Hi Guys,
I am using MacIntosh X (Mountain Lion), MS Office 2011 and Applescript version 2.5
I had Word file with bunch of list, like the below sample text.
Name:XXXXXXX
Age:YYYYY
D.O.B:00/00/0000
Gender:ZZZZ
I need a script to loop through all the para and to Delimit the text from “:” to the end of the para and to keep it as a variable. So I call the variable to another script.
For example:
P1 as “XXXXXXX”
P2 as “YYYYY”
P3 as “00/00/0000”
P4 as “ZZZZ”
Likewise end of the document.
Thanks in advance,
John
If I am understanding you correctly, here is a start toward a solution that allows all the programming to be in Applescript. The following Applescript will read all the lines in the Word file selected into a single variable. This is then processed to separate out each of the variables in your example data. Granted this is rather a brute force approach, but it does work. Other options would be to put each piece of data for an individual into a list.
Hope this helps 
set theFile to choose file with prompt "Select word file to process:" of type {"com.microsoft.word.doc", "org.openxmlformats.wordprocessingml.document"}
tell application "Microsoft Word"
open theFile
set theContent to content of text object of active document
my processContent(theContent)
end tell
on processContent(theContent as text)
set lineCount to the number of paragraphs in theContent
repeat with i from 1 to lineCount
set theLine to paragraph i of theContent as text
if theLine contains "Name:" then
set theName to characters 6 thru -1 of theLine as text
else if theLine contains "Age:" then
set theAge to characters 5 thru -1 of theLine as text
else if theLine contains "D.O.B:" then
set theDOB to characters 7 thru -1 of theLine as text
else if theLine contains "Gender:" then
set theGender to characters 8 thru -1 of theLine as text
end if
-- put call to process(es) to handle variables here
end repeat
end processContent
The test word file included the following lines:
Name:XXXXXXX
Age:YYYYY
D.O.B:00/00/0000
Gender:ZZZZ
Name:AAAAAAA
Age:YYYYY
D.O.B:00/00/0000
Gender:ZZZZ
Name:BBBBBBB
Age:YYYYY
D.O.B:00/00/0000
Gender:ZZZZ
Hi, haolesurferdude. Your method requires hard-coded numbers, but, as we don’t know if the OP’s sample contains all possible scenarios”they almost never do :)”it would be advisable to use the offset command to determine the colon’s location.
Because there is also no way to determine how many different paragraphs there will be throughout the text, a record should be made to pair the p variables (keys) with their values.
set theContent to "Name:XXXXXXX
Age:YYYYY
D.O.B:00/00/0000
Gender:ZZZZ"
set {textList, recordList} to {{}, {}}
# Get text after offset of colons
repeat with index from 1 to count theContent's paragraphs
set textList's end to theContent's paragraph index's text ((offset of ":" in theContent's paragraph index) + 1) thru -1
end repeat
# Make records
repeat with index from 1 to count textList
set recordList to my recordList & (run script "{|" & ("p" & index) & "|: " & (quote & textList's item index & quote) & "}")
end repeat
# Get value with key
recordList's p2 --or p3, p4, etc.
Edited for clarity.
Hi Marc Anthony,
Your code works perfect and split out the content.
But I tried it with a small modification, i.e., setting thecontent to whole document text like below.
set myFile to choose file with prompt "Please select the Word File:" of type {"doc", "docx"} default location (path to desktop)
tell application "Microsoft Word"
activate
open myFile
set thecontent to content of text object of active document
end tell
set {textList, recordList} to {{}, {}}
# Get text after offset of colons
repeat with index from 1 to count thecontent's paragraphs
set textList's end to thecontent's paragraph index's text ((offset of ":" in thecontent's paragraph index) + 1) thru -1
end repeat
# Make records
repeat with index from 1 to count textList
set recordList to my recordList & (run script "{|" & ("p" & index) & "|: " & (quote & textList's item index & quote) & "}")
end repeat
# Get value with key
recordList's p1 --or p3, p4, etc.
But it throws an error (Can’t get text 1 thru -1 of “”) in the following line
set textList's end to thecontent's paragraph index's text ((offset of ":" in thecontent's paragraph index) + 1) thru -1
how to fix it?
Thanks
John
The error message is telling you it encountered a line without text. Change the first loop to read:
if not theContent's paragraph index's text = "" then set textList's end to theContent's paragraph index's text ((offset of ":" in theContent's paragraph index) + 1) thru -1