I have an AppleScript variable that contains the following:
I need to parse through the text and return the following list: {“6963”, “6973”, “95348”}
I have an AppleScript variable that contains the following:
I need to parse through the text and return the following list: {“6963”, “6973”, “95348”}
Lots of ways to do this. You can use text item delimiters, shell scripts, scripting additions, etc., etc., etc. Here’s one way:
set x to " Scale Ordering Quantity Weight List Price code per carton (psi) (lbs.) (each)
Series DPG1-2
Dial Size: 2\"?Connection: 1/4\" NPT
0 ? 15 0615600 40 8.2 $27.50
0 ? 30 0615601 40 8.2 93.20
0 ? 60 0615602 40 8.2 ?6963?
0 ? 100 0615603 40 8.2 ?6973?
0 ? 160 0615604 40 8.2 ?953489?
0 ? 200 0615605 40 8.2 27.50
0 ? 300 0615606 40 8.2 27.50 "
set the_data to {}
repeat with this_row in (paragraphs of x)
set this_row to (contents of this_row)
if this_row contains "?" then set end of the_data to text ((offset of "?" in this_row) + 1) thru ((offset of "?" in this_row) - 1) of this_row
end repeat
return the_data
-->{"6963", "6973", "953489"}
Jon
Here is another:
on BreakAtTokens(s, t1, t2)
--
-- Assumes t1 and t2 are ONLY used as starting and
-- ending tokens in the string.
--
set oldtids to AppleScript's text item delimiters
try
set AppleScript's text item delimiters to t1
set s to s's text items
set AppleScript's text item delimiters to t2
set s to (s as string)'s text items
on error m number n from f to t partial result p
set AppleScript's text item delimiters to oldtids
error m number n from f to t partial result p
end try
set AppleScript's text item delimiters to oldtids
set a to {}
repeat with i from 2 to s's length by 2 -- even items
set a's end to s's item i
end repeat
return a
end BreakAtTokens
BreakAtTokens(str, "�", "�") --> {"6963", "6973", "953489"}
It just occurred to me that I didn’t name this handler very well. How about:
on ExtractFromBetweenNonNestedDataDelimitingTokens(s, t1, t2)
set quarktext to "0 � 15 0615600 40 �06156526� �0615600�
0 � 30 0615601 40 8.2 �0615601�
0 � 60 0615602 40 8.2 �0615602�
0 � 100 0615603 40 8.2 �0615603�
"
set the_data to {}
repeat with this_row in (paragraphs of quarktext)
set this_row to (contents of this_row)
if this_row contains "�" then set end of the_data to text ((offset of "�" in this_row) + 1) thru ((offset of "�" in this_row) - 1) of this_row
end repeat
This will only return the following list:
{“06156526”,“0615601”,“0615602”,“0615603”}
I really want the following list:
{“06156526”,“0615600”,“0615601”,“0615602”,“0615603”}
Umm, you know my solution will handle that, right?
set quarktext to "0 � 15 0615600 40 �06156526� �0615600�
0 � 30 0615601 40 8.2 �0615601�
0 � 60 0615602 40 8.2 �0615602�
0 � 100 0615603 40 8.2 �0615603�"
BreakAtTokens(quarktext, "�", "�")
--> {"06156526", "0615600", "0615601", "0615602", "0615603"}
Here is one way that Jon’s technique. It can be modified to suit your purposes:
set a to {}
repeat with i from 1 to (count paragraphs in quarktext)
set p to paragraph i of quarktext
repeat while (p contains "�")
set x to offset of "�" in p
set y to offset of "�" in p
set a's end to p's text (x + 1) thru (y - 1)
try
set p to p's text (y + 1) thru -1
on error
exit repeat
end try
end repeat
end repeat
a
--> {"06156526", "0615600", "0615601", "0615602", "0615603"}
Did not know the BreakAtTokens did what I want. I guess I should have tried it before asking for more help.
If you’re interested, this one catches some errors in the text by just looking for items with trailing tags instead of splitting the text up all at once:
set the_text to "0 � 15 0615600 40 �06156526� �0615600�
0 � 30 0615601 40 8.2 �0615601�
0 � 60 0615602 40 8.2 �0615602�
0 � 100 0615603 40 8.2 �0615603�"
set def_tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"�"}
try
set delim_text to text items of the_text
set AppleScript's text item delimiters to {"�"}
set new_list to {}
repeat with this_ti in (rest of delim_text) -- skip the first item, empty or
if this_ti contains "�" then
set new_ti to first item of (text items of this_ti)
set end of new_list to new_ti
else -- error, no trailing tag
display dialog this_ti
end if
end repeat
set AppleScript's text item delimiters to def_tid
on error err_mess
set AppleScript's text item delimiters to def_tid
display dialog err_mess
end try
{new_list, contents of this_ti}