capturing text

Hi again,

Just curious if I have a string:

set x to “Las Vegas, Nevada 89102”

how is the best way to extract the city, state, and zip into their own strings?

is the answer delimiters, or to use offsets of ", " and " "?

-patrick

This should return a record:

on extractAddressComponents(address)
	set components to {city:missing value, state:missing value, zip:missing value}
	set ASTID to AppleScript's text item delimiters
	
	set AppleScript's text item delimiters to ","
	set city of components to first text item of address
	set state of components to second text item of address
	
	set AppleScript's text item delimiters to " "
	set zip of components to last text item of address
	set state of components to (items 2 through -2 of every text item of state of components) as text
	
	set AppleScript's text item delimiters to ASTID
	return components
end extractAddressComponents

This example will make three strings and display a dialog with the new values:

extractAddressComponents("Las Vegas, Nevada 89102")
copy result to {city:theCity, state:theState, zip:theZIP}

display dialog "City: " & theCity & return & ¬
	"State: " & theState & return & ¬
	"ZIP: " & theZIP & return buttons "OK" default button 1

Thank you very much for the reply… I am having a problem still though–

This script is supposed to be extracting text out of emails, so it will not always be “LAS VEGAS, NV 89102”. but it could be “OAKLAND, CA 94605”

which will cause the error:

Can’t make items 2 thru -2 of every text item of state of {city:“OAKLAND”, state:" CA 94605", zip:“OAKLAND, CA 94605”} into type string… Is there an easy way to fix this so that it wont be affected by 2 or 3 word cities?

-patrick

I see that you’re cross-posting, Patrick, so I’ll give you the same answer as I did elsewhere…

You could go for either. (Offset is slightly briefer, and TIDs can be faster on long strings - although that doesn’t appear to be a huge issue here.) Rather than locate spaces, I’d probably just use the string elements ‘word’ and ‘text’ to help clean up the whole thing:

Using offset:

on locationAsList(l)
	set o to offset of "," in l
	tell l to {text 1 thru (o - 1)} & (text o thru -1)'s {text from word 1 to word -2, word -1}
end locationAsList

set x to "Las Vegas, Nevada 89102"
set {city, state, zip} to locationAsList(x)

city --> "Las Vegas"
state --> "Nevada"
zip --> "89102"

Using text item delimiters:

on locationAsList(l)
	set d to text item delimiters
	set text item delimiters to ","
	tell l's text items to set l to {item 1} & item 2's {text from word 1 to word -2, word -1}
	set text item delimiters to d
	l
end locationAsList

set x to "Las Vegas, Nevada 89102"
set {city, state, zip} to locationAsList(x)

city --> "Las Vegas"
state --> "Nevada"
zip --> "89102"

sorry, I didn’t really realize that people on macscripter are the same people on lists.apple…

-patrick

Well, they may not all be the same - but there are quite a few folk who participate in more than one forum - just as you do, in fact… :wink:

Just by way of an update to this thread, I subsequently learned that US postcodes can consist of either 1 or 2 ‘words’. For example: “60062” or “85123-1234”. (As you might guess from this admission of omission, I’m not American.) :wink:

To allow for this possibility, I’ve modified my original TID-based suggestion to this:

on locationAsList(l)
	set d to text item delimiters
	set text item delimiters to item 1 of (l's word -2)
	set z to count "0123456789"'s text items
	set text item delimiters to ","
	tell l's text items to set l to {item 1} & item 2's {text from word 1 to word -(z + 1), text from word -z to word -1}
	set text item delimiters to d
	l
end locationAsList

set l to {}
repeat with i in {"Las Vegas, Nevada 89102", "Northbrook, IL 60062", "Las Cruces, New Mexico 85123-1234"}
	set l's end to locationAsList(i)
end repeat
l

--> {{"Las Vegas", "Nevada", "89102"}, {"Northbrook", "IL", "60062"}, {"Las Cruces", "New Mexico", "85123-1234"}}