Extracting a number from a webpage with 2 different delineators

Hi, I’ve been reading up on the text offset command as well as text delineators, but I can’t quite figure out how to crack this one. I want to:

  1. open webpage in Safari (simple enough)
  2. get text of webpage (ditto)
  3. go through the webpage for a number that is ALWAYS and EXCLUSIVELY between a “$” and a carriage return – i.e., the only dollar amount listed on the page (Ack!)
  4. put that number into a variable as a non-integer number, not a text string – i.e,. 12.34, not “12.34” (Double Ack!)

I’m just lost here. Any help would be appreciated.

I think I answered my own question without page delineators. I modified something I found on the MacOSXhints forums:

http://forums.macosxhints.com/archive/index.php/t-95269.html

Here 'tis (the proper webpage has already been opened in an earlier part)

set AppleScript's text item delimiters to ""
tell application "Safari"
	activate
	set pagetext to get text of document 1
	set theword to 0
	repeat with someword from 1 to (count of the words of pagetext)
		get word someword of pagetext
		if the first character of the result is "$" then
			set theword to someword
			exit repeat
		end if
	end repeat
end tell
tell pagetext
	set balance to word (theword + 1) as number
end tell

so “balance” balance is the variable that gets the number. It’s slow but it works

This is probably faster


set theText to do shell script "curl 'http://www.mySite.com'"
set {TID, text item delimiters} to {text item delimiters, "$"}
set balance to paragraph 1 of text item 2 of theText as real
set text item delimiters to TID

Thanks Stefan, but I’m getting an error message with that method – it can’t seem to parse the HTML source from the server which does not contain the dollar amount until it’s been delivered to the browser. This is a page that requires a login first and the curl routine bypasses that even though I’ve already logged in via javascript in Safari.

So I’ve combined the two scripts (Remember, Safari already has the proper page loaded in an earlier routine):

tell application "Safari"
	activate
	set pagetext to get text of document 1
end tell
set {TID, text item delimiters} to {text item delimiters, "$"}
set balance to paragraph 1 of text item 2 of pagetext as number
set text item delimiters to TID

Much faster than before. Thanks again.

Indeed this is a reason.