Opening a URL and searching for specific text

Hey guys, I must say I’ve never touched Applescript, but I finally thought of a use for it that I can use. What I need to do is have(preferably Safari) open a page and search for specific text, and if the text is there then I would like it to output a line of text to a text file, or perhaps send an IM through Proteus, etc etc. Any help anyone could provide would be great!

This seems to work in 10.3.2. It could do with some error checking routines though.

John M

set newUrl to "http://macscripter.net" -- your site
set myText to "Applescript" -- your search text

set myNewCode to do shell script "curl -A "Mozilla/3.0 (Win95; I)" "" & newUrl & """
if myNewCode contains myText then
	set myFilePath to choose file name default name "result.txt"
	set saveFile to (open for access myFilePath with write permission)
	write myText & " is in " & newUrl as string to saveFile
	close access saveFile
end if

Perfect, this gives me something to work on, thanks John!

Hi Bigfat,

you might want to look at the code in this post I made today, which expands on the above code:

http://bbs.applescript.net/viewtopic.php?t=6388

Best

John M

Hm, is there a way to have the script do something if the text is not there? Sorry I’m not done looking over your other post. :wink:

How about:

set newUrl to "http://macscripter.net" -- your site 
set myText to "Applescript" -- your search text 

set myNewCode to do shell script "curl -A "Mozilla/3.0 (Win95; I)" "" & newUrl & """ 
if myNewCode contains myText then 
   set myFilePath to choose file name default name "result.txt" 
   set saveFile to (open for access myFilePath with write permission) 
   write myText & " is in " & newUrl as string to saveFile
   close access saveFile 
else  
   display dialog myText & " is not in " & newUrl
end if 


John M,

Any reason you assume that curl should pretend to be an old version of Netscape for Windows 95? I would guess that most sites won’t require a browser that old, and some may even not give you what you want. Unless your thinking is that well-written websites will give non-frame-based pages to old browsers? I doubt that many sites are well-written enough to do that.

Just curious - your code using cURL is what I would have recommended. Just a quick note that I’ve got a bunch of example code for downloading pages with cURL and post-processing the source of those pages at http://www.danshockley.com, although for a quick download, that is good code. If you need to POST a form, though, my examples may be helpful.

Hi Krioni,

The cURL pretending to be an old version of Netscape for Windows 95 came about when I tried to cURL Google results pages. cURL without pretending to be another browser would pull a page saying something about forbidding access for Linux machines. The -A “Mozilla/3.0 (Win95; I)” was from a forum post at http://curl.haxx.se and was the first example I found that worked. Bit of a kludge really.

Thanks for the website link. If you have the tag for a later browser I’d be more than happy to use it.

Best

John M

John,

Well, you can check your own web browser’s user-agent by going to http://www.esperance-linux.co.uk/misc/UserAgent.html. For example, Safari for Mac OS X 10.3.2 is:

Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/106.2 (KHTML, like Gecko) Safari/100.1

Also, you can run the following in AppleScript (change the do javascript code as appropriate for your browser) to check your user agent:


tell application "Safari" to do JavaScript "navigator.userAgent" in document 1

I tested using Safari’s user agent with cURL, and Google seems to respond well. I think you aren’t supposed to use this for commercial purposes, as Google’s Terms of Services may disallow it.