Calling Javascript from Applescript without browsers

Hi folks,

Does anyone know how to call javascript from an applescript without using a browser? I’m trying to parse html returned from lynx (using do shell script to get access to lynx).

– bmig

I don’t know much about lynx. I didn’t even know it was still alive or ran on mac. You could just use curl to do http.

I think you’d need spidermonkey or rhino to interpret javascript, so you’d better be good with c or java and depending on the javascript, you’d still need a browser to expose a DOM, right?

Hi Digest, thanks for the reply. You’re right, curl works well to output source and it does seem like I need a browser no matter what to expose the DOM. I did find a way to figure out the links using lynx, but it has nothing to do with Javascript. There’s an option to have lynx output the source numbering the links and their references. I can figure it out from there.

I’ll look into spider monkey and rhino. Thanks again.

– bmig

If you’re just looking to scrape HTML, you might say a bit more about what you’re actually trying to obtain. The AppleScript language itself isn’t well suited to parsing HTML due to its poor native text processing facilities and lack of libraries, but many other scripting languages provide powerful tools for this kind of work.

For example, BeautifulSoup for Python is very good at dealing with messy real-world markup. You could write that portion of your script in Python and then call it from AppleScript, either via ‘do shell script’ or by packaging it as a scriptable FBA (see the aemreceive/sfba package included in Appscript Installer).

If you really want to stick with AppleScript only, you could pipe your source HTML through HTMLTidy to convert it to XHTML, then load the result into System Events’ XML Suite where you can crawl through it using AppleScript. Or you might consider using a commercial AppleScriptable webscraping application such as Anthracite.

HTH

Great, you rock hhas! Thanks so much.