web browser text?

Good morning all,

Just wanted to seek the knowledge of the script gurus.

Anyone have any ideas how it might be possible to search for and extract a bit of predictable text off of a webpage?

thank you,

Hi,

it depends on the source code of the website.
You can parse most sites with /usr/bin/curl and filter shell commands like grep, awk, sed etc.

For example, this extracts the line which displays the number of registered users here on MacScripter


do shell script "/usr/bin/curl 'http://www.macscripter.net' | awk '/registered users/ {print $6}'"

Oh thank you. This shows a lot of potential.

Now what if the web page I’m trying to parse is in an open browser window?

The want to write a script that grabs a bit of volatile information on an open page then opens a certain folder based on that clipped info.

I guess I want to parse the source for the open window instead of the http site itself. Does that make since?

thank you in advance

dustin

you can get the source of an open document in Safari with

tell application "Safari" to set theSource to source of document 1
do shell script "curl http://www.google.com/ | textutil -stdin -stdout -format html -convert txt -encoding UTF-8"

It should work.

Thank you. Now that I have the source of an open page any suggestions on how I could parse down that 400 lines of code to get at the one line or really the one number I need. Would or could I use a shell script on a variable in an applescript?

Or can it be done with in the main script?

Thanks you again.

dustin

What is the site. Maybe it get’s it from a source that can be done easily without having to trim text. :cool:

It’s an internal website for my work that handles incoming and outgoing job orders.

Hello

What does the line look like that your really need?

if it was line 400: you could use something like this:
Starting out with Stefan’s example of how to get out the source from Safari:

tell application "Safari" to set theSource to source of document 1
set shellCmd to "set TheText=\"" & quoted form of theSource & "\"; echo $TheText | sed 400p"
set shellResult to do shell script shellCmd
display dialog shellResult

Best Regards

McUsr

I don’t know if the data I want will predictably be at a certain line number but I’ll give it a try. I do know though that it will be embedded in a string such as:

Order Number:
N344426

Order Date:

I’m looking to get that N number out of the captured source and route it.

Thanks for your help.

dustin