Login and HTML scraping

Hello everybody,

I’m new but I have been fiddling around here for a couple of times :slight_smile:

I want to login in my mobile carrier website (https://areaclienti3.tre.it/login_133.jsp) and retrieve information about my credit.

Searching around the forum I came up with a nice applescript (attached below) that works fine, as long as I use the pre-filled login information saved in my Safari

I’d like to be able to automatically fill the username and password fields in the login form without using the “remembering password” feature of Safari.

I tried all the “document.forms[#].elements[#].value” stuff but the source of that page is a total mess to me and I think I will need something different to make it work.

Any help will be appreciated.

Thanks in advance!

P.S. also, I would be able to run this script without Safari being showed (I don’t know if this is even possible) and maybe issuing a GROWL notification instead of the DISPLAY DIALOG command.


set AreaClientiURL to "https://areaclienti3.tre.it/login_133.jsp"
set x to "sendForm();"

tell application "Safari"
	open location AreaClientiURL
	delay 5
	do JavaScript x in document 1
	delay 5
	set pageHTML to source of document 1
end tell

set TID to AppleScript's text item delimiters --save default value

---Retrieve CREDITO---

set AppleScript's text item delimiters to {"<span class=\"txt_Value\">"}
set credito to text item 2 of pageHTML

set AppleScript's text item delimiters to {"&euro"}
set credito to text item 1 of credito

---Display CREDITO---
display dialog "credito residuo: " & credito & "€"

set AppleScript's text item delimiters to TID
tell application "Safari" to quit

Buonasera,

benvenuto su MacScripter.

The shell command cURL can load the source of a website without using a browser and with providing credentials.
Probably you can insert the real destination URL, not the login URL


do shell script "curl 'https://areaclienti3.tre.it/login_133.jsp' -u user:pass"

Grazie Stefan!!

Unfortunately I tried without success.

The command you posted retrieves the HTML source of the login page (so I can’t past the login step)

If there was a way to correctly pass the login information via curl that would be wonderful.

UP!

Anybody? :rolleyes:

Hello!

Did you specify -u yourusername:yourpassword in the curl command?

Hi!

if I type this command in terminal

I just get the HTML Source of the login page https://areaclienti3.tre.it/login_133.jsp which is not what I am looking for

:frowning:

is there any other way to submit username&password to a javascript site using curl?

Try figuring out the keystrokes, besides your password and your username you need to enter to fill out the form. The exact number of tabs, to reach the first field, then the exact number of tabs to reach the second field.

Try:

set myuser to "123"
set mypass to "456"

tell application "Safari"
	set URL of document 1 to "https://areaclienti3.tre.it/login_133.jsp"
	tell me to wait_for_page(1)
	do JavaScript "document.getElementById('username1').value='" & myuser & "' ; document.getElementById('password').value='" & mypass & "' ; document.getElementsByClassName('box_Buttons')[0].childNodes[1].click()" in document 1
end tell


on wait_for_page(inWindowIndex)
	activate application "Safari"
	tell application "System Events" to tell application process "Safari"
		repeat
			delay 0.5
			set theStatusText to name of static text 1 of group 3 of window inWindowIndex as text
			if theStatusText begins with "Contacting" or theStatusText begins with "Loading" or theStatusText begins with "Waiting" then
			else
				exit repeat
			end if
		end repeat
	end tell
end wait_for_page

Hello!

I have amended a little to adayzdone’s excellent solution!

But run his script first so we see that you actually login before the rest is to happen!


set myuser to "123"
set mypass to "456"

tell application "Safari"
	set URL of document 1 to "https://areaclienti3.tre.it/login_133.jsp"
	tell me to wait_for_page(1)
	do JavaScript "document.getElementById('username1').value='" & myuser & "' ; document.getElementById('password').value='" & mypass & "' ; document.getElementsByClassName('box_Buttons')[0].childNodes[1].click()" in document 1
	tell me to wait_for_page(1)
	set pageHTML to source of document 1
end tell

set TID to AppleScript's text item delimiters --save default value

---Retrieve CREDITO---

set AppleScript's text item delimiters to {"<span class=\"txt_Value\">"}
set credito to text item 2 of pageHTML

set AppleScript's text item delimiters to {"&euro"}
set credito to text item 1 of credito

---Display CREDITO---
display dialog "credito residuo: " & credito & "€"

set AppleScript's text item delimiters to TID
tell application "Safari" to quit


on wait_for_page(inWindowIndex)
	activate application "Safari"
	tell application "System Events" to tell application process "Safari"
		repeat
			delay 0.5
			set theStatusText to name of static text 1 of group 3 of window inWindowIndex as text
			if theStatusText begins with "Contacting" or theStatusText begins with "Loading" or theStatusText begins with "Waiting" then
			else
				exit repeat
			end if
		end repeat
	end tell
end wait_for_page

I’m amazed! It works!!

Thank you both, you were very helpful!

I just get an error on the subroutine

wait_for_page(inWindowIndex)

this is the error:

on this line of code:

I substituted all the instances of the subroutine with a delay of 10 seconds (just to be sure the page had fully loaded) and it works like charm.

Hello!

:slight_smile: I am glad it works! But what version of Safari are you using?

If I understand well, it seems that the tested GUI items aren’t available when the subroutine is entered.
Maybe this edited version would behave well :


on wait_for_page(inWindowIndex)
	activate application "Safari"
	tell application "System Events" to tell application process "Safari"
		repeat
			try # ADDED
				delay 0.5
				set theStatusText to name of static text 1 of group 3 of window inWindowIndex as text
				if theStatusText begins with "Contacting" or theStatusText begins with "Loading" or theStatusText begins with "Waiting" then
				else
					exit repeat
				end if
			end try # ADDED
		end repeat
	end tell
end wait_for_page

Yvan KOENIG (VALLAURIS, France) lundi 1 octobre 2012 16:48:39

Hi Yvan,

I have been using that handler for a while now without problems on 10.6.8. I think it has something to do which version of Safari the OP is using.

Hello!

Or that he hasn’t enabled access for assistive devices in the System Preferences Pane under “Universal Access”

Here is a cool javascript link: Donut math: how donut.c works

For see I ran this script :


set myuser to "123"
set mypass to "456"

tell application "Safari"
	set URL of document 1 to "https://areaclienti3.tre.it/login_133.jsp"
	tell me to wait_for_page(1)
	do JavaScript "document.getElementById('username1').value='" & myuser & "' ; document.getElementById('password').value='" & mypass & "' ; document.getElementsByClassName('box_Buttons')[0].childNodes[1].click()" in document 1
end tell


on wait_for_page(inWindowIndex)
	activate application "Safari"
	tell application "System Events" to tell application process "Safari"
		repeat 100 times
			try
				delay 0.5
				tell window inWindowIndex
					class of UI elements
					-->{button, button, button, group, group, group, button, static text, tool bar}
					class of UI elements of group 3
					-->{group}
				end tell
				(*
				set theStatusText to name of static text 1 of group 3 of window inWindowIndex as text
				if theStatusText begins with "Contacting" or theStatusText begins with "Loading" or theStatusText begins with "Waiting" then
				else
					exit repeat
				end if
				*)
			end try
		end repeat
	end tell
end wait_for_page

The log report returned 100 times :

get class of every UI element of window 1 of application process "Safari"
	--> {button, button, button, group, group, group, button, static text, tool bar}
get class of every UI element of group 3 of window 1 of application process "Safari"
	--> {group}

There is no hope to reach
static text 1 of group 3 of window inWindowIndex
because the group 3 contains only a group, no static text (under 10.8.2)

Yvan KOENIG (VALLAURIS, France) lundi 1 octobre 2012 18:52:18

Hello Yvan!

I am glad you sorted it out, I am on the same version as adayzdone, so I have no means for sorting out where the status text has gone. Anyways, it is interesting to see that things has changed.

The script seem to work for OP, and that is the main thing.

Bonne Soiree!