Collect Text From Form In Safari

I’ve seen a lot of posts about how to pass data to a form in Safari. I want to do the reverse: I’d like to retrieve the data from a form. Use-case is to collect a generated code snippet which i could then build into an html document.

is there some variation of the “do Javascript” command that can be used to set a variable to a forms current value?

Model: MacTower
Browser: Safari 536.29.13
Operating System: Mac OS X (10.8)

http://www.occupantunknown.com/missinglink/missinglink.html

PB


Hello.

There is another way of doing it, as an alternative, though I’d really try Peter Bunn’s “Missing Link” first.

Nik’s scheme is to parse a webform directly with Applescript, a demo of the principles can be found here

If you really want to get the contents of a form before you submit, the your only option is using the web-inspector to find the form field, and use the do script to get the value of the form (which may be a bit finicky, but will produce a result in the end). :slight_smile:

Thanks for the advice. Can I prevail on you guys for some code samples? How can I use Missing Link to accomplish the goal? Or a snippet for any alternate method?

There is a third way, if your forms data are visible after being posted on the net, and that is to retrieve your form with curl, using the url in the browser to the page, if that is available.

As for the javascript, there are plenty of posts here that shows the technique, and you may also use google. As for using Nik’s method, all I can say is that you’ll have to try following the blog, and read the code.

But that post isn’t of much relevance, if the data you are after, is posted on a form on an external website.

I haven’t been able to find any code samples that retrieve the form data. I’ve seen plenty about passing data to a form - but none about collecting from a form. I’ve spent along time looking. Would be really great if some could give me a sample or a link.

I can’t see any difference in retrieving the value of a form field than any other field, you’ll have to figure out the addressing yourself.

You should be able to find a script named Safari next page at red-sweater.com.

I post the script here, since I wasn’t able to find it a red-sweater.com anymore.

-- © Copyright 2005, Red Sweater Software. All Rights Reserved.
-- Permission to copy granted for personal use only. All copies of this script
-- must retain this copyright information and all lines of comments below, up to
-- and including the line indicating "End of Red Sweater Comments". 
--
-- Any commercial use of this code must be licensed from the Copyright
-- owner, Red Sweater Software.
--
-- A script to go to the next page of a multi-page web document/story/whatever.
-- The script works by by guessing the "next" URL link and changing the location
-- of the page to that link's target.
--
-- This script will work on many sites that I have not tested, because it 
-- searches for a number of "link names" that are commonly used by sites 
-- presenting multi-page results. If you come across some common link names 
-- that I have not included, please contact me so I can update the script for
-- all to benefit. http://www.red-sweater.com/
--
-- Version 1.0.6
--    Fix bug preventing Salon.com from working correctly.
--
-- Version 1.0.5:
--	Support any link starting with "Next" and ending in ">"
--
-- Version 1.0.4:
--	Support sites that put non-breaking spaces in their link text (e.g. photos.com)
--	
-- Version 1.0.3:
--	Support for graphical links named next.gif or next.jpg
--	
--	New sites tested:
--	ADC documentation - "Next Page >"
--
-- Version 1.0.2:
--	Lots of new sites supported by virtue of case insensitive comparisons. Some new special cases added.
--
--	New sites tested:
--	vBulletin - Contains a link named ">"
--	Wired! News - Contains a link whose "class" attribute is 'next'
--	Google Update - For some reason its link is now "\nNext"
--	Craiglist - Special case - any link that starts with "next " and ends in " postings"
--	tribe.net - link named "next"
--	friendster - link named "next >"
--	macdevcenter - link named "Next Page"
--	apple mailing lists - link named "Next >>"
--
-- Version 1.0.1:
--	arstechnica.com - Contains a link named "Next »" (option-shift-\)
--
-- Version 1.0:
--
--	a9.com - Contains a link named "Next"
--	cocoabuilder.com - Contains a link named ">>"
--	Google Search - Contains a link named "Next"
--	msn.com - Contains a link named "Next"
--	NYTimes.com - Contains a link named "Next>>"
--	Salon.com Stories - Contains a link containing a relatively higher "index' page.
--	WashingtonPost.com - Contains a link named "Next"
--	Yahoo Search - Contains a link named "Next"
--
-- End of Red Sweater Comments

tell application "Safari"
	activate
	set myJavaScript to "

function GetLinkWithImageNamed(theImageName)
{
	// For every link
	for (i=0; i<document.links.length; i++)
	{
		// For every child of that link
		for (j=0; j<document.links[i].children.length; j++)
		{
			// Is it an IMG tag?
			if (document.links[i].children[j].tagName.toLowerCase() == 'img')
			{
				var myImageSource = document.links[i].children[j].src;
				
				// Get the leaf of the path
				var pathLeafName = myImageSource.split('/').pop();
				
				// Does it match?
				if (pathLeafName.toLowerCase() == theImageName.toLowerCase())
				{
					return i;
				}
			}
		}
	}
	return -1;
}

function GetLinkIndexNamed(theLinkName)
{
	for (i=0; i<document.links.length; i++)
	{
		var thisLinkContent = document.links[i].innerText;
			
		// Convert all non-breaking space to plain for matching
		thisLinkContent = thisLinkContent.replace(/\\xA0/g, ' ');

		// IS it the next link?
		if (thisLinkContent.toLowerCase() == theLinkName.toLowerCase())
		{		
			return i;
		}
	}
	return -1;
}

function GetLinkOfClass(theClassName)
{
	for (i=0; i<document.links.length; i++)
	{
		var thisLinkClass = document.links[i].attributes.getNamedItem('class');
		if (thisLinkClass)
		{
			// IS it the next link?
			if (thisLinkClass.value.toLowerCase() == theClassName.toLowerCase())
			{
				return i;
			}
		}
	}
	return -1;
}

function GetLinkWithPrefixAndSuffix(thePrefix, theSuffix)
{
	for (i=0; i<document.links.length; i++)
	{
		var thisLinkContent = document.links[i].innerText;
			
		// Convert all non-breaking space to plain for matching
		thisLinkContent = thisLinkContent.replace(/\\xA0/g, ' ');

		// IS it the right prefix?
		if (thisLinkContent.indexOf(thePrefix) == 0)
		{
			// And the right suffix?
			if (thisLinkContent.substring(thisLinkContent.length - theSuffix.length, thisLinkContent.length) == theSuffix)
			{
				return i;
			}
		}
	}
	return -1;
}

function SalonGetNextLink()
{
	// What is our current index number?
	var mySearchToken = '/index';
	var myURLString = document.location.toString();
	var thisIndexOffset = myURLString.indexOf(mySearchToken);
	var currentIndex = 0;
	if (thisIndexOffset != -1)
	{
		thisIndexOffset +=  mySearchToken.length;
		var indexString = myURLString.substring(thisIndexOffset, myURLString.length);
		currentIndex = parseInt(indexString);
		if (isNaN(currentIndex)) currentIndex = 0;
	}

	// Ok - the next link will be numerically higher than current. Find a link suitably 
	// identified.
	var newIndex = currentIndex + 1;
	var newString = mySearchToken + newIndex;
	alert(newString);
	for (i=0; i<document.links.length; i++)
	{
		var thisLinkContent = document.links[i].toString();
		// IS it the next link?
		if (thisLinkContent.indexOf(newString) != -1)
		{
			return i;
		}	
	}
	return -1;
}

var foundLinkIndex = -1;

// First, check to see if it's a particular domain that 
// we have special rules for handling...
if (document.location.toString().indexOf('salon.com') != -1)
{
	foundLinkIndex = SalonGetNextLink();
}
else if (document.location.toString().indexOf('craigslist.org') != -1)
{
	foundLinkIndex = GetLinkWithPrefixAndSuffix('next ', ' postings')
}
else
{
	// Try by class ID
	foundLinkIndex = GetLinkOfClass('next');

	// Try a bunch of common link names 
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkIndexNamed('Next');	
	}
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkIndexNamed('\\nNext');	
	}
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkIndexNamed('>>');
	}
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkIndexNamed('>');
	}
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkWithPrefixAndSuffix('Next Page', '');
	}
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkWithPrefixAndSuffix('Next', '>');
	}	
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkWithPrefixAndSuffix('Next', '»');
	}		
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkWithImageNamed('next.jpg');
	}
	if (foundLinkIndex == -1)
	{
		foundLinkIndex = GetLinkWithImageNamed('next.gif');
	}
}

if (foundLinkIndex == -1)
{
	alert('Sorry, I could not find the next page of results for this page.');
}
else
{
	document.location=document.links[foundLinkIndex];
}

"
	do JavaScript myJavaScript in document 1
end tell


Here is a small example of how to navigate the login fields of this site, so that you have something that you can look at in web-inspector of Safari or view source, while you decode the javascript.


tell application "Safari"
	open location "http://macscripter.net/login.php"
	repeat
		set theReturn to do JavaScript "document.getElementById('navlogin').innerHTML" in document 1
		if theReturn is not missing value then exit repeat
		delay 0.2
	end repeat
	do JavaScript "oFormObject = document.forms['login']; oFormObject.elements['req_username'].value = 'USERNAME'; oFormObject.elements['req_password'].value = 'PASSWORD';document.forms['login'].submit();" in document 1
end tell
return

I hope this helps you a little. :slight_smile:

jlazarus:

As yet, your specific goal isn’t clear.

If you’re trying to ‘collect’ form data from local HTML pages, then Nik’s method or Missing Link can do that for you. Either might also work with networked computers or those you have remote access to by writing the data to file on that computer and reading it from yours.

If you’re wanting to retrieve data from forms on the web, I don’t think Nik’s method or ML can help.

Clarifying your goal might help us answer your question.

PB


My specific goal is to create a script that will populate values in Paypal to generate new button code. I can handle the field population - and I can script clicking the button to submit - but then I get stuck.

The general steps in Paypal are:

  1. Fill in parameters (item name, price, etc.)
  2. Click submit
  3. Paypal translates data into code.

Paypal’s script delivers a code snippet into a subsequent field, which the user (me) can copy and paste into an html document. I can’t find a way for Applescript to grab that code. I’m not sure i get the nuance between ‘collect’ and ‘retrieve’ - so forgive my language imprecision. At the end of the day, I’m trying to set a variable to the contents of the form/field.

I hope this is clearer. I’d post my preliminary applescript - but I’ve been approaching this in tiny steps so far - and don’t have anything yet that will make sense. I’ll work on that next.

You may want to check out Keyboard Maestro. It’s a macro program that, in its latest version, can do some fairly sophisticated browser automation. I haven’t experimented much with it, but it could save you a ton of time over trying to shoehorn AppleScript into there.

The other thought I have would be to use Safari to download the contents of the page (either as text or HTML, depending on how each is formatted) and then parse the text within AppleScript to find the right snippet to copy out. Probably easier than trying to control the browser’s selection.

Hello.

You really don’t have to download anything, you can get at the text by setting a variable to the source of a Safari Document:

set AppleScript's text item delimiters to ""
set frag1 to "You may want"
set frag2 to "browser's selection"
tell application "Safari"
	set a to source of its document 1
	set AppleScript's text item delimiters to {frag1}
	set b to item 2 of text items of a
	set AppleScript's text item delimiters to {frag2}
	set c to item 1 of text items of b
	set AppleScript's text item delimiters to ""
	set d to frag1 & c & frag2
end tell
d

Thanks D. I will experiment with parsing the source.

For what it’s worth, here’s my preliminary code:

set bookingName to "Test Name"
set bookingID to "Test ID"
set bookingPrice to "300"

tell application "Safari"
	activate
	open location "https://www.paypal.com/us/cgi-bin/?cmd=_web-tools&fli=true" -- create Buy Now link
	
	delay 3
	
	tell document 1
		
		do JavaScript "document.forms['buttonDesignerForm']['itemName'].value = '" & bookingName & "'"
		do JavaScript "document.forms['buttonDesignerForm']['itemID'].value = '" & bookingID & "'"
		do JavaScript "document.forms['buttonDesignerForm']['itemPrice'].value = '" & bookingPrice & "'"
		do JavaScript "document.getElementsByName('create_button_submit')[0].click()"

-- And this is where I need help with code to grab the field contents produced from the steps above

end tell

In case anyone has additional ideas.

Thanks everyone.

Hello.

If your element has an id, you may have a look at this line:

 set theReturn to do JavaScript "document.getElementById('navlogin').innerHTML" in document 1

You may look for other getElement functions, or just google it to find javascript functions that return values.