Get printable version of html pages, and save to folder

Hello.

Today, I clicked through the pages at MacDevCenter.com, and filled a Safari window with 116 pages I wanted to read. Then I discover that the only title on each page is MacDevcenter, when I started off writing a script to save them to a folder.

Daniel Jalkut’s click next link script came to the rescue, as there were the handler to search for a link by its text, and “click” it by javascript. Unfortunately, I can’t find the script over at red-sweater.com anymore, but I posted the original version here

Well, the script is here, should you come into a similiar situation, it may feel like slow, but entering like 100 tabs, and clicking the “Print” link, and then choose a name to save it under, now that is just aggravating! (When you know you can script it!) :wink:

--	Main Script
property macdevcenterFolder : "Macintosh HD:Users:McUsr:Desktop:MacDevCenter:"
-- The javascript is totally stolen from Daniel Jalkut of Red Sweater Software
set myJavaScript to "

function GetLinkIndexNamed(theLinkName)
{
   for (i=0; i<document.links.length; i++)
   {
       var thisLinkContent = document.links[i].innerText;
           
       // Convert all non-breaking space to plain for matching
       thisLinkContent = thisLinkContent.replace(/\\xA0/g, ' ');

       // IS it the next link?
       if (thisLinkContent.toLowerCase() == theLinkName.toLowerCase())
       {        
           return i;
       }
   }
   return -1;
}
var foundLinkIndex = -1;
foundLinkIndex = GetLinkIndexNamed('Print'); 
if (foundLinkIndex != -1)
{
   document.location=document.links[foundLinkIndex];
}

"
-- You have opened a bunch of articles in http://macdevcenter.com
-- you want to save this for later, by a title, from which you can
-- discern what the article is all about.
-- you have the Safari window loaded with articles, and specified
-- a folder to save them in by the property at the top of this script.
tell application "Safari"
	tell its front window
		set theTabs to every tab whose URL contains "MacDevCenter"
	end tell
	
	repeat with aTab in theTabs
		tell aTab to do JavaScript "self.focus()"
		tell document 1
			-- we "click" the print link, to get a friendly title
			-- and read format.
			do JavaScript myJavaScript
			
			repeat while true
				-- We'll have to wait until the page to print is properly loaded.
				if (do JavaScript "document.readyState") is "complete" then
					set fullTitle to (do JavaScript "document.title")
					if fullTitle starts with "MacDevCenter.com: " then exit repeat
				end if
				delay 0.2
			end repeat
			-- This is what we are interested in.
			set theStuff to source
		end tell
		set thisTitle to my textItems(2, fullTitle, ": ", null)
		-- we create the filename to save it under
		set fullHfs to macdevcenterFolder & thisTitle & ".html"
		try
			my writeToFileAsUtf8(fullHfs, theStuff)
		on error e number n
			display dialog e & " " & n
		end try
		
	end repeat
end tell


to textItems(whatItems, theText, splitDelims, replaceDelim)
	local oldDelims, theExtract, theNewText
	
	if splitDelims = null and replaceDelim is not null then
		-- we are going to create a list of the text
		set {oldDelims, AppleScript's text item delimiters} to {AppleScript's text item delimiters, replaceDelim}
		set theExtract to theText as text
	else
		set {oldDelims, AppleScript's text item delimiters} to {AppleScript's text item delimiters, splitDelims}
		
		if class of whatItems = list then -- range
			set theExtract to text items (item 1 of whatItems as integer) thru (item 2 of whatItems as integer) of theText
			
			if replaceDelim is not null then -- swap delimiters
				set AppleScript's text item delimiters to replaceDelim
				set theExtract to theExtract as text
			end if
			
		else if whatItems = null then -- preflight
			set theExtract to (count text items of theText)
			
		else if whatItems = 0 then -- explode
			set theExtract to text items of theText
			
			if replaceDelim is not null then -- swap delimiters
				set AppleScript's text item delimiters to replaceDelim
				set theExtract to theExtract as text
			end if
		else -- single text item
			set theExtract to text item whatItems of theText
		end if
	end if
	set AppleScript's text item delimiters to oldDelims
	
	return theExtract
	
end textItems

on writeToFileAsUtf8(fName, stuff)
	local fRef, n, fsz
	set fRef to open for access fName with write permission
	try
		write stuff to fRef as «class utf8» starting at 0
		set fsz to get eof fRef
		close access fRef
		return fsz
	on error e number n
		try
			close access fRef
		end try
		error e number n
	end try
end writeToFileAsUtf8