Saving image from web page

Hi

I am following a comic at http://www.hs.fi/fingerpori/1135231647953 and would like to download all strips to my hard drive. The strips are standard image files I can save but the URL of each file seem to base on some random number so I have to click a link (“Seuraava”) to get the URL of next image. Thus, as far as I understand, this precludes the usage of curl, which would be the easiest possibility.

So, I would like to write an AppleScript that right clicks the image, selects “Save image to Downloads”, click the link to get next strip and then loop this. I have already figured out the clicking link part but I don’t know how to do the saving image part.

Any help would be appreciated. Of course, if there is some easier method to achieve the same results, I would be glad to here it!

Hi,

try this


set destFolder to POSIX path of ((path to desktop as Unicode text) & "WebImages:")
do shell script "/bin/mkdir -p ~/Desktop/WebImages"
tell application "Safari" to set numberOfPictures to do JavaScript "document.images.length" in document 1

set {TID, text item delimiters} to {text item delimiters, "/"}
repeat with i from 1 to numberOfPictures
	tell application "Safari" to set picID to do JavaScript "document.images[" & ((i - 1) as string) & "].id" in document 1
	if picID starts with "strip" then
		tell application "Safari" to set picURL to do JavaScript "document.images[" & ((i - 1) as string) & "].src" in document 1
		set fName to last text item of picURL
		do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
	end if
end repeat
set text item delimiters to TID

I tried your script but I got following error:

error “The variable picID is not defined.” number -2753 from “picID”

(it points to line “if picID starts with “strip” then”)

hm, I tested the script with your link above and it worked fine.
The site with the comic must be the current document of Safari

Thank you very much for all your help! I now got the script working. I still got the same error couple of times again without changing anything in the script - very strange…

So, below is the whole script. 5 times repeat is just for testing purposes - for final version I must increase the number to very big because I don’t know the total number of strips. Maybe it would be possible to add a feature that when the “Seuraava” link no more exist (i.e. the latest strip is showing) the repeat ends and the script e.g. shows a message dialog. But of course it stops also when it encounters an error :slight_smile:


set destFolder to POSIX path of ((path to desktop as Unicode text) & "WebImages:")
do shell script "/bin/mkdir -p ~/Desktop/WebImages"
tell application "Safari" to set numberOfPictures to do JavaScript "document.images.length" in document 1

repeat 5 times
	
	set {TID, text item delimiters} to {text item delimiters, "/"}
	repeat with i from 1 to numberOfPictures
		tell application "Safari" to set picID to do JavaScript "document.images[" & ((i - 1) as string) & "].id" in document 1
		if picID starts with "strip" then
			tell application "Safari" to set picURL to do JavaScript "document.images[" & ((i - 1) as string) & "].src" in document 1
			set fName to last text item of picURL
			do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
		end if
	end repeat
	set text item delimiters to TID
	
	delay 2
	tell application "System Events" to tell UI element "Seuraava" of group 10 of UI element 1 of scroll area 1 of group 3 of window 1 of application process "Safari"
		repeat until exists
			delay 0.2
		end repeat
		click
	end tell
	
end repeat

Here is the solution to your request. This script starts from “today’s” comic strip and works it’s way backwards X number of pages you set in the repeat. You only need to modify the number of repeats for the number of days you want to download. I tested it with 50 and it worked fine. Running 10.6.2 (couldn’t resist the challenge)

set destFolder to POSIX path of ((path to desktop as Unicode text) & "WebImages:")
do shell script "/bin/mkdir -p ~/Desktop/WebImages"
set Start_URL to "http://www.hs.fi/fingerpori/"
set PageSource to do shell script "curl " & Start_URL

repeat 50 times
	
	--  Get date of comic strip --
	set x to the offset of ("<h1>Fingerpori</h1>") in PageSource
	set tmpTXT to text (x) thru (x + 50) of PageSource
	set TMP_Start to the (offset of ("<p>") in tmpTXT) + 3
	set TMP_End to the (offset of ("</p>") in tmpTXT) - 1
	set FileName to (text TMP_Start thru (TMP_End) of tmpTXT) & ".gif"
	
	--  Download the comic strip --
	set x to the offset of ("display: block;") in PageSource
	set tmpTXT to text (x) thru (x + 150) of PageSource
	set TMP_Start to the offset of ("http:") in tmpTXT
	set TMP_End to the (offset of ("align") in tmpTXT)
	set PictURL to text (TMP_Start) thru (TMP_End - 3) of tmpTXT
	do shell script "curl -o " & quoted form of (destFolder & FileName) & space & PictURL
	
	-- Get the URL of the next webpage --
	set x to the offset of ("Edellinen") in PageSource
	set tmpTXT to text (x - 120) thru (x) of PageSource
	set TMP_Start to the offset of ("http:") in tmpTXT
	set TMP_End to the offset of ("class") in tmpTXT
	set full_URL to text (TMP_Start) thru (TMP_End - 3) of tmpTXT
	set PageSource to do shell script "curl " & full_URL
	
end repeat