Download an image from a web page

I want to be able to do this but am at a loss…I need to download
an image from a web page (each page contains only one image) and store
it to a pre-defined folder on the desktop. The action to automate
would be equivalent to clicking on the image when the page is open and
dragging it into the folder on the desktop. I do not know the terms
or syntax to use.

Would there be an issue in grabbing the image if the URL literally
points to the image itself? Basically the URL is formatted as
http://blahblahpathname/abc.jpg”, so there’s only one object on page
itself, ie. the image.

tell application “System Events”
if exists process “Safari” then
download image now store to folder with default name
`close web page
end if
end tell

I have not idea what goes into the line beginning with `. Thanks for any help.

Hi,

a nice and comfortable way to do this is to use the scriptable javascript capabilities of Safari.app

set destFolder to POSIX path of (choose folder)

tell application "Safari" to set numberOfPictures to do JavaScript "document.images.length" in document 1
repeat with i from 1 to numberOfPictures
	tell application "Safari" to set picURL to do JavaScript "document.images[" & ((i - 1) as string) & "].src" in document 1
	if picURL ends with "jpg" then
		set {TID, text item delimiters} to {text item delimiters, "/"}
		set fName to last text item of picURL
		set text item delimiters to TID
		do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
	end if
end repeat

Hi StefanK, Thanks for your reply and help. I tried with the following script and this is what I got…

set destFolder to POSIX path of ("MacHD:Users:Mainuser:Desktop:WebImages")

tell application "Safari" to set numberOfPictures to do JavaScript "document.images.length" in document 1
repeat with i from 1 to numberOfPictures
	tell application "Safari" to set picURL to do JavaScript "document.images[" & ((i - 1) as string) & "].src" in document 1
	if picURL ends with "jpg" then
		set {TID, text item delimiters} to {text item delimiters, "/"}
		set fName to last text item of picURL
		set text item delimiters to TID
		do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
	end if
end repeat

“MacHD:Users:Mainuser:Desktop:WebImages” is supposed to be my destination folder for the downloaded image. When I ran the script, instead of 1 image from the web page open, it downloaded many images from Safari’s default home page (not open). Images were also downloaded to the desktop. Where have I gone wrong?

Hi StefanK, I forgot to mention also that the filenames for the downloaded files took on names prefixed with the folder name. I do not want that to happen. For example, if I download an image from this URL “http://www.tandb.com.au/index1.jpg”, then the filename for the downloaded image should be index1.jpg, nothing more or nothing less. Thanks again.

to specify the destination folder, use this instead

set destFolder to POSIX path of (path to desktop as Unicode text) & "WebImages"

The script grabs the images of the frontmost document of Safari
it works on my machine

The script should do this :wink:

Moving to OS X.

Hi StefanK, I made the change and tested the script. The results are not different.

set destFolder to POSIX path of (path to desktop as Unicode text) & "WebImages"

tell application "Safari" to set numberOfPictures to do JavaScript "document.images.length" in document 1
repeat with i from 1 to numberOfPictures
	tell application "Safari" to set picURL to do JavaScript "document.images[" & ((i - 1) as string) & "].src" in document 1
	if picURL ends with "jpg" then
		set {TID, text item delimiters} to {text item delimiters, "/"}
		set fName to last text item of picURL
		set text item delimiters to TID
		do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
	end if
end repeat

The images I got landed on the desktop, not in the WebImages folder. The filenames are still prefixed with “WebImages”, then the image filename. Lastly, it got the wrong images, it still takes images from the default home page (which is not open), not the single index1.jpg from the webpage “http://www.tandb.com.au/index1.jpg”.

You’re right, this works:

set destFolder to POSIX path of ((path to desktop as Unicode text) & "WebImages:")

The script loads all pictures from the frontmost document,
I’ve tested it with your site and it loads 6 pictures named index1.jpg - index6.jpg
I don’t understand why this doesn’t work on your machine

Hi StefanK, why is it that you get 6 pictures? With this URL, http://www.tandb.com.au/index1.jpg, I see only an image that says “T&B”. This is index1.jpg and is the only image I need. This is really puzzling.

It loads 6 pictures, because I wrote the script to load all .jpg pictures.
If you want only the “index1.jpg” picture, change the line starting with if to

if picURL ends with "index1.jpg" then

Hi StefanK, the script is right to download all images, but what I am saying is that there actually should be only 1 image on that web page.

I am not sure why the script is not working as it should. I’ll see if I can work through it further. Though I feel a bit disheartened, whatever the result may be, I am still very grateful for your help. Thanks a bundle.

I think Stefan is making the same assumption that I made. That you want to go to a any page and download the all images found. Hence there are six images on http://www.tandb.com.au/

But what it it seems you want to do is download the images that you are displaying in Safari.

http://www.tandb.com.au/index1.jpg

This page is Not a html (webpage) as such but is the image being displayed by Safari.

So this may be what you want.

set destFolder to POSIX path of ((path to desktop as Unicode text) & "WebImages:")

tell application "Safari"
	
	set firstUrl to URL of document 1
	set fileName to (do shell script "basename " & quoted form of firstUrl)
	do shell script "curl -o " & quoted form of (destFolder & fileName) & space & firstUrl
	
end tell

Hi Mark, Thanks, you hit the nail on the head. Your script works for me.

However, I have a further question to expand this script. If I want the script to actually launch Safari (if it is not already open) and open the page directly, how do I pass the URL parameter to Applescript? Hardcoding the URL is not a solution; reason being I envisage this script to be eventually saved as an application. I will run a loop whereby I can pass the URLs (for each image) to the applescript application which will then download the image to disk automatically. Further queries: How do I close close each web page once the downloading is done? Granting that opening each page should be much faster than downloading the image, would I end up “crashing” Safari due to many concurrent open windows?

Sorry if I sound too demanding. I will promise to put the final completed script on this forum for anyone else who might be interested in this same solution. Thanks again.

A quick question first, where are you getting the urls from if not safari’s address bar.

If you are not using safari as the starting place for the url, then that makes a big difference in the direction the script can take. And may negate your other questions.

Hi Mark,

Point is that I’ll always know the URL beforehand. Is it possible that the image on the web page can actually be downloaded without even opening Safari? The only reason for opening Safari is just so that I can pass the URL from Safari to Applescript as in …

tell application “Safari”
set firstUrl to URL of document 1

What I am doing is actually looping through each URL I have, then running the get image script within each iteration. Is it possible in applescript to do something like ?..

getpictapp(“URL pathname”) where getpictapp is the name of the applescript and URL pathname is the path to the image

You mean something like this?


to getpictapp(tgt_URL)
	set destFolder to POSIX path of (path to desktop) & "WebImages/"
	set tgt_URL to quoted form of tgt_URL
	set fileName to do shell script "basename " & tgt_URL
	do shell script "curl -o " & quoted form of (destFolder & fileName) & space & tgt_URL
end getpictapp

set URL_list to {"http://www.tandb.com.au/index1.jpg", "http://www.tandb.com.au/index2.jpg", "http://www.tandb.com.au/index3.jpg", "http://www.tandb.com.au/index4.jpg", "http://www.tandb.com.au/index5.jpg", "http://www.tandb.com.au/index6.jpg"}

repeat with tgt_URL in URL_list
	getpictapp(tgt_URL)
end repeat


Or my preference (depending on what else you’re doing):


to download_items from URL_list to destFolder
	repeat with i in URL_list
		set i to quoted form of i
		set fileName to do shell script "basename " & i
		do shell script "curl -o " & quoted form of (destFolder & fileName) & space & i
	end repeat
end download_items

set destFolder to POSIX path of (path to desktop) & "WebImages/"

set URL_list to {"http://www.tandb.com.au/index1.jpg", "http://www.tandb.com.au/index2.jpg", "http://www.tandb.com.au/index3.jpg", "http://www.tandb.com.au/index4.jpg", "http://www.tandb.com.au/index5.jpg", "http://www.tandb.com.au/index6.jpg"}

download_items from URL_list to destFolder


Hi Kai, Thanks for your input.

I tried this script and it works…

to download_items from URL_list to destFolder
repeat with i in URL_list
set i to quoted form of i
set fileName to do shell script "basename " & i
do shell script "curl -o " & quoted form of (destFolder & fileName) & space & i
end repeat
end download_items

set destFolder to POSIX path of (path to desktop) & “WebImages/”

set URL_list to {“http://www.tandb.com.au/index1.jpg”, “http://www.tandb.com.au/index2.jpg”, “http://www.tandb.com.au/index3.jpg”, “http://www.tandb.com.au/index4.jpg”, “http://www.tandb.com.au/index5.jpg”, “http://www.tandb.com.au/index6.jpg”}

download_items from URL_list to destFolder

My problem is in setting the URL_list. There are hundreds of image URLs; too many to specify manually. Can we get the script to read from a text file instead? If the URLs in the text file can be formatted in the same way as in the example script above, would it be difficult to parse the file to extract each URL?

No problem. If the URLs are listed in the file in a form like this…

… then extracting the text and converting it to a list is a trivial matter:

set URL_list to paragraphs of (read (choose file of type "TEXT" without invisibles))

To reverse the process, and write a list to file, you can use something like:

set URL_list to {"http://www.tandb.com.au/index1.jpg", "http://www.tandb.com.au/index2.jpg", "http://www.tandb.com.au/index3.jpg", "http://www.tandb.com.au/index4.jpg", "http://www.tandb.com.au/index5.jpg", "http://www.tandb.com.au/index6.jpg"}

set text item delimiters to return
set text_list to URL_list as string
set text item delimiters to {""}
set file_path to (path to desktop as Unicode text) & "URL_list.txt"
set file_ref to open for access file file_path with write permission
set eof file_ref to 0 -- overwrite file if it already exists
write text_list to file_ref
close access file_ref

Hi All,

I want to say a big thank you to all who have helped me on this forum. I have achived what I set to do, ie. download 2000+plus images from a website, with minimal intervention. Couldn’t have done without the generous contributions of all of you, again a big THANK YOU!

I started out simply wanting to read a URL with applescript but ended up learning up to get images from various avenues, namely reading a URL from Safari, from a text string embedded in applescript, from a text string in clipboard, from a text file. The final script even had a progress indicator to boot, again something I sourced from the forums. It has been an invaluable learning experience.

For what it’s worth, or any who might need such a script, this is the script I finally used:

set URL_list to paragraphs of (read (choose file of type "TEXT" without invisibles))

set total_count to 2328
set iteration to 1

repeat with imageURL in URL_list
	set main_string to "Downloading Web Images...."
	set sub_string to ("Processing item " & (iteration as string) & " of " & (total_count as string) & "...")
	my display_progress(iteration, total_count, main_string, sub_string)
	
	set imageURL to quoted form of imageURL
	set fileName to do shell script "basename " & imageURL
	set saveFilePath to (((path to desktop as Unicode text) & "WebImages:") as string) & fileName
	set testFormInfo to ""
	curlSimpleDownload(imageURL, saveFilePath, testFormInfo)
	
	set iteration to (iteration + 1)
end repeat
display dialog "All done." buttons {"OK"} default button 1 with icon 1 giving up after 2328


on curlSimpleDownload(downloadURL, destExpected, theFormInfo)
	-- version 1.2, Daniel A. Shockley - public domain
	-- downloadURL is STRING
	-- saves to destExpected (Mac path as STRING, FILE SPEC, or ALIAS), if given
	-- if destExpected is "", returns source result directly as string
	-- optional form data for POST - use "" for no form data
	try
		-- basic download to standard output
		set curlCode to "curl '" & downloadURL & "'"
		
		if (length of theFormInfo) > 0 then
			set curlCode to curlCode & " -d \"" & theFormInfo & "\""
		end if
		
		-- now, add on the desired file location, if there is one given
		if destExpected is not "" then
			set unixDestExpected to quoted form of POSIX path of (destExpected as string)
			set curlCode to curlCode & " --output " & unixDestExpected & " --write-out \"%{http_code}\""
		else -- result as string
			set curlCode to curlCode & " | vis" -- pipe into vis to strip nonprintable characters
		end if
		
		set curlResponse to do shell script curlCode
		
		return curlResponse
		(*
		curlResponse will be the http success code ("200"), or an error code.
		If no destination was given, curlResponse will be the source 
		returned, and no file will be saved
	*)
	on error errMsg number errNum
		error "curlSimpleDownload FAILED: " & errMsg number errNum
	end try
end curlSimpleDownload


on display_progress(iteration, total_count, main_string, sub_string)
	--choose from this list for the "unprocessed" part of the progress bar
	set the_counter_opts to {" ", "-------------------------------------------", "oooooooooooooooooooooooooooooooooooooooo"}
	set the_counter to (characters of (item 2 of the_counter_opts))
	set the_percentage to (round of ((iteration / total_count) * (count of the_counter)) rounding down)
	if the_percentage is not equal to 0 then
		repeat with i from 1 to the_percentage
			set item i of the_counter to "?"
		end repeat
	end if
	set the_counter to the_counter as string
	set the_message to main_string & return & return & sub_string & return & return & the_counter
	display dialog the_message buttons {"Cancel"} giving up after 1
end display_progress

Hi everybody,
this script is pretty cool, but how would it be possible to get images from all tabs at once ?
Considering I have a safari window with many tabs open and with image content in each tab, I tried to do a “repeat” action but it does nothing…
My purpose is to make the script save images in my folder “TEMPORAIRE”, close tabs, save image… and so on until there is no more open tab in Safari.
Another question : how to change path to “TEMPORAIRE” ? I’d like the script to save images in a folder in “Users/me/images/TEMPORAIRE” but not on the Desktop.

Your help would be much appreciated !!
I’m a newbie, yeahhh !

Here is the script that doesn’t work…

set destFolder to POSIX path of ((path to desktop as Unicode text) & "TEMPORAIRE:")
tell application "Safari" to activate
tell application "Safari" to set numberOfPictures to do JavaScript "document.images.length" in document 1

repeat until window 1 is true
	
	repeat with i from 1 to numberOfPictures
		tell application "Safari" to set picURL to do JavaScript "document.images[" & ((i - 1) as string) & "].src" in document 1
		if picURL ends with "jpg" then
			set {TID, text item delimiters} to {text item delimiters, "/"}
			set fName to last text item of picURL
			set text item delimiters to TID
			do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
			
			tell application "System Events"
				tell process "Safari"
					keystroke "w" using command down
					delay 0.5
				end tell
			end tell
			
			
			if exists document 1 = false then exit repeat
		end if
	end repeat
end repeat