Download an image from a web page

Hi StefanK, why is it that you get 6 pictures? With this URL, http://www.tandb.com.au/index1.jpg, I see only an image that says “T&B”. This is index1.jpg and is the only image I need. This is really puzzling.

It loads 6 pictures, because I wrote the script to load all .jpg pictures.
If you want only the “index1.jpg” picture, change the line starting with if to

if picURL ends with "index1.jpg" then

Hi StefanK, the script is right to download all images, but what I am saying is that there actually should be only 1 image on that web page.

I am not sure why the script is not working as it should. I’ll see if I can work through it further. Though I feel a bit disheartened, whatever the result may be, I am still very grateful for your help. Thanks a bundle.

I think Stefan is making the same assumption that I made. That you want to go to a any page and download the all images found. Hence there are six images on http://www.tandb.com.au/

But what it it seems you want to do is download the images that you are displaying in Safari.

http://www.tandb.com.au/index1.jpg

This page is Not a html (webpage) as such but is the image being displayed by Safari.

So this may be what you want.

set destFolder to POSIX path of ((path to desktop as Unicode text) & "WebImages:")

tell application "Safari"
	
	set firstUrl to URL of document 1
	set fileName to (do shell script "basename " & quoted form of firstUrl)
	do shell script "curl -o " & quoted form of (destFolder & fileName) & space & firstUrl
	
end tell

Hi Mark, Thanks, you hit the nail on the head. Your script works for me.

However, I have a further question to expand this script. If I want the script to actually launch Safari (if it is not already open) and open the page directly, how do I pass the URL parameter to Applescript? Hardcoding the URL is not a solution; reason being I envisage this script to be eventually saved as an application. I will run a loop whereby I can pass the URLs (for each image) to the applescript application which will then download the image to disk automatically. Further queries: How do I close close each web page once the downloading is done? Granting that opening each page should be much faster than downloading the image, would I end up “crashing” Safari due to many concurrent open windows?

Sorry if I sound too demanding. I will promise to put the final completed script on this forum for anyone else who might be interested in this same solution. Thanks again.

A quick question first, where are you getting the urls from if not safari’s address bar.

If you are not using safari as the starting place for the url, then that makes a big difference in the direction the script can take. And may negate your other questions.

Hi Mark,

Point is that I’ll always know the URL beforehand. Is it possible that the image on the web page can actually be downloaded without even opening Safari? The only reason for opening Safari is just so that I can pass the URL from Safari to Applescript as in …

tell application “Safari”
set firstUrl to URL of document 1

What I am doing is actually looping through each URL I have, then running the get image script within each iteration. Is it possible in applescript to do something like ?..

getpictapp(“URL pathname”) where getpictapp is the name of the applescript and URL pathname is the path to the image

You mean something like this?


to getpictapp(tgt_URL)
	set destFolder to POSIX path of (path to desktop) & "WebImages/"
	set tgt_URL to quoted form of tgt_URL
	set fileName to do shell script "basename " & tgt_URL
	do shell script "curl -o " & quoted form of (destFolder & fileName) & space & tgt_URL
end getpictapp

set URL_list to {"http://www.tandb.com.au/index1.jpg", "http://www.tandb.com.au/index2.jpg", "http://www.tandb.com.au/index3.jpg", "http://www.tandb.com.au/index4.jpg", "http://www.tandb.com.au/index5.jpg", "http://www.tandb.com.au/index6.jpg"}

repeat with tgt_URL in URL_list
	getpictapp(tgt_URL)
end repeat


Or my preference (depending on what else you’re doing):


to download_items from URL_list to destFolder
	repeat with i in URL_list
		set i to quoted form of i
		set fileName to do shell script "basename " & i
		do shell script "curl -o " & quoted form of (destFolder & fileName) & space & i
	end repeat
end download_items

set destFolder to POSIX path of (path to desktop) & "WebImages/"

set URL_list to {"http://www.tandb.com.au/index1.jpg", "http://www.tandb.com.au/index2.jpg", "http://www.tandb.com.au/index3.jpg", "http://www.tandb.com.au/index4.jpg", "http://www.tandb.com.au/index5.jpg", "http://www.tandb.com.au/index6.jpg"}

download_items from URL_list to destFolder


Hi Kai, Thanks for your input.

I tried this script and it works…

to download_items from URL_list to destFolder
repeat with i in URL_list
set i to quoted form of i
set fileName to do shell script "basename " & i
do shell script "curl -o " & quoted form of (destFolder & fileName) & space & i
end repeat
end download_items

set destFolder to POSIX path of (path to desktop) & “WebImages/”

set URL_list to {“http://www.tandb.com.au/index1.jpg”, “http://www.tandb.com.au/index2.jpg”, “http://www.tandb.com.au/index3.jpg”, “http://www.tandb.com.au/index4.jpg”, “http://www.tandb.com.au/index5.jpg”, “http://www.tandb.com.au/index6.jpg”}

download_items from URL_list to destFolder

My problem is in setting the URL_list. There are hundreds of image URLs; too many to specify manually. Can we get the script to read from a text file instead? If the URLs in the text file can be formatted in the same way as in the example script above, would it be difficult to parse the file to extract each URL?

No problem. If the URLs are listed in the file in a form like this…

… then extracting the text and converting it to a list is a trivial matter:

set URL_list to paragraphs of (read (choose file of type "TEXT" without invisibles))

To reverse the process, and write a list to file, you can use something like:

set URL_list to {"http://www.tandb.com.au/index1.jpg", "http://www.tandb.com.au/index2.jpg", "http://www.tandb.com.au/index3.jpg", "http://www.tandb.com.au/index4.jpg", "http://www.tandb.com.au/index5.jpg", "http://www.tandb.com.au/index6.jpg"}

set text item delimiters to return
set text_list to URL_list as string
set text item delimiters to {""}
set file_path to (path to desktop as Unicode text) & "URL_list.txt"
set file_ref to open for access file file_path with write permission
set eof file_ref to 0 -- overwrite file if it already exists
write text_list to file_ref
close access file_ref

Hi All,

I want to say a big thank you to all who have helped me on this forum. I have achived what I set to do, ie. download 2000+plus images from a website, with minimal intervention. Couldn’t have done without the generous contributions of all of you, again a big THANK YOU!

I started out simply wanting to read a URL with applescript but ended up learning up to get images from various avenues, namely reading a URL from Safari, from a text string embedded in applescript, from a text string in clipboard, from a text file. The final script even had a progress indicator to boot, again something I sourced from the forums. It has been an invaluable learning experience.

For what it’s worth, or any who might need such a script, this is the script I finally used:

set URL_list to paragraphs of (read (choose file of type "TEXT" without invisibles))

set total_count to 2328
set iteration to 1

repeat with imageURL in URL_list
	set main_string to "Downloading Web Images...."
	set sub_string to ("Processing item " & (iteration as string) & " of " & (total_count as string) & "...")
	my display_progress(iteration, total_count, main_string, sub_string)
	
	set imageURL to quoted form of imageURL
	set fileName to do shell script "basename " & imageURL
	set saveFilePath to (((path to desktop as Unicode text) & "WebImages:") as string) & fileName
	set testFormInfo to ""
	curlSimpleDownload(imageURL, saveFilePath, testFormInfo)
	
	set iteration to (iteration + 1)
end repeat
display dialog "All done." buttons {"OK"} default button 1 with icon 1 giving up after 2328


on curlSimpleDownload(downloadURL, destExpected, theFormInfo)
	-- version 1.2, Daniel A. Shockley - public domain
	-- downloadURL is STRING
	-- saves to destExpected (Mac path as STRING, FILE SPEC, or ALIAS), if given
	-- if destExpected is "", returns source result directly as string
	-- optional form data for POST - use "" for no form data
	try
		-- basic download to standard output
		set curlCode to "curl '" & downloadURL & "'"
		
		if (length of theFormInfo) > 0 then
			set curlCode to curlCode & " -d \"" & theFormInfo & "\""
		end if
		
		-- now, add on the desired file location, if there is one given
		if destExpected is not "" then
			set unixDestExpected to quoted form of POSIX path of (destExpected as string)
			set curlCode to curlCode & " --output " & unixDestExpected & " --write-out \"%{http_code}\""
		else -- result as string
			set curlCode to curlCode & " | vis" -- pipe into vis to strip nonprintable characters
		end if
		
		set curlResponse to do shell script curlCode
		
		return curlResponse
		(*
		curlResponse will be the http success code ("200"), or an error code.
		If no destination was given, curlResponse will be the source 
		returned, and no file will be saved
	*)
	on error errMsg number errNum
		error "curlSimpleDownload FAILED: " & errMsg number errNum
	end try
end curlSimpleDownload


on display_progress(iteration, total_count, main_string, sub_string)
	--choose from this list for the "unprocessed" part of the progress bar
	set the_counter_opts to {" ", "-------------------------------------------", "oooooooooooooooooooooooooooooooooooooooo"}
	set the_counter to (characters of (item 2 of the_counter_opts))
	set the_percentage to (round of ((iteration / total_count) * (count of the_counter)) rounding down)
	if the_percentage is not equal to 0 then
		repeat with i from 1 to the_percentage
			set item i of the_counter to "?"
		end repeat
	end if
	set the_counter to the_counter as string
	set the_message to main_string & return & return & sub_string & return & return & the_counter
	display dialog the_message buttons {"Cancel"} giving up after 1
end display_progress

Hi everybody,
this script is pretty cool, but how would it be possible to get images from all tabs at once ?
Considering I have a safari window with many tabs open and with image content in each tab, I tried to do a “repeat” action but it does nothing…
My purpose is to make the script save images in my folder “TEMPORAIRE”, close tabs, save image… and so on until there is no more open tab in Safari.
Another question : how to change path to “TEMPORAIRE” ? I’d like the script to save images in a folder in “Users/me/images/TEMPORAIRE” but not on the Desktop.

Your help would be much appreciated !!
I’m a newbie, yeahhh !

Here is the script that doesn’t work…

set destFolder to POSIX path of ((path to desktop as Unicode text) & "TEMPORAIRE:")
tell application "Safari" to activate
tell application "Safari" to set numberOfPictures to do JavaScript "document.images.length" in document 1

repeat until window 1 is true
	
	repeat with i from 1 to numberOfPictures
		tell application "Safari" to set picURL to do JavaScript "document.images[" & ((i - 1) as string) & "].src" in document 1
		if picURL ends with "jpg" then
			set {TID, text item delimiters} to {text item delimiters, "/"}
			set fName to last text item of picURL
			set text item delimiters to TID
			do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
			
			tell application "System Events"
				tell process "Safari"
					keystroke "w" using command down
					delay 0.5
				end tell
			end tell
			
			
			if exists document 1 = false then exit repeat
		end if
	end repeat
end repeat

Hi,

welcome to MacScripter

alitaliano <> temporaire ? Do you come from valle d’aosta ? ;):wink:

There is no need for GUI scripting. Since Safari 4 tabs are easily scriptable


set destFolder to POSIX path of (path to pictures folder as Unicode text) & "TEMPORAIRE:"
set {TID, text item delimiters} to {text item delimiters, "/"}
tell application "Safari"
	repeat with aTab from 1 to (count tabs of window 1)
		tell window 1 to set current tab to tab aTab -- go to next tab
		set numberOfPictures to do JavaScript "document.images.length" in document 1 -- document 1 is always the visible tab
		repeat with aPicture from 1 to numberOfPictures
			set picURL to do JavaScript "document.images[" & ((aPicture - 1) as string) & "].src" in document 1
			if picURL ends with "jpg" then
				set fName to last text item of picURL
				do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
			end if
		end repeat
	end repeat
end tell
set text item delimiters to TID

Note: Without any error handling images with same file name will overwrite existing ones

Hey StefanK, the legend ! :stuck_out_tongue:
No, I’m french :rolleyes:
Thank you so much for your reply !
But something is still wrong : the pictures save in “pictures folder” but not in the “Temporaire” folder.
I tried to add the possibility to close tab whent it’s been downloaded without result.
Can you help me again ? Thank you !

set destFolder to POSIX path of ((path to desktop as Unicode text) & "TEMPORAIRE:")
set {TID, text item delimiters} to {text item delimiters, "/"}
tell application "Safari"
	repeat with aTab from 1 to (count tabs of window 1)
		tell window 1 to set current tab to tab aTab -- go to next tab
		set numberOfPictures to do JavaScript "document.images.length" in document 1 -- document 1 is always the visible tab
		repeat with aPicture from 1 to numberOfPictures
			set picURL to do JavaScript "document.images[" & ((aPicture - 1) as string) & "].src" in document 1
			if picURL ends with "jpg" then
				set fName to last text item of picURL
				do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
			end if
		end repeat
	end repeat
end tell
set text item delimiters to TID

sorry, as we deal with POSIX paths the last character must be a slash.
The script closes every tab which doesn’t contain any images


set destFolder to POSIX path of (path to pictures folder as Unicode text) & "TEMPORAIRE/"
set {TID, text item delimiters} to {text item delimiters, "/"}
tell application "Safari"
	repeat with aTab from 1 to (count tabs of window 1)
		tell window 1 to set current tab to tab aTab -- go to next tab
		set numberOfPictures to do JavaScript "document.images.length" in document 1 -- document 1 is always the visible tab
		if numberOfPictures = 0 then tell window 1 to close current tab
		repeat with aPicture from 1 to numberOfPictures
			set picURL to do JavaScript "document.images[" & ((aPicture - 1) as string) & "].src" in document 1
			if picURL ends with "jpg" then
				set fName to last text item of picURL
				do shell script "curl -o " & quoted form of (destFolder & fName) & space & picURL
			end if
		end repeat
	end repeat
end tell
set text item delimiters to TID


I get an error, could you tell me why ?
Thanks again for your help ! :cool:

try


.
do shell script "curl -o " & quoted form of (destFolder & fName) & space & quoted form of picURL
.

Nope, it doesn’t work and I have the same error.
Pictures still save in the “pictures folder” in home though.
If you have time and patience to tell me why…
Anyways thank you so much :wink: to pay attention for a newbie like me !!

Of course the script assumes that the folder TEMPORAIRE exists in your pictures folder

:stuck_out_tongue: Ok it works, I’m a little bit stupid… The folder name was “*TEMPORAIRE” and not just “TEMPORAIRE”…

:rolleyes: :rolleyes:

The script DOES work.
Thank you again, and see you !!