how to tell curl to download an arbitrary file ? i tried with the wildcard, but it doesnt work.
set url_of to "http://Webaddress/photo-of-the-day/*.jpg"
do shell script "curl -L " & quoted form of url_of & " > " & quoted form of POSIX path of ((path to desktop folder as text) & "Test.jpg" as text)
URLs and file paths in bash are two complete different things. Bash has something called parameters expansion, you can use special characters to address multiple locations from your file system at once. The command still receives multiple arguments, it’s a feature in Bash. URLs are not supported by Bash and therefore they cannot be expanded.
wget on the other hand does support wildcards, although unlike curl it does not come pre-installed on OSX. You have to install it with MacPorts or HomeBrew or make it yourself.
wget -r -l1 -np -nd "http://members.aceweb.com/randsautos/photogallery/ferrari/enzo/" -A "*.jpg"
This is a working command, and it is safe - but please don’t abuse the host too much.
The recursive switch is potentially dangerous in that you can download far more stuff than you intended to.
What wget does is not actually using wildcards, which is what I think a misleading name (not for FTP connections). What it does is:
It downloads the given webpage (called a directory, but it’s not).
Get all links from that page
Filter the links matching with the regular expression and other options if any
Download all remaining links one by one.
For that you won’t need to install wget, you can make use of curl and awk (or sed) and get the same results. Below an quick example without need of wget but downloads the same files.
do shell script "cd " & quoted form of POSIX path of (choose folder) & "
directory_url=http://members.aceweb.com/randsautos/photogallery/ferrari/enzo/
curl $directory_url 2>/dev/null | tr [A-Z] [a-z] | AWK 'BEGIN{
RS=\"</a>\"
}
/href/{gsub(/.*href=\\042/,\"\",$0)
if ( $0 ~ \".*\\.jpg\"){
gsub(/\\042.*/,\"\",$0)
print $0
}
}' | while read filename
do
if [[ $filename == http://* ]]
then # the link is absolute
curl -O $filename
else # the link is relative
curl -O $directory_url$filename
fi
done"
Unless it’s your own server you’ll never know that if the returned page will actually contain links to all the files in the directory. Also most commercial web servers won’t return an indexed page of the directory at all, they return a 403 error or blank page. So the example above as the wget example both can be useless or inaccurate if the server returns nothing or only a small part of the directory. In practice both examples will be useless on most commercial web servers.