The main idea is to use the –cookie-jar option and login to the web site by having curl POST the login form information as if it had been filled out in a normal browser. Then for the search queries, use the –cookie option to supply the login cookie. The following seems to work for me:
set un to "user"
set pw to "password"
set cjpp to getCookieJarPOSIXPath()
set l to Login_MacScripterBBS(un, pw, cjpp)
set s to Search_MacScripterBBS("http://bbs.macscripter.net/search.php?action=search&keywords=attach+drawer&author=&forum=3&sort_by=5&sort_dir=DESC&show_as=topics&search=Submit", cjpp)
{l, s}
to Search_MacScripterBBS(search, cjpp)
(* Salient bit:
Use the --cookie option, giving it the path to the previously written cookie jar file so that the required login cookie (punbb_cookie) will be supplied to (e.g.) search.php.
Alternatively, supply the cookie itself (maybe extracted from the file written to by Login_MacScripterBBS, or copied from some cookie tool): --cookie punbb_cookie=value
Adjust the rest as needed.
*)
do shell script "
/usr/bin/curl \\
--verbose \\
--cookie " & quoted form of cjpp & " \\
" & quoted form of search & " \\
2>&1"
end Search_MacScripterBBS
to Login_MacScripterBBS(un, pw, cjpp)
(* Salient bits:
Use --data options to provide form data to login.php.
Use --cookie-jar to have curl record the cookie that login.php sets.
Use --verbose and capture stderr (2>&1) for debugging/fun.
*)
do shell script "
/usr/bin/curl \\
--verbose \\
--data form_sent=1 \\
--data redirect_url=index.php \\
--data req_username=" & quoted form of encode_form_value(un) & " \\
--data req_password=" & quoted form of encode_form_value(pw) & " \\
--data login=Login \\
--cookie-jar " & quoted form of cjpp & " \\
http://bbs.macscripter.net/login.php?action=in \\
2>&1"
end Login_MacScripterBBS
to getCookieJarPOSIXPath()
path to temporary items folder from user domain
POSIX path of result
result & "curl-cookies-XXXXXXXXXX"
do shell script "/usr/bin/mktemp " & quoted form of result
end getCookieJarPOSIXPath
to encode_form_value(str)
encode_text(str, true, true) -- This is not exactly what the spec calls for (http://www.w3.org/MarkUp/html-spec/html-spec_toc.html#SEC8.2.1), but it might work anyway.
end encode_form_value
(* URL encoding from: http://www.apple.com/applescript/sbrt/sbrt-08.html "Text Encoding | Decoding" *)
-- A sub-routine for encoding high-ASCII characters:
on encode_char(this_char)
set the ASCII_num to (the ASCII number this_char)
set the hex_list to {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F"}
set x to item ((ASCII_num div 16) + 1) of the hex_list
set y to item ((ASCII_num mod 16) + 1) of the hex_list
return ("%" & x & y) as string
end encode_char
-- this sub-routine is used to encode text
on encode_text(this_text, encode_URL_A, encode_URL_B)
set the standard_characters to "abcdefghijklmnopqrstuvwxyz0123456789"
set the URL_A_chars to "$+!'/?;&@=#%><{}[]\"~`^\\|*"
set the URL_B_chars to ".-_:"
set the acceptable_characters to the standard_characters
if encode_URL_A is false then set the acceptable_characters to the acceptable_characters & the URL_A_chars
if encode_URL_B is false then set the acceptable_characters to the acceptable_characters & the URL_B_chars
set the encoded_text to ""
repeat with this_char in this_text
if this_char is in the acceptable_characters then
set the encoded_text to (the encoded_text & this_char)
else
set the encoded_text to (the encoded_text & encode_char(this_char)) as string
end if
end repeat
return the encoded_text
end encode_text
-- A sub-routine for decoding a three-character hex string:
on decode_chars(these_chars)
copy these_chars to {indentifying_char, multiplier_char, remainder_char}
set the hex_list to "123456789ABCDEF"
if the multiplier_char is in "ABCDEF" then
set the multiplier_amt to the offset of the multiplier_char in the hex_list
else
set the multiplier_amt to the multiplier_char as integer
end if
if the remainder_char is in "ABCDEF" then
set the remainder_amt to the offset of the remainder_char in the hex_list
else
set the remainder_amt to the remainder_char as integer
end if
set the ASCII_num to (multiplier_amt * 16) + remainder_amt
return (ASCII character ASCII_num)
end decode_chars
-- this sub-routine is used to decode text strings
on decode_text(this_text)
set flag_A to false
set flag_B to false
set temp_char to ""
set the character_list to {}
repeat with this_char in this_text
set this_char to the contents of this_char
if this_char is "%" then
set flag_A to true
else if flag_A is true then
set the temp_char to this_char
set flag_A to false
set flag_B to true
else if flag_B is true then
set the end of the character_list to my decode_chars(("%" & temp_char & this_char) as string)
set the temp_char to ""
set flag_A to false
set flag_B to false
else
set the end of the character_list to this_char
end if
end repeat
return the character_list as string
end decode_text
Note that your MacScripter BBS username and password will be exposed on the command line and as such may be visible to other users on the same machine (through ps and the like). If you need it to be better secured, this could be avoided by using –data’s ability to read from a file instead of the command line.
In practice you might not have to do the login transaction each time. Instead you might do it once and keep the cookie jar in a predictable, resuable place (not in a temp directory with a randomized filename). That way the login cookie could be reused across multiple searches. Alternatively, if you normally use Firefox (maybe other browsers, too) to login to the MacScripter BBS, you could point curl to the browser’s cookie file (i.e. ~/Library/Application Support/Firefox/Profiles/something/cookies.txt). Safari stores cookies in a plist file that is incompatible with curl’s –cookie option.
Model: iBook G4 933
AppleScript: 1.10.7
Browser: Safari Version 3.1 (4525.13)
Operating System: Mac OS X (10.4)
Edit History: Add full path to mktemp call. Fix typo and clarify comments in code.