Linus4200 wrote to me off-board, since he had some specific questions about his case. I’m not going to include the URL he is trying to reach, but he can add the extra info I sent him, if he likes. Amazon’s search box on their front page also uses POST, if you’d like to see an example.
His case is difficult, as it requires the POST method, not just GET (which means he cannot use URL Access Scripting on OS X, since POST broke in URLAS for OS X). Here’s the overview I sent him:
Yours is not the easy case, in case you had not figured that out yet.
Maybe I should give a little background, so you know the framework.
HTTP is just a simple way for a computer to say “Hey, give me some information back”, as I’m sure you know. One method is to just send a URL to a server, and it then sends a bunch of ASCII (or binary) back to your computer, which is waiting for the response. Many forms work this way. The form is specified in the HTML source code of the page to use the method GET, which means that your web browser will take the info you typed into the fields (and any hidden INPUT fields in the source code), and tack that onto the end of the ACTION url for that form (ACTION is also specified in the HTML source code of the page). Then, the server gets all the fields you filled out (plus those hidden ones), determines what you want, and sends back a page (or other data in some cases).
However, sometimes there is just too much information to fit in an URL (or the website doesn’t want data sent in URLs). So, there is the POST method, which means that your request includes the fields as a separate part of your request to the web server.
The site you’re trying to access uses POST. Take a look at the source code of the page where your form is. Find the form tag you want (there can be more than one). See what the ACTION parameter equals (this may be a relative URL, which means you need to look at the URL of the page your are on). Also, see if the method is GET or POST. If it is POST, as in our example, you’ll need to attach the data separately from the URL. The command-line utility curl can do this using the --data option. URL Access Scripting is SUPPOSED to be able to attach form POST data, but that is broken in OS X (works in OS 9). That is the main reason I’ve given up on URLAS.
Look for the tags to get the fields’ names. You may also need tags, if a popup is used, or as well. Most will be tags, with a type of hidden, text, radio, or checkbox. The --data options needs a string (with single quotes around the string) that looks like this: ‘SOMEFIELDNAME=SOMETHING&ANOTHERNAME=SomethingElse’ and so on.
So, your command to curl may look like this:
curl 'http://www.SOMESITE.com/their_cgi' --data 'firstname=Dan&lastname=Shockley'
So, for Amazon.com, to search for Stephenson in All Products, you’d use (notice I did not include the big numeric part in the URL, since that was specific to my visit):
curl 'http://www.amazon.com/exec/obidos/search-handle-form/' --data 'url=index=aps&field-keywords=Stephenson'
Now, some caveats: you need to make sure you specify all the required hidden fields. Another issue is that you may get back a response saying that the info you want is at another URL. This can happen when the web server gets your request, and, instead of feeding it back immediately, caches it in a local file that you can retrieve with a simple GET request. When it responds this way, your web browser just goes there, so it looks to you as if it went directly there immediately. What really happened is that the web browser was smart enough to say “Oh, OK, the web server is telling me that what I want is somewhere else, so I’ll just ask for that now instead” and you see the desired information. To tell curl to mimic this behavior, you need to add the --location option. So now your curl command would look like this (Amazon.com does not need this):
curl 'http://www.SOMESITE.com/their_cgi' --data 'firstname=Dan&lastname=Shockley' --location
Now you’ll get the information you want.