This is how I do it:
1) grab the page, write it to disk
2) pull the page up into your favourite editor
3) delete (yup, delete) all the extraneous text, reducing it to just the
barebones of the form
4) reduce the form elements until I have the form action string, and the post
variables with their values.
5) now post that back to the site and save the response.
6) loop back to 2) if I'm drilling down through a set of pages
So, assuming http://fisher.lib.virginia.edu/cbp/county.html is the page you're
working on, let's take a look. I can see that it's a fairly simple form,
with just a SELECT and a SUBMIT button. Throw out everything down to the
FORM tag, throw out everything between the FORM and the SELECT, and throw out
everything after the SELECT. In this particular case, the SUBMIT button has
no variable name and no value, hence does not need (cannot ?) be posted back.
Now reduce the SELECT, by tossing out all the selections you don't want.
Assuming you want option 1, that leaves you with:
FORM METHOD="post" ACTION="/cgi-local/cbpbin/county.cgi"
So, you just have one post variable - "st" with the value "1", and the URL to
post to is going to be the original hostname with
"/cgi-local/cbpbin/county.cgi" tacked on:
curl -d "st=1" http://fisher.lib.virginia.edu/cgi-local/cbpbin/county.cgi
which hands back a page with about 5 selection boxes in it and a radio button.
In the case of a radio button, simply keep the button value you want and
discard the others. In the case of text input boxes, you need to url-encode
any character strings you want to pass back - that is, replace space, &, ?,
etc with the hex equivalent, because curl doesn't do that for you.
eliminate stuff that isn't going to be useful, such as form validation (which
you will do in *your* script, right?), floating boxes, scrolling messages,
etc. Apart from that little bit of advice, you're on your own, because
new url on-the-fly depending on the form values supplied and simply sets
location.href to the new url. Once I understood what was happening, it
wasn't too hard to replicate that in Bourne shell and get the right results,
but such stuff can be tricky.
Probably the worst site I've come across is a certain change management and
trouble ticket suite that uses java to open a link from the browser back to
the mothership. The form selection boxes are all empty except for the first
one. As you pick values, your selections are communicated back to the
server, which feeds out values for the next selection box... I could
probably mimic that eventually, but not just with curl.
On Monday 01 July 2002 10:13 am, Saunders, Chris wrote:
> I need to figure out how to get cURL to do a few things. For one to
> select an item from a list on a webpage and to submit, then from the
> next page to select a series of choices from a list and download a data
> file. Even for a nOOb like me it appeared cURL would make that easy for
> me, however I cannot get cURL to select (highlight) a choice in a
> textbox, I can get cURL to hit the submit button, but that's it. Here
> is what I have tried:
> Curl -d "Option value=1&Input type=submit"
> <http://fisher.lib.virginia.edu/cbp/county.html> > Alabama1stpage.html
> I know there has got to be a way and this is a big priority for the
> bossman! Anybody have some suggestions?
> Christopher Saunders
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
Received on 2002-07-02