cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Trying to save html page in an script

From: Ralph Mitchell <rmitchell_at_eds.com>
Date: Thu, 17 Apr 2003 07:06:59 -0500

Where you went wrong was in the login process. The -u option only
passes userid and passwd to an authentication popup box, not to a login
page such as the one you're trying to use. What you need to do with a
fill-in form such as this one, is examine the html and extract the form
variables, then post those back to the server. In this particular case,
the variables are user, pass and logon_submit, and the form action is
/right.html, so your login looks like this:

    curl -s -S -L -b cookies -c cookies -o right.html -d user=olivierb
-d pass=lola22 -d logon_submit=biff
http://www.hedgefundnews.com/right.html

That should all be on one line. Then to grab the page you want, do
this:

    curl -s -S -L -b cookies -c cookies -o record1425.html
http://www.hedgefundnews.com/list_funds/fund_details.php?fundid=1425

Again, all on one line. On login, the server hands you a session cookie
that is stored in the cookies file, which is then passed back in order
to get the page.

This works on my Linux box - I don't see why it wouldn't work on your
Win2K box.

Ralph Mitchell

Olivier Bezelgues wrote:

> Dear Curl Users,
>
> I have a question, I try to realize a script (batch or VBS it will
> depend) using the
>
> Features of curl.
>
> I am working on a windows 2000 PC with
>
> curl 7.10.4 (win32) libcurl/7.10.4 OpenSSL/0.9.7a zlib/1.1.4
>
> I would like to simulate the internet explorer with a curl command
>
> line, in order to retrieve data. The aim is two record on html files
>
> the different records of a database.
>
> The site I am working on is :
>
> http://www.hedgefundnews.com/
>
> My loggin on this web site(using the internet explorer) is :
>
> userid : olivierb
>
> pwd : lola22
>
> For the login (when surfing with I explorer), I have a login form
>
> and I think the site keep track of your logged in status
>
> in a cookie or similar.
>
> I have tried the following command line
>
> curl -u olivierb:lola22 -o record1425.html
>
> http://www.hedgefundnews.com/list_funds/fund_details.php?fundid=1425
>
> The aim of this command line is to save the page I get when I click on
> the link
>
> http://www.hedgefundnews.com/list_funds/fund_details.php?fundid=1425
>
> 1425 is the value of the variable in my script it will vary from 1400
>
> to 1800.
>
> In the html page saved on my pC in my curl directory there is nothing
>
> about the record I try to save on my pC.
>
> Where am I wrong? Did I forget something ?
>
> Thanks a lot for your help
>
> Best regards
>

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
Received on 2003-04-17