cURL / Mailing Lists / curl-and-php / Single Mail

curl-and-php

Re[2]: Curl is identified as a robot somehow.

From: IG <911-000_at_mail.ru>
Date: Mon, 17 Jan 2005 01:26:13 +0300

thats the exact message displyed after fetching the above mentioned page.
"Your web-client is unable to display this resource. Don't use content grabbers here. Pay some respect to my work ..."
So despite the fact that user agent is properly set- curl is still recognized as a bot.
I think theres something about javascript on the page but at the same time curl understands javascript right?
Any thoughts?

-----Original Message-----
From: "Brian Wilkins" <brian_at_hcc.net>
To: IG <911-000_at_mail.ru>, Programming PHP/CURL <curl-and-php_at_cool.haxx.se>
Date: Sun, 16 Jan 2005 11:37:18 +0500
Subject: Re: Curl is identified as a robot somehow.

>
> Try any of these:
>
> Googlebot/2.X (http://www.googlebot.com/bot.html)
> The Google Web crawler.
> Mozilla/3.0 (Win95; I)
> Netscape Navigator 3.0 on Windows 95.
> Mozilla/3.01 (Macintosh; PPC)
> Netscape Navigator 3.01 on a Macintosh.
> Mozilla/4.0 (compatible; MSIE 4.01; AOL 4.0; Windows 98)
> The AOL browser, based on Microsoft Internet Explorer 4.01, on Windows
> 98.
> Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
> Microsoft Internet Explorer 6.0 on Windows 2000.
> Mozilla/5.0 (compatible; Konqueror/2.2.2; Linux 2.4.14-xfs; X11; i686)
> Konqueror 2.2.2 for Linux.
> Mozilla/5.0 (Windows; U; Win98; en-US; rv:0.9.2)
>
> Gecko/20010726 Netscape6/6.1
> Netscape 6.1 on Windows 98.
> Opera/6.x (Windows NT 4.0; U) [de]
> The German version of Opera 6.x on Windows NT.
> Opera/7.x (Windows NT 5.1; U) [en]
> The English version of Opera 7.x on Windows XP.
>
> > trying to fetch the following url:
> > http://www.leader.ru/secure/who.html
> >
> > with
> > ...
> > $referer="http://www.leader.ru/";
> > $user_agent="Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.1)";
> > $url="http://www.leader.ru/secure/who.html";
> > $cookies="c:/cookies/cookies.txt";
> >
> > $header[] = "*.*";
> > $ch = curl_init();
> > curl_setopt($ch, CURLOPT_REFERER,$referer);
> > curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
> >
> > curl_setopt($ch, CURLOPT_URL,$url);
> > curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
> > curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
> > curl_setopt($ch, CURLOPT_COOKIEJAR, $cookies);
> > curl_setopt($ch, CURLOPT_COOKIEFILE, $cookies);
> > $result = curl_exec ($ch);
> > curl_close ($ch);
> > unset($ch);
> >
> > ...
> > it says thats the browser is unsupported and cant display the page.
> > whats the trick? user agent is clearly set.
> > thanks for any help.
>
>
> --
> Brian Wilkins
> brian_at_hcc.net
> Software Engineer
> Heritage Communications Corporation
> Melbourne, FL USA 32935
>
>
Received on 2005-01-16