cURL / Mailing Lists / curl-and-php / Single Mail

curl-and-php

Splitting header and body and reliably detecting the ole http return code...

From: Drew Weaver <drew.weaver_at_thenap.com>
Date: Sat, 15 Dec 2007 11:24:48 -0500

$url = $dat['url'];
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, "$url");
        curl_setopt($ch, CURLOPT_HEADER, true);
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops
        curl_setopt($ch, CURLOPT_TIMEOUT, 90);
        $data = curl_exec($ch);
        $err = curl_error($ch);
        if ($data !== FALSE) {
                if (empty($err)) {
                        $datary = explode("\r\n\r\n", "$data", 2);
                        $header = $datary[0];
                        $body = $datary[1];
                        if (!isset($datary[1])) {
                                echo "the datary explode is broken fix it....!\n";
                        }
                        #check header for HTTP return code
                        preg_match_all("/HTTP\/1\.[1|0]\s(\d{3})/",$header, $matches);
                        $httpret = end($matches[1]);
                        if (!isset($matches[1])) {
                                echo "Your ereg for splitting the http header sucks!\n";
                        }

I wrote this code some time ago for a little hobby project but I never used it, pardon the self abusive error messages I find it helps me suck less if I taunt myself in my code.

It Is my understanding that "technically" there is supposed to be a \r\n\r\n between the header and the body on pretty much every http daemon unless I'm totally clueless..

At any rate, I'm scanning 161,000 URLs people have submitted to me over the years to see if the pages they have submitted to me have been altered the occasion where the splitting the head/body doesn't work is very rare, maybe 1/500 but I'd like to see if I can eliminate it all together.

Back when I wrote this I found using the 'fork the header and body to separate functions "feature" of php to be fairly complex for something which is seemingly so simple.

Anyhow thanks, and I hope everyone has a happy holidays.

-Drew

_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php
Received on 2007-12-15