cURL / Mailing Lists / curl-library / Single Mail

curl-library

Important info about caching

From: codemastr <codemstr_at_ptd.net>
Date: Tue, 3 Feb 2004 19:48:18 -0500

Ok a while back I sent an email about how to do caching. Daniel told me all
I had to do was to specify the appropriate CURLOPT_TIMECONDITION/VALUE stuff
and then after curl_easy_perform I do a curl_easy_getinfo for
CURLOPT_RESPONSE_CODE and check for a 304. That will work 90% of the time,
but Daniel left something out.

After several hours of debugging I found what is going on. Sometimes (no
idea why) Apache will still send a 200 rather than a 304 even if, for
example in this case:

If-Modified-Since: Wed, 04 Feb 2004 00:06:02 GMT
Last-Modified: Tue, 03 Feb 2004 20:34:01 GMT

Now here is the problem. Since it sends a 200, my detection stuff doesn't
realize there is any kind of problem, it doesn't get a 304 there for it
assumes the file has been downloaded. Curl disagrees. Since the server sent
a Last-Modified header, curl will check this header even if the server sends
a 200 rather than a 304. So libcurl was canceling my connection once it gets
the Last-Modified! This is obviously good behavior, but it must be dealt
with. Since the response code was a 200, my code didn't know this. So here
is how to solve the problem:

long response_code;
long last_mod;
...
curl_easy_setopt(curl, CURLOPT_TIMECONDITION, CURL_TIMECOND_IFMODSINCE);
curl_easy_setopt(curl, CURLOPT_TIMEVALUE, the_cached_time_here);
curl_easy_setopt(curl, CURLOPT_FILETIME, 1);
curl_easy_perform(curl);
...
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code);
curl_easy_getinfo(curl, CURLINFO_FILETIME, &last_mod);
...
/* Check for a 304 or if last-modified was returned make sure it isn't older
*/
if (response_code == 304 || (last_mod != -1 && last_mod <
the_cached_time_here)
/* Use the cached version ! */
else
/* The file was downloaded */
...
curl_easy_cleanup(curl);

On a side note, maybe it would be a good idea at lib/transfer.c ~872 and
~880, which is where the last-modified checks occur not return CURLE_OK?
Maybe it should return CURLE_TIMECONDITION or something? It seems kind of
stupid for me to have to do the CURLINFO_FILETIME comparisons which libcurl
already has done them. I mean, I'm redoing an operation that isn't needed.
Since curl has already done this for me, perhaps it should indicate that.
Though if you disagree that's fine, I have it working now with
CURLINFO_FILETIME so I'm happy. I just figured I'd share this info with
everyone since I'm sure there are others who might have the same problem.
And also the fact that when I asked Daniel never mentioned this whole thing
to me so I'm sure other people saw Daniel say my code was OK so they just
copied it :)

Anyway, hope this helps some people out there!

Dominick Meglio

-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
Received on 2004-02-04