curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Last-Modified header

From: Dan Fandrich via curl-library <curl-library_at_cool.haxx.se>
Date: Thu, 21 May 2020 17:13:44 +0200

On Thu, May 21, 2020 at 03:46:33PM +0100, James Read via curl-library wrote:
> I'm implementing a simple web crawler with curl and want to retrieve the
> Last-Modified header so I can implement a sensible recrawl policy. I've found 
> https://curl.haxx.se/libcurl/c/getinfo.html%c2%a0which is a nice easy way to
> retrieve the Content-Type header. Is there a similarly easy way to retrieve the
> Last-Modified header? Or I do I need to parse the header myself?
>
> If I need to parse the header myself I found https://curl.haxx.se/libcurl/c/
> sepheaders.html which prints headers to a file. Is there a way of just storing
> the headers in memory so I can parse them there? I don't want to have to write
> a file just to read it again.

You can use that example as a basis, then set CURLOPT_HEADERFUNCTION with a
function like WriteMemoryCallback() in the getinmemory.c example to store the
headers in memory instead. Or, do something more intelligent since you're only
interested in a single header. libcurl writes to a file by default, so by
setting your own header callback function you can process them however you want.

Dan
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2020-05-21