curl / Mailing Lists / curl-users / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Naming ETag files?

From: Timothe Litt <litt_at_acm.org>
Date: Sun, 15 Jan 2023 17:26:28 -0500


On 15-Jan-23 15:47, Dan Fandrich via curl-users wrote:
> On Sun, Jan 15, 2023 at 11:04:20AM -0700, Paul Gilmartin via curl-users wrote:
>> Interesting. The Mac and Linux xattr commands have a useful compatible
>> overlap. So I could download with:
>> --etag-save tempfile.ETag
>> tnen:
>> xattr -w user.aETag "$( cat tempfile.ETag )" downloaded-resource
>>
>> ... and reverse the process for --etag-compare for the next conditional
>> download.
> But, ETags are unique to the server (or possibly the URL) so just saving the
> ETag isn't enough. You'd have to also store the URL where it came from as well,
> and only use the associated ETag if the server/URL matches on the next request.

Maybe.  For the URL to matter, the same file would have to be downloaded
from different servers and have different content. That's not a likely
scenario.  How often do you download a file from several servers and
store it with the same name in the same place?  For long enough for an
etag to be reused?  I do have cases where different paths lead to the
same filename, but in those cases either (a) I store them in different
local directories or (b) give them different names when stored, or (c)
they're temporary files that get processed and deleted long before the
next download.

Even if it happened, the odds of a false e-tag match are pretty slim. 
Etag is typically a hash of content, or something server-specific, e.g.
apache httpd allows some combination of inode, mtime, size.  So it's
highly likely that if the wrong e-tag were sent, there would be no
match, and the server would return the entire (correct) file.

These days, due to clusters and cdns serving replicated content, inode
isn't common, and mtime + size is at best a weak validator, so it should
be so marked.  (W/"...")  Even so, the odds of a false match remain
slim.  That's the theory.  It would be interesting to see what servers
are actually doing.

That said, xattrs on most file systems can hold a url as well as an
e-tag, and reduce the probably of a mismatch from very, very small to
zero.  (The implementations on some, e.g. FAT, limit the size.)

> curl actually save the URL already given the --xattr option.

Then it should be simple to also save the etag... and use it. Though I
would argue that (a) it should be the default and (b) when a filesystem
doesn't support it, fallback to a dotfile would be more helpful than a
warning.  Again, let curf do the work instead of off-loading it to the
user, who probably isn't as technical as any of us.

> Dan

Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.


-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-users
Etiquette:   https://curl.se/mail/etiquette.html
Received on 2023-01-15