cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Large transfers

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Thu, 31 Jul 2003 15:35:35 +0200 (CEST)

On Thu, 31 Jul 2003, Duncan Wilcox wrote:

> Attempting to download a large file (>2GB) has several issues with curl,
> currently.

Yes, unless you fool the server to send the file using chunked
transfer-encoding.

(BTW, this defect is mentioned in docs/KNOWN_BUGS)

> For example on very large files Apache often sends a bogus Content-Length,
> which curl 7.10.6 fails to interpret correctly:

> 0000: Content-Length: -1081789720
>
> curl: (18) transfer closed with -1081789720 bytes remaining to read

I beg to differ. Treating this "correctly" is bound to be impossible. That is
a blantant server error and even though we sometimes do work-arounds for
obvious and common bugs in servers, I'm still to be convinced this is one of
those.

> I suppose curl should ignore a Content-Length < 0, attempt resuming only at
> positions smaller than 2GB and just streaming to the end of file and
> crossing fingers otherwise (wget apparently does something like this, uh,
> sorry for using the w-word).

That's easier to do for wget, since it does mere HTTP 1.0 requests and they
close the connection after transfer by default so wget can always more or less
ignore the content-length since it brings little extra for it.

libcurl however uses HTTP 1.1 and then ignoring the content-length will cause
the transfer to just "hang" when the server consider the transfer to be
complete, as it won't close the connection (until it's been idle for N
seconds).

> I tried looking into developing a patch but stepped in this in
> lib/transfer.c:581:
>
> sscanf (k->p+15, " %ld", &k->contentlength))
>
> Obviously also curl needs upgrading to 64bit file lengths/offsets, and if
> you use sscanf you need different % qualifiers on different platforms
> (printfing usually is %lld on unixish systems, %I64d on VC6/7, scanfing
> '%qd' on unixish systems and %I64d on VC6/7 -- that's capital 'i' followed
> by 64).

Yes. I can only agree that this is indeed needed for this to work nicely.

I would suggest that we modify the test suite so that we can get it to return
>2GB contents(*) and then make sure those test cases run fine.

Alternatively, someone that has a web server and ftp server up and running
with proper support for large files could raise a hand and offer us some URLs
to test against.

(*) = today the test suite only sends file's of which the content is stored
within the test case file and storing 2GB isn't fun! ;-)

> Unfortunately given the number of different things that need to be hacked I
> don't currently have the time to work on it, its not a major issue but I
> thought it was worth noting, hopefully someone can develop a patch quickly
> starting from my notes.

Thanks a lot for your effort and summary. It can certainly serve as a base for
someone to make the bigger patch.

I believe we might even need to fix some public functions prototypes when we
once and for all make all file sizes in libcurl support 64 bits (like the
progress callback for example).

-- 
 Daniel Stenberg -- curl: been grokking URLs since 1998
-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
Received on 2003-07-31