curl / Mailing Lists / curl-library / Single Mail

curl-library

Re: cURL behaviour when Content-Length is unrepresentable and a max file size is set

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Wed, 3 Jan 2018 04:02:01 +0100 (CET)

On Tue, 19 Dec 2017, Brad Spencer wrote:

> In the most recent versions of cURL, the Content-Length header is now parsed
> as an integer instead of a floating point number, for good reasons.

Recent? I tracked down the parsing using integer back to 2005 and I can't
recall it using anything else before that either...

> When cURL isn't able to parse the Content-Length value into a representable
> integer, it seems that it proceeds as if there was no Content-Length header.
> The intent is noble, because cURL is trying to be robust and keep going even
> when the response is very, very large.

On modern machines curl speaks 64 bit numbers so when a Content-Length value
is actually larger than this, we can more or less assume that it was made so
on purpose. But yes, there are downsides with this behavior, and I guess it
is particularly more likely to happen on ancient systems without 64 bit
support - although I believe those systems are slowly going extinct.

In particular for the case where the parser finds the value to be "invalid"
and not just out of range, I think we should consider making it fail the
request as that's a mighty strange response that isn't even HTTP compliant.

> Firstly, it's not clear how cURL's going to know when the response is over.

It can't know it, if the server is really intending to send that amount of
data and then keep the connection open.

> Secondly, it seems like the behaviour is different when the Content-Length
> can be represented but is negative. In this case cURL seems to close the
> connection and fail the request. Presumably, this is for reasons similar to
> the first case, since there can be no way to know when such a request is
> finished. But, of course, a negative Content-Length is meaningless anyway.
> (It looks like the intended behaviour is to close the connection when the
> Content-Length is negative, but it looks like negative values may trigger
> the "unrepresentable" behaviour instead since the ASCII-to-integer parser
> doesn't seem to handle negative numbers.)

That's actually a bug. It is supposed to mark the connection for closure and
ignore the content-length. Negative values used to be common back in the days
of early large file support and Apache servers so we have that weirdo hack
still in to code to deal with those. I would say that the time is ripe to
remove that work-around now. Especially since it doesn't even seem to work!

> Thirdly, when a maximum file size is set with the CURLOPT_MAXFILESIZE_LARGE
> option, but then cURL can't parse a Content-Length, it seems safe to assume
> that either the Content-Length is garbage, so the request needs to fail, or
> that the Content-Length is too large for cURL to represent, which means it
> must be larger than the maximum file size, so the request needs to fail. In
> the very least, it would seem that at least when this option is set, cURL
> should always close the connection and fail the request.

I would agree. If there's a maximum set and curl fails to parse the given
value, it feels like it should err on the safe side and assume that it might
be too big.

> So my question is, why doesn't cURL close the connection and fail the request
> in all of these cases?

As with everything: nobody has had any reason to ponder those decisions or has
been happy with them.

> Also, BTW, it's interesting that cURL does not enforce the maximum file size
> when there is no Content-Length. This is surprising (it was a surprise to me
> when I first encountered it), and it leaves the application with the task of
> counting the bytes returning in response data callbacks so it can also
> enforce the maximum file size for the unpredictable cases where the server
> didn't advertise the response size. It would seem natural for cURL to do
> this itself when a maximum file size is set on the request.

I disagree. The intended functionality for that option was always to prevent a
"too large transfer" to even start. It was not meant to abort transfers after
N bytes. So if curl doesn't know the size of the transfer to come, it won't
stop it due to the value of this option.

-- 
  / daniel.haxx.se
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html
Received on 2018-01-03