curl / Mailing Lists / curl-users / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

feature request: expected payload size command-line flag

From: Danny McClanahan via curl-users <curl-users_at_lists.haxx.se>
Date: Tue, 01 Nov 2022 19:27:39 +0000

Hello curl-users,

I was recently looking to download my twitter user data archive via curl since my browser was shorting out. The file size was quite large, and twitter fails to provide an exact Content-Length for some reason, except in their own custom header e.g. "x-ton-expected-size: 8274859056", which means the default curl progress output was unable to estimate the remaining time for the download. This of course looks like:
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed
100 9.9M 0 9.9M 0 0 1820k 0 --:--:-- 0:00:05 --:--:-- 1981k

As it turns out, even when the download completes successfully (in either the browser or curl), the zip file twitter provides for my account is corrupt, but that's not curl's problem. I'm mostly interested in whether someone has already considered adding a way to provide an expected Content-Length to curl in order to obtain the benefits of the progress bar, such as estimating remaining time.

I have tried setting --max-filesize, but that doesn't work for my purposes for two reasons:
1. It doesn't affect the progress output ("Time Left" remains at "--:--:--"), so it does not solve the problem.
2. It would cut off the download after that many bytes, whereas my use case does not expect to know the precise number of bytes in advance, and I need to ensure I download the complete file (instead, --max-filesize would complement this proposed feature by setting an upper bound for payload size so I can avoid downloading more than I have space for).

In searching archives of this mailing list, I found this issue (https://github.com/curl/curl/issues/2158), which provides an easier repro case of a download missing a Content-Length: "https://github.com/torvalds/linux/archive/v4.14-rc1.tar.gz", but wasn't immediately able to find discussion about hard-coding an expected payload length when not provided.

I'd like to know whether this feature has already been considered already, or whether there are likely to be any blockers. I'm not yet too familiar with how curl communicates with libcurl, but if libcurl produces the progress output, and libcurl requires a precise (instead of estimated) Content-Length to produce the progress estimate, I could see this requiring a change to libcurl. But I'm hoping this can be implemented purely in the curl command-line tool.

I'm planning to take a stab at implementing this change now from my checkout of the curl git repo, but would love to hear any objections to this feature as well. I was thinking this would be a command-line flag that accepts the same type of size specification that --max-filesize does. I was also planning to print out a warning and ignore the value of this flag if the response provides its own Content-Length, in cases such as described in https://github.com/curl/curl/issues/2158 above, where the Content-Length may or may not be set.

Thanks,
Danny
-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-users
Etiquette:   https://curl.se/mail/etiquette.html
Received on 2022-11-01