Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcurl 7.58 regression with HTTP/2 #2365

Closed
JustArchi opened this issue Mar 6, 2018 · 7 comments
Closed

libcurl 7.58 regression with HTTP/2 #2365

JustArchi opened this issue Mar 6, 2018 · 7 comments
Labels

Comments

@JustArchi
Copy link

JustArchi commented Mar 6, 2018

I did this

#include <stdio.h>
#include <curl/curl.h>
#include <string.h>

int main(void)
{
  CURL *curl;

  curl_global_init(CURL_GLOBAL_DEFAULT);

  curl = curl_easy_init();
  if(curl) {
    curl_easy_setopt(curl, CURLOPT_URL, "https://discordapp.com/api/v6/gateway");
    curl_easy_setopt(curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_2_0);

    char errbuf[CURL_ERROR_SIZE];
    curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, errbuf);
    errbuf[0] = 0;

    int count = 1;

    for (count = 1; count < 1500; count++) {
        CURLcode res = curl_easy_perform(curl);
        if (res != CURLE_OK) {
            size_t len = strlen(errbuf);
            fprintf(stderr, "\nlibcurl: (%d) ", res);
            if (len) {
                fprintf(stderr, "%s%s", errbuf, ((errbuf[len - 1] != '\n') ? "\n" : ""));
            } else {
                fprintf(stderr, "%s\n", curl_easy_strerror(res));
            }

            break;
        }

        fprintf(stdout, "OK: #%d\n", count);
    }

    curl_easy_cleanup(curl);
  }

  curl_global_cleanup();

  return 0;
}

I expected the following

for loop going past 1000th request.

I got this

libcurl: (16) Error in the HTTP2 framing layer at 1000th request

image

curl/libcurl version

curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.0.2n zlib/1.2.8 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) libssh2/1.8.0 nghttp2/1.31.0 librtmp/2.3
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL

operating system

Debian Testing

extra info

Initially I reported this issue as part of dotnet/corefx at https://github.com/dotnet/corefx/issues/27730

The issue itself is a weird regression that I managed to reproduce only with one specific server until now, but it's possible that more are affected. This could potentially hint that it's some server-specific rate-limiting or likewise, but then the reproducible case works perfectly fine on older libcurl 7.52.1, also with HTTP/2, which makes me believe that even if it's some server-related issue, it wasn't triggered before.

Side note: I think that 7.57 also worked fine and the regression happened with 7.58, but I could be wrong. It's definitely possible to reproduce with 7.58, and definitely not possible to reproduce with 7.52.1.

The issue is very weird since it always affects exactly 1000th request being sent. Doing a similar thing but with launching a new process with each request, curl in particular, does not lead to this issue - it has to be a part of a single process. You could also do 2 processes spawning 600 requests each and that also finishes just fine.

While it's possible that it's some specific server-related issue, considering that old version of libcurl works fine while new one doesn't, I believe it's some weird regression that I'm sadly unable to analyze further due to very limited knowledge in terms of C and libcurl. I'd like to apologize in advance if I messed up something.

Thank you in advance for looking into the issue.

@bagder bagder added the HTTP/2 label Mar 6, 2018
@bagder
Copy link
Member

bagder commented Mar 9, 2018

It certainly can't be a coincidence that it stops exactly at 1000. It feels like it might be related to the max number of streams or something. Running the same test against for example https://daniel.haxx.se/, it works perfectly beyond 1000. This indicates that at least it is related to something the server does.

@bagder
Copy link
Member

bagder commented Mar 9, 2018

It's not related to MAX_CONCURRENT_STREAMS, it gets set to 128...

@JustArchi
Copy link
Author

JustArchi commented Mar 10, 2018

Running the same test against for example https://daniel.haxx.se/, it works perfectly beyond 1000. This indicates that at least it is related to something the server does.

Yes, I can confirm that I reproduced it with only that one server until now. At the same time I couldn't reproduce it with exactly the same server on older libcurl. I also couldn't reproduce it with affected libcurl but splitted across several processes, e.g. 1200 requests in 2x 600.

@bagder
Copy link
Member

bagder commented Mar 10, 2018

It stops reading here: https://github.com/curl/curl/blob/master/lib/http2.c#L1280

@bagder
Copy link
Member

bagder commented Mar 10, 2018

I captured the entire run with wireshark and here's what I learned:

The server sends a GOAWAY (with error NO_ERROR) when the 1000th stream ends. This message is defined by the spec like this:

The GOAWAY frame (type=0x7) is used to initiate shutdown of a connection or to signal serious error conditions. GOAWAY allows an endpoint to gracefully stop accepting new streams while still finishing processing of previously established streams.

I think the GOAWAY should cause the connection to get closed but I figure the next request with libcurl should then simply create a new one and continue. This case should not be considered an error for libcurl.

@bagder
Copy link
Member

bagder commented Mar 10, 2018

With #2375 applied your example above reaches all the way to 1500 for me, by creating a new connection after stream 1000 ends. As one would like it to do.

@JustArchi
Copy link
Author

Yeah, that was the goal. Thank you a lot for that!

bagder added a commit that referenced this issue Mar 11, 2018
Assisted-by: Jay Satiro
Reported-by: Łukasz Domeradzki
Fixes #2365
@bagder bagder closed this as completed in 8b498a8 Mar 12, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Development

No branches or pull requests

2 participants