curl_multi_socket_action with large posts won't multiplex properly #3436

TvdW · 2019-01-04T15:36:27Z

I did this

I'm performing multiple concurrent requests to a HTTP/2 server, each with a large POST body. The server execution takes exactly 1 second (because sleep(1000)), and can run many in parallel. Using curl_multi_perform, this takes one second as expected. Using curl_multi_socket_action, it takes one second per request.

Code: https://gist.github.com/TvdW/c41cb1e74feb0c5f79be83f9d330a74b - argv[1] refers to the server to test with, I've been using https://home.xifon.eu/sleep/1000 which runs a http2 server that sleeps for exactly one second. When commenting out the "BAD" section (thus allowing the "GOOD" section to run), the difference can be seen.

I expected the following

Doing 10 concurrent requests that each take one second should take one second total.

curl/libcurl version

$ ./bin/curl -V
curl 7.63.0 (x86_64-pc-linux-gnu) libcurl/7.63.0 OpenSSL/1.0.2k zlib/1.2.7 c-ares/1.15.0 nghttp2/1.35.1
Release-Date: 2018-12-12
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IPv6 Largefile NTLM NTLM_WB SSL libz HTTP2 UnixSockets HTTPS-proxy

operating system

CentOS 7

The text was updated successfully, but these errors were encountered:

TvdW · 2019-01-04T15:58:41Z

Some further debugging shows that when I get the fd_set from curl_multi_fdset, it differs from the ones passed to me via socketfunction in that it asks for a write on the socket, but using those sets instead does not mitigate the problem. It seems that curl_multi_perform is doing some additional processing that curl_multi_socket_action does not.

TvdW · 2019-01-04T16:42:05Z

Just to be sure this isn't a recent regression: this is reproducible on 7.55.1 on Ubuntu (but also 7.63.0 on CentOS as indicated in the initial post).

bagder · 2019-01-04T22:23:02Z

The code looks like it uses libcurl wrongly. It uses a mix of event-based and non event-based functions in a way I'm not convinced is fine. It should wait only for the sockets told about by the socket callback and it should not use curl_multi_fdset() at all. Also the timeout logic looks broken.

TvdW · 2019-01-04T23:02:34Z

Thanks for checking. I'm not using curl_multi_fdset in the code, and the curl_multi_perform in the bottom of the file is no-opped by the loop above it so there should be no mix of event-based and non event-based code in the file (unless I'm missing something).

The timeout logic is indeed broken but should be sufficient for the purpose of demonstrating the issue.

bagder · 2019-01-05T22:22:57Z

runs a http2 server that sleeps for exactly one second

And exactly how sure are we that this sleep only affects one particular stream and not the whole connection?

TvdW · 2019-01-06T00:25:50Z

runs a http2 server that sleeps for exactly one second

And exactly how sure are we that this sleep only affects one particular stream and not the whole connection?

That's a fair question. Here's the output of h2load:

$ dd if=/dev/zero of=./zero_data count=100000 bs=1
100000+0 records in
100000+0 records out
100000 bytes (100 kB) copied, 0.211724 s, 472 kB/s

$ h2load -d zero_data -m 100 -n 10 -c 1 https://home.xifon.eu/sleep/1000
starting benchmark...
spawning thread #0: 1 total client(s). 10 total requests
TLS Protocol: TLSv1.2
Cipher: ECDHE-RSA-AES128-GCM-SHA256
Server Temp Key: ECDH P-256 256 bits
Application protocol: h2
progress: 10% done
progress: 20% done
progress: 30% done
progress: 40% done
progress: 50% done
progress: 60% done
progress: 70% done
progress: 80% done
progress: 90% done
progress: 100% done

finished in 1.03s, 9.71 req/s, 1.47KB/s
requests: 10 total, 10 started, 10 done, 10 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 10 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 1.51KB (1550) total, 681B (681) headers (space savings 33.24%), 510B (510) data
                     min         max         mean         sd        +/- sd
time for request:      1.02s       1.02s       1.02s       260us    70.00%
time for connect:     6.29ms      6.29ms      6.29ms         0us   100.00%
time to 1st byte:      1.03s       1.03s       1.03s         0us   100.00%
req/s           :       9.71        9.71        9.71        0.00   100.00%

In other words, the server is able to handle 10 concurrent requests in 1 second, despite all being executed over the same connection. Each request takes one second.

The code I posted (test.c) should be doing the same thing (including the 100KB POST), but it takes about ten times as long, and considering the h2load output here I'm pretty convinced it's not a server-side thing.

Swapping the event loop code for curl_multi_perform changes the behavior of libcurl completely, and suddenly it will perform all ten requests concurrently.

bagder · 2019-01-08T12:52:34Z

I've figured it out. Now I just need to fix it too ... =)

Fixes #3436 After LOTS of scratching my head, I eventually realized that even when doing 10 uploads in parallel, sometimes the socket callback to the application that tells it what to wait for on the socket, looked like it would reflect the status of just the single transfer that just changed state. Digging into the code revealed that this was indeed the truth. When multiple transfers are using the same connection, the application did not correctly get the *combined* flags for all transfers which then could make it switch to READ (only) when in fact most transfers wanted to get told when the socket was WRITEABLE. A separate but related regression had also been introduced by me when I cleared connection/transfer association better a while ago, as now the logic couldn't find the connection and see if that was marked as used by more transfers and then it would also prematurely remove the socket from the socket hash table even in times other transfers were still using it! Make sure that each socket stored in the socket hash has a "combined" action field of what to ask the application to wait for, that is potentially the ORed action of multiple parallel transfers. And remove that socket hash entry only if there are no transfers left using it. The socket hash entry stored an association to a single transfer using that socket - and when curl_multi_socket_action() was called to tell libcurl about activities on that specific socket only that transfer was "handled". This was WRONG, as a single socket/connection can be used by numerous parallel transfers and not necessarily a single one. We now store a list of handles in the socket hashtable entry and when libcurl is told there's traffic for a particular socket, it now iterates over all known transfers using that single socket.

Fixes #3436 Problem 1 After LOTS of scratching my head, I eventually realized that even when doing 10 uploads in parallel, sometimes the socket callback to the application that tells it what to wait for on the socket, looked like it would reflect the status of just the single transfer that just changed state. Digging into the code revealed that this was indeed the truth. When multiple transfers are using the same connection, the application did not correctly get the *combined* flags for all transfers which then could make it switch to READ (only) when in fact most transfers wanted to get told when the socket was WRITEABLE. Problem 1b A separate but related regression had also been introduced by me when I cleared connection/transfer association better a while ago, as now the logic couldn't find the connection and see if that was marked as used by more transfers and then it would also prematurely remove the socket from the socket hash table even in times other transfers were still using it! Fix 1 Make sure that each socket stored in the socket hash has a "combined" action field of what to ask the application to wait for, that is potentially the ORed action of multiple parallel transfers. And remove that socket hash entry only if there are no transfers left using it. Problem 2 The socket hash entry stored an association to a single transfer using that socket - and when curl_multi_socket_action() was called to tell libcurl about activities on that specific socket only that transfer was "handled". This was WRONG, as a single socket/connection can be used by numerous parallel transfers and not necessarily a single one. Fix 2 We now store a list of handles in the socket hashtable entry and when libcurl is told there's traffic for a particular socket, it now iterates over all known transfers using that single socket.

bagder added the HTTP/2 label Jan 4, 2019

This was referenced Jan 8, 2019

multi: multiplexing improvements #3448

Closed

7.64 regression: curl_multi_fdset() returns a write fd instead of read fd #3446

Closed

bagder closed this as completed in 4c35574 Jan 10, 2019

lock bot locked as resolved and limited conversation to collaborators Apr 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

curl_multi_socket_action with large posts won't multiplex properly #3436

curl_multi_socket_action with large posts won't multiplex properly #3436

TvdW commented Jan 4, 2019

TvdW commented Jan 4, 2019

TvdW commented Jan 4, 2019 •

edited

bagder commented Jan 4, 2019

TvdW commented Jan 4, 2019

bagder commented Jan 5, 2019

TvdW commented Jan 6, 2019

bagder commented Jan 8, 2019

curl_multi_socket_action with large posts won't multiplex properly #3436

curl_multi_socket_action with large posts won't multiplex properly #3436

Comments

TvdW commented Jan 4, 2019

I did this

I expected the following

curl/libcurl version

operating system

TvdW commented Jan 4, 2019

TvdW commented Jan 4, 2019 • edited

bagder commented Jan 4, 2019

TvdW commented Jan 4, 2019

bagder commented Jan 5, 2019

TvdW commented Jan 6, 2019

bagder commented Jan 8, 2019

TvdW commented Jan 4, 2019 •

edited