cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Massive HTTP/2 parallel requests

From: Molina <jose.molina_at_cern.ch>
Date: Fri, 11 Mar 2016 18:09:15 +0100

Hello, and many thanks for the replies.

> Can these values be set programmatically per socket? Did you do any magic calculations to end up with this new value? Or did you experiment your way to the optimal value?
>
> Some sources[1] say this config sets the maximum TCP window it can announce, which to me sounds a bit small while if it is buffers per socket it sounds pretty big.
>
> Another place[2] says net.inet.tcp.recvspace is the recive buffer size and that this value, plus net.inet.tcp.sendspace, must be less than kern.ipc.maxsockbuf.

Concerning the programmatical way, I saw it was possible [1], and actually that’s the article inspired me. However, it clearly states that the current buffer size (per socket, as I understand) can be modified up to the maximum, but the maximum itself cannot be changed without administrator permissions, which is not (and shouldn’t be) the usual case. To change the socket's current size it’s enough with:

setsockopt(skt, SOCKET, SO_RCVBUF, (char *)&sndsize, (int)sizeof(sndsize));
For more information please access here [2]. According to these readings net.inet.tcp.recvspace seems to be the default socket size, provided it’s not greater than the maximum, and net.inet.tcp.autorcvbufmax seems to be the maximum one, only modifiable with administrator privileges.
Strictly speaking, setting a higher default buffer size would not be necessary for my particular problem, it would only delay a bit the download time because it would have to scale up the buffer size during perhaps a few package exchanges between client and server. The same way, modifying net.inet.tcp.autorcvbufmax is the key to increase the performance when there is a considerably big round-trip-time (RTT).

Concerning the calculation, the idea was to use the following formula:

WS >= Bandwidth * RTT

For my particular case, I had a RTT = 135ms = 0.135s and an empirically-measured bandwidth of 600Mb/s = 75MB/s. Hence WS = 75 MB/s * 0.135s = 10.125 MB. My idea was to use (just in case) the closest power of two, so in this case 2^24 byes = 16MB.

Back to the different buffers [3], and answering the differences between them:
net.inet.tcp.recvspace: The net.inet.tcp.sendspace and net.inet.tcp.recvspace settings control the maximum TCP window size the system will allow sending from the machine or receiving to the machine. Up until the latest releases of most operating systems, these values defaulted to 65535 Bytes. As I understand, it means it’s the maximum amount of data that can be sent through that connection… as long as its value is NOT modified.
kern.ipc.maxsockbuf: This value sets the maximum amount of Bytes of memory that can be allocated to a single socket. However, the recommendation of the article is to let the OS decide its value. I did not have to set that variable to get a better performance out of my test. In any case, it’s a possibility to be explored.

> So this allows the receive buffer to automatically grow to 16MB. Have you understood how it correlates with net.inet.tcp.recvspace? And again with the actual value, how did you end up with this number?
It’s explained in the previous paragraph.

> But how about other clients? Did Firefox/Chrome also work slowly until this change or did they work fast already without the parameter tweaks?
To be honest, I am not using any browser in my benchmarks, mainly because (as far as I understood) they only support HTTP/2 over SSL, which is not the case of the product I’m working for, neither the case of the server I’m using. Also, I haven’t found any other client that allows the user to specify the per-connection and per-stream window size, which is also mentioned in this email chain.

> We currently don't have any such API. Ideally we should also try to avoid having one I think because it only leads to the very hard question: what should a user set the value to? It would be much better if we figure that out and set a suitable value ourselves. But maybe that's not possible?
>
> In general though, libcurl doesn't really make much use of the HTTP/2 flow control features so it would probably make sense to more or less by default use a pretty large window.
Firstly, a pretty large window that is not efficiently used means a waste of memory. That being said, having into account the nature of HTTP/2, I’d say it would be very convenient to have a larger window size. The default one so far is 2^16 = 64KB might become insufficient as soon as you launch many parallel file requests through the same HTTP/2 connection. If I’d have to explore a number I’d start with at least 1MB to cover most of the use cases. However it’s been proved it was not enough for my particular case, so one cannot really offer a global solution that performs well in all cases and do not waste memory.

Now talking about a possible use case, and this is my personal opinion, I think a user might use CURL for many things. These many things may include what I was forced to do with nghttp because CURL did not provide an option. It doesn’t mean a user has to every time specify that option, and it’s not incompatible with having a sensible default. It’s just that providing a --window-bits=<N> and --connection-window-bits=<N>”-like options in CURL would be an possibility to consider at least for HTTP/2.

> The window is set per-stream and libcurl sets up the streams (using libnghttp2 of course) so in order to get this feature we need to add the ability to lib curl.
> We make libcurl do what we think it should do. If we think it should be able to set the HTTP/2 stream window size then we make that possible! Feel free to dig in and help us write that patch. But perhaps we could start with simply making the default window larger than what it currently is and maybe we can be fine with that?
At this point I’d like to explore how both libraries would make this possible. Recently, after suspecting libCURL does not offer the possibility to tune the HTTP/2 window size, I started to take a look at libnghttp2 to manually do it in my project. Concerning the static tuning of the windows size, as I mentioned before, for sure setting it to 1MB at least is worth a try. I will set that value and recompile CURL, detailing here the results I obtain with my test, of course.

> The discussion sounds interesting, is the email chain available on public archives? I'm a little confused how the original issue of a 64 stream limit has ended up as an issue related to TCP buffers. Or did you solve that one and move onto performance tuning?
You can find it here: http://mail-archives.apache.org/mod_mbox/httpd-modules-dev/201603.mbox/browser <http://mail-archives.apache.org/mod_mbox/httpd-modules-dev/201603.mbox/browser>
The 64 streams was a limitation of the server, as far as I understood. What I naively thought is that the more files you put in the connection, the better performance you will get. With my tests I realised that’s not true, and with simply 16 parallel streams I get the same performance than with 64.

> I could guess that the libcurl window size defaults mean than streams receiving data at an equal rate eventually fill the client connection window and block the server from sending more data. But that would suggest that the clients are not sending WINDOW_UPDATE frames in a timely manner. Can you shed any more light on the process of coming to this solution?
Yes, I think I can drop some light on this issue. If you take a look at the last message of the chain [4] it can be seen that the distance between server and client is so big that, as you mentioned, the client does not send on time the WINDOW_UPDATE request. But why? The client sends it when it’s current window is at half of its capacity, which is suddenly filled when it receives packages from the server. The problem is that the server does not receive this message on time either, so it doesn’t send any more packages while this request is arriving.
However, if more packages are allow to be sent to the client without waiting for a response, at least half of the client’s window size arrives with sufficient time to send the WINDOW_UPDATE request, which most probably arrives when the server is still sending the previous chunk of data. That implies that the server never stops sending data, which is the ultimate goal that should be achieved.

Best regards,
José

[1] http://www.onlamp.com/2005/11/17/tcp_tuning.html <http://www.onlamp.com/2005/11/17/tcp_tuning.html>
[2] http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html <http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html>
[3] https://rolande.wordpress.com/2014/05/17/performance-tuning-the-network-stack-on-mac-os-x-part-2/ <https://rolande.wordpress.com/2014/05/17/performance-tuning-the-network-stack-on-mac-os-x-part-2/>
[4] http://mail-archives.apache.org/mod_mbox/httpd-modules-dev/201603.mbox/browser <http://mail-archives.apache.org/mod_mbox/httpd-modules-dev/201603.mbox/browser>

-------------------------------------------------------------------
List admin: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2016-03-11