curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Memory bloat in libcurl when using http/2

From: Sangamkar, Dheeraj via curl-library <curl-library_at_cool.haxx.se>
Date: Fri, 31 Jan 2020 09:27:06 +0000

Hello,

Environment: Debian 10 with libcurl 7.64.0-4. (some minor patches), nghttp2 version 1.36.
Problem: Memory consumption of application keeps growing when read a large amount of data using libcurl over http/2 over tls. It is released later or during application shutdow, so, I don’t think there is a leak.
Problem NOT seen with http/1.1 in the same environment and workload.

Application overview: an http server that serves GETs by retrieving data from multiple http/2 source servers using libcurl to read the data from source servers using libcurl. The application uses 1 multi handle per thread in a thread pool and schedules curl easy handles for execution using curl_multi_perform on any one thread. When some bytes of the body of a response of a GET are received, the WRITEFUNCTION for the handle returns PAUSE. The transfer is later unpaused using curl_easy_unpause(.., CURLPAUSE_CONT).

When the concurrency of GETs of large content(100+mb_ is increased, the application memory bloats to 2-6GB and ultimately crashes the application.

Heapdump from jemalloc was inspected and showed that most of the memory was held by libcurl.
Here is dump of the top consumers in libcurl(lot of other functions removed):

1041.4 77.3% 77.3% 1041.4 77.3% curl_domalloc /src/curl-7.64.0/lib/memdebug.c:170

   178.8 13.3% 90.5% 178.8 13.3% curl_dorealloc /src/curl-7.64.0/lib/memdebug.c:288

     0.0 0.0% 100.0% 1202.7 89.2% Curl_client_write /src/curl-7.64.0/lib/sendf.c:676

     0.0 0.0% 100.0% 1033.3 76.7% Curl_memdup /src/curl-7.64.0/lib/strdup.c:69

     0.0 0.0% 100.0% 171.4 12.7% Curl_readwrite /src/curl-7.64.0/lib/transfer.c:1294

     0.0 0.0% 100.0% 1033.3 76.7% chop_write /src/curl-7.64.0/lib/sendf.c:604

     0.0 0.0% 100.0% 1032.3 76.6% curl_easy_pause /src/curl-7.64.0/lib/easy.c:1083

     0.0 0.0% 100.0% 167.4 12.4% pausewrite /src/curl-7.64.0/lib/sendf.c:521

     0.0 0.0% 100.0% 1033.3 76.7% pausewrite /src/curl-7.64.0/lib/sendf.c:532

     0.0 0.0% 100.0% 170.4 12.6% readwrite_data /src/curl-7.64.0/lib/transfer.c:934

The pdf version of the report shows that the call sequence that allocates the memory in heap is:
curl_easy_pause->Curl_client_write->chop_write->pausewrite->curl_domalloc(1+GB).

Testing with libcurl with modified info log messages showed that 10s of mbs of buffers are held off of the curl easy handle in:

Change in lib/sendf.h to the info message generation in function pausewrite(…):



548 DEBUGF(infof(data,

549 "Paused %zu more bytes in buffer of size %zu for type %02x\n",

550 len, s->tempwrite[i].len, type));

Messages printed when test is run:


 Info: Paused 4096 more bytes in buffer of size 4096 for type 01

 Info: Paused 4096 more bytes in buffer of size 9539557 for type 01

 Info: Paused 12315 more bytes in buffer of size 9551872 for type 01

 Info: Paused 16375 more bytes in buffer of size 9568247 for type 01

 Info: Paused 4096 more bytes in buffer of size 5869541 for type 01

 Info: Paused 16375 more bytes in buffer of size 9584622 for type 01

 Info: Paused 20086766 more bytes in buffer of size 20086766 for type 01

 Info: Paused 20086766 more bytes in buffer of size 20086766 for type 01

 Info: Paused 43628028 more bytes in buffer of size 43628028 for type 01

 Info: Paused 43595260 more bytes in buffer of size 43595260 for type 01

 Info: Paused 43595260 more bytes in buffer of size 43595260 for type 01

 Info: Paused 638985 more bytes in buffer of size 638985 for type 01

 Info: Paused 638985 more bytes in buffer of size 638985 for type 01

This implies, the buffers in curl_easy->state->tempwrite[BODY] keeps growing to MBs.

My modeling of the problem is that the application cannot read the body of the GET response as fast as libcurl and the http/2 servers can deliver it. When this happens, libcurl starts accumulating the data received off of the curl_easy_handle. Does this seem right in light of evidence provided?

I also suspect that when curl_easy_pause is called (in the same thread) and it calls the writefunction, possibly multiple times, and only the first invocation accepts data, it buffers up this memory in temp buffers. And somehow this memory grows without pushing back on the client via the http/2 stream corresponding to the GET.

Is this a known issue or a fixed problem? Is there a workaround?
Does libcurl’s interfacing with nghttp2 provide a way for a slow reader to push back to the server during GET?
How can I fix this? What files and functions should I be messing with in libcurl code to tell the http server via nghttp2 that I don’t want to receive any more data for a GET response body beyond a certain limit, say 1MB per GET.

-Dheeraj

(Note: This is similar to a problem fixed last year for http/1.1 GETs in libcurl)

-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2020-01-31