cURL / Mailing Lists / curl-library / Single Mail

curl-library

RE: "pull" aspect of multi interface not quite working properly

From: Allen Pulsifer <pulsifer3_at_comcast.net>
Date: Thu, 21 Jun 2007 20:33:02 -0400

> So in summary, the problem is that curl_multi_perform
> sometimes calls CURLOPT_WRITEFUNCTION more than once each
> time it is called, which delivers in total more than
> CURL_MAX_WRITE_SIZE and overwhelms my application with data.
> The solution would be to ensure that curl_multi_perform can
> call CURLOPT_WRITEFUNCTION at most one time before returning.

In digging into this further, the most important thing IMO is to prevent
more than one read on the socket per call to curl_multi_perform. This will
limit the amount of data that will be sent to the application.

The amount of data read from the socket is limited to data->set.buffer_size,
which is directly set by CURLOPT_BUFFERSIZE. If curl_multi_perform is
limited to one read, and if there is no expansion of the data, then the
maximum amount of data sent to the application per call to
curl_multi_perform will be CURLOPT_BUFFERSIZE. In certain circumstances
however, there can be data expansion, such as when libcurl does on-the-fly
content decoding. In that case, the application may get more than
CURLOPT_BUFFERSIZE bytes in one call to curl_multi_perform and must be
prepared to handle this.

Only two set of changes to the code are needed to ensure read is only called
once per call to curl_multi_perform.

FIRST CHANGES:

multi.c line 1362

CHANGE:

   while (easy->easy_handle->change.url_changed);

TO:

   while (0);

This is a minor change. change.url_changed is only true when the
application sets CURLOPT_URL while the connection is in progress, but in
that one case, this change will ensure read is not called until the next
time the application calls curl_multi_perform.

SECOND SET OF CHANGES:

transfer.c line 345:

CHANGE:

   ((select_res & CSELECT_IN) || conn->bits.stream_was_rewound)) {

TO:

   ((select_res & CSELECT_IN) || conn->bits.stream_was_rewound) ||
data_pending(conn)) {

transfer.c line 1347:

CHANGE:

   while(data_pending(conn));

TO:

   while (0);

This moves the test for data_pending(conn) out of the loop. As a result,
curl_multi_perform will not loop back to repeat the read when
data_pending(conn) is true, but instead it will return to the application
and then call read the next time it is called.

I'm currently running libcurl with these changes. As far as I can tell,
they have no adverse side-effects, but I can't guarantee that.

There is more than could be done to make libcurl even more "pull", for
example, content decompression could be limited to CURLOPT_BUFFERSIZE before
being resumed on the next call to curl_multi_perform, but this looks like it
would require more significant changes to the code, and could alternately be
accomplished by disabling decompression in libcurl and performing it in the
application instead.

Allen
Received on 2007-06-22