cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: libcurl + libevent2: Stalling if no data is received/written [new timeout patch]

From: Dirk Manske <dm_at_nonitor.de>
Date: Mon, 20 Sep 2010 19:14:57 +0200

> > But I've found a new detail, hiperfifo and connecting to nc -l -p 9999:
>
> > As already written the error message is missing, but I guess
The attached patch brings the error messages back.

> > more worse is what curl tells us:
> > * Connection #0 to host 127.0.0.1 left intact
> >
> > I think curl should close that connection.
> > Or would that be a problem if pipeling is used?
>
> If the app is done right, then yes it should. I mean, the "left intact" isn't
> a problem but it should say that it gets closed after that line.

The conncetion will not closed until another request uses the same connection.
And if curl detects that the connection is dead then the timeout will be reseted.
I've detect that flaw while testing against a cgi script, which simply sleeps
for 15 seconds:

1284986731.171602 Adding easy 0x24a1468 to multi 0x2474c98 (http://127.0.0.1/cgi-bin/sleep)
...
1284986731.174426 multi_timer_cb: Setting timeout to 4000 ms
...
1284986735.174288 multi_timer_cb: Setting timeout to 3000 ms
...
* STATE: DO => DO_DONE handle 0x24bc368; (connection #0)
...
* Operation timed out after 7043 milliseconds with 0 out of -1 bytes received
...
1284986738.177407 DONE: http://127.0.0.1/cgi-bin/sleep => (28) msg:Operation timed out after 7043 milliseconds with 0 out of -1 bytes received
* Connection #0 to host 127.0.0.1 left intact
1284986738.177585 timer_cb called fd:-1 kind:1
1284986738.177600 timer_cb curl_multi_socket_action rc:0 running:0

As configured the operation timed out after 7 seconds. But try again:

1284986844.596010 Adding easy 0x24a1468 to multi 0x2474c98 (http://127.0.0.1/cgi-bin/sleep)
...
* Connection #0 seems to be dead!
* Expire cleared
* Closing connection #0
...
1284987006.009698 multi_timer_cb: Setting timeout to -1 ms
1284987006.009712 timer_cb curl_multi_socket_action rc:0 running:1
1284987006.009726 timer_cb called fd:-1 kind:1
1284987006.009738 timer_cb curl_multi_socket_action rc:0 running:1
* Operation timed out after 15009 milliseconds with 0 out of -1 bytes received
* STATE: PERFORM => COMPLETED handle 0x24bc328; (connection #-5000)
* STATE: COMPLETED => MSGSENT handle 0x24bc328; (connection #-5000)
1284987021.018430 socket callback: s=7 e=0x24a1468 what=REMOVE
1284987021.018467 REMAINING: 0
1284987021.018481 DONE: http://127.0.0.1/cgi-bin/sleep => (28) msg:Operation timed out after 15009 milliseconds with 0 out of -1 bytes received
* Connection #0 to host 127.0.0.1 left intact
1284987021.018631 last transfer done, kill timeout

(The cgi script is done after 15s, otherwise we wouldn't above message)

In url.c the dead connection is detected, than Curl_disconnect is called
which calls Curl_expire. And this purges the timeout of the new request.

Without thinking about side effects I would remove the Curl_expire(0) call
in Curl_disconnect. Or the timeouts must be re-added at some other place.

> > And the second problem..
>
> >> Alright, connection refused, but where is the DONE message from
> >> hiperfifo...
> >
> > I must correct me, that is not a new side effect. That bug happens with
> > 7.21.1 too.
>
> That done output is made by hiperfifo itself isn't it?
Yes.
> I didn't quite see how that was clearly a bug in libcurl.

I would expect that libcurl tells the application that a request
is done by calling the socket function with action set to CURL_POLL_REMOVE.
But you could bring forward the argument that there wasn't a socket,
so there is also nothing to remove...

Neverless I've patched multi_runsingle to call the socket_cb with
CURL_POLL_REMOVE ... and it doesn't help (in this case).

You where right, the real bug is in hiperfifo. In function check_run_count
the numbers of previous and still running requests are compared.
curl_multi_info_read is only called "if (g->prev_running > g->still_running)"
But if the request fails early then the running counter wasn't increased.

(Not only a fast "connection refused" can cause trouble, but also using
a unsupported protocol etc.)

So applications have to service an own counter, or simply call
curl_multi_info_read and see if there are messages.

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html

Received on 2010-09-20