cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: [PATCH 2/7] pipelining: Fix connection handling under timeouts.

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Tue, 11 Nov 2014 00:20:40 +0100 (CET)

On Mon, 10 Nov 2014, Carlo Wood wrote:

> If you agree with that, then what we need is to detect is if a request has
> already sent data to the server or not. Not if it already reached the DONE
> state. The added multi_do_connection_id is for this purpose:

That's not exactly what it detects though and hence my question. First, the
call to Curl_do() is not guaranteed to send any data (even if it is likely).

(It is interesting to recall that there are many more protocols than HTTP...)

The check later seems to check that the timeout happens for the transfer that
issued the most recent request on the connection (even the last one done on
the multi handle which may use any amount of pipelines and connections), and
*only* *then* you set the bits.timedout bit. What if the timeout happens on
some other request in the pipeline (or if it issued a request on another
pipeline in between), shouldn't the connection get marked for timeout then as
well?

I say you can't know (using this method) if data truly has been sent. I think
using the fact that we've not reached DONE as a data-has-been-sent signal
could be used for simplicity as it avoids that weird check. There's a
writebytecount that _can_ be read if we really wanted to know exactly, but I
think that's an optmization we can wait with.

>> 2 - For a transfer that you abort due to a timeout, that is part of a
>> pipeline and there is data pending you set bits.timedout to prevent it from
>> getting more transfers added to the pipeline. But shouldn't you get the
>> same effect by setting bits.close and then avoid a new struct member for
>> only this purpose?
>
> No, I tried that first. But the result of doing that is that the (new)
> connection isn't used until it actually connects.

Does this mean we happily add requests to connections that are marked to get
closed? That seems wrong! The bit only needs to make new requests not use this
connection, isn't that so?

I also have a little bit of trouble to accept that "bits.timeout" is for the
connection when it isn't the connection that is timed out and the comment says
"the transfer timedout" but the bit is not for the transfer, it is for the
connection.

> Namely, upon construction of a new connectdata the close bit is set, as the
> default, to close the connection after use - until the user explicitly
> requests reuse. This (resetting the close bit) only happens after the
> connection succeeds I found out.

That seems to go nicely with dont-pipeline-until-we-know-it-can so it seems
fine to me.

> The resulting effect was this: all requests in the recv pipe, of which the
> head timed out (invalidating everything) created their own new connection.

I thought we agreed that is how the code works right now - on purpose. I don't
think it is right to change the behavior like this. This commit is about
fixing how we deal with timeouts.

> all requests end up firing a connect() of their own.

... up to the set limits, yes.

> Using a new bit seemed a lot easier then to (in fact) abuse an existing bit
> with a double meaning (close after DONE / Poisoned due to timeout) where I'd
> have to add hacks to achieve that double duty use - hacks which might be
> broken accidently again in the future.

Sorry, but I don't see it as double use. It is one single meaning: marked to
get closed. The fact that the reason was a timeout doesn't have to be
conveyed. At least I don't think this situation is a reason for it.

>> 3 - The changing of the state to DONE as Monty already commented on:
>> I believe there's a miniscule risk for a race condition here if the
>> timeout is triggered when already in the CURLM_STATE_DONE state, so I
>> believe we need a check for that and if already in DONE go to
>> CURLM_STATE_COMPLETED instead. Your thoughts?
>
> I don't think that multi_runsingle() can be invoked recursively can it? Or
> more to the point, it can not be invoked recursively from the state
> CURLM_STATE_DONE.

Nah, but it can get into this condition multiple times as it may stay in the
DONE state for a while and that seems unfortunate even if it might not lethal,
so we should make it not happen.

In fact, it struck me now when reading this code closer that when we go to the
DONE state "forcibly" due to a timeout (as opposed to COMPLETED), we need to
change the Curl_done() call (around line 1656) to send a TRUE in its third
argument to signify getting called "premature" (ie before the full and clean
end of operation).

(Another intersting but related exercise that probably will cause us grief is
to call curl_multi_remove_handle() on an easy handle that is mid-pipeline.)

(From the follow-up mail)

> What about indeed not timing out when the state is already set to
> CURLM_STATE_DONE?

We need to. Some protocols can stay in that state for quite some time and they
must be able to time-out from there as well.

-- 
  / daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2014-11-11