Re: debugging a crash in Curl_pgrsTime/checkPendPipeline?
Date: Wed, 29 Jul 2009 10:35:05 -0700
On Sun, Jul 26, 2009 at 12:52:12AM +0200, Daniel Stenberg wrote:
>> One handle, let's call it A, transitions from CURLM_STATE_PERFORM to
>> CURLM_STATE_DONE in multi_runsingle(). As part of the transition, the
>> code in CURLM_STATE_PERFORM removes the session handle for A from the
>> recv pipeline. The function returns and we go on to process another
>> handle. Some time later, handle F encounters a send error that requires
>> it to set bits.closed to True. It then calls Curl_done() on the
>> connection. Curl_done() calls Curl_disconnect(), which calls
>> signalPipeClose() on the send, recv, and pend pipelines. Part of the
>> PipeClose is a call to Curl_multi_handlePipeBreak(), which sets the
>> easy_conn in Curl_one_easy's attached to the pipeline to NULL. The
>> connection is then free'd.
>> When handle A gets run in the CURLM_STATE_DONE state, it's easy_conn
>> has been free'd by this send error, but because its SessionHandle was
>> removed from the pipeline, handlePipeBreak() didn't set handle A's
>> easy_conn to NULL. This is what causes us to access free'd memory and
> Shouldn't A perhaps immediately forget what connection it was part of
> when it reaches CURLM_STATE_DONE and another SessionHandle "takes over"
> the connection?
There's a fair amount of code in the handling of CURLM_STATE_DONE that
assumes we still have a valid struct connectdata assigned to the
easy_conn member of our Curl_one_easy. In the general case, the code
doesn't set easy_conn to NULL until it reaches CURLM_STATE_COMPLETED.
>> In terms of solving the problem itself, what do you think about
>> introducing a 4th list called the done_pipeline?
> I'm a bit reluctant to add another list since I think we should be able
> to make the existing logic work with the existing lists. I get the
> feeling adding another list is just a way to escape that fixing and
> instead try to engineer another way to solve the same problem.
I have prototyped a fix with a 4th list that seems to work. I'm open to
alternative suggestions, but I'll need details. I've been able to
identify what's causing the bug, but I don't have any knowledge about
how past design decisions affected the current state of the code base.
>> This is where we can put requests that are done, but that haven't been
>> processed in the CURLM_STATE_DONE state yet.
> Uh, how would it end in such a limbo? I mean when the request is done,
> doesn't it switch to CURLM_STATE_DONE at once?
This would be for requests making the transition from CURLM_STATE_PERFORM
to CURLM_STATE_DONE. The call to multi_runsingle only processes one
state transition per invocation. So, if a handle gets switched to the
DONE state, we have to call multi_runsingle() to actually perform the
DONE state's processing. This is where we seem to hit the bug. The
request is in the DONE state but we haven't invoked the code to perform
the processing for that state yet. Because it has a easy_conn but isn't
on a pipeline list, the connectdata gets free'd when another connection
gets an error. Since the handle isn't on list, we don't set the
easy_conn member to NULL, which is why we access free'd data.
Received on 2009-07-29