cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Unclear docs on curl_multi_socket_action and CURLM_CALL_MULTI_SOCKET

From: Josef Drexler <josef_at_ttdpatch.net>
Date: Thu, 10 Jul 2008 02:07:57 +0200

On Jul 9, 2008, at 11:42 PM, Daniel Stenberg wrote:
> On Wed, 9 Jul 2008, Josef Drexler wrote:
>
>> I've been working on using the curl_multi_socket_action interface
>> for rtorrent, and I found the docs (which often still describe the
>> deprecated curl_multi_socket rather than curl_multi_socket_action)
>> a bit sparse on details.
>
> We appreciate all the help we can get in improving these things!
>
> Any particular references that caught your eye as bad or misleading?

Mainly the man page at http://curl.haxx.se/libcurl/c/
curl_multi_socket.html suggests calling curl_multi_socket rather than
curl_multi_socket_action in the description and the return value
sections. Then a paragraph below it says curl_multi_socket is
actually deprecated and should not be used. At least in the timeout
case that it refers to, there is no difference between the two calls,
but suggesting to call a deprecated routine is still a little
confusing, until you understand why it's deprecated and why that
makes no difference in this particular case.

>> 1. CURLM_CALL_MULTI_SOCKET return value
>>
>> When this is returned, what should be the arguments for the next
>> call? The same socket? The same event mask? Socket and/or event
>> mask cleared?
>
> I think the most correct thing to do for us is to prevent that
> return code from ever being returned from curl_multi_socket_action().
>
> Assuming you're using a rather recent libcurl version, I don't
> think you need to call it again at all - but it would of course be
> good and useful to really verify this and document it. And remove
> the ability for the return code to happen.

I was trying to not require too new a version, and if possible
degrade gracefully in the presence of bugs. So even when it shouldn't
return with CURLM_CALL_MULTI_SOCKET, in the event that it does the
transfers should not stall. This is a bit harder with edge triggered
event architectures, in which you will not get another readiness
notification if you don't read or write as much data as is available/
fits in the buffer to make it "not ready". In that case,
curl_multi_socket_action would have to be called again with the right
descriptor and mask to read/write the rest of the data.

But if CURLM_CALL_MULTI_SOCKET doesn't mean libcurl hasn't read all
data yet, but rather that some other cleanup needs to be done, then
it wouldn't matter what the socket argument is.

> If you're using a somewhat older version, I think you should call
> it again with parameters you'd use for a timeout.

Right.

>> Apparently there are cases where CURLM_CALL_MULTI_SOCKET is
>> returned after the socket in question has been removed and closed.
>> In that case the arguments for the next call MUST be different or
>> it'll return with CURLM_BAD_SOCKET. But what should they be?
>
> That's a bug that I've fixed. I thought that was already done in
> 7.18.2?

Ah, it was probably an old version then. I've compiled it with
various versions to see how it worked, since insisting on the latest
version is not really feasible.

>> It seems libcurl does not cancel the timeouts it has specified via
>> the timeout callback.
>
> It doesn't, but that's mostly just because I found it easier to
> just ignore that rather than to keep track and cleanup. I figured
> an extra time-out call (when the timer expires) will be very cheap.
>
>> Can or should it be cancelled when all handles are finished, or
>> not? Or should it be cancelled when all sockets have been removed?
>> Or should it just be kept and have it call
>> curl_multi_socket_action even though there are no more handles?
>
> Without handles the timeout should be cancelled yes. It's really
> pointless to keep it ticking then since it can't do anything on
> expiry.

OK.

>> It seems libcurl sometimes closes the socket *before* generating
>> the CURL_POLL_REMOVE, and sometimes *after* it. Since both cases
>> need to be handled differently
>
> They do? What differences are those?

See below. Essentially you must make sure not to remove events from a
reopened FD of the same number, but you must still remove it from
your event loop if it wasn't reopened after all.

For select() based backends, it must be removed from the poll whether
it was closed or not. But for epoll et.al. it can only be removed if
it wasn't closed already. If it was closed, the syscalls will simply
fail. But if it was closed and reopened, you'd screw up a different
FD that now has the same number.

>> I need to waste a stat() call or similar to find out if the
>> descriptor is valid. For instance, when the socket is already
>> closed, the syscalls to remove it from the poll interface will
>> fail. This also opens a race condition where other code (or
>> another thread) might open() some resource and receive the same FD
>> as the now-closed socket, without a way of telling that the events
>> now set on that FD in the kernel should NOT be removed.
>
> That sounds like an attempt to an optimization of yours, not a
> necessary difference. I don't really see how libcurl encourages a
> race condition to occur by its behavior.

Perhaps it doesnt encourage it as such, but it makes it necessary to
handle the case where an FD might refer to a new resource without
having received notification of the old resource having been closed
and reopened. To detect this in a multi-threaded environment, all
threads with event loops would need to coordinate the FDs added to
their event loops.

It might be a bit construed, but if thread A closes descriptor 8 and
doesn't yet notify its event loop, and in that time thread B opens a
new connection, it's not too unlikely that it might get descriptor 8
for it. Then unless the threads coordinate their FDs, thread A would
not know to remove its event watcher for FD 8, or that events it
receives for it until then are bogus.

This isn't much of a problem with architectures like epoll or kqueue
where the kernel removes the events when closing the FD, but for
select() or poll() based backends (where nothing better is
available), both threads might randomly receive read/write readiness
for the FD.

That's why closing an FD before letting the event loop know of that
fact should be avoided if at all possible. In libcurl's case that
means it's probably safe so long as it generates a CURL_POLL_REMOVE
notification for all closed sockets before returning from
curl_multi_socket_action.

But maybe I'm just a little TOO obsessed about code correctness...

>> It would be a lot better if libcurl never closed sockets before
>> generating a CURL_POLL_REMOVE.
>
> Yes I agree it would be better with a more consistent behavior (and
> it would also be easier to document in a more clear way), but that
> would require further internal changes and considerations to make
> sure that would be the case all over. For example we currently
> don't use the socket callbacks from c-ares (that I _believe_ would
> enable CURL_POLL_REMOVE before close) but instead poll it for its
> status and thus we don't know about it closing its handle(s) until
> after the fact.

Alright. If it's not simply a bug, then we'll just have to handle
this case.

-- 
Josef Drexler
josef_at_ttdpatch.net
Received on 2008-07-10