cURL / Mailing Lists / curl-library / Single Mail

curl-library

RE: Crash during gethostbyname in libcurl

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Mon, 19 Mar 2007 09:56:22 +0100 (CET)

On Sat, 17 Mar 2007, Cyril Picat wrote:

> We have further investigated the crash related to name resolution and for us
> it is because there is a threading problem with the use of a shared DNS
> cache (data->hostcache) in the multi interface. Indeed, even if the multi
> interface is only accessed by one thread, its DNS cache could be accessed at
> the same time by several.

By several... what?

And just to reiterate my previous suggestion: if you change the resolve to use
the standard synchronous one, you get no problems? You can even try the c-ares
one if you want asynchronous that avoids the threaded one.

> There are two cases where it can happen (imagine we have two easy handles in
> a multi interface) :

> 1. the thread of the multi interface is processing Curl_disconnect() on its
> first easy handle because the handle has finished its transfer and
> connection could not be reused (because of the server or with Curl option)
> and at the same time the callback from the host resolver thread for the
> second easy handle is called and tries to insert a new hash entry in the DNS
> cache (indeed the two easy handle DNS cache pointers are pointing to the DNS
> cache of their multi interface)

Then we need to add some kind of locking to prevent this from happening...

> 2. the thread of the multi interface is processing
> curl_multi_remove_handle() on an easy handle because the application has
> decided to remove this handle (for example we decided to abort the transfer)
> even before the name resolution succeeds. In this case, the data->hostcache
> pointer is set to 0 in the easy handle structure and thus the callback would
> call Curl_hash_add() with h=0, which would crash in FETCH_LIST().

A similar case then. As long as the threaded resolver is running and is in
progress we must not remove the foundation it builds on!

> We have reproduced case 2. quite easily in a test case but unfortunately
> case 1. is much more difficult to reproduce (it happened once in 6 months).
>
> Thus a solution for problem 1. could be to protect the access to the shared
> DNS cache and for problem 2. the value of h could be tested either in
> Curl_cache_addr() or Curl_hash_add().

First, I must say that this is a Windows-specific problem and for *really*
sensible code I would recommend using a non-threaded resolver. For multi
interface purposes, and especially when using many easy handles, the threaded
resolver will add quite a few threads and most probably won't be as effciciant
as using c-ares in the first place.

It being windows-specific also means not only that I cannot test it nor do I
know much about the windows magic stuff, but also that I don't want to pollute
the generic code too much due to peculiarities in this part.

The generic approach for protecting a resource that is or might be accessed by
more than one thread simultaneously is of course to introduce some kind of
locking/mutex scheme that prevents the resource to be removed as long as it is
in use by another thread.

> Do you agree with this analysis or is there a flaw in our way of thinking ?
> Which solution do you recommend ?

The analysis seems likely to be correct. As for the solution, can you please
post a patch with your suggestion?

> Can we use the multi and the share at the same time (ie adding an easy
> handle with option CURLOPT_SHARE in a multi interface) ?

Yes, easy handles can be set to share things using the share interface even
when added to a multi handle. Some sharing is done by default between easy
handles added to the multi handle (dns cache, connection cache etc), but you
can overrride that.

-- 
  Commercial curl and libcurl Technical Support: http://haxx.se/curl.html
Received on 2007-03-19