cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Mysterious random crash demystified.

From: Traian Nicolescu <traian_at_burstcopy.com>
Date: Tue, 12 Oct 2004 13:41:04 +0300 (EEST)

>> "If the target thread is executing certain kernel32 calls when it is
>> terminated, the kernel32 state for the thread's process could be
>> inconsistent."
>
> I haven't heard any such worries/problems about the unix code that uses
> alarm() and *longjmp(). Couldn't the state there also get comprimised?

For starters, may I just note that I consider it wrong to make assumptions
about the safety of documented unsafe functions. Not only by making them,
but by implicitly forcing all of your user base to comply with them.

> Since gethostbyname/getaddrinfo are blocking, how do you mean
> they should be synchronised? If not by forcefully terminating the thread
> one
> must wait 10-15 sec in curl_easy_cleanup. Or longer if NetBIOS is
> involved.
> IMHO not very nice.

This is naturally not what I am suggesting, as it would kill the whole
purpose of having a separate thread. I have termed it a small synch task,
because it is what one typically faces when synchronizing threads.

> One sollution would be for the thread to detect if the 'connectdata'
> structure
> has been deleted. But that would probably need waiting on a
> critical-section or
> a mutex used during cleanup (Curl_disconnect).

Well, I typically use QueueUserAPC() for such purposes. Instead of
directly running the callback, it attaches a request to run the callback
to the thread's queue. When the thread enters an alertable state, the
asynchronous callback we attached is automatically run. Alertable states
are entered through WaitForSingleObjectEx() calls with the alertable flag
set to true. The drawback to this approach, for libcurl at least, is that
it would no longer run on NT 3.x systems, since they don't have
QueueUserAPC().

If you decide that NT 3.x systems are not already phased out, antiquated,
obsolete, deprecated, and altogether dead, then I suggest using the
following mutex workaround.

In the main thread:
- create mutex with random name
(e.g. "libcurl.{9DD4790E-01FF-4121-9B31-B8170FA2CD6E}")
- start resolver thread
- wait for the thread to exit (timeout 5 sec)
- if thread exited, all is okay
- if thread timed out:
  + close mutex handle
  + if the callback is currently running (add a flag for it), wait until
it stops
  + proceed to exit as failure

In the resolver thread:
- store mutex name
- attempt name resolve ( 10 seconds, maybe )
- try to open a handle to the mutex
- if succeeded:
  + run callback
  + close mutex handle
- if failed:
  + directly exit

This method doesn't look 100% safe to me, since the thread may open the
mutex handle _just_ before we close it, and start the callback _just_
after we check if it is running, but it is a big improvement compared to
the way it is currently done. Adding a hardcoded Sleep( 5 ) before
checking if the callback is running should prevent the above scenario from
happening as well.

Best Regards,
Traian Nicolescu
Received on 2004-10-12