curl-and-python

bug in Curl.reset()?

From: <johansen_at_sun.com>
Date: Wed, 4 Mar 2009 18:48:44 -0800

I seem to have run afoul of a problem lurking within the pycurl
easy handle's reset function. If I reset an easy handle and then
attempt to use it again, the code fails in weird ways. In the
particular case that I came across, calling multi.info_read() on a
handle that has been reset leads to the following stack trace:

$ ./reset.py http://www.google.com
Recd 5665 bytes from http://www.google.com

Traceback (most recent call last):
  File "./reset.py", line 65, in ?
    main()
  File "./reset.py", line 37, in main
    count, good, bad = cm.info_read()
pycurl.error: (0, 'Unable to fetch curl handle from curl object')

I'll attaching the test case. It illustrates the problem with using
reset(). If you uncomment the line that resets the easy handle, the
program runs in a loop. Otherwise, it dies with the above stack after
it has been reset.

The problem seems to be due to the fact that when a handle is reset,
it's not re-configured the way a new handle is in pycurl.c do_curl_new().

do_curl_reset() looks like this:

        static PyObject*
        do_curl_reset(CurlObject *self)
        {
            unsigned int i;
         
            curl_easy_reset(self->handle);
         
            /* Decref callbacks and file handles */
            util_curl_xdecref(self, 4 | 8, self->handle);
         
            /* Free all variables allocated by setopt */
        #undef SFREE
        #define SFREE(v) if ((v) != NULL) (curl_formfree(v), (v) = NULL)
            SFREE(self->httppost);
        #undef SFREE
        #define SFREE(v) if ((v) != NULL) (curl_slist_free_all(v), (v) = NULL)
            SFREE(self->httpheader);
            SFREE(self->http200aliases);
            SFREE(self->quote);
            SFREE(self->postquote);
            SFREE(self->prequote);
        #undef SFREE
         
            /* Last, free the options */
            for (i = 0; i < OPTIONS_SIZE; i++) {
                if (self->options[i] != NULL) {
                    free(self->options[i]);
                    self->options[i] = NULL;
                }
            }
         
            return Py_None;
        }

It calls curl_easy_reset(), which resets the options configured on the
easy handle. It then invokes util_curl_xdecref with flags 4 | 8, that
clear up some internal state maintained by pycurl. Unfortunately, the
handle never gets re-configured.

The do_multi_info_read() has a portion that looks like this:

        /* Fetch the curl object that corresponds to the curl handle in the mess
        res = curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE, &co);
        if (res != CURLE_OK || co == NULL) {
            Py_DECREF(err_list);
            Py_DECREF(ok_list);
            CURLERROR_MSG("Unable to fetch curl handle from curl object");
        }

Here, if CURLINFO_PRIVATE can't be found, we die with the "Unable to
fetch curl handle from curl object" message that was seen in the stack
trace. After the call to curl_easy_reset() the value for
CURLOPT_PRIVATE has been erased. The only routine that sets this value
is do_curl_new():

    [pycurl.c: line 777]

    /* Set backreference */
    res = curl_easy_setopt(self->handle, CURLOPT_PRIVATE, (char *) self);
    if (res != CURLE_OK)
        goto error;

It looks to me that the code between lines 771 and 815 should be
extracted into a separate routine that is invoked after a successful
do_curl_new() as well as do_curl_reset().

I'm attaching the test case as reset.py. Any thoughts on this from the
pycurl team?

-j

_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python

Received on 2009-03-05