curl-and-python

Re: [BUG] setopt / POSTFIELDS / object reference

From: Dima Tisnek <dimaqq_at_gmail.com>
Date: Mon, 18 Nov 2013 17:06:35 +0100

yes, it does look like a valid issue.
I'd rate it as small/medium impact as usually user code looks like this:

def post(url, data, etc...):
  c = pycurl.Curl()
  c.setopt(pycurl.POSTFIELDS, data) # ref to data is kept
  c.perform()
  return xx

def user_code():
  xx = post("http:...", some_data or urllib.urlencode(...), ...)

assuing isinstance(data, str), data is immutable and won't change
during the call; a reference is kept in post() frame, thus data is
safe.

if the use case is markedly different, then yes, garbage could be sent.

d.

On 18 November 2013 16:27, Nicolas Collignon
<nicolas.collignon_at_synacktiv.com> wrote:
> Hi,
>
> I've identified a bug in PyCURL library concerning "setopt +
> pycurl.POSTFIELDS".
>
> client.setopt(pycurl.POSTFIELDS, data)
>
> The bug may be classified as a "use-after-free". The "data" pointer (post
> fields data) is passed as-is to the libcurl C API without increasing the
> associated Python object reference.
>
>
> file pycurl.c, line 1704:
> -----------------------------------------
> case CURLOPT_POSTFIELDS:
> if (PyString_AsStringAndSize(obj, &str, &len) != 0)
> return NULL;
> /* automatically set POSTFIELDSIZE */
> if (len <= INT_MAX) {
> res = curl_easy_setopt(self->handle, CURLOPT_POSTFIELDSIZE,
> (long)len);
> } else {
> res = curl_easy_setopt(self->handle,
> CURLOPT_POSTFIELDSIZE_LARGE, (curl_off_t)len);
> }
> [...]
> /* Call setopt */
> res = curl_easy_setopt(self->handle, (CURLoption)option, str);
> -----------------------------------------
>
> Therefore as soon as the "data" object is dereferenced by the Python code,
> its memory allocation may be reused by any Python code. It results in
> garbage being sent to the server.
> The bug is quite easy to reproduce with the provided python script.
>
> I confirmed that by increasing the python object reference count (variable
> "obj"), the problem is solved however it leads to a memory leak since there
> is no way of tracking the python object usage without adding a curl-specific
> garbage collector.
>
> The problem may be solved with 3 different approaches:
>
> 1) keep track of the data object and unref the object when the curl object
> is destructed or unset (which seems like a bad idea since the POST size is
> set at the same time as the POST data. the POST size may change if the
> application changes the object content).
>
> 2) copy the data object and free it when the curl object is destructed or
> unset (require adding a new field to PyCURL object).
>
> 3) ask libcurl C API to copy the data by converting POSTFIELDS to
> COPYPOSTFIELDS at the Python API level (however the COPYPOSTFIELDS option
> has been merged in libcurl 7.18.1).
>
> By the way, this issue is solved by using approaches 2 + 3 in php-curl.
>
> So i have 2 questions:
> - Do you agree with my analysis ?
> - Do you want me to "pull-request" the code that perform POSTFIELDS ->
> COPYPOSTFIELDS conversion for libcurl >= 7.18.1 and keep the bug for libcurl
> < 7.18.1 ?
>
> -- Nicolas Collignon
>
>
> _______________________________________________
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
>
_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
Received on 2013-11-18