Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with CURLOPT_READFUNCTION + CURLAUTH_ANYSAFE #346

Closed
paulharris opened this issue Jul 15, 2015 · 25 comments
Closed

Crash with CURLOPT_READFUNCTION + CURLAUTH_ANYSAFE #346

paulharris opened this issue Jul 15, 2015 · 25 comments
Assignees

Comments

@paulharris
Copy link

The good news is I narrowed it down.

The "ANYSAFE" auth method can cause a form to be reposted from scratch.

In this situation, the user-provided readfunction callback is switched to the standard fread callback, and then called with the user-provided readdata pointer, which the fread callback is incorrectly casting and writing to.

Valgrind also has a heap of warnings.

First, is the gdb output showing when the fread_func is reset, and then called.

Note: readstream is my user callback function.
Curl_FormReader is the built-in callback function
Curl_FormReader casts my user callback data 0x121 and then tries to access it.

Then, the code.
It has lots of odd variables and padding, I used them to force a segfault.
I believe your compiler may not crash in the same way, so I'm attaching a lot of output.

Then, GDB and Valgrind output

Breakpoint 6, Curl_http (conn=0x615e20, done=0x7fffffffdeb3) at /project/curl/lib/http.c:2454
2454        http->form.fread_func = data->set.fread_func;
(gdb) p data->set.fread_func
$15 = (curl_read_callback) 0x40112c <readcallback(char*, size_t, size_t, void*)>
(gdb) c
Continuing.

Breakpoint 6, Curl_http (conn=0x615e20, done=0x7fffffffdeb3) at /project/curl/lib/http.c:2454
2454        http->form.fread_func = data->set.fread_func;
(gdb) p data->set.fread_func
$16 = (curl_read_callback) 0x7ffff78460fe <Curl_FormReader>
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7846138 in Curl_FormReader (
    buffer=0x608689 '-' <repeats 26 times>, "fdf41c87d3c77191\r\nContent-Disposition: form-data; name=\"file\"; filename=\"filename\"\r\n\r\n", size=1, nitems=16384, mydata=0x121)
    at /project/curl/lib/formdata.c:1453
1453      if(!form->data)
(gdb) p form
$17 = (struct Form *) 0x121
(gdb) 

The code:

#include <curl/curl.h>
#include <string>

using std::string;

const char*  BASE_URL2 = "http://mysite?action=put";
const string BASE_URL = "http://mysite?action=put";
const string SOMETHING = "1";

static size_t readcallback(char *buffer, size_t size, size_t nitems, void *instream)
{
   return 0;
}

struct SomeStruct
{
   SomeStruct( string const& url )
   {
   }
};


int main(int argc, char **argv)
{
   struct curl_httppost *formpost = NULL;
   struct curl_httppost *lastptr = NULL;
   CURL * handle = NULL;
   curl_slist * headerlist = NULL;
   const char* fn = "filename";
   char space2[17];   // 17 is the sweet spot
   char space[8+8+8];


   SomeStruct curl(BASE_URL2);
   curl_global_init(CURL_GLOBAL_ALL);

      handle = (curl_easy_init());
      headerlist = (curl_slist_append(NULL,"Expect:"));

      // we don't want "Expect: 100-continue" -- it screws up proxy
      curl_easy_setopt(handle, CURLOPT_HTTPHEADER, headerlist);

      curl_easy_setopt(handle, CURLOPT_URL, BASE_URL2);

      curl_easy_setopt(handle, CURLOPT_READFUNCTION, readcallback);

      // authentication (at the server end)
      curl_easy_setopt(handle, CURLOPT_HTTPAUTH, CURLAUTH_ANYSAFE);
      curl_easy_setopt(handle, CURLOPT_USERNAME, "");
      curl_easy_setopt(handle, CURLOPT_PASSWORD, "");


   curl_formadd( &formpost, &lastptr,
         CURLFORM_COPYNAME, "file",

         CURLFORM_STREAM, (void*)0x121,
         CURLFORM_CONTENTSLENGTH, 0,//reader_data.size,
         CURLFORM_FILENAME, fn,

         CURLFORM_END
         );

   curl_easy_setopt(handle, CURLOPT_HTTPPOST, formpost);

   curl_easy_perform(handle);

   // should die by now

   return 0;
}

The GDB output:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7846138 in Curl_FormReader (
    buffer=0x608689 '-' <repeats 26 times>, "4b8b37d30d5acaa0\r\nContent-Disposition: form-data; name=\"file\"; filename=\"filename\"\r\n\r\n", size=1, nitems=16384, mydata=0x121)
    at /project/curl/lib/formdata.c:1453
1453      if(!form->data)
(gdb) bt
#0  0x00007ffff7846138 in Curl_FormReader (
    buffer=0x608689 '-' <repeats 26 times>, "4b8b37d30d5acaa0\r\nContent-Disposition: form-data; name=\"file\"; filename=\"filename\"\r\n\r\n", size=1, nitems=16384, mydata=0x121)
    at /project/curl/lib/formdata.c:1453
#1  0x00007ffff784604b in readfromfile (form=0x616838, 
    buffer=0x608689 '-' <repeats 26 times>, "4b8b37d30d5acaa0\r\nContent-Disposition: form-data; name=\"file\"; filename=\"filename\"\r\n\r\n", size=16384) at /project/curl/lib/formdata.c:1413
#2  0x00007ffff784617f in Curl_FormReader (
    buffer=0x608689 '-' <repeats 26 times>, "4b8b37d30d5acaa0\r\nContent-Disposition: form-data; name=\"file\"; filename=\"filename\"\r\n\r\n", size=1, nitems=16384, mydata=0x616838)
    at /project/curl/lib/formdata.c:1458
#3  0x00007ffff786ed6d in Curl_fillreadbuffer (conn=0x615e20, bytes=16384, nreadp=0x7fffffffdda4)
    at /project/curl/lib/transfer.c:118
#4  0x00007ffff7870168 in readwrite_upload (data=0x603d80, conn=0x615e20, k=0x603df8, didwhat=0x7fffffffde2c)
    at /project/curl/lib/transfer.c:869
#5  0x00007ffff78706f7 in Curl_readwrite (conn=0x615e20, data=0x603d80, done=0x7fffffffdeb2)
    at /project/curl/lib/transfer.c:1071
#6  0x00007ffff787be6d in multi_runsingle (multi=0x60ccc0, now=..., data=0x603d80)
    at /project/curl/lib/multi.c:1531
#7  0x00007ffff787c637 in curl_multi_perform (multi_handle=0x60ccc0, running_handles=0x7fffffffdfec)
    at /project/curl/lib/multi.c:1808
#8  0x00007ffff787226e in easy_transfer (multi=0x60ccc0) at /project/curl/lib/easy.c:715
#9  0x00007ffff787240b in easy_perform (data=0x603d80, events=false)
    at /project/curl/lib/easy.c:803
#10 0x00007ffff7872443 in curl_easy_perform (easy=0x603d80) at /project/curl/lib/easy.c:822
#11 0x0000000000401320 in main (argc=1, argv=0x7fffffffe258)
    at /project/test/upload_file.cpp:65
(gdb) 

The valgrind output:

==21353== Memcheck, a memory error detector
==21353== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==21353== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==21353== Command: ./bin/upload_file-d
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x5160AE5: FormAdd (formdata.c:605)
==21353==    by 0x5160FFB: curl_formadd (formdata.c:734)
==21353==    by 0x4012F9: main (upload_file.cpp:61)
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x5185AB0: dprintf_formatf (mprintf.c:702)
==21353==    by 0x5186AC8: curl_mvaprintf (mprintf.c:1070)
==21353==    by 0x5165F4B: Curl_add_bufferf (http.c:1228)
==21353==    by 0x51689D6: Curl_http (http.c:2465)
==21353==    by 0x5182783: Curl_do (url.c:6186)
==21353==    by 0x519772E: multi_runsingle (multi.c:1305)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x5185B64: dprintf_formatf (mprintf.c:728)
==21353==    by 0x5186AC8: curl_mvaprintf (mprintf.c:1070)
==21353==    by 0x5165F4B: Curl_add_bufferf (http.c:1228)
==21353==    by 0x51689D6: Curl_http (http.c:2465)
==21353==    by 0x5182783: Curl_do (url.c:6186)
==21353==    by 0x519772E: multi_runsingle (multi.c:1305)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Use of uninitialised value of size 8
==21353==    at 0x5185B32: dprintf_formatf (mprintf.c:729)
==21353==    by 0x5186AC8: curl_mvaprintf (mprintf.c:1070)
==21353==    by 0x5165F4B: Curl_add_bufferf (http.c:1228)
==21353==    by 0x51689D6: Curl_http (http.c:2465)
==21353==    by 0x5182783: Curl_do (url.c:6186)
==21353==    by 0x519772E: multi_runsingle (multi.c:1305)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x5185BFF: dprintf_formatf (mprintf.c:749)
==21353==    by 0x5186AC8: curl_mvaprintf (mprintf.c:1070)
==21353==    by 0x5165F4B: Curl_add_bufferf (http.c:1228)
==21353==    by 0x51689D6: Curl_http (http.c:2465)
==21353==    by 0x5182783: Curl_do (url.c:6186)
==21353==    by 0x519772E: multi_runsingle (multi.c:1305)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x5185C8E: dprintf_formatf (mprintf.c:756)
==21353==    by 0x5186AC8: curl_mvaprintf (mprintf.c:1070)
==21353==    by 0x5165F4B: Curl_add_bufferf (http.c:1228)
==21353==    by 0x51689D6: Curl_http (http.c:2465)
==21353==    by 0x5182783: Curl_do (url.c:6186)
==21353==    by 0x519772E: multi_runsingle (multi.c:1305)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x515EDB2: Curl_pgrsSetUploadSize (progress.c:247)
==21353==    by 0x5168AF0: Curl_http (http.c:2500)
==21353==    by 0x5182783: Curl_do (url.c:6186)
==21353==    by 0x519772E: multi_runsingle (multi.c:1305)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x5164FE1: http_perhapsrewind (http.c:452)
==21353==    by 0x5165234: Curl_http_auth_act (http.c:547)
==21353==    by 0x5169F6C: Curl_http_readwrite_headers (http.c:3140)
==21353==    by 0x518B51A: readwrite_data (transfer.c:480)
==21353==    by 0x518C6A7: Curl_readwrite (transfer.c:1062)
==21353==    by 0x5197E6C: multi_runsingle (multi.c:1531)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Conditional jump or move depends on uninitialised value(s)
==21353==    at 0x5164FEB: http_perhapsrewind (http.c:452)
==21353==    by 0x5165234: Curl_http_auth_act (http.c:547)
==21353==    by 0x5169F6C: Curl_http_readwrite_headers (http.c:3140)
==21353==    by 0x518B51A: readwrite_data (transfer.c:480)
==21353==    by 0x518C6A7: Curl_readwrite (transfer.c:1062)
==21353==    by 0x5197E6C: multi_runsingle (multi.c:1531)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353== 
==21353== Invalid read of size 8
==21353==    at 0x5162138: Curl_FormReader (formdata.c:1453)
==21353==    by 0x516204A: readfromfile (formdata.c:1413)
==21353==    by 0x516217E: Curl_FormReader (formdata.c:1458)
==21353==    by 0x518AD6C: Curl_fillreadbuffer (transfer.c:118)
==21353==    by 0x518C167: readwrite_upload (transfer.c:869)
==21353==    by 0x518C6F6: Curl_readwrite (transfer.c:1071)
==21353==    by 0x5197E6C: multi_runsingle (multi.c:1531)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353==  Address 0x121 is not stack'd, malloc'd or (recently) free'd
==21353== 
==21353== 
==21353== Process terminating with default action of signal 11 (SIGSEGV)
==21353==  Access not within mapped region at address 0x121
==21353==    at 0x5162138: Curl_FormReader (formdata.c:1453)
==21353==    by 0x516204A: readfromfile (formdata.c:1413)
==21353==    by 0x516217E: Curl_FormReader (formdata.c:1458)
==21353==    by 0x518AD6C: Curl_fillreadbuffer (transfer.c:118)
==21353==    by 0x518C167: readwrite_upload (transfer.c:869)
==21353==    by 0x518C6F6: Curl_readwrite (transfer.c:1071)
==21353==    by 0x5197E6C: multi_runsingle (multi.c:1531)
==21353==    by 0x5198636: curl_multi_perform (multi.c:1808)
==21353==    by 0x518E26D: easy_transfer (easy.c:715)
==21353==    by 0x518E40A: easy_perform (easy.c:803)
==21353==    by 0x518E442: curl_easy_perform (easy.c:822)
==21353==    by 0x40131F: main (upload_file.cpp:65)
==21353==  If you believe this happened as a result of a stack
==21353==  overflow in your program's main thread (unlikely but
==21353==  possible), you can try to increase the size of the
==21353==  main thread stack using the --main-stacksize= flag.
==21353==  The main thread stack size used in this run was 8388608.
==21353== 
==21353== HEAP SUMMARY:
==21353==     in use at exit: 83,286 bytes in 131 blocks
==21353==   total heap usage: 320 allocs, 189 frees, 101,211 bytes allocated
==21353== 
==21353== LEAK SUMMARY:
==21353==    definitely lost: 0 bytes in 0 blocks
==21353==    indirectly lost: 0 bytes in 0 blocks
==21353==      possibly lost: 365 bytes in 8 blocks
==21353==    still reachable: 82,921 bytes in 123 blocks
==21353==         suppressed: 0 bytes in 0 blocks
==21353== Rerun with --leak-check=full to see details of leaked memory
==21353== 
==21353== For counts of detected and suppressed errors, rerun with: -v
==21353== Use --track-origins=yes to see where uninitialised values come from
==21353== ERROR SUMMARY: 26 errors from 10 contexts (suppressed: 10 from 6)
@bagder
Copy link
Member

bagder commented Jul 19, 2015

Sorry, cannot repeat. I converted your code into plain C and used the following verbatim with no valgrind warnings and no memory leaks. Can you tell me what I need to change to make it crash?

What libcurl version are you using?

#include <curl/curl.h>

#define URL  "http://localhost/"

static size_t readcallback(char *buffer, size_t size, size_t nitems, void *instream)
{
   return 0;
}

int main(int argc, char **argv)
{
   struct curl_httppost *formpost = NULL;
   struct curl_httppost *lastptr = NULL;
   CURL * handle = NULL;
   struct curl_slist * headerlist = NULL;
   const char* fn = "filename";
   char space2[17];   // 17 is the sweet spot
   char space[8+8+8];


   curl_global_init(CURL_GLOBAL_ALL);

   handle = (curl_easy_init());
   headerlist = (curl_slist_append(NULL,"Expect:"));

   // we don't want "Expect: 100-continue" -- it screws up proxy
   curl_easy_setopt(handle, CURLOPT_HTTPHEADER, headerlist);

   curl_easy_setopt(handle, CURLOPT_URL, URL);
   curl_easy_setopt(handle, CURLOPT_VERBOSE, 1L);

   curl_easy_setopt(handle, CURLOPT_READFUNCTION, readcallback);

   // authentication (at the server end)
   curl_easy_setopt(handle, CURLOPT_HTTPAUTH, CURLAUTH_ANYSAFE);
   curl_easy_setopt(handle, CURLOPT_USERNAME, "");
   curl_easy_setopt(handle, CURLOPT_PASSWORD, "");

   curl_formadd( &formpost, &lastptr,
                 CURLFORM_COPYNAME, "file",

                 CURLFORM_STREAM, (void*)0x121,
                 CURLFORM_CONTENTSLENGTH, 0,//reader_data.size,
                 CURLFORM_FILENAME, fn,

                 CURLFORM_END
     );

   curl_easy_setopt(handle, CURLOPT_HTTPPOST, formpost);
   curl_easy_perform(handle);

   curl_easy_cleanup(handle);

   curl_slist_free_all(headerlist);
   curl_formfree(formpost);

   // should die by now
   return 0;
}

@paulharris
Copy link
Author

Hi,

I'm using github's master curl.

Yes, the C program doesn't crash for me, you need to use the C++ program as-is.
The behaviour of curl is undefined and differs depending on all sorts of things.
eg if in C++ version, if I don't init that useless SomeStruct then the program doesn't crash.

So undefined behaviour is going to be difficult to nail down,
but here is a start...

Use your C code and run it. I'm pointing it to a server that only accepts DIGEST auth (and refuses to auth the user).

The output I get from your test C code is below.
Note this bit: Content-Length: 140733193388192
The content length should be zero !

If I change the CURLFORM_STREAM to (void*)0 then the content length prints as zero (as it should).

Does it behave that way for you?

$ gcc  -g -ggdb -I~/build_curl/install/include -o test test.c -rdynamic ~/build_curl/install/lib/libcurl.so -Wl,-rpath,~/build_curl/install/lib

$ ./test 
*   Trying THE_IP...
* Connected to THE_HOST (THE_IP) port 80 (#0)
> POST /A_FORM HTTP/1.1
Host: THE_HOST
Accept: */*
Content-Length: 140733193388192
Content-Type: multipart/form-data; boundary=------------------------cc3b722562ca287e

< HTTP/1.1 413 Request Entity Too Large
< Connection: close
< Content-Length: 0
< Date: Mon, 20 Jul 2015 02:04:55 GMT
< Server: lighttpd/1.4.19
< 
* Closing connection 0

@jay
Copy link
Member

jay commented Jul 20, 2015

Content-Length: 140733193388192

Tested master (aab76af 2015-07-18) in Ubuntu and Windows. The content length issue is reproducible in Ubuntu 14 x64 LTS.
I made a static-only build of libcurl to do the test. Here is exactly how I built:

make distclean
./buildconf
./configure --with-ssl=/usr/local/ssl --disable-shared --enable-static --enable-debug 
make
cd lib/.libs
gcc -o a a.c libcurl.a -g -ggdb -rdynamic -I../../include -L/usr/local/ssl/lib -Wl,-rpath,/usr/local/ssl/lib -lidn -lrtmp -lssl -lcrypto -lssl -lcrypto -llber -lldap -lz

The content-length varies each run I assume it's uninitialized. For example:

> POST / HTTP/1.1
Host: localhost
Accept: */*
Content-Length: 139964394242208
Content-Type: multipart/form-data; boundary=------------------------ef1b08da598fced3

Initial impression-
In valgrind my results are slightly different. The content length is consistent (at 160) and I have the uninitialized value notices but nothing else, no leak or crash. Interesting the lower bit pattern is always 0000 0000 1010 0000 (160) consistent , regardless of whether valgrind or not. Maybe va_arg issue? hm.

Windows I couldn't reproduce. I used Visual Studio 2010 project file and tried DLL Release - DLL OpenSSL, DLL Debug - DLL OpenSSL in both 32 and 64.

@jay
Copy link
Member

jay commented Jul 20, 2015

valgrind warnings went away when I used CURLFORM_CONTENTSLENGTH, 0L specifically. I assume this is because my Linux long is 64-bit and there's a va_arg for long. I still get content length 160 though. CURLFORM_CONTENTSLENGTH says:

If you pass a 0 (zero) for this option, libcurl will instead do a strlen() on the contents to figure out the size. If you really want to send a zero byte content then you must make sure strlen() on the data pointer returns zero.

Should probably be changed to 0L and oveview should probably have something in bold like "if a parameter requires a long you must pass a long, you must not pass an int." This is due to the va_arg method I assume.

Also, from CURLFORM_STREAM:

Note that when using CURLFORM_STREAM, CURLFORM_CONTENTSLENGTH must also be set with the total expected length of the part.

That should be added to CURLFORM_CONTENTSLENGTH, basically something explicit that states 0L is not valid for stream.

Possibly none of this may help with your original issue though.

@jay
Copy link
Member

jay commented Jul 20, 2015

On second thought why wouldn't 0L be a valid length for stream?. So maybe instead CURLFORM_CONTENTSLENGTH would have to be modified to say something like "If you pass a 0L (zero) for this option, libcurl will instead do a strlen() on the contents to figure out the size, except in the case of CURLFORM_STREAM in which case 0 is taken to be the length." or something like that..

@bagder
Copy link
Member

bagder commented Jul 20, 2015

Right, the varargs functions need 'long' for the numerals. Like 0L and not just 0. With that fixed, do you still get this problem @paulharris ?

I tried sending my post to a page returning 401 due to bad auth but I still haven't triggered any crash.

@bagder
Copy link
Member

bagder commented Aug 1, 2015

This is now believed to be fixed, closing

@bagder bagder closed this as completed Aug 1, 2015
@jay
Copy link
Member

jay commented Aug 2, 2015

I didn't fix this though. I think the fix would be somehow documenting that in the case of CURLFORM_STREAM a content length of 0 is taken to be the length (again I'm assuming this is acceptable, I need clarification).

How about instead of what I mentioned prior I just preface it with "If you are not using CURLFORM_STREAM " which I think gets the same point across:

If you are not using CURLFORM_STREAM and you pass a 0 (zero) for this option, libcurl will instead do a strlen() on the contents to figure out the size. If you really want to send a zero byte content then you must make sure strlen() on the data pointer returns zero.

Related discussion in 4673094 to better warn of type adherence; eg 0L not 0 and I will continue that there.

@jay jay reopened this Aug 2, 2015
@paulharris
Copy link
Author

Hi guys,

I have this earmarked for me to revisit in the next week or two, just a bit busy at the moment.
I'd appreciate it if the issue isn't closed prematurely. If it is, then it discourages people like me from contributing or even reporting bugs to the project.

I dislike the whole varargs 0 vs 0L thing, it is a very easy trap to fall into.
I don't see why there can't be an additional header file added to libarchive that allows me to call eg

curl_formadd_name(&formpost, &lastptr,"file");
curl_formadd_stream(&formpost, &lastptr,(void*)0x121);
curl_formadd_contentslength(&formpost, &lastptr, 0); // note: no need for 0L
curl_formadd_filename(&formpost, &lastptr,fn);
where in the new header, eg:
inline void curl_formadd_contentslength( whatnot * formpost, whatnot * lastptr, long n)
{
   curl_formadd(formpost, lastptr,
     CURLFORM_CONTENTSLENGTH, n,
    CURLFORM_END 
);
}

This is far more robust and a safer call, especially for new users...
There wouldn't be that many functions, and would not break any API.
AND if there is a breaking change in the API (eg curl needs to pass int64_t instead of long because Window's long is only 32 bit) then library users are warned by the compiler.

Back to work, I'll try and double-check the bug fix suggestion soon.
Best regards,
Paul

@bagder
Copy link
Member

bagder commented Aug 3, 2015

The way to build multipart formposts can certainly be made into something that is easier to use with less risks of accidentally using the wrong type, and what you're suggesting seems like a fine approach. If you're willing to work on it, I'll certainly help out to review.

@paulharris
Copy link
Author

I'll see what I come up with when I revisit, but I won't be able to do full test coverage, and I may not get all the APIs right. I have only starting using libarchive so level of code completeness can't be expected from such a contributor. But hopefully I can give you something to get the ball rolling.

@bagder
Copy link
Member

bagder commented Aug 3, 2015

Every little bit counts!

@chubiei
Copy link

chubiei commented Oct 1, 2015

Not sure if anyone is still working on this issue, but I think I've found the cause:

Commit b0143a2 removed fread_func and fread_in members in the connectdata structure, but these members are switched in the Curl_http function temporarily and then switched back in Curl_http_done.

Before the commit, the user callback (fread_func) in the SessionHandle was intact, but this property was broken after the commit due to the removal of fread_func in connectdata. Therefore after the user callback was switched in Curl_http, it was never switched back.

This leads to the issue that @paulharris found: the user callback is switched to the standard version, but called with user-provided read-data pointer.

The valgrind output tells the same story as well:

  • Calling Curl_pgrsSetUploadSize from Curl_http shows that the fread_func is switched.
  • Later after Curl_http is done, Curl_readwrite is called.
  • In Curl_readwrite, the form data is sent and the issue occurs.

A quick fix is to cache both fread_func and fread_in members before the switching occurs and then assign them back. Although it sounds tricky, but this trick has already been done before in db6ff22 (see backup member in HTTP structure).

@bagder bagder removed the needs-info label Oct 1, 2015
@bagder
Copy link
Member

bagder commented Oct 5, 2015

This sounds a bit complicated but it feels like you're on the right track. Can you figure out a use case that actually triggers this problem? @chubiei, can you show your suggested fix as a patch?

@bagder bagder added the HTTP label Oct 5, 2015
@chubiei
Copy link

chubiei commented Oct 5, 2015

Hello @bagder,

Although the code execution path has been identified, but I don't have a way to 100% reproduce the issue. In my company, the issue can be occasionally reproduced using Baidu PCS API to upload a large file to Baidu server with chunks.

The patch below is a temporary fix that I mentioned earlier based on the method used in db6ff22 and has been committed to our company's repository. Personally I think a better way of fixing the issue is to extend the backup member by adding a type:

  • The initial state of backup type is BACKUP_EMPTY.
  • Before switching fread_func to readmoredata, backup all pointers to the backup structure and set the type to BACKUP_REDEMOREDATA. The type is set to BACKUP_EMPTY after the pointers are restored.
  • Likewise, backup all pointers and set the backup type to BACKUP_FORMREADER before switching fread_func to Curl_FormReader. To unify the logic, you can instead switch the fread_func to another static function that calls Curl_FormReader and restores the pointers and backup type back to initial state before returning. With the static function, all pointers are restored earlier and the restoration becomes safer because it is not delayed to Curl_http_done.
  • Finally, the only thing needs to be make sure is that the type of backup can never be overlapped. In other words you can't backup both pointers types at the same time. But currently the condition will always hold because readmoredata is called only if the HTTP request type is HTTPREQ_POST and Curl_FormReader is called if HTTP request type is HTTPREQ_POST_FORM. So basically their code paths are different.

The suggestion above is quite long, if you need me to write it as a patch, please let me know and I'll submit to you later as I am not that familiar with open source code contributing.

diff --git a/lib/http.c b/lib/http.c
index fe2c2ca..6008029 100644
--- a/lib/http.c
+++ b/lib/http.c
@@ -1486,6 +1486,17 @@ CURLcode Curl_http_done(struct connectdata *conn,
 #endif

   /* set the proper values (possibly modified on POST) */
+  if (http->curl_7_43_backup.fread_func) {
+    data->set.fread_func = http->curl_7_43_backup.fread_func; /* restore */
+    http->curl_7_43_backup.fread_func = NULL;
+  }
+  if (http->curl_7_43_backup.fread_in) {
+    data->set.in = http->curl_7_43_backup.fread_in; /* restore */
+    http->curl_7_43_backup.fread_in = NULL;
+  }
+
   conn->seek_func = data->set.seek_func; /* restore */
   conn->seek_client = data->set.seek_client; /* restore */

@@ -2453,6 +2464,11 @@ CURLcode Curl_http(struct connectdata *conn, bool *done)
        stream. */
     http->form.fread_func = data->set.fread_func;

+    /* backup fread_func and in, will be restored on done */
+    http->curl_7_43_backup.fread_func = data->set.fread_func;
+    http->curl_7_43_backup.fread_in = data->set.in;
+
     /* Set the read function to read from the generated form data */
     data->set.fread_func = (curl_read_callback)Curl_FormReader;
     data->set.in = &http->form;
diff --git a/lib/http.h b/lib/http.h
index 415be39..0437753 100644
--- a/lib/http.h
+++ b/lib/http.h
@@ -144,6 +144,13 @@ struct HTTP {
     curl_off_t postsize;
   } backup;
+
+  // FIXME: fix curl 7.43 update regression, issue #346
+  struct fix_curl_7_43_back {
+    curl_read_callback fread_func;
+    void *fread_in;
+  } curl_7_43_backup;
+
   enum {
     HTTPSEND_NADA,    /* init */
     HTTPSEND_REQUEST, /* sending a request */

@bagder
Copy link
Member

bagder commented Oct 5, 2015

Okay, yeah I can certainly see how it never restoring the backup there is a bad idea!

But I would like to go for another fix. Namely to stop changing the 'set' struct member, which was always a bad idea. My larger patch can be found here: http://pastebin.com/raw.php?i=i5EbHLwH

It runs through all my tests fine.

@chubiei
Copy link

chubiei commented Oct 6, 2015

Thanks for the quick fix! But I am afraid that this commit will not fix this issue. The main reason is that state.fread_func is not restored back to set.fread_func after it is assigned with Curl_FormReader. You didn't find it because the restoration was removed on another commit d04bab8. Besides, you might want to check commit b0143a2 as well because similar assignments occur in lib/url.c and lib/file.c and they are all removed.

@bagder
Copy link
Member

bagder commented Oct 6, 2015

So you can repeat the crash even with my patch? The point is: what exactly is the purpose of restoring those values after the request is done? Answer: for the subsequent request. And my patch now sets the values unconditionally in the beginning of every request. I would imagine that it would remove the need for restoring them. Unless you've figured out another reason we need to restore them?

@paulharris
Copy link
Author

Hi guys,

Sorry I've been unresponsive, I've been working on another part of the project for a while,
but I'll be moving back to the network side and checking up on Curl then. I really have no time to run the tests right now, I'm sorry.

I just want to quickly throw in a thought to check up on.

The crash for me was because the request would be automatically replayed when authentication failed in a POST transfer.
This happens with the ANYSAFE auth method which will try on method, fail, and then retry another method.

So its not a new Request from the library user's end, its actually Curl that is re-executing the Request.

Does your patch correctly set those values prior to the automatic re-request? (I can't tell from looking at the patch)

cheers,
Paul

@bagder
Copy link
Member

bagder commented Oct 8, 2015

@paulharris ah, right, thanks for that feedback. No, I added the init code in the wrong spot as this now only sets it up correctly before everything starts.

I'll move that piece of code and post an update.

@bagder bagder self-assigned this Oct 8, 2015
@bagder
Copy link
Member

bagder commented Oct 8, 2015

I created a temporary branch for playing with the fix for this (see issue-346). Here's what I suggest that now sets the pointers correctly for every new HTTP request during a single easy "transfer": 263192f35b77b

Thoughts?

bagder added a commit that referenced this issue Oct 8, 2015
... and assign it from the set.fread_func_set pointer in the
Curl_init_CONNECT function. This A) avoids that we have code that
assigns fields in the 'set' struct (which we always knew was bad) and
more importantly B) it makes it impossibly to accidentally leave the
wrong value for when the handle is re-used etc.

Introducing a state-init functionality in multi.c, so that we can set a
specific function to get called when we enter a state. The
Curl_init_CONNECT is thus called when switching to the CONNECT state.

Bug: #346
@paulharris
Copy link
Author

I can't comment on the patch as I'm not familiar with the code,
but is this code only relevant for "easy" transfers?

On 8 October 2015 at 21:21, Daniel Stenberg notifications@github.com
wrote:

I created a temporary branch for playing with the fix for this. Here's
what I suggest that now sets the pointers correctly for every new HTTP
request during a single easy "transfer": 263192f
263192f35b77b

Thoughts?


Reply to this email directly or view it on GitHub
#346 (comment).

@bagder
Copy link
Member

bagder commented Oct 9, 2015

The easy transfer function is just a thin layer built on top of the multi interface, so internally there really isn't any difference in which way you use to drive the transfer. It runs the same code.

My hope with posting the patch/branch here is that some of you who have actually seen this crash happen and can repeat it, would try the patch and tell me if it fixes the problem for you. The patch works and runs all existing test cases fine in my end. I'll try to spend some more time writing up a test case that can reproduce the original crash.

@bagder bagder added the crash label Oct 9, 2015
@bagder
Copy link
Member

bagder commented Oct 15, 2015

okay, then I'll move to merge my fix and close this issue, and if anything still remains will take it up then

bagder added a commit that referenced this issue Oct 15, 2015
... and assign it from the set.fread_func_set pointer in the
Curl_init_CONNECT function. This A) avoids that we have code that
assigns fields in the 'set' struct (which we always knew was bad) and
more importantly B) it makes it impossibly to accidentally leave the
wrong value for when the handle is re-used etc.

Introducing a state-init functionality in multi.c, so that we can set a
specific function to get called when we enter a state. The
Curl_init_CONNECT is thus called when switching to the CONNECT state.

Bug: #346
@bagder bagder closed this as completed in c6aedf6 Oct 15, 2015
@paulharris
Copy link
Author

I finally had a chance to test the fix.
Yes, it appears to work now, great work ! Thank you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

4 participants