cURL / Mailing Lists / curl-users / Single Mail


PyCURL interface - Uploading large binary files

From: Jesse Noller <>
Date: Wed, 04 Feb 2004 17:50:06 -0500

The problem: I am writing a file uploading utility in python that uses
the walk() function to parse a directory, finding any file under that
directory, and upload it to a remote server using the pyCURL curl
interface. The files are invariably binary files, and the upload
method is via an HTTP PUT to the system.

I also need to perform the reverse - I need to GET those files and
write them to disk.

The problem I am seeing is memory and time outs. Currently, I call
os.path.walk(dir), and then I call the upload function. The upload
function basically goes (the formatting got nuked when I pasted it):

f = open(filepath, "rb")
fs = os.path.getsize(filepath)

c = pycurl.Curl()
c.setopt(c.URL, target_url)
c.setopt(c.HTTPHEADER, ["User-Agent: Load Tool (PyCURL Load Tool)"])
c.setopt(c.PUT, 1)
c.setopt(c.READDATA, f)
c.setopt(c.INFILESIZE, int(fs))
c.setopt(c.NOSIGNAL, 1)
         if verbose == 'true':
    c.setopt(c.VERBOSE, 1)
c.body = StringIO()
                 c.setopt(c.WRITEFUNCTION, c.body.write)
import traceback

This opens the file via open() - which reads the file into memory.
This of course, causes problems when the client machine only has 512
megs of ram and we're uploading a 2-3 gig file (barring the argument
against doing this via HTTP PUT).

I am also running into the problem where if I hit a ~260 Megabyte file,
 I start getting intermittent (to constant depending on file size) errors:

* Empty reply from server

Originally I assumed this was because I was contacting the server with a form
post method and libcurl was taking too long to encode the file which caused
a timeout. This is not the case - I can recreate it with a PUT method as shown above.

The verbose output from the session is:

* About to connect() to target:8080
* Connected to ( port 8080
> POST /put HTTP/1.1
Host: target-036:8080
Pragma: no-cache
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
User-Agent: Zoidberg (PyCURL Load Tool)
Content-Length: 314573158
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------46b3250c0a30
* Empty reply from server
* Connection #0 left intact

If I run a: curl -vT file URL on the command line - it PUTs properly, so
I have to assume it's something with the way I, or pycurl is invoking the libcurl

Does anyone know a more efficient method to do this with? Please also
note I am measuring the metrics for each transaction sent too - so I
don't want to chunk and then upload, as I only get metrics for the

The metrics measuring comes before the c.close() function:

speed_up = c.getinfo(c.SPEED_UPLOAD)
size_up = c.getinfo(c.SIZE_UPLOAD)
ttime = c.getinfo(c.TOTAL_TIME)
ctime = c.getinfo(c.CONNECT_TIME)
sttime = c.getinfo(c.STARTTRANSFER_TIME)

Does anyone have any thoughts?

Thank you

The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
Received on 2004-02-04