curl-and-python

Tricks to optimizing PUT performance?

From: Mark Seger <mjseger_at_gmail.com>
Date: Wed, 23 Jan 2013 19:35:55 -0500

I've managed to get to the point where I can now upload in-memory strings
of data, via a REST interface. Very cool stuff. In fact the good news I
can hit very high network rates with strings on the order of 100MB or more.
 The bad news is smaller strings upload very slowly and I have no idea why.

To try to figure out what's going on I surrounded the perform() call with
time.time() to measure the delay and I'm finding that even with payloads on
the order of 32KB it's always taking over a second to execute the upload
call whereas other interfaces go much faster on the order of under 0.1
sec/upload. Has anyone else every observed this behavior?

Digging a little deeper I've observed a few things:
- when my callback is called for data, it is passed a chunk size of 16384
and I wonder if asking for bigger chunks would result in fewer calls which
in turn could speed things up
- another thing I noticed is very high CPU loads, not for the small strings
but for the larger ones I'm seeing close to 100% of a single CPU being
saturated. Is this caused by encryption? is there any way to speed it up
or choose a faster algorithm. Or is it something totally different?
- I'm also guessing the overhead is not caused by data compression because
I'm intentionally sending a string of all spaces which are highly
compressible and I do see the full 100MB go over the network and if it were
compressed I'd expect to see far less.

I know pycurl is very heavily used everywhere and that this could simply be
a case of operator error on my part. If anyone would like to see my code
I'd be happy to send it along, but for now I thought I'd just keep it to a
couple of simple questions in case the answer is an obvious one.

-mark

_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
Received on 2013-01-24