cURL / Mailing Lists / curl-and-python / Single Mail

curl-and-python

Re: Tricks to optimizing PUT performance?

From: Mark Seger <mjseger_at_gmail.com>
Date: Fri, 25 Jan 2013 13:17:05 -0500

oops, I have to fall on my sword and say I accidentally profiled my
non-pycurl upload script and it was the one that spent all that time in
compression, but only 17% of it's time in crypto. When I uploaded a file
with pycurl it didn't compress at all (as I originally suspected) but this
time spend 2/3 of it's time in crypto. Either way, both cases are spending
a LOT of time in CPU and that just doesn't feel right to me.
-mark

On Fri, Jan 25, 2013 at 12:58 PM, Mark Seger <mjseger_at_gmail.com> wrote:

> so I finally got around to running oprofile on this and it looks like it's
> spending most of its time doing compression and I don't see how. if I copy
> a 100MB file w/o curl of all spaces it compresses very nicely and I see a
> lot less data going over the network. so I'd think if compression was used
> it would have shrunk but it you look at the data below you can see 100MB
> went over the wire.
>
> I guess my question becomes why is compression using so much of the CPU
> and no data is being compressed?
>
> -mark
>
>
> On Fri, Jan 25, 2013 at 9:30 AM, Mark Seger <mjseger_at_gmail.com> wrote:
>
>> dima - thanks for the reply. sorry for not getting back yesterday but I
>> was offline and don't want you to think this isn't important to me. I see
>> where you're getting a pycul run in <1sec so should I assume you're on the
>> same system as the target of the PUT? I'm going over a wire...
>>
>> My issue with large data isn't so much the speed, it's the CPU load and
>> it's sustained at very high levels for a single upload. This is what a
>> 200MB upload looks like when I monitor it with collectl:
>>
>>
>> #<--------CPU--------><----------Disks-----------><----------Network---------->
>> #cpu sys inter ctxsw KBRead Reads KBWrit Writes KBIn PktIn KBOut
>> PktOut
>> 6 0 36 18 0 0 0 0 1 6 1
>> 6
>> 3 0 129 51 0 0 0 0 4 107 790
>> 53
>> 20 0 627 95 0 0 0 0 30 771 7913
>> 158
>> 76 7 2557 116 0 0 0 0 67 1708 31656
>> 1009
>> 98 10 6757 127 0 0 0 0 178 4564 38086
>> 2357
>> 69 9 4715 107 0 0 0 0 122 3117 25116
>> 1573
>> 0 0 10 14 0 0 0 0 0 1 0
>> 1
>>
>> and as you can see the load it quite high which on a small core system
>> means you can't get much else done if you want to multi-thread. What I'm
>> trying to figure out is where all the CPU is being spend and if it's
>> possible to reduce it. It's certainly possibly I'm doing something wrong
>> in my code. Does this look ok?
>>
>> c = pycurl.Curl()
>> c.setopt(c.URL, '%s' % url)
>> c.setopt(c.HTTPHEADER, [auth_token])
>> c.setopt(c.UPLOAD, 1)
>>
>> c.setopt(pycurl.READFUNCTION, read_callback(1).callback)
>> c.setopt(pycurl.INFILESIZE, objsize)
>> c.perform()
>>
>> where the url and auth_token are build independent of this connection.
>> My read_callback simply pulls data out of a big string and returns it in
>> 16384 size chunks. While I don't think it would do any thing to improve
>> the CPU load, is there a way to increase the size of the chunks? Maybe
>> some other setopt call?
>>
>> But my other issue is when I run with objects as little as 1k, the PUT
>> takes over 1 full second just to execute the perform() call and that
>> doesn't sound right either. I can do many more small object uploads with
>> other libraries and I've gotta believe it's something wrong on the way I've
>> written the code.
>>
>> -mark
>>
>>
>>
>> On Thu, Jan 24, 2013 at 5:43 AM, Dima Tisnek <dimaqq_at_gmail.com> wrote:
>> >
>> > I went ahead an tried to reproduce your workload, sent 100M data in 10K
>> reads over http and then https aes128 sha1 over localhost
>> >
>> > air:~ dima$ time openssl s_server -msg -debug -nocert -cipher
>> 'ADH-AES128-SHA' -accept 8080 > somefile.ssl
>> > ^C
>> >
>> > real 0m5.425s
>> > user 0m1.316s
>> > sys 0m0.429s
>> >
>> > air:~ dima$ time ./test-pycurl-put.py
>> > [snip]
>> > real 0m4.078s
>> > user 0m1.810s
>> > sys 0m0.284s
>> >
>> > Well I get a spike of 100% cpu usage for individual processes, but
>> that's all for the good cause, according to openssl speed, aes-128-cbc
>> crunches up to 120MB/s and sha1 some 300MB/s, in other words, ~60MB/s I get
>> is not superb, but quite acceptable.
>> >
>> > For comparison, http pycurl time output:
>> > real 0m0.946s
>> > user 0m0.175s
>> > sys 0m0.177s
>> >
>> > yes it takes 1 second to push 100MB through, but it hardly taxes the
>> processor, namely a tenth of a single core.
>> >
>> > If you get much lower throughput than this, perhaps it's down to how
>> you process the data you send in python, e.g. if you keep reallocating or
>> "resizing" large strings, that could lead to O(N^2).
>> >
>> > d.
>> >
>> >
>> >
>> > On 24 January 2013 01:35, Mark Seger <mjseger_at_gmail.com> wrote:
>> >>
>> >> I've managed to get to the point where I can now upload in-memory
>> strings of data, via a REST interface. Very cool stuff. In fact the good
>> news I can hit very high network rates with strings on the order of 100MB
>> or more. The bad news is smaller strings upload very slowly and I have no
>> idea why.
>> >>
>> >> To try to figure out what's going on I surrounded the perform() call
>> with time.time() to measure the delay and I'm finding that even with
>> payloads on the order of 32KB it's always taking over a second to execute
>> the upload call whereas other interfaces go much faster on the order of
>> under 0.1 sec/upload. Has anyone else every observed this behavior?
>> >>
>> >> Digging a little deeper I've observed a few things:
>> >> - when my callback is called for data, it is passed a chunk size of
>> 16384 and I wonder if asking for bigger chunks would result in fewer calls
>> which in turn could speed things up
>> >> - another thing I noticed is very high CPU loads, not for the small
>> strings but for the larger ones I'm seeing close to 100% of a single CPU
>> being saturated. Is this caused by encryption? is there any way to speed
>> it up or choose a faster algorithm. Or is it something totally different?
>> >> - I'm also guessing the overhead is not caused by data compression
>> because I'm intentionally sending a string of all spaces which are highly
>> compressible and I do see the full 100MB go over the network and if it were
>> compressed I'd expect to see far less.
>> >>
>> >> I know pycurl is very heavily used everywhere and that this could
>> simply be a case of operator error on my part. If anyone would like to see
>> my code I'd be happy to send it along, but for now I thought I'd just keep
>> it to a couple of simple questions in case the answer is an obvious one.
>> >>
>> >> -mark
>> >>
>> >>
>> >> _______________________________________________
>> >> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
>> >>
>> >
>> >
>> > _______________________________________________
>> > http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
>> >
>>
>>
>

_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
Received on 2013-01-25

This message: [ Message body ]
Next message: Daniel Stenberg: "Re: Tricks to optimizing PUT performance?"
Previous message: Mark Seger: "Re: Tricks to optimizing PUT performance?"
In reply to: Mark Seger: "Re: Tricks to optimizing PUT performance?"
Next in thread: Daniel Stenberg: "Re: Tricks to optimizing PUT performance?"
Reply: Daniel Stenberg: "Re: Tricks to optimizing PUT performance?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]