cURL / Mailing Lists / curl-library / Single Mail

curl-library

maximizing performance when issuing a large number of GET/PUT requests to a single server

From: Alex Loukissas <alex_at_maginatics.com>
Date: Wed, 10 Aug 2011 17:51:45 -0700

The scenario is the following:

I want to perform many GET and PUT requests on a single HTTP server
for many files (e.g. downloading/uploading large photo albums), where
the only thing changing between each request is the name of the file.
The current code I have, takes the URI I wish to GET/PUT and calls a
function that performs this on this URI. This function initializes and
cleansup a curl_easy_handle each time, which, as expected is killing
performance (no connection reuse) and for large number of URIs, it
also produces a big amount of TCP connections in TIME_WAIT state.

I am looking to refactor this code now, so that I get a speedup by
parallelism (the order that these requests are performed doesn't need
to be FIFO or anything, as long as they all succeed), and connection
reuse (i.e. filling up the pipe). One of the things I've looked at is
going into the curl_multi_* interface, have a set number of handles
and reuse them until all requests have been served. My question is the
following: is there a benefit of doing this (which from what I
understand is single-threaded and serialized --see:
http://curl.haxx.se/mail/lib-2011-06/0065.html) over having a thread
pool of curl_easy_handles that each has a connection to the server?

Thanks!
Alex
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2011-08-11