cURL / Mailing Lists / curl-library / Single Mail

curl-library

HTTP Pipelining Contributions

From: George Rizkalla <grizkalla_at_rim.com>
Date: Tue, 24 Jul 2012 15:01:26 +0000

Hi all,

We are currently looking at contributing to libcurl's pipelining
implementation, and we were hoping to get your feedback on some areas we'd
like to help with.

Joe Mason will be contributing the bulk of the code changes (after he
wraps up the authentication changes he has been discussing on this list).
In the interim, I was hoping to run some of our design ideas by you.

The proposed algorithm involves balancing HTTP requests over multiple TCP
sockets, while avoiding use of HTTP pipelining in instances where we
believe errors are likely to occur, or where it is likely that there would
be a performance hit if pipelining is used.

Essentially, there are three areas we wish to address:

1. Controlling the maximum number of sockets in use
2. Controlling maximum pipeline length, and protocol behaviour when the
limit is reached
3. Providing the ability to blacklist sites or proxies that are known to
not support pipelining

1. MAX SOCKETS CHANGES
While CURLMOPT_MAXCONNECTS imposes a limit on the number of sockets
persisted, it does not regulate the maximum number of sockets that are in
use at a given point in time. It is proposed that CURLMOPT_MAXCONNECTS be
aliased/renamed CURLMOPT_MAXCONNECTS_SOFT, and that a new option,
CURLMOPT_MAXCONNECTS_HARD would be introduced. The latter option would
regulate the maximum number of open sockets at any given point in time.
This will be necessary to eliminate the need for queuing requests at the
application layer when the application wishes to throttle the number of
underlying sockets.

2. HTTP PIPELINING CHANGES
-Max Pipeline Length
It is proposed that the maximum pipeline length is set using a new curl
option, CURLMOPT_MAX_PIPELINE_LENGTH. When the number of outstanding
requests exceeds this parameter, a new socket should be opened. When the
number of outstanding requests reaches CURLMOPT_MAX_PIPELINE_LENGTH *
CURLMOPT_MAX_CONNECTS_HARD, requests are queued.

-Socket Penalization Principle:
For effective pipelining, the concept of socket penalization should be
introduced to libcurl. The guiding tenet is that, where possible, a new
request should not be pipelined behind a request (or response) that is
known to be large.

It is possible to determine whether a socket should be penalized by either:

a) A Content-Length header specifying a content-length greater than the
proposed new curl option CURLMOPT_CONTENT_LENGTH_PENALTY_SIZE
b) A chunk size larger than the proposed new curl option
CURLMOPT_CHUNK_PENALTY_SIZE where a chunked transfer-encoding is used

Of course, if a chunked transfer-encoding is used with multiple chunks
smaller than CURLMOPT_CHUNK_PENALTY_SIZE, it will not be possible to
identify the penalized condition correctly.

-Pipeline-able Sockets
In addition to the obvious checks for the use of an HTTP/1.1 server,
before assuming that pipelining is possible on a particular socket,
libcurl could check for older/incompatible IIS servers before deciding to
pipeline. It should also check the blacklist (discussed below) to
determine if pipelining is an option for this host.

If the first response to a request on a socket is marked as HTTP/1.0, or
an older IIS server version is used, or the site is black-listed (see
below), the socket should be characterized as CAN_PIPELINE = false.

-Basic Socket Selection Algorithm
1. If you have one or more sockets not marked PENALIZED that is/are
established to the desired host that are marked CAN_PIPELINE and have less
than CURLMOPT_MAX_PIPELINE_LENGTH outstanding requests, select the socket
that has the least number of outstanding requests. Proceed to 5.
2. If the number of open sockets is less than CURLMOPT_MAXCONNECTS_HARD,
open a new socket. Proceed to 5.
3. If an idle connection to another host exists, close it. Open a new
socket, and proceed to 5.
4. Queue the new request until a prior request is complete, and then
proceed to 1.
5. If the request is a PUT or POST and contains a chunk length >
CHUNK_PENALTY_SIZE, or a content-length greater than
CONTENT_LENGTH_PENALTY_SIZE, mark the socket as PENALIZED.
6. Increment the OUTSTANDING_REQUESTS on the socket.
7. Send the request.
8. If the content-length or any chunk length of the response exceeds the
corresponding threshold, mark the socket as PENALIZED.
9. If the response type is HTTP/1.1, and the host is not blacklisted, and
the server is not using an old IIS version, mark the socket as
CAN_PIPELINE=true.
10. When the entire response is received, decrement the number of
outstanding requests on the socket, and un-penalize the socket.

3. BLACKLISTING
A new libcurl method that receives a list of host/port combinations that
shouldn't be pipelined could be added (per
http://tools.ietf.org/html/draft-nottingham-http-pipeline-01#section-4).
CAN_PIPELINE will always be false for these servers.

Any feedback/suggestions would be greatly appreciated!

Warm regards,
George

---------------------------------------------------------------------
This transmission (including any attachments) may contain confidential information, privileged material (including material protected by the solicitor-client or other applicable privileges), or constitute non-public information. Any use of this information by anyone other than the intended recipient is prohibited. If you have received this transmission in error, please immediately reply to the sender and delete this information from your system. Use, dissemination, distribution, or reproduction of this transmission by unintended recipients is not authorized and may be unlawful.

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2012-07-24