cURL cURL > Mailing List > Monthly Index > Single Mail

curl-tracker Archives

[curl:bugs] #1420 Pipelining + client-side timeouts lead to loss of synchronization with response stream and incorrect data delivery

From: Carlo Wood <libcw_at_users.sf.net>
Date: Tue, 28 Oct 2014 18:10:08 +0000

I fixed all http pipelining bugs that I could find with my testsuite.
You can find/get the commits from https://github.com/CarloWood/curl

The testsuite now gives this as output:

2014-10-28 19:07:28.399012: : Request #0 added [now running: 1]
<html><body>Reply 1 on connection 1 for request #0</body></html>
2014-10-28 19:07:28.399972: : Request #0 finished [now running: 0]
2014-10-28 19:07:28.400096: : Request #1 added [now running: 1]
2014-10-28 19:07:28.400111: : Request #2 added [now running: 2]
2014-10-28 19:07:28.400134: : Request #3 added [now running: 3]
2014-10-28 19:07:28.400145: : Request #4 added [now running: 4]
<... 1.000572 seconds ...>
2014-10-28 19:07:29.400717: : Request #1 TIMED OUT! [now running: 3]
2014-10-28 19:07:29.400759: : Request #2 TIMED OUT! [now running: 2]
2014-10-28 19:07:29.400773: : Request #4 TIMED OUT! [now running: 1]
2014-10-28 19:07:29.400786: : Request #5 added [now running: 2]
2014-10-28 19:07:29.400797: : Request #6 added [now running: 3]
2014-10-28 19:07:29.400808: : Request #7 added [now running: 4]
<html><body>Reply 1 on connection 2 for request #3</body></html>
<html><body>Reply 2 on connection 2 for request #5</body></html>
<html><body>Reply 3 on connection 2 for request #6</body></html>
<html><body>Reply 4 on connection 2 for request #7</body></html>
<... 0.100344 seconds ...>
2014-10-28 19:07:29.501152: : Request #3 finished [now running: 3]
2014-10-28 19:07:29.501188: : Request #5 finished [now running: 2]
2014-10-28 19:07:29.501201: : Request #6 finished [now running: 1]
2014-10-28 19:07:29.501213: : Request #7 finished [now running: 0]
2014-10-28 19:07:29.501227: : Request #8 added [now running: 1]
2014-10-28 19:07:29.501237: : Request #9 added [now running: 2]
<html><body>Reply 5 on connection 2 for request #8</body></html>
<... 0.100530 seconds ...>
2014-10-28 19:07:29.601767: : Request #8 finished [now running: 1]
<html><body>Reply 6 on connection 2 for request #9</body></html>
<... 0.037036 seconds ...>
2014-10-28 19:07:29.638803: : Request #9 finished [now running: 0]

After #1 times out, the connection is marked for closing (and it is actually closed, unlike in the previous two cases), a new connection is created (connection 2) which is used for request #3 that was in the pipe but didn't timeout yet, as well as all newly added requests.

In total 2 connections are made, and only 1 connection exists at any time.

---
** [bugs:#1420] Pipelining + client-side timeouts lead to loss of synchronization with response stream and incorrect data delivery**
**Status:** open
**Labels:** pipelining 
**Created:** Tue Sep 02, 2014 10:36 PM UTC by Monty Brandenberg
**Last Updated:** Tue Oct 28, 2014 06:05 PM UTC
**Owner:** nobody
I've been tracking a data corruption/missing http status problem and I think I have enough data for a useful bug report.
The problem centers around the handling of queued requests in a pipeline when preceding requests are failed in libcurl after committing to a request/response transaction.  In the table below, I show six GET requests pipelined on one connection.  'Time' is relative seconds since the creation of the connection.  The first three requests are processed normally.  The fourth request times out while processing the response body.  The fifth request times out waiting for the response header.  The sixth request is allowed to proceed but appears to be out-of-sync with the response stream.  I haven't dumped the data in verbose mode but I'd guess that the sixth request is consuming the remainder of the fourth request's response body in some demented fashion.
Request | Time | Event
------- | ---- | -----
0 | 0 | HEADEROUT issued
0 | 1 | First HEADERIN data
0 | 13 | Request completed, 200 status
1 | 0 | HEADEROUT issued
1 | 13 | First HEADERIN data
1 | 15 | Request completed, 200 status
2 | 0 | HEADEROUT issued
2 | 15 | First HEADERIN data
2 | 20 | Request completed, 200 status
3 | 0 | HEADEROUT issued
3 | 20 | First HEADERIN data
3 | 30 | Timeout declared (easy error 28)
3 | 30 | Request failed, easy 28 status
4 | 0 | HEADEROUT issued
4 | 30 | Timeout declared (easy error 28), no HEADERIN data seen
4 | 30 | Request failed, easy 28 status
5 | 13 | HEADEROUT issued
5 | 30 | First DATAIN received, NO HEADERIN data for this request
5 | 43 | Connection closed.  This may be in response to server socket close.
5 | 43 | Request appears to succeed but CURLINFO_RESPONSE_CODE returns 0.
The sixth request appears to succeed as far as multi_perform and multi_info_read are concerned.  But fetching CURLINFO_RESPONSE_CODE returns 0 for status.  
As a workaround, checking the status as above appears to be useful.  I'm not certain that's 100% reliable or that the connection will be broken immediately at that point.  This has the potential of going very wrong as far as data integrity goes.
If I understand this correctly, solutions would include:
* Canceling/failing a request that's active on a pipeline results in failure to all requests farther down the pipeline.
* Canceling/failing a request results in 'passivation' of the request.  It no longer interacts with the caller but remains active sinking data from the response until satisfied.
---
Sent from sourceforge.net because curl-tracker@cool.haxx.se is subscribed to https://sourceforge.net/p/curl/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/curl/admin/bugs/options.  Or, if this is a mailing list, you can unsubscribe from the mailing list.
Received on 2014-10-28

These mail archives are generated by hypermail.