cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Multi cURL connect bug

From: Keyur Govande <keyurgovande_at_gmail.com>
Date: Mon, 8 Jul 2013 18:06:25 -0400

On Mon, Jul 8, 2013 at 3:24 PM, Dan Fandrich <dan_at_coneharvesters.com> wrote:
> On Mon, Jul 08, 2013 at 02:14:05PM -0400, Keyur Govande wrote:
>> I respectfully disagree. Asynchronous RPC without using a separate
>> thread is not that uncommon and is definitely not wrong. In C land
>> this would perfectly reasonable to do:
>> open a non-blocking socket()
>> connect() with timeout. If successful {
>> loop over write() until finished
>> // do other stuff
>> poll() on fd with timeout
>> read() from fd
>> }
>> close()
>> // do more stuff
>>
>> The application I'm trying to do this in is PHP, so a separate thread
>> is not an option.
>
> You can do this form of communication with libcurl, but not exactly in the way
> you describe. Since libcurl handles the write and the read in the same
> function, the // do other stuff part has to be combined with the loop and poll
> sections before and afterward. It's not always easy to cleanly separate the
> write and read states, anyway, nor would you necessarily want to. Consider a
> hypothetical RPC call with a slow DNS lookup so the connect state takes 5
> seconds, the write state that takes 100 msec, a processing state length of 10
> msec, and a multi-part response that takes 100 msec to return the first part,
> then a 10 second delay, then another 100 msec to return the second part. The
> pseudo code above would give you only 10 msec out of 15,310 msec to "do other
> stuff".
>
> The libcurl-style would look something like this:
>
> start curl transfer()
> loop until transfer and stuff done
> poll()
> if readable_or_writeable
> process curl transfer()
> if stuff == 0
> do stuff()
> else if stuff == 1
> do more stuff()
> else if stuff == 2
> do even more stuff()
> stuff=stuff+1
>
> The difference is that the process curl transfer part is executed regularly, on
> every iteration through the loop, and not in separate write and read sections.
> libcurl isn't ever CPU-hungry and the process curl transfer part will never
> take more than msec (assuming a properly-configured libcurl), so out of the
> 15,000 msec this transfer takes, libcurl would give the app more like 15,300
> msec of time to do stuff (instead of 10 msec in the previous example).
>
> You could of course hide the libcurl things in a couple of functions,
> so the code could look more like:
>
> start rpc()
> do stuff()
> is_rpc_done() // ignore the result--we're not ready for the rpc to be done
> do more stuff()
> is_rpc_done()
> do even more stuff()
> while not is_rpc_done()
> just wait
>
> The trick is keeping the time between calling libcurl down. This could be done
> by splitting the stuff to do into small enough segments to give libcurl enough
> opportunities and low latency to do the transfer, or finding regular times
> while doing stuff to call is_rpc_done() to give libcurl a chance to work, or by
> calling libcurl in a loop with a short timeout in between doing stuff.
>

Thanks Dan for the detailed response.

I agree that if feasible, what you suggested would work well. But in
most large pre-existing code bases, there is no way to keep calling
cURL function when you get a few class hierarchies deep and passing
along the cURL object along all the way through to do so seems leaky
and would involve lots of changes.

I realize libcurl is not the answer to every problem :-) This
statement (Enable a "pull" interface. The application that uses
libcurl decides where and when to ask libcurl to get/send data) in
Objectives on http://curl.haxx.se/libcurl/c/libcurl-multi.html was
what led me down the road: ask libcurl to first send, and then the app
will ask it to receive when it is ready.

To take your example, a more conventional RPC would look like:
1) DNS lookup + connect in 10ms
2) send request in 20ms
(external service takes 200ms to respond)
3) poll() with timeout. If timeout, assume RPC failed and move on.
4) If #3 succeeded, receive response in 20ms
5) close connection

With the goal being to run other code after step 2 and before step 3
while the external service is still processing.

To solve my immediate problem where connect() takes long(er) and to
still use libcurl, the solution seems to be to pass in a connected
socket file descriptor using CURLOPT_OPENSOCKETFUNCTION and
CURLOPT_SOCKOPTFUNCTION, and assume when CURLM_OK is returned from
curl_multi_perform(), the entire request has been flushed over. Then
call curl_multi_wait() when I'm ready to receive the response. The
other option is to use a different library or raw sockets to do
exactly what I need.
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2013-07-09