cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: libcurl and async I/O

From: Jamie Lokier <jamie_at_shareable.org>
Date: Tue, 19 Aug 2008 03:09:53 +0100

Andrew Barnert wrote:
> Sorry; I didn't notice all of the other replies before I sent mine.
>
> 18 Aug 2008 15:07, Cory Nelson:
> [lots of snipping not reflected below]
> > On Mon, Aug 18, 2008 at 2:32 PM, Daniel Stenberg <daniel_at_haxx.se>
> > wrote:
> > > For an ordinary application with < 100 connections, why would an app
> > > particularly insist on such an asynch API for libcurl?
>
> My particular reason is that I'm using asio for the native protocol, and
> I want to use the same loop for the tunneled protocol. Yes, I could just
> create another thread and put a select loop in there for the libcurl
> sockets, and add all of the inter-thread synchronization stuff, but you
> can see why I wouldn't want to do this.
>
> Other reasons:
>
> 1. Your app might one day need to handle thousands of connections.
>
> 2. Even if you only need 70, select won't do that many on Windows.

Btw, libevent seems to have code which expands the fd_sets dynamically
on Windows. Does that mean it can get past the 64 socket limit?

> 3. There is no better cross-platform solution than select, without
> going to hefty libraries like ACE reactor. (I'm assuming libevent
> still uses select on Windows, right?)

libevent used to have experimental code using WSAEvent* I think, but
it's gone now, in favour of select() with dynamic fd_sets. WSAEvent*
was limited to 64 events anyway!

> 4. If you think in terms of asio (as die-hard Windows networking types
> do), ready-notification is as backwards and hard to wrap your head
> around as asio is for everyone else. (And remember, these are
> Windows programmers, who aren't notorious for flexibility.)
>
> 5. The boost.asio library is pretty nice, especially if your app is
> already highly boost-y; you might just prefer writing boost.asio
> code to writing select or libevent code.
>
> > > What possibly is lacking on Windows, as Jamie Lokier possibly
> > > suggests, is an event-based system that is good enough or suitable
> > > to do the job in a convenient manner!
>

> You're suggesting that Windows should have something that looks a
> lot like epoll/kqueue/etc., and that's just as efficient as them? That
> would be nice. Even poll would be nice. But it's not going to happen.

Windows does have poll: WSAPoll(). Vista or later, sorry :-)

It does have an event registration and notification queue -
WSAAsyncSelect(). You get Windows messages sent to your private
hidden window for each socket state change. I have no idea how
efficient they are, or if there's a limit on the message queue length.

> As I understand things, Microsoft has determined that there's no way
> to add an epoll-style API to the Windows kernel that scales well to
> multiple cores, while the linux networking team has determined that
> there's no way to add aio networking to the linux kernel in a way
> that's as low-overhead as epoll, so we may be stuck with these two
> different interfaces forever. (And of course Sun will provide both,
> and 20 others besides.)

And neither method is optimal!

Ideally, you don't want to commit 10000 large receive buffers in
overlapped reads which may take a long time to receive anything. You
want to allocate buffers from a smaller pool as data becomes
available. For writes, similar but less clear cut.

> But the secondary difference is that you have to replace your recv
> and send calls with ReadEx/WriteEx, aio_read/aio_write, etc., which
> means that handy wrappers like SSL_read designed to drop in as
> replacements for recv and send can't be used. And this one _is_
> super inconvenient.

All the more reason to have code patterns which can adapt both ways
around, to join each style of code together.

> I'm not sure I agree. Even forgetting about SSL/SSH/Kerb, the socket
> callback is going to need the buffer to be read/written so it can make
> the async read/write call. That means either exposing internal info
> about how the connectdata, SingleRequest, etc. structs work, changing
> things so that the app manages the buffers instead of libcurl, or
> putting the buffers into the API.

Or do some data copying, and simulate an epoll-style API for curl's
benefit. I.e. pretend socket is "ready for write", let curl write,
you start an async write and pretend its not ready any more, until
that async write finishes. (Or overlap a few more, as you like.)
Similar for reading.

Raw data copying is less expensive than it used to be: the scalability
issues with 10000 sockets aren't about that, but about queuing
intelligently, so that might be fine.

-- Jamie
Received on 2008-08-19