cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Project hiper - High Performance libcurl

From: Jamie Lokier <jamie_at_shareable.org>
Date: Thu, 10 Nov 2005 01:41:19 +0000

Cory Nelson wrote:
> > I think such fancy event/thread scheduler should not be part of Curl;
> > it should be a project in its own right. (One I've heard of for unix
> > is called libasync-smp. I'm slowly working on one myself, too).
>
> It is easy to do (on Windows, at least), and would help users a lot.
> If it's not in Curl, we lose the ability of easily writing
> cross-platform applications that work on all the major OSes.

A cross-platform library to implement good event/thread scheduling is
not a trivial undertaking. I know, because I am writing one and it's
taken a year so far to do it right. While acknowledging that I might
be a slow programmer, still if such a thing is written, it would be
useful to more applications than just the ones using Curl.

But far more importantly, if Curl were to _require_ that the blocking
happens inside Curl's code, then it would be difficult to use Curl in
programs which already have their own event handling for other things.

> Not having it in Curl would also mean we lose efficiency. By not
> being able to delegate which type of thread (I/O or non-I/O) to begin
> a request in, Curl would be forcing the user to never destroy any of
> the threads that Curl has touched, for fear of canceling I/O. We
> would no longer be able to intelligently pool threads by how heavy the
> workload is.

I don't see why you say the user couldn't cancel threads touched by
Curl. Surely, if the scheduler is outside Curl, the application would
know _exactly_ which threads Curl is using at all times?

> CPUs aren't getting any faster, but cores are being added on. The
> concept of multi-threaded coding needs to get a lot more popular, and
> this is a prime example of something that would really benefit from
> it.

Well, using multiple threads may be a prime example. But I thought
you could already use multiple threads with Curl?

> As I said above, it would be trivial to add a single lock to each easy
> handle. No fancy scheduling needed.

So you're suggesting to add the right locks so that the application's
scheduler is allowed to call Curl's event handlers from any of it's
threads?

That would certainly be an easy API to understand, and I'd have no
complaint with it.

But it wouldn't be as efficient as you think. As soon as some Curl
code blocked on any lock, you'd need to create another thread somehow
to ensure the CPU is still used (otherwise there is no point in this
discussion). The result would be extra threads that aren't really
needed.

> This is simple to do for Windows. There really is no reason not to.
> I'm not familiar with kqueue etc but I doubt it would be very complex
> for them either.

The complex part isn't the part which calls kqueue (although that's
quite a lot of work to do, and more to do efficiently, for all the
different variations on the kqueue idea (epoll, RT signals, kqueue,
/dev/poll, port_create, and IOCP; which is why it should be a library
of it's own if Curl were to depend on it).

The complex part is deciding which event handlers to run on which
threads - and also deciding when to create more threads or destroy
them. Most OSes don't have a "create another thread if one blocks" or
a "create another thread if a CPU becomes available".

That's interesting work, but there is no obviously perfect way to do
it (it's still a very active research field - lots of testing, lots of
heuristics). Which is why Curl should be able to work with different
people's implementations of such methods.

-- Jamie
Received on 2005-11-10