cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Cacheing In

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Mon, 18 Feb 2002 16:33:31 +0100 (MET)

On Sun, 17 Feb 2002 CHRIS.CLARK_at_FRIENDSPROVIDENT.CO.UK wrote:

> I have to say first that I'm amazed and impressed at your sheer volume and
> quality of input to cURL and libcurl. Not to mention the Web site, the
> mailing lists, and the full-time job. I guess sorting through everyone
> else's comments and contributions and evaluating and prioritising them must
> keep you off the streets as well. I hope the vacation has recharged your
> batteries!

Thanks for throwing some moral support this way. I appreciate it. Yes, the
vacation was refreshing and relaxing (if skiing eight hours a day can be
considered relaxing)!

I have a fair number of curl mails to digest on my return. I'm trying to get
on top of them right now, to be able to apply some of the most important
patches to get another pre-release out and then fix the test suite so that we
can soon release a public 7.9.5.

What comes beyond that is only talk on my behalf right now. (Though I
appreciate other's contributions and talk as well of course.)

> I guess the performance issue that I raised for files (certificates and
> keys) more properly belongs to your much wider "sharing is caring"
> discussion, since that seems to cover many of the same areas (cacheing,
> sharing objects across threads and curl handles, portable support for
> multithreaded apps with or without mutexes, and so on). And doing all of
> that will be no small undertaking, I suspect.

I think you're on the right track, this kind of caching should also be
possible to share.

However, as with the other "caches", it should also work on a single handle
and thus we should first introduce a cache for a single handle that can later
be exetended to get shared between handles etc.

> However, in the meantime here's a couple of smaller scale suggestions on
> the particular point I raised. One option would be to use the existing
> APIs, but amend the perform() code to do its own cacheing for any of the
> referenced files (certificates and key). It would have to check that the
> filenames were still the same as last time, and reload them if they'd
> changed. And for those awkward customers who are cussed enough to swap the
> contents of the files between calls, but keep the same filenames, maybe a
> file timestamp comparison as well. But as the whole point is to avoid
> hitting the file system again, and catering for minority tastes would spoil
> things for the majority, they could take a hike.

I would even consider a case when the application knows the files aren't
changed, and thus libcurl won't even need to check if they're different or
not.

However, AFAIK all this talk about caching SSL data such as certificates is
very dependent on what the OpenSSL interface offers so we should take a dive
into that dark forrest first to if we can "load" stuff from memory instead of
files.

> Another option would be to extend the existing APIs by introducing a new
> (but small) set of CURLOPTs, paralleling the current filename-oriented
> CURLOPTs.

I think we will need new options to make this go into this the smoothest
possible way. The options could be made in a few different ways.

> The pointer parameter for these new CURLOPTs would point to an [address +
> length] structure, which would contain everything needed to get the data
> from application memory instead of from a file.

... if OpenSSL allows this.

> The perform() API could be amended to get the data by preference from
> application memory, so it wouldn't have to do its own cacheing.
> Applications could also use these new CURLOPTs to reset the pointers to
> NULL again (or even just reset their own pointer in the [address + length]
> structure), so the application could easily switch off cacheing, and force
> file reloads if it wanted to. So this way we cater for the awkward squad as
> well, without impacting everyone else.

Either that, or we have an option that is named SSL_CERT_CACHE_CONTROL_TOOL
(name silly enough to not remain like this) that can take a few different
values. One that checks if the cert has changed between each use, and if it
has, it reloads the cert, one that assumes that the cert is never changed and
one that enforces a reload of the cert in the upcoming request. Or similar.

> A minor variation on this theme, if you don't like the idea of introducing
> new CURLOPTs, would be to have "special" values for the pointer for the
> existing CURLOPTs

Nah, I don't like scarry stuff like that! ;-)

> But either option would solve my original problem, and make libcurl HTTPS
> performance in a high-volume, multithreaded environment even better.

I am very positive to changes that'll make libcurl better in this context.

-- 
    Daniel Stenberg -- curl groks URLs -- http://curl.haxx.se/
Received on 2002-02-18