RE: Many CLOSE_WAIT when handling lots of URLs
Date: Mon, 17 Feb 2014 10:04:05 +0100 (CET)
On Mon, 17 Feb 2014, Shao, Shuchao wrote:
> The attached patch is just to make sure the maxconnects limit works.
Thanks, merged and pushed. I also edited the documentation slightly to clarify
this default behavior slightly better.
>> See also CURLMOPT_MAX_TOTAL_CONNECTIONS for a more strict limit.
> Because the CURLMOPT_MAX_TOTAL_CONNECTIONS is set as unlimited as default,
> it is still easy to cumulate CLOSE_WAIT connections when the number of
> connections is too much.
You make it sound as if our job is primarily to limit the number of CLOSE_WAIT
connections. It really isn't.
The CLOSE_WAIT connections are mostly (AFAIK) unfortunate side-effects of us
keeping connections alive after we're done using them, and while kept like
that they get closed from the other end.
The multi interface allows you to add any amount of handles to it, and libcurl
will try to do transfers for all of them. That can end up in an awful lot of
connections but then the application asked for that! That's by design from
both the app and the library.
CURLMOPT_MAX_TOTAL_CONNECTIONS is a modern newcomer in libcurl that basically
allows an application to setup and add more handles than it wants connections
for, like when you know you want N transfers done but for whatever reasons you
don't want more than Y connections to be used so you can push the queuing
logic into libcurl.
> 1. Set a proper default value for CURLMOPT_MAX_TOTAL_CONNECTIONS, but not
> just unlimited
I'm very firmly against that. Why do you add so many handles if you don't want
them handled when you add them? And if you want a maximum, then either don't
add so many handles or set CURLMOPT_MAX_TOTAL_CONNECTIONS.
I can't see how setting a default CURLMOPT_MAX_TOTAL_CONNECTIONS to a made up
number will benefit our users. (Unless _perhaps_ that limit is based somehow
on the maximum total of sockets a process is allowed to use in the specific
circumstances it runs.)
> 2. Use maxconnects to limit both done connections and in-use connections in
> cache, but not only limit the done connections as now. That's to say, in
> ConnectionDone(), we need to kill as many as idle connections in cache to
> make the number of connection is cache closer to maxconnects, but not just
> kill one idle connection as now. Do you think we need a fix around here?
I don't believe in changing the meaning of the variable as that would effect
too many existing users and appplications, but yes when maxconnects is lowered
during run-time just killing off a single one will not be enough.
I'm thinking we can change the if() in ConnectionDone() to a while() instead
and kill off connections until below the threshold. We should also add a test
case for that.
-- / daniel.haxx.se ------------------------------------------------------------------- List admin: http://cool.haxx.se/list/listinfo/curl-library Etiquette: http://curl.haxx.se/mail/etiquette.htmlReceived on 2014-02-17