cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: "100% CPU usage during SFTP transfer" bugfix

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Tue, 18 Aug 2009 14:12:41 +0200 (CEST)

On Tue, 18 Aug 2009, Vourhey wrote:

> send(4,
> "D\320kE\33\10\253:\241~O\261\2407l\346\313\2275^\313X\305\303RY\264Y\201foH|"...,
> 16452, MSG_NOSIGNAL) = 16452 //sending data from the file

That's libssh2 doing the send()

> recv(4, 0x827c7e4, 16384, MSG_NOSIGNAL) = -1 EAGAIN (Resource temporarily

And libssh2 doing a recv(), since SFTP is a lot of back-and-forth packets even
while uploading

> poll([{fd=4, events=POLLOUT}], 1, 1000) = 1 ([{fd=4, revents=POLLOUT}])

This is the error. Even though it was recv() that returned EAGAIN and we
should wait until we can receive more data, this poll() waits for POLLOUT
which instead waits for the socket to be writeable. And it already is
writeable so it returns immediately...

> recv(4, 0x827c7e4, 16384, MSG_NOSIGNAL) = -1 EAGAIN (Resource temporarily

... but still not readable, so libssh2 returns EAGAIN at once again! :-/

(and on and on and on...)

> So, poll and clock_gettime are invoked from the libcurl.

Of course. But libssh2 tells libcurl what "direction" to wait for traffic, and
based on that info libcurl will poll() accordingly. Somehow this logic fails
and waits for the wrong action.

Do you by any chance have more than one libssh2 installation in your system?
I mean, when libcurl is built perhaps it thinks you have an older libssh2 and
thus doesn't use the correct direction-checking function (which was added in a
rather recent libssh2 version, I can't recall exactly which atm).

>> libcurl is supposed to do that blocking magic on its own using the same
>> mechanism that libssh2 uses internally when blocking is selected. It would
>> then rather indicate that libssh2 _is_ right and that libcurl gets confused
>> for some reason.
>
> Could you please provide info, how to enable this mechanism on libcurl
> level?

It is already there and is used. What you see is a bug in that mechanism.

> Blocking mode in the libssh2 is performed by non-blocked socket and the
> select syscall.

Yes, libssh2 always has the actual socket non-blocking even when it provides a
blocking API.

> So, libcurl just waits for return from a libssh2 call ("libssh2_sftp_write",
> probably). After that it will invoke the same code. Could you please tell
> me, what kind of issues can this cause?

It blocks libcurl until the entire call is complete, thus it prevents libcurl
from doing what it would normally do when the call is done in a non-blocking
manner. As I said, everything in libcurl works on the assumption that the
transport layer is non-blocking. For the exact details, just read the code or
test what happens.

-- 
  / daniel.haxx.se
Received on 2009-08-18