cURL / Mailing Lists / curl-library / Single Mail

curl-library

[LIBSSH2_ERROR_EAGAIN problem] proposed patch for curl/libssh2 bugfix

From: Vlad Grachov <fmot.fics_at_gmail.com>
Date: Fri, 26 Sep 2008 18:08:10 +0900

Hi!
While investigating how SFTP is supported in curl I figured out that curl
does not use select to wait for blocked sockets. Instead it goes into the
same state machine step and calls some libssh2 function that in turn calls
recv()/send() infinitely. If no data comes from socket CPU load is 100%.

When libssh2 is set to non-blocking mode its functions return
LIBSSH2_ERROR_EAGAIN
if recv/send returned EAGAIN. Upon receiving LIBSSH2_ERROR_EAGAIN curl's
state machine just repeats the same step (without a call to select/poll).
This leads to lots of recv/send calls that return EAGAIN and finally waste
CPU time. I wrote a small TCP tunneling code that waits for user input after
receiving 2600 bytes. And curl used 100% CPU while testing. CURL did not
release the CPU and continued to infinitely call recv until I let the tunnel
program transfer the remaining data.
To see 100% CPU usage you can use my small tunneling software that stops
transmitting after it receives a certain amount of bytes:
http://fmot.ru/tcp_forward.cpp
So if run
  ./a.out 23 127.0.0.1 22
at host where sftp works. And then
  curl sftp://127.0.0.1:23 -u [username]
curl will use 100% CPU time.

The problem is related to libssh2 as well because LIBSSH2_ERROR_EAGAIN
return code does not provide any information about the cause of would-block.
Regardless of whether recv() or send() returned EAGAIN all the information
curl can get from libssh2 is the one and same LIBSSH2_ERROR_EAGAIN error
code.

The proposed 2 patches are a patch to libssh2 and a patch to curl. They
modify the libraries in a minor way.

Patched libssh2 collects information about whether send() of recv()
would-block in socket_block_direction variable of LIBSSH2_SESSION struct.
This can be LIBSSH2_SOCKET_BLOCK_INBOUND or LIBSSH2_SOCKET_BLOCK_OUTBOUND
constant values. This value can later be retrieved from outside via call to
new exported function libssh2_session_block_direction(LIBSSH2_SESSION
*session). That is actually done by patched curl in ssh_statemach_act
function. If LIBSSH2_ERROR_EAGAIN is returned patched curle/ssh.c performs a
call to Curl_socket_ready with either read or write socket set depending on
libssh2_session_block_direction function' return value.

The patched curl no longer waste CPU when the socket would block. Tested
under Linux and Windows.

Best regards, Vlad Grachev / SolutionBox Inc.

Received on 2008-09-26