cURL / Mailing Lists / curl-users / Single Mail

curl-users

Scripting curl from the inside out (Re: curl the next few years)

From: Rich Gray <rgray_at_plustechnologies.com>
Date: Thu, 19 Jun 2014 13:15:07 -0400

Daniel Stenberg wrote on the libcurl list:
>
> New stuff - curl
> ================
>
> 1. Embed a language interpreter (lua?). For that middle ground where curl
> isn’t enough and a libcurl binding feels “too much”.

I have been toying with the idea of an option which would allow control of
curl via scripting *between* multiple requests, allowing a single curl
instance to perform multiple transfers to one or more hosts. Because curl
would not die between commands, connections can stay open, DNS caches can be
maintained, etc., making the repeated accesses more efficient.

Imagine a "--exec <program>" option which would invoke a program (script) at
the completion of a command. Curl would pass the just completed command
string, the result and other useful info to the program as arguments and/or
environment variables. The program would check the result of the curl run,
optionally operate on any data and then communicate back to curl what it
wants to do next (if anything) by sending commands back to curl through a
pipe.

  # process received data, tell curl to ftp it
  {
  somecommand < received_data > out_data
  echo "env STATE=sentftp" # put in env for next --exec call
  echo "cmd -T out_data -u u:p ftp://ftp.example.com/path"
  } >&$CURL_CMD_HANDLE
  exit 0

The invoked programs would need to essentially be state machines where each
curl command result gets processed then a new command specified. This could
be done via a single program maintaining state by modifying the environment
as shown above. Or perhaps each program could change the --exec program so
that each program is a state (ugly!) The --exec'd programs can be in
whatever language the user wants.

Misc thoughts:
- curl could be invoked with just the --exec command so the whole thing is
under control of the specified program.
- curl could provide the previous command in full and broken out into parts
such as protocol, host, path, etc.
- curl could provide some helper values, like a counter, elapsed time, etc.
- next command could be specified in full as above or maybe in some -K like
parameter at a time form.
- ability to modify some part of the previous command and re-execute?
- what things would be "sticky" between commands?
- some way to specify the next command and pipe data to it from the program?
- way to have short files download to an environment variable?
- would multiple handles via "cmd[n] <command>" syntax be useful?
- setopt[n] and such? (Hmm, maybe this is the beginnings of your
interpreter? It could do things in a libcurl flavor.)
- could a slave curl running under the invoking parent be somehow useful?
- should be fun to help users with their state machine bugs!

That's all I have time for now. I really have no use for this myself, but
wanted to float this idea that's been trying to get out for years in case it
strikes someone as worth doing. Maybe I can post a somewhat complete
example script this weekend... maybe. :/

Cheers!
Rich
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-users
FAQ: http://curl.haxx.se/docs/faq.html
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2014-06-19