cURL / Mailing Lists / curl-library / Single Mail

curl-library

RE: Libcurl suggestions

From: DOMINICK C MEGLIO <dcm5151_at_esu.edu>
Date: Mon, 8 Dec 2003 13:44:26 -0500

> I'm against adding this feature because it misleads people into
believing
> libcurl will do the right thing for them, when in fact libcurl can't
know how
> to properly url encode something that is already put together into a
"URL" (it
won't really be a url if it isn't encoded properly).

I can understand what you're saying here. The problem is, as a client coder,
I'd rather spend my time working on my application, not on parsing and
dealing with a URL a user inputs. I'm not sure how exactly this should be
dealt with, it just seems like a problem to me.

> 2.) CURLOPT_DISABLEPROTOCOLS
> The idea behind these build-time options to disable certain protocols,
was to
> offer users a way to reduce the size of the output library by disabling
> features they don't use. I didn't realize it would be considered a
useful
> run-time option.

> I figure this would need to be bits set in curl_global_init() for this
to be
> really useful?

Well I did it differently (patch attached). I made curl_easy_setopt(curl,
CURLOPT_DISABLEPROTOCOLS, CURLPROT_LDAP|CURLPROT_DICT); for example.
(CURLPROT_* are aliases for the PROT_* constants).

My reason for needing this is basically this, in my program, the user enters
a URL to download. Now, my program is a server application. In order to get
the most out of our fd's, we've closed stdin/out/err and instead use those
fd's to be client sockets (why waste 3 fd's for a daemon that doesn't have
i/o anyway?) Now lets say someone puts a telnet:// URL. The telnet://
protocol as it is implemented in libcurl uses stdin/out. That means if
someone inputted a telnet:// URL, things get crazy since stdin/out are
actually client sockets! So instead of reading/writing to the screen we're
sending it to users of the server, that's obviously not a good thing. Now of
course I could simply do some string comparisons to test if it starts with
"telnet://" or whatever, but I figured, the library is already determining
which protocol it is, why not let the library determine if it should be
disabled? So that's exactly what I did.

> 3.) curl_getfilename(CURL *curl);
> If you replace 'the' with 'a' in that last sentense, then I would agree
with
> you. This has been discussed at length before.

> Why settle with getting just the file name? What about host name? Port
number?
> Protocol? Password? etc etc... (and then someone comes up with the
brilliant
> idea of having functions that let you set one of those fields)
Well I wouldn't settle for just the filename. The way I'm planning to do it
is move curl's URL parsing code to a Curl_parse and then expose a
curl_urldata_get and curl_urldata_free. And you're right, someone would come
up with such functions to set those things, who would this person be you
ask? Well it's you! CURLOPT_USERPWD for example lets you set the username
and password. As far as I can tell that works the same as including the
user:password right in the URL.

I can understand why you don't want such a feature, but I'm sure you also
know that many users would definately value such a feature. If you don't
think implementing this is useful, would you at least consider an "effective
filename" lookup? What I mean by this is, the filename I specify isn't
necessarily the file I get. If I go to blah.com/download.php?file=2
download.php isn't the file I get, either using Location: or
Content-Disposition: it's very likely that I'll be redirected to another
file, maybe somedl.txt. From what I can tell, there is no way for me to
discover that the file I actually want to save locally is somedl.txt.

> I prefer to not offer any of these pieces, and if you want them
extracted from
> the URL, you can use one of the existing URI/URL parsing libraries "out
> there". You know the full URL yourself.
Well first off, I'd rather have as few libraries as possible. It seems
wasteful to me to include a library so that I can use 1 function. Not to
mention you again run into the possibility of inconsistencies, again my
fictional fake:// URL scheme. In 7.20.0 libcurl supports fake://. Say I use
libwww to parse the URL. Well my program was designed for libcurl 7.11.0 and
libwww 5.4.0. Neither of these versions support fake://. But now the user
notices libcurl 7.20.0 was released and upgrades it. The user also upgrades
libwww to 7.0.0. What happens if libwww 7.0.0 doesn't support fake://? Well
it means my program can't load fake:// URLs because I have no way to parse
them. So even though libcurl 100% has the means to work with fake:// urls,
my program can't until libwww is updated to support fake://.

Anyway, I realize you wouldn't make such a feature yourself, but does that
mean you'd also reject it if I were to code it and submit it as a patch? If
so I won't waste my time.

> I've been contemplating about adding a function that when given a URL it
> returns info about libcurl's support of that particular protocol.
That sounds like a good idea, but it wouldn't really help me with my
situation.

> One of the ideas behind the concept of URLs is that they look and work
the
> same, independent of the underlying protocol. Thus, you should be pretty
safe
> to assume that no such big surprises will pop up even in future versions
of
libcurl.

That may be the theory, but it isn't like that in practice. Take for example
the draft for the irc:// URL
<http://www.ietf.org/internet-drafts/draft-butcher-irc-url-03.txt> (I doubt
libcurl would ever support this, but it is a valid example of what I'm
saying). Things after the / have _nothing_ to do with being a file. In fact,
it specifies which IRC channel the client should join, or it might specify a
nickname that should be queried (depending on whether ,ischan or ,isuser is
specified) That clearly is different from schemes that we're all familiar
with like ftp and http. irc://undernet/pickle%25butcher.id.au,isuser That
certainly doesn't look like an http/ftp URL to me. Different protocols have
different needs and therefore the URL scheme has to be adapted in some ways
to accomodate those needs.

Dominick Meglio

-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click

Received on 2003-12-08