cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: [PATCH] a new option WILDCARDMATCH (FTP only for now)

From: Pavel Raiskup <xraisk00_at_gmail.com>
Date: Wed, 14 Apr 2010 02:59:13 +0200

Hi. Only for this moment I would like solve some unclear things to
continue working..

> 1 - the CURLOPT_* options that set callbacks should preferably end in
> FUNCTION as all existing callback options do that. I also don't like the
> word "UNIT" in the two callbacks sent before and after each directory
> entry. What unit?
2 - each callback function option should have a corresponding *DATA ..

You are right. It should be more consistent.. Due to this IMAP topic
http://curl.haxx.se/mail/lib-2010-04/0082.html I'd like to introduce the
most generic and appropriate solution.

First of all, As Kamil Dudka said, this "UNIT" idea was slightly inspired
 from gcc plugin mechanism
http://gcc.gnu.org/onlinedocs/gccint/Plugins.html and I agree with this
naming. But I think there could be one little bit better word -- CHUNK.

As you can see, I'm not really good in english -- I don't know any better
word for these callbacks, I have been considering
PART | PARTITION | UNIT | ITEM | CHUNK | SEPARATOR | FRAGMENT | SPLITTER |
???.

So, the new CURLOPTS could look like this:

CURLOPT_CHUNK_BGN_FUNCTION
CURLOPT_CHUNK_END_FUNCTION
CURLOPT_CHUNK_DATA
CURLOPT_FNMATCH_FUNCTION
?? CURLOPT_FNMATCH_DATA should I implement this?

CHUNK sounds like it could be used very generally. Useful for FTP and IMAP
well for me. Please hit me if you know better name.

The dilemma - what about parameters? I was considering "struct
Curl_filedata*" as a first argument for STARTUNIT/CHUNK_BGN_FUNCTION and
it looks like there should be something like "void *transfer_info". So is
the content of mail attachment acceptable?

> 16 - error handling from the callback. Even when you detect out of
> memory within the callback, the callback still doesn't return any error
> (it returns the bufflen no matter what) and I don't think that's a good
> idea. If the error is serious enough you better return error properly.
> Only "simple" problems with parsing the data should be considered soft
> enough to allow the data to keep getting downloaded.

I completely agree. So what value should I return? E.g. (bufflen - 1)? I
haven't found recommended way how to tell the lib to stop working.
Probably it doesn't make sense. Correct me if I'm wrong.

> 19 - fnmatch() is not portable enough for unconditional use. Lots of
> systems that build and run libcurl have no native fnmatch(). You either
> need to provide an alternative "native" version for those systems or we
> need to have the entire matching code conditional on the existance of
> this function.

For 19. note.. As soon as it will be possible I'd like to propose you my
own version of fnmatch. I have tried to implement (only for now) Question
mark and Star notation and it works AFAIK perfectly (I was surprised how
easy it was). So the "*.t?t" pattern could be solved good. There is no
problem with [a-z] implementation but it is only about some time.
There is one distinction between these implementations. I have used
dynamically generated FSM and original gnu fnmatch(3) works AFAIK
statically (no allocation) --- and it costs some time to "compile" this
FSM --- I do not want to change it, it could be too hard for me for this
moment. The advantage of this function is that there are no dependencies
and it will be easy to improve. I propose to forgot about fnmatch for now
and consider my fnmatch implementation. Even if it is slower. Code
prototype will be ASAP (this week) now it doesn't look cosmetically.
Possible problems -- no Unicode support, no directory solution -- but I
don't think it is necessary for the moment and it could be implemented in
future.

The reason why I'm telling this is I'm deciding how to deal with
CURLOPT_WILDCARDMATCH "button". Because, if the wildcard is implicitly off
there will be no reason to call this extension "WILDCARDwhatever".

> 22 - I would prefer to see as much as possible of the code handling
> wildcards in a separate file. You added two new files, wildcard.c and
> fileinfo.c but yet the largest portion of new code was added to
> lib/ftp.c and while that works I find it unfortunate since ftp.c already
> is one of the larger source files in our tree.

I know that and I completely agree but the major part of this patch was
only for FTP :-( I'm going to try move something.

Pavel

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html

Received on 2010-04-14