cURL
Haxx ad
libcurl

curl's project page on SourceForge.net

Sponsors:
Haxx

cURL > Mailing List > Monthly Index > Single Mail

curl-tracker mailing list Archives

[ curl-Bugs-3048197 ] Incorrect data uploaded in case of CURLE_SEND_ERROR

From: SourceForge.net <noreply_at_sourceforge.net>
Date: Tue, 31 Aug 2010 08:59:58 +0000

Bugs item #3048197, was opened at 2010-08-19 00:59
Message generated for change (Comment added) made by bagder
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=100976&aid=3048197&group_id=976

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: ftp
Group: bad behaviour
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: catalin (catalinr)
Assigned to: Daniel Stenberg (bagder)
Summary: Incorrect data uploaded in case of CURLE_SEND_ERROR

Initial Comment:
These are my [hopefully right] conclusions:
- When doing an FTP upload, in the progressCallback function the uploadedTillNow argument (last one) holds the correct size for the uploaded data (well, rather "sent data");
- the uploaded data is sent in chunks of at most CURL_MAX_WRITE_SIZE bytes (btw, CURL_MAX_WRITE_SIZE is not documented anywhere and it'd be useful to know that that is the maximum size attempted to be uploaded at one time);
- in case the link is broken/disconnected, the remote destination file gets appended with a chunk of the size reported by progressCallback but the last KB are composed of NULL bits and not the real data. So a subsequent APPE[-nd] to that file will render a file of the same size with the original one, but somewhere in the middle it'll have the wrong NULL bytes.

The workaround for this was to add the size reported by the progressCallback in a totalTillNow variable, and after any network error that would require a resume subtract CURL_MAX_WRITE_SIZE*8 from totalTillNow, and setup a REST upload with the calculated starting point. (See #3048174 for issues with CURLOPT_RESUME_FROM).

I'm not sure if my workaround is the best solution, but it works. OTOH, that behavior when uploading and getting disconnected leads to corrupted files at destination and that is IMO very wrong...

libcurl 7.21.1
msw Vista
mingw-gcc 3.4.5

----------------------------------------------------------------------

>Comment By: Daniel Stenberg (bagder)
Date: 2010-08-31 10:59

Message:
Your bug report here is based on an assumption from your end. You want me
to put in a lot of work to try to test that assumption to rule it out. I
rather reverse the argument and instead assume that things are good until
you can provide me with something that indicate that curl indeed does
behave wrongly.

If it truly is that easy to setup two systems with a router and then
unplug the router to get this problem to trigger so please proceed and do
exactly that and tell us what happens, preferably with some logs.

Allow me to remind you that curl is an open source project with very
little company backing. You're asking me to my spend spare time to look for
a problem you assume exist in curl. (and yes perhaps it does)

I look for and fix dozens of curl bugs every months already, I say we
scale much better if you do the larger piece of the work here and I'll help
analyse the results of your tests and experiments.

I don't think this is bad practice. I think it is common sense. You're of
course free to disagree.

----------------------------------------------------------------------

Comment By: catalin (catalinr)
Date: 2010-08-31 00:34

Message:
I'm sorry, I can't see how you could get any "more info and points" without
you testing this... I really don't think it is that difficult nowadays to
setup a source and a destination separated by a router and just unplug the
router in the middle of a transfer.
This [lib]curl is a great piece of work but dismissing such issues on a
pure _assumption_ is by any means bad practice.

I believe this is due to network problems, but then curl_progress_callback
should not return the size of data that is not certain to have been
correctly transferred. So what would be wrong there if reporting only the
size of the data that is confirmed to have been correctly transmitted? IOW
to report only the total data transferred so far except the chunk being
transmitted at that point in time.. It'll be a couple of KB less exact, but
then it can never guarantee at any point that the reported numbers are the
final ones anyway, can it?

I don't have the necessary knowledge to debug this nor any urgent need for
it since the workaround described in my initial message is doing ok, nor
the knowledge (..did I already say that?). So I'll at least be happy with
the fact that google indexes this and it will be easily found by any other
having the same issue.

----------------------------------------------------------------------

Comment By: Daniel Stenberg (bagder)
Date: 2010-08-29 00:24

Message:
Well, I'm sorry but unless we get more info and points that indicate that
curl in fact does something wrong for this case I will consider this a
proxy/intermediate/server problem/artifact.

----------------------------------------------------------------------

Comment By: catalin (catalinr)
Date: 2010-08-23 01:39

Message:
I believe the way it happens in my case is because of an intermediate
network element. I can't be 100% sure, but maybe if a router is used
between the source and the destination and the connection is broken by
disconnecting the router, then the same thing would happen. Maybe you
already tried like this...

I don't really know where to start investigating this in the curl code -
i.e. where the numbers in the callback come from...

----------------------------------------------------------------------

Comment By: Daniel Stenberg (bagder)
Date: 2010-08-23 00:24

Message:
I've tried, and I've not seen any zeroes in my broken uploads when using
vsFTPd.

The progress callback gets the amount told that the system calls have
reported were successfully sent. Unless of course there's a bug somewhere
but I've not been able to find any such. Can you?

----------------------------------------------------------------------

Comment By: catalin (catalinr)
Date: 2010-08-22 04:13

Message:
I am trying to make myself understood as best as I can already. Sorry if I
should do more but just fail at it... OTOH I'm not sure I can describe this
any better than I already have, my English is not that good.

"When the connection breaks, libcurl CANNOT [...] check anything else on
the remote site"
What I'm saying is not for libcurl to do more, but for you (a person) to
check the uploaded data after a CURLE_SEND_ERROR occurs. IOW I've asked you
to check for the problem I'm signaling, not to implement something.
Again, consider this a test case: an upload is in progress,
CURLE_SEND_ERROR occurs, transfer is aborted; desired outcome: the uploaded
data is the same as the source (partial, but identical so far); actual
outcome (at my end at least): last part of the data is incorrect (null
bits).
I feel the need to express this yet again: this is not a request for
automatically doing anything, but for reproducing what I'm experiencing.

"libcurl cannot guarantee what the server does, nor can it assume
anything"
Ok, so reading the last part I can only think that on the contrary, the
progressCallback reports the uploaded size _assuming_ it all was correctly
received at destination. But it should probably report only the data that
is confirmed to have been sent correctly so far.
If the upload consists of lets say 5 chunks of CURL_MAX_WRITE_SIZE bytes,
when a call to progressCallback is triggered i.e. while uploading chunk 4
it should only report the bytes sent in the first 3 chunks, and not adding
the [unconfirmed so far] bytes of the 4th chunk, which seems to be done
now.

"1. I can't see any error in libcurl's side"
I'd say the partialUpload reported by progressCallback is not always
correct, see above.

"2. It sounds like bad behavior on the server side "
It may very well be like that, but is the "good behavior" defined in a RFC
or just as a cURL concept?
_If_ cURL does assume that everything sent is also correctly received,
then this is a rather arbitrary call.
If this is an impossible to change fact, it should be better described in
the docs.

"3. you have not presented any way to repeat this problem"
I believe I have, even if not by using a piece of code. If it was not
understood from my previous post maybe this time will be luckier. If still
not, I'll probably give up...

----------------------------------------------------------------------

Comment By: Daniel Stenberg (bagder)
Date: 2010-08-21 21:20

Message:
When the connection breaks, libcurl CANNOT send anything further as the
connection is no more, nor can it check anything else on the remote site as
the connection... broke! Having libcurl try to reconnect just to check the
end of the file in case it got disconnected just previously is completely
out of the question.

Alas, the problem you see at disconnect depends on what the server does on
a disconnect. libcurl cannot guarantee what the server does, nor can it
assume anything. Some servers are likely to act differently than others on
disconnect. Appending zero-bytes to the file does sound like a case of bad
behaviour ON THE SERVER END.

1. I can't see any error in libcurl's side
2. It sounds like bad behavior on the server side
3. you have not presented any way to repeat this problem

I can't see what libcurl can do about this.

Anyone who decides to append data to an existing file because it got
aborted in a previous upload attempt may of course consider to check the
end of the file to see that the end looks OK before blindly appending more
data to it. libcurl will not do that automatically though but does provide
the powers to get the data etc.

----------------------------------------------------------------------

Comment By: catalin (catalinr)
Date: 2010-08-21 07:02

Message:
I'm sure I fail to see a lot more than you do, but I'm just signaling what
looks like bad behavior to me. Maybe you can find a way to try and
reproduce this as I don't think I can make a sample program that will make
a server disconnect (or an ISP to interrupt it etc), can I?
Of course that ftp server (comes with a BusyBox linux on a NAS device) may
be broken but I have some doubts about that and IMO it's worth
investigated...

The error received at my end was CURLE_SEND_ERROR and IIRC once I also got
CURLE_RECV_ERROR (although only uploads were being done, but maybe it was
about receiving some response from the server).
A long-shot interpretation would be that curl sends the size of the packet
being uploaded, but only part of the actual data gets to destination.
Again, I may be far away with my guess...
I don't think those zeroes are exactly random neither... Comparing the
source and destination files, the difference is made by the zeroes in the
destination file, not some _random_ bits being there.

Maybe a shorter way would be to get an ftp upload to break with
CURLE_SEND_ERROR and then check the end of the uploaded file part?

HTH

----------------------------------------------------------------------

Comment By: Daniel Stenberg (bagder)
Date: 2010-08-19 15:15

Message:
I don't see how libcurl sends any data as zeroes. Also, I fail to see how
it could send that block of zeroes if the connection disconnected?

To me it sounds like your server behaves oddly and add random data to the
file being written at the time of the disconnect. I don't think libcurl can
do anything about it.

If this is not the case, can you please clarify your point for us?

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=100976&aid=3048197&group_id=976
Received on 2010-08-31

These mail archives are generated by hypermail.

donate! Page updated November 12, 2010.
web site info

File upload with ASP.NET