cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: fflush and fsync

From: Jamie Lokier <jamie_at_shareable.org>
Date: Tue, 22 Apr 2008 14:56:16 +0100

Yang Tse wrote:
> > fsync shouldn't even be needed within curl. Whether log files
> > are physically on disk or merely in the cache should have no
> > effect on the success of any curl operation. Is that call actually
> > needed?
>
> As long as fflush() makes changes visible to other processes there is
> no need for fsync(). But if fflush() + fclose() don't make changes
> immediately visible to other processes, then on those OS test harness
> would certainly benefit using fsync().

fclose() implies fflush(). That's ANSI C. If any stdio
implementation didn't do that, a lot of simple programs (like
printf_helloworld > file) wouldn't work. So you can absolutely rely
on it.

fsync() has _no_ effect on the visibility of file writes to other
processes _on the same machine_. This is true even if the filesystem
is remote.

It's only effect is to commit data to the underlying storage devices
for higher recoverability in case there's a system crash shortly
after, and even that isn't assured.

So if you're running a program which finishes with
fflush+fsync+fclose, and then another program on the same machine to
read the newly written file, you really don't need the fflush+fsync.
If they make a difference, it indicates a bug elsewhere. fsync often
causes a time delay, so look for race conditions.

That leaves the case where a program writes a file on one machine, and
then another program reads the file on a _different_ machine (the
filesystem must be remote on at least one of them). Sometimes fsync()
might help these to synchronise, but it's only a side effect and far
from guaranteed. That's not what fsync() does, and it will only
synchronise some implementations and some configurations.

> > The only time I can see that having an effect is when running the
> > curl test suite on a remote machine, which doesn't quite work properly
> > without some hacks, anyway.
>
> Ok, lets see...
>
> Test # 1001 was failing verification on some systems due to the fact
> that test harness sws server was completing the writing of the server
> input request file once that runtests.pl had already read it and
> compared its incomplete contents with the expected results. Resulting
> in a false test failure.

Reading a file before it's finished being written looks like a bug.
Isn't this the obvious thing to fix?

> First step taken. Introduce a small delay in the <postcheck> section
> ot test # 1001. This proved to be effective on autobuilds since it no
> longer failed.
...
> If removal of the extra delay for 1001 makes it fail again even with
> fflush() + fsync() in place, then the fflush() + fsync() must be
> removed and the delay introduced again for 1001.

Seems to me the delay works only because you have a race condition:
runtests.pl shouldn't read the file until the sws server has finished
writing to it. (I'm going from your description above.)

Seems to me the fsync() is a hack which, if it helps, is only because
of a spurious side effect. If this is a local test, the only side
effect I can imagine is that it adds a time delay after fflush(), and
you could equally call fclose() followed by a time delay for the same
effect.

-- Jamie
Received on 2008-04-22