At GitLab we have recently received a report where a repository was left
with a corrupted `packed-refs` file after the node hard-crashed even
though `core.fsync=reference` was set. This is something that in theory
should not happen if we correctly did the atomic-rename dance to:
1. Write the data into a temporary file.
2. Synchronize the temporary file to disk.
3. Rename the temporary file into place.
So if we crash in the middle of writing the `packed-refs` file we should
only ever see either the old or the new state of the file.
And while we do the dance when writing the `packed-refs` file, there is
indeed one gotcha: we use a `FILE *` stream to write the temporary file,
but don't flush it before synchronizing it to disk. As a consequence any
data that is still buffered will not get synchronized and a crash of the
machine may cause corruption.
Fix this bug by flushing the file stream before we fsync.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
goto error;
}
- if (fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
+ if (fflush(out) ||
+ fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
close_tempfile_gently(refs->tempfile)) {
strbuf_addf(err, "error closing file %s: %s",
get_tempfile_path(refs->tempfile),