From: Heikki Linnakangas Date: Mon, 1 Jul 2013 06:36:00 +0000 (+0300) Subject: Retry short writes when flushing WAL. X-Git-Tag: REL9_4_BETA1~1404 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=79ce29c734c6a652b2f7193bda537cff0c8eb8c1;p=thirdparty%2Fpostgresql.git Retry short writes when flushing WAL. We don't normally bother retrying when the number of bytes written by write() is short of what was requested. It is generally assumed that a write() to disk doesn't return short, unless you run out of disk space. While writing the WAL, however, it seems prudent to try a bit harder, because a failure leads to PANIC. The write() is also much larger than most write()s in the backend (up to wal_buffers), so there's more room for surprises. Also retry on EINTR. All signals used in the backend are flagged SA_RESTART nowadays, so it shouldn't happen, but better to be defensive. --- diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 66447033695..0ce661bf9f4 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1606,6 +1606,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) { char *from; Size nbytes; + Size nleft; + int written; /* Need to seek in the file? */ if (openLogOff != startoffset) @@ -1622,19 +1624,25 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) /* OK to write the page(s) */ from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ; nbytes = npages * (Size) XLOG_BLCKSZ; - errno = 0; - if (write(openLogFile, from, nbytes) != nbytes) + nleft = nbytes; + do { - /* if write didn't set errno, assume no disk space */ - if (errno == 0) - errno = ENOSPC; - ereport(PANIC, - (errcode_for_file_access(), - errmsg("could not write to log file %s " - "at offset %u, length %lu: %m", - XLogFileNameP(ThisTimeLineID, openLogSegNo), - openLogOff, (unsigned long) nbytes))); - } + errno = 0; + written = write(openLogFile, from, nleft); + if (written <= 0) + { + if (errno == EINTR) + continue; + ereport(PANIC, + (errcode_for_file_access(), + errmsg("could not write to log file %s " + "at offset %u, length %lu: %m", + XLogFileNameP(ThisTimeLineID, openLogSegNo), + openLogOff, (unsigned long) nbytes))); + } + nleft -= written; + from += written; + } while (nleft > 0); /* Update state for write */ openLogOff += nbytes;