git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log

]> git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log

projects / thirdparty / xfsprogs-dev.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Dave Chinner [Thu, 7 Jan 2021 20:59:18 +0000 (15:59 -0500)]

xfs: remove xfs_buf_t typedef

Source kernel commit: e82226138b20d4f638426413e83c6b5db532c6a2

Prepare for kernel xfs_buf alignment by getting rid of the
xfs_buf_t typedef from userspace.

[darrick: This patch is a port of a userspace patch removing the
xfs_buf_t typedef in preparation to make the userspace xfs_buf code
behave more like its kernel counterpart.]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Zheng Yongjun [Thu, 7 Jan 2021 20:59:18 +0000 (15:59 -0500)]

fs/xfs: convert comma to semicolon

Source kernel commit: 1189686e5440041057f8cc21a7c1d13bb6642cb9

Replace a comma between expression statements by a semicolon.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Gao Xiang [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: kill ialloced in xfs_dialloc()

Source kernel commit: 3937493c502566d90a74c3439ebdb663d9380cc3

It's enough to just use return code, and get rid of an argument.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: spilt xfs_dialloc() into 2 functions

Source kernel commit: 8d822dc38ad781b1bfa5c03227da80dbd87e9959

This patch explicitly separates free inode chunk allocation and
inode allocation into two individual high level operations.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: move xfs_dialloc_roll() into xfs_dialloc()

Source kernel commit: f3bf6e0f1196c69a7b0412521596cd1cc7622a82

Get rid of the confusing ialloc_context and failure handling around
xfs_dialloc() by moving xfs_dialloc_roll() into xfs_dialloc().

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: move on-disk inode allocation out of xfs_ialloc()

Source kernel commit: 1abcf261016e12246e1f0d2dada9c5c851a9ceb7

So xfs_ialloc() will only address in-core inode allocation then,
Also, rename xfs_ialloc() to xfs_dir_ialloc_init() in order to
keep everything in xfs_inode.c under the same namespace.

[sandeen: make equivalent change in xfsprogs]

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: introduce xfs_dialloc_roll()

Source kernel commit: aececc9f8dec92a25c84a3378021636ce58d72dc

Introduce a helper to make the on-disk inode allocation rolling
logic clearer in preparation of the following cleanup.

[sandeen: update xfsprogs struct xfs_trans to match]

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Gao Xiang [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: convert noroom, okalloc in xfs_dialloc() to bool

Source kernel commit: 15574ebbff260a70d344cfb924a8daf3c47dc303

Boolean is preferred for such use.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: don't catch dax+reflink inodes as corruption in verifier

Source kernel commit: 207ddc0ef4f413ab1f4e0c1fcab2226425dec293

We don't yet support dax on reflinked files, but that is in the works.

Further, having the flag set does not automatically mean that the inode
is actually "in the CPU direct access state," which depends on several
other conditions in addition to the flag being set.

As such, we should not catch this as corruption in the verifier - simply
not actually enabling S_DAX on reflinked files is enough for now.

Fixes: 4f435ebe7d04 ("xfs: don't mix reflink and DAX mode for now")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[darrick: fix the scrubber too]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Joseph Qi [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: remove unneeded return value check for *init_cursor()

Source kernel commit: 2e984badbcc0f1cf284441c566ca4309fe59ac05

Since *init_cursor() can always return a valid cursor, the NULL check
in caller is unneeded. So clean them up.
This also keeps the behavior consistent with other callers.

Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Gao Xiang [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: introduce xfs_validate_stripe_geometry()

Source kernel commit: 7bc1fea9d36c78e783ce7d4ad28ad129ebcce435

Introduce a common helper to consolidate stripe validation process.
Also make kernel code xfs_validate_sb_common() use it first.

Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Kaixu Xia [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: remove the unused XFS_B_FSB_OFFSET macro

Source kernel commit: afbd914776db9c035dbe2afa6badb9955ae52492

There are no callers of the XFS_B_FSB_OFFSET macro, so remove it.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Kaixu Xia [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: check tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag

Source kernel commit: 04a58620a17cb14fa20c6e536e03eb27f9af6bc9

Nowadays the only things that the XFS_TRANS_DQ_DIRTY flag seems to do
are indicates the tp->t_dqinfo->dqs[XFS_QM_TRANS_{USR,GRP,PRJ}] values
changed and check in xfs_trans_apply_dquot_deltas() and the unreserve
variant xfs_trans_unreserve_and_mod_dquots(). Actually, we also can
use the tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag, that
is to say, we allocate the new tp->t_dqinfo only when the qtrx values
changed, so the tp->t_dqinfo value isn't NULL equals the XFS_TRANS_DQ_DIRTY
flag is set, we only need to check if tp->t_dqinfo == NULL in
xfs_trans_apply_dquot_deltas() and its unreserve variant to determine
whether lock all of the dquots and join them to the transaction.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: refactor file range validation

Source kernel commit: 33005fd0a537501111fc97ec330b721388c6b451

Refactor all the open-coded validation of file block ranges into a
single helper, and teach the bmap scrubber to check the ranges.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: refactor realtime volume extent validation

Source kernel commit: 18695ad4251462b33787b7e375dbda57c1969c8f

Refactor all the open-coded validation of realtime device extents into a
single helper.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: refactor data device extent validation

Source kernel commit: 67457eb0d225521a0e81327aef808cd0f9075880

Refactor all the open-coded validation of non-static data device extents
into a single helper.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: detect overflows in bmbt records

Source kernel commit: acf104c2331c1ba2a667e65dd36139d1555b1432

Detect file block mappings with a blockcount that's either so large that
integer overflows occur or are zero, because neither are valid in the
filesystem. Worse yet, attempting directory modifications causes the
iext code to trip over the bmbt key handling and takes the filesystem
down. We can fix most of this by preventing the bad metadata from
entering the incore structures in the first place.

Found by setting blockcount=0 in a directory data fork mapping and
watching the fireworks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: enable the needsrepair feature

Source kernel commit: 96f65bad7c31557c28468ba8c1896c7dd7a6bbfa

Make it so that libxfs recognizes the needsrepair feature. Note that
the kernel will still refuse to mount these.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: define a new "needrepair" feature

Source kernel commit: 80c720b8eb1c7800133c5ae1686353d33564b773

Define an incompat feature flag to indicate that the filesystem needs to
be repaired. While libxfs will recognize this feature, the kernel will
refuse to mount if the feature flag is set, and only xfs_repair will be
able to clear the flag. The goal here is to force the admin to run
xfs_repair to completion after upgrading the filesystem, or if we
otherwise detect anomalies.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 7 Jan 2021 20:59:17 +0000 (15:59 -0500)]

xfs: move kernel-specific superblock validation out of libxfs

Source kernel commit: 3945ae03d822aa47584dd502ac024ae1e1eb9e2d

A couple of the superblock validation checks apply only to the kernel,
so move them to xfs_fc_fill_super before we add the needsrepair "feature",
which will prevent the kernel (but not xfsprogs) from mounting the
filesystem. This also reduces the diff between kernel and userspace
libxfs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Thu, 7 Jan 2021 20:59:03 +0000 (15:59 -0500)]

libxfs: cosmetic changes to libxfs_inode_alloc

This pre-patch helps make the next libxfs-sync for 5.11 a bit
more clear.

In reality, the libxfs_inode_alloc function matches the kernel's
xfs_dir_ialloc so rename it for clarity before the rest of the
sync, and change several variable names for the same reason.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Fri, 11 Dec 2020 22:19:43 +0000 (17:19 -0500)]

xfsprogs: Release v5.10.0

Update all the necessary files for a 5.10.0 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Anthony Iliopoulos [Wed, 9 Dec 2020 17:20:40 +0000 (12:20 -0500)]

xfs_repair: remove obsolete code for handling mountpoint inodes

The S_IFMNT file type was never supported in Linux, remove the related
code that was supposed to deal with it, along with the translation file
entries.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Fri, 4 Dec 2020 19:46:15 +0000 (14:46 -0500)]

xfsprogs: Release v5.10.0-rc1

Update all the necessary files for a 5.10.0-rc1 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Fri, 4 Dec 2020 17:17:12 +0000 (12:17 -0500)]

xfsprogs: make things non-gender-specific

Users are not exclusively male, so fix that implication
in the xfs_quota manpage and the configure.ac comments.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Fri, 4 Dec 2020 17:17:12 +0000 (12:17 -0500)]

xfs_quota: Remove delalloc caveat from man page

Ever since
89605011915a ("xfs: include reservations in quota reporting")
xfs quota has been in sync with delayed allocations, so this caveat
is no longer relevant or correct; remove it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Fri, 4 Dec 2020 17:17:12 +0000 (12:17 -0500)]

xfs_quota: document how the default quota is stored

Nowhere in the man page is the default quota described; what it
does or where it is stored. Add some brief information about this.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 4 Dec 2020 17:17:12 +0000 (12:17 -0500)]

debian: add build dependency on libinih-dev

mkfs now supports configuration files, which are parsed using libinih.
Add this dependency to the debian build.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 4 Dec 2020 17:17:12 +0000 (12:17 -0500)]

debian: fix version in changelog

We're still at 5.10-rc0, at least according to the tags.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 4 Dec 2020 17:17:12 +0000 (12:17 -0500)]

libxfs: add realtime extent reservation and usage tracking to transactions

The libxfs resync added to the deferred ops code the ability to capture
the unfinished deferred ops and transaction reservation for later replay
during log recovery. This nominally requires transactions to have the
ability to track rt extent reservations and usage, so port that missing
piece from the kernel now to avoid leaving logic bombs in case anyone
ever /does/ start messing with realtime.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 4 Dec 2020 17:17:12 +0000 (12:17 -0500)]

libxfs: fix weird comment

Not sure what happened with this multiline comment, but clean up all the
stars.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 4 Dec 2020 17:16:59 +0000 (12:16 -0500)]

libxfs-apply: don't add duplicate headers

When we're backporting patches from libxfs, don't add a S-o-b header if
there's already one at the end of the headers of the patch being ported.

That way, we avoid things like:
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Tue, 24 Nov 2020 16:58:25 +0000 (11:58 -0500)]

libxfs: get rid of b_bcount from xfs_buf

We no longer use it in the kernel - it has been replaced by b_length
and it only exists in userspace because we haven't converted it
over. Do that now before we introduce a heap of code that doesn't
ever set it and so breaks all the progs code.

WHile we are doing this, kill the XFS_BUF_SIZE macro that has also
been removed from the kernel, too.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Christoph Hellwig [Mon, 23 Nov 2020 19:49:35 +0000 (14:49 -0500)]

repair: simplify bmap_next_offset

The tp argument is always NULL, and the whichfork argument is always
XFS_DATA_FORK, so simplify and cleanup the function based on those
assumptions.

[sandeen: rebase for current git tree]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Gao Xiang [Mon, 23 Nov 2020 19:25:15 +0000 (14:25 -0500)]

xfs: fix forkoff miscalculation related to XFS_LITINO(mp)

Source kernel commit: ada49d64fb3538144192181db05de17e2ffc3551

Currently, commit e9e2eae89ddb dropped a (int) decoration from
XFS_LITINO(mp), and since sizeof() expression is also involved,
the result of XFS_LITINO(mp) is simply as the size_t type
(commonly unsigned long).

Considering the expression in xfs_attr_shortform_bytesfit():
offset = (XFS_LITINO(mp) - bytes) >> 3;
let "bytes" be (int)340, and
"XFS_LITINO(mp)" be (unsigned long)336.

on 64-bit platform, the expression is
offset = ((unsigned long)336 - (int)340) >> 3 =
(int)(0xfffffffffffffffcUL >> 3) = -1

but on 32-bit platform, the expression is
offset = ((unsigned long)336 - (int)340) >> 3 =
(int)(0xfffffffcUL >> 3) = 0x1fffffff
instead.

so offset becomes a large positive number on 32-bit platform, and
cause xfs_attr_shortform_bytesfit() returns maxforkoff rather than 0.

Therefore, one result is
"ASSERT(new_size <= XFS_IFORK_SIZE(ip, whichfork));"

assertion failure in xfs_idata_realloc(), which was also the root
cause of the original bugreport from Dennis, see:
https://bugzilla.redhat.com/show_bug.cgi?id=1894177

And it can also be manually triggered with the following commands:
$ touch a;
$ setfattr -n user.0 -v "`seq 0 80`" a;
$ setfattr -n user.1 -v "`seq 0 80`" a

on 32-bit platform.

Fix the case in xfs_attr_shortform_bytesfit() by bailing out
"XFS_LITINO(mp) < bytes" in advance suggested by Eric and a misleading
comment together with this bugfix suggested by Darrick. It seems the
other users of XFS_LITINO(mp) are not impacted.

Fixes: e9e2eae89ddb ("xfs: only check the superblock version for dinode size calculation")
Cc: <stable@vger.kernel.org> # 5.7+
Reported-and-tested-by: Dennis Gilmore <dgilmore@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:30 +0000 (17:03 -0500)]

xfsprogs: get rid of ancient btree tracing fragments

If we are going to do any userspace tracing, it will be via the
existing libxfs tracepoint hooks, not the ancient Irix tracing
macros.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:30 +0000 (17:03 -0500)]

libxfs: rename buftarg->dev to btdev

To prepare for alignment with kernel buftarg code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:30 +0000 (17:03 -0500)]

xfsprogs: remove unused IO_DEBUG functionality

Similar to the XFS_BUF_TRACING code, this is largely unused and not
hugely helpfule for tracing buffer IO. Remove it to simplify the
conversion process to the kernel buffer cache.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:30 +0000 (17:03 -0500)]

xfsprogs: remove unused buffer tracing code

This isn't particularly useful for finding issues, it's rarely used
and complicates the conversion to the kernel buffer cache code. THe
kernel code also carries it's own trace hooks that could be
implemented if tracing is needed, so remove this code to make the
conversion simpler.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Fri, 20 Nov 2020 22:03:30 +0000 (17:03 -0500)]

xfs_io: fix up typos in manpage

We go in reverse direction, not reserve direction.
We go in forward direction, not forwards direction.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

mkfs: document config files in mkfs.xfs(8)

So people know it exists.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

mkfs: hook up suboption parsing to ini files

Now we have the config file parsing hooked up and feeding in
parameters to mkfs, wire the parameters up to the existing CLI
option parsing functions. THis gives the config file exactly the
same capabilities and constraints as the command line option
specification.

[sandeen: fix whitespace, drop "opt" check in parse_subopts()]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

mkfs: constify various strings

Because the ini parser uses const strings and so the opt parsing
needs to be told about it to avoid compiler warnings.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

mkfs: add initial ini format config file parsing support

Add the framework that will allow the config file to be supplied on
the CLI and passed to the library that will parse it. This does not
yet do any option parsing from the config file.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Dave Chinner [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

build: add support for libinih for mkfs

Need to make sure the library is present so we can build mkfs with
config file support.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Jakub Bogusz [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

Polish translation update for xfsprogs 5.8.0.

[sandeen: reviewed insofar as it only changes .po file]

Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

xfs_repair: directly compare refcount records

Check that our observed refcount records have exact matches for what's
in the ondisk refcount btree, since they're supposed to match exactly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

xfs_repair: correctly detect partially written extents

Recently, I was able to create a realtime file with a 16b extent size
and the following data fork mapping:

data offset 0 startblock 144 (0/144) count 3 flag 0
data offset 3 startblock 147 (0/147) count 3 flag 1
data offset 6 startblock 150 (0/150) count 10 flag 0

Notice how we have a written extent, then an unwritten extent, and then
another written extent. The current code in process_rt_rec trips over
that third extent, because repair only knows not to complain about inuse
extents if the mapping was unwritten.

This loop logic is confusing, because it tries to do too many things.
Move the phase3 and phase4 code to separate helper functions, then
isolate the code that handles a mapping that starts in the middle of an
rt extent so that it's clearer what's going on.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

xfs_repair: skip the rmap and refcount btree checks when the levels are garbage

In validate_ag[fi], we should check that the levels of the rmap and
refcount btrees are valid. If they aren't, we need to tell phase4 to
skip the comparison between the existing and incore rmap and refcount
data. The comparison routines use libxfs btree cursors, which assume
that the caller validated bc_nlevels and will corrupt memory if we load
a btree cursor with a garbage level count.

This was found by examing a core dump from a failed xfs/086 invocation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:29 +0000 (17:03 -0500)]

xfs_db: report ranges of invalid rt blocks

Copy-pasta the block range reporting code from check_range into
check_rrange so that we don't flood stdout with a ton of low value
messages when a bit flips somewhere in rt metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

xfs: remove unnecessary parameter from scrub_scan_estimate_blocks

The only caller that cares about the file counts uses it to compute the
number of files used, so return that and save a parameter.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

mkfs: don't pass on extent size inherit flags when extent size is zero

If the caller passes in an extent size hint of zero, clear the inherit
flags because a hint value of zero is treated as not a hint.

Otherwise, you get stupid stuff like:
$ mkfs.xfs -d cowextsize=0 /tmp/a.img -f
illegal CoW extent size hint 0, must be less than 9600.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

mkfs: clarify valid "inherit" option values

Clarify which values are valid for the various *inherit= mkfs
options.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix a few nits]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

mkfs: allow users to specify rtinherit=0

mkfs has quite a few boolean options that can be specified in several
ways: "option=1" (turn it on), "option" (turn it on), or "option=0"
(turn it off). For whatever reason, rtinherit sticks out as the only
mkfs parameter that doesn't behave that way. Let's make it behave the
same as all the other boolean variables.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

mkfs: format bigtime filesystems

Allow formatting with large timestamps.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

xfs: enable big timestamps

Source kernel commit: 29887a22713192509cfc6068ea3b200cdb8856da

Enable the big timestamp feature.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

xfs_repair: support bigtime timestamp checking

Make sure that inodes don't have the bigtime flag set when the feature
is disabled, and don't check for overflows in the nanoseconds when
bigtime is enabled because that is no longer possible. Also make sure
that quotas don't have bigtime set erroneously.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

xfs_quota: support editing and reporting quotas with bigtime

Enhance xfs_quota to detect and report grace period expirations past
2038.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

xfs_db: support printing time limits

Support printing the minimum and maxium timestamp limits on this
filesystem.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

xfs_db: report bigtime format timestamps

Report the large format timestamps in a human-readable manner if it is
possible to do so without loss of information.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

libfrog: list the bigtime feature when reporting geometry

When we're reporting on a filesystem's geometry, report if the bigtime
feature is enabled on this filesystem.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:28 +0000 (17:03 -0500)]

xfs_db: refactor quota timer printing

Introduce type-specific printing functions to xfs_db to print a quota
timer instead of printing a raw int32 value. This is needed to stay
ahead of changes that we're going to make to the quota timer format in
the following patches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

xfs_db: refactor timestamp printing

Introduce type-specific printing functions to xfs_db to print an
xfs_timestamp instead of open-coding the timestamp decoding. This is
needed to stay ahead of changes that we're going to make to
xfs_timestamp_t in the following patches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

xfs_quota: convert time_to_string to use time64_t

Rework the time_to_string helper to be capable of dealing with 64-bit
timestamps.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

libfrog: convert cvttime to return time64_t

Change the cvttime function to return 64-bit time values so that we can
put them to use with the bigtime feature.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

libfrog: define LIBFROG_BULKSTAT_CHUNKSIZE to remove dependence on XFS_INODES_PER_CHUNK

"Online" XFS programs like scrub have no business importing the internal
disk format headers to discover things like the optimum number of inodes
to request through a bulkstat request. That number can be derived from
the ioctl definition, so define a new constant in terms of that instead
of pulling in the ondisk format unnecessarily.

Note: This patch will be needed to work around new definitions in the
bigtime patchset that will break scrub builds, so clean this up instead
of adding more #includes to the two scrub source files.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

mkfs: enable the inode btree counter feature

Teach mkfs how to enable the inode btree counter feature.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

xfs: enable new inode btree counters feature

Source kernel commit: b896a39faa5a2f97dadfb347928466afb12cc63a

Enable the new inode btree counters feature.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

xfs_repair: regenerate inode btree block counters in AGI

Reset both inode btree block counters in the AGI when rebuilding the
metadata indexes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

xfs_repair: check inode btree block counters in AGI

Make sure that both inode btree block counters in the AGI are correct.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:03:27 +0000 (17:03 -0500)]

xfs_db: support displaying inode btree block counts in AGI header

Fix up xfs_db to support displaying the btree block counts.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Fri, 20 Nov 2020 22:02:55 +0000 (17:02 -0500)]

xfs: revert "xfs: fix rmap key and record comparison functions"

Source kernel commit: eb8409071a1d47e3593cfe077107ac46853182ab

This reverts commit 6ff646b2ceb0eec916101877f38da0b73e3a5b7f.

Your maintainer committed a major braino in the rmap code by adding the
attr fork, bmbt, and unwritten extent usage bits into rmap record key
comparisons. While XFS uses the usage bits *in the rmap records* for
cross-referencing metadata in xfs_scrub and xfs_repair, it only needs
the owner and offset information to distinguish between reverse mappings
of the same physical extent into the data fork of a file at multiple
offsets. The other bits are not important for key comparisons for index
lookups, and never have been.

Eric Sandeen reports that this causes regressions in generic/299, so
undo this patch before it does more damage.

Reported-by: Eric Sandeen <sandeen@sandeen.net>
Fixes: 6ff646b2ceb0 ("xfs: fix rmap key and record comparison functions")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Eric Sandeen [Mon, 16 Nov 2020 20:27:37 +0000 (15:27 -0500)]

xfsprogs: Release v5.10.0-rc0

Update all the necessary files for a 5.10.0-rc0 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Sat, 14 Nov 2020 17:09:13 +0000 (12:09 -0500)]

xfs: fix rmap key and record comparison functions

Source kernel commit: 6ff646b2ceb0eec916101877f38da0b73e3a5b7f

Keys for extent interval records in the reverse mapping btree are
supposed to be computed as follows:

(physical block, owner, fork, is_btree, is_unwritten, offset)

This provides users the ability to look up a reverse mapping from a bmbt
record -- start with the physical block; then if there are multiple
records for the same block, move on to the owner; then the inode fork
type; and so on to the file offset.

However, the key comparison functions incorrectly remove the
fork/btree/unwritten information that's encoded in the on-disk offset.
This means that lookup comparisons are only done with:

(physical block, owner, offset)

This means that queries can return incorrect results. On consistent
filesystems this hasn't been an issue because blocks are never shared
between forks or with bmbt blocks; and are never unwritten. However,
this bug means that online repair cannot always detect corruption in the
key information in internal rmapbt nodes.

Found by fuzzing keys[1].attrfork = ones on xfs/371.

Fixes: 4b8ed67794fe ("xfs: add rmap btree operations")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Sat, 14 Nov 2020 17:08:47 +0000 (12:08 -0500)]

xfs: fix flags argument to rmap lookup when converting shared file rmaps

Source kernel commit: ea8439899c0b15a176664df62aff928010fad276

Pass the same oldext argument (which contains the existing rmapping's
unwritten state) to xfs_rmap_lookup_le_range at the start of
xfs_rmap_convert_shared. At this point in the code, flags is zero,
which means that we perform lookups using the wrong key.

Fixes: 3f165b334e51 ("xfs: convert unwritten status of reverse mappings for shared files")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:39:58 +0000 (17:39 -0500)]

xfs: set xefi_discard when creating a deferred agfl free log intent item

Source kernel commit: 2c334e12f957cd8c6bb66b4aa3f79848b7c33cab

Make sure that we actually initialize xefi_discard when we're scheduling
a deferred free of an AGFL block. This was (eventually) found by the
UBSAN while I was banging on realtime rmap problems, but it exists in
the upstream codebase. While we're at it, rearrange the structure to
reduce the struct size from 64 to 56 bytes.

Fixes: fcb762f5de2e ("xfs: add bmapi nodiscard flag")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:22:04 +0000 (17:22 -0500)]

xfs: fix high key handling in the rt allocator's query_range function

Source kernel commit: d88850bd5516a77c6f727e8b6cefb64e0cc929c7

Fix some off-by-one errors in xfs_rtalloc_query_range.  The highest key
in the realtime bitmap is always one less than the number of rt extents,
which means that the key clamp at the start of the function is wrong.
The 4th argument to xfs_rtfind_forw is the highest rt extent that we
want to probe, which means that passing 1 less than the high key is
wrong.  Finally, drop the rem variable that controls the loop because we
can compare the iteration point (rtstart) against the high key directly.

The sordid history of this function is that the original commit (fb3c3)
incorrectly passed (high_rec->ar_startblock - 1) as the 'limit' parameter
to xfs_rtfind_forw.  This was wrong because the "high key" is supposed
to be the largest key for which the caller wants result rows, not the
key for the first row that could possibly be outside the range that the
caller wants to see.

A subsequent attempt (8ad56) to strengthen the parameter checking added
incorrect clamping of the parameters to the number of rt blocks in the
system (despite the bitmap functions all taking units of rt extents) to
avoid querying ranges past the end of rt bitmap file but failed to fix
the incorrect _rtfind_forw parameter.  The original _rtfind_forw
parameter error then survived the conversion of the startblock and
blockcount fields to rt extents (a0e5c), and the most recent off-by-one
fix (a3a37) thought it was patching a problem when the end of the rt
volume is not in use, but none of these fixes actually solved the
original problem that the author was confused about the "limit" argument
to xfs_rtfind_forw.

Sadly, all four of these patches were written by this author and even
his own usage of this function and rt testing were inadequate to get
this fixed quickly.

Original-problem: fb3c3de2f65c ("xfs: add a couple of queries to iterate free extents in the rtbitmap")
Not-fixed-by: 8ad560d2565e ("xfs: strengthen rtalloc query range checks")
Not-fixed-by: a0e5c435babd ("xfs: fix xfs_rtalloc_rec units")
Fixes: a3a374bf1889 ("xfs: fix off-by-one error in xfs_rtalloc_query_range")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:21:54 +0000 (17:21 -0500)]

xfs: only relog deferred intent items if free space in the log gets low

Source kernel commit: 74f4d6a1e065c92428c5b588099e307a582d79d9

Now that we have the ability to ask the log how far the tail needs to be
pushed to maintain its free space targets, augment the decision to relog
an intent item so that we only do it if the log has hit the 75% full
threshold. There's no point in relogging an intent into the same
checkpoint, and there's no need to relog if there's plenty of free space
in the log.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:21:44 +0000 (17:21 -0500)]

xfs: periodically relog deferred intent items

Source kernel commit: 4e919af7827a6adfc28e82cd6c4ffcfcc3dd6118

There's a subtle design flaw in the deferred log item code that can lead
to pinning the log tail.  Taking up the defer ops chain examples from
the previous commit, we can get trapped in sequences like this:

Caller hands us a transaction t0 with D0-D3 attached.  The defer ops
chain will look like the following if the transaction rolls succeed:

t1: D0(t0), D1(t0), D2(t0), D3(t0)
t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
t3: d5(t1), D1(t0), D2(t0), D3(t0)
...
t9: d9(t7), D3(t0)
t10: D3(t0)
t11: d10(t10), d11(t10)
t12: d11(t10)

In transaction 9, we finish d9 and try to roll to t10 while holding onto
an intent item for D3 that we logged in t0.

The previous commit changed the order in which we place new defer ops in
the defer ops processing chain to reduce the maximum chain length.  Now
make xfs_defer_finish_noroll capable of relogging the entire chain
periodically so that we can always move the log tail forward.  Most
chains will never get relogged, except for operations that generate very
long chains (large extents containing many blocks with different sharing
levels) or are on filesystems with small logs and a lot of ongoing
metadata updates.

Callers are now required to ensure that the transaction reservation is
large enough to handle logging done items and new intent items for the
maximum possible chain length.  Most callers are careful to keep the
chain lengths low, so the overhead should be minimal.

The decision to relog an intent item is made based on whether the intent
was logged in a previous checkpoint, since there's no point in relogging
an intent into the same checkpoint.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:21:40 +0000 (17:21 -0500)]

xfs: change the order in which child and parent defer ops are finished

Source kernel commit: 27dada070d59c28a441f1907d2cec891b17dcb26

The defer ops code has been finishing items in the wrong order -- if a
top level defer op creates items A and B, and finishing item A creates
more defer ops A1 and A2, we'll put the new items on the end of the
chain and process them in the order A B A1 A2.  This is kind of weird,
since it's convenient for programmers to be able to think of A and B as
an ordered sequence where all the sub-tasks for A must finish before we
move on to B, e.g. A A1 A2 D.

Right now, our log intent items are not so complex that this matters,
but this will become important for the atomic extent swapping patchset.
In order to maintain correct reference counting of extents, we have to
unmap and remap extents in that order, and we want to complete that work
before moving on to the next range that the user wants to swap.  This
patch fixes defer ops to satsify that requirement.

The primary symptom of the incorrect order was noticed in an early
performance analysis of the atomic extent swap code.  An astonishingly
large number of deferred work items accumulated when userspace requested
an atomic update of two very fragmented files.  The cause of this was
traced to the same ordering bug in the inner loop of
xfs_defer_finish_noroll.

If the ->finish_item method of a deferred operation queues new deferred
operations, those new deferred ops are appended to the tail of the
pending work list.  To illustrate, say that a caller creates a
transaction t0 with four deferred operations D0-D3.  The first thing
defer ops does is roll the transaction to t1, leaving us with:

t1: D0(t0), D1(t0), D2(t0), D3(t0)

Let's say that finishing each of D0-D3 will create two new deferred ops.
After finish D0 and roll, we'll have the following chain:

t2: D1(t0), D2(t0), D3(t0), d4(t1), d5(t1)

d4 and d5 were logged to t1.  Notice that while we're about to start
work on D1, we haven't actually completed all the work implied by D0
being finished.  So far we've been careful (or lucky) to structure the
dfops callers such that D1 doesn't depend on d4 or d5 being finished,
but this is a potential logic bomb.

There's a second problem lurking.  Let's see what happens as we finish
D1-D3:

t3: D2(t0), D3(t0), d4(t1), d5(t1), d6(t2), d7(t2)
t4: D3(t0), d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3)
t5: d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4)

Let's say that d4-d11 are simple work items that don't queue any other
operations, which means that we can complete each d4 and roll to t6:

t6: d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4)
t7: d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4)
...
t11: d10(t4), d11(t4)
t12: d11(t4)
<done>

When we try to roll to transaction #12, we're holding defer op d11,
which we logged way back in t4.  This means that the tail of the log is
pinned at t4.  If the log is very small or there are a lot of other
threads updating metadata, this means that we might have wrapped the log
and cannot get roll to t11 because there isn't enough space left before
we'd run into t4.

Let's shift back to the original failure.  I mentioned before that I
discovered this flaw while developing the atomic file update code.  In
that scenario, we have a defer op (D0) that finds a range of file blocks
to remap, creates a handful of new defer ops to do that, and then asks
to be continued with however much work remains.

So, D0 is the original swapext deferred op.  The first thing defer ops
does is rolls to t1:

t1: D0(t0)

We try to finish D0, logging d1 and d2 in the process, but can't get all
the work done.  We log a done item and a new intent item for the work
that D0 still has to do, and roll to t2:

t2: D0'(t1), d1(t1), d2(t1)

We roll and try to finish D0', but still can't get all the work done, so
we log a done item and a new intent item for it, requeue D0 a second
time, and roll to t3:

t3: D0''(t2), d1(t1), d2(t1), d3(t2), d4(t2)

If it takes 48 more rolls to complete D0, then we'll finally dispense
with D0 in t50:

t50: D<fifty primes>(t49), d1(t1), ..., d102(t50)

We then try to roll again to get a chain like this:

t51: d1(t1), d2(t1), ..., d101(t50), d102(t50)
...
t152: d102(t50)
<done>

Notice that in rolling to transaction #51, we're holding on to a log
intent item for d1 that was logged in transaction #1.  This means that
the tail of the log is pinned at t1.  If the log is very small or there
are a lot of other threads updating metadata, this means that we might
have wrapped the log and cannot roll to t51 because there isn't enough
space left before we'd run into t1.  This is of course problem #2 again.

But notice the third problem with this scenario: we have 102 defer ops
tied to this transaction!  Each of these items are backed by pinned
kernel memory, which means that we risk OOM if the chains get too long.

Yikes.  Problem #1 is a subtle logic bomb that could hit someone in the
future; problem #2 applies (rarely) to the current upstream, and problem
#3 applies to work under development.

This is not how incremental deferred operations were supposed to work.
The dfops design of logging in the same transaction an intent-done item
and a new intent item for the work remaining was to make it so that we
only have to juggle enough deferred work items to finish that one small
piece of work.  Deferred log item recovery will find that first
unfinished work item and restart it, no matter how many other intent
items might follow it in the log.  Therefore, it's ok to put the new
intents at the start of the dfops chain.

For the first example, the chains look like this:

t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
t3: d5(t1), D1(t0), D2(t0), D3(t0)
...
t9: d9(t7), D3(t0)
t10: D3(t0)
t11: d10(t10), d11(t10)
t12: d11(t10)

For the second example, the chains look like this:

t1: D0(t0)
t2: d1(t1), d2(t1), D0'(t1)
t3: d2(t1), D0'(t1)
t4: D0'(t1)
t5: d1(t4), d2(t4), D0''(t4)
...
t148: D0<50 primes>(t147)
t149: d101(t148), d102(t148)
t150: d102(t148)
<done>

This actually sucks more for pinning the log tail (we try to roll to t10
while holding an intent item that was logged in t1) but we've solved
problem #1.  We've also reduced the maximum chain length from:

sum(all the new items) + nr_original_items

to:

max(new items that each original item creates) + nr_original_items

This solves problem #3 by sharply reducing the number of defer ops that
can be attached to a transaction at any given time.  The change makes
the problem of log tail pinning worse, but is improvement we need to
solve problem #2.  Actually solving #2, however, is left to the next
patch.

Note that a subsequent analysis of some hard-to-trigger reflink and COW
livelocks on extremely fragmented filesystems (or systems running a lot
of IO threads) showed the same symptoms -- uncomfortably large numbers
of incore deferred work items and occasional stalls in the transaction
grant code while waiting for log reservations.  I think this patch and
the next one will also solve these problems.

As originally written, the code used list_splice_tail_init instead of
list_splice_init, so change that, and leave a short comment explaining
our actions.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:21:30 +0000 (17:21 -0500)]

xfs: fix an incore inode UAF in xfs_bui_recover

Source kernel commit: ff4ab5e02a0447dd1e290883eb6cd7d94848e590

In xfs_bui_item_recover, there exists a use-after-free bug with regards
to the inode that is involved in the bmap replay operation.  If the
mapping operation does not complete, we call xfs_bmap_unmap_extent to
create a deferred op to finish the unmapping work, and we retain a
pointer to the incore inode.

Unfortunately, the very next thing we do is commit the transaction and
drop the inode.  If reclaim tears down the inode before we try to finish
the defer ops, we dereference garbage and blow up.  Therefore, create a
way to join inodes to the defer ops freezer so that we can maintain the
xfs_inode reference until we're done with the inode.

Note: This imposes the requirement that there be enough memory to keep
every incore inode in memory throughout recovery.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:21:20 +0000 (17:21 -0500)]

xfs: xfs_defer_capture should absorb remaining transaction reservation

Source kernel commit: 929b92f64048d90d23e40a59c47adf59f5026903

When xfs_defer_capture extracts the deferred ops and transaction state
from a transaction, it should record the transaction reservation type
from the old transaction so that when we continue the dfops chain, we
still use the same reservation parameters.

Doing this means that the log item recovery functions get to determine
the transaction reservation instead of abusing tr_itruncate in yet
another part of xfs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:21:10 +0000 (17:21 -0500)]

xfs: xfs_defer_capture should absorb remaining block reservations

Source kernel commit: 4f9a60c48078c0efa3459678fa8d6e050e8ada5d

When xfs_defer_capture extracts the deferred ops and transaction state
from a transaction, it should record the remaining block reservations so
that when we continue the dfops chain, we can reserve the same number of
blocks to use. We capture the reservations for both data and realtime
volumes.

This adds the requirement that every log intent item recovery function
must be careful to reserve enough blocks to handle both itself and all
defer ops that it can queue. On the other hand, this enables us to do
away with the handwaving block estimation nonsense that was going on in
xlog_finish_defer_ops.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:21:01 +0000 (17:21 -0500)]

xfs: proper replay of deferred ops queued during log recovery

Source kernel commit: e6fff81e487089e47358a028526a9f63cdbcd503

When we replay unfinished intent items that have been recovered from the
log, it's possible that the replay will cause the creation of more
deferred work items.  As outlined in commit 509955823cc9c ("xfs: log
recovery should replay deferred ops in order"), later work items have an
implicit ordering dependency on earlier work items.  Therefore, recovery
must replay the items (both recovered and created) in the same order
that they would have been during normal operation.

For log recovery, we enforce this ordering by using an empty transaction
to collect deferred ops that get created in the process of recovering a
log intent item to prevent them from being committed before the rest of
the recovered intent items.  After we finish committing all the
recovered log items, we allocate a transaction with an enormous block
reservation, splice our huge list of created deferred ops into that
transaction, and commit it, thereby finishing all those ops.

This is /really/ hokey -- it's the one place in XFS where we allow
nested transactions; the splicing of the defer ops list is is inelegant
and has to be done twice per recovery function; and the broken way we
handle inode pointers and block reservations cause subtle use-after-free
and allocator problems that will be fixed by this patch and the two
patches after it.

Therefore, replace the hokey empty transaction with a structure designed
to capture each chain of deferred ops that are created as part of
recovering a single unfinished log intent.  Finally, refactor the loop
that replays those chains to do so using one transaction per chain.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:20:57 +0000 (17:20 -0500)]

xfs: remove xfs_defer_reset

Source kernel commit: b80b29d602a8879829fbf89115e9e6877806a2da

Remove this one-line helper since the assert is trivially true in one
call site and the rest obscures a bitmask operation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:20:52 +0000 (17:20 -0500)]

xfs: avoid shared rmap operations for attr fork extents

Source kernel commit: d7884e6e90da974b50dc2c3bf50e03b70750e5f1

During code review, I noticed that the rmap code uses the (slower)
shared mappings rmap functions for any extent of a reflinked file, even
if those extents are for the attr fork, which doesn't support sharing.
We can speed up rmap a tiny bit by optimizing out this case.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Kaixu Xia [Thu, 12 Nov 2020 22:20:42 +0000 (17:20 -0500)]

xfs: code cleanup in xfs_attr_leaf_entsize_{remote,local}

Source kernel commit: 61ef5230518a3ad224549a50a01b73989acb94b9

Cleanup the typedef usage, the unnecessary parentheses, the unnecessary
backslash and use the open-coded round_up call in
xfs_attr_leaf_entsize_{remote,local}.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Kaixu Xia [Thu, 12 Nov 2020 22:20:38 +0000 (17:20 -0500)]

xfs: remove the redundant crc feature check in xfs_attr3_rmt_verify

Source kernel commit: 3feb4ffbf69321284dc78ac6ca43b4a2afadf243

We already check whether the crc feature is enabled before calling
xfs_attr3_rmt_verify(), so remove the redundant feature check in that
function.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Kaixu Xia [Thu, 12 Nov 2020 22:20:28 +0000 (17:20 -0500)]

xfs: fix some comments

Source kernel commit: a647d109e08ac0961ca0fd511b013d962d256987

Fix the comments to help people understand the code.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
[darrick: fix the indenting problems too]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Kaixu Xia [Thu, 12 Nov 2020 22:20:18 +0000 (17:20 -0500)]

xfs: use the existing type definition for di_projid

Source kernel commit: 9c0fce4c16fc8d4d119cc3a20f1e5ce870206706

We have already defined the project ID type prid_t, so maybe should
use it here.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:20:08 +0000 (17:20 -0500)]

xfs: log new intent items created as part of finishing recovered intent items

Source kernel commit: 93293bcbde93567efaf4e6bcd58cad270e1fcbf5

During a code inspection, I found a serious bug in the log intent item
recovery code when an intent item cannot complete all the work and
decides to requeue itself to get that done.  When this happens, the
item recovery creates a new incore deferred op representing the
remaining work and attaches it to the transaction that it allocated.  At
the end of _item_recover, it moves the entire chain of deferred ops to
the dummy parent_tp that xlog_recover_process_intents passed to it, but
fail to log a new intent item for the remaining work before committing
the transaction for the single unit of work.

xlog_finish_defer_ops logs those new intent items once recovery has
finished dealing with the intent items that it recovered, but this isn't
sufficient.  If the log is forced to disk after a recovered log item
decides to requeue itself and the system goes down before we call
xlog_finish_defer_ops, the second log recovery will never see the new
intent item and therefore has no idea that there was more work to do.
It will finish recovery leaving the filesystem in a corrupted state.

The same logic applies to /any/ deferred ops added during intent item
recovery, not just the one handling the remaining work.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 22:19:52 +0000 (17:19 -0500)]

xfs: don't free rt blocks when we're doing a REMAP bunmapi call

Source kernel commit: 8df0fa39bdd86ca81a8d706a6ed9d33cc65ca625

When callers pass XFS_BMAPI_REMAP into xfs_bunmapi, they want the extent
to be unmapped from the given file fork without the extent being freed.
We do this for non-rt files, but we forgot to do this for realtime
files. So far this isn't a big deal since nobody makes a bunmapi call
to a rt file with the REMAP flag set, but don't leave a logic bomb.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Carlos Maiolino [Thu, 12 Nov 2020 22:18:52 +0000 (17:18 -0500)]

xfs: Convert xfs_attr_sf macros to inline functions

Source kernel commit: e01b7eed5d0a9b101da53701e92136c3985998af

xfs_attr_sf_totsize() requires access to xfs_inode structure, so, once
xfs_attr_shortform_addname() is its only user, move it to xfs_attr.c
instead of playing with more #includes.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Carlos Maiolino [Thu, 12 Nov 2020 21:50:12 +0000 (16:50 -0500)]

xfs: Use variable-size array for nameval in xfs_attr_sf_entry

Source kernel commit: c418dbc9805dbd215586454f0c5729333219aa63

nameval is a variable-size array, so, define it as it, and remove all
the -1 magic number subtractions

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Carlos Maiolino [Thu, 12 Nov 2020 21:50:02 +0000 (16:50 -0500)]

xfs: Remove typedef xfs_attr_shortform_t

Source kernel commit: 47e6cc100054c8c6b809e25c286a2fd82e82bcb7

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Carlos Maiolino [Thu, 12 Nov 2020 21:49:52 +0000 (16:49 -0500)]

xfs: remove typedef xfs_attr_sf_entry_t

Source kernel commit: 6337c84466c250d5da797bc5d6941c501d500e48

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 21:49:42 +0000 (16:49 -0500)]

xfs: widen ondisk quota expiration timestamps to handle y2038+

Source kernel commit: 4ea1ff3b49681af45a4a8c14baf7f0b3d11aa74a

Enable the bigtime feature for quota timers. We decrease the accuracy
of the timers to ~4s in exchange for being able to set timers up to the
bigtime maximum.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Thu, 12 Nov 2020 01:08:14 +0000 (20:08 -0500)]

xfs: widen ondisk inode timestamps to deal with y2038+

Source kernel commit: f93e5436f0ee5a85eaa3a86d2614d215873fb18b

Redesign the ondisk inode timestamps to be a simple unsigned 64-bit
counter of nanoseconds since 14 Dec 1901 (i.e. the minimum time in the
32-bit unix time epoch). This enables us to handle dates up to 2486,
which solves the y2038 problem.

sandeen: update xfs_flags2diflags2() as well, to match

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Wed, 11 Nov 2020 18:48:47 +0000 (13:48 -0500)]

xfs: redefine xfs_ictimestamp_t

Source kernel commit: 30e05599219f3c15bd5f24190af0e33cdb4a00e5

Redefine xfs_ictimestamp_t as a uint64_t typedef in preparation for the
bigtime functionality. Preserve the legacy structure format so that we
can let the compiler take care of the masking and shifting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

commit | commitdiff | tree

Darrick J. Wong [Tue, 10 Nov 2020 21:29:40 +0000 (16:29 -0500)]

xfs: redefine xfs_timestamp_t

Source kernel commit: 5a0bb066f60fa02f453d7721844eae59f505c06e

Redefine xfs_timestamp_t as a __be64 typedef in preparation for the
bigtime functionality. Preserve the legacy structure format so that we
can let the compiler take care of masking and shifting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

Unnamed repository; edit this file 'description' to name the repository.