]> git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log
thirdparty/xfsprogs-dev.git
3 years agoxfs: fix radix tree tag signs
Darrick J. Wong [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: fix radix tree tag signs

Source kernel commit: 919a4ddb68413056ecb7c71d9d5465bb54c8032b

Radix tree tags are supposed to be unsigned ints, so fix the callers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: mark xfs_bmap_set_attrforkoff static
Christoph Hellwig [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: mark xfs_bmap_set_attrforkoff static

Source kernel commit: 5a981e4ea8ff8062e7c7ea8fc4a1565e4820a08b

xfs_bmap_set_attrforkoff is only used inside of xfs_bmap.c, so mark it
static.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Remove redundant assignment to busy
Jiapeng Chong [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: Remove redundant assignment to busy

Source kernel commit: 9673261c32dc2f30863b803374b726a72d16b07c

Variable busy is set to false, but this value is never read as it is
overwritten or not used later on, hence it is a redundant assignment
and can be removed.

Clean up the following clang-analyzer warning:

fs/xfs/libxfs/xfs_alloc.c:1679:2: warning: Value stored to 'busy' is
never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: sort variable alphabetically to avoid repeated declaration
Shaokun Zhang [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: sort variable alphabetically to avoid repeated declaration

Source kernel commit: 5f7fd75086203a8a4dd3e518976e52bcf24e8b22

Variable 'xfs_agf_buf_ops', 'xfs_agi_buf_ops', 'xfs_dquot_buf_ops' and
'xfs_symlink_buf_ops' are declared twice, so sort these variables
alphabetically and remove the repeated declaration.

Cc: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: remove xfs_perag_t
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: remove xfs_perag_t

Source kernel commit: 509201163fca3d4d906bd50a5320115d42818748

Almost unused, gets rid of another typedef.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: use perag through unlink processing
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: use perag through unlink processing

Source kernel commit: f40aadb2bb64fe0a3d9b59957e70796d629cdee2

Unlinked lists are held in the perag, and freeing of inodes needs to
be passed a perag, too, so look up the perag early in the unlink
processing and use it throughout.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: clean up and simplify xfs_dialloc()
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: clean up and simplify xfs_dialloc()

Source kernel commit: 8237fbf53d6fd2a3a248fc2a8608e047ef22316c

Because it's a mess.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: inode allocation can use a single perag instance
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: inode allocation can use a single perag instance

Source kernel commit: 309161f6603ce1a53b76a42817cde2a9bcd17e82

Now that we've internalised the two-phase inode allocation, we can
now easily make the AG selection and allocation atomic from the
perspective of a single perag context. This will ensure AGs going
offline/away cannot occur between the selection and allocation
steps.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: get rid of xfs_dir_ialloc()
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: get rid of xfs_dir_ialloc()

Source kernel commit: b652afd937033911944d7f681f2031b006961f1d

This is just a simple wrapper around the per-ag inode allocation
that doesn't need to exist. The internal mechanism to select and
allocate within an AG does not need to be exposed outside
xfs_ialloc.c, and it being exposed simply makes it harder to follow
the code and simplify it.

This is simplified by internalising xf_dialloc_select_ag() and
xfs_dialloc_ag() into a single xfs_dialloc() function and then
xfs_dir_ialloc() can go away.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: collapse AG selection for inode allocation
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: collapse AG selection for inode allocation

Source kernel commit: 89b1f55a2951bb89b7ae9f8cb3fd11513ff3f219

xfs_dialloc_select_ag() does a lot of repetitive work. It first
calls xfs_ialloc_ag_select() to select the AG to start allocation
attempts in, which can do up to two entire loops across the perags
that inodes can be allocated in. This is simply checking if there is
spce available to allocate inodes in an AG, and it returns when it
finds the first candidate AG.

xfs_dialloc_select_ag() then does it's own iterative walk across
all the perags locking the AGIs and trying to allocate inodes from
the locked AG. It also doesn't limit the search to mp->m_maxagi,
so it will walk all AGs whether they can allocate inodes or not.

Hence if we are really low on inodes, we could do almost 3 entire
walks across the whole perag range before we find an allocation
group we can allocate inodes in or report ENOSPC.

Because xfs_ialloc_ag_select() returns on the first candidate AG it
finds, we can simply do these checks directly in
xfs_dialloc_select_ag() before we lock and try to allocate inodes.
This reduces the inode allocation pass down to 2 perag sweeps at
most - one for aligned inode cluster allocation and if we can't
allocate full, aligned inode clusters anywhere we'll do another pass
trying to do sparse inode cluster allocation.

This also removes a big chunk of duplicate code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: simplify xfs_dialloc_select_ag() return values
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: simplify xfs_dialloc_select_ag() return values

Source kernel commit: 4268547305c91b35ae7871374078de788a822ed1

The only caller of xfs_dialloc_select_ag() will always return
-ENOSPC to it's caller if the agbp returned from
xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate
AGI we can allocate inodes from is always an ENOSPC condition, so
move this logic up into xfs_dialloc_select_ag() so we can simplify
the return logic in this function.

xfs_dialloc_select_ag() now only ever returns 0 with a locked
agbp, or an error with no agbp.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: remove agno from btree cursor
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: remove agno from btree cursor

Source kernel commit: 50f02fe3338d3fee6b298a1b262a4c562e7d84e0

Now that everything passes a perag, the agno is not needed anymore.
Convert all the users to use pag->pag_agno instead and remove the
agno from the cursor. This was largely done as an automated search
and replace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: use perag for ialloc btree cursors
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: use perag for ialloc btree cursors

Source kernel commit: 7b13c515518264df0cb90d84fdab907a627c0fa9

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert allocbt cursors to use perags
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: convert allocbt cursors to use perags

Source kernel commit: 289d38d22cd88960cb648dc480c50de5102519bb

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert refcount btree cursor to use perags
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: convert refcount btree cursor to use perags

Source kernel commit: a81a06211fb43d80ee746e7a40a32ed812002f8e

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert rmap btree cursor to using a perag
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: convert rmap btree cursor to using a perag

Source kernel commit: fa9c3c197329fdab0efc48a8944d2c4a21c6a74f

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: add a perag to the btree cursor
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: add a perag to the btree cursor

Source kernel commit: be9fb17d88f08af648a89784d30dbac83d893154

Which will eventually completely replace the agno in it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: pass perags around in fsmap data dev functions
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: pass perags around in fsmap data dev functions

Source kernel commit: 58d43a7e3263766ade4974c86118e6b5737ea259

Needs a [from, to] ranged AG walk, and the perag to be stuffed into
the info structure for callouts to use.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: push perags through the ag reservation callouts
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: push perags through the ag reservation callouts

Source kernel commit: 30933120ad79f4549d6e364df7eda474cc0d9c65

We currently pass an agno from the AG reservation functions to the
individual feature accounting functions, which in future may have to
do perag lookups to access per-AG state. Instead, pre-emptively
plumb the perag through from the highest AG reservation layer to the
feature callouts so they won't have to look it up again.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: pass perags through to the busy extent code
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: pass perags through to the busy extent code

Source kernel commit: 45d0662117565e6100f9e0cf356cd873542c95b1

All of the callers of the busy extent API either have perag
references available to use so we can pass a perag to the busy
extent functions rather than having them have to do unnecessary
lookups.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert secondary superblock walk to use perags
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: convert secondary superblock walk to use perags

Source kernel commit: 7f8d3b3ca6fe9269b3c5deee0dcea38499288e06

Clean up the last external manual AG walk.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert xfs_iwalk to use perag references
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: convert xfs_iwalk to use perag references

Source kernel commit: 6f4118fc6482b1989cdcb19a1a0ab53b2dca7ab9

Rather than manually walking the ags and passing agnunbers around,
pass the perag for the AG we are currently working on around in the
iwalk structure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert raw ag walks to use for_each_perag
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: convert raw ag walks to use for_each_perag

Source kernel commit: 934933c3eec9e4a5826d3d7a47aca0742337fded

Convert the raw walks to an iterator, pulling the current AG out of
pag->pag_agno instead of the loop iterator variable.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: make for_each_perag... a first class citizen
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: make for_each_perag... a first class citizen

Source kernel commit: f250eedcf7621b9a56d563912b4eeacd524422c7

for_each_perag_tag() is defined in xfs_icache.c for local use.
Promote this to xfs_ag.h and define equivalent iteration functions
so that we can use them to iterate AGs instead to replace open coded
perag walks and perag lookups.

We also convert as many of the straight forward open coded AG walks
to use these iterators as possible. Anything that is not a direct
conversion to an iterator is ignored and will be updated in future

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: move perag structure and setup to libxfs/xfs_ag.[ch]
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: move perag structure and setup to libxfs/xfs_ag.[ch]

Source kernel commit: 07b6403a6873045344b0c18cbb4a4360854f6d76

Move the xfs_perag infrastructure to the libxfs files that contain
all the per AG infrastructure. This helps set up for passing perags
around all the code instead of bare agnos with minimal extra
includes for existing files.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: move xfs_perag_get/put to xfs_ag.[ch]
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: move xfs_perag_get/put to xfs_ag.[ch]

Source kernel commit: 9bbafc71919adfdf83fafd2ce909853b493e7d86

They are AG functions, not superblock functions, so move them to the
appropriate location.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: clean up open-coded fs block unit conversions
Darrick J. Wong [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: clean up open-coded fs block unit conversions

Source kernel commit: a7bcb147fef39054fe324a1a988470f5da127196

Replace some open-coded fs block unit conversions with the standard
conversion macro.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Clean up xfs_attr_node_addname_clear_incomplete
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Clean up xfs_attr_node_addname_clear_incomplete

Source kernel commit: 4fd084dbbd05402bb6e24782b8e9f9ea3e8ab3d6

We can use the helper function xfs_attr_node_remove_name to reduce
duplicate code in this function

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Remove xfs_attr_rmtval_set
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Remove xfs_attr_rmtval_set

Source kernel commit: 0e6acf29db6f463027d1ff7cea86a641da89f0d4

This function is no longer used, so it is safe to remove

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add delay ready attr set routines
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Add delay ready attr set routines

Source kernel commit: 8f502a4009822a6972772ae65b34078645b3ba16

This patch modifies the attr set routines to be delay ready. This means
they no longer roll or commit transactions, but instead return -EAGAIN
to have the calling routine roll and refresh the transaction.  In this
series, xfs_attr_set_args has become xfs_attr_set_iter, which uses a
state machine like switch to keep track of where it was when EAGAIN was
returned. See xfs_attr.h for a more detailed diagram of the states.

Two new helper functions have been added: xfs_attr_rmtval_find_space and
xfs_attr_rmtval_set_blk.  They provide a subset of logic similar to
xfs_attr_rmtval_set, but they store the current block in the delay attr
context to allow the caller to roll the transaction between allocations.
This helps to simplify and consolidate code used by
xfs_attr_leaf_addname and xfs_attr_node_addname. xfs_attr_set_args has
now become a simple loop to refresh the transaction until the operation
is completed.  Lastly, xfs_attr_rmtval_remove is no longer used, and is
removed.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add delay ready attr remove routines
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Add delay ready attr remove routines

Source kernel commit: 2b74b03c13c444cb5af56804cc975534e2058d06

This patch modifies the attr remove routines to be delay ready. This
means they no longer roll or commit transactions, but instead return
-EAGAIN to have the calling routine roll and refresh the transaction. In
this series, xfs_attr_remove_args is merged with
xfs_attr_node_removename become a new function, xfs_attr_remove_iter.
This new version uses a sort of state machine like switch to keep track
of where it was when EAGAIN was returned. A new version of
xfs_attr_remove_args consists of a simple loop to refresh the
transaction until the operation is completed. A new XFS_DAC_DEFER_FINISH
flag is used to finish the transaction where ever the existing code used
to.

Calls to xfs_attr_rmtval_remove are replaced with the delay ready
version __xfs_attr_rmtval_remove. We will rename
__xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
done.

xfs_attr_rmtval_remove itself is still in use by the set routines (used
during a rename).  For reasons of preserving existing function, we
modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
set.  Similar to how xfs_attr_remove_args does here.  Once we transition
the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
used and will be removed.

This patch also adds a new struct xfs_delattr_context, which we will use
to keep track of the current state of an attribute operation. The new
xfs_delattr_state enum is used to track various operations that are in
progress so that we know not to repeat them, and resume where we left
off before EAGAIN was returned to cycle out the transaction. Other
members take the place of local variables that need to retain their
values across multiple function calls.  See xfs_attr.h for a more
detailed diagram of the states.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Hoist node transaction handling
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Hoist node transaction handling

Source kernel commit: 3f562d092bb1edd39bfc0e6808d7108d47f8aa3a

This patch basically hoists the node transaction handling around the
leaf code we just hoisted.  This will helps setup this area for the
state machine since the goto is easily replaced with a state since it
ends with a transaction roll.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Hoist xfs_attr_leaf_addname
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Hoist xfs_attr_leaf_addname

Source kernel commit: 83c6e70789ff371c4eebc54f2c8d979305a1bae8

This patch hoists xfs_attr_leaf_addname into the calling function.  The
goal being to get all the code that will require state management into
the same scope. This isn't particularly aesthetic right away, but it is a
preliminary step to merging in the state machine code.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Hoist xfs_attr_node_addname
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Hoist xfs_attr_node_addname

Source kernel commit: 5d954cc09f6baed80458ea02ec092031608ea3fe

This patch hoists the later half of xfs_attr_node_addname into
the calling function.  We do this because it is this area that
will need the most state management, and we want to keep such
code in the same scope as much as possible

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add helper xfs_attr_node_addname_find_attr
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Add helper xfs_attr_node_addname_find_attr

Source kernel commit: 6ca5a4a1f52952790a40099b79b5631d91163ba4

This patch separates the first half of xfs_attr_node_addname into a
helper function xfs_attr_node_addname_find_attr.  It also replaces the
restart goto with an EAGAIN return code driven by a loop in the calling
function.  This looks odd now, but will clean up nicly once we introduce
the state machine.  It will also enable hoisting the last state out of
xfs_attr_node_addname with out having to plumb in a "done" parameter to
know if we need to move to the next state or not.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Separate xfs_attr_node_addname and xfs_attr_node_addname_clear_incomplete
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Separate xfs_attr_node_addname and xfs_attr_node_addname_clear_incomplete

Source kernel commit: f0f7c502c728d0c6947219739631bad101f8737b

This patch separate xfs_attr_node_addname into two functions.  This will
help to make it easier to hoist parts of xfs_attr_node_addname that need
state management

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Refactor xfs_attr_set_shortform
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Refactor xfs_attr_set_shortform

Source kernel commit: 6286514b63e12d7bedc67e46aa1aeff9ed8378ce

This patch is actually the combination of patches from the previous
version (v18).  Initially patch 3 hoisted xfs_attr_set_shortform, and
the next added the helper xfs_attr_set_fmt. xfs_attr_set_fmt is similar
the old xfs_attr_set_shortform. It returns 0 when the attr has been set
and no further action is needed. It returns -EAGAIN when shortform has
been transformed to leaf, and the calling function should proceed the
set the attr in leaf form.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add xfs_attr_node_remove_name
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Add xfs_attr_node_remove_name

Source kernel commit: a8490f699f6ec88843879b92cbb21953dab379ee

This patch pulls a new helper function xfs_attr_node_remove_name out
of xfs_attr_node_remove_step.  This helps to modularize
xfs_attr_node_remove_step which will help make the delayed attribute
code easier to follow

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Reverse apply 72b97ea40d
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Reverse apply 72b97ea40d

Source kernel commit: 4126c06e25b38842a254b2de6ffc3019a7b2f0ca

Originally we added this patch to help modularize the attr code in
preparation for delayed attributes and the state machine it requires.
However, later reviews found that this slightly alters the transaction
handling as the helper function is ambiguous as to whether the
transaction is diry or clean.  This may cause a dirty transaction to be
included in the next roll, where previously it had not.  To preserve the
existing code flow, we reverse apply this commit.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: fix crash on second attempt to initialize library
Darrick J. Wong [Fri, 15 Oct 2021 20:28:14 +0000 (16:28 -0400)] 
libxfs: fix crash on second attempt to initialize library

xfs_repair crashes when it tries to initialize the libxfs library but
the initialization fails because the fs is already mounted:

# xfs_repair /dev/sdd
xfs_repair: /dev/sdd contains a mounted filesystem
xfs_repair: urcu.c:553: urcu_memb_register_thread: Assertion `!URCU_TLS(rcu_reader).registered' failed.
Aborted

This is because libxfs_init() registers the main thread with liburcu,
but doesn't unregister the thread if libxfs library initialization
fails.  When repair sets more dangerous options and tries again, the
second initialization attempt causes liburcu to abort.  Fix this by
unregistering the thread with liburcu if libxfs initialization fails.

Observed by running xfs/284.

Fixes: e4da1b16 ("xfsprogs: introduce liburcu support")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: use xfs_buf_alloc_pages for uncached buffers
Dave Chinner [Thu, 14 Oct 2021 16:35:45 +0000 (12:35 -0400)] 
xfs: use xfs_buf_alloc_pages for uncached buffers

Source kernel commit: 07b5c5add42a0afccf79401b12d78043ed6b8240

Use the newly factored out page allocation code. This adds
automatic buffer zeroing for non-read uncached buffers.

This also allows us to greatly simply the error handling in
xfs_buf_get_uncached(). Because xfs_buf_alloc_pages() cleans up
partial allocation failure, we can just call xfs_buf_free() in all
error cases now to clean up after failures.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: fix call_rcu crash when unmounting the fake mount in mkfs
Darrick J. Wong [Thu, 14 Oct 2021 16:35:43 +0000 (12:35 -0400)] 
libxfs: fix call_rcu crash when unmounting the fake mount in mkfs

In commit a6fb6abe, we simplified the process by which mkfs.xfs computes
the minimum log size calculation by creating a dummy xfs_mount with the
draft superblock image, using the dummy to compute the log geometry, and
then unmounting the dummy.

Note that creating a dummy mount with no data device is supported by
libxfs, though with the caveat that we don't set up any perag structures
at all.  Up until this point this has worked perfectly well since free()
(and hence kmem_free()) are perfectly happy to ignore NULL pointers.

Unfortunately, this will cause problems with the upcoming patch to shift
per-AG setup and teardown to libxfs because call_rcu in the liburcu
library actually tries to access the rcu_head of the passed-in perag
structure, but they're all NULL in the dummy mount case.  IOWs,
xfs_free_perag requires that every AG have a per-AG structure, and it's
too late to change the 5.14 kernel libxfs now, so work around this by
altering libxfs_mount to remember when it has initialized the perag
structures and libxfs_umount to skip freeing them when the flag isn't
set.

Just to be clear: This fault has no user-visible consequences right now;
it's a fixup to avoid problems in the libxfs sync series for 5.14.

Fixes: a6fb6abe ("mkfs: simplify minimum log size calculation")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agomisc: convert utilities to use "fallthrough;"
Darrick J. Wong [Fri, 1 Oct 2021 20:04:54 +0000 (16:04 -0400)] 
misc: convert utilities to use "fallthrough;"

Now that we have a macro to virtualize switch statement fallthroughs for
lazy compiler linters, we might as well spread it elsewhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Fix fall-through warnings for Clang
Gustavo A. R. Silva [Fri, 1 Oct 2021 20:03:54 +0000 (16:03 -0400)] 
xfs: Fix fall-through warnings for Clang

Source kernel commit: 53004ee78d6273c994534ccf79d993098ac89769

In preparation to enable -Wimplicit-fallthrough for Clang, fix
the following warnings by replacing /* fall through */ comments,
and its variants, with the new pseudo-keyword macro fallthrough:

fs/xfs/libxfs/xfs_alloc.c:3167:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_da_btree.c:286:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_ag_resv.c:346:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_ag_resv.c:388:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_bmap_util.c:246:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_export.c:88:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_export.c:96:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_file.c:867:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_ioctl.c:562:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_ioctl.c:1548:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_iomap.c:1040:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_inode.c:852:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_log.c:2627:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_trans_buf.c:298:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/bmap.c:275:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/btree.c:48:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/common.c:85:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/common.c:138:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/common.c:698:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/dabtree.c:51:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/repair.c:951:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/agheader.c:89:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]

Notice that Clang doesn't recognize /* fall through */ comments as
implicit fall-through markings, so in order to globally enable
-Wimplicit-fallthrough for Clang, these comments need to be
replaced with fallthrough; in the whole codebase.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: fix whitespace inconsistencies with kernel
Darrick J. Wong [Fri, 1 Oct 2021 19:15:19 +0000 (15:15 -0400)] 
libxfs: fix whitespace inconsistencies with kernel

Fix a few places where the whitespace isn't an exact match for the
kernel.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: port xfs_set_inode_alloc from the kernel
Darrick J. Wong [Fri, 1 Oct 2021 18:00:34 +0000 (14:00 -0400)] 
libxfs: port xfs_set_inode_alloc from the kernel

To prepare to perag initialization code move to libxfs, port the
xfs_set_inode_alloc function from the kernel and make
libxfs_initialize_perag use it.  The code isn't 1:1 identical, but
AFAICT it behaves the same way.  In a future kernel release we'll
move the function into xfs_ag.c and update xfsprogs.

(sandeen: Note that this is in effect simply syncing up several
kernel commits in this code which was not shared via libxfs.)

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibfrog: move topology.[ch] to libxfs
Darrick J. Wong [Wed, 29 Sep 2021 20:57:10 +0000 (16:57 -0400)] 
libfrog: move topology.[ch] to libxfs

The topology code depends on a few libxfs structures and is only needed
by mkfs and xfs_repair.  Move this code to libxfs to reduce the size of
libfrog and to avoid build failures caused by "xfs: move perag structure
and setup to libxfs/xfs_ag.[ch]".

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agomkfs: move mkfs/proto.c declarations to mkfs/proto.h
Darrick J. Wong [Wed, 29 Sep 2021 20:57:05 +0000 (16:57 -0400)] 
mkfs: move mkfs/proto.c declarations to mkfs/proto.h

These functions are only used by mkfs, so move them to a separate header
file that isn't in an internal library.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[sandeen: cosmetic tidyups as suggested by Christoph]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoatomic: convert to uatomic
Dave Chinner [Wed, 29 Sep 2021 20:57:02 +0000 (16:57 -0400)] 
atomic: convert to uatomic

Now we have liburcu, we can make use of it's atomic variable
implementation. It is almost identical to the kernel API - it's just
got a "uatomic" prefix. liburcu also provides all the same aomtic
variable memory barriers as the kernel, so if we pull memory barrier
dependent kernel code across, it will just work with the right
barrier wrappers.

This is preparation the addition of more extensive atomic operations
the that kernel buffer cache requires to function correctly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[chandan.babu@oracle.com: Swap order of arguments provided to atomic[64]_[add|sub]()]
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
[sandeen: fix whitespace]
[sandeen: make #defines a little more consistent about (a,v)]
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: add spinlock_t wrapper
Dave Chinner [Wed, 29 Sep 2021 20:48:29 +0000 (16:48 -0400)] 
libxfs: add spinlock_t wrapper

These provide the kernel spinlock_t interface, but are *not*
spinlocks. Spinlocks cannot be used by general purpose userspace
processes due to the fact they cannot control task preemption and
scheduling reliability. Hence these are implemented as a
pthread_mutex_t, similar to the way the kernel RT build implements
spinlock_t as a kernel mutex.

Because the current libxfs spinlock "implementation" just makes
spinlocks go away, we have to also add initialisation to spinlocks
that libxfs uses that are missing from the userspace implementation.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[chandan.babu@oracle.com: Initialize inode log item spin lock]
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
[sandeen: fix minor whitespace]
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: introduce liburcu support
Dave Chinner [Wed, 29 Sep 2021 20:48:03 +0000 (16:48 -0400)] 
xfsprogs: introduce liburcu support

The upcoming buffer cache rework/kerenl sync-up requires atomic
variables. I could use C++11 atomics build into GCC, but they are a
pain to work with and shoe-horn into the kernel atomic variable API.

Much easier is to introduce a dependency on liburcu - the userspace
RCU library. This provides atomic variables that very closely match
the kernel atomic variable API, and it provides a very similar
memory model and memory barrier support to the kernel. And we get
RCU support that has an identical interface to the kernel and works
the same way.

Hence kernel code written with RCU algorithms and atomic variables
will just slot straight into the userspace xfsprogs code without us
having to think about whether the lockless algorithms will work in
userspace or not. This reduces glue and hoop jumping, and gets us
a step closer to having the entire userspace libxfs code MT safe.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[chandan.babu@oracle.com: Add m4 macros to detect availability of liburcu]
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
[sandeen: use dchinner's m4 macros]
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: Release v5.13.0 v5.13.0
Eric Sandeen [Fri, 20 Aug 2021 16:03:57 +0000 (12:03 -0400)] 
xfsprogs: Release v5.13.0

Update all the necessary files for a 5.13.0 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfsprogs: Release v5.13.0-rc1 v5.13.0-rc1
Eric Sandeen [Mon, 2 Aug 2021 21:31:28 +0000 (17:31 -0400)] 
xfsprogs: Release v5.13.0-rc1

Update all the necessary files for a 5.13.0-rc1 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_quota: allow users to truncate group and project quota files
Darrick J. Wong [Wed, 28 Jul 2021 23:05:51 +0000 (19:05 -0400)] 
xfs_quota: allow users to truncate group and project quota files

In commit 79ac1ae4, I /think/ xfsprogs gained the ability to deal with
project or group quotas.  For some reason, the quota remove command was
structured so that if the user passes both -g and -p, it will only ask
the kernel truncate the group quota file.  This is a strange behavior
since -ug results in truncation requests for both user and group quota
files, and the kernel is smart enough to return 0 if asked to truncate a
quota file that doesn't exist.

In other words, this is a seemingly arbitrary limitation of the command.
It's an unexpected behavior since we don't do any sort of parameter
validation to warn users when -p is silently ignored.  Modern V5
filesystems support both group and project quotas, so it's all the more
surprising that you can't do group and project all at once.  Remove this
pointless restriction.

Found while triaging xfs/007 regressions.

Fixes: 79ac1ae4 ("Fix xfs_quota disable, enable, off and remove commands Merge of master-melb:xfs-cmds:29395a by kenmcd.")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_repair: invalidate dirhash entry when junking dirent
Darrick J. Wong [Wed, 28 Jul 2021 23:05:15 +0000 (19:05 -0400)] 
xfs_repair: invalidate dirhash entry when junking dirent

In longform_dir2_entry_check_data, we add the directory entries we find
to the incore dirent hash table after we've validated the name but
before we're totally done checking the entry.  This sequence is
necessary to detect all duplicated names in the directory.

Unfortunately, if we later decide to junk the ondisk dirent, we neglect
to mark the dirhash entry, so if the directory gets rebuilt, it will get
rebuilt with the entry that we rejected.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: allow callers to dump fs stats individually
Darrick J. Wong [Wed, 28 Jul 2021 23:03:18 +0000 (19:03 -0400)] 
xfs_io: allow callers to dump fs stats individually

Enable callers to decide if they want to see statfs, fscounts, or
geometry information (or any combination) from the xfs_io statfs
command.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agomkfs: validate rt extent size hint when rtinherit is set
Darrick J. Wong [Wed, 28 Jul 2021 23:03:16 +0000 (19:03 -0400)] 
mkfs: validate rt extent size hint when rtinherit is set

Extent size hints exist to nudge the behavior of the file data block
allocator towards trying to make aligned allocations.  Therefore, it
doesn't make sense to allow a hint that isn't a multiple of the
fundamental allocation unit for a given file.

This means that if the sysadmin is formatting with rtinherit set on the
root dir, validate_extsize_hint needs to check the hint value on a
simulated realtime file to make sure that it's correct.  Unfortunately,
the gate check here was for a nonzero rt extent size, which is wrong
since we never format with rtextsize==0.  This leads to absurd failures
such as:

# mkfs.xfs -f /dev/sdf -r extsize=7b -d rtinherit=0,extszinherit=13
illegal extent size hint 13, must be less than 649088 and a multiple of 7.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_repair: validate alignment of inherited rt extent hints
Darrick J. Wong [Wed, 28 Jul 2021 23:03:12 +0000 (19:03 -0400)] 
xfs_repair: validate alignment of inherited rt extent hints

If we encounter a directory that has been configured to pass on an
extent size hint to a new realtime file and the hint isn't an integer
multiple of the rt extent size, we should turn off the hint because that
is a misconfiguration.  Old kernels didn't check for this when copying
attributes into new files and would crash in the rt allocator.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
[sandeen: clarify that it is an extent size *hint*]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: don't count fsmaps before querying fsmaps
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: don't count fsmaps before querying fsmaps

There's a bunch of code in fsmap.c that tries to count the GETFSMAP
records so that it can size the fsmap array appropriately for the
GETFSMAP call.  It's pointless to iterate the entire result set /twice/
(unlike the bmap command where the extent count is actually stored in
the fs metadata), so get rid of the duplicate walk.

In other words: Iterate over the records using the default chunk size
instead of doing one call to find the size and doing a giant allocation
and GETFSMAP call.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: only print the header once when dumping fsmap in csv format
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: only print the header once when dumping fsmap in csv format

Only print the column names once when we're dumping fsmap information in
csv format.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: clean up the funshare command a bit
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: clean up the funshare command a bit

Add proper argument parsing to the funshare command so that when you
pass it nonexistent --help it will print the help instead of complaining
that it can't convert that to a number.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: fix broken funshare_cmd usage
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: fix broken funshare_cmd usage

Create a funshare_cmd and use that to store information about the
xfs_io funshare command instead of overwriting the contents of
fzero_cmd.  This fixes confusing output like:

$ xfs_io -c 'fzero 2 3 --help' /
fzero: invalid option -- '-'
funshare off len -- unshares shared blocks within the range

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfsprogs: Release v5.13.0-rc0 v5.13.0-rc0
Eric Sandeen [Thu, 1 Jul 2021 17:37:53 +0000 (13:37 -0400)] 
xfsprogs: Release v5.13.0-rc0

Update all the necessary files for a 5.13.0-rc0 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: bunmapi has unnecessary AG lock ordering issues libxfs-5.13-sync
Dave Chinner [Wed, 30 Jun 2021 22:38:59 +0000 (18:38 -0400)] 
xfs: bunmapi has unnecessary AG lock ordering issues

Source kernel commit: 0fe0bbe00a6fb77adf75085b7d06b71a830dd6f2

large directory block size operations are assert failing because
xfs_bunmapi() is not completely removing fragmented directory blocks
like so:

XFS: Assertion failed: done, file: fs/xfs/libxfs/xfs_dir2.c, line: 677
....
Call Trace:
xfs_dir2_shrink_inode+0x1a8/0x210
xfs_dir2_block_to_sf+0x2ae/0x410
xfs_dir2_block_removename+0x21a/0x280
xfs_dir_removename+0x195/0x1d0
xfs_rename+0xb79/0xc50
? avc_has_perm+0x8d/0x1a0
? avc_has_perm_noaudit+0x9a/0x120
xfs_vn_rename+0xdb/0x150
vfs_rename+0x719/0xb50
? __lookup_hash+0x6a/0xa0
do_renameat2+0x413/0x5e0
__x64_sys_rename+0x45/0x50
do_syscall_64+0x3a/0x70
entry_SYSCALL_64_after_hwframe+0x44/0xae

We are aborting the bunmapi() pass because of this specific chunk of
code:

/*
* Make sure we don't touch multiple AGF headers out of order
* in a single transaction, as that could cause AB-BA deadlocks.
*/
if (!wasdel && !isrt) {
agno = XFS_FSB_TO_AGNO(mp, del.br_startblock);
if (prev_agno != NULLAGNUMBER && prev_agno > agno)
break;
prev_agno = agno;
}

This is designed to prevent deadlocks in AGF locking when freeing
multiple extents by ensuring that we only ever lock in increasing
AG number order. Unfortunately, this also violates the "bunmapi will
always succeed" semantic that some high level callers depend on,
such as xfs_dir2_shrink_inode(), xfs_da_shrink_inode() and
xfs_inactive_symlink_rmt().

This AG lock ordering was introduced back in 2017 to fix deadlocks
triggered by generic/299 as reported here:

https://lore.kernel.org/linux-xfs/800468eb-3ded-9166-20a4-047de8018582@gmail.com/

This codebase is old enough that it was before we were defering all
AG based extent freeing from within xfs_bunmapi(). THat is, we never
actually lock AGs in xfs_bunmapi() any more - every non-rt based
extent free is added to the defer ops list, as is all BMBT block
freeing. And RT extents are not RT based, so there's no lock
ordering issues associated with them.

Hence this AGF lock ordering code is both broken and dead. Let's
just remove it so that the large directory block code works reliably
again.

Tested against xfs/538 and generic/299 which is the original test
that exposed the deadlocks that this code fixed.

Fixes: 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: btree format inode forks can have zero extents
Dave Chinner [Wed, 30 Jun 2021 22:38:59 +0000 (18:38 -0400)] 
xfs: btree format inode forks can have zero extents

Source kernel commit: 991c2c5980fb97ae6194f7c46b44f9446629eb4e

xfs/538 is assert failing with this trace when testing with
directory block sizes of 64kB:

XFS: Assertion failed: !xfs_need_iread_extents(ifp), file: fs/xfs/libxfs/xfs_bmap.c, line: 608
....
Call Trace:
xfs_bmap_btree_to_extents+0x2a9/0x470
? kmem_cache_alloc+0xe7/0x220
__xfs_bunmapi+0x4ca/0xdf0
xfs_bunmapi+0x1a/0x30
xfs_dir2_shrink_inode+0x71/0x210
xfs_dir2_block_to_sf+0x2ae/0x410
xfs_dir2_block_removename+0x21a/0x280
xfs_dir_removename+0x195/0x1d0
xfs_remove+0x244/0x460
xfs_vn_unlink+0x53/0xa0
? selinux_inode_unlink+0x13/0x20
vfs_unlink+0x117/0x220
do_unlinkat+0x1a2/0x2d0
__x64_sys_unlink+0x42/0x60
do_syscall_64+0x3a/0x70
entry_SYSCALL_64_after_hwframe+0x44/0xae

This is a check to ensure that the extents have been read into
memory before we are doing a ifork btree manipulation. This assert
is bogus in the above case.

We have a fragmented directory block that has more extents in it
than can fit in extent format, so the inode data fork is in btree
format. xfs_dir2_shrink_inode() asks to remove all remaining 16
filesystem blocks from the inode so it can convert to short form,
and __xfs_bunmapi() removes all the extents. We now have a data fork
in btree format but have zero extents in the fork. This incorrectly
trips the xfs_need_iread_extents() assert because it assumes that an
empty extent btree means the extent tree has not been read into
memory yet. This is clearly not the case with xfs_bunmapi(), as it
has an explicit call to xfs_iread_extents() in it to pull the
extents into memory before it starts unmapping.

Also, the assert directly after this bogus one is:

ASSERT(ifp->if_format == XFS_DINODE_FMT_BTREE);

Which covers the context in which it is legal to call
xfs_bmap_btree_to_extents just fine. Hence we should just remove the
bogus assert as it is clearly wrong and causes a regression.

The returns the test behaviour to the pre-existing assert failure in
xfs_dir2_shrink_inode() that indicates xfs_bunmapi() has failed to
remove all the extents in the range it was asked to unmap.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: validate extsz hints against rt extent size when rtinherit is set
Darrick J. Wong [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: validate extsz hints against rt extent size when rtinherit is set

Source kernel commit: 603f000b15f21ce8932f76689c7aa9fe58261cf5

The RTINHERIT bit can be set on a directory so that newly created
regular files will have the REALTIME bit set to store their data on the
realtime volume.  If an extent size hint (and EXTSZINHERIT) are set on
the directory, the hint will also be copied into the new file.

As pointed out in previous patches, for realtime files we require the
extent size hint be an integer multiple of the realtime extent, but we
don't perform the same validation on a directory with both RTINHERIT and
EXTSZINHERIT set, even though the only use-case of that combination is
to propagate extent size hints into new realtime files.  This leads to
inode corruption errors when the bad values are propagated.

Because there may be existing filesystems with such a configuration, we
cannot simply amend the inode verifier to trip on these directories and
call it a day because that will cause previously "working" filesystems
to start throwing errors abruptly.  Note that it's valid to have
directories with rtinherit set even if there is no realtime volume, in
which case the problem does not manifest because rtinherit is ignored if
there's no realtime device; and it's possible that someone set the flag,
crashed, repaired the filesystem (which clears the hint on the realtime
file) and continued.

Therefore, mitigate this issue in several ways: First, if we try to
write out an inode with both rtinherit/extszinherit set and an unaligned
extent size hint, turn off the hint to correct the error.  Second, if
someone tries to misconfigure a directory via the fssetxattr ioctl, fail
the ioctl.  Third, reverify both extent size hint values when we
propagate heritable inode attributes from parent to child, to prevent
misconfigurations from spreading.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: standardize extent size hint validation
Darrick J. Wong [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: standardize extent size hint validation

Source kernel commit: 6b69e485894b355b333bd286f0f0958e41d8754a

While chasing a bug involving invalid extent size hints being propagated
into newly created realtime files, I noticed that the xfs_ioctl_setattr
checks for the extent size hints weren't the same as the ones now
encoded in libxfs and used for validation in repair and mkfs.

Because the checks in libxfs are more stringent than the ones in the
ioctl, it's possible for a live system to set inode flags that
immediately result in corruption warnings.  Specifically, it's possible
to set an extent size hint on an rtinherit directory without checking if
the hint is aligned to the realtime extent size, which makes no sense
since that combination is used only to seed new realtime files.

Replace the open-coded and inadequate checks with the libxfs verifier
versions and update the code comments a bit.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: check free AG space when making per-AG reservations
Darrick J. Wong [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: check free AG space when making per-AG reservations

Source kernel commit: 0f9342513cc78a31a4a272a19b35eee4e8cd7107

The new online shrink code exposed a gap in the per-AG reservation
code, which is that we only return ENOSPC to callers if the entire fs
doesn't have enough free blocks.  Except for debugging mode, the
reservation init code doesn't ever check that there's enough free space
in that AG to cover the reservation.

Not having enough space is not considered an immediate fatal error that
requires filesystem offlining because (a) it's shouldn't be possible to
wind up in that state through normal file operations and (b) even if
one did, freeing data blocks would recover the situation.

However, online shrink now needs to know if shrinking would not leave
enough space so that it can abort the shrink operation.  Hence we need
to promote this assertion into an actual error return.

Observed by running xfs/168 with a 1k block size, though in theory this
could happen with any configuration.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: restore old ioctl definitions
Darrick J. Wong [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: restore old ioctl definitions

Source kernel commit: e3c2b047475b52739bcf178a9e95176c42bbcf8f

These ioctl definitions in xfs_fs.h are part of the userspace ABI and
were mistakenly removed during the 5.13 merge window.

Fixes: 9fefd5db08ce ("xfs: convert to fileattr")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: introduce in-core global counter of allocbt blocks
Brian Foster [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: introduce in-core global counter of allocbt blocks

Source kernel commit: 16eaab839a9273ed156ebfccbd40c15d1e72f3d8

Introduce an in-core counter to track the sum of all allocbt blocks
used by the filesystem. This value is currently tracked per-ag via
the ->agf_btreeblks field in the AGF, which also happens to include
rmapbt blocks. A global, in-core count of allocbt blocks is required
to identify the subset of global ->m_fdblocks that consists of
unavailable blocks currently used for allocation btrees. To support
this calculation at block reservation time, construct a similar
global counter for allocbt blocks, populate it on first read of each
AGF and update it as allocbt blocks are used and released.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: unconditionally read all AGFs on mounts with perag reservation
Brian Foster [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: unconditionally read all AGFs on mounts with perag reservation

Source kernel commit: 2675ad3890db93e58f2264d07c2d1f615ec5adf7

perag reservation is enabled at mount time on a per AG basis. The
upcoming change to set aside allocbt blocks from block reservation
requires a populated allocbt counter as soon as possible after mount
to be fully effective against large perag reservations. Therefore as
a preparation step, initialize the pagf on all mounts where at least
one reservation is active. Note that this already occurs to some
degree on most default format filesystems as reservation requirement
calculations already depend on the AGF or AGI, depending on the
reservation type.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: update superblock counters correctly for !lazysbcount
Dave Chinner [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: update superblock counters correctly for !lazysbcount

Source kernel commit: 6543990a168acf366f4b6174d7bd46ba15a8a2a6

Keep the mount superblock counters up to date for !lazysbcount
filesystems so that when we log the superblock they do not need
updating in any way because they are already correct.

It's found by what Zorro reported:
1. mkfs.xfs -f -l lazy-count=0 -m crc=0 $dev
2. mount $dev $mnt
3. fsstress -d $mnt -p 100 -n 1000 (maybe need more or less io load)
4. umount $mnt
5. xfs_repair -n $dev
and I've seen no problem with this patch.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reported-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: remove obsolete AGF counter debugging
Darrick J. Wong [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: remove obsolete AGF counter debugging

Source kernel commit: 1aec7c3d05670b92b7339b19999009a93808efb9

In commit f8f2835a9cf3 we changed the behavior of XFS to use EFIs to
remove blocks from an overfilled AGFL because there were complaints
about transaction overruns that stemmed from trying to free multiple
blocks in a single transaction.

Unfortunately, that commit missed a subtlety in the debug-mode
transaction accounting when a realtime volume is attached.  If a
realtime file undergoes a data fork mapping change such that realtime
extents are allocated (or freed) in the same transaction that a data
device block is also allocated (or freed), we can trip a debugging
assertion.  This can happen (for example) if a realtime extent is
allocated and it is necessary to reshape the bmbt to hold the new
mapping.

When we go to allocate a bmbt block from an AG, the first thing the data
device block allocator does is ensure that the freelist is the proper
length.  If the freelist is too long, it will trim the freelist to the
proper length.

In debug mode, trimming the freelist calls xfs_trans_agflist_delta() to
record the decrement in the AG free list count.  Prior to f8f28 we would
put the free block back in the free space btrees in the same
transaction, which calls xfs_trans_agblocks_delta() to record the
increment in the AG free block count.  Since AGFL blocks are included in
the global free block count (fdblocks), there is no corresponding
fdblocks update, so the AGFL free satisfies the following condition in
xfs_trans_apply_sb_deltas:

/*
* Check that superblock mods match the mods made to AGF counters.
*/
ASSERT((tp->t_fdblocks_delta + tp->t_res_fdblocks_delta) ==
(tp->t_ag_freeblks_delta + tp->t_ag_flist_delta +
tp->t_ag_btree_delta));

The comparison here used to be: (X + 0) == ((X+1) + -1 + 0), where X is
the number blocks that were allocated.

After commit f8f28 we defer the block freeing to the next chained
transaction, which means that the calls to xfs_trans_agflist_delta and
xfs_trans_agblocks_delta occur in separate transactions.  The (first)
transaction that shortens the free list trips on the comparison, which
has now become:

(X + 0) == ((X) + -1 + 0)

because we haven't freed the AGFL block yet; we've only logged an
intention to free it.  When the second transaction (the deferred free)

(0 + 0) == (1 + 0 + 0)

and trip over that in turn.

At this point, the astute reader may note that the two commits tagged by
this patch have been in the kernel for a long time but haven't generated
any bug reports.  How is it that the author became aware of this bug?

This originally surfaced as an intermittent failure when I was testing
realtime rmap, but a different bug report by Zorro Lang reveals the same
assertion occuring on !lazysbcount filesystems.

The common factor to both reports (and why this problem wasn't
previously reported) becomes apparent if we consider when
xfs_trans_apply_sb_deltas is called by __xfs_trans_commit():

if (tp->t_flags & XFS_TRANS_SB_DIRTY)
xfs_trans_apply_sb_deltas(tp);

With a modern lazysbcount filesystem, transactions update only the
percpu counters, so they don't need to set XFS_TRANS_SB_DIRTY, hence
xfs_trans_apply_sb_deltas is rarely called.

However, updates to the count of free realtime extents are not part of
lazysbcount, so XFS_TRANS_SB_DIRTY will be set on transactions adding or
removing data fork mappings to realtime files; similarly,
XFS_TRANS_SB_DIRTY is always set on !lazysbcount filesystems.

Dave mentioned in response to an earlier version of this patch:

"IIUC, what you are saying is that this debug code is simply not
exercised in normal testing and hasn't been for the past decade?  And it
still won't be exercised on anything other than realtime device testing?

"...it was debugging code from 1994 that was largely turned into dead
code when lazysbcounters were introduced in 2007. Hence I'm not sure it
holds any value anymore."

This debugging code isn't especially helpful - you can modify the
flcount on one AG and the freeblks of another AG, and it won't trigger.
Add the fact that nobody noticed for a decade, and let's just get rid of
it (and start testing realtime :P).

This bug was found by running generic/051 on either a V4 filesystem
lacking lazysbcount; or a V5 filesystem with a realtime volume.

Cc: bfoster@redhat.com, zlang@redhat.com
Fixes: f8f2835a9cf3 ("xfs: defer agfl block frees when dfops is available")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: rename struct xfs_legacy_ictimestamp
Christoph Hellwig [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: rename struct xfs_legacy_ictimestamp

Source kernel commit: 732de7dbdbd30df40a6d260a8da6fc5262039439

Rename struct xfs_legacy_ictimestamp to struct xfs_log_legacy_timestamp
as it is a type used for logging timestamps with no relationship to the
in-core inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: rename xfs_ictimestamp_t
Christoph Hellwig [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: rename xfs_ictimestamp_t

Source kernel commit: 6fc277c7c935c7e1fdee23e82da988d9d3cb6bef

Rename xfs_ictimestamp_t to xfs_log_timestamp_t as it is a type used
for logging timestamps with no relationship to the in-core inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: remove XFS_IFEXTENTS
Christoph Hellwig [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: remove XFS_IFEXTENTS

Source kernel commit: b2197a36c0ef5b35a0ed83de744610a462da1ad3

The in-memory XFS_IFEXTENTS is now only used to check if an inode with
extents still needs the extents to be read into memory before doing
operations that need the extent map.  Add a new xfs_need_iread_extents
helper that returns true for btree format forks that do not have any
entries in the in-memory extent btree, and use that instead of checking
the XFS_IFEXTENTS flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: remove XFS_IFINLINE
Christoph Hellwig [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: remove XFS_IFINLINE

Source kernel commit: 0779f4a68d4df539a7ea624f7e1560f48aa46ad9

Just check for an inline format fork instead of the using the equivalent
in-memory XFS_IFINLINE flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: remove XFS_IFBROOT
Christoph Hellwig [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: remove XFS_IFBROOT

Source kernel commit: ac1e067211d1476dae304e8881c10b40c90614d5

Just check for a btree format fork instead of the using the equivalent
in-memory XFS_IFBROOT flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: only look at the fork format in xfs_idestroy_fork
Christoph Hellwig [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: only look at the fork format in xfs_idestroy_fork

Source kernel commit: 0eba048dd3b73fab6c97742468176dff58650860

Stop using the XFS_IFEXTENTS flag, and instead switch on the fork format
in xfs_idestroy_fork to decide how to cleanup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: simplify xfs_attr_remove_args
Christoph Hellwig [Wed, 30 Jun 2021 22:38:58 +0000 (18:38 -0400)] 
xfs: simplify xfs_attr_remove_args

Source kernel commit: 605e74e29218bb22edd5ddcf90a4d37df00446cc

Directly return from the subfunctions and avoid the error variable.  Also
remove the not really needed dp local variable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: rename and simplify xfs_bmap_one_block
Christoph Hellwig [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: rename and simplify xfs_bmap_one_block

Source kernel commit: 2ac131df03d4f06bb0d825335663cc5064421993

xfs_bmap_one_block is only called for the attribute fork.  Move it to
xfs_attr.c, drop the unused whichfork argument and code only executed for
the data fork and rename the result to xfs_attr_is_leaf.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the XFS_IFEXTENTS check into xfs_iread_extents
Christoph Hellwig [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: move the XFS_IFEXTENTS check into xfs_iread_extents

Source kernel commit: 862a804aae3031e91bd0ae0b13c90a1b13d77af3

Move the XFS_IFEXTENTS check from the callers into xfs_iread_extents to
simplify the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: convert to fileattr
Miklos Szeredi [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: convert to fileattr

Source kernel commit: 9fefd5db08ce01abffffcdca3dc0964d9cb6ee69

Use the fileattr API to let the VFS handle locking, permission checking and
conversion.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: fix return of uninitialized value in variable error
Colin Ian King [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: fix return of uninitialized value in variable error

Source kernel commit: 3b6dd9a9aeeada19d0c820ff68e979243a888bb6

A previous commit removed a call to xfs_attr3_leaf_read that
assigned an error return code to variable error. We now have
a few early error return paths to label 'out' that return
error if error is set; however error now is uninitialized
so potentially garbage is being returned.  Fix this by setting
error to zero to restore the original behaviour where error
was zero at the label 'restart'.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: precalculate default inode attribute offset
Dave Chinner [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: precalculate default inode attribute offset

Source kernel commit: b2941046ea85d2cd94b485831bf03402f34f4060

Default attr fork offset is based on inode size, so is a fixed
geometry parameter of the inode. Move it to the xfs_ino_geometry
structure and stop calculating it on every call to
xfs_default_attroffset().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: default attr fork size does not handle device inodes
Dave Chinner [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: default attr fork size does not handle device inodes

Source kernel commit: 683ec9ba887d096a6cbd9a5778be9400efe6468c

Device inodes have a non-default data fork size of 8 bytes
as checked/enforced by xfs_repair. xfs_default_attroffset() doesn't
handle this, so lets do a minor refactor so it does.

Fixes: e6a688c33238 ("xfs: initialise attr fork on inode create")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: Use struct xfs_bmdr_block instead of struct xfs_btree_block to calculate root...
Chandan Babu R [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: Use struct xfs_bmdr_block instead of struct xfs_btree_block to calculate root node size

Source kernel commit: b6785e279d53ca5c4fa6be1146e85000870d73ef

The incore data fork of an inode stores the bmap btree root node as 'struct
xfs_btree_block'. However, the ondisk version of the inode stores the bmap
btree root node as a 'struct xfs_bmdr_block'.

xfs_bmap_add_attrfork_btree() checks if the btree root node fits inside the
data fork of the inode. However, it incorrectly uses 'struct xfs_btree_block'
to compute the size of the bmap btree root node. Since size of 'struct
xfs_btree_block' is larger than that of 'struct xfs_bmdr_block',
xfs_bmap_add_attrfork_btree() could end up unnecessarily demoting the current
root node as the child of newly allocated root node.

This commit optimizes space usage by modifying xfs_bmap_add_attrfork_btree()
to use 'struct xfs_bmdr_block' to check if the bmap btree root node fits
inside the data fork of the inode.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: deprecate BMV_IF_NO_DMAPI_READ flag
Anthony Iliopoulos [Wed, 30 Jun 2021 22:38:57 +0000 (18:38 -0400)] 
xfs: deprecate BMV_IF_NO_DMAPI_READ flag

Source kernel commit: fcb62c28031eeeb626392e6a338a90dedbdecf1c

Use of the flag has had no effect since kernel commit 288699fecaff
("xfs: drop dmapi hooks"), which removed all dmapi related code, so
deprecate it.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_crtime field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:38:56 +0000 (18:38 -0400)] 
xfs: move the di_crtime field to struct xfs_inode

Source kernel commit: e98d5e882b3ccb0f7f38d4e893fe60c1dd7934db

Move the crtime field from struct xfs_icdinode into stuct xfs_inode and
remove the now entirely unused struct xfs_icdinode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_flags2 field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:36:05 +0000 (18:36 -0400)] 
xfs: move the di_flags2 field to struct xfs_inode

Source kernel commit: 3e09ab8fdc4d4c9d0afee7a63a3b39e5ade3c863

In preparation of removing the historic icinode struct, move the flags2
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_flags field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:35:12 +0000 (18:35 -0400)] 
xfs: move the di_flags field to struct xfs_inode

Source kernel commit: db07349da2f564742c0f23528691991e641e315e

In preparation of removing the historic icinode struct, move the flags
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_forkoff field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:34:59 +0000 (18:34 -0400)] 
xfs: move the di_forkoff field to struct xfs_inode

Source kernel commit: 7821ea302dca72469c558e382d6e4ae09232b7a7

In preparation of removing the historic icinode struct, move the
forkoff field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: use a union for i_cowextsize and i_flushiter
Christoph Hellwig [Wed, 30 Jun 2021 22:34:41 +0000 (18:34 -0400)] 
xfs: use a union for i_cowextsize and i_flushiter

Source kernel commit: ee7b83fd365e32beaa405d60b8c42f42ec5f42c2

The i_cowextsize field is only used for v3 inodes, and the i_flushiter
field is only used for v1/v2 inodes.  Use a union to pack the inode a
littler better after adding a few missing guards around their usage.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_flushiter field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:33:57 +0000 (18:33 -0400)] 
xfs: move the di_flushiter field to struct xfs_inode

Source kernel commit: 965e0a1ad273ba61a8040220ef8ec09c9d065875

In preparation of removing the historic icinode struct, move the
flushiter field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_cowextsize field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:33:44 +0000 (18:33 -0400)] 
xfs: move the di_cowextsize field to struct xfs_inode

Source kernel commit: b33ce57d3e61020328582ce6d7dbae1d694ac496

In preparation of removing the historic icinode struct, move the
cowextsize field into the containing xfs_inode structure.  Also
switch to use the xfs_extlen_t instead of a uint32_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_extsize field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:33:29 +0000 (18:33 -0400)] 
xfs: move the di_extsize field to struct xfs_inode

Source kernel commit: 031474c28a3a9a2772a715d1ec9770f9068ea5a4

In preparation of removing the historic icinode struct, move the extsize
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_nblocks field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:32:58 +0000 (18:32 -0400)] 
xfs: move the di_nblocks field to struct xfs_inode

Source kernel commit: 6e73a545f91e128d8dd7da1769dca200225f5d82

In preparation of removing the historic icinode struct, move the nblocks
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_size field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:32:42 +0000 (18:32 -0400)] 
xfs: move the di_size field to struct xfs_inode

Source kernel commit: 13d2c10b05d8e67cb9b4c2d1d4a09a906148a72e

In preparation of removing the historic icinode struct, move the on-disk
size field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: move the di_projid field to struct xfs_inode
Christoph Hellwig [Wed, 30 Jun 2021 22:32:24 +0000 (18:32 -0400)] 
xfs: move the di_projid field to struct xfs_inode

Source kernel commit: ceaf603c7024d3c021803a3e90e893feda8d76e2

In preparation of removing the historic icinode struct, move the projid
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs: remove the di_dmevmask and di_dmstate fields from struct xfs_icdinode
Christoph Hellwig [Wed, 30 Jun 2021 22:29:44 +0000 (18:29 -0400)] 
xfs: remove the di_dmevmask and di_dmstate fields from struct xfs_icdinode

Source kernel commit: 9b3beb028ff5bed99473021d1a7de8747665ac32

The legacy DMAPI fields were never set by upstream Linux XFS, and have no
way to be read using the kernel APIs.  So instead of bloating the in-core
inode for them just copy them from the on-disk inode into the log when
logging the inode.  The only caveat is that we need to make sure to zero
the fields for newly read or deleted inodes, which is solved using a new
flag in the inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>