Create get and set functions for rtbitmap words so that we can redefine
the ondisk format with a specific endianness. Note that this requires
the definition of a distinct type for ondisk rtbitmap words so that the
compiler can perform proper typechecking as we go back and forth.
In the upcoming rtgroups feature, we're going to fix the problem that
rtwords are written in host endian order, which means we'll need the
distinct rtword/rtword_raw types.
Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Create an explicit helper function to log parts of rt bitmap and summary
blocks. While we're at it, fix an off-by-one error in two of the
rtbitmap logging calls that led to unnecessarily large log items but was
otherwise benign.
Note that the upcoming rtgroups patchset will add block headers to the
rtbitmap and rtsummary files. The helpers in this and the next few
patches take a less than direct route through xfs_rbmblock_wordptr and
xfs_rsumblock_infoptr to avoid helper churn in that patchset.
Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
There are a bunch of places where we use open-coded logic to find a
pointer to an xfs_rtword_t within a rt bitmap buffer. Convert all that
to helper functions for better type safety.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Replace these macros with typechecked helper functions. Eventually
we're going to add more logic to the helpers and it'll be easier if we
don't have to macro it up.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Convert these calls to use the helpers, and clean up all these places
where the same variable can have different units depending on where it
is in the function.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Create helpers to do unit conversions of rt block numbers to rt extent
numbers. There are three variations -- one to compute the rt extent
number from an rt block number; one to compute the offset of an rt block
within an rt extent; and one to extract both.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Further disambiguate the xfs_rtblock_t uses by creating a new type,
xfs_rtxnum_t, to store the position of an extent within the realtime
section, in units of rtextents.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
This helper function validates that a range of *blocks* in the
realtime section is completely contained within the realtime section.
It does /not/ validate ranges of *rtextents*. Rename the function to
avoid suggesting that it does, and change the type of the @len parameter
since xfs_rtblock_t is a position unit, not a length unit.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
XFS uses xfs_rtblock_t for many different uses, which makes it much more
difficult to perform a unit analysis on the codebase. One of these
(ab)uses is when we need to store the length of a free space extent as
stored in the realtime bitmap. Because there can be up to 2^64 realtime
extents in a filesystem, we need a new type that is larger than
xfs_rtxlen_t for callers that are querying the bitmap directly. This
means scrub and growfs.
Create this type as "xfs_rtbxlen_t" and use it to store 64-bit rtx
lengths. 'b' stands for 'bitmap' or 'big'; reader's choice.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
In most of the filesystem, we use xfs_extlen_t to store the length of a
file (or AG) space mapping in units of fs blocks. Unfortunately, the
realtime allocator also uses it to store the length of a rt space
mapping in units of rt extents. This is confusing, since one rt extent
can consist of many fs blocks.
Separate the two by introducing a new type (xfs_rtxlen_t) to store the
length of a space mapping (in units of realtime extents) that would be
found in a file.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
The unit conversions in this function do not make sense. First we
convert a block count to bytes, then divide that bytes value by
rextsize, which is in blocks, to get an rt extent count. You can't
divide bytes by blocks to get a (possibly multiblock) extent value.
Fortunately nobody uses delalloc on the rt volume so this hasn't
mattered.
Fixes: fa5c836ca8eb5 ("xfs: refactor xfs_bunmapi_cow") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Currently, xfs_bmap_del_extent_real contains a bunch of code to convert
the physical extent of a data fork mapping for a realtime file into rt
extents and pass that to the rt extent freeing function. Since the
details of this aren't needed when CONFIG_XFS_REALTIME=n, move it to
xfs_rtbitmap.c to reduce code size when realtime isn't enabled.
This will (one day) enable realtime EFIs to reuse the same
unit-converting call with less code duplication.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
The latest version of the fs geometry structure is v5. Bump this
constant so that xfs_db and mkfs calls to libxfs_fs_geometry will fill
out all the fields.
IOWs, this commit is a no-op for the kernel, but will be useful for
userspace reporting in later changes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Carlos Maiolino [Tue, 23 Jan 2024 10:31:17 +0000 (11:31 +0100)]
Merge tag 'scruball-service-fixes-6.6_2024-01-11' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next
xfs_scrub_all: fixes for systemd services [v28.3 5/6]
This patchset ties up some problems in the xfs_scrub_all program and
service, which are essential for finding mounted filesystems to scrub
and creating the background service instances that do the scrub.
First, we need to fix various errors in pathname escaping, because
systemd does /not/ like slashes in service names. Then, teach
xfs_scrub_all to deal with systemd restarts causing it to think that a
scrub has finished before the service actually finishes. Finally,
implement a signal handler so that SIGINT (console ^C) and SIGTERM
(systemd stopping the service) shut down the xfs_scrub@ services
correctly.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Carlos Maiolino [Tue, 23 Jan 2024 10:30:03 +0000 (11:30 +0100)]
Merge tag 'scrub-service-fixes-6.6_2024-01-11' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next
xfs_scrub: fixes for systemd services [v28.3 4/6]
This series fixes deficiencies in the systemd services that were created
to manage background scans. First, improve the debian packaging so that
services get installed at package install time. Next, fix copyright and
spdx header omissions.
Finally, fix bugs in the mailer scripts so that scrub failures are
reported effectively. Finally, fix xfs_scrub_all to deal with systemd
restarts causing it to think that a scrub has finished before the
service actually finishes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Carlos Maiolino [Tue, 23 Jan 2024 10:28:50 +0000 (11:28 +0100)]
Merge tag 'scrub-repair-fixes-6.6_2024-01-11' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next
xfs_scrub: fixes to the repair code [v28.3 3/6]
Now that we've landed the new kernel code, it's time to reorganize the
xfs_scrub code that handles repairs. Clean up various naming warts and
misleading error messages. Move the repair code to scrub/repair.c as
the first step. Then, fix various issues in the repair code before we
start reorganizing things.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Carlos Maiolino [Tue, 23 Jan 2024 10:27:23 +0000 (11:27 +0100)]
Merge tag 'scrub-fix-legalese-6.6_2024-01-11' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next
xfs_scrub: fix licensing and copyright notices [v28.3 2/6]
Fix various attribution problems in the xfs_scrub source code, such as
the author's contact information, out of date SPDX tags, and a rough
estimate of when the feature was under heavy development. The most
egregious parts are the files that are missing license information
completely.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:07 +0000 (18:07 -0800)]
xfs_scrub_all: fix termination signal handling
Currently, xfs_scrub_all does not handle termination signals well.
SIGTERM and SIGINT are left to their default handlers, which are
immediate termination of the process group in the case of SIGTERM and
raising KeyboardInterrupt in the case of SIGINT.
Terminating the process group is fine when the xfs_scrub processes are
direct children, but this completely doesn't work if we're farming the
work out to systemd services since we don't terminate the child service.
Instead, they keep going.
Raising KeyboardInterrupt doesn't work because once the main thread
calls sys.exit at the bottom of main(), it blocks in the python runtime
waiting for child threads to terminate. There's no longer any context
to handle an exception, so the signal is ignored and no child processes
are killed.
In other words, if you try to kill a running xfs_scrub_all, chances are
good it won't kill the child xfs_scrub processes. This is undesirable
and egregious since we actually have the ability to track and kill all
the subprocesses that we create.
Solve the subproblem of getting stuck in the python runtime by calling
it repeatedly until we no longer have subprocesses. This means that the
main thread loops until all threads have exited.
Solve the subproblem of the signals doing the wrong thing by setting up
our own signal handler that can wake up the main thread and initiate
subprocess shutdown, no matter whether the subprocesses are systemd
services or directly fork/exec'd.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:06 +0000 (18:07 -0800)]
xfs_scrub_all.cron: move to package data directory
cron jobs don't belong in /usr/lib. Since the cron job is also
secondary to the systemd timer, it's really only provided as a courtesy
for distributions that don't use systemd. Move it to @datadir@, aka
/usr/share/xfsprogs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:06 +0000 (18:07 -0800)]
xfs_scrub_fail: move executable script to /usr/libexec
Per FHS 3.0, non-PATH executable binaries are supposed to live under
/usr/libexec, not /usr/lib. xfs_scrub_fail is an executable script,
so move it to libexec in case some distro some day tries to mount
/usr/lib as noexec or something.
Darrick J. Wong [Fri, 12 Jan 2024 02:07:06 +0000 (18:07 -0800)]
xfs_scrub_all: survive systemd restarts when waiting for services
If xfs_scrub_all detects a running systemd, it will use it to invoke
xfs_scrub subprocesses in a sandboxed and resource-controlled
environment. Unfortunately, if you happen to restart dbus or systemd
while it's running, you get this:
systemd[1]: Reexecuting.
xfs_scrub_all[9958]: Warning! D-Bus connection terminated.
xfs_scrub_all[9956]: Warning! D-Bus connection terminated.
xfs_scrub_all[9956]: Failed to wait for response: Connection reset by peer
xfs_scrub_all[9958]: Failed to wait for response: Connection reset by peer
xfs_scrub_all[9930]: Scrubbing / done, (err=1)
xfs_scrub_all[9930]: Scrubbing /storage done, (err=1)
The xfs_scrub units themselves are still running, it's just that the
`systemctl start' command that xfs_scrub_all uses to start and wait for
the unit lost its connection to dbus and hence is no longer monitoring
sub-services.
When this happens, we don't have great options -- systemctl doesn't have
a command to wait on an activating (aka running) unit. Emulate the
functionality we normally get by polling the failed/active statuses.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:06 +0000 (18:07 -0800)]
xfs_scrub_all: fix argument passing when invoking xfs_scrub manually
Currently, xfs_scrub_all will try to invoke xfs_scrub with argv[1] being
"-n -x". This of course is recognized by C getopt as a weird looking
string, not two individual arguments, and causes the child process to
exit with complaints about CLI usage.
What we really want is to split the string into a proper array and then
add them to the xfs_scrub command line. The code here isn't strictly
correct, but as @scrub_args@ is controlled by us in the Makefile, it'll
do for now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:05 +0000 (18:07 -0800)]
xfs_scrub_fail: return the failure status of the mailer program
We should return the exit code of the mailer program sending the scrub
failure reports, since that's much more important to anyone watching the
system.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:04 +0000 (18:07 -0800)]
xfs_scrub: don't report media errors for space with unknowable owner
On filesystems that don't have the reverse mapping feature enabled, the
GETFSMAP call cannot tell us much about the owner of a space extent --
we're limited to static fs metadata, free space, or "unknown". In this
case, nothing is corrupt, so str_corrupt is not an appropriate logging
function. Relax this to str_info so that the user sees a notice that
media errors have been found so that the user knows something bad
happened even if the directory tree walker cannot find the file owning
the space where the media error was found.
Filesystems with rmap enabled are never supposed to return OWN_UNKNOWN
from a GETFSMAP report, so continue to report that as a corruption.
This fixes a regression in xfs/556.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:04 +0000 (18:07 -0800)]
xfs_scrub: update copyright years for scrub/ files
Update the copyright years in the scrub/ source code files. This isn't
required, but it's helpful to remind myself just how long it's taken to
develop this feature.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:05 +0000 (18:07 -0800)]
xfs_scrub_fail: fix sendmail detection
This script emails the results of failed scrub runs to root. We
shouldn't be hardcoding the path to the mailer program because distros
can change the path according to their whim. Modify this script to use
command -v to find the program.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:04 +0000 (18:07 -0800)]
xfs_scrub: flush stdout after printing to it
Make sure we flush stdout after printf'ing to it, especially before we
start any operation that could take a while to complete. Most of scrub
already does this, but we missed a couple of spots.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:05 +0000 (18:07 -0800)]
xfs_scrub: fix pathname escaping across all service definitions
systemd services provide an "instance name" that can be associated with
a particular invocation of a service. This allows service users to
invoke multiple copies of a service, each with a unique string. For
xfs_scrub, we pass the mountpoint of the filesystem as the instance
name. However, systemd services aren't supposed to have slashes in
them, so we're supposed to escape them.
The canonical escaping scheme for pathnames is defined by the
systemd-escape --path command. Unfortunately, we've been adding our own
opinionated sauce for years, to work around the fact that --path didn't
exist in systemd before January 2017. The special sauce is incorrect,
and we no longer care about systemd of 7 years past.
Clean up this mess by following the systemd escaping scheme throughout
the service units. Now we can use the '%f' specifier in them, which
makes things a lot less complicated.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:05 +0000 (18:07 -0800)]
xfs_scrub_all: escape service names consistently
This program is not consistent as to whether or not it escapes the
pathname that is being used as the xfs_scrub service instance name.
Fix it to be consistent, and to fall back to direct invocation if
escaping doesn't work. The escaping itself is also broken, but we'll
fix that in the next patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:05 +0000 (18:07 -0800)]
debian: install scrub services with dh_installsystemd
Use dh_installsystemd to handle the installation and activation of the
scrub systemd services. This requires bumping the compat version to 11.
Note that the services are /not/ activated on installation.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Fri, 12 Jan 2024 02:07:03 +0000 (18:07 -0800)]
libxfs: fix krealloc to allow freeing data
A recent refactoring to xfs_idata_realloc in the kernel made it depend
on krealloc returning NULL if the new size is zero. The xfsprogs
wrapper instead aborts, so we need to make it follow the kernel
behavior.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:47 +0000 (08:53 -0800)]
xfs_scrub: try to use XFS_SCRUB_IFLAG_FORCE_REBUILD
Now that we have a FORCE_REBUILD flag to the scrub ioctl, try to use
that over the (much noisier) error injection knob, which may or may not
even be enabled in the kernel config.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:43 +0000 (08:53 -0800)]
xfs_copy: actually do directio writes to block devices
Not sure why block device targets don't get O_DIRECT in !buffered mode,
but it's misleading when the copy completes instantly only to stall
forever due to fsync-on-close. Adjust the "write last sector" code to
allocate a properly aligned buffer.
In removing the onstack buffer for EOD writes, this also corrects the
buffer being larger than necessary -- the old code declared an array of
32768 pointers, whereas all we really need is an aligned 32768-byte
buffer.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:46 +0000 (08:53 -0800)]
xfs_scrub: don't retry unsupported optimizations
If the kernel says it doesn't support optimizing a data structure, we
should mark it done and move on. This is much better than requeuing the
repair, in which case it will likely keep failing. Eventually these
requeued repairs end up in the single-threaded last resort at the end of
phase 4, which makes things /very/ slow.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:46 +0000 (08:53 -0800)]
xfs_scrub: handle spurious wakeups in scan_fs_tree
Coverity reminded me that the pthread_cond_wait can wake up and return
without the predicate variable (sft.nr_dirs > 0) actually changing.
Therefore, one has to retest the condition after each wakeup.
Coverity-id: 1554280 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:46 +0000 (08:53 -0800)]
xfs_io: extract control number parsing routines
Break out the parts of parse_args that extract control numbers from the
CLI arguments, so that the function isn't as long. This isn't all that
exciting now, but the scrub vectorization speedups will introduce a new
ioctl. For the new command that comes with that, we'll want the control
number parsing helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:45 +0000 (08:53 -0800)]
xfs_io: collapse trivial helpers
Simply the call chain by having parse_args set the scrub ioctl
parameters in the caller's object. The parse_args callers can then
invoke the ioctl directly, eliminating one function and one indirect
call.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:45 +0000 (08:53 -0800)]
xfs_mdrestore: refactor progress printing and sb fixup code
Now that we've fixed the dissimilarities between the two progress
printing callsites, refactor them into helpers. Do the same for the
duplicate code that clears sb_inprogress from the primary superblock
after the copy succeeds.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:45 +0000 (08:53 -0800)]
xfs_mdrestore: fix missed progress reporting
Currently, the progress reporting only triggers when the number of bytes
read is exactly a multiple of a megabyte. This isn't always guaranteed,
since AG headers can be 512 bytes in size. Fix the algorithm by
recording the number of megabytes we've reported as being read, and emit
a new report any time the bytes_read count, once converted to megabytes,
doesn't match.
Fix the v2 code to emit one final status message in case the last
extent restored is more than a megabyte.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:44 +0000 (08:53 -0800)]
xfs_mdrestore: fix uninitialized variables in mdrestore main
Coverity complained about the "is fd a file?" flags being uninitialized.
Clean this up.
Coverity-id: 1554270 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
Darrick J. Wong [Wed, 20 Dec 2023 16:53:43 +0000 (08:53 -0800)]
libxfs: don't UAF a requeued EFI
In the kernel, commit 8ebbf262d4684 ("xfs: don't block in busy flushing
when freeing extents") changed the allocator behavior such that AGFL
fixing can return -EAGAIN in response to detection of a deadlock with
the transaction busy extent list. If this happens, we're supposed to
requeue the EFI so that we can roll the transaction and try the item
again.
If a requeue happens, we should not free the xefi pointer in
xfs_extent_free_finish_item or else the retry will walk off a dangling
pointer. There is no extent busy list in userspace so this should
never happen, but let's fix the logic bomb anyway.
We should have ported kernel commit 0853b5de42b47 ("xfs: allow extent
free intents to be retried") to userspace, but neither Carlos nor I
noticed this fine detail. :(
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
libxfs: split out a libxfs_dev structure from struct libxfs_init
Most of the content of libxfs_init is members duplicated for each of the
data, log and RT devices. Split those members into a separate
libxfs_dev structure.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: stash away the device fd in struct xfs_buftarg
Cache the open file descriptor for each device in the buftarg
structure and remove the now unused dev_map infrastructure.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
xfs_repair: remove various libxfs_device_to_fd calls
A few places in xfs_repair call libxfs_device_to_fd to get the data
device fd from the data device dev_t stored in the libxfs_init
structure. Just use the file descriptor stored right there directly.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
No need to do a dev_t to fd lookup when the caller already has the fd.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: return the opened fd from libxfs_device_open
So that the caller can stash it away without having to call
xfs_device_to_fd.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs_device_open and libxfs_device_close are only used in init.c.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: remove dead size < 0 checks in libxfs_init
libxfs_init initializes the device size to 0 at the start of the function
and libxfs_open_device never sets the size to a negativ value. Remove
these checks as they are dead code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libfrog: make platform_set_blocksize exit on fatal failure
platform_set_blocksize has a fatal argument that is currently only
used to change the printed message. Make it actually fatal similar to
other libfrog platform helpers to simplify the caller.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: remove the setblksize == 1 case in libxfs_device_open
All callers of libxfs_init always pass an actual sector size or zero in
the setblksize member. Remove the unreachable setblksize == 1 case.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: making passing flags to libxfs_init less confusing
The libxfs_xinit stucture has four different ways to pass flags to
libxfs_init:
- the isreadonly argument despite it's name contains various LIBXFS_
flags that go beyond just the readonly flag
- the isdirect flag contains a single LIBXFS_ flag from the same name
- the usebuflock is an integer used as bool
- the bcache_flags member is used to pass flags directly to cache_init()
for the buffer cache
While there is good arguments for keeping the last one separate, all the
others are rather confusing. Consolidate them into a single flags member
using flags in the LIBXFS_* namespace.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: merge the file vs device cases in libxfs_init
The only special handling for an XFS device on a regular file is that
we skip the checks in check_open. Simplify perform those conditionally
instead of duplicating the entire sequence.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: pass a struct libxfs_init to libxfs_alloc_buftarg
Pass a libxfs_init structure to libxfs_alloc_buftarg instead of three
separate dev_t values.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Pass a libxfs_init structure to libxfs_mount instead of three separate
dev_t values.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Make the struct name more usual, and remove the libxfs_init_t typedef.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxlog: remove the global libxfs_xinit x structure
There is no need to export a libxfs_xinit with the somewhat unsuitable
name x from libxlog. Move it into the tools linking against libxlog
that actually need it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxlog: don't require a libxfs_xinit structure for xlog_init
xlog_init currently requires a libxfs_args structure to be passed in,
and then clobbers various log-related arguments to it. There is no
good reason for that as all the required information can be calculated
without it.
Remove the x argument to xlog_init and xlog_is_dirty and the now unused
logBBstart member in struct libxfs_xinit.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxlog: add a helper to initialize a xlog without clobbering the x structure
xfsprogs has three copies of a code sequence to initialize an xlog
structure from a libxfs_init structure. Factor the code into a helper.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxlog: remove the verbose argument to xlog_is_dirty
No caller passes a non-zero verbose argument to xlog_is_dirty.
Remove the argument the code keyed off by it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
xfs_logprint: move all code to set up the fake xlog into logstat()
Isolate the code that sets up the fake xlog into the logstat() helper to
prepare for upcoming changes.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
IRIX has the concept of a volume that has data/log/rt subvolumes (that's
where the subvolume name in Linux comes from), but in the current
Linux-only xfsprogs version trying to pretend we do anything with that
it is just utterly confusing. The volname is basically just a very
obsfucated second way to pass the data device name, so get rid of it
in the libxfs and progs internals.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Stop pretending we try to distinguish between the legacy Unix raw and
block devices nodes. Linux as the only currently support platform never
had them, but other modern Unix variants like FreeBSD also got rid of
this distinction years ago.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: remove the dead {d,log,rt}path variables in libxfs_init
These variables are only initialized, and then unlink is called if they
were changed from the initial value, which can't happen. Remove the
variables and the conditional unlink calls.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
libxfs: remove the unused icache_flags member from struct libxfs_xinit
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Eric Biggers [Fri, 13 Oct 2023 06:26:39 +0000 (23:26 -0700)]
xfs_io/encrypt: support specifying crypto data unit size
Add an '-s' option to the 'set_encpolicy' command of xfs_io to allow
exercising the log2_data_unit_size field that is being added to struct
fscrypt_policy_v2 (kernel patch:
https://lore.kernel.org/linux-fscrypt/20230925055451.59499-6-ebiggers@kernel.org).
The xfs_io support is needed for xfstests
(https://lore.kernel.org/fstests/20231013061403.138425-1-ebiggers@kernel.org),
which currently relies on xfs_io to access the encryption ioctls.
Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chandan Babu R [Mon, 6 Nov 2023 13:10:54 +0000 (18:40 +0530)]
mdrestore: Add support for passing log device as an argument
metadump v2 format allows dumping metadata from external log devices. This
commit allows passing the device file to which log data must be restored from
the corresponding metadump file.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chandan Babu R [Mon, 6 Nov 2023 13:10:53 +0000 (18:40 +0530)]
mdrestore: Define mdrestore ops for v2 format
This commit adds functionality to restore metadump stored in v2 format.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chandan Babu R [Mon, 6 Nov 2023 13:10:52 +0000 (18:40 +0530)]
mdrestore: Extract target device size verification into a function
A future commit will need to perform the device size verification on an
external log device. In preparation for this, this commit extracts the
relevant portions into a new function. No functional changes have been
introduced.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chandan Babu R [Mon, 6 Nov 2023 13:10:51 +0000 (18:40 +0530)]
mdrestore: Introduce mdrestore v1 operations
In order to indicate the version of metadump files that they can work with,
this commit renames read_header(), show_info() and restore() functions to
read_header_v1(), show_info_v1() and restore_v1() respectively.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chandan Babu R [Mon, 6 Nov 2023 13:10:50 +0000 (18:40 +0530)]
mdrestore: Replace metadump header pointer argument with a union pointer
We will need two variants of read_header(), show_info() and restore() helper
functions to support two versions of metadump formats. To this end, A future
commit will introduce a vector of function pointers to work with the two
metadump formats. To have a common function signature for the function
pointers, this commit replaces the first argument of the previously listed
function pointers from "struct xfs_metablock *" with "union
mdrestore_headers *".
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chandan Babu R [Mon, 6 Nov 2023 13:10:49 +0000 (18:40 +0530)]
mdrestore: Add open_device(), read_header() and show_info() functions
This commit moves functionality associated with opening the target device,
reading metadump header information and printing information about the
metadump into their respective functions. There are no functional changes made
by this commit.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chandan Babu R [Mon, 6 Nov 2023 13:10:48 +0000 (18:40 +0530)]
mdrestore: Detect metadump v1 magic before reading the header
In order to support both v1 and v2 versions of metadump, mdrestore will have
to detect the format in which the metadump file has been stored on the disk
and then read the ondisk structures accordingly. In a step in that direction,
this commit splits the work of reading the metadump header from disk into two
parts,
1. Read the first 4 bytes containing the metadump magic code.
2. Read the remaining part of the header.
A future commit will take appropriate action based on the value of the magic
code.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>