Ameer Hamza [Fri, 4 Oct 2024 16:57:44 +0000 (21:57 +0500)]
libblkid: zfs: Use nvlist for detection instead of Uber blocks
Currently, blkid relies on the presence of Uber blocks to detect ZFS
partition types. However, Uber blocks are not consistently dumped for
cache and spare vdevs, particularly in pools created prior to
https://github.com/openzfs/zfs/commit/d9885b3. Additionally, indirect
vdevs are incorrectly detected by blkid due to the presence of Uber
blocks in the label. ZFS itself does not depend on Uber blocks either
when reading ZFS labels; instead, it parses the nvlist.
This commit aligns blkid's approach with ZFS by parsing the nvlist in
the label to detect ZFS partition types, requiring at least one valid
label for successful detection. This change also ensures compatibility
with wipefs, as it now uses nvlist headers for offsets instead of the
Uber Magic offset. Consequently, running wipefs -a will zero out the
nvlist header in each label, fully removing the ZFS partition type and
making the pool unimportable. Previously, wipefs -a did not clear all
the Uber blocks or delete all nvlist headers, allowing pools to remain
importable even after wiping.
Karel Zak [Tue, 5 Nov 2024 10:17:10 +0000 (11:17 +0100)]
Merge branch 'PR/nsenter-pidfd' of https://github.com/karelzak/util-linux-work
* 'PR/nsenter-pidfd' of https://github.com/karelzak/util-linux-work:
nsenter: Rewrite --user-parent to use pidfd
include/pidfd-utils: add namespaces ioctls
nsenter: reuse pidfd for --net-socket
nsenter: use macros to access the nsfiles array
nsenter: use pidfd to enter target namespaces
nsenter: use separate function to enter namespaces
nsenter: add functions to enable/disable namespaces
Karel Zak [Tue, 5 Nov 2024 10:10:31 +0000 (11:10 +0100)]
libsmartcols: make __attributes__ more portable
Let's use what is already used for libmount. The header file is a public
header and does not require support for the __attribute__() compiler
feature. We need an additional #ifdef to ensure portability.
Karel Zak [Tue, 5 Nov 2024 10:08:26 +0000 (11:08 +0100)]
Merge branch 'smartcols-printf' of https://github.com/rjarry/util-linux
* 'smartcols-printf' of https://github.com/rjarry/util-linux:
treewide: use scols printf api where possible
libsmartcols: add printf api to fill in column data
Karel Zak [Tue, 5 Nov 2024 10:00:11 +0000 (11:00 +0100)]
Merge branch 'mkfds--minor-fixes' of https://github.com/masatake/util-linux
* 'mkfds--minor-fixes' of https://github.com/masatake/util-linux:
tests: (test_mkfds::make-regular-file) fix the default union member for \"readable\" parameter
test_mkfds: reserve file descriptors in the early stage of execution
Anjali K [Mon, 4 Nov 2024 06:32:26 +0000 (12:02 +0530)]
lscpu: fix incorrect number of sockets during hotplug
lscpu sometimes shows incorrect 'Socket(s)' value if a hotplug operation
is running.
On a 32 CPU 2-socket system, the expected output is as shown below:
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Model name: POWER10 (architected), altivec supported
Model: 2.0 (pvr 0080 0200)
Thread(s) per core: 8
Core(s) per socket: 2
Socket(s): 2
On the same system, if hotplug is running along with lscpu, it shows
"Socket(s):" as 3 and 4 incorrectly sometimes.
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-11,16-31
Off-line CPU(s) list: 12-15
Model name: POWER10 (architected), altivec supported
Model: 2.0 (pvr 0080 0200)
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 3
The number of sockets is considered as the number of unique core_siblings
CPU groups. The issues causing the number of sockets to sometimes be
higher during hotplug is:
1. The core_siblings of CPUs on the same socket are different because a CPU
on the socket has been onlined/offlined in between. In the below example,
nr sockets was wrongly incremented for CPU 5 though CPU 4 and 5 are on the
same socket because their core_siblings was different as CPU 12 was onlined
in between.
CPU: 4
core_siblings: ff f0 0 0 0 0 0 0
CPU: 5
core_siblings: ff f8 0 0 0 0 0 0
2. The core_siblings file of a CPU is created when a CPU is onlined. It may
have an invalid value for some time until the online operation is fully
complete. In the below example, nr sockets is wrongly incremented because
the core_siblings of CPU 14 was 0 as it had just been onlined.
CPU: 14
core_siblings: 0 0 0 0 0 0 0 0
To fix this, make the below changes:
1. Instead of considering CPUs to be on different sockets if their
core_siblings masks are unequal, consider them to be on different sockets
only if their core_siblings masks don't have even one common CPU. Then CPUs
on the same socket will be correctly identified even if offline/online
operations happen while they are read if at least one CPU in the socket is
online during both reads.
2. Check if a CPU's hotplug operation has been completed before using its
core_siblings file
[kzak@redhat.com: - use xmalloc(),
- use ul_strtos32(),
- use err() on CPU_ALLOC() error]
Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com> Signed-off-by: Anjali K <anjalik@linux.ibm.com> Signed-off-by: Karel Zak <kzak@redhat.com>
Robin Jarry [Thu, 31 Oct 2024 22:55:44 +0000 (23:55 +0100)]
treewide: use scols printf api where possible
Everywhere a string generated with xasprintf() is directly passed to
scols_line_refer_data(), use scols_line_sprintf() to remove the need for
an intermediate buffer.
Replace the (now redundant) private scols_line_asprintf() function.
Masatake YAMATO [Sat, 26 Oct 2024 17:05:45 +0000 (02:05 +0900)]
test_mkfds: reserve file descriptors in the early stage of execution
A factory specified with command line opens some files. After
opening, the factory remaps the opened file descriptors (ofds) to file
descriptors (rfds) specified with the command line with dup2 system all.
This remapping may fail if there is an overlap between ofds and rfds.
With this change, there cannot be an overlap between ofds and rfds;
test_mkfds reserves rfds in the early stage of execution.
Karel Zak [Thu, 31 Oct 2024 10:21:20 +0000 (11:21 +0100)]
hardlink: implement --mount
Let's export another feature of nftw() to the hardlink command line.
In this case, we will force the file-tree-walk to stay within the same
filesystem.
Addresses: https://github.com/util-linux/util-linux/discussions/3244 Signed-off-by: Karel Zak <kzak@redhat.com>
Karel Zak [Thu, 31 Oct 2024 09:51:11 +0000 (10:51 +0100)]
hardlink: implement --exclude-subtree
Now, it is possible to exclude files by their names, but it does not
allow for ignoring entire subtrees of the scanned hierarchy. The new
option only applies to directory names and forces the file-tree-walk
to skip the directory and all of its subdirectories.
This is based on FTW_SKIP_SUBTREE, which was originally only available
in glibc (since 2004). Therefore, the code is #ifdef-ed to make it
portable to other libc versions.
Addresses: https://github.com/util-linux/util-linux/discussions/3244 Signed-off-by: Karel Zak <kzak@redhat.com>
наб [Mon, 28 Oct 2024 18:19:34 +0000 (19:19 +0100)]
hardlink: fix 0-sized file processing
The manual says that -s0 will process 0-sized files normally,
but as it stands (a) hardlink considers 0-sized files unlinkable
(so, with -l, unlistable) and (b) fileeq considers reading an empty
prologue to be an error
наб [Mon, 28 Oct 2024 18:19:30 +0000 (19:19 +0100)]
hardlink: add --list-duplicates and --zero
--list-duplicates codifies what everyone keeps re-implementing with
find -exec b2sum or src:perforate's finddup or whatever.
hardlink already knows this, so make the data available thusly,
in a format well-suited for pipeline processing
(fixed-width key for uniq/cut/&c.,
tab delimiter for cut &a.,
-z for correct filename handling).
Karel Zak [Wed, 30 Oct 2024 10:14:19 +0000 (11:14 +0100)]
Merge branch 'lsfd--bpf-prog-id-and-tag' of https://github.com/masatake/util-linux
* 'lsfd--bpf-prog-id-and-tag' of https://github.com/masatake/util-linux:
tests: (lsfd::mkfds-bpf-prog) verify BPF-PROG.{ID,TAG} column
tests: (test_mkfds::bpf-prog) report id and tag
lsfd: add BPF-PROG.TAG column
lsfd: update bpf related tables
lsfd: (bugfix) fix wrong type usage in anon_bpf_map_fill_column
test_mkfds: (bugfix) listing ALL output values for a given factory
Masatake YAMATO [Mon, 14 Oct 2024 08:39:15 +0000 (17:39 +0900)]
lsfd: (bugfix) fix wrong type usage in anon_bpf_map_fill_column
Where we should use anon_bpf_map_data, anon_bpf_prog_data was used.
Fortunately, this has not been a big trouble because anon_bpf_map_data
and anon_bpf_prog_data had no difference in their member layout
Robin Jarry [Mon, 6 May 2024 21:45:21 +0000 (23:45 +0200)]
text-utils: add bits command
Add a new text utility to convert bit masks in various formats.
This can be handy to avoid parsing affinity masks in one's head and/or
to interact with the kernel in a more human friendly way. It is
a rewrite in C of the bits command from my linux-tools python package so
that it can be more widely available.
Karel Zak [Mon, 21 Oct 2024 11:09:43 +0000 (13:09 +0200)]
nsenter: Rewrite --user-parent to use pidfd
The latest kernel pidfd supports ioctls to ask for the target's
namespaces. It seems we can use it for --user-parent if no user
namespace is explicitly specified. The fallback is to use any other
namespace or open the target's /proc/<pid>/ns/user file directly.
Karel Zak [Fri, 18 Oct 2024 10:16:04 +0000 (12:16 +0200)]
nsenter: use pidfd to enter target namespaces
The typical use case is to enter namespaces of the task (--target
<pid>). The original nsenter opens /proc/<pid>/ns/* files and uses the
file descriptors to enter the namespaces by setns(). The recent kernel
allows using the pid file descriptor instead of the files in /proc,
making it possible to enter multiple namespaces with one setns call.
This solution reduces the number of syscalls (open+setns for each
namespace), removes the dependence on /proc, and allows entering
nested namespaces.
This commit should be backwardly compatible, meaning it can be used on
systems without pidfd_open(). Explicitly specified namespaces by
filenames are still supported, and user namespaces are still entered
first/last according to permissions privileging/deprivileging.
Addresses: https://github.com/util-linux/util-linux/pull/301 Signed-off-by: Karel Zak <kzak@redhat.com>
Karel Zak [Thu, 17 Oct 2024 09:14:49 +0000 (11:14 +0200)]
nsenter: add functions to enable/disable namespaces
Currently, enabled namespaces are those with an open file descriptor.
However, if we support pidfd, this will become unnecessary and we will
need an FD-independent enable/disable mechanism.
It also makes sense to delay opening --target <pid> namespaces files
until everything is ready and only handle it in one place.
Karel Zak [Mon, 14 Oct 2024 09:45:32 +0000 (11:45 +0200)]
libfdisk: make sure libblkid uses the same sector size
Libfdisk uses libblkid to check for filesystems on the device. It
makes sense for both libraries to share the logical sector size
setting, as this setting can be modified by using the fdisk command
line.
We do not see this as an issue, as filesystem detection rarely depends
on sector size (with the exception of some RAIDs). Additionally,
libblkid is usually intelligent enough to check multiple locations
independently of the current device's sector size setting.
Addresses: https://github.com/util-linux/util-linux/pull/3235 Signed-off-by: Karel Zak <kzak@redhat.com>
Maks Mishin [Thu, 10 Oct 2024 17:23:49 +0000 (20:23 +0300)]
sys-utils: (setpriv): fix potential memory leak
Dynamic memory, referenced by 'buf' is allocated by calling function 'xstrdup'
add then changed by calling of strsep function.
The free(buf) call is incorrect if buf != NULL, and points to some
place inside or outside the source string.
Karel Zak [Mon, 7 Oct 2024 11:27:43 +0000 (13:27 +0200)]
Merge branch 'sock-netns-with-tests' of https://github.com/masatake/util-linux
* 'sock-netns-with-tests' of https://github.com/masatake/util-linux:
tests: (lsfd) verify SOCK.NETID and ENDPOINTS for sockets made in another netns
tests: (lsns) verify the code finding an isolated netns via socket
tests: (nsenter) verify the code entering the network ns via socket made in the ns
tests: (test_sysinfo) add a helper to detect NS_GET_USERNS
tests: (test_mkfds::foreign-sockets) new factory
tests: (test_mkfds, refactor) use xmemdup newly added in xalloc.h
xalloc.h: add xmemdup
tests: (test_mkfds) fix a typo in an option name
test_mkfds: (cosmetic) remove whitespaces between a function and its arguments
Karel Zak [Mon, 7 Oct 2024 08:22:07 +0000 (10:22 +0200)]
Merge branch 'lsfd--minor-fixes' of https://github.com/masatake/util-linux
* 'lsfd--minor-fixes' of https://github.com/masatake/util-linux:
lsfd: avoid accessing an uninitialized value
lsfd: finalize abst_class
lsfd,test_mkfds: (refactor) specify the variable itself as an operand of sizeof
tests: (test_mkfds) add a missing word in a comment
The exFAT specification lists valid value ranges for the superblock
fields. Validate the fields interpreted by the libblkid prober to avoid
undefined behaviour.
Karel Zak [Wed, 2 Oct 2024 08:06:10 +0000 (10:06 +0200)]
Merge branch 'PR/libmount-xnocanon' of https://github.com/karelzak/util-linux-work
* 'PR/libmount-xnocanon' of https://github.com/karelzak/util-linux-work:
mount: (man) add note about symlink over symlink
tests: add X-mount.nocanonicalize tests
libmount: support bind symlink over symlink
libmount: add X-mount.nocanonicalize[=source|target]
Karel Zak [Tue, 1 Oct 2024 11:56:52 +0000 (13:56 +0200)]
Merge branch 'test_mkfds-dont-free-and-close-when-exit-with-error' of https://github.com/masatake/util-linux
* 'test_mkfds-dont-free-and-close-when-exit-with-error' of https://github.com/masatake/util-linux:
tests: (test_mkfds) don't close fds and free memory objects when exiting with EXIT_FAILURE
tests: (test_mkfds,refactor) simplify nested if conditions
tests: (test_mkfds) save errno before calling system calls for clean-up
tests: (test_mkfds, cosmetic) add an empty line before the definition of struct sysvshm_data
Karel Zak [Thu, 26 Sep 2024 12:44:36 +0000 (14:44 +0200)]
libmount: support bind symlink over symlink
The new mount API allows for the use of AT_SYMLINK_NOFOLLOW when
opening a mount tree (aka the "mount source" for libmount).
As a result, you can now replace one symlink with another by using a
bind mount.
By default, the mount(8) command follows symlinks and canonicalizes
all paths. However, with the X-mount.nocanonicalize=source option, it
is possible to open the symlink itself. Similarly, with the
X-mount.nocanonicalize=target option, the path of the mount point can
be kept as the original symlink. (Using X-mount.nocanonicalize without
any argument works for both the "source" and "target".)
Example:
# file /mnt/test/symlinkA /mnt/test/symlinkB
/mnt/test/symlinkA: symbolic link to /mnt/test/fileA
/mnt/test/symlinkB: symbolic link to /mnt/test/fileB
The result is that 'symlinkB' is still a symlink, but it now points to
a different file.
This commit also modifies umount(8) because it does not work with
symlinks by default. The solution is to call umount2(UMOUNT_NOFOLLOW)
for symlinks after a failed regular umount(). For example:
Thomas Weißschuh [Wed, 25 Sep 2024 06:12:45 +0000 (08:12 +0200)]
login-utils/su-common: Validate all return values again
The additional coded added in commit d6564701e812 ("login-utils/su-common: Check that the user didn't change during PAM transaction")
was inserted in between the assignment and tests of "rc",
making the return value unchecked.
Add a new explicit check.
Thomas Weißschuh [Wed, 25 Sep 2024 06:09:29 +0000 (08:09 +0200)]
meson: test for pidfd_getfd()
Commit 55c7120accab ("nsenter: Provide an option to join target process's socket net namespace")
added stubs for pidfd_getfd() but didn't add the code for meson to check
if the function is already available.
Karel Zak [Tue, 24 Sep 2024 11:37:13 +0000 (13:37 +0200)]
libfdisk: (dos) ignore incomplete EBR for non-wholedisk
The logical partitions are defined by a chain of extended partitions,
with the beginning of the chain located on the whole disk device.
If a user runs "fdisk --list /dev/sda4", libfdisk cannot calculate proper
offsets for the items in the chain, resulting in the following error
message:
Failed to read extended partition table (offset=22528): Invalid argument
This error message may confuse users and is unnecessary when fdisk is
used in list-only mode (--list option). It would be sufficient to only
print the content of the partition without the error message and not
continue to the next item in the chain.
However, in write mode (without --list), the error message will still
be displayed as it is potentially dangerous to edit the EBR table.
Addresses: https://issues.redhat.com/browse/RHEL-59867 Signed-off-by: Karel Zak <kzak@redhat.com>
The new kernel mount API can bind-mount over a symlink. However, this
feature does not work with libmount because it canonicalizes all paths
by default. A possible workaround is to use the --no-canonicalize
option on the mount(8) command line, but this is a heavy-handed
solution as it disables all conversions for all paths and tags (such
as LABEL=) and fstab processing.
This commit introduces the X-mount.nocanonicalize userspace mount
option to control canonicalization. It only affects paths used for
mounting and does not affect tags and searching in fstab. Additionally,
this setting possible to use in fstab.
If the optional argument [=source|target] is not specified, then paths
canonicalization is disabled for both the source and target paths.
Adresses: https://github.com/util-linux/util-linux/issues/2370 Signed-off-by: Karel Zak <kzak@redhat.com>
Karel Zak [Tue, 24 Sep 2024 10:31:39 +0000 (12:31 +0200)]
Merge branch 'sock-netns' of https://github.com/0x7f454c46/util-linux
* 'sock-netns' of https://github.com/0x7f454c46/util-linux:
lsns: List network namespaces that are held by a socket
lsfd: Gather information on target socket's net namespace
nsenter: Provide an option to join target process's socket net namespace
Michal Suchanek [Tue, 24 Sep 2024 07:19:39 +0000 (09:19 +0200)]
partx: Fix example in man page
The example is:
partx -d --nr :-1 /dev/sdd
Removes the last partition on _/dev/sdd_.
The documentation says:
M:
Specifies the lower limit only (e.g. --nr 2:).
:N
Specifies the upper limit only (e.g. --nr :4).
In the above example the lower limit is not set and the upper is set to
the last partition, meaning all partitions. The lower limit should be
set instead.
nsenter: Provide an option to join target process's socket net namespace
The network namespace of a socket can be different from the target
process. Previously there were some userspace issues where a
net-namespace was held alive by a socket leak. For this purpose Arista's
linux kernel has a patch to provide socket => netns map by procfs pid/fd
directory links.
Add nsenter option to join the network namespace of a target process'
socket.