]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
12 days agoxfs: allow setting errortags at mount time
Christoph Hellwig [Fri, 30 Jan 2026 05:19:21 +0000 (06:19 +0100)] 
xfs: allow setting errortags at mount time

Add an errortag mount option that enables an errortag with the default
injection frequency.  This allows injecting errors into the mount
process instead of just on live file systems, and thus test mount
error handling.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
12 days agoxfs: use WRITE_ONCE/READ_ONCE for m_errortag
Christoph Hellwig [Fri, 30 Jan 2026 05:19:20 +0000 (06:19 +0100)] 
xfs: use WRITE_ONCE/READ_ONCE for m_errortag

There is no synchronization for updating m_errortag, which is fine as
it's just a debug tool.  It would still be nice to fully avoid the
theoretical case of torn values, so use WRITE_ONCE and READ_ONCE to
access the members.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
12 days agoxfs: move the guts of XFS_ERRORTAG_DELAY out of line
Christoph Hellwig [Fri, 30 Jan 2026 05:19:19 +0000 (06:19 +0100)] 
xfs: move the guts of XFS_ERRORTAG_DELAY out of line

Mirror what is done for the more common XFS_ERRORTAG_TEST version,
and also only look at the error tag value once now that we can
easily have a local variable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
12 days agoxfs: don't validate error tags in the I/O path
Christoph Hellwig [Fri, 30 Jan 2026 05:19:18 +0000 (06:19 +0100)] 
xfs: don't validate error tags in the I/O path

We can trust XFS developers enough to not pass random stuff to
XFS_ERROR_TEST/DELAY.  Open code the validity check in xfs_errortag_add,
which is the only place that receives unvalidated error tag values from
user space, and drop the now pointless xfs_errortag_enabled helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
12 days agoxfs: allocate m_errortag early
Christoph Hellwig [Fri, 30 Jan 2026 05:19:17 +0000 (06:19 +0100)] 
xfs: allocate m_errortag early

Ensure the mount structure always has a valid m_errortag for debug
builds.  This removes the NULL checking from the runtime code, and
prepares for allowing to set errortags from mount.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
12 days agoxfs: fix the errno sign for the xfs_errortag_{add,clearall} stubs
Christoph Hellwig [Fri, 30 Jan 2026 05:19:16 +0000 (06:19 +0100)] 
xfs: fix the errno sign for the xfs_errortag_{add,clearall} stubs

All errno values should be negative in the kernel.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
12 days agoxfs: validate log record version against superblock log version
Raphael Pinsonneault-Thibeault [Thu, 29 Jan 2026 18:50:21 +0000 (13:50 -0500)] 
xfs: validate log record version against superblock log version

Syzbot creates a fuzzed record where xfs_has_logv2() but the
xlog_rec_header h_version != XLOG_VERSION_2. This causes a
KASAN: slab-out-of-bounds read in xlog_do_recovery_pass() ->
xlog_recover_process() -> xlog_cksum().

Fix by adding a check to xlog_valid_rec_header() to abort journal
recovery if the xlog_rec_header h_version does not match the super
block log version.

A file system with a version 2 log will only ever set
XLOG_VERSION_2 in its headers (and v1 will only ever set V_1), so if
there is any mismatch, either the journal or the superblock has been
corrupted and therefore we abort processing with a -EFSCORRUPTED error
immediately.

Also, refactor the structure of the validity checks for better
readability. At the default error level (LOW), XFS_IS_CORRUPT() emits
the condition that failed, the file and line number it is
located at, then dumps the stack. This gives us everything we need
to know about the failure if we do a single validity check per
XFS_IS_CORRUPT().

Reported-by: syzbot+9f6d080dece587cfdd4c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9f6d080dece587cfdd4c
Tested-by: syzbot+9f6d080dece587cfdd4c@syzkaller.appspotmail.com
Fixes: 45cf976008dd ("xfs: fix log recovery buffer allocation for the legacy h_size fixup")
Signed-off-by: Raphael Pinsonneault-Thibeault <rpthibeault@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
13 days agoxfs: fix spacing style issues in xfs_alloc.c
Shin Seong-jun [Fri, 23 Jan 2026 15:04:32 +0000 (00:04 +0900)] 
xfs: fix spacing style issues in xfs_alloc.c

Fix checkpatch.pl errors regarding missing spaces around assignment
operators in xfs_alloc_compute_diff() and xfs_alloc_fixup_trees().

Adhere to the Linux kernel coding style by ensuring spaces are placed
around the assignment operator '='.

Signed-off-by: Shin Seong-jun <shinsj4653@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
13 days agoxfs: remove xfs_zone_gc_space_available
Christoph Hellwig [Tue, 27 Jan 2026 15:10:21 +0000 (16:10 +0100)] 
xfs: remove xfs_zone_gc_space_available

xfs_zone_gc_space_available only has one caller left, so fold it into
that.  Reorder the checks so that the cheaper scratch_available check
is done first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
13 days agoxfs: use a seprate member to track space availabe in the GC scatch buffer
Christoph Hellwig [Tue, 27 Jan 2026 15:10:20 +0000 (16:10 +0100)] 
xfs: use a seprate member to track space availabe in the GC scatch buffer

When scratch_head wraps back to 0 and scratch_tail is also 0 because no
I/O has completed yet, the ring buffer could be mistaken for empty.

Fix this by introducing a separate scratch_available member in
struct xfs_zone_gc_data.  This actually ends up simplifying the code as
well.

Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoMerge tag 'scrub-syzbot-fixes-7.0_2026-01-25' of https://git.kernel.org/pub/scm/linux...
Carlos Maiolino [Wed, 28 Jan 2026 09:22:09 +0000 (10:22 +0100)] 
Merge tag 'scrub-syzbot-fixes-7.0_2026-01-25' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-7.0-merge

xfs: syzbot fixes for online fsck [3/3]

Fix various syzbot complaints about scrub that Jiaming Zhang found.

With a bit of luck, this should all go splendidly.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoMerge tag 'attr-pptr-speedup-7.0_2026-01-25' of https://git.kernel.org/pub/scm/linux...
Carlos Maiolino [Wed, 28 Jan 2026 09:16:12 +0000 (10:16 +0100)] 
Merge tag 'attr-pptr-speedup-7.0_2026-01-25' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-7.0-merge

xfs: improve shortform attr performance [2/3]

Improve performance of the xattr (and parent pointer) code when the
attr structure is in short format and we can therefore perform all
updates in a single transaction.  Avoiding the attr intent code brings
a very nice speedup in those operations.

With a bit of luck, this should all go splendidly.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoMerge tag 'attr-leaf-freemap-fixes-7.0_2026-01-25' of https://git.kernel.org/pub...
Carlos Maiolino [Wed, 28 Jan 2026 09:15:07 +0000 (10:15 +0100)] 
Merge tag 'attr-leaf-freemap-fixes-7.0_2026-01-25' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-7.0-merge

xfs: fix problems in the attr leaf freemap code [1/3]

Running generic/753 for hours revealed data corruption problems in the
attr leaf block space management code.  Under certain circumstances,
freemap entries are left with zero size but a nonzero offset.  If that
offset happens to be the same offset as the end of the entries array
during an attr set operation, the leaf entry table expansion will push
the freemap record offset upwards without checking for overlap with any
other freemap entries.  If there happened to be a second freemap entry
overlapping with the newly allocated leaf entry space, then the next
attr set operation might find that space and overwrite the leaf entry,
thereby corrupting the leaf block.

Fix this by zeroing the freemap offset any time we set the size to zero.
If a subsequent attr set operation finds no space in the freemap, it
will compact the block and regenerate the freemaps.

With a bit of luck, this should all go splendidly.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoMerge tag 'health-monitoring-7.0_2026-01-20' of https://git.kernel.org/pub/scm/linux...
Carlos Maiolino [Wed, 28 Jan 2026 08:47:42 +0000 (09:47 +0100)] 
Merge tag 'health-monitoring-7.0_2026-01-20' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-7.0-merge

xfs: autonomous self healing of filesystems [v7]

This patchset builds new functionality to deliver live information about
filesystem health events to userspace.  This is done by creating an
anonymous file that can be read() for events by userspace programs.
Events are captured by hooking various parts of XFS and iomap so that
metadata health failures, file I/O errors, and major changes in
filesystem state (unmounts, shutdowns, etc.) can be observed by
programs.

When an event occurs, the hook functions queue an event object to each
event anonfd for later processing.  Programs must have CAP_SYS_ADMIN
to open the anonfd and there's a maximum event lag to prevent resource
overconsumption.  The events themselves can be read() from the anonfd
as C structs for the xfs_healer daemon.

In userspace, we create a new daemon program that will read the event
objects and initiate repairs automatically.  This daemon is managed
entirely by systemd and will not block unmounting of the filesystem
unless repairs are ongoing.  They are auto-started by a starter
service that uses fanotify.

This patchset depends on the new fserror code that Christian Brauner
has tentatively accepted for Linux 7.0:
https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=vfs-7.0.fserror

v7: more cleanups of the media verification ioctl, improve comments, and
    reuse the bio
v6: fix pi-breaking bugs, make verify failures trigger health reports
    and filter bio status flags better
v5: add verify-media ioctl, collapse small helper funcs with only
    one caller
v4: drop multiple client support so we can make direct calls into
    healthmon instead of chasing pointers and doing indirect calls
v3: drag out of rfc status

With a bit of luck, this should all go splendidly.

Conflicts:
This merge required an update on files:
- fs/xfs/xfs_healthmon.c
- fs/xfs/xfs_verify_media.c
Such change was required because a parallel developement changed
XFS header file xfs.h naming to xfs_platform.h, so the merge
required to update those includes in both files above

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoxfs: check for deleted cursors when revalidating two btrees
Darrick J. Wong [Fri, 23 Jan 2026 17:27:40 +0000 (09:27 -0800)] 
xfs: check for deleted cursors when revalidating two btrees

The free space and inode btree repair functions will rebuild both btrees
at the same time, after which it needs to evaluate both btrees to
confirm that the corruptions are gone.

However, Jiaming Zhang ran syzbot and produced a crash in the second
xchk_allocbt call.  His root-cause analysis is as follows (with minor
corrections):

 In xrep_revalidate_allocbt(), xchk_allocbt() is called twice (first
 for BNOBT, second for CNTBT). The cause of this issue is that the
 first call nullified the cursor required by the second call.

 Let's first enter xrep_revalidate_allocbt() via following call chain:

 xfs_file_ioctl() ->
 xfs_ioc_scrubv_metadata() ->
 xfs_scrub_metadata() ->
 `sc->ops->repair_eval(sc)` ->
 xrep_revalidate_allocbt()

 xchk_allocbt() is called twice in this function. In the first call:

 /* Note that sc->sm->sm_type is XFS_SCRUB_TYPE_BNOPT now */
 xchk_allocbt() ->
 xchk_btree() ->
 `bs->scrub_rec(bs, recp)` ->
 xchk_allocbt_rec() ->
 xchk_allocbt_xref() ->
 xchk_allocbt_xref_other()

 since sm_type is XFS_SCRUB_TYPE_BNOBT, pur is set to &sc->sa.cnt_cur.
 Kernel called xfs_alloc_get_rec() and returned -EFSCORRUPTED. Call
 chain:

 xfs_alloc_get_rec() ->
 xfs_btree_get_rec() ->
 xfs_btree_check_block() ->
 (XFS_IS_CORRUPT || XFS_TEST_ERROR), the former is false and the latter
 is true, return -EFSCORRUPTED. This should be caused by
 ioctl$XFS_IOC_ERROR_INJECTION I guess.

 Back to xchk_allocbt_xref_other(), after receiving -EFSCORRUPTED from
 xfs_alloc_get_rec(), kernel called xchk_should_check_xref(). In this
 function, *curpp (points to sc->sa.cnt_cur) is nullified.

 Back to xrep_revalidate_allocbt(), since sc->sa.cnt_cur has been
 nullified, it then triggered null-ptr-deref via xchk_allocbt() (second
 call) -> xchk_btree().

So.  The bnobt revalidation failed on a cross-reference attempt, so we
deleted the cntbt cursor, and then crashed when we tried to revalidate
the cntbt.  Therefore, check for a null cntbt cursor before that
revalidation, and mark the repair incomplete.  Also we can ignore the
second tree entirely if the first tree was rebuilt but is already
corrupt.

Apply the same fix to xrep_revalidate_iallocbt because it has the same
problem.

Cc: r772577952@gmail.com
Link: https://lore.kernel.org/linux-xfs/CANypQFYU5rRPkTy=iG5m1Lp4RWasSgrHXAh3p8YJojxV0X15dQ@mail.gmail.com/T/#m520c7835fad637eccf843c7936c200589427cc7e
Cc: <stable@vger.kernel.org> # v6.8
Fixes: dbfbf3bdf639a2 ("xfs: repair inode btrees")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jiaming Zhang <r772577952@gmail.com>
2 weeks agoxfs: fix UAF in xchk_btree_check_block_owner
Darrick J. Wong [Fri, 23 Jan 2026 17:27:39 +0000 (09:27 -0800)] 
xfs: fix UAF in xchk_btree_check_block_owner

We cannot dereference bs->cur when trying to determine if bs->cur
aliases bs->sc->sa.{bno,rmap}_cur after the latter has been freed.
Fix this by sampling before type before any freeing could happen.
The correct temporal ordering was broken when we removed xfs_btnum_t.

Cc: r772577952@gmail.com
Cc: <stable@vger.kernel.org> # v6.9
Fixes: ec793e690f801d ("xfs: remove xfs_btnum_t")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jiaming Zhang <r772577952@gmail.com>
2 weeks agoxfs: check return value of xchk_scrub_create_subord
Darrick J. Wong [Fri, 23 Jan 2026 17:27:38 +0000 (09:27 -0800)] 
xfs: check return value of xchk_scrub_create_subord

Fix this function to return NULL instead of a mangled ENOMEM, then fix
the callers to actually check for a null pointer and return ENOMEM.
Most of the corrections here are for code merged between 6.2 and 6.10.

Cc: r772577952@gmail.com
Cc: <stable@vger.kernel.org> # v6.12
Fixes: 1a5f6e08d4e379 ("xfs: create subordinate scrub contexts for xchk_metadata_inode_subtype")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jiaming Zhang <r772577952@gmail.com>
2 weeks agoxfs: only call xf{array,blob}_destroy if we have a valid pointer
Darrick J. Wong [Fri, 23 Jan 2026 17:27:37 +0000 (09:27 -0800)] 
xfs: only call xf{array,blob}_destroy if we have a valid pointer

Only call the xfarray and xfblob destructor if we have a valid pointer,
and be sure to null out that pointer afterwards.  Note that this patch
fixes a large number of commits, most of which were merged between 6.9
and 6.10.

Cc: r772577952@gmail.com
Cc: <stable@vger.kernel.org> # v6.12
Fixes: ab97f4b1c03075 ("xfs: repair AGI unlinked inode bucket lists")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jiaming Zhang <r772577952@gmail.com>
2 weeks agoxfs: get rid of the xchk_xfile_*_descr calls
Darrick J. Wong [Fri, 23 Jan 2026 17:27:37 +0000 (09:27 -0800)] 
xfs: get rid of the xchk_xfile_*_descr calls

The xchk_xfile_*_descr macros call kasprintf, which can fail to allocate
memory if the formatted string is larger than 16 bytes (or whatever the
nofail guarantees are nowadays).  Some of them could easily exceed that,
and Jiaming Zhang found a few places where that can happen with syzbot.

The descriptions are debugging aids and aren't required to be unique, so
let's just pass in static strings and eliminate this path to failure.
Note this patch touches a number of commits, most of which were merged
between 6.6 and 6.14.

Cc: r772577952@gmail.com
Cc: <stable@vger.kernel.org> # v6.12
Fixes: ab97f4b1c03075 ("xfs: repair AGI unlinked inode bucket lists")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jiaming Zhang <r772577952@gmail.com>
2 weeks agoxfs: add a method to replace shortform attrs
Darrick J. Wong [Fri, 23 Jan 2026 17:27:36 +0000 (09:27 -0800)] 
xfs: add a method to replace shortform attrs

If we're trying to replace an xattr in a shortform attr structure and
the old entry fits the new entry, we can just memcpy and exit without
having to delete, compact, and re-add the entry (or worse use the attr
intent machinery).  For parent pointers this only advantages renaming
where the filename length stays the same (e.g. mv autoexec.bat
scandisk.exe) but for regular xattrs it might be useful for updating
security labels and the like.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: speed up parent pointer operations when possible
Darrick J. Wong [Fri, 23 Jan 2026 17:27:35 +0000 (09:27 -0800)] 
xfs: speed up parent pointer operations when possible

After a recent fsmark benchmarking run, I observed that the overhead of
parent pointers on file creation and deletion can be a bit high.  On a
machine with 20 CPUs, 128G of memory, and an NVME SSD capable of pushing
750000iops, I see the following results:

 $ mkfs.xfs -f -l logdev=/dev/nvme1n1,size=1g /dev/nvme0n1 -n parent=0
 meta-data=/dev/nvme0n1           isize=512    agcount=40, agsize=9767586 blks
          =                       sectsz=4096  attr=2, projid32bit=1
          =                       crc=1        finobt=1, sparse=1, rmapbt=1
          =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
          =                       exchange=0   metadir=0
 data     =                       bsize=4096   blocks=390703440, imaxpct=5
          =                       sunit=0      swidth=0 blks
 naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
 log      =/dev/nvme1n1           bsize=4096   blocks=262144, version=2
          =                       sectsz=4096  sunit=1 blks, lazy-count=1
 realtime =none                   extsz=4096   blocks=0, rtextents=0
          =                       rgcount=0    rgsize=0 extents
          =                       zoned=0      start=0 reserved=0

So we created 40 AGs, one per CPU.  Now we create 40 directories and run
fsmark:

 $ time fs_mark  -D  10000  -S  0  -n  100000  -s  0  -L  8 -d ...
 # Version 3.3, 40 thread(s) starting at Wed Dec 10 14:22:07 2025
 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls.
 # Directories:  Time based hash between directories across 10000 subdirectories with 180 seconds per subdirectory.
 # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name)
 # Files info: size 0 bytes, written with an IO size of 16384 bytes per write
 # App overhead is time in microseconds spent in the test not doing file writing related system calls.

 parent=0               parent=1
 ==================     ==================
 real    0m57.573s      real    1m2.934s
 user    3m53.578s      user    3m53.508s
 sys     19m44.440s     sys     25m14.810s

 $ time rm -rf ...

 parent=0               parent=1
 ==================     ==================
 real    0m59.649s      real    1m12.505s
 user    0m41.196s      user    0m47.489s
 sys     13m9.566s      sys     20m33.844s

Parent pointers increase the system time by 28% overhead to create 32
million files that are totally empty.  Removing them incurs a system
time increase of 56%.  Wall time increases by 9% and 22%.

For most filesystems, each file tends to have a single owner and not
that many xattrs.  If the xattr structure is shortform, then all xattr
changes are logged with the inode and do not require the the xattr
intent mechanism to persist the parent pointer.

Therefore, we can speed up parent pointer operations by calling the
shortform xattr functions directly if the child's xattr is in short
format.  Now the overhead looks like:

 $ time fs_mark  -D  10000  -S  0  -n  100000  -s  0  -L  8 -d ...

 parent=0               parent=1
 ==================     ==================
 real    0m58.030s      real    1m0.983s
 user    3m54.141s      user    3m53.758s
 sys     19m57.003s     sys     21m30.605s

 $ time rm -rf ...

 parent=0               parent=1
 ==================     ==================
 real    0m58.911s      real    1m4.420s
 user    0m41.329s      user    0m45.169s
 sys     13m27.857s     sys     15m58.564s

Now parent pointers only increase the system time by 8% for creation and
19% for deletion.  Wall time increases by 5% and 9% now.

Close the performance gap by creating helpers for the attr set, remove,
and replace operations that will try to make direct shortform updates,
and fall back to the attr intent machinery if that doesn't work.  This
works for regular xattrs and for parent pointers.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: reduce xfs_attr_try_sf_addname parameters
Darrick J. Wong [Fri, 23 Jan 2026 17:27:34 +0000 (09:27 -0800)] 
xfs: reduce xfs_attr_try_sf_addname parameters

The dp parameter to this function is an alias of args->dp, so remove it
for clarity before we go adding new callers.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: fix remote xattr valuelblk check
Darrick J. Wong [Fri, 23 Jan 2026 17:27:33 +0000 (09:27 -0800)] 
xfs: fix remote xattr valuelblk check

In debugging other problems with generic/753, it turns out that it's
possible for the system go to down in the middle of a remote xattr set
operation such that the leaf block entry is marked incomplete and
valueblk is set to zero.  Make this no longer a failure.

Cc: <stable@vger.kernel.org> # v4.15
Fixes: 13791d3b833428 ("xfs: scrub extended attribute leaf space")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: fix the xattr scrub to detect freemap/entries array collisions
Darrick J. Wong [Fri, 23 Jan 2026 17:27:33 +0000 (09:27 -0800)] 
xfs: fix the xattr scrub to detect freemap/entries array collisions

In the previous patches, we observed that it's possible for there to be
freemap entries with zero size but a nonzero base.  This isn't an
inconsistency per se, but older kernels can get confused by this and
corrupt the block, leading to corruption.

If we see this, flag the xattr structure for optimization so that it
gets rebuilt.

Cc: <stable@vger.kernel.org> # v4.15
Fixes: 13791d3b833428 ("xfs: scrub extended attribute leaf space")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: strengthen attr leaf block freemap checking
Darrick J. Wong [Fri, 23 Jan 2026 17:27:32 +0000 (09:27 -0800)] 
xfs: strengthen attr leaf block freemap checking

Check for erroneous overlapping freemap regions and collisions between
freemap regions and the xattr leaf entry array.

Note that we must explicitly zero out the extra freemaps in
xfs_attr3_leaf_compact so that the in-memory buffer has a correctly
initialized freemap array to satisfy the new verification code, even if
subsequent code changes the contents before unlocking the buffer.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: refactor attr3 leaf table size computation
Darrick J. Wong [Fri, 23 Jan 2026 17:27:31 +0000 (09:27 -0800)] 
xfs: refactor attr3 leaf table size computation

Replace all the open-coded callsites with a single static inline helper.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: fix freemap adjustments when adding xattrs to leaf blocks
Darrick J. Wong [Fri, 23 Jan 2026 17:27:31 +0000 (09:27 -0800)] 
xfs: fix freemap adjustments when adding xattrs to leaf blocks

xfs/592 and xfs/794 both trip this assertion in the leaf block freemap
adjustment code after ~20 minutes of running on my test VMs:

 ASSERT(ichdr->firstused >= ichdr->count * sizeof(xfs_attr_leaf_entry_t)
+ xfs_attr3_leaf_hdr_size(leaf));

Upon enabling quite a lot more debugging code, I narrowed this down to
fsstress trying to set a local extended attribute with namelen=3 and
valuelen=71.  This results in an entry size of 80 bytes.

At the start of xfs_attr3_leaf_add_work, the freemap looks like this:

i 0 base 448 size 0 rhs 448 count 46
i 1 base 388 size 132 rhs 448 count 46
i 2 base 2120 size 4 rhs 448 count 46
firstused = 520

where "rhs" is the first byte past the end of the leaf entry array.
This is inconsistent -- the entries array ends at byte 448, but
freemap[1] says there's free space starting at byte 388!

By the end of the function, the freemap is in worse shape:

i 0 base 456 size 0 rhs 456 count 47
i 1 base 388 size 52 rhs 456 count 47
i 2 base 2120 size 4 rhs 456 count 47
firstused = 440

Important note: 388 is not aligned with the entries array element size
of 8 bytes.

Based on the incorrect freemap, the name area starts at byte 440, which
is below the end of the entries array!  That's why the assertion
triggers and the filesystem shuts down.

How did we end up here?  First, recall from the previous patch that the
freemap array in an xattr leaf block is not intended to be a
comprehensive map of all free space in the leaf block.  In other words,
it's perfectly legal to have a leaf block with:

 * 376 bytes in use by the entries array
 * freemap[0] has [base = 376, size = 8]
 * freemap[1] has [base = 388, size = 1500]
 * the space between 376 and 388 is free, but the freemap stopped
   tracking that some time ago

If we add one xattr, the entries array grows to 384 bytes, and
freemap[0] becomes [base = 384, size = 0].  So far, so good.  But if we
add a second xattr, the entries array grows to 392 bytes, and freemap[0]
gets pushed up to [base = 392, size = 0].  This is bad, because
freemap[1] hasn't been updated, and now the entries array and the free
space claim the same space.

The fix here is to adjust all freemap entries so that none of them
collide with the entries array.  Note that this fix relies on commit
2a2b5932db6758 ("xfs: fix attr leaf header freemap.size underflow") and
the previous patch that resets zero length freemap entries to have
base = 0.

Cc: <stable@vger.kernel.org> # v2.6.12
Fixes: 1da177e4c3f415 ("Linux-2.6.12-rc2")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 weeks agoxfs: delete attr leaf freemap entries when empty
Darrick J. Wong [Fri, 23 Jan 2026 17:27:30 +0000 (09:27 -0800)] 
xfs: delete attr leaf freemap entries when empty

Back in commit 2a2b5932db6758 ("xfs: fix attr leaf header freemap.size
underflow"), Brian Foster observed that it's possible for a small
freemap at the end of the end of the xattr entries array to experience
a size underflow when subtracting the space consumed by an expansion of
the entries array.  There are only three freemap entries, which means
that it is not a complete index of all free space in the leaf block.

This code can leave behind a zero-length freemap entry with a nonzero
base.  Subsequent setxattr operations can increase the base up to the
point that it overlaps with another freemap entry.  This isn't in and of
itself a problem because the code in _leaf_add that finds free space
ignores any freemap entry with zero size.

However, there's another bug in the freemap update code in _leaf_add,
which is that it fails to update a freemap entry that begins midway
through the xattr entry that was just appended to the array.  That can
result in the freemap containing two entries with the same base but
different sizes (0 for the "pushed-up" entry, nonzero for the entry
that's actually tracking free space).  A subsequent _leaf_add can then
allocate xattr namevalue entries on top of the entries array, leading to
data loss.  But fixing that is for later.

For now, eliminate the possibility of confusion by zeroing out the base
of any freemap entry that has zero size.  Because the freemap is not
intended to be a complete index of free space, a subsequent failure to
find any free space for a new xattr will trigger block compaction, which
regenerates the freemap.

It looks like this bug has been in the codebase for quite a long time.

Cc: <stable@vger.kernel.org> # v2.6.12
Fixes: 1da177e4c3f415 ("Linux-2.6.12-rc2")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: fix incorrect context handling in xfs_trans_roll
Wenwu Hou [Sat, 17 Jan 2026 06:52:43 +0000 (14:52 +0800)] 
xfs: fix incorrect context handling in xfs_trans_roll

The memalloc_nofs_save() and memalloc_nofs_restore() calls are
incorrectly paired in xfs_trans_roll.

Call path:
xfs_trans_alloc()
    __xfs_trans_alloc()
// tp->t_pflags = memalloc_nofs_save();
xfs_trans_set_context()
...
xfs_defer_trans_roll()
    xfs_trans_roll()
        xfs_trans_dup()
            // old_tp->t_pflags = 0;
            xfs_trans_switch_context()
        __xfs_trans_commit()
            xfs_trans_free()
                // memalloc_nofs_restore(tp->t_pflags);
                xfs_trans_clear_context()

The code passes 0 to memalloc_nofs_restore() when committing the original
transaction, but memalloc_nofs_restore() should always receive the
flags returned from the paired memalloc_nofs_save() call.

Before commit 3f6d5e6a468d ("mm: introduce memalloc_flags_{save,restore}"),
calling memalloc_nofs_restore(0) would unset the PF_MEMALLOC_NOFS flag,
which could cause memory allocation deadlocks[1].
Fortunately, after that commit, memalloc_nofs_restore(0) does nothing,
so this issue is currently harmless.

Fixes: 756b1c343333 ("xfs: use current->journal_info for detecting transaction recursion")
Link: https://lore.kernel.org/linux-xfs/20251104131857.1587584-1-leo.lilong@huawei.com
Signed-off-by: Wenwu Hou <hwenwur@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: always allocate the free zone with the lowest index
Hans Holmberg [Tue, 20 Jan 2026 08:57:46 +0000 (09:57 +0100)] 
xfs: always allocate the free zone with the lowest index

Zones in the beginning of the address space are typically mapped to
higer bandwidth tracks on HDDs than those at the end of the address
space. So, in stead of allocating zones "round robin" across the whole
address space, always allocate the zone with the lowest index.

This increases average write bandwidth for overwrite workloads
when less than the full capacity is being used. At ~50% utilization
this improves bandwidth for a random file overwrite benchmark
with 128MiB files and 256MiB zone capacity by 30%.

Running the same benchmark with small 2-8 MiB files at 67% capacity
shows no significant difference in performance. Due to heavy
fragmentation the whole zone range is in use, greatly limiting the
number of free zones with high bw.

Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: promote metadata directories and large block support
Darrick J. Wong [Wed, 21 Jan 2026 06:45:40 +0000 (22:45 -0800)] 
xfs: promote metadata directories and large block support

Large block support was merged upstream in 6.12 (Dec 2024) and metadata
directories was merged in 6.13 (Jan 2025).  We've not received any
serious complaints about the ondisk formats of these two features in the
past year, so let's remove the experimental warnings.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: use blkdev_get_zone_info to simplify zone reporting
Christoph Hellwig [Wed, 14 Jan 2026 06:53:29 +0000 (07:53 +0100)] 
xfs: use blkdev_get_zone_info to simplify zone reporting

Unwind the callback based programming model by querying the cached
zone information using blkdev_get_zone_info.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: check that used blocks are smaller than the write pointer
Christoph Hellwig [Wed, 14 Jan 2026 06:53:28 +0000 (07:53 +0100)] 
xfs: check that used blocks are smaller than the write pointer

Any used block must have been written, this reject used blocks > write
pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: split and refactor zone validation
Christoph Hellwig [Wed, 14 Jan 2026 06:53:27 +0000 (07:53 +0100)] 
xfs: split and refactor zone validation

Currently xfs_zone_validate mixes validating the software zone state in
the XFS realtime group with validating the hardware state reported in
struct blk_zone and deriving the write pointer from that.

Move all code that works on the realtime group to xfs_init_zone, and only
keep the hardware state validation in xfs_zone_validate.  This makes the
code more clear, and allows for better reuse in userspace.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: pass the write pointer to xfs_init_zone
Christoph Hellwig [Wed, 14 Jan 2026 06:53:26 +0000 (07:53 +0100)] 
xfs: pass the write pointer to xfs_init_zone

Move the two methods to query the write pointer out of xfs_init_zone into
the callers, so that xfs_init_zone doesn't have to bother with the
blk_zone structure and instead operates purely at the XFS realtime group
level.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: add a xfs_rtgroup_raw_size helper
Christoph Hellwig [Wed, 14 Jan 2026 06:53:25 +0000 (07:53 +0100)] 
xfs: add a xfs_rtgroup_raw_size helper

Add a helper to figure the on-disk size of a group, accounting for the
XFS_SB_FEAT_INCOMPAT_ZONE_GAPS feature if needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: add missing forward declaration in xfs_zones.h
Damien Le Moal [Wed, 14 Jan 2026 06:53:24 +0000 (07:53 +0100)] 
xfs: add missing forward declaration in xfs_zones.h

Add the missing forward declaration for struct blk_zone in xfs_zones.h.
This avoids headaches with the order of header file inclusion to avoid
compilation errors.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: remove xfs_attr_leaf_hasname
Christoph Hellwig [Fri, 9 Jan 2026 15:17:40 +0000 (16:17 +0100)] 
xfs: remove xfs_attr_leaf_hasname

The calling convention of xfs_attr_leaf_hasname() is problematic, because
it returns a NULL buffer when xfs_attr3_leaf_read fails, a valid buffer
when xfs_attr3_leaf_lookup_int returns -ENOATTR or -EEXIST, and a
non-NULL buffer pointer for an already released buffer when
xfs_attr3_leaf_lookup_int fails with other error values.

Fix this by simply open coding xfs_attr_leaf_hasname in the callers, so
that the buffer release code is done by each caller of
xfs_attr3_leaf_read.

Cc: stable@vger.kernel.org # v5.19+
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Reported-by: Mark Tinguely <mark.tinguely@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: mark data structures corrupt on EIO and ENODATA
Darrick J. Wong [Fri, 19 Dec 2025 02:40:50 +0000 (18:40 -0800)] 
xfs: mark data structures corrupt on EIO and ENODATA

I learned a few things this year: first, blk_status_to_errno can return
ENODATA for critical media errors; and second, the scrub code doesn't
mark data structures as corrupt on ENODATA or EIO.

Currently, scrub failing to capture these errors isn't all that
impactful -- the checking code will exit to userspace with EIO/ENODATA,
and xfs_scrub will log a complaint and exit with nonzero status.  Most
people treat fsck tools failing as a sign that the fs is corrupt, but
online fsck should mark the metadata bad and keep moving.

Cc: stable@vger.kernel.org # v4.15
Fixes: 4700d22980d459 ("xfs: create helpers to record and deal with scrub problems")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: rework zone GC buffer management
Christoph Hellwig [Wed, 14 Jan 2026 13:06:43 +0000 (14:06 +0100)] 
xfs: rework zone GC buffer management

The double buffering where just one scratch area is used at a time does
not efficiently use the available memory.  It was originally implemented
when GC I/O could happen out of order, but that was removed before
upstream submission to avoid fragmentation.  Now that all GC I/Os are
processed in order, just use a number of buffers as a simple ring buffer.

For a synthetic benchmark that fills 256MiB HDD zones and punches out
holes to free half the space this leads to a decrease of GC time by
a little more than 25%.

Thanks to Hans Holmberg <hans.holmberg@wdc.com> for testing and
benchmarking.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: use bio_reuse in the zone GC code
Christoph Hellwig [Wed, 14 Jan 2026 13:06:42 +0000 (14:06 +0100)] 
xfs: use bio_reuse in the zone GC code

Replace our somewhat fragile code to reuse the bio, which caused a
regression in the past with the block layer bio_reuse helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoblock: add a bio_reuse helper
Christoph Hellwig [Wed, 14 Jan 2026 13:06:41 +0000 (14:06 +0100)] 
block: add a bio_reuse helper

Add a helper to allow an existing bio to be resubmitted without
having to re-add the payload.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: directly include xfs_platform.h
Christoph Hellwig [Fri, 19 Dec 2025 05:41:47 +0000 (06:41 +0100)] 
xfs: directly include xfs_platform.h

The xfs.h header conflicts with the public xfs.h in xfsprogs, leading
to a spurious difference in all shared libxfs files that have to
include libxfs_priv.h in userspace.  Directly include xfs_platform.h so
that we can add a header of the same name to xfsprogs and remove this
major annoyance for the shared code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: move the remaining content from xfs.h to xfs_platform.h
Christoph Hellwig [Fri, 19 Dec 2025 05:41:46 +0000 (06:41 +0100)] 
xfs: move the remaining content from xfs.h to xfs_platform.h

Move the global defines from xfs.h to xfs_platform.h to prepare for
removing xfs.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: include global headers first in xfs_platform.h
Christoph Hellwig [Fri, 19 Dec 2025 05:41:45 +0000 (06:41 +0100)] 
xfs: include global headers first in xfs_platform.h

Ensure we have all kernel headers included by the time we do our own
thing, just like the rest of the tree.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: rename xfs_linux.h to xfs_platform.h
Christoph Hellwig [Fri, 19 Dec 2025 05:41:44 +0000 (06:41 +0100)] 
xfs: rename xfs_linux.h to xfs_platform.h

Rename xfs_linux.h to prepare for including including it directly
from source files including those shared with xfsprogs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: factor out a xlog_write_space_advance helper
Christoph Hellwig [Wed, 12 Nov 2025 12:14:26 +0000 (13:14 +0100)] 
xfs: factor out a xlog_write_space_advance helper

Add a new xlog_write_space_advance that returns the current place in the
iclog that data is written to, and advances the various counters by the
amount taken from xlog_write_iovec, and also use it xlog_write_partial,
which open codes the counter adjustments, but misses the asserts.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: improve the iclog space assert in xlog_write_iovec
Christoph Hellwig [Wed, 12 Nov 2025 12:14:25 +0000 (13:14 +0100)] 
xfs: improve the iclog space assert in xlog_write_iovec

We need enough space for the length we copy into the iclog, not just
some space, so tighten up the check a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: add a xlog_write_space_left helper
Christoph Hellwig [Wed, 12 Nov 2025 12:14:24 +0000 (13:14 +0100)] 
xfs: add a xlog_write_space_left helper

Various places check how much space is left in the current iclog,
add a helper for that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: improve the calling convention for the xlog_write helpers
Christoph Hellwig [Wed, 12 Nov 2025 12:14:23 +0000 (13:14 +0100)] 
xfs: improve the calling convention for the xlog_write helpers

The xlog_write chain passes around the same seven variables that are
often passed by reference. Add a xlog_write_data structure to contain
them to improve code generation and readability.

This change increases the generated code size by about 140 bytes for my
x86_64 build, which is hopefully worth the much easier to follow code:

$ size fs/xfs/xfs_log.o*
   text    data     bss     dec     hex filename
  29300    1730     176   31206    79e6 fs/xfs/xfs_log.o
  29160    1730     176   31066    795a fs/xfs/xfs_log.o.old

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: regularize iclog space accounting in xlog_write_partial
Christoph Hellwig [Wed, 12 Nov 2025 12:14:22 +0000 (13:14 +0100)] 
xfs: regularize iclog space accounting in xlog_write_partial

When xlog_write_partial splits a log region over multiple iclogs, it
has to include the continuation ophder in the length requested for the
new iclog.  Currently is simply adds that to the request, which makes
the accounting of the used space below look slightly different from the
other users of iclog space that decrement it.

To prepare for more code sharing, add the ophdr size to the len variable
that tracks the number of bytes still are left in this xlog_write
operation before the calling xlog_write_get_more_iclog_space, and then
decrement it later when consuming that space.

This changes the value of len when xlog_write_get_more_iclog_space
returns an error, but as nothing looks at len in that case the
difference doesn't matter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: move struct xfs_log_vec to xfs_log_priv.h
Christoph Hellwig [Wed, 12 Nov 2025 12:14:21 +0000 (13:14 +0100)] 
xfs: move struct xfs_log_vec to xfs_log_priv.h

The log_vec is a private type for the log/CIL code and should not be
exposed to anything else.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: move struct xfs_log_iovec to xfs_log_priv.h
Christoph Hellwig [Wed, 12 Nov 2025 12:14:20 +0000 (13:14 +0100)] 
xfs: move struct xfs_log_iovec to xfs_log_priv.h

This structure is now only used by the core logging and CIL code.

Also remove the unused typedef.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: improve the ->iop_format interface
Christoph Hellwig [Wed, 12 Nov 2025 12:14:19 +0000 (13:14 +0100)] 
xfs: improve the ->iop_format interface

Export a higher level interface to format log items.  The xlog_format_buf
structure is hidden inside xfs_log_cil.c and only accessed using two
helpers (and a wrapper build on top), hiding details of log iovecs from
the log items.  This also allows simply using an index into lv_iovecp
instead of keeping a cursor vec.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: set lv_bytes in xlog_write_one_vec
Christoph Hellwig [Wed, 12 Nov 2025 12:14:18 +0000 (13:14 +0100)] 
xfs: set lv_bytes in xlog_write_one_vec

lv_bytes is mostly just use by the CIL code, but has crept into the
low-level log writing code to decide on a full or partial iclog
write.  Ensure it is valid even for the special log writes that don't
go through the CIL by initializing it in xlog_write_one_vec.

Note that even without this fix, the checkpoint commits would never
trigger a partial iclog write, as they have no payload beyond the
opheader.

The unmount record on the other hand could in theory trigger a an
overflow of the iclog, but given that is has never been seen in
the wild this has probably been masked by the small size of it
and the fact that the unmount process does multiple log forces
before writing the unmount record and we thus usually operate on
an empty or almost empty iclog.

Fixes: 110dc24ad2ae ("xfs: log vector rounding leaks log space")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: add a xlog_write_one_vec helper
Christoph Hellwig [Wed, 12 Nov 2025 12:14:17 +0000 (13:14 +0100)] 
xfs: add a xlog_write_one_vec helper

Add a wrapper for xlog_write for the two callers who need to build a
log_vec and add it to a single-entry chain instead of duplicating the
code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
3 weeks agoxfs: add media verification ioctl
Darrick J. Wong [Wed, 21 Jan 2026 02:06:52 +0000 (18:06 -0800)] 
xfs: add media verification ioctl

Add a new privileged ioctl so that xfs_scrub can ask the kernel to
verify the media of the devices backing an xfs filesystem, and have any
resulting media errors reported to fsnotify and xfs_healer.

To accomplish this, the kernel allocates a folio between the base page
size and 1MB, and issues read IOs to a gradually incrementing range of
one of the storage devices underlying an xfs filesystem.  If any error
occurs, that raw error is reported to the calling process.  If the error
happens to be one of the ones that the kernel considers indicative of
data loss, then it will also be reported to xfs_healthmon and fsnotify.

Driving the verification from the kernel enables xfs (and by extension
xfs_scrub) to have precise control over the size and error handling of
IOs that are issued to the underlying block device, and to emit
notifications about problems to other relevant kernel subsystems
immediately.

Note that the caller is also allowed to reduce the size of the IO and
to ask for a relaxation period after each IO.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: check if an open file is on the health monitored fs
Darrick J. Wong [Wed, 21 Jan 2026 02:06:51 +0000 (18:06 -0800)] 
xfs: check if an open file is on the health monitored fs

Create a new ioctl for the healthmon file that checks that a given fd
points to the same filesystem that the healthmon file is monitoring.
This allows xfs_healer to check that when it reopens a mountpoint to
perform repairs, the file that it gets matches the filesystem that
generated the corruption report.

(Note that xfs_healer doesn't maintain an open fd to a filesystem that
it's monitoring so that it doesn't pin the mount.)

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: allow toggling verbose logging on the health monitoring file
Darrick J. Wong [Wed, 21 Jan 2026 02:06:51 +0000 (18:06 -0800)] 
xfs: allow toggling verbose logging on the health monitoring file

Make it so that we can reconfigure the health monitoring device by
calling the XFS_IOC_HEALTH_MONITOR ioctl on it.  As of right now we can
only toggle the verbose flag, but this is less annoying than having to
closing the monitor fd and reopen it.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: convey file I/O errors to the health monitor
Darrick J. Wong [Wed, 21 Jan 2026 02:06:50 +0000 (18:06 -0800)] 
xfs: convey file I/O errors to the health monitor

Connect the fserror reporting to the health monitor so that xfs can send
events about file I/O errors to the xfs_healer daemon.  These events are
entirely informational because xfs cannot regenerate user data, so
hopefully the fsnotify I/O error event gets noticed by the relevant
management systems.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: convey externally discovered fsdax media errors to the health monitor
Darrick J. Wong [Wed, 21 Jan 2026 02:06:49 +0000 (18:06 -0800)] 
xfs: convey externally discovered fsdax media errors to the health monitor

Connect the fsdax media failure notification code to the health monitor
so that xfs can send events about that to the xfs_healer daemon.

Later on we'll add the ability for the xfs_scrub media scan (phase 6) to
report the errors that it finds to the kernel so that those are also
logged by xfs_healer.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: convey filesystem shutdown events to the health monitor
Darrick J. Wong [Wed, 21 Jan 2026 02:06:48 +0000 (18:06 -0800)] 
xfs: convey filesystem shutdown events to the health monitor

Connect the filesystem shutdown code to the health monitor so that xfs
can send events about that to the xfs_healer daemon.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: convey metadata health events to the health monitor
Darrick J. Wong [Wed, 21 Jan 2026 02:06:47 +0000 (18:06 -0800)] 
xfs: convey metadata health events to the health monitor

Connect the filesystem metadata health event collection system to the
health monitor so that xfs can send events to xfs_healer as it collects
information.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: convey filesystem unmount events to the health monitor
Darrick J. Wong [Wed, 21 Jan 2026 02:06:47 +0000 (18:06 -0800)] 
xfs: convey filesystem unmount events to the health monitor

In xfs_healthmon_unmount, send events to xfs_healer so that it knows
that nothing further can be done for the filesystem.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: create event queuing, formatting, and discovery infrastructure
Darrick J. Wong [Wed, 21 Jan 2026 02:06:46 +0000 (18:06 -0800)] 
xfs: create event queuing, formatting, and discovery infrastructure

Create the basic infrastructure that we need to report health events to
userspace.  We need a compact form for recording critical information
about an event and queueing them; a means to notice that we've lost some
events; and a means to format the events into something that userspace
can handle.  Make the kernel export C structures via read().

In a previous iteration of this new subsystem, I wanted to explore data
exchange formats that are more flexible and easier for humans to read
than C structures.  The thought being that when we want to rev (or
worse, enlarge) the event format, it ought to be trivially easy to do
that in a way that doesn't break old userspace.

I looked at formats such as protobufs and capnproto.  These look really
nice in that extending the wire format is fairly easy, you can give it a
data schema and it generates the serialization code for you, handles
endianness problems, etc.  The huge downside is that neither support C
all that well.

Too hard, and didn't want to port either of those huge sprawling
libraries first to the kernel and then again to xfsprogs.  Then I
thought, how about JSON?  Javascript objects are human readable, the
kernel can emit json without much fuss (it's all just strings!) and
there are plenty of interpreters for python/rust/c/etc.

There's a proposed schema format for json, which means that xfs can
publish a description of the events that kernel will emit.  Userspace
consumers (e.g. xfsprogs/xfs_healer) can embed the same schema document
and use it to validate the incoming events from the kernel, which means
it can discard events that it doesn't understand, or garbage being
emitted due to bugs.

However, json has a huge crutch -- javascript is well known for its
vague definitions of what are numbers.  This makes expressing a large
number rather fraught, because the runtime is free to represent a number
in nearly any way it wants.  Stupider ones will truncate values to word
size, others will roll out doubles for uint52_t (yes, fifty-two) with
the resulting loss of precision.  Not good when you're dealing with
discrete units.

It just so happens that python's json library is smart enough to see a
sequence of digits and put them in a u64 (at least on x86_64/aarch64)
but an actual javascript interpreter (pasting into Firefox) isn't
necessarily so clever.

It turns out that none of the proposed json schemas were ever ratified
even in an open-consensus way, so json blobs are still just loosely
structured blobs.  The parsing in userspace was also noticeably slow and
memory-consumptive.

Hence only the C interface survives.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoxfs: start creating infrastructure for health monitoring
Darrick J. Wong [Wed, 21 Jan 2026 02:06:45 +0000 (18:06 -0800)] 
xfs: start creating infrastructure for health monitoring

Start creating helper functions and infrastructure to pass filesystem
health events to a health monitoring file.  Since this is an
administrative interface, we only support a single health monitor
process per filesystem, so we don't need to use anything fancy such as
notifier chains (== tons of indirect calls).

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
3 weeks agoLinux 6.19-rc6 v6.19-rc6
Linus Torvalds [Sun, 18 Jan 2026 23:42:45 +0000 (15:42 -0800)] 
Linux 6.19-rc6

3 weeks agoMerge tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic...
Linus Torvalds [Sun, 18 Jan 2026 23:15:47 +0000 (15:15 -0800)] 
Merge tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock fixes from Mickaël Salaün:
 "This fixes TCP handling, tests, documentation, non-audit elided code,
  and minor cosmetic changes"

* tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Clarify documentation for the IOCTL access right
  selftests/landlock: Properly close a file descriptor
  landlock: Improve the comment for domain_is_scoped
  selftests/landlock: Use scoped_base_variants.h for ptrace_test
  selftests/landlock: Fix missing semicolon
  selftests/landlock: Fix typo in fs_test
  landlock: Optimize stack usage when !CONFIG_AUDIT
  landlock: Fix spelling
  landlock: Clean up hook_ptrace_access_check()
  landlock: Improve erratum documentation
  landlock: Remove useless include
  landlock: Fix wrong type usage
  selftests/landlock: NULL-terminate unix pathname addresses
  selftests/landlock: Remove invalid unix socket bind()
  selftests/landlock: Add missing connect(minimal AF_UNSPEC) test
  selftests/landlock: Fix TCP bind(AF_UNSPEC) test case
  landlock: Fix TCP handling of short AF_UNSPEC addresses
  landlock: Fix formatting

3 weeks agoMerge tag 'cgroup-for-6.19-rc5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Jan 2026 22:30:27 +0000 (14:30 -0800)] 
Merge tag 'cgroup-for-6.19-rc5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:

 - Add Chen Ridong as cpuset reviewer

 - Add SPDX license identifiers to cgroup files that were missing them

* tag 'cgroup-for-6.19-rc5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  kernel: cgroup: Add LGPL-2.1 SPDX license ID to legacy_freezer.c
  kernel: cgroup: Add SPDX-License-Identifier lines
  MAINTAINERS: Add Chen Ridong as cpuset reviewer

3 weeks agoMerge tag 'ext4_for_linus-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Jan 2026 22:01:20 +0000 (14:01 -0800)] 
Merge tag 'ext4_for_linus-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:

 - Fix an inconsistency in structure size on 32-bit platforms caused by
   padding differences for the new EXT4_IOC_[GS]ET_TUNE_SB_PARAM ioctls

 - Fix a buffer leak on the error path when dropping the refcount an
   xattr value stored in an inode

 - Fix missing locking on the error path for the file defragmentation
   ioctl leading to a BUG

* tag 'ext4_for_linus-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix iloc.bh leak in ext4_xattr_inode_update_ref
  ext4: add missing down_write_data_sem in mext_move_extent().
  ext4: fix ext4_tune_sb_params padding

3 weeks agoMerge tag 'dmaengine-fix-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Sun, 18 Jan 2026 21:38:31 +0000 (13:38 -0800)] 
Merge tag 'dmaengine-fix-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
 "A bunch of driver fixes for:

   - dma mask fix for mmp pdma driver

   - Xilinx regmap max register, uninitialized addr_width fix

   - device leak fix for bunch of drivers in the subsystem

   - stm32 dmamux, TI crossbar driver fixes for device & of node leak
     and route allocation cleanup

   - Tegra use afer free fix

   - Memory leak fix in Qualcomm gpi and omap-dma driver

   - compatible fix for apple driver"

* tag 'dmaengine-fix-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (25 commits)
  dmaengine: apple-admac: Add "apple,t8103-admac" compatible
  dmaengine: omap-dma: fix dma_pool resource leak in error paths
  dmaengine: qcom: gpi: Fix memory leak in gpi_peripheral_config()
  dmaengine: sh: rz-dmac: Fix rz_dmac_terminate_all()
  dmaengine: xilinx_dma: Fix uninitialized addr_width when "xlnx,addrwidth" property is missing
  dmaengine: tegra-adma: Fix use-after-free
  dmaengine: fsl-edma: Fix clk leak on alloc_chan_resources failure
  dmaengine: mmp_pdma: Fix race condition in mmp_pdma_residue()
  dmaengine: ti: k3-udma: fix device leak on udma lookup
  dmaengine: ti: dma-crossbar: clean up dra7x route allocation error paths
  dmaengine: ti: dma-crossbar: fix device leak on am335x route allocation
  dmaengine: ti: dma-crossbar: fix device leak on dra7x route allocation
  dmaengine: stm32: dmamux: clean up route allocation error labels
  dmaengine: stm32: dmamux: fix OF node leak on route allocation failure
  dmaengine: stm32: dmamux: fix device leak on route allocation
  dmaengine: sh: rz-dmac: fix device leak on probe failure
  dmaengine: lpc32xx-dmamux: fix device leak on route allocation
  dmaengine: lpc18xx-dmamux: fix device leak on route allocation
  dmaengine: idxd: fix device leaks on compat bind and unbind
  dmaengine: dw: dmamux: fix OF node leak on route allocation failure
  ...

3 weeks agoMerge tag 'phy-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Linus Torvalds [Sun, 18 Jan 2026 21:18:40 +0000 (13:18 -0800)] 
Merge tag 'phy-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy fixes from Vinod Koul:
 "A bunch of driver fixes:

   - Freescale typec orientation switch fix, clearing register fix,
     assertion of phy reset during power on

   - Qualcomm pcs register clear before using

   - stm one off fix

   - TI runtimepm error handling, regmap leak fixes

   - Rockchip gadget mode disconnection and disruption fixes

   - Tegra register level fix

   - Broadcom pointer cast warning fix"

* tag 'phy-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: freescale: imx8m-pcie: assert phy reset during power on
  phy: rockchip: inno-usb2: Fix a double free bug in rockchip_usb2phy_probe()
  phy: broadcom: ns-usb3: Fix Wvoid-pointer-to-enum-cast warning (again)
  phy: tegra: xusb: Explicitly configure HS_DISCON_LEVEL to 0x7
  phy: rockchip: inno-usb2: fix communication disruption in gadget mode
  phy: rockchip: inno-usb2: fix disconnection in gadget mode
  phy: ti: gmii-sel: fix regmap leak on probe failure
  phy: sparx5-serdes: make it selectable for ARCH_LAN969X
  phy: ti: da8xx-usb: Handle devm_pm_runtime_enable() errors
  phy: stm32-usphyc: Fix off by one in probe()
  phy: qcom-qusb2: Fix NULL pointer dereference on early suspend
  phy: fsl-imx8mq-usb: Clear the PCS_TX_SWING_FULL field before using it
  dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Update pcie phy bindings for qcs8300
  phy: fsl-imx8mq-usb: fix typec orientation switch when built as module

3 weeks agoMerge tag 'soundwire-6.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 18 Jan 2026 20:29:12 +0000 (12:29 -0800)] 
Merge tag 'soundwire-6.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire fix from Vinod Koul:

 - Single off-by-one fix for allocating slave id

* tag 'soundwire-6.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: bus: fix off-by-one when allocating slave IDs

3 weeks agoMerge tag 'usb-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 18 Jan 2026 20:09:13 +0000 (12:09 -0800)] 
Merge tag 'usb-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes and new device ids for 6.19-rc6

  Included in here are:

   - new usb-serial device ids

   - dwc3-apple driver fixes to get things working properly on that
     hardware platform

   - ohci/uhci platfrom driver module soft-deps with ehci to remove a
     runtime warning that sometimes shows up on some platforms.

   - quirk for broken devices that can not handle reading the BOS
     descriptor from them without going crazy.

   - usb-serial driver fixes

   - xhci driver fixes

   - usb gadget driver fixes

  All of these except for the last xhci fix has been in linux-next for a
  while. The xhci fix has been reported by others to solve the issue for
  them, so should be ok"

* tag 'usb-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: sideband: don't dereference freed ring when removing sideband endpoint
  usb: gadget: uvc: retry vb2_reqbufs() with vb_vmalloc_memops if use_sg fail
  usb: gadget: uvc: return error from uvcg_queue_init()
  usb: gadget: uvc: fix interval_duration calculation
  usb: gadget: uvc: fix req_payload_size calculation
  usb: dwc3: apple: Ignore USB role switches to the active role
  usb: host: xhci-tegra: Use platform_get_irq_optional() for wake IRQs
  USB: OHCI/UHCI: Add soft dependencies on ehci_platform
  usb: dwc3: apple: Set USB2 PHY mode before dwc3 init
  USB: serial: f81232: fix incomplete serial port generation
  USB: serial: ftdi_sio: add support for PICAXE AXE027 cable
  USB: serial: option: add Telit LE910 MBIM composition
  usb: core: add USB_QUIRK_NO_BOS for devices that hang on BOS descriptor
  dt-bindings: usb: qcom,dwc3: Correct MSM8994 interrupts
  dt-bindings: usb: qcom,dwc3: Correct IPQ5018 interrupts
  tcpm: allow looking for role_sw device in the main node
  usb: dwc3: Check for USB4 IP_NAME

3 weeks agoMerge tag 'i2c-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 18 Jan 2026 20:00:35 +0000 (12:00 -0800)] 
Merge tag 'i2c-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - riic, imx-lpi2c: suspend/resume fixes

 - qcom-geni: DMA handling fix

 - iproc: correct DT binding description

* tag 'i2c-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: imx-lpi2c: change to PIO mode in system-wide suspend/resume progress
  i2c: qcom-geni: make sure I2C hub controllers can't use SE DMA
  i2c: riic: Move suspend handling to NOIRQ phase
  dt-bindings: i2c: brcm,iproc-i2c: Allow 2 reg entries for brcm,iproc-nic-i2c

3 weeks agoMerge tag 'edac_urgent_for_v6.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Jan 2026 19:39:56 +0000 (11:39 -0800)] 
Merge tag 'edac_urgent_for_v6.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC fixes from Borislav Petkov:
 "Make sure the memory-mapped memory controller registers BAR gets
  unmapped when the driver memory allocation fails

  Fix that in both x38 and i3200 EDAC drivers as former has copied the
  bug from the latter, it looks like"

* tag 'edac_urgent_for_v6.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/x38: Fix a resource leak in x38_probe1()
  EDAC/i3200: Fix a resource leak in i3200_probe1()

3 weeks agoMerge tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 18 Jan 2026 19:34:11 +0000 (11:34 -0800)] 
Merge tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 fixes from Ingo Molnar:

 - Fix resctrl initialization on Hygon CPUs

 - Fix resctrl memory bandwidth counters on Hygon CPUs

 - Fix x86 self-tests build bug

* tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/x86: Add selftests include path for kselftest.h after centralization
  x86/resctrl: Fix memory bandwidth counter width for Hygon
  x86/resctrl: Add missing resctrl initialization for Hygon

3 weeks agoMerge tag 'timers-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Jan 2026 18:56:32 +0000 (10:56 -0800)] 
Merge tag 'timers-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "Fix the update_needs_ipi() check in the hrtimer code that may result
  in incorrect skipping of hrtimer IPIs"

* tag 'timers-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimer: Fix softirq base check in update_needs_ipi()

3 weeks agoMerge tag 'sched-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Jan 2026 18:17:40 +0000 (10:17 -0800)] 
Merge tag 'sched-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "Misc deadline scheduler fixes, mainly for a new category of bugs that
  were discovered and fixed recently:

   - Fix a race condition in the DL server

   - Fix a DL server bug which can result in incorrectly going idle when
     there's work available

   - Fix DL server bug which triggers a WARN() due to broken
     get_prio_dl() logic and subsequent misbehavior

   - Fix double update_rq_clock() calls

   - Fix setscheduler() assumption about static priorities

   - Make sure balancing callbacks are always called

   - Plus a handful of preparatory commits for the fixes"

* tag 'sched-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Use ENQUEUE_MOVE to allow priority change
  sched: Deadline has dynamic priority
  sched: Audit MOVE vs balance_callbacks
  sched: Fold rq-pin swizzle into __balance_callbacks()
  sched/deadline: Avoid double update_rq_clock()
  sched/deadline: Ensure get_prio_dl() is up-to-date
  sched/deadline: Fix server stopping with runnable tasks
  sched: Provide idle_rq() helper
  sched/deadline: Fix potential race in dl_add_task_root_domain()
  sched/deadline: Remove unnecessary comment in dl_add_task_root_domain()

3 weeks agoMerge tag 'objtool-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Jan 2026 17:09:32 +0000 (09:09 -0800)] 
Merge tag 'objtool-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Ingo Molnar:
 "Fix two objtool build failures that trigger in uncommon build
  environments"

* tag 'objtool-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: fix build failure due to missing libopcodes check
  objtool: fix compilation failure with the x32 toolchain

3 weeks agoMerge tag 'irq-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 18 Jan 2026 17:08:27 +0000 (09:08 -0800)] 
Merge tag 'irq-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Fix a riscv-imsic irqchip driver regression"

* tag 'irq-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/riscv-imsic: Revert "Remove redundant irq_data lookups"

3 weeks agoext4: fix iloc.bh leak in ext4_xattr_inode_update_ref
Yang Erkun [Sat, 13 Dec 2025 05:57:06 +0000 (13:57 +0800)] 
ext4: fix iloc.bh leak in ext4_xattr_inode_update_ref

The error branch for ext4_xattr_inode_update_ref forget to release the
refcount for iloc.bh. Find this when review code.

Fixes: 57295e835408 ("ext4: guard against EA inode refcount underflow in xattr update")
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20251213055706.3417529-1-yangerkun@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
3 weeks agoext4: add missing down_write_data_sem in mext_move_extent().
Julian Sun [Mon, 8 Dec 2025 12:37:13 +0000 (20:37 +0800)] 
ext4: add missing down_write_data_sem in mext_move_extent().

Commit 962e8a01eab9 ("ext4: introduce mext_move_extent()") attempts to
call ext4_swap_extents() on the failure path to recover the swapped
extents, but fails to acquire locks for the two inode->i_data_sem,
triggering the BUG_ON statement in ext4_swap_extents().

This issue can be fixed by calling ext4_double_down_write_data_sem()
before ext4_swap_extents().

Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
Reported-by: syzbot+4ea6bd8737669b423aae@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69368649.a70a0220.38f243.0093.GAE@google.com/
Fixes: 962e8a01eab9 ("ext4: introduce mext_move_extent()")
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://patch.msgid.link/20251208123713.1971068-1-sunjunchao@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 weeks agoext4: fix ext4_tune_sb_params padding
Arnd Bergmann [Thu, 4 Dec 2025 10:19:10 +0000 (11:19 +0100)] 
ext4: fix ext4_tune_sb_params padding

The padding at the end of struct ext4_tune_sb_params is architecture
specific and in particular is different between x86-32 and x86-64,
since the __u64 member only enforces struct alignment on the latter.

This shows up as a new warning when test-building the headers with
-Wpadded:

include/linux/ext4.h:144:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded]

All members inside the structure are naturally aligned, so the only
difference here is the amount of padding at the end. Make the padding
explicit, to have a consistent sizeof(struct ext4_tune_sb_params) of
232 on all architectures and avoid adding compat ioctl handling for
EXT4_IOC_GET_TUNE_SB_PARAM/EXT4_IOC_SET_TUNE_SB_PARAM.

This is an ABI break on x86-32 but hopefully this can go into 6.18.y early
enough as a fixup so no actual users will be affected.  Alternatively, the
kernel could handle the ioctl commands for both sizes (232 and 228 bytes)
on all architectures.

Fixes: 04a91570ac67 ("ext4: implemet new ioctls to set and get superblock parameters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20251204101914.1037148-1-arnd@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
3 weeks agoMerge tag 'for-6.19-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Sun, 18 Jan 2026 03:29:32 +0000 (19:29 -0800)] 
Merge tag 'for-6.19-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - with large folios in use, fix partial incorrect update of a reflinked
   range

 - fix potential deadlock in iget when lookup fails and eviction is
   needed

 - in send, validate inline extent type while detecting file holes

 - fix memory leak after an error when creating a space info

 - remove zone statistics from sysfs again, the output size limitations
   make it unusable, we'll do it in another way in another release

 - test fixes:
     - return proper error codes from block remapping tests
     - fix tree root leaks in qgroup tests after errors

* tag 'for-6.19-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: remove zoned statistics from sysfs
  btrfs: fix memory leaks in create_space_info() error paths
  btrfs: invalidate pages instead of truncate after reflinking
  btrfs: update the Kconfig string for CONFIG_BTRFS_EXPERIMENTAL
  btrfs: send: check for inline extents in range_is_hole_in_parent()
  btrfs: tests: fix return 0 on rmap test failure
  btrfs: tests: fix root tree leak in btrfs_test_qgroups()
  btrfs: release path before iget_failed() in btrfs_read_locked_inode()

3 weeks agoMerge tag 'loongarch-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 18 Jan 2026 03:24:48 +0000 (19:24 -0800)] 
Merge tag 'loongarch-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Remove redundant code in head.S, fix PMU counter allocation for mixed-
  type event groups, fix a lot of dts build warnings, and fix kvm_device
  memory leaks"

* tag 'loongarch-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Fix kvm_device leak in kvm_pch_pic_destroy()
  LoongArch: KVM: Fix kvm_device leak in kvm_eiointc_destroy()
  LoongArch: KVM: Fix kvm_device leak in kvm_ipi_destroy()
  LoongArch: dts: loongson-2k1000: Fix i2c-gpio node names
  LoongArch: dts: loongson-2k2000: Add default interrupt controller address cells
  LoongArch: dts: loongson-2k1000: Add default interrupt controller address cells
  LoongArch: dts: loongson-2k0500: Add default interrupt controller address cells
  LoongArch: dts: Describe PCI sideband IRQ through interrupt-extended
  LoongArch: Fix PMU counter allocation for mixed-type event groups
  LoongArch: Remove redundant code in head.S

3 weeks agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Sat, 17 Jan 2026 16:52:45 +0000 (08:52 -0800)] 
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "An arm64/mpam fix to use non-atomic bitops on struct mmap_props member
  (atomicity not required).

  For kunit testing, the structure is packed to avoid memcmp() errors
  but this affects atomic bitops as they have strict alignment
  requirements.

  Also remove a duplicate include in the mpam driver"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm_mpam: Use non-atomic bitops when modifying feature bitmap
  arm_mpam: Remove duplicate linux/srcu.h header

3 weeks agoselftests/x86: Add selftests include path for kselftest.h after centralization
Bala-Vignesh-Reddy [Wed, 22 Oct 2025 06:29:48 +0000 (11:59 +0530)] 
selftests/x86: Add selftests include path for kselftest.h after centralization

The previous change centralizing kselftest.h include path in lib.mk caused x86
selftests to fail, as x86 Makefile overwrites CFLAGS using ":=", dropping the
include path added in lib.mk. Therefore, helpers.h could not find kselftest.h
during compilation.

Fix this by adding the tools/testing/sefltest to CFLAGS in x86 Makefile.

  [ bp: Correct commit ID in Fixes: ]

Fixes: e6fbd1759c9e ("selftests: complete kselftest include centralization")
Closes: https://lore.kernel.org/lkml/CA+G9fYvKjQcCBMfXA-z2YuL2L+3Qd-pJjEUDX8PDdz2-EEQd=Q@mail.gmail.com/T/#m83fd330231287fc9d6c921155bee16c591db7360
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Tested-by: Brendan Jackman <jackmanb@google.com>
Link: https://patch.msgid.link/20251022062948.162852-1-reddybalavignesh9979@gmail.com
3 weeks agoMerge tag 'block-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 17 Jan 2026 04:59:46 +0000 (20:59 -0800)] 
Merge tag 'block-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith:
     - Device quirk to disable faulty temperature (Ilikara)
     - TCP target null pointer fix from bad host protocol usage (Shivam)
     - Add apple,t8103-nvme-ans2 as a compatible apple controller
       (Janne)
     - FC tagset leak fix (Chaitanya)
     - TCP socket deadlock fix (Hannes)
     - Target name buffer overrun fix (Shin'ichiro)

 - Fix for an underflow for rnbd during device unmap

 - Zero the non-PI part of the auto integrity buffer

 - Fix for a configfs memory leak in the null block driver

* tag 'block-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  rnbd-clt: fix refcount underflow in device unmap path
  nvme: fix PCIe subsystem reset controller state transition
  nvmet: do not copy beyond sybsysnqn string length
  nvmet-tcp: fixup hang in nvmet_tcp_listen_data_ready()
  null_blk: fix kmemleak by releasing references to fault configfs items
  block: zero non-PI portion of auto integrity buffer
  nvme-fc: release admin tagset if init fails
  nvme-apple: add "apple,t8103-nvme-ans2" as compatible
  nvme-tcp: fix NULL pointer dereferences in nvmet_tcp_build_pdu_iovec
  nvme-pci: disable secondary temp for Wodposit WPBSNM8

3 weeks agoMerge tag 'io_uring-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 17 Jan 2026 04:56:56 +0000 (20:56 -0800)] 
Merge tag 'io_uring-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fix from Jens Axboe:
 "Just a single fix moving local task_work inside the cancelation loop,
  rather than only before cancelations.

  If any cancelations generate task_work, we do need to re-run it"

* tag 'io_uring-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring: move local task_work in exit cancel loop

3 weeks agoLoongArch: KVM: Fix kvm_device leak in kvm_pch_pic_destroy()
Qiang Ma [Sat, 17 Jan 2026 02:57:03 +0000 (10:57 +0800)] 
LoongArch: KVM: Fix kvm_device leak in kvm_pch_pic_destroy()

In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_pch_pic_destroy() is not currently doing this, that
would lead to a memory leak.

So, fix it.

Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: KVM: Fix kvm_device leak in kvm_eiointc_destroy()
Qiang Ma [Sat, 17 Jan 2026 02:57:02 +0000 (10:57 +0800)] 
LoongArch: KVM: Fix kvm_device leak in kvm_eiointc_destroy()

In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_eiointc_destroy() is not currently doing this, that
would lead to a memory leak.

So, fix it.

Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: KVM: Fix kvm_device leak in kvm_ipi_destroy()
Qiang Ma [Sat, 17 Jan 2026 02:57:02 +0000 (10:57 +0800)] 
LoongArch: KVM: Fix kvm_device leak in kvm_ipi_destroy()

In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_ipi_destroy() is not currently doing this, that
would lead to a memory leak.

So, fix it.

Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: dts: loongson-2k1000: Fix i2c-gpio node names
Binbin Zhou [Sat, 17 Jan 2026 02:56:53 +0000 (10:56 +0800)] 
LoongArch: dts: loongson-2k1000: Fix i2c-gpio node names

The binding wants the node to be named "i2c-number", but those are named
"i2c-gpio-number" instead.

Thus rename those to i2c-0, i2c-1 to adhere to the binding and suppress
dtbs_check warnings.

Cc: stable@vger.kernel.org
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: dts: loongson-2k2000: Add default interrupt controller address cells
Binbin Zhou [Sat, 17 Jan 2026 02:56:53 +0000 (10:56 +0800)] 
LoongArch: dts: loongson-2k2000: Add default interrupt controller address cells

Add missing address-cells 0 to the Local I/O, Extend I/O and PCH-PIC
Interrupt Controller node to silence W=1 warning:

  loongson-2k2000.dtsi:364.5-49: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map:
    Missing property '#address-cells' in node /bus@10000000/interrupt-controller@10000000, using 0 as fallback

Value '0' is correct because:
1. The LIO/EIO/PCH interrupt controller does not have children,
2. interrupt-map property (in PCI node) consists of five components and
   the fourth component "parent unit address", which size is defined by
   '#address-cells' of the node pointed to by the interrupt-parent
   component, is not used (=0)

Cc: stable@vger.kernel.org
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: dts: loongson-2k1000: Add default interrupt controller address cells
Binbin Zhou [Sat, 17 Jan 2026 02:56:53 +0000 (10:56 +0800)] 
LoongArch: dts: loongson-2k1000: Add default interrupt controller address cells

Add missing address-cells 0 to the Local I/O interrupt controller node
to silence W=1 warning:

  loongson-2k1000.dtsi:498.5-55: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map:
    Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe01440, using 0 as fallback

Value '0' is correct because:
1. The Local I/O interrupt controller does not have children,
2. interrupt-map property (in PCI node) consists of five components and
   the fourth component "parent unit address", which size is defined by
   '#address-cells' of the node pointed to by the interrupt-parent
   component, is not used (=0)

Cc: stable@vger.kernel.org
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: dts: loongson-2k0500: Add default interrupt controller address cells
Binbin Zhou [Sat, 17 Jan 2026 02:56:52 +0000 (10:56 +0800)] 
LoongArch: dts: loongson-2k0500: Add default interrupt controller address cells

Add missing address-cells 0 to the Local I/O and Extend I/O interrupt
controller node to silence W=1 warning:

  loongson-2k0500.dtsi:513.5-51: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@0,0:interrupt-map:
    Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe11600, using 0 as fallback

Value '0' is correct because:
1. The Local I/O & Extend I/O interrupt controller do not have children,
2. interrupt-map property (in PCI node) consists of five components and
   the fourth component "parent unit address", which size is defined by
   '#address-cells' of the node pointed to by the interrupt-parent
   component, is not used (=0)

Cc: stable@vger.kernel.org
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: dts: Describe PCI sideband IRQ through interrupt-extended
Yao Zi [Sat, 17 Jan 2026 02:56:52 +0000 (10:56 +0800)] 
LoongArch: dts: Describe PCI sideband IRQ through interrupt-extended

SoC integrated peripherals on LS2K1000 and LS2K2000 could be discovered
as PCI devices, but require sideband interrupts to function, which are
previously described by interrupts and interrupt-parent properties.

However, pci/pci-device.yaml allows interrupts property to only specify
PCI INTx interrupts, not sideband ones. Convert these devices to use
interrupt-extended property, which describes sideband interrupts used by
PCI devices since dt-schema commit e6ea659d2baa ("schemas: pci-device:
Allow interrupts-extended for sideband interrupts"), eliminating
dtbs_check warnings.

Cc: stable@vger.kernel.org
Fixes: 30a5532a3206 ("LoongArch: dts: DeviceTree for Loongson-2K1000")
Signed-off-by: Yao Zi <me@ziyao.cc>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoLoongArch: Fix PMU counter allocation for mixed-type event groups
Lisa Robinson [Sat, 17 Jan 2026 02:56:43 +0000 (10:56 +0800)] 
LoongArch: Fix PMU counter allocation for mixed-type event groups

When validating a perf event group, validate_group() unconditionally
attempts to allocate hardware PMU counters for the leader, sibling
events and the new event being added.

This is incorrect for mixed-type groups. If a PERF_TYPE_SOFTWARE event
is part of the group, the current code still tries to allocate a hardware
PMU counter for it, which can wrongly consume hardware PMU resources and
cause spurious allocation failures.

Fix this by only allocating PMU counters for hardware events during group
validation, and skipping software events.

A trimmed down reproducer is as simple as this:

  #include <stdio.h>
  #include <assert.h>
  #include <unistd.h>
  #include <string.h>
  #include <sys/syscall.h>
  #include <linux/perf_event.h>

  int main (int argc, char *argv[])
  {
   struct perf_event_attr attr = { 0 };
   int fds[5];

   attr.disabled = 1;
   attr.exclude_kernel = 1;
   attr.exclude_hv = 1;
   attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
   PERF_FORMAT_TOTAL_TIME_RUNNING | PERF_FORMAT_ID | PERF_FORMAT_GROUP;
   attr.size = sizeof (attr);

   attr.type = PERF_TYPE_SOFTWARE;
   attr.config = PERF_COUNT_SW_DUMMY;
   fds[0] = syscall (SYS_perf_event_open, &attr, 0, -1, -1, 0);
   assert (fds[0] >= 0);

   attr.type = PERF_TYPE_HARDWARE;
   attr.config = PERF_COUNT_HW_CPU_CYCLES;
   fds[1] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
   assert (fds[1] >= 0);

   attr.type = PERF_TYPE_HARDWARE;
   attr.config = PERF_COUNT_HW_INSTRUCTIONS;
   fds[2] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
   assert (fds[2] >= 0);

   attr.type = PERF_TYPE_HARDWARE;
   attr.config = PERF_COUNT_HW_BRANCH_MISSES;
   fds[3] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
   assert (fds[3] >= 0);

   attr.type = PERF_TYPE_HARDWARE;
   attr.config = PERF_COUNT_HW_CACHE_REFERENCES;
   fds[4] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
   assert (fds[4] >= 0);

   printf ("PASSED\n");

   return 0;
  }

Cc: stable@vger.kernel.org
Fixes: b37042b2bb7c ("LoongArch: Add perf events support")
Signed-off-by: Lisa Robinson <lisa@bytefly.space>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 weeks agoMerge tag 'drm-fixes-2026-01-16' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Fri, 16 Jan 2026 21:48:18 +0000 (13:48 -0800)] 
Merge tag 'drm-fixes-2026-01-16' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Simona Vetter:
 "We've had nothing aside of a compiler noise fix until today, when the
  amd and drm-misc fixes showed up after Dave already went into weekend
  mode. So it's on me to push these out, since there's a bunch of
  important fixes in here I think that shouldn't be delayed for a week.

  Core Changes:
   - take gem lock when preallocating in gpuvm
   - add single byte read fallback to dp for broken usb-c adapters
   - remove duplicate drm_sysfb declarations

  Driver Changes:
   - i915: compiler noise fix
   - amdgpu/amdkfd: pile of fixes all over
   - vmwgfx:
      - v10 cursor regression fix
      - other fixes
   - rockchip:
      - waiting for cfgdone regression fix
      - other fixes
   - gud: fix oops on disconnect
   - simple-panel:
      - regression fix when connector is not set
      - fix for DataImage SCF0700C48GGU18
   - nouveau: cursor handling locking fix"

* tag 'drm-fixes-2026-01-16' of https://gitlab.freedesktop.org/drm/kernel: (33 commits)
  drm/amd/display: Add an hdmi_hpd_debounce_delay_ms module
  drm/amdgpu/userq: Fix fence reference leak on queue teardown v2
  drm/amdkfd: No need to suspend whole MES to evict process
  Revert "drm/amdgpu: don't attach the tlb fence for SI"
  drm/amdgpu: validate the flush_gpu_tlb_pasid()
  drm/amd/pm: fix smu overdrive data type wrong issue on smu 14.0.2
  drm/amd/display: Initialise backlight level values from hw
  drm/amd/display: Bump the HDMI clock to 340MHz
  drm/amd/display: Show link name in PSR status message
  drm/amdkfd: fix a memory leak in device_queue_manager_init()
  drm/amdgpu: make sure userqs are enabled in userq IOCTLs
  drm/amdgpu: Use correct address to setup gart page table for vram access
  Revert duplicate "drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces"
  drm/amd: Clean up kfd node on surprise disconnect
  drm/amdgpu: fix drm panic null pointer when driver not support atomic
  drm/amdgpu: Fix gfx9 update PTE mtype flag
  drm/sysfb: Remove duplicate declarations
  drm/nouveau/kms/nv50-: Assert we hold nv50_disp->lock in nv50_head_flush_*
  drm/nouveau/disp/nv50-: Set lock_core in curs507a_prepare
  drm/gud: fix NULL fb and crtc dereferences on USB disconnect
  ...