git.ipfire.org Git - thirdparty/kernel/linux.git/log

xfs: remove the unused bv field in struct xfs_gc_bio

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove xarray mark for reclaimable zones

We can easily check if there are any reclaimble zones by just looking
at the used counters in the reclaim buckets, so do that to free up the
xarray mark we currently use for this purpose.

Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove the xlog_in_core_t typedef

Switch the few remaining users to use the underlying struct directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove l_iclog_heads

l_iclog_heads is only used in one place and can be trivially derived
from l_iclog_hsize by a single shift operation. Remove it, and switch
the initialization of l_iclog_hsize to use struct_size so that it is
directly derived from the on-disk format definition.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove the xlog_rec_header_t typedef

There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove xlog_in_core_2_t

xlog_in_core_2_t is a really odd type, not only is it grossly
misnamed because it actually is an on-disk structure, but it also
reprents the actual on-disk structure in a rather odd way.

A v1 or small v2 log header look like:

+-----------------------+
|      xlog_record      |
+-----------------------+

while larger v2 log headers look like:

+-----------------------+
|      xlog_record      |
+-----------------------+
|  xlog_rec_ext_header  |
+-------------------+---+
|         .....         |
+-----------------------+
|  xlog_rec_ext_header  |
+-----------------------+

I.e., the ext headers are a variable sized array at the end of the
header.  So instead of declaring a union of xlog_rec_header,
xlog_rec_ext_header and padding to BBSIZE, add the proper padding to
struct struct xlog_rec_header and struct xlog_rec_ext_header, and
add a variable sized array of the latter to the former.  This also
exposes the somewhat unusual scope of the log checksums, which is
made explicitly now by adding proper padding and macro designating
the actual payload length.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove a very outdated comment from xlog_alloc_log

The xlog_iclog definition has been pretty standard for a while, so drop
this now rather misleading comment.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: cleanup xlog_alloc_log a bit

Remove the separate head variable, move the ic_datap initialization
up a bit where the context is more obvious and remove the duplicate
memset right after a zeroing memory allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: don't use xlog_in_core_2_t in struct xlog_in_core

Most accessed to the on-disk log record header are for the original
xlog_rec_header. Make that the main structure, and case for the
single remaining place using other union legs.

This prepares for removing xlog_in_core_2_t entirely.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: add a on-disk log header cycle array accessor

Accessing the cycle arrays in the original log record header vs the
extended header is messy and duplicated in multiple places.

Add a xlog_cycle_data helper to abstract it out.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: add a XLOG_CYCLE_DATA_SIZE constant

The XLOG_HEADER_CYCLE_SIZE / BBSIZE expression is used a lot
in the log code, give it a symbolic name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: reduce ilock roundtrips in xfs_qm_vop_dqalloc

xfs_qm_vop_dqalloc only needs the (exclusive) ilock for attaching dquots
to the inode if not done so yet. All the other locks don't touch the inode
and don't need the ilock - the i_rwsem / iolock protects against changes
to the IDs while we are in a method, and the ilock would not help because
dropping it for the dqget calls would be racy anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: move xfs_dquot_tree calls into xfs_qm_dqget_cache_{lookup,insert}

These are the low-level functions that needs them, so localize the
(trivial) calculation of the radix tree root there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: move quota locking into xrep_quota_item

Drop two redundant lock roundtrips by not requiring q_lock to be held on
entry and return.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: move quota locking into xqcheck_commit_dquot

Drop two redundant lock roundtrips by not requiring q_lock to be held on
entry and return.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: move q_qlock locking into xqcheck_compare_dquot

Instead of having both callers do it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: move q_qlock locking into xchk_quota_item

This avoids a pointless roundtrip because ilock needs to be taken first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: push q_qlock acquisition from xchk_dquot_iter to the callers.

There is no good reason to take q_qlock in xchk_dquot_iter, which just
provides a reference to the dquot.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove q_qlock locking in xfs_qm_scall_setqlim

q_type can't change for an existing dquot, so there is no need for
the locking here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: return the dquot unlocked from xfs_qm_dqget

There is no reason to lock the dquot in xfs_qm_dqget, which just acquires
a reference. Move the locking to the callers, or remove it in cases where
the caller instantly unlocks the dquot.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fold xfs_qm_dqattach_one into xfs_qm_dqget_inode

xfs_qm_dqattach_one is a thin wrapper around xfs_qm_dqget_inode. Move
the extra asserts into xfs_qm_dqget_inode, drop the unneeded q_qlock
roundtrip and merge the two functions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: xfs_qm_dqattach_one is never called with a non-NULL *IO_idqpp

The caller already checks that, so replace the handling of this case with
an assert that it does not happen.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: consolidate q_qlock locking in xfs_qm_dqget and xfs_qm_dqget_inode

Move taking q_qlock from the cache lookup / insert helpers into the
main functions and do it just before returning to the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove xfs_qm_dqput and optimize dropping dquot references

With the new lockref-based dquot reference counting, there is no need to
hold q_qlock for dropping the reference. Make xfs_qm_dqrele the main
function to drop dquot references without taking q_qlock and convert all
callers of xfs_qm_dqput to unlock q_qlock and call xfs_qm_dqrele instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: use a lockref for the xfs_dquot reference count

The xfs_dquot structure currently uses the anti-pattern of using the
in-object lock that protects the content to also serialize reference
count updates for the structure, leading to a cumbersome free path.
This is partially papered over by the fact that we never free the dquot
directly but always through the LRU. Switch to use a lockref instead and
move the reference counter manipulations out of q_qlock.

To make this work, xfs_qm_flush_one and xfs_qm_flush_one are converted to
acquire a dquot reference while flushing to integrate with the lockref
"get if not dead" scheme.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: remove xfs_dqunlock and friends

There's really no point in wrapping the basic mutex operations. Remove
the wrapper to ease lock analysis annotations and make the code a litte
easier to read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: don't treat all radix_tree_insert errors as -EEXIST

Return other errors to the caller instead. Note that there really
shouldn't be any other errors because the entry is preallocated, but
if there were, we'd better return them instead of retrying forever.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: make qi_dquots a 64-bit value

qi_dquots counts all quotas in the file system, which can be up to
3 * UINT_MAX and overflow a 32-bit counter, but can't be negative.
Make qi_dquots a uint64_t, and saturate the value to UINT_MAX for
userspace reporting.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: don't leak a locked dquot when xfs_dquot_attach_buf fails

xfs_qm_quotacheck_dqadjust acquired the dquot through xfs_qm_dqget,
which means it owns a reference and holds q_qlock. Both need to
be dropped on an error exit.

Cc: <stable@vger.kernel.org> # v6.13
Fixes: ca378189fdfa ("xfs: convert quotacheck to attach dquot buffers")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: add a xfs_groups_to_rfsbs helper

Plus a rtgroup wrapper and use that to avoid overflows when converting
zone/rtg counts to block counts.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: free xfs_busy_extents structure when no RT extents are queued

kmemleak occasionally reports leaking xfs_busy_extents structure
from xfs_scrub calls after running xfs/528 (but attributed to following
tests), which seems to be caused by not freeing the xfs_busy_extents
structure when tr.queued is 0 and xfs_trim_rtgroup_extents breaks out
of the main loop. Free the structure in this case.

Fixes: a3315d11305f ("xfs: use rtgroup busy extent list for FITRIM")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fix zone selection in xfs_select_open_zone_mru

xfs_select_open_zone_mru needs to pass XFS_ZONE_ALLOC_OK to
xfs_try_use_zone because we only want to tightly pack into zones of the
same or a compatible temperature instead of any available zone.

This got broken in commit 0301dae732a5 ("xfs: refactor hint based zone
allocation"), which failed to update this particular caller when
switching to an enum. xfs/638 sometimes, but not reliably fails due to
this change.

Fixes: 0301dae732a5 ("xfs: refactor hint based zone allocation")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fix a rtgroup leak when xfs_init_zone fails

Drop the rtgrop reference when xfs_init_zone fails for a conventional
device.

Fixes: 4e4d52075577 ("xfs: add the zoned space allocator")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fix various problems in xfs_atomic_write_cow_iomap_begin

I think there are several things wrong with this function:

A) xfs_bmapi_write can return a much larger unwritten mapping than what
   the caller asked for.  We convert part of that range to written, but
   return the entire written mapping to iomap even though that's
   inaccurate.

B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
   unwritten mapping could be *smaller* than the write range (or even
   the hole range).  In this case, we convert too much file range to
   written state because we then return a smaller mapping to iomap.

C) It doesn't handle delalloc mappings.  This I covered in the patch
   that I already sent to the list.

D) Reassigning count_fsb to handle the hole means that if the second
   cmap lookup attempt succeeds (due to racing with someone else) we
   trim the mapping more than is strictly necessary.  The changing
   meaning of count_fsb makes this harder to notice.

E) The tracepoint is kinda wrong because @length is mutated.  That makes
   it harder to chase the data flows through this function because you
   can't just grep on the pos/bytecount strings.

F) We don't actually check that the br_state = XFS_EXT_NORM assignment
   is accurate, i.e that the cow fork actually contains a written
   mapping for the range we're interested in

G) Somewhat inadequate documentation of why we need to xfs_trim_extent
   so aggressively in this function.

H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
   the write range to s_maxbytes.

Fix these issues, and then the atomic writes regressions in generic/760,
generic/617, generic/091, generic/263, and generic/521 all go away for
me.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fix delalloc write failures in software-provided atomic writes

With the 20 Oct 2025 release of fstests, generic/521 fails for me on
regular (aka non-block-atomic-writes) storage:

QA output created by 521
dowrite: write: Input/output error
LOG DUMP (8553 total operations):
1(  1 mod 256): SKIPPED (no operation)
2(  2 mod 256): WRITE    0x7e000 thru 0x8dfff (0x10000 bytes) HOLE
3(  3 mod 256): READ     0x69000 thru 0x79fff (0x11000 bytes)
4(  4 mod 256): FALLOC   0x53c38 thru 0x5e853 (0xac1b bytes) INTERIOR
5(  5 mod 256): COPY 0x55000 thru 0x59fff (0x5000 bytes) to 0x25000 thru 0x29fff
6(  6 mod 256): WRITE    0x74000 thru 0x88fff (0x15000 bytes)
7(  7 mod 256): ZERO     0xedb1 thru 0x11693 (0x28e3 bytes)

with a warning in dmesg from iomap about XFS trying to give it a
delalloc mapping for a directio write.  Fix the software atomic write
iomap_begin code to convert the reservation into a written mapping.
This doesn't fix the data corruption problems reported by generic/760,
but it's a start.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: document another racy GC case in xfs_zoned_map_extent

Besides blocks being invalidated, there is another case when the original
mapping could have changed between querying the rmap for GC and calling
xfs_zoned_map_extent. Document it there as it took us quite some time
to figure out what is going on while developing the multiple-GC
protection fix.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: prevent gc from picking the same zone twice

When we are picking a zone for gc it might already be in the pipeline
which can lead to us moving the same data twice resulting in in write
amplification and a very unfortunate case where we keep on garbage
collecting the zone we just filled with migrated data stopping all
forward progress.

Fix this by introducing a count of on-going GC operations on a zone, and
skip any zone with ongoing GC when picking a new victim.

Fixes: 080d01c41 ("xfs: implement zoned garbage collection")
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Co-developed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fix locking in xchk_nlinks_collect_dir

On a filesystem with parent pointers, xchk_nlinks_collect_dir walks both
the directory entries (data fork) and the parent pointers (attr fork) to
determine the correct link count.  Unfortunately I forgot to update the
lock mode logic to handle the case of a directory whose attr fork is in
btree format and has not yet been loaded *and* whose data fork doesn't
need loading.

This leads to a bunch of assertions from xfs/286 in xfs_iread_extents
because we only took ILOCK_SHARED, not ILOCK_EXCL.  You'd need the rare
happenstance of a directory with a large number of non-pptr extended
attributes set and enough memory pressure to cause the directory to be
evicted and partially reloaded from disk.

I /think/ this only started in 6.18-rc1 because I've started seeing OOM
errors with the maple tree slab using 70% of memory, and this didn't
happen in 6.17.  Yay dynamic systems!

Cc: stable@vger.kernel.org # v6.10
Fixes: 77ede5f44b0d86 ("xfs: walk directory parent pointers to determine backref count")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: loudly complain about defunct mount options

Apparently we can never deprecate mount options in this project, because
it will invariably turn out that some foolish userspace depends on some
behavior and break.  From Oleksandr Natalenko:

  In v6.18, the attr2 XFS mount option is removed. This may silently
  break system boot if the attr2 option is still present in /etc/fstab
  for rootfs.

  Consider Arch Linux that is being set up from scratch with / being
  formatted as XFS. The genfstab command that is used to generate
  /etc/fstab produces something like this by default:

  /dev/sda2 on / type xfs (rw,relatime,attr2,discard,inode64,logbufs=8,logbsize=32k,noquota)

  Once the system is set up and rebooted, there's no deprecation warning
  seen in the kernel log:

  # cat /proc/cmdline
  root=UUID=77b42de2-397e-47ee-a1ef-4dfd430e47e9 rootflags=discard rd.luks.options=discard quiet

  # dmesg | grep -i xfs
  [    2.409818] SGI XFS with ACLs, security attributes, realtime, scrub, repair, quota, no debug enabled
  [    2.415341] XFS (sda2): Mounting V5 Filesystem 77b42de2-397e-47ee-a1ef-4dfd430e47e9
  [    2.442546] XFS (sda2): Ending clean mount

  Although as per the deprecation intention, it should be there.

  Vlastimil (in Cc) suggests this is because xfs_fs_warn_deprecated()
  doesn't produce any warning by design if the XFS FS is set to be
  rootfs and gets remounted read-write during boot. This imposes two
  problems:

  1) a user doesn't see the deprecation warning; and
  2) with v6.18 kernel, the read-write remount fails because of unknown
     attr2 option rendering system unusable:

  systemd[1]: Switching root.
  systemd-remount-fs[225]: /usr/bin/mount for / exited with exit status 32.

  # mount -o rw /
  mount: /: fsconfig() failed: xfs: Unknown parameter 'attr2'.

  Thorsten (in Cc) suggested reporting this as a user-visible regression.

  From my PoV, although the deprecation is in place for 5 years already,
  it may not be visible enough as the warning is not emitted for rootfs.
  Considering the amount of systems set up with XFS on /, this may
  impose a mass problem for users.

  Vlastimil suggested making attr2 option a complete noop instead of
  removing it.

IOWs, the initrd mounts the root fs with (I assume) no mount options,
and mount -a remounts with whatever options are in fstab.  However,
XFS doesn't complain about deprecated mount options during a remount, so
technically speaking we were not warning all users in all combinations
that they were heading for a cliff.

Gotcha!!

Now, how did 'attr2' get slurped up on so many systems?  The old code
would put that in /proc/mounts if the filesystem happened to be in attr2
mode, even if user hadn't mounted with any such option.  IOWs, this is
because someone thought it would be a good idea to advertise system
state via /proc/mounts.

The easy way to fix this is to reintroduce the four mount options but
map them to a no-op option that ignores them, and hope that nobody's
depending on attr2 to appear in /proc/mounts.  (Hint: use the fsgeometry
ioctl).  But we've learned our lesson, so complain as LOUDLY as possible
about the deprecation.

Lessons learned:

1. Don't expose system state via /proc/mounts; the only strings that
    ought to be there are options *explicitly* provided by the user.
2. Never tidy, it's not worth the stress and irritation.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: stable@vger.kernel.org # v6.18-rc1
Fixes: b9a176e54162f8 ("xfs: remove deprecated mount options")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: always warn about deprecated mount options

The deprecation of the 'attr2' mount option in 6.18 wasn't entirely
successful because nobody noticed that the kernel never printed a
warning about attr2 being set in fstab if the only xfs filesystem is the
root fs; the initramfs mounts the root fs with no mount options; and the
init scripts only conveyed the fstab options by remounting the root fs.

Fix this by making it complain all the time.

Cc: stable@vger.kernel.org # v5.13
Fixes: 92cf7d36384b99 ("xfs: Skip repetitive warnings about mount options")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: don't set bt_nr_sectors to a negative number

xfs_daddr_t is a signed type, which means that xfs_buf_map_verify is
using a signed comparison. This causes problems if bt_nr_sectors is
never overridden (e.g. in the case of an xfbtree for rmap btree repairs)
because even daddr 0 can't pass the verifier test in that case.

Define an explicit max constant and set the initial bt_nr_sectors to a
positive value.

Found by xfs/422.

Cc: stable@vger.kernel.org # v6.18-rc1
Fixes: 42852fe57c6d2a ("xfs: track the number of blocks in each buftarg")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: don't use __GFP_NOFAIL in xfs_init_fs_context

With enough debug options enabled, struct xfs_mount is larger
than 4k and thus NOFAIL allocations won't work for it.

xfs_init_fs_context is early in the mount process, and if we really
are out of memory there we'd better give up ASAP anyway.

Fixes: 7b77b46a6137 ("xfs: use kmem functions for struct xfs_mount")
Reported-by: syzbot+359a67b608de1ef72f65@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: cache open zone in inode->i_private

The MRU cache for open zones is unfortunately still not ideal, as it can
time out pretty easily when doing heavy I/O to hard disks using up most
or all open zones. One option would be to just increase the timeout,
but while looking into that I realized we're just better off caching it
indefinitely as there is no real downside to that once we don't hold a
reference to the cache open zone.

So switch the open zone to RCU freeing, and then stash the last used
open zone into inode->i_private. This helps to significantly reduce
fragmentation by keeping I/O localized to zones for workloads that
write using many open files to HDD.

Fixes: 4e4d52075577 ("xfs: add the zoned space allocator")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: avoid busy loops in GCD

When GCD has no new work to handle, but read, write or reset commands
are outstanding, it currently busy loops, which is a bit suboptimal,
and can lead to softlockup warnings in case of stuck commands.

Change the code so that the task state is only set to running when work
is performed, which looks a bit tricky due to the design of the
reading/writing/resetting lists that contain both in-flight and finished
commands.

Fixes: 080d01c41d44 ("xfs: implement zoned garbage collection")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: XFS_ONLINE_SCRUB_STATS should depend on DEBUG_FS

Currently, XFS_ONLINE_SCRUB_STATS selects DEBUG_FS. However, DEBUG_FS
is meant for debugging, and people may want to disable it on production
systems. Since commit 0ff51a1fd786f47b ("xfs: enable online fsck by
default in Kconfig")), XFS_ONLINE_SCRUB_STATS is enabled by default,
forcing DEBUG_FS enabled too.

Fix this by replacing the selection of DEBUG_FS by a dependency on
DEBUG_FS, which is what most other options controlling the gathering and
exposing of statistics do.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: do not tightly pack-write large files

When using a zoned realtime device, tightly packing of data blocks
belonging to multiple closed files into the same realtime group (RTG)
is very efficient at improving write performance. This is especially
true with SMR HDDs as this can reduce, and even suppress, disk head
seeks.

However, such tight packing does not make sense for large files that
require at least a full RTG. If tight packing placement is applied for
such files, the VM writeback thread switching between inodes result in
the large files to be fragmented, thus increasing the garbage collection
penalty later when the RTG needs to be reclaimed.

This problem can be avoided with a simple heuristic: if the size of the
inode being written back is at least equal to the RTG size, do not use
tight-packing. Modify xfs_zoned_pack_tight() to always return false in
this case.

With this change, a multi-writer workload writing files of 256 MB on a
file system backed by an SMR HDD with 256 MB zone size as a realtime
device sees all files occupying exactly one RTG (i.e. one device zone),
thus completely removing the heavy fragmentation observed without this
change.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: Improve CONFIG_XFS_RT Kconfig help

Improve the description of the XFS_RT configuration option to document
that this option is required for zoned block devices.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

Linux 6.18-rc2

Merge tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

- Make sure the check for lost pelt idle time is done unconditionally
   to have correct lost idle time accounting

- Stop the deadline server task before a CPU goes offline

* tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix pelt lost idle time detection
  sched/deadline: Stop dl_server before CPU goes offline

Merge tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

- Make sure perf reporting works correctly in setups using
   overlayfs or FUSE

- Move the uprobe optimization to a better location logically

* tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix MMAP2 event device with backing files
  perf/core: Fix MMAP event path names with backing files
  perf/core: Fix address filter match with backing files
  uprobe: Move arch_uprobe_optimize right after handlers execution

Merge tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

- Reset the why-the-system-rebooted register on AMD to avoid stale bits
   remaining from previous boots

- Add a missing barrier in the TLB flushing code to prevent erroneously
   not flushing a TLB generation

- Make sure cpa_flush() does not overshoot when computing the end range
   of a flush region

- Fix resctrl bandwidth counting on AMD systems when the amount of
   monitoring groups created exceeds the number the hardware can track

* tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Prevent reset reasons from being retained across reboot
  x86/mm: Fix SMP ordering in switch_mm_irqs_off()
  x86/mm: Fix overflow in __cpa_addr()
  x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID

Merge tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull rustfmt fixes from Miguel Ojeda:
"Rust 'rustfmt' cleanup

  'rustfmt', by default, formats imports in a way that is prone to
  conflicts while merging and rebasing, since in some cases it condenses
  several items into the same line.

  Document in our guidelines that we will handle this for the moment
  with the trailing empty comment workaround and make the tree
  'rustfmt'-clean again"

* tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: bitmap: fix formatting
  rust: cpufreq: fix formatting
  rust: alloc: employ a trailing comment to keep vertical layout
  docs: rust: add section on imports formatting

Merge tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fix from Jarkko Sakkinen:
"Correct the state transitions for ARM FF-A to match the spec and how
tpm_crb behaves on other platforms"

* tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm_crb: Add idle support for the Arm FF-A start method

Merge tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

- Search for MSI Capability with correct ID to fix an MSI regression on
   platforms with Cadence IP (Hans Zhang)

- Revert early bridge resource set up to fix resource assignment
   failures that broke at least alpha boot and Snapdragon ath12k WiFi
   (Ilpo Järvinen)

- Implement VMD .irq_startup()/.irq_shutdown() to fix IRQ issues that
   caused boot crashes and broken devices below VMD (Inochi Amaoto)

- Select CONFIG_SCREEN_INFO on X86 to fix black screen on boot when
   SCREEN_INFO not selected (Mario Limonciello)

* tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/VGA: Select SCREEN_INFO on X86
  PCI: vmd: Override irq_startup()/irq_shutdown() in vmd_init_dev_msi_info()
  PCI: Revert early bridge resource set up
  PCI: cadence: Search for MSI Capability with correct ID

Merge tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull Compute Express Link fixes from Dave Jiang:
"A small collection of CXL fixes. In addition to some misc fixes for
  the CXL subsystem, a number of fixes for CXL extended linear cache
  support are included to make it functional again.

   - Avoid missing port component registers setup due to dport
     enumeration failure

   - Add check for no entries in cxl_feature_info to address accessing
     invalid pointer.

   - Use %pa printk format to emit resource_size_t in
     validate_region_offset()

  CXL extended linear cache support fixes:

   - Fix setup of memory resource in cxl_acpi_set_cache_size()

   - Set range param for region_res_match_cxl_range() as const
     (addresses a compile warning for match_region_by_range() fix)

   - Fix match_region_by_range() to use region_res_match_cxl_range()

   - Subtract to find an hpa_alias0 in cxl_poison events to correct the
     alias math calculation"

* tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/trace: Subtract to find an hpa_alias0 in cxl_poison events
  cxl/region: Use %pa printk format to emit resource_size_t
  cxl: Fix match_region_by_range() to use region_res_match_cxl_range()
  cxl: Set range param for region_res_match_cxl_range() as const
  cxl/acpi: Fix setup of memory resource in cxl_acpi_set_cache_size()
  cxl/features: Add check for no entries in cxl_feature_info
  cxl/port: Avoid missing port component registers setup

Merge tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- fix for sticky fingers handling in hid-multitouch (Benjamin
   Tissoires)

- fix for reporting of 0 battery levels (Dmitry Torokhov)

- build fix for hid-haptic in certain configurations (Jonathan Denose)

- improved probe and avoiding spamming kernel log by hid-nintendo
   (Vicki Pfau)

- fix for OOB in hid-cp2112 (Deepak Sharma)

- interrupt handling fix for intel-thc-hid (Even Xu)

- a couple of new device IDs and device-specific quirks

* tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Add HIDPP_QUIRK_RESET_HI_RES_SCROLL
  selftests/hid: add tests for missing release on the Dell Synaptics
  HID: multitouch: fix sticky fingers
  HID: multitouch: fix name of Stylus input devices
  HID: hid-input: only ignore 0 battery events for digitizers
  HID: hid-debug: Fix spelling mistake "Rechargable" -> "Rechargeable"
  HID: Kconfig: Fix build error from CONFIG_HID_HAPTIC
  HID: nintendo: Rate limit IMU compensation message
  HID: nintendo: Wait longer for initial probe
  HID: core: Add printk_ratelimited variants to hid_warn() etc
  HID: quirks: Add ALWAYS_POLL quirk for VRS R295 steering wheel
  HID: quirks: avoid Cooler Master MM712 dongle wakeup bug
  HID: cp2112: Add parameter validation to data length
  HID: intel-thc-hid: intel-quickspi: Add ARL PCI Device Id's
  HID: intel-thc-hid: Intel-quickspi: switch first interrupt from level to edge detection
  HID: intel-thc-hid: intel-quicki2c: Fix wrong type casting

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

- Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
   imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)

- Make selftests/bpf arg_parsing.c more robust to errors (Andrii
   Nakryiko)

- Fix redefinition of 'off' as different kind of symbol when I40E
   driver is builtin (Brahmajit Das)

- Do not disable preemption in bpf_test_run (Sahil Chandna)

- Fix memory leak in __lookup_instance error path (Shardul Bankar)

- Ensure test data is flushed to disk before reading it (Xing Guo)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Fix redefinition of 'off' as different kind of symbol
  bpf: Do not disable preemption in bpf_test_run().
  bpf: Fix memory leak in __lookup_instance error path
  selftests: arg_parsing: Ensure data is flushed to disk before reading.
  bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
  selftests/bpf: make arg_parsing.c more robust to crashes
  bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path

Merge tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

Pull exfat fixes from Namjae Jeon:

- Fix out-of-bounds in FS_IOC_SETFSLABEL

- Add validation for stream entry size to prevent infinite loop

* tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: fix out-of-bounds in exfat_nls_to_ucs2()
exfat: fix improper check of dentry.stream.valid_size

Merge tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:

- Fix for FlexFiles mirror->dss allocation

- Apply delay_retrans to async operations

- Check if suid/sgid is cleared after a write when needed

- Fix setting the state renewal timer for early mounts after a reboot

* tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS4: Fix state renewals missing after boot
  NFS: check if suid/sgid was cleared after a write as needed
  NFS4: Apply delay_retrans to async operations
  NFSv4/flexfiles: fix to allocate mirror->dss before use

Merge tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
"smb client fixes, security and smbdirect improvements, and some minor cleanup:

   - Important OOB DFS fix

   - Fix various potential tcon refcount leaks

   - smbdirect (RDMA) fixes (following up from test event a few weeks
     ago):

      - Fixes to improve and simplify handling of memory lifetime of
        smbdirect_mr_io structures, when a connection gets disconnected

      - Make sure we really wait to reach SMBDIRECT_SOCKET_DISCONNECTED
        before destroying resources

      - Make sure the send/recv submission/completion queues are large
        enough to avoid ib_post_send() from failing under pressure

   - convert cifs.ko to use the recommended crypto libraries (instead of
     crypto_shash), this also can improve performance

   - Three small cleanup patches"

* tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (24 commits)
  smb: client: Consolidate cmac(aes) shash allocation
  smb: client: Remove obsolete crypto_shash allocations
  smb: client: Use HMAC-MD5 library for NTLMv2
  smb: client: Use MD5 library for SMB1 signature calculation
  smb: client: Use MD5 library for M-F symlink hashing
  smb: client: Use HMAC-SHA256 library for SMB2 signature calculation
  smb: client: Use HMAC-SHA256 library for key generation
  smb: client: Use SHA-512 library for SMB3.1.1 preauth hash
  cifs: parse_dfs_referrals: prevent oob on malformed input
  smb: client: Fix refcount leak for cifs_sb_tlink
  smb: client: let smbd_destroy() wait for SMBDIRECT_SOCKET_DISCONNECTED
  smb: move some duplicate definitions to common/cifsglob.h
  smb: client: let destroy_mr_list() keep smbdirect_mr_io memory if registered
  smb: client: let destroy_mr_list() call ib_dereg_mr() before ib_dma_unmap_sg()
  smb: client: call ib_dma_unmap_sg if mr->sgt.nents is not 0
  smb: client: improve logic in smbd_deregister_mr()
  smb: client: improve logic in smbd_register_mr()
  smb: client: improve logic in allocate_mr_list()
  smb: client: let destroy_mr_list() remove locked from the list
  smb: client: let destroy_mr_list() call list_del(&mr->list)
  ...

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"ARM:

   - Fix the handling of ZCR_EL2 in NV VMs

   - Pick the correct translation regime when doing a PTW on the back of
     a SEA

   - Prevent userspace from injecting an event into a vcpu that isn't
     initialised yet

   - Move timer save/restore to the sysreg handling code, fixing EL2
     timer access in the process

   - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug

   - Fix trapping configuration when the host isn't GICv3

   - Improve the detection of HCR_EL2.E2H being RES1

   - Drop a spurious 'break' statement in the S1 PTW

   - Don't try to access SPE when owned by EL3

  Documentation updates:

   - Document the failure modes of event injection

   - Document that a GICv3 guest can be created on a GICv5 host with
     FEAT_GCIE_LEGACY

  Selftest improvements:

   - Add a selftest for the effective value of HCR_EL2.AMO

   - Address build warning in the timer selftest when building with
     clang

   - Teach irqfd selftests about non-x86 architectures

   - Add missing sysregs to the set_id_regs selftest

   - Fix vcpu allocation in the vgic_lpi_stress selftest

   - Correctly enable interrupts in the vgic_lpi_stress selftest

  x86:

   - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
     for the bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return
     -EAGAIN if userspace deletes/moves memslot during prefault")

   - Don't try to get PMU capabilities from perf when running a CPU with
     hybrid CPUs/PMUs, as perf will rightly WARN.

  guest_memfd:

   - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
     more generic KVM_CAP_GUEST_MEMFD_FLAGS

   - Add a guest_memfd INIT_SHARED flag and require userspace to
     explicitly set said flag to initialize memory as SHARED,
     irrespective of MMAP.

     The behavior merged in 6.18 is that enabling mmap() implicitly
     initializes memory as SHARED, which would result in an ABI
     collision for x86 CoCo VMs as their memory is currently always
     initialized PRIVATE.

   - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
     private memory, to enable testing such setups, i.e. to hopefully
     flush out any other lurking ABI issues before 6.18 is officially
     released.

   - Add testcases to the guest_memfd selftest to cover guest_memfd
     without MMAP, and host userspace accesses to mmap()'d private
     memory"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
  arm64: Revamp HCR_EL2.E2H RES1 detection
  KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
  KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
  KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
  KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
  KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
  KVM: arm64: Kill leftovers of ad-hoc timer userspace access
  KVM: arm64: Fix WFxT handling of nested virt
  KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
  KVM: arm64: Make timer_set_offset() generally accessible
  KVM: arm64: Replace timer context vcpu pointer with timer_id
  KVM: arm64: Introduce timer_context_to_vcpu() helper
  KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
  Documentation: KVM: Update GICv3 docs for GICv5 hosts
  KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
  KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
  KVM: arm64: selftests: Allocate vcpus with correct size
  ...

Merge tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Madhavan Srinivasan:

- Fix to handle NULL pointer dereference at irq domain teardown

- Fix for handling extraction of struct xive_irq_data

- Fix to skip parameter area allocation when fadump disabled

Thanks to Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM),
Sourabh Jain, and Venkat Rao Bagalkote,

* tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/fadump: skip parameter area allocation when fadump is disabled
  powerpc, ocxl: Fix extraction of struct xive_irq_data
  powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown

Merge tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

- Fixes for two bugs that can be triggered when debugging options are
   enabled (Hao Ge, Vlastimil Babka)

* tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slab: reset slab->obj_ext when freeing and it is OBJEXTS_ALLOC_FAIL
  slab: fix clearing freelist in free_deferred_objects()

tpm_crb: Add idle support for the Arm FF-A start method

According to the CRB over FF-A specification [1], a TPM that implements
the ABI must comply with the TCG PTP specification. This requires support
for the Idle and Ready states.

This patch implements CRB control area requests for goIdle and
cmdReady on FF-A based TPMs.

The FF-A message used to notify the TPM of CRB updates includes a
locality parameter, which provides a hint to the TPM about which
locality modified the CRB. This patch adds a locality parameter
to __crb_go_idle() and __crb_cmd_ready() to support this.

[1] https://developer.arm.com/documentation/den0138/latest/

Signed-off-by: Stuart Yoder <stuart.yoder@arm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>

Merge tag 'kvm-x86-fixes-6.18-rc2' of https://github.com/kvm-x86/linux into HEAD

KVM x86 fixes for 6.18:

- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
   bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return -EAGAIN if userspace
   deletes/moves memslot during prefault")

- Don't try to get PMU capabbilities from perf when running a CPU with hybrid
   CPUs/PMUs, as perf will rightly WARN.

- Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
   generic KVM_CAP_GUEST_MEMFD_FLAGS

- Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
   said flag to initialize memory as SHARED, irrespective of MMAP.  The
   behavior merged in 6.18 is that enabling mmap() implicitly initializes
   memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
   as their memory is currently always initialized PRIVATE.

- Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
   memory, to enable testing such setups, i.e. to hopefully flush out any
   other lurking ABI issues before 6.18 is officially released.

- Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
   and host userspace accesses to mmap()'d private memory.

Merge tag 'kvmarm-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.18, take #1

Improvements and bug fixes:

- Fix the handling of ZCR_EL2 in NV VMs
  (20250926194108.84093-1-oliver.upton@linux.dev)

- Pick the correct translation regime when doing a PTW on
  the back of a SEA (20250926224246.731748-1-oliver.upton@linux.dev)

- Prevent userspace from injecting an event into a vcpu that isn't
  initialised yet (20250930085237.108326-1-oliver.upton@linux.dev)

- Move timer save/restore to the sysreg handling code, fixing EL2 timer
  access in the process (20250929160458.3351788-1-maz@kernel.org)

- Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
  (20250924235150.617451-1-oliver.upton@linux.dev)

- Fix trapping configuration when the host isn't GICv3
  (20251007160704.1673584-1-sascha.bischoff@arm.com)

- Improve the detection of HCR_EL2.E2H being RES1
  (20251009121239.29370-1-maz@kernel.org)

- Drop a spurious 'break' statement in the S1 PTW
  (20250930135621.162050-1-osama.abdelkader@gmail.com)

- Don't try to access SPE when owned by EL3
  (20251010174707.1684200-1-mukesh.ojha@oss.qualcomm.com)

Documentation updates:

- Document the failure modes of event injection
  (20250930233620.124607-1-oliver.upton@linux.dev)

- Document that a GICv3 guest can be created on a GICv5 host
  with FEAT_GCIE_LEGACY (20251007154848.1640444-1-sascha.bischoff@arm.com)

Selftest improvements:

- Add a selftest for the effective value of HCR_EL2.AMO
  (20250926224454.734066-1-oliver.upton@linux.dev)

- Address build warning in the timer selftest when building
  with clang (20250926155838.2612205-1-seanjc@google.com)

- Teach irq_fd selftests about non-x86 architectures
  (20250930193301.119859-1-oliver.upton@linux.dev)

- Add missing sysregs to the set_id_regs selftest
  (20251012154352.61133-1-zenghui.yu@linux.dev)

- Fix vcpu allocation in the vgic_lpi_stress selftest
  (20251008154520.54801-1-zenghui.yu@linux.dev)

- Correctly enable interrupts in the vgic_lpi_stress selftest
  (20251007195254.260539-1-oliver.upton@linux.dev)

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

- Explicitly encode the XZR register if the value passed to
   write_sysreg_s() is 0.

   The GIC CDEOI instruction is encoded as a system register write with
   XZR as the source register. However, clang does not honour the "Z"
   register constraint, leading to incorrect code generation

- Ensure the interrupts (DAIF.IF) are unmasked when completing
   single-step of a suspended breakpoint before calling
   exit_to_user_mode().

   With pseudo-NMIs, interrupts are (additionally) masked at the PMR_EL1
   register, handled by local_irq_*()

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: debug: always unmask interrupts in el0_softstp()
  arm64/sysreg: Fix GIC CDEOI instruction encoding

Merge tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:

- Disable CFI with Rust for any platform other than x86 and ARM64

- Keep task mm_cpumasks up-to-date to avoid triggering M-mode firmware
   warnings if the kernel tries to send an IPI to an offline CPU

- Improve kprobe address validation performance and avoid desyncs
   (following x86)

- Avoid duplicate device probes by avoiding DT hardware probing when
   ACPI is enabled in early boot

- Use the correct set of dependencies for
   CONFIG_ARCH_HAS_ELF_CORE_EFLAGS, avoiding an allnoconfig warning

- Fix a few other minor issues

* tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: kprobes: convert one final __ASSEMBLY__ to __ASSEMBLER__
  riscv: Respect dependencies of ARCH_HAS_ELF_CORE_EFLAGS
  riscv: acpi: avoid errors caused by probing DT devices when ACPI is used
  riscv: kprobes: Fix probe address validation
  riscv: entry: fix typo in comment 'instruciton' -> 'instruction'
  RISC-V: clear hot-unplugged cores from all task mm_cpumasks to avoid rfence errors
  riscv: kgdb: Ensure that BUFMAX > NUMREGBYTES
  rust: cfi: only 64-bit arm and x86 support CFI_CLANG

selftests/bpf: Fix redefinition of 'off' as different kind of symbol

This fixes the following build error

   CLNG-BPF [test_progs] verifier_global_ptr_args.bpf.o
progs/verifier_global_ptr_args.c:228:5: error: redefinition of 'off' as
different kind of symbol
   228 | u32 off;
       |     ^

The symbol 'off' was previously defined in
tools/testing/selftests/bpf/tools/include/vmlinux.h, which includes an
enum i40e_ptp_gpio_pin_state from
drivers/net/ethernet/intel/i40e/i40e_ptp.c:

enum i40e_ptp_gpio_pin_state {
end = -2,
invalid = -1,
off = 0,
in_A = 1,
in_B = 2,
out_A = 3,
out_B = 4,
};

This enum is included when CONFIG_I40E is enabled. As of commit
032676ff8217 ("LoongArch: Update Loongson-3 default config file"),
CONFIG_I40E is set in the defconfig, which leads to the conflict.

Renaming the local variable avoids the redefinition and allows the
build to succeed.

Suggested-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20251017171551.53142-1-listout@listout.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Do not disable preemption in bpf_test_run().

The timer mode is initialized to NO_PREEMPT mode by default,
this disables preemption and force execution in atomic context
causing issue on PREEMPT_RT configurations when invoking
spin_lock_bh(), leading to the following warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6107, name: syz.0.17
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
Preemption disabled at:
[<ffffffff891fce58>] bpf_test_timer_enter+0xf8/0x140 net/bpf/test_run.c:42

Fix this, by removing NO_PREEMPT/NO_MIGRATE mode check.
Also, the test timer context no longer needs explicit calls to
migrate_disable()/migrate_enable() with rcu_read_lock()/rcu_read_unlock().
Use helpers rcu_read_lock_dont_migrate() and rcu_read_unlock_migrate()
instead.

Reported-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1f1fbecb9413cdbfbef8
Suggested-by: Yonghong Song <yonghong.song@linux.dev>
Suggested-by: Menglong Dong <menglong.dong@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
Co-developed-by: Brahmajit Das <listout@listout.xyz>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Signed-off-by: Sahil Chandna <chandna.sahil@gmail.com>
Link: https://lore.kernel.org/r/20251014185635.10300-1-chandna.sahil@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

arm64: debug: always unmask interrupts in el0_softstp()

We intend that EL0 exception handlers unmask all DAIF exceptions
before calling exit_to_user_mode().

When completing single-step of a suspended breakpoint, we do not call
local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(),
leaving all DAIF exceptions masked.

When pseudo-NMIs are not in use this is benign.

When pseudo-NMIs are in use, this is unsound. At this point interrupts
are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag
manipulation may not work correctly. For example, a subsequent
local_irq_enable() within exit_to_user_mode_loop() will only unmask
interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and
anything depending on interrupts being unmasked (e.g. delivery of
signals) will not work correctly.

This was detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING.

Move the call to `try_step_suspended_breakpoints()` outside of the check
so that interrupts can be unmasked even if we don't call the step handler.

Fixes: 0ac7584c08ce ("arm64: debug: split single stepping exception entry")
Cc: <stable@vger.kernel.org> # 6.17
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: added Mark's rewritten commit log and some whitespace]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64/sysreg: Fix GIC CDEOI instruction encoding

The GIC CDEOI system instruction requires the Rt field to be set to 0b11111
otherwise the instruction behaviour becomes CONSTRAINED UNPREDICTABLE.

Currenly, its usage is encoded as a system register write, with a constant
0 value:

write_sysreg_s(0, GICV5_OP_GIC_CDEOI)

While compiling with GCC, the 0 constant value, through these asm
constraints and modifiers ('x' modifier and 'Z' constraint combo):

asm volatile(__msr_s(r, "%x0") : : "rZ" (__val));

forces the compiler to issue the XZR register for the MSR operation (ie
that corresponds to Rt == 0b11111) issuing the right instruction encoding.

Unfortunately LLVM does not yet understand that modifier/constraint
combo so it ends up issuing a different register from XZR for the MSR
source, which in turns means that it encodes the GIC CDEOI instruction
wrongly and the instruction behaviour becomes CONSTRAINED UNPREDICTABLE
that we must prevent.

Add a conditional to write_sysreg_s() macro that detects whether it
is passed a constant 0 value and issues an MSR write with XZR as source
register - explicitly doing what the asm modifier/constraint is meant to
achieve through constraints/modifiers, fixing the LLVM compilation issue.

Fixes: 7ec80fb3f025 ("irqchip/gic-v5: Add GICv5 PPI support")
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Cc: Sascha Bischoff <sascha.bischoff@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Merge tag 'io_uring-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

- Revert of a change that went into an older kernel, and which has been
   reported to cause a regression for some write workloads on LVM while
   a snapshop is being created

- Fix a regression from this merge window, where some compilers (and/or
   certain .config options) would cause an earlier evaluations of a
   dereference which would then cause a NULL pointer dereference.

   I was only able to reproduce this with OPTIMIZE_FOR_SIZE=y, but David
   Howells hit it with just KASAN enabled. Depending on how things
   inlined, this makes sense

- Fix for a missing lock around a mem region unregistration

- Fix for ring resizing with the same placement after resize

* tag 'io_uring-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/rw: check for NULL io_br_sel when putting a buffer
  io_uring: fix unexpected placement on same size resizing
  io_uring: protect mem region deregistration
  Revert "io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()"

Merge tag 'block-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

- NVMe pull request via Keith:
     - iostats accounting fixed on multipath retries (Amit)
     - secure concatenation response fixup (Martin)
     - tls partial record fixup (Wilfred)

- Fix for a lockdep reported issue with the elevator lock and
   blk group frozen operations

- Fix for a regression in this merge window, where updating
   'nr_requests' would not do the right thing for queues with
   shared tags

* tag 'block-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  nvme/tcp: handle tls partially sent records in write_space()
  block: Remove elevator_lock usage from blkg_conf frozen operations
  blk-mq: fix stale tag depth for shared sched tags in blk_mq_update_nr_requests()
  nvme-auth: update sc_c in host response
  nvme-multipath: Skip nr_active increments in RETRY disposition

Merge tag 'mmc-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull mmc cleanup from Ulf Hansson:
"Move rpmb_frame struct and constants to rpmb common header

  This helps us to avoid sharing an immutable branch between our git
  trees. I was planning to send it before rc1, but I didn't make it"

* tag 'mmc-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  rpmb: move rpmb_frame struct and constants to common header

Merge tag 'sound-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A collection of small fixes. All changes are rather boring
  device-specific fixes and quirks:

   - A few fixes for missing NULL checks

   - ASoC NAU8821 fixes for jack and irq handling

   - Various fixes for ASoC TAS2781, IDT821034, sc8280xp, max9809x,
     wcd938x, and SoundWire

   - Usual HD-audio and USB-audio quirks"

* tag 'sound-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (27 commits)
  ALSA: hda/realtek: Fix mute led for HP Omen 17-cb0xxx
  ALSA: usb-audio: fix vendor quirk for Logitech H390
  ALSA: usb-audio: add volume quirks for MS LifeChat LX-3000
  ASoC: amd/sdw_utils: avoid NULL deref when devm_kasprintf() fails
  ASoC: max98090/91: fixed max98091 ALSA widget powering up/down
  ASoC: dt-bindings: Add compatible string fsl,imx-audio-tlv320
  ASoC: codecs: wcd938x-sdw: remove redundant runtime pm calls
  ASoC: sdw_utils: add rt1321 part id to codec_info_list
  ALSA: usb-audio: Fix NULL pointer deference in try_to_register_card
  ALSA: firewire: amdtp-stream: fix enum kernel-doc warnings
  ALSA: usb-audio: add mixer_playback_min_mute quirk for Logitech H390
  ASoC: nau8821: Avoid unnecessary blocking in IRQ handler
  ASoC: nau8821: Add DMI quirk to bypass jack debounce circuit
  ASoC: nau8821: Consistently clear interrupts before unmasking
  ASoC: nau8821: Generalize helper to clear IRQ status
  ASoC: nau8821: Cancel jdet_work before handling jack ejection
  ASoC: codecs: Fix gain setting ranges for Renesas IDT821034 codec
  ASoC: tas2781: Update ti,tas2781.yaml for adding tas58xx
  ASoC: tas2781: Support more newly-released amplifiers tas58xx in the driver
  ASoC: qcom: sc8280xp: Add support for QCS615
  ...

Merge tag 'drm-fixes-2025-10-17' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"As per usual xe/amdgpu are the leaders, with some i915 and then a
  bunch of scattered fixes. There are a bunch of stability fixes for
  some older amdgpu cards.

  draw:
   - Avoid color truncation

  gpuvm:
   - Avoid kernel-doc warning

  sched:
   - Avoid double free

  i915:
   - Skip GuC communication warning if reset is in progress
   - Couple frontbuffer related fixes
   - Deactivate PSR only on LNL and when selective fetch enabled

  xe:
   - Increase global invalidation timeout to handle some workloads
   - Fix NPD while evicting BOs in an array of VM binds
   - Fix resizable BAR to account for possibly needing to move BARs
     other than the LMEMBAR
   - Fix error handling in xe_migrate_init()
   - Fix atomic fault handling with mixed mappings or if the page is
     already in VRAM
   - Enable media samplers power gating for platforms before Xe2
   - Fix de-registering exec queue from GuC when unbinding
   - Ensure data migration to system if indicated by madvise with SVM
   - Fix kerneldoc for kunit change
   - Always account for cacheline alignment on migration
   - Drop bogus assertion on eviction

  amdgpu:
   - Backlight fix
   - SI fixes
   - CIK fix
   - Make CE support debug only
   - IP discovery fix
   - Ring reset fixes
   - GPUVM fault memory barrier fix
   - Drop unused structures in amdgpu_drm.h
   - JPEG debugfs fix
   - VRAM handling fixes for GPUs without VRAM
   - GC 12 MES fixes

  amdkfd:
   - MES fix

  ast:
   - Fix display output after reboot

  bridge:
   - lt9211: Fix version check

  panthor:
   - Fix MCU suspend

  qaic:
   - Init bootlog in correct order
   - Treat remaining == 0 as error in find_and_map_user_pages()
   - Lock access to DBC request queue

  rockchip:
   - vop2: Fix destination size in atomic check"

* tag 'drm-fixes-2025-10-17' of https://gitlab.freedesktop.org/drm/kernel: (44 commits)
  drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies
  drm/xe/evict: drop bogus assert
  drm/xe/migrate: don't misalign current bytes
  drm/xe/kunit: Fix kerneldoc for parameterized tests
  drm/xe/svm: Ensure data will be migrated to system if indicated by madvise.
  drm/gpuvm: Fix kernel-doc warning for drm_gpuvm_map_req.map
  drm/i915/psr: Deactivate PSR only on LNL and when selective fetch enabled
  drm/ast: Blank with VGACR17 sync enable, always clear VGACRB6 sync off
  accel/qaic: Synchronize access to DBC request queue head & tail pointer
  accel/qaic: Treat remaining == 0 as error in find_and_map_user_pages()
  accel/qaic: Fix bootlog initialization ordering
  drm/rockchip: vop2: use correct destination rectangle height check
  drm/draw: fix color truncation in drm_draw_fill24
  drm/xe/guc: Check GuC running state before deregistering exec queue
  drm/xe: Enable media sampler power gating
  drm/xe: Handle mixed mappings and existing VRAM on atomic faults
  drm/xe/migrate: Fix an error path
  drm/xe: Move rebar to be done earlier
  drm/xe: Don't allow evicting of BOs in same VM in array of VM binds
  drm/xe: Increase global invalidation timeout to 1000us
  ...

Merge tag 'i2c-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

- PM cleanup after all prerequisites are merged with rc1

- usbio: missing addition after all dependencies are in

- slimpro: DT binding schema conversion

* tag 'i2c-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  dt-bindings: i2c: Convert apm,xgene-slimpro-i2c to DT schema
  i2c: usbio: Add ACPI device-id for MTL-CVF devices
  i2c: Remove redundant pm_runtime_mark_last_busy() calls

ALSA: hda/realtek: Fix mute led for HP Omen 17-cb0xxx

This laptop uses the ALC285 codec, fixed by enabling
the ALC285_FIXUP_HP_MUTE_LED quirk

Signed-off-by: Dawn Gardner <dawn.auroali@gmail.com>
Link: https://patch.msgid.link/20251016184218.31508-3-dawn.auroali@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

PCI/VGA: Select SCREEN_INFO on X86

commit 337bf13aa9dda ("PCI/VGA: Replace vga_is_firmware_default() with a
screen info check") introduced an implicit dependency upon SCREEN_INFO by
removing the open coded implementation.

If a user didn't have CONFIG_SCREEN_INFO set, vga_is_firmware_default()
would now return false. SCREEN_INFO is only used on X86 so add a
conditional select for SCREEN_INFO to ensure that the VGA arbiter works as
intended.

Fixes: 337bf13aa9dda ("PCI/VGA: Replace vga_is_firmware_default() with a screen info check")
Reported-by: Eric Biggers <ebiggers@kernel.org>
Closes: https://lore.kernel.org/linux-pci/20251012182302.GA3412@sol/
Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20251013220829.1536292-1-superm1@kernel.org

PCI: vmd: Override irq_startup()/irq_shutdown() in vmd_init_dev_msi_info()

Since commit 54f45a30c0d0 ("PCI/MSI: Add startup/shutdown for per
device domains") set callback irq_startup() and irq_shutdown() of
the struct pci_msi[x]_template, __irq_startup() will always invokes
irq_startup() callback instead of irq_enable() callback overridden
in vmd_init_dev_msi_info(). This will not start the IRQ correctly.

Also override irq_startup()/irq_shutdown() in vmd_init_dev_msi_info(),
so the irq_startup() can invoke the real logic.

Fixes: 54f45a30c0d0 ("PCI/MSI: Add startup/shutdown for per device domains")
Reported-by: Kenneth Crudup <kenny@panix.com>
Closes: https://lore.kernel.org/r/8a923590-5b3a-406f-a324-7bd1cf894d8f@panix.com/
Reported-by: Genes Lists <lists@sapience.com>
Closes: https://lore.kernel.org/r/4b392af8847cc19720ffcd53865f60ab3edc56b3.camel@sapience.com
Reported-by: Todd Brandt <todd.e.brandt@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220658
Reported-by: Oliver Hartkopp <socketcan@hartkopp.net>
Closes: https://lore.kernel.org/r/8d6887a5-60bc-423c-8f7a-87b4ab739f6a@hartkopp.net
Reported-by: Hervé <herve@dxcv.net>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Genes Lists <lists@sapience.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Tested-by: Hervé <herve@dxcv.net>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251014014607.612586-1-inochiama@gmail.com

rust: bitmap: fix formatting

We do our best to keep the repository `rustfmt`-clean, thus run the tool
to fix the formatting issue.

Link: https://docs.kernel.org/rust/coding-guidelines.html#style-formatting
Link: https://rust-for-linux.com/contributing#submit-checklist-addendum
Fixes: 0f5878834d6c ("rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning")
Reviewed-by: Burak Emir <bqe@google.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

Merge tag 'drm-xe-fixes-2025-10-16' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Increase global invalidation timeout to handle some workloads
   (Kenneth Graunke)
- Fix NPD while evicting BOs in an array of VM binds (Matthew Brost)
- Fix resizable BAR to account for possibly needing to move BARs other
   than the LMEMBAR (Lucas De Marchi)
- Fix error handling in xe_migrate_init() (Thomas Hellström)
- Fix atomic fault handling with mixed mappings or if the page is
   already in VRAM (Matthew Brost)
- Enable media samplers power gating for platforms before Xe2 (Vinay
   Belgaumkar)
- Fix de-registering exec queue from GuC when unbinding (Matthew Brost)
- Ensure data migration to system if indicated by madvise with SVM
   (Thomas Hellström)
- Fix kerneldoc for kunit change (Matt Roper)
- Always account for cacheline alignment on migration (Matthew Auld)
- Drop bogus assertion on eviction (Matthew Auld)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/rch735eqkmprfyutk3ux2fsqa3e5ve4p77w7a5j66qdpgyquxr@ao3wzcqtpn6s

Merge tag 'drm-misc-fixes-2025-10-16' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

ast:
- Fix display output after reboot

bridge:
- lt9211: Fix version check

core:
- draw: Avoid color truncation
- gpuvm: Avoid kernel-doc warning
- sched: Avoid double free

panthor:
- Fix MCU suspend

qaic:
- Init bootlog in correct order
- Treat remaining == 0 as error in find_and_map_user_pages()
- Lock access to DBC request queue

rockchip:
- vop2: Fix destination size in atomic check

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20251016141607.GA73919@linux.fritz.box

rust: cpufreq: fix formatting

We do our best to keep the repository `rustfmt`-clean, thus run the tool
to fix the formatting issue.

Link: https://docs.kernel.org/rust/coding-guidelines.html#style-formatting
Link: https://rust-for-linux.com/contributing#submit-checklist-addendum
Fixes: f97aef092e19 ("cpufreq: Make drivers using CPUFREQ_ETERNAL specify transition latency")
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: alloc: employ a trailing comment to keep vertical layout

Apply the formatting guidelines introduced in the previous commit to
make the file `rustfmt`-clean again.

Reviewed-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

docs: rust: add section on imports formatting

`rustfmt`, by default, formats imports in a way that is prone to conflicts
while merging and rebasing, since in some cases it condenses several
items into the same line.

For instance, Linus mentioned [1] that the following case:

    use crate::{
        fmt,
        page::AsPageIter,
    };

is compressed by `rustfmt` into:

    use crate::{fmt, page::AsPageIter};

which is undesirable.

Similarly, `rustfmt` may put several items in the same line even if the
braces span already multiple lines, e.g.:

    use kernel::{
        acpi, c_str,
        device::{property, Core},
        of, platform,
    };

The options that control the formatting behavior around imports are
generally unstable, and `rustfmt` releases do not allow to use nightly
features, unlike the compiler and other Rust tooling [2].

For the moment, we can introduce a workaround to prevent `rustfmt`
from compressing the example above -- the "trailing empty comment":

    use crate::{
        fmt,
        page::AsPageIter, //
    };

which is reminiscent of the trailing comma behavior in other formatters.
We already used empty comments for formatting purposes in the past,
e.g. in commit b9b701fce49a ("rust: clarify the language unstable features
in use").

In addition, `rustfmt` actually reformats with a vertical layout (i.e. it
does not put two items in the same line) when seeing such a comment,
i.e. it doesn't just preserve the formatting, which is good in the sense
that we can use it to easily reformat some imports, since it matches
the style we generally want to have.

A Git merge driver would help (suggested by Gary and Wedson), though
maintainers would need to set it up, the diffs would still be larger
and the formatting rules for imports would remain hard to predict.

Thus document the style that we will follow in the coding guidelines
by introducing a new section and explain how the trailing empty comment
works there too.

We discussed the issue with upstream Rust in our usual Rust <-> Rust
for Linux meeting [3], and there have also been a few other discussions
in parallel in issues [4][5] and Zulip [6]. We will see what happens,
but upstream Rust has already created a subteam of `rustfmt` to try
to overcome the bandwidth issue [7], which is a good signal, and some
organization work has already started (e.g. tracking issues). We will
continue our discussions with them about it.

Cc: Caleb Cartwright <caleb.cartwright@outlook.com>
Cc: Yacin Tmimi <yacintmimi@gmail.com>
Cc: Manish Goregaokar <manishsmail@gmail.com>
Cc: Deadbeef <ent3rm4n@gmail.com>
Cc: Cameron Steffen <cam.steffen94@gmail.com>
Cc: Jieyou Xu <jieyouxu@outlook.com>
Link: https://lore.kernel.org/all/CAHk-=wgO7S_FZUSBbngG5vtejWOpzDfTTBkVvP3_yjJmFddbzA@mail.gmail.com/
Link: https://github.com/rust-lang/rustfmt/issues/4884
Link: https://hackmd.io/iSCyY3JTTz-g8YM-nnzTTA
Link: https://github.com/rust-lang/rustfmt/issues/4991
Link: https://github.com/rust-lang/rustfmt/issues/3361
Link: https://rust-lang.zulipchat.com/#narrow/channel/392734-council/topic/rustfmt.20maintenance/near/543815381
Link: https://github.com/rust-lang/team/pull/2017
Reviewed-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

Merge tag 'amd-drm-fixes-6.18-2025-10-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.18-2025-10-16:

amdgpu:
- Backlight fix
- SI fixes
- CIK fix
- Make CE support debug only
- IP discovery fix
- Ring reset fixes
- GPUVM fault memory barrier fix
- Drop unused structures in amdgpu_drm.h
- JPEG debugfs fix
- VRAM handling fixes for GPUs without VRAM
- GC 12 MES fixes

amdkfd:
- MES fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20251016132224.2534946-1-alexander.deucher@amd.com

Merge tag 'drm-intel-fixes-2025-10-16' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Skip GuC communication warning if reset is in progress (Zhanjun)
- Couple frontbuffer related fixes (Ville)
- Deactivate PSR only on LNL and when selective fetch enabled (Jouni)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/aPDoguxlhXlvjNAi@intel.com

Merge tag 'nvme-6.18-2025-10-16' of git://git.infradead.org/nvme into block-6.18

Pull NVMe fixes from Keith:

"- iostats accounting fixed on multipath retries (Amit)
- secure concatenation response fixup (Martin)
- tls partial record fixup (Wilfred)"

* tag 'nvme-6.18-2025-10-16' of git://git.infradead.org/nvme:
  nvme/tcp: handle tls partially sent records in write_space()
  nvme-auth: update sc_c in host response
  nvme-multipath: Skip nr_active increments in RETRY disposition

nvme/tcp: handle tls partially sent records in write_space()

With TLS enabled, records that are encrypted and appended to TLS TX
list can fail to see a retry if the underlying TCP socket is busy, for
example, hitting an EAGAIN from tcp_sendmsg_locked(). This is not known
to the NVMe TCP driver, as the TLS layer successfully generated a record.

Typically, the TLS write_space() callback would ensure such records are
retried, but in the NVMe TCP Host driver, write_space() invokes
nvme_tcp_write_space(). This causes a partially sent record in the TLS TX
list to timeout after not being retried.

This patch fixes the above by calling queue->write_space(), which calls
into the TLS layer to retry any pending records.

Fixes: be8e82caa685 ("nvme-tcp: enable TLS handshake upcall")
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>

Merge tag 'asoc-fix-v6.18-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.18

A moderately large collection of driver specific fixes, plus a few new
quirks and device IDs. The NAU8821 changes are a little large but more
in mechanical ways than in ways that are complex.

Merge tag 'f2fs-fix-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:

- fix soft lockupg caused by iput() added in bc986b1d756482a ("fs: stop
   accessing ->i_count directly in f2fs and gfs2")

- fix a wrong block address map on multiple devices

Link: https://lore.kernel.org/oe-lkp/202509301450.138b448f-lkp@intel.com
* tag 'f2fs-fix-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: fix wrong block mapping for multi-devices
  f2fs: don't call iput() from f2fs_drop_inode()

bpf: Fix memory leak in __lookup_instance error path

When __lookup_instance() allocates a func_instance structure but fails
to allocate the must_write_set array, it returns an error without freeing
the previously allocated func_instance. This causes a memory leak of 192
bytes (sizeof(struct func_instance)) each time this error path is triggered.

Fix by freeing 'result' on must_write_set allocation failure.

Fixes: b3698c356ad9 ("bpf: callchain sensitive stack liveness tracking using CFG")
Reported-by: BPF Runtime Fuzzer (BRF)
Signed-off-by: Shardul Bankar <shardulsb08@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/20251016063330.4107547-1-shardulsb08@gmail.com

Merge tag 'for-6.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- in tree-checker fix extref bounds check

- reorder send context structure to avoid
   -Wflex-array-member-not-at-end warning

- fix extent readahead length for compressed extents

- fix memory leaks on error paths (qgroup assign ioctl, zone loading
   with raid stripe tree enabled)

- fix how device specific mount options are applied, in particular the
   'ssd' option will be set unexpectedly

- fix tracking of relocation state when tasks are running and
   cancellation is attempted

- adjust assertion condition for folios allocated for scrub

- remove incorrect assertion checking for block group when populating
   free space tree

* tag 'for-6.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: send: fix -Wflex-array-member-not-at-end warning in struct send_ctx
  btrfs: tree-checker: fix bounds check in check_inode_extref()
  btrfs: fix memory leaks when rejecting a non SINGLE data profile without an RST
  btrfs: fix incorrect readahead expansion length
  btrfs: do not assert we found block group item when creating free space tree
  btrfs: do not use folio_test_partial_kmap() in ASSERT()s
  btrfs: only set the device specific options after devices are opened
  btrfs: fix memory leak on duplicated memory in the qgroup assign ioctl
  btrfs: fix clearing of BTRFS_FS_RELOC_RUNNING if relocation already running

Merge tag 'v6.18-rc1-smb-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

- Fix RPC hang due to locking bug

- Fix for memory leak in read and refcount leak (in session setup)

- Minor cleanup

* tag 'v6.18-rc1-smb-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix recursive locking in RPC handle list access
  smb/server: fix possible refcount leak in smb2_sess_setup()
  smb/server: fix possible memory leak in smb2_read()
  smb: server: Use common error handling code in smb_direct_rdma_xmit()

Merge tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from CAN

  Current release - regressions:

    - udp: do not use skb_release_head_state() before
      skb_attempt_defer_free()

    - gro_cells: use nested-BH locking for gro_cell

    - dpll: zl3073x: increase maximum size of flash utility

  Previous releases - regressions:

    - core: fix lockdep splat on device unregister

    - tcp: fix tcp_tso_should_defer() vs large RTT

    - tls:
        - don't rely on tx_work during send()
        - wait for pending async decryptions if tls_strp_msg_hold fails

    - can: j1939: add missing calls in NETDEV_UNREGISTER notification
      handler

    - eth: lan78xx: fix lost EEPROM write timeout in
      lan78xx_write_raw_eeprom

  Previous releases - always broken:

    - ip6_tunnel: prevent perpetual tunnel growth

    - dpll: zl3073x: handle missing or corrupted flash configuration

    - can: m_can: fix pm_runtime and CAN state handling

    - eth:
        - ixgbe: fix too early devlink_free() in ixgbe_remove()
        - ixgbevf: fix mailbox API compatibility
        - gve: Check valid ts bit on RX descriptor before hw timestamping
        - idpf: cleanup remaining SKBs in PTP flows
        - r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H"

* tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  udp: do not use skb_release_head_state() before skb_attempt_defer_free()
  net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset
  netdevsim: set the carrier when the device goes up
  selftests: tls: add test for short splice due to full skmsg
  selftests: net: tls: add tests for cmsg vs MSG_MORE
  tls: don't rely on tx_work during send()
  tls: wait for pending async decryptions if tls_strp_msg_hold fails
  tls: always set record_type in tls_process_cmsg
  tls: wait for async encrypt in case of error during latter iterations of sendmsg
  tls: trim encrypted message to match the plaintext on short splice
  tg3: prevent use of uninitialized remote_adv and local_adv variables
  MAINTAINERS: new entry for IPv6 IOAM
  gve: Check valid ts bit on RX descriptor before hw timestamping
  net: core: fix lockdep splat on device unregister
  MAINTAINERS: add myself as maintainer for b53
  selftests: net: check jq command is supported
  net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit()
  tcp: fix tcp_tso_should_defer() vs large RTT
  r8152: add error handling in rtl8152_driver_init
  usbnet: Fix using smp_processor_id() in preemptible code warnings
  ...

Merge tag 'ata-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata fix from Niklas Cassel:

- Do not print an error message (and assume that the General Purpose
   Log Directory log page is not supported) for a device that reports a
   bogus General Purpose Logging Version.

   Unsurprisingly, many vendors fail to report the only valid General
   Purpose Logging Version (Damien)

* tag 'ata-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libata-core: relax checks in ata_read_log_directory()

selftests: arg_parsing: Ensure data is flushed to disk before reading.

test_parse_test_list_file writes some data to
/tmp/bpf_arg_parsing_test.XXXXXX and parse_test_list_file() will read
the data back. However, after writing data to that file, we forget to
call fsync() and it's causing testing failure in my laptop. This patch
helps fix it by adding the missing fsync() call.

Fixes: 64276f01dce8 ("selftests/bpf: Test_progs can read test lists from file")
Signed-off-by: Xing Guo <higuoxing@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20251016035330.3217145-1-higuoxing@gmail.com

HID: logitech-hidpp: Add HIDPP_QUIRK_RESET_HI_RES_SCROLL

The Logitech G502 Hero Wireless's high resolution scrolling resets after
being unplugged without notifying the driver, causing extremely slow
scrolling.

The only indication of this is a battery update packet, so add a quirk to
detect when the device is unplugged and re-enable the scrolling.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218037
Signed-off-by: Stuart Hayhurst <stuart.a.hayhurst@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>