git.ipfire.org Git - thirdparty/kernel/linux.git/log

Documentation: Fix syntax of kmalloc_objs example in coding style doc

The first parameter should match the variable that the allocated memory
is assigned to. Fix the example accordingly, the one for kmalloc_obj got
it right already.

Fixes: 7c6d969d5349 ("Documentation: adopt new coding style of type-aware kmalloc-family")
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260529081006.2019687-2-ukleinek@kernel.org>

docs: pt_BR: update maintainer-handbooks

Update the content of the maintainer-handbooks documentation
to Brazilian Portuguese.

Signed-off-by: Amanda Corrêa <amandacorreasilvax@gmail.com>
Acked-by: Daniel Pereira <danielmaraboo@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260528041958.57231-1-amandacorreasilvax@gmail.com>

docs: pt_BR: add translation for kernel development process guides

Add the Brazilian Portuguese (pt_BR) translation for the
'development-process.rst' and '2.process.rst' files under
the 'process/' directory.

The main 'index.rst' file is also updated to include references
to the newly translated documents in the toctree.

Signed-off-by: Daniel Pereira <danielmaraboo@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260527155350.202569-1-danielmaraboo@gmail.com>

drm/v3d: Wait for pending L2T flush before cleaning caches

v3d_clean_caches() starts the cache-clean sequence by writing
V3D_L2TCACTL_TMUWCF to V3D_CTL_L2TCACTL and then polling for that bit to
clear. It does not, however, check for an L2T flush (L2TFLS) that may
still be in flight from a previous operation.

On pre-V3D 7.1 hardware, kicking off the TMU write-combiner flush while an
L2T flush is still pending can clobber bits in L2TCACTL and cause cache
inconsistencies.

Poll for L2TFLS to clear before writing L2TCACTL on V3D < 7.1, ensuring
any pending flush has completed before a new clean is issued.

Cc: stable@vger.kernel.org
Fixes: d223f98f0209 ("drm/v3d: Add support for compute shader dispatch.")
Link: https://patch.msgid.link/20260530-v3d-fix-rpi4-freezes-v1-1-c2c8307da6ce@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

ACPI: CPPC: Add support for CPPC v4

CPPC v4 (ACPI 6.6, Section 8.4.6) adds two optional entries to the
_CPC package:

1. OSPM Nominal Performance (8.4.6.1.2.6): A write-only register that
   lets OSPM inform the platform what it considers nominal performance.
   The platform classifies performance above this level as boost and
   below as throttle for its power/thermal decisions.

2. Resource Priority (8.4.6.1.2.7): A Package of Resource Priority
   Register Descriptor sub-packages that allow OSPM to set relative
   priority among processors for shared resources (boost, throttle,
   L2/L3 cache, memory bandwidth). Parsing the full structure is not
   yet supported; such entries are marked as unsupported.

Add v4 _CPC table parsing (25 entries) and update REG_OPTIONAL to
mark the two new registers as optional.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://patch.msgid.link/20260527194626.185286-2-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

thermal: intel: Use sysfs_emit() for powerclamp cpumask

cpumask_get() is used as a sysfs getter for the cpumask module
parameter. Use sysfs_emit() and cpumask_pr_args() to emit the mask.

This prepares for removing cpumap_print_to_pagebuf().

Signed-off-by: Yury Norov <ynorov@nvidia.com>
Link: https://patch.msgid.link/20260528183625.870813-16-ynorov@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

powercap: intel_rapl: Use sysfs_emit() in cpumask_show()

cpumask_show() is a sysfs show callback, so use sysfs_emit() and
cpumask_pr_args() to emit the mask in it.

This prepares for removing cpumap_print_to_pagebuf().

Signed-off-by: Yury Norov <ynorov@nvidia.com>
[ rjw: Subject and changelog tweaks ]
Link: https://patch.msgid.link/20260528183625.870813-15-ynorov@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

erofs: fix EFSCORRUPTED on multi-algorithm images in z_erofs_map_sanity_check()

Commit a5242d37c83a ("erofs: error out obviously illegal extents in
advance") changed the per-extent algorithm presence check from "is the
bit set" to "is the only bit set":
  -      !(sbi->available_compr_algs & (1 << map->m_algorithmformat))
  + (sbi->available_compr_algs ^ BIT(map->m_algorithmformat))

`available_compr_algs` is a bitmap of every compression algorithm
available in the image (z_erofs_parse_cfgs() iterates it with
for_each_set_bit()), so an image that enables more than one algorithm
has multiple bits set.  XOR is zero only when the bitmap is exactly
BIT(map->m_algorithmformat); for any image with two or more algorithms
the test is non-zero for every extent and the read fails with
-EFSCORRUPTED ("inconsistent algorithmtype %u").

Reproducer (mkfs.erofs from erofs-utils 1.7.1):
  $ mkdir src
  $ yes A | head -c 100K > src/a
  $ head -c 64K /dev/zero > src/b
  $ mkfs.erofs -zlz4:deflate multi.erofs src
  $ mount -t erofs -o loop multi.erofs /mnt
  $ cat /mnt/a >/dev/null
  cat: /mnt/a: Structure needs cleaning
  $ dmesg | tail
    erofs (device loop0): inconsistent algorithmtype 0 for nid 46
    erofs (device loop0): read error -117 @ 0 of nid 46

The erofs on-disk format (Z_EROFS_COMPRESSION_MAX = 4 with LZ4, LZMA,
DEFLATE, ZSTD) and the kernel parser explicitly support
multi-algorithm images, and erofs-utils 1.7.1 generates them via the
"-z X:Y" syntax.

Restore the original per-bit presence check.

Fixes: a5242d37c83a ("erofs: error out obviously illegal extents in advance")
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

ACPI: PAD: Use sysfs_emit() in idlecpus_show()

idlecpus_show() is a sysfs show callback. Use sysfs_emit() and
cpumask_pr_args() to emit the mask.

This prepares for removing cpumap_print_to_pagebuf().

Signed-off-by: Yury Norov <ynorov@nvidia.com>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/20260528183625.870813-6-ynorov@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

x86/CPU/AMD: Add more Zen6 models

Family 0x1a, models 0xd0 - 0xef are Zen6, so add them to the range which sets
X86_FEATURE_ZEN6.

[ bp: Massage commit message. ]

Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260530061819.9721-1-Pratik.Vishwakarma@amd.com

ACPI: scan: Honor _DEP for ACPI0016 PCI/CXL host bridge

CXL root devices (ACPI0017) declare _DEP on their parent ACPI0016
PCI/CXL host bridge so that cxl_acpi probes only after acpi_pci_root
has attached the PCI root and registered it for acpi_pci_find_root().
However, acpi_dev_ready_for_enumeration() only consults dep_unmet
when the supplier's HID is on acpi_honor_dep_ids[]; otherwise the
dependency is silently ignored.

Without honoring the dependency, cxl_acpi can probe before the PCI
root is ready. The resulting CXL topology is broken: decoder targets
read as 0 and no port/endpoint devices appear under
/sys/bus/cxl/devices/.

Add ACPI0016 to acpi_honor_dep_ids[] so the _DEP declared by ACPI0017
is enforced. This relies on the preceding patch ("ACPI: PCI: clear
_DEP dependencies after PCI root bridge attach"), which releases the
dependency once the PCI root is fully enumerated; the two patches
must be applied together.

Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260526025118.38935-3-cp0613@linux.alibaba.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPI: PCI: Clear _DEP dependencies after PCI root bridge attach

PCI root bridges enumerated by acpi_pci_root_add() can be the _DEP
supplier for other ACPI consumers, most notably ACPI0017 CXL root
devices whose probe path depends on acpi_pci_find_root() succeeding.
Once the root bus has been added, those consumers can safely be
enumerated, so notify them by clearing the dependency.

Call acpi_dev_clear_dependencies() at the end of acpi_pci_root_add(),
after pci_bus_add_devices(), following the same pattern used by other
ACPI suppliers such as the EC (drivers/acpi/ec.c) and the ACPI PCI
Link device (drivers/acpi/pci_link.c). The clear is intentionally
done only on the success path; on the error paths the supplier did
not attach and consumers must keep dep_unmet set.

This is a prerequisite for honoring _DEP on ACPI0016 host bridges,
which matters on architectures where the probe order of acpi_pci_root
relative to cxl_acpi is not guaranteed (e.g. RISC-V).

Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Suggested-by: Dan Williams (nvidia) <djbw@kernel.org>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260526025118.38935-2-cp0613@linux.alibaba.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPI: button: Use local pointer to platform device dev field in probe

To avoid dereferencing pdev to get to the target platform device's
dev field in multiple places in acpi_button_probe(), use a local pointer
to that field.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2049596.PYKUYFuaPT@rafael.j.wysocki

ACPI: button: Eliminate redundant conditional statement

Simplify do_update initialization in acpi_lid_notify_state() by
assigning the value of the condition it depends on directly to it.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/10868292.nUPlyArG6x@rafael.j.wysocki

ACPI: button: Change return type of two functions to void

The return value of acpi_lid_notify_state() is always 0, so change
its return type to void.

Moreover, the return value of the only caller of that function,
acpi_lid_update_state(), is never used, so change its return type
to void either.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/3429748.44csPzL39Z@rafael.j.wysocki

ACPI: button: Eliminate ternary operator from acpi_lid_evaluate_state()

The ternary operator in acpi_lid_evaluate_state() is not actually needed
because the same result can be achieved by applying the !! operator to
the lid_state value, so update the code accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/3055906.e9J7NaK4W3@rafael.j.wysocki

ACPI: button: Use bool for representing boolean values

Change the data type of the last_state field in struct acpi_button and
the data type of the acpi_lid_notify_state() second argument to bool
because they both are used for storing boolean values.

Update the callers of acpi_lid_notify_state() accordingly and
while at it, remove the unnecessary (void) cast from the
acpi_lid_update_state() call in acpi_lid_initialize_state() for
consistency.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2274778.irdbgypaU6@rafael.j.wysocki

ACPI: button: Improve warning message regarding lid state

The warning message regarding an unexpected lid state printed by
acpi_lid_notify_state() is quite cryptic and there is no information
in it to indicate that it is about a platform firmware defect. In
fact, it can only be understood after reading the comment below the
statement printing it.

For this reason, replace it with a more direct one including FW_BUG so
its connection to a firmware issue is clearer.

While at it, fix up a comment preceding the statement printing the
message in question.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5084775.GXAFRqVoOG@rafael.j.wysocki

ACPI: button: Pass ACPI handle to acpi_lid_evaluate_state()

Make it clear that acpi_lid_evaluate_state() only uses the ACPI handle
of the lid by changing its argument to acpi_handle and adjust its
callers accordingly.

Also save the ACPI handle of the lid, that later may be passed to
acpi_lid_evaluate_state(), in a static variable instead of saving a
pointer to the ACPI device object containing that handle.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/4747530.LvFx2qVVIh@rafael.j.wysocki

ACPI: button: Fix lid_device value leak past driver removal

Static variable lid_device is set when the ACPI button driver probes
the last lid device (under the assumptions that there will be only
one lid device in the system) and never cleared, but in principle it
should be reset when the driver unbinds from the lid device pointed
to by it.

Address that and add locking that is needed to clear and set that
variable safely.

Fixes: 7e12715ecc47 ("ACPI button: provide lid status functions")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/6281379.lOV4Wx5bFT@rafael.j.wysocki

Revert "drm/xe/nvls: Define GuC firmware for NVL-S"

This reverts commit 4e88de313ff4d1c67b644b1f39f9fb4089711b71.

The early GuC FW definition meant for our CI branch was accidentally
merged to the drm-xe-next branch instead. This GuC FW will never be
released to linux-firmware, so we do not want the definition to be
available in the mainline Linux codebase.

Fixes: 4e88de313ff4 ("drm/xe/nvls: Define GuC firmware for NVL-S")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: stable@vger.kernel.org # v7.0+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260529193558.185436-11-daniele.ceraolospurio@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 65b8e0ac86e48cfc9128c04dfc53ea3395d030dd)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ASoC: amd: yc: Enable internal mic on MSI Bravo 17 C7VF

The MSI Bravo 17 C7VF routes its internal digital microphone through
the ACP6x. The machine driver only enables the DMIC for boards present
in the DMI quirk table, so on this model the internal mic is never
detected and no capture device is created.

Add a quirk entry matching the board's DMI identifiers so the DMIC is
enabled and the internal microphone works.

Signed-off-by: João Miguel <jmiguel.ghp@gmail.com>
Link: https://patch.msgid.link/20260523213548.5219-1-jmiguel.ghp@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: amd: acp: Add DMI quirk for Lenovo Yoga Pro 7 15ASH11

Lenovo Yoga Pro 7 15ASH11 with AMD RYZEN AI MAX+ 388 (Strix Halo, ACP
7.0) uses Realtek ALC287 series codec and no any DMIC connected by ACP.
All DMICs directly connet with ALC codec.

Without this quirk, Input Device of Gnome Sound settings shows Internal
Stereo Microphone and Digital Microphone by default. In fact, Digital
Microphone of ACP doesn't work due to no connecting with ALC287 codec,
the Internal Stereo Microphone as analog device based on snd_hda_intel
driver can work well.

Add a DMI quirk to override the flag to 0, consistent with the existing
entry for the HN7306EA.

Signed-off-by: Jackie Dong <xy-jackie@139.com>
Link: https://patch.msgid.link/20260527102005.58528-1-xy-jackie@139.com
Signed-off-by: Mark Brown <broonie@kernel.org>

dm cache: make smq background work limit configurable

The maximum number of concurrent background work items (promotions,
demotions, writebacks) in the SMQ policy was hardcoded to 4096, with
a FIXME comment noting it should be made configurable.

This value was originally tuned down from 10240 to balance memory
overhead (~128 bytes per entry, ~512KB at 4096 entries) against I/O
parallelism. However, different workloads and cache sizes may benefit
from different limits:

- Write-heavy workloads may need more writeback concurrency
- Very large caches (10+ TB) may need more promotion slots
- Memory-constrained systems may want a lower limit

Make this configurable via the module parameter "smq_max_background_work"
(defaulting to 4096 to preserve existing behaviour). Clamp the value to
at least 1 to prevent setting 0, which would block all background work.
The parameter only affects newly created cache devices; existing caches
retain their value from creation time.

Signed-off-by: Cao Guanghui <caoguanghui@kylinos.cn>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Merge tag 'thunderbolt-for-v7.1-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus

Mika writes:

thunderbolt: Fixes for v7.1-rc7

This includes more fixes to harden XDomain message handling against
possible malicious hosts.

All these have been in linux-next with no reported issues.

* tag 'thunderbolt-for-v7.1-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
  thunderbolt: Limit XDomain response copy to actual frame size
  thunderbolt: Validate XDomain request packet size before type cast
  thunderbolt: Clamp XDomain response data copy to allocation size
  thunderbolt: Bound root directory content to block size
  thunderbolt: Reject zero-length property entries in validator

dm cache policy smq: check allocation under invalidate lock

commit 2d1f7b65f5de ("dm cache policy smq: fix missing locks in
invalidating cache blocks") added mq->lock around the destructive part of
smq_invalidate_mapping(), but left the e->allocated check outside the
critical section.

That leaves a check-then-act race. Two concurrent invalidators can both
observe e->allocated as true before either of them takes mq->lock. The
first invalidator that acquires the lock removes the entry from the
queues and hash table and then calls free_entry(), which clears
e->allocated and puts the entry back on the free list. The second
invalidator can then acquire mq->lock and continue with the stale result
of the unlocked check.

This can corrupt the SMQ queues or hash table by deleting an entry that
is no longer on those structures. It can also hit the allocation check in
free_entry() when the same entry is freed again.

Move the allocation check under mq->lock so the predicate and the
destructive operations are serialized by the same lock.

Fixes: 2d1f7b65f5de ("dm cache policy smq: fix missing locks in invalidating cache blocks")
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

tracing: Replace BUG_ON with lockdep_assert_held in uprobe_buffer functions

Replace BUG_ON(!mutex_is_locked(&event_mutex)) with
lockdep_assert_held(&event_mutex) in uprobe_buffer_enable() and
uprobe_buffer_disable().

BUG_ON() will crash the kernel. mutex_is_locked() only checks
if any task holds lock,but not the caller task. lockdep_assert_held()
also check current task for lock and no crash on true condition.

Link: https://lore.kernel.org/all/20260521192846.8306-1-yashsuthar983@gmail.com/
Signed-off-by: Yash Suthar <yashsuthar983@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

tracing: Use flexible array for entry fetch code

Store probe entry fetch instructions in the probe_entry_arg
allocation instead of allocating a separate instruction array.

This keeps the entry fetch code tied to the entry argument lifetime while
leaving regular probe_arg instruction arrays separately allocated and
freed.

Assisted-by: Codex:GPT-5.5
Link: https://lore.kernel.org/all/20260520215817.16560-1-rosenp@gmail.com/
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

tracing/probes: Ensure the uprobe buffer size is bigger than event size

Add BUILD_BUG_ON() to ensure the uprobe per-CPU working buffer
size is bigger than the event size.

Link: https://lore.kernel.org/all/177849383209.8038.1902170479780501237.stgit@devnote2/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Merge tag 'socfpga_fix_for_v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes

SoCFPGA dts fix for v7.1
- Fix OF node refcount leak

* tag 'socfpga_fix_for_v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
ARM: socfpga: Fix OF node refcount leak in SMP setup

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'at91-fixes-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes

Microchip AT91 fixes for v7.1

This update includes:
- a fix for the GMAC DT node on SAM9X7 SoC to properly describe the
available clocks

* tag 'at91-fixes-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
ARM: dts: microchip: sam9x7: fix GMAC clock configuration

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

MAINTAINERS: use new drbd-dev mailing list

We are migrating from our own infrastructure to lists.linux.dev, so
change the drbd-dev address to point to the new domain.

Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://patch.msgid.link/20260513065557.36042-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

spi: tegra210-quad: Allocate DMA memory for DMA engine

When the SPI controllers are running in DMA mode, it is the DMA engine
that performs the memory accesses rather than the SPI controller. Pass
the DMA engine's struct device pointer to the DMA API to make sure the
correct DMA operations are used.

Suggested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Link: https://patch.msgid.link/20260525-tegra194-qspi-iommu-v2-1-a11c53f804b2@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: imx: replace dmaengine_terminate_all() with dmaengine_terminate_sync()

dmaengine_terminate_all() has been deprecated, so replace it with
dmaengine_terminate_sync().

Fixes: ba9b28652c75 ("spi: imx: enable DMA mode for target operation")
Fixes: a450c8b77f92 ("spi: imx: handle DMA submission errors with dma_submit_error()")
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Link: https://patch.msgid.link/20260525062928.3191821-1-carlos.song@oss.nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: fsl-lpspi: fix DMA termination issues

Carlos Song (OSS) <carlos.song@oss.nxp.com> says:

This series fixes two issues in the fsl-lpspi DMA transfer error paths.

Patch 1 replaces the deprecated dmaengine_terminate_all() with
dmaengine_terminate_sync() across all error paths in
fsl_lpspi_dma_transfer().

Patch 2 fixes a missing RX DMA channel termination when TX descriptor
preparation fails. Since the RX channel is already submitted and issued
before the TX descriptor is prepared, returning -EINVAL without
terminating the RX channel leaves it running against buffers that the
SPI core will unmap, potentially causing memory corruption.

Link: https://patch.msgid.link/20260525062357.3191349-1-carlos.song@oss.nxp.com

spi: fsl-lpspi: terminate the RX channel on TX prepare failure path

When dmaengine_prep_slave_sg() fails for the TX channel, the error path
terminates the TX DMA channel but leaves the RX channel running. Since
the RX channel was already submitted and issued prior to preparing
the TX descriptor, returning -EINVAL causes the SPI core to unmap the
DMA buffers while the RX DMA engine continues writing to them, leading
to potential memory corruption or use-after-free.

Terminate the RX channel before returning on the TX prepare failure path.

Fixes: 09c04466ce7e ("spi: lpspi: add dma mode support")
Cc: stable@vger.kernel.org
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Link: https://patch.msgid.link/20260525062357.3191349-3-carlos.song@oss.nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: fsl-lpspi: replace dmaengine_terminate_all() with dmaengine_terminate_sync()

dmaengine_terminate_all() has been deprecated, so replace it with
dmaengine_terminate_sync().

Fixes: 09c04466ce7e ("spi: lpspi: add dma mode support")
Cc: stable@vger.kernel.org
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Link: https://patch.msgid.link/20260525062357.3191349-2-carlos.song@oss.nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: atmel: fix DMA channel and bounce buffer leaks

The original code set use_dma to false when dma_alloc_coherent() for
bounce buffers failed, but DMA channels acquired earlier via
atmel_spi_configure_dma() were never freed.

When devm_request_irq() or clk_prepare_enable() failed later in probe,
the driver also did not release DMA channels or bounce buffers already
allocated.

The out_free_dma error path released DMA channels but did not free the
bounce buffers.

Fix by moving bounce buffer allocation into atmel_spi_configure_dma()
and registering the devres cleanup for DMA channels and bounce buffers.

Fixes: a9889ed62d06 ("spi: atmel: Implements transfers with bounce buffer")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260522-atmel-v3-1-23f8c6e6aa43@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

net/mlx5: Reorder completion before putting command entry in cmd_work_handler

Assuming callback != NULL && !page_queue, cmd_work_handler takes
command entry with refcnt == 1 from mlx5_cmd_invoke.
If either semaphore timeout or index allocation error happens,
it does final cmd_ent_put(ent). To avoid access to freed memory,
notify slotted completion before cmd_ent_put.

This is theoretical issue found by Svace static analyser.

Cc: stable@vger.kernel.org
Fixes: 485d65e135712 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
Fixes: 0e2909c6bec90 ("net/mlx5: Fix variable not being completed when function returns")
Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru>
Reviewed-by: Md Haris Iqbal <haris.iqbal@linux.dev>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260526162932.501584-1-kniv@yandex-team.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

netfilter: nft_byteorder: remove multi-register support

64bit byteorder conversion is broken when several registers need to be
converted because the source register array advances in steps for 4 bytes
instead of 8:

  for (i = ...
      src64 = nft_reg_load64(&src[i]);
                             ~~~~~ u32 *src
      nft_reg_store64(&dst64[i],

Remove the multi-register support, it has other issues as well:

Pablo points out that commit
caf3ef7468f7 ("netfilter: nf_tables: prevent OOB access in nft_byteorder_eval")
alters semantics: before the loop operated on registers, i.e.
for ( ... )
   dst32[i] = htons((u16)src32[i])

.. but after the patch it will operate on bytes, which makes this
useless to convert e.g. concatenations, which store each compound
in its own register.

Multi-convert of u32 has one theoretical application:

ct mark . meta mark . tcp dport @intervalset

Because ct mark and meta mark are host byte order, use with
intervals has to convert the byteorder for ct/meta mark value
to network byte order (bigendian).

nftables emits this:
[ meta load mark => reg 1 ]
[ byteorder reg 1 = hton(reg 1, 4, 4) ]
[ ct load mark => reg 9 ]
[ byteorder reg 9 = hton(reg 9, 4, 4) ]
...

I.e. two separate calls.  Theoretically it could be changed to do:
[ meta load mark => reg 1 ]
[ ct load mark => reg 9 ]
[ byteorder reg 1 = htonl(reg 1, 4, 8) ]
...

But then all it would take to change the set to
meta mark . tcp dport . ct mark

... and we'd be back to two "byteorder" calls. IOW, support to
convert a range of registers is both dysfunctional and dubious.

Simplify this: remove the feature.

Pablo Neira Ayuso points out that nftables before 1.1.0 can generate
incorrect byteorder conversions, see 9fe58952c45a,
"evaluate: skip byteorder conversion for selector smaller than 2 bytes"
in nftables.git).  Affected rulesets fail to load with this change and
old userspace due to 'len != size' check.

Fixes: c301f0981fdd ("netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()")
Cc: <stable+noautosel@kernel.org> # may break rule load with old nftables versions
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Link: https://lore.kernel.org/netfilter-devel/20240206104336.ctigqpkunom2ufmn@lion.mk-sys.cz/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: bridge: make ebt_snat ARP rewrite writable

The ebtables SNAT target keeps the Ethernet source address rewrite
behind skb_ensure_writable(skb, 0).  This is intentional: at the bridge
ebtables hooks the Ethernet header is addressed through
skb_mac_header()/eth_hdr(), while skb->data points at the Ethernet
payload.  Asking skb_ensure_writable() for ETH_HLEN bytes would check
the payload, not the Ethernet header, and would reintroduce the small
packet regression fixed by commit 63137bc5882a.

However, the optional ARP sender hardware address rewrite is different.
It writes through skb_store_bits() at an offset relative to skb->data:

        skb_store_bits(skb, sizeof(struct arphdr), info->mac, ETH_ALEN)

skb_header_pointer() only safely reads the ARP header; it does not make
the later sender hardware address range writable.  If that range is
still held in a nonlinear skb fragment backed by a splice-imported file
page, skb_store_bits() maps the frag page and copies the new MAC address
directly into it.

Ensure the ARP SHA range is writable before reading the ARP header and
before calling skb_store_bits().

Fixes: 63137bc5882a ("netfilter: ebtables: Fixes dropping of small packets in bridge nat")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nft_ct: bail out on template ct in get eval

I noticed this issue while looking at a historic syzbot report [1].

A rule like the one below is enough to trigger the bug:

    table ip t {
        chain pre {
            type filter hook prerouting priority raw;
            ct zone set 1
            ct original saddr 1.2.3.4 accept
        }
    }

The first expression attaches a per-cpu template ct via
nft_ct_set_zone_eval() (nf_ct_tmpl_alloc -> kzalloc, tuple is all
zero, nf_ct_l3num(ct) == 0). The next expression then calls
nft_ct_get_eval() on the same skb, treats the template as a real ct
and hits the 16-byte memcpy path. With dreg at NFT_REG32_15 this
overflows past struct nft_regs on the kernel stack; with smaller
dreg values it silently clobbers adjacent registers.

Reject template ct at the eval entry and in nft_ct_get_fast_eval(),
mirroring the check nft_ct_set_eval() already has. Additionally,
bound the address copy in NFT_CT_SRC / NFT_CT_DST by priv->len
instead of by nf_ct_l3num(ct): nf_ct_get_tuple() zeroes the tuple
before pkt_to_tuple() fills in only the protocol-relevant leading
bytes, so the trailing bytes of tuple->{src,dst}.u3.all are
well-defined zero. priv->len is validated at rule load, so the
copy size is now bounded by the destination register rather than
by an untrusted field on the conntrack.

[1]: https://syzkaller.appspot.com/bug?id=389cf09cb72926114fce90dc85a2c3231dcb647c

Fixes: 45d9bcda21f4 ("netfilter: nf_tables: validate len in nft_validate_data_load()")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nft_tunnel: fix use-after-free on object destroy

nft_tunnel_obj_destroy() calls metadata_dst_free() which directly
kfree()s the metadata_dst, ignoring the dst_entry refcount. Packets
that took a reference via dst_hold() in nft_tunnel_obj_eval() and
are still queued (e.g. in a netem qdisc) are left with a dangling
pointer. When these packets are eventually dequeued, dst_release()
operates on freed memory.

Replace metadata_dst_free() with dst_release() so the metadata_dst
is freed only after all references are dropped. The dst subsystem
already handles metadata_dst cleanup in dst_destroy() when
DST_METADATA is set.

Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support")
Cc: stable@vger.kernel.org
Signed-off-by: Tristan Madani <tristan@talencesecurity.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: conntrack_irc: fix possible out-of-bounds read

When parsing fails after we've matched the command string we
should bail out instead of trying to match a different command.

This helper should be deprecated, given prevalence of TLS I doubt it has
any relevance in 2026.

Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port")
Closes: https://sashiko.dev/#/patchset/20260525182924.28456-1-fw%40strlen.de
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: synproxy: add mutex to guard hook reference counting

As the synproxy infrastructure register netfilter hooks on-demand when a
user adds the first iptables target or nftables expression, if done
concurrently they can race each other.

Introduce a mutex to serialize the refcount control blocks access from
both frontends. While a per namespace mutex might be more efficient, it
is not needed for target/expression like SYNPROXY.

Fixes: ad49d86e07a4 ("netfilter: nf_tables: Add synproxy support")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nft_fib_ipv6: bail out of sibling walk if rt got unlinked

This was reported by Sashiko [1].

The RCU walk over rt->fib6_siblings can spin forever if rt is unlinked
mid-iteration: rt->fib6_siblings.next still points into the old ring,
so the loop never meets &rt->fib6_siblings as its terminator.

fib6_purge_rt() always does WRITE_ONCE(rt->fib6_nsiblings, 0) before
list_del_rcu(), so readers can use rt->fib6_nsiblings == 0 as the
detach signal. The same pattern is used in fib6_info_uses_dev() and
rt6_nlmsg_size().

[1]: https://sashiko.dev/#/patchset/20260520023411.391233-1-jiayuan.chen%40linux.dev
Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: 1c32b24c234b ("netfilter: nft_fib_ipv6: switch to fib6_lookup")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ipvs: clear the svc scheduler ptr early on edit

ip_vs_edit_service() while unbinding the old scheduler clears
the svc->scheduler ptr after the scheduler module initiates
RCU callbacks. This can cause packets to use the old
scheduler at the time when svc->sched_data is already freed
after RCU grace period.

Fix it by clearing the ptr early in ip_vs_unbind_scheduler(),
before the done_service method schedules any RCU callbacks.

Also, if the new scheduler fails to initialize when replacing
the old scheduler, try to restore the old scheduler while still
returning the error code.

Link: https://sashiko.dev/#/patchset/20260519015506.634185-1-rosenp%40gmail.com
Fixes: 05f00505a89a ("ipvs: fix crash if scheduler is changed")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: xt_NFQUEUE: prefer raw_smp_processor_id

With PREEMPT_RCU this triggers a splat because smp_processor_id() can be
preempted while inside a RCU critical section. If xt_NFQUEUE target is
invoked via nft_compat_eval() path, we are inside a RCU critical
section.

Just use the raw version instead.

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ext2: Remove deprecated DAX support

DAX support in ext2 was deprecated in commit d5a2693f93e4 ("ext2:
Deprecate DAX") with a removal deadline of end of 2025. Remove all DAX
code from ext2 as scheduled.

This removes the DAX mount option, IOMAP DAX support, DAX file
operations, DAX address_space_operations, and the DAX fault handler.

[JK: Fixup some whitespace damage]

Signed-off-by: Ashwin Gundarapu <linuxuser509@zohomail.in>
Link: https://patch.msgid.link/19e5aa07c9b.3a2e576d130187.5289857983023045470@zohomail.in
Signed-off-by: Jan Kara <jack@suse.cz>

mm/slub: detach and reattach partial slabs in batch

get_partial_node_bulk() moves each selected slab from the node's
partial list to the local pc->slabs list using a remove_partial() and
list_add() pair. In practice, the loop often detaches several adjacent
slabs. Doing this individually repeatedly manipulates list pointers
while holding n->list_lock, which causes unnecessary churn.

To demonstrate this, the counts below show how often single vs. multiple
consecutive slabs are retrieved during a will-it-scale mmap stress test:

consecutive_slabs_count        frequency
= 1                            277345324
= 2                            335238023
= 3                            175717884
>= 4                           88862337

The data confirms that retrieving multiple contiguous slabs is highly
frequent.

To optimize this, track contiguous runs of matching slabs and move each
run in a single operation using list_bulk_move_tail(). This reduces list
pointer churn inside the lock critical section.

Apply the same optimization to __refill_objects_node() when reattaching
leftover partial slabs back to the node's partial list.

The will-it-scale mmap benchmark shows a 2% ~ 5% performance improvement
after applying this patch.

Signed-off-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/20260529035120.81304-3-hao.li@linux.dev
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

mm/slub: introduce helpers for node partial slab state

Wrap partial slab count inc/dec and flag set/clear into
helper functions to reduce code duplication.

Note that __add_partial() is called locklessly in
early_kmem_cache_node_alloc(), but since there is no such use case for
removal, __remove_partial() does not exist.

Suggested-by: Harry Yoo <harry@kernel.org>
Signed-off-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/20260529035120.81304-2-hao.li@linux.dev
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

mm/slub: use empty sheaf helpers for oversized sheaves

Oversized prefilled sheaves are allocated separately because their
capacity can be larger than the cache's regular sheaf capacity. After
they are flushed, however, they are empty sheaves as well, and should be
released through the same empty-sheaf helper.

Allocate oversized prefilled sheaves with __alloc_empty_sheaf() and free
them with free_empty_sheaf() after a failed prefill or after they are
returned and flushed. This keeps the oversized and pfmemalloc return paths
consistent, including the SLAB_KMALLOC-specific __GFP_NO_OBJ_EXT and
mark_obj_codetag_empty() handling.

Keep the caller-GFP filtering in alloc_empty_sheaf() instead of
__alloc_empty_sheaf(). In particular, do not clear OBJCGS_CLEAR_MASK in
the raw helper, so the oversized prefill path does not unexpectedly drop
caller-provided flags such as __GFP_NOFAIL. The SLAB_KMALLOC-specific
addition of __GFP_NO_OBJ_EXT remains in __alloc_empty_sheaf(), matching
the free_empty_sheaf() assumption.

Since oversized sheaves are now allocated and freed through the empty
sheaf helpers, SHEAF_ALLOC and SHEAF_FREE also account for oversized
sheaves. Update the stat comments accordingly.

Keep the capacity initialization in the oversized prefill path, since
capacity is currently only used for prefilled sheaves

Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
Link: https://patch.msgid.link/20260528193537623nAo-xYBNYBysGKSBjREuO@zte.com.cn
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Reviewed-by: Hao Li <hao.li@linux.dev>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

xfrm: iptfs: preserve shared-frag marker in iptfs_consume_frags()

iptfs_consume_frags() transfers paged fragments from one socket buffer
to another but fails to propagate the SKBFL_SHARED_FRAG flag. This is
the same class of bug that was fixed in skb_try_coalesce() for
CVE-2026-46300: when fragments backed by read-only page-cache pages are
merged, the marker indicating their shared nature must be preserved so
that ESP can decide correctly whether in-place encryption is safe.

Apply the same two-line fix used in skb_try_coalesce() to
iptfs_consume_frags().

Fixes: b96ba312e21c ("xfrm: iptfs: share page fragments of inner packets")
Cc: stable@vger.kernel.org # 6.14+
Signed-off-by: Takao Sato <takaosato1997@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

rust: cpufreq: clean new `clippy::map_or_identity` lint for Rust 1.98.0

Starting with Rust 1.98.0 (expected 2026-08-20), Clippy is likely
introducing a new lint `clippy::map_or_identity` [1][2], which currently
triggers in a single case:

    warning: expression can be simplified using `Result::unwrap_or()`
        --> rust/kernel/cpufreq.rs:1326:60
         |
    1326 |         PolicyCpu::from_cpu(cpu_id).map_or(0, |mut policy| T::get(&mut policy).map_or(0, |f| f))
         |                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         |
         = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#map_or_identity
         = note: `-W clippy::map-or-identity` implied by `-W clippy::all`
         = help: to override `-W clippy::all` add `#[allow(clippy::map_or_identity)]`
    help: consider using `unwrap_or`
         |
    1326 -         PolicyCpu::from_cpu(cpu_id).map_or(0, |mut policy| T::get(&mut policy).map_or(0, |f| f))
    1326 +         PolicyCpu::from_cpu(cpu_id).map_or(0, |mut policy| T::get(&mut policy).unwrap_or(0))
         |

The suggestion is valid, thus clean it up.

Cc: stable@vger.kernel.org # Needed in 6.18.y and later.
Link: https://github.com/rust-lang/rust-clippy/issues/15801
Link: https://github.com/rust-lang/rust-clippy/pull/16052
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260530095809.213611-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

userfaultfd: remove redundant check in vm_uffd_ops()

Lorenzo says:

  static const struct vm_uffd_ops *vma_uffd_ops(struct vm_area_struct *vma)
  {
          if (vma_is_anonymous(vma))
                  return &anon_uffd_ops;
          return vma->vm_ops ? vma->vm_ops->uffd_ops : NULL;
  }

  This is doing a redundant check _and_ making life confusing, as if
  !vma->vm_ops is a condition that can be reached there, it can't, as
  vma_is_anonymous() is literally a !vma->vm_ops check :)

Remove the redundant check.

Link: https://lore.kernel.org/20260527184751.4147364-4-rppt@kernel.org
Fixes: 0f48947c4232 ("userfaultfd: introduce vm_uffd_ops")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: David Carlier <devnexen@gmail.com>
Cc: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

userfaultfd: refuse to __mfill_atomic_pte() for unsupported VMAs

__mfill_atomic_pte() unconditionally dereferences ops because there is an
assumption that VMAs that can undergo mfill_* operations are vetted on
registration and must have valid vm_uffd_ops.

Add a guard against potential bugs and make sure __mfill_atomic_pte()
bails out if ops is NULL.

Link: https://lore.kernel.org/20260527184751.4147364-3-rppt@kernel.org
Fixes: ad9ac3081332 ("userfaultfd: introduce vm_uffd_ops->alloc_folio()")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: David CARLIER <devnexen@gmail.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Michael Bommarito <michael.bommarito@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

userfaultfd: verify VMA state across UFFDIO_COPY retry

Patch series "userfaultfd: verify VMA state across UFFDIO_COPY retry", v2.

... and two more small fixes.

This patch (of 3):

mfill_copy_folio_retry() drops the VMA lock for copy_from_user() and
reacquires it afterwards.  The destination VMA can be replaced during that
window.

The existing check compares vma_uffd_ops() before and after the retry, but
if a shmem VMA with MAP_SHARED is replaced with a shmem VMA with
MAP_PRIVATE (or vice versa) the replacement goes undetected.

The change from MAP_PRIVATE to MAP_SHARED will treat the folio allocated
with shmem_alloc_folio() as anonymous and this will cause BUG() when
mfill_atomic_install_pte() will try to folio_add_new_anon_rmap().

The change from MAP_SHARED to MAP_PRIVATE allows injection of folios into
the page cache of the original VMA.

There is no need to change for hugetlb because it never uses
mfill_copy_folio_retry().

Introduce helpers for more comprehensive comparison of VMA state:
- mfill_retry_state_save() to save the relevant VMA state into a struct
  mfill_retry_state (original uffd_ops, relevant VMA flags, vm_file and
  pgoff) before dropping the lock
- mfill_retry_state_changed() to compare the saved state with the state
  of the VMA acquired after retaking the locks
- mfill_retry_state_put() to release vm_file pinning.

Use DEFINE_FREE() cleanup to wrap mfill_retry_state_put() to avoid
complicating error handling paths in mfill_copy_folio_retry().

Link: https://lore.kernel.org/20260527184751.4147364-1-rppt@kernel.org
Link: https://lore.kernel.org/20260527184751.4147364-2-rppt@kernel.org
Fixes: 292411fda25b ("mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry()")
Fixes: 6ab703034f14 ("userfaultfd: mfill_atomic(): remove retry logic")
Co-developed-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Peter Xu <peterx@redhat.com>
Co-developed-by: David Carlier <devnexen@gmail.com>
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/huge_memory: update file PMD counter before folio_put()

__split_huge_pmd_locked() updates the file/shmem RSS counter after
dropping the PMD mapping's folio reference. If folio_put() drops the last
reference, mm_counter_file() can later read freed folio state via
folio_test_swapbacked().

Move the counter update before folio_put().

Link: https://lore.kernel.org/20260526101337.1984081-1-yintirui@huawei.com
Fixes: fadae2953072 ("thp: use mm_file_counter to determine update which rss counter")
Signed-off-by: Yin Tirui <yintirui@huawei.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/huge_memory: update file PUD counter before folio_put()

__split_huge_pud_locked() updates the file/shmem RSS counter after
dropping the PUD mapping's folio reference. If folio_put() drops the last
reference, mm_counter_file() can later read freed folio state via
folio_test_swapbacked().

Move the counter update before folio_put().

Link: https://lore.kernel.org/20260526101355.1984244-1-yintirui@huawei.com
Fixes: dbe54153296d ("mm/huge_memory: add vmf_insert_folio_pud()")
Signed-off-by: Yin Tirui <yintirui@huawei.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/hugetlb_vmemmap: fix incorrect vmemmap restore in rollback

vmemmap_restore_pte() rebuilds restored vmemmap pages from a tail-page
template derived from compound_head().  This is wrong when the current PTE
already maps a page whose contents are not tail-page metadata.

In the rollback path of vmemmap_remap_free(), the first restored PTE is
backed by vmemmap_head and contains head-page metadata.  Reconstructing
that page from a tail-page template overwrites the head-page state and
corrupts the restored vmemmap page.

Fix this by copying the full page from the page currently mapped by the
PTE.  Also pass vmemmap_tail to the rollback walk so only PTEs backed by
the shared tail page are restored, while the head PTE remains mapped to
vmemmap_head.  Add VM_WARN_ON_ONCE() checks for unexpected cases.

Link: https://lore.kernel.org/20260525025213.2229628-1-songmuchun@bytedance.com
Fixes: c0b495b91a47 ("mm/hugetlb: refactor code around vmemmap_walk")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Kiryl Shutsemau <kas@kernel.org>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/damon/ops-common: call folio_test_lru() after folio_get()

damon_get_folio() speculatively calls folio_test_lru() before
folio_try_get().  The folio can get freed and reallocated to a tail page.
In the case, VM_BUG_ON_PGFLAGS() in const_folio_flags() can be triggered.
Remove the speculative call.

Also mark folio_test_lru() check right after folio_try_get() success as no
more unlikely.

The race should be rare.  Also the problem can happen only if the kernel
has enabled CONFIG_DEBUG_VM_PGFLAGS.  No real world report of this issue
has been made so far.  This fix is based on only theoretical analysis.
That said, a bug is a bug.  A similar issue was also fixed via commit
3203b3ab0fcf ("mm/filemap: don't call folio_test_locked() without a
reference in next_uptodate_folio()").  I don't expect this change will
make a meaningful impact to DAMON performance in the real world, though I
will be happy to be corrected from the real world reports.

The issue was discovered [1] by Sashiko.

Link: https://lore.kernel.org/20260525162256.8317-1-sj@kernel.org
Link: https://lore.kernel.org/20260517234112.89245-1-sj@kernel.org
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Fernand Sieber <sieberf@amazon.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

rbd: check snap_count against RBD_MAX_SNAP_COUNT

snap_count is u32 but the comparison is against a SIZE_MAX-derived value
(~2^61 on 64-bit), which clang flags as always false with
-Wtautological-constant-out-of-range-compare.

The proper check here should be that snap_count does not go over
RBD_MAX_SNAP_COUNT.

Assisted-by: Opencode:Big-pickle
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Alex Elder <elder@riscstar.com>
Link: https://patch.msgid.link/20260530011255.52916-1-rosenp@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

rust: block: fix GenDisk cleanup paths

GenDiskBuilder::build() still has fallible work after
__blk_mq_alloc_disk(), but its error path only recovers the
foreign queue data. That leaks the temporary gendisk and
request_queue until later teardown. If the caller moved the last
Arc<TagSet<T>> into build(), the leaked queue can retain blk-mq
state after the tag set is dropped.

Fix the pre-registration failure path by dropping the temporary
gendisk reference with put_disk() before recovering queue_data,
so disk_release() can tear down the owned queue.

Also pair GenDisk::drop() with put_disk() after del_gendisk().
Once a Rust GenDisk has been added with device_add_disk(),
del_gendisk() only unregisters it; the final gendisk reference
still has to be dropped to complete the release path.

Fixes: 3253aba3408a ("rust: block: introduce `kernel::block::mq` module")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Link: https://patch.msgid.link/b70aff9a920cc42110fe5cf454c3099561863519.1780063368.git.royenheart@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL

A deferred byte-range lock (an SMB2_LOCK that blocks) registers an async work on
conn->async_requests via setup_async_work(), with cancel_fn =
smb2_remove_blocked_lock and cancel_argv[0] pointing at the struct file_lock.

When the request is cancelled, the worker frees the file_lock with
locks_free_lock() and takes the cancelled early-exit, which "goto out"s and never
reaches release_async_work() -- the only site that unlinks the work from
conn->async_requests and clears cancel_fn/cancel_argv. The work therefore stays
matchable on async_requests with a live cancel_fn pointing at the freed file_lock,
until connection teardown finally runs release_async_work().

smb2_cancel() fires cancel_fn unconditionally with no state guard, so a second
SMB2_CANCEL for the same AsyncId, arriving in that window, re-runs
smb2_remove_blocked_lock() on the freed file_lock -- a slab use-after-free:

  BUG: KASAN: slab-use-after-free in __locks_delete_block
    __locks_delete_block
    locks_delete_block
    ksmbd_vfs_posix_lock_unblock
    smb2_remove_blocked_lock
    smb2_cancel                 <- 2nd SMB2_CANCEL fires cancel_fn
    handle_ksmbd_work
  Allocated by ...: locks_alloc_lock <- smb2_lock
  Freed by ...:     locks_free_lock  <- smb2_lock (cancelled branch)
  ... cache file_lock_cache of size 192

Reproduced on mainline with KASAN by an authenticated SMB client.

Skip a work whose state is already KSMBD_WORK_CANCELLED so its cancel callback
cannot be fired a second time.

Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix durable reconnect double-bind race in ksmbd_reopen_durable_fd

Two concurrent same-user DHnC reconnects can both observe fp->conn == NULL
before either sets it. ksmbd_reopen_durable_fd() checks fp->conn to guard
against a handle already being reconnected, but the check and the binding
assignment are not atomic: both threads pass the guard, both call
ksmbd_conn_get() on the same fp, and both eventually reach
kfree(fp->owner.name) -- a double-free of the owner.name slab object.
The double-bound ksmbd_file also causes a write-UAF on the 344-byte
ksmbd_file_cache object when a concurrent smb2_close() spins on fp->f_lock
after the object has been freed by the losing reconnect path.

KASAN on 7.1-rc5 (48-thread concurrent reconnect, 3000 cycles):
  BUG: KASAN: double-free in ksmbd_reopen_durable_fd+0x268/0x308
  BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xac/0x150
    Write of size 4 at offset 24 into freed ksmbd_file_cache object
Five double-bind windows observed; 63 total KASAN reports triggered.

Fix: validate and claim fp->conn under write_lock(&global_ft.lock) so the
check-and-claim is atomic. ksmbd_lookup_durable_fd() already treats
fp->conn != NULL as "in use" and skips such an fp; setting fp->conn before
dropping the lock closes the race. ksmbd_conn_get() is a non-sleeping
refcount increment, safe under the rwlock. The rollback path on __open_id()
failure also clears fp->conn/tcon under the lock so concurrent readers see
a consistent state.

Fixes: b1f1e80620de ("ksmbd: centralize ksmbd_conn final release to plug transport leak")
Assisted-by: Henry (Claude):claude-opus-4
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix NULL-deref of opinfo->conn in oplock/lease break notifiers

smb2_oplock_break_noti() and smb2_lease_break_noti() read opinfo->conn
into a local with neither READ_ONCE() nor a NULL check.  Both run from
oplock_break() after opinfo_get_list() has dropped ci->m_lock, so a
concurrent SMB2 LOGOFF (session_fd_check()) can set op->conn = NULL
under ci->m_lock within that window.  ksmbd_conn_r_count_inc(conn) then
writes through NULL at offset 0xc4 -- a remotely triggerable oops.

Guard both reads the way compare_guid_key() already does: read
opinfo->conn with READ_ONCE() and return early if it is NULL, before
allocating the work struct so nothing leaks.  A NULL conn means the
client is gone and the break is moot, so return 0; oplock_break() treats
that as success and runs the normal teardown.

Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Assisted-by: Henry (Claude):claude-opus-4
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'media/v7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

- rc: igorplugusb: fix control request setup packet

- vsp1: revert a couple patches to fix regressions when setting DRM
   pipelines

* tag 'media/v7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: rc: igorplugusb: fix control request setup packet
  Revert "media: renesas: vsp1: brx: Fix format propagation"
  Revert "media: renesas: vsp1: Initialize format on all pads"

Merge tag 'x86-urgent-2026-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

- Make the clearcpuid= boot parameter less prominent
   and warn about its dangers & caveats (Borislav Petkov)

- Do not access the (new) PLATFORM_ID MSR when running
   as a guest (Borislav Petkov)

- x86 ftrace: Relocate %rip-relative percpu refs in dynamic
   trampolines, to fix crash when using such trampolines
   (Alexis Lothoré)

- Fix x86-64 CFI build error (Peter Zijlstra)

- Revert FPU signal return magic number check optimization,
   because it broke CRIU and gVisor in certain FPU configurations
   (Andrei Vagin)

* tag 'x86-urgent-2026-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "x86/fpu: Refine and simplify the magic number check during signal return"
  x86/kvm/vmx: Fix x86_64 CFI build
  x86/ftrace: Relocate %rip-relative percpu refs in dynamic trampolines
  x86/microcode: Do not access MSR_IA32_PLATFORM_ID when running as a guest
  Documentation/arch/x86: Hide clearcpuid=

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Two core changes, the only one of significance being the change to
  kick queues in SDEV_CANCEL which had a small window for stuck
  requests.

  The major driver fixes are the one to the FC transport class to widen
  the FPIN counter to counter a theoretical (and privileged) fabric
  traffic injection attack and the other is an iscsi fix where a
  malicious target could trick the kernel into an output buffer overrun.

  Both the driver fixes were AI assisted"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: iscsi: Validate CHAP_R length before base64 decode
  scsi: target: iscsi: Bound iscsi_encode_text_output() appends to rsp_buf
  scsi: target: iscsi: Fix CRC overread and double-free in iscsit_handle_text_cmd()
  scsi: fcoe: Reject FIP descriptors with zero fip_dlen in CVL walker
  scsi: scsi_transport_fc: Widen FPIN pname walker counter to u32
  scsi: scsi_debug: Add missing newline in scsi_debug_device_reset()
  scsi: megaraid_sas: Fix NULL pointer dereference on firmware duplicate completion
  scsi: devinfo: Add BLIST_NO_RSOC for Promise VTrak E310f
  scsi: core: Run queues for all non-SDEV_DEL devices from scsi_run_host_queues

Merge tag 'i2c-for-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

- davinci: fix fallback bus frequency on missing clock-frequency

- virtio: mark device ready initially

* tag 'i2c-for-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: virtio: mark device ready before registering the adapter
i2c: davinci: fix division by zero on missing clock-frequency

Merge tag 'input-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

- updates to Elan I2C touchpad driver to handle a new IC type and to
   validate size of supplied firmware to prevent OOB access

- updates to Xpad controller driver to recognize ASUS ROG RAIKIRI II
   and "Nova 2 Lite" from GameSir controllers as well as a fix to
   prevent a potential OOB access when handling "Share" button

- an update to Synaptics touchpad driver to use RMI mode for touchpad
   in Thinkpad E490

- updates to Atmel MXT driver adding checks to prevent potential OOB
   accesses

- a fix to IMS PCU driver to free correct amount of memory when tearing
   it down

- a fixup to the recent change to Atlas buttons driver

- a small cleanup in fm801-fp for PCI IDs table initialisation

* tag 'input-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ims-pcu - fix usb_free_coherent() size in ims_pcu_buffers_free()
  Input: synaptics - add LEN2058 to SMBus passlist for ThinkPad E490
  Input: atlas - check ACPI_COMPANION() against NULL
  Input: atmel_mxt_ts - check mem_size before calculating config memory size
  Input: atmel_mxt_ts - fix boundary check in mxt_prepare_cfg_mem
  Input: fm801-gp - simplify initialisation of pci_device_id array
  Input: xpad - add "Nova 2 Lite" from GameSir
  Input: xpad - add support for ASUS ROG RAIKIRI II
  Input: elan_i2c - validate firmware size before use
  Input: xpad - fix out-of-bounds access for Share button
  Input: usbtouchscreen - clamp NEXIO data_len/x_len to URB buffer size
  Input: elan_i2c - increase device reset wait timeout after update FW
  Input: elan_i2c - add ic type 0x19

ALSA: usb-audio: Set the value of potential sticky mixers to maximum

It makes no sense to restore the saved value for a sticky mixer, since
setting any value is a no-op.

However, in some rare cases, SET_CUR is effective despite GET_CUR always
returns a constant value. These mixers are not sticky, but there's no
way to distinguish them. Without any additional information, the best
thing we can do is to set the mixer value to the maximum before bailing
out, so that a soft mixer can still reach the maximum hardware volume if
the mixer turns out to be non-sticky. Meanwhile, all channels must be
synchronized to prevent imbalance volume.

Fixes: 86aa1ea1f15c ("ALSA: usb-audio: Do not expose sticky mixers")
Signed-off-by: Rong Zhang <i@rong.moe>
Link: https://patch.msgid.link/20260531-uac-sticky-error-path-v1-1-12c2329d17ef@rong.moe
Signed-off-by: Takashi Iwai <tiwai@suse.de>

wifi: iwlwifi: pcie: simplify the resume flow if fast resume is not used

In most distributions, NetworkManager shuts the device down before
entering system suspend, so fast suspend is typically not used.

On older devices, resume currently tries to grab NIC access to infer
whether the device was powered off while suspended. That probe is only
meaningful for the fast-suspend path where the device is expected to
remain alive.

Unfortunately, for unclear reasons, grabbing NIC access was harmful as
reported in the bugzilla ticket below.

Workaround this issue by simply not grabbing NIC access if fast suspend
is not used.

Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221501
Assisted-by: GitHub Copilot:gpt-5.3-codex
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://patch.msgid.link/20260531133005.e2ed9e0cd44f.If283625983a843933e0c01561a421daff184e9e9@changeid
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>

md/raid0: use str_plural helper in dump_zones

Replace the manual ternary "s" pluralization with str_plural() to
simplify the code.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260527141932.1243503-2-thorsten.blum@linux.dev
Signed-off-by: Yu Kuai <yukuai@fygo.io>

raid1: fix nr_pending leak in REQ_ATOMIC bad-block error path

In raid1_write_request(), each per-mirror loop iteration begins by
incrementing rdev->nr_pending. If a REQ_ATOMIC write encounters a
badblock within the requested range, the code jumps to err_handle
without dropping the reference taken for the current mirror.

err_handle's cleanup loop will only decrements for k < i and
r1_bio->bios[k] is non-NULL. The current slot is therefore skipped,
leaving its nr_pending reference leaked permanently. The reference
prevents the rdev from ever being removed, since raid1_remove_conf()
refuses to remove an rdev with nr_pending > 0.

Fix this by calling rdev_dec_pending() before jumping to err_handle.

Fixes: f2a38abf5f1c ("md/raid1: Atomic write support")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Link: https://patch.msgid.link/20260530151411.4119-1-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

md/raid1: move the exceed_read_errors condition out of fix_read_error

This condition much better fits into the only caller, limiting
fix_read_error to actually fix up data devices after a read error.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260529054308.2720300-3-hch@lst.de
Signed-off-by: Yu Kuai <yukuai@fygo.io>

md/raid1: cleanup handle_read_error

Unwind the main conditional with duplicate conditions and initialize
variables at initialization time where possible.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260529054308.2720300-2-hch@lst.de
Signed-off-by: Yu Kuai <yukuai@fygo.io>

md/raid1,raid10: fix bio accounting for split md cloned bios

Use md_cloned_bio() to control bio accounting instead of relying
on r1bio_existed in raid1 or the io_accounting flag in raid10.

The previous logic does not reliably reflect whether a bio is an
md cloned bio. When a failed bio is split and resubmitted via
bio_submit_split_bioset() on the error path, this can lead to either
double accounting for md cloned bios, or missing accounting for bios
returned from bio_submit_split_bioset()

Fix this by using md_cloned_bio() to detect md cloned bios and
skip accounting accordingly.

Fixes: bb2a9acefaf9 ("md/raid1: switch to use md_account_bio() for io accounting")
Fixes: 820455238366 ("md/raid10: switch to use md_account_bio() for io accounting")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260501114652.590037-4-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

md/raid1,raid10: fix error-path detection with md_cloned_bio()

Detect the error path using md_cloned_bio() instead of relying
on r1_bio in raid1 or r10_bio->read_slot in raid10, which may be
NULL or -1 after splitting and resubmitting a failed bio.

As a result, the error path may not be recognized and memory
allocations can incorrectly use GFP_NOIO instead of
(GFP_NOIO | __GFP_HIGH), which can lead to a deadlock under
memory pressure.

Fixes: 689389a06ce7 ("md/raid1: simplify handle_read_error().")
Fixes: 545250f24809 ("md/raid10: simplify handle_read_error()")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260501114652.590037-3-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

md/raid1,raid10: fix deadlock in read error recovery path

raid1d and raid10d may resubmit a split md cloned bio while handling
a read error. In this case, resubmitting the bio can lead to a deadlock
if the array is suspended before md_handle_request() acquires an
active_io reference via percpu_ref_tryget_live().

Since the cloned bio already holds an active_io reference,
trying to acquire another reference via percpu_ref_tryget_live()
can lead to a deadlock while the array is suspended.

Fix this by using percpu_ref_get() for md cloned bios.

Fixes: bb2a9acefaf9 ("md/raid1: switch to use md_account_bio() for io accounting")
Fixes: 820455238366 ("md/raid10: switch to use md_account_bio() for io accounting")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Reviewed-by: Yu Kuai <yukuai@fygo.io>
Link: https://patch.msgid.link/20260501114652.590037-2-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

md/raid10: reset read_slot when reusing r10bio for discard

put_all_bios() always drops devs[i].bio, but it only drops
devs[i].repl_bio when r10_bio->read_slot < 0. If discard reuses an
r10bio that was previously used for a read, read_slot can still be
non-negative, and discard cleanup can skip bio_put() on repl_bio.

Reset read_slot to -1 when preparing an r10bio for discard so the
replacement bio is always released correctly.

Fixes: d30588b2731f ("md/raid10: improve raid10 discard request")
Signed-off-by: Chen Cheng <chencheng@fnnas.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260515093019.3436882-1-chencheng@fnnas.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

md: skip redundant raid_disks update when value is unchanged

Calling update_raid_disks() with the same value as the current one
can trigger unnecessary work. For example, RAID1 will reallocate
resources such as the mempool for r1bio.

Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Link: https://patch.msgid.link/20260428130524.448063-1-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

dm-raid: only requeue bios when dm is suspending

Returning DM_MAPIO_REQUEUE from the target map() function only requeues
the bio during noflush suspends. During regular operations or during
flushing suspends, it fails the bio. Failing the bio during flushing
suspends is the correct behavior here. The bio cannot be handled, and
dm-raid cannot suspend while it is outstanding. But during normal
operations, dm-raid should not push the bio back to dm. Instead, wait
for the reshape to be resumed.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260428232010.2785514-1-bmarzins@redhat.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

MAINTAINERS: Update Li Nan's E-mail address

Change to my new email address on didiglobal.com.

Signed-off-by: Li Nan <magiclinan@didiglobal.com>
Link: https://patch.msgid.link/tencent_8F8173BEDF20E98550D5429DF802F34A7108@qq.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

MAINTAINERS: update Yu Kuai's email address

Update Yu Kuai's maintainer entries to use the new fygo.io address.

Link: https://patch.msgid.link/20260520112627.1264368-1-yukuai@fnnas.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>

Merge tag 'v7.1-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- fix uninitialized variable in smb2_writev_callback()

- detect short folioq copy in cifs_copy_folioq_to_iter()

* tag 'v7.1-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix uninitialized variable in smb2_writev_callback
smb: client: detect short folioq copy in cifs_copy_folioq_to_iter()

Merge tag 'liveupdate-fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux

Pull liveupdate fixes from Mike Rapoport:
"Two kexec handover regression fixes:

   - fix order calculation for kho_unpreserve_pages() to make sure sure
     that the order calculation in kho_unpreserve_pages() mathes the
     order calculation in kho_preserve_pages().

   - fix math in calculation of KHO_TREE_MAX_DEPTH to make it work with
     16KB pages"

* tag 'liveupdate-fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux:
  kho: fix order calculation for kho_unpreserve_pages()
  kho: fix KHO_TREE_MAX_DEPTH for non-4KB page sizes

Merge tag 'fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock fix from Mike Rapoport:
"Fix regression from memblock_free_late() refactoring

  After refactoring of memblock_free_late() and free_init_pages() it
  became possible to call memblock_free() after memblock init data was
  discarded.

  Make sure memblock_free() does not touch memblock.reserved unless it
  is called early enough or when ARCH_KEEP_MEMBLOCK is enabled"

* tag 'fixes-2026-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: don't touch memblock arrays when memblock_free() is called late

i2c: core: clean up adapter registration error label

Clean up the adapter registration error labels by making sure that also
the last one is named after what it does.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: clean up bus id allocation

Clean up bus id allocation by using a common helper and deferring it
until it is needed during adapter registration.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: fix adapter deregistration race

Adapters can be looked up by their id using i2c_get_adapter() which
takes a reference to the embedded struct device.

Remove the adapter from the IDR before tearing it down during
deregistration (and on registration failure) to make sure its resources
are not accessed after having been freed (e.g. the device name).

Fixes: 35fc37f81881 ("i2c: Limit core locking to the necessary sections")
Cc: stable@vger.kernel.org # 2.6.31
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: fix adapter registration race

Adapters can be looked up based on their id using i2c_get_adapter()
which takes a reference to the embedded struct device.

Make sure that the adapter (including its struct device) has been
initialised before adding it to the IDR to avoid accessing uninitialised
data which could, for example, lead to NULL-pointer dereferences or
use-after-free.

Note that the i2c-dev chardev, which is registered from a bus notifier,
currently uses i2c_get_adapter() so the adapter needs to be added to the
IDR before registration.

Fixes: 6e13e6418418 ("i2c: Add i2c_add_numbered_adapter()")
Cc: stable@vger.kernel.org # 2.6.22
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: disable runtime PM on adapter registration failure

Runtime PM is disabled by driver core when deregistering a device (and
on registration failure) but add an explicit disable to balance the
enable call when adapter registration fails for symmetry.

Fixes: 23a698fe65ec ("i2c: core: treat EPROBE_DEFER when acquiring SCL/SDA GPIOs")
Cc: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: fix adapter debugfs creation

Clients can be registered from bus notifier callbacks so the debugfs
directory needs to be created before registering the adapter as clients
use that directory as their debugfs parent.

Move debugfs creation before adapter registration to avoid having
clients create their debugfs directories in the debugfs root (which is
also more likely to fail due to name collisions).

Note that failure to allocate the adapter name must now be handled
explicitly as debugfs_create_dir() cannot handle a NULL name (unlike
device_add() which returns an error).

Fixes: 73febd775bdb ("i2c: create debugfs entry per adapter")
Cc: stable@vger.kernel.org # 6.8
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: fix adapter probe deferral loop

Drivers must not probe defer after having registered devices as that
will trigger a probe loop if the devices bind to a driver (cf. commit
fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")).

Move the recovery initialisation, where the GPIO lookup may fail, before
registering the adapter to prevent this.

Fixes: 75820314de26 ("i2c: core: add generic I2C GPIO recovery")
Cc: stable@vger.kernel.org # 5.9
Cc: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: fix NULL-deref on adapter registration failure

If adapter registration ever fails the release callback would trigger a
NULL-pointer dereference as the completion struct has not been
initialised.

Note that before the offending commit this would instead have resulted
in a minor memory leak of the adapter name.

Fixes: 3f8c4f5e9a57 ("i2c: core: fix reference leak in i2c_register_adapter()")
Cc: stable@vger.kernel.org
Cc: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: fix hang on adapter registration failure

Clients may be registered from bus notifier callbacks when the adapter
is registered. On a subsequent error during registration, the adapter
references taken by such clients prevent the wait for the references to
be released from ever completing.

Fix this by refactoring client deregistration and deregistering also on
late adapter registration failures.

Fixes: f8756c67b3de ("i2c: core: call of_i2c_setup_smbus_alert in i2c_register_adapter")
Cc: stable@vger.kernel.org # 4.15
Cc: Phil Reid <preid@electromag.com.au>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: core: fix irq domain leak on adapter registration failure

Make sure to tear down the host notify irq domain on adapter
registration failure to avoid leaking it.

This issue was flagged by Sashiko when reviewing another adapter
registration fix.

Fixes: 4d5538f5882a ("i2c: use an IRQ to report Host Notify events, not alert")
Cc: stable@vger.kernel.org # 4.10
Cc: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

wifi: iwlwifi: mvm: avoid oversized UATS command copy

MCC_ALLOWED_AP_TYPE_CMD exceeds the fixed copied host-command buffer
and triggers warnings in the gen2 enqueue path when command
0xc05 is sent.

Use IWL_HCMD_DFL_NOCOPY as it was done before the offending commit.

Fixes: 078df640ef05 ("wifi: iwlwifi: mld: add support for iwl_mcc_allowed_ap_type_cmd v2")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260529085453.9af349ab459b.I348df3980764c15efce0099a35fe8a88fb2a6ee2@changeid