Input: edt-ft5x06 - fix use-after-free in debugfs teardown
The commit 68743c500c6e ("Input: edt-ft5x06 - use per-client debugfs
directory") removed the manual debugfs teardown, relying on the I2C core
to handle it. However, this creates a window where debugfs files are
still accessible after edt_ft5x06_ts_teardown_debugfs() frees
tsdata->raw_buffer.
To prevent a use-after-free, protect the freeing of raw_buffer with the
device mutex and set raw_buffer to NULL. The debugfs read function
already checks if raw_buffer is NULL under the same mutex, so this
safely avoids the use-after-free.
drm/v3d: Reject empty multisync extension to prevent infinite loop
v3d_get_extensions() walks a userspace-provided singly-linked list of
ioctl extensions without any bound on the chain length. A local user
can craft a self-referential extension (ext->next == &ext) with zero
in_sync_count and out_sync_count, which bypasses the existing duplicate-
extension guard:
if (se->in_sync_count || se->out_sync_count)
return -EINVAL;
The guard never fires because v3d_get_multisync_post_deps() returns
immediately when count is zero, leaving both fields at zero on every
iteration. The result is an infinite loop in kernel context, blocking
the calling thread and pegging a CPU core indefinitely.
Fix this by rejecting a multisync extension where both in_sync_count
and out_sync_count are zero in v3d_get_multisync_submit_deps(). An
empty multisync carries no synchronization information and serves no
useful purpose, so returning -EINVAL for such an extension is the
correct defense against this attack vector.
Igor Korotin [Thu, 9 Apr 2026 18:41:00 +0000 (19:41 +0100)]
MAINTAINERS: add Rust I2C tree and update Igor Korotin's email
Add a git tree entry for Rust I2C development and update the e-mail
address. The tree will be used to collect patches and provide a basis
for integration and testing, including linux-next.
Signed-off-by: Igor Korotin <igor.korotin@linux.dev> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Merge tag 'mm-hotfixes-stable-2026-04-19-00-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
"7 hotfixes. 6 are cc:stable and all are for MM. Please see the
individual changelogs for details"
* tag 'mm-hotfixes-stable-2026-04-19-00-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/damon/core: disallow non-power of two min_region_sz on damon_start()
mm/vmalloc: take vmap_purge_lock in shrinker
mm: call ->free_folio() directly in folio_unmap_invalidate()
mm: blk-cgroup: fix use-after-free in cgwb_release_workfn()
mm/zone_device: do not touch device folio after calling ->folio_free()
mm/damon/core: disallow time-quota setting zero esz
mm/mempolicy: fix weighted interleave auto sysfs name
Merge tag 'driver-core-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:
- Prevent a device from being probed before device_add() has finished
initializing it; gate probe with a "ready_to_probe" device flag to
avoid races with concurrent driver_register() calls
- Fix a kernel-doc warning for DEV_FLAG_COUNT introduced by the above
- Return -ENOTCONN from software_node_get_reference_args() when a
referenced software node is known but not yet registered, allowing
callers to defer probe
- In sysfs_group_attrs_change_owner(), also check is_visible_const();
missed when the const variant was introduced
* tag 'driver-core-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
driver core: Add kernel-doc for DEV_FLAG_COUNT enum value
sysfs: attribute_group: Respect is_visible_const() when changing owner
software node: return -ENOTCONN when referenced swnode is not registered yet
driver core: Don't let a device probe until it's ready
Merge tag 'staging-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the "big" set of staging driver changes for 7.1-rc1.
Nothing major in here at all, just lots of little cleanups for the
staging drivers, driven by new developers getting their feet wet in
kernel development. "Largest" thing in here is the change of some of
the octeon variable types into proper kernel ones.
Full details are in the shortlog.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (154 commits)
staging: rtl8723bs: remove redundant & parentheses
staging: most: dim2: replace BUG_ON() in poison_channel()
staging: most: dim2: replace BUG_ON() in enqueue()
staging: most: dim2: replace BUG_ON() in configure_channel()
staging: most: dim2: replace BUG_ON() in service_done_flag()
staging: most: dim2: replace BUG_ON() in try_start_dim_transfer()
staging: rtl8723bs: remove unused RTL8188E antenna selection macros
staging: rtl8723bs: remove redundant blank lines in basic_types.h
staging: rtl8723bs: wrap complex macros with parentheses
staging: rtl8723bs: remove unused WRITEEF/READEF byte macros
staging: rtl8723bs: rename camelCase variable
staging: greybus: audio: fix error message for BTN_3 button
staging: rtl8723bs: rename variables to snake_case
staging: rtl8723bs: fix spelling in comment
staging: rtl8723bs: cleanup return in sdio_init()
staging: rtl8723bs: use direct returns in sdio_dvobj_init()
staging: rtl8723bs: remove unused arg at odm_interface.h
greybus: raw: fix use-after-free if write is called after disconnect
greybus: raw: fix use-after-free on cdev close
staging: rtl8723bs: fix logical continuations in xmit_linux.c
...
Merge tag 'usb-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 7.1-rc1.
Lots of little things in here, nothing major, just constant
improvements, updates, and new features. Highlights are:
- new USB power supply driver support.
These changes did touch outside of drivers/usb/ but got acks from
the relevant mantainers for them.
- dts file updates and conversions
- string function conversions into "safer" ones
- new device quirks
- xhci driver updates
- usb gadget driver minor fixes
- typec driver additions and updates
- small number of thunderbolt driver changes
- dwc3 driver updates and additions of new hardware support
- other minor driver updates
All of these have been in the linux-next tree for a while with no
reported issues"
* tag 'usb-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (176 commits)
usb: dwc3: starfive: Add JHB100 USB 2.0 DRD controller
dt-bindings: usb: dwc3: add support for StarFive JHB100
dt-bindings: usb: atmel,at91sam9rl-udc: convert to DT schema
dt-bindings: usb: atmel,at91rm9200-udc: convert to DT schema
dt-bindings: usb: generic-ehci: fix schema structure and add at91sam9g45 constraints
dt-bindings: usb: generic-ohci: add AT91RM9200 OHCI binding support
arm: dts: at91: remove unused #address-cells/#size-cells from sam9x60 udc node
drivers/usb/host: Fix spelling error 'seperate' -> 'separate'
usbip: tools: add hint when no exported devices are found
USB: serial: iuu_phoenix: fix iuutool author name
usb: gadget: f_ncm: validate minimum block_len in ncm_unwrap_ntb()
usb: gadget: f_phonet: fix skb frags[] overflow in pn_rx_complete()
usb: gadget: f_hid: Add missing error code
usb: typec: cros_ec_ucsi: Load driver from OF and ACPI definitions
dt-bindings: chrome: Add cros-ec-ucsi compatibility to typec binding
USB: of: Simplify with scoped for each OF child loop
usbip: validate number_of_packets in usbip_pack_ret_submit()
usb: gadget: renesas_usb3: validate endpoint index in standard request handlers
usb: core: config: reverse the size check of the SSP isoc endpoint descriptor
usb: typec: ucsi: Set usb mode on partner change
...
Merge tag 'tty-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the set of tty and serial driver changes for 7.1-rc1.
Not much here this cycle, biggest thing is the removal of an old
driver that never got any actual hardware support (esp32), and the
second try to moving the tty ports to their own workqueues (first try
was in 7.0-rc1 but was reverted due to problems)
Otherwise it's just a small set of driver updates and some vt modifier
key enhancements.
All have been in linux-next for a while with no reported issues"
* tag 'tty-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (35 commits)
tty: serial: ip22zilog: Fix section mispatch warning
hvc/xen: Check console connection flag
serial: sh-sci: Add support for RZ/G3L RSCI
dt-bindings: serial: renesas,rsci: Document RZ/G3L SoC
tty: atmel_serial: update outdated reference to atmel_tasklet_func()
serial: xilinx_uartps: Drop unused include
serial: qcom-geni: drop stray newline format specifier
serial: 8250: loongson: Enable building on MIPS Loongson64
dt-bindings: serial: 8250: Add Loongson 3A4000 uart compatible
serial: 8250_fintek: Add support for F81214E
tty: tty_port: add workqueue to flip TTY buffer
vt: support ITU-T T.416 color subparameters
serial: qcom-geni: Fix RTS behavior with flow control
tty: serial: imx: keep dma request disabled before dma transfer setup
tty: serial: 8250: Add SystemBase Multi I/O cards
serial: pic32_uart: allow driver to be compiled on all architectures with COMPILE_TEST
serial: tegra: remove Kconfig dependency on APB DMA controller
dt-bindings: serial: amlogic,meson-uart: Add compatible string for A9
dt-bindings: serial: atmel,at91-usart: add microchip,lan9691-usart
serial: auart: check clk_enable() return in console write
...
Merge tag 'mm-stable-2026-04-18-02-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "Eliminate Dying Memory Cgroup" (Qi Zheng and Muchun Song)
Address the longstanding "dying memcg problem". A situation wherein a
no-longer-used memory control group will hang around for an extended
period pointlessly consuming memory
- "fix unexpected type conversions and potential overflows" (Qi Zheng)
Fix a couple of potential 32-bit/64-bit issues which were identified
during review of the "Eliminate Dying Memory Cgroup" series
- "kho: history: track previous kernel version and kexec boot count"
(Breno Leitao)
Use Kexec Handover (KHO) to pass the previous kernel's version string
and the number of kexec reboots since the last cold boot to the next
kernel, and print it at boot time
- "MAINTAINERS: update KHO and LIVE UPDATE entries" (Pratyush Yadav)
Document upcoming changes in the maintenance of KHO, LUO, memfd_luo,
kexec, crash, kdump and probably other kexec-based things - they are
being moved out of mm.git and into a new git tree
* tag 'mm-stable-2026-04-18-02-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (121 commits)
MAINTAINERS: add page cache reviewer
mm/vmscan: avoid false-positive -Wuninitialized warning
MAINTAINERS: update Dave's kdump reviewer email address
MAINTAINERS: drop include/linux/liveupdate from LIVE UPDATE
MAINTAINERS: drop include/linux/kho/abi/ from KHO
MAINTAINERS: update KHO and LIVE UPDATE maintainers
MAINTAINERS: update kexec/kdump maintainers entries
mm/migrate_device: remove dead migration entry check in migrate_vma_collect_huge_pmd()
selftests: mm: skip charge_reserved_hugetlb without killall
userfaultfd: allow registration of ranges below mmap_min_addr
mm/vmstat: fix vmstat_shepherd double-scheduling vmstat_update
mm/hugetlb: fix early boot crash on parameters without '=' separator
zram: reject unrecognized type= values in recompress_store()
docs: proc: document ProtectionKey in smaps
mm/mprotect: special-case small folios when applying permissions
mm/mprotect: move softleaf code out of the main function
mm: remove '!root_reclaim' checking in should_abort_scan()
mm/sparse: fix comment for section map alignment
mm/page_io: use sio->len for PSWPIN accounting in sio_read_complete()
selftests/mm: transhuge_stress: skip the test when thp not available
...
SeongJae Park [Sat, 11 Apr 2026 21:36:36 +0000 (14:36 -0700)]
mm/damon/core: disallow non-power of two min_region_sz on damon_start()
Commit d8f867fa0825 ("mm/damon: add damon_ctx->min_sz_region") introduced
a bug that allows unaligned DAMON region address ranges. Commit c80f46ac228b ("mm/damon/core: disallow non-power of two min_region_sz")
fixed it, but only for damon_commit_ctx() use case. Still, DAMON sysfs
interface can emit non-power of two min_region_sz via damon_start(). Fix
the path by adding the is_power_of_2() check on damon_start().
decay_va_pool_node() can be invoked concurrently from two paths:
__purge_vmap_area_lazy() when pools are being purged, and the shrinker via
vmap_node_shrink_scan().
However, decay_va_pool_node() is not safe to run concurrently, and the
shrinker path currently lacks serialization, leading to races and possible
leaks.
Protect decay_va_pool_node() by taking vmap_purge_lock in the shrinker
path to ensure serialization with purge users.
Link: https://lore.kernel.org/20260413192646.14683-1-urezki@gmail.com Fixes: 7679ba6b36db ("mm: vmalloc: add a shrinker to drain vmap pools") Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Baoquan He <baoquan.he@linux.dev> Cc: chenyichong <chenyichong@uniontech.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: call ->free_folio() directly in folio_unmap_invalidate()
We can only call filemap_free_folio() if we have a reference to (or hold a
lock on) the mapping. Otherwise, we've already removed the folio from the
mapping so it no longer pins the mapping and the mapping can be removed,
causing a use-after-free when accessing mapping->a_ops.
Follow the same pattern as __remove_mapping() and load the free_folio
function pointer before dropping the lock on the mapping. That lets us
make filemap_free_folio() static as this was the only caller outside
filemap.c.
Link: https://lore.kernel.org/20260413184314.3419945-1-willy@infradead.org Fixes: fb7d3bc41493 ("mm/filemap: drop streaming/uncached pages when writeback completes") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Google Big Sleep <big-sleep-vuln-reports+bigsleep-501448199@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: blk-cgroup: fix use-after-free in cgwb_release_workfn()
cgwb_release_workfn() calls css_put(wb->blkcg_css) and then later accesses
wb->blkcg_css again via blkcg_unpin_online(). If css_put() drops the last
reference, the blkcg can be freed asynchronously (css_free_rwork_fn ->
blkcg_css_free -> kfree) before blkcg_unpin_online() dereferences the
pointer to access blkcg->online_pin, resulting in a use-after-free:
BUG: KASAN: slab-use-after-free in blkcg_unpin_online (./include/linux/instrumented.h:112 ./include/linux/atomic/atomic-instrumented.h:400 ./include/linux/refcount.h:389 ./include/linux/refcount.h:432 ./include/linux/refcount.h:450 block/blk-cgroup.c:1367)
Write of size 4 at addr ff11000117aa6160 by task kworker/71:1/531
Workqueue: cgwb_release cgwb_release_workfn
Call Trace:
<TASK>
blkcg_unpin_online (./include/linux/instrumented.h:112 ./include/linux/atomic/atomic-instrumented.h:400 ./include/linux/refcount.h:389 ./include/linux/refcount.h:432 ./include/linux/refcount.h:450 block/blk-cgroup.c:1367)
cgwb_release_workfn (mm/backing-dev.c:629)
process_scheduled_works (kernel/workqueue.c:3278 kernel/workqueue.c:3385)
** Stack based on commit 66672af7a095 ("Add linux-next specific files
for 20260410")
I am seeing this crash sporadically in Meta fleet across multiple kernel
versions. A full reproducer is available at:
https://github.com/leitao/debug/blob/main/reproducers/repro_blkcg_uaf.sh
(The race window is narrow. To make it easily reproducible, inject a
msleep(100) between css_put() and blkcg_unpin_online() in
cgwb_release_workfn(). With that delay and a KASAN-enabled kernel, the
reproducer triggers the splat reliably in less than a second.)
Fix this by moving blkcg_unpin_online() before css_put(), so the
cgwb's CSS reference keeps the blkcg alive while blkcg_unpin_online()
accesses it.
Link: https://lore.kernel.org/20260413-blkcg-v1-1-35b72622d16c@debian.org Fixes: 59b57717fff8 ("blkcg: delay blkg destruction until after writeback has finished") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: David Hildenbrand <david@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: JP Kobryn <inwardvessel@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Matthew Brost [Fri, 10 Apr 2026 23:03:46 +0000 (16:03 -0700)]
mm/zone_device: do not touch device folio after calling ->folio_free()
The contents of a device folio can immediately change after calling
->folio_free(), as the folio may be reallocated by a driver with a
different order. Instead of touching the folio again to extract the
pgmap, use the local stack variable when calling percpu_ref_put_many().
Link: https://lore.kernel.org/20260410230346.4009855-1-matthew.brost@intel.com Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Reviewed-by: Vishal Moola <vishal.moola@gmail.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: David Hildenbrand <david@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
SeongJae Park [Tue, 7 Apr 2026 00:31:52 +0000 (17:31 -0700)]
mm/damon/core: disallow time-quota setting zero esz
When the throughput of a DAMOS scheme is very slow, DAMOS time quota can
make the effective size quota smaller than damon_ctx->min_region_sz. In
the case, damos_apply_scheme() will skip applying the action, because the
action is tried at region level, which requires >=min_region_sz size.
That is, the quota is effectively exceeded for the quota charge window.
Because no action will be applied, the total_charged_sz and
total_charged_ns are also not updated. damos_set_effective_quota() will
try to update the effective size quota before starting the next charge
window. However, because the total_charged_sz and total_charged_ns have
not updated, the throughput and effective size quota are also not changed.
Since effective size quota can only be decreased, other effective size
quota update factors including DAMOS quota goals and size quota cannot
make any change, either.
As a result, the scheme is unexpectedly deactivated until the user notices
and mitigates the situation. The users can mitigate this situation by
changing the time quota online or re-install the scheme. While the
mitigation is somewhat straightforward, finding the situation would be
challenging, because DAMON is not providing good observabilities for that.
Even if such observability is provided, doing the additional monitoring
and the mitigation is somewhat cumbersome and not aligned to the intention
of the time quota. The time quota was intended to help reduce the user's
administration overhead.
Fix the problem by setting time quota-modified effective size quota be at
least min_region_sz always.
mm/mempolicy: fix weighted interleave auto sysfs name
The __ATTR macro is a utility that makes defining kobj_attributes easier
by stringfying the name, verifying the mode, and setting the show/store
fields in a single initializer. It takes a raw token as the first value,
rather than a string, so that __ATTR family macros like __ATTR_RW can
token-paste it for inferring the _show / _store function names.
Commit e341f9c3c841 ("mm/mempolicy: Weighted Interleave Auto-tuning") used
the __ATTR macro to define the "auto" sysfs for weighted interleave. A
few months later, commit 2fb6915fa22d ("compiler_types.h: add "auto" as a
macro for "__auto_type"") introduced a #define macro which expanded auto
into __auto_type.
This led to the "auto" token passed into __ATTR to be expanded out into
__auto_type, and the sysfs entry to be displayed as __auto_type as well.
Expand out the __ATTR macro and directly pass a string "auto" instead of
the raw token 'auto' to prevent it from being expanded out. Also bypass
the VERIFY_OCTAL_PERMISSIONS check by triple checking that 0664 is indeed
the intended permissions for this sysfs file.
Before:
$ ls /sys/kernel/mm/mempolicy/weighted_interleave
__auto_type node0
After:
$ ls /sys/kernel/mm/mempolicy/weighted_interleave/
auto node0
Link: https://lore.kernel.org/20260407141415.3080960-1-joshua.hahnjy@gmail.com Fixes: 2fb6915fa22d ("compiler_types.h: add "auto" as a macro for "__auto_type"") Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Rakie Kim <rakie.kim@sk.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Ying Huang <ying.huang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Eric Biggers [Sat, 18 Apr 2026 19:21:38 +0000 (12:21 -0700)]
lib/crypto: docs: Add rst documentation to Documentation/crypto/
Add a documentation file Documentation/crypto/libcrypto.rst which
provides a high-level overview of lib/crypto/.
Also add several sub-pages which include the kernel-doc for the
algorithms that have it. This makes the existing, quite extensive
kernel-doc start being included in the HTML and PDF documentation.
Note that the intent is very much *not* that everyone has to read these
Documentation/ files. The library is intended to be straightforward and
use familiar conventions; generally it should be possible to dive right
into the kernel-doc. You shouldn't need to read a lot of documentation
to just call `sha256()`, for example, or to run the unit tests if you're
already familiar with KUnit. (This differs from the traditional crypto
API which has a larger barrier to entry.)
Nevertheless, this seems worth adding. Hopefully it is useful and makes
LWN no longer consider the library to be "meticulously undocumented".
Eric Biggers [Sat, 18 Apr 2026 19:21:37 +0000 (12:21 -0700)]
docs: kdoc: Expand 'at_least' when creating parameter list
sphinx doesn't know that the kernel headers do:
#define at_least static
Do this replacement before declarations are passed to it.
This prevents errors like the following from appearing once the
lib/crypto/ kernel-doc is wired up to the sphinx build:
linux/Documentation/crypto/libcrypto:128: ./include/crypto/sha2.h:773: WARNING: Error in declarator or parameters
Error in declarator or parameters
Invalid C declaration: Expected ']' in end of array operator. [error at 59]
void sha512_final (struct sha512_ctx *ctx, u8 out[at_least SHA512_DIGEST_SIZE])
Acked-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20260418192138.15556-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Merge tag 'pinctrl-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"Core changes:
- Perform basic checks on pin config properties so as not to allow
directly contradictory settings such as setting a pin to more than
one bias or drive mode
- Introduce pinctrl_gpio_get_config() handling in the core for SCMI
GPIO using pin control
New drivers:
- GPIO-by-pin control driver (also appearing in the GPIO pull
request) fulfilling a promise on a comment from Grant Likely many
years ago: "can't GPIO just be a front-end for pin control?" it
turns out it can, if and only if you design something new from
scratch, such as SCMI
- Broadcom BCM7038 as a pinctrl-single delegate
- Mobileye EyeQ6Lplus OLB pin controller
- Qualcomm Eliza and Hawi families TLMM pin controllers
- Qualcomm SDM670 and Milos family LPASS LPI pin controllers
- Qualcomm IPQ5210 pin controller
- Realtek RTD1625 pin controller support
- Rockchip RV1103B pin controller support
- Texas Instruments AM62L as a pinctrl-single delegate
Improvements:
- Set config implementation for the Spacemit K1 pin controller"
* tag 'pinctrl-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (84 commits)
pinctrl: qcom: Add Hawi pinctrl driver
dt-bindings: pinctrl: qcom: Describe Hawi TLMM block
dt-bindings: pinctrl: pinctrl-max77620: convert to DT schema
pinctrl: single: Add bcm7038-padconf compatible matching
dt-bindings: pinctrl: pinctrl-single: Add brcm,bcm7038-padconf
dt-bindings: pinctrl: apple,pinctrl: Add t8122 compatible
pinctrl: qcom: sdm670-lpass-lpi: label variables as static
pinctrl: sophgo: pinctrl-sg2044: Fix wrong module description
pinctrl: sophgo: pinctrl-sg2042: Fix wrong module description
pinctrl: qcom: add sdm670 lpi tlmm
dt-bindings: pinctrl: qcom: Add SDM670 LPASS LPI pinctrl
dt-bindings: qcom: lpass-lpi-common: add reserved GPIOs property
pinctrl: qcom: Introduce IPQ5210 TLMM driver
dt-bindings: pinctrl: qcom: add IPQ5210 pinctrl
pinctrl: qcom: Drop redundant intr_target_reg on modern SoCs
pinctrl: qcom: eliza: Fix interrupt target bit
pinctrl: core: Don't use "proxy" headers
pinctrl: amd: Support new ACPI ID AMDI0033
pinctrl: renesas: rzg2l: Drop superfluous blank line
pinctrl: renesas: rzg2l: Fix save/restore of {IOLH,IEN,PUPD,SMT} registers
...
David Carlier [Sat, 18 Apr 2026 19:17:37 +0000 (20:17 +0100)]
eventfs: Hold eventfs_mutex and SRCU when remount walks events
Commit 340f0c7067a9 ("eventfs: Update all the eventfs_inodes from the
events descriptor") had eventfs_set_attrs() recurse through ei->children
on remount. The walk only holds the rcu_read_lock() taken by
tracefs_apply_options() over tracefs_inodes, which is wrong:
- list_for_each_entry over ei->children races with the list_del_rcu()
in eventfs_remove_rec() -- LIST_POISON1 deref, same shape as d2603279c7d6.
- eventfs_inodes are freed via call_srcu(&eventfs_srcu, ...).
rcu_read_lock() does not extend an SRCU grace period, so ti->private
can be reclaimed under the walk.
- The writes to ei->attr race with eventfs_set_attr(), which holds
eventfs_mutex.
Reproducer:
while :; do mount -o remount,uid=$((RANDOM%1000)) /sys/kernel/tracing; done &
while :; do
echo "p:kp submit_bio" > /sys/kernel/tracing/kprobe_events
echo > /sys/kernel/tracing/kprobe_events
done
Wrap the events portion of tracefs_apply_options() in
eventfs_remount_lock()/_unlock() that take eventfs_mutex and
srcu_read_lock(&eventfs_srcu). eventfs_set_attrs() doesn't sleep so the
nested rcu_read_lock() is fine; lockdep_assert_held() pins the contract.
Comment in tracefs_drop_inode() said "RCU cycle" -- it is SRCU.
Fixes: 340f0c7067a9 ("eventfs: Update all the eventfs_inodes from the events descriptor") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260418191737.10289-1-devnexen@gmail.com Signed-off-by: David Carlier <devnexen@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
David Carlier [Sat, 18 Apr 2026 15:22:50 +0000 (16:22 +0100)]
eventfs: Use list_add_tail_rcu() for SRCU-protected children list
Commit d2603279c7d6 ("eventfs: Use list_del_rcu() for SRCU protected
list variable") converted the removal side to pair with the
list_for_each_entry_srcu() walker in eventfs_iterate(). The insertion
in eventfs_create_dir() was left as a plain list_add_tail(), which on
weakly-ordered architectures can expose a new entry to the SRCU reader
before its list pointers and fields are observable.
Use list_add_tail_rcu() so the publication pairs with the existing
list_del_rcu() and list_for_each_entry_srcu().
Fixes: 43aa6f97c2d0 ("eventfs: Get rid of dentry pointers without refcounts") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260418152251.199343-1-devnexen@gmail.com Signed-off-by: David Carlier <devnexen@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
sctp: fix OOB write to userspace in sctp_getsockopt_peer_auth_chunks
sctp_getsockopt_peer_auth_chunks() checks that the caller's optval
buffer is large enough for the peer AUTH chunk list with
if (len < num_chunks)
return -EINVAL;
but then writes num_chunks bytes to p->gauth_chunks, which lives
at offset offsetof(struct sctp_authchunks, gauth_chunks) == 8
inside optval. The check is missing the sizeof(struct
sctp_authchunks) = 8-byte header. When the caller supplies
len == num_chunks (for any num_chunks > 0) the test passes but
copy_to_user() writes sizeof(struct sctp_authchunks) = 8 bytes
past the declared buffer.
The sibling function sctp_getsockopt_local_auth_chunks() at the
next line already has the correct check:
if (len < sizeof(struct sctp_authchunks) + num_chunks)
return -EINVAL;
Align the peer variant with its sibling.
Reproducer confirms on v7.0-13-generic: an unprivileged userspace
caller that opens a loopback SCTP association with AUTH enabled,
queries num_chunks with a short optval, then issues the real
getsockopt with len == num_chunks and sentinel bytes painted past
the buffer observes those sentinel bytes overwritten with the
peer's AUTH chunk type. The bytes written are under the peer's
control but land in the caller's own userspace; this is not a
kernel memory corruption, but it is a kernel-side contract
violation that can silently corrupt adjacent userspace data.
Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20260416031903.1447072-1-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Marek Vasut [Wed, 15 Apr 2026 23:09:45 +0000 (01:09 +0200)]
net: ks8851: Avoid excess softirq scheduling
The code injects a packet into netif_rx() repeatedly, which will add
it to its internal NAPI and schedule a softirq, and process it. It is
more efficient to queue multiple packets and process them all at the
local_bh_enable() time.
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: e0863634bf9f ("net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs") Cc: stable@vger.kernel.org Signed-off-by: Marek Vasut <marex@nabladev.com> Link: https://patch.msgid.link/20260415231020.455298-2-marex@nabladev.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Marek Vasut [Wed, 15 Apr 2026 23:09:44 +0000 (01:09 +0200)]
net: ks8851: Reinstate disabling of BHs around IRQ handler
If the driver executes ks8851_irq() AND a TX packet has been sent, then
the driver enables TX queue via netif_wake_queue() which schedules TX
softirq to queue packets for this device.
If CONFIG_PREEMPT_RT=y is set AND a packet has also been received by
the MAC, then ks8851_rx_pkts() calls netdev_alloc_skb_ip_align() to
allocate SKBs for the received packets. If netdev_alloc_skb_ip_align()
is called with BH enabled, then local_bh_enable() at the end of
netdev_alloc_skb_ip_align() will trigger the pending softirq processing,
which may ultimately call the .xmit callback ks8851_start_xmit_par().
The ks8851_start_xmit_par() will try to lock struct ks8851_net_par
.lock spinlock, which is already locked by ks8851_irq() from which
ks8851_start_xmit_par() was called. This leads to a deadlock, which
is reported by the kernel, including a trace listed below.
If CONFIG_PREEMPT_RT is not set, then since commit 0913ec336a6c0
("net: ks8851: Fix deadlock with the SPI chip variant") the deadlock
can also be triggered without received packet in the RX FIFO. The
pending softirqs will be processed on return from
spin_unlock_bh(&ks->statelock) in ks8851_irq(), which triggers the
deadlock as well.
Fix the problem by disabling BH around critical sections, including the
IRQ handler, thus preventing the net_tx_action() softirq from triggering
during these critical sections. The net_tx_action() softirq is triggered
once BH are re-enabled and at the end of the IRQ handler, once all the
other IRQ handler actions have been completed.
__schedule from schedule_rtlock+0x1c/0x34
schedule_rtlock from rtlock_slowlock_locked+0x548/0x904
rtlock_slowlock_locked from rt_spin_lock+0x60/0x9c
rt_spin_lock from ks8851_start_xmit_par+0x74/0x1a8
ks8851_start_xmit_par from netdev_start_xmit+0x20/0x44
netdev_start_xmit from dev_hard_start_xmit+0xd0/0x188
dev_hard_start_xmit from sch_direct_xmit+0xb8/0x25c
sch_direct_xmit from __qdisc_run+0x1f8/0x4ec
__qdisc_run from qdisc_run+0x1c/0x28
qdisc_run from net_tx_action+0x1f0/0x268
net_tx_action from handle_softirqs+0x1a4/0x270
handle_softirqs from __local_bh_enable_ip+0xcc/0xe0
__local_bh_enable_ip from __alloc_skb+0xd8/0x128
__alloc_skb from __netdev_alloc_skb+0x3c/0x19c
__netdev_alloc_skb from ks8851_irq+0x388/0x4d4
ks8851_irq from irq_thread_fn+0x24/0x64
irq_thread_fn from irq_thread+0x178/0x28c
irq_thread from kthread+0x12c/0x138
kthread from ret_from_fork+0x14/0x28
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: e0863634bf9f ("net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs") Cc: stable@vger.kernel.org Signed-off-by: Marek Vasut <marex@nabladev.com> Link: https://patch.msgid.link/20260415231020.455298-1-marex@nabladev.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a socket in SOCKMAP receives skb with inflight fd,
sk_psock_verdict_data_ready() looks up the mapped socket and
enqueue skb to its psock->ingress_skb.
Since neither the old nor the new GC can inspect the psock
queue, the hidden skb leaks the inflight sockets. Note that
this cannot be detected via kmemleak because inflight sockets
are linked to a global list.
In addition, SOCKMAP redirect breaks the Tarjan-based GC's
assumption that unix_edge.successor is always alive, which
is no longer true once skb is redirected, resulting in
use-after-free below. [0]
Moreover, SOCKMAP does not call scm_stat_del() properly,
so unix_show_fdinfo() could report an incorrect fd count.
sk_msg_recvmsg() does not support any SCM attributes in the
first place.
Let's drop all SCM attributes before passing skb to the
SOCKMAP layer.
[0]:
BUG: KASAN: slab-use-after-free in unix_del_edges (net/unix/garbage.c:118 net/unix/garbage.c:181 net/unix/garbage.c:251)
Read of size 8 at addr ffff888125362670 by task kworker/56:1/496
net: stmmac: Update default_an_inband before passing value to phylink_config
get_interfaces() will update both the plat->phy_interfaces and
mdio_bus_data->default_an_inband based on reading a SERDES register. As
get_interfaces() will be called after default_an_inband had already been
read, dwmac-intel regressed as a result with incorrect default_an_inband
value in phylink_config.
Therefore, we moved the priv->plat->get_interfaces() to be executed first
before assigning priv->plat->default_an_inband to config->default_an_inband
to ensure default_an_inband is in correct value.
Fixes: d3836052fe09 ("net: stmmac: intel: convert speed_mode_2500() to get_interfaces()") Signed-off-by: KhaiWenTan <khai.wen.tan@linux.intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20260416102609.7953-1-khai.wen.tan@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Grzegorz updates the logic for adjusting the PTP hardware clock on E830,
fixing a bug that prevented adjustments below S32_MAX/MIN nanoseconds.
Grzegorz and Zoli update the PCS latency settings for E825 devices at 10GbE
and 25GbE, improving the accuracy of timestamps based on data from
production hardware.
Michal Schmidt fixes a double-free that could happen if a particular error
path is taken in ice_xmit_frame_ring().
Guangshuo fixes a double-free that could happen during error paths in the
ice_sf_eth_activate() function.
Paul Greenwalt fixes the PHY link configuration when the link-down-on-close
driver parameter is enabled and new media is inserted.
Paul Greenwalt fixes the ICE_AQ_LINK_SPEED_M macro for 200G, enabling 200G
link speed advertisement.
Keita Morisaki fixes a race condition in the ice Tx timestamp ring cleanup,
preventing a possible NULL pointer dereference.
Kohei Enju fixes a potential NULL pointer dereference in ice_set_ring_param().
Kohei Enju fixes i40e to stop advertising IFF_SUPP_NOFCS, when the driver
does not actually support the feature.
Petr fixes the VLAN L2TAG2 mask when the iAVF VF and a PF negotiate use of
the legacy Rx descriptor format.
Matt fixes the unrolling logic for PTP when the e1000e probe fails after
the PTP clock has been registered.
**A note to stable backports**
The patches [7/12] ("ice: fix race condition in TX timestamp ring
cleanup") and [8/12] ("ice: fix potential NULL pointer deref in error
path of ice_set_ringparam()") must be backported together. Otherwise the
fix in patch 8 will not work properly.
====================
Petr Oros [Fri, 17 Apr 2026 00:53:34 +0000 (17:53 -0700)]
iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2
The IAVF_RXD_LEGACY_L2TAG2_M mask was incorrectly defined as
GENMASK_ULL(63, 32), extracting 32 bits from qw2 instead of the
16-bit VLAN tag. In the legacy Rx descriptor layout, the 2nd L2TAG2
(VLAN tag) occupies bits 63:48 of qw2, not 63:32.
The oversized mask causes FIELD_GET to return a 32-bit value where the
actual VLAN tag sits in bits 31:16. When this value is passed to
iavf_receive_skb() as a u16 parameter, it gets truncated to the lower
16 bits (which contain the 1st L2TAG2, typically zero). As a result,
__vlan_hwaccel_put_tag() is never called and software VLAN interfaces
on VFs receive no traffic.
This affects VFs behind ice PF (VIRTCHNL VLAN v2) when the PF
advertises VLAN stripping into L2TAG2_2 and legacy descriptors are
used.
The flex descriptor path already uses the correct mask
(IAVF_RXD_FLEX_L2TAG2_2_M = GENMASK_ULL(63, 48)).
Reproducer:
1. Create 2 VFs on ice PF (echo 2 > sriov_numvfs)
2. Disable spoofchk on both VFs
3. Move each VF into a separate network namespace
4. On each VF: create VLAN interface (e.g. vlan 198), assign IP,
bring up
5. Set rx-vlan-offload OFF on both VFs
6. Ping between VLAN interfaces -> expect PASS
(VLAN tag stays in packet data, kernel matches in-band)
7. Set rx-vlan-offload ON on both VFs
8. Ping between VLAN interfaces -> expect FAIL if bug present
(HW strips VLAN tag into descriptor L2TAG2 field, wrong mask
extracts bits 47:32 instead of 63:48, truncated to u16 -> zero,
__vlan_hwaccel_put_tag() never called, packet delivered to parent
interface, not VLAN interface)
The reproducer requires legacy Rx descriptors. On modern ice + iavf
with full PTP support, flex descriptors are always negotiated and the
buggy legacy path is never reached. Flex descriptors require all of:
- CONFIG_PTP_1588_CLOCK enabled
- VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC granted by PF
- PTP capabilities negotiated (VIRTCHNL_VF_CAP_PTP)
- VIRTCHNL_1588_PTP_CAP_RX_TSTAMP supported
- VIRTCHNL_RXDID_2_FLEX_SQ_NIC present in DDP profile
If any condition is not met, iavf_select_rx_desc_format() falls back
to legacy descriptors (RXDID=1) and the wrong L2TAG2 mask is hit.
Fixes: 2dc8e7c36d80 ("iavf: refactor iavf_clean_rx_irq to support legacy and flex descriptors") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-10-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
i40e advertises IFF_SUPP_NOFCS, allowing users to use the SO_NOFCS
socket option. However, this option is silently ignored, as the driver
does not check skb->no_fcs, and always enables FCS insertion offload.
Fix this by removing the advertisement of IFF_SUPP_NOFCS.
This behavior can be reproduced with a simple AF_PACKET socket:
ice: fix potential NULL pointer deref in error path of ice_set_ringparam()
ice_set_ringparam nullifies tstamp_ring of temporary tx_rings, without
clearing ICE_TX_RING_FLAGS_TXTIME bit.
When ICE_TX_RING_FLAGS_TXTIME is set and the subsequent
ice_setup_tx_ring() call fails, a NULL pointer dereference could happen
in the unwinding sequence:
ice: fix race condition in TX timestamp ring cleanup
Fix a race condition between ice_free_tx_tstamp_ring() and ice_tx_map()
that can cause a NULL pointer dereference.
ice_free_tx_tstamp_ring currently clears the ICE_TX_FLAGS_TXTIME flag
after NULLing the tstamp_ring. This could allow a concurrent ice_tx_map
call on another CPU to dereference the tstamp_ring, which could lead to
a NULL pointer dereference.
Fix by:
1. Reordering ice_free_tx_tstamp_ring() to clear the flag before
NULLing the pointer, with smp_wmb() to ensure proper ordering.
2. Adding smp_rmb() in ice_tx_map() after the flag check to order the
flag read before the pointer read, using READ_ONCE() for the
pointer, and adding a NULL check as a safety net.
3. Converting tx_ring->flags from u8 to DECLARE_BITMAP() and using
atomic bitops (set_bit(), clear_bit(), test_bit()) for all flag
operations throughout the driver:
- ICE_TX_RING_FLAGS_XDP
- ICE_TX_RING_FLAGS_VLAN_L2TAG1
- ICE_TX_RING_FLAGS_VLAN_L2TAG2
- ICE_TX_RING_FLAGS_TXTIME
Fixes: ccde82e909467 ("ice: add E830 Earliest TxTime First Offload support") Signed-off-by: Keita Morisaki <kmta1236@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-7-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paul Greenwalt [Fri, 17 Apr 2026 00:53:30 +0000 (17:53 -0700)]
ice: fix ICE_AQ_LINK_SPEED_M for 200G
When setting PHY configuration during driver initialization, 200G link
speed is not being advertised even when the PHY is capable. This is
because the get PHY capabilities link speed response is being masked by
ICE_AQ_LINK_SPEED_M, which does not include the 200G link speed bit.
ICE_AQ_LINK_SPEED_200GB is defined as BIT(11), but the mask 0x7FF only
covers bits 0-10. Fix ICE_AQ_LINK_SPEED_M to use GENMASK(11, 0) so
that it covers all defined link speed bits including 200G.
Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-6-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paul Greenwalt [Fri, 17 Apr 2026 00:53:29 +0000 (17:53 -0700)]
ice: fix PHY config on media change with link-down-on-close
Commit 1a3571b5938c ("ice: restore PHY settings on media insertion")
introduced separate flows for setting PHY configuration on media
present: ice_configure_phy() when link-down-on-close is disabled, and
ice_force_phys_link_state() when enabled. The latter incorrectly uses
the previous configuration even after module change, causing link
issues such as wrong speed or no link.
Unify PHY configuration into a single ice_phy_cfg() function with a
link_en parameter, ensuring PHY capabilities are always fetched fresh
from hardware.
Fixes: 1a3571b5938c ("ice: restore PHY settings on media insertion") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-5-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michal Schmidt [Fri, 17 Apr 2026 00:53:28 +0000 (17:53 -0700)]
ice: fix double-free of tx_buf skb
If ice_tso() or ice_tx_csum() fail, the error path in
ice_xmit_frame_ring() frees the skb, but the 'first' tx_buf still points
to it and is marked as valid (ICE_TX_BUF_SKB).
'next_to_use' remains unchanged, so the potential problem will
likely fix itself when the next packet is transmitted and the tx_buf
gets overwritten. But if there is no next packet and the interface is
brought down instead, ice_clean_tx_ring() -> ice_unmap_and_free_tx_buf()
will find the tx_buf and free the skb for the second time.
The fix is to reset the tx_buf type to ICE_TX_BUF_EMPTY in the error
path, so that ice_unmap_and_free_tx_buf().
Move the initialization of 'first' up, to ensure it's already valid in
case we hit the linearization error path.
The bug was spotted by AI while I had it looking for something else.
It also proposed an initial version of the patch.
I reproduced the bug and tested the fix by adding code to inject
failures, on a build with KASAN.
I looked for similar bugs in related Intel drivers and did not find any.
Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads") Assisted-by: Claude:claude-4.6-opus-high Cursor Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-4-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guangshuo Li [Fri, 17 Apr 2026 00:53:27 +0000 (17:53 -0700)]
ice: fix double free in ice_sf_eth_activate() error path
When auxiliary_device_add() fails, ice_sf_eth_activate() jumps to
aux_dev_uninit and calls auxiliary_device_uninit(&sf_dev->adev).
The device release callback ice_sf_dev_release() frees sf_dev, but
the current error path falls through to sf_dev_free and calls
kfree(sf_dev) again, causing a double free.
Keep kfree(sf_dev) for the auxiliary_device_init() failure path, but
avoid falling through to sf_dev_free after auxiliary_device_uninit().
Fixes: 13acc5c4cdbe ("ice: subfunction activation and base devlink ops") Cc: stable@vger.kernel.org Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-3-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Grzegorz Nitka [Fri, 17 Apr 2026 00:53:26 +0000 (17:53 -0700)]
ice: update PCS latency settings for E825 10G/25Gb modes
Update MAC Rx/Tx offset registers settings (PHY_MAC_[RX|TX]_OFFSET
registers) with the data obtained with the latest research. It applies
to PCS latency settings for the following speeds/modes:
* 10Gb NO-FEC
- TX latency changed from 71.25 ns to 73 ns
- RX latency changed from -25.6 ns to -28 ns
* 25Gb NO-FEC
- TX latency changed from 28.17 ns to 33 ns
- RX latency changed from -12.45 ns to -12 ns
* 25Gb RS-FEC
- TX latency changed from 64.5 ns to 69 ns
- RX latency changed from -3.6 ns to -3 ns
The original data came from simulation and pre-production hardware.
The new data measures the actual delays and as such is more accurate.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Co-developed-by: Zoltan Fodor <zoltan.fodor@intel.com> Signed-off-by: Zoltan Fodor <zoltan.fodor@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-2-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Grzegorz Nitka [Fri, 17 Apr 2026 00:53:25 +0000 (17:53 -0700)]
ice: fix 'adjust' timer programming for E830 devices
Fix incorrect 'adjust the timer' programming sequence for E830 devices
series. Only shadow registers GLTSYN_SHADJ were programmed in the
current implementation. According to the specification [1], write to
command GLTSYN_CMD register is also required with CMD field set to
"Adjust the Time" value, for the timer adjustment to take the effect.
The flow was broken for the adjustment less than S32_MAX/MIN range
(around +/- 2 seconds). For bigger adjustment, non-atomic programming
flow is used, involving set timer programming. Non-atomic flow is
implemented correctly.
Testing hints:
Run command:
phc_ctl /dev/ptpX get adj 2 get
Expected result:
Returned timestamps differ at least by 2 seconds
Fixes: f00307522786 ("ice: Implement PTP support for E830 devices") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-1-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 18 Apr 2026 18:44:11 +0000 (11:44 -0700)]
Merge tag 'ovpn-net-20260417' of https://github.com/OpenVPN/ovpn-net-next
Antonio Quartulli says:
====================
This batch includes only fixes to the selftest harness:
* switch to TAP test orchestration
* parse slurped notifications as returned by jq -s
* add ovpn_ prefix to helpers and global variables to avoid clashes
* fail test in case of netlink notification mismatch
* add missing kernel config dependencies
* add delay when launching multiple ynl/cli.py listeners
* tag 'ovpn-net-20260417' of https://github.com/OpenVPN/ovpn-net-next:
selftests: ovpn: serialize YNL listener startup
selftests: ovpn: align command flow with TAP
selftests: ovpn: add prefix to helpers and shared variables
selftests: ovpn: flatten slurped notification JSON before filtering
selftests: ovpn: fail notification check on mismatch
selftests: ovpn: add nftables config dependencies for test-mark
====================
Merge tag 'parisc-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture updates from Helge Deller:
- A fix to make modules on 32-bit parisc architecture work again
- Drop ip_fast_csum() inline assembly to avoid unaligned memory
accesses
- Allow to build kernel without 32-bit VDSO
- Reference leak fix in error path in LED driver
* tag 'parisc-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: led: fix reference leak on failed device registration
module.lds.S: Fix modules on 32-bit parisc architecture
parisc: Allow to build without VDSO32
parisc: Include 32-bit VDSO only when building for 32-bit or compat mode
parisc: Allow to disable COMPAT mode on 64-bit kernel
parisc: Fix default stack size when COMPAT=n
parisc: Fix signal code to depend on CONFIG_COMPAT instead of CONFIG_64BIT
parisc: is_compat_task() shall return false for COMPAT=n
parisc: Avoid compat syscalls when COMPAT=n
parisc: _llseek syscall is only available for 32-bit userspace
parisc: Drop ip_fast_csum() inline assembly implementation
parisc: update outdated comments for renamed ccio_alloc_consistent()
Merge tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock updates from Mike Rapoport:
- improve debuggability of reserve_mem kernel parameter handling with
print outs in case of a failure and debugfs info showing what was
actually reserved
- Make memblock_free_late() and free_reserved_area() use the same core
logic for freeing the memory to buddy and ensure it takes care of
updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled.
* tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
x86/alternative: delay freeing of smp_locks section
memblock: warn when freeing reserved memory before memory map is initialized
memblock, treewide: make memblock_free() handle late freeing
memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y
memblock: extract page freeing from free_reserved_area() into a helper
memblock: make free_reserved_area() more robust
mm: move free_reserved_area() to mm/memblock.c
powerpc: opal-core: pair alloc_pages_exact() with free_pages_exact()
powerpc: fadump: pair alloc_pages_exact() with free_pages_exact()
memblock: reserve_mem: fix end caclulation in reserve_mem_release_by_name()
memblock: move reserve_bootmem_range() to memblock.c and make it static
memblock: Add reserve_mem debugfs info
memblock: Print out errors on reserve_mem parser
====================
tcp: take care of tcp_get_timestamping_opt_stats() races
tcp_get_timestamping_opt_stats() does not own the socket lock,
this is intentional.
It calls tcp_get_info_chrono_stats() while other threads could
change chrono fields in tcp_chrono_set(). It also reads many
tcp socket fields that can be modified by other cpus/threads.
I do not think we need coherent TCP socket state snapshot
in tcp_get_timestamping_opt_stats().
Add READ_ONCE()/WRITE_ONCE() or data_race() annotations.
Note that icsk_ca_state is a bitfield, thus not covered
in this series.
====================
Wolfram Sang [Fri, 17 Apr 2026 07:42:36 +0000 (09:42 +0200)]
mailbox: mailbox-test: make data_ready a per-instance variable
While not the default case, multiple tests can be run simultaneously.
Then, data_ready being a global variable will be overwritten and the
per-instance lock will not help. Turn the global variable into a
per-instance one to avoid this problem.
Fixes: e339c80af95e ("mailbox: mailbox-test: don't rely on rx_buffer content to signal data ready") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Wolfram Sang [Fri, 17 Apr 2026 07:42:35 +0000 (09:42 +0200)]
mailbox: mailbox-test: initialize struct earlier
The waitqueue must be initialized before the debugfs files are created
because from that time, requests from userspace can already be made.
Similarily, drvdata and spinlock needs to be initialized before we
request the channel, otherwise dangling irqs might run into problems
like a NULL pointer exception.
Fixes: 8ea4484d0c2b ("mailbox: Add generic mechanism for testing Mailbox Controllers") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Wolfram Sang [Fri, 17 Apr 2026 07:42:34 +0000 (09:42 +0200)]
mailbox: mailbox-test: don't free the reused channel
The RX channel can be aliased to the TX channel if it has a different
MMIO. This special case needs to be handled when freeing the channels
otherwise a double-free occurs.
Fixes: 8ea4484d0c2b ("mailbox: Add generic mechanism for testing Mailbox Controllers") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
mbox_test_request_channel() returns either an ERR_PTR or NULL. The
callers, however, mostly checked for non-NULL which allows for bogus
code paths when an ERR_PTR is treated like a valid channel. A later
commit tried to fix it in one place but missed the other ones. Because
the ERR_PTR is only used for -ENOMEM once and is converted to
-EPROBE_DEFER anyhow, convert the callee to only return NULL which
simplifies handling a lot and makes it less error prone.
Fixes: 8ea4484d0c2b ("mailbox: Add generic mechanism for testing Mailbox Controllers") Fixes: 9b63a810c6f9 ("mailbox: mailbox-test: Fix an error check in mbox_test_probe()") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Wolfram Sang [Mon, 13 Apr 2026 10:42:38 +0000 (12:42 +0200)]
mailbox: add sanity check for channel array
Fail gracefully if there is no channel array attached to the mailbox
controller. Otherwise the later dereference will cause an OOPS which
might not be seen because mailbox controllers might instantiate very
early. Remove the comment explaining the obvious while here.
Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
ksmbd: fix out-of-bounds write in smb2_get_ea() EA alignment
smb2_get_ea() applies 4-byte alignment padding via memset() after
writing each EA entry. The bounds check on buf_free_len is performed
before the value memcpy, but the alignment memset fires unconditionally
afterward with no check on remaining space.
When the EA value exactly fills the remaining buffer (buf_free_len == 0
after value subtraction), the alignment memset writes 1-3 NUL bytes
past the buf_free_len boundary. In compound requests where the response
buffer is shared across commands, the first command (e.g., READ) can
consume most of the buffer, leaving a tight remainder for the QUERY_INFO
EA response. The alignment memset then overwrites past the physical
kvmalloc allocation into adjacent kernel heap memory.
Add a bounds check before the alignment memset to ensure buf_free_len
can accommodate the padding bytes.
This is the same bug pattern fixed by commit beef2634f81f ("ksmbd: fix
potencial OOB in get_file_all_info() for compound requests") and
commit fda9522ed6af ("ksmbd: fix OOB write in QUERY_INFO for compound
requests"), both of which added bounds checks before unconditional
writes in QUERY_INFO response handlers.
Cc: stable@vger.kernel.org Fixes: e2b76ab8b5c9 ("ksmbd: add support for read compound") Signed-off-by: Tristan Madani <tristan@talencesecurity.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd: use check_add_overflow() to prevent u16 DACL size overflow
set_posix_acl_entries_dacl() and set_ntacl_dacl() accumulate ACE sizes
in u16 variables. When a file has many POSIX ACL entries, the
accumulated size can wrap past 65535, causing the pointer arithmetic
(char *)pndace + *size to land within already-written ACEs. Subsequent
writes then overwrite earlier entries, and pndacl->size gets a
truncated value.
Use check_add_overflow() at each accumulation point to detect the
wrap before it corrupts the buffer, consistent with existing
check_mul_overflow() usage elsewhere in smbacl.c.
Cc: stable@vger.kernel.org Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Tristan Madani <tristan@talencesecurity.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd: fix use-after-free in smb2_open during durable reconnect
In smb2_open, the call to ksmbd_put_durable_fd(fp) drops the reference
to the durable file descriptor early during the durable reconnect
process. If an error occurs subsequently (eg, ksmbd_iov_pin_rsp fails)
or a scavenger accesses the file, it leads to a use-after-free when
accessing fp properties (eg fp->create_time).
Move the single put to the end of the function below err_out2 so fp
stays valid until smb2_open returns.
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2") Signed-off-by: Akif <akif.sait111@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
num_aces is a u16 read from le16_to_cpu(parent_pdacl->num_aces)
without checking that it is consistent with the declared pdacl_size.
An authenticated client whose parent directory's security.NTACL is
tampered (e.g. via offline xattr corruption or a concurrent path that
bypasses parse_dacl()) can present num_aces = 65535 with minimal
actual ACE data. This causes a ~8 MB allocation (not kzalloc, so
uninitialized) that the subsequent loop only partially populates, and
may also overflow the three-way size_t multiply on 32-bit kernels.
Additionally, the ACE walk loop uses the weaker
offsetof(struct smb_ace, access_req) minimum size check rather than
the minimum valid on-wire ACE size, and does not reject ACEs whose
declared size is below the minimum.
Reproduced on UML + KASAN + LOCKDEP against the real ksmbd code path.
A legitimate mount.cifs client creates a parent directory over SMB
(ksmbd writes a valid security.NTACL xattr), then the NTACL blob on
the backing filesystem is rewritten to set num_aces = 0xFFFF while
keeping the posix_acl_hash bytes intact so ksmbd_vfs_get_sd_xattr()'s
hash check still passes. A subsequent SMB2 CREATE of a child under
that parent drives smb2_open() into smb_inherit_dacl() (share has
"vfs objects = acl_xattr" set), which fails the page allocator:
With the patch applied the added guard rejects the tampered value
with -EINVAL before any large allocation runs, smb2_open() falls back
to smb2_create_sd_buffer(), and the child is created with a default
SD. No warning, no splat.
Fix by:
1. Validating num_aces against pdacl_size using the same formula
applied in parse_dacl().
2. Replacing the raw kmalloc(sizeof * num_aces * 2) with
kmalloc_array(num_aces * 2, sizeof(...)) for overflow-safe
allocation.
3. Tightening the per-ACE loop guard to require the minimum valid
ACE size (offsetof(smb_ace, sid) + CIFS_SID_BASE_SIZE) and
rejecting under-sized ACEs, matching the hardening in
smb_check_perm_dacl() and parse_dacl().
v1 -> v2:
- Replace the synthetic test-module splat in the changelog with a
real-path UML + KASAN reproduction driven through mount.cifs and
SMB2 CREATE; Namjae flagged the kcifs3_test_inherit_dacl_old name
in v1 since it does not exist in ksmbd.
- Drop the commit-hash citation from the code comment per Namjae's
review; keep the parse_dacl() pointer.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
DaeMyung Kang [Thu, 16 Apr 2026 21:17:35 +0000 (06:17 +0900)]
smb: server: fix max_connections off-by-one in tcp accept path
The global max_connections check in ksmbd's TCP accept path counts
the newly accepted connection with atomic_inc_return(), but then
rejects the connection when the result is greater than or equal to
server_conf.max_connections.
That makes the effective limit one smaller than configured. For
example:
- max_connections=1 rejects the first connection
- max_connections=2 allows only one connection
The per-IP limit in the same function uses <= correctly because it
counts only pre-existing connections. The global limit instead checks
the post-increment total, so it should reject only when that total
exceeds the configured maximum.
Fix this by changing the comparison from >= to >, so exactly
max_connections simultaneous connections are allowed and the next one
is rejected. This matches the documented meaning of max_connections
in fs/smb/server/ksmbd_netlink.h as the "Number of maximum simultaneous
connections".
Fixes: 0d0d4680db22 ("ksmbd: add max connections parameter") Cc: stable@vger.kernel.org Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd: require minimum ACE size in smb_check_perm_dacl()
Both ACE-walk loops in smb_check_perm_dacl() only guard against an
under-sized remaining buffer, not against an ACE whose declared
`ace->size` is smaller than the struct it claims to describe:
if (offsetof(struct smb_ace, access_req) > aces_size)
break;
ace_size = le16_to_cpu(ace->size);
if (ace_size > aces_size)
break;
The first check only requires the 4-byte ACE header to be in bounds;
it does not require access_req (4 bytes at offset 4) to be readable.
An attacker who has set a crafted DACL on a file they own can declare
ace->size == 4 with aces_size == 4, pass both checks, and then
which is the smallest valid on-wire ACE layout (4-byte header +
4-byte access_req + 8-byte sid base with zero sub-auths). Also
reject ACEs whose sid.num_subauth exceeds SID_MAX_SUB_AUTHORITIES
before letting compare_sids() dereference sub_auth[] entries.
parse_sec_desc() already enforces an equivalent check (lines 441-448);
smb_check_perm_dacl() simply grew weaker validation over time.
Reachability: authenticated SMB client with permission to set an ACL
on a file. On a subsequent CREATE against that file, the kernel
walks the stored DACL via smb_check_perm_dacl() and triggers the
OOB read. Not pre-auth, and the OOB read is not reflected to the
attacker, but KASAN reports and kernel state corruption are
possible.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd: validate response sizes in ipc_validate_msg()
ipc_validate_msg() computes the expected message size for each
response type by adding (or multiplying) attacker-controlled fields
from the daemon response to a fixed struct size in unsigned int
arithmetic. Three cases can overflow:
resp->payload_sz is __u32 and resp->ngroups is __s32. Each addition
can wrap in unsigned int; the multiplication by sizeof(gid_t) mixes
signed and size_t, so a negative ngroups is converted to SIZE_MAX
before the multiply. A wrapped value of msg_sz that happens to
equal entry->msg_sz bypasses the size check on the next line, and
downstream consumers (smb2pdu.c:6742 memcpy using rpc_resp->payload_sz,
kmemdup in ksmbd_alloc_user using resp_ext->ngroups) then trust the
unverified length.
Use check_add_overflow() on the RPC_REQUEST and SHARE_CONFIG_REQUEST
paths to detect integer overflow without constraining functional
payload size; userspace ksmbd-tools grows NDR responses in 4096-byte
chunks for calls like NetShareEnumAll, so a hard transport cap is
unworkable on the response side. For LOGIN_REQUEST_EXT, reject
resp->ngroups outside the signed [0, NGROUPS_MAX] range up front and
report the error from ipc_validate_msg() so it fires at the IPC
boundary; with that bound the subsequent multiplication and addition
stay well below UINT_MAX. The now-redundant ngroups check and
pr_err in ksmbd_alloc_user() are removed.
This is the response-side analogue of aab98e2dbd64 ("ksmbd: fix
integer overflows on 32 bit systems"), which hardened the request
side.
Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Fixes: a77e0e02af1c ("ksmbd: add support for supplementary groups") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: server: fix active_num_conn leak on transport allocation failure
Commit 77ffbcac4e56 ("smb: server: fix leak of active_num_conn in
ksmbd_tcp_new_connection()") addressed the kthread_run() failure
path. The earlier alloc_transport() == NULL path in the same
function has the same leak, is reachable pre-authentication via any
TCP connect to port 445, and was empirically reproduced on UML
(ARCH=um, v7.0-rc7): a small number of forced allocation failures
were sufficient to put ksmbd into a state where every subsequent
connection attempt was rejected for the remainder of the boot.
ksmbd_kthread_fn() increments active_num_conn before calling
ksmbd_tcp_new_connection() and discards the return value, so when
alloc_transport() returns NULL the socket is released and -ENOMEM
returned without decrementing the counter. Each such failure
permanently consumes one slot from the max_connections pool; once
cumulative failures reach the cap, atomic_inc_return() hits the
threshold on every subsequent accept and every new connection is
rejected. The counter is only reset by module reload.
An unauthenticated remote attacker can drive the server toward the
memory pressure that makes alloc_transport() fail by holding open
connections with large RFC1002 lengths up to MAX_STREAM_PROT_LEN
(0x00FFFFFF); natural transient allocation failures on a loaded
host produce the same drift more slowly.
Mirror the existing rollback pattern in ksmbd_kthread_fn(): on the
alloc_transport() failure path, decrement active_num_conn gated on
server_conf.max_connections.
Repro details: with the patch reverted, forced alloc_transport()
NULL returns leaked counter slots and subsequent connection
attempts -- including legitimate connects issued after the
forced-fail window had closed -- were all rejected with "Limit the
maximum number of connections". With this patch applied, the same
connect sequence produces no rejections and the counter cycles
cleanly between zero and one on every accept.
Fixes: 0d0d4680db22 ("ksmbd: add max connections parameter") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Merge tag 'i2c-for-7.1-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"The biggest news in this pull request is that it will start the last
cycle of me handling the I2C subsystem. From 7.2. on, I will pass
maintainership to Andi Shyti who has been maintaining the I2C drivers
for a while now and who has done a great job in doing so.
We will use this cycle for a hopefully smooth transition. Thanks must
go to Andi for stepping up! I will still be around for guidance.
Updates:
- generic cleanups in npcm7xx, qcom-cci, xiic and designware DT
bindings
- atr: use kzalloc_flex for alias pool allocation
- ixp4xx: convert bindings to DT schema
- ocores: use read_poll_timeout_atomic() for polling waits
- qcom-geni: skip extra TX DMA TRE for single read messages
- s3c24xx: validate SMBus block length before using it
- spacemit: refactor xfer path and add K1 PIO support
- tegra: identify DVC and VI with SoC data variants
- tegra: support SoC-specific register offsets
- xiic: switch to devres and generic fw properties
- xiic: skip input clock setup on non-OF systems
- various minor improvements in other drivers
rtl9300:
- add per-SoC callbacks and clock support for RTL9607C
- add support for new 50 kHz and 2.5 MHz bus speeds
- general refactoring in preparation for RTL9607C support
New support:
- DesignWare GOOG5000 (ACPI HID)
- Intel Nova Lake (ACPI ID)
- Realtek RTL9607C
- SpacemiT K3 binding
- Tegra410 register layout support"
* tag 'i2c-for-7.1-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (40 commits)
i2c: usbio: Add ACPI device-id for NVL platforms
i2c: qcom-geni: Avoid extra TX DMA TRE for single read message in GPI mode
i2c: atr: use kzalloc_flex
i2c: spacemit: introduce pio for k1
i2c: spacemit: move i2c_xfer_msg()
i2c: xiic: skip input clock setup on non-OF systems
i2c: xiic: use numbered adapter registration
i2c: xiic: cosmetic: use resource format specifier in debug log
i2c: xiic: cosmetic cleanup
i2c: xiic: switch to generic device property accessors
i2c: xiic: remove duplicate error message
i2c: xiic: switch to devres managed APIs
i2c: rtl9300: add RTL9607C i2c controller support
i2c: rtl9300: introduce new function properties to driver data
i2c: rtl9300: introduce clk struct for upcoming rtl9607 support
dt-bindings: i2c: realtek,rtl9301-i2c: extend for clocks and RTL9607C support
i2c: rtl9300: introduce a property for 8 bit width reg address
i2c: rtl9300: introduce F_BUSY to the reg_fields struct
i2c: rtl9300: introduce max length property to driver data
i2c: rtl9300: split data_reg into read and write reg
...
Merge tag 'for-linus-7.1-1' of https://github.com/cminyard/linux-ipmi
Pull ipmi updates from Corey Minyard:
"Small updates and fixes (mostly to the BMC software):
- Fix one issue in the host side driver where a kthread can be left
running on a specific memory allocation failre at probe time
- Replace system_wq with system_percpu_wq so system_wq can eventually
go away"
* tag 'for-linus-7.1-1' of https://github.com/cminyard/linux-ipmi:
ipmi:ssif: Clean up kthread on errors
ipmi:ssif: Remove unnecessary indention
ipmi: ssif_bmc: Fix KUnit test link failure when KUNIT=m
ipmi: ssif_bmc: add unit test for state machine
ipmi: ssif_bmc: change log level to dbg in irq callback
ipmi: ssif_bmc: fix message desynchronization after truncated response
ipmi: ssif_bmc: fix missing check for copy_to_user() partial failure
ipmi: ssif_bmc: cancel response timer on remove
ipmi: Replace use of system_wq with system_percpu_wq
Merge tag 'perf-tools-for-v7.1-2026-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"perf report:
- Add 'comm_nodigit' sort key to combine similar threads that only
have different numbers in the comm. In the following example, the
'comm_nodigit' will have samples from all threads starting with
"bpfrb/" into an entry "bpfrb/<N>".
- Support flat layout for symfs. The --symfs option is to specify the
location of debugging symbol files. The default 'hierarchy' layout
would search the symbol file using the same path of the original
file under the symfs root. The new 'flat' layout would search only
in the root directory.
- Update 'simd' sort key for ARM SIMD flags to cover ASE/SME and more
predicate flags.
perf stat:
- Add --pmu-filter option to select specific PMUs. This would be
useful when you measure metrics from multiple instance of uncore
PMUs with similar names.
# perf stat -M cpa_p0_avg_bw
Performance counter stats for 'system wide':
sh: Drop CONFIG_FIRMWARE_EDID from defconfig files
CONFIG_FIRMWARE_EDID=y depends on X86 or EFI_GENERIC_STUB. Neither
is true here, so drop the lines from the defconfig files.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Thomas Weißschuh [Tue, 24 Feb 2026 15:35:31 +0000 (16:35 +0100)]
sh: Remove CONFIG_VSYSCALL reference from UAPI
AT_SYSINFO_EHDR defines the auxvector index representing the vDSO
entrypoint. Its value or presence does not depend on whether a vDSO
is actually provided by the kernel.
The definition of AT_SYSINFO_EHDR was gated between CONFIG_VSYSCALL to
avoid a default gate VMA to be created. However that default gate VMA
was removed entirely in commit a6c19dfe3994
("arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate area").
Remove the now unnecessary conditional.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Tim Bird [Thu, 12 Feb 2026 19:28:45 +0000 (12:28 -0700)]
sh: Fix typo in SPDX license ID lines
Both platform_early.c and platform_early.h have an extra dash in
their SPDX-License-Identifier lines. Use the correct (single-dash)
syntax for these lines.
Signed-off-by: Tim Bird <tim.bird@sony.com> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Include <linux/io.h> to avoid depending on <linux/backlight.h>
for including it. Declares __raw_readb() and __raw_writeb().
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510282206.wI0HrqcK-lkp@intel.com/ Fixes: 243ce64b2b37 ("backlight: Do not include <linux/fb.h> in header file") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Daniel Thompson (RISCstar) <danielt@kernel.org> Cc: Simona Vetter <simona.vetter@ffwll.ch> Cc: Lee Jones <lee@kernel.org> Cc: Daniel Thompson <danielt@kernel.org> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: dri-devel@lists.freedesktop.org Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Daniel Thompson (RISCstar) <danielt@kernel.org> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Jan Kara [Wed, 15 Apr 2026 17:40:40 +0000 (19:40 +0200)]
MAINTAINERS: add page cache reviewer
Add myself as a page cache reviewer since I tend to review changes in
these areas anyway.
[akpm@linux-foundation.org: add linux-mm@kvack.org] Link: https://lore.kernel.org/20260415174039.13016-2-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Lorenzo Stoakes <ljs@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When the -fsanitize=bounds sanitizer is enabled, gcc-16 sometimes runs
into a corner case in the read_ctrl_pos() pos function, where it sees
possible undefined behavior from the 'tier' index overflowing, presumably
in the case that this was called with a negative tier:
In function 'get_tier_idx',
inlined from 'isolate_folios' at mm/vmscan.c:4671:14:
mm/vmscan.c: In function 'isolate_folios':
mm/vmscan.c:4645:29: error: 'pv.refaulted' is used uninitialized [-Werror=uninitialized]
Part of the problem seems to be that read_ctrl_pos() has unusual calling
conventions since commit 37a260870f2c ("mm/mglru: rework type selection")
where passing MAX_NR_TIERS makes it accumulate all tiers but passing a
smaller positive number makes it read a single tier instead.
Shut up the warning by adding a fake initialization to the two instances
of this variable that can run into that corner case.
Link: https://lore.kernel.org/all/CAJHvVcjtFW86o5FoQC8MMEXCHAC0FviggaQsd5EmiCHP+1fBpg@mail.gmail.com/ Link: https://lore.kernel.org/20260414065206.3236176-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <kasong@tencent.com> Cc: Koichiro Den <koichiro.den@canonical.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
MAINTAINERS: update KHO and LIVE UPDATE maintainers
Patch series "MAINTAINERS: update KHO and LIVE UPDATE entries".
This series contains some updates for the Kexec Handover (KHO) and Live
update entries. Patch 1 updates the maintainers list and adds the
liveupdate tree. Patches 2 and 3 clean up stale files in the list.
This patch (of 3):
I have been helping out with reviewing and developing KHO. I would also
like to help maintain it. Change my entry from R to M for KHO and live
update. Alex has been inactive for a while, so to avoid over-crowding the
KHO entry and to keep the information up-to-date, move his entry from M to
R.
We also now have a tree for KHO and live update at liveupdate/linux.git
where we plan to start maintaining those subsystems and start queuing the
patches. List that in the entries as well.
Update KEXEC and KDUMP maintainer entries by adding the live update group
maintainers. Remove Vivek Goyal due to inactivity to keep the MAINTAINERS
file up-to-date, and add Vivek to the CREDITS file to recognize their
contributions.
Link: https://lore.kernel.org/20260413121146.49215-1-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Pratyush Yadav <pratyush@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Diego Viola <diego.viola@gmail.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Cc: Mark Brown <broonie@kernel.org> Cc: Martin Kepplinger <martink@posteo.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Davidlohr Bueso [Thu, 12 Feb 2026 01:46:11 +0000 (17:46 -0800)]
mm/migrate_device: remove dead migration entry check in migrate_vma_collect_huge_pmd()
The softleaf_is_migration() check is unreachable as entries that are not
device_private are filtered out. Similarly, the PTE-level equivalent in
migrate_vma_collect_pmd() skips migration entries.
This dead branch also contained a double spin_unlock(ptl) bug.
Link: https://lore.kernel.org/20260212014611.416695-1-dave@stgolabs.net Fixes: a30b48bf1b244 ("mm/migrate_device: implement THP migration of zone device pages") Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Acked-by: Balbir Singh <balbirs@nvidia.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Ying Huang <ying.huang@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cao Ruichuang [Fri, 10 Apr 2026 04:41:39 +0000 (12:41 +0800)]
selftests: mm: skip charge_reserved_hugetlb without killall
charge_reserved_hugetlb.sh tears down background writers with killall from
psmisc. Minimal Ubuntu images do not always provide that tool, so the
selftest fails in cleanup for an environment reason rather than for the
hugetlb behavior it is trying to cover.
Skip the test when killall is unavailable, similar to the existing root
check, so these environments report the dependency clearly instead of
failing the test.
Link: https://lore.kernel.org/20260410044139.67480-1-create0818@163.com Signed-off-by: Cao Ruichuang <create0818@163.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
userfaultfd: allow registration of ranges below mmap_min_addr
The current implementation of validate_range() in fs/userfaultfd.c
performs a hard check against mmap_min_addr. This is redundant because
UFFDIO_REGISTER operates on memory ranges that must already be backed by a
VMA.
Enforcing mmap_min_addr or capability checks again in userfaultfd is
unnecessary and prevents applications like binary compilers from using
UFFD for valid memory regions mapped by application.
Remove the redundant check for mmap_min_addr.
We started using UFFD instead of the classic mprotect approach in the
binary translator to track application writes. During development, we
encountered this bug. The translator cannot control where the translated
application chooses to map its memory and if the app requires a
low-address area, UFFD fails, whereas mprotect would work just fine. I
believe this is a genuine logic bug rather than an improvement, and I
would appreciate including the fix in stable.
Link: https://lore.kernel.org/20260409103345.15044-1-komlomal@gmail.com Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Denis M. Karpov <komlomal@gmail.com> Reviewed-by: Lorenzo Stoakes <ljs@kernel.org> Acked-by: Harry Yoo (Oracle) <harry@kernel.org> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
vmstat_shepherd uses delayed_work_pending() to check whether vmstat_update
is already scheduled for a given CPU before queuing it. However,
delayed_work_pending() only tests WORK_STRUCT_PENDING_BIT, which is
cleared the moment a worker thread picks up the work to execute it.
This means that while vmstat_update is actively running on a CPU,
delayed_work_pending() returns false. If need_update() also returns true
at that point (per-cpu counters not yet zeroed mid-flush), the shepherd
queues a second invocation with delay=0, causing vmstat_update to run
again immediately after finishing.
On a 72-CPU system this race is readily observable: before the fix, many
CPUs show invocation gaps well below 500 jiffies (the minimum
round_jiffies_relative() can produce), with the most extreme cases
reaching 0 jiffies—vmstat_update called twice within the same jiffy.
Fix this by replacing delayed_work_pending() with work_busy(), which
returns non-zero for both WORK_BUSY_PENDING (timer armed or work queued)
and WORK_BUSY_RUNNING (work currently executing). The shepherd now
correctly skips a CPU in all busy states.
After the fix, all sub-jiffy and most sub-100-jiffie gaps disappear. The
remaining early invocations have gaps in the 700–999 jiffie range,
attributable to round_jiffies_relative() aligning to a nearer
jiffie-second boundary rather than to this race.
Each spurious vmstat_update invocation has a measurable side effect:
refresh_cpu_vm_stats() calls decay_pcp_high() for every zone, which drains
idle per-CPU pages back to the buddy allocator via free_pcppages_bulk(),
taking the zone spinlock each time. Eliminating the double-scheduling
therefore reduces zone lock contention directly. On a 72-CPU stress-ng
workload measured with perf lock contention:
free_pcppages_bulk contention count: ~55% reduction
free_pcppages_bulk total wait time: ~57% reduction
free_pcppages_bulk max wait time: ~47% reduction
Note: work_busy() is inherently racy—between the check and the
subsequent queue_delayed_work_on() call, vmstat_update can finish
execution, leaving the work neither pending nor running. In that narrow
window the shepherd can still queue a second invocation. After the fix,
this residual race is rare and produces only occasional small gaps, a
significant improvement over the systematic double-scheduling seen with
delayed_work_pending().
Link: https://lore.kernel.org/20260409-vmstat-v2-1-e9d9a6db08ad@debian.org Fixes: 7b8da4c7f07774 ("vmstat: get rid of the ugly cpu_stat_off variable") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Dmitry Ilvokhin <d@ilvokhin.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Hildenbrand <david@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>