Linus Torvalds [Fri, 20 Feb 2026 17:44:39 +0000 (09:44 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Two arm64 fixes: one fixes a warning that started showing up with
gcc 16 and the other fixes a lockup in udelay() when running on a
vCPU loaded on a CPU with the new-fangled WFIT instruction:
- Fix compiler warning from huge_pte_clear() with GCC 16
- Fix hang in udelay() on systems with WFIT by consistently using the
virtual counter to calculate the delta"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hugetlbpage: avoid unused-but-set-parameter warning (gcc-16)
arm64: Force the use of CNTVCT_EL0 in __delay()
Linus Torvalds [Fri, 20 Feb 2026 17:24:45 +0000 (09:24 -0800)]
Merge tag 's390-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Make KEXEC_SIG available again for CONFIG_MODULES=n
- The s390 topology code used to call rebuild_sched_domains() before
common code scheduling domains were setup. This was silently ignored
by common code, but now results in a warning. Address by avoiding the
early call
- Convert debug area lock from spinlock to raw spinlock to address
lockdep warnings
- The recent 3490 tape device driver rework resulted in a different
device driver name, which is visible via sysfs for user space. This
breaks at least one user space application. Change the device driver
name back to its old name to fix this
* tag 's390-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/tape: Fix device driver name
s390/debug: Convert debug area lock from a spinlock to a raw spinlock
s390/smp: Avoid calling rebuild_sched_domains() early
s390/kexec: Make KEXEC_SIG available when CONFIG_MODULES=n
Linus Torvalds [Fri, 20 Feb 2026 16:57:35 +0000 (08:57 -0800)]
Merge tag 'for-linus-7.0-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single patch fixing a boot regression when running as a Xen PV
guest. This issue was introduced in this merge window"
* tag 'for-linus-7.0-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Fix Xen PV guest boot
Linus Torvalds [Fri, 20 Feb 2026 16:48:31 +0000 (08:48 -0800)]
Merge tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Wei Liu:
- Debugfs support for MSHV statistics (Nuno Das Neves)
- Support for the integrated scheduler (Stanislav Kinsburskii)
- Various fixes for MSHV memory management and hypervisor status
handling (Stanislav Kinsburskii)
- Expose more capabilities and flags for MSHV partition management
(Anatol Belski, Muminul Islam, Magnus Kulke)
- Miscellaneous fixes to improve code quality and stability (Carlos
López, Ethan Nelson-Moore, Li RongQing, Michael Kelley, Mukesh
Rathor, Purna Pavan Chandra Aekkaladevi, Stanislav Kinsburskii, Uros
Bizjak)
- PREEMPT_RT fixes for vmbus interrupts (Jan Kiszka)
* tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (34 commits)
mshv: Handle insufficient root memory hypervisor statuses
mshv: Handle insufficient contiguous memory hypervisor status
mshv: Introduce hv_deposit_memory helper functions
mshv: Introduce hv_result_needs_memory() helper function
mshv: Add SMT_ENABLED_GUEST partition creation flag
mshv: Add nested virtualization creation flag
Drivers: hv: vmbus: Simplify allocation of vmbus_evt
mshv: expose the scrub partition hypercall
mshv: Add support for integrated scheduler
mshv: Use try_cmpxchg() instead of cmpxchg()
x86/hyperv: Fix error pointer dereference
x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV
Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT
x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn
x86/hyperv: Use savesegment() instead of inline asm() to save segment registers
mshv: fix SRCU protection in irqfd resampler ack handler
mshv: make field names descriptive in a header struct
x86/hyperv: Update comment in hyperv_cleanup()
mshv: clear eventfd counter on irqfd shutdown
x86/hyperv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap()
...
Linus Torvalds [Thu, 19 Feb 2026 18:39:08 +0000 (10:39 -0800)]
Merge tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter.
Current release - new code bugs:
- net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT
- eth: mlx5e: XSK, Fix unintended ICOSQ change
- phy_port: correctly recompute the port's linkmodes
- vsock: prevent child netns mode switch from local to global
- couple of kconfig fixes for new symbols
Previous releases - regressions:
- nfc: nci: fix false-positive parameter validation for packet data
- net: do not delay zero-copy skbs in skb_attempt_defer_free()
Previous releases - always broken:
- mctp: ensure our nlmsg responses to user space are zero-initialised
- ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data()
- fixes for ICMP rate limiting
Misc:
- intel: fix PCI device ID conflict between i40e and ipw2200"
* tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: nfc: nci: Fix parameter validation for packet data
net/mlx5e: Use unsigned for mlx5e_get_max_num_channels
net/mlx5e: Fix deadlocks between devlink and netdev instance locks
net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event
net/mlx5: Fix misidentification of write combining CQE during poll loop
net/mlx5e: Fix misidentification of ASO CQE during poll loop
net/mlx5: Fix multiport device check over light SFs
bonding: alb: fix UAF in rlb_arp_recv during bond up/down
bnge: fix reserving resources from FW
eth: fbnic: Advertise supported XDP features.
rds: tcp: fix uninit-value in __inet_bind
net/rds: Fix NULL pointer dereference in rds_tcp_accept_one
octeontx2-af: Fix default entries mcam entry action
net/mlx5e: XSK, Fix unintended ICOSQ change
ipv6: icmp: icmpv6_xrlim_allow() optimization if net.ipv6.icmp.ratelimit is zero
ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero
ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow()
inet: move icmp_global_{credit,stamp} to a separate cache line
icmp: prevent possible overflow in icmp_global_allow()
selftests/net: packetdrill: add ipv4-mapped-ipv6 tests
...
- Fix a potential use-after-free of BTF object (Anton Protopopov)
- Add feature detection to libbpf and avoid moving arena global
variables on older kernels (Emil Tsalapatis)
- Remove extern declaration of bpf_stream_vprintk() from libbpf headers
(Ihor Solodrai)
- Fix truncated netlink dumps in bpftool (Jakub Kicinski)
- Fix map_kptr grace period wait in bpf selftests (Kumar Kartikeya
Dwivedi)
- Remove hexdump dependency while building bpf selftests (Matthieu
Baerts)
- Complete fsession support in BPF trampolines on riscv (Menglong Dong)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Remove hexdump dependency
libbpf: Remove extern declaration of bpf_stream_vprintk()
selftests/bpf: Use vmlinux.h in test_xdp_meta
bpftool: Fix truncated netlink dumps
libbpf: Delay feature gate check until object prepare time
libbpf: Do not use PROG_TYPE_TRACEPOINT program for feature gating
bpf: Add a map/btf from a fd array more consistently
selftests/bpf: Fix map_kptr grace period wait
selftests/bpf: enable fsession_test on riscv64
selftests/bpf: Adjust selftest due to function rename
bpf, riscv: add fsession support for trampolines
bpf: Fix a potential use-after-free of BTF object
bpf, riscv: introduce emit_store_stack_imm64() for trampoline
libbpf: Fix invalid write loop logic in bpf_linker__add_buf()
libbpf: Add gating for arena globals relocation feature
net: nfc: nci: Fix parameter validation for packet data
Since commit 9c328f54741b ("net: nfc: nci: Add parameter validation for
packet data") communication with nci nfc chips is not working any more.
The mentioned commit tries to fix access of uninitialized data, but
failed to understand that in some cases the data packet is of variable
length and can therefore not be compared to the maximum packet length
given by the sizeof(struct).
Fixes: 9c328f54741b ("net: nfc: nci: Add parameter validation for packet data") Cc: stable@vger.kernel.org Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Reported-by: syzbot+740e04c2a93467a0f8c8@syzkaller.appspotmail.com Link: https://patch.msgid.link/20260218083000.301354-1-michael.thalmeier@hale.at Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cosmin Ratiu [Wed, 18 Feb 2026 07:29:03 +0000 (09:29 +0200)]
net/mlx5e: Fix deadlocks between devlink and netdev instance locks
In the mentioned "Fixes" commit, various work tasks triggering devlink
health reporter recovery were switched to use netdev_trylock to protect
against concurrent tear down of the channels being recovered. But this
had the side effect of introducing potential deadlocks because of
incorrect lock ordering.
The correct lock order is described by the init flow:
probe_one -> mlx5_init_one (acquires devlink lock)
-> mlx5_init_one_devl_locked -> mlx5_register_device
-> mlx5_rescan_drivers_locked -...-> mlx5e_probe -> _mlx5e_probe
-> register_netdev (acquires rtnl lock)
-> register_netdevice (acquires netdev lock)
=> devlink lock -> rtnl lock -> netdev lock.
But in the current recovery flow, the order is wrong:
mlx5e_tx_err_cqe_work (acquires netdev lock)
-> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report
-> devlink_health_report (acquires devlink lock => boom!)
-> devlink_health_reporter_recover
-> mlx5e_tx_reporter_recover -> mlx5e_tx_reporter_recover_from_ctx
-> mlx5e_tx_reporter_err_cqe_recover
The same pattern exists in:
mlx5e_reporter_rx_timeout
mlx5e_reporter_tx_ptpsq_unhealthy
mlx5e_reporter_tx_timeout
Fix these by moving the netdev_trylock calls from the work handlers
lower in the call stack, in the respective recovery functions, where
they are actually necessary.
Gal Pressman [Wed, 18 Feb 2026 07:29:02 +0000 (09:29 +0200)]
net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event
The macsec_aso_set_arm_event function calls mlx5_aso_poll_cq once
without a retry loop. If the CQE is not immediately available after
posting the WQE, the function fails unnecessarily.
Use read_poll_timeout() to poll 3-10 usecs for CQE, consistent with
other ASO polling code paths in the driver.
Fixes: 739cfa34518e ("net/mlx5: Make ASO poll CQ usable in atomic context") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gal Pressman [Wed, 18 Feb 2026 07:29:01 +0000 (09:29 +0200)]
net/mlx5: Fix misidentification of write combining CQE during poll loop
The write combining completion poll loop uses usleep_range() which can
sleep much longer than requested due to scheduler latency. Under load,
we witnessed a 20ms+ delay until the process was rescheduled, causing
the jiffies based timeout to expire while the thread is sleeping.
The original do-while loop structure (poll, sleep, check timeout) would
exit without a final poll when waking after timeout, missing a CQE that
arrived during sleep.
Instead of the open-coded while loop, use the kernel's poll_timeout_us()
which always performs an additional check after the sleep expiration,
and is less error-prone.
Note: poll_timeout_us() doesn't accept a sleep range, by passing 10
sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs.
Fixes: d98995b4bf98 ("net/mlx5: Reimplement write combining test") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gal Pressman [Wed, 18 Feb 2026 07:29:00 +0000 (09:29 +0200)]
net/mlx5e: Fix misidentification of ASO CQE during poll loop
The ASO completion poll loop uses usleep_range() which can sleep much
longer than requested due to scheduler latency. Under load, we witnessed
a 20ms+ delay until the process was rescheduled, causing the jiffies
based timeout to expire while the thread is sleeping.
The original do-while loop structure (poll, sleep, check timeout) would
exit without a final poll when waking after timeout, missing a CQE that
arrived during sleep.
Instead of the open-coded while loop, use the kernel's
read_poll_timeout() which always performs an additional check after the
sleep expiration, and is less error-prone.
Note: read_poll_timeout() doesn't accept a sleep range, by passing 10
sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs.
Fixes: 739cfa34518e ("net/mlx5: Make ASO poll CQ usable in atomic context") Fixes: 7e3fce82d945 ("net/mlx5e: Overcome slow response for first macsec ASO WQE") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shay Drory [Wed, 18 Feb 2026 07:28:59 +0000 (09:28 +0200)]
net/mlx5: Fix multiport device check over light SFs
Driver is using num_vhca_ports capability to distinguish between
multiport master device and multiport slave device. num_vhca_ports is a
capability the driver sets according to the MAX num_vhca_ports
capability reported by FW. On the other hand, light SFs doesn't set the
above capbility.
This leads to wrong results whenever light SFs is checking whether he is
a multiport master or slave.
Therefore, use the MAX capability to distinguish between master and
slave devices.
Fixes: e71383fb9cd1 ("net/mlx5: Light probe local SFs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hangbin Liu [Wed, 18 Feb 2026 06:09:19 +0000 (06:09 +0000)]
bonding: alb: fix UAF in rlb_arp_recv during bond up/down
The ALB RX path may access rx_hashtbl concurrently with bond
teardown. During rapid bond up/down cycles, rlb_deinitialize()
frees rx_hashtbl while RX handlers are still running, leading
to a null pointer dereference detected by KASAN.
However, the root cause is that rlb_arp_recv() can still be accessed
after setting recv_probe to NULL, which is actually a use-after-free
(UAF) issue. That is the reason for using the referenced commit in the
Fixes tag.
The issue is reproducible by repeatedly running
ip link set bond0 up/down while receiving ARP messages, where
rlb_arp_recv() can race with rlb_deinitialize() and dereference
a freed rx_hashtbl entry.
Fix this by setting recv_probe to NULL and then calling
synchronize_net() to wait for any concurrent RX processing to finish.
This ensures that no RX handler can access rx_hashtbl after it is freed
in bond_alb_deinitialize().
Reported-by: Liang Li <liali@redhat.com> Fixes: 3aba891dde38 ("bonding: move processing of recv handlers into handle_frame()") Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260218060919.101574-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vikas Gupta [Wed, 18 Feb 2026 05:27:55 +0000 (10:57 +0530)]
bnge: fix reserving resources from FW
HWRM_FUNC_CFG is used to reserve resources, whereas HWRM_FUNC_QCFG is
intended for querying resource information from the firmware.
Since __bnge_hwrm_reserve_pf_rings() reserves resources for a specific
PF, the command type should be HWRM_FUNC_CFG.
net/rds: Fix NULL pointer dereference in rds_tcp_accept_one
Save a local pointer to new_sock->sk and hold a reference before
installing callbacks in rds_tcp_accept_one. After
rds_tcp_set_callbacks() or rds_tcp_reset_callbacks(), tc->t_sock is
set to new_sock which may race with the shutdown path. A concurrent
rds_tcp_conn_path_shutdown() may call sock_release(), which sets
new_sock->sk = NULL and may eventually free sk when the refcount
reaches zero.
Subsequent accesses to new_sock->sk->sk_state would dereference NULL,
causing the crash. The fix saves a local sk pointer before callbacks
are installed so that sk_state can be accessed safely even after
new_sock->sk is nulled, and uses sock_hold()/sock_put() to ensure
sk itself remains valid for the duration.
Fixes: 826c1004d4ae ("net/rds: rds_tcp_conn_path_shutdown must not discard messages") Reported-by: syzbot+96046021045ffe6d7709@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=96046021045ffe6d7709 Signed-off-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260216222643.2391390-1-achender@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jan Höppner [Thu, 19 Feb 2026 11:05:53 +0000 (12:05 +0100)]
s390/tape: Fix device driver name
Recent cleanups and code consolidations in the s390 tape device driver
renamed files and function namespaces from tape_34xx to tape_3490 to
better reflect the single support of the IBM 3490E device in the
codebase.
These changes also renamed the driver name to tape_3490, which
consequently broke userspace as the sysfs driver path is now
/sys/bus/ccw/drivers/tape_3490/ instead of
/sys/bus/ccw/drivers/tape_34xx/.
Change the device driver name back to tape_34xx to fix userspace.
Fixes: 9872dae6102e ("s390/tape: Rename tape_34xx.c to tape_3490.c") Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
gcc-16 warns about an instance that older compilers did not:
arch/arm64/mm/hugetlbpage.c: In function 'huge_pte_clear':
arch/arm64/mm/hugetlbpage.c:369:57: error: parameter 'addr' set but not used [-Werror=unused-but-set-parameter=]
The issue here is that __pte_clear() does not actually use its second
argument, but when CONFIG_ARM64_CONTPTE is enabled it still gets
updated.
Replace the macro with an inline function to let the compiler see
the argument getting passed down.
Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dev Jain <dev.jain@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
Marc Zyngier [Fri, 13 Feb 2026 14:16:19 +0000 (14:16 +0000)]
arm64: Force the use of CNTVCT_EL0 in __delay()
Quentin forwards a report from Hyesoo Yu, describing an interesting
problem with the use of WFxT in __delay() when a vcpu is loaded and
that KVM is *not* in VHE mode (either nVHE or hVHE).
In this case, CNTVOFF_EL2 is set to a non-zero value to reflect the
state of the guest virtual counter. At the same time, __delay() is
using get_cycles() to read the counter value, which is indirected to
reading CNTPCT_EL0.
The core of the issue is that WFxT is using the *virtual* counter,
while the kernel is using the physical counter, and that the offset
introduces a really bad discrepancy between the two.
Fix this by forcing the use of CNTVCT_EL0, making __delay() consistent
irrespective of the value of CNTVOFF_EL2.
Reported-by: Hyesoo Yu <hyesoo.yu@samsung.com> Reported-by: Quentin Perret <qperret@google.com> Reviewed-by: Quentin Perret <qperret@google.com> Fixes: 7d26b0516a0d ("arm64: Use WFxT for __delay() when possible") Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/ktosachvft2cgqd5qkukn275ugmhy6xrhxur4zqpdxlfr3qh5h@o3zrfnsq63od Cc: stable@vger.kernel.org Signed-off-by: Will Deacon <will@kernel.org>
When creating guest partition objects, the hypervisor may fail to
allocate root partition pages and return an insufficient memory status.
In this case, deposit memory using the root partition ID instead.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Mukesh R <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
mshv: Handle insufficient contiguous memory hypervisor status
The HV_STATUS_INSUFFICIENT_CONTIGUOUS_MEMORY status indicates that the
hypervisor lacks sufficient contiguous memory for its internal allocations.
When this status is encountered, allocate and deposit
HV_MAX_CONTIGUOUS_ALLOCATION_PAGES contiguous pages to the hypervisor.
HV_MAX_CONTIGUOUS_ALLOCATION_PAGES is defined in the hypervisor headers, a
deposit of this size will always satisfy the hypervisor's requirements.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Mukesh R <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Introduce hv_deposit_memory_node() and hv_deposit_memory() helper
functions to handle memory deposit with proper error handling.
The new hv_deposit_memory_node() function takes the hypervisor status
as a parameter and validates it before depositing pages. It checks for
HV_STATUS_INSUFFICIENT_MEMORY specifically and returns an error for
unexpected status codes.
This is a precursor patch to new out-of-memory error codes support.
No functional changes intended.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Mukesh R <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Linus Torvalds [Thu, 19 Feb 2026 05:40:16 +0000 (21:40 -0800)]
Merge tag 'mm-nonmm-stable-2026-02-18-19-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more non-MM updates from Andrew Morton:
- "two fixes in kho_populate()" fixes a couple of not-major issues in
the kexec handover code (Ran Xiaokai)
- misc singletons
* tag 'mm-nonmm-stable-2026-02-18-19-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
lib/group_cpus: handle const qualifier from clusters allocation type
kho: remove unnecessary WARN_ON(err) in kho_populate()
kho: fix missing early_memunmap() call in kho_populate()
scripts/gdb: implement x86_page_ops in mm.py
objpool: fix the overestimation of object pooling metadata size
selftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT
delayacct: fix build regression on accounting tool
Linus Torvalds [Thu, 19 Feb 2026 04:50:32 +0000 (20:50 -0800)]
Merge tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a
couple of issues in the demotion code - pages were failed demotion
and were finding themselves demoted into disallowed nodes (Bing Jiao)
- "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare
mapledtree race and performs a number of cleanups (Liam Howlett)
- "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use
them" implements a lot of cleanups following on from the conversion
of the VMA flags into a bitmap (Lorenzo Stoakes)
- "support batch checking of references and unmapping for large folios"
implements batching to greatly improve the performance of reclaiming
clean file-backed large folios (Baolin Wang)
- "selftests/mm: add memory failure selftests" does as claimed (Miaohe
Lin)
* tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits)
mm/page_alloc: clear page->private in free_pages_prepare()
selftests/mm: add memory failure dirty pagecache test
selftests/mm: add memory failure clean pagecache test
selftests/mm: add memory failure anonymous page test
mm: rmap: support batched unmapping for file large folios
arm64: mm: implement the architecture-specific clear_flush_young_ptes()
arm64: mm: support batch clearing of the young flag for large folios
arm64: mm: factor out the address and ptep alignment into a new helper
mm: rmap: support batched checks of the references for large folios
tools/testing/vma: add VMA userland tests for VMA flag functions
tools/testing/vma: separate out vma_internal.h into logical headers
tools/testing/vma: separate VMA userland tests into separate files
mm: make vm_area_desc utilise vma_flags_t only
mm: update all remaining mmap_prepare users to use vma_flags_t
mm: update shmem_[kernel]_file_*() functions to use vma_flags_t
mm: update secretmem to use VMA flags on mmap_prepare
mm: update hugetlbfs to use VMA flags on mmap_prepare
mm: add basic VMA flag operation helper functions
tools: bitmap: add missing bitmap_[subset(), andnot()]
mm: add mk_vma_flags() bitmap flag macro helper
...
As per design, AF should update the default MCAM action only when
mcam_index is -1. A bug in the previous patch caused default entries
to be changed even when the request was not for them.
Jakub Kicinski [Thu, 19 Feb 2026 01:09:30 +0000 (17:09 -0800)]
Merge tag 'nf-26-02-17' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter: updates for net
The following patchset contains Netfilter fixes for *net*:
1) Add missing __rcu annotations to NAT helper hook pointers in Amanda,
FTP, IRC, SNMP and TFTP helpers. From Sun Jian.
2-4):
- Add global spinlock to serialize nft_counter fetch+reset operations.
- Use atomic64_xchg() for nft_quota reset instead of read+subtract pattern.
Note AI review detects a race in this change but it isn't new. The
'racing' bit only exists to prevent constant stream of 'quota expired'
notifications.
- Revert commit_mutex usage in nf_tables reset path, it caused
circular lock dependency. All from Brian Witte.
5) Fix uninitialized l3num value in nf_conntrack_h323 helper.
6) Fix musl libc compatibility in netfilter_bridge.h UAPI header. This
change isn't nice (UAPI headers should not include libc headers), but
as-is musl builds may fail due to redefinition of struct ethhdr.
7) Fix protocol checksum validation in IPVS for IPv6 with extension headers,
from Julian Anastasov.
8) Fix device reference leak in IPVS when netdev goes down. Also from
Julian.
9) Remove WARN_ON_ONCE when accessing forward path array, this can
trigger with sufficiently long forward paths. From Pablo Neira Ayuso.
10) Fix use-after-free in nf_tables_addchain() error path, from Inseo An.
* tag 'nf-26-02-17' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: fix use-after-free in nf_tables_addchain()
net: remove WARN_ON_ONCE when accessing forward path array
ipvs: do not keep dest_dst if dev is going down
ipvs: skip ipv6 extension headers for csum checks
include: uapi: netfilter_bridge.h: Cover for musl libc
netfilter: nf_conntrack_h323: don't pass uninitialised l3num value
netfilter: nf_tables: revert commit_mutex usage in reset path
netfilter: nft_quota: use atomic64_xchg for reset
netfilter: nft_counter: serialize reset with spinlock
netfilter: annotate NAT helper hook pointers with __rcu
====================
Tariq Toukan [Tue, 17 Feb 2026 07:45:25 +0000 (09:45 +0200)]
net/mlx5e: XSK, Fix unintended ICOSQ change
XSK wakeup must use the async ICOSQ (with proper locking), as it is not
guaranteed to run on the same CPU as the channel.
The commit that converted the NAPI trigger path to use the sync ICOSQ
incorrectly applied the same change to XSK, causing XSK wakeups to use
the sync ICOSQ as well. Revert XSK flows to use the async ICOSQ.
XDP program attach/detach triggers channel reopen, while XSK pool
enable/disable can happen on-the-fly via NDOs without reopening
channels. As a result, xsk_pool state cannot be reliably used at
mlx5e_open_channel() time to decide whether an async ICOSQ is needed.
Update the async_icosq_needed logic to depend on the presence of an XDP
program rather than the xsk_pool, ensuring the async ICOSQ is available
when XSK wakeups are enabled.
This fixes multiple issues:
1. Illegal synchronize_rcu() in an RCU read- side critical section via
mlx5e_xsk_wakeup() -> mlx5e_trigger_napi_icosq() ->
synchronize_net(). The stack holds RCU read-lock in xsk_poll().
2. Hitting a NULL pointer dereference in mlx5e_xsk_wakeup():
Jakub Kicinski [Thu, 19 Feb 2026 00:46:38 +0000 (16:46 -0800)]
Merge branch 'icmp-better-deal-with-ddos'
Eric Dumazet says:
====================
icmp: better deal with DDOS
When dealing with death of big UDP servers, admins might want to
increase net.ipv4.icmp_msgs_per_sec and net.ipv4.icmp_msgs_burst
to big values (2,000,000 or more).
They also might need to tune the per-host ratelimit to 1ms or 0ms
in favor of the global rate limit.
This series fixes bugs showing up in all these needs.
====================
Eric Dumazet [Mon, 16 Feb 2026 14:28:30 +0000 (14:28 +0000)]
ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow()
Following part was needed before the blamed commit, because
inet_getpeer_v6() second argument was the prefix.
/* Give more bandwidth to wider prefixes. */
if (rt->rt6i_dst.plen < 128)
tmo >>= ((128 - rt->rt6i_dst.plen)>>5);
Now inet_getpeer_v6() retrieves hosts, we need to remove
@tmo adjustement or wider prefixes likes /24 allow 8x
more ICMP to be sent for a given ratelimit.
As we had this issue for a while, this patch changes net.ipv6.icmp.ratelimit
default value from 1000ms to 100ms to avoid potential regressions.
Also add a READ_ONCE() when reading net->ipv6.sysctl.icmpv6_time.
Fixes: fd0273d7939f ("ipv6: Remove external dependency on rt6i_dst and rt6i_src") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260216142832.3834174-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 16 Feb 2026 14:28:29 +0000 (14:28 +0000)]
inet: move icmp_global_{credit,stamp} to a separate cache line
icmp_global_credit was meant to be changed ~1000 times per second,
but if an admin sets net.ipv4.icmp_msgs_per_sec to a very high value,
icmp_global_credit changes can inflict false sharing to surrounding
fields that are read mostly.
Move icmp_global_credit and icmp_global_stamp to a separate
cacheline aligned group.
Fixes: b056b4cd9178 ("icmp: move icmp_global.credit and icmp_global.stamp to per netns storage") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260216142832.3834174-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Muminul Islam [Wed, 18 Feb 2026 14:47:59 +0000 (14:47 +0000)]
mshv: Add nested virtualization creation flag
Introduce HV_PARTITION_CREATION_FLAG_NESTED_VIRTUALIZATION_CAPABLE to
indicate support for nested virtualization during partition creation.
This enables clearer configuration and capability checks for nested
virtualization scenarios.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Muminul Islam <muislam@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Michael Kelley [Wed, 18 Feb 2026 17:01:21 +0000 (09:01 -0800)]
Drivers: hv: vmbus: Simplify allocation of vmbus_evt
The per-cpu variable vmbus_evt is currently dynamically allocated. It's
only 8 bytes, so just allocate it statically to simplify and save a few
lines of code.
Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Query the hypervisor for integrated scheduler support and use it if
configured.
Microsoft Hypervisor originally provided two schedulers: root and core. The
root scheduler allows the root partition to schedule guest vCPUs across
physical cores, supporting both time slicing and CPU affinity (e.g., via
cgroups). In contrast, the core scheduler delegates vCPU-to-physical-core
scheduling entirely to the hypervisor.
Direct virtualization introduces a new privileged guest partition type - L1
Virtual Host (L1VH) — which can create child partitions from its own
resources. These child partitions are effectively siblings, scheduled by
the hypervisor's core scheduler. This prevents the L1VH parent from setting
affinity or time slicing for its own processes or guest VPs. While cgroups,
CFS, and cpuset controllers can still be used, their effectiveness is
unpredictable, as the core scheduler swaps vCPUs according to its own logic
(typically round-robin across all allocated physical CPUs). As a result,
the system may appear to "steal" time from the L1VH and its children.
To address this, Microsoft Hypervisor introduces the integrated scheduler.
This allows an L1VH partition to schedule its own vCPUs and those of its
guests across its "physical" cores, effectively emulating root scheduler
behavior within the L1VH, while retaining core scheduler behavior for the
rest of the system.
The integrated scheduler is controlled by the root partition and gated by
the vmm_enable_integrated_scheduler capability bit. If set, the hypervisor
supports the integrated scheduler. The L1VH partition must then check if it
is enabled by querying the corresponding extended partition property. If
this property is true, the L1VH partition must use the root scheduler
logic; otherwise, it must use the core scheduler. This requirement makes
reading VMM capabilities in L1VH partition a requirement too.
Signed-off-by: Andreea Pintilie <anpintil@microsoft.com> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Uros Bizjak [Wed, 18 Feb 2026 11:00:18 +0000 (12:00 +0100)]
mshv: Use try_cmpxchg() instead of cmpxchg()
Use !try_cmpxchg() instead of cmpxchg (*ptr, old, new) != old.
x86 CMPXCHG instruction returns success in ZF flag, so this
change saves a compare after CMPXCHG.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Long Li <longli@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Ethan Tidmore [Wed, 18 Feb 2026 19:09:03 +0000 (13:09 -0600)]
x86/hyperv: Fix error pointer dereference
The function idle_thread_get() can return an error pointer and is not
checked for it. Add check for error pointer.
Detected by Smatch:
arch/x86/hyperv/hv_vtl.c:126 hv_vtl_bringup_vcpu() error:
'idle' dereferencing possible ERR_PTR()
Fixes: 2b4b90e053a29 ("x86/hyperv: Use per cpu initial stack for vtl context") Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Mukesh Rathor [Tue, 17 Feb 2026 23:11:58 +0000 (15:11 -0800)]
x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV
MSVC compiler, used to compile the Microsoft Hypervisor, currently
has an assert intrinsic that uses interrupt vector 0x29 to create an
exception. This will cause hypervisor to then crash and collect core. As
such, if this interrupt number is assigned to a device by Linux and the
device generates it, hypervisor will crash. There are two other such
vectors hard coded in the hypervisor, 0x2C and 0x2D for debug purposes.
Fortunately, the three vectors are part of the kernel driver space and
that makes it feasible to reserve them early so they are not assigned
later.
Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
The verification signature header generation requires converting a
binary certificate to a C array. Previously this only worked with xxd,
and a switch to hexdump has been done in commit b640d556a2b3
("selftests/bpf: Remove xxd util dependency").
hexdump is a more common utility program, yet it might not be installed
by default. When it is not installed, BPF selftests build without
errors, but tests_progs is unusable: it exits with the 255 code and
without any error messages. When manually reproducing the issue, it is
not too hard to find out that the generated verification_cert.h file is
incorrect, but that's time consuming. When digging the BPF selftests
build logs, this line can be seen amongst thousands others, but ignored:
/bin/sh: 2: hexdump: not found
Here, od is used instead of hexdump. od is coming from the coreutils
package, and this new od command produces the same output when using od
from GNU coreutils, uutils, and even busybox. This is more portable, and
it produces a similar results to what was done before with hexdump:
there is an extra comma at the end instead of trailing whitespaces,
but the C code is not impacted.
Ihor Solodrai [Wed, 18 Feb 2026 21:56:51 +0000 (13:56 -0800)]
libbpf: Remove extern declaration of bpf_stream_vprintk()
An issue was reported that building BPF program which includes both
vmlinux.h and bpf_helpers.h from libbpf fails due to conflicting
declarations of bpf_stream_vprintk().
Remove the extern declaration from bpf_helpers.h to address this.
In order to use bpf_stream_printk() macro, BPF programs are expected
to either include vmlinux.h of the kernel they are targeting, or add
their own extern declaration.
Ihor Solodrai [Wed, 18 Feb 2026 21:56:50 +0000 (13:56 -0800)]
selftests/bpf: Use vmlinux.h in test_xdp_meta
- Replace linux/* includes with vmlinux.h
- Include errno.h
- Include bpf_tracing_net.h for TC_ACT_* and ETH_*
- Use BPF_STDERR instead of BPF_STREAM_STDERR
Linus Torvalds [Wed, 18 Feb 2026 22:33:18 +0000 (14:33 -0800)]
Merge tag 'thermal-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"This fixes a sysfs group leak on DLVR registration failure in the
Intel int340x thermal driver (Kaushlendra Kumar)"
* tag 'thermal-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: Fix sysfs group leak on DLVR registration failure
Linus Torvalds [Wed, 18 Feb 2026 22:28:57 +0000 (14:28 -0800)]
Merge tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI support updates from Rafael J. Wysocki:
"These are mostly fixes and cleanups on top of the ACPI support updates
merged recently, including two new quirks, an ACPI CPPC library fix,
and fixes and cleanups of a few core ACPI device drivers:
- Add an unused power resource handling quirk for THUNDEROBOT ZERO
(Zhai Can)
- Fix remaining for_each_possible_cpu() in the ACPI CPPC library to
use online CPUs (Sean V Kelley)
- Drop redundant checks from the ACPI notify handler and the driver
remove callback in the ACPI battery driver (Rafael Wysocki)
- Move the creation of the wakeup source during the ACPI button
driver probe to an earlier point to avoid missing a wakeup event
due to a race and clean up system wakeup handling and remove
callback in that driver (Rafael Wysocki)
- Drop unnecessary driver_data pointer clearing from the ACPI EC and
SMBUS HC drivers and make the ACPI backlight (video) driver clear
the device's driver_data pointer on remove (Rafael Wysocki)
- Force enabling of PWM2 on the Yogabook YB1-X90 tablets (Yauhen
Kharuzhy)"
* tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Add unused power resource quirk for THUNDEROBOT ZERO
ACPI: driver: Drop driver_data pointer clearing from two drivers
ACPI: video: Clear driver_data pointer on remove
ACPI: button: Tweak acpi_button_remove()
ACPI: button: Tweak system wakeup handling
ACPI: battery: Drop redundant checks from acpi_battery_remove()
ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs
ACPI: x86: Force enabling of PWM2 on the Yogabook YB1-X90
ACPI: button: Call device_init_wakeup() earlier during probe
ACPI: battery: Drop redundant check from acpi_battery_notify()
Linus Torvalds [Wed, 18 Feb 2026 22:11:47 +0000 (14:11 -0800)]
Merge tag 'pm-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are mostly fixes on top of the power management updates merged
recently in cpuidle governors, in the Intel RAPL power capping driver
and in the wake IRQ management code:
- Fix the handling of package-scope MSRs in the intel_rapl power
capping driver when called from the PMU subsystem and make it add
all package CPUs to the PMU cpumask to allow tools to read RAPL
events from any CPU in the package (Kuppuswamy Satharayananyan)
- Rework the invalid version check in the intel_rapl_tpmi power
capping driver to account for the fact that on partitioned systems,
multiple TPMI instances may exist per package, but RAPL registers
are only valid on one instance (Kuppuswamy Satharayananyan)
- Describe the new intel_idle.table command line option in the
admin-guide intel_idle documentation (Artem Bityutskiy)
- Fix a crash in the ladder cpuidle governor on systems with only one
(polling) idle state available by making the cpuidle core bypass
the governor in those cases and adjust the other existing governors
to that change (Aboorva Devarajan, Christian Loehle)
- Update kerneldoc comments for wake IRQ management functions that
have not been matching the code (Wang Jiayue)"
* tag 'pm-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: menu: Remove single state handling
cpuidle: teo: Remove single state handling
cpuidle: haltpoll: Remove single state handling
cpuidle: Skip governor when only one idle state is available
powercap: intel_rapl_tpmi: Remove FW_BUG from invalid version check
PM: sleep: wakeirq: Update outdated documentation comments
Documentation: PM: Document intel_idle.table command line option
powercap: intel_rapl: Expose all package CPUs in PMU cpumask
powercap: intel_rapl: Remove incorrect CPU check in PMU context
Merge branches 'acpi-battery', 'acpi-button' and 'acpi-driver'
Merge additional updates of multiple core ACPI device drivers (battery,
button, video, EC, SMBUS HC) for 7.0-rc1:
- Drop redundant checks from the ACPI notify handler and the driver
remove callback in the ACPI battery driver (Rafael Wysocki)
- Move the creation of the wakeup source during the ACPI button driver
probe to an earlier point to avoid missing a wakeup event due to a
race and clean up system wakeup handling and remove callback in that
driver (Rafael Wysocki)
- Drop unnecessary driver_data pointer clearing from the ACPI EC and
SMBUS HC drivers and make the ACPI backlight (video) driver clear the
device's driver_data pointer on remove (Rafael Wysocki)
* acpi-battery:
ACPI: battery: Drop redundant checks from acpi_battery_remove()
ACPI: battery: Drop redundant check from acpi_battery_notify()
* acpi-button:
ACPI: button: Tweak acpi_button_remove()
ACPI: button: Tweak system wakeup handling
ACPI: button: Call device_init_wakeup() earlier during probe
* acpi-driver:
ACPI: driver: Drop driver_data pointer clearing from two drivers
ACPI: video: Clear driver_data pointer on remove
Merge additional power capping and cpuidle updates for 7.0-rc1:
- Fix the handling of package-scope MSRs in the intel_rapl power
capping driver when called from the PMU subsystem and make it add all
package CPUs to the PMU cpumask to allow tools to read RAPL events
from any CPU in the package (Kuppuswamy Sathyanarayanan)
- Rework the invalid version check in the intel_rapl_tpmi power capping
driver to account for the fact that on partitioned systems, multiple
TPMI instances may exist per package, but RAPL registers are only
valid on one instance (Kuppuswamy Satharayananyan)
- Describe the new intel_idle.table command line option in the
admin-guide intel_idle documentation (Artem Bityutskiy)
- Fix a crash in the ladder cpuidle governor on systems with only one
(polling) idle state available by making the cpuidle core bypass the
governor in those cases and adjust the other existing governors to
that change (Aboorva Devarajan, Christian Loehle)
* pm-powercap:
powercap: intel_rapl_tpmi: Remove FW_BUG from invalid version check
powercap: intel_rapl: Expose all package CPUs in PMU cpumask
powercap: intel_rapl: Remove incorrect CPU check in PMU context
* pm-cpuidle:
cpuidle: menu: Remove single state handling
cpuidle: teo: Remove single state handling
cpuidle: haltpoll: Remove single state handling
cpuidle: Skip governor when only one idle state is available
Documentation: PM: Document intel_idle.table command line option
Linus Torvalds [Wed, 18 Feb 2026 18:45:36 +0000 (10:45 -0800)]
Merge tag 'sysctl-7.00-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Remove macros from proc handler converters
Replace the proc converter macros with "regular" functions. Though it
is more verbose than the macro version, it helps when debugging and
better aligns with coding-style.rst.
- General cleanup
Remove superfluous ctl_table forward declarations. Const qualify the
memory_allocation_profiling_sysctl and loadpin_sysctl_table arrays.
Add missing kernel doc to proc_dointvec_conv.
- Testing
This series was run through sysctl selftests/kunit test suite in
x86_64. And went into linux-next after rc4, giving it a good 3 weeks
of testing
* tag 'sysctl-7.00-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
sysctl: replace SYSCTL_INT_CONV_CUSTOM macro with functions
sysctl: Replace unidirectional INT converter macros with functions
sysctl: Add kernel doc to proc_douintvec_conv
sysctl: Replace UINT converter macros with functions
sysctl: Add CONFIG_PROC_SYSCTL guards for converter macros
sysctl: clarify proc_douintvec_minmax doc
sysctl: Return -ENOSYS from proc_douintvec_conv when CONFIG_PROC_SYSCTL=n
sysctl: Remove unused ctl_table forward declarations
loadpin: Implement custom proc_handler for enforce
alloc_tag: move memory_allocation_profiling_sysctls into .rodata
sysctl: Add missing kernel-doc for proc_dointvec_conv
Benjamin Block [Tue, 17 Feb 2026 13:29:12 +0000 (14:29 +0100)]
s390/debug: Convert debug area lock from a spinlock to a raw spinlock
With PREEMPT_RT as potential configuration option, spinlock_t is now
considered as a sleeping lock, and thus might cause issues when used in
an atomic context. But even with PREEMPT_RT as potential configuration
option, raw_spinlock_t remains as a true spinning lock/atomic context.
This creates potential issues with the s390 debug/tracing feature. The
functions to trace errors are called in various contexts, including
under lock of raw_spinlock_t, and thus the used spinlock_t in each debug
area is in violation of the locking semantics.
Here are two examples involving failing PCI Read accesses that are
traced while holding `pci_lock` in `drivers/pci/access.c`:
=============================
[ BUG: Invalid wait context ]
6.19.0-devel #18 Not tainted
-----------------------------
bash/3833 is trying to lock: 0000027790baee30 (&rc->lock){-.-.}-{3:3}, at: debug_event_common+0xfc/0x300
other info that might help us debug this:
context-{5:5}
5 locks held by bash/3833:
#0: 0000027efbb29450 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x7c/0xf0
#1: 00000277f0504a90 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x13e/0x260
#2: 00000277beed8c18 (kn->active#339){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x164/0x260
#3: 00000277e9859190 (&dev->mutex){....}-{4:4}, at: pci_dev_lock+0x2e/0x40
#4: 00000383068a7708 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x4a/0xb0
stack backtrace:
CPU: 6 UID: 0 PID: 3833 Comm: bash Kdump: loaded Not tainted 6.19.0-devel #18 PREEMPTLAZY
Hardware name: IBM 9175 ME1 701 (LPAR)
Call Trace:
[<00000383048afec2>] dump_stack_lvl+0xa2/0xe8
[<00000383049ba166>] __lock_acquire+0x816/0x1660
[<00000383049bb1fa>] lock_acquire+0x24a/0x370
[<00000383059e3860>] _raw_spin_lock_irqsave+0x70/0xc0
[<00000383048bbb6c>] debug_event_common+0xfc/0x300
[<0000038304900b0a>] __zpci_load+0x17a/0x1f0
[<00000383048fad88>] pci_read+0x88/0xd0
[<00000383054cbce0>] pci_bus_read_config_dword+0x70/0xb0
[<00000383054d55e4>] pci_dev_wait+0x174/0x290
[<00000383054d5a3e>] __pci_reset_function_locked+0xfe/0x170
[<00000383054d9b30>] pci_reset_function+0xd0/0x100
[<00000383054ee21a>] reset_store+0x5a/0x80
[<0000038304e98758>] kernfs_fop_write_iter+0x1e8/0x260
[<0000038304d995da>] new_sync_write+0x13a/0x180
[<0000038304d9c5d0>] vfs_write+0x200/0x330
[<0000038304d9c88c>] ksys_write+0x7c/0xf0
[<00000383059cfa80>] __do_syscall+0x210/0x500
[<00000383059e4c06>] system_call+0x6e/0x90
INFO: lockdep is turned off.
=============================
[ BUG: Invalid wait context ]
6.19.0-devel #3 Not tainted
-----------------------------
bash/6861 is trying to lock: 0000009da05c7430 (&rc->lock){-.-.}-{3:3}, at: debug_event_common+0xfc/0x300
other info that might help us debug this:
context-{5:5}
5 locks held by bash/6861:
#0: 000000acff404450 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x7c/0xf0
#1: 000000acff41c490 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x13e/0x260
#2: 0000009da36937d8 (kn->active#75){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x164/0x260
#3: 0000009dd15250d0 (&zdev->state_lock){+.+.}-{4:4}, at: enable_slot+0x2e/0xc0
#4: 000001a19682f708 (pci_lock){....}-{2:2}, at: pci_bus_read_config_byte+0x42/0xa0
stack backtrace:
CPU: 16 UID: 0 PID: 6861 Comm: bash Kdump: loaded Not tainted 6.19.0-devel #3 PREEMPTLAZY
Hardware name: IBM 9175 ME1 701 (LPAR)
Call Trace:
[<000001a194837ec2>] dump_stack_lvl+0xa2/0xe8
[<000001a194942166>] __lock_acquire+0x816/0x1660
[<000001a1949431fa>] lock_acquire+0x24a/0x370
[<000001a19596b810>] _raw_spin_lock_irqsave+0x70/0xc0
[<000001a194843b6c>] debug_event_common+0xfc/0x300
[<000001a194888b0a>] __zpci_load+0x17a/0x1f0
[<000001a194882d88>] pci_read+0x88/0xd0
[<000001a195453b88>] pci_bus_read_config_byte+0x68/0xa0
[<000001a195457bc2>] pci_setup_device+0x62/0xad0
[<000001a195458e70>] pci_scan_single_device+0x90/0xe0
[<000001a19488a0f6>] zpci_bus_scan_device+0x46/0x80
[<000001a19547f958>] enable_slot+0x98/0xc0
[<000001a19547f134>] power_write_file+0xc4/0x110
[<000001a194e20758>] kernfs_fop_write_iter+0x1e8/0x260
[<000001a194d215da>] new_sync_write+0x13a/0x180
[<000001a194d245d0>] vfs_write+0x200/0x330
[<000001a194d2488c>] ksys_write+0x7c/0xf0
[<000001a195957a30>] __do_syscall+0x210/0x500
[<000001a19596cbb6>] system_call+0x6e/0x90
INFO: lockdep is turned off.
Since it is desired to keep it possible to create trace records in most
situations, including this particular case (failing PCI config space
accesses are relevant), convert the used spinlock_t in `struct
debug_info` to raw_spinlock_t.
The impact is small, as the debug area lock only protects bounded memory
access without external dependencies, apart from one function
debug_set_size() where kfree() is implicitly called with the lock held.
Move debug_info_free() out of this lock, to keep remove this external
dependency.
efi: Align unaccepted memory range to page boundary
The accept_memory() and range_contains_unaccepted_memory() functions
employ a "guard page" logic to prevent crashes with load_unaligned_zeropad().
This logic extends the range to be accepted (or checked) by one unit_size
if the end of the range is aligned to a unit_size boundary.
However, if the caller passes a range that is not page-aligned, the
'end' of the range might not be numerically aligned to unit_size, even
if it covers the last page of a unit. This causes the "if (!(end % unit_size))"
check to fail, skipping the necessary extension and leaving the next
unit unaccepted, which can lead to a kernel panic when accessed by
load_unaligned_zeropad().
Align the start address down and the size up to the nearest page
boundary before performing the unit_size alignment check. This ensures
that the guard unit is correctly added when the range effectively ends
on a unit boundary.
Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The reserve_unaccepted() function incorrectly calculates the size of the
memblock reservation for the unaccepted memory table. It aligns the
size of the table, but fails to account for cases where the table's
starting physical address (efi.unaccepted) is not page-aligned.
If the table starts at an offset within a page and its end crosses into
a subsequent page that the aligned size does not cover, the end of the
table will not be reserved. This can lead to the table being overwritten
or inaccessible, causing a kernel panic in accept_memory().
This issue was observed when starting Intel TDX VMs with specific memory
sizes (e.g., > 64GB).
Fix this by calculating the end address first (including the unaligned
start) and then aligning it up, ensuring the entire range is covered
by the reservation.
Fixes: 8dbe33956d96 ("efi/unaccepted: Make sure unaccepted table is mapped") Reported-by: Moritz Sanft <ms@edgeless.systems> Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The 'struct efivar_operations' is not modified by the driver after
initialization, so it should follow typical practice of being static
const for increased code safety and readability.
Ard Biesheuvel [Tue, 17 Feb 2026 11:09:35 +0000 (12:09 +0100)]
x86/kexec: Copy ACPI root pointer address from config table
Dave reports that kexec may fail when the first kernel boots via the EFI
stub but without EFI runtime services, as in that case, the RSDP address
field in struct bootparams is never assigned. Kexec copies this value
into the version of struct bootparams that it provides to the incoming
kernel, which may have no other means to locate the ACPI root pointer.
So take the value from the EFI config tables if no root pointer has been
set in the first kernel's struct bootparams.
Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Cc: <stable@vger.kernel.org> # v6.1 Reported-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Link: https://lore.kernel.org/linux-efi/aZQg_tRQmdKNadCg@darkstar.users.ipa.redhat.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Juergen Gross [Sat, 14 Feb 2026 13:50:35 +0000 (14:50 +0100)]
x86/xen: Fix Xen PV guest boot
A recent patch moving the call of sparse_init() to common mm code
broke booting as a Xen PV guest.
Reason is that the Xen PV specific boot code relied on struct page area
being accessible rather early, but this changed by the move of the call
of sparse_init().
Fortunately the fix is rather easy: there is a static branch available
indicating whether struct page contents are usable by Xen. This static
branch just needs to be tested in some places for avoiding the access
of struct page.
As code paths that handle vmbus IRQs use sleepy locks under PREEMPT_RT,
the vmbus_isr execution needs to be moved into thread context. Open-
coding this allows to skip the IPI that irq_work would additionally
bring and which we do not need, being an IRQ, never an NMI.
This affects both x86 and arm64, therefore hook into the common driver
logic.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Florian Bezdeka <florian.bezdeka@siemens.com> Tested-by: Florian Bezdeka <florian.bezdeka@siemens.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Uros Bizjak [Fri, 21 Nov 2025 14:14:11 +0000 (15:14 +0100)]
x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn
Unlike CALL instruction, VMMCALL does not push to the stack, so it's
OK to allow the compiler to insert it before the frame pointer gets
set up by the containing function. ASM_CALL_CONSTRAINT is for CALLs
that must be inserted after the frame pointer is set up, so it is
over-constraining here and can be removed.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Uros Bizjak [Fri, 21 Nov 2025 14:14:10 +0000 (15:14 +0100)]
x86/hyperv: Use savesegment() instead of inline asm() to save segment registers
Use standard savesegment() utility macro to save segment registers.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Tested-by: Michael Kelley <mhklinux@outlook.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Increasing the MTU beyond the HDS threshold causes the hardware to
fragment packets across multiple buffers. If a single-buffer XDP program
is attached, the driver will drop all multi-frag frames. While we can't
prevent a remote sender from sending non-TCP packets larger than the MTU,
this will prevent users from inadvertently breaking new TCP streams.
Traditionally, drivers supported XDP with MTU less than 4Kb
(packet per page). Fbnic currently prevents attaching XDP when MTU is too high.
But it does not prevent increasing MTU after XDP is attached.
Fixes: 1b0a3950dbd4 ("eth: fbnic: Add XDP pass, drop, abort support") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The range size can be 65536 when the requested range covers all possible
u16 queue IDs (e.g. queue_mapping=0 and queue_mapping_max=U16_MAX).
That value cannot be represented in a u16 and previously wrapped to 0,
so tcf_skbedit_hash() could trigger a divide-by-zero:
Add documentation for the vsock per-namespace sysctls (`ns_mode` and
`child_ns_mode`) to Documentation/admin-guide/sysctl/net.rst.
These sysctls were introduced by commit eafb64f40ca4 ("vsock: add
netns to vsock core").
Document the two namespace modes (`global` and `local`), the
inheritance behavior of `child_ns_mode`, and the restriction preventing
local namespaces from setting `child_ns_mode` to `global`.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20260216163147.236844-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Fri, 13 Feb 2026 14:25:57 +0000 (14:25 +0000)]
macvlan: observe an RCU grace period in macvlan_common_newlink() error path
valis reported that a race condition still happens after my prior patch.
macvlan_common_newlink() might have made @dev visible before
detecting an error, and its caller will directly call free_netdev(dev).
We must respect an RCU period, either in macvlan or the core networking
stack.
After adding a temporary mdelay(1000) in macvlan_forward_source_one()
to open the race window, valis repro was:
ip link add p1 type veth peer p2
ip link set address 00:00:00:00:00:20 dev p1
ip link set up dev p1
ip link set up dev p2
ip link add mv0 link p2 type macvlan mode source
(ip link add invalid% link p2 type macvlan mode source macaddr add
00:00:00:00:00:20 &) ; sleep 0.5 ; ping -c1 -I p1 1.2.3.4
PING 1.2.3.4 (1.2.3.4): 56 data bytes
RTNETLINK answers: Invalid argument
BUG: KASAN: slab-use-after-free in macvlan_forward_source
(drivers/net/macvlan.c:408 drivers/net/macvlan.c:444)
Read of size 8 at addr ffff888016bb89c0 by task e/175
Jakub Kicinski [Sat, 14 Feb 2026 03:51:59 +0000 (19:51 -0800)]
selftests: tc_actions: don't dump 2MB of \0 to stdout
Since we started running selftests in NIPA we have been seeing
tc_actions.sh generate a soft lockup warning on ~20% of the runs.
On the pre-netdev foundation setup it was actually a missed irq
splat from the console. Now it's either that or a lockup.
I initially suspected a socket locking issue since the test
is exercising local loopback with act_mirred.
After hours of staring at this I noticed in strace that ncat
when -o $file is specified _both_ saves the output to the file
and still prints it to stdout. Because the file being sent
is constructed with:
the data printed is all \0. Most terminals don't display nul
characters (and neither does vng output capture save them).
But QEMU's serial console still has to poke them thru which
is very slow and causes the lockup (if the file is >600kB).
Replace the '-o $file' with '> $file'. This speeds the test up
from 2m20s to 18s on debug kernels, and prevents the warnings.
ipv6: addrconf: reduce default temp_valid_lft to 2 days
This is a recommendation from RFC 8981 and it was intended to be changed
by commit 969c54646af0 ("ipv6: Implement draft-ietf-6man-rfc4941bis")
but it only changed the sysctl documentation.
Eric Dumazet [Mon, 16 Feb 2026 10:01:49 +0000 (10:01 +0000)]
ping: annotate data-races in ping_lookup()
isk->inet_num, isk->inet_rcv_saddr and sk->sk_bound_dev_if
are read locklessly in ping_lookup().
Add READ_ONCE()/WRITE_ONCE() annotations.
The race on isk->inet_rcv_saddr is probably coming from IPv6 support,
but does not deserve a specific backport.
Fixes: dbca1596bbb0 ("ping: convert to RCU lookups, get rid of rwlock") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260216100149.3319315-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ivan Vecera [Mon, 16 Feb 2026 19:40:07 +0000 (20:40 +0100)]
dpll: zl3073x: Fix ref frequency setting
The frequency for an input reference is computed as:
frequency = freq_base * freq_mult * freq_ratio_m / freq_ratio_n
Before commit 5bc02b190a3fb ("dpll: zl3073x: Cache all reference
properties in zl3073x_ref"), zl3073x_dpll_input_pin_frequency_set()
explicitly wrote 1 to both the REF_RATIO_M and REF_RATIO_N hardware
registers whenever a new frequency was set. This ensured the FEC ratio
was always reset to 1:1 alongside the new base/multiplier values.
The refactoring in that commit introduced zl3073x_ref_freq_set() to
update the cached ref state, but this helper only sets freq_base and
freq_mult without resetting freq_ratio_m and freq_ratio_n to 1. Because
zl3073x_ref_state_set() uses a compare-and-write strategy, unchanged
ratio fields are never written to the hardware. If the device previously
had non-unity FEC ratio values, they remain in effect after a frequency
change, resulting in an incorrect computed frequency.
Explicitly set freq_ratio_m and freq_ratio_n to 1 in zl3073x_ref_freq_set()
to restore the original behavior.
Fixes: 5bc02b190a3fb ("dpll: zl3073x: Cache all reference properties in zl3073x_ref") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260216194007.680416-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 16 Feb 2026 19:36:53 +0000 (19:36 +0000)]
net: do not delay zero-copy skbs in skb_attempt_defer_free()
After the blamed commit, TCP tx zero copy notifications could be
arbitrarily delayed and cause regressions in applications waiting
for them.
Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: e20dfbad8aab ("net: fix napi_consume_skb() with alien skbs") Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260216193653.627617-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arnd Bergmann [Mon, 16 Feb 2026 10:54:54 +0000 (11:54 +0100)]
net: psp: select CONFIG_SKB_EXTENSIONS
psp now uses skb extensions, failing to build when that is disabled:
In file included from include/net/psp.h:7,
from net/psp/psp_sock.c:9:
include/net/psp/functions.h: In function '__psp_skb_coalesce_diff':
include/net/psp/functions.h:60:13: error: implicit declaration of function 'skb_ext_find'; did you mean 'skb_ext_copy'? [-Wimplicit-function-declaration]
60 | a = skb_ext_find(one, SKB_EXT_PSP);
| ^~~~~~~~~~~~
| skb_ext_copy
include/net/psp/functions.h:60:31: error: 'SKB_EXT_PSP' undeclared (first use in this function)
60 | a = skb_ext_find(one, SKB_EXT_PSP);
| ^~~~~~~~~~~
include/net/psp/functions.h:60:31: note: each undeclared identifier is reported only once for each function it appears in
include/net/psp/functions.h: In function '__psp_sk_rx_policy_check':
include/net/psp/functions.h:94:53: error: 'SKB_EXT_PSP' undeclared (first use in this function)
94 | struct psp_skb_ext *pse = skb_ext_find(skb, SKB_EXT_PSP);
| ^~~~~~~~~~~
net/psp/psp_sock.c: In function 'psp_sock_recv_queue_check':
net/psp/psp_sock.c:164:41: error: 'SKB_EXT_PSP' undeclared (first use in this function)
164 | pse = skb_ext_find(skb, SKB_EXT_PSP);
| ^~~~~~~~~~~
Select the Kconfig symbol as we do from its other users.
Fixes: 6b46ca260e22 ("net: psp: add socket security association code") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20260216105500.2382181-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 17 Feb 2026 19:41:50 +0000 (11:41 -0800)]
bpftool: Fix truncated netlink dumps
Netlink requires that the recv buffer used during dumps is at least
min(PAGE_SIZE, 8k) (see the man page). Otherwise the messages will
get truncated. Make sure bpftool follows this requirement, avoid
missing information on systems with large pages.
Linus Torvalds [Tue, 17 Feb 2026 23:37:06 +0000 (15:37 -0800)]
Merge tag 'ntfs3_for_7.0' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
"New code:
- improve readahead for bitmap initialization and large directory scans
- fsync files by syncing parent inodes
- drop of preallocated clusters for sparse and compressed files
- zero-fill folios beyond i_valid in ntfs_read_folio()
- implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
- implement iomap-based file operations
- allow explicit boolean acl/prealloc mount options
- fall-through between switch labels
- delayed-allocation (delalloc) support
Fixes:
- check return value of indx_find to avoid infinite loop
- initialize new folios before use
- infinite loop in attr_load_runs_range on inconsistent metadata
- infinite loop triggered by zero-sized ATTR_LIST
- ntfs_mount_options leak in ntfs_fill_super()
- deadlock in ni_read_folio_cmpr
- circular locking dependency in run_unpack_ex
- prevent infinite loops caused by the next valid being the same
- restore NULL folio initialization in ntfs_writepages()
- slab-out-of-bounds read in DeleteIndexEntryRoot
Updates:
- allow readdir() to finish after directory mutations without rewinddir()
- handle attr_set_size() errors when truncating files
- make ntfs_writeback_ops static
- refactor duplicate kmemdup pattern in do_action()
- avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()
Replaced:
- use wait_on_buffer() directly
- rename ni_readpage_cmpr into ni_read_folio_cmpr"
* tag 'ntfs3_for_7.0' of https://github.com/Paragon-Software-Group/linux-ntfs3: (26 commits)
fs/ntfs3: add delayed-allocation (delalloc) support
fs/ntfs3: avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()
fs/ntfs3: add fall-through between switch labels
fs/ntfs3: allow explicit boolean acl/prealloc mount options
fs/ntfs3: Fix slab-out-of-bounds read in DeleteIndexEntryRoot
ntfs3: Restore NULL folio initialization in ntfs_writepages()
ntfs3: Refactor duplicate kmemdup pattern in do_action()
fs/ntfs3: prevent infinite loops caused by the next valid being the same
fs/ntfs3: make ntfs_writeback_ops static
ntfs3: fix circular locking dependency in run_unpack_ex
fs/ntfs3: implement iomap-based file operations
fs/ntfs3: fix deadlock in ni_read_folio_cmpr
fs/ntfs3: implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
fs/ntfs3: zero-fill folios beyond i_valid in ntfs_read_folio()
fs/ntfs3: handle attr_set_size() errors when truncating files
fs/ntfs3: drop preallocated clusters for sparse and compressed files
fs/ntfs3: fsync files by syncing parent inodes
fs/ntfs3: fix ntfs_mount_options leak in ntfs_fill_super()
fs/ntfs3: allow readdir() to finish after directory mutations without rewinddir()
fs/ntfs3: improve readahead for bitmap initialization and large directory scans
...
Linus Torvalds [Tue, 17 Feb 2026 23:18:51 +0000 (15:18 -0800)]
Merge tag 'ceph-for-7.0-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"This adds support for the upcoming aes256k key type in CephX that is
based on Kerberos 5 and brings a bunch of assorted CephFS fixes from
Ethan and Sam. One of Sam's patches in particular undoes a change in
the fscrypt area that had an inadvertent side effect of making CephFS
behave as if mounted with wsize=4096 and leading to the corresponding
degradation in performance, especially for sequential writes"
* tag 'ceph-for-7.0-rc1' of https://github.com/ceph/ceph-client:
ceph: assert loop invariants in ceph_writepages_start()
ceph: remove error return from ceph_process_folio_batch()
ceph: fix write storm on fscrypted files
ceph: do not propagate page array emplacement errors as batch errors
ceph: supply snapshot context in ceph_uninline_data()
ceph: supply snapshot context in ceph_zero_partial_object()
libceph: adapt ceph_x_challenge_blob hashing and msgr1 message signing
libceph: add support for CEPH_CRYPTO_AES256KRB5
libceph: introduce ceph_crypto_key_prepare()
libceph: generalize ceph_x_encrypt_offset() and ceph_x_encrypt_buflen()
libceph: define and enforce CEPH_MAX_KEY_LEN
Linus Torvalds [Tue, 17 Feb 2026 23:08:24 +0000 (15:08 -0800)]
Merge tag 'ovl-update-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
Pull overlayfs update from Amir Goldstein:
"Relax the semantics of uuid=off to cater to a use case of overlayfs
lower layers on btrfs clones, whose UUID are ephemeral and an upper
layer on a different filesystem"
* tag 'ovl-update-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: relax requirement for uuid=off,index=on
Linus Torvalds [Tue, 17 Feb 2026 23:02:49 +0000 (15:02 -0800)]
Merge tag 'v7.0-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix three potential double free vulnerabilities
- Fix data corruption due to racy lease checks
- Enforce SMB1 signing verification checks
- Fix invalid mount option parsing
- Remove unneeded tracepoint
- Various minor error code corrections
- Minor cleanup
* tag 'v7.0-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: terminate session upon failed client required signing
cifs: some missing initializations on replay
cifs: remove unnecessary tracing after put tcon
cifs: update internal module version number
smb: client: fix data corruption due to racy lease checks
smb/client: move NT_STATUS_MORE_ENTRIES
smb/client: rename to NT_ERROR_INVALID_DATATYPE
smb/client: rename to NT_STATUS_SOME_NOT_MAPPED
smb/client: map NT_STATUS_PRIVILEGE_NOT_HELD
smb/client: map NT_STATUS_MORE_PROCESSING_REQUIRED
smb/client: map NT_STATUS_BUFFER_OVERFLOW
smb/client: map NT_STATUS_NOTIFY_ENUM_DIR
cifs: SMB1 split: Remove duplicate include of cifs_debug.h
smb: client: fix regression with mount options parsing
====================
libbpf: Fix perm errors for LDIMM_64_FULL_RANGE_OFF
Commit 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
adds a feature flag for testing whether the running kernel supports LDIMM64
instructions with large direct offsets. Fix two edge cases that can
cause unexpected -EPERM errors in two ways:
1) The probe program used for the feature has type TRACEPOINT, but it's
possible the caller does not have permission to load it, even if it is
able to do so for generic BPF programs. Use the SOCKET_FILTER type
instead that requires fewer permissions. This does not affect the check
itself, which will always fail verification anyway.
2) The probe is triggered during bpf_object__collect_relos(), itself
called in bpf_object_open(), to compute the arena relocation offsets of
arena variables. However, the caller may not have permissions to load
BPF programs. This is the case in some systems with the bpftool calls made
by the BPF selftests during compilation, e.g., for skeleton generation.
Move all uses of the feature check to bpf_object_prepare() time instead.
Fixes: 728ff167910e ("libbpf: Add gating for arena globals relocation feature") Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
v2 -> v3: https://lore.kernel.org/bpf/20260214021014.15670-1-emil@etsalapatis.com/
- Only zero out the first byte of the log buffer (Andrii)
- Minimize invocations of the feature gate (Andrii)
- Adjust the hash of the original commit post-tree rebase
- Ensure close() is not called on invalid prog_fd in feature probe (Coverity)
====================
Emil Tsalapatis [Tue, 17 Feb 2026 20:43:45 +0000 (15:43 -0500)]
libbpf: Delay feature gate check until object prepare time
Commit 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
adds a feature gate check that loads a map and BPF program to
test the running kernel supports large direct offsets for LDIMM64
instructions. This check is currently used to calculate arena symbol
offsets during bpf_object__collect_relos, itself called by
bpf_object_open.
However, the program calling bpf_object_open may not have the permissions to
load maps and programs. This is the case with the BPF selftests, where
bpftool is invoked at compilation time during skeleton generation. This
causes errors as the feature gate unexpectedly fails with -EPERM.
Avoid this by moving all the use of the FEAT_LDIMM64_FULL_RANGE_OFF feature gate
to BPF object preparation time instead.
Emil Tsalapatis [Tue, 17 Feb 2026 20:43:44 +0000 (15:43 -0500)]
libbpf: Do not use PROG_TYPE_TRACEPOINT program for feature gating
Commit 728ff167910e uses a PROG_TYPE_TRACEPOINT BPF test program to
check whether the running kernel supports large LDIMM64 offsets. The
feature gate incorrectly assumes that the program will fail at
verification time with one of two messages, depending on whether the
feature is supported by the running kernel. However,
PROG_TYPE_TRACEPOINT programs may fail to load before verification even
starts, e.g., if the shell does not have the appropriate capabilities.
Use a BPF_PROG_TYPE_SOCKET_FILTER program for the feature gate instead.
Also fix two minor issues. First, ensure the log buffer for the test is
initialized: Failing program load before verification led to libbpf dumping
uninitialized data to stdout. Also, ensure that close() is only called
for program_fd in the probe if the program load actually succeeded. The
call was currently failing silently with -EBADF most of the time.
Linus Torvalds [Tue, 17 Feb 2026 19:47:17 +0000 (11:47 -0800)]
Merge tag 'dmaengine-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"Core:
- Add Frank Li as susbstem reviewer to help with reviews
New Support:
- Mediatek support for Dimensity 6300 and 9200 controller
- Qualcomm Kaanapali and Glymur GPI DMA engine
- Synopsis DW AXI Agilex5
- Renesas RZ/V2N SoC
- Atmel microchip lan9691-dma
- Tegra ADMA tegra264
Updates:
- sg_nents_for_dma() helper use in subsystem
- pm_runtime_mark_last_busy() redundant call update for subsystem
- Residue support for xilinx AXIDMA driver
- Intel Max SGL Size Support and capabilities for DSA3.0
- AXI dma larger than 32bits address support"
* tag 'dmaengine-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (64 commits)
dmaengine: add Frank Li as reviewer
dt-bindings: dma: qcom,gpi: Update max interrupts lines to 16
dmaengine: fsl-edma: don't explicitly disable clocks in .remove()
dmaengine: xilinx: xdma: use sg_nents_for_dma() helper
dmaengine: sh: use sg_nents_for_dma() helper
dmaengine: sa11x0: use sg_nents_for_dma() helper
dmaengine: qcom: bam_dma: use sg_nents_for_dma() helper
dmaengine: qcom: adm: use sg_nents_for_dma() helper
dmaengine: pxa-dma: use sg_nents_for_dma() helper
dmaengine: lgm: use sg_nents_for_dma() helper
dmaengine: k3dma: use sg_nents_for_dma() helper
dmaengine: dw-axi-dmac: use sg_nents_for_dma() helper
dmaengine: bcm2835-dma: use sg_nents_for_dma() helper
dmaengine: axi-dmac: use sg_nents_for_dma() helper
dmaengine: altera-msgdma: use sg_nents_for_dma() helper
scatterlist: introduce sg_nents_for_dma() helper
dmaengine: idxd: Add Max SGL Size Support for DSA3.0
dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
dmaengine: sh: rz-dmac: Make channel irq local
dmaengine: pl08x: Fix comment stating the difference between PL080 and PL081
...
Linus Torvalds [Tue, 17 Feb 2026 19:40:04 +0000 (11:40 -0800)]
Merge tag 'phy-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"Core:
- Add suuport for "rx-polarity" and "tx-polarity" device tree
properties and phy common properties to manage this
New Support:
- Qualcomm Glymur PCIe Gen4 2-lanes PCIe phy, DP and edp phy, USB UNI
PHY and SMB2370 eUSB2 repeater. SC8280xp QMP UFS PHY, Kaanapali
PCIe phy and QMP PHY, QCS615 QMP USB3+DP PHY and driver support for
that.
- SpacemiT PCIe/combo PHY and K1 USB2 PHY driver.
- HDMI 2.1 FRL configuration support and driver enabling for rockchip
samsung-hdptx driver
- TI TCAN1046 phy
- Renesas RZ/V2H(P) and RZ/V2N usb3
- Mediatek MT8188 hdmi-phy
- Google Tensor SoC USB PHY driver
- Apple Type-C PHY
Updates:
- Subsystem conversion for clock round_rate() to determine_rate()
- TI USB3 DT schema conversion
- Samsung ExynosAutov920 usb3, combo hsphy and ssphy support"
* tag 'phy-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (143 commits)
phy: ti: phy-j721e-wiz: convert from divider_round_rate() to divider_determine_rate()
dt-bindings: phy: ti,control-phy-otghs: convert to DT schema
dt-bindings: phy: ti,phy-usb3: convert to DT schema
phy: tegra: xusb: Remove unused powered_on variable
phy: renesas: rcar-gen3-usb2: add regulator dependency
phy: GOOGLE_USB: add TYPEC dependency
phy: enter drivers/phy/Makefile even without CONFIG_GENERIC_PHY
phy: renesas: rcar-gen3-usb2: Use mux-state for phyrst management
phy: renesas: rcar-gen3-usb2: Add regulator for OTG VBUS control
phy: renesas: rcar-gen3-usb2: Use devm_pm_runtime_enable()
phy: renesas: rcar-gen3-usb2: Factor out VBUS control logic
dt-bindings: phy: renesas,usb2-phy: Document RZ/G3E SoC
dt-bindings: phy: renesas,usb2-phy: Document mux-states property
dt-bindings: phy: renesas,usb2-phy: Document USB VBUS regulator
phy: rockchip: samsung-hdptx: Add HDMI 2.1 FRL support
phy: rockchip: samsung-hdptx: Extend rk_hdptx_phy_verify_hdmi_config() helper
phy: rockchip: samsung-hdptx: Switch to driver specific HDMI config
phy: rockchip: samsung-hdptx: Drop hw_rate driver data
phy: rockchip: samsung-hdptx: Compute clk rate from PLL config
phy: rockchip: samsung-hdptx: Cleanup *_cmn_init_seq lists
...
Linus Torvalds [Tue, 17 Feb 2026 18:07:13 +0000 (10:07 -0800)]
Merge tag 'soundwire-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
- support for Qualcomm v2.2.0 controllers
- bus method updates for .probe(), .remove() and .shutdown()
and remove function return value updates
- Avell B.ON dmi-quirks mapping
- mark cs42l45 codec as wake capable
* tag 'soundwire-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: intel_ace2x: add SND_HDA_CORE dependency
dt-bindings: soundwire: qcom: Add SoundWire v2.2.0 compatible
soundwire: Use bus methods for .probe(), .remove() and .shutdown()
soundwire: Make remove function return no value
soundwire: dmi-quirks: add mapping for Avell B.ON (OEM rebranded of NUC15)
soundwire: qcom: Use guard to avoid mixing cleanup and goto
soundwire: intel_auxdevice: add cs42l45 codec to wake_capable_list
Linus Torvalds [Tue, 17 Feb 2026 17:36:43 +0000 (09:36 -0800)]
Merge tag 'usb-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the "big" set of USB and Thunderbolt driver updates for
7.0-rc1. Overall more lines were removed than added, thanks to
dropping the obsolete isp1362 USB host controller driver, always a
nice change.
Other than that, nothing major happening here, highlights are:
- lots of dwc3 driver updates and new hardware support added
- usb gadget function driver updates
- usb phy driver updates
- typec driver updates and additions
- USB rust binding updates for syntax and formatting changes
- more usb serial device ids added
- other smaller USB core and driver updates and additions
All of these have been in linux-next for a long time, with no reported
problems"
* tag 'usb-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (77 commits)
usb: typec: ucsi: Add Thunderbolt alternate mode support
usb: typec: hd3ss3220: Check if regulator needs to be switched
usb: phy: tegra: parametrize PORTSC1 register offset
usb: phy: tegra: parametrize HSIC PTS value
usb: phy: tegra: return error value from utmi_wait_register
usb: phy: tegra: cosmetic fixes
dt-bindings: usb: renesas,usbhs: Add RZ/G3E SoC support
usb: dwc2: fix resume failure if dr_mode is host
usb: cdns3: fix role switching during resume
usb: dwc3: gadget: Move vbus draw to workqueue context
USB: serial: option: add Telit FN920C04 RNDIS compositions
usb: dwc3: Log dwc3 address in traces
usb: gadget: tegra-xudc: Add handling for BLCG_COREPLL_PWRDN
usb: phy: tegra: add HSIC support
usb: phy: tegra: use phy type directly
usb: typec: ucsi: Enforce mode selection for cros_ec_ucsi
usb: typec: ucsi: Support mode selection to activate altmodes
usb: typec: Introduce mode_selection bit
usb: typec: Implement mode selection
usb: typec: Expose alternate mode priority via sysfs
...
Linus Torvalds [Tue, 17 Feb 2026 17:30:52 +0000 (09:30 -0800)]
Merge tag 'tty-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is the small amount of tty and serial driver updates for 7.0-rc1.
Nothing major in here at all, just some driver updates and minor
tweaks and cleanups including:
- sh-sci serial driver updates
- 8250 driver updates
- attempt to make the tty ports have their own workqueue, but was
reverted after testing found it to have problems on some platforms.
This will probably come back for 7.1 after it has been reworked and
resubmitted
- other tiny tty driver changes
All of these have been in linux-next for a while with no reported
problems"
* tag 'tty-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (49 commits)
Revert "tty: tty_port: add workqueue to flip TTY buffer"
tty: tty_port: add workqueue to flip TTY buffer
serial: 8250_pci: Remove custom deprecated baud setting routine
serial: 8250_omap: Remove custom deprecated baud setting routine
dt-bindings: serial: renesas,scif: Document RZ/G3L SoC
serial: 8250: omap: set out-of-band wakeup if wakeup pinctrl exists
tty: hvc-iucv: Remove KMSG_COMPONENT macro
dt-bindings: serial: google,goldfish-tty: Convert to DT schema
dt-bindings: serial: sh-sci: Fold single-entry compatibles into enum
serial: 8250: 8250_omap.c: Clear DMA RX running status only after DMA termination is done
serial: 8250: 8250_omap.c: Add support for handling UART error conditions
serial: SH_SCI: improve "DMA support" prompt
serial: Kconfig: fix ordering of entries for menu display
serial: 8250: fix ordering of entries for menu display
serial: imx: change SERIAL_IMX_CONSOLE to bool
8250_men_mcb: drop unneeded MODULE_ALIAS
serial: men_z135_uart: drop unneeded MODULE_ALIAS
dt-bindings: serial: renesas,rsci: Document RZ/V2H(P) and RZ/V2N SoCs
serial: rsci: Convert to FIELD_MODIFY()
dt-bindings: serial: 8250: add SpacemiT K3 UART compatible
...