]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
3 weeks agoKVM: arm64: Replace fault_is_perm with a helper
Marc Zyngier [Mon, 9 Mar 2026 09:59:38 +0000 (09:59 +0000)] 
KVM: arm64: Replace fault_is_perm with a helper

Carrying a boolean to indicate that a given fault is a permission fault
is slightly odd, as this is a property of the fault itself, and we'd
better avoid duplicating state.

For this purpose, introduce a kvm_s2_fault_is_perm() predicate that
can take a fault descriptor as a parameter. fault_is_perm is therefore
dropped from kvm_s2_fault.

Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Move fault context to const structure
Marc Zyngier [Sun, 8 Mar 2026 15:03:49 +0000 (15:03 +0000)] 
KVM: arm64: Move fault context to const structure

In order to make it clearer what gets updated or not during fault
handling, move a set of information that losely represents the
fault context.

This gets populated early, from handle_mem_abort(), and gets passed
along as a const pointer. user_mem_abort()'s signature is majorly
improved in doing so, and kvm_s2_fault loses a bunch of fields.

gmem_abort() will get a similar treatment down the line.

Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Make fault_ipa immutable
Marc Zyngier [Sun, 8 Mar 2026 12:18:29 +0000 (12:18 +0000)] 
KVM: arm64: Make fault_ipa immutable

Updating fault_ipa is conceptually annoying, as it changes something
that is a property of the fault itself.

Stop doing so and instead use fault->gfn as the sole piece of state
that can be used to represent the faulting IPA.

At the same time, introduce get_canonical_gfn() for the couple of case
we're we are concerned with the memslot-related IPA and not the faulting
one.

Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Kill fault->ipa
Marc Zyngier [Sat, 7 Mar 2026 11:39:49 +0000 (11:39 +0000)] 
KVM: arm64: Kill fault->ipa

fault->ipa, in a nested contest, represents the output of the guest's
S2 translation for the fault->fault_ipa input, and is equal to
fault->fault_ipa otherwise,

Given that this is readily available from kvm_s2_trans_output(),
drop fault->ipa and directly compute fault->gfn instead, which
is really what we want.

Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Clean up control flow in kvm_s2_fault_map()
Fuad Tabba [Fri, 6 Mar 2026 14:02:32 +0000 (14:02 +0000)] 
KVM: arm64: Clean up control flow in kvm_s2_fault_map()

Clean up the KVM MMU lock retry loop by pre-assigning the error code.
Add clear braces to the THP adjustment integration for readability, and
safely unnest the transparent hugepage logic branches.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Hoist MTE validation check out of MMU lock path
Fuad Tabba [Fri, 6 Mar 2026 14:02:31 +0000 (14:02 +0000)] 
KVM: arm64: Hoist MTE validation check out of MMU lock path

Simplify the non-cacheable attributes assignment by using a ternary
operator. Additionally, hoist the MTE validation check (mte_allowed) out
of kvm_s2_fault_map() and into kvm_s2_fault_compute_prot(). This allows
us to fail faster and avoid acquiring the KVM MMU lock unnecessarily
when the VMM introduces a disallowed VMA for an MTE-enabled guest.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Optimize early exit checks in kvm_s2_fault_pin_pfn()
Fuad Tabba [Fri, 6 Mar 2026 14:02:30 +0000 (14:02 +0000)] 
KVM: arm64: Optimize early exit checks in kvm_s2_fault_pin_pfn()

Optimize the early exit checks in kvm_s2_fault_pin_pfn by grouping all
error responses under the generic is_error_noslot_pfn check first,
avoiding unnecessary branches in the hot path.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Initialize struct kvm_s2_fault completely at declaration
Fuad Tabba [Fri, 6 Mar 2026 14:02:29 +0000 (14:02 +0000)] 
KVM: arm64: Initialize struct kvm_s2_fault completely at declaration

Simplify the initialization of struct kvm_s2_fault in user_mem_abort().

Instead of partially initializing the struct via designated initializers
and then sequentially assigning the remaining fields (like write_fault
and topup_memcache) further down the function, evaluate those
dependencies upfront.

This allows the entire struct to be fully initialized at declaration. It
also eliminates the need for the intermediate fault_data variable and
its associated fault pointer, reducing boilerplate.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Simplify return logic in user_mem_abort()
Fuad Tabba [Fri, 6 Mar 2026 14:02:28 +0000 (14:02 +0000)] 
KVM: arm64: Simplify return logic in user_mem_abort()

With the refactoring done, the final return block of user_mem_abort()
can be tidied up a bit more.

Clean up the trailing edge by dropping the unnecessary assignment,
collapsing the return evaluation for kvm_s2_fault_compute_prot(), and
tail calling kvm_s2_fault_map() directly.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Remove redundant state variables from struct kvm_s2_fault
Fuad Tabba [Fri, 6 Mar 2026 14:02:27 +0000 (14:02 +0000)] 
KVM: arm64: Remove redundant state variables from struct kvm_s2_fault

Remove redundant variables vma_shift and vfio_allow_any_uc from struct
kvm_s2_fault as they are easily derived or checked when needed.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Simplify nested VMA shift calculation
Fuad Tabba [Fri, 6 Mar 2026 14:02:26 +0000 (14:02 +0000)] 
KVM: arm64: Simplify nested VMA shift calculation

In the kvm_s2_resolve_vma_size() helper, the local variable vma_pagesize
is calculated from vma_shift, only to be used to bound the vma_pagesize
by max_map_size and subsequently convert it back to a shift via __ffs().

Because vma_pagesize and max_map_size are both powers of two, we can
simplify the logic by omitting vma_pagesize entirely and bounding the
vma_shift directly using the shift of max_map_size. This achieves the
same result while keeping the size-to-shift conversion out of the helper
logic.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Extract page table mapping in user_mem_abort()
Fuad Tabba [Fri, 6 Mar 2026 14:02:25 +0000 (14:02 +0000)] 
KVM: arm64: Extract page table mapping in user_mem_abort()

Extract the code responsible for locking the KVM MMU and mapping the PFN
into the stage-2 page tables into a new helper, kvm_s2_fault_map().

This helper manages the kvm_fault_lock, checks for MMU invalidation
retries, attempts to adjust for transparent huge pages (THP), handles
MTE sanitization if needed, and finally maps or relaxes permissions on
the stage-2 entries.

With this change, the main user_mem_abort() function is now a sequential
dispatcher that delegates to specialized helper functions.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Extract stage-2 permission logic in user_mem_abort()
Fuad Tabba [Fri, 6 Mar 2026 14:02:24 +0000 (14:02 +0000)] 
KVM: arm64: Extract stage-2 permission logic in user_mem_abort()

Extract the logic that computes the stage-2 protections and checks for
various permission faults (e.g., execution faults on non-cacheable
memory) into a new helper function, kvm_s2_fault_compute_prot(). This
helper also handles injecting atomic/exclusive faults back into the
guest when necessary.

This refactoring step separates the permission computation from the
mapping logic, making the main fault handler flow clearer.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Isolate mmap_read_lock inside new kvm_s2_fault_get_vma_info() helper
Fuad Tabba [Fri, 6 Mar 2026 14:02:23 +0000 (14:02 +0000)] 
KVM: arm64: Isolate mmap_read_lock inside new kvm_s2_fault_get_vma_info() helper

Extract the VMA lookup and metadata snapshotting logic from
kvm_s2_fault_pin_pfn() into a tightly-scoped sub-helper.

This refactoring structurally fixes a TOCTOU (Time-Of-Check to
Time-Of-Use) vulnerability and Use-After-Free risk involving the vma
pointer. In the previous layout, the mmap_read_lock is taken, the vma is
looked up, and then the lock is dropped before the function continues to
map the PFN. While an explicit vma = NULL safeguard was present, the vma
variable was still lexically in scope for the remainder of the function.

By isolating the locked region into kvm_s2_fault_get_vma_info(), the vma
pointer becomes a local variable strictly confined to that sub-helper.
Because the pointer's scope literally ends when the sub-helper returns,
it is not possible for the subsequent page fault logic in
kvm_s2_fault_pin_pfn() to accidentally access the vanished VMA,
eliminating this bug class by design.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Extract PFN resolution in user_mem_abort()
Fuad Tabba [Fri, 6 Mar 2026 14:02:22 +0000 (14:02 +0000)] 
KVM: arm64: Extract PFN resolution in user_mem_abort()

Extract the section of code responsible for pinning the physical page
frame number (PFN) backing the faulting IPA into a new helper,
kvm_s2_fault_pin_pfn().

This helper encapsulates the critical section where the mmap_read_lock
is held, the VMA is looked up, the mmu invalidate sequence is sampled,
and the PFN is ultimately resolved via __kvm_faultin_pfn(). It also
handles the early exits for hardware poisoned pages and noslot PFNs.

By isolating this region, we can begin to organize the state variables
required for PFN resolution into the kvm_s2_fault struct, clearing out
a significant amount of local variable clutter from user_mem_abort().

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Introduce struct kvm_s2_fault to user_mem_abort()
Fuad Tabba [Fri, 6 Mar 2026 14:02:21 +0000 (14:02 +0000)] 
KVM: arm64: Introduce struct kvm_s2_fault to user_mem_abort()

The user_mem_abort() function takes many arguments and defines a large
number of local variables. Passing all these variables around to helper
functions would result in functions with too many arguments.

Introduce struct kvm_s2_fault to encapsulate the stage-2 fault state.
This structure holds both the input parameters and the intermediate
state required during the fault handling process.

Update user_mem_abort() to initialize this structure and replace the
usage of local variables with fields from the new structure.

This prepares the ground for further extracting parts of
user_mem_abort() into smaller helper functions that can simply take a
pointer to the fault state structure.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: ptdump: Initialize parser_state before pgtable walk
Zenghui Yu (Huawei) [Sat, 28 Mar 2026 05:31:55 +0000 (13:31 +0800)] 
KVM: arm64: ptdump: Initialize parser_state before pgtable walk

If we go through the "need a bigger buffer" path in seq_read_iter(), which
is likely to happen as we're dumping page tables, we will pass the
populated-by-last-run st::parser_state to
kvm_pgtable_walk()/kvm_ptdump_visitor(). As a result, the output of
stage2_page_tables on my box looks like

0x0000000240000000-0x0000000000000000   17179869175G 1
0x0000000000000000-0x0000000000200000           2M 2   R   px ux  AF BLK
0x0000000000200000-0x0000000040000000        1022M 2
0x0000000040000000-0x0000000040200000           2M 2   R W PXNUXN AF BLK
[...]

Fix it by always initializing st::parser_state before starting a new
pgtable walk.

Besides, remove st::range as it's not used by note_page(); remove the
explicit initialization of parser_state::start_address as it will be
initialized in note_page() anyway.

Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Link: https://patch.msgid.link/20260328053155.12219-1-zenghui.yu@linux.dev
[maz: rebased on top of NV support]
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoALSA: core: Validate compress device numbers without dynamic minors
Cássio Gabriel [Wed, 25 Mar 2026 05:24:04 +0000 (02:24 -0300)] 
ALSA: core: Validate compress device numbers without dynamic minors

Without CONFIG_SND_DYNAMIC_MINORS, ALSA reserves only two fixed minors
for compress devices on each card: comprD0 and comprD1.

snd_find_free_minor() currently computes the compress minor as
type + dev without validating dev first, so device numbers greater than
1 spill into the HWDEP minor range instead of failing registration.

ASoC passes rtd->id to snd_compress_new(), so this can happen on real
non-dynamic-minor builds.

Add a dedicated fixed-minor check for SNDRV_DEVICE_TYPE_COMPRESS in
snd_find_free_minor() and reject out-of-range device numbers with
-EINVAL before constructing the minor.

Also remove the stale TODO in compress_offload.c that still claims
multiple compress nodes are missing.

Fixes: 3eafc959b32f ("ALSA: core: add support for compressed devices")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260325-alsa-compress-static-minors-v1-1-0628573bee1c@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoALSA: hda/realtek: add quirk for HP Victus 15-fb0xxx
Sourav Nayak [Fri, 27 Mar 2026 14:28:05 +0000 (19:58 +0530)] 
ALSA: hda/realtek: add quirk for HP Victus 15-fb0xxx

This adds a mute led quirck for HP Victus 15-fb0xxx (103c:8a3d) model

- As it used 0x8(full bright)/0x7f(little dim) for mute led on and other
  values as 0ff (0x0, 0x4, ...)

- So, use ALC245_FIXUP_HP_MUTE_LED_V2_COEFBIT insted for safer approach

Cc: <stable@vger.kernel.org>
Signed-off-by: Sourav Nayak <nonameblank007@gmail.com>
Link: https://patch.msgid.link/20260327142805.17139-1-nonameblank007@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoALSA: hda/intel: Add MSI X870E Tomahawk to denylist by DMI ID
Stuart Hayhurst [Fri, 27 Mar 2026 15:57:36 +0000 (15:57 +0000)] 
ALSA: hda/intel: Add MSI X870E Tomahawk to denylist by DMI ID

This motherboard uses USB audio instead, causing this driver to complain
about "no codecs found!".
Add it to the denylist to silence the warning.

The first attempt only matched on the PCI device, but this caused issues
for some laptops, so DMI match against the board as well.

Signed-off-by: Stuart Hayhurst <stuart.a.hayhurst@gmail.com>
Link: https://patch.msgid.link/20260327155737.21818-2-stuart.a.hayhurst@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoALSA: hda/realtek: add quirk for Framework F111:000F
Dustin L. Howett [Fri, 27 Mar 2026 15:54:40 +0000 (10:54 -0500)] 
ALSA: hda/realtek: add quirk for Framework F111:000F

Similar to commit 7b509910b3ad ("ALSA hda/realtek: Add quirk for
Framework F111:000C") and previous quirks for Framework systems with
Realtek codecs.

000F is another new platform with an ALC285 which needs the same quirk.

Signed-off-by: Dustin L. Howett <dustin@howett.net>
Link: https://patch.msgid.link/20260327-framework-alsa-000f-v1-1-74013aba1c00@howett.net
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoALSA: usb-audio: Fix quirk flags for NeuralDSP Quad Cortex
Phil Willoughby [Sat, 28 Mar 2026 08:07:34 +0000 (08:07 +0000)] 
ALSA: usb-audio: Fix quirk flags for NeuralDSP Quad Cortex

The NeuralDSP Quad Cortex does not support DSD playback. We need
this product-specific entry with zero quirks because otherwise it
falls through to the vendor-specific entry which marks it as
supporting DSD playback.

Cc: Yue Wang <yuleopen@gmail.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Signed-off-by: Phil Willoughby <willerz@gmail.com>
Link: https://patch.msgid.link/20260328080921.3310-1-willerz@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agofork: zero vmap stack using clear_pages() instead of memset()
Linus Walleij [Tue, 24 Feb 2026 10:26:32 +0000 (11:26 +0100)] 
fork: zero vmap stack using clear_pages() instead of memset()

After the introduction of clear_pages() we exploit the fact that the
process vm_area is allocated in contiguous pages to just clear them all in
one swift operation.

Link: https://lkml.kernel.org/r/20260224-mm-fork-clear-pages-v1-1-184c65a72d49@kernel.org
Signed-off-by: Linus Walleij <linusw@kernel.org>
Suggested-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/linux-mm/dpnwsp7dl4535rd7qmszanw6u5an2p74uxfex4dh53frpb7pu3@2bnjjavjrepe/
Suggested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/20240311164638.2015063-7-pasha.tatashin@soleen.com
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Ben Segall <bsegall@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoocfs2/dlm: fix off-by-one in dlm_match_regions() region comparison
Junrui Luo [Sat, 7 Mar 2026 07:21:09 +0000 (15:21 +0800)] 
ocfs2/dlm: fix off-by-one in dlm_match_regions() region comparison

The local-vs-remote region comparison loop uses '<=' instead of '<',
causing it to read one entry past the valid range of qr_regions.  The
other loops in the same function correctly use '<'.

Fix the loop condition to use '<' for consistency and correctness.

Link: https://lkml.kernel.org/r/SYBPR01MB78813DA26B50EC5E01F00566AF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Fixes: ea2034416b54 ("ocfs2/dlm: Add message DLM_QUERY_REGION")
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoocfs2/dlm: validate qr_numregions in dlm_match_regions()
Junrui Luo [Sat, 7 Mar 2026 07:21:08 +0000 (15:21 +0800)] 
ocfs2/dlm: validate qr_numregions in dlm_match_regions()

Patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()".

In dlm_match_regions(), the qr_numregions field from a DLM_QUERY_REGION
network message is used to drive loops over the qr_regions buffer without
sufficient validation.  This series fixes two issues:

- Patch 1 adds a bounds check to reject messages where qr_numregions
  exceeds O2NM_MAX_REGIONS. The o2net layer only validates message
  byte length; it does not constrain field values, so a crafted message
  can set qr_numregions up to 255 and trigger out-of-bounds reads past
  the 1024-byte qr_regions buffer.

- Patch 2 fixes an off-by-one in the local-vs-remote comparison loop,
  which uses '<=' instead of '<', reading one entry past the valid range
  even when qr_numregions is within bounds.

This patch (of 2):

The qr_numregions field from a DLM_QUERY_REGION network message is used
directly as loop bounds in dlm_match_regions() without checking against
O2NM_MAX_REGIONS.  Since qr_regions is sized for at most O2NM_MAX_REGIONS
(32) entries, a crafted message with qr_numregions > 32 causes
out-of-bounds reads past the qr_regions buffer.

Add a bounds check for qr_numregions before entering the loops.

Link: https://lkml.kernel.org/r/SYBPR01MB7881A334D02ACEE5E0645801AF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Link: https://lkml.kernel.org/r/SYBPR01MB788166F524AD04E262E174BEAF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Fixes: ea2034416b54 ("ocfs2/dlm: Add message DLM_QUERY_REGION")
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/bch: fix signed shift overflow in build_mod8_tables
Josh Law [Wed, 18 Mar 2026 07:48:06 +0000 (07:48 +0000)] 
lib/bch: fix signed shift overflow in build_mod8_tables

Cast loop variable to unsigned int before left-shifting to avoid undefined
behavior when i >= 128 and b == 3 (i << 24 overflows signed int).

Link: https://lkml.kernel.org/r/20260318074806.16527-3-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ivan Djelic <ivan.djelic@parrot.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/bch: fix signed left-shift undefined behavior
Josh Law [Wed, 18 Mar 2026 07:48:05 +0000 (07:48 +0000)] 
lib/bch: fix signed left-shift undefined behavior

Patch series "lib/bch: fix undefined behavior from signed left-shifts".

Fix two instances of undefined behavior in lib/bch.c caused by
left-shifting signed integers into or past the sign bit.

While the kernel's -fno-strict-overflow flag prevents miscompilation
today, these are formally UB per C11 6.5.7p4 and trivial to fix.

This patch (of 2):

Use 1u instead of 1 to avoid undefined behavior when left-shifting into
the sign bit of a signed int.  deg() can return up to 31, and 1 << 31 is
UB per C11.

Link: https://lkml.kernel.org/r/20260318074806.16527-2-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ivan Djelic <ivan.djelic@parrot.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoscripts/decodecode: return 0 on success
Patrick Bellasi [Wed, 18 Mar 2026 15:05:45 +0000 (15:05 +0000)] 
scripts/decodecode: return 0 on success

The decodecode script always returns an exit code of 1, regardless of
whether the operation was successful or not.  This is because the
"cleanup" function, which is registered to run on any script exit via
"trap cleanup EXIT", contains an unconditional "exit 1".

Remove the "exit 1" from the "cleanup" function so that it only performs
the necessary file cleanup without forcing a non-zero exit status.

Do that to ensure successful script executions now exit with code 0.
Exits due to errors are all handled by the "die()" function and will still
correctly exit with code 1.

Link: https://lkml.kernel.org/r/20260318150545.2809311-1-derkling@google.com
Signed-off-by: Patrick Bellasi <derkling@google.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agodo_notify_parent: sanitize the valid_signal() checks
Oleg Nesterov [Tue, 17 Mar 2026 13:58:18 +0000 (14:58 +0100)] 
do_notify_parent: sanitize the valid_signal() checks

Now that kernel_clone() checks valid_signal(args->exit_signal), the "sig"
argument of do_notify_parent() must always be valid or we have a bug.

However, do_notify_parent() only checks that sig != -1 at the start, then
it does another valid_signal() check before __send_signal_locked().

This is confusing.  Change do_notify_parent() to WARN and return early if
valid_signal(sig) is false.

Link: https://lkml.kernel.org/r/abld-ilvMEZ7VgMw@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agodoc: watchdog: futher improvements
Petr Mladek [Mon, 23 Mar 2026 17:21:38 +0000 (18:21 +0100)] 
doc: watchdog: futher improvements

Make further additions and alterations to the watchdog documentation.

Link: https://lkml.kernel.org/r/acF3tXBxSr0KOP9b@pathway.suse.cz
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Mayank Rungta <mrungta@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Stephane Erainan <eranian@google.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agodoc: watchdog: document buddy detector
Mayank Rungta [Thu, 12 Mar 2026 23:22:06 +0000 (16:22 -0700)] 
doc: watchdog: document buddy detector

The current documentation generalizes the hardlockup detector as primarily
NMI-perf-based and lacks details on the SMP "Buddy" detector.

Update the documentation to add a detailed description of the Buddy
detector, and also restructure the "Implementation" section to explicitly
separate "Softlockup Detector", "Hardlockup Detector (NMI/Perf)", and
"Hardlockup Detector (Buddy)".

Clarify that the softlockup hrtimer acts as the heartbeat generator for
both hardlockup mechanisms and centralize the configuration details in a
"Frequency and Heartbeats" section.

Link: https://lkml.kernel.org/r/20260312-hardlockup-watchdog-fixes-v2-5-45bd8a0cc7ed@google.com
Signed-off-by: Mayank Rungta <mrungta@google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Stephane Erainan <eranian@google.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agowatchdog/hardlockup: improve buddy system detection timeliness
Mayank Rungta [Thu, 12 Mar 2026 23:22:05 +0000 (16:22 -0700)] 
watchdog/hardlockup: improve buddy system detection timeliness

Currently, the buddy system only performs checks every 3rd sample.  With a
4-second interval.  If a check window is missed, the next check occurs 12
seconds later, potentially delaying hard lockup detection for up to 24
seconds.

Modify the buddy system to perform checks at every interval (4s).
Introduce a missed-interrupt threshold to maintain the existing grace
period while reducing the detection window to 8-12 seconds.

Best and worst case detection scenarios:

Before (12s check window):
- Best case: Lockup occurs after first check but just before heartbeat
  interval. Detected in ~8s (8s till next check).
- Worst case: Lockup occurs just after a check.
  Detected in ~24s (missed check + 12s till next check + 12s logic).

After (4s check window with threshold of 3):
- Best case: Lockup occurs just before a check.
  Detected in ~8s (0s till 1st check + 4s till 2nd + 4s till 3rd).
- Worst case: Lockup occurs just after a check.
  Detected in ~12s (4s till 1st check + 4s till 2nd + 4s till 3rd).

Link: https://lkml.kernel.org/r/20260312-hardlockup-watchdog-fixes-v2-4-45bd8a0cc7ed@google.com
Signed-off-by: Mayank Rungta <mrungta@google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Stephane Erainan <eranian@google.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agodoc: watchdog: clarify hardlockup detection timing
Mayank Rungta [Thu, 12 Mar 2026 23:22:04 +0000 (16:22 -0700)] 
doc: watchdog: clarify hardlockup detection timing

The current documentation implies that a hardlockup is strictly defined as
looping for "more than 10 seconds." However, the detection mechanism is
periodic (based on `watchdog_thresh`), meaning detection time varies
significantly depending on when the lockup occurs relative to the NMI perf
event.

Update the definition to remove the strict "more than 10 seconds"
constraint in the introduction and defer details to the Implementation
section.

Additionally, add a "Detection Overhead" section illustrating the Best
Case (~6s) and Worst Case (~20s) detection scenarios to provide
administrators with a clearer understanding of the watchdog's latency.

Link: https://lkml.kernel.org/r/20260312-hardlockup-watchdog-fixes-v2-3-45bd8a0cc7ed@google.com
Signed-off-by: Mayank Rungta <mrungta@google.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Stephane Erainan <eranian@google.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agowatchdog: update saved interrupts during check
Mayank Rungta [Thu, 12 Mar 2026 23:22:03 +0000 (16:22 -0700)] 
watchdog: update saved interrupts during check

Currently, arch_touch_nmi_watchdog() causes an early return that skips
updating hrtimer_interrupts_saved.  This leads to stale comparisons and
delayed lockup detection.

I found this issue because in our system the serial console is fairly
chatty.  For example, the 8250 console driver frequently calls
touch_nmi_watchdog() via console_write().  If a CPU locks up after a timer
interrupt but before next watchdog check, we see the following sequence:

  * watchdog_hardlockup_check() saves counter (e.g., 1000)
  * Timer runs and updates the counter (1001)
  * touch_nmi_watchdog() is called
  * CPU locks up
  * 10s pass: check() notices touch, returns early, skips update
  * 10s pass: check() saves counter (1001)
  * 10s pass: check() finally detects lockup

This delays detection to 30 seconds.  With this fix, we detect the lockup
in 20 seconds.

Link: https://lkml.kernel.org/r/20260312-hardlockup-watchdog-fixes-v2-2-45bd8a0cc7ed@google.com
Signed-off-by: Mayank Rungta <mrungta@google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Stephane Erainan <eranian@google.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agowatchdog: return early in watchdog_hardlockup_check()
Mayank Rungta [Thu, 12 Mar 2026 23:22:02 +0000 (16:22 -0700)] 
watchdog: return early in watchdog_hardlockup_check()

Patch series "watchdog/hardlockup: Improvements to hardlockup", v2.

This series addresses limitations in the hardlockup detector
implementations and updates the documentation to reflect actual behavior
and recent changes.

The changes are structured as follows:

Refactoring (Patch 1)
=====================
Patch 1 refactors watchdog_hardlockup_check() to return early if no
lockup is detected. This reduces the indentation level of the main
logic block, serving as a clean base for the subsequent changes.

Hardlockup Detection Improvements (Patches 2 & 4)
=================================================
The hardlockup detector logic relies on updating saved interrupt counts to
determine if the CPU is making progress.

Patch 1 ensures that the saved interrupt count is updated unconditionally
before checking the "touched" flag.  This prevents stale comparisons which
can delay detection.  This is a logic fix that ensures the detector
remains accurate even when the watchdog is frequently touched.

Patch 3 improves the Buddy detector's timeliness.  The current checking
interval (every 3rd sample) causes high variability in detection time (up
to 24s).  This patch changes the Buddy detector to check at every hrtimer
interval (4s) with a missed-interrupt threshold of 3, narrowing the
detection window to a consistent 8-12 second range.

Documentation Updates (Patches 3 & 5)
=====================================
The current documentation does not fully capture the variable nature of
detection latency or the details of the Buddy system.

Patch 3 removes the strict "10 seconds" definition of a hardlockup, which
was misleading given the periodic nature of the detector.  It adds a
"Detection Overhead" section to the admin guide, using "Best Case" and
"Worst Case" scenarios to illustrate that detection time can vary
significantly (e.g., ~6s to ~20s).

Patch 5 adds a dedicated section for the Buddy detector, which was
previously undocumented.  It details the mechanism, the new timing logic,
and known limitations.

This patch (of 5):

Invert the `is_hardlockup(cpu)` check in `watchdog_hardlockup_check()` to
return early when a hardlockup is not detected.  This flattens the main
logic block, reducing the indentation level and making the code easier to
read and maintain.

This refactoring serves as a preparation patch for future hardlockup
changes.

Link: https://lkml.kernel.org/r/20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com
Link: https://lkml.kernel.org/r/20260312-hardlockup-watchdog-fixes-v2-1-45bd8a0cc7ed@google.com
Signed-off-by: Mayank Rungta <mrungta@google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Stephane Erainan <eranian@google.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agokernel/kexec: remove inclusion of crypto/hash.h
Eric Biggers [Sat, 14 Mar 2026 20:41:44 +0000 (13:41 -0700)] 
kernel/kexec: remove inclusion of crypto/hash.h

kexec_core.c does not do any cryptographic hashing, so the header
crypto/hash.h is not needed at all.

Link: https://lkml.kernel.org/r/20260314204144.44884-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agokernel/crash: remove inclusion of crypto/sha1.h
Eric Biggers [Sat, 14 Mar 2026 20:42:43 +0000 (13:42 -0700)] 
kernel/crash: remove inclusion of crypto/sha1.h

Several files related to kernel crash dumps include crypto/sha1.h but
never use any of its functionality.  Remove these includes so that these
files don't unnecessarily come up in searches for which kernel code is
still using the obsolete SHA-1 algorithm.

Link: https://lkml.kernel.org/r/20260314204243.45001-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoCREDITS: simplify the end-of-file alphabetical order comment
Hisam Mehboob [Thu, 12 Mar 2026 01:17:42 +0000 (06:17 +0500)] 
CREDITS: simplify the end-of-file alphabetical order comment

The existing comment references specific individuals by name which becomes
outdated as new entries are added.  Simplify it to state the alphabetical
ordering rule clearly without depending on who happens to be the last
entry.

Link: https://lkml.kernel.org/r/20260312011741.846664-2-hisamshar@gmail.com
Signed-off-by: Hisam Mehboob <hisamshar@gmail.com>
Suggested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jakub Kacinski <kuba@kernel.org>
Cc: Simon Horman <horms@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/glob: initialize back_str to silence uninitialized variable warning
Josh Law [Thu, 12 Mar 2026 21:52:49 +0000 (21:52 +0000)] 
lib/glob: initialize back_str to silence uninitialized variable warning

back_str is only used when back_pat is non-NULL, and both are always set
together, so it is safe in practice.  Initialize back_str to NULL to make
this safety invariant explicit and silence compiler/static analysis
warnings.

Link: https://lkml.kernel.org/r/20260312215249.50165-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agocheckpatch: add support for Assisted-by tag
Sasha Levin [Wed, 11 Mar 2026 21:58:17 +0000 (17:58 -0400)] 
checkpatch: add support for Assisted-by tag

The Assisted-by tag was introduced in
Documentation/process/coding-assistants.rst for attributing AI tool
contributions to kernel patches.  However, checkpatch.pl did not recognize
this tag, causing two issues:

  WARNING: Non-standard signature: Assisted-by:
  ERROR: Unrecognized email address: 'AGENT_NAME:MODEL_VERSION'

Fix this by:
1. Adding Assisted-by to the recognized $signature_tags list
2. Skipping email validation for Assisted-by lines since they use the
   AGENT_NAME:MODEL_VERSION format instead of an email address
3. Warning when the Assisted-by value doesn't match the expected format

Link: https://lkml.kernel.org/r/20260311215818.518930-1-sashal@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agodecode_stacktrace: decode caller address
Masami Hiramatsu (Google) [Fri, 6 Mar 2026 00:50:16 +0000 (09:50 +0900)] 
decode_stacktrace: decode caller address

Decode the caller address instead of the return address by default.  This
also introduced -R option to provide return address decoding mode.

This changes the decode_stacktrace.sh to decode the line info 1byte before
the return address which will be the call(branch) instruction address.  If
the return address is a symbol address (zero offset from it), it falls
back to decoding the return address.

This improves results especially when optimizations have changed the order
of the lines around the return address, or when the return address does
not have the actual line information.

With this change;
 Call Trace:
  <TASK>
  dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
  lockdep_rcu_suspicious (kernel/locking/lockdep.c:6876)
  event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:1057)
  kernel_clone (include/trace/events/sched.h:396 include/trace/events/sched.h:396 kernel/fork.c:2664)
  __x64_sys_clone (kernel/fork.c:2795 kernel/fork.c:2779 kernel/fork.c:2779)
  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
  ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
  ? trace_irq_disable (include/trace/events/preemptirq.h:36)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)

Without this (or give -R option);
 Call Trace:
  <TASK>
  dump_stack_lvl (lib/dump_stack.c:122)
  lockdep_rcu_suspicious (kernel/locking/lockdep.c:6877)
  event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:?)
  kernel_clone (include/trace/events/sched.h:? include/trace/events/sched.h:396 kernel/fork.c:2664)
  __x64_sys_clone (kernel/fork.c:2779)
  do_syscall_64 (arch/x86/entry/syscall_64.c:?)
  ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
  ? trace_irq_disable (include/trace/events/preemptirq.h:36)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

[akpm@linux-foundation.org: fix spello]
Link: https://lkml.kernel.org/r/177275821652.1557019.18367881408364381866.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com> [arm64]
Cc: Carlos Llamas <cmllamas@google.com>
Cc: Sasha Levin (Microsoft) <sashal@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoselftests: fix ARCH normalization to handle command-line argument
Aleksei Oladko [Mon, 9 Mar 2026 20:51:45 +0000 (20:51 +0000)] 
selftests: fix ARCH normalization to handle command-line argument

Several selftests Makefiles (e.g.  prctl, breakpoints, etc) attempt to
normalize the ARCH variable by converting x86_64 and i.86 to x86.
However, it uses the conditional assignment operator '?='.

When ARCH is passed as a command-line argument (e.g., during an rpmbuild
process), the '?=' operator ignores the shell command and the sed
transformation.  This leads to an incorrect ARCH value being used, which
causes build failures

  # make -C tools/testing/selftests TARGETS=prctl ARCH=x86_64
  make: Entering directory '/build/tools/testing/selftests'
  make[1]: Entering directory '/build/tools/testing/selftests/prctl'
  make[1]: *** No targets.  Stop.
  make[1]: Leaving directory '/build/tools/testing/selftests/prctl'
  make: *** [Makefile:197: all] Error 2

Change the assignment to use 'override' and ':=' to ensure the
normalization logic is applied regardless of how the ARCH variable was
initially defined.

Link: https://lkml.kernel.org/r/20260309205145.572778-1-aleksey.oladko@virtuozzo.com
Signed-off-by: Aleksei Oladko <aleksey.oladko@virtuozzo.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Cc: Chelsy Ratnawat <chelsyratnawat2001@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/ts_kmp: fix integer overflow in pattern length calculation
Josh Law [Sun, 8 Mar 2026 20:20:28 +0000 (20:20 +0000)] 
lib/ts_kmp: fix integer overflow in pattern length calculation

The ts_kmp algorithm stores its prefix_tbl[] table and pattern in a single
allocation sized from the pattern length.  If the prefix_tbl[] size
calculation wraps, the resulting allocation can be too small and
subsequent pattern copies can overflow it.

Fix this by rejecting zero-length patterns and by using overflow helpers
before calculating the combined allocation size.

This fixes a potential heap overflow.  The pattern length calculation can
wrap during a size_t addition, leading to an undersized allocation.
Because the textsearch library is reachable from userspace via Netfilter's
xt_string module, this is a security risk that should be backported to LTS
kernels.

Link: https://lkml.kernel.org/r/20260308202028.2889285-2-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/ts_bm: fix integer overflow in pattern length calculation
Josh Law [Sun, 8 Mar 2026 20:20:27 +0000 (20:20 +0000)] 
lib/ts_bm: fix integer overflow in pattern length calculation

The ts_bm algorithm stores its good_shift[] table and pattern in a single
allocation sized from the pattern length.  If the good_shift[] size
calculation wraps, the resulting allocation can be too small and
subsequent pattern copies can overflow it.

Fix this by rejecting zero-length patterns and by using overflow helpers
before calculating the combined allocation size.

This fixes a potential heap overflow.  The pattern length calculation can
wrap during a size_t addition, leading to an undersized allocation.
Because the textsearch library is reachable from userspace via Netfilter's
xt_string module, this is a security risk that should be backported to LTS
kernels.

Link: https://lkml.kernel.org/r/20260308202028.2889285-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoocfs2: remove redundant error code assignment
Alexey Velichayshiy [Sat, 7 Mar 2026 23:47:53 +0000 (02:47 +0300)] 
ocfs2: remove redundant error code assignment

Remove the error assignment for variable 'ret' during correct code
execution.  In subsequent execution, variable 'ret' is overwritten.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Link: https://lkml.kernel.org/r/20260307234809.88421-1-a.velichayshiy@ispras.ru
Signed-off-by: Alexey Velichayshiy <a.velichayshiy@ispras.ru>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agotools/getdelays: use the static UAPI headers from tools/include/uapi
Thomas Weißschuh [Sat, 7 Mar 2026 08:47:53 +0000 (09:47 +0100)] 
tools/getdelays: use the static UAPI headers from tools/include/uapi

The include directory ../../usr/include is only present if an in-tree
kernel build with CONFIG_HEADERS_INSTALL was done before.
Otherwise the system UAPI headers are used, which most likely are not
the most recent ones.

To make sure to always have access to up-to-date UAPI headers,
use the static copy in tools/include/uapi.

Link: https://lkml.kernel.org/r/20260307-accounting-taskstats-h-v1-2-0b75915c6ce5@weissschuh.net
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202603062103.Z5fecwZD-lkp@intel.com/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Jiang Kun <jiang.kun2@zte.com.cn>
Cc: Wang Yaxin <wang.yaxin@zte.com.cn>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agotools headers UAPI: sync linux/taskstats.h
Thomas Weißschuh [Sat, 7 Mar 2026 08:47:52 +0000 (09:47 +0100)] 
tools headers UAPI: sync linux/taskstats.h

Patch series "tools/getdelays: use the static UAPI headers from
tools/include/uapi".

The include directory ../../usr/include is only present if an in-tree
kernel build with CONFIG_HEADERS_INSTALL was done before.  Otherwise the
system UAPI headers are used, which most likely are not the most recent
ones.

To make sure to always have access to up-to-date UAPI headers, use the
static copy in tools/include/uapi.

This patch (of 2):

To give the accounting tools access to the new fields introduced in commit
503efe850c74 ("delayacct: add timestamp of delay max")

Link: https://lkml.kernel.org/r/20260307-accounting-taskstats-h-v1-0-0b75915c6ce5@weissschuh.net
Link: https://lkml.kernel.org/r/20260307-accounting-taskstats-h-v1-1-0b75915c6ce5@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Jiang Kun <jiang.kun2@zte.com.cn>
Cc: Wang Yaxin <wang.yaxin@zte.com.cn>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: decompress_bunzip2: fix 32-bit shift undefined behavior
Josh Law [Sun, 8 Mar 2026 16:50:12 +0000 (16:50 +0000)] 
lib: decompress_bunzip2: fix 32-bit shift undefined behavior

Fix undefined behavior caused by shifting a 32-bit integer by 32 bits
during decompression.  This prevents potential kernel decompression
failures or corruption when parsing malicious or malformed bzip2 archives.

Link: https://lkml.kernel.org/r/20260308165012.2872633-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoocfs2: fix possible deadlock between unlink and dio_end_io_write
Joseph Qi [Fri, 6 Mar 2026 03:22:11 +0000 (11:22 +0800)] 
ocfs2: fix possible deadlock between unlink and dio_end_io_write

ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
while in ocfs2_dio_end_io_write, it acquires these locks in reverse order.
This creates an ABBA lock ordering violation on lock classes
ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
ocfs2_file_ip_alloc_sem_key.

Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
ocfs2_unlink
  ocfs2_prepare_orphan_dir
    ocfs2_lookup_lock_orphan_dir
      inode_lock(orphan_dir_inode) <- lock A
    __ocfs2_prepare_orphan_dir
      ocfs2_prepare_dir_for_insert
        ocfs2_extend_dir
  ocfs2_expand_inline_dir
    down_write(&oi->ip_alloc_sem) <- Lock B

Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
ocfs2_dio_end_io_write
  down_write(&oi->ip_alloc_sem) <- Lock B
  ocfs2_del_inode_from_orphan()
    inode_lock(orphan_dir_inode) <- Lock A

Deadlock Scenario:
  CPU0 (unlink)                     CPU1 (dio_end_io_write)
  ------                            ------
  inode_lock(orphan_dir_inode)
                                    down_write(ip_alloc_sem)
  down_write(ip_alloc_sem)
                                    inode_lock(orphan_dir_inode)

Since ip_alloc_sem is to protect allocation changes, which is unrelated
with operations in ocfs2_del_inode_from_orphan.  So move
ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock.

Link: https://lkml.kernel.org/r/20260306032211.1016452-1-joseph.qi@linux.alibaba.com
Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04
Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/bug: remove unnecessary variable initializations
Josh Law [Fri, 6 Mar 2026 16:24:18 +0000 (16:24 +0000)] 
lib/bug: remove unnecessary variable initializations

Remove the unnecessary initialization of 'rcu' to false in
report_bug_entry() and report_bug(), as it is assigned by warn_rcu_enter()
before its first use.

Link: https://lkml.kernel.org/r/20260306162418.2815979-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/bug: fix inconsistent capitalization in BUG message
Josh Law [Fri, 6 Mar 2026 16:23:27 +0000 (16:23 +0000)] 
lib/bug: fix inconsistent capitalization in BUG message

Use lowercase "kernel BUG" consistently in pr_crit() messages.  The
verbose path already uses "kernel BUG at %s:%u!" but the non-verbose
fallback uses "Kernel BUG" with an uppercase 'K'.

Link: https://lkml.kernel.org/r/20260306162327.2815553-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/inflate: fix typo "This results" to "The results" in comment
Josh Law [Fri, 6 Mar 2026 16:17:32 +0000 (16:17 +0000)] 
lib/inflate: fix typo "This results" to "The results" in comment

Fix "This results of this trade" to "The results of this trade" in the
comment describing the lbits and dbits tuning parameters.

Link: https://lkml.kernel.org/r/20260306161732.2812132-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/inflate: fix grammar in comment: "variable" to "variables"
Josh Law [Fri, 6 Mar 2026 16:17:07 +0000 (16:17 +0000)] 
lib/inflate: fix grammar in comment: "variable" to "variables"

Fix "all variable" to "all variables" in the file header comment.

Link: https://lkml.kernel.org/r/20260306161707.2812005-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/inflate: fix memory leak in inflate_dynamic() on inflate_codes() failure
Josh Law [Fri, 6 Mar 2026 16:16:47 +0000 (16:16 +0000)] 
lib/inflate: fix memory leak in inflate_dynamic() on inflate_codes() failure

When inflate_codes() fails in inflate_dynamic(), the code jumps to the
'out' label which only frees 'll', leaking the Huffman tables 'tl' and
'td'.  Restructure the code so that the decoding tables are always freed
before reaching the 'out' label.

Link: https://lkml.kernel.org/r/20260306161647.2811874-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/inflate: fix memory leak in inflate_fixed() on inflate_codes() failure
Josh Law [Fri, 6 Mar 2026 16:16:12 +0000 (16:16 +0000)] 
lib/inflate: fix memory leak in inflate_fixed() on inflate_codes() failure

When inflate_codes() fails in inflate_fixed(), only the length list 'l' is
freed, but the Huffman tables 'tl' and 'td' are leaked.  Add the missing
huft_free() calls on the error path.

Link: https://lkml.kernel.org/r/20260306161612.2811703-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/uuid: fix typo "reversion" to "revision" in comment
Josh Law [Fri, 6 Mar 2026 16:12:50 +0000 (16:12 +0000)] 
lib/uuid: fix typo "reversion" to "revision" in comment

Fix a typo in __uuid_gen_common() where "reversion" (meaning to revert)
was used instead of "revision" when describing the UUID variant field.

Link: https://lkml.kernel.org/r/20260306161250.2811500-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoscripts/gdb/symbols: handle module path parameters
Benjamin Berg [Wed, 4 Mar 2026 11:06:43 +0000 (12:06 +0100)] 
scripts/gdb/symbols: handle module path parameters

commit 581ee79a2547 ("scripts/gdb/symbols: make BPF debug info available
to GDB") added support to make BPF debug information available to GDB.
However, the argument handling loop was slightly broken, causing it to
fail if further modules were passed.  Fix it to append these passed
modules to the instance variable after expansion.

Link: https://lkml.kernel.org/r/20260304110642.2020614-2-benjamin@sipsolutions.net
Fixes: 581ee79a2547 ("scripts/gdb/symbols: make BPF debug info available to GDB")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agohung_task: explicitly report I/O wait state in log output
Aaron Tomlin [Tue, 3 Mar 2026 22:13:24 +0000 (17:13 -0500)] 
hung_task: explicitly report I/O wait state in log output

Currently, the hung task reporting mechanism indiscriminately labels all
TASK_UNINTERRUPTIBLE (D) tasks as "blocked", irrespective of whether they
are awaiting I/O completion or kernel locking primitives.  This ambiguity
compels system administrators to manually inspect stack traces to discern
whether the delay stems from an I/O wait (typically indicative of hardware
or filesystem anomalies) or software contention.  Such detailed analysis
is not always immediately accessible to system administrators or support
engineers.

To address this, this patch utilises the existing in_iowait field within
struct task_struct to augment the failure report.  If the task is blocked
due to I/O (e.g., via io_schedule_prepare()), the log message is updated
to explicitly state "blocked in I/O wait".

Examples:
        - Standard Block: "INFO: task bash:123 blocked for more than 120
          seconds".

        - I/O Block: "INFO: task dd:456 blocked in I/O wait for more than
          120 seconds".

Theoretically, concurrent executions of io_schedule_finish() could result
in a race condition where the read flag does not precisely correlate with
the subsequently printed backtrace.  However, this limitation is deemed
acceptable in practice.  The entire reporting mechanism is inherently racy
by design; nevertheless, it remains highly reliable in the vast majority
of cases, particularly because it primarily captures protracted stalls.
Consequently, introducing additional synchronisation to mitigate this
minor inaccuracy would be entirely disproportionate to the situation.

Link: https://lkml.kernel.org/r/20260303221324.4106917-1-atomlin@atomlin.com
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agohung_task: increment the global counter immediately
Petr Mladek [Tue, 3 Mar 2026 20:30:31 +0000 (15:30 -0500)] 
hung_task: increment the global counter immediately

A recent change allowed to reset the global counter of hung tasks using
the sysctl interface.  A potential race with the regular check has been
solved by updating the global counter only once at the end of the check.

However, the hung task check can take a significant amount of time,
particularly when task information is being dumped to slow serial
consoles.  Some users monitor this global counter to trigger immediate
migration of critical containers.  Delaying the increment until the full
check completes postpones these high-priority rescue operations.

Update the global counter as soon as a hung task is detected.  Since the
value is read asynchronously, a relaxed atomic operation is sufficient.

Link: https://lkml.kernel.org/r/20260303203031.4097316-4-atomlin@atomlin.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
Reported-by: Lance Yang <lance.yang@linux.dev>
Closes: https://lore.kernel.org/r/f239e00f-4282-408d-b172-0f9885f4b01b@linux.dev
Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agohung_task: enable runtime reset of hung_task_detect_count
Aaron Tomlin [Tue, 3 Mar 2026 20:30:30 +0000 (15:30 -0500)] 
hung_task: enable runtime reset of hung_task_detect_count

Currently, the hung_task_detect_count sysctl provides a cumulative count
of hung tasks since boot.  In long-running, high-availability
environments, this counter may lose its utility if it cannot be reset once
an incident has been resolved.  Furthermore, the previous implementation
relied upon implicit ordering, which could not strictly guarantee that
diagnostic metadata published by one CPU was visible to the panic logic on
another.

This patch introduces the capability to reset the detection count by
writing "0" to the hung_task_detect_count sysctl.  The proc_handler logic
has been updated to validate this input and atomically reset the counter.

The synchronisation of sysctl_hung_task_detect_count relies upon a
transactional model to ensure the integrity of the detection counter
against concurrent resets from userspace.  The application of
atomic_long_read_acquire() and atomic_long_cmpxchg_release() is correct
and provides the following guarantees:

    1. Prevention of Load-Store Reordering via Acquire Semantics By
       utilising atomic_long_read_acquire() to snapshot the counter
       before initiating the task traversal, we establish a strict
       memory barrier. This prevents the compiler or hardware from
       reordering the initial load to a point later in the scan. Without
       this "acquire" barrier, a delayed load could potentially read a
       "0" value resulting from a userspace reset that occurred
       mid-scan. This would lead to the subsequent cmpxchg succeeding
       erroneously, thereby overwriting the user's reset with stale
       increment data.

    2. Atomicity of the "Commit" Phase via Release Semantics The
       atomic_long_cmpxchg_release() serves as the transaction's commit
       point. The "release" barrier ensures that all diagnostic
       recordings and task-state observations made during the scan are
       globally visible before the counter is incremented.

    3. Race Condition Resolution This pairing effectively detects any
       "out-of-band" reset of the counter. If
       sysctl_hung_task_detect_count is modified via the procfs
       interface during the scan, the final cmpxchg will detect the
       discrepancy between the current value and the "acquire" snapshot.
       Consequently, the update will fail, ensuring that a reset command
       from the administrator is prioritised over a scan that may have
       been invalidated by that very reset.

Link: https://lkml.kernel.org/r/20260303203031.4097316-3-atomlin@atomlin.com
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Joel Granados <joel.granados@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agohung_task: refactor detection logic and atomicise detection count
Aaron Tomlin [Tue, 3 Mar 2026 20:30:29 +0000 (15:30 -0500)] 
hung_task: refactor detection logic and atomicise detection count

Patch series "hung_task: Provide runtime reset interface for hung task
detector", v9.

This series introduces the ability to reset
/proc/sys/kernel/hung_task_detect_count.

Writing a "0" value to this file atomically resets the counter of detected
hung tasks.  This functionality provides system administrators with the
means to clear the cumulative diagnostic history following incident
resolution, thereby simplifying subsequent monitoring without
necessitating a system restart.

This patch (of 3):

The check_hung_task() function currently conflates two distinct
responsibilities: validating whether a task is hung and handling the
subsequent reporting (printing warnings, triggering panics, or
tracepoints).

This patch refactors the logic by introducing hung_task_info(), a function
dedicated solely to reporting.  The actual detection check,
task_is_hung(), is hoisted into the primary loop within
check_hung_uninterruptible_tasks().  This separation clearly decouples the
mechanism of detection from the policy of reporting.

Furthermore, to facilitate future support for concurrent hung task
detection, the global sysctl_hung_task_detect_count variable is converted
from unsigned long to atomic_long_t.  Consequently, the counting logic is
updated to accumulate the number of hung tasks locally (this_round_count)
during the iteration.  The global counter is then updated atomically via
atomic_long_cmpxchg_relaxed() once the loop concludes, rather than
incrementally during the scan.

These changes are strictly preparatory and introduce no functional change
to the system's runtime behaviour.

Link: https://lkml.kernel.org/r/20260303203031.4097316-1-atomlin@atomlin.com
Link: https://lkml.kernel.org/r/20260303203031.4097316-2-atomlin@atomlin.com
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Joel Granados <joel.granados@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agomailmap: update Guru Das Srinagesh's email address
Guru Das Srinagesh [Mon, 2 Mar 2026 02:44:20 +0000 (18:44 -0800)] 
mailmap: update Guru Das Srinagesh's email address

Add my current email address and map previous addresses to it.

Link: https://lkml.kernel.org/r/20260301-gds-mailmap-update-2-v1-1-5691415be73c@gurudas.dev
Signed-off-by: Guru Das Srinagesh <linux@gurudas.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: math: polynomial: remove link to non-exist file and fix spelling
Andy Shevchenko [Mon, 2 Mar 2026 09:28:09 +0000 (10:28 +0100)] 
lib: math: polynomial: remove link to non-exist file and fix spelling

The Baikal SoC and platform support was dropped from the kernel, remove
the reference to non-exist file.  While at it, fix spelling.

Link: https://lkml.kernel.org/r/20260302092831.2267785-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: math: polynomial: don't use 'proxy' headers
Andy Shevchenko [Mon, 2 Mar 2026 09:28:08 +0000 (10:28 +0100)] 
lib: math: polynomial: don't use 'proxy' headers

Update header inclusions to follow IWYU (Include What You Use) principle.

Link: https://lkml.kernel.org/r/20260302092831.2267785-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: polynomial: move to math/ subfolder
Andy Shevchenko [Mon, 2 Mar 2026 09:28:07 +0000 (10:28 +0100)] 
lib: polynomial: move to math/ subfolder

Patch series "lib: polynomial: Move to math/ and clean up", v2.

While removing Baikal SoC and platform code pieces I found that this code
belongs to lib/math/ rather than generic lib/.  Hence the move and
followed up cleanups.

This patch (of 3):

The algorithm behind polynomial belongs to our collection of math
equations and expressions handling.  Move it to math/ subfolder where
others of the kind are located.

Link: https://lkml.kernel.org/r/20260302092831.2267785-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoocfs2: fix deadlock when creating quota file
Heming Zhao [Mon, 2 Mar 2026 06:17:05 +0000 (14:17 +0800)] 
ocfs2: fix deadlock when creating quota file

syzbot detected a circular locking dependency. the scenarios:

        CPU0                    CPU1
        ----                    ----
   lock(&ocfs2_quota_ip_alloc_sem_key);
                                lock(&ocfs2_sysfile_lock_key[USER_QUOTA_SYSTEM_INODE]);
                                lock(&ocfs2_quota_ip_alloc_sem_key);
   lock(&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]);

or:
       CPU0                    CPU1
       ----                    ----
  lock(&ocfs2_quota_ip_alloc_sem_key);
                               lock(&dquot->dq_lock);
                               lock(&ocfs2_quota_ip_alloc_sem_key);
  lock(&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]);

Following are the code paths for above scenarios:

path_openat
 ocfs2_create
  ocfs2_mknod
  + ocfs2_reserve_new_inode
  |  ocfs2_reserve_suballoc_bits
  |   inode_lock(alloc_inode) //C0: hold INODE_ALLOC_SYSTEM_INODE
  |    //ocfs2_free_alloc_context(inode_ac) is called at the end of
  |    //caller ocfs2_mknod to handle the release
  |
  + ocfs2_get_init_inode
     __dquot_initialize
      dqget
       ocfs2_acquire_dquot
       + ocfs2_lock_global_qf
       |  down_write(&OCFS2_I(oinfo->dqi_gqinode)->ip_alloc_sem)//A2:grabbing
       + ocfs2_create_local_dquot
          down_write(&OCFS2_I(lqinode)->ip_alloc_sem)//A3:grabbing

evict
 ocfs2_evict_inode
  ocfs2_delete_inode
   ocfs2_wipe_inode
    + inode_lock(orphan_dir_inode) //B0:hold
    + ...
    + ocfs2_remove_inode
       inode_lock(inode_alloc_inode) //INODE_ALLOC_SYSTEM_INODE
        down_write(&inode->i_rwsem) //C1:grabbing

generic_file_direct_write
 ocfs2_direct_IO
  __blockdev_direct_IO
   dio_complete
    ocfs2_dio_end_io
     ocfs2_dio_end_io_write
     + down_write(&oi->ip_alloc_sem) //A0:hold
     + ocfs2_del_inode_from_orphan
        inode_lock(orphan_dir_inode) //B1:grabbing

Root cause for the circular locking:

DIO completion path:
 holds oi->ip_alloc_sem and is trying to acquire the orphan_dir_inode lock.

evict path:
 holds the orphan_dir_inode lock and is trying to acquire the
 inode_alloc_inode lock.

ocfs2_mknod path:
 Holds the inode_alloc_inode lock (to allocate a new quota file) and is
 blocked waiting for oi->ip_alloc_sem in ocfs2_acquire_dquot().

How to fix:

Replace down_write() with down_write_trylock() in ocfs2_acquire_dquot().
If acquiring oi->ip_alloc_sem fails, return -EBUSY to abort the file
creation routine and break the deadlock.

Link: https://lkml.kernel.org/r/20260302061707.7092-1-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reported-by: syzbot+78359d5fbb04318c35e9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=78359d5fbb04318c35e9
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoget_maintainer: add ** glob pattern support
Matteo Croce [Mon, 2 Mar 2026 10:38:22 +0000 (11:38 +0100)] 
get_maintainer: add ** glob pattern support

Add support for the ** glob operator in MAINTAINERS F: and X: patterns,
matching any number of path components (like Python's ** glob).

The existing * to .* conversion with slash-count check is preserved.  **
is converted to (?:.*), a non-capturing group used as a marker to bypass
the slash-count check in file_match_pattern(), allowing the pattern to
cross directory boundaries.

This enables patterns like F: **/*[_-]kunit*.c to match files at any depth
in the tree.

Link: https://lkml.kernel.org/r/20260302103822.77343-1-teknoraver@meta.com
Signed-off-by: Matteo Croce <teknoraver@meta.com>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agocrash_dump: use sysfs_emit in sysfs show functions
Thorsten Blum [Sun, 1 Mar 2026 12:51:07 +0000 (13:51 +0100)] 
crash_dump: use sysfs_emit in sysfs show functions

Replace sprintf() with sysfs_emit() in sysfs show functions.  sysfs_emit()
is preferred for formatting sysfs output because it provides safer bounds
checking.  No functional changes.

Link: https://lkml.kernel.org/r/20260301125106.911980-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib/glob: clean up "bool abuse" in pointer arithmetic
Josh Law [Sun, 1 Mar 2026 20:38:45 +0000 (20:38 +0000)] 
lib/glob: clean up "bool abuse" in pointer arithmetic

Replace the implicit 'bool' to 'int' conversion with an explicit ternary
operator.  This makes the pointer arithmetic clearer and avoids relying on
boolean memory representation for logic flow.

Link: https://lkml.kernel.org/r/20260301203845.2617217-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: glob: replace bitwise OR with logical operation on boolean
Josh Law [Sun, 1 Mar 2026 15:21:42 +0000 (15:21 +0000)] 
lib: glob: replace bitwise OR with logical operation on boolean

Using bitwise OR (|=) on a boolean variable is valid C, but replacing it
with a direct logical assignment makes the intent clearer and appeases
strict static analysis tools.

Link: https://lkml.kernel.org/r/20260301152143.2572137-2-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: glob: add explicit include for export.h
Josh Law [Sun, 1 Mar 2026 15:21:41 +0000 (15:21 +0000)] 
lib: glob: add explicit include for export.h

Include <linux/export.h> explicitly instead of relying on it being
implicitly included by <linux/module.h> for the EXPORT_SYMBOL macro.

Link: https://lkml.kernel.org/r/20260301152143.2572137-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: glob: fix grammar and replace non-inclusive terminology
Josh Law [Sun, 1 Mar 2026 15:45:53 +0000 (15:45 +0000)] 
lib: glob: fix grammar and replace non-inclusive terminology

Fix a missing article ('a') in the comment describing the glob
implementation, and replace 'blacklists' with 'denylists' to align with
the kernel's inclusive terminology guidelines.

Link: https://lkml.kernel.org/r/20260301154553.2592681-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoselftests/fchmodat2: use ksft_finished()
Mark Brown [Thu, 26 Feb 2026 22:24:43 +0000 (22:24 +0000)] 
selftests/fchmodat2: use ksft_finished()

The fchmodat2 test program open codes a version of ksft_finished(), use
the standard version.

Link: https://lkml.kernel.org/r/20260226-selftests-fchmodat2-v4-2-a6419435f2e8@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexey Gladkov <legion@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoselftests/fchmodat2: clean up temporary files and directories
Mark Brown [Thu, 26 Feb 2026 22:24:42 +0000 (22:24 +0000)] 
selftests/fchmodat2: clean up temporary files and directories

Patch series "selftests/fchmodat2: Error handling and general", v4.

I looked at the fchmodat2() tests since I've been experiencing some random
intermittent segfaults with them in my test systems, while doing so I
noticed these two issues.  Unfortunately I didn't figure out the original
yet, unless I managed to fix it unwittingly.

This patch (of 2):

The fchmodat2() test program creates a temporary directory with a file and
a symlink for every test it runs but never cleans these up, resulting in
${TMPDIR} getting left with stale files after every run.  Restructure the
program a bit to ensure that we clean these up, this is more invasive than
it might otherwise be due to the extensive use of ksft_exit_fail_msg() in
the program.

As a side effect this also ensures that we report a consistent test name
for the tests and always try both tests even if they are skipped.

Link: https://lkml.kernel.org/r/20260226-selftests-fchmodat2-v4-0-a6419435f2e8@kernel.org
Link: https://lkml.kernel.org/r/20260226-selftests-fchmodat2-v4-1-a6419435f2e8@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexey Gladkov <legion@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agolib: glob: add missing SPDX-License-Identifier
Josh Law [Sat, 28 Feb 2026 19:53:00 +0000 (19:53 +0000)] 
lib: glob: add missing SPDX-License-Identifier

Add the missing dual MIT/GPL license identifier to glob.c.

Link: https://lkml.kernel.org/r/20260228195300.2468310-1-objecting@objecting.org
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agopid: document the PIDNS_ADDING checks in alloc_pid() and copy_process()
Oleg Nesterov [Fri, 27 Feb 2026 12:04:20 +0000 (13:04 +0100)] 
pid: document the PIDNS_ADDING checks in alloc_pid() and copy_process()

Both copy_process() and alloc_pid() do the same PIDNS_ADDING check.  The
reasons for these checks, and the fact that both are necessary, are not
immediately obvious.  Add the comments.

Link: https://lkml.kernel.org/r/aaGIRElc78U4Er42@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Adrian Reber <areber@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Kirill Tkhai <tkhai@ya.ru>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agopid: make sub-init creation retryable
Oleg Nesterov [Fri, 27 Feb 2026 12:03:41 +0000 (13:03 +0100)] 
pid: make sub-init creation retryable

Patch series "pid: make sub-init creation retryable".

This patch (of 2):

Currently we allow only one attempt to create init in a new namespace.  If
the first fork() fails after alloc_pid() succeeds, free_pid() clears
PIDNS_ADDING and thus disables further PID allocations.

Nowadays this looks like an unnecessary limitation.  The original reason
to handle "case PIDNS_ADDING" in free_pid() is gone, most probably after
commit 69879c01a0c3 ("proc: Remove the now unnecessary internal mount of
proc").

Change free_pid() to keep ns->pid_allocated == PIDNS_ADDING, and change
alloc_pid() to reset the cursor early, right after taking pidmap_lock.

Test-case:

#define _GNU_SOURCE
#include <linux/sched.h>
#include <sys/syscall.h>
#include <sys/wait.h>
#include <assert.h>
#include <sched.h>
#include <errno.h>

int main(void)
{
struct clone_args args = {
.exit_signal = SIGCHLD,
.flags = CLONE_PIDFD,
.pidfd = 0,
};
unsigned long pidfd;
int pid;

assert(unshare(CLONE_NEWPID) == 0);

pid = syscall(__NR_clone3, &args, sizeof(args));
assert(pid == -1 && errno == EFAULT);

args.pidfd = (unsigned long)&pidfd;
pid = syscall(__NR_clone3, &args, sizeof(args));
if (pid)
assert(pid > 0 && wait(NULL) == pid);
else
assert(getpid() == 1);

return 0;
}

Link: https://lkml.kernel.org/r/aaGHu3ixbw9Y7kFj@redhat.com
Link: https://lkml.kernel.org/r/aaGIHa7vGdwhEc_D@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrei Vagin <avagin@gmail.com>
Cc: Adrian Reber <areber@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Kirill Tkhai <tkhai@ya.ru>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agocrash_dump: fix typo in function name read_key_from_user_keying
Thorsten Blum [Fri, 27 Feb 2026 23:04:21 +0000 (00:04 +0100)] 
crash_dump: fix typo in function name read_key_from_user_keying

The function read_key_from_user_keying() is missing an 'r' in its name.
Fix the typo by renaming it to read_key_from_user_keyring().

Link: https://lkml.kernel.org/r/20260227230422.859423-1-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agocrash_dump: remove redundant less-than-zero check
Thorsten Blum [Sat, 28 Feb 2026 08:51:36 +0000 (09:51 +0100)] 
crash_dump: remove redundant less-than-zero check

'key_count' is an 'unsigned int' and cannot be less than zero. Remove
the redundant condition.

Link: https://lkml.kernel.org/r/20260228085136.861971-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agofork: replace simple_strtoul with kstrtoul in coredump_filter_setup
Thorsten Blum [Mon, 15 Dec 2025 14:21:52 +0000 (15:21 +0100)] 
fork: replace simple_strtoul with kstrtoul in coredump_filter_setup

Replace simple_strtoul() with the recommended kstrtoul() for parsing the
'coredump_filter=' boot parameter.

Check the return value of kstrtoul() and reject invalid values.  This adds
error handling while preserving behavior for existing values, and removes
use of the deprecated simple_strtoul() helper.  The current code silently
sets 'default_dump_filter = 0' if parsing fails, instead of leaving the
default value (MMF_DUMP_FILTER_DEFAULT) unchanged.

Rename the static variable 'default_dump_filter' to 'coredump_filter'
since it does not necessarily contain the default value and the current
name can be misleading.

Link: https://lkml.kernel.org/r/20251215142152.4082-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ben Segall <bsegall@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agocomplete_signal: kill always-true "core_state || !SIGNAL_GROUP_EXIT" check
Oleg Nesterov [Sun, 22 Feb 2026 15:24:00 +0000 (16:24 +0100)] 
complete_signal: kill always-true "core_state || !SIGNAL_GROUP_EXIT" check

The "(signal->core_state || !(signal->flags & SIGNAL_GROUP_EXIT))" check
in complete_signal() is not obvious at all, and in fact it only adds
unnecessary confusion: this condition is always true.

prepare_signal() does:

if (signal->flags & SIGNAL_GROUP_EXIT) {
if (signal->core_state)
return sig == SIGKILL;
/*
 * The process is in the middle of dying, drop the signal.
 */
return false;
}

This means that "!signal->core_state && (signal->flags &
SIGNAL_GROUP_EXIT)" in complete_signal() is never possible.

If SIGNAL_GROUP_EXIT is set, prepare_signal() can only return true if
signal->core_state is not NULL.

Link: https://lkml.kernel.org/r/aZsfkDhnqJ4s1oTs@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc; Deepanshu Kartikey <kartikey406@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoexit: kill unnecessary thread_group_leader() checks in exit_notify() and do_notify_pa...
Oleg Nesterov [Sun, 22 Feb 2026 15:23:37 +0000 (16:23 +0100)] 
exit: kill unnecessary thread_group_leader() checks in exit_notify() and do_notify_parent()

thread_group_empty(tsk) is only possible if tsk is a group leader, and
thread_group_empty() already does the thread_group_leader() check.

So it makes no sense to check "thread_group_leader() &&
thread_group_empty()"; thread_group_empty() alone is enough.

Link: https://lkml.kernel.org/r/aZsfeegKZPZZszJh@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Kees Cook <kees@kernel.org>
Cc; Deepanshu Kartikey <kartikey406@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoselftests/ipc: skip msgque test when MSG_COPY is unsupported
UYeol Jo [Tue, 10 Feb 2026 13:53:59 +0000 (22:53 +0900)] 
selftests/ipc: skip msgque test when MSG_COPY is unsupported

msgque kselftest uses msgrcv(..., MSG_COPY) to copy messages.  When the
kernel is built without CONFIG_CHECKPOINT_RESTORE, prepare_copy() is
stubbed out and msgrcv() returns -ENOSYS.  The test currently reports this
as a failure even though it is simply a missing feature/configuration.

Skip the test when msgrcv() fails with ENOSYS.

Link: https://lkml.kernel.org/r/20260210135359.178636-1-jouyeol8739@gmail.com
Signed-off-by: UYeol Jo <jouyeol8739@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoscripts/spelling.txt: add "exaclty" typo
Petr Vorel [Thu, 12 Feb 2026 14:40:05 +0000 (15:40 +0100)] 
scripts/spelling.txt: add "exaclty" typo

Link: https://lkml.kernel.org/r/20260212144005.45052-2-pvorel@suse.cz
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Cc: Jonathan Camerom <Jonathan.Cameron@huawei.com>
Cc: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoscripts/spelling.txt: sort alphabetically
Petr Vorel [Thu, 12 Feb 2026 14:40:04 +0000 (15:40 +0100)] 
scripts/spelling.txt: sort alphabetically

Easier to add new entries.  It was sorted when added in 66b47b4a9dad0, but
later got wrong order for few entries.  Sorted with en_US.UTF-8 locale.

Link: https://lkml.kernel.org/r/20260212144005.45052-1-pvorel@suse.cz
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Cc: Jonathan Camerom <Jonathan.Cameron@huawei.com>
Cc: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agokernel/panic: mark init_taint_buf as __initdata and panic instead of warning in alloc...
Rio [Mon, 23 Feb 2026 03:59:14 +0000 (09:29 +0530)] 
kernel/panic: mark init_taint_buf as __initdata and panic instead of warning in alloc_taint_buf()

However there's a convention of assuming that __init-time allocations
cannot fail.  Because if a kmalloc() were to fail at this time, the kernel
is hopelessly messed up anyway.  So simply panic() if that kmalloc failed,
then make that 350-byte buffer __initdata.

Link: https://lkml.kernel.org/r/20260223035914.4033-1-rioo.tsukatsukii@gmail.com
Signed-off-by: Rio <rioo.tsukatsukii@gmail.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agokernel/panic: allocate taint string buffer dynamically
Rio [Sun, 22 Feb 2026 14:08:04 +0000 (19:38 +0530)] 
kernel/panic: allocate taint string buffer dynamically

The buffer used to hold the taint string is statically allocated, which
requires updating whenever a new taint flag is added.

Instead, allocate the exact required length at boot once the allocator is
available in an init function.  The allocation sums the string lengths in
taint_flags[], along with space for separators and formatting.
print_tainted() is switched to use this dynamically allocated buffer.

If allocation fails, print_tainted() warns about the failure and continues
to use the original static buffer as a fallback.

Link: https://lkml.kernel.org/r/20260222140804.22225-1-rioo.tsukatsukii@gmail.com
Signed-off-by: Rio <rioo.tsukatsukii@gmail.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agokernel/panic: increase buffer size for verbose taint logging
Rio [Fri, 20 Feb 2026 15:15:00 +0000 (20:45 +0530)] 
kernel/panic: increase buffer size for verbose taint logging

The verbose 'Tainted: ...' string in print_tainted_seq can total to 327
characters while the buffer defined in _print_tainted is 320 bytes.
Increase its size to 350 characters to hold all flags, along with some
headroom.

[akpm@linux-foundation.org: fix spello, add comment]
Link: https://lkml.kernel.org/r/20260220151500.13585-1-rioo.tsukatsukii@gmail.com
Signed-off-by: Rio <rioo.tsukatsukii@gmail.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Wang Jinchao <wangjinchao600@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoscripts/bloat-o-meter: rename file arguments to match output
Valtteri Koskivuori [Thu, 12 Feb 2026 21:39:33 +0000 (23:39 +0200)] 
scripts/bloat-o-meter: rename file arguments to match output

The output of bloat-o-meter already uses the words 'old' and 'new' for
symbol size in the table header, so reflect that in the corresponding
argument names.

Link: https://lkml.kernel.org/r/20260212213941.3984330-1-vkoskiv@gmail.com
Signed-off-by: Valtteri Koskivuori <vkoskiv@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agounshare: fix nsproxy leak in ksys_unshare() on set_cred_ucounts() failure
Michal Grzedzicki [Fri, 13 Feb 2026 19:39:59 +0000 (11:39 -0800)] 
unshare: fix nsproxy leak in ksys_unshare() on set_cred_ucounts() failure

When set_cred_ucounts() fails in ksys_unshare() new_nsproxy is leaked.

Let's call put_nsproxy() if that happens.

Link: https://lkml.kernel.org/r/20260213193959.2556730-1-mge@meta.com
Fixes: 905ae01c4ae2 ("Add a reference to ucounts for each cred")
Signed-off-by: Michal Grzedzicki <mge@meta.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexey Gladkov (Intel) <legion@kernel.org>
Cc: Ben Segall <bsegall@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoscripts/spelling.txt: add "binded||bound"
Günther Noack [Sat, 14 Feb 2026 14:08:54 +0000 (15:08 +0100)] 
scripts/spelling.txt: add "binded||bound"

The correct passive of "to bind" is "bound", not "binded".  This is often
used in the context of the BSD socket bind(2) operation.

Link: https://lkml.kernel.org/r/20260214140854.42247-1-gnoack3000@gmail.com
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoproc: array: drop stale FIXME about RCU in task_sig()
Jaime Saguillo Revilla [Sun, 15 Feb 2026 12:45:11 +0000 (12:45 +0000)] 
proc: array: drop stale FIXME about RCU in task_sig()

task_sig() already wraps the SigQ rlimit read in an explicit RCU read-side
critical section.  Drop the stale FIXME comment and keep using
task_ucounts() for the ucounts access.

No functional change.

Link: https://lkml.kernel.org/r/20260215124511.14227-1-jaime.saguillo@gmail.com
Signed-off-by: Jaime Saguillo Revilla <jaime.saguillo@gmail.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 weeks agoMerge branch 'vrf-a-few-cleanups'
Jakub Kicinski [Sat, 28 Mar 2026 04:07:26 +0000 (21:07 -0700)] 
Merge branch 'vrf-a-few-cleanups'

Ido Schimmel says:

====================
vrf: A few cleanups

Perform a few cleanups in the VRF driver. Noticed these while reviewing
a recent patch [1]. See individual patches for more details.

[1] https://lore.kernel.org/netdev/20260310105331.2371-1-lirongqing@baidu.com/
====================

Link: https://patch.msgid.link/20260326203233.1128554-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agovrf: Remove unnecessary RCU protection around dst entries
Ido Schimmel [Thu, 26 Mar 2026 20:32:33 +0000 (22:32 +0200)] 
vrf: Remove unnecessary RCU protection around dst entries

During initialization of a VRF device, the VRF driver creates two dst
entries (for IPv4 and IPv6). They are attached to locally generated
packets that are transmitted out of the VRF ports (via the
l3mdev_l3_out() hook). Their purpose is to redirect packets towards the
VRF device instead of having the packets egress directly out of the VRF
ports. This is useful, for example, when a queuing discipline is
configured on the VRF device.

In order to avoid a NULL pointer dereference, commit b0e95ccdd775 ("net:
vrf: protect changes to private data with rcu") made the pointers to the
dst entries RCU protected. As far as I can tell, this was needed because
back then the dst entries were released (and the pointers reset to NULL)
before removing the VRF ports.

Later on, commit f630c38ef0d7 ("vrf: fix bug_on triggered by rx when
destroying a vrf") moved the removal of the VRF ports to the VRF
device's dellink() callback. As such, the tear down sequence of a VRF
device looks as follows:

1. VRF ports are removed.
2. VRF device is unregistered.
    a. Device is closed.
    b. An RCU grace period passes.
    c. ndo_uninit() is called.
        i. dst entries are released.

Given the above, the Tx path will always see the same fully initialized
dst entries and will never race with the ndo_uninit() callback.

Therefore, there is no need to make the pointers to the dst entries RCU
protected. Remove it as well as the unnecessary NULL checks in the Tx
path.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260326203233.1128554-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agovrf: Use dst_dev_put() instead of using loopback device
Ido Schimmel [Thu, 26 Mar 2026 20:32:32 +0000 (22:32 +0200)] 
vrf: Use dst_dev_put() instead of using loopback device

Use dst_dev_put() to clean up the device referenced by the dst entry
instead of partially open coding it. Internally, the helper uses the
blackhole device instead of the loopback device.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260326203233.1128554-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agovrf: Remove unnecessary NULL check
Ido Schimmel [Thu, 26 Mar 2026 20:32:31 +0000 (22:32 +0200)] 
vrf: Remove unnecessary NULL check

The VRF driver always allocates an IPv4 dst entry for a VRF device and
prevents the device from being registered if the allocation fails.

Therefore, there is no need to check if the entry exists when tearing
down a VRF device. Remove the check.

Note that the same is not true for the IPv6 dst entry. Its creation can
be skipped if IPv6 is administratively disabled (i.e.,
'ipv6.disable=1').

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260326203233.1128554-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'net-stmmac-disable-eee-on-i-mx'
Jakub Kicinski [Sat, 28 Mar 2026 03:57:40 +0000 (20:57 -0700)] 
Merge branch 'net-stmmac-disable-eee-on-i-mx'

Laurent Pinchart says:

====================
net: stmmac: Disable EEE on i.MX

This small patch series fixes a long-standing interrupt storm issue with
stmmac on NXP i.MX platforms.

The initial attempt to fix^Wwork around the problem in DT ([1]) was
painfully but rightfully rejected by Russell, who helped me investigate
the issue in depth. It turned out that the root cause is a mistake in
how interrupts are wired in the SoC, a hardware bug that has been
replicated in all i.MX SoCs that integrate an stmmac. The only viable
solution is to disable EEE on those devices.

Individual patches explain the issue in more details. Patch 1/2,
authored by Russell, adds a new STMMAC_FLAG to disable EEE, and patch
2/2 sets the flag for i.MX platforms.

[1] https://lore.kernel.org/r/20251026122905.29028-1-laurent.pinchart@ideasonboard.com
====================

Link: https://patch.msgid.link/20260325210003.2752013-1-laurent.pinchart@ideasonboard.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: imx: Disable EEE
Laurent Pinchart [Wed, 25 Mar 2026 21:00:03 +0000 (23:00 +0200)] 
net: stmmac: imx: Disable EEE

The i.MX8MP suffers from an interrupt storm related to the stmmac and
EEE. A long and tedious analysis ([1]) concluded that the SoC wires the
stmmac lpi_intr_o signal to an OR gate along with the main dwmac
interrupts, which causes an interrupt storm for two reasons.

First, there's a race condition due to the interrupt deassertion being
synchronous to the RX clock domain:

- When the PHY exits LPI mode, it restarts generating the RX clock
  (clk_rx_i input signal to the GMAC).
- The MAC detects exit from LPI, and asserts lpi_intr_o. This triggers
  the ENET_EQOS interrupt.
- Before the CPU has time to process the interrupt, the PHY enters LPI
  mode again, and stops generating the RX clock.
- The CPU processes the interrupt and reads the GMAC4_LPI_CTRL_STATUS
  registers. This does not clear lpi_intr_o as there's no clk_rx_i.

An attempt was made to fixing the issue by not stopping RX_CLK in Rx LPI
state ([2]). This alleviates the symptoms but doesn't fix the issue.
Since lpi_intr_o takes four RX_CLK cycles to clear, an interrupt storm
can still occur during that window. In 1000T mode this is harder to
notice, but slower receive clocks cause hundreds to thousands of
spurious interrupts.

Fix the issue by disabling EEE completely on i.MX8MP.

[1] https://lore.kernel.org/all/20251026122905.29028-1-laurent.pinchart@ideasonboard.com/
[2] https://lore.kernel.org/all/20251123053518.8478-1-laurent.pinchart@ideasonboard.com/

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260325210003.2752013-3-laurent.pinchart@ideasonboard.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: provide flag to disable EEE
Russell King (Oracle) [Wed, 25 Mar 2026 21:00:02 +0000 (23:00 +0200)] 
net: stmmac: provide flag to disable EEE

Some platforms have problems when EEE is enabled, and thus need a way
to disable stmmac EEE support. Add a flag before the other LPI related
flags which tells stmmac to avoid populating the phylink LPI
capabilities, which causes phylink to call phy_disable_eee() for any
PHY that is attached to the affected phylink instance.

iMX8MP is an example - the lpi_intr_o signal is wired to an OR gate
along with the main dwmac interrupts. Since lpi_intr_o is synchronous
to the receive clock domain, and takes four clock cycles to clear, this
leads to interrupt storms as the interrupt remains asserted for some
time after the LPI control and status register is read.

This problem becomes worse when the receive clock from the PHY stops
when the receive path enters LPI state - which means that lpi_intr_o
can not deassert until the clock restarts. Since the LPI state of the
receive path depends on the link partner, this is out of our control.
We could disable RX clock stop at the PHY, but that doesn't get around
the slow-to-deassert lpi_intr_o mentioned in the above paragraph.

Previously, iMX8MP worked around this by disabling gigabit EEE, but
this is insufficient - the problem is also visible at 100M speeds,
where the receive clock is slower.

There is extensive discussion and investigation in the thread linked
below, the result of which is summarised in this commit message.

Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Closes: https://lore.kernel.org/r/20251026122905.29028-1-laurent.pinchart@ideasonboard.com
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Link: https://patch.msgid.link/20260325210003.2752013-2-laurent.pinchart@ideasonboard.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'net-enetc-add-more-checks-to-enetc_set_rxfh'
Jakub Kicinski [Sat, 28 Mar 2026 03:56:49 +0000 (20:56 -0700)] 
Merge branch 'net-enetc-add-more-checks-to-enetc_set_rxfh'

Wei Fang says:

====================
net: enetc: add more checks to enetc_set_rxfh()

ENETC only supports Toeplitz algorithm, and VFs do not support setting
the RSS key, but enetc_set_rxfh() does not check these constraints and
silently accepts unsupported configurations. This may mislead users or
tools into believing that the requested RSS settings have been
successfully applied. So add checks to reject unsupported hash functions
and RSS key updates on VFs, and return "-EOPNOTSUPP" to user space.
====================

Link: https://patch.msgid.link/20260326075233.3628047-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>