git.ipfire.org Git - thirdparty/kernel/stable.git/log

mm/memory_hotplug: maintain N_NORMAL_MEMORY during hotplug

N_NORMAL_MEMORY is initialized from zone population at boot, but memory
hotplug currently only updates N_MEMORY. As a result, a node that gains
normal memory via hotplug can remain invisible to users iterating over
N_NORMAL_MEMORY, while a node that loses its last normal memory can stay
incorrectly marked as such.

The most visible effect is that
/sys/devices/system/node/has_normal_memory does not report a node even
after that node has gained normal memory via hotplug.

Also, list_lru-based shrinkers can undercount objects on such a node
and may skip reclaim on that node entirely, which can lead to a higher
memory footprint than expected.

Restore N_NORMAL_MEMORY maintenance directly in online_pages() and
offline_pages(). Set the bit when a node that currently lacks normal
memory onlines pages into a zone <= ZONE_NORMAL, and clear it when
offlining removes the last present pages from zones <= ZONE_NORMAL.

This restores the intended semantics without bringing back the old
status_change_nid_normal notifier plumbing which was removed in
8d2882a8edb8.

Current users that benefit include list_lru, zswap, nfsd filecache,
hugetlb_cgroup, and has_normal_memory sysfs reporting.

Link: https://lkml.kernel.org/r/20260330035941.518186-1-hao.li@linux.dev
Fixes: 8d2882a8edb8 ("mm,memory_hotplug: remove status_change_nid_normal and update documentation")
Signed-off-by: Hao Li <hao.li@linux.dev>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/damon/sysfs: dealloc repeat_call_control if damon_call() fails

damon_call() for repeat_call_control of DAMON_SYSFS could fail if somehow
the kdamond is stopped before the damon_call(). It could happen, for
example, when te damon context was made for monitroing of a virtual
address processes, and the process is terminated immediately, before the
damon_call() invocation. In the case, the dyanmically allocated
repeat_call_control is not deallocated and leaked.

Fix the leak by deallocating the repeat_call_control under the
damon_call() failure.

This issue is discovered by sashiko [1].

Link: https://lkml.kernel.org/r/20260327003224.55752-1-sj@kernel.org
Link: https://lore.kernel.org/20260320020630.962-1-sj@kernel.org
Fixes: 04a06b139ec0 ("mm/damon/sysfs: use dynamically allocated repeat mode damon_call_control")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: reinstate unconditional writeback start in balance_dirty_pages()

Commit 64dd89ae01f2 ("mm/block/fs: remove laptop_mode") removed this
unconditional writeback start from balance_dirty_pages():

       if (unlikely(!writeback_in_progress(wb)))
       wb_start_background_writeback(wb);

This logic needs to be reinstated to prevent performance regressions for
strictlimited BDIs and memcg setups.  The problem occurs because:

a) For strictlimited BDIs, throttling is calculated using per-wb
   thresholds.  The per-wb threshold can be exceeded even when the global
   dirty threshold was not exceeded (nr_dirty < gdtc->bg_thresh)

b) For memcg-based throttling, memcg uses its own dirty count /
   thresholds and can trigger throttling even when the global threshold
   isn't exceeded

Without the unconditional writeback start, IO is throttled as it waits for
dirty pages to be written back but there is no writeback running.  This
leads to severe stalls.  On fuse, buffered write performance dropped from
1400 MiB/s to 2000 KiB/s.

Reinstate the unconditional writeback start so that writeback is
guaranteed to be running whenever IO needs to be throttled.

Link: https://lkml.kernel.org/r/20260326215127.3857682-2-joannelkoong@gmail.com
Fixes: 64dd89ae01f2 ("mm/block/fs: remove laptop_mode")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

liveupdate: propagate file deserialization failures

luo_session_deserialize() ignored the return value from
luo_file_deserialize(). As a result, a session could be left partially
restored even though the /dev/liveupdate open path treats deserialization
failures as fatal.

Propagate the error so a failed file deserialization aborts session
deserialization instead of silently continuing.

Link: https://lkml.kernel.org/r/20260325044608.8407-1-leotimmins1974@gmail.com
Link: https://lkml.kernel.org/r/20260325044608.8407-2-leotimmins1974@gmail.com
Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation")
Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: filemap: fix nr_pages calculation overflow in filemap_map_pages()

When running stress-ng on my Arm64 machine with v7.0-rc3 kernel, I
encountered some very strange crash issues showing up as "Bad page state":

"
[  734.496287] BUG: Bad page state in process stress-ng-env  pfn:415735fb
[  734.496427] page: refcount:0 mapcount:1 mapping:0000000000000000 index:0x4cf316 pfn:0x415735fb
[  734.496434] flags: 0x57fffe000000800(owner_2|node=1|zone=2|lastcpupid=0x3ffff)
[  734.496439] raw: 057fffe000000800 0000000000000000 dead000000000122 0000000000000000
[  734.496440] raw: 00000000004cf316 0000000000000000 0000000000000000 0000000000000000
[  734.496442] page dumped because: nonzero mapcount
"

After analyzing this page’s state, it is hard to understand why the
mapcount is not 0 while the refcount is 0, since this page is not where
the issue first occurred.  By enabling the CONFIG_DEBUG_VM config, I can
reproduce the crash as well and captured the first warning where the issue
appears:

"
[  734.469226] page: refcount:33 mapcount:0 mapping:00000000bef2d187 index:0x81a0 pfn:0x415735c0
[  734.469304] head: order:5 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  734.469315] memcg:ffff000807a8ec00
[  734.469320] aops:ext4_da_aops ino:100b6f dentry name(?):"stress-ng-mmaptorture-9397-0-2736200540"
[  734.469335] flags: 0x57fffe400000069(locked|uptodate|lru|head|node=1|zone=2|lastcpupid=0x3ffff)
......
[  734.469364] page dumped because: VM_WARN_ON_FOLIO((_Generic((page + nr_pages - 1),
const struct page *: (const struct folio *)_compound_head(page + nr_pages - 1), struct page *:
(struct folio *)_compound_head(page + nr_pages - 1))) != folio)
[  734.469390] ------------[ cut here ]------------
[  734.469393] WARNING: ./include/linux/rmap.h:351 at folio_add_file_rmap_ptes+0x3b8/0x468,
CPU#90: stress-ng-mlock/9430
[  734.469551]  folio_add_file_rmap_ptes+0x3b8/0x468 (P)
[  734.469555]  set_pte_range+0xd8/0x2f8
[  734.469566]  filemap_map_folio_range+0x190/0x400
[  734.469579]  filemap_map_pages+0x348/0x638
[  734.469583]  do_fault_around+0x140/0x198
......
[  734.469640]  el0t_64_sync+0x184/0x188
"

The code that triggers the warning is: "VM_WARN_ON_FOLIO(page_folio(page +
nr_pages - 1) != folio, folio)", which indicates that set_pte_range()
tried to map beyond the large folio’s size.

By adding more debug information, I found that 'nr_pages' had overflowed
in filemap_map_pages(), causing set_pte_range() to establish mappings for
a range exceeding the folio size, potentially corrupting fields of pages
that do not belong to this folio (e.g., page->_mapcount).

After above analysis, I think the possible race is as follows:

CPU 0                                                  CPU 1
filemap_map_pages()                                   ext4_setattr()
   //get and lock folio with old inode->i_size
   next_uptodate_folio()

                                                          .......
                                                          //shrink the inode->i_size
                                                          i_size_write(inode, attr->ia_size);

   //calculate the end_pgoff with the new inode->i_size
   file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
   end_pgoff = min(end_pgoff, file_end);

   ......
   //nr_pages can be overflowed, cause xas.xa_index > end_pgoff
   end = folio_next_index(folio) - 1;
   nr_pages = min(end, end_pgoff) - xas.xa_index + 1;

   ......
   //map large folio
   filemap_map_folio_range()
                                                          ......
                                                          //truncate folios
                                                          truncate_pagecache(inode, inode->i_size);

To fix this issue, move the 'end_pgoff' calculation before
next_uptodate_folio(), so the retrieved folio stays consistent with the
file end to avoid 'nr_pages' calculation overflow.  After this patch, the
crash issue is gone.

Link: https://lkml.kernel.org/r/1cf1ac59018fc647a87b0dad605d4056a71c14e4.1773739704.git.baolin.wang@linux.alibaba.com
Fixes: 743a2753a02e ("filemap: cap PTE range to be created to allowed zero fill in folio_map_range()")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reported-by: Yuanhe Shu <xiangzao@linux.alibaba.com>
Tested-by: Yuanhe Shu <xiangzao@linux.alibaba.com>
Acked-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Luis Chamberalin <mcgrof@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

sched_ext: Documentation: Add ops.dequeue() to task lifecycle

Document ops.dequeue() in the sched_ext task lifecycle now that its
semantics are well-defined.

Also update the pseudo-code to use task_is_runnable() consistently and
clarify the case where ops.dispatch() does not refill the time slice.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing

scx_alloc_free_idx() zeroes the payload of a freed arena allocation
one word at a time. The loop bound was alloc->pool.elem_size / 8, but
elem_size includes sizeof(struct sdt_data) (the 8-byte union sdt_id
header). This caused the loop to write one extra u64 past the
allocation, corrupting the tid field of the adjacent pool element.

Fix the loop bound to (elem_size - sizeof(struct sdt_data)) / 8 so
only the payload portion is zeroed.

Test plan:
- Add a temporary sanity check in scx_task_free() before the free call:

  if (mval->data->tid.idx != mval->tid.idx)
      scx_bpf_error("tid corruption: arena=%d storage=%d",
                    mval->data->tid.idx, (int)mval->tid.idx);

- stress-ng --fork 100 -t 10 & sudo ./build/bin/scx_sdt

Without this fix, running scx_sdt under fork-heavy load triggers the
corruption error. With the fix applied, the same workload completes
without error.

Fixes: 36929ebd17ae ("tools/sched_ext: add arena based scheduler")
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

selftests/nolibc: only use libgcc when really necessary

nolibc should work without libgcc to be compatible with as many
toolchains as possible. Currently the functionality tested by
nolibc-test does not contain any dependencies, make sure it stays
this way by not linking libgcc anymore.

On the ppc target GCC always emits references to '_restgpr_' functions,
so keep linking libgcc there.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260404-nolibc-libgcc-v1-1-eb3ecfe0e176@weissschuh.net

selftests/nolibc: test the memory allocator

The memory allocator has not seen any testing so far.

Add a simple testcase for it.

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/adDRK8D6YBZgv36H@1wt.eu/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260404-nolibc-asprintf-v2-2-17d2d0df9763@weissschuh.net

tools/nolibc: check for overflow in calloc() without divisions

On some architectures without native division instructions
the division can generate calls into libgcc/compiler-rt.
This library might not be available, so its use should be avoided.

Use the compiler builtin to check for overflows without needing a
division. The builtin has been available since GCC 3 and clang 3.8.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260404-nolibc-asprintf-v2-1-17d2d0df9763@weissschuh.net

tools/nolibc: add support for asprintf()

Add support for dynamically allocating formatted strings through
asprintf() and vasprintf().

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260401-nolibc-asprintf-v1-3-46292313439f@weissschuh.net

PCI/NPEM: Set LED_HW_PLUGGABLE for hotplug-capable ports

NPEM registers LED classdevs on PCI endpoint that may be behind
hotplug-capable ports. During hot-removal, led_classdev_unregister() calls
led_set_brightness(LED_OFF) which leads to a PCI config read to a
disconnected device, which fails and returns -ENODEV (topology details in
msgid.link below):

leds 0003:01:00.0:enclosure:ok: Setting an LED's brightness failed (-19)

The LED core already suppresses this for devices with LED_HW_PLUGGABLE set,
but NPEM never sets it. Add the flag since NPEM LEDs are on hot-pluggable
hardware by nature.

Fixes: 4e893545ef87 ("PCI/NPEM: Add Native PCIe Enclosure Management support")
Signed-off-by: Richard Cheng <icheng@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Kai-Heng Feng <kaihengf@nvidia.com>
Link: https://patch.msgid.link/20260402093850.23075-1-icheng@nvidia.com

cpupower: remove extern declarations in cmd functions

extern char *optarg and extern int optind, opterr, optopt are
already declared by <getopt.h>, which is included at the top of
the file. Repeating extern declarations inside a function body
is misleading and unnecessary.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

Merge tag 'soc-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
"The largest part here are devicetree fixes for Qualcomm, and NXP i.MX,
  addressing a few regressions and incorrect settings in board and SoC
  pecific dts files.

  The largest single commits are a revert of a cleanup patch for i.MX
  that caused regressions for the NAND flash controller and a fixup for
  an incomplete cleanup of the PCIe controller on Qualcomm platforms
  that broke because the state was left incompatible with both the old
  and new behavior.

  On the Rockchips, Hisilicon, Renesas, Allwinner and AT91 platforms,
  only a single simple dts bugfix each was added since the last round of
  fixes.

  On the SoC specific device drivers, everything is relatively harmless:
  three reset controller driver fixes, a compatibility for fix ASpeed
  soc ID, and error handling fixes for Qualcomm and Microchip. One
  regression fix on Qualcomm addresses a problem with a previous fix for
  DisplayPort alt mode"

* tag 'soc-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (32 commits)
  arm64: dts: qcom: hamoa: Fix incomplete Root Port property migration
  dt-bindings: display/msm: qcm2290-mdss: Fix missing ranges in example
  firmware: microchip: fail auto-update probe if no flash found
  arm64: dts: renesas: sparrow-hawk: Reserve first 128 MiB of DRAM
  arm64: dts: qcom: agatti: Fix IOMMU DT properties
  dt-bindings: media: venus: Fix iommus property
  dt-bindings: display: msm: qcm2290-mdss: Fix iommus property
  arm64: dts: allwinner: sun55i: Fix r-spi DMA
  reset: spacemit: k3: Decouple composite reset lines
  reset: gpio: fix double free in reset_add_gpio_aux_device() error path
  ARM: dts: microchip: sam9x7: fix gpio-lines count for pioB
  arm64: dts: hisilicon: hi3798cv200: Add missing dma-ranges
  arm64: dts: hisilicon: poplar: Correct PCIe reset GPIO polarity
  reset: rzg2l-usbphy-ctrl: Fix malformed MODULE_AUTHOR string
  soc: microchip: mpfs-mss-top-sysreg: Fix resource leak on driver unbind
  soc: microchip: mpfs-control-scb: Fix resource leak on driver unbind
  soc: qcom: pmic_glink_altmode: Fix TBT->SAFE->!TBT transition
  arm64: dts: qcom: monaco: Reserve full Gunyah metadata region
  arm64: dts: imx8mq-librem5: Bump BUCK1 suspend voltage up to 0.85V
  Revert "arm64: dts: imx8mq-librem5: Set the DVS voltages lower"
  ...

PCI: imx6: Fix reference clock source selection for i.MX95

In the PCIe PHY init for the i.MX95, the reference clock source selection
uses a conditional instead of always passing the mask. This currently
breaks functionality if the internal refclk is used.

To fix this issue, always pass IMX95_PCIE_REF_USE_PAD as the mask and clear
bit if external refclk is not used. This essentially swaps the parameters.

Fixes: d8574ce57d76 ("PCI: imx6: Add external reference clock input mode support")
Signed-off-by: Franz Schnyder <franz.schnyder@toradex.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260325093118.684142-1-fra.schnyder@gmail.com

PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM

pcie_tph_get_cpu_st() uses the Query Cache Locality Features _DSM [1]
to retrieve the TPH Steering Tag for memory associated with the CPU
identified by its "cpu_uid" parameter, a Linux logical CPU ID.

The _DSM requires an ACPI Processor UID, which pcie_tph_get_cpu_st()
previously assumed was the same as the Linux logical CPU ID. This is
true on x86 but not on arm64, so pcie_tph_get_cpu_st() returned the
wrong Steering Tag, resulting in incorrect TPH functionality on arm64.

Convert the Linux logical CPU ID to the ACPI Processor UID with
acpi_get_cpu_uid() before passing it to the _DSM. Additionally, rename
the pcie_tph_get_cpu_st() parameter from "cpu_uid" to "cpu" to reflect
that it represents a logical CPU ID (not an ACPI Processor UID).

[1] According to ECN_TPH-ST_Revision_20200924
    (https://members.pcisig.com/wg/PCI-SIG/document/15470), the input
    is defined as: "If the target is a processor, then this field
    represents the ACPI Processor UID of the processor as specified in
    the MADT. If the target is a processor container, then this field
    represents the ACPI Processor UID of the processor container as
    specified in the PPTT."

Fixes: d2e8a34876ce ("PCI/TPH: Add Steering Tag support")
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260401081640.26875-9-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()

Update acpi/pptt.c to use acpi_get_cpu_uid() and remove unused
get_acpi_id_for_cpu() from arm64/loongarch/riscv, completing PPTT's
migration to the unified ACPI CPU UID interface

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-8-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()

Update arm_cspmu to use acpi_get_cpu_uid() instead of
get_acpi_id_for_cpu(), aligning with unified ACPI CPU UID interface.

No functional changes are introduced by this switch (valid inputs retain
original behavior).

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-7-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h

Centralize acpi_get_cpu_uid() in include/linux/acpi.h (global scope) and
remove arch-specific declarations from arm64/loongarch/riscv/x86
asm/acpi.h. This unifies the interface across architectures and
simplifies maintenance by eliminating duplicate prototypes.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-6-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval

As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
x86. While at it, add input validation to make the code more robust.

Update Xen-related code to use acpi_get_cpu_uid() instead of the legacy
cpu_acpi_id() function, and remove the now-unused cpu_acpi_id() to clean
up redundant code.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://patch.msgid.link/20260401081640.26875-5-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval

As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
riscv. While at it, add input validation to make the code more robust.

And also update acpi_numa.c and rhct.c to use the new interface instead
of the legacy get_acpi_id_for_cpu().

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-4-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval

As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
loongarch. While at it, add input validation to make the code more
robust.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-3-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval

As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
arm64. While at it, add input validation to make the code more robust.

Reimplement get_cpu_for_acpi_id() based on acpi_get_cpu_uid() for
consistency, and move its implementation next to the new function for
code coherence.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://patch.msgid.link/20260401081640.26875-2-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

remoteproc: qcom: pas: Add Eliza ADSP support

The ADSP found on Eliza SoC is similar to the one found on SM8550.
So just add the dedicated compatible for Eliza ADSP and reuse the
SM8550 resource configuration.

Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260327-eliza-remoteproc-adsp-v1-2-1c46c5e5f809@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler

Add support for decoding NVIDIA-specific CPER sections delivered via
the APEI GHES vendor record notifier chain. NVIDIA hardware generates
vendor-specific CPER sections containing error signatures and diagnostic
register dumps. This implementation registers a notifier_block with the
GHES vendor record notifier and decodes these sections, printing error
details via dev_info().

The driver binds to ACPI device NVDA2012, present on NVIDIA server
platforms. The NVIDIA CPER section contains a fixed header with error
metadata (signature, error type, severity, socket) followed by
variable-length register address-value pairs for hardware diagnostics.

This work is based on libcper [1].

Example output:
nvidia-ghes NVDA2012:00: NVIDIA CPER section, error_data_length: 544
nvidia-ghes NVDA2012:00: signature: CMET-INFO
nvidia-ghes NVDA2012:00: error_type: 0
nvidia-ghes NVDA2012:00: error_instance: 0
nvidia-ghes NVDA2012:00: severity: 3
nvidia-ghes NVDA2012:00: socket: 0
nvidia-ghes NVDA2012:00: number_regs: 32
nvidia-ghes NVDA2012:00: instance_base: 0x0000000000000000
nvidia-ghes NVDA2012:00: register[0]: address=0x8000000100000000 value=0x0000000100000000

https://github.com/openbmc/libcper/commit/683e055061ce [1]
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260330094203.38022-4-kaihengf@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

PCI: hisi: Use devm_ghes_register_vendor_record_notifier()

Switch to the device-managed variant so the notifier is automatically
unregistered on device removal, allowing the open-coded remove callback
to be dropped entirely.

Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20260330094203.38022-3-kaihengf@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()

Add a device-managed wrapper around ghes_register_vendor_record_notifier()
so drivers can avoid manual cleanup on device removal or probe failure.

Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://patch.msgid.link/20260330094203.38022-2-kaihengf@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

dt-bindings: remoteproc: qcom,milos-pas: Document Eliza ADSP

Since the devicetree bindings are exactly the same between Eliza ADSP and
Milos ADSP, reuse the existing Milos schema, just add the Eliza specific
ADSP compatible.

Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260327-eliza-remoteproc-adsp-v1-1-1c46c5e5f809@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

remoteproc: qcom: Add missing space before closing bracket

Add missing space before closing curly bracket for qcom_q6v5_mss and
qcom_q6v5_pas driver of_match[] lines, so that all qcom remoteproc
drivers are consistent on the common coding style.

Signed-off-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260306145607.1394878-1-shengchao.guo@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

dt-bindings: remoteproc: qcom: Drop types for firmware-name

The type of firmware-name is already defined by core schemas. Drop it
from individual bindings that have either a redundant definition or
an override as string type. For the later cases, constrain the number
of expected firmware names to 1.

Signed-off-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260309123357.1911586-1-shengchao.guo@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

remoteproc: qcom: Fix minidump out-of-bounds access on subsystems array

MAX_NUM_OF_SS was hardcoded to 10 in the minidump_global_toc struct,
which is a direct overlay on an SMEM item allocated by the firmware.
Newer Qualcomm SoC firmware allocates space for more subsystems, while
older firmware only allocates space for 10. Bumping the constant would
cause Linux to read/write beyond the SMEM item boundary on older
platforms.

Fix this by converting subsystems[] to a flexible array member and
deriving the actual number of subsystems at runtime from the size
returned by qcom_smem_get(). Add a bounds check on minidump_id against
the derived count before indexing into the array.

Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Acked-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260331171243.1962067-1-mukesh.ojha@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

hwspinlock: remove now unused pdata from header file

The last user turned out to be obsolete and was removed. Remove the
unused struct now, too.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Link: https://lore.kernel.org/r/20260401071141.4718-3-wsa+renesas@sang-engineering.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

hwspinlock: u8500: delete driver

The U8500 platform was converted to DT around 2013 and is DT only
meanwhile. This driver has never been converted to a DT driver, so it
clearly hasn't been used since then. To ease upcoming refactoring in the
hwspinlock subsystem, remove this obsolete driver.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Link: https://lore.kernel.org/r/20260401071141.4718-2-wsa+renesas@sang-engineering.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

ACPI: tables: Enable FPDT on LoongArch

FPDT provides system- and application-readable performance statistics,
useful for profiling and analyzing boot-time performance. FPDT table
support is now available as a pending patch at the EDK II upstream [1]
and has been tested on real hardware such as Loongson XA61200_V1.1 and
XB612B0_V1.2 with patched firmware.

We have also cross checked systemd-analyze(1) against a stop watch and
the `dp' command in EFI Shell to see that the timing information are
correct.

Now that the functionality of FPDT is verified on LoongArch hardware,
list LOONGARCH as a possible dependency, allowing it to be enabled.

Link: https://github.com/tianocore/edk2/pull/12378
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260401135311.1737958-2-xry111@xry111.site
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

media: platform: mtk-mdp3: Constify buffer passed to mdp_vpu_sendmsg()

mdp_vpu_sendmsg() passes the buffer to scp_ipi_send(), which takes now
pointer to const, so adjust this interface as well for increased code
safety and code readability.

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-rpmsg-send-const-v3-5-4d7fd27f037f@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

ASoC: qcom: Constify GPR packet being send over GPR interface

gpr_send_pkt() and pkt_router_send_svc_pkt() only send the GPR packet
they receive, without any need to actually modify it, so mark the
pointer to GPR packet as pointer to const for code safety and code
self-documentation. Several users of this interface can follow up and
also operate on pointer to const.

Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-rpmsg-send-const-v3-4-4d7fd27f037f@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

rpmsg: Constify buffer passed to send API

The rpmsg_send(), rpmsg_sendto() and other variants of sending
interfaces should only send the passed data, without modifying its
contents, so mark pointer 'data' as pointer to const. All users of this
interface already follow this approach, so only the function
declarations have to be updated.

Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-rpmsg-send-const-v3-3-4d7fd27f037f@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

remoteproc: mtk_scp: Constify buffer passed to scp_send_ipi()

scp_send_ipi() should only send the passed buffer, without modifying its
contents, so mark pointer 'buf' as pointer to const.

Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-rpmsg-send-const-v3-2-4d7fd27f037f@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

remoteproc: mtk_scp_ipi: Constify buffer passed to scp_ipi_send()

scp_ipi_send() should only send the passed buffer, without modifying its
contents, so mark pointer 'buf' as pointer to const.

Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-rpmsg-send-const-v3-1-4d7fd27f037f@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

selftests: ublk: test that teardown after incomplete recovery completes

Before the fix, teardown of a ublk server that was attempting to recover
a device, but died when it had submitted a nonempty proper subset of the
fetch commands to any queue would loop forever. Add a test to verify
that, after the fix, teardown completes. This is done by:

- Adding a new argument to the fault_inject target that causes it die
  after fetching a nonempty proper subset of the IOs to a queue
- Using that argument in a new test while trying to recover an
  already-created device
- Attempting to delete the ublk device at the end of the test; this
  hangs forever if teardown from the fault-injected ublk server never
  completed.

It was manually verified that the test passes with the fix and hangs
without it.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://patch.msgid.link/20260405-cancel-v2-2-02d711e643c2@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ublk: reset per-IO canceled flag on each fetch

If a ublk server starts recovering devices but dies before issuing fetch
commands for all IOs, cancellation of the fetch commands that were
successfully issued may never complete. This is because the per-IO
canceled flag can remain set even after the fetch for that IO has been
submitted - the per-IO canceled flags for all IOs in a queue are reset
together only once all IOs for that queue have been fetched. So if a
nonempty proper subset of the IOs for a queue are fetched when the ublk
server dies, the IOs in that subset will never successfully be canceled,
as their canceled flags remain set, and this prevents ublk_cancel_cmd
from actually calling io_uring_cmd_done on the commands, despite the
fact that they are outstanding.

Fix this by resetting the per-IO cancel flags immediately when each IO
is fetched instead of waiting for all IOs for the queue (which may never
happen).

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Fixes: 728cbac5fe21 ("ublk: move device reset into ublk_ch_release()")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: zhang, the-essence-of-life <zhangweize9@gmail.com>
Link: https://patch.msgid.link/20260405-cancel-v2-1-02d711e643c2@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'opp-updates-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Pull OPP updates for 7.1 from Viresh Kumar:

"- Use performance level if available to distinguish between rates in
   debugfs (Manivannan Sadhasivam).

- Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)."

* tag 'opp-updates-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  OPP: Move break out of scoped_guard in dev_pm_opp_xlate_required_opp()
  OPP: debugfs: Use performance level if available to distinguish between rates

Merge tag 'cpufreq-arm-updates-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Pull CPUFreq Arm updates for 7.1 from Viresh Kumar:

"- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa).

- Update cpufreq-dt-platdev blocklist (Faruque Ansari).

- Minor updates to driver and dt-bindings for Tegra (Thierry Reding and
   Rosen Penev).

- Add MAINTAINERS entry for CPPC driver (Viresh Kumar)."

* tag 'cpufreq-arm-updates-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: tegra194: remove COMPILE_TEST
  cpufreq: Add QCS8300 to cpufreq-dt-platdev blocklist
  cpufreq: Add MAINTAINERS entry for CPPC driver
  cpufreq: tegra194: Rename Tegra239 to Tegra238
  dt-bindings: arm: nvidia: Document the Tegra238 CCPLEX cluster
  dt-bindings: cpufreq: qcom-hw: document Eliza cpufreq hardware

batman-adv: hold claim backbone gateways by reference

batadv_bla_add_claim() can replace claim->backbone_gw and drop the old
gateway's last reference while readers still follow the pointer.

The netlink claim dump path dereferences claim->backbone_gw->orig and
takes claim->backbone_gw->crc_lock without pinning the underlying
backbone gateway. batadv_bla_check_claim() still has the same naked
pointer access pattern.

Reuse batadv_bla_claim_get_backbone_gw() in both readers so they operate
on a stable gateway reference until the read-side work is complete.
This keeps the dump and claim-check paths aligned with the lifetime
rules introduced for the other BLA claim readers.

Fixes: 23721387c409 ("batman-adv: add basic bridge loop avoidance code")
Fixes: 04f3f5bf1883 ("batman-adv: add B.A.T.M.A.N. Dump BLA claims via netlink")
Cc: stable@vger.kernel.org
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Ao Zhou <n05ec@lzu.edu.cn>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>

regulator: dt-bindings: qcom,qca6390-pmu: Document WCN6755 PMU

Document the WCN6755 PMU using a fallback to WCN6750 since the two chips
seem to be completely pin and software compatible. In fact the original
downstream kernel just pretends the WCN6755 is a WCN6750.

Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260403-milos-fp6-bt-wifi-v2-1-393322b27c5f@fairphone.com
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: dt-bindings: regulator-max77620: convert to DT schema

Convert regulator-max77620 devicetree bindings for the MAX77620 PMIC from
TXT to YAML format. This patch does not change any functionality; the
bindings remain the same.

Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260406075114.25672-2-clamor95@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: soc.h: remove unused card->pmdown_time

commit f0fba2ad1b6b ("ASoC: multi-component - ASoC Multi-Component
Support") has replaced "card->pmdown_time" to "rtd->pmdown_time".
card->pmdown_time has been not used this 15 years. Let's remove it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87eckstz49.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: SOF: Intel: Fixes for find_acpi_adr_device() when some endpoints are missing

Bard Liao <yung-chuan.liao@linux.intel.com> says:

To make sure find_acpi_adr_device can work well when some of the
endpoints are missing and do not map 1:1 to codec_info_list.

ASoC: SOF: Intel: fix iteration in is_endpoint_present()

is_endpoint_present() iterates over sdca_data.num_functions, but checks
the dai_type according to codec info list, which will cause problems if
not all endpoints from the codec info list are present. Make sure the
type of actually present functions is compared against target dai_type.

Fixes: 5226d19d4cae ("ASoC: SOF: Intel: use sof_sdw as default SDW machine driver")
Signed-off-by: Maciej Strozek <mstrozek@opensource.cirrus.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20260402064531.2287261-3-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: SOF: Intel: Fix endpoint index if endpoints are missing

In case of missing endpoints, the sequential numbering will cause wrong
mapping. Instead, assign the original DAI index from codec_info_list.

Fixes: 5226d19d4cae ("ASoC: SOF: Intel: use sof_sdw as default SDW machine driver")
Signed-off-by: Maciej Strozek <mstrozek@opensource.cirrus.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20260402064531.2287261-2-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: SDCA: Fix errors in IRQ cleanup

IRQs are enabled through sdca_irq_populate() from component probe
using devm_request_threaded_irq(), this however means the IRQs can
persist if the sound card is torn down. Some of the IRQ handlers
store references to the card and the kcontrols which can then
fail. Some detail of the crash was explained in [1].

Generally it is not advised to use devm outside of bus probe, so
the code is updated to not use devm. The IRQ requests are not moved
to bus probe time as it makes passing the snd_soc_component into
the IRQs very awkward and would the require a second step once the
component is available, so it is simpler to just register the IRQs
at this point, even though that necessitates some manual cleanup.

Link: https://lore.kernel.org/linux-sound/20260310183829.2907805-1-gaggery.tsai@intel.com/
Fixes: b126394d9ec6 ("ASoC: SDCA: Generic interrupt support")
Reported-by: Gaggery Tsai <gaggery.tsai@intel.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260316141449.2950215-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>

MIPS: dts: loongson64g-package: Switch to Loongson UART driver

Loongson64g is Loongson 3A4000, whose UART controller is compatible with
Loongson 2K1500, which is NS16550A-compatible with an additional
fractional frequency divisor register.

Update the compatible strings to reflect this, so that 3A4000 can
benefit from the fractional frequency divisor provided by loongson-uart.
This is required on some devices, otherwise their UART can't work at
some high baud rates, e.g., 115200.

Tested on Loongson-LS3A4000-7A1000-NUC-SE with a 25MHz UART clock.
Without fractional frequency divisor, the actual baud rate was 111607
(25MHz / 16 / 14, measured value: 111545) and some USB-to-UART
converters couldn't work with it at all. With fractional frequency
divisor, the measured baud rate becomes 115207, which is quite accurate.

Signed-off-by: Rong Zhang <rongrong@oss.cipunited.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

ASoC: SOF: compress: return the configured codec from get_params

The SOF compressed offload path accepts codec parameters in
sof_compr_set_params() and forwards them to firmware as
extended data in the SOF IPC stream params message.

However, sof_compr_get_params() still returns success without
filling the snd_codec structure. Since the compress core allocates
that structure zeroed and copies it back to userspace on success,
SNDRV_COMPRESS_GET_PARAMS returns an all-zero codec description
even after the stream has been configured successfully.

The stale TODO in this callback conflates get_params() with capability
discovery. Supported codec enumeration belongs in get_caps() and
get_codec_caps(). get_params() should report the current codec settings.

Cache the codec accepted by sof_compr_set_params() in the per-stream SOF
compress state and return it from sof_compr_get_params().

Fixes: 6324cf901e14 ("ASoC: SOF: compr: Add compress ops implementation")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260325-sof-compr-get-params-v1-1-0758815f13c7@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: amd: acp: add Lenovo P16s G5 AMD quirk for legacy SDW machine

Add a DMI quirk entry for Lenovo P16s G5 AMD to use ASOC_SDW_ACP_DMIC.
Needed to allow the microphone to work on this platform

Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Link: https://patch.msgid.link/20260403010336.1223078-1-mpearson-lenovo@squebb.ca
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: dt-bindings: ti,tas2552: Add sound-dai-cells

Add missing sound-sai-cells for this codec into schema.
At the same time, drop trailing spaces from description.

Fixes: 506e0825a4c9 ("ASoC: dt-bindings: Convert ti,tas2552 to DT schema")
Signed-off-by: Marek Vasut <marex@nabladev.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260405234502.154227-1-marex@nabladev.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: audioreach: explicitly enable speaker protection modules

Speaker protection and VI feedback modules are disabled by default.
Explicitly enable them when configuring speaker protection.

Fixes: 3e43a8c033c3 ("ASoC: qcom: audioreach: Add support for VI Sense module")
Fixes: 0db76f5b2235 ("ASoC: qcom: audioreach: Add support for Speaker Protection module")
Signed-off-by: Ravi Hothi <ravi.hothi@oss.qualcomm.com>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260326113531.3144998-1-ravi.hothi@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

mips: pci-mt7620: rework initialization procedure

Move the reset operation to the common part to reduce the code
redundancy. They are actually the same and needed for all SoCs.
Disabling power and clock are unnecessary for MT7620 and will be
removed. In vendor SDK, it's used to save the power when the PCI
driver is not selected. The MT7628 GPIO pinctrl has been removed
because this should be done in device-tree. Some delay intervals
have also been increased to follow the recommendations of the SoC
SDK and datasheet. Tested on both MT7620 and MT7628.

Signed-off-by: Shiji Yang <yangshiji66@outlook.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

mips: pci-mt7620: add more register init values

These missing register init values are ported from the vendor SDK.
It should have some stability enhancements. Tested on both MT7620
and MT7628.

Signed-off-by: Shiji Yang <yangshiji66@outlook.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

mips: pci-mt7620: fix bridge register access

Host bridge registers and PCI RC control registers have different
memory base. pcie_m32() is used to write the RC control registers
instead of bridge registers. This patch introduces bridge_m32()
and use it to operate bridge registers to fix the access issue.

Signed-off-by: Shiji Yang <yangshiji66@outlook.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

mips: dts: Add PCIe to EcoNet EN751221

Add PCIe based on EN7528 PCIe driver, also add two MT76 wifi devices
to SmartFiber XP8421-B.

Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: mobileye: eyeq5-epm: add two Cadence GEM Ethernet PHYs

The Mobileye EyeQ5 eval board (EPM) embeds two MDIO PHYs.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: mobileye: eyeq5: add two Cadence GEM Ethernet controllers

Add both MACB/GEM instances found in the Mobileye EyeQ5 SoC.

Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

dt-bindings: soc: mobileye: OLB is an Ethernet PHY provider on EyeQ5

OLB on EyeQ5 ("mobileye,eyeq5-olb" compatible) is now declared as a
generic PHY provider. Under the hood, it provides Ethernet RGMII/SGMII
PHY support for both MAC instances.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

ASoC: rt5640: Handle 0Hz sysclk during stream shutdown

Commit 2458adb8f92a ("SoC: simple-card-utils: set 0Hz to sysclk when
shutdown") sends a 0Hz sysclk request during stream shutdown to clear
codec rate constraints. The rt5640 codec forwards this 0Hz to
clk_set_rate(), which can cause clock controller firmware faults on
platforms where MCLK is SoC-driven (e.g. Tegra) and 0Hz falls below
the hardware minimum rate.

Handle the 0Hz case by clearing the internal sysclk state and
returning early, avoiding the invalid clk_set_rate() call.

Signed-off-by: Sheetal <sheetal@nvidia.com>
Link: https://patch.msgid.link/20260406090547.988966-1-sheetal@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>

MIPS: DEC: Rate-limit memory errors for non-KN01 parity systems

Similarly to memory errors in ECC systems also rate-limit memory parity
errors for KN02-BA, KN02-CA, KN04-BA, KN04-CA DECstation and DECsystem
models. Unlike with ECC these events are always fatal and are less
likely to cause a message flood, but handle them the same way for
consistency.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: DEC: Rate-limit memory errors for KN01 systems

Similarly to memory errors in ECC systems also rate-limit memory parity
errors for KN01 DECstation and DECsystem models. Unlike with ECC these
events are always fatal and are less likely to cause a message flood,
but handle them the same way for consistency.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: DEC: Rate-limit memory errors for ECC systems

Prevent the system from becoming unusable due to a flood of memory error
messages with DECstation and DECsystem models using ECC, that is KN02,
KN03 and KN05 systems.  It seems common for gradual oxidation of memory
module contacts to cause memory errors to eventually develop and while
ECC takes care of correcting them and the system affected can continue
operating normally until the contacts have been cleaned, the unlimited
messages make the system spend all its time on producing them, therefore
preventing it from being used.

Rate-limiting removes the load from the system and enables its normal
operation, e.g.:

Bus error interrupt: CPU memory read ECC error at 0x139cfb04
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU partial memory write ECC error at 0x138c1f5c
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU partial memory write ECC error at 0x138c1f6c
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU memory read ECC error at 0x139cff64
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU memory read ECC error at 0x136af00c
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU memory read ECC error at 0x136af044
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU memory read ECC error at 0x136af0cc
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU memory read ECC error at 0x136af0cc
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU memory read ECC error at 0x136af0e4
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
Bus error interrupt: CPU memory read ECC error at 0x136af104
  ECC syndrome 0x54 -- corrected single bit error at data bit D3
dec_ecc_be_backend: 34455 callbacks suppressed

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: kernel: Remove $0 clobber from `mult_sh_align_mod'

Remove rubbish $0 clobber added to inline asm within `mult_sh_align_mod'
with the removal of support for GCC versions below 3.4 made with commit
57810ecb581a ("MIPS: Remove GCC_IMM_ASM & GCC_REG_ACCUM macros").

Previously a macro was used that, depending on GCC version, expanded to
either `accum' or $0. Since the latter choice was only a placeholder to
keep the syntax consistent and the register referred is hardwired, there
is no point in having it here as it has no effect on code generation.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

arch/mips: Drop CONFIG_FIRMWARE_EDID from defconfig files

CONFIG_FIRMWARE_EDID=y depends on X86 or EFI_GENERIC_STUB. Neither is
true here, so drop the lines from the defconfig files.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

mei: me: add nova lake point H DID

Add Nova Lake H device id.

Cc: stable <stable@kernel.org>
Co-developed-by: Tomas Winkler <tomasw@gmail.com>
Signed-off-by: Tomas Winkler <tomasw@gmail.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Link: https://patch.msgid.link/20260405141758.1634556-1-alexander.usyskin@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mei: lb: add late binding version 2

The second Late Binding version allows to send payload bigger
than client MTU by splitting it to chunks and uses separate
firmware client for transfer.

The component interface is unchanged and driver doing all splitting.

Only one Late Binding version is supported by firmware.
When Late binding version 2 is supported, the new client is advertised
by firmware and existing MKHI will have version 2.
This helps driver to select the right mode of work.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Link: https://patch.msgid.link/20260405112326.1535208-3-alexander.usyskin@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mei: bus: add mei_cldev_uuid

Add mei_cldev_uuid API on mei bus to allow client
to query what UUID it bound to.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Link: https://patch.msgid.link/20260405112326.1535208-2-alexander.usyskin@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: gusmax: add ISA suspend and resume callbacks

gusmax still leaves its ISA PM callbacks disabled even though the shared
GF1 suspend and resume path now exists.

This board needs one extra piece of PM glue around the shared GF1 helpers.
The attached WSS codec has its own register image that must be saved and
restored across suspend, and the MAX control register must be rewritten on
resume before the codec and GF1 sides are brought back.

Use the existing wss->suspend() and wss->resume() hooks for the codec, then
wire the driver up to the shared GUS suspend and resume helpers for the GF1
side.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260406-b4-alsa-gus-isa-pm-v1-4-b6829a7457cd@gmail.com

ALSA: gusextreme: add ISA suspend and resume callbacks

gusextreme still leaves its ISA PM callbacks disabled because the shared
GF1 core had no suspend and resume path suitable for PM recovery.

Resume on this board needs one extra step before the shared GF1 path can
touch the chip again: the ES1688 side must restore the GF1 routing. Split
that routing sequence into a helper, reuse it for probe and resume, reset
the ES1688 side first on resume, and then wire the driver up to the shared
GUS PM helpers.

This restores usable post-resume GF1 operation on GUS Extreme without
rerunning probe-only detection in the shared GF1 path.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260406-b4-alsa-gus-isa-pm-v1-3-b6829a7457cd@gmail.com

ALSA: gusclassic: add ISA suspend and resume callbacks

gusclassic still leaves its ISA PM callbacks disabled because the shared
GF1 core had no suspend and resume path suitable for PM recovery.

Wire the driver up to the new shared GUS suspend and resume helpers so a
suspend/resume cycle restores usable GF1 operation without rerunning
probe-only detection or tearing down the runtime bookkeeping kept by the
card instance.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260406-b4-alsa-gus-isa-pm-v1-2-b6829a7457cd@gmail.com

ALSA: gus: add shared GF1 suspend and resume helpers

gusclassic and gusextreme still leave their ISA PM callbacks disabled
because the shared GF1 core only provides probe-time startup and full
shutdown paths.

Those helpers are not suitable for suspend and resume. They reset software
handlers and tear down runtime state such as the DRAM allocator, timer
state, DMA queues, PCM state and UART setup. Resume instead needs a
narrower recovery path that rebuilds the GF1 hardware state without
rerunning probe-only detection or discarding the bookkeeping kept by the
card instance.

Add shared GF1 suspend and resume helpers for that recovery path. Suspend
now quiesces GF1 PCM, aborts queued GF1 DMA work, resets the UART and
powers the chip down without tearing down allocator, timer or rawmidi
bookkeeping. Resume rebuilds the GF1 hardware state, restores timer and
UART handlers, and brings the chip back to a usable post-resume state for
the ISA front-ends.

The scope is limited to restoring post-resume usability. It does not
attempt transparent continuation of active GF1 PCM or synth state across
suspend, and userspace may still need to reprepare streams or reload
onboard sample data after resume. Open rawmidi substreams are restored
only to a usable post-resume state.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260406-b4-alsa-gus-isa-pm-v1-1-b6829a7457cd@gmail.com

ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IAH10

The bass speakers are not working, and add the following entry
in /etc/modprobe.d/snd.conf:
options snd-sof-intel-hda-generic hda_model=alc287-yoga9-bass-spk-pin
Fixes the bass speakers.

So add the quick ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN here.

Reported-by: Fernando Garcia Corona <fgarcor@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221317
Signed-off-by: songxiebing <songxiebing@kylinos.cn>
Link: https://patch.msgid.link/20260405012651.133838-1-songxiebing@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: ctxfi: Add fallback to default RSR for S/PDIF

spdif_passthru_playback_get_resources() uses atc->pll_rate as the RSR
for the MSR calculation loop. However, pll_rate is only updated in
atc_pll_init() and not in hw_pll_init(), so it remains 0 after the
card init.

When spdif_passthru_playback_setup() skips atc_pll_init() for
32000 Hz, (rsr * desc.msr) always becomes 0, causing the loop to spin
indefinitely.

Add fallback to use atc->rsr when atc->pll_rate is 0. This reflects
the hardware state, since hw_card_init() already configures the PLL
to the default RSR.

Fixes: 8cc72361481f ("ALSA: SB X-Fi driver merge")
Cc: stable@vger.kernel.org
Signed-off-by: Harin Lee <me@harin.net>
Link: https://patch.msgid.link/20260406074913.217374-1-me@harin.net
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: ctxfi: Limit PTP to a single page

Commit 391e69143d0a increased CT_PTP_NUM from 1 to 4 to support 256
playback streams, but the additional pages are not used by the card
correctly. The CT20K2 hardware already has multiple VMEM_PTPAL
registers, but using them separately would require refactoring the
entire virtual memory allocation logic.

ct_vm_map() always uses PTEs in vm->ptp[0].area regardless of
CT_PTP_NUM. On AMD64 systems, a single PTP covers 512 PTEs (2M). When
aggregate memory allocations exceed this limit, ct_vm_map() tries to
access beyond the allocated space and causes a page fault:

  BUG: unable to handle page fault for address: ffffd4ae8a10a000
  Oops: Oops: 0002 [#1] SMP PTI
  RIP: 0010:ct_vm_map+0x17c/0x280 [snd_ctxfi]
  Call Trace:
  atc_pcm_playback_prepare+0x225/0x3b0
  ct_pcm_playback_prepare+0x38/0x60
  snd_pcm_do_prepare+0x2f/0x50
  snd_pcm_action_single+0x36/0x90
  snd_pcm_action_nonatomic+0xbf/0xd0
  snd_pcm_ioctl+0x28/0x40
  __x64_sys_ioctl+0x97/0xe0
  do_syscall_64+0x81/0x610
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Revert CT_PTP_NUM to 1. The 256 SRC_RESOURCE_NUM and playback_count
remain unchanged.

Fixes: 391e69143d0a ("ALSA: ctxfi: Bump playback substreams to 256")
Cc: stable@vger.kernel.org
Signed-off-by: Harin Lee <me@harin.net>
Link: https://patch.msgid.link/20260406074857.216034-1-me@harin.net
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: scarlett2: Add missing sentinel initializer field

A "-Wmissing-field-initializers" warning was emitted when compiling the
module using the W=2 option. There is a sentinel initializer field
missing in the end of scarlett2_devices[]. Tested using a
Scarlett Solo 4th gen.

Fixes: d98cc489029d ("ALSA: scarlett2: Move USB IDs out from device_info struct")
Signed-off-by: Panagiotis Petrakopoulos <npetrakopoulos2003@gmail.com>
Link: https://patch.msgid.link/20260405222548.8903-1-npetrakopoulos2003@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda/realtek: Fix code style error

Output of checkpatch shows error:
ERROR: else should follow close brace '}'
2168: FILE: sound/hda/codecs/realtek/realtek.c:2168:
+ }
+ else

So fix it.

Signed-off-by: songxiebing <songxiebing@kylinos.cn>
Link: https://patch.msgid.link/20260405014208.167364-1-songxiebing@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: aoa: onyx: Update IEC958 sample-rate status for PCM playback

onyx_prepare() accepts 32/44.1/48 kHz PCM playback, but it leaves the
Onyx IEC958 sample-rate status bits at the driver's initial 44.1 kHz
setting in DIG_INFO3. As a result, 32 kHz and 48 kHz PCM streams
advertise a stale IEC958 sample rate unless userspace rewrites IEC958
Playback Default first.

Update only the consumer sample-frequency bits in DIG_INFO3 from the PCM
runtime during prepare, resolving the long-standing FIXME in the PCM
playback path while leaving the other user-controlled IEC958 status bits
unchanged.

Mark IEC958 Playback Default as volatile as well, since prepare() now
changes the exposed register contents outside the control put callback.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260403-onyx-spdif-pcm-rate-v1-1-dcfaf931cf83@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge tag 'v7.0-rc7' into usb-next

We need the USB fixes in here to build on and for testing

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge tag 'v7.0-rc7' into char-misc-next

We need the char/misc/iio/comedi fixes in here as well for testing

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf test: Skip sched stats test for !root

Running perf sched stats requires root and it fails to open the
schedstat file for regular users.  Let's skip the test.

  $ perf sched stats true
  Failed to open /proc/sys/kernel/sched_schedstats

Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf cgroup: Update metric leader in evlist__expand_cgroup

When the evlist is expanded the metric leader wasn't being updated. As
the original evsel is deleted this creates a use-after-free in
stat-shadow's prepare_metric. This was detected running the "perf stat
--bpf-counters --for-each-cgroup test" with sanitizers.

The change itself puts the copied evsel into the priv field (known
unused because of evsel__clone use) and then in a second pass over the
list updates the copied values using the priv pointer.

Fixes: d1c5a0e86a4e ("perf stat: Add --for-each-cgroup option")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Sun Jian <sun.jian.kdev@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sample: Add evsel to struct perf_sample

Add the evsel from evsel__parse_sample into the struct
perf_sample. Sometimes we want to alter the evsel associated with a
sample, such as with off-cpu bpf-output events. In general the evsel
and perf_sample are passed as a pair, but this makes an altered evsel
something of a chore to keep checking for and setting up. Later
patches will remove passing an evsel with the perf_sample and switch
to just using the perf_sample's value.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sample: Make sure perf_sample__init/exit are used

The deferred stack trace code wasn't using perf_sample__init/exit. Add
the deferred stack trace clean up to perf_sample__exit which requires
proper NULL initialization in perf_sample__init. Make the
perf_sample__exit robust to being called more than once by using
zfree. Make the error paths in evsel__parse_sample exit the
sample. Add a merged_callchain boolean to capture that callchain is
allocated, deferred_callchain doen't suffice for this. Pack the struct
variables to avoid padding bytes for this.

Similiarly powerpc_vpadtl_sample wasn't using perf_sample__init/exit,
use it for consistency and potential issues with uninitialized
variables.

Similarly guest_session__inject_events in builtin-inject wasn't using
perf_sample_init/exit. The lifetime management for fetched events is
somewhat complex there, but when an event is fetched the sample should
be initialized and needs exiting on error. The sample may be left in
place so that future injects have access to it.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sample: Document struct perf_sample

Add kernel-doc for struct perf_sample capturing the somewhat unusual
population of fields and lifetime relationships.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tools: Save cln_size header

Store cacheline size during perf record in header, so that cacheline
size can be used for other features, like sort keys for perf report.

Testing example with feat enabled:

  $ perf record ./Example

  $ perf report --header-only | grep -C 3 cacheline
  CPU_DOMAIN_INFO info available, use -I to display
  e_machine : 62
  e_flags : 0
  cacheline size: 64
  missing features: TRACING_DATA BUILD_ID BRANCH_STACK GROUP_DESC AUXTRACE \
  STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA
  ========

[namhyung: Update the commit message and remove blank lines]
Signed-off-by: Ricky Ringler <ricky.ringler@proton.me>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tests sched stats: Write output to temp file

Writing to the perf.data file can fail in various contexts such as
continual test. Other tests write to a mktemp-ed file, make the "perf
sched stats tests" follow this convention.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Avoid crash for unexpected perf sched stats report

Doing a `perf sched record` then `perf sched stats report` crashes as
the tp_handler isn't set. Add a dummy tp_handler for it rather than
adding an extra check.

Reported-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

RISC-V: KVM: Fix shift-out-of-bounds in make_xfence_request()

The make_xfence_request() function uses a shift operation to check if a
vCPU is in the hart mask:

  if (!(hmask & (1UL << (vcpu->vcpu_id - hbase))))

However, when the difference between vcpu_id and hbase
is >= BITS_PER_LONG, the shift operation causes undefined behavior.

This was detected by UBSAN:
  UBSAN: shift-out-of-bounds in arch/riscv/kvm/tlb.c:343:23
  shift exponent 256 is too large for 64-bit type 'long unsigned int'

Fix this by adding a bounds check before the shift operation.

This bug was found by fuzzing the KVM RISC-V interface.

Fixes: 13acfec2dbcc ("RISC-V: KVM: Add remote HFENCE functions based on VCPU requests")
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403232011.2394966-1-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>

md: fix array_state=clear sysfs deadlock

When "clear" is written to array_state, md_attr_store() breaks sysfs
active protection so the array can delete itself from its own sysfs
store method.

However, md_attr_store() currently drops the mddev reference before
calling sysfs_unbreak_active_protection(). Once do_md_stop(..., 0)
has made the mddev eligible for delayed deletion, the temporary
kobject reference taken by sysfs_break_active_protection() can become
the last kobject reference protecting the md kobject.

That allows sysfs_unbreak_active_protection() to drop the last
kobject reference from the current sysfs writer context. kobject
teardown then recurses into kernfs removal while the current sysfs
node is still being unwound, and lockdep reports recursive locking on
kn->active with kernfs_drain() in the call chain.

Reproducer on an existing level:
1. Create an md0 linear array and activate it:
   mknod /dev/md0 b 9 0
   echo none > /sys/block/md0/md/metadata_version
   echo linear > /sys/block/md0/md/level
   echo 1 > /sys/block/md0/md/raid_disks
   echo "$(cat /sys/class/block/sdb/dev)" > /sys/block/md0/md/new_dev
   echo "$(($(cat /sys/class/block/sdb/size) / 2))" > \
/sys/block/md0/md/dev-sdb/size
   echo 0 > /sys/block/md0/md/dev-sdb/slot
   echo active > /sys/block/md0/md/array_state
2. Wait briefly for the array to settle, then clear it:
   sleep 2
   echo clear > /sys/block/md0/md/array_state

The warning looks like:

  WARNING: possible recursive locking detected
  bash/588 is trying to acquire lock:
  (kn->active#65) at __kernfs_remove+0x157/0x1d0
  but task is already holding lock:
  (kn->active#65) at sysfs_unbreak_active_protection+0x1f/0x40
  ...
  Call Trace:
   kernfs_drain
   __kernfs_remove
   kernfs_remove_by_name_ns
   sysfs_remove_group
   sysfs_remove_groups
   __kobject_del
   kobject_put
   md_attr_store
   kernfs_fop_write_iter
   vfs_write
   ksys_write

Restore active protection before mddev_put() so the extra sysfs
kobject reference is dropped while the mddev is still held alive. The
actual md kobject deletion is then deferred until after the sysfs
write path has fully returned.

Fixes: 9e59d609763f ("md: call del_gendisk in control path")
Reviewed-by: Xiao Ni <xni@redhat.com>
Link: https://lore.kernel.org/linux-raid/20260330055213.3976052-1-yukuai@fnnas.com/
Signed-off-by: Yu Kuai <yukuai@fnnas.com>

bpf: Fix stale offload->prog pointer after constant blinding

When a dev-bound-only BPF program (BPF_F_XDP_DEV_BOUND_ONLY) undergoes
JIT compilation with constant blinding enabled (bpf_jit_harden >= 2),
bpf_jit_blind_constants() clones the program. The original prog is then
freed in bpf_jit_prog_release_other(), which updates aux->prog to point
to the surviving clone, but fails to update offload->prog.

This leaves offload->prog pointing to the freed original program. When
the network namespace is subsequently destroyed, cleanup_net() triggers
bpf_dev_bound_netdev_unregister(), which iterates ondev->progs and calls
__bpf_prog_offload_destroy(offload->prog). Accessing the freed prog
causes a page fault:

BUG: unable to handle page fault for address: ffffc900085f1038
Workqueue: netns cleanup_net
RIP: 0010:__bpf_prog_offload_destroy+0xc/0x80
Call Trace:
__bpf_offload_dev_netdev_unregister+0x257/0x350
bpf_dev_bound_netdev_unregister+0x4a/0x90
unregister_netdevice_many_notify+0x2a2/0x660
...
cleanup_net+0x21a/0x320

The test sequence that triggers this reliably is:

1. Set net.core.bpf_jit_harden=2 (echo 2 > /proc/sys/net/core/bpf_jit_harden)
2. Run xdp_metadata selftest, which creates a dev-bound-only XDP
program on a veth inside a netns (./test_progs -t xdp_metadata)
3. cleanup_net -> page fault in __bpf_prog_offload_destroy

Dev-bound-only programs are unique in that they have an offload structure
but go through the normal JIT path instead of bpf_prog_offload_compile().
This means they are subject to constant blinding's prog clone-and-replace,
while also having offload->prog that must stay in sync.

Fix this by updating offload->prog in bpf_jit_prog_release_other(),
alongside the existing aux->prog update. Both are back-pointers to
the prog that must be kept in sync when the prog is replaced.

Fixes: 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs")
Signed-off-by: MingTao Huang <mintaohuang@tencent.com>
Link: https://lore.kernel.org/r/tencent_BCF692F45859CCE6C22B7B0B64827947D406@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: remove unused toggle in tc_tunnel

tc_tunnel test is based on a send_and_test_data function which takes a
subtest configuration, and a boolean indicating whether the connection
is supposed to fail or not. This boolean is systematically passed to
true, and is a remnant from the first (not integrated) attempts to
convert tc_tunnel to test_progs: those versions validated for
example that a connection properly fails when only one side of the
connection has tunneling enabled. This specific testing has not been
integrated because it involved large timeouts which increased quite a
lot the test duration, for little added value.

Remove the unused boolean from send_and_test_data to simplify the
generic part of subtests.

Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260403-tc_tunnel_cleanup-v1-1-4f1bb113d3ab@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge branch 'bpf-fix-end-of-list-detection-in-cgroup_storage_get_next_key'

Weiming Shi says:

====================
bpf: fix end-of-list detection in cgroup_storage_get_next_key()

list_next_entry() never returns NULL, so the NULL check in
cgroup_storage_get_next_key() is dead code. When iterating past the last
element, the function reads storage->key from a bogus pointer that aliases
internal map fields and copies the result to userspace.

Patch 1 replaces the NULL check with list_entry_is_head() so the function
correctly returns -ENOENT when there are no more entries.

Patch 2 adds a selftest to cover this corner case, as suggested by Sun Jian
and Paul Chaignon.

v2:
- Added selftest (Paul Chaignon)
- Collected Reviewed-by and Acked-by tags
====================

Link: https://patch.msgid.link/20260403132951.43533-1-bestswngs@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: add get_next_key boundary test for cgroup_storage

Verify that bpf_map__get_next_key() correctly returns -ENOENT when
called on the last (and only) key in a cgroup_storage map. Before the
fix in the previous patch, this would succeed with bogus key data
instead of failing.

Suggested-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260403132951.43533-3-bestswngs@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: fix end-of-list detection in cgroup_storage_get_next_key()

list_next_entry() never returns NULL -- when the current element is the
last entry it wraps to the list head via container_of(). The subsequent
NULL check is therefore dead code and get_next_key() never returns
-ENOENT for the last element, instead reading storage->key from a bogus
pointer that aliases internal map fields and copying the result to
userspace.

Replace it with list_entry_is_head() so the function correctly returns
-ENOENT when there are no more entries.

Fixes: de9cbbaadba5 ("bpf: introduce cgroup storage maps")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260403132951.43533-2-bestswngs@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge branch 'bpf-fix-torn-writes-in-non-prealloc-htab-with-bpf_f_lock'

Mykyta Yatsenko says:

====================
bpf: Fix torn writes in non-prealloc htab with BPF_F_LOCK

A torn write issue was reported in htab_map_update_elem() with
BPF_F_LOCK on hash maps. The BPF_F_LOCK fast path performs
a lockless lookup and copies the value under the element's embedded
spin_lock. A concurrent delete can free the element via
bpf_mem_cache_free(), which allows immediate reuse. When
alloc_htab_elem() recycles the same memory, it writes the value with
plain copy_map_value() without taking the spin_lock, racing with the
stale lock holder and producing torn writes.

Patch 1 fixes alloc_htab_elem() to use copy_map_value_locked() when
BPF_F_LOCK is set.

Patch 2 adds a selftest that reliably detects the torn writes on an
unpatched kernel.

Reported-by: Aaron Esau <aaron1esau@gmail.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
====================

Link: https://patch.msgid.link/20260401-bpf_map_torn_writes-v1-0-782d071c55e7@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>