]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
3 months agotools/power turbostat: regression fix: --show C1E%
Len Brown [Tue, 10 Jun 2025 03:34:04 +0000 (23:34 -0400)] 
tools/power turbostat: regression fix: --show C1E%

[ Upstream commit 5d939fbdd480cdf276eccc01eda3ed41e37d3f8a ]

The new default idle counter groupings broke "--show C1E%" (or any other C-state %)

Also delete a stray debug printf from the same offending commit.

Reported-by: Zhang Rui <rui.zhang@intel.com>
Fixes: ec4acd3166d8 ("tools/power turbostat: disable "cpuidle" invocation counters, by default")
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoPCI: pnv_php: Fix surprise plug detection and recovery
Timothy Pearson [Tue, 15 Jul 2025 21:39:06 +0000 (16:39 -0500)] 
PCI: pnv_php: Fix surprise plug detection and recovery

[ Upstream commit a2a2a6fc2469524caa713036297c542746d148dc ]

The existing PowerNV hotplug code did not handle surprise plug events
correctly, leading to a complete failure of the hotplug system after device
removal and a required reboot to detect new devices.

This comes down to two issues:

 1) When a device is surprise removed, often the bridge upstream
    port will cause a PE freeze on the PHB.  If this freeze is not
    cleared, the MSI interrupts from the bridge hotplug notification
    logic will not be received by the kernel, stalling all plug events
    on all slots associated with the PE.

 2) When a device is removed from a slot, regardless of surprise or
    programmatic removal, the associated PHB/PE ls left frozen.
    If this freeze is not cleared via a fundamental reset, skiboot
    is unable to clear the freeze and cannot retrain / rescan the
    slot.  This also requires a reboot to clear the freeze and redetect
    the device in the slot.

Issue the appropriate unfreeze and rescan commands on hotplug events,
and don't oops on hotplug if pci_bus_to_OF_node() returns NULL.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[bhelgaas: tidy comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/171044224.1359864.1752615546988.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agopowerpc/eeh: Make EEH driver device hotplug safe
Timothy Pearson [Tue, 15 Jul 2025 21:38:23 +0000 (16:38 -0500)] 
powerpc/eeh: Make EEH driver device hotplug safe

[ Upstream commit 1010b4c012b0d78dfb9d3132b49aa2ef024a07a7 ]

Multiple race conditions existed between the PCIe hotplug driver and the
EEH driver, leading to a variety of kernel oopses of the same general
nature:

<pcie device unplug>
<eeh driver trigger>
<hotplug removal trigger>
<pcie tree reconfiguration>
<eeh recovery next step>
<oops in EEH driver bus iteration loop>

A second class of oops is also seen when the underlying bus disappears
during device recovery.

Refactor the EEH module to be PCI rescan and remove safe.  Also clean
up a few minor formatting / readability issues.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1334208367.1359861.1752615503144.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agopowerpc/eeh: Export eeh_unfreeze_pe()
Timothy Pearson [Tue, 15 Jul 2025 21:37:34 +0000 (16:37 -0500)] 
powerpc/eeh: Export eeh_unfreeze_pe()

[ Upstream commit e82b34eed04b0ddcff4548b62633467235672fd3 ]

The PowerNV hotplug driver needs to be able to clear any frozen PE(s)
on the PHB after suprise removal of a downstream device.

Export the eeh_unfreeze_pe() symbol to allow implementation of this
functionality in the php_nv module.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1778535414.1359858.1752615454618.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoPCI: pnv_php: Work around switches with broken presence detection
Timothy Pearson [Tue, 15 Jul 2025 21:36:55 +0000 (16:36 -0500)] 
PCI: pnv_php: Work around switches with broken presence detection

[ Upstream commit 80f9fc2362797538ebd4fd70a1dfa838cc2c2cdb ]

The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
was observed to incorrectly assert the Presence Detect Set bit in its
capabilities when tested on a Raptor Computing Systems Blackbird system,
resulting in the hot insert path never attempting a rescan of the bus
and any downstream devices not being re-detected.

Work around this by additionally checking whether the PCIe data link is
active or not when performing presence detection on downstream switches'
ports, similar to the pciehp_hpc.c driver.

Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/505981576.1359853.1752615415117.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoPCI: pnv_php: Clean up allocated IRQs on unplug
Timothy Pearson [Tue, 15 Jul 2025 21:36:07 +0000 (16:36 -0500)] 
PCI: pnv_php: Clean up allocated IRQs on unplug

[ Upstream commit 4668619092554e1b95c9a5ac2941ca47ba6d548a ]

When the root of a nested PCIe bridge configuration is unplugged, the
pnv_php driver leaked the allocated IRQ resources for the child bridges'
hotplug event notifications, resulting in a panic.

Fix this by walking all child buses and deallocating all its IRQ resources
before calling pci_hp_remove_devices().

Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
that it is only destroyed in pnv_php_free_slot(), instead of
pnv_php_disable_irq(). This is required since pnv_php_disable_irq() will
now be called by workers triggered by hot unplug interrupts, so the
workqueue needs to stay allocated.

The abridged kernel panic that occurs without this patch is as follows:

  WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c
  CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
  Call Trace:
   msi_device_data_release+0x34/0x9c (unreliable)
   release_nodes+0x64/0x13c
   devres_release_all+0xc0/0x140
   device_del+0x2d4/0x46c
   pci_destroy_dev+0x5c/0x194
   pci_hp_remove_devices+0x90/0x128
   pci_hp_remove_devices+0x44/0x128
   pnv_php_disable_slot+0x54/0xd4
   power_write_file+0xf8/0x18c
   pci_slot_attr_store+0x40/0x5c
   sysfs_kf_write+0x64/0x78
   kernfs_fop_write_iter+0x1b0/0x290
   vfs_write+0x3bc/0x50c
   ksys_write+0x84/0x140
   system_call_exception+0x124/0x230
   system_call_vectored_common+0x15c/0x2ec

Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[bhelgaas: tidy comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/2013845045.1359852.1752615367790.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agopadata: Remove comment for reorder_work
Herbert Xu [Mon, 16 Jun 2025 08:38:49 +0000 (16:38 +0800)] 
padata: Remove comment for reorder_work

[ Upstream commit 82a0302e7167d0b7c6cde56613db3748f8dd806d ]

Remove comment for reorder_work which no longer exists.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 71203f68c774 ("padata: Fix pd UAF once and for all")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agosched/psi: Fix psi_seq initialization
Peter Zijlstra [Tue, 15 Jul 2025 19:11:14 +0000 (15:11 -0400)] 
sched/psi: Fix psi_seq initialization

[ Upstream commit 99b773d720aeea1ef2170dce5fcfa80649e26b78 ]

With the seqcount moved out of the group into a global psi_seq,
re-initializing the seqcount on group creation is causing seqcount
corruption.

Fixes: 570c8efd5eb7 ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
Reported-by: Chris Mason <clm@meta.com>
Suggested-by: Beata Michalska <beata.michalska@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD
Jason Gunthorpe [Mon, 14 Jul 2025 16:08:25 +0000 (13:08 -0300)] 
vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD

[ Upstream commit 86624ba3b522b6512def25534341da93356c8da4 ]

This was missed during the initial implementation. The VFIO PCI encodes
the vf_token inside the device name when opening the device from the group
FD, something like:

  "0000:04:10.0 vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3"

This is used to control access to a VF unless there is co-ordination with
the owner of the PF.

Since we no longer have a device name in the cdev path, pass the token
directly through VFIO_DEVICE_BIND_IOMMUFD using an optional field
indicated by VFIO_DEVICE_BIND_FLAG_TOKEN.

Fixes: 5fcc26969a16 ("vfio: Add VFIO_DEVICE_BIND_IOMMUFD")
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v3-bdd8716e85fe+3978a-vfio_token_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agokconfig: qconf: fix ConfigList::updateListAllforAll()
Masahiro Yamada [Sun, 29 Jun 2025 18:48:56 +0000 (03:48 +0900)] 
kconfig: qconf: fix ConfigList::updateListAllforAll()

[ Upstream commit 721bfe583c52ba1ea74b3736a31a9dcfe6dd6d95 ]

ConfigList::updateListForAll() and ConfigList::updateListAllforAll()
are identical.

Commit f9b918fae678 ("kconfig: qconf: move ConfigView::updateList(All)
to ConfigList class") was a misconversion.

Fixes: f9b918fae678 ("kconfig: qconf: move ConfigView::updateList(All) to ConfigList class")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoscsi: sd: Make sd shutdown issue START STOP UNIT appropriately
Salomon Dushimirimana [Thu, 24 Jul 2025 21:45:20 +0000 (21:45 +0000)] 
scsi: sd: Make sd shutdown issue START STOP UNIT appropriately

[ Upstream commit 8e48727c26c4d839ff9b4b73d1cae486bea7fe19 ]

Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device
manage_system_start_stop") enabled libata EH to manage device power mode
trasitions for system suspend/resume and removed the flag from
ata_scsi_dev_config. However, since the sd_shutdown() function still
relies on the manage_system_start_stop flag, a spin-down command is not
issued to the disk with command "echo 1 > /sys/block/sdb/device/delete"

sd_shutdown() can be called for both system/runtime start stop
operations, so utilize the manage_run_time_start_stop flag set in the
ata_scsi_dev_config and issue a spin-down command during disk removal
when the system is running. This is in addition to when the system is
powering off and manage_shutdown flag is set. The
manage_system_start_stop flag will still be used for drivers that still
set the flag.

Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Signed-off-by: Salomon Dushimirimana <salomondush@google.com>
Link: https://lore.kernel.org/r/20250724214520.112927-1-salomondush@google.com
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoscsi: ufs: core: Use link recovery when h8 exit fails during runtime resume
Seunghui Lee [Thu, 17 Jul 2025 08:12:13 +0000 (17:12 +0900)] 
scsi: ufs: core: Use link recovery when h8 exit fails during runtime resume

[ Upstream commit 35dabf4503b94a697bababe94678a8bc989c3223 ]

If the h8 exit fails during runtime resume process, the runtime thread
enters runtime suspend immediately and the error handler operates at the
same time.  It becomes stuck and cannot be recovered through the error
handler.  To fix this, use link recovery instead of the error handler.

Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths")
Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
Link: https://lore.kernel.org/r/20250717081213.6811-1-sh043.lee@samsung.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoscsi: Revert "scsi: iscsi: Fix HW conn removal use after free"
Li Lingfeng [Tue, 15 Jul 2025 07:39:26 +0000 (15:39 +0800)] 
scsi: Revert "scsi: iscsi: Fix HW conn removal use after free"

[ Upstream commit 7bdc68921481c19cd8c85ddf805a834211c19e61 ]

This reverts commit c577ab7ba5f3bf9062db8a58b6e89d4fe370447e.

The invocation of iscsi_put_conn() in iscsi_iter_destory_conn_fn() is
used to free the initial reference counter of iscsi_cls_conn.  For
non-qla4xxx cases, the ->destroy_conn() callback (e.g.,
iscsi_conn_teardown) will call iscsi_remove_conn() and iscsi_put_conn()
to remove the connection from the children list of session and free the
connection at last.  However for qla4xxx, it is not the case. The
->destroy_conn() callback of qla4xxx will keep the connection in the
session conn_list and doesn't use iscsi_put_conn() to free the initial
reference counter. Therefore, it seems necessary to keep the
iscsi_put_conn() in the iscsi_iter_destroy_conn_fn(), otherwise, there
will be memory leak problem.

Link: https://lore.kernel.org/all/88334658-072b-4b90-a949-9c74ef93cfd1@huawei.com/
Fixes: c577ab7ba5f3 ("scsi: iscsi: Fix HW conn removal use after free")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Link: https://lore.kernel.org/r/20250715073926.3529456-1-lilingfeng3@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoscsi: mpt3sas: Fix a fw_event memory leak
Tomas Henzl [Wed, 23 Jul 2025 15:30:18 +0000 (17:30 +0200)] 
scsi: mpt3sas: Fix a fw_event memory leak

[ Upstream commit 3e90b38781e3bdd651edaf789585687611638862 ]

In _mpt3sas_fw_work() the fw_event reference is removed, it should also
be freed in all cases.

Fixes: 4318c7347847 ("scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20250723153018.50518-1-thenzl@redhat.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovfio/pci: Separate SR-IOV VF dev_set
Alex Williamson [Thu, 26 Jun 2025 22:56:18 +0000 (16:56 -0600)] 
vfio/pci: Separate SR-IOV VF dev_set

[ Upstream commit e908f58b6beb337cbe4481d52c3f5c78167b1aab ]

In the below noted Fixes commit we introduced a reflck mutex to allow
better scaling between devices for open and close.  The reflck was
based on the hot reset granularity, device level for root bus devices
which cannot support hot reset or bus/slot reset otherwise.  Overlooked
in this were SR-IOV VFs, where there's also no bus reset option, but
the default for a non-root-bus, non-slot-based device is bus level
reflck granularity.

The reflck mutex has since become the dev_set mutex (via commit
2cd8b14aaa66 ("vfio/pci: Move to the device set infrastructure")) and
is our defacto serialization for various operations and ioctls.  It
still seems to be the case though that sets of vfio-pci devices really
only need serialization relative to hot resets affecting the entire
set, which is not relevant to SR-IOV VFs.  As described in the Closes
link below, this serialization contributes to startup latency when
multiple VFs sharing the same "bus" are opened concurrently.

Mark the device itself as the basis of the dev_set for SR-IOV VFs.

Reported-by: Aaron Lewis <aaronlewis@google.com>
Closes: https://lore.kernel.org/all/20250626180424.632628-1-aaronlewis@google.com
Tested-by: Aaron Lewis <aaronlewis@google.com>
Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release")
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20250626225623.1180952-1-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovfio/pds: Fix missing detach_ioas op
Brett Creeley [Wed, 2 Jul 2025 16:37:44 +0000 (09:37 -0700)] 
vfio/pds: Fix missing detach_ioas op

[ Upstream commit fe24d5bc635e103a517ec201c3cb571eeab8be2f ]

When CONFIG_IOMMUFD is enabled and a device is bound to the pds_vfio_pci
driver, the following WARN_ON() trace is seen and probe fails:

WARNING: CPU: 0 PID: 5040 at drivers/vfio/vfio_main.c:317 __vfio_register_dev+0x130/0x140 [vfio]
<...>
pds_vfio_pci 0000:08:00.1: probe with driver pds_vfio_pci failed with error -22

This is because the driver's vfio_device_ops.detach_ioas isn't set.

Fix this by using the generic vfio_iommufd_physical_detach_ioas
function.

Fixes: 38fe3975b4c2 ("vfio/pds: Initial support for pds VFIO driver")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20250702163744.69767-1-brett.creeley@amd.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovfio: Prevent open_count decrement to negative
Jacob Pan [Wed, 18 Jun 2025 23:46:18 +0000 (16:46 -0700)] 
vfio: Prevent open_count decrement to negative

[ Upstream commit 982ddd59ed97dc7e63efd97ed50273ffb817bd41 ]

When vfio_df_close() is called with open_count=0, it triggers a warning in
vfio_assert_device_open() but still decrements open_count to -1. This allows
a subsequent open to incorrectly pass the open_count == 0 check, leading to
unintended behavior, such as setting df->access_granted = true.

For example, running an IOMMUFD compat no-IOMMU device with VFIO tests
(https://github.com/awilliam/tests/blob/master/vfio-noiommu-pci-device-open.c)
results in a warning and a failed VFIO_GROUP_GET_DEVICE_FD ioctl on the first
run, but the second run succeeds incorrectly.

Add checks to avoid decrementing open_count below zero.

Fixes: 05f37e1c03b6 ("vfio: Pass struct vfio_device_file * to vfio_device_open/close()")
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250618234618.1910456-2-jacob.pan@linux.microsoft.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovfio: Fix unbalanced vfio_df_close call in no-iommu mode
Jacob Pan [Wed, 18 Jun 2025 23:46:17 +0000 (16:46 -0700)] 
vfio: Fix unbalanced vfio_df_close call in no-iommu mode

[ Upstream commit b25e271b377999191b12f0afbe1861edcf57e3fe ]

For devices with no-iommu enabled in IOMMUFD VFIO compat mode, the group open
path skips vfio_df_open(), leaving open_count at 0. This causes a warning in
vfio_assert_device_open(device) when vfio_df_close() is called during group
close.

The correct behavior is to skip only the IOMMUFD bind in the device open path
for no-iommu devices. Commit 6086efe73498 omitted vfio_df_open(), which was
too broad. This patch restores the previous behavior, ensuring
the vfio_df_open is called in the group open path.

Fixes: 6086efe73498 ("vfio-iommufd: Move noiommu compat validation out of vfio_iommufd_bind()")
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20250618234618.1910456-1-jacob.pan@linux.microsoft.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoi2c: muxes: mule: Fix an error handling path in mule_i2c_mux_probe()
Christophe JAILLET [Wed, 30 Jul 2025 19:38:02 +0000 (21:38 +0200)] 
i2c: muxes: mule: Fix an error handling path in mule_i2c_mux_probe()

[ Upstream commit 33ac5155891cab165c93b51b0e22e153eacc2ee7 ]

If an error occurs in the loop that creates the device adapters, then a
reference to 'dev' still needs to be released.

Use for_each_child_of_node_scoped() to both fix the issue and save one line
of code.

Fixes: d0f8e97866bf ("i2c: muxes: add support for tsd,mule-i2c multiplexer")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoexfat: fdatasync flag should be same like generic_write_sync()
Zhengxu Zhang [Thu, 19 Jun 2025 01:33:31 +0000 (09:33 +0800)] 
exfat: fdatasync flag should be same like generic_write_sync()

[ Upstream commit 2f2d42a17b5a6711378d39df74f1f69a831c5d4e ]

Test: androbench by default setting, use 64GB sdcard.
 the random write speed:
without this patch 3.5MB/s
with this patch 7MB/s

After patch "11a347fb6cef", the random write speed decreased significantly.
the .write_iter() interface had been modified, and check the differences
with generic_file_write_iter(), when calling generic_write_sync() and
exfat_file_write_iter() to call vfs_fsync_range(), the fdatasync flag is
wrong, and make not use the fdatasync mode, and make random write speed
decreased. So use generic_write_sync() instead of vfs_fsync_range().

Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength")
Signed-off-by: Zhengxu Zhang <zhengxu.zhang@unisoc.com>
Acked-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs mode
Chao Yu [Thu, 24 Jul 2025 08:01:44 +0000 (16:01 +0800)] 
f2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs mode

[ Upstream commit 1005a3ca28e90c7a64fa43023f866b960a60f791 ]

w/ "mode=lfs" mount option, generic/299 will cause system panic as below:

------------[ cut here ]------------
kernel BUG at fs/f2fs/segment.c:2835!
Call Trace:
 <TASK>
 f2fs_allocate_data_block+0x6f4/0xc50
 f2fs_map_blocks+0x970/0x1550
 f2fs_iomap_begin+0xb2/0x1e0
 iomap_iter+0x1d6/0x430
 __iomap_dio_rw+0x208/0x9a0
 f2fs_file_write_iter+0x6b3/0xfa0
 aio_write+0x15d/0x2e0
 io_submit_one+0x55e/0xab0
 __x64_sys_io_submit+0xa5/0x230
 do_syscall_64+0x84/0x2f0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0010:new_curseg+0x70f/0x720

The root cause of we run out-of-space is: in f2fs_map_blocks(), f2fs may
trigger foreground gc only if it allocates any physical block, it will be
a little bit later when there is multiple threads writing data w/
aio/dio/bufio method in parallel, since we always use OPU in lfs mode, so
f2fs_map_blocks() does block allocations aggressively.

In order to fix this issue, let's give a chance to trigger foreground
gc in prior to block allocation in f2fs_map_blocks().

Fixes: 36abef4e796d ("f2fs: introduce mode=lfs mount option")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to calculate dirty data during has_not_enough_free_secs()
Chao Yu [Thu, 24 Jul 2025 08:01:43 +0000 (16:01 +0800)] 
f2fs: fix to calculate dirty data during has_not_enough_free_secs()

[ Upstream commit e194e140ab7de2ce2782e64b9e086a43ca6ff4f2 ]

In lfs mode, dirty data needs OPU, we'd better calculate lower_p and
upper_p w/ them during has_not_enough_free_secs(), otherwise we may
encounter out-of-space issue due to we missed to reclaim enough
free section w/ foreground gc.

Fixes: 36abef4e796d ("f2fs: introduce mode=lfs mount option")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to update upper_p in __get_secs_required() correctly
Chao Yu [Thu, 24 Jul 2025 08:01:42 +0000 (16:01 +0800)] 
f2fs: fix to update upper_p in __get_secs_required() correctly

[ Upstream commit 6840faddb65683b4e7bd8196f177b038a1e19faf ]

Commit 1acd73edbbfe ("f2fs: fix to account dirty data in __get_secs_required()")
missed to calculate upper_p w/ data_secs, fix it.

Fixes: 1acd73edbbfe ("f2fs: fix to account dirty data in __get_secs_required()")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: vm_unmap_ram() may be called from an invalid context
Jan Prusakowski [Thu, 24 Jul 2025 15:31:15 +0000 (17:31 +0200)] 
f2fs: vm_unmap_ram() may be called from an invalid context

[ Upstream commit 08a7efc5b02a0620ae16aa9584060e980a69cb55 ]

When testing F2FS with xfstests using UFS backed virtual disks the
kernel complains sometimes that f2fs_release_decomp_mem() calls
vm_unmap_ram() from an invalid context. Example trace from
f2fs/007 test:

f2fs/007 5s ...  [12:59:38][    8.902525] run fstests f2fs/007
[   11.468026] BUG: sleeping function called from invalid context at mm/vmalloc.c:2978
[   11.471849] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 68, name: irq/22-ufshcd
[   11.475357] preempt_count: 1, expected: 0
[   11.476970] RCU nest depth: 0, expected: 0
[   11.478531] CPU: 0 UID: 0 PID: 68 Comm: irq/22-ufshcd Tainted: G        W           6.16.0-rc5-xfstests-ufs-g40f92e79b0aa #9 PREEMPT(none)
[   11.478535] Tainted: [W]=WARN
[   11.478536] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   11.478537] Call Trace:
[   11.478543]  <TASK>
[   11.478545]  dump_stack_lvl+0x4e/0x70
[   11.478554]  __might_resched.cold+0xaf/0xbe
[   11.478557]  vm_unmap_ram+0x21/0xb0
[   11.478560]  f2fs_release_decomp_mem+0x59/0x80
[   11.478563]  f2fs_free_dic+0x18/0x1a0
[   11.478565]  f2fs_finish_read_bio+0xd7/0x290
[   11.478570]  blk_update_request+0xec/0x3b0
[   11.478574]  ? sbitmap_queue_clear+0x3b/0x60
[   11.478576]  scsi_end_request+0x27/0x1a0
[   11.478582]  scsi_io_completion+0x40/0x300
[   11.478583]  ufshcd_mcq_poll_cqe_lock+0xa3/0xe0
[   11.478588]  ufshcd_sl_intr+0x194/0x1f0
[   11.478592]  ufshcd_threaded_intr+0x68/0xb0
[   11.478594]  ? __pfx_irq_thread_fn+0x10/0x10
[   11.478599]  irq_thread_fn+0x20/0x60
[   11.478602]  ? __pfx_irq_thread_fn+0x10/0x10
[   11.478603]  irq_thread+0xb9/0x180
[   11.478605]  ? __pfx_irq_thread_dtor+0x10/0x10
[   11.478607]  ? __pfx_irq_thread+0x10/0x10
[   11.478609]  kthread+0x10a/0x230
[   11.478614]  ? __pfx_kthread+0x10/0x10
[   11.478615]  ret_from_fork+0x7e/0xd0
[   11.478619]  ? __pfx_kthread+0x10/0x10
[   11.478621]  ret_from_fork_asm+0x1a/0x30
[   11.478623]  </TASK>

This patch modifies in_task() check inside f2fs_read_end_io() to also
check if interrupts are disabled. This ensures that pages are unmapped
asynchronously in an interrupt handler.

Fixes: bff139b49d9f ("f2fs: handle decompress only post processing in softirq")
Signed-off-by: Jan Prusakowski <jprusakowski@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to avoid out-of-boundary access in devs.path
Chao Yu [Fri, 11 Jul 2025 07:14:50 +0000 (15:14 +0800)] 
f2fs: fix to avoid out-of-boundary access in devs.path

[ Upstream commit 5661998536af52848cc4d52a377e90368196edea ]

- touch /mnt/f2fs/012345678901234567890123456789012345678901234567890123
- truncate -s $((1024*1024*1024)) \
  /mnt/f2fs/012345678901234567890123456789012345678901234567890123
- touch /mnt/f2fs/file
- truncate -s $((1024*1024*1024)) /mnt/f2fs/file
- mkfs.f2fs /mnt/f2fs/012345678901234567890123456789012345678901234567890123 \
  -c /mnt/f2fs/file
- mount /mnt/f2fs/012345678901234567890123456789012345678901234567890123 \
  /mnt/f2fs/loop

[16937.192225] F2FS-fs (loop0): Mount Device [ 0]: /mnt/f2fs/012345678901234567890123456789012345678901234567890123\xff\x01,      511,        0 -    3ffff
[16937.192268] F2FS-fs (loop0): Failed to find devices

If device path length equals to MAX_PATH_LEN, sbi->devs.path[] may
not end up w/ null character due to path array is fully filled, So
accidently, fields locate after path[] may be treated as part of
device path, result in parsing wrong device path.

struct f2fs_dev_info {
...
char path[MAX_PATH_LEN];
...
};

Let's add one byte space for sbi->devs.path[] to store null
character of device path string.

Fixes: 3c62be17d4f5 ("f2fs: support multiple devices")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to avoid panic in f2fs_evict_inode
Chao Yu [Tue, 8 Jul 2025 09:56:57 +0000 (17:56 +0800)] 
f2fs: fix to avoid panic in f2fs_evict_inode

[ Upstream commit a509a55f8eecc8970b3980c6f06886bbff0e2f68 ]

As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/20250702120321.1080759-1-chao@kernel.org

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b462b2e5 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to avoid UAF in f2fs_sync_inode_meta()
Chao Yu [Tue, 8 Jul 2025 09:53:39 +0000 (17:53 +0800)] 
f2fs: fix to avoid UAF in f2fs_sync_inode_meta()

[ Upstream commit 7c30d79930132466f5be7d0b57add14d1a016bda ]

syzbot reported an UAF issue as below: [1] [2]

[1] https://syzkaller.appspot.com/text?tag=CrashReport&x=16594c60580000

==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff888100567dc8 by task kworker/u4:0/8

CPU: 1 PID: 8 Comm: kworker/u4:0 Tainted: G        W          6.1.129-syzkaller-00017-g642656a36791 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: writeback wb_workfn (flush-7:0)
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x151/0x1b7 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:316 [inline]
 print_report+0x158/0x4e0 mm/kasan/report.c:427
 kasan_report+0x13c/0x170 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0x100/0x2e0 fs/f2fs/super.c:1553
 f2fs_update_inode+0x72/0x1c40 fs/f2fs/inode.c:588
 f2fs_update_inode_page+0x135/0x170 fs/f2fs/inode.c:706
 f2fs_write_inode+0x416/0x790 fs/f2fs/inode.c:734
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4cf/0xb80 fs/fs-writeback.c:1677
 writeback_sb_inodes+0xb32/0x1910 fs/fs-writeback.c:1903
 __writeback_inodes_wb+0x118/0x3f0 fs/fs-writeback.c:1974
 wb_writeback+0x3da/0xa00 fs/fs-writeback.c:2081
 wb_check_background_flush fs/fs-writeback.c:2151 [inline]
 wb_do_writeback fs/fs-writeback.c:2239 [inline]
 wb_workfn+0xbba/0x1030 fs/fs-writeback.c:2266
 process_one_work+0x73d/0xcb0 kernel/workqueue.c:2299
 worker_thread+0xa60/0x1260 kernel/workqueue.c:2446
 kthread+0x26d/0x300 kernel/kthread.c:386
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>

Allocated by task 298:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x1f/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:333
 kasan_slab_alloc include/linux/kasan.h:202 [inline]
 slab_post_alloc_hook+0x53/0x2c0 mm/slab.h:768
 slab_alloc_node mm/slub.c:3421 [inline]
 slab_alloc mm/slub.c:3431 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3438 [inline]
 kmem_cache_alloc_lru+0x102/0x270 mm/slub.c:3454
 alloc_inode_sb include/linux/fs.h:3255 [inline]
 f2fs_alloc_inode+0x2d/0x350 fs/f2fs/super.c:1437
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x18c/0x7e0 fs/inode.c:1373
 f2fs_iget+0x55/0x4ca0 fs/f2fs/inode.c:486
 f2fs_lookup+0x3c1/0xb50 fs/f2fs/namei.c:484
 __lookup_slow+0x2b9/0x3e0 fs/namei.c:1689
 lookup_slow+0x5a/0x80 fs/namei.c:1706
 walk_component+0x2e7/0x410 fs/namei.c:1997
 lookup_last fs/namei.c:2454 [inline]
 path_lookupat+0x16d/0x450 fs/namei.c:2478
 filename_lookup+0x251/0x600 fs/namei.c:2507
 vfs_statx+0x107/0x4b0 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3434 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xda/0x7c0 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x52/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x3b/0x80 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 0:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:516
 ____kasan_slab_free+0x131/0x180 mm/kasan/common.c:241
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:249
 kasan_slab_free include/linux/kasan.h:178 [inline]
 slab_free_hook mm/slub.c:1745 [inline]
 slab_free_freelist_hook mm/slub.c:1771 [inline]
 slab_free mm/slub.c:3686 [inline]
 kmem_cache_free+0x291/0x560 mm/slub.c:3711
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1584
 i_callback+0x4b/0x70 fs/inode.c:250
 rcu_do_batch+0x552/0xbe0 kernel/rcu/tree.c:2297
 rcu_core+0x502/0xf40 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x1db/0x650 kernel/softirq.c:624
 __do_softirq kernel/softirq.c:662 [inline]
 invoke_softirq kernel/softirq.c:479 [inline]
 __irq_exit_rcu+0x52/0xf0 kernel/softirq.c:711
 irq_exit_rcu+0x9/0x10 kernel/softirq.c:723
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1118 [inline]
 sysvec_apic_timer_interrupt+0xa9/0xc0 arch/x86/kernel/apic/apic.c:1118
 asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:691

Last potentially related work creation:
 kasan_save_stack+0x3b/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb4/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 __call_rcu_common kernel/rcu/tree.c:2807 [inline]
 call_rcu+0xdc/0x10f0 kernel/rcu/tree.c:2926
 destroy_inode fs/inode.c:316 [inline]
 evict+0x87d/0x930 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x616/0x690 fs/inode.c:1860
 do_unlinkat+0x4e1/0x920 fs/namei.c:4396
 __do_sys_unlink fs/namei.c:4437 [inline]
 __se_sys_unlink fs/namei.c:4435 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4435
 x64_sys_call+0x289/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x3b/0x80 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff888100567a10
 which belongs to the cache f2fs_inode_cache of size 1360
The buggy address is located 952 bytes inside of
 1360-byte region [ffff888100567a10ffff888100567f60)

The buggy address belongs to the physical page:
page:ffffea0004015800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100560
head:ffffea0004015800 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff8881002c4d80
raw: 0000000000000000 0000000080160016 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Reclaimable, gfp_mask 0xd2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_RECLAIMABLE), pid 298, tgid 298 (syz-executor330), ts 26489303743, free_ts 0
 set_page_owner include/linux/page_owner.h:33 [inline]
 post_alloc_hook+0x213/0x220 mm/page_alloc.c:2637
 prep_new_page+0x1b/0x110 mm/page_alloc.c:2644
 get_page_from_freelist+0x3a98/0x3b10 mm/page_alloc.c:4539
 __alloc_pages+0x234/0x610 mm/page_alloc.c:5837
 alloc_slab_page+0x6c/0xf0 include/linux/gfp.h:-1
 allocate_slab mm/slub.c:1962 [inline]
 new_slab+0x90/0x3e0 mm/slub.c:2015
 ___slab_alloc+0x6f9/0xb80 mm/slub.c:3203
 __slab_alloc+0x5d/0xa0 mm/slub.c:3302
 slab_alloc_node mm/slub.c:3387 [inline]
 slab_alloc mm/slub.c:3431 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3438 [inline]
 kmem_cache_alloc_lru+0x149/0x270 mm/slub.c:3454
 alloc_inode_sb include/linux/fs.h:3255 [inline]
 f2fs_alloc_inode+0x2d/0x350 fs/f2fs/super.c:1437
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x18c/0x7e0 fs/inode.c:1373
 f2fs_iget+0x55/0x4ca0 fs/f2fs/inode.c:486
 f2fs_fill_super+0x5360/0x6dc0 fs/f2fs/super.c:4488
 mount_bdev+0x282/0x3b0 fs/super.c:1445
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4743
 legacy_get_tree+0xf1/0x190 fs/fs_context.c:632
page_owner free stack trace missing

Memory state around the buggy address:
 ffff888100567c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888100567d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888100567d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                              ^
 ffff888100567e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888100567e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[2] https://syzkaller.appspot.com/text?tag=CrashLog&x=13654c60580000

[   24.675720][   T28] audit: type=1400 audit(1745327318.732:72): avc:  denied  { write } for  pid=298 comm="syz-executor399" name="/" dev="loop0" ino=3 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.705426][  T296] ------------[ cut here ]------------
[   24.706608][   T28] audit: type=1400 audit(1745327318.732:73): avc:  denied  { remove_name } for  pid=298 comm="syz-executor399" name="file0" dev="loop0" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.711550][  T296] WARNING: CPU: 0 PID: 296 at fs/f2fs/inode.c:847 f2fs_evict_inode+0x1262/0x1540
[   24.734141][   T28] audit: type=1400 audit(1745327318.732:74): avc:  denied  { rename } for  pid=298 comm="syz-executor399" name="file0" dev="loop0" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.742969][  T296] Modules linked in:
[   24.765201][   T28] audit: type=1400 audit(1745327318.732:75): avc:  denied  { add_name } for  pid=298 comm="syz-executor399" name="bus" scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.768847][  T296] CPU: 0 PID: 296 Comm: syz-executor399 Not tainted 6.1.129-syzkaller-00017-g642656a36791 #0
[   24.799506][  T296] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
[   24.809401][  T296] RIP: 0010:f2fs_evict_inode+0x1262/0x1540
[   24.815018][  T296] Code: 34 70 4a ff eb 0d e8 2d 70 4a ff 4d 89 e5 4c 8b 64 24 18 48 8b 5c 24 28 4c 89 e7 e8 78 38 03 00 e9 84 fc ff ff e8 0e 70 4a ff <0f> 0b 4c 89 f7 be 08 00 00 00 e8 7f 21 92 ff f0 41 80 0e 04 e9 61
[   24.834584][  T296] RSP: 0018:ffffc90000db7a40 EFLAGS: 00010293
[   24.840465][  T296] RAX: ffffffff822aca42 RBX: 0000000000000002 RCX: ffff888110948000
[   24.848291][  T296] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
[   24.856064][  T296] RBP: ffffc90000db7bb0 R08: ffffffff822ac6a8 R09: ffffed10200b005d
[   24.864073][  T296] R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888100580000
[   24.871812][  T296] R13: dffffc0000000000 R14: ffff88810fef4078 R15: 1ffff920001b6f5c

The root cause is w/ a fuzzed image, f2fs may missed to clear FI_DIRTY_INODE
flag for target inode, after f2fs_evict_inode(), the inode is still linked in
sbi->inode_list[DIRTY_META] global list, once it triggers checkpoint,
f2fs_sync_inode_meta() may access the released inode.

In f2fs_evict_inode(), let's always call f2fs_inode_synced() to clear
FI_DIRTY_INODE flag and drop inode from global dirty list to avoid this
UAF issue.

Fixes: 0f18b462b2e5 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/bug?extid=849174b2efaf0d8be6ba
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: doc: fix wrong quota mount option description
Chao Yu [Wed, 2 Jul 2025 06:49:25 +0000 (14:49 +0800)] 
f2fs: doc: fix wrong quota mount option description

[ Upstream commit 81b6ecca2f15922e8d653dc037df5871e754be6e ]

We should use "{usr,grp,prj}jquota=" to disable journaled quota,
rather than using off{usr,grp,prj}jquota.

Fixes: 4b2414d04e99 ("f2fs: support journalled quota")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to check upper boundary for gc_no_zoned_gc_percent
Chao Yu [Fri, 27 Jun 2025 02:38:18 +0000 (10:38 +0800)] 
f2fs: fix to check upper boundary for gc_no_zoned_gc_percent

[ Upstream commit a919ae794ad2dc6d04b3eea2f9bc86332c1630cc ]

This patch adds missing upper boundary check while setting
gc_no_zoned_gc_percent via sysfs.

Fixes: 9a481a1c16f4 ("f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to check upper boundary for gc_valid_thresh_ratio
Chao Yu [Fri, 27 Jun 2025 02:38:17 +0000 (10:38 +0800)] 
f2fs: fix to check upper boundary for gc_valid_thresh_ratio

[ Upstream commit 7a96d1d73ce9de5041e891a623b722f900651561 ]

This patch adds missing upper boundary check while setting
gc_valid_thresh_ratio via sysfs.

Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to check upper boundary for value of gc_boost_zoned_gc_percent
yohan.joung [Wed, 25 Jun 2025 00:14:07 +0000 (09:14 +0900)] 
f2fs: fix to check upper boundary for value of gc_boost_zoned_gc_percent

[ Upstream commit 10dcaa56ef93f2a45e4c3fec27d8e1594edad110 ]

to check the upper boundary when setting gc_boost_zoned_gc_percent

Fixes: 9a481a1c16f4 ("f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent")
Signed-off-by: yohan.joung <yohan.joung@sk.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix KMSAN uninit-value in extent_info usage
Abinash Singh [Wed, 25 Jun 2025 11:05:37 +0000 (16:35 +0530)] 
f2fs: fix KMSAN uninit-value in extent_info usage

[ Upstream commit 154467f4ad033473e5c903a03e7b9bca7df9a0fa ]

KMSAN reported a use of uninitialized value in `__is_extent_mergeable()`
 and `__is_back_mergeable()` via the read extent tree path.

The root cause is that `get_read_extent_info()` only initializes three
fields (`fofs`, `blk`, `len`) of `struct extent_info`, leaving the
remaining fields uninitialized. This leads to undefined behavior
when those fields are accessed later, especially during
extent merging.

Fix it by zero-initializing the `extent_info` struct before population.

Reported-by: syzbot+b8c1d60e95df65e827d4@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b8c1d60e95df65e827d4
Fixes: 94afd6d6e525 ("f2fs: extent cache: support unaligned extent")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Abinash Singh <abinashsinghlalotra@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: compress: fix UAF of f2fs_inode_info in f2fs_free_dic
Zhiguo Niu [Fri, 13 Jun 2025 01:50:45 +0000 (09:50 +0800)] 
f2fs: compress: fix UAF of f2fs_inode_info in f2fs_free_dic

[ Upstream commit 39868685c2a94a70762bc6d77dc81d781d05bff5 ]

The decompress_io_ctx may be released asynchronously after
I/O completion. If this file is deleted immediately after read,
and the kworker of processing post_read_wq has not been executed yet
due to high workloads, It is possible that the inode(f2fs_inode_info)
is evicted and freed before it is used f2fs_free_dic.

    The UAF case as below:
    Thread A                                      Thread B
    - f2fs_decompress_end_io
     - f2fs_put_dic
      - queue_work
        add free_dic work to post_read_wq
                                                   - do_unlink
                                                    - iput
                                                     - evict
                                                      - call_rcu
    This file is deleted after read.

    Thread C                                 kworker to process post_read_wq
    - rcu_do_batch
     - f2fs_free_inode
      - kmem_cache_free
     inode is freed by rcu
                                             - process_scheduled_works
                                              - f2fs_late_free_dic
                                               - f2fs_free_dic
                                                - f2fs_release_decomp_mem
                                      read (dic->inode)->i_compress_algorithm

This patch store compress_algorithm and sbi in dic to avoid inode UAF.

In addition, the previous solution is deprecated in [1] may cause system hang.
[1] https://lore.kernel.org/all/c36ab955-c8db-4a8b-a9d0-f07b5f426c3f@kernel.org

Cc: Daeho Jeong <daehojeong@google.com>
Fixes: bff139b49d9f ("f2fs: handle decompress only post processing in softirq")
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Baocong Liu <baocong.liu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: compress: change the first parameter of page_array_{alloc,free} to sbi
Zhiguo Niu [Fri, 13 Jun 2025 01:50:44 +0000 (09:50 +0800)] 
f2fs: compress: change the first parameter of page_array_{alloc,free} to sbi

[ Upstream commit 8e2a9b656474d67c55010f2c003ea2cf889a19ff ]

No logic changes, just cleanup and prepare for fixing the UAF issue
in f2fs_free_dic.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Baocong Liu <baocong.liu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: 39868685c2a9 ("f2fs: compress: fix UAF of f2fs_inode_info in f2fs_free_dic")
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix to avoid invalid wait context issue
Chao Yu [Wed, 11 Jun 2025 08:42:18 +0000 (16:42 +0800)] 
f2fs: fix to avoid invalid wait context issue

[ Upstream commit 90d5c9ba3ed91950f1546bf123a7a57cd958b452 ]

=============================
[ BUG: Invalid wait context ]
6.13.0-rc1 #84 Tainted: G           O
-----------------------------
cat/56160 is trying to lock:
ffff888105c86648 (&cprc->stat_lock){+.+.}-{3:3}, at: update_general_status+0x32a/0x8c0 [f2fs]
other info that might help us debug this:
context-{5:5}
2 locks held by cat/56160:
 #0: ffff88810a002a98 (&p->lock){+.+.}-{4:4}, at: seq_read_iter+0x56/0x4c0
 #1: ffffffffa0462638 (f2fs_stat_lock){....}-{2:2}, at: stat_show+0x29/0x1020 [f2fs]
stack backtrace:
CPU: 0 UID: 0 PID: 56160 Comm: cat Tainted: G           O       6.13.0-rc1 #84
Tainted: [O]=OOT_MODULE
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
 <TASK>
 dump_stack_lvl+0x88/0xd0
 dump_stack+0x14/0x20
 __lock_acquire+0x8d4/0xbb0
 lock_acquire+0xd6/0x300
 _raw_spin_lock+0x38/0x50
 update_general_status+0x32a/0x8c0 [f2fs]
 stat_show+0x50/0x1020 [f2fs]
 seq_read_iter+0x116/0x4c0
 seq_read+0xfa/0x130
 full_proxy_read+0x66/0x90
 vfs_read+0xc4/0x350
 ksys_read+0x74/0xf0
 __x64_sys_read+0x1d/0x20
 x64_sys_call+0x17d9/0x1b80
 do_syscall_64+0x68/0x130
 entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x7f2ca53147e2

- seq_read
 - stat_show
  - raw_spin_lock_irqsave(&f2fs_stat_lock, flags)
  : f2fs_stat_lock is raw_spinlock_t type variable
  - update_general_status
   - spin_lock(&sbi->cprc_info.stat_lock);
   : stat_lock is spinlock_t type variable

The root cause is the lock order is incorrect [1], we should not acquire
spinlock_t lock after raw_spinlock_t lock, as if CONFIG_PREEMPT_LOCK is
on, spinlock_t is implemented based on rtmutex, which can sleep after
holding the lock.

To fix this issue, let's use change f2fs_stat_lock lock type from
raw_spinlock_t to spinlock_t, it's safe due to:
- we don't need to use raw version of spinlock as the path is not
performance sensitive.
- we don't need to use irqsave version of spinlock as it won't be
used in irq context.

Quoted from [1]:

"Extend lockdep to validate lock wait-type context.

The current wait-types are:

LD_WAIT_FREE, /* wait free, rcu etc.. */
LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */

Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).

This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep()."

[1] https://lore.kernel.org/all/20200321113242.427089655@linutronix.de

Fixes: 98237fcda4a2 ("f2fs: use spin_lock to avoid hang")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: fix bio memleak when committing super block
Sheng Yong [Sat, 7 Jun 2025 06:41:16 +0000 (14:41 +0800)] 
f2fs: fix bio memleak when committing super block

[ Upstream commit 554d9b7242a73d701ce121ac81bb578a3fca538e ]

When committing new super block, bio is allocated but not freed, and
kmemleak complains:

  unreferenced object 0xffff88801d185600 (size 192):
    comm "kworker/3:2", pid 128, jiffies 4298624992
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 80 67 c3 00 81 88 ff ff  .........g......
      01 08 06 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
    backtrace (crc 650ecdb1):
      kmem_cache_alloc_noprof+0x3a9/0x460
      mempool_alloc_noprof+0x12f/0x310
      bio_alloc_bioset+0x1e2/0x7e0
      __f2fs_commit_super+0xe0/0x370
      f2fs_commit_super+0x4ed/0x8c0
      f2fs_record_error_work+0xc7/0x190
      process_one_work+0x7db/0x1970
      worker_thread+0x518/0xea0
      kthread+0x359/0x690
      ret_from_fork+0x34/0x70
      ret_from_fork_asm+0x1a/0x30

The issue can be reproduced by:

  mount /dev/vda /mnt
  i=0
  while :; do
      echo '[h]abc' > /sys/fs/f2fs/vda/extension_list
      echo '[h]!abc' > /sys/fs/f2fs/vda/extension_list
      echo scan > /sys/kernel/debug/kmemleak
      dmesg | grep "new suspected memory leaks"
      [ $? -eq 0 ] && break
      i=$((i + 1))
      echo "$i"
  done
  umount /mnt

Fixes: 5bcde4557862 ("f2fs: get rid of buffer_head use")
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agof2fs: turn off one_time when forcibly set to foreground GC
Daeho Jeong [Fri, 6 Jun 2025 18:49:04 +0000 (11:49 -0700)] 
f2fs: turn off one_time when forcibly set to foreground GC

[ Upstream commit 8142daf8a53806689186ee255cc02f89af7f8890 ]

one_time mode is only for background GC. So, we need to set it back to
false when foreground GC is enforced.

Fixes: 9748c2ddea4a ("f2fs: do FG_GC when GC boosting is required for zoned devices")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agortc: rv3028: fix incorrect maximum clock rate handling
Brian Masney [Thu, 10 Jul 2025 15:20:26 +0000 (11:20 -0400)] 
rtc: rv3028: fix incorrect maximum clock rate handling

[ Upstream commit b574acb3cf7591d2513a9f29f8c2021ad55fb881 ]

When rv3028_clkout_round_rate() is called with a requested rate higher
than the highest supported rate, it currently returns 0, which disables
the clock. According to the clk API, round_rate() should instead return
the highest supported rate. Update the function to return the maximum
supported rate in this case.

Fixes: f583c341a515f ("rtc: rv3028: add clkout support")
Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-6-33140bb2278e@redhat.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agortc: pcf8563: fix incorrect maximum clock rate handling
Brian Masney [Thu, 10 Jul 2025 15:20:25 +0000 (11:20 -0400)] 
rtc: pcf8563: fix incorrect maximum clock rate handling

[ Upstream commit 906726a5efeefe0ef0103ccff5312a09080c04ae ]

When pcf8563_clkout_round_rate() is called with a requested rate higher
than the highest supported rate, it currently returns 0, which disables
the clock. According to the clk API, round_rate() should instead return
the highest supported rate. Update the function to return the maximum
supported rate in this case.

Fixes: a39a6405d5f94 ("rtc: pcf8563: add CLKOUT to common clock framework")
Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-5-33140bb2278e@redhat.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agortc: pcf85063: fix incorrect maximum clock rate handling
Brian Masney [Thu, 10 Jul 2025 15:20:24 +0000 (11:20 -0400)] 
rtc: pcf85063: fix incorrect maximum clock rate handling

[ Upstream commit 186ae1869880e58bb3f142d222abdb35ecb4df0f ]

When pcf85063_clkout_round_rate() is called with a requested rate higher
than the highest supported rate, it currently returns 0, which disables
the clock. According to the clk API, round_rate() should instead return
the highest supported rate. Update the function to return the maximum
supported rate in this case.

Fixes: 8c229ab6048b7 ("rtc: pcf85063: Add pcf85063 clkout control to common clock framework")
Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-4-33140bb2278e@redhat.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agortc: nct3018y: fix incorrect maximum clock rate handling
Brian Masney [Thu, 10 Jul 2025 15:20:23 +0000 (11:20 -0400)] 
rtc: nct3018y: fix incorrect maximum clock rate handling

[ Upstream commit 437c59e4b222cd697b4cf95995d933e7d583c5f1 ]

When nct3018y_clkout_round_rate() is called with a requested rate higher
than the highest supported rate, it currently returns 0, which disables
the clock. According to the clk API, round_rate() should instead return
the highest supported rate. Update the function to return the maximum
supported rate in this case.

Fixes: 5adbaed16cc63 ("rtc: Add NCT3018Y real time clock driver")
Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-3-33140bb2278e@redhat.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agortc: hym8563: fix incorrect maximum clock rate handling
Brian Masney [Thu, 10 Jul 2025 15:20:22 +0000 (11:20 -0400)] 
rtc: hym8563: fix incorrect maximum clock rate handling

[ Upstream commit d0a518eb0a692a2ab8357e844970660c5ea37720 ]

When hym8563_clkout_round_rate() is called with a requested rate higher
than the highest supported rate, it currently returns 0, which disables
the clock. According to the clk API, round_rate() should instead return
the highest supported rate. Update the function to return the maximum
supported rate in this case.

Fixes: dcaf038493525 ("rtc: add hym8563 rtc-driver")
Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-2-33140bb2278e@redhat.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agortc: ds1307: fix incorrect maximum clock rate handling
Brian Masney [Thu, 10 Jul 2025 15:20:21 +0000 (11:20 -0400)] 
rtc: ds1307: fix incorrect maximum clock rate handling

[ Upstream commit cf6eb547a24af7ad7bbd2abe9c5327f956bbeae8 ]

When ds3231_clk_sqw_round_rate() is called with a requested rate higher
than the highest supported rate, it currently returns 0, which disables
the clock. According to the clk API, round_rate() should instead return
the highest supported rate. Update the function to return the maximum
supported rate in this case.

Fixes: 6c6ff145b3346 ("rtc: ds1307: add clock provider support for DS3231")
Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-1-33140bb2278e@redhat.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoucount: fix atomic_long_inc_below() argument type
Uros Bizjak [Mon, 21 Jul 2025 17:45:57 +0000 (19:45 +0200)] 
ucount: fix atomic_long_inc_below() argument type

[ Upstream commit f8cd9193b62e92ad25def5370ca8ea2bc7585381 ]

The type of u argument of atomic_long_inc_below() should be long to avoid
unwanted truncation to int.

The patch fixes the wrong argument type of an internal function to
prevent unwanted argument truncation.  It fixes an internal locking
primitive; it should not have any direct effect on userspace.

Mark said

: AFAICT there's no problem in practice because atomic_long_inc_below()
: is only used by inc_ucount(), and it looks like the value is
: constrained between 0 and INT_MAX.
:
: In inc_ucount() the limit value is taken from
: user_namespace::ucount_max[], and AFAICT that's only written by
: sysctls, to the table setup by setup_userns_sysctls(), where
: UCOUNT_ENTRY() limits the value between 0 and INT_MAX.
:
: This is certainly a cleanup, but there might be no functional issue in
: practice as above.

Link: https://lkml.kernel.org/r/20250721174610.28361-1-ubizjak@gmail.com
Fixes: f9c82a4ea89c ("Increase size of ucounts to atomic_long_t")
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: MengEn Sun <mengensun@tencent.com>
Cc: "Thomas Weißschuh" <linux@weissschuh.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agomodule: Restore the moduleparam prefix length check
Petr Pavlu [Mon, 30 Jun 2025 14:32:34 +0000 (16:32 +0200)] 
module: Restore the moduleparam prefix length check

[ Upstream commit bdc877ba6b7ff1b6d2ebeff11e63da4a50a54854 ]

The moduleparam code allows modules to provide their own definition of
MODULE_PARAM_PREFIX, instead of using the default KBUILD_MODNAME ".".

Commit 730b69d22525 ("module: check kernel param length at compile time,
not runtime") added a check to ensure the prefix doesn't exceed
MODULE_NAME_LEN, as this is what param_sysfs_builtin() expects.

Later, commit 58f86cc89c33 ("VERIFY_OCTAL_PERMISSIONS: stricter checking
for sysfs perms.") removed this check, but there is no indication this was
intentional.

Since the check is still useful for param_sysfs_builtin() to function
properly, reintroduce it in __module_param_call(), but in a modernized form
using static_assert().

While here, clean up the __module_param_call() comments. In particular,
remove the comment "Default value instead of permissions?", which comes
from commit 9774a1f54f17 ("[PATCH] Compile-time check re world-writeable
module params"). This comment was related to the test variable
__param_perm_check_##name, which was removed in the previously mentioned
commit 58f86cc89c33.

Fixes: 58f86cc89c33 ("VERIFY_OCTAL_PERMISSIONS: stricter checking for sysfs perms.")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Link: https://lore.kernel.org/r/20250630143535.267745-4-petr.pavlu@suse.com
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoi3c: master: svc: Fix npcm845 FIFO_EMPTY quirk
Stanley Chu [Wed, 30 Jul 2025 00:37:19 +0000 (08:37 +0800)] 
i3c: master: svc: Fix npcm845 FIFO_EMPTY quirk

[ Upstream commit bc4a09d8e79cadccdd505f47b01903a80bc666e7 ]

In a private write transfer, the driver pre-fills the FIFO to work around
the FIFO_EMPTY quirk. However, if an IBIWON event occurs, the hardware
emits a NACK and the driver initiates a retry. During the retry, driver
attempts to pre-fill the FIFO again if there is remaining data, but since
the FIFO is already full, this leads to data loss.

Check available space in FIFO to prevent overflow.

Fixes: 4008a74e0f9b ("i3c: master: svc: Fix npcm845 FIFO empty issue")
Signed-off-by: Stanley Chu <yschu@nuvoton.com>
Link: https://lore.kernel.org/r/20250730003719.1825593-1-yschu@nuvoton.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoi3c: fix module_i3c_i2c_driver() with I3C=n
Arnd Bergmann [Fri, 25 Jul 2025 09:06:03 +0000 (11:06 +0200)] 
i3c: fix module_i3c_i2c_driver() with I3C=n

[ Upstream commit 5523a466e905b6287b94654ddb364536f2f948cf ]

When CONFIG_I3C is disabled and the i3c_i2c_driver_register() happens
to not be inlined, any driver calling it still references the i3c_driver
instance, which then causes a link failure:

x86_64-linux-ld: drivers/hwmon/lm75.o: in function `lm75_i3c_reg_read':
lm75.c:(.text+0xc61): undefined reference to `i3cdev_to_dev'
x86_64-linux-ld: lm75.c:(.text+0xd25): undefined reference to `i3c_device_do_priv_xfers'
x86_64-linux-ld: lm75.c:(.text+0xdd8): undefined reference to `i3c_device_do_priv_xfers'

This issue was part of the original i3c code, but only now caused problems
when i3c support got added to lm75.

Change the 'inline' annotations in the header to '__always_inline' to
ensure that the dead-code-elimination pass in the compiler can optimize
it out as intended.

Fixes: 6071d10413ff ("hwmon: (lm75) add I3C support for P3T1755")
Fixes: 3a379bbcea0a ("i3c: Add core I3C infrastructure")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250725090609.2456262-1-arnd@kernel.org
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoapparmor: Fix unaligned memory accesses in KUnit test
Helge Deller [Sat, 31 May 2025 15:08:22 +0000 (17:08 +0200)] 
apparmor: Fix unaligned memory accesses in KUnit test

[ Upstream commit c68804199dd9d63868497a27b5da3c3cd15356db ]

The testcase triggers some unnecessary unaligned memory accesses on the
parisc architecture:
  Kernel: unaligned access to 0x12f28e27 in policy_unpack_test_init+0x180/0x374 (iir 0x0cdc1280)
  Kernel: unaligned access to 0x12f28e67 in policy_unpack_test_init+0x270/0x374 (iir 0x64dc00ce)

Use the existing helper functions put_unaligned_le32() and
put_unaligned_le16() to avoid such warnings on architectures which
prefer aligned memory accesses.

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 98c0cc48e27e ("apparmor: fix policy_unpack_test on big endian systems")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agosquashfs: fix incorrect argument to sizeof in kmalloc_array call
Colin Ian King [Tue, 8 Jul 2025 14:26:04 +0000 (15:26 +0100)] 
squashfs: fix incorrect argument to sizeof in kmalloc_array call

[ Upstream commit 97103dcec292b8688de142f7a48bd0d46038d3f6 ]

The sizeof(void *) is the incorrect argument in the kmalloc_array call, it
best to fix this by using sizeof(*cache_folios) instead.

Fortunately the sizes of void* and folio* happen to be the same, so this
has not shown up as a run time issue.

[akpm@linux-foundation.org: fix build]
Link: https://lkml.kernel.org/r/20250708142604.1891156-1-colin.i.king@gmail.com
Fixes: 2e227ff5e272 ("squashfs: add optional full compressed block caching")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Chanho Min <chanho.min@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agosquashfs: use folios in squashfs_bio_read_cached()
Matthew Wilcox (Oracle) [Thu, 12 Jun 2025 14:39:01 +0000 (15:39 +0100)] 
squashfs: use folios in squashfs_bio_read_cached()

[ Upstream commit c9e3fb050e9cb0d3a833b2c62b35ea42cdd81e89 ]

Remove an access to page->mapping and a few calls to the old page-based
APIs.  This doesn't support large folios, but it's still a nice
improvement.

Link: https://lkml.kernel.org/r/20250612143903.2849289-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 97103dcec292 ("squashfs: fix incorrect argument to sizeof in kmalloc_array call")
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoscripts: gdb: move MNT_* constants to gdb-parsed
Johannes Berg [Wed, 18 Jun 2025 13:46:02 +0000 (15:46 +0200)] 
scripts: gdb: move MNT_* constants to gdb-parsed

[ Upstream commit 41a7f737685eed2700654720d3faaffdf0132135 ]

Since these are now no longer defines, but in an enum.

Link: https://lkml.kernel.org/r/20250618134629.25700-2-johannes@sipsolutions.net
Fixes: 101f2bbab541 ("fs: convert mount flags to enum")
Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agouprobes: revert ref_ctr_offset in uprobe_unregister error path
Jiri Olsa [Wed, 14 May 2025 10:18:09 +0000 (12:18 +0200)] 
uprobes: revert ref_ctr_offset in uprobe_unregister error path

[ Upstream commit aa644c405291a419e92b112e2279c01c410e9a26 ]

There's error path that could lead to inactive uprobe:

  1) uprobe_register succeeds - updates instruction to int3 and
     changes ref_ctr from 0 to 1
  2) uprobe_unregister fails  - int3 stays in place, but ref_ctr
     is changed to 0 (it's not restored to 1 in the fail path)
     uprobe is leaked
  3) another uprobe_register comes and re-uses the leaked uprobe
     and succeds - but int3 is already in place, so ref_ctr update
     is skipped and it stays 0 - uprobe CAN NOT be triggered now
  4) uprobe_unregister fails because ref_ctr value is unexpected

Fix this by reverting the updated ref_ctr value back to 1 in step 2),
which is the case when uprobe_unregister fails (int3 stays in place), but
we have already updated refctr.

The new scenario will go as follows:

  1) uprobe_register succeeds - updates instruction to int3 and
     changes ref_ctr from 0 to 1
  2) uprobe_unregister fails  - int3 stays in place and ref_ctr
     is reverted to 1..  uprobe is leaked
  3) another uprobe_register comes and re-uses the leaked uprobe
     and succeds - but int3 is already in place, so ref_ctr update
     is skipped and it stays 1 - uprobe CAN be triggered now
  4) uprobe_unregister succeeds

Link: https://lkml.kernel.org/r/20250514101809.2010193-1-jolsa@kernel.org
Fixes: 1cc33161a83d ("uprobes: Support SDT markers having reference count (semaphore)")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agodm-flakey: Fix corrupt_bio_byte setup checks
Kent Overstreet [Wed, 11 Jun 2025 19:12:20 +0000 (15:12 -0400)] 
dm-flakey: Fix corrupt_bio_byte setup checks

[ Upstream commit 75227ed6812cb869380c8fb6d41a845ae571781e ]

Fix the error_reads mode - it's incompatible with corrupt_bio_byte, but
that's only enabled if corrupt_bio_byte is nonzero.

Cc: Benjamin Marzinski <bmarzins@redhat.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: dm-devel@lists.linux.dev
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
Fixes: 19da6b2c9e8e ("dm-flakey: Clean up parsing messages")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoapparmor: fix loop detection used in conflicting attachment resolution
Ryan Lee [Thu, 1 May 2025 19:54:39 +0000 (12:54 -0700)] 
apparmor: fix loop detection used in conflicting attachment resolution

[ Upstream commit a88db916b8c77552f49f7d9f8744095ea01a268f ]

Conflicting attachment resolution is based on the number of states
traversed to reach an accepting state in the attachment DFA, accounting
for DFA loops traversed during the matching process. However, the loop
counting logic had multiple bugs:

 - The inc_wb_pos macro increments both position and length, but length
   is supposed to saturate upon hitting buffer capacity, instead of
   wrapping around.
 - If no revisited state is found when traversing the history, is_loop
   would still return true, as if there was a loop found the length of
   the history buffer, instead of returning false and signalling that
   no loop was found. As a result, the adjustment step of
   aa_dfa_leftmatch would sometimes produce negative counts with loop-
   free DFAs that traversed enough states.
 - The iteration in the is_loop for loop is supposed to stop before
   i = wb->len, so the conditional should be < instead of <=.

This patch fixes the above bugs as well as the following nits:
 - The count and size fields in struct match_workbuf were not used,
   so they can be removed.
 - The history buffer in match_workbuf semantically stores aa_state_t
   and not unsigned ints, even if aa_state_t is currently unsigned int.
 - The local variables in is_loop are counters, and thus should be
   unsigned ints instead of aa_state_t's.

Fixes: 21f606610502 ("apparmor: improve overlapping domain attachment resolution")
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Co-developed-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoapparmor: ensure WB_HISTORY_SIZE value is a power of 2
Ryan Lee [Thu, 1 May 2025 19:54:38 +0000 (12:54 -0700)] 
apparmor: ensure WB_HISTORY_SIZE value is a power of 2

[ Upstream commit 6c055e62560b958354625604293652753d82bcae ]

WB_HISTORY_SIZE was defined to be a value not a power of 2, despite a
comment in the declaration of struct match_workbuf stating it is and a
modular arithmetic usage in the inc_wb_pos macro assuming that it is. Bump
WB_HISTORY_SIZE's value up to 32 and add a BUILD_BUG_ON_NOT_POWER_OF_2
line to ensure that any future changes to the value of WB_HISTORY_SIZE
respect this requirement.

Fixes: 136db994852a ("apparmor: increase left match history buffer size")
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agobpf: Check netfilter ctx accesses are aligned
Paul Chaignon [Fri, 1 Aug 2025 09:48:15 +0000 (11:48 +0200)] 
bpf: Check netfilter ctx accesses are aligned

[ Upstream commit 9e6448f7b1efb27f8d508b067ecd33ed664a4246 ]

Similarly to the previous patch fixing the flow_dissector ctx accesses,
nf_is_valid_access also doesn't check that ctx accesses are aligned.
Contrary to flow_dissector programs, netfilter programs don't have
context conversion. The unaligned ctx accesses are therefore allowed by
the verifier.

Fixes: fd9c663b9ad6 ("bpf: minimal support for programs hooked into netfilter framework")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/853ae9ed5edaa5196e8472ff0f1bb1cc24059214.1754039605.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agobpf: Check flow_dissector ctx accesses are aligned
Paul Chaignon [Fri, 1 Aug 2025 09:47:23 +0000 (11:47 +0200)] 
bpf: Check flow_dissector ctx accesses are aligned

[ Upstream commit ead3d7b2b6afa5ee7958620c4329982a7d9c2b78 ]

flow_dissector_is_valid_access doesn't check that the context access is
aligned. As a consequence, an unaligned access within one of the exposed
field is considered valid and later rejected by
flow_dissector_convert_ctx_access when we try to convert it.

The later rejection is problematic because it's reported as a verifier
bug with a kernel warning and doesn't point to the right instruction in
verifier logs.

Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
Reported-by: syzbot+ccac90e482b2a81d74aa@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ccac90e482b2a81d74aa
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/cc1b036be484c99be45eddf48bd78cc6f72839b1.1754039605.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovhost: Reintroduce kthread API and add mode selection
Cindy Lu [Mon, 14 Jul 2025 07:12:32 +0000 (15:12 +0800)] 
vhost: Reintroduce kthread API and add mode selection

[ Upstream commit 7d9896e9f6d02d8aa85e63f736871f96c59a5263 ]

Since commit 6e890c5d5021 ("vhost: use vhost_tasks for worker threads"),
the vhost uses vhost_task and operates as a child of the
owner thread. This is required for correct CPU usage accounting,
especially when using containers.

However, this change has caused confusion for some legacy
userspace applications, and we didn't notice until it's too late.

Unfortunately, it's too late to revert - we now have userspace
depending both on old and new behaviour :(

To address the issue, reintroduce kthread mode for vhost workers and
provide a configuration to select between kthread and task worker.

- Add 'fork_owner' parameter to vhost_dev to let users select kthread
  or task mode. Default mode is task mode(VHOST_FORK_OWNER_TASK).

- Reintroduce kthread mode support:
  * Bring back the original vhost_worker() implementation,
    and renamed to vhost_run_work_kthread_list().
  * Add cgroup support for the kthread
  * Introduce struct vhost_worker_ops:
    - Encapsulates create / stop / wake‑up callbacks.
    - vhost_worker_create() selects the proper ops according to
      inherit_owner.

- Userspace configuration interface:
  * New IOCTLs:
      - VHOST_SET_FORK_FROM_OWNER lets userspace select task mode
        (VHOST_FORK_OWNER_TASK) or kthread mode (VHOST_FORK_OWNER_KTHREAD)
      - VHOST_GET_FORK_FROM_OWNER reads the current worker mode
  * Expose module parameter 'fork_from_owner_default' to allow system
    administrators to configure the default mode for vhost workers
  * Kconfig option CONFIG_VHOST_ENABLE_FORK_OWNER_CONTROL controls whether
    these IOCTLs and the parameter are available

- The VHOST_NEW_WORKER functionality requires fork_owner to be set
  to true, with validation added to ensure proper configuration

This partially reverts or improves upon:
  commit 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
  commit 1cdaafa1b8b4 ("vhost: replace single worker pointer with xarray")

Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads"),
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20250714071333.59794-2-lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovdpa: Fix IDR memory leak in VDUSE module exit
Anders Roxell [Fri, 4 Jul 2025 12:53:35 +0000 (14:53 +0200)] 
vdpa: Fix IDR memory leak in VDUSE module exit

[ Upstream commit d9ea58b5dc6b4b50fbb6a10c73f840e8b10442b7 ]

Add missing idr_destroy() call in vduse_exit() to properly free the
vduse_idr radix tree nodes. Without this, module load/unload cycles leak
576-byte radix tree node allocations, detectable by kmemleak as:

unreferenced object (size 576):
  backtrace:
    [<ffffffff81234567>] radix_tree_node_alloc+0xa0/0xf0
    [<ffffffff81234568>] idr_get_free+0x128/0x280

The vduse_idr is initialized via DEFINE_IDR() at line 136 and used throughout
the VDUSE (vDPA Device in Userspace) driver for device ID management. The fix
follows the documented pattern in lib/idr.c and matches the cleanup approach
used by other drivers.

This leak was discovered through comprehensive module testing with cumulative
kmemleak detection across 10 load/unload iterations per module.

Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Message-Id: <20250704125335.1084649-1-anders.roxell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovdpa/mlx5: Fix release of uninitialized resources on error path
Dragos Tatulea [Tue, 8 Jul 2025 12:04:24 +0000 (12:04 +0000)] 
vdpa/mlx5: Fix release of uninitialized resources on error path

[ Upstream commit cc51a66815999afb7e9cd845968de4fdf07567b7 ]

The commit in the fixes tag made sure that mlx5_vdpa_free()
is the single entrypoint for removing the vdpa device resources
added in mlx5_vdpa_dev_add(), even in the cleanup path of
mlx5_vdpa_dev_add().

This means that all functions from mlx5_vdpa_free() should be able to
handle uninitialized resources. This was not the case though:
mlx5_vdpa_destroy_mr_resources() and mlx5_cmd_cleanup_async_ctx()
were not able to do so. This caused the splat below when adding
a vdpa device without a MAC address.

This patch fixes these remaining issues:

- Makes mlx5_vdpa_destroy_mr_resources() return early if called on
  uninitialized resources.

- Moves mlx5_cmd_init_async_ctx() early on during device addition
  because it can't fail. This means that mlx5_cmd_cleanup_async_ctx()
  also can't fail. To mirror this, move the call site of
  mlx5_cmd_cleanup_async_ctx() in mlx5_vdpa_free().

An additional comment was added in mlx5_vdpa_free() to document
the expectations of functions called from this context.

Splat:

  mlx5_core 0000:b5:03.2: mlx5_vdpa_dev_add:3950:(pid 2306) warning: No mac address provisioned?
  ------------[ cut here ]------------
  WARNING: CPU: 13 PID: 2306 at kernel/workqueue.c:4207 __flush_work+0x9a/0xb0
  [...]
  Call Trace:
   <TASK>
   ? __try_to_del_timer_sync+0x61/0x90
   ? __timer_delete_sync+0x2b/0x40
   mlx5_vdpa_destroy_mr_resources+0x1c/0x40 [mlx5_vdpa]
   mlx5_vdpa_free+0x45/0x160 [mlx5_vdpa]
   vdpa_release_dev+0x1e/0x50 [vdpa]
   device_release+0x31/0x90
   kobject_cleanup+0x37/0x130
   mlx5_vdpa_dev_add+0x327/0x890 [mlx5_vdpa]
   vdpa_nl_cmd_dev_add_set_doit+0x2c1/0x4d0 [vdpa]
   genl_family_rcv_msg_doit+0xd8/0x130
   genl_family_rcv_msg+0x14b/0x220
   ? __pfx_vdpa_nl_cmd_dev_add_set_doit+0x10/0x10 [vdpa]
   genl_rcv_msg+0x47/0xa0
   ? __pfx_genl_rcv_msg+0x10/0x10
   netlink_rcv_skb+0x53/0x100
   genl_rcv+0x24/0x40
   netlink_unicast+0x27b/0x3b0
   netlink_sendmsg+0x1f7/0x430
   __sys_sendto+0x1fa/0x210
   ? ___pte_offset_map+0x17/0x160
   ? next_uptodate_folio+0x85/0x2b0
   ? percpu_counter_add_batch+0x51/0x90
   ? filemap_map_pages+0x515/0x660
   __x64_sys_sendto+0x20/0x30
   do_syscall_64+0x7b/0x2c0
   ? do_read_fault+0x108/0x220
   ? do_pte_missing+0x14a/0x3e0
   ? __handle_mm_fault+0x321/0x730
   ? count_memcg_events+0x13f/0x180
   ? handle_mm_fault+0x1fb/0x2d0
   ? do_user_addr_fault+0x20c/0x700
   ? syscall_exit_work+0x104/0x140
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7f0c25b0feca
  [...]
  ---[ end trace 0000000000000000 ]---

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Fixes: 83e445e64f48 ("vdpa/mlx5: Fix error path during device add")
Reported-by: Wenli Quan <wquan@redhat.com>
Closes: https://lore.kernel.org/virtualization/CADZSLS0r78HhZAStBaN1evCSoPqRJU95Lt8AqZNJ6+wwYQ6vPQ@mail.gmail.com/
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Message-Id: <20250708120424.2363354-2-dtatulea@nvidia.com>
Tested-by: Wenli Quan <wquan@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovhost-scsi: Fix check for inline_sg_cnt exceeding preallocated limit
Alok Tiwari [Sat, 28 Jun 2025 18:33:53 +0000 (11:33 -0700)] 
vhost-scsi: Fix check for inline_sg_cnt exceeding preallocated limit

[ Upstream commit 400cad513c78f9af72c5a20f3611c1f1dc71d465 ]

The condition comparing ret to VHOST_SCSI_PREALLOC_SGLS was incorrect,
as ret holds the result of kstrtouint() (typically 0 on success),
not the parsed value. Update the check to use cnt, which contains the
actual user-provided value.

prevents silently accepting values exceeding the maximum inline_sg_cnt.

Fixes: bca939d5bcd0 ("vhost-scsi: Dynamically allocate scatterlists")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20250628183405.3979538-1-alok.a.tiwari@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovhost-scsi: Fix log flooding with target does not exist errors
Mike Christie [Wed, 11 Jun 2025 21:01:13 +0000 (16:01 -0500)] 
vhost-scsi: Fix log flooding with target does not exist errors

[ Upstream commit 69cd720a8a5e9ef0f05ce5dd8c9ea6e018245c82 ]

As part of the normal initiator side scanning the guest's scsi layer
will loop over all possible targets and send an inquiry. Since the
max number of targets for virtio-scsi is 256, this can result in 255
error messages about targets not existing if you only have a single
target. When there's more than 1 vhost-scsi device each with a single
target, then you get N * 255 log messages.

It looks like the log message was added by accident in:

commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code from
control queue handler")

when we added common helpers. Then in:

commit 09d7583294aa ("vhost/scsi: Use common handling code in request
queue handler")

we converted the scsi command processing path to use the new
helpers so we started to see the extra log messages during scanning.

The patches were just making some code common but added the vq_err
call and I'm guessing the patch author forgot to enable the vq_err
call (vq_err is implemented by pr_debug which defaults to off). So
this patch removes the call since it's expected to hit this path
during device discovery.

Fixes: 09d7583294aa ("vhost/scsi: Use common handling code in request queue handler")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20250611210113.10912-1-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agovdpa/mlx5: Fix needs_teardown flag calculation
Dragos Tatulea [Wed, 4 Jun 2025 18:48:01 +0000 (21:48 +0300)] 
vdpa/mlx5: Fix needs_teardown flag calculation

[ Upstream commit 6f0f3d7fc4e05797b801ded4910a64d16db230e9 ]

needs_teardown is a device flag that indicates when virtual queues need
to be recreated. This happens for certain configuration changes: queue
size and some specific features.

Currently, the needs_teardown state can be incorrectly reset by
subsequent .set_vq_num() calls. For example, for 1 rx VQ with size 512
and 1 tx VQ with size 256:

.set_vq_num(0, 512) -> sets needs_teardown to true (rx queue has a
                       non-default size)
.set_vq_num(1, 256) -> sets needs_teardown to false (tx queue has a
                       default size)

This change takes into account the previous value of the needs_teardown
flag when re-calculating it during VQ size configuration.

Fixes: 0fe963d6fc16 ("vdpa/mlx5: Re-create HW VQs under certain conditions")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu<si-wei.liu@oracle.com>
Message-Id: <20250604184802.2625300-1-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agobpf: Fix oob access in cgroup local storage
Daniel Borkmann [Wed, 30 Jul 2025 23:47:33 +0000 (01:47 +0200)] 
bpf: Fix oob access in cgroup local storage

[ Upstream commit abad3d0bad72a52137e0c350c59542d75ae4f513 ]

Lonial reported that an out-of-bounds access in cgroup local storage
can be crafted via tail calls. Given two programs each utilizing a
cgroup local storage with a different value size, and one program
doing a tail call into the other. The verifier will validate each of
the indivial programs just fine. However, in the runtime context
the bpf_cg_run_ctx holds an bpf_prog_array_item which contains the
BPF program as well as any cgroup local storage flavor the program
uses. Helpers such as bpf_get_local_storage() pick this up from the
runtime context:

  ctx = container_of(current->bpf_ctx, struct bpf_cg_run_ctx, run_ctx);
  storage = ctx->prog_item->cgroup_storage[stype];

  if (stype == BPF_CGROUP_STORAGE_SHARED)
    ptr = &READ_ONCE(storage->buf)->data[0];
  else
    ptr = this_cpu_ptr(storage->percpu_buf);

For the second program which was called from the originally attached
one, this means bpf_get_local_storage() will pick up the former
program's map, not its own. With mismatching sizes, this can result
in an unintended out-of-bounds access.

To fix this issue, we need to extend bpf_map_owner with an array of
storage_cookie[] to match on i) the exact maps from the original
program if the second program was using bpf_get_local_storage(), or
ii) allow the tail call combination if the second program was not
using any of the cgroup local storage maps.

Fixes: 7d9c3427894f ("bpf: Make cgroup storages shared between programs on the same cgroup")
Reported-by: Lonial Con <kongln9170@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20250730234733.530041-4-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agobpf: Move cgroup iterator helpers to bpf.h
Daniel Borkmann [Wed, 30 Jul 2025 23:47:32 +0000 (01:47 +0200)] 
bpf: Move cgroup iterator helpers to bpf.h

[ Upstream commit 9621e60f59eae87eb9ffe88d90f24f391a1ef0f0 ]

Move them into bpf.h given we also need them in core code.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20250730234733.530041-3-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: abad3d0bad72 ("bpf: Fix oob access in cgroup local storage")
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agobpf: Move bpf map owner out of common struct
Daniel Borkmann [Wed, 30 Jul 2025 23:47:31 +0000 (01:47 +0200)] 
bpf: Move bpf map owner out of common struct

[ Upstream commit fd1c98f0ef5cbcec842209776505d9e70d8fcd53 ]

Given this is only relevant for BPF tail call maps, it is adding up space
and penalizing other map types. We also need to extend this with further
objects to track / compare to. Therefore, lets move this out into a separate
structure and dynamically allocate it only for BPF tail call maps.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20250730234733.530041-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: abad3d0bad72 ("bpf: Fix oob access in cgroup local storage")
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agobpf: Add cookie object to bpf maps
Daniel Borkmann [Wed, 30 Jul 2025 23:47:30 +0000 (01:47 +0200)] 
bpf: Add cookie object to bpf maps

[ Upstream commit 12df58ad294253ac1d8df0c9bb9cf726397a671d ]

Add a cookie to BPF maps to uniquely identify BPF maps for the timespan
when the node is up. This is different to comparing a pointer or BPF map
id which could get rolled over and reused.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20250730234733.530041-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: abad3d0bad72 ("bpf: Fix oob access in cgroup local storage")
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoperf record: Cache build-ID of hit DSOs only
Namhyung Kim [Thu, 31 Jul 2025 07:03:30 +0000 (00:03 -0700)] 
perf record: Cache build-ID of hit DSOs only

[ Upstream commit 6235ce77749f45cac27f630337e2fdf04e8a6c73 ]

It post-processes samples to find which DSO has samples.  Based on that
info, it can save used DSOs in the build-ID cache directory.  But for
some reason, it saves all DSOs without checking the hit mark.  Skipping
unused DSOs can give some speedup especially with --buildid-mmap being
default.

On my idle machine, `time perf record -a sleep 1` goes down from 3 sec
to 1.5 sec with this change.

Fixes: e29386c8f7d71fa5 ("perf record: Add --buildid-mmap option to enable PERF_RECORD_MMAP2's build id")
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250731070330.57116-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoALSA: usb: scarlett2: Fix missing NULL check
Takashi Iwai [Thu, 31 Jul 2025 05:37:08 +0000 (07:37 +0200)] 
ALSA: usb: scarlett2: Fix missing NULL check

[ Upstream commit df485a4b2b3ee5b35c80f990beb554e38a8a5fb1 ]

scarlett2_input_select_ctl_info() sets up the string arrays allocated
via kasprintf(), but it misses NULL checks, which may lead to NULL
dereference Oops.  Let's add the proper NULL check.

Fixes: 8eba063b5b2b ("ALSA: scarlett2: Simplify linked channel handling")
Link: https://patch.msgid.link/20250731053714.29414-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoselftests: ALSA: fix memory leak in utimer test
WangYuli [Thu, 31 Jul 2025 10:02:22 +0000 (18:02 +0800)] 
selftests: ALSA: fix memory leak in utimer test

[ Upstream commit 6260da046819b7bda828bacae148fc8856fdebd7 ]

Free the malloc'd buffer in TEST_F(timer_f, utimer) to prevent
memory leak.

Fixes: 1026392d10af ("selftests: ALSA: Cover userspace-driven timers with test")
Reported-by: Jun Zhan <zhanjun@uniontech.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Link: https://patch.msgid.link/DE4D931FCF54F3DB+20250731100222.65748-1-wangyuli@uniontech.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agodrm/xe/vf: Disable CSC support on VF
Lukasz Laguna [Tue, 29 Jul 2025 12:34:37 +0000 (14:34 +0200)] 
drm/xe/vf: Disable CSC support on VF

[ Upstream commit f62408efc8669b82541295a4611494c8c8c52684 ]

CSC is not accessible by VF drivers, so disable its support flag on VF
to prevent further initialization attempts.

Fixes: e02cea83d32d ("drm/xe/gsc: add Battlemage support")
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Cc: Alexander Usyskin <alexander.usyskin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250729123437.5933-1-lukasz.laguna@intel.com
(cherry picked from commit 552dbba1caaf0cb40ce961806d757615e26ec668)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agomtd: rawnand: atmel: set pmecc data setup time
Balamanikandan Gunasundar [Mon, 21 Jul 2025 10:43:40 +0000 (16:13 +0530)] 
mtd: rawnand: atmel: set pmecc data setup time

[ Upstream commit f552a7c7e0a14215cb8a6fd89e60fa3932a74786 ]

Setup the pmecc data setup time as 3 clock cycles for 133MHz as recommended
by the datasheet.

Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Reported-by: Zixun LI <admin@hifiphile.com>
Closes: https://lore.kernel.org/all/c015bb20-6a57-4f63-8102-34b3d83e0f5b@microchip.com
Suggested-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Signed-off-by: Balamanikandan Gunasundar <balamanikandan.gunasundar@microchip.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agomtd: rawnand: rockchip: Add missing check after DMA map
Thomas Fourier [Mon, 7 Jul 2025 07:15:50 +0000 (09:15 +0200)] 
mtd: rawnand: rockchip: Add missing check after DMA map

[ Upstream commit 3b36f86dc47261828f96f826077131a35dd825fd ]

The DMA map functions can fail and should be tested for errors.

Fixes: 058e0e847d54 ("mtd: rawnand: rockchip: NFC driver for RK3308, RK2928 and others")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agomtd: rawnand: atmel: Fix dma_mapping_error() address
Thomas Fourier [Wed, 2 Jul 2025 06:45:11 +0000 (08:45 +0200)] 
mtd: rawnand: atmel: Fix dma_mapping_error() address

[ Upstream commit e1e6b933c56b1e9fda93caa0b8bae39f3f421e5c ]

It seems like what was intended is to test if the dma_map of the
previous line failed but the wrong dma address was passed.

Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Rule: add
Link: https://lore.kernel.org/stable/20250702064515.18145-2-fourier.thomas%40gmail.com
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agojfs: fix metapage reference count leak in dbAllocCtl
Zheng Yu [Tue, 29 Jul 2025 01:22:14 +0000 (01:22 +0000)] 
jfs: fix metapage reference count leak in dbAllocCtl

[ Upstream commit 856db37592021e9155384094e331e2d4589f28b1 ]

In dbAllocCtl(), read_metapage() increases the reference count of the
metapage. However, when dp->tree.budmin < 0, the function returns -EIO
without calling release_metapage() to decrease the reference count,
leading to a memory leak.

Add release_metapage(mp) before the error return to properly manage
the metapage reference count and prevent the leak.

Fixes: a5f5e4698f8abbb25fe4959814093fb5bfa1aa9d ("jfs: fix shift-out-of-bounds in dbSplit")
Signed-off-by: Zheng Yu <zheng.yu@northwestern.edu>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agodrm/xe/configfs: Fix pci_dev reference leak
Michal Wajdeczko [Tue, 22 Jul 2025 14:10:54 +0000 (16:10 +0200)] 
drm/xe/configfs: Fix pci_dev reference leak

[ Upstream commit 942ac8da6388c25fe62b2792c78715e0ea6e649b ]

We are using pci_get_domain_bus_and_slot() function to verify if
the given config directory name matches any existing PCI device,
but we missed to call matching pci_dev_put() to release reference.

While around, also change error code in case of no device match,
to make it more specific than generic formatting error.

Fixes: 16280ded45fb ("drm/xe: Add configfs to enable survivability mode")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://lore.kernel.org/r/20250722141059.30707-2-michal.wajdeczko@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 0bdd05c2a82bbf2419415d012fd4f5faeca7f1af)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agosmb: client: allow parsing zero-length AV pairs
Paulo Alcantara [Fri, 25 Jul 2025 03:04:43 +0000 (00:04 -0300)] 
smb: client: allow parsing zero-length AV pairs

[ Upstream commit be77ab6b9fbe348daf3c2d3ee40f23ca5110a339 ]

Zero-length AV pairs should be considered as valid target infos.
Don't skip the next AV pairs that follow them.

Cc: linux-cifs@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Fixes: 0e8ae9b953bc ("smb: client: parse av pair type 4 in CHALLENGE_MESSAGE")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agofbdev: imxfb: Check fb_add_videomode to prevent null-ptr-deref
Chenyuan Yang [Thu, 24 Jul 2025 03:25:34 +0000 (22:25 -0500)] 
fbdev: imxfb: Check fb_add_videomode to prevent null-ptr-deref

[ Upstream commit da11e6a30e0bb8e911288bdc443b3dc8f6a7cac7 ]

fb_add_videomode() can fail with -ENOMEM when its internal kmalloc() cannot
allocate a struct fb_modelist.  If that happens, the modelist stays empty but
the driver continues to register.  Add a check for its return value to prevent
poteintial null-ptr-deref, which is similar to the commit 17186f1f90d3 ("fbdev:
Fix do_register_framebuffer to prevent null-ptr-deref in fb_videomode_to_var").

Fixes: 1b6c79361ba5 ("video: imxfb: Add DT support")
Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agocrypto: qat - fix seq_file position update in adf_ring_next()
Giovanni Cabiddu [Mon, 14 Jul 2025 07:10:29 +0000 (08:10 +0100)] 
crypto: qat - fix seq_file position update in adf_ring_next()

[ Upstream commit 6908c5f4f066a0412c3d9a6f543a09fa7d87824b ]

The `adf_ring_next()` function in the QAT debug transport interface
fails to correctly update the position index when reaching the end of
the ring elements. This triggers the following kernel warning when
reading ring files, such as
/sys/kernel/debug/qat_c6xx_<D:B:D:F>/transport/bank_00/ring_00:

   [27725.022965] seq_file: buggy .next function adf_ring_next [intel_qat] did not update position index

Ensure that the `*pos` index is incremented before returning NULL when
after the last element in the ring is found, satisfying the seq_file API
requirements and preventing the warning.

Fixes: a672a9dc872e ("crypto: qat - Intel(R) QAT transport code")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Ahsan Atta <ahsan.atta@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agocrypto: qat - fix DMA direction for compression on GEN2 devices
Giovanni Cabiddu [Mon, 14 Jul 2025 07:07:49 +0000 (08:07 +0100)] 
crypto: qat - fix DMA direction for compression on GEN2 devices

[ Upstream commit d41d75fe1b751ee6b347bf1cb1cfe9accc4fcb12 ]

QAT devices perform an additional integrity check during compression by
decompressing the output. Starting from QAT GEN4, this verification is
done in-line by the hardware. However, on GEN2 devices, the hardware
reads back the compressed output from the destination buffer and performs
a decompression operation using it as the source.

In the current QAT driver, destination buffers are always marked as
write-only. This is incorrect for QAT GEN2 compression, where the buffer
is also read during verification. Since commit 6f5dc7658094
("iommu/vt-d: Restore WO permissions on second-level paging entries"),
merged in v6.16-rc1, write-only permissions are strictly enforced, leading
to DMAR errors when using QAT GEN2 devices for compression, if VT-d is
enabled.

Mark the destination buffers as DMA_BIDIRECTIONAL. This ensures
compatibility with GEN2 devices, even though it is not required for
QAT GEN4 and later.

Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Fixes: cf5bb835b7c8 ("crypto: qat - fix DMA transfer direction")
Reviewed-by: Ahsan Atta <ahsan.atta@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoclk: clocking-wizard: Fix the round rate handling for versal
Shubhrajyoti Datta [Wed, 25 Jun 2025 05:41:14 +0000 (11:11 +0530)] 
clk: clocking-wizard: Fix the round rate handling for versal

[ Upstream commit 7f5e9ca0a424af44a708bb4727624d56f83ecffa ]

Fix the `clk_round_rate` implementation for Versal platforms by calling
the Versal-specific divider calculation helper. The existing code used
the generic divider routine, which results in incorrect round rate.

Fixes: 7681f64e6404 ("clk: clocking-wizard: calculate dividers fractional parts")
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Link: https://lore.kernel.org/r/20250625054114.28273-1-shubhrajyoti.datta@amd.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoperf tools: Remove libtraceevent in .gitignore
Chen Pei [Sat, 26 Jul 2025 11:15:32 +0000 (19:15 +0800)] 
perf tools: Remove libtraceevent in .gitignore

[ Upstream commit af470fb532fc803c4c582d15b4bd394682a77a15 ]

The libtraceevent has been removed from the source tree, and .gitignore
needs to be updated as well.

Fixes: 4171925aa9f3f7bf ("tools lib traceevent: Remove libtraceevent")
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250726111532.8031-1-cp0613@linux.alibaba.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agosh: Do not use hyphen in exported variable name
Ben Hutchings [Thu, 17 Jul 2025 14:47:32 +0000 (16:47 +0200)] 
sh: Do not use hyphen in exported variable name

[ Upstream commit c32969d0362a790fbc6117e0b6a737a7e510b843 ]

arch/sh/Makefile defines and exports ld-bfd to be used by
arch/sh/boot/compressed/Makefile and arch/sh/boot/romimage/Makefile.
However some shells, including dash, will not pass through environment
variables whose name includes a hyphen.  Usually GNU make does not use
a shell to recurse, but if e.g. $(srctree) contains '~' it will use a
shell here.

Other instances of this problem were previously fixed by commits
2bfbe7881ee0 "kbuild: Do not use hyphen in exported variable name"
and 82977af93a0d "sh: rename suffix-y to suffix_y".

Rename the variable to ld_bfd.

References: https://buildd.debian.org/status/fetch.php?pkg=linux&arch=sh4&ver=4.13%7Erc5-1%7Eexp1&stamp=1502943967&raw=0
Fixes: 7b022d07a0fd ("sh: Tidy up the ldscript output format specifier.")
Signed-off-by: Ben Hutchings <benh@debian.org>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoclk: spacemit: ccu_pll: fix error return value in recalc_rate callback
Akhilesh Patil [Wed, 23 Jul 2025 05:29:56 +0000 (10:59 +0530)] 
clk: spacemit: ccu_pll: fix error return value in recalc_rate callback

[ Upstream commit c60b95389d0206a3a3c087c09113315e7084be3f ]

Return 0 instead of -EINVAL if function ccu_pll_recalc_rate() fails to
get correct rate entry. Follow .recalc_rate callback documentation
as mentioned in include/linux/clk-provider.h for error return value.

Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in>
Fixes: 1b72c59db0add ("clk: spacemit: Add clock support for SpacemiT K1 SoC")
Reviewed-by: Haylen Chu <heylenay@4d2.org>
Reviewed-by: Alex Elder <elder@riscstar.com>
Link: https://lore.kernel.org/r/aIBzVClNQOBrjIFG@bhairav-test.ee.iitb.ac.in
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoASoC: fsl_xcvr: get channel status data with firmware exists
Shengjiu Wang [Thu, 10 Jul 2025 03:04:05 +0000 (11:04 +0800)] 
ASoC: fsl_xcvr: get channel status data with firmware exists

[ Upstream commit 6776ecc9dd587c08a6bb334542f9f8821a091013 ]

For the XCVR module on i.MX95, even though it only supports SPDIF, the
channel status needs to be obtained from RAM space, which is processed
by firmware. Firmware is necessary to trigger the FSL_XCVR_IRQ_NEW_CS
interrupt.

This change also applies for the SPDIF & ARC function on i.MX8MP which
has the firmware.

Fixes: e6a9750a346b ("ASoC: fsl_xcvr: Add suspend and resume support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20250710030405.3370671-3-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoASoC: fsl_xcvr: get channel status data when PHY is not exists
Shengjiu Wang [Thu, 10 Jul 2025 03:04:04 +0000 (11:04 +0800)] 
ASoC: fsl_xcvr: get channel status data when PHY is not exists

[ Upstream commit ca592e20659e0304ebd8f4dabb273da4f9385848 ]

There is no PHY for the XCVR module on i.MX93, the channel status needs
to be obtained from FSL_XCVR_RX_CS_DATA_* registers. And channel status
acknowledge (CSA) bit should be set once channel status is processed.

Fixes: e240b9329a30 ("ASoC: fsl_xcvr: Add support for i.MX93 platform")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20250710030405.3370671-2-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoASoC: SDCA: Fix some holes in the regmap readable/writeable helpers
Charles Keepax [Fri, 18 Jul 2025 13:54:31 +0000 (14:54 +0100)] 
ASoC: SDCA: Fix some holes in the regmap readable/writeable helpers

[ Upstream commit 061fade7a67f6cdfe918a675270d84107abbef61 ]

The current regmap readable/writeable helper functions always
allow the Next flag and allows any Control Number. Mask the Next
flag based on SDCA_ACCESS_MODE_DUAL which is the only Mode that
supports it. Also check that the Control Number is valid for
the given control.

Fixes: e3f7caf74b79 ("ASoC: SDCA: Add generic regmap SDCA helpers")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20250718135432.1048566-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agomfd: tps65219: Update TPS65214 MFD cell's GPIO compatible string
Shree Ramamoorthy [Tue, 27 May 2025 19:04:54 +0000 (14:04 -0500)] 
mfd: tps65219: Update TPS65214 MFD cell's GPIO compatible string

[ Upstream commit 6f27d26e363a41fc651be852094823ce47a43243 ]

This patch reflects the change made to move TPS65215 from 1 GPO and 1 GPIO
to 2 GPOs and 1 GPIO. TPS65215 and TPS65219 both have 2 GPOs and 1 GPIO.
TPS65214 has 1 GPO and 1 GPIO. TPS65215 will reuse the TPS65219 GPIO
compatible string.

TPS65214 TRM: https://www.ti.com/lit/pdf/slvud30
TPS65215 TRM: https://www.ti.com/lit/pdf/slvucw5/

Fixes: 7947219ab1a2 ("mfd: tps65219: Add support for TI TPS65214 PMIC")
Signed-off-by: Shree Ramamoorthy <s-ramamoorthy@ti.com>
Link: https://lore.kernel.org/r/20250527190455.169772-2-s-ramamoorthy@ti.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agodmaengine: nbpfaxi: Add missing check after DMA map
Thomas Fourier [Mon, 7 Jul 2025 07:57:16 +0000 (09:57 +0200)] 
dmaengine: nbpfaxi: Add missing check after DMA map

[ Upstream commit c6ee78fc8f3e653bec427cfd06fec7877ee782bd ]

The DMA map functions can fail and should be tested for errors.
If the mapping fails, unmap and return an error.

Fixes: b45b262cefd5 ("dmaengine: add a driver for AMBA AXI NBPF DMAC IP cores")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250707075752.28674-2-fourier.thomas@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agodmaengine: mv_xor: Fix missing check after DMA map and missing unmap
Thomas Fourier [Tue, 1 Jul 2025 12:37:52 +0000 (14:37 +0200)] 
dmaengine: mv_xor: Fix missing check after DMA map and missing unmap

[ Upstream commit 60095aca6b471b7b7a79c80b7395f7e4e414b479 ]

The DMA map functions can fail and should be tested for errors.

In case of error, unmap the already mapped regions.

Fixes: 22843545b200 ("dma: mv_xor: Add support for DMA_INTERRUPT")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250701123753.46935-2-fourier.thomas@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoperf pmu: Switch FILENAME_MAX to NAME_MAX
Ian Rogers [Thu, 17 Jul 2025 15:08:54 +0000 (08:08 -0700)] 
perf pmu: Switch FILENAME_MAX to NAME_MAX

[ Upstream commit 82aac553372cd201b91a8b064be0cd5a501932b2 ]

FILENAME_MAX is the same as PATH_MAX (4kb) in glibc rather than
NAME_MAX's 255. Switch to using NAME_MAX and ensure the '\0' is
accounted for in the path's buffer size.

Fixes: 754baf426e09 ("perf pmu: Change aliases from list to hashmap")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250717150855.1032526-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agotools subcmd: Tighten the filename size in check_if_command_finished
Ian Rogers [Thu, 17 Jul 2025 15:08:53 +0000 (08:08 -0700)] 
tools subcmd: Tighten the filename size in check_if_command_finished

[ Upstream commit 478272d1cdd9959a6d638e9d81f70642f04290c9 ]

FILENAME_MAX is often PATH_MAX (4kb), far more than needed for the
/proc path. Make the buffer size sufficient for the maximum integer
plus "/proc/" and "/status" with a '\0' terminator.

Fixes: 5ce42b5de461 ("tools subcmd: Add non-waitpid check_if_command_finished()")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250717150855.1032526-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoclk: thead: th1520-ap: Describe mux clocks with clk_mux
Yao Zi [Tue, 22 Jul 2025 08:05:36 +0000 (08:05 +0000)] 
clk: thead: th1520-ap: Describe mux clocks with clk_mux

[ Upstream commit 54edba916e2913b0893b0f6404b73155d48374ea ]

Mux clocks are now described with a customized ccu_mux structure
consisting of ccu_internal and ccu_common substructures, and registered
later with devm_clk_hw_register_mux_parent_data_table(). As this helper
always allocates a new clk_hw structure, it's extremely hard to use mux
clocks as parents statically by clk_hw pointers, since CCF has no
knowledge about the clk_hw structure embedded in ccu_mux.

This scheme already causes issues for clock c910, which takes a mux
clock, c910-i0, as a possible parent. With mainline U-Boot that
reparents c910 to c910-i0 at boottime, c910 is considered as an orphan
by CCF.

This patch refactors handling of mux clocks, embeds a clk_mux structure
in ccu_mux directly. Instead of calling devm_clk_hw_register_mux_*(),
we could register mux clocks on our own without allocating any new
clk_hw pointer, fixing c910 clock's issue.

Fixes: ae81b69fd2b1 ("clk: thead: Add support for T-Head TH1520 AP_SUBSYS clocks")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agofs/orangefs: Allow 2 more characters in do_c_string()
Dan Carpenter [Sat, 19 Jul 2025 14:19:10 +0000 (09:19 -0500)] 
fs/orangefs: Allow 2 more characters in do_c_string()

[ Upstream commit 2138e89cb066b40386b1d9ddd61253347d356474 ]

The do_k_string() and do_c_string() functions do essentially the same
thing which is they add a string and a comma onto the end of an existing
string.  At the end, the caller will overwrite the last comma with a
newline.  Later, in orangefs_kernel_debug_init(), we add a newline to
the string.

The change to do_k_string() is just cosmetic.  I moved the "- 1" to
the other side of the comparison and made it "+ 1".  This has no
effect on runtime, I just wanted the functions to match each other
and the rest of the file.

However in do_c_string(), I removed the "- 2" which allows us to print
two extra characters.  I noticed this issue while reviewing the code
and I doubt affects anything in real life.  My guess is that this was
double counting the comma and the newline.  The "+ 1" accounts for
the newline, and the caller will delete the final comma which ensures
there is enough space for the newline.

Removing the "- 2" lets us print 2 more characters, but mainly it makes
the code more consistent and understandable for reviewers.

Fixes: 44f4641073f1 ("orangefs: clean up debugfs globals")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoremoteproc: xlnx: Disable unsupported features
Tanmay Shah [Wed, 16 Jul 2025 21:30:47 +0000 (14:30 -0700)] 
remoteproc: xlnx: Disable unsupported features

[ Upstream commit 699cdd706290208d47bd858a188b030df2e90357 ]

AMD-Xilinx platform driver does not support iommu or recovery mechanism
yet. Disable both features in platform driver.

Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
Link: https://lore.kernel.org/r/20250716213048.2316424-2-tanmay.shah@amd.com
Fixes: 6b291e8020a8 ("drivers: remoteproc: Add Xilinx r5 remoteproc driver")
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agophy: qcom: phy-qcom-snps-eusb2: Add missing write from init sequence
Luca Weiss [Tue, 15 Jul 2025 07:29:36 +0000 (09:29 +0200)] 
phy: qcom: phy-qcom-snps-eusb2: Add missing write from init sequence

[ Upstream commit 7f5f703210109366c1e1b685086c9b0a4897ea54 ]

As per a commit from Qualcomm's downstream 6.1 kernel[0], the init
sequence is missing setting the CMN_CTRL_OVERRIDE_EN bit back to 0 at
the end, as per the 'latest' HPG revision (as of November 2023).

[0] https://git.codelinaro.org/clo/la/kernel/qcom/-/commit/b77774a89e3fda3246e09dd39e16e2ab43cd1329

Fixes: 80090810f5d3 ("phy: qcom: Add QCOM SNPS eUSB2 driver")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Link: https://lore.kernel.org/r/20250715-sm7635-eusb-phy-v3-3-6c3224085eb6@fairphone.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoclk: imx95-blk-ctl: Fix synchronous abort
Laurentiu Palcu [Mon, 7 Jul 2025 02:24:38 +0000 (10:24 +0800)] 
clk: imx95-blk-ctl: Fix synchronous abort

[ Upstream commit b08217a257215ed9130fce93d35feba66b49bf0a ]

When enabling runtime PM for clock suppliers that also belong to a power
domain, the following crash is thrown:
error: synchronous external abort: 0000000096000010 [#1] PREEMPT SMP
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : clk_mux_get_parent+0x60/0x90
lr : clk_core_reparent_orphans_nolock+0x58/0xd8
  Call trace:
   clk_mux_get_parent+0x60/0x90
   clk_core_reparent_orphans_nolock+0x58/0xd8
   of_clk_add_hw_provider.part.0+0x90/0x100
   of_clk_add_hw_provider+0x1c/0x38
   imx95_bc_probe+0x2e0/0x3f0
   platform_probe+0x70/0xd8

Enabling runtime PM without explicitly resuming the device caused
the power domain cut off after clk_register() is called. As a result,
a crash happens when the clock hardware provider is added and attempts
to access the BLK_CTL register.

Fix this by using devm_pm_runtime_enable() instead of pm_runtime_enable()
and getting rid of the pm_runtime_disable() in the cleanup path.

Fixes: 5224b189462f ("clk: imx: add i.MX95 BLK CTL clk driver")
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250707-imx95-blk-ctl-7-1-v3-2-c1b676ec13be@nxp.com
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agoPCI: endpoint: pci-epf-vntb: Fix the incorrect usage of __iomem attribute
Manivannan Sadhasivam [Wed, 9 Jul 2025 12:50:22 +0000 (18:20 +0530)] 
PCI: endpoint: pci-epf-vntb: Fix the incorrect usage of __iomem attribute

[ Upstream commit 61ae7f8694fb4b57a8c02a1a8d2b601806afc999 ]

__iomem attribute is supposed to be used only with variables holding the
MMIO pointer. But here, 'mw_addr' variable is just holding a 'void *'
returned by pci_epf_alloc_space(). So annotating it with __iomem is clearly
wrong. Hence, drop the attribute.

This also fixes the below sparse warning:

  drivers/pci/endpoint/functions/pci-epf-vntb.c:524:17: warning: incorrect type in assignment (different address spaces)
  drivers/pci/endpoint/functions/pci-epf-vntb.c:524:17:    expected void [noderef] __iomem *mw_addr
  drivers/pci/endpoint/functions/pci-epf-vntb.c:524:17:    got void *
  drivers/pci/endpoint/functions/pci-epf-vntb.c:530:21: warning: incorrect type in assignment (different address spaces)
  drivers/pci/endpoint/functions/pci-epf-vntb.c:530:21:    expected unsigned int [usertype] *epf_db
  drivers/pci/endpoint/functions/pci-epf-vntb.c:530:21:    got void [noderef] __iomem *mw_addr
  drivers/pci/endpoint/functions/pci-epf-vntb.c:542:38: warning: incorrect type in argument 2 (different address spaces)
  drivers/pci/endpoint/functions/pci-epf-vntb.c:542:38:    expected void *addr
  drivers/pci/endpoint/functions/pci-epf-vntb.c:542:38:    got void [noderef] __iomem *mw_addr

Fixes: e35f56bb0330 ("PCI: endpoint: Support NTB transfer between RC and EP")
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250709125022.22524-1-mani@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agosoundwire: stream: restore params when prepare ports fail
Bard Liao [Thu, 26 Jun 2025 06:09:52 +0000 (14:09 +0800)] 
soundwire: stream: restore params when prepare ports fail

[ Upstream commit dba7d9dbfdc4389361ff3a910e767d3cfca22587 ]

The bus->params should be restored if the stream is failed to prepare.
The issue exists since beginning. The Fixes tag just indicates the
first commit that the commit can be applied to.

Fixes: 17ed5bef49f4 ("soundwire: add missing newlines in dynamic debug logs")
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20250626060952.405996-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 months agocgroup: Add compatibility option for content of /proc/cgroups
Michal Koutný [Fri, 18 Jul 2025 09:18:54 +0000 (11:18 +0200)] 
cgroup: Add compatibility option for content of /proc/cgroups

[ Upstream commit 646faf36d7271c597497ca547a59912fcab49be9 ]

/proc/cgroups lists only v1 controllers by default, however, this is
only enforced since the commit af000ce85293b ("cgroup: Do not report
unavailable v1 controllers in /proc/cgroups") and there is software in
the wild that uses content of /proc/cgroups to decide on availability of
v2 (sic) controllers.

Add a boottime param that can bring back the previous behavior for
setups where the check in the software cannot be changed and it causes
e.g. unintended OOMs.

Also, this patch takes out cgrp_v1_visible from cgroup1_subsys_absent()
guard since it's only important to check which hierarchy (v1 vs v2) the
subsys is attached to. This has no effect on the printed message but
the code is cleaner since cgrp_v1_visible is really about mounted
hierarchies, not the content of /proc/cgroups.

Link: https://lore.kernel.org/r/b26b60b7d0d2a5ecfd2f3c45f95f32922ed24686.camel@decadent.org.uk
Fixes: af000ce85293b ("cgroup: Do not report unavailable v1 controllers in /proc/cgroups")
Fixes: a0ab1453226d8 ("cgroup: Print message when /proc/cgroups is read on v2-only system")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>