Aksh Garg [Fri, 30 Jan 2026 11:55:15 +0000 (17:25 +0530)]
PCI: dwc: ep: Add per-PF BAR and inbound ATU mapping support
The commit 24ede430fa49 ("PCI: designware-ep: Add multiple PFs support
for DWC") added support for multiple PFs in the DWC driver, but the
implementation was incomplete. It did not properly support MSI/MSI-X,
as well as BAR and inbound ATU mapping for multiple PFs. The MSI/MSI-X
issue was later fixed by commit 47a062609a30 ("PCI: designware-ep:
Modify MSI and MSIX CAP way of finding") by introducing a per-PF
struct dw_pcie_ep_func.
However, even with both commits, the multiple PF support in the driver
remains broken because BAR configuration and ATU mappings are managed
globally in struct dw_pcie_ep, meaning all PFs share the same BAR-to-ATU
mapping table. This causes one PF's EPF to overwrite the address
translation of another PF's EPF in the internal ATU region,
creating conflicts when multiple physical functions attempt to
configure their BARs independently.
The commit cfbc98dbf44d ("PCI: dwc: ep: Support BAR subrange inbound
mapping via Address Match Mode iATU") later introduced Address Match
Mode support, which suffers from the same multi-PF conflict issue.
Fix this by moving the required members from struct dw_pcie_ep to
struct dw_pcie_ep_func, similar to what commit 47a062609a30
("PCI: designware-ep: Modify MSI and MSIX CAP way of finding") did for
MSI/MSI-X capability support, to allow proper multi-function endpoint
operation, where each PF can configure its BARs and corresponding
internal ATU region without interfering with other PFs.
Fixes: 24ede430fa49 ("PCI: designware-ep: Add multiple PFs support for DWC") Fixes: cc839bef7727 ("PCI: dwc: ep: Support BAR subrange inbound mapping via Address Match Mode iATU") Signed-off-by: Aksh Garg <a-garg7@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260130115516.515082-3-a-garg7@ti.com
Aksh Garg [Fri, 30 Jan 2026 11:55:14 +0000 (17:25 +0530)]
PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations
The resizable BAR support added by the commit 3a3d4cabe681 ("PCI: dwc: ep:
Allow EPF drivers to configure the size of Resizable BARs") incorrectly
configures the resizable BARs only for the first Physical Function (PF0)
in EP mode.
The resizable BAR configuration functions use generic dw_pcie_*_dbi()
operations instead of physical function specific dw_pcie_ep_*_dbi()
operations. This causes resizable BAR configuration to always target
PF0 regardless of the requested function number.
Additionally, dw_pcie_ep_init_non_sticky_registers() only initializes
resizable BAR registers for PF0, leaving other PFs unconfigured during
the execution of this function.
Fix this by using physical function specific configuration space access
operations throughout the resizable BAR code path and initializing
registers for all the physical functions that support resizable BARs.
Niklas Cassel [Fri, 30 Jan 2026 11:30:39 +0000 (12:30 +0100)]
PCI: endpoint: pci-epf-test: Allow overriding default BAR sizes
Add bar{0,1,2,3,4,5}_size attributes in configfs, so that the user is not
restricted to run pci-epf-test with the hardcoded BAR size values defined
in pci-epf-test.c.
This code is shamelessly more or less copy pasted from pci-epf-vntb.c
Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Koichiro Den <den@valinux.co.jp> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260130113038.2143947-2-cassel@kernel.org
Koichiro Den [Sat, 24 Jan 2026 14:50:12 +0000 (23:50 +0900)]
selftests: pci_endpoint: Add BAR subrange mapping test case
Add BAR_SUBRANGE_TEST to the pci_endpoint kselftest suite.
The test uses the PCITEST_BAR_SUBRANGE ioctl and will skip when the
chosen BAR is disabled (-ENODATA), when the endpoint/controller does not
support subrange mapping (-EOPNOTSUPP), or when the BAR is reserved for
the test register space (-EBUSY).
Koichiro Den [Sat, 24 Jan 2026 14:50:11 +0000 (23:50 +0900)]
misc: pci_endpoint_test: Add BAR subrange mapping test case
Add a new PCITEST_BAR_SUBRANGE ioctl to exercise EPC BAR subrange
mapping end-to-end.
The test programs a simple 2-subrange layout on the endpoint (via
pci-epf-test) and verifies that:
- the endpoint-provided per-subrange signature bytes are observed at
the expected PCIe BAR offsets, and
- writes to each subrange are routed to the correct backing region
(i.e. the submap order is applied rather than accidentally working
due to an identity mapping).
Return -EOPNOTSUPP when the endpoint does not advertise subrange
mapping, -ENODATA when the BAR is disabled, and -EBUSY when the BAR is
reserved for the test register space.
Koichiro Den [Sat, 24 Jan 2026 14:50:10 +0000 (23:50 +0900)]
PCI: endpoint: pci-epf-test: Add BAR subrange mapping test support
Extend pci-epf-test so that pci_endpoint_test can exercise BAR subrange
mapping end-to-end.
Add BAR_SUBRANGE_SETUP/CLEAR commands that program (and tear down) a
simple 2-subrange layout for a selected BAR. The endpoint deliberately
permutes the physical backing regions (swap the halves) and writes a
deterministic signature byte per subrange. This allows the RC to verify
that the submap order is actually applied, not just that reads/writes
work with an identity mapping.
Advertise CAP_SUBRANGE_MAPPING only when the underlying EPC supports
dynamic_inbound_mapping and subrange_mapping. Also bump the default BAR
sizes (BAR0-4) to 128 KiB so that split subranges are large enough to
satisfy common inbound-translation alignment constraints. E.g. for DWC
EP, the default and maximum CX_ATU_MIN_REGION_SIZE is 64 kB, so 128 KiB
is sufficient for DWC-based EP platforms for 2-subrange testing.
The current documentation implies that pci_epc_set_bar() is only used
before the host enumerates the endpoint.
In practice, some Endpoint Controllers support calling pci_epc_set_bar()
multiple times for the same BAR (without clearing it) in order to update
inbound address translations after the host has programmed the BAR base
address, which some Endpoint Functions such as vNTB already rely on.
Add document text for that.
Also document the expected call flow for BAR subrange mapping
(pci_epf_bar.num_submap / pci_epf_bar.submap), which may require a
second pci_epc_set_bar() call after the host has programmed the BAR base
address.
Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260124145012.2794108-6-den@valinux.co.jp
Koichiro Den [Sat, 24 Jan 2026 14:50:08 +0000 (23:50 +0900)]
PCI: dwc: ep: Support BAR subrange inbound mapping via Address Match Mode iATU
Extend dw_pcie_ep_set_bar() to support inbound mappings for BAR
subranges using Address Match Mode IB iATU when pci_epf_bar.num_submap
is non-zero.
Rename the existing BAR-match helper into dw_pcie_ep_ib_atu_bar() and
introduce dw_pcie_ep_ib_atu_addr() for Address Match Mode. When
num_submap is non-zero, read the assigned BAR base address and program
one inbound iATU window per subrange. Validate the submap array before
programming:
- each subrange is aligned to pci->region_align
- subranges cover the whole BAR (no gaps and no overlaps)
Track Address Match Mode mappings and tear them down on clear_bar() and
on set_bar() error paths to avoid leaving half-programmed state or
untranslated BAR holes.
Advertise this capability by extending the common feature bit
initializer macro (DWC_EPC_COMMON_FEATURES).
This enables multiple inbound windows within a single BAR, which is
useful on platforms where usable BARs are scarce but EPFs need multiple
inbound regions.
Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260124145012.2794108-5-den@valinux.co.jp
Koichiro Den [Sat, 24 Jan 2026 14:50:07 +0000 (23:50 +0900)]
PCI: dwc: Advertise dynamic inbound mapping support
The DesignWare EP core has supported updating the inbound iATU mapping
for an already configured BAR (i.e. allowing pci_epc_set_bar() to be
called again without a prior pci_epc_clear_bar()) since
commit 4284c88fff0e ("PCI: designware-ep: Allow pci_epc_set_bar() update
inbound map address").
Now that this capability is exposed via the dynamic_inbound_mapping EPC
feature bit, set it for DWC-based EP glue drivers using a common
initializer macro to avoid duplicating the same flag in each driver.
Note that pci-layerscape-ep.c is untouched. It currently constructs the
feature struct dynamically in ls_pcie_ep_init(). Once converted to a
static feature definition, it will use DWC_EPC_COMMON_FEATURES as well.
Koichiro Den [Sat, 24 Jan 2026 14:50:06 +0000 (23:50 +0900)]
PCI: endpoint: Add BAR subrange mapping support
Some endpoint platforms have only a small number of usable BARs. At the
same time, EPF drivers (e.g. vNTB) may need multiple independent inbound
regions (control/scratchpad, one or more memory windows, and optionally
MSI or other feature-related regions). Subrange mapping allows these to
share a single BAR without consuming additional BARs that may not be
available, or forcing a fragile layout by aggressively packing into a
single contiguous memory range.
Extend the PCI endpoint core to support mapping subranges within a BAR.
Add an optional 'submap' field in struct pci_epf_bar so an endpoint
function driver can request inbound mappings that fully cover the BAR.
Introduce a new EPC feature bit, subrange_mapping, and reject submap
requests from pci_epc_set_bar() unless the controller advertises both
subrange_mapping and dynamic_inbound_mapping features.
The submap array describes the complete BAR layout (no overlaps and no
gaps are allowed to avoid exposing untranslated address ranges). This
provides the generic infrastructure needed to map multiple logical
regions into a single BAR at different offsets, without assuming a
controller-specific inbound address translation mechanism.
Introduce a new EPC feature bit (dynamic_inbound_mapping) that indicates
whether an Endpoint Controller can update the inbound address
translation for a BAR without requiring the EPF driver to clear/reset
the BAR first.
Endpoint Function drivers (e.g. vNTB) can use this information to decide
whether it really is safe to call pci_epc_set_bar() multiple times to
update inbound mappings for the BAR.
Suggested-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260124145012.2794108-2-den@valinux.co.jp
Qiang Yu [Thu, 22 Jan 2026 07:45:19 +0000 (23:45 -0800)]
PCI: dwc: Rename dw_pcie_rp::has_msi_ctrl to dw_pcie_rp::use_imsi_rx for clarity
The current "has_msi_ctrl" flag name is misleading because it suggests the
presence of any MSI controller, while it is specifically set for platforms
that lack .msi_init() callback and don't have "msi-parent" or "msi-map"
device tree properties, indicating they rely on the iMSI-RX module for MSI
functionality.
Rename it to "use_imsi_rx" to make the intent clear:
- When true: Platform uses the iMSI-RX module for MSI handling
- When false: Platform has other MSI controller support (ITS/MBI, external
MSI controller)
No functional changes, only improves code readability and eliminates
naming confusion.
Samuel Holland [Fri, 9 Jan 2026 11:34:30 +0000 (19:34 +0800)]
PCI: dwc: Use multiple iATU windows for mapping large bridge windows and DMA ranges
The DWC driver tries to use a single iATU region for mapping the individual
entries of the bridge window and DMA range. If a bridge window/DMA range is
larger than the iATU inbound/outbound window size, then the mapping will
fail.
Hence, avoid this failure by using multiple iATU windows to map the whole
region. If the region runs out of iATU windows, then return failure.
Signed-off-by: Charles Mirabile <cmirabil@redhat.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Co-developed-by: Randolph Lin <randolph@andestech.com> Signed-off-by: Randolph Lin <randolph@andestech.com>
[mani: reworded description, minor code cleanup] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Acked-by: Charles Mirabile <cmirabil@redhat.com> Link: https://patch.msgid.link/20260109113430.2767264-1-randolph@andestech.com
Qiang Yu [Wed, 24 Dec 2025 10:10:46 +0000 (02:10 -0800)]
PCI: dwc: Remove duplicate dw_pcie_ep_hide_ext_capability() function
Remove dw_pcie_ep_hide_ext_capability() and replace its usage with
dw_pcie_remove_ext_capability(). Both functions serve the same purpose
of hiding PCIe extended capabilities, but dw_pcie_remove_ext_capability()
provides a cleaner API that doesn't require the caller to specify the
previous capability ID.
Richard Zhu [Wed, 14 Jan 2026 08:33:00 +0000 (16:33 +0800)]
PCI: dwc: Skip waiting for L2/L3 Ready if dw_pcie_rp::skip_l23_wait is true
In NXP i.MX6QP and i.MX7D SoCs, LTSSM registers are not accessible once
PME_Turn_Off message is broadcasted to the link. So there is no way to
verify whether the link has entered L2/L3 Ready state or not.
Hence, add a new flag 'dw_pcie_rp::skip_l23_ready' and set it to 'true' for
the above mentioned SoCs. This flag when set, will allow the DWC core to
skip polling for L2/L3 Ready state and just wait for 10ms as recommended in
the PCIe spec r6.0, sec 5.3.3.2.1.
Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method") Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[mani: renamed flag to skip_l23_ready and reworded description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260114083300.3689672-2-hongxing.zhu@nxp.com
PCI: dwc: Fail dw_pcie_host_init() if dw_pcie_wait_for_link() returns -ETIMEDOUT
The dw_pcie_wait_for_link() API now distinguishes link failures more
precisely:
-ENODEV: Device not found on the bus.
-EIO: Device found but inactive.
-ETIMEDOUT: Link failed to come up.
Out of these three errors, only -ETIMEDOUT represents a definitive link
failure since it signals that something is wrong with the link. For the
other two errors, there is a possibility that the link might come up later.
So fail dw_pcie_host_init() if -ETIMEDOUT is returned and skip the failure
otherwise.
PCI: dwc: Rework the error print of dw_pcie_wait_for_link()
For the cases where the link cannot come up later i.e., when LTSSM is not
in Detect.{Quiet/Active} or Poll.{Active/Compliance} states,
dw_pcie_wait_for_link() should log an error.
So promote dev_info() to dev_err(), reword the error log to make it clear
and also print the LTSSM state to aid debugging.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Tested-by: Richard Zhu <hongxing.zhu@nxp.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-4-2f32d5082549@oss.qualcomm.com
PCI: dwc: Rename and move ltssm_status_string() to pcie-designware.c
Rename ltssm_status_string() to dw_pcie_ltssm_status_string() and move it
to the common file pcie-designware.c so that this function could be used
outside of pcie-designware-debugfs.c file.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Tested-by: Richard Zhu <hongxing.zhu@nxp.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-3-2f32d5082549@oss.qualcomm.com
PCI: dwc: Return -EIO from dw_pcie_wait_for_link() if device is not active
There are cases where the PCIe device would be physically connected to the
bus, but the device firmware might not be active. So the LTSSM will
get stuck in POLL.{Active/Compliance} states.
This behavior is common with endpoint devices controlled by the PCI
Endpoint framework, where the device will wait for the user to start its
operation through configfs.
For those cases, print the relevant log and return -EIO to indicate that
the device is present, but not active. This will allow the callers to skip
the failure as the device might become active in the future.
PCI: dwc: Return -ENODEV from dw_pcie_wait_for_link() if device is not found
The dw_pcie_wait_for_link() function waits up to 1 second for the PCIe link
to come up and returns -ETIMEDOUT for all failures without distinguishing
cases where no device is present on the bus. But the callers may want to
just skip the failure if the device is not found on the bus and handle
failure for other reasons.
So after timeout, if the LTSSM is in Detect.Quiet or Detect.Active state,
return -ENODEV to indicate the callers that the device is not found on the
bus and return -ETIMEDOUT otherwise.
Also add kernel doc to document the parameter and return values.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Tested-by: Richard Zhu <hongxing.zhu@nxp.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260120-pci-dwc-suspend-rework-v4-1-2f32d5082549@oss.qualcomm.com
Shawn Lin [Wed, 24 Dec 2025 01:38:52 +0000 (09:38 +0800)]
PCI: dwc: Use cfg0_base as iMSI-RX target address to support 32-bit MSI devices
commit f3a296405b6e ("PCI: dwc: Strengthen the MSI address allocation
logic") strengthened the iMSI-RX target address allocation logic to handle
64-bit addresses for platforms without 32-bit DMA addresses. However, it
still left 32-bit MSI capable endpoints (EPs) non-functional on such
platforms.
Per DWC databook r6.21a, sec 3.10.2.3, it states:
"The iMSI-RX is programmed with an address (MSI_CTRL_ADDR_OFF and
MSI_CTRL_UPPER_ADDR_OFF) that is used as the system MSI address. When an
inbound MWr request is passed to the AXI bridge and matches this address
as well as the conditions specified for an MSI memory write request, an
MSI interrupt is detected. When this MWr is about to be driven onto the
AXI bridge master interface1, it is dropped and never appears on the AXI
bus."
Since iMSI-RX MSI_CTRL_ADDR doesn't require actual system memory mapping,
any 32-bit address that won't be used for BAR memory allocations can be
assigned. So assign cfg0_base to the iMSI-RX target address as the first
option if it's a 32-bit address, which satisfies this requirement.
Otherwise, fallback to the existing coherent allocation.
cc: Ajay Agarwal <ajayagarwal@google.com>
cc: Will McVicker <willmcvicker@google.com> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
[mani: trimmed the description to exclude testbed info and used imperative tone] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/1766540332-24235-1-git-send-email-shawn.lin@rock-chips.com
Koichiro Den [Mon, 22 Dec 2025 11:01:44 +0000 (12:01 +0100)]
PCI: dwc: ep: Cache MSI outbound iATU mapping
dw_pcie_ep_raise_msi_irq() currently programs an outbound iATU window
for the MSI target address on every interrupt and tears it down again
via dw_pcie_ep_unmap_addr().
On systems that heavily use the AXI bridge interface (for example when
the integrated eDMA engine is active), this means the outbound iATU
registers are updated while traffic is in flight. The DesignWare
endpoint databook 5.40a - "3.10.6.1 iATU Outbound Programming Overview"
warns that updating iATU registers in this situation is not supported,
and the behavior is undefined.
Under high MSI and eDMA load this pattern results in occasional bogus
outbound transactions and IOMMU faults, on the RC side, such as:
ipmmu-vmsa eed40000.iommu: Unhandled fault: status 0x00001502 iova 0xfe000000
followed by the system becoming unresponsive. This is the actual output
observed on Renesas R-Car S4, with its ipmmu_hc used with PCIe ch0.
There is no need to reprogram the iATU region used for MSI on every
interrupt. The host-provided MSI address is stable while MSI is enabled,
and the endpoint driver already dedicates a scratch buffer for MSI
generation.
Cache the aligned MSI address and map size, program the outbound iATU
once, and keep the window enabled. Subsequent interrupts only perform a
write to the MSI scratch buffer, avoiding dynamic iATU reprogramming in
the hot path and fixing the lockups seen under load.
dw_pcie_ep_raise_msix_irq() is not modified, as each vector can have a
different msg_addr, and because the msg_addr is allowed to be changed
while the vector is masked. Neither problem is easy to solve with the
current design. Instead, the plan is for the DWC vendor drivers to
transition to dw_pcie_ep_raise_msix_irq_doorbell(), which does not rely
on the iATU.
While this fake hotplugging was a nice idea, it has shown that this feature
does not handle PCIe switches correctly:
pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43
pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them
pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44
pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them
pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45
pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them
pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46
pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them
pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46
pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41])
pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them
pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46
During the initial scan, PCI core doesn't see the switch and since the Root
Port is not hot plug capable, the secondary bus number gets assigned as the
subordinate bus number. This means, the PCI core assumes that only one bus
will appear behind the Root Port since the Root Port is not hot plug
capable.
This works perfectly fine for PCIe endpoints connected to the Root Port,
since they don't extend the bus. However, if a PCIe switch is connected,
then there is a problem when the downstream busses starts showing up and
the PCI core doesn't extend the subordinate bus number and bridge resources
after initial scan during boot.
So revert the change that skipped dw_pcie_wait_for_link() if the Link up
IRQ was used by a vendor glue driver.
While this fake hotplugging was a nice idea, it has shown that this feature
does not handle PCIe switches correctly:
pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43
pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them
pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44
pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them
pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45
pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them
pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46
pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them
pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46
pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41])
pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them
pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46
During the initial scan, PCI core doesn't see the switch and since the Root
Port is not hot plug capable, the secondary bus number gets assigned as the
subordinate bus number. This means, the PCI core assumes that only one bus
will appear behind the Root Port since the Root Port is not hot plug
capable.
This works perfectly fine for PCIe endpoints connected to the Root Port,
since they don't extend the bus. However, if a PCIe switch is connected,
then there is a problem when the downstream busses starts showing up and
the PCI core doesn't extend the subordinate bus number and bridge resources
after initial scan during boot.
The long term plan is to migrate this driver to the upcoming pwrctrl APIs
that are supposed to handle this problem elegantly.
While this fake hotplugging was a nice idea, it has shown that this feature
does not handle PCIe switches correctly:
pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43
pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them
pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44
pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them
pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45
pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them
pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46
pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them
pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46
pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41])
pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them
pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46
During the initial scan, PCI core doesn't see the switch and since the Root
Port is not hot plug capable, the secondary bus number gets assigned as the
subordinate bus number. This means, the PCI core assumes that only one bus
will appear behind the Root Port since the Root Port is not hot plug
capable.
This works perfectly fine for PCIe endpoints connected to the Root Port,
since they don't extend the bus. However, if a PCIe switch is connected,
then there is a problem when the downstream busses starts showing up and
the PCI core doesn't extend the subordinate bus number and bridge resources
after initial scan during boot.
The long term plan is to migrate this driver to the upcoming pwrctrl APIs
that are supposed to handle this problem elegantly.
While this fake hotplugging was a nice idea, it has shown that this feature
does not handle PCIe switches correctly:
pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43
pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them
pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44
pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them
pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45
pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them
pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46
pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them
pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46
pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41])
pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them
pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46
During the initial scan, PCI core doesn't see the switch and since the Root
Port is not hot plug capable, the secondary bus number gets assigned as the
subordinate bus number. This means, the PCI core assumes that only one bus
will appear behind the Root Port since the Root Port is not hot plug
capable.
This works perfectly fine for PCIe endpoints connected to the Root Port,
since they don't extend the bus. However, if a PCIe switch is connected,
then there is a problem when the downstream busses starts showing up and
the PCI core doesn't extend the subordinate bus number and bridge resources
after initial scan during boot.
The long term plan is to migrate this driver to the upcoming pwrctrl APIs
that are supposed to handle this problem elegantly.
While this fake hotplugging was a nice idea, it has shown that this feature
does not handle PCIe switches correctly:
pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43
pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them
pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44
pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them
pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45
pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them
pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46
pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them
pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46
pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41])
pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them
pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46
During the initial scan, PCI core doesn't see the switch and since the Root
Port is not hot plug capable, the secondary bus number gets assigned as the
subordinate bus number. This means, the PCI core assumes that only one bus
will appear behind the Root Port since the Root Port is not hot plug
capable.
This works perfectly fine for PCIe endpoints connected to the Root Port,
since they don't extend the bus. However, if a PCIe switch is connected,
then there is a problem when the downstream busses starts showing up and
the PCI core doesn't extend the subordinate bus number and bridge resources
after initial scan during boot.
The long term plan is to migrate this driver to the upcoming pwrctrl APIs
that are supposed to handle this problem elegantly.
PCI: dwc: Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if link is not up
During system suspend, if the PCIe link is not up, then there is no need
to broadcast PME_Turn_Off message and wait for L2/L3 transition. So skip
them.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Link: https://patch.msgid.link/20251218-pci-dwc-suspend-rework-v2-1-5a7778c6094a@oss.qualcomm.com
Shawn Lin [Fri, 12 Dec 2025 01:33:25 +0000 (09:33 +0800)]
PCI: dw-rockchip: Change get_ltssm() to provide L1 Substates info
Rename rockchip_pcie_get_ltssm() to rockchip_pcie_get_ltssm_reg() and add
rockchip_pcie_get_ltssm() to get_ltssm() callback in order to show the
proper L1 Substates. The PCIE_CLIENT_LTSSM_STATUS[5:0] register returns
the same LTSSM layout as enum dw_pcie_ltssm. So the driver just need to
convey L1 PM Substates by returning the proper value defined in
pcie-designware.h.
Shawn Lin [Fri, 12 Dec 2025 01:33:24 +0000 (09:33 +0800)]
PCI: dwc: Add L1 Substates context to ltssm_status of debugfs
DWC core couldn't distinguish LTSSM state among L1.0, L1.1 and L1.2. But
the vendor glue driver may implement additional logic to convey this
information. So add two pseudo definitions for vendor glue drivers to
translate their internal L1 Substates for debugfs to show.
Qiang Yu [Mon, 10 Nov 2025 06:59:44 +0000 (22:59 -0800)]
PCI: qcom: Remove DPC Extended Capability
Some platforms (e.g., X1E80100) expose Downstream Port Containment (DPC)
Extended Capability registers in the PCIe Root Port config space, but do
not fully support it. To prevent undefined behavior and ensure DPC cap is
not visible to PCI framework and users, remove DPC Extended Capability
unconditionally, since there is no Qcom platform support DPC till now.
Qiang Yu [Mon, 10 Nov 2025 06:59:43 +0000 (22:59 -0800)]
PCI: qcom: Remove MSI-X Capability for Root Ports
On some platforms like Glymur, the hardware does not support MSI-X in RC
mode, yet still exposes the MSI-X capability. However, it omits the
required MSI-X Table and PBA structures.
This mismatch can lead to issues where the PCIe port driver requests MSI-X
instead of MSI, causing the Root Port to trigger interrupts by writing to
an uninitialized address, resulting in SMMU faults.
To address this, remove MSI-X capability unconditionally for Root Ports of
all SoCs as none of the Qcom PCIe Root Ports support MSI-X.
Qiang Yu [Mon, 10 Nov 2025 06:59:42 +0000 (22:59 -0800)]
PCI: dwc: Remove MSI/MSIX capability for Root Port if iMSI-RX is used as MSI controller
Some platforms may not support ITS (Interrupt Translation Service) and MBI
(Message Based Interrupt), or there are not enough available empty SPI
lines for MBI, in which case the msi-map and msi-parent property will not
be provided in device tree node. For those cases, the DWC PCIe driver
defaults to using the iMSI-RX module as MSI controller. However, due to
DWC IP design, iMSI-RX cannot generate MSI interrupts for Root Ports even
when MSI is properly configured and supported as iMSI-RX will only monitor
and intercept incoming MSI TLPs from PCIe link, but the memory write
generated by Root Port are internal system bus transactions instead of
PCIe TLPs, so they are ignored.
This leads to interrupts such as PME, AER from the Root Port not received
on the host and the users have to resort to workarounds such as passing
"pcie_pme=nomsi" cmdline parameter.
To ensure reliable interrupt handling, remove MSI and MSI-X capabilities
from Root Ports when using iMSI-RX as MSI controller, which is indicated
by 'dw_pcie_rp::has_msi_ctrl == true'. This forces a fallback to INTx
interrupts by default, eliminating the need for manual kernel command line
workarounds.
With this behavior:
- Platforms with ITS/MBI support use ITS/MBI MSI for interrupts from all
components.
- Platforms without ITS/MBI support fall back to INTx for Root Ports and
use iMSI-RX for other PCI devices.
Signed-off-by: Qiang Yu <qiang.yu@oss.qualcomm.com>
[mani: reworded the comment] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Link: https://patch.msgid.link/20251109-remove_cap-v1-3-2208f46f4dc2@oss.qualcomm.com
Qiang Yu [Mon, 10 Nov 2025 06:59:41 +0000 (22:59 -0800)]
PCI: dwc: Add new APIs to remove standard and extended Capability
On some platforms, certain PCIe Capabilities may be present in hardware
but are not fully implemented as defined in PCIe spec. These incomplete
capabilities should be hidden from the PCI framework to prevent unexpected
behavior.
Introduce two APIs to remove a specific PCIe Capability and Extended
Capability by updating the previous capability's next offset field to skip
over the unwanted capability. These APIs allow RC drivers to easily hide
unsupported or partially implemented capabilities from software.
Qiang Yu [Mon, 10 Nov 2025 06:59:40 +0000 (22:59 -0800)]
PCI: Add preceding capability position support in PCI_FIND_NEXT_*_CAP macros
Add support for finding the preceding capability position in PCI
capability list by extending the capability finding macros with an
additional parameter. This functionality is essential for modifying PCI
capability list, as it provides the necessary information to update the
"next" pointer of the predecessor capability when removing entries.
Modify two macros to accept a new 'prev_ptr' parameter:
- PCI_FIND_NEXT_CAP - Now accepts 'prev_ptr' parameter for standard
capabilities
- PCI_FIND_NEXT_EXT_CAP - Now accepts 'prev_ptr' parameter for extended
capabilities
When a capability is found, these macros:
- Store the position of the preceding capability in *prev_ptr
(if prev_ptr != NULL)
- Maintain all existing functionality when prev_ptr is NULL
Update current callers to accommodate this API change by passing NULL to
'prev_ptr' argument if they do not care about the preceding capability
position.
No functional changes to driver behavior result from this commit as it
maintains the existing capability finding functionality while adding the
infrastructure for future capability removal operations.
Linus Torvalds [Sun, 14 Dec 2025 03:35:35 +0000 (15:35 +1200)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The only core fix is in doc; all the others are in drivers, with the
biggest impacts in libsas being the rollback on error handling and in
ufs coming from a couple of error handling fixes, one causing a crash
if it's activated before scanning and the other fixing W-LUN
resumption"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: qcom: Fix confusing cleanup.h syntax
scsi: libsas: Add rollback handling when an error occurs
scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name()
scsi: ufs: core: Fix a deadlock in the frequency scaling code
scsi: ufs: core: Fix an error handler crash
scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed"
scsi: ufs: core: Fix RPMB link error by reversing Kconfig dependencies
scsi: qla4xxx: Use time conversion macros
scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset
scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset
scsi: imm: Fix use-after-free bug caused by unfinished delayed work
scsi: target: sbp: Remove KMSG_COMPONENT macro
scsi: core: Correct documentation for scsi_device_quiesce()
scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1
scsi: target: Reset t_task_cdb pointer in error case
scsi: ufs: core: Fix EH failure after W-LUN resume error
Linus Torvalds [Sun, 14 Dec 2025 03:24:10 +0000 (15:24 +1200)]
Merge tag 'ceph-for-6.19-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"We have a patch that adds an initial set of tracepoints to the MDS
client from Max, a fix that hardens osdmap parsing code from myself
(marked for stable) and a few assorted fixups"
* tag 'ceph-for-6.19-rc1' of https://github.com/ceph/ceph-client:
rbd: stop selecting CRC32, CRYPTO, and CRYPTO_AES
ceph: stop selecting CRC32, CRYPTO, and CRYPTO_AES
libceph: make decode_pool() more resilient against corrupted osdmaps
libceph: Amend checking to fix `make W=1` build breakage
ceph: Amend checking to fix `make W=1` build breakage
ceph: add trace points to the MDS client
libceph: fix log output race condition in OSD client
Linus Torvalds [Sat, 13 Dec 2025 18:12:46 +0000 (06:12 +1200)]
Merge tag 'smp-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar:
- Fix CPU hotplug callbacks to disable interrupts on UP kernels
* tag 'smp-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu: Make atomic hotplug callbacks run with interrupts disabled on UP
Linus Torvalds [Sat, 13 Dec 2025 18:10:35 +0000 (06:10 +1200)]
Merge tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
- Fix NULL pointer dereference crash in the Intel PMU driver
- Fix missing read event generation on task exit
- Fix AMD uncore driver init error handling
- Fix whitespace noise
* tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()
perf/core: Fix missing read event generation on task exit
perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on error
perf/uprobes: Remove <space><Tab> whitespace noise
Linus Torvalds [Sat, 13 Dec 2025 18:07:09 +0000 (06:07 +1200)]
Merge tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
- Fix error code in the irqchip/mchp-eic driver
- Fix setup_percpu_irq() affinity assumptions
- Remove the unused irq_domain_add_tree() function
* tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc()
irqdomain: Delete irq_domain_add_tree()
genirq: Allow NULL affinity for setup_percpu_irq()
Linus Torvalds [Sat, 13 Dec 2025 18:04:16 +0000 (06:04 +1200)]
Merge tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc core fixes from Ingo Molnar:
- Improve bug reporting
- Suppress W=1 format warning
- Improve rseq scalability on Clang builds
* tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Always inline rseq_debug_syscall_return()
bug: Hush suggest-attribute=format for __warn_printf()
bug: Let report_bug_entry() provide the correct bugaddr
Linus Torvalds [Sat, 13 Dec 2025 08:55:12 +0000 (20:55 +1200)]
Merge tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc updates from Andrew Morton:
"There are no significant series in this small merge. Please see the
individual changelogs for details"
[ Editor's note: it's mainly ocfs2 and a couple of random fixes ]
* tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: memfd_luo: add CONFIG_SHMEM dependency
mm: shmem: avoid build warning for CONFIG_SHMEM=n
ocfs2: fix memory leak in ocfs2_merge_rec_left()
ocfs2: invalidate inode if i_mode is zero after block read
ocfs2: avoid -Wflex-array-member-not-at-end warning
ocfs2: convert remaining read-only checks to ocfs2_emergency_state
ocfs2: add ocfs2_emergency_state helper and apply to setattr
checkpatch: add uninitialized pointer with __free attribute check
args: fix documentation to reflect the correct numbers
ocfs2: fix kernel BUG in ocfs2_find_victim_chain
liveupdate: luo_core: fix redundant bound check in luo_ioctl()
ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list
fs/fat: remove unnecessary wrapper fat_max_cache()
ocfs2: replace deprecated strcpy with strscpy
ocfs2: check tl_used after reading it from trancate log inode
liveupdate: luo_file: don't use invalid list iterator
Brown paper bag time. This is a silly oversight where I missed to drop
the error condition checking to ensure we clean up on early error
returns. I have an internal unit testset coming up for this which will
catch all such issues going forward.
Reported-by: Chris Mason <clm@fb.com> Reported-by: Jeff Layton <jlayton@kernel.org> Fixes: 011703a9acd7 ("file: add FD_{ADD,PREPARE}()") Signed-off-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 13 Dec 2025 07:57:41 +0000 (19:57 +1200)]
x86/hv: Add gitignore entry for generated header file
Commit 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver") added a
new generated header file for the offsets into the mshv_vtl_cpu_context
structure to be used by the low-level assembly code. But it didn't add
the .gitignore file to go with it, so 'git status' and friends will
mention it.
Let's add the gitignore file before somebody thinks that generated
header should be committed.
Linus Torvalds [Sat, 13 Dec 2025 05:39:28 +0000 (17:39 +1200)]
Merge tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm fixes from Dave Airlie:
"These are the enqueued fixes that ended up in our fixes branch,
nouveau mostly, along with some small fixes in other places.
plane:
- Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties()
ttm:
- fix devcoredump for evicted bos
panel:
- Fix stack usage warning in novatek-nt35560
nouveau:
- alloc fwsec sb at boot to avoid s/r problems
- fix strcpy usage
- fix i2c encoder crash
bridge:
- Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83
mgag200:
- Fix bigendian handling in mgag200
tilcdc:
- Fix probe failure in tilcdc"
* tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
drm/mgag200: Fix big-endian support
drm/tilcdc: Fix removal actions in case of failed probe
drm/ttm: Avoid NULL pointer deref for evicted BOs
drm: nouveau: Replace sprintf() with sysfs_emit()
drm/nouveau: fix circular dep oops from vendored i2c encoder
drm/nouveau: refactor deprecated strcpy
drm/plane: Fix IS_ERR() vs NULL check in drm_plane_create_hotspot_properties()
drm/bridge: ti-sn65dsi83: ignore PLL_UNLOCK errors
drm/nouveau/gsp: Allocate fwsec-sb at boot
drm/panel: novatek-nt35560: avoid on-stack device structure
i915:
- Fix format string truncation warning
- FIx runtime PM reference during fbdev BO creation
panthor:
- fix UAF
renesas:
- fix sync flag handling"
* tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
Revert "drm/amd/display: Fix pbn to kbps Conversion"
drm/amd: Fix unbind/rebind for VCN 4.0.5
drm/i915: Fix format string truncation warning
drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation
drm/amd/display: Improve HDMI info retrieval
drm/amdkfd: bump minimum vgpr size for gfx1151
drm/amd/display: shrink struct members
drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace
drm/amd/display: Refactor dml_core_mode_support to reduce stack frame
drm/amdgpu: don't attach the tlb fence for SI
drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state()
drm/amdkfd: Trap handler support for expert scheduling mode
drm/amdkfd: Use huge page size to check split svm range alignment
drm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC
drm/gem-shmem: revert the 8-byte alignment constraint
drm/gem-dma: revert the 8-byte alignment constraint
drm/panthor: Prevent potential UAF in group creation
Linus Torvalds [Sat, 13 Dec 2025 05:15:16 +0000 (17:15 +1200)]
Merge tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull further i3c update from Alexandre Belloni:
"We are removing a legacy API callback and having this sooner rather
than later will help ensuring no one introduces a new driver using it.
I've also added patches removing the "__free(...) = NULL" pattern
because I'm sure we won't avoid people sending those following the
mailing list discussion..."
* tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
i3c: adi: Fix confusing cleanup.h syntax
i3c: master: Fix confusing cleanup.h syntax
i3c: master: cleanup callback .priv_xfers()
i3c: master: switch to use new callback .i3c_xfers() from .priv_xfers()
Linus Torvalds [Sat, 13 Dec 2025 05:09:06 +0000 (17:09 +1200)]
Merge tag 'rtc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Subsystem:
- stop setting max_user_freq from the individual drivers as this has
not been hardware related for a while
New drivers:
- Andes ATCRTC100
- Apple SMC
- Nvidia VRS
Linus Torvalds [Sat, 13 Dec 2025 04:36:57 +0000 (16:36 +1200)]
Merge tag 'gpio-fixes-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
- fix spinlock op type after conversion to lock guards
- fix a memory leak in error path in gpio-regmap
- Kconfig fixes in GPIO drivers
- add a GPIO ACPI quirk for Dell Precision 7780
- set of fixes for shared GPIO management
* tag 'gpio-fixes-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: shared: make locking more fine-grained
gpio: shared: fix auxiliary device cleanup order
gpio: shared: check if a reference is populated before cleaning its resources
gpio: shared: fix NULL-pointer dereference in teardown path
gpio: shared: ignore disabled nodes when traversing the device-tree
gpiolib: acpi: Add quirk for Dell Precision 7780
gpio: tb10x: fix OF_GPIO dependency
gpio: qixis: select CONFIG_REGMAP_MMIO
gpio: regmap: Fix memleak in error path in gpio_regmap_register()
gpio: mmio: fix bad guard conversion
Linus Torvalds [Sat, 13 Dec 2025 04:09:10 +0000 (16:09 +1200)]
Merge tag 'sound-fix-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"The only slightly large change is the enablement of CIX HD-audio
controller, which took a bit time to be cooked up, while most of other
changes are device-specific small trivial fixes:
- Default disablement of the kconfig for decades old pre-release
alsa-lib PCM API; it's only the default config value change, so it
can't lead to any regressions for the existing setups
- Support for CIX HD-audio controller
- A few ASoC ACP fixes
- Fixes for ASoC cirrus, bcm, wcd, qcom, ak platforms
- Trivial hardening for FireWire and USB-audio
- HD-audio Intel binding fix and quirks"
* tag 'sound-fix-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits)
ALSA: hda/tas2781: Add new quirk for HP new project
ALSA: hda: cix-ipbloq: Use modern PM ops
ALSA: hda: intel-dsp-config: Prefer legacy driver as fallback
ASoC: amd: acp: update tdm channels for specific DAI
ASoC: cs35l56: Fix incorrect select SND_SOC_CS35L56_CAL_SYSFS_COMMON
ALSA: firewire-motu: add bounds check in put_user loop for DSP events
ASoC: cs35l41: Always return 0 when a subsystem ID is found
ALSA: uapi: Fix typo in asound.h comment
ALSA: Do not build obsolete API
ALSA: hda: add CIX IPBLOQ HDA controller support
ALSA: hda/core: add addr_offset field for bus address translation
ALSA: hda: dt-bindings: add CIX IPBLOQ HDA controller support
ALSA: hda/realtek: Add support for ASUS UM3406GA
ALSA: hda/realtek: Add support for HP Turbine Laptops
ALSA: usb-audio: Initialize status1 to fix uninitialized symbol errors
ALSA: firewire-motu: fix buffer overflow in hwdep read for DSP events
ALSA: hda: cs35l41: Fix NULL pointer dereference in cs35l41_hda_read_acpi()
ASoC: cros_ec_codec: Remove unnecessary selection of CRYPTO
ASoc: qcom: q6afe: fix bad guard conversion
ASoC: rockchip: Fix Wvoid-pointer-to-enum-cast warning (again)
...
Dave Airlie [Sat, 13 Dec 2025 00:54:28 +0000 (10:54 +1000)]
Merge tag 'drm-misc-fixes-2025-12-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc1:
- Fix stack usage warning in novatek-nt35560.
- Fix s/r, i2c issues in nouveau and update string handling.
- Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83.
- Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties().
- Fix devcoredump crash on reading evicted bo's.
- Fix bigendian handling in mgag200.
- Fix probe failure in tilcdc.
Initializing automatic __free variables to NULL without need (e.g.
branches with different allocations), followed by actual allocation is
in contrary to explicit coding rules guiding cleanup.h:
"Given that the "__free(...) = NULL" pattern for variables defined at
the top of the function poses this potential interdependency problem the
recommendation is to always define and assign variables in one statement
and not group variable definitions at the top of the function when
__free() is used."
Code does not have a bug, but is less readable and uses discouraged
coding practice, so fix that by moving declaration to the place of
assignment.
Initializing automatic __free variables to NULL without need (e.g.
branches with different allocations), followed by actual allocation is
in contrary to explicit coding rules guiding cleanup.h:
"Given that the "__free(...) = NULL" pattern for variables defined at
the top of the function poses this potential interdependency problem the
recommendation is to always define and assign variables in one statement
and not group variable definitions at the top of the function when
__free() is used."
Code does not have a bug, but is less readable and uses discouraged
coding practice, so fix that by moving declaration to the place of
assignment.
Not that other existing usage of __free() in this context is a corret
exception initialized to NULL, because the actual allocation is branched
in if().
Linus Torvalds [Fri, 12 Dec 2025 17:44:03 +0000 (05:44 +1200)]
Merge tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Add basic LoongArch32 support
Note: Build infrastructures of LoongArch32 are not enabled yet,
because we need to adjust irqchip drivers and wait for GNU toolchain
be upstream first.
- Select HAVE_ARCH_BITREVERSE in Kconfig
- Fix build and boot for CONFIG_RANDSTRUCT
- Correct the calculation logic of thread_count
- Some bug fixes and other small changes
* tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits)
LoongArch: Adjust default config files for 32BIT/64BIT
LoongArch: Adjust VDSO/VSYSCALL for 32BIT/64BIT
LoongArch: Adjust misc routines for 32BIT/64BIT
LoongArch: Adjust user accessors for 32BIT/64BIT
LoongArch: Adjust system call for 32BIT/64BIT
LoongArch: Adjust module loader for 32BIT/64BIT
LoongArch: Adjust time routines for 32BIT/64BIT
LoongArch: Adjust process management for 32BIT/64BIT
LoongArch: Adjust memory management for 32BIT/64BIT
LoongArch: Adjust boot & setup for 32BIT/64BIT
LoongArch: Adjust common macro definitions for 32BIT/64BIT
LoongArch: Add adaptive CSR accessors for 32BIT/64BIT
LoongArch: Add atomic operations for 32BIT/64BIT
LoongArch: Add new PCI ID for pci_fixup_vgadev()
LoongArch: Add and use some macros for AVEC
LoongArch: Correct the calculation logic of thread_count
LoongArch: Use unsigned long for _end and _text
LoongArch: Use __pmd()/__pte() for swap entry conversions
LoongArch: Fix arch_dup_task_struct() for CONFIG_RANDSTRUCT
LoongArch: Fix build errors for CONFIG_RANDSTRUCT
...
- Fix chacha-riscv64-zvkb.S to not use frame pointer for data"
* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
crypto: arm64/ghash - Fix incorrect output from ghash-neon
crypto/arm64: sm4/xts - Merge ksimd scopes to reduce stack bloat
crypto/arm64: aes/xts - Use single ksimd scope to reduce stack bloat
lib/crypto: blake2s: Replace manual unrolling with unrolled_full
lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit
lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
lib/crypto: riscv/chacha: Avoid s0/fp register
Linus Torvalds [Fri, 12 Dec 2025 10:04:18 +0000 (22:04 +1200)]
Merge tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Always initialize DMA state, fixing a potentially nasty issue on the
block side
- btrfs zoned write fix with cached zone reports
- Fix corruption issues in bcache with chained bio's, and further make
it clear that the chained IO handler is simply a marker, it's not
code meant to be executed
- Kill old code dealing with synchronous IO polling in the block layer,
that has been dead for a long time. Only async polling is supported
these days
- Fix a lockdep issue in tag_set management, moving it to RCU
- Fix an issue with ublks bio_vec iteration
- Don't unconditionally enforce blocking issue of ublk control
commands, allow some of them with non-blocking issue as they
do not block
* tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
blk-mq-dma: always initialize dma state
blk-mq: delete task running check in blk_hctx_poll()
block: fix cached zone reports on devices with native zone append
block: Use RCU in blk_mq_[un]quiesce_tagset() instead of set->tag_list_lock
ublk: don't mutate struct bio_vec in iteration
block: prohibit calls to bio_chain_endio
bcache: fix improper use of bi_end_io
ublk: allow non-blocking ctrl cmds in IO_URING_F_NONBLOCK issue
Linus Torvalds [Fri, 12 Dec 2025 10:01:32 +0000 (22:01 +1200)]
Merge tag 'io_uring-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fix from Jens Axboe:
"Single fix for io_uring headed to stable, fixing an issue introduced
with the min_wait support earlier this year, where SQPOLL didn't get
correctly woken if an event arrived once the event waiting has
finished the min_wait portion.
As we already have regression tests for this added and people
reporting new failures there, let's get this one flushed out
so it can bubble back down to stable as well"
* tag 'io_uring-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: fix min_wait wakeups for SQPOLL
Linus Torvalds [Fri, 12 Dec 2025 09:59:19 +0000 (21:59 +1200)]
Merge tag 'v6.19-rc-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- minor cleanup
- minor update to comment to avoid confusion about fs type
* tag 'v6.19-rc-smb3-server-fixes' of git://git.samba.org/ksmbd:
smb/server: add comment to FileSystemName of FileFsAttributeInformation
smb/server: remove unused nterr.h
smb/server: rename include guard in smb_common.h
Linus Torvalds [Fri, 12 Dec 2025 09:52:42 +0000 (21:52 +1200)]
Merge tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Bugfixes:
- Fix 'nlink' attribute update races when unlinking a file
- Add missing initialisers for the directory verifier in various
places
- Don't regress the NFSv4 open state due to misordered racing replies
- Ensure the NFSv4.x callback server uses the correct transport
connection
- Fix potential use-after-free races when shutting down the NFSv4.x
callback server
- Fix a pNFS layout commit crash
- Assorted fixes to ensure correct propagation of mount options when
the client crosses a filesystem boundary and triggers the VFS
automount code
- More localio fixes
Features and cleanups:
- Add initial support for basic directory delegations
- SunRPC back channel code cleanups"
* tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
NFSv4: Handle NFS4ERR_NOTSUPP errors for directory delegations
nfs/localio: remove 61 byte hole from needless ____cacheline_aligned
nfs/localio: remove alignment size checking in nfs_is_local_dio_possible
NFS: Fix up the automount fs_context to use the correct cred
NFS: Fix inheritance of the block sizes when automounting
NFS: Automounted filesystems should inherit ro,noexec,nodev,sync flags
Revert "nfs: ignore SB_RDONLY when mounting nfs"
Revert "nfs: clear SB_RDONLY before getting superblock"
Revert "nfs: ignore SB_RDONLY when remounting nfs"
NFS: Add a module option to disable directory delegations
NFS: Shortcut lookup revalidations if we have a directory delegation
NFS: Request a directory delegation during RENAME
NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK
NFS: Add support for sending GDD_GETATTR
NFSv4/pNFS: Clear NFS_INO_LAYOUTCOMMIT in pnfs_mark_layout_stateid_invalid
NFSv4.1: protect destroying and nullifying bc_serv structure
SUNRPC: new helper function for stopping backchannel server
SUNRPC: cleanup common code in backchannel request
NFSv4.1: pass transport for callback shutdown
NFSv4: ensure the open stateid seqid doesn't go backwards
...
Brendan Jackman [Sun, 7 Dec 2025 03:53:18 +0000 (03:53 +0000)]
bug: Hush suggest-attribute=format for __warn_printf()
Recent additions to this function cause GCC 14.3.0 to get excited
(W=1) and suggest a missing attribute:
lib/bug.c: In function '__warn_printf':
lib/bug.c:187:25: error: function '__warn_printf' be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format]
187 | vprintk(fmt, *args);
| ^~~~~~~
Disable the diagnostic locally, following the pattern used for stuff
like va_format().
Heiko Carstens [Mon, 8 Dec 2025 20:06:58 +0000 (21:06 +0100)]
bug: Let report_bug_entry() provide the correct bugaddr
report_bug_entry() always provides zero for bugaddr but could easily
extract the correct address from the provided bug_entry. Just do that to
have proper warning messages.
E.g. adding an artificial:
void foo(void) { WARN_ONCE(1, "bar"); }
function generates this warning message:
WARNING: arch/s390/kernel/setup.c:1017 at 0x0, CPU#0: swapper/0/0
^^^
With the correct bug address this changes to:
WARNING: arch/s390/kernel/setup.c:1017 at foo+0x1c/0x40, CPU#0: swapper/0/0
^^^^^^^^^^^^^
Evan Li [Fri, 12 Dec 2025 08:49:43 +0000 (16:49 +0800)]
perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()
handle_pmi_common() may observe an active bit set in cpuc->active_mask
while the corresponding cpuc->events[] entry has already been cleared,
which leads to a NULL pointer dereference.
This can happen when interrupt throttling stops all events in a group
while PEBS processing is still in progress. perf_event_overflow() can
trigger perf_event_throttle_group(), which stops the group and clears
the cpuc->events[] entry, but the active bit may still be set when
handle_pmi_common() iterates over the events.
The following recent fix:
7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss")
moved the cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del() and
relied on cpuc->active_mask/pebs_enabled checks. However,
handle_pmi_common() can still encounter a NULL cpuc->events[] entry
despite the active bit being set.
Add an explicit NULL check on the event pointer before using it,
to cover this legitimate scenario and avoid the NULL dereference crash.
Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") Reported-by: kitta <kitta@linux.alibaba.com> Co-developed-by: kitta <kitta@linux.alibaba.com> Signed-off-by: Evan Li <evan.li@linux.alibaba.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251212084943.2124787-1-evan.li@linux.alibaba.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220855
When building without CONFIG_PM_SLEEP, there are several warnings (or
errors with CONFIG_WERROR=y / W=e) from the cix-ipbloq driver:
sound/hda/controllers/cix-ipbloq.c:378:12: error: 'cix_ipbloq_hda_runtime_resume' defined but not used [-Werror=unused-function]
378 | static int cix_ipbloq_hda_runtime_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sound/hda/controllers/cix-ipbloq.c:362:12: error: 'cix_ipbloq_hda_runtime_suspend' defined but not used [-Werror=unused-function]
362 | static int cix_ipbloq_hda_runtime_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sound/hda/controllers/cix-ipbloq.c:349:12: error: 'cix_ipbloq_hda_resume' defined but not used [-Werror=unused-function]
349 | static int cix_ipbloq_hda_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~
sound/hda/controllers/cix-ipbloq.c:336:12: error: 'cix_ipbloq_hda_suspend' defined but not used [-Werror=unused-function]
336 | static int cix_ipbloq_hda_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~
When CONFIG_PM and CONFIG_PM_SLEEP are unset, SET_SYSTEM_SLEEP_PM_OPS()
and SET_RUNTIME_PM_OPS() evaluate to nothing, so these functions appear
unused to the compiler in this configuration.
Use the modern SYSTEM_SLEEP_PM_OPS and RUNTIME_PM_OPS macros to resolve
these warnings, which is what they are intended to do. Additionally,
wrap &cix_ipbloq_hda_pm in pm_ptr() to ensure the compiler can drop the
entire structure when CONFIG_PM is unset.
Linus Torvalds [Thu, 11 Dec 2025 03:13:29 +0000 (12:13 +0900)]
Merge tag 'for-6.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mikulas Patocka:
- convert crypto_shash users to direct crypto library use with simpler
and faster code and reduced stack usage (Eric Biggers):
- the dm-verity SHA-256 conversion also teaches it to do two-way
interleaved hashing for added performance
- dm-crypt MD5 conversion (used for Loop-AES compatibility)
- added document for for takeover/reshape raid1 -> raid5 examples (Heinz Mauelshagen)
- fix dm-vdo kerneldoc warnings (Matthew Sakai)
- various random fixes and cleanups
* tag 'for-6.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (29 commits)
dm pcache: fix segment info indexing
dm pcache: fix cache info indexing
dm-pcache: advance slot index before writing slot
dm raid: add documentation for takeover/reshape raid1 -> raid5 table line examples
dm log-writes: Add missing set_freezable() for freezable kthread
dm-raid: fix possible NULL dereference with undefined raid type
dm-snapshot: fix 'scheduling while atomic' on real-time kernels
dm: ignore discard return value
MAINTAINERS: add Benjamin Marzinski as a device mapper maintainer
dm-mpath: Simplify the setup_scsi_dh code
dm vdo: fix kerneldoc warnings
dm-bufio: align write boundary on physical block size
dm-crypt: enable DM_TARGET_ATOMIC_WRITES
dm: test for REQ_ATOMIC in dm_accept_partial_bio()
dm-verity: remove useless mempool
dm-verity: disable recursive forward error correction
dm-ebs: Mark full buffer dirty even on partial write
dm mpath: enable DM_TARGET_ATOMIC_WRITES
dm verity fec: Expose corrected block count via status
dm: Don't warn if IMA_DISABLE_HTABLE is not enabled
...
Linus Torvalds [Thu, 11 Dec 2025 00:57:08 +0000 (09:57 +0900)]
Merge tag 'spi-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few small fixes for SPI that came in during the merge window,
nothing too exciting here"
* tag 'spi-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: microchip-core: Fix an error handling path in mchp_corespi_probe()
spi: cadence-qspi: Fix runtime PM imbalance in probe
Linus Torvalds [Thu, 11 Dec 2025 00:54:59 +0000 (09:54 +0900)]
Merge tag 'regulator-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A few fixes that came in during the merge window, nothing too
exciting - the one core fix improves error propagation from gpiolib
which hopefully shouldn't actually happen but is safer"
* tag 'regulator-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: spacemit: Align input supply name with the DT binding
regulator: fixed: Rely on the core freeing the enable GPIO
regulator: check the return value of gpiod_set_value_cansleep()
Arnd Bergmann [Thu, 4 Dec 2025 10:01:58 +0000 (11:01 +0100)]
mm: memfd_luo: add CONFIG_SHMEM dependency
The new memfd code fails to link without SHMEM:
aarch64-linux-ld: mm/memfd_luo.o: in function `memfd_luo_retrieve_folios':
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0xdc): undefined reference to `shmem_add_to_page_cache'
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0x11c): undefined reference to `shmem_inode_acct_blocks'
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0x134): undefined reference to `shmem_recalc_inode'
Add a Kconfig dependency to disallow that configuration.
Arnd Bergmann [Thu, 4 Dec 2025 10:28:59 +0000 (11:28 +0100)]
mm: shmem: avoid build warning for CONFIG_SHMEM=n
The newly added 'flags' variable is unused and causes a warning if
CONFIG_SHMEM is disabled, since the shmem_acct_size() macro it is passed
into does nothing:
mm/shmem.c: In function '__shmem_file_setup':
mm/shmem.c:5816:23: error: unused variable 'flags' [-Werror=unused-variable]
5816 | unsigned long flags = (vm_flags & VM_NORESERVE) ? SHMEM_F_NORESERVE : 0;
| ^~~~~
Replace the two macros with equivalent inline functions to get the
argument checking.
Link: https://lkml.kernel.org/r/20251204102905.1048000-1-arnd@kernel.org Fixes: 6ff1610ced56 ("mm: shmem: use SHMEM_F_* flags instead of VM_* flags") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Christian Brauner <brauner@kernel.org> Cc: guoweikang <guoweikang.kernel@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: invalidate inode if i_mode is zero after block read
A panic occurs in ocfs2_unlink due to WARN_ON(inode->i_nlink == 0) when
handling a corrupted inode with i_mode=0 and i_nlink=0 in memory.
This "zombie" inode is created because ocfs2_read_locked_inode proceeds
even after ocfs2_validate_inode_block successfully validates a block that
structurally looks okay (passes checksum, signature etc.) but contains
semantically invalid data (specifically i_mode=0). The current validation
function doesn't check for i_mode being zero.
This results in an in-memory inode with i_mode=0 being added to the VFS
cache, which later triggers the panic during unlink.
Prevent this by adding an explicit check for (i_mode == 0, i_nlink == 0,
non-orphan) within ocfs2_validate_inode_block. If the check is true,
return -EFSCORRUPTED to signal corruption. This causes the caller
(ocfs2_read_locked_inode) to invoke make_bad_inode(), correctly preventing
the zombie inode from entering the cache.
Link: https://lkml.kernel.org/r/20251202224507.53452-2-eraykrdg1@gmail.com Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com> Reported-by: syzbot+55c40ae8a0e5f3659f2b@syzkaller.appspotmail.com Fixes: https://syzkaller.appspot.com/bug?extid=55c40ae8a0e5f3659f2b Link: https://lore.kernel.org/all/20251022222752.46758-2-eraykrdg1@gmail.com/T/ Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: David Hunter <david.hunter.linux@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
Use the new TRAILING_OVERLAP() helper to fix the following warning:
fs/ocfs2/xattr.c:52:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
This helper creates a union between a flexible-array member (FAM) and a
set of MEMBERS that would otherwise follow it.
This overlays the trailing MEMBER struct ocfs2_extent_rec er; onto the FAM
struct ocfs2_xattr_value_root::xr_list.l_recs[], while keeping the FAM and
the start of MEMBER aligned.
The static_assert() ensures this alignment remains, and it's intentionally
placed inmediately after the related structure --no blank line in between.
Link: https://lkml.kernel.org/r/aRKm_7aN7Smc3J5L@kspp Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: convert remaining read-only checks to ocfs2_emergency_state
Now that the centralized `ocfs2_emergency_state()` helper is available,
refactor remaining filesystem-wide checks for `ocfs2_is_soft_readonly` and
`ocfs2_is_hard_readonly` to use this new function.
To ensure strict consistency with the previous behavior and guarantee no
functional changes, the call sites continue to explicitly return -EROFS
when the emergency state is detected. This standardizes the check logic
while preserving the existing error handling flow.
Link: https://lkml.kernel.org/r/3421641b54ad6b6e4ffca052351b518eacc1bd08.1764728893.git.eraykrdg1@gmail.com Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: David Hunter <david.hunter.linux@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: add ocfs2_emergency_state helper and apply to setattr
Patch series "ocfs2: Refactor read-only checks to use
ocfs2_emergency_state", v4.
Following the fix for the `make_bad_inode` validation failure (syzbot ID: b93b65ee321c97861072), this separate series introduces a new helper
function, `ocfs2_emergency_state()`, to improve and centralize read-only
and error state checking.
This is modeled after the `ext4_emergency_state()` pattern, providing a
single, unified location for checking all filesystem-level emergency
conditions. This makes the code cleaner and ensures that any future
checks (e.g., for fatal error states) can be added in one place.
This series is structured as follows:
1. The first patch introduces the `ocfs2_emergency_state()` helper
(currently checking for -EROFS) and applies it to `ocfs2_setattr`
to provide a "fail-fast" mechanism, as suggested by Albin
Babu Varghese.
2. The second patch completes the refactoring by converting all
remaining read-only checks throughout OCFS2 to use this new helper.
This patch (of 2):
To centralize error checking, follow the pattern of other filesystems like
ext4 (which uses `ext4_emergency_state()`), and prepare for future
enhancements, this patch introduces a new helper function:
`ocfs2_emergency_state()`.
The purpose of this helper is to provide a single, unified location for
checking all filesystem-level emergency conditions. In this initial
implementation, the function only checks for the existing hard and soft
read-only modes, returning -EROFS if either is set.
This provides a foundation where future checks (e.g., for fatal error
states returning -EIO, or shutdown states) can be easily added in one
place.
This patch also adds this new check to the beginning of `ocfs2_setattr()`.
This ensures that operations like `ftruncate` (which triggered the
original BUG) fail-fast with -EROFS when the filesystem is already in a
read-only state.
Link: https://lkml.kernel.org/r/cover.1764728893.git.eraykrdg1@gmail.com Link: https://lkml.kernel.org/r/e9e975bcaaff8dbc155b70fbc1b2798a2e36e96f.1764728893.git.eraykrdg1@gmail.com Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com> Suggested-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: David Hunter <david.hunter.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ally Heev [Wed, 3 Dec 2025 15:28:49 +0000 (20:58 +0530)]
checkpatch: add uninitialized pointer with __free attribute check
Uinitialized pointers with __free attribute can cause undefined behavior
as the memory randomly assigned to the pointer is freed automatically when
the pointer goes out of scope. add check in checkpatch to detect such
issues.
syzbot reported a kernel BUG in ocfs2_find_victim_chain() because the
`cl_next_free_rec` field of the allocation chain list (next free slot in
the chain list) is 0, triggring the BUG_ON(!cl->cl_next_free_rec)
condition in ocfs2_find_victim_chain() and panicking the kernel.
To fix this, an if condition is introduced in ocfs2_claim_suballoc_bits(),
just before calling ocfs2_find_victim_chain(), the code block in it being
executed when either of the following conditions is true:
1. `cl_next_free_rec` is equal to 0, indicating that there are no free
chains in the allocation chain list
2. `cl_next_free_rec` is greater than `cl_count` (the total number of
chains in the allocation chain list)
Either of them being true is indicative of the fact that there are no
chains left for usage.
This is addressed using ocfs2_error(), which prints
the error log for debugging purposes, rather than panicking the kernel.
Link: https://lkml.kernel.org/r/20251201130711.143900-1-activprithvi@gmail.com Signed-off-by: Prithvi Tambewagh <activprithvi@gmail.com> Reported-by: syzbot+96d38c6e1655c1420a72@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=96d38c6e1655c1420a72 Tested-by: syzbot+96d38c6e1655c1420a72@syzkaller.appspotmail.com Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pasha Tatashin [Sun, 30 Nov 2025 01:09:19 +0000 (20:09 -0500)]
liveupdate: luo_core: fix redundant bound check in luo_ioctl()
The kernel test robot reported a Smatch warning:
kernel/liveupdate/luo_core.c:402 luo_ioctl() warn: unsigned 'nr' is
never less than zero.
This occurs because 'nr' is unsigned and LIVEUPDATE_CMD_BASE is currently
defined as 0, making the check (nr < LIVEUPDATE_CMD_BASE) always false.
Remove the explicit lower bound check. The logic remains correct because
'nr' is unsigned; if nr is less than LIVEUPDATE_CMD_BASE, the expression
(nr - LIVEUPDATE_CMD_BASE) will wrap around to a large positive value.
This will inevitably be larger than ARRAY_SIZE(luo_ioctl_ops) and be
caught by the upper bound check.
Link: https://lkml.kernel.org/r/20251130010919.1488230-1-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511280300.6pvBmXUS-lkp@intel.com/ Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list
Add comprehensive validation of inline xattr metadata in
ocfs2_xattr_ibody_list() to prevent out-of-bounds access and
use-after-free bugs when processing corrupted inline xattrs.
The patch adds two critical validations:
1. Validates i_xattr_inline_size before use:
- Ensures it does not exceed block size
- Ensures it is at least large enough for xattr header
- Prevents pointer arithmetic with corrupted size values that could
point outside the inode block
2. Validates xattr entry count (xh_count):
- Calculates maximum entries that can fit in the inline space
- Rejects counts that exceed this limit
- Prevents out-of-bounds array access in subsequent code
Without these checks, a corrupted filesystem with invalid inline xattr
metadata can cause the code to access memory beyond the allocated space.
For example:
- A corrupted i_xattr_inline_size of 0 would cause header pointer
calculation to point past the end of the block
- A corrupted xh_count of 22 with inline_size of 256 would cause
array access 7 entries beyond the 15 that actually fit (the syzbot
reproducer used xh_count of 20041), leading to use-after-free when
accessing freed memory pages
The validation uses the correct inline_size (from di->i_xattr_inline_size)
rather than block size, ensuring accurate bounds checking for inline
xattrs specifically.
Thorsten Blum [Wed, 26 Nov 2025 11:44:15 +0000 (12:44 +0100)]
ocfs2: replace deprecated strcpy with strscpy
strcpy() has been deprecated [1] because it performs no bounds checking on
the destination buffer, which can lead to buffer overflows. Replace it
with the safer strscpy(). No functional changes.
ocfs2: check tl_used after reading it from trancate log inode
The fuzz image has a truncate log inode whose tl_used is bigger than
tl_count so it triggers the BUG in ocfs2_truncate_log_needs_flush() [1].
As what the check in ocfs2_truncate_log_needs_flush() does, just do same
check into ocfs2_get_truncate_log_info() when truncate log inode is
reading in so we can bail out earlier.
Link: https://lkml.kernel.org/r/tencent_B24B1C1BE225DCBA44BB6933AB9E1B1B0708@qq.com Reported-by: syzbot+f82afc4d4e74d0ef7a89@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f82afc4d4e74d0ef7a89 Tested-by: syzbot+f82afc4d4e74d0ef7a89@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>