When a UAS device is unplugged during data transfer, there is
a probability of a system panic occurring. The root cause is
an access to an invalid memory address during URB callback handling.
Specifically, this happens when the dma_direct_unmap_sg() function
is called within the usb_hcd_unmap_urb_for_dma() interface, but the
sg->dma_address field is 0 and the sg data structure has already been
freed.
The SCSI driver sends transfer commands by invoking uas_queuecommand_lck()
in uas.c, using the uas_submit_urbs() function to submit requests to USB.
Within the uas_submit_urbs() implementation, three URBs (sense_urb,
data_urb, and cmd_urb) are sequentially submitted. Device removal may
occur at any point during uas_submit_urbs execution, which may result
in URB submission failure. However, some URBs might have been successfully
submitted before the failure, and uas_submit_urbs will return the -ENODEV
error code in this case. The current error handling directly calls
scsi_done(). In the SCSI driver, this eventually triggers scsi_complete()
to invoke scsi_end_request() for releasing the sgtable. The successfully
submitted URBs, when being unlinked to giveback, call
usb_hcd_unmap_urb_for_dma() in hcd.c, leading to exceptions during sg
unmapping operations since the sg data structure has already been freed.
This patch modifies the error condition check in the uas_submit_urbs()
function. When a UAS device is removed but one or more URBs have already
been successfully submitted to USB, it avoids immediately invoking
scsi_done() and save the cmnd to devinfo->cmnd array. If the successfully
submitted URBs is completed before devinfo->resetting being set, then
the scsi_done() function will be called within uas_try_complete() after
all pending URB operations are finalized. Otherwise, the scsi_done()
function will be called within uas_zap_pending(), which is executed after
usb_kill_anchored_urbs().
The error handling only takes effect when uas_queuecommand_lck() calls
uas_submit_urbs() and returns the error value -ENODEV . In this case,
the device is disconnected, and the flow proceeds to uas_disconnect(),
where uas_zap_pending() is invoked to call uas_try_complete().
Discovered by Atuin - Automated Vulnerability Discovery Engine.
new_pba comes from the status packet returned after each write.
A bogus device could report values beyond the block count derived
from info->capacity, letting the driver walk off the end of
pba_to_lba[] and corrupt heap memory.
Reject PBAs that exceed the computed block count and fail the
transfer so we avoid touching out-of-range mapping entries.
A report from Oleg Smirnov indicates that the unusual_devs quirks
entry for the Novatek camera does not need to override the subclass
and protocol parameters:
[3266355.209532] usb 1-3: new high-speed USB device number 10 using xhci_hcd
[3266355.333031] usb 1-3: New USB device found, idVendor=0603, idProduct=8611, bcdDevice= 1.00
[3266355.333040] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[3266355.333043] usb 1-3: Product: YICARCAM
[3266355.333045] usb 1-3: Manufacturer: XIAO-YI
[3266355.333047] usb 1-3: SerialNumber: 966110000000100
[3266355.338621] usb-storage 1-3:1.0: USB Mass Storage device detected
[3266355.338817] usb-storage 1-3:1.0: Quirks match for vid 0603 pid 8611: 4000
[3266355.338821] usb-storage 1-3:1.0: This device (0603,8611,0100 S 06 P 50) has unneeded SubClass and Protocol entries in unusual_devs.h (kernel 6.16.10-arch1-1)
Please send a copy of this message to
<linux-usb@vger.kernel.org> and <usb-storage@lists.one-eyed-alien.net>
The overrides are harmless but they do provoke the driver into logging
this annoying message. Update the entry to remove the unneeded entries.
A kernel memory leak was identified by the 'ioctl_sg01' test from Linux
Test Project (LTP). The following bytes were mainly observed: 0x53425355.
When USB storage devices incorrectly skip the data phase with status data,
the code extracts/validates the CSW from the sg buffer, but fails to clear
it afterwards. This leaves status protocol data in srb's transfer buffer,
such as the US_BULK_CS_SIGN 'USBS' signature observed here. Thus, this can
lead to USB protocols leaks to user space through SCSI generic (/dev/sg*)
interfaces, such as the one seen here when the LTP test requested 512 KiB.
Fix the leak by zeroing the CSW data in srb's transfer buffer immediately
after the validation of devices that skip data phase.
Note: Differently from CVE-2018-1000204, which fixed a big leak by zero-
ing pages at allocation time, this leak occurs after allocation, when USB
protocol data is written to already-allocated sg pages.
Fixes: a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()") Cc: stable <stable@kernel.org> Signed-off-by: Desnes Nunes <desnesn@redhat.com> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Link: https://patch.msgid.link/20251031043436.55929-1-desnesn@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The issue occurs because usbhs_sys_function_pullup(), which accesses the IP
registers, is executed after the USBHS clocks have been disabled. The
problem is reproducible on the Renesas RZ/G3S SoC starting with the
addition of module stop in the clock enable/disable APIs. With module stop
functionality enabled, a bus error is expected if a master accesses a
module whose clock has been stopped and module stop activated.
The existing code did not handle the failure case of usb_ep_queue in the
command path, potentially leading to memory leaks.
Improve error handling to free all allocated resources on usb_ep_queue
failure. This patch continues to use goto logic for error handling, as the
existing error handling is complex and not easily adaptable to auto-cleanup
helpers.
The driver uses pcim_enable_device() to enable the PCI device,
the device will be automatically disabled on driver detach through
the managed device framework. The manual pci_disable_device() calls
in the error paths are therefore redundant and should be removed.
Found via static anlaysis and this is similar to commit 99ca0b57e49f
("thermal: intel: int340x: processor: Fix warning during module unload").
The MOST subsystem has a non-standard registration function which frees
the interface on registration failures and on deregistration.
This unsurprisingly leads to bugs in the MOST drivers, and a couple of
recent changes turned a reference underflow and use-after-free in the
USB driver into several double free and a use-after-free on late probe
failures.
Fixes: 723de0f9171e ("staging: most: remove device from interface structure") Fixes: 4b1270902609 ("most: usb: Fix use-after-free in hdm_disconnect") Fixes: a8cc9e5fcb0e ("most: usb: hdm_probe: Fix calling put_device() before device initialization") Cc: stable@vger.kernel.org Cc: Christian Gromm <christian.gromm@microchip.com> Cc: Victoria Votokina <Victoria.Votokina@kaspersky.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20251029093029.28922-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Check for returned DMA addresses using specialized dma_mapping_error()
helper which is generally recommended for this purpose by
Documentation/core-api/dma-api.rst:
"In some circumstances dma_map_single(), ...
will fail to create a mapping. A driver can check for these errors
by testing the returned DMA address with dma_mapping_error()."
Found via static analysis and this is similar to commit fa0308134d26
("ALSA: memalloc: prefer dma_mapping_error() over explicit address checking")
When __mptcp_retrans() kicks-in, it schedules one or more subflows for
retransmission, but such subflows could be actually left alone if there
is no more data to retransmit and/or in case of concurrent fallback.
Scheduled subflows could be processed much later in time, i.e. when new
data will be transmitted, leading to bad subflow selection.
Explicitly clear all scheduled subflows before leaving the
retransmission function.
Fixes: ee2708aedad0 ("mptcp: use get_retrans wrapper") Cc: stable@vger.kernel.org Reported-by: Filip Pokryvka <fpokryvk@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251125-net-mptcp-clear-sched-rtx-v1-1-1cea4ad2165f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 27e8fe0da3b7 ("mmc: sdhci-of-dwcmshc: Prevent stale command
interrupt handling") clears pending interrupts when resetting
host->pending_reset to ensure no pending stale interrupts after
sdhci_threaded_irq restores interrupts. But this fix is only added for
th1520 platforms, in fact per my test, this issue exists on all
dwcmshc users, such as cv1800b, sg2002, and synaptics platforms.
So promote the above reset handling from th1520 to ip level. And keep
reset handling on rk, sg2042 and bf3 as is, until it's confirmed that
the same issue exists on these platforms too.
Fixes: 017199c2849c ("mmc: sdhci-of-dwcmshc: Add support for Sophgo CV1800B and SG2002") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When allocating hugetlb folios for memfd, three initialization steps are
missing:
1. Folios are not zeroed, leading to kernel memory disclosure to userspace
2. Folios are not marked uptodate before adding to page cache
3. hugetlb_fault_mutex is not taken before hugetlb_add_to_page_cache()
The memfd allocation path bypasses the normal page fault handler
(hugetlb_no_page) which would handle all of these initialization steps.
This is problematic especially for udmabuf use cases where folios are
pinned and directly accessed by userspace via DMA.
Fix by matching the initialization pattern used in hugetlb_no_page():
- Zero the folio using folio_zero_user() which is optimized for huge pages
- Mark it uptodate with folio_mark_uptodate()
- Take hugetlb_fault_mutex before adding to page cache to prevent races
The folio_zero_user() change also fixes a potential security issue where
uninitialized kernel memory could be disclosed to userspace through read()
or mmap() operations on the memfd.
Link: https://lkml.kernel.org/r/20251112145034.2320452-1-kartikey406@gmail.com Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/20251112031631.2315651-1-kartikey406@gmail.com/ Closes: https://syzkaller.appspot.com/bug?extid=f64019ba229e3a5c411b Suggested-by: Oscar Salvador <osalvador@suse.de> Suggested-by: David Hildenbrand <david@redhat.com> Tested-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Hugh Dickins <hughd@google.com> Cc: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@nvidia.com> (v2) Cc: Christoph Hellwig <hch@lst.de> (v6) Cc: Dave Airlie <airlied@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix the incorrect usage of platform_set_drvdata and dev_set_drvdata. They
both are of the same data and overrides each other. This resulted in the
rmmod of the svc driver to fail and throw a kernel panic for kthread_stop
and fifo free.
Fixes: b5dc75c915cd ("firmware: stratix10-svc: extend svc to support new RSU features") Cc: stable@vger.kernel.org # 6.6+ Signed-off-by: Ang Tien Sung <tiensung.ang@altera.com> Signed-off-by: Khairul Anuar Romli <khairul.anuar.romli@altera.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function qcom_slim_ngd_notify_slaves() calls of_slim_get_device() which
internally uses device_find_child() to obtain a device reference.
According to the device_find_child() documentation,
the caller must drop the reference with put_device() after use.
Found via static analysis and this is similar to commit 4e65bda8273c
("ASoC: wcd934x: fix error handling in wcd934x_codec_parse_data()")
When having a multiuser mount with domain= specified and using
cifscreds, cifs_set_cifscreds() will end up setting @ctx->domainname,
so it needs to be freed before leaving cifs_construct_tcon().
This fixes the following memory leak reported by kmemleak:
Fixes: f2aee329a68f ("cifs: set domainName when a domain-key is used in multiuser") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Jay Shin <jaeshin@redhat.com> Cc: stable@vger.kernel.org Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The sysdev_is_parent check was being used to infer PCI devices that have
the DMA mask set from the PCI capabilities, but sysdev_is_parent is also
used for non-PCI ACPI devices in which case the DMA mask would be the
bus default or as set by the _DMA method.
Without this fix the DMA mask would default to 32-bits and so allocation
would fail if there was no DRAM below 4GB.
GFP_NOWAIT allocation may fail anytime. It needs to be changed to
GFP_NOIO. There's no need to handle an error because mempool_alloc with
GFP_NOIO can't fail.
The crash in process_v2_sparse_read() for fscrypt-encrypted directories
has been reported. Issue takes place for Ceph msgr2 protocol in secure
mode. It can be reproduced by the steps:
sudo mount -t ceph :/ /mnt/cephfs/ -o name=admin,fs=cephfs,ms_mode=secure
[ 417.133921] Second to last potentially related work creation:
[ 417.134183] kasan_save_stack+0x26/0x60
[ 417.134362] kasan_record_aux_stack+0xae/0xd0
[ 417.134566] __call_rcu_common+0xcd/0x14b0
[ 417.134782] call_rcu+0x31/0x50
[ 417.134929] put_task_struct_rcu_user+0x58/0xb0
[ 417.135143] finish_task_switch.isra.0+0x5d3/0x830
[ 417.135366] __schedule+0xd30/0x5100
[ 417.135534] schedule_idle+0x5a/0x90
[ 417.135712] do_idle+0x25f/0x410
[ 417.135871] cpu_startup_entry+0x53/0x70
[ 417.136053] start_secondary+0x216/0x2c0
[ 417.136233] common_startup_64+0x13e/0x141
[ 417.136894] The buggy address belongs to the object at ffff888124870000
which belongs to the cache task_struct of size 10504
[ 417.138122] The buggy address is located 52 bytes inside of
freed 10504-byte region [ffff888124870000, ffff888124872908)
[ 417.145859] Memory state around the buggy address:
[ 417.146094] ffff88812486ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fc
[ 417.146439] ffff88812486ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fc
[ 417.146791] >ffff888124870000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[ 417.147145] ^
[ 417.147387] ffff888124870080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[ 417.147751] ffff888124870100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[ 417.148123]
==================================================================
First of all, we have warning in get_bvec_at() because
cursor->total_resid contains zero value. And, finally,
we have crash in ceph_msg_data_advance() because
cursor->data is NULL. It means that get_bvec_at()
receives not initialized ceph_msg_data_cursor structure
because data is NULL and total_resid contains zero.
Moreover, we don't have likewise issue for the case of
Ceph msgr1 protocol because ceph_msg_data_cursor_init()
has been called before reading sparse data.
This patch adds calling of ceph_msg_data_cursor_init()
in the beginning of process_v2_sparse_read() with
the goal to guarantee that logic of reading sparse data
works correctly for the case of Ceph msgr2 protocol.
Reading the interrupt register `SUN4I_REG_INT_ADDR` causes all of its bits
to be reset. If we ever reach the condition of handling more than
`SUN4I_CAN_MAX_IRQ` IRQs, we will have read the register and reset all its
bits but without actually handling the interrupt inside of the loop body.
This may, among other issues, cause us to never `netif_wake_queue()` again
after a transmission interrupt.
Fixes: 0738eff14d81 ("can: Allwinner A10/A20 CAN Controller support - Kernel module") Cc: stable@vger.kernel.org Co-developed-by: Thomas Mühlbacher <tmuehlbacher@posteo.net> Signed-off-by: Thomas Mühlbacher <tmuehlbacher@posteo.net> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251116-sun4i-fix-loop-v1-1-3d76d3f81950@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reading the interrupt register `SJA1000_IR` causes all of its bits to be
reset. If we ever reach the condition of handling more than
`SJA1000_MAX_IRQ` IRQs, we will have read the register and reset all its
bits but without actually handling the interrupt inside of the loop
body.
This may, among other issues, cause us to never `netif_wake_queue()`
again after a transmission interrupt.
Fixes: 429da1cc841b ("can: Driver for the SJA1000 CAN controller") Cc: stable@vger.kernel.org Signed-off-by: Thomas Mühlbacher <tmuehlbacher@posteo.net> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20251115153437.11419-1-tmuehlbacher@posteo.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In btusb_mtk_setup(), we set `btmtk_data->isopkt_intf` to:
usb_ifnum_to_if(data->udev, MTK_ISO_IFNUM)
That function can return NULL in some cases. Even when it returns
NULL, though, we still go on to call btusb_mtk_claim_iso_intf().
As of commit e9087e828827 ("Bluetooth: btusb: mediatek: Add locks for
usb_driver_claim_interface()"), calling btusb_mtk_claim_iso_intf()
when `btmtk_data->isopkt_intf` is NULL will cause a crash because
we'll end up passing a bad pointer to device_lock(). Prior to that
commit we'd pass the NULL pointer directly to
usb_driver_claim_interface() which would detect it and return an
error, which was handled.
Resolve the crash in btusb_mtk_claim_iso_intf() by adding a NULL check
at the start of the function. This makes the code handle a NULL
`btmtk_data->isopkt_intf` the same way it did before the problematic
commit (just with a slight change to the error message printed).
Reported-by: IncogCyberpunk <incogcyberpunk@proton.me> Closes: http://lore.kernel.org/r/a380d061-479e-4713-bddd-1d6571ca7e86@leemhuis.info Fixes: e9087e828827 ("Bluetooth: btusb: mediatek: Add locks for usb_driver_claim_interface()") Cc: stable@vger.kernel.org Tested-by: IncogCyberpunk <incogcyberpunk@proton.me> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Protect access to fore200e->available_cell_rate with rate_mtx lock in the
error handling path of fore200e_open() to prevent a data race.
The field fore200e->available_cell_rate is a shared resource used to track
available bandwidth. It is concurrently accessed by fore200e_open(),
fore200e_close(), and fore200e_change_qos().
In fore200e_open(), the lock rate_mtx is correctly held when subtracting
vcc->qos.txtp.max_pcr from available_cell_rate to reserve bandwidth.
However, if the subsequent call to fore200e_activate_vcin() fails, the
function restores the reserved bandwidth by adding back to
available_cell_rate without holding the lock.
This introduces a race condition because available_cell_rate is a global
device resource shared across all VCCs. If the error path in
fore200e_open() executes concurrently with operations like
fore200e_close() or fore200e_change_qos() on other VCCs, a
read-modify-write race occurs.
Specifically, the error path reads the rate without the lock. If another
CPU acquires the lock and modifies the rate (e.g., releasing bandwidth in
fore200e_close()) between this read and the subsequent write, the error
path will overwrite the concurrent update with a stale value. This results
in incorrect bandwidth accounting.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251120120657.2462194-1-hanguidong02@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The i.MX6UL reference manual lists two possible interrupt lines for
SAI3 (56 and 57, offset +32). The current device tree entry uses
the first one (24), which prevents IRQs from being handled properly.
Use the second interrupt line (25), which does allow interrupts
to work as expected.
According to the board design, set SEL to high means flipped
connection (TX2/RX2). And the TCPM will output logical 1 if it needs
flipped connection. So switch to active high for select-gpios.
The EN pin on mux chip is low active, so switch to active low for
enable-gpios too.
Fixes: b237975b2cd5 ("arm64: dts: imx8qm-mek: add usb 3.0 and related type C nodes") Cc: stable@vger.kernel.org Reviewed-by: Jun Li <jun.li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Integrated amplifier LEAK Stereo 230 by IAG Limited has built-in
ESS9038Q2M DAC served by XMOS controller. It supports both DSD Native
and DSD-over-PCM (DoP) operational modes. But it doesn't work properly
by default and tries DSD-to-PCM conversion. USB quirks below allow it
to operate as designed.
Add DSD_RAW quirk flag for IAG Limited devices (vendor ID 0x2622)
Add DSD format quirk for LEAK Stereo 230 (USB ID 0x2622:0x0061)
When a VMA is split (e.g., by partial munmap or MAP_FIXED), the kernel
calls vm_ops->close on each portion. For trace buffer mappings, this
results in ring_buffer_unmap() being called multiple times while
ring_buffer_map() was only called once.
This causes ring_buffer_unmap() to return -ENODEV on subsequent calls
because user_mapped is already 0, triggering a WARN_ON.
Trace buffer mappings cannot support partial mappings because the ring
buffer structure requires the complete buffer including the meta page.
Fix this by adding a may_split callback that returns -EINVAL to prevent
VMA splits entirely.
Owing to Config4.MMUSizeExt and VTLB/FTLB MMU features later MIPSr2+
cores can have more than 64 TLB entries. Therefore allocate an array
for uniquification instead of placing too an small array on the stack.
Fixes: 35ad7e181541 ("MIPS: mm: tlb-r4k: Uniquify TLB entries on init") Co-developed-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # v6.17+: 9f048fa48740: MIPS: mm: Prevent a TLB shutdown on initial uniquification Cc: stable@vger.kernel.org # v6.17+ Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Klara Modin <klarasmodin@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Depending on the particular CPU implementation a TLB shutdown may occur
if multiple matching entries are detected upon the execution of a TLBP
or the TLBWI/TLBWR instructions. Given that we don't know what entries
we have been handed we need to be very careful with the initial TLB
setup and avoid all these instructions.
Therefore read all the TLB entries one by one with the TLBR instruction,
bypassing the content addressing logic, and truncate any large pages in
place so as to avoid a case in the second step where an incoming entry
for a large page at a lower address overlaps with a replacement entry
chosen at another index. Then preinitialize the TLB using addresses
outside our usual unique range and avoiding clashes with any entries
received, before making the usual call to local_flush_tlb_all().
This fixes (at least) R4x00 cores if TLBP hits multiple matching TLB
entries (SGI IP22 PROM for examples sets up all TLBs to the same virtual
address).
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: 35ad7e181541 ("MIPS: mm: tlb-r4k: Uniquify TLB entries on init") Cc: stable@vger.kernel.org Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com> # Boston I6400, M5150 sim Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use correct argument to iio_str_to_fixpoint() to parse 3 decimal places.
iio_str_to_fixpoint() has a bit of an unintuitive API where the
fract_mult parameter is the multiplier of the first decimal place as if
it was already an integer. So to get 3 decimal places, fract_mult must
be 100 rather than 1000.
Fixes: 96ccdbc07a74 ("staging:iio:adc:ad7280a: Standardize extended ABI naming") Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is an race-condition where device is not full working after SW reset.
Therefore it's necessary to wait some time after reset and verify shadow
registers values by reading and comparing the values before/after reset.
This mechanism is described in datasheet at least from revision D.
Fixes: 12ed27863ea3 ("iio: accel: Add driver support for ADXL355") Signed-off-by: Valek Andrej <andrej.v@skyrain.eu> Signed-off-by: Kessler Markus <markus.kessler@hilti.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code in bmc150-accel-core.c unconditionally calls
bmc150_accel_set_interrupt() in the iio_buffer_setup_ops,
such as on the runtime PM resume path giving a kernel
splat like this if the device has no interrupts:
Unable to handle kernel NULL pointer dereference at virtual
address 00000001 when read
PC is at bmc150_accel_set_interrupt+0x98/0x194
LR is at __pm_runtime_resume+0x5c/0x64
(...)
Call trace:
bmc150_accel_set_interrupt from bmc150_accel_buffer_postenable+0x40/0x108
bmc150_accel_buffer_postenable from __iio_update_buffers+0xbe0/0xcbc
__iio_update_buffers from enable_store+0x84/0xc8
enable_store from kernfs_fop_write_iter+0x154/0x1b4
This bug seems to have been in the driver since the beginning,
but it only manifests recently, I do not know why.
Store the IRQ number in the state struct, as this is a common
pattern in other drivers, then use this to determine if we have
IRQ support or not.
Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Initially st,adc-alt-channel property was defined as an enum in the DFSDM
binding. The DFSDM binding has been changed to use the new IIO backend
framework, along with the adoption of IIO generic channels.
In this new binding st,adc-alt-channel is defined as a boolean property,
but it is still handled has an enum in DFSDM driver.
Fix st,adc-alt-channel property handling in DFSDM driver.
If an error occurs after a successful mfd_add_devices() call, it should be
undone by a corresponding mfd_remove_devices() call, as already done in the
remove function.
The `decimator` and `batch` fields of struct st_lsm6dsx_settings
are arrays indexed by sensor type, not by sensor hardware
identifier; moreover, the `batch` field is only used for the
accelerometer and gyroscope.
Change the array size for `decimator` from ST_LSM6DSX_MAX_ID to
ST_LSM6DSX_ID_MAX, and change the array size for `batch` from
ST_LSM6DSX_MAX_ID to 2; move the enum st_lsm6dsx_sensor_id
definition so that the ST_LSM6DSX_ID_MAX value is usable within
the struct st_lsm6dsx_settings definition.
Fixes: 801a6e0af0c6c ("iio: imu: st_lsm6dsx: add support to LSM6DSO") Signed-off-by: Francesco Lavra <flavra@baylibre.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the ABI the units after application of scale and offset are
milli degree celsius for temperature thresholds and milli percent for
relative humidity thresholds. Currently the resulting units are degree
celsius for temperature thresholds and hysteresis and percent for relative
humidity thresholds and hysteresis. Change scale factor to fix this issue.
According to the ABI the units after application of scale and offset are
milli degrees for temperature measurements and milli percent for relative
humidity measurements. Currently the resulting units are degree celsius for
temperature measurements and percent for relative humidity measurements.
Change scale factor to fix this issue.
Fixes: c9180b8e39be ("iio: humidity: Add driver for ti HDC302x humidity sensors") Reported-by: Chris Lesiak <chris.lesiak@licorbio.com> Suggested-by: Chris Lesiak <chris.lesiak@licorbio.com> Reviewed-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Signed-off-by: Dimitri Fedrau <dimitri.fedrau@liebherr.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a new buffer accessor .get_dma_dev() in order to get the
struct device responsible for actually providing the dma channel. We
cannot assume that we can use the parent of the IIO device for mapping
the DMA buffer. This becomes important on systems (like the Xilinx/AMD
zynqMP Ultrascale) where memory (or part of it) is mapped above the
32 bit range. On such systems and given that a device by default has
a dma mask of 32 bits we would then need to rely on bounce buffers (to
swiotlb) for mapping memory above the dma mask limit.
In the process, add an iio_buffer_get_dma_dev() helper function to get
the proper DMA device.
Cc: stable@vger.kernel.org Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wire up the .get_dma_dev() callback to use the DMA buffer infrastructure's
implementation. This ensures that DMABUF operations use the correct DMA
device for mapping, which is essential for proper operation on systems
where memory is mapped above the 32-bit range.
Without this callback, the core would fall back to using the IIO device's
parent, which may not have the appropriate DMA mask configuration for
high memory access.
Fixes: 7a86d469983a ("iio: buffer-dmaengine: Support new DMABUF based userspace API") Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implement the .get_dma_dev() callback for DMA buffers by returning the
device that owns the DMA channel. This allows the core DMABUF
infrastructure to properly map DMA buffers using the correct device,
avoiding the need for bounce buffers on systems where memory is mapped
above the 32-bit range.
The function returns the DMA queue's device, which is the actual device
responsible for DMA operations in buffer-dma implementations.
Cc: stable@vger.kernel.org Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we store initial stacktrace entry twice for non-HW ot_regs, which
means callers that fail perf_hw_regs(regs) condition in perf_callchain_kernel.
When perf_callchain_kernel calls unwind_start with first_frame, AFAICS
we do not skip regs->ip, but it's added as part of the unwind process.
Hence reverting the extra perf_callchain_store for non-hw regs leg.
I was not able to bisect this, so I'm not really sure why this was needed
in v5.2 and why it's not working anymore, but I could see double entries
as far as v5.10.
I did the test for both ORC and framepointer unwind with and without the
this fix and except for the initial entry the stacktraces are the same.
Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This results in a blank screen on the HDMI port on some systems.
Revert for now so as not to regress 6.18, can be addressed
in 6.19 once the issue is root caused.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4652 Cc: Sunpeng.Li@amd.com Cc: ivan.lipski@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d0e9de7a81503cdde37fb2d37f1d102f9e0f38fb) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On BCM6358 (and also observed on BCM6368) the controller appears to
only generate as many SPI clocks as bytes that have been written into
the TX FIFO. For RX-only transfers the driver programs the transfer
length in SPI_MSG_CTL but does not write anything into the FIFO, so
chip select is deasserted early and the RX transfer segment is never
fully clocked in.
A concrete failing case is a three-transfer MAC address read from
SPI-NOR:
- TX 0x03 (read command)
- TX 3-byte address
- RX 6 bytes (MAC)
In contrast, a two-transfer JEDEC-ID read (0x9f + 6-byte RX) works
because the driver uses prepend_len and writes dummy bytes into the
TX FIFO for the RX part.
Fix this by writing 0xff dummy bytes into the TX FIFO for RX-only
segments so that the number of bytes written to the FIFO matches the
total message length seen by the controller.
Propagate fwnode of the ACPI device to the SPI controller Linux device.
Currently only OF case propagates fwnode to the controller.
While at it, replace several calls to dev_fwnode() with a single one
cached in a local variable, and unify checks for fwnode type by using
is_*_node() APIs.
Fixes: 55ab8487e01d ("spi: spi-nxp-fspi: Add ACPI support") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Haibo Chen <haibo.chen@nxp.com> Link: https://patch.msgid.link/20251126202501.2319679-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add OCT-DTR mode support in default, since flexspi do not supports
swapping bytes on a 16 bit boundary in OCT-DTR mode, so mark swap16
as false.
lx2160a do not support DQS, so add a quirk to disable DTR mode for this
platform.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20250917-flexspi-ddr-v2-5-bb9fe2a01889@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 40ad64ac25bb ("spi: nxp-fspi: Propagate fwnode in ACPI case as well") Signed-off-by: Sasha Levin <sashal@kernel.org>
Every ->exec_op() call correctly configures the spi bus speed to the
maximum allowed frequency for the memory using the constant spi default
parameter. Since we can now have per-operation constraints, let's use
the value that comes from the spi-mem operation structure instead. In
case there is no specific limitation for this operation, the default spi
device value will be given anyway.
The per-operation frequency capability is thus advertised to the spi-mem
core.
There are spi devices with multiple frequency limitations depending on
the invoked command. We probably do not want to afford running at the
lowest supported frequency all the time, so if we want to get the most
of our hardware, we need to allow per-operation frequency limitations.
Among all the SPI memory controllers, I believe all are capable of
changing the spi frequency on the fly. Some of the drivers do not make
any frequency setup though. And some others will derive a per chip
prescaler value which will be used forever.
Actually changing the frequency on the fly is something new in Linux, so
we need to carefully flag the drivers which do and do not support it. A
controller capability is created for that, and the presence for this
capability will always be checked before accepting such pattern.
In the spi subsystem, the bus frequency is derived as follows:
- the controller may expose a minimum and maximum operating frequency
- the hardware description, through the spi peripheral properties,
advise what is the maximum acceptable frequency from a device/wiring
point of view.
Transfers must be observed at a frequency which fits both (so in
practice, the lowest maximum).
Actually, this second point mixes two information and already takes the
lowest frequency among:
- what the spi device is capable of (what is written in the component
datasheet)
- what the wiring allows (electromagnetic sensibility, crossovers,
terminations, antenna effect, etc).
This logic works until spi devices are no longer capable of sustaining
their highest frequency regardless of the operation. Spi memories are
typically subject to such variation. Some devices are capable of
spitting their internally stored data (essentially in read mode) at a
very fast rate, typically up to 166MHz on Winbond SPI-NAND chips, using
"fast" commands. However, some of the low-end operations, such as
regular page read-from-cache commands, are more limited and can only be
executed at 54MHz at most. This is currently a problem in the SPI-NAND
subsystem. Another situation, even if not yet supported, will be with
DTR commands, when the data is latched on both edges of the clock. The
same chips as mentioned previously are in this case limited to
80MHz. Yet another example might be continuous reads, which, under
certain circumstances, can also run at most at 104 or 120MHz.
As a matter of fact, the "one frequency per chip" policy is outdated and
more fine grain configuration is needed: we need to allow per-operation
frequency limitations. So far, all datasheets I encountered advertise a
maximum default frequency, which need to be lowered for certain specific
operations. So based on the current infrastructure, we can still expect
firmware (device trees in general) to continued advertising the same
maximum speed which is a mix between the PCB limitations and the chip
maximum capability, and expect per-operation lower frequencies when this
is relevant.
Add a `struct spi_mem_op` member to carry this information. Not
providing this field explicitly from upper layers means that there is no
further constraint and the default spi device maximum speed will be
carried instead. The SPI_MEM_OP() macro is also expanded with an
optional frequency argument, because virtually all operations can be
subject to such a limitation, and this will allow for a smooth and
discrete transition.
For controller drivers which do not implement the spi-mem interface, the
per-transfer speed is also set acordingly to a lower (than the maximum
default) speed when relevant.
There are NOR flashes (Macronix) that swap the bytes on a 16-bit
boundary when configured in Octal DTR mode. The byte order of
16-bit words is swapped when read or written in Octal Double
Transfer Rate (DTR) mode compared to Single Transfer Rate (STR)
modes. If one writes D0 D1 D2 D3 bytes using 1-1-1 mode, and uses
8D-8D-8D SPI mode for reading, it will read back D1 D0 D3 D2.
Swapping the bytes may introduce some endianness problems. It can
affect the boot sequence if the entire boot sequence is not handled
in either 8D-8D-8D mode or 1-1-1 mode. Therefore, it is necessary
to swap the bytes back to ensure the same byte order as in STR modes.
Fortunately there are controllers that could swap the bytes back at
runtime, addressing the flash's endianness requirements. Provide a
way for the upper layers to specify the byte order in Octal DTR mode.
Merge Tudor's patch and add modifications for suiting newer version
of Linux kernel.
Suggested-by: Michael Walle <mwalle@kernel.org> Signed-off-by: JaimeLiao <jaimeliao@mxic.com.tw> Signed-off-by: AlvinZhou <alvinzhou@mxic.com.tw> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240926141956.2386374-3-alvinzhou.tw@gmail.com Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Stable-dep-of: 40ad64ac25bb ("spi: nxp-fspi: Propagate fwnode in ACPI case as well") Signed-off-by: Sasha Levin <sashal@kernel.org>
devm_pm_runtime_enable() can fail due to memory allocation. The current
code ignores its return value, potentially causing runtime PM operations
to fail silently after autosuspend configuration.
Check the return value of devm_pm_runtime_enable() and return on failure.
Fixes: 909fac05b926 ("spi: add support for Amlogic A1 SPI Flash Controller") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Link: https://patch.msgid.link/20251124015852.937-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This driver runs also on Tegra SoCs without a Tegra20 APB DMA controller
(e.g. Tegra234).
Remove the Kconfig dependency on TEGRA20_APB_DMA; in addition, amend the
help text to reflect the fact that this driver works on SoCs different from
Tegra114.
Fixes: bb9667d8187b ("arm64: tegra: Add SPI device tree nodes for Tegra234") Signed-off-by: Francesco Lavra <flavra@baylibre.com> Link: https://patch.msgid.link/20251126095027.4102004-1-flavra@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The error status mask for a type 3/4 subspace is used for reading the
error status, and the bitwise inverse is used for clearing the error
with the intent being to preserve any of the non-error bits. However,
we were previously applying the mask to extract the status and then
applying the inverse to the result which ended up clearing all bits.
Instead, store the inverse mask in the preserve mask and then use that
on the original value read from the error status so that only the error
is cleared.
The existing error handling logic in pcc_mbox_irq() is intermixed with the
main flow of the function. The command complete check and the complete
complete update/acknowledgment are nicely factored into separate functions.
Moves error detection and clearing logic into a separate function called:
pcc_mbox_error_check_and_clear() by extracting error-handling logic from
pcc_mbox_irq().
This ensures error checking and clearing are handled separately and it
improves maintainability by keeping the IRQ handler focused on processing
events.
Acked-by: Huisong Li <lihuisong@huawei.com> Tested-by: Huisong Li <lihuisong@huawei.com> Tested-by: Adam Young <admiyo@os.amperecomputing.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Stable-dep-of: ff0e4d4c97c9 ("mailbox: pcc: don't zero error register") Signed-off-by: Sasha Levin <sashal@kernel.org>
GCE can only fetch the command buffer address from a 32-bit register.
Some SoCs support a 35-bit command buffer address for GCE, which
requires a right shift of 3 bits before setting the address into
the 32-bit register. A comment has been added to the header of
cmdq_get_shift_pa() to explain this requirement.
To prevent the GCE command buffer address from being DMA mapped beyond
its supported bit range, the DMA bit mask for the device is set during
initialization.
Additionally, to ensure the correct shift is applied when setting or
reading the register that stores the GCE command buffer address,
new APIs, cmdq_convert_gce_addr() and cmdq_revert_gce_addr(), have
been introduced for consistent operations on this register.
The variable type for the command buffer address has been standardized
to dma_addr_t to prevent handling issues caused by type mismatches.
Fixes: 0858fde496f8 ("mailbox: cmdq: variablize address shift in platform") Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
devm_pm_runtime_enable() can fail due to memory allocation.
The current code ignores its return value, potentially causing
pm_runtime_resume_and_get() to operate on uninitialized runtime
PM state.
Check the return value of devm_pm_runtime_enable() and return on failure.
The calibrated timestamp is calculated from the nominal value using the
formula:
ts_gain[ns] ≈ ts_sensitivity - (ts_trim_coeff * val) / 1000.
The values of ts_sensitivity and ts_trim_coeff are not the same for all
devices, so it is necessary to differentiate them based on the part name.
For the correct values please consult the relevant AN.
Fixes: cb3b6b8e1bc0 ("iio: imu: st_lsm6dsx: add odr calibration feature") Signed-off-by: Mario Tesi <mario.tesi@st.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
There are currently two situations that can trigger the PTP interrupt,
one is the PPS event, the other is the PEROUT event. However, the irq
handler fec_pps_interrupt() does not check the irq event type and
directly registers a PPS event into the system, but the event may be
a PEROUT event. This is incorrect because PEROUT is an output signal,
while PPS is the input of the kernel PPS system. Therefore, add a check
for the event type, if pps_enable is true, it means that the current
event is a PPS event, and then the PPS event is registered.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-5-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the current driver, PPS and PEROUT use the same channel to generate
the events, so they cannot be enabled at the same time. Otherwise, the
later configuration will overwrite the earlier configuration. Therefore,
when configuring PPS, the driver will check whether PEROUT is enabled.
Similarly, when configuring PEROUT, the driver will check whether PPS
is enabled.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-4-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the previously set PEROUT is already active, updating it will cause
the new PEROUT to start immediately instead of at the specified time.
This is because fep->reload_period is updated whithout check whether
the PEROUT is enabled, and the old PEROUT is not disabled. Therefore,
the pulse period will be updated immediately in the pulse interrupt
handler fec_pps_interrupt().
Currently, the driver does not support directly updating PEROUT and it
will make the logic be more complicated. To fix the current issue, add
a check before enabling the PEROUT, the driver will return an error if
PEROUT is enabled. If users wants to update a new PEROUT, they should
disable the old PEROUT first.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-3-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The PEROUT allows the user to set a specified future time to output the
periodic signal. If the future time is far from the current time, the FEC
driver will use hrtimer to configure PEROUT one second before the future
time. However, the hrtimer will not be canceled if the PEROUT is disabled
before the hrtimer expires. So the PEROUT will be configured when the
hrtimer expires, which is not as expected. Therefore, cancel the hrtimer
in fec_ptp_pps_disable() to fix this issue.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-2-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The atlantic driver can receive packets with more than MAX_SKB_FRAGS (17)
fragments when handling large multi-descriptor packets. This causes an
out-of-bounds write in skb_add_rx_frag_netmem() leading to kernel panic.
The issue occurs because the driver doesn't check the total number of
fragments before calling skb_add_rx_frag(). When a packet requires more
than MAX_SKB_FRAGS fragments, the fragment index exceeds the array bounds.
Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE,
then all fragments are accounted for. And reusing the existing check to
prevent the overflow earlier in the code path.
This crash occurred in production with an Aquantia AQC113 10G NIC.
Fix a potential counter roll-over issue in fbnic_mbx_alloc_rx_msgs()
when calculating descriptor slots. The issue occurs when head - tail
results in a large positive value (unsigned) and the compiler interprets
head - tail - 1 as a signed value.
Since FBNIC_IPC_MBX_DESC_LEN is a power of two, use a masking operation,
which is a common way of avoiding this problem when dealing with these
sort of ring space calculations.
When using the SGMII PCS as a fixed-link chip-to-chip connection, it is
easy to miss the fact that traffic passes only at 1G, since that's what
any normal such connection would use.
When using the SGMII PCS connected towards an on-board PHY or an SFP
module, it is immediately noticeable that when the link resolves to a
speed other than 1G, traffic from the MAC fails to pass: TX counters
increase, but nothing gets decoded by the other end, and no local RX
counters increase either.
Artificially lowering a fixed-link rate to speed = <100> makes us able
to see the same issue as in the case of having an SGMII PHY.
Some debugging shows that the XPCS configuration is A-OK, but that the
MAC Configuration Table entry for the port has the SPEED bits still set
to 1000Mbps, due to a special condition in the driver. Deleting that
condition, and letting the resolved link speed be programmed directly
into the MAC speed field, results in a functional link at all 3 speeds.
This piece of evidence, based on testing on both generations with SGMII
support (SJA1105S and SJA1110A) directly contradicts the statement from
the blamed commit that "the MAC is fixed at 1 Gbps and we need to
configure the PCS only (if even that)". Worse, that statement is not
backed by any documentation, and no one from NXP knows what it might
refer to.
I am unable to recall sufficient context regarding my testing from March
2020 to understand what led me to draw such a braindead and factually
incorrect conclusion. Yet, there is nothing of value regarding forcing
the MAC speed, either for SGMII or 2500Base-X (introduced at a later
stage), so remove all such logic.
Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251122111324.136761-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The static configuration reload saves the port speed in the static
configuration tables by first converting it from the internal
respresentation to the SPEED_xxx ethtool representation, and then
converts it back to restore the setting. This is because
sja1105_adjust_port_config() takes the speed as SPEED_xxx.
However, this is unnecessarily complex. If we split
sja1105_adjust_port_config() up, we can simply save and restore the
mac[port].speed member in the static configuration tables.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1svfMa-005ZIX-If@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: da62abaaa268 ("net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing traffic") Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, when skb is null, the driver prints an error and then
dereferences skb on the next line.
To fix this, let's add a 'break' after the error message to switch
to sxgbe_rx_refill(), which is similar to the approach taken by the
other drivers in this particular case, e.g. calxeda with xgmac_rx().
Found during a code review.
Fixes: 1edb9ca69e8a ("net: sxgbe: add basic framework for Samsung 10Gb ethernet driver") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251121123834.97748-1-aleksei.kodanev@bell-sw.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Attempting to add a port device that is already up will expectedly fail,
but not before modifying the team device header_ops.
In the case of the syzbot reproducer the gre0 device is
already in state UP when it attempts to add it as a
port device of team0, this fails but before that
header_ops->create of team0 is changed from eth_header to ipgre_header
in the call to team_dev_type_check_change.
Later when we end up in ipgre_header() struct ip_tunnel* points to nonsense
as the private data of the device still holds a struct team.
Example sequence of iproute2 commands to reproduce the hang/BUG():
ip link add dev team0 type team
ip link add dev gre0 type gre
ip link set dev gre0 up
ip link set dev gre0 master team0
ip link set dev team0 up
ping -I team0 1.1.1.1
Move team_dev_type_check_change down where all other checks have passed
as it changes the dev type with no way to restore it in case
one of the checks that follow it fail.
Also make sure to preserve the origial mtu assignment:
- If port_dev is not the same type as dev, dev takes mtu from port_dev
- If port_dev is the same type as dev, port_dev takes mtu from dev
This is done by adding a conditional before the call to dev_set_mtu
to prevent it from assigning port_dev->mtu = dev->mtu and instead
letting team_dev_type_check_change assign dev->mtu = port_dev->mtu.
The conditional is needed because the patch moves the call to
team_dev_type_check_change past dev_set_mtu.
Testing:
- team device driver in-tree selftests
- Add/remove various devices as slaves of team device
- syzbot
Reported-by: syzbot+a2a3b519de727b0f7903@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a2a3b519de727b0f7903 Fixes: 1d76efe1577b ("team: add support for non-ethernet devices") Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251122002027.695151-1-zlatistiv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The rate limiting validation condition currently checks the output
variable max_bw_value[i] instead of the input value
maxrate->tc_maxrate[i]. This causes the validation to compare an
uninitialized or stale value rather than the actual requested rate.
The condition should check the input rate to properly validate against
the upper limit:
} else if (maxrate->tc_maxrate[i] <= upper_limit_gbps) {
This aligns with the pattern used in the first branch, which correctly
checks maxrate->tc_maxrate[i] against upper_limit_mbps.
The current implementation can lead to unreliable validation behavior:
- For rates between 25.5 Gbps and 255 Gbps, if max_bw_value[i] is 0
from initialization, the GBPS path may be taken regardless of whether
the actual rate is within bounds
- When processing multiple TCs (i > 0), max_bw_value[i] contains the
value computed for the previous TC, affecting the validation logic
- The overflow check for rates exceeding 255 Gbps may not trigger
consistently depending on previous array values
This patch ensures the validation correctly examines the requested rate
value for proper bounds checking.
Fixes: 43b27d1bd88a ("net/mlx5e: Fix wraparound in rate limiting for values above 255 Gbps") Signed-off-by: Danielle Costantino <dcostantino@meta.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20251124180043.2314428-1-dcostantino@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
To initialize the taprio block in lan966x, it is required to configure
the register REVISIT_DLY. The purpose of this register is to set the
delay before revisit the next gate and the value of this register depends
on the system clock. The problem is that the we calculated wrong the value
of the system clock period in picoseconds. The actual system clock is
~165.617754MHZ and this correspond to a period of 6038 pico seconds and
not 15125 as currently set.
Fixes: e462b2717380b4 ("net: lan966x: Add offload support for taprio") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251121061411.810571-1-horatiu.vultur@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 7a1bb49461b1 ("net: aquantia: fix potential IOMMU fault after
driver unbind") and commit ed4d81c4b3f2 ("net: aquantia: when cleaning
hw cache it should be toggled") fixed cache invalidation for ATL B0, but
ATL2 was left with only interrupt disabling. This allowed hardware to
write to cached descriptors after DMA memory was unmapped, triggering
SMMU faults. Once cache invalidation is applied to ATL2, the translation
fault can't be observed anymore.
Add shared aq_hw_invalidate_descriptor_cache() helper and use it in both
ATL B0 and ATL2 hw_stop() implementations for consistent behavior.
Fixes: e54dcf4bba3e ("net: atlantic: basic A2 init/deinit hw_ops") Tested-by: Carol Soto <csoto@nvidia.com> Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251120041537.62184-1-kaihengf@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This passes the address of the pointer "&punit_ipcdev" when the intent
was to pass the pointer itself "punit_ipcdev" (without the ampersand).
This means that the:
complete(&ipcdev->cmd_complete);
in intel_punit_ioc() will write to a wrong memory address corrupting it.
Fixes: fdca4f16f57d ("platform:x86: add Intel P-Unit mailbox IPC driver") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/aSCmoBipSQ_tlD-D@stanley.mountain Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As the interface mode doesn't need to be updated on PHYs connected with
USXGMII and integrated PHYs, gpy_update_interface() should just return 0
in these cases rather than -EINVAL which has wrongly been introduced by
commit 7a495dde27ebc ("net: phy: mxl-gpy: Change gpy_update_interface()
function return type"), as this breaks support for those PHYs.
Fixes: 7a495dde27ebc ("net: phy: mxl-gpy: Change gpy_update_interface() function return type") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/f744f721a1fcc5e2e936428c62ff2c7d94d2a293.1763648168.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
As explain in commit fa349e396e48 ("veth: Fix race with AF_XDP exposing
old or uninitialized descriptors") for veth there is a chance after
napi_complete_done() that another CPU can manage start another NAPI
instance running veth_pool(). For NAPI this is correctly handled as the
napi_schedule_prep() check will prevent multiple instances from getting
scheduled, but for the remaining code in veth_pool() this can run
concurrent with the newly started NAPI instance.
The problem/race is that xdp_clear_return_frame_no_direct() isn't
designed to be nested.
Prior to commit 401cb7dae813 ("net: Reference bpf_redirect_info via
task_struct on PREEMPT_RT.") the temporary BPF net context
bpf_redirect_info was stored per CPU, where this wasn't an issue. Since
this commit the BPF context is stored in 'current' task_struct. When
running veth in threaded-NAPI mode, then the kthread becomes the storage
area. Now a race exists between two concurrent veth_pool() function calls
one exiting NAPI and one running new NAPI, both using the same BPF net
context.
Race is when another CPU gets within the xdp_set_return_frame_no_direct()
section before exiting veth_pool() calls the clear-function
xdp_clear_return_frame_no_direct().
Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/176356963888.337072.4805242001928705046.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to
reduce TX drops") introduced a race condition that can lead to a permanently
stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra
Max).
The race occurs in veth_xmit(). The producer observes a full ptr_ring and
stops the queue (netif_tx_stop_queue()). The subsequent conditional logic,
intended to re-wake the queue if the consumer had just emptied it (if
(__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a
"lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and
traffic halts.
This failure is caused by an incorrect use of the __ptr_ring_empty() API
from the producer side. As noted in kernel comments, this check is not
guaranteed to be correct if a consumer is operating on another CPU. The
empty test is based on ptr_ring->consumer_head, making it reliable only for
the consumer. Using this check from the producer side is fundamentally racy.
This patch fixes the race by adopting the more robust logic from an earlier
version V4 of the patchset, which always flushed the peer:
(1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier
are removed. Instead, after stopping the queue, we unconditionally call
__veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled,
making it solely responsible for re-waking the TXQ.
This handles the race where veth_poll() consumes all packets and completes
NAPI *before* veth_xmit() on the producer side has called netif_tx_stop_queue.
The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule
NAPI.
(2) On the consumer side, the logic for waking the peer TXQ is moved out of
veth_xdp_rcv() and placed at the end of the veth_poll() function. This
placement is part of fixing the race, as the netif_tx_queue_stopped() check
must occur after rx_notify_masked is potentially set to false during NAPI
completion.
This handles the race where veth_poll() consumes all packets, but haven't
finished (rx_notify_masked is still true). The producer veth_xmit() stops the
TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning
not starting NAPI. Then veth_poll() change rx_notify_masked to false and
stops NAPI. Before exiting veth_poll() will observe TXQ is stopped and wake
it up.
Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/176295323282.307447.14790015927673763094.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: a14602fcae17 ("veth: reduce XDP no_direct return section to fix race") Signed-off-by: Sasha Levin <sashal@kernel.org>
The veth peer device is RCU protected, but when the peer device gets
deleted (veth_dellink) then the pointer is assigned NULL (via
RCU_INIT_POINTER).
This patch adds a necessary NULL check in veth_xdp_rcv when accessing
the veth peer net_device.
This fixes a bug introduced in commit dc82a33297fc ("veth: apply qdisc
backpressure on full ptr_ring to reduce TX drops"). The bug is a race
and only triggers when having inflight packets on a veth that is being
deleted.
In production, we're seeing TX drops on veth devices when the ptr_ring
fills up. This can occur when NAPI mode is enabled, though it's
relatively rare. However, with threaded NAPI - which we use in
production - the drops become significantly more frequent.
The underlying issue is that with threaded NAPI, the consumer often runs
on a different CPU than the producer. This increases the likelihood of
the ring filling up before the consumer gets scheduled, especially under
load, leading to drops in veth_xmit() (ndo_start_xmit()).
This patch introduces backpressure by returning NETDEV_TX_BUSY when the
ring is full, signaling the qdisc layer to requeue the packet. The txq
(netdev queue) is stopped in this condition and restarted once
veth_poll() drains entries from the ring, ensuring coordination between
NAPI and qdisc.
Backpressure is only enabled when a qdisc is attached. Without a qdisc,
the driver retains its original behavior - dropping packets immediately
when the ring is full. This avoids unexpected behavior changes in setups
without a configured qdisc.
With a qdisc in place (e.g. fq, sfq) this allows Active Queue Management
(AQM) to fairly schedule packets across flows and reduce collateral
damage from elephant flows.
A known limitation of this approach is that the full ring sits in front
of the qdisc layer, effectively forming a FIFO buffer that introduces
base latency. While AQM still improves fairness and mitigates flow
dominance, the latency impact is measurable.
In hardware drivers, this issue is typically addressed using BQL (Byte
Queue Limits), which tracks in-flight bytes needed based on physical link
rate. However, for virtual drivers like veth, there is no fixed bandwidth
constraint - the bottleneck is CPU availability and the scheduler's ability
to run the NAPI thread. It is unclear how effective BQL would be in this
context.
This patch serves as a first step toward addressing TX drops. Future work
may explore adapting a BQL-like mechanism to better suit virtual devices
like veth.
The "noqueue" qdisc can either be directly attached, or get default
attached if net_device priv_flags has IFF_NO_QUEUE. In both cases, the
allocated Qdisc structure gets it's enqueue function pointer reset to
NULL by noqueue_init() via noqueue_qdisc_ops.
This is a common case for software virtual net_devices. For these devices
with no-queue, the transmission path in __dev_queue_xmit() will bypass
the qdisc layer. Directly invoking device drivers ndo_start_xmit (via
dev_hard_start_xmit). In this mode the device driver is not allowed to
ask for packets to be queued (either via returning NETDEV_TX_BUSY or
stopping the TXQ).
The simplest and most reliable way to identify this no-queue case is by
checking if enqueue == NULL.
The vrf driver currently open-codes this check (!qdisc->enqueue). While
functionally correct, this low-level detail is better encapsulated in a
dedicated helper for clarity and long-term maintainability.
To make this behavior more explicit and reusable, this patch introduce a
new helper: qdisc_txq_has_no_queue(). Helper will also be used by the
veth driver in the next patch, which introduces optional qdisc-based
backpressure.
The change eed467b517e8 ("Bluetooth: fix passkey uninitialized when used")
introduced a goto that bypasses the creation of temporary mackey and ltk
which are later used by the likes of DHKey Check step.
Later ffee202a78c2 ("Bluetooth: Always request for user confirmation for
Just Works (LE SC)") which means confirm_hint is always set in case
JUST_WORKS so the branch checking for an existing LTK becomes pointless
as confirm_hint will always be set, so this just merge both cases of
malicious or legitimate devices to be confirmed before continuing with the
pairing procedure.
Link: https://github.com/bluez/bluez/issues/1622 Fixes: eed467b517e8 ("Bluetooth: fix passkey uninitialized when used") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
There is a potential race condition between sock bind and socket write
iter. bind may free the same cmd via mgmt_pending before write iter sends
the cmd, just as syzbot reported in UAF[1].
Here we use hci_dev_lock to synchronize the two, thereby avoiding the
UAF mentioned in [1].
[1]
syzbot reported:
BUG: KASAN: slab-use-after-free in mgmt_pending_remove+0x3b/0x210 net/bluetooth/mgmt_util.c:316
Read of size 8 at addr ffff888077164818 by task syz.0.17/5989
Call Trace:
mgmt_pending_remove+0x3b/0x210 net/bluetooth/mgmt_util.c:316
set_link_security+0x5c2/0x710 net/bluetooth/mgmt.c:1918
hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719
hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg+0x21c/0x270 net/socket.c:742
sock_write_iter+0x279/0x360 net/socket.c:1195
Fixes: 6fe26f694c82 ("Bluetooth: MGMT: Protect mgmt_pending list with its own lock") Reported-by: syzbot+9aa47cd4633a3cf92a80@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9aa47cd4633a3cf92a80 Tested-by: syzbot+9aa47cd4633a3cf92a80@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
HCI_OP_NOP means no command was actually sent so there is no point in
triggering cmd_timer which may cause a hdev->reset in the process since
it is assumed that the controller is stuck processing a command.
Fixes: e2d471b7806b ("Bluetooth: ISO: Fix not using SID from adv report") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When performing reset tests and encountering abnormal card drop issues
that lead to a kernel crash, it is necessary to perform a null check
before releasing resources to avoid attempting to release a null pointer.
Fixes: ceac1cb0259d ("Bluetooth: btusb: mediatek: add ISO data transmission functions") Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The URB received in gs_usb_receive_bulk_callback() contains a struct
gs_host_frame. The length of the data after the header depends on the
gs_host_frame hf::flags and the active device features (e.g. time
stamping).
Introduce a new function gs_usb_get_minimum_length() and check that we have
at least received the required amount of data before accessing it. Only
copy the data to that skb that has actually been received.
The driver expects to receive a struct gs_host_frame in
gs_usb_receive_bulk_callback().
Use struct_group to describe the header of the struct gs_host_frame and
check that we have at least received the header before accessing any
members of it.
To resubmit the URB, do not dereference the pointer chain
"dev->parent->hf_size_rx" but use "parent->hf_size_rx" instead. Since
"urb->context" contains "parent", it is always defined, while "dev" is not
defined if the URB it too short.
The driver lacks the cleanup of failed transfers of URBs. This reduces the
number of available URBs per error by 1. This leads to reduced performance
and ultimately to a complete stop of the transmission.
If the sending of a bulk URB fails do proper cleanup:
- increase netdev stats
- mark the echo_sbk as free
- free the driver's context and do accounting
- wake the send queue
The `kvaser_usb_leaf_wait_cmd()` and `kvaser_usb_leaf_read_bulk_callback`
functions contain logic to zero-length commands. These commands are used
to align data to the USB endpoint's wMaxPacketSize boundary.
The driver attempts to skip these placeholders by aligning the buffer
position `pos` to the next packet boundary using `round_up()` function.
However, if zero-length command is found exactly on a packet boundary
(i.e., `pos` is a multiple of wMaxPacketSize, including 0), `round_up`
function will return the unchanged value of `pos`. This prevents `pos`
to be increased, causing an infinite loop in the parsing logic.
This patch fixes this in the function by using `pos + 1` instead.
This ensures that even if `pos` is on a boundary, the calculation is
based on `pos + 1`, forcing `round_up()` to always return the next
aligned boundary.
Fixes: 7259124eac7d ("can: kvaser_usb: Split driver into kvaser_usb_core.c and kvaser_usb_leaf.c") Signed-off-by: Seungjin Bae <eeodqql09@gmail.com> Reviewed-by: Jimmy Assarsson <extja@kvaser.com> Tested-by: Jimmy Assarsson <extja@kvaser.com> Link: https://patch.msgid.link/20251023162709.348240-1-eeodqql09@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
This software node change doesn't actually fix any current issues
with the kernel, it is an improvement to the lookup process rather
than fixing a live bug. It also causes a couple of regressions with
shipping laptops, which relied on the label based lookup.
There is a fix for the regressions in mainline, the first 5 patches
of [1]. However, those patches are fairly substantial changes and
given the patch causing the regression doesn't actually fix a bug
it seems better to just revert it in stable.
[why]
1. With allow_0_dtb_clk enabled, the time required to latch DTBCLK to 600 MHz
depends on the SMU. If DTBCLK is not latched to 600 MHz before set_mode completes,
gating DTBCLK causes the DP2 sink to lose its clock source.
2. The existing DTBCLK gating sequence ungates DTBCLK based on both pix_clk and ref_dtbclk,
but gates DTBCLK when either pix_clk or ref_dtbclk is zero.
pix_clk can be zero outside the set_mode sequence before DTBCLK is properly latched,
which can lead to DTBCLK being gated by mistake.
[how]
Consider both pixel_clk and ref_dtbclk when determining when it is safe to gate DTBCLK;
this is more accurate.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4701 Fixes: 5949e7c4890c ("drm/amd/display: Enable Dynamic DTBCLK Switch") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d04eb0c402780ca037b62a6aecf23b863545ebca) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>