In our user safe ino resolve ioctl we'll just turn any ret into -EACCES
from inode_permission(). This is redundant, and could potentially be
wrong if we had an ENOMEM in the security layer or some such other
error, so simply return the actual return value.
Note: The patch was taken from v5 of fscrypt patchset
(https://lore.kernel.org/linux-btrfs/cover.1706116485.git.josef@toxicpanda.com/)
which was handled over time by various people: Omar Sandoval, Sweet Tea
Dorminy, Josef Bacik.
Fixes: 23d0b79dfaed ("btrfs: Add unprivileged version of ino_lookup ioctl") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Daniel Vacek <neelx@suse.com> Reviewed-by: David Sterba <dsterba@suse.com>
[ add note ] Signed-off-by: David Sterba <dsterba@suse.com>
[ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tpm2_get_pcr_allocation() does not cap any upper limit for the number of
banks. Cap the limit to eight banks so that out of bounds values coming
from external I/O cause on only limited harm.
Cc: stable@vger.kernel.org # v5.10+ Fixes: bcfff8384f6c ("tpm: dynamically allocate the allocated_banks array") Tested-by: Lai Yi <yi1.lai@linux.intel.com> Reviewed-by: Jonathan McDowell <noodles@meta.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@opinsys.com>
[ added backward-compatible define for TPM_MAX_DIGEST_SIZE to support older ima_init.c code still using that macro name ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Copying the file system while it is mounted as read-only results in
a mount failure:
[~]# mkfs.ext4 -F /dev/sdc
[~]# mount /dev/sdc -o ro /mnt/test
[~]# dd if=/dev/sdc of=/dev/sda bs=1M
[~]# mount /dev/sda /mnt/test1
[ 1094.849826] JBD2: journal checksum error
[ 1094.850927] EXT4-fs (sda): Could not load journal inode
mount: mount /dev/sda on /mnt/test1 failed: Bad message
The process described above is just an abstracted way I came up with to
reproduce the issue. In the actual scenario, the file system was mounted
read-only and then copied while it was still mounted. It was found that
the mount operation failed. The user intended to verify the data or use
it as a backup, and this action was performed during a version upgrade.
Above issue may happen as follows:
ext4_fill_super
set_journal_csum_feature_set(sb)
if (ext4_has_metadata_csum(sb))
incompat = JBD2_FEATURE_INCOMPAT_CSUM_V3;
if (test_opt(sb, JOURNAL_CHECKSUM)
jbd2_journal_set_features(sbi->s_journal, compat, 0, incompat);
lock_buffer(journal->j_sb_buffer);
sb->s_feature_incompat |= cpu_to_be32(incompat);
//The data in the journal sb was modified, but the checksum was not
updated, so the data remaining in memory has a mismatch between the
data and the checksum.
unlock_buffer(journal->j_sb_buffer);
In this case, the journal sb copied over is in a state where the checksum
and data are inconsistent, so mounting fails.
To solve the above issue, update the checksum in memory after modifying
the journal sb.
Fixes: 4fd5ea43bc11 ("jbd2: checksum journal superblock") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20251103010123.3753631-1-yebin@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
[ Changed jbd2_superblock_csum(sb) to jbd2_superblock_csum(journal, sb) ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When DbC is disconnected then xhci_dbc_tty_unregister_device()
is called. However if there is any user space process blocked
on write to DbC terminal device then it will never be signalled
and thus stay blocked indifinitely.
This fix adds a tty_vhangup() call in xhci_dbc_tty_unregister_device().
The tty_vhangup() wakes up any blocked writers and causes subsequent
write attempts to DbC terminal device to fail.
A race condition during gadget teardown can lead to a use-after-free
in usb_gadget_state_work(), as reported by KASAN:
BUG: KASAN: invalid-access in sysfs_notify+0x2c/0xd0
Workqueue: events usb_gadget_state_work
The fundamental race occurs because a concurrent event (e.g., an
interrupt) can call usb_gadget_set_state() and schedule gadget->work
at any time during the cleanup process in usb_del_gadget().
Commit 399a45e5237c ("usb: gadget: core: flush gadget workqueue after
device removal") attempted to fix this by moving flush_work() to after
device_del(). However, this does not fully solve the race, as a new
work item can still be scheduled *after* flush_work() completes but
before the gadget's memory is freed, leading to the same use-after-free.
This patch fixes the race condition robustly by introducing a 'teardown'
flag and a 'state_lock' spinlock to the usb_gadget struct. The flag is
set during cleanup in usb_del_gadget() *before* calling flush_work() to
prevent any new work from being scheduled once cleanup has commenced.
The scheduling site, usb_gadget_set_state(), now checks this flag under
the lock before queueing the work, thus safely closing the race window.
Fixes: 5702f75375aa9 ("usb: gadget: udc-core: move sysfs_notify() to a workqueue") Cc: stable <stable@kernel.org> Signed-off-by: Jimmy Hu <hhhuuu@google.com> Link: https://patch.msgid.link/20251023054945.233861-1-hhhuuu@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Two clearly different specimens of NEC uPD720200 (one with start/stop
bug, one without) were seen to cause IOMMU faults after some Missed
Service Errors. Faulting address is immediately after a transfer ring
segment and patched dynamic debug messages revealed that the MSE was
received when waiting for a TD near the end of that segment:
[ 1.041954] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ffa08fe0
[ 1.042120] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09000 flags=0x0000]
[ 1.042146] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09040 flags=0x0000]
It gets even funnier if the next page is a ring segment accessible to
the HC. Below, it reports MSE in segment at ff1e8000, plows through a
zero-filled page at ff1e9000 and starts reporting events for TRBs in
page at ff1ea000 every microframe, instead of jumping to seg ff1e6000.
[ 7.041671] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0
[ 7.041999] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0
[ 7.042011] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
[ 7.042028] xhci_hcd: All TDs skipped for slot 1 ep 2. Clear skip flag.
[ 7.042134] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
[ 7.042138] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31
[ 7.042144] xhci_hcd: Looking for event-dma 00000000ff1ea040 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
[ 7.042259] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
[ 7.042262] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31
[ 7.042266] xhci_hcd: Looking for event-dma 00000000ff1ea050 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
At some point completion events change from Isoch Buffer Overrun to
Short Packet and the HC finally finds cycle bit mismatch in ff1ec000.
[ 7.098130] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[ 7.098132] xhci_hcd: Looking for event-dma 00000000ff1ecc50 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
[ 7.098254] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[ 7.098256] xhci_hcd: Looking for event-dma 00000000ff1ecc60 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
[ 7.098379] xhci_hcd: Overrun event on slot 1 ep 2
It's possible that data from the isochronous device were written to
random buffers of pending TDs on other endpoints (either IN or OUT),
other devices or even other HCs in the same IOMMU domain.
Lastly, an error from a different USB device on another HC. Was it
caused by the above? I don't know, but it may have been. The disk
was working without any other issues and generated PCIe traffic to
starve the NEC of upstream BW and trigger those MSEs. The two HCs
shared one x1 slot by means of a commercial "PCIe splitter" board.
[ 7.162604] usb 10-2: reset SuperSpeed USB device number 3 using xhci_hcd
[ 7.178990] sd 9:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s
[ 7.179001] sd 9:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 04 02 ae 00 00 02 00 00
[ 7.179004] I/O error, dev sdb, sector 67284480 op 0x0:(READ) flags 0x80700 phys_seg 5 prio class 0
Fortunately, it appears that this ridiculous bug is avoided by setting
the chain bit of Link TRBs on isochronous rings. Other ancient HCs are
known which also expect the bit to be set and they ignore Link TRBs if
it's not. Reportedly, 0.95 spec guaranteed that the bit is set.
The bandwidth-starved NEC HC running a 32KB/uframe UVC endpoint reports
tens of MSEs per second and runs into the bug within seconds. Chaining
Link TRBs allows the same workload to run for many minutes, many times.
No negative side effects seen in UVC recording and UAC playback with a
few devices at full speed, high speed and SuperSpeed.
The problem doesn't reproduce on the newer Renesas uPD720201/uPD720202
and on old Etron EJ168 and VIA VL805 (but the VL805 has other bug).
[shorten line length of log snippets in commit messge -Mathias]
Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-14-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Shivani: Modified to apply on v5.10.y-v6.1.y] Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Older 0.95 xHCI hosts and some other specific newer hosts require the
chain bit to be set for Link TRBs even if the link TRB is not in the
middle of a transfer descriptor (TD).
move the checks for all those cases into one xhci_link_chain_quirk()
function to clean up and avoid code duplication.
Check that the resource which is converted to a surface exists before
trying to use the cursor snooper on it.
vmw_cmd_res_check allows explicit invalid (SVGA3D_INVALID_ID) identifiers
because some svga commands accept SVGA3D_INVALID_ID to mean "no surface",
unfortunately functions that accept the actual surfaces as objects might
(and in case of the cursor snooper, do not) be able to handle null
objects. Make sure that we validate not only the identifier (via the
vmw_cmd_res_check) but also check that the actual resource exists before
trying to do something with it.
Fixes unchecked null-ptr reference in the snooping code.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Fixes: c0951b797e7d ("drm/vmwgfx: Refactor resource management") Reported-by: Kuzey Arda Bulut <kuzeyardabulut@gmail.com> Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Ian Forbes <ian.forbes@broadcom.com> Link: https://lore.kernel.org/r/20250917153655.1968583-1-zack.rusin@broadcom.com Signed-off-by: Sasha Levin <sashal@kernel.org>
[Shivani: Modified to apply on v5.10.y-v6.1.y] Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to section 5.3.6.2 (Multiport Device Operation) of the virtio
spec(version 1.2) a control buffer with the event VIRTIO_CONSOLE_RESIZE
is followed by a virtio_console_resize struct containing cols then rows.
The kernel implements this the wrong way around (rows then cols) resulting
in the two values being swapped.
Signed-off-by: Maximilian Immanuel Brandtner <maxbr@linux.ibm.com>
Message-Id: <20250324144300.905535-1-maxbr@linux.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Filip Hejsek <filip.hejsek@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kmod is now (since kmod commit 09c9f8c5df04 ("libkmod: Use kernel
decompression when available")) using the kernel decompressor, when
loading compressed modules.
However, the kernel XZ decompressor is XZ Embedded, which doesn't
handle CRC64 and dictionaries larger than 1MiB.
Use CRC32 and 1MiB dictionary when XZ compressing and installing
kernel modules.
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1050582 Signed-off-by: Martin Nybo Andersen <tweek@tweek.dk> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_aggregate() is assuming all dynamic memory allocation in it
will succeed. Those are indeed likely in the real use cases since those
allocations are too small to fail, but theoretically those could fail. In
the case, inappropriate memory access can happen. Fix it by appropriately
cleanup pre-allocated memory and skip the execution of the remaining tests
in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-5-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_split_regions_of() is assuming all dynamic memory allocation in
it will succeed. Those are indeed likely in the real use cases since
those allocations are too small to fail, but theoretically those could
fail. In the case, inappropriate memory access can happen. Fix it by
appropriately cleanup pre-allocated memory and skip the execution of the
remaining tests in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-9-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_target() is assuming all dynamic memory allocation in it will
succeed. Those are indeed likely in the real use cases since those
allocations are too small to fail, but theoretically those could fail. In
the case, inappropriate memory access can happen. Fix it by appropriately
cleanup pre-allocated memory and skip the execution of the remaining tests
in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-4-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_merge_two() is assuming all dynamic memory allocation in it
will succeed. Those are indeed likely in the real use cases since those
allocations are too small to fail, but theoretically those could fail. In
the case, inappropriate memory access can happen. Fix it by appropriately
cleanup pre-allocated memory and skip the execution of the remaining tests
in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-7-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_merge_regions_of() is assuming all dynamic memory allocation in
it will succeed. Those are indeed likely in the real use cases since
those allocations are too small to fail, but theoretically those could
fail. In the case, inappropriate memory access can happen. Fix it by
appropriately cleanup pre-allocated memory and skip the execution of the
remaining tests in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-8-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_split_at() is assuming all dynamic memory allocation in it will
succeed. Those are indeed likely in the real use cases since those
allocations are too small to fail, but theoretically those could fail. In
the case, inappropriate memory access can happen. Fix it by appropriately
cleanup pre-allocated memory and skip the execution of the remaining tests
in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-6-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_regions() is assuming all dynamic memory allocation in it will
succeed. Those are indeed likely in the real use cases since those
allocations are too small to fail, but theoretically those could fail. In
the case, inappropriate memory access can happen. Fix it by appropriately
cleanup pre-allocated memory and skip the execution of the remaining tests
in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-3-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_split_evenly_succ() is assuming all dynamic memory allocation
in it will succeed. Those are indeed likely in the real use cases since
those allocations are too small to fail, but theoretically those could
fail. In the case, inappropriate memory access can happen. Fix it by
appropriately cleanup pre-allocated memory and skip the execution of the
remaining tests in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-20-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This problem is similar to the problem that the
commit 1d6a9e7449e2 ("RDMA/core: Fix use-after-free when rename device name")
fixes.
The root cause is: the function ib_device_rename() renames the name with
lock. But in the function kobject_uevent(), this name is accessed without
lock protection at the same time.
The solution is to add the lock protection when this name is accessed in
the function kobject_uevent().
Fixes: 779e0bf47632 ("RDMA/core: Do not indicate device ready when device enablement fails") Link: https://patch.msgid.link/r/20250506151008.75701-1-yanjun.zhu@linux.dev Reported-by: syzbot+e2ce9e275ecc70a30b72@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e2ce9e275ecc70a30b72 Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Ajay: Modified to apply on v5.10.y-v6.6.y
ib_device_notify_register() not present in v5.10.y-v6.6.y,
so directly added lock for kobject_uevent() ] Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_do_test_apply_three_regions() is assuming all dynamic memory
allocation in it will succeed. Those are indeed likely in the real use
cases since those allocations are too small to fail, but theoretically
those could fail. In the case, inappropriate memory access can happen.
Fix it by appropriately cleanup pre-allocated memory and skip the
execution of the remaining tests in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-18-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_test_split_evenly_fail() is assuming all dynamic memory allocation
in it will succeed. Those are indeed likely in the real use cases since
those allocations are too small to fail, but theoretically those could
fail. In the case, inappropriate memory access can happen. Fix it by
appropriately cleanup pre-allocated memory and skip the execution of the
remaining tests in the failure cases.
Link: https://lkml.kernel.org/r/20251101182021.74868-19-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since we recently started warning about uses of this function after the
atomic check phase completes, we've started getting warnings about this in
nouveau. It appears a misplaced drm_atomic_get_crtc_state() call has been
hiding in our .prepare_fb callback for a while.
So, fix this by adding a new nv50_head_atom_get_new() function and use that
in our .prepare_fb callback instead.
Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Fixes: 1590700d94ac ("drm/nouveau/kms/nv50-: split each resource type into their own source files") Cc: <stable@vger.kernel.org> # v4.18+ Link: https://patch.msgid.link/20251211190256.396742-1-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is possible for a BO to exist that is not currently associated with a
resource, e.g. because it has been evicted.
When devcoredump tries to read the contents of all BOs for dumping, we need
to expect this as well -- in this case, ENODATA is recorded instead of the
buffer contents.
Fixes: 7d08df5d0bd3 ("drm/ttm: Add ttm_bo_access") Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Cc: stable <stable@kernel.org> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6271 Signed-off-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251013161241.709916-1-Simon.Richter@hogyros.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A deadlock can occur between nfc_unregister_device() and rfkill_fop_write()
due to lock ordering inversion between device_lock and rfkill_global_mutex.
The problematic lock order is:
Thread A (rfkill_fop_write):
rfkill_fop_write()
mutex_lock(&rfkill_global_mutex)
rfkill_set_block()
nfc_rfkill_set_block()
nfc_dev_down()
device_lock(&dev->dev) <- waits for device_lock
Thread B (nfc_unregister_device):
nfc_unregister_device()
device_lock(&dev->dev)
rfkill_unregister()
mutex_lock(&rfkill_global_mutex) <- waits for rfkill_global_mutex
This creates a classic ABBA deadlock scenario.
Fix this by moving rfkill_unregister() and rfkill_destroy() outside the
device_lock critical section. Store the rfkill pointer in a local variable
before releasing the lock, then call rfkill_unregister() after releasing
device_lock.
This change is safe because rfkill_fop_write() holds rfkill_global_mutex
while calling the rfkill callbacks, and rfkill_unregister() also acquires
rfkill_global_mutex before cleanup. Therefore, rfkill_unregister() will
wait for any ongoing callback to complete before proceeding, and
device_del() is only called after rfkill_unregister() returns, preventing
any use-after-free.
The similar lock ordering in nfc_register_device() (device_lock ->
rfkill_global_mutex via rfkill_register) is safe because during
registration the device is not yet in rfkill_list, so no concurrent
rfkill operations can occur on this device.
Fixes: 3e3b5dfcd16a ("NFC: reorder the logic in nfc_{un,}register_device") Cc: stable@vger.kernel.org Reported-by: syzbot+4ef89409a235d804c6c2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4ef89409a235d804c6c2 Link: https://lore.kernel.org/all/20251217054908.178907-1-kartikey406@gmail.com/T/ Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20251218012355.279940-1-kartikey406@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes the device failing to initialize with "error reading MAC
address" for me, probably because the incorrect write of NCR_RST to
SR_NCR is not actually resetting the device.
Fixes: c9b37458e95629b1d1171457afdcc1bf1eb7881d ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support") Cc: stable@vger.kernel.org Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Link: https://patch.msgid.link/20251221082400.50688-1-enelsonmoore@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In error path, call drop_client() to drop the reference
obtained by get_nfsdfs_clp().
Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In e1000_tbi_should_accept() we read the last byte of the frame via
'data[length - 1]' to evaluate the TBI workaround. If the descriptor-
reported length is zero or larger than the actual RX buffer size, this
read goes out of bounds and can hit unrelated slab objects. The issue
is observed from the NAPI receive path (e1000_clean_rx_irq):
==================================================================
BUG: KASAN: slab-out-of-bounds in e1000_tbi_should_accept+0x610/0x790
Read of size 1 at addr ffff888014114e54 by task sshd/363
CPU: 0 PID: 363 Comm: sshd Not tainted 5.18.0-rc1 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0x5a/0x74
print_address_description+0x7b/0x440
print_report+0x101/0x200
kasan_report+0xc1/0xf0
e1000_tbi_should_accept+0x610/0x790
e1000_clean_rx_irq+0xa8c/0x1110
e1000_clean+0xde2/0x3c10
__napi_poll+0x98/0x380
net_rx_action+0x491/0xa20
__do_softirq+0x2c9/0x61d
do_softirq+0xd1/0x120
</IRQ>
<TASK>
__local_bh_enable_ip+0xfe/0x130
ip_finish_output2+0x7d5/0xb00
__ip_queue_xmit+0xe24/0x1ab0
__tcp_transmit_skb+0x1bcb/0x3340
tcp_write_xmit+0x175d/0x6bd0
__tcp_push_pending_frames+0x7b/0x280
tcp_sendmsg_locked+0x2e4f/0x32d0
tcp_sendmsg+0x24/0x40
sock_write_iter+0x322/0x430
vfs_write+0x56c/0xa60
ksys_write+0xd1/0x190
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f511b476b10
Code: 73 01 c3 48 8b 0d 88 d3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 2b 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 8e 9b 01 00 48 89 04 24
RSP: 002b:00007ffc9211d4e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000004024 RCX: 00007f511b476b10
RDX: 0000000000004024 RSI: 0000559a9385962c RDI: 0000000000000003
RBP: 0000559a9383a400 R08: fffffffffffffff0 R09: 0000000000004f00
R10: 0000000000000070 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc9211d57f R14: 0000559a9347bde7 R15: 0000000000000003
</TASK>
Allocated by task 1:
__kasan_krealloc+0x131/0x1c0
krealloc+0x90/0xc0
add_sysfs_param+0xcb/0x8a0
kernel_add_sysfs_param+0x81/0xd4
param_sysfs_builtin+0x138/0x1a6
param_sysfs_init+0x57/0x5b
do_one_initcall+0x104/0x250
do_initcall_level+0x102/0x132
do_initcalls+0x46/0x74
kernel_init_freeable+0x28f/0x393
kernel_init+0x14/0x1a0
ret_from_fork+0x22/0x30
The buggy address belongs to the object at ffff888014114000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1620 bytes to the right of
2048-byte region [ffff888014114000, ffff888014114800]
The buggy address belongs to the physical page:
page:ffffea0000504400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14110
head:ffffea0000504400 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head|node=0|zone=1)
raw: 01000000000102000000000000000000dead000000000001ffff888013442000
raw: 0000000000000000000000000008000800000001ffffffff0000000000000000
page dumped because: kasan: bad access detected
==================================================================
This happens because the TBI check unconditionally dereferences the last
byte without validating the reported length first:
u8 last_byte = *(data + length - 1);
Fix by rejecting the frame early if the length is zero, or if it exceeds
adapter->rx_buffer_len. This preserves the TBI workaround semantics for
valid frames and prevents touching memory beyond the RX buffer.
Fixes: 2037110c96d5 ("e1000: move tbi workaround code into helper function") Cc: stable@vger.kernel.org Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the CM ID is destroyed while the CM event for multicast creating is
still queued the cancel_work_sync() will prevent the work from running
which also prevents destroying the ah_attr. This leaks a refcount and
triggers a WARN:
GID entry ref leak for dev syz1 index 2 ref=573
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 release_gid_table drivers/infiniband/core/cache.c:806 [inline]
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 gid_table_release_one+0x284/0x3cc drivers/infiniband/core/cache.c:886
Destroy the ah_attr after canceling the work, it is safe to call this
twice.
The netlink response for RDMA_NL_LS_OP_IP_RESOLVE should always have a
LS_NLA_TYPE_DGID attribute, it is invalid if it does not.
Use the nl parsing logic properly and call nla_parse_deprecated() to fill
the nlattrs array and then directly index that array to get the data for
the DGID. Just fail if it is NULL.
Remove the for loop searching for the nla, and squash the validation and
parsing into one function.
Fixes an uninitialized read from the stack triggered by userspace if it
does not provide the DGID to a kernel initiated RDMA_NL_LS_OP_IP_RESOLVE
query.
If you use an IDR with a non-zero base, and specify a range that lies
entirely below the base, 'max - base' becomes very large and
idr_get_free() can return an ID that lies outside of the requested range.
Link: https://lkml.kernel.org/r/20251128161853.3200058-1-willy@infradead.org Fixes: 6ce711f27500 ("idr: Make 1-based IDRs more efficient") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Jan Sokolowski <jan.sokolowski@intel.com> Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6449 Reviewed-by: Christian König <christian.koenig@amd.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The delayed_work delayed_work_enable_hotplug is initialized with
INIT_DELAYED_WORK() in adv7842_probe(), but it is never scheduled
anywhere in the probe function.
Calling cancel_delayed_work() on a work that has never been
scheduled is redundant and unnecessary, as there is no pending
work to cancel.
Remove the redundant cancel_delayed_work() from error handling
path and adjust the goto label accordingly to simplify the code
and avoid potential confusion.
Fixes: a89bcd4c6c20 ("[media] adv7842: add new video decoder driver") Cc: stable@vger.kernel.org Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The delayed_work delayed_work_enable_hotplug is initialized with
INIT_DELAYED_WORK() in adv76xx_probe(), but it is never scheduled
anywhere in the probe function.
Calling cancel_delayed_work() on a work that has never been
scheduled is redundant and unnecessary, as there is no pending
work to cancel.
Remove the redundant cancel_delayed_work() from error handling
path and adjust the goto label accordingly to simplify the code
and avoid potential confusion.
Fixes: 54450f591c99 ("[media] adv7604: driver for the Analog Devices ADV7604 video decoder") Cc: stable@vger.kernel.org Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In cec_devnode_init(), the debugfs directory created with
debugfs_create_dir() is not removed if bus_register() fails.
This leaves a stale "cec" entry in debugfs and prevents
proper module reloading.
Fix this by removing the debugfs directory in the error path.
The variables were never clamped because the return value of clamp_val()
was not used. Fix this by assigning the clamped values, and use clamp()
instead of clamp_val().
While debuggigng why X would not start on mips64 Sgi/O2 I found the
phys adress being off. Turns out the gbefb passed the internal
dma_addr as phys. May be broken pre git history. Fix by converting
dma_to_phys.
When performing a read-modify-write(RMW) operation, any modification
to a buffered block must cause the entire buffer to be marked dirty.
Marking only a subrange as dirty is incorrect because the underlying
device block size(ubs) defines the minimum read/write granularity. A
lower device can perform I/O only on regions which are fully aligned
and sized to ubs.
This change ensures that write-back operations always occur in full
ubs-sized chunks, matching the intended emulation semantics of the
EBS target.
As for user space visible impact, submitting sub-ubs and misaligned
I/O for devices which are tuned to ubs sizes only, will reject such
requests, therefore it can lead to losing data. Example:
When the kernel leaves to userspace via syscall_restore_rfi(), the
W bit is not set in the new PSW. This doesn't cause any problems
because there's no 64 bit userspace for parisc. Simple static binaries
are usually loaded at addresses way below the 32 bit limit so the W bit
doesn't matter.
Fix this by setting the W bit when TIF_32BIT is not set.
In wide mode, the IASQ contain the upper part of the GVA
during interruption. This needs to be reversed before
the space is used - otherwise it contains parts of IAOQ.
See Page 2-13 "Processing Resources / Interruption Instruction
Address Queues" in the Parisc 2.0 Architecture Manual page 2-13
for an explanation.
The IAOQ/IASQ space_adjust was skipped for other interruptions
than itlb misses. However, the code in handle_interruption()
checks whether iasq[0] contains a valid space. Due to the not
masked out bits this match failed and the process was killed.
Also add space_adjust for IAOQ1/IASQ1 so ptregs contains sane values.
The driver calls reset_control_get_optional_exclusive() but never calls
reset_control_put() in error paths or in the remove function. This causes
a resource leak when probe fails after successfully acquiring the reset
control, or when the driver is unloaded.
Switch to devm_reset_control_get_optional_exclusive() to automatically
manage the reset control resource.
MAX77620 is most likely always a single device on the board, however
nothing stops board designers to have two of them, thus same device
driver could probe twice. Or user could manually try to probing second
time.
Device driver is not ready for that case, because it allocates
statically 'struct regmap_irq_chip' as non-const and stores during
probe in 'irq_drv_data' member a pointer to per-probe state
container ('struct max77620_chip'). devm_regmap_add_irq_chip() does not
make a copy of 'struct regmap_irq_chip' but store the pointer.
Second probe - either successful or failure - would overwrite the
'irq_drv_data' from previous device probe, so interrupts would be
executed in a wrong context.
Cc: stable@vger.kernel.org Fixes: 3df140d11c6d ("mfd: max77620: Mask/unmask interrupt before/after servicing it") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20251023101939.67991-2-krzysztof.kozlowski@linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LP5009 supports 9 LED outputs that are grouped into 3 modules.
Cc: stable@vger.kernel.org Fixes: 242b81170fb8 ("leds: lp50xx: Add the LP50XX family of the RGB LED driver") Signed-off-by: Christian Hitz <christian.hitz@bbv.ch> Link: https://patch.msgid.link/20251022063305.972190-1-christian@klarinett.li Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
led_banks contains LED module number(s) that should be grouped into the
module bank. led_banks is 0-initialized.
By checking the led_banks entries for 0, un-set entries are detected.
But a 0-entry also indicates that LED module 0 should be grouped into the
module bank.
By only iterating over the available entries no check for unused entries
is required and LED module 0 can be added to bank.
Cc: stable@vger.kernel.org Fixes: 242b81170fb8 ("leds: lp50xx: Add the LP50XX family of the RGB LED driver") Signed-off-by: Christian Hitz <christian.hitz@bbv.ch> Reviewed-by: Jacek Anaszewski <jacek.anaszewski@gmail.com> Link: https://patch.msgid.link/20251008123222.1117331-1-christian@klarinett.li Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a PCI device is suspended, it is normally the PCI core's job to save
Config Space and put the device into a low power state. However drivers
are allowed to assume these responsibilities. When they do, the PCI core
can tell by looking at the state_saved flag in struct pci_dev: The flag
is cleared before commencing the suspend sequence and it is set when
pci_save_state() is called. If the PCI core finds the flag set late in
the suspend sequence, it refrains from calling pci_save_state() itself.
But there are two corner cases where the PCI core neglects to clear the
flag before commencing the suspend sequence:
* If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
to clear the flag. The (stale) flag is subsequently queried by
pci_legacy_suspend() itself and pci_legacy_suspend_late().
* If a device has no driver or its driver has no PCI PM callbacks,
pci_pm_freeze() neglects to clear the flag. The (stale) flag is
subsequently queried by pci_pm_freeze_noirq().
The flag may be set prior to suspend if the device went through error
recovery: Drivers commonly invoke pci_restore_state() + pci_save_state()
to restore Config Space after reset.
The flag may also be set if drivers call pci_save_state() on probe to
allow for recovery from subsequent errors.
The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq()
don't call pci_save_state() and so the state that will be restored on
resume is the one recorded on last error recovery or on probe, not the one
that the device had on suspend. If the two states happen to be identical,
there's no problem.
Reinstate clearing the flag in pci_legacy_suspend() and pci_pm_freeze().
The two functions used to do that until commit 4b77b0a2ba27 ("PCI: Clear
saved_state after the state has been restored") deemed it unnecessary
because it assumed that it's sufficient to clear the flag on resume in
pci_restore_state(). The commit seemingly did not take into account that
pci_save_state() and pci_restore_state() are not only used by power
management code, but also for error recovery.
Devices without driver or whose driver has no PCI PM callbacks may be in
runtime suspend when pci_pm_freeze() is called. Their state has already
been saved, so don't clear the flag to skip a pointless pci_save_state()
in pci_pm_freeze_noirq().
None of the drivers with legacy PCI PM callbacks seem to use runtime PM,
so clear the flag unconditionally in their case.
Fixes: 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Cc: stable@vger.kernel.org # v2.6.32+ Link: https://patch.msgid.link/094f2aad64418710daf0940112abe5a0afdc6bce.1763483367.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
logi_dj_recv_query_paired_devices() and logi_dj_recv_switch_to_dj_mode()
both have 2 callers which all log an error if the function fails. Move
the error logging to inside these 2 functions to remove the duplicated
error logging in the callers.
While at it also move the logi_dj_recv_send_report() call error handling
in logi_dj_recv_switch_to_dj_mode() to directly after the call. That call
only fails if the report cannot be found and in that case it does nothing,
so the msleep() is not necessary on failures.
Fixes: 6f20d3261265 ("HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode()") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during probe_device().
Note that commit 9826e393e4a8 ("iommu/tegra-smmu: Fix missing
put_device() call in tegra_smmu_find") fixed the leak in an error path,
but the reference is still leaking on success.
Make sure to drop the references taken to the iommu platform devices
when looking up their driver data during probe_device().
Note that the arch data device pointer added by commit 604629bcb505
("iommu/omap: add support for late attachment of iommu devices") has
never been used. Remove it to underline that the references are not
needed.
Fixes: 9d5018deec86 ("iommu/omap: Add support to program multiple iommus") Fixes: 7d6827748d54 ("iommu/omap: Fix iommu archdata name for DT-based devices") Cc: stable@vger.kernel.org # 3.18 Cc: Suman Anna <s-anna@ti.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Note that commit 1a26044954a6 ("iommu/exynos: add missing put_device()
call in exynos_iommu_of_xlate()") fixed the leak in a couple of error
paths, but the reference is still leaking on success.
Fixes: aa759fd376fb ("iommu/exynos: Add callback for initializing devices from device tree") Cc: stable@vger.kernel.org # 4.2: 1a26044954a6 Cc: Yu Kuai <yukuai3@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
DSP expects the periods to be aligned to fragment sizes, currently
setting up to hw constriants on periods bytes is not going to work
correctly as we can endup with periods sizes aligned to 32 bytes however
not aligned to fragment size.
Update the constriants to use fragment size, and also set at step of
10ms for period size to accommodate DSP requirements of 10ms latency.
A matching Common object post processing instance is normally resused
across multiple streams. However currently we close this on DSP
even though there is a refcount on this copp object, this can result in
below error.
q6routing ab00000.remoteproc:glink-edge:apr:service@8:routing: Found Matching Copp 0x0
qcom-q6adm aprsvc:service:4:8: cmd = 0x10325 return error = 0x2
q6routing ab00000.remoteproc:glink-edge:apr:service@8:routing: DSP returned error[2]
q6routing ab00000.remoteproc:glink-edge:apr:service@8:routing: Found Matching Copp 0x0
qcom-q6adm aprsvc:service:4:8: cmd = 0x10325 return error = 0x2
q6routing ab00000.remoteproc:glink-edge:apr:service@8:routing: DSP returned error[2]
qcom-q6adm aprsvc:service:4:8: cmd = 0x10327 return error = 0x2
qcom-q6adm aprsvc:service:4:8: DSP returned error[2]
qcom-q6adm aprsvc:service:4:8: Failed to close copp -22
qcom-q6adm aprsvc:service:4:8: cmd = 0x10327 return error = 0x2
qcom-q6adm aprsvc:service:4:8: DSP returned error[2]
qcom-q6adm aprsvc:service:4:8: Failed to close copp -22
Fix this by addressing moving the adm_close to copp_kref destructor
callback.
The func_traceonoff_triggers.tc sometimes goes to fail
on my board, Kunpeng-920.
[root@localhost]# ./ftracetest ./test.d/ftrace/func_traceonoff_triggers.tc -l fail.log
=== Ftrace unit tests ===
[1] ftrace - test for function traceon/off triggers [FAIL]
[2] (instance) ftrace - test for function traceon/off triggers [UNSUPPORTED]
I look up the log, and it shows that the md5sum is different between csum1 and csum2.
Link: https://lore.kernel.org/r/20230818013226.2182299-1-zouyipeng@huawei.com Fixes: d87b29179aa0 ("selftests: ftrace: Use md5sum to take less time of checking logs") Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Yipeng Zou <zouyipeng@huawei.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The dma_alloc_coherent() allocates a dma-mapped buffer, pbl->pg_arr[i].
The dma_free_coherent() should pass the same buffer to
dma_free_coherent() and not page-aligned.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://patch.msgid.link/20251230085121.8023-2-fourier.thomas@gmail.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In bnxt_qplib_alloc_init_hwq(), while allocating memory for PDE table
driver incorrectly is using the "pg_size" value passed to the function.
Fixed to use the right value 4K. Also, fixed the allocation size for
PBL table.
The bnxt_re SEND path checks wr->send_flags to enable features such as
IP checksum offload. However, send_flags is a bitmask and may contain
multiple flags (e.g. IB_SEND_SIGNALED | IB_SEND_IP_CSUM), while the
existing code uses a switch() statement that only matches when
send_flags is exactly IB_SEND_IP_CSUM.
As a result, checksum offload is not enabled when additional SEND
flags are present.
Replace the switch() with a bitmask test:
if (wr->send_flags & IB_SEND_IP_CSUM)
This ensures IP checksum offload is enabled correctly when multiple
SEND flags are used.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://patch.msgid.link/20251219093308.2415620-1-alok.a.tiwari@oracle.com Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
RCFW_COMM_CONS_PCI_BAR_REGION is defined as BAR 2, so checking
!creq_db->reg.bar_id is incorrect and always false.
pci_resource_start() returns the BAR base address, and a value of 0
indicates that the BAR is unassigned. Update the condition to test
bar_base == 0 instead.
This ensures the driver detects and logs an error for an unassigned
RCFW communication BAR.
Fix missing comparison operator for RDMA_NETWORK_ROCE_V1 in the
conditional statement. The constant was used directly instead of
being compared with net_type, causing the condition to always
evaluate to true.
Fixes: 1c15b4f2a42f ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type") Signed-off-by: Jang Ingyu <ingyujang25@korea.ac.kr> Link: https://patch.msgid.link/20251219041508.1725947-1-ingyujang25@korea.ac.kr Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The page size used for device might in some cases be smaller than
PAGE_SIZE what results in a negative shift when calculating the number of
host pages in PAGE_SIZE for a debug log. Remove the debug line together
with the calculation.
Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation") Link: https://patch.msgid.link/r/20251210173656.8180-1-mrgolin@amazon.com Reviewed-by: Tom Sela <tomsela@amazon.com> Reviewed-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Reviewed-by: Gal Pressman <gal.pressman@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
irdma_net_event() should not dereference anything from "neigh" (alias
"ptr") until it has checked that the event is NETEVENT_NEIGH_UPDATE.
Other events come with different structures pointed to by "ptr" and they
may be smaller than struct neighbour.
Move the read of neigh->dev under the NETEVENT_NEIGH_UPDATE case.
The bug is mostly harmless, but it triggers KASAN on debug kernels:
BUG: KASAN: stack-out-of-bounds in irdma_net_event+0x32e/0x3b0 [irdma]
Read of size 8 at addr ffffc900075e07f0 by task kworker/27:2/542554
rose_kill_by_device() collects sockets into a local array[] and then
iterates over them to disconnect sockets bound to a device being brought
down.
The loop mistakenly indexes array[cnt] instead of array[i]. For cnt <
ARRAY_SIZE(array), this reads an uninitialized entry; for cnt ==
ARRAY_SIZE(array), it is an out-of-bounds read. Either case can lead to
an invalid socket pointer dereference and also leaks references taken
via sock_hold().
When a nexthop object is deleted, it is marked as dead and then
fib_table_flush() is called to flush all the routes that are using the
dead nexthop.
The current logic in fib_table_flush() is to only flush error routes
(e.g., blackhole) when it is called as part of network namespace
dismantle (i.e., with flush_all=true). Therefore, error routes are not
flushed when their nexthop object is deleted:
# ip link add name dummy1 up type dummy
# ip nexthop add id 1 dev dummy1
# ip route add 198.51.100.1/32 nhid 1
# ip route add blackhole 198.51.100.2/32 nhid 1
# ip nexthop del id 1
# ip route show
blackhole 198.51.100.2 nhid 1 dev dummy1
As such, they keep holding a reference on the nexthop object which in
turn holds a reference on the nexthop device, resulting in a reference
count leak:
# ip link del dev dummy1
[ 70.516258] unregister_netdevice: waiting for dummy1 to become free. Usage count = 2
Fix by flushing error routes when their nexthop is marked as dead.
IPv6 does not suffer from this problem.
Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects") Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Closes: https://lore.kernel.org/netdev/d943f806-4da6-4970-ac28-b9373b0e63ac@I-love.SAKURA.ne.jp/ Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20251221144829.197694-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
There exists a kernel oops caused by a BUG_ON(nhead < 0) at
net/core/skbuff.c:2232 in pskb_expand_head().
This bug is triggered as part of the calipso_skbuff_setattr()
routine when skb_cow() is passed headroom > INT_MAX
(i.e. (int)(skb_headroom(skb) + len_delta) < 0).
The root cause of the bug is due to an implicit integer cast in
__skb_cow(). The check (headroom > skb_headroom(skb)) is meant to ensure
that delta = headroom - skb_headroom(skb) is never negative, otherwise
we will trigger a BUG_ON in pskb_expand_head(). However, if
headroom > INT_MAX and delta <= -NET_SKB_PAD, the check passes, delta
becomes negative, and pskb_expand_head() is passed a negative value for
nhead.
Fix the trigger condition in calipso_skbuff_setattr(). Avoid passing
"negative" headroom sizes to skb_cow() within calipso_skbuff_setattr()
by only using skb_cow() to grow headroom.
Fixes: 2917f57b6bc1 ("calipso: Allow the lsm to label the skbuff directly.") Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Will Rosenberg <whrosenb@asu.edu> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://patch.msgid.link/20251219173637.797418-1-whrosenb@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch ensures that the RX ring size (rx_pending) is not
set below the permitted length. This avoids UBSAN
shift-out-of-bounds errors when users passes small or zero
ring sizes via ethtool -G.
The ASIX driver reads the PHY address from the USB device via
asix_read_phy_addr(). A malicious or faulty device can return an
invalid address (>= PHY_MAX_ADDR), which causes a warning in
mdiobus_get_phy():
addr 207 out of range
WARNING: drivers/net/phy/mdio_bus.c:76
Validate the PHY address in asix_read_phy_addr() and remove the
now-redundant check in ax88172a.c.
Reported-by: syzbot+3d43c9066a5b54902232@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3d43c9066a5b54902232 Tested-by: syzbot+3d43c9066a5b54902232@syzkaller.appspotmail.com Fixes: 7e88b11a862a ("net: usb: asix: refactor asix_read_phy_addr() and handle errors on return") Link: https://lore.kernel.org/all/20251217085057.270704-1-kartikey406@gmail.com/T/ Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251218011156.276824-1-kartikey406@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This happens because smc_special_trylock() disables IRQs even on PREEMPT_RT,
but smc_special_unlock() does not restore IRQs on PREEMPT_RT.
The reason is that smc_special_unlock() calls spin_unlock_irqrestore(),
and rcu_read_unlock_bh() in __dev_queue_xmit() cannot invoke
rcu_read_unlock() through __local_bh_enable_ip() when current->softirq_disable_cnt becomes zero.
To address this issue, replace smc_special_trylock() with spin_trylock_irqsave().
Fixes: 342a93247e08 ("locking/spinlock: Provide RT variant header: <linux/spinlock_rt.h>") Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251217085115.1730036-1-yeoreum.yun@arm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In async_set_registers(), when usb_submit_urb() fails, the allocated
async_req structure and URB are not freed, causing a memory leak.
The completion callback async_set_reg_cb() is responsible for freeing
these allocations, but it is only called after the URB is successfully
submitted and completes (successfully or with error). If submission
fails, the callback never runs and the memory is leaked.
Fix this by freeing both the URB and the request structure in the error
path when usb_submit_urb() fails.
The problem is in this flow:
1) Port is enabled, queue_id != 0, in qom_list
2) Port gets disabled
-> team_port_disable()
-> team_queue_override_port_del()
-> del (removed from list)
3) Port is disabled, queue_id != 0, not in any list
4) Priority changes
-> team_queue_override_port_prio_changed()
-> checks: port disabled && queue_id != 0
-> calls del - hits the BUG as it is removed already
To fix this, change the check in team_queue_override_port_prio_changed()
so it returns early if port is not enabled.
Reported-by: syzbot+422806e5f4cce722a71f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=422806e5f4cce722a71f Fixes: 6c31ff366c11 ("team: remove synchronize_rcu() called during queue override change") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251212102953.167287-1-jiri@resnulli.us Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The ibm_rtl_init() function searches for the signature but has a pointer
arithmetic error. The loop counter suggests searching at 4-byte intervals
but the implementation only advances by 1 byte per iteration.
Fix by properly advancing the pointer by sizeof(unsigned int) bytes
each iteration.
Reported-by: Yuhao Jiang <danisjiang@gmail.com> Reported-by: Junrui Luo <moonafterrain@outlook.com> Fixes: 35f0ce032b0f ("IBM Real-Time "SMI Free" mode driver -v7") Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/SYBPR01MB78812D887A92DE3802D0D06EAFA9A@SYBPR01MB7881.ausprd01.prod.outlook.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
A sysfs group is created in msi_init() when old_ec_model is enabled, but
never removed. Remove the msipf_old_attribute_group in that case.
Fixes: 03696e51d75a ("msi-laptop: Disable brightness control for new EC") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://patch.msgid.link/20251217103617.27668-2-fourier.thomas@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Over the years, syzbot found many ways to crash the kernel
in ip6gre_header() [1].
This involves team or bonding drivers ability to dynamically
change their dev->needed_headroom and/or dev->hard_header_len
In this particular crash mld_newpack() allocated an skb
with a too small reserve/headroom, and by the time mld_sendpack()
was called, syzbot managed to attach an ip6gre device.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Reported-by: syzbot+43a2ebcf2a64b1102d64@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/693b002c.a70a0220.33cd7b.0033.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251211173550.2032674-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The openvswitch teardown code will immediately call
ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
It will then start the dp_notify_work workqueue, which will later end up
calling the vport destroy() callback. This callback takes the RTNL to do
another ovs_netdev_detach_port(), which in this case is unnecessary.
This causes extra pressure on the RTNL, in some cases leading to
"unregister_netdevice: waiting for XX to become free" warnings on
teardown.
We can straight-forwardly avoid the extra RTNL lock acquisition by
checking the device flags before taking the lock, and skip the locking
altogether if the IFF_OVS_DATAPATH flag has already been unset.
The Aspeed MDIO controller may return incorrect data when a read operation
follows immediately after a write. Due to a controller bug, the subsequent
read can latch stale data, causing the polling logic to terminate earlier
than expected.
To work around this hardware issue, insert a dummy read after each write
operation. This ensures that the next actual read returns the correct
data and prevents premature polling exit.
This workaround has been verified to stabilize MDIO transactions on
affected Aspeed platforms.
Add aspeed_mdio_op() and aseed_mdio_get_data() for register accessing.
aspeed_mdio_op() handles operations, write command to control register,
then check and wait operations is finished (bit 31 is cleared).
aseed_mdio_get_data() fetchs the result value of operation from data
register.
Signed-off-by: Potin Lai <potin.lai@quantatw.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: d1a1a4bade4b ("net: mdio: aspeed: add dummy read to avoid read-after-write issue") Signed-off-by: Sasha Levin <sashal@kernel.org>
This reverts commit 98921dbd00c4e ("Bluetooth: Use devm_kzalloc in
btusb.c file").
In btusb_probe(), we use devm_kzalloc() to allocate the btusb data. This
ties the lifetime of all the btusb data to the binding of a driver to
one interface, INTF. In a driver that binds to other interfaces, ISOC
and DIAG, this is an accident waiting to happen.
The issue is revealed in btusb_disconnect(), where calling
usb_driver_release_interface(&btusb_driver, data->intf) will have devm
free the data that is also being used by the other interfaces of the
driver that may not be released yet.
To fix this, revert the use of devm and go back to freeing memory
explicitly.
Fixes: 98921dbd00c4e ("Bluetooth: Use devm_kzalloc in btusb.c file") Signed-off-by: Raphael Pinsonneault-Thibeault <rpthibeault@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As soon as crypto_aead_encrypt is called, the underlying request
may be freed by an asynchronous completion. Thus dereferencing
req->iv after it returns is invalid.
Instead of checking req->iv against info, create a new variable
unaligned_info and use it for that purpose instead.
Fixes: 0a270321dbf9 ("[CRYPTO] seqiv: Add Sequence Number IV Generator") Reported-by: Xiumei Mu <xmu@redhat.com> Reported-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
There are off-by-one bugs when configuring RSS hash key and lookup
table, causing out-of-bounds reads to memory [1] and out-of-bounds
writes to device registers.
Before commit 43a3d9ba34c9 ("i40evf: Allow PF driver to configure RSS"),
the loop upper bounds were:
i <= I40E_VFQF_{HKEY,HLUT}_MAX_INDEX
which is safe since the value is the last valid index.
That commit changed the bounds to:
i <= adapter->rss_{key,lut}_size / 4
where `rss_{key,lut}_size / 4` is the number of dwords, so the last
valid index is `(rss_{key,lut}_size / 4) - 1`. Therefore, using `<=`
accesses one element past the end.
Fix the issues by using `<` instead of `<=`, ensuring we do not exceed
the bounds.
[1] KASAN splat about rss_key_size off-by-one
BUG: KASAN: slab-out-of-bounds in iavf_config_rss+0x619/0x800
Read of size 4 at addr ffff888102c50134 by task kworker/u8:6/63
The buggy address belongs to the object at ffff888102c50100
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes to the right of
allocated 52-byte region [ffff888102c50100, ffff888102c50134)
Memory state around the buggy address: ffff888102c50000: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc ffff888102c50080: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
>ffff888102c50100: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc
^ ffff888102c50180: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc ffff888102c50200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
The maximum number of descriptors supported by the hardware is
hardware-dependent and can be retrieved using
i40e_get_max_num_descriptors(). Move this function to a shared header
and use it when checking for valid ring_len parameter rather than using
hardcoded value.
By fixing an over-acceptance issue, behavior change could be seen where
ring_len could now be rejected while configuring rx and tx queues if its
size is larger than the hardware-dependent maximum number of
descriptors.
Fixes: 55d225670def ("i40e: add validation for ring_len param") Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 07d44190a389 ("i40e/i40evf: Detect and recover hung queue
scenario") changes i40e_detect_recover_hung() argument type from
i40e_pf* to i40e_vsi* to be shareable by both i40e and i40evf.
Because the i40evf does not exist anymore and the function is
exclusively used by i40e we can revert this change.
Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Stable-dep-of: 699428342153 ("i40e: validate ring_len parameter against hardware-specific values") Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 0ef2d5afb12d ("i40e: KISS the client interface") simplified
the client interface so in practice it supports only one client
per i40e netdev. But we have still 2 notification functions that
uses as parameter a pointer to VSI of netdevice associated with
the client. After the mentioned commit only possible and used
VSI is the main (LAN) VSI.
So refactor these functions so they are called with PF pointer argument
and the associated VSI (LAN) is taken inside them.
Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Stable-dep-of: 699428342153 ("i40e: validate ring_len parameter against hardware-specific values") Signed-off-by: Sasha Levin <sashal@kernel.org>
Add service task schedule to set_rx_mode.
In some cases there are error messages printed out in PTP application
(ptp4l):
ptp4l[13848.762]: port 1 (ens2f3np3): received SYNC without timestamp
ptp4l[13848.825]: port 1 (ens2f3np3): received SYNC without timestamp
ptp4l[13848.887]: port 1 (ens2f3np3): received SYNC without timestamp
This happens when service task would not run immediately after
set_rx_mode, and we need it for setup tasks. This service task checks, if
PTP RX packets are hung in firmware, and propagate correct settings such
as multicast address for IEEE 1588 Precision Time Protocol.
RX timestamping depends on some of these filters set. Bug happens only
with high PTP packets frequency incoming, and not every run since
sometimes service task is being ran from a different place immediately
after starting ptp4l.
Fixes: 0e4425ed641f ("i40e: fix: do not sleep in netdev_ops") Reviewed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The macros FAN_FROM_REG and TEMP_FROM_REG evaluate their arguments
multiple times. When used in lockless contexts involving shared driver
data, this causes Time-of-Check to Time-of-Use (TOCTOU) race
conditions.
Convert the macros to static functions. This guarantees that arguments
are evaluated only once (pass-by-value), preventing the race
conditions.
Adhere to the principle of minimal changes by only converting macros
that evaluate arguments multiple times and are used in lockless
contexts.
The macro FAN_FROM_REG evaluates its arguments multiple times. When used
in lockless contexts involving shared driver data, this leads to
Time-of-Check to Time-of-Use (TOCTOU) race conditions, potentially
causing divide-by-zero errors.
Convert the macro to a static function. This guarantees that arguments
are evaluated only once (pass-by-value), preventing the race
conditions.
Additionally, in store_fan_div, move the calculation of the minimum
limit inside the update lock. This ensures that the read-modify-write
sequence operates on consistent data.
Adhere to the principle of minimal changes by only converting macros
that evaluate arguments multiple times and are used in lockless
contexts.