With older FW, we may get the ASYNC_EVENT_CMPL_EVENT_ID_DBG_BUF_PRODUCER
for FW trace data type that has not been initialized. This will result
in a crash in bnxt_bs_trace_type_wrap(). Add a guard to check for a
valid magic_byte pointer before proceeding.
Fixes: 84fcd9449fd7 ("bnxt_en: Manage the FW trace context memory") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Shruti Parab <shruti.parab@broadcom.com> Signed-off-by: Gautam R A <gautam-r.a@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In bnxt_ptp_init(), when ptp_clock_register() fails, the driver is
not freeing the memory allocated for ptp_info->pin_config. Fix it
to unconditionally free ptp_info->pin_config in bnxt_ptp_free().
Fixes: caf3eedbcd8d ("bnxt_en: 1PPS support for 5750X family chips") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The netif_close() call in bnxt_shutdown() only stops packet DMA. There
may be FW DMA for trace logging (recently added) that will continue. If
we kexec to a new kernel, the DMA will corrupt memory in the new kernel.
Add bnxt_hwrm_func_drv_unrgtr() to unregister the driver from the FW.
This will stop the FW DMA. In case the call fails, call pcie_flr() to
reset the function and stop the DMA.
Fixes: 24d694aec139 ("bnxt_en: Allocate backing store memory for FW trace logs") Reported-by: Jakub Kicinski <kicinski@meta.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Raw IP packets have no MAC header, leaving skb->mac_header uninitialized.
This can trigger kernel panics on ARM64 when xfrm or other subsystems
access the offset due to strict alignment checks.
Initialize the MAC header to prevent such crashes.
This can trigger kernel panics on ARM when running IPsec over the
qmimux0 interface.
The devm_kcalloc() function never return error pointers, it returns NULL
on failure. Also delete the netdev_err() printk. These allocation
functions already have debug output built-in some the extra error message
is not required.
Fixes: efabce290151 ("octeontx2-pf: AF_XDP zero copy receive support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/aQYKkrGA12REb2sj@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The TSO path called ionic_tx_map_skb() before preparing the TCP pseudo
checksum (ionic_tx_tcp_[inner_]pseudo_csum()), which may perform
skb_cow_head() and might modifies bytes in the linear header area.
Mapping first and then mutating the header risks:
- Using a stale DMA address if skb_cow_head() relocates the head, and/or
- Device reading stale header bytes on weakly-ordered systems
(CPU writes after mapping are not guaranteed visible without an
explicit dma_sync_single_for_device()).
Reorder the TX path to perform all header mutations (including
skb_cow_head()) *before* DMA mapping. Mapping is now done only after the
skb layout and header contents are final. This removes the need for any
post-mapping dma_sync and prevents on-wire corruption observed under
VLAN+TSO load after repeated runs.
This change is purely an ordering fix; no functional behavior change
otherwise.
Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Mohammad Heib <mheib@redhat.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Link: https://patch.msgid.link/20251031155203.203031-2-mheib@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The TX path currently writes descriptors and then immediately writes to
the MMIO doorbell register to notify the NIC. On weakly ordered
architectures, descriptor writes may still be pending in CPU or DMA
write buffers when the doorbell is issued, leading to the device
fetching stale or incomplete descriptors.
Add a dma_wmb() in ionic_txq_post() to ensure all descriptor writes are
visible to the device before the doorbell MMIO write.
Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Mohammad Heib <mheib@redhat.com> Link: https://patch.msgid.link/20251031155203.203031-1-mheib@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When iterating over the ARL table we stop at max ARL entries / 2, but
this is only valid if the chip actually returns 2 results at once. For
chips with only one result register we will stop before reaching the end
of the table if it is more than half full.
Fix this by only dividing the maximum results by two if we have a chip
with more than one result register (i.e. those with 4 ARL bins).
The switch clears the ARL_SRCH_STDN bit when the search is done, i.e. it
finished traversing the ARL table.
This means that there will be no valid result, so we should not attempt
to read and process any further entries.
We only ever check the validity of the entries for 4 ARL bin chips, and
only after having passed the first entry to the b53_fdb_copy().
This means that we always pass an invalid entry at the end to the
b53_fdb_copy(). b53_fdb_copy() does check the validity though before
passing on the entry, so it never gets passed on.
On < 4 ARL bin chips, we will even continue reading invalid entries
until we reach the result limit.
In the New Control register bit 1 is either reserved, or has a different
function:
Out of Range Error Discard
When enabled, the ingress port discards any frames
if the Length field is between 1500 and 1536
(excluding 1500 and 1536) and with good CRC.
The actual bit for enabling IP multicast is bit 0, which was only
explicitly enabled for BCM5325 so far.
For older switch chips, this bit defaults to 0, so we want to enable it
as well, while newer switch chips default to 1, and their documentation
says "It is illegal to set this bit to zero."
So drop the wrong B53_IPMC_FWD_EN define, enable the IP multicast bit
also for other switch chips. While at it, rename it to (B53_)IP_MC as
that is how it is called in Broadcom code.
There is no guarantee that the port state override registers have their
default values, as not all switches support being reset via register or
have a reset GPIO.
So when forcing port config, we need to make sure to clear all fields,
which we currently do not do for the speed and flow control
configuration. This can cause flow control stay enabled, or in the case
of speed becoming an illegal value, e.g. configured for 1G (0x2), then
setting 100M (0x1), results in 0x3 which is invalid.
For PORT_OVERRIDE_SPEED_2000M we need to make sure to only clear it on
supported chips, as the bit can have different meanings on other chips,
e.g. for BCM5389 this controls scanning PHYs for link/speed
configuration.
Fixes: 5e004460f874 ("net: dsa: b53: Add helper to set link parameters") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251101132807.50419-2-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The call to device_node_to_regmap() in airoha_mdio_probe() can return
an ERR_PTR() if regmap initialization fails. Currently, the driver
stores the pointer without validation, which could lead to a crash
if it is later dereferenced.
Add an IS_ERR() check and return the corresponding error code to make
the probe path more robust.
Fixes: 67e3ba978361 ("net: mdio: Add MDIO bus controller for Airoha AN7583") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251031161607.58581-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the memory allocation in gpiolib_seq_start() fails, the s->private
field remains uninitialized and is later dereferenced without checking
in gpiolib_seq_stop(). Initialize s->private to NULL before calling
kzalloc() and check it before dereferencing it.
Fixes: e348544f7994 ("gpio: protect the list of GPIO devices with SRCU") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20251103141132.53471-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Looking up a GPIO controller by label that is the name of the software
node is wonky at best - the GPIO controller driver is free to set
a different label than the name of its firmware node. We're already being
passed a firmware node handle attached to the GPIO device to
swnode_get_gpio_device() so use it instead for a more precise lookup.
There is a race between operations that iterate over the userdata
cg_children list and concurrent add/remove of userdata items through
configfs. The update_userdata() function iterates over the
nt->userdata_group.cg_children list, and count_extradata_entries() also
iterates over this same list to count nodes.
Quoting from Documentation/filesystems/configfs.rst:
> A subsystem can navigate the cg_children list and the ci_parent pointer
> to see the tree created by the subsystem. This can race with configfs'
> management of the hierarchy, so configfs uses the subsystem mutex to
> protect modifications. Whenever a subsystem wants to navigate the
> hierarchy, it must do so under the protection of the subsystem
> mutex.
Without proper locking, if a userdata item is added or removed
concurrently while these functions are iterating, the list can be
accessed in an inconsistent state. For example, the list_for_each() loop
can reach a node that is being removed from the list by list_del_init()
which sets the nodes' .next pointer to point to itself, so the loop will
never end (or reach the WARN_ON_ONCE in update_userdata() ).
Fix this by holding the configfs subsystem mutex (su_mutex) during all
operations that iterate over cg_children.
This includes:
- userdatum_value_store() which calls update_userdata() to iterate over
cg_children
- All sysdata_*_enabled_store() functions which call
count_extradata_entries() to iterate over cg_children
The su_mutex must be acquired before dynamic_netconsole_mutex to avoid
potential lock ordering issues, as configfs operations may already hold
su_mutex when calling into our code.
After registering a VLAN device and setting its feature flags, we need to
synchronize the VLAN features with the lower device. For example, the VLAN
device does not have the NETIF_F_LRO flag, it should be synchronized with
the lower device based on the NETIF_F_UPPER_DISABLES definition.
As the dev->vlan_features has changed, we need to call
netdev_update_features(). The caller must run after netdev_upper_dev_link()
links the lower devices, so this patch adds the netdev_update_features()
call in register_vlan_dev().
Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251030073539.133779-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The script "ethtool-common.sh" is not installed in INSTALL_PATH, and
triggers some errors when I try to run the test
'drivers/net/netdevsim/ethtool-coalesce.sh':
TAP version 13
1..1
# timeout set to 600
# selftests: drivers/net/netdevsim: ethtool-coalesce.sh
# ./ethtool-coalesce.sh: line 4: ethtool-common.sh: No such file or directory
# ./ethtool-coalesce.sh: line 25: make_netdev: command not found
# ethtool: bad command line argument(s)
# ./ethtool-coalesce.sh: line 124: check: command not found
# ./ethtool-coalesce.sh: line 126: [: -eq: unary operator expected
# FAILED /0 checks
not ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh # exit=1
Install this file to avoid this error. After this patch:
TAP version 13
1..1
# timeout set to 600
# selftests: drivers/net/netdevsim: ethtool-coalesce.sh
# PASSED all 22 checks
ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh
Fixes: fbb8531e58bd ("selftests: extract common functions in ethtool-common.sh") Signed-off-by: Wang Liang <wangliang74@huawei.com> Link: https://patch.msgid.link/20251030040340.3258110-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The GRO self-test, gro.c, currently constructs IPv6 packets containing a
Hop-by-Hop Options header (IPPROTO_HOPOPTS) to ensure the GRO path
correctly handles IPv6 extension headers.
However, network elements may be configured to drop packets with the
Hop-by-Hop Options header (HBH). This causes the self-test to fail
in environments where such network elements are present.
To improve the robustness and reliability of this test in diverse
network environments, switch from using IPPROTO_HOPOPTS to
IPPROTO_DSTOPTS (Destination Options).
The Destination Options header is less likely to be dropped by
intermediate routers and still serves the core purpose of the test:
validating GRO's handling of an IPv6 extension header. This change
ensures the test can execute successfully without being incorrectly
failed by network policies outside the kernel's control.
Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Anubhav Singh <anubhavsinggh@google.com> Link: https://patch.msgid.link/20251030060436.1556664-1-anubhavsinggh@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Due to the gro_sender sending data packets and FIN packets
in very quick succession, these are received almost simultaneously
by the gro_receiver. FIN packets are sometimes processed before the
data packets leading to intermittent (~1/100) test failures.
This change adds a delay of 100ms before sending FIN packets
in gro:tcp test to avoid the out-of-order delivery. The same
mitigation already exists for the gro:ip test.
Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Anubhav Singh <anubhavsinggh@google.com> Link: https://patch.msgid.link/20251030062818.1562228-1-anubhavsinggh@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The internal switch on BCM63XX SoCs will unconditionally add 802.1Q VLAN
tags on egress to CPU when 802.1Q mode is enabled. We do this
unconditionally since commit ed409f3bbaa5 ("net: dsa: b53: Configure
VLANs while not filtering").
This is fine for VLAN aware bridges, but for standalone ports and vlan
unaware bridges this means all packets are tagged with the default VID,
which is 0.
While the kernel will treat that like untagged, this can break userspace
applications processing raw packets, expecting untagged traffic, like
STP daemons.
This also breaks several bridge tests, where the tcpdump output then
does not match the expected output anymore.
Since 0 isn't a valid VID, just strip out the VLAN tag if we encounter
it, unless the priority field is set, since that would be a valid tag
again.
Fixes: 964dbf186eaa ("net: dsa: tag_brcm: add support for legacy tags") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20251027194621.133301-1-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
gve implemented a ptp_clock for sole use of do_aux_work at this time.
ptp_clock_gettime() and ptp_sys_offset() assume every ptp_clock has
implemented either gettimex64 or gettime64. Stub gettimex64 and return
-EOPNOTSUPP to prevent NULL dereferencing.
In hci_cmd_complete_evt(), if the command complete event has an unknown
opcode, we assume the first byte of the remaining skb->data contains the
return status. However, parameter data has previously been pulled in
hci_event_func(), which may leave the skb empty. If so, using skb->data[0]
for the return status uses un-init memory.
The fix is to check skb->len before using skb->data.
Reported-by: syzbot+a9a4bedfca6aa9d7fa24@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a9a4bedfca6aa9d7fa24 Tested-by: syzbot+a9a4bedfca6aa9d7fa24@syzkaller.appspotmail.com Fixes: afcb3369f46ed ("Bluetooth: hci_event: Fix vendor (unknown) opcode status handling") Signed-off-by: Raphael Pinsonneault-Thibeault <rpthibeault@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Patch "Make HID attributes visible" is needed for older kernel versions
(e.g. 6.12) where ufs_get_device_desc() is called from ufshcd_probe_hba().
In these older kernel versions ufshcd_get_device_desc() may be called
after the sysfs attributes have been added. In the upstream kernel however
ufshcd_get_device_desc() is called before ufs_sysfs_add_nodes(). See also
the ufshcd_device_params_init() call from ufshcd_init(). Hence, calling
sysfs_update_group() is not necessary.
See also commit 69f5eb78d4b0 ("scsi: ufs: core: Move the
ufshcd_device_init(hba, true) call") in kernel v6.13.
Cc: Daniel Lee <chullee@google.com> Cc: Peter Wang <peter.wang@mediatek.com> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Neil Armstrong <neil.armstrong@linaro.org> Fixes: bb7663dec67b ("scsi: ufs: sysfs: Make HID attributes visible") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Fixes: bb7663dec67b ("scsi: ufs: sysfs: Make HID attributes visible") Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://patch.msgid.link/20251028222433.1108299-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In [1], Ross Brown reports poor performance of WCN7850 after enabling
power save. Temporarily revert the fix; it will be re-enabled once
the issue is resolved.
Fixes: 4b66d18918f8 ("wifi: ath12k: Fix missing station power save configuration") Reported-by: Ross Brown <true.robot.ross@gmail.com> Closes: https://lore.kernel.org/all/CAMn66qZENLhDOcVJuwUZ3ir89PVtVnQRq9DkV5xjJn1p6BKB9w@mail.gmail.com/ # [1] Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20251028060744.897198-1-miaoqing.pan@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The pt_dump_seq_puts() macro incorrectly uses seq_printf() instead of
seq_puts(). This is both a performance issue and conceptually wrong,
as the macro name suggests plain string output (puts) but the
implementation uses formatted output (printf).
The macro is used in ptdump.c:301 to output a newline character. Using
seq_printf() adds unnecessary overhead for format string parsing when
outputting this constant string.
This bug was introduced in commit 59c4da8640cc ("riscv: Add support to
dump the kernel page tables") in 2020, which copied the implementation
pattern from other architectures that had the same bug.
Fixes: 59c4da8640cc ("riscv: Add support to dump the kernel page tables") Signed-off-by: Josephine Pfeiffer <hi@josie.lol> Link: https://lore.kernel.org/r/20251018170451.3355496-1-hi@josie.lol Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Unwinding the stack of a task other than current, KASAN would report
"BUG: KASAN: out-of-bounds in walk_stackframe+0x41c/0x460"
There is a same issue on x86 and has been resolved by the commit 84936118bdf3 ("x86/unwind: Disable KASAN checks for non-current tasks")
The solution could be applied to RISC-V too.
This patch also can solve the issue:
https://seclists.org/oss-sec/2025/q4/23
If this happens, ufs_sysfs_add_nodes() triggers a kernel warning and
fails. Fix this by calling ufs_sysfs_add_nodes() before SCSI LUNs are
scanned since the sysfs_update_group() call happens from the context of
thread that executes ufshcd_async_scan(). This patch fixes the following
kernel warning:
Cc: Daniel Lee <chullee@google.com> Fixes: bb7663dec67b ("scsi: ufs: sysfs: Make HID attributes visible") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251014200118.3390839-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The device bus LAN ID was obtained from PCI_FUNC(), but when a PF
port is passthrough to a virtual machine, the function number may not
match the actual port index on the device. This could cause the driver
to perform operations such as LAN reset on the wrong port.
Fix this by reading the LAN ID from port status register.
Fixes: a34b3e6ed8fb ("net: txgbe: Store PCI info") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/B60A670C1F52CB8E+20251104062321.40059-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function ring_buffer_map_get_reader() is a bit more strict than the
other get reader functions, and except for certain situations the
rb_get_reader_page() should not return NULL. If it does, it triggers a
warning.
This warning was triggering but after looking at why, it was because
another acceptable situation was happening and it wasn't checked for.
If the reader catches up to the writer and there's still data to be read
on the reader page, then the rb_get_reader_page() will return NULL as
there's no new page to get.
In this situation, the reader page should not be updated and no warning
should trigger.
__unregister_trace_fprobe() checks tf->tuser to put it when removing
tprobe. However, disable_trace_fprobe() does not use it and only calls
unregister_fprobe(). Thus it forgets to disable tracepoint_user.
If the trace_fprobe has tuser, put it for unregistering the tracepoint
callbacks when disabling tprobe correctly.
Since __tracepoint_user_init() calls tracepoint_user_register() without
initializing tuser->tpoint with given tracpoint, it does not register
tracepoint stub function as callback correctly, and tprobe does not work.
Initializing tuser->tpoint correctly before tracepoint_user_register()
so that it sets up tracepoint callback.
Although this commit benefits QCA6174, it breaks QCA988x and
QCA9984 [1][2]. Since it is not likely to root cause/fix this
issue in a short time, revert it to get those chips back.
Commit c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM
driver") added AFBC support to Mediatek DRM and enabled the
32x8/split/sparse modifier.
However, this is currently broken on Mediatek MT8188 (Genio 700 EVK
platform); tested using upstream Kernel and Mesa (v25.2.1), AFBC is used by
default since Mesa v25.0.
Kernel trace reports vblank timeouts constantly, and the render is garbled:
Until this gets fixed upstream, disable AFBC support on this platform, as
it's currently broken with upstream Mesa.
Fixes: c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM driver") Cc: stable@vger.kernel.org Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: Macpaul Lin <macpaul.lin@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20251024202756.811425-1-ariel.dalessandro@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vb2_ioctl_remove_bufs() call manipulates queue internal buffer list,
potentially overwriting some pointers used by the legacy fileio access
mode. Forbid that ioctl when fileio is active to protect internal queue
state between subsequent read/write calls.
CC: stable@vger.kernel.org Fixes: a3293a85381e ("media: v4l2: Add REMOVE_BUFS ioctl") Reported-by: Shuangpeng Bai <SJB7183@psu.edu> Closes: https://lore.kernel.org/linux-media/5317B590-AAB4-4F17-8EA1-621965886D49@psu.edu/ Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices, like the Grandstream GUV3100 webcam, have an invalid UVC
descriptor where multiple entities share the same ID, this is invalid
and makes it impossible to make a proper entity tree without heuristics.
We have recently introduced a change in the way that we handle invalid
entities that has caused a regression on broken devices.
Implement a new heuristic to handle these devices properly.
Reported-by: Angel4005 <ooara1337@gmail.com> Closes: https://lore.kernel.org/linux-media/CAOzBiVuS7ygUjjhCbyWg-KiNx+HFTYnqH5+GJhd6cYsNLT=DaA@mail.gmail.com/ Fixes: 0e2ee70291e6 ("media: uvcvideo: Mark invalid entities with id UVC_INVALID_ENTITY_ID") Cc: stable@vger.kernel.org Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[BUG]
During development of a minor feature (make sure all btrfs_bio::end_io()
is called in task context), I noticed a crash in generic/388, where
metadata writes triggered new works after btrfs_stop_all_workers().
It turns out that it can even happen without any code modification, just
using RAID5 for metadata and the same workload from generic/388 is going
to trigger the use-after-free.
[CAUSE]
If btrfs hits an error, the fs is marked as error, no new
transaction is allowed thus metadata is in a frozen state.
But there are some metadata modifications before that error, and they are
still in the btree inode page cache.
Since there will be no real transaction commit, all those dirty folios
are just kept as is in the page cache, and they can not be invalidated
by invalidate_inode_pages2() call inside close_ctree(), because they are
dirty.
And finally after btrfs_stop_all_workers(), we call iput() on btree
inode, which triggers writeback of those dirty metadata.
And if the fs is using RAID56 metadata, this will trigger RMW and queue
new works into rmw_workers, which is already stopped, causing warning
from queue_work() and use-after-free.
[FIX]
Add a special handling for write_one_eb(), that if the fs is already in
an error state, immediately mark the bbio as failure, instead of really
submitting them.
Then during close_ctree(), iput() will just discard all those dirty
tree blocks without really writing them back, thus no more new jobs for
already stopped-and-freed workqueues.
The extra discard in write_one_eb() also acts as an extra safenet.
E.g. the transaction abort is triggered by some extent/free space
tree corruptions, and since extent/free space tree is already corrupted
some tree blocks may be allocated where they shouldn't be (overwriting
existing tree blocks). In that case writing them back will further
corrupting the fs.
CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Even if normally `build_error` isn't a kernel object, it should still
be treated as such so that we pass the same flags. Similarly, `rustdoc`
targets are never kernel objects, but we need to treat them as such.
Otherwise, starting with Rust 1.91.0 (released 2025-10-30), `rustc`
will complain about missing sanitizer flags since `-Zsanitizer` is a
target modifier too [1]:
error: mixing `-Zsanitizer` will cause an ABI mismatch in crate `build_error`
--> rust/build_error.rs:3:1
|
3 | //! Build-time error.
| ^
|
= help: the `-Zsanitizer` flag modifies the ABI so Rust crates compiled with different values of this flag cannot be used together safely
= note: unset `-Zsanitizer` in this crate is incompatible with `-Zsanitizer=kernel-address` in dependency `core`
= help: set `-Zsanitizer=kernel-address` in this crate or unset `-Zsanitizer` in `core`
= help: if you are sure this will not cause problems, you may use `-Cunsafe-allow-abi-mismatch=sanitizer` to silence this error
The `rustdoc` modifiers bug [1] was fixed in Rust 1.90.0 [2], for which
we added a workaround in commit abbf9a449441 ("rust: workaround `rustdoc`
target modifiers bug").
However, `rustdoc`'s doctest generation still has a similar issue [3],
being fixed at [4], which does not affect us because we apply the
workaround to both, and now, starting with Rust 1.91.0 (released
2025-10-30), `-Zsanitizer` is a target modifier too [5], which means we
fail with:
RUSTDOC TK rust/kernel/lib.rs
error: mixing `-Zsanitizer` will cause an ABI mismatch in crate `kernel`
--> rust/kernel/lib.rs:3:1
|
3 | //! The `kernel` crate.
| ^
|
= help: the `-Zsanitizer` flag modifies the ABI so Rust crates compiled with different values of this flag cannot be used together safely
= note: unset `-Zsanitizer` in this crate is incompatible with `-Zsanitizer=kernel-address` in dependency `core`
= help: set `-Zsanitizer=kernel-address` in this crate or unset `-Zsanitizer` in `core`
= help: if you are sure this will not cause problems, you may use `-Cunsafe-allow-abi-mismatch=sanitizer` to silence this error
A simple way around is to add the sanitizer to the list in the existing
workaround (especially if we had not started to pass the sanitizer
flags in the previous commit, since in that case that would not be
necessary). However, that still applies the workaround in more cases
than necessary.
Instead, only modify the doctests flags to ignore the check for
sanitizers, so that it is more local (and thus the compiler keeps checking
it for us in the normal `rustdoc` calls). Since the previous commit
already treated the `rustdoc` calls as kernel objects, this should allow
us in the future to easily remove this workaround when the time comes.
By the way, the `-Cunsafe-allow-abi-mismatch` flag overwrites previous
ones rather than appending, so it needs to be all done in the same flag.
Moreover, unknown modifiers are rejected, and thus we have to gate based
on the version too.
Finally, `-Zsanitizer-cfi-normalize-integers` is not affected (in Rust
1.91.0), so it is not needed in the workaround for the moment.
The future move of pin-init to `syn` uncovers the following private
intra-doc link:
error: public documentation for `Devres` links to private item `Self::inner`
--> rust/kernel/devres.rs:106:7
|
106 | /// [`Self::inner`] is guaranteed to be initialized and is always accessed read-only.
| ^^^^^^^^^^^ this item is private
|
= note: this link will resolve properly if you pass `--document-private-items`
= note: `-D rustdoc::private-intra-doc-links` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(rustdoc::private_intra_doc_links)]`
Currently, when rendered, the link points to "nowhere" (an inexistent
anchor for a "method").
Thus fix it.
Cc: stable@vger.kernel.org Fixes: f5d3ef25d238 ("rust: devres: get rid of Devres' inner Arc") Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20251029071406.324511-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The future move of pin-init to `syn` uncovers the following broken
intra-doc link:
error: unresolved link to `crate::pin_init`
--> rust/kernel/sync/condvar.rs:39:40
|
39 | /// instances is with the [`pin_init`](crate::pin_init!) and [`new_condvar`] macros.
| ^^^^^^^^^^^^^^^^ no item named `pin_init` in module `kernel`
|
= note: `-D rustdoc::broken-intra-doc-links` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(rustdoc::broken_intra_doc_links)]`
Currently, when rendered, the link points to a literal `crate::pin_init!`
URL.
The runtime-const infrastructure was never designed to handle the
modular case, because the constant fixup is only done at boot time for
core kernel code.
But by the time I used it for the x86-64 user space limit handling in
commit 86e6b1547b3d ("x86: fix user address masking non-canonical
speculation issue"), I had completely repressed that fact.
And it all happens to work because the only code that currently actually
gets inlined by modules is for the access_ok() limit check, where the
default constant value works even when not fixed up. Because at least I
had intentionally made it be something that is in the non-canonical
address space region.
But it's technically very wrong, and it does mean that at least in
theory, the use of 'access_ok()' + '__get_user()' can trigger the same
speculation issue with non-canonical addresses that the original commit
was all about.
The pattern is unusual enough that this probably doesn't matter in
practice, but very wrong is still very wrong. Also, let's fix it before
the nice optimized scoped user accessor helpers that Thomas Gleixner is
working on cause this pseudo-constant to then be more widely used.
This all came up due to an unrelated discussion with Mateusz Guzik about
using the runtime const infrastructure for names_cachep accesses too.
There the modular case was much more obviously broken, and Mateusz noted
it in his 'v2' of the patch series.
That then made me notice how broken 'access_ok()' had been in modules
all along. Mea culpa, mea maxima culpa.
Fix it by simply not using the runtime-const code in modules, and just
using the USER_PTR_MAX variable value instead. This is not
performance-critical like the core user accessor functions (get_user()
and friends) are.
Also make sure this doesn't get forgotten the next time somebody wants
to do runtime constant optimizations by having the x86 runtime-const.h
header file error out if included by modules.
The mds auth caps check should also validate the
fsname along with the associated caps. Not doing
so would result in applying the mds auth caps of
one fs on to the other fs in a multifs ceph cluster.
The bug causes multiple issues w.r.t user
authentication, following is one such example.
Steps to Reproduce (on vstart cluster):
1. Create two file systems in a cluster, say 'fsname1' and 'fsname2'
2. Authorize read only permission to the user 'client.usr' on fs 'fsname1'
$ceph fs authorize fsname1 client.usr / r
3. Authorize read and write permission to the same user 'client.usr' on fs 'fsname2'
$ceph fs authorize fsname2 client.usr / rw
4. Update the keyring
$ceph auth get client.usr >> ./keyring
With above permssions for the user 'client.usr', following is the
expectation.
a. The 'client.usr' should be able to only read the contents
and not allowed to create or delete files on file system 'fsname1'.
b. The 'client.usr' should be able to read/write on file system 'fsname2'.
But, with this bug, the 'client.usr' is allowed to read/write on file
system 'fsname1'. See below.
5. Mount the file system 'fsname1' with the user 'client.usr'
$sudo bin/mount.ceph usr@.fsname1=/ /kmnt_fsname1_usr/
6. Try creating a file on file system 'fsname1' with user 'client.usr'. This
should fail but passes with this bug.
$touch /kmnt_fsname1_usr/file1
7. Mount the file system 'fsname1' with the user 'client.admin' and create a
file.
$sudo bin/mount.ceph admin@.fsname1=/ /kmnt_fsname1_admin
$echo "data" > /kmnt_fsname1_admin/admin_file1
8. Try removing an existing file on file system 'fsname1' with the user
'client.usr'. This shoudn't succeed but succeeds with the bug.
$rm -f /kmnt_fsname1_usr/admin_file1
For more information, please take a look at the corresponding mds/fuse patch
and tests added by looking into the tracker mentioned below.
v2: Fix a possible null dereference in doutc
v3: Don't store fsname from mdsmap, validate against
ceph_mount_options's fsname and use it
v4: Code refactor, better warning message and
fix possible compiler warning
The wake_up_bit() is called in ceph_async_unlink_cb(),
wake_async_create_waiters(), and ceph_finish_async_create().
It makes sense to switch on clear_bit() function, because
it makes the code much cleaner and easier to understand.
More important rework is the adding of smp_mb__after_atomic()
memory barrier after the bit modification and before
wake_up_bit() call. It can prevent potential race condition
of accessing the modified bit in other threads. Luckily,
clear_and_wake_up_bit() already implements the required
functionality pattern:
static inline void clear_and_wake_up_bit(int bit, unsigned long *word)
{
clear_bit_unlock(bit, word);
/* See wake_up_bit() for which memory barrier you need to use. */
smp_mb__after_atomic();
wake_up_bit(word, bit);
}
The Coverity Scan service has detected potential
race condition in ceph_ioctl_lazyio() [1].
The CID 1591046 contains explanation: "Check of thread-shared
field evades lock acquisition (LOCK_EVASION). Thread1 sets
fmode to a new value. Now the two threads have an inconsistent
view of fmode and updates to fields correlated with fmode
may be lost. The data guarded by this critical section may
be read while in an inconsistent state or modified by multiple
racing threads. In ceph_ioctl_lazyio: Checking the value of
a thread-shared field outside of a locked region to determine
if a locked operation involving that thread shared field
has completed. (CWE-543)".
The patch places fi->fmode field access under ci->i_ceph_lock
protection. Also, it introduces the is_file_already_lazy
variable that is set under the lock and it is checked later
out of scope of critical section.
The Coverity Scan service has detected the calling of
wait_for_completion_killable() without checking the return
value in ceph_lock_wait_for_completion() [1]. The CID 1636232
defect contains explanation: "If the function returns an error
value, the error value may be mistaken for a normal value.
In ceph_lock_wait_for_completion(): Value returned from
a function is not checked for errors before being used. (CWE-252)".
The patch adds the checking of wait_for_completion_killable()
return value and return the error code from
ceph_lock_wait_for_completion().
If mmap write lock is taken while draining retry fault, mmap write lock
is not released because svm_range_restore_pages calls mmap_read_unlock
then returns. This causes deadlock and system hangs later because mmap
read or write lock cannot be taken.
Downgrade mmap write lock to read lock if draining retry fault fix this
bug.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to I2S specs audio data is sampled on the rising edge of the
clock and it can change on the falling one. When operating in normal mode
this SoC behaves the opposite so a clock polarity inversion is required
in this case.
This was tested on an OdroidC2 (Amlogic S905 SoC) board.
Signed-off-by: Valerio Setti <vsetti@baylibre.com> Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> Tested-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://patch.msgid.link/20251007-fix-i2s-polarity-v1-1-86704d9cda10@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On m68k, check_sizetypes in headers_check reports:
./usr/include/asm/bootinfo-amiga.h:17: found __[us]{8,16,32,64} type without #include <linux/types.h>
This header file does not use any of the Linux-specific integer types,
but merely refers to them from comments, so this is a false positive.
As of commit c3a9d74ee413bdb3 ("kbuild: uapi: upgrade check_sizetypes()
warning to error"), this check was promoted to an error, breaking m68k
all{mod,yes}config builds.
Fix this by stripping simple comments before looking for Linux-specific
integer types.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://patch.msgid.link/949f096337e28d50510e970ae3ba3ec9c1342ec0.1759753998.git.geert@linux-m68k.org
[nathan: Adjust comment and remove unnecessary escaping from slashes in
regex] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When using interrupt pin (INT A) as watchdog output all other
interrupt sources need to be disabled to avoid additional
resets. Resulting INT_A_MASK1 value is 55 (0x37).
During kexec reboots, RTC alarms that are fired during the kernel
transition experience delayed execution. The new kernel would eventually
honor these alarms, but the interrupt handlers would only execute after
the driver probe is completed rather than at the intended alarm time.
This is because pending alarm interrupt status from the previous kernel
is not properly cleared during driver initialization, causing timing
discrepancies in alarm delivery.
To ensure precise alarm timing across kexec transitions, enhance the
probe function to:
1. Clear any pending alarm interrupt status from previous boot.
2. Detect existing valid alarms and preserve their state.
3. Re-enable alarm interrupts for future alarms.
The ASUS ROG Zephyrus Duo 15 SE (GX551QS) with ALC 289 codec requires specific
pin configuration for proper volume control. Without this quirk, volume
adjustments produce a muffled sound effect as only certain channels attenuate,
leaving bass frequency at full volume.
Testing with hdajackretask confirms these pin tweaks fix the issue:
- Pin 0x17: Internal Speaker (LFE)
- Pin 0x1e: Internal Speaker
Add bounds checking to prevent writes past framebuffer boundaries when
rendering text near screen edges. Return early if the Y position is off-screen
and clip image height to screen boundary. Break from the rendering loop if the
X position is off-screen. When clipping image width to fit the screen, update
the character count to match the clipped width to prevent buffer size
mismatches.
Without the character count update, bit_putcs_aligned and bit_putcs_unaligned
receive mismatched parameters where the buffer is allocated for the clipped
width but cnt reflects the original larger count, causing out-of-bounds writes.
Instead of preserving mode, timestamp, and owner, for the object files
during installation, just preserve the mode and timestamp.
When installing as root, the installed files should be owned by root.
When installing as user, --preserve=ownership doesn't work anyway. This
makes --preserve=ownership rather pointless.
Signed-off-by: Emil Dahl Juhl <juhl.emildahl@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
small_const_nbits is defined in asm-generic/bitsperlong.h which
bitmap.h uses but doesn't include causing build failures in some build
systems. Add the missing #include.
Note the bitmap.h in tools has diverged from that of the kernel, so no
changes are made there.
Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Yury Norov <yury.norov@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: André Almeida <andrealmeid@igalia.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Darren Hart <dvhart@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jason Xing <kerneljasonxing@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonas Gottlieb <jonas.gottlieb@stackit.cloud> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maurice Lambert <mauricelambert434@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Machata <petrm@nvidia.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yuyang Huang <yuyanghuang@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The atomic instructions sc.q, llacq.{w/d}, screl.{w/d} were newly added
in the LoongArch Reference Manual v1.10, it is necessary to handle them
in insns_not_supported() to avoid putting a breakpoint in the middle of
a ll/sc atomic sequence, otherwise it will loop forever for kprobes and
uprobes.
fwnode_graph_get_next_subnode() may return fwnode backed by ACPI
device nodes and there has been no check these devices are present
in the system, unlike there has been on fwnode OF backend.
In order to provide consistent behaviour towards callers,
add a check for device presence by introducing
a new function acpi_get_next_present_subnode(), used as the
get_next_child_node() fwnode operation that also checks device
node presence.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251001102636.1272722-2-sakari.ailus@linux.intel.com
[ rjw: Kerneldoc comment and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When a UTP error occurs in isolation, UFS is not currently recoverable.
This is because the UTP error is not considered fatal in the error
handling code, leading to either an I/O timeout or an OCS error.
Add the UTP error flag to INT_FATAL_ERRORS so the controller will be
reset in this situation.
sd 0:0:0:0: [sda] tag#38 UNKNOWN(0x2003) Result: hostbyte=0x07
driverbyte=DRIVER_OK cmd_age=0s
sd 0:0:0:0: [sda] tag#38 CDB: opcode=0x28 28 00 00 51 24 e2 00 00 08 00
I/O error, dev sda, sector 42542864 op 0x0:(READ) flags 0x80700 phys_seg
8 prio class 2
OCS error from controller = 9 for tag 39
pa_err[1] = 0x80000010 at 2667224756 us
pa_err: total cnt=2
dl_err[0] = 0x80000002 at 2667148060 us
dl_err[1] = 0x80002000 at 2667282844 us
No record of nl_err
No record of tl_err
No record of dme_err
No record of auto_hibern8_err
fatal_err[0] = 0x804 at 2667282836 us
---------------------------------------------------
REGISTER
---------------------------------------------------
NAME OFFSET VALUE
STD HCI SFR 0xfffffff0 0x0
AHIT 0x18 0x814
INTERRUPT STATUS 0x20 0x1000
INTERRUPT ENABLE 0x24 0x70ef5
[mkp: commit desc]
Signed-off-by: Hoyoung Seo <hy50.seo@samsung.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Message-Id: <20250930061428.617955-1-hy50.seo@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
During initialization, the EDVD_COREx_VOLT_FREQ registers for some cores
are still at reset values and not reflecting the actual frequency. This
causes get calls to fail. Set all cores to their respective max
frequency during probe to initialize the registers to working values.
Suggested-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Aaron Kling <webgeek1234@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The NTB epf host driver assumes the BAR number associated with a memory
window is just incremented from the BAR number associated with MW1. This
seems to have been enough so far but this is not really how the endpoint
side work and the two could easily become mis-aligned.
ntb_epf_mw_to_bar() even assumes that the BAR number is the memory window
index + 2, which means the function only returns a proper result if BAR_2
is associated with MW1.
Instead, fully describe and allow arbitrary NTB BAR mapping.
The output clock register offset used in clk_wzrd_register_output_clocks
was incorrectly referencing 0x3C instead of 0x38, which caused
misconfiguration of output dividers on Versal platforms.
Correcting the off-by-one error ensures proper configuration of output
clocks.
For some of the SCMI based platforms, the oem extended config may be
supported, but not for duty cycle purpose. Skip the duty cycle ops if
err return when trying to get duty cycle info.
Signed-off-by: Jacky Bai <ping.bai@nxp.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
As described in AM335x Errata Advisory 1.0.42, WKUP_DEBUGSS_CLKCTRL
can't be disabled - the clock module will just be stuck in transitioning
state forever, resulting in the following warning message after the wait
loop times out:
l3-aon-clkctrl:0000:0: failed to disable
Just add the clock to enable_init_clks, so no attempt is made to disable
it.
Signed-off-by: Matthias Schiffer <matthias.schiffer@tq-group.com> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Acked-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
p9_read_work() doesn't set Rworksched and doesn't do schedule_work(m->rq)
if list_empty(&m->req_list).
However, if the pipe is full, we need to read more data and this used to
work prior to commit aaec5a95d59615 ("pipe_read: don't wake up the writer
if the pipe is still full").
p9_read_work() does p9_fd_read() -> ... -> anon_pipe_read() which (before
the commit above) triggered the unnecessary wakeup. This wakeup calls
p9_pollwake() which kicks p9_poll_workfn() -> p9_poll_mux(), p9_poll_mux()
will notice EPOLLIN and schedule_work(&m->rq).
This no longer happens after the optimization above, change p9_fd_request()
to use p9_poll_mux() instead of only checking for EPOLLOUT.
This register is important for sequencing the commands to PLLs, so
actually write the update bits with regmap_write_bits() instead of
relying on a read/modify/write regmap command that could skip the actual
hardware write if the value is identical to the one read.
It's changed when modification is needed to the PLL, when
read-only operation is done, we could keep the call to
regmap_update_bits().
Add a comment to the sam9x60_div_pll_set_div() function that uses this
PLL_UPDT register so that it's used consistently, according to the
product's datasheet.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Tested-by: Ryan Wanner <ryan.wanner@microchip.com> # on sama7d65 and sam9x75 Link: https://lore.kernel.org/r/20250827150811.82496-1-nicolas.ferre@microchip.com
[claudiu.beznea: fix "Alignment should match open parenthesis"
checkpatch.pl check] Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Sasha Levin <sashal@kernel.org>
A potential divider for the master clock is div/3. The register
configuration for div/3 is MASTER_PRES_MAX. The current bit shifting
method does not work for this case. Checking for MASTER_PRES_MAX will
ensure the correct decimal value is stored in the system.
Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
PCF2127 can generate interrupt every full second or minute configured
from control and status register 1, bits MI (1) and SI (0).
On interrupt control register 2 bit MSF (7) is set and must be cleared
to continue normal operation.
While the driver never enables this interrupt on its own, users or
firmware may do so - e.g. as an easy way to test the interrupt.
Add preprocessor definition for MSF bit and include it in the irq
bitmask to ensure minute and second interrupts are cleared when fired.
This fixes an issue where the rtc enters a test mode and becomes
unresponsive after a second interrupt has fired and is not cleared in
time. In this state register writes to control registers have no
effect and the interrupt line is kept asserted [1]:
[1] userspace commands to put rtc into unresponsive state:
$ i2cget -f -y 2 0x51 0x00
0x04
$ i2cset -f -y 2 0x51 0x00 0x05 # set bit 0 SI
$ i2cget -f -y 2 0x51 0x00
0x84 # bit 8 EXT_TEST set
$ i2cset -f -y 2 0x51 0x00 0x05 # try overwrite control register
$ i2cget -f -y 2 0x51 0x00
0x84 # no change
The A523's RTC block is backward compatible with the R329's, but it also
has a calibration function for its internal oscillator, which would
allow it to provide a clock rate closer to the desired 32.768 KHz. This
is useful on the Radxa Cubie A5E, which does not have an external 32.768
KHz crystal.
Add the missing option name in the help message. Additionally,
switch to __uml_help(), because this is a global option rather
than a per-channel option.
This field is unused, but the correct structure size is needed
when computing the amount of space for the output argument to
reside, so that it does not cross a page boundary.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The HV_ACCESS_TSC_INVARIANT bit is always zero when Linux runs as the
root partition. The root partition will see directly what the hardware
provides.
The old logic in ms_hyperv_init_platform caused the native TSC clock
source to be incorrectly marked as unstable on x86. Fix it.
Skip the unnecessary checks in code for the root partition. Add one
extra comment in code to clarify the behavior.
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The function call new_inode() is a primitive for allocating an inode in memory,
rather than planning disk space for it. Therefore, -ENOMEM should be returned
as the error code rather than -ENOSPC.
To be specific, new_inode()'s call path looks like this:
new_inode
new_inode_pseudo
alloc_inode
ops->alloc_inode (hpfs_alloc_inode)
alloc_inode_sb
kmem_cache_alloc_lru
Therefore, the failure of new_inode() indicates a memory presure issue (-ENOMEM),
not a lack of disk space. However, the current implementation of
hpfs_mkdir/create/mknod/symlink incorrectly returns -ENOSPC when new_inode() fails.
This patch fix this by set err to -ENOMEM before the goto statement.
BTW, we also noticed that other nested calls within these four functions,
like hpfs_alloc_f/dnode and hpfs_add_dirent, might also fail due to memory presure.
But similarly, only -ENOSPC is returned. Addressing these will involve code
modifications in other functions, and we plan to submit dedicated patches for these
issues in the future. For this patch, we focus on new_inode().
Prevent issues during reset deassertion by re-asserting the reset if a
timeout occurs when trying to deassert. This ensures the reset line is in a
known state and improves reliability for hardware that may not immediately
clear the reset monitor bit.
Rework nss_port5 to use the new multiple configuration implementation
and correctly fix the clocks for this port under some corner case.
In OpenWrt, this patch avoids intermittent dmesg errors of the form
nss_port5_rx_clk_src: rcg didn't update its configuration.
This is a mechanical, straightforward port of
commit e88f03230dc07aa3293b6aeb078bd27370bb2594
("clk: qcom: gcc-ipq8074: rework nss_port5/6 clock to multiple conf")
to gcc-ipq6018, with two conflicts resolved: different frequency of the
P_XO clock source, and only 5 Ethernet ports.
This was originally developed by JiaY-shi <shi05275@163.com>.
In btrfs_fallocate(), when the allocated range overlaps with a prealloc
extent and the extent starts after i_size, the range doesn't get marked
dirty in file_extent_tree. This results in persisting an incorrect
disk_i_size for the inode when not using the no-holes feature.
This is reproducible since commit 41a2ee75aab0 ("btrfs: introduce
per-inode file extent tree"), then became hidden since commit 3d7db6e8bd22
("btrfs: don't allocate file extent tree for non regular files") and then
visible again after commit 8679d2687c35 ("btrfs: initialize
inode::file_extent_tree after i_mode has been set"), which fixes the
previous commit.
When btrfs_add_qgroup_relation() is called with invalid qgroup levels
(src >= dst), the function returns -EINVAL directly without freeing the
preallocated qgroup_list structure passed by the caller. This causes a
memory leak because the caller unconditionally sets the pointer to NULL
after the call, preventing any cleanup.
The issue occurs because the level validation check happens before the
mutex is acquired and before any error handling path that would free
the prealloc pointer. On this early return, the cleanup code at the
'out' label (which includes kfree(prealloc)) is never reached.
In btrfs_ioctl_qgroup_assign(), the code pattern is:
prealloc = kzalloc(sizeof(*prealloc), GFP_KERNEL);
ret = btrfs_add_qgroup_relation(trans, sa->src, sa->dst, prealloc);
prealloc = NULL; // Always set to NULL regardless of return value
...
kfree(prealloc); // This becomes kfree(NULL), does nothing
When the level check fails, 'prealloc' is never freed by either the
callee or the caller, resulting in a 64-byte memory leak per failed
operation. This can be triggered repeatedly by an unprivileged user
with access to a writable btrfs mount, potentially exhausting kernel
memory.
Fix this by freeing prealloc before the early return, ensuring prealloc
is always freed on all error paths.
Fixes: 4addc1ffd67a ("btrfs: qgroup: preallocate memory before adding a relation") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Shardul Bankar <shardulsb08@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When QP wraps around, WQE data from the previous use at the same
position still remains as driver does not clear it. The WQE field
layout differs across different opcodes, causing that the fields
that are not explicitly assigned for the current opcode retain
stale values, and are issued to HW by mistake. Such fields are as
follows:
* MSG_START_SGE_IDX field in ATOMIC WQE
* BLOCK_SIZE and ZBVA fields in FRMR WQE
* DirectWQE fields when DirectWQE not used
For ATOMIC WQE, always set the latest sge index in MSG_START_SGE_IDX
as required by HW.
For FRMR WQE and DirectWQE, clear only those unassigned fields
instead of the entire WQE to avoid performance penalty.
The actual sge number may exceed the value specified in init_attr->cap
when HW needs extra sge to enable inline feature. Since these extra
sges are not expected by ULP, return the user-specified value to ULP
instead of the expanded sge number.
Currently driver enforces affinity between QP cache and send CQ
cache, which helps improve the performance of sending, but doesn't
set affinity with recv CQ cache, resulting in suboptimal performance
of receiving.
Use one CQ bank per context to ensure the affinity among QP, send CQ
and recv CQ. For kernel ULP, CQ bank is fixed to 0.
In `UVERBS_METHOD_CQ_CREATE`, umem should be released if anything goes
wrong. Currently, if `create_cq_umem` fails, umem would not be
released or referenced, causing a possible leak.
In this patch, we release umem at `UVERBS_METHOD_CQ_CREATE`, the driver
should not release umem if it returns an error code.
Fixes: 1a40c362ae26 ("RDMA/uverbs: Add a common way to create CQ with umem") Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Link: https://patch.msgid.link/aOh1le4YqtYwj-hH@osx.local Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The driver maintains a CQ table that is used to ensure that a CQ is
still valid when processing CQ related AEs. When a CQ is destroyed,
the table entry is cleared, using irdma_cq.cq_num as the index. This
field was never being set, so it was just always clearing out entry
0.
Additionally, the cq_num field size was increased to accommodate HW
supporting more than 64K CQs.