The old IC does not support the I2C_MASTER_WRRD (write-then-read)
function, but the current code’s handling of i2c->auto_restart may
potentially lead to entering the I2C_MASTER_WRRD software flow,
resulting in unexpected bugs.
Instead of repurposing the auto_restart flag, add a separate flag
to signal I2C_MASTER_WRRD operations.
Also fix handling of msgs. If the operation (i2c->op) is
I2C_MASTER_WRRD, then the msgs pointer is incremented by 2.
For all other operations, msgs is simply incremented by 1.
As reported by LKP, the Qualcomm LMH driver needs to include several
IRQ-related headers, which decrlare necessary IRQ functionality.
Currently driver builds on ARM64 platforms, where the headers are pulled
in implicitly by other headers, but fails to build on other platforms.
Fixes: 53bca371cdf7 ("thermal/drivers/qcom: Add support for LMh driver") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507270042.KdK0KKht-lkp@intel.com/ Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20250728-lmh-scm-v2-2-33bc58388ca5@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The QCOM_SCM symbol is not user-visible, so it makes little sense to
depend on it. Make LMH driver select QCOM_SCM as all other drivers do
and, as the dependecy is now correctly handled, enable || COMPILE_TEST
in order to include the driver into broader set of build tests.
Fixes: 9e5a4fb84230 ("thermal/drivers/qcom/lmh: make QCOM_LMH depends on QCOM_SCM") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20250728-lmh-scm-v2-1-33bc58388ca5@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Distinct between fan speed setting request coming for hwmon and
thermal subsystems.
There are fields 'last_hwmon_state' and 'last_thermal_state' in the
structure 'mlxreg_fan_pwm', which respectively store the cooling state
set by the 'hwmon' and 'thermal' subsystem.
The purpose is to make arbitration of fan speed setting. For example, if
fan speed required to be not lower than some limit, such setting is to
be performed through 'hwmon' subsystem, thus 'thermal' subsystem will
not set fan below this limit.
Currently, the 'last_thermal_state' is also be updated by 'hwmon' causing
cooling state to never be set to a lower value.
Eliminate update of 'last_thermal_state', when request is coming from
'hwmon' subsystem.
Fixes: da74944d3a46 ("hwmon: (mlxreg-fan) Use pwm attribute for setting fan speed low limit") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20250113084859.27064-2-vadimp@nvidia.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit d5094bcb5bfd ("tools/nolibc: define time_t in terms of
__kernel_old_time_t") made nolibc use the kernel's time type so that
`time_t` matches `timespec::tv_sec` on all ABIs (notably x32).
But since __kernel_old_time_t is fairly new, notably from 2020 in commit 94c467ddb273 ("y2038: add __kernel_old_timespec and __kernel_old_time_t"),
nolibc builds that rely on host headers may fail.
Switch to __kernel_time_t, which is the same as __kernel_old_time_t and
has existed for longer.
Tested in PPC VM of Open Source Lab of Oregon State University
(./tools/testing/selftests/rcutorture/bin/mkinitrd.sh)
Fixes: d5094bcb5bfd ("tools/nolibc: define time_t in terms of __kernel_old_time_t") Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
[Thomas: Reformat commit and its message a bit] Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The smp_call_function_many() kerneldoc comment got out of sync with the
function definition (bool parameter "wait" is incorrectly described as a
bitmask in it), so fix it up by copying the "wait" description from the
smp_call_function() kerneldoc and add information regarding the handling
of the local CPU to it.
Fixes: 49b3bd213a9f ("smp: Fix all kernel-doc warnings") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Syzkaller found a kernel warning on the following sock_addr program:
0: r0 = 0
1: r2 = *(u32 *)(r1 +60)
2: exit
which triggers:
verifier bug: error during ctx access conversion (0)
This is happening because offset 60 in bpf_sock_addr corresponds to an
implicit padding of 4 bytes, right after msg_src_ip4. Access to this
padding isn't rejected in sock_addr_is_valid_access and it thus later
fails to convert the access.
This patch fixes it by explicitly checking the various fields of
bpf_sock_addr in sock_addr_is_valid_access.
I checked the other ctx structures and is_valid_access functions and
didn't find any other similar cases. Other cases of (properly handled)
padding are covered in new tests in a subsequent patch.
Fixes: 1cedee13d25a ("bpf: Hooks for sys_sendmsg") Reported-by: syzbot+136ca59d411f92e821b7@syzkaller.appspotmail.com Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://syzkaller.appspot.com/bug?extid=136ca59d411f92e821b7 Link: https://lore.kernel.org/bpf/b58609d9490649e76e584b0361da0abd3c2c1779.1758094761.git.paul.chaignon@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Check if watchdog device supports WDIOF_KEEPALIVEPING option before
entering keep_alive() ping test loop. Fix watchdog-test silently looping
if ioctl based ping is not supported by the device. Exit from test in
such case instead of getting stuck in loop executing failing keep_alive()
watchdog_info:
identity: m41t93 rtc Watchdog
firmware_version: 0
Support/Status: Set timeout (in seconds)
Support/Status: Watchdog triggers a management or other external alarm not a reboot
Watchdog card disabled.
Watchdog timeout set to 5 seconds.
Watchdog ping rate set to 2 seconds.
Watchdog card enabled.
WDIOC_KEEPALIVE not supported by this device
without this change
Watchdog card disabled.
Watchdog timeout set to 5 seconds.
Watchdog ping rate set to 2 seconds.
Watchdog card enabled.
Watchdog Ticking Away!
(Where test stuck here forver silently)
Updated change log at commit time:
Shuah Khan <skhan@linuxfoundation.org>
In svc_i3c_master_handle_ibi(), an IBI slot is fetched from the pool
to store the IBI payload. However, when an error condition is encountered,
the function returns without recycling the IBI slot, resulting in an IBI
slot leak.
Fixes: c85e209b799f ("i3c: master: svc: fix ibi may not return mandatory data byte") Signed-off-by: Stanley Chu <yschu@nuvoton.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20250829012309.3562585-3-yschu@nuvoton.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Driver wants to nack the IBI request when the target is not in the
known address list. In below code, svc_i3c_master_nack_ibi() will
cause undefined behavior when using AUTOIBI with auto response rule,
because hw always auto ack the IBI request.
switch (ibitype) {
case SVC_I3C_MSTATUS_IBITYPE_IBI:
dev = svc_i3c_master_dev_from_addr(master, ibiaddr);
if (!dev || !is_events_enabled(master, SVC_I3C_EVENT_IBI))
svc_i3c_master_nack_ibi(master);
...
break;
AutoIBI has another issue that the controller doesn't quit AutoIBI state
after IBIWON polling timeout when there is a SDA glitch(high->low->high).
1. SDA high->low: raising an interrupt to execute IBI ISR
2. SDA low->high
3. Driver writes an AutoIBI request
4. AutoIBI process does not start because SDA is not low
5. IBIWON polling times out
6. Controller reamins in AutoIBI state and doesn't accept EmitStop request
Emitting broadcast address with IBIRESP_MANUAL avoids both issues.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Signed-off-by: Stanley Chu <yschu@nuvoton.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20250829012309.3562585-2-yschu@nuvoton.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It’s possible for more than one async command to be in flight from
__nvmet_fc_send_ls_req. For each command, a tgtport reference is taken.
In the current code, only one put work item is queued at a time, which
results in a leaked reference.
To fix this, move the work item to the nvmet_fc_ls_req_op struct, which
already tracks all resources related to the command.
Fixes: 710c69dbaccd ("nvmet-fc: avoid deadlock on delete association path") Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <wagi@kernel.org> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The function set_prescale_div() is responsible for calculating the clock
divisor settings such that the input clock rate is divided down such that
the required period length is at most 0x10000 clock ticks. If period_cycles
is an integer multiple of 0x10000, the divisor period_cycles / 0x10000 is
good enough. So round up in the calculation of the required divisor and
compare it using >= instead of >.
This devicetree contained only the SoC compatible but lacked the
machine specific one: add a "mediatek,mt8516-pumpkin" compatible
to the list to fix dtbs_check warnings.
Make sure to drop the reference to the saw device taken by
of_find_device_by_node() after retrieving its driver data during
probe().
Also drop the reference to the CPU node sooner to avoid leaking it in
case there is no saw node or device.
Fixes: 60f3692b5f0b ("cpuidle: qcom_spm: Detach state machine from main SPM handling") Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When executing modinfo null_blk, there is an error in the description
of module parameter mbps, and the output information of cache_size is
incomplete.The output of modinfo before and after applying this patch
is as follows:
Before:
[...]
parm: cache_size:ulong
[...]
parm: mbps:Cache size in MiB for memory-backed device.
Default: 0 (none) (uint)
[...]
After:
[...]
parm: cache_size:Cache size in MiB for memory-backed device.
Default: 0 (none) (ulong)
[...]
parm: mbps:Limit maximum bandwidth (in MiB/s).
Default: 0 (no limit) (uint)
[...]
Fixes: 058efe000b31 ("null_blk: add module parameters for 4 options") Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change the 'ret' variable in sh_pfc_pinconf_group_set() from unsigned
int to int, as it needs to store either negative error codes or zero
returned by sh_pfc_pinconf_set().
No effect on runtime.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Fixes: d0593c363f04ccc4 ("pinctrl: sh-pfc: Propagate errors on group config") Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/20250831084958.431913-4-rongqianfeng@vivo.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix the checkpatch warning:
CHECK: Alignment should match open parenthesis
Fixes: 0cb172a4918e ("power: supply: cw2015: Use device managed API to simplify the code") Signed-off-by: Andy Yan <andyshrk@163.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The drv->sram_reg pointer could be set to ERR_PTR(-EPROBE_DEFER) which
would lead to a error pointer dereference. Use IS_ERR_OR_NULL() to check
that the pointer is valid.
If system suspend is aborted in the "noirq" phase (for instance, due to
an error returned by one of the device callbacks), power.is_noirq_suspended
will not be set for some devices and device_resume_noirq() will return
early for them. Consequently, noirq resume callbacks will not run for
them at all because the noirq suspend callbacks have not run for them
yet.
If any of them has power.must_resume set and late suspend has been
skipped for it (due to power.smart_suspend), early resume should be
skipped for it either, or its state may become inconsistent (for
instance, if the early resume assumes that it will always follow
noirq resume).
Make that happen by clearing power.must_resume in device_resume_noirq()
for devices with power.is_noirq_suspended clear that have been left in
suspend by device_suspend_late(), which will subsequently cause
device_resume_early() to leave the device in suspend and avoid
changing its state.
Change the 'ret' variable in blk_stack_limits() from unsigned int to int,
as it needs to store negative value -1.
Storing the negative error codes in unsigned type, or performing equality
comparisons (e.g., ret == -1), doesn't cause an issue at runtime [1] but
can be confusing. Additionally, assigning negative error codes to unsigned
type may trigger a GCC warning when the -Wsign-conversion flag is enabled.
Change the 'ret' variable from u32 to int to store negative error codes or
zero returned by of_property_read_u32().
Storing the negative error codes in unsigned type, doesn't cause an issue
at runtime but it's ugly as pants. Additionally, assigning negative error
codes to unsigned type may trigger a GCC warning when the -Wsign-conversion
flag is enabled.
No effect on runtime.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Fixes: 0fbeae70ee7c ("regulator: add SCMI driver") Link: https://patch.msgid.link/20250829101411.625214-1-rongqianfeng@vivo.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The at91_mckx_ps_restore() assembly function is responsible for setting
back MCKx system bus clocks after exiting low power modes.
Fix a typo and use tmp3 variable instead of tmp2 to correctly set MCKx
to previously saved state.
Tmp2 was used without the needed changes in CSS and DIV. Moreover the
required bit 7, telling that MCR register's content is to be changed
(CMD/write), was not set.
Fix function comment to match tmp variables actually used.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Fixes: 28eb1d40fe57 ("ARM: at91: pm: add support for MCK1..4 save/restore for ulp modes") Link: https://lore.kernel.org/r/20250827145427.46819-3-nicolas.ferre@microchip.com Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
[claudiu.beznea: s/sate/state in commit description] Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Sasha Levin <sashal@kernel.org>
In __blk_mq_update_nr_hw_queues() the return value of
blk_mq_sysfs_register_hctxs() is not checked. If sysfs creation for hctx
fails, later changing the number of hw_queues or removing disk will
trigger the following warning:
kernfs: can not remove 'nr_tags', no directory
WARNING: CPU: 2 PID: 637 at fs/kernfs/dir.c:1707 kernfs_remove_by_name_ns+0x13f/0x160
Call Trace:
remove_files.isra.1+0x38/0xb0
sysfs_remove_group+0x4d/0x100
sysfs_remove_groups+0x31/0x60
__kobject_del+0x23/0xf0
kobject_del+0x17/0x40
blk_mq_unregister_hctx+0x5d/0x80
blk_mq_sysfs_unregister_hctxs+0x94/0xd0
blk_mq_update_nr_hw_queues+0x124/0x760
nullb_update_nr_hw_queues+0x71/0xf0 [null_blk]
nullb_device_submit_queues_store+0x92/0x120 [null_blk]
kobjct_del() was called unconditionally even if sysfs creation failed.
Fix it by checkig the kobject creation statusbefore deleting it.
Fixes: 477e19dedc9d ("blk-mq: adjust debugfs and sysfs register when updating nr_hw_queues") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250826084854.1030545-1-linan666@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Unconditionally clear the TCS_AMC_MODE_TRIGGER bit when a
transaction completes. Previously this bit was only cleared when
a wake TCS was borrowed as an AMC TCS but not for dedicated
AMC TCS. Leaving this bit set for AMC TCS and entering deeper low
power modes can generate a false completion IRQ.
Prevent this scenario by always clearing the TCS_AMC_MODE_TRIGGER
bit upon receiving a completion IRQ.
Fixes: 15b3bf61b8d4 ("soc: qcom: rpmh-rsc: Clear active mode configuration for wake TCS") Signed-off-by: Sneh Mankad <sneh.mankad@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250825-rpmh_rsc_change-v1-1-138202c31bf6@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The cpuidle device's memory is leaked when cpuidle device registration
fails in acpi_processor_power_init(). Free it as appropriate.
Fixes: 3d339dcbb56d ("cpuidle / ACPI : move cpuidle_device field out of the acpi_processor_power structure") Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://patch.msgid.link/20250728070612.1260859-2-lihuisong@huawei.com
[ rjw: Changed the order of the new statements, added empty line after if () ]
[ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Broadcom STB platforms were early adopters (2017) of the SCMI framework and as
a result, not all deployed systems have a Device Tree entry where SCMI
protocol 0x13 (PERFORMANCE) is declared as a clock provider, nor are the
CPU Device Tree node(s) referencing protocol 0x13 as their clock
provider. This was clarified in commit e11c480b6df1 ("dt-bindings:
firmware: arm,scmi: Extend bindings for protocol@13") in 2023.
For those platforms, we allow the checks done by scmi_dev_used_by_cpus()
to continue, and in the event of not having done an early return, we key
off the documented compatible string and give them a pass to continue to
use scmi-cpufreq.
Fixes: 6c9bb8692272 ("cpufreq: scmi: Skip SCMI devices that aren't used by the CPUs") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Previously, re-using pinned DEVMAP maps would always fail, because
get_map_info on a DEVMAP always returns flags with BPF_F_RDONLY_PROG set,
but BPF_F_RDONLY_PROG being set on a map during creation is invalid.
Thus, ignore the BPF_F_RDONLY_PROG flag in the flags returned from
get_map_info when checking for compatibility with an existing DEVMAP.
The same problem is handled in a third-party ebpf library:
- https://github.com/cilium/ebpf/issues/925
- https://github.com/cilium/ebpf/pull/930
Graph tracer framework ensures we won't migrate, kprobe_multi_link_prog_run
called all the way from graph tracer, which disables preemption in
function_graph_enter_regs, as Jiri and Yonghong suggested, there is no
need to use migrate_disable. As a result, some overhead may will be reduced.
And add cant_sleep check for __this_cpu_inc_return.
Based on a bisect, it appears that commit 7ee988770326 ("timers:
Implement the hierarchical pull model") has somehow inadvertently
broken BPF selftest test_tcpnotify_user. The error that is being
generated by this test is as follows:
FAILED: Wrong stats Expected 10 calls, got 8
It looks like the change allows timer functions to be run on CPUs
different from the one they are armed on. The test had pinned itself
to CPU 0, and in the past the retransmit attempts also occurred on CPU
0. The test had set the max_entries attribute for
BPF_MAP_TYPE_PERF_EVENT_ARRAY to 2 and was calling
bpf_perf_event_output() with BPF_F_CURRENT_CPU, so the entry was
likely to be in range. With the change to allow timers to run on other
CPUs, the current CPU tasked with performing the retransmit might be
bumped and in turn fall out of range, as the event will be filtered
out via __bpf_perf_event_output() using:
if (unlikely(index >= array->map.max_entries))
return -E2BIG;
A possible change would be to explicitly set the max_entries attribute
for perf_event_map in test_tcpnotify_kern.c to a value that's at least
as large as the number of CPUs. As it turns out however, if the field
is left unset, then the libbpf will determine the number of CPUs available
on the underlying system and update the max_entries attribute accordingly
in map_set_def_max_entries().
A further problem with the test is that it has a thread that continues
running up until the program exits. The main thread cleans up some
LIBBPF data structures, while the other thread continues to use them,
which inevitably will trigger a SIGSEGV. This can be dealt with by
telling the thread to run for as long as necessary and doing a
pthread_join on it before exiting the program.
Finally, I don't think binding the process to CPU 0 is meaningful for
this test any more, so get rid of that.
Fixes: 435f90a338ae ("selftests/bpf: add a test case for sock_ops perf-event notification") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/aJ8kHhwgATmA3rLf@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
On RZ/G2LC SMARC EVK, CAN-FD channel0 is not populated, and currently we
are deleting a wrong and nonexistent node. Fixing the wrong node would
invoke a dtb warning message, as channel0 is a required property.
Disable CAN-FD channel0 instead of deleting the node.
Already do real negotiation in smb_direct_handle_connect_request()
where we see the requested initiator_depth and responder_resources
from the client.
We should detect legacy iwarp clients using MPA v1
with the custom IRD/ORD negotiation.
We need to send the custom IRD/ORD in big endian,
but we need to try to let clients with broken requests
using little endian (older cifs.ko) to work.
Note the reason why this uses u8 for
initiator_depth and responder_resources is
that the rdma layer also uses it.
Acked-by: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: linux-rdma@vger.kernel.org Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Cast nr_pages to unsigned long to avoid overflow when handling large
AUX buffer sizes (>= 2 GiB).
Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix -Wunused-result warning generated when compiled with gcc 13.3.0,
by checking fread's return value and handling errors, preventing
potential failures when reading from stdin.
Fixes compiler warning:
warning: ignoring return value of 'fread' declared with attribute
'warn_unused_result' [-Wunused-result]
Fixes: 806a15b2545e ("kselftests/arm64: add PAuth test for whether exec() changes keys") Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Normally the tracee starts in SECCOMP_NOTIFY_INIT, sends an
event to the tracer, and starts to wait interruptibly. With
SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV, if the tracer receives the
message (SECCOMP_NOTIFY_SENT is reached) while the tracee was waiting
and is subsequently interrupted, the tracee begins to wait again
uninterruptibly (but killable).
This fails if SECCOMP_NOTIFY_REPLIED is reached before the tracee
is interrupted, as the check only considered SECCOMP_NOTIFY_SENT as a
condition to begin waiting again. In this case the tracee is interrupted
even though the tracer already acted on its behalf. This breaks the
assumption SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV wanted to ensure,
namely that the tracer can be sure the syscall is not interrupted or
restarted on the tracee after it is received on the tracer. Fix this
by also considering SECCOMP_NOTIFY_REPLIED when evaluating whether to
switch to uninterruptible waiting.
With the condition changed the loop in seccomp_do_user_notification()
would exit immediately after deciding that noninterruptible waiting
is required if the operation already reached SECCOMP_NOTIFY_REPLIED,
skipping the code that processes pending addfd commands first. Prevent
this by executing the remaining loop body one last time in this case.
Fixes: c2aa2dfef243 ("seccomp: Add wait_killable semantic to seccomp user notifier") Reported-by: Ali Polatel <alip@chesswob.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220291 Signed-off-by: Johannes Nixdorf <johannes@nixdorf.dev> Link: https://lore.kernel.org/r/20250725-seccomp-races-v2-1-cf8b9d139596@nixdorf.dev Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
INITRAMFS_PRESERVE_MTIME is only used in init/initramfs.c and
init/initramfs_test.c. Hence add a dependency on BLK_DEV_INITRD, to
prevent asking the user about this feature when configuring a kernel
without initramfs support.
Fixes: 1274aea127b2e8c9 ("initramfs: add INITRAMFS_PRESERVE_MTIME Kconfig option") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
spin_lock(&m->req_lock)
...
// second remove
list_del(&req->req_list);
spin_unlock(&m->req_lock)
...
Commit 74d6a5d56629 ("9p/trans_fd: Fix concurrency del of req_list in
p9_fd_cancelled/p9_read_work") fixes a concurrency issue in the 9p filesystem
client where the req_list could be deleted simultaneously by both
p9_read_work and p9_fd_cancelled functions, but for the case where req->status
equals REQ_STATUS_RCVD.
Update the check for req->status in p9_fd_cancelled to skip processing not
just received requests, but anything that is not SENT, as whatever
changed the state from SENT also removed the request from its list.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: afd8d6541155 ("9P: Add cancelled() to the transport functions.") Cc: stable@vger.kernel.org Signed-off-by: Nalivayko Sergey <Sergey.Nalivayko@kaspersky.com>
Message-ID: <20250715154815.3501030-1-Sergey.Nalivayko@kaspersky.com>
[updated the check from status == RECV || status == ERROR to status != SENT] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Devices with power.no_pm set are not expected to need any power
management at all, so modify device_set_pm_not_required() to set
power.no_callbacks for them too in case runtime PM will be enabled
for any of them (which in principle may be done for convenience if
such a device participates in a dependency chain).
Since device_set_pm_not_required() must be called before device_add()
or it would not have any effect, it can update power.no_callbacks
without locking, unlike pm_runtime_no_callbacks() that can be called
after registering the target device.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@kernel.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/1950054.tdWV9SEqCh@rafael.j.wysocki Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Flush stale data from the RX FIFO in case of errors, to avoid reading
old data when new packets arrive.
Commit c6e8d85fafa7 ("staging: axis-fifo: Remove hardware resets for
user errors") removed full FIFO resets from the read error paths, which
fixed potential TX data losses, but introduced this RX issue.
If copy_from_user() fails, write() currently returns -EFAULT, but any
partially written data leaves the TX FIFO in an inconsistent state.
Subsequent write() calls then fail with "transmit length mismatch"
errors.
Once partial data is written to the hardware FIFO, it cannot be removed
without a TX reset. Commit c6e8d85fafa7 ("staging: axis-fifo: Remove
hardware resets for user errors") removed a full FIFO reset for this case,
which fixed a potential RX data loss, but introduced this TX issue.
Fix this by introducing a bounce buffer: copy the full packet from
userspace first, and write to the hardware FIFO only if the copy
was successful.
Since commit 2ca34b508774 ("staging: axis-fifo: Correct handling of
tx_fifo_depth for size validation"), write() operations with packets
larger than 'tx_fifo_depth - 4' words are no longer rejected with -EINVAL.
Fortunately, the packets are not actually getting transmitted to hardware,
otherwise they would be raising a 'Transmit Packet Overrun Error'
interrupt, which requires a reset of the TX circuit to recover from.
Instead, the request times out inside wait_event_interruptible_timeout()
and always returns -EAGAIN, since the wake up condition can never be true
for these packets. But still, they unnecessarily block other tasks from
writing to the FIFO and the EAGAIN return code signals userspace to retry
the write() call, even though it will always fail and time out.
According to the AXI4-Stream FIFO reference manual (PG080), the maximum
valid packet length is 'tx_fifo_depth - 4' words, so attempting to send
larger packets is invalid and should not be happening in the first place:
> The maximum packet that can be transmitted is limited by the size of
> the FIFO, which is (C_TX_FIFO_DEPTH–4)*(data interface width/8) bytes.
Therefore, bring back the old behavior and outright reject packets larger
than 'tx_fifo_depth - 4' with -EINVAL. Add a comment to explain why the
check is necessary. The dev_err() message was removed to avoid cluttering
the dmesg log if an invalid packet is received from userspace.
As reported by syzbot, mcp2221_raw_event lacked
validation of incoming I2C read data sizes, risking buffer
overflows in mcp->rxbuf during multi-part transfers.
As highlighted in the DS20005565B spec, p44, we have:
"The number of read-back data bytes to follow in this packet:
from 0 to a maximum of 60 bytes of read-back bytes."
This patch enforces we don't exceed this limit.
Driver configures register to choose controller mode before
setting all channels to reset mode leading to failure.
The patch corrects operation of mode setting.
This issue is similar to the vulnerability in the `mcp251x` driver,
which was fixed in commit 03c427147b2d ("can: mcp251x: fix resume from
sleep before interface was brought up").
In the `hi311x` driver, when the device resumes from sleep, the driver
schedules `priv->restart_work`. However, if the network interface was
not previously enabled, the `priv->wq` (workqueue) is not allocated and
initialized, leading to a null pointer dereference.
To fix this, we move the allocation and initialization of the workqueue
from the `hi3110_open` function to the `hi3110_can_probe` function.
This ensures that the workqueue is properly initialized before it is
used during device resume. And added logic to destroy the workqueue
in the error handling paths of `hi3110_can_probe` and in the
`hi3110_can_remove` function to prevent resource leaks.
Syzbot hits a problem with enabled ref-verify, ignorebadroots and a
fuzzed/damaged extent tree. There's no fallback option like in other
places that can deal with it so disable the whole ref-verify as it is
just a debugging feature.
This happens when the perf binary is not named exactly "perf" or when
multiple "perf-*" binaries exist in the same directory. In such cases,
the `excludes` command list can be empty, which leads to the final
assertion in exclude_cmds() being triggered.
Add a simple guard at the beginning of exclude_cmds() to return early
if excludes->cnt is zero, preventing the crash.
The bodies of __signed_type_use() and __unsigned_type_use() are much the
same size as their names - so put the bodies in the only line that expands
them.
Similarly __signed_type() is defined separately for 64bit and then used
exactly once just below.
Change the test for __signed_type from CONFIG_64BIT to one based on gcc
defined macros so that the code is valid if it gets used outside of a
kernel build.
Link: https://lkml.kernel.org/r/9386d1ebb8974fbabbed2635160c3975@AcuMS.aculab.com Signed-off-by: David Laight <david.laight@aculab.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pedro Falcato <pedro.falcato@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
At some point the definitions for clamp() got added in the middle of the
ones for min() and max(). Re-order the definitions so they are more
sensibly grouped.
Link: https://lkml.kernel.org/r/8bb285818e4846469121c8abc3dfb6e2@AcuMS.aculab.com Signed-off-by: David Laight <david.laight@aculab.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pedro Falcato <pedro.falcato@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use BUILD_BUG_ON_MSG(statically_true(ulo > uhi), ...) for the sanity check
of the bounds in clamp(). Gives better error coverage and one less
expansion of the arguments.
Link: https://lkml.kernel.org/r/34d53778977747f19cce2abb287bb3e6@AcuMS.aculab.com Signed-off-by: David Laight <david.laight@aculab.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pedro Falcato <pedro.falcato@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the test for signed values being non-negative only relies on
__builtion_constant_p() (not is_constexpr()) it can use the 'ux' variable
instead of the caller supplied expression. This means that the #define
parameters are only expanded twice. Once in the code and once quoted in
the error message.
Link: https://lkml.kernel.org/r/051afc171806425da991908ed8688a98@AcuMS.aculab.com Signed-off-by: David Laight <david.laight@aculab.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pedro Falcato <pedro.falcato@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Change three to several.
- Remove the comment about retaining constant expressions, no longer true.
- Realign to nearer 80 columns and break on major punctiation.
- Add a leading comment to the block before __signed_type() and __is_nonneg()
Otherwise the block explaining the cast is a bit 'floating'.
Reword the rest of that comment to improve readability.
Link: https://lkml.kernel.org/r/85b050c81c1d4076aeb91a6cded45fee@AcuMS.aculab.com Signed-off-by: David Laight <david.laight@aculab.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pedro Falcato <pedro.falcato@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Laight pointed out that we should deal with the min3() and max3()
mess too, which still does excessive expansion.
And our current macros are actually rather broken.
In particular, the macros did this:
#define min3(x, y, z) min((typeof(x))min(x, y), z)
#define max3(x, y, z) max((typeof(x))max(x, y), z)
and that not only is a nested expansion of possibly very complex
arguments with all that involves, the typing with that "typeof()" cast
is completely wrong.
For example, imagine what happens in max3() if 'x' happens to be a
'unsigned char', but 'y' and 'z' are 'unsigned long'. The types are
compatible, and there's no warning - but the result is just random
garbage.
No, I don't think we've ever hit that issue in practice, but since we
now have sane infrastructure for doing this right, let's just use it.
It fixes any excessive expansion, and also avoids these kinds of broken
type issues.
This clarifies the rules for min()/max()/clamp() type checking and makes
them a much more efficient macro expansion.
In particular, we now look at the type and range of the inputs to see
whether they work together, generating a mask of acceptable comparisons,
and then just verifying that the inputs have a shared case:
- an expression with a signed type can be used for
(1) signed comparisons
(2) unsigned comparisons if it is statically known to have a
non-negative value
- an expression with an unsigned type can be used for
(3) unsigned comparison
(4) signed comparisons if the type is smaller than 'int' and thus
the C integer promotion rules will make it signed anyway
Here rule (1) and (3) are obvious, and rule (2) is important in order to
allow obvious trivial constants to be used together with unsigned
values.
Rule (4) is not necessarily a good idea, but matches what we used to do,
and we have extant cases of this situation in the kernel. Notably with
bcachefs having an expression like
where bch2_bucket_sectors_dirty() returns an 's64', and
'ca->mi.bucket_size' is of type 'u16'.
Technically that bcachefs comparison is clearly sensible on a C type
level, because the 'u16' will go through the normal C integer promotion,
and become 'int', and then we're comparing two signed values and
everything looks sane.
However, it's not entirely clear that a 'min(s64,u16)' operation makes a
lot of conceptual sense, and it's possible that we will remove rule (4).
After all, the _reason_ we have these complicated type checks is exactly
that the C type promotion rules are not very intuitive.
But at least for now the rule is in place for backwards compatibility.
Also note that rule (2) existed before, but is hugely relaxed by this
commit. It used to be true only for the simplest compile-time
non-negative integer constants. The new macro model will allow cases
where the compiler can trivially see that an expression is non-negative
even if it isn't necessarily a constant.
because our old 'min()' macro would see that 'pia[addr] & 0x3F' is of
type 'int' and clearly not a C constant expression, so doing a 'min()'
with a 'size_t' is a signedness violation.
Our new 'min()' macro still sees that 'pia[addr] & 0x3F' is of type
'int', but is smart enough to also see that it is clearly non-negative,
and thus would allow that case without any complaints.
Now that we no longer have any C constant expression contexts (ie array
size declarations or static initializers) that use min() or max(), we
can simpify the implementation by not having to worry about the result
staying as a C constant expression.
So now we can unconditionally just use temporary variables of the right
type, and get rid of the excessive expansion that used to come from the
use of
__builtin_choose_expr(__is_constexpr(...), ..
to pick the specialized code for constant expressions.
Another expansion simplification is to pass the temporary variables (in
addition to the original expression) to our __types_ok() macro. That
may superficially look like it complicates the macro, but when we only
want the type of the expression, expanding the temporary variable names
is much simpler and smaller than expanding the potentially complicated
original expression.
As a result, on my machine, doing a
$ time make drivers/staging/media/atomisp/pci/isp/kernels/ynr/ynr_1.0/ia_css_ynr.host.i
goes from
real 0m16.621s
user 0m15.360s
sys 0m1.221s
to
real 0m2.532s
user 0m2.091s
sys 0m0.452s
because the token expansion goes down dramatically.
In particular, the longest line expansion (which was line 71 of that
'ia_css_ynr.host.c' file) shrinks from 23,338kB (yes, 23MB for one
single line) to "just" 1,444kB (now "only" 1.4MB).
And yes, that line is still the line from hell, because it's doing
multiple levels of "min()/max()" expansion thanks to some of them being
hidden inside the uDIGIT_FITTING() macro.
Lorenzo has a nice cleanup patch that makes that driver use inline
functions instead of macros for sDIGIT_FITTING() and uDIGIT_FITTING(),
which will fix that line once and for all, but the 16-fold reduction in
this case does show why we need to simplify these helpers.
Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We only had a couple of array[] declarations, and changing them to just
use 'MAX()' instead of 'max()' fixes the issue.
This will allow us to simplify our min/max macros enormously, since they
can now unconditionally use temporary variables to avoid using the
argument values multiple times.
Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The state->timer is a cyclic timer that schedules work_i2c_poll and
delayed_work_enable_hotplug, while rearming itself. Using timer_delete()
fails to guarantee the timer isn't still running when destroyed, similarly
cancel_delayed_work() cannot ensure delayed_work_enable_hotplug has
terminated if already executing. During probe failure after timer
initialization, these may continue running as orphans and reference the
already-freed tc358743_state object through tc358743_irq_poll_timer.
Replace timer_delete() with timer_delete_sync() and cancel_delayed_work()
with cancel_delayed_work_sync() to ensure proper termination of timer and
work items before resource cleanup.
This bug was initially identified through static analysis. For reproduction
and testing, I created a functional emulation of the tc358743 device via a
kernel module and introduced faults through the debugfs interface.
Fixes: 869f38ae07f7 ("media: i2c: tc358743: Fix crash in the probe error path when using polling") Fixes: d32d98642de6 ("[media] Driver for Toshiba TC358743 HDMI to CSI-2 bridge") Cc: stable@vger.kernel.org Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
[ replaced del_timer() instead of timer_delete() ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The original code uses cancel_delayed_work() in xc5000_release(), which
does not guarantee that the delayed work item timer_sleep has fully
completed if it was already running. This leads to use-after-free scenarios
where xc5000_release() may free the xc5000_priv while timer_sleep is still
active and attempts to dereference the xc5000_priv.
A typical race condition is illustrated below:
CPU 0 (release thread) | CPU 1 (delayed work callback)
xc5000_release() | xc5000_do_timer_sleep()
cancel_delayed_work() |
hybrid_tuner_release_state(priv) |
kfree(priv) |
| priv = container_of() // UAF
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the timer_sleep is properly canceled before the xc5000_priv memory
is deallocated.
A deadlock concern was considered: xc5000_release() is called in a process
context and is not holding any locks that the timer_sleep work item might
also need. Therefore, the use of the _sync() variant is safe here.
This bug was initially identified through static analysis.
Make sure the firmware is released when we leave
xc_load_fw_and_init_tuner()
This change makes smatch happy:
drivers/media/tuners/xc5000.c:1213 xc_load_fw_and_init_tuner() warn: 'fw' from request_firmware() not released on lines: 1213.
Will Deacon [Fri, 3 Oct 2025 18:40:18 +0000 (19:40 +0100)]
KVM: arm64: Fix softirq masking in FPSIMD register saving sequence
Stable commit 8f4dc4e54eed ("KVM: arm64: Fix kernel BUG() due to bad
backport of FPSIMD/SVE/SME fix") fixed a kernel BUG() caused by a bad
backport of upstream commit fbc7e61195e2 ("KVM: arm64: Unconditionally
save+flush host FPSIMD/SVE/SME state") by ensuring that softirqs are
disabled/enabled across the fpsimd register save operation.
Unfortunately, although this fixes the original issue, it can now lead
to deadlock when re-enabling softirqs causes pending softirqs to be
handled with locks already held:
Take a tiny step towards the upstream fix in 9b19700e623f ("arm64:
fpsimd: Drop unneeded 'busy' flag") by additionally disabling hardirqs
while saving the fpsimd registers.
Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Lee Jones <lee@kernel.org> Cc: Sasha Levin <sashal@kernel.org> Cc: <stable@vger.kernel.org> # 6.1.y Fixes: 8f4dc4e54eed ("KVM: arm64: Fix kernel BUG() due to bad backport of FPSIMD/SVE/SME fix") Reported-by: Kenneth Van Alstyne <kvanals@kvanals.org> Link: https://lore.kernel.org/r/010001999bae0958-4d80d25d-8dda-4006-a6b9-798f3e774f6c-000000@email.amazonses.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is possible that the topology parsing function
audioreach_widget_load_module_common() could return NULL or an error
pointer. Add missing NULL check so that we do not dereference it.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Stable@vger.kernel.org Fixes: 36ad9bf1d93d ("ASoC: qdsp6: audioreach: add topology support") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Link: https://patch.msgid.link/20250825101247.152619-2-srinivas.kandagatla@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Syzbot reports a KASAN issue as below:
BUG: KASAN: use-after-free in __create_pipe include/linux/usb.h:1945 [inline]
BUG: KASAN: use-after-free in send_packet+0xa2d/0xbc0 drivers/media/rc/imon.c:627
Read of size 4 at addr ffff8880256fb000 by task syz-executor314/4465
The iMON driver improperly releases the usb_device reference in
imon_disconnect without coordinating with active users of the
device.
Specifically, the fields usbdev_intf0 and usbdev_intf1 are not
protected by the users counter (ictx->users). During probe,
imon_init_intf0 or imon_init_intf1 increments the usb_device
reference count depending on the interface. However, during
disconnect, usb_put_dev is called unconditionally, regardless of
actual usage.
As a result, if vfd_write or other operations are still in
progress after disconnect, this can lead to a use-after-free of
the usb_device pointer.
Thread 1 vfd_write Thread 2 imon_disconnect
...
if
usb_put_dev(ictx->usbdev_intf0)
else
usb_put_dev(ictx->usbdev_intf1)
...
while
send_packet
if
pipe = usb_sndintpipe(
ictx->usbdev_intf0) UAF
else
pipe = usb_sndctrlpipe(
ictx->usbdev_intf0, 0) UAF
Guard access to usbdev_intf0 and usbdev_intf1 after disconnect by
checking ictx->disconnected in all writer paths. Add early return
with -ENODEV in send_packet(), vfd_write(), lcd_write() and
display_open() if the device is no longer present.
Set and read ictx->disconnected under ictx->lock to ensure memory
synchronization. Acquire the lock in imon_disconnect() before setting
the flag to synchronize with any ongoing operations.
Ensure writers exit early and safely after disconnect before the USB
core proceeds with cleanup.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Reported-by: syzbot+f1a69784f6efe748c3bf@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f1a69784f6efe748c3bf Fixes: 21677cfc562a ("V4L/DVB: ir-core: add imon driver") Cc: stable@vger.kernel.org Signed-off-by: Larshin Sergey <Sergey.Larshin@kaspersky.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The original code uses cancel_delayed_work() in flexcop_pci_remove(), which
does not guarantee that the delayed work item irq_check_work has fully
completed if it was already running. This leads to use-after-free scenarios
where flexcop_pci_remove() may free the flexcop_device while irq_check_work
is still active and attempts to dereference the device.
A typical race condition is illustrated below:
CPU 0 (remove) | CPU 1 (delayed work callback)
flexcop_pci_remove() | flexcop_pci_irq_check_work()
cancel_delayed_work() |
flexcop_device_kfree(fc_pci->fc_dev) |
| fc = fc_pci->fc_dev; // UAF
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the delayed work item is properly canceled and any executing delayed
work has finished before the device memory is deallocated.
This bug was initially identified through static analysis. To reproduce
and test it, I simulated the B2C2 FlexCop PCI device in QEMU and introduced
artificial delays within the flexcop_pci_irq_check_work() function to
increase the likelihood of triggering the bug.
A buffer overflow arises from the usage of snprintf to write into the
buffer "buf" in target_lu_gp_members_show function located in
/drivers/target/target_core_configfs.c. This buffer is allocated with
size LU_GROUP_NAME_BUF (256 bytes).
snprintf(...) formats multiple strings into buf with the HBA name
(hba->hba_group.cg_item), a slash character, a devicename (dev->
dev_group.cg_item) and a newline character, the total formatted string
length may exceed the buffer size of 256 bytes.
Since snprintf() returns the total number of bytes that would have been
written (the length of %s/%sn ), this value may exceed the buffer length
(256 bytes) passed to memcpy(), this will ultimately cause function
memcpy reporting a buffer overflow error.
An additional check of the return value of snprintf() can avoid this
buffer overflow.
Reported-by: Wang Haoran <haoranwangsec@gmail.com> Reported-by: ziiiro <yuanmingbuaa@gmail.com> Signed-off-by: Wang Haoran <haoranwangsec@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kenta Akagi [Thu, 2 Oct 2025 14:17:59 +0000 (23:17 +0900)]
selftests: mptcp: connect: fix build regression caused by backport
Since v6.1.154, mptcp selftests have failed to build with the following
errors:
mptcp_connect.c: In function ‘main_loop_s’:
mptcp_connect.c:1040:59: error: ‘winfo’ undeclared (first use in this function)
1040 | err = copyfd_io(fd, remotesock, 1, true, &winfo);
| ^~~~~
mptcp_connect.c:1040:59: note: each undeclared identifier is reported only once for each function it appears in
mptcp_connect.c:1040:23: error: too many arguments to function ‘copyfd_io’; expected 4, have 5
1040 | err = copyfd_io(fd, remotesock, 1, true, &winfo);
| ^~~~~~~~~ ~~~~~~
mptcp_connect.c:845:12: note: declared here
845 | static int copyfd_io(int infd, int peerfd, int outfd, bool close_peerfd)
| ^~~~~~~~~
This is caused by commit ff160500c499 ("selftests: mptcp: connect: catch
IO errors on listen side"), a backport of upstream 14e22b43df25,
which attempts to use the undeclared variable 'winfo' and passes too many
arguments to copyfd_io(). Both the winfo variable and the updated
copyfd_io() function were introduced in upstream
commit ca7ae8916043 ("selftests: mptcp: mptfo Initiator/Listener"),
which is not present in v6.1.y.
The goal of the backport is to stop on errors from copyfd_io.
Therefore, the backport does not depend on the changes in upstream
commit ca7ae8916043 ("selftests: mptcp: mptfo Initiator/Listener").
This commit simply removes ', &winfo' to fix a build failure.
UBSAN: signed-integer-overflow in ./include/crypto/sha256_base.h:64:19 34152083 * 64 cannot be represented in type 'int'
...
BUG: unable to handle page fault for address: ff9fffff83b624c0
sha256_update (lib/crypto/sha256.c:137)
crypto_sha256_update (crypto/sha256_generic.c:40)
kexec_calculate_store_digests (kernel/kexec_file.c:769)
__se_sys_kexec_file_load (kernel/kexec_file.c:397 kernel/kexec_file.c:332)
...
(Line numbers based on commit da274362a7bd9 ("Linux 6.12.49")
This started happening after commit f4da7afe07523f
("kexec_file: increase maximum file size to 4G") that landed in v6.0,
which increased the file size for kexec.
This is not happening upstream (v6.16+), given that `block` type was
upgraded from "int" to "size_t" in commit 74a43a2cf5e8 ("crypto:
lib/sha256 - Move partial block handling out")
Upgrade the block type similar to the commit above, avoiding hitting the
overflow.
This patch is only suitable for the stable tree, and before 6.16, which
got commit 74a43a2cf5e8 ("crypto: lib/sha256 - Move partial block
handling out"). This is not required before f4da7afe07523f ("kexec_file:
increase maximum file size to 4G"). In other words, this fix is required
between versions v6.0 and v6.16.
Signed-off-by: Breno Leitao <leitao@debian.org> Fixes: f4da7afe07523f ("kexec_file: increase maximum file size to 4G") # Before v6.16 Reported-by: Michael van der Westhuizen <rmikey@meta.com> Reported-by: Tobias Fleig <tfleig@meta.com> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The scale() functions detects invalid parameters, but continues
its calculations anyway. This causes bad results if negative values
are used for unsigned operations. Worst case, a division by 0 error
will be seen if source_min == source_max.
On top of that, after v6.13, the sequence of WARN_ON() followed by clamp()
may result in a build error with gcc 13.x.
drivers/gpu/drm/i915/display/intel_backlight.c: In function 'scale':
include/linux/compiler_types.h:542:45: error:
call to '__compiletime_assert_415' declared with attribute error:
clamp() low limit source_min greater than high limit source_max
This happens if the compiler decides to rearrange the code as follows.
if (source_min > source_max) {
WARN(..);
/* Do the clamp() knowing that source_min > source_max */
source_val = clamp(source_val, source_min, source_max);
} else {
/* Do the clamp knowing that source_min <= source_max */
source_val = clamp(source_val, source_min, source_max);
}
Fix the problem by evaluating the return values from WARN_ON and returning
immediately after a warning. While at it, fix divide by zero error seen
if source_min == source_max.
This simplifies the min_t() and max_t() macros by no longer making them
work in the context of a C constant expression.
That means that you can no longer use them for static initializers or
for array sizes in type definitions, but there were only a couple of
such uses, and all of them were converted (famous last words) to use
MIN_T/MAX_T instead.
Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 3a7e02c040b1 ("minmax: avoid overly complicated constant
expressions in VM code") added the simpler MIN_T/MAX_T macros in order
to avoid some excessive expansion from the rather complicated regular
min/max macros.
The complexity of those macros stems from two issues:
(a) trying to use them in situations that require a C constant
expression (in static initializers and for array sizes)
(b) the type sanity checking
and MIN_T/MAX_T avoids both of these issues.
Now, in the whole (long) discussion about all this, it was pointed out
that the whole type sanity checking is entirely unnecessary for
min_t/max_t which get a fixed type that the comparison is done in.
But that still leaves min_t/max_t unnecessarily complicated due to
worries about the C constant expression case.
However, it turns out that there really aren't very many cases that use
min_t/max_t for this, and we can just force-convert those.
This does exactly that.
Which in turn will then allow for much simpler implementations of
min_t()/max_t(). All the usual "macros in all upper case will evaluate
the arguments multiple times" rules apply.
We should do all the same things for the regular min/max() vs MIN/MAX()
cases, but that has the added complexity of various drivers defining
their own local versions of MIN/MAX, so that needs another level of
fixes first.
This just standardizes the use of MIN() and MAX() macros, with the very
traditional semantics. The goal is to use these for C constant
expressions and for top-level / static initializers, and so be able to
simplify the min()/max() macros.
These macro names were used by various kernel code - they are very
traditional, after all - and all such users have been fixed up, with a
few different approaches:
- trivial duplicated macro definitions have been removed
Note that 'trivial' here means that it's obviously kernel code that
already included all the major kernel headers, and thus gets the new
generic MIN/MAX macros automatically.
- non-trivial duplicated macro definitions are guarded with #ifndef
This is the "yes, they define their own versions, but no, the include
situation is not entirely obvious, and maybe they don't get the
generic version automatically" case.
- strange use case #1
A couple of drivers decided that the way they want to describe their
versioning is with
#define MAJ 1
#define MIN 2
#define DRV_VERSION __stringify(MAJ) "." __stringify(MIN)
which adds zero value and I just did my Alexander the Great
impersonation, and rewrote that pointless Gordian knot as
#define DRV_VERSION "1.2"
instead.
- strange use case #2
A couple of drivers thought that it's a good idea to have a random
'MIN' or 'MAX' define for a value or index into a table, rather than
the traditional macro that takes arguments.
These values were re-written as C enum's instead. The new
function-line macros only expand when followed by an open
parenthesis, and thus don't clash with enum use.
Happily, there weren't really all that many of these cases, and a lot of
users already had the pattern of using '#ifndef' guarding (or in one
case just using '#undef MIN') before defining their own private version
that does the same thing. I left such cases alone.
Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This occurs when memset() is called on a buffer that is not 4-byte aligned
and extends to the end of a guard page, i.e. the next page is unmapped.
The bug is that the loop at the end of kmsan_internal_set_shadow_origin()
accesses the wrong shadow memory bytes when the address is not 4-byte
aligned. Since each 4 bytes are associated with an origin, it rounds the
address and size so that it can access all the origins that contain the
buffer. However, when it checks the corresponding shadow bytes for a
particular origin, it incorrectly uses the original unrounded shadow
address. This results in reads from shadow memory beyond the end of the
buffer's shadow memory, which crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning") Signed-off-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Alexander Potapenko <glider@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Adjust context in tests ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The `ring_len` parameter provided by the virtual function (VF)
is assigned directly to the hardware memory context (HMC) without
any validation.
To address this, introduce an upper boundary check for both Tx and Rx
queue lengths. The maximum number of descriptors supported by the
hardware is 8k-32.
Additionally, enforce alignment constraints: Tx rings must be a multiple
of 8, and Rx rings must be a multiple of 32.
In Tables 8-12 and 8-22 in the X710/XXV710/XL710 datasheet, the QLEN
description states that the maximum size of the descriptor queue is 8k
minus 32, or 8160.
Signed-off-by: Justin Bronder <jsbronder@cold-front.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20231113231047.548659-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 55d225670def ("i40e: add validation for ring_len param") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
VF state I40E_VF_STATE_ACTIVE is not the only state in which
VF is actually active so it should not be used to determine
if a VF is allowed to obtain resources.
Use I40E_VF_STATE_RESOURCES_LOADED that is set only in
i40e_vc_get_vf_resources_msg() and cleared during reset.
Fixes: 61125b8be85d ("i40e: Fix failed opcode appearing if handling messages from VF") Cc: stable@vger.kernel.org Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The minmax infrastructure is overkill for simple constants, and can
cause huge expansions because those simple constants are then used by
other things.
For example, 'pageblock_order' is a core VM constant, but because it was
implemented using 'min_t()' and all the type-checking that involves, it
actually expanded to something like 2.5kB of preprocessor noise.
And when that simple constant was then used inside other expansions:
the end result was that one statement expanding to 253kB in size.
There are probably other cases of this, but this one case certainly
stood out.
I've added 'MIN_T()' and 'MAX_T()' macros for this kind of "core simple
constant with specific type" use. These macros skip the type checking,
and as such need to be very sparingly used only for obvious cases that
have active issues like this.
It appears that compiler_types.h already have an implementation of the
__unconst_integer_typeof() called __unqual_scalar_typeof(). Use it
instead of the copy.
flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them. The old APIs remain around
but are mostly implemented by calling the new interfaces.
The new APIs are based around setting up N page table entries at once.
The N entries belong to the same PMD, the same folio and the same VMA, so
ptep++ is a legitimate operation, and locking is taken care of for you.
Some architectures can do a better job of it than just a loop, but I have
hesitated to make too deep a change to architectures I don't understand
well.
One thing I have changed in every architecture is that PG_arch_1 is now a
per-folio bit instead of a per-page bit when used for dcache clean/dirty
tracking. This was something that would have to happen eventually, and it
makes sense to do it now rather than iterate over every page involved in a
cache flush and figure out if it needs to happen.
The point of all this is better performance, and Fengwei Yin has measured
improvement on x86. I suspect you'll see improvement on your architecture
too. Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.
This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the last
few months.
This patch (of 38):
Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND). It also has useful (under some
circumstances) behaviour if the range exceeds the maximum value of the
type. Convert all the conflicting definitions of in_range() within the
kernel; some can use the generic definition while others need their own
definition.
If migration succeeded, we called
folio_migrate_flags()->mem_cgroup_migrate() to migrate the memcg from the
old to the new folio. This will set memcg_data of the old folio to 0.
Similarly, if migration failed, memcg_data of the dst folio is left unset.
If we call folio_putback_lru() on such folios (memcg_data == 0), we will
add the folio to be freed to the LRU, making memcg code unhappy. Running
the hmm selftests:
Likely, nothing else goes wrong: putting the last folio reference will
remove the folio from the LRU again. So besides memcg complaining, adding
the folio to be freed to the LRU is just an unnecessary step.
The new flow resembles what we have in migrate_folio_move(): add the dst
to the lru, remove migration ptes, unlock and unref dst.
Link: https://lkml.kernel.org/r/20250210161317.717936-1-david@redhat.com Fixes: 8763cb45ab96 ("mm/migrate: new memory migration helper for use with device memory") Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
s390/cpum_cf: Fix uninitialized warning after backport of ce971233242b
Upstream commit ce971233242b ("s390/cpum_cf: Deny all sampling events by
counter PMU"), backported to 6.6 as commit d660c8d8142e ("s390/cpum_cf:
Deny all sampling events by counter PMU"), implicitly depends on the
unconditional initialization of err to -ENOENT added by upstream
commit aa1ac98268cd ("s390/cpumf: Fix double free on error in
cpumf_pmu_event_init()"). The latter change is missing from 6.6,
resulting in an instance of -Wuninitialized, which is fairly obvious
from looking at the actual diff.
arch/s390/kernel/perf_cpum_cf.c:858:10: warning: variable 'err' is uninitialized when used here [-Wuninitialized]
858 | return err;
| ^~~
Commit aa1ac98268cd ("s390/cpumf: Fix double free on error in
cpumf_pmu_event_init()") depends on commit c70ca298036c ("perf/core:
Simplify the perf_event_alloc() error path"), which is a part of a much
larger series unsuitable for stable.
Extract the unconditional initialization of err to -ENOENT from
commit aa1ac98268cd ("s390/cpumf: Fix double free on error in
cpumf_pmu_event_init()") and apply it to 6.6 as a standalone change to
resolve the warning.
Fixes: d660c8d8142e ("s390/cpum_cf: Deny all sampling events by counter PMU") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 1a194e6c8e1e ("fbcon: fix integer overflow in fbcon_do_set_font")
introduced an out-of-bounds access by storing data and allocation sizes
in the same variable. Restore the old size calculation and use the new
variable 'alloc_size' for the allocation.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 1a194e6c8e1e ("fbcon: fix integer overflow in fbcon_do_set_font") Reported-by: Jani Nikula <jani.nikula@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15020 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6201 Cc: Samasth Norway Ananda <samasth.norway.ananda@oracle.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: George Kennedy <george.kennedy@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Simona Vetter <simona@ffwll.ch> Cc: Helge Deller <deller@gmx.de> Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Qianqiang Liu <qianqiang.liu@163.com> Cc: Shixiong Ou <oushixiong@kylinos.cn> Cc: Kees Cook <kees@kernel.org> Cc: <stable@vger.kernel.org> # v5.9+ Cc: Zsolt Kajtar <soci@c64.rulez.org> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Qianqiang Liu <qianqiang.liu@163.com> Link: https://lore.kernel.org/r/20250922134619.257684-1-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>