Ben Horgan [Wed, 6 May 2026 08:28:55 +0000 (09:28 +0100)]
fs/resctrl: Document tasks file behaviour for task id 0 and idle tasks
When 0 is written to the tasks file it is interpreted as the current task in
rdtgroup_move_task(). Each CPU's idle task has task_struct::pid set to 0 and,
on x86, task_struct::closid to RESCTRL_RESERVED_CLOSID and task_struct::rmid
to RESCTRL_RESERVED_RMID.
Equivalently, on MPAM platforms, thread_info::mpam_partid_pmg is encoded with
PARTID and PMG set to RESCTRL_RESERVED_CLOSID and RESCTRL_RESERVED_RMID,
respectively. As there is no interface to change these from the default, the
resctrl configuration for the idle tasks is fixed and they always behave
equivalently to a task in the default tasks file and so take their
configuration from the cpus/cpus_list files.
On read of the tasks file, show_rdt_tasks() filters out any 0 PID. Hence, a
task id of 0 is never shown in the tasks file and the idle tasks are not
represented either.
Ben Horgan [Wed, 6 May 2026 08:28:53 +0000 (09:28 +0100)]
fs/resctrl: Continue counter allocation after failure
In mbm_event mode, with mbm_assign_on_mkdir set to 1, when a user creates a
new CTRL_MON or MON group resctrl attempts to allocate counters for each of
the supported MBM events on each resctrl domain. As counters are limited,
such allocation may fail and when it does counter allocations for the
remaining domains are skipped even if the domains have available counters.
Because of that, the user needs to view the resource group'smbm_L3_assignments
file to get an accurate view of counter assignment in a new resource group and
then manually create counters in the skipped domains with available counters.
Writes to mbm_L3_assignments using the wildcard format, <event>:*=e, also skip
counter allocation in other domains after a counter allocation failure.
When handling a request to create counters in all domains it is unnecessary
for a counter allocation in one domain to prevent counter allocation in
other domains. Always attempt to allocate all the counters requested.
Francis, David [Tue, 28 Apr 2026 19:25:50 +0000 (19:25 +0000)]
drm: Set old handle to NULL before prime swap in change_handle
There was a potential race condition in change_handle. The ioctl
briefly had a single object with two idr entries; a concurrent
gem_close could delete the object and remove one of the handles
while leaving the other one dangling, which could subsequently
be dereferenced for a use-after-free.
To fix this, do the same dance that gem_close itself does.
(f6cd7daecff5 drm: Release driver references to handle before making it available again)
First idr_replace the old handle to NULL. Later, if the prime
operations are successful, actually close it.
create_tail required a similar dance to avoid a similar problem.
(bd46cece51a3 drm/gem: Fix race in drm_gem_handle_create_tail())
It idr_allocs the new handle with NULL, then swaps in the correct
object later to avoid races. We don't need to do that here, since
the only operations that could race are drm_prime, and
change_handle holds the prime lock for the entire duration.
v2: cleanups of error paths
Signed-off-by: David Francis <David.Francis@amd.com> Co-authored-by: Dave Airlie <airlied@gmail.com> Reported-by: Puttimet Thammasaeng <pwn8official@gmail.com> Tested-by: Vitaly Prosyak <Vitaly.Prosyak@amd.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: stable@vger.kernel.org Cc: Christian Koenig <Christian.Koenig@amd.com> Fixes: 53096728b8910 ("drm: Add DRM prime interface to reassign GEM handle") Signed-off-by: Dave Airlie <airlied@redhat.com>
John Walker [Thu, 7 May 2026 23:07:20 +0000 (17:07 -0600)]
wifi: cfg80211: advance loop vars in cfg80211_merge_profile()
cfg80211_merge_profile() reassembles a Multi-BSSID non-transmitted BSS
profile that has been split across multiple consecutive MBSSID elements.
Its while-loop calls
but never advances mbssid_elem or sub_elem inside the body. Each
iteration therefore searches for a continuation that follows the same
fixed pair; the helper returns the same next_mbssid; and the same
next_sub bytes are memcpy()'d into merged_ie at a growing offset until
the buffer fills.
Advance both mbssid_elem and sub_elem to the just-consumed continuation
so the next call to cfg80211_get_profile_continuation() searches for a
further continuation beyond it (or returns NULL when none exists).
A specially-crafted malicious beacon can take advantage of this bug
to cause the kernel to spend an excessive amount of time in
cfg80211_merge_profile (up to as much as 2ms per beacon received),
which could theoretically be abused in some way.
Cc: stable@vger.kernel.org Fixes: fe806e4992c9 ("cfg80211: support profile split between elements") Signed-off-by: John Walker <johnwalker0@gmail.com> Link: https://patch.msgid.link/20260507230720.64783-1-johnwalker0@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
K Prateek Nayak [Fri, 8 May 2026 05:17:48 +0000 (05:17 +0000)]
cpufreq/amd-pstate-ut: Drop policy reference before driver switch
Recent changes to the EPP unit test tries to perform a driver switch
with a cpufreq_policy reference held when the driver is loaded into
anything but the active mode which leads to a circular dependency and
the unit test hanging indefinitely.
Drop the reference before driver switch and grab it back once the driver
mode is stabilized for the test.
The EPP writes are only possible with CPUFREQ_POLICY_POWERSAVE policy.
Temporarily switch the cpudata->policy (while holding the write end of
the policy->rwsem) to CPUFREQ_POLICY_POWERSAVE and restore the original
policy once tests are done. To ensure the final EPP is correct in case
the driver started with CPUFREQ_POLICY_PERFORMANCE, EPP performance is
tested last.
The __free() based cleanup for cpufreq_policy is lost in the process.
Reported-by: Kalpana Shetty <kalpana.shetty@amd.com> Fixes: 7e173bc310d2b ("cpufreq/amd-pstate-ut: Add a unit test for raw EPP") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20260508051748.10484-7-kprateek.nayak@amd.com Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
K Prateek Nayak [Fri, 8 May 2026 05:17:47 +0000 (05:17 +0000)]
cpufreq/amd-pstate: Use "epp_default_dc" as default when dynamic_epp is disabled
If "dynamic_epp" is disabled, the driver initialization and the default
EPP selection from sysfs currently sets the EPP based on the power
supply state of the system at that time but there is no power supply
callbacks registered to toggle it when the power supply state changes.
This can lead to faster battery drain on platforms that start off while
being plugged to the wall but later move to battery power since the EPP
stays at AMD_CPPC_EPP_PERFORMANCE.
Use "epp_default_dc" as the default EPP selection when dynamic_epp is
disabled, restoring older behavior. On servers, this defaults to
AMD_CPPC_EPP_PERFORMANCE and on other platforms, it defaults to
AMD_CPPC_EPP_BALANCE_PERFORMANCE.
Fixes: e30ca6dd5345 ("cpufreq/amd-pstate: Add dynamic energy performance preference") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20260508051748.10484-6-kprateek.nayak@amd.com Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
K Prateek Nayak [Fri, 8 May 2026 05:17:46 +0000 (05:17 +0000)]
cpufreq/amd-pstate: Reorder notifier unregistration and floor perf reset
An active power supply notifier can race with amd_pstate_epp_cpu_exit()
trying to reset the floor perf and can overwrite the floor perf set in
MSR_AMD_CPPC_REQ.
Unregister the notifier before setting the floor perf to prevent the
rare race.
Fixes: e30ca6dd5345 ("cpufreq/amd-pstate: Add dynamic energy performance preference") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20260508051748.10484-5-kprateek.nayak@amd.com Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
K Prateek Nayak [Fri, 8 May 2026 05:17:45 +0000 (05:17 +0000)]
cpufreq/amd-pstate: Allow writes to dynamic_epp when state isn't modified
Writing the current "dynamic_epp" state to sysfs fails with -EINVAL even
though the desired result was achieved. Allow writes to "dynamic_epp"
that does not modify the state.
Fixes: e30ca6dd5345 ("cpufreq/amd-pstate: Add dynamic energy performance preference") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20260508051748.10484-4-kprateek.nayak@amd.com Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
K Prateek Nayak [Fri, 8 May 2026 05:17:44 +0000 (05:17 +0000)]
cpufreq/amd-pstate: Return -ENOMEM on failure to allocate profile_name
Failure to allocate profile name will return -EINVAL from
platform_profile_register() while in fact, it is a failure to allocate
memory for the profile_name string.
Return -ENOMEM when kasprintf() fails to allocate profile_name string.
Fixes: e30ca6dd5345 ("cpufreq/amd-pstate: Add dynamic energy performance preference") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20260508051748.10484-3-kprateek.nayak@amd.com Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
hits the WARN_ON_ONCE() in static_key_disable_cpuslocked() and hangs the
system since both sysfs writes are trying to do
amd_pstate_change_driver_mode() without any synchronization.
Grab the "amd_pstate_driver_lock" mutex when modifying "dynamic_epp" to
prevent the two paths from racing with each other. Add a lockdep
assertion for "amd_pstate_driver_lock" in
amd_pstate_change_driver_mode() to formalize the dependency.
Since "cppc_mode" is stable under "amd_pstate_driver_lock", only reload
the driver when in "AMD_PSTATE_ACTIVE" mode and reject all writes when
in passive or guided mode, or if the driver is not loaded, since only
active mode operates on EPP.
Fixes: e30ca6dd5345 ("cpufreq/amd-pstate: Add dynamic energy performance preference") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20260508051748.10484-2-kprateek.nayak@amd.com Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Hui Wang [Wed, 6 May 2026 13:21:52 +0000 (21:21 +0800)]
riscv: cpufeature: Use pre-defined ISA ext macros to index isa2hwcap
We have pre-defined ISA extension macros, here use those macros to
replace a magic number for isa2hwcap definition and some array
indexing for isa2hwcap access.
This doesn't change the original functionality, just improve the code
maintainability and readability.
The testsuite defines several kprobes and kretprobes as static variables
that are preserved across test runs.
After register_kprobe and unregister_kprobe, a kprobe contains some
leftover data that must be cleared before the kprobe can be registered
again. The tests are setting symbol_name to define the probe location.
Address and flags must be cleared.
The existing code clears some of the probes between subsequent tests, but
not between two test runs. The leftover data from a previous test run
makes the registrations fail in the next run.
Move the cleanups for all kprobes into kprobes_test_init, this function
is called before each single test (including the first test of a test
run).
Jianpeng Chang [Fri, 8 May 2026 00:56:36 +0000 (09:56 +0900)]
kprobes: skip non-symbol addresses in kprobe_add_ksym_blacklist()
When kprobe_add_area_blacklist() iterates through a section like
.kprobes.text, the start address may not correspond to a named symbol.
On ARM64 with CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y (introduced by
commit baaf553d3bc3 ("arm64: Implement
HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")), the compiler flag
-fpatchable-function-entry=4,2 inserts 2 NOPs before each function entry
point for ftrace call_ops. These pre-function NOPs sit at the section base
address, before the first named function symbol. The compiler emits a $x
mapping symbol at offset 0x00 to mark the start of code, but
find_kallsyms_symbol() ignores mapping symbols.
Without CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS (e.g. defconfig), no
pre-function NOPs are inserted, the first function starts at offset
0x00, and the bug does not trigger.
This only affects modules that have a .kprobes.text section (i.e. those
using the __kprobes annotation). Modules using NOKPROBE_SYMBOL() instead
(like kretprobe_example.ko) blacklist exact function addresses via the
_kprobe_blacklist section and are not affected.
For kprobe_example.ko on ARM64 with -fpatchable-function-entry=4,2,
the .kprobes.text section layout is:
kprobe_add_area_blacklist() starts iterating from the section base
address (offset 0x00), which only has the $x mapping symbol.
kprobe_add_ksym_blacklist() then calls kallsyms_lookup_size_offset()
for this address, which goes through:
find_kallsyms_symbol() scans all module symbols to find the closest
preceding symbol.
Since no named text symbol exists at offset 0x00,
find_kallsyms_symbol() picks __UNIQUE_ID_vermagic (a .modinfo symbol
whose address is in the temporary image) as the "best" match. The
computed "size" = next_text_symbol - modinfo_symbol spans across
these two unrelated memory regions, creating a blacklist entry with
a bogus range of tens of terabytes.
Whether this causes a visible failure depends on address randomization,
here is what happens on Raspberry Pi 4/5:
- On RPi5, the bogus size was ~35 TB. start + size stayed within
64-bit range, so the blacklist entry covered the entire kernel
text. register_kprobe() in the module's own init function failed
with -EINVAL.
- On RPi4, the bogus size was ~75 TB. start + size overflowed
64 bits and wrapped to a small address near zero. The range
check (addr >= start && addr < end) then failed because end
wrapped around, so the bogus entry was accidentally harmless
and kprobes worked by luck.
The same bug exists on both machines, but randomization determines whether
the integer overflow masks it or not.
Fix this by adding notrace to the __kprobes macro. Functions in
.kprobes.text are kprobe infrastructure handlers that should never be
traced by ftrace. With notrace, the compiler stops inserting them and the
non-symbol gap at the section start disappears entirely.
Linus Torvalds [Fri, 8 May 2026 00:26:43 +0000 (17:26 -0700)]
Merge tag 'selinux-pr-20260507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
- Allow for multiple opens of /sys/fs/selinux/policy
Prevent a single process from blocking others from reading the
SELinux policy loaded in the kernel. This does have the side effect
of potentially allowing userspace to trigger additional kernel memory
allocations as part of the open/read operation, but this is mitigated
by requiring the SELinux security/read_policy permission.
- Reduce the critical sections where the SELinux policy mutex is held
This includes the patch to the policy loader code where we move the
permission checks and an allocation outside the mutex as well as the
the patch to checkreqprot which drops the code/lock entirely.
While the checkreqprot code had effectively been dropped in an
earlier release, portions of the code still remained that would have
triggered the mutex to perform an IMA measurement. This finally drops
all of that while preserving the user visible behavior.
- Eliminate potential sources of log spamming
There were a few areas where processes could flood the system logs
and hide other, more critical events. The previously disabled
checkreqprot and runtime disable knobs in selinuxfs were two such
areas that have now been greatly simplified and a pr_err() replaced
with a pr_err_once().
The third such place is the /sys/fs/selinux/user file, which hasn't
been used by a userspace release since 2020 and was scheduled for
removal after 2025; this effectively disables this functionality, but
similar to checkreqprot, it is done in a way that should not break
old userspace.
* tag 'selinux-pr-20260507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: shrink critical section in sel_write_load()
selinux: allow multiple opens of /sys/fs/selinux/policy
selinux: prune /sys/fs/selinux/user
selinux: prune /sys/fs/selinux/disable
selinux: prune /sys/fs/selinux/checkreqprot
Li Xiasong [Thu, 7 May 2026 14:04:22 +0000 (22:04 +0800)]
netfilter: nf_conntrack_sip: get helper before allocating expectation
process_register_request() allocates an expectation and then checks
whether a conntrack helper is available. If helper lookup fails, the
function returns early and the allocated expectation is left behind.
Reorder the code to fetch and validate helper before calling
nf_ct_expect_alloc(). This keeps the logic simpler and removes the leak
path while preserving existing behavior.
Fixes: e14575fa7529 ("netfilter: nf_conntrack: use rcu accessors where needed") Cc: stable@vger.kernel.org Signed-off-by: Li Xiasong <lixiasong1@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nf_conntrack_expect: restore helper propagation via expectation
A recent series to fix expectations broke helper propagation via
expectation, this mechanism is used by the sip and h323 helper. This
also propagates the conntrack helper to expected connections. I changed
semantics of exp->helper which now tells us the actual helper that
created the expectation.
Add an explicit assign_helper field to expectations for this purpose
and update helpers to use it.
Restore this feature for userspace conntrack helper via ctnetlink
nfqueue integration so it is again possible to attach a helper to an
expectation, where it makes sense. This is not restored via ctnetlink
expectation creation as there is no client for such feature. Use the
expectation layer 4 protocol number for the helper lookup for
consistency.
Make sure the expectation using this helper propagation mechanism also
go away when the helper is unregistered.
netfilter: bridge: eb_tables: close module init race
sashiko reports for unrelated patch:
Does the core ebtables initialization in ebtables.c suffer from a similar race?
Once nf_register_sockopt() completes, the sockopts are exposed globally.
sockopt has to be registered last, just like in ip/ip6/arptables.
Fixes: 5b53951cfc85 ("netfilter: ebtables: use net_generic infra") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: x_tables: close dangling table module init race
Similar to the previous ebtables patch:
template add exposes the table to userspace, we must do this last to
rnsure the pernet ops are set up (contain the destructors).
Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: ebtables: close dangling table module init race
sashiko reported for a related patch:
In modules like iptable_raw.c, [..], if register_pernet_subsys() fails,
the rollback might call kfree(rawtable_ops) before [..]
During this window, could a concurrent userspace process find the globally
visible template, trigger table_init(), [..]
The table init functions must always register the template last.
Otherwise, set/getsockopt can instantiate a table in a namespace
while the required pernet ops (contain the destructor) isn't available.
This change is also required in x_tables, handled in followup change.
Fixes: 87663c39f898 ("netfilter: ebtables: do not hook tables by default") Reviewed-by: Tristan Madani <tristan@talencesecurity.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: x_tables: add and use xtables_unregister_table_exit
Previous change added xtables_unregister_table_pre_exit to detach the
table from the packetpath and to unlink it from the active table list.
In case of rmmod, userspace that is doing set/getsockopt for this table
will not be able to re-instantiate the table:
1. The larval table has been removed already
2. existing instantiated table is no longer on the xt pernet table list.
This adds the second stage helper:
unlink the table from the dying list, free the hook ops (if any) and do
the audit notification. It replaces xt_unregister_table().
Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default") Reported-by: Tristan Madani <tristan@talencesecurity.com> Reviewed-by: Tristan Madani <tristan@talencesecurity.com> Closes: https://lore.kernel.org/netfilter-devel/20260429175613.1459342-1-tristmd@gmail.com/ Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: x_tables: unregister the templates first
When the module is going away we need to zap the template
first. Else there is a small race window where userspace
could instantiate a new table after the pernet exit function
has removed the current table.
Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default") Reported-by: Tristan Madani <tristan@talencesecurity.com> Reviewed-by: Tristan Madani <tristan@talencesecurity.com> Closes: https://lore.kernel.org/netfilter-devel/20260429175613.1459342-1-tristmd@gmail.com/ Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: x_tables: add and use xt_unregister_table_pre_exit
Remove the copypasted variants of _pre_exit and add one single
function in the xtables core. ebtables is not compatible with
x_tables and therefore unchanged.
This is a preparation patch to reduce noise in the followup
bug fixes.
netfilter: x_tables: allocate hook ops while under mutex
arp/ip(6)t_register_table() add the table to the per-netns list via
xt_register_table() before allocating the per-netns hook ops copy
via kmemdup_array(). This leaves a window where the table is
visible in the list with ops=NULL.
If the pernet exit happens runs concurrently the pre_exit callback finds
the table via xt_find_table() and passes the NULL ops pointer to
nf_unregister_net_hooks(), causing a NULL dereference:
general protection fault in nf_unregister_net_hooks+0xbc/0x150
RIP: nf_unregister_net_hooks (net/netfilter/core.c:613)
Call Trace:
ipt_unregister_table_pre_exit
iptable_mangle_net_pre_exit
ops_pre_exit_list
cleanup_net
Fix by moving the ops allocation into the xtables core so the table is
never in the list without valid ops. Also ensure the table is no longer
processing packets before its torn down on error unwind.
nf_register_net_hooks might have published at least one hook; call
synchronize_rcu() if there was an error.
audit log register message gets deferred until all operations have
passed, this avoids need to emit another ureg message in case of
error unwinding.
At the moment we emit the audit log a bit too early, which makes it
necessary to also emit an unregister log in case we have to unwind
errors after possible hook register failure.
Followup patch will be slightly simpler if we can delay the
register message until after the hooks have been wired up.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tabrez Ahmed [Sat, 2 May 2026 02:08:42 +0000 (07:38 +0530)]
hwmon: (ads7871) Fix endianness bug in 16-bit register reads
The ads7871_read_reg16() function relies on spi_w8r16() to read the
16-bit sensor output. The ADS7871 device transmits the Least Significant
Byte (LSB) first.
On Little-Endian architectures, spi_w8r16() correctly reconstructs the
16-bit value. However, on Big-Endian architectures, the byte swapping
causes the first received byte (LSB) to be placed in the most significant
byte of the u16, resulting in corrupted voltage readings.
To fix this, cast the integer result of spi_w8r16() to a restricted
__le16 type and convert it to the host CPU's native byte order using
le16_to_cpu(). Negative error codes returned by the SPI core are caught
and returned prior to the conversion to avoid mangling the error status.
Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260418034601.90226-1-tabreztalks@gmail.com Fixes: e0c70b8078629 ("hwmon: add TI ads7871 a/d converter driver") Suggested-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Tabrez Ahmed <tabreztalks@gmail.com> Link: https://lore.kernel.org/r/20260502020844.110038-2-tabreztalks@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Sensors configurations are defined by set and clear masks. These
do not follow a consistent "clear mask is a superset of set mask"
rule. This relaxed definition breaks lm75_write_config()
Basically all bits from set_mask that are not defined in clr_mask are
dropped. Fix that by enhancing the helper to always combine clr_mask
and set_mask into the mask bits of regmap_update_bits().
hwmon: (lm75) Fix AS6200 and TMP112 setup and alarm handling
The initialization of the AS6200 has two shortcomings
- The device-add-commit states "Conversion mode: continuous" but the
the lm75_params structure uses set_mask = 0x94c0. This activates
single shot mode (bit 15). According to the datasheet "The device
features a single shot measurement mode if the device is in sleep
mode (SM=1)". This is quite contradictionary.
- It is the only device that activates polarity active-high (bit 10)
All this is paired with a undefined clear mask bug in function
lm75_write_config() that was introduced with a later refactoring
commit.
[as6200] = {
.config_reg_16bits = true,
.set_mask = 0x94C0,
-> .clr_mask not defined here
.default_resolution = 12,
...
static inline int lm75_write_config(struct lm75_data *data, u16 set_mask,
u16 clr_mask)
{
return regmap_update_bits(data->regmap, LM75_REG_CONF,
clr_mask | LM75_SHUTDOWN, set_mask);
}
regmap_update_bits() requires clr_mask to be a superset of set_mask.
So basically all sensors with "wrong" masks like the AS6200 are not
initialized as intended.
Fix that by
- Change the set_mask to 0xc010 to reflect the current active-low
setup properly and to drive the sensor in continous mode. This
takes into account that the config register is little endian and
the first byte sent to the chip is the LSB.
- Adapt the alarm handling so it can report the alarm correctly
even if it is high active. This is done by comparing config register
bit 5 and 10 (translated to 2 and 13).
This commit does not introduce any ABI breakage as the mutliple bugs
effectly drive the AS6200 in standard active-low mode.
Fixes: 4b6358e1fe46 ("hwmon: (lm75) Add AMS AS6200 temperature sensor") Suggested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de> Link: https://lore.kernel.org/r/20260502173207.3567876-2-markus.stockhausen@gmx.de
[groeck: Update set_mask for as6200 further: As modeled, the upper bits
contain the conversion rate, so the config register needs to be set to
0xc010 instead of 0x10c0 to reflect 8 samples/s and 4 consecutive faults.
Fix the same problem for TMP112.] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Dave Airlie [Thu, 7 May 2026 22:51:01 +0000 (08:51 +1000)]
Merge tag 'drm-xe-fixes-2026-05-07' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
Driver Changes:
- Add NULL check for media_gt in intel_hdcp_gsc_check_status (Gustavo)
- Fix EAGAIN sign in pf_migration_consume (Shuicheng)
- Fix MMIO access using PF view instead of VF view during migration (Shuicheng)
- Exclude indirect ring state page from ADS engine state size (Satya)
Dave Airlie [Thu, 7 May 2026 22:34:34 +0000 (08:34 +1000)]
Merge tag 'drm-rust-fixes-2026-05-07' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-fixes
DRM Rust fixes for v7.1-rc3
- Fix unsound initialization in drm::Device::new(); if pinned
initialization of drm::Device::Data fails, make sure
drm::Device::release() isn't called, so we don't run the data's
destructor
- Fix missing GEM state cleanup in the init failure case; call
drm_gem_private_object_fini() if drm_gem_object_init() fails
- Fix wrong ARef import in the DRM shmem GEM helper abstraction
- Replace the nouveau mailing list with the new nova-gpu mailing list
for both nova-core and nova-drm, and remove unused patchwork entries
Robbie Ko [Fri, 1 May 2026 02:41:56 +0000 (10:41 +0800)]
btrfs: fix incorrect i_size after remount caused by KEEP_SIZE prealloc gap
When fallocate() with FALLOC_FL_KEEP_SIZE preallocates an extent past the
current i_size, the file_extent_tree of the inode is updated to cover
that range. However, on the next mount, btrfs_read_locked_inode() only
re-populates file_extent_tree with [0, round_up(i_size, sectorsize)),
losing the marks that belonged to the KEEP_SIZE prealloc extent beyond
i_size.
Later, when a non-KEEP_SIZE fallocate() extends i_size into / past that
old prealloc extent, the reservation loop in btrfs_fallocate() skips
already-prealloc segments and does not call into the path that marks the
file_extent_tree, so a gap remains inside the file_extent_tree across
[old_aligned_i_size, start_of_new_alloc). Then __btrfs_prealloc_file_range()
calls btrfs_inode_safe_disk_i_size_write(), which uses
find_contiguous_extent_bit() starting at offset 0 to derive disk_i_size.
The walk stops at the gap, so disk_i_size ends up smaller than i_size and
gets persisted. After the next mount, the file shows the wrong (smaller)
size.
# non-KEEP_SIZE fallocate that overlaps the previous prealloc tail
# and extends past it
fallocate -o 7M -l 2M $MNT/file1
ls -lh $MNT/file1
umount $MNT
mount $DEV $MNT
ls -lh $MNT/file1
umount $MNT
Running the reproducer gives the following result:
The size before the second mount is correct (9M), but after the
remount it drops to 7M, i.e. the start of the gap inside file_extent_tree.
Fix this in __btrfs_prealloc_file_range() by marking the entire range
[round_down(old_i_size, sectorsize), round_up(new_i_size, sectorsize))
in file_extent_tree before updating i_size and calling
btrfs_inode_safe_disk_i_size_write(). This ensures the contiguous bit
search starting from 0 is not truncated by a stale gap left behind by a
previous KEEP_SIZE prealloc that was not restored on inode load.
The fix has no effect when the NO_HOLES feature is enabled because
btrfs_inode_safe_disk_i_size_write() and
btrfs_inode_set_file_extent_range()
both take the fast path that directly tracks disk_i_size without
consulting file_extent_tree.
Fixes: 9ddc959e802b ("btrfs: use the file extent tree infrastructure") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Robbie Ko <robbieko@synology.com>
[ Minor updates to the change log ] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
btrfs: only release the dirty pages io tree after successful writes
[WARNING]
With extra warning on dirty extent buffers at umount (aka, the next
patch in the series), test case generic/388 can trigger the following
warning about dirty extent buffers at unmount time:
BTRFS critical (device dm-2 state E): emergency shutdown
BTRFS error (device dm-2 state E): error while writing out transaction: -30
BTRFS warning (device dm-2 state E): Skipping commit of aborted transaction.
BTRFS error (device dm-2 state EA): Transaction 9 aborted (error -30)
BTRFS: error (device dm-2 state EA) in cleanup_transaction:2068: errno=-30 Readonly filesystem
BTRFS info (device dm-2 state EA): forced readonly
BTRFS info (device dm-2 state EA): last unmount of filesystem 4fbf2e15-f941-49a0-bc7c-716315d2777c
------------[ cut here ]------------
WARNING: disk-io.c:3311 at invalidate_and_check_btree_folios+0xfd/0x1ca [btrfs], CPU#8: umount/914368
CPU: 8 UID: 0 PID: 914368 Comm: umount Tainted: G OE 7.1.0-rc1-custom+ #372 PREEMPT(full) 2de38db8d1deae71fde295430a0ff3ab98ccf596
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
RIP: 0010:invalidate_and_check_btree_folios+0xfd/0x1ca [btrfs]
Call Trace:
<TASK>
close_ctree+0x52e/0x574 [btrfs d2f0b1cd330d1287e7a9919d112eadfc0e914efd]
generic_shutdown_super+0x89/0x1a0
kill_anon_super+0x16/0x40
btrfs_kill_super+0x16/0x20 [btrfs d2f0b1cd330d1287e7a9919d112eadfc0e914efd]
deactivate_locked_super+0x2d/0xb0
cleanup_mnt+0xdc/0x140
task_work_run+0x5a/0xa0
exit_to_user_mode_loop+0x123/0x4b0
do_syscall_64+0x243/0x7c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
---[ end trace 0000000000000000 ]---
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30539776 owner 9 gen 9 refs 2 flags 0x7
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30621696 owner 257 gen 9 refs 2 flags 0x7
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30638080 owner 258 gen 9 refs 2 flags 0x7
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30654464 owner 7 gen 9 refs 2 flags 0x7
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30703616 owner 2 gen 9 refs 2 flags 0x7
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30720000 owner 10 gen 9 refs 2 flags 0x7
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30736384 owner 4 gen 9 refs 2 flags 0x7
BTRFS warning (device dm-2 state EA): unable to release extent buffer 30752768 owner 11 gen 9 refs 2 flags 0x7
I'm using a stripped down version, which seems to trigger the warning
more reliably:
for (( i = 0; i < $runtime; i++ )); do
echo "=== $i/$runtime ==="
workload
done
[CAUSE]
Inside btrfs_write_and_wait_transaction(), we first try to write all
dirty ebs, then wait for them to finish.
After that we call btrfs_extent_io_tree_release() to free all
extent states from dirty_pages io tree.
However if we hit an error from btrfs_write_marked_extent(), then we
still call btrfs_extent_io_tree_release() to clear that dirty_pages io
tree, which may contain dirty records that we haven't yet submitted.
Furthermore, the later transaction cleanup path will utilize that
dirty_pages io tree to properly cleanup those dirty ebs, but since it's
already empty, no dirty ebs are properly cleaned up, thus will later
trigger the warnings inside invalidate_btree_folios().
[FIX]
Normally such dirty ebs won't cause problems, as when the iput() is
called on the btree inode, the dirty ebs will be forcibly written back,
and since the fs is already in an error status, such writeback will not
reach disk and finish immediately.
But it's still better to get rid of such dirty ebs, if we ended up with
dirty ebs but the fs is not in an error status, then such writeback at
iput() time will be too late, as all workers are already stopped but
writeback will utilize workers, which will lead to NULL pointer
dereferences.
Instead of unconditionally calling btrfs_extent_io_tree_release(), only
call it if btrfs_write_and_wait_transaction() finished successfully, so
that @dirty_pages extent io tree is kept untouched for transaction
cleanup.
CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Tue, 28 Apr 2026 15:58:56 +0000 (16:58 +0100)]
btrfs: tracepoints: fix sleep while in atomic context in btrfs_sync_file()
The trace event btrfs_sync_file() is called in an atomic context (all trace
events are) and its call to dput(), which is needed due to the call to
dget_parent(), can sleep, triggering a kernel splat.
This can be reproduced by enabling the trace event and running btrfs/056
from fstests for example. The splat shown in dmesg is the following:
So stop using dget_parent() and dput() and access the parent dentry
directly as dentry->d_parent. This is also what ext4 is doing in
its equivalent trace event ext4_sync_file_enter().
Fixes: a85b46db143f ("btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file()") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
(lldb) source list -a add_ra_bio_pages+0x13c
.../vmlinux.unstripped add_ra_bio_pages + 316 at .../fs/btrfs/compression.c:454:8
451
452 folio = filemap_alloc_folio(mapping_gfp_constraint(mapping, constraint_gfp),
453 0, NULL);
-> 454 if (!folio)
455 break;
I can reproduce this consistently by running a memory hog concurrently
with a buffered writer on a machine with a very large amount of swap.
Commit 7ae37b2c94ed ("btrfs: prevent direct reclaim during compressed
readahead") clearly intended to suppress these warnings. But because the
mask set in the address_space with mapping_set_gfp_mask() doesn't include
__GFP_NOWARN, mapping_gfp_constraint() removes it from constraint_gfp
before it is passed to filemap_alloc_folio().
Fix by refactoring the code to add __GFP_NOWARN after the call to
mapping_gfp_constraint().
Fixes: 7ae37b2c94ed ("btrfs: prevent direct reclaim during compressed readahead") Signed-off-by: Calvin Owens <calvin@wbinvd.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
ZhengYuan Huang [Wed, 25 Mar 2026 00:43:39 +0000 (08:43 +0800)]
btrfs: fix check_chunk_block_group_mappings() to iterate all chunk maps
[BUG]
A corrupted image with a chunk present in the chunk tree but whose
corresponding block group item is missing from the extent tree can be
mounted successfully, even though check_chunk_block_group_mappings()
is supposed to catch exactly this corruption at mount time. Once
mounted, running btrfs balance with a usage filter (-dusage=N or
-dusage=min..max) triggers a null-ptr-deref:
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline]
RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline]
RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline]
RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604
[CAUSE]
The crash occurs because __btrfs_balance() iterates the on-disk chunk
tree, finds the orphaned chunk, calls chunk_usage_filter() (or
chunk_usage_range_filter()), which queries the in-memory block group
cache via btrfs_lookup_block_group(). Since no block group was ever
inserted for this chunk, the lookup returns NULL, and the subsequent
dereference of cache->used crashes.
check_chunk_block_group_mappings() uses btrfs_find_chunk_map() to
iterate the in-memory chunk map (fs_info->mapping_tree):
map = btrfs_find_chunk_map(fs_info, start, 1);
With @start = 0 and @length = 1, btrfs_find_chunk_map() looks for a
chunk map that *contains* the logical address 0. If no chunk contains
logical address 0, btrfs_find_chunk_map(fs_info, 0, 1) returns NULL
immediately and the loop breaks after the very first iteration,
having checked zero chunks. The entire verification function is therefore
a no-op, and the corrupted image passes the mount-time check undetected.
[FIX]
Replace the btrfs_find_chunk_map() based loop with a direct in-order
walk of fs_info->mapping_tree using rb_first_cached() + rb_next().
This guarantees that every chunk map in the tree is visited regardless
of the logical addresses involved.
No lock is taken around the traversal. This function is called during
mount from btrfs_read_block_groups(), which is invoked from open_ctree()
before any background threads (cleaner, transaction kthread, etc.) are
started. There are therefore no concurrent writers that could modify
mapping_tree at this point. An analogous lockless direct traversal of
mapping_tree already exists in fill_dummy_bgs() in the same file.
Since we walk the rb-tree directly via rb_entry() without going through
btrfs_find_chunk_map(), no reference is taken on each map entry, so the
btrfs_free_chunk_map() calls are also removed.
Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Tejun Heo [Thu, 7 May 2026 22:09:21 +0000 (12:09 -1000)]
sched_ext: Drop unused scx_find_sub_sched() stub
scx_find_sub_sched()'s only caller, scx_bpf_sub_dispatch(), is gated on
CONFIG_EXT_SUB_SCHED. When CONFIG_EXT_SUB_SCHED=n the caller compiles out
and the stub becomes dead code, tripping -Wunused-function on randconfigs.
Drop the stub.
Fixes: 25037af712eb ("sched_ext: Add rhashtable lookup for sub-schedulers") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/all/202605080556.42PXw8U9-lkp@intel.com/ Signed-off-by: Tejun Heo <tj@kernel.org>
Tristan Madani [Tue, 5 May 2026 11:12:59 +0000 (11:12 +0000)]
hfs/hfsplus: zero-initialize buffer in hfs_bnode_read
hfs_bnode_read() can return early without writing to the output buffer
when is_bnode_offset_valid() fails or when check_and_correct_requested_
length() corrects the length to zero. Callers such as hfs_bnode_read_
u16() and hfs_bnode_read_u8() pass stack-allocated buffers and use the
result unconditionally, leading to KMSAN uninit-value reports.
Rather than initializing at each individual call site, zero the buffer
at the start of hfs_bnode_read() before any validation checks. This
ensures all callers in both hfs and hfsplus get a deterministic zero
value regardless of which early-return path is taken.
Tristan Madani [Tue, 5 May 2026 11:12:58 +0000 (11:12 +0000)]
hfs/hfsplus: fix u32 overflow in check_and_correct_requested_length
check_and_correct_requested_length() compares (off + len) against
node_size using u32 arithmetic. When the caller passes a large len
value (e.g. from an underflowed subtraction in hfs_brec_remove()),
off + len can wrap past 2^32 and produce a small result, causing the
bounds check to pass when it should fail.
For example, with off=14 and len=0xFFFFFFF2 (underflowed from
data_off - keyoffset - size in hfs_brec_remove), off + len wraps to 6,
which is less than a typical node_size of 512, so the check passes and
the subsequent memmove reads ~4GB past the node buffer.
Fix this by widening the addition to u64 before comparing against
node_size. This prevents the u32 wrap while keeping the logic
straightforward.
Chen Wandun [Thu, 7 May 2026 10:54:34 +0000 (18:54 +0800)]
cgroup/cpuset: move PF_EXITING check before __GFP_HARDWALL in cpuset_current_node_allowed()
Since prepare_alloc_pages() unconditionally adds __GFP_HARDWALL for the
fast path when cpusets are enabled, the __GFP_HARDWALL check in
cpuset_current_node_allowed() causes the PF_EXITING escape path to be
skipped on the first allocation attempt. This makes it unreachable in
the common case, so dying tasks can get stuck in direct reclaim or even
trigger OOM while trying to exit, despite being allowed to allocate from
any node.
Move the PF_EXITING check before __GFP_HARDWALL so that dying tasks
can allocate memory from any node to exit quickly, even when cpusets
are enabled.
Also update the function comment to reflect the actual behavior of
prepare_alloc_pages() and the corrected check ordering.
Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Acked-by: Michal Koutný <mkoutny@suse.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Thu, 7 May 2026 21:05:31 +0000 (11:05 -1000)]
sched_ext: Move scx_error() out of scx_link_sched()'s lock region
scx_link_sched() holds scx_sched_lock. The scx_error() calls inside take the
same lock through scx_claim_exit() and deadlock. Move them out of the guard.
Fixes: 6b4576b09714 ("sched_ext: Reject sub-sched attachment to a disabled parent") Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
smb: client: validate dacloffset before building DACL pointers
parse_sec_desc(), build_sec_desc(), and the chown path in
id_mode_to_cifs_acl() all add the server-supplied dacloffset to pntsd
before proving a DACL header fits inside the returned security
descriptor.
On 32-bit builds a malicious server can return dacloffset near
U32_MAX, wrap the derived DACL pointer below end_of_acl, and then slip
past the later pointer-based bounds checks. build_sec_desc() and
id_mode_to_cifs_acl() can then dereference DACL fields from the wrapped
pointer in the chmod/chown rewrite paths.
Validate dacloffset numerically before building any DACL pointer and
reuse the same helper at the three DACL entry points.
Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Zisen Ye [Wed, 6 May 2026 03:49:08 +0000 (11:49 +0800)]
smb/client: fix out-of-bounds read in smb2_compound_op()
If a server sends a truncated response but a large OutputBufferLength, and
terminates the EA list early, check_wsl_eas() returns success without
validating that the entire OutputBufferLength fits within iov_len.
Then smb2_compound_op() does:
memcpy(idata->wsl.eas, data[0], size[0]);
Where size[0] is OutputBufferLength. If iov_len is smaller than size[0],
memcpy can read beyond the end of the rsp_iov allocation and leak adjacent
kernel heap memory.
Zisen Ye [Sat, 2 May 2026 10:48:36 +0000 (18:48 +0800)]
smb/client: fix out-of-bounds read in symlink_data()
Since smb2_check_message() returns success without length validation for
the symlink error response, in symlink_data() it is possible for
iov->iov_len to be smaller than sizeof(struct smb2_err_rsp). If the buffer
only contains the base SMB2 header (64 bytes), accessing
err->ErrorContextCount (at offset 66) or err->ByteCount later in
symlink_data() will cause an out-of-bounds read.
Piyush Sachdeva [Thu, 7 May 2026 16:52:14 +0000 (22:22 +0530)]
smb: client: Zero-pad short GSS session keys per MS-SMB2
Per MS-SMB2 section 3.2.5.3, Session.SessionKey is the first 16 bytes
of the GSS cryptographic key, right-padded with zero bytes if the key
is shorter than 16 bytes.
SMB2_auth_kerberos() copies the GSS session key from the cifs.upcall
response using kmemdup(msg->data, msg->sesskey_len, ...) and stores
the GSS-reported length verbatim in ses->auth_key.len. generate_key()
reads SMB2_NTLMV2_SESSKEY_SIZE bytes from this buffer when feeding the
HMAC-SHA256 KDF for signing key derivation. If a GSS mechanism returns
a session key shorter than 16 bytes (e.g. a deprecated single-DES
Kerberos enctype with an 8-byte session key), the KDF call performs an
out-of-bounds slab read and derives keys that do not match the server,
which pads per the spec.
Modern KDCs disable short-key enctypes by default, so this is latent
rather than reachable in production, but it is still a kernel heap
over-read.
Allocate auth_key.response with kzalloc() at a length of
max(msg->sesskey_len, SMB2_NTLMV2_SESSKEY_SIZE), copy the GSS key in,
and rely on kzalloc()'s zero initialization for the spec-mandated
padding. Set ses->auth_key.len to the padded length. Larger GSS keys
(e.g. the 32-byte aes256-cts-hmac-sha1-96 session key) continue to be
stored at their natural length, preserving the FullSessionKey path.
Emit a cifs_dbg(VFS, ...) message when a short key is encountered to
surface deprecated-enctype usage.
NTLMv2 and NTLMSSP code paths produce a 16-byte session key by
construction and are unaffected.
Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com> Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Piyush Sachdeva [Thu, 7 May 2026 16:52:13 +0000 (22:22 +0530)]
smb: client: Use FullSessionKey for AES-256 encryption key derivation
When Kerberos authentication is used with AES-256 encryption (AES-256-CCM
or AES-256-GCM), the SMB3 encryption and decryption keys must be derived
using the full session key (Session.FullSessionKey) rather than just the
first 16 bytes (Session.SessionKey).
Per MS-SMB2 section 3.2.5.3.1, when Connection.Dialect is "3.1.1" and
Connection.CipherId is AES-256-CCM or AES-256-GCM, Session.FullSessionKey
must be set to the full cryptographic key from the GSS authentication
context. The encryption and decryption key derivation (SMBC2SCipherKey,
SMBS2CCipherKey) must use this FullSessionKey as the KDF input. The
signing key derivation continues to use Session.SessionKey (first 16
bytes) in all cases.
Previously, generate_key() hardcoded SMB2_NTLMV2_SESSKEY_SIZE (16) as the
HMAC-SHA256 key input length for all derivations. When Kerberos with
AES-256 provides a 32-byte session key, the KDF for encryption/decryption
was using only the first 16 bytes, producing keys that did not match the
server's, causing mount failures with sec=krb5 and require_gcm_256=1.
Add a full_key_size parameter to generate_key() and pass the appropriate
size from generate_smb3signingkey():
- Signing: always SMB2_NTLMV2_SESSKEY_SIZE (16 bytes)
- Encryption/Decryption: ses->auth_key.len when AES-256, otherwise 16
Also fix cifs_dump_full_key() to report the actual session key length for
AES-256 instead of hardcoded CIFS_SESS_KEY_SIZE, so that userspace tools
like Wireshark receive the correct key for decryption.
Cc: <stable@vger.kernel.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com> Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Dmitry Torokhov [Mon, 4 May 2026 18:54:46 +0000 (11:54 -0700)]
Input: atmel_mxt_ts - check mem_size before calculating config memory size
In mxt_update_cfg(), the driver calculates the memory size needed to store
the configuration as data->mem_size - cfg.start_ofs. If data->mem_size is
less than or equal to cfg.start_ofs, this calculation will underflow or
result in a zero-size buffer, neither of which is valid for a configuration
update.
Add a check to return -EINVAL if data->mem_size is too small. While at it,
change the types of start_ofs and mem_size in struct mxt_cfg to u16 to
match the device address space.
Dmitry Torokhov [Mon, 4 May 2026 18:54:45 +0000 (11:54 -0700)]
Input: atmel_mxt_ts - fix boundary check in mxt_prepare_cfg_mem
When a configuration file provides an object size that is larger than the
driver's known mxt_obj_size(object), the driver intends to discard the
extra bytes.
The loop iterates using for (i = 0; i < size; i++). Inside the loop, the
condition to skip processing extra bytes is:
if (i > mxt_obj_size(object))
continue;
Since i is a 0-based index, the valid indices for the object are 0 through
mxt_obj_size(object) - 1.
When i == mxt_obj_size(object), the condition evaluates to false, and the
code processes the byte instead of discarding it.
This causes the code to calculate byte_offset = reg + i - cfg->start_ofs
and writes the byte there, overwriting exactly one byte of the adjacent
instance or object.
Update the boundary check to skip extra bytes correctly by using >=.
Input: fm801-gp - simplify initialisation of pci_device_id array
Instead of assigning the pci_device_id members using a list (which is
hard to read as you need to look at the order of the members in that
struct in parallel) use the PCI_VDEVICE() convenience macro to compact
the initialisation while improving readability.
Also drop trailing zeros that the compiler will care about then.
The change doesn't introduce binary changes to the compiled driver,
verified on both ARCH=x86 and ARCH=arm64.
Daniel Machon [Wed, 6 May 2026 07:25:39 +0000 (09:25 +0200)]
net: sparx5: configure serdes for 1000BASE-X in sparx5_port_init()
sparx5_port_init() only invokes sparx5_serdes_set() and the associated
shadow-device enable and low-speed device switch for SGMII and QSGMII.
On any port with a high-speed primary device (DEV5G/DEV10G/DEV25G)
configured for 1000BASE-X the serdes is therefore left uninitialized,
the DEV2G5 shadow is never enabled, and the port stays pointed at its
high-speed device rather than the DEV2G5. The PCS1G block looks
healthy in isolation, but no frames reach the link partner.
Add 1000BASE-X to the check so the same three steps run.
Note: the same issue might apply to 2500BASE-X, but that will,
eventually, be addressed in a separate commit.
The value read back from the chip is GCB_CHIP_ID_PART_ID, which is a
GENMASK(27, 12) field, i.e. at most 16 bits wide. It can never match
these IDs, so probing a TSN part fails with a "Target not supported"
error.
Fix the enum to use the actual 16-bit part IDs returned by the
hardware: 0x0546, 0x0549, 0x0552, 0x0556 and 0x0558.
Linus Torvalds [Thu, 7 May 2026 15:55:15 +0000 (08:55 -0700)]
Merge tag 'sound-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Again a collection of small fixes, mostly for device-specific ones.
The only big LOC is about the removal of pretty old dead code in
ab8500 codec driver, while the rest all nice small changes.
Core / API:
- Fix race in deferred fasync state checks
- Fix UMP group filtering in sequencer
ASoC:
- cs35l56: fixes for driver cleanup and error paths
- tas2764/2770: workaround for bogus temperature readings
- wm_adsp: fixes for firmware unit tests
- amd-yc: more DMI quirks for laptops
- Minor fixes for fsl_xcvr and spacemit
HD-Audio:
- Mute LED and speaker quirks for HP, Lenovo, and Xiaomi laptops
USB-audio:
- New device-specific quirks (Motu, JBL, AlphaTheta, Razer)
- Fix of MIDI2 playback on resume
Others:
- Firewire-tascam control event fix
- Minor cleanups and fixes for sparc/dbri and pcmtest"
* tag 'sound-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (28 commits)
ASoC: cs35l56: Destroy workqueue in probe error path
ASoC: cs35l56: Don't use devres to unregister component
ALSA: sparc/dbri: add missing fallthrough
ALSA: core: Serialize deferred fasync state checks
ALSA: hda/realtek: Add mute LED fixup for HP Pavilion 15-cs1xxx
ALSA: seq: Fix UMP group 16 filtering
ASoC: wm_adsp_fw_find_test: Clear searched_fw_files in find-by-index test
ASoC: wm_adsp_fw_find_test: Redirect wm_adsp_release_firmware_files()
ASoC: tas2770: Deal with bogus initial temperature value
ASoC: tas2764: Deal with bogus initial temperature register value
ALSA: usb-audio: add clock quirk for Motu 1248
ALSA: usb-audio: midi2: Restart output URBs on resume
ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP Envy X360 15-fh0xxx
ALSA: usb-audio: Add quirk flags for JBL Pebbles
ALSA: firewire-tascam: Do not drop unread control events
ALSA: usb-audio: Add quirk flags for AlphaTheta EUPHONIA
ASoC: fsl_xcvr: Fix event generation for cached controls
ASoC: sdw_utils: avoid the SDCA companion function not supported failure
ASoC: amd: yc: Add HP OMEN Gaming Laptop 16-ap0xxx product line in quirk table
ASoC: cs35l56: Fix out-of-bounds in dev_err() in cs35l56_read_onchip_spkid()
...
Peng Fan [Mon, 27 Apr 2026 01:01:48 +0000 (09:01 +0800)]
soc: imx8m: Fix match data lookup for soc device
The i.MX8M soc device is registered via platform_device_register_simple(),
so it is not associated with a Device Tree node and the imx8m_soc_driver
has no of_match_table.
As a result, device_get_match_data() always returns NULL when probing
the soc device.
Retrieve the match data directly from the machine compatible using
of_machine_get_match_data(imx8_soc_match), which provides the correct SoC
data.
Fixes: 2524b293a59e5 ("soc: imx8m: don't access of_root directly") Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Linus Torvalds [Thu, 7 May 2026 15:43:25 +0000 (08:43 -0700)]
Merge tag 'pmdomain-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
- Fix detach procedure for virtual devices in genpd
- mediatek: Fix use-after-free in scpsys_get_bus_protection_legacy()
* tag 'pmdomain-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: mediatek: fix use-after-free in scpsys_get_bus_protection_legacy()
pmdomain: core: Fix detach procedure for virtual devices in genpd
Joey Lu [Wed, 6 May 2026 08:46:13 +0000 (16:46 +0800)]
net: stmmac: dwmac-nuvoton: fix NULL pointer dereference in nvt_set_phy_intf_sel()
priv->dev was never initialized after devm_kzalloc() allocates the
private data structure. When nvt_set_phy_intf_sel() is later invoked
via the phylink interface_select callback, it calls
nvt_gmac_get_delay(priv->dev, ...) which dereferences the NULL pointer.
Fix this by assigning priv->dev = dev immediately after allocation.
If a socket is bound to a wildcard address, tcp_v[46]_connect()
updates it with a non-wildcard address based on the route lookup.
After bhash2 was introduced in the cited commit, we must call
inet_bhash2_update_saddr() to update the bhash2 entry as well.
If inet_bhash2_update_saddr() fails, we must release the refcount
for dst by ip_route_connect() or ip6_dst_lookup_flow().
While tcp_v4_connect() calls ip_rt_put() in the error path,
tcp_v6_connect() does not call dst_release().
Let's call dst_release() when inet_bhash2_update_saddr() fails
in tcp_v6_connect().
Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address") Reported-by: Damiano Melotti <melotti@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260506070443.1699879-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Chen [Tue, 5 May 2026 17:39:26 +0000 (10:39 -0700)]
net: phy: broadcom: Save PHY counters during suspend
The PHY counters can be lost if the PHY is reset during suspend. We
need to save the values into the shadow counters or the accounting
will be incorrect over multiple suspend and resume cycles.
Fixes: 820ee17b8d3b ("net: phy: broadcom: Add support code for reading PHY counters") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20260505173926.2870069-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
D. Wythe [Wed, 6 May 2026 01:41:05 +0000 (09:41 +0800)]
net/smc: fix missing sk_err when TCP handshake fails
In smc_connect_work(), when the underlying TCP handshake fails, the error
code (rc) must be propagated to sk_err to ensure userspace can correctly
retrieve the error status via SO_ERROR. Currently, the code only handles
a restricted set of error codes (e.g., EPIPE, ECONNREFUSED). If other
errors occurs, such as EHOSTUNREACH, sk_err remains unset (zero).
This affects applications that rely on SO_ERROR to determine connect
outcome. For example, higher versions of Go's netpoller treats
SO_ERROR == 0 combined with a failed getpeername() as a spurious wakeup
and re-enters epoll_wait(). Under ET mode, no further edge will be
generated since the socket is already in a terminal state, causing the
connect to hang indefinitely or until a user-specified timeout, if one
is set.
Jiexun Wang [Wed, 6 May 2026 14:08:23 +0000 (22:08 +0800)]
af_unix: Reject SIOCATMARK on non-stream sockets
SIOCATMARK reports whether the receive queue is at the urgent mark for
MSG_OOB.
In AF_UNIX, MSG_OOB is supported only for SOCK_STREAM sockets.
SOCK_DGRAM and SOCK_SEQPACKET reject MSG_OOB in sendmsg() and recvmsg(),
so they should not support SIOCATMARK either.
Return -EOPNOTSUPP for non-stream sockets before checking the receive
queue.
Fixes: 314001f0bf92 ("af_unix: Add OOB support") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Suggested-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260506140825.2987635-1-n05ec@lzu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The top-level MFD driver now passes the device properties to the GPIO
cell via the software node. Use generic device property accessors and
stop using platform data. We can ignore the "ngpios" property here now
as it will be retrieved internally by GPIO core.
mfd: timberdale: Set up a software node for the GPIO cell
Using generic device properties instead of custom platform data
structures is preferred due to the resulting unification of the way
properties are accessed in consumer drivers. There's no DT node for the
GPIO cell in this driver but we can create a software node with device
properties and attach it to all the GPIO cells.
Shuhao Fu [Tue, 28 Apr 2026 08:12:38 +0000 (16:12 +0800)]
ALSA: hda: cs35l41: Put ACPI device on missing physical node
acpi_dev_get_first_match_dev() returns a refcounted ACPI device and
callers must balance it with acpi_dev_put().
cs35l41_hda_read_acpi() stores the returned ACPI device in
cs35l41->dacpi. That reference is normally released by the later
probe cleanup or the remove path, but the NULL-check on
physdev exits before either of those paths can run.
Drop the lookup reference before returning -ENODEV.
Fixes: c34b04cc6178 ("ALSA: hda: cs35l41: Fix NULL pointer dereference in cs35l41_hda_read_acpi()") Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Tested-by: Simon Trimmer <simont@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260428081238.GA1659932@chcpu16
Shuhao Fu [Tue, 28 Apr 2026 08:01:39 +0000 (16:01 +0800)]
ALSA: hda: cs35l56: Put ACPI device after setting companion
acpi_dev_get_first_match_dev() returns a refcounted ACPI device and
callers are expected to balance it with acpi_dev_put().
When no companion is already attached, cs35l56_hda_read_acpi() looks
up an ACPI device and sets it with ACPI_COMPANION_SET(), but leaves
the lookup reference held.
ACPI_COMPANION_SET() does not take ownership of that reference, so
drop it with acpi_dev_put() after attaching the companion.
Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier") Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Tested-by: Simon Trimmer <simont@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260428080139.GA1649104@chcpu16
3b497c3f4f04 ("fs/resctrl: Introduce the interface to display monitoring modes")
introduced CONFIG_RESCTRL_ASSIGN_FIXED but left adding the Kconfig
entry until it was necessary. The counter assignment mode is fixed in
MPAM, even when there are assignable counters, and so addressing this
is needed to support MPAM.
To avoid the burden of another Kconfig entry, replace
CONFIG_RESCTRL_ASSIGN_FIXED with a new property in 'struct resctrl_mon',
'mbm_cntr_assign_fixed' to be set by the architecture.
Do not request the architecture to change the counter assignment mode if it
does not support doing so. Provide insight to user space about why such a
request fails.
veth: fix OOB txq access in veth_poll() with asymmetric queue counts
XDP redirect into a veth device (via bpf_redirect()) calls
veth_xdp_xmit(), which enqueues frames into the peer's ptr_ring using
smp_processor_id() % peer->real_num_rx_queues
as the ring index. With an asymmetric veth pair where the peer has
fewer TX queues than RX queues, that index can exceed
peer->real_num_tx_queues.
veth_poll() then resolves peer_txq for the ring via:
where queue_idx = rq->xdp_rxq.queue_index. When queue_idx exceeds
peer_dev->real_num_tx_queues this is an out-of-bounds (OOB) access
into the peer's netdev_queue array, triggering DEBUG_NET_WARN_ON_ONCE
in netdev_get_tx_queue().
The normal ndo_start_xmit path is not affected: the stack clamps
skb->queue_mapping via netdev_cap_txqueue() before invoking
ndo_start_xmit, so rxq in veth_xmit() never exceeds real_num_tx_queues.
Fix veth_poll() by clamping: only dereference peer_txq when queue_idx is
within bounds, otherwise set it to NULL. The out-of-range rings are fed
exclusively via XDP redirect (veth_xdp_xmit), never via ndo_start_xmit
(veth_xmit), so the peer txq was never stopped and there is nothing to
wake; NULL is the correct fallback.
Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://lore.kernel.org/all/20260502071828.616C3C19425@smtp.kernel.org/ Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/20260505132159.241305-2-hawk@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
drm/panfrost: Fix wait_bo ioctl leaking positive return from dma_resv_wait_timeout()
dma_resv_wait_timeout() returns a positive 'remaining jiffies' value
on success, 0 on timeout, and -errno on failure.
panfrost_ioctl_wait_bo() returns this 'long' result from an int-typed
ioctl handler, so positive values reach userspace as bogus errors.
Explicitly set ret to 0 on the success path.
accel/rocket: Fix prep_bo ioctl leaking positive return from dma_resv_wait_timeout()
dma_resv_wait_timeout() returns a positive 'remaining jiffies' value
on success, 0 on timeout, and -errno on failure.
rocket_ioctl_prep_bo() returns this 'long' result from an int-typed
ioctl handler, so positive values reach userspace as bogus errors.
Explicitly set ret to 0 on the success path.
Fuad Tabba [Fri, 1 May 2026 11:21:49 +0000 (12:21 +0100)]
KVM: arm64: Pre-check vcpu memcache for host->guest donate
__pkvm_host_donate_guest() flips the host stage-2 PTE for the
donated page to a non-valid annotation via
host_stage2_set_owner_metadata_locked() and then calls
kvm_pgtable_stage2_map() to install the matching guest stage-2
mapping. The map's return value is wrapped in WARN_ON() and
otherwise discarded, asserting that the call cannot fail.
WARN_ON() at nVHE EL2 panics, so this assertion is only correct
if the call genuinely cannot fail. kvm_pgtable_stage2_map() can
fail with -ENOMEM even at PAGE_SIZE granularity: the donate path
verifies PKVM_NOPAGE for the guest IPA before the map, so the
walker must allocate fresh page-table pages from the vcpu
memcache, and the host controls the vcpu memcache via the topup
interface. An under-provisioned donation request would otherwise
turn a recoverable -ENOMEM into a fatal hyp panic.
Bound the worst-case walker allocation alongside the existing
__host_check_page_state_range() / __guest_check_page_state_range()
pre-checks, using the helper introduced for host->guest share. If
the vcpu memcache holds fewer pages than kvm_mmu_cache_min_pages(),
return -ENOMEM before any state mutation.
Fuad Tabba [Fri, 1 May 2026 11:21:48 +0000 (12:21 +0100)]
KVM: arm64: Pre-check vcpu memcache for host->guest share
__pkvm_host_share_guest() ends with kvm_pgtable_stage2_map() to
install the guest stage-2 mapping, after a forward pass that mutates
the host vmemmap (sets PKVM_PAGE_SHARED_OWNED and increments
host_share_guest_count) for every page in the range. The map's
return value is wrapped in WARN_ON() and otherwise discarded,
asserting that the call cannot fail.
WARN_ON() at nVHE EL2 panics, so this assertion is only correct if
the call genuinely cannot fail. kvm_pgtable_stage2_map() can fail
with -ENOMEM when the stage-2 walker exhausts the caller's
memcache, and the host controls the vcpu memcache via the topup
interface, so an under-provisioned share request would otherwise
turn a recoverable -ENOMEM into a fatal hyp panic.
Bound the worst-case walker allocation in the existing pre-check
pass so that kvm_pgtable_stage2_map() cannot fail at the call
site, using kvm_mmu_cache_min_pages() -- the same bound host EL1
uses for its own stage-2 maps. If the vcpu memcache holds fewer
pages, return -ENOMEM before any state mutation.
The hypercall handlers call pkvm_refill_memcache() to top up the
hyp_vcpu memcache before invoking __pkvm_host_{share,donate}_guest().
pkvm_ownership_selftest invokes those functions directly with a
static selftest_vcpu that has an empty memcache.
Seed selftest_vcpu's memcache from the prepopulated selftest
pages, leaving the remainder for selftest_vm.pool. Required by
the memcache-sufficiency pre-check added in the following
patches.
__deactivate_fgt() declares its first parameter as "htcxt" but the body
references "hctxt". The parameter is unused; the macro silently captures
"hctxt" from the enclosing scope. Both existing callers
(__deactivate_traps_hfgxtr() and __deactivate_traps_ich_hfgxtr()) happen
to define a local "struct kvm_cpu_context *hctxt", so the macro works
by coincidence.
A future caller without an "hctxt" local in scope, or naming it
differently, would compile but bind to the wrong context. Align the
parameter name with the sibling __activate_fgt() macro.
The "vcpu" parameter remains unused in the body, kept for API symmetry
with __activate_fgt() (which uses it).
Fuad Tabba [Fri, 1 May 2026 11:21:45 +0000 (12:21 +0100)]
KVM: arm64: Guard against NULL vcpu on VHE hyp panic path
On VHE, __hyp_call_panic() unconditionally calls __deactivate_traps(vcpu)
on the vcpu pointer read from host_ctxt->__hyp_running_vcpu. That pointer
is cleared after every guest exit (and is never set when no guest is
running), so an unexpected EL2 exception landing in _guest_exit_panic,
e.g. via the el2t*_invalid / el2h_irq_invalid vectors - reaches this
function with vcpu == NULL. __deactivate_traps() then dereferences vcpu
via ___deactivate_traps() -> vserror_state_is_nested() -> vcpu_has_nv()
-> vcpu->arch.features, faulting inside the panic handler and obscuring
the original failure.
The nVHE counterpart (hyp_panic() in arch/arm64/kvm/hyp/nvhe/switch.c)
already guards its vcpu-using cleanup with "if (vcpu)"; mirror that
here. sysreg_restore_host_state_vhe() does not depend on vcpu and
continues to run unconditionally, preserving panic forensics. The
trailing panic("...VCPU:%p", vcpu) prints "(null)" safely via printk's
%p handling.
Fuad Tabba [Fri, 1 May 2026 11:21:44 +0000 (12:21 +0100)]
KVM: arm64: Make EL2 exception entry and exit context-synchronization events
SCTLR_EL2.EIS and SCTLR_EL2.EOS control whether exception entry and
exit at EL2 are Context Synchronisation Events (CSEs). Per ARM DDI
0487 M.b D24.2.175 (p. D24-9754):
- !FEAT_ExS: the bit is RES1, so the entry/exit is unconditionally
a CSE.
- FEAT_ExS: the reset value is architecturally UNKNOWN; software
must set the bit to make the entry/exit a CSE.
INIT_SCTLR_EL2_MMU_ON in arch/arm64/include/asm/sysreg.h sets neither
bit. KVM/arm64 hot paths rely on ERET from EL2 being a CSE, and on
synchronous EL1->EL2 entry being a CSE, to elide explicit ISBs after
MSRs to context-switching system registers (HCR_EL2, ZCR_EL2,
ptrauth keys, etc.). On FEAT_ExS hardware those reliances are not
architecturally backed unless EOS=1 (and, for entry, EIS=1).
Until commit 0a35bd285f43 ("arm64: Convert SCTLR_EL2 to sysreg
infrastructure"), SCTLR_EL2_RES1 was a hand-rolled mask that
included BIT(11) (EOS) and BIT(22) (EIS), so INIT_SCTLR_EL2_MMU_ON
was setting both unconditionally. The conversion made
SCTLR_EL2_RES1 auto-generated; because the sysreg tooling only
models unconditionally-RES1 fields and EIS/EOS are RES1 only when
FEAT_ExS is absent, the auto-generated mask is UL(0). The seven
other bits dropped from the old mask (positions 4, 5, 16, 18, 23,
28, 29) are unconditionally RES1 in the E2H=0 SCTLR_EL2 layout per
DDI 0487 M.b D24.2.175, so dropping them is harmless. EIS and EOS
are the only bits whose semantics changed for FEAT_ExS hardware
and where the kernel relies on the value being 1.
Make the guarantee explicit: include SCTLR_ELx_EIS | SCTLR_ELx_EOS in
INIT_SCTLR_EL2_MMU_ON so that EL2 exception entry and exit are
unconditionally CSEs regardless of whether FEAT_ExS is implemented.
This matches the pairing in arch/arm64/kvm/config.c which treats EIS
and EOS together as RES1 under !FEAT_ExS.
Fixes: 0a35bd285f43 ("arm64: Convert SCTLR_EL2 to sysreg infrastructure") Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com> Assisted-by: Gemini:gemini-3.1-pro review-prompts Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260501112149.2824881-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
This is caused by accessing plr->dbgfs_dir after vsec_tpmi driver is
removed. Here vsec_tpmi driver is the parent. On unbind, the parent
device remove callback is called first which here will remove debugfs
interface. Hence plr->dbgfs_dir is no longer valid.
Register notifier for TPMI_CORE_EXIT and make this pointer to NULL,
so that debugfs_remove_recursive() is not called with bad plr->dbgfs_dir
pointer.
After notifier is returned the vsec_tpmi driver will call remove debugfs
by calling debugfs_remove_recursive().
Fixes: 811f67c51636 ("platform/x86/intel/tpmi: Add new auxiliary driver for performance limits") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stable@vger.kernel.org Link: https://patch.msgid.link/20260430151103.1549733-4-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
In some cases a driver using services of vsec_tpmi driver requires some
processing before vsec_tpmi exits. For example a children using debugfs
can't use debugfs as this will be deleted by the vsec_tpmi driver.
This is the case when unbind using PCI driver interface. In this case
the remove callback of vsec_tpmi driver is called first, then remove
callback of its children.
Add support of blocking chain notifiers support. Notify on successful probe
and before clean up in the remove callback.
Fixes: 811f67c51636 ("platform/x86/intel/tpmi: Add new auxiliary driver for performance limits") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stable@vger.kernel.org Link: https://patch.msgid.link/20260430151103.1549733-3-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
platform/x86: intel: Move debugfs register before creating devices
It is possible that the driver handling device is enumerated before
registering debugfs. If the driver wants to access debugfs by calling
tpmi_get_debugfs_dir(), this will return error in this case.
Hence register debugfs before creating devices.
Fixes: 811f67c51636 ("platform/x86/intel/tpmi: Add new auxiliary driver for performance limits") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stable@vger.kernel.org Link: https://patch.msgid.link/20260430151103.1549733-2-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Konrad Dybcio [Wed, 31 Dec 2025 16:21:26 +0000 (21:51 +0530)]
firmware: psci: Set pm_set_resume/suspend_via_firmware() for SYSTEM_SUSPEND
PSCI specification defines the SYSTEM_SUSPEND feature which enables the
firmware to implement the suspend to RAM (S2RAM) functionality by
transitioning the system to a deeper low power state. When the system
enters such state, the power to the peripheral devices might be removed. So
the respective device drivers must prepare for the possible removal of the
power by performing actions such as shutting down or resetting the device
in their system suspend callbacks.
The Linux PM framework allows the platform drivers to convey this info to
device drivers by calling the pm_set_suspend_via_firmware() and
pm_set_resume_via_firmware() APIs.
Hence, if the PSCI firmware supports SYSTEM_SUSPEND feature, call the above
mentioned APIs in the psci_system_suspend_begin() and
psci_system_suspend_enter() callbacks.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
[mani: reworded the description to be more elaborative] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 7 May 2026 12:06:53 +0000 (14:06 +0200)]
Merge tag 'ffa-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm FF-A fixes for v7.1
The updates fix several robustness issues in the FF-A bus and core driver,
mostly around notification handling, RxTx buffer sizing, firmware-provided
bounds, and cleanup paths.
The notification fixes make per-CPU notification work truly per-CPU, avoid
running per-vCPU notification handling from IPI context, keep RX buffer
release serialized under rx_lock, validate framework notification payload
layout before copying from the shared RX buffer, and avoid dereferencing
notifier entries after they may have been unregistered.
The remaining fixes reject FF-A drivers without an ID table at registration
time, avoid freeing an unallocated RX buffer after allocation failure,
unregister the FF-A v1.0 bus notifier on teardown, bound register-based
PARTITION_INFO_GET descriptor copies, align the stored RxTx buffer size with
the size mapped to firmware, and fix sched-recv callback partition lookup on
the circular partition list.
* tag 'ffa-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_ffa: Fix sched-recv callback partition lookup
firmware: arm_ffa: Snapshot notifier callbacks under lock
firmware: arm_ffa: Align RxTx buffer size before mapping
firmware: arm_ffa: Validate framework notification message layout
firmware: arm_ffa: Keep framework RX release under lock
firmware: arm_ffa: Bound PARTITION_INFO_GET_REGS copies
firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0
firmware: arm_ffa: Fix per-vcpu self notifications handling in workqueue
firmware: arm_ffa: Avoid collapsing NPI work from different CPUs
firmware: arm_ffa: Skip free_pages on RX buffer alloc failure
firmware: arm_ffa: Check for NULL FF-A ID table while driver registration
ARM: realtek: MAINTAINERS: Include pin controller drivers
No dedicated maintainers are shown for Realtek SoC pin controllers,
except pinctrl subsystem maintainer, which means reduced review and
impression of abandoned drivers. Pin controller drivers are essential
part of an SoC, so in case of lack of dedicated entry at least cover it
by the SoC platform maintainers.
Acked-by: Yu-Chun Lin <eleanor.lin@realtek.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Yu-Chun Lin <eleanor.lin@realtek.com> Link: https://lore.kernel.org/r/20260505105838.1014771-2-eleanor.lin@realtek.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Guenter Roeck [Tue, 5 May 2026 19:15:37 +0000 (21:15 +0200)]
ARM: integrator: Fix early initialization
Starting with commit bdb249fce9ad4 ("ARM: integrator: read counter using
syscon/regmap"), intcp_init_early calls syscon_regmap_lookup_by_compatible
which in turn calls of_syscon_register. This function allocates memory.
Since the memory management code has not been initialized at that time,
the call always fails. It either returns -ENOMEM or crashes as follows.
Unable to handle kernel NULL pointer dereference at virtual address 0000000c when read
[0000000c] *pgd=00000000
Internal error: Oops: 5 [#1] ARM
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc5-00026-g5fcc9bf84ee5 #1 PREEMPT
Hardware name: ARM Integrator/CP (Device Tree)
PC is at __kmalloc_cache_noprof+0xec/0x39c
LR is at __kmalloc_cache_noprof+0x34/0x39c
...
Call trace:
__kmalloc_cache_noprof from of_syscon_register+0x7c/0x310
of_syscon_register from device_node_get_regmap+0xa4/0xb0
device_node_get_regmap from intcp_init_early+0xc/0x40
intcp_init_early from start_kernel+0x60/0x688
start_kernel from 0x0
The crash is seen due to a dereferenced pointer which is not supposed to be
NULL but is NULL if the memory management subsystem has not been
initialized. The crash is not seen with all versions of gcc. Some versions
such as gcc 9.x apparently do not dereference the pointer, presumably if
tracing is disabled. The problem has been reproduced with gcc 10.x, 11.x,
and 13.x. Either case, if the crash is not seen, the call to
syscon_regmap_lookup_by_compatible returns -ENOMEM, and
sched_clock_register is never called.
Fix the problem by moving the early initialization code into the standard
machine initialization code.
Arnd Bergmann [Thu, 7 May 2026 12:00:11 +0000 (14:00 +0200)]
Merge tag 'renesas-fixes-for-v7.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes
Renesas fixes for v7.1
- Fix SCIF (serial port) clocks on R-Car X5H,
- Fix various dtc and dtbs_check warnings.
* tag 'renesas-fixes-for-v7.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
arm64: dts: renesas: r9a09g056: Add #mux-state-cells to usb20phyrst
arm64: dts: renesas: r9a09g057: Add #mux-state-cells to usb2{0,1}phyrst
ARM: dts: renesas: rskrza1: Drop superfluous cells
ARM: dts: renesas: genmai: Drop superfluous cells
ARM: dts: renesas: r7s72100: Add missing unit address to bus node
ARM: dts: renesas: r8a7792: Add missing unit address to bus node
ARM: dts: renesas: r8a7779: Add missing unit address to bus node
ARM: dts: renesas: r8a7778: Add missing unit address to bus node
arm64: dts: renesas: rz-smarc-du-adv7513-smarc: Fix missing cells and reg in DU subnode
arm64: dts: renesas: rz-smarc-cru-csi-ov5645: Fix missing cells and reg in CSI2 subnode
arm64: dts: renesas: salvator-panel: Fix missing cells and reg in DTO
arm64: dts: renesas: draak/ebisu-panel: Fix missing cells and reg in DTO
arm64: dts: renesas: r8a78000: Fix SCIF brg_int clocks
On Samsung Galaxy Book 5 (SAM0430), the keyboard backlight, microphone
mute, and camera block hotkeys do not generate i8042 scancodes.
Instead they arrive as ACPI notifications 0x7d, 0x6e, and 0x6f
respectively, all of which previously fell through to the default
"unknown" warning in galaxybook_acpi_notify().
Add handling for these three events:
- 0x7d (Fn+F9, keyboard backlight): schedule the existing
kbd_backlight_hotkey_work which cycles brightness.
- 0x6e (Fn+F10, microphone mute): emit KEY_MICMUTE via the driver's
input device.
- 0x6f (Fn+F11, camera block): if block_recording is active use the
existing block_recording_hotkey_work; otherwise emit a toggle of
SW_CAMERA_LENS_COVER via the driver's input device on models where
the block_recording ACPI feature is not supported.
Tested on Samsung Galaxy Book 5 (SAM0430) and Samsung Galaxy Book2 Pro
(SAM0429).
Signed-off-by: Ayaan Mirza Baig <ayaanmirzabaig85@gmail.com> Co-developed-by: Joshua Grisham <josh@joshuagrisham.com> Signed-off-by: Joshua Grisham <josh@joshuagrisham.com> Link: https://patch.msgid.link/20260418004613.93981-3-ayaanmirzabaig85@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
platform/x86: samsung-galaxybook: Refactor camera lens cover input device
Rename the camera_lens_cover_switch input device to a generic input
device which can be used for multiple input events. Move input device
allocation and registration into a dedicated galaxybook_input_init()
helper which is called early in probe so that the device is available
to all features.