The algo running in fw may take a little longer than 5 milliseconds,
(e.g. measurement on 80MHz while associated). Increase the minimum
time between measurements to 7 milliseconds.
Fixes: 830aa3e7d1ca ("iwlwifi: mvm: add support for range request command version 13") Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20240729201718.d3f3c26e00d9.I09e951290e8a3d73f147b88166fd9a678d1d69ed@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The cipher pointer is not set, but is derefereced trying to set its
content, which leads to a NULL pointer dereference.
Fix it by pointing to the cipher parameter before dereferencing.
The 'gl' devices are in the bz family, but they're not,
integrated, so should have their own trans config struct.
Fix that, also necessitating the removal of LTR config,
and while at it remove 0x2727 and 0x272D IDs that were
only used for test chips.
Fixes: c30a2a64788b ("wifi: iwlwifi: add a new PCI device ID for BZ device")ticket=none Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20240729201718.95aed0620080.Ib9129512c95aa57acc9876bdff8b99dd41e1562c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Like the commit ab9177d83c04 ("wifi: mac80211: don't use rate mask for
scanning"), ignore incorrect settings to avoid no supported rate warning
reported by syzbot.
The syzbot did bisect and found cause is commit 9df66d5b9f45 ("cfg80211:
fix default HE tx bitrate mask in 2G band"), which however corrects
bitmask of HE MCS and recognizes correctly settings of empty legacy rate
plus HE MCS rate instead of returning -EINVAL.
As suggestions [1], follow the change of SCAN TX to consider this case of
offchannel TX as well.
It used to be that the MacbookPro9,2 used its native intel backlight
device until the following commit was introduced:
commit b1d36e73cc1c ("drm/i915: Don't register backlight when another
backlight should be used (v2)")
This commit forced this model to use its firmware acpi_video backlight
device instead.
That worked fine until an additional commit was added:
commit 92714006eb4d ("drm/i915/backlight: Do not bump min brightness
to max on enable")
That commit uncovered a bug in the MacbookPro 9,2's acpi_video
backlight firmware; the backlight does not come back up after resume.
Add DMI quirk to select the working native intel interface instead
so that the backlight successfully comes back up after resume.
Fixes: 92714006eb4d ("drm/i915/backlight: Do not bump min brightness to max on enable") Signed-off-by: Esther Shimanovich <eshimanovich@chromium.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20240806-acpi-video-quirk-v1-1-369d8f7abc59@chromium.org
[ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The macro `ADF_RP_INT_SRC_SEL_F_RISE_MASK` is currently set to the value
`0100b` which means "Empty Going False". This might cause an incorrect
restore of the bank state during live migration.
Fix the definition of the macro to properly represent the "Full Going
True" state which is encoded as `0011b`.
Fixes: bbfdde7d195f ("crypto: qat - add bank save and restore flows") Signed-off-by: Svyatoslav Pankratov <svyatoslav.pankratov@intel.com> Reviewed-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the PCIe devices are discovered late, the driver can't find
the PCIe devices and returns in the init without registering with
the bus notifier. Due to that the devices which are discovered late
the driver can't register for this.
Register for bus notifier & driver even if the device is not found
as part of init.
When there are multiple of instances of PCIe controllers, registration
to perf driver fails with this error.
sysfs: cannot create duplicate filename '/devices/platform/dwc_pcie_pmu.0'
CPU: 0 PID: 166 Comm: modprobe Not tainted 6.10.0-rc2-next-20240607-dirty
Hardware name: Qualcomm SA8775P Ride (DT)
Call trace:
dump_backtrace.part.8+0x98/0xf0
show_stack+0x14/0x1c
dump_stack_lvl+0x74/0x88
dump_stack+0x14/0x1c
sysfs_warn_dup+0x60/0x78
sysfs_create_dir_ns+0xe8/0x100
kobject_add_internal+0x94/0x224
kobject_add+0xa8/0x118
device_add+0x298/0x7b4
platform_device_add+0x1a0/0x228
platform_device_register_full+0x11c/0x148
dwc_pcie_register_dev+0x74/0xf0 [dwc_pcie_pmu]
dwc_pcie_pmu_init+0x7c/0x1000 [dwc_pcie_pmu]
do_one_initcall+0x58/0x1c0
do_init_module+0x58/0x208
load_module+0x1804/0x188c
__do_sys_init_module+0x18c/0x1f0
__arm64_sys_init_module+0x14/0x1c
invoke_syscall+0x40/0xf8
el0_svc_common.constprop.1+0x70/0xf4
do_el0_svc+0x18/0x20
el0_svc+0x28/0xb0
el0t_64_sync_handler+0x9c/0xc0
el0t_64_sync+0x160/0x164
kobject: kobject_add_internal failed for dwc_pcie_pmu.0 with -EEXIST,
don't try to register things with the same name in the same directory.
This is because of having same bdf value for devices under two different
controllers.
Update the logic to use sbdf which is a unique number in case of
multi instance also.
The alibaba_uncore_pmu driver forgot to clear all interrupt status
in the interrupt processing function. After the PMU counter overflow
interrupt occurred, an interrupt storm occurred, causing the system
to hang.
Therefore, clear the correct interrupt status in the interrupt handling
function to fix it.
Using round_jiffies() in thermal_set_delay_jiffies() is invalid because
its argument should be time in the future in absolute jiffies and it
computes the result with respect to the current jiffies value at the
invocation time. Fortunately, in the majority of cases it does not
make any difference due to the time_is_after_jiffies() check in
round_jiffies_common().
While using round_jiffies_relative() instead of round_jiffies() might
reflect the intent a bit better, it still would not be defensible
because that function should be called when the timer is about to be
set and it is not suitable for pre-computation of delay values.
Accordingly, drop thermal_set_delay_jiffies() altogether, simply
convert polling_delay and passive_delay to jiffies during thermal
zone initialization and make thermal_zone_device_set_polling() call
round_jiffies_relative() on the delay if it is greather than 1 second.
Fixes: 17d399cd9c89 ("thermal/core: Precompute the delays from msecs to jiffies") Fixes: e5f2cda61d06 ("thermal/core: Move thermal_set_delay_jiffies to static") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/1994438.PYKUYFuaPT@rjwysocki.net Signed-off-by: Sasha Levin <sashal@kernel.org>
Fold bind_cdev() into __thermal_cooling_device_register() and bind_tz()
into thermal_zone_device_register_with_trips() to reduce code bloat and
make it somewhat easier to follow the code flow.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/2962184.e9J7NaK4W3@rjwysocki.net
Stable-dep-of: 8144dbe68c49 ("thermal: core: Fix rounding of delay jiffies") Signed-off-by: Sasha Levin <sashal@kernel.org>
When testing hard lockup handling on my sc7180-trogdor-lazor device
with pseudo-NMI enabled, with serial console enabled and with kgdb
disabled, I found that the stack crawls printed to the serial console
ended up as a jumbled mess. After rebooting, the pstore-based console
looked fine though. Also, enabling kgdb to trap the panic made the
console look fine and avoided the mess.
After a bit of tracking down, I came to the conclusion that this was
what was happening:
1. The panic path was stopping all other CPUs with
panic_other_cpus_shutdown().
2. At least one of those other CPUs was in the middle of printing to
the serial console and holding the console port's lock, which is
grabbed with "irqsave". ...but since we were stopping with an NMI
we didn't care about the "irqsave" and interrupted anyway.
3. Since we stopped the CPU while it was holding the lock it would
never release it.
4. All future calls to output to the console would end up failing to
get the lock in qcom_geni_serial_console_write(). This isn't
_totally_ unexpected at panic time but it's a code path that's not
well tested, hard to get right, and apparently doesn't work
terribly well on the Qualcomm geni serial driver.
The Qualcomm geni serial driver was fixed to be a bit better in commit 9e957a155005 ("serial: qcom-geni: Don't cancel/abort if we can't get
the port lock") but it's nice not to get into this situation in the
first place.
Taking a page from what x86 appears to do in native_stop_other_cpus(),
do this:
1. First, try to stop other CPUs with a normal IPI and wait a second.
This gives them a chance to leave critical sections.
2. If CPUs fail to stop then retry with an NMI, but give a much lower
timeout since there's no good reason for a CPU not to react quickly
to a NMI.
This works well and avoids the corrupted console and (presumably)
could help avoid other similar issues.
In order to do this, we need to do a little re-organization of our
IPIs since we don't have any more free IDs. Do what was suggested in
previous conversations and combine "stop" and "crash stop". That frees
up an IPI so now we can have a "stop" and "stop NMI".
In order to do this we also need a slight change in the way we keep
track of which CPUs still need to be stopped. We need to know
specifically which CPUs haven't stopped yet when we fall back to NMI
but in the "crash stop" case the "cpu_online_mask" isn't updated as
CPUs go down. This is why that code path had an atomic of the number
of CPUs left. Solve this by also updating the "cpu_online_mask" for
crash stops.
All of the above lets us combine the logic for "stop" and "crash stop"
code, which appeared to have a bunch of arbitrary implementation
differences.
Aside from the above change where we try a normal IPI and then an NMI,
the combined function has a few subtle differences:
* In the normal smp_send_stop(), if we fail to stop one or more CPUs
then we won't include the current CPU (the one running
smp_send_stop()) in the error message.
* In crash_smp_send_stop(), if we fail to stop some CPUs we'll print
the CPUs that we failed to stop instead of printing all _but_ the
current running CPU.
* In crash_smp_send_stop(), we will now only print "SMP: stopping
secondary CPUs" if (system_state <= SYSTEM_RUNNING).
Currently a number of SVE/SME related tests have almost identical
functions to enumerate all supported vector lengths. However over time
the copy&pasted code has diverged, allowing some bugs to creep in:
- fake_sigreturn_sme_change_vl reports a failure, not a SKIP if only
one vector length is supported (but the SVE version is fine)
- fake_sigreturn_sme_change_vl tries to set the SVE vector length, not
the SME one (but the other SME tests are fine)
- za_no_regs keeps iterating forever if only one vector length is
supported (but za_regs is correct)
Since those bugs seem to be mostly copy&paste ones, let's consolidate
the enumeration loop into one shared function, and just call that from
each test. That should fix the above bugs, and prevent similar issues
from happening again.
Fixes: 4963aeb35a9e ("kselftest/arm64: signal: Add SME signal handling tests") Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240821164401.3598545-1-andre.przywara@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the PPDU length for VHT rate exceeds 0x40000, calculating the PSDU
length will overflow. TMAC will determine the length to be too small and
as a result, all packets will be sent as ZLD (Zero Length Delimiter).
Fixes: 5f7e92c59b8e ("wifi: rtw89: 8852b: set AMSDU limit to 5000") Signed-off-by: Chia-Yuan Li <leo.li@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20240815134054.44649-1-pkshih@realtek.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The rp->priv->rpi array is either rpi_msr or rpi_tpmi which have
NR_RAPL_PRIMITIVES number of elements. Thus the > needs to be >=
to prevent an off by one access.
Fixes: 98ff639a7289 ("powercap: intel_rapl: Support per Interface primitive information") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://patch.msgid.link/86e3a059-504d-4795-a5ea-4a653f3b41f8@stanley.mountain Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
I think LLVM's reordering is valid as the code is currently written: the
compiler doesn't know the instructions have side effects in hardware.
Fix by using "asm volatile" in fmxr() and fmrx(), so they cannot be
reordered with respect to each other. The original compiler now produces
working kernels on my hardware with DYNAMIC_DEBUG=n.
This is the relevant piece of the diff of the vfp_support_entry() text,
from the original oopsing kernel to a working kernel with this patch:
nft_set_lookup_byid() is very slow when transaction becomes large, due to
walk of the transaction list.
Add a dedicated list that contains only the new sets.
Before: nft -f ruleset 0.07s user 0.00s system 0% cpu 1:04.84 total
After: nft -f ruleset 0.07s user 0.00s system 0% cpu 30.115 total
.. where ruleset contains ~10 sets with ~100k elements.
The above number is for a combined flush+reload of the ruleset.
With previous flush, even the first NEWELEM has to walk through a few
hundred thousands of DELSET(ELEM) transactions before the first NEWSET
object. To cope with random-order-newset-newsetelem we'd need to replace
commit_set_list with a hashtable.
Expectation is that a NEWELEM operation refers to the most recently added
set, so last entry of the dedicated list should be the set we want.
NB: This is not a bug fix per se (functionality is fine), but with
larger transaction batches list search takes forever, so it would be
nice to speed this up for -stable too, hence adding a "fixes" tag.
Fixes: 958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure to handle sets") Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The csr_fun defines a count parameter which defines the total number
CSRs emulated in KVM starting from the base. This value should be
equal to total number of counters possible for trap/emulation (32).
Currently, KVM traps & emulates PMU counter access only if SBI PMU
is available as the guest can only configure/read PMU counters via
SBI only. However, if SBI PMU is not enabled in the host, the
guest will fallback to the legacy PMU which will try to access
cycle/instret and result in an illegal instruction trap which
is not desired.
KVM can allow dummy emulation of cycle/instret only for the guest
if SBI PMU is not enabled in the host. The dummy emulation will
still return zero as we don't to expose the host counter values
from a guest using legacy PMU.
With the latest Linux-6.11-rc3, the below NULL pointer crash is observed
when SBI PMU snapshot is enabled for the guest and the guest is forcefully
powered-off.
Clearly, the kvm_vcpu_write_guest() function is crashing because it is
being called from kvm_pmu_clear_snapshot_area() upon guest tear down.
To address the above issue, simplify the kvm_pmu_clear_snapshot_area() to
not zero-out PMU snapshot area from kvm_pmu_clear_snapshot_area() because
the guest is anyway being tore down.
The kvm_pmu_clear_snapshot_area() is also called when guest changes
PMU snapshot area of a VCPU but even in this case the previous PMU
snaphsot area must not be zeroed-out because the guest might have
reclaimed the pervious PMU snapshot area for some other purpose.
When forwarding SBI calls to userspace ensure sbiret.error is
initialized to SBI_ERR_NOT_SUPPORTED first, in case userspace
neglects to set it to anything. If userspace neglects it then we
can't be sure it did anything else either, so we just report it
didn't do or try anything. Just init sbiret.value to zero, which is
the preferred value to return when nothing special is specified.
KVM was already initializing both sbiret.error and sbiret.value, but
the values used appear to come from a copy+paste of the __sbi_ecall()
implementation, i.e. a0 and a1, which don't apply prior to the call
being executed, nor at all when forwarding to userspace.
In 'rtw_coex_action_bt_a2dp_pan', 'wl_cpt_test' and 'bt_cpt_test' are
hardcoded to false, so corresponding 'table_case' and 'tdma_case'
assignments are never met.
Also 'rtw_coex_set_rf_para(rtwdev, chip->wl_rf_para_rx[1])' is never
executed. Assuming that CPT was never fully implemented, remove
lookalike leftovers. Compile tested only.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
The handler of firmware C2H event RTW89_MAC_C2H_FUNC_READ_WOW_CAM isn't
implemented, but driver expects number of handlers is
NUM_OF_RTW89_MAC_C2H_FUNC_WOW causing out-of-bounds access. Fix it by
removing ID.
A few SME-related sigcontext UAPI macros leave an argument
unprotected from misparsing during macro expansion.
Add parentheses around references to macro arguments where
appropriate.
Signed-off-by: Dave Martin <Dave.Martin@arm.com> Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT") Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling") Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240729152005.289844-1-Dave.Martin@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We calculate guest offloads during probe without the protection of
rtnl_lock. This lead to race between probe and ndo_set_features. Fix
this by moving the calculation under the rtnl_lock.
Fixes: 3f93522ffab2 ("virtio-net: switch off offloads on demand if possible on XDP set") Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20240814052228.4654-5-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch synchronizes operstate with admin state per RFC2863.
This is done by trying to toggle the carrier upon open/close and
synchronize with the config change work. This allows to propagate
status correctly to stacked devices like:
ip link add link enp0s3 macvlan0 type macvlan
ip link set link enp0s3 down
ip link show
Before this patch:
3: enp0s3: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff
......
5: macvlan0@enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff
After this patch:
3: enp0s3: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff
...
5: macvlan0@enp0s3: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff
Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Cc: Gia-Khanh Nguyen <gia-khanh.nguyen@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20240814052228.4654-4-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: c392d6019398 ("virtio-net: synchronize probe with ndo_set_features") Signed-off-by: Sasha Levin <sashal@kernel.org>
Sometime, it would be useful to disable the configure change
notification from the driver. So this patch allows this by introducing
a variable config_change_driver_disabled and only allow the configure
change notification callback to be triggered when it is allowed by
both the virtio core and the driver. It is set to false by default to
hold the current semantic so we don't need to change any drivers.
The first user for this would be virtio-net.
Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Cc: Gia-Khanh Nguyen <gia-khanh.nguyen@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20240814052228.4654-3-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: c392d6019398 ("virtio-net: synchronize probe with ndo_set_features") Signed-off-by: Sasha Levin <sashal@kernel.org>
Following patch will allow the config interrupt to be disabled by a
specific driver via another boolean. So this patch renames
virtio_config_enabled and relevant helpers to
virtio_config_core_enabled.
Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Cc: Gia-Khanh Nguyen <gia-khanh.nguyen@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20240814052228.4654-2-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: c392d6019398 ("virtio-net: synchronize probe with ndo_set_features") Signed-off-by: Sasha Levin <sashal@kernel.org>
Reference and PTP clocks rate of the Loongson GMAC devices is 125MHz.
(So is in the GNET devices which support is about to be added.) Set
the respective plat_stmmacenet_data field up in accordance with that
so to have the coalesce command and timestamping work correctly.
On PREEMPT_RT, kfree() takes sleeping locks and must not be called with
preemption disabled. Therefore, on PREEMPT_RT skcipher_walk_done() must
not be called from within a kernel_fpu_{begin,end}() pair, even when
it's the last call which is guaranteed to not allocate memory.
Therefore, move the last skcipher_walk_done() in gcm_crypt() to the end
of the function so that it goes after the kernel_fpu_end(). To make
this work cleanly, rework the data processing loop to handle only
non-last data segments.
Fixes: b06affb1cb58 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM") Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Closes: https://lore.kernel.org/linux-crypto/20240802102333.itejxOsJ@linutronix.de Signed-off-by: Eric Biggers <ebiggers@google.com> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
Before commit addea5858b66 ("hwrng: Kconfig - Do not enable by default
CN10K driver") the Marvell CN10K Random Number Generator was always
enabled when HW_RANDOM was enabled.
This was changed with that commit to prevent having this driver being
always enabled on arm64. To prevent introducing regression with some old
defconfig enable the driver when ARCH_THUNDER is enabled.
Fixes: addea5858b66 ("hwrng: Kconfig - Do not enable by default CN10K driver") Closes: https://lore.kernel.org/all/SN7PR18MB53144B37B82ADEEC5D35AE0CE3AC2@SN7PR18MB5314.namprd18.prod.outlook.com/ Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, the firmware returns incorrect pdev_id information in
WMI_PDEV_BSS_CHAN_INFO_EVENTID, leading to incorrect filling of
the pdev's survey information.
To prevent this issue, when requesting BSS channel information
through WMI_PDEV_BSS_CHAN_INFO_REQUEST_CMDID, firmware expects
pdev_id as one of the arguments in this WMI command.
Add pdev_id to the struct wmi_pdev_bss_chan_info_req_cmd and fill it
during ath12k_wmi_pdev_bss_chan_info_request(). This resolves the
issue of sending the correct pdev_id in WMI_PDEV_BSS_CHAN_INFO_EVENTID.
We should not be checking the return values from debugfs creation at all: the
debugfs functions are designed to handle errors of previously called functions
and just transparently abort the creation of debugfs entries when debugfs is
disabled. If we check the return value and abort driver initialisation, we break
the driver if debugfs is disabled (such as when booting with debugfs=off).
Earlier versions of ath9k accidentally did the right thing by checking the
return value, but only for NULL, not for IS_ERR(). This was "fixed" by the two
commits referenced below, breaking ath9k with debugfs=off starting from the 6.6
kernel (as reported in the Bugzilla linked below).
Restore functionality by just getting rid of the return value check entirely.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219122 Fixes: 1e4134610d93 ("wifi: ath9k: use IS_ERR() with debugfs_create_dir()") Fixes: 6edb4ba6fb5b ("wifi: ath9k: fix parameter check in ath9k_init_debug()") Reported-by: Daniel Tobias <dan.g.tob@gmail.com> Tested-by: Daniel Tobias <dan.g.tob@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://patch.msgid.link/20240805110225.19690-1-toke@toke.dk Signed-off-by: Sasha Levin <sashal@kernel.org>
When the firmware interface layer was refactored it provided various
"get" and "set" functions. For the "get" in some cases a parameter
needed to be passed down to firmware as a key indicating what to
"get" turning the output parameter of the "get" function into an
input parameter as well. To accommodate this the "get" function blindly
copies the parameter which in some places resulted in an uninitialized
warnings from the compiler. These have been fixed by initializing the
input parameter in the past. Recently another batch of similar fixes
were submitted to address clang static checker warnings [1].
Proposing another solution by introducing a "query" variant which is used
when the (input) parameter is needed by firmware. The "get" variant will
only fill the (output) parameter with the result received from firmware
taking care of proper endianess conversion.
The free_device_compression_mode(iaa_device, device_mode) function frees
"device_mode" but it iss passed to iaa_compression_modes[i]->free() a few
lines later resulting in a use after free.
The good news is that, so far as I can tell, nothing implements the
->free() function and the use after free happens in dead code. But, with
this fix, when something does implement it, we'll be ready. :)
Fixes: b190447e0fa3 ("crypto: iaa - Add compression mode management along with fixed mode") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the process of sending the ADF_PF2VF_MSGTYPE_RESTARTING message to
Virtual Functions (VFs), the Physical Function (PF) should set the
`vf->restarting` flag to true before dispatching the message.
This change is necessary to prevent a race condition where the handling
of the ADF_VF2PF_MSGTYPE_RESTARTING_COMPLETE message (which sets the
`vf->restarting` flag to false) runs immediately after the message is sent,
but before the flag is set to true.
Set the `vf->restarting` to true before sending the message
ADF_PF2VF_MSGTYPE_RESTARTING, if supported by the version of the
protocol and if the VF is started.
Fixes: ec26f8e6c784 ("crypto: qat - update PFVF protocol for recovery") Signed-off-by: Michal Witwicki <michal.witwicki@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the PFVF protocol was updated to support version 5, i.e.
ADF_PFVF_COMPAT_FALLBACK, the compatibility version for the VF was
updated without supporting the message RESTARTING_COMPLETE required for
such version.
Add support for the ADF_VF2PF_MSGTYPE_RESTARTING_COMPLETE message in the
VF drivers. This message is sent by the VF driver to the PF to notify
the completion of the shutdown flow.
Fixes: ec26f8e6c784 ("crypto: qat - update PFVF protocol for recovery") Signed-off-by: Michal Witwicki <michal.witwicki@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
Disabling IOV has the side effect of re-enabling the AEs that might
attempt to do DMAs into the heartbeat buffers.
Move the disable_iov() function in adf_dev_stop() before the AEs are
stopped.
Fixes: ed8ccaef52fa ("crypto: qat - Add support for SRIOV") Signed-off-by: Michal Witwicki <michal.witwicki@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit c055e3eae0f1 ("crypto: xor - use ktime for template benchmarking")
switched from using jiffies to ktime-based performance benchmarking.
This works nicely on machines which have a fine-grained ktime()
clocksource as e.g. x86 machines with TSC.
But other machines, e.g. my 4-way HP PARISC server, don't have such
fine-grained clocksources, which is why it seems that 800 xor loops
take zero seconds, which then shows up in the logs as:
Fix this with some small modifications to the existing code to improve
the algorithm to always produce correct results without introducing
major delays for architectures with a fine-grained ktime()
clocksource:
a) Delay start of the timing until ktime() just advanced. On machines
with a fast ktime() this should be just one additional ktime() call.
b) Count the number of loops. Run at minimum 800 loops and finish
earliest when the ktime() counter has progressed.
With that the throughput can now be calculated more accurately under all
conditions.
Fixes: c055e3eae0f1 ("crypto: xor - use ktime for template benchmarking") Signed-off-by: Helge Deller <deller@gmx.de> Tested-by: John David Anglin <dave.anglin@bell.net>
v2:
- clean up coding style (noticed & suggested by Herbert Xu)
- rephrased & fixed typo in commit message
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
In 'rtw_wait_firmware_completion()', always wait for both (regular and
wowlan) firmware loading attempts. Otherwise if 'rtw_usb_intf_init()'
has failed in 'rtw_usb_probe()', 'rtw_usb_disconnect()' may issue
'ieee80211_free_hw()' when one of 'rtw_load_firmware_cb()' (usually
the wowlan one) is still in progress, causing UAF detected by KASAN.
The Zynq UltraScale+ MPSoC DDR has a disjoint memory from 2GB to 32GB.
The DDR host interface has a contiguous memory so while injecting
errors, the driver should remove the hole else the injection fails as
the address translation is incorrect.
Introduce a get_mem_info() function pointer and set it for Zynq
UltraScale+ platform to return host address.
Commit 3a415daa3e8b ("wifi: ath11k: add P2P IE in beacon template")
from Feb 28, 2024 (linux-next), leads to the following Smatch static
checker warning:
drivers/net/wireless/ath/ath11k/wmi.c:1742 ath11k_wmi_p2p_go_bcn_ie()
warn: sleeping in atomic context
The reason is that ath11k_bcn_tx_status_event() will directly call might
sleep function ath11k_wmi_cmd_send() during RCU read-side critical
sections. The call trace is like:
Commit 886433a98425 ("ath11k: add support for BSS color change") added the
ath11k_mac_bcn_tx_event(), commit 01e782c89108 ("ath11k: fix warning
of RCU usage for ath11k_mac_get_arvif_by_vdev_id()") added the RCU lock
to avoid warning but also introduced this BUG.
Use work queue to avoid directly calling ath11k_mac_bcn_tx_event()
during RCU critical sections. No need to worry about the deletion of vif
because cancel_work_sync() will drop the work if it doesn't start or
block vif deletion until the running work is done.
Fixes: 3a415daa3e8b ("wifi: ath11k: add P2P IE in beacon template") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/2d277abd-5e7b-4da0-80e0-52bd96337f6e@moroto.mountain/ Signed-off-by: Kang Yang <quic_kangyang@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://patch.msgid.link/20240626053543.1946-1-quic_kangyang@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The rapl_find_package_domain_cpuslocked() function is supposed to
return NULL on error.
This new error patch returns ERR_PTR(-EINVAL) but none of the callers
check for that so it would lead to an Oops.
Fixes: 26096aed255f ("powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/fa719c6a-8d3b-4cca-9b43-bcd477ff6655@stanley.mountain Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Another device has been reported to be unreliable if we have more than
one outstanding command. In this new case, data corruption may occur.
Since we have two devices now needing this quirky behavior, make a
generic quirk flag.
The same Apple quirk is clearly not "temporary", so update the comment
while moving it.
Add ZSC Control register programming sequence for ACP D0 and D3 state
transitions for ACP7.0 onwards. This will allow ACP to enter low power
state when ACP enters D3 state. When ACP enters D0 State, ZSC control
should be disabled.
Tested-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://patch.msgid.link/20240807085154.1987681-1-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown <broonie@kernel.org> Cc: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix driver not allocating memory for struct btintel_data which is used
to store internal data.
Fixes: 6e65a09f9275 ("Bluetooth: btintel_pcie: Add *setup* function to download firmware") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Thomas Leroy <thomas.leroy@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
on AMD processors that support extended CPUID leaf 0x80000026, the
topology_logical_die_id() macros, no longer returns package id, instead it
returns the CCD (Core Complex Die) id. This leads to the energy-pkg
event scope to be modified to CCD instead of package.
For more historical context, please refer to commit 32fb480e0a2c
("powercap/intel_rapl: Support multi-die/package"), which initially changed
the RAPL scope from package to die for all systems, as Intel systems
with Die enumeration have RAPL scope as die, and those without die
enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
correctly with topology_logical_die_id() until recently, but this changed
after the "0x80000026 leaf" commit mentioned above.
Future multi-die Intel systems will have package scope RAPL counters,
but they will be using TPMI RAPL interface, which is not affected by
this change.
Replacing topology_logical_die_id() with topology_physical_package_id()
conditionally only for AMD and Hygon fixes the energy-pkg event.
Having a limit of 64 DRM devices is not good enough for modern world
where we have multi-GPU servers, SR-IOV virtual functions and virtual
devices used for testing.
Let's utilize full minor range for DRM devices.
To avoid regressing the existing userspace, we're still maintaining the
numbering scheme where 0-63 is used for primary, 64-127 is reserved
(formerly for control) and 128-191 is used for render.
For minors >= 192, we're allocating minors dynamically on a first-come,
first-served basis.
Accel minor management is based on DRM (and is also using struct
drm_minor internally), since DRM is using XArray for minors, it makes
sense to also convert accel.
As the two implementations are identical (only difference being the
underlying xarray), move the accel_minor_* functionality to DRM.
IDR is deprecated, and since XArray manages its own state with internal
locking, it simplifies the locking on DRM side.
Additionally, don't use the IRQ-safe variant, since operating on drm
minor is not done in IRQ context.
Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Acked-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240823163048.2676257-2-michal.winiarski@intel.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This caused a regression with the bochsdrm driver, which used ioremap()
instead of ioremap_wc() to map the video RAM. After the commit, the
WB memory type is used without the IGNORE_PAT, resulting in the slower
UC memory type. In fact, UC is slow enough to basically cause guests
to not boot... but only on new processors such as Sapphire Rapids and
Cascade Lake. Coffee Lake for example works properly, though that might
also be an effect of being on a larger, more NUMA system.
The driver has been fixed but that does not help older guests. Until we
figure out whether Cascade Lake and newer processors are working as
intended, revert the commit. Long term we might add a quirk, but the
details depend on whether the processors are working as intended: for
example if they are, the quirk might reference bochs-compatible devices,
e.g. in the name and documentation, so that userspace can disable the
quirk by default and only leave it enabled if such a device is being
exposed to the guest.
If instead this is actually a bug in CLX+, then the actions we need to
take are different and depend on the actual cause of the bug.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Merge tag 'sound-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few last-minute ASoC fixes and MAINTAINERS update.
All look small, obvious and nice-to-have fixes for 6.11-final"
* tag 'sound-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: meson: axg-card: fix 'use-after-free'
ASoC: codecs: avoid possible garbage value in peb2466_reg_read()
MAINTAINERS: update Pierre Bossart's email and role
ASoC: tas2781: fix to save the dsp bin file name into the correct array in case name_prefix is not NULL
ASoC: Intel: soc-acpi-intel-mtl-match: add missing empty item
ASoC: Intel: soc-acpi-intel-lnl-match: add missing empty item
Merge tag 'asoc-fix-v6.11-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.11
A few last minute fixes, plus an update for Pierre's contact details and
status. It'd be good to get these into v6.11 (especially the
MAINTAINERS update) but it wouldn't be the end of the world if they
waited for the merge window, none of them are super remarkable and it's
just a question of timing that they're last minute.
Merge tag 'pci-v6.11-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fix from Bjorn Helgaas:
- Prevent a possible deadlock (reported by lockdep) when a driver
relinquishes a pci_dev, another driver claims it, and one uses
managed pcim_enable_device() and the other doesn't (Philipp Stanner)
* tag 'pci-v6.11-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Fix potential deadlock in pcim_intx()
Merge tag 'spi-fix-v6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few last minute fixes for v6.11, they're all individually
unremarkable and only last minute due to when they came in"
* tag 'spi-fix-v6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: nxp-fspi: fix the KASAN report out-of-bounds bug
spi: geni-qcom: Fix incorrect free_irq() sequence
spi: geni-qcom: Undo runtime PM changes at driver exit time
Merge tag 'soundwire-6.11-fixes_2' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- Revert of earlier fix sent for non-continuous port map programming
which caused regression on Intel platforms
* tag 'soundwire-6.11-fixes_2' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: stream: Revert "soundwire: stream: fix programming slave ports for non-continous port maps"
Merge tag 'drm-fixes-2024-09-13' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Regular fixes pull, the amdgpu JPEG engine fixes are probably the
biggest, they look to block some register accessing, otherwise there
are just minor fixes and regression fixes all over.
nouveau had a regression report going back a few kernels that finally
got fixed, Not entirely happy with so many changes so late, but they
all seem quite benign apart from the jpeg one.
dma-buf/heaps:
- fix off by one in CMA heap fault handler
syncobj:
- fix syncobj leak in drm_syncobj_eventfd_ioctl
amdgpu:
- Avoid races between set_drr() functions and dc_state_destruct()
- Fix regerssion related to zpos
- Fix regression related to overlay cursor
- SMU 14.x updates
- JPEG fixes
- Silence an UBSAN warning
amdkfd:
- Fetch cacheline size from IP discovery
i915:
- Prevent a possible int overflow in wq offsets
xe:
- Remove a double include
- Fix null checks and UAF
- Fix access_ok check in user_fence_create
- Fix compat IS_DISPLAY_STEP() range
- OA fix
- Fixes in show_meminfo
nouveau:
- fix GP10x regression on boot
stm:
- add COMMON_CLK dep
rockchip:
- iommu api change
tegra:
- iommu api change"
* tag 'drm-fixes-2024-09-13' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
drm/xe/client: add missing bo locking in show_meminfo()
drm/xe/client: fix deadlock in show_meminfo()
drm/xe/oa: Enable Xe2+ PES disaggregation
drm/xe/display: fix compat IS_DISPLAY_STEP() range end
drm/xe: Fix access_ok check in user_fence_create
drm/xe: Fix possible UAF in guc_exec_queue_process_msg
drm/xe: Remove fence check from send_tlb_invalidation
drm/xe/gt: Remove double include
drm/amd/display: Add all planes on CRTC to state for overlay cursor
drm/amdgpu/atomfirmware: Silence UBSAN warning
drm/amd/amdgpu: apply command submission parser for JPEG v1
drm/amd/amdgpu: apply command submission parser for JPEG v2+
drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
drm/amd/pm: update the features set on smu v14.0.2/3
drm/amd/display: Do not reset planes based on crtc zpos_changed
drm/amd/display: Avoid race between dcn35_set_drr() and dc_state_destruct()
drm/amd/display: Avoid race between dcn10_set_drr() and dc_state_destruct()
drm/amdkfd: Add cache line size info
drm/tegra: Use iommu_paging_domain_alloc()
drm/rockchip: Use iommu_paging_domain_alloc()
...
Patrick Rudolph [Mon, 2 Sep 2024 07:28:58 +0000 (09:28 +0200)]
pinctrl: pinctrl-cy8c95x0: Fix regcache
The size of the mux stride was off by one, which could result in
invalid pin configuration on the device side or invalid state
readings on the software side.
While on it also update the code and:
- Increase the mux stride size to 16
- Align the virtual muxed regmap range to 16
- Start the regmap window at the selector
- Mark reserved registers as not-readable
Fixes: 8670de9fae49 ("pinctrl: cy8c95x0: Use regmap ranges") Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com> Reported-by: Andy Shevchenko <andy@kernel.org> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/20240902072859.583490-1-patrick.rudolph@9elements.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dave Airlie [Fri, 13 Sep 2024 05:18:15 +0000 (15:18 +1000)]
Merge tag 'drm-xe-fixes-2024-09-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
- Remove a double include (Lucas)
- Fix null checks and UAF (Brost)
- Fix access_ok check in user_fence_create (Nirmoy)
- Fix compat IS_DISPLAY_STEP() range (Jani)
- OA fix (Ashutosh)
- Fixes in show_meminfo (Auld)
Dave Airlie [Fri, 13 Sep 2024 04:47:49 +0000 (14:47 +1000)]
Merge tag 'drm-misc-fixes-2024-09-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
An off-by-one fix for the CMA DMA-buf heap, An init fix for nouveau, a
config dependency fix for stm, a syncobj leak fix, and two iommu fixes
for tegra and rockchip.
Dave Airlie [Fri, 13 Sep 2024 01:33:37 +0000 (11:33 +1000)]
Merge tag 'amd-drm-fixes-6.11-2024-09-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.11-2024-09-11:
amdgpu:
- Avoid races between set_drr() functions and dc_state_destruct()
- Fix regerssion related to zpos
- Fix regression related to overlay cursor
- SMU 14.x updates
- JPEG fixes
- Silence an UBSAN warning
David Howells [Thu, 12 Sep 2024 15:58:48 +0000 (16:58 +0100)]
cifs: Fix signature miscalculation
Fix the calculation of packet signatures by adding the offset into a page
in the read or write data payload when hashing the pages from it.
Fixes: 39bc58203f04 ("cifs: Add a function to Hash the contents of an iterator") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Tom Talpey <tom@talpey.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One build fix for 32-bit arches using the Qualcomm PLL driver. It's
cheaper to use a comparison here instead of a division so we just do
that to fix the build"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: clk-alpha-pll: Simplify the zonda_pll_adjust_l_val()
Merge tag 'block-6.11-20240912' of git://git.kernel.dk/linux
Pull block fix from Jens Axboe:
"Just a single fix for a deadlock issue that can happen if someone
attempts to change the root disk IO scheduler with a module that
requires loading from disk.
Changing the scheduler freezes the queue while that operation is
happening, hence causing a deadlock"
* tag 'block-6.11-20240912' of git://git.kernel.dk/linux:
block: Prevent deadlocks when switching elevators
Merge tag 'hwmon-for-v6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
- Fix clearing status register bits for chips supporting older
PMBus versions
* tag 'hwmon-for-v6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pmbus) Conditionally clear individual status bits for pmbus rev >= 1.2
Merge tag 'wq-for-6.11-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"A fix for a NULL worker->pool deref bug which can be triggered when a
worker is created and then destroyed immediately"
* tag 'wq-for-6.11-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Clear worker->pool in the worker thread context
Merge tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- Two fixes for smp_processor_id() calls in preemptible sections: one
if the perf driver, and one in the fence.i prctl.
* tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
drivers: perf: Fix smp_processor_id() use in preemptible code
Merge tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
There is a recently notified BT regression with no fix yet. I do not
think a fix will land in the next week.
Current release - regressions:
- core: tighten bad gso csum offset check in virtio_net_hdr
- netfilter: move nf flowtable bpf initialization in
nf_flow_table_module_init()
- eth: ice: stop calling pci_disable_device() as we use pcim
- eth: fou: fix null-ptr-deref in GRO.
Current release - new code bugs:
- hsr: prevent NULL pointer dereference in hsr_proxy_announce()
Previous releases - regressions:
- hsr: remove seqnr_lock
- netfilter: nft_socket: fix sk refcount leaks
- mptcp: pm: fix uaf in __timer_delete_sync
- phy: dp83822: fix NULL pointer dereference on DP83825 devices
- eth: revert "virtio_net: rx enable premapped mode by default"
- eth: octeontx2-af: Modify SMQ flush sequence to drop packets
Previous releases - always broken:
- eth: mlx5: fix bridge mode operations when there are no VFs
- eth: igb: Always call igb_xdp_ring_update_tail() under Tx lock"
* tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init()
net: tighten bad gso csum offset check in virtio_net_hdr
netlink: specs: mptcp: fix port endianness
net: dpaa: Pad packets to ETH_ZLEN
mptcp: pm: Fix uaf in __timer_delete_sync
net: libwx: fix number of Rx and Tx descriptors
net: dsa: felix: ignore pending status of TAS module when it's disabled
net: hsr: prevent NULL pointer dereference in hsr_proxy_announce()
selftests: mptcp: include net_helper.sh file
selftests: mptcp: include lib.sh file
selftests: mptcp: join: restrict fullmesh endp on 1st sf
netfilter: nft_socket: make cgroupsv2 matching work with namespaces
netfilter: nft_socket: fix sk refcount leaks
MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVER
dt-bindings: net: tja11xx: fix the broken binding
selftests: net: csum: Fix checksums for packets with non-zero padding
net: phy: dp83822: Fix NULL pointer dereference on DP83825 devices
virtio_net: disable premapped mode by default
Revert "virtio_net: big mode skip the unmap check"
Revert "virtio_net: rx remove premapped failover code"
...
Merge tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- asus-wmi: Disable OOBE that interferes with backlight control
- panasonic-laptop: Two fixes to SINF array handling
* tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-wmi: Disable OOBE experience on Zenbook S 16
platform/x86: panasonic-laptop: Allocate 1 entry extra in the sinf array
platform/x86: panasonic-laptop: Fix SINF array out of bounds accesses
mm: avoid leaving partial pfn mappings around in error case
As Jann points out, PFN mappings are special, because unlike normal
memory mappings, there is no lifetime information associated with the
mapping - it is just a raw mapping of PFNs with no reference counting of
a 'struct page'.
That's all very much intentional, but it does mean that it's easy to
mess up the cleanup in case of errors. Yes, a failed mmap() will always
eventually clean up any partial mappings, but without any explicit
lifetime in the page table mapping itself, it's very easy to do the
error handling in the wrong order.
In particular, it's easy to mistakenly free the physical backing store
before the page tables are actually cleaned up and (temporarily) have
stale dangling PTE entries.
To make this situation less error-prone, just make sure that any partial
pfn mapping is torn down early, before any other error handling.
Reported-and-tested-by: Jann Horn <jannh@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthew Auld [Wed, 11 Sep 2024 15:55:28 +0000 (16:55 +0100)]
drm/xe/client: add missing bo locking in show_meminfo()
bo_meminfo() wants to inspect bo state like tt and the ttm resource,
however this state can change at any point leading to stuff like NPD and
UAF, if the bo lock is not held. Grab the bo lock when calling
bo_meminfo(), ensuring we drop any spinlocks first. In the case of
object_idr we now also need to hold a ref.
Matthew Auld [Wed, 11 Sep 2024 15:55:27 +0000 (16:55 +0100)]
drm/xe/client: fix deadlock in show_meminfo()
There is a real deadlock as well as sleeping in atomic() bug in here, if
the bo put happens to be the last ref, since bo destruction wants to
grab the same spinlock and sleeping locks. Fix that by dropping the ref
using xe_bo_put_deferred(), and moving the final commit outside of the
lock. Dropping the lock around the put is tricky since the bo can go
out of scope and delete itself from the list, making it difficult to
navigate to the next list entry.
Ashutosh Dixit [Mon, 9 Sep 2024 16:59:33 +0000 (09:59 -0700)]
drm/xe/oa: Enable Xe2+ PES disaggregation
Enable Xe2+ PES disaggregation (for OAG) to retrieve disaggregated metrics
when disaggregated data is needed. Userspace can select whether to receive
aggregated or disaggregated metrics via the particular OA configuration it
uses (programmed via DRM_XE_OBSERVATION_OP_ADD_CONFIG).
Lorenzo Bianconi [Wed, 11 Sep 2024 15:37:30 +0000 (17:37 +0200)]
net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init()
Move nf flowtable bpf initialization in nf_flow_table module load
routine since nf_flow_table_bpf is part of nf_flow_table module and not
nf_flow_table_inet one. This patch allows to avoid the following kernel
warning running the reproducer below:
$modprobe nf_flow_table_inet
$rmmod nf_flow_table_inet
$modprobe nf_flow_table_inet
modprobe: ERROR: could not insert 'nf_flow_table_inet': Invalid argument
Paolo Abeni [Thu, 12 Sep 2024 13:26:18 +0000 (15:26 +0200)]
Merge tag 'nf-24-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains two fixes from Florian Westphal:
Patch #1 fixes a sk refcount leak in nft_socket on mismatch.
Patch #2 fixes cgroupsv2 matching from containers due to incorrect
level in subtree.
netfilter pull request 24-09-12
* tag 'nf-24-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_socket: make cgroupsv2 matching work with namespaces
netfilter: nft_socket: fix sk refcount leaks
====================
Philipp Stanner [Thu, 5 Sep 2024 07:25:57 +0000 (09:25 +0200)]
PCI: Fix potential deadlock in pcim_intx()
25216afc9db5 ("PCI: Add managed pcim_intx()") moved the allocation step for
pci_intx()'s device resource from pcim_enable_device() to pcim_intx(). As
before, pcim_enable_device() sets pci_dev.is_managed to true; and it is
never set to false again.
Due to the lifecycle of a struct pci_dev, it can happen that a second
driver obtains the same pci_dev after a first driver ran. If one driver
uses pcim_enable_device() and the other doesn't, this causes the other
driver to run into managed pcim_intx(), which will try to allocate when
called for the first time.
Allocations might sleep, so calling pci_intx() while holding spinlocks
becomes then invalid, which causes lockdep warnings and could cause
deadlocks:
========================================================
WARNING: possible irq lock inversion dependency detected
6.11.0-rc6+ #59 Tainted: G W
--------------------------------------------------------
CPU 0/KVM/1537 just changed the state of lock: ffffa0f0cff965f0 (&vdev->irqlock){-...}-{2:2}, at:
vfio_intx_handler+0x21/0xd0 [vfio_pci_core] but this lock took another,
HARDIRQ-unsafe lock in the past: (fs_reclaim){+.+.}-{0:0}
and interrupts could create inverse lock ordering between them.
Have pcim_enable_device()'s release function, pcim_disable_device(), set
pci_dev.is_managed to false so that subsequent drivers using the same
struct pci_dev do not implicitly run into managed code.
Link: https://lore.kernel.org/r/20240905072556.11375-2-pstanner@redhat.com Fixes: 25216afc9db5 ("PCI: Add managed pcim_intx()") Reported-by: Alex Williamson <alex.williamson@redhat.com> Closes: https://lore.kernel.org/all/20240903094431.63551744.alex.williamson@redhat.com/ Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org>