git.ipfire.org Git - thirdparty/kernel/stable.git/log

can: m_can: m_can_close(): stop clocks after device has been shut down

[ Upstream commit 2c09b50efcad985cf920ca88baa9aa52b1999dcc ]

After calling m_can_stop() an interrupt may be pending or NAPI might
still be executed. This means the driver might still touch registers
of the IP core after the clocks have been disabled. This is not good
practice and might lead to aborts depending on the SoC integration.

To avoid these potential problems, make m_can_close() symmetric to
m_can_open(), i.e. stop the clocks at the end, right before shutting
down the transceiver.

Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
Link: https://patch.msgid.link/20240910-can-m_can-fix-ifup-v3-2-6c1720ba45ce@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

can: m_can: enable NAPI before enabling interrupts

[ Upstream commit 801ad2f87b0c6d0c34a75a4efd6bfd3a2d9f9298 ]

If an interrupt (RX-complete or error flag) is set when bringing up
the CAN device, e.g. due to CAN bus traffic before initializing the
device, when m_can_start() is called and interrupts are enabled,
m_can_isr() is called immediately, which disables all CAN interrupts
and calls napi_schedule().

Because napi_enable() isn't called until later in m_can_open(), the
call to napi_schedule() never schedules the m_can_poll() callback and
the device is left with interrupts disabled and can't receive any CAN
packets until rebooted.

This can be verified by running "cansend" from another device before
setting the bitrate and calling "ip link set up can0" on the test
device. Adding debug lines to m_can_isr() shows it's called with flags
(IR_EP | IR_EW | IR_CRCE), which calls m_can_disable_all_interrupts()
and napi_schedule(), and then m_can_poll() is never called.

Move the call to napi_enable() above the call to m_can_start() to
enable any initial interrupt flags to be handled by m_can_poll() so
that interrupts are reenabled. Add a call to napi_disable() in the
error handling section of m_can_open(), to handle the case where later
functions return errors.

Also, in m_can_close(), move the call to napi_disable() below the call
to m_can_stop() to ensure all interrupts are handled when bringing
down the device. This race condition is much less likely to occur.

Tested on a Microchip SAMA7G54 MPU. The fix should be applicable to
any SoC with a Bosch M_CAN controller.

Signed-off-by: Jake Hamby <Jake.Hamby@Teledyne.com>
Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
Link: https://patch.msgid.link/20240910-can-m_can-fix-ifup-v3-1-6c1720ba45ce@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

can: bcm: Clear bo->bcm_proc_read after remove_proc_entry().

[ Upstream commit 94b0818fa63555a65f6ba107080659ea6bcca63e ]

syzbot reported a warning in bcm_release(). [0]

The blamed change fixed another warning that is triggered when
connect() is issued again for a socket whose connect()ed device has
been unregistered.

However, if the socket is just close()d without the 2nd connect(), the
remaining bo->bcm_proc_read triggers unnecessary remove_proc_entry()
in bcm_release().

Let's clear bo->bcm_proc_read after remove_proc_entry() in bcm_notify().

[0]
name '4986'
WARNING: CPU: 0 PID: 5234 at fs/proc/generic.c:711 remove_proc_entry+0x2e7/0x5d0 fs/proc/generic.c:711
Modules linked in:
CPU: 0 UID: 0 PID: 5234 Comm: syz-executor606 Not tainted 6.11.0-rc5-syzkaller-00178-g5517ae241919 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
RIP: 0010:remove_proc_entry+0x2e7/0x5d0 fs/proc/generic.c:711
Code: ff eb 05 e8 cb 1e 5e ff 48 8b 5c 24 10 48 c7 c7 e0 f7 aa 8e e8 2a 38 8e 09 90 48 c7 c7 60 3a 1b 8c 48 89 de e8 da 42 20 ff 90 <0f> 0b 90 90 48 8b 44 24 18 48 c7 44 24 40 0e 36 e0 45 49 c7 04 07
RSP: 0018:ffffc9000345fa20 EFLAGS: 00010246
RAX: 2a2d0aee2eb64600 RBX: ffff888032f1f548 RCX: ffff888029431e00
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000345fb08 R08: ffffffff8155b2f2 R09: 1ffff1101710519a
R10: dffffc0000000000 R11: ffffed101710519b R12: ffff888011d38640
R13: 0000000000000004 R14: 0000000000000000 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8880b8800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcfb52722f0 CR3: 000000000e734000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
bcm_release+0x250/0x880 net/can/bcm.c:1578
__sock_release net/socket.c:659 [inline]
sock_close+0xbc/0x240 net/socket.c:1421
__fput+0x24a/0x8a0 fs/file_table.c:422
task_work_run+0x24f/0x310 kernel/task_work.c:228
exit_task_work include/linux/task_work.h:40 [inline]
do_exit+0xa2f/0x27f0 kernel/exit.c:882
do_group_exit+0x207/0x2c0 kernel/exit.c:1031
__do_sys_exit_group kernel/exit.c:1042 [inline]
__se_sys_exit_group kernel/exit.c:1040 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1040
x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcfb51ee969
Code: Unable to access opcode bytes at 0x7fcfb51ee93f.
RSP: 002b:00007ffce0109ca8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fcfb51ee969
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
RBP: 00007fcfb526f3b0 R08: ffffffffffffffb8 R09: 0000555500000000
R10: 0000555500000000 R11: 0000000000000246 R12: 00007fcfb526f3b0
R13: 0000000000000000 R14: 00007fcfb5271ee0 R15: 00007fcfb51bf160
</TASK>

Fixes: 76fe372ccb81 ("can: bcm: Remove proc entry when dev is unregistered.")
Reported-by: syzbot+0532ac7a06fb1a03187e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=0532ac7a06fb1a03187e
Tested-by: syzbot+0532ac7a06fb1a03187e@syzkaller.appspotmail.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20240905012237.79683-1-kuniyu@amazon.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hsr: Use the seqnr lock for frames received via interlink port.

[ Upstream commit 430d67bdcb04ee8502c2b10dcbaced4253649189 ]

syzbot reported that the seqnr_lock is not acquire for frames received
over the interlink port. In the interlink case a new seqnr is generated
and assigned to the frame.
Frames, which are received over the slave port have already a sequence
number assigned so the lock is not required.

Acquire the hsr_priv::seqnr_lock during in the invocation of
hsr_forward_skb() if a packet has been received from the interlink port.

Reported-by: syzbot+3d602af7549af539274e@syzkaller.appspotmail.com
Closes: https://groups.google.com/g/syzkaller-bugs/c/KppVvGviGg4/m/EItSdCZdBAAJ
Fixes: 5055cccfc2d1c ("net: hsr: Provide RedBox support (HSR-SAN)")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Tested-by: Lukasz Majewski <lukma@denx.de>
Link: https://patch.msgid.link/20240906132816.657485-2-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

sock_map: Add a cond_resched() in sock_hash_free()

[ Upstream commit b1339be951ad31947ae19bc25cb08769bf255100 ]

Several syzbot soft lockup reports all have in common sock_hash_free()

If a map with a large number of buckets is destroyed, we need to yield
the cpu when needed.

Fixes: 75e68e5bf2c7 ("bpf, sockhash: Synchronize delete from bucket list on map free")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240906154449.3742932-1-edumazet@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_sync: Ignore errors from HCI_OP_REMOTE_NAME_REQ_CANCEL

[ Upstream commit cfbfeee61582e638770a1a10deef866c9adb38f5 ]

This ignores errors from HCI_OP_REMOTE_NAME_REQ_CANCEL since it
shouldn't interfere with the stopping of discovery and in certain
conditions it seems to be failing.

Link: https://github.com/bluez/bluez/issues/575
Fixes: d0b137062b2d ("Bluetooth: hci_sync: Rework init stages")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_core: Fix sending MGMT_EV_CONNECT_FAILED

[ Upstream commit d47da6bd4cfa982fe903f33423b9e2ec541e9496 ]

If HCI_CONN_MGMT_CONNECTED has been set then the event shall be
HCI_CONN_MGMT_DISCONNECTED.

Fixes: b644ba336997 ("Bluetooth: Update device_connected and device_found events to latest API")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: wilc1000: fix potential RCU dereference issue in wilc_parse_join_bss_param

[ Upstream commit 6d7c6ae1efb1ff68bc01d79d94fdf0388f86cdd8 ]

In the `wilc_parse_join_bss_param` function, the TSF field of the `ies`
structure is accessed after the RCU read-side critical section is
unlocked. According to RCU usage rules, this is illegal. Reusing this
pointer can lead to unpredictable behavior, including accessing memory
that has been updated or causing use-after-free issues.

This possible bug was identified using a static analysis tool developed
by myself, specifically designed to detect RCU-related issues.

To address this, the TSF value is now stored in a local variable
`ies_tsf` before the RCU lock is released. The `param->tsf_lo` field is
then assigned using this local variable, ensuring that the TSF value is
safely accessed.

Fixes: 205c50306acf ("wifi: wilc1000: fix RCU usage in connect path")
Signed-off-by: Jiawei Ye <jiawei.ye@foxmail.com>
Reviewed-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/tencent_466225AA599BA49627FB26F707EE17BC5407@qq.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: use two-phase skb reclamation in ieee80211_do_stop()

[ Upstream commit 9d301de12da6e1bb069a9835c38359b8e8135121 ]

Since '__dev_queue_xmit()' should be called with interrupts enabled,
the following backtrace:

ieee80211_do_stop()
...
spin_lock_irqsave(&local->queue_stop_reason_lock, flags)
...
ieee80211_free_txskb()
  ieee80211_report_used_skb()
   ieee80211_report_ack_skb()
    cfg80211_mgmt_tx_status_ext()
     nl80211_frame_tx_status()
      genlmsg_multicast_netns()
       genlmsg_multicast_netns_filtered()
        nlmsg_multicast_filtered()
netlink_broadcast_filtered()
  do_one_broadcast()
   netlink_broadcast_deliver()
    __netlink_sendskb()
     netlink_deliver_tap()
      __netlink_deliver_tap_skb()
       dev_queue_xmit()
        __dev_queue_xmit() ; with IRQS disabled
...
spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags)

issues the warning (as reported by syzbot reproducer):

WARNING: CPU: 2 PID: 5128 at kernel/softirq.c:362 __local_bh_enable_ip+0xc3/0x120

Fix this by implementing a two-phase skb reclamation in
'ieee80211_do_stop()', where actual work is performed
outside of a section with interrupts disabled.

Fixes: 5061b0c2b906 ("mac80211: cooperate more with network namespaces")
Reported-by: syzbot+1a3986bbd3169c307819@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1a3986bbd3169c307819
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://patch.msgid.link/20240906123151.351647-1-dmantipov@yandex.ru
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: fix two more possible UBSAN-detected off-by-one errors

[ Upstream commit 15ea13b1b1fbf6364d4cd568e65e4c8479632999 ]

Although not reproduced in practice, these two cases may be
considered by UBSAN as off-by-one errors. So fix them in the
same way as in commit a26a5107bc52 ("wifi: cfg80211: fix UBSAN
noise in cfg80211_wext_siwscan()").

Fixes: 807f8a8c3004 ("cfg80211/nl80211: add support for scheduled scans")
Fixes: 5ba63533bbf6 ("cfg80211: fix alignment problem in scan request")
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://patch.msgid.link/20240909090806.1091956-1-dmantipov@yandex.ru
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7996: fix uninitialized TLV data

[ Upstream commit 9e461f4a2329109571f4b10f9bcad28d07e6ebb3 ]

Use skb_put_zero instead of skb_put

Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Link: https://patch.msgid.link/20240827093011.18621-23-nbd@nbd.name
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7915: fix rx filter setting for bfee functionality

[ Upstream commit 6ac80fce713e875a316a58975b830720a3e27721 ]

Fix rx filter setting to prevent dropping NDPA frames. Without this
change, bfee functionality may behave abnormally.

Fixes: e57b7901469f ("mt76: add mac80211 driver for MT7915 PCIe-based chipsets")
Signed-off-by: Howard Hsu <howard-yh.hsu@mediatek.com>
Link: https://patch.msgid.link/20240827093011.18621-21-nbd@nbd.name
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: fix UBSAN noise in cfg80211_wext_siwscan()

[ Upstream commit a26a5107bc52922cf5f67361e307ad66547b51c7 ]

Looking at https://syzkaller.appspot.com/bug?extid=1a3986bbd3169c307819
and running reproducer with CONFIG_UBSAN_BOUNDS, I've noticed the
following:

[ T4985] UBSAN: array-index-out-of-bounds in net/wireless/scan.c:3479:25
[ T4985] index 164 is out of range for type 'struct ieee80211_channel *[]'
<...skipped...>
[ T4985] Call Trace:
[ T4985]  <TASK>
[ T4985]  dump_stack_lvl+0x1c2/0x2a0
[ T4985]  ? __pfx_dump_stack_lvl+0x10/0x10
[ T4985]  ? __pfx__printk+0x10/0x10
[ T4985]  __ubsan_handle_out_of_bounds+0x127/0x150
[ T4985]  cfg80211_wext_siwscan+0x11a4/0x1260
<...the rest is not too useful...>

Even if we do 'creq->n_channels = n_channels' before 'creq->ssids =
(void *)&creq->channels[n_channels]', UBSAN treats the latter as
off-by-one error. Fix this by using pointer arithmetic rather than
an expression with explicit array indexing and use convenient
'struct_size()' to simplify the math here and in 'kzalloc()' above.

Fixes: 5ba63533bbf6 ("cfg80211: fix alignment problem in scan request")
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20240905150400.126386-1-dmantipov@yandex.ru
[fix coding style for multi-line calculation]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7603: fix mixed declarations and code

[ Upstream commit 9b8d932053b8b45d650360b36f701cf0f9b6470e ]

Move the qid variable declaration further up

Fixes: b473c0e47f04 ("wifi: mt76: mt7603: fix tx queue of loopback packets")
Link: https://patch.msgid.link/20240827093011.18621-1-nbd@nbd.name
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: connac: fix checksum offload fields of connac3 RXD

[ Upstream commit a04b920fbc707deb699292cdae47006e8ea57d87 ]

Fix incorrect RXD offset and bitfield related to RX checksum offload.

Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Fixes: 4e9011fcdfc4 ("wifi: mt76: connac: move connac3 definitions in mt76_connac3_mac.h")
Co-developed-by: Shayne Chen <shayne.chen@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Link: https://patch.msgid.link/20240816095040.2574-1-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7996: fix handling mbss enable/disable

[ Upstream commit ded1a6d9e13a32d4e8bef0c63accf49b9415ff5f ]

When mbss was previously enabled, the TLV needs to be included when
disabling it again, in order to clear the firmware state.

Fixes: a7908d5b61e5 ("wifi: mt76: mt7996: fix non-main BSS no beacon issue for MBSS scenario")
Signed-off-by: Rex Lu <rex.lu@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20240816094635.2391-9-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: hisilicon/qm - inject error before stopping queue

[ Upstream commit b04f06fc0243600665b3b50253869533b7938468 ]

The master ooo cannot be completely closed when the
accelerator core reports memory error. Therefore, the driver
needs to inject the qm error to close the master ooo. Currently,
the qm error is injected after stopping queue, memory may be
released immediately after stopping queue, causing the device to
access the released memory. Therefore, error is injected to close master
ooo before stopping queue to ensure that the device does not access
the released memory.

Fixes: 6c6dd5802c2d ("crypto: hisilicon/qm - add controller reset interface")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: hisilicon/qm - reset device before enabling it

[ Upstream commit 5d2d1ee0874c26b8010ddf7f57e2f246e848af38 ]

Before the device is enabled again, the device may still
store the previously processed data. If an error occurs in
the previous task, the device may fail to be enabled again.
Therefore, before enabling device, reset the device to restore
the initial state.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: b04f06fc0243 ("crypto: hisilicon/qm - inject error before stopping queue")
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: hisilicon/hpre - mask cluster timeout error

[ Upstream commit 145013f723947c83b1a5f76a0cf6e7237d59e973 ]

The timeout threshold of the hpre cluster is 16ms. When the CPU
and device share virtual address, page fault processing time may
exceed the threshold.

In the current test, there is a high probability that the
cluster times out. However, the cluster is waiting for the
completion of memory access, which is not an error, the device
does not need to be reset. If an error occurs in the cluster,
qm also reports the error. Therefore, the cluster timeout
error of hpre can be masked.

Fixes: d90fab0deb8e ("crypto: hisilicon/qm - get error type from hardware registers")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: ccp - do not request interrupt on cmd completion when irqs disabled

[ Upstream commit 3401f63e72596dcb7d912a5b67b4291643cc1034 ]

While sending a command to the PSP, we always requested an interrupt
from the PSP after command completion. This worked for most cases. For
the special case of irqs being disabled -- e.g. when running within
crashdump or kexec contexts, we should not set the SEV_CMDRESP_IOC flag,
so the PSP knows to not attempt interrupt delivery.

Fixes: 8ef979584ea8 ("crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump")
Based-on-patch-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Amit Shah <amit.shah@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

pm:cpupower: Add missing powercap_set_enabled() stub function

[ Upstream commit 4b80294fb53845dc5c98cca0c989da09150f2ca9 ]

There was a symbol listed in the powercap.h file that was not implemented.
Implement it with a stub return of 0.

Programs like SWIG require that functions that are defined in the
headers be implemented.

Fixes: c2294c1496b7 ("cpupower: Introduce powercap intel-rapl library and powercap-info command")
Suggested-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: John B. Wyatt IV <jwyatt@redhat.com>
Signed-off-by: John B. Wyatt IV <sageofredondo@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/sgx: Fix deadlock in SGX NUMA node search

[ Upstream commit 9c936844010466535bd46ea4ce4656ef17653644 ]

When the current node doesn't have an EPC section configured by firmware
and all other EPC sections are used up, CPU can get stuck inside the
while loop that looks for an available EPC page from remote nodes
indefinitely, leading to a soft lockup. Note how nid_of_current will
never be equal to nid in that while loop because nid_of_current is not
set in sgx_numa_mask.

Also worth mentioning is that it's perfectly fine for the firmware not
to setup an EPC section on a node. While setting up an EPC section on
each node can enhance performance, it is not a requirement for
functionality.

Rework the loop to start and end on *a* node that has SGX memory. This
avoids the deadlock looking for the current SGX-lacking node to show up
in the loop when it never will.

Fixes: 901ddbb9ecf5 ("x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()")
Reported-by: "Molina Sabido, Gerardo" <gerardo.molina.sabido@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Zhimin Luo <zhimin.luo@intel.com>
Link: https://lore.kernel.org/all/20240905080855.1699814-2-aaron.lu%40intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7996: fix EHT beamforming capability check

[ Upstream commit 9ca65757f0a5b393a7737d37f377d5daf91716af ]

If a VIF acts as a beamformer, it should check peer's beamformee
capability, and vice versa.

Fixes: ba01944adee9 ("wifi: mt76: mt7996: add EHT beamforming support")
Signed-off-by: Howard Hsu <howard-yh.hsu@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20240816094635.2391-7-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7996: fix HE and EHT beamforming capabilities

[ Upstream commit e1f4847fdbdf5d44ae60e035c131920e5ab88598 ]

Fix HE and EHT beamforming capabilities for different bands and
interface types.

Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Fixes: 348533eb968d ("wifi: mt76: mt7996: add EHT capability init")
Signed-off-by: Howard Hsu <howard-yh.hsu@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20240816094635.2391-5-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7996: fix wmm set of station interface to 3

[ Upstream commit 9265397caacf5c0c2d10c40b2958a474664ebd9e ]

According to connac3 HW design, the WMM index of AP and STA interface
should be 0 and 3, respectively.

Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20240816094635.2391-3-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7996: fix traffic delay when switching back to working channel

[ Upstream commit 376200f095d0c3a7096199b336204698d7086279 ]

During scanning, UNI_CHANNEL_RX_PATH tag is necessary for the firmware to
properly stop and resume MAC TX queue. Without this tag, HW needs more time
to resume traffic when switching back to working channel.

Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20240816094635.2391-2-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7996: use hweight16 to get correct tx antenna

[ Upstream commit f98c3de92bb05dac4a4969df8a4595ed380b4604 ]

The chainmask is u16 so using hweight8 cannot get correct tx_ant.
Without this patch, the tx_ant of band 2 would be -1 and lead to the
following issue:
BUG: KASAN: stack-out-of-bounds in mt7996_mcu_add_sta+0x12e0/0x16e0 [mt7996e]

Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20240816094635.2391-1-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7921: fix wrong UNII-4 freq range check for the channel usage

[ Upstream commit 723762a7a7e6fdb3cc6953f127a3fe9c5162beb7 ]

The check should start from 5845 to 5925, which includes
channels 169, 173, and 177.

Fixes: 09382d8f8641 ("wifi: mt76: mt7921: update the channel usage when the regd domain changed")
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://patch.msgid.link/20240806013408.17874-1-mingyen.hsieh@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7915: fix oops on non-dbdc mt7986

[ Upstream commit 862bf7cbd772c2bad570ef0c5b5556a1330656dd ]

mt7915_band_config() sets band_idx = 1 on the main phy for mt7986
with MT7975_ONE_ADIE or MT7976_ONE_ADIE.

Commit 0335c034e726 ("wifi: mt76: fix race condition related to
checking tx queue fill status") introduced a dereference of the
phys array indirectly indexed by band_idx via wcid->phy_idx in
mt76_wcid_cleanup(). This caused the following Oops on affected
mt7986 devices:

Unable to handle kernel read from unreadable memory at virtual address 0000000000000024
Mem abort info:
   ESR = 0x0000000096000005
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x05: level 1 translation fault
Data abort info:
   ISV = 0, ISS = 0x00000005
   CM = 0, WnR = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=0000000042545000
[0000000000000024] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] SMP
Modules linked in: ... mt7915e mt76_connac_lib mt76 mac80211 cfg80211 ...
CPU: 2 PID: 1631 Comm: hostapd Not tainted 5.15.150 #0
Hardware name: ZyXEL EX5700 (Telenor) (DT)
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mt76_wcid_cleanup+0x84/0x22c [mt76]
lr : mt76_wcid_cleanup+0x64/0x22c [mt76]
sp : ffffffc00a803700
x29: ffffffc00a803700 x28: ffffff80008f7300 x27: ffffff80003f3c00
x26: ffffff80000a7880 x25: ffffffc008c26e00 x24: 0000000000000001
x23: ffffffc000a68114 x22: 0000000000000000 x21: ffffff8004172cc8
x20: ffffffc00a803748 x19: ffffff8004152020 x18: 0000000000000000
x17: 00000000000017c0 x16: ffffffc008ef5000 x15: 0000000000000be0
x14: ffffff8004172e28 x13: ffffff8004172e28 x12: 0000000000000000
x11: 0000000000000000 x10: ffffff8004172e30 x9 : ffffff8004172e28
x8 : 0000000000000000 x7 : ffffff8004156020 x6 : 0000000000000000
x5 : 0000000000000031 x4 : 0000000000000000 x3 : 0000000000000001
x2 : 0000000000000000 x1 : ffffff80008f7300 x0 : 0000000000000024
Call trace:
  mt76_wcid_cleanup+0x84/0x22c [mt76]
  __mt76_sta_remove+0x70/0xbc [mt76]
  mt76_sta_state+0x8c/0x1a4 [mt76]
  mt7915_eeprom_get_power_delta+0x11e4/0x23a0 [mt7915e]
  drv_sta_state+0x144/0x274 [mac80211]
  sta_info_move_state+0x1cc/0x2a4 [mac80211]
  sta_set_sinfo+0xaf8/0xc24 [mac80211]
  sta_info_destroy_addr_bss+0x4c/0x6c [mac80211]

  ieee80211_color_change_finish+0x1c08/0x1e70 [mac80211]
  cfg80211_check_station_change+0x1360/0x4710 [cfg80211]
  genl_family_rcv_msg_doit+0xb4/0x110
  genl_rcv_msg+0xd0/0x1bc
  netlink_rcv_skb+0x58/0x120
  genl_rcv+0x34/0x50
  netlink_unicast+0x1f0/0x2ec
  netlink_sendmsg+0x198/0x3d0
  ____sys_sendmsg+0x1b0/0x210
  ___sys_sendmsg+0x80/0xf0
  __sys_sendmsg+0x44/0xa0
  __arm64_sys_sendmsg+0x20/0x30
  invoke_syscall.constprop.0+0x4c/0xe0
  do_el0_svc+0x40/0xd0
  el0_svc+0x14/0x4c
  el0t_64_sync_handler+0x100/0x110
  el0t_64_sync+0x15c/0x160
Code: d2800002 910092c0 52800023 f9800011 (885f7c01)
---[ end trace 7e42dd9a39ed2281 ]---

Fix by using mt76_dev_phy() which will map band_idx to the correct phy
for all hardware combinations.

Fixes: 0335c034e726 ("wifi: mt76: fix race condition related to checking tx queue fill status")
Link: https://github.com/openwrt/openwrt/issues/14548
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Link: https://patch.msgid.link/20240713130010.516037-1-bjorn@mork.no
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: gov_bang_bang: Adjust states of all uninitialized instances

[ Upstream commit 15cb56bd529868d9242b22812fc69bd144bfdc94 ]

If a cooling device is registered after a thermal zone it should be
bound to and the trip point it should be bound to has already been
crossed by the zone temperature on the way up, the cooling device's
state may need to be adjusted, but the Bang-bang governor will not
do that because its .manage() callback only looks at thermal instances
for trip points whose thresholds are below or at the zone temperature.

Address this by updating bang_bang_manage() to look at all of the
uninitialized thermal instances and setting their target states in
accordance with the position of the zone temperature with respect to
the threshold of the given trip point.

Fixes: 5f64b4a1ab1b ("thermal: gov_bang_bang: Add .manage() callback")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/6103874.lOV4Wx5bFT@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>

cpufreq: ti-cpufreq: Introduce quirks to handle syscon fails appropriately

[ Upstream commit abc00ffda43bd4ba85896713464c7510c39f8165 ]

Commit b4bc9f9e27ed ("cpufreq: ti-cpufreq: add support for omap34xx
and omap36xx") introduced special handling for OMAP3 class devices
where syscon node may not be present. However, this also creates a bug
where the syscon node is present, however the offset used to read
is beyond the syscon defined range.

Fix this by providing a quirk option that is populated when such
special handling is required. This allows proper failure for all other
platforms when the syscon node and efuse offsets are mismatched.

Fixes: b4bc9f9e27ed ("cpufreq: ti-cpufreq: add support for omap34xx and omap36xx")
Signed-off-by: Nishanth Menon <nm@ti.com>
Tested-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf/arm-cmn: Ensure dtm_idx is big enough

[ Upstream commit 359414b33e00bae91e4eabf3e4ef8e76024c7673 ]

While CMN_MAX_DIMENSION was bumped to 12 for CMN-650, that only supports
up to a 10x10 mesh, so bumping dtm_idx to 256 bits at the time worked
out OK in practice. However CMN-700 did finally support up to 144 XPs,
and thus needs a worst-case 288 bits of dtm_idx for an aggregated XP
event on a maxed-out config. Oops.

Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e771b358526a0d7fc06efee2c3a2fdc0c9f51d44.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf/arm-cmn: Fix CCLA register offset

[ Upstream commit 88b63a82c84ed9bbcbdefb10cb6f75dd1dd04887 ]

Apparently pmu_event_sel is offset by 8 for all CCLA nodes, not just
the CCLA_RNI combination type.

Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/6e7bb06fef6046f83e7647aad0e5be544139763f.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf/arm-cmn: Refactor node ID handling. Again.

[ Upstream commit e79634b53e398966c49f803c49701bc74dc3ccf8 ]

The scope of the "extra device ports" configuration is not made clear by
the CMN documentation - so far we've assumed it applies globally, based
on the sole example which suggests as much. However it transpires that
this is incorrect, and the format does in fact vary based on each
individual XP's port configuration. As a consequence, we're currenly
liable to decode the port/device indices from a node ID incorrectly,
thus program the wrong event source in the DTM leading to bogus event
counts, and also show device topology on the wrong ports in debugfs.

To put this right, rework node IDs yet again to carry around the
additional data necessary to decode them properly per-XP. At this point
the notion of fully decomposing an ID becomes more impractical than it's
worth, so unabstracting the XY mesh coordinates (where 2/3 users were
just debug anyway) ends up leaving things a bit simpler overall.

Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/5195f990152fc37adba5fbf5929a6b11063d9f09.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nft_dynset: annotate data-races around set timeout

[ Upstream commit c5ad8ed61fa8410b272c077ec167c593602b4542 ]

set timeout can be read locklessly while being updated from control
plane, add annotation.

Fixes: 123b99619cca ("netfilter: nf_tables: honor set timeout and garbage collection updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: remove annotation to access set timeout while holding lock

[ Upstream commit 15d8605c0cf4fc9cf4386cae658c68a0fd4bdb92 ]

Mutex is held when adding an element, no need for READ_ONCE, remove it.

Fixes: 123b99619cca ("netfilter: nf_tables: honor set timeout and garbage collection updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: reject expiration higher than timeout

[ Upstream commit c0f38a8c60174368aed1d0f9965d733195f15033 ]

Report ERANGE to userspace if user specifies an expiration larger than
the timeout.

Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: reject element expiration with no timeout

[ Upstream commit d2dc429ecb4e79ad164028d965c00f689e6f6d06 ]

If element timeout is unset and set provides no default timeout, the
element expiration is silently ignored, reject this instead to let user
know this is unsupported.

Also prepare for supporting timeout that never expire, where zero
timeout and expiration must be also rejected.

Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire

[ Upstream commit e0c47281723f301894c14e6f5cd5884fdfb813f9 ]

Element timeout that is below CONFIG_HZ never expires because the
timeout extension is not allocated given that nf_msecs_to_jiffies64()
returns 0. Set timeout to the minimum value to honor timeout.

Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: CPPC: Fix MASK_VAL() usage

[ Upstream commit 60949b7b805424f21326b450ca4f1806c06d982e ]

MASK_VAL() was added as a way to handle bit_offset and bit_width for
registers located in system memory address space. However, while suited
for reading, it does not work for writing and result in corrupted
registers when writing values with bit_offset > 0. Moreover, when a
register is collocated with another one at the same address but with a
different mask, the current code results in the other registers being
overwritten with 0s. The write procedure for SYSTEM_MEMORY registers
should actually read the value, mask it, update it and write it with the
updated value. Moreover, since registers can be located in the same
word, we must take care of locking the access before doing it. We should
potentially use a global lock since we don't know in if register
addresses aren't shared with another _CPC package but better not
encourage vendors to do so. Assume that registers can use the same word
inside a _CPC package and thus, use a per _CPC package lock.

Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Link: https://patch.msgid.link/20240826101648.95654-1-cleger@rivosinc.com
[ rjw: Dropped redundant semicolon ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

can: j1939: use correct function name in comment

[ Upstream commit dc2ddcd136fe9b6196a7dd01f75f824beb02d43f ]

The function j1939_cancel_all_active_sessions() was renamed to
j1939_cancel_active_session() but name in comment wasn't updated.

Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Link: https://patch.msgid.link/1724935703-44621-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

kselftest/arm64: Actually test SME vector length changes via sigreturn

[ Upstream commit 6f0315330af7a57c1c00587fdfb69c7778bf1c50 ]

The test case for SME vector length changes via sigreturn use a bit too
much cut'n'paste and only actually changed the SVE vector length in the
test itself. Andre's recent factoring out of the initialisation code caused
this to be exposed and the test to start failing. Fix the test to actually
cover the thing it's supposed to test.

Fixes: 4963aeb35a9e ("kselftest/arm64: signal: Add SME signal handling tests")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20240829-arm64-sme-signal-vl-change-test-v1-1-42d7534cb818@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drivers/perf: hisi_pcie: Fix TLP headers bandwidth counting

[ Upstream commit 17bf68aeb3642221e3e770399b5a52f370747ac1 ]

We make the initial value of event ctrl register as HISI_PCIE_INIT_SET
and modify according to the user options. This will make TLP headers
bandwidth only counting never take effect since HISI_PCIE_INIT_SET
configures to count the TLP payloads bandwidth. Fix this by making
the initial value of event ctrl register as 0.

Fixes: 17d573984d4d ("drivers/perf: hisi: Add TLP filter support")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240829090332.28756-3-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drivers/perf: hisi_pcie: Record hardware counts correctly

[ Upstream commit daecd3373a16a039ad241086e30a1ec46fc9d61f ]

Currently we set the period and record it as the initial value of the
counter without checking it's set to the hardware successfully or not.
However the counter maybe unwritable if the target event is unsupported
by the device. In such case we will pass user a wrong count:

[start counts when setting the period]
hwc->prev_count = 0x8000000000000000
device.counter_value = 0 // the counter is not set as the period
[when user reads the counter]
event->count = device.counter_value - hwc->prev_count
= 0x8000000000000000 // wrong. should be 0.

Fix this by record the hardware counter counts correctly when setting
the period.

Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240829090332.28756-2-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

padata: Honor the caller's alignment in case of chunk_size 0

[ Upstream commit 24cc57d8faaa4060fd58adf810b858fcfb71a02f ]

In the case where we are forcing the ps.chunk_size to be at least 1,
we are ignoring the caller's alignment.

Move the forcing of ps.chunk_size to be at least 1 before rounding it
up to caller's alignment, so that caller's alignment is honored.

While at it, use max() to force the ps.chunk_size to be at least 1 to
improve readability.

Fixes: 6d45e1c948a8 ("padata: Fix possible divide-by-0 panic in padata_mt_helper()")
Signed-off-by: Kamlesh Gurudasani <kamlesh@ti.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPICA: executer/exsystem: Don't nag user about every Stall() violating the spec

[ Upstream commit c82c507126c9c9db350be28f14c83fad1c7969ae ]

ACPICA commit 129b75516fc49fe1fd6b8c5798f86c13854630b3

Stop nagging user about every Stall() that violates the spec

On my Dell XPS 15 7590 I get hundreds of these warnings after few hours of
uptime:

$ dmesg | grep "fix the firmware" | wc -l
261

I cannot fix the firmware and I doubt that Dell cares about 4 year old
laptop either

Fixes: ace8f1c54a02 ("ACPICA: executer/exsystem: Inform users about ACPI spec violation")
Link: https://github.com/acpica/acpica/commit/129b7551
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPICA: Implement ACPI_WARNING_ONCE and ACPI_ERROR_ONCE

[ Upstream commit 632b746b108e3c62e0795072d00ed597371c738a ]

ACPICA commit 2ad4e6e7c4118f4cdfcad321c930b836cec77406

In some cases it is not practical nor useful to nag user about some
firmware errors that they cannot fix. Add a macro that will print a
warning or error only once to be used in these cases.

Link: https://github.com/acpica/acpica/commit/2ad4e6e7
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: c82c507126c9 ("ACPICA: executer/exsystem: Don't nag user about every Stall() violating the spec")
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: Check for missing VHT elements only for 5 GHz

[ Upstream commit 67bb124cd9ae38870667e4f9c876ef8e0f82ec44 ]

Check for missing VHT Capabilities and VHT Operation elements in
association response frame only for 5 GHz links.

Fixes: 310c8387c638 ("wifi: mac80211: clean up connection process")
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240827103920.dd711282d543.Iaba245cebc52209b0499d5bab7d8a8ef1df9dd65@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: allow ESR when we the ROC expires

[ Upstream commit 76364f3edfde60aa2fa20b578ba9b96797d7bff5 ]

We forgot to release the ROC reason for ESR prevention when the remain
on channel expires.
Add this.

Fixes: a1efeb823084 ("wifi: iwlwifi: mvm: Block EMLSR when a p2p/softAP vif is active")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.8f8765f359cc.I16fcd6198072d422ff36dce68070aafaf011f4c1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: fix the comeback long retry times

[ Upstream commit 1524173a3745899612c71d9e83ff8fe29dbb2cfb ]

When we had a comeback, we will never use the default timeout values
again because comeback is never cleared.
Clear comeback if we send another association request which will allow
to start a default timer after Tx status.

The problem was seen with iwlwifi where the tx_status on the association
request is handled before the association response frame (which is the
usual case).

1) Tx assoc request 1/3
2) Rx assoc response (comeback, timeout = 1 second)
3) wait 1 second
4) Tx assoc request 2/3
5) Set timer to IEEE80211_ASSOC_TIMEOUT_LONG = 500ms (1 second after
round_up)
6) tx_status on frame sent in 4) is ignored because comeback is still
true
7) AP does not reply with assoc response
8) wait 1s <= This is where the bug is felt
9) Tx assoc request 3/3

With this fix, in step 6 we will reset the timer to
IEEE80211_ASSOC_TIMEOUT_SHORT = 100ms and we will wait only 100ms in
step 8.

Fixes: b133fdf07db8 ("wifi: mac80211: Skip association timeout update after comeback rejection")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://patch.msgid.link/20240808085916.23519-1-emmanuel.grumbach@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: fix bug of mapping AF3x to incorrect User Priority

[ Upstream commit a68b22e2905b04f376e2fa116be5e48b948f81c8 ]

According to RFC8325 4.3, Multimedia Streaming: AF31(011010, 26),
AF32(011100, 28), AF33(011110, 30) maps to User Priority = 4
and AC_VI (Video).

However, the original code remain the default three Most Significant
Bits (MSBs) of the DSCP, which makes AF3x map to User Priority = 3
and AC_BE (Best Effort).

Fixes: 6fdb8b8781d5 ("wifi: cfg80211: Update the default DSCP-to-UP mapping")
Signed-off-by: hhorace <hhoracehsu@gmail.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20240807082205.1369-1-hhoracehsu@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: increase the time between ranging measurements

[ Upstream commit 3a7ee94559dfd640604d0265739e86dec73b64e8 ]

The algo running in fw may take a little longer than 5 milliseconds,
(e.g. measurement on 80MHz while associated). Increase the minimum
time between measurements to 7 milliseconds.

Fixes: 830aa3e7d1ca ("iwlwifi: mvm: add support for range request command version 13")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240729201718.d3f3c26e00d9.I09e951290e8a3d73f147b88166fd9a678d1d69ed@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: config: label 'gl' devices as discrete

[ Upstream commit 8131dd52810dfcdb49fcdc78f5e18e1538b6c441 ]

The 'gl' devices are in the bz family, but they're not,
integrated, so should have their own trans config struct.
Fix that, also necessitating the removal of LTR config,
and while at it remove 0x2727 and 0x272D IDs that were
only used for test chips.

Fixes: c30a2a64788b ("wifi: iwlwifi: add a new PCI device ID for BZ device")ticket=none
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240729201718.95aed0620080.Ib9129512c95aa57acc9876bdff8b99dd41e1562c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: remove AX101, AX201 and AX203 support from LNL

[ Upstream commit 6adae0b081454393ca5f7363fd3c7379c8e2a7a1 ]

LNL is the codename for the upcoming Series 2 Core Ultra
processors designed by Intel. AX101, AX201 and AX203 devices
are not shiped on LNL platforms, so don't support them.

Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240613171043.f24a228dfd96.I989a2d3f1513211bc49ac8143ee4e9e341e1ee67@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Stable-dep-of: 8131dd52810d ("wifi: iwlwifi: config: label 'gl' devices as discrete")
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: don't use rate mask for offchannel TX either

[ Upstream commit e7a7ef9a0742dbd0818d5b15fba2c5313ace765b ]

Like the commit ab9177d83c04 ("wifi: mac80211: don't use rate mask for
scanning"), ignore incorrect settings to avoid no supported rate warning
reported by syzbot.

The syzbot did bisect and found cause is commit 9df66d5b9f45 ("cfg80211:
fix default HE tx bitrate mask in 2G band"), which however corrects
bitmask of HE MCS and recognizes correctly settings of empty legacy rate
plus HE MCS rate instead of returning -EINVAL.

As suggestions [1], follow the change of SCAN TX to consider this case of
offchannel TX as well.

[1] https://lore.kernel.org/linux-wireless/6ab2dc9c3afe753ca6fdcdd1421e7a1f47e87b84.camel@sipsolutions.net/T/#m2ac2a6d2be06a37c9c47a3d8a44b4f647ed4f024

Reported-by: syzbot+8dd98a9e98ee28dc484a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-wireless/000000000000fdef8706191a3f7b@google.com/
Fixes: 9df66d5b9f45 ("cfg80211: fix default HE tx bitrate mask in 2G band")
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240729074816.20323-1-pkshih@realtek.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: video: force native for Apple MacbookPro9,2

[ Upstream commit 7dc918daaf2994963690171584ba423f28724df5 ]

It used to be that the MacbookPro9,2 used its native intel backlight
device until the following commit was introduced:

commit b1d36e73cc1c ("drm/i915: Don't register backlight when another
backlight should be used (v2)")

This commit forced this model to use its firmware acpi_video backlight
device instead.

That worked fine until an additional commit was added:

commit 92714006eb4d ("drm/i915/backlight: Do not bump min brightness
to max on enable")

That commit uncovered a bug in the MacbookPro 9,2's acpi_video
backlight firmware; the backlight does not come back up after resume.

Add DMI quirk to select the working native intel interface instead
so that the backlight successfully comes back up after resume.

Fixes: 92714006eb4d ("drm/i915/backlight: Do not bump min brightness to max on enable")
Signed-off-by: Esther Shimanovich <eshimanovich@chromium.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patch.msgid.link/20240806-acpi-video-quirk-v1-1-369d8f7abc59@chromium.org
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: video: force native for some T2 macbooks

[ Upstream commit d010a0282e045f02895f88299e5442506585b46c ]

The intel backlight is needed for these, previously users had nothing in
/sys/class/backlight.

Signed-off-by: Orlando Chamberlain <orlandoch.dev@gmail.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patch.msgid.link/3DA0EAE3-9EB7-492B-96FC-988503BBDCCC@live.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 7dc918daaf29 ("ACPI: video: force native for Apple MacbookPro9,2")
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: qat - fix "Full Going True" macro definition

[ Upstream commit 694a6f594817462942acbb1a35b1f7d61e2d49e7 ]

The macro `ADF_RP_INT_SRC_SEL_F_RISE_MASK` is currently set to the value
`0100b` which means "Empty Going False". This might cause an incorrect
restore of the bank state during live migration.

Fix the definition of the macro to properly represent the "Full Going
True" state which is encoded as `0011b`.

Fixes: bbfdde7d195f ("crypto: qat - add bank save and restore flows")
Signed-off-by: Svyatoslav Pankratov <svyatoslav.pankratov@intel.com>
Reviewed-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf/dwc_pcie: Always register for PCIe bus notifier

[ Upstream commit b94b05478fb6a09033bf70c6edd03f8930a0fe24 ]

When the PCIe devices are discovered late, the driver can't find
the PCIe devices and returns in the init without registering with
the bus notifier. Due to that the devices which are discovered late
the driver can't register for this.

Register for bus notifier & driver even if the device is not found
as part of init.

Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver")
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-3-198b8ab1077c@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf/dwc_pcie: Fix registration issue in multi PCIe controller instances

[ Upstream commit e669388537c472142804eb5a0449cc23d5409694 ]

When there are multiple of instances of PCIe controllers, registration
to perf driver fails with this error.
sysfs: cannot create duplicate filename '/devices/platform/dwc_pcie_pmu.0'
CPU: 0 PID: 166 Comm: modprobe Not tainted 6.10.0-rc2-next-20240607-dirty
Hardware name: Qualcomm SA8775P Ride (DT)
Call trace:
dump_backtrace.part.8+0x98/0xf0
show_stack+0x14/0x1c
dump_stack_lvl+0x74/0x88
dump_stack+0x14/0x1c
sysfs_warn_dup+0x60/0x78
sysfs_create_dir_ns+0xe8/0x100
kobject_add_internal+0x94/0x224
kobject_add+0xa8/0x118
device_add+0x298/0x7b4
platform_device_add+0x1a0/0x228
platform_device_register_full+0x11c/0x148
dwc_pcie_register_dev+0x74/0xf0 [dwc_pcie_pmu]
dwc_pcie_pmu_init+0x7c/0x1000 [dwc_pcie_pmu]
do_one_initcall+0x58/0x1c0
do_init_module+0x58/0x208
load_module+0x1804/0x188c
__do_sys_init_module+0x18c/0x1f0
__arm64_sys_init_module+0x14/0x1c
invoke_syscall+0x40/0xf8
el0_svc_common.constprop.1+0x70/0xf4
do_el0_svc+0x18/0x20
el0_svc+0x28/0xb0
el0t_64_sync_handler+0x9c/0xc0
el0t_64_sync+0x160/0x164
kobject: kobject_add_internal failed for dwc_pcie_pmu.0 with -EEXIST,
don't try to register things with the same name in the same directory.

This is because of having same bdf value for devices under two different
controllers.

Update the logic to use sbdf which is a unique number in case of
multi instance also.

Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver")
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-1-198b8ab1077c@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drivers/perf: Fix ali_drw_pmu driver interrupt status clearing

[ Upstream commit a3dd920977dccc453c550260c4b7605b280b79c3 ]

The alibaba_uncore_pmu driver forgot to clear all interrupt status
in the interrupt processing function. After the PMU counter overflow
interrupt occurred, an interrupt storm occurred, causing the system
to hang.

Therefore, clear the correct interrupt status in the interrupt handling
function to fix it.

Fixes: cf7b61073e45 ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC")
Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1724297611-20686-1-git-send-email-renyu.zj@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: core: Fix rounding of delay jiffies

[ Upstream commit 8144dbe68c493baa412d78f41a57e90a6461f6c3 ]

Using round_jiffies() in thermal_set_delay_jiffies() is invalid because
its argument should be time in the future in absolute jiffies and it
computes the result with respect to the current jiffies value at the
invocation time. Fortunately, in the majority of cases it does not
make any difference due to the time_is_after_jiffies() check in
round_jiffies_common().

While using round_jiffies_relative() instead of round_jiffies() might
reflect the intent a bit better, it still would not be defensible
because that function should be called when the timer is about to be
set and it is not suitable for pre-computation of delay values.

Accordingly, drop thermal_set_delay_jiffies() altogether, simply
convert polling_delay and passive_delay to jiffies during thermal
zone initialization and make thermal_zone_device_set_polling() call
round_jiffies_relative() on the delay if it is greather than 1 second.

Fixes: 17d399cd9c89 ("thermal/core: Precompute the delays from msecs to jiffies")
Fixes: e5f2cda61d06 ("thermal/core: Move thermal_set_delay_jiffies to static")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/1994438.PYKUYFuaPT@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: core: Fold two functions into their respective callers

[ Upstream commit a8bbe6f10f78f85243ff821432c5d798a6d99ed4 ]

Fold bind_cdev() into __thermal_cooling_device_register() and bind_tz()
into thermal_zone_device_register_with_trips() to reduce code bloat and
make it somewhat easier to follow the code flow.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/2962184.e9J7NaK4W3@rjwysocki.net
Stable-dep-of: 8144dbe68c49 ("thermal: core: Fix rounding of delay jiffies")
Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: smp: smp_send_stop() and crash_smp_send_stop() should try non-NMI first

[ Upstream commit fdfa588124b6356cd08e5d3f0c3643c4ec3d6887 ]

When testing hard lockup handling on my sc7180-trogdor-lazor device
with pseudo-NMI enabled, with serial console enabled and with kgdb
disabled, I found that the stack crawls printed to the serial console
ended up as a jumbled mess. After rebooting, the pstore-based console
looked fine though. Also, enabling kgdb to trap the panic made the
console look fine and avoided the mess.

After a bit of tracking down, I came to the conclusion that this was
what was happening:
1. The panic path was stopping all other CPUs with
   panic_other_cpus_shutdown().
2. At least one of those other CPUs was in the middle of printing to
   the serial console and holding the console port's lock, which is
   grabbed with "irqsave". ...but since we were stopping with an NMI
   we didn't care about the "irqsave" and interrupted anyway.
3. Since we stopped the CPU while it was holding the lock it would
   never release it.
4. All future calls to output to the console would end up failing to
   get the lock in qcom_geni_serial_console_write(). This isn't
   _totally_ unexpected at panic time but it's a code path that's not
   well tested, hard to get right, and apparently doesn't work
   terribly well on the Qualcomm geni serial driver.

The Qualcomm geni serial driver was fixed to be a bit better in commit
9e957a155005 ("serial: qcom-geni: Don't cancel/abort if we can't get
the port lock") but it's nice not to get into this situation in the
first place.

Taking a page from what x86 appears to do in native_stop_other_cpus(),
do this:
1. First, try to stop other CPUs with a normal IPI and wait a second.
   This gives them a chance to leave critical sections.
2. If CPUs fail to stop then retry with an NMI, but give a much lower
   timeout since there's no good reason for a CPU not to react quickly
   to a NMI.

This works well and avoids the corrupted console and (presumably)
could help avoid other similar issues.

In order to do this, we need to do a little re-organization of our
IPIs since we don't have any more free IDs. Do what was suggested in
previous conversations and combine "stop" and "crash stop". That frees
up an IPI so now we can have a "stop" and "stop NMI".

In order to do this we also need a slight change in the way we keep
track of which CPUs still need to be stopped. We need to know
specifically which CPUs haven't stopped yet when we fall back to NMI
but in the "crash stop" case the "cpu_online_mask" isn't updated as
CPUs go down. This is why that code path had an atomic of the number
of CPUs left. Solve this by also updating the "cpu_online_mask" for
crash stops.

All of the above lets us combine the logic for "stop" and "crash stop"
code, which appeared to have a bunch of arbitrary implementation
differences.

Aside from the above change where we try a normal IPI and then an NMI,
the combined function has a few subtle differences:
* In the normal smp_send_stop(), if we fail to stop one or more CPUs
  then we won't include the current CPU (the one running
  smp_send_stop()) in the error message.
* In crash_smp_send_stop(), if we fail to stop some CPUs we'll print
  the CPUs that we failed to stop instead of printing all _but_ the
  current running CPU.
* In crash_smp_send_stop(), we will now only print "SMP: stopping
  secondary CPUs" if (system_state <= SYSTEM_RUNNING).

Fixes: d7402513c935 ("arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20240821145353.v3.1.Id4817adef610302554b8aa42b090d57270dc119c@changeid
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

kselftest/arm64: signal: fix/refactor SVE vector length enumeration

[ Upstream commit 5225b6562b9a7dc808d5a1e465aaf5e2ebb220cd ]

Currently a number of SVE/SME related tests have almost identical
functions to enumerate all supported vector lengths. However over time
the copy&pasted code has diverged, allowing some bugs to creep in:
- fake_sigreturn_sme_change_vl reports a failure, not a SKIP if only
  one vector length is supported (but the SVE version is fine)
- fake_sigreturn_sme_change_vl tries to set the SVE vector length, not
  the SME one (but the other SME tests are fine)
- za_no_regs keeps iterating forever if only one vector length is
  supported (but za_regs is correct)

Since those bugs seem to be mostly copy&paste ones, let's consolidate
the enumeration loop into one shared function, and just call that from
each test. That should fix the above bugs, and prevent similar issues
from happening again.

Fixes: 4963aeb35a9e ("kselftest/arm64: signal: Add SME signal handling tests")
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240821164401.3598545-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

powercap: intel_rapl: Fix off by one in get_rpi()

[ Upstream commit 95f6580352a7225e619551febb83595bcb77ab17 ]

The rp->priv->rpi array is either rpi_msr or rpi_tpmi which have
NR_RAPL_PRIMITIVES number of elements. Thus the > needs to be >=
to prevent an off by one access.

Fixes: 98ff639a7289 ("powercap: intel_rapl: Support per Interface primitive information")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://patch.msgid.link/86e3a059-504d-4795-a5ea-4a653f3b41f8@stanley.mountain
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ARM: 9410/1: vfp: Use asm volatile in fmrx/fmxr macros

[ Upstream commit 89a906dfa8c3b21b3e5360f73c49234ac1eb885b ]

Floating point instructions in userspace can crash some arm kernels
built with clang/LLD 17.0.6:

    BUG: unsupported FP instruction in kernel mode
    FPEXC == 0xc0000780
    Internal error: Oops - undefined instruction: 0 [#1] ARM
    CPU: 0 PID: 196 Comm: vfp-reproducer Not tainted 6.10.0 #1
    Hardware name: BCM2835
    PC is at vfp_support_entry+0xc8/0x2cc
    LR is at do_undefinstr+0xa8/0x250
    pc : [<c0101d50>]    lr : [<c010a80c>]    psr: a0000013
    sp : dc8d1f68  ip : 60000013  fp : bedea19c
    r10: ec532b17  r9 : 00000010  r8 : 0044766c
    r7 : c0000780  r6 : ec532b17  r5 : c1c13800  r4 : dc8d1fb0
    r3 : c10072c4  r2 : c0101c88  r1 : ec532b17  r0 : 0044766c
    Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
    Control: 00c5387d  Table: 0251c008  DAC: 00000051
    Register r0 information: non-paged memory
    Register r1 information: vmalloc memory
    Register r2 information: non-slab/vmalloc memory
    Register r3 information: non-slab/vmalloc memory
    Register r4 information: 2-page vmalloc region
    Register r5 information: slab kmalloc-cg-2k
    Register r6 information: vmalloc memory
    Register r7 information: non-slab/vmalloc memory
    Register r8 information: non-paged memory
    Register r9 information: zero-size pointer
    Register r10 information: vmalloc memory
    Register r11 information: non-paged memory
    Register r12 information: non-paged memory
    Process vfp-reproducer (pid: 196, stack limit = 0x61aaaf8b)
    Stack: (0xdc8d1f68 to 0xdc8d2000)
    1f60:                   0000081f b6f69300 0000000f c10073f4 c10072c4 dc8d1fb0
    1f80: ec532b17 0c532b17 0044766c b6f9ccd8 00000000 c010a80c 00447670 60000010
    1fa0: ffffffff c1c13800 00c5387d c0100f10 b6f68af8 00448fc0 00000000 bedea188
    1fc0: bedea314 00000001 00448ebc b6f9d000 00447608 b6f9ccd8 00000000 bedea19c
    1fe0: bede9198 bedea188 b6e1061c 0044766c 60000010 ffffffff 00000000 00000000
    Call trace:
    [<c0101d50>] (vfp_support_entry) from [<c010a80c>] (do_undefinstr+0xa8/0x250)
    [<c010a80c>] (do_undefinstr) from [<c0100f10>] (__und_usr+0x70/0x80)
    Exception stack(0xdc8d1fb0 to 0xdc8d1ff8)
    1fa0:                                     b6f68af8 00448fc0 00000000 bedea188
    1fc0: bedea314 00000001 00448ebc b6f9d000 00447608 b6f9ccd8 00000000 bedea19c
    1fe0: bede9198 bedea188 b6e1061c 0044766c 60000010 ffffffff
    Code: 0a000061 e3877202 e594003c e3a09010 (eef16a10)
    ---[ end trace 0000000000000000 ]---
    Kernel panic - not syncing: Fatal exception in interrupt
    ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

This is a minimal userspace reproducer on a Raspberry Pi Zero W:

    #include <stdio.h>
    #include <math.h>

    int main(void)
    {
            double v = 1.0;
            printf("%fn", NAN + *(volatile double *)&v);
            return 0;
    }

Another way to consistently trigger the oops is:

    calvin@raspberry-pi-zero-w ~$ python -c "import json"

The bug reproduces only when the kernel is built with DYNAMIC_DEBUG=n,
because the pr_debug() calls act as barriers even when not activated.

This is the output from the same kernel source built with the same
compiler and DYNAMIC_DEBUG=y, where the userspace reproducer works as
expected:

    VFP: bounce: trigger ec532b17 fpexc c0000780
    VFP: emulate: INST=0xee377b06 SCR=0x00000000
    VFP: bounce: trigger eef1fa10 fpexc c0000780
    VFP: emulate: INST=0xeeb40b40 SCR=0x00000000
    VFP: raising exceptions 30000000

    calvin@raspberry-pi-zero-w ~$ ./vfp-reproducer
    nan

Crudely grepping for vmsr/vmrs instructions in the otherwise nearly
idential text for vfp_support_entry() makes the problem obvious:

    vmlinux.llvm.good [0xc0101cb8] <+48>:  vmrs   r7, fpexc
    vmlinux.llvm.good [0xc0101cd8] <+80>:  vmsr   fpexc, r0
    vmlinux.llvm.good [0xc0101d20] <+152>: vmsr   fpexc, r7
    vmlinux.llvm.good [0xc0101d38] <+176>: vmrs   r4, fpexc
    vmlinux.llvm.good [0xc0101d6c] <+228>: vmrs   r0, fpscr
    vmlinux.llvm.good [0xc0101dc4] <+316>: vmsr   fpexc, r0
    vmlinux.llvm.good [0xc0101dc8] <+320>: vmrs   r0, fpsid
    vmlinux.llvm.good [0xc0101dcc] <+324>: vmrs   r6, fpscr
    vmlinux.llvm.good [0xc0101e10] <+392>: vmrs   r10, fpinst
    vmlinux.llvm.good [0xc0101eb8] <+560>: vmrs   r10, fpinst2

    vmlinux.llvm.bad  [0xc0101cb8] <+48>:  vmrs   r7, fpexc
    vmlinux.llvm.bad  [0xc0101cd8] <+80>:  vmsr   fpexc, r0
    vmlinux.llvm.bad  [0xc0101d20] <+152>: vmsr   fpexc, r7
    vmlinux.llvm.bad  [0xc0101d30] <+168>: vmrs   r0, fpscr
    vmlinux.llvm.bad  [0xc0101d50] <+200>: vmrs   r6, fpscr  <== BOOM!
    vmlinux.llvm.bad  [0xc0101d6c] <+228>: vmsr   fpexc, r0
    vmlinux.llvm.bad  [0xc0101d70] <+232>: vmrs   r0, fpsid
    vmlinux.llvm.bad  [0xc0101da4] <+284>: vmrs   r10, fpinst
    vmlinux.llvm.bad  [0xc0101df8] <+368>: vmrs   r4, fpexc
    vmlinux.llvm.bad  [0xc0101e5c] <+468>: vmrs   r10, fpinst2

I think LLVM's reordering is valid as the code is currently written: the
compiler doesn't know the instructions have side effects in hardware.

Fix by using "asm volatile" in fmxr() and fmrx(), so they cannot be
reordered with respect to each other. The original compiler now produces
working kernels on my hardware with DYNAMIC_DEBUG=n.

This is the relevant piece of the diff of the vfp_support_entry() text,
from the original oopsing kernel to a working kernel with this patch:

         vmrs r0, fpscr
         tst r0, #4096
         bne 0xc0101d48
         tst r0, #458752
         beq 0xc0101ecc
         orr r7, r7, #536870912
         ldr r0, [r4, #0x3c]
         mov r9, #16
        -vmrs r6, fpscr
         orr r9, r9, #251658240
         add r0, r0, #4
         str r0, [r4, #0x3c]
         mvn r0, #159
         sub r0, r0, #-1207959552
         and r0, r7, r0
         vmsr fpexc, r0
         vmrs r0, fpsid
        +vmrs r6, fpscr
         and r0, r0, #983040
         cmp r0, #65536
         bne 0xc0101d88

Fixes: 4708fb041346 ("ARM: vfp: Reimplement VFP exception entry in C code")
Signed-off-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>

autofs: fix missing fput for FSCONFIG_SET_FD

[ Upstream commit 6a64c5220c5df235448b846aeff3c0660d4cc83e ]

If you pass an fd using FSCONFIG_SET_FD, autofs_parse_fd() "steals" the
param->file and so the fs_context infrastructure will not do fput() for
us.

Fixes: e6ec453bd0f0 ("autofs: convert autofs to use the new mount api")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/r/20240731-fsconfig-fsparam_fd-fixes-v2-1-e7c472224417@cyphar.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mount: handle OOM on mnt_warn_timestamp_expiry

[ Upstream commit 4bcda1eaf184e308f07f9c61d3a535f9ce477ce8 ]

If no page could be allocated, an error pointer was used as format
string in pr_warn.

Rearrange the code to return early in case of OOM. Also add a check
for the return value of d_path.

Fixes: f8b92ba67c5d ("mount: Add mount warning for impending timestamp expiry")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Link: https://lore.kernel.org/r/20240730085856.32385-1-olaf@aepfle.de
[brauner: rewrite commit and commit message]
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RISC-V: KVM: Fix to allow hpmcounter31 from the guest

[ Upstream commit 5aa09297a3dcc798d038bd7436f8c90f664045a6 ]

The csr_fun defines a count parameter which defines the total number
CSRs emulated in KVM starting from the base. This value should be
equal to total number of counters possible for trap/emulation (32).

Fixes: a9ac6c37521f ("RISC-V: KVM: Implement trap & emulate for hpmcounters")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240816-kvm_pmu_fixes-v1-2-cdfce386dd93@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RISC-V: KVM: Allow legacy PMU access from guest

[ Upstream commit 7d1ffc8b087e97dbe1985912c7a2d00e53cea169 ]

Currently, KVM traps & emulates PMU counter access only if SBI PMU
is available as the guest can only configure/read PMU counters via
SBI only. However, if SBI PMU is not enabled in the host, the
guest will fallback to the legacy PMU which will try to access
cycle/instret and result in an illegal instruction trap which
is not desired.

KVM can allow dummy emulation of cycle/instret only for the guest
if SBI PMU is not enabled in the host. The dummy emulation will
still return zero as we don't to expose the host counter values
from a guest using legacy PMU.

Fixes: a9ac6c37521f ("RISC-V: KVM: Implement trap & emulate for hpmcounters")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240816-kvm_pmu_fixes-v1-1-cdfce386dd93@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RISC-V: KVM: Don't zero-out PMU snapshot area before freeing data

[ Upstream commit 47d40d93292d9cff8dabb735bed83d930fa03950 ]

With the latest Linux-6.11-rc3, the below NULL pointer crash is observed
when SBI PMU snapshot is enabled for the guest and the guest is forcefully
powered-off.

  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000508
  Oops [#1]
  Modules linked in: kvm
  CPU: 0 UID: 0 PID: 61 Comm: term-poll Not tainted 6.11.0-rc3-00018-g44d7178dd77a #3
  Hardware name: riscv-virtio,qemu (DT)
  epc : __kvm_write_guest_page+0x94/0xa6 [kvm]
   ra : __kvm_write_guest_page+0x54/0xa6 [kvm]
  epc : ffffffff01590e98 ra : ffffffff01590e58 sp : ffff8f80001f39b0
   gp : ffffffff81512a60 tp : ffffaf80024872c0 t0 : ffffaf800247e000
   t1 : 00000000000007e0 t2 : 0000000000000000 s0 : ffff8f80001f39f0
   s1 : 00007fff89ac4000 a0 : ffffffff015dd7e8 a1 : 0000000000000086
   a2 : 0000000000000000 a3 : ffffaf8000000000 a4 : ffffaf80024882c0
   a5 : 0000000000000000 a6 : ffffaf800328d780 a7 : 00000000000001cc
   s2 : ffffaf800197bd00 s3 : 00000000000828c4 s4 : ffffaf800248c000
   s5 : ffffaf800247d000 s6 : 0000000000001000 s7 : 0000000000001000
   s8 : 0000000000000000 s9 : 00007fff861fd500 s10: 0000000000000001
   s11: 0000000000800000 t3 : 00000000000004d3 t4 : 00000000000004d3
   t5 : ffffffff814126e0 t6 : ffffffff81412700
  status: 0000000200000120 badaddr: 0000000000000508 cause: 000000000000000d
  [<ffffffff01590e98>] __kvm_write_guest_page+0x94/0xa6 [kvm]
  [<ffffffff015943a6>] kvm_vcpu_write_guest+0x56/0x90 [kvm]
  [<ffffffff015a175c>] kvm_pmu_clear_snapshot_area+0x42/0x7e [kvm]
  [<ffffffff015a1972>] kvm_riscv_vcpu_pmu_deinit.part.0+0xe0/0x14e [kvm]
  [<ffffffff015a2ad0>] kvm_riscv_vcpu_pmu_deinit+0x1a/0x24 [kvm]
  [<ffffffff0159b344>] kvm_arch_vcpu_destroy+0x28/0x4c [kvm]
  [<ffffffff0158e420>] kvm_destroy_vcpus+0x5a/0xda [kvm]
  [<ffffffff0159930c>] kvm_arch_destroy_vm+0x14/0x28 [kvm]
  [<ffffffff01593260>] kvm_destroy_vm+0x168/0x2a0 [kvm]
  [<ffffffff015933d4>] kvm_put_kvm+0x3c/0x58 [kvm]
  [<ffffffff01593412>] kvm_vm_release+0x22/0x2e [kvm]

Clearly, the kvm_vcpu_write_guest() function is crashing because it is
being called from kvm_pmu_clear_snapshot_area() upon guest tear down.

To address the above issue, simplify the kvm_pmu_clear_snapshot_area() to
not zero-out PMU snapshot area from kvm_pmu_clear_snapshot_area() because
the guest is anyway being tore down.

The kvm_pmu_clear_snapshot_area() is also called when guest changes
PMU snapshot area of a VCPU but even in this case the previous PMU
snaphsot area must not be zeroed-out because the guest might have
reclaimed the pervious PMU snapshot area for some other purpose.

Fixes: c2f41ddbcdd7 ("RISC-V: KVM: Implement SBI PMU Snapshot feature")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240815170907.2792229-1-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RISC-V: KVM: Fix sbiret init before forwarding to userspace

[ Upstream commit 6b7b282e6baea06ba65b55ae7d38326ceb79cebf ]

When forwarding SBI calls to userspace ensure sbiret.error is
initialized to SBI_ERR_NOT_SUPPORTED first, in case userspace
neglects to set it to anything. If userspace neglects it then we
can't be sure it did anything else either, so we just report it
didn't do or try anything. Just init sbiret.value to zero, which is
the preferred value to return when nothing special is specified.

KVM was already initializing both sbiret.error and sbiret.value, but
the values used appear to come from a copy+paste of the __sbi_ecall()
implementation, i.e. a0 and a1, which don't apply prior to the call
being executed, nor at all when forwarding to userspace.

Fixes: dea8ee31a039 ("RISC-V: KVM: Add SBI v0.1 support")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240807154943.150540-2-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw88: remove CPT execution branch never used

[ Upstream commit 77c977327dfaa9ae2e154964cdb89ceb5c7b7cf1 ]

In 'rtw_coex_action_bt_a2dp_pan', 'wl_cpt_test' and 'bt_cpt_test' are
hardcoded to false, so corresponding 'table_case' and 'tdma_case'
assignments are never met.
Also 'rtw_coex_set_rf_para(rtwdev, chip->wl_rf_para_rx[1])' is never
executed. Assuming that CPT was never fully implemented, remove
lookalike leftovers. Compile tested only.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 76f631cb401f ("rtw88: coex: update the mechanism for A2DP + PAN")
Signed-off-by: Dmitry Kandybka <d.kandybka@gmail.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240809085310.10512-1-d.kandybka@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw89: remove unused C2H event ID RTW89_MAC_C2H_FUNC_READ_WOW_CAM to prevent out-of-bounds reading

[ Upstream commit 56310ddb50b190b3390fdc974aec455d0a516bd2 ]

The handler of firmware C2H event RTW89_MAC_C2H_FUNC_READ_WOW_CAM isn't
implemented, but driver expects number of handlers is
NUM_OF_RTW89_MAC_C2H_FUNC_WOW causing out-of-bounds access. Fix it by
removing ID.

Addresses-Coverity-ID: 1598775 ("Out-of-bounds read")

Fixes: ff53fce5c78b ("wifi: rtw89: wow: update latest PTK GTK info to mac80211 after resume")
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240809072012.84152-4-pkshih@realtek.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: signal: Fix some under-bracketed UAPI macros

[ Upstream commit fc2220c9b15828319b09384e68399b4afc6276d9 ]

A few SME-related sigcontext UAPI macros leave an argument
unprotected from misparsing during macro expansion.

Add parentheses around references to macro arguments where
appropriate.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT")
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240729152005.289844-1-Dave.Martin@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: dwmac-loongson: Init ref and PTP clocks rate

[ Upstream commit c70f3163681381c15686bdd2fe56bf4af9b8aaaa ]

Reference and PTP clocks rate of the Loongson GMAC devices is 125MHz.
(So is in the GNET devices which support is about to be added.) Set
the respective plat_stmmacenet_data field up in accordance with that
so to have the coalesce command and timestamping work correctly.

Fixes: 30bba69d7db4 ("stmmac: pci: Add dwmac support for Loongson")
Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Yinggang Gu <guyinggang@loongson.cn>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath12k: fix invalid AMPDU factor calculation in ath12k_peer_assoc_h_he()

[ Upstream commit a66de2d0f22b1740f3f9777776ad98c4bee62dff ]

Currently ampdu_factor is wrongly calculated in ath12k_peer_assoc_h_he(), fix it.

This is found during code review.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4

Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240710021819.87216-1-quic_bqiang@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath12k: match WMI BSS chan info structure with firmware definition

[ Upstream commit dd98d54db29fb553839f43ade5f547baa93392c8 ]

struct wmi_pdev_bss_chan_info_event is not similar to the firmware
struct definition, this will cause some random failures.

Fix by matching the struct wmi_pdev_bss_chan_info_event with the
firmware structure definition.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: P Praneesh <quic_ppranees@quicinc.com>
Signed-off-by: Karthikeyan Kathirvel <quic_kathirve@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240331183232.2158756-3-quic_kathirve@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath12k: fix BSS chan info request WMI command

[ Upstream commit 59529c982f85047650fd473db903b23006a796c6 ]

Currently, the firmware returns incorrect pdev_id information in
WMI_PDEV_BSS_CHAN_INFO_EVENTID, leading to incorrect filling of
the pdev's survey information.

To prevent this issue, when requesting BSS channel information
through WMI_PDEV_BSS_CHAN_INFO_REQUEST_CMDID, firmware expects
pdev_id as one of the arguments in this WMI command.

Add pdev_id to the struct wmi_pdev_bss_chan_info_req_cmd and fill it
during ath12k_wmi_pdev_bss_chan_info_request(). This resolves the
issue of sending the correct pdev_id in WMI_PDEV_BSS_CHAN_INFO_EVENTID.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: P Praneesh <quic_ppranees@quicinc.com>
Signed-off-by: Karthikeyan Kathirvel <quic_kathirve@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240331183232.2158756-2-quic_kathirve@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath9k: Remove error checks when creating debugfs entries

[ Upstream commit f6ffe7f0184792c2f99aca6ae5b916683973d7d3 ]

We should not be checking the return values from debugfs creation at all: the
debugfs functions are designed to handle errors of previously called functions
and just transparently abort the creation of debugfs entries when debugfs is
disabled. If we check the return value and abort driver initialisation, we break
the driver if debugfs is disabled (such as when booting with debugfs=off).

Earlier versions of ath9k accidentally did the right thing by checking the
return value, but only for NULL, not for IS_ERR(). This was "fixed" by the two
commits referenced below, breaking ath9k with debugfs=off starting from the 6.6
kernel (as reported in the Bugzilla linked below).

Restore functionality by just getting rid of the return value check entirely.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219122
Fixes: 1e4134610d93 ("wifi: ath9k: use IS_ERR() with debugfs_create_dir()")
Fixes: 6edb4ba6fb5b ("wifi: ath9k: fix parameter check in ath9k_init_debug()")
Reported-by: Daniel Tobias <dan.g.tob@gmail.com>
Tested-by: Daniel Tobias <dan.g.tob@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240805110225.19690-1-toke@toke.dk
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: brcmfmac: introducing fwil query functions

[ Upstream commit c6002b6c05f3edfa12fd25990cc637281f200442 ]

When the firmware interface layer was refactored it provided various
"get" and "set" functions. For the "get" in some cases a parameter
needed to be passed down to firmware as a key indicating what to
"get" turning the output parameter of the "get" function into an
input parameter as well. To accommodate this the "get" function blindly
copies the parameter which in some places resulted in an uninitialized
warnings from the compiler. These have been fixed by initializing the
input parameter in the past. Recently another batch of similar fixes
were submitted to address clang static checker warnings [1].

Proposing another solution by introducing a "query" variant which is used
when the (input) parameter is needed by firmware. The "get" variant will
only fill the (output) parameter with the result received from firmware
taking care of proper endianess conversion.

[1] https://lore.kernel.org/all/20240702122450.2213833-1-suhui@nfschina.com/

Fixes: 81f5dcb80830 ("brcmfmac: refactor firmware interface layer.")
Reported-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20240727185617.253210-1-arend.vanspriel@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: PMIC: Remove unneeded check in tps68470_pmic_opregion_probe()

[ Upstream commit 07442c46abad1d50ac82af5e0f9c5de2732c4592 ]

In tps68470_pmic_opregion_probe() pointer 'dev' is compared to NULL which
is useless.

Fix this issue by removing unneeded check.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: e13452ac3790 ("ACPI / PMIC: Add TI PMIC TPS68470 operation region driver")
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://patch.msgid.link/20240730225339.13165-1-amishin@t-argos.ru
[ rjw: Subject edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: iaa - Fix potential use after free bug

[ Upstream commit e0d3b845a1b10b7b5abdad7ecc69d45b2aab3209 ]

The free_device_compression_mode(iaa_device, device_mode) function frees
"device_mode" but it iss passed to iaa_compression_modes[i]->free() a few
lines later resulting in a use after free.

The good news is that, so far as I can tell, nothing implements the
->free() function and the use after free happens in dead code. But, with
this fix, when something does implement it, we'll be ready. :)

Fixes: b190447e0fa3 ("crypto: iaa - Add compression mode management along with fixed mode")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: qat - ensure correct order in VF restarting handler

[ Upstream commit cd8d2d74292c199b433ef77762bb1d28a4821784 ]

In the process of sending the ADF_PF2VF_MSGTYPE_RESTARTING message to
Virtual Functions (VFs), the Physical Function (PF) should set the
`vf->restarting` flag to true before dispatching the message.
This change is necessary to prevent a race condition where the handling
of the ADF_VF2PF_MSGTYPE_RESTARTING_COMPLETE message (which sets the
`vf->restarting` flag to false) runs immediately after the message is sent,
but before the flag is set to true.

Set the `vf->restarting` to true before sending the message
ADF_PF2VF_MSGTYPE_RESTARTING, if supported by the version of the
protocol and if the VF is started.

Fixes: ec26f8e6c784 ("crypto: qat - update PFVF protocol for recovery")
Signed-off-by: Michal Witwicki <michal.witwicki@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: qat - fix recovery flow for VFs

[ Upstream commit 6f1b5236348fced7e7691a933327694b4106bc39 ]

When the PFVF protocol was updated to support version 5, i.e.
ADF_PFVF_COMPAT_FALLBACK, the compatibility version for the VF was
updated without supporting the message RESTARTING_COMPLETE required for
such version.

Add support for the ADF_VF2PF_MSGTYPE_RESTARTING_COMPLETE message in the
VF drivers. This message is sent by the VF driver to the PF to notify
the completion of the shutdown flow.

Fixes: ec26f8e6c784 ("crypto: qat - update PFVF protocol for recovery")
Signed-off-by: Michal Witwicki <michal.witwicki@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: qat - disable IOV in adf_dev_stop()

[ Upstream commit b6c7d36292d50627dbe6a57fa344f87c776971e6 ]

Disabling IOV has the side effect of re-enabling the AEs that might
attempt to do DMAs into the heartbeat buffers.
Move the disable_iov() function in adf_dev_stop() before the AEs are
stopped.

Fixes: ed8ccaef52fa ("crypto: qat - Add support for SRIOV")
Signed-off-by: Michal Witwicki <michal.witwicki@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: xor - fix template benchmarking

[ Upstream commit ab9a244c396aae4aaa34b2399b82fc15ec2df8c1 ]

Commit c055e3eae0f1 ("crypto: xor - use ktime for template benchmarking")
switched from using jiffies to ktime-based performance benchmarking.

This works nicely on machines which have a fine-grained ktime()
clocksource as e.g. x86 machines with TSC.
But other machines, e.g. my 4-way HP PARISC server, don't have such
fine-grained clocksources, which is why it seems that 800 xor loops
take zero seconds, which then shows up in the logs as:

xor: measuring software checksum speed
    8regs           : -1018167296 MB/sec
    8regs_prefetch  : -1018167296 MB/sec
    32regs          : -1018167296 MB/sec
    32regs_prefetch : -1018167296 MB/sec

Fix this with some small modifications to the existing code to improve
the algorithm to always produce correct results without introducing
major delays for architectures with a fine-grained ktime()
clocksource:
a) Delay start of the timing until ktime() just advanced. On machines
with a fast ktime() this should be just one additional ktime() call.
b) Count the number of loops. Run at minimum 800 loops and finish
earliest when the ktime() counter has progressed.

With that the throughput can now be calculated more accurately under all
conditions.

Fixes: c055e3eae0f1 ("crypto: xor - use ktime for template benchmarking")
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: John David Anglin <dave.anglin@bell.net>
v2:
- clean up coding style (noticed & suggested by Herbert Xu)
- rephrased & fixed typo in commit message

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw88: always wait for both firmware loading attempts

[ Upstream commit 0e735a4c6137262bcefe45bb52fde7b1f5fc6c4d ]

In 'rtw_wait_firmware_completion()', always wait for both (regular and
wowlan) firmware loading attempts. Otherwise if 'rtw_usb_intf_init()'
has failed in 'rtw_usb_probe()', 'rtw_usb_disconnect()' may issue
'ieee80211_free_hw()' when one of 'rtw_load_firmware_cb()' (usually
the wowlan one) is still in progress, causing UAF detected by KASAN.

Fixes: c8e5695eae99 ("rtw88: load wowlan firmware if wowlan is supported")
Reported-by: syzbot+6c6c08700f9480c41fe3@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6c6c08700f9480c41fe3
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240726114657.25396-1-dmantipov@yandex.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>

EDAC/synopsys: Fix error injection on Zynq UltraScale+

[ Upstream commit 35e6dbfe1846caeafabb49b7575adb36b0aa2269 ]

The Zynq UltraScale+ MPSoC DDR has a disjoint memory from 2GB to 32GB.
The DDR host interface has a contiguous memory so while injecting
errors, the driver should remove the hole else the injection fails as
the address translation is incorrect.

Introduce a get_mem_info() function pointer and set it for Zynq
UltraScale+ platform to return host address.

Fixes: 1a81361f75d8 ("EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller")
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240711100656.31376-1-shubhrajyoti.datta@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath11k: use work queue to process beacon tx event

[ Upstream commit 177b49dbf9c1d8f9f25a22ffafa416fc2c8aa6a3 ]

Commit 3a415daa3e8b ("wifi: ath11k: add P2P IE in beacon template")
from Feb 28, 2024 (linux-next), leads to the following Smatch static
checker warning:

drivers/net/wireless/ath/ath11k/wmi.c:1742 ath11k_wmi_p2p_go_bcn_ie()
warn: sleeping in atomic context

The reason is that ath11k_bcn_tx_status_event() will directly call might
sleep function ath11k_wmi_cmd_send() during RCU read-side critical
sections. The call trace is like:

ath11k_bcn_tx_status_event()
-> rcu_read_lock()
-> ath11k_mac_bcn_tx_event()
-> ath11k_mac_setup_bcn_tmpl()
……
-> ath11k_wmi_bcn_tmpl()
-> ath11k_wmi_cmd_send()
-> rcu_read_unlock()

Commit 886433a98425 ("ath11k: add support for BSS color change") added the
ath11k_mac_bcn_tx_event(), commit 01e782c89108 ("ath11k: fix warning
of RCU usage for ath11k_mac_get_arvif_by_vdev_id()") added the RCU lock
to avoid warning but also introduced this BUG.

Use work queue to avoid directly calling ath11k_mac_bcn_tx_event()
during RCU critical sections. No need to worry about the deletion of vif
because cancel_work_sync() will drop the work if it doesn't start or
block vif deletion until the running work is done.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Fixes: 3a415daa3e8b ("wifi: ath11k: add P2P IE in beacon template")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/2d277abd-5e7b-4da0-80e0-52bd96337f6e@moroto.mountain/
Signed-off-by: Kang Yang <quic_kangyang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240626053543.1946-1-quic_kangyang@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux 6.10.12

Link: https://lore.kernel.org/r/20240927121718.789211866@linuxfoundation.org
Tested-by: Allen Pais <apais@linux.microsoft.com>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Kexy Biscuit <kexybiscuit@aosc.io>
Tested-by: kernelci.org bot <bot@kernelci.org>
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powercap: intel_rapl: Change an error pointer to NULL

commit 6b08b4ee5e60d8789aeb87250f718ef14bebb90e upstream.

The rapl_find_package_domain_cpuslocked() function is supposed to
return NULL on error.

This new error patch returns ERR_PTR(-EINVAL) but none of the callers
check for that so it would lead to an Oops.

Fixes: 26096aed255f ("powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/fa719c6a-8d3b-4cca-9b43-bcd477ff6655@stanley.mountain
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: usbtmc: prevent kernel-usb-infoleak

commit 625fa77151f00c1bd00d34d60d6f2e710b3f9aad upstream.

The syzbot reported a kernel-usb-infoleak in usbtmc_write,
we need to clear the structure before filling fields.

Fixes: 4ddc645f40e9 ("usb: usbtmc: Add ioctl for vendor specific write")
Reported-and-tested-by: syzbot+9d34f80f841e948c3fdb@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9d34f80f841e948c3fdb
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/tencent_9649AA6EC56EDECCA8A7D106C792D1C66B06@qq.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: serial: pl2303: add device id for Macrosilicon MS3020

commit 7d47d22444bb7dc1b6d768904a22070ef35e1fc0 upstream.

Add the device id for the Macrosilicon MS3020 which is a
PL2303HXN based device.

Signed-off-by: Junhao Xie <bigfoot@classfun.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

can: mcp251xfd: move mcp251xfd_timestamp_start()/stop() into mcp251xfd_chip_start/stop()

commit a7801540f325d104de5065850a003f1d9bdc6ad3 upstream.

The mcp251xfd wakes up from Low Power or Sleep Mode when SPI activity
is detected. To avoid this, make sure that the timestamp worker is
stopped before shutting down the chip.

Split the starting of the timestamp worker out of
mcp251xfd_timestamp_init() into the separate function
mcp251xfd_timestamp_start().

Call mcp251xfd_timestamp_init() before mcp251xfd_chip_start(), move
mcp251xfd_timestamp_start() to mcp251xfd_chip_start(). In this way,
mcp251xfd_timestamp_stop() can be called unconditionally by
mcp251xfd_chip_stop().

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

can: mcp251xfd: properly indent labels

commit 51b2a721612236335ddec4f3fb5f59e72a204f3a upstream.

To fix the coding style, remove the whitespace in front of labels.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nvme-pci: qdepth 1 quirk

commit 83bdfcbdbe5d901c5fa432decf12e1725a840a56 upstream.

Another device has been reported to be unreliable if we have more than
one outstanding command. In this new case, data corruption may occur.
Since we have two devices now needing this quirky behavior, make a
generic quirk flag.

The same Apple quirk is clearly not "temporary", so update the comment
while moving it.

Link: https://lore.kernel.org/linux-nvme/191d810a4e3.fcc6066c765804.973611676137075390@collabora.com/
Reported-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Christoph Hellwig hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Cc: "Gagniuc, Alexandru" <alexandru.gagniuc@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: btintel_pcie: Allocate memory for driver private data

commit 7ffaa200251871980af12e57649ad57c70bf0f43 upstream.

Fix driver not allocating memory for struct btintel_data which is used
to store internal data.

Fixes: 6e65a09f9275 ("Bluetooth: btintel_pcie: Add *setup* function to download firmware")
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Thomas Leroy <thomas.leroy@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nft_socket: Fix a NULL vs IS_ERR() bug in nft_socket_cgroup_subtree_level()

commit 7052622fccb1efb850c6b55de477f65d03525a30 upstream.

The cgroup_get_from_path() function never returns NULL, it returns error
pointers. Update the error handling to match.

Fixes: 7f3287db6543 ("netfilter: nft_socket: make cgroupsv2 matching work with namespaces")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/bbc0c4e0-05cc-4f44-8797-2f4b3920a820@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>