git.ipfire.org Git - thirdparty/kernel/stable.git/log

net: ipv6: fix skb hash for some RST packets

[ Upstream commit dc6456e938e938d64ffb6383a286b2ac9790a37f ]

The skb hash comes from sk->sk_txhash when using TCP, except for some
IPv6 RST packets. This is because in tcp_v6_send_reset when not in
TIME_WAIT the hash is taken from sk->sk_hash, while it should come from
sk->sk_txhash as those two hashes are not computed the same way.

Packetdrill script to test the above,

   0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
  +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
  +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)

  +0 > (flowlabel 0x1) S 0:0(0) <...>

  // Wrong ack seq, trigger a rst.
  +0 < S. 0:0(0) ack 0 win 4000

  // Check the flowlabel matches prior one from SYN.
  +0 > (flowlabel 0x1) R 0:0(0) <...>

Fixes: 9258b8b1be2e ("ipv6: tcp: send consistent autoflowlabel in RST packets")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: srv6: make srv6_end_dt46_l3vpn_test more robust

[ Upstream commit 46ef24c60f8ee70662968ac55325297ed4624d61 ]

On some distributions, the rp_filter is automatically set (=1) by
default on a netdev basis (also on VRFs).
In an SRv6 End.DT46 behavior, decapsulated IPv4 packets are routed using
the table associated with the VRF bound to that tunnel. During lookup
operations, the rp_filter can lead to packet loss when activated on the
VRF.
Therefore, we chose to make this selftest more robust by explicitly
disabling the rp_filter during tests (as it is automatically set by some
Linux distributions).

Fixes: 03a0b567a03d ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

sit: update dev->needed_headroom in ipip6_tunnel_bind_dev()

[ Upstream commit c88f8d5cd95fd039cff95d682b8e71100c001df0 ]

When a tunnel device is bound with the underlying device, its
dev->needed_headroom needs to be updated properly. IPv4 tunnels
already do the same in ip_tunnel_bind_dev(). Otherwise we may
not have enough header room for skb, especially after commit
b17f709a2401 ("gue: TX support for using remote checksum offload option").

Fixes: 32b8a8e59c9c ("sit: add IPv4 over IPv4 support")
Reported-by: Palash Oswal <oswalpalash@gmail.com>
Link: https://lore.kernel.org/netdev/CAGyP=7fDcSPKu6nttbGwt7RXzE3uyYxLjCSE97J64pRxJP8jPA@mail.gmail.com/
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net/sched: cls_api: remove block_cb from driver_list before freeing

[ Upstream commit da94a7781fc3c92e7df7832bc2746f4d39bc624e ]

Error handler of tcf_block_bind() frees the whole bo->cb_list on error.
However, by that time the flow_block_cb instances are already in the driver
list because driver ndo_setup_tc() callback is called before that up the
call chain in tcf_block_offload_cmd(). This leaves dangling pointers to
freed objects in the list and causes use-after-free[0]. Fix it by also
removing flow_block_cb instances from driver_list before deallocating them.

[0]:
[  279.868433] ==================================================================
[  279.869964] BUG: KASAN: slab-use-after-free in flow_block_cb_setup_simple+0x631/0x7c0
[  279.871527] Read of size 8 at addr ffff888147e2bf20 by task tc/2963

[  279.873151] CPU: 6 PID: 2963 Comm: tc Not tainted 6.3.0-rc6+ #4
[  279.874273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  279.876295] Call Trace:
[  279.876882]  <TASK>
[  279.877413]  dump_stack_lvl+0x33/0x50
[  279.878198]  print_report+0xc2/0x610
[  279.878987]  ? flow_block_cb_setup_simple+0x631/0x7c0
[  279.879994]  kasan_report+0xae/0xe0
[  279.880750]  ? flow_block_cb_setup_simple+0x631/0x7c0
[  279.881744]  ? mlx5e_tc_reoffload_flows_work+0x240/0x240 [mlx5_core]
[  279.883047]  flow_block_cb_setup_simple+0x631/0x7c0
[  279.884027]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
[  279.885037]  ? tcf_block_setup+0x6b0/0x6b0
[  279.885901]  ? mutex_lock+0x7d/0xd0
[  279.886669]  ? __mutex_unlock_slowpath.constprop.0+0x2d0/0x2d0
[  279.887844]  ? ingress_init+0x1c0/0x1c0 [sch_ingress]
[  279.888846]  tcf_block_get_ext+0x61c/0x1200
[  279.889711]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.890682]  ? clsact_init+0x2b0/0x2b0 [sch_ingress]
[  279.891701]  qdisc_create+0x401/0xea0
[  279.892485]  ? qdisc_tree_reduce_backlog+0x470/0x470
[  279.893473]  tc_modify_qdisc+0x6f7/0x16d0
[  279.894344]  ? tc_get_qdisc+0xac0/0xac0
[  279.895213]  ? mutex_lock+0x7d/0xd0
[  279.896005]  ? __mutex_lock_slowpath+0x10/0x10
[  279.896910]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.897770]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  279.898672]  ? __sys_sendmsg+0xb5/0x140
[  279.899494]  ? do_syscall_64+0x3d/0x90
[  279.900302]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  279.901337]  ? kasan_save_stack+0x2e/0x40
[  279.902177]  ? kasan_save_stack+0x1e/0x40
[  279.903058]  ? kasan_set_track+0x21/0x30
[  279.903913]  ? kasan_save_free_info+0x2a/0x40
[  279.904836]  ? ____kasan_slab_free+0x11a/0x1b0
[  279.905741]  ? kmem_cache_free+0x179/0x400
[  279.906599]  netlink_rcv_skb+0x12c/0x360
[  279.907450]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  279.908360]  ? netlink_ack+0x1550/0x1550
[  279.909192]  ? rhashtable_walk_peek+0x170/0x170
[  279.910135]  ? kmem_cache_alloc_node+0x1af/0x390
[  279.911086]  ? _copy_from_iter+0x3d6/0xc70
[  279.912031]  netlink_unicast+0x553/0x790
[  279.912864]  ? netlink_attachskb+0x6a0/0x6a0
[  279.913763]  ? netlink_recvmsg+0x416/0xb50
[  279.914627]  netlink_sendmsg+0x7a1/0xcb0
[  279.915473]  ? netlink_unicast+0x790/0x790
[  279.916334]  ? iovec_from_user.part.0+0x4d/0x220
[  279.917293]  ? netlink_unicast+0x790/0x790
[  279.918159]  sock_sendmsg+0xc5/0x190
[  279.918938]  ____sys_sendmsg+0x535/0x6b0
[  279.919813]  ? import_iovec+0x7/0x10
[  279.920601]  ? kernel_sendmsg+0x30/0x30
[  279.921423]  ? __copy_msghdr+0x3c0/0x3c0
[  279.922254]  ? import_iovec+0x7/0x10
[  279.923041]  ___sys_sendmsg+0xeb/0x170
[  279.923854]  ? copy_msghdr_from_user+0x110/0x110
[  279.924797]  ? ___sys_recvmsg+0xd9/0x130
[  279.925630]  ? __perf_event_task_sched_in+0x183/0x470
[  279.926656]  ? ___sys_sendmsg+0x170/0x170
[  279.927529]  ? ctx_sched_in+0x530/0x530
[  279.928369]  ? update_curr+0x283/0x4f0
[  279.929185]  ? perf_event_update_userpage+0x570/0x570
[  279.930201]  ? __fget_light+0x57/0x520
[  279.931023]  ? __switch_to+0x53d/0xe70
[  279.931846]  ? sockfd_lookup_light+0x1a/0x140
[  279.932761]  __sys_sendmsg+0xb5/0x140
[  279.933560]  ? __sys_sendmsg_sock+0x20/0x20
[  279.934436]  ? fpregs_assert_state_consistent+0x1d/0xa0
[  279.935490]  do_syscall_64+0x3d/0x90
[  279.936300]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  279.937311] RIP: 0033:0x7f21c814f887
[  279.938085] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[  279.941448] RSP: 002b:00007fff11efd478 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  279.942964] RAX: ffffffffffffffda RBX: 0000000064401979 RCX: 00007f21c814f887
[  279.944337] RDX: 0000000000000000 RSI: 00007fff11efd4e0 RDI: 0000000000000003
[  279.945660] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[  279.947003] R10: 00007f21c8008708 R11: 0000000000000246 R12: 0000000000000001
[  279.948345] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400
[  279.949690]  </TASK>

[  279.950706] Allocated by task 2960:
[  279.951471]  kasan_save_stack+0x1e/0x40
[  279.952338]  kasan_set_track+0x21/0x30
[  279.953165]  __kasan_kmalloc+0x77/0x90
[  279.954006]  flow_block_cb_setup_simple+0x3dd/0x7c0
[  279.955001]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
[  279.956020]  tcf_block_get_ext+0x61c/0x1200
[  279.956881]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.957873]  qdisc_create+0x401/0xea0
[  279.958656]  tc_modify_qdisc+0x6f7/0x16d0
[  279.959506]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.960392]  netlink_rcv_skb+0x12c/0x360
[  279.961216]  netlink_unicast+0x553/0x790
[  279.962044]  netlink_sendmsg+0x7a1/0xcb0
[  279.962906]  sock_sendmsg+0xc5/0x190
[  279.963702]  ____sys_sendmsg+0x535/0x6b0
[  279.964534]  ___sys_sendmsg+0xeb/0x170
[  279.965343]  __sys_sendmsg+0xb5/0x140
[  279.966132]  do_syscall_64+0x3d/0x90
[  279.966908]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

[  279.968407] Freed by task 2960:
[  279.969114]  kasan_save_stack+0x1e/0x40
[  279.969929]  kasan_set_track+0x21/0x30
[  279.970729]  kasan_save_free_info+0x2a/0x40
[  279.971603]  ____kasan_slab_free+0x11a/0x1b0
[  279.972483]  __kmem_cache_free+0x14d/0x280
[  279.973337]  tcf_block_setup+0x29d/0x6b0
[  279.974173]  tcf_block_offload_cmd.isra.0+0x226/0x2d0
[  279.975186]  tcf_block_get_ext+0x61c/0x1200
[  279.976080]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.977065]  qdisc_create+0x401/0xea0
[  279.977857]  tc_modify_qdisc+0x6f7/0x16d0
[  279.978695]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.979562]  netlink_rcv_skb+0x12c/0x360
[  279.980388]  netlink_unicast+0x553/0x790
[  279.981214]  netlink_sendmsg+0x7a1/0xcb0
[  279.982043]  sock_sendmsg+0xc5/0x190
[  279.982827]  ____sys_sendmsg+0x535/0x6b0
[  279.983703]  ___sys_sendmsg+0xeb/0x170
[  279.984510]  __sys_sendmsg+0xb5/0x140
[  279.985298]  do_syscall_64+0x3d/0x90
[  279.986076]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

[  279.987532] The buggy address belongs to the object at ffff888147e2bf00
                which belongs to the cache kmalloc-192 of size 192
[  279.989747] The buggy address is located 32 bytes inside of
                freed 192-byte region [ffff888147e2bf00, ffff888147e2bfc0)

[  279.992367] The buggy address belongs to the physical page:
[  279.993430] page:00000000550f405c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147e2a
[  279.995182] head:00000000550f405c order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  279.996713] anon flags: 0x200000000010200(slab|head|node=0|zone=2)
[  279.997878] raw: 0200000000010200 ffff888100042a00 0000000000000000 dead000000000001
[  279.999384] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[  280.000894] page dumped because: kasan: bad access detected

[  280.002386] Memory state around the buggy address:
[  280.003338]  ffff888147e2be00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.004781]  ffff888147e2be80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  280.006224] >ffff888147e2bf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.007700]                                ^
[  280.008592]  ffff888147e2bf80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  280.010035]  ffff888147e2c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.011564] ==================================================================

Fixes: 59094b1e5094 ("net: sched: use flow block API")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp: fix skb_copy_ubufs() vs BIG TCP

[ Upstream commit 7e692df3933628d974acb9f5b334d2b3e885e2a6 ]

David Ahern reported crashes in skb_copy_ubufs() caused by TCP tx zerocopy
using hugepages, and skb length bigger than ~68 KB.

skb_copy_ubufs() assumed it could copy all payload using up to
MAX_SKB_FRAGS order-0 pages.

This assumption broke when BIG TCP was able to put up to 512 KB per skb.

We did not hit this bug at Google because we use CONFIG_MAX_SKB_FRAGS=45
and limit gso_max_size to 180000.

A solution is to use higher order pages if needed.

v2: add missing __GFP_COMP, or we leak memory.

Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536")
Reported-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/netdev/c70000f6-baa4-4a05-46d0-4b3e0dc1ccc8@gmail.com/T/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Coco Li <lixiaoyan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net/ncsi: clear Tx enable mode when handling a Config required AEN

[ Upstream commit 6f75cd166a5a3c0bc50441faa8b8304f60522fdd ]

ncsi_channel_is_tx() determines whether a given channel should be
used for Tx or not. However, when reconfiguring the channel by
handling a Configuration Required AEN, there is a misjudgment that
the channel Tx has already been enabled, which results in the Enable
Channel Network Tx command not being sent.

Clear the channel Tx enable flag before reconfiguring the channel to
avoid the misjudgment.

Fixes: 8d951a75d022 ("net/ncsi: Configure multi-package, multi-channel modes with failover")
Signed-off-by: Cosmo Chou <chou.cosmo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-pf: mcs: Do not reset PN while updating secy

[ Upstream commit 3c99bace4ad08ad0264285ba8ad73117560992c2 ]

After creating SecYs, SCs and SAs a SecY can be modified
to change attributes like validation mode, protect frames
mode etc. During this SecY update, packet number is reset to
initial user given value by mistake. Hence do not reset
PN when updating SecY parameters.

Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-pf: mcs: Fix shared counters logic

[ Upstream commit 9bdfe61054fb2b989eb58df20bf99c0cf67e3038 ]

Macsec stats like InPktsLate and InPktsDelayed share
same counter in hardware. If SecY replay_protect is true
then counter represents InPktsLate otherwise InPktsDelayed.
This mode change was tracked based on protect_frames
instead of replay_protect mistakenly. Similarly InPktsUnchecked
and InPktsOk share same counter and mode change was tracked
based on validate_check instead of validate_disabled.
This patch fixes those problems.

Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-pf: mcs: Clear stats before freeing resource

[ Upstream commit 815debbbf7b52026462c37eea3be70d6377a7a9a ]

When freeing MCS hardware resources like SecY, SC and
SA the corresponding stats needs to be cleared. Otherwise
previous stats are shown in newly created macsec interfaces.

Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-pf: mcs: Match macsec ethertype along with DMAC

[ Upstream commit 57d00d4364f314485092667d2a48718985515deb ]

On CN10KB silicon a single hardware macsec block is
present and offloads macsec operations for all the
ethernet LMACs. TCAM match with macsec ethertype 0x88e5
alone at RX side is not sufficient to distinguish all the
macsec interfaces created on top of netdevs. Hence append
the DMAC of the macsec interface too. Otherwise the first
created macsec interface only receives all the macsec traffic.

Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-pf: mcs: Fix NULL pointer dereferences

[ Upstream commit 699af748c61574125d269db260dabbe20436d74e ]

When system is rebooted after creating macsec interface
below NULL pointer dereference crashes occurred. This
patch fixes those crashes by using correct order of teardown

[ 3324.406942] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 3324.415726] Mem abort info:
[ 3324.418510]   ESR = 0x96000006
[ 3324.421557]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3324.426865]   SET = 0, FnV = 0
[ 3324.429913]   EA = 0, S1PTW = 0
[ 3324.433047] Data abort info:
[ 3324.435921]   ISV = 0, ISS = 0x00000006
[ 3324.439748]   CM = 0, WnR = 0
....
[ 3324.575915] Call trace:
[ 3324.578353]  cn10k_mdo_del_secy+0x24/0x180
[ 3324.582440]  macsec_common_dellink+0xec/0x120
[ 3324.586788]  macsec_notify+0x17c/0x1c0
[ 3324.590529]  raw_notifier_call_chain+0x50/0x70
[ 3324.594965]  call_netdevice_notifiers_info+0x34/0x7c
[ 3324.599921]  rollback_registered_many+0x354/0x5bc
[ 3324.604616]  unregister_netdevice_queue+0x88/0x10c
[ 3324.609399]  unregister_netdev+0x20/0x30
[ 3324.613313]  otx2_remove+0x8c/0x310
[ 3324.616794]  pci_device_shutdown+0x30/0x70
[ 3324.620882]  device_shutdown+0x11c/0x204

[  966.664930] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  966.673712] Mem abort info:
[  966.676497]   ESR = 0x96000006
[  966.679543]   EC = 0x25: DABT (current EL), IL = 32 bits
[  966.684848]   SET = 0, FnV = 0
[  966.687895]   EA = 0, S1PTW = 0
[  966.691028] Data abort info:
[  966.693900]   ISV = 0, ISS = 0x00000006
[  966.697729]   CM = 0, WnR = 0
[  966.833467] Call trace:
[  966.835904]  cn10k_mdo_stop+0x20/0xa0
[  966.839557]  macsec_dev_stop+0xe8/0x11c
[  966.843384]  __dev_close_many+0xbc/0x140
[  966.847298]  dev_close_many+0x84/0x120
[  966.851039]  rollback_registered_many+0x114/0x5bc
[  966.855735]  unregister_netdevice_many.part.0+0x14/0xa0
[  966.860952]  unregister_netdevice_many+0x18/0x24
[  966.865560]  macsec_notify+0x1ac/0x1c0
[  966.869303]  raw_notifier_call_chain+0x50/0x70
[  966.873738]  call_netdevice_notifiers_info+0x34/0x7c
[  966.878694]  rollback_registered_many+0x354/0x5bc
[  966.883390]  unregister_netdevice_queue+0x88/0x10c
[  966.888173]  unregister_netdev+0x20/0x30
[  966.892090]  otx2_remove+0x8c/0x310
[  966.895571]  pci_device_shutdown+0x30/0x70
[  966.899660]  device_shutdown+0x11c/0x204
[  966.903574]  __do_sys_reboot+0x208/0x290
[  966.907487]  __arm64_sys_reboot+0x20/0x30
[  966.911489]  el0_svc_handler+0x80/0x1c0
[  966.915316]  el0_svc+0x8/0x180
[  966.918362] Code: f9400000 f9400a64 91220014 f94b3403 (f9400060)
[  966.924448] ---[ end trace 341778e799c3d8d7 ]---

Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: mcs: Fix MCS block interrupt

[ Upstream commit b8aebeaaf9ffb1e99c642eb3751e28981f9be475 ]

On CN10KB, MCS IP vector number, BBE and PAB interrupt mask
got changed to support more block level interrupts.
To address this changes, this patch fixes the bbe and pab
interrupt handlers.

Fixes: 6c635f78c474 ("octeontx2-af: cn10k: mcs: Handle MCS block interrupts")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: mcs: Config parser to skip 8B header

[ Upstream commit 65cdc2b637a5749c7dec0ce14fe2c48f1f91f671 ]

When ptp timestamp is enabled in RPM, RPM will append 8B
timestamp header for all RX traffic. MCS need to skip these
8 bytes header while parsing the packet header, so that
correct tcam key is created for lookup.
This patch fixes the mcs parser configuration to skip this
8B header for ptp packets.

Fixes: ca7f49ff8846 ("octeontx2-af: cn10k: Introduce driver for macsec block.")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: mcs: Write TCAM_DATA and TCAM_MASK registers at once

[ Upstream commit b51612198603fce33d6cf57b4864e3018a1cd9b8 ]

As per hardware errata on CN10KB, all the four TCAM_DATA
and TCAM_MASK registers has to be written at once otherwise
write to individual registers will fail. Hence write to all
TCAM_DATA registers and then to all TCAM_MASK registers.

Fixes: cfc14181d497 ("octeontx2-af: cn10k: mcs: Manage the MCS block hardware resources")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

octeonxt2-af: mcs: Fix per port bypass config

[ Upstream commit c222b292a3568754828ffd30338d2909b14ed160 ]

For each lmac port, MCS has two MCS_TOP_SLAVE_CHANNEL_CONFIGX
registers. For CN10KB both register need to be configured for the
port level mcs bypass to work. This patch also sets bitmap
of flowid/secy entry reserved for default bypass so that these
entries can be shown in debugfs.

Fixes: bd69476e86fc ("octeontx2-af: cn10k: mcs: Install a default TCAM for normal traffic")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ixgbe: Fix panic during XDP_TX with > 64 CPUs

[ Upstream commit c23ae5091a8b3e50fe755257df020907e7c029bb ]

Commit 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
adds support to allow XDP programs to run on systems with more than
64 CPUs by locking the XDP TX rings and indexing them using cpu % 64
(IXGBE_MAX_XDP_QS).

Upon trying this out patch on a system with more than 64 cores,
the kernel paniced with an array-index-out-of-bounds at the return in
ixgbe_determine_xdp_ring in ixgbe.h, which means ixgbe_determine_xdp_q_idx
was just returning the cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example
splat:

==========================================================================
UBSAN: array-index-out-of-bounds in
/var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
index 65 is out of range for type 'ixgbe_ring *[64]'
==========================================================================
BUG: kernel NULL pointer dereference, address: 0000000000000058
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 65 PID: 408 Comm: ksoftirqd/65
Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
  <TASK>
  ixgbe_poll+0x103e/0x1280 [ixgbe]
  ? sched_clock_cpu+0x12/0xe0
  __napi_poll+0x30/0x160
  net_rx_action+0x11c/0x270
  __do_softirq+0xda/0x2ee
  run_ksoftirqd+0x2f/0x50
  smpboot_thread_fn+0xb7/0x150
  ? sort_range+0x30/0x30
  kthread+0x127/0x150
  ? set_kthread_struct+0x50/0x50
  ret_from_fork+0x1f/0x30
  </TASK>

I think this is how it happens:

Upon loading the first XDP program on a system with more than 64 CPUs,
ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
immediately after this, the rings are reconfigured by ixgbe_setup_tc.
ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
stopped my system from panicing.

I suspect to make the original patch work, I would need to load an XDP
program and then replace it in order to get ixgbe_xdp_locking_key back
above 0 since ixgbe_setup_tc is only called when transitioning between
XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
incremented every time ixgbe_xdp_setup is called.

Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
with more than 64 CPUs.

Since ixgbe_xdp_locking_key only protects the XDP_TX path and is tied
to the number of CPUs present, there is no reason to disable it upon
unloading an XDP program.  To avoid confusion, I have moved enabling
ixgbe_xdp_locking_key into ixgbe_sw_init, which is part of the probe path.

Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
Signed-off-by: John Hickey <jjh@daedalian.us>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230425170308.2522429-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

rxrpc: Fix potential data race in rxrpc_wait_to_be_connected()

[ Upstream commit 2b5fdc0f5caa505afe34d608e2eefadadf2ee67a ]

Inside the loop in rxrpc_wait_to_be_connected() it checks call->error to
see if it should exit the loop without first checking the call state.  This
is probably safe as if call->error is set, the call is dead anyway, but we
should probably wait for the call state to have been set to completion
first, lest it cause surprise on the way out.

Fix this by only accessing call->error if the call is complete.  We don't
actually need to access the error inside the loop as we'll do that after.

This caused the following report:

    BUG: KCSAN: data-race in rxrpc_send_data / rxrpc_set_call_completion

    write to 0xffff888159cf3c50 of 4 bytes by task 25673 on cpu 1:
     rxrpc_set_call_completion+0x71/0x1c0 net/rxrpc/call_state.c:22
     rxrpc_send_data_packet+0xba9/0x1650 net/rxrpc/output.c:479
     rxrpc_transmit_one+0x1e/0x130 net/rxrpc/output.c:714
     rxrpc_decant_prepared_tx net/rxrpc/call_event.c:326 [inline]
     rxrpc_transmit_some_data+0x496/0x600 net/rxrpc/call_event.c:350
     rxrpc_input_call_event+0x564/0x1220 net/rxrpc/call_event.c:464
     rxrpc_io_thread+0x307/0x1d80 net/rxrpc/io_thread.c:461
     kthread+0x1ac/0x1e0 kernel/kthread.c:376
     ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

    read to 0xffff888159cf3c50 of 4 bytes by task 25672 on cpu 0:
     rxrpc_send_data+0x29e/0x1950 net/rxrpc/sendmsg.c:296
     rxrpc_do_sendmsg+0xb7a/0xc20 net/rxrpc/sendmsg.c:726
     rxrpc_sendmsg+0x413/0x520 net/rxrpc/af_rxrpc.c:565
     sock_sendmsg_nosec net/socket.c:724 [inline]
     sock_sendmsg net/socket.c:747 [inline]
     ____sys_sendmsg+0x375/0x4c0 net/socket.c:2501
     ___sys_sendmsg net/socket.c:2555 [inline]
     __sys_sendmmsg+0x263/0x500 net/socket.c:2641
     __do_sys_sendmmsg net/socket.c:2670 [inline]
     __se_sys_sendmmsg net/socket.c:2667 [inline]
     __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2667
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd

    value changed: 0x00000000 -> 0xffffffea

Fixes: 9d35d880e0e4 ("rxrpc: Move client call connection to the I/O thread")
Reported-by: syzbot+ebc945fdb4acd72cba78@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000e7c6d205fa10a3cd@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Dmitry Vyukov <dvyukov@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/508133.1682427395@warthog.procyon.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Update bounding box values for DCN321

[ Upstream commit 989cd3e76a4aab76fe7dd50090ac3fa501c537f6 ]

[Why&how]

Update bounding box values as per hardware spec

Fixes: 197485c69543 ("drm/amd/display: Create dcn321_fpu file")
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Do not clear GPINT register when releasing DMUB from reset

[ Upstream commit 99d92eaca5d915763b240aae24669f5bf3227ecf ]

[Why & How]
There's no need to clear GPINT register for DMUB
when releasing it from reset. Fix that.

Fixes: ac2e555e0a7f ("drm/amd/display: Add DMCUB source files and changes for DCN32/321")
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Reset OUTBOX0 r/w pointer on DMUB reset

[ Upstream commit 425afa0ac99a05b39e6cd00704fa0e3e925cee2b ]

[Why & How]
We missed resetting OUTBOX0 mailbox r/w pointer on DMUB reset.
Fix it.

Fixes: 6ecf9773a503 ("drm/amd/display: Fix DMUB outbox trace in S4 (#4465)")
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Fixes for dcn32_clk_mgr implementation

[ Upstream commit d1c5c3e252b8a911a524e6ee33b82aca81397745 ]

[Why&How]
Fix CLK MGR early initialization and add logging.

Fixes: 265280b99822 ("drm/amd/display: add CLKMGR changes for DCN32/321")
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Return error code on DSC atomic check failure

[ Upstream commit dd24662d9dfbad281bbf030f06d68c7938fa0c66 ]

[Why&How]
We were not returning -EINVAL on DSC atomic check fail. Add it.

Fixes: 71be4b16d39a ("drm/amd/display: dsc validate fail not pass to atomic check")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add missing WA and MCLK validation

[ Upstream commit 822b84ecfc646da0f87fd947fa00dc3be5e45ecc ]

When the commit fff7eb56b376 ("drm/amd/display: Don't set dram clock
change requirement for SubVP") was merged, we missed some parts
associated with the MCLK switch. This commit adds all the missing parts.

Fixes: fff7eb56b376 ("drm/amd/display: Don't set dram clock change requirement for SubVP")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: qedi: Fix use after free bug in qedi_remove()

[ Upstream commit c5749639f2d0a1f6cbe187d05f70c2e7c544d748 ]

In qedi_probe() we call __qedi_probe() which initializes
&qedi->recovery_work with qedi_recovery_handler() and
&qedi->board_disable_work with qedi_board_disable_work().

When qedi_schedule_recovery_handler() is called, schedule_delayed_work()
will finally start the work.

In qedi_remove(), which is called to remove the driver, the following
sequence may be observed:

Fix this by finishing the work before cleanup in qedi_remove().

CPU0                  CPU1

                     |qedi_recovery_handler
qedi_remove          |
  __qedi_remove      |
iscsi_host_free      |
scsi_host_put        |
//free shost         |
                     |iscsi_host_for_each_session
                     |//use qedi->shost

Cancel recovery_work and board_disable_work in __qedi_remove().

Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Link: https://lore.kernel.org/r/20230413033422.28003-1-zyytlz.wz@163.com
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: Intel: soc-acpi-byt: Fix "WM510205" match no longer working

[ Upstream commit c963e2ec095cb3f855890be53f56f5a6c6fbe371 ]

Commit 7e1d728a94ca ("ASoC: Intel: soc-acpi-byt: Add new WM5102 ACPI HID")
added an extra HID to wm5102_comp_ids.codecs, but it forgot to bump
wm5102_comp_ids.num_codecs, causing the last codec HID in the codecs list
to no longer work.

Bump wm5102_comp_ids.num_codecs to fix this.

Fixes: 7e1d728a94ca ("ASoC: Intel: soc-acpi-byt: Add new WM5102 ACPI HID")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20230421183714.35186-1-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/rxe: Extend dbg log messages to err and info

[ Upstream commit 9ac01f434a1eb56ea94611bd75cf62fa276b41f4 ]

Extend the dbg log messages (e.g. rxe_dbg_xxx) to include
err and info types. rxe.c is modified to use these new log
messages as examples.

Link: https://lore.kernel.org/r/20230303221623.8053-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Stable-dep-of: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c")
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/rxe: Change rxe_dbg to rxe_dbg_dev

[ Upstream commit a9fb3287211e64b94ceb2b6b4791cc2b829d0d56 ]

Replace the name rxe_dbg with rxe_dbg_dev which better matches
the remaining rxe_dbg_xxx macros for debug messages with a
rxe device parameter. Reuse the name rxe_dbg for debug messages
which do not have a rxe device parameter.

Link: https://lore.kernel.org/r/20230303221623.8053-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Stable-dep-of: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c")
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/rxe: Remove rxe_alloc()

[ Upstream commit 72a03627443d5bc7032ab98bd784740cd8a76f8a ]

Currently all the object types in the rxe driver are allocated in
rdma-core except for MRs. By moving tha kzalloc() call outside of
the pool code the rxe_alloc() subroutine can be eliminated and code
checking for MR as a special case can be removed.

This patch moves the kzalloc() and kfree_rcu() calls into the mr
registration and destruction verbs. It removes that code from
rxe_pool.c including the rxe_alloc() subroutine which is no longer
used.

Link: https://lore.kernel.org/r/20230213225551.12437-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Stable-dep-of: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c")
Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: x86/mmu: Refresh CR0.WP prior to checking for emulated permission faults

[ Upstream commit cf9f4c0eb1699d306e348b1fd0225af7b2c282d3 ]

Refresh the MMU's snapshot of the vCPU's CR0.WP prior to checking for
permission faults when emulating a guest memory access and CR0.WP may be
guest owned. If the guest toggles only CR0.WP and triggers emulation of
a supervisor write, e.g. when KVM is emulating UMIP, KVM may consume a
stale CR0.WP, i.e. use stale protection bits metadata.

Note, KVM passes through CR0.WP if and only if EPT is enabled as CR0.WP
is part of the MMU role for legacy shadow paging, and SVM (NPT) doesn't
support per-bit interception controls for CR0. Don't bother checking for
EPT vs. NPT as the "old == new" check will always be true under NPT, i.e.
the only cost is the read of vcpu->arch.cr4 (SVM unconditionally grabs CR0
from the VMCB on VM-Exit).

Reported-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lkml.kernel.org/r/677169b4-051f-fcae-756b-9a3e1bb9f8fe%40grsecurity.net
Fixes: fb509f76acc8 ("KVM: VMX: Make CR0.WP a guest owned bit")
Tested-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lore.kernel.org/r/20230405002608.418442-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net> # backport to v6.2.x
Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: VMX: Make CR0.WP a guest owned bit

[ Upstream commit fb509f76acc8d42bed11bca308404f81c2be856a ]

Guests like grsecurity that make heavy use of CR0.WP to implement kernel
level W^X will suffer from the implied VMEXITs.

With EPT there is no need to intercept a guest change of CR0.WP, so
simply make it a guest owned bit if we can do so.

This implies that a read of a guest's CR0.WP bit might need a VMREAD.
However, the only potentially affected user seems to be kvm_init_mmu()
which is a heavy operation to begin with. But also most callers already
cache the full value of CR0 anyway, so no additional VMREAD is needed.
The only exception is nested_vmx_load_cr3().

This change is VMX-specific, as SVM has no such fine grained control
register intercept control.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lore.kernel.org/r/20230322013731.102955-7-minipli@grsecurity.net
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net> # backport to v6.2.x
Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: x86: Make use of kvm_read_cr*_bits() when testing bits

[ Upstream commit 74cdc836919bf34684ef66f995273f35e2189daf ]

Make use of the kvm_read_cr{0,4}_bits() helper functions when we only
want to know the state of certain bits instead of the whole register.

This not only makes the intent cleaner, it also avoids a potential
VMREAD in case the tested bits aren't guest owned.

Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lore.kernel.org/r/20230322013731.102955-5-minipli@grsecurity.net
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: x86: Do not unload MMU roots when only toggling CR0.WP with TDP enabled

[ Upstream commit 01b31714bd90be2784f7145bf93b7f78f3d081e1 ]

There is no need to unload the MMU roots with TDP enabled when only
CR0.WP has changed -- the paging structures are still valid, only the
permission bitmap needs to be updated.

One heavy user of toggling CR0.WP is grsecurity's KERNEXEC feature to
implement kernel W^X.

The optimization brings a huge performance gain for this case as the
following micro-benchmark running 'ssdd 10 50000' from rt-tests[1] on a
grsecurity L1 VM shows (runtime in seconds, lower is better):

                       legacy     TDP    shadow
kvm-x86/next@d8708b     8.43s    9.45s    70.3s
             +patch     5.39s    5.63s    70.2s

For legacy MMU this is ~36% faster, for TDP MMU even ~40% faster. Also
TDP and legacy MMU now both have a similar runtime which vanishes the
need to disable TDP MMU for grsecurity.

Shadow MMU sees no measurable difference and is still slow, as expected.

[1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git

Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lore.kernel.org/r/20230322013731.102955-3-minipli@grsecurity.net
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: x86/mmu: Avoid indirect call for get_cr3

[ Upstream commit 2fdcc1b324189b5fb20655baebd40cd82e2bdf0c ]

Most of the time, calls to get_guest_pgd result in calling
kvm_read_cr3 (the exception is only nested TDP). Hardcode
the default instead of using the get_cr3 function, avoiding
a retpoline if they are enabled.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lore.kernel.org/r/20230322013731.102955-2-minipli@grsecurity.net
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net> # backport to v6.2.x
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Ext displays with dock can't recognized after resume

[ Upstream commit 1e5d4d8eb8c0f15d90c50e7abd686c980e54e42e ]

[Why]
Needs to set the default value of the LTTPR timeout after resume.

[How]
Set the default (3.2ms) timeout at resuming if the sink supports
LTTPR

Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Ryan Lin <tsung-hua.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/ntfs3: Fix null-ptr-deref on inode->i_op in ntfs_lookup()

[ Upstream commit 254e69f284d7270e0abdc023ee53b71401c3ba0c ]

Syzbot reported a null-ptr-deref bug:

ntfs3: loop0: Different NTFS' sector size (1024) and media sector size
(512)
ntfs3: loop0: Mark volume as dirty due to NTFS errors
general protection fault, probably for non-canonical address
0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
RIP: 0010:d_flags_for_inode fs/dcache.c:1980 [inline]
RIP: 0010:__d_add+0x5ce/0x800 fs/dcache.c:2796
Call Trace:
<TASK>
d_splice_alias+0x122/0x3b0 fs/dcache.c:3191
lookup_open fs/namei.c:3391 [inline]
open_last_lookups fs/namei.c:3481 [inline]
path_openat+0x10e6/0x2df0 fs/namei.c:3688
do_filp_open+0x264/0x4f0 fs/namei.c:3718
do_sys_openat2+0x124/0x4e0 fs/open.c:1310
do_sys_open fs/open.c:1326 [inline]
__do_sys_open fs/open.c:1334 [inline]
__se_sys_open fs/open.c:1330 [inline]
__x64_sys_open+0x221/0x270 fs/open.c:1330
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

If the MFT record of ntfs inode is not a base record, inode->i_op can be
NULL. And a null-ptr-deref may happen:

ntfs_lookup()
    dir_search_u() # inode->i_op is set to NULL
    d_splice_alias()
        __d_add()
            d_flags_for_inode() # inode->i_op->get_link null-ptr-deref

Fix this by adding a Check on inode->i_op before calling the
d_splice_alias() function.

Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation")
Reported-by: syzbot+a8f26a403c169b7593fe@syzkaller.appspotmail.com
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: spi-nor: spansion: Enable JFFS2 write buffer for Infineon s25hx SEMPER flash

[ Upstream commit 4199c1719e24e73be0acc8b0146fc31ad8af9771 ]

Infineon(Cypress) SEMPER NOR flash family has on-die ECC and its program
granularity is 16-byte ECC data unit size. JFFS2 supports write buffer
mode for ECC'd NOR flash. Provide a way to clear the MTD_BIT_WRITEABLE
flag in order to enable JFFS2 write buffer mode support.

Fixes: b6b23833fc42 ("mtd: spi-nor: spansion: Add s25hl-t/s25hs-t IDs and fixups")
Suggested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Takahiro Kuwano <Takahiro.Kuwano@infineon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/a1cc128e094db4ec141f85bd380127598dfef17e.1680760742.git.Takahiro.Kuwano@infineon.com
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mailbox: zynqmp: Fix counts of child nodes

[ Upstream commit f72f805e72882c361e2a612c64a6e549f3da7152 ]

If child mailbox node status is disabled it causes
crash in interrupt handler. Fix this by assigning
only available child node during driver probe.

Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller")
Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230311012407.1292118-2-tanmay.shah@amd.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mailbox: zynq: Switch to flexible array to simplify code

[ Upstream commit 043f85ce81cb1714e14d31c322c5646513dde3fb ]

Using flexible array is more straight forward. It
  - saves 1 pointer in the 'zynqmp_ipi_pdata' structure
  - saves an indirection when using this array
  - saves some LoC and avoids some always spurious pointer arithmetic

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Stable-dep-of: f72f805e7288 ("mailbox: zynqmp: Fix counts of child nodes")
Signed-off-by: Sasha Levin <sashal@kernel.org>

soc: qcom: llcc: Do not create EDAC platform device on SDM845

[ Upstream commit cca94f1dd6d0a4c7e5c8190672f5747e3c00ddde ]

The platforms based on SDM845 SoC locks the access to EDAC registers in the
bootloader. So probing the EDAC driver will result in a crash. Hence,
disable the creation of EDAC platform device on all SDM845 devices.

The issue has been observed on Lenovo Yoga C630 and DB845c.

While at it, also sort the members of `struct qcom_llcc_config` to avoid
any holes in-between.

Cc: <stable@vger.kernel.org> # 5.10
Reported-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230314080443.64635-15-manivannan.sadhasivam@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

qcom: llcc/edac: Support polling mode for ECC handling

[ Upstream commit 721d3e91bfc93975c5e1a76c7d588dd8df5d82da ]

Not all Qcom platforms support IRQ mode for ECC handling. For those
platforms, the current EDAC driver will not be probed due to missing ECC
IRQ in devicetree.

So add support for polling mode so that the EDAC driver can be used on all
Qcom platforms supporting LLCC.

The polling delay of 5000ms is chosen based on Qcom downstream/vendor
driver.

Reported-by: Luca Weiss <luca.weiss@fairphone.com>
Tested-by: Luca Weiss <luca.weiss@fairphone.com>
Tested-by: Steev Klimaszewski <steev@kali.org> # Thinkpad X13s
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230314080443.64635-14-manivannan.sadhasivam@linaro.org
Stable-dep-of: cca94f1dd6d0 ("soc: qcom: llcc: Do not create EDAC platform device on SDM845")
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: spi-nor: spansion: Enable JFFS2 write buffer for Infineon s28hx SEMPER flash

[ Upstream commit 9fd0945fe6fadfb6b54a9cd73be101c02b3e8134 ]

Infineon(Cypress) SEMPER NOR flash family has on-die ECC and its program
granularity is 16-byte ECC data unit size. JFFS2 supports write buffer
mode for ECC'd NOR flash. Provide a way to clear the MTD_BIT_WRITEABLE
flag in order to enable JFFS2 write buffer mode support.

A new SNOR_F_ECC flag is introduced to determine if the part has on-die
ECC and if it has, MTD_BIT_WRITEABLE is unset.

In vendor specific driver, a common cypress_nor_ecc_init() helper is
added. This helper takes care for ECC related initialization for SEMPER
flash family by setting up params->writesize and SNOR_F_ECC.

Fixes: c3266af101f2 ("mtd: spi-nor: spansion: add support for Cypress Semper flash")
Suggested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Takahiro Kuwano <Takahiro.Kuwano@infineon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/d586723f6f12aaff44fbcd7b51e674b47ed554ed.1680760742.git.Takahiro.Kuwano@infineon.com
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: spi-nor: Add a RWW flag

[ Upstream commit 4eddee70140b3ae183398b246a609756546c51f1 ]

Introduce a new (no SFDP) flag for the feature that we are about to
support: Read While Write. This means, if the chip has several banks and
supports RWW, once a page of data to write has been transferred into the
chip's internal SRAM, another read operation happening on a different
bank can be performed during the tPROG delay.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/r/20230328154105.448540-7-miquel.raynal@bootlin.com
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Stable-dep-of: 9fd0945fe6fa ("mtd: spi-nor: spansion: Enable JFFS2 write buffer for Infineon s28hx SEMPER flash")
Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: codecs: wcd938x: fix accessing regmap on unattached devices

[ Upstream commit 84822215acd15bd86a7759a835271e63bba83a7b ]

The WCD938x comes with three devices on two Linux drivers:
1. RX Soundwire device (wcd938x-sdw.c driver),
2. TX Soundwire device, which is used to access devices via regmap (also
   wcd938x-sdw.c driver),
3. platform device (wcd938x.c driver) - glue and component master,
   actually having most of the code using TX Soundwire device regmap.

When RX and TX Soundwire devices probe, the component master (platform
device) bind tries to write micbias configuration via TX Soundwire
regmap.  This might happen before TX Soundwire enumerates, so the regmap
access fails.  On Qualcomm SM8550 board with WCD9385:

  qcom-soundwire 6d30000.soundwire-controller: Qualcomm Soundwire controller v2.0.0 Registered
  wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops)
  wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops)
  qcom-soundwire 6ad0000.soundwire-controller: swrm_wait_for_wr_fifo_avail err write overflow

Fix the issue by:
1. Moving the regmap creation from platform device to TX Soundwire
   device.  The regmap settings are moved as-is with one difference:
   making the wcd938x_regmap_config const.
2. Using regmap in cache only mode till the actual TX Soundwire device
   enumerates and then sync the regmap cache.

Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Message-Id: <20230503144102.242240-1-krzysztof.kozlowski@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: codecs: constify static sdw_slave_ops struct

[ Upstream commit 65b7b869da9bd3bd0b9fa60e6fe557bfbc0a75e8 ]

The struct sdw_slave_ops is not modified and sdw_driver takes pointer to
const, so make it a const for code safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230124163953.345949-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 84822215acd1 ("ASoC: codecs: wcd938x: fix accessing regmap on unattached devices")
Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: ccp - Clear PSP interrupt status register before calling handler

[ Upstream commit 45121ad4a1750ca47ce3f32bd434bdb0cdbf0043 ]

The PSP IRQ is edge-triggered (MSI or MSI-X) in all cases supported by
the psp module so clear the interrupt status register early in the
handler to prevent missed interrupts. sev_irq_handler() calls wake_up()
on a wait queue, which can result in a new command being submitted from
a different CPU. This then races with the clearing of isr and can result
in missed interrupts. A missed interrupt results in a command waiting
until it times out, which results in the psp being declared dead.

This is unlikely on bare metal, but has been observed when running
virtualized. In the cases where this is observed, sev->cmdresp_reg has
PSP_CMDRESP_RESP set which indicates that the command was processed
correctly but no interrupt was asserted.

The full sequence of events looks like this:

CPU 1: submits SEV cmd #1
CPU 1: calls wait_event_timeout()
CPU 0: enters psp_irq_handler()
CPU 0: calls sev_handler()->wake_up()
CPU 1: wakes up; finishes processing cmd #1
CPU 1: submits SEV cmd #2
CPU 1: calls wait_event_timeout()
PSP: finishes processing cmd #2; interrupt status is still set; no interrupt
CPU 0: clears intsts
CPU 0: exits psp_irq_handler()
CPU 1: wait_event_timeout() times out; psp_dead=true

Fixes: 200664d5237f ("crypto: ccp: Add Secure Encrypted Virtualization (SEV) command support")
Cc: stable@vger.kernel.org
Signed-off-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: dwc3: gadget: Execute gadget stop after halting the controller

[ Upstream commit 39674be56fba1cd3a03bf4617f523a35f85fd2c1 ]

Do not call gadget stop until the poll for controller halt is
completed. DEVTEN is cleared as part of gadget stop, so the intention to
allow ep0 events to continue while waiting for controller halt is not
happening.

Fixes: c96683798e27 ("usb: dwc3: ep0: Don't prepare beyond Setup stage")
Cc: stable@vger.kernel.org
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Link: https://lore.kernel.org/r/20230420212759.29429-2-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

USB: dwc3: gadget: drop dead hibernation code

[ Upstream commit bdb19d01026a5cccfa437be8adcf2df472c5889e ]

The hibernation code is broken and has never been enabled in mainline
and should thus be dropped.

Remove the hibernation bits from the gadget code, which effectively
reverts commits e1dadd3b0f27 ("usb: dwc3: workaround: bogus hibernation
events") and 7b2a0368bbc9 ("usb: dwc3: gadget: set KEEP_CONNECT in case
of hibernation") except for the spurious interrupt warning.

Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230404072524.19014-5-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 39674be56fba ("usb: dwc3: gadget: Execute gadget stop after halting the controller")
Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux 6.2.15

Link: https://lore.kernel.org/r/20230508094428.384831245@linuxfoundation.org
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Markus Reichelt <lkt+2023@mareichelt.com>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20230509030701.507375024@linuxfoundation.org
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Markus Reichelt <lkt+2023@mareichelt.com>
Tested-by: Allen Pais <apais@linux.microsoft.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: deactivate anonymous set from preparation phase

commit c1592a89942e9678f7d9c8030efa777c0d57edab upstream.

Toggle deleted anonymous sets as inactive in the next generation, so
users cannot perform any update on it. Clear the generation bitmask
in case the transaction is aborted.

The following KASAN splat shows a set element deletion for a bound
anonymous set that has been already removed in the same transaction.

[   64.921510] ==================================================================
[   64.923123] BUG: KASAN: wild-memory-access in nf_tables_commit+0xa24/0x1490 [nf_tables]
[   64.924745] Write of size 8 at addr dead000000000122 by task test/890
[   64.927903] CPU: 3 PID: 890 Comm: test Not tainted 6.3.0+ #253
[   64.931120] Call Trace:
[   64.932699]  <TASK>
[   64.934292]  dump_stack_lvl+0x33/0x50
[   64.935908]  ? nf_tables_commit+0xa24/0x1490 [nf_tables]
[   64.937551]  kasan_report+0xda/0x120
[   64.939186]  ? nf_tables_commit+0xa24/0x1490 [nf_tables]
[   64.940814]  nf_tables_commit+0xa24/0x1490 [nf_tables]
[   64.942452]  ? __kasan_slab_alloc+0x2d/0x60
[   64.944070]  ? nf_tables_setelem_notify+0x190/0x190 [nf_tables]
[   64.945710]  ? kasan_set_track+0x21/0x30
[   64.947323]  nfnetlink_rcv_batch+0x709/0xd90 [nfnetlink]
[   64.948898]  ? nfnetlink_rcv_msg+0x480/0x480 [nfnetlink]

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

debugobject: Ensure pool refill (again)

commit 0af462f19e635ad522f28981238334620881badc upstream.

The recent fix to ensure atomicity of lookup and allocation inadvertently
broke the pool refill mechanism.

Prior to that change debug_objects_activate() and debug_objecs_assert_init()
invoked debug_objecs_init() to set up the tracking object for statically
initialized objects. That's not longer the case and debug_objecs_init() is
now the only place which does pool refills.

Depending on the number of statically initialized objects this can be
enough to actually deplete the pool, which was observed by Ido via a
debugobjects OOM warning.

Restore the old behaviour by adding explicit refill opportunities to
debug_objects_activate() and debug_objecs_assert_init().

Fixes: 63a759694eed ("debugobject: Prevent init race with static objects")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/871qk05a9d.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: use int for i40e_status

commit 5180ff1364bc26c031b54a68a80aa90ce0028b70 upstream.

To prepare for removal of i40e_status, change the variables
from i40e_status to int. This eases the transition when values
are changed to return standard int error codes over enum i40e_status.

As such changes often also change variable orders, a cleanup
is also applied here to make variables conform to RCT and
some lines are also reformatted where applicable.

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: Remove string printing for i40e_status

commit 5d968af27a166e055bdd5f832f095d809eadb992 upstream.

Remove the i40e_stat_str() function which prints the string
representation of the i40e_status error code. With upcoming changes
moving away from i40e_status, there will be no need for this function

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: Remove unused i40e status codes

commit 9d135352bb5d885a7b21417103423301a40c1b8b upstream.

In an effort to remove i40e status codes and replace them
with standard kernel errornums, unused values of i40e_status_code
were removed.

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf intel-pt: Fix CYC timestamps after standalone CBR

commit 430635a0ef1ce958b7b4311f172694ece2c692b8 upstream.

After a standalone CBR (not associated with TSC), update the cycles
reference timestamp and reset the cycle count, so that CYC timestamps
are calculated relative to that point with the new frequency.

Fixes: cc33618619cefc6d ("perf tools: Add Intel PT support for decoding CYC packets")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf auxtrace: Fix address filter entire kernel size

commit 1f9f33ccf0320be21703d9195dd2b36a1c9a07cb upstream.

kallsyms is not completely in address order.

In find_entire_kern_cb(), calculate the kernel end from the maximum
address not the last symbol.

Example:

Before:

    $ sudo cat /proc/kallsyms | grep ' [twTw] ' | tail -1
    ffffffffc00b8bd0 t bpf_prog_6deef7357e7b4530    [bpf]
    $ sudo cat /proc/kallsyms | grep ' [twTw] ' | sort | tail -1
    ffffffffc15e0cc0 t iwl_mvm_exit [iwlmvm]
    $ perf.d093603a05aa record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter
    Address filter: filter 0xffffffff93200000/0x2ceba000

After:

    $ perf.8fb0f7a01f8e record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter
    Address filter: filter 0xffffffff93200000/0x2e3e2000

Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

thunderbolt: Use correct type in tb_port_is_clx_enabled() prototype

commit d31137619776f9c173a46a79bc7733a2b106061f upstream.

tb_port_is_clx_enabled() generates a valid warning with gcc-13:
drivers/thunderbolt/switch.c:1286:6: error: conflicting types for 'tb_port_is_clx_enabled' due to enum/integer mismatch; have 'bool(struct tb_port *, unsigned int)' ...
drivers/thunderbolt/tb.h:1050:6: note: previous declaration of 'tb_port_is_clx_enabled' with type 'bool(struct tb_port *, enum tb_clx)' ...

I.e. the type of the 2nd parameter of tb_port_is_clx_enabled() in the
declaration is unsigned int, while the definition spells enum tb_clx.
Synchronize them to the former as the parameter is in fact a mask of the
enum values.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: protect access of TCP_Server_Info::{origin,leaf}_fullpath

commit 3dc9c433c9dde15477d02b609ccb4328e2adb6dc upstream.

Protect access of TCP_Server_Info::{origin,leaf}_fullpath when
matching DFS connections, and get rid of
TCP_Server_Info::current_fullpath while we're at it.

Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix potential race when tree connecting ipc

commit ee20d7c6100752eaf2409d783f4f1449c29ea33d upstream.

Protect access of TCP_Server_Info::hostname when building the ipc tree
name as it might get freed in cifsd thread and thus causing an
use-after-free bug in __tree_connect_dfs_target(). Also, while at it,
update status of IPC tcon on success and then avoid any extra tree
connects.

Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix sharing of DFS connections

commit 8e3554150d6c80a84b3cb046615d1a0e943811dc upstream.

When matching DFS connections, we can't rely on the values set in
cifs_sb_info::prepath and cifs_tcon::tree_name as they might change
during DFS failover. The DFS referrals related to a specific DFS tcon
are already matched earlier in match_server(), therefore we can safely
skip those checks altogether as the connection is guaranteed to be
unique for the DFS tcon.

Besides, when creating or finding an SMB session, make sure to also
refcount any DFS root session related to it (cifs_ses::dfs_root_ses),
so if a new DFS mount ends up reusing the connection from the old
mount while there was an umount(2) still in progress (e.g. umount(2)
-> cifs_umount() -> reconnect -> cifs_put_tcon()), the connection
could potentially be put right after the umount(2) finished.

Patch has minor update to include fix for unused variable issue
noted by the kernel test robot

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202305041040.j7W2xQSy-lkp@intel.com/
Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: protect session status check in smb2_reconnect()

commit 5bff9f741af60b143a5ae73417a8ec47fd5ff2f4 upstream.

Use @ses->ses_lock to protect access of @ses->ses_status.

Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix potential use-after-free bugs in TCP_Server_Info::hostname

commit 90c49fce1c43e1cc152695e20363ff5087897c09 upstream.

TCP_Server_Info::hostname may be updated once or many times during
reconnect, so protect its access outside reconnect path as well and
then prevent any potential use-after-free bugs.

Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

blk-iocost: avoid 64-bit division in ioc_timer_fn

commit 5f2779dfa7b8cc7dfd4a1b6586d86e0d193266f3 upstream.

The behavior of 'enum' types has changed in gcc-13, so now the
UNBUSY_THR_PCT constant is interpreted as a 64-bit number because
it is defined as part of the same enum definition as some other
constants that do not fit within a 32-bit integer. This in turn
leads to some inefficient code on 32-bit architectures as well
as a link error:

arm-linux-gnueabi/bin/arm-linux-gnueabi-ld: block/blk-iocost.o: in function `ioc_timer_fn':
blk-iocost.c:(.text+0x68e8): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: blk-iocost.c:(.text+0x6908): undefined reference to `__aeabi_uldivmod'

Split the enum definition to keep the 64-bit timing constants in
a separate enum type from those constants that can clearly fit
within a smaller type.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230118080706.3303186-1-arnd@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm: don't lock fs when the map is NULL in process of resume

commit 38d11da522aacaa05898c734a1cec86f1e611129 upstream.

Commit fa247089de99 ("dm: requeue IO if mapping table not yet available")
added a detection of whether the mapping table is available in the IO
submission process. If the mapping table is unavailable, it returns
BLK_STS_RESOURCE and requeues the IO.
This can lead to the following deadlock problem:

dm create                                      mount
ioctl(DM_DEV_CREATE_CMD)
ioctl(DM_TABLE_LOAD_CMD)
                               do_mount
                                vfs_get_tree
                                 ext4_get_tree
                                  get_tree_bdev
                                   sget_fc
                                    alloc_super
                                     // got &s->s_umount
                                     down_write_nested(&s->s_umount, ...);
                                   ext4_fill_super
                                    ext4_load_super
                                     ext4_read_bh
                                      submit_bio
                                      // submit and wait io end
ioctl(DM_DEV_SUSPEND_CMD)
dev_suspend
do_resume
  dm_suspend
   __dm_suspend
    lock_fs
     freeze_bdev
      get_active_super
       grab_super
        // wait for &s->s_umount
        down_write(&s->s_umount);
  dm_swap_table
   __bind
    // set md->map(can't get here)

IO will be continuously requeued while holding the lock since mapping
table is NULL. At the same time, mapping table won't be set since the
lock is not available.
Like request-based DM, bio-based DM also has the same problem.

It's not proper to just abort IO if the mapping table not available.
So clear DM_SKIP_LOCKFS_FLAG when the mapping table is NULL, this
allows the DM table to be loaded and the IO submitted upon resume.

Fixes: fa247089de99 ("dm: requeue IO if mapping table not yet available")
Cc: stable@vger.kernel.org
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm ioctl: fix nested locking in table_clear() to remove deadlock concern

commit 3d32aaa7e66d5c1479a3c31d6c2c5d45dd0d3b89 upstream.

syzkaller found the following problematic rwsem locking (with write
lock already held):

down_read+0x9d/0x450 kernel/locking/rwsem.c:1509
dm_get_inactive_table+0x2b/0xc0 drivers/md/dm-ioctl.c:773
__dev_status+0x4fd/0x7c0 drivers/md/dm-ioctl.c:844
table_clear+0x197/0x280 drivers/md/dm-ioctl.c:1537

In table_clear, it first acquires a write lock
https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L1520
down_write(&_hash_lock);

Then before the lock is released at L1539, there is a path shown above:
table_clear -> __dev_status -> dm_get_inactive_table -> down_read
https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L773
down_read(&_hash_lock);

It tries to acquire the same read lock again, resulting in the deadlock
problem.

Fix this by moving table_clear()'s __dev_status() call to after its
up_write(&_hash_lock);

Cc: stable@vger.kernel.org
Reported-by: Zheng Zhang <zheng.zhang@email.ucr.edu>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm flakey: fix a crash with invalid table line

commit 98dba02d9a93eec11bffbb93c7c51624290702d2 upstream.

This command will crash with NULL pointer dereference:
dmsetup create flakey --table \
"0 `blockdev --getsize /dev/ram0` flakey /dev/ram0 0 0 1 2 corrupt_bio_byte 512"

Fix the crash by checking if arg_name is non-NULL before comparing it.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm integrity: call kmem_cache_destroy() in dm_integrity_init() error path

commit 6b79a428c02769f2a11f8ae76bf866226d134887 upstream.

Otherwise the journal_io_cache will leak if dm_register_target() fails.

Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm clone: call kmem_cache_destroy() in dm_clone_init() error path

commit 6827af4a9a9f5bb664c42abf7c11af4978d72201 upstream.

Otherwise the _hydration_cache will leak if dm_register_target() fails.

Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm verity: fix error handling for check_at_most_once on FEC

commit e8c5d45f82ce0c238a4817739892fe8897a3dcc3 upstream.

In verity_end_io(), if bi_status is not BLK_STS_OK, it can be return
directly. But if FEC configured, it is desired to correct the data page
through verity_verify_io. And the return value will be converted to
blk_status and passed to verity_finish_io().

BTW, when a bit is set in v->validated_blocks, verity_verify_io() skips
verification regardless of I/O error for the corresponding bio. In this
case, the I/O error could not be returned properly, and as a result,
there is a problem that abnormal data could be read for the
corresponding block.

To fix this problem, when an I/O error occurs, do not skip verification
even if the bit related is set in v->validated_blocks.

Fixes: 843f38d382b1 ("dm verity: add 'check_at_most_once' option to only validate hashes once")
Cc: stable@vger.kernel.org
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vhost_vdpa: fix unmap process in no-batch mode

commit c82729e06644f4e087f5ff0f91b8fb15e03b8890 upstream.

While using the vdpa device with vIOMMU enabled
in the guest VM, when the vdpa device bind to vfio-pci and run testpmd
then system will fail to unmap.
The test process is
Load guest VM --> attach to virtio driver--> bind to vfio-pci driver
So the mapping process is
1)batched mode map to normal MR
2)batched mode unmapped the normal MR
3)unmapped all the memory
4)mapped to iommu MR

This error happened in step 3). The iotlb was freed in step 2)
and the function vhost_vdpa_process_iotlb_msg will return fail
Which causes failure.

To fix this, we will not remove the AS while the iotlb->nmaps is 0.
This will free in the vhost_vdpa_clean

Cc: stable@vger.kernel.org
Fixes: aaca8373c4b1 ("vhost-vdpa: support ASID based IOTLB API")
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20230420151734.860168-1-lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/mempolicy: correctly update prev when policy is equal on mbind

commit 00ca0f2e86bf40b016a646e6323a8941a09cf106 upstream.

The refactoring in commit f4e9e0e69468 ("mm/mempolicy: fix use-after-free
of VMA iterator") introduces a subtle bug which arises when attempting to
apply a new NUMA policy across a range of VMAs in mbind_range().

The refactoring passes a **prev pointer to keep track of the previous VMA
in order to reduce duplication, and in all but one case it keeps this
correctly updated.

The bug arises when a VMA within the specified range has an equivalent
policy as determined by mpol_equal() - which unlike other cases, does not
update prev.

This can result in a situation where, later in the iteration, a VMA is
found whose policy does need to change. At this point, vma_merge() is
invoked with prev pointing to a VMA which is before the previous VMA.

Since vma_merge() discovers the curr VMA by looking for the one
immediately after prev, it will now be in a situation where this VMA is
incorrect and the merge will not proceed correctly.

This is checked in the VM_WARN_ON() invariant case with end >
curr->vm_end, which, if a merge is possible, results in a warning (if
CONFIG_DEBUG_VM is specified).

I note that vma_merge() performs these invariant checks only after
merge_prev/merge_next are checked, which is debatable as it hides this
issue if no merge is possible even though a buggy situation has arisen.

The solution is simply to update the prev pointer even when policies are
equal.

This caused a bug to arise in the 6.2.y stable tree, and this patch
resolves this bug.

Link: https://lkml.kernel.org/r/83f1d612acb519d777bebf7f3359317c4e7f4265.1682866629.git.lstoakes@gmail.com
Fixes: f4e9e0e69468 ("mm/mempolicy: fix use-after-free of VMA iterator")
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/oe-lkp/202304292203.44ddeff6-oliver.sang@intel.com
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ia64: fix an addr to taddr in huge_pte_offset()

commit 3647ebcfbfca384840231fe13fae665453238a61 upstream.

I know nothing of ia64 htlbpage_to_page(), but guess that the p4d
line should be using taddr rather than addr, like everywhere else.

Link: https://lkml.kernel.org/r/732eae88-3beb-246-2c72-281de786740@google.com
Fixes: c03ab9e32a2c ("ia64: add support for folded p4d page tables")
Signed-off-by: Hugh Dickins <hughd@google.com
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/dasd: fix hanging blockdevice after request requeue

commit d8898ee50edecacdf0141f26fd90acf43d7e9cd7 upstream.

The DASD driver does not kick the requeue list when requeuing IO requests
to the blocklayer. This might lead to hanging blockdevice when there is
no other trigger for this.

Fix by automatically kick the requeue list when requeuing DASD requests
to the blocklayer.

Fixes: e443343e509a ("s390/dasd: blk-mq conversion")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/20230405142017.2446986-8-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: scrub: reject unsupported scrub flags

commit 604e6681e114d05a2e384c4d1e8ef81918037ef5 upstream.

Since the introduction of scrub interface, the only flag that we support
is BTRFS_SCRUB_READONLY. Thus there is no sanity checks, if there are
some undefined flags passed in, we just ignore them.

This is problematic if we want to introduce new scrub flags, as we have
no way to determine if such flags are supported.

Address the problem by introducing a check for the flags, and if
unsupported flags are set, return -EOPNOTSUPP to inform the user space.

This check should be backported for all supported kernels before any new
scrub flags are introduced.

CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scripts/gdb: fix lx-timerlist for Python3

commit 7362042f3556528e9e9b1eb5ce8d7a3a6331476b upstream.

Below incompatibilities between Python2 and Python3 made lx-timerlist fail
to run under Python3.

o xrange() is replaced by range() in Python3
o bytes and str are different types in Python3
o the return value of Inferior.read_memory() is memoryview object in
Python3

akpm: cc stable so that older kernels are properly debuggable under newer
Python.

Link: https://lkml.kernel.org/r/TYCP286MB2146EE1180A4D5176CBA8AB2C6819@TYCP286MB2146.JPNP286.PROD.OUTLOOK.COM
Signed-off-by: Peng Liu <liupeng17@lenovo.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: rockchip: rk3399: allow clk_cifout to force clk_cifout_src to reparent

commit 933bf364e152cd60902cf9585c2ba310d593e69f upstream.

clk_cifout is derived from clk_cifout_src through an integer divider
limited to 32. clk_cifout_src is a child of either cpll, gpll or npll
without any possibility of a divider of any sort. The default clock
parent is cpll.

Let's allow clk_cifout to ask its parent clk_cifout_src to reparent in
order to find the real closest possible rate for clk_cifout and not one
derived from cpll only.

Cc: stable@vger.kernel.org # 4.10+
Fixes: fd8bc829336a ("clk: rockchip: fix the rk3399 cifout clock")
Signed-off-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://lore.kernel.org/r/20221117-rk3399-cifout-set-rate-parent-v1-0-432548d04081@theobroma-systems.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: microchip: fix potential UAF in auxdev release callback

commit 7455b7007b9e93bcc2bc9c1c6c73a228e3152069 upstream.

Similar to commit 1c11289b34ab ("peci: cpu: Fix use-after-free in
adev_release()"), the auxiliary device is not torn down in the correct
order. If auxiliary_device_add() fails, the release callback will be
called twice, resulting in a UAF. Due to timing, the auxdev code in this
driver "took inspiration" from the aforementioned commit, and thus its
bugs too!

Moving auxiliary_device_uninit() to the unregister callback instead
avoids the issue.

CC: stable@vger.kernel.org
Fixes: b56bae2dd6fd ("clk: microchip: mpfs: add reset controller")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230413-critter-synopsis-dac070a86cb4@spud
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: rtw89: fix potential race condition between napi_init and napi_enable

commit 47515664ecfbde11425dff121f298ae4499425c9 upstream.

A race condition can happen if netdev is registered, but NAPI isn't
initialized yet, and meanwhile user space starts the netdev that will
enable NAPI. Then, it hits BUG_ON():

kernel BUG at net/core/dev.c:6423!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 417 Comm: iwd Not tainted 6.2.7-slab-dirty #3 eb0f5a8a9d91
Hardware name: LENOVO 21DL/LNVNB161216, BIOS JPCN20WW(V1.06) 09/20/2022
RIP: 0010:napi_enable+0x3f/0x50
Code: 48 89 c2 48 83 e2 f6 f6 81 89 08 00 00 02 74 0d 48 83 ...
RSP: 0018:ffffada1414f3548 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffa01425802080 RCX: 0000000000000000
RDX: 00000000000002ff RSI: ffffada14e50c614 RDI: ffffa01425808dc0
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000100 R12: ffffa01425808f58
R13: 0000000000000000 R14: ffffa01423498940 R15: 0000000000000001
FS:  00007f5577c0a740(0000) GS:ffffa0169fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f5577a19972 CR3: 0000000125a7a000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
  <TASK>
  rtw89_pci_ops_start+0x1c/0x70 [rtw89_pci 6cbc75429515c181cbc386478d5cfb32ffc5a0f8]
  rtw89_core_start+0xbe/0x160 [rtw89_core fe07ecb874820b6d778370d4acb6ef8a37847f22]
  rtw89_ops_start+0x26/0x40 [rtw89_core fe07ecb874820b6d778370d4acb6ef8a37847f22]
  drv_start+0x42/0x100 [mac80211 c07fa22af8c3cf3f7d7ab3884ca990784d72e2d2]
  ieee80211_do_open+0x311/0x7d0 [mac80211 c07fa22af8c3cf3f7d7ab3884ca990784d72e2d2]
  ieee80211_open+0x6a/0x90 [mac80211 c07fa22af8c3cf3f7d7ab3884ca990784d72e2d2]
  __dev_open+0xe0/0x180
  __dev_change_flags+0x1da/0x250
  dev_change_flags+0x26/0x70
  do_setlink+0x37c/0x12c0
  ? ep_poll_callback+0x246/0x290
  ? __nla_validate_parse+0x61/0xd00
  ? __wake_up_common_lock+0x8f/0xd0

To fix this, follow Jonas' suggestion to switch the order of these
functions and move register netdev to be the last step of PCI probe.
Also, correct the error handling of rtw89_core_register_hw().

Fixes: e3ec7017f6a2 ("rtw89: add Realtek 802.11ax driver")
Cc: stable@vger.kernel.org
Reported-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Link: https://lore.kernel.org/linux-wireless/CAOiHx=n7EwK2B9CnBR07FVA=sEzFagb8TkS4XC_qBNq8OwcYUg@mail.gmail.com/T/#t
Suggested-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Larry Finger<Larry.Finger@lwfinger.net>
Reviewed-by: Larry Finger<Larry.Finger@lwfinger.net>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230323082839.20474-1-pkshih@realtek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: rtw88: rtw8821c: Fix rfe_option field width

commit 14705f969d98187a1cc2682e0c9bd2e230b8098f upstream.

On my RTW8821CU chipset rfe_option reads as 0x22. Looking at the
vendor driver suggests that the field width of rfe_option is 5 bit,
so rfe_option should be masked with 0x1f.

Without this the rfe_option comparisons with 2 further down the
driver evaluate as false when they should really evaluate as true.
The effect is that 2G channels do not work.

rfe_option is also used as an array index into rtw8821c_rfe_defs[].
rtw8821c_rfe_defs[34] (0x22) was added as part of adding USB support,
likely because rfe_option reads as 0x22. As this now becomes 0x2,
rtw8821c_rfe_defs[34] is no longer used and can be removed.

Note that this might not be the whole truth. In the vendor driver
there are indeed places where the unmasked rfe_option value is used.
However, the driver has several places where rfe_option is tested
with the pattern if (rfe_option == 2 || rfe_option == 0x22) or
if (rfe_option == 4 || rfe_option == 0x24), so that rfe_option BIT(5)
has no influence on the code path taken. We therefore mask BIT(5)
out from rfe_option entirely until this assumption is proved wrong
by some chip variant we do not know yet.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Alexandru gagniuc <mr.nuke.me@gmail.com>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: ValdikSS <iam@valdikss.org.ru>
Cc: stable@vger.kernel.org
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230417140358.2240429-3-s.hauer@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: rtl8xxxu: RTL8192EU always needs full init

commit d46e04ccd40457a0119b76e11ab64a2ad403e138 upstream.

Always run the entire init sequence (rtl8xxxu_init_device()) for
RTL8192EU. It's what the vendor driver does too.

This fixes a bug where the device is unable to connect after
rebooting:

wlp3s0f3u2: send auth to ... (try 1/3)
wlp3s0f3u2: send auth to ... (try 2/3)
wlp3s0f3u2: send auth to ... (try 3/3)
wlp3s0f3u2: authentication with ... timed out

Rebooting leaves the device powered on (partially? at least the
firmware is still running), but not really in a working state.

Cc: stable@vger.kernel.org
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/4eb111a9-d4c4-37d0-b376-4e202de7153c@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfp: fix incorrect pointer deference when offloading IPsec with bonding

commit 63cfd210034c772fad047afa13dd5a4664b0a72e upstream.

There are two pointers in struct xfrm_dev_offload, *dev, *real_dev.
The *dev points whether bonding interface or real interface, if
bonding IPsec offload is used, it points bonding interface; if not,
it points real interface. And *real_dev always points real interface.
So nfp should always use real_dev instead of dev.

Prior to this change the system becomes unresponsive when offloading
IPsec for a device which is a lower device to a bonding device.

Fixes: 859a497fe80c ("nfp: implement xfrm callbacks and expose ipsec offload feature to upper layer")
CC: stable@vger.kernel.org
Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Link: https://lore.kernel.org/r/20230420140125.38521-1-louis.peens@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mailbox: zynqmp: Fix typo in IPI documentation

commit 79963fbfc233759bd8a43462f120d15a1bd4f4fa upstream.

Xilinx IPI message buffers allows 32-byte data transfer.
Fix documentation that says 12 bytes

Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller")
Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230311012407.1292118-4-tanmay.shah@amd.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kcsan: Avoid READ_ONCE() in read_instrumented_memory()

commit 8dec88070d964bfeb4198f34cb5956d89dd1f557 upstream.

Haibo Li reported:

| Unable to handle kernel paging request at virtual address
|   ffffff802a0d8d7171
| Mem abort info:o:
|   ESR = 0x9600002121
|   EC = 0x25: DABT (current EL), IL = 32 bitsts
|   SET = 0, FnV = 0 0
|   EA = 0, S1PTW = 0 0
|   FSC = 0x21: alignment fault
| Data abort info:o:
|   ISV = 0, ISS = 0x0000002121
|   CM = 0, WnR = 0 0
| swapper pgtable: 4k pages, 39-bit VAs, pgdp=000000002835200000
| [ffffff802a0d8d71] pgd=180000005fbf9003, p4d=180000005fbf9003,
| pud=180000005fbf9003, pmd=180000005fbe8003, pte=006800002a0d8707
| Internal error: Oops: 96000021 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 2 PID: 45 Comm: kworker/u8:2 Not tainted
|   5.15.78-android13-8-g63561175bbda-dirty #1
| ...
| pc : kcsan_setup_watchpoint+0x26c/0x6bc
| lr : kcsan_setup_watchpoint+0x88/0x6bc
| sp : ffffffc00ab4b7f0
| x29: ffffffc00ab4b800 x28: ffffff80294fe588 x27: 0000000000000001
| x26: 0000000000000019 x25: 0000000000000001 x24: ffffff80294fdb80
| x23: 0000000000000000 x22: ffffffc00a70fb68 x21: ffffff802a0d8d71
| x20: 0000000000000002 x19: 0000000000000000 x18: ffffffc00a9bd060
| x17: 0000000000000001 x16: 0000000000000000 x15: ffffffc00a59f000
| x14: 0000000000000001 x13: 0000000000000000 x12: ffffffc00a70faa0
| x11: 00000000aaaaaaab x10: 0000000000000054 x9 : ffffffc00839adf8
| x8 : ffffffc009b4cf00 x7 : 0000000000000000 x6 : 0000000000000007
| x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffffc00a70fb70
| x2 : 0005ff802a0d8d71 x1 : 0000000000000000 x0 : 0000000000000000
| Call trace:
|  kcsan_setup_watchpoint+0x26c/0x6bc
|  __tsan_read2+0x1f0/0x234
|  inflate_fast+0x498/0x750
|  zlib_inflate+0x1304/0x2384
|  __gunzip+0x3a0/0x45c
|  gunzip+0x20/0x30
|  unpack_to_rootfs+0x2a8/0x3fc
|  do_populate_rootfs+0xe8/0x11c
|  async_run_entry_fn+0x58/0x1bc
|  process_one_work+0x3ec/0x738
|  worker_thread+0x4c4/0x838
|  kthread+0x20c/0x258
|  ret_from_fork+0x10/0x20
| Code: b8bfc2a8 2a0803f7 14000007 d503249f (78bfc2a8) )
| ---[ end trace 613a943cb0a572b6 ]-----

The reason for this is that on certain arm64 configuration since
e35123d83ee3 ("arm64: lto: Strengthen READ_ONCE() to acquire when
CONFIG_LTO=y"), READ_ONCE() may be promoted to a full atomic acquire
instruction which cannot be used on unaligned addresses.

Fix it by avoiding READ_ONCE() in read_instrumented_memory(), and simply
forcing the compiler to do the required access by casting to the
appropriate volatile type. In terms of generated code this currently
only affects architectures that do not use the default READ_ONCE()
implementation.

The only downside is that we are not guaranteed atomicity of the access
itself, although on most architectures a plain load up to machine word
size should still be atomic (a fact the default READ_ONCE() still relies
on itself).

Reported-by: Haibo Li <haibo.li@mediatek.com>
Tested-by: Haibo Li <haibo.li@mediatek.com>
Cc: <stable@vger.kernel.org> # 5.17+
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mailbox: zynqmp: Fix IPI isr handling

commit 74ad37a30ffee3643bc34f9ca7225b20a66abaaf upstream.

Multiple IPI channels are mapped to same interrupt handler.
Current isr implementation handles only one channel per isr.
Fix this behavior by checking isr status bit of all child
mailbox nodes.

Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller")
Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230311012407.1292118-3-tanmay.shah@amd.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers: remoteproc: xilinx: Fix carveout names

commit 81c18e08a609706c5c2887f267135fa0dece4119 upstream.

If the unit address is appended to node name of memory-region,
then adding rproc carveouts fails as node name and unit-address
both are passed as carveout name (i.e. vdev0vring0@xxxxxxxx). However,
only node name is expected by remoteproc framework. This patch moves
memory-region node parsing from driver probe to prepare and
only passes node-name and not unit-address

Fixes: 6b291e8020a8 ("drivers: remoteproc: Add Xilinx r5 remoteproc driver")
Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230311012407.1292118-5-tanmay.shah@amd.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: spi-nor: core: Update flash's current address mode when changing address mode

commit 37513c56139b79dd43c1774513c28f8ab2b05224 upstream.

The bug was obswerved while reading code. There are not many users of
addr_mode_nbytes. Anyway, we should update the flash's current address
mode when changing the address mode, fix it. We don't care for now about
the set_4byte_addr_mode(nor, false) from spi_nor_restore(), as it is
used at driver remove and shutdown.

Fixes: d7931a215063 ("mtd: spi-nor: core: Track flash's internal address mode")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230331074606.3559258-9-tudor.ambarus@linaro.org
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: core: fix error path for nvmem provider

commit e0489f6e221f5ddee6cb3bd51b992b790c5fa4b9 upstream.

If mtd_otp_nvmem_add() fails, the partitions won't be removed
because there is simply no call to del_mtd_partitions().
Unfortunately, add_mtd_partitions() will print all partitions to
the kernel console. If mtd_otp_nvmem_add() returns -EPROBE_DEFER
this would print the partitions multiple times to the kernel
console. Instead move mtd_otp_nvmem_add() to the beginning of the
function.

Fixes: 4b361cfa8624 ("mtd: core: add OTP nvmem provider support")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230308082021.870459-3-michael@walle.cc
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: core: fix nvmem error reporting

commit 8bd1d24e6ca3c599dd455b0e1b22f77bab8290eb upstream.

The master MTD will only have an associated device if
CONFIG_MTD_PARTITIONED_MASTER is set, thus we cannot use dev_err() on
mtd->dev. Instead use the parent device which is the physical flash
memory.

Fixes: 4b361cfa8624 ("mtd: core: add OTP nvmem provider support")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230308082021.870459-2-michael@walle.cc
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: core: provide unique name for nvmem device, take two

commit 1cd9ceaa5282ff10ea20a7fbadde5a476a1cc99e upstream.

Commit c048b60d39e1 ("mtd: core: provide unique name for nvmem device")
tries to give the nvmem device a unique name, but fails badly if the mtd
device doesn't have a "struct device" associated with it, i.e. if
CONFIG_MTD_PARTITIONED_MASTER is not set. This will result in the name
"(null)-user-otp", which is not unique. It seems the best we can do is
to use the compatible name together with a unique identifier added by
the nvmem subsystem by using NVMEM_DEVID_AUTO.

Fixes: c048b60d39e1 ("mtd: core: provide unique name for nvmem device")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230308082021.870459-1-michael@walle.cc
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kasan: hw_tags: avoid invalid virt_to_page()

commit 29083fd84da576bfb3563d044f98d38e6b338f00 upstream.

When booting with 'kasan.vmalloc=off', a kernel configured with support
for KASAN_HW_TAGS will explode at boot time due to bogus use of
virt_to_page() on a vmalloc adddress.  With CONFIG_DEBUG_VIRTUAL selected
this will be reported explicitly, and with or without CONFIG_DEBUG_VIRTUAL
the kernel will dereference a bogus address:

| ------------[ cut here ]------------
| virt_to_phys used for non-linear address: (____ptrval____) (0xffff800008000000)
| WARNING: CPU: 0 PID: 0 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x78/0x80
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.3.0-rc3-00073-g83865133300d-dirty #4
| Hardware name: linux,dummy-virt (DT)
| pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __virt_to_phys+0x78/0x80
| lr : __virt_to_phys+0x78/0x80
| sp : ffffcd076afd3c80
| x29: ffffcd076afd3c80 x28: 0068000000000f07 x27: ffff800008000000
| x26: fffffbfff0000000 x25: fffffbffff000000 x24: ff00000000000000
| x23: ffffcd076ad3c000 x22: fffffc0000000000 x21: ffff800008000000
| x20: ffff800008004000 x19: ffff800008000000 x18: ffff800008004000
| x17: 666678302820295f x16: ffffffffffffffff x15: 0000000000000004
| x14: ffffcd076b009e88 x13: 0000000000000fff x12: 0000000000000003
| x11: 00000000ffffefff x10: c0000000ffffefff x9 : 0000000000000000
| x8 : 0000000000000000 x7 : 205d303030303030 x6 : 302e30202020205b
| x5 : ffffcd076b41d63f x4 : ffffcd076afd3827 x3 : 0000000000000000
| x2 : 0000000000000000 x1 : ffffcd076afd3a30 x0 : 000000000000004f
| Call trace:
|  __virt_to_phys+0x78/0x80
|  __kasan_unpoison_vmalloc+0xd4/0x478
|  __vmalloc_node_range+0x77c/0x7b8
|  __vmalloc_node+0x54/0x64
|  init_IRQ+0x94/0xc8
|  start_kernel+0x194/0x420
|  __primary_switched+0xbc/0xc4
| ---[ end trace 0000000000000000 ]---
| Unable to handle kernel paging request at virtual address 03fffacbe27b8000
| Mem abort info:
|   ESR = 0x0000000096000004
|   EC = 0x25: DABT (current EL), IL = 32 bits
|   SET = 0, FnV = 0
|   EA = 0, S1PTW = 0
|   FSC = 0x04: level 0 translation fault
| Data abort info:
|   ISV = 0, ISS = 0x00000004
|   CM = 0, WnR = 0
| swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041bc5000
| [03fffacbe27b8000] pgd=0000000000000000, p4d=0000000000000000
| Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W          6.3.0-rc3-00073-g83865133300d-dirty #4
| Hardware name: linux,dummy-virt (DT)
| pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __kasan_unpoison_vmalloc+0xe4/0x478
| lr : __kasan_unpoison_vmalloc+0xd4/0x478
| sp : ffffcd076afd3ca0
| x29: ffffcd076afd3ca0 x28: 0068000000000f07 x27: ffff800008000000
| x26: 0000000000000000 x25: 03fffacbe27b8000 x24: ff00000000000000
| x23: ffffcd076ad3c000 x22: fffffc0000000000 x21: ffff800008000000
| x20: ffff800008004000 x19: ffff800008000000 x18: ffff800008004000
| x17: 666678302820295f x16: ffffffffffffffff x15: 0000000000000004
| x14: ffffcd076b009e88 x13: 0000000000000fff x12: 0000000000000001
| x11: 0000800008000000 x10: ffff800008000000 x9 : ffffb2f8dee00000
| x8 : 000ffffb2f8dee00 x7 : 205d303030303030 x6 : 302e30202020205b
| x5 : ffffcd076b41d63f x4 : ffffcd076afd3827 x3 : 0000000000000000
| x2 : 0000000000000000 x1 : ffffcd076afd3a30 x0 : ffffb2f8dee00000
| Call trace:
|  __kasan_unpoison_vmalloc+0xe4/0x478
|  __vmalloc_node_range+0x77c/0x7b8
|  __vmalloc_node+0x54/0x64
|  init_IRQ+0x94/0xc8
|  start_kernel+0x194/0x420
|  __primary_switched+0xbc/0xc4
| Code: d34cfc08 aa1f03fa 8b081b39 d503201f (f9400328)
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Attempted to kill the idle task!

This is because init_vmalloc_pages() erroneously calls virt_to_page() on
a vmalloc address, while virt_to_page() is only valid for addresses in
the linear/direct map. Since init_vmalloc_pages() expects virtual
addresses in the vmalloc range, it must use vmalloc_to_page() rather
than virt_to_page().

We call init_vmalloc_pages() from __kasan_unpoison_vmalloc(), where we
check !is_vmalloc_or_module_addr(), suggesting that we might encounter a
non-vmalloc address. Luckily, this never happens. By design, we only
call __kasan_unpoison_vmalloc() on pointers in the vmalloc area, and I
have verified that we don't violate that expectation. Given that,
is_vmalloc_or_module_addr() must always be true for any legitimate
argument to __kasan_unpoison_vmalloc().

Correct init_vmalloc_pages() to use vmalloc_to_page(), and remove the
redundant and misleading use of is_vmalloc_or_module_addr() in
__kasan_unpoison_vmalloc().

Link: https://lkml.kernel.org/r/20230418164212.1775741-1-mark.rutland@arm.com
Fixes: 6c2f761dad7851d8 ("kasan: fix zeroing vmalloc memory with HW_TAGS")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md/raid5: Improve performance for sequential IO

commit fc05e06e6098ca2c28f7a10da0e00aeea20fa59e upstream.

Commit 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()") changed the
order in which requests for underlying disks are created. Since for
large sequential IO adding of requests frequently races with md_raid5
thread submitting bios to underlying disks, this results in a change in
IO pattern because intermediate states of new order of request creation
result in more smaller discontiguous requests. For RAID5 on top of three
rotational disks our performance testing revealed this results in
regression in write throughput:

iozone -a -s 131072000 -y 4 -q 8 -i 0 -i 1 -R

before 7e55c60acfbb:
              KB  reclen   write rewrite    read    reread
       131072000       4  493670  525964   524575   513384
       131072000       8  540467  532880   512028   513703

after 7e55c60acfbb:
              KB  reclen   write rewrite    read    reread
       131072000       4  421785  456184   531278   509248
       131072000       8  459283  456354   528449   543834

To reduce the amount of discontiguous requests we can start generating
requests with the stripe with the lowest chunk offset as that has the
best chance of being adjacent to IO queued previously. This improves the
performance to:
              KB  reclen   write rewrite    read    reread
       131072000       4  497682  506317   518043   514559
       131072000       8  514048  501886   506453   504319

restoring big part of the regression.

Fixes: 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()")
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230417171537.17899-1-jack@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md/raid10: fix null-ptr-deref in raid10_sync_request

commit a405c6f0229526160aa3f177f65e20c86fce84c5 upstream.

init_resync() inits mempool and sets conf->have_replacemnt at the beginning
of sync, close_sync() frees the mempool when sync is completed.

After [1] recovery might be skipped and init_resync() is called but
close_sync() is not. null-ptr-deref occurs with r10bio->dev[i].repl_bio.

The following is one way to reproduce the issue.

  1) create a array, wait for resync to complete, mddev->recovery_cp is set
     to MaxSector.
  2) recovery is woken and it is skipped. conf->have_replacement is set to
     0 in init_resync(). close_sync() not called.
  3) some io errors and rdev A is set to WantReplacement.
  4) a new device is added and set to A's replacement.
  5) recovery is woken, A have replacement, but conf->have_replacemnt is
     0. r10bio->dev[i].repl_bio will not be alloced and null-ptr-deref
     occurs.

Fix it by not calling init_resync() if recovery skipped.

[1] commit 7e83ccbecd60 ("md/raid10: Allow skipping recovery when clean arrays are assembled")
Fixes: 7e83ccbecd60 ("md/raid10: Allow skipping recovery when clean arrays are assembled")
Cc: stable@vger.kernel.org
Signed-off-by: Li Nan <linan122@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230222041000.3341651-3-linan666@huaweicloud.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drbd: correctly submit flush bio on barrier

commit 3899d94e3831ee07ea6821c032dc297aec80586a upstream.

When we receive a flush command (or "barrier" in DRBD), we currently use
a REQ_OP_FLUSH with the REQ_PREFLUSH flag set.

The correct way to submit a flush bio is by using a REQ_OP_WRITE without
any data, and set the REQ_PREFLUSH flag.

Since commit b4a6bb3a67aa ("block: add a sanity check for non-write
flush/fua bios"), this triggers a warning in the block layer, but this
has been broken for quite some time before that.

So use the correct set of flags to actually make the flush happen.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: stable@vger.kernel.org
Fixes: f9ff0da56437 ("drbd: allow parallel flushes for multi-volume resources")
Reported-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230503121937.17232-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: do not reclaim private data from pinned page

commit d824ec2a154677f63c56cc71ffe4578274f6e32e upstream.

If the page is pinned, there's no point in trying to reclaim it.
Furthermore if the page is from the page cache we don't want to reclaim
fs-private data from the page because the pinning process may be writing
to the page at any time and reclaiming fs private info on a dirty page can
upset the filesystem (see link below).

Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
Link: https://lkml.kernel.org/r/20230428124140.30166-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nilfs2: fix infinite loop in nilfs_mdt_get_block()

commit a6a491c048882e7e424d407d32cba0b52d9ef2bf upstream.

If the disk image that nilfs2 mounts is corrupted and a virtual block
address obtained by block lookup for a metadata file is invalid,
nilfs_bmap_lookup_at_level() may return the same internal return code as
-ENOENT, meaning the block does not exist in the metadata file.

This duplication of return codes confuses nilfs_mdt_get_block(), causing
it to read and create a metadata block indefinitely.

In particular, if this happens to the inode metadata file, ifile,
semaphore i_rwsem can be left held, causing task hangs in lock_mount.

Fix this issue by making nilfs_bmap_lookup_at_level() treat virtual block
address translation failures with -ENOENT as metadata corruption instead
of returning the error code.

Link: https://lkml.kernel.org/r/20230430193046.6769-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+221d75710bde87fa0e97@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=221d75710bde87fa0e97
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nilfs2: do not write dirty data after degenerating to read-only

commit 28a65b49eb53e172d23567005465019658bfdb4d upstream.

According to syzbot's report, mark_buffer_dirty() called from
nilfs_segctor_do_construct() outputs a warning with some patterns after
nilfs2 detects metadata corruption and degrades to read-only mode.

After such read-only degeneration, page cache data may be cleared through
nilfs_clear_dirty_page() which may also clear the uptodate flag for their
buffer heads. However, even after the degeneration, log writes are still
performed by unmount processing etc., which causes mark_buffer_dirty() to
be called for buffer heads without the "uptodate" flag and causes the
warning.

Since any writes should not be done to a read-only file system in the
first place, this fixes the warning in mark_buffer_dirty() by letting
nilfs_segctor_do_construct() abort early if in read-only mode.

This also changes the retry check of nilfs_segctor_write_out() to avoid
unnecessary log write retries if it detects -EROFS that
nilfs_segctor_do_construct() returned.

Link: https://lkml.kernel.org/r/20230427011526.13457-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+2af3bc9585be7f23f290@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=2af3bc9585be7f23f290
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Fix mute and micmute LEDs for an HP laptop

commit 56fc217f0db4fc78e02a1b8450df06389474a5e5 upstream.

There's another laptop that needs the fixup to enable mute and micmute
LEDs. So do it accordingly.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20230505125925.543601-1-kai.heng.feng@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: support HP Pavilion Aero 13-be0xxx Mute LED

commit e7477cb97607b373d175a759c8c0270a640ab3f2 upstream.

This patch adds support for the mute LED on the HP Pavilion Aero Laptop
13-be0xxx. The current behavior is that the LED does not turn on at any
time and does not indicate to the user whether the sound is muted.

The solution is to add a PCI quirk to properly recognize and support the
LED on this device.

This change has been tested on the device in question using modified
versions of kernels 6.0.7-6.2.12 on Arch Linux.

Signed-off-by: Caleb Harper <calebharp2005@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20230503175026.6796-1-calebharp2005@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Add quirk for ASUS UM3402YAR using CS35L41

commit 7e2d06628aab6324e1ac885910a52f4c038d4043 upstream.

This Asus Zenbook laptop uses Realtek HDA codec combined with
2xCS35L41 Amplifiers using I2C with External Boost.

Signed-off-by: Mark Asselstine <asselsm@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20230501231346.54979-1-asselsm@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Add quirk for ThinkPad P1 Gen 6

commit 067eb084592819ad59d07afcb5de3e61cee2757c upstream.

Lenovo ThinkPad P1 Gen 6 laptop has 2 CS35L41 amplifies
on I2C bus connected to Realtek codec.

Signed-off-by: Vitaly Rodionov <vitalyr@opensource.cirrus.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20230427110452.13787-1-vitalyr@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Add quirk for Pioneer DDJ-800

commit 7501f472977df233d039d86c6981e0641708e1ca upstream.

One more Pioneer quirk, this time for DDJ-800, which is quite similar like
other DJ DDJ models but with slightly different EPs or channels.

Signed-off-by: Geraldo Nascimento <geraldogabriel@gmail.com>
Tested-by: Grégory Desor <gregory.desor@free.fr>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/ZFLLzgEcsSF5aIHG@geday
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>