]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
15 months agortnetlink: Don't ignore IFLA_TARGET_NETNSID when ifname is specified in rtnl_dellink().
Kuniyuki Iwashima [Sat, 27 Jul 2024 00:19:53 +0000 (17:19 -0700)] 
rtnetlink: Don't ignore IFLA_TARGET_NETNSID when ifname is specified in rtnl_dellink().

[ Upstream commit 9415d375d8520e0ed55f0c0b058928da9a5b5b3d ]

The cited commit accidentally replaced tgt_net with net in rtnl_dellink().

As a result, IFLA_TARGET_NETNSID is ignored if the interface is specified
with IFLA_IFNAME or IFLA_ALT_IFNAME.

Let's pass tgt_net to rtnl_dev_get().

Fixes: cc6090e985d7 ("net: rtnetlink: introduce helper to get net_device instance by ifname")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agonet: axienet: start napi before enabling Rx/Tx
Andy Chiu [Fri, 26 Jul 2024 07:06:50 +0000 (15:06 +0800)] 
net: axienet: start napi before enabling Rx/Tx

[ Upstream commit 799a829507506924add8a7620493adc1c3cfda30 ]

softirq may get lost if an Rx interrupt comes before we call
napi_enable. Move napi_enable in front of axienet_setoptions(), which
turns on the device, to address the issue.

Link: https://lists.gnu.org/archive/html/qemu-devel/2024-07/msg06160.html
Fixes: cc37610caaf8 ("net: axienet: implement NAPI and GRO receive")
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agotcp: Adjust clamping window for applications specifying SO_RCVBUF
Subash Abhinov Kasiviswanathan [Fri, 26 Jul 2024 20:41:05 +0000 (13:41 -0700)] 
tcp: Adjust clamping window for applications specifying SO_RCVBUF

[ Upstream commit 05f76b2d634e65ab34472802d9b142ea9e03f74e ]

tp->scaling_ratio is not updated based on skb->len/skb->truesize once
SO_RCVBUF is set leading to the maximum window scaling to be 25% of
rcvbuf after
commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
and 50% of rcvbuf after
commit 697a6c8cec03 ("tcp: increase the default TCP scaling ratio").
50% tries to emulate the behavior of older kernels using
sysctl_tcp_adv_win_scale with default value.

Systems which were using a different values of sysctl_tcp_adv_win_scale
in older kernels ended up seeing reduced download speeds in certain
cases as covered in https://lists.openwall.net/netdev/2024/05/15/13
While the sysctl scheme is no longer acceptable, the value of 50% is
a bit conservative when the skb->len/skb->truesize ratio is later
determined to be ~0.66.

Applications not specifying SO_RCVBUF update the window scaling and
the receiver buffer every time data is copied to userspace. This
computation is now used for applications setting SO_RCVBUF to update
the maximum window scaling while ensuring that the receive buffer
is within the application specified limit.

Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
Signed-off-by: Sean Tranchetti <quic_stranche@quicinc.com>
Signed-off-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoethtool: fix setting key and resetting indir at once
Jakub Kicinski [Thu, 25 Jul 2024 22:23:51 +0000 (15:23 -0700)] 
ethtool: fix setting key and resetting indir at once

[ Upstream commit 7195f0ef7f5b8c678cf28de7c9b619cb908b482c ]

The indirection table and the key follow struct ethtool_rxfh
in user memory.

To reset the indirection table user space calls SET_RXFH with
table of size 0 (OTOH to say "no change" it should use -1 / ~0).
The logic for calculating the offset where they key sits is
incorrect in this case, as kernel would still offset by the full
table length, while for the reset there is no indir table and
key is immediately after the struct.

  $ ethtool -X eth0 default hkey 01:02:03...
  $ ethtool -x eth0
  [...]
  RSS hash key:
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
  [...]

Fixes: 3de0b592394d ("ethtool: Support for configurable RSS hash key")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agodrm/client: Fix error code in drm_client_buffer_vmap_local()
Dan Carpenter [Wed, 24 Jul 2024 16:09:54 +0000 (11:09 -0500)] 
drm/client: Fix error code in drm_client_buffer_vmap_local()

[ Upstream commit b5fbf924f125ba3638cfdc21c0515eb7e76264ca ]

This function accidentally returns zero/success on the failure path.
It leads to locking issues and an uninitialized *map_copy in the
caller.

Fixes: b4b0193e83cb ("drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local()")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/89d13df3-747c-4c5d-b122-d081aef5110a@stanley.mountain
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoBluetooth: hci_sync: Fix suspending with wrong filter policy
Luiz Augusto von Dentz [Mon, 15 Jul 2024 14:40:03 +0000 (10:40 -0400)] 
Bluetooth: hci_sync: Fix suspending with wrong filter policy

[ Upstream commit 96b82af36efaa1787946e021aa3dc5410c05beeb ]

When suspending the scan filter policy cannot be 0x00 (no acceptlist)
since that means the host has to process every advertisement report
waking up the system, so this attempts to check if hdev is marked as
suspended and if the resulting filter policy would be 0x00 (no
acceptlist) then skip passive scanning if thre no devices in the
acceptlist otherwise reset the filter policy to 0x01 so the acceptlist
is used since the devices programmed there can still wakeup be system.

Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoBluetooth: btintel: Fail setup on error
Kiran K [Wed, 3 Jul 2024 08:52:42 +0000 (14:22 +0530)] 
Bluetooth: btintel: Fail setup on error

[ Upstream commit e22a3a9d4134d7e6351a2998771522e74bcc58da ]

Do not attempt to send any hci command to controller if *setup* function
fails.

Fixes: af395330abed ("Bluetooth: btintel: Add Intel devcoredump support")
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoALSA: hda: conexant: Fix headset auto detect fail in the polling mode
songxiebing [Fri, 26 Jul 2024 10:07:26 +0000 (18:07 +0800)] 
ALSA: hda: conexant: Fix headset auto detect fail in the polling mode

[ Upstream commit e60dc98122110594d0290845160f12916192fc6d ]

The previous fix (7aeb25908648) only handles the unsol_event reporting
during interrupts and does not include the polling mode used to set
jackroll_ms, so now we are replacing it with
snd_hda_jack_detect_enable_callback.

Fixes: 7aeb25908648 ("ALSA: hda/conexant: Fix headset auto detect fail in cx8070 and SN6140")
Co-developed-by: bo liu <bo.liu@senarytech.com>
Signed-off-by: bo liu <bo.liu@senarytech.com>
Signed-off-by: songxiebing <songxiebing@kylinos.cn>
Link: https://patch.msgid.link/20240726100726.50824-1-soxiebing@163.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agonet: phy: realtek: add support for RTL8366S Gigabit PHY
Mark Mentovai [Thu, 25 Jul 2024 20:41:44 +0000 (16:41 -0400)] 
net: phy: realtek: add support for RTL8366S Gigabit PHY

[ Upstream commit 225990c487c1023e7b3aa89beb6a68011fbc0461 ]

The PHY built in to the Realtek RTL8366S switch controller was
previously supported by genphy_driver. This PHY does not implement MMD
operations. Since commit 9b01c885be36 ("net: phy: c22: migrate to
genphy_c45_write_eee_adv()"), MMD register reads have been made during
phy_probe to determine EEE support. For genphy_driver, these reads are
transformed into 802.3 annex 22D clause 45-over-clause 22
mmd_phy_indirect operations that perform MII register writes to
MII_MMD_CTRL and MII_MMD_DATA. This overwrites those two MII registers,
which on this PHY are reserved and have another function, rendering the
PHY unusable while so configured.

Proper support for this PHY is restored by providing a phy_driver that
declares MMD operations as unsupported by using the helper functions
provided for that purpose, while remaining otherwise identical to
genphy_driver.

Fixes: 9b01c885be36 ("net: phy: c22: migrate to genphy_c45_write_eee_adv()")
Reported-by: Russell Senior <russell@personaltelco.net>
Closes: https://github.com/openwrt/openwrt/issues/15981
Link: https://github.com/openwrt/openwrt/issues/15739
Signed-off-by: Mark Mentovai <mark@mentovai.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agowifi: cfg80211: correct S1G beacon length calculation
Johannes Berg [Wed, 24 Jul 2024 11:29:12 +0000 (13:29 +0200)] 
wifi: cfg80211: correct S1G beacon length calculation

[ Upstream commit 6873cc4416078202882691b424fcca5b5fb1a94d ]

The minimum header length calculation (equivalent to the start
of the elements) for the S1G long beacon erroneously required
only up to the start of u.s1g_beacon rather than the start of
u.s1g_beacon.variable. Fix that, and also shuffle the branches
around a bit to not assign useless values that are overwritten
later.

Reported-by: syzbot+0f3afa93b91202f21939@syzkaller.appspotmail.com
Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results")
Link: https://patch.msgid.link/20240724132912.9662972db7c1.I8779675b5bbda4994cc66f876b6b87a2361c3c0b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agowifi: cfg80211: fix reporting failed MLO links status with cfg80211_connect_done
Veerendranath Jakkam [Wed, 24 Jul 2024 12:53:27 +0000 (18:23 +0530)] 
wifi: cfg80211: fix reporting failed MLO links status with cfg80211_connect_done

[ Upstream commit baeaabf970b9a90999f62ae27edf63f6cb86c023 ]

Individual MLO links connection status is not copied to
EVENT_CONNECT_RESULT data while processing the connect response
information in cfg80211_connect_done(). Due to this failed links
are wrongly indicated with success status in EVENT_CONNECT_RESULT.

To fix this, copy the individual MLO links status to the
EVENT_CONNECT_RESULT data.

Fixes: 53ad07e9823b ("wifi: cfg80211: support reporting failed links")
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Link: https://patch.msgid.link/20240724125327.3495874-1-quic_vjakkam@quicinc.com
[commit message editorial changes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agosched: act_ct: take care of padding in struct zones_ht_key
Eric Dumazet [Thu, 25 Jul 2024 09:27:45 +0000 (09:27 +0000)] 
sched: act_ct: take care of padding in struct zones_ht_key

[ Upstream commit 2191a54f63225b548fd8346be3611c3219a24738 ]

Blamed commit increased lookup key size from 2 bytes to 16 bytes,
because zones_ht_key got a struct net pointer.

Make sure rhashtable_lookup() is not using the padding bytes
which are not initialized.

 BUG: KMSAN: uninit-value in rht_ptr_rcu include/linux/rhashtable.h:376 [inline]
 BUG: KMSAN: uninit-value in __rhashtable_lookup include/linux/rhashtable.h:607 [inline]
 BUG: KMSAN: uninit-value in rhashtable_lookup include/linux/rhashtable.h:646 [inline]
 BUG: KMSAN: uninit-value in rhashtable_lookup_fast include/linux/rhashtable.h:672 [inline]
 BUG: KMSAN: uninit-value in tcf_ct_flow_table_get+0x611/0x2260 net/sched/act_ct.c:329
  rht_ptr_rcu include/linux/rhashtable.h:376 [inline]
  __rhashtable_lookup include/linux/rhashtable.h:607 [inline]
  rhashtable_lookup include/linux/rhashtable.h:646 [inline]
  rhashtable_lookup_fast include/linux/rhashtable.h:672 [inline]
  tcf_ct_flow_table_get+0x611/0x2260 net/sched/act_ct.c:329
  tcf_ct_init+0xa67/0x2890 net/sched/act_ct.c:1408
  tcf_action_init_1+0x6cc/0xb30 net/sched/act_api.c:1425
  tcf_action_init+0x458/0xf00 net/sched/act_api.c:1488
  tcf_action_add net/sched/act_api.c:2061 [inline]
  tc_ctl_action+0x4be/0x19d0 net/sched/act_api.c:2118
  rtnetlink_rcv_msg+0x12fc/0x1410 net/core/rtnetlink.c:6647
  netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2550
  rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6665
  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
  netlink_unicast+0xf52/0x1260 net/netlink/af_netlink.c:1357
  netlink_sendmsg+0x10da/0x11e0 net/netlink/af_netlink.c:1901
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x30f/0x380 net/socket.c:745
  ____sys_sendmsg+0x877/0xb60 net/socket.c:2597
  ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2651
  __sys_sendmsg net/socket.c:2680 [inline]
  __do_sys_sendmsg net/socket.c:2689 [inline]
  __se_sys_sendmsg net/socket.c:2687 [inline]
  __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2687
  x64_sys_call+0x2dd6/0x3c10 arch/x86/include/generated/asm/syscalls_64.h:47
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Local variable key created at:
  tcf_ct_flow_table_get+0x4a/0x2260 net/sched/act_ct.c:324
  tcf_ct_init+0xa67/0x2890 net/sched/act_ct.c:1408

Fixes: 88c67aeb1407 ("sched: act_ct: add netns into the key of tcf_ct_flow_table")
Reported-by: syzbot+1b5e4e187cc586d05ea0@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agodrm/vmwgfx: Trigger a modeset when the screen moves
Ian Forbes [Mon, 24 Jun 2024 20:59:51 +0000 (15:59 -0500)] 
drm/vmwgfx: Trigger a modeset when the screen moves

[ Upstream commit 75c3e8a26a35d4f3eee299b3cc7e465f166f4e2d ]

When multi-monitor is cycled the X,Y position of the Screen Target will
likely change but the resolution will not. We need to trigger a modeset
when this occurs in order to recreate the Screen Target with the correct
X,Y position.

Fixes a bug where multiple displays are shown in a single scrollable
host window rather than in 2+ windows on separate host displays.

Fixes: 426826933109 ("drm/vmwgfx: Filter modes which exceed graphics memory")
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624205951.23343-1-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoethtool: rss: echo the context number back
Jakub Kicinski [Wed, 24 Jul 2024 23:42:49 +0000 (16:42 -0700)] 
ethtool: rss: echo the context number back

[ Upstream commit f96aae91b0d260f682e630e092ef70a05a718a43 ]

The response to a GET request in Netlink should fully identify
the queried object. RSS_GET accepts context id as an input,
so it must echo that attribute back to the response.

After (assuming context 1 has been created):

  $ ./cli.py --spec netlink/specs/ethtool.yaml \
             --do rss-get \
     --json '{"header": {"dev-index": 2}, "context": 1}'
  {'context': 1,
   'header': {'dev-index': 2, 'dev-name': 'eth0'},
  [...]

Fixes: 7112a04664bf ("ethtool: add netlink based get rss support")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20240724234249.2621109-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agonetlink: specs: correct the spec of ethtool
Jakub Kicinski [Wed, 24 Jul 2024 23:42:48 +0000 (16:42 -0700)] 
netlink: specs: correct the spec of ethtool

[ Upstream commit a40c7a24f97edda025f53cfe8f0bc6a6e3c12fa6 ]

The spec for Ethtool is a bit inaccurate. We don't currently
support dump. Context is only accepted as input and not echoed
to output (which is a separate bug).

Fixes: a353318ebf24 ("tools: ynl: populate most of the ethtool spec")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20240724234249.2621109-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agobnxt_en: Fix RSS logic in __bnxt_reserve_rings()
Pavan Chebbi [Wed, 24 Jul 2024 22:21:06 +0000 (15:21 -0700)] 
bnxt_en: Fix RSS logic in __bnxt_reserve_rings()

[ Upstream commit 98ba1d931f611e8f8f519c0405fa0a1a76554bfa ]

In __bnxt_reserve_rings(), the existing code unconditionally sets the
default RSS indirection table to default if netif_is_rxfh_configured()
returns false.  This used to be correct before we added RSS contexts
support.  For example, if the user is changing the number of ethtool
channels, we will enter this path to reserve the new number of rings.
We will then set the RSS indirection table to default to cover the new
number of rings if netif_is_rxfh_configured() is false.

Now, with RSS contexts support, if the user has added or deleted RSS
contexts, we may now enter this path to reserve the new number of VNICs.
However, netif_is_rxfh_configured() will not return the correct state if
we are still in the middle of set_rxfh().  So the existing code may
set the indirection table of the default RSS context to default by
mistake.

Fix it to check if the reservation of the RX rings is changing.  Only
check netif_is_rxfh_configured() if it is changing.  RX rings will not
change in the middle of set_rxfh() and this will fix the issue.

Fixes: b3d0083caf9a ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")
Reported-and-tested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/20240625010210.2002310-1-kuba@kernel.org
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240724222106.147744-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agodrm/vmwgfx: Fix overlay when using Screen Targets
Ian Forbes [Fri, 19 Jul 2024 16:36:27 +0000 (11:36 -0500)] 
drm/vmwgfx: Fix overlay when using Screen Targets

[ Upstream commit cb372a505a994cb39aa75acfb8b3bcf94787cf94 ]

This code was never updated to support Screen Targets.
Fixes a bug where Xv playback displays a green screen instead of actual
video contents when 3D acceleration is disabled in the guest.

Fixes: c8261a961ece ("vmwgfx: Major KMS refactoring / cleanup in preparation of screen targets")
Reported-by: Doug Brown <doug@schmorgal.com>
Closes: https://lore.kernel.org/all/bd9cb3c7-90e8-435d-bc28-0e38fee58977@schmorgal.com
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Tested-by: Doug Brown <doug@schmorgal.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719163627.20888-1-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agodrm/vmwgfx: Make sure the screen surface is ref counted
Zack Rusin [Mon, 22 Jul 2024 18:41:14 +0000 (14:41 -0400)] 
drm/vmwgfx: Make sure the screen surface is ref counted

[ Upstream commit 09f34a00272d2311f6e5d64ed8ad824ef78f7487 ]

Fix races issues in virtual crc generation by making sure the surface
the code uses for crc computation is properly ref counted.

Crc generation was trying to be too clever by allowing the surfaces
to go in and out of scope, with the hope of always having some kind
of screen present. That's not always the code, in particular during
atomic disable, so to make sure the surface, when present, is not
being actively destroyed at the same time, hold a reference to it.

Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: 7b0062036c3b ("drm/vmwgfx: Implement virtual crc generation")
Cc: Zack Rusin <zack.rusin@broadcom.com>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722184313.181318-3-zack.rusin@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agodrm/nouveau: prime: fix refcount underflow
Danilo Krummrich [Thu, 18 Jul 2024 16:58:46 +0000 (18:58 +0200)] 
drm/nouveau: prime: fix refcount underflow

[ Upstream commit a9bf3efc33f1fbf88787a277f7349459283c9b95 ]

Calling nouveau_bo_ref() on a nouveau_bo without initializing it (and
hence the backing ttm_bo) leads to a refcount underflow.

Instead of calling nouveau_bo_ref() in the unwind path of
drm_gem_object_init(), clean things up manually.

Fixes: ab9ccb96a6e6 ("drm/nouveau: use prime helpers")
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718165959.3983-2-dakr@kernel.org
(cherry picked from commit 1b93f3e89d03cfc576636e195466a0d728ad8de5)
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agodrm/gpuvm: fix missing dependency to DRM_EXEC
Danilo Krummrich [Mon, 15 Jul 2024 13:51:33 +0000 (15:51 +0200)] 
drm/gpuvm: fix missing dependency to DRM_EXEC

[ Upstream commit eeb1f825b5dc68047a0556e5ae86d1467920db41 ]

In commit 50c1a36f594b ("drm/gpuvm: track/lock/validate external/evicted
objects") we started using drm_exec, but did not select DRM_EXEC in the
Kconfig for DRM_GPUVM, fix this.

Cc: Christian König <christian.koenig@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 50c1a36f594b ("drm/gpuvm: track/lock/validate external/evicted objects")
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240715135158.133287-1-dakr@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoperf tool: fix dereferencing NULL al->maps
Casey Chen [Mon, 22 Jul 2024 21:15:48 +0000 (15:15 -0600)] 
perf tool: fix dereferencing NULL al->maps

[ Upstream commit 4c17736689ccfc44ec7dcc472577f25c34cf8724 ]

With 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions"),
when cpumode is 3 (macro PERF_RECORD_MISC_HYPERVISOR),
thread__find_map() could return with al->maps being NULL.

The path below could add a callchain_cursor_node with NULL ms.maps.

add_callchain_ip()
  thread__find_symbol(.., &al)
    thread__find_map(.., &al)   // al->maps becomes NULL
  ms.maps = maps__get(al.maps)
  callchain_cursor_append(..., &ms, ...)
    node->ms.maps = maps__get(ms->maps)

Then the path below would dereference NULL maps and get segfault.

fill_callchain_info()
  maps__machine(node->ms.maps);

Fix it by checking if maps is NULL in fill_callchain_info().

Fixes: 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions")
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: yzhong@purestorage.com
Link: https://lore.kernel.org/r/20240722211548.61455-1-cachen@purestorage.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoHID: amd_sfh: Move sensor discovery before HID device initialization
Basavaraj Natikar [Thu, 18 Jul 2024 11:16:16 +0000 (16:46 +0530)] 
HID: amd_sfh: Move sensor discovery before HID device initialization

[ Upstream commit 8031b001da700474c11d28629581480b12a0d8d4 ]

Sensors discovery is independent of HID device initialization. If sensor
discovery fails after HID initialization, then the HID device needs to be
deinitialized. Therefore, sensors discovery should be moved before HID
device initialization.

Fixes: 7bcfdab3f0c6 ("HID: amd_sfh: if no sensors are enabled, clean up")
Tested-by: Aurinko <petrvelicka@tuta.io>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Link: https://patch.msgid.link/20240718111616.3012155-1-Basavaraj.Natikar@amd.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoARM: 9408/1: mm: CFI: Fix some erroneous reset prototypes
Linus Walleij [Thu, 27 Jun 2024 08:22:09 +0000 (09:22 +0100)] 
ARM: 9408/1: mm: CFI: Fix some erroneous reset prototypes

[ Upstream commit 657a292d679ae3a6c733ab0e939e24ae44b20faf ]

I somehow got a few cpu_nn_reset() signatures wrong in my
patch. Fix it up.

Closes: https://lore.kernel.org/oe-kbuild-all/202406260432.6WGV2jCk-lkp@intel.com/
Fixes: 393999fa9627 ("ARM: 9389/2: mm: Define prototypes for all per-processor calls")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoARM: 9406/1: Fix callchain_trace() return value
Jinjie Ruan [Thu, 27 Jun 2024 07:29:59 +0000 (08:29 +0100)] 
ARM: 9406/1: Fix callchain_trace() return value

[ Upstream commit 4e7b4ff2dcaed228cb2fb7bfe720262c98ec1bb9 ]

perf_callchain_store() return 0 on success, -1 otherwise, fix
callchain_trace() to return correct bool value. So walk_stackframe() can
have a chance to stop walking the stack ahead.

Fixes: 70ccc7c0667b ("ARM: 9258/1: stacktrace: Make stack walk callback consistent with generic code")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoMIPS: dts: loongson: Fix ls2k1000-rtc interrupt
Jiaxun Yang [Fri, 14 Jun 2024 15:40:11 +0000 (16:40 +0100)] 
MIPS: dts: loongson: Fix ls2k1000-rtc interrupt

[ Upstream commit f70fd92df7529e7283e02a6c3a2510075f13ba30 ]

The correct interrupt line for RTC is line 8 on liointc1.

Fixes: e47084e116fc ("MIPS: Loongson64: DTS: Add RTC support to Loongson-2K1000")
Cc: stable@vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoMIPS: dts: loongson: Fix liointc IRQ polarity
Jiaxun Yang [Fri, 14 Jun 2024 15:40:10 +0000 (16:40 +0100)] 
MIPS: dts: loongson: Fix liointc IRQ polarity

[ Upstream commit dbb69b9d6234aad23b3ecd33e5bc8a8ae1485b7d ]

All internal liointc interrupts are high level triggered.

Fixes: b1a792601f26 ("MIPS: Loongson64: DeviceTree for Loongson-2K1000")
Cc: stable@vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoMIPS: Loongson64: DTS: Fix PCIe port nodes for ls7a
Jiaxun Yang [Tue, 7 May 2024 18:51:22 +0000 (19:51 +0100)] 
MIPS: Loongson64: DTS: Fix PCIe port nodes for ls7a

[ Upstream commit d89a415ff8d5e0aad4963f2d8ebb0f9e8110b7fa ]

Add various required properties to silent warnings:

arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi:116.16-297.5: Warning (interrupt_provider): /bus@10000000/pci@1a000000: '#interrupt-cells' found, but node is not an interrupt provider
arch/mips/boot/dts/loongson/loongson64_2core_2k1000.dtb: Warning (interrupt_map): Failed prerequisite 'interrupt_provider'

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Stable-dep-of: dbb69b9d6234 ("MIPS: dts: loongson: Fix liointc IRQ polarity")
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoperf/x86/intel: Add a distinct name for Granite Rapids
Kan Liang [Mon, 8 Jul 2024 19:33:35 +0000 (12:33 -0700)] 
perf/x86/intel: Add a distinct name for Granite Rapids

[ Upstream commit fa0c1c9d283b37fdb7fc1dcccbb88fc8f48a4aa4 ]

Currently, the Sapphire Rapids and Granite Rapids share the same PMU
name, sapphire_rapids. Because from the kernel’s perspective, GNR is
similar to SPR. The only key difference is that they support different
extra MSRs. The code path and the PMU name are shared.

However, from end users' perspective, they are quite different. Besides
the extra MSRs, GNR has a newer PEBS format, supports Retire Latency,
supports new CPUID enumeration architecture, doesn't required the
load-latency AUX event, has additional TMA Level 1 Architectural Events,
etc. The differences can be enumerated by CPUID or the PERF_CAPABILITIES
MSR. They weren't reflected in the model-specific kernel setup.
But it is worth to have a distinct PMU name for GNR.

Fixes: a6742cb90b56 ("perf/x86/intel: Fix the FRONTEND encoding on GNR and MTL")
Suggested-by: Ahmad Yasin <ahmad.yasin@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20240708193336.1192217-3-kan.liang@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoperf/x86/intel: Switch to new Intel CPU model defines
Tony Luck [Mon, 20 May 2024 22:46:02 +0000 (15:46 -0700)] 
perf/x86/intel: Switch to new Intel CPU model defines

[ Upstream commit d142df13f3574237688c7a20e0019cccc7ae39eb ]

New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20240520224620.9480-32-tony.luck%40intel.com
Stable-dep-of: fa0c1c9d283b ("perf/x86/intel: Add a distinct name for Granite Rapids")
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoperf: imx_perf: fix counter start and config sequence
Xu Yang [Wed, 29 May 2024 08:03:55 +0000 (16:03 +0800)] 
perf: imx_perf: fix counter start and config sequence

[ Upstream commit ac9aa295f7a89d38656739628796f086f0b160e2 ]

In current driver, the counter will start firstly and then be configured.
This sequence is not correct for AXI filter events since the correct
AXI_MASK and AXI_ID are not set yet. Then the results may be inaccurate.

Reviewed-by: Frank Li <Frank.Li@nxp.com>
Fixes: 55691f99d417 ("drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver")
cc: stable@vger.kernel.org
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240529080358.703784-5-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agof2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
Jaegeuk Kim [Tue, 18 Jun 2024 02:15:38 +0000 (02:15 +0000)] 
f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid

[ Upstream commit 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1 ]

mkdir /mnt/test/comp
f2fs_io setflags compression /mnt/test/comp
dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1
truncate --size 13 /mnt/test/comp/testfile

In the above scenario, we can get a BUG_ON.
 kernel BUG at fs/f2fs/segment.c:3589!
 Call Trace:
  do_write_page+0x78/0x390 [f2fs]
  f2fs_outplace_write_data+0x62/0xb0 [f2fs]
  f2fs_do_write_data_page+0x275/0x740 [f2fs]
  f2fs_write_single_data_page+0x1dc/0x8f0 [f2fs]
  f2fs_write_multi_pages+0x1e5/0xae0 [f2fs]
  f2fs_write_cache_pages+0xab1/0xc60 [f2fs]
  f2fs_write_data_pages+0x2d8/0x330 [f2fs]
  do_writepages+0xcf/0x270
  __writeback_single_inode+0x44/0x350
  writeback_sb_inodes+0x242/0x530
  __writeback_inodes_wb+0x54/0xf0
  wb_writeback+0x192/0x310
  wb_workfn+0x30d/0x400

The reason is we gave CURSEG_ALL_DATA_ATGC to COMPR_ADDR where the
page was set the gcing flag by set_cluster_dirty().

Cc: stable@vger.kernel.org
Fixes: 4961acdd65c9 ("f2fs: fix to tag gcing flag on page during block migration")
Reviewed-by: Chao Yu <chao@kernel.org>
Tested-by: Will McVicker <willmcvicker@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agof2fs: fix to avoid use SSR allocate when do defragment
Zhiguo Niu [Wed, 29 May 2024 09:47:00 +0000 (17:47 +0800)] 
f2fs: fix to avoid use SSR allocate when do defragment

[ Upstream commit 21327a042dd94bc73181d7300e688699cb1f467e ]

SSR allocate mode will be used when doing file defragment
if ATGC is working at the same time, that is because
set_page_private_gcing may make CURSEG_ALL_DATA_ATGC segment
type got in f2fs_allocate_data_block when defragment page
is writeback, which may cause file fragmentation is worse.

A file with 2 fragmentations is changed as following after defragment:

----------------file info-------------------
sensorsdata :
--------------------------------------------
dev       [254:48]
ino       [0x    3029 : 12329]
mode      [0x    81b0 : 33200]
nlink     [0x       1 : 1]
uid       [0x    27e6 : 10214]
gid       [0x    27e6 : 10214]
size      [0x  242000 : 2367488]
blksize   [0x    1000 : 4096]
blocks    [0x    1210 : 4624]
--------------------------------------------

file_pos   start_blk     end_blk        blks
       0    11361121    11361207          87
  356352    11361215    11361216           2
  364544    11361218    11361218           1
  368640    11361220    11361221           2
  376832    11361224    11361225           2
  385024    11361227    11361238          12
  434176    11361240    11361252          13
  487424    11361254    11361254           1
  491520    11361271    11361279           9
  528384     3681794     3681795           2
  536576     3681797     3681797           1
  540672     3681799     3681799           1
  544768     3681803     3681803           1
  548864     3681805     3681805           1
  552960     3681807     3681807           1
  557056     3681809     3681809           1

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: 8cb1f4080dd9 ("f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoext4: check the extent status again before inserting delalloc block
Zhang Yi [Fri, 17 May 2024 12:39:57 +0000 (20:39 +0800)] 
ext4: check the extent status again before inserting delalloc block

[ Upstream commit 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c ]

ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).

If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.

Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
   folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
   lock.

But this can race between ext4_page_mkwrite() & ext4 fallocate path

ext4_page_mkwrite()             ext4_fallocate()
 block_page_mkwrite()
  ext4_da_map_blocks()
   //find hole in extent status tree
                                 ext4_alloc_file_blocks()
                                  ext4_map_blocks()
                                   //allocate block and unwritten extent
   ext4_insert_delayed_block()
    ext4_da_reserve_space()
     //reserve one more block
    ext4_es_insert_delayed_block()
     //drop unwritten extent and add delayed extent by mistake

Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:

 EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!

Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.

Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agoext4: factor out a common helper to query extent map
Zhang Yi [Fri, 17 May 2024 12:39:56 +0000 (20:39 +0800)] 
ext4: factor out a common helper to query extent map

[ Upstream commit 8e4e5cdf2fdeb99445a468b6b6436ad79b9ecb30 ]

Factor out a new common helper ext4_map_query_blocks() from the
ext4_da_map_blocks(), it query and return the extent map status on the
inode's extent path, no logic changes.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/20240517124005.347221-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agomm/migrate: putback split folios when numa hint migration fails
Peter Xu [Mon, 8 Jul 2024 21:55:37 +0000 (17:55 -0400)] 
mm/migrate: putback split folios when numa hint migration fails

[ Upstream commit 6e49019db5f7a09a9c0e8ac4d108e656c3f8e583 ]

This issue is not from any report yet, but by code observation only.

This is yet another fix besides Hugh's patch [1] but on relevant code
path, where eager split of folio can happen if the folio is already on
deferred list during a folio migration.

Here the issue is NUMA path (migrate_misplaced_folio()) may start to
encounter such folio split now even with MR_NUMA_MISPLACED hint applied.
Then when migrate_pages() didn't migrate all the folios, it's possible the
split small folios be put onto the list instead of the original folio.
Then putting back only the head page won't be enough.

Fix it by putting back all the folios on the list.

[1] https://lore.kernel.org/all/46c948b4-4dd8-6e03-4c7b-ce4e81cfa536@google.com/

[akpm@linux-foundation.org: remove now unused local `nr_pages']
Link: https://lkml.kernel.org/r/20240708215537.2630610-1-peterx@redhat.com
Fixes: 7262f208ca68 ("mm/migrate: split source folio if it is on deferred split list")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agomm/migrate: move NUMA hinting fault folio isolation + checks under PTL
David Hildenbrand [Thu, 20 Jun 2024 21:29:35 +0000 (23:29 +0200)] 
mm/migrate: move NUMA hinting fault folio isolation + checks under PTL

[ Upstream commit ee86814b0562f18255b55c5e6a01a022895994cf ]

Currently we always take a folio reference even if migration will not even
be tried or isolation failed, requiring us to grab+drop an additional
reference.

Further, we end up calling folio_likely_mapped_shared() while the folio
might have already been unmapped, because after we dropped the PTL, that
can easily happen.  We want to stop touching mapcounts and friends from
such context, and only call folio_likely_mapped_shared() while the folio
is still mapped: mapcount information is pretty much stale and unreliable
otherwise.

So let's move checks into numamigrate_isolate_folio(), rename that
function to migrate_misplaced_folio_prepare(), and call that function from
callsites where we call migrate_misplaced_folio(), but still with the PTL
held.

We can now stop taking temporary folio references, and really only take a
reference if folio isolation succeeded.  Doing the
folio_likely_mapped_shared() + folio isolation under PT lock is now
similar to how we handle MADV_PAGEOUT.

While at it, combine the folio_is_file_lru() checks.

[david@redhat.com: fix list_del() corruption]
Link: https://lkml.kernel.org/r/8f85c31a-e603-4578-bf49-136dae0d4b69@redhat.com
Link: https://lkml.kernel.org/r/20240626191129.658CFC32782@smtp.kernel.org
Link: https://lkml.kernel.org/r/20240620212935.656243-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 6e49019db5f7 ("mm/migrate: putback split folios when numa hint migration fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agomm/migrate: make migrate_misplaced_folio() return 0 on success
David Hildenbrand [Thu, 20 Jun 2024 21:29:34 +0000 (23:29 +0200)] 
mm/migrate: make migrate_misplaced_folio() return 0 on success

[ Upstream commit 4b88c23ab8c9bc3857f7c8847e2c6bed95185530 ]

Patch series "mm/migrate: move NUMA hinting fault folio isolation + checks
under PTL".

Let's just return 0 on success, which is less confusing.

...  especially because we got it wrong in the migrate.h stub where we
have "return -EAGAIN; /* can't migrate now */" instead of "return 0;".
Likely this wrong return value doesn't currently matter, but it certainly
adds confusion.

We'll add migrate_misplaced_folio_prepare() next, where we want to use the
same "return 0 on success" approach, so let's just clean this up.

Link: https://lkml.kernel.org/r/20240620212935.656243-1-david@redhat.com
Link: https://lkml.kernel.org/r/20240620212935.656243-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 6e49019db5f7 ("mm/migrate: putback split folios when numa hint migration fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agomm: fix khugepaged activation policy
Ryan Roberts [Thu, 4 Jul 2024 09:10:50 +0000 (10:10 +0100)] 
mm: fix khugepaged activation policy

[ Upstream commit 00f58104202c472e487f0866fbd38832523fd4f9 ]

Since the introduction of mTHP, the docuementation has stated that
khugepaged would be enabled when any mTHP size is enabled, and disabled
when all mTHP sizes are disabled.  There are 2 problems with this; 1.
this is not what was implemented by the code and 2.  this is not the
desirable behavior.

Desirable behavior is for khugepaged to be enabled when any PMD-sized THP
is enabled, anon or file.  (Note that file THP is still controlled by the
top-level control so we must always consider that, as well as the PMD-size
mTHP control for anon).  khugepaged only supports collapsing to PMD-sized
THP so there is no value in enabling it when PMD-sized THP is disabled.
So let's change the code and documentation to reflect this policy.

Further, per-size enabled control modification events were not previously
forwarded to khugepaged to give it an opportunity to start or stop.
Consequently the following was resulting in khugepaged eroneously not
being activated:

  echo never > /sys/kernel/mm/transparent_hugepage/enabled
  echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled

[ryan.roberts@arm.com: v3]
Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20240704091051.2411934-1-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
Closes: https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
15 months agomm/huge_memory: mark racy access onhuge_anon_orders_always
Ran Xiaokai [Wed, 15 May 2024 02:47:54 +0000 (10:47 +0800)] 
mm/huge_memory: mark racy access onhuge_anon_orders_always

[ Upstream commit 7f83bf14603ef41a44dc907594d749a283e22c37 ]

huge_anon_orders_always is accessed lockless, it is better to use the
READ_ONCE() wrapper.  This is not fixing any visible bug, hopefully this
can cease some KCSAN complains in the future.  Also do that for
huge_anon_orders_madvise.

Link: https://lkml.kernel.org/r/20240515104754889HqrahFPePOIE1UlANHVAh@zte.com.cn
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 00f58104202c ("mm: fix khugepaged activation policy")
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoLinux 6.10.3 v6.10.3
Greg Kroah-Hartman [Sat, 3 Aug 2024 07:01:09 +0000 (09:01 +0200)] 
Linux 6.10.3

Link: https://lore.kernel.org/r/20240730151724.637682316@linuxfoundation.org
Tested-by: Ronald Warsow <rwarsow@gmx.de>
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240731073248.306752137@linuxfoundation.org
Link: https://lore.kernel.org/r/20240731095022.970699670@linuxfoundation.org
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Mark Brown <broonie@kernel.org>
Tested-by: Luna Jernberg <droidbittin@gmail.com>
Tested-by: Ronald Warsow <rwarsow@gmx.de>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Christian Heusel <christian@heusel.eu>
Tested-by: Markus Reichelt <lkt+2023@mareichelt.com>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Allen Pais <apais@linux.microsoft.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Jiri Slaby <jirislaby@kernel.org>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agoselftests/bpf: DENYLIST.aarch64: Skip fexit_sleep again
Daniel Borkmann [Fri, 12 Jul 2024 16:12:30 +0000 (18:12 +0200)] 
selftests/bpf: DENYLIST.aarch64: Skip fexit_sleep again

commit 517125f6749402e579f715519147145944f12ad9 upstream.

Revert commit 90dc946059b7 ("selftests/bpf: DENYLIST.aarch64: Remove
fexit_sleep") again. The fix in 19d3c179a377 ("bpf, arm64: Fix trampoline
for BPF_TRAMP_F_CALL_ORIG") does not address all of the issues and BPF
CI is still hanging and timing out:

   https://github.com/kernel-patches/bpf/actions/runs/9905842936/job/27366435436

   [...]
   #89/11   fexit_bpf2bpf/func_replace_global_func:OK
   #89/12   fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
   #89/13   fexit_bpf2bpf/func_replace_progmap:OK
   #89      fexit_bpf2bpf:OK
   Error: The operation was canceled.

Thus more investigation work & fixing is needed before the test can be put
in place again.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/bpf/20240705145009.32340-1-puranjay@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agoselinux,smack: remove the capability checks in the removexattr hooks
Paul Moore [Wed, 3 Jul 2024 21:00:20 +0000 (17:00 -0400)] 
selinux,smack: remove the capability checks in the removexattr hooks

commit dd44477e7fa15ba3b100dfc67bf7cf083f3dccf6 upstream.

Commit 61df7b828204 ("lsm: fixup the inode xattr capability handling")
moved the responsibility of doing the inode xattr capability checking
out of the individual LSMs and into the LSM framework itself.
Unfortunately, while the original commit added the capability checks
to both the setxattr and removexattr code in the LSM framework, it
only removed the setxattr capability checks from the individual LSMs,
leaving duplicated removexattr capability checks in both the SELinux
and Smack code.

This patch removes the duplicated code from SELinux and Smack.

Fixes: 61df7b828204 ("lsm: fixup the inode xattr capability handling")
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agopowerpc/configs: Update defconfig with now user-visible CONFIG_FSL_IFC
Esben Haabendal [Thu, 30 May 2024 14:46:37 +0000 (16:46 +0200)] 
powerpc/configs: Update defconfig with now user-visible CONFIG_FSL_IFC

commit 45547a0a93d85f704b49788cde2e1d9ab9cd363b upstream.

With CONFIG_FSL_IFC now being user-visible, and thus changed from a select
to depends in CONFIG_MTD_NAND_FSL_IFC, the dependencies needs to be
selected in defconfigs.

Depends-on: 9ba0cae3cac0 ("memory: fsl_ifc: Make FSL_IFC config visible and selectable")
Signed-off-by: Esben Haabendal <esben@geanix.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240530-fsl-ifc-config-v3-2-1fd2c3d233dd@geanix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
16 months agoperf dso: Fix build when libunwind is enabled
James Clark [Mon, 15 Jul 2024 09:47:13 +0000 (10:47 +0100)] 
perf dso: Fix build when libunwind is enabled

[ Upstream commit 92717bc077892d1ce60fee07aee3a33f33909b85 ]

Now that symsrc_filename is always accessed through an accessor, we also
need a free() function for it to avoid the following compilation error:

  util/unwind-libunwind-local.c:416:12: error: lvalue required as unary
    ‘&’ operand
  416 |      zfree(&dso__symsrc_filename(dso));

Fixes: 1553419c3c10 ("perf dso: Fix address sanitizer build")
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240715094715.3914813-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agowifi: ath12k: fix mbssid max interface advertisement
Karthikeyan Periyasamy [Thu, 13 Jun 2024 15:38:13 +0000 (21:08 +0530)] 
wifi: ath12k: fix mbssid max interface advertisement

[ Upstream commit 253ec89c9013b0e61f5c54344df7c065d54934d1 ]

The Current method for advertising the maximum MBSSID interface count
assumes single radio per wiphy (multi wiphy model). However, this
assumption is incorrect for multi radio per wiphy (single wiphy model).
Therefore, populate the parameter for each radio present in the MAC
abstraction layer (ah). This approach ensure scalability for both single
wiphy and multi wiphy models.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: 519a545cfee7 ("wifi: ath12k: advertise driver capabilities for MBSSID and EMA")
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240613153813.3509837-1-quic_periyasa@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agofs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT
Seth Forshee (DigitalOcean) [Wed, 24 Jul 2024 14:53:59 +0000 (09:53 -0500)] 
fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT

[ Upstream commit e1c5ae59c0f22f7fe5c07fb5513a29e4aad868c9 ]

Christian noticed that it is possible for a privileged user to mount
most filesystems with a non-initial user namespace in sb->s_user_ns.
When fsopen() is called in a non-init namespace the caller's namespace
is recorded in fs_context->user_ns. If the returned file descriptor is
then passed to a process priviliged in init_user_ns, that process can
call fsconfig(fd_fs, FSCONFIG_CMD_CREATE), creating a new superblock
with sb->s_user_ns set to the namespace of the process which called
fsopen().

This is problematic. We cannot assume that any filesystem which does not
set FS_USERNS_MOUNT has been written with a non-initial s_user_ns in
mind, increasing the risk for bugs and security issues.

Prevent this by returning EPERM from sget_fc() when FS_USERNS_MOUNT is
not set for the filesystem and a non-initial user namespace will be
used. sget() does not need to be updated as it always uses the user
namespace of the current context, or the initial user namespace if
SB_SUBMOUNT is set.

Fixes: cb50b348c71f ("convenience helpers: vfs_get_super() and sget_fc()")
Reported-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
Link: https://lore.kernel.org/r/20240724-s_user_ns-fix-v1-1-895d07c94701@kernel.org
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoi3c: mipi-i3c-hci: Fix number of DAT/DCT entries for HCI versions < 1.1
Jarkko Nikula [Fri, 14 Jun 2024 14:02:08 +0000 (17:02 +0300)] 
i3c: mipi-i3c-hci: Fix number of DAT/DCT entries for HCI versions < 1.1

[ Upstream commit be90ae1ba14a83962b33c4d4c854ef081186b0e4 ]

I was wrong about the TABLE_SIZE field description in the
commit 0676bfebf576 ("i3c: mipi-i3c-hci: Fix DAT/DCT entry sizes").

For the MIPI I3C HCI versions 1.0 and earlier the TABLE_SIZE field in
the registers DAT_SECTION_OFFSET and DCT_SECTION_OFFSET is indeed defined
in DWORDs and not number of entries like it is defined in later versions.

Where above fix allowed driver initialization to continue the wrongly
interpreted TABLE_SIZE field leads variables DAT_entries being twice and
DCT_entries four times as big as they really are.

That in turn leads clearing the DAT table over the boundary in the
dat_v1.c: hci_dat_v1_init().

So interprete the TABLE_SIZE field in DWORDs for HCI versions < 1.1 and
fix number of DAT/DCT entries accordingly.

Fixes: 0676bfebf576 ("i3c: mipi-i3c-hci: Fix DAT/DCT entry sizes")
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonvme-pci: add missing condition check for existence of mapped data
Leon Romanovsky [Wed, 24 Jul 2024 10:31:14 +0000 (13:31 +0300)] 
nvme-pci: add missing condition check for existence of mapped data

[ Upstream commit c31fad1470389666ac7169fe43aa65bf5b7e2cfd ]

nvme_map_data() is called when request has physical segments, hence
the nvme_unmap_data() should have same condition to avoid dereference.

Fixes: 4aedb705437f ("nvme-pci: split metadata handling from nvme_map_data / nvme_unmap_data")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoapparmor: unpack transition table if dfa is not present
Georgia Garcia [Mon, 10 Jun 2024 12:51:48 +0000 (09:51 -0300)] 
apparmor: unpack transition table if dfa is not present

[ Upstream commit e0ff0cff1f6cdce0aa596aac04129893201c4162 ]

Due to a bug in earlier userspaces, a transition table may be present
even when the dfa is not. Commit 7572fea31e3e
("apparmor: convert fperm lookup to use accept as an index") made the
verification check more rigourous regressing old userspaces with
the bug. For compatibility reasons allow the orphaned transition table
during unpack and discard.

Fixes: 7572fea31e3e ("apparmor: convert fperm lookup to use accept as an index")
Signed-off-by: Georgia Garcia <georgia.garcia@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoublk: fix UBLK_CMD_DEL_DEV_ASYNC handling
Ming Lei [Wed, 24 Jul 2024 14:33:11 +0000 (22:33 +0800)] 
ublk: fix UBLK_CMD_DEL_DEV_ASYNC handling

[ Upstream commit 55fbb9a5d64e0e590cad5eacc16c99f2482a008f ]

In ublk_ctrl_uring_cmd(), ioctl command NR should be used for
matching _IOC_NR(cmd_op).

Fix it by adding one private macro, and this way is clean.

Fixes: 13fe8e6825e4 ("ublk: add UBLK_CMD_DEL_DEV_ASYNC")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20240724143311.2646330-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoio_uring: fix io_match_task must_hold
Pavel Begunkov [Wed, 24 Jul 2024 11:16:18 +0000 (12:16 +0100)] 
io_uring: fix io_match_task must_hold

[ Upstream commit e142e9cd8891b0c6f277ac2c2c254199a6aa56e3 ]

The __must_hold annotation in io_match_task() uses a non existing
parameter "req", fix it.

Fixes: 6af3f48bf6156 ("io_uring: fix link traversal locking")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3e65ee7709e96507cef3d93291746f2c489f2307.1721819383.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agothermal: core: Back off when polling thermal zones on errors
Rafael J. Wysocki [Thu, 18 Jul 2024 19:01:14 +0000 (21:01 +0200)] 
thermal: core: Back off when polling thermal zones on errors

[ Upstream commit f7c1b0e4ae47e67c6f9af84568a5f4a80638ccd8 ]

Commit a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone
temperature is invalid") introduced a polling mechanism by which the
thermal core attampts to get a valid temperature value for thermal zones
where the .get_temp() callback returns errors to start with (for
example, due to initialization ordering woes).  However, this polling is
carried out periodically ad infinitum and every iteration of it causes
a message to be printed to the kernel log which means a lot of log noise
on systems where there are thermal zones that never get ready for some
reason.  It is also not really useful to continuously poll thermal zones
that never respond.

To address this, modify the thermal core to increase the delay between
consecutive thermal zone temperature checks after every check that fails
until it reaches a certain maximum value.  At that point, the thermal
zone in question will be disabled, but user space will be able to
reenable it if it believes that the failure is transient.

Also change the code to print messages regarding failed temperature
checks to the kernel log only twice, once when the thermal zone's
.get_temp() callback returns an error for the first time and once when
disabling the given thermal zone.  In addition, a dev_crit() message
will be printed at that point if the given thermal zone contains a
critical trip point to notify the system operator about the situation.

Fixes: a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid")
Link: https://lore.kernel.org/linux-acpi/CAGnHSE=RyPK++UG0-wAtVKgeJxe0uzFYgLxm+RUOKKoQquW=Ow@mail.gmail.com/
Reported-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2962033.e9J7NaK4W3@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agothermal: trip: Split thermal_zone_device_set_mode()
Rafael J. Wysocki [Thu, 18 Jul 2024 19:00:35 +0000 (21:00 +0200)] 
thermal: trip: Split thermal_zone_device_set_mode()

[ Upstream commit e5f98896efb3b6350cb6f1c241394966dcbcf240 ]

Pull a wrapper around thermal zone .change_mode() callback out of
thermal_zone_device_set_mode() because it will be used elsewhere
subsequently.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2206793.irdbgypaU6@rjwysocki.net
Stable-dep-of: f7c1b0e4ae47 ("thermal: core: Back off when polling thermal zones on errors")
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoiommu: sprd: Avoid NULL deref in sprd_iommu_hw_en
Artem Chernyshev [Tue, 16 Jul 2024 12:55:14 +0000 (15:55 +0300)] 
iommu: sprd: Avoid NULL deref in sprd_iommu_hw_en

[ Upstream commit 630482ee0653decf9e2482ac6181897eb6cde5b8 ]

In sprd_iommu_cleanup() before calling function sprd_iommu_hw_en()
dom->sdev is equal to NULL, which leads to null dereference.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 9afea57384d4 ("iommu/sprd: Release dma buffer to avoid memory leak")
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Reviewed-by: Chunyan Zhang <zhang.lyra@gmail.com>
Link: https://lore.kernel.org/r/20240716125522.3690358-1-artem.chernyshev@red-soft.ru
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agos390/cpum_cf: Fix endless loop in CF_DIAG event stop
Thomas Richter [Mon, 15 Jul 2024 10:07:29 +0000 (12:07 +0200)] 
s390/cpum_cf: Fix endless loop in CF_DIAG event stop

[ Upstream commit e6ce1f12d777f6ee22b20e10ae6a771e7e6f44f5 ]

Event CF_DIAG reads out complete counter sets using stcctm
instruction. This is done at event start time when the process
starts execution and at event stop time when the process is
removed from the CPU. During removal the difference of each
counter in the counter sets is calculated and saved as raw data
in the ring buffer. This works fine unless the number of counters
in a counter set is zero. This may happen for the extended counter
set. This set is machine specific and the size of the counter
set can be zero even when extended counter set is authorized for
read access.

This case is not handled. cfdiag_diffctr() checks authorization
of the extended counter set. If true the functions assumes
the extended counter set has been saved in a data buffer. However
this is not the case, cfdiag_getctrset() does not save a counter
set with counter set size of zero. This mismatch causes an endless
loop in the counter set readout during event stop handling.

The calculation of the difference of the counters in each counter
now verifies the size of the counter set is non-zero. A counter set
with size zero is skipped.

Fixes: a029a4eab39e ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agos390/setup: Fix __pa/__va for modules under non-GPL licenses
Vasily Gorbik [Thu, 11 Jul 2024 13:50:26 +0000 (15:50 +0200)] 
s390/setup: Fix __pa/__va for modules under non-GPL licenses

[ Upstream commit e188e5d5ffd01d484b5255b88739fcf67b300223 ]

The struct vm_layout contains fields used in __pa/__va calculations. Such
fundamental things have to be exported with EXPORT_SYMBOL to avoid
breakages of out-of-tree modules under non-GPL licenses.

Fixes: 7de0446f0b26 ("s390/boot: Make identity mapping base address explicit")
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agos390/pci: Allow allocation of more than 1 MSI interrupt
Gerd Bayer [Thu, 11 Jul 2024 13:45:27 +0000 (15:45 +0200)] 
s390/pci: Allow allocation of more than 1 MSI interrupt

[ Upstream commit ab42fcb511fd9d241bbab7cc3ca04e34e9fc0666 ]

On a PCI adapter that provides up to 8 MSI interrupt sources the s390
implementation of PCI interrupts rejected to accommodate them, although
the underlying hardware is able to support that.

For MSI-X it is sufficient to allocate a single irq_desc per msi_desc,
but for MSI multiple irq descriptors are attached to and controlled by
a single msi descriptor. Add the appropriate loops to maintain multiple
irq descriptors and tie/untie them to/from the appropriate AIBV bit, if
a device driver allocates more than 1 MSI interrupt.

Common PCI code passes on requests to allocate a number of interrupt
vectors based on the device drivers' demand and the PCI functions'
capabilities. However, the root-complex of s390 systems support just a
limited number of interrupt vectors per PCI function.
Produce a kernel log message to inform about any architecture-specific
capping that might be done.

With this change, we had a PCI adapter successfully raising
interrupts to its device driver via all 8 sources.

Fixes: a384c8924a8b ("s390/PCI: Fix single MSI only check")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agos390/pci: Refactor arch_setup_msi_irqs()
Gerd Bayer [Thu, 11 Jul 2024 13:45:26 +0000 (15:45 +0200)] 
s390/pci: Refactor arch_setup_msi_irqs()

[ Upstream commit 5fd11b96b43708f2f6e3964412c301c1bd20ec0f ]

Factor out adapter interrupt allocation from arch_setup_msi_irqs() in
preparation for enabling registration of multiple MSIs. Code movement
only, no change of functionality intended.

Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Stable-dep-of: ab42fcb511fd ("s390/pci: Allow allocation of more than 1 MSI interrupt")
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoceph: fix incorrect kmalloc size of pagevec mempool
ethanwu [Thu, 11 Jul 2024 06:47:56 +0000 (14:47 +0800)] 
ceph: fix incorrect kmalloc size of pagevec mempool

[ Upstream commit 03230edb0bd831662a7c08b6fef66b2a9a817774 ]

The kmalloc size of pagevec mempool is incorrectly calculated.
It misses the size of page pointer and only accounts the number for the array.

Fixes: a0102bda5bc0 ("ceph: move sb->wb_pagevec_pool to be a global mempool")
Signed-off-by: ethanwu <ethanwu@synology.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agotimers/migration: Do not rely always on group->parent
Anna-Maria Behnsen [Tue, 16 Jul 2024 14:19:19 +0000 (16:19 +0200)] 
timers/migration: Do not rely always on group->parent

[ Upstream commit facd40aa5c4699f94014012e4e58414c082f2c01 ]

When reading group->parent without holding the group lock it is racy
against CPUs coming online the first time and thereby creating another
level of the hierarchy. This is not a problem when this value is read once
to decide whether to abort a propagation or not. The worst outcome is an
unnecessary/early CPU wake up. But it is racy when reading it several times
during a single 'action' (like activation, deactivation, checking for
remote timer expiry,...) and relying on the consitency of this value
without holding the lock. This happens at the moment e.g. in
tmigr_inactive_up() which is also calling tmigr_udpate_events(). Code relys
on group->parent not to change during this 'action'.

Update parent struct member description to explain the above only
once. Remove parent pointer checks when they are not mandatory (like update
of data->childmask). Remove a warning, which would be nice but the trigger
of this warning is not reliable and add expand the data structure member
description instead. Expand a comment, why it is safe to rely on parent
pointer here (inside hierarchy update).

Fixes: 7ee988770326 ("timers: Implement the hierarchical pull model")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20240716-tmigr-fixes-v4-1-757baa7803fe@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoASoC: TAS2781: Fix tasdev_load_calibrated_data()
Dan Carpenter [Fri, 19 Jul 2024 23:53:48 +0000 (18:53 -0500)] 
ASoC: TAS2781: Fix tasdev_load_calibrated_data()

[ Upstream commit 92c78222168e9035a9bfb8841c2e56ce23e51f73 ]

This function has a reversed if statement so it's either a no-op or it
leads to a NULL dereference.

Fixes: b195acf5266d ("ASoC: tas2781: Fix wrong loading calibrated data sequence")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/18a29b68-cc85-4139-b7c7-2514e8409a42@stanley.mountain
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoASoC: Intel: use soc_intel_is_byt_cr() only when IOSF_MBI is reachable
Pierre-Louis Bossart [Mon, 22 Jul 2024 08:30:02 +0000 (10:30 +0200)] 
ASoC: Intel: use soc_intel_is_byt_cr() only when IOSF_MBI is reachable

[ Upstream commit 9931f7d5d251882a147cc5811060097df43e79f5 ]

the Intel kbuild bot reports a link failure when IOSF_MBI is built-in
but the Merrifield driver is configured as a module. The
soc-intel-quirks.h is included for Merrifield platforms, but IOSF_MBI
is not selected for that platform.

ld.lld: error: undefined symbol: iosf_mbi_read
>>> referenced by atom.c
>>>               sound/soc/sof/intel/atom.o:(atom_machine_select) in archive vmlinux.a

This patch forces the use of the fallback static inline when IOSF_MBI is not reachable.

Fixes: 536cfd2f375d ("ASoC: Intel: use common helpers to detect CPUs")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407160704.zpdhJ8da-lkp@intel.com/
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20240722083002.10800-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agodrm/xe/pf: Limit fair VF LMEM provisioning
Michal Wajdeczko [Thu, 11 Jul 2024 19:23:19 +0000 (21:23 +0200)] 
drm/xe/pf: Limit fair VF LMEM provisioning

[ Upstream commit bf07ca963d4fd11c88a9d4b058f2bd62e8d46a98 ]

Due to the current design of the BO and VRAM manager, any object
with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF
LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS
flag, which may cause VRAM fragmentation that prevents subsequent
allocations of larger objects, like fair VF LMEM provisioning.

To avoid such failures, round down fair VF LMEM provisioning size
to next power of two size, to compensate what xe_ttm_vram_mgr is
doing to achieve contiguous allocations.

Fixes: ac6598aed1b3 ("drm/xe/pf: Add support to configure SR-IOV VFs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4c3fe5eae46b92e2fd961b19f7779608352e5368)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agodrm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup
Ashutosh Dixit [Thu, 11 Jul 2024 21:12:03 +0000 (14:12 -0700)] 
drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup

[ Upstream commit 408c2f14a5d3d7ac4824b96e52693ab271efb738 ]

Increment num_syncs after xe_sync_entry_parse() is successful to ensure
the xe_sync_entry_cleanup() logic under "err_syncs" label works correctly.

v2: Use the same pattern as that in xe_vm.c (Matt Brost)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711211203.3728180-1-ashutosh.dixit@intel.com
(cherry picked from commit 43a6faa6d9b5e9139758200a79fe9c8f4aaa0c8d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agospi: spidev: add correct compatible for Rohm BH2228FV
Conor Dooley [Wed, 17 Jul 2024 09:59:49 +0000 (10:59 +0100)] 
spi: spidev: add correct compatible for Rohm BH2228FV

[ Upstream commit fc28d1c1fe3b3e2fbc50834c8f73dda72f6af9fc ]

When Maxime originally added the BH2228FV to the spidev driver, he spelt
it incorrectly - the d should have been a b. Add the correctly spelt
compatible to the driver. Although the majority of users of this
compatible are abusers, there is at least one board that validly uses
the incorrect spelt compatible, so keep it in the driver to avoid
breaking the few real users it has.

Fixes: 8fad805bdc52 ("spi: spidev: Add Rohm DH2228FV DAC compatible string")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://patch.msgid.link/20240717-ventricle-strewn-a7678c509e85@spud
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoASoC: sof: amd: fix for firmware reload failure in Vangogh platform
Venkata Prasad Potturu [Thu, 18 Jul 2024 06:20:02 +0000 (11:50 +0530)] 
ASoC: sof: amd: fix for firmware reload failure in Vangogh platform

[ Upstream commit f2038c12e8133bf4c6bd4d1127a23310d55d9e21 ]

Setting ACP ACLK as clock source when ACP enters D0 state causing
firmware load failure, as per design clock source should be internal
clock.

Remove acp_clkmux_sel field so that ACP will use internal clock
source when ACP enters into D0 state.

Fixes: d0dab6b76a9f ("ASoC: SOF: amd: Add sof support for vangogh platform")
Signed-off-by: Venkata Prasad Potturu <venkataprasad.potturu@amd.com>
Link: https://patch.msgid.link/20240718062004.581685-1-venkataprasad.potturu@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoASoC: Intel: Fix RT5650 SSP lookup
Curtis Malainey [Tue, 16 Jul 2024 08:40:12 +0000 (10:40 +0200)] 
ASoC: Intel: Fix RT5650 SSP lookup

[ Upstream commit 6f6a23d42bdfbbfe1b41a23e2c78319a0cc65db3 ]

Commit 8efcd4864652 ("ASoC: Intel: sof_rt5682: use common module for
sof_card_private initialization") migrated the pin assignment in the
context struct up to soc-acpi-intel-ssp-common.c. This uses a lookup
table to see if a device has a amp/codec before assigning the pin. The
issue here arises when combination parts that serve both (with 2 ports)
are used.

sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1f.3/adl_rt5682_def/SSP0-Codec'
CPU: 1 PID: 2079 Comm: udevd Tainted: G     U             6.6.36-03391-g744739e00023 #1 3be1a2880a0970f65545a957db7d08ef4b3e2c0d
Hardware name: Google Anraggar/Anraggar, BIOS Google_Anraggar.15217.552.0 05/07/2024
Call Trace:
 <TASK>
 dump_stack_lvl+0x69/0xa0
 sysfs_warn_dup+0x5b/0x70
 sysfs_create_dir_ns+0xb0/0x100
 kobject_add_internal+0x133/0x3c0
 kobject_add+0x66/0xb0
 ? device_add+0x65/0x780
 device_add+0x164/0x780
 snd_soc_add_pcm_runtimes+0x2fa/0x800
 snd_soc_bind_card+0x35e/0xc20
 devm_snd_soc_register_card+0x48/0x90
 platform_probe+0x7b/0xb0
 really_probe+0xf7/0x2a0
 ...
kobject: kobject_add_internal failed for SSP0-Codec with -EEXIST, don't try to register things with the same name in the same directory.

The issue is that the ALC5650 was only defined in the codec table and
not the amp table which left the pin unassigned but the dai link was
still created by the machine driver.

Also patch the suffix filename code for the topology to prevent double
suffix names as a result of this change.

Fixes: 8efcd4864652 ("ASoC: Intel: sof_rt5682: use common module for sof_card_private initialization")
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Curtis Malainey <cujomalainey@chromium.org>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240716084012.299257-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoASOC: SOF: Intel: hda-loader: only wait for HDaudio IOC for IPC4 devices
Pierre-Louis Bossart [Tue, 16 Jul 2024 08:45:30 +0000 (10:45 +0200)] 
ASOC: SOF: Intel: hda-loader: only wait for HDaudio IOC for IPC4 devices

[ Upstream commit 9ee3f0d8c9999eb1ef2866e86f8d57d996fc0348 ]

Multiple users report a regression bisected to commit d5263dbbd8af
("ASoC: SOF: Intel: don't ignore IOC interrupts for non-audio
transfers"). The firmware version is the likely suspect, as these
users relied on SOF 2.0 while Intel only tested with the 2.2 release.

Rather than completely disable the wait_for_completion(), which can
help us gather timing information on the different stages of the boot
process, the simplest course of action is to just disable it for older
IPC versions which are no longer under active development.

Closes: https://github.com/thesofproject/linux/issues/5072
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218961
Fixes: d5263dbbd8af ("ASoC: SOF: Intel: don't ignore IOC interrupts for non-audio transfers")
Tested-by: Mike Krinkin <krinkin.m.u@gmail.com>
Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20240716084530.300829-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonvme-pci: Fix the instructions for disabling power management
Bart Van Assche [Thu, 11 Jul 2024 22:59:52 +0000 (15:59 -0700)] 
nvme-pci: Fix the instructions for disabling power management

[ Upstream commit 92fc2c469eb26060384e9b2cd4cb0cc228aba582 ]

pcie_aspm=off tells the kernel not to modify the ASPM configuration. This
setting does not guarantee that ASPM (Active State Power Management) is
disabled. Hence add pcie_port_pm=off. This disables power management for
all PCIe ports.

This patch has been tested on a workstation with a Samsung SSD 970 EVO Plus
NVMe SSD.

Fixes: 4641a8e6e145 ("nvme-pci: add trouble shooting steps for timeouts")
Cc: Keith Busch <kbusch@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agospi: microchip-core: ensure TX and RX FIFOs are empty at start of a transfer
Steve Wilkins [Mon, 15 Jul 2024 11:13:56 +0000 (12:13 +0100)] 
spi: microchip-core: ensure TX and RX FIFOs are empty at start of a transfer

[ Upstream commit 9cf71eb0faef4bff01df4264841b8465382d7927 ]

While transmitting with rx_len == 0, the RX FIFO is not going to be
emptied in the interrupt handler. A subsequent transfer could then
read crap from the previous transfer out of the RX FIFO into the
start RX buffer. The core provides a register that will empty the RX and
TX FIFOs, so do that before each transfer.

Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers")
Signed-off-by: Steve Wilkins <steve.wilkins@raymarine.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20240715-flammable-provoke-459226d08e70@wendy
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agospi: microchip-core: fix init function not setting the master and motorola modes
Steve Wilkins [Mon, 15 Jul 2024 11:13:55 +0000 (12:13 +0100)] 
spi: microchip-core: fix init function not setting the master and motorola modes

[ Upstream commit 3a5e76283672efddf47cea39ccfe9f5735cc91d5 ]

mchp_corespi_init() reads the CONTROL register, sets the master and
motorola bits, but doesn't write the value back to the register. The
function also doesn't ensure the controller is disabled at the start,
which may present a problem if the controller was used by an
earlier boot stage as some settings (including the mode) can only be
modified while the controller is disabled.

Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers")
Signed-off-by: Steve Wilkins <steve.wilkins@raymarine.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20240715-designing-thus-05f7c26e1da7@wendy
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agospi: microchip-core: only disable SPI controller when register value change requires it
Steve Wilkins [Mon, 15 Jul 2024 11:13:54 +0000 (12:13 +0100)] 
spi: microchip-core: only disable SPI controller when register value change requires it

[ Upstream commit de9850b5c606b754dd7861678d6e2874b96b04f8 ]

Setting up many of the registers for a new SPI transfer involves
unconditionally disabling the SPI controller, writing the register
value and re-enabling the controller. This is being done for registers
even when the value is unchanged and is also done for registers that
don't require the controller to be disabled for the change to take
effect. Make an effort to detect changes to the register values, and
only disables the controller if the new register value is different
and disabling the controller is required. This stops the controller
being repeated disabled and the bus going tristate before every
transfer.

Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers")
Signed-off-by: Steve Wilkins <steve.wilkins@raymarine.com>
Co-developed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20240715-depict-twirl-7e592eeabaad@wendy
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agospi: microchip-core: defer asserting chip select until just before write to TX FIFO
Steve Wilkins [Mon, 15 Jul 2024 11:13:53 +0000 (12:13 +0100)] 
spi: microchip-core: defer asserting chip select until just before write to TX FIFO

[ Upstream commit 22fd98c107c792e35db7abe45298bc3a29bf4723 ]

Setting up many of the registers for a new SPI transfer requires the
SPI controller to be disabled after set_cs() has been called to assert
the chip select line. However, disabling the controller results in the
SCLK and MOSI output pins being tristate, which can cause clock
transitions to be seen by a slave device whilst SS is active. To fix
this, the CS is only set to inactive inline, whilst setting it active
is deferred until all registers are set up and the any controller
disables have been completed.

Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers")
Signed-off-by: Steve Wilkins <steve.wilkins@raymarine.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20240715-sanitizer-recant-dd96b7a97048@wendy
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agospi: microchip-core: fix the issues in the isr
Naga Sureshkumar Relli [Mon, 15 Jul 2024 11:13:52 +0000 (12:13 +0100)] 
spi: microchip-core: fix the issues in the isr

[ Upstream commit 502a582b8dd897d9282db47c0911d5320ef2e6b9 ]

It is possible for the TXDONE interrupt be raised if the tx FIFO becomes
temporarily empty while transmitting, resulting in recursive calls to
mchp_corespi_write_fifo() and therefore a garbage message might be
transmitted depending on when the interrupt is triggered. Moving all of
the tx FIFO writes out of the TXDONE portion of the interrupt handler
avoids this problem.

Most of rest of the TXDONE portion of the handler is problematic too.
Only reading the rx FIFO (and finalising the transfer) when the TXDONE
interrupt is raised can cause the transfer to stall, if the final bytes
of rx data are not available in the rx FIFO when the final TXDONE
interrupt is raised. The transfer should be finalised regardless of
which interrupt is raised, provided that all tx data has been set and
all rx data received.

The first issue was encountered "in the wild", the second is
theoretical.

Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers")
Signed-off-by: Naga Sureshkumar Relli <nagasuresh.relli@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20240715-candied-deforest-585685ef3c8a@wendy
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoASoC: SOF: imx8m: Fix DSP control regmap retrieval
Daniel Baluta [Mon, 15 Jul 2024 15:16:53 +0000 (18:16 +0300)] 
ASoC: SOF: imx8m: Fix DSP control regmap retrieval

[ Upstream commit 2634f745eac25a33f032df32cf98fca8538a534a ]

According to Documentation/devicetree/bindings/dsp/fsl,dsp.yaml
fsl,dsp-ctrl is a phandle to syscon block so we need to use correct
function to retrieve it.

Currently there is no SOF DSP DTS merged into mainline so there is no
need to support the old way of retrieving the dsp control node.

Fixes: 9ba23717b292 ("ASoC: SOF: imx8m: Implement DSP start")
Signed-off-by: Daniel Baluta <daniel.baluta@nxp.com>
Link: https://patch.msgid.link/20240715151653.114751-1-daniel.baluta@oss.nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoauxdisplay: ht16k33: Drop reference after LED registration
Markus Elfring [Tue, 4 Jun 2024 15:02:15 +0000 (17:02 +0200)] 
auxdisplay: ht16k33: Drop reference after LED registration

[ Upstream commit 2ccfe94bc3ac980d2d1df9f7a0b2c6d2137abe55 ]

The reference count is bumped by device_get_named_child_node()
and never dropped. Since LED APIs do not require it to be
bumped by the user, drop the reference after LED registration.

[andy: rewritten the commit message and amended the change]

Fixes: c223d9c636ed ("auxdisplay: ht16k33: Add LED support")
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agolirc: rc_dev_get_from_fd(): fix file leak
Al Viro [Fri, 31 May 2024 03:58:26 +0000 (23:58 -0400)] 
lirc: rc_dev_get_from_fd(): fix file leak

[ Upstream commit bba1f6758a9ec90c1adac5dcf78f8a15f1bad65b ]

missing fdput() on a failure exit

Fixes: 6a9d552483d50 "media: rc: bpf attach/detach requires write permission" # v6.9
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agopowerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
Al Viro [Fri, 31 May 2024 03:54:55 +0000 (23:54 -0400)] 
powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()

[ Upstream commit b4cf5fc01ce83e5c0bcf3dbb9f929428646b9098 ]

missing fdput() on one of the failure exits

Fixes: eacc56bb9de3e # v5.2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoapparmor: Fix null pointer deref when receiving skb during sock creation
Xiao Liang [Sat, 2 Sep 2023 00:48:38 +0000 (08:48 +0800)] 
apparmor: Fix null pointer deref when receiving skb during sock creation

[ Upstream commit fce09ea314505a52f2436397608fa0a5d0934fb1 ]

The panic below is observed when receiving ICMP packets with secmark set
while an ICMP raw socket is being created. SK_CTX(sk)->label is updated
in apparmor_socket_post_create(), but the packet is delivered to the
socket before that, causing the null pointer dereference.
Drop the packet if label context is not set.

    BUG: kernel NULL pointer dereference, address: 000000000000004c
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0 P4D 0
    Oops: 0000 [#1] PREEMPT SMP NOPTI
    CPU: 0 PID: 407 Comm: a.out Not tainted 6.4.12-arch1-1 #1 3e6fa2753a2d75925c34ecb78e22e85a65d083df
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/28/2020
    RIP: 0010:aa_label_next_confined+0xb/0x40
    Code: 00 00 48 89 ef e8 d5 25 0c 00 e9 66 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 89 f0 <8b> 77 4c 39 c6 7e 1f 48 63 d0 48 8d 14 d7 eb 0b 83 c0 01 48 83 c2
    RSP: 0018:ffffa92940003b08 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000e
    RDX: ffffa92940003be8 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff8b57471e7800 R08: ffff8b574c642400 R09: 0000000000000002
    R10: ffffffffbd820eeb R11: ffffffffbeb7ff00 R12: ffff8b574c642400
    R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
    FS:  00007fb092ea7640(0000) GS:ffff8b577bc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000000004c CR3: 00000001020f2005 CR4: 00000000007706f0
    PKRU: 55555554
    Call Trace:
     <IRQ>
     ? __die+0x23/0x70
     ? page_fault_oops+0x171/0x4e0
     ? exc_page_fault+0x7f/0x180
     ? asm_exc_page_fault+0x26/0x30
     ? aa_label_next_confined+0xb/0x40
     apparmor_secmark_check+0xec/0x330
     security_sock_rcv_skb+0x35/0x50
     sk_filter_trim_cap+0x47/0x250
     sock_queue_rcv_skb_reason+0x20/0x60
     raw_rcv+0x13c/0x210
     raw_local_deliver+0x1f3/0x250
     ip_protocol_deliver_rcu+0x4f/0x2f0
     ip_local_deliver_finish+0x76/0xa0
     __netif_receive_skb_one_core+0x89/0xa0
     netif_receive_skb+0x119/0x170
     ? __netdev_alloc_skb+0x3d/0x140
     vmxnet3_rq_rx_complete+0xb23/0x1010 [vmxnet3 56a84f9c97178c57a43a24ec073b45a9d6f01f3a]
     vmxnet3_poll_rx_only+0x36/0xb0 [vmxnet3 56a84f9c97178c57a43a24ec073b45a9d6f01f3a]
     __napi_poll+0x28/0x1b0
     net_rx_action+0x2a4/0x380
     __do_softirq+0xd1/0x2c8
     __irq_exit_rcu+0xbb/0xf0
     common_interrupt+0x86/0xa0
     </IRQ>
     <TASK>
     asm_common_interrupt+0x26/0x40
    RIP: 0010:apparmor_socket_post_create+0xb/0x200
    Code: 08 48 85 ff 75 a1 eb b1 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 <55> 48 89 fd 53 45 85 c0 0f 84 b2 00 00 00 48 8b 1d 80 56 3f 02 48
    RSP: 0018:ffffa92940ce7e50 EFLAGS: 00000286
    RAX: ffffffffbc756440 RBX: 0000000000000000 RCX: 0000000000000001
    RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8b574eaab740
    RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
    R10: ffff8b57444cec70 R11: 0000000000000000 R12: 0000000000000003
    R13: 0000000000000002 R14: ffff8b574eaab740 R15: ffffffffbd8e4748
     ? __pfx_apparmor_socket_post_create+0x10/0x10
     security_socket_post_create+0x4b/0x80
     __sock_create+0x176/0x1f0
     __sys_socket+0x89/0x100
     __x64_sys_socket+0x17/0x20
     do_syscall_64+0x5d/0x90
     ? do_syscall_64+0x6c/0x90
     ? do_syscall_64+0x6c/0x90
     ? do_syscall_64+0x6c/0x90
     entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fixes: ab9f2115081a ("apparmor: Allow filtering based on secmark policy")
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agomISDN: Fix a use after free in hfcmulti_tx()
Dan Carpenter [Wed, 24 Jul 2024 16:08:18 +0000 (11:08 -0500)] 
mISDN: Fix a use after free in hfcmulti_tx()

[ Upstream commit 61ab751451f5ebd0b98e02276a44e23a10110402 ]

Don't dereference *sp after calling dev_kfree_skb(*sp).

Fixes: af69fb3a8ffa ("Add mISDN HFC multiport driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8be65f5a-c2dd-4ba0-8a10-bfe5980b8cfb@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoxsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
Stanislav Fomichev [Sat, 13 Jul 2024 01:52:51 +0000 (18:52 -0700)] 
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len

[ Upstream commit d5e726d9143c5624135f5dc9e4069799adeef734 ]

Julian reports that commit 341ac980eab9 ("xsk: Support tx_metadata_len")
can break existing use cases which don't zero-initialize xdp_umem_reg
padding. Introduce new XDP_UMEM_TX_METADATA_LEN to make sure we
interpret the padding as tx_metadata_len only when being explicitly
asked.

Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len")
Reported-by: Julian Schindel <mail@arctic-alpaca.de>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240713015253.121248-2-sdf@fomichev.me
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agobpf: Fix a segment issue when downgrading gso_size
Fred Li [Fri, 19 Jul 2024 02:46:53 +0000 (10:46 +0800)] 
bpf: Fix a segment issue when downgrading gso_size

[ Upstream commit fa5ef655615a01533035c6139248c5b33aa27028 ]

Linearize the skb when downgrading gso_size because it may trigger a
BUG_ON() later when the skb is segmented as described in [1,2].

Fixes: 2be7e212d5419 ("bpf: add bpf_skb_adjust_room helper")
Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/all/20240626065555.35460-2-dracodingfly@gmail.com
Link: https://lore.kernel.org/all/668d5cf1ec330_1c18c32947@willemb.c.googlers.com.notmuch
Link: https://lore.kernel.org/bpf/20240719024653.77006-1-dracodingfly@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonet: mediatek: Fix potential NULL pointer dereference in dummy net_device handling
Breno Leitao [Wed, 24 Jul 2024 08:05:23 +0000 (01:05 -0700)] 
net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling

[ Upstream commit 16f3a28cf5f876a7f3550d8f4c870a7b41bcfaef ]

Move the freeing of the dummy net_device from mtk_free_dev() to
mtk_remove().

Previously, if alloc_netdev_dummy() failed in mtk_probe(),
eth->dummy_dev would be NULL. The error path would then call
mtk_free_dev(), which in turn called free_netdev() assuming dummy_dev
was allocated (but it was not), potentially causing a NULL pointer
dereference.

By moving free_netdev() to mtk_remove(), we ensure it's only called when
mtk_probe() has succeeded and dummy_dev is fully allocated. This
addresses a potential NULL pointer dereference detected by Smatch[1].

Fixes: b209bd6d0bff ("net: mediatek: mtk_eth_sock: allocate dummy net_device dynamically")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/4160f4e0-cbef-4a22-8b5d-42c4d399e1f7@stanley.mountain/ [1]
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240724080524.2734499-1-leitao@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonet: nexthop: Initialize all fields in dumped nexthops
Petr Machata [Tue, 23 Jul 2024 16:04:16 +0000 (18:04 +0200)] 
net: nexthop: Initialize all fields in dumped nexthops

[ Upstream commit 6d745cd0e9720282cd291d36b9db528aea18add2 ]

struct nexthop_grp contains two reserved fields that are not initialized by
nla_put_nh_group(), and carry garbage. This can be observed e.g. with
strace (edited for clarity):

    # ip nexthop add id 1 dev lo
    # ip nexthop add id 101 group 1
    # strace -e recvmsg ip nexthop get id 101
    ...
    recvmsg(... [{nla_len=12, nla_type=NHA_GROUP},
                 [{id=1, weight=0, resvd1=0x69, resvd2=0x67}]] ...) = 52

The fields are reserved and therefore not currently used. But as they are, they
leak kernel memory, and the fact they are not just zero complicates repurposing
of the fields for new ends. Initialize the full structure.

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonet: stmmac: Correct byte order of perfect_match
Simon Horman [Tue, 23 Jul 2024 13:29:27 +0000 (14:29 +0100)] 
net: stmmac: Correct byte order of perfect_match

[ Upstream commit e9dbebae2e3c338122716914fe105458f41e3a4a ]

The perfect_match parameter of the update_vlan_hash operation is __le16,
and is correctly converted from host byte-order in the lone caller,
stmmac_vlan_update().

However, the implementations of this caller, dwxgmac2_update_vlan_hash()
and dwxgmac2_update_vlan_hash(), both treat this parameter as host byte
order, using the following pattern:

u32 value = ...
...
writel(value | perfect_match, ...);

This is not correct because both:
1) value is host byte order; and
2) writel expects a host byte order value as it's first argument

I believe that this will break on big endian systems. And I expect it
has gone unnoticed by only being exercised on little endian systems.

The approach taken by this patch is to update the callback, and it's
caller to simply use a host byte order value.

Flagged by Sparse.
Compile tested only.

Fixes: c7ab0b8088d7 ("net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available")
Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoselftests: forwarding: skip if kernel not support setting bridge fdb learning limit
Hangbin Liu [Tue, 23 Jul 2024 08:22:52 +0000 (16:22 +0800)] 
selftests: forwarding: skip if kernel not support setting bridge fdb learning limit

[ Upstream commit 863ff546fb62a8fa75757a30794ab6ec6cc4bab7 ]

If the testing kernel doesn't support setting fdb_max_learned or show
fdb_n_learned, just skip it. Or we will get errors like

./bridge_fdb_learning_limit.sh: line 218: [: null: integer expression expected
./bridge_fdb_learning_limit.sh: line 225: [: null: integer expression expected

Fixes: 6f84090333bb ("selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agotipc: Return non-zero value from tipc_udp_addr2str() on error
Shigeru Yoshida [Tue, 16 Jul 2024 02:09:05 +0000 (11:09 +0900)] 
tipc: Return non-zero value from tipc_udp_addr2str() on error

[ Upstream commit fa96c6baef1b5385e2f0c0677b32b3839e716076 ]

tipc_udp_addr2str() should return non-zero value if the UDP media
address is invalid. Otherwise, a buffer overflow access can occur in
tipc_media_addr_printf(). Fix this by returning 1 on an invalid UDP
media address.

Fixes: d0f91938bede ("tipc: add ip/udp media type")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@endava.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonetfs: Fix writeback that needs to go to both server and cache
David Howells [Fri, 19 Jul 2024 14:20:18 +0000 (15:20 +0100)] 
netfs: Fix writeback that needs to go to both server and cache

[ Upstream commit 212be98aa19303cbf376d61faf9de3ec9997c1cd ]

When netfslib is performing writeback (ie. ->writepages), it maintains two
parallel streams of writes, one to the server and one to the cache, but it
doesn't mark either stream of writes as active until it gets some data that
needs to be written to that stream.

This is done because some folios will only be written to the cache
(e.g. copying to the cache on read is done by marking the folios and
letting writeback do the actual work) and sometimes we'll only be writing
to the server (e.g. if there's no cache).

Now, since we don't actually dispatch uploads and cache writes in parallel,
but rather flip between the streams, depending on which has the lowest
so-far-issued offset, and don't wait for the subreqs to finish before
flipping, we can end up in a situation where, say, we issue a write to the
server and this completes before we start the write to the cache.

But because we only activate a stream when we first add a subreq to it, the
result collection code may run before we manage to activate the stream -
resulting in the folio being cleaned and having the writeback-in-progress
mark removed.  At this point, the folio no longer belongs to us.

This is only really a problem for folios that need to be written to both
streams - and in that case, the upload to the server is started first,
followed by the write to the cache - and the cache write may see a bad
folio.

Fix this by activating the cache stream up front if there's a cache
available.  If there's a cache, then all data is going to be written to it.

Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/1599053.1721398818@warthog.procyon.org.uk
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonetfilter: nft_set_pipapo_avx2: disable softinterrupts
Florian Westphal [Fri, 19 Jul 2024 11:19:26 +0000 (13:19 +0200)] 
netfilter: nft_set_pipapo_avx2: disable softinterrupts

[ Upstream commit a16909ae9982e931841c456061cb57fbaec9c59e ]

We need to disable softinterrupts, else we get following problem:

1. pipapo_avx2 called from process context; fpu usable
2. preempt_disable() called, pcpu scratchmap in use
3. softirq handles rx or tx, we re-enter pipapo_avx2
4. fpu busy, fallback to generic non-avx version
5. fallback reuses scratch map and index, which are in use
   by the preempted process

Handle this same way as generic version by first disabling
softinterrupts while the scratchmap is in use.

Fixes: f0b3d338064e ("netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version")
Cc: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoice: Fix recipe read procedure
Wojciech Drewek [Mon, 1 Jul 2024 09:05:46 +0000 (11:05 +0200)] 
ice: Fix recipe read procedure

[ Upstream commit 19abb9c2b900bad59e0a9818d6c83bb4cc875437 ]

When ice driver reads recipes from firmware information about
need_pass_l2 and allow_pass_l2 flags is not stored correctly.
Those flags are stored as one bit each in ice_sw_recipe structure.
Because of that, the result of checking a flag has to be casted to bool.
Note that the need_pass_l2 flag currently works correctly, because
it's stored in the first bit.

Fixes: bccd9bce29e0 ("ice: Add guard rule when creating FDB in switchdev")
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agonet: bonding: correctly annotate RCU in bond_should_notify_peers()
Johannes Berg [Fri, 19 Jul 2024 16:41:18 +0000 (09:41 -0700)] 
net: bonding: correctly annotate RCU in bond_should_notify_peers()

[ Upstream commit 3ba359c0cd6eb5ea772125a7aededb4a2d516684 ]

RCU use in bond_should_notify_peers() looks wrong, since it does
rcu_dereference(), leaves the critical section, and uses the
pointer after that.

Luckily, it's called either inside a nested RCU critical section
or with the RTNL held.

Annotate it with rcu_dereference_rtnl() instead, and remove the
inner RCU critical section.

Fixes: 4cb4f97b7e36 ("bonding: rebuild the lock use for bond_mii_monitor()")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20240719094119.35c62455087d.I68eb9c0f02545b364b79a59f2110f2cf5682a8e2@changeid
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoipv4: Fix incorrect source address in Record Route option
Ido Schimmel [Thu, 18 Jul 2024 12:34:07 +0000 (15:34 +0300)] 
ipv4: Fix incorrect source address in Record Route option

[ Upstream commit cc73bbab4b1fb8a4f53a24645871dafa5f81266a ]

The Record Route IP option records the addresses of the routers that
routed the packet. In the case of forwarded packets, the kernel performs
a route lookup via fib_lookup() and fills in the preferred source
address of the matched route.

The lookup is performed with the DS field of the forwarded packet, but
using the RT_TOS() macro which only masks one of the two ECN bits. If
the packet is ECT(0) or CE, the matched route might be different than
the route via which the packet was forwarded as the input path masks
both of the ECN bits, resulting in the wrong address being filled in the
Record Route option.

Fix by masking both of the ECN bits.

Fixes: 8e36360ae876 ("ipv4: Remove route key identity dependencies in ip_rt_get_source().")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20240718123407.434778-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoMIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later
Gregory CLEMENT [Mon, 22 Jul 2024 13:15:39 +0000 (15:15 +0200)] 
MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later

[ Upstream commit a263e5f309f32301e1f3ad113293f4e68a82a646 ]

When the CM block migrated from CM2.5 to CM3.0, the address offset for
the Global CSR Access Privilege register was modified. We saw this in
the "MIPS64 I6500 Multiprocessing System Programmer's Guide," it is
stated that "the Global CSR Access Privilege register is located at
offset 0x0120" in section 5.4. It is at least the same for I6400.

This fix allows to use the VP cores in SMP mode if the reset values
were modified by the bootloader.

Based on the work of Vladimir Kondratiev
<vladimir.kondratiev@mobileye.com> and the feedback from Jiaxun Yang
<jiaxun.yang@flygoat.com>.

Fixes: 197e89e0984a ("MIPS: mips-cm: Implement mips_cm_revision")
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agotools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids
Liwei Song [Mon, 22 Jul 2024 08:32:59 +0000 (16:32 +0800)] 
tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids

[ Upstream commit 13c9b702e6cb8e406d5fa6b2dca422fa42d2f13e ]

Add a type cast for set8->pairs to fix below compile warning:

main.c: In function 'sets_patch':
main.c:699:50: warning: comparison of distinct pointer types lacks a cast
  699 |        BUILD_BUG_ON(set8->pairs != &set8->pairs[0].id);
      |                                 ^~

Fixes: 9707ac4fe2f5 ("tools/resolve_btfids: Refactor set sorting with types from btf_ids.h")
Signed-off-by: Liwei Song <liwei.song.lsong@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240722083305.4009723-1-liwei.song.lsong@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agobpf, events: Use prog to emit ksymbol event for main program
Hou Tao [Sun, 14 Jul 2024 06:55:33 +0000 (14:55 +0800)] 
bpf, events: Use prog to emit ksymbol event for main program

[ Upstream commit 0be9ae5486cd9e767138c13638820d240713f5f1 ]

Since commit 0108a4e9f358 ("bpf: ensure main program has an extable"),
prog->aux->func[0]->kallsyms is left as uninitialized. For BPF programs
with subprogs, the symbol for the main program is missing just as shown
in the output of perf script below:

 ffffffff81284b69 qp_trie_lookup_elem+0xb9 ([kernel.kallsyms])
 ffffffffc0011125 bpf_prog_a4a0eb0651e6af8b_lookup_qp_trie+0x5d (bpf...)
 ffffffff8127bc2b bpf_for_each_array_elem+0x7b ([kernel.kallsyms])
 ffffffffc00110a1 +0x25 ()
 ffffffff8121a89a trace_call_bpf+0xca ([kernel.kallsyms])

Fix it by always using prog instead prog->aux->func[0] to emit ksymbol
event for the main program. After the fix, the output of perf script
will be correct:

 ffffffff81284b96 qp_trie_lookup_elem+0xe6 ([kernel.kallsyms])
 ffffffffc001382d bpf_prog_a4a0eb0651e6af8b_lookup_qp_trie+0x5d (bpf...)
 ffffffff8127bc2b bpf_for_each_array_elem+0x7b ([kernel.kallsyms])
 ffffffffc0013779 bpf_prog_245c55ab25cfcf40_qp_trie_lookup+0x25 (bpf...)
 ffffffff8121a89a trace_call_bpf+0xca ([kernel.kallsyms])

Fixes: 0108a4e9f358 ("bpf: ensure main program has an extable")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Krister Johansen <kjlx@templeofstupid.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240714065533.1112616-1-houtao@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agodma: fix call order in dmam_free_coherent
Lance Richardson [Thu, 18 Jul 2024 14:38:24 +0000 (14:38 +0000)] 
dma: fix call order in dmam_free_coherent

[ Upstream commit 28e8b7406d3a1f5329a03aa25a43aa28e087cb20 ]

dmam_free_coherent() frees a DMA allocation, which makes the
freed vaddr available for reuse, then calls devres_destroy()
to remove and free the data structure used to track the DMA
allocation. Between the two calls, it is possible for a
concurrent task to make an allocation with the same vaddr
and add it to the devres list.

If this happens, there will be two entries in the devres list
with the same vaddr and devres_destroy() can free the wrong
entry, triggering the WARN_ON() in dmam_match.

Fix by destroying the devres entry before freeing the DMA
allocation.

Tested:
  kokonut //net/encryption
    http://sponge2/b9145fe6-0f72-4325-ac2f-a84d81075b03

Fixes: 9ac7849e35f7 ("devres: device resource management")
Signed-off-by: Lance Richardson <rlance@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoaf_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
Michal Luczaj [Sat, 13 Jul 2024 19:41:38 +0000 (21:41 +0200)] 
af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash

[ Upstream commit 638f32604385fd23059985da8de918e9c18f0b98 ]

AF_UNIX socket tracks the most recent OOB packet (in its receive queue)
with an `oob_skb` pointer. BPF redirecting does not account for that: when
an OOB packet is moved between sockets, `oob_skb` is left outdated. This
results in a single skb that may be accessed from two different sockets.

Take the easy way out: silently drop MSG_OOB data targeting any socket that
is in a sockmap or a sockhash. Note that such silent drop is akin to the
fate of redirected skb's scm_fp_list (SCM_RIGHTS, SCM_CREDENTIALS).

For symmetry, forbid MSG_OOB in unix_bpf_recvmsg().

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20240713200218.2140950-2-mhal@rbox.co
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agolibbpf: Fix no-args func prototype BTF dumping syntax
Andrii Nakryiko [Fri, 12 Jul 2024 22:44:42 +0000 (15:44 -0700)] 
libbpf: Fix no-args func prototype BTF dumping syntax

[ Upstream commit 189f1a976e426011e6a5588f1d3ceedf71fe2965 ]

For all these years libbpf's BTF dumper has been emitting not strictly
valid syntax for function prototypes that have no input arguments.

Instead of `int (*blah)()` we should emit `int (*blah)(void)`.

This is not normally a problem, but it manifests when we get kfuncs in
vmlinux.h that have no input arguments. Due to compiler internal
specifics, we get no BTF information for such kfuncs, if they are not
declared with proper `(void)`.

The fix is trivial. We also need to adjust a few ancient tests that
happily assumed `()` is correct.

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/bpf/20240712224442.282823-1-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agoselftests/bpf: fexit_sleep: Fix stack allocation for arm64
Puranjay Mohan [Mon, 15 Jul 2024 17:33:27 +0000 (17:33 +0000)] 
selftests/bpf: fexit_sleep: Fix stack allocation for arm64

[ Upstream commit e1ef78dce9b7b0fa7f9d88bb3554441d74d33b34 ]

On ARM64 the stack pointer should be aligned at a 16 byte boundary or
the SPAlignmentFault can occur. The fexit_sleep selftest allocates the
stack for the child process as a character array, this is not guaranteed
to be aligned at 16 bytes.

Because of the SPAlignmentFault, the child process is killed before it
can do the nanosleep call and hence fentry_cnt remains as 0. This causes
the main thread to hang on the following line:

while (READ_ONCE(fexit_skel->bss->fentry_cnt) != 2);

Fix this by allocating the stack using mmap() as described in the
example in the man page of clone().

Remove the fexit_sleep test from the DENYLIST of arm64.

Fixes: eddbe8e65214 ("selftest/bpf: Add a test to check trampoline freeing logic.")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240715173327.8657-1-puranjay@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
16 months agokbuild: avoid build error when single DTB is turned into composite DTB
Masahiro Yamada [Thu, 4 Jul 2024 13:13:58 +0000 (22:13 +0900)] 
kbuild: avoid build error when single DTB is turned into composite DTB

[ Upstream commit 712aba5543b88996bc4682086471076fbf048927 ]

As commit afa974b77128 ("kbuild: add real-prereqs shorthand for
$(filter-out FORCE,$^)") explained, $(real-prereqs) is not just a list
of objects when linking a multi-object module. If a single-object module
is turned into a multi-object module, $^ (and therefore $(real-prereqs)
as well) contains header files recorded in the *.cmd file. Such headers
must be filtered out.

Now that a DTB can be built either from a single source or multiple
source files, the same issue can occur.

Consider the following scenario:

First, foo.dtb is implemented as a single-blob device tree.

The code looks something like this:

[Sample Code 1]

  Makefile:

      dtb-y += foo.dtb

  foo.dts:

    #include <dt-bindings/gpio/gpio.h>
    /dts-v1/;
    / { };

When it is compiled, .foo.dtb.cmd records that foo.dtb depends on
scripts/dtc/include-prefixes/dt-bindings/gpio/gpio.h.

Later, foo.dtb is split into a base and an overlay. The code looks
something like this:

[Sample Code 2]

  Makefile:

      dtb-y += foo.dtb
      foo-dtbs := foo-base.dtb foo-addon.dtbo

  foo-base.dts:

    #include <dt-bindings/gpio/gpio.h>
    /dts-v1/;
    / { };

  foo-addon.dtso:

    /dts-v1/;
    /plugin/;
    / { };

If you rebuild foo.dtb without 'make clean', you will get this error:

    Overlay 'scripts/dtc/include-prefixes/dt-bindings/gpio/gpio.h' is incomplete

$(real-prereqs) contains not only foo-base.dtb and foo-addon.dtbo but
also scripts/dtc/include-prefixes/dt-bindings/gpio/gpio.h, which is
passed to scripts/dtc/fdtoverlay.

Fixes: 15d16d6dadf6 ("kbuild: Add generic rule to apply fdtoverlay")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>