]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
6 weeks agonet/sched: Remove unused typedef psched_tdiff_t
Yue Haibing [Fri, 24 Oct 2025 02:51:45 +0000 (10:51 +0800)] 
net/sched: Remove unused typedef psched_tdiff_t

Since commit 051d44209842 ("net/sched: Retire CBQ qdisc")
this is not used anymore.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://patch.msgid.link/20251024025145.4069583-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'sctp-avoid-redundant-initialisation-in-sctp_accept-and-sctp_do_peeloff'
Jakub Kicinski [Tue, 28 Oct 2025 01:05:01 +0000 (18:05 -0700)] 
Merge branch 'sctp-avoid-redundant-initialisation-in-sctp_accept-and-sctp_do_peeloff'

Kuniyuki Iwashima says:

====================
sctp: Avoid redundant initialisation in sctp_accept() and sctp_do_peeloff().

When sctp_accept() and sctp_do_peeloff() allocates a new socket,
somehow sk_alloc() is used, and the new socket goes through full
initialisation, but most of the fields are overwritten later.

  1)
  sctp_accept()
  |- sctp_v[46]_create_accept_sk()
  |  |- sk_alloc()
  |  |- sock_init_data()
  |  |- sctp_copy_sock()
  |  `- newsk->sk_prot->init() / sctp_init_sock()
  |
  `- sctp_sock_migrate()
     `- sctp_copy_descendant(newsk, oldsk)

  sock_init_data() initialises struct sock, but many fields are
  overwritten by sctp_copy_sock(), which inherits fields of struct
  sock and inet_sock from the parent socket.

  sctp_init_sock() fully initialises struct sctp_sock, but later
  sctp_copy_descendant() inherits most fields from the parent's
  struct sctp_sock by memcpy().

  2)
  sctp_do_peeloff()
  |- sock_create()
  |  |
  |  ...
  |      |- sk_alloc()
  |      |- sock_init_data()
  |  ...
  |    `- newsk->sk_prot->init() / sctp_init_sock()
  |
  |- sctp_copy_sock()
  `- sctp_sock_migrate()
     `- sctp_copy_descendant(newsk, oldsk)

  sock_create() creates a brand new socket, but sctp_copy_sock()
  and sctp_sock_migrate() overwrite most of the fields.

So, sk_alloc(), sock_init_data(), sctp_copy_sock(), and
sctp_copy_descendant() can be replaced with a single function
like sk_clone_lock().

This series does the conversion and removes TODO comment added
by commit 4a997d49d92ad ("tcp: Save lock_sock() for memcg in
inet_csk_accept().").

Tested accept() and SCTP_SOCKOPT_PEELOFF and both work properly.

  socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP) = 3
  bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
  listen(3, -1)                           = 0
  getsockname(3, {sa_family=AF_INET, sin_port=htons(49460), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
  socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP) = 4
  connect(4, {sa_family=AF_INET, sin_port=htons(49460), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
  accept(3, NULL, NULL)                   = 5
  ...
  socket(AF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 3
  bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
  listen(3, -1)                           = 0
  getsockname(3, {sa_family=AF_INET, sin_port=htons(48240), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
  socket(AF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 4
  connect(4, {sa_family=AF_INET, sin_port=htons(48240), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
  getsockopt(3, SOL_SCTP, SCTP_SOCKOPT_PEELOFF, "*\0\0\0\5\0\0\0", [8]) = 5

v1: https://lore.kernel.org/20251021214422.1941691-1-kuniyu@google.com
====================

Link: https://patch.msgid.link/20251023231751.4168390-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agosctp: Remove sctp_copy_sock() and sctp_copy_descendant().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:57 +0000 (23:16 +0000)] 
sctp: Remove sctp_copy_sock() and sctp_copy_descendant().

Now, sctp_accept() and sctp_do_peeloff() use sk_clone(), and
we no longer need sctp_copy_sock() and sctp_copy_descendant().

Let's remove them.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-9-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agosctp: Use sctp_clone_sock() in sctp_do_peeloff().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:56 +0000 (23:16 +0000)] 
sctp: Use sctp_clone_sock() in sctp_do_peeloff().

sctp_do_peeloff() calls sock_create() to allocate and initialise
struct sock, inet_sock, and sctp_sock, but later sctp_copy_sock()
and sctp_sock_migrate() overwrite most fields.

What sctp_do_peeloff() does is more like accept().

Let's use sock_create_lite() and sctp_clone_sock().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-8-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agosctp: Remove sctp_pf.create_accept_sk().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:55 +0000 (23:16 +0000)] 
sctp: Remove sctp_pf.create_accept_sk().

sctp_v[46]_create_accept_sk() are no longer used.

Let's remove sctp_pf.create_accept_sk().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-7-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agosctp: Use sk_clone() in sctp_accept().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:54 +0000 (23:16 +0000)] 
sctp: Use sk_clone() in sctp_accept().

sctp_accept() calls sctp_v[46]_create_accept_sk() to allocate a new
socket and calls sctp_sock_migrate() to copy fields from the parent
socket to the new socket.

sctp_v4_create_accept_sk() allocates sk by sk_alloc(), initialises
it by sock_init_data(), and copy a bunch of fields from the parent
socekt by sctp_copy_sock().

sctp_sock_migrate() calls sctp_copy_descendant() to copy most fields
in sctp_sock from the parent socket by memcpy().

These can be simply replaced by sk_clone().

Let's consolidate sctp_v[46]_create_accept_sk() to sctp_clone_sock()
with sk_clone().

We will reuse sctp_clone_sock() for sctp_do_peeloff() and then remove
sctp_copy_descendant().

Note that sock_reset_flag(newsk, SOCK_ZAPPED) is not copied to
sctp_clone_sock() as sctp does not use SOCK_ZAPPED at all.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-6-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: Add sk_clone().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:53 +0000 (23:16 +0000)] 
net: Add sk_clone().

sctp_accept() will use sk_clone_lock(), but it will be called
with the parent socket locked, and sctp_migrate() acquires the
child lock later.

Let's add no lock version of sk_clone_lock().

Note that lockdep complains if we simply use bh_lock_sock_nested().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agosctp: Don't call sk->sk_prot->init() in sctp_v[46]_create_accept_sk().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:52 +0000 (23:16 +0000)] 
sctp: Don't call sk->sk_prot->init() in sctp_v[46]_create_accept_sk().

sctp_accept() calls sctp_v[46]_create_accept_sk() to allocate a new
socket and calls sctp_sock_migrate() to copy fields from the parent
socket to the new socket.

sctp_v[46]_create_accept_sk() calls sctp_init_sock() to initialise
sctp_sock, but most fields are overwritten by sctp_copy_descendant()
called from sctp_sock_migrate().

Things done in sctp_init_sock() but not in sctp_sock_migrate() are
the following:

  1. Copy sk->sk_gso
  2. Copy sk->sk_destruct (sctp_v6_init_sock())
  3. Allocate sctp_sock.ep
  4. Initialise sctp_sock.pd_lobby
  5. Count sk_sockets_allocated_inc(), sock_prot_inuse_add(),
     and SCTP_DBG_OBJCNT_INC()

Let's do these in sctp_copy_sock() and sctp_sock_migrate() and avoid
calling sk->sk_prot->init() in sctp_v[46]_create_accept_sk().

Note that sk->sk_destruct is already copied in sctp_copy_sock().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agosctp: Don't copy sk_sndbuf and sk_rcvbuf in sctp_sock_migrate().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:51 +0000 (23:16 +0000)] 
sctp: Don't copy sk_sndbuf and sk_rcvbuf in sctp_sock_migrate().

sctp_sock_migrate() is called from 2 places.

1) sctp_accept() calls sp->pf->create_accept_sk() before
   sctp_sock_migrate(), and sp->pf->create_accept_sk() calls
   sctp_copy_sock().

2) sctp_do_peeloff() also calls sctp_copy_sock() before
   sctp_sock_migrate().

sctp_copy_sock() copies sk_sndbuf and sk_rcvbuf from the
parent socket.

Let's not copy the two fields in sctp_sock_migrate().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agosctp: Defer SCTP_DBG_OBJCNT_DEC() to sctp_destroy_sock().
Kuniyuki Iwashima [Thu, 23 Oct 2025 23:16:50 +0000 (23:16 +0000)] 
sctp: Defer SCTP_DBG_OBJCNT_DEC() to sctp_destroy_sock().

SCTP_DBG_OBJCNT_INC() is called only when sctp_init_sock()
returns 0 after successfully allocating sctp_sk(sk)->ep.

OTOH, SCTP_DBG_OBJCNT_DEC() is called in sctp_close().

The code seems to expect that the socket is always exposed
to userspace once SCTP_DBG_OBJCNT_INC() is incremented, but
there is a path where the assumption is not true.

In sctp_accept(), sctp_sock_migrate() could fail after
sctp_init_sock().

Then, sk_common_release() does not call inet_release() nor
sctp_close().  Instead, it calls sk->sk_prot->destroy().

Let's move SCTP_DBG_OBJCNT_DEC() from sctp_close() to
sctp_destroy_sock().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251023231751.4168390-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'convert-net-drivers-to-ndo_hwtstamp-api-part-2'
Jakub Kicinski [Tue, 28 Oct 2025 01:04:38 +0000 (18:04 -0700)] 
Merge branch 'convert-net-drivers-to-ndo_hwtstamp-api-part-2'

Vadim Fedorenko says:

====================
convert net drivers to ndo_hwtstamp API part 2

This is part 2 of patchset to convert drivers which support HW
timestamping to use .ndo_hwtstamp_get()/.ndo_hwtstamp_set() callbacks.
The new API uses netlink to communicate with user-space and have some
test coverage.
====================

Link: https://patch.msgid.link/20251023220457.3201122-1-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: hns3: add hwtstamp_get/hwtstamp_set ops
Vadim Fedorenko [Thu, 23 Oct 2025 22:04:57 +0000 (22:04 +0000)] 
net: hns3: add hwtstamp_get/hwtstamp_set ops

And .ndo_hwtstamp_get()/.ndo_hwtstamp_set() callbacks to HNS3 framework
to support HW timestamp configuration via netlink and adopt hns3pf to
use .ndo_hwtstamp_get()/.ndo_hwtstamp_set() callbacks.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251023220457.3201122-7-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: renesas: rswitch: convert to ndo_hwtstamp API
Vadim Fedorenko [Thu, 23 Oct 2025 22:04:56 +0000 (22:04 +0000)] 
net: renesas: rswitch: convert to ndo_hwtstamp API

Convert driver to use .ndo_hwtstamp_set()/.ndo_hwtstamp_get() callbacks.
rswitch_eth_ioctl() becomes phy_do_ioctl_running(), remove it and
replace .ndo_eth_ioctl callback with phy_do_ioctl_running().

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251023220457.3201122-6-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: ravb: convert to ndo_hwtstamp API
Vadim Fedorenko [Thu, 23 Oct 2025 22:04:55 +0000 (22:04 +0000)] 
net: ravb: convert to ndo_hwtstamp API

Convert driver to use .ndo_hwtstamp_set()/.ndo_hwtstamp_get callbacks.
ravb_do_ioctl() becomes pure phy_do_ioctl_running(), remove it and
replace in callbacks.

Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251023220457.3201122-5-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoionic: convert to ndo_hwtstamp API
Vadim Fedorenko [Thu, 23 Oct 2025 22:04:54 +0000 (22:04 +0000)] 
ionic: convert to ndo_hwtstamp API

Convert driver to use .ndo_hwtstamp_get()/.ndo_hwtstamp_set() callbacks.
ionic_eth_ioctl() becomes empty, remove it.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251023220457.3201122-4-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agomlx4: convert to ndo_hwtstamp API
Vadim Fedorenko [Thu, 23 Oct 2025 22:04:53 +0000 (22:04 +0000)] 
mlx4: convert to ndo_hwtstamp API

Convert driver to use .ndo_hwtstamp_get()/.ndo_hwtstamp_set() callbacks.
mlx4_en_ioctl() becomes empty, remove it.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251023220457.3201122-3-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoocteontx2: convert to ndo_hwtstamp API
Vadim Fedorenko [Thu, 23 Oct 2025 22:04:52 +0000 (22:04 +0000)] 
octeontx2: convert to ndo_hwtstamp API

Convert driver to use .ndo_hwtstamp_get()/.ndo_hwtstamp_set() callbacks.
otx2_ioctl() becomes empty, remove it.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251023220457.3201122-2-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: airoha: Fix a copy and paste bug in probe()
Dan Carpenter [Fri, 24 Oct 2025 11:23:35 +0000 (14:23 +0300)] 
net: airoha: Fix a copy and paste bug in probe()

This code has a copy and paste bug where it accidentally checks "if (err)"
instead of checking if "xsi_rsts" is NULL.  Also, as a free bonus, I
changed the allocation from kzalloc() to  kcalloc() which is a kernel
hardening measure to protect against integer overflows.

Fixes: 5863b4e065e2 ("net: airoha: Add airoha_eth_soc_data struct")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/aPtht6y5DRokn9zv@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge tag 'batadv-next-pullrequest-20251024' of https://git.open-mesh.org/linux-merge
Jakub Kicinski [Tue, 28 Oct 2025 01:02:38 +0000 (18:02 -0700)] 
Merge tag 'batadv-next-pullrequest-20251024' of https://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - use skb_crc32c() instead of skb_seq_read(), by Sven Eckelmann

* tag 'batadv-next-pullrequest-20251024' of https://git.open-mesh.org/linux-merge:
  batman-adv: use skb_crc32c() instead of skb_seq_read()
  batman-adv: Start new development cycle
====================

Link: https://patch.msgid.link/20251024092315.232636-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'phy-mscc-fix-ptp-for-vsc8574-and-vsc8572'
Jakub Kicinski [Tue, 28 Oct 2025 00:58:05 +0000 (17:58 -0700)] 
Merge branch 'phy-mscc-fix-ptp-for-vsc8574-and-vsc8572'

Horatiu Vultur says:

====================
phy: mscc: Fix PTP for VSC8574 and VSC8572

The first patch will update the PHYs VSC8584, VSC8582, VSC8575 and VSC856X
to use PHY_ID_MATCH_EXACT because only rev B exists for these PHYs.
But for the PHYs VSC8574 and VSC8572 exists rev A, B, C, D and E.
This is just a preparation for the second patch to allow the VSC8574 and
VSC8572 to use the function vsc8584_probe().

We want to use vsc8584_probe() for VSC8574 and VSC8572 because this
function does the correct PTP initialization. This change is in the second
patch.
====================

Link: https://patch.msgid.link/20251023191350.190940-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agophy: mscc: Fix PTP for VSC8574 and VSC8572
Horatiu Vultur [Thu, 23 Oct 2025 19:13:50 +0000 (21:13 +0200)] 
phy: mscc: Fix PTP for VSC8574 and VSC8572

The PTP initialization is two-step. First part are the function
vsc8584_ptp_probe_once() and vsc8584_ptp_probe() at probe time which
initialize the locks, queues, creates the PTP device. The second part is
the function vsc8584_ptp_init() at config_init() time which initialize
PTP in the HW.

For VSC8574 and VSC8572, the PTP initialization is incomplete. It is
missing the first part but it makes the second part. Meaning that the
ptp_clock_register() is never called.

There is no crash without the first part when enabling PTP but this is
unexpected because some PHys have PTP functionality exposed by the
driver and some don't even though they share the same PTP clock PTP.

Fixes: 774626fa440e ("net: phy: mscc: Add PTP support for 2 more VSC PHYs")
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://patch.msgid.link/20251023191350.190940-3-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agophy: mscc: Use PHY_ID_MATCH_EXACT for VSC8584, VSC8582, VSC8575, VSC856X
Horatiu Vultur [Thu, 23 Oct 2025 19:13:49 +0000 (21:13 +0200)] 
phy: mscc: Use PHY_ID_MATCH_EXACT for VSC8584, VSC8582, VSC8575, VSC856X

As the PHYs VSC8584, VSC8582, VSC8575 and VSC856X exists only as rev B,
we can use PHY_ID_MATCH_EXACT to match exactly on revision B of the PHY.
Because of this change then there is not need the check if it is a
different revision than rev B in the function vsc8584_probe() as we
already know that this will never happen.
These changes are a preparation for the next patch because in that patch
we will make the PHYs VSC8574 and VSC8572 to use vsc8584_probe() and
these PHYs have multiple revision.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://patch.msgid.link/20251023191350.190940-2-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: bridge_mdb: Add a test for MDB flush on snooping disable
Petr Machata [Thu, 23 Oct 2025 14:45:38 +0000 (16:45 +0200)] 
selftests: bridge_mdb: Add a test for MDB flush on snooping disable

Check that non-permanent MDB entries are removed as IGMP / MLD snooping is
disabled.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/9420dfbcf26c8e1134d31244e9e7d6a49d677a69.1761228273.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: bridge: Flush multicast groups when snooping is disabled
Petr Machata [Thu, 23 Oct 2025 14:45:37 +0000 (16:45 +0200)] 
net: bridge: Flush multicast groups when snooping is disabled

When forwarding multicast packets, the bridge takes MDB into account when
IGMP / MLD snooping is enabled. Currently, when snooping is disabled, the
MDB is retained, even though it is not used anymore.

At the same time, during the time that snooping is disabled, the IGMP / MLD
control packets are obviously ignored, and after the snooping is reenabled,
the administrator has to assume it is out of sync. In particular, missed
join and leave messages would lead to traffic being forwarded to wrong
interfaces.

Keeping the MDB entries around thus serves no purpose, and just takes
memory. Note also that disabling per-VLAN snooping does actually flush the
relevant MDB entries.

This patch flushes non-permanent MDB entries as global snooping is
disabled.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/5e992df1bb93b88e19c0ea5819e23b669e3dde5d.1761228273.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: tls: add tls record_size_limit test
Wilfred Mallawa [Wed, 22 Oct 2025 00:19:37 +0000 (10:19 +1000)] 
selftests: tls: add tls record_size_limit test

Test that outgoing plaintext records respect the tls TLS_TX_MAX_PAYLOAD_LEN
set using setsockopt(). The limit is set to be 128, thus, in all received
records, the plaintext must not exceed this amount.

Also test that setting a new record size limit whilst a pending open
record exists is handled correctly by discarding the request.

Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251022001937.20155-2-wilfred.opensource@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/tls: support setting the maximum payload size
Wilfred Mallawa [Wed, 22 Oct 2025 00:19:36 +0000 (10:19 +1000)] 
net/tls: support setting the maximum payload size

During a handshake, an endpoint may specify a maximum record size limit.
Currently, the kernel defaults to TLS_MAX_PAYLOAD_SIZE (16KB) for the
maximum record size. Meaning that, the outgoing records from the kernel
can exceed a lower size negotiated during the handshake. In such a case,
the TLS endpoint must send a fatal "record_overflow" alert [1], and
thus the record is discarded.

Upcoming Western Digital NVMe-TCP hardware controllers implement TLS
support. For these devices, supporting TLS record size negotiation is
necessary because the maximum TLS record size supported by the controller
is less than the default 16KB currently used by the kernel.

Currently, there is no way to inform the kernel of such a limit. This patch
adds support to a new setsockopt() option `TLS_TX_MAX_PAYLOAD_LEN` that
allows for setting the maximum plaintext fragment size. Once set, outgoing
records are no larger than the size specified. This option can be used to
specify the record size limit.

[1] https://www.rfc-editor.org/rfc/rfc8449

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251022001937.20155-1-wilfred.opensource@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge branch 'dwmac-support-for-rockchip-rk3506'
Jakub Kicinski [Sat, 25 Oct 2025 02:07:48 +0000 (19:07 -0700)] 
Merge branch 'dwmac-support-for-rockchip-rk3506'

Heiko Stuebner says:

====================
DWMAC support for Rockchip RK3506

Some cleanups to the DT binding for Rockchip variants of the dwmac
and adding the RK3506 support on top.

As well as the driver-glue needed for setting up the correct RMII
speed seitings.
====================

Link: https://patch.msgid.link/20251023111213.298860-1-heiko@sntech.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMAINTAINERS: add dwmac-rk glue driver to the main Rockchip entry
Heiko Stuebner [Thu, 23 Oct 2025 11:12:12 +0000 (13:12 +0200)] 
MAINTAINERS: add dwmac-rk glue driver to the main Rockchip entry

The dwmac-rk glue driver is currently not caught by the general maintainer
entry for Rockchip SoCs, so add it explicitly, similar to the i2c driver.

The binding document in net/rockchip-dwmac.yaml already gets caught by
the wildcard match.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251023111213.298860-6-heiko@sntech.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoethernet: stmmac: dwmac-rk: Add RK3506 GMAC support
David Wu [Thu, 23 Oct 2025 11:12:11 +0000 (13:12 +0200)] 
ethernet: stmmac: dwmac-rk: Add RK3506 GMAC support

Add the needed glue blocks for the RK3506-specific setup.

The RK3506 dwmac only supports up to 100MBit with a RMII PHY,
but no RGMII.

Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20251023111213.298860-5-heiko@sntech.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agodt-bindings: net: rockchip-dwmac: Add compatible string for RK3506
Heiko Stuebner [Thu, 23 Oct 2025 11:12:10 +0000 (13:12 +0200)] 
dt-bindings: net: rockchip-dwmac: Add compatible string for RK3506

Rockchip RK3506 has two Ethernet controllers based on Synopsys DWC
Ethernet QoS IP.

Add compatible string for the RK3506 variant.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20251023111213.298860-4-heiko@sntech.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agodt-bindings: net: snps,dwmac: Sync list of Rockchip compatibles
Heiko Stuebner [Thu, 23 Oct 2025 11:12:09 +0000 (13:12 +0200)] 
dt-bindings: net: snps,dwmac: Sync list of Rockchip compatibles

A number of dwmac variants from Rockchip SoCs have turned up in the
Rockchip-specific binding, but not in the main list in snps,dwmac.yaml
which as the comment indicates is needed for accurate matching.

So add the missing rk3528, rk3568 and rv1126 to the main list.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20251023111213.298860-3-heiko@sntech.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agodt-bindings: net: snps,dwmac: move rk3399 line to its correct position
Heiko Stuebner [Thu, 23 Oct 2025 11:12:08 +0000 (13:12 +0200)] 
dt-bindings: net: snps,dwmac: move rk3399 line to its correct position

Move the rk3399 compatible to its alphabetically correct position.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20251023111213.298860-2-heiko@sntech.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge branch 'net-ravb-soc-specific-configuration'
Jakub Kicinski [Sat, 25 Oct 2025 02:04:36 +0000 (19:04 -0700)] 
Merge branch 'net-ravb-soc-specific-configuration'

Lad Prabhakar says:

====================
net: ravb: SoC-specific configuration

This series addresses several issues in the Renesas Ethernet AVB (ravb)
driver related to SoC-specific resource configuration.

The series includes the following changes:

- Make DBAT entry count configurable per SoC
The number of descriptor base address table (DBAT) entries is not uniform
across all SoCs. Pass this information via the hardware info structure and
allocate resources accordingly.

- Allocate correct number of queues based on SoC support
Use the per-SoC configuration to determine whether a network control queue
is available, and allocate queues dynamically to match the SoC's
capability.

v2: https://lore.kernel.org/20251017151830.171062-1-prabhakar.mahadev-lad.rj@bp.renesas.com
====================

Link: https://patch.msgid.link/20251023112111.215198-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: ravb: Allocate correct number of queues based on SoC support
Lad Prabhakar [Thu, 23 Oct 2025 11:21:11 +0000 (12:21 +0100)] 
net: ravb: Allocate correct number of queues based on SoC support

Use the per-SoC match data flag `nc_queues` to decide how many TX/RX
queues to allocate. If the SoC does not provide a network-control queue,
fall back to a single TX/RX queue. Obtain the match data before calling
alloc_etherdev_mqs() so the allocation is sized correctly.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251023112111.215198-3-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: ravb: Make DBAT entry count configurable per-SoC
Lad Prabhakar [Thu, 23 Oct 2025 11:21:10 +0000 (12:21 +0100)] 
net: ravb: Make DBAT entry count configurable per-SoC

Avoid wasting coherent DMA memory by allocating the descriptor base
address table sized for the actual number of DBAT/CDARq entries supported
by the SoC. Some platforms (for example GBETH) only provide two CDARq
entries; previously the driver always allocated space for 22 entries which
needlessly consumed memory on those systems.

Pass the per-SoC dbat_entry_num via struct ravb_hw_info and use it for
allocation and initialization in probe. This sizes the table correctly and
removes the unnecessary memory overhead on SoCs with fewer DBAT entries.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251023112111.215198-2-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: usb: usbnet: coding style for functions
Oliver Neukum [Thu, 23 Oct 2025 10:00:19 +0000 (12:00 +0200)] 
net: usb: usbnet: coding style for functions

Functions are not to have blanks between names
and parameter lists. Remove them.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://patch.msgid.link/20251023100136.909118-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge branch 'net-stmmac-pcs-support-part-2'
Jakub Kicinski [Sat, 25 Oct 2025 01:56:37 +0000 (18:56 -0700)] 
Merge branch 'net-stmmac-pcs-support-part-2'

Russell King says:

====================
net: stmmac: pcs support part 2

This is the next part of stmmac PCS support. Not much here, other than
dealing with what remains of the interrupts, which are the PCS AN
complete and PCS Link interrupts, which are just cleared and update
accounting.

Currently, they are enabled at core init time, but if we have an
implementation that supports multiple PHY interfaces, we want to
enable only the appropriate interrupts.

I also noticed that stmmac_fpe_configure_pmac() also modifies the
interrupt mask during run time. As a pre-requisit, we need a way
to ensure that we don't have different threads modifying the
interrupt settings at the same time. So, the first patch introduces
a new function and a spinlock which must be held when manipulating
the interrupt enable/mask state.

The second patch adds the PCS bits for enabling the PCS AN and PCS
link interrupts when the PCS is in-use.
====================

Link: https://patch.msgid.link/aPn5YVeUcWo4CW3c@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: stmmac: add support for controlling PCS interrupts
Russell King (Oracle) [Thu, 23 Oct 2025 09:46:25 +0000 (10:46 +0100)] 
net: stmmac: add support for controlling PCS interrupts

Add support to the PCS instance for controlling the PCS interrupts
depending on whether the PCS is used.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrtp-0000000BMYs-3bhI@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: stmmac: add stmmac_mac_irq_modify()
Russell King (Oracle) [Thu, 23 Oct 2025 09:46:20 +0000 (10:46 +0100)] 
net: stmmac: add stmmac_mac_irq_modify()

Add a function to allow interrupts to be enabled and disabled in a
core independent manner.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrtk-0000000BMYm-3CV5@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge branch 'net-add-phylink-managed-wol-and-convert-stmmac'
Jakub Kicinski [Sat, 25 Oct 2025 01:52:09 +0000 (18:52 -0700)] 
Merge branch 'net-add-phylink-managed-wol-and-convert-stmmac'

Russell King says:

====================
net: add phylink managed WoL and convert stmmac

This series is implementing the thoughts of Andrew, Florian and myself
to improve the quality of Wake-on-Lan (WoL) implementations.

This changes nothing for MAC drivers that do not wish to participate in
this, but if they do, then they gain the benefit of phylink configuring
WoL at the point closest to the media as possible.

We first need to solve the problem that the multitude of PHY drivers
report their device supports WoL, but are not capable of waking the
system. Correcting this is fundamental to choosing where WoL should be
enabled - a mis-reported WoL support can render WoL completely
ineffective.

The only PHY drivers which uses the driver model's wakeup support is
drivers/net/phy/broadcom.c, and until recently, realtek. This means
we have the opportunity for PHY drivers to be _correctly_ converted
to use this method of signalling wake-up capability only when they can
actually wake the system, and thus providing a way for phylink to
know whether to use PHY-based WoL at all.

However, a PHY driver not implementing that logic doesn't become a
blocker to MACs wanting to convert. In full, the logic is:

- phylink supports a flag, wol_phy_legacy, which forces phylink to use
  the PHY-based WoL even if the MDIO device is not marked as wake-up
  capable.

- when wol_phy_legacy is not set, we check whether the PHY MDIO device
  is wake-up capable. If it is, we offer the WoL request to the PHY.

- if neither wol_phy_legacy is set, or the PHY is not wake-up capable,
  we do not offer the WoL request to the PHY.

In both cases, after setting any PHY based WoL, we remove the options
that the PHY now reports are enabled from the options mask, and offer
these (if any) to the MAC. The mac will get a "mac_set_wol()" method
call when any settings change.

Phylink mainatains the WoL state for the MAC, so there's no need for
a "mac_get_wol()" method. There may be the need to set the initial
state but this is not supported at present.

I've also added support for doing the PHY speed-up/speed-down at
suspend/resume time depending on the WoL state, which takes another
issue from the MAC authors.

Lastly, with phylink now having the full picture for WoL, the
"mac_wol" argument for phylink_suspend() becomes redundant, and for
MAC drivers that implement mac_set_wol(), the value passed becomes
irrelevant.
====================

Link: https://patch.msgid.link/aPnyW54J80h9DmhB@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: stmmac: convert to phylink managed WoL PHY speed
Russell King (Oracle) [Thu, 23 Oct 2025 09:16:55 +0000 (10:16 +0100)] 
net: stmmac: convert to phylink managed WoL PHY speed

Convert stmmac to use phylink's management of the PHY speed when
Wake-on-Lan is enabled.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrRH-0000000BLzm-3JjF@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: stmmac: convert to phylink-managed Wake-on-Lan
Russell King (Oracle) [Thu, 23 Oct 2025 09:16:50 +0000 (10:16 +0100)] 
net: stmmac: convert to phylink-managed Wake-on-Lan

Convert stmmac to use phylink-managed Wake-on-Lan support. To achieve
this, we implement the .mac_wol_set() method, which simply configures
the driver model's struct device wakeup for stmmac, and sets the
priv->wolopts appropriately.

When STMMAC_FLAG_USE_PHY_WOL is set, in the stmmac world this means to
only use the PHY's WoL support and ignore the MAC's WoL capabilities.
To preserve this behaviour, we enable phylink's legacy mode, and avoid
telling phylink that the MAC has any WoL support. This achieves the
same functionality for this case.

When STMMAC_FLAG_USE_PHY_WOL is not set, we provide the MAC's WoL
capabilities to phylink, which then allows phylink to choose between
the PHY and MAC for WoL depending on their individual capabilities
as described in the phylink commit. This only augments the WoL
functionality with PHYs that declare to the driver model that they are
wake-up capable. Currently, very few PHY drivers support this.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrRC-0000000BLzg-2tA4@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phylink: add phylink managed wake-on-lan PHY speed control
Russell King (Oracle) [Thu, 23 Oct 2025 09:16:45 +0000 (10:16 +0100)] 
net: phylink: add phylink managed wake-on-lan PHY speed control

Some drivers, e.g. stmmac, use the speed_up()/speed_down() APIs to
gain additional power saving during Wake-on-LAN where the PHY is
managing the state.

Add support to phylink for this, which can be enabled by the MAC
driver. Only change the PHY speed if the PHY is configured for
wake-up, but without any wake-up on the MAC side, as MAC side
means changing the configuration once the negotiation has
completed.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrR7-0000000BLza-2PjK@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phylink: add phylink managed MAC Wake-on-Lan support
Russell King (Oracle) [Thu, 23 Oct 2025 09:16:40 +0000 (10:16 +0100)] 
net: phylink: add phylink managed MAC Wake-on-Lan support

Add core phylink managed Wake-on-Lan support, which is enabled when the
MAC driver fills in the new .mac_wol_set() method that this commit
creates.

When this feature is disabled, phylink acts as it has in the past,
merely passing the ethtool WoL calls to phylib whenever a PHY exists.
No other new functionality provided by this commit is enabled.

When this feature is enabled, a more inteligent approach is used.
Phylink will first pass WoL options to the PHY, read them back, and
attempt to set any options that were not set at the PHY at the MAC.

Since we have PHY drivers that report they support WoL, and accept WoL
configuration even though they aren't wired up to be capable of waking
the system, we need a way to differentiate between PHYs that think
they support WoL and those which actually do. As PHY drivers do not
make use of the driver model's wake-up infrastructure, but could, we
use this to determine whether PHY drivers can participate. This gives
a path forward where, as MAC drivers are converted to this, it
encourages PHY drivers to also be converted.

Phylink will also ignore the mac_wol argument to phylink_suspend() as
it now knows the WoL state at the MAC.

MAC drivers are expected to record/configure the Wake-on-Lan state in
their .mac_set_wol() method, and deal appropriately with it in their
suspend/resume methods. The driver model provides assistance to set the
IRQ wake support which may assist driver authors in achieving the
necessary configuration.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrR2-0000000BLzU-1xYL@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phy: add phy_may_wakeup()
Russell King (Oracle) [Thu, 23 Oct 2025 09:16:35 +0000 (10:16 +0100)] 
net: phy: add phy_may_wakeup()

Add phy_may_wakeup() which uses the driver model's device_may_wakeup()
when the PHY driver has marked the device as wakeup capable in the
driver model, otherwise use phy_drv_wol_enabled().

Replace the sites that used to call phy_drv_wol_enabled() with this
as checking the driver model will be more efficient than checking the
WoL state.

Export phy_may_wakeup() so that phylink can use it.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrQx-0000000BLzO-1RLt@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phy: add phy_can_wakeup()
Russell King (Oracle) [Thu, 23 Oct 2025 09:16:30 +0000 (10:16 +0100)] 
net: phy: add phy_can_wakeup()

Add phy_can_wakeup() to report whether the PHY driver has marked the
PHY device as being wake-up capable as far as the driver model is
concerned.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vBrQs-0000000BLzI-0w3U@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agosmc: rename smc_find_ism_store_rc to reflect broader usage
Dust Li [Thu, 23 Oct 2025 02:00:12 +0000 (10:00 +0800)] 
smc: rename smc_find_ism_store_rc to reflect broader usage

The function smc_find_ism_store_rc() is used to record the reason
why a suitable device (either ISM or RDMA) could not be found.
However, its name suggests it is ISM-specific, which is misleading.

Rename it to better reflect its actual usage.

No functional changes.

Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251023020012.69609-1-dust.li@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agostrparser: fix typo in comment
Julia Lawall [Thu, 23 Oct 2025 01:30:51 +0000 (03:30 +0200)] 
strparser: fix typo in comment

The name frags_list doesn't appear in the kernel.
It should be frag_list as in the next sentence.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://patch.msgid.link/20251023013051.1728388-1-Julia.Lawall@inria.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoselftest: net: prevent use of uninitialized variable
Alessandro Zanni [Thu, 23 Oct 2025 20:53:52 +0000 (22:53 +0200)] 
selftest: net: prevent use of uninitialized variable

Fix to avoid the usage of the `ret` variable uninitialized in the
following macro expansions.

It solves the following warning:

In file included from netlink-dumps.c:21:
netlink-dumps.c: In function ‘dump_extack’:
../kselftest_harness.h:788:35: warning: ‘ret’ may be used uninitialized [-Wmaybe-uninitialized]
  788 |                         intmax_t  __exp_print = (intmax_t)__exp; \
      |                                   ^~~~~~~~~~~
../kselftest_harness.h:631:9: note: in expansion of macro ‘__EXPECT’
  631 |         __EXPECT(expected, #expected, seen, #seen, ==, 0)
      |         ^~~~~~~~
netlink-dumps.c:169:9: note: in expansion of macro ‘EXPECT_EQ’
  169 |         EXPECT_EQ(ret, FOUND_EXTACK);
      |         ^~~~~~~~~

The issue can be reproduced, building the tests, with the command:
make -C tools/testing/selftests TARGETS=net

Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Link: https://patch.msgid.link/20251023205354.28249-1-alessandro.zanni87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge branch 'neighbour-convert-rtm_getneightbl-and-rtm_setneightbl-to-rcu'
Jakub Kicinski [Sat, 25 Oct 2025 00:57:27 +0000 (17:57 -0700)] 
Merge branch 'neighbour-convert-rtm_getneightbl-and-rtm_setneightbl-to-rcu'

Kuniyuki Iwashima says:

====================
neighbour: Convert RTM_GETNEIGHTBL and RTM_SETNEIGHTBL to RCU.

Patch 1 & 2 are prep for RCU conversion for RTM_GETNEIGHTBL.

Patch 3 & 4 converts RTM_GETNEIGHTBL and RTM_SETNEIGHTBL to RCU.

Patch 5 converts the neighbour table rwlock to the plain spinlock.
====================

Link: https://patch.msgid.link/20251022054004.2514876-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoneighbour: Convert rwlock of struct neigh_table to spinlock.
Kuniyuki Iwashima [Wed, 22 Oct 2025 05:39:49 +0000 (05:39 +0000)] 
neighbour: Convert rwlock of struct neigh_table to spinlock.

Only neigh_for_each() and neigh_seq_start/stop() are on the
reader side of neigh_table.lock.

Let's convert rwlock to the plain spinlock.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251022054004.2514876-6-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoneighbour: Convert RTM_SETNEIGHTBL to RCU.
Kuniyuki Iwashima [Wed, 22 Oct 2025 05:39:48 +0000 (05:39 +0000)] 
neighbour: Convert RTM_SETNEIGHTBL to RCU.

neightbl_set() fetches neigh_tables[] and updates attributes under
write_lock_bh(&tbl->lock), so RTNL is not needed.

neigh_table_clear() synchronises RCU only, and rcu_dereference_rtnl()
protects nothing here.

If we released RCU after fetching neigh_tables[], there would be no
synchronisation to block neigh_table_clear() further, so RCU is held
until the end of the function.

Another option would be to protect neigh_tables[] user with SRCU
and add synchronize_srcu() in neigh_table_clear().

But, holding RCU should be fine as we hold write_lock_bh() for the
rest of neightbl_set() anyway.

Let's perform RTM_SETNEIGHTBL under RCU and drop RTNL.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251022054004.2514876-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoneighbour: Convert RTM_GETNEIGHTBL to RCU.
Kuniyuki Iwashima [Wed, 22 Oct 2025 05:39:47 +0000 (05:39 +0000)] 
neighbour: Convert RTM_GETNEIGHTBL to RCU.

neightbl_dump_info() calls these functions for each neigh_tables[]
entry:

  1. neightbl_fill_info() for tbl->parms
  2. neightbl_fill_param_info() for tbl->parms_list (except tbl->parms)

Both functions rely on the table lock (read_lock_bh(&tbl->lock))
and RTNL is not needed.

Let's fetch the table under RCU and convert RTM_GETNEIGHTBL to RCU.

Note that the first entry of tbl->parms_list is tbl->parms.list and
embedded in neigh_table, so list_next_entry() is safe.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251022054004.2514876-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoneighbour: Annotate access to neigh_parms fields.
Kuniyuki Iwashima [Wed, 22 Oct 2025 05:39:46 +0000 (05:39 +0000)] 
neighbour: Annotate access to neigh_parms fields.

NEIGH_VAR() is read locklessly in the fast path, and IPv6 ndisc uses
NEIGH_VAR_SET() locklessly.

The next patch will convert neightbl_dump_info() to RCU.

Let's annotate accesses to neigh_param with READ_ONCE() and WRITE_ONCE().

Note that ndisc_ifinfo_sysctl_change() uses &NEIGH_VAR() and we cannot
use '&' with READ_ONCE(), so NEIGH_VAR_PTR() is introduced.

Note also that NEIGH_VAR_INIT() does not need WRITE_ONCE() as it is before
parms is published.  Also, the only user hippi_neigh_setup_dev() is no
longer called since commit e3804cbebb67 ("net: remove COMPAT_NET_DEV_OPS"),
which looks wrong, but probably no one uses HIPPI and RoadRunner.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251022054004.2514876-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoneighbour: Use RCU list helpers for neigh_parms.list writers.
Kuniyuki Iwashima [Wed, 22 Oct 2025 05:39:45 +0000 (05:39 +0000)] 
neighbour: Use RCU list helpers for neigh_parms.list writers.

We will convert RTM_GETNEIGHTBL to RCU soon, where we traverse
tbl->parms_list under RCU in neightbl_dump_info().

Let's use RCU list helper for neigh_parms in neigh_parms_alloc()
and neigh_parms_release().

neigh_table_init() uses the plain list_add() for the default
neigh_parm that is embedded in the table and not yet published.

Note that neigh_parms_release() already uses call_rcu() to free
neigh_parms.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251022054004.2514876-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge branch 'net-dsa-lantiq_gswip-use-regmap-for-register-access'
Jakub Kicinski [Fri, 24 Oct 2025 01:53:08 +0000 (18:53 -0700)] 
Merge branch 'net-dsa-lantiq_gswip-use-regmap-for-register-access'

Daniel Golle says:

====================
net: dsa: lantiq_gswip: use regmap for register access

This series refactors the lantiq_gswip driver to utilize the regmap API
for register access, replacing the previous approach of open-coding
register operations.

Using regmap paves the way for supporting different busses to access the
switch registers, for example it makes it easier to use an MDIO-based
method required to access the registers of the MaxLinear GSW1xx series
of dedicated switch ICs.

Apart from that, the use of regmap improves readability and
maintainability of the driver by standardizing register access.

When ever possible changes were made using Coccinelle semantic patches,
sometimes adjusting white space and adding line breaks when needed.
The remaining changes which were not done using semantic patches are
small and should be easy to review and verify.

The whole series has been
Tested-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
====================

Link: https://patch.msgid.link/cover.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: dsa: lantiq_gswip: harmonize gswip_mii_mask_*() parameters
Daniel Golle [Tue, 21 Oct 2025 11:17:33 +0000 (12:17 +0100)] 
net: dsa: lantiq_gswip: harmonize gswip_mii_mask_*() parameters

The 'clear' parameter of gswip_mii_mask_cfg() and gswip_mii_mask_pcdu()
is inconsistent with the semantics of regmap_write_bits() which also
applies the mask to the value to be written.
Change the semantic mask/set of the functions gswip_mii_mask_cfg() and
gswip_mii_mask_pcdu() to follow the regmap_write_bits() pattern.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Link: https://patch.msgid.link/218854236c97a152af071852bda83d02ff2dd918.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: dsa: lantiq_gswip: optimize regmap_write_bits() statements
Daniel Golle [Tue, 21 Oct 2025 11:17:14 +0000 (12:17 +0100)] 
net: dsa: lantiq_gswip: optimize regmap_write_bits() statements

Further optimize the previous naive conversion of the *_mask() accessor
functions to regmap_write_bits by manually removing redundant mask
operands.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Link: https://patch.msgid.link/fce2f964b22fe3efc234c664b1e50de28dddf512.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: dsa: lantiq_gswip: replace *_mask() functions with regmap API
Daniel Golle [Tue, 21 Oct 2025 11:17:06 +0000 (12:17 +0100)] 
net: dsa: lantiq_gswip: replace *_mask() functions with regmap API

Use coccinelle to replace all uses of *_mask() with an equivalent call
to regmap_write_bits().

// Replace gswip_switch_mask with regmap_write_bits
@@
expression priv, clear, set, offset;
@@
- gswip_switch_mask(priv, clear, set, offset)
+ regmap_write_bits(priv->gswip, offset, clear | set, set)

// Replace gswip_mdio_mask with regmap_write_bits
@@
expression priv, clear, set, offset;
@@
- gswip_mdio_mask(priv, clear, set, offset)
+ regmap_write_bits(priv->mdio, offset, clear | set, set)

// Replace gswip_mii_mask with regmap_write_bits
@@
expression priv, clear, set, offset;
@@
- gswip_mii_mask(priv, clear, set, offset)
+ regmap_write_bits(priv->mii, offset, clear | set, set)

Remove the new unused *_mask() functions.
This naive approach will be further optmized manually in the next commit.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Link: https://patch.msgid.link/258d931386a512b7089924c70073ca7acba71168.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: dsa: lantiq_gswip: manually convert remaining uses of read accessors
Daniel Golle [Tue, 21 Oct 2025 11:16:58 +0000 (12:16 +0100)] 
net: dsa: lantiq_gswip: manually convert remaining uses of read accessors

Manually convert the remaining uses of the read accessor functions and
remove them now that they are unused.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Link: https://patch.msgid.link/0e2a44b83131b40fc1ee558ed1f536c26e1232ba.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: dsa: lantiq_gswip: convert trivial accessor uses to regmap
Daniel Golle [Tue, 21 Oct 2025 11:16:50 +0000 (12:16 +0100)] 
net: dsa: lantiq_gswip: convert trivial accessor uses to regmap

Use coccinelle semantic patch to convert all trivial uses of the register
accessor functions to use the regmap API directly.

// Replace gswip_switch_w with regmap_write
@@
expression priv, val, offset;
@@
- gswip_switch_w(priv, val, offset)
+ regmap_write(priv->gswip, offset, val)

// Replace gswip_mdio_w with regmap_write
@@
expression priv, val, offset;
@@
- gswip_mdio_w(priv, val, offset)
+ regmap_write(priv->mdio, offset, val)

// Replace gswip_switch_r in simple assignment - only for u32
@@
expression priv, offset;
u32 var;
@@
- var = gswip_switch_r(priv, offset)
+ regmap_read(priv->gswip, offset, &var)

// Replace gswip_switch_mask with regmap_set_bits when clear is 0
@@
expression priv, set, offset;
@@
- gswip_switch_mask(priv, 0, set, offset)
+ regmap_set_bits(priv->gswip, offset, set)

// Replace gswip_mdio_mask with regmap_set_bits when clear is 0
@@
expression priv, set, offset;
@@
- gswip_mdio_mask(priv, 0, set, offset)
+ regmap_set_bits(priv->mdio, offset, set)

// Replace gswip_switch_mask with regmap_clear_bits when set is 0
@@
expression priv, clear, offset;
@@
- gswip_switch_mask(priv, clear, 0, offset)
+ regmap_clear_bits(priv->gswip, offset, clear)

// Replace gswip_mdio_mask with regmap_clear_bits when set is 0
@@
expression priv, clear, offset;
@@
- gswip_mdio_mask(priv, clear, 0, offset)
+ regmap_clear_bits(priv->mdio, offset, clear)

Remove gswip_switch_w() and gswip_mdio_w() functions as they now no
longer have any users.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Link: https://patch.msgid.link/48a60f386b1bd487c410b1f5fb25ba50ceddc6f7.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: dsa: lantiq_gswip: convert accessors to use regmap
Daniel Golle [Tue, 21 Oct 2025 11:16:37 +0000 (12:16 +0100)] 
net: dsa: lantiq_gswip: convert accessors to use regmap

Use regmap for register access in preparation for supporting the MaxLinear
GSW1xx family of switches connected via MDIO or SPI.
Rewrite the existing accessor read-poll-timeout functions to use calls to
the regmap API for now.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Link: https://patch.msgid.link/535d968bc6319a74bdf76166ef19364ee659285f.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: dsa: lantiq_gswip: clarify GSWIP 2.2 VLAN mode in comment
Daniel Golle [Tue, 21 Oct 2025 11:16:30 +0000 (12:16 +0100)] 
net: dsa: lantiq_gswip: clarify GSWIP 2.2 VLAN mode in comment

The comment above writing the default PVID incorrectly states that
"GSWIP 2.2 (GRX300) and later program here the VID directly."
The truth is that even GSWIP 2.2 and newer maintain the behavior of
GSWIP 2.1 unless the VLANMD bit in PCE Global Control Register 1 is
set ("GSWIP2.2 VLAN Mode").
Fix the misleading comment accordingly.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Acked-by; Hauke Mehrtens <hauke@hauke-m.de>:
Link: https://patch.msgid.link/018056a575503d9797f3222f71a988e825316be0.1761045000.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agotcp: Remove unnecessary null check in tcp_inbound_md5_hash()
Eric Biggers [Wed, 22 Oct 2025 22:12:09 +0000 (15:12 -0700)] 
tcp: Remove unnecessary null check in tcp_inbound_md5_hash()

The 'if (!key && hash_location)' check in tcp_inbound_md5_hash() implies
that hash_location might be null.  However, later code in the function
dereferences hash_location anyway, without checking for null first.
Fortunately, there is no real bug, since tcp_inbound_md5_hash() is
called only with non-null values of hash_location.

Therefore, remove the unnecessary and misleading null check of
hash_location.  This silences a Smatch static checker warning
(https://lore.kernel.org/netdev/aPi4b6aWBbBR52P1@stanley.mountain/)

Also fix the related comment at the beginning of the function.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Link: https://patch.msgid.link/20251022221209.19716-1-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: rmnet: Use section heading markup for packet format subsections
Bagas Sanjaya [Wed, 22 Oct 2025 02:54:57 +0000 (09:54 +0700)] 
net: rmnet: Use section heading markup for packet format subsections

Format subsections of "Packet format" section as reST subsections.

Link: https://lore.kernel.org/linux-doc/aO_MefPIlQQrCU3j@horms.kernel.org/
Suggested-by: Simon Horman <horms@kernel.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20251022025456.19004-2-bagasdotme@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: unix: remove outdated BSD behavior comment in unix_release_sock()
Sunday Adelodun [Tue, 21 Oct 2025 19:59:06 +0000 (20:59 +0100)] 
net: unix: remove outdated BSD behavior comment in unix_release_sock()

Remove the long-standing comment in unix_release_sock() that described a
behavioral difference between Linux and BSD regarding when ECONNRESET is
sent to connected UNIX sockets upon closure.

As confirmed by testing on macOS (similar to BSD behavior), ECONNRESET
is only observed for SOCK_DGRAM sockets, not for SOCK_STREAM. Meanwhile,
Linux already returns ECONNRESET in cases where a socket is closed with
unread data or is not yet accept()ed. This means the previous comment no
longer accurately describes current behavior and is misleading.

Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Sunday Adelodun <adelodunolaoluwa@yahoo.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251021195906.20389-1-adelodunolaoluwa@yahoo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: airoha: Remove code duplication in airoha_regs.h
Lorenzo Bianconi [Wed, 22 Oct 2025 07:11:12 +0000 (09:11 +0200)] 
net: airoha: Remove code duplication in airoha_regs.h

This patch does not introduce any logical change, it just removes
duplicated code in airoha_regs.h.
Fix naming conventions in airoha_regs.h.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20251022-airoha-regs-cosmetics-v2-1-e0425b3f2c2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 16 Oct 2025 17:53:13 +0000 (10:53 -0700)] 
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-6.18-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 23 Oct 2025 17:03:18 +0000 (07:03 -1000)] 
Merge tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from can. Slim pickings, I'm guessing people haven't
  really started testing.

  Current release - new code bugs:

   - eth: mlx5e:
       - psp: avoid 'accel' NULL pointer dereference
       - skip PPHCR register query for FEC histogram if not supported

  Previous releases - regressions:

   - bonding: update the slave array for broadcast mode

   - rtnetlink: re-allow deleting FDB entries in user namespace

   - eth: dpaa2: fix the pointer passed to PTR_ALIGN on Tx path

  Previous releases - always broken:

   - can: drop skb on xmit if device is in listen-only mode

   - gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb()

   - eth: mlx5e
       - RX, fix generating skb from non-linear xdp_buff if program
         trims frags
       - make devcom init failures non-fatal, fix races with IPSec

  Misc:

   - some documentation formatting 'fixes'"

* tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  net/mlx5: Fix IPsec cleanup over MPV device
  net/mlx5: Refactor devcom to return NULL on failure
  net/mlx5e: Skip PPHCR register query if not supported by the device
  net/mlx5: Add PPHCR to PCAM supported registers mask
  virtio-net: zero unused hash fields
  net: phy: micrel: always set shared->phydev for LAN8814
  vsock: fix lock inversion in vsock_assign_transport()
  ovpn: use datagram_poll_queue for socket readiness in TCP
  espintcp: use datagram_poll_queue for socket readiness
  net: datagram: introduce datagram_poll_queue for custom receive queues
  net: bonding: fix possible peer notify event loss or dup issue
  net: hsr: prevent creation of HSR device with slaves from another netns
  sctp: avoid NULL dereference when chunk data buffer is missing
  ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
  net: ravb: Ensure memory write completes before ringing TX doorbell
  net: ravb: Enforce descriptor type ordering
  net: hibmcge: select FIXED_PHY
  net: dlink: use dev_kfree_skb_any instead of dev_kfree_skb
  Documentation: networking: ax25: update the mailing list info.
  net: gro_cells: fix lock imbalance in gro_cells_receive()
  ...

7 weeks agoMerge tag 'acpi-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 23 Oct 2025 16:53:12 +0000 (06:53 -1000)] 
Merge tag 'acpi-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix a fallout of a recent ACPI properties management update and
  work around a compiler bug in ACPICA:

   - Fix a recent coding mistake causing __acpi_node_get_property_reference()
     arguments to be put in an incorrect order (Sunil V L)

   - Work around bogus -Wstringop-overread warning on LoongArch since
     GCC 11 in ACPICA (Xi Ruoyao)"

* tag 'acpi-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPICA: Work around bogus -Wstringop-overread warning since GCC 11
  ACPI: property: Fix argument order in __acpi_node_get_property_reference()

7 weeks agoMerge tag 'pm-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 23 Oct 2025 16:48:32 +0000 (06:48 -1000)] 
Merge tag 'pm-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These revert a cpuidle menu governor commit leading to a performance
  regression, fix an amd-pstate driver regression introduced recently,
  and fix new conditional guard definitions for runtime PM.

   - Add missing _RET == 0 condition to recently introduced conditional
     guard definitions for runtime PM (Rafael Wysocki)

   - Revert a cpuidle menu governor change that introduced a serious
     performance regression on Chromebooks with Intel Jasper Lake
     processors (Rafael Wysocki)

   - Fix an amd-pstate driver regression leading to EPP=0 after
     hibernation (Mario Limonciello)"

* tag 'pm-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: runtime: Fix conditional guard definitions
  Revert "cpuidle: menu: Avoid discarding useful information"
  cpufreq/amd-pstate: Fix a regression leading to EPP 0 after hibernate

7 weeks agoMerge tag 'for-6.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 23 Oct 2025 16:44:43 +0000 (06:44 -1000)] 
Merge tag 'for-6.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - in send, fix duplicated rmdir operations when using extrefs
   (hardlinks), receive can fail with ENOENT

 - fixup of error check when reading extent root in ref-verify and
   damaged roots are allowed by mount option (found by smatch)

 - fix freeing partially initialized fs info (found by syzkaller)

 - fix use-after-free when printing ref_tracking status of delayed
   inodes

* tag 'for-6.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: ref-verify: fix IS_ERR() vs NULL check in btrfs_build_ref_tree()
  btrfs: fix delayed_node ref_tracker use after free
  btrfs: send: fix duplicated rmdir operations when using extrefs
  btrfs: directly free partially initialized fs_info in btrfs_check_leaked_roots()

7 weeks agoMerge branch 'mlx5-misc-fixes-2025-10-22'
Jakub Kicinski [Thu, 23 Oct 2025 14:14:38 +0000 (07:14 -0700)] 
Merge branch 'mlx5-misc-fixes-2025-10-22'

Tariq Toukan says:

====================
mlx5 misc fixes 2025-10-22

This patchset provides misc bug fixes from the team to the mlx5 core and
Eth drivers.
====================

Link: https://patch.msgid.link/1761136182-918470-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet/mlx5: Fix IPsec cleanup over MPV device
Patrisious Haddad [Wed, 22 Oct 2025 12:29:42 +0000 (15:29 +0300)] 
net/mlx5: Fix IPsec cleanup over MPV device

When we do mlx5e_detach_netdev() we eventually disable blocking events
notifier, among those events are IPsec MPV events from IB to core.

So before disabling those blocking events, make sure to also unregister
the devcom device and mark all this device operations as complete,
in order to prevent the other device from using invalid netdev
during future devcom events which could cause the trace below.

BUG: kernel NULL pointer dereference, address: 0000000000000010
PGD 146427067 P4D 146427067 PUD 146488067 PMD 0
Oops: Oops: 0000 [#1] SMP
CPU: 1 UID: 0 PID: 7735 Comm: devlink Tainted: GW 6.12.0-rc6_for_upstream_min_debug_2024_11_08_00_46 #1
Tainted: [W]=WARN
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
Code: 00 01 48 83 05 23 32 1e 00 01 41 b8 ed ff ff ff e9 60 ff ff ff 48 83 05 00 32 1e 00 01 eb e3 66 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 47 10 48 83 05 5f 32 1e 00 01 48 8b 50 40 48 85 d2 74 05 40
RSP: 0018:ffff88811a5c35f8 EFLAGS: 00010206
RAX: ffff888106e8ab80 RBX: ffff888107d7e200 RCX: ffff88810d6f0a00
RDX: ffff88810d6f0a00 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff88811a17e620 R08: 0000000000000040 R09: 0000000000000000
R10: ffff88811a5c3618 R11: 0000000de85d51bd R12: ffff88811a17e600
R13: ffff88810d6f0a00 R14: 0000000000000000 R15: ffff8881034bda80
FS:  00007f27bdf89180(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000010f159005 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? __die+0x20/0x60
 ? page_fault_oops+0x150/0x3e0
 ? exc_page_fault+0x74/0x130
 ? asm_exc_page_fault+0x22/0x30
 ? mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
 mlx5e_devcom_event_mpv+0x42/0x60 [mlx5_core]
 mlx5_devcom_send_event+0x8c/0x170 [mlx5_core]
 blocking_event+0x17b/0x230 [mlx5_core]
 notifier_call_chain+0x35/0xa0
 blocking_notifier_call_chain+0x3d/0x60
 mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core]
 mlx5_core_mp_event_replay+0x12/0x20 [mlx5_core]
 mlx5_ib_bind_slave_port+0x228/0x2c0 [mlx5_ib]
 mlx5_ib_stage_init_init+0x664/0x9d0 [mlx5_ib]
 ? idr_alloc_cyclic+0x50/0xb0
 ? __kmalloc_cache_noprof+0x167/0x340
 ? __kmalloc_noprof+0x1a7/0x430
 __mlx5_ib_add+0x34/0xd0 [mlx5_ib]
 mlx5r_probe+0xe9/0x310 [mlx5_ib]
 ? kernfs_add_one+0x107/0x150
 ? __mlx5_ib_add+0xd0/0xd0 [mlx5_ib]
 auxiliary_bus_probe+0x3e/0x90
 really_probe+0xc5/0x3a0
 ? driver_probe_device+0x90/0x90
 __driver_probe_device+0x80/0x160
 driver_probe_device+0x1e/0x90
 __device_attach_driver+0x7d/0x100
 bus_for_each_drv+0x80/0xd0
 __device_attach+0xbc/0x1f0
 bus_probe_device+0x86/0xa0
 device_add+0x62d/0x830
 __auxiliary_device_add+0x3b/0xa0
 ? auxiliary_device_init+0x41/0x90
 add_adev+0xd1/0x150 [mlx5_core]
 mlx5_rescan_drivers_locked+0x21c/0x300 [mlx5_core]
 esw_mode_change+0x6c/0xc0 [mlx5_core]
 mlx5_devlink_eswitch_mode_set+0x21e/0x640 [mlx5_core]
 devlink_nl_eswitch_set_doit+0x60/0xe0
 genl_family_rcv_msg_doit+0xd0/0x120
 genl_rcv_msg+0x180/0x2b0
 ? devlink_get_from_attrs_lock+0x170/0x170
 ? devlink_nl_eswitch_get_doit+0x290/0x290
 ? devlink_nl_pre_doit_port_optional+0x50/0x50
 ? genl_family_rcv_msg_dumpit+0xf0/0xf0
 netlink_rcv_skb+0x54/0x100
 genl_rcv+0x24/0x40
 netlink_unicast+0x1fc/0x2d0
 netlink_sendmsg+0x1e4/0x410
 __sock_sendmsg+0x38/0x60
 ? sockfd_lookup_light+0x12/0x60
 __sys_sendto+0x105/0x160
 ? __sys_recvmsg+0x4e/0x90
 __x64_sys_sendto+0x20/0x30
 do_syscall_64+0x4c/0x100
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f27bc91b13a
Code: bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 fa 96 2c 00 45 89 c9 4c 63 d1 48 63 ff 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 f3 c3 0f 1f 40 00 41 55 41 54 4d 89 c5 55
RSP: 002b:00007fff369557e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000009c54b10 RCX: 00007f27bc91b13a
RDX: 0000000000000038 RSI: 0000000009c54b10 RDI: 0000000000000006
RBP: 0000000009c54920 R08: 00007f27bd0030e0 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001
 </TASK>
Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_fwctl mlx5_ib ib_uverbs ib_core mlx5_core
CR2: 0000000000000010

Fixes: 82f9378c443c ("net/mlx5: Handle IPsec steering upon master unbind/bind")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761136182-918470-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet/mlx5: Refactor devcom to return NULL on failure
Patrisious Haddad [Wed, 22 Oct 2025 12:29:41 +0000 (15:29 +0300)] 
net/mlx5: Refactor devcom to return NULL on failure

Devcom device and component registration isn't always critical to the
functionality of the caller, hence the registration can fail and we can
continue working with an ERR_PTR value saved inside a variable.

In order to avoid that make sure all devcom failures return NULL.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761136182-918470-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet/mlx5e: Skip PPHCR register query if not supported by the device
Alexei Lazar [Wed, 22 Oct 2025 12:29:40 +0000 (15:29 +0300)] 
net/mlx5e: Skip PPHCR register query if not supported by the device

Check the PCAM supported registers mask before querying the PPHCR
register, as it is not supported in older devices.

Fixes: 44907e7c8fd0 ("net/mlx5e: Add logic to read RS-FEC histogram bin ranges from PPHCR")
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761136182-918470-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet/mlx5: Add PPHCR to PCAM supported registers mask
Alexei Lazar [Wed, 22 Oct 2025 12:29:39 +0000 (15:29 +0300)] 
net/mlx5: Add PPHCR to PCAM supported registers mask

Add the PPHCR bit to the port_access_reg_cap_mask field of PCAM
register to indicate that the device supports the PPHCR register
and the RS-FEC histogram feature.

Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761136182-918470-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agovirtio-net: zero unused hash fields
Jason Wang [Wed, 22 Oct 2025 03:44:21 +0000 (11:44 +0800)] 
virtio-net: zero unused hash fields

When GSO tunnel is negotiated virtio_net_hdr_tnl_from_skb() tries to
initialize the tunnel metadata but forget to zero unused rxhash
fields. This may leak information to another side. Fixing this by
zeroing the unused hash fields.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: a2fb4bc4e2a6a ("net: implement virtio helpers to handle UDP GSO tunneling")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20251022034421.70244-1-jasowang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phy: micrel: always set shared->phydev for LAN8814
Robert Marko [Tue, 21 Oct 2025 13:20:26 +0000 (15:20 +0200)] 
net: phy: micrel: always set shared->phydev for LAN8814

Currently, during the LAN8814 PTP probe shared->phydev is only set if PTP
clock gets actually set, otherwise the function will return before setting
it.

This is an issue as shared->phydev is unconditionally being used when IRQ
is being handled, especially in lan8814_gpio_process_cap and since it was
not set it will cause a NULL pointer exception and crash the kernel.

So, simply always set shared->phydev to avoid the NULL pointer exception.

Fixes: b3f1a08fcf0d ("net: phy: micrel: Add support for PTP_PF_EXTTS for lan8814")
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://patch.msgid.link/20251021132034.983936-1-robert.marko@sartura.hr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agovsock: fix lock inversion in vsock_assign_transport()
Stefano Garzarella [Tue, 21 Oct 2025 12:17:18 +0000 (14:17 +0200)] 
vsock: fix lock inversion in vsock_assign_transport()

Syzbot reported a potential lock inversion deadlock between
vsock_register_mutex and sk_lock-AF_VSOCK when vsock_linger() is called.

The issue was introduced by commit 687aa0c5581b ("vsock: Fix
transport_* TOCTOU") which added vsock_register_mutex locking in
vsock_assign_transport() around the transport->release() call, that can
call vsock_linger(). vsock_assign_transport() can be called with sk_lock
held. vsock_linger() calls sk_wait_event() that temporarily releases and
re-acquires sk_lock. During this window, if another thread hold
vsock_register_mutex while trying to acquire sk_lock, a circular
dependency is created.

Fix this by releasing vsock_register_mutex before calling
transport->release() and vsock_deassign_transport(). This is safe
because we don't need to hold vsock_register_mutex while releasing the
old transport, and we ensure the new transport won't disappear by
obtaining a module reference first via try_module_get().

Reported-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com
Tested-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com
Fixes: 687aa0c5581b ("vsock: Fix transport_* TOCTOU")
Cc: mhal@rbox.co
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251021121718.137668-1-sgarzare@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agoMerge branch 'fix-poll-behaviour-for-tcp-based-tunnel-protocols'
Paolo Abeni [Thu, 23 Oct 2025 13:46:10 +0000 (15:46 +0200)] 
Merge branch 'fix-poll-behaviour-for-tcp-based-tunnel-protocols'

Ralf Lici says:

====================
fix poll behaviour for TCP-based tunnel protocols

This patch series introduces a polling function for datagram-style
sockets that operates on custom skb queues, and updates ovpn (the
OpenVPN data-channel offload module) and espintcp (the TCP Encapsulation
of IKE and IPsec Packets implementation) to use it accordingly.

Protocols like the aforementioned one decapsulate packets received over
TCP and deliver userspace-bound data through a separate skb queue, not
the standard sk_receive_queue. Previously, both relied on
datagram_poll(), which would signal readiness based on non-userspace
packets, leading to misleading poll results and unnecessary recv
attempts in userspace.

Patch 1 introduces datagram_poll_queue(), a variant of datagram_poll()
that accepts an explicit receive queue. This builds on the approach
introduced in commit b50b058, which extended other skb-related functions
to support custom queues. Patch 2 and 3 update espintcp_poll() and
ovpn_tcp_poll() respectively to use this helper, ensuring readiness is
only signaled when userspace data is available.

Each patch is self-contained and the ovpn one includes rationale and
lifecycle enforcement where appropriate.
====================

Link: https://patch.msgid.link/20251021100942.195010-1-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agoovpn: use datagram_poll_queue for socket readiness in TCP
Ralf Lici [Tue, 21 Oct 2025 10:09:42 +0000 (12:09 +0200)] 
ovpn: use datagram_poll_queue for socket readiness in TCP

openvpn TCP encapsulation uses a custom queue to deliver packets to
userspace. Currently it relies on datagram_poll, which checks
sk_receive_queue, leading to false readiness signals when that queue
contains non-userspace packets.

Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's
user_queue, ensuring poll only signals readiness when userspace data is
actually available. Also refactor ovpn_tcp_poll in order to enforce the
assumption we can make on the lifetime of ovpn_sock and peer.

Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251021100942.195010-4-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agoespintcp: use datagram_poll_queue for socket readiness
Ralf Lici [Tue, 21 Oct 2025 10:09:41 +0000 (12:09 +0200)] 
espintcp: use datagram_poll_queue for socket readiness

espintcp uses a custom queue (ike_queue) to deliver packets to
userspace. The polling logic relies on datagram_poll, which checks
sk_receive_queue, which can lead to false readiness signals when that
queue contains non-userspace packets.

Switch espintcp_poll to use datagram_poll_queue with ike_queue, ensuring
poll only signals readiness when userspace data is actually available.

Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251021100942.195010-3-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agonet: datagram: introduce datagram_poll_queue for custom receive queues
Ralf Lici [Tue, 21 Oct 2025 10:09:40 +0000 (12:09 +0200)] 
net: datagram: introduce datagram_poll_queue for custom receive queues

Some protocols using TCP encapsulation (e.g., espintcp, openvpn) deliver
userspace-bound packets through a custom skb queue rather than the
standard sk_receive_queue.

Introduce datagram_poll_queue that accepts an explicit receive queue,
and convert datagram_poll into a wrapper around datagram_poll_queue.
This allows protocols with custom skb queues to reuse the core polling
logic without relying on sk_receive_queue.

Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Antonio Quartulli <antonio@openvpn.net>
Link: https://patch.msgid.link/20251021100942.195010-2-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agoMerge branch 'acpi-property'
Rafael J. Wysocki [Thu, 23 Oct 2025 11:25:02 +0000 (13:25 +0200)] 
Merge branch 'acpi-property'

Merge an ACPI device properties handling change fixing the order of
__acpi_node_get_property_reference() arguments broken by a recent
update (Sunil V L)

* 'acpi-property':
  ACPI: property: Fix argument order in __acpi_node_get_property_reference()

7 weeks agoMerge branches 'pm-cpuidle' and 'pm-cpufreq'
Rafael J. Wysocki [Thu, 23 Oct 2025 11:12:24 +0000 (13:12 +0200)] 
Merge branches 'pm-cpuidle' and 'pm-cpufreq'

Merge cpuidle and cpufreq fixes for 6.18-rc3:

 - Revert a cpuidle menu governor change that introduced a serious
   performance regression on Chromebooks with Intel Jasper Lake
   processors (Rafael Wysocki)

 - Fix an amd-pstate driver regression leading to EPP=0 after
   hibernation (Mario Limonciello)

* pm-cpuidle:
  Revert "cpuidle: menu: Avoid discarding useful information"

* pm-cpufreq:
  cpufreq/amd-pstate: Fix a regression leading to EPP 0 after hibernate

7 weeks agonet: phy: micrel: Add support for non PTP SKUs for lan8814
Horatiu Vultur [Tue, 21 Oct 2025 07:07:26 +0000 (09:07 +0200)] 
net: phy: micrel: Add support for non PTP SKUs for lan8814

The lan8814 has 4 different SKUs and for 2 of these SKUs the PTP is
disabled. All these SKUs have the same value in the register 2 and 3.
Meaning that we can't differentiate them based on device id, therefore
check the SKU register and based on this allow or not to create a PTP
device.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251021070726.3690685-1-horatiu.vultur@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agonet: bonding: fix possible peer notify event loss or dup issue
Tonghao Zhang [Tue, 21 Oct 2025 05:09:33 +0000 (13:09 +0800)] 
net: bonding: fix possible peer notify event loss or dup issue

If the send_peer_notif counter and the peer event notify are not synchronized.
It may cause problems such as the loss or dup of peer notify event.

Before this patch:
- If should_notify_peers is true and the lock for send_peer_notif-- fails, peer
  event may be sent again in next mii_monitor loop, because should_notify_peers
  is still true.
- If should_notify_peers is true and the lock for send_peer_notif-- succeeded,
  but the lock for peer event fails, the peer event will be lost.

This patch locks the RTNL for send_peer_notif, events, and commit simultaneously.

Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications")
Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Vincent Bernat <vincent@bernat.ch>
Cc: <stable@vger.kernel.org>
Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20251021050933.46412-1-tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agonet: ti: icssg-prueth: Omit a variable reassignment in prueth_netdev_init()
Markus Elfring [Mon, 20 Oct 2025 13:46:11 +0000 (15:46 +0200)] 
net: ti: icssg-prueth: Omit a variable reassignment in prueth_netdev_init()

An error code was assigned to a variable and checked accordingly.
This value was passed to a dev_err_probe() call in an if branch.
This function is documented in the way that the same value is returned.
Thus delete two redundant variable reassignments.

The source code was transformed by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: https://patch.msgid.link/71f7daa3-d4f4-4753-aae8-67040fc8297d@web.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 weeks agonet: hsr: prevent creation of HSR device with slaves from another netns
Fernando Fernandez Mancera [Mon, 20 Oct 2025 13:55:33 +0000 (15:55 +0200)] 
net: hsr: prevent creation of HSR device with slaves from another netns

HSR/PRP driver does not handle correctly having slaves/interlink devices
in a different net namespace. Currently, it is possible to create a HSR
link in a different net namespace than the slaves/interlink with the
following command:

 ip link add hsr0 netns hsr-ns type hsr slave1 eth1 slave2 eth2

As there is no use-case on supporting this scenario, enforce that HSR
device link matches netns defined by IFLA_LINK_NETNSID.

The iproute2 command mentioned above will throw the following error:

 Error: hsr: HSR slaves/interlink must be on the same net namespace than HSR link.

Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20251020135533.9373-1-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoeth: fbnic: fix integer overflow warning in TLV_MAX_DATA definition
Pei Xiao [Tue, 21 Oct 2025 09:42:27 +0000 (17:42 +0800)] 
eth: fbnic: fix integer overflow warning in TLV_MAX_DATA definition

The TLV_MAX_DATA macro calculates (PAGE_SIZE - 512) which can exceed
the maximum value of a 16-bit unsigned integer on architectures with
large page sizes, causing compiler warnings:

drivers/net/ethernet/meta/fbnic/fbnic_tlv.h:83:24: warning: conversion
from 'long unsigned int' to 'short unsigned int' changes value from
'261632' to '65024' [-Woverflow]

Fix this by explicitly masking the result to 16 bits using bitwise AND
with 0xFFFF, ensuring the value fits within the expected data type
while maintaining the intended behavior for normal page sizes.

This preserves the existing functionality while eliminating the
compiler warning and potential undefined behavior from integer
truncation.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202510190832.3SQkTCHe-lkp@intel.com/
Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/182b9d0235d044d69d7a57c1296cc6f46e395beb.1761039651.git.xiaopei01@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet/sched: Remove unused inline helper qdisc_from_priv()
Yue Haibing [Tue, 21 Oct 2025 11:46:26 +0000 (19:46 +0800)] 
net/sched: Remove unused inline helper qdisc_from_priv()

Since commit fb38306ceb9e ("net/sched: Retire ATM qdisc"), this is
not used and can be removed.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://patch.msgid.link/20251021114626.3148894-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: spacemit: Avoid -Wflex-array-member-not-at-end warnings
Gustavo A. R. Silva [Tue, 21 Oct 2025 11:54:10 +0000 (12:54 +0100)] 
net: spacemit: Avoid -Wflex-array-member-not-at-end warnings

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Use regular arrays instead of flexible-array members (they're not
really needed in this case) in a couple of unions, and fix the
following warnings:

      1 drivers/net/ethernet/spacemit/k1_emac.c:122:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
      1 drivers/net/ethernet/spacemit/k1_emac.c:122:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
      1 drivers/net/ethernet/spacemit/k1_emac.c:121:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
      1 drivers/net/ethernet/spacemit/k1_emac.c:121:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Vivian Wang <wangruikang@iscas.ac.cn>
Link: https://patch.msgid.link/aPd0YjO-oP60Lgvj@kspp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agosctp: avoid NULL dereference when chunk data buffer is missing
Alexey Simakov [Tue, 21 Oct 2025 13:00:36 +0000 (16:00 +0300)] 
sctp: avoid NULL dereference when chunk data buffer is missing

chunk->skb pointer is dereferenced in the if-block where it's supposed
to be NULL only.

chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list
instead and do it just before replacing chunk->skb. We're sure that
otherwise chunk->skb is non-NULL because of outer if() condition.

Fixes: 90017accff61 ("sctp: Add GSO support")
Signed-off-by: Alexey Simakov <bigalex934@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
Jiasheng Jiang [Tue, 21 Oct 2025 18:24:56 +0000 (18:24 +0000)] 
ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop

In ptp_ocp_sma_fb_init(), the code mistakenly used bp->sma[1]
instead of bp->sma[i] inside a for-loop, which caused only SMA[1]
to have its DIRECTION_CAN_CHANGE capability cleared. This led to
inconsistent capability flags across SMA pins.

Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251021182456.9729-1-jiashengjiangcool@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: stmmac: replace has_xxxx with core_type
Russell King (Oracle) [Tue, 21 Oct 2025 07:26:49 +0000 (08:26 +0100)] 
net: stmmac: replace has_xxxx with core_type

Replace the has_gmac, has_gmac4 and has_xgmac ints, of which only one
can be set when matching a core to its driver backend, with an
enumerated type carrying the DWMAC core type.

Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Chen-Yu Tsai <wens@kernel.org>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/E1vB6ld-0000000BIPy-2Qi4@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge branch 'net-ravb-fix-soc-specific-configuration-and-descriptor-handling-issues'
Jakub Kicinski [Thu, 23 Oct 2025 01:08:06 +0000 (18:08 -0700)] 
Merge branch 'net-ravb-fix-soc-specific-configuration-and-descriptor-handling-issues'

Lad Prabhakar says:

====================
net: ravb: Fix SoC-specific configuration and descriptor handling issues [part]

This series addresses several issues in the Renesas Ethernet AVB (ravb)
driver related descriptor ordering.

A potential ordering hazard in descriptor setup could cause
the DMA engine to start prematurely, leading to TX stalls on some
platforms.

The series includes the following changes:

Enforce descriptor type ordering to prevent early DMA start
Ensure proper write ordering of TX descriptor type fields to prevent the
DMA engine from observing an incomplete descriptor chain. This fixes
observed TX stalls on RZ/G2L platforms running RT kernels.

Tested on R/G1x Gen2, RZ/G2x Gen3 and RZ/G2L family hardware.
====================

Link: https://patch.msgid.link/20251017151830.171062-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: ravb: Ensure memory write completes before ringing TX doorbell
Lad Prabhakar [Fri, 17 Oct 2025 15:18:30 +0000 (16:18 +0100)] 
net: ravb: Ensure memory write completes before ringing TX doorbell

Add a final dma_wmb() barrier before triggering the transmit request
(TCCR_TSRQ) to ensure all descriptor and buffer writes are visible to
the DMA engine.

According to the hardware manual, a read-back operation is required
before writing to the doorbell register to guarantee completion of
previous writes. Instead of performing a dummy read, a dma_wmb() is
used to both enforce the same ordering semantics on the CPU side and
also to ensure completion of writes.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Cc: stable@vger.kernel.org
Co-developed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251017151830.171062-5-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: ravb: Enforce descriptor type ordering
Lad Prabhakar [Fri, 17 Oct 2025 15:18:29 +0000 (16:18 +0100)] 
net: ravb: Enforce descriptor type ordering

Ensure the TX descriptor type fields are published in a safe order so the
DMA engine never begins processing a descriptor chain before all descriptor
fields are fully initialised.

For multi-descriptor transmits the driver writes DT_FEND into the last
descriptor and DT_FSTART into the first. The DMA engine begins processing
when it observes DT_FSTART. Move the dma_wmb() barrier so it executes
immediately after DT_FEND and immediately before writing DT_FSTART
(and before DT_FSINGLE in the single-descriptor case). This guarantees
that all prior CPU writes to the descriptor memory are visible to the
device before DT_FSTART is seen.

This avoids a situation where compiler/CPU reordering could publish
DT_FSTART ahead of DT_FEND or other descriptor fields, allowing the DMA to
start on a partially initialised chain and causing corrupted transmissions
or TX timeouts. Such a failure was observed on RZ/G2L with an RT kernel as
transmit queue timeouts and device resets.

Fixes: 2f45d1902acf ("ravb: minimize TX data copying")
Cc: stable@vger.kernel.org
Co-developed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251017151830.171062-4-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Thu, 23 Oct 2025 01:00:34 +0000 (15:00 -1000)] 
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "All driver fixes. The big change is the storvsc one to rejig the
  hyper-v channel handling to be more efficient for SMP virtual
  machines"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: phy: dt-bindings: Add QMP UFS PHY compatible for Kaanapali
  scsi: ufs: qcom: dt-bindings: Document the Kaanapali UFS controller
  scsi: libfc: Prevent integer overflow in fc_fcp_recv_data()
  scsi: qla4xxx: Fix typos in comments
  scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU