From: Sasha Levin Date: Sat, 25 Oct 2025 22:39:52 +0000 (-0400) Subject: Fixes for all trees X-Git-Tag: v5.4.301~42 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=bef85054bb692b439f3badd97cf016a686a59274;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for all trees Signed-off-by: Sasha Levin --- diff --git a/queue-5.10/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch b/queue-5.10/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch new file mode 100644 index 0000000000..8f2e36b4c5 --- /dev/null +++ b/queue-5.10/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch @@ -0,0 +1,71 @@ +From c6be02ab82675e67cc12278cd79528253562514c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:37:12 +0800 +Subject: arm64, mm: avoid always making PTE dirty in pte_mkwrite() + +From: Huang Ying + +[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ] + +Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may +mark some pages that are never written dirty wrongly. For example, +do_swap_page() may map the exclusive pages with writable and clean PTEs +if the VMA is writable and the page fault is for read access. +However, current pte_mkwrite_novma() implementation always dirties the +PTE. This may cause unnecessary disk writing if the pages are +never written before being reclaimed. + +So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the +PTE_DIRTY bit is set to make it possible to make the PTE writable and +clean. + +The current behavior was introduced in commit 73e86cb03cf2 ("arm64: +Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, +pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only +clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits +are set. + +To test the performance impact of the patch, on an arm64 server +machine, run 16 redis-server processes on socket 1 and 16 +memtier_benchmark processes on socket 0 with mostly get +transactions (that is, redis-server will mostly read memory only). +The memory footprint of redis-server is larger than the available +memory, so swap out/in will be triggered. Test results show that the +patch can avoid most swapping out because the pages are mostly clean. +And the benchmark throughput improves ~23.9% in the test. + +Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") +Signed-off-by: Huang Ying +Cc: Will Deacon +Cc: Anshuman Khandual +Cc: Ryan Roberts +Cc: Gavin Shan +Cc: Ard Biesheuvel +Cc: Matthew Wilcox (Oracle) +Cc: Yicong Yang +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Reviewed-by: Catalin Marinas +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/pgtable.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h +index d92b5aed354e9..9bf40864b6e4f 100644 +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -174,7 +174,8 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) + static inline pte_t pte_mkwrite(pte_t pte) + { + pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); +- pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); ++ if (pte_sw_dirty(pte)) ++ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + return pte; + } + +-- +2.51.0 + diff --git a/queue-5.10/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch b/queue-5.10/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch new file mode 100644 index 0000000000..8120db2215 --- /dev/null +++ b/queue-5.10/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch @@ -0,0 +1,50 @@ +From 6672aa1bf3525a072563c8be728d42fd93c7c8d2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:58:07 +0300 +Subject: dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path + +From: Ioana Ciornei + +[ Upstream commit 902e81e679d86846a2404630d349709ad9372d0d ] + +The blamed commit increased the needed headroom to account for +alignment. This means that the size required to always align a Tx buffer +was added inside the dpaa2_eth_needed_headroom() function. By doing +that, a manual adjustment of the pointer passed to PTR_ALIGN() was no +longer correct since the 'buffer_start' variable was already pointing +to the start of the skb's memory. + +The behavior of the dpaa2-eth driver without this patch was to drop +frames on Tx even when the headroom was matching the 128 bytes +necessary. Fix this by removing the manual adjust of 'buffer_start' from +the PTR_MODE call. + +Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u +Fixes: f422abe3f23d ("dpaa2-eth: increase the needed headroom to account for alignment") +Signed-off-by: Ioana Ciornei +Tested-by: Mathew McBride +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +index 0a1a7d94583b4..d27f5a8e59dcd 100644 +--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c ++++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +@@ -994,8 +994,7 @@ static int dpaa2_eth_build_single_fd(struct dpaa2_eth_priv *priv, + dma_addr_t addr; + + buffer_start = skb->data - dpaa2_eth_needed_headroom(skb); +- aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN, +- DPAA2_ETH_TX_BUF_ALIGN); ++ aligned_start = PTR_ALIGN(buffer_start, DPAA2_ETH_TX_BUF_ALIGN); + if (aligned_start >= skb->head) + buffer_start = aligned_start; + else +-- +2.51.0 + diff --git a/queue-5.10/net-add-ndo_fdb_del_bulk.patch b/queue-5.10/net-add-ndo_fdb_del_bulk.patch new file mode 100644 index 0000000000..969585d56a --- /dev/null +++ b/queue-5.10/net-add-ndo_fdb_del_bulk.patch @@ -0,0 +1,52 @@ +From fef15122646f571ef55085b9d9e90814a31435d1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:56 +0300 +Subject: net: add ndo_fdb_del_bulk + +From: Nikolay Aleksandrov + +[ Upstream commit 1306d5362a591493a2d07f685ed2cc480dcda320 ] + +Add a new netdev op called ndo_fdb_del_bulk, it will be later used for +driver-specific bulk delete implementation dispatched from rtnetlink. The +first user will be the bridge, we need it to signal to rtnetlink from +the driver that we support bulk delete operation (NLM_F_BULK). + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/linux/netdevice.h | 9 +++++++++ + 1 file changed, 9 insertions(+) + +diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h +index 06b37f45b67c9..d3a3e77a18df1 100644 +--- a/include/linux/netdevice.h ++++ b/include/linux/netdevice.h +@@ -1200,6 +1200,10 @@ struct netdev_net_notifier { + * struct net_device *dev, + * const unsigned char *addr, u16 vid) + * Deletes the FDB entry from dev coresponding to addr. ++ * int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, struct nlattr *tb[], ++ * struct net_device *dev, ++ * u16 vid, ++ * struct netlink_ext_ack *extack); + * int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb, + * struct net_device *dev, struct net_device *filter_dev, + * int *idx) +@@ -1452,6 +1456,11 @@ struct net_device_ops { + struct net_device *dev, + const unsigned char *addr, + u16 vid); ++ int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, ++ struct nlattr *tb[], ++ struct net_device *dev, ++ u16 vid, ++ struct netlink_ext_ack *extack); + int (*ndo_fdb_dump)(struct sk_buff *skb, + struct netlink_callback *cb, + struct net_device *dev, +-- +2.51.0 + diff --git a/queue-5.10/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch b/queue-5.10/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch new file mode 100644 index 0000000000..1ff7e9688b --- /dev/null +++ b/queue-5.10/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch @@ -0,0 +1,54 @@ +From dd434de6d26ea49c136995862de01301f23e427a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:01:31 +0800 +Subject: net: enetc: correct the value of ENETC_RXB_TRUESIZE + +From: Wei Fang + +[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ] + +The ENETC RX ring uses the page halves flipping mechanism, each page is +split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is +defined to 2048 to indicate the size of half a page. However, the page +size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, +but it could be configured to 16K or 64K. + +When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, +and the RX ring will always use the first half of the page. This is not +consistent with the description in the relevant kernel doc and commit +messages. + +This issue is invisible in most cases, but if users want to increase +PAGE_SIZE to receive a Jumbo frame with a single buffer for some use +cases, it will not work as expected, because the buffer size of each +RX BD is fixed to 2048 bytes. + +Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE +to (PAGE_SIZE >> 1), as described in the comment. + +Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") +Signed-off-by: Wei Fang +Reviewed-by: Claudiu Manoil +Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h +index 725c3d1cbb198..f58cca219a1e7 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.h ++++ b/drivers/net/ethernet/freescale/enetc/enetc.h +@@ -28,7 +28,7 @@ struct enetc_tx_swbd { + }; + + #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE +-#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ ++#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) + #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ + #define ENETC_RXB_DMA_SIZE \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +-- +2.51.0 + diff --git a/queue-5.10/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch b/queue-5.10/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch new file mode 100644 index 0000000000..3471805e21 --- /dev/null +++ b/queue-5.10/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch @@ -0,0 +1,43 @@ +From 96415de8a1c72a8795f5eeb66df71a2242b9c265 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:54 +0300 +Subject: net: netlink: add NLM_F_BULK delete request modifier + +From: Nikolay Aleksandrov + +[ Upstream commit 545528d788556c724eeb5400757f828ef27782a8 ] + +Add a new delete request modifier called NLM_F_BULK which, when +supported, would cause the request to delete multiple objects. The flag +is a convenient way to signal that a multiple delete operation is +requested which can be gradually added to different delete requests. In +order to make sure older kernels will error out if the operation is not +supported instead of doing something unintended we have to break a +required condition when implementing support for this flag, f.e. for +neighbors we will omit the mandatory mac address attribute. +Initially it will be used to add flush with filtering support for bridge +fdbs, but it also opens the door to add similar support to others. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/uapi/linux/netlink.h | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h +index 49751d5fee88a..9bcca78ee1f91 100644 +--- a/include/uapi/linux/netlink.h ++++ b/include/uapi/linux/netlink.h +@@ -72,6 +72,7 @@ struct nlmsghdr { + + /* Modifiers to DELETE request */ + #define NLM_F_NONREC 0x100 /* Do not delete recursively */ ++#define NLM_F_BULK 0x200 /* Delete multiple objects */ + + /* Flags for ACK message */ + #define NLM_F_CAPPED 0x100 /* request was capped */ +-- +2.51.0 + diff --git a/queue-5.10/net-rtnetlink-add-bulk-delete-support-flag.patch b/queue-5.10/net-rtnetlink-add-bulk-delete-support-flag.patch new file mode 100644 index 0000000000..1dec27988c --- /dev/null +++ b/queue-5.10/net-rtnetlink-add-bulk-delete-support-flag.patch @@ -0,0 +1,66 @@ +From c8a58df5ce10e79b2863b55939bd5b5a08e43a5e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:55 +0300 +Subject: net: rtnetlink: add bulk delete support flag + +From: Nikolay Aleksandrov + +[ Upstream commit a6cec0bcd34264be8887791594be793b3f12719f ] + +Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to +verify that the delete operation allows bulk object deletion. Also emit +a warning if anyone tries to set it for non-delete kind. + +Suggested-by: David Ahern +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 3 ++- + net/core/rtnetlink.c | 8 ++++++++ + 2 files changed, 10 insertions(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 030fc7eef7401..e893b1f21913e 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -10,7 +10,8 @@ typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *, + typedef int (*rtnl_dumpit_func)(struct sk_buff *, struct netlink_callback *); + + enum rtnl_link_flags { +- RTNL_FLAG_DOIT_UNLOCKED = BIT(0), ++ RTNL_FLAG_DOIT_UNLOCKED = BIT(0), ++ RTNL_FLAG_BULK_DEL_SUPPORTED = BIT(1), + }; + + enum rtnl_kinds { +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 2cd4990ac8bc3..d861744940c6f 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -214,6 +214,8 @@ static int rtnl_register_internal(struct module *owner, + if (dumpit) + link->dumpit = dumpit; + ++ WARN_ON(rtnl_msgtype_kind(msgtype) != RTNL_KIND_DEL && ++ (flags & RTNL_FLAG_BULK_DEL_SUPPORTED)); + link->flags |= flags; + + /* publish protocol:msgtype */ +@@ -5594,6 +5596,12 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + } + + flags = link->flags; ++ if (kind == RTNL_KIND_DEL && (nlh->nlmsg_flags & NLM_F_BULK) && ++ !(flags & RTNL_FLAG_BULK_DEL_SUPPORTED)) { ++ NL_SET_ERR_MSG(extack, "Bulk delete is not supported"); ++ goto err_unlock; ++ } ++ + if (flags & RTNL_FLAG_DOIT_UNLOCKED) { + doit = link->doit; + rcu_read_unlock(); +-- +2.51.0 + diff --git a/queue-5.10/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch b/queue-5.10/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch new file mode 100644 index 0000000000..668083fe8d --- /dev/null +++ b/queue-5.10/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch @@ -0,0 +1,53 @@ +From 22fcff877461fa890fcf5fb14b58435cb54732ae Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:52 +0300 +Subject: net: rtnetlink: add helper to extract msg type's kind + +From: Nikolay Aleksandrov + +[ Upstream commit 2e9ea3e30f696fd438319c07836422bb0bbb4608 ] + +Add a helper which extracts the msg type's kind using the kind mask (0x3). + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 6 ++++++ + net/core/rtnetlink.c | 2 +- + 2 files changed, 7 insertions(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 74eff5259b361..02b0636a4523d 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -19,6 +19,12 @@ enum rtnl_kinds { + RTNL_KIND_GET, + RTNL_KIND_SET + }; ++#define RTNL_KIND_MASK 0x3 ++ ++static inline enum rtnl_kinds rtnl_msgtype_kind(int msgtype) ++{ ++ return msgtype & RTNL_KIND_MASK; ++} + + void rtnl_register(int protocol, int msgtype, + rtnl_doit_func, rtnl_dumpit_func, unsigned int flags); +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index c0a3cf3ed8f34..2cd4990ac8bc3 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -5532,7 +5532,7 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + return 0; + + family = ((struct rtgenmsg *)nlmsg_data(nlh))->rtgen_family; +- kind = type&3; ++ kind = rtnl_msgtype_kind(type); + + if (kind != RTNL_KIND_GET && !netlink_net_capable(skb, CAP_NET_ADMIN)) + return -EPERM; +-- +2.51.0 + diff --git a/queue-5.10/net-rtnetlink-add-msg-kind-names.patch b/queue-5.10/net-rtnetlink-add-msg-kind-names.patch new file mode 100644 index 0000000000..80edaadd13 --- /dev/null +++ b/queue-5.10/net-rtnetlink-add-msg-kind-names.patch @@ -0,0 +1,73 @@ +From 4f83642fa38856c87e849ecd07b32b7a51ee9311 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:51 +0300 +Subject: net: rtnetlink: add msg kind names + +From: Nikolay Aleksandrov + +[ Upstream commit 12dc5c2cb7b269c5a1c6d02844f40bfce942a7a6 ] + +Add rtnl kind names instead of using raw values. We'll need to +check for DEL kind later to validate bulk flag support. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 7 +++++++ + net/core/rtnetlink.c | 6 +++--- + 2 files changed, 10 insertions(+), 3 deletions(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 5c2a73bbfabee..74eff5259b361 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -13,6 +13,13 @@ enum rtnl_link_flags { + RTNL_FLAG_DOIT_UNLOCKED = 1, + }; + ++enum rtnl_kinds { ++ RTNL_KIND_NEW, ++ RTNL_KIND_DEL, ++ RTNL_KIND_GET, ++ RTNL_KIND_SET ++}; ++ + void rtnl_register(int protocol, int msgtype, + rtnl_doit_func, rtnl_dumpit_func, unsigned int flags); + int rtnl_register_module(struct module *owner, int protocol, int msgtype, +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index bc86034e17eab..c0a3cf3ed8f34 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -5513,11 +5513,11 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + { + struct net *net = sock_net(skb->sk); + struct rtnl_link *link; ++ enum rtnl_kinds kind; + struct module *owner; + int err = -EOPNOTSUPP; + rtnl_doit_func doit; + unsigned int flags; +- int kind; + int family; + int type; + +@@ -5534,11 +5534,11 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + family = ((struct rtgenmsg *)nlmsg_data(nlh))->rtgen_family; + kind = type&3; + +- if (kind != 2 && !netlink_net_capable(skb, CAP_NET_ADMIN)) ++ if (kind != RTNL_KIND_GET && !netlink_net_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + + rcu_read_lock(); +- if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) { ++ if (kind == RTNL_KIND_GET && (nlh->nlmsg_flags & NLM_F_DUMP)) { + struct sock *rtnl; + rtnl_dumpit_func dumpit; + u32 min_dump_alloc = 0; +-- +2.51.0 + diff --git a/queue-5.10/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch b/queue-5.10/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch new file mode 100644 index 0000000000..f81deeb208 --- /dev/null +++ b/queue-5.10/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch @@ -0,0 +1,159 @@ +From 72b0f090a48cb18db845866246b36b4c4a00125e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:57 +0300 +Subject: net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_del + +From: Nikolay Aleksandrov + +[ Upstream commit 9e83425993f38bb89e0ea07849ba0039a748e85b ] + +When NLM_F_BULK is specified in a fdb del message we need to handle it +differently. First since this is a new call we can strictly validate the +passed attributes, at first only ifindex and vlan are allowed as these +will be the initially supported filter attributes, any other attribute +is rejected. The mac address is no longer mandatory, but we use it +to error out in older kernels because it cannot be specified with bulk +request (the attribute is not allowed) and then we have to dispatch +the call to ndo_fdb_del_bulk if the device supports it. The del bulk +callback can do further validation of the attributes if necessary. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 67 +++++++++++++++++++++++++++++++------------- + 1 file changed, 48 insertions(+), 19 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index d861744940c6f..42c97a71174fe 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4134,22 +4134,34 @@ int ndo_dflt_fdb_del(struct ndmsg *ndm, + } + EXPORT_SYMBOL(ndo_dflt_fdb_del); + ++static const struct nla_policy fdb_del_bulk_policy[NDA_MAX + 1] = { ++ [NDA_VLAN] = { .type = NLA_U16 }, ++ [NDA_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), ++}; ++ + static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) + { ++ bool del_bulk = !!(nlh->nlmsg_flags & NLM_F_BULK); + struct net *net = sock_net(skb->sk); ++ const struct net_device_ops *ops; + struct ndmsg *ndm; + struct nlattr *tb[NDA_MAX+1]; + struct net_device *dev; +- __u8 *addr; ++ __u8 *addr = NULL; + int err; + u16 vid; + + if (!netlink_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + +- err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, NULL, +- extack); ++ if (!del_bulk) { ++ err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, ++ NULL, extack); ++ } else { ++ err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, ++ fdb_del_bulk_policy, extack); ++ } + if (err < 0) + return err; + +@@ -4165,9 +4177,12 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + return -ENODEV; + } + +- if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { +- NL_SET_ERR_MSG(extack, "invalid address"); +- return -EINVAL; ++ if (!del_bulk) { ++ if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { ++ NL_SET_ERR_MSG(extack, "invalid address"); ++ return -EINVAL; ++ } ++ addr = nla_data(tb[NDA_LLADDR]); + } + + if (dev->type != ARPHRD_ETHER) { +@@ -4175,8 +4190,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + return -EINVAL; + } + +- addr = nla_data(tb[NDA_LLADDR]); +- + err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack); + if (err) + return err; +@@ -4187,10 +4200,16 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) && + netif_is_bridge_port(dev)) { + struct net_device *br_dev = netdev_master_upper_dev_get(dev); +- const struct net_device_ops *ops = br_dev->netdev_ops; + +- if (ops->ndo_fdb_del) +- err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ ops = br_dev->netdev_ops; ++ if (!del_bulk) { ++ if (ops->ndo_fdb_del) ++ err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ } else { ++ if (ops->ndo_fdb_del_bulk) ++ err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, ++ extack); ++ } + + if (err) + goto out; +@@ -4200,15 +4219,24 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + + /* Embedded bridge, macvlan, and any other device support */ + if (ndm->ndm_flags & NTF_SELF) { +- if (dev->netdev_ops->ndo_fdb_del) +- err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr, +- vid); +- else +- err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); ++ ops = dev->netdev_ops; ++ if (!del_bulk) { ++ if (ops->ndo_fdb_del) ++ err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ else ++ err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); ++ } else { ++ /* in case err was cleared by NTF_MASTER call */ ++ err = -EOPNOTSUPP; ++ if (ops->ndo_fdb_del_bulk) ++ err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, ++ extack); ++ } + + if (!err) { +- rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, +- ndm->ndm_state); ++ if (!del_bulk) ++ rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, ++ ndm->ndm_state); + ndm->ndm_flags &= ~NTF_SELF; + } + } +@@ -5730,7 +5758,8 @@ void __init rtnetlink_init(void) + rtnl_register(PF_UNSPEC, RTM_DELLINKPROP, rtnl_dellinkprop, NULL, 0); + + rtnl_register(PF_BRIDGE, RTM_NEWNEIGH, rtnl_fdb_add, NULL, 0); +- rtnl_register(PF_BRIDGE, RTM_DELNEIGH, rtnl_fdb_del, NULL, 0); ++ rtnl_register(PF_BRIDGE, RTM_DELNEIGH, rtnl_fdb_del, NULL, ++ RTNL_FLAG_BULK_DEL_SUPPORTED); + rtnl_register(PF_BRIDGE, RTM_GETNEIGH, rtnl_fdb_get, rtnl_fdb_dump, 0); + + rtnl_register(PF_BRIDGE, RTM_GETLINK, NULL, rtnl_bridge_getlink, 0); +-- +2.51.0 + diff --git a/queue-5.10/net-rtnetlink-use-bit-for-flag-values.patch b/queue-5.10/net-rtnetlink-use-bit-for-flag-values.patch new file mode 100644 index 0000000000..4b55a38c65 --- /dev/null +++ b/queue-5.10/net-rtnetlink-use-bit-for-flag-values.patch @@ -0,0 +1,35 @@ +From a4f22743a2ee42975f95f6d7e67e1e189be01327 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:53 +0300 +Subject: net: rtnetlink: use BIT for flag values + +From: Nikolay Aleksandrov + +[ Upstream commit 0569e31f1bc2f50613ba4c219f3ecc0d1174d841 ] + +Use BIT to define flag values. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 02b0636a4523d..030fc7eef7401 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -10,7 +10,7 @@ typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *, + typedef int (*rtnl_dumpit_func)(struct sk_buff *, struct netlink_callback *); + + enum rtnl_link_flags { +- RTNL_FLAG_DOIT_UNLOCKED = 1, ++ RTNL_FLAG_DOIT_UNLOCKED = BIT(0), + }; + + enum rtnl_kinds { +-- +2.51.0 + diff --git a/queue-5.10/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch b/queue-5.10/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch new file mode 100644 index 0000000000..b979dd6ac3 --- /dev/null +++ b/queue-5.10/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch @@ -0,0 +1,56 @@ +From 2048595a7e5060d81a5fa1c519eec5d442328c34 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 22:15:43 +0200 +Subject: rtnetlink: Allow deleting FDB entries in user namespace +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Johannes Wiesböck + +[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ] + +Creating FDB entries is possible from a non-initial user namespace when +having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive +an EPERM because the capability is always checked against the initial +user namespace. This restricts the FDB management from unprivileged +containers. + +Drop the netlink_capable check in rtnl_fdb_del as it was originally +dropped in c5c351088ae7 and reintroduced in 1690be63a27b without +intention. + +This patch was tested using a container on GyroidOS, where it was +possible to delete FDB entries from an unprivileged user namespace and +private network namespace. + +Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") +Reviewed-by: Michael Weiß +Tested-by: Harshal Gohel +Signed-off-by: Johannes Wiesböck +Reviewed-by: Ido Schimmel +Reviewed-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 42c97a71174fe..0b44c6b1ef999 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4152,9 +4152,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + int err; + u16 vid; + +- if (!netlink_capable(skb, CAP_NET_ADMIN)) +- return -EPERM; +- + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); +-- +2.51.0 + diff --git a/queue-5.10/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch b/queue-5.10/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch new file mode 100644 index 0000000000..48853d7956 --- /dev/null +++ b/queue-5.10/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch @@ -0,0 +1,54 @@ +From 74e2e5459ab49fa62d74395afd93ed4de34b9241 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 16:00:36 +0300 +Subject: sctp: avoid NULL dereference when chunk data buffer is missing + +From: Alexey Simakov + +[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ] + +chunk->skb pointer is dereferenced in the if-block where it's supposed +to be NULL only. + +chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list +instead and do it just before replacing chunk->skb. We're sure that +otherwise chunk->skb is non-NULL because of outer if() condition. + +Fixes: 90017accff61 ("sctp: Add GSO support") +Signed-off-by: Alexey Simakov +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sctp/inqueue.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c +index 7182c5a450fb5..6a434d441dc70 100644 +--- a/net/sctp/inqueue.c ++++ b/net/sctp/inqueue.c +@@ -163,13 +163,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) + chunk->head_skb = chunk->skb; + + /* skbs with "cover letter" */ +- if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) ++ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { ++ if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { ++ __SCTP_INC_STATS(dev_net(chunk->skb->dev), ++ SCTP_MIB_IN_PKT_DISCARDS); ++ sctp_chunk_free(chunk); ++ goto next_chunk; ++ } + chunk->skb = skb_shinfo(chunk->skb)->frag_list; +- +- if (WARN_ON(!chunk->skb)) { +- __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); +- sctp_chunk_free(chunk); +- goto next_chunk; + } + } + +-- +2.51.0 + diff --git a/queue-5.10/series b/queue-5.10/series index a74c2efd0d..d1c7c03bf1 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -265,3 +265,15 @@ dlm-check-for-defined-force-value-in-dlm_lockspace_r.patch hfs-fix-kmsan-uninit-value-issue-in-hfs_find_set_zer.patch hfsplus-return-eio-when-type-of-hidden-directory-mis.patch m68k-bitops-fix-find_-_bit-signatures.patch +net-rtnetlink-add-msg-kind-names.patch +net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch +net-rtnetlink-use-bit-for-flag-values.patch +net-netlink-add-nlm_f_bulk-delete-request-modifier.patch +net-rtnetlink-add-bulk-delete-support-flag.patch +net-add-ndo_fdb_del_bulk.patch +net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch +rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch +net-enetc-correct-the-value-of-enetc_rxb_truesize.patch +dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch +arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch +sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch diff --git a/queue-5.15/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch b/queue-5.15/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch new file mode 100644 index 0000000000..1aeabea372 --- /dev/null +++ b/queue-5.15/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch @@ -0,0 +1,71 @@ +From 47b637adc8a6c431e34445d3240767212d96b3ac Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:37:12 +0800 +Subject: arm64, mm: avoid always making PTE dirty in pte_mkwrite() + +From: Huang Ying + +[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ] + +Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may +mark some pages that are never written dirty wrongly. For example, +do_swap_page() may map the exclusive pages with writable and clean PTEs +if the VMA is writable and the page fault is for read access. +However, current pte_mkwrite_novma() implementation always dirties the +PTE. This may cause unnecessary disk writing if the pages are +never written before being reclaimed. + +So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the +PTE_DIRTY bit is set to make it possible to make the PTE writable and +clean. + +The current behavior was introduced in commit 73e86cb03cf2 ("arm64: +Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, +pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only +clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits +are set. + +To test the performance impact of the patch, on an arm64 server +machine, run 16 redis-server processes on socket 1 and 16 +memtier_benchmark processes on socket 0 with mostly get +transactions (that is, redis-server will mostly read memory only). +The memory footprint of redis-server is larger than the available +memory, so swap out/in will be triggered. Test results show that the +patch can avoid most swapping out because the pages are mostly clean. +And the benchmark throughput improves ~23.9% in the test. + +Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") +Signed-off-by: Huang Ying +Cc: Will Deacon +Cc: Anshuman Khandual +Cc: Ryan Roberts +Cc: Gavin Shan +Cc: Ard Biesheuvel +Cc: Matthew Wilcox (Oracle) +Cc: Yicong Yang +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Reviewed-by: Catalin Marinas +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/pgtable.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h +index a0bfa9cd76dab..a1902dcf7a7e3 100644 +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -175,7 +175,8 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) + static inline pte_t pte_mkwrite(pte_t pte) + { + pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); +- pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); ++ if (pte_sw_dirty(pte)) ++ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + return pte; + } + +-- +2.51.0 + diff --git a/queue-5.15/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch b/queue-5.15/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch new file mode 100644 index 0000000000..82a29bbbbb --- /dev/null +++ b/queue-5.15/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch @@ -0,0 +1,50 @@ +From 150eccbe79588ca884f52fbc0caa5cda167ccc94 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:58:07 +0300 +Subject: dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path + +From: Ioana Ciornei + +[ Upstream commit 902e81e679d86846a2404630d349709ad9372d0d ] + +The blamed commit increased the needed headroom to account for +alignment. This means that the size required to always align a Tx buffer +was added inside the dpaa2_eth_needed_headroom() function. By doing +that, a manual adjustment of the pointer passed to PTR_ALIGN() was no +longer correct since the 'buffer_start' variable was already pointing +to the start of the skb's memory. + +The behavior of the dpaa2-eth driver without this patch was to drop +frames on Tx even when the headroom was matching the 128 bytes +necessary. Fix this by removing the manual adjust of 'buffer_start' from +the PTR_MODE call. + +Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u +Fixes: f422abe3f23d ("dpaa2-eth: increase the needed headroom to account for alignment") +Signed-off-by: Ioana Ciornei +Tested-by: Mathew McBride +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +index 7554cf37507df..0439bf465fa5b 100644 +--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c ++++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +@@ -1018,8 +1018,7 @@ static int dpaa2_eth_build_single_fd(struct dpaa2_eth_priv *priv, + dma_addr_t addr; + + buffer_start = skb->data - dpaa2_eth_needed_headroom(skb); +- aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN, +- DPAA2_ETH_TX_BUF_ALIGN); ++ aligned_start = PTR_ALIGN(buffer_start, DPAA2_ETH_TX_BUF_ALIGN); + if (aligned_start >= skb->head) + buffer_start = aligned_start; + else +-- +2.51.0 + diff --git a/queue-5.15/net-add-ndo_fdb_del_bulk.patch b/queue-5.15/net-add-ndo_fdb_del_bulk.patch new file mode 100644 index 0000000000..e86024f7a4 --- /dev/null +++ b/queue-5.15/net-add-ndo_fdb_del_bulk.patch @@ -0,0 +1,52 @@ +From 8657fa3780fc789a4a81f287883e7af31fd31339 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:56 +0300 +Subject: net: add ndo_fdb_del_bulk + +From: Nikolay Aleksandrov + +[ Upstream commit 1306d5362a591493a2d07f685ed2cc480dcda320 ] + +Add a new netdev op called ndo_fdb_del_bulk, it will be later used for +driver-specific bulk delete implementation dispatched from rtnetlink. The +first user will be the bridge, we need it to signal to rtnetlink from +the driver that we support bulk delete operation (NLM_F_BULK). + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/linux/netdevice.h | 9 +++++++++ + 1 file changed, 9 insertions(+) + +diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h +index 179c569a55c42..83bb0f21b1b02 100644 +--- a/include/linux/netdevice.h ++++ b/include/linux/netdevice.h +@@ -1273,6 +1273,10 @@ struct netdev_net_notifier { + * struct net_device *dev, + * const unsigned char *addr, u16 vid) + * Deletes the FDB entry from dev coresponding to addr. ++ * int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, struct nlattr *tb[], ++ * struct net_device *dev, ++ * u16 vid, ++ * struct netlink_ext_ack *extack); + * int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb, + * struct net_device *dev, struct net_device *filter_dev, + * int *idx) +@@ -1528,6 +1532,11 @@ struct net_device_ops { + struct net_device *dev, + const unsigned char *addr, + u16 vid); ++ int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, ++ struct nlattr *tb[], ++ struct net_device *dev, ++ u16 vid, ++ struct netlink_ext_ack *extack); + int (*ndo_fdb_dump)(struct sk_buff *skb, + struct netlink_callback *cb, + struct net_device *dev, +-- +2.51.0 + diff --git a/queue-5.15/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch b/queue-5.15/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch new file mode 100644 index 0000000000..5927911231 --- /dev/null +++ b/queue-5.15/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch @@ -0,0 +1,54 @@ +From f92741dd81287a500d6ccc1b4cdd2fb50e4a65d7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:01:31 +0800 +Subject: net: enetc: correct the value of ENETC_RXB_TRUESIZE + +From: Wei Fang + +[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ] + +The ENETC RX ring uses the page halves flipping mechanism, each page is +split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is +defined to 2048 to indicate the size of half a page. However, the page +size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, +but it could be configured to 16K or 64K. + +When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, +and the RX ring will always use the first half of the page. This is not +consistent with the description in the relevant kernel doc and commit +messages. + +This issue is invisible in most cases, but if users want to increase +PAGE_SIZE to receive a Jumbo frame with a single buffer for some use +cases, it will not work as expected, because the buffer size of each +RX BD is fixed to 2048 bytes. + +Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE +to (PAGE_SIZE >> 1), as described in the comment. + +Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") +Signed-off-by: Wei Fang +Reviewed-by: Claudiu Manoil +Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h +index a3b936375c561..40c8f0f026a5b 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.h ++++ b/drivers/net/ethernet/freescale/enetc/enetc.h +@@ -37,7 +37,7 @@ struct enetc_tx_swbd { + }; + + #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE +-#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ ++#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) + #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ + #define ENETC_RXB_DMA_SIZE \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +-- +2.51.0 + diff --git a/queue-5.15/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch b/queue-5.15/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch new file mode 100644 index 0000000000..9e19deefee --- /dev/null +++ b/queue-5.15/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch @@ -0,0 +1,43 @@ +From 889b9e2a27a8a58210c567dfc040323b5ac576ec Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:54 +0300 +Subject: net: netlink: add NLM_F_BULK delete request modifier + +From: Nikolay Aleksandrov + +[ Upstream commit 545528d788556c724eeb5400757f828ef27782a8 ] + +Add a new delete request modifier called NLM_F_BULK which, when +supported, would cause the request to delete multiple objects. The flag +is a convenient way to signal that a multiple delete operation is +requested which can be gradually added to different delete requests. In +order to make sure older kernels will error out if the operation is not +supported instead of doing something unintended we have to break a +required condition when implementing support for this flag, f.e. for +neighbors we will omit the mandatory mac address attribute. +Initially it will be used to add flush with filtering support for bridge +fdbs, but it also opens the door to add similar support to others. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/uapi/linux/netlink.h | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h +index 4940a93315995..1e543cf0568c0 100644 +--- a/include/uapi/linux/netlink.h ++++ b/include/uapi/linux/netlink.h +@@ -72,6 +72,7 @@ struct nlmsghdr { + + /* Modifiers to DELETE request */ + #define NLM_F_NONREC 0x100 /* Do not delete recursively */ ++#define NLM_F_BULK 0x200 /* Delete multiple objects */ + + /* Flags for ACK message */ + #define NLM_F_CAPPED 0x100 /* request was capped */ +-- +2.51.0 + diff --git a/queue-5.15/net-rtnetlink-add-bulk-delete-support-flag.patch b/queue-5.15/net-rtnetlink-add-bulk-delete-support-flag.patch new file mode 100644 index 0000000000..e4ae2adaeb --- /dev/null +++ b/queue-5.15/net-rtnetlink-add-bulk-delete-support-flag.patch @@ -0,0 +1,66 @@ +From c0fc810275c8749af1ce12d559baf3f8da48111e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:55 +0300 +Subject: net: rtnetlink: add bulk delete support flag + +From: Nikolay Aleksandrov + +[ Upstream commit a6cec0bcd34264be8887791594be793b3f12719f ] + +Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to +verify that the delete operation allows bulk object deletion. Also emit +a warning if anyone tries to set it for non-delete kind. + +Suggested-by: David Ahern +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 3 ++- + net/core/rtnetlink.c | 8 ++++++++ + 2 files changed, 10 insertions(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 268eadbbaa300..fdc7b4ce0ef7b 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -10,7 +10,8 @@ typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *, + typedef int (*rtnl_dumpit_func)(struct sk_buff *, struct netlink_callback *); + + enum rtnl_link_flags { +- RTNL_FLAG_DOIT_UNLOCKED = BIT(0), ++ RTNL_FLAG_DOIT_UNLOCKED = BIT(0), ++ RTNL_FLAG_BULK_DEL_SUPPORTED = BIT(1), + }; + + enum rtnl_kinds { +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 79fb6d74e6dab..61ab0497ac755 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -214,6 +214,8 @@ static int rtnl_register_internal(struct module *owner, + if (dumpit) + link->dumpit = dumpit; + ++ WARN_ON(rtnl_msgtype_kind(msgtype) != RTNL_KIND_DEL && ++ (flags & RTNL_FLAG_BULK_DEL_SUPPORTED)); + link->flags |= flags; + + /* publish protocol:msgtype */ +@@ -5634,6 +5636,12 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + } + + flags = link->flags; ++ if (kind == RTNL_KIND_DEL && (nlh->nlmsg_flags & NLM_F_BULK) && ++ !(flags & RTNL_FLAG_BULK_DEL_SUPPORTED)) { ++ NL_SET_ERR_MSG(extack, "Bulk delete is not supported"); ++ goto err_unlock; ++ } ++ + if (flags & RTNL_FLAG_DOIT_UNLOCKED) { + doit = link->doit; + rcu_read_unlock(); +-- +2.51.0 + diff --git a/queue-5.15/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch b/queue-5.15/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch new file mode 100644 index 0000000000..428974a96c --- /dev/null +++ b/queue-5.15/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch @@ -0,0 +1,53 @@ +From 00a720fe3bf5b74578fb61ff7263e94a17718c03 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:52 +0300 +Subject: net: rtnetlink: add helper to extract msg type's kind + +From: Nikolay Aleksandrov + +[ Upstream commit 2e9ea3e30f696fd438319c07836422bb0bbb4608 ] + +Add a helper which extracts the msg type's kind using the kind mask (0x3). + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 6 ++++++ + net/core/rtnetlink.c | 2 +- + 2 files changed, 7 insertions(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index dcb1c92e69879..d2961e2ed30bd 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -19,6 +19,12 @@ enum rtnl_kinds { + RTNL_KIND_GET, + RTNL_KIND_SET + }; ++#define RTNL_KIND_MASK 0x3 ++ ++static inline enum rtnl_kinds rtnl_msgtype_kind(int msgtype) ++{ ++ return msgtype & RTNL_KIND_MASK; ++} + + struct rtnl_msg_handler { + struct module *owner; +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index e8e67429e437f..79fb6d74e6dab 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -5572,7 +5572,7 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + return 0; + + family = ((struct rtgenmsg *)nlmsg_data(nlh))->rtgen_family; +- kind = type&3; ++ kind = rtnl_msgtype_kind(type); + + if (kind != RTNL_KIND_GET && !netlink_net_capable(skb, CAP_NET_ADMIN)) + return -EPERM; +-- +2.51.0 + diff --git a/queue-5.15/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch b/queue-5.15/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch new file mode 100644 index 0000000000..13d147b64d --- /dev/null +++ b/queue-5.15/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch @@ -0,0 +1,159 @@ +From 2f39f6a914737c7904d85b3531a46f9ec05c91ff Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:57 +0300 +Subject: net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_del + +From: Nikolay Aleksandrov + +[ Upstream commit 9e83425993f38bb89e0ea07849ba0039a748e85b ] + +When NLM_F_BULK is specified in a fdb del message we need to handle it +differently. First since this is a new call we can strictly validate the +passed attributes, at first only ifindex and vlan are allowed as these +will be the initially supported filter attributes, any other attribute +is rejected. The mac address is no longer mandatory, but we use it +to error out in older kernels because it cannot be specified with bulk +request (the attribute is not allowed) and then we have to dispatch +the call to ndo_fdb_del_bulk if the device supports it. The del bulk +callback can do further validation of the attributes if necessary. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 67 +++++++++++++++++++++++++++++++------------- + 1 file changed, 48 insertions(+), 19 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 61ab0497ac755..08bb8e09994db 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4174,22 +4174,34 @@ int ndo_dflt_fdb_del(struct ndmsg *ndm, + } + EXPORT_SYMBOL(ndo_dflt_fdb_del); + ++static const struct nla_policy fdb_del_bulk_policy[NDA_MAX + 1] = { ++ [NDA_VLAN] = { .type = NLA_U16 }, ++ [NDA_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), ++}; ++ + static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) + { ++ bool del_bulk = !!(nlh->nlmsg_flags & NLM_F_BULK); + struct net *net = sock_net(skb->sk); ++ const struct net_device_ops *ops; + struct ndmsg *ndm; + struct nlattr *tb[NDA_MAX+1]; + struct net_device *dev; +- __u8 *addr; ++ __u8 *addr = NULL; + int err; + u16 vid; + + if (!netlink_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + +- err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, NULL, +- extack); ++ if (!del_bulk) { ++ err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, ++ NULL, extack); ++ } else { ++ err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, ++ fdb_del_bulk_policy, extack); ++ } + if (err < 0) + return err; + +@@ -4205,9 +4217,12 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + return -ENODEV; + } + +- if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { +- NL_SET_ERR_MSG(extack, "invalid address"); +- return -EINVAL; ++ if (!del_bulk) { ++ if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { ++ NL_SET_ERR_MSG(extack, "invalid address"); ++ return -EINVAL; ++ } ++ addr = nla_data(tb[NDA_LLADDR]); + } + + if (dev->type != ARPHRD_ETHER) { +@@ -4215,8 +4230,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + return -EINVAL; + } + +- addr = nla_data(tb[NDA_LLADDR]); +- + err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack); + if (err) + return err; +@@ -4227,10 +4240,16 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) && + netif_is_bridge_port(dev)) { + struct net_device *br_dev = netdev_master_upper_dev_get(dev); +- const struct net_device_ops *ops = br_dev->netdev_ops; + +- if (ops->ndo_fdb_del) +- err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ ops = br_dev->netdev_ops; ++ if (!del_bulk) { ++ if (ops->ndo_fdb_del) ++ err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ } else { ++ if (ops->ndo_fdb_del_bulk) ++ err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, ++ extack); ++ } + + if (err) + goto out; +@@ -4240,15 +4259,24 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + + /* Embedded bridge, macvlan, and any other device support */ + if (ndm->ndm_flags & NTF_SELF) { +- if (dev->netdev_ops->ndo_fdb_del) +- err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr, +- vid); +- else +- err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); ++ ops = dev->netdev_ops; ++ if (!del_bulk) { ++ if (ops->ndo_fdb_del) ++ err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ else ++ err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); ++ } else { ++ /* in case err was cleared by NTF_MASTER call */ ++ err = -EOPNOTSUPP; ++ if (ops->ndo_fdb_del_bulk) ++ err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, ++ extack); ++ } + + if (!err) { +- rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, +- ndm->ndm_state); ++ if (!del_bulk) ++ rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, ++ ndm->ndm_state); + ndm->ndm_flags &= ~NTF_SELF; + } + } +@@ -5770,7 +5798,8 @@ void __init rtnetlink_init(void) + rtnl_register(PF_UNSPEC, RTM_DELLINKPROP, rtnl_dellinkprop, NULL, 0); + + rtnl_register(PF_BRIDGE, RTM_NEWNEIGH, rtnl_fdb_add, NULL, 0); +- rtnl_register(PF_BRIDGE, RTM_DELNEIGH, rtnl_fdb_del, NULL, 0); ++ rtnl_register(PF_BRIDGE, RTM_DELNEIGH, rtnl_fdb_del, NULL, ++ RTNL_FLAG_BULK_DEL_SUPPORTED); + rtnl_register(PF_BRIDGE, RTM_GETNEIGH, rtnl_fdb_get, rtnl_fdb_dump, 0); + + rtnl_register(PF_BRIDGE, RTM_GETLINK, NULL, rtnl_bridge_getlink, 0); +-- +2.51.0 + diff --git a/queue-5.15/net-rtnetlink-use-bit-for-flag-values.patch b/queue-5.15/net-rtnetlink-use-bit-for-flag-values.patch new file mode 100644 index 0000000000..30e8228a3c --- /dev/null +++ b/queue-5.15/net-rtnetlink-use-bit-for-flag-values.patch @@ -0,0 +1,35 @@ +From cbc81c8793a2647e649abcafad534c9ba1ec1d91 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:53 +0300 +Subject: net: rtnetlink: use BIT for flag values + +From: Nikolay Aleksandrov + +[ Upstream commit 0569e31f1bc2f50613ba4c219f3ecc0d1174d841 ] + +Use BIT to define flag values. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index d2961e2ed30bd..268eadbbaa300 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -10,7 +10,7 @@ typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *, + typedef int (*rtnl_dumpit_func)(struct sk_buff *, struct netlink_callback *); + + enum rtnl_link_flags { +- RTNL_FLAG_DOIT_UNLOCKED = 1, ++ RTNL_FLAG_DOIT_UNLOCKED = BIT(0), + }; + + enum rtnl_kinds { +-- +2.51.0 + diff --git a/queue-5.15/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch b/queue-5.15/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch new file mode 100644 index 0000000000..82c71c815e --- /dev/null +++ b/queue-5.15/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch @@ -0,0 +1,56 @@ +From f108200ef89ecf8a15122612c286a663d6239332 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 22:15:43 +0200 +Subject: rtnetlink: Allow deleting FDB entries in user namespace +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Johannes Wiesböck + +[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ] + +Creating FDB entries is possible from a non-initial user namespace when +having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive +an EPERM because the capability is always checked against the initial +user namespace. This restricts the FDB management from unprivileged +containers. + +Drop the netlink_capable check in rtnl_fdb_del as it was originally +dropped in c5c351088ae7 and reintroduced in 1690be63a27b without +intention. + +This patch was tested using a container on GyroidOS, where it was +possible to delete FDB entries from an unprivileged user namespace and +private network namespace. + +Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") +Reviewed-by: Michael Weiß +Tested-by: Harshal Gohel +Signed-off-by: Johannes Wiesböck +Reviewed-by: Ido Schimmel +Reviewed-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 08bb8e09994db..c44ab3b71f3e7 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4192,9 +4192,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + int err; + u16 vid; + +- if (!netlink_capable(skb, CAP_NET_ADMIN)) +- return -EPERM; +- + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); +-- +2.51.0 + diff --git a/queue-5.15/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch b/queue-5.15/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch new file mode 100644 index 0000000000..35fd01c81c --- /dev/null +++ b/queue-5.15/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch @@ -0,0 +1,54 @@ +From d00c0033dcd620bc1b61f2edb6b1b5bb529b8843 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 16:00:36 +0300 +Subject: sctp: avoid NULL dereference when chunk data buffer is missing + +From: Alexey Simakov + +[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ] + +chunk->skb pointer is dereferenced in the if-block where it's supposed +to be NULL only. + +chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list +instead and do it just before replacing chunk->skb. We're sure that +otherwise chunk->skb is non-NULL because of outer if() condition. + +Fixes: 90017accff61 ("sctp: Add GSO support") +Signed-off-by: Alexey Simakov +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sctp/inqueue.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c +index 7182c5a450fb5..6a434d441dc70 100644 +--- a/net/sctp/inqueue.c ++++ b/net/sctp/inqueue.c +@@ -163,13 +163,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) + chunk->head_skb = chunk->skb; + + /* skbs with "cover letter" */ +- if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) ++ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { ++ if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { ++ __SCTP_INC_STATS(dev_net(chunk->skb->dev), ++ SCTP_MIB_IN_PKT_DISCARDS); ++ sctp_chunk_free(chunk); ++ goto next_chunk; ++ } + chunk->skb = skb_shinfo(chunk->skb)->frag_list; +- +- if (WARN_ON(!chunk->skb)) { +- __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); +- sctp_chunk_free(chunk); +- goto next_chunk; + } + } + +-- +2.51.0 + diff --git a/queue-5.15/series b/queue-5.15/series index a8be826e2b..e34f9a297d 100644 --- a/queue-5.15/series +++ b/queue-5.15/series @@ -49,3 +49,14 @@ dlm-check-for-defined-force-value-in-dlm_lockspace_r.patch hfs-fix-kmsan-uninit-value-issue-in-hfs_find_set_zer.patch hfsplus-return-eio-when-type-of-hidden-directory-mis.patch m68k-bitops-fix-find_-_bit-signatures.patch +net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch +net-rtnetlink-use-bit-for-flag-values.patch +net-netlink-add-nlm_f_bulk-delete-request-modifier.patch +net-rtnetlink-add-bulk-delete-support-flag.patch +net-add-ndo_fdb_del_bulk.patch +net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch +rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch +net-enetc-correct-the-value-of-enetc_rxb_truesize.patch +dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch +arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch +sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch diff --git a/queue-5.4/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch b/queue-5.4/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch new file mode 100644 index 0000000000..bd9904a872 --- /dev/null +++ b/queue-5.4/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch @@ -0,0 +1,71 @@ +From 9f6cb4504d7324f75b60e54250ee440fdfaa0d4a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:37:12 +0800 +Subject: arm64, mm: avoid always making PTE dirty in pte_mkwrite() + +From: Huang Ying + +[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ] + +Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may +mark some pages that are never written dirty wrongly. For example, +do_swap_page() may map the exclusive pages with writable and clean PTEs +if the VMA is writable and the page fault is for read access. +However, current pte_mkwrite_novma() implementation always dirties the +PTE. This may cause unnecessary disk writing if the pages are +never written before being reclaimed. + +So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the +PTE_DIRTY bit is set to make it possible to make the PTE writable and +clean. + +The current behavior was introduced in commit 73e86cb03cf2 ("arm64: +Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, +pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only +clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits +are set. + +To test the performance impact of the patch, on an arm64 server +machine, run 16 redis-server processes on socket 1 and 16 +memtier_benchmark processes on socket 0 with mostly get +transactions (that is, redis-server will mostly read memory only). +The memory footprint of redis-server is larger than the available +memory, so swap out/in will be triggered. Test results show that the +patch can avoid most swapping out because the pages are mostly clean. +And the benchmark throughput improves ~23.9% in the test. + +Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") +Signed-off-by: Huang Ying +Cc: Will Deacon +Cc: Anshuman Khandual +Cc: Ryan Roberts +Cc: Gavin Shan +Cc: Ard Biesheuvel +Cc: Matthew Wilcox (Oracle) +Cc: Yicong Yang +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Reviewed-by: Catalin Marinas +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/pgtable.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h +index 709badd4475f5..a05d782dcf5e9 100644 +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -145,7 +145,8 @@ static inline pte_t set_pte_bit(pte_t pte, pgprot_t prot) + static inline pte_t pte_mkwrite(pte_t pte) + { + pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); +- pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); ++ if (pte_sw_dirty(pte)) ++ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + return pte; + } + +-- +2.51.0 + diff --git a/queue-5.4/net-add-ndo_fdb_del_bulk.patch b/queue-5.4/net-add-ndo_fdb_del_bulk.patch new file mode 100644 index 0000000000..cad3d5766d --- /dev/null +++ b/queue-5.4/net-add-ndo_fdb_del_bulk.patch @@ -0,0 +1,52 @@ +From fc09f2f4efb3cf9b2b39df71e443dfea1ae62b33 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:56 +0300 +Subject: net: add ndo_fdb_del_bulk + +From: Nikolay Aleksandrov + +[ Upstream commit 1306d5362a591493a2d07f685ed2cc480dcda320 ] + +Add a new netdev op called ndo_fdb_del_bulk, it will be later used for +driver-specific bulk delete implementation dispatched from rtnetlink. The +first user will be the bridge, we need it to signal to rtnetlink from +the driver that we support bulk delete operation (NLM_F_BULK). + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/linux/netdevice.h | 9 +++++++++ + 1 file changed, 9 insertions(+) + +diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h +index f5c1058f565c8..037a48bc5690a 100644 +--- a/include/linux/netdevice.h ++++ b/include/linux/netdevice.h +@@ -1158,6 +1158,10 @@ struct tlsdev_ops; + * struct net_device *dev, + * const unsigned char *addr, u16 vid) + * Deletes the FDB entry from dev coresponding to addr. ++ * int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, struct nlattr *tb[], ++ * struct net_device *dev, ++ * u16 vid, ++ * struct netlink_ext_ack *extack); + * int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb, + * struct net_device *dev, struct net_device *filter_dev, + * int *idx) +@@ -1396,6 +1400,11 @@ struct net_device_ops { + struct net_device *dev, + const unsigned char *addr, + u16 vid); ++ int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, ++ struct nlattr *tb[], ++ struct net_device *dev, ++ u16 vid, ++ struct netlink_ext_ack *extack); + int (*ndo_fdb_dump)(struct sk_buff *skb, + struct netlink_callback *cb, + struct net_device *dev, +-- +2.51.0 + diff --git a/queue-5.4/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch b/queue-5.4/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch new file mode 100644 index 0000000000..9c3a010b66 --- /dev/null +++ b/queue-5.4/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch @@ -0,0 +1,54 @@ +From 10cb9522d7dad53e4a8f7b3d7a253ef4237327f5 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:01:31 +0800 +Subject: net: enetc: correct the value of ENETC_RXB_TRUESIZE + +From: Wei Fang + +[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ] + +The ENETC RX ring uses the page halves flipping mechanism, each page is +split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is +defined to 2048 to indicate the size of half a page. However, the page +size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, +but it could be configured to 16K or 64K. + +When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, +and the RX ring will always use the first half of the page. This is not +consistent with the description in the relevant kernel doc and commit +messages. + +This issue is invisible in most cases, but if users want to increase +PAGE_SIZE to receive a Jumbo frame with a single buffer for some use +cases, it will not work as expected, because the buffer size of each +RX BD is fixed to 2048 bytes. + +Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE +to (PAGE_SIZE >> 1), as described in the comment. + +Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") +Signed-off-by: Wei Fang +Reviewed-by: Claudiu Manoil +Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h +index b8801a2b6a025..6203d117d0d2c 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.h ++++ b/drivers/net/ethernet/freescale/enetc/enetc.h +@@ -27,7 +27,7 @@ struct enetc_tx_swbd { + }; + + #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE +-#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ ++#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) + #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ + #define ENETC_RXB_DMA_SIZE \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +-- +2.51.0 + diff --git a/queue-5.4/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch b/queue-5.4/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch new file mode 100644 index 0000000000..2afe7e1d4a --- /dev/null +++ b/queue-5.4/net-netlink-add-nlm_f_bulk-delete-request-modifier.patch @@ -0,0 +1,43 @@ +From a7b45b39636e9d646f02e3dc5910abd0ad557056 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:54 +0300 +Subject: net: netlink: add NLM_F_BULK delete request modifier + +From: Nikolay Aleksandrov + +[ Upstream commit 545528d788556c724eeb5400757f828ef27782a8 ] + +Add a new delete request modifier called NLM_F_BULK which, when +supported, would cause the request to delete multiple objects. The flag +is a convenient way to signal that a multiple delete operation is +requested which can be gradually added to different delete requests. In +order to make sure older kernels will error out if the operation is not +supported instead of doing something unintended we have to break a +required condition when implementing support for this flag, f.e. for +neighbors we will omit the mandatory mac address attribute. +Initially it will be used to add flush with filtering support for bridge +fdbs, but it also opens the door to add similar support to others. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/uapi/linux/netlink.h | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h +index cf4e4836338f6..9ad4c47dea844 100644 +--- a/include/uapi/linux/netlink.h ++++ b/include/uapi/linux/netlink.h +@@ -72,6 +72,7 @@ struct nlmsghdr { + + /* Modifiers to DELETE request */ + #define NLM_F_NONREC 0x100 /* Do not delete recursively */ ++#define NLM_F_BULK 0x200 /* Delete multiple objects */ + + /* Flags for ACK message */ + #define NLM_F_CAPPED 0x100 /* request was capped */ +-- +2.51.0 + diff --git a/queue-5.4/net-rtnetlink-add-bulk-delete-support-flag.patch b/queue-5.4/net-rtnetlink-add-bulk-delete-support-flag.patch new file mode 100644 index 0000000000..7ff3fe977b --- /dev/null +++ b/queue-5.4/net-rtnetlink-add-bulk-delete-support-flag.patch @@ -0,0 +1,66 @@ +From dcf75079b8244926fdf78ca4153d54a2ce6d7d47 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:55 +0300 +Subject: net: rtnetlink: add bulk delete support flag + +From: Nikolay Aleksandrov + +[ Upstream commit a6cec0bcd34264be8887791594be793b3f12719f ] + +Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to +verify that the delete operation allows bulk object deletion. Also emit +a warning if anyone tries to set it for non-delete kind. + +Suggested-by: David Ahern +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 3 ++- + net/core/rtnetlink.c | 8 ++++++++ + 2 files changed, 10 insertions(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 030fc7eef7401..e893b1f21913e 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -10,7 +10,8 @@ typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *, + typedef int (*rtnl_dumpit_func)(struct sk_buff *, struct netlink_callback *); + + enum rtnl_link_flags { +- RTNL_FLAG_DOIT_UNLOCKED = BIT(0), ++ RTNL_FLAG_DOIT_UNLOCKED = BIT(0), ++ RTNL_FLAG_BULK_DEL_SUPPORTED = BIT(1), + }; + + enum rtnl_kinds { +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index b41f31a09a7cd..c4b33a2ecac26 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -214,6 +214,8 @@ static int rtnl_register_internal(struct module *owner, + if (dumpit) + link->dumpit = dumpit; + ++ WARN_ON(rtnl_msgtype_kind(msgtype) != RTNL_KIND_DEL && ++ (flags & RTNL_FLAG_BULK_DEL_SUPPORTED)); + link->flags |= flags; + + /* publish protocol:msgtype */ +@@ -5274,6 +5276,12 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + } + + flags = link->flags; ++ if (kind == RTNL_KIND_DEL && (nlh->nlmsg_flags & NLM_F_BULK) && ++ !(flags & RTNL_FLAG_BULK_DEL_SUPPORTED)) { ++ NL_SET_ERR_MSG(extack, "Bulk delete is not supported"); ++ goto err_unlock; ++ } ++ + if (flags & RTNL_FLAG_DOIT_UNLOCKED) { + doit = link->doit; + rcu_read_unlock(); +-- +2.51.0 + diff --git a/queue-5.4/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch b/queue-5.4/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch new file mode 100644 index 0000000000..2405514d53 --- /dev/null +++ b/queue-5.4/net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch @@ -0,0 +1,53 @@ +From 6493d6ef421011cbe8e678d9e18b6e1845b65ff7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:52 +0300 +Subject: net: rtnetlink: add helper to extract msg type's kind + +From: Nikolay Aleksandrov + +[ Upstream commit 2e9ea3e30f696fd438319c07836422bb0bbb4608 ] + +Add a helper which extracts the msg type's kind using the kind mask (0x3). + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 6 ++++++ + net/core/rtnetlink.c | 2 +- + 2 files changed, 7 insertions(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 74eff5259b361..02b0636a4523d 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -19,6 +19,12 @@ enum rtnl_kinds { + RTNL_KIND_GET, + RTNL_KIND_SET + }; ++#define RTNL_KIND_MASK 0x3 ++ ++static inline enum rtnl_kinds rtnl_msgtype_kind(int msgtype) ++{ ++ return msgtype & RTNL_KIND_MASK; ++} + + void rtnl_register(int protocol, int msgtype, + rtnl_doit_func, rtnl_dumpit_func, unsigned int flags); +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 2cdb07dd263bd..b41f31a09a7cd 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -5212,7 +5212,7 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + return 0; + + family = ((struct rtgenmsg *)nlmsg_data(nlh))->rtgen_family; +- kind = type&3; ++ kind = rtnl_msgtype_kind(type); + + if (kind != RTNL_KIND_GET && !netlink_net_capable(skb, CAP_NET_ADMIN)) + return -EPERM; +-- +2.51.0 + diff --git a/queue-5.4/net-rtnetlink-add-msg-kind-names.patch b/queue-5.4/net-rtnetlink-add-msg-kind-names.patch new file mode 100644 index 0000000000..cf5a105ca2 --- /dev/null +++ b/queue-5.4/net-rtnetlink-add-msg-kind-names.patch @@ -0,0 +1,73 @@ +From a8ab9e91ddbc3a8ed4a0628c2c420e12ea048f7c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:51 +0300 +Subject: net: rtnetlink: add msg kind names + +From: Nikolay Aleksandrov + +[ Upstream commit 12dc5c2cb7b269c5a1c6d02844f40bfce942a7a6 ] + +Add rtnl kind names instead of using raw values. We'll need to +check for DEL kind later to validate bulk flag support. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 7 +++++++ + net/core/rtnetlink.c | 6 +++--- + 2 files changed, 10 insertions(+), 3 deletions(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 5c2a73bbfabee..74eff5259b361 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -13,6 +13,13 @@ enum rtnl_link_flags { + RTNL_FLAG_DOIT_UNLOCKED = 1, + }; + ++enum rtnl_kinds { ++ RTNL_KIND_NEW, ++ RTNL_KIND_DEL, ++ RTNL_KIND_GET, ++ RTNL_KIND_SET ++}; ++ + void rtnl_register(int protocol, int msgtype, + rtnl_doit_func, rtnl_dumpit_func, unsigned int flags); + int rtnl_register_module(struct module *owner, int protocol, int msgtype, +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index f1338734c2eee..2cdb07dd263bd 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -5193,11 +5193,11 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + { + struct net *net = sock_net(skb->sk); + struct rtnl_link *link; ++ enum rtnl_kinds kind; + struct module *owner; + int err = -EOPNOTSUPP; + rtnl_doit_func doit; + unsigned int flags; +- int kind; + int family; + int type; + +@@ -5214,11 +5214,11 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, + family = ((struct rtgenmsg *)nlmsg_data(nlh))->rtgen_family; + kind = type&3; + +- if (kind != 2 && !netlink_net_capable(skb, CAP_NET_ADMIN)) ++ if (kind != RTNL_KIND_GET && !netlink_net_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + + rcu_read_lock(); +- if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) { ++ if (kind == RTNL_KIND_GET && (nlh->nlmsg_flags & NLM_F_DUMP)) { + struct sock *rtnl; + rtnl_dumpit_func dumpit; + u16 min_dump_alloc = 0; +-- +2.51.0 + diff --git a/queue-5.4/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch b/queue-5.4/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch new file mode 100644 index 0000000000..a9e4bda9c0 --- /dev/null +++ b/queue-5.4/net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch @@ -0,0 +1,159 @@ +From 9dbc74c7dfbde6011de8ae7121deb60888ea2299 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:57 +0300 +Subject: net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_del + +From: Nikolay Aleksandrov + +[ Upstream commit 9e83425993f38bb89e0ea07849ba0039a748e85b ] + +When NLM_F_BULK is specified in a fdb del message we need to handle it +differently. First since this is a new call we can strictly validate the +passed attributes, at first only ifindex and vlan are allowed as these +will be the initially supported filter attributes, any other attribute +is rejected. The mac address is no longer mandatory, but we use it +to error out in older kernels because it cannot be specified with bulk +request (the attribute is not allowed) and then we have to dispatch +the call to ndo_fdb_del_bulk if the device supports it. The del bulk +callback can do further validation of the attributes if necessary. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 67 +++++++++++++++++++++++++++++++------------- + 1 file changed, 48 insertions(+), 19 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index c4b33a2ecac26..3d3743ef4f691 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -3818,22 +3818,34 @@ int ndo_dflt_fdb_del(struct ndmsg *ndm, + } + EXPORT_SYMBOL(ndo_dflt_fdb_del); + ++static const struct nla_policy fdb_del_bulk_policy[NDA_MAX + 1] = { ++ [NDA_VLAN] = { .type = NLA_U16 }, ++ [NDA_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), ++}; ++ + static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) + { ++ bool del_bulk = !!(nlh->nlmsg_flags & NLM_F_BULK); + struct net *net = sock_net(skb->sk); ++ const struct net_device_ops *ops; + struct ndmsg *ndm; + struct nlattr *tb[NDA_MAX+1]; + struct net_device *dev; +- __u8 *addr; ++ __u8 *addr = NULL; + int err; + u16 vid; + + if (!netlink_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + +- err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, NULL, +- extack); ++ if (!del_bulk) { ++ err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, ++ NULL, extack); ++ } else { ++ err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, ++ fdb_del_bulk_policy, extack); ++ } + if (err < 0) + return err; + +@@ -3849,9 +3861,12 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + return -ENODEV; + } + +- if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { +- NL_SET_ERR_MSG(extack, "invalid address"); +- return -EINVAL; ++ if (!del_bulk) { ++ if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { ++ NL_SET_ERR_MSG(extack, "invalid address"); ++ return -EINVAL; ++ } ++ addr = nla_data(tb[NDA_LLADDR]); + } + + if (dev->type != ARPHRD_ETHER) { +@@ -3859,8 +3874,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + return -EINVAL; + } + +- addr = nla_data(tb[NDA_LLADDR]); +- + err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack); + if (err) + return err; +@@ -3871,10 +3884,16 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) && + netif_is_bridge_port(dev)) { + struct net_device *br_dev = netdev_master_upper_dev_get(dev); +- const struct net_device_ops *ops = br_dev->netdev_ops; + +- if (ops->ndo_fdb_del) +- err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ ops = br_dev->netdev_ops; ++ if (!del_bulk) { ++ if (ops->ndo_fdb_del) ++ err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ } else { ++ if (ops->ndo_fdb_del_bulk) ++ err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, ++ extack); ++ } + + if (err) + goto out; +@@ -3884,15 +3903,24 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + + /* Embedded bridge, macvlan, and any other device support */ + if (ndm->ndm_flags & NTF_SELF) { +- if (dev->netdev_ops->ndo_fdb_del) +- err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr, +- vid); +- else +- err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); ++ ops = dev->netdev_ops; ++ if (!del_bulk) { ++ if (ops->ndo_fdb_del) ++ err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); ++ else ++ err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); ++ } else { ++ /* in case err was cleared by NTF_MASTER call */ ++ err = -EOPNOTSUPP; ++ if (ops->ndo_fdb_del_bulk) ++ err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, ++ extack); ++ } + + if (!err) { +- rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, +- ndm->ndm_state); ++ if (!del_bulk) ++ rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, ++ ndm->ndm_state); + ndm->ndm_flags &= ~NTF_SELF; + } + } +@@ -5407,7 +5435,8 @@ void __init rtnetlink_init(void) + rtnl_register(PF_UNSPEC, RTM_GETNETCONF, NULL, rtnl_dump_all, 0); + + rtnl_register(PF_BRIDGE, RTM_NEWNEIGH, rtnl_fdb_add, NULL, 0); +- rtnl_register(PF_BRIDGE, RTM_DELNEIGH, rtnl_fdb_del, NULL, 0); ++ rtnl_register(PF_BRIDGE, RTM_DELNEIGH, rtnl_fdb_del, NULL, ++ RTNL_FLAG_BULK_DEL_SUPPORTED); + rtnl_register(PF_BRIDGE, RTM_GETNEIGH, rtnl_fdb_get, rtnl_fdb_dump, 0); + + rtnl_register(PF_BRIDGE, RTM_GETLINK, NULL, rtnl_bridge_getlink, 0); +-- +2.51.0 + diff --git a/queue-5.4/net-rtnetlink-remove-redundant-assignment-to-variabl.patch b/queue-5.4/net-rtnetlink-remove-redundant-assignment-to-variabl.patch new file mode 100644 index 0000000000..3e9ae7b6ca --- /dev/null +++ b/queue-5.4/net-rtnetlink-remove-redundant-assignment-to-variabl.patch @@ -0,0 +1,39 @@ +From 480556efa4ca8c93423d5428a112da915b7d3bac Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 25 Apr 2020 12:28:14 +0100 +Subject: net: rtnetlink: remove redundant assignment to variable err + +From: Colin Ian King + +[ Upstream commit 7d3118016787b5c05da94b3bcdb96c9d6ff82c44 ] + +The variable err is being initializeed with a value that is never read +and it is being updated later with a new value. The initialization +is redundant and can be removed. + +Addresses-Coverity: ("Unused value") +Signed-off-by: Colin Ian King +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 2b7ad5cf8fbfd..f1338734c2eee 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -3823,8 +3823,8 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + struct ndmsg *ndm; + struct nlattr *tb[NDA_MAX+1]; + struct net_device *dev; +- int err = -EINVAL; + __u8 *addr; ++ int err; + u16 vid; + + if (!netlink_capable(skb, CAP_NET_ADMIN)) +-- +2.51.0 + diff --git a/queue-5.4/net-rtnetlink-use-bit-for-flag-values.patch b/queue-5.4/net-rtnetlink-use-bit-for-flag-values.patch new file mode 100644 index 0000000000..19dd704465 --- /dev/null +++ b/queue-5.4/net-rtnetlink-use-bit-for-flag-values.patch @@ -0,0 +1,35 @@ +From 54e716814343b9be927e9827b15f9d6f2fade062 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 13 Apr 2022 13:51:53 +0300 +Subject: net: rtnetlink: use BIT for flag values + +From: Nikolay Aleksandrov + +[ Upstream commit 0569e31f1bc2f50613ba4c219f3ecc0d1174d841 ] + +Use BIT to define flag values. + +Signed-off-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Stable-dep-of: bf29555f5bdc ("rtnetlink: Allow deleting FDB entries in user namespace") +Signed-off-by: Sasha Levin +--- + include/net/rtnetlink.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h +index 02b0636a4523d..030fc7eef7401 100644 +--- a/include/net/rtnetlink.h ++++ b/include/net/rtnetlink.h +@@ -10,7 +10,7 @@ typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *, + typedef int (*rtnl_dumpit_func)(struct sk_buff *, struct netlink_callback *); + + enum rtnl_link_flags { +- RTNL_FLAG_DOIT_UNLOCKED = 1, ++ RTNL_FLAG_DOIT_UNLOCKED = BIT(0), + }; + + enum rtnl_kinds { +-- +2.51.0 + diff --git a/queue-5.4/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch b/queue-5.4/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch new file mode 100644 index 0000000000..34f41ea58d --- /dev/null +++ b/queue-5.4/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch @@ -0,0 +1,56 @@ +From 74a1770e3cb55743240fedfbb5bf04949e336f20 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 22:15:43 +0200 +Subject: rtnetlink: Allow deleting FDB entries in user namespace +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Johannes Wiesböck + +[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ] + +Creating FDB entries is possible from a non-initial user namespace when +having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive +an EPERM because the capability is always checked against the initial +user namespace. This restricts the FDB management from unprivileged +containers. + +Drop the netlink_capable check in rtnl_fdb_del as it was originally +dropped in c5c351088ae7 and reintroduced in 1690be63a27b without +intention. + +This patch was tested using a container on GyroidOS, where it was +possible to delete FDB entries from an unprivileged user namespace and +private network namespace. + +Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") +Reviewed-by: Michael Weiß +Tested-by: Harshal Gohel +Signed-off-by: Johannes Wiesböck +Reviewed-by: Ido Schimmel +Reviewed-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 3d3743ef4f691..342b92afd1219 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -3836,9 +3836,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + int err; + u16 vid; + +- if (!netlink_capable(skb, CAP_NET_ADMIN)) +- return -EPERM; +- + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); +-- +2.51.0 + diff --git a/queue-5.4/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch b/queue-5.4/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch new file mode 100644 index 0000000000..1b5e7ad0cf --- /dev/null +++ b/queue-5.4/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch @@ -0,0 +1,54 @@ +From 05fffd25e3277cf959ac0563bdd43dcf4655d8ac Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 16:00:36 +0300 +Subject: sctp: avoid NULL dereference when chunk data buffer is missing + +From: Alexey Simakov + +[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ] + +chunk->skb pointer is dereferenced in the if-block where it's supposed +to be NULL only. + +chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list +instead and do it just before replacing chunk->skb. We're sure that +otherwise chunk->skb is non-NULL because of outer if() condition. + +Fixes: 90017accff61 ("sctp: Add GSO support") +Signed-off-by: Alexey Simakov +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sctp/inqueue.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c +index 7182c5a450fb5..6a434d441dc70 100644 +--- a/net/sctp/inqueue.c ++++ b/net/sctp/inqueue.c +@@ -163,13 +163,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) + chunk->head_skb = chunk->skb; + + /* skbs with "cover letter" */ +- if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) ++ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { ++ if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { ++ __SCTP_INC_STATS(dev_net(chunk->skb->dev), ++ SCTP_MIB_IN_PKT_DISCARDS); ++ sctp_chunk_free(chunk); ++ goto next_chunk; ++ } + chunk->skb = skb_shinfo(chunk->skb)->frag_list; +- +- if (WARN_ON(!chunk->skb)) { +- __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); +- sctp_chunk_free(chunk); +- goto next_chunk; + } + } + +-- +2.51.0 + diff --git a/queue-5.4/series b/queue-5.4/series index 79650bd2d6..1c38eb5262 100644 --- a/queue-5.4/series +++ b/queue-5.4/series @@ -179,3 +179,15 @@ dlm-check-for-defined-force-value-in-dlm_lockspace_r.patch hfs-fix-kmsan-uninit-value-issue-in-hfs_find_set_zer.patch hfsplus-return-eio-when-type-of-hidden-directory-mis.patch m68k-bitops-fix-find_-_bit-signatures.patch +net-rtnetlink-remove-redundant-assignment-to-variabl.patch +net-rtnetlink-add-msg-kind-names.patch +net-rtnetlink-add-helper-to-extract-msg-type-s-kind.patch +net-rtnetlink-use-bit-for-flag-values.patch +net-netlink-add-nlm_f_bulk-delete-request-modifier.patch +net-rtnetlink-add-bulk-delete-support-flag.patch +net-add-ndo_fdb_del_bulk.patch +net-rtnetlink-add-nlm_f_bulk-support-to-rtnl_fdb_del.patch +rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch +net-enetc-correct-the-value-of-enetc_rxb_truesize.patch +arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch +sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch diff --git a/queue-6.1/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch b/queue-6.1/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch new file mode 100644 index 0000000000..c1b3f61c66 --- /dev/null +++ b/queue-6.1/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch @@ -0,0 +1,71 @@ +From 72451ca559e47a76869238bd79561931235aa8ea Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:37:12 +0800 +Subject: arm64, mm: avoid always making PTE dirty in pte_mkwrite() + +From: Huang Ying + +[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ] + +Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may +mark some pages that are never written dirty wrongly. For example, +do_swap_page() may map the exclusive pages with writable and clean PTEs +if the VMA is writable and the page fault is for read access. +However, current pte_mkwrite_novma() implementation always dirties the +PTE. This may cause unnecessary disk writing if the pages are +never written before being reclaimed. + +So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the +PTE_DIRTY bit is set to make it possible to make the PTE writable and +clean. + +The current behavior was introduced in commit 73e86cb03cf2 ("arm64: +Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, +pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only +clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits +are set. + +To test the performance impact of the patch, on an arm64 server +machine, run 16 redis-server processes on socket 1 and 16 +memtier_benchmark processes on socket 0 with mostly get +transactions (that is, redis-server will mostly read memory only). +The memory footprint of redis-server is larger than the available +memory, so swap out/in will be triggered. Test results show that the +patch can avoid most swapping out because the pages are mostly clean. +And the benchmark throughput improves ~23.9% in the test. + +Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") +Signed-off-by: Huang Ying +Cc: Will Deacon +Cc: Anshuman Khandual +Cc: Ryan Roberts +Cc: Gavin Shan +Cc: Ard Biesheuvel +Cc: Matthew Wilcox (Oracle) +Cc: Yicong Yang +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Reviewed-by: Catalin Marinas +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/pgtable.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h +index 426c3cb3e3bb1..62326f249aa71 100644 +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -183,7 +183,8 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) + static inline pte_t pte_mkwrite(pte_t pte) + { + pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); +- pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); ++ if (pte_sw_dirty(pte)) ++ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + return pte; + } + +-- +2.51.0 + diff --git a/queue-6.1/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch b/queue-6.1/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch new file mode 100644 index 0000000000..219ca8714a --- /dev/null +++ b/queue-6.1/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch @@ -0,0 +1,50 @@ +From 4e50bc6f4ede46d6ec3833217a752ed6eb6f74da Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:58:07 +0300 +Subject: dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path + +From: Ioana Ciornei + +[ Upstream commit 902e81e679d86846a2404630d349709ad9372d0d ] + +The blamed commit increased the needed headroom to account for +alignment. This means that the size required to always align a Tx buffer +was added inside the dpaa2_eth_needed_headroom() function. By doing +that, a manual adjustment of the pointer passed to PTR_ALIGN() was no +longer correct since the 'buffer_start' variable was already pointing +to the start of the skb's memory. + +The behavior of the dpaa2-eth driver without this patch was to drop +frames on Tx even when the headroom was matching the 128 bytes +necessary. Fix this by removing the manual adjust of 'buffer_start' from +the PTR_MODE call. + +Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u +Fixes: f422abe3f23d ("dpaa2-eth: increase the needed headroom to account for alignment") +Signed-off-by: Ioana Ciornei +Tested-by: Mathew McBride +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +index dbc40e4514f0a..3c19be56af22e 100644 +--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c ++++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +@@ -1046,8 +1046,7 @@ static int dpaa2_eth_build_single_fd(struct dpaa2_eth_priv *priv, + dma_addr_t addr; + + buffer_start = skb->data - dpaa2_eth_needed_headroom(skb); +- aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN, +- DPAA2_ETH_TX_BUF_ALIGN); ++ aligned_start = PTR_ALIGN(buffer_start, DPAA2_ETH_TX_BUF_ALIGN); + if (aligned_start >= skb->head) + buffer_start = aligned_start; + else +-- +2.51.0 + diff --git a/queue-6.1/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch b/queue-6.1/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch new file mode 100644 index 0000000000..2361db729e --- /dev/null +++ b/queue-6.1/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch @@ -0,0 +1,54 @@ +From 58fe9141a77d25ac21d355241194a772bbfa08ef Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:01:31 +0800 +Subject: net: enetc: correct the value of ENETC_RXB_TRUESIZE + +From: Wei Fang + +[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ] + +The ENETC RX ring uses the page halves flipping mechanism, each page is +split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is +defined to 2048 to indicate the size of half a page. However, the page +size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, +but it could be configured to 16K or 64K. + +When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, +and the RX ring will always use the first half of the page. This is not +consistent with the description in the relevant kernel doc and commit +messages. + +This issue is invisible in most cases, but if users want to increase +PAGE_SIZE to receive a Jumbo frame with a single buffer for some use +cases, it will not work as expected, because the buffer size of each +RX BD is fixed to 2048 bytes. + +Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE +to (PAGE_SIZE >> 1), as described in the comment. + +Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") +Signed-off-by: Wei Fang +Reviewed-by: Claudiu Manoil +Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h +index c6d8cc15c2701..aacdfe98b65ab 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.h ++++ b/drivers/net/ethernet/freescale/enetc/enetc.h +@@ -40,7 +40,7 @@ struct enetc_tx_swbd { + }; + + #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE +-#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ ++#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) + #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ + #define ENETC_RXB_DMA_SIZE \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +-- +2.51.0 + diff --git a/queue-6.1/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch b/queue-6.1/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch new file mode 100644 index 0000000000..f5a91cbe49 --- /dev/null +++ b/queue-6.1/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch @@ -0,0 +1,158 @@ +From 18c00ec29df3d353de5407578e2bfe84f63c76dc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:14:27 +0800 +Subject: net: enetc: fix the deadlock of enetc_mdio_lock + +From: Jianpeng Chang + +[ Upstream commit 50bd33f6b3922a6b760aa30d409cae891cec8fb5 ] + +After applying the workaround for err050089, the LS1028A platform +experiences RCU stalls on RT kernel. This issue is caused by the +recursive acquisition of the read lock enetc_mdio_lock. Here list some +of the call stacks identified under the enetc_poll path that may lead to +a deadlock: + +enetc_poll + -> enetc_lock_mdio + -> enetc_clean_rx_ring OR napi_complete_done + -> napi_gro_receive + -> enetc_start_xmit + -> enetc_lock_mdio + -> enetc_map_tx_buffs + -> enetc_unlock_mdio + -> enetc_unlock_mdio + +After enetc_poll acquires the read lock, a higher-priority writer attempts +to acquire the lock, causing preemption. The writer detects that a +read lock is already held and is scheduled out. However, readers under +enetc_poll cannot acquire the read lock again because a writer is already +waiting, leading to a thread hang. + +Currently, the deadlock is avoided by adjusting enetc_lock_mdio to prevent +recursive lock acquisition. + +Fixes: 6d36ecdbc441 ("net: enetc: take the MDIO lock only once per NAPI poll cycle") +Signed-off-by: Jianpeng Chang +Acked-by: Wei Fang +Link: https://patch.msgid.link/20251015021427.180757-1-jianpeng.chang.cn@windriver.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.c | 25 ++++++++++++++++---- + 1 file changed, 21 insertions(+), 4 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c +index 44ae1d2c34fd6..ed1db7f056e66 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.c ++++ b/drivers/net/ethernet/freescale/enetc/enetc.c +@@ -1225,6 +1225,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd; + struct sk_buff *skb; +@@ -1260,7 +1262,9 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_byte_cnt += skb->len + ETH_HLEN; + rx_frm_cnt++; + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + } + + rx_ring->next_to_clean = i; +@@ -1268,6 +1272,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -1572,6 +1578,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd, *orig_rxbd; + int orig_i, orig_cleaned_cnt; +@@ -1631,7 +1639,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + if (unlikely(!skb)) + goto out; + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + break; + case XDP_TX: + tx_ring = priv->xdp_tx_ring[rx_ring->index]; +@@ -1660,7 +1670,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + } + break; + case XDP_REDIRECT: ++ enetc_unlock_mdio(); + err = xdp_do_redirect(rx_ring->ndev, &xdp_buff, prog); ++ enetc_lock_mdio(); + if (unlikely(err)) { + enetc_xdp_drop(rx_ring, orig_i, i); + rx_ring->stats.xdp_redirect_failures++; +@@ -1680,8 +1692,11 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + +- if (xdp_redirect_frm_cnt) ++ if (xdp_redirect_frm_cnt) { ++ enetc_unlock_mdio(); + xdp_do_flush(); ++ enetc_lock_mdio(); ++ } + + if (xdp_tx_frm_cnt) + enetc_update_tx_ring_tail(tx_ring); +@@ -1690,6 +1705,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + enetc_refill_rx_ring(rx_ring, enetc_bd_unused(rx_ring) - + rx_ring->xdp.xdp_tx_in_flight); + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -1708,6 +1725,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + for (i = 0; i < v->count_tx_rings; i++) + if (!enetc_clean_tx_ring(&v->tx_ring[i], budget)) + complete = false; ++ enetc_unlock_mdio(); + + prog = rx_ring->xdp.prog; + if (prog) +@@ -1719,10 +1737,8 @@ static int enetc_poll(struct napi_struct *napi, int budget) + if (work_done) + v->rx_napi_work = true; + +- if (!complete) { +- enetc_unlock_mdio(); ++ if (!complete) + return budget; +- } + + napi_complete_done(napi, work_done); + +@@ -1731,6 +1747,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + + v->rx_napi_work = false; + ++ enetc_lock_mdio(); + /* enable interrupts */ + enetc_wr_reg_hot(v->rbier, ENETC_RBIER_RXTIE); + +-- +2.51.0 + diff --git a/queue-6.1/net-ethernet-enetc-unlock-xdp_redirect-for-xdp-non-l.patch b/queue-6.1/net-ethernet-enetc-unlock-xdp_redirect-for-xdp-non-l.patch new file mode 100644 index 0000000000..db5019d2a5 --- /dev/null +++ b/queue-6.1/net-ethernet-enetc-unlock-xdp_redirect-for-xdp-non-l.patch @@ -0,0 +1,69 @@ +From ecf10ad02f186ebfcc966a9d82e6e3fc29e70f58 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 4 Jan 2023 14:57:10 +0100 +Subject: net: ethernet: enetc: unlock XDP_REDIRECT for XDP non-linear buffers + +From: Lorenzo Bianconi + +[ Upstream commit 8feb020f92a559f5a73a55c7337a3e51f19a2dc9 ] + +Even if full XDP_REDIRECT is not supported yet for non-linear XDP buffers +since we allow redirecting just into CPUMAPs, unlock XDP_REDIRECT for +S/G XDP buffer and rely on XDP stack to properly take care of the +frames. + +Tested-by: Vladimir Oltean +Signed-off-by: Lorenzo Bianconi +Reviewed-by: Leon Romanovsky +Signed-off-by: Jakub Kicinski +Stable-dep-of: 50bd33f6b392 ("net: enetc: fix the deadlock of enetc_mdio_lock") +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.c | 24 ++++++++------------ + 1 file changed, 10 insertions(+), 14 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c +index bf49c07c8b513..4b4cc64724932 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.c ++++ b/drivers/net/ethernet/freescale/enetc/enetc.c +@@ -1460,6 +1460,16 @@ static void enetc_add_rx_buff_to_xdp(struct enetc_bdr *rx_ring, int i, + /* To be used for XDP_TX */ + rx_swbd->len = size; + ++ if (!xdp_buff_has_frags(xdp_buff)) { ++ xdp_buff_set_frags_flag(xdp_buff); ++ shinfo->xdp_frags_size = size; ++ } else { ++ shinfo->xdp_frags_size += size; ++ } ++ ++ if (page_is_pfmemalloc(rx_swbd->page)) ++ xdp_buff_set_frag_pfmemalloc(xdp_buff); ++ + skb_frag_off_set(frag, rx_swbd->page_offset); + skb_frag_size_set(frag, size); + __skb_frag_set_page(frag, rx_swbd->page); +@@ -1650,20 +1660,6 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + } + break; + case XDP_REDIRECT: +- /* xdp_return_frame does not support S/G in the sense +- * that it leaks the fragments (__xdp_return should not +- * call page_frag_free only for the initial buffer). +- * Until XDP_REDIRECT gains support for S/G let's keep +- * the code structure in place, but dead. We drop the +- * S/G frames ourselves to avoid memory leaks which +- * would otherwise leave the kernel OOM. +- */ +- if (unlikely(cleaned_cnt - orig_cleaned_cnt != 1)) { +- enetc_xdp_drop(rx_ring, orig_i, i); +- rx_ring->stats.xdp_redirect_sg++; +- break; +- } +- + err = xdp_do_redirect(rx_ring->ndev, &xdp_buff, prog); + if (unlikely(err)) { + enetc_xdp_drop(rx_ring, orig_i, i); +-- +2.51.0 + diff --git a/queue-6.1/net-fec-add-initial-xdp-support.patch b/queue-6.1/net-fec-add-initial-xdp-support.patch new file mode 100644 index 0000000000..ce8a89fa15 --- /dev/null +++ b/queue-6.1/net-fec-add-initial-xdp-support.patch @@ -0,0 +1,390 @@ +From 47a3f48d7721f3132f9f1d0eae94edb0167b09ca Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 31 Oct 2022 13:53:50 -0500 +Subject: net: fec: add initial XDP support + +From: Shenwei Wang + +[ Upstream commit 6d6b39f180b83dfe1e938382b68dd1e6cb51363c ] + +This patch adds the initial XDP support to Freescale driver. It supports +XDP_PASS, XDP_DROP and XDP_REDIRECT actions. Upcoming patches will add +support for XDP_TX and Zero Copy features. + +As the patch is rather large, the part of codes to collect the +statistics is separated and will prepare a dedicated patch for that +part. + +I just tested with the application of xdpsock. + -- Native here means running command of "xdpsock -i eth0" + -- SKB-Mode means running command of "xdpsock -S -i eth0" + +The following are the testing result relating to XDP mode: + +root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 + sock0@eth0:0 rxdrop xdp-drv + pps pkts 1.00 +rx 371347 2717794 +tx 0 0 + +root@imx8qxpc0mek:~/bpf# ./xdpsock -S -i eth0 + sock0@eth0:0 rxdrop xdp-skb + pps pkts 1.00 +rx 202229 404528 +tx 0 0 + +root@imx8qxpc0mek:~/bpf# ./xdp2 eth0 +proto 0: 496708 pkt/s +proto 0: 505469 pkt/s +proto 0: 505283 pkt/s +proto 0: 505443 pkt/s +proto 0: 505465 pkt/s + +root@imx8qxpc0mek:~/bpf# ./xdp2 -S eth0 +proto 0: 0 pkt/s +proto 17: 118778 pkt/s +proto 17: 118989 pkt/s +proto 0: 1 pkt/s +proto 17: 118987 pkt/s +proto 0: 0 pkt/s +proto 17: 118943 pkt/s +proto 17: 118976 pkt/s +proto 0: 1 pkt/s +proto 17: 119006 pkt/s +proto 0: 0 pkt/s +proto 17: 119071 pkt/s +proto 17: 119092 pkt/s + +Signed-off-by: Shenwei Wang +Reported-by: kernel test robot +Link: https://lore.kernel.org/r/20221031185350.2045675-1-shenwei.wang@nxp.com +Signed-off-by: Paolo Abeni +Stable-dep-of: 50bd33f6b392 ("net: enetc: fix the deadlock of enetc_mdio_lock") +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/fec.h | 4 +- + drivers/net/ethernet/freescale/fec_main.c | 224 +++++++++++++++++++++- + 2 files changed, 226 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h +index 33f84a30e1671..6bdf7c98c3651 100644 +--- a/drivers/net/ethernet/freescale/fec.h ++++ b/drivers/net/ethernet/freescale/fec.h +@@ -348,7 +348,6 @@ struct bufdesc_ex { + */ + + #define FEC_ENET_XDP_HEADROOM (XDP_PACKET_HEADROOM) +- + #define FEC_ENET_RX_PAGES 256 + #define FEC_ENET_RX_FRSIZE (PAGE_SIZE - FEC_ENET_XDP_HEADROOM \ + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info))) +@@ -661,6 +660,9 @@ struct fec_enet_private { + + struct imx_sc_ipc *ipc_handle; + ++ /* XDP BPF Program */ ++ struct bpf_prog *xdp_prog; ++ + u64 ethtool_stats[]; + }; + +diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c +index ca271d7a388b4..5c72d229d3b6f 100644 +--- a/drivers/net/ethernet/freescale/fec_main.c ++++ b/drivers/net/ethernet/freescale/fec_main.c +@@ -89,6 +89,11 @@ static const u16 fec_enet_vlan_pri_to_queue[8] = {0, 0, 1, 1, 1, 2, 2, 2}; + #define FEC_ENET_OPD_V 0xFFF0 + #define FEC_MDIO_PM_TIMEOUT 100 /* ms */ + ++#define FEC_ENET_XDP_PASS 0 ++#define FEC_ENET_XDP_CONSUMED BIT(0) ++#define FEC_ENET_XDP_TX BIT(1) ++#define FEC_ENET_XDP_REDIR BIT(2) ++ + struct fec_devinfo { + u32 quirks; + }; +@@ -443,13 +448,14 @@ static int + fec_enet_create_page_pool(struct fec_enet_private *fep, + struct fec_enet_priv_rx_q *rxq, int size) + { ++ struct bpf_prog *xdp_prog = READ_ONCE(fep->xdp_prog); + struct page_pool_params pp_params = { + .order = 0, + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, + .pool_size = size, + .nid = dev_to_node(&fep->pdev->dev), + .dev = &fep->pdev->dev, +- .dma_dir = DMA_FROM_DEVICE, ++ .dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE, + .offset = FEC_ENET_XDP_HEADROOM, + .max_len = FEC_ENET_RX_FRSIZE, + }; +@@ -1610,6 +1616,59 @@ static void fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq, + bdp->cbd_bufaddr = cpu_to_fec32(phys_addr); + } + ++static u32 ++fec_enet_run_xdp(struct fec_enet_private *fep, struct bpf_prog *prog, ++ struct xdp_buff *xdp, struct fec_enet_priv_rx_q *rxq, int index) ++{ ++ unsigned int sync, len = xdp->data_end - xdp->data; ++ u32 ret = FEC_ENET_XDP_PASS; ++ struct page *page; ++ int err; ++ u32 act; ++ ++ act = bpf_prog_run_xdp(prog, xdp); ++ ++ /* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */ ++ sync = xdp->data_end - xdp->data_hard_start - FEC_ENET_XDP_HEADROOM; ++ sync = max(sync, len); ++ ++ switch (act) { ++ case XDP_PASS: ++ ret = FEC_ENET_XDP_PASS; ++ break; ++ ++ case XDP_REDIRECT: ++ err = xdp_do_redirect(fep->netdev, xdp, prog); ++ if (!err) { ++ ret = FEC_ENET_XDP_REDIR; ++ } else { ++ ret = FEC_ENET_XDP_CONSUMED; ++ page = virt_to_head_page(xdp->data); ++ page_pool_put_page(rxq->page_pool, page, sync, true); ++ } ++ break; ++ ++ default: ++ bpf_warn_invalid_xdp_action(fep->netdev, prog, act); ++ fallthrough; ++ ++ case XDP_TX: ++ bpf_warn_invalid_xdp_action(fep->netdev, prog, act); ++ fallthrough; ++ ++ case XDP_ABORTED: ++ fallthrough; /* handle aborts by dropping packet */ ++ ++ case XDP_DROP: ++ ret = FEC_ENET_XDP_CONSUMED; ++ page = virt_to_head_page(xdp->data); ++ page_pool_put_page(rxq->page_pool, page, sync, true); ++ break; ++ } ++ ++ return ret; ++} ++ + /* During a receive, the bd_rx.cur points to the current incoming buffer. + * When we update through the ring, if the next incoming buffer has + * not been given to the system, we just set the empty indicator, +@@ -1631,6 +1690,9 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) + u16 vlan_tag; + int index = 0; + bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME; ++ struct bpf_prog *xdp_prog = READ_ONCE(fep->xdp_prog); ++ u32 ret, xdp_result = FEC_ENET_XDP_PASS; ++ struct xdp_buff xdp; + struct page *page; + + #ifdef CONFIG_M532x +@@ -1642,6 +1704,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) + * These get messed up if we get called due to a busy condition. + */ + bdp = rxq->bd.cur; ++ xdp_init_buff(&xdp, PAGE_SIZE, &rxq->xdp_rxq); + + while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) { + +@@ -1691,6 +1754,17 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) + prefetch(page_address(page)); + fec_enet_update_cbd(rxq, bdp, index); + ++ if (xdp_prog) { ++ xdp_buff_clear_frags_flag(&xdp); ++ xdp_prepare_buff(&xdp, page_address(page), ++ FEC_ENET_XDP_HEADROOM, pkt_len, false); ++ ++ ret = fec_enet_run_xdp(fep, xdp_prog, &xdp, rxq, index); ++ xdp_result |= ret; ++ if (ret != FEC_ENET_XDP_PASS) ++ goto rx_processing_done; ++ } ++ + /* The packet length includes FCS, but we don't want to + * include that when passing upstream as it messes up + * bridging applications. +@@ -1794,6 +1868,10 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) + writel(0, rxq->bd.reg_desc_active); + } + rxq->bd.cur = bdp; ++ ++ if (xdp_result & FEC_ENET_XDP_REDIR) ++ xdp_do_flush_map(); ++ + return pkt_received; + } + +@@ -3604,6 +3682,148 @@ static u16 fec_enet_select_queue(struct net_device *ndev, struct sk_buff *skb, + return fec_enet_vlan_pri_to_queue[vlan_tag >> 13]; + } + ++static int fec_enet_bpf(struct net_device *dev, struct netdev_bpf *bpf) ++{ ++ struct fec_enet_private *fep = netdev_priv(dev); ++ bool is_run = netif_running(dev); ++ struct bpf_prog *old_prog; ++ ++ switch (bpf->command) { ++ case XDP_SETUP_PROG: ++ if (is_run) { ++ napi_disable(&fep->napi); ++ netif_tx_disable(dev); ++ } ++ ++ old_prog = xchg(&fep->xdp_prog, bpf->prog); ++ fec_restart(dev); ++ ++ if (is_run) { ++ napi_enable(&fep->napi); ++ netif_tx_start_all_queues(dev); ++ } ++ ++ if (old_prog) ++ bpf_prog_put(old_prog); ++ ++ return 0; ++ ++ case XDP_SETUP_XSK_POOL: ++ return -EOPNOTSUPP; ++ ++ default: ++ return -EOPNOTSUPP; ++ } ++} ++ ++static int ++fec_enet_xdp_get_tx_queue(struct fec_enet_private *fep, int cpu) ++{ ++ int index = cpu; ++ ++ if (unlikely(index < 0)) ++ index = 0; ++ ++ while (index >= fep->num_tx_queues) ++ index -= fep->num_tx_queues; ++ ++ return index; ++} ++ ++static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep, ++ struct fec_enet_priv_tx_q *txq, ++ struct xdp_frame *frame) ++{ ++ unsigned int index, status, estatus; ++ struct bufdesc *bdp, *last_bdp; ++ dma_addr_t dma_addr; ++ int entries_free; ++ ++ entries_free = fec_enet_get_free_txdesc_num(txq); ++ if (entries_free < MAX_SKB_FRAGS + 1) { ++ netdev_err(fep->netdev, "NOT enough BD for SG!\n"); ++ return NETDEV_TX_OK; ++ } ++ ++ /* Fill in a Tx ring entry */ ++ bdp = txq->bd.cur; ++ last_bdp = bdp; ++ status = fec16_to_cpu(bdp->cbd_sc); ++ status &= ~BD_ENET_TX_STATS; ++ ++ index = fec_enet_get_bd_index(bdp, &txq->bd); ++ ++ dma_addr = dma_map_single(&fep->pdev->dev, frame->data, ++ frame->len, DMA_TO_DEVICE); ++ if (dma_mapping_error(&fep->pdev->dev, dma_addr)) ++ return FEC_ENET_XDP_CONSUMED; ++ ++ status |= (BD_ENET_TX_INTR | BD_ENET_TX_LAST); ++ if (fep->bufdesc_ex) ++ estatus = BD_ENET_TX_INT; ++ ++ bdp->cbd_bufaddr = cpu_to_fec32(dma_addr); ++ bdp->cbd_datlen = cpu_to_fec16(frame->len); ++ ++ if (fep->bufdesc_ex) { ++ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp; ++ ++ if (fep->quirks & FEC_QUIRK_HAS_AVB) ++ estatus |= FEC_TX_BD_FTYPE(txq->bd.qid); ++ ++ ebdp->cbd_bdu = 0; ++ ebdp->cbd_esc = cpu_to_fec32(estatus); ++ } ++ ++ index = fec_enet_get_bd_index(last_bdp, &txq->bd); ++ txq->tx_skbuff[index] = NULL; ++ ++ /* Send it on its way. Tell FEC it's ready, interrupt when done, ++ * it's the last BD of the frame, and to put the CRC on the end. ++ */ ++ status |= (BD_ENET_TX_READY | BD_ENET_TX_TC); ++ bdp->cbd_sc = cpu_to_fec16(status); ++ ++ /* If this was the last BD in the ring, start at the beginning again. */ ++ bdp = fec_enet_get_nextdesc(last_bdp, &txq->bd); ++ ++ txq->bd.cur = bdp; ++ ++ return 0; ++} ++ ++static int fec_enet_xdp_xmit(struct net_device *dev, ++ int num_frames, ++ struct xdp_frame **frames, ++ u32 flags) ++{ ++ struct fec_enet_private *fep = netdev_priv(dev); ++ struct fec_enet_priv_tx_q *txq; ++ int cpu = smp_processor_id(); ++ struct netdev_queue *nq; ++ unsigned int queue; ++ int i; ++ ++ queue = fec_enet_xdp_get_tx_queue(fep, cpu); ++ txq = fep->tx_queue[queue]; ++ nq = netdev_get_tx_queue(fep->netdev, queue); ++ ++ __netif_tx_lock(nq, cpu); ++ ++ for (i = 0; i < num_frames; i++) ++ fec_enet_txq_xmit_frame(fep, txq, frames[i]); ++ ++ /* Make sure the update to bdp and tx_skbuff are performed. */ ++ wmb(); ++ ++ /* Trigger transmission start */ ++ writel(0, txq->bd.reg_desc_active); ++ ++ __netif_tx_unlock(nq); ++ ++ return num_frames; ++} ++ + static const struct net_device_ops fec_netdev_ops = { + .ndo_open = fec_enet_open, + .ndo_stop = fec_enet_close, +@@ -3615,6 +3835,8 @@ static const struct net_device_ops fec_netdev_ops = { + .ndo_set_mac_address = fec_set_mac_address, + .ndo_eth_ioctl = fec_enet_ioctl, + .ndo_set_features = fec_set_features, ++ .ndo_bpf = fec_enet_bpf, ++ .ndo_xdp_xmit = fec_enet_xdp_xmit, + }; + + static const unsigned short offset_des_active_rxq[] = { +-- +2.51.0 + diff --git a/queue-6.1/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch b/queue-6.1/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch new file mode 100644 index 0000000000..9b7d6ec952 --- /dev/null +++ b/queue-6.1/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch @@ -0,0 +1,65 @@ +From 5c01e0de3289b1a271ce0b1d3ce7f18049c5bf62 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 14 Oct 2025 13:46:49 -0700 +Subject: net/mlx5e: Return 1 instead of 0 in invalid case in + mlx5e_mpwrq_umr_entry_size() + +From: Nathan Chancellor + +[ Upstream commit aaf043a5688114703ae2c1482b92e7e0754d684e ] + +When building with Clang 20 or newer, there are some objtool warnings +from unexpected fallthroughs to other functions: + + vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries() + vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom() + +LLVM 20 contains an (admittedly problematic [1]) optimization [2] to +convert divide by zero into the equivalent of __builtin_unreachable(), +which invokes undefined behavior and destroys code generation when it is +encountered in a control flow graph. + +mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an +unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(), +which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of +mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for +zero, so LLVM is able to infer there will be a divide by zero in this +case and invokes undefined behavior. While there is some proposed work +to isolate this undefined behavior and avoid the destructive code +generation that results in these objtool warnings, code should still be +defensive against divide by zero. + +As the WARN_ONCE() implies that an invalid value should be handled +gracefully, return 1 instead of 0 in the default case so that the +results of this division operation is always valid. + +Fixes: 168723c1f8d6 ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters") +Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@mail.gmail.com/ [1] +Link: https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448 [2] +Closes: https://github.com/ClangBuiltLinux/linux/issues/2131 +Closes: https://github.com/ClangBuiltLinux/linux/issues/2132 +Signed-off-by: Nathan Chancellor +Reviewed-by: Tariq Toukan +Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_entry_size-v1-1-dc186b8819ef@kernel.org +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +index 33cc53f221e0b..542cc017e64cd 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +@@ -94,7 +94,7 @@ u8 mlx5e_mpwrq_umr_entry_size(enum mlx5e_mpwrq_umr_mode mode) + return sizeof(struct mlx5_ksm) * 4; + } + WARN_ONCE(1, "MPWRQ UMR mode %d is not known\n", mode); +- return 0; ++ return 1; + } + + u8 mlx5e_mpwrq_log_wqe_sz(struct mlx5_core_dev *mdev, u8 page_shift, +-- +2.51.0 + diff --git a/queue-6.1/net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch b/queue-6.1/net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch new file mode 100644 index 0000000000..37b14da037 --- /dev/null +++ b/queue-6.1/net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch @@ -0,0 +1,285 @@ +From 20abbd7880cfc00ffcf960a25d3d7cd5ac0fb978 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 8 Sep 2023 16:32:14 +0200 +Subject: net: Tree wide: Replace xdp_do_flush_map() with xdp_do_flush(). +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Sebastian Andrzej Siewior + +[ Upstream commit 7f04bd109d4c358a12b125bc79a6f0eac2e915ec ] + +xdp_do_flush_map() is deprecated and new code should use xdp_do_flush() +instead. + +Replace xdp_do_flush_map() with xdp_do_flush(). + +Cc: AngeloGioacchino Del Regno +Cc: Clark Wang +Cc: Claudiu Manoil +Cc: David Arinzon +Cc: Edward Cree +Cc: Felix Fietkau +Cc: Grygorii Strashko +Cc: Jassi Brar +Cc: Jesse Brandeburg +Cc: John Crispin +Cc: Leon Romanovsky +Cc: Lorenzo Bianconi +Cc: Louis Peens +Cc: Marcin Wojtas +Cc: Mark Lee +Cc: Matthias Brugger +Cc: NXP Linux Team +Cc: Noam Dagan +Cc: Russell King +Cc: Saeed Bishara +Cc: Saeed Mahameed +Cc: Sean Wang +Cc: Shay Agroskin +Cc: Shenwei Wang +Cc: Thomas Petazzoni +Cc: Tony Nguyen +Cc: Vladimir Oltean +Cc: Wei Fang +Signed-off-by: Sebastian Andrzej Siewior +Acked-by: Arthur Kiyanovski +Acked-by: Toke Høiland-Jørgensen +Acked-by: Ilias Apalodimas +Acked-by: Martin Habets +Acked-by: Jesper Dangaard Brouer +Link: https://lore.kernel.org/r/20230908143215.869913-2-bigeasy@linutronix.de +Signed-off-by: Jakub Kicinski +Stable-dep-of: 50bd33f6b392 ("net: enetc: fix the deadlock of enetc_mdio_lock") +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +- + drivers/net/ethernet/freescale/enetc/enetc.c | 2 +- + drivers/net/ethernet/freescale/fec_main.c | 2 +- + drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +- + drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 +- + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- + drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +- + drivers/net/ethernet/marvell/mvneta.c | 2 +- + drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 2 +- + drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +- + drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +- + drivers/net/ethernet/netronome/nfp/nfd3/xsk.c | 2 +- + drivers/net/ethernet/sfc/efx_channels.c | 2 +- + drivers/net/ethernet/sfc/siena/efx_channels.c | 2 +- + drivers/net/ethernet/socionext/netsec.c | 2 +- + drivers/net/ethernet/ti/cpsw_priv.c | 2 +- + 16 files changed, 16 insertions(+), 16 deletions(-) + +diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c +index 77fa4c35f2331..af03e307451c2 100644 +--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c ++++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c +@@ -1828,7 +1828,7 @@ static int ena_clean_rx_irq(struct ena_ring *rx_ring, struct napi_struct *napi, + } + + if (xdp_flags & ENA_XDP_REDIRECT) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + return work_done; + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c +index 4b4cc64724932..44ae1d2c34fd6 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.c ++++ b/drivers/net/ethernet/freescale/enetc/enetc.c +@@ -1681,7 +1681,7 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + rx_ring->stats.bytes += rx_byte_cnt; + + if (xdp_redirect_frm_cnt) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_tx_frm_cnt) + enetc_update_tx_ring_tail(tx_ring); +diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c +index 5c72d229d3b6f..b352b89d096bf 100644 +--- a/drivers/net/ethernet/freescale/fec_main.c ++++ b/drivers/net/ethernet/freescale/fec_main.c +@@ -1870,7 +1870,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) + rxq->bd.cur = bdp; + + if (xdp_result & FEC_ENET_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + return pkt_received; + } +diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c +index 2ede35ba3919b..a428d57c1da52 100644 +--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c ++++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c +@@ -2392,7 +2392,7 @@ void i40e_update_rx_stats(struct i40e_ring *rx_ring, + void i40e_finalize_xdp_rx(struct i40e_ring *rx_ring, unsigned int xdp_res) + { + if (xdp_res & I40E_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_res & I40E_XDP_TX) { + struct i40e_ring *xdp_ring = +diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +index d137b98d78eb6..48eac14fe991a 100644 +--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c ++++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +@@ -358,7 +358,7 @@ int ice_xmit_xdp_buff(struct xdp_buff *xdp, struct ice_tx_ring *xdp_ring) + void ice_finalize_xdp_rx(struct ice_tx_ring *xdp_ring, unsigned int xdp_res) + { + if (xdp_res & ICE_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_res & ICE_XDP_TX) { + if (static_branch_unlikely(&ice_xdp_locking_key)) +diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +index 086cc25730338..d036dc190396c 100644 +--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c ++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +@@ -2429,7 +2429,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector, + } + + if (xdp_xmit & IXGBE_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_xmit & IXGBE_XDP_TX) { + struct ixgbe_ring *ring = ixgbe_determine_xdp_ring(adapter); +diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +index 7ef82c30e8571..9fdd19acf2242 100644 +--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c ++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +@@ -351,7 +351,7 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, + } + + if (xdp_xmit & IXGBE_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_xmit & IXGBE_XDP_TX) { + struct ixgbe_ring *ring = ixgbe_determine_xdp_ring(adapter); +diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c +index eb4ebaa1c92ff..0cc277fe33f5b 100644 +--- a/drivers/net/ethernet/marvell/mvneta.c ++++ b/drivers/net/ethernet/marvell/mvneta.c +@@ -2514,7 +2514,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi, + mvneta_xdp_put_buff(pp, rxq, &xdp_buf, -1); + + if (ps.xdp_redirect) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (ps.rx_packets) + mvneta_update_stats(pp, &ps); +diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +index ec69bb90f5740..79516673811bd 100644 +--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c ++++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +@@ -4052,7 +4052,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi, + } + + if (xdp_ret & MVPP2_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (ps.rx_packets) { + struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats); +diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c +index 3f2f725ccceb3..b2ec1f183271d 100644 +--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c ++++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c +@@ -1989,7 +1989,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget, + net_dim(ð->rx_dim, dim_sample); + + if (xdp_flush) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + return done; + } +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +index 20507ef2f9569..e6612d1c3749c 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +@@ -677,7 +677,7 @@ void mlx5e_xdp_rx_poll_complete(struct mlx5e_rq *rq) + mlx5e_xmit_xdp_doorbell(xdpsq); + + if (test_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags)) { +- xdp_do_flush_map(); ++ xdp_do_flush(); + __clear_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags); + } + } +diff --git a/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c b/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c +index 5d9db8c2a5b43..45be6954d5aae 100644 +--- a/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c ++++ b/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c +@@ -256,7 +256,7 @@ nfp_nfd3_xsk_rx(struct nfp_net_rx_ring *rx_ring, int budget, + nfp_net_xsk_rx_ring_fill_freelist(r_vec->rx_ring); + + if (xdp_redir) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (tx_ring->wr_ptr_add) + nfp_net_tx_xmit_more_flush(tx_ring); +diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c +index 27d00ffac68f4..506163c977fa0 100644 +--- a/drivers/net/ethernet/sfc/efx_channels.c ++++ b/drivers/net/ethernet/sfc/efx_channels.c +@@ -1281,7 +1281,7 @@ static int efx_poll(struct napi_struct *napi, int budget) + + spent = efx_process_channel(channel, budget); + +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (spent < budget) { + if (efx_channel_has_rx_queue(channel) && +diff --git a/drivers/net/ethernet/sfc/siena/efx_channels.c b/drivers/net/ethernet/sfc/siena/efx_channels.c +index 1776f7f8a7a90..a7346e965bfe7 100644 +--- a/drivers/net/ethernet/sfc/siena/efx_channels.c ++++ b/drivers/net/ethernet/sfc/siena/efx_channels.c +@@ -1285,7 +1285,7 @@ static int efx_poll(struct napi_struct *napi, int budget) + + spent = efx_process_channel(channel, budget); + +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (spent < budget) { + if (efx_channel_has_rx_queue(channel) && +diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c +index b130e978366c1..f4b516906abdb 100644 +--- a/drivers/net/ethernet/socionext/netsec.c ++++ b/drivers/net/ethernet/socionext/netsec.c +@@ -780,7 +780,7 @@ static void netsec_finalize_xdp_rx(struct netsec_priv *priv, u32 xdp_res, + u16 pkts) + { + if (xdp_res & NETSEC_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_res & NETSEC_XDP_TX) + netsec_xdp_ring_tx_db(priv, pkts); +diff --git a/drivers/net/ethernet/ti/cpsw_priv.c b/drivers/net/ethernet/ti/cpsw_priv.c +index 758295c898ac9..7b861e1027b9b 100644 +--- a/drivers/net/ethernet/ti/cpsw_priv.c ++++ b/drivers/net/ethernet/ti/cpsw_priv.c +@@ -1359,7 +1359,7 @@ int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp, + * particular hardware is sharing a common queue, so the + * incoming device might change per packet. + */ +- xdp_do_flush_map(); ++ xdp_do_flush(); + break; + default: + bpf_warn_invalid_xdp_action(ndev, prog, act); +-- +2.51.0 + diff --git a/queue-6.1/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch b/queue-6.1/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch new file mode 100644 index 0000000000..dc96955ab9 --- /dev/null +++ b/queue-6.1/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch @@ -0,0 +1,56 @@ +From a271e5e029b9ed0388add7b2d451c6ec6c8b5eba Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 22:15:43 +0200 +Subject: rtnetlink: Allow deleting FDB entries in user namespace +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Johannes Wiesböck + +[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ] + +Creating FDB entries is possible from a non-initial user namespace when +having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive +an EPERM because the capability is always checked against the initial +user namespace. This restricts the FDB management from unprivileged +containers. + +Drop the netlink_capable check in rtnl_fdb_del as it was originally +dropped in c5c351088ae7 and reintroduced in 1690be63a27b without +intention. + +This patch was tested using a container on GyroidOS, where it was +possible to delete FDB entries from an unprivileged user namespace and +private network namespace. + +Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") +Reviewed-by: Michael Weiß +Tested-by: Harshal Gohel +Signed-off-by: Johannes Wiesböck +Reviewed-by: Ido Schimmel +Reviewed-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 6bc2f78a5ebbf..6fd6c717d1e39 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4274,9 +4274,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + int err; + u16 vid; + +- if (!netlink_capable(skb, CAP_NET_ADMIN)) +- return -EPERM; +- + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); +-- +2.51.0 + diff --git a/queue-6.1/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch b/queue-6.1/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch new file mode 100644 index 0000000000..c9ceee18f4 --- /dev/null +++ b/queue-6.1/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch @@ -0,0 +1,54 @@ +From 0d9fc7cd18bcb69a4fc814377e9ee1df71c2a04d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 16:00:36 +0300 +Subject: sctp: avoid NULL dereference when chunk data buffer is missing + +From: Alexey Simakov + +[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ] + +chunk->skb pointer is dereferenced in the if-block where it's supposed +to be NULL only. + +chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list +instead and do it just before replacing chunk->skb. We're sure that +otherwise chunk->skb is non-NULL because of outer if() condition. + +Fixes: 90017accff61 ("sctp: Add GSO support") +Signed-off-by: Alexey Simakov +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sctp/inqueue.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c +index 5c16521818058..f5a7d5a387555 100644 +--- a/net/sctp/inqueue.c ++++ b/net/sctp/inqueue.c +@@ -169,13 +169,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) + chunk->head_skb = chunk->skb; + + /* skbs with "cover letter" */ +- if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) ++ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { ++ if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { ++ __SCTP_INC_STATS(dev_net(chunk->skb->dev), ++ SCTP_MIB_IN_PKT_DISCARDS); ++ sctp_chunk_free(chunk); ++ goto next_chunk; ++ } + chunk->skb = skb_shinfo(chunk->skb)->frag_list; +- +- if (WARN_ON(!chunk->skb)) { +- __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); +- sctp_chunk_free(chunk); +- goto next_chunk; + } + } + +-- +2.51.0 + diff --git a/queue-6.1/series b/queue-6.1/series index 047d550d4e..7f3affbbb2 100644 --- a/queue-6.1/series +++ b/queue-6.1/series @@ -78,3 +78,13 @@ lkdtm-fortify-fix-potential-null-dereference-on-kmal.patch m68k-bitops-fix-find_-_bit-signatures.patch powerpc-32-remove-page_kernel_text-to-fix-startup-fa.patch smb-server-let-smb_direct_flush_send_list-invalidate.patch +net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch +rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch +net-fec-add-initial-xdp-support.patch +net-ethernet-enetc-unlock-xdp_redirect-for-xdp-non-l.patch +net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch +net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch +net-enetc-correct-the-value-of-enetc_rxb_truesize.patch +dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch +arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch +sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch diff --git a/queue-6.12/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch b/queue-6.12/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch new file mode 100644 index 0000000000..79814ed208 --- /dev/null +++ b/queue-6.12/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch @@ -0,0 +1,71 @@ +From fff0a678da0faba6327f8c2a4bff9ad1c4092dcd Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:37:12 +0800 +Subject: arm64, mm: avoid always making PTE dirty in pte_mkwrite() + +From: Huang Ying + +[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ] + +Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may +mark some pages that are never written dirty wrongly. For example, +do_swap_page() may map the exclusive pages with writable and clean PTEs +if the VMA is writable and the page fault is for read access. +However, current pte_mkwrite_novma() implementation always dirties the +PTE. This may cause unnecessary disk writing if the pages are +never written before being reclaimed. + +So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the +PTE_DIRTY bit is set to make it possible to make the PTE writable and +clean. + +The current behavior was introduced in commit 73e86cb03cf2 ("arm64: +Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, +pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only +clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits +are set. + +To test the performance impact of the patch, on an arm64 server +machine, run 16 redis-server processes on socket 1 and 16 +memtier_benchmark processes on socket 0 with mostly get +transactions (that is, redis-server will mostly read memory only). +The memory footprint of redis-server is larger than the available +memory, so swap out/in will be triggered. Test results show that the +patch can avoid most swapping out because the pages are mostly clean. +And the benchmark throughput improves ~23.9% in the test. + +Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") +Signed-off-by: Huang Ying +Cc: Will Deacon +Cc: Anshuman Khandual +Cc: Ryan Roberts +Cc: Gavin Shan +Cc: Ard Biesheuvel +Cc: Matthew Wilcox (Oracle) +Cc: Yicong Yang +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Reviewed-by: Catalin Marinas +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/pgtable.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h +index 5ba8376735cb0..eb57ddb5ecc53 100644 +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -212,7 +212,8 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) + static inline pte_t pte_mkwrite_novma(pte_t pte) + { + pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); +- pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); ++ if (pte_sw_dirty(pte)) ++ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + return pte; + } + +-- +2.51.0 + diff --git a/queue-6.12/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch b/queue-6.12/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch new file mode 100644 index 0000000000..0575c616a2 --- /dev/null +++ b/queue-6.12/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch @@ -0,0 +1,43 @@ +From c5c009bb22b17855dd78ba4cf5a78f51be1f920b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:28:49 +0200 +Subject: can: bxcan: bxcan_start_xmit(): use can_dev_dropped_skb() instead of + can_dropped_invalid_skb() + +From: Marc Kleine-Budde + +[ Upstream commit 3a20c444cd123e820e10ae22eeaf00e189315aa1 ] + +In addition to can_dropped_invalid_skb(), the helper function +can_dev_dropped_skb() checks whether the device is in listen-only mode and +discards the skb accordingly. + +Replace can_dropped_invalid_skb() by can_dev_dropped_skb() to also drop +skbs in for listen-only mode. + +Reported-by: Marc Kleine-Budde +Closes: https://lore.kernel.org/all/20251017-bizarre-enchanted-quokka-f3c704-mkl@pengutronix.de/ +Fixes: f00647d8127b ("can: bxcan: add support for ST bxCAN controller") +Link: https://patch.msgid.link/20251017-fix-skb-drop-check-v1-1-556665793fa4@pengutronix.de +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Sasha Levin +--- + drivers/net/can/bxcan.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/can/bxcan.c b/drivers/net/can/bxcan.c +index bfc60eb33dc37..333ad42ea73bc 100644 +--- a/drivers/net/can/bxcan.c ++++ b/drivers/net/can/bxcan.c +@@ -842,7 +842,7 @@ static netdev_tx_t bxcan_start_xmit(struct sk_buff *skb, + u32 id; + int i, j; + +- if (can_dropped_invalid_skb(ndev, skb)) ++ if (can_dev_dropped_skb(ndev, skb)) + return NETDEV_TX_OK; + + if (bxcan_tx_busy(priv)) +-- +2.51.0 + diff --git a/queue-6.12/can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch b/queue-6.12/can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch new file mode 100644 index 0000000000..5e145c8d54 --- /dev/null +++ b/queue-6.12/can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch @@ -0,0 +1,43 @@ +From 8c0cf88d04ca763b3d6bbb76cdeefbf6cdc520cf Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:28:49 +0200 +Subject: can: esd: acc_start_xmit(): use can_dev_dropped_skb() instead of + can_dropped_invalid_skb() + +From: Marc Kleine-Budde + +[ Upstream commit 0bee15a5caf36fe513fdeee07fd4f0331e61c064 ] + +In addition to can_dropped_invalid_skb(), the helper function +can_dev_dropped_skb() checks whether the device is in listen-only mode and +discards the skb accordingly. + +Replace can_dropped_invalid_skb() by can_dev_dropped_skb() to also drop +skbs in for listen-only mode. + +Reported-by: Marc Kleine-Budde +Closes: https://lore.kernel.org/all/20251017-bizarre-enchanted-quokka-f3c704-mkl@pengutronix.de/ +Fixes: 9721866f07e1 ("can: esd: add support for esd GmbH PCIe/402 CAN interface family") +Link: https://patch.msgid.link/20251017-fix-skb-drop-check-v1-2-556665793fa4@pengutronix.de +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Sasha Levin +--- + drivers/net/can/esd/esdacc.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/can/esd/esdacc.c b/drivers/net/can/esd/esdacc.c +index c80032bc1a521..73e66f9a3781c 100644 +--- a/drivers/net/can/esd/esdacc.c ++++ b/drivers/net/can/esd/esdacc.c +@@ -254,7 +254,7 @@ netdev_tx_t acc_start_xmit(struct sk_buff *skb, struct net_device *netdev) + u32 acc_id; + u32 acc_dlc; + +- if (can_dropped_invalid_skb(netdev, skb)) ++ if (can_dev_dropped_skb(netdev, skb)) + return NETDEV_TX_OK; + + /* Access core->tx_fifo_tail only once because it may be changed +-- +2.51.0 + diff --git a/queue-6.12/can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch b/queue-6.12/can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch new file mode 100644 index 0000000000..3db8e52a8f --- /dev/null +++ b/queue-6.12/can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch @@ -0,0 +1,43 @@ +From 3c3ca2ff6d561bab1b383d8ddddb92b6d3b2ba87 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:28:49 +0200 +Subject: can: rockchip-canfd: rkcanfd_start_xmit(): use can_dev_dropped_skb() + instead of can_dropped_invalid_skb() + +From: Marc Kleine-Budde + +[ Upstream commit 3a3bc9bbb3a0287164a595787df0c70d91e77cfd ] + +In addition to can_dropped_invalid_skb(), the helper function +can_dev_dropped_skb() checks whether the device is in listen-only mode and +discards the skb accordingly. + +Replace can_dropped_invalid_skb() by can_dev_dropped_skb() to also drop +skbs in for listen-only mode. + +Reported-by: Marc Kleine-Budde +Closes: https://lore.kernel.org/all/20251017-bizarre-enchanted-quokka-f3c704-mkl@pengutronix.de/ +Fixes: ff60bfbaf67f ("can: rockchip_canfd: add driver for Rockchip CAN-FD controller") +Link: https://patch.msgid.link/20251017-fix-skb-drop-check-v1-3-556665793fa4@pengutronix.de +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Sasha Levin +--- + drivers/net/can/rockchip/rockchip_canfd-tx.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/can/rockchip/rockchip_canfd-tx.c b/drivers/net/can/rockchip/rockchip_canfd-tx.c +index 865a15e033a9e..12200dcfd3389 100644 +--- a/drivers/net/can/rockchip/rockchip_canfd-tx.c ++++ b/drivers/net/can/rockchip/rockchip_canfd-tx.c +@@ -72,7 +72,7 @@ netdev_tx_t rkcanfd_start_xmit(struct sk_buff *skb, struct net_device *ndev) + int err; + u8 i; + +- if (can_dropped_invalid_skb(ndev, skb)) ++ if (can_dev_dropped_skb(ndev, skb)) + return NETDEV_TX_OK; + + if (!netif_subqueue_maybe_stop(priv->ndev, 0, +-- +2.51.0 + diff --git a/queue-6.12/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch b/queue-6.12/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch new file mode 100644 index 0000000000..ca4dabbde8 --- /dev/null +++ b/queue-6.12/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch @@ -0,0 +1,50 @@ +From c01a43a46554a1af0af781fa240bcd09ad7894f0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:58:07 +0300 +Subject: dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path + +From: Ioana Ciornei + +[ Upstream commit 902e81e679d86846a2404630d349709ad9372d0d ] + +The blamed commit increased the needed headroom to account for +alignment. This means that the size required to always align a Tx buffer +was added inside the dpaa2_eth_needed_headroom() function. By doing +that, a manual adjustment of the pointer passed to PTR_ALIGN() was no +longer correct since the 'buffer_start' variable was already pointing +to the start of the skb's memory. + +The behavior of the dpaa2-eth driver without this patch was to drop +frames on Tx even when the headroom was matching the 128 bytes +necessary. Fix this by removing the manual adjust of 'buffer_start' from +the PTR_MODE call. + +Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u +Fixes: f422abe3f23d ("dpaa2-eth: increase the needed headroom to account for alignment") +Signed-off-by: Ioana Ciornei +Tested-by: Mathew McBride +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +index c744e10e64033..f56a14e09d4a3 100644 +--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c ++++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +@@ -1077,8 +1077,7 @@ static int dpaa2_eth_build_single_fd(struct dpaa2_eth_priv *priv, + dma_addr_t addr; + + buffer_start = skb->data - dpaa2_eth_needed_headroom(skb); +- aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN, +- DPAA2_ETH_TX_BUF_ALIGN); ++ aligned_start = PTR_ALIGN(buffer_start, DPAA2_ETH_TX_BUF_ALIGN); + if (aligned_start >= skb->head) + buffer_start = aligned_start; + else +-- +2.51.0 + diff --git a/queue-6.12/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch b/queue-6.12/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch new file mode 100644 index 0000000000..446603dbaa --- /dev/null +++ b/queue-6.12/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch @@ -0,0 +1,54 @@ +From b1af17fe8db14330f9f85510035eb50c61abc331 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:01:31 +0800 +Subject: net: enetc: correct the value of ENETC_RXB_TRUESIZE + +From: Wei Fang + +[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ] + +The ENETC RX ring uses the page halves flipping mechanism, each page is +split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is +defined to 2048 to indicate the size of half a page. However, the page +size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, +but it could be configured to 16K or 64K. + +When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, +and the RX ring will always use the first half of the page. This is not +consistent with the description in the relevant kernel doc and commit +messages. + +This issue is invisible in most cases, but if users want to increase +PAGE_SIZE to receive a Jumbo frame with a single buffer for some use +cases, it will not work as expected, because the buffer size of each +RX BD is fixed to 2048 bytes. + +Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE +to (PAGE_SIZE >> 1), as described in the comment. + +Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") +Signed-off-by: Wei Fang +Reviewed-by: Claudiu Manoil +Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h +index fb7d98d577839..bf72b2825fa68 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.h ++++ b/drivers/net/ethernet/freescale/enetc/enetc.h +@@ -41,7 +41,7 @@ struct enetc_tx_swbd { + }; + + #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE +-#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ ++#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) + #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ + #define ENETC_RXB_DMA_SIZE \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +-- +2.51.0 + diff --git a/queue-6.12/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch b/queue-6.12/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch new file mode 100644 index 0000000000..7d48606864 --- /dev/null +++ b/queue-6.12/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch @@ -0,0 +1,158 @@ +From dddd0762402e9c1b9dad1508ab5f745b4ab5e663 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:14:27 +0800 +Subject: net: enetc: fix the deadlock of enetc_mdio_lock + +From: Jianpeng Chang + +[ Upstream commit 50bd33f6b3922a6b760aa30d409cae891cec8fb5 ] + +After applying the workaround for err050089, the LS1028A platform +experiences RCU stalls on RT kernel. This issue is caused by the +recursive acquisition of the read lock enetc_mdio_lock. Here list some +of the call stacks identified under the enetc_poll path that may lead to +a deadlock: + +enetc_poll + -> enetc_lock_mdio + -> enetc_clean_rx_ring OR napi_complete_done + -> napi_gro_receive + -> enetc_start_xmit + -> enetc_lock_mdio + -> enetc_map_tx_buffs + -> enetc_unlock_mdio + -> enetc_unlock_mdio + +After enetc_poll acquires the read lock, a higher-priority writer attempts +to acquire the lock, causing preemption. The writer detects that a +read lock is already held and is scheduled out. However, readers under +enetc_poll cannot acquire the read lock again because a writer is already +waiting, leading to a thread hang. + +Currently, the deadlock is avoided by adjusting enetc_lock_mdio to prevent +recursive lock acquisition. + +Fixes: 6d36ecdbc441 ("net: enetc: take the MDIO lock only once per NAPI poll cycle") +Signed-off-by: Jianpeng Chang +Acked-by: Wei Fang +Link: https://patch.msgid.link/20251015021427.180757-1-jianpeng.chang.cn@windriver.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.c | 25 ++++++++++++++++---- + 1 file changed, 21 insertions(+), 4 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c +index d8272b7a55fcb..749b65aab14a9 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.c ++++ b/drivers/net/ethernet/freescale/enetc/enetc.c +@@ -1246,6 +1246,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd; + struct sk_buff *skb; +@@ -1281,7 +1283,9 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_byte_cnt += skb->len + ETH_HLEN; + rx_frm_cnt++; + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + } + + rx_ring->next_to_clean = i; +@@ -1289,6 +1293,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -1598,6 +1604,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd, *orig_rxbd; + int orig_i, orig_cleaned_cnt; +@@ -1657,7 +1665,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + if (unlikely(!skb)) + goto out; + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + break; + case XDP_TX: + tx_ring = priv->xdp_tx_ring[rx_ring->index]; +@@ -1692,7 +1702,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + } + break; + case XDP_REDIRECT: ++ enetc_unlock_mdio(); + err = xdp_do_redirect(rx_ring->ndev, &xdp_buff, prog); ++ enetc_lock_mdio(); + if (unlikely(err)) { + enetc_xdp_drop(rx_ring, orig_i, i); + rx_ring->stats.xdp_redirect_failures++; +@@ -1712,8 +1724,11 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + +- if (xdp_redirect_frm_cnt) ++ if (xdp_redirect_frm_cnt) { ++ enetc_unlock_mdio(); + xdp_do_flush(); ++ enetc_lock_mdio(); ++ } + + if (xdp_tx_frm_cnt) + enetc_update_tx_ring_tail(tx_ring); +@@ -1722,6 +1737,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + enetc_refill_rx_ring(rx_ring, enetc_bd_unused(rx_ring) - + rx_ring->xdp.xdp_tx_in_flight); + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -1740,6 +1757,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + for (i = 0; i < v->count_tx_rings; i++) + if (!enetc_clean_tx_ring(&v->tx_ring[i], budget)) + complete = false; ++ enetc_unlock_mdio(); + + prog = rx_ring->xdp.prog; + if (prog) +@@ -1751,10 +1769,8 @@ static int enetc_poll(struct napi_struct *napi, int budget) + if (work_done) + v->rx_napi_work = true; + +- if (!complete) { +- enetc_unlock_mdio(); ++ if (!complete) + return budget; +- } + + napi_complete_done(napi, work_done); + +@@ -1763,6 +1779,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + + v->rx_napi_work = false; + ++ enetc_lock_mdio(); + /* enable interrupts */ + enetc_wr_reg_hot(v->rbier, ENETC_RBIER_RXTIE); + +-- +2.51.0 + diff --git a/queue-6.12/net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch b/queue-6.12/net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch new file mode 100644 index 0000000000..4acdc1fd3c --- /dev/null +++ b/queue-6.12/net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch @@ -0,0 +1,191 @@ +From b2297dd41c3d0e434a6357aff6d70dc33aab59aa Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 17:27:55 +0530 +Subject: net: ethernet: ti: am65-cpts: fix timestamp loss due to race + conditions + +From: Aksh Garg + +[ Upstream commit 49d34f3dd8519581030547eb7543a62f9ab5fa08 ] + +Resolve race conditions in timestamp events list handling between TX +and RX paths causing missed timestamps. + +The current implementation uses a single events list for both TX and RX +timestamps. The am65_cpts_find_ts() function acquires the lock, +splices all events (TX as well as RX events) to a temporary list, +and releases the lock. This function performs matching of timestamps +for TX packets only. Before it acquires the lock again to put the +non-TX events back to the main events list, a concurrent RX +processing thread could acquire the lock (as observed in practice), +find an empty events list, and fail to attach timestamp to it, +even though a relevant event exists in the spliced list which is yet to +be restored to the main list. + +Fix this by creating separate events lists to handle TX and RX +timestamps independently. + +Fixes: c459f606f66df ("net: ethernet: ti: am65-cpts: Enable RX HW timestamp for PTP packets using CPTS FIFO") +Signed-off-by: Aksh Garg +Reviewed-by: Siddharth Vadapalli +Link: https://patch.msgid.link/20251016115755.1123646-1-a-garg7@ti.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/ti/am65-cpts.c | 63 ++++++++++++++++++++--------- + 1 file changed, 43 insertions(+), 20 deletions(-) + +diff --git a/drivers/net/ethernet/ti/am65-cpts.c b/drivers/net/ethernet/ti/am65-cpts.c +index 59d6ab989c554..8ffbfaa3ab18c 100644 +--- a/drivers/net/ethernet/ti/am65-cpts.c ++++ b/drivers/net/ethernet/ti/am65-cpts.c +@@ -163,7 +163,9 @@ struct am65_cpts { + struct device_node *clk_mux_np; + struct clk *refclk; + u32 refclk_freq; +- struct list_head events; ++ /* separate lists to handle TX and RX timestamp independently */ ++ struct list_head events_tx; ++ struct list_head events_rx; + struct list_head pool; + struct am65_cpts_event pool_data[AM65_CPTS_MAX_EVENTS]; + spinlock_t lock; /* protects events lists*/ +@@ -227,6 +229,24 @@ static void am65_cpts_disable(struct am65_cpts *cpts) + am65_cpts_write32(cpts, 0, int_enable); + } + ++static int am65_cpts_purge_event_list(struct am65_cpts *cpts, ++ struct list_head *events) ++{ ++ struct list_head *this, *next; ++ struct am65_cpts_event *event; ++ int removed = 0; ++ ++ list_for_each_safe(this, next, events) { ++ event = list_entry(this, struct am65_cpts_event, list); ++ if (time_after(jiffies, event->tmo)) { ++ list_del_init(&event->list); ++ list_add(&event->list, &cpts->pool); ++ ++removed; ++ } ++ } ++ return removed; ++} ++ + static int am65_cpts_event_get_port(struct am65_cpts_event *event) + { + return (event->event1 & AM65_CPTS_EVENT_1_PORT_NUMBER_MASK) >> +@@ -239,20 +259,12 @@ static int am65_cpts_event_get_type(struct am65_cpts_event *event) + AM65_CPTS_EVENT_1_EVENT_TYPE_SHIFT; + } + +-static int am65_cpts_cpts_purge_events(struct am65_cpts *cpts) ++static int am65_cpts_purge_events(struct am65_cpts *cpts) + { +- struct list_head *this, *next; +- struct am65_cpts_event *event; + int removed = 0; + +- list_for_each_safe(this, next, &cpts->events) { +- event = list_entry(this, struct am65_cpts_event, list); +- if (time_after(jiffies, event->tmo)) { +- list_del_init(&event->list); +- list_add(&event->list, &cpts->pool); +- ++removed; +- } +- } ++ removed += am65_cpts_purge_event_list(cpts, &cpts->events_tx); ++ removed += am65_cpts_purge_event_list(cpts, &cpts->events_rx); + + if (removed) + dev_dbg(cpts->dev, "event pool cleaned up %d\n", removed); +@@ -287,7 +299,7 @@ static int __am65_cpts_fifo_read(struct am65_cpts *cpts) + struct am65_cpts_event, list); + + if (!event) { +- if (am65_cpts_cpts_purge_events(cpts)) { ++ if (am65_cpts_purge_events(cpts)) { + dev_err(cpts->dev, "cpts: event pool empty\n"); + ret = -1; + goto out; +@@ -306,11 +318,21 @@ static int __am65_cpts_fifo_read(struct am65_cpts *cpts) + cpts->timestamp); + break; + case AM65_CPTS_EV_RX: ++ event->tmo = jiffies + ++ msecs_to_jiffies(AM65_CPTS_EVENT_RX_TX_TIMEOUT); ++ ++ list_move_tail(&event->list, &cpts->events_rx); ++ ++ dev_dbg(cpts->dev, ++ "AM65_CPTS_EV_RX e1:%08x e2:%08x t:%lld\n", ++ event->event1, event->event2, ++ event->timestamp); ++ break; + case AM65_CPTS_EV_TX: + event->tmo = jiffies + + msecs_to_jiffies(AM65_CPTS_EVENT_RX_TX_TIMEOUT); + +- list_move_tail(&event->list, &cpts->events); ++ list_move_tail(&event->list, &cpts->events_tx); + + dev_dbg(cpts->dev, + "AM65_CPTS_EV_TX e1:%08x e2:%08x t:%lld\n", +@@ -828,7 +850,7 @@ static bool am65_cpts_match_tx_ts(struct am65_cpts *cpts, + return found; + } + +-static void am65_cpts_find_ts(struct am65_cpts *cpts) ++static void am65_cpts_find_tx_ts(struct am65_cpts *cpts) + { + struct am65_cpts_event *event; + struct list_head *this, *next; +@@ -837,7 +859,7 @@ static void am65_cpts_find_ts(struct am65_cpts *cpts) + LIST_HEAD(events); + + spin_lock_irqsave(&cpts->lock, flags); +- list_splice_init(&cpts->events, &events); ++ list_splice_init(&cpts->events_tx, &events); + spin_unlock_irqrestore(&cpts->lock, flags); + + list_for_each_safe(this, next, &events) { +@@ -850,7 +872,7 @@ static void am65_cpts_find_ts(struct am65_cpts *cpts) + } + + spin_lock_irqsave(&cpts->lock, flags); +- list_splice_tail(&events, &cpts->events); ++ list_splice_tail(&events, &cpts->events_tx); + list_splice_tail(&events_free, &cpts->pool); + spin_unlock_irqrestore(&cpts->lock, flags); + } +@@ -861,7 +883,7 @@ static long am65_cpts_ts_work(struct ptp_clock_info *ptp) + unsigned long flags; + long delay = -1; + +- am65_cpts_find_ts(cpts); ++ am65_cpts_find_tx_ts(cpts); + + spin_lock_irqsave(&cpts->txq.lock, flags); + if (!skb_queue_empty(&cpts->txq)) +@@ -905,7 +927,7 @@ static u64 am65_cpts_find_rx_ts(struct am65_cpts *cpts, u32 skb_mtype_seqid) + + spin_lock_irqsave(&cpts->lock, flags); + __am65_cpts_fifo_read(cpts); +- list_for_each_safe(this, next, &cpts->events) { ++ list_for_each_safe(this, next, &cpts->events_rx) { + event = list_entry(this, struct am65_cpts_event, list); + if (time_after(jiffies, event->tmo)) { + list_move(&event->list, &cpts->pool); +@@ -1155,7 +1177,8 @@ struct am65_cpts *am65_cpts_create(struct device *dev, void __iomem *regs, + return ERR_PTR(ret); + + mutex_init(&cpts->ptp_clk_lock); +- INIT_LIST_HEAD(&cpts->events); ++ INIT_LIST_HEAD(&cpts->events_tx); ++ INIT_LIST_HEAD(&cpts->events_rx); + INIT_LIST_HEAD(&cpts->pool); + spin_lock_init(&cpts->lock); + skb_queue_head_init(&cpts->txq); +-- +2.51.0 + diff --git a/queue-6.12/net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch b/queue-6.12/net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch new file mode 100644 index 0000000000..793272d06f --- /dev/null +++ b/queue-6.12/net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch @@ -0,0 +1,201 @@ +From d5216bd7e651c7440c30c381bf8ee17502351415 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 22 Oct 2025 15:29:42 +0300 +Subject: net/mlx5: Fix IPsec cleanup over MPV device + +From: Patrisious Haddad + +[ Upstream commit 664f76be38a18c61151d0ef248c7e2f3afb4f3c7 ] + +When we do mlx5e_detach_netdev() we eventually disable blocking events +notifier, among those events are IPsec MPV events from IB to core. + +So before disabling those blocking events, make sure to also unregister +the devcom device and mark all this device operations as complete, +in order to prevent the other device from using invalid netdev +during future devcom events which could cause the trace below. + +BUG: kernel NULL pointer dereference, address: 0000000000000010 +PGD 146427067 P4D 146427067 PUD 146488067 PMD 0 +Oops: Oops: 0000 [#1] SMP +CPU: 1 UID: 0 PID: 7735 Comm: devlink Tainted: GW 6.12.0-rc6_for_upstream_min_debug_2024_11_08_00_46 #1 +Tainted: [W]=WARN +Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 +RIP: 0010:mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core] +Code: 00 01 48 83 05 23 32 1e 00 01 41 b8 ed ff ff ff e9 60 ff ff ff 48 83 05 00 32 1e 00 01 eb e3 66 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 47 10 48 83 05 5f 32 1e 00 01 48 8b 50 40 48 85 d2 74 05 40 +RSP: 0018:ffff88811a5c35f8 EFLAGS: 00010206 +RAX: ffff888106e8ab80 RBX: ffff888107d7e200 RCX: ffff88810d6f0a00 +RDX: ffff88810d6f0a00 RSI: 0000000000000001 RDI: 0000000000000000 +RBP: ffff88811a17e620 R08: 0000000000000040 R09: 0000000000000000 +R10: ffff88811a5c3618 R11: 0000000de85d51bd R12: ffff88811a17e600 +R13: ffff88810d6f0a00 R14: 0000000000000000 R15: ffff8881034bda80 +FS: 00007f27bdf89180(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000000000000010 CR3: 000000010f159005 CR4: 0000000000372eb0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + + ? __die+0x20/0x60 + ? page_fault_oops+0x150/0x3e0 + ? exc_page_fault+0x74/0x130 + ? asm_exc_page_fault+0x22/0x30 + ? mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core] + mlx5e_devcom_event_mpv+0x42/0x60 [mlx5_core] + mlx5_devcom_send_event+0x8c/0x170 [mlx5_core] + blocking_event+0x17b/0x230 [mlx5_core] + notifier_call_chain+0x35/0xa0 + blocking_notifier_call_chain+0x3d/0x60 + mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core] + mlx5_core_mp_event_replay+0x12/0x20 [mlx5_core] + mlx5_ib_bind_slave_port+0x228/0x2c0 [mlx5_ib] + mlx5_ib_stage_init_init+0x664/0x9d0 [mlx5_ib] + ? idr_alloc_cyclic+0x50/0xb0 + ? __kmalloc_cache_noprof+0x167/0x340 + ? __kmalloc_noprof+0x1a7/0x430 + __mlx5_ib_add+0x34/0xd0 [mlx5_ib] + mlx5r_probe+0xe9/0x310 [mlx5_ib] + ? kernfs_add_one+0x107/0x150 + ? __mlx5_ib_add+0xd0/0xd0 [mlx5_ib] + auxiliary_bus_probe+0x3e/0x90 + really_probe+0xc5/0x3a0 + ? driver_probe_device+0x90/0x90 + __driver_probe_device+0x80/0x160 + driver_probe_device+0x1e/0x90 + __device_attach_driver+0x7d/0x100 + bus_for_each_drv+0x80/0xd0 + __device_attach+0xbc/0x1f0 + bus_probe_device+0x86/0xa0 + device_add+0x62d/0x830 + __auxiliary_device_add+0x3b/0xa0 + ? auxiliary_device_init+0x41/0x90 + add_adev+0xd1/0x150 [mlx5_core] + mlx5_rescan_drivers_locked+0x21c/0x300 [mlx5_core] + esw_mode_change+0x6c/0xc0 [mlx5_core] + mlx5_devlink_eswitch_mode_set+0x21e/0x640 [mlx5_core] + devlink_nl_eswitch_set_doit+0x60/0xe0 + genl_family_rcv_msg_doit+0xd0/0x120 + genl_rcv_msg+0x180/0x2b0 + ? devlink_get_from_attrs_lock+0x170/0x170 + ? devlink_nl_eswitch_get_doit+0x290/0x290 + ? devlink_nl_pre_doit_port_optional+0x50/0x50 + ? genl_family_rcv_msg_dumpit+0xf0/0xf0 + netlink_rcv_skb+0x54/0x100 + genl_rcv+0x24/0x40 + netlink_unicast+0x1fc/0x2d0 + netlink_sendmsg+0x1e4/0x410 + __sock_sendmsg+0x38/0x60 + ? sockfd_lookup_light+0x12/0x60 + __sys_sendto+0x105/0x160 + ? __sys_recvmsg+0x4e/0x90 + __x64_sys_sendto+0x20/0x30 + do_syscall_64+0x4c/0x100 + entry_SYSCALL_64_after_hwframe+0x4b/0x53 +RIP: 0033:0x7f27bc91b13a +Code: bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 fa 96 2c 00 45 89 c9 4c 63 d1 48 63 ff 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 f3 c3 0f 1f 40 00 41 55 41 54 4d 89 c5 55 +RSP: 002b:00007fff369557e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c +RAX: ffffffffffffffda RBX: 0000000009c54b10 RCX: 00007f27bc91b13a +RDX: 0000000000000038 RSI: 0000000009c54b10 RDI: 0000000000000006 +RBP: 0000000009c54920 R08: 00007f27bd0030e0 R09: 000000000000000c +R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 +R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 + +Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_fwctl mlx5_ib ib_uverbs ib_core mlx5_core +CR2: 0000000000000010 + +Fixes: 82f9378c443c ("net/mlx5: Handle IPsec steering upon master unbind/bind") +Signed-off-by: Patrisious Haddad +Reviewed-by: Leon Romanovsky +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1761136182-918470-5-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../mellanox/mlx5/core/en_accel/ipsec.h | 5 ++++ + .../mellanox/mlx5/core/en_accel/ipsec_fs.c | 25 +++++++++++++++++-- + .../net/ethernet/mellanox/mlx5/core/en_main.c | 2 ++ + 3 files changed, 30 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +index 9aff779c77c89..78e78b6f81467 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +@@ -337,6 +337,7 @@ void mlx5e_ipsec_build_accel_xfrm_attrs(struct mlx5e_ipsec_sa_entry *sa_entry, + void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv, + struct mlx5e_priv *master_priv); + void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event); ++void mlx5e_ipsec_disable_events(struct mlx5e_priv *priv); + + static inline struct mlx5_core_dev * + mlx5e_ipsec_sa2dev(struct mlx5e_ipsec_sa_entry *sa_entry) +@@ -382,6 +383,10 @@ static inline void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *sl + static inline void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event) + { + } ++ ++static inline void mlx5e_ipsec_disable_events(struct mlx5e_priv *priv) ++{ ++} + #endif + + #endif /* __MLX5E_IPSEC_H__ */ +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +index 59b9653f573c8..131eb9b4eba65 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +@@ -2421,9 +2421,30 @@ void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv, + + void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event) + { +- if (!priv->ipsec) +- return; /* IPsec not supported */ ++ if (!priv->ipsec || mlx5_devcom_comp_get_size(priv->devcom) < 2) ++ return; /* IPsec not supported or no peers */ + + mlx5_devcom_send_event(priv->devcom, event, event, priv); + wait_for_completion(&priv->ipsec->comp); + } ++ ++void mlx5e_ipsec_disable_events(struct mlx5e_priv *priv) ++{ ++ struct mlx5_devcom_comp_dev *tmp = NULL; ++ struct mlx5e_priv *peer_priv; ++ ++ if (!priv->devcom) ++ return; ++ ++ if (!mlx5_devcom_for_each_peer_begin(priv->devcom)) ++ goto out; ++ ++ peer_priv = mlx5_devcom_get_next_peer_data(priv->devcom, &tmp); ++ if (peer_priv) ++ complete_all(&peer_priv->ipsec->comp); ++ ++ mlx5_devcom_for_each_peer_end(priv->devcom); ++out: ++ mlx5_devcom_unregister_component(priv->devcom); ++ priv->devcom = NULL; ++} +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +index 4a2f58a9d7066..7e04a17fa3b82 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +@@ -257,6 +257,7 @@ static void mlx5e_devcom_cleanup_mpv(struct mlx5e_priv *priv) + } + + mlx5_devcom_unregister_component(priv->devcom); ++ priv->devcom = NULL; + } + + static int blocking_event(struct notifier_block *nb, unsigned long event, void *data) +@@ -5830,6 +5831,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv) + if (mlx5e_monitor_counter_supported(priv)) + mlx5e_monitor_counter_cleanup(priv); + ++ mlx5e_ipsec_disable_events(priv); + mlx5e_disable_blocking_events(priv); + if (priv->en_trap) { + mlx5e_deactivate_trap(priv); +-- +2.51.0 + diff --git a/queue-6.12/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch b/queue-6.12/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch new file mode 100644 index 0000000000..a6b2b0b7a1 --- /dev/null +++ b/queue-6.12/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch @@ -0,0 +1,65 @@ +From d42366d09222029c41694f3eff0b676a2ed06a23 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 14 Oct 2025 13:46:49 -0700 +Subject: net/mlx5e: Return 1 instead of 0 in invalid case in + mlx5e_mpwrq_umr_entry_size() + +From: Nathan Chancellor + +[ Upstream commit aaf043a5688114703ae2c1482b92e7e0754d684e ] + +When building with Clang 20 or newer, there are some objtool warnings +from unexpected fallthroughs to other functions: + + vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries() + vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom() + +LLVM 20 contains an (admittedly problematic [1]) optimization [2] to +convert divide by zero into the equivalent of __builtin_unreachable(), +which invokes undefined behavior and destroys code generation when it is +encountered in a control flow graph. + +mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an +unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(), +which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of +mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for +zero, so LLVM is able to infer there will be a divide by zero in this +case and invokes undefined behavior. While there is some proposed work +to isolate this undefined behavior and avoid the destructive code +generation that results in these objtool warnings, code should still be +defensive against divide by zero. + +As the WARN_ONCE() implies that an invalid value should be handled +gracefully, return 1 instead of 0 in the default case so that the +results of this division operation is always valid. + +Fixes: 168723c1f8d6 ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters") +Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@mail.gmail.com/ [1] +Link: https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448 [2] +Closes: https://github.com/ClangBuiltLinux/linux/issues/2131 +Closes: https://github.com/ClangBuiltLinux/linux/issues/2132 +Signed-off-by: Nathan Chancellor +Reviewed-by: Tariq Toukan +Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_entry_size-v1-1-dc186b8819ef@kernel.org +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +index 58ec5e44aa7ad..3dac708c0d75a 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +@@ -99,7 +99,7 @@ u8 mlx5e_mpwrq_umr_entry_size(enum mlx5e_mpwrq_umr_mode mode) + return sizeof(struct mlx5_ksm) * 4; + } + WARN_ONCE(1, "MPWRQ UMR mode %d is not known\n", mode); +- return 0; ++ return 1; + } + + u8 mlx5e_mpwrq_log_wqe_sz(struct mlx5_core_dev *mdev, u8 page_shift, +-- +2.51.0 + diff --git a/queue-6.12/net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch b/queue-6.12/net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch new file mode 100644 index 0000000000..eb74e7f93c --- /dev/null +++ b/queue-6.12/net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch @@ -0,0 +1,327 @@ +From 251acb7277406f0f5444e193c3bdb164770577d8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 14 May 2025 23:03:52 +0300 +Subject: net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead + +From: Carolina Jubran + +[ Upstream commit b66b76a82c8879d764ab89adc21ee855ffd292d5 ] + +CONFIG_INIT_STACK_ALL_ZERO introduces a performance cost by +zero-initializing all stack variables on function entry. The mlx5 XDP +RX path previously allocated a struct mlx5e_xdp_buff on the stack per +received CQE, resulting in measurable performance degradation under +this config. + +This patch reuses a mlx5e_xdp_buff stored in the mlx5e_rq struct, +avoiding per-CQE stack allocations and repeated zeroing. + +With this change, XDP_DROP and XDP_TX performance matches that of +kernels built without CONFIG_INIT_STACK_ALL_ZERO. + +Performance was measured on a ConnectX-6Dx using a single RX channel +(1 CPU at 100% usage) at ~50 Mpps. The baseline results were taken from +net-next-6.15. + +Stack zeroing disabled: +- XDP_DROP: + * baseline: 31.47 Mpps + * baseline + per-RQ allocation: 32.31 Mpps (+2.68%) + +- XDP_TX: + * baseline: 12.41 Mpps + * baseline + per-RQ allocation: 12.95 Mpps (+4.30%) + +Stack zeroing enabled: +- XDP_DROP: + * baseline: 24.32 Mpps + * baseline + per-RQ allocation: 32.27 Mpps (+32.7%) + +- XDP_TX: + * baseline: 11.80 Mpps + * baseline + per-RQ allocation: 12.24 Mpps (+3.72%) + +Reported-by: Sebastiano Miano +Reported-by: Samuel Dobron +Link: https://lore.kernel.org/all/CAMENy5pb8ea+piKLg5q5yRTMZacQqYWAoVLE1FE9WhQPq92E0g@mail.gmail.com/ +Signed-off-by: Carolina Jubran +Reviewed-by: Dragos Tatulea +Signed-off-by: Tariq Toukan +Acked-by: Jesper Dangaard Brouer +Link: https://patch.msgid.link/1747253032-663457-1-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Stable-dep-of: afd5ba577c10 ("net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ") +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 ++ + .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 6 -- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 81 ++++++++++--------- + 3 files changed, 51 insertions(+), 43 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h +index e048a667e0758..f2952a6b0db73 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h +@@ -512,6 +512,12 @@ struct mlx5e_xdpsq { + struct mlx5e_channel *channel; + } ____cacheline_aligned_in_smp; + ++struct mlx5e_xdp_buff { ++ struct xdp_buff xdp; ++ struct mlx5_cqe64 *cqe; ++ struct mlx5e_rq *rq; ++}; ++ + struct mlx5e_ktls_resync_resp; + + struct mlx5e_icosq { +@@ -710,6 +716,7 @@ struct mlx5e_rq { + struct mlx5e_xdpsq *xdpsq; + DECLARE_BITMAP(flags, 8); + struct page_pool *page_pool; ++ struct mlx5e_xdp_buff mxbuf; + + /* AF_XDP zero-copy */ + struct xsk_buff_pool *xsk_pool; +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +index e054db1e10f8a..75256cf978c86 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +@@ -45,12 +45,6 @@ + (MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \ + sizeof(struct mlx5_wqe_inline_seg)) + +-struct mlx5e_xdp_buff { +- struct xdp_buff xdp; +- struct mlx5_cqe64 *cqe; +- struct mlx5e_rq *rq; +-}; +- + /* XDP packets can be transmitted in different ways. On completion, we need to + * distinguish between them to clean up things in a proper way. + */ +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index 673043d9ed11a..f072b21eb610d 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -1691,17 +1691,17 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, + + prog = rcu_dereference(rq->xdp_prog); + if (prog) { +- struct mlx5e_xdp_buff mxbuf; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + + net_prefetchw(va); /* xdp_frame data area */ + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, +- cqe_bcnt, &mxbuf); +- if (mlx5e_xdp_handle(rq, prog, &mxbuf)) ++ cqe_bcnt, mxbuf); ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) + return NULL; /* page/packet was consumed by XDP */ + +- rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start; +- metasize = mxbuf.xdp.data - mxbuf.xdp.data_meta; +- cqe_bcnt = mxbuf.xdp.data_end - mxbuf.xdp.data; ++ rx_headroom = mxbuf->xdp.data - mxbuf->xdp.data_hard_start; ++ metasize = mxbuf->xdp.data - mxbuf->xdp.data_meta; ++ cqe_bcnt = mxbuf->xdp.data_end - mxbuf->xdp.data; + } + frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); + skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize); +@@ -1720,11 +1720,11 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) + { + struct mlx5e_rq_frag_info *frag_info = &rq->wqe.info.arr[0]; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + struct mlx5e_wqe_frag_info *head_wi = wi; + u16 rx_headroom = rq->buff.headroom; + struct mlx5e_frag_page *frag_page; + struct skb_shared_info *sinfo; +- struct mlx5e_xdp_buff mxbuf; + u32 frag_consumed_bytes; + struct bpf_prog *prog; + struct sk_buff *skb; +@@ -1744,8 +1744,8 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + net_prefetch(va + rx_headroom); + + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, +- frag_consumed_bytes, &mxbuf); +- sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); ++ frag_consumed_bytes, mxbuf); ++ sinfo = xdp_get_shared_info_from_buff(&mxbuf->xdp); + truesize = 0; + + cqe_bcnt -= frag_consumed_bytes; +@@ -1757,8 +1757,9 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + + frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt); + +- mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf.xdp, frag_page, +- wi->offset, frag_consumed_bytes); ++ mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf->xdp, ++ frag_page, wi->offset, ++ frag_consumed_bytes); + truesize += frag_info->frag_stride; + + cqe_bcnt -= frag_consumed_bytes; +@@ -1767,7 +1768,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + } + + prog = rcu_dereference(rq->xdp_prog); +- if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) { ++ if (prog && mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + struct mlx5e_wqe_frag_info *pwi; + +@@ -1777,21 +1778,23 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + return NULL; /* page/packet was consumed by XDP */ + } + +- skb = mlx5e_build_linear_skb(rq, mxbuf.xdp.data_hard_start, rq->buff.frame0_sz, +- mxbuf.xdp.data - mxbuf.xdp.data_hard_start, +- mxbuf.xdp.data_end - mxbuf.xdp.data, +- mxbuf.xdp.data - mxbuf.xdp.data_meta); ++ skb = mlx5e_build_linear_skb( ++ rq, mxbuf->xdp.data_hard_start, rq->buff.frame0_sz, ++ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, ++ mxbuf->xdp.data_end - mxbuf->xdp.data, ++ mxbuf->xdp.data - mxbuf->xdp.data_meta); + if (unlikely(!skb)) + return NULL; + + skb_mark_for_recycle(skb); + head_wi->frag_page->frags++; + +- if (xdp_buff_has_frags(&mxbuf.xdp)) { ++ if (xdp_buff_has_frags(&mxbuf->xdp)) { + /* sinfo->nr_frags is reset by build_skb, calculate again. */ + xdp_update_skb_shared_info(skb, wi - head_wi - 1, + sinfo->xdp_frags_size, truesize, +- xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); ++ xdp_buff_is_frag_pfmemalloc( ++ &mxbuf->xdp)); + + for (struct mlx5e_wqe_frag_info *pwi = head_wi + 1; pwi < wi; pwi++) + pwi->frag_page->frags++; +@@ -1991,10 +1994,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + struct mlx5e_frag_page *frag_page = &wi->alloc_units.frag_pages[page_idx]; + u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt); + struct mlx5e_frag_page *head_page = frag_page; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + u32 frag_offset = head_offset; + u32 byte_cnt = cqe_bcnt; + struct skb_shared_info *sinfo; +- struct mlx5e_xdp_buff mxbuf; + unsigned int truesize = 0; + struct bpf_prog *prog; + struct sk_buff *skb; +@@ -2040,9 +2043,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } + } + +- mlx5e_fill_mxbuf(rq, cqe, va, linear_hr, linear_frame_sz, linear_data_len, &mxbuf); ++ mlx5e_fill_mxbuf(rq, cqe, va, linear_hr, linear_frame_sz, ++ linear_data_len, mxbuf); + +- sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); ++ sinfo = xdp_get_shared_info_from_buff(&mxbuf->xdp); + + while (byte_cnt) { + /* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */ +@@ -2053,7 +2057,8 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + else + truesize += ALIGN(pg_consumed_bytes, BIT(rq->mpwqe.log_stride_sz)); + +- mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf.xdp, frag_page, frag_offset, ++ mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf->xdp, ++ frag_page, frag_offset, + pg_consumed_bytes); + byte_cnt -= pg_consumed_bytes; + frag_offset = 0; +@@ -2061,7 +2066,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } + + if (prog) { +- if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + struct mlx5e_frag_page *pfp; + +@@ -2074,10 +2079,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + return NULL; /* page/packet was consumed by XDP */ + } + +- skb = mlx5e_build_linear_skb(rq, mxbuf.xdp.data_hard_start, +- linear_frame_sz, +- mxbuf.xdp.data - mxbuf.xdp.data_hard_start, 0, +- mxbuf.xdp.data - mxbuf.xdp.data_meta); ++ skb = mlx5e_build_linear_skb( ++ rq, mxbuf->xdp.data_hard_start, linear_frame_sz, ++ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, 0, ++ mxbuf->xdp.data - mxbuf->xdp.data_meta); + if (unlikely(!skb)) { + mlx5e_page_release_fragmented(rq, &wi->linear_page); + return NULL; +@@ -2087,13 +2092,14 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + wi->linear_page.frags++; + mlx5e_page_release_fragmented(rq, &wi->linear_page); + +- if (xdp_buff_has_frags(&mxbuf.xdp)) { ++ if (xdp_buff_has_frags(&mxbuf->xdp)) { + struct mlx5e_frag_page *pagep; + + /* sinfo->nr_frags is reset by build_skb, calculate again. */ + xdp_update_skb_shared_info(skb, frag_page - head_page, + sinfo->xdp_frags_size, truesize, +- xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); ++ xdp_buff_is_frag_pfmemalloc( ++ &mxbuf->xdp)); + + pagep = head_page; + do +@@ -2104,12 +2110,13 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } else { + dma_addr_t addr; + +- if (xdp_buff_has_frags(&mxbuf.xdp)) { ++ if (xdp_buff_has_frags(&mxbuf->xdp)) { + struct mlx5e_frag_page *pagep; + + xdp_update_skb_shared_info(skb, sinfo->nr_frags, + sinfo->xdp_frags_size, truesize, +- xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); ++ xdp_buff_is_frag_pfmemalloc( ++ &mxbuf->xdp)); + + pagep = frag_page - sinfo->nr_frags; + do +@@ -2159,20 +2166,20 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + + prog = rcu_dereference(rq->xdp_prog); + if (prog) { +- struct mlx5e_xdp_buff mxbuf; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + + net_prefetchw(va); /* xdp_frame data area */ + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, +- cqe_bcnt, &mxbuf); +- if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { ++ cqe_bcnt, mxbuf); ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) + frag_page->frags++; + return NULL; /* page/packet was consumed by XDP */ + } + +- rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start; +- metasize = mxbuf.xdp.data - mxbuf.xdp.data_meta; +- cqe_bcnt = mxbuf.xdp.data_end - mxbuf.xdp.data; ++ rx_headroom = mxbuf->xdp.data - mxbuf->xdp.data_hard_start; ++ metasize = mxbuf->xdp.data - mxbuf->xdp.data_meta; ++ cqe_bcnt = mxbuf->xdp.data_end - mxbuf->xdp.data; + } + frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); + skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize); +-- +2.51.0 + diff --git a/queue-6.12/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch b/queue-6.12/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch new file mode 100644 index 0000000000..e15f74cd01 --- /dev/null +++ b/queue-6.12/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch @@ -0,0 +1,68 @@ +From 710e4bfbedd2b76ac9788b52e021c139e1bb7654 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 22:55:39 +0300 +Subject: net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy + RQ + +From: Amery Hung + +[ Upstream commit afd5ba577c10639f62e8120df67dc70ea4b61176 ] + +XDP programs can release xdp_buff fragments when calling +bpf_xdp_adjust_tail(). The driver currently assumes the number of +fragments to be unchanged and may generate skb with wrong truesize or +containing invalid frags. Fix the bug by generating skb according to +xdp_buff after the XDP program runs. + +Fixes: ea5d49bdae8b ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ") +Reviewed-by: Dragos Tatulea +Signed-off-by: Amery Hung +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1760644540-899148-2-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 25 ++++++++++++++----- + 1 file changed, 19 insertions(+), 6 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index f072b21eb610d..5cacb25a763e4 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -1768,14 +1768,27 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + } + + prog = rcu_dereference(rq->xdp_prog); +- if (prog && mlx5e_xdp_handle(rq, prog, mxbuf)) { +- if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { +- struct mlx5e_wqe_frag_info *pwi; ++ if (prog) { ++ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags; ++ ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) { ++ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, ++ rq->flags)) { ++ struct mlx5e_wqe_frag_info *pwi; ++ ++ wi -= old_nr_frags - sinfo->nr_frags; ++ ++ for (pwi = head_wi; pwi < wi; pwi++) ++ pwi->frag_page->frags++; ++ } ++ return NULL; /* page/packet was consumed by XDP */ ++ } + +- for (pwi = head_wi; pwi < wi; pwi++) +- pwi->frag_page->frags++; ++ nr_frags_free = old_nr_frags - sinfo->nr_frags; ++ if (unlikely(nr_frags_free)) { ++ wi -= nr_frags_free; ++ truesize -= nr_frags_free * frag_info->frag_stride; + } +- return NULL; /* page/packet was consumed by XDP */ + } + + skb = mlx5e_build_linear_skb( +-- +2.51.0 + diff --git a/queue-6.12/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-30117 b/queue-6.12/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-30117 new file mode 100644 index 0000000000..a7965f8418 --- /dev/null +++ b/queue-6.12/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-30117 @@ -0,0 +1,120 @@ +From e4c7984fb40996098fec85eff9c4a43f9e8d0ba6 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 22:55:40 +0300 +Subject: net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for + striding RQ + +From: Amery Hung + +[ Upstream commit 87bcef158ac1faca1bd7e0104588e8e2956d10be ] + +XDP programs can change the layout of an xdp_buff through +bpf_xdp_adjust_tail() and bpf_xdp_adjust_head(). Therefore, the driver +cannot assume the size of the linear data area nor fragments. Fix the +bug in mlx5 by generating skb according to xdp_buff after XDP programs +run. + +Currently, when handling multi-buf XDP, the mlx5 driver assumes the +layout of an xdp_buff to be unchanged. That is, the linear data area +continues to be empty and fragments remain the same. This may cause +the driver to generate erroneous skb or triggering a kernel +warning. When an XDP program added linear data through +bpf_xdp_adjust_head(), the linear data will be ignored as +mlx5e_build_linear_skb() builds an skb without linear data and then +pull data from fragments to fill the linear data area. When an XDP +program has shrunk the non-linear data through bpf_xdp_adjust_tail(), +the delta passed to __pskb_pull_tail() may exceed the actual nonlinear +data size and trigger the BUG_ON in it. + +To fix the issue, first record the original number of fragments. If the +number of fragments changes after the XDP program runs, rewind the end +fragment pointer by the difference and recalculate the truesize. Then, +build the skb with the linear data area matching the xdp_buff. Finally, +only pull data in if there is non-linear data and fill the linear part +up to 256 bytes. + +Fixes: f52ac7028bec ("net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ") +Signed-off-by: Amery Hung +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1760644540-899148-3-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 26 ++++++++++++++++--- + 1 file changed, 23 insertions(+), 3 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index 5cacb25a763e4..59aa10f1a9d95 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -2012,6 +2012,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + u32 byte_cnt = cqe_bcnt; + struct skb_shared_info *sinfo; + unsigned int truesize = 0; ++ u32 pg_consumed_bytes; + struct bpf_prog *prog; + struct sk_buff *skb; + u32 linear_frame_sz; +@@ -2063,7 +2064,8 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + + while (byte_cnt) { + /* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */ +- u32 pg_consumed_bytes = min_t(u32, PAGE_SIZE - frag_offset, byte_cnt); ++ pg_consumed_bytes = ++ min_t(u32, PAGE_SIZE - frag_offset, byte_cnt); + + if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) + truesize += pg_consumed_bytes; +@@ -2079,10 +2081,15 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } + + if (prog) { ++ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags; ++ u32 len; ++ + if (mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + struct mlx5e_frag_page *pfp; + ++ frag_page -= old_nr_frags - sinfo->nr_frags; ++ + for (pfp = head_page; pfp < frag_page; pfp++) + pfp->frags++; + +@@ -2092,9 +2099,19 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + return NULL; /* page/packet was consumed by XDP */ + } + ++ nr_frags_free = old_nr_frags - sinfo->nr_frags; ++ if (unlikely(nr_frags_free)) { ++ frag_page -= nr_frags_free; ++ truesize -= (nr_frags_free - 1) * PAGE_SIZE + ++ ALIGN(pg_consumed_bytes, ++ BIT(rq->mpwqe.log_stride_sz)); ++ } ++ ++ len = mxbuf->xdp.data_end - mxbuf->xdp.data; ++ + skb = mlx5e_build_linear_skb( + rq, mxbuf->xdp.data_hard_start, linear_frame_sz, +- mxbuf->xdp.data - mxbuf->xdp.data_hard_start, 0, ++ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, len, + mxbuf->xdp.data - mxbuf->xdp.data_meta); + if (unlikely(!skb)) { + mlx5e_page_release_fragmented(rq, &wi->linear_page); +@@ -2118,8 +2135,11 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + do + pagep->frags++; + while (++pagep < frag_page); ++ ++ headlen = min_t(u16, MLX5E_RX_MAX_HEAD - len, ++ skb->data_len); ++ __pskb_pull_tail(skb, headlen); + } +- __pskb_pull_tail(skb, headlen); + } else { + dma_addr_t addr; + +-- +2.51.0 + diff --git a/queue-6.12/net-phy-micrel-always-set-shared-phydev-for-lan8814.patch b/queue-6.12/net-phy-micrel-always-set-shared-phydev-for-lan8814.patch new file mode 100644 index 0000000000..407e52ca4a --- /dev/null +++ b/queue-6.12/net-phy-micrel-always-set-shared-phydev-for-lan8814.patch @@ -0,0 +1,54 @@ +From 1a4719f43a7acbea666502948cc062d778c98091 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 15:20:26 +0200 +Subject: net: phy: micrel: always set shared->phydev for LAN8814 + +From: Robert Marko + +[ Upstream commit 399d10934740ae8cdaa4e3245f7c5f6c332da844 ] + +Currently, during the LAN8814 PTP probe shared->phydev is only set if PTP +clock gets actually set, otherwise the function will return before setting +it. + +This is an issue as shared->phydev is unconditionally being used when IRQ +is being handled, especially in lan8814_gpio_process_cap and since it was +not set it will cause a NULL pointer exception and crash the kernel. + +So, simply always set shared->phydev to avoid the NULL pointer exception. + +Fixes: b3f1a08fcf0d ("net: phy: micrel: Add support for PTP_PF_EXTTS for lan8814") +Signed-off-by: Robert Marko +Tested-by: Horatiu Vultur +Link: https://patch.msgid.link/20251021132034.983936-1-robert.marko@sartura.hr +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/phy/micrel.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c +index 92e9eb4146d9b..f60cf630bdb3d 100644 +--- a/drivers/net/phy/micrel.c ++++ b/drivers/net/phy/micrel.c +@@ -3870,6 +3870,8 @@ static int lan8814_ptp_probe_once(struct phy_device *phydev) + { + struct lan8814_shared_priv *shared = phydev->shared->priv; + ++ shared->phydev = phydev; ++ + /* Initialise shared lock for clock*/ + mutex_init(&shared->shared_lock); + +@@ -3921,8 +3923,6 @@ static int lan8814_ptp_probe_once(struct phy_device *phydev) + + phydev_dbg(phydev, "successfully registered ptp clock\n"); + +- shared->phydev = phydev; +- + /* The EP.4 is shared between all the PHYs in the package and also it + * can be accessed by any of the PHYs + */ +-- +2.51.0 + diff --git a/queue-6.12/net-smc-fix-general-protection-fault-in-__smc_diag_d.patch b/queue-6.12/net-smc-fix-general-protection-fault-in-__smc_diag_d.patch new file mode 100644 index 0000000000..fe93b32db7 --- /dev/null +++ b/queue-6.12/net-smc-fix-general-protection-fault-in-__smc_diag_d.patch @@ -0,0 +1,131 @@ +From 51ee9b5c94ad55cdd98e6fef2a69ee4ceac8f346 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 10:48:27 +0800 +Subject: net/smc: fix general protection fault in __smc_diag_dump + +From: Wang Liang + +[ Upstream commit f584239a9ed25057496bf397c370cc5163dde419 ] + +The syzbot report a crash: + + Oops: general protection fault, probably for non-canonical address 0xfbd5a5d5a0000003: 0000 [#1] SMP KASAN NOPTI + KASAN: maybe wild-memory-access in range [0xdead4ead00000018-0xdead4ead0000001f] + CPU: 1 UID: 0 PID: 6949 Comm: syz.0.335 Not tainted syzkaller #0 PREEMPT(full) + Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025 + RIP: 0010:smc_diag_msg_common_fill net/smc/smc_diag.c:44 [inline] + RIP: 0010:__smc_diag_dump.constprop.0+0x3ca/0x2550 net/smc/smc_diag.c:89 + Call Trace: + + smc_diag_dump_proto+0x26d/0x420 net/smc/smc_diag.c:217 + smc_diag_dump+0x27/0x90 net/smc/smc_diag.c:234 + netlink_dump+0x539/0xd30 net/netlink/af_netlink.c:2327 + __netlink_dump_start+0x6d6/0x990 net/netlink/af_netlink.c:2442 + netlink_dump_start include/linux/netlink.h:341 [inline] + smc_diag_handler_dump+0x1f9/0x240 net/smc/smc_diag.c:251 + __sock_diag_cmd net/core/sock_diag.c:249 [inline] + sock_diag_rcv_msg+0x438/0x790 net/core/sock_diag.c:285 + netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552 + netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] + netlink_unicast+0x5a7/0x870 net/netlink/af_netlink.c:1346 + netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896 + sock_sendmsg_nosec net/socket.c:714 [inline] + __sock_sendmsg net/socket.c:729 [inline] + ____sys_sendmsg+0xa95/0xc70 net/socket.c:2614 + ___sys_sendmsg+0x134/0x1d0 net/socket.c:2668 + __sys_sendmsg+0x16d/0x220 net/socket.c:2700 + do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] + do_syscall_64+0xcd/0x4e0 arch/x86/entry/syscall_64.c:94 + entry_SYSCALL_64_after_hwframe+0x77/0x7f + + +The process like this: + + (CPU1) | (CPU2) + ---------------------------------|------------------------------- + inet_create() | + // init clcsock to NULL | + sk = sk_alloc() | + | + // unexpectedly change clcsock | + inet_init_csk_locks() | + | + // add sk to hash table | + smc_inet_init_sock() | + smc_sk_init() | + smc_hash_sk() | + | // traverse the hash table + | smc_diag_dump_proto + | __smc_diag_dump() + | // visit wrong clcsock + | smc_diag_msg_common_fill() + // alloc clcsock | + smc_create_clcsk | + sock_create_kern | + +With CONFIG_DEBUG_LOCK_ALLOC=y, the smc->clcsock is unexpectedly changed +in inet_init_csk_locks(). The INET_PROTOSW_ICSK flag is no need by smc, +just remove it. + +After removing the INET_PROTOSW_ICSK flag, this patch alse revert +commit 6fd27ea183c2 ("net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC") +to avoid casting smc_sock to inet_connection_sock. + +Reported-by: syzbot+f775be4458668f7d220e@syzkaller.appspotmail.com +Closes: https://syzkaller.appspot.com/bug?extid=f775be4458668f7d220e +Tested-by: syzbot+f775be4458668f7d220e@syzkaller.appspotmail.com +Fixes: d25a92ccae6b ("net/smc: Introduce IPPROTO_SMC") +Signed-off-by: Wang Liang +Reviewed-by: Kuniyuki Iwashima +Reviewed-by: Eric Dumazet +Reviewed-by: D. Wythe +Link: https://patch.msgid.link/20251017024827.3137512-1-wangliang74@huawei.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/smc/smc_inet.c | 13 ------------- + 1 file changed, 13 deletions(-) + +diff --git a/net/smc/smc_inet.c b/net/smc/smc_inet.c +index a944e7dcb8b96..a94084b4a498e 100644 +--- a/net/smc/smc_inet.c ++++ b/net/smc/smc_inet.c +@@ -56,7 +56,6 @@ static struct inet_protosw smc_inet_protosw = { + .protocol = IPPROTO_SMC, + .prot = &smc_inet_prot, + .ops = &smc_inet_stream_ops, +- .flags = INET_PROTOSW_ICSK, + }; + + #if IS_ENABLED(CONFIG_IPV6) +@@ -104,27 +103,15 @@ static struct inet_protosw smc_inet6_protosw = { + .protocol = IPPROTO_SMC, + .prot = &smc_inet6_prot, + .ops = &smc_inet6_stream_ops, +- .flags = INET_PROTOSW_ICSK, + }; + #endif /* CONFIG_IPV6 */ + +-static unsigned int smc_sync_mss(struct sock *sk, u32 pmtu) +-{ +- /* No need pass it through to clcsock, mss can always be set by +- * sock_create_kern or smc_setsockopt. +- */ +- return 0; +-} +- + static int smc_inet_init_sock(struct sock *sk) + { + struct net *net = sock_net(sk); + + /* init common smc sock */ + smc_sk_init(net, sk, IPPROTO_SMC); +- +- inet_csk(sk)->icsk_sync_mss = smc_sync_mss; +- + /* create clcsock */ + return smc_create_clcsk(net, sk, sk->sk_family); + } +-- +2.51.0 + diff --git a/queue-6.12/ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch b/queue-6.12/ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch new file mode 100644 index 0000000000..1b4a49dfc4 --- /dev/null +++ b/queue-6.12/ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch @@ -0,0 +1,41 @@ +From 6de3fab0f139265d0e5121eb658da88a9652c380 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 18:24:56 +0000 +Subject: ptp: ocp: Fix typo using index 1 instead of i in SMA initialization + loop + +From: Jiasheng Jiang + +[ Upstream commit a767957e7a83f9e742be196aa52a48de8ac5a7e4 ] + +In ptp_ocp_sma_fb_init(), the code mistakenly used bp->sma[1] +instead of bp->sma[i] inside a for-loop, which caused only SMA[1] +to have its DIRECTION_CAN_CHANGE capability cleared. This led to +inconsistent capability flags across SMA pins. + +Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops") +Signed-off-by: Jiasheng Jiang +Reviewed-by: Vadim Fedorenko +Link: https://patch.msgid.link/20251021182456.9729-1-jiashengjiangcool@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/ptp/ptp_ocp.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/ptp/ptp_ocp.c b/drivers/ptp/ptp_ocp.c +index efbd80db778d6..bd9919c01e502 100644 +--- a/drivers/ptp/ptp_ocp.c ++++ b/drivers/ptp/ptp_ocp.c +@@ -2546,7 +2546,7 @@ ptp_ocp_sma_fb_init(struct ptp_ocp *bp) + for (i = 0; i < OCP_SMA_NUM; i++) { + bp->sma[i].fixed_fcn = true; + bp->sma[i].fixed_dir = true; +- bp->sma[1].dpll_prop.capabilities &= ++ bp->sma[i].dpll_prop.capabilities &= + ~DPLL_PIN_CAPABILITIES_DIRECTION_CAN_CHANGE; + } + return; +-- +2.51.0 + diff --git a/queue-6.12/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch b/queue-6.12/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch new file mode 100644 index 0000000000..ad455d30b5 --- /dev/null +++ b/queue-6.12/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch @@ -0,0 +1,56 @@ +From 34f8ec2bdb70615779c296bfd5bfe4d8036ba575 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 22:15:43 +0200 +Subject: rtnetlink: Allow deleting FDB entries in user namespace +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Johannes Wiesböck + +[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ] + +Creating FDB entries is possible from a non-initial user namespace when +having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive +an EPERM because the capability is always checked against the initial +user namespace. This restricts the FDB management from unprivileged +containers. + +Drop the netlink_capable check in rtnl_fdb_del as it was originally +dropped in c5c351088ae7 and reintroduced in 1690be63a27b without +intention. + +This patch was tested using a container on GyroidOS, where it was +possible to delete FDB entries from an unprivileged user namespace and +private network namespace. + +Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") +Reviewed-by: Michael Weiß +Tested-by: Harshal Gohel +Signed-off-by: Johannes Wiesböck +Reviewed-by: Ido Schimmel +Reviewed-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 4d0ee1c9002aa..650c3c20e79ff 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4414,9 +4414,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + int err; + u16 vid; + +- if (!netlink_capable(skb, CAP_NET_ADMIN)) +- return -EPERM; +- + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); +-- +2.51.0 + diff --git a/queue-6.12/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch b/queue-6.12/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch new file mode 100644 index 0000000000..50ebb584c7 --- /dev/null +++ b/queue-6.12/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch @@ -0,0 +1,54 @@ +From 8e809518e6326988c746ce5388211c5ca67fd00e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 16:00:36 +0300 +Subject: sctp: avoid NULL dereference when chunk data buffer is missing + +From: Alexey Simakov + +[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ] + +chunk->skb pointer is dereferenced in the if-block where it's supposed +to be NULL only. + +chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list +instead and do it just before replacing chunk->skb. We're sure that +otherwise chunk->skb is non-NULL because of outer if() condition. + +Fixes: 90017accff61 ("sctp: Add GSO support") +Signed-off-by: Alexey Simakov +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sctp/inqueue.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c +index 5c16521818058..f5a7d5a387555 100644 +--- a/net/sctp/inqueue.c ++++ b/net/sctp/inqueue.c +@@ -169,13 +169,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) + chunk->head_skb = chunk->skb; + + /* skbs with "cover letter" */ +- if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) ++ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { ++ if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { ++ __SCTP_INC_STATS(dev_net(chunk->skb->dev), ++ SCTP_MIB_IN_PKT_DISCARDS); ++ sctp_chunk_free(chunk); ++ goto next_chunk; ++ } + chunk->skb = skb_shinfo(chunk->skb)->frag_list; +- +- if (WARN_ON(!chunk->skb)) { +- __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); +- sctp_chunk_free(chunk); +- goto next_chunk; + } + } + +-- +2.51.0 + diff --git a/queue-6.12/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch b/queue-6.12/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch new file mode 100644 index 0000000000..dd321060e4 --- /dev/null +++ b/queue-6.12/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch @@ -0,0 +1,241 @@ +From 9ddd89ac28ab4e4c440e965eb336feb7057a4714 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:06:14 -0400 +Subject: selftests: net: fix server bind failure in sctp_vrf.sh +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Xin Long + +[ Upstream commit a73ca0449bcb7c238097cc6a1bf3fd82a78374df ] + +sctp_vrf.sh could fail: + + TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [FAIL] + not ok 1 selftests: net: sctp_vrf.sh # exit=3 + +The failure happens when the server bind in a new run conflicts with an +existing association from the previous run: + +[1] ip netns exec $SERVER_NS ./sctp_hello server ... +[2] ip netns exec $CLIENT_NS ./sctp_hello client ... +[3] ip netns exec $SERVER_NS pkill sctp_hello ... +[4] ip netns exec $SERVER_NS ./sctp_hello server ... + +It occurs if the client in [2] sends a message and closes immediately. +With the message unacked, no SHUTDOWN is sent. Killing the server in [3] +triggers a SHUTDOWN the client also ignores due to the unacked message, +leaving the old association alive. This causes the bind at [4] to fail +until the message is acked and the client responds to a second SHUTDOWN +after the server’s T2 timer expires (3s). + +This patch fixes the issue by preventing the client from sending data. +Instead, the client blocks on recv() and waits for the server to close. +It also waits until both the server and the client sockets are fully +released in stop_server and wait_client before restarting. + +Additionally, replace 2>&1 >/dev/null with -q in sysctl and grep, and +drop other redundant 2>&1 >/dev/null redirections, and fix a typo from +N to Y (connect successfully) in the description of the last test. + +Fixes: a61bd7b9fef3 ("selftests: add a selftest for sctp vrf") +Reported-by: Hangbin Liu +Tested-by: Jakub Kicinski +Signed-off-by: Xin Long +Link: https://patch.msgid.link/be2dacf52d0917c4ba5e2e8c5a9cb640740ad2b6.1760731574.git.lucien.xin@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + tools/testing/selftests/net/sctp_hello.c | 17 +----- + tools/testing/selftests/net/sctp_vrf.sh | 73 +++++++++++++++--------- + 2 files changed, 47 insertions(+), 43 deletions(-) + +diff --git a/tools/testing/selftests/net/sctp_hello.c b/tools/testing/selftests/net/sctp_hello.c +index f02f1f95d2275..a04dac0b8027d 100644 +--- a/tools/testing/selftests/net/sctp_hello.c ++++ b/tools/testing/selftests/net/sctp_hello.c +@@ -29,7 +29,6 @@ static void set_addr(struct sockaddr_storage *ss, char *ip, char *port, int *len + static int do_client(int argc, char *argv[]) + { + struct sockaddr_storage ss; +- char buf[] = "hello"; + int csk, ret, len; + + if (argc < 5) { +@@ -56,16 +55,10 @@ static int do_client(int argc, char *argv[]) + + set_addr(&ss, argv[3], argv[4], &len); + ret = connect(csk, (struct sockaddr *)&ss, len); +- if (ret < 0) { +- printf("failed to connect to peer\n"); ++ if (ret < 0) + return -1; +- } + +- ret = send(csk, buf, strlen(buf) + 1, 0); +- if (ret < 0) { +- printf("failed to send msg %d\n", ret); +- return -1; +- } ++ recv(csk, NULL, 0, 0); + close(csk); + + return 0; +@@ -75,7 +68,6 @@ int main(int argc, char *argv[]) + { + struct sockaddr_storage ss; + int lsk, csk, ret, len; +- char buf[20]; + + if (argc < 2 || (strcmp(argv[1], "server") && strcmp(argv[1], "client"))) { + printf("%s server|client ...\n", argv[0]); +@@ -125,11 +117,6 @@ int main(int argc, char *argv[]) + return -1; + } + +- ret = recv(csk, buf, sizeof(buf), 0); +- if (ret <= 0) { +- printf("failed to recv msg %d\n", ret); +- return -1; +- } + close(csk); + close(lsk); + +diff --git a/tools/testing/selftests/net/sctp_vrf.sh b/tools/testing/selftests/net/sctp_vrf.sh +index c854034b6aa16..667b211aa8a11 100755 +--- a/tools/testing/selftests/net/sctp_vrf.sh ++++ b/tools/testing/selftests/net/sctp_vrf.sh +@@ -20,9 +20,9 @@ setup() { + modprobe sctp_diag + setup_ns CLIENT_NS1 CLIENT_NS2 SERVER_NS + +- ip net exec $CLIENT_NS1 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null +- ip net exec $CLIENT_NS2 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null +- ip net exec $SERVER_NS sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null ++ ip net exec $CLIENT_NS1 sysctl -wq net.ipv6.conf.default.accept_dad=0 ++ ip net exec $CLIENT_NS2 sysctl -wq net.ipv6.conf.default.accept_dad=0 ++ ip net exec $SERVER_NS sysctl -wq net.ipv6.conf.default.accept_dad=0 + + ip -n $SERVER_NS link add veth1 type veth peer name veth1 netns $CLIENT_NS1 + ip -n $SERVER_NS link add veth2 type veth peer name veth1 netns $CLIENT_NS2 +@@ -62,17 +62,40 @@ setup() { + } + + cleanup() { +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null ++ wait_client $CLIENT_NS1 ++ wait_client $CLIENT_NS2 ++ stop_server + cleanup_ns $CLIENT_NS1 $CLIENT_NS2 $SERVER_NS + } + +-wait_server() { ++start_server() { + local IFACE=$1 + local CNT=0 + +- until ip netns exec $SERVER_NS ss -lS src $SERVER_IP:$SERVER_PORT | \ +- grep LISTEN | grep "$IFACE" 2>&1 >/dev/null; do +- [ $((CNT++)) = "20" ] && { RET=3; return $RET; } ++ ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP $SERVER_PORT $IFACE & ++ disown ++ until ip netns exec $SERVER_NS ss -SlH | grep -q "$IFACE"; do ++ [ $((CNT++)) -eq 30 ] && { RET=3; return $RET; } ++ sleep 0.1 ++ done ++} ++ ++stop_server() { ++ local CNT=0 ++ ++ ip netns exec $SERVER_NS pkill sctp_hello ++ while ip netns exec $SERVER_NS ss -SaH | grep -q .; do ++ [ $((CNT++)) -eq 30 ] && break ++ sleep 0.1 ++ done ++} ++ ++wait_client() { ++ local CLIENT_NS=$1 ++ local CNT=0 ++ ++ while ip netns exec $CLIENT_NS ss -SaH | grep -q .; do ++ [ $((CNT++)) -eq 30 ] && break + sleep 0.1 + done + } +@@ -81,14 +104,12 @@ do_test() { + local CLIENT_NS=$1 + local IFACE=$2 + +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE 2>&1 >/dev/null & +- disown +- wait_server $IFACE || return $RET ++ start_server $IFACE || return $RET + timeout 3 ip netns exec $CLIENT_NS ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT + RET=$? ++ wait_client $CLIENT_NS ++ stop_server + return $RET + } + +@@ -96,25 +117,21 @@ do_testx() { + local IFACE1=$1 + local IFACE2=$2 + +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE1 2>&1 >/dev/null & +- disown +- wait_server $IFACE1 || return $RET +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE2 2>&1 >/dev/null & +- disown +- wait_server $IFACE2 || return $RET ++ start_server $IFACE1 || return $RET ++ start_server $IFACE2 || return $RET + timeout 3 ip netns exec $CLIENT_NS1 ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null && \ ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT && \ + timeout 3 ip netns exec $CLIENT_NS2 ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT + RET=$? ++ wait_client $CLIENT_NS1 ++ wait_client $CLIENT_NS2 ++ stop_server + return $RET + } + + testup() { +- ip netns exec $SERVER_NS sysctl -w net.sctp.l3mdev_accept=1 2>&1 >/dev/null ++ ip netns exec $SERVER_NS sysctl -wq net.sctp.l3mdev_accept=1 + echo -n "TEST 01: nobind, connect from client 1, l3mdev_accept=1, Y " + do_test $CLIENT_NS1 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" +@@ -123,7 +140,7 @@ testup() { + do_test $CLIENT_NS2 && { echo "[FAIL]"; return $RET; } + echo "[PASS]" + +- ip netns exec $SERVER_NS sysctl -w net.sctp.l3mdev_accept=0 2>&1 >/dev/null ++ ip netns exec $SERVER_NS sysctl -wq net.sctp.l3mdev_accept=0 + echo -n "TEST 03: nobind, connect from client 1, l3mdev_accept=0, N " + do_test $CLIENT_NS1 && { echo "[FAIL]"; return $RET; } + echo "[PASS]" +@@ -160,7 +177,7 @@ testup() { + do_testx vrf-1 vrf-2 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" + +- echo -n "TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N " ++ echo -n "TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, Y " + do_testx vrf-2 vrf-1 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" + } +-- +2.51.0 + diff --git a/queue-6.12/series b/queue-6.12/series index 29716f5166..af52b0a957 100644 --- a/queue-6.12/series +++ b/queue-6.12/series @@ -23,3 +23,22 @@ pm-em-drop-unused-parameter-from-em_adjust_new_capacity.patch pm-em-slightly-reduce-em_check_capacity_update-overhead.patch pm-em-move-cpu-capacity-check-to-em_adjust_new_capacity.patch pm-em-fix-late-boot-with-holes-in-cpu-topology.patch +net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch +rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch +net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch +net-enetc-correct-the-value-of-enetc_rxb_truesize.patch +dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch +can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch +can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch +can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch +selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch +net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch +net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch +net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-30117 +net-smc-fix-general-protection-fault-in-__smc_diag_d.patch +net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch +arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch +ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch +sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch +net-phy-micrel-always-set-shared-phydev-for-lan8814.patch +net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch diff --git a/queue-6.17/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch b/queue-6.17/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch new file mode 100644 index 0000000000..3bd3a6588c --- /dev/null +++ b/queue-6.17/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch @@ -0,0 +1,71 @@ +From 2b67e5b4bc39a051bc5c316aecab99e85d59618b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:37:12 +0800 +Subject: arm64, mm: avoid always making PTE dirty in pte_mkwrite() + +From: Huang Ying + +[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ] + +Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may +mark some pages that are never written dirty wrongly. For example, +do_swap_page() may map the exclusive pages with writable and clean PTEs +if the VMA is writable and the page fault is for read access. +However, current pte_mkwrite_novma() implementation always dirties the +PTE. This may cause unnecessary disk writing if the pages are +never written before being reclaimed. + +So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the +PTE_DIRTY bit is set to make it possible to make the PTE writable and +clean. + +The current behavior was introduced in commit 73e86cb03cf2 ("arm64: +Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, +pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only +clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits +are set. + +To test the performance impact of the patch, on an arm64 server +machine, run 16 redis-server processes on socket 1 and 16 +memtier_benchmark processes on socket 0 with mostly get +transactions (that is, redis-server will mostly read memory only). +The memory footprint of redis-server is larger than the available +memory, so swap out/in will be triggered. Test results show that the +patch can avoid most swapping out because the pages are mostly clean. +And the benchmark throughput improves ~23.9% in the test. + +Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") +Signed-off-by: Huang Ying +Cc: Will Deacon +Cc: Anshuman Khandual +Cc: Ryan Roberts +Cc: Gavin Shan +Cc: Ard Biesheuvel +Cc: Matthew Wilcox (Oracle) +Cc: Yicong Yang +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Reviewed-by: Catalin Marinas +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/pgtable.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h +index abd2dee416b3b..e6fdb52963303 100644 +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -293,7 +293,8 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) + static inline pte_t pte_mkwrite_novma(pte_t pte) + { + pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); +- pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); ++ if (pte_sw_dirty(pte)) ++ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + return pte; + } + +-- +2.51.0 + diff --git a/queue-6.17/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch b/queue-6.17/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch new file mode 100644 index 0000000000..79e615216f --- /dev/null +++ b/queue-6.17/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch @@ -0,0 +1,43 @@ +From 4032714161163727ca5387be1748e43f6eba7fb3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:28:49 +0200 +Subject: can: bxcan: bxcan_start_xmit(): use can_dev_dropped_skb() instead of + can_dropped_invalid_skb() + +From: Marc Kleine-Budde + +[ Upstream commit 3a20c444cd123e820e10ae22eeaf00e189315aa1 ] + +In addition to can_dropped_invalid_skb(), the helper function +can_dev_dropped_skb() checks whether the device is in listen-only mode and +discards the skb accordingly. + +Replace can_dropped_invalid_skb() by can_dev_dropped_skb() to also drop +skbs in for listen-only mode. + +Reported-by: Marc Kleine-Budde +Closes: https://lore.kernel.org/all/20251017-bizarre-enchanted-quokka-f3c704-mkl@pengutronix.de/ +Fixes: f00647d8127b ("can: bxcan: add support for ST bxCAN controller") +Link: https://patch.msgid.link/20251017-fix-skb-drop-check-v1-1-556665793fa4@pengutronix.de +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Sasha Levin +--- + drivers/net/can/bxcan.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/can/bxcan.c b/drivers/net/can/bxcan.c +index bfc60eb33dc37..333ad42ea73bc 100644 +--- a/drivers/net/can/bxcan.c ++++ b/drivers/net/can/bxcan.c +@@ -842,7 +842,7 @@ static netdev_tx_t bxcan_start_xmit(struct sk_buff *skb, + u32 id; + int i, j; + +- if (can_dropped_invalid_skb(ndev, skb)) ++ if (can_dev_dropped_skb(ndev, skb)) + return NETDEV_TX_OK; + + if (bxcan_tx_busy(priv)) +-- +2.51.0 + diff --git a/queue-6.17/can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch b/queue-6.17/can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch new file mode 100644 index 0000000000..0511cb7d37 --- /dev/null +++ b/queue-6.17/can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch @@ -0,0 +1,43 @@ +From 4770fbad30609e9b90c2fac16d849492eb3b3af3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:28:49 +0200 +Subject: can: esd: acc_start_xmit(): use can_dev_dropped_skb() instead of + can_dropped_invalid_skb() + +From: Marc Kleine-Budde + +[ Upstream commit 0bee15a5caf36fe513fdeee07fd4f0331e61c064 ] + +In addition to can_dropped_invalid_skb(), the helper function +can_dev_dropped_skb() checks whether the device is in listen-only mode and +discards the skb accordingly. + +Replace can_dropped_invalid_skb() by can_dev_dropped_skb() to also drop +skbs in for listen-only mode. + +Reported-by: Marc Kleine-Budde +Closes: https://lore.kernel.org/all/20251017-bizarre-enchanted-quokka-f3c704-mkl@pengutronix.de/ +Fixes: 9721866f07e1 ("can: esd: add support for esd GmbH PCIe/402 CAN interface family") +Link: https://patch.msgid.link/20251017-fix-skb-drop-check-v1-2-556665793fa4@pengutronix.de +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Sasha Levin +--- + drivers/net/can/esd/esdacc.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/can/esd/esdacc.c b/drivers/net/can/esd/esdacc.c +index c80032bc1a521..73e66f9a3781c 100644 +--- a/drivers/net/can/esd/esdacc.c ++++ b/drivers/net/can/esd/esdacc.c +@@ -254,7 +254,7 @@ netdev_tx_t acc_start_xmit(struct sk_buff *skb, struct net_device *netdev) + u32 acc_id; + u32 acc_dlc; + +- if (can_dropped_invalid_skb(netdev, skb)) ++ if (can_dev_dropped_skb(netdev, skb)) + return NETDEV_TX_OK; + + /* Access core->tx_fifo_tail only once because it may be changed +-- +2.51.0 + diff --git a/queue-6.17/can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch b/queue-6.17/can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch new file mode 100644 index 0000000000..47f1cb44c8 --- /dev/null +++ b/queue-6.17/can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch @@ -0,0 +1,43 @@ +From c44f1e70bfdb2ad0a18773159331df58e7a0c068 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:28:49 +0200 +Subject: can: rockchip-canfd: rkcanfd_start_xmit(): use can_dev_dropped_skb() + instead of can_dropped_invalid_skb() + +From: Marc Kleine-Budde + +[ Upstream commit 3a3bc9bbb3a0287164a595787df0c70d91e77cfd ] + +In addition to can_dropped_invalid_skb(), the helper function +can_dev_dropped_skb() checks whether the device is in listen-only mode and +discards the skb accordingly. + +Replace can_dropped_invalid_skb() by can_dev_dropped_skb() to also drop +skbs in for listen-only mode. + +Reported-by: Marc Kleine-Budde +Closes: https://lore.kernel.org/all/20251017-bizarre-enchanted-quokka-f3c704-mkl@pengutronix.de/ +Fixes: ff60bfbaf67f ("can: rockchip_canfd: add driver for Rockchip CAN-FD controller") +Link: https://patch.msgid.link/20251017-fix-skb-drop-check-v1-3-556665793fa4@pengutronix.de +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Sasha Levin +--- + drivers/net/can/rockchip/rockchip_canfd-tx.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/can/rockchip/rockchip_canfd-tx.c b/drivers/net/can/rockchip/rockchip_canfd-tx.c +index 865a15e033a9e..12200dcfd3389 100644 +--- a/drivers/net/can/rockchip/rockchip_canfd-tx.c ++++ b/drivers/net/can/rockchip/rockchip_canfd-tx.c +@@ -72,7 +72,7 @@ netdev_tx_t rkcanfd_start_xmit(struct sk_buff *skb, struct net_device *ndev) + int err; + u8 i; + +- if (can_dropped_invalid_skb(ndev, skb)) ++ if (can_dev_dropped_skb(ndev, skb)) + return NETDEV_TX_OK; + + if (!netif_subqueue_maybe_stop(priv->ndev, 0, +-- +2.51.0 + diff --git a/queue-6.17/cpufreq-amd-pstate-fix-a-regression-leading-to-epp-0.patch b/queue-6.17/cpufreq-amd-pstate-fix-a-regression-leading-to-epp-0.patch new file mode 100644 index 0000000000..6c0f6dbc1e --- /dev/null +++ b/queue-6.17/cpufreq-amd-pstate-fix-a-regression-leading-to-epp-0.patch @@ -0,0 +1,48 @@ +From 565f74c413346406485392d4a88e4693622b217b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 23 Sep 2025 10:29:29 -0500 +Subject: cpufreq/amd-pstate: Fix a regression leading to EPP 0 after hibernate + +From: Mario Limonciello (AMD) + +[ Upstream commit 85d7dda5a9f665ea579741ec873a8841f37e8943 ] + +After resuming from S4, all CPUs except the boot CPU have the wrong EPP +hint programmed. This is because when the CPUs were offlined the EPP value +was reset to 0. + +This is a similar problem as fixed by +commit ba3319e590571 ("cpufreq/amd-pstate: Fix a regression leading to EPP +0 after resume") and the solution is also similar. When offlining rather +than reset the values to zero, reset them to match those chosen by the +policy. When the CPUs are onlined again these values will be restored. + +Closes: https://community.frame.work/t/increased-power-usage-after-resuming-from-suspend-on-ryzen-7040-kernel-6-15-regression/74531/20?u=mario_limonciello +Fixes: 608a76b65288 ("cpufreq/amd-pstate: Add support for the "Requested CPU Min frequency" BIOS option") +Reviewed-by: Gautham R. Shenoy +Signed-off-by: Mario Limonciello (AMD) +Signed-off-by: Sasha Levin +--- + drivers/cpufreq/amd-pstate.c | 6 +++++- + 1 file changed, 5 insertions(+), 1 deletion(-) + +diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c +index b4c79fde1979b..e4f1933dd7d47 100644 +--- a/drivers/cpufreq/amd-pstate.c ++++ b/drivers/cpufreq/amd-pstate.c +@@ -1614,7 +1614,11 @@ static int amd_pstate_cpu_offline(struct cpufreq_policy *policy) + * min_perf value across kexec reboots. If this CPU is just onlined normally after this, the + * limits, epp and desired perf will get reset to the cached values in cpudata struct + */ +- return amd_pstate_update_perf(policy, perf.bios_min_perf, 0U, 0U, 0U, false); ++ return amd_pstate_update_perf(policy, perf.bios_min_perf, ++ FIELD_GET(AMD_CPPC_DES_PERF_MASK, cpudata->cppc_req_cached), ++ FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached), ++ FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached), ++ false); + } + + static int amd_pstate_suspend(struct cpufreq_policy *policy) +-- +2.51.0 + diff --git a/queue-6.17/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch b/queue-6.17/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch new file mode 100644 index 0000000000..cb0a76fd1c --- /dev/null +++ b/queue-6.17/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch @@ -0,0 +1,50 @@ +From bee3ae60f3ccc69a96576d3deb67d780ac60f809 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:58:07 +0300 +Subject: dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path + +From: Ioana Ciornei + +[ Upstream commit 902e81e679d86846a2404630d349709ad9372d0d ] + +The blamed commit increased the needed headroom to account for +alignment. This means that the size required to always align a Tx buffer +was added inside the dpaa2_eth_needed_headroom() function. By doing +that, a manual adjustment of the pointer passed to PTR_ALIGN() was no +longer correct since the 'buffer_start' variable was already pointing +to the start of the skb's memory. + +The behavior of the dpaa2-eth driver without this patch was to drop +frames on Tx even when the headroom was matching the 128 bytes +necessary. Fix this by removing the manual adjust of 'buffer_start' from +the PTR_MODE call. + +Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u +Fixes: f422abe3f23d ("dpaa2-eth: increase the needed headroom to account for alignment") +Signed-off-by: Ioana Ciornei +Tested-by: Mathew McBride +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +index 0f4efd5053320..a5f3d19f1466c 100644 +--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c ++++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +@@ -1077,8 +1077,7 @@ static int dpaa2_eth_build_single_fd(struct dpaa2_eth_priv *priv, + dma_addr_t addr; + + buffer_start = skb->data - dpaa2_eth_needed_headroom(skb); +- aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN, +- DPAA2_ETH_TX_BUF_ALIGN); ++ aligned_start = PTR_ALIGN(buffer_start, DPAA2_ETH_TX_BUF_ALIGN); + if (aligned_start >= skb->head) + buffer_start = aligned_start; + else +-- +2.51.0 + diff --git a/queue-6.17/erofs-avoid-infinite-loops-due-to-corrupted-subpage-.patch b/queue-6.17/erofs-avoid-infinite-loops-due-to-corrupted-subpage-.patch new file mode 100644 index 0000000000..5ea2dbabb9 --- /dev/null +++ b/queue-6.17/erofs-avoid-infinite-loops-due-to-corrupted-subpage-.patch @@ -0,0 +1,93 @@ +From 4a86a316230d3ceaa59781ac05e73c77fc60e8be Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 15:05:38 +0800 +Subject: erofs: avoid infinite loops due to corrupted subpage compact indexes + +From: Gao Xiang + +[ Upstream commit e13d315ae077bb7c3c6027cc292401bc0f4ec683 ] + +Robert reported an infinite loop observed by two crafted images. + +The root cause is that `clusterofs` can be larger than `lclustersize` +for !NONHEAD `lclusters` in corrupted subpage compact indexes, e.g.: + + blocksize = lclustersize = 512 lcn = 6 clusterofs = 515 + +Move the corresponding check for full compress indexes to +`z_erofs_load_lcluster_from_disk()` to also cover subpage compact +compress indexes. + +It also fixes the position of `m->type >= Z_EROFS_LCLUSTER_TYPE_MAX` +check, since it should be placed right after +`z_erofs_load_{compact,full}_lcluster()`. + +Fixes: 8d2517aaeea3 ("erofs: fix up compacted indexes for block size < 4096") +Fixes: 1a5223c182fd ("erofs: do sanity check on m->type in z_erofs_load_compact_lcluster()") +Reported-by: Robert Morris +Closes: https://lore.kernel.org/r/35167.1760645886@localhost +Reviewed-by: Hongbo Li +Signed-off-by: Gao Xiang +Signed-off-by: Sasha Levin +--- + fs/erofs/zmap.c | 32 ++++++++++++++++++-------------- + 1 file changed, 18 insertions(+), 14 deletions(-) + +diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c +index 87032f90fe840..b2dabdf176b6c 100644 +--- a/fs/erofs/zmap.c ++++ b/fs/erofs/zmap.c +@@ -55,10 +55,6 @@ static int z_erofs_load_full_lcluster(struct z_erofs_maprecorder *m, + } else { + m->partialref = !!(advise & Z_EROFS_LI_PARTIAL_REF); + m->clusterofs = le16_to_cpu(di->di_clusterofs); +- if (m->clusterofs >= 1 << vi->z_lclusterbits) { +- DBG_BUGON(1); +- return -EFSCORRUPTED; +- } + m->pblk = le32_to_cpu(di->di_u.blkaddr); + } + return 0; +@@ -240,21 +236,29 @@ static int z_erofs_load_compact_lcluster(struct z_erofs_maprecorder *m, + static int z_erofs_load_lcluster_from_disk(struct z_erofs_maprecorder *m, + unsigned int lcn, bool lookahead) + { ++ struct erofs_inode *vi = EROFS_I(m->inode); ++ int err; ++ ++ if (vi->datalayout == EROFS_INODE_COMPRESSED_COMPACT) { ++ err = z_erofs_load_compact_lcluster(m, lcn, lookahead); ++ } else { ++ DBG_BUGON(vi->datalayout != EROFS_INODE_COMPRESSED_FULL); ++ err = z_erofs_load_full_lcluster(m, lcn); ++ } ++ if (err) ++ return err; ++ + if (m->type >= Z_EROFS_LCLUSTER_TYPE_MAX) { + erofs_err(m->inode->i_sb, "unknown type %u @ lcn %u of nid %llu", +- m->type, lcn, EROFS_I(m->inode)->nid); ++ m->type, lcn, EROFS_I(m->inode)->nid); + DBG_BUGON(1); + return -EOPNOTSUPP; ++ } else if (m->type != Z_EROFS_LCLUSTER_TYPE_NONHEAD && ++ m->clusterofs >= (1 << vi->z_lclusterbits)) { ++ DBG_BUGON(1); ++ return -EFSCORRUPTED; + } +- +- switch (EROFS_I(m->inode)->datalayout) { +- case EROFS_INODE_COMPRESSED_FULL: +- return z_erofs_load_full_lcluster(m, lcn); +- case EROFS_INODE_COMPRESSED_COMPACT: +- return z_erofs_load_compact_lcluster(m, lcn, lookahead); +- default: +- return -EINVAL; +- } ++ return 0; + } + + static int z_erofs_extent_lookback(struct z_erofs_maprecorder *m, +-- +2.51.0 + diff --git a/queue-6.17/erofs-fix-crafted-invalid-cases-for-encoded-extents.patch b/queue-6.17/erofs-fix-crafted-invalid-cases-for-encoded-extents.patch new file mode 100644 index 0000000000..eb183ee4a9 --- /dev/null +++ b/queue-6.17/erofs-fix-crafted-invalid-cases-for-encoded-extents.patch @@ -0,0 +1,72 @@ +From 24d14f9a04a18b4350678601c693c6a758d8b7ee Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 12 Oct 2025 21:59:25 +0800 +Subject: erofs: fix crafted invalid cases for encoded extents + +From: Gao Xiang + +[ Upstream commit a429b76114aaca3ef1aff4cd469dcf025431bd11 ] + +Robert recently reported two corrupted images that can cause system +crashes, which are related to the new encoded extents introduced +in Linux 6.15: + + - The first one [1] has plen != 0 (e.g. plen == 0x2000000) but + (plen & Z_EROFS_EXTENT_PLEN_MASK) == 0. It is used to represent + special extents such as sparse extents (!EROFS_MAP_MAPPED), but + previously only plen == 0 was handled; + + - The second one [2] has pa 0xffffffffffdcffed and plen 0xb4000, + then "cur [0xfffffffffffff000] += bvec.bv_len [0x1000]" in + "} while ((cur += bvec.bv_len) < end);" wraps around, causing an + out-of-bound access of pcl->compressed_bvecs[] in + z_erofs_submit_queue(). EROFS only supports 48-bit physical block + addresses (up to 1EiB for 4k blocks), so add a sanity check to + enforce this. + +Fixes: 1d191b4ca51d ("erofs: implement encoded extent metadata") +Reported-by: Robert Morris +Closes: https://lore.kernel.org/r/75022.1759355830@localhost [1] +Closes: https://lore.kernel.org/r/80524.1760131149@localhost [2] +Reviewed-by: Hongbo Li +Signed-off-by: Gao Xiang +Signed-off-by: Sasha Levin +--- + fs/erofs/zmap.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c +index 798223e6da9ce..87032f90fe840 100644 +--- a/fs/erofs/zmap.c ++++ b/fs/erofs/zmap.c +@@ -596,7 +596,7 @@ static int z_erofs_map_blocks_ext(struct inode *inode, + vi->z_fragmentoff = map->m_plen; + if (recsz > offsetof(struct z_erofs_extent, pstart_lo)) + vi->z_fragmentoff |= map->m_pa << 32; +- } else if (map->m_plen) { ++ } else if (map->m_plen & Z_EROFS_EXTENT_PLEN_MASK) { + map->m_flags |= EROFS_MAP_MAPPED | + EROFS_MAP_FULL_MAPPED | EROFS_MAP_ENCODED; + fmt = map->m_plen >> Z_EROFS_EXTENT_PLEN_FMT_BIT; +@@ -715,6 +715,7 @@ static int z_erofs_map_sanity_check(struct inode *inode, + struct erofs_map_blocks *map) + { + struct erofs_sb_info *sbi = EROFS_I_SB(inode); ++ u64 pend; + + if (!(map->m_flags & EROFS_MAP_ENCODED)) + return 0; +@@ -732,6 +733,10 @@ static int z_erofs_map_sanity_check(struct inode *inode, + if (unlikely(map->m_plen > Z_EROFS_PCLUSTER_MAX_SIZE || + map->m_llen > Z_EROFS_PCLUSTER_MAX_DSIZE)) + return -EOPNOTSUPP; ++ /* Filesystems beyond 48-bit physical block addresses are invalid */ ++ if (unlikely(check_add_overflow(map->m_pa, map->m_plen, &pend) || ++ (pend >> sbi->blkszbits) >= BIT_ULL(48))) ++ return -EFSCORRUPTED; + return 0; + } + +-- +2.51.0 + diff --git a/queue-6.17/espintcp-use-datagram_poll_queue-for-socket-readines.patch b/queue-6.17/espintcp-use-datagram_poll_queue-for-socket-readines.patch new file mode 100644 index 0000000000..2a69077865 --- /dev/null +++ b/queue-6.17/espintcp-use-datagram_poll_queue-for-socket-readines.patch @@ -0,0 +1,50 @@ +From 0094a6cd6979360bdda1f22d72c024759d7a9d4c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 12:09:41 +0200 +Subject: espintcp: use datagram_poll_queue for socket readiness + +From: Ralf Lici + +[ Upstream commit 0fc3e32c2c069f541f2724d91f5e98480b640326 ] + +espintcp uses a custom queue (ike_queue) to deliver packets to +userspace. The polling logic relies on datagram_poll, which checks +sk_receive_queue, which can lead to false readiness signals when that +queue contains non-userspace packets. + +Switch espintcp_poll to use datagram_poll_queue with ike_queue, ensuring +poll only signals readiness when userspace data is actually available. + +Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") +Signed-off-by: Ralf Lici +Reviewed-by: Sabrina Dubroca +Link: https://patch.msgid.link/20251021100942.195010-3-ralf@mandelbit.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/xfrm/espintcp.c | 6 +----- + 1 file changed, 1 insertion(+), 5 deletions(-) + +diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c +index fc7a603b04f13..bf744ac9d5a73 100644 +--- a/net/xfrm/espintcp.c ++++ b/net/xfrm/espintcp.c +@@ -555,14 +555,10 @@ static void espintcp_close(struct sock *sk, long timeout) + static __poll_t espintcp_poll(struct file *file, struct socket *sock, + poll_table *wait) + { +- __poll_t mask = datagram_poll(file, sock, wait); + struct sock *sk = sock->sk; + struct espintcp_ctx *ctx = espintcp_getctx(sk); + +- if (!skb_queue_empty(&ctx->ike_queue)) +- mask |= EPOLLIN | EPOLLRDNORM; +- +- return mask; ++ return datagram_poll_queue(file, sock, wait, &ctx->ike_queue); + } + + static void build_protos(struct proto *espintcp_prot, +-- +2.51.0 + diff --git a/queue-6.17/net-datagram-introduce-datagram_poll_queue-for-custo.patch b/queue-6.17/net-datagram-introduce-datagram_poll_queue-for-custo.patch new file mode 100644 index 0000000000..9372c69ae4 --- /dev/null +++ b/queue-6.17/net-datagram-introduce-datagram_poll_queue-for-custo.patch @@ -0,0 +1,123 @@ +From b1d6ab5eeda92d9a1863987ec269eb6ca4ab8917 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 12:09:40 +0200 +Subject: net: datagram: introduce datagram_poll_queue for custom receive + queues + +From: Ralf Lici + +[ Upstream commit f6ceec6434b5efff62cecbaa2ff74fc29b96c0c6 ] + +Some protocols using TCP encapsulation (e.g., espintcp, openvpn) deliver +userspace-bound packets through a custom skb queue rather than the +standard sk_receive_queue. + +Introduce datagram_poll_queue that accepts an explicit receive queue, +and convert datagram_poll into a wrapper around datagram_poll_queue. +This allows protocols with custom skb queues to reuse the core polling +logic without relying on sk_receive_queue. + +Cc: Sabrina Dubroca +Cc: Antonio Quartulli +Signed-off-by: Ralf Lici +Reviewed-by: Sabrina Dubroca +Reviewed-by: Antonio Quartulli +Link: https://patch.msgid.link/20251021100942.195010-2-ralf@mandelbit.com +Signed-off-by: Paolo Abeni +Stable-dep-of: efd729408bc7 ("ovpn: use datagram_poll_queue for socket readiness in TCP") +Signed-off-by: Sasha Levin +--- + include/linux/skbuff.h | 3 +++ + net/core/datagram.c | 44 ++++++++++++++++++++++++++++++++---------- + 2 files changed, 37 insertions(+), 10 deletions(-) + +diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h +index fa633657e4c06..ad66110b43cca 100644 +--- a/include/linux/skbuff.h ++++ b/include/linux/skbuff.h +@@ -4157,6 +4157,9 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, + struct sk_buff_head *sk_queue, + unsigned int flags, int *off, int *err); + struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned int flags, int *err); ++__poll_t datagram_poll_queue(struct file *file, struct socket *sock, ++ struct poll_table_struct *wait, ++ struct sk_buff_head *rcv_queue); + __poll_t datagram_poll(struct file *file, struct socket *sock, + struct poll_table_struct *wait); + int skb_copy_datagram_iter(const struct sk_buff *from, int offset, +diff --git a/net/core/datagram.c b/net/core/datagram.c +index f474b9b120f98..8b328879f8d25 100644 +--- a/net/core/datagram.c ++++ b/net/core/datagram.c +@@ -920,21 +920,22 @@ int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, + EXPORT_SYMBOL(skb_copy_and_csum_datagram_msg); + + /** +- * datagram_poll - generic datagram poll ++ * datagram_poll_queue - same as datagram_poll, but on a specific receive ++ * queue + * @file: file struct + * @sock: socket + * @wait: poll table ++ * @rcv_queue: receive queue to poll + * +- * Datagram poll: Again totally generic. This also handles +- * sequenced packet sockets providing the socket receive queue +- * is only ever holding data ready to receive. ++ * Performs polling on the given receive queue, handling shutdown, error, ++ * and connection state. This is useful for protocols that deliver ++ * userspace-bound packets through a custom queue instead of ++ * sk->sk_receive_queue. + * +- * Note: when you *don't* use this routine for this protocol, +- * and you use a different write policy from sock_writeable() +- * then please supply your own write_space callback. ++ * Return: poll bitmask indicating the socket's current state + */ +-__poll_t datagram_poll(struct file *file, struct socket *sock, +- poll_table *wait) ++__poll_t datagram_poll_queue(struct file *file, struct socket *sock, ++ poll_table *wait, struct sk_buff_head *rcv_queue) + { + struct sock *sk = sock->sk; + __poll_t mask; +@@ -956,7 +957,7 @@ __poll_t datagram_poll(struct file *file, struct socket *sock, + mask |= EPOLLHUP; + + /* readable? */ +- if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) ++ if (!skb_queue_empty_lockless(rcv_queue)) + mask |= EPOLLIN | EPOLLRDNORM; + + /* Connection-based need to check for termination and startup */ +@@ -978,4 +979,27 @@ __poll_t datagram_poll(struct file *file, struct socket *sock, + + return mask; + } ++EXPORT_SYMBOL(datagram_poll_queue); ++ ++/** ++ * datagram_poll - generic datagram poll ++ * @file: file struct ++ * @sock: socket ++ * @wait: poll table ++ * ++ * Datagram poll: Again totally generic. This also handles ++ * sequenced packet sockets providing the socket receive queue ++ * is only ever holding data ready to receive. ++ * ++ * Note: when you *don't* use this routine for this protocol, ++ * and you use a different write policy from sock_writeable() ++ * then please supply your own write_space callback. ++ * ++ * Return: poll bitmask indicating the socket's current state ++ */ ++__poll_t datagram_poll(struct file *file, struct socket *sock, poll_table *wait) ++{ ++ return datagram_poll_queue(file, sock, wait, ++ &sock->sk->sk_receive_queue); ++} + EXPORT_SYMBOL(datagram_poll); +-- +2.51.0 + diff --git a/queue-6.17/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch b/queue-6.17/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch new file mode 100644 index 0000000000..9a4beaefc5 --- /dev/null +++ b/queue-6.17/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch @@ -0,0 +1,54 @@ +From b1aff76cdd772d7ff56cf4c0e1920c36c556536b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:01:31 +0800 +Subject: net: enetc: correct the value of ENETC_RXB_TRUESIZE + +From: Wei Fang + +[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ] + +The ENETC RX ring uses the page halves flipping mechanism, each page is +split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is +defined to 2048 to indicate the size of half a page. However, the page +size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, +but it could be configured to 16K or 64K. + +When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, +and the RX ring will always use the first half of the page. This is not +consistent with the description in the relevant kernel doc and commit +messages. + +This issue is invisible in most cases, but if users want to increase +PAGE_SIZE to receive a Jumbo frame with a single buffer for some use +cases, it will not work as expected, because the buffer size of each +RX BD is fixed to 2048 bytes. + +Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE +to (PAGE_SIZE >> 1), as described in the comment. + +Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") +Signed-off-by: Wei Fang +Reviewed-by: Claudiu Manoil +Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h +index 62e8ee4d2f04e..fbc08a18db6d5 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.h ++++ b/drivers/net/ethernet/freescale/enetc/enetc.h +@@ -67,7 +67,7 @@ struct enetc_lso_t { + #define ENETC_LSO_MAX_DATA_LEN SZ_256K + + #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE +-#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ ++#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) + #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ + #define ENETC_RXB_DMA_SIZE \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +-- +2.51.0 + diff --git a/queue-6.17/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch b/queue-6.17/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch new file mode 100644 index 0000000000..9081dee2c5 --- /dev/null +++ b/queue-6.17/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch @@ -0,0 +1,158 @@ +From 7473b2b23d8e3fc8d681f68bf306dc3e6e0285d4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:14:27 +0800 +Subject: net: enetc: fix the deadlock of enetc_mdio_lock + +From: Jianpeng Chang + +[ Upstream commit 50bd33f6b3922a6b760aa30d409cae891cec8fb5 ] + +After applying the workaround for err050089, the LS1028A platform +experiences RCU stalls on RT kernel. This issue is caused by the +recursive acquisition of the read lock enetc_mdio_lock. Here list some +of the call stacks identified under the enetc_poll path that may lead to +a deadlock: + +enetc_poll + -> enetc_lock_mdio + -> enetc_clean_rx_ring OR napi_complete_done + -> napi_gro_receive + -> enetc_start_xmit + -> enetc_lock_mdio + -> enetc_map_tx_buffs + -> enetc_unlock_mdio + -> enetc_unlock_mdio + +After enetc_poll acquires the read lock, a higher-priority writer attempts +to acquire the lock, causing preemption. The writer detects that a +read lock is already held and is scheduled out. However, readers under +enetc_poll cannot acquire the read lock again because a writer is already +waiting, leading to a thread hang. + +Currently, the deadlock is avoided by adjusting enetc_lock_mdio to prevent +recursive lock acquisition. + +Fixes: 6d36ecdbc441 ("net: enetc: take the MDIO lock only once per NAPI poll cycle") +Signed-off-by: Jianpeng Chang +Acked-by: Wei Fang +Link: https://patch.msgid.link/20251015021427.180757-1-jianpeng.chang.cn@windriver.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.c | 25 ++++++++++++++++---- + 1 file changed, 21 insertions(+), 4 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c +index e4287725832e0..5496b4cb2a64a 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.c ++++ b/drivers/net/ethernet/freescale/enetc/enetc.c +@@ -1558,6 +1558,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd; + struct sk_buff *skb; +@@ -1593,7 +1595,9 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_byte_cnt += skb->len + ETH_HLEN; + rx_frm_cnt++; + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + } + + rx_ring->next_to_clean = i; +@@ -1601,6 +1605,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -1910,6 +1916,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd, *orig_rxbd; + struct xdp_buff xdp_buff; +@@ -1973,7 +1981,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + */ + enetc_bulk_flip_buff(rx_ring, orig_i, i); + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + break; + case XDP_TX: + tx_ring = priv->xdp_tx_ring[rx_ring->index]; +@@ -2008,7 +2018,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + } + break; + case XDP_REDIRECT: ++ enetc_unlock_mdio(); + err = xdp_do_redirect(rx_ring->ndev, &xdp_buff, prog); ++ enetc_lock_mdio(); + if (unlikely(err)) { + enetc_xdp_drop(rx_ring, orig_i, i); + rx_ring->stats.xdp_redirect_failures++; +@@ -2028,8 +2040,11 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + +- if (xdp_redirect_frm_cnt) ++ if (xdp_redirect_frm_cnt) { ++ enetc_unlock_mdio(); + xdp_do_flush(); ++ enetc_lock_mdio(); ++ } + + if (xdp_tx_frm_cnt) + enetc_update_tx_ring_tail(tx_ring); +@@ -2038,6 +2053,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + enetc_refill_rx_ring(rx_ring, enetc_bd_unused(rx_ring) - + rx_ring->xdp.xdp_tx_in_flight); + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -2056,6 +2073,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + for (i = 0; i < v->count_tx_rings; i++) + if (!enetc_clean_tx_ring(&v->tx_ring[i], budget)) + complete = false; ++ enetc_unlock_mdio(); + + prog = rx_ring->xdp.prog; + if (prog) +@@ -2067,10 +2085,8 @@ static int enetc_poll(struct napi_struct *napi, int budget) + if (work_done) + v->rx_napi_work = true; + +- if (!complete) { +- enetc_unlock_mdio(); ++ if (!complete) + return budget; +- } + + napi_complete_done(napi, work_done); + +@@ -2079,6 +2095,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + + v->rx_napi_work = false; + ++ enetc_lock_mdio(); + /* enable interrupts */ + enetc_wr_reg_hot(v->rbier, ENETC_RBIER_RXTIE); + +-- +2.51.0 + diff --git a/queue-6.17/net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch b/queue-6.17/net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch new file mode 100644 index 0000000000..04e078a6f5 --- /dev/null +++ b/queue-6.17/net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch @@ -0,0 +1,191 @@ +From 988b97a61f82815f19031a422d090863ded96daa Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 17:27:55 +0530 +Subject: net: ethernet: ti: am65-cpts: fix timestamp loss due to race + conditions + +From: Aksh Garg + +[ Upstream commit 49d34f3dd8519581030547eb7543a62f9ab5fa08 ] + +Resolve race conditions in timestamp events list handling between TX +and RX paths causing missed timestamps. + +The current implementation uses a single events list for both TX and RX +timestamps. The am65_cpts_find_ts() function acquires the lock, +splices all events (TX as well as RX events) to a temporary list, +and releases the lock. This function performs matching of timestamps +for TX packets only. Before it acquires the lock again to put the +non-TX events back to the main events list, a concurrent RX +processing thread could acquire the lock (as observed in practice), +find an empty events list, and fail to attach timestamp to it, +even though a relevant event exists in the spliced list which is yet to +be restored to the main list. + +Fix this by creating separate events lists to handle TX and RX +timestamps independently. + +Fixes: c459f606f66df ("net: ethernet: ti: am65-cpts: Enable RX HW timestamp for PTP packets using CPTS FIFO") +Signed-off-by: Aksh Garg +Reviewed-by: Siddharth Vadapalli +Link: https://patch.msgid.link/20251016115755.1123646-1-a-garg7@ti.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/ti/am65-cpts.c | 63 ++++++++++++++++++++--------- + 1 file changed, 43 insertions(+), 20 deletions(-) + +diff --git a/drivers/net/ethernet/ti/am65-cpts.c b/drivers/net/ethernet/ti/am65-cpts.c +index 59d6ab989c554..8ffbfaa3ab18c 100644 +--- a/drivers/net/ethernet/ti/am65-cpts.c ++++ b/drivers/net/ethernet/ti/am65-cpts.c +@@ -163,7 +163,9 @@ struct am65_cpts { + struct device_node *clk_mux_np; + struct clk *refclk; + u32 refclk_freq; +- struct list_head events; ++ /* separate lists to handle TX and RX timestamp independently */ ++ struct list_head events_tx; ++ struct list_head events_rx; + struct list_head pool; + struct am65_cpts_event pool_data[AM65_CPTS_MAX_EVENTS]; + spinlock_t lock; /* protects events lists*/ +@@ -227,6 +229,24 @@ static void am65_cpts_disable(struct am65_cpts *cpts) + am65_cpts_write32(cpts, 0, int_enable); + } + ++static int am65_cpts_purge_event_list(struct am65_cpts *cpts, ++ struct list_head *events) ++{ ++ struct list_head *this, *next; ++ struct am65_cpts_event *event; ++ int removed = 0; ++ ++ list_for_each_safe(this, next, events) { ++ event = list_entry(this, struct am65_cpts_event, list); ++ if (time_after(jiffies, event->tmo)) { ++ list_del_init(&event->list); ++ list_add(&event->list, &cpts->pool); ++ ++removed; ++ } ++ } ++ return removed; ++} ++ + static int am65_cpts_event_get_port(struct am65_cpts_event *event) + { + return (event->event1 & AM65_CPTS_EVENT_1_PORT_NUMBER_MASK) >> +@@ -239,20 +259,12 @@ static int am65_cpts_event_get_type(struct am65_cpts_event *event) + AM65_CPTS_EVENT_1_EVENT_TYPE_SHIFT; + } + +-static int am65_cpts_cpts_purge_events(struct am65_cpts *cpts) ++static int am65_cpts_purge_events(struct am65_cpts *cpts) + { +- struct list_head *this, *next; +- struct am65_cpts_event *event; + int removed = 0; + +- list_for_each_safe(this, next, &cpts->events) { +- event = list_entry(this, struct am65_cpts_event, list); +- if (time_after(jiffies, event->tmo)) { +- list_del_init(&event->list); +- list_add(&event->list, &cpts->pool); +- ++removed; +- } +- } ++ removed += am65_cpts_purge_event_list(cpts, &cpts->events_tx); ++ removed += am65_cpts_purge_event_list(cpts, &cpts->events_rx); + + if (removed) + dev_dbg(cpts->dev, "event pool cleaned up %d\n", removed); +@@ -287,7 +299,7 @@ static int __am65_cpts_fifo_read(struct am65_cpts *cpts) + struct am65_cpts_event, list); + + if (!event) { +- if (am65_cpts_cpts_purge_events(cpts)) { ++ if (am65_cpts_purge_events(cpts)) { + dev_err(cpts->dev, "cpts: event pool empty\n"); + ret = -1; + goto out; +@@ -306,11 +318,21 @@ static int __am65_cpts_fifo_read(struct am65_cpts *cpts) + cpts->timestamp); + break; + case AM65_CPTS_EV_RX: ++ event->tmo = jiffies + ++ msecs_to_jiffies(AM65_CPTS_EVENT_RX_TX_TIMEOUT); ++ ++ list_move_tail(&event->list, &cpts->events_rx); ++ ++ dev_dbg(cpts->dev, ++ "AM65_CPTS_EV_RX e1:%08x e2:%08x t:%lld\n", ++ event->event1, event->event2, ++ event->timestamp); ++ break; + case AM65_CPTS_EV_TX: + event->tmo = jiffies + + msecs_to_jiffies(AM65_CPTS_EVENT_RX_TX_TIMEOUT); + +- list_move_tail(&event->list, &cpts->events); ++ list_move_tail(&event->list, &cpts->events_tx); + + dev_dbg(cpts->dev, + "AM65_CPTS_EV_TX e1:%08x e2:%08x t:%lld\n", +@@ -828,7 +850,7 @@ static bool am65_cpts_match_tx_ts(struct am65_cpts *cpts, + return found; + } + +-static void am65_cpts_find_ts(struct am65_cpts *cpts) ++static void am65_cpts_find_tx_ts(struct am65_cpts *cpts) + { + struct am65_cpts_event *event; + struct list_head *this, *next; +@@ -837,7 +859,7 @@ static void am65_cpts_find_ts(struct am65_cpts *cpts) + LIST_HEAD(events); + + spin_lock_irqsave(&cpts->lock, flags); +- list_splice_init(&cpts->events, &events); ++ list_splice_init(&cpts->events_tx, &events); + spin_unlock_irqrestore(&cpts->lock, flags); + + list_for_each_safe(this, next, &events) { +@@ -850,7 +872,7 @@ static void am65_cpts_find_ts(struct am65_cpts *cpts) + } + + spin_lock_irqsave(&cpts->lock, flags); +- list_splice_tail(&events, &cpts->events); ++ list_splice_tail(&events, &cpts->events_tx); + list_splice_tail(&events_free, &cpts->pool); + spin_unlock_irqrestore(&cpts->lock, flags); + } +@@ -861,7 +883,7 @@ static long am65_cpts_ts_work(struct ptp_clock_info *ptp) + unsigned long flags; + long delay = -1; + +- am65_cpts_find_ts(cpts); ++ am65_cpts_find_tx_ts(cpts); + + spin_lock_irqsave(&cpts->txq.lock, flags); + if (!skb_queue_empty(&cpts->txq)) +@@ -905,7 +927,7 @@ static u64 am65_cpts_find_rx_ts(struct am65_cpts *cpts, u32 skb_mtype_seqid) + + spin_lock_irqsave(&cpts->lock, flags); + __am65_cpts_fifo_read(cpts); +- list_for_each_safe(this, next, &cpts->events) { ++ list_for_each_safe(this, next, &cpts->events_rx) { + event = list_entry(this, struct am65_cpts_event, list); + if (time_after(jiffies, event->tmo)) { + list_move(&event->list, &cpts->pool); +@@ -1155,7 +1177,8 @@ struct am65_cpts *am65_cpts_create(struct device *dev, void __iomem *regs, + return ERR_PTR(ret); + + mutex_init(&cpts->ptp_clk_lock); +- INIT_LIST_HEAD(&cpts->events); ++ INIT_LIST_HEAD(&cpts->events_tx); ++ INIT_LIST_HEAD(&cpts->events_rx); + INIT_LIST_HEAD(&cpts->pool); + spin_lock_init(&cpts->lock); + skb_queue_head_init(&cpts->txq); +-- +2.51.0 + diff --git a/queue-6.17/net-hibmcge-select-fixed_phy.patch b/queue-6.17/net-hibmcge-select-fixed_phy.patch new file mode 100644 index 0000000000..e016a58bb3 --- /dev/null +++ b/queue-6.17/net-hibmcge-select-fixed_phy.patch @@ -0,0 +1,38 @@ +From c482525290edb598396e8bb4ba9ede1c21298ce0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 20 Oct 2025 08:54:54 +0200 +Subject: net: hibmcge: select FIXED_PHY + +From: Heiner Kallweit + +[ Upstream commit d63f0391d6c7b75e1a847e1a26349fa8cad0004d ] + +hibmcge uses fixed_phy_register() et al, but doesn't cater for the case +that hibmcge is built-in and fixed_phy is a module. To solve this +select FIXED_PHY. + +Fixes: 1d7cd7a9c69c ("net: hibmcge: support scenario without PHY") +Signed-off-by: Heiner Kallweit +Reviewed-by: Jijie Shao +Link: https://patch.msgid.link/c4fc061f-b6d5-418b-a0dc-6b238cdbedce@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/hisilicon/Kconfig | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/net/ethernet/hisilicon/Kconfig b/drivers/net/ethernet/hisilicon/Kconfig +index 65302c41bfb14..38875c196cb69 100644 +--- a/drivers/net/ethernet/hisilicon/Kconfig ++++ b/drivers/net/ethernet/hisilicon/Kconfig +@@ -148,6 +148,7 @@ config HIBMCGE + tristate "Hisilicon BMC Gigabit Ethernet Device Support" + depends on PCI && PCI_MSI + select PHYLIB ++ select FIXED_PHY + select MOTORCOMM_PHY + select REALTEK_PHY + help +-- +2.51.0 + diff --git a/queue-6.17/net-hsr-prevent-creation-of-hsr-device-with-slaves-f.patch b/queue-6.17/net-hsr-prevent-creation-of-hsr-device-with-slaves-f.patch new file mode 100644 index 0000000000..cb47f3ddfc --- /dev/null +++ b/queue-6.17/net-hsr-prevent-creation-of-hsr-device-with-slaves-f.patch @@ -0,0 +1,60 @@ +From 71437d9a00cd0e1e616d6c6a2433c4c6e6cb06bc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 20 Oct 2025 15:55:33 +0200 +Subject: net: hsr: prevent creation of HSR device with slaves from another + netns + +From: Fernando Fernandez Mancera + +[ Upstream commit c0178eec8884231a5ae0592b9fce827bccb77e86 ] + +HSR/PRP driver does not handle correctly having slaves/interlink devices +in a different net namespace. Currently, it is possible to create a HSR +link in a different net namespace than the slaves/interlink with the +following command: + + ip link add hsr0 netns hsr-ns type hsr slave1 eth1 slave2 eth2 + +As there is no use-case on supporting this scenario, enforce that HSR +device link matches netns defined by IFLA_LINK_NETNSID. + +The iproute2 command mentioned above will throw the following error: + + Error: hsr: HSR slaves/interlink must be on the same net namespace than HSR link. + +Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") +Signed-off-by: Fernando Fernandez Mancera +Link: https://patch.msgid.link/20251020135533.9373-1-fmancera@suse.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/hsr/hsr_netlink.c | 8 +++++++- + 1 file changed, 7 insertions(+), 1 deletion(-) + +diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c +index b120470246cc5..c96b63adf96ff 100644 +--- a/net/hsr/hsr_netlink.c ++++ b/net/hsr/hsr_netlink.c +@@ -34,12 +34,18 @@ static int hsr_newlink(struct net_device *dev, + struct netlink_ext_ack *extack) + { + struct net *link_net = rtnl_newlink_link_net(params); ++ struct net_device *link[2], *interlink = NULL; + struct nlattr **data = params->data; + enum hsr_version proto_version; + unsigned char multicast_spec; + u8 proto = HSR_PROTOCOL_HSR; + +- struct net_device *link[2], *interlink = NULL; ++ if (!net_eq(link_net, dev_net(dev))) { ++ NL_SET_ERR_MSG_MOD(extack, ++ "HSR slaves/interlink must be on the same net namespace than HSR link"); ++ return -EINVAL; ++ } ++ + if (!data) { + NL_SET_ERR_MSG_MOD(extack, "No slave devices specified"); + return -EINVAL; +-- +2.51.0 + diff --git a/queue-6.17/net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch b/queue-6.17/net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch new file mode 100644 index 0000000000..4a71864244 --- /dev/null +++ b/queue-6.17/net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch @@ -0,0 +1,201 @@ +From 397834ae86b6a38a234e97496c081b3fb85bf3ec Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 22 Oct 2025 15:29:42 +0300 +Subject: net/mlx5: Fix IPsec cleanup over MPV device + +From: Patrisious Haddad + +[ Upstream commit 664f76be38a18c61151d0ef248c7e2f3afb4f3c7 ] + +When we do mlx5e_detach_netdev() we eventually disable blocking events +notifier, among those events are IPsec MPV events from IB to core. + +So before disabling those blocking events, make sure to also unregister +the devcom device and mark all this device operations as complete, +in order to prevent the other device from using invalid netdev +during future devcom events which could cause the trace below. + +BUG: kernel NULL pointer dereference, address: 0000000000000010 +PGD 146427067 P4D 146427067 PUD 146488067 PMD 0 +Oops: Oops: 0000 [#1] SMP +CPU: 1 UID: 0 PID: 7735 Comm: devlink Tainted: GW 6.12.0-rc6_for_upstream_min_debug_2024_11_08_00_46 #1 +Tainted: [W]=WARN +Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 +RIP: 0010:mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core] +Code: 00 01 48 83 05 23 32 1e 00 01 41 b8 ed ff ff ff e9 60 ff ff ff 48 83 05 00 32 1e 00 01 eb e3 66 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 47 10 48 83 05 5f 32 1e 00 01 48 8b 50 40 48 85 d2 74 05 40 +RSP: 0018:ffff88811a5c35f8 EFLAGS: 00010206 +RAX: ffff888106e8ab80 RBX: ffff888107d7e200 RCX: ffff88810d6f0a00 +RDX: ffff88810d6f0a00 RSI: 0000000000000001 RDI: 0000000000000000 +RBP: ffff88811a17e620 R08: 0000000000000040 R09: 0000000000000000 +R10: ffff88811a5c3618 R11: 0000000de85d51bd R12: ffff88811a17e600 +R13: ffff88810d6f0a00 R14: 0000000000000000 R15: ffff8881034bda80 +FS: 00007f27bdf89180(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000000000000010 CR3: 000000010f159005 CR4: 0000000000372eb0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + + ? __die+0x20/0x60 + ? page_fault_oops+0x150/0x3e0 + ? exc_page_fault+0x74/0x130 + ? asm_exc_page_fault+0x22/0x30 + ? mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core] + mlx5e_devcom_event_mpv+0x42/0x60 [mlx5_core] + mlx5_devcom_send_event+0x8c/0x170 [mlx5_core] + blocking_event+0x17b/0x230 [mlx5_core] + notifier_call_chain+0x35/0xa0 + blocking_notifier_call_chain+0x3d/0x60 + mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core] + mlx5_core_mp_event_replay+0x12/0x20 [mlx5_core] + mlx5_ib_bind_slave_port+0x228/0x2c0 [mlx5_ib] + mlx5_ib_stage_init_init+0x664/0x9d0 [mlx5_ib] + ? idr_alloc_cyclic+0x50/0xb0 + ? __kmalloc_cache_noprof+0x167/0x340 + ? __kmalloc_noprof+0x1a7/0x430 + __mlx5_ib_add+0x34/0xd0 [mlx5_ib] + mlx5r_probe+0xe9/0x310 [mlx5_ib] + ? kernfs_add_one+0x107/0x150 + ? __mlx5_ib_add+0xd0/0xd0 [mlx5_ib] + auxiliary_bus_probe+0x3e/0x90 + really_probe+0xc5/0x3a0 + ? driver_probe_device+0x90/0x90 + __driver_probe_device+0x80/0x160 + driver_probe_device+0x1e/0x90 + __device_attach_driver+0x7d/0x100 + bus_for_each_drv+0x80/0xd0 + __device_attach+0xbc/0x1f0 + bus_probe_device+0x86/0xa0 + device_add+0x62d/0x830 + __auxiliary_device_add+0x3b/0xa0 + ? auxiliary_device_init+0x41/0x90 + add_adev+0xd1/0x150 [mlx5_core] + mlx5_rescan_drivers_locked+0x21c/0x300 [mlx5_core] + esw_mode_change+0x6c/0xc0 [mlx5_core] + mlx5_devlink_eswitch_mode_set+0x21e/0x640 [mlx5_core] + devlink_nl_eswitch_set_doit+0x60/0xe0 + genl_family_rcv_msg_doit+0xd0/0x120 + genl_rcv_msg+0x180/0x2b0 + ? devlink_get_from_attrs_lock+0x170/0x170 + ? devlink_nl_eswitch_get_doit+0x290/0x290 + ? devlink_nl_pre_doit_port_optional+0x50/0x50 + ? genl_family_rcv_msg_dumpit+0xf0/0xf0 + netlink_rcv_skb+0x54/0x100 + genl_rcv+0x24/0x40 + netlink_unicast+0x1fc/0x2d0 + netlink_sendmsg+0x1e4/0x410 + __sock_sendmsg+0x38/0x60 + ? sockfd_lookup_light+0x12/0x60 + __sys_sendto+0x105/0x160 + ? __sys_recvmsg+0x4e/0x90 + __x64_sys_sendto+0x20/0x30 + do_syscall_64+0x4c/0x100 + entry_SYSCALL_64_after_hwframe+0x4b/0x53 +RIP: 0033:0x7f27bc91b13a +Code: bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 fa 96 2c 00 45 89 c9 4c 63 d1 48 63 ff 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 f3 c3 0f 1f 40 00 41 55 41 54 4d 89 c5 55 +RSP: 002b:00007fff369557e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c +RAX: ffffffffffffffda RBX: 0000000009c54b10 RCX: 00007f27bc91b13a +RDX: 0000000000000038 RSI: 0000000009c54b10 RDI: 0000000000000006 +RBP: 0000000009c54920 R08: 00007f27bd0030e0 R09: 000000000000000c +R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 +R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 + +Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_fwctl mlx5_ib ib_uverbs ib_core mlx5_core +CR2: 0000000000000010 + +Fixes: 82f9378c443c ("net/mlx5: Handle IPsec steering upon master unbind/bind") +Signed-off-by: Patrisious Haddad +Reviewed-by: Leon Romanovsky +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1761136182-918470-5-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../mellanox/mlx5/core/en_accel/ipsec.h | 5 ++++ + .../mellanox/mlx5/core/en_accel/ipsec_fs.c | 25 +++++++++++++++++-- + .../net/ethernet/mellanox/mlx5/core/en_main.c | 2 ++ + 3 files changed, 30 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +index 5d7c15abfcaf6..f8eaaf37963b1 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +@@ -342,6 +342,7 @@ void mlx5e_ipsec_build_accel_xfrm_attrs(struct mlx5e_ipsec_sa_entry *sa_entry, + void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv, + struct mlx5e_priv *master_priv); + void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event); ++void mlx5e_ipsec_disable_events(struct mlx5e_priv *priv); + + static inline struct mlx5_core_dev * + mlx5e_ipsec_sa2dev(struct mlx5e_ipsec_sa_entry *sa_entry) +@@ -387,6 +388,10 @@ static inline void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *sl + static inline void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event) + { + } ++ ++static inline void mlx5e_ipsec_disable_events(struct mlx5e_priv *priv) ++{ ++} + #endif + + #endif /* __MLX5E_IPSEC_H__ */ +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +index 9e23652535638..f1297b5a04082 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +@@ -2869,9 +2869,30 @@ void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv, + + void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event) + { +- if (!priv->ipsec) +- return; /* IPsec not supported */ ++ if (!priv->ipsec || mlx5_devcom_comp_get_size(priv->devcom) < 2) ++ return; /* IPsec not supported or no peers */ + + mlx5_devcom_send_event(priv->devcom, event, event, priv); + wait_for_completion(&priv->ipsec->comp); + } ++ ++void mlx5e_ipsec_disable_events(struct mlx5e_priv *priv) ++{ ++ struct mlx5_devcom_comp_dev *tmp = NULL; ++ struct mlx5e_priv *peer_priv; ++ ++ if (!priv->devcom) ++ return; ++ ++ if (!mlx5_devcom_for_each_peer_begin(priv->devcom)) ++ goto out; ++ ++ peer_priv = mlx5_devcom_get_next_peer_data(priv->devcom, &tmp); ++ if (peer_priv) ++ complete_all(&peer_priv->ipsec->comp); ++ ++ mlx5_devcom_for_each_peer_end(priv->devcom); ++out: ++ mlx5_devcom_unregister_component(priv->devcom); ++ priv->devcom = NULL; ++} +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +index 21bb88c5d3dce..8a63e62938e73 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +@@ -261,6 +261,7 @@ static void mlx5e_devcom_cleanup_mpv(struct mlx5e_priv *priv) + } + + mlx5_devcom_unregister_component(priv->devcom); ++ priv->devcom = NULL; + } + + static int blocking_event(struct notifier_block *nb, unsigned long event, void *data) +@@ -6068,6 +6069,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv) + if (mlx5e_monitor_counter_supported(priv)) + mlx5e_monitor_counter_cleanup(priv); + ++ mlx5e_ipsec_disable_events(priv); + mlx5e_disable_blocking_events(priv); + mlx5e_disable_async_events(priv); + mlx5_lag_remove_netdev(mdev, priv->netdev); +-- +2.51.0 + diff --git a/queue-6.17/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch b/queue-6.17/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch new file mode 100644 index 0000000000..7edec68322 --- /dev/null +++ b/queue-6.17/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch @@ -0,0 +1,65 @@ +From 836d7621ecb64a41ebe2a4901a6404e07c779834 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 14 Oct 2025 13:46:49 -0700 +Subject: net/mlx5e: Return 1 instead of 0 in invalid case in + mlx5e_mpwrq_umr_entry_size() + +From: Nathan Chancellor + +[ Upstream commit aaf043a5688114703ae2c1482b92e7e0754d684e ] + +When building with Clang 20 or newer, there are some objtool warnings +from unexpected fallthroughs to other functions: + + vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries() + vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom() + +LLVM 20 contains an (admittedly problematic [1]) optimization [2] to +convert divide by zero into the equivalent of __builtin_unreachable(), +which invokes undefined behavior and destroys code generation when it is +encountered in a control flow graph. + +mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an +unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(), +which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of +mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for +zero, so LLVM is able to infer there will be a divide by zero in this +case and invokes undefined behavior. While there is some proposed work +to isolate this undefined behavior and avoid the destructive code +generation that results in these objtool warnings, code should still be +defensive against divide by zero. + +As the WARN_ONCE() implies that an invalid value should be handled +gracefully, return 1 instead of 0 in the default case so that the +results of this division operation is always valid. + +Fixes: 168723c1f8d6 ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters") +Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@mail.gmail.com/ [1] +Link: https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448 [2] +Closes: https://github.com/ClangBuiltLinux/linux/issues/2131 +Closes: https://github.com/ClangBuiltLinux/linux/issues/2132 +Signed-off-by: Nathan Chancellor +Reviewed-by: Tariq Toukan +Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_entry_size-v1-1-dc186b8819ef@kernel.org +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +index 3cca06a74cf94..06e1a04e693f3 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +@@ -99,7 +99,7 @@ u8 mlx5e_mpwrq_umr_entry_size(enum mlx5e_mpwrq_umr_mode mode) + return sizeof(struct mlx5_ksm) * 4; + } + WARN_ONCE(1, "MPWRQ UMR mode %d is not known\n", mode); +- return 0; ++ return 1; + } + + u8 mlx5e_mpwrq_log_wqe_sz(struct mlx5_core_dev *mdev, u8 page_shift, +-- +2.51.0 + diff --git a/queue-6.17/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch b/queue-6.17/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch new file mode 100644 index 0000000000..6c5218676b --- /dev/null +++ b/queue-6.17/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch @@ -0,0 +1,68 @@ +From a0a70121d40244130d5ea89a607324a8fe40b934 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 22:55:39 +0300 +Subject: net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy + RQ + +From: Amery Hung + +[ Upstream commit afd5ba577c10639f62e8120df67dc70ea4b61176 ] + +XDP programs can release xdp_buff fragments when calling +bpf_xdp_adjust_tail(). The driver currently assumes the number of +fragments to be unchanged and may generate skb with wrong truesize or +containing invalid frags. Fix the bug by generating skb according to +xdp_buff after the XDP program runs. + +Fixes: ea5d49bdae8b ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ") +Reviewed-by: Dragos Tatulea +Signed-off-by: Amery Hung +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1760644540-899148-2-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 25 ++++++++++++++----- + 1 file changed, 19 insertions(+), 6 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index b8c609d91d11b..25d993ded314a 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -1773,14 +1773,27 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + } + + prog = rcu_dereference(rq->xdp_prog); +- if (prog && mlx5e_xdp_handle(rq, prog, mxbuf)) { +- if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { +- struct mlx5e_wqe_frag_info *pwi; ++ if (prog) { ++ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags; ++ ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) { ++ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, ++ rq->flags)) { ++ struct mlx5e_wqe_frag_info *pwi; ++ ++ wi -= old_nr_frags - sinfo->nr_frags; ++ ++ for (pwi = head_wi; pwi < wi; pwi++) ++ pwi->frag_page->frags++; ++ } ++ return NULL; /* page/packet was consumed by XDP */ ++ } + +- for (pwi = head_wi; pwi < wi; pwi++) +- pwi->frag_page->frags++; ++ nr_frags_free = old_nr_frags - sinfo->nr_frags; ++ if (unlikely(nr_frags_free)) { ++ wi -= nr_frags_free; ++ truesize -= nr_frags_free * frag_info->frag_stride; + } +- return NULL; /* page/packet was consumed by XDP */ + } + + skb = mlx5e_build_linear_skb( +-- +2.51.0 + diff --git a/queue-6.17/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-22401 b/queue-6.17/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-22401 new file mode 100644 index 0000000000..fc970ca764 --- /dev/null +++ b/queue-6.17/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-22401 @@ -0,0 +1,120 @@ +From 587657c0d0f9786ebeb7fb8d5faf34dd50282a1c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 22:55:40 +0300 +Subject: net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for + striding RQ + +From: Amery Hung + +[ Upstream commit 87bcef158ac1faca1bd7e0104588e8e2956d10be ] + +XDP programs can change the layout of an xdp_buff through +bpf_xdp_adjust_tail() and bpf_xdp_adjust_head(). Therefore, the driver +cannot assume the size of the linear data area nor fragments. Fix the +bug in mlx5 by generating skb according to xdp_buff after XDP programs +run. + +Currently, when handling multi-buf XDP, the mlx5 driver assumes the +layout of an xdp_buff to be unchanged. That is, the linear data area +continues to be empty and fragments remain the same. This may cause +the driver to generate erroneous skb or triggering a kernel +warning. When an XDP program added linear data through +bpf_xdp_adjust_head(), the linear data will be ignored as +mlx5e_build_linear_skb() builds an skb without linear data and then +pull data from fragments to fill the linear data area. When an XDP +program has shrunk the non-linear data through bpf_xdp_adjust_tail(), +the delta passed to __pskb_pull_tail() may exceed the actual nonlinear +data size and trigger the BUG_ON in it. + +To fix the issue, first record the original number of fragments. If the +number of fragments changes after the XDP program runs, rewind the end +fragment pointer by the difference and recalculate the truesize. Then, +build the skb with the linear data area matching the xdp_buff. Finally, +only pull data in if there is non-linear data and fill the linear part +up to 256 bytes. + +Fixes: f52ac7028bec ("net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ") +Signed-off-by: Amery Hung +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1760644540-899148-3-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 26 ++++++++++++++++--- + 1 file changed, 23 insertions(+), 3 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index 25d993ded314a..5dbf48da2f4f1 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -2017,6 +2017,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + u32 byte_cnt = cqe_bcnt; + struct skb_shared_info *sinfo; + unsigned int truesize = 0; ++ u32 pg_consumed_bytes; + struct bpf_prog *prog; + struct sk_buff *skb; + u32 linear_frame_sz; +@@ -2070,7 +2071,8 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + + while (byte_cnt) { + /* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */ +- u32 pg_consumed_bytes = min_t(u32, PAGE_SIZE - frag_offset, byte_cnt); ++ pg_consumed_bytes = ++ min_t(u32, PAGE_SIZE - frag_offset, byte_cnt); + + if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) + truesize += pg_consumed_bytes; +@@ -2086,10 +2088,15 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } + + if (prog) { ++ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags; ++ u32 len; ++ + if (mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + struct mlx5e_frag_page *pfp; + ++ frag_page -= old_nr_frags - sinfo->nr_frags; ++ + for (pfp = head_page; pfp < frag_page; pfp++) + pfp->frags++; + +@@ -2100,9 +2107,19 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + return NULL; /* page/packet was consumed by XDP */ + } + ++ nr_frags_free = old_nr_frags - sinfo->nr_frags; ++ if (unlikely(nr_frags_free)) { ++ frag_page -= nr_frags_free; ++ truesize -= (nr_frags_free - 1) * PAGE_SIZE + ++ ALIGN(pg_consumed_bytes, ++ BIT(rq->mpwqe.log_stride_sz)); ++ } ++ ++ len = mxbuf->xdp.data_end - mxbuf->xdp.data; ++ + skb = mlx5e_build_linear_skb( + rq, mxbuf->xdp.data_hard_start, linear_frame_sz, +- mxbuf->xdp.data - mxbuf->xdp.data_hard_start, 0, ++ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, len, + mxbuf->xdp.data - mxbuf->xdp.data_meta); + if (unlikely(!skb)) { + mlx5e_page_release_fragmented(rq->page_pool, +@@ -2127,8 +2144,11 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + do + pagep->frags++; + while (++pagep < frag_page); ++ ++ headlen = min_t(u16, MLX5E_RX_MAX_HEAD - len, ++ skb->data_len); ++ __pskb_pull_tail(skb, headlen); + } +- __pskb_pull_tail(skb, headlen); + } else { + dma_addr_t addr; + +-- +2.51.0 + diff --git a/queue-6.17/net-phy-micrel-always-set-shared-phydev-for-lan8814.patch b/queue-6.17/net-phy-micrel-always-set-shared-phydev-for-lan8814.patch new file mode 100644 index 0000000000..cc4cb5cfad --- /dev/null +++ b/queue-6.17/net-phy-micrel-always-set-shared-phydev-for-lan8814.patch @@ -0,0 +1,54 @@ +From 339824d5ad7e65c49aba6bd0954b4ebad353b393 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 15:20:26 +0200 +Subject: net: phy: micrel: always set shared->phydev for LAN8814 + +From: Robert Marko + +[ Upstream commit 399d10934740ae8cdaa4e3245f7c5f6c332da844 ] + +Currently, during the LAN8814 PTP probe shared->phydev is only set if PTP +clock gets actually set, otherwise the function will return before setting +it. + +This is an issue as shared->phydev is unconditionally being used when IRQ +is being handled, especially in lan8814_gpio_process_cap and since it was +not set it will cause a NULL pointer exception and crash the kernel. + +So, simply always set shared->phydev to avoid the NULL pointer exception. + +Fixes: b3f1a08fcf0d ("net: phy: micrel: Add support for PTP_PF_EXTTS for lan8814") +Signed-off-by: Robert Marko +Tested-by: Horatiu Vultur +Link: https://patch.msgid.link/20251021132034.983936-1-robert.marko@sartura.hr +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/phy/micrel.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c +index 605b0315b4cb0..99f8374fd32a7 100644 +--- a/drivers/net/phy/micrel.c ++++ b/drivers/net/phy/micrel.c +@@ -4109,6 +4109,8 @@ static int lan8814_ptp_probe_once(struct phy_device *phydev) + { + struct lan8814_shared_priv *shared = phy_package_get_priv(phydev); + ++ shared->phydev = phydev; ++ + /* Initialise shared lock for clock*/ + mutex_init(&shared->shared_lock); + +@@ -4164,8 +4166,6 @@ static int lan8814_ptp_probe_once(struct phy_device *phydev) + + phydev_dbg(phydev, "successfully registered ptp clock\n"); + +- shared->phydev = phydev; +- + /* The EP.4 is shared between all the PHYs in the package and also it + * can be accessed by any of the PHYs + */ +-- +2.51.0 + diff --git a/queue-6.17/net-phy-realtek-fix-rtl8221b-vm-cg-name.patch b/queue-6.17/net-phy-realtek-fix-rtl8221b-vm-cg-name.patch new file mode 100644 index 0000000000..ea889a47e2 --- /dev/null +++ b/queue-6.17/net-phy-realtek-fix-rtl8221b-vm-cg-name.patch @@ -0,0 +1,80 @@ +From 4d288f5e71c8aeaa9373c8d5e6dc4c6530f90656 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 21:22:52 +0200 +Subject: net: phy: realtek: fix rtl8221b-vm-cg name + +From: Aleksander Jan Bajkowski + +[ Upstream commit ffff5c8fc2af2218a3332b3d5b97654599d50cde ] + +When splitting the RTL8221B-VM-CG into C22 and C45 variants, the name was +accidentally changed to RTL8221B-VN-CG. This patch brings back the previous +part number. + +Fixes: ad5ce743a6b0 ("net: phy: realtek: Add driver instances for rtl8221b via Clause 45") +Signed-off-by: Aleksander Jan Bajkowski +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20251016192325.2306757-1-olek2@wp.pl +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/phy/realtek/realtek_main.c | 16 ++++++++-------- + 1 file changed, 8 insertions(+), 8 deletions(-) + +diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c +index 64af3b96f0288..62ef87ecc5587 100644 +--- a/drivers/net/phy/realtek/realtek_main.c ++++ b/drivers/net/phy/realtek/realtek_main.c +@@ -156,7 +156,7 @@ + #define RTL_8211FVD_PHYID 0x001cc878 + #define RTL_8221B 0x001cc840 + #define RTL_8221B_VB_CG 0x001cc849 +-#define RTL_8221B_VN_CG 0x001cc84a ++#define RTL_8221B_VM_CG 0x001cc84a + #define RTL_8251B 0x001cc862 + #define RTL_8261C 0x001cc890 + +@@ -1362,16 +1362,16 @@ static int rtl8221b_vb_cg_c45_match_phy_device(struct phy_device *phydev, + return rtlgen_is_c45_match(phydev, RTL_8221B_VB_CG, true); + } + +-static int rtl8221b_vn_cg_c22_match_phy_device(struct phy_device *phydev, ++static int rtl8221b_vm_cg_c22_match_phy_device(struct phy_device *phydev, + const struct phy_driver *phydrv) + { +- return rtlgen_is_c45_match(phydev, RTL_8221B_VN_CG, false); ++ return rtlgen_is_c45_match(phydev, RTL_8221B_VM_CG, false); + } + +-static int rtl8221b_vn_cg_c45_match_phy_device(struct phy_device *phydev, ++static int rtl8221b_vm_cg_c45_match_phy_device(struct phy_device *phydev, + const struct phy_driver *phydrv) + { +- return rtlgen_is_c45_match(phydev, RTL_8221B_VN_CG, true); ++ return rtlgen_is_c45_match(phydev, RTL_8221B_VM_CG, true); + } + + static int rtl_internal_nbaset_match_phy_device(struct phy_device *phydev, +@@ -1718,7 +1718,7 @@ static struct phy_driver realtek_drvs[] = { + .suspend = genphy_c45_pma_suspend, + .resume = rtlgen_c45_resume, + }, { +- .match_phy_device = rtl8221b_vn_cg_c22_match_phy_device, ++ .match_phy_device = rtl8221b_vm_cg_c22_match_phy_device, + .name = "RTL8221B-VM-CG 2.5Gbps PHY (C22)", + .probe = rtl822x_probe, + .get_features = rtl822x_get_features, +@@ -1731,8 +1731,8 @@ static struct phy_driver realtek_drvs[] = { + .read_page = rtl821x_read_page, + .write_page = rtl821x_write_page, + }, { +- .match_phy_device = rtl8221b_vn_cg_c45_match_phy_device, +- .name = "RTL8221B-VN-CG 2.5Gbps PHY (C45)", ++ .match_phy_device = rtl8221b_vm_cg_c45_match_phy_device, ++ .name = "RTL8221B-VM-CG 2.5Gbps PHY (C45)", + .probe = rtl822x_probe, + .config_init = rtl822xb_config_init, + .get_rate_matching = rtl822xb_get_rate_matching, +-- +2.51.0 + diff --git a/queue-6.17/net-smc-fix-general-protection-fault-in-__smc_diag_d.patch b/queue-6.17/net-smc-fix-general-protection-fault-in-__smc_diag_d.patch new file mode 100644 index 0000000000..bcb433e456 --- /dev/null +++ b/queue-6.17/net-smc-fix-general-protection-fault-in-__smc_diag_d.patch @@ -0,0 +1,131 @@ +From 7c3ceb8d6f601e916aba44a8f4007ab6b2822ab1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 10:48:27 +0800 +Subject: net/smc: fix general protection fault in __smc_diag_dump + +From: Wang Liang + +[ Upstream commit f584239a9ed25057496bf397c370cc5163dde419 ] + +The syzbot report a crash: + + Oops: general protection fault, probably for non-canonical address 0xfbd5a5d5a0000003: 0000 [#1] SMP KASAN NOPTI + KASAN: maybe wild-memory-access in range [0xdead4ead00000018-0xdead4ead0000001f] + CPU: 1 UID: 0 PID: 6949 Comm: syz.0.335 Not tainted syzkaller #0 PREEMPT(full) + Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025 + RIP: 0010:smc_diag_msg_common_fill net/smc/smc_diag.c:44 [inline] + RIP: 0010:__smc_diag_dump.constprop.0+0x3ca/0x2550 net/smc/smc_diag.c:89 + Call Trace: + + smc_diag_dump_proto+0x26d/0x420 net/smc/smc_diag.c:217 + smc_diag_dump+0x27/0x90 net/smc/smc_diag.c:234 + netlink_dump+0x539/0xd30 net/netlink/af_netlink.c:2327 + __netlink_dump_start+0x6d6/0x990 net/netlink/af_netlink.c:2442 + netlink_dump_start include/linux/netlink.h:341 [inline] + smc_diag_handler_dump+0x1f9/0x240 net/smc/smc_diag.c:251 + __sock_diag_cmd net/core/sock_diag.c:249 [inline] + sock_diag_rcv_msg+0x438/0x790 net/core/sock_diag.c:285 + netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552 + netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] + netlink_unicast+0x5a7/0x870 net/netlink/af_netlink.c:1346 + netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896 + sock_sendmsg_nosec net/socket.c:714 [inline] + __sock_sendmsg net/socket.c:729 [inline] + ____sys_sendmsg+0xa95/0xc70 net/socket.c:2614 + ___sys_sendmsg+0x134/0x1d0 net/socket.c:2668 + __sys_sendmsg+0x16d/0x220 net/socket.c:2700 + do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] + do_syscall_64+0xcd/0x4e0 arch/x86/entry/syscall_64.c:94 + entry_SYSCALL_64_after_hwframe+0x77/0x7f + + +The process like this: + + (CPU1) | (CPU2) + ---------------------------------|------------------------------- + inet_create() | + // init clcsock to NULL | + sk = sk_alloc() | + | + // unexpectedly change clcsock | + inet_init_csk_locks() | + | + // add sk to hash table | + smc_inet_init_sock() | + smc_sk_init() | + smc_hash_sk() | + | // traverse the hash table + | smc_diag_dump_proto + | __smc_diag_dump() + | // visit wrong clcsock + | smc_diag_msg_common_fill() + // alloc clcsock | + smc_create_clcsk | + sock_create_kern | + +With CONFIG_DEBUG_LOCK_ALLOC=y, the smc->clcsock is unexpectedly changed +in inet_init_csk_locks(). The INET_PROTOSW_ICSK flag is no need by smc, +just remove it. + +After removing the INET_PROTOSW_ICSK flag, this patch alse revert +commit 6fd27ea183c2 ("net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC") +to avoid casting smc_sock to inet_connection_sock. + +Reported-by: syzbot+f775be4458668f7d220e@syzkaller.appspotmail.com +Closes: https://syzkaller.appspot.com/bug?extid=f775be4458668f7d220e +Tested-by: syzbot+f775be4458668f7d220e@syzkaller.appspotmail.com +Fixes: d25a92ccae6b ("net/smc: Introduce IPPROTO_SMC") +Signed-off-by: Wang Liang +Reviewed-by: Kuniyuki Iwashima +Reviewed-by: Eric Dumazet +Reviewed-by: D. Wythe +Link: https://patch.msgid.link/20251017024827.3137512-1-wangliang74@huawei.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/smc/smc_inet.c | 13 ------------- + 1 file changed, 13 deletions(-) + +diff --git a/net/smc/smc_inet.c b/net/smc/smc_inet.c +index a944e7dcb8b96..a94084b4a498e 100644 +--- a/net/smc/smc_inet.c ++++ b/net/smc/smc_inet.c +@@ -56,7 +56,6 @@ static struct inet_protosw smc_inet_protosw = { + .protocol = IPPROTO_SMC, + .prot = &smc_inet_prot, + .ops = &smc_inet_stream_ops, +- .flags = INET_PROTOSW_ICSK, + }; + + #if IS_ENABLED(CONFIG_IPV6) +@@ -104,27 +103,15 @@ static struct inet_protosw smc_inet6_protosw = { + .protocol = IPPROTO_SMC, + .prot = &smc_inet6_prot, + .ops = &smc_inet6_stream_ops, +- .flags = INET_PROTOSW_ICSK, + }; + #endif /* CONFIG_IPV6 */ + +-static unsigned int smc_sync_mss(struct sock *sk, u32 pmtu) +-{ +- /* No need pass it through to clcsock, mss can always be set by +- * sock_create_kern or smc_setsockopt. +- */ +- return 0; +-} +- + static int smc_inet_init_sock(struct sock *sk) + { + struct net *net = sock_net(sk); + + /* init common smc sock */ + smc_sk_init(net, sk, IPPROTO_SMC); +- +- inet_csk(sk)->icsk_sync_mss = smc_sync_mss; +- + /* create clcsock */ + return smc_create_clcsk(net, sk, sk->sk_family); + } +-- +2.51.0 + diff --git a/queue-6.17/ovpn-use-datagram_poll_queue-for-socket-readiness-in.patch b/queue-6.17/ovpn-use-datagram_poll_queue-for-socket-readiness-in.patch new file mode 100644 index 0000000000..304f0ed5c8 --- /dev/null +++ b/queue-6.17/ovpn-use-datagram_poll_queue-for-socket-readiness-in.patch @@ -0,0 +1,76 @@ +From 34078a5a3e97840f10514832ea2a0c65f32ef519 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 12:09:42 +0200 +Subject: ovpn: use datagram_poll_queue for socket readiness in TCP + +From: Ralf Lici + +[ Upstream commit efd729408bc7d57e0c8d027b9ff514187fc1a05b ] + +openvpn TCP encapsulation uses a custom queue to deliver packets to +userspace. Currently it relies on datagram_poll, which checks +sk_receive_queue, leading to false readiness signals when that queue +contains non-userspace packets. + +Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's +user_queue, ensuring poll only signals readiness when userspace data is +actually available. Also refactor ovpn_tcp_poll in order to enforce the +assumption we can make on the lifetime of ovpn_sock and peer. + +Fixes: 11851cbd60ea ("ovpn: implement TCP transport") +Signed-off-by: Antonio Quartulli +Signed-off-by: Ralf Lici +Reviewed-by: Sabrina Dubroca +Link: https://patch.msgid.link/20251021100942.195010-4-ralf@mandelbit.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ovpn/tcp.c | 26 ++++++++++++++++++++++---- + 1 file changed, 22 insertions(+), 4 deletions(-) + +diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c +index 289f62c5d2c70..0d7f30360d874 100644 +--- a/drivers/net/ovpn/tcp.c ++++ b/drivers/net/ovpn/tcp.c +@@ -560,16 +560,34 @@ static void ovpn_tcp_close(struct sock *sk, long timeout) + static __poll_t ovpn_tcp_poll(struct file *file, struct socket *sock, + poll_table *wait) + { +- __poll_t mask = datagram_poll(file, sock, wait); ++ struct sk_buff_head *queue = &sock->sk->sk_receive_queue; + struct ovpn_socket *ovpn_sock; ++ struct ovpn_peer *peer = NULL; ++ __poll_t mask; + + rcu_read_lock(); + ovpn_sock = rcu_dereference_sk_user_data(sock->sk); +- if (ovpn_sock && ovpn_sock->peer && +- !skb_queue_empty(&ovpn_sock->peer->tcp.user_queue)) +- mask |= EPOLLIN | EPOLLRDNORM; ++ /* if we landed in this callback, we expect to have a ++ * meaningful state. The ovpn_socket lifecycle would ++ * prevent it otherwise. ++ */ ++ if (WARN(!ovpn_sock || !ovpn_sock->peer, ++ "ovpn: null state in ovpn_tcp_poll!")) { ++ rcu_read_unlock(); ++ return 0; ++ } ++ ++ if (ovpn_peer_hold(ovpn_sock->peer)) { ++ peer = ovpn_sock->peer; ++ queue = &peer->tcp.user_queue; ++ } + rcu_read_unlock(); + ++ mask = datagram_poll_queue(file, sock, wait, queue); ++ ++ if (peer) ++ ovpn_peer_put(peer); ++ + return mask; + } + +-- +2.51.0 + diff --git a/queue-6.17/platform-mellanox-mlxbf-pmc-add-sysfs_attr_init-to-c.patch b/queue-6.17/platform-mellanox-mlxbf-pmc-add-sysfs_attr_init-to-c.patch new file mode 100644 index 0000000000..444840cf10 --- /dev/null +++ b/queue-6.17/platform-mellanox-mlxbf-pmc-add-sysfs_attr_init-to-c.patch @@ -0,0 +1,65 @@ +From 91ba15701629cd5bbc20116e4909b107191492b2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 13 Oct 2025 15:56:05 +0000 +Subject: platform/mellanox: mlxbf-pmc: add sysfs_attr_init() to count_clock + init +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: David Thompson + +[ Upstream commit a7b4747d8e0e7871c3d4971cded1dcc9af6af9e9 ] + +The lock-related debug logic (CONFIG_LOCK_STAT) in the kernel is noting +the following warning when the BlueField-3 SOC is booted: + + BUG: key ffff00008a3402a8 has not been registered! + ------------[ cut here ]------------ + DEBUG_LOCKS_WARN_ON(1) + WARNING: CPU: 4 PID: 592 at kernel/locking/lockdep.c:4801 lockdep_init_map_type+0x1d4/0x2a0 + + Call trace: + lockdep_init_map_type+0x1d4/0x2a0 + __kernfs_create_file+0x84/0x140 + sysfs_add_file_mode_ns+0xcc/0x1cc + internal_create_group+0x110/0x3d4 + internal_create_groups.part.0+0x54/0xcc + sysfs_create_groups+0x24/0x40 + device_add+0x6e8/0x93c + device_register+0x28/0x40 + __hwmon_device_register+0x4b0/0x8a0 + devm_hwmon_device_register_with_groups+0x7c/0xe0 + mlxbf_pmc_probe+0x1e8/0x3e0 [mlxbf_pmc] + platform_probe+0x70/0x110 + +The mlxbf_pmc driver must call sysfs_attr_init() during the +initialization of the "count_clock" data structure to avoid +this warning. + +Fixes: 5efc800975d9 ("platform/mellanox: mlxbf-pmc: Add support for monitoring cycle count") +Reviewed-by: Shravan Kumar Ramani +Signed-off-by: David Thompson +Link: https://patch.msgid.link/20251013155605.3589770-1-davthompson@nvidia.com +Reviewed-by: Ilpo Järvinen +Signed-off-by: Ilpo Järvinen +Signed-off-by: Sasha Levin +--- + drivers/platform/mellanox/mlxbf-pmc.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c +index 4776013e07649..16a2fd9fdd9b8 100644 +--- a/drivers/platform/mellanox/mlxbf-pmc.c ++++ b/drivers/platform/mellanox/mlxbf-pmc.c +@@ -2015,6 +2015,7 @@ static int mlxbf_pmc_init_perftype_counter(struct device *dev, unsigned int blk_ + if (pmc->block[blk_num].type == MLXBF_PMC_TYPE_CRSPACE) { + /* Program crspace counters to count clock cycles using "count_clock" sysfs */ + attr = &pmc->block[blk_num].attr_count_clock; ++ sysfs_attr_init(&attr->dev_attr.attr); + attr->dev_attr.attr.mode = 0644; + attr->dev_attr.show = mlxbf_pmc_count_clock_show; + attr->dev_attr.store = mlxbf_pmc_count_clock_store; +-- +2.51.0 + diff --git a/queue-6.17/ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch b/queue-6.17/ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch new file mode 100644 index 0000000000..5104d57fbf --- /dev/null +++ b/queue-6.17/ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch @@ -0,0 +1,41 @@ +From 4b3abf2af27b5e5bcde3beae8103b4f8b05f1f9f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 18:24:56 +0000 +Subject: ptp: ocp: Fix typo using index 1 instead of i in SMA initialization + loop + +From: Jiasheng Jiang + +[ Upstream commit a767957e7a83f9e742be196aa52a48de8ac5a7e4 ] + +In ptp_ocp_sma_fb_init(), the code mistakenly used bp->sma[1] +instead of bp->sma[i] inside a for-loop, which caused only SMA[1] +to have its DIRECTION_CAN_CHANGE capability cleared. This led to +inconsistent capability flags across SMA pins. + +Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops") +Signed-off-by: Jiasheng Jiang +Reviewed-by: Vadim Fedorenko +Link: https://patch.msgid.link/20251021182456.9729-1-jiashengjiangcool@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/ptp/ptp_ocp.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/ptp/ptp_ocp.c b/drivers/ptp/ptp_ocp.c +index 4e1286ce05c9a..f354f2f51a48c 100644 +--- a/drivers/ptp/ptp_ocp.c ++++ b/drivers/ptp/ptp_ocp.c +@@ -2550,7 +2550,7 @@ ptp_ocp_sma_fb_init(struct ptp_ocp *bp) + for (i = 0; i < OCP_SMA_NUM; i++) { + bp->sma[i].fixed_fcn = true; + bp->sma[i].fixed_dir = true; +- bp->sma[1].dpll_prop.capabilities &= ++ bp->sma[i].dpll_prop.capabilities &= + ~DPLL_PIN_CAPABILITIES_DIRECTION_CAN_CHANGE; + } + return; +-- +2.51.0 + diff --git a/queue-6.17/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch b/queue-6.17/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch new file mode 100644 index 0000000000..64e0cbd844 --- /dev/null +++ b/queue-6.17/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch @@ -0,0 +1,56 @@ +From 151f75d7a21ccb1c3dabda28ba5bebf850cf2db2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 22:15:43 +0200 +Subject: rtnetlink: Allow deleting FDB entries in user namespace +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Johannes Wiesböck + +[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ] + +Creating FDB entries is possible from a non-initial user namespace when +having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive +an EPERM because the capability is always checked against the initial +user namespace. This restricts the FDB management from unprivileged +containers. + +Drop the netlink_capable check in rtnl_fdb_del as it was originally +dropped in c5c351088ae7 and reintroduced in 1690be63a27b without +intention. + +This patch was tested using a container on GyroidOS, where it was +possible to delete FDB entries from an unprivileged user namespace and +private network namespace. + +Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") +Reviewed-by: Michael Weiß +Tested-by: Harshal Gohel +Signed-off-by: Johannes Wiesböck +Reviewed-by: Ido Schimmel +Reviewed-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 094b085cff206..8f3fd52f089d2 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4707,9 +4707,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + int err; + u16 vid; + +- if (!netlink_capable(skb, CAP_NET_ADMIN)) +- return -EPERM; +- + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); +-- +2.51.0 + diff --git a/queue-6.17/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch b/queue-6.17/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch new file mode 100644 index 0000000000..a6fbcead0e --- /dev/null +++ b/queue-6.17/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch @@ -0,0 +1,54 @@ +From 6e47a6224e4ce7e2aa3ec95fbe942ada9309517f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 16:00:36 +0300 +Subject: sctp: avoid NULL dereference when chunk data buffer is missing + +From: Alexey Simakov + +[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ] + +chunk->skb pointer is dereferenced in the if-block where it's supposed +to be NULL only. + +chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list +instead and do it just before replacing chunk->skb. We're sure that +otherwise chunk->skb is non-NULL because of outer if() condition. + +Fixes: 90017accff61 ("sctp: Add GSO support") +Signed-off-by: Alexey Simakov +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sctp/inqueue.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c +index 5c16521818058..f5a7d5a387555 100644 +--- a/net/sctp/inqueue.c ++++ b/net/sctp/inqueue.c +@@ -169,13 +169,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) + chunk->head_skb = chunk->skb; + + /* skbs with "cover letter" */ +- if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) ++ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { ++ if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { ++ __SCTP_INC_STATS(dev_net(chunk->skb->dev), ++ SCTP_MIB_IN_PKT_DISCARDS); ++ sctp_chunk_free(chunk); ++ goto next_chunk; ++ } + chunk->skb = skb_shinfo(chunk->skb)->frag_list; +- +- if (WARN_ON(!chunk->skb)) { +- __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); +- sctp_chunk_free(chunk); +- goto next_chunk; + } + } + +-- +2.51.0 + diff --git a/queue-6.17/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch b/queue-6.17/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch new file mode 100644 index 0000000000..2d0399f99d --- /dev/null +++ b/queue-6.17/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch @@ -0,0 +1,241 @@ +From 14bc0ed0dd7384cc4eb8e03f2a3c32abc3984203 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:06:14 -0400 +Subject: selftests: net: fix server bind failure in sctp_vrf.sh +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Xin Long + +[ Upstream commit a73ca0449bcb7c238097cc6a1bf3fd82a78374df ] + +sctp_vrf.sh could fail: + + TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [FAIL] + not ok 1 selftests: net: sctp_vrf.sh # exit=3 + +The failure happens when the server bind in a new run conflicts with an +existing association from the previous run: + +[1] ip netns exec $SERVER_NS ./sctp_hello server ... +[2] ip netns exec $CLIENT_NS ./sctp_hello client ... +[3] ip netns exec $SERVER_NS pkill sctp_hello ... +[4] ip netns exec $SERVER_NS ./sctp_hello server ... + +It occurs if the client in [2] sends a message and closes immediately. +With the message unacked, no SHUTDOWN is sent. Killing the server in [3] +triggers a SHUTDOWN the client also ignores due to the unacked message, +leaving the old association alive. This causes the bind at [4] to fail +until the message is acked and the client responds to a second SHUTDOWN +after the server’s T2 timer expires (3s). + +This patch fixes the issue by preventing the client from sending data. +Instead, the client blocks on recv() and waits for the server to close. +It also waits until both the server and the client sockets are fully +released in stop_server and wait_client before restarting. + +Additionally, replace 2>&1 >/dev/null with -q in sysctl and grep, and +drop other redundant 2>&1 >/dev/null redirections, and fix a typo from +N to Y (connect successfully) in the description of the last test. + +Fixes: a61bd7b9fef3 ("selftests: add a selftest for sctp vrf") +Reported-by: Hangbin Liu +Tested-by: Jakub Kicinski +Signed-off-by: Xin Long +Link: https://patch.msgid.link/be2dacf52d0917c4ba5e2e8c5a9cb640740ad2b6.1760731574.git.lucien.xin@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + tools/testing/selftests/net/sctp_hello.c | 17 +----- + tools/testing/selftests/net/sctp_vrf.sh | 73 +++++++++++++++--------- + 2 files changed, 47 insertions(+), 43 deletions(-) + +diff --git a/tools/testing/selftests/net/sctp_hello.c b/tools/testing/selftests/net/sctp_hello.c +index f02f1f95d2275..a04dac0b8027d 100644 +--- a/tools/testing/selftests/net/sctp_hello.c ++++ b/tools/testing/selftests/net/sctp_hello.c +@@ -29,7 +29,6 @@ static void set_addr(struct sockaddr_storage *ss, char *ip, char *port, int *len + static int do_client(int argc, char *argv[]) + { + struct sockaddr_storage ss; +- char buf[] = "hello"; + int csk, ret, len; + + if (argc < 5) { +@@ -56,16 +55,10 @@ static int do_client(int argc, char *argv[]) + + set_addr(&ss, argv[3], argv[4], &len); + ret = connect(csk, (struct sockaddr *)&ss, len); +- if (ret < 0) { +- printf("failed to connect to peer\n"); ++ if (ret < 0) + return -1; +- } + +- ret = send(csk, buf, strlen(buf) + 1, 0); +- if (ret < 0) { +- printf("failed to send msg %d\n", ret); +- return -1; +- } ++ recv(csk, NULL, 0, 0); + close(csk); + + return 0; +@@ -75,7 +68,6 @@ int main(int argc, char *argv[]) + { + struct sockaddr_storage ss; + int lsk, csk, ret, len; +- char buf[20]; + + if (argc < 2 || (strcmp(argv[1], "server") && strcmp(argv[1], "client"))) { + printf("%s server|client ...\n", argv[0]); +@@ -125,11 +117,6 @@ int main(int argc, char *argv[]) + return -1; + } + +- ret = recv(csk, buf, sizeof(buf), 0); +- if (ret <= 0) { +- printf("failed to recv msg %d\n", ret); +- return -1; +- } + close(csk); + close(lsk); + +diff --git a/tools/testing/selftests/net/sctp_vrf.sh b/tools/testing/selftests/net/sctp_vrf.sh +index c854034b6aa16..667b211aa8a11 100755 +--- a/tools/testing/selftests/net/sctp_vrf.sh ++++ b/tools/testing/selftests/net/sctp_vrf.sh +@@ -20,9 +20,9 @@ setup() { + modprobe sctp_diag + setup_ns CLIENT_NS1 CLIENT_NS2 SERVER_NS + +- ip net exec $CLIENT_NS1 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null +- ip net exec $CLIENT_NS2 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null +- ip net exec $SERVER_NS sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null ++ ip net exec $CLIENT_NS1 sysctl -wq net.ipv6.conf.default.accept_dad=0 ++ ip net exec $CLIENT_NS2 sysctl -wq net.ipv6.conf.default.accept_dad=0 ++ ip net exec $SERVER_NS sysctl -wq net.ipv6.conf.default.accept_dad=0 + + ip -n $SERVER_NS link add veth1 type veth peer name veth1 netns $CLIENT_NS1 + ip -n $SERVER_NS link add veth2 type veth peer name veth1 netns $CLIENT_NS2 +@@ -62,17 +62,40 @@ setup() { + } + + cleanup() { +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null ++ wait_client $CLIENT_NS1 ++ wait_client $CLIENT_NS2 ++ stop_server + cleanup_ns $CLIENT_NS1 $CLIENT_NS2 $SERVER_NS + } + +-wait_server() { ++start_server() { + local IFACE=$1 + local CNT=0 + +- until ip netns exec $SERVER_NS ss -lS src $SERVER_IP:$SERVER_PORT | \ +- grep LISTEN | grep "$IFACE" 2>&1 >/dev/null; do +- [ $((CNT++)) = "20" ] && { RET=3; return $RET; } ++ ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP $SERVER_PORT $IFACE & ++ disown ++ until ip netns exec $SERVER_NS ss -SlH | grep -q "$IFACE"; do ++ [ $((CNT++)) -eq 30 ] && { RET=3; return $RET; } ++ sleep 0.1 ++ done ++} ++ ++stop_server() { ++ local CNT=0 ++ ++ ip netns exec $SERVER_NS pkill sctp_hello ++ while ip netns exec $SERVER_NS ss -SaH | grep -q .; do ++ [ $((CNT++)) -eq 30 ] && break ++ sleep 0.1 ++ done ++} ++ ++wait_client() { ++ local CLIENT_NS=$1 ++ local CNT=0 ++ ++ while ip netns exec $CLIENT_NS ss -SaH | grep -q .; do ++ [ $((CNT++)) -eq 30 ] && break + sleep 0.1 + done + } +@@ -81,14 +104,12 @@ do_test() { + local CLIENT_NS=$1 + local IFACE=$2 + +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE 2>&1 >/dev/null & +- disown +- wait_server $IFACE || return $RET ++ start_server $IFACE || return $RET + timeout 3 ip netns exec $CLIENT_NS ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT + RET=$? ++ wait_client $CLIENT_NS ++ stop_server + return $RET + } + +@@ -96,25 +117,21 @@ do_testx() { + local IFACE1=$1 + local IFACE2=$2 + +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE1 2>&1 >/dev/null & +- disown +- wait_server $IFACE1 || return $RET +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE2 2>&1 >/dev/null & +- disown +- wait_server $IFACE2 || return $RET ++ start_server $IFACE1 || return $RET ++ start_server $IFACE2 || return $RET + timeout 3 ip netns exec $CLIENT_NS1 ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null && \ ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT && \ + timeout 3 ip netns exec $CLIENT_NS2 ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT + RET=$? ++ wait_client $CLIENT_NS1 ++ wait_client $CLIENT_NS2 ++ stop_server + return $RET + } + + testup() { +- ip netns exec $SERVER_NS sysctl -w net.sctp.l3mdev_accept=1 2>&1 >/dev/null ++ ip netns exec $SERVER_NS sysctl -wq net.sctp.l3mdev_accept=1 + echo -n "TEST 01: nobind, connect from client 1, l3mdev_accept=1, Y " + do_test $CLIENT_NS1 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" +@@ -123,7 +140,7 @@ testup() { + do_test $CLIENT_NS2 && { echo "[FAIL]"; return $RET; } + echo "[PASS]" + +- ip netns exec $SERVER_NS sysctl -w net.sctp.l3mdev_accept=0 2>&1 >/dev/null ++ ip netns exec $SERVER_NS sysctl -wq net.sctp.l3mdev_accept=0 + echo -n "TEST 03: nobind, connect from client 1, l3mdev_accept=0, N " + do_test $CLIENT_NS1 && { echo "[FAIL]"; return $RET; } + echo "[PASS]" +@@ -160,7 +177,7 @@ testup() { + do_testx vrf-1 vrf-2 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" + +- echo -n "TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N " ++ echo -n "TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, Y " + do_testx vrf-2 vrf-1 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" + } +-- +2.51.0 + diff --git a/queue-6.17/series b/queue-6.17/series index 2c56c90e64..22096c5823 100644 --- a/queue-6.17/series +++ b/queue-6.17/series @@ -31,3 +31,31 @@ smb-client-limit-the-range-of-info-receive_credit_ta.patch smb-client-make-use-of-ib_wc_status_msg-and-skip-ib_.patch smb-server-let-smb_direct_flush_send_list-invalidate.patch unbreak-make-tools-for-user-space-targets.patch +platform-mellanox-mlxbf-pmc-add-sysfs_attr_init-to-c.patch +cpufreq-amd-pstate-fix-a-regression-leading-to-epp-0.patch +net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch +rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch +erofs-fix-crafted-invalid-cases-for-encoded-extents.patch +net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch +net-enetc-correct-the-value-of-enetc_rxb_truesize.patch +dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch +net-phy-realtek-fix-rtl8221b-vm-cg-name.patch +can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch +can-esd-acc_start_xmit-use-can_dev_dropped_skb-inste.patch +can-rockchip-canfd-rkcanfd_start_xmit-use-can_dev_dr.patch +selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch +net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch +net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-22401 +net-smc-fix-general-protection-fault-in-__smc_diag_d.patch +net-ethernet-ti-am65-cpts-fix-timestamp-loss-due-to-.patch +arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch +erofs-avoid-infinite-loops-due-to-corrupted-subpage-.patch +net-hibmcge-select-fixed_phy.patch +ptp-ocp-fix-typo-using-index-1-instead-of-i-in-sma-i.patch +sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch +net-hsr-prevent-creation-of-hsr-device-with-slaves-f.patch +espintcp-use-datagram_poll_queue-for-socket-readines.patch +net-datagram-introduce-datagram_poll_queue-for-custo.patch +ovpn-use-datagram_poll_queue-for-socket-readiness-in.patch +net-phy-micrel-always-set-shared-phydev-for-lan8814.patch +net-mlx5-fix-ipsec-cleanup-over-mpv-device.patch diff --git a/queue-6.6/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch b/queue-6.6/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch new file mode 100644 index 0000000000..0f7fb0d86a --- /dev/null +++ b/queue-6.6/arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch @@ -0,0 +1,71 @@ +From 25be4c7051794367db243c86719f0a087619aac3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:37:12 +0800 +Subject: arm64, mm: avoid always making PTE dirty in pte_mkwrite() + +From: Huang Ying + +[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ] + +Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may +mark some pages that are never written dirty wrongly. For example, +do_swap_page() may map the exclusive pages with writable and clean PTEs +if the VMA is writable and the page fault is for read access. +However, current pte_mkwrite_novma() implementation always dirties the +PTE. This may cause unnecessary disk writing if the pages are +never written before being reclaimed. + +So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the +PTE_DIRTY bit is set to make it possible to make the PTE writable and +clean. + +The current behavior was introduced in commit 73e86cb03cf2 ("arm64: +Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, +pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only +clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits +are set. + +To test the performance impact of the patch, on an arm64 server +machine, run 16 redis-server processes on socket 1 and 16 +memtier_benchmark processes on socket 0 with mostly get +transactions (that is, redis-server will mostly read memory only). +The memory footprint of redis-server is larger than the available +memory, so swap out/in will be triggered. Test results show that the +patch can avoid most swapping out because the pages are mostly clean. +And the benchmark throughput improves ~23.9% in the test. + +Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") +Signed-off-by: Huang Ying +Cc: Will Deacon +Cc: Anshuman Khandual +Cc: Ryan Roberts +Cc: Gavin Shan +Cc: Ard Biesheuvel +Cc: Matthew Wilcox (Oracle) +Cc: Yicong Yang +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Reviewed-by: Catalin Marinas +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/pgtable.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h +index 0212129b13d07..92e43b3a10df9 100644 +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -184,7 +184,8 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) + static inline pte_t pte_mkwrite_novma(pte_t pte) + { + pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); +- pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); ++ if (pte_sw_dirty(pte)) ++ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + return pte; + } + +-- +2.51.0 + diff --git a/queue-6.6/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch b/queue-6.6/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch new file mode 100644 index 0000000000..1573250abc --- /dev/null +++ b/queue-6.6/can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch @@ -0,0 +1,43 @@ +From 2b8b4f0174aa5e6b44a3e5af0737ff23c8b414f5 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:28:49 +0200 +Subject: can: bxcan: bxcan_start_xmit(): use can_dev_dropped_skb() instead of + can_dropped_invalid_skb() + +From: Marc Kleine-Budde + +[ Upstream commit 3a20c444cd123e820e10ae22eeaf00e189315aa1 ] + +In addition to can_dropped_invalid_skb(), the helper function +can_dev_dropped_skb() checks whether the device is in listen-only mode and +discards the skb accordingly. + +Replace can_dropped_invalid_skb() by can_dev_dropped_skb() to also drop +skbs in for listen-only mode. + +Reported-by: Marc Kleine-Budde +Closes: https://lore.kernel.org/all/20251017-bizarre-enchanted-quokka-f3c704-mkl@pengutronix.de/ +Fixes: f00647d8127b ("can: bxcan: add support for ST bxCAN controller") +Link: https://patch.msgid.link/20251017-fix-skb-drop-check-v1-1-556665793fa4@pengutronix.de +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Sasha Levin +--- + drivers/net/can/bxcan.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/can/bxcan.c b/drivers/net/can/bxcan.c +index 49cf9682b9254..247d02447fc3f 100644 +--- a/drivers/net/can/bxcan.c ++++ b/drivers/net/can/bxcan.c +@@ -842,7 +842,7 @@ static netdev_tx_t bxcan_start_xmit(struct sk_buff *skb, + u32 id; + int i, j; + +- if (can_dropped_invalid_skb(ndev, skb)) ++ if (can_dev_dropped_skb(ndev, skb)) + return NETDEV_TX_OK; + + if (bxcan_tx_busy(priv)) +-- +2.51.0 + diff --git a/queue-6.6/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch b/queue-6.6/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch new file mode 100644 index 0000000000..07baaab56e --- /dev/null +++ b/queue-6.6/dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch @@ -0,0 +1,50 @@ +From f1299efc498dadf1ae905ec7333f732fa55bd2f7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:58:07 +0300 +Subject: dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path + +From: Ioana Ciornei + +[ Upstream commit 902e81e679d86846a2404630d349709ad9372d0d ] + +The blamed commit increased the needed headroom to account for +alignment. This means that the size required to always align a Tx buffer +was added inside the dpaa2_eth_needed_headroom() function. By doing +that, a manual adjustment of the pointer passed to PTR_ALIGN() was no +longer correct since the 'buffer_start' variable was already pointing +to the start of the skb's memory. + +The behavior of the dpaa2-eth driver without this patch was to drop +frames on Tx even when the headroom was matching the 128 bytes +necessary. Fix this by removing the manual adjust of 'buffer_start' from +the PTR_MODE call. + +Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u +Fixes: f422abe3f23d ("dpaa2-eth: increase the needed headroom to account for alignment") +Signed-off-by: Ioana Ciornei +Tested-by: Mathew McBride +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +index 81a99f4824d05..61bd2389ef4b5 100644 +--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c ++++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +@@ -1077,8 +1077,7 @@ static int dpaa2_eth_build_single_fd(struct dpaa2_eth_priv *priv, + dma_addr_t addr; + + buffer_start = skb->data - dpaa2_eth_needed_headroom(skb); +- aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN, +- DPAA2_ETH_TX_BUF_ALIGN); ++ aligned_start = PTR_ALIGN(buffer_start, DPAA2_ETH_TX_BUF_ALIGN); + if (aligned_start >= skb->head) + buffer_start = aligned_start; + else +-- +2.51.0 + diff --git a/queue-6.6/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch b/queue-6.6/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch new file mode 100644 index 0000000000..1fd5fe8bbb --- /dev/null +++ b/queue-6.6/net-enetc-correct-the-value-of-enetc_rxb_truesize.patch @@ -0,0 +1,54 @@ +From 43ce71465388de4b943faba51c3d84cc4dfc6967 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 16:01:31 +0800 +Subject: net: enetc: correct the value of ENETC_RXB_TRUESIZE + +From: Wei Fang + +[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ] + +The ENETC RX ring uses the page halves flipping mechanism, each page is +split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is +defined to 2048 to indicate the size of half a page. However, the page +size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, +but it could be configured to 16K or 64K. + +When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, +and the RX ring will always use the first half of the page. This is not +consistent with the description in the relevant kernel doc and commit +messages. + +This issue is invisible in most cases, but if users want to increase +PAGE_SIZE to receive a Jumbo frame with a single buffer for some use +cases, it will not work as expected, because the buffer size of each +RX BD is fixed to 2048 bytes. + +Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE +to (PAGE_SIZE >> 1), as described in the comment. + +Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") +Signed-off-by: Wei Fang +Reviewed-by: Claudiu Manoil +Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h +index 860ecee302f1a..dcf3e4b4e3f55 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.h ++++ b/drivers/net/ethernet/freescale/enetc/enetc.h +@@ -41,7 +41,7 @@ struct enetc_tx_swbd { + }; + + #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE +-#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ ++#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) + #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ + #define ENETC_RXB_DMA_SIZE \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +-- +2.51.0 + diff --git a/queue-6.6/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch b/queue-6.6/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch new file mode 100644 index 0000000000..31e23a0732 --- /dev/null +++ b/queue-6.6/net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch @@ -0,0 +1,158 @@ +From fb4e79cfb1588c06c7d601f1291173ae316336df Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 10:14:27 +0800 +Subject: net: enetc: fix the deadlock of enetc_mdio_lock + +From: Jianpeng Chang + +[ Upstream commit 50bd33f6b3922a6b760aa30d409cae891cec8fb5 ] + +After applying the workaround for err050089, the LS1028A platform +experiences RCU stalls on RT kernel. This issue is caused by the +recursive acquisition of the read lock enetc_mdio_lock. Here list some +of the call stacks identified under the enetc_poll path that may lead to +a deadlock: + +enetc_poll + -> enetc_lock_mdio + -> enetc_clean_rx_ring OR napi_complete_done + -> napi_gro_receive + -> enetc_start_xmit + -> enetc_lock_mdio + -> enetc_map_tx_buffs + -> enetc_unlock_mdio + -> enetc_unlock_mdio + +After enetc_poll acquires the read lock, a higher-priority writer attempts +to acquire the lock, causing preemption. The writer detects that a +read lock is already held and is scheduled out. However, readers under +enetc_poll cannot acquire the read lock again because a writer is already +waiting, leading to a thread hang. + +Currently, the deadlock is avoided by adjusting enetc_lock_mdio to prevent +recursive lock acquisition. + +Fixes: 6d36ecdbc441 ("net: enetc: take the MDIO lock only once per NAPI poll cycle") +Signed-off-by: Jianpeng Chang +Acked-by: Wei Fang +Link: https://patch.msgid.link/20251015021427.180757-1-jianpeng.chang.cn@windriver.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/freescale/enetc/enetc.c | 25 ++++++++++++++++---- + 1 file changed, 21 insertions(+), 4 deletions(-) + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c +index 49c61aa920b02..7accf3a3e9f0d 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.c ++++ b/drivers/net/ethernet/freescale/enetc/enetc.c +@@ -1246,6 +1246,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd; + struct sk_buff *skb; +@@ -1281,7 +1283,9 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_byte_cnt += skb->len + ETH_HLEN; + rx_frm_cnt++; + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + } + + rx_ring->next_to_clean = i; +@@ -1289,6 +1293,8 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -1598,6 +1604,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + /* next descriptor to process */ + i = rx_ring->next_to_clean; + ++ enetc_lock_mdio(); ++ + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd, *orig_rxbd; + int orig_i, orig_cleaned_cnt; +@@ -1657,7 +1665,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + if (unlikely(!skb)) + goto out; + ++ enetc_unlock_mdio(); + napi_gro_receive(napi, skb); ++ enetc_lock_mdio(); + break; + case XDP_TX: + tx_ring = priv->xdp_tx_ring[rx_ring->index]; +@@ -1692,7 +1702,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + } + break; + case XDP_REDIRECT: ++ enetc_unlock_mdio(); + err = xdp_do_redirect(rx_ring->ndev, &xdp_buff, prog); ++ enetc_lock_mdio(); + if (unlikely(err)) { + enetc_xdp_drop(rx_ring, orig_i, i); + rx_ring->stats.xdp_redirect_failures++; +@@ -1712,8 +1724,11 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + +- if (xdp_redirect_frm_cnt) ++ if (xdp_redirect_frm_cnt) { ++ enetc_unlock_mdio(); + xdp_do_flush(); ++ enetc_lock_mdio(); ++ } + + if (xdp_tx_frm_cnt) + enetc_update_tx_ring_tail(tx_ring); +@@ -1722,6 +1737,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + enetc_refill_rx_ring(rx_ring, enetc_bd_unused(rx_ring) - + rx_ring->xdp.xdp_tx_in_flight); + ++ enetc_unlock_mdio(); ++ + return rx_frm_cnt; + } + +@@ -1740,6 +1757,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + for (i = 0; i < v->count_tx_rings; i++) + if (!enetc_clean_tx_ring(&v->tx_ring[i], budget)) + complete = false; ++ enetc_unlock_mdio(); + + prog = rx_ring->xdp.prog; + if (prog) +@@ -1751,10 +1769,8 @@ static int enetc_poll(struct napi_struct *napi, int budget) + if (work_done) + v->rx_napi_work = true; + +- if (!complete) { +- enetc_unlock_mdio(); ++ if (!complete) + return budget; +- } + + napi_complete_done(napi, work_done); + +@@ -1763,6 +1779,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) + + v->rx_napi_work = false; + ++ enetc_lock_mdio(); + /* enable interrupts */ + enetc_wr_reg_hot(v->rbier, ENETC_RBIER_RXTIE); + +-- +2.51.0 + diff --git a/queue-6.6/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch b/queue-6.6/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch new file mode 100644 index 0000000000..7fb0914c07 --- /dev/null +++ b/queue-6.6/net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch @@ -0,0 +1,65 @@ +From 5e8aa0cda90b9c9c7da8ef5ff77b5ddd7440e67f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 14 Oct 2025 13:46:49 -0700 +Subject: net/mlx5e: Return 1 instead of 0 in invalid case in + mlx5e_mpwrq_umr_entry_size() + +From: Nathan Chancellor + +[ Upstream commit aaf043a5688114703ae2c1482b92e7e0754d684e ] + +When building with Clang 20 or newer, there are some objtool warnings +from unexpected fallthroughs to other functions: + + vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries() + vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom() + +LLVM 20 contains an (admittedly problematic [1]) optimization [2] to +convert divide by zero into the equivalent of __builtin_unreachable(), +which invokes undefined behavior and destroys code generation when it is +encountered in a control flow graph. + +mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an +unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(), +which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of +mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for +zero, so LLVM is able to infer there will be a divide by zero in this +case and invokes undefined behavior. While there is some proposed work +to isolate this undefined behavior and avoid the destructive code +generation that results in these objtool warnings, code should still be +defensive against divide by zero. + +As the WARN_ONCE() implies that an invalid value should be handled +gracefully, return 1 instead of 0 in the default case so that the +results of this division operation is always valid. + +Fixes: 168723c1f8d6 ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters") +Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@mail.gmail.com/ [1] +Link: https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448 [2] +Closes: https://github.com/ClangBuiltLinux/linux/issues/2131 +Closes: https://github.com/ClangBuiltLinux/linux/issues/2132 +Signed-off-by: Nathan Chancellor +Reviewed-by: Tariq Toukan +Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_entry_size-v1-1-dc186b8819ef@kernel.org +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +index dcd5db907f102..9c22d64af6853 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +@@ -98,7 +98,7 @@ u8 mlx5e_mpwrq_umr_entry_size(enum mlx5e_mpwrq_umr_mode mode) + return sizeof(struct mlx5_ksm) * 4; + } + WARN_ONCE(1, "MPWRQ UMR mode %d is not known\n", mode); +- return 0; ++ return 1; + } + + u8 mlx5e_mpwrq_log_wqe_sz(struct mlx5_core_dev *mdev, u8 page_shift, +-- +2.51.0 + diff --git a/queue-6.6/net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch b/queue-6.6/net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch new file mode 100644 index 0000000000..64945ac788 --- /dev/null +++ b/queue-6.6/net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch @@ -0,0 +1,327 @@ +From 854eab43bdfa91cf0373d80f3a109d9f93570399 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 14 May 2025 23:03:52 +0300 +Subject: net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead + +From: Carolina Jubran + +[ Upstream commit b66b76a82c8879d764ab89adc21ee855ffd292d5 ] + +CONFIG_INIT_STACK_ALL_ZERO introduces a performance cost by +zero-initializing all stack variables on function entry. The mlx5 XDP +RX path previously allocated a struct mlx5e_xdp_buff on the stack per +received CQE, resulting in measurable performance degradation under +this config. + +This patch reuses a mlx5e_xdp_buff stored in the mlx5e_rq struct, +avoiding per-CQE stack allocations and repeated zeroing. + +With this change, XDP_DROP and XDP_TX performance matches that of +kernels built without CONFIG_INIT_STACK_ALL_ZERO. + +Performance was measured on a ConnectX-6Dx using a single RX channel +(1 CPU at 100% usage) at ~50 Mpps. The baseline results were taken from +net-next-6.15. + +Stack zeroing disabled: +- XDP_DROP: + * baseline: 31.47 Mpps + * baseline + per-RQ allocation: 32.31 Mpps (+2.68%) + +- XDP_TX: + * baseline: 12.41 Mpps + * baseline + per-RQ allocation: 12.95 Mpps (+4.30%) + +Stack zeroing enabled: +- XDP_DROP: + * baseline: 24.32 Mpps + * baseline + per-RQ allocation: 32.27 Mpps (+32.7%) + +- XDP_TX: + * baseline: 11.80 Mpps + * baseline + per-RQ allocation: 12.24 Mpps (+3.72%) + +Reported-by: Sebastiano Miano +Reported-by: Samuel Dobron +Link: https://lore.kernel.org/all/CAMENy5pb8ea+piKLg5q5yRTMZacQqYWAoVLE1FE9WhQPq92E0g@mail.gmail.com/ +Signed-off-by: Carolina Jubran +Reviewed-by: Dragos Tatulea +Signed-off-by: Tariq Toukan +Acked-by: Jesper Dangaard Brouer +Link: https://patch.msgid.link/1747253032-663457-1-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Stable-dep-of: afd5ba577c10 ("net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ") +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 ++ + .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 6 -- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 81 ++++++++++--------- + 3 files changed, 51 insertions(+), 43 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h +index 9cf33ae48c216..455d02b6500d0 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h +@@ -519,6 +519,12 @@ struct mlx5e_xdpsq { + struct mlx5e_channel *channel; + } ____cacheline_aligned_in_smp; + ++struct mlx5e_xdp_buff { ++ struct xdp_buff xdp; ++ struct mlx5_cqe64 *cqe; ++ struct mlx5e_rq *rq; ++}; ++ + struct mlx5e_ktls_resync_resp; + + struct mlx5e_icosq { +@@ -717,6 +723,7 @@ struct mlx5e_rq { + struct mlx5e_xdpsq *xdpsq; + DECLARE_BITMAP(flags, 8); + struct page_pool *page_pool; ++ struct mlx5e_xdp_buff mxbuf; + + /* AF_XDP zero-copy */ + struct xsk_buff_pool *xsk_pool; +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +index ecfe93a479da8..38e9ff6aa3aee 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +@@ -44,12 +44,6 @@ + (MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \ + sizeof(struct mlx5_wqe_inline_seg)) + +-struct mlx5e_xdp_buff { +- struct xdp_buff xdp; +- struct mlx5_cqe64 *cqe; +- struct mlx5e_rq *rq; +-}; +- + /* XDP packets can be transmitted in different ways. On completion, we need to + * distinguish between them to clean up things in a proper way. + */ +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index 8278395ee20a0..711c95074f05c 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -1697,17 +1697,17 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, + + prog = rcu_dereference(rq->xdp_prog); + if (prog) { +- struct mlx5e_xdp_buff mxbuf; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + + net_prefetchw(va); /* xdp_frame data area */ + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, +- cqe_bcnt, &mxbuf); +- if (mlx5e_xdp_handle(rq, prog, &mxbuf)) ++ cqe_bcnt, mxbuf); ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) + return NULL; /* page/packet was consumed by XDP */ + +- rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start; +- metasize = mxbuf.xdp.data - mxbuf.xdp.data_meta; +- cqe_bcnt = mxbuf.xdp.data_end - mxbuf.xdp.data; ++ rx_headroom = mxbuf->xdp.data - mxbuf->xdp.data_hard_start; ++ metasize = mxbuf->xdp.data - mxbuf->xdp.data_meta; ++ cqe_bcnt = mxbuf->xdp.data_end - mxbuf->xdp.data; + } + frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); + skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize); +@@ -1726,11 +1726,11 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) + { + struct mlx5e_rq_frag_info *frag_info = &rq->wqe.info.arr[0]; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + struct mlx5e_wqe_frag_info *head_wi = wi; + u16 rx_headroom = rq->buff.headroom; + struct mlx5e_frag_page *frag_page; + struct skb_shared_info *sinfo; +- struct mlx5e_xdp_buff mxbuf; + u32 frag_consumed_bytes; + struct bpf_prog *prog; + struct sk_buff *skb; +@@ -1750,8 +1750,8 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + net_prefetch(va + rx_headroom); + + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, +- frag_consumed_bytes, &mxbuf); +- sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); ++ frag_consumed_bytes, mxbuf); ++ sinfo = xdp_get_shared_info_from_buff(&mxbuf->xdp); + truesize = 0; + + cqe_bcnt -= frag_consumed_bytes; +@@ -1763,8 +1763,9 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + + frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt); + +- mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf.xdp, frag_page, +- wi->offset, frag_consumed_bytes); ++ mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf->xdp, ++ frag_page, wi->offset, ++ frag_consumed_bytes); + truesize += frag_info->frag_stride; + + cqe_bcnt -= frag_consumed_bytes; +@@ -1773,7 +1774,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + } + + prog = rcu_dereference(rq->xdp_prog); +- if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) { ++ if (prog && mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + struct mlx5e_wqe_frag_info *pwi; + +@@ -1783,21 +1784,23 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + return NULL; /* page/packet was consumed by XDP */ + } + +- skb = mlx5e_build_linear_skb(rq, mxbuf.xdp.data_hard_start, rq->buff.frame0_sz, +- mxbuf.xdp.data - mxbuf.xdp.data_hard_start, +- mxbuf.xdp.data_end - mxbuf.xdp.data, +- mxbuf.xdp.data - mxbuf.xdp.data_meta); ++ skb = mlx5e_build_linear_skb( ++ rq, mxbuf->xdp.data_hard_start, rq->buff.frame0_sz, ++ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, ++ mxbuf->xdp.data_end - mxbuf->xdp.data, ++ mxbuf->xdp.data - mxbuf->xdp.data_meta); + if (unlikely(!skb)) + return NULL; + + skb_mark_for_recycle(skb); + head_wi->frag_page->frags++; + +- if (xdp_buff_has_frags(&mxbuf.xdp)) { ++ if (xdp_buff_has_frags(&mxbuf->xdp)) { + /* sinfo->nr_frags is reset by build_skb, calculate again. */ + xdp_update_skb_shared_info(skb, wi - head_wi - 1, + sinfo->xdp_frags_size, truesize, +- xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); ++ xdp_buff_is_frag_pfmemalloc( ++ &mxbuf->xdp)); + + for (struct mlx5e_wqe_frag_info *pwi = head_wi + 1; pwi < wi; pwi++) + pwi->frag_page->frags++; +@@ -2003,10 +2006,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + struct mlx5e_frag_page *frag_page = &wi->alloc_units.frag_pages[page_idx]; + u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt); + struct mlx5e_frag_page *head_page = frag_page; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + u32 frag_offset = head_offset; + u32 byte_cnt = cqe_bcnt; + struct skb_shared_info *sinfo; +- struct mlx5e_xdp_buff mxbuf; + unsigned int truesize = 0; + struct bpf_prog *prog; + struct sk_buff *skb; +@@ -2052,9 +2055,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } + } + +- mlx5e_fill_mxbuf(rq, cqe, va, linear_hr, linear_frame_sz, linear_data_len, &mxbuf); ++ mlx5e_fill_mxbuf(rq, cqe, va, linear_hr, linear_frame_sz, ++ linear_data_len, mxbuf); + +- sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); ++ sinfo = xdp_get_shared_info_from_buff(&mxbuf->xdp); + + while (byte_cnt) { + /* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */ +@@ -2065,7 +2069,8 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + else + truesize += ALIGN(pg_consumed_bytes, BIT(rq->mpwqe.log_stride_sz)); + +- mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf.xdp, frag_page, frag_offset, ++ mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf->xdp, ++ frag_page, frag_offset, + pg_consumed_bytes); + byte_cnt -= pg_consumed_bytes; + frag_offset = 0; +@@ -2073,7 +2078,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } + + if (prog) { +- if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + struct mlx5e_frag_page *pfp; + +@@ -2086,10 +2091,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + return NULL; /* page/packet was consumed by XDP */ + } + +- skb = mlx5e_build_linear_skb(rq, mxbuf.xdp.data_hard_start, +- linear_frame_sz, +- mxbuf.xdp.data - mxbuf.xdp.data_hard_start, 0, +- mxbuf.xdp.data - mxbuf.xdp.data_meta); ++ skb = mlx5e_build_linear_skb( ++ rq, mxbuf->xdp.data_hard_start, linear_frame_sz, ++ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, 0, ++ mxbuf->xdp.data - mxbuf->xdp.data_meta); + if (unlikely(!skb)) { + mlx5e_page_release_fragmented(rq, &wi->linear_page); + return NULL; +@@ -2099,13 +2104,14 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + wi->linear_page.frags++; + mlx5e_page_release_fragmented(rq, &wi->linear_page); + +- if (xdp_buff_has_frags(&mxbuf.xdp)) { ++ if (xdp_buff_has_frags(&mxbuf->xdp)) { + struct mlx5e_frag_page *pagep; + + /* sinfo->nr_frags is reset by build_skb, calculate again. */ + xdp_update_skb_shared_info(skb, frag_page - head_page, + sinfo->xdp_frags_size, truesize, +- xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); ++ xdp_buff_is_frag_pfmemalloc( ++ &mxbuf->xdp)); + + pagep = head_page; + do +@@ -2116,12 +2122,13 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } else { + dma_addr_t addr; + +- if (xdp_buff_has_frags(&mxbuf.xdp)) { ++ if (xdp_buff_has_frags(&mxbuf->xdp)) { + struct mlx5e_frag_page *pagep; + + xdp_update_skb_shared_info(skb, sinfo->nr_frags, + sinfo->xdp_frags_size, truesize, +- xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); ++ xdp_buff_is_frag_pfmemalloc( ++ &mxbuf->xdp)); + + pagep = frag_page - sinfo->nr_frags; + do +@@ -2171,20 +2178,20 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + + prog = rcu_dereference(rq->xdp_prog); + if (prog) { +- struct mlx5e_xdp_buff mxbuf; ++ struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf; + + net_prefetchw(va); /* xdp_frame data area */ + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, +- cqe_bcnt, &mxbuf); +- if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { ++ cqe_bcnt, mxbuf); ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) + frag_page->frags++; + return NULL; /* page/packet was consumed by XDP */ + } + +- rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start; +- metasize = mxbuf.xdp.data - mxbuf.xdp.data_meta; +- cqe_bcnt = mxbuf.xdp.data_end - mxbuf.xdp.data; ++ rx_headroom = mxbuf->xdp.data - mxbuf->xdp.data_hard_start; ++ metasize = mxbuf->xdp.data - mxbuf->xdp.data_meta; ++ cqe_bcnt = mxbuf->xdp.data_end - mxbuf->xdp.data; + } + frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); + skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize); +-- +2.51.0 + diff --git a/queue-6.6/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch b/queue-6.6/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch new file mode 100644 index 0000000000..78f1faaec8 --- /dev/null +++ b/queue-6.6/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch @@ -0,0 +1,68 @@ +From a1636e0d5127ae83dc899612769e6f1cafba0579 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 22:55:39 +0300 +Subject: net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy + RQ + +From: Amery Hung + +[ Upstream commit afd5ba577c10639f62e8120df67dc70ea4b61176 ] + +XDP programs can release xdp_buff fragments when calling +bpf_xdp_adjust_tail(). The driver currently assumes the number of +fragments to be unchanged and may generate skb with wrong truesize or +containing invalid frags. Fix the bug by generating skb according to +xdp_buff after the XDP program runs. + +Fixes: ea5d49bdae8b ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ") +Reviewed-by: Dragos Tatulea +Signed-off-by: Amery Hung +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1760644540-899148-2-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 25 ++++++++++++++----- + 1 file changed, 19 insertions(+), 6 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index 711c95074f05c..54268892148d4 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -1774,14 +1774,27 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi + } + + prog = rcu_dereference(rq->xdp_prog); +- if (prog && mlx5e_xdp_handle(rq, prog, mxbuf)) { +- if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { +- struct mlx5e_wqe_frag_info *pwi; ++ if (prog) { ++ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags; ++ ++ if (mlx5e_xdp_handle(rq, prog, mxbuf)) { ++ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, ++ rq->flags)) { ++ struct mlx5e_wqe_frag_info *pwi; ++ ++ wi -= old_nr_frags - sinfo->nr_frags; ++ ++ for (pwi = head_wi; pwi < wi; pwi++) ++ pwi->frag_page->frags++; ++ } ++ return NULL; /* page/packet was consumed by XDP */ ++ } + +- for (pwi = head_wi; pwi < wi; pwi++) +- pwi->frag_page->frags++; ++ nr_frags_free = old_nr_frags - sinfo->nr_frags; ++ if (unlikely(nr_frags_free)) { ++ wi -= nr_frags_free; ++ truesize -= nr_frags_free * frag_info->frag_stride; + } +- return NULL; /* page/packet was consumed by XDP */ + } + + skb = mlx5e_build_linear_skb( +-- +2.51.0 + diff --git a/queue-6.6/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-5039 b/queue-6.6/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-5039 new file mode 100644 index 0000000000..1f62c5328e --- /dev/null +++ b/queue-6.6/net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-5039 @@ -0,0 +1,120 @@ +From 6964f250f8a89c02e7af51dcd8897316ff7d6ce4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 16 Oct 2025 22:55:40 +0300 +Subject: net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for + striding RQ + +From: Amery Hung + +[ Upstream commit 87bcef158ac1faca1bd7e0104588e8e2956d10be ] + +XDP programs can change the layout of an xdp_buff through +bpf_xdp_adjust_tail() and bpf_xdp_adjust_head(). Therefore, the driver +cannot assume the size of the linear data area nor fragments. Fix the +bug in mlx5 by generating skb according to xdp_buff after XDP programs +run. + +Currently, when handling multi-buf XDP, the mlx5 driver assumes the +layout of an xdp_buff to be unchanged. That is, the linear data area +continues to be empty and fragments remain the same. This may cause +the driver to generate erroneous skb or triggering a kernel +warning. When an XDP program added linear data through +bpf_xdp_adjust_head(), the linear data will be ignored as +mlx5e_build_linear_skb() builds an skb without linear data and then +pull data from fragments to fill the linear data area. When an XDP +program has shrunk the non-linear data through bpf_xdp_adjust_tail(), +the delta passed to __pskb_pull_tail() may exceed the actual nonlinear +data size and trigger the BUG_ON in it. + +To fix the issue, first record the original number of fragments. If the +number of fragments changes after the XDP program runs, rewind the end +fragment pointer by the difference and recalculate the truesize. Then, +build the skb with the linear data area matching the xdp_buff. Finally, +only pull data in if there is non-linear data and fill the linear part +up to 256 bytes. + +Fixes: f52ac7028bec ("net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ") +Signed-off-by: Amery Hung +Signed-off-by: Tariq Toukan +Link: https://patch.msgid.link/1760644540-899148-3-git-send-email-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 26 ++++++++++++++++--- + 1 file changed, 23 insertions(+), 3 deletions(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +index 54268892148d4..fcf7437174e18 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +@@ -2024,6 +2024,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + u32 byte_cnt = cqe_bcnt; + struct skb_shared_info *sinfo; + unsigned int truesize = 0; ++ u32 pg_consumed_bytes; + struct bpf_prog *prog; + struct sk_buff *skb; + u32 linear_frame_sz; +@@ -2075,7 +2076,8 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + + while (byte_cnt) { + /* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */ +- u32 pg_consumed_bytes = min_t(u32, PAGE_SIZE - frag_offset, byte_cnt); ++ pg_consumed_bytes = ++ min_t(u32, PAGE_SIZE - frag_offset, byte_cnt); + + if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) + truesize += pg_consumed_bytes; +@@ -2091,10 +2093,15 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + } + + if (prog) { ++ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags; ++ u32 len; ++ + if (mlx5e_xdp_handle(rq, prog, mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + struct mlx5e_frag_page *pfp; + ++ frag_page -= old_nr_frags - sinfo->nr_frags; ++ + for (pfp = head_page; pfp < frag_page; pfp++) + pfp->frags++; + +@@ -2104,9 +2111,19 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + return NULL; /* page/packet was consumed by XDP */ + } + ++ nr_frags_free = old_nr_frags - sinfo->nr_frags; ++ if (unlikely(nr_frags_free)) { ++ frag_page -= nr_frags_free; ++ truesize -= (nr_frags_free - 1) * PAGE_SIZE + ++ ALIGN(pg_consumed_bytes, ++ BIT(rq->mpwqe.log_stride_sz)); ++ } ++ ++ len = mxbuf->xdp.data_end - mxbuf->xdp.data; ++ + skb = mlx5e_build_linear_skb( + rq, mxbuf->xdp.data_hard_start, linear_frame_sz, +- mxbuf->xdp.data - mxbuf->xdp.data_hard_start, 0, ++ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, len, + mxbuf->xdp.data - mxbuf->xdp.data_meta); + if (unlikely(!skb)) { + mlx5e_page_release_fragmented(rq, &wi->linear_page); +@@ -2130,8 +2147,11 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w + do + pagep->frags++; + while (++pagep < frag_page); ++ ++ headlen = min_t(u16, MLX5E_RX_MAX_HEAD - len, ++ skb->data_len); ++ __pskb_pull_tail(skb, headlen); + } +- __pskb_pull_tail(skb, headlen); + } else { + dma_addr_t addr; + +-- +2.51.0 + diff --git a/queue-6.6/net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch b/queue-6.6/net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch new file mode 100644 index 0000000000..1cdf2bbc43 --- /dev/null +++ b/queue-6.6/net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch @@ -0,0 +1,285 @@ +From b64ee021443956bc0e65379754dcece8a0b01a1b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 8 Sep 2023 16:32:14 +0200 +Subject: net: Tree wide: Replace xdp_do_flush_map() with xdp_do_flush(). +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Sebastian Andrzej Siewior + +[ Upstream commit 7f04bd109d4c358a12b125bc79a6f0eac2e915ec ] + +xdp_do_flush_map() is deprecated and new code should use xdp_do_flush() +instead. + +Replace xdp_do_flush_map() with xdp_do_flush(). + +Cc: AngeloGioacchino Del Regno +Cc: Clark Wang +Cc: Claudiu Manoil +Cc: David Arinzon +Cc: Edward Cree +Cc: Felix Fietkau +Cc: Grygorii Strashko +Cc: Jassi Brar +Cc: Jesse Brandeburg +Cc: John Crispin +Cc: Leon Romanovsky +Cc: Lorenzo Bianconi +Cc: Louis Peens +Cc: Marcin Wojtas +Cc: Mark Lee +Cc: Matthias Brugger +Cc: NXP Linux Team +Cc: Noam Dagan +Cc: Russell King +Cc: Saeed Bishara +Cc: Saeed Mahameed +Cc: Sean Wang +Cc: Shay Agroskin +Cc: Shenwei Wang +Cc: Thomas Petazzoni +Cc: Tony Nguyen +Cc: Vladimir Oltean +Cc: Wei Fang +Signed-off-by: Sebastian Andrzej Siewior +Acked-by: Arthur Kiyanovski +Acked-by: Toke Høiland-Jørgensen +Acked-by: Ilias Apalodimas +Acked-by: Martin Habets +Acked-by: Jesper Dangaard Brouer +Link: https://lore.kernel.org/r/20230908143215.869913-2-bigeasy@linutronix.de +Signed-off-by: Jakub Kicinski +Stable-dep-of: 50bd33f6b392 ("net: enetc: fix the deadlock of enetc_mdio_lock") +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +- + drivers/net/ethernet/freescale/enetc/enetc.c | 2 +- + drivers/net/ethernet/freescale/fec_main.c | 2 +- + drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +- + drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 +- + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- + drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +- + drivers/net/ethernet/marvell/mvneta.c | 2 +- + drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 2 +- + drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +- + drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +- + drivers/net/ethernet/netronome/nfp/nfd3/xsk.c | 2 +- + drivers/net/ethernet/sfc/efx_channels.c | 2 +- + drivers/net/ethernet/sfc/siena/efx_channels.c | 2 +- + drivers/net/ethernet/socionext/netsec.c | 2 +- + drivers/net/ethernet/ti/cpsw_priv.c | 2 +- + 16 files changed, 16 insertions(+), 16 deletions(-) + +diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c +index 0d201a57d7e29..dd9c50d3ec0f0 100644 +--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c ++++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c +@@ -1310,7 +1310,7 @@ static int ena_clean_rx_irq(struct ena_ring *rx_ring, struct napi_struct *napi, + } + + if (xdp_flags & ENA_XDP_REDIRECT) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + return work_done; + +diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c +index 0c09d82dbf00d..49c61aa920b02 100644 +--- a/drivers/net/ethernet/freescale/enetc/enetc.c ++++ b/drivers/net/ethernet/freescale/enetc/enetc.c +@@ -1713,7 +1713,7 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + rx_ring->stats.bytes += rx_byte_cnt; + + if (xdp_redirect_frm_cnt) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_tx_frm_cnt) + enetc_update_tx_ring_tail(tx_ring); +diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c +index 8352d9b6469f2..64cd72c194783 100644 +--- a/drivers/net/ethernet/freescale/fec_main.c ++++ b/drivers/net/ethernet/freescale/fec_main.c +@@ -1904,7 +1904,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) + rxq->bd.cur = bdp; + + if (xdp_result & FEC_ENET_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + return pkt_received; + } +diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c +index 6a9b47b005d29..99604379c87b6 100644 +--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c ++++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c +@@ -2398,7 +2398,7 @@ void i40e_update_rx_stats(struct i40e_ring *rx_ring, + void i40e_finalize_xdp_rx(struct i40e_ring *rx_ring, unsigned int xdp_res) + { + if (xdp_res & I40E_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_res & I40E_XDP_TX) { + struct i40e_ring *xdp_ring = +diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +index c8322fb6f2b37..7e06373e14d98 100644 +--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c ++++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +@@ -450,7 +450,7 @@ void ice_finalize_xdp_rx(struct ice_tx_ring *xdp_ring, unsigned int xdp_res, + struct ice_tx_buf *tx_buf = &xdp_ring->tx_buf[first_idx]; + + if (xdp_res & ICE_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_res & ICE_XDP_TX) { + if (static_branch_unlikely(&ice_xdp_locking_key)) +diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +index f245f3df40fca..99876b765b08b 100644 +--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c ++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +@@ -2421,7 +2421,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector, + } + + if (xdp_xmit & IXGBE_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_xmit & IXGBE_XDP_TX) { + struct ixgbe_ring *ring = ixgbe_determine_xdp_ring(adapter); +diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +index 7ef82c30e8571..9fdd19acf2242 100644 +--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c ++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +@@ -351,7 +351,7 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, + } + + if (xdp_xmit & IXGBE_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_xmit & IXGBE_XDP_TX) { + struct ixgbe_ring *ring = ixgbe_determine_xdp_ring(adapter); +diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c +index 165f76d1231c1..2941721b65152 100644 +--- a/drivers/net/ethernet/marvell/mvneta.c ++++ b/drivers/net/ethernet/marvell/mvneta.c +@@ -2520,7 +2520,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi, + mvneta_xdp_put_buff(pp, rxq, &xdp_buf, -1); + + if (ps.xdp_redirect) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (ps.rx_packets) + mvneta_update_stats(pp, &ps); +diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +index fce57faf345ce..aabc39f7690f8 100644 +--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c ++++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +@@ -4055,7 +4055,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi, + } + + if (xdp_ret & MVPP2_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (ps.rx_packets) { + struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats); +diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c +index aefe2af6f01d4..c843e6531449b 100644 +--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c ++++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c +@@ -2221,7 +2221,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget, + net_dim(ð->rx_dim, dim_sample); + + if (xdp_flush) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + return done; + } +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +index b723ff5e5249c..13c7ed1bb37e9 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +@@ -895,7 +895,7 @@ void mlx5e_xdp_rx_poll_complete(struct mlx5e_rq *rq) + mlx5e_xmit_xdp_doorbell(xdpsq); + + if (test_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags)) { +- xdp_do_flush_map(); ++ xdp_do_flush(); + __clear_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags); + } + } +diff --git a/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c b/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c +index 5d9db8c2a5b43..45be6954d5aae 100644 +--- a/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c ++++ b/drivers/net/ethernet/netronome/nfp/nfd3/xsk.c +@@ -256,7 +256,7 @@ nfp_nfd3_xsk_rx(struct nfp_net_rx_ring *rx_ring, int budget, + nfp_net_xsk_rx_ring_fill_freelist(r_vec->rx_ring); + + if (xdp_redir) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (tx_ring->wr_ptr_add) + nfp_net_tx_xmit_more_flush(tx_ring); +diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c +index 8d2d7ea2ebefc..c9e17a8208a90 100644 +--- a/drivers/net/ethernet/sfc/efx_channels.c ++++ b/drivers/net/ethernet/sfc/efx_channels.c +@@ -1260,7 +1260,7 @@ static int efx_poll(struct napi_struct *napi, int budget) + + spent = efx_process_channel(channel, budget); + +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (spent < budget) { + if (efx_channel_has_rx_queue(channel) && +diff --git a/drivers/net/ethernet/sfc/siena/efx_channels.c b/drivers/net/ethernet/sfc/siena/efx_channels.c +index 1776f7f8a7a90..a7346e965bfe7 100644 +--- a/drivers/net/ethernet/sfc/siena/efx_channels.c ++++ b/drivers/net/ethernet/sfc/siena/efx_channels.c +@@ -1285,7 +1285,7 @@ static int efx_poll(struct napi_struct *napi, int budget) + + spent = efx_process_channel(channel, budget); + +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (spent < budget) { + if (efx_channel_has_rx_queue(channel) && +diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c +index f358ea0031936..b834b129639f0 100644 +--- a/drivers/net/ethernet/socionext/netsec.c ++++ b/drivers/net/ethernet/socionext/netsec.c +@@ -780,7 +780,7 @@ static void netsec_finalize_xdp_rx(struct netsec_priv *priv, u32 xdp_res, + u16 pkts) + { + if (xdp_res & NETSEC_XDP_REDIR) +- xdp_do_flush_map(); ++ xdp_do_flush(); + + if (xdp_res & NETSEC_XDP_TX) + netsec_xdp_ring_tx_db(priv, pkts); +diff --git a/drivers/net/ethernet/ti/cpsw_priv.c b/drivers/net/ethernet/ti/cpsw_priv.c +index 0ec85635dfd60..764ed298b5708 100644 +--- a/drivers/net/ethernet/ti/cpsw_priv.c ++++ b/drivers/net/ethernet/ti/cpsw_priv.c +@@ -1360,7 +1360,7 @@ int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp, + * particular hardware is sharing a common queue, so the + * incoming device might change per packet. + */ +- xdp_do_flush_map(); ++ xdp_do_flush(); + break; + default: + bpf_warn_invalid_xdp_action(ndev, prog, act); +-- +2.51.0 + diff --git a/queue-6.6/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch b/queue-6.6/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch new file mode 100644 index 0000000000..38ef5116fb --- /dev/null +++ b/queue-6.6/rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch @@ -0,0 +1,56 @@ +From 97d403a4e76c5d74d63900e5e6718eebdf2ca80e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 15 Oct 2025 22:15:43 +0200 +Subject: rtnetlink: Allow deleting FDB entries in user namespace +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Johannes Wiesböck + +[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ] + +Creating FDB entries is possible from a non-initial user namespace when +having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive +an EPERM because the capability is always checked against the initial +user namespace. This restricts the FDB management from unprivileged +containers. + +Drop the netlink_capable check in rtnl_fdb_del as it was originally +dropped in c5c351088ae7 and reintroduced in 1690be63a27b without +intention. + +This patch was tested using a container on GyroidOS, where it was +possible to delete FDB entries from an unprivileged user namespace and +private network namespace. + +Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") +Reviewed-by: Michael Weiß +Tested-by: Harshal Gohel +Signed-off-by: Johannes Wiesböck +Reviewed-by: Ido Schimmel +Reviewed-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 26c520d1af6e6..1613563132035 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -4383,9 +4383,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, + int err; + u16 vid; + +- if (!netlink_capable(skb, CAP_NET_ADMIN)) +- return -EPERM; +- + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); +-- +2.51.0 + diff --git a/queue-6.6/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch b/queue-6.6/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch new file mode 100644 index 0000000000..478a942b4b --- /dev/null +++ b/queue-6.6/sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch @@ -0,0 +1,54 @@ +From d0035397a3044ab9c4599f62553a8c93d592b6a3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 21 Oct 2025 16:00:36 +0300 +Subject: sctp: avoid NULL dereference when chunk data buffer is missing + +From: Alexey Simakov + +[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ] + +chunk->skb pointer is dereferenced in the if-block where it's supposed +to be NULL only. + +chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list +instead and do it just before replacing chunk->skb. We're sure that +otherwise chunk->skb is non-NULL because of outer if() condition. + +Fixes: 90017accff61 ("sctp: Add GSO support") +Signed-off-by: Alexey Simakov +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sctp/inqueue.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c +index 5c16521818058..f5a7d5a387555 100644 +--- a/net/sctp/inqueue.c ++++ b/net/sctp/inqueue.c +@@ -169,13 +169,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) + chunk->head_skb = chunk->skb; + + /* skbs with "cover letter" */ +- if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) ++ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { ++ if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { ++ __SCTP_INC_STATS(dev_net(chunk->skb->dev), ++ SCTP_MIB_IN_PKT_DISCARDS); ++ sctp_chunk_free(chunk); ++ goto next_chunk; ++ } + chunk->skb = skb_shinfo(chunk->skb)->frag_list; +- +- if (WARN_ON(!chunk->skb)) { +- __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); +- sctp_chunk_free(chunk); +- goto next_chunk; + } + } + +-- +2.51.0 + diff --git a/queue-6.6/selftests-net-convert-sctp_vrf.sh-to-run-it-in-uniqu.patch b/queue-6.6/selftests-net-convert-sctp_vrf.sh-to-run-it-in-uniqu.patch new file mode 100644 index 0000000000..62532a92ec --- /dev/null +++ b/queue-6.6/selftests-net-convert-sctp_vrf.sh-to-run-it-in-uniqu.patch @@ -0,0 +1,72 @@ +From 33fb7f03ff2fe02f82cec33d0b1c6e07fa0d2fb7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 2 Dec 2023 10:01:09 +0800 +Subject: selftests/net: convert sctp_vrf.sh to run it in unique namespace + +From: Hangbin Liu + +[ Upstream commit 90e271f65ee428ae5a75e783f5ba50a10dece09d ] + +Here is the test result after conversion. + +]# ./sctp_vrf.sh +Testing For SCTP VRF: +TEST 01: nobind, connect from client 1, l3mdev_accept=1, Y [PASS] +... +TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [PASS] +***v6 Tests Done*** + +Acked-by: David Ahern +Reviewed-by: Xin Long +Signed-off-by: Hangbin Liu +Signed-off-by: Paolo Abeni +Stable-dep-of: a73ca0449bcb ("selftests: net: fix server bind failure in sctp_vrf.sh") +Signed-off-by: Sasha Levin +--- + tools/testing/selftests/net/sctp_vrf.sh | 12 +++--------- + 1 file changed, 3 insertions(+), 9 deletions(-) + +diff --git a/tools/testing/selftests/net/sctp_vrf.sh b/tools/testing/selftests/net/sctp_vrf.sh +index c721e952e5f30..c854034b6aa16 100755 +--- a/tools/testing/selftests/net/sctp_vrf.sh ++++ b/tools/testing/selftests/net/sctp_vrf.sh +@@ -6,13 +6,11 @@ + # SERVER_NS + # CLIENT_NS2 (veth1) <---> (veth2) -> vrf_s2 + +-CLIENT_NS1="client-ns1" +-CLIENT_NS2="client-ns2" ++source lib.sh + CLIENT_IP4="10.0.0.1" + CLIENT_IP6="2000::1" + CLIENT_PORT=1234 + +-SERVER_NS="server-ns" + SERVER_IP4="10.0.0.2" + SERVER_IP6="2000::2" + SERVER_PORT=1234 +@@ -20,9 +18,7 @@ SERVER_PORT=1234 + setup() { + modprobe sctp + modprobe sctp_diag +- ip netns add $CLIENT_NS1 +- ip netns add $CLIENT_NS2 +- ip netns add $SERVER_NS ++ setup_ns CLIENT_NS1 CLIENT_NS2 SERVER_NS + + ip net exec $CLIENT_NS1 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null + ip net exec $CLIENT_NS2 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null +@@ -67,9 +63,7 @@ setup() { + + cleanup() { + ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null +- ip netns del "$CLIENT_NS1" +- ip netns del "$CLIENT_NS2" +- ip netns del "$SERVER_NS" ++ cleanup_ns $CLIENT_NS1 $CLIENT_NS2 $SERVER_NS + } + + wait_server() { +-- +2.51.0 + diff --git a/queue-6.6/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch b/queue-6.6/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch new file mode 100644 index 0000000000..fbae5d7b07 --- /dev/null +++ b/queue-6.6/selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch @@ -0,0 +1,241 @@ +From fb9d93cbdd5445b3fdfe147308d17d60bdc48082 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 17 Oct 2025 16:06:14 -0400 +Subject: selftests: net: fix server bind failure in sctp_vrf.sh +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Xin Long + +[ Upstream commit a73ca0449bcb7c238097cc6a1bf3fd82a78374df ] + +sctp_vrf.sh could fail: + + TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [FAIL] + not ok 1 selftests: net: sctp_vrf.sh # exit=3 + +The failure happens when the server bind in a new run conflicts with an +existing association from the previous run: + +[1] ip netns exec $SERVER_NS ./sctp_hello server ... +[2] ip netns exec $CLIENT_NS ./sctp_hello client ... +[3] ip netns exec $SERVER_NS pkill sctp_hello ... +[4] ip netns exec $SERVER_NS ./sctp_hello server ... + +It occurs if the client in [2] sends a message and closes immediately. +With the message unacked, no SHUTDOWN is sent. Killing the server in [3] +triggers a SHUTDOWN the client also ignores due to the unacked message, +leaving the old association alive. This causes the bind at [4] to fail +until the message is acked and the client responds to a second SHUTDOWN +after the server’s T2 timer expires (3s). + +This patch fixes the issue by preventing the client from sending data. +Instead, the client blocks on recv() and waits for the server to close. +It also waits until both the server and the client sockets are fully +released in stop_server and wait_client before restarting. + +Additionally, replace 2>&1 >/dev/null with -q in sysctl and grep, and +drop other redundant 2>&1 >/dev/null redirections, and fix a typo from +N to Y (connect successfully) in the description of the last test. + +Fixes: a61bd7b9fef3 ("selftests: add a selftest for sctp vrf") +Reported-by: Hangbin Liu +Tested-by: Jakub Kicinski +Signed-off-by: Xin Long +Link: https://patch.msgid.link/be2dacf52d0917c4ba5e2e8c5a9cb640740ad2b6.1760731574.git.lucien.xin@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + tools/testing/selftests/net/sctp_hello.c | 17 +----- + tools/testing/selftests/net/sctp_vrf.sh | 73 +++++++++++++++--------- + 2 files changed, 47 insertions(+), 43 deletions(-) + +diff --git a/tools/testing/selftests/net/sctp_hello.c b/tools/testing/selftests/net/sctp_hello.c +index f02f1f95d2275..a04dac0b8027d 100644 +--- a/tools/testing/selftests/net/sctp_hello.c ++++ b/tools/testing/selftests/net/sctp_hello.c +@@ -29,7 +29,6 @@ static void set_addr(struct sockaddr_storage *ss, char *ip, char *port, int *len + static int do_client(int argc, char *argv[]) + { + struct sockaddr_storage ss; +- char buf[] = "hello"; + int csk, ret, len; + + if (argc < 5) { +@@ -56,16 +55,10 @@ static int do_client(int argc, char *argv[]) + + set_addr(&ss, argv[3], argv[4], &len); + ret = connect(csk, (struct sockaddr *)&ss, len); +- if (ret < 0) { +- printf("failed to connect to peer\n"); ++ if (ret < 0) + return -1; +- } + +- ret = send(csk, buf, strlen(buf) + 1, 0); +- if (ret < 0) { +- printf("failed to send msg %d\n", ret); +- return -1; +- } ++ recv(csk, NULL, 0, 0); + close(csk); + + return 0; +@@ -75,7 +68,6 @@ int main(int argc, char *argv[]) + { + struct sockaddr_storage ss; + int lsk, csk, ret, len; +- char buf[20]; + + if (argc < 2 || (strcmp(argv[1], "server") && strcmp(argv[1], "client"))) { + printf("%s server|client ...\n", argv[0]); +@@ -125,11 +117,6 @@ int main(int argc, char *argv[]) + return -1; + } + +- ret = recv(csk, buf, sizeof(buf), 0); +- if (ret <= 0) { +- printf("failed to recv msg %d\n", ret); +- return -1; +- } + close(csk); + close(lsk); + +diff --git a/tools/testing/selftests/net/sctp_vrf.sh b/tools/testing/selftests/net/sctp_vrf.sh +index c854034b6aa16..667b211aa8a11 100755 +--- a/tools/testing/selftests/net/sctp_vrf.sh ++++ b/tools/testing/selftests/net/sctp_vrf.sh +@@ -20,9 +20,9 @@ setup() { + modprobe sctp_diag + setup_ns CLIENT_NS1 CLIENT_NS2 SERVER_NS + +- ip net exec $CLIENT_NS1 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null +- ip net exec $CLIENT_NS2 sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null +- ip net exec $SERVER_NS sysctl -w net.ipv6.conf.default.accept_dad=0 2>&1 >/dev/null ++ ip net exec $CLIENT_NS1 sysctl -wq net.ipv6.conf.default.accept_dad=0 ++ ip net exec $CLIENT_NS2 sysctl -wq net.ipv6.conf.default.accept_dad=0 ++ ip net exec $SERVER_NS sysctl -wq net.ipv6.conf.default.accept_dad=0 + + ip -n $SERVER_NS link add veth1 type veth peer name veth1 netns $CLIENT_NS1 + ip -n $SERVER_NS link add veth2 type veth peer name veth1 netns $CLIENT_NS2 +@@ -62,17 +62,40 @@ setup() { + } + + cleanup() { +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null ++ wait_client $CLIENT_NS1 ++ wait_client $CLIENT_NS2 ++ stop_server + cleanup_ns $CLIENT_NS1 $CLIENT_NS2 $SERVER_NS + } + +-wait_server() { ++start_server() { + local IFACE=$1 + local CNT=0 + +- until ip netns exec $SERVER_NS ss -lS src $SERVER_IP:$SERVER_PORT | \ +- grep LISTEN | grep "$IFACE" 2>&1 >/dev/null; do +- [ $((CNT++)) = "20" ] && { RET=3; return $RET; } ++ ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP $SERVER_PORT $IFACE & ++ disown ++ until ip netns exec $SERVER_NS ss -SlH | grep -q "$IFACE"; do ++ [ $((CNT++)) -eq 30 ] && { RET=3; return $RET; } ++ sleep 0.1 ++ done ++} ++ ++stop_server() { ++ local CNT=0 ++ ++ ip netns exec $SERVER_NS pkill sctp_hello ++ while ip netns exec $SERVER_NS ss -SaH | grep -q .; do ++ [ $((CNT++)) -eq 30 ] && break ++ sleep 0.1 ++ done ++} ++ ++wait_client() { ++ local CLIENT_NS=$1 ++ local CNT=0 ++ ++ while ip netns exec $CLIENT_NS ss -SaH | grep -q .; do ++ [ $((CNT++)) -eq 30 ] && break + sleep 0.1 + done + } +@@ -81,14 +104,12 @@ do_test() { + local CLIENT_NS=$1 + local IFACE=$2 + +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE 2>&1 >/dev/null & +- disown +- wait_server $IFACE || return $RET ++ start_server $IFACE || return $RET + timeout 3 ip netns exec $CLIENT_NS ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT + RET=$? ++ wait_client $CLIENT_NS ++ stop_server + return $RET + } + +@@ -96,25 +117,21 @@ do_testx() { + local IFACE1=$1 + local IFACE2=$2 + +- ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE1 2>&1 >/dev/null & +- disown +- wait_server $IFACE1 || return $RET +- ip netns exec $SERVER_NS ./sctp_hello server $AF $SERVER_IP \ +- $SERVER_PORT $IFACE2 2>&1 >/dev/null & +- disown +- wait_server $IFACE2 || return $RET ++ start_server $IFACE1 || return $RET ++ start_server $IFACE2 || return $RET + timeout 3 ip netns exec $CLIENT_NS1 ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null && \ ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT && \ + timeout 3 ip netns exec $CLIENT_NS2 ./sctp_hello client $AF \ +- $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT 2>&1 >/dev/null ++ $SERVER_IP $SERVER_PORT $CLIENT_IP $CLIENT_PORT + RET=$? ++ wait_client $CLIENT_NS1 ++ wait_client $CLIENT_NS2 ++ stop_server + return $RET + } + + testup() { +- ip netns exec $SERVER_NS sysctl -w net.sctp.l3mdev_accept=1 2>&1 >/dev/null ++ ip netns exec $SERVER_NS sysctl -wq net.sctp.l3mdev_accept=1 + echo -n "TEST 01: nobind, connect from client 1, l3mdev_accept=1, Y " + do_test $CLIENT_NS1 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" +@@ -123,7 +140,7 @@ testup() { + do_test $CLIENT_NS2 && { echo "[FAIL]"; return $RET; } + echo "[PASS]" + +- ip netns exec $SERVER_NS sysctl -w net.sctp.l3mdev_accept=0 2>&1 >/dev/null ++ ip netns exec $SERVER_NS sysctl -wq net.sctp.l3mdev_accept=0 + echo -n "TEST 03: nobind, connect from client 1, l3mdev_accept=0, N " + do_test $CLIENT_NS1 && { echo "[FAIL]"; return $RET; } + echo "[PASS]" +@@ -160,7 +177,7 @@ testup() { + do_testx vrf-1 vrf-2 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" + +- echo -n "TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N " ++ echo -n "TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, Y " + do_testx vrf-2 vrf-1 || { echo "[FAIL]"; return $RET; } + echo "[PASS]" + } +-- +2.51.0 + diff --git a/queue-6.6/series b/queue-6.6/series index e8110fa27a..66e9485306 100644 --- a/queue-6.6/series +++ b/queue-6.6/series @@ -14,3 +14,17 @@ powerpc-32-remove-page_kernel_text-to-fix-startup-fa.patch drivers-perf-hisi-relax-the-event-id-check-in-the-fr.patch smb-server-let-smb_direct_flush_send_list-invalidate.patch unbreak-make-tools-for-user-space-targets.patch +net-mlx5e-return-1-instead-of-0-in-invalid-case-in-m.patch +rtnetlink-allow-deleting-fdb-entries-in-user-namespa.patch +net-tree-wide-replace-xdp_do_flush_map-with-xdp_do_f.patch +net-enetc-fix-the-deadlock-of-enetc_mdio_lock.patch +net-enetc-correct-the-value-of-enetc_rxb_truesize.patch +dpaa2-eth-fix-the-pointer-passed-to-ptr_align-on-tx-.patch +can-bxcan-bxcan_start_xmit-use-can_dev_dropped_skb-i.patch +selftests-net-convert-sctp_vrf.sh-to-run-it-in-uniqu.patch +selftests-net-fix-server-bind-failure-in-sctp_vrf.sh.patch +net-mlx5e-reuse-per-rq-xdp-buffer-to-avoid-stack-zer.patch +net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch +net-mlx5e-rx-fix-generating-skb-from-non-linear-xdp_.patch-5039 +arm64-mm-avoid-always-making-pte-dirty-in-pte_mkwrit.patch +sctp-avoid-null-dereference-when-chunk-data-buffer-i.patch