From: Sasha Levin Date: Sat, 24 May 2025 10:22:30 +0000 (-0400) Subject: Fixes for 5.10 X-Git-Tag: v6.12.31~72 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=7190fc143b3512855703f6489642e491376e5333;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 5.10 Signed-off-by: Sasha Levin --- diff --git a/queue-5.10/bridge-netfilter-fix-forwarding-of-fragmented-packet.patch b/queue-5.10/bridge-netfilter-fix-forwarding-of-fragmented-packet.patch new file mode 100644 index 0000000000..5aa7a29910 --- /dev/null +++ b/queue-5.10/bridge-netfilter-fix-forwarding-of-fragmented-packet.patch @@ -0,0 +1,95 @@ +From 4fd84b7430e233010033bd241ec6d69e1a11e4b8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 15 May 2025 11:48:48 +0300 +Subject: bridge: netfilter: Fix forwarding of fragmented packets + +From: Ido Schimmel + +[ Upstream commit 91b6dbced0ef1d680afdd69b14fc83d50ebafaf3 ] + +When netfilter defrag hooks are loaded (due to the presence of conntrack +rules, for example), fragmented packets entering the bridge will be +defragged by the bridge's pre-routing hook (br_nf_pre_routing() -> +ipv4_conntrack_defrag()). + +Later on, in the bridge's post-routing hook, the defragged packet will +be fragmented again. If the size of the largest fragment is larger than +what the kernel has determined as the destination MTU (using +ip_skb_dst_mtu()), the defragged packet will be dropped. + +Before commit ac6627a28dbf ("net: ipv4: Consolidate ipv4_mtu and +ip_dst_mtu_maybe_forward"), ip_skb_dst_mtu() would return dst_mtu() as +the destination MTU. Assuming the dst entry attached to the packet is +the bridge's fake rtable one, this would simply be the bridge's MTU (see +fake_mtu()). + +However, after above mentioned commit, ip_skb_dst_mtu() ends up +returning the route's MTU stored in the dst entry's metrics. Ideally, in +case the dst entry is the bridge's fake rtable one, this should be the +bridge's MTU as the bridge takes care of updating this metric when its +MTU changes (see br_change_mtu()). + +Unfortunately, the last operation is a no-op given the metrics attached +to the fake rtable entry are marked as read-only. Therefore, +ip_skb_dst_mtu() ends up returning 1500 (the initial MTU value) and +defragged packets are dropped during fragmentation when dealing with +large fragments and high MTU (e.g., 9k). + +Fix by moving the fake rtable entry's metrics to be per-bridge (in a +similar fashion to the fake rtable entry itself) and marking them as +writable, thereby allowing MTU changes to be reflected. + +Fixes: 62fa8a846d7d ("net: Implement read-only protection and COW'ing of metrics.") +Fixes: 33eb9873a283 ("bridge: initialize fake_rtable metrics") +Reported-by: Venkat Venkatsubra +Closes: https://lore.kernel.org/netdev/PH0PR10MB4504888284FF4CBA648197D0ACB82@PH0PR10MB4504.namprd10.prod.outlook.com/ +Tested-by: Venkat Venkatsubra +Signed-off-by: Ido Schimmel +Acked-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20250515084848.727706-1-idosch@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/bridge/br_nf_core.c | 7 ++----- + net/bridge/br_private.h | 1 + + 2 files changed, 3 insertions(+), 5 deletions(-) + +diff --git a/net/bridge/br_nf_core.c b/net/bridge/br_nf_core.c +index 8c69f0c95a8ed..b8c8deb87407d 100644 +--- a/net/bridge/br_nf_core.c ++++ b/net/bridge/br_nf_core.c +@@ -65,17 +65,14 @@ static struct dst_ops fake_dst_ops = { + * ipt_REJECT needs it. Future netfilter modules might + * require us to fill additional fields. + */ +-static const u32 br_dst_default_metrics[RTAX_MAX] = { +- [RTAX_MTU - 1] = 1500, +-}; +- + void br_netfilter_rtable_init(struct net_bridge *br) + { + struct rtable *rt = &br->fake_rtable; + + atomic_set(&rt->dst.__refcnt, 1); + rt->dst.dev = br->dev; +- dst_init_metrics(&rt->dst, br_dst_default_metrics, true); ++ dst_init_metrics(&rt->dst, br->metrics, false); ++ dst_metric_set(&rt->dst, RTAX_MTU, br->dev->mtu); + rt->dst.flags = DST_NOXFRM | DST_FAKE_RTABLE; + rt->dst.ops = &fake_dst_ops; + } +diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h +index 2b88b17cc8b25..259b43b435a99 100644 +--- a/net/bridge/br_private.h ++++ b/net/bridge/br_private.h +@@ -400,6 +400,7 @@ struct net_bridge { + struct rtable fake_rtable; + struct rt6_info fake_rt6_info; + }; ++ u32 metrics[RTAX_MAX]; + #endif + u16 group_fwd_mask; + u16 group_fwd_mask_required; +-- +2.39.5 + diff --git a/queue-5.10/net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch b/queue-5.10/net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch new file mode 100644 index 0000000000..581cafcd9b --- /dev/null +++ b/queue-5.10/net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch @@ -0,0 +1,48 @@ +From 27fcc2da618f81309609914376aec63a234072c3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 19 May 2025 18:49:36 +0200 +Subject: net: dwmac-sun8i: Use parsed internal PHY address instead of 1 + +From: Paul Kocialkowski + +[ Upstream commit 47653e4243f2b0a26372e481ca098936b51ec3a8 ] + +While the MDIO address of the internal PHY on Allwinner sun8i chips is +generally 1, of_mdio_parse_addr is used to cleanly parse the address +from the device-tree instead of hardcoding it. + +A commit reworking the code ditched the parsed value and hardcoded the +value 1 instead, which didn't really break anything but is more fragile +and not future-proof. + +Restore the initial behavior using the parsed address returned from the +helper. + +Fixes: 634db83b8265 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs") +Signed-off-by: Paul Kocialkowski +Reviewed-by: Andrew Lunn +Acked-by: Corentin LABBE +Tested-by: Corentin LABBE +Link: https://patch.msgid.link/20250519164936.4172658-1-paulk@sys-base.io +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +index 958bbcfc2668d..d04bc6597e0f0 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c ++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +@@ -936,7 +936,7 @@ static int sun8i_dwmac_set_syscon(struct device *dev, + /* of_mdio_parse_addr returns a valid (0 ~ 31) PHY + * address. No need to mask it again. + */ +- reg |= 1 << H3_EPHY_ADDR_SHIFT; ++ reg |= ret << H3_EPHY_ADDR_SHIFT; + } else { + /* For SoCs without internal PHY the PHY selection bit should be + * set to 0 (external PHY). +-- +2.39.5 + diff --git a/queue-5.10/net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch b/queue-5.10/net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch new file mode 100644 index 0000000000..44dc4aeebd --- /dev/null +++ b/queue-5.10/net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch @@ -0,0 +1,125 @@ +From a5d86bbf6cac77f275e171a621b202fa47376a15 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 20 May 2025 18:14:04 +0800 +Subject: net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done + +From: Wang Liang + +[ Upstream commit e279024617134c94fd3e37470156534d5f2b3472 ] + +Syzbot reported a slab-use-after-free with the following call trace: + + ================================================================== + BUG: KASAN: slab-use-after-free in tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840 + Read of size 8 at addr ffff88807a733000 by task kworker/1:0/25 + + Call Trace: + kasan_report+0xd9/0x110 mm/kasan/report.c:601 + tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840 + crypto_request_complete include/crypto/algapi.h:266 + aead_request_complete include/crypto/internal/aead.h:85 + cryptd_aead_crypt+0x3b8/0x750 crypto/cryptd.c:772 + crypto_request_complete include/crypto/algapi.h:266 + cryptd_queue_worker+0x131/0x200 crypto/cryptd.c:181 + process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 + + Allocated by task 8355: + kzalloc_noprof include/linux/slab.h:778 + tipc_crypto_start+0xcc/0x9e0 net/tipc/crypto.c:1466 + tipc_init_net+0x2dd/0x430 net/tipc/core.c:72 + ops_init+0xb9/0x650 net/core/net_namespace.c:139 + setup_net+0x435/0xb40 net/core/net_namespace.c:343 + copy_net_ns+0x2f0/0x670 net/core/net_namespace.c:508 + create_new_namespaces+0x3ea/0xb10 kernel/nsproxy.c:110 + unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228 + ksys_unshare+0x419/0x970 kernel/fork.c:3323 + __do_sys_unshare kernel/fork.c:3394 + + Freed by task 63: + kfree+0x12a/0x3b0 mm/slub.c:4557 + tipc_crypto_stop+0x23c/0x500 net/tipc/crypto.c:1539 + tipc_exit_net+0x8c/0x110 net/tipc/core.c:119 + ops_exit_list+0xb0/0x180 net/core/net_namespace.c:173 + cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640 + process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 + +After freed the tipc_crypto tx by delete namespace, tipc_aead_encrypt_done +may still visit it in cryptd_queue_worker workqueue. + +I reproduce this issue by: + ip netns add ns1 + ip link add veth1 type veth peer name veth2 + ip link set veth1 netns ns1 + ip netns exec ns1 tipc bearer enable media eth dev veth1 + ip netns exec ns1 tipc node set key this_is_a_master_key master + ip netns exec ns1 tipc bearer disable media eth dev veth1 + ip netns del ns1 + +The key of reproduction is that, simd_aead_encrypt is interrupted, leading +to crypto_simd_usable() return false. Thus, the cryptd_queue_worker is +triggered, and the tipc_crypto tx will be visited. + + tipc_disc_timeout + tipc_bearer_xmit_skb + tipc_crypto_xmit + tipc_aead_encrypt + crypto_aead_encrypt + // encrypt() + simd_aead_encrypt + // crypto_simd_usable() is false + child = &ctx->cryptd_tfm->base; + + simd_aead_encrypt + crypto_aead_encrypt + // encrypt() + cryptd_aead_encrypt_enqueue + cryptd_aead_enqueue + cryptd_enqueue_request + // trigger cryptd_queue_worker + queue_work_on(smp_processor_id(), cryptd_wq, &cpu_queue->work) + +Fix this by holding net reference count before encrypt. + +Reported-by: syzbot+55c12726619ff85ce1f6@syzkaller.appspotmail.com +Closes: https://syzkaller.appspot.com/bug?extid=55c12726619ff85ce1f6 +Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") +Signed-off-by: Wang Liang +Link: https://patch.msgid.link/20250520101404.1341730-1-wangliang74@huawei.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/tipc/crypto.c | 5 +++++ + 1 file changed, 5 insertions(+) + +diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c +index bf384bd126963..159d891b81c59 100644 +--- a/net/tipc/crypto.c ++++ b/net/tipc/crypto.c +@@ -821,12 +821,16 @@ static int tipc_aead_encrypt(struct tipc_aead *aead, struct sk_buff *skb, + goto exit; + } + ++ /* Get net to avoid freed tipc_crypto when delete namespace */ ++ get_net(aead->crypto->net); ++ + /* Now, do encrypt */ + rc = crypto_aead_encrypt(req); + if (rc == -EINPROGRESS || rc == -EBUSY) + return rc; + + tipc_bearer_put(b); ++ put_net(aead->crypto->net); + + exit: + kfree(ctx); +@@ -864,6 +868,7 @@ static void tipc_aead_encrypt_done(struct crypto_async_request *base, int err) + kfree(tx_ctx); + tipc_bearer_put(b); + tipc_aead_put(aead); ++ put_net(net); + } + + /** +-- +2.39.5 + diff --git a/queue-5.10/sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch b/queue-5.10/sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch new file mode 100644 index 0000000000..c36af08ae9 --- /dev/null +++ b/queue-5.10/sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch @@ -0,0 +1,62 @@ +From f2f3a88e0d6b83a10ee10536016b4bc01966bac2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 18 May 2025 15:20:37 -0700 +Subject: sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue() + +From: Cong Wang + +[ Upstream commit 3f981138109f63232a5fb7165938d4c945cc1b9d ] + +When enqueuing the first packet to an HFSC class, hfsc_enqueue() calls the +child qdisc's peek() operation before incrementing sch->q.qlen and +sch->qstats.backlog. If the child qdisc uses qdisc_peek_dequeued(), this may +trigger an immediate dequeue and potential packet drop. In such cases, +qdisc_tree_reduce_backlog() is called, but the HFSC qdisc's qlen and backlog +have not yet been updated, leading to inconsistent queue accounting. This +can leave an empty HFSC class in the active list, causing further +consequences like use-after-free. + +This patch fixes the bug by moving the increment of sch->q.qlen and +sch->qstats.backlog before the call to the child qdisc's peek() operation. +This ensures that queue length and backlog are always accurate when packet +drops or dequeues are triggered during the peek. + +Fixes: 12d0ad3be9c3 ("net/sched/sch_hfsc.c: handle corner cases where head may change invalidating calculated deadline") +Reported-by: Mingi Cho +Signed-off-by: Cong Wang +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20250518222038.58538-2-xiyou.wangcong@gmail.com +Reviewed-by: Jamal Hadi Salim +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/sched/sch_hfsc.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c +index adc16643779fb..45d17501a6ed0 100644 +--- a/net/sched/sch_hfsc.c ++++ b/net/sched/sch_hfsc.c +@@ -1571,6 +1571,9 @@ hfsc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) + return err; + } + ++ sch->qstats.backlog += len; ++ sch->q.qlen++; ++ + if (first && !cl->cl_nactive) { + if (cl->cl_flags & HFSC_RSC) + init_ed(cl, len); +@@ -1586,9 +1589,6 @@ hfsc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) + + } + +- sch->qstats.backlog += len; +- sch->q.qlen++; +- + return NET_XMIT_SUCCESS; + } + +-- +2.39.5 + diff --git a/queue-5.10/series b/queue-5.10/series index 2243e76152..ebc1d22fbf 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -233,3 +233,8 @@ nvmet-tcp-don-t-restore-null-sk_state_change.patch btrfs-correct-the-order-of-prelim_ref-arguments-in-b.patch xenbus-allow-pvh-dom0-a-non-local-xenstore.patch __legitimize_mnt-check-for-mnt_sync_umount-should-be.patch +xfrm-sanitize-marks-before-insert.patch +bridge-netfilter-fix-forwarding-of-fragmented-packet.patch +net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch +sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch +net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch diff --git a/queue-5.10/xfrm-sanitize-marks-before-insert.patch b/queue-5.10/xfrm-sanitize-marks-before-insert.patch new file mode 100644 index 0000000000..9d99fe2a22 --- /dev/null +++ b/queue-5.10/xfrm-sanitize-marks-before-insert.patch @@ -0,0 +1,71 @@ +From 2aa323c27cfacd8b2dcb0a86a3a67e1f083d404a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 7 May 2025 13:31:58 +0200 +Subject: xfrm: Sanitize marks before insert + +From: Paul Chaignon + +[ Upstream commit 0b91fda3a1f044141e1e615456ff62508c32b202 ] + +Prior to this patch, the mark is sanitized (applying the state's mask to +the state's value) only on inserts when checking if a conflicting XFRM +state or policy exists. + +We discovered in Cilium that this same sanitization does not occur +in the hot-path __xfrm_state_lookup. In the hot-path, the sk_buff's mark +is simply compared to the state's value: + + if ((mark & x->mark.m) != x->mark.v) + continue; + +Therefore, users can define unsanitized marks (ex. 0xf42/0xf00) which will +never match any packet. + +This commit updates __xfrm_state_insert and xfrm_policy_insert to store +the sanitized marks, thus removing this footgun. + +This has the side effect of changing the ip output, as the +returned mark will have the mask applied to it when printed. + +Fixes: 3d6acfa7641f ("xfrm: SA lookups with mark") +Signed-off-by: Paul Chaignon +Signed-off-by: Louis DeLosSantos +Co-developed-by: Louis DeLosSantos +Signed-off-by: Steffen Klassert +Signed-off-by: Sasha Levin +--- + net/xfrm/xfrm_policy.c | 3 +++ + net/xfrm/xfrm_state.c | 3 +++ + 2 files changed, 6 insertions(+) + +diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c +index a1a662a55c2ae..64b971bb1d36a 100644 +--- a/net/xfrm/xfrm_policy.c ++++ b/net/xfrm/xfrm_policy.c +@@ -1594,6 +1594,9 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl) + struct xfrm_policy *delpol; + struct hlist_head *chain; + ++ /* Sanitize mark before store */ ++ policy->mark.v &= policy->mark.m; ++ + spin_lock_bh(&net->xfrm.xfrm_policy_lock); + chain = policy_hash_bysel(net, &policy->selector, policy->family, dir); + if (chain) +diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c +index 94179ff475f2f..da2d7012e5c74 100644 +--- a/net/xfrm/xfrm_state.c ++++ b/net/xfrm/xfrm_state.c +@@ -1249,6 +1249,9 @@ static void __xfrm_state_insert(struct xfrm_state *x) + + list_add(&x->km.all, &net->xfrm.state_all); + ++ /* Sanitize mark before store */ ++ x->mark.v &= x->mark.m; ++ + h = xfrm_dst_hash(net, &x->id.daddr, &x->props.saddr, + x->props.reqid, x->props.family); + hlist_add_head_rcu(&x->bydst, net->xfrm.state_bydst + h); +-- +2.39.5 +