From 5d72e4738771612fddc619852d61534f2c733a7f Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Sat, 24 May 2025 06:22:28 -0400 Subject: [PATCH] Fixes for 6.14 Signed-off-by: Sasha Levin --- ...el-hda-fix-uaf-when-reloading-module.patch | 101 ++ ...use-skb_pull-to-avoid-unsafe-access-.patch | 193 ++++ ...fix-not-checking-l2cap_chan-security.patch | 92 ++ ...-fix-forwarding-of-fragmented-packet.patch | 95 ++ ...add-missing-divider-for-mmc-mod-cloc.patch | 131 +++ .../devres-introduce-devm_kmemdup_array.patch | 42 + ...ma-fix-return-code-for-unhandled-int.patch | 40 + ...ix-allowing-write-from-different-add.patch | 59 ++ ...dmaengine-idxd-fix-poll-return-value.patch | 41 + ...split-devres-apis-to-device-devres.h.patch | 302 ++++++ queue-6.14/espintcp-fix-skb-leaks.patch | 73 ++ ...encap-socket-caching-to-avoid-refere.patch | 252 +++++ ...lacp-bonds-without-sriov-environment.patch | 68 ++ ...num_mac-count-with-port-representors.patch | 53 ++ ...idpf-fix-idpf_vport_splitq_napi_poll.patch | 72 ++ ...ull-ptr-deref-in-idpf_features_check.patch | 121 +++ ...-fix-overflow-resched-cqe-reordering.patch | 38 + ...sic-start-local-sync-timer-on-correc.patch | 71 ++ ...-call-untrack_pfn_clear-on-vmas-dupl.patch | 98 ++ ...re-write_iter-for-writable-files-in-.patch | 43 + ...idate-the-ipmr_can_free_table-checks.patch | 165 ++++ ...use-parsed-internal-phy-address-inst.patch | 48 + ...estore-sgmii-ctrl-register-on-resume.patch | 92 ++ ...b-use-after-free-read-in-tipc_aead_e.patch | 125 +++ ...-apr-entry-mapping-based-on-apr_lmt_.patch | 111 +++ ...et-lmt_ena-bit-for-apr-table-entries.patch | 76 ++ ...-pf-add-af_xdp-non-zero-copy-support.patch | 52 + ...-pf-af_xdp-zero-copy-receive-support.patch | 896 ++++++++++++++++++ ...id-adding-dcbnl_ops-for-lbk-and-sdp-.patch | 45 + ...-xdp_return_frame-to-free-xdp-buffer.patch | 216 +++++ ...ix-segfault-with-pebs-via-pt-with-sa.patch | 101 ++ ...tch-to-devm_register_sys_off_handler.patch | 96 ++ ...gnal-freq-counts-in-summary-output-f.patch | 131 +++ ...wcnss-fix-on-platforms-without-fallb.patch | 45 + ...n-accounting-bug-when-using-peek-in-.patch | 62 ++ queue-6.14/series | 40 + ...x-race-on-the-creation-of-the-irq-do.patch | 60 ++ ...ator-precedence-in-ghcb_msr_vmpl_req.patch | 45 + ...p-gro-handling-for-some-corner-cases.patch | 143 +++ .../xfrm-sanitize-marks-before-insert.patch | 71 ++ ...ack-busy-polling-support-in-xdp_copy.patch | 63 ++ 41 files changed, 4668 insertions(+) create mode 100644 queue-6.14/asoc-sof-intel-hda-fix-uaf-when-reloading-module.patch create mode 100644 queue-6.14/bluetooth-btusb-use-skb_pull-to-avoid-unsafe-access-.patch create mode 100644 queue-6.14/bluetooth-l2cap-fix-not-checking-l2cap_chan-security.patch create mode 100644 queue-6.14/bridge-netfilter-fix-forwarding-of-fragmented-packet.patch create mode 100644 queue-6.14/clk-sunxi-ng-d1-add-missing-divider-for-mmc-mod-cloc.patch create mode 100644 queue-6.14/devres-introduce-devm_kmemdup_array.patch create mode 100644 queue-6.14/dmaengine-fsl-edma-fix-return-code-for-unhandled-int.patch create mode 100644 queue-6.14/dmaengine-idxd-fix-allowing-write-from-different-add.patch create mode 100644 queue-6.14/dmaengine-idxd-fix-poll-return-value.patch create mode 100644 queue-6.14/driver-core-split-devres-apis-to-device-devres.h.patch create mode 100644 queue-6.14/espintcp-fix-skb-leaks.patch create mode 100644 queue-6.14/espintcp-remove-encap-socket-caching-to-avoid-refere.patch create mode 100644 queue-6.14/ice-fix-lacp-bonds-without-sriov-environment.patch create mode 100644 queue-6.14/ice-fix-vf-num_mac-count-with-port-representors.patch create mode 100644 queue-6.14/idpf-fix-idpf_vport_splitq_napi_poll.patch create mode 100644 queue-6.14/idpf-fix-null-ptr-deref-in-idpf_features_check.patch create mode 100644 queue-6.14/io_uring-fix-overflow-resched-cqe-reordering.patch create mode 100644 queue-6.14/irqchip-riscv-imsic-start-local-sync-timer-on-correc.patch create mode 100644 queue-6.14/kernel-fork-only-call-untrack_pfn_clear-on-vmas-dupl.patch create mode 100644 queue-6.14/loop-don-t-require-write_iter-for-writable-files-in-.patch create mode 100644 queue-6.14/mr-consolidate-the-ipmr_can_free_table-checks.patch create mode 100644 queue-6.14/net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch create mode 100644 queue-6.14/net-lan743x-restore-sgmii-ctrl-register-on-resume.patch create mode 100644 queue-6.14/net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch create mode 100644 queue-6.14/octeontx2-af-fix-apr-entry-mapping-based-on-apr_lmt_.patch create mode 100644 queue-6.14/octeontx2-af-set-lmt_ena-bit-for-apr-table-entries.patch create mode 100644 queue-6.14/octeontx2-pf-add-af_xdp-non-zero-copy-support.patch create mode 100644 queue-6.14/octeontx2-pf-af_xdp-zero-copy-receive-support.patch create mode 100644 queue-6.14/octeontx2-pf-avoid-adding-dcbnl_ops-for-lbk-and-sdp-.patch create mode 100644 queue-6.14/octeontx2-pf-use-xdp_return_frame-to-free-xdp-buffer.patch create mode 100644 queue-6.14/perf-x86-intel-fix-segfault-with-pebs-via-pt-with-sa.patch create mode 100644 queue-6.14/pinctrl-qcom-switch-to-devm_register_sys_off_handler.patch create mode 100644 queue-6.14/ptp-ocp-limit-signal-freq-counts-in-summary-output-f.patch create mode 100644 queue-6.14/remoteproc-qcom_wcnss-fix-on-platforms-without-fallb.patch create mode 100644 queue-6.14/sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch create mode 100644 queue-6.14/soundwire-bus-fix-race-on-the-creation-of-the-irq-do.patch create mode 100644 queue-6.14/x86-sev-fix-operator-precedence-in-ghcb_msr_vmpl_req.patch create mode 100644 queue-6.14/xfrm-fix-udp-gro-handling-for-some-corner-cases.patch create mode 100644 queue-6.14/xfrm-sanitize-marks-before-insert.patch create mode 100644 queue-6.14/xsk-bring-back-busy-polling-support-in-xdp_copy.patch diff --git a/queue-6.14/asoc-sof-intel-hda-fix-uaf-when-reloading-module.patch b/queue-6.14/asoc-sof-intel-hda-fix-uaf-when-reloading-module.patch new file mode 100644 index 0000000000..88a6b8059e --- /dev/null +++ b/queue-6.14/asoc-sof-intel-hda-fix-uaf-when-reloading-module.patch @@ -0,0 +1,101 @@ +From 8bfd643e3dc927a092e99563d3e5350159157813 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 14 May 2025 09:37:49 -0400 +Subject: ASoC: SOF: Intel: hda: Fix UAF when reloading module + +From: Tavian Barnes + +[ Upstream commit 7dd7f39fce0022b386ef1ea5ffef92ecc7dfc6af ] + +hda_generic_machine_select() appends -idisp to the tplg filename by +allocating a new string with devm_kasprintf(), then stores the string +right back into the global variable snd_soc_acpi_intel_hda_machines. +When the module is unloaded, this memory is freed, resulting in a global +variable pointing to freed memory. Reloading the module then triggers +a use-after-free: + +BUG: KFENCE: use-after-free read in string+0x48/0xe0 + +Use-after-free read at 0x00000000967e0109 (in kfence-#99): + string+0x48/0xe0 + vsnprintf+0x329/0x6e0 + devm_kvasprintf+0x54/0xb0 + devm_kasprintf+0x58/0x80 + hda_machine_select.cold+0x198/0x17a2 [snd_sof_intel_hda_generic] + sof_probe_work+0x7f/0x600 [snd_sof] + process_one_work+0x17b/0x330 + worker_thread+0x2ce/0x3f0 + kthread+0xcf/0x100 + ret_from_fork+0x31/0x50 + ret_from_fork_asm+0x1a/0x30 + +kfence-#99: 0x00000000198a940f-0x00000000ace47d9d, size=64, cache=kmalloc-64 + +allocated by task 333 on cpu 8 at 17.798069s (130.453553s ago): + devm_kmalloc+0x52/0x120 + devm_kvasprintf+0x66/0xb0 + devm_kasprintf+0x58/0x80 + hda_machine_select.cold+0x198/0x17a2 [snd_sof_intel_hda_generic] + sof_probe_work+0x7f/0x600 [snd_sof] + process_one_work+0x17b/0x330 + worker_thread+0x2ce/0x3f0 + kthread+0xcf/0x100 + ret_from_fork+0x31/0x50 + ret_from_fork_asm+0x1a/0x30 + +freed by task 1543 on cpu 4 at 141.586686s (6.665010s ago): + release_nodes+0x43/0xb0 + devres_release_all+0x90/0xf0 + device_unbind_cleanup+0xe/0x70 + device_release_driver_internal+0x1c1/0x200 + driver_detach+0x48/0x90 + bus_remove_driver+0x6d/0xf0 + pci_unregister_driver+0x42/0xb0 + __do_sys_delete_module+0x1d1/0x310 + do_syscall_64+0x82/0x190 + entry_SYSCALL_64_after_hwframe+0x76/0x7e + +Fix it by copying the match array with devm_kmemdup_array() before we +modify it. + +Fixes: 5458411d7594 ("ASoC: SOF: Intel: hda: refactoring topology name fixup for HDA mach") +Suggested-by: Peter Ujfalusi +Acked-by: Peter Ujfalusi +Signed-off-by: Tavian Barnes +Link: https://patch.msgid.link/570b15570b274520a0d9052f4e0f064a29c950ef.1747229716.git.tavianator@tavianator.com +Signed-off-by: Mark Brown +Signed-off-by: Sasha Levin +--- + sound/soc/sof/intel/hda.c | 16 +++++++++++++++- + 1 file changed, 15 insertions(+), 1 deletion(-) + +diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c +index a1ccd95da8bb7..9ea194dfbd2ec 100644 +--- a/sound/soc/sof/intel/hda.c ++++ b/sound/soc/sof/intel/hda.c +@@ -1011,7 +1011,21 @@ static void hda_generic_machine_select(struct snd_sof_dev *sdev, + if (!*mach && codec_num <= 2) { + bool tplg_fixup = false; + +- hda_mach = snd_soc_acpi_intel_hda_machines; ++ /* ++ * make a local copy of the match array since we might ++ * be modifying it ++ */ ++ hda_mach = devm_kmemdup_array(sdev->dev, ++ snd_soc_acpi_intel_hda_machines, ++ 2, /* we have one entry + sentinel in the array */ ++ sizeof(snd_soc_acpi_intel_hda_machines[0]), ++ GFP_KERNEL); ++ if (!hda_mach) { ++ dev_err(bus->dev, ++ "%s: failed to duplicate the HDA match table\n", ++ __func__); ++ return; ++ } + + dev_info(bus->dev, "using HDA machine driver %s now\n", + hda_mach->drv_name); +-- +2.39.5 + diff --git a/queue-6.14/bluetooth-btusb-use-skb_pull-to-avoid-unsafe-access-.patch b/queue-6.14/bluetooth-btusb-use-skb_pull-to-avoid-unsafe-access-.patch new file mode 100644 index 0000000000..6c91aeddce --- /dev/null +++ b/queue-6.14/bluetooth-btusb-use-skb_pull-to-avoid-unsafe-access-.patch @@ -0,0 +1,193 @@ +From 2c840e2ef65e2d88761d4d6f92ac40690f02049e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 8 May 2025 22:15:20 +0800 +Subject: Bluetooth: btusb: use skb_pull to avoid unsafe access in QCA dump + handling + +From: En-Wei Wu + +[ Upstream commit 4bcb0c7dc25446b99fc7a8fa2a143d69f3314162 ] + +Use skb_pull() and skb_pull_data() to safely parse QCA dump packets. + +This avoids direct pointer math on skb->data, which could lead to +invalid access if the packet is shorter than expected. + +Fixes: 20981ce2d5a5 ("Bluetooth: btusb: Add WCN6855 devcoredump support") +Signed-off-by: En-Wei Wu +Signed-off-by: Luiz Augusto von Dentz +Signed-off-by: Sasha Levin +--- + drivers/bluetooth/btusb.c | 98 ++++++++++++++++----------------------- + 1 file changed, 40 insertions(+), 58 deletions(-) + +diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c +index ccd0a21da3955..b15f3ed767c53 100644 +--- a/drivers/bluetooth/btusb.c ++++ b/drivers/bluetooth/btusb.c +@@ -3014,9 +3014,8 @@ static void btusb_coredump_qca(struct hci_dev *hdev) + static int handle_dump_pkt_qca(struct hci_dev *hdev, struct sk_buff *skb) + { + int ret = 0; ++ unsigned int skip = 0; + u8 pkt_type; +- u8 *sk_ptr; +- unsigned int sk_len; + u16 seqno; + u32 dump_size; + +@@ -3025,18 +3024,13 @@ static int handle_dump_pkt_qca(struct hci_dev *hdev, struct sk_buff *skb) + struct usb_device *udev = btdata->udev; + + pkt_type = hci_skb_pkt_type(skb); +- sk_ptr = skb->data; +- sk_len = skb->len; ++ skip = sizeof(struct hci_event_hdr); ++ if (pkt_type == HCI_ACLDATA_PKT) ++ skip += sizeof(struct hci_acl_hdr); + +- if (pkt_type == HCI_ACLDATA_PKT) { +- sk_ptr += HCI_ACL_HDR_SIZE; +- sk_len -= HCI_ACL_HDR_SIZE; +- } +- +- sk_ptr += HCI_EVENT_HDR_SIZE; +- sk_len -= HCI_EVENT_HDR_SIZE; ++ skb_pull(skb, skip); ++ dump_hdr = (struct qca_dump_hdr *)skb->data; + +- dump_hdr = (struct qca_dump_hdr *)sk_ptr; + seqno = le16_to_cpu(dump_hdr->seqno); + if (seqno == 0) { + set_bit(BTUSB_HW_SSR_ACTIVE, &btdata->flags); +@@ -3056,16 +3050,15 @@ static int handle_dump_pkt_qca(struct hci_dev *hdev, struct sk_buff *skb) + + btdata->qca_dump.ram_dump_size = dump_size; + btdata->qca_dump.ram_dump_seqno = 0; +- sk_ptr += offsetof(struct qca_dump_hdr, data0); +- sk_len -= offsetof(struct qca_dump_hdr, data0); ++ ++ skb_pull(skb, offsetof(struct qca_dump_hdr, data0)); + + usb_disable_autosuspend(udev); + bt_dev_info(hdev, "%s memdump size(%u)\n", + (pkt_type == HCI_ACLDATA_PKT) ? "ACL" : "event", + dump_size); + } else { +- sk_ptr += offsetof(struct qca_dump_hdr, data); +- sk_len -= offsetof(struct qca_dump_hdr, data); ++ skb_pull(skb, offsetof(struct qca_dump_hdr, data)); + } + + if (!btdata->qca_dump.ram_dump_size) { +@@ -3085,7 +3078,6 @@ static int handle_dump_pkt_qca(struct hci_dev *hdev, struct sk_buff *skb) + return ret; + } + +- skb_pull(skb, skb->len - sk_len); + hci_devcd_append(hdev, skb); + btdata->qca_dump.ram_dump_seqno++; + if (seqno == QCA_LAST_SEQUENCE_NUM) { +@@ -3113,68 +3105,58 @@ static int handle_dump_pkt_qca(struct hci_dev *hdev, struct sk_buff *skb) + /* Return: true if the ACL packet is a dump packet, false otherwise. */ + static bool acl_pkt_is_dump_qca(struct hci_dev *hdev, struct sk_buff *skb) + { +- u8 *sk_ptr; +- unsigned int sk_len; +- + struct hci_event_hdr *event_hdr; + struct hci_acl_hdr *acl_hdr; + struct qca_dump_hdr *dump_hdr; ++ struct sk_buff *clone = skb_clone(skb, GFP_ATOMIC); ++ bool is_dump = false; + +- sk_ptr = skb->data; +- sk_len = skb->len; +- +- acl_hdr = hci_acl_hdr(skb); +- if (le16_to_cpu(acl_hdr->handle) != QCA_MEMDUMP_ACL_HANDLE) ++ if (!clone) + return false; + +- sk_ptr += HCI_ACL_HDR_SIZE; +- sk_len -= HCI_ACL_HDR_SIZE; +- event_hdr = (struct hci_event_hdr *)sk_ptr; +- +- if ((event_hdr->evt != HCI_VENDOR_PKT) || +- (event_hdr->plen != (sk_len - HCI_EVENT_HDR_SIZE))) +- return false; ++ acl_hdr = skb_pull_data(clone, sizeof(*acl_hdr)); ++ if (!acl_hdr || (le16_to_cpu(acl_hdr->handle) != QCA_MEMDUMP_ACL_HANDLE)) ++ goto out; + +- sk_ptr += HCI_EVENT_HDR_SIZE; +- sk_len -= HCI_EVENT_HDR_SIZE; ++ event_hdr = skb_pull_data(clone, sizeof(*event_hdr)); ++ if (!event_hdr || (event_hdr->evt != HCI_VENDOR_PKT)) ++ goto out; + +- dump_hdr = (struct qca_dump_hdr *)sk_ptr; +- if ((sk_len < offsetof(struct qca_dump_hdr, data)) || +- (dump_hdr->vse_class != QCA_MEMDUMP_VSE_CLASS) || +- (dump_hdr->msg_type != QCA_MEMDUMP_MSG_TYPE)) +- return false; ++ dump_hdr = skb_pull_data(clone, sizeof(*dump_hdr)); ++ if (!dump_hdr || (dump_hdr->vse_class != QCA_MEMDUMP_VSE_CLASS) || ++ (dump_hdr->msg_type != QCA_MEMDUMP_MSG_TYPE)) ++ goto out; + +- return true; ++ is_dump = true; ++out: ++ consume_skb(clone); ++ return is_dump; + } + + /* Return: true if the event packet is a dump packet, false otherwise. */ + static bool evt_pkt_is_dump_qca(struct hci_dev *hdev, struct sk_buff *skb) + { +- u8 *sk_ptr; +- unsigned int sk_len; +- + struct hci_event_hdr *event_hdr; + struct qca_dump_hdr *dump_hdr; ++ struct sk_buff *clone = skb_clone(skb, GFP_ATOMIC); ++ bool is_dump = false; + +- sk_ptr = skb->data; +- sk_len = skb->len; +- +- event_hdr = hci_event_hdr(skb); +- +- if ((event_hdr->evt != HCI_VENDOR_PKT) +- || (event_hdr->plen != (sk_len - HCI_EVENT_HDR_SIZE))) ++ if (!clone) + return false; + +- sk_ptr += HCI_EVENT_HDR_SIZE; +- sk_len -= HCI_EVENT_HDR_SIZE; ++ event_hdr = skb_pull_data(clone, sizeof(*event_hdr)); ++ if (!event_hdr || (event_hdr->evt != HCI_VENDOR_PKT)) ++ goto out; + +- dump_hdr = (struct qca_dump_hdr *)sk_ptr; +- if ((sk_len < offsetof(struct qca_dump_hdr, data)) || +- (dump_hdr->vse_class != QCA_MEMDUMP_VSE_CLASS) || +- (dump_hdr->msg_type != QCA_MEMDUMP_MSG_TYPE)) +- return false; ++ dump_hdr = skb_pull_data(clone, sizeof(*dump_hdr)); ++ if (!dump_hdr || (dump_hdr->vse_class != QCA_MEMDUMP_VSE_CLASS) || ++ (dump_hdr->msg_type != QCA_MEMDUMP_MSG_TYPE)) ++ goto out; + +- return true; ++ is_dump = true; ++out: ++ consume_skb(clone); ++ return is_dump; + } + + static int btusb_recv_acl_qca(struct hci_dev *hdev, struct sk_buff *skb) +-- +2.39.5 + diff --git a/queue-6.14/bluetooth-l2cap-fix-not-checking-l2cap_chan-security.patch b/queue-6.14/bluetooth-l2cap-fix-not-checking-l2cap_chan-security.patch new file mode 100644 index 0000000000..db0366627c --- /dev/null +++ b/queue-6.14/bluetooth-l2cap-fix-not-checking-l2cap_chan-security.patch @@ -0,0 +1,92 @@ +From f242070ec5de5fb08cd735ae2bca9d0ff15bf73b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 7 May 2025 15:00:30 -0400 +Subject: Bluetooth: L2CAP: Fix not checking l2cap_chan security level + +From: Luiz Augusto von Dentz + +[ Upstream commit 7af8479d9eb4319b4ba7b47a8c4d2c55af1c31e1 ] + +l2cap_check_enc_key_size shall check the security level of the +l2cap_chan rather than the hci_conn since for incoming connection +request that may be different as hci_conn may already been +encrypted using a different security level. + +Fixes: 522e9ed157e3 ("Bluetooth: l2cap: Check encryption key size on incoming connection") +Signed-off-by: Luiz Augusto von Dentz +Signed-off-by: Sasha Levin +--- + net/bluetooth/l2cap_core.c | 15 ++++++++------- + 1 file changed, 8 insertions(+), 7 deletions(-) + +diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c +index c219a8c596d3e..66fa5d6fea6ca 100644 +--- a/net/bluetooth/l2cap_core.c ++++ b/net/bluetooth/l2cap_core.c +@@ -1411,7 +1411,8 @@ static void l2cap_request_info(struct l2cap_conn *conn) + sizeof(req), &req); + } + +-static bool l2cap_check_enc_key_size(struct hci_conn *hcon) ++static bool l2cap_check_enc_key_size(struct hci_conn *hcon, ++ struct l2cap_chan *chan) + { + /* The minimum encryption key size needs to be enforced by the + * host stack before establishing any L2CAP connections. The +@@ -1425,7 +1426,7 @@ static bool l2cap_check_enc_key_size(struct hci_conn *hcon) + int min_key_size = hcon->hdev->min_enc_key_size; + + /* On FIPS security level, key size must be 16 bytes */ +- if (hcon->sec_level == BT_SECURITY_FIPS) ++ if (chan->sec_level == BT_SECURITY_FIPS) + min_key_size = 16; + + return (!test_bit(HCI_CONN_ENCRYPT, &hcon->flags) || +@@ -1453,7 +1454,7 @@ static void l2cap_do_start(struct l2cap_chan *chan) + !__l2cap_no_conn_pending(chan)) + return; + +- if (l2cap_check_enc_key_size(conn->hcon)) ++ if (l2cap_check_enc_key_size(conn->hcon, chan)) + l2cap_start_connection(chan); + else + __set_chan_timer(chan, L2CAP_DISC_TIMEOUT); +@@ -1528,7 +1529,7 @@ static void l2cap_conn_start(struct l2cap_conn *conn) + continue; + } + +- if (l2cap_check_enc_key_size(conn->hcon)) ++ if (l2cap_check_enc_key_size(conn->hcon, chan)) + l2cap_start_connection(chan); + else + l2cap_chan_close(chan, ECONNREFUSED); +@@ -3957,7 +3958,7 @@ static void l2cap_connect(struct l2cap_conn *conn, struct l2cap_cmd_hdr *cmd, + /* Check if the ACL is secure enough (if not SDP) */ + if (psm != cpu_to_le16(L2CAP_PSM_SDP) && + (!hci_conn_check_link_mode(conn->hcon) || +- !l2cap_check_enc_key_size(conn->hcon))) { ++ !l2cap_check_enc_key_size(conn->hcon, pchan))) { + conn->disc_reason = HCI_ERROR_AUTH_FAILURE; + result = L2CAP_CR_SEC_BLOCK; + goto response; +@@ -7317,7 +7318,7 @@ static void l2cap_security_cfm(struct hci_conn *hcon, u8 status, u8 encrypt) + } + + if (chan->state == BT_CONNECT) { +- if (!status && l2cap_check_enc_key_size(hcon)) ++ if (!status && l2cap_check_enc_key_size(hcon, chan)) + l2cap_start_connection(chan); + else + __set_chan_timer(chan, L2CAP_DISC_TIMEOUT); +@@ -7327,7 +7328,7 @@ static void l2cap_security_cfm(struct hci_conn *hcon, u8 status, u8 encrypt) + struct l2cap_conn_rsp rsp; + __u16 res, stat; + +- if (!status && l2cap_check_enc_key_size(hcon)) { ++ if (!status && l2cap_check_enc_key_size(hcon, chan)) { + if (test_bit(FLAG_DEFER_SETUP, &chan->flags)) { + res = L2CAP_CR_PEND; + stat = L2CAP_CS_AUTHOR_PEND; +-- +2.39.5 + diff --git a/queue-6.14/bridge-netfilter-fix-forwarding-of-fragmented-packet.patch b/queue-6.14/bridge-netfilter-fix-forwarding-of-fragmented-packet.patch new file mode 100644 index 0000000000..3cabe3fd19 --- /dev/null +++ b/queue-6.14/bridge-netfilter-fix-forwarding-of-fragmented-packet.patch @@ -0,0 +1,95 @@ +From 273fa934c3b930bec21deff53486aa469528e45f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 15 May 2025 11:48:48 +0300 +Subject: bridge: netfilter: Fix forwarding of fragmented packets + +From: Ido Schimmel + +[ Upstream commit 91b6dbced0ef1d680afdd69b14fc83d50ebafaf3 ] + +When netfilter defrag hooks are loaded (due to the presence of conntrack +rules, for example), fragmented packets entering the bridge will be +defragged by the bridge's pre-routing hook (br_nf_pre_routing() -> +ipv4_conntrack_defrag()). + +Later on, in the bridge's post-routing hook, the defragged packet will +be fragmented again. If the size of the largest fragment is larger than +what the kernel has determined as the destination MTU (using +ip_skb_dst_mtu()), the defragged packet will be dropped. + +Before commit ac6627a28dbf ("net: ipv4: Consolidate ipv4_mtu and +ip_dst_mtu_maybe_forward"), ip_skb_dst_mtu() would return dst_mtu() as +the destination MTU. Assuming the dst entry attached to the packet is +the bridge's fake rtable one, this would simply be the bridge's MTU (see +fake_mtu()). + +However, after above mentioned commit, ip_skb_dst_mtu() ends up +returning the route's MTU stored in the dst entry's metrics. Ideally, in +case the dst entry is the bridge's fake rtable one, this should be the +bridge's MTU as the bridge takes care of updating this metric when its +MTU changes (see br_change_mtu()). + +Unfortunately, the last operation is a no-op given the metrics attached +to the fake rtable entry are marked as read-only. Therefore, +ip_skb_dst_mtu() ends up returning 1500 (the initial MTU value) and +defragged packets are dropped during fragmentation when dealing with +large fragments and high MTU (e.g., 9k). + +Fix by moving the fake rtable entry's metrics to be per-bridge (in a +similar fashion to the fake rtable entry itself) and marking them as +writable, thereby allowing MTU changes to be reflected. + +Fixes: 62fa8a846d7d ("net: Implement read-only protection and COW'ing of metrics.") +Fixes: 33eb9873a283 ("bridge: initialize fake_rtable metrics") +Reported-by: Venkat Venkatsubra +Closes: https://lore.kernel.org/netdev/PH0PR10MB4504888284FF4CBA648197D0ACB82@PH0PR10MB4504.namprd10.prod.outlook.com/ +Tested-by: Venkat Venkatsubra +Signed-off-by: Ido Schimmel +Acked-by: Nikolay Aleksandrov +Link: https://patch.msgid.link/20250515084848.727706-1-idosch@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/bridge/br_nf_core.c | 7 ++----- + net/bridge/br_private.h | 1 + + 2 files changed, 3 insertions(+), 5 deletions(-) + +diff --git a/net/bridge/br_nf_core.c b/net/bridge/br_nf_core.c +index 98aea5485aaef..a8c67035e23c0 100644 +--- a/net/bridge/br_nf_core.c ++++ b/net/bridge/br_nf_core.c +@@ -65,17 +65,14 @@ static struct dst_ops fake_dst_ops = { + * ipt_REJECT needs it. Future netfilter modules might + * require us to fill additional fields. + */ +-static const u32 br_dst_default_metrics[RTAX_MAX] = { +- [RTAX_MTU - 1] = 1500, +-}; +- + void br_netfilter_rtable_init(struct net_bridge *br) + { + struct rtable *rt = &br->fake_rtable; + + rcuref_init(&rt->dst.__rcuref, 1); + rt->dst.dev = br->dev; +- dst_init_metrics(&rt->dst, br_dst_default_metrics, true); ++ dst_init_metrics(&rt->dst, br->metrics, false); ++ dst_metric_set(&rt->dst, RTAX_MTU, br->dev->mtu); + rt->dst.flags = DST_NOXFRM | DST_FAKE_RTABLE; + rt->dst.ops = &fake_dst_ops; + } +diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h +index d5b3c5936a79e..4715a8d6dc326 100644 +--- a/net/bridge/br_private.h ++++ b/net/bridge/br_private.h +@@ -505,6 +505,7 @@ struct net_bridge { + struct rtable fake_rtable; + struct rt6_info fake_rt6_info; + }; ++ u32 metrics[RTAX_MAX]; + #endif + u16 group_fwd_mask; + u16 group_fwd_mask_required; +-- +2.39.5 + diff --git a/queue-6.14/clk-sunxi-ng-d1-add-missing-divider-for-mmc-mod-cloc.patch b/queue-6.14/clk-sunxi-ng-d1-add-missing-divider-for-mmc-mod-cloc.patch new file mode 100644 index 0000000000..bf63fbab2d --- /dev/null +++ b/queue-6.14/clk-sunxi-ng-d1-add-missing-divider-for-mmc-mod-cloc.patch @@ -0,0 +1,131 @@ +From 356a200d007c613d3794cb712350fba2ba15e4d8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 1 May 2025 13:06:31 +0100 +Subject: clk: sunxi-ng: d1: Add missing divider for MMC mod clocks +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Andre Przywara + +[ Upstream commit 98e6da673cc6dd46ca9a599802bd2c8f83606710 ] + +The D1/R528/T113 SoCs have a hidden divider of 2 in the MMC mod clocks, +just as other recent SoCs. So far we did not describe that, which led +to the resulting MMC clock rate to be only half of its intended value. + +Use a macro that allows to describe a fixed post-divider, to compensate +for that divisor. + +This brings the MMC performance on those SoCs to its expected level, +so about 23 MB/s for SD cards, instead of the 11 MB/s measured so far. + +Fixes: 35b97bb94111 ("clk: sunxi-ng: Add support for the D1 SoC clocks") +Reported-by: Kuba Szczodrzyński +Signed-off-by: Andre Przywara +Link: https://patch.msgid.link/20250501120631.837186-1-andre.przywara@arm.com +Signed-off-by: Chen-Yu Tsai +Signed-off-by: Sasha Levin +--- + drivers/clk/sunxi-ng/ccu-sun20i-d1.c | 44 ++++++++++++++++------------ + drivers/clk/sunxi-ng/ccu_mp.h | 22 ++++++++++++++ + 2 files changed, 47 insertions(+), 19 deletions(-) + +diff --git a/drivers/clk/sunxi-ng/ccu-sun20i-d1.c b/drivers/clk/sunxi-ng/ccu-sun20i-d1.c +index bb66c906ebbb6..e83d4fd40240f 100644 +--- a/drivers/clk/sunxi-ng/ccu-sun20i-d1.c ++++ b/drivers/clk/sunxi-ng/ccu-sun20i-d1.c +@@ -412,19 +412,23 @@ static const struct clk_parent_data mmc0_mmc1_parents[] = { + { .hw = &pll_periph0_2x_clk.common.hw }, + { .hw = &pll_audio1_div2_clk.common.hw }, + }; +-static SUNXI_CCU_MP_DATA_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc0_mmc1_parents, 0x830, +- 0, 4, /* M */ +- 8, 2, /* P */ +- 24, 3, /* mux */ +- BIT(31), /* gate */ +- 0); +- +-static SUNXI_CCU_MP_DATA_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc0_mmc1_parents, 0x834, +- 0, 4, /* M */ +- 8, 2, /* P */ +- 24, 3, /* mux */ +- BIT(31), /* gate */ +- 0); ++static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0", ++ mmc0_mmc1_parents, 0x830, ++ 0, 4, /* M */ ++ 8, 2, /* P */ ++ 24, 3, /* mux */ ++ BIT(31), /* gate */ ++ 2, /* post-div */ ++ 0); ++ ++static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1", ++ mmc0_mmc1_parents, 0x834, ++ 0, 4, /* M */ ++ 8, 2, /* P */ ++ 24, 3, /* mux */ ++ BIT(31), /* gate */ ++ 2, /* post-div */ ++ 0); + + static const struct clk_parent_data mmc2_parents[] = { + { .fw_name = "hosc" }, +@@ -433,12 +437,14 @@ static const struct clk_parent_data mmc2_parents[] = { + { .hw = &pll_periph0_800M_clk.common.hw }, + { .hw = &pll_audio1_div2_clk.common.hw }, + }; +-static SUNXI_CCU_MP_DATA_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc2_parents, 0x838, +- 0, 4, /* M */ +- 8, 2, /* P */ +- 24, 3, /* mux */ +- BIT(31), /* gate */ +- 0); ++static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_POSTDIV(mmc2_clk, "mmc2", mmc2_parents, ++ 0x838, ++ 0, 4, /* M */ ++ 8, 2, /* P */ ++ 24, 3, /* mux */ ++ BIT(31), /* gate */ ++ 2, /* post-div */ ++ 0); + + static SUNXI_CCU_GATE_HWS(bus_mmc0_clk, "bus-mmc0", psi_ahb_hws, + 0x84c, BIT(0), 0); +diff --git a/drivers/clk/sunxi-ng/ccu_mp.h b/drivers/clk/sunxi-ng/ccu_mp.h +index 6e50f3728fb5f..7d836a9fb3db3 100644 +--- a/drivers/clk/sunxi-ng/ccu_mp.h ++++ b/drivers/clk/sunxi-ng/ccu_mp.h +@@ -52,6 +52,28 @@ struct ccu_mp { + } \ + } + ++#define SUNXI_CCU_MP_DATA_WITH_MUX_GATE_POSTDIV(_struct, _name, _parents, \ ++ _reg, \ ++ _mshift, _mwidth, \ ++ _pshift, _pwidth, \ ++ _muxshift, _muxwidth, \ ++ _gate, _postdiv, _flags)\ ++ struct ccu_mp _struct = { \ ++ .enable = _gate, \ ++ .m = _SUNXI_CCU_DIV(_mshift, _mwidth), \ ++ .p = _SUNXI_CCU_DIV(_pshift, _pwidth), \ ++ .mux = _SUNXI_CCU_MUX(_muxshift, _muxwidth), \ ++ .fixed_post_div = _postdiv, \ ++ .common = { \ ++ .reg = _reg, \ ++ .features = CCU_FEATURE_FIXED_POSTDIV, \ ++ .hw.init = CLK_HW_INIT_PARENTS_DATA(_name, \ ++ _parents, \ ++ &ccu_mp_ops, \ ++ _flags), \ ++ } \ ++ } ++ + #define SUNXI_CCU_MP_WITH_MUX_GATE(_struct, _name, _parents, _reg, \ + _mshift, _mwidth, \ + _pshift, _pwidth, \ +-- +2.39.5 + diff --git a/queue-6.14/devres-introduce-devm_kmemdup_array.patch b/queue-6.14/devres-introduce-devm_kmemdup_array.patch new file mode 100644 index 0000000000..8c67fb67f7 --- /dev/null +++ b/queue-6.14/devres-introduce-devm_kmemdup_array.patch @@ -0,0 +1,42 @@ +From c27745a3e9179b49979ff92bca4c9da01de3cd33 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 12 Feb 2025 11:55:05 +0530 +Subject: devres: Introduce devm_kmemdup_array() + +From: Raag Jadav + +[ Upstream commit a103b833ac3806b816bc993cba77d0b17cf801f1 ] + +Introduce '_array' variant of devm_kmemdup() which is more robust and +consistent with alloc family of helpers. + +Suggested-by: Andy Shevchenko +Signed-off-by: Raag Jadav +Reviewed-by: Dmitry Torokhov +Reviewed-by: Linus Walleij +Signed-off-by: Andy Shevchenko +Stable-dep-of: 7dd7f39fce00 ("ASoC: SOF: Intel: hda: Fix UAF when reloading module") +Signed-off-by: Sasha Levin +--- + include/linux/device/devres.h | 5 +++++ + 1 file changed, 5 insertions(+) + +diff --git a/include/linux/device/devres.h b/include/linux/device/devres.h +index 6b0b265058bcc..9b49f99158508 100644 +--- a/include/linux/device/devres.h ++++ b/include/linux/device/devres.h +@@ -79,6 +79,11 @@ void devm_kfree(struct device *dev, const void *p); + + void * __realloc_size(3) + devm_kmemdup(struct device *dev, const void *src, size_t len, gfp_t gfp); ++static inline void *devm_kmemdup_array(struct device *dev, const void *src, ++ size_t n, size_t size, gfp_t flags) ++{ ++ return devm_kmemdup(dev, src, size_mul(size, n), flags); ++} + + char * __malloc + devm_kstrdup(struct device *dev, const char *s, gfp_t gfp); +-- +2.39.5 + diff --git a/queue-6.14/dmaengine-fsl-edma-fix-return-code-for-unhandled-int.patch b/queue-6.14/dmaengine-fsl-edma-fix-return-code-for-unhandled-int.patch new file mode 100644 index 0000000000..e4544dcfcf --- /dev/null +++ b/queue-6.14/dmaengine-fsl-edma-fix-return-code-for-unhandled-int.patch @@ -0,0 +1,40 @@ +From c5b64048540f35dc96d9c112f3abc02ad4d2a7b3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 24 Apr 2025 13:48:29 +0200 +Subject: dmaengine: fsl-edma: Fix return code for unhandled interrupts + +From: Stefan Wahren + +[ Upstream commit 5e27af0514e2249a9ccc9a762abd3b74e03a1f90 ] + +For fsl,imx93-edma4 two DMA channels share the same interrupt. +So in case fsl_edma3_tx_handler is called for the "wrong" +channel, the return code must be IRQ_NONE. This signalize that +the interrupt wasn't handled. + +Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support") +Signed-off-by: Stefan Wahren +Reviewed-by: Joy Zou +Link: https://lore.kernel.org/r/20250424114829.9055-1-wahrenst@gmx.net +Signed-off-by: Vinod Koul +Signed-off-by: Sasha Levin +--- + drivers/dma/fsl-edma-main.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c +index e3e0e88a76d3c..619403c6e5173 100644 +--- a/drivers/dma/fsl-edma-main.c ++++ b/drivers/dma/fsl-edma-main.c +@@ -57,7 +57,7 @@ static irqreturn_t fsl_edma3_tx_handler(int irq, void *dev_id) + + intr = edma_readl_chreg(fsl_chan, ch_int); + if (!intr) +- return IRQ_HANDLED; ++ return IRQ_NONE; + + edma_writel_chreg(fsl_chan, 1, ch_int); + +-- +2.39.5 + diff --git a/queue-6.14/dmaengine-idxd-fix-allowing-write-from-different-add.patch b/queue-6.14/dmaengine-idxd-fix-allowing-write-from-different-add.patch new file mode 100644 index 0000000000..ef1646cce9 --- /dev/null +++ b/queue-6.14/dmaengine-idxd-fix-allowing-write-from-different-add.patch @@ -0,0 +1,59 @@ +From abe83c31cb989e72b895bfcf073c7de0a0056245 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 21 Apr 2025 10:03:37 -0700 +Subject: dmaengine: idxd: Fix allowing write() from different address spaces + +From: Vinicius Costa Gomes + +[ Upstream commit 8dfa57aabff625bf445548257f7711ef294cd30e ] + +Check if the process submitting the descriptor belongs to the same +address space as the one that opened the file, reject otherwise. + +Fixes: 6827738dc684 ("dmaengine: idxd: add a write() method for applications to submit work") +Signed-off-by: Vinicius Costa Gomes +Signed-off-by: Dave Jiang +Link: https://lore.kernel.org/r/20250421170337.3008875-1-dave.jiang@intel.com +Signed-off-by: Vinod Koul +Signed-off-by: Sasha Levin +--- + drivers/dma/idxd/cdev.c | 9 +++++++++ + 1 file changed, 9 insertions(+) + +diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c +index ff94ee892339d..b847b74949f19 100644 +--- a/drivers/dma/idxd/cdev.c ++++ b/drivers/dma/idxd/cdev.c +@@ -407,6 +407,9 @@ static int idxd_cdev_mmap(struct file *filp, struct vm_area_struct *vma) + if (!idxd->user_submission_safe && !capable(CAP_SYS_RAWIO)) + return -EPERM; + ++ if (current->mm != ctx->mm) ++ return -EPERM; ++ + rc = check_vma(wq, vma, __func__); + if (rc < 0) + return rc; +@@ -473,6 +476,9 @@ static ssize_t idxd_cdev_write(struct file *filp, const char __user *buf, size_t + ssize_t written = 0; + int i; + ++ if (current->mm != ctx->mm) ++ return -EPERM; ++ + for (i = 0; i < len/sizeof(struct dsa_hw_desc); i++) { + int rc = idxd_submit_user_descriptor(ctx, udesc + i); + +@@ -493,6 +499,9 @@ static __poll_t idxd_cdev_poll(struct file *filp, + struct idxd_device *idxd = wq->idxd; + __poll_t out = 0; + ++ if (current->mm != ctx->mm) ++ return -EPERM; ++ + poll_wait(filp, &wq->err_queue, wait); + spin_lock(&idxd->dev_lock); + if (idxd->sw_err.valid) +-- +2.39.5 + diff --git a/queue-6.14/dmaengine-idxd-fix-poll-return-value.patch b/queue-6.14/dmaengine-idxd-fix-poll-return-value.patch new file mode 100644 index 0000000000..4857a06280 --- /dev/null +++ b/queue-6.14/dmaengine-idxd-fix-poll-return-value.patch @@ -0,0 +1,41 @@ +From 9f4e0f6fa78b4936519d53a6c825054af27fe92a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 8 May 2025 10:05:48 -0700 +Subject: dmaengine: idxd: Fix ->poll() return value + +From: Dave Jiang + +[ Upstream commit ae74cd15ade833adc289279b5c6f12e78f64d4d7 ] + +The fix to block access from different address space did not return a +correct value for ->poll() change. kernel test bot reported that a +return value of type __poll_t is expected rather than int. Fix to return +POLLNVAL to indicate invalid request. + +Fixes: 8dfa57aabff6 ("dmaengine: idxd: Fix allowing write() from different address spaces") +Reported-by: kernel test robot +Closes: https://lore.kernel.org/oe-kbuild-all/202505081851.rwD7jVxg-lkp@intel.com/ +Signed-off-by: Dave Jiang +Link: https://lore.kernel.org/r/20250508170548.2747425-1-dave.jiang@intel.com +Signed-off-by: Vinod Koul +Signed-off-by: Sasha Levin +--- + drivers/dma/idxd/cdev.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c +index b847b74949f19..cd57067e82180 100644 +--- a/drivers/dma/idxd/cdev.c ++++ b/drivers/dma/idxd/cdev.c +@@ -500,7 +500,7 @@ static __poll_t idxd_cdev_poll(struct file *filp, + __poll_t out = 0; + + if (current->mm != ctx->mm) +- return -EPERM; ++ return POLLNVAL; + + poll_wait(filp, &wq->err_queue, wait); + spin_lock(&idxd->dev_lock); +-- +2.39.5 + diff --git a/queue-6.14/driver-core-split-devres-apis-to-device-devres.h.patch b/queue-6.14/driver-core-split-devres-apis-to-device-devres.h.patch new file mode 100644 index 0000000000..b93b0dcf49 --- /dev/null +++ b/queue-6.14/driver-core-split-devres-apis-to-device-devres.h.patch @@ -0,0 +1,302 @@ +From ad33899b058be93f7616ba65de80ad2509c37bc0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 12 Feb 2025 11:55:03 +0530 +Subject: driver core: Split devres APIs to device/devres.h + +From: Andy Shevchenko + +[ Upstream commit a21cad9312767d26b5257ce0662699bb202cdda1 ] + +device.h is a huge header which is hard to follow and easy to miss +something. Improve that by splitting devres APIs to device/devres.h. + +In particular this helps to speedup the build of the code that includes +device.h solely for a devres APIs. + +While at it, cast the error pointers to __iomem using IOMEM_ERR_PTR() +and fix sparse warnings. + +Signed-off-by: Raag Jadav +Acked-by: Arnd Bergmann +Reviewed-by: Greg Kroah-Hartman +Signed-off-by: Andy Shevchenko +Stable-dep-of: 7dd7f39fce00 ("ASoC: SOF: Intel: hda: Fix UAF when reloading module") +Signed-off-by: Sasha Levin +--- + include/linux/device.h | 119 +------------------------------- + include/linux/device/devres.h | 124 ++++++++++++++++++++++++++++++++++ + 2 files changed, 125 insertions(+), 118 deletions(-) + create mode 100644 include/linux/device/devres.h + +diff --git a/include/linux/device.h b/include/linux/device.h +index 80a5b32689866..78ca7fd0e625a 100644 +--- a/include/linux/device.h ++++ b/include/linux/device.h +@@ -26,9 +26,9 @@ + #include + #include + #include +-#include + #include + #include ++#include + #include + #include + #include +@@ -281,123 +281,6 @@ int __must_check device_create_bin_file(struct device *dev, + void device_remove_bin_file(struct device *dev, + const struct bin_attribute *attr); + +-/* device resource management */ +-typedef void (*dr_release_t)(struct device *dev, void *res); +-typedef int (*dr_match_t)(struct device *dev, void *res, void *match_data); +- +-void *__devres_alloc_node(dr_release_t release, size_t size, gfp_t gfp, +- int nid, const char *name) __malloc; +-#define devres_alloc(release, size, gfp) \ +- __devres_alloc_node(release, size, gfp, NUMA_NO_NODE, #release) +-#define devres_alloc_node(release, size, gfp, nid) \ +- __devres_alloc_node(release, size, gfp, nid, #release) +- +-void devres_for_each_res(struct device *dev, dr_release_t release, +- dr_match_t match, void *match_data, +- void (*fn)(struct device *, void *, void *), +- void *data); +-void devres_free(void *res); +-void devres_add(struct device *dev, void *res); +-void *devres_find(struct device *dev, dr_release_t release, +- dr_match_t match, void *match_data); +-void *devres_get(struct device *dev, void *new_res, +- dr_match_t match, void *match_data); +-void *devres_remove(struct device *dev, dr_release_t release, +- dr_match_t match, void *match_data); +-int devres_destroy(struct device *dev, dr_release_t release, +- dr_match_t match, void *match_data); +-int devres_release(struct device *dev, dr_release_t release, +- dr_match_t match, void *match_data); +- +-/* devres group */ +-void * __must_check devres_open_group(struct device *dev, void *id, gfp_t gfp); +-void devres_close_group(struct device *dev, void *id); +-void devres_remove_group(struct device *dev, void *id); +-int devres_release_group(struct device *dev, void *id); +- +-/* managed devm_k.alloc/kfree for device drivers */ +-void *devm_kmalloc(struct device *dev, size_t size, gfp_t gfp) __alloc_size(2); +-void *devm_krealloc(struct device *dev, void *ptr, size_t size, +- gfp_t gfp) __must_check __realloc_size(3); +-__printf(3, 0) char *devm_kvasprintf(struct device *dev, gfp_t gfp, +- const char *fmt, va_list ap) __malloc; +-__printf(3, 4) char *devm_kasprintf(struct device *dev, gfp_t gfp, +- const char *fmt, ...) __malloc; +-static inline void *devm_kzalloc(struct device *dev, size_t size, gfp_t gfp) +-{ +- return devm_kmalloc(dev, size, gfp | __GFP_ZERO); +-} +-static inline void *devm_kmalloc_array(struct device *dev, +- size_t n, size_t size, gfp_t flags) +-{ +- size_t bytes; +- +- if (unlikely(check_mul_overflow(n, size, &bytes))) +- return NULL; +- +- return devm_kmalloc(dev, bytes, flags); +-} +-static inline void *devm_kcalloc(struct device *dev, +- size_t n, size_t size, gfp_t flags) +-{ +- return devm_kmalloc_array(dev, n, size, flags | __GFP_ZERO); +-} +-static inline __realloc_size(3, 4) void * __must_check +-devm_krealloc_array(struct device *dev, void *p, size_t new_n, size_t new_size, gfp_t flags) +-{ +- size_t bytes; +- +- if (unlikely(check_mul_overflow(new_n, new_size, &bytes))) +- return NULL; +- +- return devm_krealloc(dev, p, bytes, flags); +-} +- +-void devm_kfree(struct device *dev, const void *p); +-char *devm_kstrdup(struct device *dev, const char *s, gfp_t gfp) __malloc; +-const char *devm_kstrdup_const(struct device *dev, const char *s, gfp_t gfp); +-void *devm_kmemdup(struct device *dev, const void *src, size_t len, gfp_t gfp) +- __realloc_size(3); +- +-unsigned long devm_get_free_pages(struct device *dev, +- gfp_t gfp_mask, unsigned int order); +-void devm_free_pages(struct device *dev, unsigned long addr); +- +-#ifdef CONFIG_HAS_IOMEM +-void __iomem *devm_ioremap_resource(struct device *dev, +- const struct resource *res); +-void __iomem *devm_ioremap_resource_wc(struct device *dev, +- const struct resource *res); +- +-void __iomem *devm_of_iomap(struct device *dev, +- struct device_node *node, int index, +- resource_size_t *size); +-#else +- +-static inline +-void __iomem *devm_ioremap_resource(struct device *dev, +- const struct resource *res) +-{ +- return ERR_PTR(-EINVAL); +-} +- +-static inline +-void __iomem *devm_ioremap_resource_wc(struct device *dev, +- const struct resource *res) +-{ +- return ERR_PTR(-EINVAL); +-} +- +-static inline +-void __iomem *devm_of_iomap(struct device *dev, +- struct device_node *node, int index, +- resource_size_t *size) +-{ +- return ERR_PTR(-EINVAL); +-} +- +-#endif +- + /* allows to add/remove a custom action to devres stack */ + int devm_remove_action_nowarn(struct device *dev, void (*action)(void *), void *data); + +diff --git a/include/linux/device/devres.h b/include/linux/device/devres.h +new file mode 100644 +index 0000000000000..6b0b265058bcc +--- /dev/null ++++ b/include/linux/device/devres.h +@@ -0,0 +1,124 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++#ifndef _DEVICE_DEVRES_H_ ++#define _DEVICE_DEVRES_H_ ++ ++#include ++#include ++#include ++#include ++#include ++#include ++ ++struct device; ++struct device_node; ++struct resource; ++ ++/* device resource management */ ++typedef void (*dr_release_t)(struct device *dev, void *res); ++typedef int (*dr_match_t)(struct device *dev, void *res, void *match_data); ++ ++void * __malloc ++__devres_alloc_node(dr_release_t release, size_t size, gfp_t gfp, int nid, const char *name); ++#define devres_alloc(release, size, gfp) \ ++ __devres_alloc_node(release, size, gfp, NUMA_NO_NODE, #release) ++#define devres_alloc_node(release, size, gfp, nid) \ ++ __devres_alloc_node(release, size, gfp, nid, #release) ++ ++void devres_for_each_res(struct device *dev, dr_release_t release, ++ dr_match_t match, void *match_data, ++ void (*fn)(struct device *, void *, void *), ++ void *data); ++void devres_free(void *res); ++void devres_add(struct device *dev, void *res); ++void *devres_find(struct device *dev, dr_release_t release, dr_match_t match, void *match_data); ++void *devres_get(struct device *dev, void *new_res, dr_match_t match, void *match_data); ++void *devres_remove(struct device *dev, dr_release_t release, dr_match_t match, void *match_data); ++int devres_destroy(struct device *dev, dr_release_t release, dr_match_t match, void *match_data); ++int devres_release(struct device *dev, dr_release_t release, dr_match_t match, void *match_data); ++ ++/* devres group */ ++void * __must_check devres_open_group(struct device *dev, void *id, gfp_t gfp); ++void devres_close_group(struct device *dev, void *id); ++void devres_remove_group(struct device *dev, void *id); ++int devres_release_group(struct device *dev, void *id); ++ ++/* managed devm_k.alloc/kfree for device drivers */ ++void * __alloc_size(2) ++devm_kmalloc(struct device *dev, size_t size, gfp_t gfp); ++void * __must_check __realloc_size(3) ++devm_krealloc(struct device *dev, void *ptr, size_t size, gfp_t gfp); ++static inline void *devm_kzalloc(struct device *dev, size_t size, gfp_t gfp) ++{ ++ return devm_kmalloc(dev, size, gfp | __GFP_ZERO); ++} ++static inline void *devm_kmalloc_array(struct device *dev, size_t n, size_t size, gfp_t flags) ++{ ++ size_t bytes; ++ ++ if (unlikely(check_mul_overflow(n, size, &bytes))) ++ return NULL; ++ ++ return devm_kmalloc(dev, bytes, flags); ++} ++static inline void *devm_kcalloc(struct device *dev, size_t n, size_t size, gfp_t flags) ++{ ++ return devm_kmalloc_array(dev, n, size, flags | __GFP_ZERO); ++} ++static inline __realloc_size(3, 4) void * __must_check ++devm_krealloc_array(struct device *dev, void *p, size_t new_n, size_t new_size, gfp_t flags) ++{ ++ size_t bytes; ++ ++ if (unlikely(check_mul_overflow(new_n, new_size, &bytes))) ++ return NULL; ++ ++ return devm_krealloc(dev, p, bytes, flags); ++} ++ ++void devm_kfree(struct device *dev, const void *p); ++ ++void * __realloc_size(3) ++devm_kmemdup(struct device *dev, const void *src, size_t len, gfp_t gfp); ++ ++char * __malloc ++devm_kstrdup(struct device *dev, const char *s, gfp_t gfp); ++const char *devm_kstrdup_const(struct device *dev, const char *s, gfp_t gfp); ++char * __printf(3, 0) __malloc ++devm_kvasprintf(struct device *dev, gfp_t gfp, const char *fmt, va_list ap); ++char * __printf(3, 4) __malloc ++devm_kasprintf(struct device *dev, gfp_t gfp, const char *fmt, ...); ++ ++unsigned long devm_get_free_pages(struct device *dev, gfp_t gfp_mask, unsigned int order); ++void devm_free_pages(struct device *dev, unsigned long addr); ++ ++#ifdef CONFIG_HAS_IOMEM ++ ++void __iomem *devm_ioremap_resource(struct device *dev, const struct resource *res); ++void __iomem *devm_ioremap_resource_wc(struct device *dev, const struct resource *res); ++ ++void __iomem *devm_of_iomap(struct device *dev, struct device_node *node, int index, ++ resource_size_t *size); ++#else ++ ++static inline ++void __iomem *devm_ioremap_resource(struct device *dev, const struct resource *res) ++{ ++ return IOMEM_ERR_PTR(-EINVAL); ++} ++ ++static inline ++void __iomem *devm_ioremap_resource_wc(struct device *dev, const struct resource *res) ++{ ++ return IOMEM_ERR_PTR(-EINVAL); ++} ++ ++static inline ++void __iomem *devm_of_iomap(struct device *dev, struct device_node *node, int index, ++ resource_size_t *size) ++{ ++ return IOMEM_ERR_PTR(-EINVAL); ++} ++ ++#endif ++ ++#endif /* _DEVICE_DEVRES_H_ */ +-- +2.39.5 + diff --git a/queue-6.14/espintcp-fix-skb-leaks.patch b/queue-6.14/espintcp-fix-skb-leaks.patch new file mode 100644 index 0000000000..e5553d4104 --- /dev/null +++ b/queue-6.14/espintcp-fix-skb-leaks.patch @@ -0,0 +1,73 @@ +From c527d680e059202d56404376c47c9ba0f77839f6 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 9 Apr 2025 15:59:56 +0200 +Subject: espintcp: fix skb leaks + +From: Sabrina Dubroca + +[ Upstream commit 63c1f19a3be3169e51a5812d22a6d0c879414076 ] + +A few error paths are missing a kfree_skb. + +Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") +Signed-off-by: Sabrina Dubroca +Reviewed-by: Simon Horman +Signed-off-by: Steffen Klassert +Signed-off-by: Sasha Levin +--- + net/ipv4/esp4.c | 4 +++- + net/ipv6/esp6.c | 4 +++- + net/xfrm/espintcp.c | 4 +++- + 3 files changed, 9 insertions(+), 3 deletions(-) + +diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c +index 0e4076866c0a4..876df672c0bfa 100644 +--- a/net/ipv4/esp4.c ++++ b/net/ipv4/esp4.c +@@ -199,8 +199,10 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb) + + sk = esp_find_tcp_sk(x); + err = PTR_ERR_OR_ZERO(sk); +- if (err) ++ if (err) { ++ kfree_skb(skb); + goto out; ++ } + + bh_lock_sock(sk); + if (sock_owned_by_user(sk)) +diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c +index 9e73944e3b530..574989b82179c 100644 +--- a/net/ipv6/esp6.c ++++ b/net/ipv6/esp6.c +@@ -216,8 +216,10 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb) + + sk = esp6_find_tcp_sk(x); + err = PTR_ERR_OR_ZERO(sk); +- if (err) ++ if (err) { ++ kfree_skb(skb); + goto out; ++ } + + bh_lock_sock(sk); + if (sock_owned_by_user(sk)) +diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c +index fe82e2d073006..fc7a603b04f13 100644 +--- a/net/xfrm/espintcp.c ++++ b/net/xfrm/espintcp.c +@@ -171,8 +171,10 @@ int espintcp_queue_out(struct sock *sk, struct sk_buff *skb) + struct espintcp_ctx *ctx = espintcp_getctx(sk); + + if (skb_queue_len(&ctx->out_queue) >= +- READ_ONCE(net_hotdata.max_backlog)) ++ READ_ONCE(net_hotdata.max_backlog)) { ++ kfree_skb(skb); + return -ENOBUFS; ++ } + + __skb_queue_tail(&ctx->out_queue, skb); + +-- +2.39.5 + diff --git a/queue-6.14/espintcp-remove-encap-socket-caching-to-avoid-refere.patch b/queue-6.14/espintcp-remove-encap-socket-caching-to-avoid-refere.patch new file mode 100644 index 0000000000..126b2c40cd --- /dev/null +++ b/queue-6.14/espintcp-remove-encap-socket-caching-to-avoid-refere.patch @@ -0,0 +1,252 @@ +From d6c913ce820127ca3950eb87e2671741d3a8a85a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 9 Apr 2025 15:59:57 +0200 +Subject: espintcp: remove encap socket caching to avoid reference leak + +From: Sabrina Dubroca + +[ Upstream commit 028363685bd0b7a19b4a820f82dd905b1dc83999 ] + +The current scheme for caching the encap socket can lead to reference +leaks when we try to delete the netns. + +The reference chain is: xfrm_state -> enacp_sk -> netns + +Since the encap socket is a userspace socket, it holds a reference on +the netns. If we delete the espintcp state (through flush or +individual delete) before removing the netns, the reference on the +socket is dropped and the netns is correctly deleted. Otherwise, the +netns may not be reachable anymore (if all processes within the ns +have terminated), so we cannot delete the xfrm state to drop its +reference on the socket. + +This patch results in a small (~2% in my tests) performance +regression. + +A GC-type mechanism could be added for the socket cache, to clear +references if the state hasn't been used "recently", but it's a lot +more complex than just not caching the socket. + +Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") +Signed-off-by: Sabrina Dubroca +Reviewed-by: Simon Horman +Signed-off-by: Steffen Klassert +Signed-off-by: Sasha Levin +--- + include/net/xfrm.h | 1 - + net/ipv4/esp4.c | 49 ++++--------------------------------------- + net/ipv6/esp6.c | 49 ++++--------------------------------------- + net/xfrm/xfrm_state.c | 3 --- + 4 files changed, 8 insertions(+), 94 deletions(-) + +diff --git a/include/net/xfrm.h b/include/net/xfrm.h +index e1eed5d47d072..03a1ed1e610b2 100644 +--- a/include/net/xfrm.h ++++ b/include/net/xfrm.h +@@ -236,7 +236,6 @@ struct xfrm_state { + + /* Data for encapsulator */ + struct xfrm_encap_tmpl *encap; +- struct sock __rcu *encap_sk; + + /* NAT keepalive */ + u32 nat_keepalive_interval; /* seconds */ +diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c +index 876df672c0bfa..f14a41ee4aa10 100644 +--- a/net/ipv4/esp4.c ++++ b/net/ipv4/esp4.c +@@ -120,47 +120,16 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp, struct sk_buff *skb) + } + + #ifdef CONFIG_INET_ESPINTCP +-struct esp_tcp_sk { +- struct sock *sk; +- struct rcu_head rcu; +-}; +- +-static void esp_free_tcp_sk(struct rcu_head *head) +-{ +- struct esp_tcp_sk *esk = container_of(head, struct esp_tcp_sk, rcu); +- +- sock_put(esk->sk); +- kfree(esk); +-} +- + static struct sock *esp_find_tcp_sk(struct xfrm_state *x) + { + struct xfrm_encap_tmpl *encap = x->encap; + struct net *net = xs_net(x); +- struct esp_tcp_sk *esk; + __be16 sport, dport; +- struct sock *nsk; + struct sock *sk; + +- sk = rcu_dereference(x->encap_sk); +- if (sk && sk->sk_state == TCP_ESTABLISHED) +- return sk; +- + spin_lock_bh(&x->lock); + sport = encap->encap_sport; + dport = encap->encap_dport; +- nsk = rcu_dereference_protected(x->encap_sk, +- lockdep_is_held(&x->lock)); +- if (sk && sk == nsk) { +- esk = kmalloc(sizeof(*esk), GFP_ATOMIC); +- if (!esk) { +- spin_unlock_bh(&x->lock); +- return ERR_PTR(-ENOMEM); +- } +- RCU_INIT_POINTER(x->encap_sk, NULL); +- esk->sk = sk; +- call_rcu(&esk->rcu, esp_free_tcp_sk); +- } + spin_unlock_bh(&x->lock); + + sk = inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, x->id.daddr.a4, +@@ -173,20 +142,6 @@ static struct sock *esp_find_tcp_sk(struct xfrm_state *x) + return ERR_PTR(-EINVAL); + } + +- spin_lock_bh(&x->lock); +- nsk = rcu_dereference_protected(x->encap_sk, +- lockdep_is_held(&x->lock)); +- if (encap->encap_sport != sport || +- encap->encap_dport != dport) { +- sock_put(sk); +- sk = nsk ?: ERR_PTR(-EREMCHG); +- } else if (sk == nsk) { +- sock_put(sk); +- } else { +- rcu_assign_pointer(x->encap_sk, sk); +- } +- spin_unlock_bh(&x->lock); +- + return sk; + } + +@@ -211,6 +166,8 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb) + err = espintcp_push_skb(sk, skb); + bh_unlock_sock(sk); + ++ sock_put(sk); ++ + out: + rcu_read_unlock(); + return err; +@@ -394,6 +351,8 @@ static struct ip_esp_hdr *esp_output_tcp_encap(struct xfrm_state *x, + if (IS_ERR(sk)) + return ERR_CAST(sk); + ++ sock_put(sk); ++ + *lenp = htons(len); + esph = (struct ip_esp_hdr *)(lenp + 1); + +diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c +index 574989b82179c..72adfc107b557 100644 +--- a/net/ipv6/esp6.c ++++ b/net/ipv6/esp6.c +@@ -137,47 +137,16 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp, struct sk_buff *skb) + } + + #ifdef CONFIG_INET6_ESPINTCP +-struct esp_tcp_sk { +- struct sock *sk; +- struct rcu_head rcu; +-}; +- +-static void esp_free_tcp_sk(struct rcu_head *head) +-{ +- struct esp_tcp_sk *esk = container_of(head, struct esp_tcp_sk, rcu); +- +- sock_put(esk->sk); +- kfree(esk); +-} +- + static struct sock *esp6_find_tcp_sk(struct xfrm_state *x) + { + struct xfrm_encap_tmpl *encap = x->encap; + struct net *net = xs_net(x); +- struct esp_tcp_sk *esk; + __be16 sport, dport; +- struct sock *nsk; + struct sock *sk; + +- sk = rcu_dereference(x->encap_sk); +- if (sk && sk->sk_state == TCP_ESTABLISHED) +- return sk; +- + spin_lock_bh(&x->lock); + sport = encap->encap_sport; + dport = encap->encap_dport; +- nsk = rcu_dereference_protected(x->encap_sk, +- lockdep_is_held(&x->lock)); +- if (sk && sk == nsk) { +- esk = kmalloc(sizeof(*esk), GFP_ATOMIC); +- if (!esk) { +- spin_unlock_bh(&x->lock); +- return ERR_PTR(-ENOMEM); +- } +- RCU_INIT_POINTER(x->encap_sk, NULL); +- esk->sk = sk; +- call_rcu(&esk->rcu, esp_free_tcp_sk); +- } + spin_unlock_bh(&x->lock); + + sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, &x->id.daddr.in6, +@@ -190,20 +159,6 @@ static struct sock *esp6_find_tcp_sk(struct xfrm_state *x) + return ERR_PTR(-EINVAL); + } + +- spin_lock_bh(&x->lock); +- nsk = rcu_dereference_protected(x->encap_sk, +- lockdep_is_held(&x->lock)); +- if (encap->encap_sport != sport || +- encap->encap_dport != dport) { +- sock_put(sk); +- sk = nsk ?: ERR_PTR(-EREMCHG); +- } else if (sk == nsk) { +- sock_put(sk); +- } else { +- rcu_assign_pointer(x->encap_sk, sk); +- } +- spin_unlock_bh(&x->lock); +- + return sk; + } + +@@ -228,6 +183,8 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb) + err = espintcp_push_skb(sk, skb); + bh_unlock_sock(sk); + ++ sock_put(sk); ++ + out: + rcu_read_unlock(); + return err; +@@ -424,6 +381,8 @@ static struct ip_esp_hdr *esp6_output_tcp_encap(struct xfrm_state *x, + if (IS_ERR(sk)) + return ERR_CAST(sk); + ++ sock_put(sk); ++ + *lenp = htons(len); + esph = (struct ip_esp_hdr *)(lenp + 1); + +diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c +index 69af5964c886c..fa8a8776d5397 100644 +--- a/net/xfrm/xfrm_state.c ++++ b/net/xfrm/xfrm_state.c +@@ -838,9 +838,6 @@ int __xfrm_state_delete(struct xfrm_state *x) + xfrm_nat_keepalive_state_updated(x); + spin_unlock(&net->xfrm.xfrm_state_lock); + +- if (x->encap_sk) +- sock_put(rcu_dereference_raw(x->encap_sk)); +- + xfrm_dev_state_delete(x); + + /* All xfrm_state objects are created by xfrm_state_alloc. +-- +2.39.5 + diff --git a/queue-6.14/ice-fix-lacp-bonds-without-sriov-environment.patch b/queue-6.14/ice-fix-lacp-bonds-without-sriov-environment.patch new file mode 100644 index 0000000000..a5ccbf0303 --- /dev/null +++ b/queue-6.14/ice-fix-lacp-bonds-without-sriov-environment.patch @@ -0,0 +1,68 @@ +From e0dff81496737a9facdebb0e6275e66e4b3bf51a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 28 Apr 2025 15:33:39 -0400 +Subject: ice: Fix LACP bonds without SRIOV environment + +From: Dave Ertman + +[ Upstream commit 6c778f1b839b63525b30046c9d1899424a62be0a ] + +If an aggregate has the following conditions: +- The SRIOV LAG DDP package has been enabled +- The bond is in 802.3ad LACP mode +- The bond is disqualified from supporting SRIOV VF LAG +- Both interfaces were added simultaneously to the bond (same command) + +Then there is a chance that the two interfaces will be assigned different +LACP Aggregator ID's. This will cause a failure of the LACP control over +the bond. + +To fix this, we can detect if the primary interface for the bond (as +defined by the driver) is not in switchdev mode, and exit the setup flow +if so. + +Reproduction steps: + +%> ip link add bond0 type bond mode 802.3ad miimon 100 +%> ip link set bond0 up +%> ifenslave bond0 eth0 eth1 +%> cat /proc/net/bonding/bond0 | grep Agg + +Check for Aggregator IDs that differ. + +Fixes: ec5a6c5f79ed ("ice: process events created by lag netdev event handler") +Reviewed-by: Aleksandr Loktionov +Signed-off-by: Dave Ertman +Tested-by: Sujai Buvaneswaran +Signed-off-by: Tony Nguyen +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/intel/ice/ice_lag.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c +index 22371011c2492..2410aee59fb2d 100644 +--- a/drivers/net/ethernet/intel/ice/ice_lag.c ++++ b/drivers/net/ethernet/intel/ice/ice_lag.c +@@ -1321,12 +1321,18 @@ static void ice_lag_changeupper_event(struct ice_lag *lag, void *ptr) + */ + if (!primary_lag) { + lag->primary = true; ++ if (!ice_is_switchdev_running(lag->pf)) ++ return; ++ + /* Configure primary's SWID to be shared */ + ice_lag_primary_swid(lag, true); + primary_lag = lag; + } else { + u16 swid; + ++ if (!ice_is_switchdev_running(primary_lag->pf)) ++ return; ++ + swid = primary_lag->pf->hw.port_info->sw_id; + ice_lag_set_swid(swid, lag, true); + ice_lag_add_prune_list(primary_lag, lag->pf); +-- +2.39.5 + diff --git a/queue-6.14/ice-fix-vf-num_mac-count-with-port-representors.patch b/queue-6.14/ice-fix-vf-num_mac-count-with-port-representors.patch new file mode 100644 index 0000000000..bbdac05061 --- /dev/null +++ b/queue-6.14/ice-fix-vf-num_mac-count-with-port-representors.patch @@ -0,0 +1,53 @@ +From a322df09451d045c495685d5a71c795c93736b6b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 10 Apr 2025 11:13:52 -0700 +Subject: ice: fix vf->num_mac count with port representors + +From: Jacob Keller + +[ Upstream commit bbd95160a03dbfcd01a541f25c27ddb730dfbbd5 ] + +The ice_vc_repr_add_mac() function indicates that it does not store the MAC +address filters in the firmware. However, it still increments vf->num_mac. +This is incorrect, as vf->num_mac should represent the number of MAC +filters currently programmed to firmware. + +Indeed, we only perform this increment if the requested filter is a unicast +address that doesn't match the existing vf->hw_lan_addr. In addition, +ice_vc_repr_del_mac() does not decrement the vf->num_mac counter. This +results in the counter becoming out of sync with the actual count. + +As it turns out, vf->num_mac is currently only used in legacy made without +port representors. The single place where the value is checked is for +enforcing a filter limit on untrusted VFs. + +Upcoming patches to support VF Live Migration will use this value when +determining the size of the TLV for MAC address filters. Fix the +representor mode function to stop incrementing the counter incorrectly. + +Fixes: ac19e03ef780 ("ice: allow process VF opcodes in different ways") +Signed-off-by: Jacob Keller +Reviewed-by: Michal Swiatkowski +Reviewed-by: Simon Horman +Tested-by: Sujai Buvaneswaran +Signed-off-by: Tony Nguyen +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/intel/ice/ice_virtchnl.c | 1 - + 1 file changed, 1 deletion(-) + +diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c +index 1af51469f070b..9be9ce300fa4a 100644 +--- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c ++++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c +@@ -4211,7 +4211,6 @@ static int ice_vc_repr_add_mac(struct ice_vf *vf, u8 *msg) + } + + ice_vfhw_mac_add(vf, &al->list[i]); +- vf->num_mac++; + break; + } + +-- +2.39.5 + diff --git a/queue-6.14/idpf-fix-idpf_vport_splitq_napi_poll.patch b/queue-6.14/idpf-fix-idpf_vport_splitq_napi_poll.patch new file mode 100644 index 0000000000..731b576142 --- /dev/null +++ b/queue-6.14/idpf-fix-idpf_vport_splitq_napi_poll.patch @@ -0,0 +1,72 @@ +From a37bc61047e7f1e28a10709ae681c622e2328373 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 20 May 2025 12:40:30 +0000 +Subject: idpf: fix idpf_vport_splitq_napi_poll() + +From: Eric Dumazet + +[ Upstream commit 407e0efdf8baf1672876d5948b75049860a93e59 ] + +idpf_vport_splitq_napi_poll() can incorrectly return @budget +after napi_complete_done() has been called. + +This violates NAPI rules, because after napi_complete_done(), +current thread lost napi ownership. + +Move the test against POLL_MODE before the napi_complete_done(). + +Fixes: c2d548cad150 ("idpf: add TX splitq napi poll support") +Reported-by: Peter Newman +Closes: https://lore.kernel.org/netdev/20250520121908.1805732-1-edumazet@google.com/T/#u +Signed-off-by: Eric Dumazet +Cc: Joshua Hay +Cc: Alan Brady +Cc: Madhu Chittim +Cc: Phani Burra +Cc: Pavan Kumar Linga +Link: https://patch.msgid.link/20250520124030.1983936-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/intel/idpf/idpf_txrx.c | 18 +++++++++--------- + 1 file changed, 9 insertions(+), 9 deletions(-) + +diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c +index 977741c414980..60b2e034c0348 100644 +--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c ++++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c +@@ -4031,6 +4031,14 @@ static int idpf_vport_splitq_napi_poll(struct napi_struct *napi, int budget) + return budget; + } + ++ /* Switch to poll mode in the tear-down path after sending disable ++ * queues virtchnl message, as the interrupts will be disabled after ++ * that. ++ */ ++ if (unlikely(q_vector->num_txq && idpf_queue_has(POLL_MODE, ++ q_vector->tx[0]))) ++ return budget; ++ + work_done = min_t(int, work_done, budget - 1); + + /* Exit the polling mode, but don't re-enable interrupts if stack might +@@ -4041,15 +4049,7 @@ static int idpf_vport_splitq_napi_poll(struct napi_struct *napi, int budget) + else + idpf_vport_intr_set_wb_on_itr(q_vector); + +- /* Switch to poll mode in the tear-down path after sending disable +- * queues virtchnl message, as the interrupts will be disabled after +- * that +- */ +- if (unlikely(q_vector->num_txq && idpf_queue_has(POLL_MODE, +- q_vector->tx[0]))) +- return budget; +- else +- return work_done; ++ return work_done; + } + + /** +-- +2.39.5 + diff --git a/queue-6.14/idpf-fix-null-ptr-deref-in-idpf_features_check.patch b/queue-6.14/idpf-fix-null-ptr-deref-in-idpf_features_check.patch new file mode 100644 index 0000000000..475d1df77b --- /dev/null +++ b/queue-6.14/idpf-fix-null-ptr-deref-in-idpf_features_check.patch @@ -0,0 +1,121 @@ +From 43626cbd553e35887709ee0b3755ae77954b3e31 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 11 Apr 2025 09:00:35 -0700 +Subject: idpf: fix null-ptr-deref in idpf_features_check + +From: Pavan Kumar Linga + +[ Upstream commit 2dabe349f7882ff1407a784d54d8541909329088 ] + +idpf_features_check is used to validate the TX packet. skb header +length is compared with the hardware supported value received from +the device control plane. The value is stored in the adapter structure +and to access it, vport pointer is used. During reset all the vports +are released and the vport pointer that the netdev private structure +points to is NULL. + +To avoid null-ptr-deref, store the max header length value in netdev +private structure. This also helps to cache the value and avoid +accessing adapter pointer in hot path. + +BUG: kernel NULL pointer dereference, address: 0000000000000068 +... +RIP: 0010:idpf_features_check+0x6d/0xe0 [idpf] +Call Trace: + + ? __die+0x23/0x70 + ? page_fault_oops+0x154/0x520 + ? exc_page_fault+0x76/0x190 + ? asm_exc_page_fault+0x26/0x30 + ? idpf_features_check+0x6d/0xe0 [idpf] + netif_skb_features+0x88/0x310 + validate_xmit_skb+0x2a/0x2b0 + validate_xmit_skb_list+0x4c/0x70 + sch_direct_xmit+0x19d/0x3a0 + __dev_queue_xmit+0xb74/0xe70 + ... + +Fixes: a251eee62133 ("idpf: add SRIOV support and other ndo_ops") +Reviewed-by: Madhu Chititm +Signed-off-by: Pavan Kumar Linga +Reviewed-by: Simon Horman +Tested-by: Samuel Salin +Signed-off-by: Tony Nguyen +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/intel/idpf/idpf.h | 2 ++ + drivers/net/ethernet/intel/idpf/idpf_lib.c | 10 ++++++---- + 2 files changed, 8 insertions(+), 4 deletions(-) + +diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h +index aef0e9775a330..70dbf80f3bb75 100644 +--- a/drivers/net/ethernet/intel/idpf/idpf.h ++++ b/drivers/net/ethernet/intel/idpf/idpf.h +@@ -143,6 +143,7 @@ enum idpf_vport_state { + * @vport_id: Vport identifier + * @link_speed_mbps: Link speed in mbps + * @vport_idx: Relative vport index ++ * @max_tx_hdr_size: Max header length hardware can support + * @state: See enum idpf_vport_state + * @netstats: Packet and byte stats + * @stats_lock: Lock to protect stats update +@@ -153,6 +154,7 @@ struct idpf_netdev_priv { + u32 vport_id; + u32 link_speed_mbps; + u16 vport_idx; ++ u16 max_tx_hdr_size; + enum idpf_vport_state state; + struct rtnl_link_stats64 netstats; + spinlock_t stats_lock; +diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c +index 6e8a82dae1628..df71e6ad65109 100644 +--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c ++++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c +@@ -723,6 +723,7 @@ static int idpf_cfg_netdev(struct idpf_vport *vport) + np->vport = vport; + np->vport_idx = vport->idx; + np->vport_id = vport->vport_id; ++ np->max_tx_hdr_size = idpf_get_max_tx_hdr_size(adapter); + vport->netdev = netdev; + + return idpf_init_mac_addr(vport, netdev); +@@ -740,6 +741,7 @@ static int idpf_cfg_netdev(struct idpf_vport *vport) + np->adapter = adapter; + np->vport_idx = vport->idx; + np->vport_id = vport->vport_id; ++ np->max_tx_hdr_size = idpf_get_max_tx_hdr_size(adapter); + + spin_lock_init(&np->stats_lock); + +@@ -2202,8 +2204,8 @@ static netdev_features_t idpf_features_check(struct sk_buff *skb, + struct net_device *netdev, + netdev_features_t features) + { +- struct idpf_vport *vport = idpf_netdev_to_vport(netdev); +- struct idpf_adapter *adapter = vport->adapter; ++ struct idpf_netdev_priv *np = netdev_priv(netdev); ++ u16 max_tx_hdr_size = np->max_tx_hdr_size; + size_t len; + + /* No point in doing any of this if neither checksum nor GSO are +@@ -2226,7 +2228,7 @@ static netdev_features_t idpf_features_check(struct sk_buff *skb, + goto unsupported; + + len = skb_network_header_len(skb); +- if (unlikely(len > idpf_get_max_tx_hdr_size(adapter))) ++ if (unlikely(len > max_tx_hdr_size)) + goto unsupported; + + if (!skb->encapsulation) +@@ -2239,7 +2241,7 @@ static netdev_features_t idpf_features_check(struct sk_buff *skb, + + /* IPLEN can support at most 127 dwords */ + len = skb_inner_network_header_len(skb); +- if (unlikely(len > idpf_get_max_tx_hdr_size(adapter))) ++ if (unlikely(len > max_tx_hdr_size)) + goto unsupported; + + /* No need to validate L4LEN as TCP is the only protocol with a +-- +2.39.5 + diff --git a/queue-6.14/io_uring-fix-overflow-resched-cqe-reordering.patch b/queue-6.14/io_uring-fix-overflow-resched-cqe-reordering.patch new file mode 100644 index 0000000000..13dbcb210f --- /dev/null +++ b/queue-6.14/io_uring-fix-overflow-resched-cqe-reordering.patch @@ -0,0 +1,38 @@ +From 69f7b9cc8f6c4d25f68529e29a7301fb211b089c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 17 May 2025 13:27:37 +0100 +Subject: io_uring: fix overflow resched cqe reordering + +From: Pavel Begunkov + +[ Upstream commit a7d755ed9ce9738af3db602eb29d32774a180bc7 ] + +Leaving the CQ critical section in the middle of a overflow flushing +can cause cqe reordering since the cache cq pointers are reset and any +new cqe emitters that might get called in between are not going to be +forced into io_cqe_cache_refill(). + +Fixes: eac2ca2d682f9 ("io_uring: check if we need to reschedule during overflow flush") +Signed-off-by: Pavel Begunkov +Link: https://lore.kernel.org/r/90ba817f1a458f091f355f407de1c911d2b93bbf.1747483784.git.asml.silence@gmail.com +Signed-off-by: Jens Axboe +Signed-off-by: Sasha Levin +--- + io_uring/io_uring.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c +index 52c9fa6c06450..1f60883d78c64 100644 +--- a/io_uring/io_uring.c ++++ b/io_uring/io_uring.c +@@ -632,6 +632,7 @@ static void __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool dying) + * to care for a non-real case. + */ + if (need_resched()) { ++ ctx->cqe_sentinel = ctx->cqe_cached; + io_cq_unlock_post(ctx); + mutex_unlock(&ctx->uring_lock); + cond_resched(); +-- +2.39.5 + diff --git a/queue-6.14/irqchip-riscv-imsic-start-local-sync-timer-on-correc.patch b/queue-6.14/irqchip-riscv-imsic-start-local-sync-timer-on-correc.patch new file mode 100644 index 0000000000..b5165644ca --- /dev/null +++ b/queue-6.14/irqchip-riscv-imsic-start-local-sync-timer-on-correc.patch @@ -0,0 +1,71 @@ +From e2680122a41184cd3ff9301ee4c54fd8354f910b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 14 May 2025 10:13:20 -0700 +Subject: irqchip/riscv-imsic: Start local sync timer on correct CPU + +From: Andrew Bresticker + +[ Upstream commit 08fb624802d8786253994d8ebdbbcdaa186f04f5 ] + +When starting the local sync timer to synchronize the state of a remote +CPU it should be added on the CPU to be synchronized, not the initiating +CPU. This results in interrupt delivery being delayed until the timer +eventually runs (due to another mask/unmask/migrate operation) on the +target CPU. + +Fixes: 0f67911e821c ("irqchip/riscv-imsic: Separate next and previous pointers in IMSIC vector") +Signed-off-by: Andrew Bresticker +Signed-off-by: Thomas Gleixner +Reviewed-by: Anup Patel +Link: https://lore.kernel.org/all/20250514171320.3494917-1-abrestic@rivosinc.com +Signed-off-by: Sasha Levin +--- + drivers/irqchip/irq-riscv-imsic-state.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c +index 1aeba76d72795..06ff0e17c0c33 100644 +--- a/drivers/irqchip/irq-riscv-imsic-state.c ++++ b/drivers/irqchip/irq-riscv-imsic-state.c +@@ -186,17 +186,17 @@ static bool __imsic_local_sync(struct imsic_local_priv *lpriv) + } + + #ifdef CONFIG_SMP +-static void __imsic_local_timer_start(struct imsic_local_priv *lpriv) ++static void __imsic_local_timer_start(struct imsic_local_priv *lpriv, unsigned int cpu) + { + lockdep_assert_held(&lpriv->lock); + + if (!timer_pending(&lpriv->timer)) { + lpriv->timer.expires = jiffies + 1; +- add_timer_on(&lpriv->timer, smp_processor_id()); ++ add_timer_on(&lpriv->timer, cpu); + } + } + #else +-static inline void __imsic_local_timer_start(struct imsic_local_priv *lpriv) ++static inline void __imsic_local_timer_start(struct imsic_local_priv *lpriv, unsigned int cpu) + { + } + #endif +@@ -211,7 +211,7 @@ void imsic_local_sync_all(bool force_all) + if (force_all) + bitmap_fill(lpriv->dirty_bitmap, imsic->global.nr_ids + 1); + if (!__imsic_local_sync(lpriv)) +- __imsic_local_timer_start(lpriv); ++ __imsic_local_timer_start(lpriv, smp_processor_id()); + + raw_spin_unlock_irqrestore(&lpriv->lock, flags); + } +@@ -256,7 +256,7 @@ static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu + return; + } + +- __imsic_local_timer_start(lpriv); ++ __imsic_local_timer_start(lpriv, cpu); + } + } + #else +-- +2.39.5 + diff --git a/queue-6.14/kernel-fork-only-call-untrack_pfn_clear-on-vmas-dupl.patch b/queue-6.14/kernel-fork-only-call-untrack_pfn_clear-on-vmas-dupl.patch new file mode 100644 index 0000000000..48d05bab68 --- /dev/null +++ b/queue-6.14/kernel-fork-only-call-untrack_pfn_clear-on-vmas-dupl.patch @@ -0,0 +1,98 @@ +From d2e3f469aa2bef0aabfd601f1dd3164107778c28 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 22 Apr 2025 16:49:42 +0200 +Subject: kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for + fork() + +From: David Hildenbrand + +[ Upstream commit e9f180d7cfde23b9f8eebd60272465176373ab2c ] + +Not intuitive, but vm_area_dup() located in kernel/fork.c is not only used +for duplicating VMAs during fork(), but also for duplicating VMAs when +splitting VMAs or when mremap()'ing them. + +VM_PFNMAP mappings can at least get ordinarily mremap()'ed (no change in +size) and apparently also shrunk during mremap(), which implies +duplicating the VMA in __split_vma() first. + +In case of ordinary mremap() (no change in size), we first duplicate the +VMA in copy_vma_and_data()->copy_vma() to then call untrack_pfn_clear() on +the old VMA: we effectively move the VM_PAT reservation. So the +untrack_pfn_clear() call on the new VMA duplicating is wrong in that +context. + +Splitting of VMAs seems problematic, because we don't duplicate/adjust the +reservation when splitting the VMA. Instead, in memtype_erase() -- called +during zapping/munmap -- we shrink a reservation in case only the end +address matches: Assume we split a VMA into A and B, both would share a +reservation until B is unmapped. + +So when unmapping B, the reservation would be updated to cover only A. +When unmapping A, we would properly remove the now-shrunk reservation. +That scenario describes the mremap() shrinking (old_size > new_size), +where we split + unmap B, and the untrack_pfn_clear() on the new VMA when +is wrong. + +What if we manage to split a VM_PFNMAP VMA into A and B and unmap A first? +It would be broken because we would never free the reservation. Likely, +there are ways to trigger such a VMA split outside of mremap(). + +Affecting other VMA duplication was not intended, vm_area_dup() being used +outside of kernel/fork.c was an oversight. So let's fix that for; how to +handle VMA splits better should be investigated separately. + +With a simple reproducer that uses mprotect() to split such a VMA I can +trigger + +x86/PAT: pat_mremap:26448 freeing invalid memtype [mem 0x00000000-0x00000fff] + +Link: https://lkml.kernel.org/r/20250422144942.2871395-1-david@redhat.com +Fixes: dc84bc2aba85 ("x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()") +Signed-off-by: David Hildenbrand +Reviewed-by: Lorenzo Stoakes +Cc: Ingo Molnar +Cc: Dave Hansen +Cc: Andy Lutomirski +Cc: Peter Zijlstra +Cc: Thomas Gleixner +Cc: Borislav Petkov +Cc: Rik van Riel +Cc: "H. Peter Anvin" +Cc: Linus Torvalds +Signed-off-by: Andrew Morton +Signed-off-by: Sasha Levin +--- + kernel/fork.c | 9 +++++---- + 1 file changed, 5 insertions(+), 4 deletions(-) + +diff --git a/kernel/fork.c b/kernel/fork.c +index ca2ca3884f763..5e640468baff1 100644 +--- a/kernel/fork.c ++++ b/kernel/fork.c +@@ -504,10 +504,6 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) + vma_numab_state_init(new); + dup_anon_vma_name(orig, new); + +- /* track_pfn_copy() will later take care of copying internal state. */ +- if (unlikely(new->vm_flags & VM_PFNMAP)) +- untrack_pfn_clear(new); +- + return new; + } + +@@ -698,6 +694,11 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, + tmp = vm_area_dup(mpnt); + if (!tmp) + goto fail_nomem; ++ ++ /* track_pfn_copy() will later take care of copying internal state. */ ++ if (unlikely(tmp->vm_flags & VM_PFNMAP)) ++ untrack_pfn_clear(tmp); ++ + retval = vma_dup_policy(mpnt, tmp); + if (retval) + goto fail_nomem_policy; +-- +2.39.5 + diff --git a/queue-6.14/loop-don-t-require-write_iter-for-writable-files-in-.patch b/queue-6.14/loop-don-t-require-write_iter-for-writable-files-in-.patch new file mode 100644 index 0000000000..9069bf9e07 --- /dev/null +++ b/queue-6.14/loop-don-t-require-write_iter-for-writable-files-in-.patch @@ -0,0 +1,43 @@ +From 2f0f6c671d00022bdf6ea406965b2e8843e9ead3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 20 May 2025 15:54:20 +0200 +Subject: loop: don't require ->write_iter for writable files in loop_configure + +From: Christoph Hellwig + +[ Upstream commit 355341e4359b2d5edf0ed5e117f7e9e7a0a5dac0 ] + +Block devices can be opened read-write even if they can't be written to +for historic reasons. Remove the check requiring file->f_op->write_iter +when the block devices was opened in loop_configure. The call to +loop_check_backing_file just below ensures the ->write_iter is present +for backing files opened for writing, which is the only check that is +actually needed. + +Fixes: f5c84eff634b ("loop: Add sanity check for read/write_iter") +Reported-by: Christian Hesse +Signed-off-by: Christoph Hellwig +Link: https://lore.kernel.org/r/20250520135420.1177312-1-hch@lst.de +Signed-off-by: Jens Axboe +Signed-off-by: Sasha Levin +--- + drivers/block/loop.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/drivers/block/loop.c b/drivers/block/loop.c +index f68f86e9cb716..0b135d1ca25ea 100644 +--- a/drivers/block/loop.c ++++ b/drivers/block/loop.c +@@ -973,9 +973,6 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode, + if (!file) + return -EBADF; + +- if ((mode & BLK_OPEN_WRITE) && !file->f_op->write_iter) +- return -EINVAL; +- + error = loop_check_backing_file(file); + if (error) + return error; +-- +2.39.5 + diff --git a/queue-6.14/mr-consolidate-the-ipmr_can_free_table-checks.patch b/queue-6.14/mr-consolidate-the-ipmr_can_free_table-checks.patch new file mode 100644 index 0000000000..e4e10da8c9 --- /dev/null +++ b/queue-6.14/mr-consolidate-the-ipmr_can_free_table-checks.patch @@ -0,0 +1,165 @@ +From 64a6ec2ad998ef271d43797dbb62f5607bc93ffc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 15 May 2025 18:49:26 +0200 +Subject: mr: consolidate the ipmr_can_free_table() checks. + +From: Paolo Abeni + +[ Upstream commit c46286fdd6aa1d0e33c245bcffe9ff2428a777bd ] + +Guoyu Yin reported a splat in the ipmr netns cleanup path: + +WARNING: CPU: 2 PID: 14564 at net/ipv4/ipmr.c:440 ipmr_free_table net/ipv4/ipmr.c:440 [inline] +WARNING: CPU: 2 PID: 14564 at net/ipv4/ipmr.c:440 ipmr_rules_exit+0x135/0x1c0 net/ipv4/ipmr.c:361 +Modules linked in: +CPU: 2 UID: 0 PID: 14564 Comm: syz.4.838 Not tainted 6.14.0 #1 +Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 +RIP: 0010:ipmr_free_table net/ipv4/ipmr.c:440 [inline] +RIP: 0010:ipmr_rules_exit+0x135/0x1c0 net/ipv4/ipmr.c:361 +Code: ff df 48 c1 ea 03 80 3c 02 00 75 7d 48 c7 83 60 05 00 00 00 00 00 00 5b 5d 41 5c 41 5d 41 5e e9 71 67 7f 00 e8 4c 2d 8a fd 90 <0f> 0b 90 eb 93 e8 41 2d 8a fd 0f b6 2d 80 54 ea 01 31 ff 89 ee e8 +RSP: 0018:ffff888109547c58 EFLAGS: 00010293 +RAX: 0000000000000000 RBX: ffff888108c12dc0 RCX: ffffffff83e09868 +RDX: ffff8881022b3300 RSI: ffffffff83e098d4 RDI: 0000000000000005 +RBP: ffff888104288000 R08: 0000000000000000 R09: ffffed10211825c9 +R10: 0000000000000001 R11: ffff88801816c4a0 R12: 0000000000000001 +R13: ffff888108c13320 R14: ffff888108c12dc0 R15: fffffbfff0b74058 +FS: 00007f84f39316c0(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 00007f84f3930f98 CR3: 0000000113b56000 CR4: 0000000000350ef0 +Call Trace: + + ipmr_net_exit_batch+0x50/0x90 net/ipv4/ipmr.c:3160 + ops_exit_list+0x10c/0x160 net/core/net_namespace.c:177 + setup_net+0x47d/0x8e0 net/core/net_namespace.c:394 + copy_net_ns+0x25d/0x410 net/core/net_namespace.c:516 + create_new_namespaces+0x3f6/0xaf0 kernel/nsproxy.c:110 + unshare_nsproxy_namespaces+0xc3/0x180 kernel/nsproxy.c:228 + ksys_unshare+0x78d/0x9a0 kernel/fork.c:3342 + __do_sys_unshare kernel/fork.c:3413 [inline] + __se_sys_unshare kernel/fork.c:3411 [inline] + __x64_sys_unshare+0x31/0x40 kernel/fork.c:3411 + do_syscall_x64 arch/x86/entry/common.c:52 [inline] + do_syscall_64+0xa6/0x1a0 arch/x86/entry/common.c:83 + entry_SYSCALL_64_after_hwframe+0x77/0x7f +RIP: 0033:0x7f84f532cc29 +Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 +RSP: 002b:00007f84f3931038 EFLAGS: 00000246 ORIG_RAX: 0000000000000110 +RAX: ffffffffffffffda RBX: 00007f84f5615fa0 RCX: 00007f84f532cc29 +RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000400 +RBP: 00007f84f53fba18 R08: 0000000000000000 R09: 0000000000000000 +R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 +R13: 0000000000000000 R14: 00007f84f5615fa0 R15: 00007fff51c5f328 + + +The running kernel has CONFIG_IP_MROUTE_MULTIPLE_TABLES disabled, and +the sanity check for such build is still too loose. + +Address the issue consolidating the relevant sanity check in a single +helper regardless of the kernel configuration. Also share it between +the ipv4 and ipv6 code. + +Reported-by: Guoyu Yin +Fixes: 50b94204446e ("ipmr: tune the ipmr_can_free_table() checks.") +Signed-off-by: Paolo Abeni +Link: https://patch.msgid.link/372dc261e1bf12742276e1b984fc5a071b7fc5a8.1747321903.git.pabeni@redhat.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + include/linux/mroute_base.h | 5 +++++ + net/ipv4/ipmr.c | 12 +----------- + net/ipv6/ip6mr.c | 12 +----------- + 3 files changed, 7 insertions(+), 22 deletions(-) + +diff --git a/include/linux/mroute_base.h b/include/linux/mroute_base.h +index 58a2401e4b551..0075f6e5c3da9 100644 +--- a/include/linux/mroute_base.h ++++ b/include/linux/mroute_base.h +@@ -262,6 +262,11 @@ struct mr_table { + int mroute_reg_vif_num; + }; + ++static inline bool mr_can_free_table(struct net *net) ++{ ++ return !check_net(net) || !net_initialized(net); ++} ++ + #ifdef CONFIG_IP_MROUTE_COMMON + void vif_device_init(struct vif_device *v, + struct net_device *dev, +diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c +index 21ae7594a8525..69df45c4a0aae 100644 +--- a/net/ipv4/ipmr.c ++++ b/net/ipv4/ipmr.c +@@ -120,11 +120,6 @@ static void ipmr_expire_process(struct timer_list *t); + lockdep_rtnl_is_held() || \ + list_empty(&net->ipv4.mr_tables)) + +-static bool ipmr_can_free_table(struct net *net) +-{ +- return !check_net(net) || !net_initialized(net); +-} +- + static struct mr_table *ipmr_mr_table_iter(struct net *net, + struct mr_table *mrt) + { +@@ -317,11 +312,6 @@ EXPORT_SYMBOL(ipmr_rule_default); + #define ipmr_for_each_table(mrt, net) \ + for (mrt = net->ipv4.mrt; mrt; mrt = NULL) + +-static bool ipmr_can_free_table(struct net *net) +-{ +- return !check_net(net); +-} +- + static struct mr_table *ipmr_mr_table_iter(struct net *net, + struct mr_table *mrt) + { +@@ -437,7 +427,7 @@ static void ipmr_free_table(struct mr_table *mrt) + { + struct net *net = read_pnet(&mrt->net); + +- WARN_ON_ONCE(!ipmr_can_free_table(net)); ++ WARN_ON_ONCE(!mr_can_free_table(net)); + + timer_shutdown_sync(&mrt->ipmr_expire_timer); + mroute_clean_tables(mrt, MRT_FLUSH_VIFS | MRT_FLUSH_VIFS_STATIC | +diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c +index 535e9f72514c0..33351acc45e10 100644 +--- a/net/ipv6/ip6mr.c ++++ b/net/ipv6/ip6mr.c +@@ -108,11 +108,6 @@ static void ipmr_expire_process(struct timer_list *t); + lockdep_rtnl_is_held() || \ + list_empty(&net->ipv6.mr6_tables)) + +-static bool ip6mr_can_free_table(struct net *net) +-{ +- return !check_net(net) || !net_initialized(net); +-} +- + static struct mr_table *ip6mr_mr_table_iter(struct net *net, + struct mr_table *mrt) + { +@@ -306,11 +301,6 @@ EXPORT_SYMBOL(ip6mr_rule_default); + #define ip6mr_for_each_table(mrt, net) \ + for (mrt = net->ipv6.mrt6; mrt; mrt = NULL) + +-static bool ip6mr_can_free_table(struct net *net) +-{ +- return !check_net(net); +-} +- + static struct mr_table *ip6mr_mr_table_iter(struct net *net, + struct mr_table *mrt) + { +@@ -416,7 +406,7 @@ static void ip6mr_free_table(struct mr_table *mrt) + { + struct net *net = read_pnet(&mrt->net); + +- WARN_ON_ONCE(!ip6mr_can_free_table(net)); ++ WARN_ON_ONCE(!mr_can_free_table(net)); + + timer_shutdown_sync(&mrt->ipmr_expire_timer); + mroute_clean_tables(mrt, MRT6_FLUSH_MIFS | MRT6_FLUSH_MIFS_STATIC | +-- +2.39.5 + diff --git a/queue-6.14/net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch b/queue-6.14/net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch new file mode 100644 index 0000000000..20b97ac729 --- /dev/null +++ b/queue-6.14/net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch @@ -0,0 +1,48 @@ +From 48958bed040e326ffe0d9158a223404941d76f11 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 19 May 2025 18:49:36 +0200 +Subject: net: dwmac-sun8i: Use parsed internal PHY address instead of 1 + +From: Paul Kocialkowski + +[ Upstream commit 47653e4243f2b0a26372e481ca098936b51ec3a8 ] + +While the MDIO address of the internal PHY on Allwinner sun8i chips is +generally 1, of_mdio_parse_addr is used to cleanly parse the address +from the device-tree instead of hardcoding it. + +A commit reworking the code ditched the parsed value and hardcoded the +value 1 instead, which didn't really break anything but is more fragile +and not future-proof. + +Restore the initial behavior using the parsed address returned from the +helper. + +Fixes: 634db83b8265 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs") +Signed-off-by: Paul Kocialkowski +Reviewed-by: Andrew Lunn +Acked-by: Corentin LABBE +Tested-by: Corentin LABBE +Link: https://patch.msgid.link/20250519164936.4172658-1-paulk@sys-base.io +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +index 4b7b2582a1201..9d31fa5bbe15e 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c ++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +@@ -964,7 +964,7 @@ static int sun8i_dwmac_set_syscon(struct device *dev, + /* of_mdio_parse_addr returns a valid (0 ~ 31) PHY + * address. No need to mask it again. + */ +- reg |= 1 << H3_EPHY_ADDR_SHIFT; ++ reg |= ret << H3_EPHY_ADDR_SHIFT; + } else { + /* For SoCs without internal PHY the PHY selection bit should be + * set to 0 (external PHY). +-- +2.39.5 + diff --git a/queue-6.14/net-lan743x-restore-sgmii-ctrl-register-on-resume.patch b/queue-6.14/net-lan743x-restore-sgmii-ctrl-register-on-resume.patch new file mode 100644 index 0000000000..1de328921d --- /dev/null +++ b/queue-6.14/net-lan743x-restore-sgmii-ctrl-register-on-resume.patch @@ -0,0 +1,92 @@ +From 90ffa3eaa1dd318230e779df2eb9c0fb475970e2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 16 May 2025 09:27:19 +0530 +Subject: net: lan743x: Restore SGMII CTRL register on resume + +From: Thangaraj Samynathan + +[ Upstream commit 293e38ff4e4c2ba53f3fd47d8a4a9f0f0414a7a6 ] + +SGMII_CTRL register, which specifies the active interface, was not +properly restored when resuming from suspend. This led to incorrect +interface selection after resume particularly in scenarios involving +the FPGA. + +To fix this: +- Move the SGMII_CTRL setup out of the probe function. +- Initialize the register in the hardware initialization helper function, +which is called during both device initialization and resume. + +This ensures the interface configuration is consistently restored after +suspend/resume cycles. + +Fixes: a46d9d37c4f4f ("net: lan743x: Add support for SGMII interface") +Signed-off-by: Thangaraj Samynathan +Link: https://patch.msgid.link/20250516035719.117960-1-thangaraj.s@microchip.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/microchip/lan743x_main.c | 19 ++++++++++--------- + 1 file changed, 10 insertions(+), 9 deletions(-) + +diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c +index e2d6bfb5d6933..a70b88037a208 100644 +--- a/drivers/net/ethernet/microchip/lan743x_main.c ++++ b/drivers/net/ethernet/microchip/lan743x_main.c +@@ -3495,6 +3495,7 @@ static int lan743x_hardware_init(struct lan743x_adapter *adapter, + struct pci_dev *pdev) + { + struct lan743x_tx *tx; ++ u32 sgmii_ctl; + int index; + int ret; + +@@ -3507,6 +3508,15 @@ static int lan743x_hardware_init(struct lan743x_adapter *adapter, + spin_lock_init(&adapter->eth_syslock_spinlock); + mutex_init(&adapter->sgmii_rw_lock); + pci11x1x_set_rfe_rd_fifo_threshold(adapter); ++ sgmii_ctl = lan743x_csr_read(adapter, SGMII_CTL); ++ if (adapter->is_sgmii_en) { ++ sgmii_ctl |= SGMII_CTL_SGMII_ENABLE_; ++ sgmii_ctl &= ~SGMII_CTL_SGMII_POWER_DN_; ++ } else { ++ sgmii_ctl &= ~SGMII_CTL_SGMII_ENABLE_; ++ sgmii_ctl |= SGMII_CTL_SGMII_POWER_DN_; ++ } ++ lan743x_csr_write(adapter, SGMII_CTL, sgmii_ctl); + } else { + adapter->max_tx_channels = LAN743X_MAX_TX_CHANNELS; + adapter->used_tx_channels = LAN743X_USED_TX_CHANNELS; +@@ -3558,7 +3568,6 @@ static int lan743x_hardware_init(struct lan743x_adapter *adapter, + + static int lan743x_mdiobus_init(struct lan743x_adapter *adapter) + { +- u32 sgmii_ctl; + int ret; + + adapter->mdiobus = devm_mdiobus_alloc(&adapter->pdev->dev); +@@ -3570,10 +3579,6 @@ static int lan743x_mdiobus_init(struct lan743x_adapter *adapter) + adapter->mdiobus->priv = (void *)adapter; + if (adapter->is_pci11x1x) { + if (adapter->is_sgmii_en) { +- sgmii_ctl = lan743x_csr_read(adapter, SGMII_CTL); +- sgmii_ctl |= SGMII_CTL_SGMII_ENABLE_; +- sgmii_ctl &= ~SGMII_CTL_SGMII_POWER_DN_; +- lan743x_csr_write(adapter, SGMII_CTL, sgmii_ctl); + netif_dbg(adapter, drv, adapter->netdev, + "SGMII operation\n"); + adapter->mdiobus->read = lan743x_mdiobus_read_c22; +@@ -3584,10 +3589,6 @@ static int lan743x_mdiobus_init(struct lan743x_adapter *adapter) + netif_dbg(adapter, drv, adapter->netdev, + "lan743x-mdiobus-c45\n"); + } else { +- sgmii_ctl = lan743x_csr_read(adapter, SGMII_CTL); +- sgmii_ctl &= ~SGMII_CTL_SGMII_ENABLE_; +- sgmii_ctl |= SGMII_CTL_SGMII_POWER_DN_; +- lan743x_csr_write(adapter, SGMII_CTL, sgmii_ctl); + netif_dbg(adapter, drv, adapter->netdev, + "RGMII operation\n"); + // Only C22 support when RGMII I/F +-- +2.39.5 + diff --git a/queue-6.14/net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch b/queue-6.14/net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch new file mode 100644 index 0000000000..20e4d21429 --- /dev/null +++ b/queue-6.14/net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch @@ -0,0 +1,125 @@ +From 92276c8b3c218f200ae708dfdf2125de2a62903c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 20 May 2025 18:14:04 +0800 +Subject: net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done + +From: Wang Liang + +[ Upstream commit e279024617134c94fd3e37470156534d5f2b3472 ] + +Syzbot reported a slab-use-after-free with the following call trace: + + ================================================================== + BUG: KASAN: slab-use-after-free in tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840 + Read of size 8 at addr ffff88807a733000 by task kworker/1:0/25 + + Call Trace: + kasan_report+0xd9/0x110 mm/kasan/report.c:601 + tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840 + crypto_request_complete include/crypto/algapi.h:266 + aead_request_complete include/crypto/internal/aead.h:85 + cryptd_aead_crypt+0x3b8/0x750 crypto/cryptd.c:772 + crypto_request_complete include/crypto/algapi.h:266 + cryptd_queue_worker+0x131/0x200 crypto/cryptd.c:181 + process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 + + Allocated by task 8355: + kzalloc_noprof include/linux/slab.h:778 + tipc_crypto_start+0xcc/0x9e0 net/tipc/crypto.c:1466 + tipc_init_net+0x2dd/0x430 net/tipc/core.c:72 + ops_init+0xb9/0x650 net/core/net_namespace.c:139 + setup_net+0x435/0xb40 net/core/net_namespace.c:343 + copy_net_ns+0x2f0/0x670 net/core/net_namespace.c:508 + create_new_namespaces+0x3ea/0xb10 kernel/nsproxy.c:110 + unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228 + ksys_unshare+0x419/0x970 kernel/fork.c:3323 + __do_sys_unshare kernel/fork.c:3394 + + Freed by task 63: + kfree+0x12a/0x3b0 mm/slub.c:4557 + tipc_crypto_stop+0x23c/0x500 net/tipc/crypto.c:1539 + tipc_exit_net+0x8c/0x110 net/tipc/core.c:119 + ops_exit_list+0xb0/0x180 net/core/net_namespace.c:173 + cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640 + process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 + +After freed the tipc_crypto tx by delete namespace, tipc_aead_encrypt_done +may still visit it in cryptd_queue_worker workqueue. + +I reproduce this issue by: + ip netns add ns1 + ip link add veth1 type veth peer name veth2 + ip link set veth1 netns ns1 + ip netns exec ns1 tipc bearer enable media eth dev veth1 + ip netns exec ns1 tipc node set key this_is_a_master_key master + ip netns exec ns1 tipc bearer disable media eth dev veth1 + ip netns del ns1 + +The key of reproduction is that, simd_aead_encrypt is interrupted, leading +to crypto_simd_usable() return false. Thus, the cryptd_queue_worker is +triggered, and the tipc_crypto tx will be visited. + + tipc_disc_timeout + tipc_bearer_xmit_skb + tipc_crypto_xmit + tipc_aead_encrypt + crypto_aead_encrypt + // encrypt() + simd_aead_encrypt + // crypto_simd_usable() is false + child = &ctx->cryptd_tfm->base; + + simd_aead_encrypt + crypto_aead_encrypt + // encrypt() + cryptd_aead_encrypt_enqueue + cryptd_aead_enqueue + cryptd_enqueue_request + // trigger cryptd_queue_worker + queue_work_on(smp_processor_id(), cryptd_wq, &cpu_queue->work) + +Fix this by holding net reference count before encrypt. + +Reported-by: syzbot+55c12726619ff85ce1f6@syzkaller.appspotmail.com +Closes: https://syzkaller.appspot.com/bug?extid=55c12726619ff85ce1f6 +Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") +Signed-off-by: Wang Liang +Link: https://patch.msgid.link/20250520101404.1341730-1-wangliang74@huawei.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/tipc/crypto.c | 5 +++++ + 1 file changed, 5 insertions(+) + +diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c +index c524421ec6525..8584893b47851 100644 +--- a/net/tipc/crypto.c ++++ b/net/tipc/crypto.c +@@ -817,12 +817,16 @@ static int tipc_aead_encrypt(struct tipc_aead *aead, struct sk_buff *skb, + goto exit; + } + ++ /* Get net to avoid freed tipc_crypto when delete namespace */ ++ get_net(aead->crypto->net); ++ + /* Now, do encrypt */ + rc = crypto_aead_encrypt(req); + if (rc == -EINPROGRESS || rc == -EBUSY) + return rc; + + tipc_bearer_put(b); ++ put_net(aead->crypto->net); + + exit: + kfree(ctx); +@@ -860,6 +864,7 @@ static void tipc_aead_encrypt_done(void *data, int err) + kfree(tx_ctx); + tipc_bearer_put(b); + tipc_aead_put(aead); ++ put_net(net); + } + + /** +-- +2.39.5 + diff --git a/queue-6.14/octeontx2-af-fix-apr-entry-mapping-based-on-apr_lmt_.patch b/queue-6.14/octeontx2-af-fix-apr-entry-mapping-based-on-apr_lmt_.patch new file mode 100644 index 0000000000..a777eb7775 --- /dev/null +++ b/queue-6.14/octeontx2-af-fix-apr-entry-mapping-based-on-apr_lmt_.patch @@ -0,0 +1,111 @@ +From cb6502e2ef8e99c1d7133361a0d8afdf95d067ba Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 21 May 2025 11:38:34 +0530 +Subject: octeontx2-af: Fix APR entry mapping based on APR_LMT_CFG + +From: Geetha sowjanya + +[ Upstream commit a6ae7129819ad20788e610261246e71736543b8b ] + +The current implementation maps the APR table using a fixed size, +which can lead to incorrect mapping when the number of PFs and VFs +varies. +This patch corrects the mapping by calculating the APR table +size dynamically based on the values configured in the +APR_LMT_CFG register, ensuring accurate representation +of APR entries in debugfs. + +Fixes: 0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table"). +Signed-off-by: Geetha sowjanya +Link: https://patch.msgid.link/20250521060834.19780-3-gakula@marvell.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c | 9 ++++++--- + .../net/ethernet/marvell/octeontx2/af/rvu_debugfs.c | 11 ++++++++--- + 2 files changed, 14 insertions(+), 6 deletions(-) + +diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c +index 3838c04b78c22..4a3370a40dd88 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c ++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c +@@ -13,7 +13,6 @@ + /* RVU LMTST */ + #define LMT_TBL_OP_READ 0 + #define LMT_TBL_OP_WRITE 1 +-#define LMT_MAP_TABLE_SIZE (128 * 1024) + #define LMT_MAPTBL_ENTRY_SIZE 16 + #define LMT_MAX_VFS 256 + +@@ -26,10 +25,14 @@ static int lmtst_map_table_ops(struct rvu *rvu, u32 index, u64 *val, + { + void __iomem *lmt_map_base; + u64 tbl_base, cfg; ++ int pfs, vfs; + + tbl_base = rvu_read64(rvu, BLKADDR_APR, APR_AF_LMT_MAP_BASE); ++ cfg = rvu_read64(rvu, BLKADDR_APR, APR_AF_LMT_CFG); ++ vfs = 1 << (cfg & 0xF); ++ pfs = 1 << ((cfg >> 4) & 0x7); + +- lmt_map_base = ioremap_wc(tbl_base, LMT_MAP_TABLE_SIZE); ++ lmt_map_base = ioremap_wc(tbl_base, pfs * vfs * LMT_MAPTBL_ENTRY_SIZE); + if (!lmt_map_base) { + dev_err(rvu->dev, "Failed to setup lmt map table mapping!!\n"); + return -ENOMEM; +@@ -80,7 +83,7 @@ static int rvu_get_lmtaddr(struct rvu *rvu, u16 pcifunc, + + mutex_lock(&rvu->rsrc_lock); + rvu_write64(rvu, BLKADDR_RVUM, RVU_AF_SMMU_ADDR_REQ, iova); +- pf = rvu_get_pf(pcifunc) & 0x1F; ++ pf = rvu_get_pf(pcifunc) & RVU_PFVF_PF_MASK; + val = BIT_ULL(63) | BIT_ULL(14) | BIT_ULL(13) | pf << 8 | + ((pcifunc & RVU_PFVF_FUNC_MASK) & 0xFF); + rvu_write64(rvu, BLKADDR_RVUM, RVU_AF_SMMU_TXN_REQ, val); +diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c +index a1f9ec03c2ce6..c827da6264712 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c ++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c +@@ -553,6 +553,7 @@ static ssize_t rvu_dbg_lmtst_map_table_display(struct file *filp, + u64 lmt_addr, val, tbl_base; + int pf, vf, num_vfs, hw_vfs; + void __iomem *lmt_map_base; ++ int apr_pfs, apr_vfs; + int buf_size = 10240; + size_t off = 0; + int index = 0; +@@ -568,8 +569,12 @@ static ssize_t rvu_dbg_lmtst_map_table_display(struct file *filp, + return -ENOMEM; + + tbl_base = rvu_read64(rvu, BLKADDR_APR, APR_AF_LMT_MAP_BASE); ++ val = rvu_read64(rvu, BLKADDR_APR, APR_AF_LMT_CFG); ++ apr_vfs = 1 << (val & 0xF); ++ apr_pfs = 1 << ((val >> 4) & 0x7); + +- lmt_map_base = ioremap_wc(tbl_base, 128 * 1024); ++ lmt_map_base = ioremap_wc(tbl_base, apr_pfs * apr_vfs * ++ LMT_MAPTBL_ENTRY_SIZE); + if (!lmt_map_base) { + dev_err(rvu->dev, "Failed to setup lmt map table mapping!!\n"); + kfree(buf); +@@ -591,7 +596,7 @@ static ssize_t rvu_dbg_lmtst_map_table_display(struct file *filp, + off += scnprintf(&buf[off], buf_size - 1 - off, "PF%d \t\t\t", + pf); + +- index = pf * rvu->hw->total_vfs * LMT_MAPTBL_ENTRY_SIZE; ++ index = pf * apr_vfs * LMT_MAPTBL_ENTRY_SIZE; + off += scnprintf(&buf[off], buf_size - 1 - off, " 0x%llx\t\t", + (tbl_base + index)); + lmt_addr = readq(lmt_map_base + index); +@@ -604,7 +609,7 @@ static ssize_t rvu_dbg_lmtst_map_table_display(struct file *filp, + /* Reading num of VFs per PF */ + rvu_get_pf_numvfs(rvu, pf, &num_vfs, &hw_vfs); + for (vf = 0; vf < num_vfs; vf++) { +- index = (pf * rvu->hw->total_vfs * 16) + ++ index = (pf * apr_vfs * LMT_MAPTBL_ENTRY_SIZE) + + ((vf + 1) * LMT_MAPTBL_ENTRY_SIZE); + off += scnprintf(&buf[off], buf_size - 1 - off, + "PF%d:VF%d \t\t", pf, vf); +-- +2.39.5 + diff --git a/queue-6.14/octeontx2-af-set-lmt_ena-bit-for-apr-table-entries.patch b/queue-6.14/octeontx2-af-set-lmt_ena-bit-for-apr-table-entries.patch new file mode 100644 index 0000000000..3f0ebb21ce --- /dev/null +++ b/queue-6.14/octeontx2-af-set-lmt_ena-bit-for-apr-table-entries.patch @@ -0,0 +1,76 @@ +From 03063ee3a072713721147c2837d04b1b1cf7fca2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 21 May 2025 11:38:33 +0530 +Subject: octeontx2-af: Set LMT_ENA bit for APR table entries + +From: Subbaraya Sundeep + +[ Upstream commit 0eefa27b493306928d88af6368193b134c98fd64 ] + +This patch enables the LMT line for a PF/VF by setting the +LMT_ENA bit in the APR_LMT_MAP_ENTRY_S structure. + +Additionally, it simplifies the logic for calculating the +LMTST table index by consistently using the maximum +number of hw supported VFs (i.e., 256). + +Fixes: 873a1e3d207a ("octeontx2-af: cn10k: Setting up lmtst map table"). +Signed-off-by: Subbaraya Sundeep +Signed-off-by: Geetha sowjanya +Reviewed-by: Michal Swiatkowski +Link: https://patch.msgid.link/20250521060834.19780-2-gakula@marvell.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + .../net/ethernet/marvell/octeontx2/af/rvu_cn10k.c | 15 +++++++++++++-- + 1 file changed, 13 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c +index 7fa98aeb3663c..3838c04b78c22 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c ++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cn10k.c +@@ -15,13 +15,17 @@ + #define LMT_TBL_OP_WRITE 1 + #define LMT_MAP_TABLE_SIZE (128 * 1024) + #define LMT_MAPTBL_ENTRY_SIZE 16 ++#define LMT_MAX_VFS 256 ++ ++#define LMT_MAP_ENTRY_ENA BIT_ULL(20) ++#define LMT_MAP_ENTRY_LINES GENMASK_ULL(18, 16) + + /* Function to perform operations (read/write) on lmtst map table */ + static int lmtst_map_table_ops(struct rvu *rvu, u32 index, u64 *val, + int lmt_tbl_op) + { + void __iomem *lmt_map_base; +- u64 tbl_base; ++ u64 tbl_base, cfg; + + tbl_base = rvu_read64(rvu, BLKADDR_APR, APR_AF_LMT_MAP_BASE); + +@@ -35,6 +39,13 @@ static int lmtst_map_table_ops(struct rvu *rvu, u32 index, u64 *val, + *val = readq(lmt_map_base + index); + } else { + writeq((*val), (lmt_map_base + index)); ++ ++ cfg = FIELD_PREP(LMT_MAP_ENTRY_ENA, 0x1); ++ /* 2048 LMTLINES */ ++ cfg |= FIELD_PREP(LMT_MAP_ENTRY_LINES, 0x6); ++ ++ writeq(cfg, (lmt_map_base + (index + 8))); ++ + /* Flushing the AP interceptor cache to make APR_LMT_MAP_ENTRY_S + * changes effective. Write 1 for flush and read is being used as a + * barrier and sets up a data dependency. Write to 0 after a write +@@ -52,7 +63,7 @@ static int lmtst_map_table_ops(struct rvu *rvu, u32 index, u64 *val, + #define LMT_MAP_TBL_W1_OFF 8 + static u32 rvu_get_lmtst_tbl_index(struct rvu *rvu, u16 pcifunc) + { +- return ((rvu_get_pf(pcifunc) * rvu->hw->total_vfs) + ++ return ((rvu_get_pf(pcifunc) * LMT_MAX_VFS) + + (pcifunc & RVU_PFVF_FUNC_MASK)) * LMT_MAPTBL_ENTRY_SIZE; + } + +-- +2.39.5 + diff --git a/queue-6.14/octeontx2-pf-add-af_xdp-non-zero-copy-support.patch b/queue-6.14/octeontx2-pf-add-af_xdp-non-zero-copy-support.patch new file mode 100644 index 0000000000..e6a1e20b84 --- /dev/null +++ b/queue-6.14/octeontx2-pf-add-af_xdp-non-zero-copy-support.patch @@ -0,0 +1,52 @@ +From 9b776a4b7e18c449ce6c8b54555671ed44b0b835 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 13 Feb 2025 11:01:37 +0530 +Subject: octeontx2-pf: Add AF_XDP non-zero copy support + +From: Suman Ghosh + +[ Upstream commit b4164de5041b51cda3438e75bce668e2556057c3 ] + +Set xdp rx ring memory type as MEM_TYPE_PAGE_POOL for +af-xdp to work. This is needed since xdp_return_frame +internally will use page pools. + +Fixes: 06059a1a9a4a ("octeontx2-pf: Add XDP support to netdev PF") +Signed-off-by: Suman Ghosh +Signed-off-by: Paolo Abeni +Stable-dep-of: 184fb40f731b ("octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf") +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c | 8 +++++++- + 1 file changed, 7 insertions(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +index 2b49bfec78692..161cf33ef89ed 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +@@ -1047,6 +1047,7 @@ static int otx2_cq_init(struct otx2_nic *pfvf, u16 qidx) + int err, pool_id, non_xdp_queues; + struct nix_aq_enq_req *aq; + struct otx2_cq_queue *cq; ++ struct otx2_pool *pool; + + cq = &qset->cq[qidx]; + cq->cq_idx = qidx; +@@ -1055,8 +1056,13 @@ static int otx2_cq_init(struct otx2_nic *pfvf, u16 qidx) + cq->cq_type = CQ_RX; + cq->cint_idx = qidx; + cq->cqe_cnt = qset->rqe_cnt; +- if (pfvf->xdp_prog) ++ if (pfvf->xdp_prog) { ++ pool = &qset->pool[qidx]; + xdp_rxq_info_reg(&cq->xdp_rxq, pfvf->netdev, qidx, 0); ++ xdp_rxq_info_reg_mem_model(&cq->xdp_rxq, ++ MEM_TYPE_PAGE_POOL, ++ pool->page_pool); ++ } + } else if (qidx < non_xdp_queues) { + cq->cq_type = CQ_TX; + cq->cint_idx = qidx - pfvf->hw.rx_queues; +-- +2.39.5 + diff --git a/queue-6.14/octeontx2-pf-af_xdp-zero-copy-receive-support.patch b/queue-6.14/octeontx2-pf-af_xdp-zero-copy-receive-support.patch new file mode 100644 index 0000000000..28f5973822 --- /dev/null +++ b/queue-6.14/octeontx2-pf-af_xdp-zero-copy-receive-support.patch @@ -0,0 +1,896 @@ +From 50e57471fb6ed256d4860efe812cbb40a5ca2850 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 13 Feb 2025 11:01:38 +0530 +Subject: octeontx2-pf: AF_XDP zero copy receive support + +From: Suman Ghosh + +[ Upstream commit efabce29015189cb5cd8066cf29eb1d754de6c3c ] + +This patch adds support to AF_XDP zero copy for CN10K. +This patch specifically adds receive side support. In this approach once +a xdp program with zero copy support on a specific rx queue is enabled, +then that receive quse is disabled/detached from the existing kernel +queue and re-assigned to the umem memory. + +Signed-off-by: Suman Ghosh +Signed-off-by: Paolo Abeni +Stable-dep-of: 184fb40f731b ("octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf") +Signed-off-by: Sasha Levin +--- + .../ethernet/marvell/octeontx2/nic/Makefile | 2 +- + .../ethernet/marvell/octeontx2/nic/cn10k.c | 7 +- + .../marvell/octeontx2/nic/otx2_common.c | 114 ++++++++--- + .../marvell/octeontx2/nic/otx2_common.h | 6 +- + .../ethernet/marvell/octeontx2/nic/otx2_pf.c | 25 ++- + .../marvell/octeontx2/nic/otx2_txrx.c | 73 +++++-- + .../marvell/octeontx2/nic/otx2_txrx.h | 6 + + .../ethernet/marvell/octeontx2/nic/otx2_vf.c | 12 +- + .../ethernet/marvell/octeontx2/nic/otx2_xsk.c | 182 ++++++++++++++++++ + .../ethernet/marvell/octeontx2/nic/otx2_xsk.h | 21 ++ + .../ethernet/marvell/octeontx2/nic/qos_sq.c | 2 +- + 11 files changed, 389 insertions(+), 61 deletions(-) + create mode 100644 drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.c + create mode 100644 drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.h + +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/Makefile b/drivers/net/ethernet/marvell/octeontx2/nic/Makefile +index cb6513ab35e74..69e0778f9ac10 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/Makefile ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/Makefile +@@ -9,7 +9,7 @@ obj-$(CONFIG_RVU_ESWITCH) += rvu_rep.o + + rvu_nicpf-y := otx2_pf.o otx2_common.o otx2_txrx.o otx2_ethtool.o \ + otx2_flows.o otx2_tc.o cn10k.o otx2_dmac_flt.o \ +- otx2_devlink.o qos_sq.o qos.o ++ otx2_devlink.o qos_sq.o qos.o otx2_xsk.o + rvu_nicvf-y := otx2_vf.o + rvu_rep-y := rep.o + +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c +index a15cc86635d66..c3b6e0f60a799 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c +@@ -112,9 +112,12 @@ int cn10k_refill_pool_ptrs(void *dev, struct otx2_cq_queue *cq) + struct otx2_nic *pfvf = dev; + int cnt = cq->pool_ptrs; + u64 ptrs[NPA_MAX_BURST]; ++ struct otx2_pool *pool; + dma_addr_t bufptr; + int num_ptrs = 1; + ++ pool = &pfvf->qset.pool[cq->cq_idx]; ++ + /* Refill pool with new buffers */ + while (cq->pool_ptrs) { + if (otx2_alloc_buffer(pfvf, cq, &bufptr)) { +@@ -124,7 +127,9 @@ int cn10k_refill_pool_ptrs(void *dev, struct otx2_cq_queue *cq) + break; + } + cq->pool_ptrs--; +- ptrs[num_ptrs] = (u64)bufptr + OTX2_HEAD_ROOM; ++ ptrs[num_ptrs] = pool->xsk_pool ? ++ (u64)bufptr : (u64)bufptr + OTX2_HEAD_ROOM; ++ + num_ptrs++; + if (num_ptrs == NPA_MAX_BURST || cq->pool_ptrs == 0) { + __cn10k_aura_freeptr(pfvf, cq->cq_idx, ptrs, +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +index 161cf33ef89ed..92b0dba07853a 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +@@ -17,6 +17,7 @@ + #include "otx2_common.h" + #include "otx2_struct.h" + #include "cn10k.h" ++#include "otx2_xsk.h" + + static bool otx2_is_pfc_enabled(struct otx2_nic *pfvf) + { +@@ -549,10 +550,13 @@ static int otx2_alloc_pool_buf(struct otx2_nic *pfvf, struct otx2_pool *pool, + } + + static int __otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool, +- dma_addr_t *dma) ++ dma_addr_t *dma, int qidx, int idx) + { + u8 *buf; + ++ if (pool->xsk_pool) ++ return otx2_xsk_pool_alloc_buf(pfvf, pool, dma, idx); ++ + if (pool->page_pool) + return otx2_alloc_pool_buf(pfvf, pool, dma); + +@@ -571,12 +575,12 @@ static int __otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool, + } + + int otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool, +- dma_addr_t *dma) ++ dma_addr_t *dma, int qidx, int idx) + { + int ret; + + local_bh_disable(); +- ret = __otx2_alloc_rbuf(pfvf, pool, dma); ++ ret = __otx2_alloc_rbuf(pfvf, pool, dma, qidx, idx); + local_bh_enable(); + return ret; + } +@@ -584,7 +588,8 @@ int otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool, + int otx2_alloc_buffer(struct otx2_nic *pfvf, struct otx2_cq_queue *cq, + dma_addr_t *dma) + { +- if (unlikely(__otx2_alloc_rbuf(pfvf, cq->rbpool, dma))) ++ if (unlikely(__otx2_alloc_rbuf(pfvf, cq->rbpool, dma, ++ cq->cq_idx, cq->pool_ptrs - 1))) + return -ENOMEM; + return 0; + } +@@ -884,7 +889,7 @@ void otx2_sqb_flush(struct otx2_nic *pfvf) + #define RQ_PASS_LVL_AURA (255 - ((95 * 256) / 100)) /* RED when 95% is full */ + #define RQ_DROP_LVL_AURA (255 - ((99 * 256) / 100)) /* Drop when 99% is full */ + +-static int otx2_rq_init(struct otx2_nic *pfvf, u16 qidx, u16 lpb_aura) ++int otx2_rq_init(struct otx2_nic *pfvf, u16 qidx, u16 lpb_aura) + { + struct otx2_qset *qset = &pfvf->qset; + struct nix_aq_enq_req *aq; +@@ -1041,7 +1046,7 @@ int otx2_sq_init(struct otx2_nic *pfvf, u16 qidx, u16 sqb_aura) + + } + +-static int otx2_cq_init(struct otx2_nic *pfvf, u16 qidx) ++int otx2_cq_init(struct otx2_nic *pfvf, u16 qidx) + { + struct otx2_qset *qset = &pfvf->qset; + int err, pool_id, non_xdp_queues; +@@ -1057,11 +1062,18 @@ static int otx2_cq_init(struct otx2_nic *pfvf, u16 qidx) + cq->cint_idx = qidx; + cq->cqe_cnt = qset->rqe_cnt; + if (pfvf->xdp_prog) { +- pool = &qset->pool[qidx]; + xdp_rxq_info_reg(&cq->xdp_rxq, pfvf->netdev, qidx, 0); +- xdp_rxq_info_reg_mem_model(&cq->xdp_rxq, +- MEM_TYPE_PAGE_POOL, +- pool->page_pool); ++ pool = &qset->pool[qidx]; ++ if (pool->xsk_pool) { ++ xdp_rxq_info_reg_mem_model(&cq->xdp_rxq, ++ MEM_TYPE_XSK_BUFF_POOL, ++ NULL); ++ xsk_pool_set_rxq_info(pool->xsk_pool, &cq->xdp_rxq); ++ } else if (pool->page_pool) { ++ xdp_rxq_info_reg_mem_model(&cq->xdp_rxq, ++ MEM_TYPE_PAGE_POOL, ++ pool->page_pool); ++ } + } + } else if (qidx < non_xdp_queues) { + cq->cq_type = CQ_TX; +@@ -1281,9 +1293,10 @@ void otx2_free_bufs(struct otx2_nic *pfvf, struct otx2_pool *pool, + + pa = otx2_iova_to_phys(pfvf->iommu_domain, iova); + page = virt_to_head_page(phys_to_virt(pa)); +- + if (pool->page_pool) { + page_pool_put_full_page(pool->page_pool, page, true); ++ } else if (pool->xsk_pool) { ++ /* Note: No way of identifying xdp_buff */ + } else { + dma_unmap_page_attrs(pfvf->dev, iova, size, + DMA_FROM_DEVICE, +@@ -1298,6 +1311,7 @@ void otx2_free_aura_ptr(struct otx2_nic *pfvf, int type) + int pool_id, pool_start = 0, pool_end = 0, size = 0; + struct otx2_pool *pool; + u64 iova; ++ int idx; + + if (type == AURA_NIX_SQ) { + pool_start = otx2_get_pool_idx(pfvf, type, 0); +@@ -1312,16 +1326,21 @@ void otx2_free_aura_ptr(struct otx2_nic *pfvf, int type) + + /* Free SQB and RQB pointers from the aura pool */ + for (pool_id = pool_start; pool_id < pool_end; pool_id++) { +- iova = otx2_aura_allocptr(pfvf, pool_id); + pool = &pfvf->qset.pool[pool_id]; ++ iova = otx2_aura_allocptr(pfvf, pool_id); + while (iova) { + if (type == AURA_NIX_RQ) + iova -= OTX2_HEAD_ROOM; +- + otx2_free_bufs(pfvf, pool, iova, size); +- + iova = otx2_aura_allocptr(pfvf, pool_id); + } ++ ++ for (idx = 0 ; idx < pool->xdp_cnt; idx++) { ++ if (!pool->xdp[idx]) ++ continue; ++ ++ xsk_buff_free(pool->xdp[idx]); ++ } + } + } + +@@ -1338,7 +1357,8 @@ void otx2_aura_pool_free(struct otx2_nic *pfvf) + qmem_free(pfvf->dev, pool->stack); + qmem_free(pfvf->dev, pool->fc_addr); + page_pool_destroy(pool->page_pool); +- pool->page_pool = NULL; ++ devm_kfree(pfvf->dev, pool->xdp); ++ pool->xsk_pool = NULL; + } + devm_kfree(pfvf->dev, pfvf->qset.pool); + pfvf->qset.pool = NULL; +@@ -1425,6 +1445,7 @@ int otx2_pool_init(struct otx2_nic *pfvf, u16 pool_id, + int stack_pages, int numptrs, int buf_size, int type) + { + struct page_pool_params pp_params = { 0 }; ++ struct xsk_buff_pool *xsk_pool; + struct npa_aq_enq_req *aq; + struct otx2_pool *pool; + int err; +@@ -1468,21 +1489,35 @@ int otx2_pool_init(struct otx2_nic *pfvf, u16 pool_id, + aq->ctype = NPA_AQ_CTYPE_POOL; + aq->op = NPA_AQ_INSTOP_INIT; + +- if (type != AURA_NIX_RQ) { +- pool->page_pool = NULL; ++ if (type != AURA_NIX_RQ) ++ return 0; ++ ++ if (!test_bit(pool_id, pfvf->af_xdp_zc_qidx)) { ++ pp_params.order = get_order(buf_size); ++ pp_params.flags = PP_FLAG_DMA_MAP; ++ pp_params.pool_size = min(OTX2_PAGE_POOL_SZ, numptrs); ++ pp_params.nid = NUMA_NO_NODE; ++ pp_params.dev = pfvf->dev; ++ pp_params.dma_dir = DMA_FROM_DEVICE; ++ pool->page_pool = page_pool_create(&pp_params); ++ if (IS_ERR(pool->page_pool)) { ++ netdev_err(pfvf->netdev, "Creation of page pool failed\n"); ++ return PTR_ERR(pool->page_pool); ++ } + return 0; + } + +- pp_params.order = get_order(buf_size); +- pp_params.flags = PP_FLAG_DMA_MAP; +- pp_params.pool_size = min(OTX2_PAGE_POOL_SZ, numptrs); +- pp_params.nid = NUMA_NO_NODE; +- pp_params.dev = pfvf->dev; +- pp_params.dma_dir = DMA_FROM_DEVICE; +- pool->page_pool = page_pool_create(&pp_params); +- if (IS_ERR(pool->page_pool)) { +- netdev_err(pfvf->netdev, "Creation of page pool failed\n"); +- return PTR_ERR(pool->page_pool); ++ /* Set XSK pool to support AF_XDP zero-copy */ ++ xsk_pool = xsk_get_pool_from_qid(pfvf->netdev, pool_id); ++ if (xsk_pool) { ++ pool->xsk_pool = xsk_pool; ++ pool->xdp_cnt = numptrs; ++ pool->xdp = devm_kcalloc(pfvf->dev, ++ numptrs, sizeof(struct xdp_buff *), GFP_KERNEL); ++ if (IS_ERR(pool->xdp)) { ++ netdev_err(pfvf->netdev, "Creation of xsk pool failed\n"); ++ return PTR_ERR(pool->xdp); ++ } + } + + return 0; +@@ -1543,9 +1578,18 @@ int otx2_sq_aura_pool_init(struct otx2_nic *pfvf) + } + + for (ptr = 0; ptr < num_sqbs; ptr++) { +- err = otx2_alloc_rbuf(pfvf, pool, &bufptr); +- if (err) ++ err = otx2_alloc_rbuf(pfvf, pool, &bufptr, pool_id, ptr); ++ if (err) { ++ if (pool->xsk_pool) { ++ ptr--; ++ while (ptr >= 0) { ++ xsk_buff_free(pool->xdp[ptr]); ++ ptr--; ++ } ++ } + goto err_mem; ++ } ++ + pfvf->hw_ops->aura_freeptr(pfvf, pool_id, bufptr); + sq->sqb_ptrs[sq->sqb_count++] = (u64)bufptr; + } +@@ -1595,11 +1639,19 @@ int otx2_rq_aura_pool_init(struct otx2_nic *pfvf) + /* Allocate pointers and free them to aura/pool */ + for (pool_id = 0; pool_id < hw->rqpool_cnt; pool_id++) { + pool = &pfvf->qset.pool[pool_id]; ++ + for (ptr = 0; ptr < num_ptrs; ptr++) { +- err = otx2_alloc_rbuf(pfvf, pool, &bufptr); +- if (err) ++ err = otx2_alloc_rbuf(pfvf, pool, &bufptr, pool_id, ptr); ++ if (err) { ++ if (pool->xsk_pool) { ++ while (ptr) ++ xsk_buff_free(pool->xdp[--ptr]); ++ } + return -ENOMEM; ++ } ++ + pfvf->hw_ops->aura_freeptr(pfvf, pool_id, ++ pool->xsk_pool ? bufptr : + bufptr + OTX2_HEAD_ROOM); + } + } +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h +index 0bec3a6af26a0..7477038d29e21 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h +@@ -533,6 +533,8 @@ struct otx2_nic { + + /* Inline ipsec */ + struct cn10k_ipsec ipsec; ++ /* af_xdp zero-copy */ ++ unsigned long *af_xdp_zc_qidx; + }; + + static inline bool is_otx2_lbkvf(struct pci_dev *pdev) +@@ -1004,7 +1006,7 @@ void otx2_txschq_free_one(struct otx2_nic *pfvf, u16 lvl, u16 schq); + void otx2_free_pending_sqe(struct otx2_nic *pfvf); + void otx2_sqb_flush(struct otx2_nic *pfvf); + int otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool, +- dma_addr_t *dma); ++ dma_addr_t *dma, int qidx, int idx); + int otx2_rxtx_enable(struct otx2_nic *pfvf, bool enable); + void otx2_ctx_disable(struct mbox *mbox, int type, bool npa); + int otx2_nix_config_bp(struct otx2_nic *pfvf, bool enable); +@@ -1034,6 +1036,8 @@ void otx2_pfaf_mbox_destroy(struct otx2_nic *pf); + void otx2_disable_mbox_intr(struct otx2_nic *pf); + void otx2_disable_napi(struct otx2_nic *pf); + irqreturn_t otx2_cq_intr_handler(int irq, void *cq_irq); ++int otx2_rq_init(struct otx2_nic *pfvf, u16 qidx, u16 lpb_aura); ++int otx2_cq_init(struct otx2_nic *pfvf, u16 qidx); + + /* RSS configuration APIs*/ + int otx2_rss_init(struct otx2_nic *pfvf); +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +index 4347a3c95350f..50a42cd5d50a2 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +@@ -27,6 +27,7 @@ + #include "qos.h" + #include + #include "cn10k_ipsec.h" ++#include "otx2_xsk.h" + + #define DRV_NAME "rvu_nicpf" + #define DRV_STRING "Marvell RVU NIC Physical Function Driver" +@@ -1662,9 +1663,7 @@ void otx2_free_hw_resources(struct otx2_nic *pf) + struct nix_lf_free_req *free_req; + struct mbox *mbox = &pf->mbox; + struct otx2_cq_queue *cq; +- struct otx2_pool *pool; + struct msg_req *req; +- int pool_id; + int qidx; + + /* Ensure all SQE are processed */ +@@ -1705,13 +1704,6 @@ void otx2_free_hw_resources(struct otx2_nic *pf) + /* Free RQ buffer pointers*/ + otx2_free_aura_ptr(pf, AURA_NIX_RQ); + +- for (qidx = 0; qidx < pf->hw.rx_queues; qidx++) { +- pool_id = otx2_get_pool_idx(pf, AURA_NIX_RQ, qidx); +- pool = &pf->qset.pool[pool_id]; +- page_pool_destroy(pool->page_pool); +- pool->page_pool = NULL; +- } +- + otx2_free_cq_res(pf); + + /* Free all ingress bandwidth profiles allocated */ +@@ -2788,6 +2780,8 @@ static int otx2_xdp(struct net_device *netdev, struct netdev_bpf *xdp) + switch (xdp->command) { + case XDP_SETUP_PROG: + return otx2_xdp_setup(pf, xdp->prog); ++ case XDP_SETUP_XSK_POOL: ++ return otx2_xsk_pool_setup(pf, xdp->xsk.pool, xdp->xsk.queue_id); + default: + return -EINVAL; + } +@@ -2865,6 +2859,7 @@ static const struct net_device_ops otx2_netdev_ops = { + .ndo_set_vf_vlan = otx2_set_vf_vlan, + .ndo_get_vf_config = otx2_get_vf_config, + .ndo_bpf = otx2_xdp, ++ .ndo_xsk_wakeup = otx2_xsk_wakeup, + .ndo_xdp_xmit = otx2_xdp_xmit, + .ndo_setup_tc = otx2_setup_tc, + .ndo_set_vf_trust = otx2_ndo_set_vf_trust, +@@ -3203,16 +3198,26 @@ static int otx2_probe(struct pci_dev *pdev, const struct pci_device_id *id) + /* Enable link notifications */ + otx2_cgx_config_linkevents(pf, true); + ++ pf->af_xdp_zc_qidx = bitmap_zalloc(qcount, GFP_KERNEL); ++ if (!pf->af_xdp_zc_qidx) { ++ err = -ENOMEM; ++ goto err_sriov_cleannup; ++ } ++ + #ifdef CONFIG_DCB + err = otx2_dcbnl_set_ops(netdev); + if (err) +- goto err_pf_sriov_init; ++ goto err_free_zc_bmap; + #endif + + otx2_qos_init(pf, qos_txqs); + + return 0; + ++err_free_zc_bmap: ++ bitmap_free(pf->af_xdp_zc_qidx); ++err_sriov_cleannup: ++ otx2_sriov_vfcfg_cleanup(pf); + err_pf_sriov_init: + otx2_shutdown_tc(pf); + err_mcam_flow_del: +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c +index 4a72750431036..00b6903ba250c 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c +@@ -12,6 +12,7 @@ + #include + #include + #include ++#include + + #include "otx2_reg.h" + #include "otx2_common.h" +@@ -523,9 +524,10 @@ static void otx2_adjust_adaptive_coalese(struct otx2_nic *pfvf, struct otx2_cq_p + int otx2_napi_handler(struct napi_struct *napi, int budget) + { + struct otx2_cq_queue *rx_cq = NULL; ++ struct otx2_cq_queue *cq = NULL; ++ struct otx2_pool *pool = NULL; + struct otx2_cq_poll *cq_poll; + int workdone = 0, cq_idx, i; +- struct otx2_cq_queue *cq; + struct otx2_qset *qset; + struct otx2_nic *pfvf; + int filled_cnt = -1; +@@ -550,6 +552,7 @@ int otx2_napi_handler(struct napi_struct *napi, int budget) + + if (rx_cq && rx_cq->pool_ptrs) + filled_cnt = pfvf->hw_ops->refill_pool_ptrs(pfvf, rx_cq); ++ + /* Clear the IRQ */ + otx2_write64(pfvf, NIX_LF_CINTX_INT(cq_poll->cint_idx), BIT_ULL(0)); + +@@ -562,20 +565,31 @@ int otx2_napi_handler(struct napi_struct *napi, int budget) + if (pfvf->flags & OTX2_FLAG_ADPTV_INT_COAL_ENABLED) + otx2_adjust_adaptive_coalese(pfvf, cq_poll); + ++ if (likely(cq)) ++ pool = &pfvf->qset.pool[cq->cq_idx]; ++ + if (unlikely(!filled_cnt)) { + struct refill_work *work; + struct delayed_work *dwork; + +- work = &pfvf->refill_wrk[cq->cq_idx]; +- dwork = &work->pool_refill_work; +- /* Schedule a task if no other task is running */ +- if (!cq->refill_task_sched) { +- work->napi = napi; +- cq->refill_task_sched = true; +- schedule_delayed_work(dwork, +- msecs_to_jiffies(100)); ++ if (likely(cq)) { ++ work = &pfvf->refill_wrk[cq->cq_idx]; ++ dwork = &work->pool_refill_work; ++ /* Schedule a task if no other task is running */ ++ if (!cq->refill_task_sched) { ++ work->napi = napi; ++ cq->refill_task_sched = true; ++ schedule_delayed_work(dwork, ++ msecs_to_jiffies(100)); ++ } ++ /* Call wake-up for not able to fill buffers */ ++ if (pool->xsk_pool) ++ xsk_set_rx_need_wakeup(pool->xsk_pool); + } + } else { ++ /* Clear wake-up, since buffers are filled successfully */ ++ if (pool && pool->xsk_pool) ++ xsk_clear_rx_need_wakeup(pool->xsk_pool); + /* Re-enable interrupts */ + otx2_write64(pfvf, + NIX_LF_CINTX_ENA_W1S(cq_poll->cint_idx), +@@ -1226,15 +1240,19 @@ void otx2_cleanup_rx_cqes(struct otx2_nic *pfvf, struct otx2_cq_queue *cq, int q + u16 pool_id; + u64 iova; + +- if (pfvf->xdp_prog) ++ pool_id = otx2_get_pool_idx(pfvf, AURA_NIX_RQ, qidx); ++ pool = &pfvf->qset.pool[pool_id]; ++ ++ if (pfvf->xdp_prog) { ++ if (pool->page_pool) ++ xdp_rxq_info_unreg_mem_model(&cq->xdp_rxq); ++ + xdp_rxq_info_unreg(&cq->xdp_rxq); ++ } + + if (otx2_nix_cq_op_status(pfvf, cq) || !cq->pend_cqe) + return; + +- pool_id = otx2_get_pool_idx(pfvf, AURA_NIX_RQ, qidx); +- pool = &pfvf->qset.pool[pool_id]; +- + while (cq->pend_cqe) { + cqe = (struct nix_cqe_rx_s *)otx2_get_next_cqe(cq); + processed_cqe++; +@@ -1418,17 +1436,28 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + struct otx2_cq_queue *cq, + bool *need_xdp_flush) + { ++ struct xdp_buff xdp, *xsk_buff = NULL; + unsigned char *hard_start; + struct otx2_pool *pool; + struct xdp_frame *xdpf; + int qidx = cq->cq_idx; +- struct xdp_buff xdp; + struct page *page; + u64 iova, pa; + u32 act; + int err; + + pool = &pfvf->qset.pool[qidx]; ++ ++ if (pool->xsk_pool) { ++ xsk_buff = pool->xdp[--cq->rbpool->xdp_top]; ++ if (!xsk_buff) ++ return false; ++ ++ xsk_buff->data_end = xsk_buff->data + cqe->sg.seg_size; ++ act = bpf_prog_run_xdp(prog, xsk_buff); ++ goto handle_xdp_verdict; ++ } ++ + iova = cqe->sg.seg_addr - OTX2_HEAD_ROOM; + pa = otx2_iova_to_phys(pfvf->iommu_domain, iova); + page = virt_to_page(phys_to_virt(pa)); +@@ -1441,6 +1470,7 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + + act = bpf_prog_run_xdp(prog, &xdp); + ++handle_xdp_verdict: + switch (act) { + case XDP_PASS: + break; +@@ -1452,6 +1482,15 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + cqe->sg.seg_size, qidx, XDP_TX); + case XDP_REDIRECT: + cq->pool_ptrs++; ++ if (xsk_buff) { ++ err = xdp_do_redirect(pfvf->netdev, xsk_buff, prog); ++ if (!err) { ++ *need_xdp_flush = true; ++ return true; ++ } ++ return false; ++ } ++ + err = xdp_do_redirect(pfvf->netdev, &xdp, prog); + if (!err) { + *need_xdp_flush = true; +@@ -1467,11 +1506,15 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + bpf_warn_invalid_xdp_action(pfvf->netdev, prog, act); + break; + case XDP_ABORTED: ++ if (xsk_buff) ++ xsk_buff_free(xsk_buff); + trace_xdp_exception(pfvf->netdev, prog, act); + break; + case XDP_DROP: + cq->pool_ptrs++; +- if (page->pp) { ++ if (xsk_buff) { ++ xsk_buff_free(xsk_buff); ++ } else if (page->pp) { + page_pool_recycle_direct(pool->page_pool, page); + } else { + otx2_dma_unmap_page(pfvf, iova, pfvf->rbsize, +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h +index 92e1e84cad75c..8f346fbc8221f 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h +@@ -12,6 +12,7 @@ + #include + #include + #include ++#include + + #define LBK_CHAN_BASE 0x000 + #define SDP_CHAN_BASE 0x700 +@@ -128,7 +129,11 @@ struct otx2_pool { + struct qmem *stack; + struct qmem *fc_addr; + struct page_pool *page_pool; ++ struct xsk_buff_pool *xsk_pool; ++ struct xdp_buff **xdp; ++ u16 xdp_cnt; + u16 rbsize; ++ u16 xdp_top; + }; + + struct otx2_cq_queue { +@@ -145,6 +150,7 @@ struct otx2_cq_queue { + void *cqe_base; + struct qmem *cqe; + struct otx2_pool *rbpool; ++ bool xsk_zc_en; + struct xdp_rxq_info xdp_rxq; + } ____cacheline_aligned_in_smp; + +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c +index e926c6ce96cff..63ddd262d1229 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c +@@ -722,15 +722,25 @@ static int otx2vf_probe(struct pci_dev *pdev, const struct pci_device_id *id) + if (err) + goto err_shutdown_tc; + ++ vf->af_xdp_zc_qidx = bitmap_zalloc(qcount, GFP_KERNEL); ++ if (!vf->af_xdp_zc_qidx) { ++ err = -ENOMEM; ++ goto err_unreg_devlink; ++ } ++ + #ifdef CONFIG_DCB + err = otx2_dcbnl_set_ops(netdev); + if (err) +- goto err_shutdown_tc; ++ goto err_free_zc_bmap; + #endif + otx2_qos_init(vf, qos_txqs); + + return 0; + ++err_free_zc_bmap: ++ bitmap_free(vf->af_xdp_zc_qidx); ++err_unreg_devlink: ++ otx2_unregister_dl(vf); + err_shutdown_tc: + otx2_shutdown_tc(vf); + err_unreg_netdev: +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.c +new file mode 100644 +index 0000000000000..894c1e0aea6f1 +--- /dev/null ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.c +@@ -0,0 +1,182 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* Marvell RVU Ethernet driver ++ * ++ * Copyright (C) 2024 Marvell. ++ * ++ */ ++ ++#include ++#include ++#include ++#include ++ ++#include "otx2_common.h" ++#include "otx2_xsk.h" ++ ++int otx2_xsk_pool_alloc_buf(struct otx2_nic *pfvf, struct otx2_pool *pool, ++ dma_addr_t *dma, int idx) ++{ ++ struct xdp_buff *xdp; ++ int delta; ++ ++ xdp = xsk_buff_alloc(pool->xsk_pool); ++ if (!xdp) ++ return -ENOMEM; ++ ++ pool->xdp[pool->xdp_top++] = xdp; ++ *dma = OTX2_DATA_ALIGN(xsk_buff_xdp_get_dma(xdp)); ++ /* Adjust xdp->data for unaligned addresses */ ++ delta = *dma - xsk_buff_xdp_get_dma(xdp); ++ xdp->data += delta; ++ ++ return 0; ++} ++ ++static int otx2_xsk_ctx_disable(struct otx2_nic *pfvf, u16 qidx, int aura_id) ++{ ++ struct nix_cn10k_aq_enq_req *cn10k_rq_aq; ++ struct npa_aq_enq_req *aura_aq; ++ struct npa_aq_enq_req *pool_aq; ++ struct nix_aq_enq_req *rq_aq; ++ ++ if (test_bit(CN10K_LMTST, &pfvf->hw.cap_flag)) { ++ cn10k_rq_aq = otx2_mbox_alloc_msg_nix_cn10k_aq_enq(&pfvf->mbox); ++ if (!cn10k_rq_aq) ++ return -ENOMEM; ++ cn10k_rq_aq->qidx = qidx; ++ cn10k_rq_aq->rq.ena = 0; ++ cn10k_rq_aq->rq_mask.ena = 1; ++ cn10k_rq_aq->ctype = NIX_AQ_CTYPE_RQ; ++ cn10k_rq_aq->op = NIX_AQ_INSTOP_WRITE; ++ } else { ++ rq_aq = otx2_mbox_alloc_msg_nix_aq_enq(&pfvf->mbox); ++ if (!rq_aq) ++ return -ENOMEM; ++ rq_aq->qidx = qidx; ++ rq_aq->sq.ena = 0; ++ rq_aq->sq_mask.ena = 1; ++ rq_aq->ctype = NIX_AQ_CTYPE_RQ; ++ rq_aq->op = NIX_AQ_INSTOP_WRITE; ++ } ++ ++ aura_aq = otx2_mbox_alloc_msg_npa_aq_enq(&pfvf->mbox); ++ if (!aura_aq) ++ goto fail; ++ ++ aura_aq->aura_id = aura_id; ++ aura_aq->aura.ena = 0; ++ aura_aq->aura_mask.ena = 1; ++ aura_aq->ctype = NPA_AQ_CTYPE_AURA; ++ aura_aq->op = NPA_AQ_INSTOP_WRITE; ++ ++ pool_aq = otx2_mbox_alloc_msg_npa_aq_enq(&pfvf->mbox); ++ if (!pool_aq) ++ goto fail; ++ ++ pool_aq->aura_id = aura_id; ++ pool_aq->pool.ena = 0; ++ pool_aq->pool_mask.ena = 1; ++ ++ pool_aq->ctype = NPA_AQ_CTYPE_POOL; ++ pool_aq->op = NPA_AQ_INSTOP_WRITE; ++ ++ return otx2_sync_mbox_msg(&pfvf->mbox); ++ ++fail: ++ otx2_mbox_reset(&pfvf->mbox.mbox, 0); ++ return -ENOMEM; ++} ++ ++static void otx2_clean_up_rq(struct otx2_nic *pfvf, int qidx) ++{ ++ struct otx2_qset *qset = &pfvf->qset; ++ struct otx2_cq_queue *cq; ++ struct otx2_pool *pool; ++ u64 iova; ++ ++ /* If the DOWN flag is set SQs are already freed */ ++ if (pfvf->flags & OTX2_FLAG_INTF_DOWN) ++ return; ++ ++ cq = &qset->cq[qidx]; ++ if (cq) ++ otx2_cleanup_rx_cqes(pfvf, cq, qidx); ++ ++ pool = &pfvf->qset.pool[qidx]; ++ iova = otx2_aura_allocptr(pfvf, qidx); ++ while (iova) { ++ iova -= OTX2_HEAD_ROOM; ++ otx2_free_bufs(pfvf, pool, iova, pfvf->rbsize); ++ iova = otx2_aura_allocptr(pfvf, qidx); ++ } ++ ++ mutex_lock(&pfvf->mbox.lock); ++ otx2_xsk_ctx_disable(pfvf, qidx, qidx); ++ mutex_unlock(&pfvf->mbox.lock); ++} ++ ++int otx2_xsk_pool_enable(struct otx2_nic *pf, struct xsk_buff_pool *pool, u16 qidx) ++{ ++ u16 rx_queues = pf->hw.rx_queues; ++ u16 tx_queues = pf->hw.tx_queues; ++ int err; ++ ++ if (qidx >= rx_queues || qidx >= tx_queues) ++ return -EINVAL; ++ ++ err = xsk_pool_dma_map(pool, pf->dev, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); ++ if (err) ++ return err; ++ ++ set_bit(qidx, pf->af_xdp_zc_qidx); ++ otx2_clean_up_rq(pf, qidx); ++ /* Kick start the NAPI context so that receiving will start */ ++ return otx2_xsk_wakeup(pf->netdev, qidx, XDP_WAKEUP_RX); ++} ++ ++int otx2_xsk_pool_disable(struct otx2_nic *pf, u16 qidx) ++{ ++ struct net_device *netdev = pf->netdev; ++ struct xsk_buff_pool *pool; ++ ++ pool = xsk_get_pool_from_qid(netdev, qidx); ++ if (!pool) ++ return -EINVAL; ++ ++ otx2_clean_up_rq(pf, qidx); ++ clear_bit(qidx, pf->af_xdp_zc_qidx); ++ xsk_pool_dma_unmap(pool, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); ++ ++ return 0; ++} ++ ++int otx2_xsk_pool_setup(struct otx2_nic *pf, struct xsk_buff_pool *pool, u16 qidx) ++{ ++ if (pool) ++ return otx2_xsk_pool_enable(pf, pool, qidx); ++ ++ return otx2_xsk_pool_disable(pf, qidx); ++} ++ ++int otx2_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags) ++{ ++ struct otx2_nic *pf = netdev_priv(dev); ++ struct otx2_cq_poll *cq_poll = NULL; ++ struct otx2_qset *qset = &pf->qset; ++ ++ if (pf->flags & OTX2_FLAG_INTF_DOWN) ++ return -ENETDOWN; ++ ++ if (queue_id >= pf->hw.rx_queues) ++ return -EINVAL; ++ ++ cq_poll = &qset->napi[queue_id]; ++ if (!cq_poll) ++ return -EINVAL; ++ ++ /* Trigger interrupt */ ++ if (!napi_if_scheduled_mark_missed(&cq_poll->napi)) ++ otx2_write64(pf, NIX_LF_CINTX_ENA_W1S(cq_poll->cint_idx), BIT_ULL(0)); ++ ++ return 0; ++} +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.h +new file mode 100644 +index 0000000000000..022b3433edbbb +--- /dev/null ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_xsk.h +@@ -0,0 +1,21 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* Marvell RVU PF/VF Netdev Devlink ++ * ++ * Copyright (C) 2024 Marvell. ++ * ++ */ ++ ++#ifndef OTX2_XSK_H ++#define OTX2_XSK_H ++ ++struct otx2_nic; ++struct xsk_buff_pool; ++ ++int otx2_xsk_pool_setup(struct otx2_nic *pf, struct xsk_buff_pool *pool, u16 qid); ++int otx2_xsk_pool_enable(struct otx2_nic *pf, struct xsk_buff_pool *pool, u16 qid); ++int otx2_xsk_pool_disable(struct otx2_nic *pf, u16 qid); ++int otx2_xsk_pool_alloc_buf(struct otx2_nic *pfvf, struct otx2_pool *pool, ++ dma_addr_t *dma, int idx); ++int otx2_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags); ++ ++#endif /* OTX2_XSK_H */ +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/qos_sq.c b/drivers/net/ethernet/marvell/octeontx2/nic/qos_sq.c +index 9d887bfc31089..c5dbae0e513b6 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/qos_sq.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/qos_sq.c +@@ -82,7 +82,7 @@ static int otx2_qos_sq_aura_pool_init(struct otx2_nic *pfvf, int qidx) + } + + for (ptr = 0; ptr < num_sqbs; ptr++) { +- err = otx2_alloc_rbuf(pfvf, pool, &bufptr); ++ err = otx2_alloc_rbuf(pfvf, pool, &bufptr, pool_id, ptr); + if (err) + goto sqb_free; + pfvf->hw_ops->aura_freeptr(pfvf, pool_id, bufptr); +-- +2.39.5 + diff --git a/queue-6.14/octeontx2-pf-avoid-adding-dcbnl_ops-for-lbk-and-sdp-.patch b/queue-6.14/octeontx2-pf-avoid-adding-dcbnl_ops-for-lbk-and-sdp-.patch new file mode 100644 index 0000000000..384cc59ffe --- /dev/null +++ b/queue-6.14/octeontx2-pf-avoid-adding-dcbnl_ops-for-lbk-and-sdp-.patch @@ -0,0 +1,45 @@ +From f3399a16bda7cc0c50c0e77d5aaac30247612e64 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 19 May 2025 12:56:58 +0530 +Subject: octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf + +From: Suman Ghosh + +[ Upstream commit 184fb40f731bd3353b0887731f7caba66609e9cd ] + +Priority flow control is not supported for LBK and SDP vf. This patch +adds support to not add dcbnl_ops for LBK and SDP vf. + +Fixes: 8e67558177f8 ("octeontx2-pf: PFC config support with DCBx") +Signed-off-by: Suman Ghosh +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20250519072658.2960851-1-sumang@marvell.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c | 9 ++++++--- + 1 file changed, 6 insertions(+), 3 deletions(-) + +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c +index 63ddd262d1229..1f53bd5e45604 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c +@@ -729,9 +729,12 @@ static int otx2vf_probe(struct pci_dev *pdev, const struct pci_device_id *id) + } + + #ifdef CONFIG_DCB +- err = otx2_dcbnl_set_ops(netdev); +- if (err) +- goto err_free_zc_bmap; ++ /* Priority flow control is not supported for LBK and SDP vf(s) */ ++ if (!(is_otx2_lbkvf(vf->pdev) || is_otx2_sdp_rep(vf->pdev))) { ++ err = otx2_dcbnl_set_ops(netdev); ++ if (err) ++ goto err_free_zc_bmap; ++ } + #endif + otx2_qos_init(vf, qos_txqs); + +-- +2.39.5 + diff --git a/queue-6.14/octeontx2-pf-use-xdp_return_frame-to-free-xdp-buffer.patch b/queue-6.14/octeontx2-pf-use-xdp_return_frame-to-free-xdp-buffer.patch new file mode 100644 index 0000000000..6eb2697116 --- /dev/null +++ b/queue-6.14/octeontx2-pf-use-xdp_return_frame-to-free-xdp-buffer.patch @@ -0,0 +1,216 @@ +From edccbd17a168becb32ec5a50ee7df1900c2a374a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 13 Feb 2025 11:01:36 +0530 +Subject: octeontx2-pf: use xdp_return_frame() to free xdp buffers + +From: Suman Ghosh + +[ Upstream commit 94c80f748873514af27b9fac3f72acafcde3bcd6 ] + +xdp_return_frames() will help to free the xdp frames and their +associated pages back to page pool. + +Signed-off-by: Geetha sowjanya +Signed-off-by: Suman Ghosh +Signed-off-by: Paolo Abeni +Stable-dep-of: 184fb40f731b ("octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf") +Signed-off-by: Sasha Levin +--- + .../marvell/octeontx2/nic/otx2_common.h | 4 +- + .../ethernet/marvell/octeontx2/nic/otx2_pf.c | 7 ++- + .../marvell/octeontx2/nic/otx2_txrx.c | 53 +++++++++++-------- + .../marvell/octeontx2/nic/otx2_txrx.h | 1 + + 4 files changed, 38 insertions(+), 27 deletions(-) + +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h +index 7cc12f10e8a15..0bec3a6af26a0 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h +@@ -21,6 +21,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -1095,7 +1096,8 @@ int otx2_del_macfilter(struct net_device *netdev, const u8 *mac); + int otx2_add_macfilter(struct net_device *netdev, const u8 *mac); + int otx2_enable_rxvlan(struct otx2_nic *pf, bool enable); + int otx2_install_rxvlan_offload_flow(struct otx2_nic *pfvf); +-bool otx2_xdp_sq_append_pkt(struct otx2_nic *pfvf, u64 iova, int len, u16 qidx); ++bool otx2_xdp_sq_append_pkt(struct otx2_nic *pfvf, struct xdp_frame *xdpf, ++ u64 iova, int len, u16 qidx, u16 flags); + u16 otx2_get_max_mtu(struct otx2_nic *pfvf); + int otx2_handle_ntuple_tc_features(struct net_device *netdev, + netdev_features_t features); +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +index e1dde93e8af82..4347a3c95350f 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +@@ -2691,7 +2691,6 @@ static int otx2_get_vf_config(struct net_device *netdev, int vf, + static int otx2_xdp_xmit_tx(struct otx2_nic *pf, struct xdp_frame *xdpf, + int qidx) + { +- struct page *page; + u64 dma_addr; + int err = 0; + +@@ -2701,11 +2700,11 @@ static int otx2_xdp_xmit_tx(struct otx2_nic *pf, struct xdp_frame *xdpf, + if (dma_mapping_error(pf->dev, dma_addr)) + return -ENOMEM; + +- err = otx2_xdp_sq_append_pkt(pf, dma_addr, xdpf->len, qidx); ++ err = otx2_xdp_sq_append_pkt(pf, xdpf, dma_addr, xdpf->len, ++ qidx, XDP_REDIRECT); + if (!err) { + otx2_dma_unmap_page(pf, dma_addr, xdpf->len, DMA_TO_DEVICE); +- page = virt_to_page(xdpf->data); +- put_page(page); ++ xdp_return_frame(xdpf); + return -ENOMEM; + } + return 0; +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c +index 224cef9389274..4a72750431036 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c +@@ -96,20 +96,16 @@ static unsigned int frag_num(unsigned int i) + + static void otx2_xdp_snd_pkt_handler(struct otx2_nic *pfvf, + struct otx2_snd_queue *sq, +- struct nix_cqe_tx_s *cqe) ++ struct nix_cqe_tx_s *cqe) + { + struct nix_send_comp_s *snd_comp = &cqe->comp; + struct sg_list *sg; +- struct page *page; +- u64 pa; + + sg = &sq->sg[snd_comp->sqe_id]; +- +- pa = otx2_iova_to_phys(pfvf->iommu_domain, sg->dma_addr[0]); +- otx2_dma_unmap_page(pfvf, sg->dma_addr[0], +- sg->size[0], DMA_TO_DEVICE); +- page = virt_to_page(phys_to_virt(pa)); +- put_page(page); ++ if (sg->flags & XDP_REDIRECT) ++ otx2_dma_unmap_page(pfvf, sg->dma_addr[0], sg->size[0], DMA_TO_DEVICE); ++ xdp_return_frame((struct xdp_frame *)sg->skb); ++ sg->skb = (u64)NULL; + } + + static void otx2_snd_pkt_handler(struct otx2_nic *pfvf, +@@ -1359,8 +1355,9 @@ void otx2_free_pending_sqe(struct otx2_nic *pfvf) + } + } + +-static void otx2_xdp_sqe_add_sg(struct otx2_snd_queue *sq, u64 dma_addr, +- int len, int *offset) ++static void otx2_xdp_sqe_add_sg(struct otx2_snd_queue *sq, ++ struct xdp_frame *xdpf, ++ u64 dma_addr, int len, int *offset, u16 flags) + { + struct nix_sqe_sg_s *sg = NULL; + u64 *iova = NULL; +@@ -1377,9 +1374,12 @@ static void otx2_xdp_sqe_add_sg(struct otx2_snd_queue *sq, u64 dma_addr, + sq->sg[sq->head].dma_addr[0] = dma_addr; + sq->sg[sq->head].size[0] = len; + sq->sg[sq->head].num_segs = 1; ++ sq->sg[sq->head].flags = flags; ++ sq->sg[sq->head].skb = (u64)xdpf; + } + +-bool otx2_xdp_sq_append_pkt(struct otx2_nic *pfvf, u64 iova, int len, u16 qidx) ++bool otx2_xdp_sq_append_pkt(struct otx2_nic *pfvf, struct xdp_frame *xdpf, ++ u64 iova, int len, u16 qidx, u16 flags) + { + struct nix_sqe_hdr_s *sqe_hdr; + struct otx2_snd_queue *sq; +@@ -1405,7 +1405,7 @@ bool otx2_xdp_sq_append_pkt(struct otx2_nic *pfvf, u64 iova, int len, u16 qidx) + + offset = sizeof(*sqe_hdr); + +- otx2_xdp_sqe_add_sg(sq, iova, len, &offset); ++ otx2_xdp_sqe_add_sg(sq, xdpf, iova, len, &offset, flags); + sqe_hdr->sizem1 = (offset / 16) - 1; + pfvf->hw_ops->sqe_flush(pfvf, sq, offset, qidx); + +@@ -1419,6 +1419,8 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + bool *need_xdp_flush) + { + unsigned char *hard_start; ++ struct otx2_pool *pool; ++ struct xdp_frame *xdpf; + int qidx = cq->cq_idx; + struct xdp_buff xdp; + struct page *page; +@@ -1426,6 +1428,7 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + u32 act; + int err; + ++ pool = &pfvf->qset.pool[qidx]; + iova = cqe->sg.seg_addr - OTX2_HEAD_ROOM; + pa = otx2_iova_to_phys(pfvf->iommu_domain, iova); + page = virt_to_page(phys_to_virt(pa)); +@@ -1444,19 +1447,21 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + case XDP_TX: + qidx += pfvf->hw.tx_queues; + cq->pool_ptrs++; +- return otx2_xdp_sq_append_pkt(pfvf, iova, +- cqe->sg.seg_size, qidx); ++ xdpf = xdp_convert_buff_to_frame(&xdp); ++ return otx2_xdp_sq_append_pkt(pfvf, xdpf, cqe->sg.seg_addr, ++ cqe->sg.seg_size, qidx, XDP_TX); + case XDP_REDIRECT: + cq->pool_ptrs++; + err = xdp_do_redirect(pfvf->netdev, &xdp, prog); +- +- otx2_dma_unmap_page(pfvf, iova, pfvf->rbsize, +- DMA_FROM_DEVICE); + if (!err) { + *need_xdp_flush = true; + return true; + } +- put_page(page); ++ ++ otx2_dma_unmap_page(pfvf, iova, pfvf->rbsize, ++ DMA_FROM_DEVICE); ++ xdpf = xdp_convert_buff_to_frame(&xdp); ++ xdp_return_frame(xdpf); + break; + default: + bpf_warn_invalid_xdp_action(pfvf->netdev, prog, act); +@@ -1465,10 +1470,14 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, + trace_xdp_exception(pfvf->netdev, prog, act); + break; + case XDP_DROP: +- otx2_dma_unmap_page(pfvf, iova, pfvf->rbsize, +- DMA_FROM_DEVICE); +- put_page(page); + cq->pool_ptrs++; ++ if (page->pp) { ++ page_pool_recycle_direct(pool->page_pool, page); ++ } else { ++ otx2_dma_unmap_page(pfvf, iova, pfvf->rbsize, ++ DMA_FROM_DEVICE); ++ put_page(page); ++ } + return true; + } + return false; +diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h +index d23810963fdbd..92e1e84cad75c 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h ++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h +@@ -76,6 +76,7 @@ struct otx2_rcv_queue { + + struct sg_list { + u16 num_segs; ++ u16 flags; + u64 skb; + u64 size[OTX2_MAX_FRAGS_IN_SQE]; + u64 dma_addr[OTX2_MAX_FRAGS_IN_SQE]; +-- +2.39.5 + diff --git a/queue-6.14/perf-x86-intel-fix-segfault-with-pebs-via-pt-with-sa.patch b/queue-6.14/perf-x86-intel-fix-segfault-with-pebs-via-pt-with-sa.patch new file mode 100644 index 0000000000..bf01d3cd2d --- /dev/null +++ b/queue-6.14/perf-x86-intel-fix-segfault-with-pebs-via-pt-with-sa.patch @@ -0,0 +1,101 @@ +From 4ddddc832ae561d88436781bbf1f578570169b43 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 8 May 2025 16:44:52 +0300 +Subject: perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freq + +From: Adrian Hunter + +[ Upstream commit 99bcd91fabada0dbb1d5f0de44532d8008db93c6 ] + +Currently, using PEBS-via-PT with a sample frequency instead of a sample +period, causes a segfault. For example: + + BUG: kernel NULL pointer dereference, address: 0000000000000195 + + ? __die_body.cold+0x19/0x27 + ? page_fault_oops+0xca/0x290 + ? exc_page_fault+0x7e/0x1b0 + ? asm_exc_page_fault+0x26/0x30 + ? intel_pmu_pebs_event_update_no_drain+0x40/0x60 + ? intel_pmu_pebs_event_update_no_drain+0x32/0x60 + intel_pmu_drain_pebs_icl+0x333/0x350 + handle_pmi_common+0x272/0x3c0 + intel_pmu_handle_irq+0x10a/0x2e0 + perf_event_nmi_handler+0x2a/0x50 + +That happens because intel_pmu_pebs_event_update_no_drain() assumes all the +pebs_enabled bits represent counter indexes, which is not always the case. +In this particular case, bits 60 and 61 are set for PEBS-via-PT purposes. + +The behaviour of PEBS-via-PT with sample frequency is questionable because +although a PMI is generated (PEBS_PMI_AFTER_EACH_RECORD), the period is not +adjusted anyway. + +Putting that aside, fix intel_pmu_pebs_event_update_no_drain() by passing +the mask of counter bits instead of 'size'. Note, prior to the Fixes +commit, 'size' would be limited to the maximum counter index, so the issue +was not hit. + +Fixes: 722e42e45c2f1 ("perf/x86: Support counter mask") +Signed-off-by: Adrian Hunter +Signed-off-by: Ingo Molnar +Reviewed-by: Kan Liang +Cc: Peter Zijlstra +Cc: Ingo Molnar +Cc: Alexander Shishkin +Cc: Arnaldo Carvalho de Melo +Cc: Jiri Olsa +Cc: Namhyung Kim +Cc: Ian Rogers +Cc: linux-perf-users@vger.kernel.org +Link: https://lore.kernel.org/r/20250508134452.73960-1-adrian.hunter@intel.com +Signed-off-by: Sasha Levin +--- + arch/x86/events/intel/ds.c | 9 +++++---- + 1 file changed, 5 insertions(+), 4 deletions(-) + +diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c +index 08de293bebad1..97587d4c7befd 100644 +--- a/arch/x86/events/intel/ds.c ++++ b/arch/x86/events/intel/ds.c +@@ -2280,8 +2280,9 @@ static void intel_pmu_drain_pebs_core(struct pt_regs *iregs, struct perf_sample_ + setup_pebs_fixed_sample_data); + } + +-static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, int size) ++static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, u64 mask) + { ++ u64 pebs_enabled = cpuc->pebs_enabled & mask; + struct perf_event *event; + int bit; + +@@ -2292,7 +2293,7 @@ static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, int + * It needs to call intel_pmu_save_and_restart_reload() to + * update the event->count for this case. + */ +- for_each_set_bit(bit, (unsigned long *)&cpuc->pebs_enabled, size) { ++ for_each_set_bit(bit, (unsigned long *)&pebs_enabled, X86_PMC_IDX_MAX) { + event = cpuc->events[bit]; + if (event->hw.flags & PERF_X86_EVENT_AUTO_RELOAD) + intel_pmu_save_and_restart_reload(event, 0); +@@ -2327,7 +2328,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d + } + + if (unlikely(base >= top)) { +- intel_pmu_pebs_event_update_no_drain(cpuc, size); ++ intel_pmu_pebs_event_update_no_drain(cpuc, mask); + return; + } + +@@ -2441,7 +2442,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d + (hybrid(cpuc->pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED); + + if (unlikely(base >= top)) { +- intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX); ++ intel_pmu_pebs_event_update_no_drain(cpuc, mask); + return; + } + +-- +2.39.5 + diff --git a/queue-6.14/pinctrl-qcom-switch-to-devm_register_sys_off_handler.patch b/queue-6.14/pinctrl-qcom-switch-to-devm_register_sys_off_handler.patch new file mode 100644 index 0000000000..3bccf9c6ca --- /dev/null +++ b/queue-6.14/pinctrl-qcom-switch-to-devm_register_sys_off_handler.patch @@ -0,0 +1,96 @@ +From 6f3a6c3e7193810dae1d88cfaafec1ff657fe2b4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 13 May 2025 21:38:58 +0300 +Subject: pinctrl: qcom: switch to devm_register_sys_off_handler() + +From: Dmitry Baryshkov + +[ Upstream commit 41e452e6933d14146381ea25cff5e4d1ac2abea1 ] + +Error-handling paths in msm_pinctrl_probe() don't call +a function required to unroll restart handler registration, +unregister_restart_handler(). Instead of adding calls to this function, +switch the msm pinctrl code into using devm_register_sys_off_handler(). + +Fixes: cf1fc1876289 ("pinctrl: qcom: use restart_notifier mechanism for ps_hold") +Signed-off-by: Dmitry Baryshkov +Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-2-249999af0fc1@oss.qualcomm.com +Signed-off-by: Linus Walleij +Signed-off-by: Sasha Levin +--- + drivers/pinctrl/qcom/pinctrl-msm.c | 23 ++++++++++++----------- + 1 file changed, 12 insertions(+), 11 deletions(-) + +diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c +index 82f0cc43bbf4f..0eb816395dc64 100644 +--- a/drivers/pinctrl/qcom/pinctrl-msm.c ++++ b/drivers/pinctrl/qcom/pinctrl-msm.c +@@ -44,7 +44,6 @@ + * @pctrl: pinctrl handle. + * @chip: gpiochip handle. + * @desc: pin controller descriptor +- * @restart_nb: restart notifier block. + * @irq: parent irq for the TLMM irq_chip. + * @intr_target_use_scm: route irq to application cpu using scm calls + * @lock: Spinlock to protect register resources as well +@@ -64,7 +63,6 @@ struct msm_pinctrl { + struct pinctrl_dev *pctrl; + struct gpio_chip chip; + struct pinctrl_desc desc; +- struct notifier_block restart_nb; + + int irq; + +@@ -1471,10 +1469,9 @@ static int msm_gpio_init(struct msm_pinctrl *pctrl) + return 0; + } + +-static int msm_ps_hold_restart(struct notifier_block *nb, unsigned long action, +- void *data) ++static int msm_ps_hold_restart(struct sys_off_data *data) + { +- struct msm_pinctrl *pctrl = container_of(nb, struct msm_pinctrl, restart_nb); ++ struct msm_pinctrl *pctrl = data->cb_data; + + writel(0, pctrl->regs[0] + PS_HOLD_OFFSET); + mdelay(1000); +@@ -1485,7 +1482,11 @@ static struct msm_pinctrl *poweroff_pctrl; + + static void msm_ps_hold_poweroff(void) + { +- msm_ps_hold_restart(&poweroff_pctrl->restart_nb, 0, NULL); ++ struct sys_off_data data = { ++ .cb_data = poweroff_pctrl, ++ }; ++ ++ msm_ps_hold_restart(&data); + } + + static void msm_pinctrl_setup_pm_reset(struct msm_pinctrl *pctrl) +@@ -1495,9 +1496,11 @@ static void msm_pinctrl_setup_pm_reset(struct msm_pinctrl *pctrl) + + for (i = 0; i < pctrl->soc->nfunctions; i++) + if (!strcmp(func[i].name, "ps_hold")) { +- pctrl->restart_nb.notifier_call = msm_ps_hold_restart; +- pctrl->restart_nb.priority = 128; +- if (register_restart_handler(&pctrl->restart_nb)) ++ if (devm_register_sys_off_handler(pctrl->dev, ++ SYS_OFF_MODE_RESTART, ++ 128, ++ msm_ps_hold_restart, ++ pctrl)) + dev_err(pctrl->dev, + "failed to setup restart handler.\n"); + poweroff_pctrl = pctrl; +@@ -1599,8 +1602,6 @@ void msm_pinctrl_remove(struct platform_device *pdev) + struct msm_pinctrl *pctrl = platform_get_drvdata(pdev); + + gpiochip_remove(&pctrl->chip); +- +- unregister_restart_handler(&pctrl->restart_nb); + } + EXPORT_SYMBOL(msm_pinctrl_remove); + +-- +2.39.5 + diff --git a/queue-6.14/ptp-ocp-limit-signal-freq-counts-in-summary-output-f.patch b/queue-6.14/ptp-ocp-limit-signal-freq-counts-in-summary-output-f.patch new file mode 100644 index 0000000000..58018b1b4d --- /dev/null +++ b/queue-6.14/ptp-ocp-limit-signal-freq-counts-in-summary-output-f.patch @@ -0,0 +1,131 @@ +From d7d348eb920809b223f67eb18c7a0e1b7c9c8363 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 14 May 2025 10:35:41 +0300 +Subject: ptp: ocp: Limit signal/freq counts in summary output functions + +From: Sagi Maimon + +[ Upstream commit c9e455581e2ba87ee38c126e8dc49a424b9df0cf ] + +The debugfs summary output could access uninitialized elements in +the freq_in[] and signal_out[] arrays, causing NULL pointer +dereferences and triggering a kernel Oops (page_fault_oops). +This patch adds u8 fields (nr_freq_in, nr_signal_out) to track the +number of initialized elements, with a maximum of 4 per array. +The summary output functions are updated to respect these limits, +preventing out-of-bounds access and ensuring safe array handling. + +Widen the label variables because the change confuses GCC about +max length of the strings. + +Fixes: ef61f5528fca ("ptp: ocp: add Adva timecard support") +Signed-off-by: Sagi Maimon +Reviewed-by: Simon Horman +Reviewed-by: Vadim Fedorenko +Link: https://patch.msgid.link/20250514073541.35817-1-maimon.sagi@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/ptp/ptp_ocp.c | 24 +++++++++++++++++------- + 1 file changed, 17 insertions(+), 7 deletions(-) + +diff --git a/drivers/ptp/ptp_ocp.c b/drivers/ptp/ptp_ocp.c +index 605cce32a3d37..e4be8b9291966 100644 +--- a/drivers/ptp/ptp_ocp.c ++++ b/drivers/ptp/ptp_ocp.c +@@ -315,6 +315,8 @@ struct ptp_ocp_serial_port { + #define OCP_BOARD_ID_LEN 13 + #define OCP_SERIAL_LEN 6 + #define OCP_SMA_NUM 4 ++#define OCP_SIGNAL_NUM 4 ++#define OCP_FREQ_NUM 4 + + enum { + PORT_GNSS, +@@ -342,8 +344,8 @@ struct ptp_ocp { + struct dcf_master_reg __iomem *dcf_out; + struct dcf_slave_reg __iomem *dcf_in; + struct tod_reg __iomem *nmea_out; +- struct frequency_reg __iomem *freq_in[4]; +- struct ptp_ocp_ext_src *signal_out[4]; ++ struct frequency_reg __iomem *freq_in[OCP_FREQ_NUM]; ++ struct ptp_ocp_ext_src *signal_out[OCP_SIGNAL_NUM]; + struct ptp_ocp_ext_src *pps; + struct ptp_ocp_ext_src *ts0; + struct ptp_ocp_ext_src *ts1; +@@ -378,10 +380,12 @@ struct ptp_ocp { + u32 utc_tai_offset; + u32 ts_window_adjust; + u64 fw_cap; +- struct ptp_ocp_signal signal[4]; ++ struct ptp_ocp_signal signal[OCP_SIGNAL_NUM]; + struct ptp_ocp_sma_connector sma[OCP_SMA_NUM]; + const struct ocp_sma_op *sma_op; + struct dpll_device *dpll; ++ int signals_nr; ++ int freq_in_nr; + }; + + #define OCP_REQ_TIMESTAMP BIT(0) +@@ -2697,6 +2701,8 @@ ptp_ocp_fb_board_init(struct ptp_ocp *bp, struct ocp_resource *r) + bp->eeprom_map = fb_eeprom_map; + bp->fw_version = ioread32(&bp->image->version); + bp->sma_op = &ocp_fb_sma_op; ++ bp->signals_nr = 4; ++ bp->freq_in_nr = 4; + + ptp_ocp_fb_set_version(bp); + +@@ -2862,6 +2868,8 @@ ptp_ocp_art_board_init(struct ptp_ocp *bp, struct ocp_resource *r) + bp->fw_version = ioread32(&bp->reg->version); + bp->fw_tag = 2; + bp->sma_op = &ocp_art_sma_op; ++ bp->signals_nr = 4; ++ bp->freq_in_nr = 4; + + /* Enable MAC serial port during initialisation */ + iowrite32(1, &bp->board_config->mro50_serial_activate); +@@ -2888,6 +2896,8 @@ ptp_ocp_adva_board_init(struct ptp_ocp *bp, struct ocp_resource *r) + bp->flash_start = 0xA00000; + bp->eeprom_map = fb_eeprom_map; + bp->sma_op = &ocp_adva_sma_op; ++ bp->signals_nr = 2; ++ bp->freq_in_nr = 2; + + version = ioread32(&bp->image->version); + /* if lower 16 bits are empty, this is the fw loader. */ +@@ -4008,7 +4018,7 @@ _signal_summary_show(struct seq_file *s, struct ptp_ocp *bp, int nr) + { + struct signal_reg __iomem *reg = bp->signal_out[nr]->mem; + struct ptp_ocp_signal *signal = &bp->signal[nr]; +- char label[8]; ++ char label[16]; + bool on; + u32 val; + +@@ -4034,7 +4044,7 @@ static void + _frequency_summary_show(struct seq_file *s, int nr, + struct frequency_reg __iomem *reg) + { +- char label[8]; ++ char label[16]; + bool on; + u32 val; + +@@ -4178,11 +4188,11 @@ ptp_ocp_summary_show(struct seq_file *s, void *data) + } + + if (bp->fw_cap & OCP_CAP_SIGNAL) +- for (i = 0; i < 4; i++) ++ for (i = 0; i < bp->signals_nr; i++) + _signal_summary_show(s, bp, i); + + if (bp->fw_cap & OCP_CAP_FREQ) +- for (i = 0; i < 4; i++) ++ for (i = 0; i < bp->freq_in_nr; i++) + _frequency_summary_show(s, i, bp->freq_in[i]); + + if (bp->irig_out) { +-- +2.39.5 + diff --git a/queue-6.14/remoteproc-qcom_wcnss-fix-on-platforms-without-fallb.patch b/queue-6.14/remoteproc-qcom_wcnss-fix-on-platforms-without-fallb.patch new file mode 100644 index 0000000000..c4e84d9b8b --- /dev/null +++ b/queue-6.14/remoteproc-qcom_wcnss-fix-on-platforms-without-fallb.patch @@ -0,0 +1,45 @@ +From ea68755f26d3558f6c55542891bf9ecd01218c42 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 12 May 2025 02:40:15 +0300 +Subject: remoteproc: qcom_wcnss: Fix on platforms without fallback regulators +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Matti Lehtimäki + +[ Upstream commit 4ca45af0a56d00b86285d6fdd720dca3215059a7 ] + +Recent change to handle platforms with only single power domain broke +pronto-v3 which requires power domains and doesn't have fallback voltage +regulators in case power domains are missing. Add a check to verify +the number of fallback voltage regulators before using the code which +handles single power domain situation. + +Fixes: 65991ea8a6d1 ("remoteproc: qcom_wcnss: Handle platforms with only single power domain") +Signed-off-by: Matti Lehtimäki +Tested-by: Luca Weiss # sdm632-fairphone-fp3 +Link: https://lore.kernel.org/r/20250511234026.94735-1-matti.lehtimaki@gmail.com +Signed-off-by: Bjorn Andersson +Signed-off-by: Sasha Levin +--- + drivers/remoteproc/qcom_wcnss.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/drivers/remoteproc/qcom_wcnss.c b/drivers/remoteproc/qcom_wcnss.c +index 775b056d795a8..2c7e519a2254b 100644 +--- a/drivers/remoteproc/qcom_wcnss.c ++++ b/drivers/remoteproc/qcom_wcnss.c +@@ -456,7 +456,8 @@ static int wcnss_init_regulators(struct qcom_wcnss *wcnss, + if (wcnss->num_pds) { + info += wcnss->num_pds; + /* Handle single power domain case */ +- num_vregs += num_pd_vregs - wcnss->num_pds; ++ if (wcnss->num_pds < num_pd_vregs) ++ num_vregs += num_pd_vregs - wcnss->num_pds; + } else { + num_vregs += num_pd_vregs; + } +-- +2.39.5 + diff --git a/queue-6.14/sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch b/queue-6.14/sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch new file mode 100644 index 0000000000..7b35f0b089 --- /dev/null +++ b/queue-6.14/sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch @@ -0,0 +1,62 @@ +From c01a71b490cc5631accc5ca78f3caea085b96e69 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 18 May 2025 15:20:37 -0700 +Subject: sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue() + +From: Cong Wang + +[ Upstream commit 3f981138109f63232a5fb7165938d4c945cc1b9d ] + +When enqueuing the first packet to an HFSC class, hfsc_enqueue() calls the +child qdisc's peek() operation before incrementing sch->q.qlen and +sch->qstats.backlog. If the child qdisc uses qdisc_peek_dequeued(), this may +trigger an immediate dequeue and potential packet drop. In such cases, +qdisc_tree_reduce_backlog() is called, but the HFSC qdisc's qlen and backlog +have not yet been updated, leading to inconsistent queue accounting. This +can leave an empty HFSC class in the active list, causing further +consequences like use-after-free. + +This patch fixes the bug by moving the increment of sch->q.qlen and +sch->qstats.backlog before the call to the child qdisc's peek() operation. +This ensures that queue length and backlog are always accurate when packet +drops or dequeues are triggered during the peek. + +Fixes: 12d0ad3be9c3 ("net/sched/sch_hfsc.c: handle corner cases where head may change invalidating calculated deadline") +Reported-by: Mingi Cho +Signed-off-by: Cong Wang +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20250518222038.58538-2-xiyou.wangcong@gmail.com +Reviewed-by: Jamal Hadi Salim +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/sched/sch_hfsc.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c +index cb8c525ea20ea..7986145a527cb 100644 +--- a/net/sched/sch_hfsc.c ++++ b/net/sched/sch_hfsc.c +@@ -1569,6 +1569,9 @@ hfsc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) + return err; + } + ++ sch->qstats.backlog += len; ++ sch->q.qlen++; ++ + if (first && !cl->cl_nactive) { + if (cl->cl_flags & HFSC_RSC) + init_ed(cl, len); +@@ -1584,9 +1587,6 @@ hfsc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) + + } + +- sch->qstats.backlog += len; +- sch->q.qlen++; +- + return NET_XMIT_SUCCESS; + } + +-- +2.39.5 + diff --git a/queue-6.14/series b/queue-6.14/series index 5a0ca27af8..dd0894dac2 100644 --- a/queue-6.14/series +++ b/queue-6.14/series @@ -684,3 +684,43 @@ x86-kconfig-make-cfi_auto_default-depend-on-rust-or-.patch xenbus-allow-pvh-dom0-a-non-local-xenstore.patch drm-amd-display-call-fp-protect-before-mode-programm.patch __legitimize_mnt-check-for-mnt_sync_umount-should-be.patch +soundwire-bus-fix-race-on-the-creation-of-the-irq-do.patch +espintcp-fix-skb-leaks.patch +espintcp-remove-encap-socket-caching-to-avoid-refere.patch +xfrm-fix-udp-gro-handling-for-some-corner-cases.patch +dmaengine-idxd-fix-allowing-write-from-different-add.patch +x86-sev-fix-operator-precedence-in-ghcb_msr_vmpl_req.patch +kernel-fork-only-call-untrack_pfn_clear-on-vmas-dupl.patch +remoteproc-qcom_wcnss-fix-on-platforms-without-fallb.patch +clk-sunxi-ng-d1-add-missing-divider-for-mmc-mod-cloc.patch +xfrm-sanitize-marks-before-insert.patch +dmaengine-idxd-fix-poll-return-value.patch +dmaengine-fsl-edma-fix-return-code-for-unhandled-int.patch +driver-core-split-devres-apis-to-device-devres.h.patch +devres-introduce-devm_kmemdup_array.patch +asoc-sof-intel-hda-fix-uaf-when-reloading-module.patch +irqchip-riscv-imsic-start-local-sync-timer-on-correc.patch +perf-x86-intel-fix-segfault-with-pebs-via-pt-with-sa.patch +bluetooth-l2cap-fix-not-checking-l2cap_chan-security.patch +bluetooth-btusb-use-skb_pull-to-avoid-unsafe-access-.patch +ptp-ocp-limit-signal-freq-counts-in-summary-output-f.patch +bridge-netfilter-fix-forwarding-of-fragmented-packet.patch +mr-consolidate-the-ipmr_can_free_table-checks.patch +ice-fix-vf-num_mac-count-with-port-representors.patch +ice-fix-lacp-bonds-without-sriov-environment.patch +idpf-fix-null-ptr-deref-in-idpf_features_check.patch +loop-don-t-require-write_iter-for-writable-files-in-.patch +pinctrl-qcom-switch-to-devm_register_sys_off_handler.patch +net-dwmac-sun8i-use-parsed-internal-phy-address-inst.patch +net-lan743x-restore-sgmii-ctrl-register-on-resume.patch +xsk-bring-back-busy-polling-support-in-xdp_copy.patch +io_uring-fix-overflow-resched-cqe-reordering.patch +idpf-fix-idpf_vport_splitq_napi_poll.patch +sch_hfsc-fix-qlen-accounting-bug-when-using-peek-in-.patch +octeontx2-pf-use-xdp_return_frame-to-free-xdp-buffer.patch +octeontx2-pf-add-af_xdp-non-zero-copy-support.patch +octeontx2-pf-af_xdp-zero-copy-receive-support.patch +octeontx2-pf-avoid-adding-dcbnl_ops-for-lbk-and-sdp-.patch +net-tipc-fix-slab-use-after-free-read-in-tipc_aead_e.patch +octeontx2-af-set-lmt_ena-bit-for-apr-table-entries.patch +octeontx2-af-fix-apr-entry-mapping-based-on-apr_lmt_.patch diff --git a/queue-6.14/soundwire-bus-fix-race-on-the-creation-of-the-irq-do.patch b/queue-6.14/soundwire-bus-fix-race-on-the-creation-of-the-irq-do.patch new file mode 100644 index 0000000000..db734a167e --- /dev/null +++ b/queue-6.14/soundwire-bus-fix-race-on-the-creation-of-the-irq-do.patch @@ -0,0 +1,60 @@ +From 6842cae3e47b8cb1b9272028e60e667afce2636c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 9 Apr 2025 13:22:39 +0100 +Subject: soundwire: bus: Fix race on the creation of the IRQ domain + +From: Charles Keepax + +[ Upstream commit fd15594ba7d559d9da741504c322b9f57c4981e5 ] + +The SoundWire IRQ domain needs to be created before any slaves are added +to the bus, such that the domain is always available when needed. Move +the call to sdw_irq_create() before the calls to sdw_acpi_find_slaves() +and sdw_of_find_slaves(). + +Fixes: 12a95123bfe1 ("soundwire: bus: Allow SoundWire peripherals to register IRQ handlers") +Signed-off-by: Charles Keepax +Link: https://lore.kernel.org/r/20250409122239.1396489-1-ckeepax@opensource.cirrus.com +Signed-off-by: Vinod Koul +Signed-off-by: Sasha Levin +--- + drivers/soundwire/bus.c | 9 +++++---- + 1 file changed, 5 insertions(+), 4 deletions(-) + +diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c +index 9b295fc9acd53..df73e2c040904 100644 +--- a/drivers/soundwire/bus.c ++++ b/drivers/soundwire/bus.c +@@ -121,6 +121,10 @@ int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent, + set_bit(SDW_GROUP13_DEV_NUM, bus->assigned); + set_bit(SDW_MASTER_DEV_NUM, bus->assigned); + ++ ret = sdw_irq_create(bus, fwnode); ++ if (ret) ++ return ret; ++ + /* + * SDW is an enumerable bus, but devices can be powered off. So, + * they won't be able to report as present. +@@ -137,6 +141,7 @@ int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent, + + if (ret < 0) { + dev_err(bus->dev, "Finding slaves failed:%d\n", ret); ++ sdw_irq_delete(bus); + return ret; + } + +@@ -155,10 +160,6 @@ int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent, + bus->params.curr_bank = SDW_BANK0; + bus->params.next_bank = SDW_BANK1; + +- ret = sdw_irq_create(bus, fwnode); +- if (ret) +- return ret; +- + return 0; + } + EXPORT_SYMBOL(sdw_bus_master_add); +-- +2.39.5 + diff --git a/queue-6.14/x86-sev-fix-operator-precedence-in-ghcb_msr_vmpl_req.patch b/queue-6.14/x86-sev-fix-operator-precedence-in-ghcb_msr_vmpl_req.patch new file mode 100644 index 0000000000..6e5694289b --- /dev/null +++ b/queue-6.14/x86-sev-fix-operator-precedence-in-ghcb_msr_vmpl_req.patch @@ -0,0 +1,45 @@ +From 2c8a5392253cb17c0f4251f09ca839786a85f782 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 11 May 2025 18:23:28 +0900 +Subject: x86/sev: Fix operator precedence in GHCB_MSR_VMPL_REQ_LEVEL macro + +From: Seongman Lee + +[ Upstream commit f7387eff4bad33d12719c66c43541c095556ae4e ] + +The GHCB_MSR_VMPL_REQ_LEVEL macro lacked parentheses around the bitmask +expression, causing the shift operation to bind too early. As a result, +when requesting VMPL1 (e.g., GHCB_MSR_VMPL_REQ_LEVEL(1)), incorrect +values such as 0x000000016 were generated instead of the intended +0x100000016 (the requested VMPL level is specified in GHCBData[39:32]). + +Fix the precedence issue by grouping the masked value before applying +the shift. + + [ bp: Massage commit message. ] + +Fixes: 34ff65901735 ("x86/sev: Use kernel provided SVSM Calling Areas") +Signed-off-by: Seongman Lee +Signed-off-by: Borislav Petkov (AMD) +Link: https://lore.kernel.org/20250511092329.12680-1-cloudlee1719@gmail.com +Signed-off-by: Sasha Levin +--- + arch/x86/include/asm/sev-common.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h +index dcbccdb280f9e..1385227125ab4 100644 +--- a/arch/x86/include/asm/sev-common.h ++++ b/arch/x86/include/asm/sev-common.h +@@ -116,7 +116,7 @@ enum psc_op { + #define GHCB_MSR_VMPL_REQ 0x016 + #define GHCB_MSR_VMPL_REQ_LEVEL(v) \ + /* GHCBData[39:32] */ \ +- (((u64)(v) & GENMASK_ULL(7, 0) << 32) | \ ++ ((((u64)(v) & GENMASK_ULL(7, 0)) << 32) | \ + /* GHCBDdata[11:0] */ \ + GHCB_MSR_VMPL_REQ) + +-- +2.39.5 + diff --git a/queue-6.14/xfrm-fix-udp-gro-handling-for-some-corner-cases.patch b/queue-6.14/xfrm-fix-udp-gro-handling-for-some-corner-cases.patch new file mode 100644 index 0000000000..ef9b16a450 --- /dev/null +++ b/queue-6.14/xfrm-fix-udp-gro-handling-for-some-corner-cases.patch @@ -0,0 +1,143 @@ +From 89698e025b1380abfdd23523489849b108790777 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 15 Apr 2025 13:13:18 +0200 +Subject: xfrm: Fix UDP GRO handling for some corner cases + +From: Tobias Brunner + +[ Upstream commit e3fd0577768584ece824c8b661c40fb3d912812a ] + +This fixes an issue that's caused if there is a mismatch between the data +offset in the GRO header and the length fields in the regular sk_buff due +to the pskb_pull()/skb_push() calls. That's because the UDP GRO layer +stripped off the UDP header via skb_gro_pull() already while the UDP +header was explicitly not pulled/pushed in this function. + +For example, an IKE packet that triggered this had len=data_len=1268 and +the data_offset in the GRO header was 28 (IPv4 + UDP). So pskb_pull() +was called with an offset of 28-8=20, which reduced len to 1248 and via +pskb_may_pull() and __pskb_pull_tail() it also set data_len to 1248. +As the ESP offload module was not loaded, the function bailed out and +called skb_push(), which restored len to 1268, however, data_len remained +at 1248. + +So while skb_headlen() was 0 before, it was now 20. The latter caused a +difference of 8 instead of 28 (or 0 if pskb_pull()/skb_push() was called +with the complete GRO data_offset) in gro_try_pull_from_frag0() that +triggered a call to gro_pull_from_frag0() that corrupted the packet. + +This change uses a more GRO-like approach seen in other GRO receivers +via skb_gro_header() to just read the actual data we are interested in +and does not try to "restore" the UDP header at this point to call the +existing function. If the offload module is not loaded, it immediately +bails out, otherwise, it only does a quick check to see if the packet +is an IKE or keepalive packet instead of calling the existing function. + +Fixes: 172bf009c18d ("xfrm: Support GRO for IPv4 ESP in UDP encapsulation") +Fixes: 221ddb723d90 ("xfrm: Support GRO for IPv6 ESP in UDP encapsulation") +Signed-off-by: Tobias Brunner +Signed-off-by: Steffen Klassert +Signed-off-by: Sasha Levin +--- + net/ipv4/xfrm4_input.c | 18 ++++++++++-------- + net/ipv6/xfrm6_input.c | 18 ++++++++++-------- + 2 files changed, 20 insertions(+), 16 deletions(-) + +diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c +index b5b06323cfd94..0d31a8c108d4f 100644 +--- a/net/ipv4/xfrm4_input.c ++++ b/net/ipv4/xfrm4_input.c +@@ -182,11 +182,15 @@ struct sk_buff *xfrm4_gro_udp_encap_rcv(struct sock *sk, struct list_head *head, + int offset = skb_gro_offset(skb); + const struct net_offload *ops; + struct sk_buff *pp = NULL; +- int ret; +- +- offset = offset - sizeof(struct udphdr); ++ int len, dlen; ++ __u8 *udpdata; ++ __be32 *udpdata32; + +- if (!pskb_pull(skb, offset)) ++ len = skb->len - offset; ++ dlen = offset + min(len, 8); ++ udpdata = skb_gro_header(skb, dlen, offset); ++ udpdata32 = (__be32 *)udpdata; ++ if (unlikely(!udpdata)) + return NULL; + + rcu_read_lock(); +@@ -194,11 +198,10 @@ struct sk_buff *xfrm4_gro_udp_encap_rcv(struct sock *sk, struct list_head *head, + if (!ops || !ops->callbacks.gro_receive) + goto out; + +- ret = __xfrm4_udp_encap_rcv(sk, skb, false); +- if (ret) ++ /* check if it is a keepalive or IKE packet */ ++ if (len <= sizeof(struct ip_esp_hdr) || udpdata32[0] == 0) + goto out; + +- skb_push(skb, offset); + NAPI_GRO_CB(skb)->proto = IPPROTO_UDP; + + pp = call_gro_receive(ops->callbacks.gro_receive, head, skb); +@@ -208,7 +211,6 @@ struct sk_buff *xfrm4_gro_udp_encap_rcv(struct sock *sk, struct list_head *head, + + out: + rcu_read_unlock(); +- skb_push(skb, offset); + NAPI_GRO_CB(skb)->same_flow = 0; + NAPI_GRO_CB(skb)->flush = 1; + +diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c +index 4abc5e9d63227..841c81abaaf4f 100644 +--- a/net/ipv6/xfrm6_input.c ++++ b/net/ipv6/xfrm6_input.c +@@ -179,14 +179,18 @@ struct sk_buff *xfrm6_gro_udp_encap_rcv(struct sock *sk, struct list_head *head, + int offset = skb_gro_offset(skb); + const struct net_offload *ops; + struct sk_buff *pp = NULL; +- int ret; ++ int len, dlen; ++ __u8 *udpdata; ++ __be32 *udpdata32; + + if (skb->protocol == htons(ETH_P_IP)) + return xfrm4_gro_udp_encap_rcv(sk, head, skb); + +- offset = offset - sizeof(struct udphdr); +- +- if (!pskb_pull(skb, offset)) ++ len = skb->len - offset; ++ dlen = offset + min(len, 8); ++ udpdata = skb_gro_header(skb, dlen, offset); ++ udpdata32 = (__be32 *)udpdata; ++ if (unlikely(!udpdata)) + return NULL; + + rcu_read_lock(); +@@ -194,11 +198,10 @@ struct sk_buff *xfrm6_gro_udp_encap_rcv(struct sock *sk, struct list_head *head, + if (!ops || !ops->callbacks.gro_receive) + goto out; + +- ret = __xfrm6_udp_encap_rcv(sk, skb, false); +- if (ret) ++ /* check if it is a keepalive or IKE packet */ ++ if (len <= sizeof(struct ip_esp_hdr) || udpdata32[0] == 0) + goto out; + +- skb_push(skb, offset); + NAPI_GRO_CB(skb)->proto = IPPROTO_UDP; + + pp = call_gro_receive(ops->callbacks.gro_receive, head, skb); +@@ -208,7 +211,6 @@ struct sk_buff *xfrm6_gro_udp_encap_rcv(struct sock *sk, struct list_head *head, + + out: + rcu_read_unlock(); +- skb_push(skb, offset); + NAPI_GRO_CB(skb)->same_flow = 0; + NAPI_GRO_CB(skb)->flush = 1; + +-- +2.39.5 + diff --git a/queue-6.14/xfrm-sanitize-marks-before-insert.patch b/queue-6.14/xfrm-sanitize-marks-before-insert.patch new file mode 100644 index 0000000000..19ed3650a2 --- /dev/null +++ b/queue-6.14/xfrm-sanitize-marks-before-insert.patch @@ -0,0 +1,71 @@ +From 2a1437b575e905adc1f23958b609ebec72a9341f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 7 May 2025 13:31:58 +0200 +Subject: xfrm: Sanitize marks before insert + +From: Paul Chaignon + +[ Upstream commit 0b91fda3a1f044141e1e615456ff62508c32b202 ] + +Prior to this patch, the mark is sanitized (applying the state's mask to +the state's value) only on inserts when checking if a conflicting XFRM +state or policy exists. + +We discovered in Cilium that this same sanitization does not occur +in the hot-path __xfrm_state_lookup. In the hot-path, the sk_buff's mark +is simply compared to the state's value: + + if ((mark & x->mark.m) != x->mark.v) + continue; + +Therefore, users can define unsanitized marks (ex. 0xf42/0xf00) which will +never match any packet. + +This commit updates __xfrm_state_insert and xfrm_policy_insert to store +the sanitized marks, thus removing this footgun. + +This has the side effect of changing the ip output, as the +returned mark will have the mask applied to it when printed. + +Fixes: 3d6acfa7641f ("xfrm: SA lookups with mark") +Signed-off-by: Paul Chaignon +Signed-off-by: Louis DeLosSantos +Co-developed-by: Louis DeLosSantos +Signed-off-by: Steffen Klassert +Signed-off-by: Sasha Levin +--- + net/xfrm/xfrm_policy.c | 3 +++ + net/xfrm/xfrm_state.c | 3 +++ + 2 files changed, 6 insertions(+) + +diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c +index 6551e588fe526..50a17112c87af 100644 +--- a/net/xfrm/xfrm_policy.c ++++ b/net/xfrm/xfrm_policy.c +@@ -1581,6 +1581,9 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl) + struct xfrm_policy *delpol; + struct hlist_head *chain; + ++ /* Sanitize mark before store */ ++ policy->mark.v &= policy->mark.m; ++ + spin_lock_bh(&net->xfrm.xfrm_policy_lock); + chain = policy_hash_bysel(net, &policy->selector, policy->family, dir); + if (chain) +diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c +index fa8a8776d5397..8176081fa1f49 100644 +--- a/net/xfrm/xfrm_state.c ++++ b/net/xfrm/xfrm_state.c +@@ -1718,6 +1718,9 @@ static void __xfrm_state_insert(struct xfrm_state *x) + + list_add(&x->km.all, &net->xfrm.state_all); + ++ /* Sanitize mark before store */ ++ x->mark.v &= x->mark.m; ++ + h = xfrm_dst_hash(net, &x->id.daddr, &x->props.saddr, + x->props.reqid, x->props.family); + XFRM_STATE_INSERT(bydst, &x->bydst, net->xfrm.state_bydst + h, +-- +2.39.5 + diff --git a/queue-6.14/xsk-bring-back-busy-polling-support-in-xdp_copy.patch b/queue-6.14/xsk-bring-back-busy-polling-support-in-xdp_copy.patch new file mode 100644 index 0000000000..77608b96a1 --- /dev/null +++ b/queue-6.14/xsk-bring-back-busy-polling-support-in-xdp_copy.patch @@ -0,0 +1,63 @@ +From 7fbee8ffa7ef62b96181729aa0e208584d463918 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 16 May 2025 21:36:38 +0000 +Subject: xsk: Bring back busy polling support in XDP_COPY + +From: Samiullah Khawaja + +[ Upstream commit b95ed5517354a5f451fb9b3771776ca4c0b65ac3 ] + +Commit 5ef44b3cb43b ("xsk: Bring back busy polling support") fixed the +busy polling support in xsk for XDP_ZEROCOPY after it was broken in +commit 86e25f40aa1e ("net: napi: Add napi_config"). The busy polling +support with XDP_COPY remained broken since the napi_id setup in +xsk_rcv_check was removed. + +Bring back the setup of napi_id for XDP_COPY so socket level SO_BUSYPOLL +can be used to poll the underlying napi. + +Do the setup of napi_id for XDP_COPY in xsk_bind, as it is done +currently for XDP_ZEROCOPY. The setup of napi_id for XDP_COPY in +xsk_bind is safe because xsk_rcv_check checks that the rx queue at which +the packet arrives is equal to the queue_id that was supplied in bind. +This is done for both XDP_COPY and XDP_ZEROCOPY mode. + +Tested using AF_XDP support in virtio-net by running the xsk_rr AF_XDP +benchmarking tool shared here: +https://lore.kernel.org/all/20250320163523.3501305-1-skhawaja@google.com/T/ + +Enabled socket busy polling using following commands in qemu, + +``` +sudo ethtool -L eth0 combined 1 +echo 400 | sudo tee /proc/sys/net/core/busy_read +echo 100 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs +echo 15000 | sudo tee /sys/class/net/eth0/gro_flush_timeout +``` + +Fixes: 5ef44b3cb43b ("xsk: Bring back busy polling support") +Signed-off-by: Samiullah Khawaja +Reviewed-by: Willem de Bruijn +Acked-by: Stanislav Fomichev +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + net/xdp/xsk.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c +index c13e13fa79fc0..dc67870b76122 100644 +--- a/net/xdp/xsk.c ++++ b/net/xdp/xsk.c +@@ -1301,7 +1301,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) + xs->queue_id = qid; + xp_add_xsk(xs->pool, xs); + +- if (xs->zc && qid < dev->real_num_rx_queues) { ++ if (qid < dev->real_num_rx_queues) { + struct netdev_rx_queue *rxq; + + rxq = __netif_get_rx_queue(dev, qid); +-- +2.39.5 + -- 2.47.3