From: Sasha Levin Date: Mon, 10 Feb 2025 03:58:37 +0000 (-0500) Subject: Fixes for 5.10 X-Git-Tag: v6.6.77~32 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=fca33e4965d49c996ecb32c120f0b18e2d628721;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 5.10 Signed-off-by: Sasha Levin --- diff --git a/queue-5.10/firmware-iscsi_ibft-fix-iscsi_ibft-kconfig-entry.patch b/queue-5.10/firmware-iscsi_ibft-fix-iscsi_ibft-kconfig-entry.patch new file mode 100644 index 0000000000..9f92ad57dc --- /dev/null +++ b/queue-5.10/firmware-iscsi_ibft-fix-iscsi_ibft-kconfig-entry.patch @@ -0,0 +1,35 @@ +From eabcfe220a643b9ac54e3b40279e0ac4a039e56e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 11 Mar 2024 16:21:22 +0530 +Subject: firmware: iscsi_ibft: fix ISCSI_IBFT Kconfig entry + +From: Prasad Pandit + +[ Upstream commit e1e17a1715982201034024863efbf238bee2bdf9 ] + +Fix ISCSI_IBFT Kconfig entry, replace tab with a space character. + +Fixes: 138fe4e0697 ("Firmware: add iSCSI iBFT Support") +Signed-off-by: Prasad Pandit +Signed-off-by: Konrad Rzeszutek Wilk +Signed-off-by: Sasha Levin +--- + drivers/firmware/Kconfig | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig +index 807c5320dc0ff..a83101310e34f 100644 +--- a/drivers/firmware/Kconfig ++++ b/drivers/firmware/Kconfig +@@ -171,7 +171,7 @@ config ISCSI_IBFT + select ISCSI_BOOT_SYSFS + select ISCSI_IBFT_FIND if X86 + depends on ACPI && SCSI && SCSI_LOWLEVEL +- default n ++ default n + help + This option enables support for detection and exposing of iSCSI + Boot Firmware Table (iBFT) via sysfs to userspace. If you wish to +-- +2.39.5 + diff --git a/queue-5.10/gpio-pca953x-improve-interrupt-support.patch b/queue-5.10/gpio-pca953x-improve-interrupt-support.patch new file mode 100644 index 0000000000..491d531cbb --- /dev/null +++ b/queue-5.10/gpio-pca953x-improve-interrupt-support.patch @@ -0,0 +1,68 @@ +From 85aa045ced74e4f1dfb4c35802f8ae92a496e2a4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 6 Jun 2024 15:31:02 +1200 +Subject: gpio: pca953x: Improve interrupt support + +From: Mark Tomlinson + +[ Upstream commit d6179f6c6204f9932aed3a7a2100b4a295dfed9d ] + +The GPIO drivers with latch interrupt support (typically types starting +with PCAL) have interrupt status registers to determine which particular +inputs have caused an interrupt. Unfortunately there is no atomic +operation to read these registers and clear the interrupt. Clearing the +interrupt is done by reading the input registers. + +The code was reading the interrupt status registers, and then reading +the input registers. If an input changed between these two events it was +lost. + +The solution in this patch is to revert to the non-latch version of +code, i.e. remembering the previous input status, and looking for the +changes. This system results in no more I2C transfers, so is no slower. +The latch property of the device still means interrupts will still be +noticed if the input changes back to its initial state. + +Fixes: 44896beae605 ("gpio: pca953x: add PCAL9535 interrupt support for Galileo Gen2") +Signed-off-by: Mark Tomlinson +Reviewed-by: Andy Shevchenko +Link: https://lore.kernel.org/r/20240606033102.2271916-1-mark.tomlinson@alliedtelesis.co.nz +Signed-off-by: Bartosz Golaszewski +Signed-off-by: Sasha Levin +--- + drivers/gpio/gpio-pca953x.c | 19 ------------------- + 1 file changed, 19 deletions(-) + +diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c +index 3ad1a9e432c8a..64a4128b9a422 100644 +--- a/drivers/gpio/gpio-pca953x.c ++++ b/drivers/gpio/gpio-pca953x.c +@@ -732,25 +732,6 @@ static bool pca953x_irq_pending(struct pca953x_chip *chip, unsigned long *pendin + DECLARE_BITMAP(trigger, MAX_LINE); + int ret; + +- if (chip->driver_data & PCA_PCAL) { +- /* Read the current interrupt status from the device */ +- ret = pca953x_read_regs(chip, PCAL953X_INT_STAT, trigger); +- if (ret) +- return false; +- +- /* Check latched inputs and clear interrupt status */ +- ret = pca953x_read_regs(chip, chip->regs->input, cur_stat); +- if (ret) +- return false; +- +- /* Apply filter for rising/falling edge selection */ +- bitmap_replace(new_stat, chip->irq_trig_fall, chip->irq_trig_raise, cur_stat, gc->ngpio); +- +- bitmap_and(pending, new_stat, trigger, gc->ngpio); +- +- return !bitmap_empty(pending, gc->ngpio); +- } +- + ret = pca953x_read_regs(chip, chip->regs->input, cur_stat); + if (ret) + return false; +-- +2.39.5 + diff --git a/queue-5.10/gpu-drm_dp_cec-fix-broken-cec-adapter-properties-che.patch b/queue-5.10/gpu-drm_dp_cec-fix-broken-cec-adapter-properties-che.patch new file mode 100644 index 0000000000..b331304308 --- /dev/null +++ b/queue-5.10/gpu-drm_dp_cec-fix-broken-cec-adapter-properties-che.patch @@ -0,0 +1,90 @@ +From 8651e7b794d6eb6b699b7eafd0a25ebed7ba8d31 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 29 Jan 2025 10:51:48 +0100 +Subject: gpu: drm_dp_cec: fix broken CEC adapter properties check + +From: Hans Verkuil + +[ Upstream commit 6daaae5ff7f3b23a2dacc9c387ff3d4f95b67cad ] + +If the hotplug detect of a display is low for longer than one second +(configurable through drm_dp_cec_unregister_delay), then the CEC adapter +is unregistered since we assume the display was disconnected. If the +HPD went low for less than one second, then we check if the properties +of the CEC adapter have changed, since that indicates that we actually +switch to new hardware and we have to unregister the old CEC device and +register a new one. + +Unfortunately, the test for changed properties was written poorly, and +after a new CEC capability was added to the CEC core code the test always +returned true (i.e. the properties had changed). + +As a result the CEC device was unregistered and re-registered for every +HPD toggle. If the CEC remote controller integration was also enabled +(CONFIG_MEDIA_CEC_RC was set), then the corresponding input device was +also unregistered and re-registered. As a result the input device in +/sys would keep incrementing its number, e.g.: + +/sys/devices/pci0000:00/0000:00:08.1/0000:e7:00.0/rc/rc0/input20 + +Since short HPD toggles are common, the number could over time get into +the thousands. + +While not a serious issue (i.e. nothing crashes), it is not intended +to work that way. + +This patch changes the test so that it only checks for the single CEC +capability that can actually change, and it ignores any other +capabilities, so this is now safe as well if new caps are added in +the future. + +With the changed test the bit under #ifndef CONFIG_MEDIA_CEC_RC can be +dropped as well, so that's a nice cleanup. + +Signed-off-by: Hans Verkuil +Reported-by: Farblos +Reviewed-by: Dmitry Baryshkov +Fixes: 2c6d1fffa1d9 ("drm: add support for DisplayPort CEC-Tunneling-over-AUX") +Tested-by: Farblos +Link: https://patchwork.freedesktop.org/patch/msgid/361bb03d-1691-4e23-84da-0861ead5dbdc@xs4all.nl +Signed-off-by: Dmitry Baryshkov +Signed-off-by: Sasha Levin +--- + drivers/gpu/drm/drm_dp_cec.c | 14 +++----------- + 1 file changed, 3 insertions(+), 11 deletions(-) + +diff --git a/drivers/gpu/drm/drm_dp_cec.c b/drivers/gpu/drm/drm_dp_cec.c +index 3ab2609f9ec74..3ec770d602da6 100644 +--- a/drivers/gpu/drm/drm_dp_cec.c ++++ b/drivers/gpu/drm/drm_dp_cec.c +@@ -310,16 +310,6 @@ void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const struct edid *edid) + if (!aux->transfer) + return; + +-#ifndef CONFIG_MEDIA_CEC_RC +- /* +- * CEC_CAP_RC is part of CEC_CAP_DEFAULTS, but it is stripped by +- * cec_allocate_adapter() if CONFIG_MEDIA_CEC_RC is undefined. +- * +- * Do this here as well to ensure the tests against cec_caps are +- * correct. +- */ +- cec_caps &= ~CEC_CAP_RC; +-#endif + cancel_delayed_work_sync(&aux->cec.unregister_work); + + mutex_lock(&aux->cec.lock); +@@ -336,7 +326,9 @@ void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const struct edid *edid) + num_las = CEC_MAX_LOG_ADDRS; + + if (aux->cec.adap) { +- if (aux->cec.adap->capabilities == cec_caps && ++ /* Check if the adapter properties have changed */ ++ if ((aux->cec.adap->capabilities & CEC_CAP_MONITOR_ALL) == ++ (cec_caps & CEC_CAP_MONITOR_ALL) && + aux->cec.adap->available_log_addrs == num_las) { + /* Unchanged, so just set the phys addr */ + cec_s_phys_addr_from_edid(aux->cec.adap, edid); +-- +2.39.5 + diff --git a/queue-5.10/net-atlantic-fix-warning-during-hot-unplug.patch b/queue-5.10/net-atlantic-fix-warning-during-hot-unplug.patch new file mode 100644 index 0000000000..7c66349d72 --- /dev/null +++ b/queue-5.10/net-atlantic-fix-warning-during-hot-unplug.patch @@ -0,0 +1,71 @@ +From 7769670be9976929d89e24245625db40ae48e3e1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 3 Feb 2025 09:36:05 -0500 +Subject: net: atlantic: fix warning during hot unplug + +From: Jacob Moroni + +[ Upstream commit 028676bb189ed6d1b550a0fc570a9d695b6acfd3 ] + +Firmware deinitialization performs MMIO accesses which are not +necessary if the device has already been removed. In some cases, +these accesses happen via readx_poll_timeout_atomic which ends up +timing out, resulting in a warning at hw_atl2_utils_fw.c:112: + +[ 104.595913] Call Trace: +[ 104.595915] +[ 104.595918] ? show_regs+0x6c/0x80 +[ 104.595923] ? __warn+0x8d/0x150 +[ 104.595925] ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic] +[ 104.595934] ? report_bug+0x182/0x1b0 +[ 104.595938] ? handle_bug+0x6e/0xb0 +[ 104.595940] ? exc_invalid_op+0x18/0x80 +[ 104.595942] ? asm_exc_invalid_op+0x1b/0x20 +[ 104.595944] ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic] +[ 104.595952] ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic] +[ 104.595959] aq_nic_deinit.part.0+0xbd/0xf0 [atlantic] +[ 104.595964] aq_nic_deinit+0x17/0x30 [atlantic] +[ 104.595970] aq_ndev_close+0x2b/0x40 [atlantic] +[ 104.595975] __dev_close_many+0xad/0x160 +[ 104.595978] dev_close_many+0x99/0x170 +[ 104.595979] unregister_netdevice_many_notify+0x18b/0xb20 +[ 104.595981] ? __call_rcu_common+0xcd/0x700 +[ 104.595984] unregister_netdevice_queue+0xc6/0x110 +[ 104.595986] unregister_netdev+0x1c/0x30 +[ 104.595988] aq_pci_remove+0xb1/0xc0 [atlantic] + +Fix this by skipping firmware deinitialization altogether if the +PCI device is no longer present. + +Tested with an AQC113 attached via Thunderbolt by performing +repeated unplug cycles while traffic was running via iperf. + +Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code") +Signed-off-by: Jacob Moroni +Reviewed-by: Igor Russkikh +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/20250203143604.24930-3-mail@jakemoroni.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c +index 2d491efa11bdf..54aa84f06e403 100644 +--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c ++++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c +@@ -1294,7 +1294,9 @@ void aq_nic_deinit(struct aq_nic_s *self, bool link_down) + aq_ptp_ring_free(self); + aq_ptp_free(self); + +- if (likely(self->aq_fw_ops->deinit) && link_down) { ++ /* May be invoked during hot unplug. */ ++ if (pci_device_is_present(self->pdev) && ++ likely(self->aq_fw_ops->deinit) && link_down) { + mutex_lock(&self->fwreq_mutex); + self->aq_fw_ops->deinit(self->aq_hw); + mutex_unlock(&self->fwreq_mutex); +-- +2.39.5 + diff --git a/queue-5.10/net-rose-lock-the-socket-in-rose_bind.patch b/queue-5.10/net-rose-lock-the-socket-in-rose_bind.patch new file mode 100644 index 0000000000..2b70eb86e8 --- /dev/null +++ b/queue-5.10/net-rose-lock-the-socket-in-rose_bind.patch @@ -0,0 +1,87 @@ +From 5ad5a86afbef3a82fbd4e5874700d6745a73c306 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 3 Feb 2025 17:08:38 +0000 +Subject: net: rose: lock the socket in rose_bind() + +From: Eric Dumazet + +[ Upstream commit a1300691aed9ee852b0a9192e29e2bdc2411a7e6 ] + +syzbot reported a soft lockup in rose_loopback_timer(), +with a repro calling bind() from multiple threads. + +rose_bind() must lock the socket to avoid this issue. + +Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") +Reported-by: syzbot+7ff41b5215f0c534534e@syzkaller.appspotmail.com +Closes: https://lore.kernel.org/netdev/67a0f78d.050a0220.d7c5a.00a0.GAE@google.com/T/#u +Signed-off-by: Eric Dumazet +Acked-by: Paolo Abeni +Link: https://patch.msgid.link/20250203170838.3521361-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/rose/af_rose.c | 24 ++++++++++++++++-------- + 1 file changed, 16 insertions(+), 8 deletions(-) + +diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c +index 65fd5b99f9dea..f8cd085c42345 100644 +--- a/net/rose/af_rose.c ++++ b/net/rose/af_rose.c +@@ -700,11 +700,9 @@ static int rose_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) + struct net_device *dev; + ax25_address *source; + ax25_uid_assoc *user; ++ int err = -EINVAL; + int n; + +- if (!sock_flag(sk, SOCK_ZAPPED)) +- return -EINVAL; +- + if (addr_len != sizeof(struct sockaddr_rose) && addr_len != sizeof(struct full_sockaddr_rose)) + return -EINVAL; + +@@ -717,8 +715,15 @@ static int rose_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) + if ((unsigned int) addr->srose_ndigis > ROSE_MAX_DIGIS) + return -EINVAL; + +- if ((dev = rose_dev_get(&addr->srose_addr)) == NULL) +- return -EADDRNOTAVAIL; ++ lock_sock(sk); ++ ++ if (!sock_flag(sk, SOCK_ZAPPED)) ++ goto out_release; ++ ++ err = -EADDRNOTAVAIL; ++ dev = rose_dev_get(&addr->srose_addr); ++ if (!dev) ++ goto out_release; + + source = &addr->srose_call; + +@@ -729,7 +734,8 @@ static int rose_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) + } else { + if (ax25_uid_policy && !capable(CAP_NET_BIND_SERVICE)) { + dev_put(dev); +- return -EACCES; ++ err = -EACCES; ++ goto out_release; + } + rose->source_call = *source; + } +@@ -751,8 +757,10 @@ static int rose_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) + rose_insert_socket(sk); + + sock_reset_flag(sk, SOCK_ZAPPED); +- +- return 0; ++ err = 0; ++out_release: ++ release_sock(sk); ++ return err; + } + + static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_len, int flags) +-- +2.39.5 + diff --git a/queue-5.10/netem-update-sch-q.qlen-before-qdisc_tree_reduce_bac.patch b/queue-5.10/netem-update-sch-q.qlen-before-qdisc_tree_reduce_bac.patch new file mode 100644 index 0000000000..3acdc1b772 --- /dev/null +++ b/queue-5.10/netem-update-sch-q.qlen-before-qdisc_tree_reduce_bac.patch @@ -0,0 +1,44 @@ +From 652e622dfb2efeddb679bcdbbe397e9a164992c9 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 3 Feb 2025 16:58:40 -0800 +Subject: netem: Update sch->q.qlen before qdisc_tree_reduce_backlog() + +From: Cong Wang + +[ Upstream commit 638ba5089324796c2ee49af10427459c2de35f71 ] + +qdisc_tree_reduce_backlog() notifies parent qdisc only if child +qdisc becomes empty, therefore we need to reduce the backlog of the +child qdisc before calling it. Otherwise it would miss the opportunity +to call cops->qlen_notify(), in the case of DRR, it resulted in UAF +since DRR uses ->qlen_notify() to maintain its active list. + +Fixes: f8d4bc455047 ("net/sched: netem: account for backlog updates from child qdisc") +Cc: Martin Ottens +Reported-by: Mingi Cho +Signed-off-by: Cong Wang +Link: https://patch.msgid.link/20250204005841.223511-4-xiyou.wangcong@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/sched/sch_netem.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c +index f459e34684ad3..22f5d9421f6a6 100644 +--- a/net/sched/sch_netem.c ++++ b/net/sched/sch_netem.c +@@ -739,9 +739,9 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch) + if (err != NET_XMIT_SUCCESS) { + if (net_xmit_drop_count(err)) + qdisc_qstats_drop(sch); +- qdisc_tree_reduce_backlog(sch, 1, pkt_len); + sch->qstats.backlog -= pkt_len; + sch->q.qlen--; ++ qdisc_tree_reduce_backlog(sch, 1, pkt_len); + } + goto tfifo_dequeue; + } +-- +2.39.5 + diff --git a/queue-5.10/nvme-handle-connectivity-loss-in-nvme_set_queue_coun.patch b/queue-5.10/nvme-handle-connectivity-loss-in-nvme_set_queue_coun.patch new file mode 100644 index 0000000000..cad2d574ce --- /dev/null +++ b/queue-5.10/nvme-handle-connectivity-loss-in-nvme_set_queue_coun.patch @@ -0,0 +1,53 @@ +From 2ed24fb2d631bd1f058db13eca4fbb0cc5432af2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 9 Jan 2025 14:30:48 +0100 +Subject: nvme: handle connectivity loss in nvme_set_queue_count + +From: Daniel Wagner + +[ Upstream commit 294b2b7516fd06a8dd82e4a6118f318ec521e706 ] + +When the set feature attempts fails with any NVME status code set in +nvme_set_queue_count, the function still report success. Though the +numbers of queues set to 0. This is done to support controllers in +degraded state (the admin queue is still up and running but no IO +queues). + +Though there is an exception. When nvme_set_features reports an host +path error, nvme_set_queue_count should propagate this error as the +connectivity is lost, which means also the admin queue is not working +anymore. + +Fixes: 9a0be7abb62f ("nvme: refactor set_queue_count") +Reviewed-by: Christoph Hellwig +Reviewed-by: Hannes Reinecke +Reviewed-by: Sagi Grimberg +Signed-off-by: Daniel Wagner +Signed-off-by: Keith Busch +Signed-off-by: Sasha Levin +--- + drivers/nvme/host/core.c | 8 +++++++- + 1 file changed, 7 insertions(+), 1 deletion(-) + +diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c +index f988a5e3f0e15..019a6dbdcbc28 100644 +--- a/drivers/nvme/host/core.c ++++ b/drivers/nvme/host/core.c +@@ -1491,7 +1491,13 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count) + + status = nvme_set_features(ctrl, NVME_FEAT_NUM_QUEUES, q_count, NULL, 0, + &result); +- if (status < 0) ++ ++ /* ++ * It's either a kernel error or the host observed a connection ++ * lost. In either case it's not possible communicate with the ++ * controller and thus enter the error code path. ++ */ ++ if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) + return status; + + /* +-- +2.39.5 + diff --git a/queue-5.10/series b/queue-5.10/series index 6355af8fd5..579816cd05 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -164,3 +164,15 @@ net-usb-rtl8150-use-new-tasklet-api.patch net-usb-rtl8150-enable-basic-endpoint-checking.patch usb-xhci-add-timeout-argument-in-address_device-usb-.patch usb-xhci-fix-null-pointer-dereference-on-certain-com.patch +nvme-handle-connectivity-loss-in-nvme_set_queue_coun.patch +firmware-iscsi_ibft-fix-iscsi_ibft-kconfig-entry.patch +gpu-drm_dp_cec-fix-broken-cec-adapter-properties-che.patch +tg3-disable-tg3-pcie-aer-on-system-reboot.patch +udp-gso-do-not-drop-small-packets-when-pmtu-reduces.patch +gpio-pca953x-improve-interrupt-support.patch +net-atlantic-fix-warning-during-hot-unplug.patch +net-rose-lock-the-socket-in-rose_bind.patch +x86-xen-fix-xen_hypercall_hvm-to-not-clobber-rbx.patch +x86-xen-add-frame_end-to-xen_hypercall_hvm.patch +netem-update-sch-q.qlen-before-qdisc_tree_reduce_bac.patch +tun-revert-fix-group-permission-check.patch diff --git a/queue-5.10/tg3-disable-tg3-pcie-aer-on-system-reboot.patch b/queue-5.10/tg3-disable-tg3-pcie-aer-on-system-reboot.patch new file mode 100644 index 0000000000..905b02f3cf --- /dev/null +++ b/queue-5.10/tg3-disable-tg3-pcie-aer-on-system-reboot.patch @@ -0,0 +1,131 @@ +From c2f01ca0a8874e75538afa89242f7c09fc6c5c2f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 30 Jan 2025 16:57:54 -0500 +Subject: tg3: Disable tg3 PCIe AER on system reboot + +From: Lenny Szubowicz + +[ Upstream commit e0efe83ed325277bb70f9435d4d9fc70bebdcca8 ] + +Disable PCIe AER on the tg3 device on system reboot on a limited +list of Dell PowerEdge systems. This prevents a fatal PCIe AER event +on the tg3 device during the ACPI _PTS (prepare to sleep) method for +S5 on those systems. The _PTS is invoked by acpi_enter_sleep_state_prep() +as part of the kernel's reboot sequence as a result of commit +38f34dba806a ("PM: ACPI: reboot: Reinstate S5 for reboot"). + +There was an earlier fix for this problem by commit 2ca1c94ce0b6 +("tg3: Disable tg3 device on system reboot to avoid triggering AER"). +But it was discovered that this earlier fix caused a reboot hang +when some Dell PowerEdge servers were booted via ipxe. To address +this reboot hang, the earlier fix was essentially reverted by commit +9fc3bc764334 ("tg3: power down device only on SYSTEM_POWER_OFF"). +This re-exposed the tg3 PCIe AER on reboot problem. + +This fix is not an ideal solution because the root cause of the AER +is in system firmware. Instead, it's a targeted work-around in the +tg3 driver. + +Note also that the PCIe AER must be disabled on the tg3 device even +if the system is configured to use "firmware first" error handling. + +V3: + - Fix sparse warning on improper comparison of pdev->current_state + - Adhere to netdev comment style + +Fixes: 9fc3bc764334 ("tg3: power down device only on SYSTEM_POWER_OFF") +Signed-off-by: Lenny Szubowicz +Reviewed-by: Pavan Chebbi +Reviewed-by: Simon Horman +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/broadcom/tg3.c | 58 +++++++++++++++++++++++++++++ + 1 file changed, 58 insertions(+) + +diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c +index 937579817f226..a7e8f13bb9761 100644 +--- a/drivers/net/ethernet/broadcom/tg3.c ++++ b/drivers/net/ethernet/broadcom/tg3.c +@@ -55,6 +55,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -18184,6 +18185,50 @@ static int tg3_resume(struct device *device) + + static SIMPLE_DEV_PM_OPS(tg3_pm_ops, tg3_suspend, tg3_resume); + ++/* Systems where ACPI _PTS (Prepare To Sleep) S5 will result in a fatal ++ * PCIe AER event on the tg3 device if the tg3 device is not, or cannot ++ * be, powered down. ++ */ ++static const struct dmi_system_id tg3_restart_aer_quirk_table[] = { ++ { ++ .matches = { ++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), ++ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R440"), ++ }, ++ }, ++ { ++ .matches = { ++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), ++ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R540"), ++ }, ++ }, ++ { ++ .matches = { ++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), ++ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R640"), ++ }, ++ }, ++ { ++ .matches = { ++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), ++ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R650"), ++ }, ++ }, ++ { ++ .matches = { ++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), ++ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R740"), ++ }, ++ }, ++ { ++ .matches = { ++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), ++ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R750"), ++ }, ++ }, ++ {} ++}; ++ + static void tg3_shutdown(struct pci_dev *pdev) + { + struct net_device *dev = pci_get_drvdata(pdev); +@@ -18200,6 +18245,19 @@ static void tg3_shutdown(struct pci_dev *pdev) + + if (system_state == SYSTEM_POWER_OFF) + tg3_power_down(tp); ++ else if (system_state == SYSTEM_RESTART && ++ dmi_first_match(tg3_restart_aer_quirk_table) && ++ pdev->current_state != PCI_D3cold && ++ pdev->current_state != PCI_UNKNOWN) { ++ /* Disable PCIe AER on the tg3 to avoid a fatal ++ * error during this system restart. ++ */ ++ pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL, ++ PCI_EXP_DEVCTL_CERE | ++ PCI_EXP_DEVCTL_NFERE | ++ PCI_EXP_DEVCTL_FERE | ++ PCI_EXP_DEVCTL_URRE); ++ } + + rtnl_unlock(); + +-- +2.39.5 + diff --git a/queue-5.10/tun-revert-fix-group-permission-check.patch b/queue-5.10/tun-revert-fix-group-permission-check.patch new file mode 100644 index 0000000000..26057c53df --- /dev/null +++ b/queue-5.10/tun-revert-fix-group-permission-check.patch @@ -0,0 +1,75 @@ +From 0b117ea2d1f24f77416451786957fc9f981e7f5d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 4 Feb 2025 11:10:06 -0500 +Subject: tun: revert fix group permission check + +From: Willem de Bruijn + +[ Upstream commit a70c7b3cbc0688016810bb2e0b9b8a0d6a530045 ] + +This reverts commit 3ca459eaba1bf96a8c7878de84fa8872259a01e3. + +The blamed commit caused a regression when neither tun->owner nor +tun->group is set. This is intended to be allowed, but now requires +CAP_NET_ADMIN. + +Discussion in the referenced thread pointed out that the original +issue that prompted this patch can be resolved in userspace. + +The relaxed access control may also make a device accessible when it +previously wasn't, while existing users may depend on it to not be. + +This is a clean pure git revert, except for fixing the indentation on +the gid_valid line that checkpatch correctly flagged. + +Fixes: 3ca459eaba1b ("tun: fix group permission check") +Link: https://lore.kernel.org/netdev/CAFqZXNtkCBT4f+PwyVRmQGoT3p1eVa01fCG_aNtpt6dakXncUg@mail.gmail.com/ +Signed-off-by: Willem de Bruijn +Cc: Ondrej Mosnacek +Cc: Stas Sergeev +Link: https://patch.msgid.link/20250204161015.739430-1-willemdebruijn.kernel@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/tun.c | 14 +++++--------- + 1 file changed, 5 insertions(+), 9 deletions(-) + +diff --git a/drivers/net/tun.c b/drivers/net/tun.c +index 52ea9f81d388b..3a89f9457fa24 100644 +--- a/drivers/net/tun.c ++++ b/drivers/net/tun.c +@@ -586,18 +586,14 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb, + return ret; + } + +-static inline bool tun_capable(struct tun_struct *tun) ++static inline bool tun_not_capable(struct tun_struct *tun) + { + const struct cred *cred = current_cred(); + struct net *net = dev_net(tun->dev); + +- if (ns_capable(net->user_ns, CAP_NET_ADMIN)) +- return 1; +- if (uid_valid(tun->owner) && uid_eq(cred->euid, tun->owner)) +- return 1; +- if (gid_valid(tun->group) && in_egroup_p(tun->group)) +- return 1; +- return 0; ++ return ((uid_valid(tun->owner) && !uid_eq(cred->euid, tun->owner)) || ++ (gid_valid(tun->group) && !in_egroup_p(tun->group))) && ++ !ns_capable(net->user_ns, CAP_NET_ADMIN); + } + + static void tun_set_real_num_queues(struct tun_struct *tun) +@@ -2776,7 +2772,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr) + !!(tun->flags & IFF_MULTI_QUEUE)) + return -EINVAL; + +- if (!tun_capable(tun)) ++ if (tun_not_capable(tun)) + return -EPERM; + err = security_tun_dev_open(tun->security); + if (err < 0) +-- +2.39.5 + diff --git a/queue-5.10/udp-gso-do-not-drop-small-packets-when-pmtu-reduces.patch b/queue-5.10/udp-gso-do-not-drop-small-packets-when-pmtu-reduces.patch new file mode 100644 index 0000000000..f0a5c84707 --- /dev/null +++ b/queue-5.10/udp-gso-do-not-drop-small-packets-when-pmtu-reduces.patch @@ -0,0 +1,113 @@ +From 5e4829d40cec7ddb367673faba457599195d1d7f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 31 Jan 2025 00:31:39 -0800 +Subject: udp: gso: do not drop small packets when PMTU reduces + +From: Yan Zhai + +[ Upstream commit 235174b2bed88501fda689c113c55737f99332d8 ] + +Commit 4094871db1d6 ("udp: only do GSO if # of segs > 1") avoided GSO +for small packets. But the kernel currently dismisses GSO requests only +after checking MTU/PMTU on gso_size. This means any packets, regardless +of their payload sizes, could be dropped when PMTU becomes smaller than +requested gso_size. We encountered this issue in production and it +caused a reliability problem that new QUIC connection cannot be +established before PMTU cache expired, while non GSO sockets still +worked fine at the same time. + +Ideally, do not check any GSO related constraints when payload size is +smaller than requested gso_size, and return EMSGSIZE instead of EINVAL +on MTU/PMTU check failure to be more specific on the error cause. + +Fixes: 4094871db1d6 ("udp: only do GSO if # of segs > 1") +Signed-off-by: Yan Zhai +Suggested-by: Willem de Bruijn +Reviewed-by: Willem de Bruijn +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + net/ipv4/udp.c | 4 ++-- + net/ipv6/udp.c | 4 ++-- + tools/testing/selftests/net/udpgso.c | 26 ++++++++++++++++++++++++++ + 3 files changed, 30 insertions(+), 4 deletions(-) + +diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c +index 6ad25dc9710c1..b801759147a68 100644 +--- a/net/ipv4/udp.c ++++ b/net/ipv4/udp.c +@@ -923,9 +923,9 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4, + const int hlen = skb_network_header_len(skb) + + sizeof(struct udphdr); + +- if (hlen + cork->gso_size > cork->fragsize) { ++ if (hlen + min(datalen, cork->gso_size) > cork->fragsize) { + kfree_skb(skb); +- return -EINVAL; ++ return -EMSGSIZE; + } + if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) { + kfree_skb(skb); +diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c +index 203a6d64d7e99..224339c3d831d 100644 +--- a/net/ipv6/udp.c ++++ b/net/ipv6/udp.c +@@ -1210,9 +1210,9 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6, + const int hlen = skb_network_header_len(skb) + + sizeof(struct udphdr); + +- if (hlen + cork->gso_size > cork->fragsize) { ++ if (hlen + min(datalen, cork->gso_size) > cork->fragsize) { + kfree_skb(skb); +- return -EINVAL; ++ return -EMSGSIZE; + } + if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) { + kfree_skb(skb); +diff --git a/tools/testing/selftests/net/udpgso.c b/tools/testing/selftests/net/udpgso.c +index 7badaf215de28..0e137182a4f40 100644 +--- a/tools/testing/selftests/net/udpgso.c ++++ b/tools/testing/selftests/net/udpgso.c +@@ -94,6 +94,19 @@ struct testcase testcases_v4[] = { + .gso_len = CONST_MSS_V4, + .r_num_mss = 1, + }, ++ { ++ /* datalen <= MSS < gso_len: will fall back to no GSO */ ++ .tlen = CONST_MSS_V4, ++ .gso_len = CONST_MSS_V4 + 1, ++ .r_num_mss = 0, ++ .r_len_last = CONST_MSS_V4, ++ }, ++ { ++ /* MSS < datalen < gso_len: fail */ ++ .tlen = CONST_MSS_V4 + 1, ++ .gso_len = CONST_MSS_V4 + 2, ++ .tfail = true, ++ }, + { + /* send a single MSS + 1B */ + .tlen = CONST_MSS_V4 + 1, +@@ -197,6 +210,19 @@ struct testcase testcases_v6[] = { + .gso_len = CONST_MSS_V6, + .r_num_mss = 1, + }, ++ { ++ /* datalen <= MSS < gso_len: will fall back to no GSO */ ++ .tlen = CONST_MSS_V6, ++ .gso_len = CONST_MSS_V6 + 1, ++ .r_num_mss = 0, ++ .r_len_last = CONST_MSS_V6, ++ }, ++ { ++ /* MSS < datalen < gso_len: fail */ ++ .tlen = CONST_MSS_V6 + 1, ++ .gso_len = CONST_MSS_V6 + 2, ++ .tfail = true ++ }, + { + /* send a single MSS + 1B */ + .tlen = CONST_MSS_V6 + 1, +-- +2.39.5 + diff --git a/queue-5.10/x86-xen-add-frame_end-to-xen_hypercall_hvm.patch b/queue-5.10/x86-xen-add-frame_end-to-xen_hypercall_hvm.patch new file mode 100644 index 0000000000..59fa539f6b --- /dev/null +++ b/queue-5.10/x86-xen-add-frame_end-to-xen_hypercall_hvm.patch @@ -0,0 +1,38 @@ +From a95b101d43ed852afc725e52d7d2dfde9c4340db Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 5 Feb 2025 10:07:56 +0100 +Subject: x86/xen: add FRAME_END to xen_hypercall_hvm() + +From: Juergen Gross + +[ Upstream commit 0bd797b801bd8ee06c822844e20d73aaea0878dd ] + +xen_hypercall_hvm() is missing a FRAME_END at the end, add it. + +Reported-by: kernel test robot +Closes: https://lore.kernel.org/oe-kbuild-all/202502030848.HTNTTuo9-lkp@intel.com/ +Fixes: b4845bb63838 ("x86/xen: add central hypercall functions") +Signed-off-by: Juergen Gross +Reviewed-by: Jan Beulich +Reviewed-by: Andrew Cooper +Signed-off-by: Juergen Gross +Signed-off-by: Sasha Levin +--- + arch/x86/xen/xen-head.S | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S +index 0dce73077c8cb..6105404ba5703 100644 +--- a/arch/x86/xen/xen-head.S ++++ b/arch/x86/xen/xen-head.S +@@ -130,6 +130,7 @@ SYM_FUNC_START(xen_hypercall_hvm) + pop %rcx + pop %rax + #endif ++ FRAME_END + /* Use correct hypercall function. */ + jz xen_hypercall_amd + jmp xen_hypercall_intel +-- +2.39.5 + diff --git a/queue-5.10/x86-xen-fix-xen_hypercall_hvm-to-not-clobber-rbx.patch b/queue-5.10/x86-xen-fix-xen_hypercall_hvm-to-not-clobber-rbx.patch new file mode 100644 index 0000000000..219b4842ca --- /dev/null +++ b/queue-5.10/x86-xen-fix-xen_hypercall_hvm-to-not-clobber-rbx.patch @@ -0,0 +1,44 @@ +From 262f128f3ac50591ff7eea3e8cf9d87ea69e812e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 5 Feb 2025 09:43:31 +0100 +Subject: x86/xen: fix xen_hypercall_hvm() to not clobber %rbx + +From: Juergen Gross + +[ Upstream commit 98a5cfd2320966f40fe049a9855f8787f0126825 ] + +xen_hypercall_hvm(), which is used when running as a Xen PVH guest at +most only once during early boot, is clobbering %rbx. Depending on +whether the caller relies on %rbx to be preserved across the call or +not, this clobbering might result in an early crash of the system. + +This can be avoided by using an already saved register instead of %rbx. + +Fixes: b4845bb63838 ("x86/xen: add central hypercall functions") +Signed-off-by: Juergen Gross +Reviewed-by: Jan Beulich +Reviewed-by: Andrew Cooper +Signed-off-by: Juergen Gross +Signed-off-by: Sasha Levin +--- + arch/x86/xen/xen-head.S | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S +index 152bbe900a174..0dce73077c8cb 100644 +--- a/arch/x86/xen/xen-head.S ++++ b/arch/x86/xen/xen-head.S +@@ -115,8 +115,8 @@ SYM_FUNC_START(xen_hypercall_hvm) + pop %ebx + pop %eax + #else +- lea xen_hypercall_amd(%rip), %rbx +- cmp %rax, %rbx ++ lea xen_hypercall_amd(%rip), %rcx ++ cmp %rax, %rcx + #ifdef CONFIG_FRAME_POINTER + pop %rax /* Dummy pop. */ + #endif +-- +2.39.5 +