From: Greg Kroah-Hartman Date: Thu, 25 Jun 2026 11:18:58 +0000 (+0100) Subject: 6.18-stable patches X-Git-Tag: v6.18.37~24 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=800e689b61091e1fdb3ca5ff9dd4b591d36ad72e;p=thirdparty%2Fkernel%2Fstable-queue.git 6.18-stable patches added patches: drivers-hv-vmbus-improve-the-logic-of-reserving-fb_mmio-on-gen2-vms.patch firmware-samsung-acpm-fix-cross-thread-rx-length-corruption.patch hv-utils-handle-and-propagate-errors-in-kvp_register.patch regulator-core-fix-locking-in-regulator_resolve_supply-error-path.patch rose-cancel-neighbour-timers-in-rose_neigh_put-before-freeing.patch rose-clear-neighbour-pointer-after-rose_neigh_put-in-state-machines.patch rose-clear-neighbour-pointer-in-rose_kill_by_device.patch rose-disconnect-orphaned-state_2-sockets-when-device-is-gone.patch rose-don-t-free-fd-owned-sockets-when-reaping-in-the-heartbeat.patch rose-drop-call_request-in-loopback-timer-when-device-is-not-running.patch rose-fix-dev_put-leak-in-rose_loopback_timer.patch rose-fix-netdev-double-hold-in-rose_make_new.patch rose-fix-netdev-double-hold-in-rose_rx_call_request.patch rose-fix-notifier-unregistered-too-early-in-rose_exit.patch rose-fix-race-between-loopback-timer-and-module-removal.patch rose-guard-rose_neigh_put-against-null-in-timer-expiry.patch rose-hold-loopback-neighbour-reference-across-timer-callback.patch rose-release-netdev-ref-and-destroy-orphaned-incoming-sockets.patch rose-set-sock_destroy-in-rose_kill_by_device-for-prompt-cleanup.patch sctp-disable-bh-before-calling-udp_tunnel_xmit_skb.patch --- diff --git a/queue-6.18/drivers-hv-vmbus-improve-the-logic-of-reserving-fb_mmio-on-gen2-vms.patch b/queue-6.18/drivers-hv-vmbus-improve-the-logic-of-reserving-fb_mmio-on-gen2-vms.patch new file mode 100644 index 0000000000..0fcaaa24b3 --- /dev/null +++ b/queue-6.18/drivers-hv-vmbus-improve-the-logic-of-reserving-fb_mmio-on-gen2-vms.patch @@ -0,0 +1,153 @@ +From stable+bounces-265000-greg=kroah.com@vger.kernel.org Tue Jun 16 18:01:40 2026 +From: Sasha Levin +Date: Tue, 16 Jun 2026 12:56:09 -0400 +Subject: Drivers: hv: vmbus: Improve the logic of reserving fb_mmio on Gen2 VMs +To: stable@vger.kernel.org +Cc: Dexuan Cui , Michael Kelley , Krister Johansen , Matthew Ruffell , Wei Liu , Sasha Levin +Message-ID: <20260616165609.3366925-1-sashal@kernel.org> + +From: Dexuan Cui + +[ Upstream commit 016a25e4b0df4d77e7c258edee4aaf982e4ee809 ] + +If vmbus_reserve_fb() in the kdump/kexec kernel fails to properly reserve +the framebuffer MMIO range (which is below 4GB) due to a Gen2 VM's +screen.lfb_base being zero [1], there is an MMIO conflict between the +drivers hyperv-drm and pci-hyperv: when the driver pci-hyperv's +hv_allocate_config_window() calls vmbus_allocate_mmio() to get an +MMIO range, typically it gets a 32-bit MMIO range that overlaps with the +framebuffer MMIO range, and later hv_pci_enter_d0() fails with an +error message "PCI Pass-through VSP failed D0 Entry with status" since +the host thinks that PCI devices must not use MMIO space that the +host has assigned to the framebuffer. + +This is especially an issue if pci-hyperv is built-in and hyperv-drm is +built as a module. Consequently, the kdump/kexec kernel fails to detect +PCI devices via pci-hyperv, and may fail to mount the root file system, +which may reside in a NVMe disk. The issue described here has existed +for SR-IOV VF NICs since day one of the pci-hyperv driver, and has been +worked around on x64 when possible. With the recent introduction of +ARM64 VMs that boot from NVMe, there is no workaround, so we need a +formal fix. + +On Gen2 VMs, if the screen.lfb_base is 0 in the kdump/kexec kernel [1], +fall back to the low MMIO base, which should be equal to the framebuffer +MMIO base [2] (the statement is true according to my testing on x64 +Windows Server 2016, and on x64 and ARM64 Windows Server 2025 and on +Azure. I checked with the Hyper-V team and they said the statement should +continue to be true for Gen2 VMs). In the first kernel, screen.lfb_base +is not 0; if the user specifies a very high resolution, it's not enough +to only reserve 8MB: let's always reserve half of the space below 4GB, +but cap the reservation to 128MB, which is the required framebuffer size +of the highest resolution 7680*4320 supported by Hyper-V. + +While at it, fix the comparison "end > VTPM_BASE_ADDRESS" by changing +the > to >=. Here the 'end' is an inclusive end (typically, it's +0xFFFF_FFFF for the low MMIO range). + +Note: vmbus_reserve_fb() now also reserves an MMIO range at the beginning +of the low MMIO range on CVMs, which have no framebuffers (the +'screen.lfb_base' in vmbus_reserve_fb() is 0 for CVMs), just in case the +host might treat the beginning of the low MMIO range specially [3]. BTW, +the OpenHCL kernel is not affected by the change, because that kernel +boots with DeviceTree rather than ACPI (so vmbus_reserve_fb() won't run +there), and there is no framebuffer device for that kernel. + +Note: normally Gen1 VMs don't have the MMIO conflict issue because the +framebuffer MMIO range (which is hardcoded to base=4GB-128MB and +size=64MB for Gen1 VMs by the host) is always reported via the legacy PCI +graphics device's BAR, so the kdump/kexec kernel can reserve the 64MB +MMIO range; however, if the VM is configured to use a very high resolution +and the required framebuffer size exceeds 64MB (AFAIK, in practice, this +isn't a typical configuration by users), the hyperv-drm driver may need to +allocate an MMIO range above 4GB and change the framebuffer MMIO location +to the allocated MMIO range -- in this case, there can still be issues [4] +which can't be easily fixed: any possible affected Gen1 users would have +to use a resolution whose framebuffer size is <= 64MB, or switch to Gen2 +VMs. + +[1] https://lore.kernel.org/all/SA1PR21MB692176C1BC53BFC9EAE5CF8EBF51A@SA1PR21MB6921.namprd21.prod.outlook.com/ +[2] https://lore.kernel.org/all/SA1PR21MB69218F955B62DFF62E3E88D2BF222@SA1PR21MB6921.namprd21.prod.outlook.com/ +[3] https://lore.kernel.org/all/SN6PR02MB415726B17D5A6027CD1717E8D4342@SN6PR02MB4157.namprd02.prod.outlook.com/ +[4] https://lore.kernel.org/all/SA1PR21MB69213486F821CA5A2C793C81BF342@SA1PR21MB6921.namprd21.prod.outlook.com/ + +Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") +CC: stable@vger.kernel.org +Reviewed-by: Michael Kelley +Tested-by: Krister Johansen +Tested-by: Matthew Ruffell +Signed-off-by: Dexuan Cui +Signed-off-by: Wei Liu +Signed-off-by: Sasha Levin +Signed-off-by: Greg Kroah-Hartman +--- + drivers/hv/vmbus_drv.c | 29 ++++++++++++++++++++++++++--- + 1 file changed, 26 insertions(+), 3 deletions(-) + +--- a/drivers/hv/vmbus_drv.c ++++ b/drivers/hv/vmbus_drv.c +@@ -2291,8 +2291,8 @@ static acpi_status vmbus_walk_resources( + return AE_NO_MEMORY; + + /* If this range overlaps the virtual TPM, truncate it. */ +- if (end > VTPM_BASE_ADDRESS && start < VTPM_BASE_ADDRESS) +- end = VTPM_BASE_ADDRESS; ++ if (end >= VTPM_BASE_ADDRESS && start < VTPM_BASE_ADDRESS) ++ end = VTPM_BASE_ADDRESS - 1; + + new_res->name = "hyperv mmio"; + new_res->flags = IORESOURCE_MEM; +@@ -2359,6 +2359,7 @@ static void vmbus_mmio_remove(void) + static void __maybe_unused vmbus_reserve_fb(void) + { + resource_size_t start = 0, size; ++ resource_size_t low_mmio_base; + struct pci_dev *pdev; + + if (efi_enabled(EFI_BOOT)) { +@@ -2366,6 +2367,24 @@ static void __maybe_unused vmbus_reserve + if (IS_ENABLED(CONFIG_SYSFB)) { + start = screen_info.lfb_base; + size = max_t(__u32, screen_info.lfb_size, 0x800000); ++ ++ low_mmio_base = hyperv_mmio->start; ++ if (!low_mmio_base || upper_32_bits(low_mmio_base) || ++ (start && start < low_mmio_base)) { ++ pr_warn("Unexpected low mmio base %pa\n", &low_mmio_base); ++ } else { ++ /* ++ * If the kdump/kexec or CVM kernel's lfb_base ++ * is 0, fall back to the low mmio base. ++ */ ++ if (!start) ++ start = low_mmio_base; ++ /* ++ * Reserve half of the space below 4GB for high ++ * resolutions, but cap the reservation to 128MB. ++ */ ++ size = min((SZ_4G - start) / 2, SZ_128M); ++ } + } + } else { + /* Gen1 VM: get FB base from PCI */ +@@ -2386,8 +2405,10 @@ static void __maybe_unused vmbus_reserve + pci_dev_put(pdev); + } + +- if (!start) ++ if (!start) { ++ pr_warn("Unexpected framebuffer mmio base of zero\n"); + return; ++ } + + /* + * Make a claim for the frame buffer in the resource tree under the +@@ -2397,6 +2418,8 @@ static void __maybe_unused vmbus_reserve + */ + for (; !fb_mmio && (size >= 0x100000); size >>= 1) + fb_mmio = __request_region(hyperv_mmio, start, size, fb_mmio_name, 0); ++ ++ pr_info("hv_mmio=%pR,%pR fb=%pR\n", hyperv_mmio, hyperv_mmio->sibling, fb_mmio); + } + + /** diff --git a/queue-6.18/firmware-samsung-acpm-fix-cross-thread-rx-length-corruption.patch b/queue-6.18/firmware-samsung-acpm-fix-cross-thread-rx-length-corruption.patch new file mode 100644 index 0000000000..1309d719a9 --- /dev/null +++ b/queue-6.18/firmware-samsung-acpm-fix-cross-thread-rx-length-corruption.patch @@ -0,0 +1,105 @@ +From stable+bounces-266607-greg=kroah.com@vger.kernel.org Wed Jun 17 02:47:28 2026 +From: Sasha Levin +Date: Tue, 16 Jun 2026 21:47:19 -0400 +Subject: firmware: samsung: acpm: Fix cross-thread RX length corruption +To: stable@vger.kernel.org +Cc: Tudor Ambarus , Titouan Ameline de Cadeville , Krzysztof Kozlowski , Sasha Levin +Message-ID: <20260617014719.3671631-1-sashal@kernel.org> + +From: Tudor Ambarus + +[ Upstream commit f133bd4b5daf71bccdde0ad1a4f47fac76a6bfb1 ] + +Sashiko identified a cross-thread RX length corruption bug when +reviewing the thermal addition to ACPM [1]. + +When multiple threads concurrently send IPC requests, the ACPM polling +mechanism can encounter responses belonging to other threads. To drain +the queue, the driver saves these concurrent responses into an internal +cache (`rx_data->cmd`) to be retrieved later by the owning thread. + +Previously, the driver incorrectly used `xfer->rxcnt` (the expected +receive length of the *current* polling thread) when copying data for +*other* threads into this cache. If the threads expected responses of +different lengths, this resulted in buffer underflows (leading to reads +of uninitialized memory) or potential buffer overflows. + +Fix this by replacing the boolean `response` flag in +`struct acpm_rx_data` with `rxcnt`, caching the exact expected receive +length for each specific transaction during transfer preparation. Use +this cached length when saving concurrent responses. + +Consequently, ensure that `xfer->rxcnt` is explicitly zeroed in driver +helpers (e.g., `acpm_dvfs_set_xfer`) for fire-and-forget messages to +prevent uninitialized stack garbage from being interpreted as a massive +expected receive length. + +Cc: stable@vger.kernel.org +Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver") +Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1] +Reported-by: Titouan Ameline de Cadeville +Closes: https://lore.kernel.org/r/20260426210255.73674-1-titouan.ameline@gmail.com/ +Signed-off-by: Tudor Ambarus +Link: https://patch.msgid.link/20260505-acpm-fixes-sashiko-reports-v5-1-43b5ee7f1674@linaro.org +Signed-off-by: Krzysztof Kozlowski +Signed-off-by: Sasha Levin +Signed-off-by: Greg Kroah-Hartman +--- + drivers/firmware/samsung/exynos-acpm.c | 14 +++++++------- + 1 file changed, 7 insertions(+), 7 deletions(-) + +--- a/drivers/firmware/samsung/exynos-acpm.c ++++ b/drivers/firmware/samsung/exynos-acpm.c +@@ -103,12 +103,12 @@ struct acpm_queue { + * + * @cmd: pointer to where the data shall be saved. + * @n_cmd: number of 32-bit commands. +- * @response: true if the client expects the RX data. ++ * @rxcnt: expected length of the response in 32-bit words. + */ + struct acpm_rx_data { + u32 *cmd; + size_t n_cmd; +- bool response; ++ size_t rxcnt; + }; + + #define ACPM_SEQNUM_MAX 64 +@@ -196,7 +196,7 @@ static void acpm_get_saved_rx(struct acp + const struct acpm_rx_data *rx_data = &achan->rx_data[tx_seqnum - 1]; + u32 rx_seqnum; + +- if (!rx_data->response) ++ if (!rx_data->rxcnt) + return; + + rx_seqnum = FIELD_GET(ACPM_PROTOCOL_SEQNUM, rx_data->cmd[0]); +@@ -253,7 +253,7 @@ static int acpm_get_rx(struct acpm_chan + seqnum = rx_seqnum - 1; + rx_data = &achan->rx_data[seqnum]; + +- if (rx_data->response) { ++ if (rx_data->rxcnt) { + if (rx_seqnum == tx_seqnum) { + __ioread32_copy(xfer->rxd, addr, + xfer->rxlen / 4); +@@ -267,7 +267,7 @@ static int acpm_get_rx(struct acpm_chan + * after the response is copied to the request. + */ + __ioread32_copy(rx_data->cmd, addr, +- xfer->rxlen / 4); ++ rx_data->rxcnt); + } + } else { + clear_bit(seqnum, achan->bitmap_seqnum); +@@ -379,8 +379,8 @@ static void acpm_prepare_xfer(struct acp + /* Clear data for upcoming responses */ + rx_data = &achan->rx_data[achan->seqnum - 1]; + memset(rx_data->cmd, 0, sizeof(*rx_data->cmd) * rx_data->n_cmd); +- if (xfer->rxd) +- rx_data->response = true; ++ /* zero means no response expected */ ++ rx_data->rxcnt = xfer->rxlen / 4; + + /* Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62) */ + set_bit(achan->seqnum - 1, achan->bitmap_seqnum); diff --git a/queue-6.18/hv-utils-handle-and-propagate-errors-in-kvp_register.patch b/queue-6.18/hv-utils-handle-and-propagate-errors-in-kvp_register.patch new file mode 100644 index 0000000000..86d8f7e7a0 --- /dev/null +++ b/queue-6.18/hv-utils-handle-and-propagate-errors-in-kvp_register.patch @@ -0,0 +1,89 @@ +From stable+bounces-264996-greg=kroah.com@vger.kernel.org Tue Jun 16 18:01:04 2026 +From: Sasha Levin +Date: Tue, 16 Jun 2026 12:55:57 -0400 +Subject: hv: utils: handle and propagate errors in kvp_register +To: stable@vger.kernel.org +Cc: Thorsten Blum , Long Li , Wei Liu , Sasha Levin +Message-ID: <20260616165557.3366587-1-sashal@kernel.org> + +From: Thorsten Blum + +[ Upstream commit 3fcf923302a8f5c0dc3af3d2ca2657cb5fae4297 ] + +Make kvp_register() return an error code instead of silently ignoring +failures, and propagate the error from kvp_handle_handshake() instead of +returning success. + +This propagates both kzalloc_obj() and hvutil_transport_send() failures +to kvp_handle_handshake() and thus to kvp_on_msg(). + +Fixes: 245ba56a52a3 ("Staging: hv: Implement key/value pair (KVP)") +Cc: stable@vger.kernel.org +Signed-off-by: Thorsten Blum +Reviewed-by: Long Li +Signed-off-by: Wei Liu +Signed-off-by: Sasha Levin +Signed-off-by: Greg Kroah-Hartman +--- + drivers/hv/hv_kvp.c | 27 ++++++++++++++------------- + 1 file changed, 14 insertions(+), 13 deletions(-) + +--- a/drivers/hv/hv_kvp.c ++++ b/drivers/hv/hv_kvp.c +@@ -93,7 +93,7 @@ static void kvp_send_key(struct work_str + static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); + static void kvp_timeout_func(struct work_struct *dummy); + static void kvp_host_handshake_func(struct work_struct *dummy); +-static void kvp_register(int); ++static int kvp_register(int); + + static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); + static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); +@@ -127,24 +127,26 @@ static void kvp_register_done(void) + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); + } + +-static void ++static int + kvp_register(int reg_value) + { + + struct hv_kvp_msg *kvp_msg; + char *version; ++ int ret; + + kvp_msg = kzalloc(sizeof(*kvp_msg), GFP_KERNEL); ++ if (!kvp_msg) ++ return -ENOMEM; + +- if (kvp_msg) { +- version = kvp_msg->body.kvp_register.version; +- kvp_msg->kvp_hdr.operation = reg_value; +- strcpy(version, HV_DRV_VERSION); +- +- hvutil_transport_send(hvt, kvp_msg, sizeof(*kvp_msg), +- kvp_register_done); +- kfree(kvp_msg); +- } ++ version = kvp_msg->body.kvp_register.version; ++ kvp_msg->kvp_hdr.operation = reg_value; ++ strcpy(version, HV_DRV_VERSION); ++ ++ ret = hvutil_transport_send(hvt, kvp_msg, sizeof(*kvp_msg), ++ kvp_register_done); ++ kfree(kvp_msg); ++ return ret; + } + + static void kvp_timeout_func(struct work_struct *dummy) +@@ -186,9 +188,8 @@ static int kvp_handle_handshake(struct h + */ + pr_debug("KVP: userspace daemon ver. %d connected\n", + msg->kvp_hdr.operation); +- kvp_register(dm_reg_value); + +- return 0; ++ return kvp_register(dm_reg_value); + } + + diff --git a/queue-6.18/regulator-core-fix-locking-in-regulator_resolve_supply-error-path.patch b/queue-6.18/regulator-core-fix-locking-in-regulator_resolve_supply-error-path.patch new file mode 100644 index 0000000000..543d18e524 --- /dev/null +++ b/queue-6.18/regulator-core-fix-locking-in-regulator_resolve_supply-error-path.patch @@ -0,0 +1,66 @@ +From 497330b203d2c59c5ff3fa4c34d14494d7203bc3 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Andr=C3=A9=20Draszik?= +Date: Fri, 9 Jan 2026 08:38:38 +0000 +Subject: regulator: core: fix locking in regulator_resolve_supply() error path +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: André Draszik + +commit 497330b203d2c59c5ff3fa4c34d14494d7203bc3 upstream. + +If late enabling of a supply regulator fails in +regulator_resolve_supply(), the code currently triggers a lockdep +warning: + + WARNING: drivers/regulator/core.c:2649 at _regulator_put+0x80/0xa0, CPU#6: kworker/u32:4/596 + ... + Call trace: + _regulator_put+0x80/0xa0 (P) + regulator_resolve_supply+0x7cc/0xbe0 + regulator_register_resolve_supply+0x28/0xb8 + +as the regulator_list_mutex must be held when calling _regulator_put(). + +To solve this, simply switch to using regulator_put(). + +While at it, we should also make sure that no concurrent access happens +to our rdev while we clear out the supply pointer. Add appropriate +locking to ensure that. + +While the code in question will be removed altogether in a follow-up +commit, I believe it is still beneficial to have this corrected before +removal for future reference. + +Fixes: 36a1f1b6ddc6 ("regulator: core: Fix memory leak in regulator_resolve_supply()") +Fixes: 8e5356a73604 ("regulator: core: Clear the supply pointer if enabling fails") +Signed-off-by: André Draszik +Link: https://patch.msgid.link/20260109-regulators-defer-v2-2-1a25dc968e60@linaro.org +Signed-off-by: Mark Brown +Signed-off-by: Nazar Kalashnikov +Signed-off-by: Greg Kroah-Hartman +--- + drivers/regulator/core.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) + +--- a/drivers/regulator/core.c ++++ b/drivers/regulator/core.c +@@ -2159,8 +2159,16 @@ static int regulator_resolve_supply(stru + if (rdev->use_count) { + ret = regulator_enable(rdev->supply); + if (ret < 0) { +- _regulator_put(rdev->supply); ++ struct regulator *supply; ++ ++ regulator_lock_two(rdev, rdev->supply->rdev, &ww_ctx); ++ ++ supply = rdev->supply; + rdev->supply = NULL; ++ ++ regulator_unlock_two(rdev, supply->rdev, &ww_ctx); ++ ++ regulator_put(supply); + goto out; + } + } diff --git a/queue-6.18/rose-cancel-neighbour-timers-in-rose_neigh_put-before-freeing.patch b/queue-6.18/rose-cancel-neighbour-timers-in-rose_neigh_put-before-freeing.patch new file mode 100644 index 0000000000..1562502827 --- /dev/null +++ b/queue-6.18/rose-cancel-neighbour-timers-in-rose_neigh_put-before-freeing.patch @@ -0,0 +1,48 @@ +From 9b222cb1d23ff210975e9df5ebab7b011acb6fad Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sun, 31 May 2026 15:41:45 +0200 +Subject: rose: cancel neighbour timers in rose_neigh_put() before freeing + +From: Bernard Pidoux + +commit 9b222cb1d23ff210975e9df5ebab7b011acb6fad upstream. + +rose_neigh_put() kfree()s the neighbour but never cancels its ftimer and +t0timer. Until now every caller that dropped the final reference first +called rose_remove_neigh(), which deletes those timers. The socket +heartbeat reaping path drops the last reference directly, so a neighbour +could be freed with t0timer still armed -- it re-arms itself in +rose_t0timer_expiry() -- leading to a use-after-free write in +enqueue_timer(). + +Cancel both timers with timer_delete_sync() (the synchronous variant, to +wait out a concurrently running, self-rearming handler) in the +refcount-zero branch of rose_neigh_put(). + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + include/net/rose.h | 12 ++++++++++++ + 1 file changed, 12 insertions(+) + +--- a/include/net/rose.h ++++ b/include/net/rose.h +@@ -160,6 +160,18 @@ static inline void rose_neigh_hold(struc + static inline void rose_neigh_put(struct rose_neigh *rose_neigh) + { + if (refcount_dec_and_test(&rose_neigh->use)) { ++ /* We are dropping the last reference, so we are about to free the ++ * neighbour. Its timers may still be armed -- t0timer in particular ++ * re-arms itself in rose_t0timer_expiry(). rose_remove_neigh() ++ * cancels them before its own put, but callers that drop the final ++ * reference without first calling rose_remove_neigh() (the socket ++ * heartbeat reaping path) would otherwise kfree() a neighbour with a ++ * live timer -> use-after-free. timer_delete_sync() (not the async ++ * variant) is required: it waits out a concurrently running handler ++ * and loops until the self-rearming timer stays stopped. ++ */ ++ timer_delete_sync(&rose_neigh->ftimer); ++ timer_delete_sync(&rose_neigh->t0timer); + if (rose_neigh->ax25) + ax25_cb_put(rose_neigh->ax25); + kfree(rose_neigh->digipeat); diff --git a/queue-6.18/rose-clear-neighbour-pointer-after-rose_neigh_put-in-state-machines.patch b/queue-6.18/rose-clear-neighbour-pointer-after-rose_neigh_put-in-state-machines.patch new file mode 100644 index 0000000000..b16cd802f3 --- /dev/null +++ b/queue-6.18/rose-clear-neighbour-pointer-after-rose_neigh_put-in-state-machines.patch @@ -0,0 +1,73 @@ +From e8eb0c6faa8849ba7769516c1a8c84d9f612acf6 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sat, 16 May 2026 12:10:38 +0200 +Subject: rose: clear neighbour pointer after rose_neigh_put() in state machines + +From: Bernard Pidoux + +commit e8eb0c6faa8849ba7769516c1a8c84d9f612acf6 upstream. + +After calling rose_neigh_put() in rose_state1_machine() through +rose_state5_machine(), rose->neighbour was left pointing at the +potentially freed neighbour structure. A subsequent timer expiry or +concurrent teardown path could dereference the stale pointer, causing +a use-after-free. + +Set rose->neighbour to NULL immediately after each rose_neigh_put() +call in the state machine functions. + +Fixes: d860d1faa6b2 ("net: rose: convert 'use' field to refcount_t") +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_in.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +--- a/net/rose/rose_in.c ++++ b/net/rose/rose_in.c +@@ -57,6 +57,7 @@ static int rose_state1_machine(struct so + rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); + rose_disconnect(sk, ECONNREFUSED, skb->data[3], skb->data[4]); + rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; + break; + + default: +@@ -80,11 +81,13 @@ static int rose_state2_machine(struct so + rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); + rose_disconnect(sk, 0, skb->data[3], skb->data[4]); + rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; + break; + + case ROSE_CLEAR_CONFIRMATION: + rose_disconnect(sk, 0, -1, -1); + rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; + break; + + default: +@@ -122,6 +125,7 @@ static int rose_state3_machine(struct so + rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); + rose_disconnect(sk, 0, skb->data[3], skb->data[4]); + rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; + break; + + case ROSE_RR: +@@ -235,6 +239,7 @@ static int rose_state4_machine(struct so + rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); + rose_disconnect(sk, 0, skb->data[3], skb->data[4]); + rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; + break; + + default: +@@ -255,6 +260,7 @@ static int rose_state5_machine(struct so + rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION); + rose_disconnect(sk, 0, skb->data[3], skb->data[4]); + rose_neigh_put(rose_sk(sk)->neighbour); ++ rose_sk(sk)->neighbour = NULL; + } + + return 0; diff --git a/queue-6.18/rose-clear-neighbour-pointer-in-rose_kill_by_device.patch b/queue-6.18/rose-clear-neighbour-pointer-in-rose_kill_by_device.patch new file mode 100644 index 0000000000..3ded72464e --- /dev/null +++ b/queue-6.18/rose-clear-neighbour-pointer-in-rose_kill_by_device.patch @@ -0,0 +1,44 @@ +From 606e42d195b467480d4d405f8814c48d1651a76a Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sun, 31 May 2026 15:41:45 +0200 +Subject: rose: clear neighbour pointer in rose_kill_by_device() + +From: Bernard Pidoux + +commit 606e42d195b467480d4d405f8814c48d1651a76a upstream. + +rose_kill_by_device() drops the neighbour reference but leaves +rose->neighbour pointing at it, unlike every other rose_neigh_put() site +(see "rose: clear neighbour pointer after rose_neigh_put() in state +machines"). The heartbeat STATE_0 reaping path then puts the same +neighbour a second time, causing a rose_neigh refcount underflow and a +use-after-free. + +Set rose->neighbour = NULL after the put, restoring the invariant. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/af_rose.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) + +--- a/net/rose/af_rose.c ++++ b/net/rose/af_rose.c +@@ -216,8 +216,16 @@ start: + * looping forever in ROSE_STATE_0 with no owner. + */ + sock_set_flag(sk, SOCK_DESTROY); +- if (rose->neighbour) ++ if (rose->neighbour) { + rose_neigh_put(rose->neighbour); ++ /* Clear the pointer after dropping the reference, as ++ * every other rose_neigh_put() site does. Otherwise ++ * rose_heartbeat_expiry() (STATE_0 reaping) sees a stale ++ * rose->neighbour and puts it a second time -> rose_neigh ++ * refcount underflow / use-after-free. ++ */ ++ rose->neighbour = NULL; ++ } + netdev_put(rose->device, &rose->dev_tracker); + rose->device = NULL; + } diff --git a/queue-6.18/rose-disconnect-orphaned-state_2-sockets-when-device-is-gone.patch b/queue-6.18/rose-disconnect-orphaned-state_2-sockets-when-device-is-gone.patch new file mode 100644 index 0000000000..b0339aee9d --- /dev/null +++ b/queue-6.18/rose-disconnect-orphaned-state_2-sockets-when-device-is-gone.patch @@ -0,0 +1,50 @@ +From d4f4cf9f09a3f5fafa8f09110a7c1b5d10f2f261 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Thu, 28 May 2026 17:38:18 +0200 +Subject: rose: disconnect orphaned STATE_2 sockets when device is gone + +From: Bernard Pidoux + +commit d4f4cf9f09a3f5fafa8f09110a7c1b5d10f2f261 upstream. + +When ax25stop brings down ROSE interfaces, sockets in ROSE_STATE_2 +(awaiting CLEAR CONFIRM) whose device pointer is already NULL are not +reached by rose_kill_by_device() and wait for T3 (up to 180s) before +self-cleaning via rose_timer_expiry(). This keeps the rose module +usecount at 1, blocking rmmod for the full T3 duration. + +In rose_heartbeat_expiry(), detect ROSE_STATE_2 sockets with no device, +cancel T3, release the neighbour reference, and call rose_disconnect() ++ sock_set_flag(SOCK_DESTROY). The next heartbeat tick (<=5s) then +destroys the socket via the existing ROSE_STATE_0/SOCK_DESTROY path, +allowing clean module unload within 10s instead of up to 180s. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_timer.c | 14 ++++++++++++++ + 1 file changed, 14 insertions(+) + +--- a/net/rose/rose_timer.c ++++ b/net/rose/rose_timer.c +@@ -139,6 +139,20 @@ static void rose_heartbeat_expiry(struct + } + break; + ++ case ROSE_STATE_2: ++ /* Device gone before CLEAR CONFIRM arrived: stop waiting for T3 ++ * and disconnect now instead of blocking rmmod for up to 180s. */ ++ if (!rose->device) { ++ rose_stop_timer(sk); ++ if (rose->neighbour) { ++ rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; ++ } ++ rose_disconnect(sk, ENETDOWN, -1, -1); ++ sock_set_flag(sk, SOCK_DESTROY); ++ } ++ break; ++ + case ROSE_STATE_3: + /* + * Check for the state of the receive buffer. diff --git a/queue-6.18/rose-don-t-free-fd-owned-sockets-when-reaping-in-the-heartbeat.patch b/queue-6.18/rose-don-t-free-fd-owned-sockets-when-reaping-in-the-heartbeat.patch new file mode 100644 index 0000000000..96c615e552 --- /dev/null +++ b/queue-6.18/rose-don-t-free-fd-owned-sockets-when-reaping-in-the-heartbeat.patch @@ -0,0 +1,100 @@ +From 56576518920edd7b6c3479477d8d490fe2ebdaaa Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sun, 31 May 2026 15:41:45 +0200 +Subject: rose: don't free fd-owned sockets when reaping in the heartbeat + +From: Bernard Pidoux + +commit 56576518920edd7b6c3479477d8d490fe2ebdaaa upstream. + +The heartbeat reaps orphaned ROSE sockets after their bound device goes +down. A socket still attached to a struct socket (sk->sk_socket != NULL -- +e.g. an incoming connection an fpad client has accepted and kept open) is +owned by that userspace fd: rose_release() frees it on close(). Freeing it +from the heartbeat left the fd dangling, so the eventual close() touched +freed memory -- slab-use-after-free in rose_release(). + +Reap only sockets with sk->sk_socket == NULL (unaccepted incoming +connections and post-close orphans). For an fd-owned socket whose device +went down, disconnect it and fall through to the switch so close() does +the teardown. Also release the neighbour reference held by orphaned +incoming sockets before tearing them down. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_timer.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++- + 1 file changed, 56 insertions(+), 1 deletion(-) + +--- a/net/rose/rose_timer.c ++++ b/net/rose/rose_timer.c +@@ -126,13 +126,68 @@ static void rose_heartbeat_expiry(struct + sk_reset_timer(sk, &sk->sk_timer, jiffies + HZ/20); + goto out; + } ++ ++ /* The bound device went down while we still hold a reference on it. ++ * This catches the narrow race where rose_loopback_timer() created a ++ * socket in the window after rose_kill_by_device()'s NETDEV_DOWN sweep ++ * but before rose_insert_socket() -- leaving a STATE_3 socket that no ++ * other branch reaps. A down device means the link is dead, so tear ++ * the socket down regardless of state. rose_destroy_socket() releases ++ * the held netdev reference (rose->device still set). ++ */ ++ if (rose->device && !netif_running(rose->device)) { ++ if (rose->neighbour) { ++ rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; ++ } ++ rose_disconnect(sk, ENETDOWN, -1, -1); ++ ++ /* Only reap the socket if userspace no longer holds it. A socket ++ * still attached to a struct socket (sk->sk_socket != NULL -- e.g. ++ * a connection an fpad client has accepted and kept open) is owned ++ * by that fd: rose_release() will destroy it on close(). Dropping ++ * the last reference here leaves the open fd dangling, so the ++ * eventual close() touches freed memory -> slab-use-after-free in ++ * rose_release(). Unaccepted incoming sockets and post-close ++ * orphans have sk->sk_socket == NULL and stay safe to reap here. ++ */ ++ if (!sk->sk_socket) { ++ sock_set_flag(sk, SOCK_DESTROY); ++ bh_unlock_sock(sk); ++ rose_destroy_socket(sk); ++ sock_put(sk); ++ return; ++ } ++ ++ /* Owned by userspace: the link is down and the socket is now ++ * disconnected (rose_disconnect() moved it to STATE_0). Fall ++ * through to the switch, which re-arms the heartbeat; the close() ++ * will tear the socket down. */ ++ } ++ + switch (rose->state) { + case ROSE_STATE_0: + /* Destroy any orphaned STATE_0 socket: either explicitly + * flagged SOCK_DESTROY, or SOCK_DEAD (covers both unaccepted + * incoming connections and listening sockets whose link died). + */ +- if (sock_flag(sk, SOCK_DESTROY) || sock_flag(sk, SOCK_DEAD)) { ++ if ((sock_flag(sk, SOCK_DESTROY) || sock_flag(sk, SOCK_DEAD)) && ++ !sk->sk_socket) { ++ /* Reap only orphaned sockets (sk->sk_socket == NULL). A ++ * socket still owned by a userspace fd reaches here via the ++ * STATE_2 device-gone branch, which sets SOCK_DESTROY without ++ * knowing about the fd; freeing it would race rose_release() ++ * at close() -> use-after-free. Leave it for close(). ++ * ++ * Orphaned incoming sockets (rose_rx_call_request) hold a ++ * neighbour reference; release it before teardown, as the ++ * STATE_2 and device-down branches do. rose_destroy_socket() ++ * does not drop it. ++ */ ++ if (rose->neighbour) { ++ rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; ++ } + bh_unlock_sock(sk); + rose_destroy_socket(sk); + sock_put(sk); diff --git a/queue-6.18/rose-drop-call_request-in-loopback-timer-when-device-is-not-running.patch b/queue-6.18/rose-drop-call_request-in-loopback-timer-when-device-is-not-running.patch new file mode 100644 index 0000000000..4607c6e0ef --- /dev/null +++ b/queue-6.18/rose-drop-call_request-in-loopback-timer-when-device-is-not-running.patch @@ -0,0 +1,55 @@ +From cf5567a2652e44866eae8987dff4c1ea507680df Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Thu, 28 May 2026 20:20:55 +0200 +Subject: rose: drop CALL_REQUEST in loopback timer when device is not running + +From: Bernard Pidoux + +commit cf5567a2652e44866eae8987dff4c1ea507680df upstream. + +When ax25stop brings down rose0 while the loopback timer has pending +CALL_REQUEST frames, rose_loopback_timer() calls rose_dev_get() and +finds the device still registered (unregister_netdevice waits for +refs to drop), then calls rose_rx_call_request() which takes a +netdev_hold() for the new socket. + +But NETDEV_DOWN fires only once: rose_kill_by_device() already ran +before this timer tick, so the new socket is never cleaned up. The +stuck reference prevents unregister_netdevice from completing, and the +orphan socket's timers eventually fire on freed memory (KASAN +slab-use-after-free in __run_timers). + +The kernel clears IFF_UP via dev_close() before sending NETDEV_DOWN, +so checking netif_running() after rose_dev_get() is sufficient: if the +device is no longer running, the CALL_REQUEST is silently dropped and +no socket is created. This closes the race without touching the +module-exit path (which already stops the timer via loopback_stopping). + +Tested: unregister_netdevice completes immediately after ax25stop with +active loopback connections; no ref_tracker warnings, no KASAN. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_loopback.c | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +--- a/net/rose/rose_loopback.c ++++ b/net/rose/rose_loopback.c +@@ -118,6 +118,16 @@ static void rose_loopback_timer(struct t + kfree_skb(skb); + continue; + } ++ /* rose_kill_by_device() runs on NETDEV_DOWN (IFF_UP cleared) ++ * before the device is unregistered. If we create a new ++ * socket here after that cleanup, the ref never gets released ++ * because NETDEV_DOWN fires only once. Drop the call instead. ++ */ ++ if (!netif_running(dev)) { ++ dev_put(dev); ++ kfree_skb(skb); ++ continue; ++ } + + if (rose_rx_call_request(skb, dev, rose_loopback_neigh, lci_o) == 0) + kfree_skb(skb); diff --git a/queue-6.18/rose-fix-dev_put-leak-in-rose_loopback_timer.patch b/queue-6.18/rose-fix-dev_put-leak-in-rose_loopback_timer.patch new file mode 100644 index 0000000000..ef529f72a6 --- /dev/null +++ b/queue-6.18/rose-fix-dev_put-leak-in-rose_loopback_timer.patch @@ -0,0 +1,56 @@ +From ff91adc54db2b62c7cdf063ff761eceb5adf2215 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sat, 16 May 2026 12:09:33 +0200 +Subject: rose: fix dev_put() leak in rose_loopback_timer() + +From: Bernard Pidoux + +commit ff91adc54db2b62c7cdf063ff761eceb5adf2215 upstream. + +rose_rx_call_request() always consumes or returns the skb but never +releases the device reference obtained from rose_dev_get(). When +rose_rx_call_request() succeeds (returns non-zero) dev_put() was never +called, leaking one reference per loopback CALL_REQUEST. + +Move dev_put() outside the conditional so it is called unconditionally +after rose_rx_call_request() in all cases. + +Also remove the dead check (!rose_loopback_neigh->dev && +!rose_loopback_neigh->loopback) that immediately precedes it: the +loopback neighbour always has loopback=1 so this condition can never +be true. + +Fixes: 0453c6824595 ("net/rose: fix unbound loop in rose_loopback_timer()") +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_loopback.c | 11 ++--------- + 1 file changed, 2 insertions(+), 9 deletions(-) + +--- a/net/rose/rose_loopback.c ++++ b/net/rose/rose_loopback.c +@@ -96,22 +96,15 @@ static void rose_loopback_timer(struct t + } + + if (frametype == ROSE_CALL_REQUEST) { +- if (!rose_loopback_neigh->dev && +- !rose_loopback_neigh->loopback) { +- kfree_skb(skb); +- continue; +- } +- + dev = rose_dev_get(dest); + if (!dev) { + kfree_skb(skb); + continue; + } + +- if (rose_rx_call_request(skb, dev, rose_loopback_neigh, lci_o) == 0) { +- dev_put(dev); ++ if (rose_rx_call_request(skb, dev, rose_loopback_neigh, lci_o) == 0) + kfree_skb(skb); +- } ++ dev_put(dev); + } else { + kfree_skb(skb); + } diff --git a/queue-6.18/rose-fix-netdev-double-hold-in-rose_make_new.patch b/queue-6.18/rose-fix-netdev-double-hold-in-rose_make_new.patch new file mode 100644 index 0000000000..ad1c82f6b3 --- /dev/null +++ b/queue-6.18/rose-fix-netdev-double-hold-in-rose_make_new.patch @@ -0,0 +1,50 @@ +From b9fb21ceb4f0d043767a1eba60786ec84809033b Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Thu, 28 May 2026 19:11:55 +0200 +Subject: rose: fix netdev double-hold in rose_make_new() + +From: Bernard Pidoux + +commit b9fb21ceb4f0d043767a1eba60786ec84809033b upstream. + +rose_make_new() copies orose->device from the listener socket and calls +netdev_hold(), storing the tracker in rose->dev_tracker. The only +caller, rose_rx_call_request(), then overwrites both make_rose->device +and make_rose->dev_tracker with a fresh netdev_hold() for the actual +incoming-call device. + +This orphans the tracker allocated by rose_make_new(): it remains in +the device's refcount_tracker list but no pointer exists to free it +via netdev_put(). The result is one spurious outstanding reference per +accepted CALL_REQUEST, visible at rmmod time as: + + ref_tracker: netdev@X has 2/2 users at + rose_rx_call_request+0xba3/0x1d50 [rose] + rose_loopback_timer+0x3eb/0x670 [rose] + +The second entry is the orphaned tracker from rose_make_new(); the +first is the correctly-managed socket reference from rose_rx_call_request(). + +Fix: initialise rose->device to NULL in rose_make_new() and let +rose_rx_call_request() -- the sole caller -- assign the correct device +and take the sole netdev_hold() as it already does. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/af_rose.c | 4 +--- + 1 file changed, 1 insertion(+), 3 deletions(-) + +--- a/net/rose/af_rose.c ++++ b/net/rose/af_rose.c +@@ -631,9 +631,7 @@ static struct sock *rose_make_new(struct + rose->hb = orose->hb; + rose->idle = orose->idle; + rose->defer = orose->defer; +- rose->device = orose->device; +- if (rose->device) +- netdev_hold(rose->device, &rose->dev_tracker, GFP_ATOMIC); ++ rose->device = NULL; /* rose_rx_call_request() sets this */ + rose->qbitincl = orose->qbitincl; + + return sk; diff --git a/queue-6.18/rose-fix-netdev-double-hold-in-rose_rx_call_request.patch b/queue-6.18/rose-fix-netdev-double-hold-in-rose_rx_call_request.patch new file mode 100644 index 0000000000..1577e4193e --- /dev/null +++ b/queue-6.18/rose-fix-netdev-double-hold-in-rose_rx_call_request.patch @@ -0,0 +1,49 @@ +From c675277c3ba0d2310e0825577d58308c39931e14 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Tue, 26 May 2026 15:57:04 +0200 +Subject: rose: fix netdev double-hold in rose_rx_call_request() + +From: Bernard Pidoux + +commit c675277c3ba0d2310e0825577d58308c39931e14 upstream. + +rose_rx_call_request() used netdev_tracker_alloc() after assigning +make_rose->device, intending to take ownership of the reference passed +by the caller. But every caller -- rose_route_frame() and +rose_loopback_timer() -- already calls dev_put() for its own hold after +the function returns, so the socket ended up with a tracker entry +pointing at a reference that had already been released. + +The result was spurious refcount_t warnings ("saturated", "decrement +hit 0") on every incoming CALL_REQUEST, leading to refcount corruption +and eventual silent freeze. + +Replace netdev_tracker_alloc() with netdev_hold() so that +rose_rx_call_request() acquires its own independent reference. Each +caller retains its own hold from rose_dev_get() and releases it via +dev_put() as before; socket cleanup releases the socket's separate hold +via netdev_put(). + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/af_rose.c | 8 +++++--- + 1 file changed, 5 insertions(+), 3 deletions(-) + +--- a/net/rose/af_rose.c ++++ b/net/rose/af_rose.c +@@ -1078,9 +1078,11 @@ int rose_rx_call_request(struct sk_buff + make_rose->source_digis[n] = facilities.source_digis[n]; + make_rose->neighbour = neigh; + make_rose->device = dev; +- /* Caller got a reference for us. */ +- netdev_tracker_alloc(make_rose->device, &make_rose->dev_tracker, +- GFP_ATOMIC); ++ /* Take an independent reference for this socket; callers keep their ++ * own reference (from rose_dev_get / dev_hold) and will release it ++ * themselves via dev_put(). ++ */ ++ netdev_hold(make_rose->device, &make_rose->dev_tracker, GFP_ATOMIC); + make_rose->facilities = facilities; + + rose_neigh_hold(make_rose->neighbour); diff --git a/queue-6.18/rose-fix-notifier-unregistered-too-early-in-rose_exit.patch b/queue-6.18/rose-fix-notifier-unregistered-too-early-in-rose_exit.patch new file mode 100644 index 0000000000..dc857075a5 --- /dev/null +++ b/queue-6.18/rose-fix-notifier-unregistered-too-early-in-rose_exit.patch @@ -0,0 +1,67 @@ +From f71a8a1edc14dba746edde38adddd654ba202b4d Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Tue, 26 May 2026 15:57:47 +0200 +Subject: rose: fix notifier unregistered too early in rose_exit() + +From: Bernard Pidoux + +commit f71a8a1edc14dba746edde38adddd654ba202b4d upstream. + +rose_exit() called unregister_netdevice_notifier() before the loop that +calls unregister_netdev() on each ROSE virtual device. As a result, +the NETDEV_DOWN event fired by unregister_netdev() was never delivered +to rose_device_event(), so rose_kill_by_device() never ran. + +Every socket whose rose->device pointed at a ROSE device therefore kept +its netdev_tracker entry live until free_netdev() destroyed the +ref_tracker_dir, at which point the kernel reported all of them as +leaked references (165 entries in a typical FPAC setup). Worse, those +sockets retained stale device pointers and live timers that could fire +into freed module text after module unload, causing a silent system +freeze with no kernel panic logged. + +Fix by moving unregister_netdevice_notifier() to after the device- +unregistration loop. unregister_netdev() then delivers NETDEV_DOWN +while the notifier is still registered, rose_kill_by_device() runs for +each device, releases all netdev references held by open sockets, and +calls rose_disconnect() which stops the per-socket timers. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/af_rose.c | 13 +++++++++++-- + 1 file changed, 11 insertions(+), 2 deletions(-) + +--- a/net/rose/af_rose.c ++++ b/net/rose/af_rose.c +@@ -1669,19 +1669,28 @@ static void __exit rose_exit(void) + #ifdef CONFIG_SYSCTL + rose_unregister_sysctl(); + #endif +- unregister_netdevice_notifier(&rose_dev_notifier); +- + sock_unregister(PF_ROSE); + + for (i = 0; i < rose_ndevs; i++) { + struct net_device *dev = dev_rose[i]; + + if (dev) { ++ /* unregister_netdev() fires NETDEV_DOWN, which -- while the ++ * notifier is still registered below -- invokes ++ * rose_kill_by_device(dev). That releases every socket's ++ * netdev reference and disconnects all active circuits. ++ * Unregistering the notifier before this loop was the ++ * original bug: NETDEV_DOWN was never delivered, leaving ++ * 165 netdev_tracker entries leaked and stale timers live. ++ */ + unregister_netdev(dev); + free_netdev(dev); + } + } + ++ /* Now safe to remove the notifier -- all ROSE devices are gone. */ ++ unregister_netdevice_notifier(&rose_dev_notifier); ++ + kfree(dev_rose); + proto_unregister(&rose_proto); + } diff --git a/queue-6.18/rose-fix-race-between-loopback-timer-and-module-removal.patch b/queue-6.18/rose-fix-race-between-loopback-timer-and-module-removal.patch new file mode 100644 index 0000000000..8c2069c811 --- /dev/null +++ b/queue-6.18/rose-fix-race-between-loopback-timer-and-module-removal.patch @@ -0,0 +1,108 @@ +From 47dd6ec1a77d77895afb00aa2e68373a48289108 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sat, 16 May 2026 12:10:20 +0200 +Subject: rose: fix race between loopback timer and module removal + +From: Bernard Pidoux + +commit 47dd6ec1a77d77895afb00aa2e68373a48289108 upstream. + +rose_loopback_clear() called timer_delete() which returns immediately +without waiting for any running callback to complete. If the timer +fired concurrently with module removal, rose_loopback_timer() could +re-arm the timer after timer_delete() returned and then access +rose_loopback_neigh after it was freed. + +Two complementary changes close the race: + +1. Add a loopback_stopping atomic flag. rose_loopback_timer() checks + it at entry (before acquiring a reference) and again inside the + loop; when set it drains the queue and exits without re-arming the + timer. + +2. Switch rose_loopback_clear() to timer_delete_sync() so it blocks + until any in-flight callback has returned before freeing resources. + +The smp_mb() between setting the flag and calling timer_delete_sync() +ensures the flag is visible to any callback that is about to run. + +Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_loopback.c | 31 ++++++++++++++++++++++++------- + 1 file changed, 24 insertions(+), 7 deletions(-) + +--- a/net/rose/rose_loopback.c ++++ b/net/rose/rose_loopback.c +@@ -12,13 +12,15 @@ + #include + #include + +-static struct sk_buff_head loopback_queue; + #define ROSE_LOOPBACK_LIMIT 1000 +-static struct timer_list loopback_timer; + ++static struct timer_list loopback_timer; ++static struct sk_buff_head loopback_queue; + static void rose_set_loopback_timer(void); + static void rose_loopback_timer(struct timer_list *unused); + ++static atomic_t loopback_stopping = ATOMIC_INIT(0); ++ + void rose_loopback_init(void) + { + skb_queue_head_init(&loopback_queue); +@@ -66,6 +68,9 @@ static void rose_loopback_timer(struct t + unsigned int lci_i, lci_o; + int count; + ++ if (atomic_read(&loopback_stopping)) ++ return; ++ + if (rose_loopback_neigh) + rose_neigh_hold(rose_loopback_neigh); + else +@@ -75,6 +80,13 @@ static void rose_loopback_timer(struct t + skb = skb_dequeue(&loopback_queue); + if (!skb) + goto out; ++ ++ if (atomic_read(&loopback_stopping)) { ++ kfree_skb(skb); ++ skb_queue_purge(&loopback_queue); ++ goto out; ++ } ++ + if (skb->len < ROSE_MIN_LEN) { + kfree_skb(skb); + continue; +@@ -118,7 +130,7 @@ static void rose_loopback_timer(struct t + out: + rose_neigh_put(rose_loopback_neigh); + +- if (!skb_queue_empty(&loopback_queue)) ++ if (!atomic_read(&loopback_stopping) && !skb_queue_empty(&loopback_queue)) + mod_timer(&loopback_timer, jiffies + 1); + } + +@@ -126,10 +138,15 @@ void __exit rose_loopback_clear(void) + { + struct sk_buff *skb; + +- timer_delete(&loopback_timer); ++ atomic_set(&loopback_stopping, 1); ++ /* Pairs with atomic_read() in rose_loopback_timer(): ensure the ++ * stopping flag is visible before we cancel, so a concurrent ++ * callback aborts its loop early rather than re-arming the timer. ++ */ ++ smp_mb(); ++ ++ timer_delete_sync(&loopback_timer); + +- while ((skb = skb_dequeue(&loopback_queue)) != NULL) { +- skb->sk = NULL; ++ while ((skb = skb_dequeue(&loopback_queue)) != NULL) + kfree_skb(skb); +- } + } diff --git a/queue-6.18/rose-guard-rose_neigh_put-against-null-in-timer-expiry.patch b/queue-6.18/rose-guard-rose_neigh_put-against-null-in-timer-expiry.patch new file mode 100644 index 0000000000..8506dca395 --- /dev/null +++ b/queue-6.18/rose-guard-rose_neigh_put-against-null-in-timer-expiry.patch @@ -0,0 +1,41 @@ +From 2b67342c6ff899a0b83359517146a5b7b243af97 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sat, 16 May 2026 12:10:55 +0200 +Subject: rose: guard rose_neigh_put() against NULL in timer expiry + +From: Bernard Pidoux + +commit 2b67342c6ff899a0b83359517146a5b7b243af97 upstream. + +In rose_timer_expiry(), the ROSE_STATE_2 branch calls +rose_neigh_put(rose->neighbour) without first checking whether the +pointer is NULL. After commit 5de7665e0a07 ("net: rose: fix timer +races against user threads") the timer is re-armed when the socket is +owned by a user thread; between the re-arm and the next firing, a +device-down event or concurrent teardown via rose_kill_by_device() can +set rose->neighbour to NULL, leading to a NULL-pointer dereference +inside rose_neigh_put(). + +Add a NULL check before the put and clear the pointer afterwards. + +Fixes: 5de7665e0a07 ("net: rose: fix timer races against user threads") +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_timer.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +--- a/net/rose/rose_timer.c ++++ b/net/rose/rose_timer.c +@@ -180,7 +180,10 @@ static void rose_timer_expiry(struct tim + break; + + case ROSE_STATE_2: /* T3 */ +- rose_neigh_put(rose->neighbour); ++ if (rose->neighbour) { ++ rose_neigh_put(rose->neighbour); ++ rose->neighbour = NULL; ++ } + rose_disconnect(sk, ETIMEDOUT, -1, -1); + break; + diff --git a/queue-6.18/rose-hold-loopback-neighbour-reference-across-timer-callback.patch b/queue-6.18/rose-hold-loopback-neighbour-reference-across-timer-callback.patch new file mode 100644 index 0000000000..02173f5e7a --- /dev/null +++ b/queue-6.18/rose-hold-loopback-neighbour-reference-across-timer-callback.patch @@ -0,0 +1,56 @@ +From d270a7a5793af84555c40dd1eb80f1d497fdf53c Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Sat, 16 May 2026 12:10:03 +0200 +Subject: rose: hold loopback neighbour reference across timer callback + +From: Bernard Pidoux + +commit d270a7a5793af84555c40dd1eb80f1d497fdf53c upstream. + +rose_loopback_timer() dereferences rose_loopback_neigh throughout its +body but holds no reference on it. A concurrent rose_loopback_clear() +followed by rose_add_loopback_neigh() could free and reallocate the +neighbour while the timer body is running, causing a use-after-free. + +Take a reference with rose_neigh_hold() at the start of the callback +(bailing out if the pointer is already NULL) and release it with +rose_neigh_put() at the single exit point. The neigh cannot be freed +while the callback holds a reference. + +Fixes: d860d1faa6b2 ("net: rose: convert 'use' field to refcount_t") +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/rose_loopback.c | 11 ++++++++++- + 1 file changed, 10 insertions(+), 1 deletion(-) + +--- a/net/rose/rose_loopback.c ++++ b/net/rose/rose_loopback.c +@@ -66,10 +66,15 @@ static void rose_loopback_timer(struct t + unsigned int lci_i, lci_o; + int count; + ++ if (rose_loopback_neigh) ++ rose_neigh_hold(rose_loopback_neigh); ++ else ++ return; ++ + for (count = 0; count < ROSE_LOOPBACK_LIMIT; count++) { + skb = skb_dequeue(&loopback_queue); + if (!skb) +- return; ++ goto out; + if (skb->len < ROSE_MIN_LEN) { + kfree_skb(skb); + continue; +@@ -109,6 +114,10 @@ static void rose_loopback_timer(struct t + kfree_skb(skb); + } + } ++ ++out: ++ rose_neigh_put(rose_loopback_neigh); ++ + if (!skb_queue_empty(&loopback_queue)) + mod_timer(&loopback_timer, jiffies + 1); + } diff --git a/queue-6.18/rose-release-netdev-ref-and-destroy-orphaned-incoming-sockets.patch b/queue-6.18/rose-release-netdev-ref-and-destroy-orphaned-incoming-sockets.patch new file mode 100644 index 0000000000..c3984e04dd --- /dev/null +++ b/queue-6.18/rose-release-netdev-ref-and-destroy-orphaned-incoming-sockets.patch @@ -0,0 +1,81 @@ +From df12be096302d2c947388acc25764456c7f18cc1 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Thu, 28 May 2026 19:38:31 +0200 +Subject: rose: release netdev ref and destroy orphaned incoming sockets + +From: Bernard Pidoux + +commit df12be096302d2c947388acc25764456c7f18cc1 upstream. + +Two related cleanup gaps left the module unremovable after a loopback +session: + +1. rose_destroy_socket() did not release the device reference. When + an unaccepted incoming socket (created by rose_rx_call_request()) is + destroyed via rose_heartbeat_expiry(), it is removed from rose_list + before rose_kill_by_device() can find it, so the netdev_hold() taken + in rose_rx_call_request() was never matched by netdev_put(). Add the + release at the top of rose_destroy_socket() guarded by a NULL check + so that rose_release() and rose_kill_by_device(), which already call + netdev_put() and set device = NULL, are not affected. + +2. rose_heartbeat_expiry() STATE_0 cleanup required TCP_LISTEN in + addition to SOCK_DEAD. Unaccepted incoming sockets are + TCP_ESTABLISHED, so the condition was never true and those sockets + lingered forever, holding the module use count above zero and + blocking rmmod. Drop the TCP_LISTEN restriction: any STATE_0 + + SOCK_DEAD socket is orphaned and should be destroyed. + +Together with the earlier rose_make_new() double-hold fix these three +patches allow clean rmmod after loopback sessions. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/af_rose.c | 9 +++++++++ + net/rose/rose_timer.c | 9 +++++---- + 2 files changed, 14 insertions(+), 4 deletions(-) + +--- a/net/rose/af_rose.c ++++ b/net/rose/af_rose.c +@@ -363,6 +363,7 @@ static void rose_destroy_timer(struct ti + */ + void rose_destroy_socket(struct sock *sk) + { ++ struct rose_sock *rose = rose_sk(sk); + struct sk_buff *skb; + + rose_remove_socket(sk); +@@ -370,6 +371,14 @@ void rose_destroy_socket(struct sock *sk + rose_stop_idletimer(sk); + rose_stop_timer(sk); + ++ /* Drop any device reference not already released by rose_kill_by_device() ++ * or rose_release() -- e.g. incoming sockets that were never accepted. ++ */ ++ if (rose->device) { ++ netdev_put(rose->device, &rose->dev_tracker); ++ rose->device = NULL; ++ } ++ + rose_clear_queues(sk); /* Flush the queues */ + + while ((skb = skb_dequeue(&sk->sk_receive_queue)) != NULL) { +--- a/net/rose/rose_timer.c ++++ b/net/rose/rose_timer.c +@@ -128,10 +128,11 @@ static void rose_heartbeat_expiry(struct + } + switch (rose->state) { + case ROSE_STATE_0: +- /* Magic here: If we listen() and a new link dies before it +- is accepted() it isn't 'dead' so doesn't get removed. */ +- if (sock_flag(sk, SOCK_DESTROY) || +- (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_DEAD))) { ++ /* Destroy any orphaned STATE_0 socket: either explicitly ++ * flagged SOCK_DESTROY, or SOCK_DEAD (covers both unaccepted ++ * incoming connections and listening sockets whose link died). ++ */ ++ if (sock_flag(sk, SOCK_DESTROY) || sock_flag(sk, SOCK_DEAD)) { + bh_unlock_sock(sk); + rose_destroy_socket(sk); + sock_put(sk); diff --git a/queue-6.18/rose-set-sock_destroy-in-rose_kill_by_device-for-prompt-cleanup.patch b/queue-6.18/rose-set-sock_destroy-in-rose_kill_by_device-for-prompt-cleanup.patch new file mode 100644 index 0000000000..e9321b6dd8 --- /dev/null +++ b/queue-6.18/rose-set-sock_destroy-in-rose_kill_by_device-for-prompt-cleanup.patch @@ -0,0 +1,43 @@ +From 741a4863ad570889c75f7a8e404567d8f3e46335 Mon Sep 17 00:00:00 2001 +From: Bernard Pidoux +Date: Wed, 27 May 2026 14:11:21 +0200 +Subject: rose: set SOCK_DESTROY in rose_kill_by_device() for prompt cleanup + +From: Bernard Pidoux + +commit 741a4863ad570889c75f7a8e404567d8f3e46335 upstream. + +When rose_kill_by_device() is called (via NETDEV_DOWN on module exit +or interface removal), it calls rose_disconnect() which transitions +sockets to ROSE_STATE_0 and sets SOCK_DEAD. However, +rose_heartbeat_expiry() only calls rose_destroy_socket() at +ROSE_STATE_0 if SOCK_DESTROY is set -- the SOCK_DEAD path is reserved +for TCP_LISTEN sockets. Without SOCK_DESTROY, orphaned sockets in +ROSE_STATE_2 (clearing) loop indefinitely in the heartbeat without +ever being freed, keeping the module use-count elevated and blocking +modprobe -r rose until the T1 timer (up to 200 s) expires. + +Set SOCK_DESTROY immediately after rose_disconnect() so the heartbeat +destroys the socket at its next tick (within 5 s), allowing clean +module unload. + +Signed-off-by: Bernard Pidoux +Signed-off-by: Greg Kroah-Hartman +--- + net/rose/af_rose.c | 5 +++++ + 1 file changed, 5 insertions(+) + +--- a/net/rose/af_rose.c ++++ b/net/rose/af_rose.c +@@ -211,6 +211,11 @@ start: + spin_lock_bh(&rose_list_lock); + if (rose->device == dev) { + rose_disconnect(sk, ENETUNREACH, ROSE_OUT_OF_ORDER, 0); ++ /* Mark for destruction so rose_heartbeat_expiry() ++ * cleans up the socket at its next tick rather than ++ * looping forever in ROSE_STATE_0 with no owner. ++ */ ++ sock_set_flag(sk, SOCK_DESTROY); + if (rose->neighbour) + rose_neigh_put(rose->neighbour); + netdev_put(rose->device, &rose->dev_tracker); diff --git a/queue-6.18/sctp-disable-bh-before-calling-udp_tunnel_xmit_skb.patch b/queue-6.18/sctp-disable-bh-before-calling-udp_tunnel_xmit_skb.patch new file mode 100644 index 0000000000..362ebcb3d4 --- /dev/null +++ b/queue-6.18/sctp-disable-bh-before-calling-udp_tunnel_xmit_skb.patch @@ -0,0 +1,90 @@ +From stable+bounces-268290-greg=kroah.com@vger.kernel.org Thu Jun 25 08:45:49 2026 +From: Alexander Martyniuk +Date: Thu, 25 Jun 2026 10:43:46 +0300 +Subject: sctp: disable BH before calling udp_tunnel_xmit_skb() +To: stable@vger.kernel.org, Greg Kroah-Hartman +Cc: marcelo.leitner@gmail.com, lucien.xin@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, horms@kernel.org, bestswngs@gmail.com, linux-sctp@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Martyniuk +Message-ID: <20260625074348.90149-1-alexevgmart@gmail.com> + +From: Xin Long + +commit 2cd7e6971fc2787408ceef17906ea152791448cf upstream. + +udp_tunnel_xmit_skb() / udp_tunnel6_xmit_skb() are expected to run with +BH disabled. After commit 6f1a9140ecda ("add xmit recursion limit to +tunnel xmit functions"), on the path: + + udp(6)_tunnel_xmit_skb() -> ip(6)tunnel_xmit() + +dev_xmit_recursion_inc()/dec() must stay balanced on the same CPU. + +Without local_bh_disable(), the context may move between CPUs, which can +break the inc/dec pairing. This may lead to incorrect recursion level +detection and cause packets to be dropped in ip(6)_tunnel_xmit() or +__dev_queue_xmit(). + +Fix it by disabling BH around both IPv4 and IPv6 SCTP UDP xmit paths. + +In my testing, after enabling the SCTP over UDP: + + # ip net exec ha sysctl -w net.sctp.udp_port=9899 + # ip net exec ha sysctl -w net.sctp.encap_port=9899 + # ip net exec hb sysctl -w net.sctp.udp_port=9899 + # ip net exec hb sysctl -w net.sctp.encap_port=9899 + + # ip net exec ha iperf3 -s + +- without this patch: + + # ip net exec hb iperf3 -c 192.168.0.1 --sctp + [ 5] 0.00-10.00 sec 37.2 MBytes 31.2 Mbits/sec sender + [ 5] 0.00-10.00 sec 37.1 MBytes 31.1 Mbits/sec receiver + +- with this patch: + + # ip net exec hb iperf3 -c 192.168.0.1 --sctp + [ 5] 0.00-10.00 sec 3.14 GBytes 2.69 Gbits/sec sender + [ 5] 0.00-10.00 sec 3.14 GBytes 2.69 Gbits/sec receiver + +Fixes: 6f1a9140ecda ("net: add xmit recursion limit to tunnel xmit functions") +Fixes: 046c052b475e ("sctp: enable udp tunneling socks") +Signed-off-by: Xin Long +Acked-by: Marcelo Ricardo Leitner +Link: https://patch.msgid.link/c874a8548221dcd56ff03c65ba75a74e6cf99119.1776017727.git.lucien.xin@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Alexander Martyniuk +Signed-off-by: Greg Kroah-Hartman +--- + net/sctp/ipv6.c | 2 ++ + net/sctp/protocol.c | 2 ++ + 2 files changed, 4 insertions(+) + +--- a/net/sctp/ipv6.c ++++ b/net/sctp/ipv6.c +@@ -261,9 +261,11 @@ static int sctp_v6_xmit(struct sk_buff * + skb_set_inner_ipproto(skb, IPPROTO_SCTP); + label = ip6_make_flowlabel(sock_net(sk), skb, fl6->flowlabel, true, fl6); + ++ local_bh_disable(); + udp_tunnel6_xmit_skb(dst, sk, skb, NULL, &fl6->saddr, &fl6->daddr, + tclass, ip6_dst_hoplimit(dst), label, + sctp_sk(sk)->udp_port, t->encap_port, false, 0); ++ local_bh_enable(); + return 0; + } + +--- a/net/sctp/protocol.c ++++ b/net/sctp/protocol.c +@@ -1102,10 +1102,12 @@ static inline int sctp_v4_xmit(struct sk + skb_reset_inner_mac_header(skb); + skb_reset_inner_transport_header(skb); + skb_set_inner_ipproto(skb, IPPROTO_SCTP); ++ local_bh_disable(); + udp_tunnel_xmit_skb(dst_rtable(dst), sk, skb, fl4->saddr, + fl4->daddr, dscp, ip4_dst_hoplimit(dst), df, + sctp_sk(sk)->udp_port, t->encap_port, false, false, + 0); ++ local_bh_enable(); + return 0; + } + diff --git a/queue-6.18/series b/queue-6.18/series index f123481cea..53d62583b1 100644 --- a/queue-6.18/series +++ b/queue-6.18/series @@ -11,3 +11,23 @@ i2c-stub-reject-i2c-block-transfers-with-invalid-length.patch net-qualcomm-rmnet-fix-endpoint-use-after-free-in-rmnet_dellink.patch agp-amd64-fix-broken-error-propagation-in-agp_amd64_probe.patch acpi-scan-use-async-schedule-function-in-acpi_scan_c.patch +rose-fix-dev_put-leak-in-rose_loopback_timer.patch +rose-hold-loopback-neighbour-reference-across-timer-callback.patch +rose-fix-race-between-loopback-timer-and-module-removal.patch +rose-clear-neighbour-pointer-after-rose_neigh_put-in-state-machines.patch +rose-guard-rose_neigh_put-against-null-in-timer-expiry.patch +rose-fix-netdev-double-hold-in-rose_rx_call_request.patch +rose-fix-notifier-unregistered-too-early-in-rose_exit.patch +rose-set-sock_destroy-in-rose_kill_by_device-for-prompt-cleanup.patch +rose-disconnect-orphaned-state_2-sockets-when-device-is-gone.patch +rose-fix-netdev-double-hold-in-rose_make_new.patch +rose-release-netdev-ref-and-destroy-orphaned-incoming-sockets.patch +rose-drop-call_request-in-loopback-timer-when-device-is-not-running.patch +rose-cancel-neighbour-timers-in-rose_neigh_put-before-freeing.patch +rose-clear-neighbour-pointer-in-rose_kill_by_device.patch +rose-don-t-free-fd-owned-sockets-when-reaping-in-the-heartbeat.patch +regulator-core-fix-locking-in-regulator_resolve_supply-error-path.patch +hv-utils-handle-and-propagate-errors-in-kvp_register.patch +drivers-hv-vmbus-improve-the-logic-of-reserving-fb_mmio-on-gen2-vms.patch +firmware-samsung-acpm-fix-cross-thread-rx-length-corruption.patch +sctp-disable-bh-before-calling-udp_tunnel_xmit_skb.patch