From: Greg Kroah-Hartman Date: Mon, 24 Feb 2025 13:19:31 +0000 (+0100) Subject: 6.6-stable patches X-Git-Tag: v6.6.80~15 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=70ad828c148d4e064676c101d98a2723eb7a5be5;p=thirdparty%2Fkernel%2Fstable-queue.git 6.6-stable patches added patches: net-mlx5e-don-t-call-cleanup-on-profile-rollback-failure.patch --- diff --git a/queue-6.6/net-mlx5e-don-t-call-cleanup-on-profile-rollback-failure.patch b/queue-6.6/net-mlx5e-don-t-call-cleanup-on-profile-rollback-failure.patch new file mode 100644 index 0000000000..2d5185249d --- /dev/null +++ b/queue-6.6/net-mlx5e-don-t-call-cleanup-on-profile-rollback-failure.patch @@ -0,0 +1,83 @@ +From 4dbc1d1a9f39c3711ad2a40addca04d07d9ab5d0 Mon Sep 17 00:00:00 2001 +From: Cosmin Ratiu +Date: Tue, 15 Oct 2024 12:32:08 +0300 +Subject: net/mlx5e: Don't call cleanup on profile rollback failure + +From: Cosmin Ratiu + +commit 4dbc1d1a9f39c3711ad2a40addca04d07d9ab5d0 upstream. + +When profile rollback fails in mlx5e_netdev_change_profile, the netdev +profile var is left set to NULL. Avoid a crash when unloading the driver +by not calling profile->cleanup in such a case. + +This was encountered while testing, with the original trigger that +the wq rescuer thread creation got interrupted (presumably due to +Ctrl+C-ing modprobe), which gets converted to ENOMEM (-12) by +mlx5e_priv_init, the profile rollback also fails for the same reason +(signal still active) so the profile is left as NULL, leading to a crash +later in _mlx5e_remove. + + [ 732.473932] mlx5_core 0000:08:00.1: E-Switch: Unload vfs: mode(OFFLOADS), nvfs(2), necvfs(0), active vports(2) + [ 734.525513] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR + [ 734.557372] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12 + [ 734.559187] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: new profile init failed, -12 + [ 734.560153] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR + [ 734.589378] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12 + [ 734.591136] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 + [ 745.537492] BUG: kernel NULL pointer dereference, address: 0000000000000008 + [ 745.538222] #PF: supervisor read access in kernel mode + + [ 745.551290] Call Trace: + [ 745.551590] + [ 745.551866] ? __die+0x20/0x60 + [ 745.552218] ? page_fault_oops+0x150/0x400 + [ 745.555307] ? exc_page_fault+0x79/0x240 + [ 745.555729] ? asm_exc_page_fault+0x22/0x30 + [ 745.556166] ? mlx5e_remove+0x6b/0xb0 [mlx5_core] + [ 745.556698] auxiliary_bus_remove+0x18/0x30 + [ 745.557134] device_release_driver_internal+0x1df/0x240 + [ 745.557654] bus_remove_device+0xd7/0x140 + [ 745.558075] device_del+0x15b/0x3c0 + [ 745.558456] mlx5_rescan_drivers_locked.part.0+0xb1/0x2f0 [mlx5_core] + [ 745.559112] mlx5_unregister_device+0x34/0x50 [mlx5_core] + [ 745.559686] mlx5_uninit_one+0x46/0xf0 [mlx5_core] + [ 745.560203] remove_one+0x4e/0xd0 [mlx5_core] + [ 745.560694] pci_device_remove+0x39/0xa0 + [ 745.561112] device_release_driver_internal+0x1df/0x240 + [ 745.561631] driver_detach+0x47/0x90 + [ 745.562022] bus_remove_driver+0x84/0x100 + [ 745.562444] pci_unregister_driver+0x3b/0x90 + [ 745.562890] mlx5_cleanup+0xc/0x1b [mlx5_core] + [ 745.563415] __x64_sys_delete_module+0x14d/0x2f0 + [ 745.563886] ? kmem_cache_free+0x1b0/0x460 + [ 745.564313] ? lockdep_hardirqs_on_prepare+0xe2/0x190 + [ 745.564825] do_syscall_64+0x6d/0x140 + [ 745.565223] entry_SYSCALL_64_after_hwframe+0x4b/0x53 + [ 745.565725] RIP: 0033:0x7f1579b1288b + +Fixes: 3ef14e463f6e ("net/mlx5e: Separate between netdev objects and mlx5e profiles initialization") +Signed-off-by: Cosmin Ratiu +Reviewed-by: Dragos Tatulea +Signed-off-by: Tariq Toukan +Signed-off-by: Paolo Abeni +Signed-off-by: Jianqi Ren +Signed-off-by: He Zhe +Signed-off-by: Greg Kroah-Hartman +--- + drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +@@ -6110,7 +6110,9 @@ static void mlx5e_remove(struct auxiliar + mlx5e_dcbnl_delete_app(priv); + unregister_netdev(priv->netdev); + mlx5e_suspend(adev, state); +- priv->profile->cleanup(priv); ++ /* Avoid cleanup if profile rollback failed. */ ++ if (priv->profile) ++ priv->profile->cleanup(priv); + mlx5e_destroy_netdev(priv); + mlx5e_devlink_port_unregister(mlx5e_dev); + mlx5e_destroy_devlink(mlx5e_dev); diff --git a/queue-6.6/series b/queue-6.6/series index 328eb16a3e..6df724f859 100644 --- a/queue-6.6/series +++ b/queue-6.6/series @@ -129,3 +129,4 @@ smb-client-add-check-for-next_buffer-in-receive_encrypted_standard.patch edac-qcom-correct-interrupt-enable-register-configuration.patch ftrace-correct-preemption-accounting-for-function-tracing.patch ftrace-do-not-add-duplicate-entries-in-subops-manager-ops.patch +net-mlx5e-don-t-call-cleanup-on-profile-rollback-failure.patch