]> git.ipfire.org Git - thirdparty/linux.git/commitdiff
RDMA/hns: Fix memory leak of bonding resources
authorJunxian Huang <huangjunxian6@hisilicon.com>
Sat, 13 Jun 2026 10:20:45 +0000 (18:20 +0800)
committerJason Gunthorpe <jgg@nvidia.com>
Tue, 16 Jun 2026 18:04:55 +0000 (15:04 -0300)
In a corner case of concurrent driver removal and driver reset,
bonding resource is first released in hns_roce_hw_v2_exit() during
driver removal, and then is allocated again in hns_roce_register_device()
during driver reset. This leads to memory leak because the release
timing has already passed. This may also lead to a kernel panic
as below because of the leaked notifier callback:

Call trace:
  0xffffa20fccc04978 (P)
  raw_notifier_call_chain+0x20/0x38
  call_netdevice_notifiers_info+0x60/0xb8
  netdev_lower_state_changed+0x4c/0xb8

As Sashiko suggested, the teardown order of bonding resources should
be inverted to make sure the resources are released when the driver
is removed.

Fixes: b37ad2e290fc ("RDMA/hns: Initialize bonding resources")
Link: https://patch.msgid.link/r/20260613102045.811623-1-huangjunxian6@hisilicon.com
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
drivers/infiniband/hw/hns/hns_roce_hw_v2.c

index 332a4816f2ca423d948a300ee5bba3c4709ff869..2b3a1cafd1b2170d99928d8d443acab4e433c7d3 100644 (file)
@@ -7654,8 +7654,8 @@ static int __init hns_roce_hw_v2_init(void)
 
 static void __exit hns_roce_hw_v2_exit(void)
 {
-       hns_roce_dealloc_bond_grp();
        hnae3_unregister_client(&hns_roce_hw_v2_client);
+       hns_roce_dealloc_bond_grp();
        hns_roce_cleanup_debugfs();
 }