]> git.ipfire.org Git - thirdparty/kernel/stable.git/commit
RDMA/hns: Fix missing flush CQE for DWQE
authorChengchang Tang <tangchengchang@huawei.com>
Fri, 20 Dec 2024 05:52:49 +0000 (13:52 +0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 9 Jan 2025 12:28:45 +0000 (13:28 +0100)
commita68ec6380f2f8e6f868a86c00c96b0ddd3619f88
treec4b2fba5b1096e42480d829a46f2ba7d87b5b089
parent06e2d3ec7a7d76b6b92292891276e266003a6db9
RDMA/hns: Fix missing flush CQE for DWQE

[ Upstream commit e3debdd48423d3d75b9d366399228d7225d902cd ]

Flush CQE handler has not been called if QP state gets into errored
mode in DWQE path. So, the new added outstanding WQEs will never be
flushed.

It leads to a hung task timeout when using NFS over RDMA:
    __switch_to+0x7c/0xd0
    __schedule+0x350/0x750
    schedule+0x50/0xf0
    schedule_timeout+0x2c8/0x340
    wait_for_common+0xf4/0x2b0
    wait_for_completion+0x20/0x40
    __ib_drain_sq+0x140/0x1d0 [ib_core]
    ib_drain_sq+0x98/0xb0 [ib_core]
    rpcrdma_xprt_disconnect+0x68/0x270 [rpcrdma]
    xprt_rdma_close+0x20/0x60 [rpcrdma]
    xprt_autoclose+0x64/0x1cc [sunrpc]
    process_one_work+0x1d8/0x4e0
    worker_thread+0x154/0x420
    kthread+0x108/0x150
    ret_from_fork+0x10/0x18

Fixes: 01584a5edcc4 ("RDMA/hns: Add support of direct wqe")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20241220055249.146943-5-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/infiniband/hw/hns/hns_roce_hw_v2.c