From: Sasha Levin Date: Sun, 23 May 2021 20:05:29 +0000 (-0400) Subject: Fixes for 5.12 X-Git-Tag: v4.4.270~76 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=232fe7b9148444eaf7b5f17d38dae5e2c24adb89;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 5.12 Signed-off-by: Sasha Levin --- diff --git a/queue-5.12/drm-ttm-do-not-add-non-system-domain-bo-into-swap-li.patch b/queue-5.12/drm-ttm-do-not-add-non-system-domain-bo-into-swap-li.patch new file mode 100644 index 00000000000..6779964abad --- /dev/null +++ b/queue-5.12/drm-ttm-do-not-add-non-system-domain-bo-into-swap-li.patch @@ -0,0 +1,43 @@ +From dff1c397d42afe14c71a841b0525d2dc2a06bdd9 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 24 Feb 2021 11:28:08 +0800 +Subject: drm/ttm: Do not add non-system domain BO into swap list +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: xinhui pan + +[ Upstream commit ad2c28bd9a4083816fa45a7e90c2486cde8a9873 ] + +BO would be added into swap list if it is validated into system domain. +If BO is validated again into non-system domain, say, VRAM domain. It +actually should not be in the swap list. + +Signed-off-by: xinhui pan +Acked-by: Guchun Chen +Acked-by: Alex Deucher +Reviewed-by: Christian König +Link: https://patchwork.freedesktop.org/patch/msgid/20210224032808.150465-1-xinhui.pan@amd.com +Signed-off-by: Christian König +Signed-off-by: Sasha Levin +--- + drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c +index 101a68dc615b..799ec7a7caa4 100644 +--- a/drivers/gpu/drm/ttm/ttm_bo.c ++++ b/drivers/gpu/drm/ttm/ttm_bo.c +@@ -153,6 +153,8 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo, + + swap = &ttm_bo_glob.swap_lru[bo->priority]; + list_move_tail(&bo->swap, swap); ++ } else { ++ list_del_init(&bo->swap); + } + + if (bdev->driver->del_from_lru_notify) +-- +2.30.2 + diff --git a/queue-5.12/firmware-arm_scpi-prevent-the-ternary-sign-expansion.patch b/queue-5.12/firmware-arm_scpi-prevent-the-ternary-sign-expansion.patch new file mode 100644 index 00000000000..e77318b8f98 --- /dev/null +++ b/queue-5.12/firmware-arm_scpi-prevent-the-ternary-sign-expansion.patch @@ -0,0 +1,49 @@ +From 3520b55e13f161ddddd06472cc03c20cb3b0b47e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 22 Apr 2021 12:02:29 +0300 +Subject: firmware: arm_scpi: Prevent the ternary sign expansion bug + +From: Dan Carpenter + +[ Upstream commit d9cd78edb2e6b7e26747c0ec312be31e7ef196fe ] + +How the type promotion works in ternary expressions is a bit tricky. +The problem is that scpi_clk_get_val() returns longs, "ret" is a int +which holds a negative error code, and le32_to_cpu() is an unsigned int. +We want the negative error code to be cast to a negative long. But +because le32_to_cpu() is an u32 then "ret" is type promoted to u32 and +becomes a high positive and then it is promoted to long and it is still +a high positive value. + +Fix this by getting rid of the ternary. + +Link: https://lore.kernel.org/r/YIE7pdqV/h10tEAK@mwanda +Fixes: 8cb7cf56c9fe ("firmware: add support for ARM System Control and Power Interface(SCPI) protocol") +Reviewed-by: Cristian Marussi +Signed-off-by: Dan Carpenter +[sudeep.holla: changed to return 0 as clock rate on error] +Signed-off-by: Sudeep Holla +Signed-off-by: Sasha Levin +--- + drivers/firmware/arm_scpi.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c +index d0dee37ad522..4ceba5ef7895 100644 +--- a/drivers/firmware/arm_scpi.c ++++ b/drivers/firmware/arm_scpi.c +@@ -552,8 +552,10 @@ static unsigned long scpi_clk_get_val(u16 clk_id) + + ret = scpi_send_message(CMD_GET_CLOCK_VALUE, &le_clk_id, + sizeof(le_clk_id), &rate, sizeof(rate)); ++ if (ret) ++ return 0; + +- return ret ? ret : le32_to_cpu(rate); ++ return le32_to_cpu(rate); + } + + static int scpi_clk_set_val(u16 clk_id, unsigned long rate) +-- +2.30.2 + diff --git a/queue-5.12/habanalabs-gaudi-fix-a-potential-use-after-free-in-g.patch b/queue-5.12/habanalabs-gaudi-fix-a-potential-use-after-free-in-g.patch new file mode 100644 index 00000000000..a757b3aa3eb --- /dev/null +++ b/queue-5.12/habanalabs-gaudi-fix-a-potential-use-after-free-in-g.patch @@ -0,0 +1,57 @@ +From 78f96b4733e432f87b2e20c63da71118a10df7cd Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 26 Apr 2021 06:43:46 -0700 +Subject: habanalabs/gaudi: Fix a potential use after free in + gaudi_memset_device_memory + +From: Lv Yunlong + +[ Upstream commit 115726c5d312b462c9d9931ea42becdfa838a076 ] + +Our code analyzer reported a uaf. + +In gaudi_memset_device_memory, cb is get via hl_cb_kernel_create() +with 2 refcount. +If hl_cs_allocate_job() failed, the execution runs into release_cb +branch. One ref of cb is dropped by hl_cb_put(cb) and could be freed +if other thread also drops one ref. Then cb is used by cb->id later, +which is a potential uaf. + +My patch add a variable 'id' to accept the value of cb->id before the +hl_cb_put(cb) is called, to avoid the potential uaf. + +Fixes: 423815bf02e25 ("habanalabs/gaudi: remove PCI access to SM block") +Signed-off-by: Lv Yunlong +Reviewed-by: Oded Gabbay +Signed-off-by: Oded Gabbay +Signed-off-by: Sasha Levin +--- + drivers/misc/habanalabs/gaudi/gaudi.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c +index 9152242778f5..ecdedd87f8cc 100644 +--- a/drivers/misc/habanalabs/gaudi/gaudi.c ++++ b/drivers/misc/habanalabs/gaudi/gaudi.c +@@ -5546,6 +5546,7 @@ static int gaudi_memset_device_memory(struct hl_device *hdev, u64 addr, + struct hl_cs_job *job; + u32 cb_size, ctl, err_cause; + struct hl_cb *cb; ++ u64 id; + int rc; + + cb = hl_cb_kernel_create(hdev, PAGE_SIZE, false); +@@ -5612,8 +5613,9 @@ static int gaudi_memset_device_memory(struct hl_device *hdev, u64 addr, + } + + release_cb: ++ id = cb->id; + hl_cb_put(cb); +- hl_cb_destroy(hdev, &hdev->kernel_cb_mgr, cb->id << PAGE_SHIFT); ++ hl_cb_destroy(hdev, &hdev->kernel_cb_mgr, id << PAGE_SHIFT); + + return rc; + } +-- +2.30.2 + diff --git a/queue-5.12/nvme-fc-clear-q_live-at-beginning-of-association-tea.patch b/queue-5.12/nvme-fc-clear-q_live-at-beginning-of-association-tea.patch new file mode 100644 index 00000000000..780357bd607 --- /dev/null +++ b/queue-5.12/nvme-fc-clear-q_live-at-beginning-of-association-tea.patch @@ -0,0 +1,59 @@ +From 36ac0196a59fc334ad2b718dda96e462fad316ba Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 10 May 2021 21:56:35 -0700 +Subject: nvme-fc: clear q_live at beginning of association teardown + +From: James Smart + +[ Upstream commit a7d139145a6640172516b193abf6d2398620aa14 ] + +The __nvmf_check_ready() routine used to bounce all filesystem io if the +controller state isn't LIVE. However, a later patch changed the logic so +that it rejection ends up being based on the Q live check. The FC +transport has a slightly different sequence from rdma and tcp for +shutting down queues/marking them non-live. FC marks its queue non-live +after aborting all ios and waiting for their termination, leaving a +rather large window for filesystem io to continue to hit the transport. +Unfortunately this resulted in filesystem I/O or applications seeing I/O +errors. + +Change the FC transport to mark the queues non-live at the first sign of +teardown for the association (when I/O is initially terminated). + +Fixes: 73a5379937ec ("nvme-fabrics: allow to queue requests for live queues") +Signed-off-by: James Smart +Reviewed-by: Sagi Grimberg +Reviewed-by: Himanshu Madhani +Reviewed-by: Hannes Reinecke +Signed-off-by: Christoph Hellwig +Signed-off-by: Sasha Levin +--- + drivers/nvme/host/fc.c | 12 ++++++++++++ + 1 file changed, 12 insertions(+) + +diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c +index 6ffa8de2a0d7..5eee603bc249 100644 +--- a/drivers/nvme/host/fc.c ++++ b/drivers/nvme/host/fc.c +@@ -2460,6 +2460,18 @@ nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved) + static void + __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues) + { ++ int q; ++ ++ /* ++ * if aborting io, the queues are no longer good, mark them ++ * all as not live. ++ */ ++ if (ctrl->ctrl.queue_count > 1) { ++ for (q = 1; q < ctrl->ctrl.queue_count; q++) ++ clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[q].flags); ++ } ++ clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[0].flags); ++ + /* + * If io queues are present, stop them and terminate all outstanding + * ios on them. As FC allocates FC exchange for each io, the +-- +2.30.2 + diff --git a/queue-5.12/nvme-loop-fix-memory-leak-in-nvme_loop_create_ctrl.patch b/queue-5.12/nvme-loop-fix-memory-leak-in-nvme_loop_create_ctrl.patch new file mode 100644 index 00000000000..45441bd8820 --- /dev/null +++ b/queue-5.12/nvme-loop-fix-memory-leak-in-nvme_loop_create_ctrl.patch @@ -0,0 +1,39 @@ +From 5457acdd06bd18ffcff131d82749b36a47843781 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 19 May 2021 13:01:10 +0800 +Subject: nvme-loop: fix memory leak in nvme_loop_create_ctrl() + +From: Wu Bo + +[ Upstream commit 03504e3b54cc8118cc26c064e60a0b00c2308708 ] + +When creating loop ctrl in nvme_loop_create_ctrl(), if nvme_init_ctrl() +fails, the loop ctrl should be freed before jumping to the "out" label. + +Fixes: 3a85a5de29ea ("nvme-loop: add a NVMe loopback host driver") +Signed-off-by: Wu Bo +Signed-off-by: Christoph Hellwig +Signed-off-by: Sasha Levin +--- + drivers/nvme/target/loop.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c +index 3e189e753bcf..14913a4588ec 100644 +--- a/drivers/nvme/target/loop.c ++++ b/drivers/nvme/target/loop.c +@@ -588,8 +588,10 @@ static struct nvme_ctrl *nvme_loop_create_ctrl(struct device *dev, + + ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_loop_ctrl_ops, + 0 /* no quirks, we're perfect! */); +- if (ret) ++ if (ret) { ++ kfree(ctrl); + goto out; ++ } + + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) + WARN_ON_ONCE(1); +-- +2.30.2 + diff --git a/queue-5.12/nvme-tcp-rerun-io_work-if-req_list-is-not-empty.patch b/queue-5.12/nvme-tcp-rerun-io_work-if-req_list-is-not-empty.patch new file mode 100644 index 00000000000..ab1ef21f4a4 --- /dev/null +++ b/queue-5.12/nvme-tcp-rerun-io_work-if-req_list-is-not-empty.patch @@ -0,0 +1,48 @@ +From 7c9fae5411c4ebbfe6c70e09f5630dfdede54385 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 17 May 2021 15:36:43 -0700 +Subject: nvme-tcp: rerun io_work if req_list is not empty + +From: Keith Busch + +[ Upstream commit a0fdd1418007f83565d3f2e04b47923ba93a9b8c ] + +A possible race condition exists where the request to send data is +enqueued from nvme_tcp_handle_r2t()'s will not be observed by +nvme_tcp_send_all() if it happens to be running. The driver relies on +io_work to send the enqueued request when it is runs again, but the +concurrently running nvme_tcp_send_all() may not have released the +send_mutex at that time. If no future commands are enqueued to re-kick +the io_work, the request will timeout in the SEND_H2C state, resulting +in a timeout error like: + + nvme nvme0: queue 1: timeout request 0x3 type 6 + +Ensure the io_work continues to run as long as the req_list is not empty. + +Fixes: db5ad6b7f8cdd ("nvme-tcp: try to send request in queue_rq context") +Signed-off-by: Keith Busch +Reviewed-by: Sagi Grimberg +Signed-off-by: Christoph Hellwig +Signed-off-by: Sasha Levin +--- + drivers/nvme/host/tcp.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c +index d7d7c81d0701..f8ef1faaf5e4 100644 +--- a/drivers/nvme/host/tcp.c ++++ b/drivers/nvme/host/tcp.c +@@ -1137,7 +1137,8 @@ static void nvme_tcp_io_work(struct work_struct *w) + pending = true; + else if (unlikely(result < 0)) + break; +- } ++ } else ++ pending = !llist_empty(&queue->req_list); + + result = nvme_tcp_try_recv(queue); + if (result > 0) +-- +2.30.2 + diff --git a/queue-5.12/nvmet-fix-memory-leak-in-nvmet_alloc_ctrl.patch b/queue-5.12/nvmet-fix-memory-leak-in-nvmet_alloc_ctrl.patch new file mode 100644 index 00000000000..128975962bc --- /dev/null +++ b/queue-5.12/nvmet-fix-memory-leak-in-nvmet_alloc_ctrl.patch @@ -0,0 +1,40 @@ +From 54dc1c90f16926100e95c6fc23a72931c6660903 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 19 May 2021 13:01:09 +0800 +Subject: nvmet: fix memory leak in nvmet_alloc_ctrl() + +From: Wu Bo + +[ Upstream commit fec356a61aa3d3a66416b4321f1279e09e0f256f ] + +When creating ctrl in nvmet_alloc_ctrl(), if the cntlid_min is larger +than cntlid_max of the subsystem, and jumps to the +"out_free_changed_ns_list" label, but the ctrl->sqs lack of be freed. +Fix this by jumping to the "out_free_sqs" label. + +Fixes: 94a39d61f80f ("nvmet: make ctrl-id configurable") +Signed-off-by: Wu Bo +Reviewed-by: Sagi Grimberg +Reviewed-by: Chaitanya Kulkarni +Signed-off-by: Christoph Hellwig +Signed-off-by: Sasha Levin +--- + drivers/nvme/target/core.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c +index a027433b8be8..348057fdc568 100644 +--- a/drivers/nvme/target/core.c ++++ b/drivers/nvme/target/core.c +@@ -1371,7 +1371,7 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn, + goto out_free_changed_ns_list; + + if (subsys->cntlid_min > subsys->cntlid_max) +- goto out_free_changed_ns_list; ++ goto out_free_sqs; + + ret = ida_simple_get(&cntlid_ida, + subsys->cntlid_min, subsys->cntlid_max, +-- +2.30.2 + diff --git a/queue-5.12/nvmet-seset-ns-file-when-open-fails.patch b/queue-5.12/nvmet-seset-ns-file-when-open-fails.patch new file mode 100644 index 00000000000..e8799f722a2 --- /dev/null +++ b/queue-5.12/nvmet-seset-ns-file-when-open-fails.patch @@ -0,0 +1,61 @@ +From 71e97a341ff1b93aa96b3cd5fc47cbeb4533664b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 12 May 2021 16:50:05 +0200 +Subject: nvmet: seset ns->file when open fails + +From: Daniel Wagner + +[ Upstream commit 85428beac80dbcace5b146b218697c73e367dcf5 ] + +Reset the ns->file value to NULL also in the error case in +nvmet_file_ns_enable(). + +The ns->file variable points either to file object or contains the +error code after the filp_open() call. This can lead to following +problem: + +When the user first setups an invalid file backend and tries to enable +the ns, it will fail. Then the user switches over to a bdev backend +and enables successfully the ns. The first received I/O will crash the +system because the IO backend is chosen based on the ns->file value: + +static u16 nvmet_parse_io_cmd(struct nvmet_req *req) +{ + [...] + + if (req->ns->file) + return nvmet_file_parse_io_cmd(req); + + return nvmet_bdev_parse_io_cmd(req); +} + +Reported-by: Enzo Matsumiya +Signed-off-by: Daniel Wagner +Signed-off-by: Christoph Hellwig +Signed-off-by: Sasha Levin +--- + drivers/nvme/target/io-cmd-file.c | 8 +++++--- + 1 file changed, 5 insertions(+), 3 deletions(-) + +diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c +index 715d4376c997..7fdbdc496597 100644 +--- a/drivers/nvme/target/io-cmd-file.c ++++ b/drivers/nvme/target/io-cmd-file.c +@@ -49,9 +49,11 @@ int nvmet_file_ns_enable(struct nvmet_ns *ns) + + ns->file = filp_open(ns->device_path, flags, 0); + if (IS_ERR(ns->file)) { +- pr_err("failed to open file %s: (%ld)\n", +- ns->device_path, PTR_ERR(ns->file)); +- return PTR_ERR(ns->file); ++ ret = PTR_ERR(ns->file); ++ pr_err("failed to open file %s: (%d)\n", ++ ns->device_path, ret); ++ ns->file = NULL; ++ return ret; + } + + ret = nvmet_file_ns_revalidate(ns); +-- +2.30.2 + diff --git a/queue-5.12/openrisc-fix-a-memory-leak.patch b/queue-5.12/openrisc-fix-a-memory-leak.patch new file mode 100644 index 00000000000..2db20d9eae3 --- /dev/null +++ b/queue-5.12/openrisc-fix-a-memory-leak.patch @@ -0,0 +1,42 @@ +From 7382e4ead8015bfad0a03925194ffb4c262b6d36 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 23 Apr 2021 17:09:28 +0200 +Subject: openrisc: Fix a memory leak + +From: Christophe JAILLET + +[ Upstream commit c019d92457826bb7b2091c86f36adb5de08405f9 ] + +'setup_find_cpu_node()' take a reference on the node it returns. +This reference must be decremented when not needed anymore, or there will +be a leak. + +Add the missing 'of_node_put(cpu)'. + +Note that 'setup_cpuinfo()' that also calls this function already has a +correct 'of_node_put(cpu)' at its end. + +Fixes: 9d02a4283e9c ("OpenRISC: Boot code") +Signed-off-by: Christophe JAILLET +Signed-off-by: Stafford Horne +Signed-off-by: Sasha Levin +--- + arch/openrisc/kernel/setup.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/arch/openrisc/kernel/setup.c b/arch/openrisc/kernel/setup.c +index 2416a9f91533..c6f9e7b9f7cb 100644 +--- a/arch/openrisc/kernel/setup.c ++++ b/arch/openrisc/kernel/setup.c +@@ -278,6 +278,8 @@ void calibrate_delay(void) + pr_cont("%lu.%02lu BogoMIPS (lpj=%lu)\n", + loops_per_jiffy / (500000 / HZ), + (loops_per_jiffy / (5000 / HZ)) % 100, loops_per_jiffy); ++ ++ of_node_put(cpu); + } + + void __init setup_arch(char **cmdline_p) +-- +2.30.2 + diff --git a/queue-5.12/platform-mellanox-mlxbf-tmfifo-fix-a-memory-barrier-.patch b/queue-5.12/platform-mellanox-mlxbf-tmfifo-fix-a-memory-barrier-.patch new file mode 100644 index 00000000000..cd73165a17e --- /dev/null +++ b/queue-5.12/platform-mellanox-mlxbf-tmfifo-fix-a-memory-barrier-.patch @@ -0,0 +1,66 @@ +From 6fd84bfa492e001355cf9367a9f07e81c0c8f6d8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 7 May 2021 20:30:12 -0400 +Subject: platform/mellanox: mlxbf-tmfifo: Fix a memory barrier issue + +From: Liming Sun + +[ Upstream commit 1c0e5701c5e792c090aef0e5b9b8923c334d9324 ] + +The virtio framework uses wmb() when updating avail->idx. It +guarantees the write order, but not necessarily loading order +for the code accessing the memory. This commit adds a load barrier +after reading the avail->idx to make sure all the data in the +descriptor is visible. It also adds a barrier when returning the +packet to virtio framework to make sure read/writes are visible to +the virtio code. + +Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc") +Signed-off-by: Liming Sun +Reviewed-by: Vadim Pasternak +Link: https://lore.kernel.org/r/1620433812-17911-1-git-send-email-limings@nvidia.com +Signed-off-by: Hans de Goede +Signed-off-by: Sasha Levin +--- + drivers/platform/mellanox/mlxbf-tmfifo.c | 11 ++++++++++- + 1 file changed, 10 insertions(+), 1 deletion(-) + +diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c b/drivers/platform/mellanox/mlxbf-tmfifo.c +index bbc4e71a16ff..38800e86ed8a 100644 +--- a/drivers/platform/mellanox/mlxbf-tmfifo.c ++++ b/drivers/platform/mellanox/mlxbf-tmfifo.c +@@ -294,6 +294,9 @@ mlxbf_tmfifo_get_next_desc(struct mlxbf_tmfifo_vring *vring) + if (vring->next_avail == virtio16_to_cpu(vdev, vr->avail->idx)) + return NULL; + ++ /* Make sure 'avail->idx' is visible already. */ ++ virtio_rmb(false); ++ + idx = vring->next_avail % vr->num; + head = virtio16_to_cpu(vdev, vr->avail->ring[idx]); + if (WARN_ON(head >= vr->num)) +@@ -322,7 +325,7 @@ static void mlxbf_tmfifo_release_desc(struct mlxbf_tmfifo_vring *vring, + * done or not. Add a memory barrier here to make sure the update above + * completes before updating the idx. + */ +- mb(); ++ virtio_mb(false); + vr->used->idx = cpu_to_virtio16(vdev, vr_idx + 1); + } + +@@ -733,6 +736,12 @@ static bool mlxbf_tmfifo_rxtx_one_desc(struct mlxbf_tmfifo_vring *vring, + desc = NULL; + fifo->vring[is_rx] = NULL; + ++ /* ++ * Make sure the load/store are in order before ++ * returning back to virtio. ++ */ ++ virtio_mb(false); ++ + /* Notify upper layer that packet is done. */ + spin_lock_irqsave(&fifo->spin_lock[is_rx], flags); + vring_interrupt(0, vring->vq); +-- +2.30.2 + diff --git a/queue-5.12/platform-x86-dell-smbios-wmi-fix-oops-on-rmmod-dell_.patch b/queue-5.12/platform-x86-dell-smbios-wmi-fix-oops-on-rmmod-dell_.patch new file mode 100644 index 00000000000..e81acce4c31 --- /dev/null +++ b/queue-5.12/platform-x86-dell-smbios-wmi-fix-oops-on-rmmod-dell_.patch @@ -0,0 +1,53 @@ +From 842ccca71aabeedd72dcd21966bf346a464bbac0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 18 May 2021 14:50:27 +0200 +Subject: platform/x86: dell-smbios-wmi: Fix oops on rmmod dell_smbios + +From: Hans de Goede + +[ Upstream commit 3a53587423d25c87af4b4126a806a0575104b45e ] + +init_dell_smbios_wmi() only registers the dell_smbios_wmi_driver on systems +where the Dell WMI interface is supported. While exit_dell_smbios_wmi() +unregisters it unconditionally, this leads to the following oops: + +[ 175.722921] ------------[ cut here ]------------ +[ 175.722925] Unexpected driver unregister! +[ 175.722939] WARNING: CPU: 1 PID: 3630 at drivers/base/driver.c:194 driver_unregister+0x38/0x40 +... +[ 175.723089] Call Trace: +[ 175.723094] cleanup_module+0x5/0xedd [dell_smbios] +... +[ 175.723148] ---[ end trace 064c34e1ad49509d ]--- + +Make the unregister happen on the same condition the register happens +to fix this. + +Cc: Mario Limonciello +Fixes: 1a258e670434 ("platform/x86: dell-smbios-wmi: Add new WMI dispatcher driver") +Signed-off-by: Hans de Goede +Reviewed-by: Mario Limonciello +Reviewed-by: Mark Gross +Link: https://lore.kernel.org/r/20210518125027.21824-1-hdegoede@redhat.com +Signed-off-by: Sasha Levin +--- + drivers/platform/x86/dell/dell-smbios-wmi.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/drivers/platform/x86/dell/dell-smbios-wmi.c b/drivers/platform/x86/dell/dell-smbios-wmi.c +index 27a298b7c541..c97bd4a45242 100644 +--- a/drivers/platform/x86/dell/dell-smbios-wmi.c ++++ b/drivers/platform/x86/dell/dell-smbios-wmi.c +@@ -271,7 +271,8 @@ int init_dell_smbios_wmi(void) + + void exit_dell_smbios_wmi(void) + { +- wmi_driver_unregister(&dell_smbios_wmi_driver); ++ if (wmi_supported) ++ wmi_driver_unregister(&dell_smbios_wmi_driver); + } + + MODULE_DEVICE_TABLE(wmi, dell_smbios_wmi_id_table); +-- +2.30.2 + diff --git a/queue-5.12/platform-x86-ideapad-laptop-fix-a-null-pointer-deref.patch b/queue-5.12/platform-x86-ideapad-laptop-fix-a-null-pointer-deref.patch new file mode 100644 index 00000000000..b090ff39efc --- /dev/null +++ b/queue-5.12/platform-x86-ideapad-laptop-fix-a-null-pointer-deref.patch @@ -0,0 +1,46 @@ +From 36466da743e8cf82f72fb6aa8bbfc68e3c5355e6 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 28 Apr 2021 13:06:36 +0800 +Subject: platform/x86: ideapad-laptop: fix a NULL pointer dereference + +From: Qiu Wenbo + +[ Upstream commit ff67dbd554b2aaa22be933eced32610ff90209dd ] + +The third parameter of dytc_cql_command should not be NULL since it will +be dereferenced immediately. + +Fixes: ff36b0d953dc4 ("platform/x86: ideapad-laptop: rework and create new ACPI helpers") +Signed-off-by: Qiu Wenbo +Acked-by: Ike Panhc +Link: https://lore.kernel.org/r/20210428050636.8003-1-qiuwenbo@kylinos.com.cn +Signed-off-by: Hans de Goede +Signed-off-by: Sasha Levin +--- + drivers/platform/x86/ideapad-laptop.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/drivers/platform/x86/ideapad-laptop.c b/drivers/platform/x86/ideapad-laptop.c +index 6cb5ad4be231..8f871151f0cc 100644 +--- a/drivers/platform/x86/ideapad-laptop.c ++++ b/drivers/platform/x86/ideapad-laptop.c +@@ -809,6 +809,7 @@ static int dytc_profile_set(struct platform_profile_handler *pprof, + { + struct ideapad_dytc_priv *dytc = container_of(pprof, struct ideapad_dytc_priv, pprof); + struct ideapad_private *priv = dytc->priv; ++ unsigned long output; + int err; + + err = mutex_lock_interruptible(&dytc->mutex); +@@ -829,7 +830,7 @@ static int dytc_profile_set(struct platform_profile_handler *pprof, + + /* Determine if we are in CQL mode. This alters the commands we do */ + err = dytc_cql_command(priv, DYTC_SET_COMMAND(DYTC_FUNCTION_MMC, perfmode, 1), +- NULL); ++ &output); + if (err) + goto unlock; + } +-- +2.30.2 + diff --git a/queue-5.12/platform-x86-intel_int0002_vgpio-only-call-enable_ir.patch b/queue-5.12/platform-x86-intel_int0002_vgpio-only-call-enable_ir.patch new file mode 100644 index 00000000000..e2c41eb22e1 --- /dev/null +++ b/queue-5.12/platform-x86-intel_int0002_vgpio-only-call-enable_ir.patch @@ -0,0 +1,232 @@ +From b68d2c11d8b3a13b7c10717aa49818129ec6abe0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 12 May 2021 14:55:23 +0200 +Subject: platform/x86: intel_int0002_vgpio: Only call enable_irq_wake() when + using s2idle + +From: Hans de Goede + +[ Upstream commit b68e182a3062e326b891f47152a3a1b84abccf0f ] + +Commit 871f1f2bcb01 ("platform/x86: intel_int0002_vgpio: Only implement +irq_set_wake on Bay Trail") stopped passing irq_set_wake requests on to +the parents IRQ because this was breaking suspend (causing immediate +wakeups) on an Asus E202SA. + +This workaround for the Asus E202SA is causing wakeup by USB keyboard to +not work on other devices with Airmont CPU cores such as the Medion Akoya +E1239T. In hindsight the problem with the Asus E202SA has nothing to do +with Silvermont vs Airmont CPU cores, so the differentiation between the +2 types of CPU cores introduced by the previous fix is wrong. + +The real issue at hand is s2idle vs S3 suspend where the suspend is +mostly handled by firmware. The parent IRQ for the INT0002 device is shared +with the ACPI SCI and the real problem is that the INT0002 code should not +be messing with the wakeup settings of that IRQ when suspend/resume is +being handled by the firmware. + +Note that on systems which support both s2idle and S3 suspend, which +suspend method to use can be changed at runtime. + +This patch fixes both the Asus E202SA spurious wakeups issue as well as +the wakeup by USB keyboard not working on the Medion Akoya E1239T issue. + +These are both fixed by replacing the old workaround with delaying the +enable_irq_wake(parent_irq) call till system-suspend time and protecting +it with a !pm_suspend_via_firmware() check so that we still do not call +it on devices using firmware-based (S3) suspend such as the Asus E202SA. + +Note rather then adding #ifdef CONFIG_PM_SLEEP, this commit simply adds +a "depends on PM_SLEEP" to the Kconfig since this drivers whole purpose +is to deal with wakeup events, so using it without CONFIG_PM_SLEEP makes +no sense. + +Cc: Maxim Mikityanskiy +Fixes: 871f1f2bcb01 ("platform/x86: intel_int0002_vgpio: Only implement irq_set_wake on Bay Trail") +Signed-off-by: Hans de Goede +Reviewed-by: Andy Shevchenko +Reviewed-by: Rafael J. Wysocki +Link: https://lore.kernel.org/r/20210512125523.55215-2-hdegoede@redhat.com +Signed-off-by: Sasha Levin +--- + drivers/platform/x86/Kconfig | 2 +- + drivers/platform/x86/intel_int0002_vgpio.c | 80 +++++++++++++++------- + 2 files changed, 57 insertions(+), 25 deletions(-) + +diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig +index 461ec61530eb..205a096e9cee 100644 +--- a/drivers/platform/x86/Kconfig ++++ b/drivers/platform/x86/Kconfig +@@ -688,7 +688,7 @@ config INTEL_HID_EVENT + + config INTEL_INT0002_VGPIO + tristate "Intel ACPI INT0002 Virtual GPIO driver" +- depends on GPIOLIB && ACPI ++ depends on GPIOLIB && ACPI && PM_SLEEP + select GPIOLIB_IRQCHIP + help + Some peripherals on Bay Trail and Cherry Trail platforms signal a +diff --git a/drivers/platform/x86/intel_int0002_vgpio.c b/drivers/platform/x86/intel_int0002_vgpio.c +index 289c6655d425..569342aa8926 100644 +--- a/drivers/platform/x86/intel_int0002_vgpio.c ++++ b/drivers/platform/x86/intel_int0002_vgpio.c +@@ -51,6 +51,12 @@ + #define GPE0A_STS_PORT 0x420 + #define GPE0A_EN_PORT 0x428 + ++struct int0002_data { ++ struct gpio_chip chip; ++ int parent_irq; ++ int wake_enable_count; ++}; ++ + /* + * As this is not a real GPIO at all, but just a hack to model an event in + * ACPI the get / set functions are dummy functions. +@@ -98,14 +104,16 @@ static void int0002_irq_mask(struct irq_data *data) + static int int0002_irq_set_wake(struct irq_data *data, unsigned int on) + { + struct gpio_chip *chip = irq_data_get_irq_chip_data(data); +- struct platform_device *pdev = to_platform_device(chip->parent); +- int irq = platform_get_irq(pdev, 0); ++ struct int0002_data *int0002 = container_of(chip, struct int0002_data, chip); + +- /* Propagate to parent irq */ ++ /* ++ * Applying of the wakeup flag to our parent IRQ is delayed till system ++ * suspend, because we only want to do this when using s2idle. ++ */ + if (on) +- enable_irq_wake(irq); ++ int0002->wake_enable_count++; + else +- disable_irq_wake(irq); ++ int0002->wake_enable_count--; + + return 0; + } +@@ -135,7 +143,7 @@ static bool int0002_check_wake(void *data) + return (gpe_sts_reg & GPE0A_PME_B0_STS_BIT); + } + +-static struct irq_chip int0002_byt_irqchip = { ++static struct irq_chip int0002_irqchip = { + .name = DRV_NAME, + .irq_ack = int0002_irq_ack, + .irq_mask = int0002_irq_mask, +@@ -143,21 +151,9 @@ static struct irq_chip int0002_byt_irqchip = { + .irq_set_wake = int0002_irq_set_wake, + }; + +-static struct irq_chip int0002_cht_irqchip = { +- .name = DRV_NAME, +- .irq_ack = int0002_irq_ack, +- .irq_mask = int0002_irq_mask, +- .irq_unmask = int0002_irq_unmask, +- /* +- * No set_wake, on CHT the IRQ is typically shared with the ACPI SCI +- * and we don't want to mess with the ACPI SCI irq settings. +- */ +- .flags = IRQCHIP_SKIP_SET_WAKE, +-}; +- + static const struct x86_cpu_id int0002_cpu_ids[] = { +- X86_MATCH_INTEL_FAM6_MODEL(ATOM_SILVERMONT, &int0002_byt_irqchip), +- X86_MATCH_INTEL_FAM6_MODEL(ATOM_AIRMONT, &int0002_cht_irqchip), ++ X86_MATCH_INTEL_FAM6_MODEL(ATOM_SILVERMONT, NULL), ++ X86_MATCH_INTEL_FAM6_MODEL(ATOM_AIRMONT, NULL), + {} + }; + +@@ -172,8 +168,9 @@ static int int0002_probe(struct platform_device *pdev) + { + struct device *dev = &pdev->dev; + const struct x86_cpu_id *cpu_id; +- struct gpio_chip *chip; ++ struct int0002_data *int0002; + struct gpio_irq_chip *girq; ++ struct gpio_chip *chip; + int irq, ret; + + /* Menlow has a different INT0002 device? */ +@@ -185,10 +182,13 @@ static int int0002_probe(struct platform_device *pdev) + if (irq < 0) + return irq; + +- chip = devm_kzalloc(dev, sizeof(*chip), GFP_KERNEL); +- if (!chip) ++ int0002 = devm_kzalloc(dev, sizeof(*int0002), GFP_KERNEL); ++ if (!int0002) + return -ENOMEM; + ++ int0002->parent_irq = irq; ++ ++ chip = &int0002->chip; + chip->label = DRV_NAME; + chip->parent = dev; + chip->owner = THIS_MODULE; +@@ -214,7 +214,7 @@ static int int0002_probe(struct platform_device *pdev) + } + + girq = &chip->irq; +- girq->chip = (struct irq_chip *)cpu_id->driver_data; ++ girq->chip = &int0002_irqchip; + /* This let us handle the parent IRQ in the driver */ + girq->parent_handler = NULL; + girq->num_parents = 0; +@@ -230,6 +230,7 @@ static int int0002_probe(struct platform_device *pdev) + + acpi_register_wakeup_handler(irq, int0002_check_wake, NULL); + device_init_wakeup(dev, true); ++ dev_set_drvdata(dev, int0002); + return 0; + } + +@@ -240,6 +241,36 @@ static int int0002_remove(struct platform_device *pdev) + return 0; + } + ++static int int0002_suspend(struct device *dev) ++{ ++ struct int0002_data *int0002 = dev_get_drvdata(dev); ++ ++ /* ++ * The INT0002 parent IRQ is often shared with the ACPI GPE IRQ, don't ++ * muck with it when firmware based suspend is used, otherwise we may ++ * cause spurious wakeups from firmware managed suspend. ++ */ ++ if (!pm_suspend_via_firmware() && int0002->wake_enable_count) ++ enable_irq_wake(int0002->parent_irq); ++ ++ return 0; ++} ++ ++static int int0002_resume(struct device *dev) ++{ ++ struct int0002_data *int0002 = dev_get_drvdata(dev); ++ ++ if (!pm_suspend_via_firmware() && int0002->wake_enable_count) ++ disable_irq_wake(int0002->parent_irq); ++ ++ return 0; ++} ++ ++static const struct dev_pm_ops int0002_pm_ops = { ++ .suspend = int0002_suspend, ++ .resume = int0002_resume, ++}; ++ + static const struct acpi_device_id int0002_acpi_ids[] = { + { "INT0002", 0 }, + { }, +@@ -250,6 +281,7 @@ static struct platform_driver int0002_driver = { + .driver = { + .name = DRV_NAME, + .acpi_match_table = int0002_acpi_ids, ++ .pm = &int0002_pm_ops, + }, + .probe = int0002_probe, + .remove = int0002_remove, +-- +2.30.2 + diff --git a/queue-5.12/powerpc-pseries-fix-hcall-tracing-recursion-in-pv-qu.patch b/queue-5.12/powerpc-pseries-fix-hcall-tracing-recursion-in-pv-qu.patch new file mode 100644 index 00000000000..a0bc426dc2c --- /dev/null +++ b/queue-5.12/powerpc-pseries-fix-hcall-tracing-recursion-in-pv-qu.patch @@ -0,0 +1,152 @@ +From 02062c76dbd8f70073d490c356c79e3db1deedc7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 8 May 2021 20:14:52 +1000 +Subject: powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks + +From: Nicholas Piggin + +[ Upstream commit 2c8c89b95831f46a2fb31a8d0fef4601694023ce ] + +The paravit queued spinlock slow path adds itself to the queue then +calls pv_wait to wait for the lock to become free. This is implemented +by calling H_CONFER to donate cycles. + +When hcall tracing is enabled, this H_CONFER call can lead to a spin +lock being taken in the tracing code, which will result in the lock to +be taken again, which will also go to the slow path because it queues +behind itself and so won't ever make progress. + +An example trace of a deadlock: + + __pv_queued_spin_lock_slowpath + trace_clock_global + ring_buffer_lock_reserve + trace_event_buffer_lock_reserve + trace_event_buffer_reserve + trace_event_raw_event_hcall_exit + __trace_hcall_exit + plpar_hcall_norets_trace + __pv_queued_spin_lock_slowpath + trace_clock_global + ring_buffer_lock_reserve + trace_event_buffer_lock_reserve + trace_event_buffer_reserve + trace_event_raw_event_rcu_dyntick + rcu_irq_exit + irq_exit + __do_irq + call_do_irq + do_IRQ + hardware_interrupt_common_virt + +Fix this by introducing plpar_hcall_norets_notrace(), and using that to +make SPLPAR virtual processor dispatching hcalls by the paravirt +spinlock code. + +Signed-off-by: Nicholas Piggin +Reviewed-by: Naveen N. Rao +Signed-off-by: Michael Ellerman +Link: https://lore.kernel.org/r/20210508101455.1578318-2-npiggin@gmail.com +Signed-off-by: Sasha Levin +--- + arch/powerpc/include/asm/hvcall.h | 3 +++ + arch/powerpc/include/asm/paravirt.h | 22 +++++++++++++++++++--- + arch/powerpc/platforms/pseries/hvCall.S | 10 ++++++++++ + arch/powerpc/platforms/pseries/lpar.c | 3 +-- + 4 files changed, 33 insertions(+), 5 deletions(-) + +diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h +index ed6086d57b22..0c92b01a3c3c 100644 +--- a/arch/powerpc/include/asm/hvcall.h ++++ b/arch/powerpc/include/asm/hvcall.h +@@ -446,6 +446,9 @@ + */ + long plpar_hcall_norets(unsigned long opcode, ...); + ++/* Variant which does not do hcall tracing */ ++long plpar_hcall_norets_notrace(unsigned long opcode, ...); ++ + /** + * plpar_hcall: - Make a pseries hypervisor call + * @opcode: The hypervisor call to make. +diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h +index 5d1726bb28e7..bcb7b5f917be 100644 +--- a/arch/powerpc/include/asm/paravirt.h ++++ b/arch/powerpc/include/asm/paravirt.h +@@ -28,19 +28,35 @@ static inline u32 yield_count_of(int cpu) + return be32_to_cpu(yield_count); + } + ++/* ++ * Spinlock code confers and prods, so don't trace the hcalls because the ++ * tracing code takes spinlocks which can cause recursion deadlocks. ++ * ++ * These calls are made while the lock is not held: the lock slowpath yields if ++ * it can not acquire the lock, and unlock slow path might prod if a waiter has ++ * yielded). So this may not be a problem for simple spin locks because the ++ * tracing does not technically recurse on the lock, but we avoid it anyway. ++ * ++ * However the queued spin lock contended path is more strictly ordered: the ++ * H_CONFER hcall is made after the task has queued itself on the lock, so then ++ * recursing on that lock will cause the task to then queue up again behind the ++ * first instance (or worse: queued spinlocks use tricks that assume a context ++ * never waits on more than one spinlock, so such recursion may cause random ++ * corruption in the lock code). ++ */ + static inline void yield_to_preempted(int cpu, u32 yield_count) + { +- plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); ++ plpar_hcall_norets_notrace(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); + } + + static inline void prod_cpu(int cpu) + { +- plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu)); ++ plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu)); + } + + static inline void yield_to_any(void) + { +- plpar_hcall_norets(H_CONFER, -1, 0); ++ plpar_hcall_norets_notrace(H_CONFER, -1, 0); + } + #else + static inline bool is_shared_processor(void) +diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S +index 2136e42833af..8a2b8d64265b 100644 +--- a/arch/powerpc/platforms/pseries/hvCall.S ++++ b/arch/powerpc/platforms/pseries/hvCall.S +@@ -102,6 +102,16 @@ END_FTR_SECTION(0, 1); \ + #define HCALL_BRANCH(LABEL) + #endif + ++_GLOBAL_TOC(plpar_hcall_norets_notrace) ++ HMT_MEDIUM ++ ++ mfcr r0 ++ stw r0,8(r1) ++ HVSC /* invoke the hypervisor */ ++ lwz r0,8(r1) ++ mtcrf 0xff,r0 ++ blr /* return r3 = status */ ++ + _GLOBAL_TOC(plpar_hcall_norets) + HMT_MEDIUM + +diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c +index cd38bd421f38..d4aa6a46e1fa 100644 +--- a/arch/powerpc/platforms/pseries/lpar.c ++++ b/arch/powerpc/platforms/pseries/lpar.c +@@ -1830,8 +1830,7 @@ void hcall_tracepoint_unregfunc(void) + + /* + * Since the tracing code might execute hcalls we need to guard against +- * recursion. One example of this are spinlocks calling H_YIELD on +- * shared processor partitions. ++ * recursion. + */ + static DEFINE_PER_CPU(unsigned int, hcall_trace_depth); + +-- +2.30.2 + diff --git a/queue-5.12/ptrace-make-ptrace-fail-if-the-tracee-changed-its-pi.patch b/queue-5.12/ptrace-make-ptrace-fail-if-the-tracee-changed-its-pi.patch new file mode 100644 index 00000000000..6506949043a --- /dev/null +++ b/queue-5.12/ptrace-make-ptrace-fail-if-the-tracee-changed-its-pi.patch @@ -0,0 +1,161 @@ +From 1534bb38a072d193f3b5d14e1ed344ca8f707bf5 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 12 May 2021 15:33:08 +0200 +Subject: ptrace: make ptrace() fail if the tracee changed its pid unexpectedly + +From: Oleg Nesterov + +[ Upstream commit dbb5afad100a828c97e012c6106566d99f041db6 ] + +Suppose we have 2 threads, the group-leader L and a sub-theread T, +both parked in ptrace_stop(). Debugger tries to resume both threads +and does + + ptrace(PTRACE_CONT, T); + ptrace(PTRACE_CONT, L); + +If the sub-thread T execs in between, the 2nd PTRACE_CONT doesn not +resume the old leader L, it resumes the post-exec thread T which was +actually now stopped in PTHREAD_EVENT_EXEC. In this case the +PTHREAD_EVENT_EXEC event is lost, and the tracer can't know that the +tracee changed its pid. + +This patch makes ptrace() fail in this case until debugger does wait() +and consumes PTHREAD_EVENT_EXEC which reports old_pid. This affects all +ptrace requests except the "asynchronous" PTRACE_INTERRUPT/KILL. + +The patch doesn't add the new PTRACE_ option to not complicate the API, +and I _hope_ this won't cause any noticeable regression: + + - If debugger uses PTRACE_O_TRACEEXEC and the thread did an exec + and the tracer does a ptrace request without having consumed + the exec event, it's 100% sure that the thread the ptracer + thinks it is targeting does not exist anymore, or isn't the + same as the one it thinks it is targeting. + + - To some degree this patch adds nothing new. In the scenario + above ptrace(L) can fail with -ESRCH if it is called after the + execing sub-thread wakes the leader up and before it "steals" + the leader's pid. + +Test-case: + + #include + #include + #include + #include + #include + #include + #include + #include + + void *tf(void *arg) + { + execve("/usr/bin/true", NULL, NULL); + assert(0); + + return NULL; + } + + int main(void) + { + int leader = fork(); + if (!leader) { + kill(getpid(), SIGSTOP); + + pthread_t th; + pthread_create(&th, NULL, tf, NULL); + for (;;) + pause(); + + return 0; + } + + waitpid(leader, NULL, WSTOPPED); + + ptrace(PTRACE_SEIZE, leader, 0, + PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC); + waitpid(leader, NULL, 0); + + ptrace(PTRACE_CONT, leader, 0,0); + waitpid(leader, NULL, 0); + + int status, thread = waitpid(-1, &status, 0); + assert(thread > 0 && thread != leader); + assert(status == 0x80137f); + + ptrace(PTRACE_CONT, thread, 0,0); + /* + * waitid() because waitpid(leader, &status, WNOWAIT) does not + * report status. Why ???? + * + * Why WEXITED? because we have another kernel problem connected + * to mt-exec. + */ + siginfo_t info; + assert(waitid(P_PID, leader, &info, WSTOPPED|WEXITED|WNOWAIT) == 0); + assert(info.si_pid == leader && info.si_status == 0x0405); + + /* OK, it sleeps in ptrace(PTRACE_EVENT_EXEC == 0x04) */ + assert(ptrace(PTRACE_CONT, leader, 0,0) == -1); + assert(errno == ESRCH); + + assert(leader == waitpid(leader, &status, WNOHANG)); + assert(status == 0x04057f); + + assert(ptrace(PTRACE_CONT, leader, 0,0) == 0); + + return 0; + } + +Signed-off-by: Oleg Nesterov +Reported-by: Simon Marchi +Acked-by: "Eric W. Biederman" +Acked-by: Pedro Alves +Acked-by: Simon Marchi +Acked-by: Jan Kratochvil +Signed-off-by: Linus Torvalds +Signed-off-by: Sasha Levin +--- + kernel/ptrace.c | 18 +++++++++++++++++- + 1 file changed, 17 insertions(+), 1 deletion(-) + +diff --git a/kernel/ptrace.c b/kernel/ptrace.c +index 61db50f7ca86..5f50fdd1d855 100644 +--- a/kernel/ptrace.c ++++ b/kernel/ptrace.c +@@ -169,6 +169,21 @@ void __ptrace_unlink(struct task_struct *child) + spin_unlock(&child->sighand->siglock); + } + ++static bool looks_like_a_spurious_pid(struct task_struct *task) ++{ ++ if (task->exit_code != ((PTRACE_EVENT_EXEC << 8) | SIGTRAP)) ++ return false; ++ ++ if (task_pid_vnr(task) == task->ptrace_message) ++ return false; ++ /* ++ * The tracee changed its pid but the PTRACE_EVENT_EXEC event ++ * was not wait()'ed, most probably debugger targets the old ++ * leader which was destroyed in de_thread(). ++ */ ++ return true; ++} ++ + /* Ensure that nothing can wake it up, even SIGKILL */ + static bool ptrace_freeze_traced(struct task_struct *task) + { +@@ -179,7 +194,8 @@ static bool ptrace_freeze_traced(struct task_struct *task) + return ret; + + spin_lock_irq(&task->sighand->siglock); +- if (task_is_traced(task) && !__fatal_signal_pending(task)) { ++ if (task_is_traced(task) && !looks_like_a_spurious_pid(task) && ++ !__fatal_signal_pending(task)) { + task->state = __TASK_TRACED; + ret = true; + } +-- +2.30.2 + diff --git a/queue-5.12/rdma-core-don-t-access-cm_id-after-its-destruction.patch b/queue-5.12/rdma-core-don-t-access-cm_id-after-its-destruction.patch new file mode 100644 index 00000000000..2574dc69a65 --- /dev/null +++ b/queue-5.12/rdma-core-don-t-access-cm_id-after-its-destruction.patch @@ -0,0 +1,106 @@ +From a43b69f49c62030c41037502be5f8261a2465028 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 11 May 2021 08:48:28 +0300 +Subject: RDMA/core: Don't access cm_id after its destruction + +From: Shay Drory + +[ Upstream commit 889d916b6f8a48b8c9489fffcad3b78eedd01a51 ] + +restrack should only be attached to a cm_id while the ID has a valid +device pointer. It is set up when the device is first loaded, but not +cleared when the device is removed. There is also two copies of the device +pointer, one private and one in the public API, and these were left out of +sync. + +Make everything go to NULL together and manipulate restrack right around +the device assignments. + +Found by syzcaller: +BUG: KASAN: wild-memory-access in __list_del include/linux/list.h:112 [inline] +BUG: KASAN: wild-memory-access in __list_del_entry include/linux/list.h:135 [inline] +BUG: KASAN: wild-memory-access in list_del include/linux/list.h:146 [inline] +BUG: KASAN: wild-memory-access in cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] +BUG: KASAN: wild-memory-access in cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] +BUG: KASAN: wild-memory-access in cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 +Write of size 8 at addr dead000000000108 by task syz-executor716/334 + +CPU: 0 PID: 334 Comm: syz-executor716 Not tainted 5.11.0+ #271 +Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS +rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 +Call Trace: + __dump_stack lib/dump_stack.c:79 [inline] + dump_stack+0xbe/0xf9 lib/dump_stack.c:120 + __kasan_report mm/kasan/report.c:400 [inline] + kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 + __list_del include/linux/list.h:112 [inline] + __list_del_entry include/linux/list.h:135 [inline] + list_del include/linux/list.h:146 [inline] + cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] + cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] + cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 + _destroy_id+0x29/0x460 drivers/infiniband/core/cma.c:1862 + ucma_close_id+0x36/0x50 drivers/infiniband/core/ucma.c:185 + ucma_destroy_private_ctx+0x58d/0x5b0 drivers/infiniband/core/ucma.c:576 + ucma_close+0x91/0xd0 drivers/infiniband/core/ucma.c:1797 + __fput+0x169/0x540 fs/file_table.c:280 + task_work_run+0xb7/0x100 kernel/task_work.c:140 + exit_task_work include/linux/task_work.h:30 [inline] + do_exit+0x7da/0x17f0 kernel/exit.c:825 + do_group_exit+0x9e/0x190 kernel/exit.c:922 + __do_sys_exit_group kernel/exit.c:933 [inline] + __se_sys_exit_group kernel/exit.c:931 [inline] + __x64_sys_exit_group+0x2d/0x30 kernel/exit.c:931 + do_syscall_64+0x2d/0x40 arch/x86/entry/common.c:46 + entry_SYSCALL_64_after_hwframe+0x44/0xa9 + +Fixes: 255d0c14b375 ("RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count") +Link: https://lore.kernel.org/r/3352ee288fe34f2b44220457a29bfc0548686363.1620711734.git.leonro@nvidia.com +Signed-off-by: Shay Drory +Signed-off-by: Leon Romanovsky +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/core/cma.c | 5 +++-- + 1 file changed, 3 insertions(+), 2 deletions(-) + +diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c +index 6ac07911a17b..5b9022a8c9ec 100644 +--- a/drivers/infiniband/core/cma.c ++++ b/drivers/infiniband/core/cma.c +@@ -482,6 +482,7 @@ static void cma_release_dev(struct rdma_id_private *id_priv) + list_del(&id_priv->list); + cma_dev_put(id_priv->cma_dev); + id_priv->cma_dev = NULL; ++ id_priv->id.device = NULL; + if (id_priv->id.route.addr.dev_addr.sgid_attr) { + rdma_put_gid_attr(id_priv->id.route.addr.dev_addr.sgid_attr); + id_priv->id.route.addr.dev_addr.sgid_attr = NULL; +@@ -1864,6 +1865,7 @@ static void _destroy_id(struct rdma_id_private *id_priv, + iw_destroy_cm_id(id_priv->cm_id.iw); + } + cma_leave_mc_groups(id_priv); ++ rdma_restrack_del(&id_priv->res); + cma_release_dev(id_priv); + } + +@@ -1877,7 +1879,6 @@ static void _destroy_id(struct rdma_id_private *id_priv, + kfree(id_priv->id.route.path_rec); + + put_net(id_priv->id.route.addr.dev_addr.net); +- rdma_restrack_del(&id_priv->res); + kfree(id_priv); + } + +@@ -3740,7 +3741,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog) + } + + id_priv->backlog = backlog; +- if (id->device) { ++ if (id_priv->cma_dev) { + if (rdma_cap_ib_cm(id->device, 1)) { + ret = cma_ib_listen(id_priv); + if (ret) +-- +2.30.2 + diff --git a/queue-5.12/rdma-core-prevent-divide-by-zero-error-triggered-by-.patch b/queue-5.12/rdma-core-prevent-divide-by-zero-error-triggered-by-.patch new file mode 100644 index 00000000000..c136cb60698 --- /dev/null +++ b/queue-5.12/rdma-core-prevent-divide-by-zero-error-triggered-by-.patch @@ -0,0 +1,64 @@ +From c9b7058877a8d77756266cb609ec5e1a96e09976 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 10 May 2021 17:46:00 +0300 +Subject: RDMA/core: Prevent divide-by-zero error triggered by the user + +From: Leon Romanovsky + +[ Upstream commit 54d87913f147a983589923c7f651f97de9af5be1 ] + +The user_entry_size is supplied by the user and later used as a +denominator to calculate number of entries. The zero supplied by the user +will trigger the following divide-by-zero error: + + divide error: 0000 [#1] SMP KASAN PTI + CPU: 4 PID: 497 Comm: c_repro Not tainted 5.13.0-rc1+ #281 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 + RIP: 0010:ib_uverbs_handler_UVERBS_METHOD_QUERY_GID_TABLE+0x1b1/0x510 + Code: 87 59 03 00 00 e8 9f ab 1e ff 48 8d bd a8 00 00 00 e8 d3 70 41 ff 44 0f b7 b5 a8 00 00 00 e8 86 ab 1e ff 31 d2 4c 89 f0 31 ff <49> f7 f5 48 89 d6 48 89 54 24 10 48 89 04 24 e8 1b ad 1e ff 48 8b + RSP: 0018:ffff88810416f828 EFLAGS: 00010246 + RAX: 0000000000000008 RBX: 1ffff1102082df09 RCX: ffffffff82183f3d + RDX: 0000000000000000 RSI: ffff888105f2da00 RDI: 0000000000000000 + RBP: ffff88810416fa98 R08: 0000000000000001 R09: ffffed102082df5f + R10: ffff88810416faf7 R11: ffffed102082df5e R12: 0000000000000000 + R13: 0000000000000000 R14: 0000000000000008 R15: ffff88810416faf0 + FS: 00007f5715efa740(0000) GS:ffff88811a700000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 0000000020000840 CR3: 000000010c2e0001 CR4: 0000000000370ea0 + DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 + DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 + Call Trace: + ? ib_uverbs_handler_UVERBS_METHOD_INFO_HANDLES+0x4b0/0x4b0 + ib_uverbs_cmd_verbs+0x1546/0x1940 + ib_uverbs_ioctl+0x186/0x240 + __x64_sys_ioctl+0x38a/0x1220 + do_syscall_64+0x3f/0x80 + entry_SYSCALL_64_after_hwframe+0x44/0xae + +Fixes: 9f85cbe50aa0 ("RDMA/uverbs: Expose the new GID query API to user space") +Link: https://lore.kernel.org/r/b971cc70a8b240a8b5eda33c99fa0558a0071be2.1620657876.git.leonro@nvidia.com +Reviewed-by: Jason Gunthorpe +Signed-off-by: Leon Romanovsky +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/core/uverbs_std_types_device.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/drivers/infiniband/core/uverbs_std_types_device.c b/drivers/infiniband/core/uverbs_std_types_device.c +index 9ec6971056fa..a03021d94e11 100644 +--- a/drivers/infiniband/core/uverbs_std_types_device.c ++++ b/drivers/infiniband/core/uverbs_std_types_device.c +@@ -331,6 +331,9 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_GID_TABLE)( + if (ret) + return ret; + ++ if (!user_entry_size) ++ return -EINVAL; ++ + max_entries = uverbs_attr_ptr_get_array_size( + attrs, UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, + user_entry_size); +-- +2.30.2 + diff --git a/queue-5.12/rdma-mlx5-fix-query-dct-via-devx.patch b/queue-5.12/rdma-mlx5-fix-query-dct-via-devx.patch new file mode 100644 index 00000000000..fed6955b792 --- /dev/null +++ b/queue-5.12/rdma-mlx5-fix-query-dct-via-devx.patch @@ -0,0 +1,54 @@ +From 87223501897f44670844c9317e8bf3676490fc05 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 19 May 2021 11:41:32 +0300 +Subject: RDMA/mlx5: Fix query DCT via DEVX + +From: Maor Gottlieb + +[ Upstream commit cfa3b797118eda7d68f9ede9b1a0279192aca653 ] + +When executing DEVX command to query QP object, we need to take the QP +type from the mlx5_ib_qp struct which hold the driver specific QP types as +well, such as DC. + +Fixes: 34613eb1d2ad ("IB/mlx5: Enable modify and query verbs objects via DEVX") +Link: https://lore.kernel.org/r/6eee15d63f09bb70787488e0cf96216e2957f5aa.1621413654.git.leonro@nvidia.com +Reviewed-by: Yishai Hadas +Signed-off-by: Maor Gottlieb +Signed-off-by: Leon Romanovsky +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/hw/mlx5/devx.c | 6 ++---- + 1 file changed, 2 insertions(+), 4 deletions(-) + +diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c +index 07b8350929cd..81276b4247f8 100644 +--- a/drivers/infiniband/hw/mlx5/devx.c ++++ b/drivers/infiniband/hw/mlx5/devx.c +@@ -630,9 +630,8 @@ static bool devx_is_valid_obj_id(struct uverbs_attr_bundle *attrs, + case UVERBS_OBJECT_QP: + { + struct mlx5_ib_qp *qp = to_mqp(uobj->object); +- enum ib_qp_type qp_type = qp->ibqp.qp_type; + +- if (qp_type == IB_QPT_RAW_PACKET || ++ if (qp->type == IB_QPT_RAW_PACKET || + (qp->flags & IB_QP_CREATE_SOURCE_QPN)) { + struct mlx5_ib_raw_packet_qp *raw_packet_qp = + &qp->raw_packet_qp; +@@ -649,10 +648,9 @@ static bool devx_is_valid_obj_id(struct uverbs_attr_bundle *attrs, + sq->tisn) == obj_id); + } + +- if (qp_type == MLX5_IB_QPT_DCT) ++ if (qp->type == MLX5_IB_QPT_DCT) + return get_enc_obj_id(MLX5_CMD_OP_CREATE_DCT, + qp->dct.mdct.mqp.qpn) == obj_id; +- + return get_enc_obj_id(MLX5_CMD_OP_CREATE_QP, + qp->ibqp.qp_num) == obj_id; + } +-- +2.30.2 + diff --git a/queue-5.12/rdma-mlx5-recover-from-fatal-event-in-dual-port-mode.patch b/queue-5.12/rdma-mlx5-recover-from-fatal-event-in-dual-port-mode.patch new file mode 100644 index 00000000000..eb305649bbb --- /dev/null +++ b/queue-5.12/rdma-mlx5-recover-from-fatal-event-in-dual-port-mode.patch @@ -0,0 +1,38 @@ +From f64171566e127b42bfe3ac0bad58c4d06d6e6278 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 11 May 2021 08:48:29 +0300 +Subject: RDMA/mlx5: Recover from fatal event in dual port mode + +From: Maor Gottlieb + +[ Upstream commit 97f30d324ce6645a4de4ffb71e4ae9b8ca36ff04 ] + +When there is fatal event on the slave port, the device is marked as not +active. We need to mark it as active again when the slave is recovered to +regain full functionality. + +Fixes: d69a24e03659 ("IB/mlx5: Move IB event processing onto a workqueue") +Link: https://lore.kernel.org/r/8906754455bb23019ef223c725d2c0d38acfb80b.1620711734.git.leonro@nvidia.com +Signed-off-by: Maor Gottlieb +Signed-off-by: Leon Romanovsky +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/hw/mlx5/main.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c +index 4be7bccefaa4..59ffbbdda317 100644 +--- a/drivers/infiniband/hw/mlx5/main.c ++++ b/drivers/infiniband/hw/mlx5/main.c +@@ -4655,6 +4655,7 @@ static int mlx5r_mp_probe(struct auxiliary_device *adev, + + if (bound) { + rdma_roce_rescan_device(&dev->ib_dev); ++ mpi->ibdev->ib_active = true; + break; + } + } +-- +2.30.2 + diff --git a/queue-5.12/rdma-rxe-clear-all-qp-fields-if-creation-failed.patch b/queue-5.12/rdma-rxe-clear-all-qp-fields-if-creation-failed.patch new file mode 100644 index 00000000000..de8cc5bbe25 --- /dev/null +++ b/queue-5.12/rdma-rxe-clear-all-qp-fields-if-creation-failed.patch @@ -0,0 +1,117 @@ +From e79986cca29a659613571e3a749a73a8a0a3f786 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 11 May 2021 10:26:03 +0300 +Subject: RDMA/rxe: Clear all QP fields if creation failed + +From: Leon Romanovsky + +[ Upstream commit 67f29896fdc83298eed5a6576ff8f9873f709228 ] + +rxe_qp_do_cleanup() relies on valid pointer values in QP for the properly +created ones, but in case rxe_qp_from_init() failed it was filled with +garbage and caused tot the following error. + + refcount_t: underflow; use-after-free. + WARNING: CPU: 1 PID: 12560 at lib/refcount.c:28 refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 + Modules linked in: + CPU: 1 PID: 12560 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0 + Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 + RIP: 0010:refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 + Code: e9 db fe ff ff 48 89 df e8 2c c2 ea fd e9 8a fe ff ff e8 72 6a a7 fd 48 c7 c7 e0 b2 c1 89 c6 05 dc 3a e6 09 01 e8 ee 74 fb 04 <0f> 0b e9 af fe ff ff 0f 1f 84 00 00 00 00 00 41 56 41 55 41 54 55 + RSP: 0018:ffffc900097ceba8 EFLAGS: 00010286 + RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 + RDX: 0000000000040000 RSI: ffffffff815bb075 RDI: fffff520012f9d67 + RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 + R10: ffffffff815b4eae R11: 0000000000000000 R12: ffff8880322a4800 + R13: ffff8880322a4940 R14: ffff888033044e00 R15: 0000000000000000 + FS: 00007f6eb2be3700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 00007fdbe5d41000 CR3: 000000001d181000 CR4: 00000000001506e0 + DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 + DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 + Call Trace: + __refcount_sub_and_test include/linux/refcount.h:283 [inline] + __refcount_dec_and_test include/linux/refcount.h:315 [inline] + refcount_dec_and_test include/linux/refcount.h:333 [inline] + kref_put include/linux/kref.h:64 [inline] + rxe_qp_do_cleanup+0x96f/0xaf0 drivers/infiniband/sw/rxe/rxe_qp.c:805 + execute_in_process_context+0x37/0x150 kernel/workqueue.c:3327 + rxe_elem_release+0x9f/0x180 drivers/infiniband/sw/rxe/rxe_pool.c:391 + kref_put include/linux/kref.h:65 [inline] + rxe_create_qp+0x2cd/0x310 drivers/infiniband/sw/rxe/rxe_verbs.c:425 + _ib_create_qp drivers/infiniband/core/core_priv.h:331 [inline] + ib_create_named_qp+0x2ad/0x1370 drivers/infiniband/core/verbs.c:1231 + ib_create_qp include/rdma/ib_verbs.h:3644 [inline] + create_mad_qp+0x177/0x2d0 drivers/infiniband/core/mad.c:2920 + ib_mad_port_open drivers/infiniband/core/mad.c:3001 [inline] + ib_mad_init_device+0xd6f/0x1400 drivers/infiniband/core/mad.c:3092 + add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:717 + enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331 + ib_register_device drivers/infiniband/core/device.c:1413 [inline] + ib_register_device+0x7c7/0xa50 drivers/infiniband/core/device.c:1365 + rxe_register_device+0x3d5/0x4a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1147 + rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 + rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:503 + rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] + rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 + nldev_newlink+0x30e/0x550 drivers/infiniband/core/nldev.c:1555 + rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195 + rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] + rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259 + netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] + netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 + netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 + sock_sendmsg_nosec net/socket.c:654 [inline] + sock_sendmsg+0xcf/0x120 net/socket.c:674 + ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 + ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 + __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 + do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 + entry_SYSCALL_64_after_hwframe+0x44/0xae + +Fixes: 8700e3e7c485 ("Soft RoCE driver") +Link: https://lore.kernel.org/r/7bf8d548764d406dbbbaf4b574960ebfd5af8387.1620717918.git.leonro@nvidia.com +Reported-by: syzbot+36a7f280de4e11c6f04e@syzkaller.appspotmail.com +Signed-off-by: Leon Romanovsky +Reviewed-by: Zhu Yanjun +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/sw/rxe/rxe_qp.c | 7 +++++++ + 1 file changed, 7 insertions(+) + +diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c +index 34ae957a315c..b0f350d674fd 100644 +--- a/drivers/infiniband/sw/rxe/rxe_qp.c ++++ b/drivers/infiniband/sw/rxe/rxe_qp.c +@@ -242,6 +242,7 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp, + if (err) { + vfree(qp->sq.queue->buf); + kfree(qp->sq.queue); ++ qp->sq.queue = NULL; + return err; + } + +@@ -295,6 +296,7 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp, + if (err) { + vfree(qp->rq.queue->buf); + kfree(qp->rq.queue); ++ qp->rq.queue = NULL; + return err; + } + } +@@ -355,6 +357,11 @@ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd, + err2: + rxe_queue_cleanup(qp->sq.queue); + err1: ++ qp->pd = NULL; ++ qp->rcq = NULL; ++ qp->scq = NULL; ++ qp->srq = NULL; ++ + if (srq) + rxe_drop_ref(srq); + rxe_drop_ref(scq); +-- +2.30.2 + diff --git a/queue-5.12/rdma-rxe-return-cqe-error-if-invalid-lkey-was-suppli.patch b/queue-5.12/rdma-rxe-return-cqe-error-if-invalid-lkey-was-suppli.patch new file mode 100644 index 00000000000..ea02c4b0b23 --- /dev/null +++ b/queue-5.12/rdma-rxe-return-cqe-error-if-invalid-lkey-was-suppli.patch @@ -0,0 +1,102 @@ +From 2d813a18c4d3e7f265a4104efcccaa1f4bb0543f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 11 May 2021 08:48:31 +0300 +Subject: RDMA/rxe: Return CQE error if invalid lkey was supplied + +From: Leon Romanovsky + +[ Upstream commit dc07628bd2bbc1da768e265192c28ebd301f509d ] + +RXE is missing update of WQE status in LOCAL_WRITE failures. This caused +the following kernel panic if someone sent an atomic operation with an +explicitly wrong lkey. + +[leonro@vm ~]$ mkt test +test_atomic_invalid_lkey (tests.test_atomic.AtomicTest) ... + WARNING: CPU: 5 PID: 263 at drivers/infiniband/sw/rxe/rxe_comp.c:740 rxe_completer+0x1a6d/0x2e30 [rdma_rxe] + Modules linked in: crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel rdma_ucm rdma_cm ib_umad ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core ptp pps_core + CPU: 5 PID: 263 Comm: python3 Not tainted 5.13.0-rc1+ #2936 + Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 + RIP: 0010:rxe_completer+0x1a6d/0x2e30 [rdma_rxe] + Code: 03 0f 8e 65 0e 00 00 3b 93 10 06 00 00 0f 84 82 0a 00 00 4c 89 ff 4c 89 44 24 38 e8 2d 74 a9 e1 4c 8b 44 24 38 e9 1c f5 ff ff <0f> 0b e9 0c e8 ff ff b8 05 00 00 00 41 bf 05 00 00 00 e9 ab e7 ff + RSP: 0018:ffff8880158af090 EFLAGS: 00010246 + RAX: 0000000000000000 RBX: ffff888016a78000 RCX: ffffffffa0cf1652 + RDX: 1ffff9200004b442 RSI: 0000000000000004 RDI: ffffc9000025a210 + RBP: dffffc0000000000 R08: 00000000ffffffea R09: ffff88801617740b + R10: ffffed1002c2ee81 R11: 0000000000000007 R12: ffff88800f3b63e8 + R13: ffff888016a78008 R14: ffffc9000025a180 R15: 000000000000000c + FS: 00007f88b622a740(0000) GS:ffff88806d540000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 00007f88b5a1fa10 CR3: 000000000d848004 CR4: 0000000000370ea0 + DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 + DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 + Call Trace: + rxe_do_task+0x130/0x230 [rdma_rxe] + rxe_rcv+0xb11/0x1df0 [rdma_rxe] + rxe_loopback+0x157/0x1e0 [rdma_rxe] + rxe_responder+0x5532/0x7620 [rdma_rxe] + rxe_do_task+0x130/0x230 [rdma_rxe] + rxe_rcv+0x9c8/0x1df0 [rdma_rxe] + rxe_loopback+0x157/0x1e0 [rdma_rxe] + rxe_requester+0x1efd/0x58c0 [rdma_rxe] + rxe_do_task+0x130/0x230 [rdma_rxe] + rxe_post_send+0x998/0x1860 [rdma_rxe] + ib_uverbs_post_send+0xd5f/0x1220 [ib_uverbs] + ib_uverbs_write+0x847/0xc80 [ib_uverbs] + vfs_write+0x1c5/0x840 + ksys_write+0x176/0x1d0 + do_syscall_64+0x3f/0x80 + entry_SYSCALL_64_after_hwframe+0x44/0xae + +Fixes: 8700e3e7c485 ("Soft RoCE driver") +Link: https://lore.kernel.org/r/11e7b553f3a6f5371c6bb3f57c494bb52b88af99.1620711734.git.leonro@nvidia.com +Signed-off-by: Leon Romanovsky +Acked-by: Zhu Yanjun +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/sw/rxe/rxe_comp.c | 16 ++++++++++------ + 1 file changed, 10 insertions(+), 6 deletions(-) + +diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c +index a612b335baa0..06b556169867 100644 +--- a/drivers/infiniband/sw/rxe/rxe_comp.c ++++ b/drivers/infiniband/sw/rxe/rxe_comp.c +@@ -346,13 +346,15 @@ static inline enum comp_state do_read(struct rxe_qp *qp, + ret = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, + &wqe->dma, payload_addr(pkt), + payload_size(pkt), to_mr_obj, NULL); +- if (ret) ++ if (ret) { ++ wqe->status = IB_WC_LOC_PROT_ERR; + return COMPST_ERROR; ++ } + + if (wqe->dma.resid == 0 && (pkt->mask & RXE_END_MASK)) + return COMPST_COMP_ACK; +- else +- return COMPST_UPDATE_COMP; ++ ++ return COMPST_UPDATE_COMP; + } + + static inline enum comp_state do_atomic(struct rxe_qp *qp, +@@ -366,10 +368,12 @@ static inline enum comp_state do_atomic(struct rxe_qp *qp, + ret = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, + &wqe->dma, &atomic_orig, + sizeof(u64), to_mr_obj, NULL); +- if (ret) ++ if (ret) { ++ wqe->status = IB_WC_LOC_PROT_ERR; + return COMPST_ERROR; +- else +- return COMPST_COMP_ACK; ++ } ++ ++ return COMPST_COMP_ACK; + } + + static void make_send_cqe(struct rxe_qp *qp, struct rxe_send_wqe *wqe, +-- +2.30.2 + diff --git a/queue-5.12/rdma-rxe-split-mem-into-mr-and-mw.patch b/queue-5.12/rdma-rxe-split-mem-into-mr-and-mw.patch new file mode 100644 index 00000000000..c711edc943a --- /dev/null +++ b/queue-5.12/rdma-rxe-split-mem-into-mr-and-mw.patch @@ -0,0 +1,1065 @@ +From 41b5169e747d4e6de30983bbcdc46c3747023d65 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 25 Mar 2021 16:24:26 -0500 +Subject: RDMA/rxe: Split MEM into MR and MW + +From: Bob Pearson + +[ Upstream commit 364e282c4fe7e24a5f32cd6e93e1056c6a6e3d31 ] + +In the original rxe implementation it was intended to use a common object +to represent MRs and MWs but they are different enough to separate these +into two objects. + +This allows replacing the mem name with mr for MRs which is more +consistent with the style for the other objects and less likely to be +confusing. This is a long patch that mostly changes mem to mr where it +makes sense and adds a new rxe_mw struct. + +Link: https://lore.kernel.org/r/20210325212425.2792-1-rpearson@hpe.com +Signed-off-by: Bob Pearson +Acked-by: Zhu Yanjun +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/sw/rxe/rxe_comp.c | 4 +- + drivers/infiniband/sw/rxe/rxe_loc.h | 29 ++- + drivers/infiniband/sw/rxe/rxe_mr.c | 271 ++++++++++++-------------- + drivers/infiniband/sw/rxe/rxe_pool.c | 14 +- + drivers/infiniband/sw/rxe/rxe_req.c | 10 +- + drivers/infiniband/sw/rxe/rxe_resp.c | 34 ++-- + drivers/infiniband/sw/rxe/rxe_verbs.c | 22 +-- + drivers/infiniband/sw/rxe/rxe_verbs.h | 60 +++--- + 8 files changed, 218 insertions(+), 226 deletions(-) + +diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c +index 17a361b8dbb1..a612b335baa0 100644 +--- a/drivers/infiniband/sw/rxe/rxe_comp.c ++++ b/drivers/infiniband/sw/rxe/rxe_comp.c +@@ -345,7 +345,7 @@ static inline enum comp_state do_read(struct rxe_qp *qp, + + ret = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, + &wqe->dma, payload_addr(pkt), +- payload_size(pkt), to_mem_obj, NULL); ++ payload_size(pkt), to_mr_obj, NULL); + if (ret) + return COMPST_ERROR; + +@@ -365,7 +365,7 @@ static inline enum comp_state do_atomic(struct rxe_qp *qp, + + ret = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, + &wqe->dma, &atomic_orig, +- sizeof(u64), to_mem_obj, NULL); ++ sizeof(u64), to_mr_obj, NULL); + if (ret) + return COMPST_ERROR; + else +diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h +index 0d758760b9ae..08e21fa9ec97 100644 +--- a/drivers/infiniband/sw/rxe/rxe_loc.h ++++ b/drivers/infiniband/sw/rxe/rxe_loc.h +@@ -72,40 +72,37 @@ int rxe_mmap(struct ib_ucontext *context, struct vm_area_struct *vma); + + /* rxe_mr.c */ + enum copy_direction { +- to_mem_obj, +- from_mem_obj, ++ to_mr_obj, ++ from_mr_obj, + }; + +-void rxe_mem_init_dma(struct rxe_pd *pd, +- int access, struct rxe_mem *mem); ++void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr); + +-int rxe_mem_init_user(struct rxe_pd *pd, u64 start, +- u64 length, u64 iova, int access, struct ib_udata *udata, +- struct rxe_mem *mr); ++int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, ++ int access, struct ib_udata *udata, struct rxe_mr *mr); + +-int rxe_mem_init_fast(struct rxe_pd *pd, +- int max_pages, struct rxe_mem *mem); ++int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr); + +-int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, +- int length, enum copy_direction dir, u32 *crcp); ++int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, ++ enum copy_direction dir, u32 *crcp); + + int copy_data(struct rxe_pd *pd, int access, + struct rxe_dma_info *dma, void *addr, int length, + enum copy_direction dir, u32 *crcp); + +-void *iova_to_vaddr(struct rxe_mem *mem, u64 iova, int length); ++void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length); + + enum lookup_type { + lookup_local, + lookup_remote, + }; + +-struct rxe_mem *lookup_mem(struct rxe_pd *pd, int access, u32 key, +- enum lookup_type type); ++struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, ++ enum lookup_type type); + +-int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length); ++int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); + +-void rxe_mem_cleanup(struct rxe_pool_entry *arg); ++void rxe_mr_cleanup(struct rxe_pool_entry *arg); + + int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); + +diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c +index 6e8c41567ba0..9f63947bab12 100644 +--- a/drivers/infiniband/sw/rxe/rxe_mr.c ++++ b/drivers/infiniband/sw/rxe/rxe_mr.c +@@ -24,16 +24,15 @@ static u8 rxe_get_key(void) + return key; + } + +-int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length) ++int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) + { +- switch (mem->type) { +- case RXE_MEM_TYPE_DMA: ++ switch (mr->type) { ++ case RXE_MR_TYPE_DMA: + return 0; + +- case RXE_MEM_TYPE_MR: +- if (iova < mem->iova || +- length > mem->length || +- iova > mem->iova + mem->length - length) ++ case RXE_MR_TYPE_MR: ++ if (iova < mr->iova || length > mr->length || ++ iova > mr->iova + mr->length - length) + return -EFAULT; + return 0; + +@@ -46,85 +45,83 @@ int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length) + | IB_ACCESS_REMOTE_WRITE \ + | IB_ACCESS_REMOTE_ATOMIC) + +-static void rxe_mem_init(int access, struct rxe_mem *mem) ++static void rxe_mr_init(int access, struct rxe_mr *mr) + { +- u32 lkey = mem->pelem.index << 8 | rxe_get_key(); ++ u32 lkey = mr->pelem.index << 8 | rxe_get_key(); + u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0; + +- mem->ibmr.lkey = lkey; +- mem->ibmr.rkey = rkey; +- mem->state = RXE_MEM_STATE_INVALID; +- mem->type = RXE_MEM_TYPE_NONE; +- mem->map_shift = ilog2(RXE_BUF_PER_MAP); ++ mr->ibmr.lkey = lkey; ++ mr->ibmr.rkey = rkey; ++ mr->state = RXE_MR_STATE_INVALID; ++ mr->type = RXE_MR_TYPE_NONE; ++ mr->map_shift = ilog2(RXE_BUF_PER_MAP); + } + +-void rxe_mem_cleanup(struct rxe_pool_entry *arg) ++void rxe_mr_cleanup(struct rxe_pool_entry *arg) + { +- struct rxe_mem *mem = container_of(arg, typeof(*mem), pelem); ++ struct rxe_mr *mr = container_of(arg, typeof(*mr), pelem); + int i; + +- ib_umem_release(mem->umem); ++ ib_umem_release(mr->umem); + +- if (mem->map) { +- for (i = 0; i < mem->num_map; i++) +- kfree(mem->map[i]); ++ if (mr->map) { ++ for (i = 0; i < mr->num_map; i++) ++ kfree(mr->map[i]); + +- kfree(mem->map); ++ kfree(mr->map); + } + } + +-static int rxe_mem_alloc(struct rxe_mem *mem, int num_buf) ++static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf) + { + int i; + int num_map; +- struct rxe_map **map = mem->map; ++ struct rxe_map **map = mr->map; + + num_map = (num_buf + RXE_BUF_PER_MAP - 1) / RXE_BUF_PER_MAP; + +- mem->map = kmalloc_array(num_map, sizeof(*map), GFP_KERNEL); +- if (!mem->map) ++ mr->map = kmalloc_array(num_map, sizeof(*map), GFP_KERNEL); ++ if (!mr->map) + goto err1; + + for (i = 0; i < num_map; i++) { +- mem->map[i] = kmalloc(sizeof(**map), GFP_KERNEL); +- if (!mem->map[i]) ++ mr->map[i] = kmalloc(sizeof(**map), GFP_KERNEL); ++ if (!mr->map[i]) + goto err2; + } + + BUILD_BUG_ON(!is_power_of_2(RXE_BUF_PER_MAP)); + +- mem->map_shift = ilog2(RXE_BUF_PER_MAP); +- mem->map_mask = RXE_BUF_PER_MAP - 1; ++ mr->map_shift = ilog2(RXE_BUF_PER_MAP); ++ mr->map_mask = RXE_BUF_PER_MAP - 1; + +- mem->num_buf = num_buf; +- mem->num_map = num_map; +- mem->max_buf = num_map * RXE_BUF_PER_MAP; ++ mr->num_buf = num_buf; ++ mr->num_map = num_map; ++ mr->max_buf = num_map * RXE_BUF_PER_MAP; + + return 0; + + err2: + for (i--; i >= 0; i--) +- kfree(mem->map[i]); ++ kfree(mr->map[i]); + +- kfree(mem->map); ++ kfree(mr->map); + err1: + return -ENOMEM; + } + +-void rxe_mem_init_dma(struct rxe_pd *pd, +- int access, struct rxe_mem *mem) ++void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr) + { +- rxe_mem_init(access, mem); ++ rxe_mr_init(access, mr); + +- mem->ibmr.pd = &pd->ibpd; +- mem->access = access; +- mem->state = RXE_MEM_STATE_VALID; +- mem->type = RXE_MEM_TYPE_DMA; ++ mr->ibmr.pd = &pd->ibpd; ++ mr->access = access; ++ mr->state = RXE_MR_STATE_VALID; ++ mr->type = RXE_MR_TYPE_DMA; + } + +-int rxe_mem_init_user(struct rxe_pd *pd, u64 start, +- u64 length, u64 iova, int access, struct ib_udata *udata, +- struct rxe_mem *mem) ++int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, ++ int access, struct ib_udata *udata, struct rxe_mr *mr) + { + struct rxe_map **map; + struct rxe_phys_buf *buf = NULL; +@@ -142,23 +139,23 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start, + goto err1; + } + +- mem->umem = umem; ++ mr->umem = umem; + num_buf = ib_umem_num_pages(umem); + +- rxe_mem_init(access, mem); ++ rxe_mr_init(access, mr); + +- err = rxe_mem_alloc(mem, num_buf); ++ err = rxe_mr_alloc(mr, num_buf); + if (err) { +- pr_warn("err %d from rxe_mem_alloc\n", err); ++ pr_warn("err %d from rxe_mr_alloc\n", err); + ib_umem_release(umem); + goto err1; + } + +- mem->page_shift = PAGE_SHIFT; +- mem->page_mask = PAGE_SIZE - 1; ++ mr->page_shift = PAGE_SHIFT; ++ mr->page_mask = PAGE_SIZE - 1; + + num_buf = 0; +- map = mem->map; ++ map = mr->map; + if (length > 0) { + buf = map[0]->buf; + +@@ -185,15 +182,15 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start, + } + } + +- mem->ibmr.pd = &pd->ibpd; +- mem->umem = umem; +- mem->access = access; +- mem->length = length; +- mem->iova = iova; +- mem->va = start; +- mem->offset = ib_umem_offset(umem); +- mem->state = RXE_MEM_STATE_VALID; +- mem->type = RXE_MEM_TYPE_MR; ++ mr->ibmr.pd = &pd->ibpd; ++ mr->umem = umem; ++ mr->access = access; ++ mr->length = length; ++ mr->iova = iova; ++ mr->va = start; ++ mr->offset = ib_umem_offset(umem); ++ mr->state = RXE_MR_STATE_VALID; ++ mr->type = RXE_MR_TYPE_MR; + + return 0; + +@@ -201,24 +198,23 @@ err1: + return err; + } + +-int rxe_mem_init_fast(struct rxe_pd *pd, +- int max_pages, struct rxe_mem *mem) ++int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr) + { + int err; + +- rxe_mem_init(0, mem); ++ rxe_mr_init(0, mr); + + /* In fastreg, we also set the rkey */ +- mem->ibmr.rkey = mem->ibmr.lkey; ++ mr->ibmr.rkey = mr->ibmr.lkey; + +- err = rxe_mem_alloc(mem, max_pages); ++ err = rxe_mr_alloc(mr, max_pages); + if (err) + goto err1; + +- mem->ibmr.pd = &pd->ibpd; +- mem->max_buf = max_pages; +- mem->state = RXE_MEM_STATE_FREE; +- mem->type = RXE_MEM_TYPE_MR; ++ mr->ibmr.pd = &pd->ibpd; ++ mr->max_buf = max_pages; ++ mr->state = RXE_MR_STATE_FREE; ++ mr->type = RXE_MR_TYPE_MR; + + return 0; + +@@ -226,28 +222,24 @@ err1: + return err; + } + +-static void lookup_iova( +- struct rxe_mem *mem, +- u64 iova, +- int *m_out, +- int *n_out, +- size_t *offset_out) ++static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out, ++ size_t *offset_out) + { +- size_t offset = iova - mem->iova + mem->offset; ++ size_t offset = iova - mr->iova + mr->offset; + int map_index; + int buf_index; + u64 length; + +- if (likely(mem->page_shift)) { +- *offset_out = offset & mem->page_mask; +- offset >>= mem->page_shift; +- *n_out = offset & mem->map_mask; +- *m_out = offset >> mem->map_shift; ++ if (likely(mr->page_shift)) { ++ *offset_out = offset & mr->page_mask; ++ offset >>= mr->page_shift; ++ *n_out = offset & mr->map_mask; ++ *m_out = offset >> mr->map_shift; + } else { + map_index = 0; + buf_index = 0; + +- length = mem->map[map_index]->buf[buf_index].size; ++ length = mr->map[map_index]->buf[buf_index].size; + + while (offset >= length) { + offset -= length; +@@ -257,7 +249,7 @@ static void lookup_iova( + map_index++; + buf_index = 0; + } +- length = mem->map[map_index]->buf[buf_index].size; ++ length = mr->map[map_index]->buf[buf_index].size; + } + + *m_out = map_index; +@@ -266,49 +258,49 @@ static void lookup_iova( + } + } + +-void *iova_to_vaddr(struct rxe_mem *mem, u64 iova, int length) ++void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) + { + size_t offset; + int m, n; + void *addr; + +- if (mem->state != RXE_MEM_STATE_VALID) { +- pr_warn("mem not in valid state\n"); ++ if (mr->state != RXE_MR_STATE_VALID) { ++ pr_warn("mr not in valid state\n"); + addr = NULL; + goto out; + } + +- if (!mem->map) { ++ if (!mr->map) { + addr = (void *)(uintptr_t)iova; + goto out; + } + +- if (mem_check_range(mem, iova, length)) { ++ if (mr_check_range(mr, iova, length)) { + pr_warn("range violation\n"); + addr = NULL; + goto out; + } + +- lookup_iova(mem, iova, &m, &n, &offset); ++ lookup_iova(mr, iova, &m, &n, &offset); + +- if (offset + length > mem->map[m]->buf[n].size) { ++ if (offset + length > mr->map[m]->buf[n].size) { + pr_warn("crosses page boundary\n"); + addr = NULL; + goto out; + } + +- addr = (void *)(uintptr_t)mem->map[m]->buf[n].addr + offset; ++ addr = (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; + + out: + return addr; + } + + /* copy data from a range (vaddr, vaddr+length-1) to or from +- * a mem object starting at iova. Compute incremental value of +- * crc32 if crcp is not zero. caller must hold a reference to mem ++ * a mr object starting at iova. Compute incremental value of ++ * crc32 if crcp is not zero. caller must hold a reference to mr + */ +-int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length, +- enum copy_direction dir, u32 *crcp) ++int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, ++ enum copy_direction dir, u32 *crcp) + { + int err; + int bytes; +@@ -323,43 +315,41 @@ int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length, + if (length == 0) + return 0; + +- if (mem->type == RXE_MEM_TYPE_DMA) { ++ if (mr->type == RXE_MR_TYPE_DMA) { + u8 *src, *dest; + +- src = (dir == to_mem_obj) ? +- addr : ((void *)(uintptr_t)iova); ++ src = (dir == to_mr_obj) ? addr : ((void *)(uintptr_t)iova); + +- dest = (dir == to_mem_obj) ? +- ((void *)(uintptr_t)iova) : addr; ++ dest = (dir == to_mr_obj) ? ((void *)(uintptr_t)iova) : addr; + + memcpy(dest, src, length); + + if (crcp) +- *crcp = rxe_crc32(to_rdev(mem->ibmr.device), +- *crcp, dest, length); ++ *crcp = rxe_crc32(to_rdev(mr->ibmr.device), *crcp, dest, ++ length); + + return 0; + } + +- WARN_ON_ONCE(!mem->map); ++ WARN_ON_ONCE(!mr->map); + +- err = mem_check_range(mem, iova, length); ++ err = mr_check_range(mr, iova, length); + if (err) { + err = -EFAULT; + goto err1; + } + +- lookup_iova(mem, iova, &m, &i, &offset); ++ lookup_iova(mr, iova, &m, &i, &offset); + +- map = mem->map + m; ++ map = mr->map + m; + buf = map[0]->buf + i; + + while (length > 0) { + u8 *src, *dest; + + va = (u8 *)(uintptr_t)buf->addr + offset; +- src = (dir == to_mem_obj) ? addr : va; +- dest = (dir == to_mem_obj) ? va : addr; ++ src = (dir == to_mr_obj) ? addr : va; ++ dest = (dir == to_mr_obj) ? va : addr; + + bytes = buf->size - offset; + +@@ -369,8 +359,8 @@ int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length, + memcpy(dest, src, bytes); + + if (crcp) +- crc = rxe_crc32(to_rdev(mem->ibmr.device), +- crc, dest, bytes); ++ crc = rxe_crc32(to_rdev(mr->ibmr.device), crc, dest, ++ bytes); + + length -= bytes; + addr += bytes; +@@ -411,7 +401,7 @@ int copy_data( + struct rxe_sge *sge = &dma->sge[dma->cur_sge]; + int offset = dma->sge_offset; + int resid = dma->resid; +- struct rxe_mem *mem = NULL; ++ struct rxe_mr *mr = NULL; + u64 iova; + int err; + +@@ -424,8 +414,8 @@ int copy_data( + } + + if (sge->length && (offset < sge->length)) { +- mem = lookup_mem(pd, access, sge->lkey, lookup_local); +- if (!mem) { ++ mr = lookup_mr(pd, access, sge->lkey, lookup_local); ++ if (!mr) { + err = -EINVAL; + goto err1; + } +@@ -435,9 +425,9 @@ int copy_data( + bytes = length; + + if (offset >= sge->length) { +- if (mem) { +- rxe_drop_ref(mem); +- mem = NULL; ++ if (mr) { ++ rxe_drop_ref(mr); ++ mr = NULL; + } + sge++; + dma->cur_sge++; +@@ -449,9 +439,9 @@ int copy_data( + } + + if (sge->length) { +- mem = lookup_mem(pd, access, sge->lkey, +- lookup_local); +- if (!mem) { ++ mr = lookup_mr(pd, access, sge->lkey, ++ lookup_local); ++ if (!mr) { + err = -EINVAL; + goto err1; + } +@@ -466,7 +456,7 @@ int copy_data( + if (bytes > 0) { + iova = sge->addr + offset; + +- err = rxe_mem_copy(mem, iova, addr, bytes, dir, crcp); ++ err = rxe_mr_copy(mr, iova, addr, bytes, dir, crcp); + if (err) + goto err2; + +@@ -480,14 +470,14 @@ int copy_data( + dma->sge_offset = offset; + dma->resid = resid; + +- if (mem) +- rxe_drop_ref(mem); ++ if (mr) ++ rxe_drop_ref(mr); + + return 0; + + err2: +- if (mem) +- rxe_drop_ref(mem); ++ if (mr) ++ rxe_drop_ref(mr); + err1: + return err; + } +@@ -525,31 +515,30 @@ int advance_dma_data(struct rxe_dma_info *dma, unsigned int length) + return 0; + } + +-/* (1) find the mem (mr or mw) corresponding to lkey/rkey ++/* (1) find the mr corresponding to lkey/rkey + * depending on lookup_type +- * (2) verify that the (qp) pd matches the mem pd +- * (3) verify that the mem can support the requested access +- * (4) verify that mem state is valid ++ * (2) verify that the (qp) pd matches the mr pd ++ * (3) verify that the mr can support the requested access ++ * (4) verify that mr state is valid + */ +-struct rxe_mem *lookup_mem(struct rxe_pd *pd, int access, u32 key, +- enum lookup_type type) ++struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, ++ enum lookup_type type) + { +- struct rxe_mem *mem; ++ struct rxe_mr *mr; + struct rxe_dev *rxe = to_rdev(pd->ibpd.device); + int index = key >> 8; + +- mem = rxe_pool_get_index(&rxe->mr_pool, index); +- if (!mem) ++ mr = rxe_pool_get_index(&rxe->mr_pool, index); ++ if (!mr) + return NULL; + +- if (unlikely((type == lookup_local && mr_lkey(mem) != key) || +- (type == lookup_remote && mr_rkey(mem) != key) || +- mr_pd(mem) != pd || +- (access && !(access & mem->access)) || +- mem->state != RXE_MEM_STATE_VALID)) { +- rxe_drop_ref(mem); +- mem = NULL; ++ if (unlikely((type == lookup_local && mr_lkey(mr) != key) || ++ (type == lookup_remote && mr_rkey(mr) != key) || ++ mr_pd(mr) != pd || (access && !(access & mr->access)) || ++ mr->state != RXE_MR_STATE_VALID)) { ++ rxe_drop_ref(mr); ++ mr = NULL; + } + +- return mem; ++ return mr; + } +diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c +index 307d8986e7c9..d24901f2af3f 100644 +--- a/drivers/infiniband/sw/rxe/rxe_pool.c ++++ b/drivers/infiniband/sw/rxe/rxe_pool.c +@@ -8,8 +8,6 @@ + #include "rxe_loc.h" + + /* info about object pools +- * note that mr and mw share a single index space +- * so that one can map an lkey to the correct type of object + */ + struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { + [RXE_TYPE_UC] = { +@@ -56,18 +54,18 @@ struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { + }, + [RXE_TYPE_MR] = { + .name = "rxe-mr", +- .size = sizeof(struct rxe_mem), +- .elem_offset = offsetof(struct rxe_mem, pelem), +- .cleanup = rxe_mem_cleanup, ++ .size = sizeof(struct rxe_mr), ++ .elem_offset = offsetof(struct rxe_mr, pelem), ++ .cleanup = rxe_mr_cleanup, + .flags = RXE_POOL_INDEX, + .max_index = RXE_MAX_MR_INDEX, + .min_index = RXE_MIN_MR_INDEX, + }, + [RXE_TYPE_MW] = { + .name = "rxe-mw", +- .size = sizeof(struct rxe_mem), +- .elem_offset = offsetof(struct rxe_mem, pelem), +- .flags = RXE_POOL_INDEX, ++ .size = sizeof(struct rxe_mw), ++ .elem_offset = offsetof(struct rxe_mw, pelem), ++ .flags = RXE_POOL_INDEX | RXE_POOL_NO_ALLOC, + .max_index = RXE_MAX_MW_INDEX, + .min_index = RXE_MIN_MW_INDEX, + }, +diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c +index 889290793d75..3664cdae7e1f 100644 +--- a/drivers/infiniband/sw/rxe/rxe_req.c ++++ b/drivers/infiniband/sw/rxe/rxe_req.c +@@ -464,7 +464,7 @@ static int fill_packet(struct rxe_qp *qp, struct rxe_send_wqe *wqe, + } else { + err = copy_data(qp->pd, 0, &wqe->dma, + payload_addr(pkt), paylen, +- from_mem_obj, ++ from_mr_obj, + &crc); + if (err) + return err; +@@ -596,7 +596,7 @@ next_wqe: + if (wqe->mask & WR_REG_MASK) { + if (wqe->wr.opcode == IB_WR_LOCAL_INV) { + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); +- struct rxe_mem *rmr; ++ struct rxe_mr *rmr; + + rmr = rxe_pool_get_index(&rxe->mr_pool, + wqe->wr.ex.invalidate_rkey >> 8); +@@ -607,14 +607,14 @@ next_wqe: + wqe->status = IB_WC_MW_BIND_ERR; + goto exit; + } +- rmr->state = RXE_MEM_STATE_FREE; ++ rmr->state = RXE_MR_STATE_FREE; + rxe_drop_ref(rmr); + wqe->state = wqe_state_done; + wqe->status = IB_WC_SUCCESS; + } else if (wqe->wr.opcode == IB_WR_REG_MR) { +- struct rxe_mem *rmr = to_rmr(wqe->wr.wr.reg.mr); ++ struct rxe_mr *rmr = to_rmr(wqe->wr.wr.reg.mr); + +- rmr->state = RXE_MEM_STATE_VALID; ++ rmr->state = RXE_MR_STATE_VALID; + rmr->access = wqe->wr.wr.reg.access; + rmr->ibmr.lkey = wqe->wr.wr.reg.key; + rmr->ibmr.rkey = wqe->wr.wr.reg.key; +diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c +index 142f3d8014d8..8e237b623b31 100644 +--- a/drivers/infiniband/sw/rxe/rxe_resp.c ++++ b/drivers/infiniband/sw/rxe/rxe_resp.c +@@ -391,7 +391,7 @@ static enum resp_states check_length(struct rxe_qp *qp, + static enum resp_states check_rkey(struct rxe_qp *qp, + struct rxe_pkt_info *pkt) + { +- struct rxe_mem *mem = NULL; ++ struct rxe_mr *mr = NULL; + u64 va; + u32 rkey; + u32 resid; +@@ -430,18 +430,18 @@ static enum resp_states check_rkey(struct rxe_qp *qp, + resid = qp->resp.resid; + pktlen = payload_size(pkt); + +- mem = lookup_mem(qp->pd, access, rkey, lookup_remote); +- if (!mem) { ++ mr = lookup_mr(qp->pd, access, rkey, lookup_remote); ++ if (!mr) { + state = RESPST_ERR_RKEY_VIOLATION; + goto err; + } + +- if (unlikely(mem->state == RXE_MEM_STATE_FREE)) { ++ if (unlikely(mr->state == RXE_MR_STATE_FREE)) { + state = RESPST_ERR_RKEY_VIOLATION; + goto err; + } + +- if (mem_check_range(mem, va, resid)) { ++ if (mr_check_range(mr, va, resid)) { + state = RESPST_ERR_RKEY_VIOLATION; + goto err; + } +@@ -469,12 +469,12 @@ static enum resp_states check_rkey(struct rxe_qp *qp, + + WARN_ON_ONCE(qp->resp.mr); + +- qp->resp.mr = mem; ++ qp->resp.mr = mr; + return RESPST_EXECUTE; + + err: +- if (mem) +- rxe_drop_ref(mem); ++ if (mr) ++ rxe_drop_ref(mr); + return state; + } + +@@ -484,7 +484,7 @@ static enum resp_states send_data_in(struct rxe_qp *qp, void *data_addr, + int err; + + err = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, &qp->resp.wqe->dma, +- data_addr, data_len, to_mem_obj, NULL); ++ data_addr, data_len, to_mr_obj, NULL); + if (unlikely(err)) + return (err == -ENOSPC) ? RESPST_ERR_LENGTH + : RESPST_ERR_MALFORMED_WQE; +@@ -499,8 +499,8 @@ static enum resp_states write_data_in(struct rxe_qp *qp, + int err; + int data_len = payload_size(pkt); + +- err = rxe_mem_copy(qp->resp.mr, qp->resp.va, payload_addr(pkt), +- data_len, to_mem_obj, NULL); ++ err = rxe_mr_copy(qp->resp.mr, qp->resp.va, payload_addr(pkt), data_len, ++ to_mr_obj, NULL); + if (err) { + rc = RESPST_ERR_RKEY_VIOLATION; + goto out; +@@ -522,9 +522,9 @@ static enum resp_states process_atomic(struct rxe_qp *qp, + u64 iova = atmeth_va(pkt); + u64 *vaddr; + enum resp_states ret; +- struct rxe_mem *mr = qp->resp.mr; ++ struct rxe_mr *mr = qp->resp.mr; + +- if (mr->state != RXE_MEM_STATE_VALID) { ++ if (mr->state != RXE_MR_STATE_VALID) { + ret = RESPST_ERR_RKEY_VIOLATION; + goto out; + } +@@ -700,8 +700,8 @@ static enum resp_states read_reply(struct rxe_qp *qp, + if (!skb) + return RESPST_ERR_RNR; + +- err = rxe_mem_copy(res->read.mr, res->read.va, payload_addr(&ack_pkt), +- payload, from_mem_obj, &icrc); ++ err = rxe_mr_copy(res->read.mr, res->read.va, payload_addr(&ack_pkt), ++ payload, from_mr_obj, &icrc); + if (err) + pr_err("Failed copying memory\n"); + +@@ -883,7 +883,7 @@ static enum resp_states do_complete(struct rxe_qp *qp, + } + + if (pkt->mask & RXE_IETH_MASK) { +- struct rxe_mem *rmr; ++ struct rxe_mr *rmr; + + wc->wc_flags |= IB_WC_WITH_INVALIDATE; + wc->ex.invalidate_rkey = ieth_rkey(pkt); +@@ -895,7 +895,7 @@ static enum resp_states do_complete(struct rxe_qp *qp, + wc->ex.invalidate_rkey); + return RESPST_ERROR; + } +- rmr->state = RXE_MEM_STATE_FREE; ++ rmr->state = RXE_MR_STATE_FREE; + rxe_drop_ref(rmr); + } + +diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c +index dee5e0e919d2..38249c1a76a8 100644 +--- a/drivers/infiniband/sw/rxe/rxe_verbs.c ++++ b/drivers/infiniband/sw/rxe/rxe_verbs.c +@@ -865,7 +865,7 @@ static struct ib_mr *rxe_get_dma_mr(struct ib_pd *ibpd, int access) + { + struct rxe_dev *rxe = to_rdev(ibpd->device); + struct rxe_pd *pd = to_rpd(ibpd); +- struct rxe_mem *mr; ++ struct rxe_mr *mr; + + mr = rxe_alloc(&rxe->mr_pool); + if (!mr) +@@ -873,7 +873,7 @@ static struct ib_mr *rxe_get_dma_mr(struct ib_pd *ibpd, int access) + + rxe_add_index(mr); + rxe_add_ref(pd); +- rxe_mem_init_dma(pd, access, mr); ++ rxe_mr_init_dma(pd, access, mr); + + return &mr->ibmr; + } +@@ -887,7 +887,7 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, + int err; + struct rxe_dev *rxe = to_rdev(ibpd->device); + struct rxe_pd *pd = to_rpd(ibpd); +- struct rxe_mem *mr; ++ struct rxe_mr *mr; + + mr = rxe_alloc(&rxe->mr_pool); + if (!mr) { +@@ -899,8 +899,7 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, + + rxe_add_ref(pd); + +- err = rxe_mem_init_user(pd, start, length, iova, +- access, udata, mr); ++ err = rxe_mr_init_user(pd, start, length, iova, access, udata, mr); + if (err) + goto err3; + +@@ -916,9 +915,9 @@ err2: + + static int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) + { +- struct rxe_mem *mr = to_rmr(ibmr); ++ struct rxe_mr *mr = to_rmr(ibmr); + +- mr->state = RXE_MEM_STATE_ZOMBIE; ++ mr->state = RXE_MR_STATE_ZOMBIE; + rxe_drop_ref(mr_pd(mr)); + rxe_drop_index(mr); + rxe_drop_ref(mr); +@@ -930,7 +929,7 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, + { + struct rxe_dev *rxe = to_rdev(ibpd->device); + struct rxe_pd *pd = to_rpd(ibpd); +- struct rxe_mem *mr; ++ struct rxe_mr *mr; + int err; + + if (mr_type != IB_MR_TYPE_MEM_REG) +@@ -946,7 +945,7 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, + + rxe_add_ref(pd); + +- err = rxe_mem_init_fast(pd, max_num_sg, mr); ++ err = rxe_mr_init_fast(pd, max_num_sg, mr); + if (err) + goto err2; + +@@ -962,7 +961,7 @@ err1: + + static int rxe_set_page(struct ib_mr *ibmr, u64 addr) + { +- struct rxe_mem *mr = to_rmr(ibmr); ++ struct rxe_mr *mr = to_rmr(ibmr); + struct rxe_map *map; + struct rxe_phys_buf *buf; + +@@ -982,7 +981,7 @@ static int rxe_set_page(struct ib_mr *ibmr, u64 addr) + static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, + int sg_nents, unsigned int *sg_offset) + { +- struct rxe_mem *mr = to_rmr(ibmr); ++ struct rxe_mr *mr = to_rmr(ibmr); + int n; + + mr->nbuf = 0; +@@ -1110,6 +1109,7 @@ static const struct ib_device_ops rxe_dev_ops = { + INIT_RDMA_OBJ_SIZE(ib_pd, rxe_pd, ibpd), + INIT_RDMA_OBJ_SIZE(ib_srq, rxe_srq, ibsrq), + INIT_RDMA_OBJ_SIZE(ib_ucontext, rxe_ucontext, ibuc), ++ INIT_RDMA_OBJ_SIZE(ib_mw, rxe_mw, ibmw), + }; + + int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) +diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h +index 79e0a5a878da..11eba7a3ba8f 100644 +--- a/drivers/infiniband/sw/rxe/rxe_verbs.h ++++ b/drivers/infiniband/sw/rxe/rxe_verbs.h +@@ -156,7 +156,7 @@ struct resp_res { + struct sk_buff *skb; + } atomic; + struct { +- struct rxe_mem *mr; ++ struct rxe_mr *mr; + u64 va_org; + u32 rkey; + u32 length; +@@ -183,7 +183,7 @@ struct rxe_resp_info { + + /* RDMA read / atomic only */ + u64 va; +- struct rxe_mem *mr; ++ struct rxe_mr *mr; + u32 resid; + u32 rkey; + u32 length; +@@ -262,18 +262,18 @@ struct rxe_qp { + struct execute_work cleanup_work; + }; + +-enum rxe_mem_state { +- RXE_MEM_STATE_ZOMBIE, +- RXE_MEM_STATE_INVALID, +- RXE_MEM_STATE_FREE, +- RXE_MEM_STATE_VALID, ++enum rxe_mr_state { ++ RXE_MR_STATE_ZOMBIE, ++ RXE_MR_STATE_INVALID, ++ RXE_MR_STATE_FREE, ++ RXE_MR_STATE_VALID, + }; + +-enum rxe_mem_type { +- RXE_MEM_TYPE_NONE, +- RXE_MEM_TYPE_DMA, +- RXE_MEM_TYPE_MR, +- RXE_MEM_TYPE_MW, ++enum rxe_mr_type { ++ RXE_MR_TYPE_NONE, ++ RXE_MR_TYPE_DMA, ++ RXE_MR_TYPE_MR, ++ RXE_MR_TYPE_MW, + }; + + #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf)) +@@ -287,17 +287,14 @@ struct rxe_map { + struct rxe_phys_buf buf[RXE_BUF_PER_MAP]; + }; + +-struct rxe_mem { ++struct rxe_mr { + struct rxe_pool_entry pelem; +- union { +- struct ib_mr ibmr; +- struct ib_mw ibmw; +- }; ++ struct ib_mr ibmr; + + struct ib_umem *umem; + +- enum rxe_mem_state state; +- enum rxe_mem_type type; ++ enum rxe_mr_state state; ++ enum rxe_mr_type type; + u64 va; + u64 iova; + size_t length; +@@ -318,6 +315,17 @@ struct rxe_mem { + struct rxe_map **map; + }; + ++enum rxe_mw_state { ++ RXE_MW_STATE_INVALID = RXE_MR_STATE_INVALID, ++ RXE_MW_STATE_FREE = RXE_MR_STATE_FREE, ++ RXE_MW_STATE_VALID = RXE_MR_STATE_VALID, ++}; ++ ++struct rxe_mw { ++ struct ib_mw ibmw; ++ struct rxe_pool_entry pelem; ++}; ++ + struct rxe_mc_grp { + struct rxe_pool_entry pelem; + spinlock_t mcg_lock; /* guard group */ +@@ -422,27 +430,27 @@ static inline struct rxe_cq *to_rcq(struct ib_cq *cq) + return cq ? container_of(cq, struct rxe_cq, ibcq) : NULL; + } + +-static inline struct rxe_mem *to_rmr(struct ib_mr *mr) ++static inline struct rxe_mr *to_rmr(struct ib_mr *mr) + { +- return mr ? container_of(mr, struct rxe_mem, ibmr) : NULL; ++ return mr ? container_of(mr, struct rxe_mr, ibmr) : NULL; + } + +-static inline struct rxe_mem *to_rmw(struct ib_mw *mw) ++static inline struct rxe_mw *to_rmw(struct ib_mw *mw) + { +- return mw ? container_of(mw, struct rxe_mem, ibmw) : NULL; ++ return mw ? container_of(mw, struct rxe_mw, ibmw) : NULL; + } + +-static inline struct rxe_pd *mr_pd(struct rxe_mem *mr) ++static inline struct rxe_pd *mr_pd(struct rxe_mr *mr) + { + return to_rpd(mr->ibmr.pd); + } + +-static inline u32 mr_lkey(struct rxe_mem *mr) ++static inline u32 mr_lkey(struct rxe_mr *mr) + { + return mr->ibmr.lkey; + } + +-static inline u32 mr_rkey(struct rxe_mem *mr) ++static inline u32 mr_rkey(struct rxe_mr *mr) + { + return mr->ibmr.rkey; + } +-- +2.30.2 + diff --git a/queue-5.12/rdma-siw-properly-check-send-and-receive-cq-pointers.patch b/queue-5.12/rdma-siw-properly-check-send-and-receive-cq-pointers.patch new file mode 100644 index 00000000000..c29a84f2ee6 --- /dev/null +++ b/queue-5.12/rdma-siw-properly-check-send-and-receive-cq-pointers.patch @@ -0,0 +1,62 @@ +From b12d78860e561bfda2465cc793723eda7b2fa6c7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 9 May 2021 14:39:21 +0300 +Subject: RDMA/siw: Properly check send and receive CQ pointers + +From: Leon Romanovsky + +[ Upstream commit a568814a55a0e82bbc7c7b51333d0c38e8fb5520 ] + +The check for the NULL of pointer received from container_of() is +incorrect by definition as it points to some offset from NULL. + +Change such check with proper NULL check of SIW QP attributes. + +Fixes: 303ae1cdfdf7 ("rdma/siw: application interface") +Link: https://lore.kernel.org/r/a7535a82925f6f4c1f062abaa294f3ae6e54bdd2.1620560310.git.leonro@nvidia.com +Signed-off-by: Leon Romanovsky +Reviewed-by: Bernard Metzler +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/sw/siw/siw_verbs.c | 9 +++------ + 1 file changed, 3 insertions(+), 6 deletions(-) + +diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c +index e389d44e5591..d1859c56a6db 100644 +--- a/drivers/infiniband/sw/siw/siw_verbs.c ++++ b/drivers/infiniband/sw/siw/siw_verbs.c +@@ -300,7 +300,6 @@ struct ib_qp *siw_create_qp(struct ib_pd *pd, + struct siw_ucontext *uctx = + rdma_udata_to_drv_context(udata, struct siw_ucontext, + base_ucontext); +- struct siw_cq *scq = NULL, *rcq = NULL; + unsigned long flags; + int num_sqe, num_rqe, rv = 0; + size_t length; +@@ -343,10 +342,8 @@ struct ib_qp *siw_create_qp(struct ib_pd *pd, + rv = -EINVAL; + goto err_out; + } +- scq = to_siw_cq(attrs->send_cq); +- rcq = to_siw_cq(attrs->recv_cq); + +- if (!scq || (!rcq && !attrs->srq)) { ++ if (!attrs->send_cq || (!attrs->recv_cq && !attrs->srq)) { + siw_dbg(base_dev, "send CQ or receive CQ invalid\n"); + rv = -EINVAL; + goto err_out; +@@ -401,8 +398,8 @@ struct ib_qp *siw_create_qp(struct ib_pd *pd, + } + } + qp->pd = pd; +- qp->scq = scq; +- qp->rcq = rcq; ++ qp->scq = to_siw_cq(attrs->send_cq); ++ qp->rcq = to_siw_cq(attrs->recv_cq); + + if (attrs->srq) { + /* +-- +2.30.2 + diff --git a/queue-5.12/rdma-siw-release-xarray-entry.patch b/queue-5.12/rdma-siw-release-xarray-entry.patch new file mode 100644 index 00000000000..056fd3929db --- /dev/null +++ b/queue-5.12/rdma-siw-release-xarray-entry.patch @@ -0,0 +1,38 @@ +From d05a64da82618ee3bfef3755ae5b00bed2ff8081 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 9 May 2021 14:41:38 +0300 +Subject: RDMA/siw: Release xarray entry + +From: Leon Romanovsky + +[ Upstream commit a3d83276d98886879b5bf7b30b7c29882754e4df ] + +The xarray entry is allocated in siw_qp_add(), but release was +missed in case zero-sized SQ was discovered. + +Fixes: 661f385961f0 ("RDMA/siw: Fix handling of zero-sized Read and Receive Queues.") +Link: https://lore.kernel.org/r/f070b59d5a1114d5a4e830346755c2b3f141cde5.1620560472.git.leonro@nvidia.com +Signed-off-by: Leon Romanovsky +Reviewed-by: Bernard Metzler +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/sw/siw/siw_verbs.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c +index d1859c56a6db..8a00c06e5f56 100644 +--- a/drivers/infiniband/sw/siw/siw_verbs.c ++++ b/drivers/infiniband/sw/siw/siw_verbs.c +@@ -375,7 +375,7 @@ struct ib_qp *siw_create_qp(struct ib_pd *pd, + else { + /* Zero sized SQ is not supported */ + rv = -EINVAL; +- goto err_out; ++ goto err_out_xa; + } + if (num_rqe) + num_rqe = roundup_pow_of_two(num_rqe); +-- +2.30.2 + diff --git a/queue-5.12/rdma-uverbs-fix-a-null-vs-is_err-bug.patch b/queue-5.12/rdma-uverbs-fix-a-null-vs-is_err-bug.patch new file mode 100644 index 00000000000..a8ea246987e --- /dev/null +++ b/queue-5.12/rdma-uverbs-fix-a-null-vs-is_err-bug.patch @@ -0,0 +1,40 @@ +From 4e83d4b14563f3185c4deefbb0f163cf666134a8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 14 May 2021 17:18:10 +0300 +Subject: RDMA/uverbs: Fix a NULL vs IS_ERR() bug + +From: Dan Carpenter + +[ Upstream commit 463a3f66473b58d71428a1c3ce69ea52c05440e5 ] + +The uapi_get_object() function returns error pointers, it never returns +NULL. + +Fixes: 149d3845f4a5 ("RDMA/uverbs: Add a method to introspect handles in a context") +Link: https://lore.kernel.org/r/YJ6Got+U7lz+3n9a@mwanda +Signed-off-by: Dan Carpenter +Reviewed-by: Leon Romanovsky +Signed-off-by: Jason Gunthorpe +Signed-off-by: Sasha Levin +--- + drivers/infiniband/core/uverbs_std_types_device.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/drivers/infiniband/core/uverbs_std_types_device.c b/drivers/infiniband/core/uverbs_std_types_device.c +index a03021d94e11..049684880ae0 100644 +--- a/drivers/infiniband/core/uverbs_std_types_device.c ++++ b/drivers/infiniband/core/uverbs_std_types_device.c +@@ -117,8 +117,8 @@ static int UVERBS_HANDLER(UVERBS_METHOD_INFO_HANDLES)( + return ret; + + uapi_object = uapi_get_object(attrs->ufile->device->uapi, object_id); +- if (!uapi_object) +- return -EINVAL; ++ if (IS_ERR(uapi_object)) ++ return PTR_ERR(uapi_object); + + handles = gather_objects_handle(attrs->ufile, uapi_object, attrs, + out_len, &total); +-- +2.30.2 + diff --git a/queue-5.12/scsi-qedf-add-pointer-checks-in-qedf_update_link_spe.patch b/queue-5.12/scsi-qedf-add-pointer-checks-in-qedf_update_link_spe.patch new file mode 100644 index 00000000000..208a3f35e1c --- /dev/null +++ b/queue-5.12/scsi-qedf-add-pointer-checks-in-qedf_update_link_spe.patch @@ -0,0 +1,63 @@ +From 80d298f69b5fbecfe31b4a8a9fd7ec342f54f24c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 12 May 2021 00:25:33 -0700 +Subject: scsi: qedf: Add pointer checks in qedf_update_link_speed() + +From: Javed Hasan + +[ Upstream commit 73578af92a0fae6609b955fcc9113e50e413c80f ] + +The following trace was observed: + + [ 14.042059] Call Trace: + [ 14.042061] + [ 14.042068] qedf_link_update+0x144/0x1f0 [qedf] + [ 14.042117] qed_link_update+0x5c/0x80 [qed] + [ 14.042135] qed_mcp_handle_link_change+0x2d2/0x410 [qed] + [ 14.042155] ? qed_set_ptt+0x70/0x80 [qed] + [ 14.042170] ? qed_set_ptt+0x70/0x80 [qed] + [ 14.042186] ? qed_rd+0x13/0x40 [qed] + [ 14.042205] qed_mcp_handle_events+0x437/0x690 [qed] + [ 14.042221] ? qed_set_ptt+0x70/0x80 [qed] + [ 14.042239] qed_int_sp_dpc+0x3a6/0x3e0 [qed] + [ 14.042245] tasklet_action_common.isra.14+0x5a/0x100 + [ 14.042250] __do_softirq+0xe4/0x2f8 + [ 14.042253] irq_exit+0xf7/0x100 + [ 14.042255] do_IRQ+0x7f/0xd0 + [ 14.042257] common_interrupt+0xf/0xf + [ 14.042259] + +API qedf_link_update() is getting called from QED but by that time +shost_data is not initialised. This results in a NULL pointer dereference +when we try to dereference shost_data while updating supported_speeds. + +Add a NULL pointer check before dereferencing shost_data. + +Link: https://lore.kernel.org/r/20210512072533.23618-1-jhasan@marvell.com +Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") +Reviewed-by: Himanshu Madhani +Signed-off-by: Javed Hasan +Signed-off-by: Martin K. Petersen +Signed-off-by: Sasha Levin +--- + drivers/scsi/qedf/qedf_main.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c +index cec27f2ef70d..e5076f09d5ed 100644 +--- a/drivers/scsi/qedf/qedf_main.c ++++ b/drivers/scsi/qedf/qedf_main.c +@@ -536,7 +536,9 @@ static void qedf_update_link_speed(struct qedf_ctx *qedf, + if (linkmode_intersects(link->supported_caps, sup_caps)) + lport->link_supported_speeds |= FC_PORTSPEED_20GBIT; + +- fc_host_supported_speeds(lport->host) = lport->link_supported_speeds; ++ if (lport->host && lport->host->shost_data) ++ fc_host_supported_speeds(lport->host) = ++ lport->link_supported_speeds; + } + + static void qedf_bw_update(void *dev) +-- +2.30.2 + diff --git a/queue-5.12/scsi-qla2xxx-fix-error-return-code-in-qla82xx_write_.patch b/queue-5.12/scsi-qla2xxx-fix-error-return-code-in-qla82xx_write_.patch new file mode 100644 index 00000000000..824a6befbe3 --- /dev/null +++ b/queue-5.12/scsi-qla2xxx-fix-error-return-code-in-qla82xx_write_.patch @@ -0,0 +1,40 @@ +From 12a34016b2a8cd362d96a2ad2d715d08f239dc45 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 14 May 2021 17:09:52 +0800 +Subject: scsi: qla2xxx: Fix error return code in qla82xx_write_flash_dword() + +From: Zhen Lei + +[ Upstream commit 5cb289bf2d7c34ca1abd794ce116c4f19185a1d4 ] + +Fix to return a negative error code from the error handling case instead of +0 as done elsewhere in this function. + +Link: https://lore.kernel.org/r/20210514090952.6715-1-thunder.leizhen@huawei.com +Fixes: a9083016a531 ("[SCSI] qla2xxx: Add ISP82XX support.") +Reported-by: Hulk Robot +Reviewed-by: Himanshu Madhani +Signed-off-by: Zhen Lei +Signed-off-by: Martin K. Petersen +Signed-off-by: Sasha Levin +--- + drivers/scsi/qla2xxx/qla_nx.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/drivers/scsi/qla2xxx/qla_nx.c b/drivers/scsi/qla2xxx/qla_nx.c +index 0677295957bc..615e44af1ca6 100644 +--- a/drivers/scsi/qla2xxx/qla_nx.c ++++ b/drivers/scsi/qla2xxx/qla_nx.c +@@ -1063,7 +1063,8 @@ qla82xx_write_flash_dword(struct qla_hw_data *ha, uint32_t flashaddr, + return ret; + } + +- if (qla82xx_flash_set_write_enable(ha)) ++ ret = qla82xx_flash_set_write_enable(ha); ++ if (ret < 0) + goto done_write; + + qla82xx_wr_32(ha, QLA82XX_ROMUSB_ROM_WDATA, data); +-- +2.30.2 + diff --git a/queue-5.12/scsi-ufs-core-increase-the-usable-queue-depth.patch b/queue-5.12/scsi-ufs-core-increase-the-usable-queue-depth.patch new file mode 100644 index 00000000000..6ec756edb1e --- /dev/null +++ b/queue-5.12/scsi-ufs-core-increase-the-usable-queue-depth.patch @@ -0,0 +1,71 @@ +From 5d061953c212af4be24855e6c0a4e33c6e1493b0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 13 May 2021 09:49:12 -0700 +Subject: scsi: ufs: core: Increase the usable queue depth + +From: Bart Van Assche + +[ Upstream commit d0b2b70eb12e9ffaf95e11b16b230a4e015a536c ] + +With the current implementation of the UFS driver active_queues is 1 +instead of 0 if all UFS request queues are idle. That causes +hctx_may_queue() to divide the queue depth by 2 when queueing a request and +hence reduces the usable queue depth. + +The shared tag set code in the block layer keeps track of the number of +active request queues. blk_mq_tag_busy() is called before a request is +queued onto a hwq and blk_mq_tag_idle() is called some time after the hwq +became idle. blk_mq_tag_idle() is called from inside blk_mq_timeout_work(). +Hence, blk_mq_tag_idle() is only called if a timer is associated with each +request that is submitted to a request queue that shares a tag set with +another request queue. + +Adds a blk_mq_start_request() call in ufshcd_exec_dev_cmd(). This doubles +the queue depth on my test setup from 16 to 32. + +In addition to increasing the usable queue depth, also fix the +documentation of the 'timeout' parameter in the header above +ufshcd_exec_dev_cmd(). + +Link: https://lore.kernel.org/r/20210513164912.5683-1-bvanassche@acm.org +Fixes: 7252a3603015 ("scsi: ufs: Avoid busy-waiting by eliminating tag conflicts") +Cc: Can Guo +Cc: Alim Akhtar +Cc: Avri Altman +Cc: Stanley Chu +Cc: Bean Huo +Cc: Adrian Hunter +Reviewed-by: Can Guo +Signed-off-by: Bart Van Assche +Signed-off-by: Martin K. Petersen +Signed-off-by: Sasha Levin +--- + drivers/scsi/ufs/ufshcd.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c +index 0c71a159d08f..e1e510882ff4 100644 +--- a/drivers/scsi/ufs/ufshcd.c ++++ b/drivers/scsi/ufs/ufshcd.c +@@ -2849,7 +2849,7 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba, + * ufshcd_exec_dev_cmd - API for sending device management requests + * @hba: UFS hba + * @cmd_type: specifies the type (NOP, Query...) +- * @timeout: time in seconds ++ * @timeout: timeout in milliseconds + * + * NOTE: Since there is only one available tag for device management commands, + * it is expected you hold the hba->dev_cmd.lock mutex. +@@ -2879,6 +2879,9 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba, + } + tag = req->tag; + WARN_ON_ONCE(!ufshcd_valid_tag(hba, tag)); ++ /* Set the timeout such that the SCSI error handler is not activated. */ ++ req->timeout = msecs_to_jiffies(2 * timeout); ++ blk_mq_start_request(req); + + init_completion(&wait); + lrbp = &hba->lrb[tag]; +-- +2.30.2 + diff --git a/queue-5.12/series b/queue-5.12/series new file mode 100644 index 00000000000..f3a744fe6d4 --- /dev/null +++ b/queue-5.12/series @@ -0,0 +1,30 @@ +firmware-arm_scpi-prevent-the-ternary-sign-expansion.patch +openrisc-fix-a-memory-leak.patch +tee-amdtee-unload-ta-only-when-its-refcount-becomes-.patch +habanalabs-gaudi-fix-a-potential-use-after-free-in-g.patch +rdma-siw-properly-check-send-and-receive-cq-pointers.patch +rdma-siw-release-xarray-entry.patch +rdma-core-prevent-divide-by-zero-error-triggered-by-.patch +platform-x86-ideapad-laptop-fix-a-null-pointer-deref.patch +rdma-rxe-clear-all-qp-fields-if-creation-failed.patch +scsi-ufs-core-increase-the-usable-queue-depth.patch +scsi-qedf-add-pointer-checks-in-qedf_update_link_spe.patch +scsi-qla2xxx-fix-error-return-code-in-qla82xx_write_.patch +rdma-mlx5-recover-from-fatal-event-in-dual-port-mode.patch +rdma-rxe-split-mem-into-mr-and-mw.patch +rdma-rxe-return-cqe-error-if-invalid-lkey-was-suppli.patch +rdma-core-don-t-access-cm_id-after-its-destruction.patch +nvmet-fix-memory-leak-in-nvmet_alloc_ctrl.patch +nvme-loop-fix-memory-leak-in-nvme_loop_create_ctrl.patch +nvme-tcp-rerun-io_work-if-req_list-is-not-empty.patch +nvme-fc-clear-q_live-at-beginning-of-association-tea.patch +platform-mellanox-mlxbf-tmfifo-fix-a-memory-barrier-.patch +platform-x86-intel_int0002_vgpio-only-call-enable_ir.patch +platform-x86-dell-smbios-wmi-fix-oops-on-rmmod-dell_.patch +rdma-mlx5-fix-query-dct-via-devx.patch +rdma-uverbs-fix-a-null-vs-is_err-bug.patch +tools-testing-selftests-exec-fix-link-error.patch +drm-ttm-do-not-add-non-system-domain-bo-into-swap-li.patch +powerpc-pseries-fix-hcall-tracing-recursion-in-pv-qu.patch +ptrace-make-ptrace-fail-if-the-tracee-changed-its-pi.patch +nvmet-seset-ns-file-when-open-fails.patch diff --git a/queue-5.12/tee-amdtee-unload-ta-only-when-its-refcount-becomes-.patch b/queue-5.12/tee-amdtee-unload-ta-only-when-its-refcount-becomes-.patch new file mode 100644 index 00000000000..1ef77f66494 --- /dev/null +++ b/queue-5.12/tee-amdtee-unload-ta-only-when-its-refcount-becomes-.patch @@ -0,0 +1,285 @@ +From 03baccd7d8a06e1b553e962634064c14f0fdba50 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 14 Apr 2021 23:08:27 +0530 +Subject: tee: amdtee: unload TA only when its refcount becomes 0 + +From: Rijo Thomas + +[ Upstream commit 9f015b3765bf593b3ed5d3b588e409dc0ffa9f85 ] + +Same Trusted Application (TA) can be loaded in multiple TEE contexts. + +If it is a single instance TA, the TA should not get unloaded from AMD +Secure Processor, while it is still in use in another TEE context. + +Therefore reference count TA and unload it when the count becomes zero. + +Fixes: 757cc3e9ff1d ("tee: add AMD-TEE driver") +Reviewed-by: Devaraj Rangasamy +Signed-off-by: Rijo Thomas +Acked-by: Dan Carpenter +Signed-off-by: Jens Wiklander +Signed-off-by: Sasha Levin +--- + drivers/tee/amdtee/amdtee_private.h | 13 ++++ + drivers/tee/amdtee/call.c | 94 ++++++++++++++++++++++++++--- + drivers/tee/amdtee/core.c | 15 +++-- + 3 files changed, 106 insertions(+), 16 deletions(-) + +diff --git a/drivers/tee/amdtee/amdtee_private.h b/drivers/tee/amdtee/amdtee_private.h +index 337c8d82f74e..6d0f7062bb87 100644 +--- a/drivers/tee/amdtee/amdtee_private.h ++++ b/drivers/tee/amdtee/amdtee_private.h +@@ -21,6 +21,7 @@ + #define TEEC_SUCCESS 0x00000000 + #define TEEC_ERROR_GENERIC 0xFFFF0000 + #define TEEC_ERROR_BAD_PARAMETERS 0xFFFF0006 ++#define TEEC_ERROR_OUT_OF_MEMORY 0xFFFF000C + #define TEEC_ERROR_COMMUNICATION 0xFFFF000E + + #define TEEC_ORIGIN_COMMS 0x00000002 +@@ -93,6 +94,18 @@ struct amdtee_shm_data { + u32 buf_id; + }; + ++/** ++ * struct amdtee_ta_data - Keeps track of all TAs loaded in AMD Secure ++ * Processor ++ * @ta_handle: Handle to TA loaded in TEE ++ * @refcount: Reference count for the loaded TA ++ */ ++struct amdtee_ta_data { ++ struct list_head list_node; ++ u32 ta_handle; ++ u32 refcount; ++}; ++ + #define LOWER_TWO_BYTE_MASK 0x0000FFFF + + /** +diff --git a/drivers/tee/amdtee/call.c b/drivers/tee/amdtee/call.c +index 096dd4d92d39..07f36ac834c8 100644 +--- a/drivers/tee/amdtee/call.c ++++ b/drivers/tee/amdtee/call.c +@@ -121,15 +121,69 @@ static int amd_params_to_tee_params(struct tee_param *tee, u32 count, + return ret; + } + ++static DEFINE_MUTEX(ta_refcount_mutex); ++static struct list_head ta_list = LIST_HEAD_INIT(ta_list); ++ ++static u32 get_ta_refcount(u32 ta_handle) ++{ ++ struct amdtee_ta_data *ta_data; ++ u32 count = 0; ++ ++ /* Caller must hold a mutex */ ++ list_for_each_entry(ta_data, &ta_list, list_node) ++ if (ta_data->ta_handle == ta_handle) ++ return ++ta_data->refcount; ++ ++ ta_data = kzalloc(sizeof(*ta_data), GFP_KERNEL); ++ if (ta_data) { ++ ta_data->ta_handle = ta_handle; ++ ta_data->refcount = 1; ++ count = ta_data->refcount; ++ list_add(&ta_data->list_node, &ta_list); ++ } ++ ++ return count; ++} ++ ++static u32 put_ta_refcount(u32 ta_handle) ++{ ++ struct amdtee_ta_data *ta_data; ++ u32 count = 0; ++ ++ /* Caller must hold a mutex */ ++ list_for_each_entry(ta_data, &ta_list, list_node) ++ if (ta_data->ta_handle == ta_handle) { ++ count = --ta_data->refcount; ++ if (count == 0) { ++ list_del(&ta_data->list_node); ++ kfree(ta_data); ++ break; ++ } ++ } ++ ++ return count; ++} ++ + int handle_unload_ta(u32 ta_handle) + { + struct tee_cmd_unload_ta cmd = {0}; +- u32 status; ++ u32 status, count; + int ret; + + if (!ta_handle) + return -EINVAL; + ++ mutex_lock(&ta_refcount_mutex); ++ ++ count = put_ta_refcount(ta_handle); ++ ++ if (count) { ++ pr_debug("unload ta: not unloading %u count %u\n", ++ ta_handle, count); ++ ret = -EBUSY; ++ goto unlock; ++ } ++ + cmd.ta_handle = ta_handle; + + ret = psp_tee_process_cmd(TEE_CMD_ID_UNLOAD_TA, (void *)&cmd, +@@ -137,8 +191,12 @@ int handle_unload_ta(u32 ta_handle) + if (!ret && status != 0) { + pr_err("unload ta: status = 0x%x\n", status); + ret = -EBUSY; ++ } else { ++ pr_debug("unloaded ta handle %u\n", ta_handle); + } + ++unlock: ++ mutex_unlock(&ta_refcount_mutex); + return ret; + } + +@@ -340,7 +398,8 @@ int handle_open_session(struct tee_ioctl_open_session_arg *arg, u32 *info, + + int handle_load_ta(void *data, u32 size, struct tee_ioctl_open_session_arg *arg) + { +- struct tee_cmd_load_ta cmd = {0}; ++ struct tee_cmd_unload_ta unload_cmd = {}; ++ struct tee_cmd_load_ta load_cmd = {}; + phys_addr_t blob; + int ret; + +@@ -353,21 +412,36 @@ int handle_load_ta(void *data, u32 size, struct tee_ioctl_open_session_arg *arg) + return -EINVAL; + } + +- cmd.hi_addr = upper_32_bits(blob); +- cmd.low_addr = lower_32_bits(blob); +- cmd.size = size; ++ load_cmd.hi_addr = upper_32_bits(blob); ++ load_cmd.low_addr = lower_32_bits(blob); ++ load_cmd.size = size; + +- ret = psp_tee_process_cmd(TEE_CMD_ID_LOAD_TA, (void *)&cmd, +- sizeof(cmd), &arg->ret); ++ mutex_lock(&ta_refcount_mutex); ++ ++ ret = psp_tee_process_cmd(TEE_CMD_ID_LOAD_TA, (void *)&load_cmd, ++ sizeof(load_cmd), &arg->ret); + if (ret) { + arg->ret_origin = TEEC_ORIGIN_COMMS; + arg->ret = TEEC_ERROR_COMMUNICATION; +- } else { +- set_session_id(cmd.ta_handle, 0, &arg->session); ++ } else if (arg->ret == TEEC_SUCCESS) { ++ ret = get_ta_refcount(load_cmd.ta_handle); ++ if (!ret) { ++ arg->ret_origin = TEEC_ORIGIN_COMMS; ++ arg->ret = TEEC_ERROR_OUT_OF_MEMORY; ++ ++ /* Unload the TA on error */ ++ unload_cmd.ta_handle = load_cmd.ta_handle; ++ psp_tee_process_cmd(TEE_CMD_ID_UNLOAD_TA, ++ (void *)&unload_cmd, ++ sizeof(unload_cmd), &ret); ++ } else { ++ set_session_id(load_cmd.ta_handle, 0, &arg->session); ++ } + } ++ mutex_unlock(&ta_refcount_mutex); + + pr_debug("load TA: TA handle = 0x%x, RO = 0x%x, ret = 0x%x\n", +- cmd.ta_handle, arg->ret_origin, arg->ret); ++ load_cmd.ta_handle, arg->ret_origin, arg->ret); + + return 0; + } +diff --git a/drivers/tee/amdtee/core.c b/drivers/tee/amdtee/core.c +index 8a6a8f30bb42..da6b88e80dc0 100644 +--- a/drivers/tee/amdtee/core.c ++++ b/drivers/tee/amdtee/core.c +@@ -59,10 +59,9 @@ static void release_session(struct amdtee_session *sess) + continue; + + handle_close_session(sess->ta_handle, sess->session_info[i]); ++ handle_unload_ta(sess->ta_handle); + } + +- /* Unload Trusted Application once all sessions are closed */ +- handle_unload_ta(sess->ta_handle); + kfree(sess); + } + +@@ -224,8 +223,6 @@ static void destroy_session(struct kref *ref) + struct amdtee_session *sess = container_of(ref, struct amdtee_session, + refcount); + +- /* Unload the TA from TEE */ +- handle_unload_ta(sess->ta_handle); + mutex_lock(&session_list_mutex); + list_del(&sess->list_node); + mutex_unlock(&session_list_mutex); +@@ -238,7 +235,7 @@ int amdtee_open_session(struct tee_context *ctx, + { + struct amdtee_context_data *ctxdata = ctx->data; + struct amdtee_session *sess = NULL; +- u32 session_info; ++ u32 session_info, ta_handle; + size_t ta_size; + int rc, i; + void *ta; +@@ -259,11 +256,14 @@ int amdtee_open_session(struct tee_context *ctx, + if (arg->ret != TEEC_SUCCESS) + goto out; + ++ ta_handle = get_ta_handle(arg->session); ++ + mutex_lock(&session_list_mutex); + sess = alloc_session(ctxdata, arg->session); + mutex_unlock(&session_list_mutex); + + if (!sess) { ++ handle_unload_ta(ta_handle); + rc = -ENOMEM; + goto out; + } +@@ -277,6 +277,7 @@ int amdtee_open_session(struct tee_context *ctx, + + if (i >= TEE_NUM_SESSIONS) { + pr_err("reached maximum session count %d\n", TEE_NUM_SESSIONS); ++ handle_unload_ta(ta_handle); + kref_put(&sess->refcount, destroy_session); + rc = -ENOMEM; + goto out; +@@ -289,12 +290,13 @@ int amdtee_open_session(struct tee_context *ctx, + spin_lock(&sess->lock); + clear_bit(i, sess->sess_mask); + spin_unlock(&sess->lock); ++ handle_unload_ta(ta_handle); + kref_put(&sess->refcount, destroy_session); + goto out; + } + + sess->session_info[i] = session_info; +- set_session_id(sess->ta_handle, i, &arg->session); ++ set_session_id(ta_handle, i, &arg->session); + out: + free_pages((u64)ta, get_order(ta_size)); + return rc; +@@ -329,6 +331,7 @@ int amdtee_close_session(struct tee_context *ctx, u32 session) + + /* Close the session */ + handle_close_session(ta_handle, session_info); ++ handle_unload_ta(ta_handle); + + kref_put(&sess->refcount, destroy_session); + +-- +2.30.2 + diff --git a/queue-5.12/tools-testing-selftests-exec-fix-link-error.patch b/queue-5.12/tools-testing-selftests-exec-fix-link-error.patch new file mode 100644 index 00000000000..4936a44e245 --- /dev/null +++ b/queue-5.12/tools-testing-selftests-exec-fix-link-error.patch @@ -0,0 +1,48 @@ +From e92cbd6c25fcf897c272b7edd8c752365afd377a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 22 May 2021 17:41:53 -0700 +Subject: tools/testing/selftests/exec: fix link error + +From: Yang Yingliang + +[ Upstream commit 4d1cd3b2c5c1c32826454de3a18c6183238d47ed ] + +Fix the link error by adding '-static': + + gcc -Wall -Wl,-z,max-page-size=0x1000 -pie load_address.c -o /home/yang/linux/tools/testing/selftests/exec/load_address_4096 + /usr/bin/ld: /tmp/ccopEGun.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `stderr@@GLIBC_2.17' which may bind externally can not be used when making a shared object; recompile with -fPIC + /usr/bin/ld: /tmp/ccopEGun.o(.text+0x158): unresolvable R_AARCH64_ADR_PREL_PG_HI21 relocation against symbol `stderr@@GLIBC_2.17' + /usr/bin/ld: final link failed: bad value + collect2: error: ld returned 1 exit status + make: *** [Makefile:25: tools/testing/selftests/exec/load_address_4096] Error 1 + +Link: https://lkml.kernel.org/r/20210514092422.2367367-1-yangyingliang@huawei.com +Fixes: 206e22f01941 ("tools/testing/selftests: add self-test for verifying load alignment") +Signed-off-by: Yang Yingliang +Cc: Chris Kennelly +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Sasha Levin +--- + tools/testing/selftests/exec/Makefile | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/tools/testing/selftests/exec/Makefile b/tools/testing/selftests/exec/Makefile +index cf69b2fcce59..dd61118df66e 100644 +--- a/tools/testing/selftests/exec/Makefile ++++ b/tools/testing/selftests/exec/Makefile +@@ -28,8 +28,8 @@ $(OUTPUT)/execveat.denatured: $(OUTPUT)/execveat + cp $< $@ + chmod -x $@ + $(OUTPUT)/load_address_4096: load_address.c +- $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000 -pie $< -o $@ ++ $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000 -pie -static $< -o $@ + $(OUTPUT)/load_address_2097152: load_address.c +- $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x200000 -pie $< -o $@ ++ $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x200000 -pie -static $< -o $@ + $(OUTPUT)/load_address_16777216: load_address.c +- $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000000 -pie $< -o $@ ++ $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000000 -pie -static $< -o $@ +-- +2.30.2 +