From: Greg Kroah-Hartman Date: Mon, 15 Apr 2024 12:24:01 +0000 (+0200) Subject: drop broken i915 patch X-Git-Tag: v5.15.156~46 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=79428b6bc54c11a510dc13438fca47beb374fd0e;p=thirdparty%2Fkernel%2Fstable-queue.git drop broken i915 patch --- diff --git a/queue-6.1/drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch b/queue-6.1/drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch deleted file mode 100644 index fce802ef3fa..00000000000 --- a/queue-6.1/drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch +++ /dev/null @@ -1,202 +0,0 @@ -From 0e45882ca829b26b915162e8e86dbb1095768e9e Mon Sep 17 00:00:00 2001 -From: Janusz Krzysztofik -Date: Tue, 5 Mar 2024 15:35:06 +0100 -Subject: drm/i915/vma: Fix UAF on destroy against retire race -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -From: Janusz Krzysztofik - -commit 0e45882ca829b26b915162e8e86dbb1095768e9e upstream. - -Object debugging tools were sporadically reporting illegal attempts to -free a still active i915 VMA object when parking a GT believed to be idle. - -[161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915] -[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0 -... -[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1 -[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 -[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915] -[161.360592] RIP: 0010:debug_print_object+0x80/0xb0 -... -[161.361347] debug_object_free+0xeb/0x110 -[161.361362] i915_active_fini+0x14/0x130 [i915] -[161.361866] release_references+0xfe/0x1f0 [i915] -[161.362543] i915_vma_parked+0x1db/0x380 [i915] -[161.363129] __gt_park+0x121/0x230 [i915] -[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915] - -That has been tracked down to be happening when another thread is -deactivating the VMA inside __active_retire() helper, after the VMA's -active counter has been already decremented to 0, but before deactivation -of the VMA's object is reported to the object debugging tool. - -We could prevent from that race by serializing i915_active_fini() with -__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from -being used, e.g. from __i915_vma_retire() called at the end of -__active_retire(), after that VMA has been already freed by a concurrent -i915_vma_destroy() on return from the i915_active_fini(). Then, we should -rather fix the issue at the VMA level, not in i915_active. - -Since __i915_vma_parked() is called from __gt_park() on last put of the -GT's wakeref, the issue could be addressed by holding the GT wakeref long -enough for __active_retire() to complete before that wakeref is released -and the GT parked. - -I believe the issue was introduced by commit d93939730347 ("drm/i915: -Remove the vma refcount") which moved a call to i915_active_fini() from -a dropped i915_vma_release(), called on last put of the removed VMA kref, -to i915_vma_parked() processing path called on last put of a GT wakeref. -However, its visibility to the object debugging tool was suppressed by a -bug in i915_active that was fixed two weeks later with commit e92eb246feb9 -("drm/i915/active: Fix missing debug object activation"). - -A VMA associated with a request doesn't acquire a GT wakeref by itself. -Instead, it depends on a wakeref held directly by the request's active -intel_context for a GT associated with its VM, and indirectly on that -intel_context's engine wakeref if the engine belongs to the same GT as the -VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. - -Fix the issue by getting a wakeref for the VMA's GT when activating it, -and putting that wakeref only after the VMA is deactivated. However, -exclude global GTT from that processing path, otherwise the GPU never goes -idle. Since __i915_vma_retire() may be called from atomic contexts, use -async variant of wakeref put. Also, to avoid circular locking dependency, -take care of acquiring the wakeref before VM mutex when both are needed. - -v7: Add inline comments with justifications for: - - using untracked variants of intel_gt_pm_get/put() (Nirmoy), - - using async variant of _put(), - - not getting the wakeref in case of a global GTT, - - always getting the first wakeref outside vm->mutex. -v6: Since __i915_vma_active/retire() callbacks are not serialized, storing - a wakeref tracking handle inside struct i915_vma is not safe, and - there is no other good place for that. Use untracked variants of - intel_gt_pm_get/put_async(). -v5: Replace "tile" with "GT" across commit description (Rodrigo), - - avoid mentioning multi-GT case in commit description (Rodrigo), - - explain why we need to take a temporary wakeref unconditionally inside - i915_vma_pin_ww() (Rodrigo). -v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm - wakerefs") (Andi), - - for more easy backporting, split out removal of former insufficient - workarounds and move them to separate patches (Nirmoy). - - clean up commit message and description a bit. -v3: Identify root cause more precisely, and a commit to blame, - - identify and drop former workarounds, - - update commit message and description. -v2: Get the wakeref before VM mutex to avoid circular locking dependency, - - drop questionable Fixes: tag. - -Fixes: d93939730347 ("drm/i915: Remove the vma refcount") -Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875 -Signed-off-by: Janusz Krzysztofik -Cc: Thomas Hellström -Cc: Nirmoy Das -Cc: Andi Shyti -Cc: Rodrigo Vivi -Cc: stable@vger.kernel.org # v5.19+ -Reviewed-by: Nirmoy Das -Signed-off-by: Andi Shyti -Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-6-janusz.krzysztofik@linux.intel.com -(cherry picked from commit f3c71b2ded5c4367144a810ef25f998fd1d6c381) -Signed-off-by: Rodrigo Vivi -Signed-off-by: Janusz Krzysztofik -Signed-off-by: Greg Kroah-Hartman ---- - drivers/gpu/drm/i915/i915_vma.c | 50 ++++++++++++++++++++++++++++++++++------ - 1 file changed, 43 insertions(+), 7 deletions(-) - ---- a/drivers/gpu/drm/i915/i915_vma.c -+++ b/drivers/gpu/drm/i915/i915_vma.c -@@ -32,6 +32,7 @@ - #include "gt/intel_engine.h" - #include "gt/intel_engine_heartbeat.h" - #include "gt/intel_gt.h" -+#include "gt/intel_gt_pm.h" - #include "gt/intel_gt_requests.h" - - #include "i915_drv.h" -@@ -98,12 +99,42 @@ static inline struct i915_vma *active_to - - static int __i915_vma_active(struct i915_active *ref) - { -- return i915_vma_tryget(active_to_vma(ref)) ? 0 : -ENOENT; -+ struct i915_vma *vma = active_to_vma(ref); -+ -+ if (!i915_vma_tryget(vma)) -+ return -ENOENT; -+ -+ /* -+ * Exclude global GTT VMA from holding a GT wakeref -+ * while active, otherwise GPU never goes idle. -+ */ -+ if (!i915_vma_is_ggtt(vma)) { -+ /* -+ * Since we and our _retire() counterpart can be -+ * called asynchronously, storing a wakeref tracking -+ * handle inside struct i915_vma is not safe, and -+ * there is no other good place for that. Hence, -+ * use untracked variants of intel_gt_pm_get/put(). -+ */ -+ intel_gt_pm_get_untracked(vma->vm->gt); -+ } -+ -+ return 0; - } - - static void __i915_vma_retire(struct i915_active *ref) - { -- i915_vma_put(active_to_vma(ref)); -+ struct i915_vma *vma = active_to_vma(ref); -+ -+ if (!i915_vma_is_ggtt(vma)) { -+ /* -+ * Since we can be called from atomic contexts, -+ * use an async variant of intel_gt_pm_put(). -+ */ -+ intel_gt_pm_put_async_untracked(vma->vm->gt); -+ } -+ -+ i915_vma_put(vma); - } - - static struct i915_vma * -@@ -1365,7 +1396,7 @@ int i915_vma_pin_ww(struct i915_vma *vma - struct i915_vma_work *work = NULL; - struct dma_fence *moving = NULL; - struct i915_vma_resource *vma_res = NULL; -- intel_wakeref_t wakeref = 0; -+ intel_wakeref_t wakeref; - unsigned int bound; - int err; - -@@ -1385,8 +1416,14 @@ int i915_vma_pin_ww(struct i915_vma *vma - if (err) - return err; - -- if (flags & PIN_GLOBAL) -- wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm); -+ /* -+ * In case of a global GTT, we must hold a runtime-pm wakeref -+ * while global PTEs are updated. In other cases, we hold -+ * the rpm reference while the VMA is active. Since runtime -+ * resume may require allocations, which are forbidden inside -+ * vm->mutex, get the first rpm wakeref outside of the mutex. -+ */ -+ wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm); - - if (flags & vma->vm->bind_async_flags) { - /* lock VM */ -@@ -1522,8 +1559,7 @@ err_fence: - if (work) - dma_fence_work_commit_imm(&work->base); - err_rpm: -- if (wakeref) -- intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref); -+ intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref); - - if (moving) - dma_fence_put(moving); diff --git a/queue-6.1/series b/queue-6.1/series index 65148b15200..564a01432f2 100644 --- a/queue-6.1/series +++ b/queue-6.1/series @@ -39,4 +39,3 @@ net-ena-fix-incorrect-descriptor-free-behavior.patch tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch tracing-hide-unused-ftrace_event_id_fops.patch iommu-vt-d-allocate-local-memory-for-page-request-qu.patch -drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch diff --git a/queue-6.6/drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch b/queue-6.6/drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch deleted file mode 100644 index a11ce027187..00000000000 --- a/queue-6.6/drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch +++ /dev/null @@ -1,202 +0,0 @@ -From 0e45882ca829b26b915162e8e86dbb1095768e9e Mon Sep 17 00:00:00 2001 -From: Janusz Krzysztofik -Date: Tue, 5 Mar 2024 15:35:06 +0100 -Subject: drm/i915/vma: Fix UAF on destroy against retire race -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -From: Janusz Krzysztofik - -commit 0e45882ca829b26b915162e8e86dbb1095768e9e upstream. - -Object debugging tools were sporadically reporting illegal attempts to -free a still active i915 VMA object when parking a GT believed to be idle. - -[161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915] -[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0 -... -[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1 -[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 -[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915] -[161.360592] RIP: 0010:debug_print_object+0x80/0xb0 -... -[161.361347] debug_object_free+0xeb/0x110 -[161.361362] i915_active_fini+0x14/0x130 [i915] -[161.361866] release_references+0xfe/0x1f0 [i915] -[161.362543] i915_vma_parked+0x1db/0x380 [i915] -[161.363129] __gt_park+0x121/0x230 [i915] -[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915] - -That has been tracked down to be happening when another thread is -deactivating the VMA inside __active_retire() helper, after the VMA's -active counter has been already decremented to 0, but before deactivation -of the VMA's object is reported to the object debugging tool. - -We could prevent from that race by serializing i915_active_fini() with -__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from -being used, e.g. from __i915_vma_retire() called at the end of -__active_retire(), after that VMA has been already freed by a concurrent -i915_vma_destroy() on return from the i915_active_fini(). Then, we should -rather fix the issue at the VMA level, not in i915_active. - -Since __i915_vma_parked() is called from __gt_park() on last put of the -GT's wakeref, the issue could be addressed by holding the GT wakeref long -enough for __active_retire() to complete before that wakeref is released -and the GT parked. - -I believe the issue was introduced by commit d93939730347 ("drm/i915: -Remove the vma refcount") which moved a call to i915_active_fini() from -a dropped i915_vma_release(), called on last put of the removed VMA kref, -to i915_vma_parked() processing path called on last put of a GT wakeref. -However, its visibility to the object debugging tool was suppressed by a -bug in i915_active that was fixed two weeks later with commit e92eb246feb9 -("drm/i915/active: Fix missing debug object activation"). - -A VMA associated with a request doesn't acquire a GT wakeref by itself. -Instead, it depends on a wakeref held directly by the request's active -intel_context for a GT associated with its VM, and indirectly on that -intel_context's engine wakeref if the engine belongs to the same GT as the -VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. - -Fix the issue by getting a wakeref for the VMA's GT when activating it, -and putting that wakeref only after the VMA is deactivated. However, -exclude global GTT from that processing path, otherwise the GPU never goes -idle. Since __i915_vma_retire() may be called from atomic contexts, use -async variant of wakeref put. Also, to avoid circular locking dependency, -take care of acquiring the wakeref before VM mutex when both are needed. - -v7: Add inline comments with justifications for: - - using untracked variants of intel_gt_pm_get/put() (Nirmoy), - - using async variant of _put(), - - not getting the wakeref in case of a global GTT, - - always getting the first wakeref outside vm->mutex. -v6: Since __i915_vma_active/retire() callbacks are not serialized, storing - a wakeref tracking handle inside struct i915_vma is not safe, and - there is no other good place for that. Use untracked variants of - intel_gt_pm_get/put_async(). -v5: Replace "tile" with "GT" across commit description (Rodrigo), - - avoid mentioning multi-GT case in commit description (Rodrigo), - - explain why we need to take a temporary wakeref unconditionally inside - i915_vma_pin_ww() (Rodrigo). -v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm - wakerefs") (Andi), - - for more easy backporting, split out removal of former insufficient - workarounds and move them to separate patches (Nirmoy). - - clean up commit message and description a bit. -v3: Identify root cause more precisely, and a commit to blame, - - identify and drop former workarounds, - - update commit message and description. -v2: Get the wakeref before VM mutex to avoid circular locking dependency, - - drop questionable Fixes: tag. - -Fixes: d93939730347 ("drm/i915: Remove the vma refcount") -Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875 -Signed-off-by: Janusz Krzysztofik -Cc: Thomas Hellström -Cc: Nirmoy Das -Cc: Andi Shyti -Cc: Rodrigo Vivi -Cc: stable@vger.kernel.org # v5.19+ -Reviewed-by: Nirmoy Das -Signed-off-by: Andi Shyti -Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-6-janusz.krzysztofik@linux.intel.com -(cherry picked from commit f3c71b2ded5c4367144a810ef25f998fd1d6c381) -Signed-off-by: Rodrigo Vivi -Signed-off-by: Janusz Krzysztofik -Signed-off-by: Greg Kroah-Hartman ---- - drivers/gpu/drm/i915/i915_vma.c | 50 ++++++++++++++++++++++++++++++++++------ - 1 file changed, 43 insertions(+), 7 deletions(-) - ---- a/drivers/gpu/drm/i915/i915_vma.c -+++ b/drivers/gpu/drm/i915/i915_vma.c -@@ -33,6 +33,7 @@ - #include "gt/intel_engine.h" - #include "gt/intel_engine_heartbeat.h" - #include "gt/intel_gt.h" -+#include "gt/intel_gt_pm.h" - #include "gt/intel_gt_requests.h" - #include "gt/intel_tlb.h" - -@@ -102,12 +103,42 @@ static inline struct i915_vma *active_to - - static int __i915_vma_active(struct i915_active *ref) - { -- return i915_vma_tryget(active_to_vma(ref)) ? 0 : -ENOENT; -+ struct i915_vma *vma = active_to_vma(ref); -+ -+ if (!i915_vma_tryget(vma)) -+ return -ENOENT; -+ -+ /* -+ * Exclude global GTT VMA from holding a GT wakeref -+ * while active, otherwise GPU never goes idle. -+ */ -+ if (!i915_vma_is_ggtt(vma)) { -+ /* -+ * Since we and our _retire() counterpart can be -+ * called asynchronously, storing a wakeref tracking -+ * handle inside struct i915_vma is not safe, and -+ * there is no other good place for that. Hence, -+ * use untracked variants of intel_gt_pm_get/put(). -+ */ -+ intel_gt_pm_get_untracked(vma->vm->gt); -+ } -+ -+ return 0; - } - - static void __i915_vma_retire(struct i915_active *ref) - { -- i915_vma_put(active_to_vma(ref)); -+ struct i915_vma *vma = active_to_vma(ref); -+ -+ if (!i915_vma_is_ggtt(vma)) { -+ /* -+ * Since we can be called from atomic contexts, -+ * use an async variant of intel_gt_pm_put(). -+ */ -+ intel_gt_pm_put_async_untracked(vma->vm->gt); -+ } -+ -+ i915_vma_put(vma); - } - - static struct i915_vma * -@@ -1403,7 +1434,7 @@ int i915_vma_pin_ww(struct i915_vma *vma - struct i915_vma_work *work = NULL; - struct dma_fence *moving = NULL; - struct i915_vma_resource *vma_res = NULL; -- intel_wakeref_t wakeref = 0; -+ intel_wakeref_t wakeref; - unsigned int bound; - int err; - -@@ -1423,8 +1454,14 @@ int i915_vma_pin_ww(struct i915_vma *vma - if (err) - return err; - -- if (flags & PIN_GLOBAL) -- wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm); -+ /* -+ * In case of a global GTT, we must hold a runtime-pm wakeref -+ * while global PTEs are updated. In other cases, we hold -+ * the rpm reference while the VMA is active. Since runtime -+ * resume may require allocations, which are forbidden inside -+ * vm->mutex, get the first rpm wakeref outside of the mutex. -+ */ -+ wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm); - - if (flags & vma->vm->bind_async_flags) { - /* lock VM */ -@@ -1560,8 +1597,7 @@ err_fence: - if (work) - dma_fence_work_commit_imm(&work->base); - err_rpm: -- if (wakeref) -- intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref); -+ intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref); - - if (moving) - dma_fence_put(moving); diff --git a/queue-6.6/series b/queue-6.6/series index 5547a30d0ba..da923e157c2 100644 --- a/queue-6.6/series +++ b/queue-6.6/series @@ -77,4 +77,3 @@ tracing-hide-unused-ftrace_event_id_fops.patch iommu-vt-d-fix-wrong-use-of-pasid-config.patch iommu-vt-d-allocate-local-memory-for-page-request-qu.patch selftests-mptcp-use-operator-to-append-strings.patch -drm-i915-vma-fix-uaf-on-destroy-against-retire-race.patch