--- /dev/null
+From 0e45882ca829b26b915162e8e86dbb1095768e9e Mon Sep 17 00:00:00 2001
+From: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
+Date: Tue, 5 Mar 2024 15:35:06 +0100
+Subject: drm/i915/vma: Fix UAF on destroy against retire race
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
+
+commit 0e45882ca829b26b915162e8e86dbb1095768e9e upstream.
+
+Object debugging tools were sporadically reporting illegal attempts to
+free a still active i915 VMA object when parking a GT believed to be idle.
+
+[161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915]
+[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0
+...
+[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1
+[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
+[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915]
+[161.360592] RIP: 0010:debug_print_object+0x80/0xb0
+...
+[161.361347] debug_object_free+0xeb/0x110
+[161.361362] i915_active_fini+0x14/0x130 [i915]
+[161.361866] release_references+0xfe/0x1f0 [i915]
+[161.362543] i915_vma_parked+0x1db/0x380 [i915]
+[161.363129] __gt_park+0x121/0x230 [i915]
+[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915]
+
+That has been tracked down to be happening when another thread is
+deactivating the VMA inside __active_retire() helper, after the VMA's
+active counter has been already decremented to 0, but before deactivation
+of the VMA's object is reported to the object debugging tool.
+
+We could prevent from that race by serializing i915_active_fini() with
+__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from
+being used, e.g. from __i915_vma_retire() called at the end of
+__active_retire(), after that VMA has been already freed by a concurrent
+i915_vma_destroy() on return from the i915_active_fini(). Then, we should
+rather fix the issue at the VMA level, not in i915_active.
+
+Since __i915_vma_parked() is called from __gt_park() on last put of the
+GT's wakeref, the issue could be addressed by holding the GT wakeref long
+enough for __active_retire() to complete before that wakeref is released
+and the GT parked.
+
+I believe the issue was introduced by commit d93939730347 ("drm/i915:
+Remove the vma refcount") which moved a call to i915_active_fini() from
+a dropped i915_vma_release(), called on last put of the removed VMA kref,
+to i915_vma_parked() processing path called on last put of a GT wakeref.
+However, its visibility to the object debugging tool was suppressed by a
+bug in i915_active that was fixed two weeks later with commit e92eb246feb9
+("drm/i915/active: Fix missing debug object activation").
+
+A VMA associated with a request doesn't acquire a GT wakeref by itself.
+Instead, it depends on a wakeref held directly by the request's active
+intel_context for a GT associated with its VM, and indirectly on that
+intel_context's engine wakeref if the engine belongs to the same GT as the
+VMA's VM. Those wakerefs are released asynchronously to VMA deactivation.
+
+Fix the issue by getting a wakeref for the VMA's GT when activating it,
+and putting that wakeref only after the VMA is deactivated. However,
+exclude global GTT from that processing path, otherwise the GPU never goes
+idle. Since __i915_vma_retire() may be called from atomic contexts, use
+async variant of wakeref put. Also, to avoid circular locking dependency,
+take care of acquiring the wakeref before VM mutex when both are needed.
+
+v7: Add inline comments with justifications for:
+ - using untracked variants of intel_gt_pm_get/put() (Nirmoy),
+ - using async variant of _put(),
+ - not getting the wakeref in case of a global GTT,
+ - always getting the first wakeref outside vm->mutex.
+v6: Since __i915_vma_active/retire() callbacks are not serialized, storing
+ a wakeref tracking handle inside struct i915_vma is not safe, and
+ there is no other good place for that. Use untracked variants of
+ intel_gt_pm_get/put_async().
+v5: Replace "tile" with "GT" across commit description (Rodrigo),
+ - avoid mentioning multi-GT case in commit description (Rodrigo),
+ - explain why we need to take a temporary wakeref unconditionally inside
+ i915_vma_pin_ww() (Rodrigo).
+v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm
+ wakerefs") (Andi),
+ - for more easy backporting, split out removal of former insufficient
+ workarounds and move them to separate patches (Nirmoy).
+ - clean up commit message and description a bit.
+v3: Identify root cause more precisely, and a commit to blame,
+ - identify and drop former workarounds,
+ - update commit message and description.
+v2: Get the wakeref before VM mutex to avoid circular locking dependency,
+ - drop questionable Fixes: tag.
+
+Fixes: d93939730347 ("drm/i915: Remove the vma refcount")
+Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875
+Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
+Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
+Cc: Nirmoy Das <nirmoy.das@intel.com>
+Cc: Andi Shyti <andi.shyti@linux.intel.com>
+Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
+Cc: stable@vger.kernel.org # v5.19+
+Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
+Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
+Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-6-janusz.krzysztofik@linux.intel.com
+(cherry picked from commit f3c71b2ded5c4367144a810ef25f998fd1d6c381)
+Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
+Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/gpu/drm/i915/i915_vma.c | 50 ++++++++++++++++++++++++++++++++++------
+ 1 file changed, 43 insertions(+), 7 deletions(-)
+
+--- a/drivers/gpu/drm/i915/i915_vma.c
++++ b/drivers/gpu/drm/i915/i915_vma.c
+@@ -33,6 +33,7 @@
+ #include "gt/intel_engine.h"
+ #include "gt/intel_engine_heartbeat.h"
+ #include "gt/intel_gt.h"
++#include "gt/intel_gt_pm.h"
+ #include "gt/intel_gt_requests.h"
+ #include "gt/intel_tlb.h"
+
+@@ -102,12 +103,42 @@ static inline struct i915_vma *active_to
+
+ static int __i915_vma_active(struct i915_active *ref)
+ {
+- return i915_vma_tryget(active_to_vma(ref)) ? 0 : -ENOENT;
++ struct i915_vma *vma = active_to_vma(ref);
++
++ if (!i915_vma_tryget(vma))
++ return -ENOENT;
++
++ /*
++ * Exclude global GTT VMA from holding a GT wakeref
++ * while active, otherwise GPU never goes idle.
++ */
++ if (!i915_vma_is_ggtt(vma)) {
++ /*
++ * Since we and our _retire() counterpart can be
++ * called asynchronously, storing a wakeref tracking
++ * handle inside struct i915_vma is not safe, and
++ * there is no other good place for that. Hence,
++ * use untracked variants of intel_gt_pm_get/put().
++ */
++ intel_gt_pm_get_untracked(vma->vm->gt);
++ }
++
++ return 0;
+ }
+
+ static void __i915_vma_retire(struct i915_active *ref)
+ {
+- i915_vma_put(active_to_vma(ref));
++ struct i915_vma *vma = active_to_vma(ref);
++
++ if (!i915_vma_is_ggtt(vma)) {
++ /*
++ * Since we can be called from atomic contexts,
++ * use an async variant of intel_gt_pm_put().
++ */
++ intel_gt_pm_put_async_untracked(vma->vm->gt);
++ }
++
++ i915_vma_put(vma);
+ }
+
+ static struct i915_vma *
+@@ -1403,7 +1434,7 @@ int i915_vma_pin_ww(struct i915_vma *vma
+ struct i915_vma_work *work = NULL;
+ struct dma_fence *moving = NULL;
+ struct i915_vma_resource *vma_res = NULL;
+- intel_wakeref_t wakeref = 0;
++ intel_wakeref_t wakeref;
+ unsigned int bound;
+ int err;
+
+@@ -1423,8 +1454,14 @@ int i915_vma_pin_ww(struct i915_vma *vma
+ if (err)
+ return err;
+
+- if (flags & PIN_GLOBAL)
+- wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm);
++ /*
++ * In case of a global GTT, we must hold a runtime-pm wakeref
++ * while global PTEs are updated. In other cases, we hold
++ * the rpm reference while the VMA is active. Since runtime
++ * resume may require allocations, which are forbidden inside
++ * vm->mutex, get the first rpm wakeref outside of the mutex.
++ */
++ wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm);
+
+ if (flags & vma->vm->bind_async_flags) {
+ /* lock VM */
+@@ -1560,8 +1597,7 @@ err_fence:
+ if (work)
+ dma_fence_work_commit_imm(&work->base);
+ err_rpm:
+- if (wakeref)
+- intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
++ intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
+
+ if (moving)
+ dma_fence_put(moving);
--- /dev/null
+From e7c42bf4d320affe37337aa83ae0347832b3f568 Mon Sep 17 00:00:00 2001
+From: Geliang Tang <tanggeliang@kylinos.cn>
+Date: Fri, 8 Mar 2024 23:10:15 +0100
+Subject: selftests: mptcp: use += operator to append strings
+
+From: Geliang Tang <tanggeliang@kylinos.cn>
+
+commit e7c42bf4d320affe37337aa83ae0347832b3f568 upstream.
+
+This patch uses addition assignment operator (+=) to append strings
+instead of duplicating the variable name in mptcp_connect.sh and
+mptcp_join.sh.
+
+This can make the statements shorter.
+
+Note: in mptcp_connect.sh, add a local variable extra in do_transfer to
+save the various extra warning logs, using += to append it. And add a
+new variable tc_info to save various tc info, also using += to append it.
+This can make the code more readable and prepare for the next commit.
+
+Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
+Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
+Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
+Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-8-4f42c347b653@kernel.org
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+[ Conflicts in mptcp_connect.sh: this commit was supposed to be
+ backported before commit 7a1b3490f47e ("mptcp: don't account accept()
+ of non-MPC client as fallback to TCP"). The new condition added by
+ this commit was then not expected, and was in fact at the wrong place
+ in v6.6: in case of issue, the problem would not have been reported
+ correctly. ]
+Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ tools/testing/selftests/net/mptcp/mptcp_connect.sh | 53 +++++++++++----------
+ tools/testing/selftests/net/mptcp/mptcp_join.sh | 30 +++++------
+ 2 files changed, 43 insertions(+), 40 deletions(-)
+
+--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
++++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+@@ -374,15 +374,15 @@ do_transfer()
+ TEST_COUNT=$((TEST_COUNT+1))
+
+ if [ "$rcvbuf" -gt 0 ]; then
+- extra_args="$extra_args -R $rcvbuf"
++ extra_args+=" -R $rcvbuf"
+ fi
+
+ if [ "$sndbuf" -gt 0 ]; then
+- extra_args="$extra_args -S $sndbuf"
++ extra_args+=" -S $sndbuf"
+ fi
+
+ if [ -n "$testmode" ]; then
+- extra_args="$extra_args -m $testmode"
++ extra_args+=" -m $testmode"
+ fi
+
+ if [ -n "$extra_args" ] && $options_log; then
+@@ -503,6 +503,7 @@ do_transfer()
+ check_transfer $cin $sout "file received by server"
+ rets=$?
+
++ local extra=""
+ local stat_synrx_now_l
+ local stat_ackrx_now_l
+ local stat_cookietx_now
+@@ -538,7 +539,7 @@ do_transfer()
+ "${stat_ackrx_now_l}" "${expect_ackrx}" 1>&2
+ rets=1
+ else
+- printf "[ Note ] fallback due to TCP OoO"
++ extra+=" [ Note ] fallback due to TCP OoO"
+ fi
+ fi
+
+@@ -561,13 +562,6 @@ do_transfer()
+ fi
+ fi
+
+- if [ $retc -eq 0 ] && [ $rets -eq 0 ]; then
+- printf "[ OK ]"
+- mptcp_lib_result_pass "${TEST_GROUP}: ${result_msg}"
+- else
+- mptcp_lib_result_fail "${TEST_GROUP}: ${result_msg}"
+- fi
+-
+ if [ ${stat_ooo_now} -eq 0 ] && [ ${stat_tcpfb_last_l} -ne ${stat_tcpfb_now_l} ]; then
+ mptcp_lib_pr_fail "unexpected fallback to TCP"
+ rets=1
+@@ -575,30 +569,39 @@ do_transfer()
+
+ if [ $cookies -eq 2 ];then
+ if [ $stat_cookietx_last -ge $stat_cookietx_now ] ;then
+- printf " WARN: CookieSent: did not advance"
++ extra+=" WARN: CookieSent: did not advance"
+ fi
+ if [ $stat_cookierx_last -ge $stat_cookierx_now ] ;then
+- printf " WARN: CookieRecv: did not advance"
++ extra+=" WARN: CookieRecv: did not advance"
+ fi
+ else
+ if [ $stat_cookietx_last -ne $stat_cookietx_now ] ;then
+- printf " WARN: CookieSent: changed"
++ extra+=" WARN: CookieSent: changed"
+ fi
+ if [ $stat_cookierx_last -ne $stat_cookierx_now ] ;then
+- printf " WARN: CookieRecv: changed"
++ extra+=" WARN: CookieRecv: changed"
+ fi
+ fi
+
+ if [ ${stat_synrx_now_l} -gt ${expect_synrx} ]; then
+- printf " WARN: SYNRX: expect %d, got %d (probably retransmissions)" \
+- "${expect_synrx}" "${stat_synrx_now_l}"
++ extra+=" WARN: SYNRX: expect ${expect_synrx},"
++ extra+=" got ${stat_synrx_now_l} (probably retransmissions)"
+ fi
+ if [ ${stat_ackrx_now_l} -gt ${expect_ackrx} ]; then
+- printf " WARN: ACKRX: expect %d, got %d (probably retransmissions)" \
+- "${expect_ackrx}" "${stat_ackrx_now_l}"
++ extra+=" WARN: ACKRX: expect ${expect_ackrx},"
++ extra+=" got ${stat_ackrx_now_l} (probably retransmissions)"
++ fi
++
++ if [ $retc -eq 0 ] && [ $rets -eq 0 ]; then
++ printf "[ OK ]%s\n" "${extra}"
++ mptcp_lib_result_pass "${TEST_GROUP}: ${result_msg}"
++ else
++ if [ -n "${extra}" ]; then
++ printf "%s\n" "${extra:1}"
++ fi
++ mptcp_lib_result_fail "${TEST_GROUP}: ${result_msg}"
+ fi
+
+- echo
+ cat "$capout"
+ [ $retc -eq 0 ] && [ $rets -eq 0 ]
+ }
+@@ -924,8 +927,8 @@ mptcp_lib_result_code "${ret}" "ping tes
+ stop_if_error "Could not even run ping tests"
+
+ [ -n "$tc_loss" ] && tc -net "$ns2" qdisc add dev ns2eth3 root netem loss random $tc_loss delay ${tc_delay}ms
+-echo -n "INFO: Using loss of $tc_loss "
+-test "$tc_delay" -gt 0 && echo -n "delay $tc_delay ms "
++tc_info="loss of $tc_loss "
++test "$tc_delay" -gt 0 && tc_info+="delay $tc_delay ms "
+
+ reorder_delay=$((tc_delay / 4))
+
+@@ -936,17 +939,17 @@ if [ -z "${tc_reorder}" ]; then
+
+ if [ $reorder_delay -gt 0 ] && [ $reorder1 -lt 100 ] && [ $reorder2 -gt 0 ]; then
+ tc_reorder="reorder ${reorder1}% ${reorder2}%"
+- echo -n "$tc_reorder with delay ${reorder_delay}ms "
++ tc_info+="$tc_reorder with delay ${reorder_delay}ms "
+ fi
+ elif [ "$tc_reorder" = "0" ];then
+ tc_reorder=""
+ elif [ "$reorder_delay" -gt 0 ];then
+ # reordering requires some delay
+ tc_reorder="reorder $tc_reorder"
+- echo -n "$tc_reorder with delay ${reorder_delay}ms "
++ tc_info+="$tc_reorder with delay ${reorder_delay}ms "
+ fi
+
+-echo "on ns3eth4"
++echo "INFO: Using ${tc_info}on ns3eth4"
+
+ tc -net "$ns3" qdisc add dev ns3eth4 root netem delay ${reorder_delay}ms $tc_reorder
+
+--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
++++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
+@@ -822,18 +822,18 @@ pm_nl_check_endpoint()
+ line="${line% }"
+ # the dump order is: address id flags port dev
+ [ -n "$addr" ] && expected_line="$addr"
+- expected_line="$expected_line $id"
+- [ -n "$_flags" ] && expected_line="$expected_line ${_flags//","/" "}"
+- [ -n "$dev" ] && expected_line="$expected_line $dev"
+- [ -n "$port" ] && expected_line="$expected_line $port"
++ expected_line+=" $id"
++ [ -n "$_flags" ] && expected_line+=" ${_flags//","/" "}"
++ [ -n "$dev" ] && expected_line+=" $dev"
++ [ -n "$port" ] && expected_line+=" $port"
+ else
+ line=$(ip netns exec $ns ./pm_nl_ctl get $_id)
+ # the dump order is: id flags dev address port
+ expected_line="$id"
+- [ -n "$flags" ] && expected_line="$expected_line $flags"
+- [ -n "$dev" ] && expected_line="$expected_line $dev"
+- [ -n "$addr" ] && expected_line="$expected_line $addr"
+- [ -n "$_port" ] && expected_line="$expected_line $_port"
++ [ -n "$flags" ] && expected_line+=" $flags"
++ [ -n "$dev" ] && expected_line+=" $dev"
++ [ -n "$addr" ] && expected_line+=" $addr"
++ [ -n "$_port" ] && expected_line+=" $_port"
+ fi
+ if [ "$line" = "$expected_line" ]; then
+ print_ok
+@@ -1256,7 +1256,7 @@ chk_csum_nr()
+ print_check "sum"
+ count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtDataCsumErr")
+ if [ "$count" != "$csum_ns1" ]; then
+- extra_msg="$extra_msg ns1=$count"
++ extra_msg+=" ns1=$count"
+ fi
+ if [ -z "$count" ]; then
+ print_skip
+@@ -1269,7 +1269,7 @@ chk_csum_nr()
+ print_check "csum"
+ count=$(mptcp_lib_get_counter ${ns2} "MPTcpExtDataCsumErr")
+ if [ "$count" != "$csum_ns2" ]; then
+- extra_msg="$extra_msg ns2=$count"
++ extra_msg+=" ns2=$count"
+ fi
+ if [ -z "$count" ]; then
+ print_skip
+@@ -1313,7 +1313,7 @@ chk_fail_nr()
+ print_check "ftx"
+ count=$(mptcp_lib_get_counter ${ns_tx} "MPTcpExtMPFailTx")
+ if [ "$count" != "$fail_tx" ]; then
+- extra_msg="$extra_msg,tx=$count"
++ extra_msg+=",tx=$count"
+ fi
+ if [ -z "$count" ]; then
+ print_skip
+@@ -1327,7 +1327,7 @@ chk_fail_nr()
+ print_check "failrx"
+ count=$(mptcp_lib_get_counter ${ns_rx} "MPTcpExtMPFailRx")
+ if [ "$count" != "$fail_rx" ]; then
+- extra_msg="$extra_msg,rx=$count"
++ extra_msg+=",rx=$count"
+ fi
+ if [ -z "$count" ]; then
+ print_skip
+@@ -1362,7 +1362,7 @@ chk_fclose_nr()
+ if [ -z "$count" ]; then
+ print_skip
+ elif [ "$count" != "$fclose_tx" ]; then
+- extra_msg="$extra_msg,tx=$count"
++ extra_msg+=",tx=$count"
+ fail_test "got $count MP_FASTCLOSE[s] TX expected $fclose_tx"
+ else
+ print_ok
+@@ -1373,7 +1373,7 @@ chk_fclose_nr()
+ if [ -z "$count" ]; then
+ print_skip
+ elif [ "$count" != "$fclose_rx" ]; then
+- extra_msg="$extra_msg,rx=$count"
++ extra_msg+=",rx=$count"
+ fail_test "got $count MP_FASTCLOSE[s] RX expected $fclose_rx"
+ else
+ print_ok
+@@ -1742,7 +1742,7 @@ chk_rm_nr()
+ count=$((count + cnt))
+ if [ "$count" != "$rm_subflow_nr" ]; then
+ suffix="$count in [$rm_subflow_nr:$((rm_subflow_nr*2))]"
+- extra_msg="$extra_msg simult"
++ extra_msg+=" simult"
+ fi
+ if [ $count -ge "$rm_subflow_nr" ] && \
+ [ "$count" -le "$((rm_subflow_nr *2 ))" ]; then