7.0-stable patches

author Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Fri, 15 May 2026 09:18:20 +0000 (11:18 +0200)

committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Fri, 15 May 2026 09:18:20 +0000 (11:18 +0200)
author Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 15 May 2026 09:18:20 +0000 (11:18 +0200)
committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 15 May 2026 09:18:20 +0000 (11:18 +0200)
diff --git a/queue-7.0/cgroup-defer-css-percpu_ref-kill-on-rmdir-until-cgroup-is-depopulated.patch b/queue-7.0/cgroup-defer-css-percpu_ref-kill-on-rmdir-until-cgroup-is-depopulated.patch

new file mode 100644 (file)

index 0000000..de7961c
--- /dev/null
+++ b/queue-7.0/cgroup-defer-css-percpu_ref-kill-on-rmdir-until-cgroup-is-depopulated.patch
@@ -0,0 +1,488 @@
+From stable+bounces-246936-greg=kroah.com@vger.kernel.org Wed May 13 19:05:49 2026
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 13 May 2026 12:33:14 -0400
+Subject: cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated
+To: stable@vger.kernel.org
+Cc: Tejun Heo <tj@kernel.org>, Martin Pitt <martin@piware.de>, Sebastian Andrzej Siewior <bigeasy@linutronix.de>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20260513163314.3807064-2-sashal@kernel.org>
+
+From: Tejun Heo <tj@kernel.org>
+
+[ Upstream commit 93618edf753838a727dbff63c7c291dee22d656b ]
+
+A chain of commits going back to v7.0 reworked rmdir to satisfy the
+controller invariant that a subsystem's ->css_offline() must not run while
+tasks are still doing kernel-side work in the cgroup.
+
+[1] d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
+[2] a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup")
+[3] 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir")
+[4] 4c56a8ac6869 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition")
+[5] 13e786b64bd3 ("cgroup: Increment nr_dying_subsys_* from rmdir context")
+
+[1] moved task cset unlink from do_exit() to finish_task_switch() so a
+task's cset link drops only after the task has fully stopped scheduling.
+That made tasks past exit_signals() linger on cset->tasks until their final
+context switch, which led to a series of problems as what userspace expected
+to see after rmdir diverged from what the kernel needs to wait for. [2]-[5]
+tried to bridge that divergence: [2] filtered the exiting tasks from
+cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4]
+fixed the wait's condition; [5] made nr_dying_subsys_* visible
+synchronously.
+
+The cgroup_drain_dying() wait in [3] turned out to be a dead end. When the
+rmdir caller is also the reaper of a zombie that pins a pidns teardown (e.g.
+host PID 1 systemd reaping orphan pids that were re-parented to it during
+the same teardown), rmdir blocks in TASK_UNINTERRUPTIBLE waiting for those
+pids to free, the pids can't free because PID 1 is the reaper and it's stuck
+in rmdir, and the system A-A deadlocks. No internal lock ordering breaks
+this; the wait itself is the bug.
+
+The css killing side that drove the original reorder, however, can be made
+cleanly asynchronous: ->css_offline() is already async, run from
+css_killed_work_fn() driven by percpu_ref_kill_and_confirm(). The fix is to
+make that chain start only after all tasks have left the cgroup. rmdir's
+user-visible side then returns as soon as cgroup.procs and friends are
+empty, while ->css_offline() still runs only after the cgroup is fully
+drained.
+
+Verified by the original reproducer (pidns teardown + zombie reaper, runs
+under vng) which hangs vanilla and succeeds here, and by per-commit
+deterministic repros for [2], [3], [4], [5] with a boot parameter that
+widens the post-exit_signals() window so each state is reliably reachable.
+Some stress tests on top of that.
+
+cgroup_apply_control_disable() has the same shape of pre-existing race:
+when a controller is disabled via subtree_control, kill_css() ran
+synchronously while tasks past exit_signals() could still be linked to
+the cgroup's csets, and ->css_offline() could fire before they drained.
+This patch preserves the existing synchronous behavior at that call site
+(kill_css_sync() + kill_css_finish() back-to-back) and a follow-up patch
+will defer kill_css_finish() there using a per-css trigger.
+
+This seems like the right approach and I don't see problems with it. The
+changes are somewhat invasive but not excessively so, so backporting to
+-stable should be okay. If something does turn out to be wrong, the fallback
+is to revert the entire chain ([1]-[5]) and rework in the development branch
+instead.
+
+v2: Pin cgrp across the deferred destroy work with explicit
+    cgroup_get()/cgroup_put() around queue_work() and the work_fn. v1
+    wasn't actually broken (ordered cgroup_offline_wq + queue_work order
+    in cgroup_task_dead() saved it) but the explicit ref removes the
+    dependency on those non-obvious invariants. Also note the
+    pre-existing cgroup_apply_control_disable() race in the description;
+    a follow-up will defer kill_css_finish() there.
+
+Fixes: 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir")
+Cc: stable@vger.kernel.org # v7.0+
+Reported-and-tested-by: Martin Pitt <martin@piware.de>
+Link: https://lore.kernel.org/all/afHNg2VX2jy9bW7y@piware.de/
+Link: https://lore.kernel.org/all/35e0670adb4abeab13da2c321582af9f@kernel.org/
+Signed-off-by: Tejun Heo <tj@kernel.org>
+Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/linux/cgroup-defs.h |    4 
+ kernel/cgroup/cgroup.c      |  250 ++++++++++++++++++++------------------------
+ 2 files changed, 119 insertions(+), 135 deletions(-)
+
+--- a/include/linux/cgroup-defs.h
++++ b/include/linux/cgroup-defs.h
+@@ -609,8 +609,8 @@ struct cgroup {
+       /* used to wait for offlining of csses */
+       wait_queue_head_t offline_waitq;
+ 
+-      /* used by cgroup_rmdir() to wait for dying tasks to leave */
+-      wait_queue_head_t dying_populated_waitq;
++      /* defers killing csses after removal until cgroup is depopulated */
++      struct work_struct finish_destroy_work;
+ 
+       /* used to schedule release agent */
+       struct work_struct release_agent_work;
+--- a/kernel/cgroup/cgroup.c
++++ b/kernel/cgroup/cgroup.c
+@@ -278,10 +278,12 @@ static void cgroup_finalize_control(stru
+ static void css_task_iter_skip(struct css_task_iter *it,
+                              struct task_struct *task);
+ static int cgroup_destroy_locked(struct cgroup *cgrp);
++static void cgroup_finish_destroy(struct cgroup *cgrp);
++static void kill_css_sync(struct cgroup_subsys_state *css);
++static void kill_css_finish(struct cgroup_subsys_state *css);
+ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp,
+                                             struct cgroup_subsys *ss);
+ static void css_release(struct percpu_ref *ref);
+-static void kill_css(struct cgroup_subsys_state *css);
+ static int cgroup_addrm_files(struct cgroup_subsys_state *css,
+                             struct cgroup *cgrp, struct cftype cfts[],
+                             bool is_add);
+@@ -858,6 +860,16 @@ static void cgroup_update_populated(stru
+               if (was_populated == cgroup_is_populated(cgrp))
+                       break;
+ 
++              /*
++               * Subtree just emptied below an offlined cgrp. Fire deferred
++               * destroy. The transition is one-shot.
++               */
++              if (was_populated && !css_is_online(&cgrp->self)) {
++                      cgroup_get(cgrp);
++                      WARN_ON_ONCE(!queue_work(cgroup_offline_wq,
++                                               &cgrp->finish_destroy_work));
++              }
++
+               cgroup1_check_for_release(cgrp);
+               TRACE_CGROUP_PATH(notify_populated, cgrp,
+                                 cgroup_is_populated(cgrp));
+@@ -2100,6 +2112,16 @@ static int cgroup_reconfigure(struct fs_
+       return 0;
+ }
+ 
++static void cgroup_finish_destroy_work_fn(struct work_struct *work)
++{
++      struct cgroup *cgrp = container_of(work, struct cgroup, finish_destroy_work);
++
++      cgroup_lock();
++      cgroup_finish_destroy(cgrp);
++      cgroup_unlock();
++      cgroup_put(cgrp);
++}
++
+ static void init_cgroup_housekeeping(struct cgroup *cgrp)
+ {
+       struct cgroup_subsys *ss;
+@@ -2126,7 +2148,7 @@ static void init_cgroup_housekeeping(str
+ #endif
+ 
+       init_waitqueue_head(&cgrp->offline_waitq);
+-      init_waitqueue_head(&cgrp->dying_populated_waitq);
++      INIT_WORK(&cgrp->finish_destroy_work, cgroup_finish_destroy_work_fn);
+       INIT_WORK(&cgrp->release_agent_work, cgroup1_release_agent);
+ }
+ 
+@@ -3436,7 +3458,8 @@ static void cgroup_apply_control_disable
+ 
+                       if (css->parent &&
+                           !(cgroup_ss_mask(dsct) & (1 << ss->id))) {
+-                              kill_css(css);
++                              kill_css_sync(css);
++                              kill_css_finish(css);
+                       } else if (!css_visible(css)) {
+                               css_clear_dir(css);
+                               if (ss->css_reset)
+@@ -5558,7 +5581,7 @@ static struct cftype cgroup_psi_files[]
+  * css destruction is four-stage process.
+  *
+  * 1. Destruction starts.  Killing of the percpu_ref is initiated.
+- *    Implemented in kill_css().
++ *    Implemented in kill_css_finish().
+  *
+  * 2. When the percpu_ref is confirmed to be visible as killed on all CPUs
+  *    and thus css_tryget_online() is guaranteed to fail, the css can be
+@@ -6037,7 +6060,7 @@ out_unlock:
+ /*
+  * This is called when the refcnt of a css is confirmed to be killed.
+  * css_tryget_online() is now guaranteed to fail.  Tell the subsystem to
+- * initiate destruction and put the css ref from kill_css().
++ * initiate destruction and put the css ref from kill_css_finish().
+  */
+ static void css_killed_work_fn(struct work_struct *work)
+ {
+@@ -6069,15 +6092,12 @@ static void css_killed_ref_fn(struct per
+ }
+ 
+ /**
+- * kill_css - destroy a css
+- * @css: css to destroy
++ * kill_css_sync - synchronous half of css teardown
++ * @css: css being killed
+  *
+- * This function initiates destruction of @css by removing cgroup interface
+- * files and putting its base reference.  ->css_offline() will be invoked
+- * asynchronously once css_tryget_online() is guaranteed to fail and when
+- * the reference count reaches zero, @css will be released.
++ * See cgroup_destroy_locked().
+  */
+-static void kill_css(struct cgroup_subsys_state *css)
++static void kill_css_sync(struct cgroup_subsys_state *css)
+ {
+       struct cgroup_subsys *ss = css->ss;
+ 
+@@ -6100,24 +6120,6 @@ static void kill_css(struct cgroup_subsy
+        */
+       css_clear_dir(css);
+ 
+-      /*
+-       * Killing would put the base ref, but we need to keep it alive
+-       * until after ->css_offline().
+-       */
+-      css_get(css);
+-
+-      /*
+-       * cgroup core guarantees that, by the time ->css_offline() is
+-       * invoked, no new css reference will be given out via
+-       * css_tryget_online().  We can't simply call percpu_ref_kill() and
+-       * proceed to offlining css's because percpu_ref_kill() doesn't
+-       * guarantee that the ref is seen as killed on all CPUs on return.
+-       *
+-       * Use percpu_ref_kill_and_confirm() to get notifications as each
+-       * css is confirmed to be seen as killed on all CPUs.
+-       */
+-      percpu_ref_kill_and_confirm(&css->refcnt, css_killed_ref_fn);
+-
+       css->cgroup->nr_dying_subsys[ss->id]++;
+       /*
+        * Parent css and cgroup cannot be freed until after the freeing
+@@ -6130,44 +6132,88 @@ static void kill_css(struct cgroup_subsy
+ }
+ 
+ /**
+- * cgroup_destroy_locked - the first stage of cgroup destruction
++ * kill_css_finish - deferred half of css teardown
++ * @css: css being killed
++ *
++ * See cgroup_destroy_locked().
++ */
++static void kill_css_finish(struct cgroup_subsys_state *css)
++{
++      lockdep_assert_held(&cgroup_mutex);
++
++      /*
++       * Skip on re-entry: cgroup_apply_control_disable() may have killed @css
++       * earlier. cgroup_destroy_locked() can still walk it because
++       * offline_css() (which NULLs cgrp->subsys[ssid]) runs async.
++       */
++      if (percpu_ref_is_dying(&css->refcnt))
++              return;
++
++      /*
++       * Killing would put the base ref, but we need to keep it alive until
++       * after ->css_offline().
++       */
++      css_get(css);
++
++      /*
++       * cgroup core guarantees that, by the time ->css_offline() is invoked,
++       * no new css reference will be given out via css_tryget_online(). We
++       * can't simply call percpu_ref_kill() and proceed to offlining css's
++       * because percpu_ref_kill() doesn't guarantee that the ref is seen as
++       * killed on all CPUs on return.
++       *
++       * Use percpu_ref_kill_and_confirm() to get notifications as each css is
++       * confirmed to be seen as killed on all CPUs.
++       */
++      percpu_ref_kill_and_confirm(&css->refcnt, css_killed_ref_fn);
++}
++
++/**
++ * cgroup_destroy_locked - destroy @cgrp (called on rmdir)
+  * @cgrp: cgroup to be destroyed
+  *
+- * css's make use of percpu refcnts whose killing latency shouldn't be
+- * exposed to userland and are RCU protected.  Also, cgroup core needs to
+- * guarantee that css_tryget_online() won't succeed by the time
+- * ->css_offline() is invoked.  To satisfy all the requirements,
+- * destruction is implemented in the following two steps.
+- *
+- * s1. Verify @cgrp can be destroyed and mark it dying.  Remove all
+- *     userland visible parts and start killing the percpu refcnts of
+- *     css's.  Set up so that the next stage will be kicked off once all
+- *     the percpu refcnts are confirmed to be killed.
+- *
+- * s2. Invoke ->css_offline(), mark the cgroup dead and proceed with the
+- *     rest of destruction.  Once all cgroup references are gone, the
+- *     cgroup is RCU-freed.
+- *
+- * This function implements s1.  After this step, @cgrp is gone as far as
+- * the userland is concerned and a new cgroup with the same name may be
+- * created.  As cgroup doesn't care about the names internally, this
+- * doesn't cause any problem.
++ * Tear down @cgrp on behalf of rmdir. Constraints:
++ *
++ * - Userspace: rmdir must succeed when cgroup.procs and friends are empty.
++ *
++ * - Kernel: subsystem ->css_offline() must not run while any task in @cgrp's
++ *   subtree is still doing kernel work. A task hidden from cgroup.procs (past
++ *   exit_signals() with signal->live cleared) can still schedule, allocate, and
++ *   consume resources until its final context switch. Dying descendants in the
++ *   subtree can host such tasks too.
++ *
++ * - Kernel: css_tryget_online() must fail by the time ->css_offline() runs.
++ *
++ * The destruction runs in three parts:
++ *
++ * - This function: synchronous user-visible state teardown plus kill_css_sync()
++ *   on each subsystem css.
++ *
++ * - cgroup_finish_destroy(): kicks the percpu_ref kill via kill_css_finish() on
++ *   each subsystem css. Fires once @cgrp's subtree is fully drained, either
++ *   inline here or from cgroup_update_populated().
++ *
++ * - The percpu_ref kill chain: css_killed_ref_fn -> css_killed_work_fn ->
++ *   ->css_offline() -> release/free.
++ *
++ * Return 0 on success, -EBUSY if a userspace-visible task or an online child
++ * remains.
+  */
+ static int cgroup_destroy_locked(struct cgroup *cgrp)
+-      __releases(&cgroup_mutex) __acquires(&cgroup_mutex)
+ {
+       struct cgroup *tcgrp, *parent = cgroup_parent(cgrp);
+       struct cgroup_subsys_state *css;
+       struct cgrp_cset_link *link;
++      struct css_task_iter it;
++      struct task_struct *task;
+       int ssid, ret;
+ 
+       lockdep_assert_held(&cgroup_mutex);
+ 
+-      /*
+-       * Only migration can raise populated from zero and we're already
+-       * holding cgroup_mutex.
+-       */
+-      if (cgroup_is_populated(cgrp))
++      css_task_iter_start(&cgrp->self, 0, &it);
++      task = css_task_iter_next(&it);
++      css_task_iter_end(&it);
++      if (task)
+               return -EBUSY;
+ 
+       /*
+@@ -6191,9 +6237,8 @@ static int cgroup_destroy_locked(struct
+               link->cset->dead = true;
+       spin_unlock_irq(&css_set_lock);
+ 
+-      /* initiate massacre of all css's */
+       for_each_css(css, ssid, cgrp)
+-              kill_css(css);
++              kill_css_sync(css);
+ 
+       /* clear and remove @cgrp dir, @cgrp has an extra ref on its kn */
+       css_clear_dir(&cgrp->self);
+@@ -6224,79 +6269,27 @@ static int cgroup_destroy_locked(struct
+       /* put the base reference */
+       percpu_ref_kill(&cgrp->self.refcnt);
+ 
++      if (!cgroup_is_populated(cgrp))
++              cgroup_finish_destroy(cgrp);
++
+       return 0;
+ };
+ 
+ /**
+- * cgroup_drain_dying - wait for dying tasks to leave before rmdir
+- * @cgrp: the cgroup being removed
++ * cgroup_finish_destroy - deferred half of @cgrp destruction
++ * @cgrp: cgroup whose subtree just became empty
+  *
+- * cgroup.procs and cgroup.threads use css_task_iter which filters out
+- * PF_EXITING tasks so that userspace doesn't see tasks that have already been
+- * reaped via waitpid(). However, cgroup_has_tasks() - which tests whether the
+- * cgroup has non-empty css_sets - is only updated when dying tasks pass through
+- * cgroup_task_dead() in finish_task_switch(). This creates a window where
+- * cgroup.procs reads empty but cgroup_has_tasks() is still true, making rmdir
+- * fail with -EBUSY from cgroup_destroy_locked() even though userspace sees no
+- * tasks.
+- *
+- * This function aligns cgroup_has_tasks() with what userspace can observe. If
+- * cgroup_has_tasks() but the task iterator sees nothing (all remaining tasks are
+- * PF_EXITING), we wait for cgroup_task_dead() to finish processing them. As the
+- * window between PF_EXITING and cgroup_task_dead() is short, the wait is brief.
+- *
+- * This function only concerns itself with this cgroup's own dying tasks.
+- * Whether the cgroup has children is cgroup_destroy_locked()'s problem.
+- *
+- * Each cgroup_task_dead() kicks the waitqueue via cset->cgrp_links, and we
+- * retry the full check from scratch.
+- *
+- * Must be called with cgroup_mutex held.
++ * See cgroup_destroy_locked() for the rationale.
+  */
+-static int cgroup_drain_dying(struct cgroup *cgrp)
+-      __releases(&cgroup_mutex) __acquires(&cgroup_mutex)
++static void cgroup_finish_destroy(struct cgroup *cgrp)
+ {
+-      struct css_task_iter it;
+-      struct task_struct *task;
+-      DEFINE_WAIT(wait);
++      struct cgroup_subsys_state *css;
++      int ssid;
+ 
+       lockdep_assert_held(&cgroup_mutex);
+-retry:
+-      if (!cgroup_has_tasks(cgrp))
+-              return 0;
+ 
+-      /* Same iterator as cgroup.threads - if any task is visible, it's busy */
+-      css_task_iter_start(&cgrp->self, 0, &it);
+-      task = css_task_iter_next(&it);
+-      css_task_iter_end(&it);
+-
+-      if (task)
+-              return -EBUSY;
+-
+-      /*
+-       * All remaining tasks are PF_EXITING and will pass through
+-       * cgroup_task_dead() shortly. Wait for a kick and retry.
+-       *
+-       * cgroup_has_tasks() can't transition from false to true while we're
+-       * holding cgroup_mutex, but the true to false transition happens
+-       * under css_set_lock (via cgroup_task_dead()). We must retest and
+-       * prepare_to_wait() under css_set_lock. Otherwise, the transition
+-       * can happen between our first test and prepare_to_wait(), and we
+-       * sleep with no one to wake us.
+-       */
+-      spin_lock_irq(&css_set_lock);
+-      if (!cgroup_has_tasks(cgrp)) {
+-              spin_unlock_irq(&css_set_lock);
+-              return 0;
+-      }
+-      prepare_to_wait(&cgrp->dying_populated_waitq, &wait,
+-                      TASK_UNINTERRUPTIBLE);
+-      spin_unlock_irq(&css_set_lock);
+-      mutex_unlock(&cgroup_mutex);
+-      schedule();
+-      finish_wait(&cgrp->dying_populated_waitq, &wait);
+-      mutex_lock(&cgroup_mutex);
+-      goto retry;
++      for_each_css(css, ssid, cgrp)
++              kill_css_finish(css);
+ }
+ 
+ int cgroup_rmdir(struct kernfs_node *kn)
+@@ -6308,12 +6301,9 @@ int cgroup_rmdir(struct kernfs_node *kn)
+       if (!cgrp)
+               return 0;
+ 
+-      ret = cgroup_drain_dying(cgrp);
+-      if (!ret) {
+-              ret = cgroup_destroy_locked(cgrp);
+-              if (!ret)
+-                      TRACE_CGROUP_PATH(rmdir, cgrp);
+-      }
++      ret = cgroup_destroy_locked(cgrp);
++      if (!ret)
++              TRACE_CGROUP_PATH(rmdir, cgrp);
+ 
+       cgroup_kn_unlock(kn);
+       return ret;
+@@ -7073,7 +7063,6 @@ void cgroup_task_exit(struct task_struct
+ 
+ static void do_cgroup_task_dead(struct task_struct *tsk)
+ {
+-      struct cgrp_cset_link *link;
+       struct css_set *cset;
+       unsigned long flags;
+ 
+@@ -7087,11 +7076,6 @@ static void do_cgroup_task_dead(struct t
+       if (thread_group_leader(tsk) && atomic_read(&tsk->signal->live))
+               list_add_tail(&tsk->cg_list, &cset->dying_tasks);
+ 
+-      /* kick cgroup_drain_dying() waiters, see cgroup_rmdir() */
+-      list_for_each_entry(link, &cset->cgrp_links, cgrp_link)
+-              if (waitqueue_active(&link->cgrp->dying_populated_waitq))
+-                      wake_up(&link->cgrp->dying_populated_waitq);
+-
+       if (dl_task(tsk))
+               dec_dl_tasks_cs(tsk);
+ 
diff --git a/queue-7.0/cgroup-increment-nr_dying_subsys_-from-rmdir-context.patch b/queue-7.0/cgroup-increment-nr_dying_subsys_-from-rmdir-context.patch

new file mode 100644 (file)

index 0000000..480ca7e
--- /dev/null
+++ b/queue-7.0/cgroup-increment-nr_dying_subsys_-from-rmdir-context.patch
@@ -0,0 +1,76 @@
+From stable+bounces-246935-greg=kroah.com@vger.kernel.org Wed May 13 19:05:07 2026
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 13 May 2026 12:33:13 -0400
+Subject: cgroup: Increment nr_dying_subsys_* from rmdir context
+To: stable@vger.kernel.org
+Cc: Petr Malat <oss@malat.biz>, Tejun Heo <tj@kernel.org>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20260513163314.3807064-1-sashal@kernel.org>
+
+From: Petr Malat <oss@malat.biz>
+
+[ Upstream commit 13e786b64bd3fd81c7eb22aa32bf8305c32f2ccf ]
+
+Incrementing nr_dying_subsys_* in offline_css(), which is executed by
+cgroup_offline_wq worker, leads to a race where user can see the value
+to be 0 if he reads cgroup.stat after calling rmdir and before the worker
+executes. This makes the user wrongly expect resources released by the
+removed cgroup to be available for a new assignment.
+
+Increment nr_dying_subsys_* from kill_css(), which is called from the
+cgroup_rmdir() context.
+
+Fixes: ab0312526867 ("cgroup: Show # of subsystem CSSes in cgroup.stat")
+Signed-off-by: Petr Malat <oss@malat.biz>
+Signed-off-by: Tejun Heo <tj@kernel.org>
+Stable-dep-of: 93618edf7538 ("cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/cgroup/cgroup.c |   22 ++++++++++++----------
+ 1 file changed, 12 insertions(+), 10 deletions(-)
+
+--- a/kernel/cgroup/cgroup.c
++++ b/kernel/cgroup/cgroup.c
+@@ -5768,16 +5768,6 @@ static void offline_css(struct cgroup_su
+       RCU_INIT_POINTER(css->cgroup->subsys[ss->id], NULL);
+ 
+       wake_up_all(&css->cgroup->offline_waitq);
+-
+-      css->cgroup->nr_dying_subsys[ss->id]++;
+-      /*
+-       * Parent css and cgroup cannot be freed until after the freeing
+-       * of child css, see css_free_rwork_fn().
+-       */
+-      while ((css = css->parent)) {
+-              css->nr_descendants--;
+-              css->cgroup->nr_dying_subsys[ss->id]++;
+-      }
+ }
+ 
+ /**
+@@ -6089,6 +6079,8 @@ static void css_killed_ref_fn(struct per
+  */
+ static void kill_css(struct cgroup_subsys_state *css)
+ {
++      struct cgroup_subsys *ss = css->ss;
++
+       lockdep_assert_held(&cgroup_mutex);
+ 
+       if (css->flags & CSS_DYING)
+@@ -6125,6 +6117,16 @@ static void kill_css(struct cgroup_subsy
+        * css is confirmed to be seen as killed on all CPUs.
+        */
+       percpu_ref_kill_and_confirm(&css->refcnt, css_killed_ref_fn);
++
++      css->cgroup->nr_dying_subsys[ss->id]++;
++      /*
++       * Parent css and cgroup cannot be freed until after the freeing
++       * of child css, see css_free_rwork_fn().
++       */
++      while ((css = css->parent)) {
++              css->nr_descendants--;
++              css->cgroup->nr_dying_subsys[ss->id]++;
++      }
+ }
+ 
+ /**
diff --git a/queue-7.0/edac-versalnet-fix-device-name-memory-leak.patch b/queue-7.0/edac-versalnet-fix-device-name-memory-leak.patch

new file mode 100644 (file)

index 0000000..a226fad
--- /dev/null
+++ b/queue-7.0/edac-versalnet-fix-device-name-memory-leak.patch
@@ -0,0 +1,67 @@
+From stable+bounces-247230-greg=kroah.com@vger.kernel.org Thu May 14 17:11:10 2026
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 14 May 2026 11:08:25 -0400
+Subject: EDAC/versalnet: Fix device name memory leak
+To: stable@vger.kernel.org
+Cc: Prasanna Kumar T S M <ptsm@linux.microsoft.com>, "Borislav Petkov (AMD)" <bp@alien8.de>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20260514150825.274588-2-sashal@kernel.org>
+
+From: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
+
+[ Upstream commit 8cf5dd235eff6008cb04c3d8064d2acfa90616f1 ]
+
+The device name allocated via kzalloc() in init_one_mc() is assigned to
+dev->init_name but never freed on the normal removal path.  device_register()
+copies init_name and then sets dev->init_name to NULL, so the name pointer
+becomes unreachable from the device. Thus leaking memory.
+
+Use a stack-local char array instead of using kzalloc() for name.
+
+Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
+Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
+Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
+Cc: stable@vger.kernel.org
+Link: https://patch.msgid.link/20260401111856.2342975-1-ptsm@linux.microsoft.com
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/edac/versalnet_edac.c |   10 ++--------
+ 1 file changed, 2 insertions(+), 8 deletions(-)
+
+--- a/drivers/edac/versalnet_edac.c
++++ b/drivers/edac/versalnet_edac.c
+@@ -777,9 +777,9 @@ static int init_one_mc(struct mc_priv *p
+       u32 num_chans, rank, dwidth, config;
+       struct edac_mc_layer layers[2];
+       struct mem_ctl_info *mci;
++      char name[MC_NAME_LEN];
+       struct device *dev;
+       enum dev_type dt;
+-      char *name;
+       int rc;
+ 
+       config = priv->adec[CONF + i * ADEC_NUM];
+@@ -813,13 +813,9 @@ static int init_one_mc(struct mc_priv *p
+       layers[1].is_virt_csrow = false;
+ 
+       rc = -ENOMEM;
+-      name = kzalloc(MC_NAME_LEN, GFP_KERNEL);
+-      if (!name)
+-              return rc;
+-
+       dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+       if (!dev)
+-              goto err_name_free;
++              return rc;
+ 
+       mci = edac_mc_alloc(i, ARRAY_SIZE(layers), layers, sizeof(struct mc_priv));
+       if (!mci) {
+@@ -858,8 +854,6 @@ err_mc_free:
+       edac_mc_free(mci);
+ err_dev_free:
+       kfree(dev);
+-err_name_free:
+-      kfree(name);
+ 
+       return rc;
+ }
diff --git a/queue-7.0/edac-versalnet-refactor-memory-controller-initialization-and-cleanup.patch b/queue-7.0/edac-versalnet-refactor-memory-controller-initialization-and-cleanup.patch

new file mode 100644 (file)

index 0000000..780ee68
--- /dev/null
+++ b/queue-7.0/edac-versalnet-refactor-memory-controller-initialization-and-cleanup.patch
@@ -0,0 +1,252 @@
+From stable+bounces-247229-greg=kroah.com@vger.kernel.org Thu May 14 17:11:14 2026
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 14 May 2026 11:08:24 -0400
+Subject: EDAC/versalnet: Refactor memory controller initialization and cleanup
+To: stable@vger.kernel.org
+Cc: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>, "Borislav Petkov (AMD)" <bp@alien8.de>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20260514150825.274588-1-sashal@kernel.org>
+
+From: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
+
+[ Upstream commit 62a9fc50e8d947601ea3484e732b1a65a0a54b96 ]
+
+Simplify the initialization and cleanup flow for Versal Net DDRMC
+controllers in the EDAC driver by carving out the single controller init
+into a separate function which allows for a much better and more
+readable error handling and unwinding.
+
+  [ bp:
+       - do the kzalloc allocations first
+       - "publish" the structures only after they've been initialized
+         properly so that you don't need to unwind unnecessarily when
+         it fails later
+       - remove_versalnet() is now trivial
+   ]
+
+Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
+Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
+Link: https://patch.msgid.link/20251104093932.3838876-1-shubhrajyoti.datta@amd.com
+Stable-dep-of: 8cf5dd235eff ("EDAC/versalnet: Fix device name memory leak")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/edac/versalnet_edac.c |  174 +++++++++++++++++++++++-------------------
+ 1 file changed, 97 insertions(+), 77 deletions(-)
+
+--- a/drivers/edac/versalnet_edac.c
++++ b/drivers/edac/versalnet_edac.c
+@@ -70,6 +70,8 @@
+ #define XDDR5_BUS_WIDTH_32            1
+ #define XDDR5_BUS_WIDTH_16            2
+ 
++#define MC_NAME_LEN                   32
++
+ /**
+  * struct ecc_error_info - ECC error log information.
+  * @burstpos:         Burst position.
+@@ -760,7 +762,17 @@ static void versal_edac_release(struct d
+       kfree(dev);
+ }
+ 
+-static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
++static void remove_one_mc(struct mc_priv *priv, int i)
++{
++      struct mem_ctl_info *mci;
++
++      mci = priv->mci[i];
++      device_unregister(mci->pdev);
++      edac_mc_del_mc(mci->pdev);
++      edac_mc_free(mci);
++}
++
++static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i)
+ {
+       u32 num_chans, rank, dwidth, config;
+       struct edac_mc_layer layers[2];
+@@ -768,102 +780,110 @@ static int init_versalnet(struct mc_priv
+       struct device *dev;
+       enum dev_type dt;
+       char *name;
+-      int rc, i;
++      int rc;
+ 
+-      for (i = 0; i < NUM_CONTROLLERS; i++) {
+-              config = priv->adec[CONF + i * ADEC_NUM];
+-              num_chans = FIELD_GET(MC5_NUM_CHANS_MASK, config);
+-              rank = 1 << FIELD_GET(MC5_RANK_MASK, config);
+-              dwidth = FIELD_GET(MC5_BUS_WIDTH_MASK, config);
+-
+-              switch (dwidth) {
+-              case XDDR5_BUS_WIDTH_16:
+-                      dt = DEV_X16;
+-                      break;
+-              case XDDR5_BUS_WIDTH_32:
+-                      dt = DEV_X32;
+-                      break;
+-              case XDDR5_BUS_WIDTH_64:
+-                      dt = DEV_X64;
+-                      break;
+-              default:
+-                      dt = DEV_UNKNOWN;
+-              }
++      config = priv->adec[CONF + i * ADEC_NUM];
++      num_chans = FIELD_GET(MC5_NUM_CHANS_MASK, config);
++      rank = 1 << FIELD_GET(MC5_RANK_MASK, config);
++      dwidth = FIELD_GET(MC5_BUS_WIDTH_MASK, config);
++
++      switch (dwidth) {
++      case XDDR5_BUS_WIDTH_16:
++              dt = DEV_X16;
++              break;
++      case XDDR5_BUS_WIDTH_32:
++              dt = DEV_X32;
++              break;
++      case XDDR5_BUS_WIDTH_64:
++              dt = DEV_X64;
++              break;
++      default:
++              dt = DEV_UNKNOWN;
++      }
+ 
+-              if (dt == DEV_UNKNOWN)
+-                      continue;
++      if (dt == DEV_UNKNOWN)
++              return 0;
+ 
+-              /* Find the first enabled device and register that one. */
+-              layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+-              layers[0].size = rank;
+-              layers[0].is_virt_csrow = true;
+-              layers[1].type = EDAC_MC_LAYER_CHANNEL;
+-              layers[1].size = num_chans;
+-              layers[1].is_virt_csrow = false;
+-
+-              rc = -ENOMEM;
+-              mci = edac_mc_alloc(i, ARRAY_SIZE(layers), layers,
+-                                  sizeof(struct mc_priv));
+-              if (!mci) {
+-                      edac_printk(KERN_ERR, EDAC_MC, "Failed memory allocation for MC%d\n", i);
+-                      goto err_alloc;
+-              }
++      /* Find the first enabled device and register that one. */
++      layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
++      layers[0].size = rank;
++      layers[0].is_virt_csrow = true;
++      layers[1].type = EDAC_MC_LAYER_CHANNEL;
++      layers[1].size = num_chans;
++      layers[1].is_virt_csrow = false;
+ 
+-              priv->mci[i] = mci;
+-              priv->dwidth = dt;
++      rc = -ENOMEM;
++      name = kzalloc(MC_NAME_LEN, GFP_KERNEL);
++      if (!name)
++              return rc;
+ 
+-              dev = kzalloc_obj(*dev);
+-              dev->release = versal_edac_release;
+-              name = kmalloc(32, GFP_KERNEL);
+-              sprintf(name, "versal-net-ddrmc5-edac-%d", i);
+-              dev->init_name = name;
+-              rc = device_register(dev);
+-              if (rc)
+-                      goto err_alloc;
++      dev = kzalloc(sizeof(*dev), GFP_KERNEL);
++      if (!dev)
++              goto err_name_free;
+ 
+-              mci->pdev = dev;
++      mci = edac_mc_alloc(i, ARRAY_SIZE(layers), layers, sizeof(struct mc_priv));
++      if (!mci) {
++              edac_printk(KERN_ERR, EDAC_MC, "Failed memory allocation for MC%d\n", i);
++              goto err_dev_free;
++      }
+ 
+-              platform_set_drvdata(pdev, priv);
++      sprintf(name, "versal-net-ddrmc5-edac-%d", i);
+ 
+-              mc_init(mci, dev);
+-              rc = edac_mc_add_mc(mci);
+-              if (rc) {
+-                      edac_printk(KERN_ERR, EDAC_MC, "Failed to register MC%d with EDAC core\n", i);
+-                      goto err_alloc;
+-              }
+-      }
+-      return 0;
++      dev->init_name = name;
++      dev->release = versal_edac_release;
+ 
+-err_alloc:
+-      while (i--) {
+-              mci = priv->mci[i];
+-              if (!mci)
+-                      continue;
+-
+-              if (mci->pdev) {
+-                      device_unregister(mci->pdev);
+-                      edac_mc_del_mc(mci->pdev);
+-              }
++      rc = device_register(dev);
++      if (rc)
++              goto err_mc_free;
+ 
+-              edac_mc_free(mci);
++      mci->pdev = dev;
++      mc_init(mci, dev);
++
++      rc = edac_mc_add_mc(mci);
++      if (rc) {
++              edac_printk(KERN_ERR, EDAC_MC, "Failed to register MC%d with EDAC core\n", i);
++              goto err_unreg;
+       }
+ 
++      priv->mci[i] = mci;
++      priv->dwidth = dt;
++
++      platform_set_drvdata(pdev, priv);
++
++      return 0;
++
++err_unreg:
++      device_unregister(mci->pdev);
++err_mc_free:
++      edac_mc_free(mci);
++err_dev_free:
++      kfree(dev);
++err_name_free:
++      kfree(name);
++
+       return rc;
+ }
+ 
+-static void remove_versalnet(struct mc_priv *priv)
++static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
+ {
+-      struct mem_ctl_info *mci;
+-      int i;
++      int rc, i;
+ 
+       for (i = 0; i < NUM_CONTROLLERS; i++) {
+-              device_unregister(priv->mci[i]->pdev);
+-              mci = edac_mc_del_mc(priv->mci[i]->pdev);
+-              if (!mci)
+-                      return;
++              rc = init_one_mc(priv, pdev, i);
++              if (rc) {
++                      while (i--)
++                              remove_one_mc(priv, i);
+ 
+-              edac_mc_free(mci);
++                      return rc;
++              }
+       }
++      return 0;
++}
++
++static void remove_versalnet(struct mc_priv *priv)
++{
++      for (int i = 0; i < NUM_CONTROLLERS; i++)
++              remove_one_mc(priv, i);
+ }
+ 
+ static int mc_probe(struct platform_device *pdev)
diff --git a/queue-7.0/io_uring-zcrx-use-guards-for-locking.patch b/queue-7.0/io_uring-zcrx-use-guards-for-locking.patch

new file mode 100644 (file)

index 0000000..b128835
--- /dev/null
+++ b/queue-7.0/io_uring-zcrx-use-guards-for-locking.patch
@@ -0,0 +1,67 @@
+From 898ad80d1207cbdb22b21bafb6de4adfd7627bd0 Mon Sep 17 00:00:00 2001
+From: Pavel Begunkov <asml.silence@gmail.com>
+Date: Mon, 23 Mar 2026 12:43:57 +0000
+Subject: io_uring/zcrx: use guards for locking
+
+From: Pavel Begunkov <asml.silence@gmail.com>
+
+commit 898ad80d1207cbdb22b21bafb6de4adfd7627bd0 upstream.
+
+Convert last several places using manual locking to guards to simplify
+the code.
+
+Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
+Link: https://patch.msgid.link/eb4667cfaf88c559700f6399da9e434889f5b04a.1774261953.git.asml.silence@gmail.com
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ io_uring/zcrx.c |   15 +++++++--------
+ 1 file changed, 7 insertions(+), 8 deletions(-)
+
+--- a/io_uring/zcrx.c
++++ b/io_uring/zcrx.c
+@@ -586,9 +586,8 @@ static void io_zcrx_return_niov_freelist
+ {
+       struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
+ 
+-      spin_lock_bh(&area->freelist_lock);
++      guard(spinlock_bh)(&area->freelist_lock);
+       area->freelist[area->free_count++] = net_iov_idx(niov);
+-      spin_unlock_bh(&area->freelist_lock);
+ }
+ 
+ static void io_zcrx_return_niov(struct net_iov *niov)
+@@ -1029,7 +1028,8 @@ static void io_zcrx_refill_slow(struct p
+ {
+       struct io_zcrx_area *area = ifq->area;
+ 
+-      spin_lock_bh(&area->freelist_lock);
++      guard(spinlock_bh)(&area->freelist_lock);
++
+       while (area->free_count && pp->alloc.count < PP_ALLOC_CACHE_REFILL) {
+               struct net_iov *niov = __io_zcrx_get_free_niov(area);
+               netmem_ref netmem = net_iov_to_netmem(niov);
+@@ -1038,7 +1038,6 @@ static void io_zcrx_refill_slow(struct p
+               io_zcrx_sync_for_device(pp, niov);
+               net_mp_netmem_place_in_cache(pp, netmem);
+       }
+-      spin_unlock_bh(&area->freelist_lock);
+ }
+ 
+ static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
+@@ -1264,10 +1263,10 @@ static struct net_iov *io_alloc_fallback
+       if (area->mem.is_dmabuf)
+               return NULL;
+ 
+-      spin_lock_bh(&area->freelist_lock);
+-      if (area->free_count)
+-              niov = __io_zcrx_get_free_niov(area);
+-      spin_unlock_bh(&area->freelist_lock);
++      scoped_guard(spinlock_bh, &area->freelist_lock) {
++              if (area->free_count)
++                      niov = __io_zcrx_get_free_niov(area);
++      }
+ 
+       if (niov)
+               page_pool_fragment_netmem(net_iov_to_netmem(niov), 1);
diff --git a/queue-7.0/io_uring-zcrx-warn-on-freelist-violations.patch b/queue-7.0/io_uring-zcrx-warn-on-freelist-violations.patch

new file mode 100644 (file)

index 0000000..1e6457e
--- /dev/null
+++ b/queue-7.0/io_uring-zcrx-warn-on-freelist-violations.patch
@@ -0,0 +1,34 @@
+From 770594e78c3964cf23cf5287f849437cdde9b7d0 Mon Sep 17 00:00:00 2001
+From: Pavel Begunkov <asml.silence@gmail.com>
+Date: Tue, 21 Apr 2026 09:45:29 +0100
+Subject: io_uring/zcrx: warn on freelist violations
+
+From: Pavel Begunkov <asml.silence@gmail.com>
+
+commit 770594e78c3964cf23cf5287f849437cdde9b7d0 upstream.
+
+The freelist is appropriately sized to always be able to take a free
+niov, but let's be more defensive and check the invariant with a
+warning. That should help to catch any double-free issues.
+
+Suggested-by: Kai Aizen <kai@snailsploit.com>
+Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
+Link: https://patch.msgid.link/2f3cea363b04649755e3b6bb9ab66485a95936d5.1776760901.git.asml.silence@gmail.com
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ io_uring/zcrx.c |    2 ++
+ 1 file changed, 2 insertions(+)
+
+--- a/io_uring/zcrx.c
++++ b/io_uring/zcrx.c
+@@ -587,6 +587,8 @@ static void io_zcrx_return_niov_freelist
+       struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
+ 
+       guard(spinlock_bh)(&area->freelist_lock);
++      if (WARN_ON_ONCE(area->free_count >= area->nia.num_niovs))
++              return;
+       area->freelist[area->free_count++] = net_iov_idx(niov);
+ }
+ 
diff --git a/queue-7.0/kho-fix-error-handling-in-kho_add_subtree.patch b/queue-7.0/kho-fix-error-handling-in-kho_add_subtree.patch

new file mode 100644 (file)

index 0000000..d73ade2
--- /dev/null
+++ b/queue-7.0/kho-fix-error-handling-in-kho_add_subtree.patch
@@ -0,0 +1,82 @@
+From stable+bounces-247282-greg=kroah.com@vger.kernel.org Thu May 14 21:27:11 2026
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 14 May 2026 15:25:53 -0400
+Subject: kho: fix error handling in kho_add_subtree()
+To: stable@vger.kernel.org
+Cc: Breno Leitao <leitao@debian.org>, Pratyush Yadav <pratyush@kernel.org>, "Mike Rapoport (Microsoft)" <rppt@kernel.org>, Alexander Graf <graf@amazon.com>, Pasha Tatashin <pasha.tatashin@soleen.com>, Andrew Morton <akpm@linux-foundation.org>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20260514192553.1255751-1-sashal@kernel.org>
+
+From: Breno Leitao <leitao@debian.org>
+
+[ Upstream commit 9ec95329894864170a1a7685b9a11b739393131a ]
+
+Fix two error handling issues in kho_add_subtree(), where it doesn't
+handle the error path correctly.
+
+1. If fdt_setprop() fails after the subnode has been created, the
+   subnode is not removed. This leaves an incomplete node in the FDT
+   (missing "preserved-data" or "blob-size" properties).
+
+2. The fdt_setprop() return value (an FDT error code) is stored
+   directly in err and returned to the caller, which expects -errno.
+
+Fix both by storing fdt_setprop() results in fdt_err, jumping to a new
+out_del_node label that removes the subnode on failure, and only setting
+err = 0 on the success path, otherwise returning -ENOMEM (instead of
+FDT_ERR_ errors that would come from fdt_setprop).
+
+No user-visible changes.  This patch fixes error handling in the KHO
+(Kexec HandOver) subsystem, which is used to preserve data across kexec
+reboots.  The fix only affects a rare failure path during kexec
+preparation — specifically when the kernel runs out of space in the
+Flattened Device Tree buffer while registering preserved memory regions.
+
+In the unlikely event that this error path was triggered, the old code
+would leave a malformed node in the device tree and return an incorrect
+error code to the calling subsystem, which could lead to confusing log
+messages or incorrect recovery decisions.  With this fix, the incomplete
+node is properly cleaned up and the appropriate errno value is propagated,
+this error code is not returned to the user.
+
+Link: https://lore.kernel.org/20260410-kho_fix_send-v2-1-1b4debf7ee08@debian.org
+Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers")
+Signed-off-by: Breno Leitao <leitao@debian.org>
+Suggested-by: Pratyush Yadav <pratyush@kernel.org>
+Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
+Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
+Cc: Alexander Graf <graf@amazon.com>
+Cc: Breno Leitao <leitao@debian.org>
+Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/liveupdate/kexec_handover.c |   13 +++++++++----
+ 1 file changed, 9 insertions(+), 4 deletions(-)
+
+--- a/kernel/liveupdate/kexec_handover.c
++++ b/kernel/liveupdate/kexec_handover.c
+@@ -757,13 +757,18 @@ int kho_add_subtree(const char *name, vo
+               goto out_pack;
+       }
+ 
+-      err = fdt_setprop(root_fdt, off, KHO_FDT_SUB_TREE_PROP_NAME,
+-                        &phys, sizeof(phys));
+-      if (err < 0)
+-              goto out_pack;
++      fdt_err = fdt_setprop(root_fdt, off, KHO_FDT_SUB_TREE_PROP_NAME,
++                            &phys, sizeof(phys));
++      if (fdt_err < 0)
++              goto out_del_node;
+ 
+       WARN_ON_ONCE(kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false));
+ 
++      err = 0;
++      goto out_pack;
++
++out_del_node:
++      fdt_del_node(root_fdt, off);
+ out_pack:
+       fdt_pack(root_fdt);
+ 
diff --git a/queue-7.0/mm-vma-do-not-try-to-unmap-a-vma-if-mmap_prepare-invoked-from-mmap.patch b/queue-7.0/mm-vma-do-not-try-to-unmap-a-vma-if-mmap_prepare-invoked-from-mmap.patch

new file mode 100644 (file)

index 0000000..8df1fea
--- /dev/null
+++ b/queue-7.0/mm-vma-do-not-try-to-unmap-a-vma-if-mmap_prepare-invoked-from-mmap.patch
@@ -0,0 +1,298 @@
+From stable+bounces-247163-greg=kroah.com@vger.kernel.org Thu May 14 12:35:09 2026
+From: Lorenzo Stoakes <ljs@kernel.org>
+Date: Thu, 14 May 2026 11:33:20 +0100
+Subject: mm/vma: do not try to unmap a VMA if mmap_prepare() invoked from mmap()
+To: stable@vger.kernel.org
+Message-ID: <20260514103320.155081-1-ljs@kernel.org>
+
+From: Lorenzo Stoakes <ljs@kernel.org>
+
+[ Upstream commit 619eab23e1ce7c97e54bfc5a417306d94b3f6f13 ]
+
+The mmap_prepare hook functionality includes the ability to invoke
+mmap_prepare() from the mmap() hook of existing 'stacked' drivers, that is
+ones which are capable of calling the mmap hooks of other drivers/file
+systems (e.g.  overlayfs, shm).
+
+As part of the mmap_prepare action functionality, we deal with errors by
+unmapping the VMA should one arise.  This works in the usual mmap_prepare
+case, as we invoke this action at the last moment, when the VMA is
+established in the maple tree.
+
+However, the mmap() hook passes a not-fully-established VMA pointer to the
+caller (which is the motivation behind the mmap_prepare() work), which is
+detached.
+
+So attempting to unmap a VMA in this state will be problematic, with the
+most obvious symptom being a warning in vma_mark_detached(), because the
+VMA is already detached.
+
+It's also unncessary - the mmap() handler will clean up the VMA on error.
+
+So to fix this issue, this patch propagates whether or not an mmap action
+is being completed via the compatibility layer or directly.
+
+If the former, then we do not attempt VMA cleanup, if the latter, then we
+do.
+
+This patch also updates the userland VMA tests to reflect the change.
+
+Link: https://lore.kernel.org/20260421102150.189982-1-ljs@kernel.org
+Fixes: ac0a3fc9c07d ("mm: add ability to take further action in vm_area_desc")
+Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
+Reported-by: syzbot+db390288d141a1dccf96@syzkaller.appspotmail.com
+Closes: https://lore.kernel.org/all/69e69734.050a0220.24bfd3.0027.GAE@google.com/
+Cc: David Hildenbrand <david@kernel.org>
+Cc: Jann Horn <jannh@google.com>
+Cc: Liam Howlett <liam.howlett@oracle.com>
+Cc: Michal Hocko <mhocko@suse.com>
+Cc: Mike Rapoport <rppt@kernel.org>
+Cc: Pedro Falcato <pfalcato@suse.de>
+Cc: Suren Baghdasaryan <surenb@google.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/linux/mm.h                |    2 -
+ mm/util.c                         |   51 +++++++++++++++++++++-----------------
+ mm/vma.c                          |    3 --
+ tools/testing/vma/include/dup.h   |   41 ++++++++++++++----------------
+ tools/testing/vma/include/stubs.h |    3 +-
+ 5 files changed, 53 insertions(+), 47 deletions(-)
+
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -4080,7 +4080,7 @@ static inline void mmap_action_ioremap_f
+ 
+ int mmap_action_prepare(struct vm_area_desc *desc);
+ int mmap_action_complete(struct vm_area_struct *vma,
+-                       struct mmap_action *action);
++                       struct mmap_action *action, bool is_compat);
+ 
+ /* Look up the first VMA which exactly match the interval vm_start ... vm_end */
+ static inline struct vm_area_struct *find_exact_vma(struct mm_struct *mm,
+--- a/mm/util.c
++++ b/mm/util.c
+@@ -1186,7 +1186,8 @@ int compat_vma_mmap(struct file *file, s
+               return err;
+ 
+       set_vma_from_desc(vma, &desc);
+-      err = mmap_action_complete(vma, &desc.action);
++      err = mmap_action_complete(vma, &desc.action,
++                                 /*is_compat=*/true);
+       if (err) {
+               const size_t len = vma_pages(vma) << PAGE_SHIFT;
+ 
+@@ -1277,28 +1278,31 @@ again:
+ }
+ 
+ static int mmap_action_finish(struct vm_area_struct *vma,
+-                            struct mmap_action *action, int err)
++                            struct mmap_action *action, int err,
++                            bool is_compat)
+ {
++      if (!err && action->success_hook)
++              err = action->success_hook(vma);
++
++      /*
++       * If this is invoked from the compatibility layer, post-mmap() hook
++       * logic will handle cleanup for us.
++       */
++      if (!err || is_compat)
++              return err;
++
+       /*
+        * If an error occurs, unmap the VMA altogether and return an error. We
+        * only clear the newly allocated VMA, since this function is only
+        * invoked if we do NOT merge, so we only clean up the VMA we created.
+        */
+-      if (err) {
+-              if (action->error_hook) {
+-                      /* We may want to filter the error. */
+-                      err = action->error_hook(err);
+-
+-                      /* The caller should not clear the error. */
+-                      VM_WARN_ON_ONCE(!err);
+-              }
+-              return err;
++      if (action->error_hook) {
++              /* We may want to filter the error. */
++              err = action->error_hook(err);
++              /* The caller should not clear the error. */
++              VM_WARN_ON_ONCE(!err);
+       }
+-
+-      if (action->success_hook)
+-              return action->success_hook(vma);
+-
+-      return 0;
++      return err;
+ }
+ 
+ #ifdef CONFIG_MMU
+@@ -1329,14 +1333,16 @@ EXPORT_SYMBOL(mmap_action_prepare);
+  * mmap_action_complete - Execute VMA descriptor action.
+  * @vma: The VMA to perform the action upon.
+  * @action: The action to perform.
++ * @is_compat: Is this being invoked from the compatibility layer?
+  *
+  * Similar to mmap_action_prepare().
+  *
+- * Return: 0 on success, or error, at which point the VMA will be unmapped.
++ * Return: 0 on success, or error, at which point the VMA will be unmapped if
++ * !@is_compat.
+  */
+ int mmap_action_complete(struct vm_area_struct *vma,
+-                       struct mmap_action *action)
+-
++                       struct mmap_action *action,
++                       bool is_compat)
+ {
+       int err = 0;
+ 
+@@ -1353,7 +1359,7 @@ int mmap_action_complete(struct vm_area_
+               break;
+       }
+ 
+-      return mmap_action_finish(vma, action, err);
++      return mmap_action_finish(vma, action, err, is_compat);
+ }
+ EXPORT_SYMBOL(mmap_action_complete);
+ #else
+@@ -1373,7 +1379,8 @@ int mmap_action_prepare(struct vm_area_d
+ EXPORT_SYMBOL(mmap_action_prepare);
+ 
+ int mmap_action_complete(struct vm_area_struct *vma,
+-                       struct mmap_action *action)
++                       struct mmap_action *action,
++                       bool is_compat)
+ {
+       int err = 0;
+ 
+@@ -1388,7 +1395,7 @@ int mmap_action_complete(struct vm_area_
+               break;
+       }
+ 
+-      return mmap_action_finish(vma, action, err);
++      return mmap_action_finish(vma, action, err, is_compat);
+ }
+ EXPORT_SYMBOL(mmap_action_complete);
+ #endif
+--- a/mm/vma.c
++++ b/mm/vma.c
+@@ -2708,7 +2708,7 @@ static int call_action_complete(struct m
+ {
+       int err;
+ 
+-      err = mmap_action_complete(vma, action);
++      err = mmap_action_complete(vma, action, /*is_compat=*/false);
+ 
+       /* If we held the file rmap we need to release it. */
+       if (map->hold_file_rmap_lock) {
+@@ -2778,7 +2778,6 @@ static unsigned long __mmap_region(struc
+ 
+       if (have_mmap_prepare && allocated_new) {
+               error = call_action_complete(&map, &desc.action, vma);
+-
+               if (error)
+                       return error;
+       }
+--- a/tools/testing/vma/include/dup.h
++++ b/tools/testing/vma/include/dup.h
+@@ -1071,8 +1071,17 @@ static inline void vma_set_anonymous(str
+ static inline void set_vma_from_desc(struct vm_area_struct *vma,
+               struct vm_area_desc *desc);
+ 
+-static inline int __compat_vma_mmap(const struct file_operations *f_op,
+-              struct file *file, struct vm_area_struct *vma)
++static inline unsigned long vma_pages(struct vm_area_struct *vma)
++{
++      return (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
++}
++
++static inline int vfs_mmap_prepare(struct file *file, struct vm_area_desc *desc)
++{
++      return file->f_op->mmap_prepare(desc);
++}
++
++static inline int compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
+ {
+       struct vm_area_desc desc = {
+               .mm = vma->vm_mm,
+@@ -1082,14 +1091,14 @@ static inline int __compat_vma_mmap(cons
+ 
+               .pgoff = vma->vm_pgoff,
+               .vm_file = vma->vm_file,
+-              .vm_flags = vma->vm_flags,
++              .vma_flags = vma->flags,
+               .page_prot = vma->vm_page_prot,
+ 
+               .action.type = MMAP_NOTHING, /* Default */
+       };
+       int err;
+ 
+-      err = f_op->mmap_prepare(&desc);
++      err = vfs_mmap_prepare(file, &desc);
+       if (err)
+               return err;
+ 
+@@ -1098,27 +1107,22 @@ static inline int __compat_vma_mmap(cons
+               return err;
+ 
+       set_vma_from_desc(vma, &desc);
+-      return mmap_action_complete(vma, &desc.action);
+-}
++      err = mmap_action_complete(vma, &desc.action,
++                                 /*is_compat=*/true);
++      if (err) {
++              const size_t len = vma_pages(vma) << PAGE_SHIFT;
+ 
+-static inline int compat_vma_mmap(struct file *file,
+-              struct vm_area_struct *vma)
+-{
+-      return __compat_vma_mmap(file->f_op, file, vma);
++              do_munmap(current->mm, vma->vm_start, len, NULL);
++      }
++      return err;
+ }
+ 
+-
+ static inline void vma_iter_init(struct vma_iterator *vmi,
+               struct mm_struct *mm, unsigned long addr)
+ {
+       mas_init(&vmi->mas, &mm->mm_mt, addr);
+ }
+ 
+-static inline unsigned long vma_pages(struct vm_area_struct *vma)
+-{
+-      return (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+-}
+-
+ static inline void mmap_assert_locked(struct mm_struct *);
+ static inline struct vm_area_struct *find_vma_intersection(struct mm_struct *mm,
+                                               unsigned long start_addr,
+@@ -1309,11 +1313,6 @@ static inline int vfs_mmap(struct file *
+       return file->f_op->mmap(file, vma);
+ }
+ 
+-static inline int vfs_mmap_prepare(struct file *file, struct vm_area_desc *desc)
+-{
+-      return file->f_op->mmap_prepare(desc);
+-}
+-
+ static inline void vma_set_file(struct vm_area_struct *vma, struct file *file)
+ {
+       /* Changing an anonymous vma with this is illegal */
+--- a/tools/testing/vma/include/stubs.h
++++ b/tools/testing/vma/include/stubs.h
+@@ -87,7 +87,8 @@ static inline int mmap_action_prepare(st
+ }
+ 
+ static inline int mmap_action_complete(struct vm_area_struct *vma,
+-                                     struct mmap_action *action)
++                                     struct mmap_action *action,
++                                     bool is_compat)
+ {
+       return 0;
+ }
diff --git a/queue-7.0/perf-build-fix-argument-list-too-long-in-second-location.patch b/queue-7.0/perf-build-fix-argument-list-too-long-in-second-location.patch

new file mode 100644 (file)

index 0000000..4f1f734
--- /dev/null
+++ b/queue-7.0/perf-build-fix-argument-list-too-long-in-second-location.patch
@@ -0,0 +1,53 @@
+From stable+bounces-247054-greg=kroah.com@vger.kernel.org Thu May 14 01:47:02 2026
+From: Florian Fainelli <florian.fainelli@broadcom.com>
+Date: Wed, 13 May 2026 16:46:38 -0700
+Subject: perf build: fix "argument list too long" in second location
+To: stable@vger.kernel.org
+Cc: Markus Mayer <mmayer@broadcom.com>, James Clark <james.clark@linaro.org>, Namhyung Kim <namhyung@kernel.org>, Florian Fainelli <florian.fainelli@broadcom.com>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>, Adrian Hunter <adrian.hunter@intel.com>, linux-perf-users@vger.kernel.org (open list:PERFORMANCE EVENTS SUBSYSTEM), linux-kernel@vger.kernel.org (open list:PERFORMANCE EVENTS SUBSYSTEM)
+Message-ID: <20260513234639.128528-1-florian.fainelli@broadcom.com>
+
+From: Markus Mayer <mmayer@broadcom.com>
+
+commit 97ab89686a9e5d087042dbe73604a32b3de72653 upstream
+
+Turns out that displaying "RM $^" via quiet_cmd_rm can also upset the
+shell and cause it to display "argument list too long".
+
+Trying to quote $^ doesn't help.
+
+In the end, *not* displaying the (potentially long) list of files is
+probably the right thing to do for a "quiet" message, anyway. Instead,
+let's display a count of how many files were removed. There is always
+V=1 if more detail is required.
+
+  TEST    linux/tools/perf/pmu-events/metric_test.log
+  RM      ...634 orphan file(s)...
+  LD      linux/tools/perf/util/perf-util-in.o
+
+Also move the comment regarding xargs before the rule, so it doesn't
+show up in the build output.
+
+Signed-off-by: Markus Mayer <mmayer@broadcom.com>
+Reviewed-by: James Clark <james.clark@linaro.org>
+Signed-off-by: Namhyung Kim <namhyung@kernel.org>
+Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ tools/perf/pmu-events/Build |    4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+--- a/tools/perf/pmu-events/Build
++++ b/tools/perf/pmu-events/Build
+@@ -211,10 +211,10 @@ ifneq ($(strip $(ORPHAN_FILES)),)
+ 
+ # Message for $(call echo-cmd,rm). Generally cleaning files isn't part
+ # of a build step.
+-quiet_cmd_rm  = RM      $^
++quiet_cmd_rm = RM      ...$(words $^) orphan file(s)...
+ 
++# The list of files can be long. Use xargs to prevent issues.
+ prune_orphans: $(ORPHAN_FILES)
+-      # The list of files can be long. Use xargs to prevent issues.
+       $(Q)$(call echo-cmd,rm)echo "$^" | xargs rm -f
+ 
+ JEVENTS_DEPS += prune_orphans
diff --git a/queue-7.0/sched_ext-skip-tasks-with-stale-task_rq-in-bypass_lb_cpu.patch b/queue-7.0/sched_ext-skip-tasks-with-stale-task_rq-in-bypass_lb_cpu.patch

new file mode 100644 (file)

index 0000000..e634eb0
--- /dev/null
+++ b/queue-7.0/sched_ext-skip-tasks-with-stale-task_rq-in-bypass_lb_cpu.patch
@@ -0,0 +1,50 @@
+From arighi@nvidia.com Wed May 13 15:01:26 2026
+From: Andrea Righi <arighi@nvidia.com>
+Date: Wed, 13 May 2026 15:01:11 +0200
+Subject: sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu()
+To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>, Changwoo Min <changwoo@igalia.com>
+Cc: Chris Mason <clm@meta.com>, Peter Schneider <pschneider1968@googlemail.com>, sched-ext@lists.linux.dev, stable@vger.kernel.org, linux-kernel@vger.kernel.org
+Message-ID: <20260513130111.689740-1-arighi@nvidia.com>
+
+From: Tejun Heo <tj@kernel.org>
+
+commit da2d81b4118a74e65d2335e221a38d665902a98c upstream.
+
+bypass_lb_cpu() transfers tasks between per-CPU bypass DSQs without
+migrating them - task_cpu() only updates when the donee later consumes the
+task via move_remote_task_to_local_dsq(). If the LB timer fires again before
+consumption and the new DSQ becomes a donor, @p is still on the previous CPU
+and task_rq(@p) != donor_rq. @p can't be moved without its own rq locked.
+
+Skip such tasks.
+
+Fixes: 95d1df610cdc ("sched_ext: Implement load balancer for bypass mode")
+Cc: stable@vger.kernel.org # v6.19+
+Reported-by: Chris Mason <clm@meta.com>
+Signed-off-by: Tejun Heo <tj@kernel.org>
+Reviewed-by: Andrea Righi <arighi@nvidia.com>
+[ arighi: replace donor_rq with rq, not present in v7.0.y ]
+Signed-off-by: Andrea Righi <arighi@nvidia.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/sched/ext.c |    9 +++++++++
+ 1 file changed, 9 insertions(+)
+
+--- a/kernel/sched/ext.c
++++ b/kernel/sched/ext.c
+@@ -4010,6 +4010,15 @@ resume:
+               if (cpumask_empty(donee_mask))
+                       break;
+ 
++              /*
++               * If an earlier pass placed @p on @donor_dsq from a different
++               * CPU and the donee hasn't consumed it yet, @p is still on the
++               * previous CPU and task_rq(@p) != @rq. @p can't be moved
++               * without its rq locked. Skip.
++               */
++              if (task_rq(p) != rq)
++                      continue;
++
+               donee = cpumask_any_and_distribute(donee_mask, p->cpus_ptr);
+               if (donee >= nr_cpu_ids)
+                       continue;
diff --git a/queue-7.0/series b/queue-7.0/series

index c9a1566e3261c2bb7019b7f01742e6bec9cfb7c7..80e6b3e7e3d82d1edb123bcac01eaba0e5f981c5 100644 (file)
--- a/queue-7.0/series
+++ b/queue-7.0/series
@@ -180,3 +180,16 @@ batman-adv-bla-prevent-use-after-free-when-deleting-claims.patch
  batman-adv-bla-only-purge-non-released-claims.patch
  batman-adv-bla-put-backbone-reference-on-failed-claim-hash-insert.patch
  sched_ext-use-hk_type_domain_boot-to-detect-isolcpus-domain-isolation.patch
+usb-typec-tcpm-reset-internal-port-states-on-soft-reset-ams.patch
+io_uring-zcrx-use-guards-for-locking.patch
+io_uring-zcrx-warn-on-freelist-violations.patch
+kho-fix-error-handling-in-kho_add_subtree.patch
+edac-versalnet-refactor-memory-controller-initialization-and-cleanup.patch
+edac-versalnet-fix-device-name-memory-leak.patch
+spi-uniphier-simplify-clock-handling-with-devm_clk_get_enabled.patch
+spi-uniphier-fix-controller-deregistration.patch
+cgroup-increment-nr_dying_subsys_-from-rmdir-context.patch
+cgroup-defer-css-percpu_ref-kill-on-rmdir-until-cgroup-is-depopulated.patch
+sched_ext-skip-tasks-with-stale-task_rq-in-bypass_lb_cpu.patch
+perf-build-fix-argument-list-too-long-in-second-location.patch
+mm-vma-do-not-try-to-unmap-a-vma-if-mmap_prepare-invoked-from-mmap.patch
diff --git a/queue-7.0/spi-uniphier-fix-controller-deregistration.patch b/queue-7.0/spi-uniphier-fix-controller-deregistration.patch

new file mode 100644 (file)

index 0000000..4678ce1
--- /dev/null
+++ b/queue-7.0/spi-uniphier-fix-controller-deregistration.patch
@@ -0,0 +1,59 @@
+From stable+bounces-247104-greg=kroah.com@vger.kernel.org Thu May 14 06:36:57 2026
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 14 May 2026 00:36:31 -0400
+Subject: spi: uniphier: fix controller deregistration
+To: stable@vger.kernel.org
+Cc: Johan Hovold <johan@kernel.org>, Keiji Hayashibara <hayashibara.keiji@socionext.com>, Mark Brown <broonie@kernel.org>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20260514043631.10946-2-sashal@kernel.org>
+
+From: Johan Hovold <johan@kernel.org>
+
+[ Upstream commit 0245435f777264ac45945ed2f325dd095a41d1af ]
+
+Make sure to deregister the controller before releasing underlying
+resources like DMA during driver unbind.
+
+Note that clocks were also disabled before the recent commit
+fdca270f8f87 ("spi: uniphier: Simplify clock handling with
+devm_clk_get_enabled()").
+
+Fixes: 5ba155a4d4cc ("spi: add SPI controller driver for UniPhier SoC")
+Cc: stable@vger.kernel.org     # 4.19
+Cc: Keiji Hayashibara <hayashibara.keiji@socionext.com>
+Signed-off-by: Johan Hovold <johan@kernel.org>
+Link: https://patch.msgid.link/20260410081757.503099-25-johan@kernel.org
+Signed-off-by: Mark Brown <broonie@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/spi/spi-uniphier.c |    8 +++++++-
+ 1 file changed, 7 insertions(+), 1 deletion(-)
+
+--- a/drivers/spi/spi-uniphier.c
++++ b/drivers/spi/spi-uniphier.c
+@@ -746,7 +746,7 @@ static int uniphier_spi_probe(struct pla
+ 
+       host->max_dma_len = min(dma_tx_burst, dma_rx_burst);
+ 
+-      ret = devm_spi_register_controller(&pdev->dev, host);
++      ret = spi_register_controller(host);
+       if (ret)
+               goto out_release_dma;
+ 
+@@ -771,10 +771,16 @@ static void uniphier_spi_remove(struct p
+ {
+       struct spi_controller *host = platform_get_drvdata(pdev);
+ 
++      spi_controller_get(host);
++
++      spi_unregister_controller(host);
++
+       if (host->dma_tx)
+               dma_release_channel(host->dma_tx);
+       if (host->dma_rx)
+               dma_release_channel(host->dma_rx);
++
++      spi_controller_put(host);
+ }
+ 
+ static const struct of_device_id uniphier_spi_match[] = {
diff --git a/queue-7.0/spi-uniphier-simplify-clock-handling-with-devm_clk_get_enabled.patch b/queue-7.0/spi-uniphier-simplify-clock-handling-with-devm_clk_get_enabled.patch

new file mode 100644 (file)

index 0000000..c6c6fe8
--- /dev/null
+++ b/queue-7.0/spi-uniphier-simplify-clock-handling-with-devm_clk_get_enabled.patch
@@ -0,0 +1,99 @@
+From stable+bounces-247103-greg=kroah.com@vger.kernel.org Thu May 14 06:36:56 2026
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 14 May 2026 00:36:30 -0400
+Subject: spi: uniphier: Simplify clock handling with devm_clk_get_enabled()
+To: stable@vger.kernel.org
+Cc: Pei Xiao <xiaopei01@kylinos.cn>, Kunihiko Hayashi <hayashi.kunihiko@socionext.com>, Mark Brown <broonie@kernel.org>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20260514043631.10946-1-sashal@kernel.org>
+
+From: Pei Xiao <xiaopei01@kylinos.cn>
+
+[ Upstream commit fdca270f8f87cae2eb5b619234b9dd11a863ce6b ]
+
+Replace devm_clk_get() followed by clk_prepare_enable() with
+devm_clk_get_enabled() for the clock. This removes the need for
+explicit clock enable and disable calls, as the managed API automatically
+handles clock disabling on device removal or probe failure.
+
+Remove the now-unnecessary clk_disable_unprepare() calls from the probe
+error path and the remove callback. Adjust error labels accordingly.
+
+Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
+Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
+Link: https://patch.msgid.link/b2deeefd4ef1a4bce71116aabfcb7e81400f6d37.1775546948.git.xiaopei01@kylinos.cn
+Signed-off-by: Mark Brown <broonie@kernel.org>
+Stable-dep-of: 0245435f7772 ("spi: uniphier: fix controller deregistration")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/spi/spi-uniphier.c |   18 ++++--------------
+ 1 file changed, 4 insertions(+), 14 deletions(-)
+
+--- a/drivers/spi/spi-uniphier.c
++++ b/drivers/spi/spi-uniphier.c
+@@ -666,28 +666,24 @@ static int uniphier_spi_probe(struct pla
+       }
+       priv->base_dma_addr = res->start;
+ 
+-      priv->clk = devm_clk_get(&pdev->dev, NULL);
++      priv->clk = devm_clk_get_enabled(&pdev->dev, NULL);
+       if (IS_ERR(priv->clk)) {
+               dev_err(&pdev->dev, "failed to get clock\n");
+               ret = PTR_ERR(priv->clk);
+               goto out_host_put;
+       }
+ 
+-      ret = clk_prepare_enable(priv->clk);
+-      if (ret)
+-              goto out_host_put;
+-
+       irq = platform_get_irq(pdev, 0);
+       if (irq < 0) {
+               ret = irq;
+-              goto out_disable_clk;
++              goto out_host_put;
+       }
+ 
+       ret = devm_request_irq(&pdev->dev, irq, uniphier_spi_handler,
+                              0, "uniphier-spi", priv);
+       if (ret) {
+               dev_err(&pdev->dev, "failed to request IRQ\n");
+-              goto out_disable_clk;
++              goto out_host_put;
+       }
+ 
+       init_completion(&priv->xfer_done);
+@@ -716,7 +712,7 @@ static int uniphier_spi_probe(struct pla
+       if (IS_ERR_OR_NULL(host->dma_tx)) {
+               if (PTR_ERR(host->dma_tx) == -EPROBE_DEFER) {
+                       ret = -EPROBE_DEFER;
+-                      goto out_disable_clk;
++                      goto out_host_put;
+               }
+               host->dma_tx = NULL;
+               dma_tx_burst = INT_MAX;
+@@ -766,9 +762,6 @@ out_release_dma:
+               host->dma_tx = NULL;
+       }
+ 
+-out_disable_clk:
+-      clk_disable_unprepare(priv->clk);
+-
+ out_host_put:
+       spi_controller_put(host);
+       return ret;
+@@ -777,14 +770,11 @@ out_host_put:
+ static void uniphier_spi_remove(struct platform_device *pdev)
+ {
+       struct spi_controller *host = platform_get_drvdata(pdev);
+-      struct uniphier_spi_priv *priv = spi_controller_get_devdata(host);
+ 
+       if (host->dma_tx)
+               dma_release_channel(host->dma_tx);
+       if (host->dma_rx)
+               dma_release_channel(host->dma_rx);
+-
+-      clk_disable_unprepare(priv->clk);
+ }
+ 
+ static const struct of_device_id uniphier_spi_match[] = {
diff --git a/queue-7.0/usb-typec-tcpm-reset-internal-port-states-on-soft-reset-ams.patch b/queue-7.0/usb-typec-tcpm-reset-internal-port-states-on-soft-reset-ams.patch

new file mode 100644 (file)

index 0000000..4ba2d49
--- /dev/null
+++ b/queue-7.0/usb-typec-tcpm-reset-internal-port-states-on-soft-reset-ams.patch
@@ -0,0 +1,85 @@
+From 2909f0d4994fb4306bf116df5ccee797791fce2c Mon Sep 17 00:00:00 2001
+From: Amit Sunil Dhamne <amitsd@google.com>
+Date: Tue, 14 Apr 2026 00:58:32 +0000
+Subject: usb: typec: tcpm: reset internal port states on soft reset AMS
+
+From: Amit Sunil Dhamne <amitsd@google.com>
+
+commit 2909f0d4994fb4306bf116df5ccee797791fce2c upstream.
+
+Reset internal port states (such as vdm_sm_running and
+explicit_contract) on soft reset AMS as the port needs to negotiate a
+new contract. The consequence of leaving the states in as-is cond are as
+follows:
+  * port is in SRC power role and an explicit contract is negotiated
+    with the port partner (in sink role)
+  * port partner sends a Soft Reset AMS while VDM State Machine is
+    running
+  * port accepts the Soft Reset request and the port advertises src caps
+  * port partner sends a Request message but since the explicit_contract
+    and vdm_sm_running are true from previous negotiation, the port ends
+    up sending Soft Reset instead of Accept msg.
+
+Stub Log:
+[  203.653942] AMS DISCOVER_IDENTITY start
+[  203.653947] PD TX, header: 0x176f
+[  203.655901] PD TX complete, status: 0
+[  203.657470] PD RX, header: 0x124f [1]
+[  203.657477] Rx VDM cmd 0xff008081 type 2 cmd 1 len 1
+[  203.657482] AMS DISCOVER_IDENTITY finished
+[  203.657484] cc:=4
+[  204.155698] PD RX, header: 0x144f [1]
+[  204.155718] Rx VDM cmd 0xeeee8001 type 0 cmd 1 len 1
+[  204.155741] PD TX, header: 0x196f
+[  204.157622] PD TX complete, status: 0
+[  204.160060] PD RX, header: 0x4d [1]
+[  204.160066] state change SRC_READY -> SOFT_RESET [rev2 SOFT_RESET_AMS]
+[  204.160076] PD TX, header: 0x163
+[  204.162486] PD TX complete, status: 0
+[  204.162832] AMS SOFT_RESET_AMS finished
+[  204.162840] cc:=4
+[  204.162891] AMS POWER_NEGOTIATION start
+[  204.162896] state change SOFT_RESET -> AMS_START [rev2 POWER_NEGOTIATION]
+[  204.162908] state change AMS_START -> SRC_SEND_CAPABILITIES [rev2 POWER_NEGOTIATION]
+[  204.162913] PD TX, header: 0x1361
+[  204.165529] PD TX complete, status: 0
+[  204.165571] pending state change SRC_SEND_CAPABILITIES -> SRC_SEND_CAPABILITIES_TIMEOUT @ 60 ms [rev2 POWER_NEGOTIATION]
+[  204.166996] PD RX, header: 0x1242 [1]
+[  204.167009] state change SRC_SEND_CAPABILITIES -> SRC_SOFT_RESET_WAIT_SNK_TX [rev2 POWER_NEGOTIATION]
+[  204.167019] AMS POWER_NEGOTIATION finished
+[  204.167020] cc:=4
+[  204.167083] AMS SOFT_RESET_AMS start
+[  204.167086] state change SRC_SOFT_RESET_WAIT_SNK_TX -> SOFT_RESET_SEND [rev2 SOFT_RESET_AMS]
+[  204.167092] PD TX, header: 0x16d
+[  204.168824] PD TX complete, status: 0
+[  204.168854] pending state change SOFT_RESET_SEND -> HARD_RESET_SEND @ 60 ms [rev2 SOFT_RESET_AMS]
+[  204.171876] PD RX, header: 0x43 [1]
+[  204.171879] AMS SOFT_RESET_AMS finished
+
+This causes COMMON.PROC.PD.11.2 check failure for
+TEST.PD.VDM.SRC.2_Rev2Src test on the PD compliance tester.
+
+Signed-off-by: Amit Sunil Dhamne <amitsd@google.com>
+Fixes: 8d3a0578ad1a ("usb: typec: tcpm: Respond Wait if VDM state machine is running")
+Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)")
+Cc: stable <stable@kernel.org>
+Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
+Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
+Link: https://patch.msgid.link/20260414-fix-soft-reset-v1-1-01d7cb9764e2@google.com
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/usb/typec/tcpm/tcpm.c |    2 ++
+ 1 file changed, 2 insertions(+)
+
+--- a/drivers/usb/typec/tcpm/tcpm.c
++++ b/drivers/usb/typec/tcpm/tcpm.c
+@@ -5539,6 +5539,8 @@ static void run_state_machine(struct tcp
+               usb_power_delivery_unregister_capabilities(port->partner_source_caps);
+               port->partner_source_caps = NULL;
+               tcpm_pd_send_control(port, PD_CTRL_ACCEPT, TCPC_TX_SOP);
++              port->vdm_sm_running = false;
++              port->explicit_contract = false;
+               tcpm_ams_finish(port);
+               if (port->pwr_role == TYPEC_SOURCE) {
+                       port->upcoming_state = SRC_SEND_CAPABILITIES;
author	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Fri, 15 May 2026 09:18:20 +0000 (11:18 +0200)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Fri, 15 May 2026 09:18:20 +0000 (11:18 +0200)
queue-7.0/cgroup-defer-css-percpu_ref-kill-on-rmdir-until-cgroup-is-depopulated.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/cgroup-increment-nr_dying_subsys_-from-rmdir-context.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/edac-versalnet-fix-device-name-memory-leak.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/edac-versalnet-refactor-memory-controller-initialization-and-cleanup.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/io_uring-zcrx-use-guards-for-locking.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/io_uring-zcrx-warn-on-freelist-violations.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/kho-fix-error-handling-in-kho_add_subtree.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/mm-vma-do-not-try-to-unmap-a-vma-if-mmap_prepare-invoked-from-mmap.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/perf-build-fix-argument-list-too-long-in-second-location.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/sched_ext-skip-tasks-with-stale-task_rq-in-bypass_lb_cpu.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/series		patch \| blob \| blame \| history
queue-7.0/spi-uniphier-fix-controller-deregistration.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/spi-uniphier-simplify-clock-handling-with-devm_clk_get_enabled.patch	[new file with mode: 0644]	patch \| blob
queue-7.0/usb-typec-tcpm-reset-internal-port-states-on-soft-reset-ams.patch	[new file with mode: 0644]	patch \| blob