Both copy_process() and alloc_pid() do the same PIDNS_ADDING check. The
reasons for these checks, and the fact that both are necessary, are not
immediately obvious. Add the comments.
Link: https://lkml.kernel.org/r/aaGIRElc78U4Er42@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Adrian Reber <areber@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Kirill Tkhai <tkhai@ya.ru>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
rseq_fork(p, clone_flags);
- /* Don't start children in a dying pid namespace */
+ /*
+ * If zap_pid_ns_processes() was called after alloc_pid(), the new
+ * child missed SIGKILL. If current is not in the same namespace,
+ * we can't rely on fatal_signal_pending() below.
+ */
if (unlikely(!(ns_of_pid(pid)->pid_allocated & PIDNS_ADDING))) {
retval = -ENOMEM;
goto bad_fork_core_free;
*
* This can't be done earlier because we need to preserve other
* error conditions.
+ *
+ * We need this even if copy_process() does the same check. If two
+ * or more tasks from parent namespace try to inject a child into a
+ * dead namespace, one of free_pid() calls from the copy_process()
+ * error path may try to wakeup the possibly freed ns->child_reaper.
*/
retval = -ENOMEM;
if (unlikely(!(ns->pid_allocated & PIDNS_ADDING)))