nsresourced: detect and clean up registry entries for dead user namespaces (#42070)
The BPF kprobe that fires on user namespace destruction is the only
thing
that triggers registry cleanup, so any time it doesn't run — ring buffer
overflow, kprobe missing, fdstore entry dropped outside our cleanup path
— a registry entry is left behind forever.
Stamp each registry entry with the kernel's unique namespace identifier
(NS_GET_ID, kernel ≥ 6.13) at allocation time. At manager startup, after
the existing fdstore→registry sweep, walk the registry and ask the
kernel
to look each namespace up by id via open_by_handle_at() on nsfs; if the
lookup returns -ESTALE the namespace is gone and we release the entry.
Old entries written before this change carry no identifier and are left
alone.
Add a namespace_open_by_id() helper for the lookup. The kernel restricts
open_by_handle_at() on nsfs to processes in the initial user namespace,
collapsing both permission denials and dead namespaces onto -ESTALE; the
helper refuses early with -EPERM outside the initial user namespace
so callers can tell the two apart.