From: Lennart Poettering Date: Wed, 24 Apr 2024 07:44:16 +0000 (+0200) Subject: capability-util: avoid thread_local X-Git-Tag: v256-rc1~28 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=fbe8f6928e0795030a5aecceda45c6eccadafd89;p=thirdparty%2Fsystemd.git capability-util: avoid thread_local While stracing PID1's forking off of children I noticed that every single forked off child reads cap_last_cap from procfs. That value is a kernel constant, hence we can save a lot of work if we'd cache it. Thing is, we actually do cache it, in a thread_local cache field. This means that the forked off processes (which are considered new threads) will have to re-query it, even though we already know the result. Hence, let's get rid of the thread_local stuff (given that the value is going to be the same for all threads anyway, and we pretty much have a single thread only anyway). Use an C11 atomic_int instead, which ensures the value is either initialized or not initialized, but we don't need to be concerned of partial initialization. This makes the cap_last_cap reading go away in the children, as strace shows (since cap_last_cap() is already called by PID 1 before fork()ing, anyway). --- diff --git a/src/basic/capability-util.c b/src/basic/capability-util.c index c3cf455e45c..e9b41fe7915 100644 --- a/src/basic/capability-util.c +++ b/src/basic/capability-util.c @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: LGPL-2.1-or-later */ #include +#include #include #include #include @@ -34,37 +35,38 @@ int have_effective_cap(int value) { } unsigned cap_last_cap(void) { - static thread_local unsigned saved; - static thread_local bool valid = false; - _cleanup_free_ char *content = NULL; - unsigned long p = 0; - int r; + static atomic_int saved = INT_MAX; + int r, c; - if (valid) - return saved; + c = saved; + if (c != INT_MAX) + return c; - /* available since linux-3.2 */ + /* Available since linux-3.2 */ + _cleanup_free_ char *content = NULL; r = read_one_line_file("/proc/sys/kernel/cap_last_cap", &content); - if (r >= 0) { - r = safe_atolu(content, &p); - if (r >= 0) { - - if (p > CAP_LIMIT) /* Safety for the future: if one day the kernel learns more than + if (r < 0) + log_debug_errno(r, "Failed to read /proc/sys/kernel/cap_last_cap, ignoring: %m"); + else { + r = safe_atoi(content, &c); + if (r < 0) + log_debug_errno(r, "Failed to parse /proc/sys/kernel/cap_last_cap, ignoring: %m"); + else { + if (c > CAP_LIMIT) /* Safety for the future: if one day the kernel learns more than * 64 caps, then we are in trouble (since we, as much userspace * and kernel space store capability masks in uint64_t types). We * also want to use UINT64_MAX as marker for "unset". Hence let's * hence protect ourselves against that and always cap at 62 for * now. */ - p = CAP_LIMIT; + c = CAP_LIMIT; - saved = p; - valid = true; - return p; + saved = c; + return c; } } - /* fall back to syscall-probing for pre linux-3.2 */ - p = (unsigned long) MIN(CAP_LAST_CAP, CAP_LIMIT); + /* Fall back to syscall-probing for pre linux-3.2, or where /proc/ is not mounted */ + unsigned long p = (unsigned long) MIN(CAP_LAST_CAP, CAP_LIMIT); if (prctl(PR_CAPBSET_READ, p) < 0) { @@ -81,10 +83,9 @@ unsigned cap_last_cap(void) { break; } - saved = p; - valid = true; - - return p; + c = (int) p; + saved = c; + return c; } int capability_update_inherited_set(cap_t caps, uint64_t set) {