]> git.ipfire.org Git - thirdparty/systemd.git/blob - docs/UIDS-GIDS.md
test: add shutdown test
[thirdparty/systemd.git] / docs / UIDS-GIDS.md
1 ---
2 title: Users, Groups, UIDs and GIDs on systemd Systems
3 category: Users, Groups and Home Directories
4 layout: default
5 SPDX-License-Identifier: LGPL-2.1-or-later
6 ---
7
8 # Users, Groups, UIDs and GIDs on systemd Systems
9
10 Here's a summary of the requirements `systemd` (and Linux) make on UID/GID
11 assignments and their ranges.
12
13 Note that while in theory UIDs and GIDs are orthogonal concepts they really
14 aren't IRL. With that in mind, when we discuss UIDs below it should be assumed
15 that whatever we say about UIDs applies to GIDs in mostly the same way, and all
16 the special assignments and ranges for UIDs always have mostly the same
17 validity for GIDs too.
18
19 ## Special Linux UIDs
20
21 In theory, the range of the C type `uid_t` is 32bit wide on Linux,
22 i.e. 0…4294967295. However, four UIDs are special on Linux:
23
24 1. 0 → The `root` super-user
25
26 2. 65534 → The `nobody` UID, also called the "overflow" UID or similar. It's
27 where various subsystems map unmappable users to, for example file systems
28 only supporting 16bit UIDs, NFS or user namespacing. (The latter can be
29 changed with a sysctl during runtime, but that's not supported on
30 `systemd`. If you do change it you void your warranty.) Because Fedora is a
31 bit confused the `nobody` user is called `nfsnobody` there (and they have a
32 different `nobody` user at UID 99). I hope this will be corrected eventually
33 though. (Also, some distributions call the `nobody` group `nogroup`. I wish
34 they didn't.)
35
36 3. 4294967295, aka "32bit `(uid_t) -1`" → This UID is not a valid user ID, as
37 `setresuid()`, `chown()` and friends treat -1 as a special request to not
38 change the UID of the process/file. This UID is hence not available for
39 assignment to users in the user database.
40
41 4. 65535, aka "16bit `(uid_t) -1`" → Before Linux kernel 2.4 `uid_t` used to be
42 16bit, and programs compiled for that would hence assume that `(uid_t) -1`
43 is 65535. This UID is hence not usable either.
44
45 The `nss-systemd` glibc NSS module will synthesize user database records for
46 the UIDs 0 and 65534 if the system user database doesn't list them. This means
47 that any system where this module is enabled works to some minimal level
48 without `/etc/passwd`.
49
50 ## Special Distribution UID ranges
51
52 Distributions generally split the available UID range in two:
53
54 1. 1…999 → System users. These are users that do not map to actual "human"
55 users, but are used as security identities for system daemons, to implement
56 privilege separation and run system daemons with minimal privileges.
57
58 2. 1000…65533 and 65536…4294967294 → Everything else, i.e. regular (human) users.
59
60 Note that most distributions allow changing the boundary between system and
61 regular users, even during runtime as user configuration. Moreover, some older
62 systems placed the boundary at 499/500, or even 99/100. In `systemd`, the
63 boundary is configurable only during compilation time, as this should be a
64 decision for distribution builders, not for users. Moreover, we strongly
65 discourage downstreams to change the boundary from the upstream default of
66 999/1000.
67
68 Also note that programs such as `adduser` tend to allocate from a subset of the
69 available regular user range only, usually 1000..60000. And it's also usually
70 user-configurable, too.
71
72 Note that systemd requires that system users and groups are resolvable without
73 networking available — a requirement that is not made for regular users. This
74 means regular users may be stored in remote LDAP or NIS databases, but system
75 users may not (except when there's a consistent local cache kept, that is
76 available during earliest boot, including in the initial RAM disk).
77
78 ## Special `systemd` GIDs
79
80 `systemd` defines no special UIDs beyond what Linux already defines (see
81 above). However, it does define some special group/GID assignments, which are
82 primarily used for `systemd-udevd`'s device management. The precise list of the
83 currently defined groups is found in this `sysusers.d` snippet:
84 [basic.conf](https://raw.githubusercontent.com/systemd/systemd/main/sysusers.d/basic.conf.in)
85
86 It's strongly recommended that downstream distributions include these groups in
87 their default group databases.
88
89 Note that the actual GID numbers assigned to these groups do not have to be
90 constant beyond a specific system. There's one exception however: the `tty`
91 group must have the GID 5. That's because it must be encoded in the `devpts`
92 mount parameters during earliest boot, at a time where NSS lookups are not
93 possible. (Note that the actual GID can be changed during `systemd` build time,
94 but downstreams are strongly advised against doing that.)
95
96 ## Special `systemd` UID ranges
97
98 `systemd` defines a number of special UID ranges:
99
100 1. 60001…60513 → UIDs for home directories managed by
101 [`systemd-homed.service(8)`](https://www.freedesktop.org/software/systemd/man/systemd-homed.service.html). UIDs
102 from this range are automatically assigned to any home directory discovered,
103 and persisted locally on first login. On different systems the same user
104 might get different UIDs assigned in case of conflict, though it is
105 attempted to make UID assignments stable, by deriving them from a hash of
106 the user name.
107
108 2. 61184…65519 → UIDs for dynamic users are allocated from this range (see the
109 `DynamicUser=` documentation in
110 [`systemd.exec(5)`](https://www.freedesktop.org/software/systemd/man/systemd.exec.html)). This
111 range has been chosen so that it is below the 16bit boundary (i.e. below
112 65535), in order to provide compatibility with container environments that
113 assign a 64K range of UIDs to containers using user namespacing. This range
114 is above the 60000 boundary, so that its allocations are unlikely to be
115 affected by `adduser` allocations (see above). And we leave some room
116 upwards for other purposes. (And if you wonder why precisely these numbers:
117 if you write them in hexadecimal, they might make more sense: 0xEF00 and
118 0xFFEF). The `nss-systemd` module will synthesize user records implicitly
119 for all currently allocated dynamic users from this range. Thus, NSS-based
120 user record resolving works correctly without those users being in
121 `/etc/passwd`.
122
123 3. 524288…1879048191 → UID range for `systemd-nspawn`'s automatic allocation of
124 per-container UID ranges. When the `--private-users=pick` switch is used (or
125 `-U`) then it will automatically find a so far unused 16bit subrange of this
126 range and assign it to the container. The range is picked so that the upper
127 16bit of the 32bit UIDs are constant for all users of the container, while
128 the lower 16bit directly encode the 65536 UIDs assigned to the
129 container. This mode of allocation means that the upper 16bit of any UID
130 assigned to a container are kind of a "container ID", while the lower 16bit
131 directly expose the container's own UID numbers. If you wonder why precisely
132 these numbers, consider them in hexadecimal: 0x00080000…0x6FFFFFFF. This
133 range is above the 16bit boundary. Moreover it's below the 31bit boundary,
134 as some broken code (specifically: the kernel's `devpts` file system)
135 erroneously considers UIDs signed integers, and hence can't deal with values
136 above 2^31. The `systemd-machined.service` service will synthesize user
137 database records for all UIDs assigned to a running container from this
138 range.
139
140 Note for both allocation ranges: when an UID allocation takes place NSS is
141 checked for collisions first, and a different UID is picked if an entry is
142 found. Thus, the user database is used as synchronization mechanism to ensure
143 exclusive ownership of UIDs and UID ranges. To ensure compatibility with other
144 subsystems allocating from the same ranges it is hence essential that they
145 ensure that whatever they pick shows up in the user/group databases, either by
146 providing an NSS module, or by adding entries directly to `/etc/passwd` and
147 `/etc/group`. For performance reasons, do note that `systemd-nspawn` will only
148 do an NSS check for the first UID of the range it allocates, not all 65536 of
149 them. Also note that while the allocation logic is operating, the glibc
150 `lckpwdf()` user database lock is taken, in order to make this logic race-free.
151
152 ## Figuring out the system's UID boundaries
153
154 The most important boundaries of the local system may be queried with
155 `pkg-config`:
156
157 ```
158 $ pkg-config --variable=systemuidmax systemd
159 999
160 $ pkg-config --variable=dynamicuidmin systemd
161 61184
162 $ pkg-config --variable=dynamicuidmax systemd
163 65519
164 $ pkg-config --variable=containeruidbasemin systemd
165 524288
166 $ pkg-config --variable=containeruidbasemax systemd
167 1878982656
168 ```
169
170 (Note that the latter encodes the maximum UID *base* `systemd-nspawn` might
171 pick — given that 64K UIDs are assigned to each container according to this
172 allocation logic, the maximum UID used for this range is hence
173 1878982656+65535=1879048191.)
174
175 Systemd has compile-time default for these boundaries. Using those defaults is
176 recommended. It will nevertheless query `/etc/login.defs` at runtime, when
177 compiled with `-Dcompat-mutable-uid-boundaries=true` and that file is present.
178 Support for this is considered only a compatibility feature and should not be
179 used except when upgrading systems which were creating with different defaults.
180
181 ## Considerations for container managers
182
183 If you hack on a container manager, and wonder how and how many UIDs best to
184 assign to your containers, here are a few recommendations:
185
186 1. Definitely, don't assign less than 65536 UIDs/GIDs. After all the `nobody`
187 user has magic properties, and hence should be available in your container, and
188 given that it's assigned the UID 65534, you should really cover the full 16bit
189 range in your container. Note that systemd will — as mentioned — synthesize
190 user records for the `nobody` user, and assumes its availability in various
191 other parts of its codebase, too, hence assigning fewer users means you lose
192 compatibility with running systemd code inside your container. And most likely
193 other packages make similar restrictions.
194
195 2. While it's fine to assign more than 65536 UIDs/GIDs to a container, there's
196 most likely not much value in doing so, as Linux distributions won't use the
197 higher ranges by default (as mentioned neither `adduser` nor `systemd`'s
198 dynamic user concept allocate from above the 16bit range). Unless you actively
199 care for nested containers, it's hence probably a good idea to allocate exactly
200 65536 UIDs per container, and neither less nor more. A pretty side-effect is
201 that by doing so, you expose the same number of UIDs per container as Linux 2.2
202 supported for the whole system, back in the days.
203
204 3. Consider allocating UID ranges for containers so that the first UID you
205 assign has the lower 16bits all set to zero. That way, the upper 16bits become
206 a container ID of some kind, while the lower 16bits directly encode the
207 internal container UID. This is the way `systemd-nspawn` allocates UID ranges
208 (see above). Following this allocation logic ensures best compatibility with
209 `systemd-nspawn` and all other container managers following the scheme, as it
210 is sufficient then to check NSS for the first UID you pick regarding conflicts,
211 as that's what they do, too. Moreover, it makes `chown()`ing container file
212 system trees nicely robust to interruptions: as the external UID encodes the
213 internal UID in a fixed way, it's very easy to adjust the container's base UID
214 without the need to know the original base UID: to change the container base,
215 just mask away the upper 16bit, and insert the upper 16bit of the new container
216 base instead. Here are the easy conversions to derive the internal UID, the
217 external UID, and the container base UID from each other:
218
219 ```
220 INTERNAL_UID = EXTERNAL_UID & 0x0000FFFF
221 CONTAINER_BASE_UID = EXTERNAL_UID & 0xFFFF0000
222 EXTERNAL_UID = INTERNAL_UID | CONTAINER_BASE_UID
223 ```
224
225 4. When picking a UID range for containers, make sure to check NSS first, with
226 a simple `getpwuid()` call: if there's already a user record for the first UID
227 you want to pick, then it's already in use: pick a different one. Wrap that
228 call in a `lckpwdf()` + `ulckpwdf()` pair, to make allocation
229 race-free. Provide an NSS module that makes all UIDs you end up taking show up
230 in the user database, and make sure that the NSS module returns up-to-date
231 information before you release the lock, so that other system components can
232 safely use the NSS user database as allocation check, too. Note that if you
233 follow this scheme no changes to `/etc/passwd` need to be made, thus minimizing
234 the artifacts the container manager persistently leaves in the system.
235
236 ## Summary
237
238 | UID/GID | Purpose | Defined By | Listed in |
239 |-----------------------|-----------------------|---------------|-------------------------------|
240 | 0 | `root` user | Linux | `/etc/passwd` + `nss-systemd` |
241 | 1…4 | System users | Distributions | `/etc/passwd` |
242 | 5 | `tty` group | `systemd` | `/etc/passwd` |
243 | 6…999 | System users | Distributions | `/etc/passwd` |
244 | 1000…60000 | Regular users | Distributions | `/etc/passwd` + LDAP/NIS/… |
245 | 60001…60513 | Human users (homed) | `systemd` | `nss-systemd` |
246 | 60514…60577 | Host users mapped into containers | `systemd` | `systemd-nspawn` |
247 | 60578…61183 | Unused | | |
248 | 61184…65519 | Dynamic service users | `systemd` | `nss-systemd` |
249 | 65520…65533 | Unused | | |
250 | 65534 | `nobody` user | Linux | `/etc/passwd` + `nss-systemd` |
251 | 65535 | 16bit `(uid_t) -1` | Linux | |
252 | 65536…524287 | Unused | | |
253 | 524288…1879048191 | Container UID ranges | `systemd` | `nss-systemd` |
254 | 1879048192…2147483647 | Unused | | |
255 | 2147483648…4294967294 | HIC SVNT LEONES | | |
256 | 4294967295 | 32bit `(uid_t) -1` | Linux | |
257
258 Note that "Unused" in the table above doesn't meant that these ranges are
259 really unused. It just means that these ranges have no well-established
260 pre-defined purposes between Linux, generic low-level distributions and
261 `systemd`. There might very well be other packages that allocate from these
262 ranges.
263
264 Note that the range 2147483648…4294967294 (i.e. 2^31…2^32-2) should be handled
265 with care. Various programs (including kernel file systems, see `devpts`) have
266 trouble with UIDs outside of the signed 32bit range, i.e any UIDs equal to or
267 above 2147483648. It is thus strongly recommended to stay away from this range
268 in order to avoid complications. This range should be considered reserved for
269 future, special purposes.
270
271 ## Notes on resolvability of user and group names
272
273 User names, UIDs, group names and GIDs don't have to be resolvable using NSS
274 (i.e. getpwuid() and getpwnam() and friends) all the time. However, systemd
275 makes the following requirements:
276
277 System users generally have to be resolvable during early boot already. This
278 means they should not be provided by any networked service (as those usually
279 become available during late boot only), except if a local cache is kept that
280 makes them available during early boot too (i.e. before networking is
281 up). Specifically, system users need to be resolvable at least before
282 `systemd-udevd.service` and `systemd-tmpfiles.service` are started, as both
283 need to resolve system users — but note that there might be more services
284 requiring full resolvability of system users than just these two.
285
286 Regular users do not need to be resolvable during early boot, it is sufficient
287 if they become resolvable during late boot. Specifically, regular users need to
288 be resolvable at the point in time the `nss-user-lookup.target` unit is
289 reached. This target unit is generally used as synchronization point between
290 providers of the user database and consumers of it. Services that require that
291 the user database is fully available (for example, the login service
292 `systemd-logind.service`) are ordered *after* it, while services that provide
293 parts of the user database (for example an LDAP user database client) are
294 ordered *before* it. Note that `nss-user-lookup.target` is a *passive* unit: in
295 order to minimize synchronization points on systems that don't need it the unit
296 is pulled into the initial transaction only if there's at least one service
297 that really needs it, and that means only if there's a service providing the
298 local user database somehow through IPC or suchlike. Or in other words: if you
299 hack on some networked user database project, then make sure you order your
300 service `Before=nss-user-lookup.target` and that you pull it in with
301 `Wants=nss-user-lookup.target`. However, if you hack on some project that needs
302 the user database to be up in full, then order your service
303 `After=nss-user-lookup.target`, but do *not* pull it in via a `Wants=`
304 dependency.