]>
Commit | Line | Data |
---|---|---|
c3e270f4 | 1 | --- |
48f60ea9 | 2 | title: Users, Groups, UIDs and GIDs on systemd Systems |
5fe63895 | 3 | category: Users, Groups and Home Directories |
b41a3f66 | 4 | layout: default |
0aff7b75 | 5 | SPDX-License-Identifier: LGPL-2.1-or-later |
c3e270f4 FB |
6 | --- |
7 | ||
48f60ea9 | 8 | # Users, Groups, UIDs and GIDs on systemd Systems |
39972553 LP |
9 | |
10 | Here's a summary of the requirements `systemd` (and Linux) make on UID/GID | |
11 | assignments and their ranges. | |
12 | ||
5cc4d7fa | 13 | Note that while in theory UIDs and GIDs are orthogonal concepts they really aren't IRL. |
14 | With that in mind, when we discuss UIDs below it should be assumed | |
15 | that whatever we say about UIDs applies to GIDs in mostly the same way, | |
16 | and all the special assignments and ranges for UIDs always have mostly the same validity for GIDs too. | |
39972553 LP |
17 | |
18 | ## Special Linux UIDs | |
19 | ||
da890466 | 20 | In theory, the range of the C type `uid_t` is 32-bit wide on Linux, |
39972553 LP |
21 | i.e. 0…4294967295. However, four UIDs are special on Linux: |
22 | ||
7e4f30c3 | 23 | 1. 0 → The `root` super-user. |
39972553 | 24 | |
5cc4d7fa | 25 | 2. 65534 → The `nobody` UID, also called the "overflow" UID or similar. |
26 | It's where various subsystems map unmappable users to, for example file systems | |
27 | only supporting 16-bit UIDs, NFS or user namespacing. | |
28 | (The latter can be changed with a sysctl during runtime, but that's not supported on | |
29 | `systemd`. If you do change it you void your warranty.) | |
30 | Because Fedora is a bit confused the `nobody` user is called `nfsnobody` there | |
31 | (and they have a different `nobody` user at UID 99). | |
32 | I hope this will be corrected eventually though. | |
33 | (Also, some distributions call the `nobody` group `nogroup`. I wish they didn't.) | |
39972553 | 34 | |
da890466 | 35 | 3. 4294967295, aka "32-bit `(uid_t) -1`" → This UID is not a valid user ID, as |
9e4b8893 | 36 | `setresuid()`, `chown()` and friends treat -1 as a special request to not |
5cc4d7fa | 37 | change the UID of the process/file. |
38 | This UID is hence not available for assignment to users in the user database. | |
39972553 | 39 | |
da890466 ZJS |
40 | 4. 65535, aka "16-bit `(uid_t) -1`" → Before Linux kernel 2.4 `uid_t` used to be |
41 | 16-bit, and programs compiled for that would hence assume that `(uid_t) -1` | |
9e4b8893 | 42 | is 65535. This UID is hence not usable either. |
39972553 LP |
43 | |
44 | The `nss-systemd` glibc NSS module will synthesize user database records for | |
5cc4d7fa | 45 | the UIDs 0 and 65534 if the system user database doesn't list them. |
46 | This means that any system where this module is enabled works to some minimal level | |
39972553 LP |
47 | without `/etc/passwd`. |
48 | ||
49 | ## Special Distribution UID ranges | |
50 | ||
51 | Distributions generally split the available UID range in two: | |
52 | ||
53 | 1. 1…999 → System users. These are users that do not map to actual "human" | |
54 | users, but are used as security identities for system daemons, to implement | |
55 | privilege separation and run system daemons with minimal privileges. | |
56 | ||
57 | 2. 1000…65533 and 65536…4294967294 → Everything else, i.e. regular (human) users. | |
58 | ||
7e4f30c3 | 59 | Some older systems placed the boundary at 499/500, or even 99/100, |
5cc4d7fa | 60 | and some distributions allow the boundary between system and regular users to be changed via local configuration. |
7e4f30c3 ZJS |
61 | In `systemd`, the boundary is configurable during compilation time |
62 | and is also queried from `/etc/login.defs` at runtime, | |
63 | if the `-Dcompat-mutable-uid-boundaries=true` compile-time setting is used. | |
64 | We strongly discourage downstreams from changing the boundary from the upstream default of 999/1000. | |
39972553 LP |
65 | |
66 | Also note that programs such as `adduser` tend to allocate from a subset of the | |
7e4f30c3 ZJS |
67 | available regular user range only, usually 1000..60000. |
68 | This range can also be configured using `/etc/login.defs`. | |
39972553 LP |
69 | |
70 | Note that systemd requires that system users and groups are resolvable without | |
5cc4d7fa | 71 | network — a requirement that is not made for regular users. |
72 | This means regular users may be stored in remote LDAP or NIS databases, | |
73 | but system users may not (except when there's a consistent local cache kept, that is | |
55c041b4 | 74 | available during earliest boot, including in the initrd). |
39972553 LP |
75 | |
76 | ## Special `systemd` GIDs | |
77 | ||
5cc4d7fa | 78 | `systemd` defines no special UIDs beyond what Linux already defines (see above). |
79 | However, it does define some special group/GID assignments, | |
80 | which are primarily used for `systemd-udevd`'s device management. | |
81 | The precise list of the currently defined groups is found in this `sysusers.d` snippet: | |
df1f621b | 82 | [basic.conf](https://raw.githubusercontent.com/systemd/systemd/main/sysusers.d/basic.conf.in) |
39972553 LP |
83 | |
84 | It's strongly recommended that downstream distributions include these groups in | |
85 | their default group databases. | |
86 | ||
87 | Note that the actual GID numbers assigned to these groups do not have to be | |
5cc4d7fa | 88 | constant beyond a specific system. |
89 | There's one exception however: the `tty` group must have the GID 5. | |
90 | That's because it must be encoded in the `devpts` mount parameters during earliest boot, at a time where NSS lookups are not | |
91 | possible. | |
92 | (Note that the actual GID can be changed during `systemd` build time, but downstreams are strongly advised against doing that.) | |
39972553 LP |
93 | |
94 | ## Special `systemd` UID ranges | |
95 | ||
96 | `systemd` defines a number of special UID ranges: | |
97 | ||
f62dd237 | 98 | 1. 60001…60513 → UIDs for home directories managed by |
5cc4d7fa | 99 | [`systemd-homed.service(8)`](https://www.freedesktop.org/software/systemd/man/systemd-homed.service.html). |
100 | UIDs from this range are automatically assigned to any home directory discovered, | |
101 | and persisted locally on first login. | |
102 | On different systems the same user might get different UIDs assigned in case of conflict, though it is | |
f62dd237 LP |
103 | attempted to make UID assignments stable, by deriving them from a hash of |
104 | the user name. | |
105 | ||
106 | 2. 61184…65519 → UIDs for dynamic users are allocated from this range (see the | |
39972553 | 107 | `DynamicUser=` documentation in |
5cc4d7fa | 108 | [`systemd.exec(5)`](https://www.freedesktop.org/software/systemd/man/systemd.exec.html)). |
109 | This range has been chosen so that it is below the 16-bit boundary | |
110 | (i.e. below 65535), in order to provide compatibility with container environments that | |
111 | assign a 64K range of UIDs to containers using user namespacing. | |
112 | This range is above the 60000 boundary, so that its allocations are unlikely to be | |
113 | affected by `adduser` allocations (see above). | |
114 | And we leave some room upwards for other purposes. | |
115 | (And if you wonder why precisely these numbers: if you write them in hexadecimal, they might make more sense: 0xEF00 and 0xFFEF). | |
116 | The `nss-systemd` module will synthesize user records implicitly | |
117 | for all currently allocated dynamic users from this range. | |
118 | Thus, NSS-based user record resolving works correctly without those users being in `/etc/passwd`. | |
39972553 | 119 | |
f62dd237 | 120 | 3. 524288…1879048191 → UID range for `systemd-nspawn`'s automatic allocation of |
5cc4d7fa | 121 | per-container UID ranges. |
122 | When the `--private-users=pick` switch is used (or `-U`) then it will automatically find a so far unused 16-bit subrange of this | |
123 | range and assign it to the container. | |
124 | The range is picked so that the upper 16-bit of the 32-bit UIDs are constant for all users of the container, | |
125 | while the lower 16-bit directly encode the 65536 UIDs assigned to the container. | |
126 | This mode of allocation means that the upper 16-bit of any UID | |
da890466 | 127 | assigned to a container are kind of a "container ID", while the lower 16-bit |
5cc4d7fa | 128 | directly expose the container's own UID numbers. |
129 | If you wonder why precisely these numbers, consider them in hexadecimal: 0x00080000…0x6FFFFFFF. | |
130 | This range is above the 16-bit boundary. | |
131 | Moreover it's below the 31-bit boundary, as some broken code (specifically: the kernel's `devpts` file system) | |
132 | erroneously considers UIDs signed integers, and hence can't deal with values above 2^31. | |
133 | The `systemd-machined.service` service will synthesize user database records for all UIDs assigned to a running container from this range. | |
39972553 | 134 | |
5bc9ea07 | 135 | Note for both allocation ranges: when a UID allocation takes place NSS is |
5cc4d7fa | 136 | checked for collisions first, and a different UID is picked if an entry is found. |
137 | Thus, the user database is used as synchronization mechanism to ensure | |
138 | exclusive ownership of UIDs and UID ranges. | |
139 | To ensure compatibility with other subsystems allocating from the same ranges it is hence essential that they | |
39972553 | 140 | ensure that whatever they pick shows up in the user/group databases, either by |
5cc4d7fa | 141 | providing an NSS module, or by adding entries directly to `/etc/passwd` and `/etc/group`. |
142 | For performance reasons, do note that `systemd-nspawn` will only | |
143 | do an NSS check for the first UID of the range it allocates, not all 65536 of them. | |
144 | Also note that while the allocation logic is operating, | |
145 | the glibc `lckpwdf()` user database lock is taken, in order to make this logic race-free. | |
39972553 LP |
146 | |
147 | ## Figuring out the system's UID boundaries | |
148 | ||
149 | The most important boundaries of the local system may be queried with | |
150 | `pkg-config`: | |
151 | ||
5cc4d7fa | 152 | ```sh |
4e434bc0 | 153 | $ pkg-config --variable=system_uid_max systemd |
39972553 | 154 | 999 |
4e434bc0 | 155 | $ pkg-config --variable=dynamic_uid_min systemd |
39972553 | 156 | 61184 |
4e434bc0 | 157 | $ pkg-config --variable=dynamic_uid_max systemd |
39972553 | 158 | 65519 |
4e434bc0 | 159 | $ pkg-config --variable=container_uid_base_min systemd |
39972553 | 160 | 524288 |
4e434bc0 | 161 | $ pkg-config --variable=container_uid_base_max systemd |
39972553 LP |
162 | 1878982656 |
163 | ``` | |
164 | ||
165 | (Note that the latter encodes the maximum UID *base* `systemd-nspawn` might | |
166 | pick — given that 64K UIDs are assigned to each container according to this | |
167 | allocation logic, the maximum UID used for this range is hence | |
168 | 1878982656+65535=1879048191.) | |
169 | ||
5cc4d7fa | 170 | Systemd has compile-time default for these boundaries. |
171 | Using those defaults is recommended. | |
172 | It will nevertheless query `/etc/login.defs` at runtime, when compiled with `-Dcompat-mutable-uid-boundaries=true` and that file is present. | |
53393c89 | 173 | Support for this is considered only a compatibility feature and should not be |
f223fd6a | 174 | used except when upgrading systems which were created with different defaults. |
39972553 LP |
175 | |
176 | ## Considerations for container managers | |
177 | ||
178 | If you hack on a container manager, and wonder how and how many UIDs best to | |
179 | assign to your containers, here are a few recommendations: | |
180 | ||
5cc4d7fa | 181 | 1. Definitely, don't assign less than 65536 UIDs/GIDs. |
182 | After all the `nobody` user has magic properties, and hence should be available in your container, | |
183 | and given that it's assigned the UID 65534, you should really cover the full 16-bit range in your container. | |
184 | Note that systemd will — as mentioned — synthesize user records for the `nobody` user, | |
185 | and assumes its availability in various other parts of its codebase, too, hence assigning fewer users means you lose | |
186 | compatibility with running systemd code inside your container. | |
187 | And most likely other packages make similar restrictions. | |
39972553 LP |
188 | |
189 | 2. While it's fine to assign more than 65536 UIDs/GIDs to a container, there's | |
190 | most likely not much value in doing so, as Linux distributions won't use the | |
191 | higher ranges by default (as mentioned neither `adduser` nor `systemd`'s | |
5cc4d7fa | 192 | dynamic user concept allocate from above the 16-bit range). |
193 | Unless you actively care for nested containers, it's hence probably a good idea to allocate exactly | |
194 | 65536 UIDs per container, and neither less nor more. | |
195 | A pretty side-effect is that by doing so, you expose the same number of UIDs per container as Linux 2.2 | |
39972553 LP |
196 | supported for the whole system, back in the days. |
197 | ||
5cc4d7fa | 198 | 3. Consider allocating UID ranges for containers so that the first UID you assign has the lower 16-bits all set to zero. |
199 | That way, the upper 16-bits become a container ID of some kind, | |
200 | while the lower 16-bits directly encode the internal container UID. | |
201 | This is the way `systemd-nspawn` allocates UID ranges(see above). | |
202 | Following this allocation logic ensures best compatibility with `systemd-nspawn` | |
203 | and all other container managers following the scheme, as it | |
204 | is sufficient then to check NSS for the first UID you pick regarding conflicts, as that's what they do, too. | |
205 | Moreover, it makes `chown()`ing container file system trees nicely robust to interruptions: as the external UID encodes the | |
39972553 LP |
206 | internal UID in a fixed way, it's very easy to adjust the container's base UID |
207 | without the need to know the original base UID: to change the container base, | |
5cc4d7fa | 208 | just mask away the upper 16-bit, and insert the upper 16-bit of the new container base instead. |
209 | Here are the easy conversions to derive the internal UID, the external UID, and the container base UID from each other: | |
39972553 | 210 | |
5cc4d7fa | 211 | ```sh |
212 | INTERNAL_UID = EXTERNAL_UID & 0x0000FFFF | |
213 | CONTAINER_BASE_UID = EXTERNAL_UID & 0xFFFF0000 | |
214 | EXTERNAL_UID = INTERNAL_UID | CONTAINER_BASE_UID | |
215 | ``` | |
39972553 LP |
216 | |
217 | 4. When picking a UID range for containers, make sure to check NSS first, with | |
218 | a simple `getpwuid()` call: if there's already a user record for the first UID | |
5cc4d7fa | 219 | you want to pick, then it's already in use: pick a different one. |
220 | Wrap that call in a `lckpwdf()` + `ulckpwdf()` pair, to make allocation race-free. | |
221 | Provide an NSS module that makes all UIDs you end up taking show up | |
39972553 LP |
222 | in the user database, and make sure that the NSS module returns up-to-date |
223 | information before you release the lock, so that other system components can | |
5cc4d7fa | 224 | safely use the NSS user database as allocation check, too. |
225 | Note that if you follow this scheme no changes to `/etc/passwd` need to be made, thus minimizing | |
39972553 LP |
226 | the artifacts the container manager persistently leaves in the system. |
227 | ||
5cc4d7fa | 228 | 5. `systemd-homed` by default mounts the home directories it manages with UID mapping applied. |
229 | It will map four UID ranges into that uidmap, and leave everything else unmapped: | |
230 | the range from 0…60000, the user's own UID, | |
231 | the range 60514…65534, and the container range 524288…1879048191. | |
232 | This means files/directories in home directories managed by `systemd-homed` cannot be | |
9df83788 | 233 | owned by UIDs/GIDs outside of these four ranges (attempts to `chown()` files to |
5cc4d7fa | 234 | UIDs outside of these ranges will fail). |
235 | Thus, if container trees are to be placed within a home directory managed by `systemd-homed` they should take | |
236 | these ranges into consideration and either place the trees at base UID 0 | |
237 | (and then map them to a higher UID range for use in user namespacing via another | |
238 | level of UID mapped mounts, at *runtime*) or at a base UID from the container UID range. | |
239 | That said, placing container trees (and in fact any files/directories not owned by the home directory's user) in home directories | |
9df83788 LP |
240 | is generally a questionable idea (regardless of whether `systemd-homed` is used |
241 | or not), given this typically breaks quota assumptions, makes it impossible for | |
242 | users to properly manage all files in their own home directory due to | |
243 | permission problems, introduces security issues around SETUID and severely | |
5cc4d7fa | 244 | restricts compatibility with networked home directories. |
245 | Typically, it's a much better idea to place container images outside of the home directory, | |
9df83788 LP |
246 | i.e. somewhere below `/var/` or similar. |
247 | ||
39972553 LP |
248 | ## Summary |
249 | ||
250 | | UID/GID | Purpose | Defined By | Listed in | | |
251 | |-----------------------|-----------------------|---------------|-------------------------------| | |
252 | | 0 | `root` user | Linux | `/etc/passwd` + `nss-systemd` | | |
253 | | 1…4 | System users | Distributions | `/etc/passwd` | | |
254 | | 5 | `tty` group | `systemd` | `/etc/passwd` | | |
255 | | 6…999 | System users | Distributions | `/etc/passwd` | | |
256 | | 1000…60000 | Regular users | Distributions | `/etc/passwd` + LDAP/NIS/… | | |
a06c9ac2 | 257 | | 60001…60513 | Human users (homed) | `systemd` | `nss-systemd` | |
da890466 | 258 | | 60514…60577 | Host users mapped into containers | `systemd` | `systemd-nspawn` | |
a06c9ac2 | 259 | | 60578…61183 | Unused | | | |
39972553 LP |
260 | | 61184…65519 | Dynamic service users | `systemd` | `nss-systemd` | |
261 | | 65520…65533 | Unused | | | | |
262 | | 65534 | `nobody` user | Linux | `/etc/passwd` + `nss-systemd` | | |
da890466 | 263 | | 65535 | 16-bit `(uid_t) -1` | Linux | | |
39972553 | 264 | | 65536…524287 | Unused | | | |
38ccb557 | 265 | | 524288…1879048191 | Container UID ranges | `systemd` | `nss-systemd` | |
581004bd | 266 | | 1879048192…2147483647 | Unused | | | |
a305eda3 | 267 | | 2147483648…4294967294 | HIC SVNT LEONES | | | |
da890466 | 268 | | 4294967295 | 32-bit `(uid_t) -1` | Linux | | |
39972553 | 269 | |
5cc4d7fa | 270 | Note that "Unused" in the table above doesn't mean that these ranges are really unused. |
271 | It just means that these ranges have no well-established | |
272 | pre-defined purposes between Linux, generic low-level distributions and `systemd`. | |
273 | There might very well be other packages that allocate from theseranges. | |
bf613f7a | 274 | |
5cc4d7fa | 275 | Note that the range 2147483648…4294967294 (i.e. 2^31…2^32-2) should be handled with care. |
276 | Various programs (including kernel file systems — see `devpts` — or | |
09d4d603 | 277 | even kernel syscalls – see `setfsuid()`) have trouble with UIDs outside of the |
5cc4d7fa | 278 | signed 32-bit range, i.e any UIDs equal to or above 2147483648. |
279 | It is thus strongly recommended to stay away from this range in order to avoid complications. | |
280 | This range should be considered reserved for future, special purposes. | |
a305eda3 | 281 | |
bf613f7a LP |
282 | ## Notes on resolvability of user and group names |
283 | ||
284 | User names, UIDs, group names and GIDs don't have to be resolvable using NSS | |
5cc4d7fa | 285 | (i.e. getpwuid() and getpwnam() and friends) all the time. |
286 | However, systemd makes the following requirements: | |
bf613f7a | 287 | |
5cc4d7fa | 288 | System users generally have to be resolvable during early boot already. |
289 | This means they should not be provided by any networked service (as those usually | |
bf613f7a | 290 | become available during late boot only), except if a local cache is kept that |
5cc4d7fa | 291 | makes them available during early boot too (i.e. before networking is up). |
292 | Specifically, system users need to be resolvable at least before | |
293 | `systemd-udevd.service` and `systemd-tmpfiles-setup.service` are started, | |
294 | as both need to resolve system users — but note that there might be more services | |
bf613f7a LP |
295 | requiring full resolvability of system users than just these two. |
296 | ||
297 | Regular users do not need to be resolvable during early boot, it is sufficient | |
5cc4d7fa | 298 | if they become resolvable during late boot. |
299 | Specifically, regular users need to be resolvable at the point in time the `nss-user-lookup.target` unit is reached. | |
300 | This target unit is generally used as synchronization point between | |
301 | providers of the user database and consumers of it. | |
302 | Services that require that the user database is fully available (for example, the login service | |
bf613f7a | 303 | `systemd-logind.service`) are ordered *after* it, while services that provide |
5cc4d7fa | 304 | parts of the user database (for example an LDAP user database client) are ordered *before* it. |
305 | Note that `nss-user-lookup.target` is a *passive* unit: in | |
bf613f7a LP |
306 | order to minimize synchronization points on systems that don't need it the unit |
307 | is pulled into the initial transaction only if there's at least one service | |
308 | that really needs it, and that means only if there's a service providing the | |
5cc4d7fa | 309 | local user database somehow through IPC or suchlike. |
310 | Or in other words: if you hack on some networked user database project, then make sure you order your | |
311 | service `Before=nss-user-lookup.target` and that you pull it in with `Wants=nss-user-lookup.target`. | |
312 | However, if you hack on some project that needs the user database to be up in full, then order your service | |
313 | `After=nss-user-lookup.target`, but do *not* pull it in via a `Wants=` dependency. |