]> git.ipfire.org Git - thirdparty/systemd.git/blame - docs/UIDS-GIDS.md
docs: place all our markdown docs in rough categories
[thirdparty/systemd.git] / docs / UIDS-GIDS.md
CommitLineData
c3e270f4
FB
1---
2title: Users, Groups, UIDs and GIDs on `systemd` Systems
4cdca0af 3category: Concepts
c3e270f4
FB
4---
5
c1d3483d 6# Users, Groups, UIDs and GIDs on `systemd` Systems
39972553
LP
7
8Here's a summary of the requirements `systemd` (and Linux) make on UID/GID
9assignments and their ranges.
10
11Note that while in theory UIDs and GIDs are orthogonal concepts they really
12aren't IRL. With that in mind, when we discuss UIDs below it should be assumed
13that whatever we say about UIDs applies to GIDs in mostly the same way, and all
14the special assignments and ranges for UIDs always have mostly the same
15validity for GIDs too.
16
17## Special Linux UIDs
18
19In theory, the range of the C type `uid_t` is 32bit wide on Linux,
20i.e. 0…4294967295. However, four UIDs are special on Linux:
21
221. 0 → The `root` super-user
23
242. 65534 → The `nobody` UID, also called the "overflow" UID or similar. It's
2e276b1d
LP
25 where various subsystems map unmappable users to, for example file systems
26 only supporting 16bit UIDs, NFS or user namespacing. (The latter can be
27 changed with a sysctl during runtime, but that's not supported on
28 `systemd`. If you do change it you void your warranty.) Because Fedora is a
29 bit confused the `nobody` user is called `nfsnobody` there (and they have a
30 different `nobody` user at UID 99). I hope this will be corrected eventually
31 though. (Also, some distributions call the `nobody` group `nogroup`. I wish
32 they didn't.)
39972553
LP
33
343. 4294967295, aka "32bit `(uid_t) -1`" → This UID is not a valid user ID, as
9e4b8893
LP
35 `setresuid()`, `chown()` and friends treat -1 as a special request to not
36 change the UID of the process/file. This UID is hence not available for
37 assignment to users in the user database.
39972553 38
9e4b8893
LP
394. 65535, aka "16bit `(uid_t) -1`" → Before Linux kernel 2.4 `uid_t` used to be
40 16bit, and programs compiled for that would hence assume that `(uid_t) -1`
41 is 65535. This UID is hence not usable either.
39972553
LP
42
43The `nss-systemd` glibc NSS module will synthesize user database records for
44the UIDs 0 and 65534 if the system user database doesn't list them. This means
45that any system where this module is enabled works to some minimal level
46without `/etc/passwd`.
47
48## Special Distribution UID ranges
49
50Distributions generally split the available UID range in two:
51
521. 1…999 → System users. These are users that do not map to actual "human"
53 users, but are used as security identities for system daemons, to implement
54 privilege separation and run system daemons with minimal privileges.
55
562. 1000…65533 and 65536…4294967294 → Everything else, i.e. regular (human) users.
57
58Note that most distributions allow changing the boundary between system and
59regular users, even during runtime as user configuration. Moreover, some older
60systems placed the boundary at 499/500, or even 99/100. In `systemd`, the
61boundary is configurable only during compilation time, as this should be a
62decision for distribution builders, not for users. Moreover, we strongly
63discourage downstreams to change the boundary from the upstream default of
64999/1000.
65
66Also note that programs such as `adduser` tend to allocate from a subset of the
67available regular user range only, usually 1000..60000. And it's also usually
68user-configurable, too.
69
70Note that systemd requires that system users and groups are resolvable without
71networking available — a requirement that is not made for regular users. This
72means regular users may be stored in remote LDAP or NIS databases, but system
73users may not (except when there's a consistent local cache kept, that is
74available during earliest boot, including in the initial RAM disk).
75
76## Special `systemd` GIDs
77
78`systemd` defines no special UIDs beyond what Linux already defines (see
79above). However, it does define some special group/GID assignments, which are
80primarily used for `systemd-udevd`'s device management. The precise list of the
81currently defined groups is found in this `sysusers.d` snippet:
82[basic.conf](https://raw.githubusercontent.com/systemd/systemd/master/sysusers.d/basic.conf.in)
83
84It's strongly recommended that downstream distributions include these groups in
85their default group databases.
86
87Note that the actual GID numbers assigned to these groups do not have to be
88constant beyond a specific system. There's one exception however: the `tty`
89group must have the GID 5. That's because it must be encoded in the `devpts`
90mount parameters during earliest boot, at a time where NSS lookups are not
91possible. (Note that the actual GID can be changed during `systemd` build time,
92but downstreams are strongly advised against doing that.)
93
94## Special `systemd` UID ranges
95
96`systemd` defines a number of special UID ranges:
97
981. 61184…65519 → UIDs for dynamic users are allocated from this range (see the
99 `DynamicUser=` documentation in
100 [`systemd.exec(5)`](https://www.freedesktop.org/software/systemd/man/systemd.exec.html)). This
101 range has been chosen so that it is below the 16bit boundary (i.e. below
102 65535), in order to provide compatibility with container environments that
103 assign a 64K range of UIDs to containers using user namespacing. This range
104 is above the 60000 boundary, so that its allocations are unlikely to be
105 affected by `adduser` allocations (see above). And we leave some room
106 upwards for other purposes. (And if you wonder why precisely these numbers:
107 if you write them in hexadecimal, they might make more sense: 0xEF00 and
108 0xFFEF). The `nss-systemd` module will synthesize user records implicitly
109 for all currently allocated dynamic users from this range. Thus, NSS-based
110 user record resolving works correctly without those users being in
111 `/etc/passwd`.
112
1132. 524288…1879048191 → UID range for `systemd-nspawn`'s automatic allocation of
114 per-container UID ranges. When the `--private-users=pick` switch is used (or
115 `-U`) then it will automatically find a so far unused 16bit subrange of this
116 range and assign it to the container. The range is picked so that the upper
117 16bit of the 32bit UIDs are constant for all users of the container, while
118 the lower 16bit directly encode the 65536 UIDs assigned to the
119 container. This mode of allocation means that the upper 16bit of any UID
120 assigned to a container are kind of a "container ID", while the lower 16bit
121 directly expose the container's own UID numbers. If you wonder why precisely
122 these numbers, consider them in hexadecimal: 0x00080000…0x6FFFFFFF. This
123 range is above the 16bit boundary. Moreover it's below the 31bit boundary,
124 as some broken code (specifically: the kernel's `devpts` file system)
125 erroneously considers UIDs signed integers, and hence can't deal with values
126 above 2^31. The `nss-mymachines` glibc NSS module will synthesize user
127 database records for all UIDs assigned to a running container from this
128 range.
129
130Note for both allocation ranges: when an UID allocation takes place NSS is
131checked for collisions first, and a different UID is picked if an entry is
132found. Thus, the user database is used as synchronization mechanism to ensure
133exclusive ownership of UIDs and UID ranges. To ensure compatibility with other
134subsystems allocating from the same ranges it is hence essential that they
135ensure that whatever they pick shows up in the user/group databases, either by
136providing an NSS module, or by adding entries directly to `/etc/passwd` and
137`/etc/group`. For performance reasons, do note that `systemd-nspawn` will only
138do an NSS check for the first UID of the range it allocates, not all 65536 of
139them. Also note that while the allocation logic is operating, the glibc
140`lckpwdf()` user database lock is taken, in order to make this logic race-free.
141
142## Figuring out the system's UID boundaries
143
144The most important boundaries of the local system may be queried with
145`pkg-config`:
146
147```
148$ pkg-config --variable=systemuidmax systemd
149999
150$ pkg-config --variable=dynamicuidmin systemd
15161184
152$ pkg-config --variable=dynamicuidmax systemd
15365519
154$ pkg-config --variable=containeruidbasemin systemd
155524288
156$ pkg-config --variable=containeruidbasemax systemd
1571878982656
158```
159
160(Note that the latter encodes the maximum UID *base* `systemd-nspawn` might
161pick — given that 64K UIDs are assigned to each container according to this
162allocation logic, the maximum UID used for this range is hence
1631878982656+65535=1879048191.)
164
165Note that systemd does not make any of these values runtime-configurable. All
166these boundaries are chosen during build time. That said, the system UID/GID
167boundary is traditionally configured in /etc/login.defs, though systemd won't
168look there during runtime.
169
170## Considerations for container managers
171
172If you hack on a container manager, and wonder how and how many UIDs best to
173assign to your containers, here are a few recommendations:
174
1751. Definitely, don't assign less than 65536 UIDs/GIDs. After all the `nobody`
176user has magic properties, and hence should be available in your container, and
177given that it's assigned the UID 65534, you should really cover the full 16bit
178range in your container. Note that systemd will — as mentioned — synthesize
179user records for the `nobody` user, and assumes its availability in various
180other parts of its codebase, too, hence assigning fewer users means you lose
181compatibility with running systemd code inside your container. And most likely
182other packages make similar restrictions.
183
1842. While it's fine to assign more than 65536 UIDs/GIDs to a container, there's
185most likely not much value in doing so, as Linux distributions won't use the
186higher ranges by default (as mentioned neither `adduser` nor `systemd`'s
187dynamic user concept allocate from above the 16bit range). Unless you actively
188care for nested containers, it's hence probably a good idea to allocate exactly
18965536 UIDs per container, and neither less nor more. A pretty side-effect is
190that by doing so, you expose the same number of UIDs per container as Linux 2.2
191supported for the whole system, back in the days.
192
1933. Consider allocating UID ranges for containers so that the first UID you
194assign has the lower 16bits all set to zero. That way, the upper 16bits become
195a container ID of some kind, while the lower 16bits directly encode the
196internal container UID. This is the way `systemd-nspawn` allocates UID ranges
e5988600 197(see above). Following this allocation logic ensures best compatibility with
39972553
LP
198`systemd-nspawn` and all other container managers following the scheme, as it
199is sufficient then to check NSS for the first UID you pick regarding conflicts,
200as that's what they do, too. Moreover, it makes `chown()`ing container file
201system trees nicely robust to interruptions: as the external UID encodes the
202internal UID in a fixed way, it's very easy to adjust the container's base UID
203without the need to know the original base UID: to change the container base,
204just mask away the upper 16bit, and insert the upper 16bit of the new container
205base instead. Here are the easy conversions to derive the internal UID, the
206external UID, and the container base UID from each other:
207
208 ```
209 INTERNAL_UID = EXTERNAL_UID & 0x0000FFFF
210 CONTAINER_BASE_UID = EXTERNAL_UID & 0xFFFF0000
211 EXTERNAL_UID = INTERNAL_UID | CONTAINER_BASE_UID
212 ```
213
2144. When picking a UID range for containers, make sure to check NSS first, with
215a simple `getpwuid()` call: if there's already a user record for the first UID
216you want to pick, then it's already in use: pick a different one. Wrap that
217call in a `lckpwdf()` + `ulckpwdf()` pair, to make allocation
218race-free. Provide an NSS module that makes all UIDs you end up taking show up
219in the user database, and make sure that the NSS module returns up-to-date
220information before you release the lock, so that other system components can
221safely use the NSS user database as allocation check, too. Note that if you
222follow this scheme no changes to `/etc/passwd` need to be made, thus minimizing
223the artifacts the container manager persistently leaves in the system.
224
225## Summary
226
227| UID/GID | Purpose | Defined By | Listed in |
228|-----------------------|-----------------------|---------------|-------------------------------|
229| 0 | `root` user | Linux | `/etc/passwd` + `nss-systemd` |
230| 1…4 | System users | Distributions | `/etc/passwd` |
231| 5 | `tty` group | `systemd` | `/etc/passwd` |
232| 6…999 | System users | Distributions | `/etc/passwd` |
233| 1000…60000 | Regular users | Distributions | `/etc/passwd` + LDAP/NIS/… |
234| 60001…61183 | Unused | | |
235| 61184…65519 | Dynamic service users | `systemd` | `nss-systemd` |
236| 65520…65533 | Unused | | |
237| 65534 | `nobody` user | Linux | `/etc/passwd` + `nss-systemd` |
238| 65535 | 16bit `(uid_t) -1` | Linux | |
239| 65536…524287 | Unused | | |
240| 524288…1879048191 | Container UID ranges | `systemd` | `nss-mymachines` |
581004bd 241| 1879048192…2147483647 | Unused | | |
a305eda3 242| 2147483648…4294967294 | HIC SVNT LEONES | | |
39972553
LP
243| 4294967295 | 32bit `(uid_t) -1` | Linux | |
244
245Note that "Unused" in the table above doesn't meant that these ranges are
246really unused. It just means that these ranges have no well-established
247pre-defined purposes between Linux, generic low-level distributions and
248`systemd`. There might very well be other packages that allocate from these
249ranges.
bf613f7a 250
a305eda3
LP
251Note that the range 2147483648…4294967294 (i.e. 2^31…2^32-2) should be handled
252with care. Various programs (including kernel file systems, see `devpts`) have
253trouble with UIDs outside of the signed 32bit range, i.e any UIDs equal to or
254above 2147483648. It is thus strongly recommended to stay away from this range
255in order to avoid complications. This range should be considered reserved for
256future, special purposes.
257
bf613f7a
LP
258## Notes on resolvability of user and group names
259
260User names, UIDs, group names and GIDs don't have to be resolvable using NSS
261(i.e. getpwuid() and getpwnam() and friends) all the time. However, systemd
262makes the following requirements:
263
264System users generally have to be resolvable during early boot already. This
265means they should not be provided by any networked service (as those usually
266become available during late boot only), except if a local cache is kept that
267makes them available during early boot too (i.e. before networking is
268up). Specifically, system users need to be resolvable at least before
269`systemd-udevd.service` and `systemd-tmpfiles.service` are started, as both
270need to resolve system users — but note that there might be more services
271requiring full resolvability of system users than just these two.
272
273Regular users do not need to be resolvable during early boot, it is sufficient
274if they become resolvable during late boot. Specifically, regular users need to
275be resolvable at the point in time the `nss-user-lookup.target` unit is
276reached. This target unit is generally used as synchronization point between
277providers of the user database and consumers of it. Services that require that
278the user database is fully available (for example, the login service
279`systemd-logind.service`) are ordered *after* it, while services that provide
280parts of the user database (for example an LDAP user database client) are
281ordered *before* it. Note that `nss-user-lookup.target` is a *passive* unit: in
282order to minimize synchronization points on systems that don't need it the unit
283is pulled into the initial transaction only if there's at least one service
284that really needs it, and that means only if there's a service providing the
285local user database somehow through IPC or suchlike. Or in other words: if you
286hack on some networked user database project, then make sure you order your
287service `Before=nss-user-lookup.target` and that you pull it in with
288`Wants=nss-user-lookup.target`. However, if you hack on some project that needs
289the user database to be up in full, then order your service
290`After=nss-user-lookup.target`, but do *not* pull it in via a `Wants=`
291dependency.