]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/openat2.2
963bedb3da9af2e65de181a672eed82d9df23e18
[thirdparty/man-pages.git] / man2 / openat2.2
1 .\" Copyright (C) 2019 Aleksa Sarai <cyphar@cyphar.com>
2 .\"
3 .\" SPDX-License-Identifier: Linux-man-pages-copyleft
4 .TH OPENAT2 2 2021-03-22 "Linux" "Linux Programmer's Manual"
5 .SH NAME
6 openat2 \- open and possibly create a file (extended)
7 .SH LIBRARY
8 Standard C library
9 .RI ( libc ", " \-lc )
10 .SH SYNOPSIS
11 .nf
12 .BR "#include <fcntl.h>" \
13 " /* Definition of " O_* " and " S_* " constants */"
14 .BR "#include <linux/openat2.h>" " /* Definition of " RESOLVE_* " constants */"
15 .BR "#include <sys/syscall.h>" " /* Definition of " SYS_* " constants */"
16 .B #include <unistd.h>
17 .PP
18 .BI "long syscall(SYS_openat2, int " dirfd ", const char *" pathname ,
19 .BI " struct open_how *" how ", size_t " size );
20 .fi
21 .PP
22 .IR Note :
23 glibc provides no wrapper for
24 .BR openat2 (),
25 necessitating the use of
26 .BR syscall (2).
27 .SH DESCRIPTION
28 The
29 .BR openat2 ()
30 system call is an extension of
31 .BR openat (2)
32 and provides a superset of its functionality.
33 .PP
34 The
35 .BR openat2 ()
36 system call opens the file specified by
37 .IR pathname .
38 If the specified file does not exist, it may optionally (if
39 .B O_CREAT
40 is specified in
41 .IR how.flags )
42 be created.
43 .PP
44 As with
45 .BR openat (2),
46 if
47 .I pathname
48 is a relative pathname, then it is interpreted relative to the
49 directory referred to by the file descriptor
50 .I dirfd
51 (or the current working directory of the calling process, if
52 .I dirfd
53 is the special value
54 .BR AT_FDCWD ).
55 If
56 .I pathname
57 is an absolute pathname, then
58 .I dirfd
59 is ignored (unless
60 .I how.resolve
61 contains
62 .BR RESOLVE_IN_ROOT ,
63 in which case
64 .I pathname
65 is resolved relative to
66 .IR dirfd ).
67 .PP
68 Rather than taking a single
69 .I flags
70 argument, an extensible structure (\fIhow\fP) is passed to allow for
71 future extensions.
72 The
73 .I size
74 argument must be specified as
75 .IR "sizeof(struct open_how)" .
76 .\"
77 .SS The open_how structure
78 The
79 .I how
80 argument specifies how
81 .I pathname
82 should be opened, and acts as a superset of the
83 .IR flags
84 and
85 .IR mode
86 arguments to
87 .BR openat (2).
88 This argument is a pointer to a structure of the following form:
89 .PP
90 .in +4n
91 .EX
92 struct open_how {
93 u64 flags; /* O_* flags */
94 u64 mode; /* Mode for O_{CREAT,TMPFILE} */
95 u64 resolve; /* RESOLVE_* flags */
96 /* ... */
97 };
98 .EE
99 .in
100 .PP
101 Any future extensions to
102 .BR openat2 ()
103 will be implemented as new fields appended to the above structure,
104 with a zero value in a new field resulting in the kernel behaving
105 as though that extension field was not present.
106 Therefore, the caller
107 .I must
108 zero-fill this structure on
109 initialization.
110 (See the "Extensibility" section of the
111 .B NOTES
112 for more detail on why this is necessary.)
113 .PP
114 The fields of the
115 .I open_how
116 structure are as follows:
117 .TP
118 .I flags
119 This field specifies
120 the file creation and file status flags to use when opening the file.
121 All of the
122 .B O_*
123 flags defined for
124 .BR openat (2)
125 are valid
126 .BR openat2 ()
127 flag values.
128 .IP
129 Whereas
130 .BR openat (2)
131 ignores unknown bits in its
132 .I flags
133 argument,
134 .BR openat2 ()
135 returns an error if unknown or conflicting flags are specified in
136 .IR how.flags .
137 .TP
138 .I mode
139 This field specifies the
140 mode for the new file, with identical semantics to the
141 .I mode
142 argument of
143 .BR openat (2).
144 .IP
145 Whereas
146 .BR openat (2)
147 ignores bits other than those in the range
148 .I 07777
149 in its
150 .I mode
151 argument,
152 .BR openat2 ()
153 returns an error if
154 .I how.mode
155 contains bits other than
156 .IR 07777 .
157 Similarly, an error is returned if
158 .BR openat2 ()
159 is called with a nonzero
160 .IR how.mode
161 and
162 .IR how.flags
163 does not contain
164 .BR O_CREAT
165 or
166 .BR O_TMPFILE .
167 .TP
168 .I resolve
169 This is a bit-mask of flags that modify the way in which
170 .B all
171 components of
172 .I pathname
173 will be resolved.
174 (See
175 .BR path_resolution (7)
176 for background information.)
177 .IP
178 The primary use case for these flags is to allow trusted programs to restrict
179 how untrusted paths (or paths inside untrusted directories) are resolved.
180 The full list of
181 .I resolve
182 flags is as follows:
183 .RS
184 .TP
185 .B RESOLVE_BENEATH
186 .\" commit adb21d2b526f7f196b2f3fdca97d80ba05dd14a0
187 Do not permit the path resolution to succeed if any component of the resolution
188 is not a descendant of the directory indicated by
189 .IR dirfd .
190 This causes absolute symbolic links (and absolute values of
191 .IR pathname )
192 to be rejected.
193 .IP
194 Currently, this flag also disables magic-link resolution (see below).
195 However, this may change in the future.
196 Therefore, to ensure that magic links are not resolved,
197 the caller should explicitly specify
198 .BR RESOLVE_NO_MAGICLINKS .
199 .TP
200 .B RESOLVE_IN_ROOT
201 .\" commit 8db52c7e7ee1bd861b6096fcafc0fe7d0f24a994
202 Treat the directory referred to by
203 .I dirfd
204 as the root directory while resolving
205 .IR pathname .
206 Absolute symbolic links are interpreted relative to
207 .IR dirfd .
208 If a prefix component of
209 .I pathname
210 equates to
211 .IR dirfd ,
212 then an immediately following
213 .IR ..\&
214 component likewise equates to
215 .IR dirfd
216 (just as
217 .I /..\&
218 is traditionally equivalent to
219 .IR / ).
220 If
221 .I pathname
222 is an absolute path, it is also interpreted relative to
223 .IR dirfd .
224 .IP
225 The effect of this flag is as though the calling process had used
226 .BR chroot (2)
227 to (temporarily) modify its root directory (to the directory
228 referred to by
229 .IR dirfd ).
230 However, unlike
231 .BR chroot (2)
232 (which changes the filesystem root permanently for a process),
233 .B RESOLVE_IN_ROOT
234 allows a program to efficiently restrict path resolution on a per-open basis.
235 .IP
236 Currently, this flag also disables magic-link resolution.
237 However, this may change in the future.
238 Therefore, to ensure that magic links are not resolved,
239 the caller should explicitly specify
240 .BR RESOLVE_NO_MAGICLINKS .
241 .TP
242 .B RESOLVE_NO_MAGICLINKS
243 .\" commit 278121417a72d87fb29dd8c48801f80821e8f75a
244 Disallow all magic-link resolution during path resolution.
245 .IP
246 Magic links are symbolic link-like objects that are most notably found in
247 .BR proc (5);
248 examples include
249 .IR /proc/[pid]/exe
250 and
251 .IR /proc/[pid]/fd/* .
252 (See
253 .BR symlink (7)
254 for more details.)
255 .IP
256 Unknowingly opening magic links can be risky for some applications.
257 Examples of such risks include the following:
258 .RS
259 .IP \(bu 2
260 If the process opening a pathname is a controlling process that
261 currently has no controlling terminal (see
262 .BR credentials (7)),
263 then opening a magic link inside
264 .IR /proc/[pid]/fd
265 that happens to refer to a terminal
266 would cause the process to acquire a controlling terminal.
267 .IP \(bu
268 .\" From https://lwn.net/Articles/796868/:
269 .\" The presence of this flag will prevent a path lookup operation
270 .\" from traversing through one of these magic links, thus blocking
271 .\" (for example) attempts to escape from a container via a /proc
272 .\" entry for an open file descriptor.
273 In a containerized environment,
274 a magic link inside
275 .I /proc
276 may refer to an object outside the container,
277 and thus may provide a means to escape from the container.
278 .RE
279 .IP
280 Because of such risks,
281 an application may prefer to disable magic link resolution using the
282 .BR RESOLVE_NO_MAGICLINKS
283 flag.
284 .IP
285 If the trailing component (i.e., basename) of
286 .I pathname
287 is a magic link,
288 .I how.resolve
289 contains
290 .BR RESOLVE_NO_MAGICLINKS ,
291 and
292 .I how.flags
293 contains both
294 .BR O_PATH
295 and
296 .BR O_NOFOLLOW ,
297 then an
298 .B O_PATH
299 file descriptor referencing the magic link will be returned.
300 .TP
301 .B RESOLVE_NO_SYMLINKS
302 .\" commit 278121417a72d87fb29dd8c48801f80821e8f75a
303 Disallow resolution of symbolic links during path resolution.
304 This option implies
305 .BR RESOLVE_NO_MAGICLINKS .
306 .IP
307 If the trailing component (i.e., basename) of
308 .I pathname
309 is a symbolic link,
310 .I how.resolve
311 contains
312 .BR RESOLVE_NO_SYMLINKS ,
313 and
314 .I how.flags
315 contains both
316 .BR O_PATH
317 and
318 .BR O_NOFOLLOW ,
319 then an
320 .B O_PATH
321 file descriptor referencing the symbolic link will be returned.
322 .IP
323 Note that the effect of the
324 .BR RESOLVE_NO_SYMLINKS
325 flag,
326 which affects the treatment of symbolic links in all of the components of
327 .IR pathname ,
328 differs from the effect of the
329 .BR O_NOFOLLOW
330 file creation flag (in
331 .IR how.flags ),
332 which affects the handling of symbolic links only in the final component of
333 .IR pathname .
334 .IP
335 Applications that employ the
336 .BR RESOLVE_NO_SYMLINKS
337 flag are encouraged to make its use configurable
338 (unless it is used for a specific security purpose),
339 as symbolic links are very widely used by end-users.
340 Setting this flag indiscriminately\(emi.e.,
341 for purposes not specifically related to security\(emfor all uses of
342 .BR openat2 ()
343 may result in spurious errors on previously functional systems.
344 This may occur if, for example,
345 a system pathname that is used by an application is modified
346 (e.g., in a new distribution release)
347 so that a pathname component (now) contains a symbolic link.
348 .TP
349 .B RESOLVE_NO_XDEV
350 .\" commit 72ba29297e1439efaa54d9125b866ae9d15df339
351 Disallow traversal of mount points during path resolution (including all bind
352 mounts).
353 Consequently,
354 .I pathname
355 must either be on the same mount as the directory referred to by
356 .IR dirfd ,
357 or on the same mount as the current working directory if
358 .I dirfd
359 is specified as
360 .BR AT_FDCWD .
361 .IP
362 Applications that employ the
363 .B RESOLVE_NO_XDEV
364 flag are encouraged to make its use configurable (unless it is
365 used for a specific security purpose),
366 as bind mounts are widely used by end-users.
367 Setting this flag indiscriminately\(emi.e.,
368 for purposes not specifically related to security\(emfor all uses of
369 .BR openat2 ()
370 may result in spurious errors on previously functional systems.
371 This may occur if, for example,
372 a system pathname that is used by an application is modified
373 (e.g., in a new distribution release)
374 so that a pathname component (now) contains a bind mount.
375 .TP
376 .B RESOLVE_CACHED
377 Make the open operation fail unless all path components are already present
378 in the kernel's lookup cache.
379 If any kind of revalidation or I/O is needed to satisfy the lookup,
380 .BR openat2 ()
381 fails with the error
382 .B EAGAIN .
383 This is useful in providing a fast-path open that can be performed without
384 resorting to thread offload, or other mechanisms that an application might
385 use to offload slower operations.
386 .RE
387 .IP
388 If any bits other than those listed above are set in
389 .IR how.resolve ,
390 an error is returned.
391 .SH RETURN VALUE
392 On success, a new file descriptor is returned.
393 On error, \-1 is returned, and
394 .I errno
395 is set to indicate the error.
396 .SH ERRORS
397 The set of errors returned by
398 .BR openat2 ()
399 includes all of the errors returned by
400 .BR openat (2),
401 as well as the following additional errors:
402 .TP
403 .B E2BIG
404 An extension that this kernel does not support was specified in
405 .IR how .
406 (See the "Extensibility" section of
407 .B NOTES
408 for more detail on how extensions are handled.)
409 .TP
410 .B EAGAIN
411 .I how.resolve
412 contains either
413 .BR RESOLVE_IN_ROOT
414 or
415 .BR RESOLVE_BENEATH ,
416 and the kernel could not ensure that a ".." component didn't escape (due to a
417 race condition or potential attack).
418 The caller may choose to retry the
419 .BR openat2 ()
420 call.
421 .TP
422 .B EAGAIN
423 .BR RESOLVE_CACHED
424 was set, and the open operation cannot be performed using only cached
425 information.
426 The caller should retry without
427 .B RESOLVE_CACHED
428 set in
429 .I how.resolve .
430 .TP
431 .B EINVAL
432 An unknown flag or invalid value was specified in
433 .IR how .
434 .TP
435 .B EINVAL
436 .I mode
437 is nonzero, but
438 .I how.flags
439 does not contain
440 .BR O_CREAT
441 or
442 .BR O_TMPFILE .
443 .TP
444 .B EINVAL
445 .I size
446 was smaller than any known version of
447 .IR "struct open_how" .
448 .TP
449 .B ELOOP
450 .I how.resolve
451 contains
452 .BR RESOLVE_NO_SYMLINKS ,
453 and one of the path components was a symbolic link (or magic link).
454 .TP
455 .B ELOOP
456 .I how.resolve
457 contains
458 .BR RESOLVE_NO_MAGICLINKS ,
459 and one of the path components was a magic link.
460 .TP
461 .B EXDEV
462 .I how.resolve
463 contains either
464 .BR RESOLVE_IN_ROOT
465 or
466 .BR RESOLVE_BENEATH ,
467 and an escape from the root during path resolution was detected.
468 .TP
469 .B EXDEV
470 .I how.resolve
471 contains
472 .BR RESOLVE_NO_XDEV ,
473 and a path component crosses a mount point.
474 .SH VERSIONS
475 .BR openat2 ()
476 first appeared in Linux 5.6.
477 .\" commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179
478 .SH CONFORMING TO
479 This system call is Linux-specific.
480 .PP
481 The semantics of
482 .B RESOLVE_BENEATH
483 were modeled after FreeBSD's
484 .BR O_BENEATH .
485 .SH NOTES
486 .SS Extensibility
487 In order to allow for future extensibility,
488 .BR openat2 ()
489 requires the user-space application to specify the size of the
490 .I open_how
491 structure that it is passing.
492 By providing this information, it is possible for
493 .BR openat2 ()
494 to provide both forwards- and backwards-compatibility, with
495 .I size
496 acting as an implicit version number.
497 (Because new extension fields will always
498 be appended, the structure size will always increase.)
499 This extensibility design is very similar to other system calls such as
500 .BR sched_setattr (2),
501 .BR perf_event_open (2),
502 and
503 .BR clone3 (2).
504 .PP
505 If we let
506 .I usize
507 be the size of the structure as specified by the user-space application, and
508 .I ksize
509 be the size of the structure which the kernel supports, then there are
510 three cases to consider:
511 .IP \(bu 2
512 If
513 .IR ksize
514 equals
515 .IR usize ,
516 then there is no version mismatch and
517 .I how
518 can be used verbatim.
519 .IP \(bu
520 If
521 .IR ksize
522 is larger than
523 .IR usize ,
524 then there are some extension fields that the kernel supports
525 which the user-space application
526 is unaware of.
527 Because a zero value in any added extension field signifies a no-op,
528 the kernel
529 treats all of the extension fields not provided by the user-space application
530 as having zero values.
531 This provides backwards-compatibility.
532 .IP \(bu
533 If
534 .IR ksize
535 is smaller than
536 .IR usize ,
537 then there are some extension fields which the user-space application
538 is aware of but which the kernel does not support.
539 Because any extension field must have its zero values signify a no-op,
540 the kernel can
541 safely ignore the unsupported extension fields if they are all-zero.
542 If any unsupported extension fields are nonzero, then \-1 is returned and
543 .I errno
544 is set to
545 .BR E2BIG .
546 This provides forwards-compatibility.
547 .PP
548 Because the definition of
549 .I struct open_how
550 may change in the future (with new fields being added when system headers are
551 updated), user-space applications should zero-fill
552 .I struct open_how
553 to ensure that recompiling the program with new headers will not result in
554 spurious errors at runtime.
555 The simplest way is to use a designated
556 initializer:
557 .PP
558 .in +4n
559 .EX
560 struct open_how how = { .flags = O_RDWR,
561 .resolve = RESOLVE_IN_ROOT };
562 .EE
563 .in
564 .PP
565 or explicitly using
566 .BR memset (3)
567 or similar:
568 .PP
569 .in +4n
570 .EX
571 struct open_how how;
572 memset(&how, 0, sizeof(how));
573 how.flags = O_RDWR;
574 how.resolve = RESOLVE_IN_ROOT;
575 .EE
576 .in
577 .PP
578 A user-space application that wishes to determine which extensions
579 the running kernel supports can do so by conducting a binary search on
580 .IR size
581 with a structure which has every byte nonzero (to find the largest value
582 which doesn't produce an error of
583 .BR E2BIG ).
584 .SH SEE ALSO
585 .BR openat (2),
586 .BR path_resolution (7),
587 .BR symlink (7)