]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/open.2
All pages: Remove the 5th argument to .TH
[thirdparty/man-pages.git] / man2 / open.2
CommitLineData
fea681da 1.\" This manpage is Copyright (C) 1992 Drew Eckhardt;
fd185f58
MK
2.\" and Copyright (C) 1993 Michael Haardt, Ian Jackson.
3.\" and Copyright (C) 2008 Greg Banks
7b8ba76c 4.\" and Copyright (C) 2006, 2008, 2013, 2014 Michael Kerrisk <mtk.manpages@gmail.com>
fea681da 5.\"
5fbde956 6.\" SPDX-License-Identifier: Linux-man-pages-copyleft
fea681da
MK
7.\"
8.\" Modified 1993-07-21 by Rik Faith <faith@cs.unc.edu>
9.\" Modified 1994-08-21 by Michael Haardt
10.\" Modified 1996-04-13 by Andries Brouwer <aeb@cwi.nl>
11.\" Modified 1996-05-13 by Thomas Koenig
12.\" Modified 1996-12-20 by Michael Haardt
13.\" Modified 1999-02-19 by Andries Brouwer <aeb@cwi.nl>
14.\" Modified 1998-11-28 by Joseph S. Myers <jsm28@hermes.cam.ac.uk>
15.\" Modified 1999-06-03 by Michael Haardt
c11b1abf
MK
16.\" Modified 2002-05-07 by Michael Kerrisk <mtk.manpages@gmail.com>
17.\" Modified 2004-06-23 by Michael Kerrisk <mtk.manpages@gmail.com>
1c1e15ed
MK
18.\" 2004-12-08, mtk, reordered flags list alphabetically
19.\" 2004-12-08, Martin Pool <mbp@sourcefrog.net> (& mtk), added O_NOATIME
fe75ec04 20.\" 2007-09-18, mtk, Added description of O_CLOEXEC + other minor edits
447bb15e 21.\" 2008-01-03, mtk, with input from Trond Myklebust
f4b9d6a5
MK
22.\" <trond.myklebust@fys.uio.no> and Timo Sirainen <tss@iki.fi>
23.\" Rewrite description of O_EXCL.
ddc4d339
MK
24.\" 2008-01-11, Greg Banks <gnb@melbourne.sgi.com>: add more detail
25.\" on O_DIRECT.
d77eb764 26.\" 2008-02-26, Michael Haardt: Reorganized text for O_CREAT and mode
fea681da 27.\"
61b7c1e1 28.\" FIXME . Apr 08: The next POSIX revision has O_EXEC, O_SEARCH, and
9f91e36c
MK
29.\" O_TTYINIT. Eventually these may need to be documented. --mtk
30.\"
45186a5d 31.TH OPEN 2 2021-08-27 "Linux man-pages (unreleased)"
fea681da 32.SH NAME
7b8ba76c 33open, openat, creat \- open and possibly create a file
d554739d
AC
34.SH LIBRARY
35Standard C library
8fc3b2cf 36.RI ( libc ", " \-lc )
fea681da
MK
37.SH SYNOPSIS
38.nf
fea681da 39.B #include <fcntl.h>
5355ff82 40.PP
fea681da
MK
41.BI "int open(const char *" pathname ", int " flags );
42.BI "int open(const char *" pathname ", int " flags ", mode_t " mode );
5355ff82 43.PP
fea681da 44.BI "int creat(const char *" pathname ", mode_t " mode );
5355ff82 45.PP
7b8ba76c
MK
46.BI "int openat(int " dirfd ", const char *" pathname ", int " flags );
47.BI "int openat(int " dirfd ", const char *" pathname ", int " flags \
48", mode_t " mode );
a2dbb2e3 49.PP
4b322a2f
MK
50/* Documented separately, in \fBopenat2\fP(2): */
51.BI "int openat2(int " dirfd ", const char *" pathname ,
9bfc9cb1 52.BI " const struct open_how *" how ", size_t " size ");"
fea681da 53.fi
5355ff82 54.PP
d39ad78f 55.RS -4
7b8ba76c
MK
56Feature Test Macro Requirements for glibc (see
57.BR feature_test_macros (7)):
d39ad78f 58.RE
5355ff82 59.PP
7b8ba76c 60.BR openat ():
9d2adbae
MK
61.nf
62 Since glibc 2.10:
5c10d2c5 63 _POSIX_C_SOURCE >= 200809L
9d2adbae
MK
64 Before glibc 2.10:
65 _ATFILE_SOURCE
66.fi
fea681da 67.SH DESCRIPTION
ef81e101 68The
1f6ceb40 69.BR open ()
ef81e101
MK
70system call opens the file specified by
71.IR pathname .
72If the specified file does not exist,
73it may optionally (if
74.B O_CREAT
75is specified in
76.IR flags )
77be created by
78.BR open ().
79.PP
80The return value of
81.BR open ()
5c3611aa
MK
82is a file descriptor, a small, nonnegative integer that is an index
83to an entry in the process's table of open file descriptors.
84The file descriptor is used
ef81e101
MK
85in subsequent system calls
86.RB ( read "(2), " write "(2), " lseek "(2), " fcntl (2),
87etc.) to refer to the open file.
e366dbc4 88The file descriptor returned by a successful call will be
2c4bff36 89the lowest-numbered file descriptor not currently open for the process.
e366dbc4 90.PP
fe75ec04 91By default, the new file descriptor is set to remain open across an
e366dbc4 92.BR execve (2)
1f6ceb40
MK
93(i.e., the
94.B FD_CLOEXEC
95file descriptor flag described in
31d79098
SP
96.BR fcntl (2)
97is initially disabled); the
fe75ec04 98.B O_CLOEXEC
d6a74b95 99flag, described below, can be used to change this default.
1f6ceb40 100The file offset is set to the beginning of the file (see
c13182ef 101.BR lseek (2)).
e366dbc4
MK
102.PP
103A call to
104.BR open ()
105creates a new
106.IR "open file description" ,
107an entry in the system-wide table of open files.
61b12e2b 108The open file description records the file offset and the file status flags
20ee63c1 109(see below).
61b12e2b 110A file descriptor is a reference to an open file description;
2c4bff36
MK
111this reference is unaffected if
112.I pathname
113is subsequently removed or modified to refer to a different file.
d20d9d33 114For further details on open file descriptions, see NOTES.
e366dbc4 115.PP
c4bb193f 116The argument
fea681da 117.I flags
e366dbc4
MK
118must include one of the following
119.IR "access modes" :
c7992edc 120.BR O_RDONLY ", " O_WRONLY ", or " O_RDWR .
e366dbc4
MK
121These request opening the file read-only, write-only, or read/write,
122respectively.
5355ff82 123.PP
bfe9ba67 124In addition, zero or more file creation flags and file status flags
c13182ef 125can be
fea681da 126.RI bitwise- or 'd
e366dbc4 127in
bfe9ba67 128.IR flags .
c13182ef
MK
129The
130.I file creation flags
131are
0e40804c 132.BR O_CLOEXEC ,
b072a788 133.BR O_CREAT ,
0e40804c
MK
134.BR O_DIRECTORY ,
135.BR O_EXCL ,
136.BR O_NOCTTY ,
137.BR O_NOFOLLOW ,
f2698a42 138.BR O_TMPFILE ,
0e40804c 139and
15fb5d03 140.BR O_TRUNC .
c13182ef
MK
141The
142.I file status flags
bfe9ba67 143are all of the remaining flags listed below.
0e40804c 144.\" SUSv4 divides the flags into:
93ee8f96
MK
145.\" * Access mode
146.\" * File creation
147.\" * File status
148.\" * Other (O_CLOEXEC, O_DIRECTORY, O_NOFOLLOW)
149.\" though it's not clear what the difference between "other" and
0e40804c
MK
150.\" "File creation" flags is. I raised an Aardvark to see if this
151.\" can be clarified in SUSv4; 10 Oct 2008.
152.\" http://thread.gmane.org/gmane.comp.standards.posix.austin.general/64/focus=67
153.\" TC1 (balloted in 2013), resolved this, so that those three constants
154.\" are also categorized" as file status flags.
155.\"
bfe9ba67 156The distinction between these two groups of flags is that
68210340
MK
157the file creation flags affect the semantics of the open operation itself,
158while the file status flags affect the semantics of subsequent I/O operations.
159The file status flags can be retrieved and (in some cases)
566b427d
MK
160modified; see
161.BR fcntl (2)
162for details.
5355ff82 163.PP
bfe9ba67 164The full list of file creation flags and file status flags is as follows:
fea681da 165.TP
1c1e15ed 166.B O_APPEND
c13182ef
MK
167The file is opened in append mode.
168Before each
0bfa087b 169.BR write (2),
1e568304 170the file offset is positioned at the end of the file,
1c1e15ed 171as if with
0bfa087b 172.BR lseek (2).
17efe87f 173The modification of the file offset and the write operation
20b8f0e2 174are performed as a single atomic step.
5355ff82 175.IP
1c1e15ed 176.B O_APPEND
9ee4a2b6 177may lead to corrupted files on NFS filesystems if more than one process
c13182ef 178appends data to a file at once.
a4391429
MK
179.\" For more background, see
180.\" http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=453946
181.\" http://nfs.sourceforge.net/
c13182ef 182This is because NFS does not support
1c1e15ed
MK
183appending to a file, so the client kernel has to simulate it, which
184can't be done without a race condition.
185.TP
186.B O_ASYNC
b50582eb 187Enable signal-driven I/O:
8bd58774
MK
188generate a signal
189.RB ( SIGIO
190by default, but this can be changed via
1c1e15ed
MK
191.BR fcntl (2))
192when input or output becomes possible on this file descriptor.
33a0ccb2 193This feature is available only for terminals, pseudoterminals,
1f6ceb40
MK
194sockets, and (since Linux 2.6) pipes and FIFOs.
195See
1c1e15ed
MK
196.BR fcntl (2)
197for further details.
9bde4908 198See also BUGS, below.
fe75ec04 199.TP
31c1f2b0 200.BR O_CLOEXEC " (since Linux 2.6.23)"
7fdec065 201.\" NOTE! several other man pages refer to this text
fe75ec04 202Enable the close-on-exec flag for the new file descriptor.
00d82ce8
MK
203.\" FIXME . for later review when Issue 8 is one day released...
204.\" POSIX proposes to fix many APIs that provide hidden FDs
205.\" http://austingroupbugs.net/tag_view_page.php?tag_id=8
206.\" http://austingroupbugs.net/view.php?id=368
24ec631f 207Specifying this flag permits a program to avoid additional
fe75ec04
MK
208.BR fcntl (2)
209.B F_SETFD
24ec631f 210operations to set the
0daa9e92 211.B FD_CLOEXEC
fe75ec04 212flag.
5355ff82 213.IP
7756d157
MK
214Note that the use of this flag is essential in some multithreaded programs,
215because using a separate
fe75ec04
MK
216.BR fcntl (2)
217.B F_SETFD
218operation to set the
0daa9e92 219.B FD_CLOEXEC
fe75ec04 220flag does not suffice to avoid race conditions
7756d157
MK
221where one thread opens a file descriptor and
222attempts to set its close-on-exec flag using
223.BR fcntl (2)
224at the same time as another thread does a
fe75ec04
MK
225.BR fork (2)
226plus
227.BR execve (2).
7756d157 228Depending on the order of execution,
30821db8 229the race may lead to the file descriptor returned by
7756d157
MK
230.BR open ()
231being unintentionally leaked to the program executed by the child process
232created by
233.BR fork (2).
234(This kind of race is in principle possible for any system call
235that creates a file descriptor whose close-on-exec flag should be set,
236and various other Linux system calls provide an equivalent of the
1ae6b2c7 237.B O_CLOEXEC
7756d157 238flag to deal with this problem.)
fe75ec04 239.\" This flag fixes only one form of the race condition;
d9cb0d7d 240.\" The race can also occur with, for example, file descriptors
fe75ec04 241.\" returned by accept(), pipe(), etc.
1c1e15ed 242.TP
fea681da 243.B O_CREAT
6f72cae5
MK
244If
245.I pathname
246does not exist, create it as a regular file.
5355ff82 247.IP
40169a93 248The owner (user ID) of the new file is set to the effective user ID
c13182ef 249of the process.
5355ff82 250.IP
ddf5e4ab
MK
251The group ownership (group ID) of the new file is set either to
252the effective group ID of the process (System V semantics)
253or to the group ID of the parent directory (BSD semantics).
254On Linux, the behavior depends on whether the
255set-group-ID mode bit is set on the parent directory:
256if that bit is set, then BSD semantics apply;
257otherwise, System V semantics apply.
258For some filesystems, the behavior also depends on the
fea681da
MK
259.I bsdgroups
260and
261.I sysvgroups
ddf5e4ab 262mount options described in
53dcd8d2 263.BR mount (8).
8b39ad66
MK
264.\" As at 2.6.25, bsdgroups is supported by ext2, ext3, ext4, and
265.\" XFS (since 2.6.14).
7f4e9716 266.IP
1bab84a8 267The
4e698277 268.I mode
901c8ecf
MK
269argument specifies the file mode bits to be applied when a new file is created.
270If neither
4e698277 271.B O_CREAT
901c8ecf 272nor
f2698a42 273.B O_TMPFILE
4e698277 274is specified in
901c8ecf
MK
275.IR flags ,
276then
277.I mode
278is ignored (and can thus be specified as 0, or simply omitted).
279The
280.I mode
281argument
282.B must
283be supplied if
4e698277 284.B O_CREAT
901c8ecf 285or
f2698a42 286.B O_TMPFILE
901c8ecf
MK
287is specified in
288.IR flags ;
289if it is not supplied,
290some arbitrary bytes from the stack will be applied as the file mode.
88f463a9 291.IP
58222012 292The effective mode is modified by the process's
4e698277 293.I umask
58222012
MK
294in the usual way: in the absence of a default ACL, the mode of the
295created file is
af2d18b2 296.IR "(mode\ &\ \(tiumask)" .
88f463a9
MK
297.IP
298Note that
299.I mode
300applies only to future accesses of the
4e698277
MK
301newly created file; the
302.BR open ()
303call that creates a read-only file may well return a read/write
304file descriptor.
7f4e9716 305.IP
4e698277
MK
306The following symbolic constants are provided for
307.IR mode :
7f4e9716 308.RS
4e698277
MK
309.TP 9
310.B S_IRWXU
97d5b762 31100700 user (file owner) has read, write, and execute permission
4e698277
MK
312.TP
313.B S_IRUSR
31400400 user has read permission
315.TP
316.B S_IWUSR
31700200 user has write permission
318.TP
319.B S_IXUSR
32000100 user has execute permission
321.TP
322.B S_IRWXG
97d5b762 32300070 group has read, write, and execute permission
4e698277
MK
324.TP
325.B S_IRGRP
32600040 group has read permission
327.TP
328.B S_IWGRP
32900020 group has write permission
330.TP
331.B S_IXGRP
33200010 group has execute permission
333.TP
334.B S_IRWXO
97d5b762 33500007 others have read, write, and execute permission
4e698277
MK
336.TP
337.B S_IROTH
33800004 others have read permission
339.TP
340.B S_IWOTH
34100002 others have write permission
342.TP
343.B S_IXOTH
34400001 others have execute permission
345.RE
9e1d8950
MK
346.IP
347According to POSIX, the effect when other bits are set in
348.I mode
349is unspecified.
350On Linux, the following bits are also honored in
351.IR mode :
352.RS
353.TP 9
354.B S_ISUID
3550004000 set-user-ID bit
356.TP
357.B S_ISGID
3580002000 set-group-ID bit (see
e6fc1596 359.BR inode (7)).
9e1d8950
MK
360.TP
361.B S_ISVTX
3620001000 sticky bit (see
e6fc1596 363.BR inode (7)).
9e1d8950 364.RE
fea681da 365.TP
31c1f2b0 366.BR O_DIRECT " (since Linux 2.4.10)"
1c1e15ed
MK
367Try to minimize cache effects of the I/O to and from this file.
368In general this will degrade performance, but it is useful in
369special situations, such as when applications do their own caching.
bce0482f 370File I/O is done directly to/from user-space buffers.
015221ef
CH
371The
372.B O_DIRECT
0deb3ce9 373flag on its own makes an effort to transfer data synchronously,
015221ef
CH
374but does not give the guarantees of the
375.B O_SYNC
0deb3ce9
JM
376flag that data and necessary metadata are transferred.
377To guarantee synchronous I/O,
015221ef
CH
378.B O_SYNC
379must be used in addition to
380.BR O_DIRECT .
be02e49f 381See NOTES below for further discussion.
5355ff82 382.IP
c13182ef 383A semantically similar (but deprecated) interface for block devices
9b54d4fa 384is described in
1c1e15ed
MK
385.BR raw (8).
386.TP
387.B O_DIRECTORY
a8d55537 388If \fIpathname\fP is not a directory, cause the open to fail.
9f8d688a
MK
389.\" But see the following and its replies:
390.\" http://marc.theaimsgroup.com/?t=112748702800001&r=1&w=2
391.\" [PATCH] open: O_DIRECTORY and O_CREAT together should fail
392.\" O_DIRECTORY | O_CREAT causes O_DIRECTORY to be ignored.
65496644 393This flag was added in kernel version 2.1.126, to
60a90ecd
MK
394avoid denial-of-service problems if
395.BR opendir (3)
396is called on a
a3041a58 397FIFO or tape device.
1c1e15ed 398.TP
6cf19e62
MK
399.B O_DSYNC
400Write operations on the file will complete according to the requirements of
401synchronized I/O
402.I data
403integrity completion.
5355ff82 404.IP
6cf19e62
MK
405By the time
406.BR write (2)
407(and similar)
408return, the output data
409has been transferred to the underlying hardware,
410along with any file metadata that would be required to retrieve that data
411(i.e., as though each
412.BR write (2)
413was followed by a call to
414.BR fdatasync (2)).
415.IR "See NOTES below" .
416.TP
fea681da 417.B O_EXCL
f4b9d6a5
MK
418Ensure that this call creates the file:
419if this flag is specified in conjunction with
fea681da 420.BR O_CREAT ,
f4b9d6a5
MK
421and
422.I pathname
423already exists, then
1c1e15ed 424.BR open ()
26cd31fd
MK
425fails with the error
426.BR EEXIST .
5355ff82 427.IP
f4b9d6a5
MK
428When these two flags are specified, symbolic links are not followed:
429.\" POSIX.1-2001 explicitly requires this behavior.
430if
431.I pathname
432is a symbolic link, then
433.BR open ()
43116169 434fails regardless of where the symbolic link points.
5355ff82 435.IP
10b7a945
IHV
436In general, the behavior of
437.B O_EXCL
438is undefined if it is used without
439.BR O_CREAT .
440There is one exception: on Linux 2.6 and later,
441.B O_EXCL
442can be used without
443.B O_CREAT
444if
445.I pathname
446refers to a block device.
6303d401
DB
447If the block device is in use by the system (e.g., mounted),
448.BR open ()
10b7a945
IHV
449fails with the error
450.BR EBUSY .
5355ff82 451.IP
efe08656 452On NFS,
f4b9d6a5 453.B O_EXCL
33a0ccb2 454is supported only when using NFSv3 or later on kernel 2.6 or later.
efe08656 455In NFS environments where
fea681da 456.B O_EXCL
f4b9d6a5
MK
457support is not provided, programs that rely on it
458for performing locking tasks will contain a race condition.
459Portable programs that want to perform atomic file locking using a lockfile,
460and need to avoid reliance on NFS support for
461.BR O_EXCL ,
462can create a unique file on
9ee4a2b6 463the same filesystem (e.g., incorporating hostname and PID), and use
fea681da 464.BR link (2)
c13182ef 465to make a link to the lockfile.
60a90ecd
MK
466If
467.BR link (2)
f4b9d6a5 468returns 0, the lock is successful.
c13182ef 469Otherwise, use
fea681da
MK
470.BR stat (2)
471on the unique file to check if its link count has increased to 2,
472in which case the lock is also successful.
473.TP
1c1e15ed
MK
474.B O_LARGEFILE
475(LFS)
476Allow files whose sizes cannot be represented in an
8478ee02 477.I off_t
1c1e15ed 478(but can be represented in an
8478ee02 479.IR off64_t )
1c1e15ed 480to be opened.
c13182ef 481The
bcdd964e 482.B _LARGEFILE64_SOURCE
e417acb0
MK
483macro must be defined
484(before including
485.I any
486header files)
487in order to obtain this definition.
c13182ef 488Setting the
bcdd964e 489.B _FILE_OFFSET_BITS
9f3d8b28
MK
490feature test macro to 64 (rather than using
491.BR O_LARGEFILE )
12e263f1 492is the preferred
9f3d8b28 493method of accessing large files on 32-bit systems (see
2dcbf4f7 494.BR feature_test_macros (7)).
1c1e15ed 495.TP
31c1f2b0 496.BR O_NOATIME " (since Linux 2.6.8)"
1bb72c96
MK
497Do not update the file last access time
498.RI ( st_atime
499in the inode)
310b7919 500when the file is
1c1e15ed 501.BR read (2).
5355ff82 502.IP
47c906e5
MK
503This flag can be employed only if one of the following conditions is true:
504.RS
505.IP * 3
506The effective UID of the process
507.\" Strictly speaking: the filesystem UID
508matches the owner UID of the file.
509.IP *
510The calling process has the
1ae6b2c7 511.B CAP_FOWNER
47c906e5
MK
512capability in its user namespace and
513the owner UID of the file has a mapping in the namespace.
514.RE
515.IP
1c1e15ed
MK
516This flag is intended for use by indexing or backup programs,
517where its use can significantly reduce the amount of disk activity.
9ee4a2b6 518This flag may not be effective on all filesystems.
1c1e15ed 519One example is NFS, where the server maintains the access time.
0e1ad98c 520.\" The O_NOATIME flag also affects the treatment of st_atime
92057f4d 521.\" by mmap() and readdir(2), MTK, Dec 04.
1c1e15ed 522.TP
fea681da
MK
523.B O_NOCTTY
524If
525.I pathname
5503c85e 526refers to a terminal device\(emsee
1bb72c96
MK
527.BR tty (4)\(emit
528will not become the process's controlling terminal even if the
fea681da
MK
529process does not have one.
530.TP
1c1e15ed 531.B O_NOFOLLOW
7a11fc63
MK
532If the trailing component (i.e., basename) of
533.I pathname
534is a symbolic link, then the open fails, with the error
6ccb7137 535.BR ELOOP .
7fba0065
MK
536Symbolic links in earlier components of the pathname will still be
537followed.
538(Note that the
539.B ELOOP
540error that can occur in this case is indistinguishable from the case where
6ccb7137
MK
541an open fails because there are too many symbolic links found
542while resolving components in the prefix part of the pathname.)
5355ff82 543.IP
8db11e23
MK
544This flag is a FreeBSD extension, which was added to Linux in version 2.1.126,
545and has subsequently been standardized in POSIX.1-2008.
5355ff82 546.IP
1135dbe1 547See also
1ae6b2c7 548.B O_PATH
1135dbe1 549below.
e366dbc4
MK
550.\" The headers from glibc 2.0.100 and later include a
551.\" definition of this flag; \fIkernels before 2.1.126 will ignore it if
a8d55537 552.\" used\fP.
fea681da
MK
553.TP
554.BR O_NONBLOCK " or " O_NDELAY
ff40dbb3 555When possible, the file is opened in nonblocking mode.
c13182ef 556Neither the
1c1e15ed 557.BR open ()
b0972b3b 558nor any subsequent I/O operations on the file descriptor which is
fea681da 559returned will cause the calling process to wait.
5355ff82 560.IP
f3fdbe28 561Note that the setting of this flag has no effect on the operation of
f2a11072
MK
562.BR poll (2),
563.BR select (2),
564.BR epoll (7),
565and similar,
566since those interfaces merely inform the caller about whether
567a file descriptor is "ready",
568meaning that an I/O operation performed on
569the file descriptor with the
570.B O_NONBLOCK
571flag
572.I clear
573would not block.
574.IP
9f629381
MK
575Note that this flag has no effect for regular files and block devices;
576that is, I/O operations will (briefly) block when device activity
577is required, regardless of whether
578.B O_NONBLOCK
579is set.
580Since
581.B O_NONBLOCK
582semantics might eventually be implemented,
583applications should not depend upon blocking behavior
584when specifying this flag for regular files and block devices.
5355ff82 585.IP
fea681da 586For the handling of FIFOs (named pipes), see also
af5b2ef2 587.BR fifo (7).
db28bfac 588For a discussion of the effect of
0daa9e92 589.B O_NONBLOCK
db28bfac
MK
590in conjunction with mandatory file locks and with file leases, see
591.BR fcntl (2).
fea681da 592.TP
1135dbe1
MK
593.BR O_PATH " (since Linux 2.6.39)"
594.\" commit 1abf0c718f15a56a0a435588d1b104c7a37dc9bd
595.\" commit 326be7b484843988afe57566b627fb7a70beac56
596.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
597.\"
598.\" http://thread.gmane.org/gmane.linux.man/2790/focus=3496
599.\" Subject: Re: [PATCH] open(2): document O_PATH
600.\" Newsgroups: gmane.linux.man, gmane.linux.kernel
601.\"
1135dbe1 602Obtain a file descriptor that can be used for two purposes:
9ee4a2b6 603to indicate a location in the filesystem tree and
1135dbe1
MK
604to perform operations that act purely at the file descriptor level.
605The file itself is not opened, and other file operations (e.g.,
606.BR read (2),
607.BR write (2),
608.BR fchmod (2),
609.BR fchown (2),
2510e4e5 610.BR fgetxattr (2),
97a45d02 611.BR ioctl (2),
2510e4e5 612.BR mmap (2))
1135dbe1
MK
613fail with the error
614.BR EBADF .
5355ff82 615.IP
1135dbe1
MK
616The following operations
617.I can
618be performed on the resulting file descriptor:
619.RS
620.IP * 3
b9307a4a
MK
621.BR close (2).
622.IP *
f3cd742c
MK
623.BR fchdir (2),
624if the file descriptor refers to a directory
b9307a4a 625(since Linux 3.5).
1135dbe1 626.\" commit 332a2e1244bd08b9e3ecd378028513396a004a24
b9307a4a 627.IP *
1135dbe1 628.BR fstat (2)
b9307a4a
MK
629(since Linux 3.6).
630.IP *
1135dbe1 631.\" fstat(): commit 55815f70147dcfa3ead5738fd56d3574e2e3c1c2
97a45d02
N
632.BR fstatfs (2)
633(since Linux 3.12).
634.\" fstatfs(): commit 9d05746e7b16d8565dddbe3200faa1e669d23bbf
1135dbe1
MK
635.IP *
636Duplicating the file descriptor
637.RB ( dup (2),
638.BR fcntl (2)
639.BR F_DUPFD ,
640etc.).
641.IP *
642Getting and setting file descriptor flags
643.RB ( fcntl (2)
1ae6b2c7 644.B F_GETFD
1135dbe1
MK
645and
646.BR F_SETFD ).
09f677a3
MK
647.IP *
648Retrieving open file status flags using the
649.BR fcntl (2)
1ae6b2c7 650.B F_GETFL
09f677a3
MK
651operation: the returned flags will include the bit
652.BR O_PATH .
1135dbe1
MK
653.IP *
654Passing the file descriptor as the
1ae6b2c7 655.I dirfd
1135dbe1 656argument of
490f876a 657.BR openat ()
1135dbe1 658and the other "*at()" system calls.
7dee406b
AL
659This includes
660.BR linkat (2)
661with
1ae6b2c7 662.B AT_EMPTY_PATH
7dee406b
AL
663(or via procfs using
664.BR AT_SYMLINK_FOLLOW )
665even if the file is not a directory.
1135dbe1
MK
666.IP *
667Passing the file descriptor to another process via a UNIX domain socket
668(see
1ae6b2c7 669.B SCM_RIGHTS
1135dbe1
MK
670in
671.BR unix (7)).
672.RE
673.IP
674When
675.B O_PATH
676is specified in
677.IR flags ,
678flag bits other than
6807fc6f
MK
679.BR O_CLOEXEC ,
680.BR O_DIRECTORY ,
1135dbe1 681and
1ae6b2c7 682.B O_NOFOLLOW
1135dbe1 683are ignored.
5355ff82 684.IP
4a3b9ffc
MK
685Opening a file or directory with the
686.B O_PATH
687flag requires no permissions on the object itself
688(but does require execute permission on the directories in the path prefix).
689Depending on the subsequent operation,
690a check for suitable file permissions may be performed (e.g.,
691.BR fchdir (2)
692requires execute permission on the directory referred to
693by its file descriptor argument).
694By contrast,
695obtaining a reference to a filesystem object by opening it with the
696.B O_RDONLY
697flag requires that the caller have read permission on the object,
698even when the subsequent operation (e.g.,
699.BR fchdir (2),
700.BR fstat (2))
701does not require read permission on the object.
702.IP
d30344ab
MK
703If
704.I pathname
705is a symbolic link and the
1ae6b2c7 706.B O_NOFOLLOW
1135dbe1
MK
707flag is also specified,
708then the call returns a file descriptor referring to the symbolic link.
709This file descriptor can be used as the
710.I dirfd
711argument in calls to
712.BR fchownat (2),
713.BR fstatat (2),
714.BR linkat (2),
715and
716.BR readlinkat (2)
717with an empty pathname to have the calls operate on the symbolic link.
5355ff82 718.IP
97a45d02
N
719If
720.I pathname
721refers to an automount point that has not yet been triggered, so no
722other filesystem is mounted on it, then the call returns a file
723descriptor referring to the automount directory without triggering a mount.
724.BR fstatfs (2)
725can then be used to determine if it is, in fact, an untriggered
726automount point
727.RB ( ".f_type == AUTOFS_SUPER_MAGIC" ).
d1304ede
MK
728.IP
729One use of
730.B O_PATH
731for regular files is to provide the equivalent of POSIX.1's
732.B O_EXEC
733functionality.
734This permits us to open a file for which we have execute
ebab32e1 735permission but not read permission, and then execute that file,
d1304ede
MK
736with steps something like the following:
737.IP
738.in +4n
739.EX
740char buf[PATH_MAX];
741fd = open("some_prog", O_PATH);
8e13d566 742snprintf(buf, PATH_MAX, "/proc/self/fd/%d", fd);
d1304ede
MK
743execl(buf, "some_prog", (char *) NULL);
744.EE
745.in
e982cebf
MK
746.IP
747An
748.B O_PATH
749file descriptor can also be passed as the argument of
750.BR fexecve (3).
1135dbe1 751.TP
fea681da 752.B O_SYNC
6cf19e62
MK
753Write operations on the file will complete according to the requirements of
754synchronized I/O
755.I file
756integrity completion
f36a1468 757(by contrast with the
6cf19e62
MK
758synchronized I/O
759.I data
760integrity completion
761provided by
762.BR O_DSYNC .)
5355ff82 763.IP
6cf19e62
MK
764By the time
765.BR write (2)
ca20a8a5
MK
766(or similar)
767returns, the output data and associated file metadata
6cf19e62
MK
768have been transferred to the underlying hardware
769(i.e., as though each
770.BR write (2)
771was followed by a call to
772.BR fsync (2)).
773.IR "See NOTES below" .
fea681da 774.TP
40398c1a
MK
775.BR O_TMPFILE " (since Linux 3.11)"
776.\" commit 60545d0d4610b02e55f65d141c95b18ccf855b6e
777.\" commit f4e0c30c191f87851c4a53454abb55ee276f4a7e
778.\" commit bb458c644a59dbba3a1fe59b27106c5e68e1c4bd
6a11a5d4 779Create an unnamed temporary regular file.
40398c1a
MK
780The
781.I pathname
782argument specifies a directory;
783an unnamed inode will be created in that directory's filesystem.
784Anything written to the resulting file will be lost when
785the last file descriptor is closed, unless the file is given a name.
5355ff82 786.IP
40398c1a
MK
787.B O_TMPFILE
788must be specified with one of
789.B O_RDWR
790or
791.B O_WRONLY
792and, optionally,
793.BR O_EXCL .
794If
795.B O_EXCL
796is not specified, then
797.BR linkat (2)
798can be used to link the temporary file into the filesystem, making it
799permanent, using code like the following:
5355ff82 800.IP
40398c1a 801.in +4n
5355ff82 802.EX
40398c1a
MK
803char path[PATH_MAX];
804fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
0fb83d00
MK
805 S_IRUSR | S_IWUSR);
806
89de1a39 807/* File I/O on \(aqfd\(aq... */
0fb83d00 808
1c551957 809linkat(fd, "", AT_FDCWD, "/path/for/file", AT_EMPTY_PATH);
a2587fbb 810
89de1a39 811/* If the caller doesn\(aqt have the CAP_DAC_READ_SEARCH
a2587fbb
MK
812 capability (needed to use AT_EMPTY_PATH with linkat(2)),
813 and there is a proc(5) filesystem mounted, then the
814 linkat(2) call above can be replaced with:
815
816snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
817linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
818 AT_SYMLINK_FOLLOW);
819*/
5355ff82 820.EE
40398c1a 821.in
5355ff82 822.IP
40398c1a
MK
823In this case,
824the
825.BR open ()
826.I mode
827argument determines the file permission mode, as with
828.BR O_CREAT .
5355ff82 829.IP
0115aaed
MK
830Specifying
831.B O_EXCL
832in conjunction with
833.B O_TMPFILE
834prevents a temporary file from being linked into the filesystem
835in the above manner.
836(Note that the meaning of
837.B O_EXCL
838in this case is different from the meaning of
839.B O_EXCL
840otherwise.)
5355ff82 841.IP
40398c1a
MK
842There are two main use cases for
843.\" Inspired by http://lwn.net/Articles/559147/
844.BR O_TMPFILE :
845.RS
846.IP * 3
847Improved
848.BR tmpfile (3)
849functionality: race-free creation of temporary files that
850(1) are automatically deleted when closed;
851(2) can never be reached via any pathname;
852(3) are not subject to symlink attacks; and
853(4) do not require the caller to devise unique names.
854.IP *
855Creating a file that is initially invisible, which is then populated
8b04592d 856with data and adjusted to have appropriate filesystem attributes
c89a9937
EB
857.RB ( fchown (2),
858.BR fchmod (2),
40398c1a
MK
859.BR fsetxattr (2),
860etc.)
861before being atomically linked into the filesystem
862in a fully formed state (using
863.BR linkat (2)
864as described above).
865.RE
866.IP
867.B O_TMPFILE
868requires support by the underlying filesystem;
40398c1a 869only a subset of Linux filesystems provide that support.
cde2074a 870In the initial implementation, support was provided in
7a0095a5 871the ext2, ext3, ext4, UDF, Minix, and tmpfs filesystems.
bd79a35a 872.\" To check for support, grep for "tmpfile" in kernel sources
6065b906
MK
873Support for other filesystems has subsequently been added as follows:
874XFS (Linux 3.15);
cde2074a
MK
875.\" commit 99b6436bc29e4f10e4388c27a3e4810191cc4788
876.\" commit ab29743117f9f4c22ac44c13c1647fb24fb2bafe
1b9d5819 877Btrfs (Linux 3.16);
e746db2e 878.\" commit ef3b9af50bfa6a1f02cd7b3f5124b712b1ba3e3c
6065b906 879F2FS (Linux 3.16);
bd79a35a 880.\" commit 50732df02eefb39ab414ef655979c2c9b64ad21c
6065b906 881and ubifs (Linux 4.9)
40398c1a 882.TP
1c1e15ed 883.B O_TRUNC
4d61d36a 884If the file already exists and is a regular file and the access mode allows
682edefb
MK
885writing (i.e., is
886.B O_RDWR
887or
888.BR O_WRONLY )
889it will be truncated to length 0.
890If the file is a FIFO or terminal device file, the
891.B O_TRUNC
c13182ef 892flag is ignored.
2b9b829d 893Otherwise, the effect of
682edefb
MK
894.B O_TRUNC
895is unspecified.
7b8ba76c 896.SS creat()
1f7191bb 897A call to
1c1e15ed 898.BR creat ()
1f7191bb 899is equivalent to calling
1c1e15ed 900.BR open ()
fea681da
MK
901with
902.I flags
903equal to
904.BR O_CREAT|O_WRONLY|O_TRUNC .
7b8ba76c
MK
905.SS openat()
906The
907.BR openat ()
908system call operates in exactly the same way as
cadd38ba 909.BR open (),
7b8ba76c 910except for the differences described here.
3130d10b 911.PP
5241f3cc
MK
912The
913.I dirfd
914argument is used in conjunction with the
915.I pathname
916argument as follows:
917.IP * 3
7b8ba76c
MK
918If the pathname given in
919.I pathname
56dddcba 920is absolute, then
7b8ba76c 921.I dirfd
56dddcba 922is ignored.
5241f3cc 923.IP *
56dddcba 924If the pathname given in
7b8ba76c
MK
925.I pathname
926is relative and
927.I dirfd
928is the special value
929.BR AT_FDCWD ,
930then
931.I pathname
932is interpreted relative to the current working
933directory of the calling process (like
cadd38ba 934.BR open ()).
5241f3cc 935.IP *
56dddcba 936If the pathname given in
7b8ba76c 937.I pathname
56dddcba
MK
938is relative, then it is interpreted relative to the directory
939referred to by the file descriptor
7b8ba76c 940.I dirfd
56dddcba
MK
941(rather than relative to the current working directory of
942the calling process, as is done by
943.BR open ()
944for a relative pathname).
a9db6c1b
MK
945In this case,
946.I dirfd
947must be a directory that was opened for reading
948.RB ( O_RDONLY )
949or using the
950.B O_PATH
951flag.
73434f40
MK
952.PP
953If the pathname given in
954.I pathname
955is relative, and
956.I dirfd
957is not a valid file descriptor, an error
958.RB ( EBADF )
959results.
960(Specifying an invalid file descriptor number in
961.I dirfd
962can be used as a means to ensure that
963.I pathname
964is absolute.)
4b322a2f 965.\"
a2dbb2e3
AS
966.SS openat2(2)
967The
968.BR openat2 (2)
969system call is an extension of
970.BR openat (),
4b322a2f
MK
971and provides a superset of the features of
972.BR openat ().
aec13430 973It is documented separately, in
4b322a2f 974.BR openat2 (2).
47297adb 975.SH RETURN VALUE
c112329f 976On success,
7b8ba76c
MK
977.BR open (),
978.BR openat (),
c13182ef 979and
e1d6264d 980.BR creat ()
c112329f
MK
981return the new file descriptor (a nonnegative integer).
982On error, \-1 is returned and
fea681da 983.I errno
c112329f 984is set to indicate the error.
fea681da 985.SH ERRORS
7b8ba76c
MK
986.BR open (),
987.BR openat (),
988and
989.BR creat ()
990can fail with the following errors:
fea681da
MK
991.TP
992.B EACCES
993The requested access to the file is not allowed, or search permission
994is denied for one of the directories in the path prefix of
995.IR pathname ,
996or the file did not exist yet and write access to the parent directory
997is not allowed.
998(See also
ad7cc990 999.BR path_resolution (7).)
fea681da 1000.TP
2ddf885a
JS
1001.B EACCES
1002.\" commit 30aba6656f61ed44cba445a3c0d38b296fa9e8f5
1003Where
1004.B O_CREAT
d9e7db1b
MK
1005is specified, the
1006.I protected_fifos
1007or
510adbed 1008.I protected_regular
d9e7db1b 1009sysctl is enabled, the file already exists and is a FIFO or regular file, the
2ddf885a
JS
1010owner of the file is neither the current user nor the owner of the
1011containing directory, and the containing directory is both world- or
1012group-writable and sticky.
d9e7db1b 1013For details, see the descriptions of
1ae6b2c7 1014.I /proc/sys/fs/protected_fifos
d9e7db1b 1015and
1ae6b2c7 1016.I /proc/sys/fs/protected_regular
d9e7db1b
MK
1017in
1018.BR proc (5).
2ddf885a 1019.TP
90879cbd
MK
1020.B EBADF
1021.RB ( openat ())
1022.I pathname
1023is relative but
1024.I dirfd
1025is neither
1026.B AT_FDCWD
1027nor a valid file descriptor.
1028.TP
836a5bbf
MK
1029.B EBUSY
1030.B O_EXCL
1031was specified in
1032.I flags
1033and
1034.I pathname
1035refers to a block device that is in use by the system (e.g., it is mounted).
1036.TP
a1f01685
MH
1037.B EDQUOT
1038Where
1039.B O_CREAT
1040is specified, the file does not exist, and the user's quota of disk
9ee4a2b6 1041blocks or inodes on the filesystem has been exhausted.
a1f01685 1042.TP
fea681da
MK
1043.B EEXIST
1044.I pathname
1045already exists and
1046.BR O_CREAT " and " O_EXCL
1047were used.
1048.TP
1049.B EFAULT
0daa9e92 1050.I pathname
e1d6264d 1051points outside your accessible address space.
fea681da 1052.TP
9f5773f7 1053.B EFBIG
7c7fb552
MK
1054See
1055.BR EOVERFLOW .
9f5773f7 1056.TP
e51412ea
MK
1057.B EINTR
1058While blocked waiting to complete an open of a slow device
1059(e.g., a FIFO; see
1060.BR fifo (7)),
1061the call was interrupted by a signal handler; see
1062.BR signal (7).
1063.TP
ef490193
DG
1064.B EINVAL
1065The filesystem does not support the
1ae6b2c7 1066.B O_DIRECT
e6f89ed2
MK
1067flag.
1068See
1ae6b2c7 1069.B NOTES
ef490193
DG
1070for more information.
1071.TP
8e335391
MK
1072.B EINVAL
1073Invalid value in
1074.\" In particular, __O_TMPFILE instead of O_TMPFILE
1075.IR flags .
1076.TP
1077.B EINVAL
1078.B O_TMPFILE
1079was specified in
1080.IR flags ,
1081but neither
1082.B O_WRONLY
1083nor
1084.B O_RDWR
1085was specified.
1086.TP
5c6f8de0
MK
1087.B EINVAL
1088.B O_CREAT
1089was specified in
1090.I flags
1091and the final component ("basename") of the new file's
1092.I pathname
1093is invalid
1094(e.g., it contains characters not permitted by the underlying filesystem).
ed6fe005 1095.TP
ed6fe005
MK
1096.B EINVAL
1097The final component ("basename") of
1098.I pathname
1099is invalid
1100(e.g., it contains characters not permitted by the underlying filesystem).
5c6f8de0 1101.TP
fea681da
MK
1102.B EISDIR
1103.I pathname
1104refers to a directory and the access requested involved writing
1105(that is,
1106.B O_WRONLY
1107or
1108.B O_RDWR
1109is set).
1110.TP
8e335391 1111.B EISDIR
843068bd
MK
1112.I pathname
1113refers to an existing directory,
8e335391
MK
1114.B O_TMPFILE
1115and one of
1116.B O_WRONLY
1117or
1118.B O_RDWR
1119were specified in
1120.IR flags ,
1121but this kernel version does not provide the
1122.B O_TMPFILE
1123functionality.
1124.TP
fea681da
MK
1125.B ELOOP
1126Too many symbolic links were encountered in resolving
289f7907
MK
1127.IR pathname .
1128.TP
1129.B ELOOP
fea681da 1130.I pathname
289f7907
MK
1131was a symbolic link, and
1132.I flags
1133specified
1ae6b2c7 1134.B O_NOFOLLOW
289f7907
MK
1135but not
1136.BR O_PATH .
fea681da
MK
1137.TP
1138.B EMFILE
26c32fab 1139The per-process limit on the number of open file descriptors has been reached
12c21590 1140(see the description of
1ae6b2c7 1141.B RLIMIT_NOFILE
12c21590
MK
1142in
1143.BR getrlimit (2)).
fea681da
MK
1144.TP
1145.B ENAMETOOLONG
0daa9e92 1146.I pathname
e1d6264d 1147was too long.
fea681da
MK
1148.TP
1149.B ENFILE
e258766b 1150The system-wide limit on the total number of open files has been reached.
fea681da
MK
1151.TP
1152.B ENODEV
1153.I pathname
1154refers to a device special file and no corresponding device exists.
682edefb
MK
1155(This is a Linux kernel bug; in this situation
1156.B ENXIO
1157must be returned.)
fea681da
MK
1158.TP
1159.B ENOENT
682edefb
MK
1160.B O_CREAT
1161is not set and the named file does not exist.
115bbafa
MK
1162.TP
1163.B ENOENT
1164A directory component in
fea681da
MK
1165.I pathname
1166does not exist or is a dangling symbolic link.
1167.TP
ba03011f
MK
1168.B ENOENT
1169.I pathname
1170refers to a nonexistent directory,
1171.B O_TMPFILE
1172and one of
1173.B O_WRONLY
1174or
1175.B O_RDWR
1176were specified in
1177.IR flags ,
1178but this kernel version does not provide the
1179.B O_TMPFILE
1180functionality.
1181.TP
fea681da 1182.B ENOMEM
8ef529f9
MK
1183The named file is a FIFO,
1184but memory for the FIFO buffer can't be allocated because
1185the per-user hard limit on memory allocation for pipes has been reached
1186and the caller is not privileged; see
1187.BR pipe (7).
1188.TP
1189.B ENOMEM
fea681da
MK
1190Insufficient kernel memory was available.
1191.TP
1192.B ENOSPC
1193.I pathname
1194was to be created but the device containing
1195.I pathname
1196has no room for the new file.
1197.TP
1198.B ENOTDIR
1199A component used as a directory in
1200.I pathname
a8d55537 1201is not, in fact, a directory, or \fBO_DIRECTORY\fP was specified and
fea681da
MK
1202.I pathname
1203was not a directory.
1204.TP
90879cbd
MK
1205.B ENOTDIR
1206.RB ( openat ())
1207.I pathname
1208is a relative pathname and
1209.I dirfd
1210is a file descriptor referring to a file other than a directory.
1211.TP
fea681da 1212.B ENXIO
682edefb 1213.BR O_NONBLOCK " | " O_WRONLY
103ea4f6
MK
1214is set, the named file is a FIFO, and
1215no process has the FIFO open for reading.
7b032b23
MK
1216.TP
1217.B ENXIO
1218The file is a device special file and no corresponding device exists.
fea681da 1219.TP
71b12d0a 1220.B ENXIO
8b5bbcfa 1221The file is a UNIX domain socket.
71b12d0a 1222.TP
1ae6b2c7 1223.B EOPNOTSUPP
bbe02b45
MK
1224The filesystem containing
1225.I pathname
1226does not support
1227.BR O_TMPFILE .
1228.TP
7c7fb552
MK
1229.B EOVERFLOW
1230.I pathname
1231refers to a regular file that is too large to be opened.
1232The usual scenario here is that an application compiled
1233on a 32-bit platform without
2c1acf16 1234.I \-D_FILE_OFFSET_BITS=64
7c7fb552 1235tried to open a file whose size exceeds
cd415e73 1236.I (1<<31)\-1
4e1a4d72 1237bytes;
7c7fb552
MK
1238see also
1239.B O_LARGEFILE
1240above.
c84d3aa3 1241This is the error specified by POSIX.1;
7c7fb552
MK
1242in kernels before 2.6.24, Linux gave the error
1243.B EFBIG
1244for this case.
1245.\" See http://bugzilla.kernel.org/show_bug.cgi?id=7253
1246.\" "Open of a large file on 32-bit fails with EFBIG, should be EOVERFLOW"
1247.\" Reported 2006-10-03
1248.TP
1c1e15ed
MK
1249.B EPERM
1250The
1251.B O_NOATIME
1252flag was specified, but the effective user ID of the caller
9ee4a2b6 1253.\" Strictly speaking, it's the filesystem UID... (MTK)
47c906e5 1254did not match the owner of the file and the caller was not privileged.
1c1e15ed 1255.TP
fbab10e5
MK
1256.B EPERM
1257The operation was prevented by a file seal; see
1258.BR fcntl (2).
1259.TP
fea681da
MK
1260.B EROFS
1261.I pathname
9ee4a2b6 1262refers to a file on a read-only filesystem and write access was
fea681da
MK
1263requested.
1264.TP
1265.B ETXTBSY
1266.I pathname
1267refers to an executable image which is currently being executed and
1268write access was requested.
d3952311 1269.TP
19d37126
JH
1270.B ETXTBSY
1271.I pathname
1272refers to a file that is currently in use as a swap file, and the
1273.B O_TRUNC
1274flag was specified.
1275.TP
1276.B ETXTBSY
1277.I pathname
0629df8b 1278refers to a file that is currently being read by the kernel (e.g., for
19d37126
JH
1279module/firmware loading), and write access was requested.
1280.TP
d3952311
MK
1281.B EWOULDBLOCK
1282The
1283.B O_NONBLOCK
1284flag was specified, and an incompatible lease was held on the file
1285(see
1286.BR fcntl (2)).
7b8ba76c
MK
1287.SH VERSIONS
1288.BR openat ()
1289was added to Linux in kernel 2.6.16;
1290library support was added to glibc in version 2.4.
3113c7f3 1291.SH STANDARDS
7b8ba76c
MK
1292.BR open (),
1293.BR creat ()
72ac7268 1294SVr4, 4.3BSD, POSIX.1-2001, POSIX.1-2008.
5355ff82 1295.PP
7b8ba76c
MK
1296.BR openat ():
1297POSIX.1-2008.
5355ff82 1298.PP
a2dbb2e3
AS
1299.BR openat2 (2)
1300is Linux-specific.
1301.PP
fea681da 1302The
72ac7268 1303.BR O_DIRECT ,
1c1e15ed 1304.BR O_NOATIME ,
72ac7268 1305.BR O_PATH ,
fea681da 1306and
1ae6b2c7 1307.B O_TMPFILE
72ac7268
MK
1308flags are Linux-specific.
1309One must define
61b7c1e1
MK
1310.B _GNU_SOURCE
1311to obtain their definitions.
5355ff82 1312.PP
9f91e36c 1313The
72ac7268
MK
1314.BR O_CLOEXEC ,
1315.BR O_DIRECTORY ,
1316and
1ae6b2c7 1317.B O_NOFOLLOW
72ac7268
MK
1318flags are not specified in POSIX.1-2001,
1319but are specified in POSIX.1-2008.
1320Since glibc 2.12, one can obtain their definitions by defining either
1321.B _POSIX_C_SOURCE
1322with a value greater than or equal to 200809L or
1ae6b2c7 1323.B _XOPEN_SOURCE
72ac7268
MK
1324with a value greater than or equal to 700.
1325In glibc 2.11 and earlier, one obtains the definitions by defining
1326.BR _GNU_SOURCE .
5355ff82 1327.PP
72ac7268
MK
1328As noted in
1329.BR feature_test_macros (7),
84fc2a6e 1330feature test macros such as
72ac7268
MK
1331.BR _POSIX_C_SOURCE ,
1332.BR _XOPEN_SOURCE ,
1333and
fe75ec04 1334.B _GNU_SOURCE
72ac7268 1335must be defined before including
e417acb0 1336.I any
72ac7268 1337header files.
a1d5f77c 1338.SH NOTES
988db661 1339Under Linux, the
a1d5f77c 1340.B O_NONBLOCK
3897a3f8 1341flag is sometimes used in cases where one wants to open
a1d5f77c 1342but does not necessarily have the intention to read or write.
3897a3f8
MK
1343For example,
1344this may be used to open a device in order to get a file descriptor
a1d5f77c
MK
1345for use with
1346.BR ioctl (2).
dd3568a1 1347.PP
fea681da
MK
1348The (undefined) effect of
1349.B O_RDONLY | O_TRUNC
c13182ef 1350varies among implementations.
bcdd964e 1351On many systems the file is actually truncated.
fea681da
MK
1352.\" Linux 2.0, 2.5: truncate
1353.\" Solaris 5.7, 5.8: truncate
1354.\" Irix 6.5: truncate
1355.\" Tru64 5.1B: truncate
1356.\" HP-UX 11.22: truncate
1357.\" FreeBSD 4.7: truncate
5355ff82 1358.PP
5dc8986d
MK
1359Note that
1360.BR open ()
1361can open device special files, but
1362.BR creat ()
1363cannot create them; use
1364.BR mknod (2)
1365instead.
5355ff82 1366.PP
5dc8986d
MK
1367If the file is newly created, its
1368.IR st_atime ,
1369.IR st_ctime ,
1370.I st_mtime
1371fields
1372(respectively, time of last access, time of last status change, and
1373time of last modification; see
1374.BR stat (2))
1375are set
1376to the current time, and so are the
1377.I st_ctime
1378and
1379.I st_mtime
1380fields of the
1381parent directory.
1382Otherwise, if the file is modified because of the
1383.B O_TRUNC
3a9c5a29
MK
1384flag, its
1385.I st_ctime
1386and
1387.I st_mtime
1388fields are set to the current time.
5355ff82 1389.PP
aaf7a574
MK
1390The files in the
1391.I /proc/[pid]/fd
1392directory show the open file descriptors of the process with the PID
1393.IR pid .
1394The files in the
1395.I /proc/[pid]/fdinfo
d40e0bfc 1396directory show even more information about these file descriptors.
aaf7a574
MK
1397See
1398.BR proc (5)
1399for further details of both of these directories.
8132c115 1400.PP
319e9b31 1401The Linux header file
8132c115
ES
1402.B <asm/fcntl.h>
1403doesn't define
1404.BR O_ASYNC ;
319e9b31 1405the (BSD-derived)
8132c115 1406.B FASYNC
319e9b31 1407synonym is defined instead.
5dc8986d
MK
1408.\"
1409.\"
d20d9d33
MK
1410.SS Open file descriptions
1411The term open file description is the one used by POSIX to refer to the
1412entries in the system-wide table of open files.
91085d85 1413In other contexts, this object is
d20d9d33
MK
1414variously also called an "open file object",
1415a "file handle", an "open file table entry",
1416or\(emin kernel-developer parlance\(ema
1417.IR "struct file" .
5355ff82 1418.PP
d20d9d33
MK
1419When a file descriptor is duplicated (using
1420.BR dup (2)
1421or similar),
1422the duplicate refers to the same open file description
1423as the original file descriptor,
1424and the two file descriptors consequently share
1425the file offset and file status flags.
1426Such sharing can also occur between processes:
1427a child process created via
91085d85 1428.BR fork (2)
d20d9d33
MK
1429inherits duplicates of its parent's file descriptors,
1430and those duplicates refer to the same open file descriptions.
5355ff82 1431.PP
d20d9d33 1432Each
bf7bc8b8 1433.BR open ()
d20d9d33
MK
1434of a file creates a new open file description;
1435thus, there may be multiple open file descriptions
1436corresponding to a file inode.
5355ff82 1437.PP
9539ebc9
MK
1438On Linux, one can use the
1439.BR kcmp (2)
1440.B KCMP_FILE
1441operation to test whether two file descriptors
1442(in the same process or in two different processes)
1443refer to the same open file description.
d20d9d33
MK
1444.\"
1445.\"
5dc8986d 1446.SS Synchronized I/O
6cf19e62
MK
1447The POSIX.1-2008 "synchronized I/O" option
1448specifies different variants of synchronized I/O,
1449and specifies the
1450.BR open ()
1451flags
015221ef
CH
1452.BR O_SYNC ,
1453.BR O_DSYNC ,
1454and
1ae6b2c7 1455.B O_RSYNC
6cf19e62
MK
1456for controlling the behavior.
1457Regardless of whether an implementation supports this option,
1458it must at least support the use of
1ae6b2c7 1459.B O_SYNC
6cf19e62 1460for regular files.
5355ff82 1461.PP
89851a00 1462Linux implements
1ae6b2c7 1463.B O_SYNC
6cf19e62
MK
1464and
1465.BR O_DSYNC ,
1466but not
015221ef 1467.BR O_RSYNC .
352c4c5c 1468Somewhat incorrectly, glibc defines
1ae6b2c7 1469.B O_RSYNC
6cf19e62 1470to have the same value as
352c4c5c
MK
1471.BR O_SYNC .
1472.RB ( O_RSYNC
1473is defined in the Linux header file
1474.I <asm/fcntl.h>
1475on HP PA-RISC, but it is not used.)
5355ff82 1476.PP
1ae6b2c7 1477.B O_SYNC
6cf19e62
MK
1478provides synchronized I/O
1479.I file
1480integrity completion,
1481meaning write operations will flush data and all associated metadata
1482to the underlying hardware.
1ae6b2c7 1483.B O_DSYNC
6cf19e62
MK
1484provides synchronized I/O
1485.I data
1486integrity completion,
1487meaning write operations will flush data
1488to the underlying hardware,
1489but will only flush metadata updates that are required
1490to allow a subsequent read operation to complete successfully.
1491Data integrity completion can reduce the number of disk operations
1492that are required for applications that don't need the guarantees
1493of file integrity completion.
5355ff82 1494.PP
a83923ca 1495To understand the difference between the two types of completion,
6cf19e62
MK
1496consider two pieces of file metadata:
1497the file last modification timestamp
1498.RI ( st_mtime )
1499and the file length.
1500All write operations will update the last file modification timestamp,
1501but only writes that add data to the end of the
1502file will change the file length.
1503The last modification timestamp is not needed to ensure that
1504a read completes successfully, but the file length is.
1505Thus,
1ae6b2c7 1506.B O_DSYNC
6cf19e62
MK
1507would only guarantee to flush updates to the file length metadata
1508(whereas
1ae6b2c7 1509.B O_SYNC
6cf19e62 1510would also always flush the last modification timestamp metadata).
5355ff82 1511.PP
6cf19e62 1512Before Linux 2.6.33, Linux implemented only the
1ae6b2c7 1513.B O_SYNC
89851a00 1514flag for
6cf19e62
MK
1515.BR open ().
1516However, when that flag was specified,
1517most filesystems actually provided the equivalent of synchronized I/O
1518.I data
1519integrity completion (i.e.,
1ae6b2c7 1520.B O_SYNC
6cf19e62
MK
1521was actually implemented as the equivalent of
1522.BR O_DSYNC ).
5355ff82 1523.PP
6cf19e62 1524Since Linux 2.6.33, proper
1ae6b2c7 1525.B O_SYNC
6cf19e62
MK
1526support is provided.
1527However, to ensure backward binary compatibility,
1ae6b2c7 1528.B O_DSYNC
6cf19e62 1529was defined with the same value as the historical
015221ef 1530.BR O_SYNC ,
015221ef 1531and
1ae6b2c7 1532.B O_SYNC
89851a00 1533was defined as a new (two-bit) flag value that includes the
1ae6b2c7 1534.B O_DSYNC
6cf19e62
MK
1535flag value.
1536This ensures that applications compiled against
1537new headers get at least
1ae6b2c7 1538.B O_DSYNC
6cf19e62 1539semantics on pre-2.6.33 kernels.
5dc8986d 1540.\"
76f054b1
MK
1541.SS C library/kernel differences
1542Since version 2.26,
1543the glibc wrapper function for
1544.BR open ()
1545employs the
1546.BR openat ()
1547system call, rather than the kernel's
1548.BR open ()
1549system call.
1550For certain architectures, this is also true in glibc versions before 2.26.
5dc8986d
MK
1551.\"
1552.SS NFS
1553There are many infelicities in the protocol underlying NFS, affecting
1554amongst others
1555.BR O_SYNC " and " O_NDELAY .
5355ff82 1556.PP
9ee4a2b6 1557On NFS filesystems with UID mapping enabled,
a1d5f77c
MK
1558.BR open ()
1559may
75b94dc3 1560return a file descriptor but, for example,
a1d5f77c
MK
1561.BR read (2)
1562requests are denied
1ae6b2c7
AC
1563with
1564.BR EACCES .
a1d5f77c
MK
1565This is because the client performs
1566.BR open ()
1567by checking the
1568permissions, but UID mapping is performed by the server upon
1569read and write requests.
5dc8986d
MK
1570.\"
1571.\"
1bdc161d
MK
1572.SS FIFOs
1573Opening the read or write end of a FIFO blocks until the other
1574end is also opened (by another process or thread).
1575See
1576.BR fifo (7)
1577for further details.
1578.\"
1579.\"
5dc8986d
MK
1580.SS File access mode
1581Unlike the other values that can be specified in
1582.IR flags ,
1583the
1584.I "access mode"
1585values
1586.BR O_RDONLY ", " O_WRONLY ", and " O_RDWR
1587do not specify individual bits.
1588Rather, they define the low order two bits of
1589.IR flags ,
1590and are defined respectively as 0, 1, and 2.
1591In other words, the combination
1592.B "O_RDONLY | O_WRONLY"
1593is a logical error, and certainly does not have the same meaning as
1594.BR O_RDWR .
5355ff82 1595.PP
5dc8986d
MK
1596Linux reserves the special, nonstandard access mode 3 (binary 11) in
1597.I flags
1598to mean:
d9cb0d7d 1599check for read and write permission on the file and return a file descriptor
5dc8986d
MK
1600that can't be used for reading or writing.
1601This nonstandard access mode is used by some Linux drivers to return a
d9cb0d7d 1602file descriptor that is to be used only for device-specific
5dc8986d
MK
1603.BR ioctl (2)
1604operations.
1605.\" See for example util-linux's disk-utils/setfdprm.c
1606.\" For some background on access mode 3, see
1607.\" http://thread.gmane.org/gmane.linux.kernel/653123
1608.\" "[RFC] correct flags to f_mode conversion in __dentry_open"
1609.\" LKML, 12 Mar 2008
7b8ba76c
MK
1610.\"
1611.\"
80d250b4 1612.SS Rationale for openat() and other "directory file descriptor" APIs
7b8ba76c 1613.BR openat ()
80d250b4
MK
1614and the other system calls and library functions that take
1615a directory file descriptor argument
7b8ba76c 1616(i.e.,
c6a16783 1617.BR execveat (2),
7b8ba76c 1618.BR faccessat (2),
80d250b4 1619.BR fanotify_mark (2),
7b8ba76c
MK
1620.BR fchmodat (2),
1621.BR fchownat (2),
5c30e7cd 1622.BR fspick (2),
7b8ba76c
MK
1623.BR fstatat (2),
1624.BR futimesat (2),
1625.BR linkat (2),
1626.BR mkdirat (2),
1627.BR mknodat (2),
d53b1b17 1628.BR mount_setattr (2),
0a5c96db 1629.BR move_mount (2),
80d250b4 1630.BR name_to_handle_at (2),
5c30e7cd 1631.BR open_tree (2),
e64c566c 1632.BR openat2 (2),
7b8ba76c
MK
1633.BR readlinkat (2),
1634.BR renameat (2),
0a5c96db 1635.BR renameat2 (2),
3f092cef 1636.BR statx (2),
7b8ba76c
MK
1637.BR symlinkat (2),
1638.BR unlinkat (2),
f37759b1 1639.BR utimensat (2),
80d250b4 1640.BR mkfifoat (3),
7b8ba76c 1641and
80d250b4 1642.BR scandirat (3))
a98e0304 1643address two problems with the older interfaces that preceded them.
92692952 1644Here, the explanation is in terms of the
7b8ba76c 1645.BR openat ()
d26f8a31 1646call, but the rationale is analogous for the other interfaces.
5355ff82 1647.PP
7b8ba76c
MK
1648First,
1649.BR openat ()
1650allows an application to avoid race conditions that could
1651occur when using
cadd38ba 1652.BR open ()
7b8ba76c
MK
1653to open files in directories other than the current working directory.
1654These race conditions result from the fact that some component
1655of the directory prefix given to
cadd38ba 1656.BR open ()
7b8ba76c 1657could be changed in parallel with the call to
cadd38ba 1658.BR open ().
54305f5b 1659Suppose, for example, that we wish to create the file
a710e359 1660.I dir1/dir2/xxx.dep
54305f5b 1661if the file
a710e359 1662.I dir1/dir2/xxx
54305f5b 1663exists.
069d2f9a 1664The problem is that between the existence check and the file-creation step,
a710e359 1665.I dir1
54305f5b 1666or
a710e359 1667.I dir2
54305f5b
MK
1668(which might be symbolic links)
1669could be modified to point to a different location.
7b8ba76c
MK
1670Such races can be avoided by
1671opening a file descriptor for the target directory,
1672and then specifying that file descriptor as the
1673.I dirfd
54305f5b
MK
1674argument of (say)
1675.BR fstatat (2)
1676and
7b8ba76c 1677.BR openat ().
941d2892
MK
1678The use of the
1679.I dirfd
1680file descriptor also has other benefits:
1681.IP * 3
1682the file descriptor is a stable reference to the directory,
1683even if the directory is renamed; and
1684.IP *
1685the open file descriptor prevents the underlying filesystem from
1686being dismounted,
1687just as when a process has a current working directory on a filesystem.
1688.PP
7b8ba76c
MK
1689Second,
1690.BR openat ()
1691allows the implementation of a per-thread "current working
1692directory", via file descriptor(s) maintained by the application.
1693(This functionality can also be obtained by tricks based
1694on the use of
1695.IR /proc/self/fd/ dirfd,
1696but less efficiently.)
96c44b8f
MK
1697.PP
1698The
1699.I dirfd
1700argument for these APIs can be obtained by using
1701.BR open ()
1702or
1703.BR openat ()
1704to open a directory (with either the
1ae6b2c7 1705.B O_RDONLY
96c44b8f 1706or the
1ae6b2c7 1707.B O_PATH
96c44b8f
MK
1708flag).
1709Alternatively, such a file descriptor can be obtained by applying
1710.BR dirfd (3)
1711to a directory stream created using
1712.BR opendir (3).
4146f81b
MK
1713.PP
1714When these APIs are given a
1715.I dirfd
1716argument of
1ae6b2c7 1717.B AT_FDCWD
4146f81b 1718or the specified pathname is absolute,
313fb527 1719then they handle their pathname argument in the same way as
4146f81b
MK
1720the corresponding conventional APIs.
1721However, in this case, several of the APIs have a
1722.I flags
1723argument that provides access to functionality that is not available with
1724the corresponding conventional APIs.
7b8ba76c
MK
1725.\"
1726.\"
ddc4d339 1727.SS O_DIRECT
ddc4d339
MK
1728The
1729.B O_DIRECT
1730flag may impose alignment restrictions on the length and address
7fac88a9 1731of user-space buffers and the file offset of I/Os.
ddc4d339 1732In Linux alignment
9ee4a2b6 1733restrictions vary by filesystem and kernel version and might be
ddc4d339 1734absent entirely.
9ee4a2b6 1735However there is currently no filesystem\-independent
ddc4d339 1736interface for an application to discover these restrictions for a given
9ee4a2b6
MK
1737file or filesystem.
1738Some filesystems provide their own interfaces
ddc4d339
MK
1739for doing so, for example the
1740.B XFS_IOC_DIOINFO
1741operation in
1742.BR xfsctl (3).
dd3568a1 1743.PP
36dce687 1744Under Linux 2.4, transfer sizes, the alignment of the user buffer,
85c2bdba 1745and the file offset must all be multiples of the logical block size
9ee4a2b6 1746of the filesystem.
21557928 1747Since Linux 2.6.0, alignment to the logical block size of the
e6042e4a 1748underlying storage (typically 512 bytes) suffices.
21557928 1749The logical block size can be determined using the
e6042e4a
PS
1750.BR ioctl (2)
1751.B BLKSSZGET
21557928 1752operation or from the shell using the command:
5355ff82 1753.PP
da16ac09 1754.in +4n
5355ff82 1755.EX
da16ac09 1756blockdev \-\-getss
5355ff82 1757.EE
da16ac09 1758.in
5355ff82 1759.PP
1847167b
NP
1760.B O_DIRECT
1761I/Os should never be run concurrently with the
04cd7f64 1762.BR fork (2)
1847167b
NP
1763system call,
1764if the memory buffer is a private mapping
1765(i.e., any mapping created with the
02ace852 1766.BR mmap (2)
1ae6b2c7 1767.B MAP_PRIVATE
0ab8aeec 1768flag;
1847167b
NP
1769this includes memory allocated on the heap and statically allocated buffers).
1770Any such I/Os, whether submitted via an asynchronous I/O interface or from
1771another thread in the process,
1772should be completed before
1773.BR fork (2)
1774is called.
1775Failure to do so can result in data corruption and undefined behavior in
1776parent and child processes.
1777This restriction does not apply when the memory buffer for the
1778.B O_DIRECT
1779I/Os was created using
1780.BR shmat (2)
1781or
1782.BR mmap (2)
1783with the
1784.B MAP_SHARED
1785flag.
1786Nor does this restriction apply when the memory buffer has been advised as
1787.B MADV_DONTFORK
0ab8aeec 1788with
02ace852 1789.BR madvise (2),
1847167b
NP
1790ensuring that it will not be available
1791to the child after
1792.BR fork (2).
dd3568a1 1793.PP
ddc4d339
MK
1794The
1795.B O_DIRECT
1796flag was introduced in SGI IRIX, where it has alignment
1797restrictions similar to those of Linux 2.4.
1798IRIX has also a
1799.BR fcntl (2)
1800call to query appropriate alignments, and sizes.
1801FreeBSD 4.x introduced
1802a flag of the same name, but without alignment restrictions.
dd3568a1 1803.PP
ddc4d339
MK
1804.B O_DIRECT
1805support was added under Linux in kernel version 2.4.10.
1806Older Linux kernels simply ignore this flag.
fedb2ff5 1807Some filesystems may not implement the flag, in which case
ddc4d339 1808.BR open ()
9e4be7e9 1809fails with the error
ddc4d339
MK
1810.B EINVAL
1811if it is used.
dd3568a1 1812.PP
ddc4d339
MK
1813Applications should avoid mixing
1814.B O_DIRECT
1815and normal I/O to the same file,
1816and especially to overlapping byte regions in the same file.
9ee4a2b6 1817Even when the filesystem correctly handles the coherency issues in
ddc4d339
MK
1818this situation, overall I/O throughput is likely to be slower than
1819using either mode alone.
1820Likewise, applications should avoid mixing
1821.BR mmap (2)
1822of files with direct I/O to the same files.
dd3568a1 1823.PP
a1fa36af 1824The behavior of
ddc4d339 1825.B O_DIRECT
9ee4a2b6 1826with NFS will differ from local filesystems.
ddc4d339
MK
1827Older kernels, or
1828kernels configured in certain ways, may not support this combination.
1829The NFS protocol does not support passing the flag to the server, so
1830.B O_DIRECT
33a0ccb2 1831I/O will bypass the page cache only on the client; the server may
ddc4d339
MK
1832still cache the I/O.
1833The client asks the server to make the I/O
1834synchronous to preserve the synchronous semantics of
1835.BR O_DIRECT .
1836Some servers will perform poorly under these circumstances, especially
1837if the I/O size is small.
1838Some servers may also be configured to
1839lie to clients about the I/O having reached stable storage; this
1840will avoid the performance penalty at some risk to data integrity
1841in the event of server power failure.
1842The Linux NFS client places no alignment restrictions on
1843.B O_DIRECT
1844I/O.
1845.PP
1846In summary,
1847.B O_DIRECT
1848is a potentially powerful tool that should be used with caution.
1849It is recommended that applications treat use of
1850.B O_DIRECT
1851as a performance option which is disabled by default.
ddc4d339 1852.SH BUGS
b50582eb
MK
1853Currently, it is not possible to enable signal-driven
1854I/O by specifying
1855.B O_ASYNC
c13182ef 1856when calling
b50582eb
MK
1857.BR open ();
1858use
1859.BR fcntl (2)
1860to enable this flag.
0e1ad98c 1861.\" FIXME . Check bugzilla report on open(O_ASYNC)
92057f4d 1862.\" See http://bugzilla.kernel.org/show_bug.cgi?id=5993
5355ff82 1863.PP
0d730fcc
MK
1864One must check for two different error codes,
1865.B EISDIR
1866and
1867.BR ENOENT ,
1868when trying to determine whether the kernel supports
0d55b37f 1869.B O_TMPFILE
0d730fcc 1870functionality.
5355ff82 1871.PP
320f8a8e
MK
1872When both
1873.B O_CREAT
1874and
1875.B O_DIRECTORY
1876are specified in
1ae6b2c7 1877.I flags
320f8a8e
MK
1878and the file specified by
1879.I pathname
1880does not exist,
1881.BR open ()
1882will create a regular file (i.e.,
1883.B O_DIRECTORY
1884is ignored).
47297adb 1885.SH SEE ALSO
a3bf8022
MK
1886.BR chmod (2),
1887.BR chown (2),
fea681da 1888.BR close (2),
e366dbc4 1889.BR dup (2),
fea681da
MK
1890.BR fcntl (2),
1891.BR link (2),
1f6ceb40 1892.BR lseek (2),
fea681da 1893.BR mknod (2),
e366dbc4 1894.BR mmap (2),
f0c34053 1895.BR mount (2),
fa5d243f 1896.BR open_by_handle_at (2),
c8fb1c6d 1897.BR openat2 (2),
fea681da
MK
1898.BR read (2),
1899.BR socket (2),
1900.BR stat (2),
1901.BR umask (2),
1902.BR unlink (2),
1903.BR write (2),
1904.BR fopen (3),
b31056e3 1905.BR acl (5),
f0c34053 1906.BR fifo (7),
3b363b62 1907.BR inode (7),
a9cfde1d
MK
1908.BR path_resolution (7),
1909.BR symlink (7)