]>
Commit | Line | Data |
---|---|---|
fea681da | 1 | .\" This manpage is Copyright (C) 1992 Drew Eckhardt; |
fd185f58 MK |
2 | .\" and Copyright (C) 1993 Michael Haardt, Ian Jackson. |
3 | .\" and Copyright (C) 2008 Greg Banks | |
7b8ba76c | 4 | .\" and Copyright (C) 2006, 2008, 2013, 2014 Michael Kerrisk <mtk.manpages@gmail.com> |
fea681da | 5 | .\" |
5fbde956 | 6 | .\" SPDX-License-Identifier: Linux-man-pages-copyleft |
fea681da MK |
7 | .\" |
8 | .\" Modified 1993-07-21 by Rik Faith <faith@cs.unc.edu> | |
9 | .\" Modified 1994-08-21 by Michael Haardt | |
10 | .\" Modified 1996-04-13 by Andries Brouwer <aeb@cwi.nl> | |
11 | .\" Modified 1996-05-13 by Thomas Koenig | |
12 | .\" Modified 1996-12-20 by Michael Haardt | |
13 | .\" Modified 1999-02-19 by Andries Brouwer <aeb@cwi.nl> | |
14 | .\" Modified 1998-11-28 by Joseph S. Myers <jsm28@hermes.cam.ac.uk> | |
15 | .\" Modified 1999-06-03 by Michael Haardt | |
c11b1abf MK |
16 | .\" Modified 2002-05-07 by Michael Kerrisk <mtk.manpages@gmail.com> |
17 | .\" Modified 2004-06-23 by Michael Kerrisk <mtk.manpages@gmail.com> | |
1c1e15ed MK |
18 | .\" 2004-12-08, mtk, reordered flags list alphabetically |
19 | .\" 2004-12-08, Martin Pool <mbp@sourcefrog.net> (& mtk), added O_NOATIME | |
fe75ec04 | 20 | .\" 2007-09-18, mtk, Added description of O_CLOEXEC + other minor edits |
447bb15e | 21 | .\" 2008-01-03, mtk, with input from Trond Myklebust |
f4b9d6a5 MK |
22 | .\" <trond.myklebust@fys.uio.no> and Timo Sirainen <tss@iki.fi> |
23 | .\" Rewrite description of O_EXCL. | |
ddc4d339 MK |
24 | .\" 2008-01-11, Greg Banks <gnb@melbourne.sgi.com>: add more detail |
25 | .\" on O_DIRECT. | |
d77eb764 | 26 | .\" 2008-02-26, Michael Haardt: Reorganized text for O_CREAT and mode |
fea681da | 27 | .\" |
61b7c1e1 | 28 | .\" FIXME . Apr 08: The next POSIX revision has O_EXEC, O_SEARCH, and |
9f91e36c MK |
29 | .\" O_TTYINIT. Eventually these may need to be documented. --mtk |
30 | .\" | |
45186a5d | 31 | .TH OPEN 2 2021-08-27 "Linux man-pages (unreleased)" |
fea681da | 32 | .SH NAME |
7b8ba76c | 33 | open, openat, creat \- open and possibly create a file |
d554739d AC |
34 | .SH LIBRARY |
35 | Standard C library | |
8fc3b2cf | 36 | .RI ( libc ", " \-lc ) |
fea681da MK |
37 | .SH SYNOPSIS |
38 | .nf | |
fea681da | 39 | .B #include <fcntl.h> |
5355ff82 | 40 | .PP |
fea681da MK |
41 | .BI "int open(const char *" pathname ", int " flags ); |
42 | .BI "int open(const char *" pathname ", int " flags ", mode_t " mode ); | |
5355ff82 | 43 | .PP |
fea681da | 44 | .BI "int creat(const char *" pathname ", mode_t " mode ); |
5355ff82 | 45 | .PP |
7b8ba76c MK |
46 | .BI "int openat(int " dirfd ", const char *" pathname ", int " flags ); |
47 | .BI "int openat(int " dirfd ", const char *" pathname ", int " flags \ | |
48 | ", mode_t " mode ); | |
a2dbb2e3 | 49 | .PP |
4b322a2f MK |
50 | /* Documented separately, in \fBopenat2\fP(2): */ |
51 | .BI "int openat2(int " dirfd ", const char *" pathname , | |
9bfc9cb1 | 52 | .BI " const struct open_how *" how ", size_t " size ");" |
fea681da | 53 | .fi |
5355ff82 | 54 | .PP |
d39ad78f | 55 | .RS -4 |
7b8ba76c MK |
56 | Feature Test Macro Requirements for glibc (see |
57 | .BR feature_test_macros (7)): | |
d39ad78f | 58 | .RE |
5355ff82 | 59 | .PP |
7b8ba76c | 60 | .BR openat (): |
9d2adbae MK |
61 | .nf |
62 | Since glibc 2.10: | |
5c10d2c5 | 63 | _POSIX_C_SOURCE >= 200809L |
9d2adbae MK |
64 | Before glibc 2.10: |
65 | _ATFILE_SOURCE | |
66 | .fi | |
fea681da | 67 | .SH DESCRIPTION |
ef81e101 | 68 | The |
1f6ceb40 | 69 | .BR open () |
ef81e101 MK |
70 | system call opens the file specified by |
71 | .IR pathname . | |
72 | If the specified file does not exist, | |
73 | it may optionally (if | |
74 | .B O_CREAT | |
75 | is specified in | |
76 | .IR flags ) | |
77 | be created by | |
78 | .BR open (). | |
79 | .PP | |
80 | The return value of | |
81 | .BR open () | |
5c3611aa MK |
82 | is a file descriptor, a small, nonnegative integer that is an index |
83 | to an entry in the process's table of open file descriptors. | |
84 | The file descriptor is used | |
ef81e101 MK |
85 | in subsequent system calls |
86 | .RB ( read "(2), " write "(2), " lseek "(2), " fcntl (2), | |
87 | etc.) to refer to the open file. | |
e366dbc4 | 88 | The file descriptor returned by a successful call will be |
2c4bff36 | 89 | the lowest-numbered file descriptor not currently open for the process. |
e366dbc4 | 90 | .PP |
fe75ec04 | 91 | By default, the new file descriptor is set to remain open across an |
e366dbc4 | 92 | .BR execve (2) |
1f6ceb40 MK |
93 | (i.e., the |
94 | .B FD_CLOEXEC | |
95 | file descriptor flag described in | |
31d79098 SP |
96 | .BR fcntl (2) |
97 | is initially disabled); the | |
fe75ec04 | 98 | .B O_CLOEXEC |
d6a74b95 | 99 | flag, described below, can be used to change this default. |
1f6ceb40 | 100 | The file offset is set to the beginning of the file (see |
c13182ef | 101 | .BR lseek (2)). |
e366dbc4 MK |
102 | .PP |
103 | A call to | |
104 | .BR open () | |
105 | creates a new | |
106 | .IR "open file description" , | |
107 | an entry in the system-wide table of open files. | |
61b12e2b | 108 | The open file description records the file offset and the file status flags |
20ee63c1 | 109 | (see below). |
61b12e2b | 110 | A file descriptor is a reference to an open file description; |
2c4bff36 MK |
111 | this reference is unaffected if |
112 | .I pathname | |
113 | is subsequently removed or modified to refer to a different file. | |
d20d9d33 | 114 | For further details on open file descriptions, see NOTES. |
e366dbc4 | 115 | .PP |
c4bb193f | 116 | The argument |
fea681da | 117 | .I flags |
e366dbc4 MK |
118 | must include one of the following |
119 | .IR "access modes" : | |
c7992edc | 120 | .BR O_RDONLY ", " O_WRONLY ", or " O_RDWR . |
e366dbc4 MK |
121 | These request opening the file read-only, write-only, or read/write, |
122 | respectively. | |
5355ff82 | 123 | .PP |
bfe9ba67 | 124 | In addition, zero or more file creation flags and file status flags |
c13182ef | 125 | can be |
fea681da | 126 | .RI bitwise- or 'd |
e366dbc4 | 127 | in |
bfe9ba67 | 128 | .IR flags . |
c13182ef MK |
129 | The |
130 | .I file creation flags | |
131 | are | |
0e40804c | 132 | .BR O_CLOEXEC , |
b072a788 | 133 | .BR O_CREAT , |
0e40804c MK |
134 | .BR O_DIRECTORY , |
135 | .BR O_EXCL , | |
136 | .BR O_NOCTTY , | |
137 | .BR O_NOFOLLOW , | |
f2698a42 | 138 | .BR O_TMPFILE , |
0e40804c | 139 | and |
15fb5d03 | 140 | .BR O_TRUNC . |
c13182ef MK |
141 | The |
142 | .I file status flags | |
bfe9ba67 | 143 | are all of the remaining flags listed below. |
0e40804c | 144 | .\" SUSv4 divides the flags into: |
93ee8f96 MK |
145 | .\" * Access mode |
146 | .\" * File creation | |
147 | .\" * File status | |
148 | .\" * Other (O_CLOEXEC, O_DIRECTORY, O_NOFOLLOW) | |
149 | .\" though it's not clear what the difference between "other" and | |
0e40804c MK |
150 | .\" "File creation" flags is. I raised an Aardvark to see if this |
151 | .\" can be clarified in SUSv4; 10 Oct 2008. | |
152 | .\" http://thread.gmane.org/gmane.comp.standards.posix.austin.general/64/focus=67 | |
153 | .\" TC1 (balloted in 2013), resolved this, so that those three constants | |
154 | .\" are also categorized" as file status flags. | |
155 | .\" | |
bfe9ba67 | 156 | The distinction between these two groups of flags is that |
68210340 MK |
157 | the file creation flags affect the semantics of the open operation itself, |
158 | while the file status flags affect the semantics of subsequent I/O operations. | |
159 | The file status flags can be retrieved and (in some cases) | |
566b427d MK |
160 | modified; see |
161 | .BR fcntl (2) | |
162 | for details. | |
5355ff82 | 163 | .PP |
bfe9ba67 | 164 | The full list of file creation flags and file status flags is as follows: |
fea681da | 165 | .TP |
1c1e15ed | 166 | .B O_APPEND |
c13182ef MK |
167 | The file is opened in append mode. |
168 | Before each | |
0bfa087b | 169 | .BR write (2), |
1e568304 | 170 | the file offset is positioned at the end of the file, |
1c1e15ed | 171 | as if with |
0bfa087b | 172 | .BR lseek (2). |
17efe87f | 173 | The modification of the file offset and the write operation |
20b8f0e2 | 174 | are performed as a single atomic step. |
5355ff82 | 175 | .IP |
1c1e15ed | 176 | .B O_APPEND |
9ee4a2b6 | 177 | may lead to corrupted files on NFS filesystems if more than one process |
c13182ef | 178 | appends data to a file at once. |
a4391429 MK |
179 | .\" For more background, see |
180 | .\" http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=453946 | |
181 | .\" http://nfs.sourceforge.net/ | |
c13182ef | 182 | This is because NFS does not support |
1c1e15ed MK |
183 | appending to a file, so the client kernel has to simulate it, which |
184 | can't be done without a race condition. | |
185 | .TP | |
186 | .B O_ASYNC | |
b50582eb | 187 | Enable signal-driven I/O: |
8bd58774 MK |
188 | generate a signal |
189 | .RB ( SIGIO | |
190 | by default, but this can be changed via | |
1c1e15ed MK |
191 | .BR fcntl (2)) |
192 | when input or output becomes possible on this file descriptor. | |
33a0ccb2 | 193 | This feature is available only for terminals, pseudoterminals, |
1f6ceb40 MK |
194 | sockets, and (since Linux 2.6) pipes and FIFOs. |
195 | See | |
1c1e15ed MK |
196 | .BR fcntl (2) |
197 | for further details. | |
9bde4908 | 198 | See also BUGS, below. |
fe75ec04 | 199 | .TP |
31c1f2b0 | 200 | .BR O_CLOEXEC " (since Linux 2.6.23)" |
7fdec065 | 201 | .\" NOTE! several other man pages refer to this text |
fe75ec04 | 202 | Enable the close-on-exec flag for the new file descriptor. |
00d82ce8 MK |
203 | .\" FIXME . for later review when Issue 8 is one day released... |
204 | .\" POSIX proposes to fix many APIs that provide hidden FDs | |
205 | .\" http://austingroupbugs.net/tag_view_page.php?tag_id=8 | |
206 | .\" http://austingroupbugs.net/view.php?id=368 | |
24ec631f | 207 | Specifying this flag permits a program to avoid additional |
fe75ec04 MK |
208 | .BR fcntl (2) |
209 | .B F_SETFD | |
24ec631f | 210 | operations to set the |
0daa9e92 | 211 | .B FD_CLOEXEC |
fe75ec04 | 212 | flag. |
5355ff82 | 213 | .IP |
7756d157 MK |
214 | Note that the use of this flag is essential in some multithreaded programs, |
215 | because using a separate | |
fe75ec04 MK |
216 | .BR fcntl (2) |
217 | .B F_SETFD | |
218 | operation to set the | |
0daa9e92 | 219 | .B FD_CLOEXEC |
fe75ec04 | 220 | flag does not suffice to avoid race conditions |
7756d157 MK |
221 | where one thread opens a file descriptor and |
222 | attempts to set its close-on-exec flag using | |
223 | .BR fcntl (2) | |
224 | at the same time as another thread does a | |
fe75ec04 MK |
225 | .BR fork (2) |
226 | plus | |
227 | .BR execve (2). | |
7756d157 | 228 | Depending on the order of execution, |
30821db8 | 229 | the race may lead to the file descriptor returned by |
7756d157 MK |
230 | .BR open () |
231 | being unintentionally leaked to the program executed by the child process | |
232 | created by | |
233 | .BR fork (2). | |
234 | (This kind of race is in principle possible for any system call | |
235 | that creates a file descriptor whose close-on-exec flag should be set, | |
236 | and various other Linux system calls provide an equivalent of the | |
1ae6b2c7 | 237 | .B O_CLOEXEC |
7756d157 | 238 | flag to deal with this problem.) |
fe75ec04 | 239 | .\" This flag fixes only one form of the race condition; |
d9cb0d7d | 240 | .\" The race can also occur with, for example, file descriptors |
fe75ec04 | 241 | .\" returned by accept(), pipe(), etc. |
1c1e15ed | 242 | .TP |
fea681da | 243 | .B O_CREAT |
6f72cae5 MK |
244 | If |
245 | .I pathname | |
246 | does not exist, create it as a regular file. | |
5355ff82 | 247 | .IP |
40169a93 | 248 | The owner (user ID) of the new file is set to the effective user ID |
c13182ef | 249 | of the process. |
5355ff82 | 250 | .IP |
ddf5e4ab MK |
251 | The group ownership (group ID) of the new file is set either to |
252 | the effective group ID of the process (System V semantics) | |
253 | or to the group ID of the parent directory (BSD semantics). | |
254 | On Linux, the behavior depends on whether the | |
255 | set-group-ID mode bit is set on the parent directory: | |
256 | if that bit is set, then BSD semantics apply; | |
257 | otherwise, System V semantics apply. | |
258 | For some filesystems, the behavior also depends on the | |
fea681da MK |
259 | .I bsdgroups |
260 | and | |
261 | .I sysvgroups | |
ddf5e4ab | 262 | mount options described in |
53dcd8d2 | 263 | .BR mount (8). |
8b39ad66 MK |
264 | .\" As at 2.6.25, bsdgroups is supported by ext2, ext3, ext4, and |
265 | .\" XFS (since 2.6.14). | |
7f4e9716 | 266 | .IP |
1bab84a8 | 267 | The |
4e698277 | 268 | .I mode |
901c8ecf MK |
269 | argument specifies the file mode bits to be applied when a new file is created. |
270 | If neither | |
4e698277 | 271 | .B O_CREAT |
901c8ecf | 272 | nor |
f2698a42 | 273 | .B O_TMPFILE |
4e698277 | 274 | is specified in |
901c8ecf MK |
275 | .IR flags , |
276 | then | |
277 | .I mode | |
278 | is ignored (and can thus be specified as 0, or simply omitted). | |
279 | The | |
280 | .I mode | |
281 | argument | |
282 | .B must | |
283 | be supplied if | |
4e698277 | 284 | .B O_CREAT |
901c8ecf | 285 | or |
f2698a42 | 286 | .B O_TMPFILE |
901c8ecf MK |
287 | is specified in |
288 | .IR flags ; | |
289 | if it is not supplied, | |
290 | some arbitrary bytes from the stack will be applied as the file mode. | |
88f463a9 | 291 | .IP |
58222012 | 292 | The effective mode is modified by the process's |
4e698277 | 293 | .I umask |
58222012 MK |
294 | in the usual way: in the absence of a default ACL, the mode of the |
295 | created file is | |
af2d18b2 | 296 | .IR "(mode\ &\ \(tiumask)" . |
88f463a9 MK |
297 | .IP |
298 | Note that | |
299 | .I mode | |
300 | applies only to future accesses of the | |
4e698277 MK |
301 | newly created file; the |
302 | .BR open () | |
303 | call that creates a read-only file may well return a read/write | |
304 | file descriptor. | |
7f4e9716 | 305 | .IP |
4e698277 MK |
306 | The following symbolic constants are provided for |
307 | .IR mode : | |
7f4e9716 | 308 | .RS |
4e698277 MK |
309 | .TP 9 |
310 | .B S_IRWXU | |
97d5b762 | 311 | 00700 user (file owner) has read, write, and execute permission |
4e698277 MK |
312 | .TP |
313 | .B S_IRUSR | |
314 | 00400 user has read permission | |
315 | .TP | |
316 | .B S_IWUSR | |
317 | 00200 user has write permission | |
318 | .TP | |
319 | .B S_IXUSR | |
320 | 00100 user has execute permission | |
321 | .TP | |
322 | .B S_IRWXG | |
97d5b762 | 323 | 00070 group has read, write, and execute permission |
4e698277 MK |
324 | .TP |
325 | .B S_IRGRP | |
326 | 00040 group has read permission | |
327 | .TP | |
328 | .B S_IWGRP | |
329 | 00020 group has write permission | |
330 | .TP | |
331 | .B S_IXGRP | |
332 | 00010 group has execute permission | |
333 | .TP | |
334 | .B S_IRWXO | |
97d5b762 | 335 | 00007 others have read, write, and execute permission |
4e698277 MK |
336 | .TP |
337 | .B S_IROTH | |
338 | 00004 others have read permission | |
339 | .TP | |
340 | .B S_IWOTH | |
341 | 00002 others have write permission | |
342 | .TP | |
343 | .B S_IXOTH | |
344 | 00001 others have execute permission | |
345 | .RE | |
9e1d8950 MK |
346 | .IP |
347 | According to POSIX, the effect when other bits are set in | |
348 | .I mode | |
349 | is unspecified. | |
350 | On Linux, the following bits are also honored in | |
351 | .IR mode : | |
352 | .RS | |
353 | .TP 9 | |
354 | .B S_ISUID | |
355 | 0004000 set-user-ID bit | |
356 | .TP | |
357 | .B S_ISGID | |
358 | 0002000 set-group-ID bit (see | |
e6fc1596 | 359 | .BR inode (7)). |
9e1d8950 MK |
360 | .TP |
361 | .B S_ISVTX | |
362 | 0001000 sticky bit (see | |
e6fc1596 | 363 | .BR inode (7)). |
9e1d8950 | 364 | .RE |
fea681da | 365 | .TP |
31c1f2b0 | 366 | .BR O_DIRECT " (since Linux 2.4.10)" |
1c1e15ed MK |
367 | Try to minimize cache effects of the I/O to and from this file. |
368 | In general this will degrade performance, but it is useful in | |
369 | special situations, such as when applications do their own caching. | |
bce0482f | 370 | File I/O is done directly to/from user-space buffers. |
015221ef CH |
371 | The |
372 | .B O_DIRECT | |
0deb3ce9 | 373 | flag on its own makes an effort to transfer data synchronously, |
015221ef CH |
374 | but does not give the guarantees of the |
375 | .B O_SYNC | |
0deb3ce9 JM |
376 | flag that data and necessary metadata are transferred. |
377 | To guarantee synchronous I/O, | |
015221ef CH |
378 | .B O_SYNC |
379 | must be used in addition to | |
380 | .BR O_DIRECT . | |
be02e49f | 381 | See NOTES below for further discussion. |
5355ff82 | 382 | .IP |
c13182ef | 383 | A semantically similar (but deprecated) interface for block devices |
9b54d4fa | 384 | is described in |
1c1e15ed MK |
385 | .BR raw (8). |
386 | .TP | |
387 | .B O_DIRECTORY | |
a8d55537 | 388 | If \fIpathname\fP is not a directory, cause the open to fail. |
9f8d688a MK |
389 | .\" But see the following and its replies: |
390 | .\" http://marc.theaimsgroup.com/?t=112748702800001&r=1&w=2 | |
391 | .\" [PATCH] open: O_DIRECTORY and O_CREAT together should fail | |
392 | .\" O_DIRECTORY | O_CREAT causes O_DIRECTORY to be ignored. | |
65496644 | 393 | This flag was added in kernel version 2.1.126, to |
60a90ecd MK |
394 | avoid denial-of-service problems if |
395 | .BR opendir (3) | |
396 | is called on a | |
a3041a58 | 397 | FIFO or tape device. |
1c1e15ed | 398 | .TP |
6cf19e62 MK |
399 | .B O_DSYNC |
400 | Write operations on the file will complete according to the requirements of | |
401 | synchronized I/O | |
402 | .I data | |
403 | integrity completion. | |
5355ff82 | 404 | .IP |
6cf19e62 MK |
405 | By the time |
406 | .BR write (2) | |
407 | (and similar) | |
408 | return, the output data | |
409 | has been transferred to the underlying hardware, | |
410 | along with any file metadata that would be required to retrieve that data | |
411 | (i.e., as though each | |
412 | .BR write (2) | |
413 | was followed by a call to | |
414 | .BR fdatasync (2)). | |
415 | .IR "See NOTES below" . | |
416 | .TP | |
fea681da | 417 | .B O_EXCL |
f4b9d6a5 MK |
418 | Ensure that this call creates the file: |
419 | if this flag is specified in conjunction with | |
fea681da | 420 | .BR O_CREAT , |
f4b9d6a5 MK |
421 | and |
422 | .I pathname | |
423 | already exists, then | |
1c1e15ed | 424 | .BR open () |
26cd31fd MK |
425 | fails with the error |
426 | .BR EEXIST . | |
5355ff82 | 427 | .IP |
f4b9d6a5 MK |
428 | When these two flags are specified, symbolic links are not followed: |
429 | .\" POSIX.1-2001 explicitly requires this behavior. | |
430 | if | |
431 | .I pathname | |
432 | is a symbolic link, then | |
433 | .BR open () | |
43116169 | 434 | fails regardless of where the symbolic link points. |
5355ff82 | 435 | .IP |
10b7a945 IHV |
436 | In general, the behavior of |
437 | .B O_EXCL | |
438 | is undefined if it is used without | |
439 | .BR O_CREAT . | |
440 | There is one exception: on Linux 2.6 and later, | |
441 | .B O_EXCL | |
442 | can be used without | |
443 | .B O_CREAT | |
444 | if | |
445 | .I pathname | |
446 | refers to a block device. | |
6303d401 DB |
447 | If the block device is in use by the system (e.g., mounted), |
448 | .BR open () | |
10b7a945 IHV |
449 | fails with the error |
450 | .BR EBUSY . | |
5355ff82 | 451 | .IP |
efe08656 | 452 | On NFS, |
f4b9d6a5 | 453 | .B O_EXCL |
33a0ccb2 | 454 | is supported only when using NFSv3 or later on kernel 2.6 or later. |
efe08656 | 455 | In NFS environments where |
fea681da | 456 | .B O_EXCL |
f4b9d6a5 MK |
457 | support is not provided, programs that rely on it |
458 | for performing locking tasks will contain a race condition. | |
459 | Portable programs that want to perform atomic file locking using a lockfile, | |
460 | and need to avoid reliance on NFS support for | |
461 | .BR O_EXCL , | |
462 | can create a unique file on | |
9ee4a2b6 | 463 | the same filesystem (e.g., incorporating hostname and PID), and use |
fea681da | 464 | .BR link (2) |
c13182ef | 465 | to make a link to the lockfile. |
60a90ecd MK |
466 | If |
467 | .BR link (2) | |
f4b9d6a5 | 468 | returns 0, the lock is successful. |
c13182ef | 469 | Otherwise, use |
fea681da MK |
470 | .BR stat (2) |
471 | on the unique file to check if its link count has increased to 2, | |
472 | in which case the lock is also successful. | |
473 | .TP | |
1c1e15ed MK |
474 | .B O_LARGEFILE |
475 | (LFS) | |
476 | Allow files whose sizes cannot be represented in an | |
8478ee02 | 477 | .I off_t |
1c1e15ed | 478 | (but can be represented in an |
8478ee02 | 479 | .IR off64_t ) |
1c1e15ed | 480 | to be opened. |
c13182ef | 481 | The |
bcdd964e | 482 | .B _LARGEFILE64_SOURCE |
e417acb0 MK |
483 | macro must be defined |
484 | (before including | |
485 | .I any | |
486 | header files) | |
487 | in order to obtain this definition. | |
c13182ef | 488 | Setting the |
bcdd964e | 489 | .B _FILE_OFFSET_BITS |
9f3d8b28 MK |
490 | feature test macro to 64 (rather than using |
491 | .BR O_LARGEFILE ) | |
12e263f1 | 492 | is the preferred |
9f3d8b28 | 493 | method of accessing large files on 32-bit systems (see |
2dcbf4f7 | 494 | .BR feature_test_macros (7)). |
1c1e15ed | 495 | .TP |
31c1f2b0 | 496 | .BR O_NOATIME " (since Linux 2.6.8)" |
1bb72c96 MK |
497 | Do not update the file last access time |
498 | .RI ( st_atime | |
499 | in the inode) | |
310b7919 | 500 | when the file is |
1c1e15ed | 501 | .BR read (2). |
5355ff82 | 502 | .IP |
47c906e5 MK |
503 | This flag can be employed only if one of the following conditions is true: |
504 | .RS | |
505 | .IP * 3 | |
506 | The effective UID of the process | |
507 | .\" Strictly speaking: the filesystem UID | |
508 | matches the owner UID of the file. | |
509 | .IP * | |
510 | The calling process has the | |
1ae6b2c7 | 511 | .B CAP_FOWNER |
47c906e5 MK |
512 | capability in its user namespace and |
513 | the owner UID of the file has a mapping in the namespace. | |
514 | .RE | |
515 | .IP | |
1c1e15ed MK |
516 | This flag is intended for use by indexing or backup programs, |
517 | where its use can significantly reduce the amount of disk activity. | |
9ee4a2b6 | 518 | This flag may not be effective on all filesystems. |
1c1e15ed | 519 | One example is NFS, where the server maintains the access time. |
0e1ad98c | 520 | .\" The O_NOATIME flag also affects the treatment of st_atime |
92057f4d | 521 | .\" by mmap() and readdir(2), MTK, Dec 04. |
1c1e15ed | 522 | .TP |
fea681da MK |
523 | .B O_NOCTTY |
524 | If | |
525 | .I pathname | |
5503c85e | 526 | refers to a terminal device\(emsee |
1bb72c96 MK |
527 | .BR tty (4)\(emit |
528 | will not become the process's controlling terminal even if the | |
fea681da MK |
529 | process does not have one. |
530 | .TP | |
1c1e15ed | 531 | .B O_NOFOLLOW |
7a11fc63 MK |
532 | If the trailing component (i.e., basename) of |
533 | .I pathname | |
534 | is a symbolic link, then the open fails, with the error | |
6ccb7137 | 535 | .BR ELOOP . |
7fba0065 MK |
536 | Symbolic links in earlier components of the pathname will still be |
537 | followed. | |
538 | (Note that the | |
539 | .B ELOOP | |
540 | error that can occur in this case is indistinguishable from the case where | |
6ccb7137 MK |
541 | an open fails because there are too many symbolic links found |
542 | while resolving components in the prefix part of the pathname.) | |
5355ff82 | 543 | .IP |
8db11e23 MK |
544 | This flag is a FreeBSD extension, which was added to Linux in version 2.1.126, |
545 | and has subsequently been standardized in POSIX.1-2008. | |
5355ff82 | 546 | .IP |
1135dbe1 | 547 | See also |
1ae6b2c7 | 548 | .B O_PATH |
1135dbe1 | 549 | below. |
e366dbc4 MK |
550 | .\" The headers from glibc 2.0.100 and later include a |
551 | .\" definition of this flag; \fIkernels before 2.1.126 will ignore it if | |
a8d55537 | 552 | .\" used\fP. |
fea681da MK |
553 | .TP |
554 | .BR O_NONBLOCK " or " O_NDELAY | |
ff40dbb3 | 555 | When possible, the file is opened in nonblocking mode. |
c13182ef | 556 | Neither the |
1c1e15ed | 557 | .BR open () |
b0972b3b | 558 | nor any subsequent I/O operations on the file descriptor which is |
fea681da | 559 | returned will cause the calling process to wait. |
5355ff82 | 560 | .IP |
f3fdbe28 | 561 | Note that the setting of this flag has no effect on the operation of |
f2a11072 MK |
562 | .BR poll (2), |
563 | .BR select (2), | |
564 | .BR epoll (7), | |
565 | and similar, | |
566 | since those interfaces merely inform the caller about whether | |
567 | a file descriptor is "ready", | |
568 | meaning that an I/O operation performed on | |
569 | the file descriptor with the | |
570 | .B O_NONBLOCK | |
571 | flag | |
572 | .I clear | |
573 | would not block. | |
574 | .IP | |
9f629381 MK |
575 | Note that this flag has no effect for regular files and block devices; |
576 | that is, I/O operations will (briefly) block when device activity | |
577 | is required, regardless of whether | |
578 | .B O_NONBLOCK | |
579 | is set. | |
580 | Since | |
581 | .B O_NONBLOCK | |
582 | semantics might eventually be implemented, | |
583 | applications should not depend upon blocking behavior | |
584 | when specifying this flag for regular files and block devices. | |
5355ff82 | 585 | .IP |
fea681da | 586 | For the handling of FIFOs (named pipes), see also |
af5b2ef2 | 587 | .BR fifo (7). |
db28bfac | 588 | For a discussion of the effect of |
0daa9e92 | 589 | .B O_NONBLOCK |
db28bfac MK |
590 | in conjunction with mandatory file locks and with file leases, see |
591 | .BR fcntl (2). | |
fea681da | 592 | .TP |
1135dbe1 MK |
593 | .BR O_PATH " (since Linux 2.6.39)" |
594 | .\" commit 1abf0c718f15a56a0a435588d1b104c7a37dc9bd | |
595 | .\" commit 326be7b484843988afe57566b627fb7a70beac56 | |
596 | .\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d | |
597 | .\" | |
598 | .\" http://thread.gmane.org/gmane.linux.man/2790/focus=3496 | |
599 | .\" Subject: Re: [PATCH] open(2): document O_PATH | |
600 | .\" Newsgroups: gmane.linux.man, gmane.linux.kernel | |
601 | .\" | |
1135dbe1 | 602 | Obtain a file descriptor that can be used for two purposes: |
9ee4a2b6 | 603 | to indicate a location in the filesystem tree and |
1135dbe1 MK |
604 | to perform operations that act purely at the file descriptor level. |
605 | The file itself is not opened, and other file operations (e.g., | |
606 | .BR read (2), | |
607 | .BR write (2), | |
608 | .BR fchmod (2), | |
609 | .BR fchown (2), | |
2510e4e5 | 610 | .BR fgetxattr (2), |
97a45d02 | 611 | .BR ioctl (2), |
2510e4e5 | 612 | .BR mmap (2)) |
1135dbe1 MK |
613 | fail with the error |
614 | .BR EBADF . | |
5355ff82 | 615 | .IP |
1135dbe1 MK |
616 | The following operations |
617 | .I can | |
618 | be performed on the resulting file descriptor: | |
619 | .RS | |
620 | .IP * 3 | |
b9307a4a MK |
621 | .BR close (2). |
622 | .IP * | |
f3cd742c MK |
623 | .BR fchdir (2), |
624 | if the file descriptor refers to a directory | |
b9307a4a | 625 | (since Linux 3.5). |
1135dbe1 | 626 | .\" commit 332a2e1244bd08b9e3ecd378028513396a004a24 |
b9307a4a | 627 | .IP * |
1135dbe1 | 628 | .BR fstat (2) |
b9307a4a MK |
629 | (since Linux 3.6). |
630 | .IP * | |
1135dbe1 | 631 | .\" fstat(): commit 55815f70147dcfa3ead5738fd56d3574e2e3c1c2 |
97a45d02 N |
632 | .BR fstatfs (2) |
633 | (since Linux 3.12). | |
634 | .\" fstatfs(): commit 9d05746e7b16d8565dddbe3200faa1e669d23bbf | |
1135dbe1 MK |
635 | .IP * |
636 | Duplicating the file descriptor | |
637 | .RB ( dup (2), | |
638 | .BR fcntl (2) | |
639 | .BR F_DUPFD , | |
640 | etc.). | |
641 | .IP * | |
642 | Getting and setting file descriptor flags | |
643 | .RB ( fcntl (2) | |
1ae6b2c7 | 644 | .B F_GETFD |
1135dbe1 MK |
645 | and |
646 | .BR F_SETFD ). | |
09f677a3 MK |
647 | .IP * |
648 | Retrieving open file status flags using the | |
649 | .BR fcntl (2) | |
1ae6b2c7 | 650 | .B F_GETFL |
09f677a3 MK |
651 | operation: the returned flags will include the bit |
652 | .BR O_PATH . | |
1135dbe1 MK |
653 | .IP * |
654 | Passing the file descriptor as the | |
1ae6b2c7 | 655 | .I dirfd |
1135dbe1 | 656 | argument of |
490f876a | 657 | .BR openat () |
1135dbe1 | 658 | and the other "*at()" system calls. |
7dee406b AL |
659 | This includes |
660 | .BR linkat (2) | |
661 | with | |
1ae6b2c7 | 662 | .B AT_EMPTY_PATH |
7dee406b AL |
663 | (or via procfs using |
664 | .BR AT_SYMLINK_FOLLOW ) | |
665 | even if the file is not a directory. | |
1135dbe1 MK |
666 | .IP * |
667 | Passing the file descriptor to another process via a UNIX domain socket | |
668 | (see | |
1ae6b2c7 | 669 | .B SCM_RIGHTS |
1135dbe1 MK |
670 | in |
671 | .BR unix (7)). | |
672 | .RE | |
673 | .IP | |
674 | When | |
675 | .B O_PATH | |
676 | is specified in | |
677 | .IR flags , | |
678 | flag bits other than | |
6807fc6f MK |
679 | .BR O_CLOEXEC , |
680 | .BR O_DIRECTORY , | |
1135dbe1 | 681 | and |
1ae6b2c7 | 682 | .B O_NOFOLLOW |
1135dbe1 | 683 | are ignored. |
5355ff82 | 684 | .IP |
4a3b9ffc MK |
685 | Opening a file or directory with the |
686 | .B O_PATH | |
687 | flag requires no permissions on the object itself | |
688 | (but does require execute permission on the directories in the path prefix). | |
689 | Depending on the subsequent operation, | |
690 | a check for suitable file permissions may be performed (e.g., | |
691 | .BR fchdir (2) | |
692 | requires execute permission on the directory referred to | |
693 | by its file descriptor argument). | |
694 | By contrast, | |
695 | obtaining a reference to a filesystem object by opening it with the | |
696 | .B O_RDONLY | |
697 | flag requires that the caller have read permission on the object, | |
698 | even when the subsequent operation (e.g., | |
699 | .BR fchdir (2), | |
700 | .BR fstat (2)) | |
701 | does not require read permission on the object. | |
702 | .IP | |
d30344ab MK |
703 | If |
704 | .I pathname | |
705 | is a symbolic link and the | |
1ae6b2c7 | 706 | .B O_NOFOLLOW |
1135dbe1 MK |
707 | flag is also specified, |
708 | then the call returns a file descriptor referring to the symbolic link. | |
709 | This file descriptor can be used as the | |
710 | .I dirfd | |
711 | argument in calls to | |
712 | .BR fchownat (2), | |
713 | .BR fstatat (2), | |
714 | .BR linkat (2), | |
715 | and | |
716 | .BR readlinkat (2) | |
717 | with an empty pathname to have the calls operate on the symbolic link. | |
5355ff82 | 718 | .IP |
97a45d02 N |
719 | If |
720 | .I pathname | |
721 | refers to an automount point that has not yet been triggered, so no | |
722 | other filesystem is mounted on it, then the call returns a file | |
723 | descriptor referring to the automount directory without triggering a mount. | |
724 | .BR fstatfs (2) | |
725 | can then be used to determine if it is, in fact, an untriggered | |
726 | automount point | |
727 | .RB ( ".f_type == AUTOFS_SUPER_MAGIC" ). | |
d1304ede MK |
728 | .IP |
729 | One use of | |
730 | .B O_PATH | |
731 | for regular files is to provide the equivalent of POSIX.1's | |
732 | .B O_EXEC | |
733 | functionality. | |
734 | This permits us to open a file for which we have execute | |
ebab32e1 | 735 | permission but not read permission, and then execute that file, |
d1304ede MK |
736 | with steps something like the following: |
737 | .IP | |
738 | .in +4n | |
739 | .EX | |
740 | char buf[PATH_MAX]; | |
741 | fd = open("some_prog", O_PATH); | |
8e13d566 | 742 | snprintf(buf, PATH_MAX, "/proc/self/fd/%d", fd); |
d1304ede MK |
743 | execl(buf, "some_prog", (char *) NULL); |
744 | .EE | |
745 | .in | |
e982cebf MK |
746 | .IP |
747 | An | |
748 | .B O_PATH | |
749 | file descriptor can also be passed as the argument of | |
750 | .BR fexecve (3). | |
1135dbe1 | 751 | .TP |
fea681da | 752 | .B O_SYNC |
6cf19e62 MK |
753 | Write operations on the file will complete according to the requirements of |
754 | synchronized I/O | |
755 | .I file | |
756 | integrity completion | |
f36a1468 | 757 | (by contrast with the |
6cf19e62 MK |
758 | synchronized I/O |
759 | .I data | |
760 | integrity completion | |
761 | provided by | |
762 | .BR O_DSYNC .) | |
5355ff82 | 763 | .IP |
6cf19e62 MK |
764 | By the time |
765 | .BR write (2) | |
ca20a8a5 MK |
766 | (or similar) |
767 | returns, the output data and associated file metadata | |
6cf19e62 MK |
768 | have been transferred to the underlying hardware |
769 | (i.e., as though each | |
770 | .BR write (2) | |
771 | was followed by a call to | |
772 | .BR fsync (2)). | |
773 | .IR "See NOTES below" . | |
fea681da | 774 | .TP |
40398c1a MK |
775 | .BR O_TMPFILE " (since Linux 3.11)" |
776 | .\" commit 60545d0d4610b02e55f65d141c95b18ccf855b6e | |
777 | .\" commit f4e0c30c191f87851c4a53454abb55ee276f4a7e | |
778 | .\" commit bb458c644a59dbba3a1fe59b27106c5e68e1c4bd | |
6a11a5d4 | 779 | Create an unnamed temporary regular file. |
40398c1a MK |
780 | The |
781 | .I pathname | |
782 | argument specifies a directory; | |
783 | an unnamed inode will be created in that directory's filesystem. | |
784 | Anything written to the resulting file will be lost when | |
785 | the last file descriptor is closed, unless the file is given a name. | |
5355ff82 | 786 | .IP |
40398c1a MK |
787 | .B O_TMPFILE |
788 | must be specified with one of | |
789 | .B O_RDWR | |
790 | or | |
791 | .B O_WRONLY | |
792 | and, optionally, | |
793 | .BR O_EXCL . | |
794 | If | |
795 | .B O_EXCL | |
796 | is not specified, then | |
797 | .BR linkat (2) | |
798 | can be used to link the temporary file into the filesystem, making it | |
799 | permanent, using code like the following: | |
5355ff82 | 800 | .IP |
40398c1a | 801 | .in +4n |
5355ff82 | 802 | .EX |
40398c1a MK |
803 | char path[PATH_MAX]; |
804 | fd = open("/path/to/dir", O_TMPFILE | O_RDWR, | |
0fb83d00 MK |
805 | S_IRUSR | S_IWUSR); |
806 | ||
89de1a39 | 807 | /* File I/O on \(aqfd\(aq... */ |
0fb83d00 | 808 | |
1c551957 | 809 | linkat(fd, "", AT_FDCWD, "/path/for/file", AT_EMPTY_PATH); |
a2587fbb | 810 | |
89de1a39 | 811 | /* If the caller doesn\(aqt have the CAP_DAC_READ_SEARCH |
a2587fbb MK |
812 | capability (needed to use AT_EMPTY_PATH with linkat(2)), |
813 | and there is a proc(5) filesystem mounted, then the | |
814 | linkat(2) call above can be replaced with: | |
815 | ||
816 | snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd); | |
817 | linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file", | |
818 | AT_SYMLINK_FOLLOW); | |
819 | */ | |
5355ff82 | 820 | .EE |
40398c1a | 821 | .in |
5355ff82 | 822 | .IP |
40398c1a MK |
823 | In this case, |
824 | the | |
825 | .BR open () | |
826 | .I mode | |
827 | argument determines the file permission mode, as with | |
828 | .BR O_CREAT . | |
5355ff82 | 829 | .IP |
0115aaed MK |
830 | Specifying |
831 | .B O_EXCL | |
832 | in conjunction with | |
833 | .B O_TMPFILE | |
834 | prevents a temporary file from being linked into the filesystem | |
835 | in the above manner. | |
836 | (Note that the meaning of | |
837 | .B O_EXCL | |
838 | in this case is different from the meaning of | |
839 | .B O_EXCL | |
840 | otherwise.) | |
5355ff82 | 841 | .IP |
40398c1a MK |
842 | There are two main use cases for |
843 | .\" Inspired by http://lwn.net/Articles/559147/ | |
844 | .BR O_TMPFILE : | |
845 | .RS | |
846 | .IP * 3 | |
847 | Improved | |
848 | .BR tmpfile (3) | |
849 | functionality: race-free creation of temporary files that | |
850 | (1) are automatically deleted when closed; | |
851 | (2) can never be reached via any pathname; | |
852 | (3) are not subject to symlink attacks; and | |
853 | (4) do not require the caller to devise unique names. | |
854 | .IP * | |
855 | Creating a file that is initially invisible, which is then populated | |
8b04592d | 856 | with data and adjusted to have appropriate filesystem attributes |
c89a9937 EB |
857 | .RB ( fchown (2), |
858 | .BR fchmod (2), | |
40398c1a MK |
859 | .BR fsetxattr (2), |
860 | etc.) | |
861 | before being atomically linked into the filesystem | |
862 | in a fully formed state (using | |
863 | .BR linkat (2) | |
864 | as described above). | |
865 | .RE | |
866 | .IP | |
867 | .B O_TMPFILE | |
868 | requires support by the underlying filesystem; | |
40398c1a | 869 | only a subset of Linux filesystems provide that support. |
cde2074a | 870 | In the initial implementation, support was provided in |
7a0095a5 | 871 | the ext2, ext3, ext4, UDF, Minix, and tmpfs filesystems. |
bd79a35a | 872 | .\" To check for support, grep for "tmpfile" in kernel sources |
6065b906 MK |
873 | Support for other filesystems has subsequently been added as follows: |
874 | XFS (Linux 3.15); | |
cde2074a MK |
875 | .\" commit 99b6436bc29e4f10e4388c27a3e4810191cc4788 |
876 | .\" commit ab29743117f9f4c22ac44c13c1647fb24fb2bafe | |
1b9d5819 | 877 | Btrfs (Linux 3.16); |
e746db2e | 878 | .\" commit ef3b9af50bfa6a1f02cd7b3f5124b712b1ba3e3c |
6065b906 | 879 | F2FS (Linux 3.16); |
bd79a35a | 880 | .\" commit 50732df02eefb39ab414ef655979c2c9b64ad21c |
6065b906 | 881 | and ubifs (Linux 4.9) |
40398c1a | 882 | .TP |
1c1e15ed | 883 | .B O_TRUNC |
4d61d36a | 884 | If the file already exists and is a regular file and the access mode allows |
682edefb MK |
885 | writing (i.e., is |
886 | .B O_RDWR | |
887 | or | |
888 | .BR O_WRONLY ) | |
889 | it will be truncated to length 0. | |
890 | If the file is a FIFO or terminal device file, the | |
891 | .B O_TRUNC | |
c13182ef | 892 | flag is ignored. |
2b9b829d | 893 | Otherwise, the effect of |
682edefb MK |
894 | .B O_TRUNC |
895 | is unspecified. | |
7b8ba76c | 896 | .SS creat() |
1f7191bb | 897 | A call to |
1c1e15ed | 898 | .BR creat () |
1f7191bb | 899 | is equivalent to calling |
1c1e15ed | 900 | .BR open () |
fea681da MK |
901 | with |
902 | .I flags | |
903 | equal to | |
904 | .BR O_CREAT|O_WRONLY|O_TRUNC . | |
7b8ba76c MK |
905 | .SS openat() |
906 | The | |
907 | .BR openat () | |
908 | system call operates in exactly the same way as | |
cadd38ba | 909 | .BR open (), |
7b8ba76c | 910 | except for the differences described here. |
3130d10b | 911 | .PP |
5241f3cc MK |
912 | The |
913 | .I dirfd | |
914 | argument is used in conjunction with the | |
915 | .I pathname | |
916 | argument as follows: | |
917 | .IP * 3 | |
7b8ba76c MK |
918 | If the pathname given in |
919 | .I pathname | |
56dddcba | 920 | is absolute, then |
7b8ba76c | 921 | .I dirfd |
56dddcba | 922 | is ignored. |
5241f3cc | 923 | .IP * |
56dddcba | 924 | If the pathname given in |
7b8ba76c MK |
925 | .I pathname |
926 | is relative and | |
927 | .I dirfd | |
928 | is the special value | |
929 | .BR AT_FDCWD , | |
930 | then | |
931 | .I pathname | |
932 | is interpreted relative to the current working | |
933 | directory of the calling process (like | |
cadd38ba | 934 | .BR open ()). |
5241f3cc | 935 | .IP * |
56dddcba | 936 | If the pathname given in |
7b8ba76c | 937 | .I pathname |
56dddcba MK |
938 | is relative, then it is interpreted relative to the directory |
939 | referred to by the file descriptor | |
7b8ba76c | 940 | .I dirfd |
56dddcba MK |
941 | (rather than relative to the current working directory of |
942 | the calling process, as is done by | |
943 | .BR open () | |
944 | for a relative pathname). | |
a9db6c1b MK |
945 | In this case, |
946 | .I dirfd | |
947 | must be a directory that was opened for reading | |
948 | .RB ( O_RDONLY ) | |
949 | or using the | |
950 | .B O_PATH | |
951 | flag. | |
73434f40 MK |
952 | .PP |
953 | If the pathname given in | |
954 | .I pathname | |
955 | is relative, and | |
956 | .I dirfd | |
957 | is not a valid file descriptor, an error | |
958 | .RB ( EBADF ) | |
959 | results. | |
960 | (Specifying an invalid file descriptor number in | |
961 | .I dirfd | |
962 | can be used as a means to ensure that | |
963 | .I pathname | |
964 | is absolute.) | |
4b322a2f | 965 | .\" |
a2dbb2e3 AS |
966 | .SS openat2(2) |
967 | The | |
968 | .BR openat2 (2) | |
969 | system call is an extension of | |
970 | .BR openat (), | |
4b322a2f MK |
971 | and provides a superset of the features of |
972 | .BR openat (). | |
aec13430 | 973 | It is documented separately, in |
4b322a2f | 974 | .BR openat2 (2). |
47297adb | 975 | .SH RETURN VALUE |
c112329f | 976 | On success, |
7b8ba76c MK |
977 | .BR open (), |
978 | .BR openat (), | |
c13182ef | 979 | and |
e1d6264d | 980 | .BR creat () |
c112329f MK |
981 | return the new file descriptor (a nonnegative integer). |
982 | On error, \-1 is returned and | |
fea681da | 983 | .I errno |
c112329f | 984 | is set to indicate the error. |
fea681da | 985 | .SH ERRORS |
7b8ba76c MK |
986 | .BR open (), |
987 | .BR openat (), | |
988 | and | |
989 | .BR creat () | |
990 | can fail with the following errors: | |
fea681da MK |
991 | .TP |
992 | .B EACCES | |
993 | The requested access to the file is not allowed, or search permission | |
994 | is denied for one of the directories in the path prefix of | |
995 | .IR pathname , | |
996 | or the file did not exist yet and write access to the parent directory | |
997 | is not allowed. | |
998 | (See also | |
ad7cc990 | 999 | .BR path_resolution (7).) |
fea681da | 1000 | .TP |
2ddf885a JS |
1001 | .B EACCES |
1002 | .\" commit 30aba6656f61ed44cba445a3c0d38b296fa9e8f5 | |
1003 | Where | |
1004 | .B O_CREAT | |
d9e7db1b MK |
1005 | is specified, the |
1006 | .I protected_fifos | |
1007 | or | |
510adbed | 1008 | .I protected_regular |
d9e7db1b | 1009 | sysctl is enabled, the file already exists and is a FIFO or regular file, the |
2ddf885a JS |
1010 | owner of the file is neither the current user nor the owner of the |
1011 | containing directory, and the containing directory is both world- or | |
1012 | group-writable and sticky. | |
d9e7db1b | 1013 | For details, see the descriptions of |
1ae6b2c7 | 1014 | .I /proc/sys/fs/protected_fifos |
d9e7db1b | 1015 | and |
1ae6b2c7 | 1016 | .I /proc/sys/fs/protected_regular |
d9e7db1b MK |
1017 | in |
1018 | .BR proc (5). | |
2ddf885a | 1019 | .TP |
90879cbd MK |
1020 | .B EBADF |
1021 | .RB ( openat ()) | |
1022 | .I pathname | |
1023 | is relative but | |
1024 | .I dirfd | |
1025 | is neither | |
1026 | .B AT_FDCWD | |
1027 | nor a valid file descriptor. | |
1028 | .TP | |
836a5bbf MK |
1029 | .B EBUSY |
1030 | .B O_EXCL | |
1031 | was specified in | |
1032 | .I flags | |
1033 | and | |
1034 | .I pathname | |
1035 | refers to a block device that is in use by the system (e.g., it is mounted). | |
1036 | .TP | |
a1f01685 MH |
1037 | .B EDQUOT |
1038 | Where | |
1039 | .B O_CREAT | |
1040 | is specified, the file does not exist, and the user's quota of disk | |
9ee4a2b6 | 1041 | blocks or inodes on the filesystem has been exhausted. |
a1f01685 | 1042 | .TP |
fea681da MK |
1043 | .B EEXIST |
1044 | .I pathname | |
1045 | already exists and | |
1046 | .BR O_CREAT " and " O_EXCL | |
1047 | were used. | |
1048 | .TP | |
1049 | .B EFAULT | |
0daa9e92 | 1050 | .I pathname |
e1d6264d | 1051 | points outside your accessible address space. |
fea681da | 1052 | .TP |
9f5773f7 | 1053 | .B EFBIG |
7c7fb552 MK |
1054 | See |
1055 | .BR EOVERFLOW . | |
9f5773f7 | 1056 | .TP |
e51412ea MK |
1057 | .B EINTR |
1058 | While blocked waiting to complete an open of a slow device | |
1059 | (e.g., a FIFO; see | |
1060 | .BR fifo (7)), | |
1061 | the call was interrupted by a signal handler; see | |
1062 | .BR signal (7). | |
1063 | .TP | |
ef490193 DG |
1064 | .B EINVAL |
1065 | The filesystem does not support the | |
1ae6b2c7 | 1066 | .B O_DIRECT |
e6f89ed2 MK |
1067 | flag. |
1068 | See | |
1ae6b2c7 | 1069 | .B NOTES |
ef490193 DG |
1070 | for more information. |
1071 | .TP | |
8e335391 MK |
1072 | .B EINVAL |
1073 | Invalid value in | |
1074 | .\" In particular, __O_TMPFILE instead of O_TMPFILE | |
1075 | .IR flags . | |
1076 | .TP | |
1077 | .B EINVAL | |
1078 | .B O_TMPFILE | |
1079 | was specified in | |
1080 | .IR flags , | |
1081 | but neither | |
1082 | .B O_WRONLY | |
1083 | nor | |
1084 | .B O_RDWR | |
1085 | was specified. | |
1086 | .TP | |
5c6f8de0 MK |
1087 | .B EINVAL |
1088 | .B O_CREAT | |
1089 | was specified in | |
1090 | .I flags | |
1091 | and the final component ("basename") of the new file's | |
1092 | .I pathname | |
1093 | is invalid | |
1094 | (e.g., it contains characters not permitted by the underlying filesystem). | |
ed6fe005 | 1095 | .TP |
ed6fe005 MK |
1096 | .B EINVAL |
1097 | The final component ("basename") of | |
1098 | .I pathname | |
1099 | is invalid | |
1100 | (e.g., it contains characters not permitted by the underlying filesystem). | |
5c6f8de0 | 1101 | .TP |
fea681da MK |
1102 | .B EISDIR |
1103 | .I pathname | |
1104 | refers to a directory and the access requested involved writing | |
1105 | (that is, | |
1106 | .B O_WRONLY | |
1107 | or | |
1108 | .B O_RDWR | |
1109 | is set). | |
1110 | .TP | |
8e335391 | 1111 | .B EISDIR |
843068bd MK |
1112 | .I pathname |
1113 | refers to an existing directory, | |
8e335391 MK |
1114 | .B O_TMPFILE |
1115 | and one of | |
1116 | .B O_WRONLY | |
1117 | or | |
1118 | .B O_RDWR | |
1119 | were specified in | |
1120 | .IR flags , | |
1121 | but this kernel version does not provide the | |
1122 | .B O_TMPFILE | |
1123 | functionality. | |
1124 | .TP | |
fea681da MK |
1125 | .B ELOOP |
1126 | Too many symbolic links were encountered in resolving | |
289f7907 MK |
1127 | .IR pathname . |
1128 | .TP | |
1129 | .B ELOOP | |
fea681da | 1130 | .I pathname |
289f7907 MK |
1131 | was a symbolic link, and |
1132 | .I flags | |
1133 | specified | |
1ae6b2c7 | 1134 | .B O_NOFOLLOW |
289f7907 MK |
1135 | but not |
1136 | .BR O_PATH . | |
fea681da MK |
1137 | .TP |
1138 | .B EMFILE | |
26c32fab | 1139 | The per-process limit on the number of open file descriptors has been reached |
12c21590 | 1140 | (see the description of |
1ae6b2c7 | 1141 | .B RLIMIT_NOFILE |
12c21590 MK |
1142 | in |
1143 | .BR getrlimit (2)). | |
fea681da MK |
1144 | .TP |
1145 | .B ENAMETOOLONG | |
0daa9e92 | 1146 | .I pathname |
e1d6264d | 1147 | was too long. |
fea681da MK |
1148 | .TP |
1149 | .B ENFILE | |
e258766b | 1150 | The system-wide limit on the total number of open files has been reached. |
fea681da MK |
1151 | .TP |
1152 | .B ENODEV | |
1153 | .I pathname | |
1154 | refers to a device special file and no corresponding device exists. | |
682edefb MK |
1155 | (This is a Linux kernel bug; in this situation |
1156 | .B ENXIO | |
1157 | must be returned.) | |
fea681da MK |
1158 | .TP |
1159 | .B ENOENT | |
682edefb MK |
1160 | .B O_CREAT |
1161 | is not set and the named file does not exist. | |
115bbafa MK |
1162 | .TP |
1163 | .B ENOENT | |
1164 | A directory component in | |
fea681da MK |
1165 | .I pathname |
1166 | does not exist or is a dangling symbolic link. | |
1167 | .TP | |
ba03011f MK |
1168 | .B ENOENT |
1169 | .I pathname | |
1170 | refers to a nonexistent directory, | |
1171 | .B O_TMPFILE | |
1172 | and one of | |
1173 | .B O_WRONLY | |
1174 | or | |
1175 | .B O_RDWR | |
1176 | were specified in | |
1177 | .IR flags , | |
1178 | but this kernel version does not provide the | |
1179 | .B O_TMPFILE | |
1180 | functionality. | |
1181 | .TP | |
fea681da | 1182 | .B ENOMEM |
8ef529f9 MK |
1183 | The named file is a FIFO, |
1184 | but memory for the FIFO buffer can't be allocated because | |
1185 | the per-user hard limit on memory allocation for pipes has been reached | |
1186 | and the caller is not privileged; see | |
1187 | .BR pipe (7). | |
1188 | .TP | |
1189 | .B ENOMEM | |
fea681da MK |
1190 | Insufficient kernel memory was available. |
1191 | .TP | |
1192 | .B ENOSPC | |
1193 | .I pathname | |
1194 | was to be created but the device containing | |
1195 | .I pathname | |
1196 | has no room for the new file. | |
1197 | .TP | |
1198 | .B ENOTDIR | |
1199 | A component used as a directory in | |
1200 | .I pathname | |
a8d55537 | 1201 | is not, in fact, a directory, or \fBO_DIRECTORY\fP was specified and |
fea681da MK |
1202 | .I pathname |
1203 | was not a directory. | |
1204 | .TP | |
90879cbd MK |
1205 | .B ENOTDIR |
1206 | .RB ( openat ()) | |
1207 | .I pathname | |
1208 | is a relative pathname and | |
1209 | .I dirfd | |
1210 | is a file descriptor referring to a file other than a directory. | |
1211 | .TP | |
fea681da | 1212 | .B ENXIO |
682edefb | 1213 | .BR O_NONBLOCK " | " O_WRONLY |
103ea4f6 MK |
1214 | is set, the named file is a FIFO, and |
1215 | no process has the FIFO open for reading. | |
7b032b23 MK |
1216 | .TP |
1217 | .B ENXIO | |
1218 | The file is a device special file and no corresponding device exists. | |
fea681da | 1219 | .TP |
71b12d0a | 1220 | .B ENXIO |
8b5bbcfa | 1221 | The file is a UNIX domain socket. |
71b12d0a | 1222 | .TP |
1ae6b2c7 | 1223 | .B EOPNOTSUPP |
bbe02b45 MK |
1224 | The filesystem containing |
1225 | .I pathname | |
1226 | does not support | |
1227 | .BR O_TMPFILE . | |
1228 | .TP | |
7c7fb552 MK |
1229 | .B EOVERFLOW |
1230 | .I pathname | |
1231 | refers to a regular file that is too large to be opened. | |
1232 | The usual scenario here is that an application compiled | |
1233 | on a 32-bit platform without | |
2c1acf16 | 1234 | .I \-D_FILE_OFFSET_BITS=64 |
7c7fb552 | 1235 | tried to open a file whose size exceeds |
cd415e73 | 1236 | .I (1<<31)\-1 |
4e1a4d72 | 1237 | bytes; |
7c7fb552 MK |
1238 | see also |
1239 | .B O_LARGEFILE | |
1240 | above. | |
c84d3aa3 | 1241 | This is the error specified by POSIX.1; |
7c7fb552 MK |
1242 | in kernels before 2.6.24, Linux gave the error |
1243 | .B EFBIG | |
1244 | for this case. | |
1245 | .\" See http://bugzilla.kernel.org/show_bug.cgi?id=7253 | |
1246 | .\" "Open of a large file on 32-bit fails with EFBIG, should be EOVERFLOW" | |
1247 | .\" Reported 2006-10-03 | |
1248 | .TP | |
1c1e15ed MK |
1249 | .B EPERM |
1250 | The | |
1251 | .B O_NOATIME | |
1252 | flag was specified, but the effective user ID of the caller | |
9ee4a2b6 | 1253 | .\" Strictly speaking, it's the filesystem UID... (MTK) |
47c906e5 | 1254 | did not match the owner of the file and the caller was not privileged. |
1c1e15ed | 1255 | .TP |
fbab10e5 MK |
1256 | .B EPERM |
1257 | The operation was prevented by a file seal; see | |
1258 | .BR fcntl (2). | |
1259 | .TP | |
fea681da MK |
1260 | .B EROFS |
1261 | .I pathname | |
9ee4a2b6 | 1262 | refers to a file on a read-only filesystem and write access was |
fea681da MK |
1263 | requested. |
1264 | .TP | |
1265 | .B ETXTBSY | |
1266 | .I pathname | |
1267 | refers to an executable image which is currently being executed and | |
1268 | write access was requested. | |
d3952311 | 1269 | .TP |
19d37126 JH |
1270 | .B ETXTBSY |
1271 | .I pathname | |
1272 | refers to a file that is currently in use as a swap file, and the | |
1273 | .B O_TRUNC | |
1274 | flag was specified. | |
1275 | .TP | |
1276 | .B ETXTBSY | |
1277 | .I pathname | |
0629df8b | 1278 | refers to a file that is currently being read by the kernel (e.g., for |
19d37126 JH |
1279 | module/firmware loading), and write access was requested. |
1280 | .TP | |
d3952311 MK |
1281 | .B EWOULDBLOCK |
1282 | The | |
1283 | .B O_NONBLOCK | |
1284 | flag was specified, and an incompatible lease was held on the file | |
1285 | (see | |
1286 | .BR fcntl (2)). | |
7b8ba76c MK |
1287 | .SH VERSIONS |
1288 | .BR openat () | |
1289 | was added to Linux in kernel 2.6.16; | |
1290 | library support was added to glibc in version 2.4. | |
3113c7f3 | 1291 | .SH STANDARDS |
7b8ba76c MK |
1292 | .BR open (), |
1293 | .BR creat () | |
72ac7268 | 1294 | SVr4, 4.3BSD, POSIX.1-2001, POSIX.1-2008. |
5355ff82 | 1295 | .PP |
7b8ba76c MK |
1296 | .BR openat (): |
1297 | POSIX.1-2008. | |
5355ff82 | 1298 | .PP |
a2dbb2e3 AS |
1299 | .BR openat2 (2) |
1300 | is Linux-specific. | |
1301 | .PP | |
fea681da | 1302 | The |
72ac7268 | 1303 | .BR O_DIRECT , |
1c1e15ed | 1304 | .BR O_NOATIME , |
72ac7268 | 1305 | .BR O_PATH , |
fea681da | 1306 | and |
1ae6b2c7 | 1307 | .B O_TMPFILE |
72ac7268 MK |
1308 | flags are Linux-specific. |
1309 | One must define | |
61b7c1e1 MK |
1310 | .B _GNU_SOURCE |
1311 | to obtain their definitions. | |
5355ff82 | 1312 | .PP |
9f91e36c | 1313 | The |
72ac7268 MK |
1314 | .BR O_CLOEXEC , |
1315 | .BR O_DIRECTORY , | |
1316 | and | |
1ae6b2c7 | 1317 | .B O_NOFOLLOW |
72ac7268 MK |
1318 | flags are not specified in POSIX.1-2001, |
1319 | but are specified in POSIX.1-2008. | |
1320 | Since glibc 2.12, one can obtain their definitions by defining either | |
1321 | .B _POSIX_C_SOURCE | |
1322 | with a value greater than or equal to 200809L or | |
1ae6b2c7 | 1323 | .B _XOPEN_SOURCE |
72ac7268 MK |
1324 | with a value greater than or equal to 700. |
1325 | In glibc 2.11 and earlier, one obtains the definitions by defining | |
1326 | .BR _GNU_SOURCE . | |
5355ff82 | 1327 | .PP |
72ac7268 MK |
1328 | As noted in |
1329 | .BR feature_test_macros (7), | |
84fc2a6e | 1330 | feature test macros such as |
72ac7268 MK |
1331 | .BR _POSIX_C_SOURCE , |
1332 | .BR _XOPEN_SOURCE , | |
1333 | and | |
fe75ec04 | 1334 | .B _GNU_SOURCE |
72ac7268 | 1335 | must be defined before including |
e417acb0 | 1336 | .I any |
72ac7268 | 1337 | header files. |
a1d5f77c | 1338 | .SH NOTES |
988db661 | 1339 | Under Linux, the |
a1d5f77c | 1340 | .B O_NONBLOCK |
3897a3f8 | 1341 | flag is sometimes used in cases where one wants to open |
a1d5f77c | 1342 | but does not necessarily have the intention to read or write. |
3897a3f8 MK |
1343 | For example, |
1344 | this may be used to open a device in order to get a file descriptor | |
a1d5f77c MK |
1345 | for use with |
1346 | .BR ioctl (2). | |
dd3568a1 | 1347 | .PP |
fea681da MK |
1348 | The (undefined) effect of |
1349 | .B O_RDONLY | O_TRUNC | |
c13182ef | 1350 | varies among implementations. |
bcdd964e | 1351 | On many systems the file is actually truncated. |
fea681da MK |
1352 | .\" Linux 2.0, 2.5: truncate |
1353 | .\" Solaris 5.7, 5.8: truncate | |
1354 | .\" Irix 6.5: truncate | |
1355 | .\" Tru64 5.1B: truncate | |
1356 | .\" HP-UX 11.22: truncate | |
1357 | .\" FreeBSD 4.7: truncate | |
5355ff82 | 1358 | .PP |
5dc8986d MK |
1359 | Note that |
1360 | .BR open () | |
1361 | can open device special files, but | |
1362 | .BR creat () | |
1363 | cannot create them; use | |
1364 | .BR mknod (2) | |
1365 | instead. | |
5355ff82 | 1366 | .PP |
5dc8986d MK |
1367 | If the file is newly created, its |
1368 | .IR st_atime , | |
1369 | .IR st_ctime , | |
1370 | .I st_mtime | |
1371 | fields | |
1372 | (respectively, time of last access, time of last status change, and | |
1373 | time of last modification; see | |
1374 | .BR stat (2)) | |
1375 | are set | |
1376 | to the current time, and so are the | |
1377 | .I st_ctime | |
1378 | and | |
1379 | .I st_mtime | |
1380 | fields of the | |
1381 | parent directory. | |
1382 | Otherwise, if the file is modified because of the | |
1383 | .B O_TRUNC | |
3a9c5a29 MK |
1384 | flag, its |
1385 | .I st_ctime | |
1386 | and | |
1387 | .I st_mtime | |
1388 | fields are set to the current time. | |
5355ff82 | 1389 | .PP |
aaf7a574 MK |
1390 | The files in the |
1391 | .I /proc/[pid]/fd | |
1392 | directory show the open file descriptors of the process with the PID | |
1393 | .IR pid . | |
1394 | The files in the | |
1395 | .I /proc/[pid]/fdinfo | |
d40e0bfc | 1396 | directory show even more information about these file descriptors. |
aaf7a574 MK |
1397 | See |
1398 | .BR proc (5) | |
1399 | for further details of both of these directories. | |
8132c115 | 1400 | .PP |
319e9b31 | 1401 | The Linux header file |
8132c115 ES |
1402 | .B <asm/fcntl.h> |
1403 | doesn't define | |
1404 | .BR O_ASYNC ; | |
319e9b31 | 1405 | the (BSD-derived) |
8132c115 | 1406 | .B FASYNC |
319e9b31 | 1407 | synonym is defined instead. |
5dc8986d MK |
1408 | .\" |
1409 | .\" | |
d20d9d33 MK |
1410 | .SS Open file descriptions |
1411 | The term open file description is the one used by POSIX to refer to the | |
1412 | entries in the system-wide table of open files. | |
91085d85 | 1413 | In other contexts, this object is |
d20d9d33 MK |
1414 | variously also called an "open file object", |
1415 | a "file handle", an "open file table entry", | |
1416 | or\(emin kernel-developer parlance\(ema | |
1417 | .IR "struct file" . | |
5355ff82 | 1418 | .PP |
d20d9d33 MK |
1419 | When a file descriptor is duplicated (using |
1420 | .BR dup (2) | |
1421 | or similar), | |
1422 | the duplicate refers to the same open file description | |
1423 | as the original file descriptor, | |
1424 | and the two file descriptors consequently share | |
1425 | the file offset and file status flags. | |
1426 | Such sharing can also occur between processes: | |
1427 | a child process created via | |
91085d85 | 1428 | .BR fork (2) |
d20d9d33 MK |
1429 | inherits duplicates of its parent's file descriptors, |
1430 | and those duplicates refer to the same open file descriptions. | |
5355ff82 | 1431 | .PP |
d20d9d33 | 1432 | Each |
bf7bc8b8 | 1433 | .BR open () |
d20d9d33 MK |
1434 | of a file creates a new open file description; |
1435 | thus, there may be multiple open file descriptions | |
1436 | corresponding to a file inode. | |
5355ff82 | 1437 | .PP |
9539ebc9 MK |
1438 | On Linux, one can use the |
1439 | .BR kcmp (2) | |
1440 | .B KCMP_FILE | |
1441 | operation to test whether two file descriptors | |
1442 | (in the same process or in two different processes) | |
1443 | refer to the same open file description. | |
d20d9d33 MK |
1444 | .\" |
1445 | .\" | |
5dc8986d | 1446 | .SS Synchronized I/O |
6cf19e62 MK |
1447 | The POSIX.1-2008 "synchronized I/O" option |
1448 | specifies different variants of synchronized I/O, | |
1449 | and specifies the | |
1450 | .BR open () | |
1451 | flags | |
015221ef CH |
1452 | .BR O_SYNC , |
1453 | .BR O_DSYNC , | |
1454 | and | |
1ae6b2c7 | 1455 | .B O_RSYNC |
6cf19e62 MK |
1456 | for controlling the behavior. |
1457 | Regardless of whether an implementation supports this option, | |
1458 | it must at least support the use of | |
1ae6b2c7 | 1459 | .B O_SYNC |
6cf19e62 | 1460 | for regular files. |
5355ff82 | 1461 | .PP |
89851a00 | 1462 | Linux implements |
1ae6b2c7 | 1463 | .B O_SYNC |
6cf19e62 MK |
1464 | and |
1465 | .BR O_DSYNC , | |
1466 | but not | |
015221ef | 1467 | .BR O_RSYNC . |
352c4c5c | 1468 | Somewhat incorrectly, glibc defines |
1ae6b2c7 | 1469 | .B O_RSYNC |
6cf19e62 | 1470 | to have the same value as |
352c4c5c MK |
1471 | .BR O_SYNC . |
1472 | .RB ( O_RSYNC | |
1473 | is defined in the Linux header file | |
1474 | .I <asm/fcntl.h> | |
1475 | on HP PA-RISC, but it is not used.) | |
5355ff82 | 1476 | .PP |
1ae6b2c7 | 1477 | .B O_SYNC |
6cf19e62 MK |
1478 | provides synchronized I/O |
1479 | .I file | |
1480 | integrity completion, | |
1481 | meaning write operations will flush data and all associated metadata | |
1482 | to the underlying hardware. | |
1ae6b2c7 | 1483 | .B O_DSYNC |
6cf19e62 MK |
1484 | provides synchronized I/O |
1485 | .I data | |
1486 | integrity completion, | |
1487 | meaning write operations will flush data | |
1488 | to the underlying hardware, | |
1489 | but will only flush metadata updates that are required | |
1490 | to allow a subsequent read operation to complete successfully. | |
1491 | Data integrity completion can reduce the number of disk operations | |
1492 | that are required for applications that don't need the guarantees | |
1493 | of file integrity completion. | |
5355ff82 | 1494 | .PP |
a83923ca | 1495 | To understand the difference between the two types of completion, |
6cf19e62 MK |
1496 | consider two pieces of file metadata: |
1497 | the file last modification timestamp | |
1498 | .RI ( st_mtime ) | |
1499 | and the file length. | |
1500 | All write operations will update the last file modification timestamp, | |
1501 | but only writes that add data to the end of the | |
1502 | file will change the file length. | |
1503 | The last modification timestamp is not needed to ensure that | |
1504 | a read completes successfully, but the file length is. | |
1505 | Thus, | |
1ae6b2c7 | 1506 | .B O_DSYNC |
6cf19e62 MK |
1507 | would only guarantee to flush updates to the file length metadata |
1508 | (whereas | |
1ae6b2c7 | 1509 | .B O_SYNC |
6cf19e62 | 1510 | would also always flush the last modification timestamp metadata). |
5355ff82 | 1511 | .PP |
6cf19e62 | 1512 | Before Linux 2.6.33, Linux implemented only the |
1ae6b2c7 | 1513 | .B O_SYNC |
89851a00 | 1514 | flag for |
6cf19e62 MK |
1515 | .BR open (). |
1516 | However, when that flag was specified, | |
1517 | most filesystems actually provided the equivalent of synchronized I/O | |
1518 | .I data | |
1519 | integrity completion (i.e., | |
1ae6b2c7 | 1520 | .B O_SYNC |
6cf19e62 MK |
1521 | was actually implemented as the equivalent of |
1522 | .BR O_DSYNC ). | |
5355ff82 | 1523 | .PP |
6cf19e62 | 1524 | Since Linux 2.6.33, proper |
1ae6b2c7 | 1525 | .B O_SYNC |
6cf19e62 MK |
1526 | support is provided. |
1527 | However, to ensure backward binary compatibility, | |
1ae6b2c7 | 1528 | .B O_DSYNC |
6cf19e62 | 1529 | was defined with the same value as the historical |
015221ef | 1530 | .BR O_SYNC , |
015221ef | 1531 | and |
1ae6b2c7 | 1532 | .B O_SYNC |
89851a00 | 1533 | was defined as a new (two-bit) flag value that includes the |
1ae6b2c7 | 1534 | .B O_DSYNC |
6cf19e62 MK |
1535 | flag value. |
1536 | This ensures that applications compiled against | |
1537 | new headers get at least | |
1ae6b2c7 | 1538 | .B O_DSYNC |
6cf19e62 | 1539 | semantics on pre-2.6.33 kernels. |
5dc8986d | 1540 | .\" |
76f054b1 MK |
1541 | .SS C library/kernel differences |
1542 | Since version 2.26, | |
1543 | the glibc wrapper function for | |
1544 | .BR open () | |
1545 | employs the | |
1546 | .BR openat () | |
1547 | system call, rather than the kernel's | |
1548 | .BR open () | |
1549 | system call. | |
1550 | For certain architectures, this is also true in glibc versions before 2.26. | |
5dc8986d MK |
1551 | .\" |
1552 | .SS NFS | |
1553 | There are many infelicities in the protocol underlying NFS, affecting | |
1554 | amongst others | |
1555 | .BR O_SYNC " and " O_NDELAY . | |
5355ff82 | 1556 | .PP |
9ee4a2b6 | 1557 | On NFS filesystems with UID mapping enabled, |
a1d5f77c MK |
1558 | .BR open () |
1559 | may | |
75b94dc3 | 1560 | return a file descriptor but, for example, |
a1d5f77c MK |
1561 | .BR read (2) |
1562 | requests are denied | |
1ae6b2c7 AC |
1563 | with |
1564 | .BR EACCES . | |
a1d5f77c MK |
1565 | This is because the client performs |
1566 | .BR open () | |
1567 | by checking the | |
1568 | permissions, but UID mapping is performed by the server upon | |
1569 | read and write requests. | |
5dc8986d MK |
1570 | .\" |
1571 | .\" | |
1bdc161d MK |
1572 | .SS FIFOs |
1573 | Opening the read or write end of a FIFO blocks until the other | |
1574 | end is also opened (by another process or thread). | |
1575 | See | |
1576 | .BR fifo (7) | |
1577 | for further details. | |
1578 | .\" | |
1579 | .\" | |
5dc8986d MK |
1580 | .SS File access mode |
1581 | Unlike the other values that can be specified in | |
1582 | .IR flags , | |
1583 | the | |
1584 | .I "access mode" | |
1585 | values | |
1586 | .BR O_RDONLY ", " O_WRONLY ", and " O_RDWR | |
1587 | do not specify individual bits. | |
1588 | Rather, they define the low order two bits of | |
1589 | .IR flags , | |
1590 | and are defined respectively as 0, 1, and 2. | |
1591 | In other words, the combination | |
1592 | .B "O_RDONLY | O_WRONLY" | |
1593 | is a logical error, and certainly does not have the same meaning as | |
1594 | .BR O_RDWR . | |
5355ff82 | 1595 | .PP |
5dc8986d MK |
1596 | Linux reserves the special, nonstandard access mode 3 (binary 11) in |
1597 | .I flags | |
1598 | to mean: | |
d9cb0d7d | 1599 | check for read and write permission on the file and return a file descriptor |
5dc8986d MK |
1600 | that can't be used for reading or writing. |
1601 | This nonstandard access mode is used by some Linux drivers to return a | |
d9cb0d7d | 1602 | file descriptor that is to be used only for device-specific |
5dc8986d MK |
1603 | .BR ioctl (2) |
1604 | operations. | |
1605 | .\" See for example util-linux's disk-utils/setfdprm.c | |
1606 | .\" For some background on access mode 3, see | |
1607 | .\" http://thread.gmane.org/gmane.linux.kernel/653123 | |
1608 | .\" "[RFC] correct flags to f_mode conversion in __dentry_open" | |
1609 | .\" LKML, 12 Mar 2008 | |
7b8ba76c MK |
1610 | .\" |
1611 | .\" | |
80d250b4 | 1612 | .SS Rationale for openat() and other "directory file descriptor" APIs |
7b8ba76c | 1613 | .BR openat () |
80d250b4 MK |
1614 | and the other system calls and library functions that take |
1615 | a directory file descriptor argument | |
7b8ba76c | 1616 | (i.e., |
c6a16783 | 1617 | .BR execveat (2), |
7b8ba76c | 1618 | .BR faccessat (2), |
80d250b4 | 1619 | .BR fanotify_mark (2), |
7b8ba76c MK |
1620 | .BR fchmodat (2), |
1621 | .BR fchownat (2), | |
5c30e7cd | 1622 | .BR fspick (2), |
7b8ba76c MK |
1623 | .BR fstatat (2), |
1624 | .BR futimesat (2), | |
1625 | .BR linkat (2), | |
1626 | .BR mkdirat (2), | |
1627 | .BR mknodat (2), | |
d53b1b17 | 1628 | .BR mount_setattr (2), |
0a5c96db | 1629 | .BR move_mount (2), |
80d250b4 | 1630 | .BR name_to_handle_at (2), |
5c30e7cd | 1631 | .BR open_tree (2), |
e64c566c | 1632 | .BR openat2 (2), |
7b8ba76c MK |
1633 | .BR readlinkat (2), |
1634 | .BR renameat (2), | |
0a5c96db | 1635 | .BR renameat2 (2), |
3f092cef | 1636 | .BR statx (2), |
7b8ba76c MK |
1637 | .BR symlinkat (2), |
1638 | .BR unlinkat (2), | |
f37759b1 | 1639 | .BR utimensat (2), |
80d250b4 | 1640 | .BR mkfifoat (3), |
7b8ba76c | 1641 | and |
80d250b4 | 1642 | .BR scandirat (3)) |
a98e0304 | 1643 | address two problems with the older interfaces that preceded them. |
92692952 | 1644 | Here, the explanation is in terms of the |
7b8ba76c | 1645 | .BR openat () |
d26f8a31 | 1646 | call, but the rationale is analogous for the other interfaces. |
5355ff82 | 1647 | .PP |
7b8ba76c MK |
1648 | First, |
1649 | .BR openat () | |
1650 | allows an application to avoid race conditions that could | |
1651 | occur when using | |
cadd38ba | 1652 | .BR open () |
7b8ba76c MK |
1653 | to open files in directories other than the current working directory. |
1654 | These race conditions result from the fact that some component | |
1655 | of the directory prefix given to | |
cadd38ba | 1656 | .BR open () |
7b8ba76c | 1657 | could be changed in parallel with the call to |
cadd38ba | 1658 | .BR open (). |
54305f5b | 1659 | Suppose, for example, that we wish to create the file |
a710e359 | 1660 | .I dir1/dir2/xxx.dep |
54305f5b | 1661 | if the file |
a710e359 | 1662 | .I dir1/dir2/xxx |
54305f5b | 1663 | exists. |
069d2f9a | 1664 | The problem is that between the existence check and the file-creation step, |
a710e359 | 1665 | .I dir1 |
54305f5b | 1666 | or |
a710e359 | 1667 | .I dir2 |
54305f5b MK |
1668 | (which might be symbolic links) |
1669 | could be modified to point to a different location. | |
7b8ba76c MK |
1670 | Such races can be avoided by |
1671 | opening a file descriptor for the target directory, | |
1672 | and then specifying that file descriptor as the | |
1673 | .I dirfd | |
54305f5b MK |
1674 | argument of (say) |
1675 | .BR fstatat (2) | |
1676 | and | |
7b8ba76c | 1677 | .BR openat (). |
941d2892 MK |
1678 | The use of the |
1679 | .I dirfd | |
1680 | file descriptor also has other benefits: | |
1681 | .IP * 3 | |
1682 | the file descriptor is a stable reference to the directory, | |
1683 | even if the directory is renamed; and | |
1684 | .IP * | |
1685 | the open file descriptor prevents the underlying filesystem from | |
1686 | being dismounted, | |
1687 | just as when a process has a current working directory on a filesystem. | |
1688 | .PP | |
7b8ba76c MK |
1689 | Second, |
1690 | .BR openat () | |
1691 | allows the implementation of a per-thread "current working | |
1692 | directory", via file descriptor(s) maintained by the application. | |
1693 | (This functionality can also be obtained by tricks based | |
1694 | on the use of | |
1695 | .IR /proc/self/fd/ dirfd, | |
1696 | but less efficiently.) | |
96c44b8f MK |
1697 | .PP |
1698 | The | |
1699 | .I dirfd | |
1700 | argument for these APIs can be obtained by using | |
1701 | .BR open () | |
1702 | or | |
1703 | .BR openat () | |
1704 | to open a directory (with either the | |
1ae6b2c7 | 1705 | .B O_RDONLY |
96c44b8f | 1706 | or the |
1ae6b2c7 | 1707 | .B O_PATH |
96c44b8f MK |
1708 | flag). |
1709 | Alternatively, such a file descriptor can be obtained by applying | |
1710 | .BR dirfd (3) | |
1711 | to a directory stream created using | |
1712 | .BR opendir (3). | |
4146f81b MK |
1713 | .PP |
1714 | When these APIs are given a | |
1715 | .I dirfd | |
1716 | argument of | |
1ae6b2c7 | 1717 | .B AT_FDCWD |
4146f81b | 1718 | or the specified pathname is absolute, |
313fb527 | 1719 | then they handle their pathname argument in the same way as |
4146f81b MK |
1720 | the corresponding conventional APIs. |
1721 | However, in this case, several of the APIs have a | |
1722 | .I flags | |
1723 | argument that provides access to functionality that is not available with | |
1724 | the corresponding conventional APIs. | |
7b8ba76c MK |
1725 | .\" |
1726 | .\" | |
ddc4d339 | 1727 | .SS O_DIRECT |
ddc4d339 MK |
1728 | The |
1729 | .B O_DIRECT | |
1730 | flag may impose alignment restrictions on the length and address | |
7fac88a9 | 1731 | of user-space buffers and the file offset of I/Os. |
ddc4d339 | 1732 | In Linux alignment |
9ee4a2b6 | 1733 | restrictions vary by filesystem and kernel version and might be |
ddc4d339 | 1734 | absent entirely. |
9ee4a2b6 | 1735 | However there is currently no filesystem\-independent |
ddc4d339 | 1736 | interface for an application to discover these restrictions for a given |
9ee4a2b6 MK |
1737 | file or filesystem. |
1738 | Some filesystems provide their own interfaces | |
ddc4d339 MK |
1739 | for doing so, for example the |
1740 | .B XFS_IOC_DIOINFO | |
1741 | operation in | |
1742 | .BR xfsctl (3). | |
dd3568a1 | 1743 | .PP |
36dce687 | 1744 | Under Linux 2.4, transfer sizes, the alignment of the user buffer, |
85c2bdba | 1745 | and the file offset must all be multiples of the logical block size |
9ee4a2b6 | 1746 | of the filesystem. |
21557928 | 1747 | Since Linux 2.6.0, alignment to the logical block size of the |
e6042e4a | 1748 | underlying storage (typically 512 bytes) suffices. |
21557928 | 1749 | The logical block size can be determined using the |
e6042e4a PS |
1750 | .BR ioctl (2) |
1751 | .B BLKSSZGET | |
21557928 | 1752 | operation or from the shell using the command: |
5355ff82 | 1753 | .PP |
da16ac09 | 1754 | .in +4n |
5355ff82 | 1755 | .EX |
da16ac09 | 1756 | blockdev \-\-getss |
5355ff82 | 1757 | .EE |
da16ac09 | 1758 | .in |
5355ff82 | 1759 | .PP |
1847167b NP |
1760 | .B O_DIRECT |
1761 | I/Os should never be run concurrently with the | |
04cd7f64 | 1762 | .BR fork (2) |
1847167b NP |
1763 | system call, |
1764 | if the memory buffer is a private mapping | |
1765 | (i.e., any mapping created with the | |
02ace852 | 1766 | .BR mmap (2) |
1ae6b2c7 | 1767 | .B MAP_PRIVATE |
0ab8aeec | 1768 | flag; |
1847167b NP |
1769 | this includes memory allocated on the heap and statically allocated buffers). |
1770 | Any such I/Os, whether submitted via an asynchronous I/O interface or from | |
1771 | another thread in the process, | |
1772 | should be completed before | |
1773 | .BR fork (2) | |
1774 | is called. | |
1775 | Failure to do so can result in data corruption and undefined behavior in | |
1776 | parent and child processes. | |
1777 | This restriction does not apply when the memory buffer for the | |
1778 | .B O_DIRECT | |
1779 | I/Os was created using | |
1780 | .BR shmat (2) | |
1781 | or | |
1782 | .BR mmap (2) | |
1783 | with the | |
1784 | .B MAP_SHARED | |
1785 | flag. | |
1786 | Nor does this restriction apply when the memory buffer has been advised as | |
1787 | .B MADV_DONTFORK | |
0ab8aeec | 1788 | with |
02ace852 | 1789 | .BR madvise (2), |
1847167b NP |
1790 | ensuring that it will not be available |
1791 | to the child after | |
1792 | .BR fork (2). | |
dd3568a1 | 1793 | .PP |
ddc4d339 MK |
1794 | The |
1795 | .B O_DIRECT | |
1796 | flag was introduced in SGI IRIX, where it has alignment | |
1797 | restrictions similar to those of Linux 2.4. | |
1798 | IRIX has also a | |
1799 | .BR fcntl (2) | |
1800 | call to query appropriate alignments, and sizes. | |
1801 | FreeBSD 4.x introduced | |
1802 | a flag of the same name, but without alignment restrictions. | |
dd3568a1 | 1803 | .PP |
ddc4d339 MK |
1804 | .B O_DIRECT |
1805 | support was added under Linux in kernel version 2.4.10. | |
1806 | Older Linux kernels simply ignore this flag. | |
fedb2ff5 | 1807 | Some filesystems may not implement the flag, in which case |
ddc4d339 | 1808 | .BR open () |
9e4be7e9 | 1809 | fails with the error |
ddc4d339 MK |
1810 | .B EINVAL |
1811 | if it is used. | |
dd3568a1 | 1812 | .PP |
ddc4d339 MK |
1813 | Applications should avoid mixing |
1814 | .B O_DIRECT | |
1815 | and normal I/O to the same file, | |
1816 | and especially to overlapping byte regions in the same file. | |
9ee4a2b6 | 1817 | Even when the filesystem correctly handles the coherency issues in |
ddc4d339 MK |
1818 | this situation, overall I/O throughput is likely to be slower than |
1819 | using either mode alone. | |
1820 | Likewise, applications should avoid mixing | |
1821 | .BR mmap (2) | |
1822 | of files with direct I/O to the same files. | |
dd3568a1 | 1823 | .PP |
a1fa36af | 1824 | The behavior of |
ddc4d339 | 1825 | .B O_DIRECT |
9ee4a2b6 | 1826 | with NFS will differ from local filesystems. |
ddc4d339 MK |
1827 | Older kernels, or |
1828 | kernels configured in certain ways, may not support this combination. | |
1829 | The NFS protocol does not support passing the flag to the server, so | |
1830 | .B O_DIRECT | |
33a0ccb2 | 1831 | I/O will bypass the page cache only on the client; the server may |
ddc4d339 MK |
1832 | still cache the I/O. |
1833 | The client asks the server to make the I/O | |
1834 | synchronous to preserve the synchronous semantics of | |
1835 | .BR O_DIRECT . | |
1836 | Some servers will perform poorly under these circumstances, especially | |
1837 | if the I/O size is small. | |
1838 | Some servers may also be configured to | |
1839 | lie to clients about the I/O having reached stable storage; this | |
1840 | will avoid the performance penalty at some risk to data integrity | |
1841 | in the event of server power failure. | |
1842 | The Linux NFS client places no alignment restrictions on | |
1843 | .B O_DIRECT | |
1844 | I/O. | |
1845 | .PP | |
1846 | In summary, | |
1847 | .B O_DIRECT | |
1848 | is a potentially powerful tool that should be used with caution. | |
1849 | It is recommended that applications treat use of | |
1850 | .B O_DIRECT | |
1851 | as a performance option which is disabled by default. | |
ddc4d339 | 1852 | .SH BUGS |
b50582eb MK |
1853 | Currently, it is not possible to enable signal-driven |
1854 | I/O by specifying | |
1855 | .B O_ASYNC | |
c13182ef | 1856 | when calling |
b50582eb MK |
1857 | .BR open (); |
1858 | use | |
1859 | .BR fcntl (2) | |
1860 | to enable this flag. | |
0e1ad98c | 1861 | .\" FIXME . Check bugzilla report on open(O_ASYNC) |
92057f4d | 1862 | .\" See http://bugzilla.kernel.org/show_bug.cgi?id=5993 |
5355ff82 | 1863 | .PP |
0d730fcc MK |
1864 | One must check for two different error codes, |
1865 | .B EISDIR | |
1866 | and | |
1867 | .BR ENOENT , | |
1868 | when trying to determine whether the kernel supports | |
0d55b37f | 1869 | .B O_TMPFILE |
0d730fcc | 1870 | functionality. |
5355ff82 | 1871 | .PP |
320f8a8e MK |
1872 | When both |
1873 | .B O_CREAT | |
1874 | and | |
1875 | .B O_DIRECTORY | |
1876 | are specified in | |
1ae6b2c7 | 1877 | .I flags |
320f8a8e MK |
1878 | and the file specified by |
1879 | .I pathname | |
1880 | does not exist, | |
1881 | .BR open () | |
1882 | will create a regular file (i.e., | |
1883 | .B O_DIRECTORY | |
1884 | is ignored). | |
47297adb | 1885 | .SH SEE ALSO |
a3bf8022 MK |
1886 | .BR chmod (2), |
1887 | .BR chown (2), | |
fea681da | 1888 | .BR close (2), |
e366dbc4 | 1889 | .BR dup (2), |
fea681da MK |
1890 | .BR fcntl (2), |
1891 | .BR link (2), | |
1f6ceb40 | 1892 | .BR lseek (2), |
fea681da | 1893 | .BR mknod (2), |
e366dbc4 | 1894 | .BR mmap (2), |
f0c34053 | 1895 | .BR mount (2), |
fa5d243f | 1896 | .BR open_by_handle_at (2), |
c8fb1c6d | 1897 | .BR openat2 (2), |
fea681da MK |
1898 | .BR read (2), |
1899 | .BR socket (2), | |
1900 | .BR stat (2), | |
1901 | .BR umask (2), | |
1902 | .BR unlink (2), | |
1903 | .BR write (2), | |
1904 | .BR fopen (3), | |
b31056e3 | 1905 | .BR acl (5), |
f0c34053 | 1906 | .BR fifo (7), |
3b363b62 | 1907 | .BR inode (7), |
a9cfde1d MK |
1908 | .BR path_resolution (7), |
1909 | .BR symlink (7) |