]>
Commit | Line | Data |
---|---|---|
28f540f4 | 1 | @node I/O Overview, I/O on Streams, Pattern Matching, Top |
7a68c94a | 2 | @c %MENU% Introduction to the I/O facilities |
28f540f4 RM |
3 | @chapter Input/Output Overview |
4 | ||
5 | Most programs need to do either input (reading data) or output (writing | |
1f77f049 JM |
6 | data), or most frequently both, in order to do anything useful. @Theglibc{} |
7 | provides such a large selection of input and output functions | |
28f540f4 RM |
8 | that the hardest part is often deciding which function is most |
9 | appropriate! | |
10 | ||
11 | This chapter introduces concepts and terminology relating to input | |
12 | and output. Other chapters relating to the GNU I/O facilities are: | |
13 | ||
14 | @itemize @bullet | |
15 | @item | |
16 | @ref{I/O on Streams}, which covers the high-level functions | |
17 | that operate on streams, including formatted input and output. | |
18 | ||
19 | @item | |
20 | @ref{Low-Level I/O}, which covers the basic I/O and control | |
21 | functions on file descriptors. | |
22 | ||
23 | @item | |
24 | @ref{File System Interface}, which covers functions for operating on | |
25 | directories and for manipulating file attributes such as access modes | |
26 | and ownership. | |
27 | ||
28 | @item | |
29 | @ref{Pipes and FIFOs}, which includes information on the basic interprocess | |
30 | communication facilities. | |
31 | ||
32 | @item | |
33 | @ref{Sockets}, which covers a more complicated interprocess communication | |
34 | facility with support for networking. | |
35 | ||
36 | @item | |
37 | @ref{Low-Level Terminal Interface}, which covers functions for changing | |
0be8752b | 38 | how input and output to terminals or other serial devices are processed. |
28f540f4 RM |
39 | @end itemize |
40 | ||
41 | ||
42 | @menu | |
43 | * I/O Concepts:: Some basic information and terminology. | |
44 | * File Names:: How to refer to a file. | |
45 | @end menu | |
46 | ||
47 | @node I/O Concepts, File Names, , I/O Overview | |
48 | @section Input/Output Concepts | |
49 | ||
50 | Before you can read or write the contents of a file, you must establish | |
51 | a connection or communications channel to the file. This process is | |
52 | called @dfn{opening} the file. You can open a file for reading, writing, | |
53 | or both. | |
54 | @cindex opening a file | |
55 | ||
56 | The connection to an open file is represented either as a stream or as a | |
57 | file descriptor. You pass this as an argument to the functions that do | |
58 | the actual read or write operations, to tell them which file to operate | |
59 | on. Certain functions expect streams, and others are designed to | |
60 | operate on file descriptors. | |
61 | ||
62 | When you have finished reading to or writing from the file, you can | |
63 | terminate the connection by @dfn{closing} the file. Once you have | |
64 | closed a stream or file descriptor, you cannot do any more input or | |
65 | output operations on it. | |
66 | ||
67 | @menu | |
1f77f049 | 68 | * Streams and File Descriptors:: The GNU C Library provides two ways |
28f540f4 RM |
69 | to access the contents of files. |
70 | * File Position:: The number of bytes from the | |
71 | beginning of the file. | |
72 | @end menu | |
73 | ||
74 | @node Streams and File Descriptors, File Position, , I/O Concepts | |
75 | @subsection Streams and File Descriptors | |
76 | ||
77 | When you want to do input or output to a file, you have a choice of two | |
78 | basic mechanisms for representing the connection between your program | |
79 | and the file: file descriptors and streams. File descriptors are | |
80 | represented as objects of type @code{int}, while streams are represented | |
81 | as @code{FILE *} objects. | |
82 | ||
83 | File descriptors provide a primitive, low-level interface to input and | |
84 | output operations. Both file descriptors and streams can represent a | |
85 | connection to a device (such as a terminal), or a pipe or socket for | |
86 | communicating with another process, as well as a normal file. But, if | |
87 | you want to do control operations that are specific to a particular kind | |
88 | of device, you must use a file descriptor; there are no facilities to | |
89 | use streams in this way. You must also use file descriptors if your | |
90 | program needs to do input or output in special modes, such as | |
91 | nonblocking (or polled) input (@pxref{File Status Flags}). | |
92 | ||
93 | Streams provide a higher-level interface, layered on top of the | |
94 | primitive file descriptor facilities. The stream interface treats all | |
95 | kinds of files pretty much alike---the sole exception being the three | |
96 | styles of buffering that you can choose (@pxref{Stream Buffering}). | |
97 | ||
98 | The main advantage of using the stream interface is that the set of | |
99 | functions for performing actual input and output operations (as opposed | |
100 | to control operations) on streams is much richer and more powerful than | |
101 | the corresponding facilities for file descriptors. The file descriptor | |
102 | interface provides only simple functions for transferring blocks of | |
103 | characters, but the stream interface also provides powerful formatted | |
104 | input and output functions (@code{printf} and @code{scanf}) as well as | |
105 | functions for character- and line-oriented input and output. | |
106 | @c !!! glibc has dprintf, which lets you do printf on an fd. | |
107 | ||
108 | Since streams are implemented in terms of file descriptors, you can | |
109 | extract the file descriptor from a stream and perform low-level | |
110 | operations directly on the file descriptor. You can also initially open | |
111 | a connection as a file descriptor and then make a stream associated with | |
112 | that file descriptor. | |
113 | ||
114 | In general, you should stick with using streams rather than file | |
115 | descriptors, unless there is some specific operation you want to do that | |
116 | can only be done on a file descriptor. If you are a beginning | |
117 | programmer and aren't sure what functions to use, we suggest that you | |
118 | concentrate on the formatted input functions (@pxref{Formatted Input}) | |
119 | and formatted output functions (@pxref{Formatted Output}). | |
120 | ||
121 | If you are concerned about portability of your programs to systems other | |
122 | than GNU, you should also be aware that file descriptors are not as | |
f65fd747 | 123 | portable as streams. You can expect any system running @w{ISO C} to |
a7a93d50 | 124 | support streams, but @nongnusystems{} may not support file descriptors at |
28f540f4 | 125 | all, or may only implement a subset of the GNU functions that operate on |
1f77f049 JM |
126 | file descriptors. Most of the file descriptor functions in @theglibc{} |
127 | are included in the POSIX.1 standard, however. | |
28f540f4 RM |
128 | |
129 | @node File Position, , Streams and File Descriptors, I/O Concepts | |
f65fd747 | 130 | @subsection File Position |
28f540f4 RM |
131 | |
132 | One of the attributes of an open file is its @dfn{file position} that | |
133 | keeps track of where in the file the next character is to be read or | |
a7a93d50 | 134 | written. On @gnusystems{}, and all POSIX.1 systems, the file position |
28f540f4 RM |
135 | is simply an integer representing the number of bytes from the beginning |
136 | of the file. | |
137 | ||
138 | The file position is normally set to the beginning of the file when it | |
139 | is opened, and each time a character is read or written, the file | |
140 | position is incremented. In other words, access to the file is normally | |
141 | @dfn{sequential}. | |
142 | @cindex file position | |
143 | @cindex sequential-access files | |
144 | ||
145 | Ordinary files permit read or write operations at any position within | |
146 | the file. Some other kinds of files may also permit this. Files which | |
147 | do permit this are sometimes referred to as @dfn{random-access} files. | |
148 | You can change the file position using the @code{fseek} function on a | |
149 | stream (@pxref{File Positioning}) or the @code{lseek} function on a file | |
150 | descriptor (@pxref{I/O Primitives}). If you try to change the file | |
151 | position on a file that doesn't support random access, you get the | |
152 | @code{ESPIPE} error. | |
153 | @cindex random-access files | |
154 | ||
155 | Streams and descriptors that are opened for @dfn{append access} are | |
156 | treated specially for output: output to such files is @emph{always} | |
157 | appended sequentially to the @emph{end} of the file, regardless of the | |
158 | file position. However, the file position is still used to control where in | |
159 | the file reading is done. | |
160 | @cindex append-access files | |
161 | ||
162 | If you think about it, you'll realize that several programs can read a | |
163 | given file at the same time. In order for each program to be able to | |
164 | read the file at its own pace, each program must have its own file | |
165 | pointer, which is not affected by anything the other programs do. | |
166 | ||
f65fd747 | 167 | In fact, each opening of a file creates a separate file position. |
28f540f4 RM |
168 | Thus, if you open a file twice even in the same program, you get two |
169 | streams or descriptors with independent file positions. | |
170 | ||
f65fd747 | 171 | By contrast, if you open a descriptor and then duplicate it to get |
28f540f4 RM |
172 | another descriptor, these two descriptors share the same file position: |
173 | changing the file position of one descriptor will affect the other. | |
174 | ||
175 | @node File Names, , I/O Concepts, I/O Overview | |
176 | @section File Names | |
177 | ||
178 | In order to open a connection to a file, or to perform other operations | |
179 | such as deleting a file, you need some way to refer to the file. Nearly | |
180 | all files have names that are strings---even files which are actually | |
181 | devices such as tape drives or terminals. These strings are called | |
182 | @dfn{file names}. You specify the file name to say which file you want | |
183 | to open or operate on. | |
184 | ||
185 | This section describes the conventions for file names and how the | |
186 | operating system works with them. | |
187 | @cindex file name | |
188 | ||
189 | @menu | |
190 | * Directories:: Directories contain entries for files. | |
191 | * File Name Resolution:: A file name specifies how to look up a file. | |
192 | * File Name Errors:: Error conditions relating to file names. | |
193 | * File Name Portability:: File name portability and syntax issues. | |
194 | @end menu | |
195 | ||
196 | ||
197 | @node Directories, File Name Resolution, , File Names | |
198 | @subsection Directories | |
199 | ||
200 | In order to understand the syntax of file names, you need to understand | |
201 | how the file system is organized into a hierarchy of directories. | |
202 | ||
203 | @cindex directory | |
204 | @cindex link | |
205 | @cindex directory entry | |
206 | A @dfn{directory} is a file that contains information to associate other | |
207 | files with names; these associations are called @dfn{links} or | |
208 | @dfn{directory entries}. Sometimes, people speak of ``files in a | |
209 | directory'', but in reality, a directory only contains pointers to | |
210 | files, not the files themselves. | |
211 | ||
212 | @cindex file name component | |
213 | The name of a file contained in a directory entry is called a @dfn{file | |
214 | name component}. In general, a file name consists of a sequence of one | |
215 | or more such components, separated by the slash character (@samp{/}). A | |
216 | file name which is just one component names a file with respect to its | |
217 | directory. A file name with multiple components names a directory, and | |
218 | then a file in that directory, and so on. | |
219 | ||
220 | Some other documents, such as the POSIX standard, use the term | |
221 | @dfn{pathname} for what we call a file name, and either @dfn{filename} | |
222 | or @dfn{pathname component} for what this manual calls a file name | |
223 | component. We don't use this terminology because a ``path'' is | |
224 | something completely different (a list of directories to search), and we | |
225 | think that ``pathname'' used for something else will confuse users. We | |
226 | always use ``file name'' and ``file name component'' (or sometimes just | |
227 | ``component'', where the context is obvious) in GNU documentation. Some | |
228 | macros use the POSIX terminology in their names, such as | |
229 | @code{PATH_MAX}. These macros are defined by the POSIX standard, so we | |
230 | cannot change their names. | |
231 | ||
232 | You can find more detailed information about operations on directories | |
233 | in @ref{File System Interface}. | |
234 | ||
235 | @node File Name Resolution, File Name Errors, Directories, File Names | |
236 | @subsection File Name Resolution | |
237 | ||
238 | A file name consists of file name components separated by slash | |
1f77f049 | 239 | (@samp{/}) characters. On the systems that @theglibc{} supports, |
28f540f4 RM |
240 | multiple successive @samp{/} characters are equivalent to a single |
241 | @samp{/} character. | |
242 | ||
243 | @cindex file name resolution | |
244 | The process of determining what file a file name refers to is called | |
245 | @dfn{file name resolution}. This is performed by examining the | |
246 | components that make up a file name in left-to-right order, and locating | |
247 | each successive component in the directory named by the previous | |
248 | component. Of course, each of the files that are referenced as | |
249 | directories must actually exist, be directories instead of regular | |
250 | files, and have the appropriate permissions to be accessible by the | |
251 | process; otherwise the file name resolution fails. | |
252 | ||
253 | @cindex root directory | |
254 | @cindex absolute file name | |
255 | If a file name begins with a @samp{/}, the first component in the file | |
256 | name is located in the @dfn{root directory} of the process (usually all | |
257 | processes on the system have the same root directory). Such a file name | |
258 | is called an @dfn{absolute file name}. | |
259 | @c !!! xref here to chroot, if we ever document chroot. -rm | |
260 | ||
261 | @cindex relative file name | |
262 | Otherwise, the first component in the file name is located in the | |
263 | current working directory (@pxref{Working Directory}). This kind of | |
264 | file name is called a @dfn{relative file name}. | |
265 | ||
266 | @cindex parent directory | |
267 | The file name components @file{.} (``dot'') and @file{..} (``dot-dot'') | |
268 | have special meanings. Every directory has entries for these file name | |
269 | components. The file name component @file{.} refers to the directory | |
270 | itself, while the file name component @file{..} refers to its | |
271 | @dfn{parent directory} (the directory that contains the link for the | |
272 | directory in question). As a special case, @file{..} in the root | |
273 | directory refers to the root directory itself, since it has no parent; | |
274 | thus @file{/..} is the same as @file{/}. | |
275 | ||
276 | Here are some examples of file names: | |
277 | ||
278 | @table @file | |
279 | @item /a | |
280 | The file named @file{a}, in the root directory. | |
281 | ||
282 | @item /a/b | |
283 | The file named @file{b}, in the directory named @file{a} in the root directory. | |
284 | ||
285 | @item a | |
286 | The file named @file{a}, in the current working directory. | |
287 | ||
288 | @item /a/./b | |
f65fd747 | 289 | This is the same as @file{/a/b}. |
28f540f4 RM |
290 | |
291 | @item ./a | |
292 | The file named @file{a}, in the current working directory. | |
293 | ||
294 | @item ../a | |
295 | The file named @file{a}, in the parent directory of the current working | |
296 | directory. | |
297 | @end table | |
298 | ||
f65fd747 | 299 | @c An empty string may ``work'', but I think it's confusing to |
28f540f4 RM |
300 | @c try to describe it. It's not a useful thing for users to use--rms. |
301 | A file name that names a directory may optionally end in a @samp{/}. | |
302 | You can specify a file name of @file{/} to refer to the root directory, | |
303 | but the empty string is not a meaningful file name. If you want to | |
304 | refer to the current working directory, use a file name of @file{.} or | |
305 | @file{./}. | |
306 | ||
a7a93d50 | 307 | Unlike some other operating systems, @gnusystems{} don't have any |
28f540f4 RM |
308 | built-in support for file types (or extensions) or file versions as part |
309 | of its file name syntax. Many programs and utilities use conventions | |
310 | for file names---for example, files containing C source code usually | |
311 | have names suffixed with @samp{.c}---but there is nothing in the file | |
312 | system itself that enforces this kind of convention. | |
313 | ||
314 | @node File Name Errors, File Name Portability, File Name Resolution, File Names | |
315 | @subsection File Name Errors | |
316 | ||
317 | @cindex file name errors | |
318 | @cindex usual file name errors | |
319 | ||
320 | Functions that accept file name arguments usually detect these | |
321 | @code{errno} error conditions relating to the file name syntax or | |
322 | trouble finding the named file. These errors are referred to throughout | |
323 | this manual as the @dfn{usual file name errors}. | |
324 | ||
325 | @table @code | |
326 | @item EACCES | |
f65fd747 | 327 | The process does not have search permission for a directory component |
28f540f4 RM |
328 | of the file name. |
329 | ||
330 | @item ENAMETOOLONG | |
3081378b | 331 | This error is used when either the total length of a file name is |
28f540f4 RM |
332 | greater than @code{PATH_MAX}, or when an individual file name component |
333 | has a length greater than @code{NAME_MAX}. @xref{Limits for Files}. | |
334 | ||
a7a93d50 | 335 | On @gnuhurdsystems{}, there is no imposed limit on overall file name |
28f540f4 RM |
336 | length, but some file systems may place limits on the length of a |
337 | component. | |
338 | ||
339 | @item ENOENT | |
340 | This error is reported when a file referenced as a directory component | |
341 | in the file name doesn't exist, or when a component is a symbolic link | |
342 | whose target file does not exist. @xref{Symbolic Links}. | |
343 | ||
344 | @item ENOTDIR | |
345 | A file that is referenced as a directory component in the file name | |
346 | exists, but it isn't a directory. | |
347 | ||
348 | @item ELOOP | |
349 | Too many symbolic links were resolved while trying to look up the file | |
350 | name. The system has an arbitrary limit on the number of symbolic links | |
351 | that may be resolved in looking up a single file name, as a primitive | |
352 | way to detect loops. @xref{Symbolic Links}. | |
353 | @end table | |
354 | ||
355 | ||
356 | @node File Name Portability, , File Name Errors, File Names | |
357 | @subsection Portability of File Names | |
358 | ||
359 | The rules for the syntax of file names discussed in @ref{File Names}, | |
a7a93d50 | 360 | are the rules normally used by @gnusystems{} and by other POSIX |
28f540f4 RM |
361 | systems. However, other operating systems may use other conventions. |
362 | ||
363 | There are two reasons why it can be important for you to be aware of | |
364 | file name portability issues: | |
365 | ||
366 | @itemize @bullet | |
f65fd747 | 367 | @item |
28f540f4 RM |
368 | If your program makes assumptions about file name syntax, or contains |
369 | embedded literal file name strings, it is more difficult to get it to | |
370 | run under other operating systems that use different syntax conventions. | |
371 | ||
372 | @item | |
373 | Even if you are not concerned about running your program on machines | |
374 | that run other operating systems, it may still be possible to access | |
375 | files that use different naming conventions. For example, you may be | |
376 | able to access file systems on another computer running a different | |
377 | operating system over a network, or read and write disks in formats used | |
378 | by other operating systems. | |
379 | @end itemize | |
380 | ||
f65fd747 | 381 | The @w{ISO C} standard says very little about file name syntax, only that |
28f540f4 RM |
382 | file names are strings. In addition to varying restrictions on the |
383 | length of file names and what characters can validly appear in a file | |
384 | name, different operating systems use different conventions and syntax | |
385 | for concepts such as structured directories and file types or | |
386 | extensions. Some concepts such as file versions might be supported in | |
387 | some operating systems and not by others. | |
388 | ||
389 | The POSIX.1 standard allows implementations to put additional | |
390 | restrictions on file name syntax, concerning what characters are | |
391 | permitted in file names and on the length of file name and file name | |
a7a93d50 JM |
392 | component strings. However, on @gnusystems{}, any character except |
393 | the null character is permitted in a file name string, and | |
394 | on @gnuhurdsystems{} there are no limits on the length of file name | |
395 | strings. |