1 .\" This manpage is copyright (C) 2001 Paul Sheer.
3 .\" Permission is granted to make and distribute verbatim copies of this
4 .\" manual provided the copyright notice and this permission notice are
5 .\" preserved on all copies.
7 .\" Permission is granted to copy and distribute modified versions of this
8 .\" manual under the conditions for verbatim copying, provided that the
9 .\" entire resulting derived work is distributed under the terms of a
10 .\" permission notice identical to this one.
12 .\" Since the Linux kernel and libraries are constantly changing, this
13 .\" manual page may be incorrect or out-of-date. The author(s) assume no
14 .\" responsibility for errors or omissions, or for damages resulting from
15 .\" the use of the information contained herein. The author(s) may not
16 .\" have taken the same level of care in the production of this manual,
17 .\" which is licensed free of charge, as they might when working
20 .\" Formatted or processed versions of this manual, if unaccompanied by
21 .\" the source, must acknowledge the copyright and authors of this work.
23 .\" very minor changes, aeb
25 .\" Modified 5 June 2002, Michael Kerrisk <mtk-manpages@gmx.net>
26 .\" 2006-05-13, mtk, removed much material that is redundant with select.2
27 .\" various other changes
29 .TH SELECT_TUT 2 2006-05-13 "Linux" "Linux Programmer's Manual"
31 select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO \-
32 synchronous I/O multiplexing
35 /* According to POSIX.1-2001 */
37 .B #include <sys/select.h>
39 /* According to earlier standards */
41 .B #include <sys/time.h>
43 .B #include <sys/types.h>
45 .B #include <unistd.h>
47 \fBint select(int \fInfds\fB, fd_set *\fIreadfds\fB, fd_set *\fIwritefds\fB,
48 fd_set *\fIexceptfds\fB, struct timeval *\fItimeout\fB);
50 .BI "void FD_CLR(int " fd ", fd_set *" set );
52 .BI "int FD_ISSET(int " fd ", fd_set *" set );
54 .BI "void FD_SET(int " fd ", fd_set *" set );
56 .BI "void FD_ZERO(fd_set *" set );
58 .B #define _XOPEN_SOURCE 600
59 .B #include <sys/select.h>
61 \fBint pselect(int \fInfds\fB, fd_set *\fIreadfds\fB, fd_set *\fIwritefds\fB,
62 fd_set *\fIexceptfds\fB, const struct timespec *\fItimeout\fB,
63 const sigset_t *\fIsigmask\fB);
67 \fBselect\fP() (or \fBpselect\fP()) is the pivot function of
69 handle more than one simultaneous file descriptor (or socket handle)
71 manner. Its principal arguments are three arrays of file descriptors:
72 \fIreadfds\fP, \fIwritefds\fP, and \fIexceptfds\fP. The way that
73 \fBselect\fP() is usually used is to block while waiting for a "change of
74 status" on one or more of the file descriptors. A "change of status" is
75 when more characters become available from the file descriptor, \fIor\fP
76 when space becomes available within the kernel's internal buffers for
77 more to be written to the file descriptor, \fIor\fP when a file
78 descriptor goes into error (in the case of a socket or pipe this is
79 when the other end of the connection is closed).
81 In summary, \fBselect\fP() just watches multiple file descriptors,
82 and is the standard Unix call to do so.
84 The arrays of file descriptors are called \fIfile descriptor sets\fP.
85 Each set is declared as type \fBfd_set\fP, and its contents can be
86 altered with the macros \fBFD_CLR\fP(), \fBFD_ISSET\fP(), \fBFD_SET\fP(), and
87 \fBFD_ZERO\fP(). \fBFD_ZERO\fP() is usually the first function to be used on
88 a newly declared set. Thereafter, the individual file descriptors that
89 you are interested in can be added one by one with \fBFD_SET\fP().
90 \fBselect\fP() modifies the contents of the sets according to the rules
91 described below; after calling \fBselect\fP() you can test if your file
92 descriptor is still present in the set with the \fBFD_ISSET\fP() macro.
93 \fBFD_ISSET\fP() returns non-zero if the descriptor is present and zero if
94 it is not. \fBFD_CLR\fP() removes a file descriptor from the set.
98 This set is watched to see if data is available for reading from any of
99 its file descriptors. After \fBselect\fP() has returned, \fIreadfds\fP will be
100 cleared of all file descriptors except for those file descriptors that
101 are immediately available for reading with a \fBrecv\fP() (for sockets) or
102 \fBread\fP() (for pipes, files, and sockets) call.
105 This set is watched to see if there is space to write data to any of
106 its file descriptors.
107 After \fBselect\fP() has returned, \fIwritefds\fP will be
108 cleared of all file descriptors except for those file descriptors that
109 are immediately available for writing with a \fBsend\fP() (for sockets) or
110 \fBwrite\fP() (for pipes, files, and sockets) call.
113 This set is watched for exceptions or errors on any of the file
114 descriptors. However, that is actually just a rumor. How you use
115 \fIexceptfds\fP is to watch for \fIout\-of\-band\fP (OOB) data. OOB data
116 is data sent on a socket using the \fBMSG_OOB\fP flag, and hence
117 \fIexceptfds\fP only really applies to sockets. See \fBrecv\fP(2) and
118 \fBsend\fP(2) about this. After \fBselect\fP() has returned,
119 \fIexceptfds\fP will be cleared of all file descriptors except for those
120 descriptors that are available for reading OOB data. You can only ever
121 read one byte of OOB data though (which is done with \fBrecv\fP()), and
122 writing OOB data (done with \fBsend\fP()) can be done at any time and will
123 not block. Hence there is no need for a fourth set to check if a socket
124 is available for writing OOB data.
127 This is an integer one more than the maximum of any file descriptor in
128 any of the sets. In other words, while you are busy adding file descriptors
129 to your sets, you must calculate the maximum integer value of all of
130 them, then increment this value by one, and then pass this as \fInfds\fP to
135 This is the longest time \fBselect\fP() must wait before returning, even
136 if nothing interesting happened. If this value is passed as NULL,
137 then \fBselect\fP() blocks indefinitely waiting for an event.
138 \fIutimeout\fP can be set to zero seconds, which causes \fBselect\fP() to
139 return immediately. The structure \fBstruct timeval\fP is defined as,
143 time_t tv_sec; /* seconds */
144 long tv_usec; /* microseconds */
151 This argument has the same meaning as \fIutimeout\fP but \fIstruct timespec\fP
152 has nanosecond precision as follows,
156 long tv_sec; /* seconds */
157 long tv_nsec; /* nanoseconds */
163 This argument holds a set of signals to allow while performing a
164 \fBpselect\fP() call (see \fBsigaddset\fP(3) and \fBsigprocmask\fP(2)).
166 as NULL, in which case it does not modify the set of allowed signals on
167 entry and exit to the function. It will then behave just like \fBselect\fP().
168 .SH COMBINING SIGNAL AND DATA EVENTS
169 \fBpselect\fP() must be used if you are waiting for a signal as well as
170 data from a file descriptor. Programs that receive signals as events
171 normally use the signal handler only to raise a global flag. The global
172 flag will indicate that the event must be processed in the main loop of
173 the program. A signal will cause the \fBselect\fP() (or \fBpselect\fP())
174 call to return with \fIerrno\fP set to \fBEINTR\fP. This behavior is
175 essential so that signals can be processed in the main loop of the
176 program, otherwise \fBselect\fP() would block indefinitely. Now, somewhere
177 in the main loop will be a conditional to check the global flag. So we
178 must ask: what if a signal arrives after the conditional, but before the
179 \fBselect\fP() call? The answer is that \fBselect\fP() would block
180 indefinitely, even though an event is actually pending. This race
181 condition is solved by the \fBpselect\fP() call. This call can be used to
182 mask out signals that are not to be received except within the
183 \fBpselect\fP() call. For instance, let us say that the event in question
184 was the exit of a child process. Before the start of the main loop, we
185 would block \fBSIGCHLD\fP using \fBsigprocmask\fP(). Our \fBpselect\fP()
186 call would enable \fBSIGCHLD\fP by using the virgin signal mask. Our
187 program would look like:
190 int child_events = 0;
193 child_sig_handler(int x)
196 signal(SIGCHLD, child_sig_handler);
200 main(int argc, char **argv)
202 sigset_t sigmask, orig_sigmask;
204 sigemptyset(&sigmask);
205 sigaddset(&sigmask, SIGCHLD);
206 sigprocmask(SIG_BLOCK, &sigmask, &orig_sigmask);
208 signal(SIGCHLD, child_sig_handler);
210 for (;;) { /* main loop */
211 for (; child_events > 0; child_events\-\-) {
212 /* do event work here */
214 r = pselect(nfds, &rd, &wr, &er, 0, &orig_sigmask);
216 /* main body of program */
221 So what is the point of \fBselect\fP()? Can't I just read and write to my
222 descriptors whenever I want?
223 The point of \fBselect\fP() is that it watches
224 multiple descriptors at the same time and properly puts the process to
225 sleep if there is no activity. It does this while enabling you to handle
226 multiple simultaneous pipes and sockets. Unix programmers often find
227 themselves in a position where they have to handle I/O from more than one
228 file descriptor where the data flow may be intermittent. If you were to
229 merely create a sequence of \fBread\fP() and \fBwrite\fP() calls, you would
230 find that one of your calls may block waiting for data from/to a file
231 descriptor, while another file descriptor is unused though available
232 for data. \fBselect\fP() efficiently copes with this situation.
234 A simple example of the use of
239 .SH PORT FORWARDING EXAMPLE
240 Here is an example that better demonstrates the true utility of
242 The listing below is a TCP forwarding program that forwards
243 from one TCP port to another.
249 #include <sys/time.h>
250 #include <sys/types.h>
253 #include <sys/socket.h>
254 #include <netinet/in.h>
255 #include <arpa/inet.h>
258 static int forward_port;
261 #define max(x,y) ((x) > (y) ? (x) : (y))
264 listen_socket(int listen_port)
266 struct sockaddr_in a;
269 if ((s = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
274 if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR,
275 (char *) &yes, sizeof(yes)) < 0) {
276 perror("setsockopt");
280 memset(&a, 0, sizeof(a));
281 a.sin_port = htons(listen_port);
282 a.sin_family = AF_INET;
283 if (bind(s, (struct sockaddr *) &a, sizeof(a)) < 0) {
288 printf("accepting connections on port %d\\n", listen_port);
294 connect_socket(int connect_port, char *address)
296 struct sockaddr_in a;
298 if ((s = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
304 memset(&a, 0, sizeof(a));
305 a.sin_port = htons(connect_port);
306 a.sin_family = AF_INET;
308 if (!inet_aton(address, (struct in_addr *) &a.sin_addr.s_addr)) {
309 perror("bad IP address format");
314 if (connect(s, (struct sockaddr *) &a, sizeof(a)) < 0) {
316 shutdown(s, SHUT_RDWR);
323 #define SHUT_FD1 { \\
325 shutdown(fd1, SHUT_RDWR); \\
331 #define SHUT_FD2 { \\
333 shutdown(fd2, SHUT_RDWR); \\
339 #define BUF_SIZE 1024
342 main(int argc, char **argv)
345 int fd1 = \-1, fd2 = \-1;
346 char buf1[BUF_SIZE], buf2[BUF_SIZE];
347 int buf1_avail, buf1_written;
348 int buf2_avail, buf2_written;
352 "Usage\\n\\tfwd <listen-port> "
353 "<forward-to-port> <forward-to-ip-address>\\n");
357 signal(SIGPIPE, SIG_IGN);
359 forward_port = atoi(argv[2]);
361 h = listen_socket(atoi(argv[1]));
373 if (fd1 > 0 && buf1_avail < BUF_SIZE) {
375 nfds = max(nfds, fd1);
377 if (fd2 > 0 && buf2_avail < BUF_SIZE) {
379 nfds = max(nfds, fd2);
382 && buf2_avail \- buf2_written > 0) {
384 nfds = max(nfds, fd1);
387 && buf1_avail \- buf1_written > 0) {
389 nfds = max(nfds, fd2);
393 nfds = max(nfds, fd1);
397 nfds = max(nfds, fd2);
400 r = select(nfds + 1, &rd, &wr, &er, NULL);
402 if (r == \-1 && errno == EINTR)
408 if (FD_ISSET(h, &rd)) {
410 struct sockaddr_in client_address;
411 memset(&client_address, 0, l = sizeof(client_address));
412 r = accept(h, (struct sockaddr *) &client_address, &l);
418 buf1_avail = buf1_written = 0;
419 buf2_avail = buf2_written = 0;
422 connect_socket(forward_port, argv[3]);
426 printf("connect from %s\\n",
427 inet_ntoa(client_address.sin_addr));
430 /* NB: read oob data before normal reads */
432 if (FD_ISSET(fd1, &er)) {
435 r = recv(fd1, &c, 1, MSG_OOB);
439 send(fd2, &c, 1, MSG_OOB);
442 if (FD_ISSET(fd2, &er)) {
445 r = recv(fd2, &c, 1, MSG_OOB);
449 send(fd1, &c, 1, MSG_OOB);
452 if (FD_ISSET(fd1, &rd)) {
454 read(fd1, buf1 + buf1_avail,
455 BUF_SIZE \- buf1_avail);
462 if (FD_ISSET(fd2, &rd)) {
464 read(fd2, buf2 + buf2_avail,
465 BUF_SIZE \- buf2_avail);
472 if (FD_ISSET(fd1, &wr)) {
474 write(fd1, buf2 + buf2_written,
475 buf2_avail \- buf2_written);
482 if (FD_ISSET(fd2, &wr)) {
484 write(fd2, buf1 + buf1_written,
485 buf1_avail \- buf1_written);
491 /* check if write data has caught read data */
492 if (buf1_written == buf1_avail)
493 buf1_written = buf1_avail = 0;
494 if (buf2_written == buf2_avail)
495 buf2_written = buf2_avail = 0;
496 /* one side has closed the connection, keep
497 writing to the other side until empty */
498 if (fd1 < 0 && buf1_avail \- buf1_written == 0) {
501 if (fd2 < 0 && buf2_avail \- buf2_written == 0) {
509 The above program properly forwards most kinds of TCP connections
510 including OOB signal data transmitted by \fBtelnet\fP servers. It
511 handles the tricky problem of having data flow in both directions
512 simultaneously. You might think it more efficient to use a \fBfork\fP()
513 call and devote a thread to each stream. This becomes more tricky than
514 you might suspect. Another idea is to set non-blocking I/O using an
515 \fBioctl\fP() call. This also has its problems because you end up having
516 to have inefficient timeouts.
518 The program does not handle more than one simultaneous connection at a
519 time, although it could easily be extended to do this with a linked list
520 of buffers \(em one for each connection. At the moment, new
521 connections cause the current connection to be dropped.
523 Many people who try to use \fBselect\fP() come across behavior that is
524 difficult to understand and produces non-portable or borderline
525 results. For instance, the above program is carefully written not to
526 block at any point, even though it does not set its file descriptors to
527 non-blocking mode at all (see \fBioctl\fP(2)). It is easy to introduce
528 subtle errors that will remove the advantage of using \fBselect\fP(),
529 hence I will present a list of essentials to watch for when using the
533 You should always try to use \fBselect\fP() without a timeout. Your program
534 should have nothing to do if there is no data available. Code that
535 depends on timeouts is not usually portable and is difficult to debug.
538 The value \fInfds\fP must be properly calculated for efficiency as
542 No file descriptor must be added to any set if you do not intend
543 to check its result after the \fBselect\fP() call, and respond
544 appropriately. See next rule.
547 After \fBselect\fP() returns, all file descriptors in all sets
548 should be checked to see if they are ready.
549 .\" mtk, May 2006: the following isn't really true.
550 .\" Any file descriptor that is available
551 .\" for writing \fImust\fP be written to, and any file descriptor
552 .\" available for reading \fImust\fP be read, etc.
555 The functions \fBread\fP(), \fBrecv\fP(), \fBwrite\fP(), and
556 \fBsend\fP() do \fInot\fP necessarily read/write the full amount of data
557 that you have requested. If they do read/write the full amount, its
558 because you have a low traffic load and a fast stream. This is not
559 always going to be the case. You should cope with the case of your
560 functions only managing to send or receive a single byte.
563 Never read/write only in single bytes at a time unless your are really
564 sure that you have a small amount of data to process. It is extremely
565 inefficient not to read/write as much data as you can buffer each time.
566 The buffers in the example above are 1024 bytes although they could
567 easily be made larger.
570 The functions \fBread\fP(), \fBrecv\fP(), \fBwrite\fP(), and
571 \fBsend\fP() as well as the \fBselect\fP() call can return \-1 with
576 set to \fBEAGAIN\fP (\fBEWOULDBLOCK\fP).
577 These results must be properly managed (not done properly
578 above). If your program is not going to receive any signals then
579 it is unlikely you will get \fBEINTR\fP. If your program does not
580 set non-blocking I/O, you will not get \fBEAGAIN\fP. Nonetheless
581 you should still cope with these errors for completeness.
584 Never call \fBread\fP(), \fBrecv\fP(), \fBwrite\fP(), or \fBsend\fP()
585 with a buffer length of zero.
588 If the functions \fBread\fP(),
589 \fBrecv\fP(), \fBwrite\fP(), and \fBsend\fP() fail
590 with errors other than those listed in \fB7.\fP,
591 or one of the input functions returns 0, indicating end of file,
592 then you should \fInot\fP pass that descriptor to
595 In the above example,
596 I close the descriptor immediately, and then set it to \-1
597 to prevent it being included in a set.
600 The timeout value must be initialized with each new call to \fBselect\fP(),
601 since some operating systems modify the structure. \fBpselect\fP()
602 however does not modify its timeout structure.
605 I have heard that the Windows socket layer does not cope with OOB data
606 properly. It also does not cope with \fBselect\fP() calls when no file
607 descriptors are set at all. Having no file descriptors set is a useful
608 way to sleep the process with sub-second precision by using the timeout.
611 On systems that do not have a \fBusleep\fP() function, you can call
612 \fBselect\fP() with a finite timeout and no file descriptors as
618 tv.tv_usec = 200000; /* 0.2 seconds */
619 select(0, NULL, NULL, NULL, &tv);
622 This is only guaranteed to work on Unix systems, however.
624 On success, \fBselect\fP() returns the total number of file descriptors
625 still present in the file descriptor sets.
627 If \fBselect\fP() timed out, then
628 the return value will be zero.
629 The file descriptors set should be all
630 empty (but may not be on some systems).
632 A return value of \-1 indicates an error, with \fIerrno\fP being
633 set appropriately. In the case of an error, the returned sets and
634 the timeout struct contents are undefined and should not be used.
635 \fBpselect\fP() however never modifies \fIntimeout\fP.
637 Generally speaking, all operating systems that support sockets, also
638 support \fBselect\fP().
639 Many types of programs become
640 extremely complicated without the use of
642 \fBselect\fP() can be used to solve
643 many problems in a portable and efficient way that naive programmers try
644 to solve in a more complicated manner using
645 threads, forking, IPCs, signals, memory sharing, and so on.
649 system call has the same functionality as \fBselect\fP(),
650 and is somewhat more efficient when monitoring sparse
651 file descriptor sets.
652 It is nowadays widely available,
653 but historically was less portable than \fBselect\fP().
657 API provides an interface that is more efficient than
661 when monitoring large numbers of file descriptors.
680 .\" This man page was written by Paul Sheer.