]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man7/socket.7
Wrapped long lines, wrapped at sentence boundaries; stripped trailing
[thirdparty/man-pages.git] / man7 / socket.7
1 '\" t
2 .\" Don't change the first line, it tells man that we need tbl.
3 .\" This man page is Copyright (C) 1999 Andi Kleen <ak@muc.de>.
4 .\" and copyright (c) 1999 Matthew Wilcox.
5 .\" Permission is granted to distribute possibly modified copies
6 .\" of this page provided the header is included verbatim,
7 .\" and in case of nontrivial modification author and date
8 .\" of the modification is added to the header.
9 .\"
10 .\" 2002-10-30, Michael Kerrisk, <mtk-manpages@gmx.net>
11 .\" Added description of SO_ACCEPTCONN
12 .\" 2004-05-20, aeb, added SO_RCVTIMEO/SO_SNDTIMEO text.
13 .\" Modified, 27 May 2004, Michael Kerrisk <mtk-manpages@gmx.net>
14 .\" Added notes on capability requirements
15 .\" A few small grammar fixes
16 .\"
17 .\" FIXME probably all PF_* should be AF_* in this page, since
18 .\" POSIX only specifies the latter values.
19 .\"
20 .TH SOCKET 7 2004-05-27 "Linux 2.6.6" "Linux Programmer's Manual"
21 .SH NAME
22 socket \- Linux socket interface
23 .SH SYNOPSIS
24 .B #include <sys/socket.h>
25 .br
26 .IB mysocket " = socket(int " socket_family ", int " socket_type ", int " protocol );
27 .SH DESCRIPTION
28 This manual page describes the Linux networking socket layer user
29 interface.
30 The BSD compatible sockets
31 are the uniform interface
32 between the user process and the network protocol stacks in the kernel.
33 The protocol modules are grouped into
34 .I protocol families
35 like
36 .BR PF_INET ", " PF_IPX ", " PF_PACKET
37 and
38 .I socket types
39 like
40 .B SOCK_STREAM
41 or
42 .BR SOCK_DGRAM .
43 See
44 .BR socket (2)
45 for more information on families and types.
46 .SH "SOCKET LAYER FUNCTIONS"
47 These functions are used by the user process to send or receive packets
48 and to do other socket operations.
49 For more information see their respective manual pages.
50
51 .BR socket (2)
52 creates a socket,
53 .BR connect (2)
54 connects a socket to a remote socket address,
55 the
56 .BR bind (2)
57 function binds a socket to a local socket address,
58 .BR listen (2)
59 tells the socket that new connections shall be accepted, and
60 .BR accept (2)
61 is used to get a new socket with a new incoming connection.
62 .BR socketpair (2)
63 returns two connected anonymous sockets (only implemented for a few
64 local families like
65 .BR PF_UNIX )
66 .PP
67 .BR send (2),
68 .BR sendto (2),
69 and
70 .BR sendmsg (2)
71 send data over a socket, and
72 .BR recv (2),
73 .BR recvfrom (2),
74 .BR recvmsg (2)
75 receive data from a socket.
76 .BR poll (2)
77 and
78 .BR select (2)
79 wait for arriving data or a readiness to send data.
80 In addition, the standard I/O operations like
81 .BR write (2),
82 .BR writev (2),
83 .BR sendfile (2),
84 .BR read (2),
85 and
86 .BR readv (2)
87 can be used to read and write data.
88 .PP
89 .BR getsockname (2)
90 returns the local socket address and
91 .BR getpeername (2)
92 returns the remote socket address.
93 .BR getsockopt (2)
94 and
95 .BR setsockopt (2)
96 are used to set or get socket layer or protocol options.
97 .BR ioctl (2)
98 can be used to set or read some other options.
99 .PP
100 .BR close (2)
101 is used to close a socket.
102 .BR shutdown (2)
103 closes parts of a full duplex socket connection.
104 .PP
105 Seeking, or calling
106 .BR pread (2)
107 or
108 .BR pwrite (2)
109 with a non-zero position is not supported on sockets.
110 .PP
111 It is possible to do non-blocking I/O on sockets by setting the
112 .B O_NONBLOCK
113 flag on a socket file descriptor using
114 .BR fcntl (2).
115 Then all operations that would block will (usually)
116 return with
117 .B EAGAIN
118 (operation should be retried later);
119 .BR connect (2)
120 will return
121 .B EINPROGRESS
122 error.
123 The user can then wait for various events via
124 .BR poll (2)
125 or
126 .BR select (2).
127 .TS
128 tab(:) allbox;
129 c s s
130 l l l.
131 I/O events
132 Event:Poll flag:Occurrence
133 Read:POLLIN:T{
134 New data arrived.
135 T}
136 Read:POLLIN:T{
137 A connection setup has been completed
138 (for connection-oriented sockets)
139 T}
140 Read:POLLHUP:T{
141 A disconnection request has been initiated by the other end.
142 T}
143 Read:POLLHUP:T{
144 A connection is broken (only for connection-oriented protocols).
145 When the socket is written
146 .B SIGPIPE
147 is also sent.
148 T}
149 Write:POLLOUT:T{
150 Socket has enough send buffer space for writing new data.
151 T}
152 Read/Write:T{
153 POLLIN|
154 .br
155 POLLOUT
156 T}:T{
157 An outgoing
158 .BR connect (2)
159 finished.
160 T}
161 Read/Write:POLLERR:An asynchronous error occurred.
162 Read/Write:POLLHUP:The other end has shut down one direction.
163 Exception:POLLPRI:T{
164 Urgent data arrived.
165 .B SIGURG
166 is sent then.
167 T}
168 .\" FIXME The following is not true currently:
169 .\" It is no I/O event when the connection
170 .\" is broken from the local end using
171 .\" .BR shutdown (2)
172 .\" or
173 .\" .BR close (2).
174 .TE
175
176 .PP
177 An alternative to
178 .BR poll ()
179 and
180 .BR select ()
181 is to let the kernel inform the application about events
182 via a
183 .B SIGIO
184 signal.
185 For that the
186 .B FASYNC
187 flag must be set on a socket file descriptor via
188 .BR fcntl (2)
189 and a valid signal handler for
190 .B SIGIO
191 must be installed via
192 .BR sigaction (2).
193 See the
194 .I SIGNALS
195 discussion below.
196 .SH "SOCKET OPTIONS"
197 These socket options can be set by using
198 .BR setsockopt (2)
199 and read with
200 .BR getsockopt (2)
201 with the socket level set to
202 .B SOL_SOCKET
203 for all sockets:
204 .\" SO_ACCEPTCONN is in POSIX.1-2001, and its origin is explained in
205 .\" W R Stevens, UNPv1
206 .TP
207 .B SO_ACCEPTCONN
208 Returns a value indicating whether or not this socket has been marked
209 to accept connections with
210 .BR listen ().
211 The value 0 indicates that this is not a listening socket,
212 the value 1 indicates that this is a listening socket.
213 Can only be read
214 with
215 .BR getsockopt ().
216 .TP
217 .B SO_BINDTODEVICE
218 Bind this socket to a particular device like \(lqeth0\(rq,
219 as specified in the passed interface name.
220 If the
221 name is an empty string or the option length is zero, the socket device
222 binding is removed.
223 The passed option is a variable-length null terminated
224 interface name string with the maximum size of
225 .BR IFNAMSIZ .
226 If a socket is bound to an interface,
227 only packets received from that particular interface are processed by the
228 socket.
229 Note that this only works for some socket types, particularly
230 .B AF_INET
231 sockets.
232 It is not supported for packet sockets (use normal
233 .BR bind (8)
234 there).
235 .TP
236 .B SO_BROADCAST
237 Set or get the broadcast flag.
238 When enabled, datagram sockets
239 receive packets sent to a broadcast address and they are allowed to send
240 packets to a broadcast address.
241 This option has no effect on stream-oriented sockets.
242 .TP
243 .B SO_BSDCOMPAT
244 Enable BSD bug-to-bug compatibility.
245 This is used by the UDP protocol module in Linux 2.0 and 2.2.
246 If enabled ICMP errors received for a UDP socket will not be passed
247 to the user program.
248 In later kernel versions, support for this option has been phased out:
249 Linux 2.4 silently ignores it, and Linux 2.6 generates a kernel warning
250 (printk()) if a program uses this option.
251 Linux 2.0 also enabled BSD bug-to-bug compatibility
252 options (random header changing, skipping of the broadcast flag) for raw
253 sockets with this option, but that was removed in Linux 2.2.
254 .TP
255 .B SO_DEBUG
256 Enable socket debugging.
257 Only allowed for processes with the
258 .B CAP_NET_ADMIN
259 capability or an effective user ID of 0.
260 .TP
261 .B SO_ERROR
262 Get and clear the pending socket error.
263 Only valid as a
264 .BR getsockopt ().
265 Expects an integer.
266 .TP
267 .B SO_DONTROUTE
268 Don't send via a gateway, only send to directly connected hosts.
269 The same effect can be achieved by setting the
270 .B MSG_DONTROUTE
271 flag on a socket
272 .BR send (2)
273 operation.
274 Expects an integer boolean flag.
275 .TP
276 .B SO_KEEPALIVE
277 Enable sending of keep-alive messages on connection-oriented sockets.
278 Expects an integer boolean flag.
279 .TP
280 .B SO_LINGER
281 Sets or gets the
282 .B SO_LINGER
283 option.
284 The argument is a
285 .I linger
286 structure.
287 .sp
288 .in +0.25i
289 .nf
290 struct linger {
291 int l_onoff; /* linger active */
292 int l_linger; /* how many seconds to linger for */
293 };
294 .fi
295 .in -0.25i
296 .IP
297 When enabled, a
298 .BR close (2)
299 or
300 .BR shutdown (2)
301 will not return until all queued messages for the socket have been
302 successfully sent or the linger timeout has been reached.
303 Otherwise,
304 the call returns immediately and the closing is done in the background.
305 When the socket is closed as part of
306 .BR exit (2),
307 it always lingers in the background.
308 .TP
309 .B SO_OOBINLINE
310 If this option is enabled,
311 out-of-band data is directly placed into the receive data stream.
312 Otherwise out-of-band data is only passed when the
313 .B MSG_OOB
314 flag is set during receiving.
315 .\" don't document it because it can do too much harm.
316 .\".B SO_NO_CHECK
317 .TP
318 .B SO_PASSCRED
319 Enable or disable the receiving of the
320 .B SCM_CREDENTIALS
321 control message.
322 For more information see
323 .BR unix (7).
324 .TP
325 .B SO_PEERCRED
326 Return the credentials of the foreign process connected to this socket.
327 This is only possible for connected
328 .B PF_UNIX
329 stream sockets and
330 .B PF_UNIX
331 stream and datagram socket pairs created using
332 .BR socketpair (2);
333 see
334 .BR unix (7).
335 The returned credentials are those that were in effect at the time
336 of the call to
337 .BR connect (2)
338 or
339 .BR socketpair (2).
340 Argument is a
341 .I ucred
342 structure.
343 Only valid as a
344 .BR getsockopt ().
345 .TP
346 .B SO_PRIORITY
347 Set the protocol-defined priority for all packets to be sent on
348 this socket.
349 Linux uses this value to order the networking queues:
350 packets with a higher priority may be processed first depending
351 on the selected device queueing discipline.
352 For
353 .BR ip (7),
354 this also sets the IP type-of-service (TOS) field for outgoing packets.
355 Setting a priority outside the range 0 to 6 requires the
356 .B CAP_NET_ADMIN
357 capability.
358 .TP
359 .B SO_RCVBUF
360 Sets or gets the maximum socket receive buffer in bytes.
361 The kernel doubles this value (to allow space for bookkeeping overhead)
362 when it is set using
363 .\" Most (all?) other implementations do not do this -- MTK, Dec 05
364 .BR setsockopt (),
365 and this doubled value is returned by
366 .BR getsockopt ().
367 The default value is set by the
368 .B rmem_default
369 sysctl and the maximum allowed value is set by the
370 .B rmem_max
371 sysctl.
372 The minimum (doubled) value for this option is 256.
373 .TP
374 .BR SO_RCVBUFFORCE " (since Linux 2.6.14")
375 Using this socket option, a privileged
376 .RB ( CAP_NET_ADMIN )
377 process can perform the same task as
378 .BR SO_RCVBUF ,
379 but the
380 .B rmem_max
381 limit can be overridden.
382 .TP
383 .BR SO_RCVLOWAT " and " SO_SNDLOWAT
384 Specify the minimum number of bytes in the buffer until the socket layer
385 will pass the data to the protocol
386 .RB ( SO_SNDLOWAT )
387 or the user on receiving
388 .RB ( SO_RCVLOWAT ).
389 These two values are initialised to 1.
390 .B SO_SNDLOWAT
391 is not changeable on Linux
392 .RB ( setsockopt
393 fails with the error
394 .BR ENOPROTOOPT ).
395 .BR SO_RCVLOWAT
396 is changeable
397 only since Linux 2.4.
398 The
399 .BR select (2)
400 and
401 .BR poll (2)
402 system calls currently do not respect the
403 .B SO_RCVLOWAT
404 setting on Linux,
405 and mark a socket readable when even a single byte of data is available.
406 A subsequent read from the socket will block until
407 .BR SO_RCVLOWAT
408 bytes are available.
409 .\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=111049368106984&w=2
410 .\" Tested on kernel 2.6.14 -- mtk, 30 Nov 05
411 .TP
412 .BR SO_RCVTIMEO " and " SO_SNDTIMEO
413 .\" Not implemented in 2.0.
414 .\" Implemented in 2.1.11 for getsockopt: always return a zero struct.
415 .\" Implemented in 2.3.41 for setsockopt, and actually used.
416 Specify the receiving or sending timeouts until reporting an error.
417 The parameter is a
418 .IR "struct timeval" .
419 If an input or output function blocks for this period of time, and
420 data has been sent or received, the return value of that function
421 will be the amount of data transferred; if no data has been transferred
422 and the timeout has been reached then \-1 is returned with
423 .I errno
424 set to EAGAIN or EWOULDBLOCK
425 .\" in fact to EAGAIN
426 just as if the socket was specified to be nonblocking.
427 If the timeout is set to zero (the default)
428 then the operation will never timeout.
429 .TP
430 .B SO_REUSEADDR
431 Indicates that the rules used in validating addresses supplied in a
432 .BR bind (2)
433 call should allow reuse of local addresses.
434 For
435 .B PF_INET
436 sockets this
437 means that a socket may bind, except when there
438 is an active listening socket bound to the address.
439 When the listening socket is bound to
440 .B INADDR_ANY
441 with a specific port then it is not possible
442 to bind to this port for any local address.
443 .TP
444 .B SO_SNDBUF
445 Sets or gets the maximum socket send buffer in bytes.
446 The kernel doubles this value (to allow space for bookkeeping overhead)
447 when it is set using
448 .\" Most (all?) other implementations do not do this -- MTK, Dec 05
449 .BR setsockopt (),
450 and this doubled value is returned by
451 .BR getsockopt ().
452 The default value is set by the
453 .B wmem_default
454 sysctl and the maximum allowed value is set by the
455 .B wmem_max
456 sysctl.
457 The minimum (doubled) value for this option is 2048.
458 .TP
459 .BR SO_SNDBUFFORCE " (since Linux 2.6.14")
460 Using this socket option, a privileged
461 .RB ( CAP_NET_ADMIN )
462 process can perform the same task as
463 .BR SO_SNDBUF ,
464 but the
465 .B wmem_max
466 limit can be overridden.
467 .TP
468 .B SO_TIMESTAMP
469 Enable or disable the receiving of the
470 .B SO_TIMESTAMP
471 control message.
472 The timestamp control message is sent with level
473 .B SOL_SOCKET
474 and the
475 .I cmsg_data
476 field is a
477 .I "struct timeval"
478 indicating the
479 reception time of the last packet passed to the user in this call.
480 See
481 .BR cmsg (3)
482 for details on control messages.
483 .TP
484 .B SO_TYPE
485 Gets the socket type as an integer (like
486 .BR SOCK_STREAM ).
487 Can only be read
488 with
489 .BR getsockopt ().
490 .SH SIGNALS
491 When writing onto a connection-oriented socket that has been shut down
492 (by the local or the remote end)
493 .B SIGPIPE
494 is sent to the writing process and
495 .B EPIPE
496 is returned.
497 The signal is not sent when the write call
498 specified the
499 .B MSG_NOSIGNAL
500 flag.
501 .PP
502 When requested with the
503 .B FIOSETOWN
504 .BR fcntl ()
505 or
506 .B SIOCSPGRP
507 .BR ioctl (),
508 .B SIGIO
509 is sent when an I/O event occurs.
510 It is possible to use
511 .BR poll (2)
512 or
513 .BR select (2)
514 in the signal handler to find out which socket the event occurred on.
515 An alternative (in Linux 2.2) is to set a realtime signal using the
516 .B F_SETSIG
517 .BR fcntl ();
518 the handler of the real time signal will be called with
519 the file descriptor in the
520 .I si_fd
521 field of its
522 .IR siginfo_t .
523 See
524 .BR fcntl (2)
525 for more information.
526 .PP
527 Under some circumstances (e.g. multiple processes accessing a
528 single socket), the condition that caused the
529 .B SIGIO
530 may have already disappeared when the process reacts to the signal.
531 If this happens, the process should wait again because Linux
532 will resend the signal later.
533 .\" .SH ANCILLARY MESSAGES
534 .SH SYSCTLS
535 The core socket networking sysctls can be accessed using the
536 .I /proc/sys/net/core/*
537 files or with the
538 .BR sysctl (2)
539 interface.
540 .TP
541 .B rmem_default
542 contains the default setting in bytes of the socket receive buffer.
543 .TP
544 .B rmem_max
545 contains the maximum socket receive buffer size in bytes which a user may
546 set by using the
547 .B SO_RCVBUF
548 socket option.
549 .TP
550 .B wmem_default
551 contains the default setting in bytes of the socket send buffer.
552 .TP
553 .B wmem_max
554 contains the maximum socket send buffer size in bytes which a user may
555 set by using the
556 .B SO_SNDBUF
557 socket option.
558 .TP
559 .BR message_cost " and " message_burst
560 configure the token bucket filter used to load limit warning messages
561 caused by external network events.
562 .TP
563 .B netdev_max_backlog
564 Maximum number of packets in the global input queue.
565 .TP
566 .B optmem_max
567 Maximum length of ancillary data and user control data like the iovecs
568 per socket.
569 .\" netdev_fastroute is not documented because it is experimental
570 .SH IOCTLS
571 These operations can be accessed using
572 .BR ioctl (2):
573
574 .in +0.25i
575 .nf
576 .IB error " = ioctl(" ip_socket ", " ioctl_type ", " &value_result ");"
577 .fi
578 .in -0.25i
579 .TP
580 .B SIOCGSTAMP
581 Return a
582 .I struct timeval
583 with the receive timestamp of the last packet passed to the user.
584 This is useful for accurate round trip time measurements.
585 See
586 .BR setitimer (2)
587 for a description of
588 .IR "struct timeval" .
589 .\"
590 This ioctl should only be used if the socket option
591 .B SO_TIMESTAMP
592 is not set on the socket.
593 Otherwise, it returns the timestamp of the
594 last packet that was received while
595 .B SO_TIMESTAMP
596 was not set, or it fails if no such packet has been received,
597 (i.e.,
598 .BR ioctl ()
599 returns \-1 with
600 .I errno
601 set to
602 .BR ENOENT ).
603 .TP
604 .BR SIOCSPGRP
605 Set the process or process group to send
606 .B SIGIO
607 or
608 .B SIGURG
609 signals
610 to when an
611 asynchronous I/O operation has finished or urgent data is available.
612 The argument is a pointer to a
613 .BR pid_t .
614 If the argument is positive, send the signals to that process.
615 If the
616 argument is negative, send the signals to the process group with the ID
617 of the absolute value of the argument.
618 The process may only choose itself or its own process group to receive
619 signals unless it has the
620 .B CAP_KILL
621 capability or an effective UID of 0.
622 .TP
623 .B FIOASYNC
624 Change the
625 .B O_ASYNC
626 flag to enable or disable asynchronous I/O mode of the socket.
627 Asynchronous I/O mode means that the
628 .B SIGIO
629 signal or the signal set with
630 .B F_SETSIG
631 is raised when a new I/O event occurs.
632 .IP
633 Argument is an integer boolean flag.
634 .\"
635 .TP
636 .BR SIOCGPGRP
637 Get the current process or process group that receives
638 .B SIGIO
639 or
640 .B SIGURG
641 signals,
642 or 0
643 when none is set.
644 .PP
645 Valid
646 .BR fcntl ()
647 operations:
648 .TP
649 .BR FIOGETOWN
650 The same as the SIOCGPGRP
651 .BR ioctl ().
652 .TP
653 .BR FIOSETOWN
654 The same as the SIOCSPGRP
655 .BR ioctl ().
656 .SH NOTES
657 Linux assumes that half of the send/receive buffer is used for internal
658 kernel structures; thus the sysctls are twice what can be observed
659 on the wire.
660
661 Linux will only allow port re-use with the SO_REUSEADDR option
662 when this option was set both in the previous program that performed a
663 .BR bind ()
664 to the port and in the program that wants to re-use the port.
665 This differs from some implementations (e.g., FreeBSD)
666 where only the later program needs to set the SO_REUSEADDR option.
667 Typically this difference is invisible, since, for example, a server
668 program is designed to always set this option.
669 .SH BUGS
670 The
671 .B CONFIG_FILTER
672 socket options
673 .B SO_ATTACH_FILTER
674 and
675 .B SO_DETACH_FILTER
676 are
677 not documented.
678 The suggested interface to use them is via the libpcap
679 library.
680 .SH VERSIONS
681 .B SO_BINDTODEVICE
682 was introduced in Linux 2.0.30.
683 .B SO_PASSCRED
684 is new in Linux 2.2.
685 The sysctls are new in Linux 2.2.
686 .B SO_RCVTIMEO
687 and
688 .B SO_SNDTIMEO
689 are supported since Linux 2.3.41.
690 Earlier, timeouts were fixed to
691 a protocol specific setting, and could not be read or written.
692 .\" .SH AUTHORS
693 .\" This man page was written by Andi Kleen.
694 .SH "SEE ALSO"
695 .BR getsockopt (2),
696 .BR setsockopt (2),
697 .BR socket (2),
698 .BR capabilities (7),
699 .BR ddp (7),
700 .BR ip (7),
701 .BR packet (7)