]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man7/ip.7
ip.7: Reword NOTES on Linux-specific options
[thirdparty/man-pages.git] / man7 / ip.7
1 '\" t
2 .\" Don't change the line above. it tells man that tbl is needed.
3 .\" This man page is Copyright (C) 1999 Andi Kleen <ak@muc.de>.
4 .\" Permission is granted to distribute possibly modified copies
5 .\" of this page provided the header is included verbatim,
6 .\" and in case of nontrivial modification author and date
7 .\" of the modification is added to the header.
8 .\" $Id: ip.7,v 1.19 2000/12/20 18:10:31 ak Exp $
9 .\"
10 .\" FIXME: Document IP_ORIGDSTADDR+IP_RECVORIGDSTADDR, added in Linux 2.6.29
11 .\" FIXME: Document IP_MINTTL, added in Linux 2.6.34
12 .\"
13 .TH IP 7 2010-09-11 "Linux" "Linux Programmer's Manual"
14 .SH NAME
15 ip \- Linux IPv4 protocol implementation
16 .SH SYNOPSIS
17 .B #include <sys/socket.h>
18 .br
19 .\" .B #include <net/netinet.h> -- does not exist anymore
20 .\" .B #include <linux/errqueue.h> -- never include <linux/foo.h>
21 .B #include <netinet/in.h>
22 .br
23 .B #include <netinet/ip.h> \fR/* superset of previous */
24 .sp
25 .IB tcp_socket " = socket(AF_INET, SOCK_STREAM, 0);"
26 .br
27 .IB udp_socket " = socket(AF_INET, SOCK_DGRAM, 0);"
28 .br
29 .IB raw_socket " = socket(AF_INET, SOCK_RAW, " protocol ");"
30 .SH DESCRIPTION
31 Linux implements the Internet Protocol, version 4,
32 described in RFC\ 791 and RFC\ 1122.
33 .B ip
34 contains a level 2 multicasting implementation conforming to RFC\ 1112.
35 It also contains an IP router including a packet filter.
36 .\" FIXME has someone verified that 2.1 is really 1812 compliant?
37 .PP
38 The programming interface is BSD-sockets compatible.
39 For more information on sockets, see
40 .BR socket (7).
41 .PP
42 An IP socket is created by calling the
43 .BR socket (2)
44 function as
45 .BR "socket(AF_INET, socket_type, protocol)" .
46 Valid socket types are
47 .B SOCK_STREAM
48 to open a
49 .BR tcp (7)
50 socket,
51 .B SOCK_DGRAM
52 to open a
53 .BR udp (7)
54 socket, or
55 .B SOCK_RAW
56 to open a
57 .BR raw (7)
58 socket to access the IP protocol directly.
59 .I protocol
60 is the IP protocol in the IP header to be received or sent.
61 The only valid values for
62 .I protocol
63 are 0 and
64 .B IPPROTO_TCP
65 for TCP sockets, and 0 and
66 .B IPPROTO_UDP
67 for UDP sockets.
68 For
69 .B SOCK_RAW
70 you may specify a valid IANA IP protocol defined in
71 RFC\ 1700 assigned numbers.
72 .PP
73 .\" FIXME ip current does an autobind in listen, but I'm not sure
74 .\" if that should be documented.
75 When a process wants to receive new incoming packets or connections, it
76 should bind a socket to a local interface address using
77 .BR bind (2).
78 Only one IP socket may be bound to any given local (address, port) pair.
79 When
80 .B INADDR_ANY
81 is specified in the bind call, the socket will be bound to
82 .I all
83 local interfaces.
84 When
85 .BR listen (2)
86 or
87 .BR connect (2)
88 are called on an unbound socket, it is automatically bound to a
89 random free port with the local address set to
90 .BR INADDR_ANY .
91
92 A TCP local socket address that has been bound is unavailable for
93 some time after closing, unless the
94 .B SO_REUSEADDR
95 flag has been set.
96 Care should be taken when using this flag as it makes TCP less reliable.
97 .SS Address Format
98 An IP socket address is defined as a combination of an IP interface
99 address and a 16-bit port number.
100 The basic IP protocol does not supply port numbers, they
101 are implemented by higher level protocols like
102 .BR udp (7)
103 and
104 .BR tcp (7).
105 On raw sockets
106 .I sin_port
107 is set to the IP protocol.
108 .PP
109 .in +4n
110 .nf
111 struct sockaddr_in {
112 sa_family_t sin_family; /* address family: AF_INET */
113 in_port_t sin_port; /* port in network byte order */
114 struct in_addr sin_addr; /* internet address */
115 };
116
117 /* Internet address. */
118 struct in_addr {
119 uint32_t s_addr; /* address in network byte order */
120 };
121 .fi
122 .in
123 .PP
124 .I sin_family
125 is always set to
126 .BR AF_INET .
127 This is required; in Linux 2.2 most networking functions return
128 .B EINVAL
129 when this setting is missing.
130 .I sin_port
131 contains the port in network byte order.
132 The port numbers below 1024 are called
133 .IR "privileged ports"
134 (or sometimes:
135 .IR "reserved ports" ).
136 Only privileged processes (i.e., those having the
137 .B CAP_NET_BIND_SERVICE
138 capability) may
139 .BR bind (2)
140 to these sockets.
141 Note that the raw IPv4 protocol as such has no concept of a
142 port, they are only implemented by higher protocols like
143 .BR tcp (7)
144 and
145 .BR udp (7).
146 .PP
147 .I sin_addr
148 is the IP host address.
149 The
150 .I s_addr
151 member of
152 .I struct in_addr
153 contains the host interface address in network byte order.
154 .I in_addr
155 should be assigned one of the INADDR_* values (e.g.,
156 .BR INADDR_ANY )
157 or set using the
158 .BR inet_aton (3),
159 .BR inet_addr (3),
160 .BR inet_makeaddr (3)
161 library functions or directly with the name resolver (see
162 .BR gethostbyname (3)).
163
164 IPv4 addresses are divided into unicast, broadcast
165 and multicast addresses.
166 Unicast addresses specify a single interface of a host,
167 broadcast addresses specify all hosts on a network and multicast
168 addresses address all hosts in a multicast group.
169 Datagrams to broadcast addresses can be only sent or received when the
170 .B SO_BROADCAST
171 socket flag is set.
172 In the current implementation, connection-oriented sockets are only allowed
173 to use unicast addresses.
174 .\" Leave a loophole for XTP @)
175
176 Note that the address and the port are always stored in
177 network byte order.
178 In particular, this means that you need to call
179 .BR htons (3)
180 on the number that is assigned to a port.
181 All address/port manipulation
182 functions in the standard library work in network byte order.
183
184 There are several special addresses:
185 .B INADDR_LOOPBACK
186 (127.0.0.1)
187 always refers to the local host via the loopback device;
188 .B INADDR_ANY
189 (0.0.0.0)
190 means any address for binding;
191 .B INADDR_BROADCAST
192 (255.255.255.255)
193 means any host and has the same effect on bind as
194 .B INADDR_ANY
195 for historical reasons.
196 .SS Socket Options
197 IP supports some protocol-specific socket options that can be set with
198 .BR setsockopt (2)
199 and read with
200 .BR getsockopt (2).
201 The socket option level for IP is
202 .BR IPPROTO_IP .
203 .\" or SOL_IP on Linux
204 A boolean integer flag is zero when it is false, otherwise true.
205 .TP
206 .BR IP_ADD_MEMBERSHIP " (since Linux 1.2)"
207 Join a multicast group.
208 Argument is an
209 .I ip_mreqn
210 structure.
211 .sp
212 .in +4n
213 .nf
214 struct ip_mreqn {
215 struct in_addr imr_multiaddr; /* IP multicast group
216 address */
217 struct in_addr imr_address; /* IP address of local
218 interface */
219 int imr_ifindex; /* interface index */
220 };
221 .fi
222 .in
223 .sp
224 .I imr_multiaddr
225 contains the address of the multicast group the application
226 wants to join or leave.
227 It must be a valid multicast address
228 .\" (i.e., within the 224.0.0.0-239.255.255.255 range)
229 (or
230 .BR setsockopt (2)
231 fails with the error
232 .BR EINVAL ).
233 .I imr_address
234 is the address of the local interface with which the system
235 should join the multicast group; if it is equal to
236 .B INADDR_ANY
237 an appropriate interface is chosen by the system.
238 .I imr_ifindex
239 is the interface index of the interface that should join/leave the
240 .I imr_multiaddr
241 group, or 0 to indicate any interface.
242 .IP
243 The
244 .I ip_mreqn
245 structure is available only since Linux 2.2.
246 For compatibility, the old
247 .I ip_mreq
248 structure (present since Linux 1.2) is still supported;
249 it differs from
250 .I ip_mreqn
251 only by not including the
252 .I imr_ifindex
253 field.
254 Only valid as a
255 .BR setsockopt (2).
256 .\"
257 .TP
258 .BR IP_DROP_MEMBERSHIP " (since Linux 1.2)"
259 Leave a multicast group.
260 Argument is an
261 .I ip_mreqn
262 or
263 .I ip_mreq
264 structure similar to
265 .BR IP_ADD_MEMBERSHIP .
266 .TP
267 .BR IP_HDRINCL " (since Linux 2.0)"
268 If enabled,
269 the user supplies an IP header in front of the user data.
270 Only valid for
271 .B SOCK_RAW
272 sockets.
273 See
274 .BR raw (7)
275 for more information.
276 When this flag is enabled the values set by
277 .BR IP_OPTIONS ,
278 .B IP_TTL
279 and
280 .B IP_TOS
281 are ignored.
282 .TP
283 .BR IP_FREEBIND " (since Linux 2.4)"
284 .\" Precisely: 2.4.0-test10
285 If enabled, this boolean option allows binding to an IP address
286 that is nonlocal or does not (yet) exist.
287 This permits listening on a socket,
288 without requiring the underlying network interface or the
289 specified dynamic IP address to be up at the time that
290 the application is trying to bind to it.
291 This option is the per-socket equivalent of the
292 .IR ip_nonlocal_bind
293 .I /proc
294 interface described below.
295 .\"
296 .\" FIXME Document IP_IPSEC_POLICY
297 .\" Since Linux 2.5.47
298 .\" Needs CAP_NET_ADMIN
299 .TP
300 .BR IP_MTU " (since Linux 2.2)"
301 .\" Precisely: 2.1.124
302 Retrieve the current known path MTU of the current socket.
303 Only valid when the socket has been connected.
304 Returns an integer.
305 Only valid as a
306 .BR getsockopt (2).
307 .TP
308 .BR IP_MTU_DISCOVER " (since Linux 2.2)"
309 .\" Precisely: 2.1.124
310 Set or receive the Path MTU Discovery setting for a socket.
311 When enabled, Linux will perform Path MTU Discovery
312 as defined in RFC\ 1191
313 on this socket.
314 The don't-fragment flag is set on all outgoing datagrams.
315 The system-wide default is controlled by the
316 .I /proc/sys/net/ipv4/ip_no_pmtu_disc
317 file for
318 .B SOCK_STREAM
319 sockets, and disabled on all others.
320 For
321 .RB non- SOCK_STREAM
322 sockets, it is the user's responsibility to packetize the data
323 in MTU sized chunks and to do the retransmits if necessary.
324 The kernel will reject packets that are bigger than the known
325 path MTU if this flag is set (with
326 .B EMSGSIZE
327 ).
328 .TS
329 tab(:);
330 c l
331 l l.
332 Path MTU discovery flags:Meaning
333 IP_PMTUDISC_WANT:Use per-route settings.
334 IP_PMTUDISC_DONT:Never do Path MTU Discovery.
335 IP_PMTUDISC_DO:Always do Path MTU Discovery.
336 IP_PMTUDISC_PROBE:Set DF but ignore Path MTU.
337 .TE
338
339 When PMTU discovery is enabled, the kernel automatically keeps track of
340 the path MTU per destination host.
341 When it is connected to a specific peer with
342 .BR connect (2),
343 the currently known path MTU can be retrieved conveniently using the
344 .B IP_MTU
345 socket option (e.g., after a
346 .B EMSGSIZE
347 error occurred).
348 It may change over time.
349 For connectionless sockets with many destinations,
350 the new MTU for a given destination can also be accessed using the
351 error queue (see
352 .BR IP_RECVERR ).
353 A new error will be queued for every incoming MTU update.
354
355 While MTU discovery is in progress, initial packets from datagram sockets
356 may be dropped.
357 Applications using UDP should be aware of this and not
358 take it into account for their packet retransmit strategy.
359
360 To bootstrap the path MTU discovery process on unconnected sockets, it
361 is possible to start with a big datagram size
362 (up to 64K-headers bytes long) and let it shrink by updates of the path MTU.
363 .\" FIXME this is an ugly hack
364
365 To get an initial estimate of the
366 path MTU, connect a datagram socket to the destination address using
367 .BR connect (2)
368 and retrieve the MTU by calling
369 .BR getsockopt (2)
370 with the
371 .B IP_MTU
372 option.
373
374 It is possible to implement RFC 4821 MTU probing with
375 .B SOCK_DGRAM
376 or
377 .B SOCK_RAW
378 sockets by setting a value of
379 .BR IP_PMTUDISC_PROBE
380 (available since Linux 2.6.22).
381 This is also particularly useful for diagnostic tools such as
382 .BR tracepath (8)
383 that wish to deliberately send probe packets larger than
384 the observed Path MTU.
385 .TP
386 .BR IP_MULTICAST_IF " (since Linux 1.2)"
387 Set the local device for a multicast socket.
388 Argument is an
389 .I ip_mreqn
390 or
391 .I ip_mreq
392 structure similar to
393 .BR IP_ADD_MEMBERSHIP .
394 .IP
395 When an invalid socket option is passed,
396 .B ENOPROTOOPT
397 is returned.
398 .TP
399 .BR IP_MULTICAST_LOOP " (since Linux 1.2)"
400 Set or read a boolean integer argument that determines whether
401 sent multicast packets should be looped back to the local sockets.
402 .TP
403 .BR IP_MULTICAST_TTL " (since Linux 1.2)"
404 Set or read the time-to-live value of outgoing multicast packets for this
405 socket.
406 It is very important for multicast packets to set the smallest TTL possible.
407 The default is 1 which means that multicast packets don't leave the local
408 network unless the user program explicitly requests it.
409 Argument is an integer.
410 .TP
411 .BR IP_NODEFRAG " (since Linux 2.6.36)"
412 If enabled (argument is nonzero),
413 the reassembly of outgoing packets is disabled in the netfilter layer.
414 This option is only valid for
415 .B SOCK_RAW
416 sockets.
417 The argument is an integer.
418 .TP
419 .BR IP_OPTIONS " (since Linux 2.0)"
420 .\" Precisely: 1.3.30
421 Set or get the IP options to be sent with every packet from this socket.
422 The arguments are a pointer to a memory buffer containing the options
423 and the option length.
424 The
425 .BR setsockopt (2)
426 call sets the IP options associated with a socket.
427 The maximum option size for IPv4 is 40 bytes.
428 See RFC\ 791 for the allowed options.
429 When the initial connection request packet for a
430 .B SOCK_STREAM
431 socket contains IP options, the IP options will be set automatically
432 to the options from the initial packet with routing headers reversed.
433 Incoming packets are not allowed to change options after the connection
434 is established.
435 The processing of all incoming source routing options
436 is disabled by default and can be enabled by using the
437 .I accept_source_route
438 .I /proc
439 interface.
440 Other options like timestamps are still handled.
441 For datagram sockets, IP options can be only set by the local user.
442 Calling
443 .BR getsockopt (2)
444 with
445 .B IP_OPTIONS
446 puts the current IP options used for sending into the supplied buffer.
447 .\" FIXME Document IP_PASSSEC
448 .\" Boolean
449 .\" Since Linux 2.6.17
450 .\" commit 2c7946a7bf45ae86736ab3b43d0085e43947945c
451 .\" Author: Catherine Zhang <cxzhang@watson.ibm.com>
452 .TP
453 .BR IP_PKTINFO " (since Linux 2.2)"
454 .\" Precisely: 2.1.68
455 Pass an
456 .B IP_PKTINFO
457 ancillary message that contains a
458 .I pktinfo
459 structure that supplies some information about the incoming packet.
460 This only works for datagram oriented sockets.
461 The argument is a flag that tells the socket whether the
462 .B IP_PKTINFO
463 message should be passed or not.
464 The message itself can only be sent/retrieved
465 as control message with a packet using
466 .BR recvmsg (2)
467 or
468 .BR sendmsg (2).
469 .IP
470 .in +4n
471 .nf
472 struct in_pktinfo {
473 unsigned int ipi_ifindex; /* Interface index */
474 struct in_addr ipi_spec_dst; /* Local address */
475 struct in_addr ipi_addr; /* Header Destination
476 address */
477 };
478 .fi
479 .in
480 .IP
481 .\" FIXME elaborate on that.
482 .I ipi_ifindex
483 is the unique index of the interface the packet was received on.
484 .I ipi_spec_dst
485 is the local address of the packet and
486 .I ipi_addr
487 is the destination address in the packet header.
488 If
489 .B IP_PKTINFO
490 is passed to
491 .BR sendmsg (2)
492 and
493 .\" This field is grossly misnamed
494 .I ipi_spec_dst
495 is not zero, then it is used as the local source address for the routing
496 table lookup and for setting up IP source route options.
497 When
498 .I ipi_ifindex
499 is not zero, the primary local address of the interface specified by the
500 index overwrites
501 .I ipi_spec_dst
502 for the routing table lookup.
503 .TP
504 .BR IP_RECVERR " (since Linux 2.2)"
505 .\" Precisely: 2.1.15
506 Enable extended reliable error message passing.
507 When enabled on a datagram socket, all
508 generated errors will be queued in a per-socket error queue.
509 When the user receives an error from a socket operation,
510 the errors can be received by calling
511 .BR recvmsg (2)
512 with the
513 .B MSG_ERRQUEUE
514 flag set.
515 The
516 .I sock_extended_err
517 structure describing the error will be passed in an ancillary message with
518 the type
519 .B IP_RECVERR
520 and the level
521 .BR IPPROTO_IP .
522 .\" or SOL_IP on Linux
523 This is useful for reliable error handling on unconnected sockets.
524 The received data portion of the error queue contains the error packet.
525 .IP
526 The
527 .B IP_RECVERR
528 control message contains a
529 .I sock_extended_err
530 structure:
531 .IP
532 .in +4n
533 .ne 18
534 .nf
535 #define SO_EE_ORIGIN_NONE 0
536 #define SO_EE_ORIGIN_LOCAL 1
537 #define SO_EE_ORIGIN_ICMP 2
538 #define SO_EE_ORIGIN_ICMP6 3
539
540 struct sock_extended_err {
541 uint32_t ee_errno; /* error number */
542 uint8_t ee_origin; /* where the error originated */
543 uint8_t ee_type; /* type */
544 uint8_t ee_code; /* code */
545 uint8_t ee_pad;
546 uint32_t ee_info; /* additional information */
547 uint32_t ee_data; /* other data */
548 /* More data may follow */
549 };
550
551 struct sockaddr *SO_EE_OFFENDER(struct sock_extended_err *);
552 .fi
553 .in
554 .IP
555 .I ee_errno
556 contains the
557 .I errno
558 number of the queued error.
559 .I ee_origin
560 is the origin code of where the error originated.
561 The other fields are protocol-specific.
562 The macro
563 .B SO_EE_OFFENDER
564 returns a pointer to the address of the network object
565 where the error originated from given a pointer to the ancillary message.
566 If this address is not known, the
567 .I sa_family
568 member of the
569 .I sockaddr
570 contains
571 .B AF_UNSPEC
572 and the other fields of the
573 .I sockaddr
574 are undefined.
575 .IP
576 IP uses the
577 .I sock_extended_err
578 structure as follows:
579 .I ee_origin
580 is set to
581 .B SO_EE_ORIGIN_ICMP
582 for errors received as an ICMP packet, or
583 .B SO_EE_ORIGIN_LOCAL
584 for locally generated errors.
585 Unknown values should be ignored.
586 .I ee_type
587 and
588 .I ee_code
589 are set from the type and code fields of the ICMP header.
590 .I ee_info
591 contains the discovered MTU for
592 .B EMSGSIZE
593 errors.
594 The message also contains the
595 .I sockaddr_in of the node
596 caused the error, which can be accessed with the
597 .B SO_EE_OFFENDER
598 macro.
599 The
600 .I sin_family
601 field of the SO_EE_OFFENDER address is
602 .B AF_UNSPEC
603 when the source was unknown.
604 When the error originated from the network, all IP options
605 .RI ( IP_OPTIONS ", " IP_TTL ", "
606 etc.) enabled on the socket and contained in the
607 error packet are passed as control messages.
608 The payload of the packet causing the error is returned as normal payload.
609 .\" FIXME . Is it a good idea to document that? It is a dubious feature.
610 .\" On
611 .\" .B SOCK_STREAM
612 .\" sockets,
613 .\" .B IP_RECVERR
614 .\" has slightly different semantics. Instead of
615 .\" saving the errors for the next timeout, it passes all incoming
616 .\" errors immediately to the user.
617 .\" This might be useful for very short-lived TCP connections which
618 .\" need fast error handling. Use this option with care:
619 .\" it makes TCP unreliable
620 .\" by not allowing it to recover properly from routing
621 .\" shifts and other normal
622 .\" conditions and breaks the protocol specification.
623 Note that TCP has no error queue;
624 .B MSG_ERRQUEUE
625 is not permitted on
626 .B SOCK_STREAM
627 sockets.
628 .B IP_RECVERR
629 is valid for TCP, but all errors are returned by socket function return or
630 .B SO_ERROR
631 only.
632 .IP
633 For raw sockets,
634 .B IP_RECVERR
635 enables passing of all received ICMP errors to the
636 application, otherwise errors are only reported on connected sockets
637 .IP
638 It sets or retrieves an integer boolean flag.
639 .B IP_RECVERR
640 defaults to off.
641 .TP
642 .BR IP_RECVOPTS " (since Linux 2.2)"
643 .\" Precisely: 2.1.15
644 Pass all incoming IP options to the user in a
645 .B IP_OPTIONS
646 control message.
647 The routing header and other options are already filled in
648 for the local host.
649 Not supported for
650 .B SOCK_STREAM
651 sockets.
652 .TP
653 .BR IP_RECVTOS " (since Linux 2.2)"
654 .\" Precisely: 2.1.68
655 If enabled the
656 .B IP_TOS
657 ancillary message is passed with incoming packets.
658 It contains a byte which specifies the Type of Service/Precedence
659 field of the packet header.
660 Expects a boolean integer flag.
661 .TP
662 .BR IP_RECVTTL " (since Linux 2.2)"
663 .\" Precisely: 2.1.68
664 When this flag is set, pass a
665 .B IP_TTL
666 control message with the time to live
667 field of the received packet as a byte.
668 Not supported for
669 .B SOCK_STREAM
670 sockets.
671 .TP
672 .BR IP_RETOPTS " (since Linux 2.2)"
673 .\" Precisely: 2.1.15
674 Identical to
675 .BR IP_RECVOPTS ,
676 but returns raw unprocessed options with timestamp and route record
677 options not filled in for this hop.
678 .TP
679 .BR IP_ROUTER_ALERT " (since Linux 2.2)"
680 .\" Precisely: 2.1.68
681 Pass all to-be forwarded packets with the
682 IP Router Alert option set to this socket.
683 Only valid for raw sockets.
684 This is useful, for instance, for user-space RSVP daemons.
685 The tapped packets are not forwarded by the kernel; it is
686 the user's responsibility to send them out again.
687 Socket binding is ignored,
688 such packets are only filtered by protocol.
689 Expects an integer flag.
690 .TP
691 .BR IP_TOS " (since Linux 1.0)"
692 Set or receive the Type-Of-Service (TOS) field that is sent
693 with every IP packet originating from this socket.
694 It is used to prioritize packets on the network.
695 TOS is a byte.
696 There are some standard TOS flags defined:
697 .B IPTOS_LOWDELAY
698 to minimize delays for interactive traffic,
699 .B IPTOS_THROUGHPUT
700 to optimize throughput,
701 .B IPTOS_RELIABILITY
702 to optimize for reliability,
703 .B IPTOS_MINCOST
704 should be used for "filler data" where slow transmission doesn't matter.
705 At most one of these TOS values can be specified.
706 Other bits are invalid and shall be cleared.
707 Linux sends
708 .B IPTOS_LOWDELAY
709 datagrams first by default,
710 but the exact behavior depends on the configured queueing discipline.
711 .\" FIXME elaborate on this
712 Some high priority levels may require superuser privileges (the
713 .B CAP_NET_ADMIN
714 capability).
715 The priority can also be set in a protocol independent way by the
716 .RB ( SOL_SOCKET ", " SO_PRIORITY )
717 socket option (see
718 .BR socket (7)).
719 .\" FIXME Document IP_TRANSPARENT
720 .\" Needs CAP_NET_ADMIN
721 .\" Boolean
722 .\" Since Linux 2.6.27
723 .\" commit f5715aea4564f233767ea1d944b2637a5fd7cd2e
724 .\" Author: KOVACS Krisztian <hidden@sch.bme.hu>
725 .TP
726 .BR IP_TTL " (since Linux 1.0)"
727 Set or retrieve the current time-to-live field that is used in every packet
728 sent from this socket.
729 .\" FIXME Document IP_XFRM_POLICY
730 .\" Since Linux 2.5.48
731 .\" Needs CAP_NET_ADMIN
732 .SS /proc interfaces
733 The IP protocol
734 supports a set of
735 .I /proc
736 interfaces to configure some global parameters.
737 The parameters can be accessed by reading or writing files in the directory
738 .IR /proc/sys/net/ipv4/ .
739 .\" FIXME As at 2.6.12, 14 Jun 2005, the following are undocumented:
740 .\" ip_queue_maxlen
741 .\" ip_conntrack_max
742 Interfaces described as
743 .I Boolean
744 take an integer value, with a nonzero value ("true") meaning that
745 the corresponding option is enabled, and a zero value ("false")
746 meaning that the option is disabled.
747 .\"
748 .TP
749 .IR ip_always_defrag " (Boolean; since Linux 2.2.13)"
750 [New with kernel 2.2.13; in earlier kernel versions this feature
751 was controlled at compile time by the
752 .B CONFIG_IP_ALWAYS_DEFRAG
753 option; this option is not present in 2.4.x and later]
754
755 When this boolean flag is enabled (not equal 0), incoming fragments
756 (parts of IP packets
757 that arose when some host between origin and destination decided
758 that the packets were too large and cut them into pieces) will be
759 reassembled (defragmented) before being processed, even if they are
760 about to be forwarded.
761
762 Only enable if running either a firewall that is the sole link
763 to your network or a transparent proxy; never ever use it for a
764 normal router or host.
765 Otherwise fragmented communication can be disturbed
766 if the fragments travel over different links.
767 Defragmentation also has a large memory and CPU time cost.
768
769 This is automagically turned on when masquerading or transparent
770 proxying are configured.
771 .\"
772 .TP
773 .IR ip_autoconfig " (since Linux 2.2 to 2.6.17)"
774 .\" Precisely: since 2.1.68
775 .\" FIXME document ip_autoconfig
776 Not documented.
777 .\"
778 .TP
779 .IR ip_default_ttl " (integer; default: 64; since Linux 2.2)"
780 .\" Precisely: 2.1.15
781 Set the default time-to-live value of outgoing packets.
782 This can be changed per socket with the
783 .B IP_TTL
784 option.
785 .\"
786 .TP
787 .IR ip_dynaddr " (Boolean; default: disabled; since Linux 2.0.31)"
788 Enable dynamic socket address and masquerading entry rewriting on interface
789 address change.
790 This is useful for dialup interface with changing IP addresses.
791 0 means no rewriting, 1 turns it on and 2 enables verbose mode.
792 .\"
793 .TP
794 .IR ip_forward " (Boolean; default: disabled; since Linux 1.2)"
795 Enable IP forwarding with a boolean flag.
796 IP forwarding can be also set on a per-interface basis.
797 .\"
798 .TP
799 .IR ip_local_port_range " (since Linux 2.2)"
800 .\" Precisely: since 2.1.68
801 Contains two integers that define the default local port range
802 allocated to sockets.
803 Allocation starts with the first number and ends with the second number.
804 Note that these should not conflict with the ports used by masquerading
805 (although the case is handled).
806 Also arbitrary choices may cause problems with some firewall packet
807 filters that make assumptions about the local ports in use.
808 First number should be at least greater than 1024,
809 or better, greater than 4096, to avoid clashes
810 with well known ports and to minimize firewall problems.
811 .\"
812 .TP
813 .IR ip_no_pmtu_disc " (Boolean; default: disabled; since Linux 2.2)"
814 .\" Precisely: 2.1.15
815 If enabled, don't do Path MTU Discovery for TCP sockets by default.
816 Path MTU discovery may fail if misconfigured firewalls (that drop
817 all ICMP packets) or misconfigured interfaces (e.g., a point-to-point
818 link where the both ends don't agree on the MTU) are on the path.
819 It is better to fix the broken routers on the path than to turn off
820 Path MTU Discovery globally, because not doing it incurs a high cost
821 to the network.
822 .\"
823 .\" The following is from 2.6.12: Documentation/networking/ip-sysctl.txt
824 .TP
825 .IR ip_nonlocal_bind " (Boolean; default: disabled; since Linux 2.4)"
826 .\" Precisely: patch-2.4.0-test10
827 If set, allows processes to
828 .BR bind (2)
829 to nonlocal IP addresses,
830 which can be quite useful, but may break some applications.
831 .\"
832 .\" The following is from 2.6.12: Documentation/networking/ip-sysctl.txt
833 .TP
834 .IR ip6frag_time " (integer; default: 30)"
835 Time in seconds to keep an IPv6 fragment in memory.
836 .\"
837 .\" The following is from 2.6.12: Documentation/networking/ip-sysctl.txt
838 .TP
839 .IR ip6frag_secret_interval " (integer; default: 600)"
840 Regeneration interval (in seconds) of the hash secret (or lifetime
841 for the hash secret) for IPv6 fragments.
842 .TP
843 .IR ipfrag_high_thresh " (integer), " ipfrag_low_thresh " (integer)"
844 If the amount of queued IP fragments reaches
845 .IR ipfrag_high_thresh ,
846 the queue is pruned down to
847 .IR ipfrag_low_thresh .
848 Contains an integer with the number of bytes.
849 .TP
850 .I neigh/*
851 See
852 .BR arp (7).
853 .\" FIXME Document the conf/*/* interfaces
854 .\" FIXME Document the route/* interfaces
855 .\" FIXME document them all
856 .SS Ioctls
857 All ioctls described in
858 .BR socket (7)
859 apply to
860 .BR ip .
861 .\" 2006-04-02, mtk
862 .\" commented out the following because ipchains is obsolete
863 .\" .PP
864 .\" The ioctls to configure firewalling are documented in
865 .\" .BR ipfw (4)
866 .\" from the
867 .\" .B ipchains
868 .\" package.
869 .PP
870 Ioctls to configure generic device parameters are described in
871 .BR netdevice (7).
872 .\" FIXME Add a discussion of multicasting
873 .SH ERRORS
874 .\" FIXME document all errors.
875 .\" We should really fix the kernels to give more uniform
876 .\" error returns (ENOMEM vs ENOBUFS, EPERM vs EACCES etc.)
877 .TP
878 .B EACCES
879 The user tried to execute an operation without the necessary permissions.
880 These include:
881 sending a packet to a broadcast address without having the
882 .B SO_BROADCAST
883 flag set;
884 sending a packet via a
885 .I prohibit
886 route;
887 modifying firewall settings without superuser privileges (the
888 .B CAP_NET_ADMIN
889 capability);
890 binding to a privileged port without superuser privileges (the
891 .B CAP_NET_BIND_SERVICE
892 capability).
893 .TP
894 .B EADDRINUSE
895 Tried to bind to an address already in use.
896 .TP
897 .B EADDRNOTAVAIL
898 A nonexistent interface was requested or the requested source
899 address was not local.
900 .TP
901 .B EAGAIN
902 Operation on a nonblocking socket would block.
903 .TP
904 .B EALREADY
905 An connection operation on a nonblocking socket is already in progress.
906 .TP
907 .B ECONNABORTED
908 A connection was closed during an
909 .BR accept (2).
910 .TP
911 .B EHOSTUNREACH
912 No valid routing table entry matches the destination address.
913 This error can be caused by a ICMP message from a remote router or
914 for the local routing table.
915 .TP
916 .B EINVAL
917 Invalid argument passed.
918 For send operations this can be caused by sending to a
919 .I blackhole
920 route.
921 .TP
922 .B EISCONN
923 .BR connect (2)
924 was called on an already connected socket.
925 .TP
926 .B EMSGSIZE
927 Datagram is bigger than an MTU on the path and it cannot be fragmented.
928 .TP
929 .BR ENOBUFS ", " ENOMEM
930 Not enough free memory.
931 This often means that the memory allocation is limited by the socket
932 buffer limits, not by the system memory, but this is not 100% consistent.
933 .TP
934 .B ENOENT
935 .B SIOCGSTAMP
936 was called on a socket where no packet arrived.
937 .TP
938 .B ENOPKG
939 A kernel subsystem was not configured.
940 .TP
941 .BR ENOPROTOOPT " and " EOPNOTSUPP
942 Invalid socket option passed.
943 .TP
944 .B ENOTCONN
945 The operation is only defined on a connected socket, but the socket wasn't
946 connected.
947 .TP
948 .B EPERM
949 User doesn't have permission to set high priority, change configuration,
950 or send signals to the requested process or group.
951 .TP
952 .B EPIPE
953 The connection was unexpectedly closed or shut down by the other end.
954 .TP
955 .B ESOCKTNOSUPPORT
956 The socket is not configured or an unknown socket type was requested.
957 .PP
958 Other errors may be generated by the overlaying protocols; see
959 .BR tcp (7),
960 .BR raw (7),
961 .BR udp (7)
962 and
963 .BR socket (7).
964 .SH NOTES
965 .BR IP_FREEBIND ,
966 .BR IP_MTU ,
967 .BR IP_MTU_DISCOVER ,
968 .BR IP_PKTINFO ,
969 .B IP_RECVERR
970 and
971 .B IP_ROUTER_ALERT
972 are Linux-specific.
973 .\" IP_PASSSEC is Linux-specific
974 .\" IP_XFRM_POLICY is Linux-specific
975 .\" IP_IPSEC_POLICY is a nonstandard extension, also present on some BSDs
976 Be very careful with the
977 .B SO_BROADCAST
978 option \- it is not privileged in Linux.
979 It is easy to overload the network
980 with careless broadcasts.
981 For new application protocols
982 it is better to use a multicast group instead of broadcasting.
983 Broadcasting is discouraged.
984 .PP
985 Some other BSD sockets implementations provide
986 .B IP_RCVDSTADDR
987 and
988 .B IP_RECVIF
989 socket options to get the destination address and the interface of
990 received datagrams.
991 Linux has the more general
992 .B IP_PKTINFO
993 for the same task.
994 .PP
995 Some BSD sockets implementations also provide an
996 .B IP_RECVTTL
997 option, but an ancillary message with type
998 .B IP_RECVTTL
999 is passed with the incoming packet.
1000 This is different from the
1001 .B IP_TTL
1002 option used in Linux.
1003 .PP
1004 Using
1005 .B SOL_IP
1006 socket options level isn't portable, BSD-based stacks use
1007 .B IPPROTO_IP
1008 level.
1009 .SS Compatibility
1010 For compatibility with Linux 2.0, the obsolete
1011 .BI "socket(AF_INET, SOCK_PACKET, " protocol )
1012 syntax is still supported to open a
1013 .BR packet (7)
1014 socket.
1015 This is deprecated and should be replaced by
1016 .BI "socket(AF_PACKET, SOCK_RAW, " protocol )
1017 instead.
1018 The main difference is the new
1019 .I sockaddr_ll
1020 address structure for generic link layer information instead of the old
1021 .BR sockaddr_pkt .
1022 .SH BUGS
1023 There are too many inconsistent error values.
1024 .PP
1025 The ioctls to configure IP-specific interface options and ARP tables are
1026 not described.
1027 .PP
1028 Some versions of glibc forget to declare
1029 .IR in_pktinfo .
1030 Workaround currently is to copy it into your program from this man page.
1031 .PP
1032 Receiving the original destination address with
1033 .B MSG_ERRQUEUE
1034 in
1035 .I msg_name
1036 by
1037 .BR recvmsg (2)
1038 does not work in some 2.2 kernels.
1039 .\" .SH AUTHORS
1040 .\" This man page was written by Andi Kleen.
1041 .SH "SEE ALSO"
1042 .BR recvmsg (2),
1043 .BR sendmsg (2),
1044 .BR byteorder (3),
1045 .BR ipfw (4),
1046 .BR capabilities (7),
1047 .BR netlink (7),
1048 .BR raw (7),
1049 .BR socket (7),
1050 .BR tcp (7),
1051 .BR udp (7)
1052 .PP
1053 RFC\ 791 for the original IP specification.
1054 .br
1055 RFC\ 1122 for the IPv4 host requirements.
1056 .br
1057 RFC\ 1812 for the IPv4 router requirements.
1058 .\" FIXME autobind INADDR REUSEADDR