]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man7/ip.7
grfix
[thirdparty/man-pages.git] / man7 / ip.7
1 '\" t
2 .\" Don't change the line above. it tells man that tbl is needed.
3 .\" This man page is Copyright (C) 1999 Andi Kleen <ak@muc.de>.
4 .\" Permission is granted to distribute possibly modified copies
5 .\" of this page provided the header is included verbatim,
6 .\" and in case of nontrivial modification author and date
7 .\" of the modification is added to the header.
8 .\" $Id: ip.7,v 1.19 2000/12/20 18:10:31 ak Exp $
9 .TH IP 7 2001-06-19 "Linux" "Linux Programmer's Manual"
10 .SH NAME
11 ip \- Linux IPv4 protocol implementation
12 .SH SYNOPSIS
13 .B #include <sys/socket.h>
14 .br
15 .\" .B #include <net/netinet.h> -- does not exist anymore
16 .\" .B #include <linux/errqueue.h> -- never include <linux/foo.h>
17 .B #include <netinet/in.h>
18 .br
19 .B #include <netinet/ip.h> \fR/* superset of previous */
20 .sp
21 .IB tcp_socket " = socket(PF_INET, SOCK_STREAM, 0);"
22 .br
23 .IB udp_socket " = socket(PF_INET, SOCK_DGRAM, 0);"
24 .br
25 .IB raw_socket " = socket(PF_INET, SOCK_RAW, " protocol ");"
26 .SH DESCRIPTION
27 Linux implements the Internet Protocol, version 4,
28 described in RFC\ 791 and RFC\ 1122.
29 .B ip
30 contains a level 2
31 multicasting implementation conforming to RFC\ 1112.
32 It also contains an IP router including a packet filter.
33 .\" FIXME has someone verified that 2.1 is really 1812 compliant?
34 .PP
35 The programming interface is BSD sockets compatible.
36 For more information on sockets, see
37 .BR socket (7).
38 .PP
39 An IP socket is created by calling the
40 .BR socket (2)
41 function as
42 .BR "socket(PF_INET, socket_type, protocol)" .
43 Valid socket types are
44 .B SOCK_STREAM
45 to open a
46 .BR tcp (7)
47 socket,
48 .B SOCK_DGRAM
49 to open a
50 .BR udp (7)
51 socket, or
52 .B SOCK_RAW
53 to open a
54 .BR raw (7)
55 socket to access the IP protocol directly.
56 .I protocol
57 is the IP protocol in the IP header to be received or sent.
58 The only valid values for
59 .I protocol
60 are
61 .B 0
62 and
63 .B IPPROTO_TCP
64 for TCP sockets and
65 .B 0
66 and
67 .B IPPROTO_UDP
68 for UDP sockets.
69 For
70 .B SOCK_RAW
71 you may specify
72 a valid IANA IP protocol defined in
73 RFC\ 1700
74 assigned numbers.
75 .PP
76 .\" FIXME ip current does an autobind in listen, but I'm not sure
77 .\" if that should be documented.
78 When a process wants to receive new incoming packets or connections, it
79 should bind a socket to a local interface address using
80 .BR bind (2).
81 Only one IP socket may be bound to any given local (address, port) pair.
82 When
83 .B INADDR_ANY
84 is specified in the bind call the socket will be bound to
85 .I all
86 local interfaces.
87 When
88 .BR listen (2)
89 or
90 .BR connect (2)
91 are called on an unbound socket, it is automatically bound to a
92 random free port with the local address set to
93 .BR INADDR_ANY .
94
95 A TCP local socket address that has been bound is unavailable for
96 some time after closing,
97 unless the
98 .B SO_REUSEADDR
99 flag has been set.
100 Care should be taken when using this flag as it
101 makes TCP less reliable.
102 .SS Address Format
103 An IP socket address is defined as a combination of an IP interface
104 address and a 16-bit port number.
105 The basic IP protocol does not supply port numbers, they
106 are implemented by higher level protocols like
107 .BR udp (7)
108 and
109 .BR tcp (7).
110 On raw sockets
111 .I sin_port
112 is set to the IP protocol.
113 .PP
114 .in +4n
115 .nf
116 struct sockaddr_in {
117 sa_family_t sin_family; /* address family: AF_INET */
118 uint16_t sin_port; /* port in network byte order */
119 struct in_addr sin_addr; /* internet address */
120 };
121
122 /* Internet address. */
123 struct in_addr {
124 uint32_t s_addr; /* address in network byte order */
125 };
126 .fi
127 .in
128 .PP
129 .I sin_family
130 is always set to
131 .BR AF_INET .
132 This is required; in Linux 2.2 most networking functions return
133 .B EINVAL
134 when this setting is missing.
135 .I sin_port
136 contains the port in network byte order.
137 The port numbers below 1024 are called
138 .IR "reserved ports" .
139 Only privileged processes (i.e., those having the
140 .B CAP_NET_BIND_SERVICE
141 capability) may
142 .BR bind (2)
143 to these sockets.
144 Note that the raw IPv4 protocol as such has no concept of a
145 port, they are only implemented by higher protocols like
146 .BR tcp (7)
147 and
148 .BR udp (7).
149 .PP
150 .I sin_addr
151 is the IP host address.
152 The
153 .I s_addr
154 member of
155 .I struct in_addr
156 contains the host interface address in network byte order.
157 .I in_addr
158 should be assigned one of the INADDR_* values (e.g.,
159 .BR INADDR_ANY )
160 or set using the
161 .BR inet_aton (3),
162 .BR inet_addr (3),
163 .BR inet_makeaddr (3)
164 library functions or directly with the name resolver (see
165 .BR gethostbyname (3)).
166 IPv4 addresses are divided into unicast, broadcast
167 and multicast addresses.
168 Unicast addresses specify a single interface of a host,
169 broadcast addresses specify all hosts on a network and multicast
170 addresses address all hosts in a multicast group.
171 Datagrams to broadcast addresses can be only sent or received when the
172 .B SO_BROADCAST
173 socket flag is set.
174 In the current implementation connection oriented sockets are only allowed
175 to use unicast addresses.
176 .\" Leave a loophole for XTP @)
177
178 Note that the address and the port are always stored in
179 network byte order.
180 In particular, this means that you need to call
181 .BR htons (3)
182 on the number that is assigned to a port.
183 All address/port manipulation
184 functions in the standard library work in network byte order.
185
186 There are several special addresses:
187 .B INADDR_LOOPBACK
188 (127.0.0.1)
189 always refers to the local host via the loopback device;
190 .B INADDR_ANY
191 (0.0.0.0)
192 means any address for binding;
193 .B INADDR_BROADCAST
194 (255.255.255.255)
195 means any host and has the same effect on bind as
196 .B INADDR_ANY
197 for historical reasons.
198 .SS Socket Options
199 IP supports some protocol-specific socket options that can be set with
200 .BR setsockopt (2)
201 and read with
202 .BR getsockopt (2).
203 The socket option level for IP is
204 .BR IPPROTO_IP .
205 .\" or SOL_IP on Linux
206 A boolean integer flag is zero when it is false, otherwise true.
207 .\"
208 .\" FIXME Document IP_FREEBIND
209 .\"
210 .TP
211 .B IP_OPTIONS
212 Sets or get the IP options to be sent with every packet from this
213 socket.
214 The arguments are a pointer to a memory buffer containing the options
215 and the option length.
216 The
217 .BR setsockopt (2)
218 call sets the IP options associated with a socket.
219 The maximum option size for IPv4 is 40 bytes.
220 See RFC\ 791 for the allowed
221 options.
222 When the initial connection request packet for a
223 .B SOCK_STREAM
224 socket contains IP options, the IP options will be set automatically
225 to the options from the initial packet with routing headers reversed.
226 Incoming packets are not allowed to change options after the connection
227 is established.
228 The processing of all incoming source routing options
229 is disabled by default and can be enabled by using the
230 .B accept_source_route
231 sysctl.
232 Other options like timestamps are still handled.
233 For datagram sockets, IP options can be only set by the local user.
234 Calling
235 .BR getsockopt (2)
236 with
237 .B IP_OPTIONS
238 puts the current IP options used for sending into the supplied buffer.
239 .TP
240 .B IP_PKTINFO
241 Pass an
242 .B IP_PKTINFO
243 ancillary message that contains a
244 .I pktinfo
245 structure that supplies some information about the incoming packet.
246 This only works for datagram oriented sockets.
247 The argument is a flag that tells the socket whether the
248 .B IP_PKTINFO
249 message should be passed or not.
250 The message itself can only be sent/retrieved
251 as control message with a packet using
252 .BR recvmsg (2)
253 or
254 .BR sendmsg (2).
255 .IP
256 .in +4n
257 .nf
258 struct in_pktinfo {
259 unsigned int ipi_ifindex; /* Interface index */
260 struct in_addr ipi_spec_dst; /* Local address */
261 struct in_addr ipi_addr; /* Header Destination
262 address */
263 };
264 .fi
265 .in
266 .IP
267 .\" FIXME elaborate on that.
268 .I ipi_ifindex
269 is the unique index of the interface the packet was received on.
270 .I ipi_spec_dst
271 is the local address of the packet and
272 .I ipi_addr
273 is the destination address in the packet header.
274 If
275 .B IP_PKTINFO
276 is passed to
277 .BR sendmsg (2)
278 and
279 .\" This field is grossly misnamed
280 .I ipi_spec_dst
281 is not zero, then it is used as the local source address for the routing
282 table lookup and for setting up IP source route options.
283 When
284 .I ipi_ifindex
285 is not zero the primary local address of the interface specified by the
286 index overwrites
287 .I ipi_spec_dst
288 for the routing table lookup.
289 .TP
290 .B IP_RECVTOS
291 If enabled the
292 .B IP_TOS
293 ancillary message is passed with incoming packets.
294 It contains a byte which specifies the Type of Service/Precedence
295 field of the packet header.
296 Expects a boolean integer flag.
297 .TP
298 .B IP_RECVTTL
299 When this flag is set
300 pass a
301 .B IP_TTL
302 control message with the time to live
303 field of the received packet as a byte.
304 Not supported for
305 .B SOCK_STREAM
306 sockets.
307 .TP
308 .B IP_RECVOPTS
309 Pass all incoming IP options to the user in a
310 .B IP_OPTIONS
311 control message.
312 The routing header and other options are already filled in
313 for the local host.
314 Not supported for
315 .B SOCK_STREAM
316 sockets.
317 .TP
318 .B IP_RETOPTS
319 Identical to
320 .B IP_RECVOPTS
321 but returns raw unprocessed options with timestamp and route record
322 options not filled in for this hop.
323 .TP
324 .B IP_TOS
325 Set or receive the Type-Of-Service (TOS) field that is sent
326 with every IP packet originating from this socket.
327 It is used to prioritize packets on the network.
328 TOS is a byte.
329 There are some standard TOS flags defined:
330 .B IPTOS_LOWDELAY
331 to minimize delays for interactive traffic,
332 .B IPTOS_THROUGHPUT
333 to optimize throughput,
334 .B IPTOS_RELIABILITY
335 to optimize for reliability,
336 .B IPTOS_MINCOST
337 should be used for "filler data" where slow transmission doesn't matter.
338 At most one of these TOS values can be specified.
339 Other bits are invalid and shall be cleared.
340 Linux sends
341 .B IPTOS_LOWDELAY
342 datagrams first by default,
343 but the exact behavior depends on the configured queueing discipline.
344 .\" FIXME elaborate on this
345 Some high priority levels may require superuser privileges (the
346 .B CAP_NET_ADMIN
347 capability).
348 The priority can also be set in a protocol independent way by the
349 .RB ( SOL_SOCKET ", " SO_PRIORITY )
350 socket option (see
351 .BR socket (7)).
352 .TP
353 .B IP_TTL
354 Set or retrieve the current time to live field that is used in every packet
355 sent from this socket.
356 .TP
357 .B IP_HDRINCL
358 If enabled
359 the user supplies an IP header in front of the user data.
360 Only valid for
361 .B SOCK_RAW
362 sockets.
363 See
364 .BR raw (7)
365 for more information.
366 When this flag is enabled the values set by
367 .BR IP_OPTIONS ,
368 .B IP_TTL
369 and
370 .B IP_TOS
371 are ignored.
372 .TP
373 .BR IP_RECVERR " (defined in \fI<linux/errqueue.h>\fP)"
374 Enable extended reliable error message passing.
375 When enabled on a datagram socket all
376 generated errors will be queued in a per-socket error queue.
377 When the user
378 receives an error from a socket operation the errors can
379 be received by calling
380 .BR recvmsg (2)
381 with the
382 .B MSG_ERRQUEUE
383 flag set.
384 The
385 .I sock_extended_err
386 structure describing the error will be passed in a ancillary message with
387 the type
388 .B IP_RECVERR
389 and the level
390 .BR IPPROTO_IP .
391 .\" or SOL_IP on Linux
392 This is useful for reliable error handling on unconnected sockets.
393 The received data portion of the error queue
394 contains the error packet.
395 .IP
396 The
397 .B IP_RECVERR
398 control message contains a
399 .I sock_extended_err
400 structure:
401 .IP
402 .in +4n
403 .ne 18
404 .nf
405 #define SO_EE_ORIGIN_NONE 0
406 #define SO_EE_ORIGIN_LOCAL 1
407 #define SO_EE_ORIGIN_ICMP 2
408 #define SO_EE_ORIGIN_ICMP6 3
409
410 struct sock_extended_err {
411 uint32_t ee_errno; /* error number */
412 uint8_t ee_origin; /* where the error originated */
413 uint8_t ee_type; /* type */
414 uint8_t ee_code; /* code */
415 uint8_t ee_pad;
416 uint32_t ee_info; /* additional information */
417 uint32_t ee_data; /* other data */
418 /* More data may follow */
419 };
420
421 struct sockaddr *SO_EE_OFFENDER(struct sock_extended_err *);
422 .fi
423 .in
424 .IP
425 .I ee_errno
426 contains the
427 .I errno
428 number of the queued error.
429 .I ee_origin
430 is the origin code of where the error originated.
431 The other fields are protocol specific.
432 The macro
433 .B SO_EE_OFFENDER
434 returns a pointer to the address of the network object
435 where the error originated from given a pointer to the ancillary message.
436 If this address is not known, the
437 .I sa_family
438 member of the
439 .I sockaddr
440 contains
441 .B AF_UNSPEC
442 and the other fields of the
443 .I sockaddr
444 are undefined.
445 .IP
446 IP uses the
447 .I sock_extended_err
448 structure as follows:
449 .I ee_origin
450 is set to
451 .B SO_EE_ORIGIN_ICMP
452 for errors received as an ICMP packet, or
453 .B SO_EE_ORIGIN_LOCAL
454 for locally generated errors.
455 Unknown values should be ignored.
456 .I ee_type
457 and
458 .I ee_code
459 are set from the type and code fields of the ICMP header.
460 .I ee_info
461 contains the discovered MTU for
462 .B EMSGSIZE
463 errors.
464 The message also contains the
465 .I sockaddr_in of the node
466 caused the error, which can be accessed with the
467 .B SO_EE_OFFENDER
468 macro.
469 The
470 .I sin_family
471 field of the SO_EE_OFFENDER address is
472 .B AF_UNSPEC
473 when the source was unknown.
474 When the error originated from the network, all IP options
475 .RI ( IP_OPTIONS ", " IP_TTL ", "
476 etc.) enabled on the socket and contained in the
477 error packet are passed as control messages.
478 The payload of the packet
479 causing the error is returned as normal payload.
480 .\" FIXME . Is it a good idea to document that? It is a dubious feature.
481 .\" On
482 .\" .B SOCK_STREAM
483 .\" sockets,
484 .\" .B IP_RECVERR
485 .\" has slightly different semantics. Instead of
486 .\" saving the errors for the next timeout, it passes all incoming
487 .\" errors immediately to the user.
488 .\" This might be useful for very short-lived TCP connections which
489 .\" need fast error handling. Use this option with care:
490 .\" it makes TCP unreliable
491 .\" by not allowing it to recover properly from routing
492 .\" shifts and other normal
493 .\" conditions and breaks the protocol specification.
494 Note that TCP has no error queue;
495 .B MSG_ERRQUEUE
496 is illegal on
497 .B SOCK_STREAM
498 sockets.
499 .B IP_RECVERR
500 is valid for TCP, but all errors are
501 returned by socket function return or
502 .B SO_ERROR
503 only.
504 .IP
505 For raw sockets,
506 .B IP_RECVERR
507 enables passing of all received ICMP errors to the
508 application, otherwise errors are only reported on connected sockets
509 .IP
510 It sets or retrieves an integer boolean flag.
511 .B IP_RECVERR
512 defaults to off.
513 .TP
514 .B IP_MTU_DISCOVER
515 Sets or receives the Path MTU Discovery setting
516 for a socket.
517 When enabled, Linux will perform Path MTU Discovery
518 as defined in RFC\ 1191
519 on this socket.
520 The don't fragment flag is set on all outgoing datagrams.
521 The system-wide default is controlled by the
522 .B ip_no_pmtu_disc
523 sysctl for
524 .B SOCK_STREAM
525 sockets, and disabled on all others.
526 For non
527 .B SOCK_STREAM
528 sockets it is the user's responsibility to packetize the data
529 in MTU sized chunks and to do the retransmits if necessary.
530 The kernel will reject packets that are bigger than the known
531 path MTU if this flag is set (with
532 .B EMSGSIZE
533 ).
534 .TS
535 tab(:);
536 c l
537 l l.
538 Path MTU discovery flags:Meaning
539 IP_PMTUDISC_WANT:Use per-route settings.
540 IP_PMTUDISC_DONT:Never do Path MTU Discovery.
541 IP_PMTUDISC_DO:Always do Path MTU Discovery.
542 IP_PMTUDISC_PROBE:Set DF but ignore Path MTU.
543 .TE
544
545 When PMTU discovery is enabled the kernel automatically keeps track of
546 the path MTU per destination host.
547 When it is connected to a specific peer with
548 .BR connect (2)
549 the currently known path MTU can be retrieved conveniently using the
550 .B IP_MTU
551 socket option (e.g., after a
552 .B EMSGSIZE
553 error occurred).
554 It may change over time.
555 For connectionless sockets with many destinations
556 the new also MTU for a given destination can also be accessed using the
557 error queue (see
558 .BR IP_RECVERR ).
559 A new error will be queued for every incoming MTU update.
560
561 While MTU discovery is in progress initial packets from datagram sockets
562 may be dropped.
563 Applications using UDP should be aware of this and not
564 take it into account for their packet retransmit strategy.
565
566 To bootstrap the path MTU discovery process on unconnected sockets it
567 is possible to start with a big datagram size
568 (up to 64K-headers bytes long) and let it shrink by updates of the
569 path MTU.
570 .\" FIXME this is an ugly hack
571
572 To get an initial estimate of the
573 path MTU connect a datagram socket to the destination address using
574 .BR connect (2)
575 and retrieve the MTU by calling
576 .BR getsockopt (2)
577 with the
578 .B IP_MTU
579 option.
580
581 It is possible to implement RFC 4821 MTU probing with
582 .B SOCK_DGRAM
583 or
584 .B SOCK_RAW
585 sockets by setting a value of
586 .BR IP_PMTUDISC_PROBE .
587 This is also particularly useful for diagnostic tools such as
588 .BR tracepath (8)
589 that wish to deliberately send probe packets larger than
590 the observed Path MTU.
591 .TP
592 .B IP_MTU
593 Retrieve the current known path MTU of the current socket.
594 Only valid when the socket has been connected.
595 Returns an integer.
596 Only valid as a
597 .BR getsockopt (2).
598 .\"
599 .TP
600 .B IP_ROUTER_ALERT
601 Pass all to-be forwarded packets with the
602 IP Router Alert
603 option
604 set to this socket.
605 Only valid for raw sockets.
606 This is useful, for instance, for user
607 space RSVP daemons.
608 The tapped packets are not forwarded by the kernel, it is
609 the users responsibility to send them out again.
610 Socket binding is ignored,
611 such packets are only filtered by protocol.
612 Expects an integer flag.
613 .\"
614 .TP
615 .B IP_MULTICAST_TTL
616 Set or reads the time-to-live value of outgoing multicast packets for this
617 socket.
618 It is very important for multicast packets to set the smallest TTL possible.
619 The default is 1 which means that multicast packets don't leave the local
620 network unless the user program explicitly requests it.
621 Argument is an
622 integer.
623 .\"
624 .TP
625 .B IP_MULTICAST_LOOP
626 Sets or reads a boolean integer argument whether sent multicast
627 packets should be looped back to the local sockets.
628 .\"
629 .TP
630 .B IP_ADD_MEMBERSHIP
631 Join a multicast group.
632 Argument is an
633 .I ip_mreqn
634 structure.
635 .sp
636 .in +4n
637 .nf
638 struct ip_mreqn {
639 struct in_addr imr_multiaddr; /* IP multicast group
640 address */
641 struct in_addr imr_address; /* IP address of local
642 interface */
643 int imr_ifindex; /* interface index */
644 };
645 .fi
646 .in
647 .sp
648 .I imr_multiaddr
649 contains the address of the multicast group the application
650 wants to join or leave.
651 It must be a valid multicast address.
652 .I imr_address
653 is the address of the local interface with which the system
654 should join the multicast
655 group; if it is equal to
656 .B INADDR_ANY
657 an appropriate interface is chosen by the system.
658 .I imr_ifindex
659 is the interface index of the interface that should join/leave the
660 .I imr_multiaddr
661 group, or 0 to indicate any interface.
662 .IP
663 For compatibility, the old
664 .I ip_mreq
665 structure is still supported.
666 It differs from
667 .I ip_mreqn
668 only by not including
669 the
670 .I imr_ifindex
671 field.
672 Only valid as a
673 .BR setsockopt (2).
674 .\"
675 .TP
676 .B IP_DROP_MEMBERSHIP
677 Leave a multicast group.
678 Argument is an
679 .I ip_mreqn
680 or
681 .I ip_mreq
682 structure similar to
683 .BR IP_ADD_MEMBERSHIP .
684 .\"
685 .TP
686 .B IP_MULTICAST_IF
687 Set the local device for a multicast socket.
688 Argument is an
689 .I ip_mreqn
690 or
691 .I ip_mreq
692 structure similar to
693 .BR IP_ADD_MEMBERSHIP .
694 .IP
695 When an invalid socket option is passed,
696 .B ENOPROTOOPT
697 is returned.
698 .SS Sysctls
699 The IP protocol
700 supports the sysctl interface to configure some global options.
701 The sysctls can be accessed by reading or writing the
702 .I /proc/sys/net/ipv4/*
703 files or using the
704 .\" FIXME As at 2.6.12, 14 Jun 2005, the following are undocumented:
705 .\" ip_queue_maxlen
706 .\" ip_conntrack_max
707 .BR sysctl (2)
708 interface.
709 Variables described as
710 .I Boolean
711 take an integer value, with a non-zero value ("true") meaning that
712 the corresponding option is enabled, and a zero value ("false")
713 meaning that the option is disabled.
714 .\"
715 .TP
716 .BR ip_always_defrag " (Boolean)"
717 [New with kernel 2.2.13; in earlier kernel versions this feature
718 was controlled at compile time by the
719 .B CONFIG_IP_ALWAYS_DEFRAG
720 option; this option is not present in 2.4.x and later]
721
722 When this boolean frag is enabled (not equal 0) incoming fragments
723 (parts of IP packets
724 that arose when some host between origin and destination decided
725 that the packets were too large and cut them into pieces) will be
726 reassembled (defragmented) before being processed, even if they are
727 about to be forwarded.
728
729 Only enable if running either a firewall that is the sole link
730 to your network or a transparent proxy; never ever use it for a
731 normal router or host.
732 Otherwise fragmented communication can be disturbed
733 if the fragments travel over different links.
734 Defragmentation also has a large memory and CPU time cost.
735
736 This is automagically turned on when masquerading or transparent
737 proxying are configured.
738 .\"
739 .TP
740 .B ip_autoconfig
741 .\" FIXME document ip_autoconfig
742 Not documented.
743 .\"
744 .TP
745 .BR ip_default_ttl " (integer; default: 64)"
746 Set the default time-to-live value of outgoing packets.
747 This can be changed per socket with the
748 .B IP_TTL
749 option.
750 .\"
751 .TP
752 .BR ip_dynaddr " (Boolean; default: disabled)"
753 Enable dynamic socket address and masquerading entry rewriting on interface
754 address change.
755 This is useful for dialup interface with changing IP addresses.
756 0 means no rewriting, 1 turns it on and 2 enables verbose mode.
757 .\"
758 .TP
759 .BR ip_forward " (Boolean; default: disabled)"
760 Enable IP forwarding with a boolean flag.
761 IP forwarding can be also set on a per interface basis.
762 .\"
763 .TP
764 .B ip_local_port_range
765 Contains two integers that define the default local port range
766 allocated to sockets.
767 Allocation starts with the first number and ends with the second number.
768 Note that these should not conflict with the ports used by masquerading
769 (although the case is handled).
770 Also arbitrary choices may cause problems with some firewall packet
771 filters that make assumptions about the local ports in use.
772 First number should be at least >1024, better >4096 to avoid clashes
773 with well known ports and to minimize firewall problems.
774 .\"
775 .TP
776 .BR ip_no_pmtu_disc " (Boolean; default: disabled)"
777 If enabled, don't do Path MTU Discovery for TCP sockets by default.
778 Path MTU discovery may fail if misconfigured firewalls (that drop
779 all ICMP packets) or misconfigured interfaces (e.g., a point-to-point
780 link where the both ends don't agree on the MTU) are on the path.
781 It is better to fix the broken routers on the path than to turn off
782 Path MTU Discovery globally, because not doing it incurs a high cost
783 to the network.
784 .\"
785 .\" The following is from 2.6.12: Documentation/networking/ip-sysctl.txt
786 .TP
787 .BR ip_nonlocal_bind " (Boolean; default: disabled)"
788 If set, allows processes to
789 .BR bind (2)
790 to non-local IP addresses,
791 which can be quite useful, but may break some applications.
792 .\"
793 .\" The following is from 2.6.12: Documentation/networking/ip-sysctl.txt
794 .TP
795 .BR ip6frag_time " (integer; default 30)"
796 Time in seconds to keep an IPv6 fragment in memory.
797 .\"
798 .\" The following is from 2.6.12: Documentation/networking/ip-sysctl.txt
799 .TP
800 .BR ip6frag_secret_interval " (integer; default 600)"
801 Regeneration interval (in seconds) of the hash secret (or lifetime
802 for the hash secret) for IPv6 fragments.
803 .TP
804 .BR ipfrag_high_thresh " (integer), " ipfrag_low_thresh " (integer)"
805 If the amount of queued IP fragments reaches
806 .BR ipfrag_high_thresh ,
807 the queue
808 is pruned down to
809 .BR ipfrag_low_thresh .
810 Contains an integer with the number of
811 bytes.
812 .TP
813 .B neigh/*
814 See
815 .BR arp (7).
816 .\" FIXME Document the conf/*/* sysctls
817 .\" FIXME Document the route/* sysctls
818 .\" FIXME document them all
819 .SS Ioctls
820 All ioctls described in
821 .BR socket (7)
822 apply to ip.
823 .\" 2006-04-02, mtk
824 .\" commented out the following because ipchains is obsolete
825 .\" .PP
826 .\" The ioctls to configure firewalling are documented in
827 .\" .BR ipfw (4)
828 .\" from the
829 .\" .B ipchains
830 .\" package.
831 .PP
832 Ioctls to configure generic device parameters are described in
833 .BR netdevice (7).
834 .\" FIXME Add a discussion of multicasting
835 .SH ERRORS
836 .\" FIXME document all errors.
837 .\" We should really fix the kernels to give more uniform
838 .\" error returns (ENOMEM vs ENOBUFS, EPERM vs EACCES etc.)
839 .TP
840 .B EACCES
841 The user tried to execute an operation without the necessary permissions.
842 These include:
843 sending a packet to a broadcast address without having the
844 .B SO_BROADCAST
845 flag set;
846 sending a packet via a
847 .I prohibit
848 route;
849 modifying firewall settings without superuser privileges (the
850 .B CAP_NET_ADMIN
851 capability);
852 binding to a reserved port without superuser privileges (the
853 .B CAP_NET_BIND_SERVICE
854 capability).
855 .TP
856 .B EADDRINUSE
857 Tried to bind to an address already in use.
858 .TP
859 .B EADDRNOTAVAIL
860 A non-existent interface was requested or the requested source
861 address was
862 not local.
863 .TP
864 .B EAGAIN
865 Operation on a non-blocking socket would block.
866 .TP
867 .B EALREADY
868 An connection operation on a non-blocking socket is already in progress.
869 .TP
870 .B ECONNABORTED
871 A connection was closed during an
872 .BR accept (2).
873 .TP
874 .B EHOSTUNREACH
875 No valid routing table entry matches the destination address.
876 This error can be caused by a ICMP message from a remote router or
877 for the local routing table.
878 .TP
879 .B EINVAL
880 Invalid argument passed.
881 For send operations this can be caused by sending to a
882 .I blackhole
883 route.
884 .TP
885 .B EISCONN
886 .BR connect (2)
887 was called on an already connected socket.
888 .TP
889 .B EMSGSIZE
890 Datagram is bigger than an MTU on the path and it cannot be fragmented.
891 .TP
892 .BR ENOBUFS ", " ENOMEM
893 Not enough free memory.
894 This often means that the memory allocation is limited by the socket
895 buffer limits, not by the system memory, but this is not
896 100% consistent.
897 .TP
898 .B ENOENT
899 .B SIOCGSTAMP
900 was called on a socket where no packet arrived.
901 .TP
902 .B ENOPKG
903 A kernel subsystem was not configured.
904 .TP
905 .BR ENOPROTOOPT " and " EOPNOTSUPP
906 Invalid socket option passed.
907 .TP
908 .B ENOTCONN
909 The operation is only defined on a connected socket, but the socket wasn't
910 connected.
911 .TP
912 .B EPERM
913 User doesn't have permission to set high priority, change configuration,
914 or send signals to the requested process or group.
915 .TP
916 .B EPIPE
917 The connection was unexpectedly closed or shut down by the other end.
918 .TP
919 .B ESOCKTNOSUPPORT
920 The socket is not configured or an unknown socket type was requested.
921 .PP
922 Other errors may be generated by the overlaying protocols; see
923 .BR tcp (7),
924 .BR raw (7),
925 .BR udp (7)
926 and
927 .BR socket (7).
928 .SH VERSIONS
929 .BR IP_MTU ,
930 .BR IP_MTU_DISCOVER ,
931 .BR IP_PKTINFO ,
932 .B IP_RECVERR
933 and
934 .B IP_ROUTER_ALERT
935 are new options in Linux 2.2.
936 They are also all Linux-specific and should not be used in
937 programs intended to be portable.
938 .PP
939 .\" FIXME
940 .\" To be confirmed that IP_PMTUDISC_PROBE makes it into kernel 2.6.22
941 .B IP_PMTUDISC_PROBE
942 is new in Linux 2.6.22.
943 .PP
944 .I struct ip_mreqn
945 is new in Linux 2.2.
946 Linux 2.0 only supported
947 .BR ip_mreq .
948 .PP
949 The sysctls were introduced with Linux 2.2.
950 .SH NOTES
951 Be very careful with the
952 .B SO_BROADCAST
953 option \- it is not privileged in Linux.
954 It is easy to overload the network
955 with careless broadcasts.
956 For new application protocols
957 it is better to use a multicast group instead of broadcasting.
958 Broadcasting is discouraged.
959 .PP
960 Some other BSD sockets implementations provide
961 .B IP_RCVDSTADDR
962 and
963 .B IP_RECVIF
964 socket options to get the destination address and the interface of
965 received datagrams.
966 Linux has the more general
967 .B IP_PKTINFO
968 for the same task.
969 .PP
970 Some BSD sockets implementations also provide an
971 .B IP_RECVTTL
972 option, but an ancillary message with type
973 .B IP_RECVTTL
974 is passed with the incoming packet.
975 This is different from the
976 .B IP_TTL
977 option used in Linux.
978 .PP
979 Using
980 .B SOL_IP
981 socket options level isn't portable, BSD-based stacks use
982 .B IPPROTO_IP
983 level.
984 .SS Compatibility
985 For compatibility with Linux 2.0, the obsolete
986 .BI "socket(PF_INET, SOCK_PACKET, " protocol )
987 syntax is still supported to open a
988 .BR packet (7)
989 socket.
990 This is deprecated and should be replaced by
991 .BI "socket(PF_PACKET, SOCK_RAW, " protocol )
992 instead.
993 The main difference is the new
994 .I sockaddr_ll
995 address structure for generic link layer information instead of the old
996 .BR sockaddr_pkt .
997 .SH BUGS
998 There are too many inconsistent error values.
999 .PP
1000 The ioctls to configure IP-specific interface options and ARP tables are
1001 not described.
1002 .PP
1003 Some versions of glibc forget to declare
1004 .IR in_pktinfo .
1005 Workaround currently is to copy it into your program from this man page.
1006 .PP
1007 Receiving the original destination address with
1008 .B MSG_ERRQUEUE
1009 in
1010 .I msg_name
1011 by
1012 .BR recvmsg (2)
1013 does not work in some 2.2 kernels.
1014 .\" .SH AUTHORS
1015 .\" This man page was written by Andi Kleen.
1016 .SH "SEE ALSO"
1017 .BR recvmsg (2),
1018 .BR sendmsg (2),
1019 .BR byteorder (3),
1020 .BR ipfw (4),
1021 .BR capabilities (7),
1022 .BR netlink (7),
1023 .BR raw (7),
1024 .BR socket (7),
1025 .BR tcp (7),
1026 .BR udp (7)
1027 .PP
1028 RFC\ 791 for the original IP specification.
1029 .br
1030 RFC\ 1122 for the IPv4 host requirements.
1031 .br
1032 RFC\ 1812 for the IPv4 router requirements.
1033 .\" FIXME autobind INADDR REUSEADDR