.\" $Id: ip.7,v 1.19 2000/12/20 18:10:31 ak Exp $
.\"
.\" FIXME The following socket options are yet to be documented
+.\"
.\" IP_XFRM_POLICY (2.5.48)
.\" Needs CAP_NET_ADMIN
+.\"
.\" IP_IPSEC_POLICY (2.5.47)
.\" Needs CAP_NET_ADMIN
+.\"
.\" IP_PASSSEC (2.6.17)
.\" Boolean
.\" commit 2c7946a7bf45ae86736ab3b43d0085e43947945c
.\" Author: Catherine Zhang <cxzhang@watson.ibm.com>
+.\"
.\" IP_MINTTL (2.6.34)
.\" commit d218d11133d888f9745802146a50255a4781d37a
.\" Author: Stephen Hemminger <shemminger@vyatta.com>
+.\"
.\" MCAST_JOIN_GROUP (2.4.22 / 2.6)
+.\"
.\" MCAST_BLOCK_SOURCE (2.4.22 / 2.6)
+.\"
.\" MCAST_UNBLOCK_SOURCE (2.4.22 / 2.6)
+.\"
.\" MCAST_LEAVE_GROUP (2.4.22 / 2.6)
+.\"
.\" MCAST_JOIN_SOURCE_GROUP (2.4.22 / 2.6)
+.\"
.\" MCAST_LEAVE_SOURCE_GROUP (2.4.22 / 2.6)
+.\"
.\" MCAST_MSFILTER (2.4.22 / 2.6)
+.\"
.\" IP_UNICAST_IF (3.4)
.\" commit 76e21053b5bf33a07c76f99d27a74238310e3c71
.\" Author: Erich E. Hoover <ehoover@mines.edu>
.\"
-.TH IP 7 2015-05-07 "Linux" "Linux Programmer's Manual"
+.TH IP 7 2019-03-06 "Linux" "Linux Programmer's Manual"
.SH NAME
ip \- Linux IPv4 protocol implementation
.SH SYNOPSIS
.B #include <netinet/in.h>
.br
.B #include <netinet/ip.h> \fR/* superset of previous */
-.sp
+.PP
.IB tcp_socket " = socket(AF_INET, SOCK_STREAM, 0);"
.br
.IB udp_socket " = socket(AF_INET, SOCK_DGRAM, 0);"
.B ip
contains a level 2 multicasting implementation conforming to RFC\ 1112.
It also contains an IP router including a packet filter.
-.\" FIXME . has someone verified that 2.1 is really 1812 compliant?
.PP
The programming interface is BSD-sockets compatible.
For more information on sockets, see
.PP
An IP socket is created using
.BR socket (2):
-
+.PP
socket(AF_INET, socket_type, protocol);
-
-Valid socket types are
+.PP
+Valid socket types include
.B SOCK_STREAM
-to open a
-.BR tcp (7)
-socket,
+to open a stream socket,
.B SOCK_DGRAM
-to open a
-.BR udp (7)
-socket, or
+to open a datagram socket, and
.B SOCK_RAW
to open a
.BR raw (7)
socket to access the IP protocol directly.
+.PP
.I protocol
is the IP protocol in the IP header to be received or sent.
-The only valid values for
+Valid values for
.I protocol
-are 0 and
+include:
+.IP \(bu 2
+0 and
.B IPPROTO_TCP
-for TCP sockets, and 0 and
+for
+.BR tcp (7)
+stream sockets;
+.IP \(bu
+0 and
.B IPPROTO_UDP
-for UDP sockets.
+for
+.BR udp (7)
+datagram sockets;
+.IP \(bu
+.B IPPROTO_SCTP
+for
+.BR sctp (7)
+stream sockets; and
+.IP \(bu
+.B IPPROTO_UDPLITE
+for
+.BR udplite (7)
+datagram sockets.
+.PP
For
.B SOCK_RAW
you may specify a valid IANA IP protocol defined in
to a random free port or to a usable shared port with the local address
set to
.BR INADDR_ANY .
-
+.PP
A TCP local socket address that has been bound is unavailable for
some time after closing, unless the
.B SO_REUSEADDR
is set to the IP protocol.
.PP
.in +4n
-.nf
+.EX
struct sockaddr_in {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
struct in_addr {
uint32_t s_addr; /* address in network byte order */
};
-.fi
+.EE
.in
.PP
.I sin_family
.IR "privileged ports"
(or sometimes:
.IR "reserved ports" ).
-Only privileged processes (i.e., those having the
+Only a privileged process
+(on Linux: a process that has the
.B CAP_NET_BIND_SERVICE
-capability) may
+capability in the user namespace governing its network namespace) may
.BR bind (2)
to these sockets.
Note that the raw IPv4 protocol as such has no concept of a
.I in_addr
should be assigned one of the
.BR INADDR_*
-values (e.g.,
-.BR INADDR_ANY )
+values
+(e.g.,
+.BR INADDR_LOOPBACK )
+using
+.BR htonl (3)
or set using the
.BR inet_aton (3),
.BR inet_addr (3),
.BR inet_makeaddr (3)
library functions or directly with the name resolver (see
.BR gethostbyname (3)).
-
-IPv4 addresses are divided into unicast, broadcast
+.PP
+IPv4 addresses are divided into unicast, broadcast,
and multicast addresses.
Unicast addresses specify a single interface of a host,
-broadcast addresses specify all hosts on a network and multicast
+broadcast addresses specify all hosts on a network, and multicast
addresses address all hosts in a multicast group.
Datagrams to broadcast addresses can be sent or received only when the
.B SO_BROADCAST
In the current implementation, connection-oriented sockets are allowed
to use only unicast addresses.
.\" Leave a loophole for XTP @)
-
+.PP
Note that the address and the port are always stored in
network byte order.
In particular, this means that you need to call
on the number that is assigned to a port.
All address/port manipulation
functions in the standard library work in network byte order.
-
+.PP
There are several special addresses:
.B INADDR_LOOPBACK
(127.0.0.1)
.BR IPPROTO_IP .
.\" or SOL_IP on Linux
A boolean integer flag is zero when it is false, otherwise true.
-
+.PP
When an invalid socket option is specified,
.BR getsockopt (2)
and
Argument is an
.I ip_mreqn
structure.
-.sp
+.PP
.in +4n
-.nf
+.EX
struct ip_mreqn {
struct in_addr imr_multiaddr; /* IP multicast group
address */
interface */
int imr_ifindex; /* interface index */
};
-.fi
+.EE
.in
-.sp
+.PP
.I imr_multiaddr
contains the address of the multicast group the application
wants to join or leave.
(The kernel determines which structure is being passed based
on the size passed in
.IR optlen .)
-
+.IP
.B IP_ADD_MEMBERSHIP
is valid only for
.BR setsockopt (2).
Argument is an
.I ip_mreq_source
structure.
-.sp
+.PP
.in +4n
-.nf
+.EX
struct ip_mreq_source {
struct in_addr imr_multiaddr; /* IP multicast group
address */
struct in_addr imr_sourceaddr; /* IP address of
multicast source */
};
-.fi
+.EE
.in
-.sp
+.PP
The
.I ip_mreq_source
structure is similar to
.I ip_mreqn
described under
-.BR IP_ADD_MEMBERSIP .
+.BR IP_ADD_MEMBERSHIP .
The
.I imr_multiaddr
field contains the address of the multicast group the application
This option can be used multiple times to allow
receiving data from more than one source.
.TP
-.BR IP_BIND_ADDRESS_NO_PORT " (since Linux 4.2)
-Instruct kernel to not reserve an ephemeral port at bind() time.
-The port will be automatically chosen at connect() time, in a way
-that allows sharing a source port as long as the 4-tuples are unique.
+.BR IP_BIND_ADDRESS_NO_PORT " (since Linux 4.2)"
+.\" commit 90c337da1524863838658078ec34241f45d8394d
+Inform the kernel to not reserve an ephemeral port when using
+.BR bind (2)
+with a port number of 0.
+The port will later be automatically chosen at
+.BR connect (2)
+time,
+in a way that allows sharing a source port as long as the 4-tuple is unique.
.TP
.BR IP_BLOCK_SOURCE " (since Linux 2.4.22 / 2.5.68)"
Stop receiving multicast data from a specific source in a given group.
If the application has subscribed to multiple sources within
the same group, data from the remaining sources will still be delivered.
To stop receiving data from all sources at once, use
-.BR IP_LEAVE_GROUP .
+.BR IP_DROP_MEMBERSHIP .
.IP
Argument is an
.I ip_mreq_source
Argument is an
.I ip_msfilter
structure.
-.sp
+.PP
.in +4n
-.nf
+.EX
struct ip_msfilter {
struct in_addr imsf_multiaddr; /* IP multicast group
address */
struct in_addr imsf_slist[1]; /* Array of source
addresses */
};
-.fi
+.EE
.in
-.sp
+.PP
There are two macros,
.BR MCAST_INCLUDE
and
.\" Precisely: 2.1.124
Retrieve the current known path MTU of the current socket.
Returns an integer.
-
+.IP
.B IP_MTU
is valid only for
.BR getsockopt (2)
.B IP_PMTUDISC_WANT
will fragment a datagram if needed according to the path MTU,
or will set the don't-fragment flag otherwise.
-
+.IP
The system-wide default can be toggled between
.B IP_PMTUDISC_WANT
and
IP_PMTUDISC_DO:Always do Path MTU Discovery.
IP_PMTUDISC_PROBE:Set DF but ignore Path MTU.
.TE
-
+.sp 1
When PMTU discovery is enabled, the kernel automatically keeps track of
the path MTU per destination host.
When it is connected to a specific peer with
error queue (see
.BR IP_RECVERR ).
A new error will be queued for every incoming MTU update.
-
+.IP
While MTU discovery is in progress, initial packets from datagram sockets
may be dropped.
Applications using UDP should be aware of this and not
take it into account for their packet retransmit strategy.
-
+.IP
To bootstrap the path MTU discovery process on unconnected sockets, it
is possible to start with a big datagram size
-(up to 64K-headers bytes long) and let it shrink by updates of the path MTU.
-.\" FIXME . this is an ugly hack
-
+(headers up to 64 kilobytes long) and let it shrink by updates of the path MTU.
+.IP
To get an initial estimate of the
path MTU, connect a datagram socket to the destination address using
.BR connect (2)
with the
.B IP_MTU
option.
-
+.IP
It is possible to implement RFC 4821 MTU probing with
.B SOCK_DGRAM
or
If enabled (argument is nonzero),
the reassembly of outgoing packets is disabled in the netfilter layer.
The argument is an integer.
-
+.IP
This option is valid only for
.B SOCK_RAW
sockets.
.BR sendmsg (2).
.IP
.in +4n
-.nf
+.EX
struct in_pktinfo {
unsigned int ipi_ifindex; /* Interface index */
struct in_addr ipi_spec_dst; /* Local address */
struct in_addr ipi_addr; /* Header Destination
address */
};
-.fi
+.EE
.in
.IP
-.\" FIXME . elaborate on that.
.I ipi_ifindex
is the unique index of the interface the packet was received on.
.I ipi_spec_dst
structure:
.IP
.in +4n
-.ne 18
-.nf
+.EX
#define SO_EE_ORIGIN_NONE 0
#define SO_EE_ORIGIN_LOCAL 1
#define SO_EE_ORIGIN_ICMP 2
};
struct sockaddr *SO_EE_OFFENDER(struct sock_extended_err *);
-.fi
+.EE
.in
.IP
.I ee_errno
When this flag is set, pass a
.B IP_TTL
control message with the time-to-live
-field of the received packet as a byte.
+field of the received packet as a 32 bit integer.
Not supported for
.B SOCK_STREAM
sockets.
The parameters can be accessed by reading or writing files in the directory
.IR /proc/sys/net/ipv4/ .
.\" FIXME As at 2.6.12, 14 Jun 2005, the following are undocumented:
-.\" ip_queue_maxlen
-.\" ip_conntrack_max
+.\" ip_queue_maxlen
+.\" ip_conntrack_max
Interfaces described as
.I Boolean
take an integer value, with a nonzero value ("true") meaning that
was controlled at compile time by the
.B CONFIG_IP_ALWAYS_DEFRAG
option; this option is not present in 2.4.x and later]
-
+.IP
When this boolean flag is enabled (not equal 0), incoming fragments
(parts of IP packets
that arose when some host between origin and destination decided
that the packets were too large and cut them into pieces) will be
reassembled (defragmented) before being processed, even if they are
about to be forwarded.
-
-Only enable if running either a firewall that is the sole link
+.IP
+Enable only if running either a firewall that is the sole link
to your network or a transparent proxy; never ever use it for a
normal router or host.
Otherwise, fragmented communication can be disturbed
if the fragments travel over different links.
Defragmentation also has a large memory and CPU time cost.
-
+.IP
This is automagically turned on when masquerading or transparent
proxying are configured.
.\"
Operation on a nonblocking socket would block.
.TP
.B EALREADY
-An connection operation on a nonblocking socket is already in progress.
+A connection operation on a nonblocking socket is already in progress.
.TP
.B ECONNABORTED
A connection was closed during an
.TP
.B EHOSTUNREACH
No valid routing table entry matches the destination address.
-This error can be caused by a ICMP message from a remote router or
+This error can be caused by an ICMP message from a remote router or
for the local routing table.
.TP
.B EINVAL
.\" IP_PASSSEC is Linux-specific
.\" IP_XFRM_POLICY is Linux-specific
.\" IP_IPSEC_POLICY is a nonstandard extension, also present on some BSDs
-
+.PP
Be very careful with the
.B SO_BROADCAST
option \- it is not privileged in Linux.
.B IP_TTL
option used in Linux.
.PP
-Using
+Using the
.B SOL_IP
-socket options level isn't portable, BSD-based stacks use
+socket options level isn't portable; BSD-based stacks use the
.B IPPROTO_IP
level.
+.PP
+.B INADDR_ANY
+(0.0.0.0) and
+.B INADDR_BROADCAST
+(255.255.255.255) are byte-order-neutral.
+ This means
+.BR htonl (3)
+has no effect on them.
.SS Compatibility
For compatibility with Linux 2.0, the obsolete
.BI "socket(AF_INET, SOCK_PACKET, " protocol )
.BR capabilities (7),
.BR icmp (7),
.BR ipv6 (7),
+.BR netdevice (7),
.BR netlink (7),
.BR raw (7),
.BR socket (7),
.BR tcp (7),
-.BR udp (7)
+.BR udp (7),
+.BR ip (8)
+.PP
+The kernel source file
+.IR Documentation/networking/ip-sysctl.txt .
.PP
RFC\ 791 for the original IP specification.
RFC\ 1122 for the IPv4 host requirements.