BSD: Use MSG_DONTROUTE for unicast packets on FreeBSD
BSD systems cannot use SO_DONTROUTE, because it does not work properly
with multicast packets (perhaps it tries to find iface based on multicast
group address). But we can use MSG_DONTROUTE sendmsg() flag for unicast
packets. Works on FreeBSD, is ignored on OpenBSD and is broken on NetBSD
(i guess due to integrated routing table and ARP table).
When use of LLGR is negotiated, handle hold timeout by LLGR instead of by
hard restart. Allow to configure whether BFD session down event should be
handled by GR/LLGR or by hard restart.
Nest: Fix race condition during reconfiguration, part 2
If export filter is changed during reconfiguration and a route disappears
between reconfiguration and refeed (e.g., if the route is a static route
also removed during the reconfiguration), the route is not withdrawn.
The issue was fixed for regular channels by an earlier patch. This patch
fixes the issue for channels in RA_ACCEPTED mode (first-pass-the-filter),
used by BGP with 'secondary' option.
If export filter is changed during reconfiguration and a route disappears
between reconfiguration and refeed (e.g., if the route is a static route
also removed during the reconfiguration), the route is not withdrawn.
The patch fixes that by adding tx reconfiguration timestamp.
The bgpmask literals can include expressions. This is OK but they have
to be interpreted as soon as the code is run, not in the time the code
is used as value.
This led to strange behavior like rewriting bgpmasks when they shan't
be rewritten:
function mask_generator(int as)
{
return [= * as * =];
}
function another()
bgpmask m1;
bgpmask m2;
{
m1 = mask_generator(10);
m2 = mask_generator(20);
if (m1 == m2) {
print("strange"); # this would happen
}
}
Moreover, sending this to CLI would cause stack overflow and knock down the
whole BIRD, as soon as there is at least one route to execute the given
filter on.
The magic match operator (~) inside the bgpmask literal would try to
resolve mmm, which points to the same bgpmask so it would resolve
itself, call the magic match operator and vice versa.
After this patch, the bgpmask literal will get resolved as soon as it's
assigned to mmm and it also will return a type error as bool is not
convertible to ASN in BIRD.
It was supposed to do tail-recursion in interpret() but it didn't
compile as such. Converting it to loop makes a significant filter
performance improvement for flat filters.
The two-letter instructions were quite messy but they could be easily
read from memory dumps. Now GDB (since 2012) supports pretty printing
enum values and GCC checks the switch construction for missing enum
values so we are converting the nice two-byte values to enums.
Anyway, the enum still keeps the old two-byte values to be able to read
the instruction codes even without GDB from plain memory dump.
On Linux, setting the ToS will also set the priority and the range of
accepted values is quite limited (masked by 0x1e). Therefore, 0xc0 is
translated to a priority of 0, not something we want, overriding the
"7" priority which was set previously explicitely. To avoid that, just
move setting priority later in the code.
A filter should log messages only if executed explicitly (e.g., during
route export or route import). When a filter is executed for technical
reasons (e.g., to establish whether a route was exported before), it
should run silently.
RFC6126bis introduces a flags field for the Hello TLV, and adds a unicast flag
that is used to signify that a hello was sent as unicast. This adds parsing of
the flags field and ignores such unicast hellos, which preserves compatibility
until we can add a proper implementation of the unicast hello mechanism.
The patch implements Default Router Preferences and More-Specific Routes
(RFC 4191) for RAdv protocol, allowing to announce router preference and
more specific routes in router advertisements. Routes can be exported to
RAdv like to regular routing protocols.
Some cleanups, bugfixes and other changes done by Ondrej Zajicek.
The patch implements BGP Administrative Shutdown Communication (RFC 8203)
allowing BGP operators to pass messages related to BGP session
administrative shutdown/restart. It handles both transmit and receive of
shutdown messages. Messages are logged and may be displayed by show
protocol all command.
Add basic VRF (virtual routing and forwarding) support. Protocols can be
associated with VRFs, such protocols will be restricted to interfaces
assigned to the VRF (as reported by Linux kernel) and will use sockets
bound to the VRF. E.g., different multihop BGP instances can use diffent
kernel routing tables to handle BGP TCP connections.
The VRF support is preliminary, currently there are several limitations:
- Recent Linux kernels (4.11) do not handle correctly sockets bound
to interaces that are part of VRF, so most protocols other than multihop
BGP do not work. This will be fixed by future kernel versions.
- Neighbor cache ignores VRFs. Breaks config with the same prefix on
local interfaces in different VRFs. Not much problem as single hop
protocols do not work anyways.
- Olock code ignores VRFs. Breaks config with multiple BGP peers with the
same IP address in different VRFs.
- Incoming BGP connections are not dispatched according to VRFs.
Breaks config with multiple BGP peers with the same IP address in
different VRFs. Perhaps we would need some kernel API to read VRF of
incoming connection? Or probably use multiple listening sockets in
int-new branch.
- We should handle master VRF interface up/down events and perhaps
disable associated protocols when VRF goes down. Or at least disable
associated interfaces.
- Also we should check if the master iface is really VRF iface and
not some other kind of master iface.
- BFD session request dispatch should be aware of VRFs.
- Perhaps kernel protocol should read default kernel table ID from VRF
iface so it is not necessary to configure it.
Starting from Linux 4.11, IPv6 ECMP routes are now notified using
RTA_MULTIPATH, like IPv4 ones. The patch adds support for RTA_MULTIPATH
parsing for IPv6 routes. This also enables to parse ECMP alien routes
correctly.
Keep a cache of all the relevant prefixes we send out. When a prefix
appears, insert it into the cache. If it dies, keep it there for a
while, marked as dead.
Conf: Replace keyword and symbol hash table with generic hash table.
The old hash table had fixed size, which makes it slow for config files
with large number of symbols and symbol lookups. The new one is growing
according to needs.
Make indentation and quotation consistent in configure macros.
Also remove --with-sysinclude option, which was broken for 7 years
and nobody complained.
The patch allows to use autoreconf, replaces some long obsolete
constructs and does some other minor cleanups. Also, the file
configure.in is renamed to configure.ac, as the old name has been
deprecated for a long time.
When a BGP session with ADD_PATH is restarted and the neighbor do not
announce ADD_PATH capability during reconnect, the accept_ra_types is
still set to RA_ANY.
Static: Fix bug in static route filter expressions
During reconfiguration, old and new filter expressions in static routes
are compared using i_same() function. When filter expressions contain
function calls, it is necessary that old filter expressions are the
second argument in i_same(), as it is internally modified by i_same().
Otherwise pointers to old (and freed) data appear in the config
structure.
Thanks to Lennert Buytenhek for tracking and reporting the bug.
OSPF: Fix net-summary origination combined with stubnet option
Stubnet nodes in OSPF FIB were removed during rt_sync(), but the pointer
remained in top_hash_entry.nf, so net-summary LSA origination was
confused, reported 'LSA ID collision' and net-summary LSAs were not
originated properly.
Thanks to Naveen Chowdary Yerramneni for bugreport and analysis.
Prefix and bucket tables are initialized when entering established state
but not explicitly freed when leaving it (that is handled by protocol
restart). With graceful restart, BGP may enter and leave established
state multiple times without hard protocol restart causing memory leak.
Commit 3c09af41... changed behavior of int_set_add() from prepend to
append, which makes more sense for community list, but prepend must be
used for cluster list. Add int_set_prepend() and use it in cluster list
handling code.
BIRD passed string from configuration to openlog(), which kept it
internally. After reconfiguration the old string was freed, therefore
openlog had invalid copy.