Add basic VRF (virtual routing and forwarding) support. Protocols can be
associated with VRFs, such protocols will be restricted to interfaces
assigned to the VRF (as reported by Linux kernel) and will use sockets
bound to the VRF. E.g., different multihop BGP instances can use diffent
kernel routing tables to handle BGP TCP connections.
The VRF support is preliminary, currently there are several limitations:
- Recent Linux kernels (4.11) do not handle correctly sockets bound
to interaces that are part of VRF, so most protocols other than multihop
BGP do not work. This will be fixed by future kernel versions.
- Neighbor cache ignores VRFs. Breaks config with the same prefix on
local interfaces in different VRFs. Not much problem as single hop
protocols do not work anyways.
- Olock code ignores VRFs. Breaks config with multiple BGP peers with the
same IP address in different VRFs.
- Incoming BGP connections are not dispatched according to VRFs.
Breaks config with multiple BGP peers with the same IP address in
different VRFs. Perhaps we would need some kernel API to read VRF of
incoming connection? Or probably use multiple listening sockets in
int-new branch.
- We should handle master VRF interface up/down events and perhaps
disable associated protocols when VRF goes down. Or at least disable
associated interfaces.
- Also we should check if the master iface is really VRF iface and
not some other kind of master iface.
- BFD session request dispatch should be aware of VRFs.
- Perhaps kernel protocol should read default kernel table ID from VRF
iface so it is not necessary to configure it.
Starting from Linux 4.11, IPv6 ECMP routes are now notified using
RTA_MULTIPATH, like IPv4 ones. The patch adds support for RTA_MULTIPATH
parsing for IPv6 routes. This also enables to parse ECMP alien routes
correctly.
Keep a cache of all the relevant prefixes we send out. When a prefix
appears, insert it into the cache. If it dies, keep it there for a
while, marked as dead.
Conf: Replace keyword and symbol hash table with generic hash table.
The old hash table had fixed size, which makes it slow for config files
with large number of symbols and symbol lookups. The new one is growing
according to needs.
Make indentation and quotation consistent in configure macros.
Also remove --with-sysinclude option, which was broken for 7 years
and nobody complained.
The patch allows to use autoreconf, replaces some long obsolete
constructs and does some other minor cleanups. Also, the file
configure.in is renamed to configure.ac, as the old name has been
deprecated for a long time.
When a BGP session with ADD_PATH is restarted and the neighbor do not
announce ADD_PATH capability during reconnect, the accept_ra_types is
still set to RA_ANY.
Static: Fix bug in static route filter expressions
During reconfiguration, old and new filter expressions in static routes
are compared using i_same() function. When filter expressions contain
function calls, it is necessary that old filter expressions are the
second argument in i_same(), as it is internally modified by i_same().
Otherwise pointers to old (and freed) data appear in the config
structure.
Thanks to Lennert Buytenhek for tracking and reporting the bug.
OSPF: Fix net-summary origination combined with stubnet option
Stubnet nodes in OSPF FIB were removed during rt_sync(), but the pointer
remained in top_hash_entry.nf, so net-summary LSA origination was
confused, reported 'LSA ID collision' and net-summary LSAs were not
originated properly.
Thanks to Naveen Chowdary Yerramneni for bugreport and analysis.
Prefix and bucket tables are initialized when entering established state
but not explicitly freed when leaving it (that is handled by protocol
restart). With graceful restart, BGP may enter and leave established
state multiple times without hard protocol restart causing memory leak.
Commit 3c09af41... changed behavior of int_set_add() from prepend to
append, which makes more sense for community list, but prepend must be
used for cluster list. Add int_set_prepend() and use it in cluster list
handling code.
BIRD passed string from configuration to openlog(), which kept it
internally. After reconfiguration the old string was freed, therefore
openlog had invalid copy.
Pavel Tvrdik [Mon, 26 Sep 2016 16:05:51 +0000 (18:05 +0200)]
Doc: Fix deprecated unescaped braces in perl script
This commit should fix warning `make docs'
./sgml2html bird.sgml Unescaped left brace in regex is deprecated,
passed through in regex; marked by <-- HERE in m/\\nameurl{ <-- HERE
(.*)}{(.*)}/ at fmt_latex2e.pl line 287.
Pavel Tvrdik [Sat, 1 Oct 2016 10:50:29 +0000 (12:50 +0200)]
Tree/Trie: Check the end of buffer
We set buffer->pos to buffer->end in function buffer_print() when
bvsnprintf() failed, so there would be uninitialized memory between
the old buffer->pos and the current buffer->pos.
Add a new route attribute, krt_scope, to expose the Linux kernel route
scope. Constants from /etc/iproute2/rt_scopes (prefixed by "ips_") are
expected to be used with the attribute. Both import and export are
supported.
Also, the patch fixes device route export to the kernel, by setting link
scope automatically.
Pavel Tvrdik [Thu, 8 Sep 2016 11:45:36 +0000 (13:45 +0200)]
BFD: Fix invalid read from pollfd array
It is possible that sockets_add() are called between sockets_prepare()
and sockets_fire() during poll loop in birdloop_main(), so we need to
use loop->poll_fd.used instead of loop->sock_num to find the last field.
Kernel protocol calls rt_export_merged(), which used @rte_update_pool for
temporary allocations, supposing it is called from other functions from
rt-table.c that handles locking and flushing of the linpool. Therefore,
linpool was not flushed properly and memory leaked.
Add linpool argument to rt_export_merged() and use @krt_filter_lp when
called from kernel protocol.
Thanks to Justin Cattle and Alexander Frolkin for the bugreport.
Kernel routes with different metrics do not clash with each other,
therefore using dedicated metric value is a reliable way to avoid
overwriting routes from other sources (e.g. kernel device routes).
Although kernel route metric could already be set as a route attribute by
filters, that is not consistent with the way how Linux kernel handles
route metric - not just a route attribute, but a part of a route key.
Linux represents IPv6 ECMP routes as a sequence of unipath routes with
the same prefix. We have to translate between our representation (one
route with multipath next hop) and the Linux representation in both
directions.
Proper learning of alien IPv6 ECMP routes still not supported.
Thanks to Mikhail Sennikovskii for the original patch.