Add a new route attribute, krt_scope, to expose the Linux kernel route
scope. Constants from /etc/iproute2/rt_scopes (prefixed by "ips_") are
expected to be used with the attribute. Both import and export are
supported.
Also, the patch fixes device route export to the kernel, by setting link
scope automatically.
Pavel Tvrdik [Thu, 8 Sep 2016 11:45:36 +0000 (13:45 +0200)]
BFD: Fix invalid read from pollfd array
It is possible that sockets_add() are called between sockets_prepare()
and sockets_fire() during poll loop in birdloop_main(), so we need to
use loop->poll_fd.used instead of loop->sock_num to find the last field.
Kernel protocol calls rt_export_merged(), which used @rte_update_pool for
temporary allocations, supposing it is called from other functions from
rt-table.c that handles locking and flushing of the linpool. Therefore,
linpool was not flushed properly and memory leaked.
Add linpool argument to rt_export_merged() and use @krt_filter_lp when
called from kernel protocol.
Thanks to Justin Cattle and Alexander Frolkin for the bugreport.
Kernel routes with different metrics do not clash with each other,
therefore using dedicated metric value is a reliable way to avoid
overwriting routes from other sources (e.g. kernel device routes).
Although kernel route metric could already be set as a route attribute by
filters, that is not consistent with the way how Linux kernel handles
route metric - not just a route attribute, but a part of a route key.
Linux represents IPv6 ECMP routes as a sequence of unipath routes with
the same prefix. We have to translate between our representation (one
route with multipath next hop) and the Linux representation in both
directions.
Proper learning of alien IPv6 ECMP routes still not supported.
Thanks to Mikhail Sennikovskii for the original patch.
Babel: Immediately update hello interval on interface reconfigure
An interface reconfiguration may change both the hello and update
intervals. An update interval change is immediately put into effect,
while a hello interval change is not. This also updates the hello
interval immediately (if the new interval is shorter than the old one),
and sends a hello to notify peers of the change.
Ignore tentative IPv6 addresses and wait until finish of Duplicate
Address Detection (We got notification when an address is no longer
tentative) to avoid problems when protocols try to use interfaces
with tentative link-local addresses.
Babel: Do not maintain feasibility distance for our own routes
We do not need to maintain feasibility distances for our own router
ID (we ignore the updates anyway). Not doing so makes the routes be
garbage collected sooner when export filters change.
Babel: Do not keep an infeasible route as selected
When a route becomes infeasible it should not be kept as selected; this
is forbidden by section 3.6 of the RFC and prevents subsequent updates
from the same router ID from replacing it.
Babel: Send wildcard retractions on shutdown and startup
This makes BIRD send a wildcard retraction on all interfaces before
shutting down and right after starting up. This helps ensure that
neighbours will discard the announced routes as soon as possible,
rather than only after the normal timeout procedures.
An update with wildcard AE and infinite metric should be treated as a
global retraction of all prefixes announced by that neighbour, per
section 4.4.9 of the RFC. In addition, router ID and seqno in retraction
updates should be ignored. This reworks the handling of retractions and
adjusts the parser to handle all this correctly.
This updates the documentation to correctly mention Babel when protocols
are listed, and adds examples and route attribute documentation to the
Babel section of the docs.
Intervals are carried as 16-bit centisecond values, but kept internally
in 16-bit second values, which causes a potential for overflow. This adds
some checks to make sure this does not happen.
Although RFC 4271 does not forbid empty path segments, they are useless
and some implementations consider them invalid. It is clarified in RFC 7606,
specifying that AS_PATH with empty segment is considered malformed.
Stijn Tintel [Tue, 10 May 2016 13:45:35 +0000 (16:45 +0300)]
netlink: update struct msghdr
The netlink code assumes an order for the members of struct msghdr.
This breaks recvmsg and sendmsg with musl libc on mips64. Fix this by
using designated initializers instead.
Pavel Tvrdik [Tue, 3 May 2016 07:32:49 +0000 (09:32 +0200)]
Initialize variable ifr in sk_setup()
==00:00:00:02.831 2468== Syscall param socketcall.setsockopt(optval) points to uninitialised byte(s)
==00:00:00:02.831 2468== at 0x513BDEA: setsockopt (in /usr/lib/libc-2.23.so)
==00:00:00:02.831 2468== by 0x45C7AF: sk_setup (io.c:1216)
==00:00:00:02.831 2468== by 0x45CDFF: sk_open (io.c:1417)
==00:00:00:02.831 2468== by 0x44B562: rip_open_socket (packets.c:740)
==00:00:00:02.831 2468== by 0x4481A7: rip_iface_locked (rip.c:616)
==00:00:00:02.831 2468== by 0x4133E4: olock_run_event (locks.c:177)
==00:00:00:02.831 2468== by 0x45A6DE: ev_run (event.c:85)
==00:00:00:02.831 2468== by 0x45A7AD: ev_run_list (event.c:142)
==00:00:00:02.831 2468== by 0x45E0FC: io_loop (io.c:2066)
==00:00:00:02.831 2468== by 0x463B56: main (main.c:845)
==00:00:00:02.831 2468== Address 0xffefffd24 is on thread 1's stack
==00:00:00:02.831 2468== in frame #1, created by sk_setup (io.c:1188)
==00:00:00:02.831 2468== Uninitialised value was created by a stack allocation
==00:00:00:02.831 2468== at 0x45C6BB: sk_setup (io.c:1188)
This patch implements the IPv6 subset of the Babel routing protocol.
Based on the patch from Toke Hoiland-Jorgensen, with some heavy
modifications and bugfixes.
Thanks to Toke Hoiland-Jorgensen for the original patch.
Add code for manipulation with TCP-MD5 keys in the IPsec SA/SP database
at FreeBSD systems. Now, BGP MD5 authentication (RFC 2385) keys are
handled automatically on both Linux and FreeBSD.
Nest: Reset export route counter during graceful restart
Counter exp_routes is increased during initial route feed after GR
recovery, so it has to start with zero, otherwise BIRD will end with
double value in exp_routes.
IO: Replace RX priority heuristic with explicit mark
In BIRD, RX has lower priority than TX with the exception of RX from
control socket. The patch replaces heuristic based on socket type with
explicit mark and uses it for both control socket and BGP session waiting
to be established.
This should avoid an issue when during heavy load, outgoing connection
could connect (TX event), send open, but then failed to receive OPEN /
establish in time, not sending notifications between and therefore
got hold timer expired error from the neighbor immediately after it
finally established the connection.
When a kernel route changed, function krt_learn_scan() noticed that and
replaced the route in internal kernel FIB, but after that, function
krt_learn_prune() failed to propagate the new route to the nest, because
it confused the new route with the (removed) old best route and decided
that the best route did not changed.
Wow, the original code (and the bug) is almost 17 years old.
Birdlib: Modify lists to avoid problems with pointer aliasing rules
The old linked list implementation used some wild typecasts and required
GCC option -fno-strict-aliasing to work properly. This patch fixes that.
However, we still keep the option due to other potential problems.
OSPF: Fix bogus LSA ID collisions between received and originated LSAs
After restart, LSAs locally originated by the previous instance are
received from neighbors. They are installed to LSA db and flushed. If
export of a route triggers origination of a new external LSA before flush
of the received one is complete, the check in ospf_originate_lsa() causes
origination to fail (because en->nf is NULL for the old LSA and non-NULL
for the new LSA). The patch fixes this by updating the en->nf for LSAs
being flushed (as is already done for empty ones). Generally, en->nf
field deserves some better description in the code.
When a BGP session was established by an outgoing connection with
Graceful Restart behavior negotiated, a pending incoming connection in
OpenSent state, and another incoming connection was received, then the
outgoing connection (and whole BGP session) was closed, but the old
incoming connection was just overwritten by the new one. That later
caused a crash when the hold timer from the old connection fired.
Since 2.6.19, the netlink API defines RTA_TABLE routing attribute to
allow 32-bit routing table IDs. Using this attribute to index routing
tables at Linux, instead of 8-bit rtm_table field.
Conf: Fixes bug in symbol lookup during reconfiguration
Symbol lookup by cf_find_symbol() not only did the lookup but also added
new void symbols allocated from cfg_mem linpool, which gets broken when
lookups are done outside of config parsing, which may lead to crashes
during reconfiguration.
The patch separates lookup-only cf_find_symbol() and config-modifying
cf_get_symbol(), while the later is called only during parsing. Also
new_config and cfg_mem global variables are NULLed outside of parsing.
If the number of sockets is too much for select(), we should at least
handle it with proper error messages and reject new sockets instead of
breaking the event loop.