Maria Matejka [Tue, 27 Sep 2022 10:17:05 +0000 (12:17 +0200)]
KRT: Fix route learning
This is a reimplementation of commit 0f2be469f897b6d9f925563bbf522438c83522ea
by Alexander Zubkov. In the master branch, changes in commit eb937358
broke setting of channel preference for alien routes learned during
scan. The preference was set only for async routes.
The original solution is extended here to accomodate for v3 specifics.
Maria Matejka [Wed, 21 Sep 2022 14:53:17 +0000 (16:53 +0200)]
Caching eattrs in filters is not needed anymore.
After flattening the route attribute structure, the ea_list ** is derivable
from rte * by arithmetics. Caching the derived value doesn't help performance
and therefore is removed as unnecessary.
Maria Matejka [Mon, 12 Sep 2022 09:09:43 +0000 (11:09 +0200)]
Fixing several race-conditions in event code.
After a suggestion by Santiago, I added the direct list pointer into
events and the events are now using this value to check whether the
route is active or not. Also the whole trick with sentinel node unioned
with event list is now gone.
For debugging, there is also an internal circular buffer to store what
has been recently happening in event code before e.g. a crash happened.
By default, this debug is off and must be manually enabled in
lib/event.c as it eats quite some time and space.
Maria Matejka [Fri, 22 Oct 2021 17:43:55 +0000 (19:43 +0200)]
Better profylaction recursive route loops
In some specific configurations, it was possible to send BIRD into an
infinite loop of recursive next hop resolution. This was caused by route
priority inversion.
To prevent priority inversions affecting other next hops, we simply
refuse to resolve any next hop if the best route for the matching prefix
is recursive or any other route with the same preference is recursive.
Next hop resolution doesn't change route priority, therefore it is
perfectly OK to resolve BGP next hops e.g. by an OSPF route, yet if the
same (or covering) prefix is also announced by iBGP, by retraction of
the OSPF route we would get a possible priority inversion.
Maria Matejka [Wed, 31 Aug 2022 14:04:36 +0000 (16:04 +0200)]
Flowspec revalidate notification converted to an export hook
Instead of synchronous notifications, we use the asynchronous export
framework to notify flowspec src route updates. This allows us to
invoke flowspec revalidation without locking collisions.
Maria Matejka [Wed, 31 Aug 2022 12:01:59 +0000 (14:01 +0200)]
Hostcache update notification converted to an export hook
Instead of synchronous notifications, we use the asynchronous export
framework to notify also hostcache updates. This allows us to do the
hostcache update and the subsequent next hop update notification without
locking collisions.
Maria Matejka [Tue, 30 Aug 2022 16:05:00 +0000 (18:05 +0200)]
Tables: Requesting prune only after export cleanup
We can't free the network structures before the export has been cleaned
up, therefore it makes more sense to request prune only after export
cleanup. This change also reduces prune calls on table shutdown.
Had to fix route source locking inside BGP export table as we need to
keep the route sources properly allocated until even last BGP pending
update is sent out, therefore the export table printout is accurate.
Maria Matejka [Tue, 2 Aug 2022 15:58:14 +0000 (17:58 +0200)]
Merge branch 'ballygarvan' into HEAD
Replacing the old 3.0-alpha0 cork mechanism with another one inside the
routing table. This version should be simpler and also quite clear what
it does, why and when.
Maria Matejka [Thu, 28 Jul 2022 11:50:59 +0000 (13:50 +0200)]
Route table cork: Indicate whether the export queues are congested.
These routines detect the export congestion (as defined by configurable
thresholds) and propagate the state to readers. There are no readers for
now, they will be added in following commits.
Maria Matejka [Mon, 1 Aug 2022 13:17:41 +0000 (15:17 +0200)]
Fixed main birdloop init in unit tests
Some unit tests weren't initializing the birdloop, trying to write the
birdloop ping into stdin. Fixed this and also forced stdin close on
startup of every test just to be sure that CI and local build behave the
same in this. (CI was failing on this while local build not.)
Ondrej Zajicek [Tue, 26 Jul 2022 16:45:20 +0000 (18:45 +0200)]
Netlink: Restrict route replace for IPv6
Seems like the previous patch was too optimistic, as route replace is
still broken even in Linux 4.19 LTS (but fixed in Linux 5.10 LTS) for:
ip route add 2001:db8::/32 via fe80::1 dev eth0
ip route replace 2001:db8::/32 dev eth0
It ends with two routes instead of just the second.
The issue is limited to direct and special type (e.g. unreachable)
routes, the patch restricts route replace for cases when the new route
is a regular route (with a next hop address).
Ondrej Zajicek [Sun, 24 Jul 2022 22:11:40 +0000 (00:11 +0200)]
Netlink: Simplify handling of IPv6 ECMP routes
When IPv6 ECMP support first appeared in Linux kernel, it used different
API than IPv4 ECMP. Individual next hops were updated and announced
separately, instead of using RTA_MULTIPATH as in IPv4. This has several
drawbacks and requires complex code to merge received notifications to
one multipath route.
When Linux came with IPv6 RTA_MULTIPATH support, the initial versions
were somewhat buggy, so we kept using the old API for updates (splitting
multipath routes to sequences of route updates), while accepting both
old-style routes and RTA_MULTIPATH routes in scans / notifications.
As IPv6 RTA_MULTIPATH support is here for a long time, this patch fully
switches Netlink to the IPv6 RTA_MULTIPATH API and removes old complex
code for handling individual next hop announces.
The required Linux version is at least 4.11 for reliable operation.
Ondrej Zajicek [Sun, 24 Jul 2022 00:15:20 +0000 (02:15 +0200)]
KRT: Scan routing tables separetely on linux to avoid congestion
Remove compile-time sysdep option CONFIG_ALL_TABLES_AT_ONCE, replace it
with runtime ability to run either separate table scans or shared scan.
On Linux, use separate table scans by default when the netlink socket
option NETLINK_GET_STRICT_CHK is available, but retreat to shared scan
when it fails.
Running separate table scans has advantages where some routing tables are
managed independently, e.g. when multiple routing daemons are running on
the same machine, as kernel routing table modification performance is
significantly reduced when the table is modified while it is being
scanned.
Thanks Daniel Gröber for the original patch and Toke Høiland-Jørgensen
for suggestions.
Maria Matejka [Fri, 22 Jul 2022 13:37:21 +0000 (15:37 +0200)]
Revert "Export table: Delay freeing of old stored route."
This reverts commit cee0cd148c9b71bf47d007c850193b5fbf9486c1.
This change is not needed in version 2 and the surrounding code has
disappeared mostly in version 3.
Maria Matejka [Fri, 24 Jun 2022 17:53:34 +0000 (19:53 +0200)]
Event lists rewritten to a single linked list
In multithreaded environment, we need to pass messages between workers.
This is done by queuing events to their respective queues. The
double-linked list is not really useful for that as it needs locking
everywhere.
This commit rewrites the event subsystem to use a single-linked list
where events are enqueued by a single atomic instruction and the queue
is processed after atomically moving the whole queue aside.