It was pretty messy; I had to rewrite a good chunk of it.
== Problem 1 ==
It was discarding meaningful validation results when miscellaneous
errors prevented the deltas array from being built.
Deltas are optional; as long as Fort has the snapshot of the latest
tree, it doesn't technically need deltas. They speed up synchronization,
but in the worst case scenario, the RTR server can keep pushing Cache
Resets.
Severity: Warning. Memory allocation failures are the only eventuality
that might prevent the deltas array from being built.
== Problem 2 ==
The database was always keeping one serial's worth of obsolete deltas.
Cleaned up, saves a potentially large amount of memory.
Severity: Fine. Not a memory leak.
== Problem 3 ==
The code computed deltas even whene there were no routers listening.
Routers are the only delta consumers, so there was no need to waste all
that time.
Severity: Fine; performance quirk.
== Problem 4 ==
I found an RTR client implementation (Cloudflare's rpki-rtr-client) that
hangs when the first serial is zero. Fort's first serial is now 1.
Severity: Warning. This is rpki-rtr-client's fault, but any client
implementations are prone to the same bug. The new solution is more
future-proof.
== Problem 5 ==
It seems it wasn't cleaning the deltas array when all routers were known
to have bogus serials. This was the code:
/* Its the first element or reached end, nothing to purge */
if (group == state.deltas.array ||
(group - state.deltas.array) == state.deltas.len)
return 0;
If you reached the end of the deltas array, and the minimum router
serial is larger than all the array serials, then all deltas are
useless; you're supposed to purge all of them.
Severity: Fine. It was pretty hard to trigger, and not a memory leak.