- Rename `file_exists()` to `file_stat_errno()`
- Make result string `node2file()` consistently free-able
- Move `valid_file_or_dir()` to file.c, rename to `file_is_valid()`
- Make cache threshold configurable
- Recover from errors in `__vfprintf()`
- Remove recursion from `cer_cleanup()` and `cer_free()`
- Delete snapshots and deltas after exploding them
- Return error code when remove() fails, but keep deleting files until
tree traversal complete.
- At least one platform thinks `nftw(a, b, c, d)` is an error when `a`
is not a directory, so fall back to unlinking when that happens.
- Rename `struct sia_uris` into `struct extension_uris`
- Merge `certificate_refs` module into `certificate`
(Includes a bunch of API review all over `certificate_refs`,
`certificate`, `signed_object` and `signed_data`.)
- Deprecate rsync and HTTP configuration priorities
(HTTP/RRDP is hardcoded to preferred to simplify things.)
- Add some comments to `struct cache_mapping`
To reduce review friction; clarify when result codes matter, force
callers to worry about them, and prevent them from getting jumbled with
other error code types.
(In particular, the cache code was returning `EBUSY` to signal rsync
deferral. Since the code frequently propagates errno, it risked regular
standard library `EBUSY`s being mistaken by rsync deferrals.)
Also, remove negative error codes when they're not needed. (Though
error codes themselves are steadily becoming slop.)
1. Propagate EBUSY so the main loop suspends the task (and takes care of
other tasks) while the rsync runs.
2. Spawner now responds rsync URL and path to parent, so the cache can
update the download state.
The single thread requirement and lack of polling was preventing the
spawner from running multiple rsyncs at the same time (as their output
needs to be exhausted for them to end), and more importantly, from
consuming the request stream while the one rsync was running. (The
latter might result in dropped requests if too many rsyncs are queued.)
Therefore, poll both the request stream and the rsync pipes. All input
is now consumed immediately, and multiple rsyncs can be forked at the
same time. (Still needs a limit.)
I haven't actually found much incentive to justify the normalization,
but libcurl provides a (still flawed as of 8.12.1, but workable) API to
do it effortlessly.
This is better than the previous implementation, and future-proof
enough.
1. rsync is a bit of a pain as a retrieval tool for RPKI,
and I'd like to avoid it when I can get away with it.
2. Refresh by SIA was already prioritizing RRDP over rsync,
so this makes the overall behavior more consistent.
3. Always preferring one protocol over the other tends to
reduce bandwidth & cache usage.
So, mirror the SIA refresh order for TAs. From highest to lowest
priority:
Stop rejecting RPPs if unrecognizable absent files are fileListed
RFC 9286:
> The RP MUST acquire all of the files enumerated in the manifest
> (fileList) from the publication point. If there are files listed in
> the manifest that cannot be retrieved from the publication point,
> the RP MUST treat this as a failed fetch.
This was clashing with Fort's default rsync filters because they were
preventing unknown extensions from being downloaded:
Which will be a problem whenever the IETF defines new legal repository
extensions, such as .asa.
Therefore, ignore unknown manifest fileList extensions. This technically
violates RFC 9286, but it's necessary evil given that we can't trust
repositories to always only serve proper RPKI content.
- Fort shouldn't lose the cache index when a signal interrupts it.
- Writing the index during the signal handler is not possible,
because of the async-signal-safe requirement.
- Writing the index outside of the signal handler is seemingly not
viable, because of the infelicities between the signal and
multithreading APIs in C.
I haven't completely discarded the "dropping multithreading" option,
but since it seems disproportionate, I've been rethinking the index.
This commit scatters the index across several files, to minimize lost
information during a stopping signal. This will exacerbate the inode
problem, but that's temporary.
There are many ways in which a mismatching cache index can cause erratic
behavior that's hard to detect. Since the index is written at the end of
the validation cycle, crashing at any point between a cache refresh and
the index write results in a misindexed cache.
Deleting the index after loading it seems to be a reliable way to force
Fort to reset the cache after a crash.
- Rename extension.h to ext.h; the former collides with Extension.h.
- Move _DEFAULT_SOURCE to the source; it's not widespread enough for
Makefile.am.
- Add _DARWIN_C_SOURCE, needed by MacOS for timegm() and mkdtemp().
- Add -flto to unit test AM_CFLAGS. This minimizes superflous #includes
and mocks needed, and will hopefully make them consistent across
platforms.
- Delete _BSD_SOURCE; it seems orphaned. (Though see below.)
Works on Linux and Mac. Might have broken the BSDs; I can't test them
ATM.
- Separate node->mtim into attempt_ts and success_ts.
Because they're really two different timestamps; The former is meant
for node expiration, the latter for HTTP IMS.
- Move removal of orphaned fallbacks to remove_abandoned().
Because orphaned refreshes need the same logic.
- Added the (randomly missing) expiration threshold for orphans.
It's still missing the implementation of remove_orphaned_files(),
but I'm still weighting options, as it seems it's going to be an
expensive operation that's rarely going to do anything.
Both used to be indexed by caRepository, inducing possible collision.
RRDP fallbacks are now indexed by rkiNotify+caRepository, ensuring
they're caged separately.
The fork()s (needed to spawn rsyncs) duplicate Fort's process.
Which is messy in a multithreaded program. Quoting the Linux man page:
> * The child process is created with a single thread—the one that
> called fork(). The entire virtual address space of the parent is
> replicated in the child, including the states of mutexes, condition
> variables, and other pthreads objects. (...)
> * After a fork() in a multithreaded program, the child can safely call
> only async-signal-safe functions (...) until such time as it calls
> execve(2).
As far as I can tell, since the forked child was, in fact, careful to
only invoke async-signal-safe functions, this wasn't really a bug.
Still, it wasn't quality architecture either.
Moving the rsync spawner to a dedicated subprocess should stop the forks
from threatening to clash with the multithreading completely.
Relies on the new core loop design, so this won't work properly until
that's implemented.
I feel like I need to relearn signals every time I have to interact with
them. Best get this done while the iron's hot.
1. The ROA file is first written as `<cache>/.roa`.
The RK file is first written as `<cache>/.rk`.
2. When the validation run is done, `.roa` is renamed to `--output.roa`,
and `.rk` becomes `--output.bgpsec`.
3. Most terminating signals unlink `.roa` and `.rk`.
The sigaction() code was in logging because it was originally conceived
by the SIGSEGV stack trace printing hack. The SIGPIPE ignorer was also
incidentally moved there at some point, but it has never had anything
to do with logging.
And I'm going to catch more signals in the upcoming commits, so this
really needs to be formalized into its own module.
It seems I'm finally done making dramatic wide-reaching changes to the
codebase. There's still plenty to add and test, but I would like to
start pushing atomic commits from now on.
This is a squashed version of development brach "issue82". It includes
a few merges with main.
- `cache/rsync`, `cache/https` and `cache/rrdp` contain "refreshes"
(the exact latest files according to the servers). RRDP withdraws are
honored, and rsyncs run without --compare-dest.
- "Refresh" files marked as valid are backed up in `cache/fallback`
at the end of each validation cycle.
- Validation first tests fallback+refresh. (If a file exists in both,
refresh wins.) If that fails, it retries with fallback only.
- The index is not a tree; everything is caged in numbered directories
and indexed by exact URL, to prevent file overriding by URL hacking.
There's also a `cache/tmp` directory, where Fort temporarily dumps
notifications, snapshots and deltas. This directory will be removed
once #127 is fixed.
The code was assuming the object was DER-encoded, and the relevant
integer was therefore in short form.
Because I postponed the DER enforcement in deef7b7823f21914b17838f152a8bd510a348f54, the code should not make
reckless assumptions about the signedAttrs encoding.