Stop rejecting RPPs if unrecognizable absent files are fileListed
RFC 9286:
> The RP MUST acquire all of the files enumerated in the manifest
> (fileList) from the publication point. If there are files listed in
> the manifest that cannot be retrieved from the publication point,
> the RP MUST treat this as a failed fetch.
This was clashing with Fort's default rsync filters because they were
preventing unknown extensions from being downloaded:
Which will be a problem whenever the IETF defines new legal repository
extensions, such as .asa.
Therefore, ignore unknown manifest fileList extensions. This technically
violates RFC 9286, but it's necessary evil given that we can't trust
repositories to always only serve proper RPKI content.
The code was assuming the object was DER-encoded, and the relevant
integer was therefore in short form.
Because I postponed the DER enforcement in deef7b7823f21914b17838f152a8bd510a348f54, the code should not make
reckless assumptions about the signedAttrs encoding.
Job Snijders [Tue, 25 Jun 2024 05:21:39 +0000 (05:21 +0000)]
Generate all permutations of the list with equal probability
@botovq was kind enough to point out that although my earlier
implementation produced random-ish ordering, it strictly speaking
wasn't Fisher-Yates.
We need to ensure `j` is a random number between `i` and `list.count`
see the second example in the 'Modern Algorithm'
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
Job Snijders [Thu, 13 Jun 2024 18:21:36 +0000 (18:21 +0000)]
Shuffle the order in which Manifest entries are processed
Previously work items were enqueued in the order the CA intended them
to appear on a Manifest. However, there is no obvious benefit to letting
third parties decide the order in which objects are processed.
Instead, randomize the list of FileAndHashes, its ordering has no meaning
anyway. As they say, a fox is not taken twice in the same snare
Job Snijders [Fri, 7 Jun 2024 17:09:44 +0000 (17:09 +0000)]
Verify the signature on a self-signed TA cert against it's own pubkey
X509_verify_cert() doesn't check the purported root certificate itself
unless X509_V_FLAG_CHECK_SS_SIGNATURE is set.
The pubkey was compared against the TAL, so check that the signature is
right as required by RFC 6487, section 7, additional condition 1,
applied to self-issued certs.
The error check looks weird, but OpenSSL 3 broke yet another API.
X509V3_EXT_print() was being summoned to print extensions unrelated to
RPKI. The TODO wanted me to pick a suitable flag for extensions unknown
even to libcrypto.
For reference, this is how X509V3_EXT_print() prints an AIA, as a known
extension:
CA Issuers - URI:rsync://rpki.ripe.net/repository/aca/KpSo3VVK5wEHIJnHC2QHVV3d5mk.cer
This is how X509V3_EXT_print() prints the same AIA, as an unknown
extension, X509V3_EXT_PARSE_UNKNOWN enabled:
These two quirks made the validation mostly a no-op.
There's also the issue that this implementation seems inefficient,
especially since Fort doesn't need to DER-encode anywhere else. By
checking the encoding while parsing, I would save a lot of memory
in addition to being able to delete that mess of encoding functions.
But I'm going to have to push that to the future. This is growing more
ambitious than I can afford during a release review, and given that the
code wasn't really doing anything productive in the first place, I'm not
losing much by simply axing it for now.
- Employ libssl's OID parsing rather than implement it from scratch.
- Rename `struct signed_object_args` to `struct ee_cert`, since it's
just a bunch of EE certificate data.
- Remove `struct signed_data`, because it wasn't actually contributing
anything.
rsync cannot download into standard output... which means rsync'd files
cannot be elegantly piped as standard output to --mode=print. So either
the rsync has to be done manually by the user... or --mode=print has to
do it internally by itself.
And looking at the code that resulted... I now wish I had gone with the
former option. Because of the long overdue cache refactors, the user
needs to include --tal for this rsync to be compatible with the cache.
This sucks.
As a workaround, Fort will rsync into /tmp if --tal and/or --local-cache
aren't supplied:
$ fort --mode=print \
--validation-log.enabled \
--validation-log.level debug \
rsync://a.b.c/d/CRL.crl
...
May 10 13:32:44 DBG [Validation]: Executing rsync:
May 10 13:32:44 DBG [Validation]: rsync
May 10 13:32:44 DBG [Validation]: ...
May 10 13:32:44 DBG [Validation]: rsync://a.b.c/d/CRL.crl
May 10 13:32:44 DBG [Validation]: /tmp/fort-Q7tMhz/CRL.crl
...
{
"tbsCertList": {
"version": 1,
...
Fort used to clear the --output.roa and --output.bgpsec files to make
sure they were writable, during early validations.
So this is why the files spent so much time being empty! This was not
acceptable. It didn't even guarantee the files would still remain
writable by the time Fort needed to properly populate them.
This function was always including the binary flag ("b") during
fopen(2), which seems to be inappropriate for the --output.roa and
--output.bgpsec files.
Well, the Unixes don't do anything with this flag, so this is more of a
semantic fine-tune than a bugfix.
The code was writing dates in Zulu format, which was fine. But then, to
read them back, it was loading them with mktime(), which is a local
timezone function. The effects of this bug depend on the time zone.
Files would expire from the cache up to 12 hours too early (UTC+) or
late (UTC-), or be updated up to 12 hours late (UTC-). (There's no such
thing as "updating too early" in this context, since Fort cannot refresh
before the present.)
I fixed this by swapping mktime() for timegm(), which is not great,
because the latter is still a nonstandard function. But it'll have to
do, because the other options are worse.
Below is my thought process on timegm() and the other options:
1. I want to store timestamps as human-readable strings.
2. Converting a timestamp to Zulu string is trivial: time_t ->
gmtime_r() -> struct tm (GMT) -> strftime() -> char*.
3. But converting a string to timestamp relies on timegm(), which is
nonstandard: char* -> strptime() -> struct tm (GMT) -> timegm() ->
time_t.
Brainstorm:
1. Store the dates in local time.
Hideous option, but not ENTIRELY insane.
Storing in local time will render the dates up to 24 hours off, I think.
But this only happens when the cache changes timezone, which isn't
really often.
But it's still pretty clumsy, and also not future-proof, as new date
fields would have to be constrained by the same limitation.
2/10 at best.
2. Ditch time_t, use struct tm in UTC for everything.
So I would rely on gmtime_r() only to get out of time_t, and never need
timegm() for anything.
Comparing field by field would be a pain, but it's interesting to note
that Fort is actually already doing it somewhere: tm_cmp(). I guess I
have to admire the audaciousness of past me.
What mainly scares me is that mktime() seems to be the only standard
function that is contractually obligated to normalize the tm, which
means I would have to keep mktime()ing every time I need to compare
them.
And mktime() is... a local time function. Probably wouldn't actually
work.
4/10. I hate this API.
3. Bite the bullet: Go modern POSIX; assume time_t is integer seconds
since the 1970 epoch.
I'm fantasizing about an early environment assertion that checks the
gmtime_r() for a known time_t, that shows people the door if the
implementation isn't sane. Would work even in POSIX-adjacent systems
like most Linuxes.
This would have other benefits, like the ability to ditch difftime(),
and perform arithmetic operations on time_t's.
All the officially supported platforms get this aspect of POSIX right,
so what's stopping me?
Well, it would mean I would have to store the date as a seconds-since-
epoch integer, which is not as human-friendly and defeats the whole
point of the JSON format.
And yet... this feels so right, I might end up doing it even regardless
of the JSON data type and timegm().
But not today. 7/10.
4. Bite the bullet: Use timegm().
Interesting. timegm() might be added to C23:
> Changes integrated into the latest working draft of C23 are listed
> below.
> (...)
> - Add timegm() function in <time.h> to convert time structure into
> calendar time value - similar to function in glibc and musl libraries.