git.ipfire.org Git - thirdparty/FORT-validator.git/log

]> git.ipfire.org Git - thirdparty/FORT-validator.git/log

projects / thirdparty / FORT-validator.git / log

Alberto Leiva Popper [Fri, 30 Jun 2023 22:14:11 +0000 (16:14 -0600)]

Deprecate stale-repository-period better

This had been deprecated in efe3d0cca66fdf9cef220f4e76fff71b12a11c8b,
but I forgot to update the manual.

Also added deprecation warning message.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 30 Jun 2023 22:06:32 +0000 (16:06 -0600)]

Delete --init-locations

It was deprecated a long time ago.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 30 Jun 2023 19:07:27 +0000 (13:07 -0600)]

More miscellaneous cleanups

- 4c1dba27917715550fb2e6367ff20fa4efe89ac3 missed some bad words.
- da2a10115fe9b0da3365cb1edeeae241ece0d84b missed one of ARIN's TAL workarounds.
- 39053843972926d6a8b55ccaf2ae2d792255189c didn't link to the tracker properly.
- b6404ec81920c0ee8397cabedc90159795f38ed8 missed deprecated fields in examples and man.

The documentation will retain the deprecated fields until the new
release.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 30 Jun 2023 18:33:36 +0000 (12:33 -0600)]

Update status in the documentation

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 30 Jun 2023 18:13:30 +0000 (12:13 -0600)]

Merge branch 'THEWWWTHING-main'

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 30 Jun 2023 18:04:05 +0000 (12:04 -0600)]

Merge ASID and ASId into ASId

They are the same data type.

I'm merging them into the latter because it's more widely used.

Fixes #94, I think.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 30 Jun 2023 17:51:44 +0000 (11:51 -0600)]

Update asn1c for its generated code

Inherits miscellaneous bugfixes.

Bumps asn1c from version 88ed3b5cf012918bc1084b606b0624c45e0d2191
to 9925dbbda86b436896108439ea3e0a31280a6065.

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 29 Jun 2023 22:22:32 +0000 (16:22 -0600)]

Merge branch 'rrdp-refactor-2'

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 29 Jun 2023 20:47:16 +0000 (14:47 -0600)]

Change rsync.priority's config data type to uint

It was using uint32, even though it was declared as an unsigned int.

This orphans uint32, so purged.

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 29 Jun 2023 20:27:02 +0000 (14:27 -0600)]

Remove the AIA validation

As discovered in 566835e8da0ce52b6bded39db72667eeb2e41001, this
validation was implemented incorrectly. Fort should locate the parent
certificate in the local cache by URI, not force-redownload by rsync.

The URI indexing will be implemented as part of #78. I'll reimplement
the validation properly then.

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 29 Jun 2023 16:37:38 +0000 (10:37 -0600)]

Update the unit tests

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 27 Jun 2023 22:21:26 +0000 (16:21 -0600)]

Remove obsolete config options

- sync-strategy
- rrdp.enabled
- rrdp.priority
- rrdp.retry.count
- rrdp.retry.interval
- http.idle-timeout

These were deprecated long ago.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 27 Jun 2023 19:31:25 +0000 (13:31 -0600)]

Remove the working_repo module

It's a thread variable that keeps track of a URI that roughly represents
the publication point currently being traversed.

I would have moved it to the validation state, but given the two
previous commits, it wasn't really doing anything anymore.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 27 Jun 2023 17:35:32 +0000 (11:35 -0600)]

Remove the reqs_errors module

It's a collection of failed downloads, and it's two-purposed:

1. To list the failed downloads once the validation is over.
   (Which is the functionality I deleted in the prevous commit.)
2. To clue the AIA validation code (through an EREQFAILED) not to...
   uh... redownload... the AIA certificate's... parent.
   Huh.

Yeah, I have a few issues with the implementation:

A. Purpose 1 is obsolete.

B. Regarding purpose 2: Fort should never redownload a file that was
   already traversed in the same validation cycle. This purpose is
   plainly wrong.

Oh, you know what? I get it.

The original coder was probably concerned that the parent might have
been downloaded via RRDP, yet the child's AIA is always an rsync URI...
and because RRDP and rsync are cached into separate namespaces, well,
Fort wasn't going to find the parent.

But the thing is, it's a URI, not a URL. RRDP also refers to these files
by way of their "rsync" *URI* in its snapshots and deltas. RRDP might
be HTTP, but there is no such thing as http://path.to/certificate.cer.

This should be fixed by way of clever local cache resolution, not an
awkward redownload.

C. The lifescope of this table should be a single validation run, not
   Fort's lifetime.

I think the reason why it's global is so the code could warn if a
particular resource had been down for several iterations.
But I wouldn't say that's Fort's job, and even if it is, it's probably
best to move it to Prometheus somehow. We're relying too much on the
logs.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 27 Jun 2023 01:04:40 +0000 (19:04 -0600)]

Remove "summary" logging

The "summary" was a list of failed downloads, and from what I gather,
it was intended to clue the admin of a possible disconnection problem.

Thing is, the implementation was rather bewilderingly large and
obtrusive, and now that we're going to have a Prometheus endpoint,
most likely replaceable by an elegant counter.

And I haven't implemented Prometheus yet, but I would definitely love to
stop running into this invasive code all over the place, thanks.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 26 Jun 2023 22:57:54 +0000 (16:57 -0600)]

Remove null handling from state_retrieve()

If the function fails, it panics. Therefore, there is no reason to catch
NULL and the subsequent error codes.

Removes clutter.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 26 Jun 2023 21:47:30 +0000 (15:47 -0600)]

Remove a bunch of uprofessional jargon

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 26 Jun 2023 18:16:24 +0000 (12:16 -0600)]

Fix multiple evaluation of HASH_ADD_STR argument

This bug was introduced by 330240b2b5de670858a492703f016da21cc374bd.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 23 Jun 2023 22:49:24 +0000 (16:49 -0600)]

Wrap alloc functions, to remove boilerplate

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 23 Jun 2023 15:49:23 +0000 (09:49 -0600)]

Panic on ENOMEMs

Trying to recover is incorrect because we don't want to advertise an
incomplete or outdated VRP table to the routers. We don't want to rely
on the OOM-killer; we NEED to die on memory allocation failures ASAP.

Though this doesn't have much to do with the RRDP refactor, I'm doing it
early to get plenty of time for testing and review.

Partially F1xes #40. (Still need to update some dependency usages.)

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 22 Jun 2023 20:14:50 +0000 (14:14 -0600)]

Update configure.ac

Fixes warning:

> warning: AC_OUTPUT should be used without arguments.
> configure.ac:52: You should run autoupdate.

commit | commitdiff | tree

THEWWWTHING [Sun, 16 Apr 2023 15:40:10 +0000 (17:40 +0200)]

Main

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 8 Feb 2023 01:00:36 +0000 (19:00 -0600)]

Update Dockerfile's documentation

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 7 Feb 2023 20:36:07 +0000 (14:36 -0600)]

Update Dockerfile

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 7 Feb 2023 17:11:04 +0000 (11:11 -0600)]

Update the TAL situation

ARIN no longer requires the explicit RPA agreement.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 6 Feb 2023 19:46:09 +0000 (13:46 -0600)]

Protocolary updates for release 1.5.4 (2nd attempt)

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 6 Feb 2023 17:57:20 +0000 (11:57 -0600)]

Final tweaks to #83/#89 (for release 1.5.4)

1. Revert panic back into the code.

- Fort SHOULD die as soon as it realizes the VRP table is corrupted, as
  we should not send garbage to the routers.
- Also, I'm not entirely sure the code would not crash later anyway,
  since the table is, in fact, corrupted.
- Plus, if it doesn't crash, there would be no core dump to further
  analyze the bug.

2. Point bug output to the currently active bug report

Might help us get some output earlier.

commit | commitdiff | tree

Alberto Leiva Popper [Sat, 4 Feb 2023 00:23:41 +0000 (18:23 -0600)]

Remove unnecessary stack traces

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 3 Feb 2023 21:43:23 +0000 (15:43 -0600)]

Update unit tests

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 3 Feb 2023 18:44:32 +0000 (12:44 -0600)]

Bug #83/#89: Print even more data

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 3 Feb 2023 17:56:22 +0000 (11:56 -0600)]

Downgrade the "modulus too long" error to warning.

Mirrors the new public key code's small forward compatibility gimmic
in the old code.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 3 Feb 2023 03:31:28 +0000 (21:31 -0600)]

Modernize access to modulus length and exponent

Fixes deprecation warning from newer versions of libcrypto.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 1 Feb 2023 18:08:20 +0000 (12:08 -0600)]

Bug #83/#89: Print even more data

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 2 Feb 2023 00:56:09 +0000 (18:56 -0600)]

Convert vrps.table_lock into a mutex

There are no readers, so there's no point in this being a reader-writer
lock.

Still not meant to be a fix for #83/#89. I'm mostly just trying to force
myself to interact with the code in hopes of finding the bug.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 1 Feb 2023 17:50:38 +0000 (11:50 -0600)]

Bug #83/#89: Print more data

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 31 Jan 2023 23:58:52 +0000 (17:58 -0600)]

Add const to hashable_roa.data

Not a fix, but it proves that the protocol never purposefully changes
outside of the ROA's constructors.

So I guess the options are... it's either inadvertently changing,
or the foreach is iterating into unrelated memory.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 30 Jan 2023 21:50:20 +0000 (15:50 -0600)]

Merge branch 'issue83'

I was hesitant to put this on main because it seemed like a performance
tank, but truth be told, I'm being a wuss. these table iterations are
nothing compared to the amount of time Fort has to spend downloading.

And it looks like I'm never going to find this bug with the stack trace
alone.

Does not fix #83 nor #89, but prevents the crash at least.

commit | commitdiff | tree

Job Snijders [Sat, 21 Jan 2023 11:43:02 +0000 (11:43 +0000)]

Ensure X509 extensions are hashed & cached, before deciding a cert is CA or EE

If X509_check_ca() fails to cache X509v3 extension values, the return
value may be incorrect, leading to erroneously assuming a given certificate
is a CA or EE cert (while in reality it is the other, or neither).

This failure mode can arise because X509_check_ca() doesn't verify
whether libcrypto's (void)x509v3_cache_extensions(x) flipped the EXFLAG_INVALID
flag in x->ex_flags. Unfortunately, X509_check_ca() doesn't have a return code
to indicate an error, so this can't be fixed in libcrypto - the API is broken.

The workaround is to call X509_check_purpose(3) with a purpose argument of -1,
before calling X509_check_ca(), this ensures the X509v3 extensions are cached.
Since X509_check_purpose() does have a return code to indicate errors, we can
use that to supplement X509_check_ca()'s shortcomings.

OpenBSD's rpki-client also uses the above approach.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 28 Dec 2022 23:15:37 +0000 (17:15 -0600)]

Remove traces of the master branch from the doc

The "master" branch was renamed to "main" a short while ago.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 28 Dec 2022 22:12:45 +0000 (16:12 -0600)]

Compile backtrace out if unavailable on system

backtrace() is a glibc-only feature. Some systems, such as Alpine,
do not support glibc.

It seems one solution is to rely on community ports, but I imagine it'd
be best to offload such a decision to the user. Not the safest.

Instead, if backtrace() is not available, just delete stack traces from
the binary. It's going to be a pain to debug, but that's the world we
live in, I guess.

Turns libexec into an optional dependency. Fixes #87.

Also, the commit contains a review and update of the documentation's
Alpine dependency list. There was a lot of fat in there.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 13 Dec 2022 20:47:37 +0000 (14:47 -0600)]

Protocolary updates for release 1.5.4

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 13 Dec 2022 18:49:45 +0000 (12:49 -0600)]

Merge branch 'imgbot'

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 22 Nov 2022 18:14:34 +0000 (12:14 -0600)]

Certificate: Remove subject name uniqueness validation

RFC 6487:

> An issuer SHOULD use a different subject name if the subject's key
> pair has changed (i.e., when the CA issues a certificate as part of
> re-keying the subject.)

Fort's implementation was problematic. The code was comparing the
certificate's subject name and public key to siblings that had
potentially not been validated yet. It seems to me this would make it
possible for attackers to crash FORT (by posting invalid objects) or to
invalidate legitimate objects (by publishing siblings that contained
conflicting subject names and public keys, without having to worry about
the rest of the fields).

This would be somewhat difficult o fix. I asked on the mailing list and
Discord ("RPKI Community"), and it seems the concensus is "don't
validate it." Subject Names don't really matter that much, because
RPKI's primary concern is resource ownership, not identity. Furthermore,
I'm not convinced that chopping off branches off the tree because of a
clumsy key rollover is a good idea.

https://mailarchive.ietf.org/arch/msg/sidrops/mXWbCwh6RO8pAtt7N30Q9m6jUws/

Manually cherry-picked from 0a6a80b558e12304ba0e68c021848e292bfe3ce6.

Hopefully f1xes #86.

commit | commitdiff | tree

ImgBotApp [Fri, 4 Nov 2022 18:11:35 +0000 (18:11 +0000)]

[ImgBot] Optimize images

*Total -- 140.66kb -> 97.46kb (30.71%)

/docs/img/warn.svg -- 3.87kb -> 1.90kb (50.81%)
/docs/img/design.svg -- 30.10kb -> 19.69kb (34.57%)
/docs/img/chain.svg -- 47.84kb -> 32.16kb (32.77%)
/docs/img/tree.svg -- 38.56kb -> 26.89kb (30.27%)
/docs/img/GitHub-Mark-Light-120px-plus.png -- 3.95kb -> 2.93kb (25.72%)
/docs/img/logo_validador_og.png -- 12.58kb -> 10.44kb (17.02%)
/docs/img/logo_validador_fort.svg -- 3.77kb -> 3.44kb (8.71%)

Signed-off-by: ImgBotApp <ImgBotHelp@gmail.com>

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 24 Oct 2022 20:20:12 +0000 (15:20 -0500)]

Add project status to the README

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 24 Oct 2022 20:17:12 +0000 (15:17 -0500)]

Update the project's status in the website

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 21 Jul 2022 20:58:02 +0000 (15:58 -0500)]

Issue #83: Add first iteration of debugging messages

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 7 Jun 2022 18:38:44 +0000 (13:38 -0500)]

Replace HASH_ADD_KEYPTR with HASH_ADD_STR

Simplifies the code a bit. These hashes are all string-keyed.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 6 Jun 2022 22:16:12 +0000 (17:16 -0500)]

Refactors resulting from the issue #83 review

Mostly quality of life improvements.

On the other hand, it looks like the notfatal hash table API was being
used incorrectly. HASH_ADD_KEYPTR can OOM, but `errno` wasn't being
catched.

Fixing this is nontrivial, however, because strange `reqs_error`
functions are in the way, and that's a spaggetti I decided to avoid.
Instead, I converted HASH_ADD_KEYPTR usage to the fatal hash table API.
That's the future according to #40, anyway.

I don't think this has anything to do with #83, though.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 16 May 2022 22:20:56 +0000 (17:20 -0500)]

Employ proper API to iterate through hash table entries

Found this quirk while eyeballing #83. I don't think it's going to
fix the problem, but it's definitely an improvement.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 19 Jan 2022 02:27:30 +0000 (20:27 -0600)]

HTTP: Shut up file delete error message

For #67.

When the file didn't need to be downloaded, because the IMS header did
its job, nothing needs to be deleted. So Fort shouldn't be complaining.

Like the previous commit, this is not a great solution, because IMS is
not the only trigger of file deletes, and the error message might be
helpful in other cases. Then again, I don't really agree with this eager
repository cleaning technique; it complicates debugging.

The proper solution is a WIP in the rrdp-refactor branch.

commit | commitdiff | tree

Alberto Leiva Popper [Sun, 16 Jan 2022 22:24:14 +0000 (16:24 -0600)]

HTTP: Result code 304 quick fix

Dirty version of the equivalent patch from the rrdp-refactor branch.

304 is neither an error nor a redirect. It needs to be handled as a
skip.

commit | commitdiff | tree

Alberto Leiva Popper [Sun, 16 Jan 2022 20:36:13 +0000 (14:36 -0600)]

log: Print libcrypto stack properly

commit | commitdiff | tree

Alberto Leiva Popper [Sat, 4 Dec 2021 03:21:53 +0000 (21:21 -0600)]

Docker: Update for 1.5.3

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 16 Nov 2021 19:27:18 +0000 (13:27 -0600)]

Logging: Upgrade HTTP and rsync requests to INFO

Maybe f1xes #62. Needs feedback.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 9 Nov 2021 19:08:54 +0000 (13:08 -0600)]

Log: Remove ##__VA_ARGS__ and -Wlogical-op

Improves portability. F1xes #64.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 9 Nov 2021 02:10:05 +0000 (20:10 -0600)]

Protocolary updates for release 1.5.3

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 9 Nov 2021 02:04:14 +0000 (20:04 -0600)]

Merge branch 'master' into ncsc

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 8 Nov 2021 22:06:18 +0000 (16:06 -0600)]

RRDP: Fix underutilization of the filename stack

Lots of error messages were referring to the wrong file, and several of
them printed the correct file manually as what I can only describe as a
quick workaround.

It's not perfect; not all the RRDP code has been patched. That'll have
to wait until the deep RRDP refactor.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 8 Nov 2021 21:58:43 +0000 (15:58 -0600)]

HTTP: Warn if a download exceeds 50% of the file size limit

Requested by Ties de Kock:

> Since some RPs now includes an upper limit on object size (some use
> 2MB if I recall correctly) I would appreciate a warning if an object
> goes over "a large fraction" of this limit (and a sample of the
> warning in the changelog and metrics if possible) - so people know
> what they need to alert on. In this situation operators can monitor
> for "natural growth" of an object and intervene, while the case that
> this check prevents (maliciously large objects) is still covered.
>
> The largest object I could find in the wild is 1.2MB (APNIC AS0 ROA).
> The RIPE NCC's largest object is smaller at the moment (but the CRL
> grows quickly if we do member CA keyrolls - since it adds all object
> on it).
>
> In summary:
>
> - I would recommend a warning (and preferably a metric) when an object
> of 50% of the object size limit is encountered.
> - I would like it if the hard limit is "safe" - especially CRLs can
> grow in some cases.

The metric will be added later, as part of #50. The warning is eg.

File size exceeds 50% of the configured limit (10/20 bytes).

50% is hardcoded at the moment.

Notice that this is an HTTP-only patch. rsync does not warn.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 8 Nov 2021 18:48:15 +0000 (12:48 -0600)]

Log: Improve some messages

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 8 Nov 2021 18:23:00 +0000 (12:23 -0600)]

RRDP: Add DEBUG_RRDP

I had a lot of trouble debugging N.

The problem was that RRDP files are not cached, so it wasn't possible
to debug the snapshot parser in offline mode.

So I added DEBUG_RRDP. It forces RRDP file caching. Not meant to be
enabled in production.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 29 Oct 2021 18:12:06 +0000 (13:12 -0500)]

Config: Remove unused code

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 29 Oct 2021 17:00:42 +0000 (12:00 -0500)]

rsync: Remove $HTTP_MAX_FILE_SIZE

RRDP snapshots are the only large files RPs neet to get, so the RRDP
limit tends to need to be much larger than the rsync limit.

Therefore, allowing them to be defined in terms of the other doesn't
make much sense.

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 28 Oct 2021 23:09:14 +0000 (18:09 -0500)]

RRDP: Patch uninitialized string

Detected by valgrind.

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 28 Oct 2021 21:33:28 +0000 (16:33 -0500)]

Manifest: Improve validation of FileAndHash.file names, version 2

Now following the rules defined in 6486bis, section 4.2.2

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 28 Oct 2021 19:59:58 +0000 (14:59 -0500)]

Manifest: Improve validation of FileAndHash.file names

The following are now rejected:

- slashes
- '.' as full name
- '..' as full name
- Non-printable ASCII

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 28 Oct 2021 01:53:21 +0000 (20:53 -0500)]

Core: Relocate basic data types

Helps the code review. Some structs and functions (such as
struct delta_router_key and router_key_print()) were bleeding into
mostly unrelated modules, and there were a couple of data types (struct
v4_address and struct v6_address) that were only used once, and induced
needless copying.

commit | commitdiff | tree

Philip Paeps [Wed, 27 Oct 2021 07:41:28 +0000 (15:41 +0800)]

Documentation: update FreeBSD build instructions

While binary packages are available, some people like to build from
source. Update the instructions for building from a port, a release
tarball or a Git checkout.

commit | commitdiff | tree

Philip Paeps [Wed, 27 Oct 2021 07:41:28 +0000 (15:41 +0800)]

Documentation: note FreeBSD packages exist

A port of FORT validator was committed to the FreeBSD ports tree and
binary packages are now available. Add instructions for using FreeBSD
packages.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 26 Oct 2021 23:25:36 +0000 (18:25 -0500)]

Merge branch 'master' into ncsc

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 26 Oct 2021 23:01:51 +0000 (18:01 -0500)]

Test & patch the previous commit

Also adds a bunch of database stringify functions for debugging.

Could use some more testing, but I want to merge first.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 28 Sep 2021 21:52:10 +0000 (16:52 -0500)]

General review

- Increase default remote object file size limit, because RRDP snapshot
  files can be very large. (Current: ~148 MB, double on key rollover)
- Treat HTTP redirects as errors.
    - Before: Redirects were treated as successes, but Fort didn't
      bother to follow the redirect. As a result, the code was either
      not finding the file, or finding an empty file.
    - Now: Redirects are treated as errors. (Not sure if I'm meant to do
      something here; curl doesn't do it automatically, and RFCs are
      silent. In particular, I'm not in the mood to have to deal with
      redirect loops and whatnot.)
- Remove ROA database clone operation in the SLURM code.
  According to the code, it was working on a clone "so that updates can
  be reverted" (on error, I presume.) But it was never reverting them.
- Refactor SLURM cleanup code.
  I don't remember this one very well. Starting from the clone removal,
  I got distracted with some inconsistent cleanups. Patched a buggy
  cleanup somewhere.

There's still more I want to do to the SLURM code; it looks somewhat
slow.

Dirty; needs testing.

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 21 Oct 2021 02:32:29 +0000 (21:32 -0500)]

Documentation: Update links to the latest binaries

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 19 Oct 2021 19:47:01 +0000 (14:47 -0500)]

Protocolary updates for release 1.5.2

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 19 Oct 2021 17:01:12 +0000 (12:01 -0500)]

Reorganize imports in common and address modules

Was throwing a compilation warning in FreeBSD.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 19 Oct 2021 03:02:58 +0000 (22:02 -0500)]

Cherry-pick 5afab1693bd5983fd3d6cf5aac63770807d74d90

Best include this in the upcoming release.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 18 Oct 2021 21:34:33 +0000 (16:34 -0500)]

Certificate: Remove a bunch of unreachable code

Also patches a bad cleanup that happened when certificate extension
errors were found.

Fixes #61.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 13 Oct 2021 05:18:07 +0000 (00:18 -0500)]

Temporary #58 solution

Completely axes the whole ordeal.

Is not a long-term solution, but will lead to stable behavior until
the other branch has been thoroughly tested.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 11 Oct 2021 21:22:08 +0000 (16:22 -0500)]

Certificate stack: Remove bogus x509stack_cancel()

It seems the #58 and #59 problem is a stray defer separator pop.

The comment above x509stack_cancel() clearly states that the function
should only be called shortly after a x509stack_push(), but there's one
in certificate_traverse() that isn't.

Removing this x509stack_cancel() seems to prevent the crash. I'm still
investigating the original intent of this code.

Tentatively f1xes #58 and #59.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 29 Sep 2021 00:01:00 +0000 (19:01 -0500)]

RRDP: Patch deltas foreach

The foreach was looping past the array limits.

Likely fixes #57.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 24 Sep 2021 00:02:12 +0000 (19:02 -0500)]

HTTP: Apply retries only on transient errors

File too big -> Do not retry
Timeout -> Retry
HTTP 408, 429, 5xx -> Retry
Else -> Do not retry

Most of this logic was copied from curl.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 20 Sep 2021 18:49:35 +0000 (13:49 -0500)]

rsync: Add --max-size to default rsync arguments

Applies the patch from the previous commit to the rsync code.

Also adds documentation.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 20 Sep 2021 17:43:26 +0000 (12:43 -0500)]

HTTP: Add --http.max-file-size

Prevents repositories from stagnating Fort with absurdly large files.

Thanks to Koen van Hove for reporting this.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 17 Sep 2021 18:35:59 +0000 (13:35 -0500)]

HTTP: Patch CURLOPT_LOW_SPEED_LIMIT and CURLOPT_LOW_SPEED_TIME

Likely due to some misunderstanding, Fort was managing both of these
variables using one command-line argument (--http.idle-timeout). This
unnecessarily limited the configurability of minimum transfer speeds for
HTTP connections.

--http.idle-timeout is now deprecated, and has been replaced by
--http.low-speed-limit and --http.low-speed-time, which correlate
verbatim to the corresponding curl arguments (CURLOPT_LOW_SPEED_LIMIT
and CURLOPT_LOW_SPEED_TIME).

Thanks to Koen van Hove for reporting this.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 8 Sep 2021 17:40:26 +0000 (12:40 -0500)]

Certificates: Fuse meta and level stacks

These stacks always had the same size, and their corresponding elements
always referred to the same certificate.

This was pending work from #55, which I think is now properly solved.

Also refactors x509stack_push(); was messy. Patched an unlikely memory
leak in the chaos.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 6 Sep 2021 16:05:00 +0000 (11:05 -0500)]

Startup: Print version number better

Version number is now inferred by git during the autogen.sh step.

It's a more automated version of the previous commit.

commit | commitdiff | tree

Alberto Leiva Popper [Sat, 4 Sep 2021 05:33:54 +0000 (00:33 -0500)]

Startup: Print version number

Just to make sure ongoing debugs are running the right version.

Needs to be more automated.

commit | commitdiff | tree

Alberto Leiva Popper [Sat, 4 Sep 2021 04:58:01 +0000 (23:58 -0500)]

RRDP Notification: Optimize delta parse

It was allocating the deltas array twice, for seemingly no reason.
Also, the array slots were pointers, and the two arrays pointed to
different instances of the same objects. For seemingly no reason.

Now there's only one array, and it stores the objects directly.

Also adds relevant unit tests.

commit | commitdiff | tree

Alberto Leiva Popper [Tue, 31 Aug 2021 23:28:15 +0000 (18:28 -0500)]

Certificate stack: Revert levels stack when x509stack_push() fails

Likely fixes #55.

commit | commitdiff | tree

Job Snijders [Mon, 30 Aug 2021 17:07:15 +0000 (17:07 +0000)]

Make FORT compile on OpenBSD

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 9 Aug 2021 15:03:09 +0000 (10:03 -0500)]

RTR Server: Prevent crash when server.address is NULL

Fixes #51.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 6 Aug 2021 20:55:00 +0000 (15:55 -0500)]

Protocolary updates for release 1.5.1

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 29 Jul 2021 22:56:15 +0000 (17:56 -0500)]

Config: Improve --init-tals

1. Update the TAL URLs. (The old ones were very obsolete.)
2. Add --init-as0-tals. (Used to download the ASN0 TALs.)
3. Deprecate and zero-op --init-locations. (Didn't make sense.
If the user needs a different URL, they can do wget instead.)
4. Deprecate setup_fort.sh. (Seems to be redundant. --init-tals
already takes care of downloading TALs.)

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 29 Jul 2021 20:55:20 +0000 (15:55 -0500)]

RTR Server: Clean up clients array after every poll

This was a pending optimization from commit
23478fdff80e8ceeaeeaffc71880f950b3c71104.

In truth, this data structure should technically be a linked list.
But I'm not sure if sacrificing cache locality for faster removal is
worth the tradeoff.

commit | commitdiff | tree

Alberto Leiva Popper [Thu, 29 Jul 2021 20:53:43 +0000 (15:53 -0500)]

TODOs: Reprioritize

There were a lot of FIXMEs that were minor nice-to-haves at best.

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 14 Jul 2021 15:25:43 +0000 (10:25 -0500)]

RTR Server: poll before writing

Problem:

write() was sometimes failing with EAGAIN when Fort tried to send PDUs
to clients.

Diagnosis:

Obviously, it's because RTR client sockets now ship with O_NONBLOCK
enabled. Fort wants O_NONBLOCK for reading, but not for writing.

This bug was introduced in the previous commit.

Solution:

Make sure the socket is writable (via poll()) before calling write().

commit | commitdiff | tree

Alberto Leiva Popper [Wed, 7 Jul 2021 19:42:39 +0000 (14:42 -0500)]

RTR Server: thread-pool.server.max now refers to RTR requests

Apparently, there was a huge misunderstanding when the thread pool was
implemented.

The intended model was

> When the RTR server receives a request, it borrows a thread from the
> thread pool, and tasks it with the request.

Which is logical and a typical thread pool use case. However, what was
actually implemented was

> When the RTR server opens a connection, it borrows a thread from the
> thread pool, and tasks it with the whole connection.

So `thread-pool.server.max` was a hard limit for simultaneous RTR
clients (routers), but now it's just a limit to simultaneous RTR
requests. (Surplus requests will queue.) This is much less taxing to the
CPU when there are hundreds of clients.

Thanks to Mark Tinka for basically spelling this out to me.

-----------------------

Actually, this commit is an almost entire rewrite of the RTR server
core. Here's a (possibly incomplete) list of other problems I had to fix
in the process:

== Problem 1 ==

sockaddr2str() was returning a pointer to invalid memory on success.

This was due to a naive attempt of a bugfix from
1ff403a0c7f61d443cbc4e2e512b8d0324547856.

== Problem 2 ==

Changed the delta expiration conditional.

Was "keep track of the clients, expire deltas when all clients outgrow
them." I see two problems with that:

1. It'll lead to bad performance if a client misbehaves by not
   maintaining the connection. (ie. the server will have to fall back to
   too many cache resets.)
2. It might keep the deltas forever if a client bugs out without killing
   the connection.

New conditional is "keep deltas for server.deltas.lifetime iterations."
"server.deltas.lifetime" is a new configuration argument.

== Problem 3 ==

Serials weren't being compared according to RFC 1982 serial arithmetic.
This was going to cause mayhem when the integer wrapped.

(Though Fort always starts at 1, and serials are 32-bit unsigned
integers, so this wasn't going to be a problem for a very long time.)

== Problem 4 ==

The thread pool had an awkward termination bug. When threads were
suspended, they were meant to be ended through a pthread signal, but
when they were running, they were supposed to be terminated through
pthread_cancel(). (Because, since each client was assigned a thread,
they would spend most of their time sleeping.) These termination methods
don't play well with each other.

Apparently, threads waiting on a signal cannot be canceled, because of
this strange quirk from man 3 pthread_cond_wait:

> a side effect of acting upon a cancellation request while in a
> condition wait is that the mutex is (in effect) re-acquired before
> calling the first cancellation cleanup handler.

(So the first thread dies with the mutex locked, and no other threads
can be canceled because no one can ever lock the mutex again.)

And of course, you can't stop a server thread through a signal, because
they aren't listening to it; they're sleeping in wait for a request.

I still don't really know how would I fix this, but luckily, the problem
no longer exists since working threads are mapped to single requests,
and therefore no longer sleep. (For long periods of time, anyway.)
So always using the signal works fine.

commit | commitdiff | tree

Alberto Leiva Popper [Mon, 5 Jul 2021 15:15:45 +0000 (10:15 -0500)]

RTR Server: Clear O_NONBLOCK on client sockets

It seems that client sockets inherit the O_NONBLOCK flag from the server
socket on some implementations of `accept()`.

We definitely don't want O_NONBLOCK on client sockets, so clear it
explicitely every time.

commit | commitdiff | tree

Alberto Leiva Popper [Fri, 2 Jul 2021 23:58:19 +0000 (18:58 -0500)]

VRPS: Clean up validation core

It was pretty messy; I had to rewrite a good chunk of it.

== Problem 1 ==

It was discarding meaningful validation results when miscellaneous
errors prevented the deltas array from being built.

Deltas are optional; as long as Fort has the snapshot of the latest
tree, it doesn't technically need deltas. They speed up synchronization,
but in the worst case scenario, the RTR server can keep pushing Cache
Resets.

Severity: Warning. Memory allocation failures are the only eventuality
that might prevent the deltas array from being built.

== Problem 2 ==

The database was always keeping one serial's worth of obsolete deltas.

Cleaned up, saves a potentially large amount of memory.

Severity: Fine. Not a memory leak.

== Problem 3 ==

The code computed deltas even whene there were no routers listening.
Routers are the only delta consumers, so there was no need to waste all
that time.

Severity: Fine; performance quirk.

== Problem 4 ==

I found an RTR client implementation (Cloudflare's rpki-rtr-client) that
hangs when the first serial is zero. Fort's first serial is now 1.

Severity: Warning. This is rpki-rtr-client's fault, but any client
implementations are prone to the same bug. The new solution is more
future-proof.

== Problem 5 ==

It seems it wasn't cleaning the deltas array when all routers were known
to have bogus serials. This was the code:

/* Its the first element or reached end, nothing to purge */
if (group == state.deltas.array ||
(group - state.deltas.array) == state.deltas.len)
return 0;

If you reached the end of the deltas array, and the minimum router
serial is larger than all the array serials, then all deltas are
useless; you're supposed to purge all of them.

Severity: Fine. It was pretty hard to trigger, and not a memory leak.

Mirror of https://github.com/NICMx/FORT-validator.git

RSS Atom