Peter Müller [Sun, 14 Aug 2022 16:02:55 +0000 (16:02 +0000)]
location-importer.in: Fix dangling variable
This fixes:
Traceback (most recent call last):
File "/usr/bin/location-importer", line 1607, in <module>
main()
File "/usr/bin/location-importer", line 1605, in main
c.run()
File "/usr/bin/location-importer", line 140, in run
ret = args.func(args)
File "/usr/bin/location-importer", line 1234, in handle_update_overrides
self._update_overrides_for_spamhaus_drop()
File "/usr/bin/location-importer", line 1504, in _update_overrides_for_spamhaus_drop
for sline in t.readlines():
NameError: name 't' is not defined
Signed-off-by: Peter Müller <peter.mueller@ipfire.org> Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Fri, 12 Aug 2022 15:47:29 +0000 (15:47 +0000)]
importer: Tolerate that data might exist from other RIRs
Since we are breaking the import into smaller chunks now, it might be
that some data already exists in the database. This is now being
ignored and data won't be replaced.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Fri, 12 Aug 2022 13:56:17 +0000 (13:56 +0000)]
importer: Move importing extended sources/ARIN into transaction
All imports should have been conducted in one large transaction so that
we can remove any previous data.
This was not the case because of an indentation issue and could have
caused that the transaction was commited without all data being
successfully re-imported.
Fixes: #12852 Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Fri, 12 Aug 2022 13:51:20 +0000 (13:51 +0000)]
importer: Change download behaviour
The downloader used to open a connection to the web server hosting our
content which would have been decompressed (if necessary) on the fly and
also been parsed on the fly so that it could have been fed into the
database easily.
Some webservers do not seem to be patient enough to keep the connection
open if things take a little bit longer than usual. That caused the
import to fail.
This patch changes the behaviour that we would download all content
first, store it locally, and then start processing it.
Fixes: #12852 Signed-off-by: Michael Tremer <michael.tremer@ipfire.org> Cc: Peter Müller <peter.mueller@ipfire.org>
Peter Müller [Sun, 5 Jun 2022 10:04:50 +0000 (10:04 +0000)]
location-importer: Only delete override data if we are sure to have a valid replacement
The current way of truncating all override data straight away leaves us
with no data at all, should a source turn out to be unreachable or
returning bogus files (yes, Cloudflare, I _am_ looking at you).
It is therefore better to only delete data we know to have a valid
replacement for, rather than just dropping the source altogether.
Signed-off-by: Peter Müller <peter.mueller@ipfire.org> Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Thu, 14 Apr 2022 18:31:56 +0000 (18:31 +0000)]
Make sources around that we can run tests without location installed
In order to run the test suite, we need to make the Python module
loadable from the build directory so that we first of all test the right
code and that it just works without running "make install" first.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Mon, 11 Apr 2022 17:57:22 +0000 (17:57 +0000)]
export: Enable flattening for everything
When performing checks, it is useful to be able to rely on a flat
network plan so that any larger parent networks in some countries/ASes
won't match any subnets.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Wed, 30 Mar 2022 15:19:10 +0000 (15:19 +0000)]
network: loc_network_subnets: Use correct prefix
The prefix is being stored as a total number of bits now since it is
easier for any bitwise maths later. This is however causing an incorrect
prefix being computed when splitting a network into two subnets for
IPv4.
To get the correct prefix, loc_network_prefix must be called.
Michael Tremer [Wed, 30 Mar 2022 14:58:36 +0000 (14:58 +0000)]
database: Allocate subnets list only once
This is a performance improvement when exporting networks flattened. For
the subnet search, we allocate an empty list many times which is often
not required.
This patch changes this behaviour that (if needed) the lists will be
allocated and will stay around and cleared if necessary.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Wed, 9 Mar 2022 10:29:11 +0000 (10:29 +0000)]
importer: Improve performance of network export query
This patch moves the subqueries out of the large query, so that the
database will materialize them for faster lookup.
We also drop the "UNION ALL" and replace it with just "UNION" because we
do not want any duplicate networks. That will save us many iterations
later on.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Mon, 7 Mar 2022 11:12:17 +0000 (11:12 +0000)]
bogons: Refactor algorithms
This changes that we won't compare one network with the previous one,
but instead we will look for gaps starting from the first possible to
the last possible IP address.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Sat, 5 Mar 2022 11:56:40 +0000 (11:56 +0000)]
importer: Parse aggregated networks
This patch adds code to parse any aggregated networks.
Bird does not automatically show the last ASN of the path, but we can
collect all networks that we can see without any ASN and perform
"show route <network> all" on them to gather this information.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Thu, 3 Mar 2022 08:48:14 +0000 (08:48 +0000)]
export: Fix filtering logic
It is possible to filter for what kind of network should be exported.
This worked well when the filter list only contained country codes, or
when it only contained ASNs. If there was a mix, only networks that
match both (i.e. virtually nothing) matched.
This patch fixes that we will use for either of them.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Reported-by: Michael Tremer <michael.tremer@ipfire.org> Signed-off-by: Peter Müller <peter.mueller@ipfire.org> Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Wed, 2 Mar 2022 10:26:41 +0000 (10:26 +0000)]
export: Conditionally enable flattening
By default, we enabled flattening of the network tree when we export it.
However, this is only required for xt_geoip since the other formats can
deal with overlapping networks and would even benefit from a shorter
list.
Therefore this is now only enabled when needed which results in shorter
export times (9 seconds instead of 2.5 minutes) and the full ipset is
about 20% smaller when loaded into memory than before.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Wed, 2 Mar 2022 10:18:16 +0000 (10:18 +0000)]
ipset: Set maxelem to a fixed size
When we try to load a changed set which might have more entries, a
previous maxelem could have been smaller preventing us from adding new
entries.
We also cannot run the "create" command with a changed maxelem
parameter which is why this patch set the value to something that should
be large enough for everything.
The downside of this is also, that we cannot modify the hashsize when we
reload a set, which is probably okay, since sets should not change too
much in size and therefore will only run *slightly* less efficient - if
at all.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Michael Tremer [Tue, 1 Mar 2022 12:44:21 +0000 (12:44 +0000)]
ipset: Optimise hash table size
ipset uses a hash table internally which can be dynamically sized to
chose whether more space efficiency or performance is required.
Previously to this patch, we always set the size of the hash table to
1024 buckets. Having large sets with almost half a million entries, this
is not performing well since we will spend a lot of time in searching
the linked list.
This will probably perform even slower on systems with smaller cache
sizes like the IPFire Mini Appliance.
Having more buckets that are sparesely filled, will result in less
memory fetches at the cost of more wastage. Throughout the whole IPv4
set, this ranges from about 50 MB for a factor of 4, to about 100 MB for
a factor of 0.75.
Since memory of this quantity is cheap and since we want to increase
throughput, I have chosen to set the fill factor to 0.75.
Logistically, it is a little bit complicated to know this in advance
when we have to write the header, so we will write the entire file
first, and then come back to write the header again. This is required to
keep memory consumption down during the export.
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>