Phil Sutter [Thu, 20 May 2021 13:11:37 +0000 (15:11 +0200)]
expr_postprocess: Avoid an unintended fall through
Parsing a range expression, the switch case fell through to prefix
expression case, thereby recursing once more for expr->left. This seems
not to have caused harm, but is certainly not intended.
Fixes: ee4391d0ac1e7 ("nat: transform range to prefix expression when possible") Signed-off-by: Phil Sutter <phil@nwl.cc>
The fuzzy lookup is exercised from the error path, when no object is
found. Remove branch that checks for exact matching since that should
not ever happen.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
libnftables: location-based error reporting for chain type
Store the location of the chain type for better error reporting.
Several users that compile custom kernels reported that error
reporting is misleading when accidentally selecting
CONFIG_NFT_NAT=n.
After this patch, a better hint is provided:
# nft 'add chain x y { type nat hook prerouting priority dstnat; }'
Error: Could not process rule: No such file or directory
add chain x y { type nat hook prerouting priority dstnat; }
^^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
datatype.c: In function ‘cgroupv2_type_print’:
datatype.c:1387:22: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
nft_print(octx, "%lu", id);
~~^ ~~
%llu
meta.c: In function ‘date_type_print’:
meta.c:411:21: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
nft_print(octx, "%lu", tstamp);
~~^ ~~~~~~
%llu
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
parser_bison: add shortcut syntax for matching flags without binary operations
This patch adds the following shortcut syntax:
expression flags / flags
instead of:
expression and flags == flags
For example:
tcp flags syn,ack / syn,ack,fin,rst
^^^^^^^ ^^^^^^^^^^^^^^^
value mask
instead of:
tcp flags and (syn|ack|fin|rst) == syn|ack
The second list of comma-separated flags represents the mask which are
examined and the first list of comma-separated flags must be set.
You can also use the != operator with this syntax:
tcp flags != fin,rst / syn,ack,fin,rst
This shortcut is based on the prefix notation, but it is also similar to
the iptables tcp matching syntax.
This patch introduces the flagcmp expression to print the tcp flags in
this new notation. The delinearize path transforms the binary expression
to this new flagcmp expression whenever possible.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Marco Oliverio [Thu, 13 May 2021 14:10:32 +0000 (16:10 +0200)]
cache: check errno before invoking cache_release()
if genid changes during cache_init(), check_genid() sets errno to EINTR to force
a re-init of the cache.
cache_release() may inadvertly change errno by calling free(). Indeed free()
may invoke madvise() that changes errno to ENOSYS on system where kernel is
configured without support for this syscall.
Signed-off-by: Marco Oliverio <marco.oliverio@tanaza.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink_delinearize: fix binary operation postprocessing with sets
If the right-hand side expression of the binary expression is a set,
then, skip the postprocessing step otherwise the tests/py report the
following warning:
tests: shell: don't assume fixed handle value in cache/0008_delete_by_handle_0
This test is occasionally reporting warning in one of my test boxes.
Update this test to extract the handle from ruleset listing, use
rudimentary invocation of the cut command to work around this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 6 May 2021 08:12:45 +0000 (10:12 +0200)]
doc: Reduce size of NAT statement synopsis
Introduce non-terminals representing address and port which may
represent ranges as well. Combined with dropping the distinction between
PR_FLAGS and PRF_FLAGS, all the lines for each nat statement type can be
combined.
Stefano Brivio [Wed, 5 May 2021 22:23:14 +0000 (00:23 +0200)]
tests: Introduce 0043_concatenated_ranges_1 for subnets of different sizes
The report from https://bugzilla.netfilter.org/show_bug.cgi?id=1520
showed a display issue with particular IPv6 mask lengths in elements
of sets with concatenations. Make sure we cover insertion and listing
of different mask lengths in concatenated set elements for IPv4 and
IPv6.
Stefano Brivio [Wed, 5 May 2021 22:23:13 +0000 (00:23 +0200)]
segtree: Fix range_mask_len() for subnet ranges exceeding unsigned int
As concatenated ranges are fetched from kernel sets and displayed to
the user, range_mask_len() evaluates whether the range is suitable for
display as netmask, and in that case it calculates the mask length by
right-shifting the endpoints until no set bits are left, but in the
existing version the temporary copies of the endpoints are derived by
copying their unsigned int representation, which doesn't suffice for
IPv6 netmask lengths, in general.
PetrB reports that, after inserting a /56 subnet in a concatenated set
element, it's listed as a /64 range. In fact, this happens for any
IPv6 mask shorter than 64 bits.
Fix this issue by simply sourcing the range endpoints provided by the
caller and setting the temporary copies with mpz_init_set(), instead
of fetching the unsigned int representation. The issue only affects
displaying of the masks, setting elements already works as expected.
If the cache does not contain this object that is defined in this batch,
add it to the cache. This allows for references to this new object in
the same batch.
This patch also adds missing handle_merge() to set the object name,
otherwise object name is NULL and obj_cache_find() crashes.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If the cache does not contain this flowtable that is defined in this
batch, then add it to the cache. This allows for references to this new
flowtable in the same batch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If the cache does not contain the set that is defined in this batch, add
it to the cache. This allows for references to this new set in the same
batch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Actually I am not expecting that many flowtables to benefit from the
hashtable to be created by streamline this code with tables, chains,
sets and policy objects.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
You can identify chains through the unique handle in deletions, update
this interface to take a string instead of the handle to prepare for
the introduction of 64-bit handle chain lookups.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: check if nat statement map specifies a transport header expr
Importing the systemd nat table fails:
table ip io.systemd.nat {
map map_port_ipport {
type inet_proto . inet_service : ipv4_addr . inet_service
elements = { tcp . 8088 : 192.168.162.117 . 80 }
}
chain prerouting {
type nat hook prerouting priority dstnat + 1; policy accept;
fib daddr type local dnat ip addr . port to meta l4proto . th dport map @map_port_ipport
}
}
ruleset:9:48-59: Error: transport protocol mapping is only valid after transport protocol match
To resolve this (no transport header base specified), check if the
map itself contains a network base protocol expression.
This allows nft to import the ruleset.
Import still fails with same error if 'inet_service' is removed
from the map, as it should.
Another process might race to add chains after chain_cache_init().
The generation check does not help since it comes after cache_init().
NLM_F_DUMP_INTR only guarantees consistency within one single netlink
dump operation, so it does not help either (cache population requires
several netlink dump commands).
Let's be safe and do not assume the chain exists in the cache when
populating the rule cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
- Chains that reside in the cache are stored in the new
tables->cache_chain and tables->cache_chain_ht. The hashtable chain
cache allows for fast chain lookups.
- Chains that defined via command line / ruleset file reside in
tables->chains.
Note that chains in the cache (already in the kernel) are not placed in
the table->chains.
By keeping separated lists, chains defined via command line / ruleset
file can be added to cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Tue, 30 Mar 2021 23:26:19 +0000 (01:26 +0200)]
netlink: don't crash when set elements are not evaluated as expected
define foo = 2001:db8:123::/48
table inet filter {
set foo {
typeof ip6 saddr
elements = $foo
}
}
gives crash. This now exits with:
stdin:1:14-30: Error: Unexpected initial set type prefix
define foo = 2001:db8:123::/48
^^^^^^^^^^^^^^^^^
For literals, bison parser protects us, as it enforces
'elements = { 2001:... '.
For 'elements = $foo' we can't detect it at parsing stage as the '$foo'
symbol might as well evaluate to "{ 2001, ...}" (i.e. we can't do a
set element allocation).
As an alternative to print the datatype values when no symbol table is
available. Use it to print protocols available via getprotobynumber()
which actually refers to /etc/protocols.
Not very efficient, getprotobynumber() causes a series of open()/close()
calls on /etc/protocols, but this is called from a non-critical path.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1503 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Simon Ruderich [Tue, 9 Mar 2021 10:53:30 +0000 (11:53 +0100)]
doc: use symbolic names for chain priorities
This replaces the numbers with the matching symbolic names with one
exception: The NAT example used "priority 0" for the prerouting
priority. This is replaced by "dstnat" which has priority -100 which is
the new recommended priority.
Also use spaces instead of tabs for consistency in lines which require
updates.
Signed-off-by: Simon Ruderich <simon@ruderich.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Tue, 16 Mar 2021 23:40:34 +0000 (00:40 +0100)]
scanner: add support for scope nesting
Adding a COUNTER scope introduces parsing errors. Example:
add rule ... counter ip saddr 1.2.3.4
This is supposed to be
COUNTER IP SADDR SYMBOL
but it will be parsed as
COUNTER IP STRING SYMBOL
... and rule fails with unknown saddr.
This is because IP state change gets popped right after it was pushed.
bison parser invokes scanner_pop_start_cond() helper via
'close_scope_counter' rule after it has processed the entire 'counter' rule.
But that happens *after* flex has executed the 'IP' rule.
IOW, the sequence of events is not the exepcted
"COUNTER close_scope_counter IP SADDR SYMBOL close_scope_ip", it is
"COUNTER IP close_scope_counter".
close_scope_counter pops the just-pushed SCANSTATE_IP and returns the
scanner to SCANSTATE_COUNTER, so next input token (saddr) gets parsed
as a string, which gets then rejected from bison.
To resolve this, defer the pop operation until the current state is done.
scanner_pop_start_cond() already gets the scope that it has been
completed as an argument, so we can compare it to the active state.
If those are not the same, just defer the pop operation until the
bison reports its done with the active flex scope.
This leads to following sequence of events:
1. flex switches to SCANSTATE_COUNTER
2. flex switches to SCANSTATE_IP
3. bison calls scanner_pop_start_cond(SCANSTATE_COUNTER)
4. flex remains in SCANSTATE_IP, bison continues
5. bison calls scanner_pop_start_cond(SCANSTATE_IP) once the entire
ip rule has completed: this pops both IP and COUNTER.
Florian Westphal [Thu, 11 Mar 2021 13:23:02 +0000 (14:23 +0100)]
scanner: ct: move to own scope
This allows moving multiple ct specific keywords out of INITIAL scope.
Next few patches follow same pattern:
1. add a scope_close_XXX rule
2. add a SCANSTATE_XXX & make flex switch to it when
encountering XXX keyword
3. make bison leave SCANSTATE_XXXX when it has seen the complete
expression.