Check family when filtering out listing of tables and sets.
Fixes: 3f1d3912c3a6 ("cache: filter out tables that are not requested") Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 28 Oct 2021 15:36:06 +0000 (17:36 +0200)]
doc: update ct timeout section with the state names
docs are too terse and did not have the list of valid timeout states.
While at it, adjust default stream timeout of udp to 120, this is the
current kernel default.
evaluate: clone variable expression if there is more than one reference
Clone the expression that defines the variable value if there are
multiple references to it in the ruleset. This saves heap memory
consumption in case the variable defines a set with a huge number of
elements.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Do not call alloc_setelem_cache() to build the set element list in
nftnl_set. Instead, translate one single set element expression to
nftnl_set_elem object at a time and use this object to build the netlink
header.
Using a huge test set containing 1.1 million element blocklist, this
patch is reducing userspace memory consumption by 40%.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
tests: py: remove verdict from closing end interval
Kernel does not allow for NFT_SET_ELEM_INTERVAL_END flag and
NFTA_SET_ELEM_DATA. The closing end interval represents a mismatch,
therefore, no verdict can be applied. The existing payload files show
the drop verdict when this is unset (because NF_DROP=0).
This update is required to fix payload warnings in tests/py after
libnftnl's ("set: use NFTNL_SET_ELEM_VERDICT to print verdict").
Fixes: 6671d9d137f6 ("mnl: Set NFTNL_SET_DATA_TYPE before dumping set elements") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Štěpán Němec [Fri, 5 Nov 2021 11:39:11 +0000 (12:39 +0100)]
tests: shell: $NFT needs to be invoked unquoted
The variable has to undergo word splitting, otherwise the shell tries
to find the variable value as an executable, which breaks in cases that 7c8a44b25c22 ("tests: shell: Allow wrappers to be passed as nft command")
intends to support.
Mention this in the shell tests README.
Fixes: d8ccad2a2b73 ("tests: cover baecd1cf2685 ("segtree: Fix segfault when restoring a huge interval set")") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
Štěpán Němec [Fri, 5 Nov 2021 11:39:10 +0000 (12:39 +0100)]
tests: shell: README: clarify test file name convention
Since commit 4d26b6dd3c4c, test file name suffix no longer reflects
expected exit code in all cases.
Move the sentence "Since they are located with `find', test files can
be put in any subdirectory." to a separate paragraph.
Fixes: 4d26b6dd3c4c ("tests: shell: change all test scripts to return 0") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
Štěpán Němec [Fri, 5 Nov 2021 11:39:09 +0000 (12:39 +0100)]
tests: shell: README: $NFT does not have to be a path to a binary
Since commit 7c8a44b25c22, $NFT can contain an arbitrary command,
e.g. 'valgrind nft'.
Fixes: 7c8a44b25c22 ("tests: shell: Allow wrappers to be passed as nft command") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
evaluate: postpone transport protocol match check after nat expression evaluation
Fix bogus error report when using transport protocol as map key.
Fixes: 50780456a01a ("evaluate: check for missing transport protocol match in nat map with concatenations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jeremy Sowden [Fri, 29 Oct 2021 20:40:08 +0000 (21:40 +0100)]
parser: add `limit_rate_pkts` and `limit_rate_bytes` rules
Factor the `N / time-unit` and `N byte-unit / time-unit` expressions
from limit expressions out into separate `limit_rate_pkts` and
`limit_rate_bytes` rules respectively.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Štěpán Němec [Wed, 20 Oct 2021 12:44:09 +0000 (14:44 +0200)]
tests: run-tests.sh: ensure non-zero exit when $failed != 0
POSIX [1] does not specify the behavior of `exit' with arguments
outside the 0-255 range, but what generally (bash, dash, zsh, OpenBSD
ksh, busybox) seems to happen is the shell exiting with status & 255
[2], which results in zero exit for certain non-zero arguments.
Phil Sutter [Tue, 2 Nov 2021 19:53:53 +0000 (20:53 +0100)]
tests: shell: Fix bogus testsuite failure with 250Hz
Previous fix for HZ=100 was not sufficient, a kernel with HZ=250 rounds
the 10ms to 8ms it seems. Do as Lukas suggests and accept the occasional
input/output asymmetry instead of continuing the hide'n'seek game.
Fixes: c9c5b5f621c37 ("tests: shell: Fix bogus testsuite failure with 100Hz") Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Lukas Wunner [Wed, 11 Mar 2020 12:20:06 +0000 (13:20 +0100)]
src: Support netdev egress hook
Add userspace support for the netdev egress hook which is queued up for
v5.16-rc1, complete with documentation and tests. Usage is identical to
the ingress hook.
Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Štěpán Němec [Wed, 20 Oct 2021 12:42:20 +0000 (14:42 +0200)]
tests: cover baecd1cf2685 ("segtree: Fix segfault when restoring a huge interval set")
Test inspired by [1] with both the set and stack size reduced by the
same power of 2, to preserve the (pre-baecd1cf2685) segfault on one
hand, and make the test successfully complete (post-baecd1cf2685) in a
few seconds even on weaker hardware on the other.
(The reason I stopped at 128kB stack size is that with 64kB I was
getting segfaults even with baecd1cf2685 applied.)
Florian Westphal [Tue, 19 Oct 2021 12:07:25 +0000 (14:07 +0200)]
tests: shell: auto-removal of chain hook on netns removal
This is the nft equivalent of the syzbot report that lead to
kernel commit 68a3765c659f8
("netfilter: nf_tables: skip netdev events generated on netns removal").
Jeremy Sowden [Thu, 7 Oct 2021 20:12:21 +0000 (21:12 +0100)]
rule: fix stateless output after listing sets containing counters
Before outputting counters in set definitions the
`NFT_CTX_OUTPUT_STATELESS` flag was set to suppress output of the
counter state and unconditionally cleared afterwards, regardless of
whether it had been originally set. Record the original set of flags
and restore it.
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994273 Fixes: 6d80e0f15492 ("src: support for counter in set definition") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jeremy Sowden [Thu, 7 Oct 2021 20:12:20 +0000 (21:12 +0100)]
rule: remove fake stateless output of named counters
When `-s` is passed, no state is output for named quotas and counter and
quota rules, but fake zero state is output for named counters. Remove
the output of named counters to match the remaining stateful objects.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Štěpán Němec [Mon, 11 Oct 2021 11:59:04 +0000 (13:59 +0200)]
doc: libnftables-json: make the example valid libnftables JSON input
- Add missing comma between array elements.
- Fix chain 'name' property.
- Match 'op' property is mandatory.
Fixes: 2e56f533b36a ("doc: Improve example in libnftables-json(5)") Fixes: 90d4ee087171 ("JSON: Make match op mandatory, introduce 'in' operator") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
Set on the cache flags for the nested notation too, this is fixing nft -f
with two files, one that contains the set declaration and another that
adds a rule that refers to such set.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1474 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: check for missing transport protocol match in nat map with concatenations
Restore this error with NAT maps:
# nft add rule 'ip ipfoo c dnat to ip daddr map @y'
Error: transport protocol mapping is only valid after transport protocol match
add rule ip ipfoo c dnat to ip daddr map @y
~~~~ ^^^^^^^^^^^^^^^
Allow for transport protocol match in the map too, which is implicitly
pulling in a transport protocol dependency.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink: dynset: set compound expr dtype based on set key definition
"nft add rule ... add @t { ip saddr . 22 ..." will be listed as
'ip saddr . 0x16 [ invalid type]".
This is a display bug, the compound expression created during netlink
deserialization lacks correct datatypes for the value expression.
Avoid this by setting the individual expressions' datatype.
The set key has the needed information, so walk over the types and set
them in the dynset statment.
Also add a test case.
Reported-by: Paulo Ricardo Bruck <paulobruck1@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
payload: don't adjust offsets of autogenerated dependency expressions
Pablo says:
user reports that this is broken:
nft --debug=netlink add rule bridge filter forward vlan id 100 vlan id set 200
[..]
[ payload load 2b @ link header + 14 => reg 1 ]
[..]
[ payload load 2b @ link header + 28 => reg 1 ]
[ bitwise reg 1 = ( reg 1 & 0x000000f0 ) ^ 0x0000c800 ]
[ payload write reg 1 => 2b @ link header + 14 csum_type 0 csum_off 0 csum_flags 0x0 ]
offset says 28, it is assuming q-in-q, in this case it is mangling the
existing header.
The problem here is that 'vlan id set 200' needs a read-modify-write
cycle because 'vlan id set' has to preserve bits located in the same byte area
as the vlan id.
The first 'payload load' at offset 14 is generated via 'vlan id 100',
this part is ok.
The second 'payload load' at offset 28 is the bogus one.
Its added as a dependency, but then adjusted because nft evaluation
considers this identical to 'vlan id 1 vlan id '2, where nft assumes
q-in-q.
To fix this, skip offset adjustments for raw expressions and mark the
dependency-generated payload instruction as such.
This is fine because raw payload operations assume that user specifies
base/offset/length manually.
Also add a test case for this.
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
Since 309785674b25 ("datatype: time_print() ignores -T"), time_type
honors -T option. Given tests/py run in numeric format, this patch
fixes a warning since the ct expiration is now expressed in seconds.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Partially revert 913979f882d1 ("src: add expression handler hashtable")
which is causing a crash with two instances of the nftables handler.
$ sudo python
[sudo] password for echerkashin:
Python 3.9.7 (default, Sep 3 2021, 06:18:44)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from nftables import Nftables
>>> n1=Nftables()
>>> n2=Nftables()
>>> <Ctrl-D>
double free or corruption (top)
Aborted
src: Check range bounds before converting to prefix
The lower bound must be the first value of the prefix to be coverted.
For example, range "10.0.0.15-10.0.0.240" can not be converted to
"10.0.0.15/24". Validate it by checking if the lower bound value has
enough trailing zeros.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
# nft list ruleset
table ip x {
set y {
type ipv4_addr
flags timeout
elements = { 1.1.1.1 timeout 5m expires 1m49s40ms }
}
}
# sudo nft -T list ruleset
table ip x {
set y {
type ipv4_addr
flags timeout
elements = { 1.1.1.1 timeout 300s expires 108s }
}
}
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1561 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink_delinearize: incorrect meta protocol dependency kill again
This patch adds __meta_dependency_may_kill() to consolidate inspection
of the meta protocol, nfproto and ether type expression to validate
dependency removal on listings.
Phil reports that 567ea4774e13 includes an update on the ip and ip6
families that is not described in the patch, moreover, it flips the
default verdict from true to false.
Fixes: 567ea4774e13 ("netlink_delinearize: incorrect meta protocol dependency kill") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 1 Sep 2021 14:41:44 +0000 (16:41 +0200)]
parser_json: Fix error reporting for invalid syntax
Errors emitted by the JSON parser caused BUG() in erec_print() due to
input descriptor values being bogus.
Due to lack of 'include' support, JSON parser uses a single input
descriptor only and it lived inside the json_ctx object on stack of
nft_parse_json_*() functions.
By the time errors are printed though, that scope is not valid anymore.
Move the static input descriptor object to avoid this.
Fixes: 586ad210368b7 ("libnftables: Implement JSON parser") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Wed, 11 Aug 2021 16:14:06 +0000 (18:14 +0200)]
tests: json_echo: Print errors to stderr
Apart from the obvious, this fixes exit_dump() which tried to dump the
wrong variable ('out' instead of 'obj') and missed that json.dumps()
doesn't print but just returns a string. Make it call exit_err() to
share some code, which changes the prefix from 'FAIL' to 'Error' as a
side-effect.
While being at it, fix for a syntax warning with newer Python in
unrelated code.
Fixes: bb32d8db9a125 ("JSON: Add support for echo option") Signed-off-by: Phil Sutter <phil@nwl.cc>
Print queue statement using the 'queue ... to' syntax to consolidate the
syntax around Florian's proposal introduced in 6cf0f2c17bfb ("src:
queue: allow use of arbitrary queue expressions").
Retain backward compatibility, 'queue num' syntax is still allowed.
Update and add new tests.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Fri, 20 Aug 2021 09:52:35 +0000 (11:52 +0200)]
parser: permit symbolic define for 'queue num' again
WHen I simplified the parser to restrict 'queue num' I forgot that
instead of range and immediate value its also allowed to pass in
a variable expression, e.g.
Update this command to display the hook datapath for a packet depending
on its family.
This patch also includes:
- Group of existing hooks based on the hook location.
- Order hooks by priority, from INT_MIN to INT_MAX.
- Do not add sign to priority zero.
- Refresh include/linux/netfilter/nfnetlink_hook.h cache copy.
- Use NFNLA_CHAIN_* attributes to print the chain family, table and name.
If NFNLA_CHAIN_* attributes are not available, display the hookfn name.
- Update syntax: remove optional hook parameter, promote the 'device'
argument.
The following example shows the hook datapath for IPv4 packets coming in
from netdevice 'eth0':
# nft list hooks ip device eth0
family ip {
hook ingress {
+0000000010 chain netdev x y [nf_tables]
+0000000300 chain inet m w [nf_tables]
}
hook input {
-0000000100 chain ip a b [nf_tables]
+0000000300 chain inet m z [nf_tables]
}
hook forward {
-0000000225 selinux_ipv4_forward 0000000000 chain ip a c [nf_tables]
}
hook output {
-0000000225 selinux_ipv4_output
}
hook postrouting {
+0000000225 selinux_ipv4_postroute
}
}
Note that the listing above includes the existing netdev and inet
hooks/chains which *might* interfer in the travel of an incoming IPv4
packet. This allows users to debug the pipeline, basically, to
understand in what order the hooks/chains are evaluated for the IPv4
packets.
If the netdevice is not specified, then the ingress hooks are not
shown.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The negation was introduced to provide a simple shortcut. Extend e6c32b2fa0b8 ("src: add negation match on singleton bitmask value") to
disallow negation with binary operations too.
evaluate: error reporting for missing statements in set/map declaration
Assuming this map:
map y {
type ipv4_addr : verdict
}
This patch slightly improves error reporting to refer to the missing
'counter' statement in the map declaration.
# nft 'add element x y { 1.2.3.4 counter packets 1 bytes 1 : accept, * counter : drop }'
Error: missing statement in map declaration
add element x y { 1.2.3.4 counter packets 10 bytes 640 : accept, * counter : drop }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The kernel already assumes that that ICMP type to reject a packet is
destination-unreachable, hence the user specifies the *ICMP code*.
Simplify the syntax to:
... reject with icmp port-unreachable
this removes the 'type' keyword before the ICMP code to reject the
packet with.
IIRC, the original intention is to leave room for future extensions that
allow to specify both the ICMP type and the ICMP code, this is however
not possible with the current inconsistent syntax.
Phil Sutter [Mon, 26 Jul 2021 13:27:32 +0000 (15:27 +0200)]
tests: shell: Fix bogus testsuite failure with 100Hz
On kernels with CONFIG_HZ=100, clock granularity does not allow tracking
timeouts in single digit ms range. Change sets/0031set_timeout_size_0 to
not expose this detail.
Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Florian Westphal <fw@strlen.de>