src: drop obsolete hook argument form hook dump functions
since commit b98fee20bfe2 ("mnl: revisit hook listing"), handle.chain is
never set in this path, so 'hook' is always set to -1, so the hook arg
can be dropped.
mnl_nft_dump_nf_hooks() can call itself for the UNSPEC case, this
avoids the second switch/case to handle printing for inet/unspec.
As for the error handling, 'nft list hooks' should not print an error,
even if nothing is printed, UNLESS there was also a lowlevel (syscall)
error from the kernel.
We don't want to indicate failure just because e.g. kernel doesn't support
NFPROTO_ARP.
This also fixes a display bug, 'nft list hooks device foo' would show hooks
registered for that device as 'bridge' family instead of the expected
'netdev' family.
This was because UNSPEC handling did not query 'netdev' family and did
pass the device name to the lowlevel function. Add it, and pass NULL
device name for those families that don't support device attachment.
The lowelevel function currently always queries NFPROTO_NETDEV to handle
the 'inet' ingress case.
This is dubious, as 'inet ingress' is a pseudo-alias to netdev family
(inet itself is a pseudo-family that ends up registering for both ipv4
and ipv6 hooks).
ERR: "tests/shell/testcases/chains/jump_to_base_chain" has no "tests/shell/testcases/chains/dumps/jump_to_base_chain.{nft,nodump}" file
For all of those, add the relevant .nft dump file.
Add a 'nodump' file in case the test doesn't print anything (e.g.
because the test checks that invalid ruleset fails validation).
Some tests have a .nft but not .json-nft, this is because json lacks
some features, in particular "typeof" and anonymous/implicit chains.
ERR: "tests/shell/testcases/maps/delete_element_catchall" has no "tests/shell/testcases/maps/dumps/delete_element_catchall.{nft,nodump}" file
ERR: "tests/shell/testcases/maps/dumps/delete_elem_catchall.nft" has no test "tests/shell/testcases/maps/delete_elem_catchall"
these two are related, rename the dump file to match the script name.
- index with rule breaks, because NFT_CACHE_REFRESH is missing.
- simple set updates.
Moreover, the current process could populate the cache with objects for
listing commands (no generation ID is bumped), while another process
could update the ruleset. Leading to a inconsistent cache due to the
genid + 1 check.
This optimization needs more work and more tests for -i/--interactive,
revert it.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
tests: shell: add more ruleset validation test cases
Passes fine on all tested kernel releases.
Same as existing tests, but try harder to fool the validation:
1. Add a ruleset where the jump that that exceeds 16 is "broken", i.e.
c0 -> c1 ... -> c8
c9-> c1 ... -> c16
Where c0 is a base chain, with a graph thats really a linear list
from c0 to c8 and c9 to c16 is a linear list not connected to the former
or a hook point.
Then try to link them either directly via jump/goto rule or indirectly
with a verdict map.
Try both unbound map with element doing 'goto c9' and then trying to add
vmap rule to c8 (must fail, creates link).
Then try reverse: with empty map, add vmap rule to c8 (should work, no
elements...).
Then, add map element with jump or goto to c9. This should be rejected.
Try the same thing with a tproxy expression in a user-defined chain:
attempt to make it reachable from c0 (filter input), which is illegal.
/dev/stdin is a placeholder, read() from STDIN_FILENO is used to fetch
the standard input into a buffer.
Since 5c2b2b0a2ba7 ("src: error reporting with -f and read from stdin")
stdin is stored in a buffer to fix error reporting.
This patch requires: ("parser_json: use stdin buffer if available")
Fixes: 149b1c95d129 ("libnftables: refuse to open onput files other than named pipes or regular files") Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since 5c2b2b0a2ba7 ("src: error reporting with -f and read from stdin")
stdin is stored in a buffer, update json support to use it instead of
reading from /dev/stdin.
Some systems do not provide /dev/stdin symlink to /proc/self/fd/0
according to reporter (that mentions Yocto Linux as example).
Fixes: 935f82e7dd49 ("Support 'nft -f -' to read from stdin") Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Set element deletion in maps (including catchall elements) does not work.
# nft delete element ip x m { \* }
BUG: invalid range expression type catch-all set element
nft: src/expression.c:1472: range_expr_value_low: Assertion `0' failed.
Aborted
Call interval_expr_key() to fetch expr->left in the mapping but use the
expression that represents the mapping because it provides access to the
EXPR_F_REMOVE flags.
Moreover, assume maximum value for catchall expression by means of the
expr->len to reuse the existing code to check if the element to be
deleted really exists.
Fixes: 3e8d934e4f72 ("intervals: support to partial deletion with automerge") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: set on expr->len for catchall set elements
Catchall elements coming from the parser provide expr->len == 0.
However, the existing mergesort implementation requires expr->len to be
set up to the length of the set key to properly sort elements.
In particular, set element deletion leverages such list sorting to find
if elements exists in the set.
Fixes: 419d19688688 ("src: add set element catch-all support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The default include path is searched for files before include paths
specified via -I/--include.
Search for default include path after user-specified include paths to
allow users for test nftables configurations spanning multiple files
without overwriting the globally installed ones.
# cat /path/to/files/ruleset.nft
include "file1.nft"
include "file2.nft"
include "file3.nft"
The follow up patch ("libnftables: search for default include path last")
is also required according to what it is described in the manpage update
coming with this patch.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1661 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This is because we pass the EXPR_SET_ELEM expr to the register allocation,
which will make it reserve 1 128 bit register / 16 bytes.
This happens to be enough for most cases, but its not for ipv6 concat data.
Pass the actual key and data instead: This will reserve enough space to
hold a possible concat expression.
Also add test cases.
Signed-off-by: Son Dinh <dinhtrason@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 9 Feb 2024 12:13:04 +0000 (13:13 +0100)]
tests: use common shebang in "packetpath/flowtables" test
"./tools/check-tree.sh" checks for a certain shebang. Either `/bin/bash` or
`/bin/bash -e`. No other are currently allowed, because it makes sense to be
strict/consistent and there is no need such flexibility.
Move the "-x" to a later command.
Note that "set -x" may not be a good choice anyway. If you want to debug
a test and see the shell commands, you could just run
That will automatically use `/bin/bash -x` as interpreter. And that
works for all tests the same. This is also the reason why
"check-tree.sh" checks for a well-known shebang. Because the "-x" option
of the test runner mangles the shebang, but for that it needs to
understand it.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
cmd: skip variable set elements when collapsing commands
ASAN reports an issue when collapsing commands that represent an element
through a variable:
include/list.h:60:13: runtime error: member access within null pointer of type 'struct list_head'
AddressSanitizer:DEADLYSIGNAL
=================================================================
==11398==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7ffb77cf09c2 bp 0x7ffc818267c0 sp 0x7ffc818267a0 T0)
==11398==The signal is caused by a WRITE memory access.
==11398==Hint: address points to the zero page.
#0 0x7ffb77cf09c2 in __list_add include/list.h:60
#1 0x7ffb77cf0ad9 in list_add_tail include/list.h:87
#2 0x7ffb77cf0e72 in list_move_tail include/list.h:169
#3 0x7ffb77cf86ad in nft_cmd_collapse src/cmd.c:478
#4 0x7ffb77da9f16 in nft_evaluate src/libnftables.c:531
#5 0x7ffb77dac471 in __nft_run_cmd_from_filename src/libnftables.c:720
#6 0x7ffb77dad703 in nft_run_cmd_from_filename src/libnftables.c:807
Skip such commands to address this issue.
This patch also extends tests/shell to cover for this bug.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1754 Fixes: 498a5f0c219d ("rule: collapse set element commands") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
cmd: provide better hint if chain is already declared with different type/hook/priority
Display the following error in such case:
ruleset.nft:7:9-52: Error: Chain "input" already exists in table ip 'filter' with different declaration
type filter hook postrouting priority filter;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
instead of reporting a misleading unsupported chain type when updating
an existing chain with different type/hook/priority.
Fixes: 573788e05363 ("src: improve error reporting for unsupported chain type") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
scanner: inet_pton() allows for broader IPv4-Mapped IPv6 addresses
inet_pton() allows for broader IPv4-Mapped IPv6 address syntax than
those specified by rfc4291 Sect.2.5.5. This patch extends the scanner to
support them for compatibility reasons. This allows to represent the
last 4 bytes of an IPv6 address as an IPv4 address.
cache: recycle existing cache with incremental updates
Cache tracking has improved over time by incrementally adding/deleting
objects when evaluating commands that are going to be sent to the kernel.
nft_cache_is_complete() already checks that the cache contains objects
that are required to handle this batch of commands by comparing cache
flags.
Infer from the current generation ID if no other transaction has
invalidated the existing cache, this allows to skip unnecessary cache
flush then refill situations which slow down incremental updates.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
cache: check for NFT_CACHE_REFRESH in current requested cache too
NFT_CACHE_REFRESH is set on inconditionally by ruleset list commands to
deal with stateful information in this ruleset. This flag results in
dropping the existing cache and fully fetching all objects from the
kernel.
Set on this flag for reset commands too, this is missing.
List/reset commands allow for filtering by specific family and object,
therefore, NFT_CACHE_REFRESH also signals that the cache is partially
populated.
Check if this flag is requested by the current list/reset command, as
well as cache->flags which represents the cache after the _previous_
list of commands.
A follow up patch allows to recycle the existing cache if the flags
report that the same objects are already available in the cache,
NFT_CACHE_REFRESH is useful to report that cache cannot be recycled.
Fixes: 407c54f71255 ("src: cache gets out of sync in interactive mode") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: bogus protocol conflicts in vlan with implicit dependencies
The following command:
# nft add rule netdev x y ip saddr 10.1.1.1 icmp type echo-request vlan id set 321
fails with:
Error: conflicting link layer protocols specified: ether vs. vlan
netdev x y ip saddr 10.1.1.1 icmp type echo-request vlan id set 321
^^^^^^^
Users can work around this issue by prepending an explicit match for
vlan ethertype field, that is:
ether type vlan ip saddr 10.1.1.1 ...
^-------------^
but nft should really handle this itself.
The error above is triggered by the following check in
resolve_ll_protocol_conflict():
/* This payload and the existing context don't match, conflict. */
if (pctx->protocol[base + 1].desc != NULL)
return 1;
This check was added by 39f15c243912 ("nft: support listing expressions
that use non-byte header fields") and f7d5590688a6 ("tests: vlan tests")
to deal with conflicting link layer protocols, for instance:
ether type ip vlan id 1
this is matching ethertype ip at offset 12, but then it matches for vlan
id at offset 14 which is not present given the previous check.
One possibility is to remove such check, but nft does not bail out for
the example above and it results in bytecode that never matches:
# nft --debug=netlink netdev x y ether type ip vlan id 10
netdev x y
[ meta load iiftype => reg 1 ]
[ cmp eq reg 1 0x00000001 ]
[ payload load 2b @ link header + 12 => reg 1 ] <---- ether type
[ cmp eq reg 1 0x00000008 ] <---- ip
[ payload load 2b @ link header + 12 => reg 1 ] <---- ether type
[ cmp eq reg 1 0x00000081 ] <---- vlan
[ payload load 2b @ link header + 14 => reg 1 ]
[ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ]
[ cmp eq reg 1 0x00000a00 ]
This is due to resolve_ll_protocol_conflict() which deals with the
conflict by updating protocol context and emitting an implicit
dependency, but there is already an explicit match coming from the user.
This patch adds a new helper function to check if an implicit dependency
clashes with an existing statement, which results in:
# nft add rule netdev x y ether type ip vlan id 1
Error: conflicting statements
add rule netdev x y ether type ip vlan id 1
^^^^^^^^^^^^^ ~~~~~~~
Theoretically, no duplicated implicit dependency should ever be emitted
if protocol context is correctly handled.
Only implicit payload expressions are considered at this stage for this
conflict check, this patch can be extended to deal with other dependency
types.
Fixes: 39f15c243912 ("nft: support listing expressions that use non-byte header fields") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 24 Apr 2024 21:48:56 +0000 (23:48 +0200)]
doc: nft.8: Highlight "hook" in flowtable description
Lacking an explicit description of possible hook values, emphasising the
word in the description text should draw readers' attention in the right
direction.
Phil Sutter [Wed, 24 Apr 2024 21:41:59 +0000 (23:41 +0200)]
tests: shell: Fix for maps/typeof_maps_add_delete with ASAN
With both KASAN and ASAN enabled, my VM is too slow so the ping-induced
set entry times out before the test checks its existence. Increase its
timeout to 2s, seems to do the trick.
Phil Sutter [Wed, 24 Apr 2024 21:35:00 +0000 (23:35 +0200)]
json: Fix for memleak in __binop_expr_json
When merging the JSON arrays generated for LHS and RHS of nested binop
expressions, the emptied array objects leak if their reference is not
decremented.
Fix this and tidy up other spots which did it right already by
introducing a json_array_extend wrapper.
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Fixes: 0ac39384fd9e4 ("json: Accept more than two operands in binary expressions") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Fri, 22 Mar 2024 13:00:26 +0000 (14:00 +0100)]
tests: py: Fix some JSON equivalents
Make sure they match the standard syntax input as much as possible.
For some reason inet/tcp.t.json was using plain arrays in place of
binary OR expressions in many cases. These arrays are interpreted as
list expressions, which seems to be semantically identical but the goal
here is to present an accurate equivalent to the rule in standard
syntax.
Phil Sutter [Fri, 22 Mar 2024 12:31:10 +0000 (13:31 +0100)]
mergesort: Avoid accidental set element reordering
In corner cases, expr_msort_cmp() may return 0 for two non-identical
elements. An example are ORed tcp flags: 'syn' and 'syn | ack' are
considered the same value since expr_msort_value() reduces the latter to
its LHS.
Keeping the above in mind and looking at how list_expr_sort() works: The
list in 'head' is cut in half, the first half put into the temporary
list 'list' and finally 'list' is merged back into 'head' considering
each element's position. Shall expr_msort_cmp() return 0 for two
elements, the one from 'list' ends up after the one in 'head', thus
reverting their previous ordering.
The practical implication is that output never matches input for the
sample set '{ syn, syn | ack }' as the sorting after delinearization in
netlink_list_setelems() keeps swapping the elements. Out of coincidence,
the commit this fixes itself illustrates the use-case this breaks,
namely tracking a ruleset in git: Each ruleset reload will trigger an
update to the stored dump.
This change breaks interval set element deletion because __set_delete()
implicitly relies upon this reordering of duplicate entries by inserting
a clone of the one to delete into the start (via list_move()) and after
sorting assumes the clone will end up right behind the original. Fix
this by calling list_move_tail() instead.
Fixes: 14ee0a979b622 ("src: sort set elements in netlink_get_setelems()") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Thu, 21 Mar 2024 13:57:41 +0000 (14:57 +0100)]
doc: nft.8: Two minor synopsis fixups
The curly braces in 'add table' are to be put literally, so need to be
bold. Also, they are optional unless either one (or both) of 'comment'
and 'flags' are specified.
The 'add chain' synopsis contained a stray tick, messing up the
following markup.
Fixes: 7fd67ce121f86 ("doc: fix synopsis of named counter, quota and ct {helper,timeout,expect}") Signed-off-by: Phil Sutter <phil@nwl.cc>
Currently, ICMP{v4,v6,inet} code datatypes only describe those that are
supported by the reject statement, but they can also be used for icmp
code matching. Moreover, ICMP code types go hand-to-hand with ICMP
types, that is, ICMP code symbols depend on the ICMP type.
Thus, the output of:
nft describe icmp_code
look confusing because that only displays the values that are supported
by the reject statement.
Disentangle this by adding internal datatypes for the reject statement
to handle the ICMP code symbol conversion to value as well as ruleset
listing.
The existing icmp_code, icmpv6_code and icmpx_code remain in place. For
backward compatibility, a parser function is defined in case an existing
ruleset relies on these symbols.
As for the manpage, move existing ICMP code tables from the DATA TYPES
section to the REJECT STATEMENT section, where this really belongs to.
But the icmp_code and icmpv6_code table stubs remain in the DATA TYPES
section because that describe that this is an 8-bit integer field.
netlink_delinearize: unused code in reverse cross-day meta hour range
f8f32deda31d ("meta: Introduce new conditions 'time', 'day' and 'hour'")
reverses a cross-day range expressed as "22:00"-"02:00" UTC time into
!= "02:00"-"22:00" so meta hour ranges works.
Listing is however confusing, hence, 44d144cd593e ("netlink_delinearize:
reverse cross-day meta hour range") introduces code to reverse a cross-day.
However, it also adds code to reverse a range in == to-from form
(assuming OP_IMPLICIT) which is never exercised from the listing path
because the range expression is not currently used, instead two
instructions (cmp gte and cmp lte) are used to represent the range.
Remove this branch otherwise a reversed notation will be used to display
meta hour ranges once the range instruction is to represent this.
Add test for cross-day scenario in EADT timezone.
Fixes: 44d144cd593e ("netlink_delinearize: reverse cross-day meta hour range") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jeremy Sowden [Sun, 24 Mar 2024 14:59:08 +0000 (14:59 +0000)]
tests: shell: packetpath/flowtables: open all temporary files in /tmp
The test used to do I/O over a named pipe in $PWD, until Phil changed it
to create the pipe in /tmp. However, he missed one `socat` command.
Update that too.
Fixes: 3a9f29e21726 ("tests: shell: packetpath/flowtables: Avoid spurious EPERM") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink_delinearize: reverse cross-day meta hour range
f8f32deda31d ("meta: Introduce new conditions 'time', 'day' and 'hour'")
reverses the hour range in case that a cross-day range is used, eg.
meta hour "03:00"-"14:00" counter accept
which results in (Sidney, Australia AEDT time):
meta hour != "14:00"-"03:00" counter accept
kernel handles time in UTC, therefore, cross-day range may not be
obvious according to local time.
The ruleset listing above is not very intuitive to the reader depending
on their timezone, therefore, complete netlink delinearize path to
reverse the cross-day meta range.
Update manpage to recommend to use a range expression when matching meta
hour range. Recommend range expression for meta time and meta day too.
Extend testcases/listing/meta_time to cover for this scenario.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1737 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink_delinearize: restore binop syntax when listing ruleset for flags
c3d57114f119 ("parser_bison: add shortcut syntax for matching flags
without binary operations") provides a similar syntax to iptables using
a prefix representation for flag matching.
Restore original representation using binop when listing the ruleset.
The parser still accepts the prefix notation for backward compatibility.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Fri, 12 Jan 2024 12:19:26 +0000 (13:19 +0100)]
src: do not merge a set with a erroneous one
The included sample causes a crash because we attempt to
range-merge a prefix expression with a symbolic expression.
The first set is evaluated, the symbol expression evaluation fails
and nft queues an error message ("Could not resolve hostname").
However, nft continues evaluation.
nft then encounters the same set definition again and merges the
new content with the preceeding one.
But the first set structure is dodgy, it still contains the
unresolved symbolic expression.
That then makes nft crash (assert) in the set internals.
There are various different incarnations of this issue, but the low
level set processing code does not allow for any partially transformed
expressions to still remain.
Before:
nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop
BUG: invalid range expression type binop
nft: src/expression.c:1479: range_expr_value_low: Assertion `0' failed.
After:
nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop
invalid_range_expr_type_binop:4:18-25: Error: Could not resolve hostname: Name or service not known
elements = { 1&.141.0.1 - 192.168.0.2}
^^^^^^^^
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Fri, 8 Mar 2024 19:22:37 +0000 (20:22 +0100)]
tests: shell: Regenerate all json-nft dumps
Ordering of 'nft -j list ruleset' output has changed, Regenerate
existing json-nft dumps. No functional change intended, merely the
position of chain objects should have moved up in the "nftables" array.
Phil Sutter [Thu, 7 Mar 2024 17:40:12 +0000 (18:40 +0100)]
json: Order output like nft_cmd_expand()
Print empty chain add commands early in list so following verdict maps
and rules referring to them won't cause spurious errors when loading the
resulting ruleset dump.
On my system for testing, called socat is not allowed to create the pipe
file in local directory (probably due to sshfs). Specify a likely unique
path in /tmp to avoid such problems.
Fixes: 419c0199774c6 ("tests: shell: add test to cover ct offload by using nft flowtables") Signed-off-by: Phil Sutter <phil@nwl.cc>