]> git.ipfire.org Git - thirdparty/nftables.git/log
thirdparty/nftables.git
16 hours agodoc: Add a note about route_localnet sysctl master
Łukasz Stelmach [Thu, 21 Aug 2025 10:38:40 +0000 (12:38 +0200)] 
doc: Add a note about route_localnet sysctl

See ip_route_input_slow() in net/ipv4/route.c in the Linux
kernel sources.

Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
23 hours agomnl: silence compiler warning
Florian Westphal [Wed, 20 Aug 2025 12:44:43 +0000 (14:44 +0200)] 
mnl: silence compiler warning

gcc 14.3.0 reports this:

src/mnl.c: In function 'mnl_nft_chain_add':
src/mnl.c:916:25: warning: 'nest' may be used uninitialized [-Wmaybe-uninitialized]
  916 |                         mnl_attr_nest_end(nlh, nest);

I guess its because compiler can't know that the conditions cannot change
in-between and assumes nest_end() can be called without nest_start().

Fixes: 01277922fede ("src: ensure chain policy evaluation when specified")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
41 hours agotests: shell: coverage for simple verdict map merger
Pablo Neira Ayuso [Wed, 20 Aug 2025 11:17:22 +0000 (13:17 +0200)] 
tests: shell: coverage for simple verdict map merger

Add a testcase to cover for merging two rules into verdict map, added by

  345d9260f7fe ("optimize: merge several selectors with different verdict into verdict map").

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
41 hours agotests: shell: cover sets as set elems evaluation
Pablo Neira Ayuso [Wed, 20 Aug 2025 11:06:30 +0000 (13:06 +0200)] 
tests: shell: cover sets as set elems evaluation

Extend tests/shell coverage to exercise merging nested sets, provided
by fixes such as:

  a6b75b837f5e ("evaluate: set: Allow for set elems to be sets")

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
41 hours agofib: restore JSON output for relational expressions
Pablo Neira Ayuso [Tue, 19 Aug 2025 09:23:42 +0000 (11:23 +0200)] 
fib: restore JSON output for relational expressions

JSON output for the fib expression changed:

-                    "result": "check"
+                    "result": "oif"

This breaks third party JSON parsers, revert this change for relational
expressions only via workaround until there are clear rules on how to
proceed with JSON schema updates.

As for set and map statements, keep this new "check" result type since
it is not possible to peek on rhs in such case to guess if the
NFT_FIB_F_PRESENT flag needs to be set on.

Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1806
Fixes: f4b646032acf ("fib: allow to check if route exists in maps")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
41 hours agotools: add a systemd unit for static rulesets
Jan Engelhardt [Thu, 17 Apr 2025 14:48:33 +0000 (16:48 +0200)] 
tools: add a systemd unit for static rulesets

There is a customer request (bugreport) for wanting to trivially load a ruleset
from a well-known location on boot, forwarded to me by M. Gerstner. A systemd
service unit is hereby added to provide that functionality. This is based on
various distributions attempting to do same, for example,

https://src.fedoraproject.org/rpms/nftables/tree/rawhide
https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/nftables/nftables.initd
https://gitlab.archlinux.org/archlinux/packaging/packages/nftables
Acked-by: Eric Garver <eric@garver.life>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 days agosrc: fix memory leak in anon chain error handling
Florian Westphal [Thu, 24 Jul 2025 10:22:02 +0000 (12:22 +0200)] 
src: fix memory leak in anon chain error handling

chain_stmt_destroy is called from bison destructor, but it turns out
this function won't free the associated chain.

There is no memory leak when bison can parse the input because the chain
statement evaluation step queues the embedded anon chain via cmd_alloc.
Then, a later cmd_free() releases the chain and the embedded statements.

In case of a parser error, the evaluation step is never reached and the
chain object leaks, e.g. in

  foo bar jump { return }

Bison calls the right destructor but the anonon chain and all
statements/expressions in it are not released:

HEAP SUMMARY:
    in use at exit: 1,136 bytes in 4 blocks
  total heap usage: 98 allocs, 94 frees, 840,255 bytes allocated

1,136 (568 direct, 568 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4
   at: calloc (vg_replace_malloc.c:1675)
   by: xzalloc (in libnftables.so.1.1.0)
   by: chain_alloc (in libnftables.so.1.1.0)
   by: nft_parse (in libnftables.so.1.1.0)
   by: __nft_run_cmd_from_filename (in libnftables.so.1.1.0)
   by: nft_run_cmd_from_filename (in libnftables.so.1.1.0)

To resolve this, make chain_stmt_destroy also release the embedded
chain.  This in turn requires chain refcount increases whenever a chain
is assocated with a chain statement, else we get double-free of the
chain.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 days agosrc: ensure chain policy evaluation when specified
Pablo Neira Ayuso [Sun, 17 Aug 2025 19:01:30 +0000 (21:01 +0200)] 
src: ensure chain policy evaluation when specified

Set on CHAIN_F_BASECHAIN when policy is specified in chain, otherwise
chain priority is not evaluated.

Toggling this flag requires needs three adjustments to work though:

1) chain_evaluate() needs skip evaluation of hook name and priority if
   not specified to allow for updating the default chain policy, e.g.

chain ip x y { policy accept; }

2) update netlink bytecode generation for chain to skip NFTA_CHAIN_HOOK
   so update path is exercised in the kernel.

3) error reporting needs to check if basechain priority and type is
   set on, otherwise skip further hints.

Fixes: acdfae9c3126 ("src: allow to specify the default policy for base chains")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 days agosegtree: incorrect type when aggregating concatenated set ranges
Pablo Neira Ayuso [Wed, 13 Aug 2025 13:19:23 +0000 (15:19 +0200)] 
segtree: incorrect type when aggregating concatenated set ranges

Uncovered by the compound_expr_remove() replacement by type safe function
coming after this patch.

Add expression to the concatenation which is reachable via expr_value().

This bug is subtle, I could not spot any reproducible buggy behaviour
when using the wrong type when running the existing tests.

Fixes: 8ac2f3b2fca3 ("src: Add support for concatenated set ranges")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agojson: Do not reduce single-item arrays on output
Phil Sutter [Tue, 12 Aug 2025 15:31:47 +0000 (17:31 +0200)] 
json: Do not reduce single-item arrays on output

This is a partial revert of commit a740f2036ad0d ("json: Introduce
json_add_array_new()"), keeping the function but eliminating its primary
task which is to replace arrays of size 1 by their only item. While
support for this on input is convenient for users, it means extra casing
in JSON output parsers to cover for it. The minor reduction in output
size does not justify that.

Fixes: a740f2036ad0d ("json: Introduce json_add_array_new()")
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1806
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Fix tests added for 'icmpv6 taddr' support
Phil Sutter [Wed, 13 Aug 2025 14:14:08 +0000 (16:14 +0200)] 
tests: py: Fix tests added for 'icmpv6 taddr' support

There was a duplicate test, also stored JSON equivalents should match
input as much as possible. The expected deviation in output (just like
with standard syntax) is stored in the .json.output file instead.

Fixes: 2e86f45d0260a ("icmpv6: Allow matching target address in NS/NA, redirect and MLD")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop stale entry from ip/snat.t.payload
Phil Sutter [Wed, 13 Aug 2025 14:06:46 +0000 (16:06 +0200)] 
tests: py: Drop stale entry from ip/snat.t.payload

This payload actually belongs to ip/dnat.t.payload, fixed commit added
it to the wrong file.

Fixes: 8f3048954d40d ("evaluate: postpone transport protocol match check after nat expression evaluation")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop stale entries from ip6/{ct,meta}.t.json
Phil Sutter [Wed, 13 Aug 2025 13:50:54 +0000 (15:50 +0200)] 
tests: py: Drop stale entries from ip6/{ct,meta}.t.json

Looks like these were added by accident, fixed commit did not add these
test cases.

Fixes: 8221d86e616bd ("tests: py: add test-cases for ct and packet mark payload expressions")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop stale entry from ip/snat.t.json
Phil Sutter [Wed, 13 Aug 2025 13:03:29 +0000 (15:03 +0200)] 
tests: py: Drop stale entry from ip/snat.t.json

The test syntax was changed, but the respective JSON equivalent remained
in place.

Fixes: 9b169bfc650eb ("src: remove STMT_NAT_F_INTERVAL flags and interval keyword")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop redundant payloads for ip/ip.t
Phil Sutter [Wed, 13 Aug 2025 12:51:39 +0000 (14:51 +0200)] 
tests: py: Drop redundant payloads for ip/ip.t

Each was present multiple times, introduced probably by copying from a
respective .got file.

Fixes: 77def2d43466e ("netlink_delinearize: support for bitfield payload statement with binary operation")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop duplicate test from inet/vxlan.t
Phil Sutter [Wed, 13 Aug 2025 12:38:22 +0000 (14:38 +0200)] 
tests: py: Drop duplicate test from inet/vxlan.t

The test was duplicate since day 1. The duplicate JSON equivalent was
added later (semi-automated), remove it as well.

Fixes: df81baa4c2bef ("tests: py: add vxlan tests")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop stale entry from inet/tcp.t.json
Phil Sutter [Wed, 13 Aug 2025 12:32:11 +0000 (14:32 +0200)] 
tests: py: Drop stale entry from inet/tcp.t.json

The test was changed but JSON equivalents not updated. Commit
c0b685951fabb ("json: fix parse of flagcmp expression") then added an
equivalent matching the changed test, so just drop the old one.

Fixes: c3d57114f119b ("parser_bison: add shortcut syntax for matching flags without binary operations")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop duplicate test from inet/gretap.t
Phil Sutter [Wed, 13 Aug 2025 12:23:30 +0000 (14:23 +0200)] 
tests: py: Drop duplicate test from inet/gretap.t

The test was duplicate since day 1. The duplicate JSON equivalent was
added later (semi-automated), remove it as well.

Fixes: 39a68d9ffd25c ("tests: py: add gretap tests")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop duplicate test from inet/gre.t
Phil Sutter [Wed, 13 Aug 2025 12:22:07 +0000 (14:22 +0200)] 
tests: py: Drop duplicate test from inet/gre.t

The test was duplicate since day 1. The duplicate JSON equivalent was
added later (semi-automated), remove it as well.

Fixes: c04ef8d104ec6 ("tests: py: add gre tests")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop duplicate test from inet/geneve.t
Phil Sutter [Wed, 13 Aug 2025 12:19:31 +0000 (14:19 +0200)] 
tests: py: Drop duplicate test from inet/geneve.t

The test was duplicate since day 1. The duplicate JSON equivalent was
added later (semi-automated), remove it as well.

Fixes: 2b9143bc7ab81 ("tests: py: add geneve tests")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop stale payload from any/rawpayload.t.payload
Phil Sutter [Wed, 13 Aug 2025 12:17:46 +0000 (14:17 +0200)] 
tests: py: Drop stale payload from any/rawpayload.t.payload

There never was a test corresponding to this payload.

Fixes: 857904bdfaf7a ("tests: py: extend raw payload match tests")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop stale entries since redundant test case removal
Phil Sutter [Wed, 13 Aug 2025 12:14:45 +0000 (14:14 +0200)] 
tests: py: Drop stale entries since redundant test case removal

Fixed commit left stale JSON equivalents and payload records in place,
drop them.

Fixes: ec1ea13314fa5 ("tests: remove redundant test cases")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: py: Drop duplicate test in any/meta.t
Phil Sutter [Wed, 13 Aug 2025 12:12:06 +0000 (14:12 +0200)] 
tests: py: Drop duplicate test in any/meta.t

The expected invalid meta hour argument of 24:00 is tested already.

Fixes: a6717ae094db2 ("evaluate: Fix for 'meta hour' ranges spanning date boundaries")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 days agotests: shell: add parser and packetpath test
Florian Westphal [Tue, 5 Aug 2025 19:40:15 +0000 (21:40 +0200)] 
tests: shell: add parser and packetpath test

One to validate parsing, and one to test that packets match the
expected mapping.

omits json file because of:
internal:0:0-0: Error: Expression type payload not allowed in context (RHS, STMT).

i.e. there is more work to be done on json side to support this.

Signed-off-by: Florian Westphal <fw@strlen.de>
3 days agoevaluate: check XOR RHS operand is a constant value
Florian Westphal [Tue, 5 Aug 2025 19:40:14 +0000 (21:40 +0200)] 
evaluate: check XOR RHS operand is a constant value

Now that we support non-constant RHS side in binary operations,
reject XOR with non-constant key: we cannot transfer the expression.

Fixes: 54bfc38c522b ("src: allow binop expressions with variable right-hand operands")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 days agotests: shell: update comment to name the right commit.
Florian Westphal [Thu, 14 Aug 2025 11:22:41 +0000 (13:22 +0200)] 
tests: shell: update comment to name the right commit.

At the time the comment was written the patch wasn't yet upstream
so replace this with the right id and title.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 days agosrc: netlink: netlink_delinearize_table() may return NULL
Phil Sutter [Fri, 8 Aug 2025 12:21:41 +0000 (14:21 +0200)] 
src: netlink: netlink_delinearize_table() may return NULL

Catch the error condition in callers to avoid crashes.

Fixes: c156232a530b3 ("src: add comment support when adding tables")
Signed-off-by: Phil Sutter <phil@nwl.cc>
10 days agotests: py: revert dccp python tests
Florian Westphal [Mon, 11 Aug 2025 09:25:06 +0000 (11:25 +0200)] 
tests: py: revert dccp python tests

These fail for kernels with 'CONFIG_NFT_EXTHDR_DCCP is not set', remove
the tests in anticipation of a future removal from both kernel and
nftables.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agotests: shell: Fix packetpath/rate_limit for old socat
Phil Sutter [Wed, 6 Aug 2025 14:21:36 +0000 (16:21 +0200)] 
tests: shell: Fix packetpath/rate_limit for old socat

The test would spuriously fail on RHEL9 due to the penultimate socat
call exiting 0 despite the connection being expected to fail. Florian
writes:

| It's the socat version in rhel9. With plain reject (icmp error):
|
|   read(0, "AAA\n", 8192)                  = 4
|   recvfrom(3, 0x7ffd59cf1ab0, 519, MSG_DONTWAIT, NULL, NULL) = -1
| EAGAIN (Resource temporarily unavailable)
| [..]
|   write(5, "AAA\n", 4)                    = 4
|   recvfrom(3, 0x7ffd59cf1f90, 519, MSG_DONTWAIT, NULL, NULL) = -1
| EAGAIN (Resource temporarily unavailable)
| [..]
|   read(0, "", 8192)                       = 0
|   recvfrom(3, 0x7ffd59cf1ab0, 519, MSG_DONTWAIT, NULL, NULL) = -1
| EAGAIN (Resource temporarily unavailable)
|   shutdown(5, SHUT_WR)                    = 0
|   shutdown(5, SHUT_RDWR)                  = 0
|   recvfrom(3, 0x7ffd59cf2260, 519, MSG_DONTWAIT, NULL, NULL) = -1
| EAGAIN (Resource temporarily unavailable)
|   exit_group(0)
|
| ---> indicates success, even though it did not receive any data.
[...]
| Replacing "reject" with a "reject with tcp reset" gives:
|   read(0, "AAA\n", 8192)                  = 4
|   recvfrom(3, 0x7ffcffd04220, 519, MSG_DONTWAIT, NULL, NULL) = -1
| EAGAIN (Resource temporarily unavailable)
| [..]
|   write(5, "AAA\n", 4)                    = -1 ECONNREFUSED (Connection refused)
|   recvfrom(3, 0x7ffcffd04700, 519, MSG_DONTWAIT, NULL, NULL) = -1
| EAGAIN (Resource temporarily unavailable)
| [..]                               = 10212
|   write(2, "2025/08/06 08:34:29 socat[10212]"..., 832025/08/06
| 08:34:29 socat[10212] E write(5, 0x55a4f0652000, 4): Connection
| refused
|   ) = 83
|   shutdown(5, SHUT_RDWR)                  = -1 ENOTCONN (Transport
| endpoint is not connected)
|   exit_group(1)                           = ?
|
| -> so failure is detected and the script passes.

While this is likely a bug in socat, working around it is simple so
let's tackle it on this side, too.

Note: The second chunk is sufficient to resolve the issue, probably
because the initial ruleset's rate limiter does not trigger during TCP
handshake. Adjust it anyway to keep things consistent.

Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: 9352fa7fb0a31 ("test: shell: Add rate_limit test case for 'limit statement'.")
Cc: Yi Chen <yiche@redhat.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
2 weeks agobuild: Bump version to 1.1.4 v1.1.4
Pablo Neira Ayuso [Wed, 6 Aug 2025 10:12:49 +0000 (12:12 +0200)] 
build: Bump version to 1.1.4

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agobuild: add trace.h to Makefile
Pablo Neira Ayuso [Wed, 6 Aug 2025 10:51:51 +0000 (12:51 +0200)] 
build: add trace.h to Makefile

so `make distcheck` works fine.

Fixes: 8e03d59b5aa4 ("src: split monitor trace code into new trace.c")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agodoc: nft.8: Minor NAT STATEMENTS section review
Phil Sutter [Thu, 31 Jul 2025 10:40:11 +0000 (12:40 +0200)] 
doc: nft.8: Minor NAT STATEMENTS section review

Synopsis insinuates an IP address argument is mandatory in snat/dnat
statements although specifying ports alone is perfectly fine. Adjust it
accordingly and add a paragraph briefly describing the behaviour.

While at it, update the redirect statement description with more
relevant examples, the current one is wrong: To *only* alter the
destination port, dnat statement must be used, not redirect.

Fixes: 6908a677ba04c ("nft.8: Enhance NAT documentation")
Signed-off-by: Phil Sutter <phil@nwl.cc>
3 weeks agoevaluate: Fix for 'meta hour' ranges spanning date boundaries
Phil Sutter [Fri, 25 Jul 2025 15:28:29 +0000 (17:28 +0200)] 
evaluate: Fix for 'meta hour' ranges spanning date boundaries

Introduction of EXPR_RANGE_SYMBOL type inadvertently disabled sanitizing
of meta hour ranges where the lower boundary has a higher value than the
upper boundary. This may happen outside of user control due to the fact
that given ranges are converted to UTC which is the kernel's native
timezone.

Perform the conditional match and op inversion with the new RHS
expression type as well after expanding it so values are comparable.
Since this replaces the whole range expression, make it replace the
relational's RHS entirely.

While at it extend testsuites to cover these corner-cases.

Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1805
Fixes: 347039f64509e ("src: add symbol range expression to further compact intervals")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 weeks agoparser_json: Parse into symbol range expression if possible
Phil Sutter [Tue, 29 Jul 2025 15:55:17 +0000 (17:55 +0200)] 
parser_json: Parse into symbol range expression if possible

Apply the bison parser changes in commit 347039f64509e ("src: add symbol
range expression to further compact intervals") to JSON parser as well.

Fixes: 347039f64509e ("src: add symbol range expression to further compact intervals")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 weeks agoexpression: Introduce is_symbol_value_expr() macro
Phil Sutter [Tue, 29 Jul 2025 15:52:35 +0000 (17:52 +0200)] 
expression: Introduce is_symbol_value_expr() macro

Annotate and combine the 'etype' and 'symtype' checks done in bison
parser for readability and because JSON parser will start doing the same
in a follow-up patch.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
4 weeks agoparser_bison: fix memory leak when parsing flowtable hook declaration
Florian Westphal [Wed, 23 Jul 2025 15:00:11 +0000 (17:00 +0200)] 
parser_bison: fix memory leak when parsing flowtable hook declaration

When the hook location is invalid we error out but we do leak both
the priority expression and the flowtable name.  Example:

valgrind --leak-check=full nft -f flowtable-parser-err-memleak
[..] Error: unknown chain hook
hook enoent priority filter + 10
     ^^^^^^
[..]
2 bytes in 1 blocks are definitely lost in loss record 1 of 3
   at: malloc (vg_replace_malloc.c:446)
   by: strdup (in libc.so.6)
   by: xstrdup (in libnftables.so.1.1.0)
   by: nft_lex (in libnftables.so.1.1.0)
   by: nft_parse (in libnftables.so.1.1.0)
   by: __nft_run_cmd_from_filename (in libnftables.so.1.1.0)
   by: nft_run_cmd_from_filename (in libnftables.so.1.1.0)

First two reports are due to the priority expression: this needs to call
expr_free().  Third report is due to the flowtable name, the destructor
was missing so add one.

After fix:
All heap blocks were freed -- no leaks are possible

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
4 weeks agoparser_json: fix assert due to empty interface name
Florian Westphal [Mon, 21 Jul 2025 11:36:03 +0000 (13:36 +0200)] 
parser_json: fix assert due to empty interface name

Before:
nft: src/mnl.c:744: nft_dev_add: Assertion `ifname_len > 0' failed.

After:
internal:0:0-0: Error: empty interface name

Bison checks this upfront, do same in json.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
4 weeks agoparser_json: reject non-concat expression
Florian Westphal [Mon, 21 Jul 2025 11:09:55 +0000 (13:09 +0200)] 
parser_json: reject non-concat expression

Before "src: detach set, list and concatenation expression layout":
internal:0:0-0: Error: Concatenation with 0 elements is illegal

After this change, expr->size access triggers assert() failure, add
explicit test for etype to avoid this and error out:

internal:0:0-0: Error: Expected concat element, got symbol.

Fixes: e0d92243be1c ("src: detach set, list and concatenation expression layout")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
4 weeks agoevaluate: maps: check element data mapping matches set data definition
Florian Westphal [Mon, 21 Jul 2025 10:57:07 +0000 (12:57 +0200)] 
evaluate: maps: check element data mapping matches set data definition

This change is similar to
7f4d7fef31bd ("evaluate: check element key vs. set definition")

but this time for data mappings.

The included bogon asserts with:
BUG: invalid data expression type catch-all set element
nft: src/netlink.c:596: __netlink_gen_data: Assertion `0' failed.

after:
internal:0:0-0: Error: Element mapping mismatches map definition, expected packet mark, not 'invalid'

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
5 weeks agojson: BASECHAIN flag no longer implies presence of priority expression
Florian Westphal [Mon, 14 Jul 2025 11:48:24 +0000 (13:48 +0200)] 
json: BASECHAIN flag no longer implies presence of priority expression

This is a followup to
44ea19364637 ("src: BASECHAIN flag no longer implies presence of priority expression"):
feeding the same bogon file into nft -j we get a very similar crash.

Signed-off-by: Florian Westphal <fw@strlen.de>
5 weeks agoevaluate: fix crash with invalid elements in set
Florian Westphal [Sun, 13 Jul 2025 21:59:30 +0000 (23:59 +0200)] 
evaluate: fix crash with invalid elements in set

ctx->ectx.key can be cleared, causing a crash:

src/nft --check -f tests/shell/testcases/bogons/nft-f/set_with_bad_elem
AddressSanitizer:DEADLYSIGNAL
    #0 0x7ffb57098c0d in elem_key_compatible src/evaluate.c:1934
    #1 0x7ffb5709926d in expr_evaluate_set_elem src/evaluate.c:1979
    #2 0x7ffb570a540f in expr_evaluate src/evaluate.c:3159
    #3 0x7ffb57095f33 in list_member_evaluate src/evaluate.c:1652
    #4 0x7ffb57099f92 in expr_evaluate_set src/evaluate.c:2066
    #5 0x7ffb570a53f7 in expr_evaluate src/evaluate.c:3157
    ..
AddressSanitizer: SEGV src/evaluate.c:1934 in elem_key_compatible

After:
set_with_bad_elem:4:39-46: Error: Element mismatches set definition, expected IPv4 address, not 'integer'
  elements = { 1.2.3.4, tcp << 8 }
                        ^^^^^^^^

Use ctx->set->key instead.

Fixes: 7f4d7fef31bd ("evaluate: check element key vs. set definition")
Signed-off-by: Florian Westphal <fw@strlen.de>
5 weeks agotests: shell: add type route chain test case
Yi Chen [Tue, 15 Jul 2025 09:19:13 +0000 (17:19 +0800)] 
tests: shell: add type route chain test case

This test case verifies the functionality of nft type route chain
when used with policy routing based on dscp and fwmark.

Signed-off-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
5 weeks agomnl: Call mnl_attr_nest_end() just once
Phil Sutter [Wed, 16 Jul 2025 12:26:08 +0000 (14:26 +0200)] 
mnl: Call mnl_attr_nest_end() just once

Calling the function after each added nested attribute is harmless but
pointless.

Fixes: a66b5ad9540dd ("src: allow for updating devices on existing netdev chain")
Signed-off-by: Phil Sutter <phil@nwl.cc>
5 weeks agomnl: Support NFNL_HOOK_TYPE_NFT_FLOWTABLE
Phil Sutter [Tue, 8 Jul 2025 13:00:34 +0000 (15:00 +0200)] 
mnl: Support NFNL_HOOK_TYPE_NFT_FLOWTABLE

New kernels dump info for flowtable hooks the same way as for base
chains.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
5 weeks agotests: bogons: fix missing file name when logging
Florian Westphal [Mon, 14 Jul 2025 18:37:57 +0000 (20:37 +0200)] 
tests: bogons: fix missing file name when logging

When the json is parsed without returning an error the test
fails.  Its supposed to log the name of the failed input which
it does for -f but not for -j -f.

Signed-off-by: Florian Westphal <fw@strlen.de>
5 weeks agodoc: expand on gc-interval, size and a few other set/map keywords
Florian Westphal [Wed, 9 Jul 2025 23:07:52 +0000 (01:07 +0200)] 
doc: expand on gc-interval, size and a few other set/map keywords

Reported-by: <pavelpribylov01@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
5 weeks agoevaluate: check element key vs. set definition
Florian Westphal [Thu, 26 Jun 2025 14:52:31 +0000 (16:52 +0200)] 
evaluate: check element key vs. set definition

Included bogon asserts with:
 src/datatype.c:253: symbolic_constant_print: Assertion `expr->len / BITS_PER_BYTE <= sizeof(val)' failed.

Resolve this by validating that the set element key matches the set key
definition.

After this, loading the bogon file gives:
Error: Element mismatches set definition, expected concatenation of (IPv4 address, integer), not 'ICMP type'
elements = {redirect }
           ^^^^^^^^

Signed-off-by: Florian Westphal <fw@strlen.de>
6 weeks agotests: monitor: enclose device names in quotes
Pablo Neira Ayuso [Thu, 10 Jul 2025 00:53:50 +0000 (02:53 +0200)] 
tests: monitor: enclose device names in quotes

Update test to enclose flowtable device names in quotes, otherwise,
it reports a spurious issue:

@@ -1,2 +1,3 @@
 add table ip t
-add flowtable ip t ft { hook ingress priority 0; devices = { lo }; }
+add flowtable ip t ft { hook ingress priority 0; devices = { "lo" }; }

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agosrc: detach set, list and concatenation expression layout
Pablo Neira Ayuso [Tue, 8 Jul 2025 22:51:24 +0000 (00:51 +0200)] 
src: detach set, list and concatenation expression layout

These three expressions use the same layout, but they have a different
purpose. Several fields are specific of a given expression:

- set_flags is only required by set expressions.
- field_len and field_count are only used by concatenation expressions.

Add accessors to validate the expression type before accessing the union
fields:

 #define expr_set(__expr)       (assert((__expr)->etype == EXPR_SET), &(__expr)->expr_set)
 #define expr_concat(__expr)    (assert((__expr)->etype == EXPR_CONCAT), &(__expr)->expr_concat)
 #define expr_list(__expr)      (assert((__expr)->etype == EXPR_LIST), &(__expr)->expr_list)

This should help catch subtle bugs due to type confusion.

assert() could be later enabled only in debugging builds to run tests,
keep it by now.

compound_expr_*() still works and it needs the same initial layout for
all of these expressions:

      struct list_head        expressions;
      unsigned int            size;

This is implicitly reducing the size of one of the largest structs
in the union area of struct expr, still EXPR_SET_ELEM remains the
largest so no gain is achieved in this iteration.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agosrc: convert set to list expression
Pablo Neira Ayuso [Tue, 8 Jul 2025 22:32:13 +0000 (00:32 +0200)] 
src: convert set to list expression

The following definition:

 define xyz = { "dummy0", "dummy1" }

is represented as a set expression to ease integration with sets.

However, this definition can be used in chains and flowtables to specify
the devices, for instance:

  table netdev x {
    chain y {
      type filter hook ingress devices = $xyz priority 0; policy drop;
    }
  }

in this context, $xyz defines a _list_ of devices, not a set.

Transform the set to list expression from the evaluation step for chains
and flowtables.

This patch also handles:

 define xyz = { "dummy0", $abc }

where $abc is also transformed to a list expression in the context of
chains and flowtables.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agoevaluate: validate set expression type before accessing flags
Pablo Neira Ayuso [Tue, 8 Jul 2025 22:21:49 +0000 (00:21 +0200)] 
evaluate: validate set expression type before accessing flags

Validate set->init is of EXPR_SET expression type before accessing
set_flags.

Fixes: 81e36530fcac ("src: replace interval segment tree overlap and automerge")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agoevaluate: mappings require set expression
Pablo Neira Ayuso [Tue, 8 Jul 2025 22:14:44 +0000 (00:14 +0200)] 
evaluate: mappings require set expression

While EXPR_CONCAT and EXPR_LIST share the same layout in struct expr,
these expressions are not possible at this stage.

Fall back to error out with "invalid mapping expression".

Fixes: 02d44b4f9917 ("evaluate: fix expression data corruption")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agorule: print chain and flowtable devices in quotes
Pablo Neira Ayuso [Tue, 8 Jul 2025 22:13:56 +0000 (00:13 +0200)] 
rule: print chain and flowtable devices in quotes

Print devices in quotes, for consistency with:

- the existing chain listing with single device:

  type filter hook ingress device "lo" priority filter; policy accept

- the ifname datatype used in sets.

In general, tokens that are user-defined, not coming in the datatype
symbol list, are enclosed in quotes.

Fixes: 3fdc7541fba0 ("src: add multidevice support for netdev chain")
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agosrc: add conntrack information to trace monitor mode
Florian Westphal [Mon, 7 Jul 2025 20:38:13 +0000 (22:38 +0200)] 
src: add conntrack information to trace monitor mode

Upcoming kernel change provides the packets conntrack state in the
trace message data.

This allows to see if packet is seen as original or reply, the conntrack
state (new, establieshed, related) and the status bits which show if e.g.
NAT was applied.  Alsoi include conntrack ID so users can use conntrack
tool to query the kernel for more information via ctnetlink.

This improves debugging when e.g. packets do not pick up the expected
NAT mapping, which could e.g. also happen because of expectations
following the NAT binding of the owning conntrack entry.

Example output ("conntrack: " lines are new):

trace id 32 t PRE_RAW packet: iif "enp0s3" ether saddr [..]
trace id 32 t PRE_RAW rule tcp flags syn meta nftrace set 1 (verdict continue)
trace id 32 t PRE_RAW policy accept
trace id 32 t PRE_MANGLE conntrack: ct direction original ct state new ct id 2641368242
trace id 32 t PRE_MANGLE packet: iif "enp0s3" ether saddr [..]
trace id 32 t ct_new_pre rule jump rpfilter (verdict jump rpfilter)
trace id 32 t PRE_MANGLE policy accept
trace id 32 t INPUT conntrack: ct direction original ct state new ct status dnat-done ct id 2641368242
trace id 32 t INPUT packet: iif "enp0s3" [..]
trace id 32 t public_in rule tcp dport 443 accept (verdict accept)

v3: remove clash bit again, kernel won't expose it anymore.
v2: add more status bits: helper, clash, offload, hw-offload.
    add flag explanation to documentation.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agosrc: split monitor trace code into new trace.c
Florian Westphal [Mon, 7 Jul 2025 09:47:13 +0000 (11:47 +0200)] 
src: split monitor trace code into new trace.c

Preparation patch to avoid putting more trace functionality into
netlink.c.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
6 weeks agotests: py: re-enables nft-test.py to load the local nftables.py
Zhongqiu Duan [Fri, 4 Jul 2025 03:12:16 +0000 (03:12 +0000)] 
tests: py: re-enables nft-test.py to load the local nftables.py

This is a needed follow-up of commit ce443afc21455 ("py: move
package source into src directory") from 2023. Since that change,
nft-test.py started using the host's nftables.py instead of the local
one. But since nft-test.py passes the local src/.libs/libnftables.so.1
as parameter when instantiating the Nftables class, we did nevertheless
use the local libnftables.

Fixes: ce443afc21455 ("py: move package source into src directory")
Reviewed-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 weeks agotests: shell: use binary defined by run-tests.sh
Zhongqiu Duan [Thu, 3 Jul 2025 13:57:17 +0000 (13:57 +0000)] 
tests: shell: use binary defined by run-tests.sh

Remove hardcoded binary in testcases/transactions/handle_bad_family.

Fixes: aa44b61a560d ("tests: shell: check for removing table via handle with incorrect family")
Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com>
7 weeks agodoc: Clarify cgroup meta variable
Michal Koutný [Mon, 30 Jun 2025 14:15:26 +0000 (16:15 +0200)] 
doc: Clarify cgroup meta variable

The documentation mentions control group id where the meaning is a class
id associated to the cgroup of a socket. This used to be fine until
there came cgroup v2 that use similar terminolgy (cgroup id) for very
different thing -- a numeric identifier of a particular (v2) cgroup.

This contemporary cgroup id isn't exposed by netfilter (v2 matching is
based on paths externally). Fix the docs and decrease confusion by more
precise description of the metavariable.

[ Added comment in description to refer to socket cgroupv2 --pablo ]

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 weeks agoMerge branch 'tests_shell_check_tree_fixes'
Florian Westphal [Mon, 30 Jun 2025 12:39:22 +0000 (14:39 +0200)] 
Merge branch 'tests_shell_check_tree_fixes'

Add many more json dump files, now that json input parser handles
'typeof', add dump files for all of them.

Also some tests lacked a .nft dump file too, add them.

Other tests can't have dump files because they produce unstable output
(e.g. due to timeouts or because test is randomized).

Add a 'nodump' files for those so tools/check-tree.sh won't complain
about them.

Furthermore check-tree.sh should not report the json bogon inputs in
tests/shell/testcases/bogons/nft-j-f as "Unexpected files".

Finally, two bogon inputs were in the wrong directory (and thus were
not used as test inputs) move them to the right location.

After his merge, only 10 check-tree.sh warnings remain and all errors
are gone.

There are still missing json test files, this is mostly due to missing
anonymous (implicit) chain support in the json parser.

Leave the warnings as-is so it can be properly resolved (add support)
at a later time.

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add json dump files
Florian Westphal [Sun, 29 Jun 2025 10:39:14 +0000 (12:39 +0200)] 
tests: shell: add json dump files

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: move bogons to correct directory
Florian Westphal [Sun, 29 Jun 2025 10:13:38 +0000 (12:13 +0200)] 
tests: shell: move bogons to correct directory

These two bogons were never loaded, they have to placed in the "nft-f"
subdir.

Also add the "nft-j-f" bogon input dir to the ignored files list so their
existence is not reported as an error by check-tree.sh.

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add a few nodump files
Florian Westphal [Sun, 29 Jun 2025 10:09:56 +0000 (12:09 +0200)] 
tests: shell: add a few nodump files

These tests produce no rules.

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add include dumps
Florian Westphal [Sun, 29 Jun 2025 10:07:00 +0000 (12:07 +0200)] 
tests: shell: add include dumps

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add maps dumps
Florian Westphal [Sun, 29 Jun 2025 09:11:25 +0000 (11:11 +0200)] 
tests: shell: add maps dumps

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add nft-i dumps
Florian Westphal [Sun, 29 Jun 2025 08:52:53 +0000 (10:52 +0200)] 
tests: shell: add nft-i dumps

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add sets dumps
Florian Westphal [Sun, 29 Jun 2025 08:50:01 +0000 (10:50 +0200)] 
tests: shell: add sets dumps

add nodump file for inerval_size_random test, it has no stable output.

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add optimize dump files
Florian Westphal [Sun, 29 Jun 2025 08:30:41 +0000 (10:30 +0200)] 
tests: shell: add optimize dump files

nomerge_vmap gains a nodump file, the test uses --check.

Signed-off-by: Florian Westphal <fw@strlen.de>
7 weeks agotests: shell: add bitwise json dump files
Florian Westphal [Sun, 29 Jun 2025 08:15:03 +0000 (10:15 +0200)] 
tests: shell: add bitwise json dump files

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agofib: allow to use it in set statements
Pablo Neira Ayuso [Tue, 24 Jun 2025 16:11:10 +0000 (18:11 +0200)] 
fib: allow to use it in set statements

Allow to use fib expression in set statements, eg.

 meta mark set ip saddr . fib daddr check map { 1.2.3.4 . exists : 0x00000001 }

Fixes: 4a75ed32132d ("src: add fib expression")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agofib: allow to check if route exists in maps
Pablo Neira Ayuso [Tue, 24 Jun 2025 16:11:06 +0000 (18:11 +0200)] 
fib: allow to check if route exists in maps

f686a17eafa0 ("fib: Support existence check") adds EXPR_F_BOOLEAN as a
workaround to infer from the rhs of the relational expression if the fib
lookup wants to check for a specific output interface or, instead,
simply check for existence. This, however, does not work with maps.

The NFT_FIB_F_PRESENT flag can be used both with NFT_FIB_RESULT_OIF and
NFT_FIB_RESULT_OFINAME, my understanding is that they serve the same
purpose which is to check if a route exists, so they are redundant.

Add a 'check' fib result to check for routes while still keeping the
inference workaround for backward compatibility, but prefer the new
syntax in the listing.

Update man nft(8) and tests/py.

Fixes: f686a17eafa0 ("fib: Support existence check")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agotests: shell: Fix ifname_based_hooks feature check
Phil Sutter [Wed, 25 Jun 2025 16:53:36 +0000 (18:53 +0200)] 
tests: shell: Fix ifname_based_hooks feature check

The test was technically incorrect: Instead of detecting whether
interface hooks are name-based or not, it actually tested whether
netdev-family chains are removed along with their last hook.

Since the latter behaviour is established in kernel commit fc0133428e7a
("netfilter: nf_tables: Tolerate chains with no remaining hooks") and
thus independent from the name-based hooks change, treating both as the
same kernel feature is not acceptable.

Fix this by detecting whether a netdev-family chain may be added despite
specifying a non-existent interface to hook into. Keep the old check
around with a better name, although unused for now.

Reported-by: Florian Westphal <fw@strlen.de>
Fixes: f27e5abd81f29 ("tests: shell: Adjust to ifname-based hooks")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agoevaluate: prevent merge of sets with incompatible keys
Florian Westphal [Thu, 26 Jun 2025 00:52:48 +0000 (02:52 +0200)] 
evaluate: prevent merge of sets with incompatible keys

Its not enough to check for interval flag, this would assert in interval
code due to concat being passed to the interval code:
BUG: unhandled key type 13

After fix:
same_set_name_but_different_keys_assert:8:6-7: Error: set already exists with
different datatype (concatenation of (IPv4 address, network interface index) vs
network interface index)
        set s4 {
            ^^

This also improves error verbosity when mixing datamap and objref maps:

invalid_transcation_merge_map_and_objref_map:9:13-13:
Error: map already exists with different datatype (IPv4 address vs string)

.. instead of 'Cannot merge map with incompatible existing map of same name'.
The 'Cannot merge map with incompatible existing map of same name' check
is kept in place to catch when ruleset contains a set and map with same name
and same key definition.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agoevaluate: check that set type is identical before merging
Florian Westphal [Mon, 23 Jun 2025 19:37:31 +0000 (21:37 +0200)] 
evaluate: check that set type is identical before merging

Reject maps and sets of the same name:
 BUG: invalid range expression type catch-all set element
 nft: src/expression.c:1704: range_expr_value_low: Assertion `0' failed.

After:
Error: Cannot merge set with existing datamap of same name
  set z {
      ^

v2:
Pablo points out that we shouldn't merge datamaps (plain value) and objref
maps either, catch this too and add another test:

nft --check -f invalid_transcation_merge_map_and_objref_map
invalid_transcation_merge_map_and_objref_map:9:13-13: Error: Cannot merge map with incompatible existing map of same name

We should also make sure that both data (for map case) and
set keys are identical, this is added in a followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agoevaluate: avoid double-free on error handling of bogus objref maps
Florian Westphal [Tue, 24 Jun 2025 21:20:58 +0000 (23:20 +0200)] 
evaluate: avoid double-free on error handling of bogus objref maps

commit 98c51aaac42b ("evaluate: bail out if anonymous concat set defines a non concat expression")
clears set->init to avoid a double-free.

Extend this to also handle object maps.
The included bogon triggers a double-free of set->init expression:

Error: unqualified type invalid specified in map definition. Try "typeof expression" instead of "type datatype".
ct helper set ct  saddr map { 1c3:: : "p", dead::beef : "myftp" }
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This might not crash, depending on libc/malloc, but ASAN reports this:
==17728==ERROR: AddressSanitizer: heap-use-after-free on address 0x50b0000005e8 at ..
READ of size 4 at 0x50b0000005e8 thread T0
    #0 0x7f1be3cb7526 in expr_free src/expression.c:87
    #1 0x7f1be3cbdf29 in map_expr_destroy src/expression.c:1488
    #2 0x7f1be3cb74d5 in expr_destroy src/expression.c:80
    #3 0x7f1be3cb75c6 in expr_free src/expression.c:96
    #4 0x7f1be3d5925e in objref_stmt_destroy src/statement.c:331
    #5 0x7f1be3d5831f in stmt_free src/statement.c:56
    #6 0x7f1be3d583c2 in stmt_list_free src/statement.c:66
    #7 0x7f1be3d42805 in rule_free src/rule.c:495
    #8 0x7f1be3d48329 in cmd_free src/rule.c:1417
    #9 0x7f1be3cd2c7c in __nft_run_cmd_from_filename src/libnftables.c:759
    #10 0x7f1be3cd340c in nft_run_cmd_from_filename src/libnftables.c:847
    #11 0x55dcde0440be in main src/main.c:535

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agoevaluate: make sure chain jump name comes with a null byte
Florian Westphal [Tue, 24 Jun 2025 21:01:13 +0000 (23:01 +0200)] 
evaluate: make sure chain jump name comes with a null byte

There is a stack oob read access in netlink_gen_chain():

mpz_export_data(chain, expr->chain->value,
BYTEORDER_HOST_ENDIAN, len);
snprintf(data->chain, NFT_CHAIN_MAXNAMELEN, "%s", chain);

There is no guarantee that chain[] is null terminated, so snprintf
can read past chain[] array.  ASAN report is:

AddressSanitizer: stack-buffer-overflow on address 0x7ffff5f00520 at ..
READ of size 257 at 0x7ffff5f00520 thread T0
    #0 0x00000032ffb6 in printf_common(void*, char const*, __va_list_tag*) (src/nft+0x32ffb6)
    #1 0x00000033055d in vsnprintf (src/nft+0x33055d)
    #2 0x000000332071 in snprintf (src/nft+0x332071)
    #3 0x0000004eef03 in netlink_gen_chain src/netlink.c:454:2
    #4 0x0000004eef03 in netlink_gen_verdict src/netlink.c:467:4

Reject chain jumps that exceed 255 characters, which matches the netlink
policy on the kernel side.

The included reproducer fails without asan too because the kernel will
reject the too-long chain name. But that happens after the asan detected
bogus read.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agojson: reject too long interface names
Florian Westphal [Tue, 24 Jun 2025 21:46:59 +0000 (23:46 +0200)] 
json: reject too long interface names

Blamed commit added a length check on ifnames to the bison parser.
Unfortunately that wasn't enough, json parser has the same issue.

Bogon results in:
BUG: Interface length 44 exceeds limit
nft: src/mnl.c:742: nft_dev_add: Assertion `0' failed.

After patch, included bogon results in:
Error: Invalid device at index 0. name d2345678999999999999999999999999999999012345 too long

I intentionally did not extend evaluate.c to catch this, past sentiment
was that frontends should not send garbage.

I'll send a followup patch to also catch this from eval stage in case there
are further reports for frontends passing in such long names.

Fixes: fa52bc225806 ("parser: reject zero-length interface names")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agotests/py: clean up set backend support fallout
Florian Westphal [Tue, 24 Jun 2025 19:39:46 +0000 (21:39 +0200)] 
tests/py: clean up set backend support fallout

Pablo reports failing py tests woth recent kernel and userland:
 any/objects.t: OK
WARNING: line 3: 'add rule ip6 test-ip6 input ..
mismatches 'family 2 __set0 test-ip4 3 backend nft_set_bitmap_type [nf_tables] count 7'

When nf_tables is built as a module, the set backend name coming
from kernel contains the module name ([nf_tables]), this makes the
test script treat it as part of the pseudo instructions.

Skip this line explicitly to avoid these warnings.

Fixes: 7cec20e45a75 ("tests/py: prepare for set debug change")
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agosrc: use EXPR_RANGE_VALUE in interval maps
Pablo Neira Ayuso [Mon, 16 Jun 2025 20:48:04 +0000 (22:48 +0200)] 
src: use EXPR_RANGE_VALUE in interval maps

Remove the restriction on maps to use EXPR_RANGE_VALUE to reduce
memory consumption.

With 100k map with concatenation:

  table inet x {
         map y {
                    typeof ip saddr . tcp dport :  ip saddr
                    flags interval
                    elements = {
                        1.0.2.0-1.0.2.240 . 0-2 : 1.0.2.10,
...
 }
  }

Before: 153.6 Mbytes
After: 108.9 Mbytes (-29.11%)

With 100k map without concatenation:

  table inet x {
         map y {
                    typeof ip saddr :  ip saddr
                    flags interval
                    elements = {
                        1.0.2.0-1.0.2.240 : 1.0.2.10,
...
 }
  }

Before: 74.36 Mbytes
After: 62.39 Mbytes (-16.10%)

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agoexpression: constant range is not a singleton
Pablo Neira Ayuso [Mon, 16 Jun 2025 20:48:02 +0000 (22:48 +0200)] 
expression: constant range is not a singleton

Remove the EXPR_F_SINGLETON flag in EXPR_RANGE_VALUE so it can be used
in maps.

expr_evaluate_set() does not toggle NFT_SET_INTERVAL for anonymous sets
because a singleton is assumed to be place, leading to this BUG:

 BUG: invalid data expression type range_value
 nft: src/netlink.c:577: netlink_gen_key: Assertion `0' failed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agosrc: use constant range expression for interval+concatenation sets
Pablo Neira Ayuso [Mon, 16 Jun 2025 20:47:57 +0000 (22:47 +0200)] 
src: use constant range expression for interval+concatenation sets

Expand 347039f64509 ("src: add symbol range expression to further
compact intervals") to use constant range expression for elements with
concatenation of intervals.

Ruleset with 100k elements of this type:

 table inet x {
        set y {
                typeof ip saddr . tcp dport
                flags interval
                elements = {
0.1.2.0-0.1.2.240 . 0-1,
...
}
}
 }

Memory consumption for this set:

Before: 123.80 Mbytes
After:   80.19 Mbytes (-35.23%)

This patch keeps the workaround 2fbade3cd990 ("netlink: bogus
concatenated set ranges with netlink message overrun") in place.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agoparser_bison: allow delete command with map via handle
Pablo Neira Ayuso [Sun, 15 Jun 2025 09:36:28 +0000 (11:36 +0200)] 
parser_bison: allow delete command with map via handle

For consistency with sets, allow delete via handle for maps too.

Fixes: f4a34d25f6d5 ("src: list set handle and delete set via set handle")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agoparser_bison: only reset by name is supported by now
Pablo Neira Ayuso [Sun, 15 Jun 2025 09:34:11 +0000 (11:34 +0200)] 
parser_bison: only reset by name is supported by now

NFT_MSG_GETSET does not support for handle lookup yet, restrict this to
reset by name by now.

Add a bogon test reported by Florian Westphal.

Fixes: 83e0f4402fb7 ("Implement 'reset {set,map,element}' commands")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agocache: pass name to cache_add()
Pablo Neira Ayuso [Sun, 15 Jun 2025 09:34:04 +0000 (11:34 +0200)] 
cache: pass name to cache_add()

Consolidate the name hash in the cache_add() function.

No functional changes are intended.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agocache: assert name is non-nul when looking up
Pablo Neira Ayuso [Sun, 15 Jun 2025 09:33:49 +0000 (11:33 +0200)] 
cache: assert name is non-nul when looking up

{table,chain,set,obj,flowtable}_cache_find() should not be called when
handles are used

Fixes: 5ec5c706d993 ("cache: add hashtable cache for table")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agorule: skip fuzzy lookup if object name is not available
Pablo Neira Ayuso [Sun, 15 Jun 2025 09:33:42 +0000 (11:33 +0200)] 
rule: skip fuzzy lookup if object name is not available

Skip fuzzy lookup for suggestions when handles are used.

Note that 4cf97abfee61 ("rule: Avoid segfault with anonymous chains")
already skips it for chain.

Fixes: 285bb67a11ad ("src: introduce simple hints on incorrect set")
Fixes: 9f7817a4e022 ("src: introduce simple hints on incorrect chain")
Fixes: d7476ddd5f7d ("src: introduce simple hints on incorrect table")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8 weeks agotests: shell: add feature check for count output change
Florian Westphal [Tue, 8 Apr 2025 14:21:32 +0000 (16:21 +0200)] 
tests: shell: add feature check for count output change

New kernels with latest nft release will print the number
of set elements allocated on the kernel side.

This causes shell test dump validation to fail in several
places.  We can't just update the affected dump files
because the test cases are also supposed to pass on current
-stable releases.

Add a feature check for this.  Dump failure can then use
sed to postprocess the stored dump file and can then call

diff a second time.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agosrc: print count variable in normal set listings
Florian Westphal [Tue, 8 Apr 2025 14:21:31 +0000 (16:21 +0200)] 
src: print count variable in normal set listings

Also print the number of allocated set elements if the set provided
an upper size limit and there is at least one element.

Example:

table ip t {
   set s {
       type ipv4_addr
       size 65535      # count 1
       flags dynamic
       counter
       elements = { 1.1.1.1 counter packets 1 bytes 11 }
   }
   ...

JSON output is unchanged as this only has informational purposes.

This change breaks tests, followup patch addresses this.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agodebug: include kernel set information on cache fill
Florian Westphal [Tue, 8 Apr 2025 14:21:30 +0000 (16:21 +0200)] 
debug: include kernel set information on cache fill

Honor --debug=netlink flag also when doing initial set dump
from the kernel.

With recent libnftnl update this will include the chosen
set backend name that is used by the kernel.

Because set names are scoped by table and protocol family,
also include the family protocol number.

Dumping this information breaks tests/py as the recorded
debug output no longer matches, this is fixed in previous
change.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agotests/py: prepare for set debug change
Florian Westphal [Tue, 8 Apr 2025 14:21:29 +0000 (16:21 +0200)] 
tests/py: prepare for set debug change

Next patch will make initial set dump from kernel emit set debug
information, so the obtained netlink debug file won't match what is
recorded in tests/py.

Furthermore, as the python add rules for each of the family the test is
for, subsequent dump will include debug information of the other/previous
families.

Change the script to skip all unrelated information to only compare the
relevant set element information and the generated expressions.

This change still finds changes in [ expr ... ] and set elem debug output.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agosrc: BASECHAIN flag no longer implies presence of priority expression
Florian Westphal [Thu, 12 Jun 2025 18:17:15 +0000 (20:17 +0200)] 
src: BASECHAIN flag no longer implies presence of priority expression

The included bogon will crash nft because print side assumes that BASECHAIN
flag presence also means that priority expression is available.

Make the print side conditional.

Fixes: a66b5ad9540d ("src: allow for updating devices on existing netdev chain")
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agoevaluate: restrict allowed subtypes of concatenations
Florian Westphal [Fri, 6 Jun 2025 12:12:37 +0000 (14:12 +0200)] 
evaluate: restrict allowed subtypes of concatenations

We need to restrict this, included bogon asserts with:
BUG: unknown expression type prefix
nft: src/netlink_linearize.c:940: netlink_gen_expr: Assertion `0' failed.

Prefix expressions are only allowed if the concatenation is used within
a set element, not when specifying the lookup key.

For the former, anything that represents a value is allowed.
For the latter, only what will generate data (fill a register) is
permitted.

At this time we do not have an annotation that tells if the expression
is on the left hand side (lookup key) or right hand side (set element).

Add a new list recursion counter for this. If its 0 then we're building
the lookup key, if its the latter the concatenation is the RHS part
of a relational expression and prefix, ranges and so on are allowed.

IOW, we don't really need a recursion counter, another type of annotation
that would tell if the expression is placed on the left or right hand side
of another expression would work too.

v2: explicitly list all 'illegal' expression types instead of
using a default label for them.

This will raise a compiler warning to remind us to adjust the case
labels in case a new expression type gets added in the future.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agoevaluate: rename recursion counter to recursion.binop
Florian Westphal [Fri, 6 Jun 2025 12:12:36 +0000 (14:12 +0200)] 
evaluate: rename recursion counter to recursion.binop

The existing recursion counter is used by the binop expression to detect
if we've completely followed all the binops.

We can only chain up to NFT_MAX_EXPR_RECURSION binops, but the evaluation
step can perform constant-folding, so we must recurse until we found the
rightmost (last) binop in the chain.

Then we can check the post-eval chain to see if it is something that can
be serialized later (i.e., if we are within the NFT_MAX_EXPR_RECURSION
after constant folding) or not.

Thus we can't reuse the existing ctx->recursion counter for other
expressions; entering the initial expr_evaluate_binop with
ctx->recursion > 0 would break things.

Therefore rename this to an embedded structure.
This allows us to add a new recursion counter in a followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agotest: shell: Add rate_limit test case for 'limit statement'.
Yi Chen [Sun, 22 Jun 2025 12:55:54 +0000 (20:55 +0800)] 
test: shell: Add rate_limit test case for 'limit statement'.

Signed-off-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agotest: shell: Add wait_local_port_listen() helper to lib.sh
Yi Chen [Sun, 22 Jun 2025 12:55:53 +0000 (20:55 +0800)] 
test: shell: Add wait_local_port_listen() helper to lib.sh

Introduce a new helper function wait_local_port_listen() in helpers/lib.sh.
Update the flowtables and nat_ftp test cases to use this helper.

Signed-off-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agotest: shell: Introduce $NFT_TEST_LIBRARY_FILE, helper/lib.sh
Yi Chen [Sun, 22 Jun 2025 12:55:52 +0000 (20:55 +0800)] 
test: shell: Introduce $NFT_TEST_LIBRARY_FILE, helper/lib.sh

Consolidate frequently used functions in helper/lib.sh
switch nat_ftp and flowtables to use it.

Signed-off-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agotest: shell: nat_ftp: test files must be world-readable
Florian Westphal [Sun, 22 Jun 2025 13:19:39 +0000 (15:19 +0200)] 
test: shell: nat_ftp: test files must be world-readable

Directory and test need to be readable, with a default umask of 077
this test fails because vsftp can't open the curl-requested file.

Signed-off-by: Florian Westphal <fw@strlen.de>
8 weeks agotest: shell: Don't use system nft binary
Yi Chen [Sun, 22 Jun 2025 12:55:51 +0000 (20:55 +0800)] 
test: shell: Don't use system nft binary

Use the defined $NFT variable instead of calling the system nft binary directly.
Add a nat_ftp.nodump file to avoid the following check-tree.sh error:
ERR: "tests/shell/testcases/packetpath/nat_ftp" has no "tests/shell/testcases/packetpath/dumps/nat_ftp.{nft,nodump}" file.

Signed-off-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2 months agoevaluate: don't BUG on unexpected base datatype
Florian Westphal [Fri, 13 Jun 2025 14:46:06 +0000 (16:46 +0200)] 
evaluate: don't BUG on unexpected base datatype

Included bogon will cause a crash but this is the evaluation stage where
we can just emit an error instead.

Signed-off-by: Florian Westphal <fw@strlen.de>
2 months agonetlink: Avoid crash upon missing NFTNL_OBJ_CT_TIMEOUT_ARRAY attribute
Phil Sutter [Thu, 12 Jun 2025 18:17:22 +0000 (20:17 +0200)] 
netlink: Avoid crash upon missing NFTNL_OBJ_CT_TIMEOUT_ARRAY attribute

If missing, the memcpy call ends up reading from address zero.

Fixes: c7c94802679cd ("src: add ct timeout support")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 months agotests: py: Properly fix JSON equivalents for netdev/reject.t
Phil Sutter [Thu, 12 Jun 2025 10:59:29 +0000 (12:59 +0200)] 
tests: py: Properly fix JSON equivalents for netdev/reject.t

Revert commit d1a7b9e19fe65 ("tests: py: update netdev reject test
file"), the stored JSON equivalents were correct in that they matched
the standard syntax input.

In fact, we missed a .json.output file recording the expected deviation
in JSON output.

Fixes: d1a7b9e19fe65 ("tests: py: update netdev reject test file")
Fixes: 7ca3368cd7575 ("reject: Unify inet, netdev and bridge delinearization")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>