git.ipfire.org Git - thirdparty/nftables.git/log

netlink_delinearize: restore binop syntax when listing ruleset for flags

commit b11b6c68e61ea294eb4c313705ccfe3e7b0eda87 upstream.

c3d57114f119 ("parser_bison: add shortcut syntax for matching flags
without binary operations") provides a similar syntax to iptables using
a prefix representation for flag matching.

Restore original representation using binop when listing the ruleset.
The parser still accepts the prefix notation for backward compatibility.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: do not merge a set with a erroneous one

commit ea011231c06cbe828cf6056bc9c3d116e1f528d5 upstream.

The included sample causes a crash because we attempt to
range-merge a prefix expression with a symbolic expression.

The first set is evaluated, the symbol expression evaluation fails
and nft queues an error message ("Could not resolve hostname").

However, nft continues evaluation.

nft then encounters the same set definition again and merges the
new content with the preceeding one.

But the first set structure is dodgy, it still contains the
unresolved symbolic expression.

That then makes nft crash (assert) in the set internals.

There are various different incarnations of this issue, but the low
level set processing code does not allow for any partially transformed
expressions to still remain.

Before:
nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop
BUG: invalid range expression type binop
nft: src/expression.c:1479: range_expr_value_low: Assertion `0' failed.

After:
nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop
invalid_range_expr_type_binop:4:18-25: Error: Could not resolve hostname: Name or service not known
elements = { 1&.141.0.1 - 192.168.0.2}
^^^^^^^^

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

parser: json: Support for synproxy objects

commit 938a135f2200766661e42a20e02d87555f5bacfa upstream.

Parsing code was there already, merely the entry in json_parse_cmd_add()
missing.

To support maps with synproxy target, an entry in string_to_nft_object()
is required. While being at it, add other missing entries as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>

json: Support maps with concatenated data

commit c56d77cc82988391ab8f2514214c0088cbc7d89e upstream.

Dump such maps with an array of types in "map" property, make the parser
aware of this.

Signed-off-by: Phil Sutter <phil@nwl.cc>

json: Order output like nft_cmd_expand()

commit 38f04196ebef64a6672c55c27a6afe9af811c8f7 upstream.

Print empty chain add commands early in list so following verdict maps
and rules referring to them won't cause spurious errors when loading the
resulting ruleset dump.

Signed-off-by: Phil Sutter <phil@nwl.cc>

nftables: do mot merge payloads on negation

commit f35a0d78fe870737fa39d859bd2e3ac25bf1b12e upstream.

else, a rule like
tcp sport != 22 tcp dport != 23

will match even if the destination is 23 as long as sport is != 22.
(or vice versa).

Signed-off-by: Sriram Rajagopalan <sriramr@arista.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

parser: allow to define maps that contain ct helpers

commit acbac5e3ada4bdf16cf24056a8a7a7739429a8ab upstream.

Its currently not possible to use ct helpers in objref maps.

Simply adding "CT HELPER" to "map_block_obj_type" does not work
due to a conflict with the "ct helper" ct_expr block.

  map m {
    type ipv4_addr : ct helper ..

... declares a map storing ip addresses and conntrack helper names
(string type).  This does not make sense, there is no way
to use the associated value (the names) in any sensible way:

ct helper "ftp" - this matches if the packet has a conntrack entry that
was accepted via the "ftp" conntrack helper. In nft vm terms, this is
translated to:

[ ct load helper => reg 1 ]
[ cmp eq reg 1 0x00707466 0x00000000 0x00000000 0x00000000 ]

Or one can query a set, e.g. 'ct helper { "ftp", "sip" }'.
"ftp" and "sip" are the kernel-defined names of these connection
tracking helpers.

ct helper set "ftp" is something else, however:

This is used to assign a *userspace defined helper objrect reference*.
Nftables will translate this to:

[ objref type 3 name ftp ]

.. where "ftp" is a arbitrary user-chosen name.

  ct helper "ftp" {
    type "ftp" protocol tcp
    l3proto ip
  }

IOW, "ct helper" is ambiguous.  Without the "set" keyword (first case),
it places the kernel-defined name of the active connection tracking helper
in the chosen register (or it will cancel rule evaluation if no helper was
active).

With the set keyword (second case), the expected argument is a user-defined
object reference which will then tell the connection tracking engine to
monitor all further packets of the new flow with the given helper template.

This change makes it so that

  map m {
    type ipv4_addr : ct helper ..

declares a map storing ct helper object references suitable for
'ct helper set'.

The better alternative would be to resolve the ambiguity
by adding an additional postfix keyword, for example

ct helper name (case one)
ct helper object (case two).

But this needs a kernel change that adds
NFT_CT_HELPER_NAME and NFT_CT_HELPER_OBJECT to enum nft_ct_keys.

While a new kernel could handle old nftables binaries that still use
NFT_CT_HELPER key, new nftables would need to probe support first.

Furthermore,

ct helper name set "foo"

... would make no sense, as the kernel-defined helper names are
readonly.

ct helper object "foo"

... would make no sense, unless we extend the kernel to store
the nftables userspace-defined name in a well-known location
in the kernel.  Userdata area cannot work for this, because the
nft conntrack expression in the kernel would need to know how to
retrieve this info again.

Also, I cannot think of a sensible use case for this.
So the only remaining, useful commands are:

ct helper name "ftp"
ct helper object set "foo"

... which is identical to what we already support, just with
extra keyword.

So a much simpler solution that does not need any kernel changes
is make "ct helper" have different meaning depending on wheter it
is placed on the key side, i.e.:

    "typeof ct helper", "typeof ct helper : $value"

versus when its on placed on the data (value) side of maps:

    "typeof $key : ct helper".

Signed-off-by: Florian Westphal <fw@strlen.de>

parser: allow to define maps that contain timeouts and expectations

commit bb2054cfd99097973d98e231066297e9e8632a61 upstream.

Its currently not possible to use ct timeouts/expectations/helpers
in objref maps, bison parser lacks the relevant keywords.

This change adds support for timeouts and expectations.
Ct helpers are more problematic, this will come in a different change.

Support is only added for the "typeof" keyword, otherwise we'd
need to add pseudo-datatypes as well, but making "ct expectation"
available as "type" as well might be confusing.

Signed-off-by: Florian Westphal <fw@strlen.de>

rule: fix ASAN errors in chain priority to textual names

commit ff6135270616ccf4712990246cae850e64253516 upstream.

ASAN reports several errors when listing this ruleset:

table ip x {
        chain y {
                type filter hook input priority -2147483648; policy accept;
        }
}

src/rule.c:1002:8: runtime error: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
src/rule.c:1001:11: runtime error: signed integer overflow: -2147483648 - 50 cannot be represented in type 'int'

Use int64_t for the offset to avoid an underflow when calculating
closest existing priority definition.

Use llabs() because abs() is undefined with INT32_MIN.

Fixes: c8a0e8c90e2d ("src: Set/print standard chain prios with textual names")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

proto: use hexadecimal to display ip frag-off field

commit 67edd1f2330a31e93789639d31315b64d01a79a3 upstream.

The ip frag-off field in the protocol definition is 16-bits long
and it contains the DF (0x2000) and MF (0x4000) bits too.

iptables-translate also suggests:

ip frag-off & 0x1ffff != 0

to match on fragments. Use hexadecimal for listing this header field.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

parser: deduplicate map with data interval

commit 28d202216535ac54216f825e511a92d9acea5d3c upstream.

Its copypasted, the copy is same as original
except that it specifies a map key that maps to an interval.

Add an exra rule that returns 0 or EXPR_F_INTERVAL, then
use that in a single rule.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: allow typeof keywords with objref maps during listing

commit 070ec7ce350f8139f9b97dbcb78ad1d7b7bf1196 upstream.

Without this,

  typeof meta l4proto . ip saddr . tcp sport : limit

... is shown as

  type inet_proto . ipv4_addr . inet_service : limit

The "data" element is a value (the object type number).
It doesn't support userinfo data.

There is no reason to add it, the value is the object type
number that the object-reference map stores.

So, if we have an objref map, DO NOT discard the key part,
as we do for normal maps.

For normal maps, we support either typeof notation, i.e.:

  typeof meta l4proto . ip saddr . tcp sport : ip saddr

or the data type version:
  type inet_proto . ipv4_addr . inet_service : ipv4_addr

... but not a mix, a hyptothetical

  typeof meta l4proto . ip saddr . tcp sport : ipv4_addr

... does not work.

If nft finds no udata attached to the data element, for normal
map case, it has to fall back to the "type" form.

But for objref maps this is expected, udata for key but not for data.
Hence, for objref case, keep the typeof part if its valid.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser: allow typeof in objref maps

commit 25f1e6ccc0590e1081879022809afb83fc45f673 upstream.

Its currently not possible to declare a map that
stores object references with the "typeof" keyword, e.g.

map m {
type ipv4_addr : limit

will work, but

map m {
typeof ip saddr : limit

will give a syntax error ("unexpected limit").
Followup pach will add support for listing side too.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser: compact type/typeof set rules

commit fb98cb91c575880617e5da2cb6f094525516cb78 upstream.

Set/maps keys can be declared either by 'type' or 'typeof' keyword.

Compact this to use a common block for both cases.

The datatype_set call is redundant, remove it:
at this point $3 == $1->key, so this is a no-op.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser: compact interval typeof rules

commit 81fc7aee0d523d61077518534036ff10beddb3e9 upstream.

There are two nearly identical blocks for typeof maps:
one with INTERVAL keyword present and one without.

Compact this into a single block.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_json: allow 0 offsets again

commit e4f59f2dc68fb2e6d62a6c37e04366a8ebd199cf upstream.

Its valid in case of tcp option removal:

[ {
   "reset": {
     "tcp option": {
       "base": 123,
       "len": 0,
       "offset": 0
   }

This makes nft-test.py -j pass again.

Fixes: e08627257ecf ("parser: reject raw payload expressions with 0 length")
Signed-off-by: Florian Westphal <fw@strlen.de>

expression: missing line in describe command with invalid expression

commit 2b24dd29c5fa1c7e4cf44f0753752d25106273a0 upstream.

Before:

duh@testbed:~# nft describe blah
symbol expression, datatype invalid (invalid)duh@testbed:#

After:

duh@testbed:~# nft describe blah
symbol expression, datatype invalid (invalid)
duh@testbed:#

Fixes: 48aca2de80a7 ("iptopt: fix crash with invalid field/type combo")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: permit use of host-endian constant values in set lookup keys

commit c0080feb0d034913409944d23873cce4bf9edf9e upstream.

AFL found following crash:

table ip filter {
map ipsec_in {
typeof ipsec in reqid . iif : verdict
flags interval
}

chain INPUT {
type filter hook input priority filter; policy drop;
ipsec in reqid . 100 @ipsec_in
}
}

Which yields:
nft: evaluate.c:1213: expr_evaluate_unary: Assertion `!expr_is_constant(arg)' failed.

All existing test cases with constant values use big endian values, but
"iif" expects host endian values.

As raw values were not supported before, concat byteorder conversion
doesn't handle constants.

Fix this:

1. Add constant handling so that the number is converted in-place,
   without unary expression.

2. Add the inverse handling on delinearization for non-interval set
   types.
   When dissecting the concat data soup, watch for integer constants where
   the datatype indicates host endian integer.

Last, extend an existing test case with the afl input to cover
in/output.

A new test case is added to test linearization, delinearization and
matching.

Based on original patch from Florian Westphal, patch subject and
description wrote by him.

Fixes: b422b07ab2f9 ("src: permit use of constant values in set lookup keys")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: permit use of constant values in set lookup keys

commit b422b07ab2f96436001f33dfdfd937238033c799 upstream.

Something like:

Given: set s { type ipv4_addr . ipv4_addr . inet_service .. } something
like
add rule ip saddr . 1.2.3.4 . 80 @s goto c1

fails with: "Error: Can't parse symbolic invalid expressions".

This fails because the relational expression first evaluates
the left hand side, so when concat evaluation sees '1.2.3.4'
no key context is available.

Check if the RHS is a set reference, and, if so, evaluate
the right hand side.

This sets a pointer to the set key in the evaluation context
structure which then makes the concat evaluation step parse
1.2.3.4 and 80 as ipv4 address and 16bit port number.

On delinearization, extend relop postprocessing to
copy the datatype from the rhs (set reference, has
proper datatype according to set->key) to the lhs (concat
expression).

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink_delinearize: move concat and value postprocessing to helpers

commit 7ef933d16fd1b14afb276ab26f4114a2c9d6a3ea upstream.

No functional changes intended.

Signed-off-by: Florian Westphal <fw@strlen.de>

cache: Always set NFT_CACHE_TERSE for list cmd with --terse

commit cd4e947032a57a585b1a457ce03f546afc7ba033 upstream.

This fixes at least 'nft -t list table ...' and 'nft -t list set ...'.

Note how --terse handling for 'list sets/maps' remains in place since
setting NFT_CACHE_TERSE does not fully undo NFT_CACHE_SETELEM: setting
both enables fetching of anonymous sets which is pointless for that
command.

Reported-by: anton.khazan@gmail.com
Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1735
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Phil Sutter <phil@nwl.cc>

netlink_linearize: add assertion to catch for buggy byteorder

commit 8a34cca265a290a0b23cec27d5258bc432cef3d3 upstream.

Add assertion to catch buggy bytecode where unary expression is present
with 1-byte selectors, where no byteorder conversion is required.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: skip byteorder conversion for selector smaller than 2 bytes

commit 9fe58952c45a1643dc7df1a0b1f5d88e8ae1a978 upstream.

Add unary expression to trigger byteorder conversion for host byteorder
selectors only if selectors length is larger or equal than 2 bytes.

# cat test.nft
table ip x {
        set test {
                type ipv4_addr . ether_addr . inet_proto
                flags interval
        }

        chain y {
                ip saddr . ether saddr . meta l4proto @test counter
        }
}

# nft -f test.nft
ip x y
  [ meta load iiftype => reg 1 ]
  [ cmp eq reg 1 0x00000001 ]
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ payload load 6b @ link header + 6 => reg 9 ]
  [ meta load l4proto => reg 11 ]
  [ byteorder reg 11 = hton(reg 11, 2, 1) ] <--- should not be here
  [ lookup reg 1 set test ]
  [ counter pkts 0 bytes 0 ]

Fixes: 1017d323cafa ("src: support for selectors with different byteorder with interval concatenations")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

cache: Optimize caching for 'list tables' command

commit 674eb7fa2895813b25f6fbfcc9417fc0788fade1 upstream.

No point in fetching anything other than existing tables from kernel:
'list tables' merely prints existing table names, no contents.

Also populate filter's family field to reduce overhead when listing
tables in one family with many tables in another one. It works without
further adjustments because nftnl_nlmsg_build_hdr() will use the value
for nfgen_family.

Reported-by: anton.khazan@gmail.com
Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1735
Signed-off-by: Phil Sutter <phil@nwl.cc>

Makefile.am: don't silence -Wimplicit-function-declaration

This becomes an error in GCC 14 and Clang 16. It's a common
misconception that these warnings are invalid or simply noise for
Bison/parser files, but even if that were true, we'd need to handle it
somehow anyway. Silencing them does nothing, so stop doing that.

Further, I don't actually get any warnings to fix with bison-3.8.2. This
mirrors changes we've done in other netfilter.org projects.

Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Phil Sutter <phil@nwl.cc>

evaluate: fix check for unknown in cmd_op_to_name

commit 9f76bb63c0c706ea5c0d55931ee690ca5dccaf16 upstream.

Example:
nft --debug=all destroy table ip missingtable

Before:
Evaluate unknown

After:
Evaluate destroy

Fixes: e1dfd5cc4c46 ("src: add support to command "destroy"")
Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>

evaluate: don't assert on net/transport header conflict

commit 3a734d60813193974a4a0e8ed0af3349f8857ec9 upstream.

before:
nft: evaluate.c:467: conflict_resolution_gen_dependency: Assertion `expr->payload.base == PROTO_BASE_LL_HDR' failed.
Aborted (core dumped)

conflict_resolution_gen_dependency() can only handle linklayer
conflicts, hence the assert.

Rename it accordingly. Also rename resolve_protocol_conflict, it doesn't
do anything for != PROTO_BASE_LL_HDR and extend the assertion to that
function too.

Callers now enforce PROTO_BASE_LL_HDR prerequisite.

after:
Error: conflicting transport layer protocols specified: comp vs. udp
ip6 nexthdr comp udp dport 4789
^^^^^^^^^

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

datatype: display 0s time datatype

commit 072c83e9a18399249fbc4343f7d0b5b04c29e6fb upstream.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

json: Support sets' auto-merge option

commit 0021d264d975ac7db1ed32699487f95b41e4c2bf upstream.

If enabled, list the option as additional attribute with boolean value.

Fixes: e70354f53e9f6 ("libnftables: Implement JSON output support")
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1734
Signed-off-by: Phil Sutter <phil@nwl.cc>

rule: fix sym refcount assertion

commit b73298405cda74b3a87a1818bb92f53298d34170 upstream.

Scope release must happen last.
afl provided a reproducer where policy is a define, because
scope is released too early we get:
nft: src/rule.c:559: scope_release: Assertion `sym->refcnt == 1' failed.

... because chain->policy is EXPR_SYMBOL.

Fixes: 627c451b2351 ("src: allow variables in the chain priority specification")
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: error out when store needs more than one 128bit register of align fixup

commit 8a66de2a15943b2fbf960967cdbcbd0a148cb114 upstream.

Else this gives:
nft: evaluate.c:2983: stmt_evaluate_payload: Assertion `sizeof(data) * BITS_PER_BYTE >= masklen' failed.

For loads, this is already prevented via expr_evaluate_bits() which has:

  if (masklen > NFT_REG_SIZE * BITS_PER_BYTE)
      return expr_error(ctx->msgs, expr, "mask length %u exceeds allowed maximum of %u\n",
                        masklen, NFT_REG_SIZE * BITS_PER_BYTE);

But for the store path this isn't called.
The reproducer asks to store a 128 bit integer at bit offset 1, i.e.
17 bytes would need to be munged, but we can only handle up to 16 bytes
(one pseudo-register).

Fixes: 78936d50f306 ("evaluate: add support to set IPv6 non-byte header fields")
Signed-off-by: Florian Westphal <fw@strlen.de>

parser: reject raw payload expressions with 0 length

commit e08627257ecfa7dfb68a34a1c8866e7a7e012b15 upstream.

Reject this at parser stage. Fix up the json input side too, else
reproducer gives:
nft: src/netlink.c:243: netlink_gen_raw_data: Assertion `len > 0' failed.

Signed-off-by: Florian Westphal <fw@strlen.de>

payload: only assert if l2 header base has no length

commit 9cc41467c75ab6beb35e0d7c34d04acd1a44861b upstream.

nftables will assert in some cases because the sanity check is done even
for network and transport header bases.

However, stacked headers are only supported for the link layer.
Move the assertion around and add a test case for this.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: release mpz type in expr_evaluate_list() error path

commit 172b660843501463a0894b0d2ca1dd48c898dc4d upstream.

Detected when running:

# nft -f tests/shell/testcases/bogons/nft-f/no_integer_basetype_crash
==383222==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x7fe7b54a9e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7fe7b538b9a9 in __gmp_default_allocate (/lib/x86_64-linux-gnu/libgmp.so.10+0xc9a9)

Fixes: 3671c4897003 ("evaluate: guard against NULL basetype")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: release key expression in error path of implicit map with unknown datatype

commit 4622a8c6372ccf3103e38d1f15227eadb0485f54 upstream.

Detected when running:

# nft -f tests/shell/testcases/bogons/nft-f/mapping_with_invalid_datatype_crash
==382584==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 144 byte(s) in 1 object(s) allocated from:
    #0 0x7fde06ca9e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7fde062924af in xmalloc src/utils.c:31
    #2 0x7fde0629266c in xzalloc src/utils.c:70
    #3 0x7fde06167299 in expr_alloc src/expression.c:46
    #4 0x7fde0616b014 in constant_expr_alloc src/expression.c:420
    #5 0x7fde06128e43 in expr_evaluate_map src/evaluate.c:2027
    #6 0x7fde06137b06 in expr_evaluate src/evaluate.c:2891
    #7 0x7fde06132417 in expr_evaluate_relational src/evaluate.c:2497
    #8 0x7fde06137b36 in expr_evaluate src/evaluate.c:2895
    #9 0x7fde06137d5f in stmt_evaluate_expr src/evaluate.c:2914
    #10 0x7fde061524c8 in stmt_evaluate src/evaluate.c:4646
    #11 0x7fde0615c9ee in rule_evaluate src/evaluate.c:5202
    #12 0x7fde061600c7 in cmd_evaluate_add src/evaluate.c:5422

Fixes: 70054e6e1c87 ("evaluate: catch implicit map expressions without known datatype")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: bail out if anonymous concat set defines a non concat expression

commit 98c51aaac42b6d180f198d3d2f5b3425ab63ad72 upstream.

Iterate over the element list in the anonymous set to validate that all
expressions are concatenations, otherwise bail out.

  ruleset.nft:3:46-53: Error: expression is not a concatenation
               ip protocol . th dport vmap { tcp / 22 : accept, tcp . 80 : drop}
                                             ^^^^^^^^

This is based on a patch from Florian Westphal.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: do not fetch next expression on runaway number of concatenation components

commit 955bb6d31c90453e43043346c917646ddc4e5c4e upstream.

If this is the last expression, then the runaway flag is set on and
evaluation bails in the next iteration, do not fetch next list element
which refers to the list head.

I found this by code inspection, I could not trigger any crash with this
one.

Fixes: ae1d54d1343f ("evaluate: do not crash on runaway number of concatenation components")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: skip anonymous set optimization for concatenations

commit 6bc6673fc88c8a3e3dd5504b2d24a6d6bc2f8427 upstream.

Concatenation is only supported with sets. Moreover, stripping of the
set leads to broken ruleset listing, therefore, skip this optimization
for the concatenations.

Fixes: fa17b17ea74a ("evaluate: revisit anonymous set with single element optimization")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: add missing range checks for dup,fwd and payload statements

commit 4121175cc243a15bdb8c226a335f67cedd98680e upstream.

Else we assert with:
BUG: unknown expression type range
nft: src/netlink_linearize.c:912: netlink_gen_expr: Assertion `0' failed.

While at it, condense meta and exthdr to reuse the same helper.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: tproxy: move range error checks after arg evaluation

commit 1d03ab5267bdbc7c0bcb041efaad42a462fdeb5f upstream.

Testing for range before evaluation will still crash us later during
netlink linearization, prefixes turn into ranges, symbolic expression
might hide a range/prefix.

So move this after the argument has been evaluated.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: error out when expression has no datatype

commit 666018e71ebb5df376b1b013c1ca859eaed66f1a upstream.

add rule ip6 f i rt2 addr . ip6 daddr { dead:: . dead:: }

... will cause a segmentation fault, we assume expr->dtype is always
set.

rt2 support is incomplete, the template is uninitialised.

This could be fixed up, but rt2 (a subset of the deperecated type 0),
like all other routing headers, lacks correct dependency tracking.

Currently such routing headers are always assumed to be segment routing
headers, we would need to add dependency on 'Routing Type' field in the
routing header, similar to icmp type/code.

Signed-off-by: Florian Westphal <fw@strlen.de>

doc: clarify reject is supported at prerouting stage

commit 52d1346d41c51abf2f77b81d21cb683a6477e817 upstream.

It's supported since kernel commit f53b9b0bdc59 ("netfilter: introduce
support for reject at prerouting stage").

Reported-by: Dan Winship <danwinship@redhat.com>
Signed-off-by: Quan Tian <tianquan23@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

doc: incorrect datatype description for icmpv6_type and icmpvx_code

commit 9caa5142f95557decc19925ad37411f1df4589a2 upstream.

Fix incorrect description in manpage:

ICMPV6 TYPE TYPE is icmpv6_type
ICMPVX CODE TYPE is icmpx_code

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

datatype: do not assert when value exceeds expected width

commit a852022d719e5a414df41a10c331f4492e9d3073 upstream.

Inputs:
ip protocol . th dport { tcp / 22,  }'
or
th dport . ip protocol { tcp / 22,  }'

are not rejected at this time. 'list ruleset' yields:
ip protocol & nft: src/gmputil.c:77: mpz_get_uint8: Assertion `cnt <= 1' failed.
or
th dport & nft: src/gmputil.c:87: mpz_get_be16: Assertion `cnt <= 1' failed.

While this should be caught at input too, the print path should be more
robust, e.g. when there are direct nfnetlink users.

After this patch, the print functions fall back to
'integer_type_print' which can handle large numbers too.

Note that the output printed this way cannot be read back by nft;
it will dump something like:

  tcp dport & 18446739675663040512 . ip protocol 0 . 0

but thats better than assert().

v2: same problem exists for service too.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: fix stack overflow due to erroneous rounding

commit b9e19cc396347df8c7f8cf5d14ba1d6172040f16 upstream.

Byteorder switch in this function may undersize the conversion
buffer by one byte, this needs to use div_round_up().

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: don't crash if object map does not refer to a value

commit 588470e00539404fd793fe22718067721f5754be upstream.

Before:
BUG: Value export of 512 bytes would overflownft: src/netlink.c:474: netlink_gen_prefix: Assertion `0' failed.

After:
66: Error: Object mapping data should be a value, not prefix
synproxy name ip saddr map { 192.168.1.0/24 : "v*" }

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: error out on duplicated type/typeof/element keywords

commit 6c04e5ceb95068bb459b07307ecc3629d97a2043 upstream.

Otherwise nft will leak the previous definition (expressions).
Also remove the nonsensical

datatype_set($1->key, $3->dtype);

This is a no-op, at this point: $1->key and $3 are identical.

Signed-off-by: Florian Westphal <fw@strlen.de>

intervals: BUG on prefix expressions without value

commit 49721e28db3e7b28f443bf963547c12f4dcf856d upstream.

Its possible to end up with prefix expressions that have
a symbolic expression, e.g.:

table t {
        set s {
                type inet_service
                flags interval
                elements = { 172.16.0.0/16 }
        }

        set s {
                type inet_service
                flags interval
                elements = { 0-1024, 8080-8082, 10000-40000 }
        }
}

Without this change, nft will crash.  We end up in setelem_expr_to_range()
with prefix "/16" for the symbolic expression "172.16.0.0".

We than pass invalid mpz_t pointer into libgmp.

This isn't a real fix, but instead of blindly assuming that the attached
expression has a gmp value die with at least some info.

Signed-off-by: Florian Westphal <fw@strlen.de>

tcpopt: don't create exthdr expression without datatype

commit c9f934ca446de5041ce6f19e4ee6a0c74b120186 upstream.

The reproducer crashes during concat evaluation, as the
exthdr expression lacks a datatype.

This should never happen, i->dtype must be set.

In this case the culprit is tcp option parsing, it will
wire up a non-existent template, because the "nop" option
has no length field (1 byte only).

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: fix stack overflow with huge priority string

commit b626c86abaf294fcf1ec788f722071dc90da68c4 upstream.

Alternative would be to refactor this and move this into the parsers
(bison, json) instead of this hidden re-parsing.

Fixes: 627c451b2351 ("src: allow variables in the chain priority specification")
Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: fix stack buffer overflow with sub-reg sized prefixes

commit 600b84631410a6e853c208795246ea0c9df95c12 upstream.

The calculation of the dynamic on-stack array is incorrect,
the scratch space can be too low which gives stack corruption:

AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffdb454f064..
    #1 0x7fabe92aaac4 in __mpz_export_data src/gmputil.c:108
    #2 0x7fabe92d71b1 in netlink_export_pad src/netlink.c:251
    #3 0x7fabe92d91d8 in netlink_gen_prefix src/netlink.c:476

div_round_up() cannot be used here, it fails to account for register
padding.  A 16 bit prefix will need 2 registers (start, end -- 8 bytes
in total).

Remove the dynamic sizing and add an assertion in case upperlayer
ever passes invalid expr sizes down to us.

After this fix, the combination is rejected by the kernel
because of the maps' wrong data size, before the fix userspace
may crash before.

Signed-off-by: Florian Westphal <fw@strlen.de>

src: reject large raw payload and concat expressions

commit 8eeedce89d8bf0ad58da398782c2ca8a91d83a32 upstream.

The kernel will reject this too, but unfortunately nft may try
to cram the data into the underlying libnftnl expr.

This causes heap corruption or
BUG: nld buffer overflow: want to copy 132, max 64

After:

Error: Concatenation of size 544 exceeds maximum size of 512
udp length . @th,0,512 . @th,512,512 { 47-63 . 0xe373135363130 . 0x33131303735353203 }
^^^^^^^^^

resp. same warning for an over-sized raw expression.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: exthdr: statement arg must be not be a range

commit 8eeedce89d8bf0ad58da398782c2ca8a91d83a32 upstream.

Else we get:
BUG: unknown expression type range
nft: src/netlink_linearize.c:909: netlink_gen_expr: Assertion `0' failed.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: don't crash if prefix for < byte is requested

commit 0404ff08b3c18052e6689d75fa85275d3cef7e8e upstream.

If prefix is used with a datatype that has less than 8 bits an
assertion is triggered:

src/netlink.c:243: netlink_gen_raw_data: Assertion `len > 0' failed.

This is esoteric, the alternative would be to restrict prefixes
to ipv4/ipv6 addresses.

Simpler fix is to use round_up instead of divide.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: fix gmp assertion with too-large reject code

commit 060ed8655d64874a92e6fba2ba9452b2aa94849e upstream.

Before:
nft: gmputil.c:77: mpz_get_uint8: Assertion `cnt <= 1' failed.
After: Error: reject code must be integer in range 0-255

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: stmt_nat: set reference must point to a map

commit 3eb0a73a9ee32897290d4097c0ec29377e25859e upstream.

nat_concat_map() requires a datamap, else we crash:
set->data is dereferenced.

Also update expr_evaluate_map() so that EXPR_SET_REF is checked there
too.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: fix memory leaks on hookspec error processing

commit d755c2a3ae7fe8272321a1d81eafbd90052c4f14 upstream.

prio_spec may contain an embedded expression, release it.
We also need to release the device expr and the hook string.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: fix ct scope underflow if ct helper section is duplicated

commit 037d58a27d675802286aafb23e409b8c1d3eef56 upstream.

table inet filter {
ct helper sip-5060u {
type "sip" protocol udp
l3proto ip
}5060t {
type "sip" protocol tcp
l3pownerip
}

Will close the 'ct' scope twice, it has to be closed AFTER the separator
has been parsed.

While not strictly needed, also error out if the protocol is already
given, this provides a better error description.

Also make sure we release the string in all error branches.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: make sure obj_free releases timeout policies

commit d5a06af393eaf47571c884a265d1f6e6ba34ed97 upstream.

obj_free() won't release them because ->type is still 0 at this
point.

Init this to CT_TIMEOUT.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: turn assert into real error check

commit 58904b8b55a2a7941287f0267601eb54c75432a0 upstream.

large '& VAL' results in:
src/evaluate.c:531: expr_evaluate_bits: Assertion `masklen <= NFT_REG_SIZE * BITS_PER_BYTE' failed.

Turn this into expr_error().

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: prevent assert when evaluating very large shift values

commit 26723202e600604ab7cf48915507cfcb7a313620 upstream.

Error out instead of 'nft: gmputil.c:67: mpz_get_uint32: Assertion `cnt <= 1' failed.'.

Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand")
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: bogus error when adding devices to flowtable

commit 59d304f47a121afda867d792c709bc2c81946979 upstream.

Bail out if flowtable declaration is missing and no devices are
specified.

Otherwise, this reports a bogus error when adding new devices to an
existing flowtable.

# nft -v
nftables v1.0.9 (Old Doc Yak #3)
# ip link add dummy1 type dummy
# ip link set dummy1 up
# nft 'create flowtable inet filter f1 { hook ingress priority 0; counter }'
# nft 'add flowtable inet filter f1 { devices = { dummy1 } ; }'
Error: missing hook and priority in flowtable declaration
add flowtable inet filter f1 { devices = { dummy1 } ; }
^^^^^^^^^^^^^^^^^^^^^^^^

Fixes: 5ad475fce5a1 ("evaluate: bail out if new flowtable does not specify hook and priority")
Reported-by: Martin Gignac <martin.gignac@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: place byteorder conversion before rshift in payload expressions

commit cb9b72a43c5684379c027908d9f332170bf8dd15 upstream.

Use the key from the evaluation context to perform the byteorder
conversion in case that this expression is used for lookups and updates
on explicit sets.

# nft --debug=netlink add rule ip6 t output ip6 dscp @mapv6
ip6 t output
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
  [ byteorder reg 1 = ntoh(reg 1, 2, 2) ]   <-------------- this was missing!
  [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
  [ lookup reg 1 set mapv6 ]

Also with set statements (updates from packet path):

# nft --debug=netlink add rule ip6 t output update @mapv6 { ip6 dscp }
ip6 t output
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
  [ byteorder reg 1 = ntoh(reg 1, 2, 2) ]   <------------- also here!
  [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
  [ dynset update reg_key 1 set mapv6 ]

Simple matches on values and implicit sets rely on the binary transfer
mechanism to propagate the shift to the constant, no explicit byteorder
is required in such case.

Fixes: 668c18f67203 ("evaluate: place byteorder conversion before rshift in payload statement")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: reset statement length context only for set mappings

commit 57f092a87fc4bc61e29cff31dfff976e1f2005ab upstream.

map expression (which is used a key to look up for the mapping) needs to
consider the statement length context, otherwise incorrect bytecode is
generated when {ct,meta} statement is generated.

# nft -f - <<EOF
add table ip6 t
add chain ip6 t c
add map ip6 t mapv6 { typeof ip6 dscp : meta mark; }
EOF

# nft -d netlink add rule ip6 t c meta mark set ip6 dscp map @mapv6
ip6 t c
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
   ... missing byteorder conversion here before shift ...
   [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
   [ lookup reg 1 set mapv6 dreg 1 ]
   [ meta set mark with reg 1 ]

Reset statement length context only for the mapping side for the
elements in the set.

Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand")
Reported-by: Brian Davidson <davidson.brian@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: place byteorder conversion before rshift in payload statement

commit 668c18f672038dffa72b67d834445e0fe5ae286d upstream.

For bitfield that spans more than one byte, such as ip6 dscp, byteorder
conversion needs to be done before rshift. Add unary expression for this
conversion only in the case of meta and ct statements.

Before this patch:

# nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp'
ip6 x y
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
  [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
  [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <--------- incorrect
  [ meta set mark with reg 1 ]

After this patch:

# nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp'
ip6 x y
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
  [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <-------- correct
  [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
  [ meta set mark with reg 1 ]

For the matching case, binary transfer already deals with the rshift to
adjust left and right hand side of the expression, the unary conversion
is not needed in such case.

Fixes: 8221d86e616b ("tests: py: add test-cases for ct and packet mark payload expressions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_linearize: avoid strict-overflow warning in netlink_gen_bitwise()

commit d07c874797295f425541464ac84864e591fc0614 upstream.

With gcc-13.2.1-1.fc38.x86_64:

  $ gcc -Iinclude -c -o tmp.o src/netlink_linearize.c -Werror -Wstrict-overflow=5 -O3
  src/netlink_linearize.c: In function ‘netlink_gen_bitwise’:
  src/netlink_linearize.c:1790:1: error: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C2 -+ C1 [-Werror=strict-overflow]
   1790 | }
        | ^
  cc1: all warnings being treated as errors

It also makes more sense this way, where "n" is the hight of the
"binops" stack, and we check for a non-empty stack with "n > 0" and pop
the last element with "binops[--n]".

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: fix memleak in prefix evaluation with wildcard interface name

commit fe727d5da18c40cb9f002eeaf0116f59e9600659 upstream.

The following ruleset:

  table ip x {
        chain y {
                meta iifname { abcde*, xyz }
        }
  }

triggers the following memleak:

==6871== 16 bytes in 1 blocks are definitely lost in loss record 1 of 1
==6871==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==6871==    by 0x48AD898: xmalloc (utils.c:37)
==6871==    by 0x4BC8B22: __gmpz_init2 (in /usr/lib/x86_64-linux-gnu/libgmp.so.10.4.1)
==6871==    by 0x4887E67: constant_expr_alloc (expression.c:424)
==6871==    by 0x488EF1F: expr_evaluate_prefix (evaluate.c:1138)
==6871==    by 0x488EF1F: expr_evaluate (evaluate.c:2725)
==6871==    by 0x488E76D: expr_evaluate_set_elem (evaluate.c:1662)
==6871==    by 0x488E76D: expr_evaluate (evaluate.c:2739)
==6871==    by 0x4891033: list_member_evaluate (evaluate.c:1454)
==6871==    by 0x488E2B6: expr_evaluate_set (evaluate.c:1757)
==6871==    by 0x488E2B6: expr_evaluate (evaluate.c:2737)
==6871==    by 0x48910D0: elems_evaluate (evaluate.c:4605)
==6871==    by 0x4891432: set_evaluate (evaluate.c:4711)
==6871==    by 0x48915BC: implicit_set_declaration (evaluate.c:122)
==6871==    by 0x488F18A: expr_evaluate_relational (evaluate.c:2503)
==6871==    by 0x488F18A: expr_evaluate (evaluate.c:2745)

expr_evaluate_prefix() calls constant_expr_alloc() which have already
called mpz_init2(), the second call to mpz_init2() overlaps the existing
mpz_t data memory area.

Remove extra mpz_init2() call to fix this memleak.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: fix bogus assertion failure with boolean datatype

commit 567937b5560fbcc7f6b74fb43c52e1cab2ac425a upstream.

The assertion is too strict, as found by afl++:

typeof iifname . ip saddr . meta ipsec
elements = { "eth0" . 10.1.1.2 . 1 }

meta ipsec is boolean (1 bit), but datasize of 1 is set at 8 bit.

Fixes: 22b750aa6dc9 ("src: allow use of base integer types as set keys in concatenations")
Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: add and use nft_data_memcpy helper

commit 130060afa9f6f11e14ea5cf372545407179f16ac upstream.

There is a stack overflow somewhere in this code, we end
up memcpy'ing a way too large expr into a fixed-size on-stack
buffer.

This is hard to diagnose, most of this code gets inlined so
the crash happens later on return from alloc_nftnl_setelem.

Condense the mempy into a helper and add a BUG so we can catch
the overflow before it occurs.

->value is too small (4, should be 16), but for normal
cases (well-formed data must fit into max reg space, i.e.
64 byte) the chain buffer that comes after value in the
structure provides a cushion.

In order to have the new BUG() not trigger on valid data,
bump value to the correct size, this is userspace so the additional
60 bytes of stack usage is no concern.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: reject set in concatenation

commit 4b6a4ad9134fa71277c2ff7f92776e1faeb83000 upstream.

Consider the following ruleset.

define ext_if = { "eth0", "eth1" }
table ip filter {
    chain c {
        iifname . tcp dport { $ext_if . 22 } accept
    }
}

Attempting to load this ruleset results in:

BUG: invalid expression type 'set' in setnft: netlink.c:304: __netlink_gen_concat_key: Assertion `0' failed.
Aborted (core dumped)

After this patch:

# nft -f ruleset.nft
ruleset.nft:1:17-40: Error: cannot use set in concatenation
define ext_if = { "eth0", "eth1" }
                 ^^^^^^^^^^^^^^^^^^

Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1715
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: perform mark datatype compatibility check from maps

commit 7b491e0c068c9881accfb571db3fb8f2f5799ca2 upstream.

Wrap datatype compatibility check into a helper function and use it for
map evaluation, otherwise the following bogus error message is
displayed:

Error: datatype mismatch, map expects packet mark, mapping expression has type integer

Add unit tests to improve coverage for this usecase.

Fixes: 5d8e33ddb112 ("evaluate: relax type-checking for integer arguments in mark statements")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

json: allow to specify comment on chain

commit bd976ab13b4d57ebf5d02459c360905a76af9e58 upstream.

Allow users to add a comment when declaring a chain.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

json: allow to specify comment on table

commit fd595286a68e2c676657d4ac753da517a4d0c3a2 upstream.

Allow users to add a comment when declaring a table:

# sudo nft add table inet test3 '{comment "this is a comment";}'
# nft list ruleset
table inet test3 {
comment "this is a comment"
}
# nft -j list ruleset
{"nftables": [{"metainfo": {"version": "1.0.7", "release_name": "Old Doc Yak", "json_schema_version": 1}}, {"table": {"family": "inet", "name": "test3", "handle": 3, "comment": "this is a comment"}}]}
# nft -j list ruleset > test.json
# nft flush ruleset
# nft -j -f test.json
# nft -j list ruleset
{"nftables": [{"metainfo": {"version": "1.0.7", "release_name": "Old Doc Yak", "json_schema_version": 1}}, {"table": {"family": "inet", "name": "test3", "handle": 4, "comment": "this is a comment"}}]}

Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1670
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: bail out if new flowtable does not specify hook and priority

commit 5ad475fce5a138d3a8b58bde4a41b0537d15b952 upstream.

If user forgets to specify the hook and priority and the flowtable does
not exist, then bail out:

# cat flowtable-incomplete.nft
table t {
  flowtable f {
   devices = { lo }
  }
}
# nft -f /tmp/k
flowtable-incomplete.nft:2:12-12: Error: missing hook and priority in flowtable declaration
flowtable f {
           ^

Update one existing tests/shell to specify a hook and priority.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_delinearize: correct type and byte-order of shifts

commit cab08c47e687104c3774e196240f3da1ad2834e7 upstream.

Downgrade to base type integer instead of the specific type from the
expression that is used in the shift operation.

Without this, listing a rule like:

  ct mark set ip dscp lshift 2 or 0x10

will return:

  ct mark set ip dscp << 2 | cs2

because the type of the OR's right operand will be transitively derived
from `ip dscp`.  However, this is not valid syntax:

  # nft add rule t c ct mark set ip dscp '<<' 2 '|' cs2
  Error: Could not parse integer
  add rule t c ct mark set ip dscp << 2 | cs2
                                          ^^^

Use xinteger_type to print the output in hexadecimal.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

payload: set byteorder when completing expression

commit fd76cdcb9a62552c7c8b6cdf6f1591f12aa9b482 upstream.

Otherwise payload expression remains in invalid byteorder which is
handled as network byteorder for historical reason.

No functional change is intended.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_delinerize: incorrect byteorder in mark statement listing

commit 6696599e104098b61e45f99d161275883885b199 upstream.

When using ip dscp in combination with bitwise operation:

# nft --debug=netlink add rule ip x y 'ct mark set ip dscp | 0x4'
ip x y
   [ payload load 1b @ network header + 1 => reg 1 ]
   [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ]
   [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ]
   [ bitwise reg 1 = ( reg 1 & 0xfffffffb ) ^ 0x00000004 ]
   [ ct set mark with reg 1 ]

the listing is showing in the incorrect byteorder:

# nft list ruleset
table ip x {
        chain y {
ct mark set ip dscp | 0x4000000
}
}

handle and and or operations in host byteorder.

The following command:

# nft --debug=netlink add rule ip6 x y 'ct mark set ip6 dscp | 0x4'
ip6 x y
   [ payload load 2b @ network header + 0 => reg 1 ]
   [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
   [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
   [ byteorder reg 1 = ntoh(reg 1, 2, 1) ]
   [ bitwise reg 1 = ( reg 1 & 0xfffffffb ) ^ 0x00000004 ]
   [ ct set mark with reg 1 ]

works fine (without requiring this patch) because there is an explicit
byteorder expression.

However, ip dscp takes only 1-byte, so it does not require the byteorder
expression. Use host byteorder if the rhs of bitwise AND OR is larger
than lhs payload expression and such expression is equal or less than
1-byte.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: honor statement length in bitwise evaluation

commit f60ef911f1e05a88fae650c487965e7b85d17d2a upstream.

Get length from statement, instead infering it from the expression that
is used to set the value. In the particular case of {ct|meta} mark, this
is 32 bits.

Otherwise, bytecode generation is not correct:

# nft -c --debug=netlink 'add rule ip6 x y ct mark set ip6 dscp << 2 | 0x10'
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
  [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
  [ byteorder reg 1 = ntoh(reg 1, 2, 1) ]
  [ bitwise reg 1 = ( reg 1 << 0x00000002 ) ]
  [ bitwise reg 1 = ( reg 1 & 0x00000fef ) ^ 0x00000010 ]    <--- incorrect!
  [ ct set mark with reg 1 ]

the previous bitwise shift already upgraded to 32-bits (not visible from
the netlink debug output above).

After this patch, the last | 0x10 uses 32-bits:

[ bitwise reg 1 = ( reg 1 & 0xffffffef ) ^ 0x00000010 ]

note that mask 0xffffffef is used instead of 0x00000fef.

Patch ("evaluate: support shifts larger than the width of the left operand")
provides the statement length through eval context. Use it to evaluate the
bitwise expression accordingly, otherwise bytecode is incorrect:

# nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000'
ip x y
  [ payload load 1b @ network header + 1 => reg 1 ]
  [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ]
  [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ]
  [ bitwise reg 1 = ( reg 1 & 0x1e000000 ) ^ 0x000000ff ]  <-- incorrect byteorder for OR
  [ byteorder reg 1 = ntoh(reg 1, 4, 4) ]    <-- no needed for single ip dscp byte
  [ ct set mark with reg 1 ]

Correct bytecode:

# nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000
ip x y
  [ payload load 1b @ network header + 1 => reg 1 ]
  [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ]
  [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ]
  [ bitwise reg 1 = ( reg 1 & 0x0000001e ) ^ 0xff000000 ]
  [ ct set mark with reg 1 ]

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: set up integer type to shift expression

commit 1cdd8249f1c9c462b40ccde8d56191ea481ce610 upstream.

Otherwise expr_evaluate_value() fails with invalid datatype:

# nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1'
BUG: invalid basetype invalid
nft: evaluate.c:440: expr_evaluate_value: Assertion `0' failed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: relax type-checking for integer arguments in mark statements

commit 5d8e33ddb1125ccdc258c01d2d869be6eb9ed3a9 upstream.

In order to be able to set ct and meta marks to values derived from
payload expressions, we need to relax the requirement that the type of
the statement argument must match that of the statement key.  Instead,
we require that the base-type of the argument is integer and that the
argument is small enough to fit.

Moreover, swap expression byteorder before to make it compatible with
the statement byteorder, to ensure rulesets are portable.

# nft --debug=netlink add rule ip t c 'meta mark set ip saddr'
ip t c
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ byteorder reg 1 = ntoh(reg 1, 4, 4) ] <----------- byteorder swap
  [ meta set mark with reg 1 ]

Based on original work from Jeremy Sowden.

The following patches are required for this to work:

evaluate: get length from statement instead of lhs expression
evaluate: don't eval unary arguments
evaluate: support shifts larger than the width of the left operand
netlink_delinearize: correct type and byte-order of shifts
evaluate: insert byte-order conversions for expressions between 9 and 15 bits

Add one testcase for tests/py.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: don't eval unary arguments

commit 6bea6864b799905c8cc5f7a3f9488c88e219c481 upstream.

When a unary expression is inserted to implement a byte-order
conversion, the expression being converted has already been evaluated
and so `expr_evaluate_unary` doesn't need to do so.

This is required by {ct|meta} statements with bitwise operations, which
might result in byteorder conversion of the expression.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: support shifts larger than the width of the left operand

commit edecd58755a84dbe8f36795f619189490eeb3ef1 upstream.

If we want to left-shift a value of narrower type and assign the result
to a variable of a wider type, we are constrained to only shifting up to
the width of the narrower type.  Thus:

  add rule t c meta mark set ip dscp << 2

works, but:

  add rule t c meta mark set ip dscp << 8

does not, even though the lvalue is large enough to accommodate the
result.

Upgrade the maximum length based on the statement datatype length, which
is provided via context, if it is larger than expression lvalue.

Update netlink_delinearize.c to handle the case where the length of a
shift expression does not match that of its left-hand operand.

Based on patch from Jeremy Sowden.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

meta: don't crash if meta key isn't known

commit cd2d1fc4e8b114f8526b0ce61cfd5cfce8eacac6 upstream.

If older nft version is used for dumping, 'key' might be
outside of the range of known templates.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink_delinearize: add postprocessing for payload binops

commit b3d4028a27edff0684a47356b13da494f7ec08ff upstream.

If a user uses a payload expression as a statement argument:

nft add rule t c meta mark set ip dscp lshift 2 or 0x10

we may need to undo munging during delinearization.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>

tests: shell: use /bin/bash in sets/elem_opts_compat_0

commit d3e941668be1d3922832fd70788fb248fd11f6c7 upstream.

[ All 1.0.6.y tests/shell report success after this ]

So running this test with /bin/sh != /bin/bash works.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

parser_bison: fix memleak in meta set error handling

commit 21608263cc1ae489326e743957bfe34b05414a44 upstream.

We must release the expression here, found via afl++ and
-fsanitize-address build.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: fix objref statement corruption

commit 78dffb470fcf7b1c0b1b3d6f43fcc056c337a808 upstream.

Consider this:

counter_stmt            :       counter_stmt_alloc
                        |       counter_stmt_alloc      counter_args

counter_stmt_alloc      :       COUNTER { $$ = counter_stmt_alloc(&@$); }
                        |       COUNTER         NAME    stmt_expr
                        {
                                $$ = objref_stmt_alloc(&@$);
                                $$->objref.type = NFT_OBJECT_COUNTER;
                                $$->objref.expr = $3;
                        }
                        ;

counter_args            :       counter_arg { $<stmt>$        = $<stmt>0; }
                        |       counter_args    counter_arg
                        ;

counter_arg             :       PACKETS NUM { $<stmt>0->counter.packets = $2; }

[..]

This has 'counter_stmt_alloc' EITHER return counter or objref statement.
Both are the same structure but with different (union'd) trailer content.

counter_stmt permits the 'packet' and 'byte' argument.

But the 'counter_arg' directive only works with a statement
coming from counter_stmt_alloc().

afl++ came up with following input:

table inet x {
        chain y {
                counter name ip saddr bytes 1.1.1. 1024
        }
}

This clobbers $<stmt>->objref.expr pointer, we then crash when
calling expr_evaluate() on it.

Split the objref related statements into their own directive.

After this, the input will fail with:
"syntax error, unexpected bytes, expecting newline or semicolon".

Also split most of the other objref statements into their own blocks.
synproxy seems to have same problem, limit and quota appeared to be ok.

v1 added objref_stmt to stateful_stmt list, this is wrong, we will
assert when generating the 'counter' statement.
Place it in the normal statement list so netlink_gen_stmt_stateful_assert
throws the expected parser error.

Fixes: dccab4f646b4 ("parser_bison: consolidate stmt_expr rule")
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: validate chain max length

commit 08925ba0daf19753df933fed69f4572a7c9d3d47 upstream.

The includes test files cause:
BUG: chain is too large (257, 256 max)nft: netlink.c:418: netlink_gen_chain: Assertion `0' failed.

Error out in evaluation step instead.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser: tcpopt: fix tcp option parsing with NUM + length field

commit 59a33d08ab3a75b2ae370b6816942793f49fa8db upstream.

tcp option 254 length ge 4

... will segfault.
The crash bug is that tcpopt_expr_alloc() can return NULL if we cannot
find a suitable template for the requested kind + field combination,
so add the needed error handling in the bison parser.

However, we can handle this. NOP and EOL have templates, all other
options (known or unknown) must also have a length field.

So also add a fallback template to handle both kind and length, even
if only a numeric option is given that nft doesn't recognize.

Don't bother with output, above will be printed via raw syntax, i.e.
tcp option @254,8,8 >= 4.

Fixes: 24d8da308342 ("tcpopt: allow to check for presence of any tcp option")
Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: reject set definition with no key

commit 1949a63215b423b914d3a7a9de7511cb48af3c09 upstream.

tests/shell/testcases/bogons/nft-f/set_definition_with_no_key_assert
BUG: unhandled key type 2
nft: src/intervals.c:59: setelem_expr_to_range: Assertion `0' failed.

This patch adds a new unit tests/shell courtesy of Florian Westphal.

Fixes: 3975430b12d9 ("src: expand table command before evaluation")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

monitor: add support for concatenated set ranges

commit 0d9392eef5f2c79ac7c19f59754a0aee574b5617 upstream.

monitor is missing concatenated set ranges support.

Fixes: 8ac2f3b2fca3 ("src: Add support for concatenated set ranges")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: fix double free on dtype release

commit ee73e9dcd46dc5a1fe3be7caa8b9323819e394b8 upstream.

We release ->dtype twice, will either segfault or assert
on dtype->refcount != 0 check in datatype_free().

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: catch implicit map expressions without known datatype

commit 70054e6e1c879475779d82d19f450ac521700fe4 upstream.

mapping_With_invalid_datatype_crash:1:8-65: Error: Implicit map expression without known datatype
bla to tcp dport map { 80 : 1.1.1.1 . 8001, 81 : 2.2.2.2 . 9001 } bla
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: reject attempt to update a set

commit 5f43ea807bb0f5b30f332c2c96f13e33c9243d22 upstream.

This will crash as set->data is NULL, so check that SET_REF is pointing
to a map:

Error: candidates_ipv4 is not a map
tcp dport 10003 ip saddr . tcp dport @candidates_ipv4 add @candidates_ipv4 { ip saddr . 10 :0004 timeout 1s }
~~~~~~~~~~~~~~~~

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: error out if basetypes are different

commit 45a4d4434742b425d019623812f2cce293033cdf upstream.

prefer
binop_with_different_basetype_assert:3:29-35: Error: Binary operation (<<) with different base types (string vs integer) is not supported
oifname set ip9dscp << 26 | 0x10
^^^^^^^~~~~~~
to assertion failure.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: guard against NULL basetype

commit 3671c48970031e617ee713b79caf8ef0a1b096c2 upstream.

i->dtype->basetype can be NULL.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: handle invalid mapping expressions gracefully

commit 778e4e113673c2a4daa798634c554c40f2808276 upstream.

Before:
BUG: invalid mapping expression binop
nft: src/evaluate.c:2027: expr_evaluate_map: Assertion `0' failed.

After:
tests/shell/testcases/bogons/nft-f/invalid_mapping_expr_binop_assert:1:22-25: Error: invalid mapping expression binop
xy mame ip saddr map h& p p
~~~~~~~~ ^^^^
Signed-off-by: Florian Westphal <fw@strlen.de>

json: deal appropriately with multidevice in chain

commit 4dfb5b2010917da3f9f11c83931069931a7d6fb3 upstream.

Chain device support is broken in JSON: listing does not include devices
and parser only deals with one single device.

Use existing json_parse_flowtable_devs() function, rename it to
json_parse_devs() to parse the device array.

Use the dev_array that contains the device names (as string) for the
listing.

Update incorrect .json-nft files in tests/shell.

Fixes: 3fdc7541fba0 ("src: add multidevice support for netdev chain")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: clone unary expression datatype to deal with dynamic datatype

upstream faa6908fad6053ae9549c45b88d0402cc69cf1ed commit.

When allocating a unary expression, clone the datatype to deal with
dynamic datatypes.

Fixes: 6b01bb9ff798 ("datatype: concat expression only releases dynamically allocated datatype")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: fix rule replacement with anon sets

commit 256904b1ded6314974dddc75726149f7b19d33f4 upstream.

nft replace rule t c handle 3 'jhash ip protocol . ip saddr mod 170 vmap { 0-94 : goto wan1, 95-169 : goto wan2, 170-269 }"'
BUG: unhandled op 2
nft: src/evaluate.c:1748: interval_set_eval: Assertion `0' failed.

Fixes: 81e36530fcac ("src: replace interval segment tree overlap and automerge")
Reported-by: Tino Reichardt <milky-netfilter@mcmilk.de>
Signed-off-by: Florian Westphal <fw@strlen.de>