This patch allows you to specify an interval of IP address in maps.
table ip x {
chain y {
type nat hook postrouting priority srcnat; policy accept;
snat ip interval to ip saddr map { 10.141.11.4 : 192.168.2.2-192.168.2.4 }
}
}
The example above performs SNAT to packets that comes from 10.141.11.4
to an interval of IP addresses from 192.168.2.2 to 192.168.2.4 (both
included).
You can also combine this with dynamic maps:
table ip x {
map y {
type ipv4_addr : interval ipv4_addr
flags interval
elements = { 10.141.10.0/24 : 192.168.2.2-192.168.2.4 }
}
chain y {
type nat hook postrouting priority srcnat; policy accept;
snat ip interval to ip saddr map @y
}
}
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Matt Turner [Tue, 7 Apr 2020 19:05:08 +0000 (12:05 -0700)]
build: Include generated man pages in dist tarball
Most projects ship pre-generated man pages in the distribution tarball
so that builders don't need the documentation tools installed, similar
to how bison-generated sources are included.
To do this, we conditionalize the presence check of a2x on whether nft.8
already exists in the source directory, as it would exist if included in
the distribution tarball.
Secondly, we move the 'if BUILD_MAN' conditional to around the man page
generation rules. This ensures that the man pages are unconditionally
installed. Also only add the man pages to CLEANFILES if their generation
is enabled.
Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
src: Set NFT_SET_CONCAT flag for sets with concatenated ranges
Pablo reports that nft, after commit 8ac2f3b2fca3 ("src: Add support
for concatenated set ranges"), crashes with older kernels (< 5.6)
without support for concatenated set ranges: those sets will be sent
to the kernel, which adds them without notion of the fact that
different concatenated fields are actually included, and nft crashes
while trying to list this kind of malformed concatenation.
Use the NFT_SET_CONCAT flag introduced by kernel commit ef516e8625dd
("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") when
sets including concatenated ranges are sent to the kernel, so that
older kernels (with no knowledge of this flag itself) will refuse set
creation.
Note that, in expr_evaluate_set(), we have to check for the presence
of the flag, also on empty sets that might carry it in context data,
and actually set it in the actual set flags.
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Segfault on error reporting when intervals overlap.
ip saddr vmap {
10.0.1.0-10.0.1.255 : accept,
10.0.1.1-10.0.2.255 : drop
}
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1415 Fixes: 4d6ad0f310d6 ("segtree: check for overlapping elements at insertion") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
concat: provide proper dtype when parsing typeof udata
Pablo reports following list bug:
table ip foo {
map whitelist {
typeof ip saddr . ip daddr : meta mark
elements = { 0x0 [invalid type] . 0x0 [invalid type] : 0x00000001,
0x0 [invalid type] . 0x0 [invalid type] : 0x00000002 }
}
}
Problem is that concat provided 'invalid' dtype.
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
Luis Ressel [Thu, 26 Mar 2020 15:22:29 +0000 (15:22 +0000)]
netlink: Show the handles of unknown rules in "nft monitor trace"
When "nft monitor trace" doesn't know a rule (because it was only added
to the ruleset after nft was invoked), that rule is silently omitted in
the trace output, which can come as a surprise when debugging issues.
Instead, we can at least show the information we got via netlink, i.e.
the family, table and chain name, rule handle and verdict.
Signed-off-by: Luis Ressel <aranea@aixah.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stefano Brivio [Thu, 5 Mar 2020 20:34:11 +0000 (21:34 +0100)]
tests: shell: Introduce test for insertion of overlapping and non-overlapping ranges
Insertion of overlapping ranges should return success only if the new
elements are identical to existing ones, or, for concatenated ranges,
if the new element is less specific (in all its fields) than any
existing one.
Note that, in case the range is identical to an existing one, insertion
won't actually be performed, but no error will be returned either on
'add element'.
This was inspired by a failing case reported by Phil Sutter (where
concatenated overlapping ranges would fail insertion silently) and is
fixed by kernel series with subject:
nftables: Consistently report partial and entire set overlaps
With that series, these tests now pass also if the call to set_overlap()
on insertion is skipped. Partial or entire overlapping was already
detected by the kernel for concatenated ranges (nft_set_pipapo) from
the beginning, and that series makes the nft_set_rbtree implementation
consistent in terms of detection and reporting. Without that, overlap
checks are performed by nft but not guaranteed by the kernel.
However, we can't just drop set_overlap() now, as we need to preserve
compatibility with older kernels.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: add range specified flag setting (missing NF_NAT_RANGE_PROTO_SPECIFIED)
Sergey reports:
With nf_tables it is not possible to use port range for masquerading.
Masquerade statement has option "to [:port-port]" which give no effect
to translation behavior. But it must change source port of packet to
one from ":port-port" range.
Address translation works fine, but source port are not belongs to
specified range.
I see in similar source code (i.e. nft_redir.c, nft_nat.c) that
there is setting NF_NAT_RANGE_PROTO_SPECIFIED flag. After adding this,
repeat test for kernel with this patch, and get dump:
Phil Sutter [Fri, 6 Mar 2020 15:15:48 +0000 (16:15 +0100)]
parser_json: Support ranges in concat expressions
Duplicate commit 8ac2f3b2fca38's changes to bison parser into JSON
parser by introducing a new context flag signalling we're parsing
concatenated expressions.
Fixes: 8ac2f3b2fca38 ("src: Add support for concatenated set ranges") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Eric Garver <eric@garver.life>
Jeremy Sowden [Tue, 3 Mar 2020 09:48:31 +0000 (09:48 +0000)]
evaluate: no need to swap byte-order for values of fewer than 16 bits.
Endianness is not meaningful for objects smaller than 2 bytes and the
byte-order conversions are no-ops in the kernel, so just update the
expression as if it were constant.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
This patch extends the basechain definition to allow users to specify
the offload flag. This flag enables hardware offload if your drivers
supports it.
# cat file.nft
table netdev x {
chain y {
type filter hook ingress device eth0 priority 10; flags offload;
}
}
# nft -f file.nft
Note: You have to enable offload via ethtool:
# ethtool -K eth0 hw-tc-offload on
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 24 Feb 2020 00:03:23 +0000 (01:03 +0100)]
src: allow nat maps containing both ip(6) address and port
nft will now be able to handle
map destinations {
type ipv4_addr . inet_service : ipv4_addr . inet_service
}
chain f {
dnat to ip daddr . tcp dport map @destinations
}
Something like this won't work though:
meta l4proto tcp dnat ip6 to numgen inc mod 4 map { 0 : dead::f001 . 8080, ..
as we lack the type info to properly dissect "dead::f001" as an ipv6
address.
For the named map case, this info is available in the map
definition, but for the anon case we'd need to resort to guesswork.
Support is added by peeking into the map definition when evaluating
a nat statement with a map.
Right now, when a map is provided as address, we will only check that
the mapped-to data type matches the expected size (of an ipv4 or ipv6
address).
After this patch, if the mapped-to type is a concatenation, it will
take a peek at the individual concat expressions. If its a combination
of address and service, nft will translate this so that the kernel nat
expression looks at the returned register that would store the
inet_service part of the octet soup returned from the lookup expression.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 24 Feb 2020 00:03:22 +0000 (01:03 +0100)]
evaluate: add two new helpers
In order to support 'dnat to ip saddr map @foo', where @foo returns
both an address and a inet_service, we will need to peek into the map
and process the concatenations sub-expressions.
Add two helpers for this, will be used in followup patches.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 20 Feb 2020 11:58:40 +0000 (12:58 +0100)]
evaluate: print correct statement name on family mismatch
nft add rule inet filter c ip daddr 1.2.3.4 dnat ip6 to f00::1
Error: conflicting protocols specified: ip vs. unknown. You must specify ip or ip6 family in tproxy statement
Should be: ... "in nat statement".
Fixes: fbe27464dee4588d90 ("src: add nat support for the inet family") Signed-off-by: Florian Westphal <fw@strlen.de>
mnl: do not use expr->identifier to fetch device name
This string might not be nul-terminated, resulting in spurious errors
when adding netdev chains.
Fixes: 3fdc7541fba0 ("src: add multidevice support for netdev chain") Fixes: 92911b362e90 ("src: add support to add flowtables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
==1135425== 9 bytes in 1 blocks are definitely lost in loss record 1 of 1
==1135425== at 0x483577F: malloc (vg_replace_malloc.c:309)
==1135425== by 0x4BE846A: strdup (strdup.c:42)
==1135425== by 0x48A5EDD: xstrdup (utils.c:75)
==1135425== by 0x48C9A20: nft_lex (scanner.l:640)
==1135425== by 0x48BC1A4: nft_parse (parser_bison.c:5682)
==1135425== by 0x48AC336: nft_parse_bison_buffer (libnftables.c:375)
==1135425== by 0x48AC336: nft_run_cmd_from_buffer (libnftables.c:443)
==1135425== by 0x10A707: main (main.c:384)
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stefano Brivio [Fri, 14 Feb 2020 15:27:25 +0000 (16:27 +0100)]
tests: Introduce test for set with concatenated ranges
This test checks that set elements can be added, deleted, that
addition and deletion are refused when appropriate, that entries
time out properly, and that they can be fetched by matching values
in the given ranges.
v5:
- speed this up by performing the timeout test for one single
permutation (Phil Sutter), by decreasing the number of
permutations from 96 to 12 if this is invoked by run-tests.sh
(Pablo Neira Ayuso) and by combining some commands into single
nft calls where possible: with dash 0.5.8 on AMD Epyc 7351 the
test now takes 1.8s instead of 82.5s
- renumber test to 0043, 0042 was added meanwhile
v4: No changes
v3:
- renumber test to 0042, 0041 was added meanwhile
v2:
- actually check an IPv6 prefix, instead of specifying everything
as explicit ranges in ELEMS_ipv6_addr
- renumber test to 0041, 0038 already exists
Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
# nft delete rule ip y z handle 7
Error: Could not process rule: No such file or directory
delete rule ip y z handle 7
^
# nft delete rule ip x z handle 7
Error: Could not process rule: No such file or directory
delete rule ip x z handle 7
^
# nft delete rule ip x x handle 7
Error: Could not process rule: No such file or directory
delete rule ip x x handle 7
^
# nft replace rule x y handle 10 ip saddr 1.1.1.2 counter
Error: Could not process rule: No such file or directory
replace rule x y handle 10 ip saddr 1.1.1.2 counter
^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 13 Feb 2020 11:45:55 +0000 (12:45 +0100)]
src: maps: update data expression dtype based on set
What we want:
- update @sticky-set-svc-M53CN2XYVUHRQ7UB { ip saddr : 0x00000002 }
what we got:
+ update @sticky-set-svc-M53CN2XYVUHRQ7UB { ip saddr : 0x2000000 [invalid type] }
Phil Sutter [Thu, 6 Feb 2020 11:31:56 +0000 (12:31 +0100)]
scanner: Extend asteriskstring definition
Accept escaped asterisks also mid-string and as only character.
Especially the latter will help when translating from iptables where
asterisk has no special meaning.
Jeremy Sowden [Mon, 3 Feb 2020 11:20:20 +0000 (11:20 +0000)]
evaluate: change shift byte-order to host-endian.
The byte-order of the righthand operands of the right-shifts generated
for payload and exthdr expressions is big-endian. However, all right
operands should be host-endian. Since evaluation of the shift binop
will insert a byte-order conversion to enforce this, change the
endianness in order to avoid the extra operation.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jeremy Sowden [Mon, 3 Feb 2020 11:20:18 +0000 (11:20 +0000)]
parser: add parenthesized statement expressions.
Primary and primary RHS expressions support parenthesized basic and
basic RHS expressions. However, primary statement expressions do not
support parenthesized basic statement expressions. Add them.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stefano Brivio [Thu, 30 Jan 2020 00:16:57 +0000 (01:16 +0100)]
src: Add support for concatenated set ranges
After exporting field lengths via NFTNL_SET_DESC_CONCAT attributes,
we now need to adjust parsing of user input and generation of
netlink key data to complete support for concatenation of set
ranges.
Instead of using separate elements for start and end of a range,
denoting the end element by the NFT_SET_ELEM_INTERVAL_END flag,
as it's currently done for ranges without concatenation, we'll use
the new attribute NFTNL_SET_ELEM_KEY_END as suggested by Pablo. It
behaves in the same way as NFTNL_SET_ELEM_KEY, but it indicates
that the included key represents the upper bound of a range.
For example, "packets with an IPv4 address between 192.0.2.0 and
192.0.2.42, with destination port between 22 and 25", needs to be
expressed as a single element with two keys:
- adjust the lexer rules to allow multiton expressions as elements
of a concatenation. As wildcards are not allowed (semantics would
be ambiguous), exclude wildcards expressions from the set of
possible multiton expressions, and allow them directly where
needed. Concatenations now admit prefixes and ranges
- generate, for each element in a range concatenation, a second key
attribute, that includes the upper bound for the range
- also expand prefixes and non-ranged values in the concatenation
to ranges: given a set with interval and concatenation support,
the kernel has no way to tell which elements are ranged, so they
all need to be. For example, 192.0.2.0 . 192.0.2.9 : 1024 is
sent as:
- aggregate ranges when elements received by the kernel represent
concatenated ranges, see concat_range_aggregate()
- perform a few minor adjustments where interval expressions
are already handled: we have intervals in these sets, but
the set specification isn't just an interval, so we can't
just aggregate and deaggregate interval ranges linearly
v4: No changes
v3:
- rework to use a separate key for closing element of range instead of
a separate element with EXPR_F_INTERVAL_END set (Pablo Neira Ayuso)
v2:
- reworked netlink_gen_concat_data(), moved loop body to a new function,
netlink_gen_concat_data_expr() (Phil Sutter)
- dropped repeated pattern in bison file, replaced by a new helper,
compound_expr_alloc_or_add() (Phil Sutter)
- added set_is_nonconcat_range() helper (Phil Sutter)
- in expr_evaluate_set(), we need to set NFT_SET_SUBKEY also on empty
sets where the set in the context already has the flag
- dropped additional 'end' parameter from netlink_gen_data(),
temporarily set EXPR_F_INTERVAL_END on expressions and use that from
netlink_gen_concat_data() to figure out we need to add the 'end'
element (Phil Sutter)
- replace range_mask_len() by a simplified version, as we don't need
to actually store the composing masks of a range (Phil Sutter)
Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stefano Brivio [Thu, 30 Jan 2020 00:16:56 +0000 (01:16 +0100)]
src: Add support for NFTNL_SET_DESC_CONCAT
To support arbitrary range concatenations, the kernel needs to know
how long each field in the concatenation is. The new libnftnl
NFTNL_SET_DESC_CONCAT set attribute describes this as an array of
lengths, in bytes, of concatenated fields.
While evaluating concatenated expressions, export the datatype size
into the new field_len array, and hand the data over via libnftnl.
Similarly, when data is passed back from libnftnl, parse it into
the set description.
When set data is cloned, we now need to copy the additional fields
in set_clone(), too.
This change depends on the libnftnl patch with title:
set: Add support for NFTA_SET_DESC_CONCAT attributes
v4: No changes
v3: Rework to use set description data instead of a stand-alone
attribute
v2: No changes
Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jeremy Sowden [Sun, 19 Jan 2020 22:57:09 +0000 (22:57 +0000)]
netlink: add support for handling shift expressions.
The kernel supports bitwise shift operations, so add support to the
netlink linearization and delinearization code. The number of bits (the
righthand operand) is expected to be a 32-bit value in host endianness.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>