Florian Westphal [Thu, 11 Mar 2021 13:23:02 +0000 (14:23 +0100)]
scanner: ct: move to own scope
This allows moving multiple ct specific keywords out of INITIAL scope.
Next few patches follow same pattern:
1. add a scope_close_XXX rule
2. add a SCANSTATE_XXX & make flex switch to it when
encountering XXX keyword
3. make bison leave SCANSTATE_XXXX when it has seen the complete
expression.
nftables: xt: fix misprint in nft_xt_compatible_revision
The rev variable is used here instead of opt obviously by mistake.
Please see iptables:nft_compatible_revision() for an example how it
should be.
This breaks revision compatibility checks completely when reading
compat-target rules from nft utility. That's why nftables can't work on
"old" kernels which don't support new revisons. That's a problem for
containers.
E.g.: 0 and 1 is supported but not 2:
https://git.sw.ru/projects/VZS/repos/vzkernel/browse/net/netfilter/xt_nat.c#111
Reproduce of the problem on Virtuozzo 7 kernel
3.10.0-1160.11.1.vz7.172.18 in centos 8 container:
iptables-nft -t nat -N TEST
iptables-nft -t nat -A TEST -j DNAT --to-destination 172.19.0.2
nft list ruleset > nft.ruleset
nft -f - < nft.ruleset
#/dev/stdin:19:67-81: Error: Range has zero or negative size
# meta l4proto tcp tcp dport 81 counter packets 0 bytes 0 dnat to 3.0.0.0-0.0.0.0
# ^^^^^^^^^^^^^^^
But nft reads this as rev 2 format (nf_nat_range2) which does not have
rangesize, and thus flugs 3 is treated as ip 3.0.0.0, which is wrong and
can't be restored later.
(Should probably be the same on Centos 7 kernel 3.10.0-1160.11.1)
Fixes: fbc0768cb696 ("nftables: xt: don't use hard-coded AF_INET") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 4 Feb 2021 01:20:23 +0000 (02:20 +0100)]
mnl: Set NFTNL_SET_DATA_TYPE before dumping set elements
In combination with libnftnl's commit "set_elem: Fix printing of verdict
map elements", This adds the vmap target to netlink dumps. Adjust dumps
in tests/py accordingly.
Simon Ruderich [Sun, 7 Mar 2021 09:51:35 +0000 (10:51 +0100)]
doc: remove duplicate tables in synproxy example
The "outcome ruleset" is the same as the two tables in the example.
Don't duplicate this information which just wastes space in the
documentation and can confuse the reader (it took me a while to realize
the tables are the same).
In addition, use the same table name for both tables to make it clear
that they can be the same. They will be merged in the resulting ruleset.
Signed-off-by: Simon Ruderich <simon@ruderich.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nft_mnl_socket_reopen() was introduced to deal with the EINTR case.
By reopening the netlink socket, pending netlink messages that are part of
a stale netlink dump are implicitly drop. This patch replaces the
nft_mnl_socket_reopen() strategy by pulling out all of the remaining
netlink message to restart in a clean state.
This is implicitly fixing up a bug in the table ownership support, which
assumes that the netlink socket remains open until nft_ctx_free() is
invoked.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add new flag to allow userspace process to own tables: Tables that have
an owner can only be updated/destroyed by the owner. The table is
destroyed either if the owner process calls nft_ctx_free() or owner
process is terminated (implicit table release).
The ruleset listing includes the program name that owns the table:
nft> list ruleset
table ip x { # progname nft
flags owner
chain y {
type filter hook input priority filter; policy accept;
counter packets 1 bytes 309
}
}
Original code to pretty print the netlink portID to program name has
been extracted from the conntrack userspace utility.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: 719e44277f8e ("main: use one data-structure to initialize getopt_long(3) arguments and help.") Cc: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 17 Feb 2021 11:38:42 +0000 (12:38 +0100)]
monitor: Don't print newgen message with JSON output
Iff this should be printed, it must adhere to output format settings. In
its current form it breaks JSON syntax, so skip it for non-default
output formats.
Fixes: cb7e02f44d6a6 ("src: enable json echo output when reading native syntax") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Tue, 29 Dec 2020 18:33:44 +0000 (19:33 +0100)]
tests/py: Add a test sanitizer and fix its findings
This is just basic housekeeping:
- Remove duplicate tests in any of the *.t files
- Remove explicit output if equal to command itself in *.t files
- Remove duplicate payload records in any of the *.t.payload* files
- Remove stale payload records (for which no commands exist in the
respective *.t file
- Remove duplicate/stale entries in any of the *.t.json files
In some cases, tests were added instead of removing a stale payload
record if it fit nicely into the sequence of tests.
Phil Sutter [Tue, 15 Dec 2020 12:52:47 +0000 (13:52 +0100)]
tests/py: Write dissenting payload into the right file
The testsuite supports diverging payloads depending on table family.
This is necessary since for some families, dependency matches are
created.
If a payload mismatch happens, record it into a "got"-file which matches
the family-specific payload file, not the common one. This eases use of
diff-tools a lot as the extra other families' payloads confuse the
tools.
payload: check icmp dependency before removing previous icmp expression
nft is too greedy when removing icmp dependencies.
'icmp code 1 type 2' did remove the type when printing.
Be more careful and check that the icmp type dependency of the
candidate expression (earlier icmp payload expression) has the same
type dependency as the new expression.
Reported-by: Eric Garver <eric@garver.life> Reported-by: Michael Biebl <biebl@debian.org> Tested-by: Eric Garver <eric@garver.life> Fixes: d0f3b9eaab8d77e ("payload: auto-remove simple icmp/icmpv6 dependency expressions") Signed-off-by: Florian Westphal <fw@strlen.de>
Phil Sutter [Tue, 26 Jan 2021 17:37:12 +0000 (18:37 +0100)]
reject: Unify inet, netdev and bridge delinearization
Postprocessing for inet family did not attempt to kill any existing
payload dependency, although it is perfectly fine to do so. The mere
culprit is to not abbreviate default code rejects as that would drop
needed protocol info as a side-effect. Since postprocessing is then
almost identical to that of bridge and netdev families, merge them.
While being at it, extend tests/py/netdev/reject.t by a few more tests
taken from inet/reject.t so this covers icmpx rejects as well.
Cc: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Tue, 26 Jan 2021 16:06:33 +0000 (17:06 +0100)]
reject: Fix for missing dependencies in netdev family
Like with bridge family, rejecting with either icmp or icmpv6 must
create a dependency match on meta protocol. Upon delinearization, treat
netdev reject identical to bridge as well so no family info is lost.
This makes reject statement in netdev family fully symmetric so fix
the tests in tests/py/netdev/reject.t, adjust the related payload dumps
and add JSON equivalents which were missing altogether.
Fixes: 0c42a1f2a0cc5 ("evaluate: add netdev support for reject default") Fixes: a51a0bec1f698 ("tests: py: add netdev folder and reject.t icmp cases") Cc: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Phil Sutter <phil@nwl.cc>
Florian Westphal [Tue, 26 Jan 2021 14:45:47 +0000 (15:45 +0100)]
json: ct: add missing test input
ERROR: did not find JSON equivalent for rule 'meta mark set ct original ip saddr . meta mark map { 1.1.1.1 . 0x00000014 : 0x0000001e }'
ERROR: did not find JSON equivalent for rule 'ct original ip saddr . meta mark { 1.1.1.1 . 0x00000014 }'
Florian Westphal [Thu, 21 Jan 2021 15:46:27 +0000 (16:46 +0100)]
json: icmp: move expected parts to json.output
Phil Sutter says:
In general, *.t.json files should contain JSON equivalents for rules as
they are *input* into nft. So we want them to be as close to the
introductory standard syntax comment as possible.
Undo earlier change and place the expected dependency added by
nft internals to json.output rather than icmp.t.json.
evaluate: disallow ct original {s,d}ddr from concatenations
Extend 8b043938e77b ("evaluate: disallow ct original {s,d}ddr from
maps") to cover concatenations too.
Error: specify either ip or ip6 for address matching
add rule x y meta mark set ct original saddr . meta mark map { 1.1.1.1 . 20 : 30 }
^^^^^^^^^^^^^^^^^
The old syntax for ct original saddr without either ip or ip6 results
in unknown key size, which breaks the listing. The old syntax is only
allowed in simple rules for backward compatibility.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1489 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
test.nft:6:55-71: Error: specify either ip or ip6 for address matching
add rule ip mangle manout ct direction reply mark set ct original daddr map { $ext1_ip : 0x11, $ext2_ip : 0x12 }
^^^^^^^^^^^^^^^^^
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1489 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
I spent a bit of time debugging an issue with libedit support 9420423900a2 ("cli: add libedit support") that broke tests/shell.
This is the reproducer:
# nft -i << EOF
list ruleset
EOF
which makes rl_callback_read_char() loop forever on read() as shown by
strace. The rl_line_buffer variable does not accumulate the typed
characters as it should when redirecting the standard input for some
reason.
Given our interactive interface is fairly simple at this stage, switch
to use the readline() interface instead of rl_callback_read_char().
Florian Westphal [Sat, 12 Dec 2020 18:33:09 +0000 (19:33 +0100)]
nft: trace: print packet unconditionally
The kernel includes the packet dump once for each base hook.
This means that in case a table contained no matching rule(s),
the packet dump will be included in the base policy dump.
Simply move the packet dump request out of the switch statement
so the debug output shows current packet even with no matched rule.
The ICMP header has field values that are only exist
for certain types.
Mark the icmp proto 'type' field as a nextheader field
and add a new th description to store the icmp type
dependency. This can later be re-used for other protocol
dependend definitions such as mptcp options -- which are all share the
same tcp option number and have a special 4 bit marker inside the
mptcp option space that tells how the remaining option looks like.
parser_bison: double close_scope() call for implicit chains
Call close_scope() from chain_block_alloc only.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1485 Fixes: c330152b7f77 ("src: support for implicit chain bindings") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When --echo and --json is specified and native syntax is read, only the
last instruction is printed. This happens because the reference to the
json_echo is reassigned each time netlink_echo_callback is executed for
an instruction to be echoed.
Add an assignment check for json_echo to avoid reassigning it.
Fixes: cb7e02f44d6a (src: enable json echo output when reading native syntax) Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
mnl: reply netlink error message might be larger than MNL_SOCKET_BUFFER_SIZE
Netlink attribute maximum size is 65536 bytes (given nla_len is
16-bits). NFTA_SET_ELEM_LIST_ELEMENTS stores as many set elements as
possible that can fit into this netlink attribute.
Netlink messages with NLMSG_ERROR type originating from the kernel
contain the original netlink message as payload, they might be larger
than 65536 bytes.
Add NFT_MNL_ACK_MAXSIZE which estimates the maximum Netlink header
coming as (error) reply from the kernel. This estimate is based on the
maximum netlink message size that nft sends from userspace.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1464 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 2 Dec 2020 22:07:11 +0000 (23:07 +0100)]
json: Fix seqnum_to_json() functionality
Introduction of json_cmd_assoc_hash missed that by the time the hash
table insert happens, the struct cmd object's 'seqnum' field which is
used as key is not initialized yet. This doesn't happen until
nft_netlink() prepares the batch object which records the lowest seqnum.
Therefore push all json_cmd_assoc objects into a temporary list until
the first lookup happens. At this time, all referenced cmd objects have
their seqnum set and the list entries can be moved into the hash table
for fast lookups.
To expose such problems in the future, make json_events_cb() emit an
error message if the passed message has a handle but no assoc entry is
found for its seqnum.
Fixes: 389a0e1edc89a ("json: echo: Speedup seqnum_to_json()") Cc: Derek Dai <daiderek@gmail.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
src: enable json echo output when reading native syntax
This patch fixes a bug in which nft did not print any output when
specifying --echo and --json and reading nft native syntax.
This patch respects behavior when input is json, in which the output
would be the identical input plus the handles.
Adds a json_echo member inside struct nft_ctx to build and store the json object
containing the json command objects, the object is built using a mock
monitor to reuse monitor json code. This json object is only used when
we are sure we have not read json from input.
[ added json_alloc_echo() to compile without json support --pablo ]
Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=1446 Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Fri, 20 Nov 2020 19:01:59 +0000 (20:01 +0100)]
json: echo: Speedup seqnum_to_json()
Derek Dai reports:
"If there are a lot of command in JSON node, seqnum_to_json() will slow
down application (eg: firewalld) dramatically since it iterate whole
command list every time."
He sent a patch implementing a lookup table, but we can do better: Speed
this up by introducing a hash table to store the struct json_cmd_assoc
objects in, taking their netlink sequence number as key.
Quickly tested restoring a ruleset containing about 19k rules:
Jeremy Sowden [Sun, 15 Nov 2020 15:11:47 +0000 (15:11 +0000)]
tests: py: update format of registers in bitwise payloads.
libnftnl has been changed to bring the format of registers in bitwise
dumps in line with those in other types of expression. Update the
expected output of Python test-cases.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Tue, 10 Nov 2020 12:07:49 +0000 (13:07 +0100)]
proto: Fix ARP header field ordering
In ARP header, destination ether address sits between source IP and
destination IP addresses. Enum arp_hdr_fields had this wrong, which
in turn caused wrong ordering of entries in proto_arp->templates. When
expanding a combined payload expression, code assumes that template
entries are ordered by header offset, therefore the destination ether
address match was printed as raw if an earlier field was matched as
well:
| arp saddr ip 192.168.1.1 arp daddr ether 3e:d1:3f:d6:12:0b