git.ipfire.org Git - thirdparty/nftables.git/log

parser: restrict relational rhs expression recursion

The relational expression allows recursion from both sides, this doesn't
allow us to know what hand side the input is coming from. This patch
adds a new expr_rhs rule that specifies what can be found on the
constant side of the relational.

Besides making it easier to understand what is actually supported, this
allows us to use reserve words both as constant and statements. This is
used by the following patch to allow to use redirect as constant from
the icmp payload match.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: py: check set value from selector and map

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add new netdev protocol description

This relies on NFT_META_PROTOCOL instead of ethernet protocol type
header field to prepare support for non-ethernet protocols in the
future.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests/shell: add tests for handles and comments

Here some tests for optional things like rule handles and comments.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests/shell: add test case for cache bug

This testcase for sets catch a cache bug.

By the time of this commit this test is failing, so the test suite shows:

% sudo ./run-tests.sh
I: using nft binary /usr/local/sbin/nft

I: [OK] ./testcases/maps/anonymous_snat_map_0
I: [OK] ./testcases/maps/named_snat_map_0
W: [FAILED] ./testcases/sets/cache_handling_0
I: [OK] ./testcases/optionals/comments_0
I: [OK] ./testcases/optionals/comments_handles_monitor_0
I: [OK] ./testcases/optionals/handles_1
I: [OK] ./testcases/optionals/handles_0
I: [OK] ./testcases/optionals/comments_handles_0

I: results: [OK] 7 [FAILED] 1 [TOTAL] 8

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests/shell: add maps tests cases

Let's add some tests cases for maps.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests/: add shell test-suite

This new test-suite is intended to perform tests of higher level than
the other reggresion test-suite.

It can run arbitrary executables which can perform any test apart of testing
the nft syntax or netlink code (which is what the regression tests does).

To run the test suite (as root):
% cd tests/shell
% ./run-tests.sh

Test files are executables files with the pattern <<name_N>>, where N is the
expected return code of the executable. Since they are located with `find',
test-files can be spreaded in any sub-directories.

You can turn on a verbose execution by calling:
% ./run-tests.sh -v

Before each call to the test-files, `nft flush ruleset' will be called.
Also, test-files will receive the environment variable $NFT which contains the
path to the nftables binary being tested.

You can pass an arbitrary $NFT value as well:
% NFT=../../src/nft ./run-tests.sh

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_delinearize: add previous statement to rule_pp_ctx

564b0e7c13f9 ("netlink_delinearize: postprocess expression before range
merge") crashes nft when the previous statement is removed via
payload_dependency_kill() as this pointer is not valid anymore.

Move the pointer to the previous statement to rule_pp_ctx and invalidate
it when required.

Reported-by: "Pablo M. Bermudo Garay" <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reported-by: "Pablo M. Bermudo Garay" <pablombg@gmail.com>

tests/: rearrange tests directory

Rearrange the directory to obtain a better organization of files and
tests-suites.

We end with a tree like this:

tests
  |
  .--- py
  .--- shell
  .--- files

This was suggested by Pablo.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_delinearize: postprocess expression before range merge

Dependency statement go away after postprocess, so we should consider
them for possible range merges.

This problem was uncovered when adding support for sub-byte payload
ranges.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: fix sub-byte protocol header definitions

Update bitfield definitions to match according to the way they are
expressed in RFC and IEEE specifications.

This required a bit of update for c3f0501 ("src: netlink_linearize:
handle sub-byte lengths").

>From the linearize step, to calculate the shift based on the bitfield
offset, we need to obtain the length of the word in bytes:

len = round_up(expr->len, BITS_PER_BYTE);

Then, we substract the offset bits and the bitfield length.

shift = len - (offset + expr->len);

From the delinearize, payload_expr_trim() needs to obtain the real
offset through:

off = round_up(mask->len, BITS_PER_BYTE) - mask_len;

For vlan id (offset 12), this gets the position of the last bit set in
the mask (ie. 12), then we substract the length we fetch in bytes (16),
so we obtain the real bitfield offset (4).

Then, we add that to the original payload offset that was expressed in
bytes:

payload_offset += off;

Note that payload_expr_trim() now also adjusts the payload expression to
its real length and offset so we don't need to propagate the mask
expression.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: vlan pcp and cfi are located in the first byte

Adjust tests to fix wrong payloads, both pcp and cfi are located in the
first nibble of the first byte.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: fix crash when rule test is malformed

The tests script suffers a crash when a rule test line is malformed
(e.g. if expected result is missing). This commit fixes these crashes
and now the line is skipped and a warning is printed.

While at it, fix a malformed test line too.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: remove useless logic

In the test files, some lines defining tables was commented out with a
minus "-" sign, also used to mark broken rules. This commit replaces
these signs with actual comments "#" and removes the code that handled
the situation.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: add *.got files to .gitignore

During tests execution, some *.payload.got files may be generated. To
avoid annoyances, this commit adds the pattern to .gitignore. Also, the
file "dup.t.payload.got", that was presumably included by mistake, has
been deleted.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_delinearize: fix use-after-free

We have to clone the payload expression before attaching it to the lhs
of the relational expression, this payload expression is located at the
lhs of the binary operation that is released thereafter.

Fixes: 39f15c2 ("nft: support listing expressions that use non-byte header fields")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: fix up indentation damage

The conversion to the net libnftnl API has left a lot of indentation damage
in the netlink functions. Fix it up.

Signed-off-by: Patrick McHardy <kaber@trash.net>

tests: regression: fix arp.t expected payload

previous commit fixed arp header definition so fix payload as well.

Signed-off-by: Florian Westphal <fw@strlen.de>

proto: fix arpop symbol table endianess

The symbols need to be in big endian.

Signed-off-by: Patrick McHardy <kaber@trash.net>

payload: add payload statement

Add support for payload mangling using the payload statement. The syntax
is similar to the other data changing statements:

nft filter output tcp dport set 25

Signed-off-by: Patrick McHardy <kaber@trash.net>

proto: add checksum key information to struct proto_desc

The checksum key is used to determine the correct position where to update
the checksum for the payload statement.

Signed-off-by: Patrick McHardy <kaber@trash.net>

tests: regression: allow to run tests from anywhere

Since 357d8cf "tests: use the src/nft binary instead of $PATH one", the
tests script needs to be executed from nftables repository root. Now
the script can be run from any location and also checks the binary
existence.

To run a single test file, the path must be relative from the directory
where you launch the script.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: regression: homogenize indentation style

Python interpreter doesn't like mixed indentation. So in order to
prevent future problems, this commit replace some tabs found in the
script with space indentation.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

rule: move comment out of handle

The comment does not belong to the handle, it belongs to the rule.

Signed-off-by: Patrick McHardy <kaber@trash.net>

evaluate: fix string matching on big endian

We need to reallocate the constant expression with the right expression
length when evaluating the string. Otherwise the linearization step
generates a wrong comparison on big endian. We cannot do this any
earlier since we don't know the maximum string length for this datatype
at the parsing stage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: add inet test for ip/ether concatenation

Test rule from Pablo, it caused assertion failure with earlier
versions of nft (caused by 7ead4932f9ab, later fixed via
775e7ff1f5ddaa32).

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: add test cases for ethernet header matching

Adds ether saddr statements for inet, bridge and ip/ip6 families.

Signed-off-by: Florian Westphal <fw@strlen.de>

rule: don't reorder protocol payload expressions when merging

An instruction like

bridge filter input ip saddr 1.2.3.4 ether saddr a:b:c:d:e:f

is displayed as

unknown unknown 0x1020304 [invalid type] ether saddr 00:0f:54:0c:11:04 ether type ip

.. because the (implicit) 'ether type ip' that is injected before the
network header match gets merged into the ether saddr instruction.

This inverts the merge in case the merge candidate contains
a next header protocol field.

After this change, the rule will be displayed as

bridge filter input ether saddr a:b:c:d:e:f ip saddr 1.2.3.4

Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: add tests matching on ether saddr for inet, bridge, ip, ip6

Signed-off-by: Florian Westphal <fw@strlen.de>

src: allow filtering on L2 header in inet family

Error: conflicting protocols specified: inet vs. ether
tcp dport 22 iiftype ether ether saddr 00:0f:54:0c:11:4
^^^^^^^^^^^

This allows the implicit inet proto dependency to get replaced
by an ethernet one.

This is possible since by the time we detect the conflict the
meta dependency for the network protocol has already been added.

So we only need to add another dependency on the Linklayer frame type.

Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=981
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>

src: add interface wildcard matching

Contrary to iptables, we use the asterisk character '*' as wildcard.

# nft --debug=netlink add rule test test iifname eth\*
ip test test
  [ meta load iifname => reg 1 ]
  [ cmp eq reg 1 0x00687465 ]

Note that this generates an optimized comparison without bitwise.

In case you want to match a device that contains an asterisk, you have
to escape the asterisk, ie.

# nft add rule test test iifname eth\\*

The wildcard string handling occurs from the evaluation step, where we
convert from:

          relational
             /  \
            /    \
         meta   value
       oifname   eth*

to:
          relational
           /    \
          /      \
        meta    prefix
      ofiname

As Patrick suggested, this not actually a wildcard but a prefix since it
only applies to the string when placed at the end.

More comments:

* This relaxes the left->size > right->size from netlink_parse_cmp()
  for strings since the optimization that this patch applies may now
  result in bogus errors.

* This patch can be later on extended to apply a similar optimization to
  payload expressions when:

expr->len % BITS_PER_BYTE == 0

  For meta and ct, the kernel checks for the exact length of the attributes
  (it expects integer 32 bits) so we can't do it unless we relax that.

* Wildcard strings are not supported from sets and maps yet. Error
  reporting is not very good at this stage since expr_evaluate_prefix()
  doesn't have enough context (ctx->set is NULL, the set object is
  currently created later after evaluating the lhs and rhs of the
  relational). I'll be following up on this later.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: Add command "replace" for rules

Modify the parser and add necessary functions to provide the command "nft
replace rule <ruleid_spec> <new_rule>"

Example of use:

# nft list ruleset -a
table ip filter {
chain output {
ip daddr 8.8.8.7 counter packets 0 bytes 0 # handle 3
}
}
# nft replace rule filter output handle 3 ip daddr 8.8.8.8 counter
# nft list ruleset -a
table ip filter {
chain output {
ip daddr 8.8.8.8 counter packets 0 bytes 0 # handle 3
}
}

Signed-off-by: Carlos Falgueras García <carlosfg@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: fix mapping evaluation

# cat ruleset.file
table ip mangle {
        map CLASS05 {
                type ipv4_addr : mark
                elements = { 192.168.0.10 : 0x00000001}
        }

        chain OUTPUT {
                type route hook output priority 0; policy accept;
                mark set ip saddr map @CLASS05
        }
}
# nft -f ruleset.file
ruleset.file:4:28-54: Error: mapping outside of map context
                elements = { 192.168.0.10 : 0x00000001}
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^

This actually is fixing two problems:

1) Validate datatype of the rhs before evaluating the map definition,
   this is also setting set->datalen which is needed for the element
   evaluation.

2) Add missing set context.

Reported-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_linearize: factor out prefix generation

Add a new netlink_gen_prefix() function that encapsulates the prefix
generation.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: check if table and chain exists when adding rules

Assuming a table 'test' that contains a chain 'test':

# nft add rule test1 test2 counter
<cmdline>:1:1-28: Error: Could not process rule: Table 'test1' does not exist
add rule test1 test2 counter
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# nft add rule test test2 counter
<cmdline>:1:1-27: Error: Could not process rule: Chain 'test2' does not exist
add rule test test2 counter
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

parser_bison: show all sets via list sets with no family

Default to the same behaviour that we get through `list ruleset', ie.

# nft list sets
table ip test1 {
        set foo {
                type ipv4_addr
        }
}
table ip6 test2 {
        set bar {
                type ipv6_addr
        }
}

# nft list sets ip
table ip test1 {
        set foo {
                type ipv4_addr
        }
}

# nft list sets ip6
table ip6 test2 {
        set bar {
                type ipv6_addr
        }
}

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

rule: rework list chain

After this patch:

# nft list chain inet filter forward
table inet filter {
        chain forward {
                type filter hook forward priority 0; policy drop;
                ct state established,related counter packets 39546074 bytes 11566126287 accept
        }
}

Before this patch, this was showing the full table definition, including
all chains, which is not what the user is asking for.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

rule: `list sets' only displays declaration, not definition

# nft list sets
table ip nat {
        set libssh {
                type ipv4_addr
        }
}
table inet filter {
        set set0 {
                type inet_service
                flags constant
        }
        set set1 {
                type inet_service
                flags constant
        }
        set set2 {
                type icmpv6_type
                flags constant
        }
}

So in case you want to inspect the definition, you have to use `list set'
and the specific set that you want to inspect:

# nft list set inet filter set0
table inet filter {
        set set0 {
                type inet_service
                flags constant
                elements = { 2200, ssh}
        }
}

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

evaluate: check if set exists before listing it

After this patch, we obtain:

# nft list set ip6 test pepe
<cmdline>:1:1-22: Error: Could not process rule: Set 'foo' does not exist
list set ip6 test foo
^^^^^^^^^^^^^^^^^^^^^

So we get things aligned with table and chain listing commands.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

rule: display table when listing one set

After:

# nft list set ip6 test foo
table ip6 test {
        set foo {
                type ipv4_addr
        }
}

Before:

  # nft list set ip6 test foo
        set foo {
                type ipv4_addr
        }

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

src: add `list chains' command

# nft list chains
table ip filter {
        chain test1 {
        }
        chain test2 {
        }
        chain input {
                type filter hook input priority 0; policy accept;
        }
}
table ip6 filter {
        chain test1 {
        }
        chain input {
                type filter hook input priority 0; policy accept;
        }
}

You can also filter out per family:

# nft list chains ip
table ip x {
        chain y {
        }
        chain xz {
        }
        chain input {
                type filter hook input priority 0; policy accept;
        }
}

# nft list chains ip6
table ip6 filter {
        chain x {
        }
        chain input {
                type filter hook input priority 0; policy accept;
        }
}

This command only shows the chain declarations, so the content (the
definition) is omitted.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

rule: display table when listing sets

After this patch:

# nft list sets ip
table ip test {
        set pepe {
                type ipv4_addr
        }
}

Before:

# nft list sets ip
        set pepe {
                type ipv4_addr
        }

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

rule: fix printing of rule comments

Several fixes:
* handles are printed last
* simplify space games (an extra space was being printed)
* comments are shown with `nft monitor' as well (missing before this patch)

Before this patch:

% nft list ruleset -a
[...]
chain test {
iifname eth0 # handle 1 comment "test"

}
[...]

% nft list ruleset
[...]
chain test {
iifname eth0  comment "test"
    ^^
}
[...]

% nft monitor &
% nft add rule test test iifname eth0 comment "test"
add rule test test iifname eth0

After this patch:

% nft list ruleset -a
chain test {
iifname eth0 comment "test" # handle 1
    ^
}

% nft monitor -a &
% nft add rule test test iifname eth0 comment "test"
add rule test test iifname eth0 comment "test" # handle 1

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

expression: provide clone operation for set element ops

define addrs={ 1.2.3.4 }
table ip filter {
chain input {
type filter hook input priority 0;
ip saddr $addrs accept
}
}

segfaults. Using saddr { 1.2.3.4 } instead of $addrs works.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=801087
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: add tests for dup

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add dup statement support

This allows you to clone packets to destination address, eg.

... dup to 172.20.0.2
... dup to 172.20.0.2 device eth1
... dup to ip saddr map { 192.168.0.2 : 172.20.0.2, ... } device eth1

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

parser: show all tables via list tables with no family

Default to the same behaviour that we get through `list ruleset', ie.

# nft list tables
table ip filter
table ip6 filter

# nft list tables ip
table ip filter

# nft list tables ip6
table ip6 filter

Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=1033
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

rule: filter out tables depending on family

# nft list tables ip
table ip filter

# nft list tables ip6
table ip6 filter

Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=1033
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: limit: extend them to validate new bytes/second and burst parameters

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add burst parameter to limit

... limit rate 1024 mbytes/second burst 10240 bytes
... limit rate 1/second burst 3 packets

This parameter is optional.

You need a Linux kernel >= 4.3-rc1.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add per-bytes limit

This example show how to accept packets below the ratelimit:

... limit rate 1024 mbytes/second counter accept

You need a Linux kernel >= 4.3-rc1.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

vlan: make != tests work

tests: vlan tests

add a few regression tests that match vlan id/dei/pcp fields
of the vlan header.

Signed-off-by: Florian Westphal <fw@strlen.de>

nft: support listing expressions that use non-byte header fields

This allows to list rules that check fields that are not aligned on byte
boundary.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: add tests for ip version/hdrlength/tcp doff

Header fields of 4 bit lengths. Requires implicit masks and
shifting of RHS constant.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: cmp: shift rhs constant if lhs offset doesn't start on byte boundary

if we have payload(someoffset) == 42, then shift 42 in case someoffset
doesn't start on a byte boundary.

We already insert a mask instruction to only load those bits into
the register that we were interested in, but the cmp will fail without
also adjusting rhs accordingly.

Needs additional patch in reverse direction to undo the shift again
when dumping ruleset.

Signed-off-by: Florian Westphal <fw@strlen.de>

nft: fill in doff and fix ihl/version template entries

This allows to use

nft add rule ip filter input tcp doff 8

or similar.

Furhermore, ip version looked at hdrlen and vice versa.

Signed-off-by: Florian Westphal <fw@strlen.de>

src: netlink: don't truncate set key lengths

If key is e.g. 12 bits, pretend its 16 instead of 8. This is needed
to make sets work with header fields with size not divisible by 8.

Signed-off-by: Florian Westphal <fw@strlen.de>

src: netlink_linearize: handle sub-byte lengths

Currently length is expr->len / BITS_PER_BYTE, i.e. expr->len
has to be a multiple of 8.

When core asks for e.g. '9 bits', we truncate this to 8.
Round up to 16 and inject a 9-bit mask to zero out the parts we're not
interested in.

This will also need change to the delinarization step to
remove the extra op when dumping rules from kernel.

Signed-off-by: Florian Westphal <fw@strlen.de>

payload: disable payload merge if offsets are not on byte boundary.

... because it doesn't work, we attempt to merge it into wrong
place, we would have to merge the second value at a specific location.

F.e. vlan hdr 4094 gives us

0xfe0f

Merging in the CFI should yield 0xfe1f, but the constant merging
doesn't know how to achive that; at the moment 'vlan id 4094'
and 'vlan id 4094 vlan cfi 1' give same result -- 0xfe0f.

For now just turn off the optimization step unless everything is
byte divisible (the common case).

Signed-off-by: Florian Westphal <fw@strlen.de>

nft: allow stacking vlan header on top of ethernet

currently 'vlan id 42' or even 'vlan type ip' doesn't work since
we expect ethernet header but get vlan.

So if we want to add another protocol header to the same base, we
attempt to figure out if the new header can fit on top of the existing
one (i.e. proto_find_num gives a protocol number when asking to find
link between the two).

We also annotate protocol description for eth and vlan with the full
header size and track the offset from the current base.

Otherwise, 'vlan type ip' fetches the protocol field from mac header
offset 0, which is some mac address.

Instead, we must consider full size of ethernet header.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: don't depend on set element order

Pablo reported test failures because the order of returned set entries
is not deterministic.

This sorts set elements before comparision.
Patrick suggested to move ordering into libnftnl (since we could f.e.
also get duplicate entries due to how netlink dumps work), but thats a bit
more work. Hence this quick workaround.

Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>

Bump version to v0.5

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: use new symbols in libnftnl

Adapt the nftables code to use the new symbols in libnftnl. This patch contains
quite some renaming to reserve the nft_ prefix for our high level library.

Explicitly request libnftnl 1.0.5 at configure stage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: add concatenations and maps; combine them too

This patch adds simple tests for concatenation and maps, including more
advanced tests that combine them.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: use existing table object from evaluation context

Skip table object lookup if we are in the context of table declaration already,
ctx->table already points to the right table we have to use during the
evalution. Otherwise, a list corruption occurs when using the wrong table
object when it already exists in the kernel.

http://marc.info/?l=netfilter-devel&m=144179814209295&w=2

Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

mnl: rework netlink socket receive path for events

This patch reworks two aspects of the netlink socket event receive path:

1) In case of ENOBUFS, stay in the loop to keep receiving messages. The tool
   displays a message so the user knows that we got lost event messages.

2) Rise the default size of the receive socket buffer up to 16 MBytes to reduce
   chances of hitting ENOBUFS. Asumming that the netlink event message size is
   ~150 bytes, we can bear with ~111848 rules without message loss.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: flush stdout after each event in monitor mode

So we get all events when redirecting them to file, ie.

# nftables monitor > file

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: display error when trying to run tests out of the root directory

Since 357d8cfcceb2 ("tests: use the src/nft binary instead of $PATH one"), the
tests fail if you try to run them if you are not under the root directory of
the nftables repository.

Display an error so I don't forget I have to do it like this.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: fix build with debug off

mnl.c:241:1: error: expected identifier or '(' before '}' token

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: add 'awkward' prefix match expression

Its just a more complicated way of saying 'ip saddr 255.255.0.0/16'.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: use the src/nft binary instead of $PATH one

... so one doesn't need to install new binary into $PATH (or
change PATH... ) during development.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: redirect: fix payload display

This has to be related to libnftnl's 0edeb667a2cf ("expr: redir: fix snprintf
to return the number of bytes printed").

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: sets: don't include listing in payload tests

Since e715f6d1241c ("netlink: don't call netlink_dump_*() from listing
functions with --debug=netlink"), there is no debugging from the listing path.
Thus, we can remove the set line from the test files.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: don't call netlink_dump_*() from listing functions with --debug=netlink

Now that we always retrieve the object list to build a cache before executing
the command, this results in fully listing of existing objects in the kernel.

This is confusing when adding a simple rule, so better not to call
netlink_dump_*() from listing functions.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: display error on unexisting chain when listing

nft list chain ip test output
<cmdline>:1:1-25: Error: Could not process rule: Chain 'output' does not exist
list chain ip test output
^^^^^^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: get rid of EINTR handling for nft_netlink()

The only remaining caller that needs this is netlink_dump_ruleset(), that is
used to export the ruleset using markup representation. We can remove it and
handle this from do_command_export() now that we have a centralized point to
build up the object cache.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: use cache infrastructure for set element objects

Populate the cache iff the user requests a ruleset listing.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: use cache infrastructure for rule objects

Populate the cache iff the user requests a ruleset listing.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add chain declarations to cache

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: add cmd_evaluate_rename()

Make sure the table that we want to rename already exist. This is required by
the follow up patch that that adds chains to the cache.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: use cache infrastructure for chain objects

The chain list is obtained if the user requests a listing.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

rule: add chain reference counter

When adding declared chains to the cache, we may hold more than one single
reference from struct cmd and the cache.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: early allocation of the set ID

By when the set is created, so element in the batch use this set ID as
reference.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add set declaration to cache

This patch adds set objects to the cache if they don't exist in the kernel, so
they can be referenced from this batch. This occurs from the evaluation step.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: use cache infrastructure for set objects

This patch populates the cache only once through netlink_list_sets() during
evaluation. As a result, there is a single call to netlink_list_sets().

After this change, we can rid of get_set(). This function was fine by the time
we had no transaction support, but this doesn't work for set objects that are
declared in this batch, so inquiring the kernel doesn't help since they are not
yet available.

As a result from this update, the monitor code gets simplified quite a lot
since it can rely of the set cache. Moreover, we can now validate that the
table and set exists from evaluation path.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add table declaration to cache

Add declared table objects to the cache, thus we can refer to objects that
come in this batch but that are not yet available in the kernel. This happens
from the evaluation step.

Get rid of code that is doing this from the later do_command_*() stage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

rule: add reference counter to the table object

We may hold multiple references to table objects in follow up patches when
adding object declarations to the cache.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add cmd_evaluate_list()

This function validates that the table that we want to list already exists by
looking it up from the cache.

This also adds cmd_error() to display an error from the evaluation step, when
the objects that the rule indicates do not exist.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add cache infrastructure and use it for table objects

This patch introduces the generic object cache that is populated during the
evaluation phase.

The first client of this infrastructure are table objects. As a result, there
is a single call to netlink_list_tables().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Merge branch 'next-4.2'

This branch adds support for the new 'netdev' family. This also resolves a
simple conflict with the default chain policy printing.

Conflicts:
src/rule.c

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: add two test cases using binop w. payload

[ payload load 4b @ network header + 12 => reg 1 ]
[ bitwise reg 1 = (reg=1 & 0xff000000 ) ^ 0x00000000 ]
[ cmp eq reg 1 ...

.. to make sure that later support to match header elements that have odd
(non-byte aligned) lengths/offsets doesn't erronously eliminate explicitly
added binops while searching expressions for implicit binops.

Signed-off-by: Florian Westphal <fw@strlen.de>

src: restore nft list tables

Iterate over the ctx->list which is where the table objects are after
calling netlink_list_tables().

Fixes: e4d21958c835 ("rule: add do_list_tables()")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: validate generated netlink instructions

compare netlink instructions generated by given nft command line
with recorded version.

Example: udp dport 80 accept in ip family should look like

ip test-ip4 input
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ cmp eq reg 1 0x00000011 ]
  [ payload load 2b @ transport header + 2 => reg 1 ]
  [ cmp eq reg 1 0x00005000 ]
  [ immediate reg 0 accept ]

This is stored in udp.t.payload.ip

Other suffixes:
.payload.ip6
.payload.inet
.payload ('any')

The test script first looks for 'testname.t.payload.$family', if that
doesn't exist 'testname.t.payload' is used.

This allows for family independent test (e.g. meta), where we don't
expect/have any family specific expressions.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink_delinearize: meta l4proto range printing broken on 32bit

Florian Westphal says:

09565a4b1ed4863d44c4509a93c50f44efd12771 ("netlink_delinearize: consolidate
range printing") causes nft to segfault on 32bit machine when printing l4proto
ranges.

The problem is that meta_expr_pctx_update() assumes that right is a value, but
after this change it can also be a range.

Thus, expr->value contents are undefined (its union). On x86_64 this is also
broken but by virtue of struct layout and pointer sizes, value->_mp_size will
almost always be 0 so mpz_get_uint8() returns 0.

But on x86-32 _mp_size will be huge value (contains expr->right pointer of
range), so we crash in libgmp.

Pablo says:

We shouldn't call pctx_update(), before the transformation we had
there a expr->op == { OP_GT, OP_GTE, OP_LT, OP_LTE }. So we never
entered that path as the assert in payload_expr_pctx_update()
indicates.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Florian Westphal <fw@strlen.de>

tests: meta: use root for uid/gid checks

I get failures here since 'man' has different uid vs. what
test suite expects.
Furthermore, this box does not have a 'backup' user.

Switch to root/bin/daemon -- those exist on both debian and fedora.

After this meta.t passes on all my machines.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: avoid more warnings

... 2001:838:35f:1::-2001:838:35f:2:: :80-100' mismatches
... 2001:838:35f:1::-2001:838:35f:2:::80-100'

nft accepts both, so just alter test to not complain.

Also, fix test script to display the expected output rather than
the input. Otherwise, a rule like
some_input;ok;expected_output

may display nonsensical message like
warning: some_input mismatches some_input

This also fixes the icmpv6 test accordingly, nft displays ranges
correctly.

Signed-off-by: Florian Westphal <fw@strlen.de>

main: return error to shell on evaluation problems

# nft add chain filter input { type filter hook inputt priority 0\; }
<cmdline>:1:43-48: Error: unknown chain hook inputt
add chain filter input { type filter hook inputt priority 0; }
^^^^^^

Before:

# echo $?
0

After:

# echo $?
1

Note that nft_parse() returns 1 on parsing errors and 0 + state->errs on
evaluation problems, so return -1 as other functions do here to pass up the
error to the main routine.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

configure: fix --enable-debug

As the documentation indicates "The most common mistake for this macro is to
consider the two actions as action-if-enabled and action-if-disabled."

Use AS_IF in the action-if-present to check the real argument that we're
getting from the user.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: release table object via table_free() in netlink_get_table()

Instead of xfree().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>