git.ipfire.org Git - thirdparty/nftables.git/log

src: add `flush ruleset'

This patch adds the `flush ruleset' operation to nft.

The syntax is:
% nft flush ruleset [family]

To flush all the ruleset (all families):
% nft flush ruleset

To flush the ruleset of a given family:
% nft flush ruleset ip
% nft flush ruleset inet

This flush is a shortcut operation which deletes all rules, sets, tables
and chains.
It's possible since the modifications in the kernel to the NFT_MSG_DELTABLE
API call.

Users can benefit of this operation when doing an atomic replacement of the
entire ruleset, loading a file like this:

=========
flush ruleset
table ip filter {
chain input {
counter accept
}
}
=========

Also, users who want to simply clean the ruleset for whatever reason can do it now
without having to iterate families/tables.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Fix typo in chain hook parsing

Just a typo in chain hook parsing

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: Add devgroup support in meta expresion

This adds device group support in meta expresion.

The new attributes of meta are "iffgroup" and "oifgroup"
- iffgroup: Match device group of incoming device.
- oifgroup: Match device group of outcoming device.

Example of use:
nft add rule ip test input meta iifgroup 2 counter
nft add rule ip test output meta oifgroup 2 counter

The kernel and libnftnl support were added in these commits:
netfilter: nf_tables: add devgroup support in meta expresion
src: meta: Add devgroup support to meta expresion

Signed-off-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: meta: Fix the size of cpu attribute

Fix the size of cpu attribute in meta_template struct.

Signed-off-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

doc: nft: Fix trivial error in man page where flush should be rename

Trivial fix, but someone filed a bug on it, and it should be fixed. ;)
https://bugzilla.redhat.com/show_bug.cgi?id=1132917

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: Add support for cpu in meta expresion

This allows you to match cpu handling with a packet.

This is an example of the syntax for this new attribute:

nft add rule ip test input meta cpu 1 counter
nft add rule ip test input meta cpu 1-3 counter
nft add rule ip test input meta cpu { 1, 3} counter

Signed-off-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: Add support for pkttype in meta expresion

If you want to match the pkttype field of the skbuff, you have to
use the following syntax:

nft add rule ip filter input meta pkttype PACKET_TYPE

where PACKET_TYPE can be: unicast, broadcast and multicast.

Joint work with Alvaro Neira Ayuso <alvaroneay@gmail.com>

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

payload: use proto_unknown for raw protocol header

Otherwise payload.desc would be NULL, which causes the crash in bug 915.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: don't return error in netlink_linearize_rule()

This function converts the rule from the list of statements to the
netlink message format. The only two possible errors that can make
this function to fail are memory exhaustion and malformed statements
which inmediately stop the execution of nft.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

verdict type: handle verdict flags and encoded additional information

The kernel can handle this, nftables should also.

Signed-off-by: Patrick McHardy <kaber@trash.net>

proto: fix byteorder of ETH_P_* values

The ethernet header type is in big endian byte order, the ETH_P_* values
are in host byte order however. Fix this using __constant_htons().

Signed-off-by: Patrick McHardy <kaber@trash.net>

datatype: take endianess into account in symbolic_constant_print()

symbolic_constant_print() uses mpz_cmp_ui() to find the matching symbol.
Since GMP internally treats all values as being in host byte, this
doesn't work when the constant value is non-host byteorder, such as
the ethernet protocol type.

Export the expression's value in its original byteorder for comparison
to fix this.

Signed-off-by: Patrick McHardy <kaber@trash.net>

payload: take endianess into account when updating the payload context

payload_expr_pctx_update() uses the numeric protocol value in host byte
order to find the upper layer protocol. This obviously doesn't work for
protocol expressions in other byte orders, such as the ethernet protocol
on little endian.

Export the protocol value in the correct byte order and use that value
to look up the upper layer protocol.

Signed-off-by: Patrick McHardy <kaber@trash.net>

linealize: generate unary expression with the appropiate operation

If we add a unary expression which the operation is ntoh, we use hton.
This looks like a typo.

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Fix memory leak in nft get operation

Some memories are forgotten to release on the error path in get operation.
Just release them. Also, in netlink_get_chain, it's better to return
immediately when a error is detected.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

doc: update documentation with 'monitor' and 'export'

Let's add info about 'monitor' and 'export'.

While at it, fix other minors things, like the no-netlink return code and
indentations of the document.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add level option to the log statement

This patch is required if you use upcoming Linux kernels >= 3.17
which come with a complete logging support for nf_tables.

If you use 'log' without options, the kernel logging buffer is used:

nft> add rule filter input log

You can also specify the logging prefix string:

nft> add rule filter input log prefix "input: "

You may want to specify the log level:

nft> add rule filter input log prefix "input: " level notice

By default, if not specified, the default level is 'warn' (just like
in iptables).

If you specify the group, then nft uses the nfnetlink_log instead:

nft> add rule filter input log prefix "input: " group 10

You can also specify the snaplen and qthreshold for the nfnetlink_log.
But you cannot mix level and group at the same time, they are mutually
exclusive.

Default values for both snaplen and qthreshold are 0 (just like in
iptables).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

mnl: fix crashes when using sets with many elements

nft crashes when adding many elements into a set for two reasons:

1) The overflow of the nla_len field for the NFTA_SET_ELEM_LIST_ELEMENTS
   attribute.

2) Out-of-bound memory writes to the reserved area for the netlink
   message, which is solved by the patch entitled ("mnl: introduce
   NFT_NLMSG_MAXSIZE").

This patch adds the corresponding nla_len overflow check for
NFTA_SET_ELEM_LIST_ELEMENTS and it splits the elements in several
netlink messages. This should be enough when set updates are handled
by the transaction infrastructure.

With this patch, nft should be now capable of adding an unlimited
number of elements to a given set.

Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=898
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

mnl: introduce NFT_NLMSG_MAXSIZE

The NFT_NLMSG_MAXSIZE constant defines the maximum nf_tables netlink
message. Currently, the largest is the set element message, which
contains the NFTA_SET_ELEM_LIST_ELEMENTS attribute. This attribute is
a nest that describes the set elements. Given that the netlink attribute
length (nla_len) is 16 bits, the largest message is a bit larger than
64 KBytes. Thus, the proposed value of NFT_NLMSG_MAXSIZE is set to
(1 << 16) + getpagesize().

This new constant is used to calculate the length of:

1) the batch page length, when the batching mode is used.

2) the buffer that stores the netlink message in the send (when no
batching is used) and receive paths.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

main: propagate error to shell

Before:

# nft add rule ip test input ip hdrlength 3
<cmdline>:1:1-37: Error: Could not process rule: Invalid argument
add rule ip test input ip hdrlength 3
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# echo $?
0

After:

# nft add rule ip test input ip hdrlength 3
<cmdline>:1:1-37: Error: Could not process rule: Invalid argument
add rule ip test input ip hdrlength 3
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# echo $?
1

Reported-by: Ana Rey Botello <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: rework batching logic to fix possible use of uninitialized pages

This patch reworks the batching logic in several aspects:

1) New batch pages are now always added into the batch page list in
   first place. Then, in the send path, if the last batch page is
   empty, it is removed from the batch list.

2) nft_batch_page_add() is only called if the current batch page is
   full. Therefore, it is guaranteed to find a valid netlink message
   in the batch page when moving the tail that didn't fit into a new
   batch page.

3) The batch paging is initialized and released from the nft_netlink()
   path.

4) No more global struct mnl_nlmsg_batch *batch that points to the
   current batch page. Instead, it is retrieved from the tail of the
   batch list, which indicates the current batch page.

This patch fixes a crash due to access of uninitialized memory area in
due to calling batch_page_add() with an empty batch in the send path,
and the memleak of the batch page contents. Reported in:

http://patchwork.ozlabs.org/patch/367085/
http://patchwork.ozlabs.org/patch/367774/

The patch is larger, but this saves the zeroing of the batch page area.

Reported-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

mnl: add nft_nlmsg_batch_current() helper

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

mnl: add nft_batch_continue() helper

Save some LOC with this function that wraps typical handling
after pushing the netlink message into the batch.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: monitor: fix how rules with intervals are printed

Previous to this patch, if we add a rule like this:
nft add rule filter test ip saddr { 1.1.1.1-2.2.2.2 }

The monitor operation output shows:
add rule ip filter test ip saddr { 0.0.0.0, 1.1.1.1, 2.2.2.3}

The fix suggested by Pablo is to call interval_map_decompose().

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: monitor: add a helper function to handle sets referenced by a rule

This patch adds a helper function to handle lookup expressions with a callback,
so we can make an action for each set referenced by the rule.

Basically this is a refactorization, useful for follow-up patches.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

mnl: check for NLM_F_DUMP_INTR when dumping object lists

This flag allows to detect that an update has ocurred while dumping
any of the object lists. In case of interference, nft cancels the
netlink socket to skip processing the remaining stale entries and
it retries to obtain fresh list of objects.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

mnl: immediately return on errors in mnl_nft_ruleset_dump()

If this fails to fetch any of the objects, stop handling inmediately.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

proto: initialize result expression in ethertype_parse()

Otherwise, you may crash in:

nft add rule bridge filter input ether type ip

Reported-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: fix concat expressions as map arguments

The type in the evaluation context needs to be reset to avoid treating
the concatenation as a right hand side relational expression.

# nft filter output mark set ip daddr . tcp dport map { 192.168.0.1 . 22 : 1 }
<cmdline>:1:24-43: Error: datatype mismatch, expected packet mark, expression has type concatenation of (IPv4 address, internet network service)
filter output mark set ip daddr . tcp dport map { 192.168.0.1 . 22 : 1 }
^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Patrick McHardy <kaber@trash.net>

netlink: check and handle errors from netlink_delinearize_set()

Fix segfaults when delinearizing the set fails and abort on error when
listing sets.

Signed-off-by: Patrick McHardy <kaber@trash.net>

Bump version to v0.3

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: Correct initial value of bytes counter in nftables rule

Packages can be accounted by nftables through such command.
% nft add rule filter output ip daddr 8.8.8.8 counter

You can also give the initial values of packets and bytes.
% nft add rule filter output ip daddr 8.8.8.8 counter packets 10 bytes 20

But packets and bytes are both initialized to 10 in above command for there is
a mistake in the program.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: revert broken reject icmp code support

This patch reverts Alvaro's 34040b1 ("reject: add ICMP code parameter
for indicating the type of error") and 11b2bb2 ("reject: Use protocol
context for indicating the reject type").

These patches are flawed by two things:

1) IPv6 support is broken, only ICMP codes are considered.
2) If you don't specify any transport context, the utility exits without
adding the rule, eg. nft add rule ip filter input reject.

The kernel is also flawed when it comes to the inet table. Let's revert
this until we can provide decent reject reason support.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

reject: add ICMP code parameter for indicating the type of error

This patch allows to indicate the ICMP code field in case that we
use to reject. Before, we have always sent network unreachable error
as ICMP code, now we can explicitly indicate the ICMP code that
we want to use. Examples:

nft add rule filter input tcp dport 22 reject with host-unreach
nft add rule filter input udp dport 22 reject with host-unreach

In this case, it will use the host unreachable code to reject traffic.

The default code field still is network unreachable and we can also
use the rules without the with like that:

nft add rule filter input udp dport 22 reject

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

reject: Use protocol context for indicating the reject type

This patch uses the protocol context to initialize the reject type
considering if the transport protocol is tcp, udp, etc. Before this
patch, this was left unset.

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

queue: More compact syntax

This patch allows to use a new syntax more compact and break
the current syntax. This new syntax is more similar than the nftables
syntax that we use usually. We can use range like we have used in
other case in nftables. Here, we have some examples:

Before, If we want to declare a queue, we have used a syntax like this:

nft add rule test input queue num 1 total 3 options bypass,fanout

If we want to use the queue number 1 and the two next (total 3),
we use a range in the new syntax, for example:

nft add rule test input queue num 1-3 bypass fanout

Also if we want to use only one queue, the new rules are like:

nft add rule test input queue num 1 # queue 1

or

nft add rule test input queue # queue 0

And if we want to add a specific flags we only need to put
what flags we want to use:

nft add rule test input queue bypass

we don't need to use options and the comma for indicating the
flags.

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

cli: fix nft -i command crashes when try to input multi line command

When try to input multiline command in "nft -i", it crashes.

Issue is, function cli_append_multiline() return null in case of
multiline command. But in the calling function cli_complete(),
cli_exit is getting called, which in turn calls
rl_callback_handler_remove() and the handler is getting removed.

[root@localhost ~]# nft -i
nft> add table filter
nft> list table \

readline: readline_callback_read_char() called with no handler!
Aborted (core dumped)
[root@localhost ~]#

After this patch, it shows:

nft> list table \
.... filter
table ip filter {
}
nft>

The ".... " prompt is used to indicate a multiline command, similar to
what Python does.

Signed-off-by: Guruswamy Basavaiah <guru2018@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: change type of chain.priority from unsigned int to int

This removes a bug that displays strange hook priorities
like "type route hook output priority 4294967146".

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: Allow to invert the ranges

This patch fix the bug:

http://bugzilla.netfilter.org/show_bug.cgi?id=924

Before, nftables doesn't permit invert ranges. This patch allows
add rules like this:

nft add rule ip test input ip daddr != 192.168.1.2-192.168.1.55
or
nft add rule ip test input ip daddr == 192.168.1.2-192.168.1.55

Also, we still have the option for adding rules like this:

sudo nft add rule ip test output frag id 33-45

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

payload: Update the context only in equality relations

If we add this rule:

sudo nft add rule ip test input ip protocol != icmp

and we try to list the rules in the table test, nftables
show this error:

nft: src/payload.c:76: payload_expr_pctx_update: Assertion `expr->op == OP_EQ' failed.

This patch change the function payload_match_postprocess for updating
only the context in equality relations case.

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

scanner: Add udplite token

If we add a udplite rule, we can't because we have forgot
to add this token in the scanner.

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: fix crash if kernel doesn't support nfnetlink / nf_tables

When trying to close a descriptor which failed to be opened.

==6231== Process terminating with default action of signal 11 (SIGSEGV)
==6231==  Access not within mapped region at address 0x0
==6231==    at 0x5503E21: mnl_socket_close (socket.c:248)
==6231==    by 0x40517F: netlink_close_sock (netlink.c:68)
==6231==    by 0x400EFEE: _dl_fini (dl-fini.c:253)
==6231==    by 0x5740AA0: __run_exit_handlers (exit.c:77)
==6231==    by 0x5740B24: exit (exit.c:99)
==6231==    by 0x40F16F: netlink_open_error (netlink.c:105)
==6231==    by 0x405642: netlink_open_sock (netlink.c:54)
==6231==    by 0x424E6C: __libc_csu_init (in /usr/sbin/nft)
==6231==    by 0x5728924: (below main) (libc-start.c:219)
==6231==  If you believe this happened as a result of a stack
==6231==  overflow in your program's main thread (unlikely but
==6231==  possible), you can try to increase the size of the
==6231==  main thread stack using the --main-stacksize= flag.
==6231==  The main thread stack size used in this run was 8388608.

Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=881
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

parser: use symbolic expression for ether too

Like in 0dbced3 ("parser: use symbolic expressions for parsing
keywords as protocol values"), convert `ether' to use a symbolic
expression.

This fixes:

# nft add rule ip filter input meta iiftype ether
# nft list table filter
table ip filter {
...
iiftype 256

which was converted to network byte order.

Reported-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: don't add table/chain/set to ctx->list in the event path

The delinearize functions for tables, chains and sets add these objects
to the ctx->list. In the chain case, this is not required. Regarding
tables and sets, those are added to the cache.

This patch implicitly fixes an use chain object after free that result
in random crashes.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_delinearize: fix double free in relational_binop_postprocess()

free(expr->right) and free(value) point to the same object, so one
single free() is enough.

This manifests in valgrind with:

==4020== Invalid read of size 4
==4020==    at 0x40A429: expr_free (expression.c:65)
==4020==    by 0x414032: expr_postprocess (netlink_delinearize.c:747)
==4020==    by 0x414C33: netlink_delinearize_rule (netlink_delinearize.c:883)
==4020==    by 0x411305: netlink_events_cb (netlink.c:1692)
==4020==    by 0x55040AD: mnl_cb_run (callback.c:77)
==4020==    by 0x4171E4: nft_mnl_recv (mnl.c:45)
==4020==    by 0x407B44: do_command (rule.c:895)
==4020==    by 0x405C6C: nft_run (main.c:183)
==4020==    by 0x405849: main (main.c:334)
==4020==  Address 0x5d126f8 is 56 bytes inside a block of size 120 free'd
==4020==    at 0x4C2AF5C: free (vg_replace_malloc.c:446)
==4020==    by 0x41402A: expr_postprocess (netlink_delinearize.c:746)
==4020==    by 0x414C33: netlink_delinearize_rule (netlink_delinearize.c:883)
==4020==    by 0x411305: netlink_events_cb (netlink.c:1692)
==4020==    by 0x55040AD: mnl_cb_run (callback.c:77)
==4020==    by 0x4171E4: nft_mnl_recv (mnl.c:45)
==4020==    by 0x407B44: do_command (rule.c:895)
==4020==    by 0x405C6C: nft_run (main.c:183)
==4020==    by 0x405849: main (main.c:334)
==4020==

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

parser: remove the "new" and "destroy" tokens from the scanner

These new tokens were introduced in f9563c0 ("src: add events reporting")
to allow filtering based on the event type.

This confuses the parser when parsing the "new" token:

test:32:33-35: Error: syntax error, unexpected new
add rule filter output ct state new,established counter
^^^

This patch fixes this by replacing these event type tokens by the
generic string token, which is then interpreted during the parsing.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add table netlink messages to the batch

This patch moves the table messages to the netlink batch that
is sent to kernel-space.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add chain netlink messages to the batch

This patch moves the chain netlink messages to the big netlink
batch that is sent to kernel-space.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add set netlink message to the batch

This patch moves the netlink set messages to the batch that contains
the rules. This helps to speed up rule-set restoration time by
changing the operational. To achieve this, an internal set ID which
is unique to the batch is allocated as suggested by Patrick.

To retain backward compatibility, nft initially guesses if the
kernel supports set in batches. Otherwise, it falls back to the
previous (slowier) operational.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

mnl: split talk() and recv() functions

Let's split talk() and recv() functions, so they can be used independently.

While at it, lets rename mnl_talk() to nft_mnl_talk() so we avoid potential
clashes with other functions in external libs.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: add events reporting

This patch adds a basic events reporting option to nft.

The syntax is:
% nft monitor [new|destroy] [tables|chains|rules|sets|elements] [xml|json]

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: add socket error reporting helper function

This patch adds a simple helper function to report errors while
opening the Netlink socket.

To help users to diagnose problems, a new NFT_EXIT_NONL exit code is included,
which is 3.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: refactorize set_elem conversion from netlink

Lets refactorize set_elem handling.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: add netlink_delinearize_table() func

This code is suitable to be reusable.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: add netlink_delinearize_chain() func

Let's make this code reusable.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

rule: generalize chain_print()

Lest generalize the chain_print() function, so we can print a plain chain
as the user typed in the basic CLI.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: add netlink_delinearize_set() func

Let's factorize this code, so it can be reused.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

rule: allow to print sets in plain format

Allow to print sets with or without format.

This is useful in situations where we want to print more or less the same
the user typed (IOW, in one single line, and with family/table info).

While at it, make family2str() function public, so it can be used in
other places.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

meta: Add support for input and output bridge interface name

Add support to get an input or output bridge interface name through the
relevant meta keys.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Bump version to v0.2

Signed-off-by: Patrick McHardy <kaber@trash.net>

doc: fix make install problems

-e INSTALL doc
/usr/bin/install: cannot stat 'doc/nftables.8': No such file or directory
make[1]: *** [install] Error 1
make: *** [doc] Error 2

Rename everything to nft.* to fix this up.

Reported-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>

doc: change documentation license to CC BY-SA 4.0

Signed-off-by: Patrick McHardy <kaber@trash.net>

doc: fix programlisting indentation

Since programlistings are used literally, the should not be indented.

Signed-off-by: Patrick McHardy <kaber@trash.net>

datatypes: rename some types for more consistency

Add some more consistency by using _addr for all address types, _proto
for all protocol types and iface_ for all interface types.

Signed-off-by: Patrick McHardy <kaber@trash.net>

doc: documentation update

Signed-off-by: Patrick McHardy

build: fix documentation build

Handle the docbook2x-man mess that is called differently on different distributions.
Also switch to dblatex since db2pdf is unable to handle XML on Fedora (and probably
other distributions).

Signed-off-by: Patrick McHardy <kaber@trash.net>

netlink: fix length value of concat data

The length is measured in bytes, not bits.

Signed-off-by: Patrick McHardy <kaber@trash.net>

gmputil: use MSF/LSF in import/export functions dependant on host byte order

For data of byteorder BYTEORDER_HOST_ENDIAN we need to set the word order
dependant on the host byte order.

Signed-off-by: Patrick McHardy <kaber@trash.net>

expression: fix constant expression allocation on big endian

When allocating a constant expression, a pointer to the data is passed
to the allocation function. When the variable used to store the data
is larger than the size of the data type, this fails on big endian since
the most significant bytes (being zero) come first.

Add a helper function to calculate the proper address for the cases
where this is needed.

This currently affects symbolic tables for values < u64 and payload
dependency generation for protocol values < u32.

Signed-off-by: Patrick McHardy <kaber@trash.net>

Merge branch 'master' of git.netfilter.org:nftables

parser: fix ether keyword clash

Due to the renaming of the "eth" keyword to "ether", parsing of arphrd
type "ether" fails.

Fix similar to TCP, UDP etc by allocating a constant arphrd expression for
the ether keyword without a following key.

Signed-off-by: Patrick McHardy <kaber@trash.net>

build: drop AC_FUNC_MALLOC/REALLOC

Two issues with these:
1. They compile & run a test program, which won't work when cross-compiling
2. When libnftnl has just been installed and is not (yet) in linker path, the
test fails since loader won't find libnftnl.

In that case configure will succeed without obvious errors, but config.h
re-defines malloc/realloc with rpl_ prefix, which then results in a
linker error ("undefined reference to `rpl_realloc'") on 'make'.

These macros are only useful to check that malloc(0) returns non-NULL
and that realloc(NULL, ... works.

For nftables the former is irrelevant and the latter a safe assumption,
so lets just remove them.

Signed-off-by: Florian Westphal <fw@strlen.de>

utils: fix -Wcast-align warnings on sparc

The cast to char * in the container_of() marcro causes warnings for all
list iteration helpers on sparc:

warning: cast increases required alignment of target type [-Wcast-align]

Fix by using a void * for address calculations.

Signed-off-by: Patrick McHardy <kaber@trash.net>

rule: fix crash in set listing

It fixes an invalid read that is shown by valgrind.

==3962== Invalid read of size 4
==3962==    at 0x407040: do_command (rule.c:692)
==3962==    by 0x40588C: nft_run (main.c:183)
==3962==    by 0x405469: main (main.c:334)
==3962==  Address 0x10 is not stack'd, malloc'd or (recently) free'd

Signed-off-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

expr: do not suppress OP_EQ when RHS is bitmask type

bitmask types default to flagcmp now, thus do not suppress OP_EQ. Else,

rule filter output tcp flags syn
rule filter output tcp flags == syn

are both displayed as 'flags syn'.

Signed-off-by: Florian Westphal <fw@strlen.de>

nftables: Fix list of sets by family

Fix the result of command line 'nft list sets FAMILY'. It shows the
following error message:

"Error: syntax error, unexpected end of file, expecting string"

Now, it is possible shows right this information:

$ sudo nft -nna list sets ip
        set set_test {
                type ipv4_address
                elements = { 192.168.3.45, 192.168.3.43, 192.168.3.42, 192.168.3.4}
        }
set set_test2 {
                type ipv4_address
                elements = { 192.168.3.43, 192.168.3.42, 192.168.3.4}
        }
set set0 {
                type ipv4_address
                flags constant
                elements = { 127.0.0.12, 12.11.11.11}
        }

Signed-off-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>

ct: add support for setting ct mark

This patch adds the possibility to set ct keys using nft. Currently, the
connection mark is supported. This functionality enables creating rules
performing the same action as iptables -j CONNMARK --save-mark. For example:

table ip filter {
chain postrouting {
type filter hook postrouting priority 0;
ip protocol icmp ip daddr 8.8.8.8 ct mark set meta mark
}
}

My patch is based on the original http://patchwork.ozlabs.org/patch/307677/
by Kristian Evensen <kristian.evensen@gmail.com>.

I simply did a rebase and some testing. To test, I added rules like these:
counter meta mark set 1 counter
counter ct mark set mark counter
counter ct mark 1 counter

The last matching worked as expected, which means the second rule is also
working as expected.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Acked-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

src: fix expr_binary_error()-related compilation warnings

The commit e7b43ec0 [expr: make expr_binary_error() usable outside of evaluation]
seem to change expr_binary_error() interface.

Later, several compilation warning appears.

The expr_binary_error() function and expr_error() macro both expect
`struct list_head *', so I simply changed callers to send `ctx->msgs'.

[...]
src/evaluate.c: In function ‘byteorder_conversion’:
src/evaluate.c:166:3: warning: passing argument 1 of ‘expr_binary_error’ from incompatible pointer type [enabled by default]
In file included from src/evaluate.c:21:0:
include/expression.h:275:12: note: expected ‘struct list_head *’ but argument is of type ‘struct eval_ctx *’
src/evaluate.c: In function ‘expr_evaluate_symbol’:
src/evaluate.c:204:4: warning: passing argument 1 of ‘expr_binary_error’ from incompatible pointer type [enabled by default]
In file included from src/evaluate.c:21:0:
include/expression.h:275:12: note: expected ‘struct list_head *’ but argument is of type ‘struct eval_ctx *’
[...]

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

datatype: don't leak file ptr on error

Signed-off-by: Florian Westphal <fw@strlen.de>

segtree: sort set elements before decomposition

The decomposition phase currently depends on the kernel returning elements
in sorted order. This is a fragile assumption, change the code to sort the
elements itself.

Signed-off-by: Patrick McHardy <kaber@trash.net>

set: properly account set size when merging recursive set definitions

Signed-off-by: Patrick McHardy <kaber@trash.net>

parser: add grammatical distinction for verdict maps

Currently the parser accepts verdicts in regular maps and non-verdicts
in verdict maps and we have to check matching types during evaluation.
Add grammar rules for verdict maps and seperate them from regular maps.
This has a couple of advantages:

- We recognize verdict maps completely in the parser and any attempt to
  mix verdicts and other expressions will result in a syntax error.
  So far this hasn't actually been checked.

- Using verdicts in non-verdict mappings will also result in a syntax
  error instead of a datatype mismatch.

- There's a grammatical distinction between dictionaries and verdict
  maps, which are actually statements.

This is needed as preparation for a following patch to turn verdicts
into pure statements, which in turn is needed to reinstate support for
using the queue verdict in maps, which was broken by the introduction
of the queue statement.

Signed-off-by: Patrick McHardy <kaber@trash.net>

netlink: use set location for IO errors

We currently crash when reporting a permission denied error for set additions.
This is due to using the wrong location, fix by passing in the set location.

Signed-off-by: Patrick McHardy <kaber@trash.net>

set: abort on interval conflicts

We currently print a debug message (with debugging) and continue. Output
a proper error message and abort.

While at it, make sure we only report a conflict if there actually is one.
This is not the case similar actions, IOW in case of sets, never, in case
of maps, only if the mapping differs.

Signed-off-by: Patrick McHardy <kaber@trash.net>

expr: add comparison function for singleton expressions

Singed-off-by: Patrick McHardy <kaber@trash.net>

expr: make expr_binary_error() usable outside of evaluation

Turn the eval_ctx argument into a list_head to queue the error to.

Signed-off-by: Patrick McHardy <kaber@trash.net>

src: add support for rule human-readable comments

This patch adds support for human-readable comments:

nft add rule filter input accept comment \"accept all traffic\"

Note that comments *always* come at the end of the rule. This uses
the new data area that allows you to attach information to the rule
via netlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: fix chain attribute parsing

The handle's table was being set to the chain name instead of the
chain table attribute.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ct: connlabel matching support

Takes advantage of the fact that the current maximum label storage area
is 128 bits, i.e. the dynamically allocated extension area in the
kernel will always fit into a nft register.

Currently this re-uses rt_symbol_table_init() to read connlabel.conf.
This works since the format is pretty much the same.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: delete unused variable

The table object that is allocated is unused.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ct: direction should be integer, not bitmask

should always generate cmp op (its enum 0, 1 in kernel).

Note: 'original,reply' will no longer work after this patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink_delinearize: meta: fix wrong type in attributes

We segfault on 'list filter' when meta expr is used as _u8
returns invalid register 0.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netlink: fix prefix expression handling

The prefix expression handling is full of bugs:

- netlink_gen_data() is used to construct the prefix mask from the full
  prefix expression. This is both conceptually wrong, the prefix expression
  is *not* data, and buggy, it only assumes network masks and thus only
  handles big endian types.

- Prefix expression reconstruction doesn't check whether the mask is a
  valid prefix and reconstructs crap otherwise. It doesn't reconstruct
  prefixes for anything but network addresses. On top of that its
  needlessly complicated, using the mpz values directly its a simple
  matter of finding the sequence of 1's that extend up to the full width.

- Unnecessary cloning of expressions where a simple refcount increase would
  suffice.

Rewrite that code properly.

Signed-off-by: Patrick McHardy <kaber@trash.net>

netlink_delinarize: convert *all* bitmask values into individual bit values

We're currently only converting bitmask types as direct argument to a
relational expression in the form of a flagcmp (expr & mask neq 0) back
into a list of bit values. This means expressions like:

tcp flags & (syn | ack) == syn | ack

won't be shown symbolically. Convert *all* bitmask values back to a sequence
of inclusive or expressions of the individual bits. In case of a flagcmp,
this sequence is further converted to a list (tcp flags syn,ack).

Signed-off-by: Patrick McHardy <kaber@trash.net>

binop: take care of operator precedence when printing binop arguments

When the argument of a binop is a binop itself, we may need to add parens
if the precedence of the argument is lower then the binop.

Before:

tcp flags & syn | ack == syn | ack
tcp flags & syn | ack != syn | ack

After:

tcp flags & (syn | ack) == syn | ack
tcp flags & (syn | ack) != syn | ack

Signed-off-by: Patrick McHardy <kaber@trash.net>

evaluate: use flagcmp for single RHS bitmask expression

Always use flagcmp for RHS bitmask expressions, independant of whether
only one or an entire list of bitmask expression is specified.

This makes sure that f.i. "tcp flags ack" will match any combinations
of ACK instead of ACK and only ACK.

Signed-off-by: Patrick McHardy <kaber@trash.net>

Merge branch 'next-3.14' of git.netfilter.org:nftables into next-3.14

src: proto: fixed a rreply symbol

There is a bug with rreply symbol. The rreply and reply symbol were the same.

There is a reproduction of this bug here:
$ sudo nft add rule arp art-t filter arp operation reply
$ sudo nft list table arp art-t
table arp art-t {
        chain filter {
type filter hook input priority 0;
                 arp operation 512
        }
}

$ sudo nft add rule arp art-t filter arp operation rreply
$ sudo nft list table arp art-t
table arp art-t {
        chain filter {
type filter hook input priority 0;
                 arp operation 512  <=====
                 arp operation 512  <=====
        }
}

Signed-off-by: Patrick McHardy <kaber@trash.net>

meta: remove line break when printing priority

The line break is added after printing the rule.