Assuming that pending extensions are sorted by first name and family,
then descending revision, the decision where to insert a newly
registered extension may be simplified by memorizing the previous
registration (which obviously is of same name and family and higher
revision).
As a side-effect, fix for unsupported old extension revisions lingering
in pending extension list forever and being retried with every use of
the given extension. Any revision being rejected by the kernel may
safely be dropped iff a previous (read: higher) revision was accepted
already.
Yet another side-effect of this change is the removal of an unwanted
recursion by xtables_fully_register_pending_*() into itself via
xtables_find_*().
Phil Sutter [Mon, 21 Sep 2020 11:42:06 +0000 (13:42 +0200)]
libxtables: Make sure extensions register in revision order
Insert extensions into pending lists in ordered fashion: Group by
extension name (and, for matches, family) and order groups by descending
revision number.
This allows to simplify the later full registration considerably. Since
that involves kernel compatibility checks, the extra cycles here pay off
eventually.
Phil Sutter [Tue, 6 Oct 2020 17:07:19 +0000 (19:07 +0200)]
extensions: libipt_icmp: Fix translation of type 'any'
By itself, '-m icmp --icmp-type any' is a noop, it matches any icmp
types. Yet nft_ipv4_xlate() does not emit an 'ip protocol' match if
there's an extension with same name present in the rule. Luckily, legacy
iptables demands icmp match to be prepended by '-p icmp', so we can
assume this is present and just emit the 'ip protocol' match from icmp
xlate callback.
Fixes: aa158ca0fda65 ("extensions: libipt_icmp: Add translation to nft") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Florian Westphal <fw@strlen.de>
Phil Sutter [Mon, 28 Sep 2020 16:57:18 +0000 (18:57 +0200)]
nft: Fix for broken address mask match detection
Trying to decide whether a bitwise expression is needed to match parts
of a source or destination address only, add_addr() checks if all bytes
in 'mask' are 0xff or not. The check is apparently broken though as each
byte in 'mask' is cast to a signed char before comparing against 0xff,
therefore the bitwise is always added:
Phil Sutter [Wed, 8 Jul 2020 21:03:12 +0000 (23:03 +0200)]
nft: Fold nftnl_rule_list_chain_save() into caller
Existence of this function was mostly code-duplication: Caller already
branches depending on whether 'chain' is NULL or not and even does the
chain list lookup.
While being at it, simplify __nftnl_rule_list_chain_save function name a
bit now that the non-prefixed name is gone.
Phil Sutter [Fri, 31 Jul 2020 16:20:17 +0000 (18:20 +0200)]
nft: Fix for ruleset flush while restoring
If ruleset is flushed while an instance of iptables-nft-restore is
running and has seen a COMMIT line once, it doesn't notice the
disappeared table while handling the next COMMIT. This is due to table
existence being tracked via 'initialized' boolean which is only reset
by nft_table_flush().
To fix this, drop the dedicated 'initialized' boolean and switch users
to the recently introduced 'exists' one.
As a side-effect, this causes base chain existence being checked for
each command calling nft_xt_builtin_init() as the old 'initialized' bit
was used to track if that function has been called before or not.
Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Fri, 7 Aug 2020 14:42:07 +0000 (16:42 +0200)]
xtables-monitor: Fix ip6tables rule printing
When printing an ip6tables rule event, false family ops are used as they
are initially looked up for AF_INET and reused no matter the current
rule's family. In practice, this means that nft_rule_print_save() calls
the wrong rule_to_cs, save_rule and clear_cs callbacks. Therefore, if a
rule specifies a source or destination address, the address is not
printed.
Fix this by performing a family lookup each time rule_cb is called.
Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 6 Aug 2020 16:52:34 +0000 (18:52 +0200)]
tests: shell: Merge and extend return codes test
Merge scripts for iptables and ip6tables, they were widely identical.
Also extend the test by one check (removing a non-existent rule with
valid chain and target) and quote the error messages where differences
are deliberately ignored.
Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Fri, 7 Aug 2020 11:48:28 +0000 (13:48 +0200)]
nft: Fix command name in ip6tables error message
Upon errors, ip6tables-nft would prefix its error messages with
'iptables:' instead of 'ip6tables:'. Turns out the command name was
hard-coded, use 'progname' variable instead.
While being at it, merge the two mostly identical fprintf() calls into
one.
Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 30 Jul 2020 09:54:36 +0000 (11:54 +0200)]
nft: Eliminate table list from cache
The full list of tables in kernel is not relevant, only those used by
iptables-nft and for those, knowing if they exist or not is sufficient.
For holding that information, the already existing 'table' array in
nft_cache suits well.
Consequently, nft_table_find() merely checks if the new 'exists' boolean
is true or not and nft_for_each_table() iterates over the builtin_table
array in nft_handle, additionally checking the boolean in cache for
whether to skip the entry or not.
iptables: replace libnftnl table list by linux list
This patch removes the libnftnl table list by linux list. This comes
with an extra memory allocation to store the nft_table object. Probably,
there is no need to cache the entire nftnl_table in the near future.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Tue, 7 Jul 2020 16:40:11 +0000 (18:40 +0200)]
nft: Reorder enum nft_table_type
This list of table types is used internally only, the actual values
don't matter that much. Reorder them to match the order in which
iptables-legacy-save prints them (if present). As a consequence, entries
in builtin_table array 'xtables_ipv4' are correctly sorted as well.
Phil Sutter [Wed, 8 Jul 2020 13:18:48 +0000 (15:18 +0200)]
nft: Turn nft_chain_save() into a foreach-callback
Let nftnl_chain_list_foreach() do the chain list iterating instead of
open-coding it. While being at it, simplify the policy value selection
code as well.
Phil Sutter [Fri, 10 Jul 2020 19:53:08 +0000 (21:53 +0200)]
nft: cache: Drop duplicate chain check
When fetching chains from kernel, checking for duplicate chain names is
not needed: Nftables doesn't support them in the first place. This is
merely a leftover from when multiple cache fetches could happen and so a
bit of sanity checking was in order.
Phil Sutter [Fri, 10 Jul 2020 18:08:35 +0000 (20:08 +0200)]
nft: Make table creation purely implicit
While asserting a required builtin chain exists, its table is created
implicitly if missing. Exploit this from xtables-restore, too: The only
actions which need adjustment are chain_new and chain_restore, i.e. when
restoring (either builtin or custom) chains.
Note: The call to nft_table_builtin_add() wasn't sufficient as it
doesn't set the table as initialized and therefore a following call to
nft_xt_builtin_init() would override non-default base chain policies.
Note2: The 'table_new' callback in 'nft_xt_restore_cb' is left in place
as xtables-translate uses it to print an explicit 'add table' command.
Note3: nft_table_new() function was already unused since a7f1e208cdf9c
("nft: split parsing from netlink commands").
allow users to override at runtime the lock file to use through the
XTABLES_LOCKFILE environment variable.
It allows to use iptables when the user has granted enough
capabilities (e.g. a user+network namespace) to configure the network
but that lacks access to the XT_LOCK_NAME (by default placed under
/run).
libxtables: compiler warning fixes for NO_SHARED_LIBS
Fixes two issues with NO_SHARED_LIBS:
- #include <dlfcn.h> is ifdef'ed out and thus dlclose()
triggers an undeclared function compiler warning
- dlreg_add() is unused and thus triggers an unused
function warning
Test: builds without warnings Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: d4409d449c10fa ("nft: Don't exit early after printing help texts") Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Arturo Borrero Gonzalez <arturo@netfilter.org>
Phil Sutter [Tue, 12 May 2020 10:59:42 +0000 (12:59 +0200)]
xtables-restore: Fix verbose mode table flushing
When called with --verbose mode, iptables-nft-restore did not print
anything when flushing the table. Fix this by adding a "manual" mode to
nft_cmd_table_flush(), turning it into a wrapper around '-F' and '-X'
commands, which is exactly what iptables-legacy-restore does to flush a
table. This though requires a real cache, so don't set NFT_CL_FAKE then.
Phil Sutter [Wed, 3 Jun 2020 09:41:55 +0000 (11:41 +0200)]
build: bump dependency on libnftnl
Recently added full among match support depends on concatenated ranges
in nftables sets, a feature which was not available in libnftnl before
version 1.1.6.
Fixes: c33bae9c6c7a4 ("ebtables: among: Support mixed MAC and MAC/IP entries") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Thu, 5 Dec 2019 12:35:25 +0000 (13:35 +0100)]
include: Avoid undefined left-shift in xt_sctp.h
Pull the fix in kernel commit 164166558aace ("netfilter: uapi: Avoid
undefined left-shift in xt_sctp.h") into iptables repository. The
original description is:
With 'bytes(__u32)' being 32, a left-shift of 31 may happen which is
undefined for the signed 32-bit value 1. Avoid this by declaring 1 as
unsigned.
Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Tue, 19 May 2020 23:00:57 +0000 (01:00 +0200)]
doc: libxt_MARK: OUTPUT chain is fine, too
In order to route packets originating from the host itself based on
fwmark, mangle table's OUTPUT chain must be used. Mention this chain as
alternative to PREROUTING.
Fixes: c9be7f153f7bf ("doc: libxt_MARK: no longer restricted to mangle table") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Thu, 7 May 2020 16:53:47 +0000 (18:53 +0200)]
nft: Merge nft_*_rule_find() functions
Both ebtables and arptables are fine with using nft_ipv46_rule_find()
instead of their own implementations. Take the chance and move the
former into nft.c as a static helper since it is used in a single place,
only. Then get rid of the callback from family_ops.
Phil Sutter [Sat, 9 May 2020 11:42:56 +0000 (13:42 +0200)]
nfnl_osf: Improve error handling
For some error cases, no log message was created - hence apart from the
return code there was no indication of failing execution.
If a line load fails, don't abort but continue with the remaining
file contents. The current pf.os file in this repository serves as
proof-of-concept:
Lines 700 and 701: Duplicates of lines 698 and 699 because 'W*' and 'W0'
parse into the same data.
Line 704: Duplicate of line 702 because apart from 'W*' and 'W0', only
the first three fields on right-hand side are sent to the kernel.
When loading, these dups are ignored (they would bounce if NLM_F_EXCL
was given). Upon deletion, they cause ENOENT response from kernel. In
order to align duplicate-tolerance in both modes, just ignore that
ENOENT.
Phil Sutter [Sat, 9 May 2020 11:36:49 +0000 (13:36 +0200)]
nfnl_osf: Fix broken conversion to nfnl_query()
Due to missing NLM_F_ACK flag in request, nfnetlink code in kernel
didn't create an own ACK message but left it upon subsystem to ACK or
not. Since nfnetlink_osf doesn't ACK by itself, nfnl_query() got stuck
waiting for a reply.
Whoever did the conversion from deprecated nfnl_talk() obviously didn't
even test basic functionality of the tool.
Fixes: 52aa15098ebd6 ("nfnl_osf: Replace deprecated nfnl_talk() by nfnl_query()") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Fri, 8 May 2020 12:57:36 +0000 (14:57 +0200)]
iptables-test: Don't choke on empty lines
The script code wasn't expecting empty lines:
| Traceback (most recent call last):
| File "./iptables-test.py", line 380, in <module>
| main()
| File "./iptables-test.py", line 370, in main
| file_tests, file_passed = run_test_file(filename, args.netns)
| File "./iptables-test.py", line 265, in run_test_file
| if item[1] == "=":
| IndexError: list index out of range
Fix this by ignoring empty lines or those consisting of whitespace only.
While being at it, remove the empty line from libxt_IDLETIMER.t which
exposed the problem.
Phil Sutter [Wed, 6 May 2020 11:33:20 +0000 (13:33 +0200)]
nft: Fix leak when replacing a rule
If nft_rule_append() is called with a reference rule, it is supposed to
insert the new rule at the reference position and then remove the
reference from cache. Instead, it removed the new rule from cache again
right after inserting it. Also, it missed to free the removed rule.
Fixes: 5ca9acf51adf9 ("xtables: Fix position of replaced rules in cache") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Tue, 5 May 2020 17:36:13 +0000 (19:36 +0200)]
nft: Use clear_cs() instead of open coding
In a few places, initialized struct iptables_command_state was not fully
deinitialized. Change them to call nft_clear_iptables_command_state()
which does it properly.
Phil Sutter [Tue, 5 May 2020 11:56:11 +0000 (13:56 +0200)]
libxtables: Introduce xtables_fini()
Record handles of loaded shared objects in a linked list and dlclose()
them from the newly introduced function. While functionally not
necessary, this clears up valgrind's memcheck output when also
displaying reachable memory.
Since this is an extra function that doesn't change the existing API,
increment both current and age.
Phil Sutter [Tue, 5 May 2020 11:45:06 +0000 (13:45 +0200)]
ebtables: Free statically loaded extensions again
All ebtables extensions are loaded upon program start as due to the lack
of '-m' parameters, loading on demand is not possible. Introduce
nft_fini_eb() to counteract nft_init_eb() and free dynamic memory in
matches and targets from there.
Phil Sutter [Mon, 4 May 2020 17:30:29 +0000 (19:30 +0200)]
nft: Fix leaks in ebt_add_policy_rule()
The function leaked memory allocated in temporary struct
iptables_command_state, clean it immediately after use.
In any of the udata-related error cases, allocated nftnl_rule would
leak, fix this by introducing a common error path to goto.
In regular code path, the allocated nftnl_rule would still leak:
batch_obj_del() does not free rules in NFT_COMPAT_RULE_APPEND jobs, as
they typically sit in cache as well. Policy rules in turn weren't added
to cache: They are created immediately before commit and never
referenced from other rules. Add them now so they are freed just like
regular rules.
Fixes: aff1162b3e4b7 ("ebtables-nft: Support user-defined chain policies") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Mon, 4 May 2020 17:20:52 +0000 (19:20 +0200)]
nft: Clear all lists in nft_fini()
Remove and free any pending entries in obj_list and err_list as well. To
get by without having to declare list-specific cursors, use generic
list_head types and call list_entry() explicitly.
Phil Sutter [Fri, 1 May 2020 05:59:36 +0000 (07:59 +0200)]
nft: cache: Re-establish cache consistency check
Restore code ensuring __nft_build_cache() returns a consistent cache in
which all ruleset elements belong to the same generation.
This check was removed by commit 200bc39965149 ("nft: cache: Fix
iptables-save segfault under stress") as it could lead to segfaults if a
partial cache fetch was done while cache's chain list was traversed.
With the new cache fetch logic, __nft_build_cache() is never called
while holding references to cache entries.
Phil Sutter [Tue, 5 May 2020 16:37:49 +0000 (18:37 +0200)]
tests: shell: Implement --valgrind mode
Wrap every call to $XT_MULTI with valgrind, or actually a wrapper script
which does the valgrind wrap and stores the log if it contains something
relevant.
Carefully name the wrapper script(s) so that test cases' checks on
$XT_MULTI name stay intact.
This mode slows down testsuite execution horribly. Luckily, it's not
meant for constant use, though.
For now, ignore commands with non-zero exit status - error paths
typically hit direct exit() calls and therefore leave reachable memory
in place.
Phil Sutter [Fri, 24 Apr 2020 09:32:08 +0000 (11:32 +0200)]
nft: Fix for '-F' in iptables dumps
When restoring a dump which contains an explicit flush command,
previously added rules are removed from cache and the following commit
will try to create netlink messages based on freed memory.
Fix this by weeding any rule-based commands from obj_list if they
address the same chain.
Phil Sutter [Mon, 27 Apr 2020 10:08:59 +0000 (12:08 +0200)]
nft: cache: Optimize caching for flush command
When flushing all chains and verbose mode is not enabled,
nft_rule_flush() uses a shortcut: It doesn't specify a chain name for
NFT_MSG_DELRULE, so the kernel will flush all existing chains without
user space needing to know which they are.
The above allows to avoid a chain cache, but there's a caveat:
nft_xt_builtin_init() will create base chains as it assumes they are
missing and thereby possibly overrides any non-default chain policies.
Solve this by making nft_xt_builtin_init() cache-aware: If a command
doesn't need a chain cache, there's no need to bother with creating any
non-existing builtin chains, either. For the sake of completeness, also
do nothing if cache is not initialized (although that shouldn't happen).
Phil Sutter [Thu, 19 Mar 2020 17:58:29 +0000 (18:58 +0100)]
nft: cache: Fetch cache for specific chains
Iterate over command list and collect chains to cache. Insert them into
a sorted list to pass to __nft_build_cache().
If a command is interested in all chains (e.g., --list), cmd->chain
remains unset. To record this case reliably, use a boolean
('all_chains'). Otherwise, it is hard to distinguish between first call
to nft_cache_level_set() and previous command with NULL cmd->chain
value.
When caching only specific chains, manually add builtin ones for the
given table as well - otherwise nft_xt_builtin_init() will act as if
they don't exist and possibly override non-default chain policies.
Phil Sutter [Wed, 18 Mar 2020 16:08:31 +0000 (17:08 +0100)]
nft-cache: Fetch cache per table
Restore per-table operation of cache routines as initially implemented
in commit e2883c5531e6e ("nft-cache: Support partial cache per table").
As before, this doesn't limit fetching of tables (their number is
supposed to be low) but instead limits fetching of sets, chains and
rules to the specified table.
For this to behave correctly when restoring without flushing over
multiple tables, cache must be freed fully after each commit - otherwise
the previous table's cache level is reused for the current one. The
exception being fake cache, used for flushing restore: NFT_CL_FAKE is
set just once at program startup, so it must stay set otherwise
consecutive tables cause pointless cache fetching.
The sole use-case requiring a multi-table cache, iptables-save, is
indicated by req->table being NULL. Therefore, req->table assignment is
a bit sloppy: All calls to nft_cache_level_set() are assumed to set the
same table value, collision detection exists merely to catch programming
mistakes.
Make nft_fini() call nft_release_cache() instead of flush_chain_cache(),
the former does a full cache deinit including cache_req contents.
Phil Sutter [Tue, 7 Apr 2020 11:47:54 +0000 (13:47 +0200)]
nft: cache: Improve fake cache integration
With NFT_CL_FAKE being highest cache level while at the same time
__nft_build_cache() treating it equal to NFT_CL_TABLES, no special
handling for fake cache is required anymore.
Phil Sutter [Thu, 26 Mar 2020 03:47:11 +0000 (04:47 +0100)]
nft: cache: Simplify rule and set fetchers
Since no incremental cache fetching happens anymore, code fetching rules
for chains or elements for sets may safely assume that whatever is in
cache also didn't get populated with rules or elements before.
Therefore no (optional) chain name needs to be passed on to
fetch_rule_cache() and fetch_set_cache() doesn't have to select for
which sets in a table to call set_fetch_elem_cb().
The cache requirements are now calculated once from the parsing phase.
There is no need to call __nft_build_cache() from several spots in the
codepath anymore.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc>
nft: calculate cache requirements from list of commands
This patch uses the new list of commands to calculate the cache
requirements, the rationale after this updates is the following:
#1 Parsing, that builds the list of commands and it also calculates
cache level requirements.
#2 Cache building.
#3 Translate commands to jobs
#4 Translate jobs to netlink
This patch removes the pre-parsing code in xtables-restore.c to
calculate the cache.
After this patch, cache is calculated only once, there is no need
to cancel and refetch for an in-transit transaction.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch updates the parser to generate a list of command objects.
This list of commands is then transformed to a list of netlink jobs.
This new command object stores the rule using the nftnl representation
via nft_rule_new().
To reduce the number of updates in this patch, the nft_*_rule_find()
functions have been updated to restore the native representation to
skip the update of the rule comparison code.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Tue, 7 Apr 2020 19:17:21 +0000 (21:17 +0200)]
ebtables-restore: Table line to trigger implicit commit
Cache code is suited for holding multiple tables' data at once. The only
users of that are xtables-save and ebtables-restore with its support for
multiple tables and lack of explicit COMMIT lines.
Remove the second user by introducing implicit commits upon table line
parsing. This would allow to make cache single table only, but then
xtables-save would fetch cache multiple times (once for each table) and
therefore lose atomicity with regards to the acquired kernel ruleset
image.
Ensures that each logged line is flushed to stdout after it's
written, and not held in any buffer.
Places to modify found via:
git grep -C5 'fputs[(]buffer, stdout[)];'
On Android iptables-restore -v is run as netd daemon's child process
and fed actions via pipe. '#PING' is used to verify the child
is still responsive, and thus needs to be unbuffered.
Luckily if you're running iptables-restore in verbose mode you
probably either don't care about performance or - like Android
- actually need this.
Test: builds, required on Android for ip6?tables-restore netd
subprocess health monitoring. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
iptables: include sys/time.h to fix lack of struct timeval declaration
This fixes clang compiler warnings:
iptables/xshared.h:176:50: error: declaration of 'struct timeval' will not be visible outside of this function [-Werror,-Wvisibility]
extern int xtables_lock_or_exit(int wait, struct timeval *tv);
^
iptables/xshared.h:179:57: error: declaration of 'struct timeval' will not be visible outside of this function [-Werror,-Wvisibility]
void parse_wait_interval(int argc, char *argv[], struct timeval *wait_interval);
^
Test: builds with less warnings Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
extensions: include strings.h for the definition of ffs()
This resolves clang compiler warnings:
extensions/libext4_srcs/gen/gensrcs/external/iptables/extensions/libipt_ULOG.c:89:32: error: implicit declaration of function 'ffs' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
printf(" --ulog-nlgroup %d", ffs(loginfo->nl_group));
^
extensions/libext4_srcs/gen/gensrcs/external/iptables/extensions/libipt_ULOG.c:105:9: error: implicit declaration of function 'ffs' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
ffs(loginfo->nl_group));
^
extensions/libext_srcs/gen/gensrcs/external/iptables/extensions/libxt_addrtype.c:263:14: error: implicit declaration of function 'ffs' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
int first = ffs(val);
^
Test: builds with less warnings Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add a second table to dump/restore. This triggers failures after
reverting c550c81fd373e ("nft: cache: Fix nft_release_cache() under
stress"), hence acts as a reproducer for the bug fixed by that commit as
well.
Phil Sutter [Sun, 22 Sep 2019 11:10:10 +0000 (13:10 +0200)]
tests: shell: Improve ipt-restore/0001load-specific-table_0 a bit
Instead of reading from stdin, pass dump file as regular parameter. This
way dump file name occurs in 'bash -x' output which helps finding out
where things fail.
Introduce "--alarm" option for idletimer rule.
If it is present, hardidle-timer is used, else default timer.
The default idletimer starts a deferrable timer or in other
words the timer will cease to run when cpu is in suspended
state. This change introduces the option to start a
non-deferrable or alarm timer which will continue to run even
when the cpu is in suspended state.
Signed-off-by: Manoj Basapathi <manojbm@codeaurora.org> Signed-off-by: Sauvik Saha <ssaha@codeaurora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
bpf: relax inode permission check for retrieving bpf program
For iptable module to load a bpf program from a pinned location, it
only retrieve a loaded program and cannot change the program content so
requiring a write permission for it might not be necessary.
Also when adding or removing an unrelated iptable rule, it might need to
flush and reload the xt_bpf related rules as well and triggers the inode
permission check. It might be better to remove the write premission
check for the inode so we won't need to grant write access to all the
processes that flush and restore iptables rules.
kernel/bpf/inode.c:
- int ret = inode_permission(inode, MAY_READ | MAY_WRITE);
+ int ret = inode_permission(inode, MAY_READ);
In practice, AFAICT, the xt_bpf match .fd field isn't even used by new
kernels, but I believe it might be needed for compatibility with old ones
(though I'm pretty sure table modifications on them will outright fail).
Test: builds, passes Android test suite (albeit on an older iptables base),
git grep bpf_obj_get - finds no other users Cc: Chenbo Feng <fengc@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Fri, 13 Mar 2020 12:02:12 +0000 (13:02 +0100)]
nft: cache: Fix iptables-save segfault under stress
If kernel ruleset is constantly changing, code called by
nft_is_table_compatible() may crash: For each item in table's chain
list, nft_is_chain_compatible() is called. This in turn calls
nft_build_cache() to fetch chain's rules. Though if kernel genid has changed
meanwhile, cache is flushed and rebuilt from scratch, thereby freeing
table's chain list - the foreach loop in nft_is_table_compatible() then
operates on freed memory.
To fix the problem, basically revert commit ab1cd3b510fa5 ("nft: ensure
cache consistency") so that __nft_build_cache() no longer flushes the
cache. Instead just record kernel's genid when fetching for the first
time. If kernel rule set changes until the changes are committed, the
commit simply fails and local cache is being rebuilt.
Phil Sutter [Mon, 2 Mar 2020 17:29:54 +0000 (18:29 +0100)]
nft: cache: Review flush_cache()
While fixing for iptables-nft-restore under stress, I managed to hit
NULL-pointer deref in flush_cache(). Given that nftnl_*_list_free()
functions are not NULL-pointer tolerant, better make sure such are not
passed by accident.
Phil Sutter [Mon, 2 Mar 2020 17:17:51 +0000 (18:17 +0100)]
nft: cache: Simplify chain list allocation
Allocate chain lists right after fetching table cache, regardless of
whether partial cache is fetched or not. Chain list pointers reside in
struct nft_cache's table array and hence are present irrespective of
actual tables in kernel. Given the small number of tables, there wasn't
much overhead avoided by the conditional in fetch_chain_cache().
Phil Sutter [Sat, 29 Feb 2020 01:08:26 +0000 (02:08 +0100)]
nft: cache: Make nft_rebuild_cache() respect fake cache
If transaction needed a refresh in nft_action(), restore with flush
would fetch a full cache instead of merely refreshing table list
contained in "fake" cache.
To fix this, nft_rebuild_cache() must distinguish between fake cache and
full rule cache. Therefore introduce NFT_CL_FAKE to be distinguished
from NFT_CL_RULES.
Phil Sutter [Fri, 28 Feb 2020 19:32:13 +0000 (20:32 +0100)]
nft: cache: Fix nft_release_cache() under stress
iptables-nft-restore calls nft_action(h, NFT_COMPAT_COMMIT) for each
COMMIT line in input. When restoring a dump containing multiple large
tables, chances are nft_rebuild_cache() has to run multiple times.
If the above happens, consecutive table contents are added to __cache[1]
which nft_rebuild_cache() then frees, so next commit attempt accesses
invalid memory.
Fix this by making nft_release_cache() (called after each successful
commit) return things into pre-rebuild state again, but keeping the
fresh cache copy.
Fixes: f6ad231d698c7 ("nft: keep original cache in case of ERESTART") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Wed, 4 Mar 2020 01:43:27 +0000 (02:43 +0100)]
connlabel: Allow numeric labels even if connlabel.conf exists
Existing code is a bit quirky: If no connlabel.conf was found, the local
function connlabel_value_parse() is called which tries to interpret
given label as a number. If the config exists though,
nfct_labelmap_get_bit() is called instead which doesn't care about
"undefined" connlabel names. So unless installed connlabel.conf contains
entries for all possible numeric labels, rules added by users may stop
working if a connlabel.conf is created.
Related man page snippet states: "Using a number always overrides
connlabel.conf", so try numeric parsing and fall back to nfct only if
that failed.
Fixes: 51340f7b6a110 ("extensions: libxt_connlabel: use libnetfilter_conntrack") Fixes: 3a3bb480a738a ("extensions: connlabel: Fallback on missing connlabel.conf") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Fri, 21 Feb 2020 13:55:52 +0000 (14:55 +0100)]
xtables: Review nft_init()
Move common code into nft_init(), such as:
* initial zeroing nft_handle fields
* family ops lookup and assignment to 'ops' field
* setting of 'family' field
This requires minor adjustments in xtables_restore_main() so extra field
initialization doesn't happen before nft_init() call.
As a side-effect, this fixes segfaulting xtables-monitor binary when
printing rules for trace event as in that code-path 'ops' field wasn't
initialized.
Phil Sutter [Fri, 21 Feb 2020 12:18:32 +0000 (13:18 +0100)]
xtables: Align effect of -4/-6 options with legacy
Legacy iptables doesn't accept -4 or -6 if they don't match the
symlink's native family. The only exception to that is iptables-restore
which simply ignores the lines introduced by non-matching options, which
is useful to create combined dump files for feeding into both
iptables-restore and ip6tables-restore.
Phil Sutter [Tue, 18 Feb 2020 15:43:16 +0000 (16:43 +0100)]
iptables-test.py: Fix --host mode
In some cases, the script still called repo binaries. Avoid this when in
--host mode to allow testing without the need to compile sources in
beforehand.
Fixes: 1b5d762c1865e ("iptables-test: Support testing host binaries") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Thu, 13 Feb 2020 16:49:53 +0000 (17:49 +0100)]
ebtables: among: Support mixed MAC and MAC/IP entries
Powered by Stefano's support for concatenated ranges, a full among match
replacement can be implemented. The trick is to add MAC-only elements as
a concatenation of MAC and zero-length prefix, i.e. a range from
0.0.0.0 till 255.255.255.255.
Although not quite needed, detection of pure MAC-only matches is left in
place. For those, no implicit 'meta protocol' match is added (which is
required otherwise at least to keep nft output correct) and no concat
type is used for the set.
Phil Sutter [Thu, 13 Feb 2020 13:01:50 +0000 (14:01 +0100)]
xtables-translate: Fix for iface++
In legacy iptables, only the last plus sign remains special, any
previous ones are taken literally. Therefore xtables-translate must not
replace all of them with asterisk but just the last one.
Fixes: e179e87a1179e ("xtables-translate: Fix for interface name corner-cases") Signed-off-by: Phil Sutter <phil@nwl.cc>