datatype: fix leak and cleanup reference counting for struct datatype
Test `./tests/shell/run-tests.sh -V tests/shell/testcases/maps/nat_addr_port`
fails:
==118== 195 (112 direct, 83 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3
==118== at 0x484682C: calloc (vg_replace_malloc.c:1554)
==118== by 0x48A39DD: xmalloc (utils.c:37)
==118== by 0x48A39DD: xzalloc (utils.c:76)
==118== by 0x487BDFD: datatype_alloc (datatype.c:1205)
==118== by 0x487BDFD: concat_type_alloc (datatype.c:1288)
==118== by 0x488229D: stmt_evaluate_nat_map (evaluate.c:3786)
==118== by 0x488229D: stmt_evaluate_nat (evaluate.c:3892)
==118== by 0x488229D: stmt_evaluate (evaluate.c:4450)
==118== by 0x488328E: rule_evaluate (evaluate.c:4956)
==118== by 0x48ADC71: nft_evaluate (libnftables.c:552)
==118== by 0x48AEC29: nft_run_cmd_from_buffer (libnftables.c:595)
==118== by 0x402983: main (main.c:534)
I think the reference handling for datatype is wrong. It was introduced
by commit
01a13882bb59 ('src: add reference counter for dynamic
datatypes').
We don't notice it most of the time, because instances are statically
allocated, where datatype_get()/datatype_free() is a NOP.
Fix and rework.
- Commit
01a13882bb59 comments "The reference counter of any newly
allocated datatype is set to zero". That seems not workable.
Previously, functions like datatype_clone() would have returned the
refcnt set to zero. Some callers would then then set the refcnt to one, but
some wouldn't (set_datatype_alloc()). Calling datatype_free() with a
refcnt of zero will overflow to UINT_MAX and leak:
if (--dtype->refcnt > 0)
return;
While there could be schemes with such asymmetric counting that juggle the
appropriate number of datatype_get() and datatype_free() calls, this is
confusing and error prone. The common pattern is that every
alloc/clone/get/ref is paired with exactly one unref/free.
Let datatype_clone() return references with refcnt set 1 and in
general be always clear about where we transfer ownership (take a
reference) and where we need to release it.
- set_datatype_alloc() needs to consistently return ownership to the
reference. Previously, some code paths would and others wouldn't.
- Replace
datatype_set(key, set_datatype_alloc(dtype, key->byteorder))
with a __datatype_set() with takes ownership.
Fixes: 01a13882bb59 ('src: add reference counter for dynamic datatypes')
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>