mm/kmemleak: dedupe verbose scan output by allocation backtrace
Patch series "mm/kmemleak: dedupe verbose scan output", v3.
I am starting to run with kmemleak in verbose enabled in some "probe
points" across the my employers fleet so that suspected leaks land in
dmesg without needing a separate read of /sys/kernel/debug/kmemleak.
The downside is that workloads which leak many objects from a single
allocation site flood the console with byte-for-byte identical backtraces.
Hundreds of duplicates per scan are common, drowning out distinct leaks
and unrelated kernel messages, while adding no signal beyond the first
occurrence.
This series collapses those duplicates inside kmemleak itself. Each
unique stackdepot trace_handle prints once per scan, followed by a short
summary line when more than one object shares it:
kmemleak: unreferenced object 0xff110001083beb00 (size 192):
kmemleak: comm "modprobe", pid 974, jiffies
4294754196
kmemleak: ...
kmemleak: backtrace (crc
6f361828):
kmemleak: __kmalloc_cache_noprof+0x1af/0x650
kmemleak: ...
kmemleak: ... and 71 more object(s) with the same backtrace
The "N new suspected memory leaks" tally and the contents of
/sys/kernel/debug/kmemleak are unchanged - the per-object detail is still
available on demand, only the verbose (dmesg) output is collapsed.
Patch 1 is the kmemleak change.
Patch 2 adds a selftest that loads samples/kmemleak's CONFIG_SAMPLE
kmemleak-test module to generate ten leaks sharing one call site and
checks that the printed count is strictly less than the reported leak
total. Not sure if Patch 2 is useful or not, if not, it is easier to
discard.
This patch (of 2):
In kmemleak's verbose mode, every unreferenced object found during a scan
is logged with its full header, hex dump and 16-frame backtrace.
Workloads that leak many objects from a single allocation site flood dmesg
with byte-for-byte identical backtraces, drowning out distinct leaks and
other kernel messages.
Dedupe within each scan using stackdepot's trace_handle as the key: for
every leaked object with a recorded stack trace, look up the
representative kmemleak_object in a per-scan xarray keyed by trace_handle.
The first sighting stores the object pointer (with a get_object()
reference) and sets object->dup_count to 1; later sightings just bump
dup_count on the representative. After the scan, walk the xarray once and
emit each unique backtrace, followed by a single summary line when more
than one object shares it.
Leaks whose trace_handle is 0 (early-boot allocations tracked before
kmemleak_init() set up object_cache, or stack_depot_save() failures under
memory pressure) cannot be deduped, so they are still printed inline via
the same locked OBJECT_ALLOCATED-checked helper. The contents of
/sys/kernel/debug/kmemleak are unchanged - only the verbose console output
is collapsed.
Safety notes:
- The xarray store happens outside object->lock: object->lock is a
raw spinlock, while xa_store() may grab xa_node slab locks at a
higher wait-context level which lockdep flags as invalid.
trace_handle is captured under object->lock (which serialises with
kmemleak_update_trace()'s writer), so it is safe to use after
dropping the lock.
- get_object() pins the kmemleak_object metadata across
rcu_read_unlock(), but the underlying tracked allocation can still
be freed concurrently. The deferred print path therefore re-acquires
object->lock and re-checks OBJECT_ALLOCATED via print_leak_locked()
before touching object->pointer; __delete_object() clears that flag
under the same lock before the user memory goes away. The same
helper is used by the trace_handle == 0 and xa_store() failure
fallbacks, so every printer in the new path has identical safety
guarantees.
- If get_object() fails after we set OBJECT_REPORTED, the object is
already being torn down (use_count hit zero); the leak count is
still accurate but the verbose line is dropped, which is correct
- the memory was freed concurrently and is no longer a leak.
- If xa_store() fails to allocate an xa_node under memory pressure,
we fall back to printing inline via print_leak_locked() instead of
silently dropping the leak.
- The hex dump is skipped for coalesced entries (dup_count > 1):
bytes would differ across objects sharing a backtrace anyway, and
skipping it removes the only remaining read of object->pointer's
contents in the deferred path. The representative's reported size
may also differ from the coalesced objects' sizes; the printed
trace_handle reflects the representative's current value rather
than the value used as the dedup key, which is normally - but not
strictly - identical.
Link: https://lore.kernel.org/20260506-kmemleak_dedup-v3-0-2d36aafc34da@debian.org
Link: https://lore.kernel.org/20260506-kmemleak_dedup-v3-1-2d36aafc34da@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>