mm/damon/stat: calculate and expose idle time percentiles
Knowing how much memory is how cold can be useful for understanding
coldness and utilization efficiency of memory. The raw form of DAMON's
monitoring results has the information. Convert the raw results into the
per-byte idle time distributions and expose it as percentiles metric to
users, as a read-only DAMON_STAT parameter.
In detail, the metrics are calculated as follows. First, DAMON's
per-region access frequency and age information is converted into per-byte
idle time. If access frequency of a region is higher than zero, every
byte of the region has zero idle time. If the access frequency of a
region is zero, every byte of the region has idle time as the age of the
region. Then the logic sorts the per-byte idle times and provides the
value at 0/100, 1/100, ..., 99/100 and 100/100 location of the sorted
array.
The metric can be easily aggregated and compared on large scale production
systems. For example, if an average of 75-th percentile idle time of
machines that collected on similar time is two minutes, it means the
system's 25 percent memory is not accessed at all for two minutes or more
on average. If a workload considers two minutes as unit work time, we can
conclude its working set size is only 75 percent of the memory. If the
system utilizes proactive reclamation and it supports coldness-based
thresholds like DAMON_RECLAIM, the idle time percentiles can be used to
find a more safe or aggressive coldness threshold for aimed memory saving.
Link: https://lkml.kernel.org/r/20250604183127.13968-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>