core: Use oom_group_kill attribute if OOMPolicy=kill
For managed oom kills, we check the user.oomd_ooms property which
reports how many times systemd-oomd recursively killed the entire
cgroup. For kernel OOM kills, we check the oom_kill property from
memory.events which reports how many processes were killed by the
kernel OOM killer in the corresponding cgroup and its child cgroups.
For units with Delegate=yes, this is problematic, becase OOM kills
in child cgroups that were handled by the delegated unit will still
be treated as unit OOM kills by systemd.
Specifically, if systemd is managing the delegated cgroup and
memory.oom.group=1 is set on both the service cgroup and the child
cgroup, if the child cgroup is OOM killed and this is handled by systemd
running inside the delegated units, when the unit exits later, it will
still be treated as oom-killed because oom_kill in memory.events will
contain the OOM kills that happened in the child cgroup.
To allow addressing this, the oom_group_kill property was added to the
memory.events and memory.events.local files which allows reading how many
times the entire cgroup was oom killed by the kernel if memory.oom.group=1.
If we read this from memory.events.local, we know how many times the unit's
entire cgroup (plus child cgroups) got oom killed by the kernel. This matches
what we report for systemd-oomd managed oom kills and avoids reporting the
unit as oom-killed if a child cgroup was oom killed by the kernel due to
having memory.oom.group=1 set on it.
Since this is only available from kernel 5.12 onwards, we fall back to
reading the oom_kill field from memory.events if the oom_group_kill property
is not available.