When looking at flamegraphs, there is a pretty large entry for the
function call drop_in_place::<Option<Allocation>> which in turn calls
drop_in_place::<Allocation>. Combined with the looper_need_return
condition, this means that the generated code looks like this:
if let Some(buffer) = buffer {
if buffer.looper_need_return_on_free() {
self.inner.lock().looper_need_return = true;
}
}
drop_in_place::<Option<Allocation>>() { // not inlined
if let Some(buffer) = buffer {
drop_in_place::<Allocation>(buffer);
}
}
This kind of situation where you check X and then check X again is
normally optimized into a single condition, but in this case due to the
non-inlined function call to drop_in_place::<Option<Allocation>>, that
optimization does not happen.
Furthermore, the drop_in_place::<Allocation> call is only two-thirds of
the drop_in_place::<Option<Allocation>> call in the flamegraph. This
indicates that this double condition is not performing well. Also, last
time I looked at Binder perf, I remember finding that the destructor of
Allocation was involved with many branch mispredictions.
Thus, change this code to look like this:
if let Some(buffer) = buffer {
if buffer.looper_need_return_on_free() {
self.inner.lock().looper_need_return = true;
}
drop_in_place::<Allocation>(buffer);
}
by dropping the Allocation directly. Flamegraphs confirm that the
drop_in_place::<Option<Allocation>> call disappears from this change.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Carlos Llamas <cmllamas@google.com>
Link: https://patch.msgid.link/20251029-binder-bcfreebuf-option-v1-1-4d282be0439f@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>