riscv: Clean up & optimize unaligned scalar access probe
check_unaligned_access_speed_all_cpus() is more complicated than it should
be:
- It uses on_each_cpu() to probe unaligned memory access on all CPUs but
excludes CPU0 with a check in the callback function. So an IPI to CPU0
is wasted.
- Probing on CPU0 is done with smp_call_on_cpu(), which is not as fast as
on_each_cpu().
The reason for this design is because the probe is timed with jiffies.
Therefore on_each_cpu() excludes CPU0 because that CPU needs to tend to
jiffies.
Instead, replace jiffies usage with ktime_get_mono_fast_ns(). With jiffies
out of the way, on_each_cpu() can be used for all CPUs and
smp_call_on_cpu() can be dropped.
To make ktime_get_mono_fast_ns() usable, move this probe to late_initcall.
Anything after clocksource's fs_initcall works, but avoid depending on
clocksource staying at fs_initcall.
The choice of probe time is now
8000000 ns, which is the same as before (2
jiffies) for riscv defconfig. This is excessive for the CPUs I have, and
probably should be reduced; but that's a different discussion.
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Link: https://patch.msgid.link/9b9a20affe2e4f5c380926ceb885a47e20a59395.1770830596.git.namcao@linutronix.de
Signed-off-by: Paul Walmsley <pjw@kernel.org>