tools/nolibc: Optimise and common up the number to ascii functions
Implement u[64]to[ah]_r() using a common function that uses multiply
by reciprocal to generate the least significant digit first and then
reverses the string.
On 32bit this is five multiplies (with 64bit product) for each output
digit. I think the old utoa_r() always did 36 multiplies and a lot
of subtracts - so this is likely faster even for 32bit values.
Definitely better for 64bit values (especially small ones).
Clearly shifts are faster for base 16, but reversing the output buffer
makes a big difference.
Sharing the code reduces the footprint (unless gcc decides to constant
fold the functions).
Definitely helps vfprintf() where the constants get loaded and a single
call is done.
Also makes it cheap to add octal support to vfprintf for completeness.
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Link: https://patch.msgid.link/20260223101735.2922-3-david.laight.linux@gmail.com
Acked-by: Willy Tarreau <w@1wt.eu>
[Thomas: skip int128 multiplication on SPARC and clang]
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>