]> git.ipfire.org Git - thirdparty/glibc.git/commit
math: Use lgamma from CORE-MATH
authorAdhemerval Zanella <adhemerval.zanella@linaro.org>
Fri, 10 Oct 2025 18:15:27 +0000 (15:15 -0300)
committerAdhemerval Zanella <adhemerval.zanella@linaro.org>
Mon, 27 Oct 2025 12:34:04 +0000 (09:34 -0300)
commitd67d2f468872c3fe9d3ba2b60eab0e421f906ff2
treed50f40928ade9d64da04f6ea5dfa48a631b0c127
parent140e802cb3e5d5e23b297d2ccf0505b4d348ae4b
math: Use lgamma from CORE-MATH

The current implementation precision shows the following accuracy, on
one range ([-1,1]) with 10e9 uniform randomly generated numbers for
each range (first column is the accuracy in ULP, with '0' being
correctly rounded, second is the number of samples with the
corresponding precision):

* Range [-20, 20]
 * FE_TONEAREST
     0:       6701254075  67.01%
     1:       3230897408  32.31%
     2:         63986940   0.64%
     3:          3605417   0.04%
     4:           233189   0.00%
     5:            20973   0.00%
     6:             1869   0.00%
     7:              125   0.00%
     8:                4   0.00%
 * FE_UPWARDA
     0:       4207428861  42.07%
     1:       5001137116  50.01%
     2:        740542213   7.41%
     3:         49116304   0.49%
     4:          1715617   0.02%
     5:            54464   0.00%
     6:             4956   0.00%
     7:              451   0.00%
     8:               16   0.00%
     9:                2   0.00%
 * FE_DOWNWARD
     0:       4155925193  41.56%
     1:       4989821364  49.90%
     2:        770312796   7.70%
     3:         72014726   0.72%
     4:         11040522   0.11%
     5:           872811   0.01%
     6:            12480   0.00%
     7:              106   0.00%
     8:                2   0.00%
 * FE_TOWARDZERO
     0:       4225861532  42.26%
     1:       5027051105  50.27%
     2:        706443411   7.06%
     3:         39877908   0.40%
     4:           713109   0.01%
     5:            47513   0.00%
     6:             4961   0.00%
     7:              438   0.00%
     8:               23   0.00%

* Range [20, 0x5.d53649e2d4674p+1012]
 * FE_TONEAREST
     0:       7262241995  72.62%
     1:       2737758005  27.38%
 * FE_UPWARD
     0:       4690392401  46.90%
     1:       5143728216  51.44%
     2:        165879383   1.66%
 * FE_DOWNWARD
     0:       4690333331  46.90%
     1:       5143794937  51.44%
     2:        165871732   1.66%
 * FE_TOWARDZERO
     0:       4690343071  46.90%
     1:       5143786761  51.44%
     2:        165870168   1.66%

The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1) shows:

reciprocal-throughput        master        patched   improvement
x86_64                     112.9740       135.8640       -20.26%
x86_64v2                   111.8910       131.7590       -17.76%
x86_64v3                   108.2800        68.0935        37.11%
aarch64                     61.3759        49.2403        19.77%
power10                     42.4483        24.1943        43.00%

Latency                      master        patched   improvement
x86_64                     144.0090       167.9750       -16.64%
x86_64v2                   139.2690       167.1900       -20.05%
x86_64v3                   130.1320        96.9347        25.51%
aarch64                     66.8538        53.2747        20.31%
power10                     49.5076        29.6917        40.03%

For x86_64/x86_64-v2, most performance hit came from the fma call
through the ifunc mechanism.

Checked on x86_64-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
SHARED-FILES
math/Makefile
sysdeps/i386/Makefile
sysdeps/ieee754/dbl-64/e_lgamma_r.c
sysdeps/ieee754/dbl-64/lgamma_neg.c [deleted file]
sysdeps/ieee754/dbl-64/lgamma_product.c [deleted file]
sysdeps/ieee754/dbl-64/libm-test-ulps
sysdeps/ieee754/dbl-64/math_config.h
sysdeps/ieee754/flt-32/lgamma_negf.c [deleted file]
sysdeps/ieee754/flt-32/lgamma_productf.c [deleted file]
sysdeps/ieee754/ldbl-96/lgamma_product.c [deleted file]