FIPS 186-4 has 5 different algorithms for key generation,
and all of them rely on testing GCD(a,n) == 1 many times.
Cachegrind was showing that during a RSA keygen operation,
the function BN_gcd() was taking a considerable percentage
of the total cycles.
The default provider uses multiprime keygen, which seemed to
be much faster. This is because it uses BN_mod_inverse()
instead.
For a 4096 bit key, the entropy of a key that was taking a
long time to generate was recorded and fed back into subsequent
runs. Roughly 40% of the cycle time was BN_gcd() with most of the
remainder in the prime testing. Changing to use the inverse
resulted in the cycle count being 96% in the prime testing.
Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19578)