Allow group methods to customize initialization for speed
This commit also adds an implementation for P256 that avoids some
expensive initialization of Montgomery arithmetic structures in favor
of precomputation. Since ECC groups are not always cached by higher
layers this brings significant savings to TLS handshakes.
Reviewed-by: Shane Lontis <shane.lontis@oracle.com> Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22746)