bn_nist: Fix strict-aliasing violations in little-endian optimizations
The little-endian optimization is doing some type-punning in a way
violating the C standard aliasing rule by loading or storing through a
lvalue with type "unsigned int" but the memory location has effective
type "unsigned long" or "unsigned long long" (BN_ULONG). Convert these
accesses to use memcpy instead, as memcpy is defined as-is "accessing
through the lvalues with type char" and char is aliasing with all types.
GCC does a good job to optimize away the temporary copies introduced
with the change. Ideally copying to a temporary unsigned int array,
doing the calculation, and then copying back to `r_d` will make the code
look better, but unfortunately GCC would fail to optimize away this
temporary array then.
I've not touched the LE optimization in BN_nist_mod_224 because it's
guarded by BN_BITS2!=64, then BN_BITS2 must be 32 and BN_ULONG must be
unsigned int, thus there is no aliasing issue in BN_nist_mod_224.
Fixes #12247.
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com> Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22816)