From: Ken Raeburn Date: Wed, 2 Dec 2009 23:09:33 +0000 (+0000) Subject: Perform the AES-CBC XOR operations 4 bytes at a time, using the helper X-Git-Tag: krb5-1.8-alpha1~103 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=5b3a9ffa46c229814552b30dbb9d1b4a0bba9c46;p=thirdparty%2Fkrb5.git Perform the AES-CBC XOR operations 4 bytes at a time, using the helper functions for loading and storing potentially-unaligned values. Improves bulk AES encryption performance by 2% or so on 32-bit x86 with gcc 4. git-svn-id: svn://anonsvn.mit.edu/krb5/trunk@23432 dc483132-0cff-0310-8789-dd5450dbe970 --- diff --git a/src/lib/crypto/builtin/enc_provider/aes.c b/src/lib/crypto/builtin/enc_provider/aes.c index 396f653756..e635f517d2 100644 --- a/src/lib/crypto/builtin/enc_provider/aes.c +++ b/src/lib/crypto/builtin/enc_provider/aes.c @@ -51,8 +51,24 @@ static void xorblock(unsigned char *out, const unsigned char *in) { int z; - for (z = 0; z < BLOCK_SIZE; z++) - out[z] ^= in[z]; + for (z = 0; z < BLOCK_SIZE/4; z++) { + unsigned char *outptr = &out[z*4]; + unsigned char *inptr = &in[z*4]; + /* Use unaligned accesses. On x86, this will probably still + be faster than multiple byte accesses for unaligned data, + and for aligned data should be far better. (One test + indicated about 2.4% faster encryption for 1024-byte + messages.) + + If some other CPU has really slow unaligned-word or byte + accesses, perhaps this function (or the load/store + helpers?) should test for alignment first. + + If byte accesses are faster than unaligned words, we may + need to conditionalize on CPU type, as that may be hard to + determine automatically. */ + store_32_n (load_32_n(outptr) ^ load_32_n(inptr), outptr); + } } krb5_error_code