The Lpartial subfunction is entered with plain call instructions,
and the win64 epilogue should only run when actually exiting the
whole salsa20_crypt function.
2013-04-23 Niels Möller <nisse@lysator.liu.se>
From Martin Storsjö:
+ * x86_64/salsa20-crypt.asm (Lpartial): Don't return via W64_EXIT
+ within this subfunction.
* x86_64/machine.m4 (W64_ENTRY): Use movdqu instead of movdqa for
saving xmm registers, since the stack is not guaranteed to be
16-byte aligned on win64.
shr $16, XREG(T64)
.Llt2:
test $1, LENGTH
- jz .Lend
+ jz .Lret
xor (SRC, POS), LREG(T64)
mov LREG(T64), (DST, POS)
- jmp .Lend
+.Lret:
+ ret
EPILOGUE(nettle_salsa20_crypt)