[librm] Work around two errata in the 386's "popal" instruction
Detailed experiments show that at least one model of 386 CPU has a
previously undocumented errata in the "popal" instruction.
Specifically: when the stack-address size is 16 bits and the operand
size is 32 bits, the "popal" instruction will erroneously load the
high 16 bits of %esp from the value stored on the stack.
The "movl -20(%esp), %esp" instruction near the end of virt_call()
currently relies on the assumption that the high 16 bits of %esp will
already be zero, since they were set to zero by the "movzwl %bp, %esp"
instruction at the end of prot_to_real() and will not have been
subsequently modified by the "popal". This 386 CPU errata invalidates
that assumption, with the result that we end up loading the stack
pointer from an essentially undefined memory location.
Fix by inserting a "movzwl %sp, %esp" after the "popal" to explicitly
zero the high 16 bits of %esp.
Inserting this instruction also happens to work around another (known
and documented) errata in the 386, in which the CPU may malfunction if
"popal" is followed immediately by an instruction that uses a base
address register to form an effective address.
Debugged-by: Jaromir Capik <jaromir.capik@email.cz> Signed-off-by: Michael Brown <mcb30@ipxe.org>