Julian Seward [Wed, 14 May 2014 23:38:23 +0000 (23:38 +0000)]
Implement VFPv4 VFMA and VFMS (F32 and F64 versions). Fixes #331057.
Patch from Janne Hellsten (jjhellst@gmail.com) with algebraic
rearrangement for the VFMS cases so as to make result signs match with
the hardware when some of the inputs are infinities.
Mark Wielaard [Fri, 9 May 2014 11:41:06 +0000 (11:41 +0000)]
Recognize MPX instructions and bnd prefix. Bug #333666.
Recognize and parse operands of new MPX instructions BNDMK, BNDCL,
BNDCU, BNDCN, BNDMOV, BNDLDX and BNDSTX. Also recognize bnd (F2) prefix
for CALL (E8,FF/2), RET (C2,C3), JMP (EB,E9,FF/4) and Jcc (70-7F,0F 80-8F).
All new MPX instructions are currently NOPs and the bnd prefix is ignored.
Julian Seward [Mon, 5 May 2014 10:03:56 +0000 (10:03 +0000)]
Fix assertion failures resulting from change of arity of
Iop_{Add,Sub,Mul}32Fx4 introduced in r2809, in which said IROps
acquired a rounding-mode argument.
Julian Seward [Sun, 4 May 2014 10:52:11 +0000 (10:52 +0000)]
Renaming only (no functional change): rename IR artefacts to do
with i-cache invalidation to be more consistent with new d-cache
invalidation functionality:
Ijk_TInval -> Ijk_InvalICache
TISTART -> CMSTART (CM == "Cache Management")
TILEN -> CMLEN
VEX_TRC_JMP_TINVAL -> VEX_TRC_JMP_INVALICACHE
Julian Seward [Sun, 9 Mar 2014 09:40:23 +0000 (09:40 +0000)]
Do early writeback of the base register for the following instruction
forms, to stop Memcheck complaining about writes below the stack
pointer:
str x3, [sp,#-16]!
stp q0, q1, [sp,#-512]!
Julian Seward [Fri, 7 Mar 2014 22:52:19 +0000 (22:52 +0000)]
Support extra instruction bits and pieces, enough to get Firefox started:
* more scalar int <-> FP conversions
* more vector integer narrowing
* a few more vector shift by imm cases
* FCVTAS (kludged)
Dejan Jevtic [Thu, 27 Feb 2014 14:17:19 +0000 (14:17 +0000)]
mips32: Fpu guest registers are ULong and the initial values need to be
extended.
Because we are supporting both big and little endian mips32 we need to
make sure that the initial values for the fpu registers are the same for both
endian.
Florian Krohm [Fri, 14 Feb 2014 08:55:32 +0000 (08:55 +0000)]
Fix comments and code snippets that were making incorrect claims about
the alignment requirement of the guest state, shadow areas, and register
spill area sizes.
The size of these areas ought to be a multiple of 16 bytes.
Florian Krohm [Tue, 11 Feb 2014 09:23:01 +0000 (09:23 +0000)]
s390: Fix s390_amode_for_guest_state. In general the offset relative
to the guest state pointer may be more than the B12 addressing mode can
handle. Fall back and use a B20 addressing mode in those cases.
Fix up the x86 and amd64 front ends to add fake rounding modes
(Irrm_NEAREST) when generating expressions using these primops.
Fix up the x86 and amd64 back ends to accept these as triops
rather than as binops, and ignore the first arg.
Add three more ir_opt folding rules to remove memcheck
instrumentation arising from instrumentation of known-defined
rounding modes.
Overall functional and performance effects should be zero.
Julian Seward [Wed, 15 Jan 2014 10:25:21 +0000 (10:25 +0000)]
arm64: rename guest_SP to guest_XSP so as to avoid a name clash with
guest_SP from s390 world. Also back out the rename of guest_SP to
guest_s390_SP that caused s390 build breakage in r2803.
Florian Krohm [Tue, 10 Dec 2013 16:51:15 +0000 (16:51 +0000)]
The result of rounding a 128-bit BFP/DFP value to 32/64 bit needs to
be stored in a register pair. This constraint was not observed previously
and the result was stored in any FPR that happened to be chosen. If the
selected FPR was not identifying a proper FPR pair, a SIGILL was delivered.
Fixes BZ #328455.
Fix Bug 327284. The condition code of risbg was not correct.
This instruction might be used by by gcc for masking out bits,
e.g. code like
n &= 3;
if (n == 0)
might result in
risbg %r4,%r4,62,128+63,0
je <target>
The old code set the condition code depending on the operand before
masking. Fix it. This patch also indicates that we need test suite
coverage for risbg and friends.
Julian Seward [Mon, 21 Oct 2013 10:05:33 +0000 (10:05 +0000)]
In 64 bit mode, allow 64 bit return values from clean helper calls.
This makes SMC checking calls work (even though they are irrelevant
on PPC targets). Fixes #309430.
Carl Love [Fri, 18 Oct 2013 01:19:06 +0000 (01:19 +0000)]
This commit adds support for the following instructions:
vaddcuq, vadduqm, vaddecuq, vaddeuqm,
vsubcuq, vsubuqm, vsubecuq, vsubeuqm,
vbpermq and vgbbd.
The vgbbd instruction required a new Iop -- Iop_PwBitMtxXpose64x2.
All other instructions were emulated using existing Iops.
The following Iops were added to support the above instructions:
Iop_BCDAdd, Iop_BCDSub,
Iop_PolynomialMulAdd8x16, Iop_PolynomialMulAdd16x8,
Iop_PolynomialMulAdd32x4, Iop_PolynomialMulAdd64x2,
Iop_CipherV128, Iop_CipherLV128, Iop_CipherSV128,
Iop_NCipherV128, Iop_NCipherLV128,
Iop_SHA512, Iop_SHA256, Iop_Clz64x2
Carl Love [Wed, 9 Oct 2013 17:52:01 +0000 (17:52 +0000)]
Power PC, add the two privileged Transactional Memory instructions.
The initial Transactional Memory instruction patch did not include the two
privileged (OS) instructions. This patch adds support for the two
instructions, treclaim and trechkpt.
Carl Love [Wed, 2 Oct 2013 16:25:57 +0000 (16:25 +0000)]
Power PC, Approach 1, add Transactional Memory instruction support
The following Transactional Memory instructions are added:
tbegin., tend., tsr., tcheck., tabortwc.,
tabortdc., tabortwci., tabortdci., tabort.
The patch implements the first proposal by Julian on how to handle the
TM instructions. The proposal is as follows:
translate "XBEGIN fail-addr" as "goto fail-addr"; that is: push
simulated execution directly onto the failure path. This is simple
but will have poor performance, if (as is likely) the failure path
uses normal locking and is not tuned for speed.
The tbegin instruction on Power sets the condition code register to
indicate if the tbegin instruction suceeded or failed. The compiler
then generates a conditional branch instruction to take the success
or failure code path for the tbegin instruction. In order to fail the
tbegin instruction, the condition code register is updated to indicate
that the tbegin instruction failed. This patch assumes that there is
always an error handler for the tbegin instruction. The other TM
instructions are all treated as no ops as we shouldn't be executing the
sucess transactional code path.
Signed-off-by: Carl Love <cel@us.ibm.com>
Bugzilla 323803
The following Iops were added to support the above instructions:
Iop_MullEven32Ux4, Iop_MullEven32Sx4, Iop_Max64Sx2, Iop_Max64Ux2,
Iop_Min64Sx2, Iop_Min64Ux2, Iop_CmpGT64Ux2, Iop_Rol64x2,
Iop_QNarrowBin64Sto32Ux4, Iop_QNarrowBin64Uto32Ux4, Iop_NarrowBin64to32x4,
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Bugzilla 324894
Add a kludgey implementation of XTEST to go with the kludgey
implementation of XBEGIN. Also kludge the CPUID output for AVX
capable targets so as to claim we support HTM.
Mark Wielaard, mjw@redhat.com)
Petar Jovanovic [Tue, 24 Sep 2013 22:27:23 +0000 (22:27 +0000)]
mips64: finetune mips_dirtyhelper_calculate_FCSR
Several MIPS32 Revision 2 instructions also belong to Revision 1 of MIPS64.
Modifing parts of mips_dirtyhelper_calculate_FCSR to be active for MIPS64R1.
This fixes none/tests/mips64/round when Valgrind is compiled for MIPS64 R1.
Petar Jovanovic [Sat, 21 Sep 2013 01:47:18 +0000 (01:47 +0000)]
mips32: protect mips32r2 instructions with a flag
Regression issue that came when mips_dirtyhelper_calculate_FCSR was added.
Inline assembly with MIPS32r2 instructions needs to be protected by flags
that disable it for non-MIPS32r2 platforms such as some Broadcom boards.
Add support for the Intel TM "xbegin" instruction, by jumping directly
to the failure address. Currently disabled pending finding hardware
that can actually execute xbegin, for testing purposes.
x86 front ends: tighten up decoding of MOV Ib,Eb and MOV Iv,Ev. This
failed to check the g-register in the modrm byte, with the result that
it will mis-decode the AVX2 XABORT and XBEGIN instructions as these
instead, with obviously-bizarre consequences.
Petar Jovanovic [Mon, 16 Sep 2013 18:11:59 +0000 (18:11 +0000)]
mips: clean-up in hardware detection (Cavium/DSP ASEs)
This change is a clean up in MIPS hardware detection code.
New flag for Cavium Company ID is added, as well as the codes for 34K and
74K processors (MIPS Company ID). The later two represent platforms with DSP
ASEs implemented (Rev 1 and Rev 2 respectively). Macros to detect these two
platforms have been added as well.
Additional macros to extract Company ID out of hwcaps added as well, and
used where possible.
Carl Love [Thu, 12 Sep 2013 17:26:42 +0000 (17:26 +0000)]
The Power ISA 2.07 document includes a correction to the description for the
behavior of the xscvspdp instruction, indicating that if the source argument
is a SNaN, it is first changed to a QNaN before being converted from
single-precision to double-precision. This updated information about the
xscvspdp instruction exposed a bug in the VEX implementation for that
instruction and also a bug in the testing for all instructions having
special behavior for single-precision SNaN arguments.
This patch fixes both the VEX bug in xscvspdp implementation:
The current implementation of xscvspdp emulates the instruction by
extracting the single-precision floating point from the vector register,
storing it in single-prcision, and then loading the data just stored using
the lfsx instruction. But the lfsx instruction does not change SNaN input
arguments to QNaN inputs before conversion to double-precision, so this
emulation is not sufficient for the xscvspdp instruction as described in the
current documentation. This patch fixes that issue by recognizing a SNaN input
and changing it to a QNaN before performing the emulation using lfsx.
While fixing the bug in xscvspdp implementation, it was also discovered that
xvcvspdp had the same issue where SNaN inputs were not being handled correctly,
so this patch also fixes its implementation, too