Julian Seward [Sun, 9 Mar 2014 09:40:23 +0000 (09:40 +0000)]
Do early writeback of the base register for the following instruction
forms, to stop Memcheck complaining about writes below the stack
pointer:
str x3, [sp,#-16]!
stp q0, q1, [sp,#-512]!
Julian Seward [Fri, 7 Mar 2014 22:52:19 +0000 (22:52 +0000)]
Support extra instruction bits and pieces, enough to get Firefox started:
* more scalar int <-> FP conversions
* more vector integer narrowing
* a few more vector shift by imm cases
* FCVTAS (kludged)
Dejan Jevtic [Thu, 27 Feb 2014 14:17:19 +0000 (14:17 +0000)]
mips32: Fpu guest registers are ULong and the initial values need to be
extended.
Because we are supporting both big and little endian mips32 we need to
make sure that the initial values for the fpu registers are the same for both
endian.
Florian Krohm [Fri, 14 Feb 2014 08:55:32 +0000 (08:55 +0000)]
Fix comments and code snippets that were making incorrect claims about
the alignment requirement of the guest state, shadow areas, and register
spill area sizes.
The size of these areas ought to be a multiple of 16 bytes.
Florian Krohm [Tue, 11 Feb 2014 09:23:01 +0000 (09:23 +0000)]
s390: Fix s390_amode_for_guest_state. In general the offset relative
to the guest state pointer may be more than the B12 addressing mode can
handle. Fall back and use a B20 addressing mode in those cases.
Fix up the x86 and amd64 front ends to add fake rounding modes
(Irrm_NEAREST) when generating expressions using these primops.
Fix up the x86 and amd64 back ends to accept these as triops
rather than as binops, and ignore the first arg.
Add three more ir_opt folding rules to remove memcheck
instrumentation arising from instrumentation of known-defined
rounding modes.
Overall functional and performance effects should be zero.
Julian Seward [Wed, 15 Jan 2014 10:25:21 +0000 (10:25 +0000)]
arm64: rename guest_SP to guest_XSP so as to avoid a name clash with
guest_SP from s390 world. Also back out the rename of guest_SP to
guest_s390_SP that caused s390 build breakage in r2803.
Florian Krohm [Tue, 10 Dec 2013 16:51:15 +0000 (16:51 +0000)]
The result of rounding a 128-bit BFP/DFP value to 32/64 bit needs to
be stored in a register pair. This constraint was not observed previously
and the result was stored in any FPR that happened to be chosen. If the
selected FPR was not identifying a proper FPR pair, a SIGILL was delivered.
Fixes BZ #328455.
Fix Bug 327284. The condition code of risbg was not correct.
This instruction might be used by by gcc for masking out bits,
e.g. code like
n &= 3;
if (n == 0)
might result in
risbg %r4,%r4,62,128+63,0
je <target>
The old code set the condition code depending on the operand before
masking. Fix it. This patch also indicates that we need test suite
coverage for risbg and friends.
Julian Seward [Mon, 21 Oct 2013 10:05:33 +0000 (10:05 +0000)]
In 64 bit mode, allow 64 bit return values from clean helper calls.
This makes SMC checking calls work (even though they are irrelevant
on PPC targets). Fixes #309430.
Carl Love [Fri, 18 Oct 2013 01:19:06 +0000 (01:19 +0000)]
This commit adds support for the following instructions:
vaddcuq, vadduqm, vaddecuq, vaddeuqm,
vsubcuq, vsubuqm, vsubecuq, vsubeuqm,
vbpermq and vgbbd.
The vgbbd instruction required a new Iop -- Iop_PwBitMtxXpose64x2.
All other instructions were emulated using existing Iops.
The following Iops were added to support the above instructions:
Iop_BCDAdd, Iop_BCDSub,
Iop_PolynomialMulAdd8x16, Iop_PolynomialMulAdd16x8,
Iop_PolynomialMulAdd32x4, Iop_PolynomialMulAdd64x2,
Iop_CipherV128, Iop_CipherLV128, Iop_CipherSV128,
Iop_NCipherV128, Iop_NCipherLV128,
Iop_SHA512, Iop_SHA256, Iop_Clz64x2
Carl Love [Wed, 9 Oct 2013 17:52:01 +0000 (17:52 +0000)]
Power PC, add the two privileged Transactional Memory instructions.
The initial Transactional Memory instruction patch did not include the two
privileged (OS) instructions. This patch adds support for the two
instructions, treclaim and trechkpt.
Carl Love [Wed, 2 Oct 2013 16:25:57 +0000 (16:25 +0000)]
Power PC, Approach 1, add Transactional Memory instruction support
The following Transactional Memory instructions are added:
tbegin., tend., tsr., tcheck., tabortwc.,
tabortdc., tabortwci., tabortdci., tabort.
The patch implements the first proposal by Julian on how to handle the
TM instructions. The proposal is as follows:
translate "XBEGIN fail-addr" as "goto fail-addr"; that is: push
simulated execution directly onto the failure path. This is simple
but will have poor performance, if (as is likely) the failure path
uses normal locking and is not tuned for speed.
The tbegin instruction on Power sets the condition code register to
indicate if the tbegin instruction suceeded or failed. The compiler
then generates a conditional branch instruction to take the success
or failure code path for the tbegin instruction. In order to fail the
tbegin instruction, the condition code register is updated to indicate
that the tbegin instruction failed. This patch assumes that there is
always an error handler for the tbegin instruction. The other TM
instructions are all treated as no ops as we shouldn't be executing the
sucess transactional code path.
Signed-off-by: Carl Love <cel@us.ibm.com>
Bugzilla 323803
The following Iops were added to support the above instructions:
Iop_MullEven32Ux4, Iop_MullEven32Sx4, Iop_Max64Sx2, Iop_Max64Ux2,
Iop_Min64Sx2, Iop_Min64Ux2, Iop_CmpGT64Ux2, Iop_Rol64x2,
Iop_QNarrowBin64Sto32Ux4, Iop_QNarrowBin64Uto32Ux4, Iop_NarrowBin64to32x4,
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Bugzilla 324894
Add a kludgey implementation of XTEST to go with the kludgey
implementation of XBEGIN. Also kludge the CPUID output for AVX
capable targets so as to claim we support HTM.
Mark Wielaard, mjw@redhat.com)
Petar Jovanovic [Tue, 24 Sep 2013 22:27:23 +0000 (22:27 +0000)]
mips64: finetune mips_dirtyhelper_calculate_FCSR
Several MIPS32 Revision 2 instructions also belong to Revision 1 of MIPS64.
Modifing parts of mips_dirtyhelper_calculate_FCSR to be active for MIPS64R1.
This fixes none/tests/mips64/round when Valgrind is compiled for MIPS64 R1.
Petar Jovanovic [Sat, 21 Sep 2013 01:47:18 +0000 (01:47 +0000)]
mips32: protect mips32r2 instructions with a flag
Regression issue that came when mips_dirtyhelper_calculate_FCSR was added.
Inline assembly with MIPS32r2 instructions needs to be protected by flags
that disable it for non-MIPS32r2 platforms such as some Broadcom boards.
Add support for the Intel TM "xbegin" instruction, by jumping directly
to the failure address. Currently disabled pending finding hardware
that can actually execute xbegin, for testing purposes.
x86 front ends: tighten up decoding of MOV Ib,Eb and MOV Iv,Ev. This
failed to check the g-register in the modrm byte, with the result that
it will mis-decode the AVX2 XABORT and XBEGIN instructions as these
instead, with obviously-bizarre consequences.
Petar Jovanovic [Mon, 16 Sep 2013 18:11:59 +0000 (18:11 +0000)]
mips: clean-up in hardware detection (Cavium/DSP ASEs)
This change is a clean up in MIPS hardware detection code.
New flag for Cavium Company ID is added, as well as the codes for 34K and
74K processors (MIPS Company ID). The later two represent platforms with DSP
ASEs implemented (Rev 1 and Rev 2 respectively). Macros to detect these two
platforms have been added as well.
Additional macros to extract Company ID out of hwcaps added as well, and
used where possible.
Carl Love [Thu, 12 Sep 2013 17:26:42 +0000 (17:26 +0000)]
The Power ISA 2.07 document includes a correction to the description for the
behavior of the xscvspdp instruction, indicating that if the source argument
is a SNaN, it is first changed to a QNaN before being converted from
single-precision to double-precision. This updated information about the
xscvspdp instruction exposed a bug in the VEX implementation for that
instruction and also a bug in the testing for all instructions having
special behavior for single-precision SNaN arguments.
This patch fixes both the VEX bug in xscvspdp implementation:
The current implementation of xscvspdp emulates the instruction by
extracting the single-precision floating point from the vector register,
storing it in single-prcision, and then loading the data just stored using
the lfsx instruction. But the lfsx instruction does not change SNaN input
arguments to QNaN inputs before conversion to double-precision, so this
emulation is not sufficient for the xscvspdp instruction as described in the
current documentation. This patch fixes that issue by recognizing a SNaN input
and changing it to a QNaN before performing the emulation using lfsx.
While fixing the bug in xscvspdp implementation, it was also discovered that
xvcvspdp had the same issue where SNaN inputs were not being handled correctly,
so this patch also fixes its implementation, too
Enhance ado_treebuild_BB to allow an expression preceding a Put
statement and containing one or more Get expressions to be
substituted in an expression following the Put statement.
That transformation is harmless as long as the guest state areas being
accessed by the Put and Get(s) do not overlap.
Carl Love [Tue, 10 Sep 2013 18:46:40 +0000 (18:46 +0000)]
Bugzilla 323437, this is phase 2 in a series of patches adding support for IBM
Power ISA 2.07. The first bugzilla in the series was: 322294: Add initial
support for IBM Power ISA 2.07
Phase 2 adds support for the following new instructions to
VEX/priv/guest_ppc_toIR.c:
- lq, stq, lqarx, stqcx.
- mfvsrwz, mtvsrwz
- fmrgew, fmrgow
There is a corresponding test case for these instructions, see the bugzilla
for the commit number.
Carl Love [Fri, 6 Sep 2013 22:27:34 +0000 (22:27 +0000)]
The existing overflow detection in VEX/priv/guest_ppc_toIR.c/set_XER_OV_64()
under the case PPCG_FLAG_OP_MULLW: does not apply to the mulldo as we need to
detect overflow when performing a Multiply Low Doubleword (not Multiply Low
Word). Hence, we added a new enumeration value PPCG_FLAG_OP_MULLD in
VEX/priv/guest_ppc_defs.h and a corresponding new case under which the
computation for detecting overflow for mulldo/mulldo. is added in
set_XER_OV_64(). The tests have been added to: none/tests/ppc32/jm-insns.c
Carl Love [Fri, 6 Sep 2013 16:49:42 +0000 (16:49 +0000)]
The patch used the binary constants 0b10000 and 0b10001. The 0b designator
is supported by the GCC extensions but not all compilers seem to support the
0b extension in GCC. Therefore, the binary constats were changed to their
equivalent hex values as suggested by Florian.
Carl Love [Thu, 5 Sep 2013 19:47:40 +0000 (19:47 +0000)]
The current code is not properly handling a non-zero TH field in the
dcbt instruction, which is valid for several forms of data cache block
touch instructions. This patch adds the needed support to
VEX/priv/guest_ppc_toIR.c.
Mark Wielaard [Tue, 27 Aug 2013 10:19:03 +0000 (10:19 +0000)]
Support mmxext (integer sse) subset on i386 (athlon).
Some processors like the AMD Athlon "Classic" support mmxext,
a sse1 subset. This subset is not properly detected by VEX.
The subset uses the same encoding as the sse1 instructions.
The subset is described at:
http://support.amd.com/us/Embedded_TechDocs/22466.pdf
https://en.wikipedia.org/wiki/3DNow!#3DNow.21_extensions
This introduces a new VEX_HWCAPS_X86_MMXEXT that sits between
the baseline (0) and VEX_HWCAPS_X86_SSE1. There is also a new
x86g_dirtyhelper_CPUID_mmxext to mimics a Athlon "Classic"
(Model 2, K75 "Pluto/Orion").
Groups all mmxext instructions together in one block.