Florian Krohm [Fri, 14 Feb 2014 08:55:32 +0000 (08:55 +0000)]
Fix comments and code snippets that were making incorrect claims about
the alignment requirement of the guest state, shadow areas, and register
spill area sizes.
The size of these areas ought to be a multiple of 16 bytes.
Florian Krohm [Tue, 11 Feb 2014 09:23:01 +0000 (09:23 +0000)]
s390: Fix s390_amode_for_guest_state. In general the offset relative
to the guest state pointer may be more than the B12 addressing mode can
handle. Fall back and use a B20 addressing mode in those cases.
Fix up the x86 and amd64 front ends to add fake rounding modes
(Irrm_NEAREST) when generating expressions using these primops.
Fix up the x86 and amd64 back ends to accept these as triops
rather than as binops, and ignore the first arg.
Add three more ir_opt folding rules to remove memcheck
instrumentation arising from instrumentation of known-defined
rounding modes.
Overall functional and performance effects should be zero.
Julian Seward [Wed, 15 Jan 2014 10:25:21 +0000 (10:25 +0000)]
arm64: rename guest_SP to guest_XSP so as to avoid a name clash with
guest_SP from s390 world. Also back out the rename of guest_SP to
guest_s390_SP that caused s390 build breakage in r2803.
Florian Krohm [Tue, 10 Dec 2013 16:51:15 +0000 (16:51 +0000)]
The result of rounding a 128-bit BFP/DFP value to 32/64 bit needs to
be stored in a register pair. This constraint was not observed previously
and the result was stored in any FPR that happened to be chosen. If the
selected FPR was not identifying a proper FPR pair, a SIGILL was delivered.
Fixes BZ #328455.
Fix Bug 327284. The condition code of risbg was not correct.
This instruction might be used by by gcc for masking out bits,
e.g. code like
n &= 3;
if (n == 0)
might result in
risbg %r4,%r4,62,128+63,0
je <target>
The old code set the condition code depending on the operand before
masking. Fix it. This patch also indicates that we need test suite
coverage for risbg and friends.
Julian Seward [Mon, 21 Oct 2013 10:05:33 +0000 (10:05 +0000)]
In 64 bit mode, allow 64 bit return values from clean helper calls.
This makes SMC checking calls work (even though they are irrelevant
on PPC targets). Fixes #309430.
Carl Love [Fri, 18 Oct 2013 01:19:06 +0000 (01:19 +0000)]
This commit adds support for the following instructions:
vaddcuq, vadduqm, vaddecuq, vaddeuqm,
vsubcuq, vsubuqm, vsubecuq, vsubeuqm,
vbpermq and vgbbd.
The vgbbd instruction required a new Iop -- Iop_PwBitMtxXpose64x2.
All other instructions were emulated using existing Iops.
The following Iops were added to support the above instructions:
Iop_BCDAdd, Iop_BCDSub,
Iop_PolynomialMulAdd8x16, Iop_PolynomialMulAdd16x8,
Iop_PolynomialMulAdd32x4, Iop_PolynomialMulAdd64x2,
Iop_CipherV128, Iop_CipherLV128, Iop_CipherSV128,
Iop_NCipherV128, Iop_NCipherLV128,
Iop_SHA512, Iop_SHA256, Iop_Clz64x2
Carl Love [Wed, 9 Oct 2013 17:52:01 +0000 (17:52 +0000)]
Power PC, add the two privileged Transactional Memory instructions.
The initial Transactional Memory instruction patch did not include the two
privileged (OS) instructions. This patch adds support for the two
instructions, treclaim and trechkpt.
Carl Love [Wed, 2 Oct 2013 16:25:57 +0000 (16:25 +0000)]
Power PC, Approach 1, add Transactional Memory instruction support
The following Transactional Memory instructions are added:
tbegin., tend., tsr., tcheck., tabortwc.,
tabortdc., tabortwci., tabortdci., tabort.
The patch implements the first proposal by Julian on how to handle the
TM instructions. The proposal is as follows:
translate "XBEGIN fail-addr" as "goto fail-addr"; that is: push
simulated execution directly onto the failure path. This is simple
but will have poor performance, if (as is likely) the failure path
uses normal locking and is not tuned for speed.
The tbegin instruction on Power sets the condition code register to
indicate if the tbegin instruction suceeded or failed. The compiler
then generates a conditional branch instruction to take the success
or failure code path for the tbegin instruction. In order to fail the
tbegin instruction, the condition code register is updated to indicate
that the tbegin instruction failed. This patch assumes that there is
always an error handler for the tbegin instruction. The other TM
instructions are all treated as no ops as we shouldn't be executing the
sucess transactional code path.
Signed-off-by: Carl Love <cel@us.ibm.com>
Bugzilla 323803
The following Iops were added to support the above instructions:
Iop_MullEven32Ux4, Iop_MullEven32Sx4, Iop_Max64Sx2, Iop_Max64Ux2,
Iop_Min64Sx2, Iop_Min64Ux2, Iop_CmpGT64Ux2, Iop_Rol64x2,
Iop_QNarrowBin64Sto32Ux4, Iop_QNarrowBin64Uto32Ux4, Iop_NarrowBin64to32x4,
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Bugzilla 324894
Add a kludgey implementation of XTEST to go with the kludgey
implementation of XBEGIN. Also kludge the CPUID output for AVX
capable targets so as to claim we support HTM.
Mark Wielaard, mjw@redhat.com)
Petar Jovanovic [Tue, 24 Sep 2013 22:27:23 +0000 (22:27 +0000)]
mips64: finetune mips_dirtyhelper_calculate_FCSR
Several MIPS32 Revision 2 instructions also belong to Revision 1 of MIPS64.
Modifing parts of mips_dirtyhelper_calculate_FCSR to be active for MIPS64R1.
This fixes none/tests/mips64/round when Valgrind is compiled for MIPS64 R1.
Petar Jovanovic [Sat, 21 Sep 2013 01:47:18 +0000 (01:47 +0000)]
mips32: protect mips32r2 instructions with a flag
Regression issue that came when mips_dirtyhelper_calculate_FCSR was added.
Inline assembly with MIPS32r2 instructions needs to be protected by flags
that disable it for non-MIPS32r2 platforms such as some Broadcom boards.
Add support for the Intel TM "xbegin" instruction, by jumping directly
to the failure address. Currently disabled pending finding hardware
that can actually execute xbegin, for testing purposes.
x86 front ends: tighten up decoding of MOV Ib,Eb and MOV Iv,Ev. This
failed to check the g-register in the modrm byte, with the result that
it will mis-decode the AVX2 XABORT and XBEGIN instructions as these
instead, with obviously-bizarre consequences.
Petar Jovanovic [Mon, 16 Sep 2013 18:11:59 +0000 (18:11 +0000)]
mips: clean-up in hardware detection (Cavium/DSP ASEs)
This change is a clean up in MIPS hardware detection code.
New flag for Cavium Company ID is added, as well as the codes for 34K and
74K processors (MIPS Company ID). The later two represent platforms with DSP
ASEs implemented (Rev 1 and Rev 2 respectively). Macros to detect these two
platforms have been added as well.
Additional macros to extract Company ID out of hwcaps added as well, and
used where possible.
Carl Love [Thu, 12 Sep 2013 17:26:42 +0000 (17:26 +0000)]
The Power ISA 2.07 document includes a correction to the description for the
behavior of the xscvspdp instruction, indicating that if the source argument
is a SNaN, it is first changed to a QNaN before being converted from
single-precision to double-precision. This updated information about the
xscvspdp instruction exposed a bug in the VEX implementation for that
instruction and also a bug in the testing for all instructions having
special behavior for single-precision SNaN arguments.
This patch fixes both the VEX bug in xscvspdp implementation:
The current implementation of xscvspdp emulates the instruction by
extracting the single-precision floating point from the vector register,
storing it in single-prcision, and then loading the data just stored using
the lfsx instruction. But the lfsx instruction does not change SNaN input
arguments to QNaN inputs before conversion to double-precision, so this
emulation is not sufficient for the xscvspdp instruction as described in the
current documentation. This patch fixes that issue by recognizing a SNaN input
and changing it to a QNaN before performing the emulation using lfsx.
While fixing the bug in xscvspdp implementation, it was also discovered that
xvcvspdp had the same issue where SNaN inputs were not being handled correctly,
so this patch also fixes its implementation, too
Enhance ado_treebuild_BB to allow an expression preceding a Put
statement and containing one or more Get expressions to be
substituted in an expression following the Put statement.
That transformation is harmless as long as the guest state areas being
accessed by the Put and Get(s) do not overlap.
Carl Love [Tue, 10 Sep 2013 18:46:40 +0000 (18:46 +0000)]
Bugzilla 323437, this is phase 2 in a series of patches adding support for IBM
Power ISA 2.07. The first bugzilla in the series was: 322294: Add initial
support for IBM Power ISA 2.07
Phase 2 adds support for the following new instructions to
VEX/priv/guest_ppc_toIR.c:
- lq, stq, lqarx, stqcx.
- mfvsrwz, mtvsrwz
- fmrgew, fmrgow
There is a corresponding test case for these instructions, see the bugzilla
for the commit number.
Carl Love [Fri, 6 Sep 2013 22:27:34 +0000 (22:27 +0000)]
The existing overflow detection in VEX/priv/guest_ppc_toIR.c/set_XER_OV_64()
under the case PPCG_FLAG_OP_MULLW: does not apply to the mulldo as we need to
detect overflow when performing a Multiply Low Doubleword (not Multiply Low
Word). Hence, we added a new enumeration value PPCG_FLAG_OP_MULLD in
VEX/priv/guest_ppc_defs.h and a corresponding new case under which the
computation for detecting overflow for mulldo/mulldo. is added in
set_XER_OV_64(). The tests have been added to: none/tests/ppc32/jm-insns.c
Carl Love [Fri, 6 Sep 2013 16:49:42 +0000 (16:49 +0000)]
The patch used the binary constants 0b10000 and 0b10001. The 0b designator
is supported by the GCC extensions but not all compilers seem to support the
0b extension in GCC. Therefore, the binary constats were changed to their
equivalent hex values as suggested by Florian.
Carl Love [Thu, 5 Sep 2013 19:47:40 +0000 (19:47 +0000)]
The current code is not properly handling a non-zero TH field in the
dcbt instruction, which is valid for several forms of data cache block
touch instructions. This patch adds the needed support to
VEX/priv/guest_ppc_toIR.c.
Mark Wielaard [Tue, 27 Aug 2013 10:19:03 +0000 (10:19 +0000)]
Support mmxext (integer sse) subset on i386 (athlon).
Some processors like the AMD Athlon "Classic" support mmxext,
a sse1 subset. This subset is not properly detected by VEX.
The subset uses the same encoding as the sse1 instructions.
The subset is described at:
http://support.amd.com/us/Embedded_TechDocs/22466.pdf
https://en.wikipedia.org/wiki/3DNow!#3DNow.21_extensions
This introduces a new VEX_HWCAPS_X86_MMXEXT that sits between
the baseline (0) and VEX_HWCAPS_X86_SSE1. There is also a new
x86g_dirtyhelper_CPUID_mmxext to mimics a Athlon "Classic"
(Model 2, K75 "Pluto/Orion").
Groups all mmxext instructions together in one block.
Florian Krohm [Thu, 15 Aug 2013 20:54:52 +0000 (20:54 +0000)]
Eliminate IRExprP__VECRET and IRExprP__BBPTR and introduce two new
IRExpr kinds instead: Iex_VECRET and Iex_BBPTR. Add constructor
functions and adjust ppIRExpr, typeOfIRxpr and deepCopyExpr. The
rest is mechanics.
Carl Love [Mon, 12 Aug 2013 18:01:40 +0000 (18:01 +0000)]
Initial ISA 2.07 support for POWER8-tuned libc
The IBM Power ISA 2.07 has been published on power.org, and IBM's new POWER8
processor is under development to implement that ISA. This patch provides
initial VEX support for running Valgrind on POWER8 systems running a soon-to-be
released Linux distribution. This Linux distro will include a POWER8-tuned
libc that uses a subset of the new instructions from ISA 2.07. Since virtually
all applications link with libc, it would be impossible to run an application
under Valgrind on this distro without adding support for these new instructions
to Valgrind, so that's the intent of this patch. Note that applications built
on this distro will *not* employ new POWER8 instructions by default. There are
roughly 150 new instructions in the Power ISA 2.07, including hardware
transaction management (HTM). Support for these new instructions (modulo the
subset included in this bug) will be added to Valgrind in a phased approach,
similar to what we did for Power ISA 2.06.
Julian Seward [Thu, 8 Aug 2013 10:28:59 +0000 (10:28 +0000)]
Add infrastructural support (IR, VEX) to allow returns of 128-
and 256-bit values from dirty helper functions, in a way which is
independent of the target ABIs and of compilers generating
correct struct return code.
Is a prereq for bug #294285.
MIPS fixes: Petar Jovanovic, mips32r2@gmail.com
S390 fixes: Maran, maranp@linux.vnet.ibm.com
Florian Krohm [Sat, 3 Aug 2013 20:39:32 +0000 (20:39 +0000)]
Do not use the 0b notation as older GCC's do not accept it.
Fixes BZ 322851 and also unbreaks the OS X nightly build (hopefully).
Patch by Thomas Rast (trast@student.ethz.ch).
mips32: Add support for mips32 DSP instruction set.
Add support for mips32 DSP and DSP revision 2 ASE.
More details about the mips32 DSP(r2) ASE:
http://www.mips.com/media/files/MD00566-2B-MIPSDSP-QRC-01.00.pdf
Applied patch provided by Maja Gagic <maja.gagic@rt-rk.com>
Implement the following instructions, in both ARM and Thumb
encodings:
SSAX SXTAB16 SHASX SHSAX SHSUB16 SHSUB8
UASX USAX UQADD16 UQASX UQSAX UHASX UHSAX REVSH
Add support for
(T1) STRBT reg+#imm8
(T1) STRHT reg+#imm8
(T1) LDRBT reg+#imm8
(T1) LDRSBT reg+#imm8
(T1) PLI reg+#imm12
(T2) PLI reg-#imm8
(T3) PLI PC+/-#imm12
Florian Krohm [Mon, 17 Jun 2013 21:03:56 +0000 (21:03 +0000)]
s390: Support some more BFP <-> DFP conversions (the ones
that were added in VEX r2727).
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Florian Krohm [Mon, 17 Jun 2013 18:59:51 +0000 (18:59 +0000)]
Add some more IRops to convert between binary floating point and
decimal floating point values. Needed to complete s390 DFP support.
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Petar Jovanovic [Sun, 9 Jun 2013 16:46:14 +0000 (16:46 +0000)]
mips64: fix 'unused variable' warning
On a couple of places, the code expected either _MIPSEB or _MIPSEL flag to
use some variables, but none of these flags is set when the code is compiled
for non-MIPS architectures.
Florian Krohm [Thu, 6 Jun 2013 19:12:46 +0000 (19:12 +0000)]
Eliminate IRRoundingModeDFP by merging its values into IRRoundingMode.
Retain encodings. The rationale is that a rounding mode is an abstraction
and as such independent of formats used to represent numeric values.
This was triggered by the need for a rounding mode to express conversions
between binary floating point values and decimal floating point values.
Florian Krohm [Fri, 31 May 2013 15:41:55 +0000 (15:41 +0000)]
s390x: Make the CC_DEP1 field appear completely initialised when
writing a 32-bit floating point value into it.
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Petar Jovanovic [Fri, 31 May 2013 15:09:56 +0000 (15:09 +0000)]
mips32/mips64: implement sdl, sdr, swl and swr without reading memory
New implementation of SDL, SDW, SWL and SWR instructions in a way in which
no memory read is required. This came as an issue for programs that map
memory as write-exec only.
Florian Krohm [Sat, 11 May 2013 15:02:58 +0000 (15:02 +0000)]
s390: First round of changes to support the PFPO insn.
Support these IROps:
Iop_F64toD64, Iop_D64toF64
Iop_F64toD128, Iop_D128toF64,
Iop_F128toD128, Iop_D128toF128,
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ #307113