Julian Seward [Sun, 24 Aug 2014 14:00:19 +0000 (14:00 +0000)]
Rename IROps for reciprocal estimate, reciprocal step, reciprocal sqrt
estimate and reciprocal sqrt step, to be more consistent. Remove
64FxWhatever versions of those ops since they are never used. As a
side effect, observe that RSqrt32Fx4 and Rsqrte32Fx4 are the same and
hence fix the duplication, at the same time. No functional change.
Julian Seward [Wed, 20 Aug 2014 08:54:06 +0000 (08:54 +0000)]
putGST_masked: correctly handle the case where the mask is for
FPSCR.RN or FPSCR.DRN, but does not cover the entire field. Then it
is important to update the exposed parts but leave the not-exposed
parts unchanged. This is a regression relative to circa 5 years ago.
Julian Seward [Fri, 15 Aug 2014 09:11:08 +0000 (09:11 +0000)]
Rename Iop_QSalN*, Iop_QShlN* and Iop_QShlN*S so as to more accurately
reflect what they actually do, which is a zero-fill shift left followed
by one of three flavours of saturation (S->S, U->U or S->U).
Small cleanups in VEX:
* rm unused arm64 function
* ijk_nodecode: always set the 4 components of the result
(avoid a compiler warning that a part is not initialised)
Unbreak the build
priv/guest_ppc_toIR.c: In function disInstr_PPC:
priv/guest_ppc_toIR.c:20160:7: error: dis undeclared (first use in this function)
dis.continueAt = 0;
^
Carl Love [Thu, 7 Aug 2014 23:25:23 +0000 (23:25 +0000)]
This commit is for Bugzilla 334834. The Bugzilla contains patch 2 of 3
to add PPC64 LE support. The other two patches can be found in Bugzillas
334384 and 334836.
POWER PC, add the functional Little Endian support, patch 2 VEX part
The IBM POWER processor now supports both Big Endian and Little Endian.
The ABI for Little Endian also changes. Specifically, the function
descriptor is not used, the stack size changed, accessing the TOC
changed. Functions now have a local and a global entry point. Register
r2 contains the TOC for local calls and register r12 contains the TOC
for global calls. This patch makes the functional changes to the
Valgrind tool. The patch makes the changes needed for the
none/tests/ppc32 and none/tests/ppc64 Makefile.am. A number of the
ppc specific tests have Endian dependencies that are not fixed in
this patch. They are fixed in the next patch.
Per Julian's comments renamed coregrind/m_dispatch/dispatch-ppc64-linux.S
to coregrind/m_dispatch/dispatch-ppc64be-linux.S Created new file for LE
coregrind/m_dispatch/dispatch-ppc64le-linux.S. The same was done for
coregrind/m_syswrap/syscall-ppc-linux.S.
Signed-off-by: Carl Love <carll@us.ibm.com>
git-svn-id: svn://svn.valgrind.org/vex/trunk@2914
Improve infrastructure for dealing with endianness in VEX. This patch
removes all decisions about endianness from VEX. Instead, it requires
that the LibVEX_* calls pass in information about the guest or host
endianness (depending on context) and in turn it passes that info
through to all the places that need it:
* the front ends (xx_toIR.c)
* the back ends (xx_isel.c)
* the patcher functions (Chain, UnChain, PatchProfInc)
Mostly it is boring and ugly plumbing. As far as types go, there is a
new type "VexEndness" that carries the endianness. This also makes it
possible to stop using Bools to indicate endianness. VexArchInfo has
a new field of type VexEndness. Apart from that, no other changes in
types.
Followups: MIPS front and back ends have not yet been fixed up to use
the passed-in endianness information. Currently they assume that the
endianness of both host and guest is the same as the endianness of the
target for which VEX is being compiled.
Initialise a couple of scalars that gcc -Og thinks might be
uninitialised, presumably because at -Og it doesn't do enough
block straightening-outening or whatever to see that they are
always assigned before use.
Julian Seward [Sat, 28 Jun 2014 22:11:16 +0000 (22:11 +0000)]
arm64: change the representation of FPSR.QC so that it can be
used efficiently to record SIMD saturation, and remove support
for all other bits of FPSR, since we don't model them anyway.
Julian Seward [Fri, 27 Jun 2014 10:43:22 +0000 (10:43 +0000)]
arm64:
* implement: rev32, rev64, saba, uaba, sabd, uabd.
* factor out a large number of duplicated expressions of the form
bitQ == 0 ? unop(Iop_ZeroHI64ofV128, mkexpr(t)) : mkexpr(t)
Julian Seward [Thu, 26 Jun 2014 08:18:08 +0000 (08:18 +0000)]
The vector versions of the count leading zeros/sign bits primops
(Iop_Cls* and Iop_Clz*) misleadingly imply a signedness in the
incoming lanes. Rename them to fix this. Fixes #326026.
Julian Seward [Sun, 15 Jun 2014 08:17:35 +0000 (08:17 +0000)]
Remove temporary front end scaffolding for Cat{Even,Odd}Lanes
and Interleave{LO,HI} operations, and instead generate real
UZP1/UZP2/ZIP1/ZIP2 instructions in the back end.
Julian Seward [Wed, 14 May 2014 23:38:23 +0000 (23:38 +0000)]
Implement VFPv4 VFMA and VFMS (F32 and F64 versions). Fixes #331057.
Patch from Janne Hellsten (jjhellst@gmail.com) with algebraic
rearrangement for the VFMS cases so as to make result signs match with
the hardware when some of the inputs are infinities.
Mark Wielaard [Fri, 9 May 2014 11:41:06 +0000 (11:41 +0000)]
Recognize MPX instructions and bnd prefix. Bug #333666.
Recognize and parse operands of new MPX instructions BNDMK, BNDCL,
BNDCU, BNDCN, BNDMOV, BNDLDX and BNDSTX. Also recognize bnd (F2) prefix
for CALL (E8,FF/2), RET (C2,C3), JMP (EB,E9,FF/4) and Jcc (70-7F,0F 80-8F).
All new MPX instructions are currently NOPs and the bnd prefix is ignored.
Julian Seward [Mon, 5 May 2014 10:03:56 +0000 (10:03 +0000)]
Fix assertion failures resulting from change of arity of
Iop_{Add,Sub,Mul}32Fx4 introduced in r2809, in which said IROps
acquired a rounding-mode argument.
Julian Seward [Sun, 4 May 2014 10:52:11 +0000 (10:52 +0000)]
Renaming only (no functional change): rename IR artefacts to do
with i-cache invalidation to be more consistent with new d-cache
invalidation functionality:
Ijk_TInval -> Ijk_InvalICache
TISTART -> CMSTART (CM == "Cache Management")
TILEN -> CMLEN
VEX_TRC_JMP_TINVAL -> VEX_TRC_JMP_INVALICACHE
Julian Seward [Sun, 9 Mar 2014 09:40:23 +0000 (09:40 +0000)]
Do early writeback of the base register for the following instruction
forms, to stop Memcheck complaining about writes below the stack
pointer:
str x3, [sp,#-16]!
stp q0, q1, [sp,#-512]!
Julian Seward [Fri, 7 Mar 2014 22:52:19 +0000 (22:52 +0000)]
Support extra instruction bits and pieces, enough to get Firefox started:
* more scalar int <-> FP conversions
* more vector integer narrowing
* a few more vector shift by imm cases
* FCVTAS (kludged)