Florian Krohm [Fri, 5 Dec 2014 18:55:39 +0000 (18:55 +0000)]
Encountering a PFPO insn in a client program while running on a host
that does not have that insn now causes an emulation error.
Previously, it caused a failing assertion which was incorrect.
Florian Krohm [Sat, 22 Nov 2014 20:10:21 +0000 (20:10 +0000)]
Add function s390_isel_amode_b12_b20 to compile an expression into an
amode that is either S390_AMODE_B12 or S390_AMODE_B20. This is needed
for compare-and-swap insns. As we're currently not generating amodes
using an index register, there was never a problem.
This change future-proofs the code.
Also add a few more asserts for amodes in the s390_insns supporting
translation chaining.
Fixes BZ #269360.
Florian Krohm [Thu, 20 Nov 2014 15:08:56 +0000 (15:08 +0000)]
This change was triggered by BZ #247974 which suggested to include
VEX/test_main.* in the tarball. We don't want to do that because those
files are really just scaffolding for developers to play with and not
meant for general consumption (and are also bitrotting ATM). Therefore,
this patch moves them to the "useful" subdirectory and adds a crude
Makefile there to build the executable.
Makefile-gcc updated accordingly.
Julian Seward [Tue, 11 Nov 2014 12:49:21 +0000 (12:49 +0000)]
Add a nasty temporary kludge to CPUID that allows 64-bit MacOSX 10.10
(Yosemite) to run, until such time as XSAVE and XRSTOR are implemented.
Detailed in the comments. All other targets should be unaffected.
Florian Krohm [Sat, 11 Oct 2014 14:48:38 +0000 (14:48 +0000)]
Merge the memory allocation bits from libvex.h into main_util.c.
This is to avoid linkage problems due to unresolved symbols for
some compilers. See also valgrind r14600 and BZ #339542.
Carl Love [Thu, 9 Oct 2014 21:08:25 +0000 (21:08 +0000)]
This patch makes the needed changes to the lxvw4x for Little Endian.
The data was being loaded in the Big Endian data order for most
cases. The code in host_ppc_isel.c was changed to do a right
shift and to permute the hi and lo registers in the other order
to ensure the data was always loaded in BE order. The lxvw4x
emulation in guest_ppc_toIR.c was changed to permute the data from
the BE order to LE order when running on an LE system.
Carl Love [Tue, 7 Oct 2014 18:20:39 +0000 (18:20 +0000)]
This commit just makes white space changes to the three files in commit
r2966 so I can fix the commit message for that commit. The previous
commit message was "msg". The "msg" was the file with the commit message below.
The first attempt to fix the false positive message "Invalid read of size"
was to change to a V128 read instead of four 32-bit reads. Unfortunately,
this caused some regression test failures that were not caught before
committing the change.
This patch implements the V128 read without creating any regression failures.
The issue with the previous fix is that the lvx instruction was used to
do the V128 fetch. Unfortunately, that instruction takes the effective
address masks it to make it 16 byte aligned and then does the fetch. So,
non-aligned fetches do not work correctly. The fix in this patch does
two aligned fetches with the lvx instruction, calculates a how to permute
the data from the two loads and then permutes the data so the result in the
vector register is the correct value for an unaligned fetch.
Julian Seward [Thu, 2 Oct 2014 16:15:30 +0000 (16:15 +0000)]
guest_amd64_spechelper: fill in a number of missing cases for
conditions after SUBQ/SUBL/SUBW. Also, add cases for
Overflow-after-ADDL/SUBL for the benefit of code generated by
Javascript JITs.
Julian Seward [Thu, 2 Oct 2014 16:13:20 +0000 (16:13 +0000)]
Add folding rules for: Sar64(x,0) and Sar32(x,0). Immediate
shifts by zero seem to have a surprisingly large perf hit on
Intels, possibly due to the bizarre eflags/rflags semantics
involved.
Julian Seward [Thu, 2 Oct 2014 11:32:39 +0000 (11:32 +0000)]
guest_amd64_spechelper: number (in comments) and reorder the spec
cases for arbitrary-condition-after-sub32/sub64. This makes it easier
to see which cases are missing. No functional change.
Carl Love [Mon, 29 Sep 2014 19:33:00 +0000 (19:33 +0000)]
ppc64: lxvw4x instruction uses four 32-byte loads. When run on an
application that does partial loads an error message is generated by
valgrind about Invalid read of size 4. Valgrind is incorrectly
detecting the invalid read. The four loads were replaced
by a single 128-bit load. The invalid read message can now be
suppressed using the command line option " --partial-loads-ok=yes ".
Carl Love [Thu, 25 Sep 2014 15:57:31 +0000 (15:57 +0000)]
The function mk_AvDuplicateRI() stores 16 bytes to memory and then
fetches the data into a vector register. The load was being
generated as a lvewx instead of a lvx instruction by the code:
/* Effectively splat the r_src value to dst */
addInstr(env, PPCInstr_AvLdSt( True/*ld*/, 4, dst, am_offset_zero ) );
The second argument controls which load instruction is generated. The
second argument should have been 16 to generate the lvx instruction not
the lvewx instruction. The issue was reported on the Freescale processor
for the vsptlb instruction. The issue was not detected before because
the backend code generation used the same vector register to load into
as was used previously to create the data. However, the code generation
is dependent on the HW/Distro/compiler. If the same register isn't used
the bug appears. The issue was found with Valgrind 3.10.0 on the Freescale
processor as the Valgrind code generation didn't happen to pick the same
register to do the load into.
Carl Love [Tue, 23 Sep 2014 16:22:36 +0000 (16:22 +0000)]
The PPC64 store quad instruction is updating the address register with the
effective address of the store. The instruction should not update the
address register. The issue is due to the two putIReg() calls at the end of
the instruction. The two putIReg() calls were removed to fix the bug.
Remove the valgrind_support parameter from LibVEX_Init. It's unused
and looks like an anachronism. VEX is also cleaner without valgrind things
creeping in.
Couple of fixes:
- deepCopyIRConst failed to copy Ico_V256 constants
- deepCopyIRExpr did not copy Iex_Binder expressions
- handle_gets_Stmt should also handle an Ist_Put statement
Change how FXSAVE and FXRSTOR are done, so as to avoid pushing the XMM
register contents themselves through the helper functions. This
avoids the false positives reported in #291310.
arm64: route all whole-vector shift/rotate/slice operations
through Iop_SliceV128, so as to give it some testing. Implement
Iop_SliceV128 in the back end.
Rename Iop_Extract{64,V128} to Iop_Slice{64,V128}, improve their
documentation, and swap the sense of the first and second args
so as to be more in keeping with the rest of the ops here, so
that the more significant arg is arg1 rather than arg2.
Julian Seward [Sun, 24 Aug 2014 14:00:19 +0000 (14:00 +0000)]
Rename IROps for reciprocal estimate, reciprocal step, reciprocal sqrt
estimate and reciprocal sqrt step, to be more consistent. Remove
64FxWhatever versions of those ops since they are never used. As a
side effect, observe that RSqrt32Fx4 and Rsqrte32Fx4 are the same and
hence fix the duplication, at the same time. No functional change.
Julian Seward [Wed, 20 Aug 2014 08:54:06 +0000 (08:54 +0000)]
putGST_masked: correctly handle the case where the mask is for
FPSCR.RN or FPSCR.DRN, but does not cover the entire field. Then it
is important to update the exposed parts but leave the not-exposed
parts unchanged. This is a regression relative to circa 5 years ago.
Julian Seward [Fri, 15 Aug 2014 09:11:08 +0000 (09:11 +0000)]
Rename Iop_QSalN*, Iop_QShlN* and Iop_QShlN*S so as to more accurately
reflect what they actually do, which is a zero-fill shift left followed
by one of three flavours of saturation (S->S, U->U or S->U).
Small cleanups in VEX:
* rm unused arm64 function
* ijk_nodecode: always set the 4 components of the result
(avoid a compiler warning that a part is not initialised)
Unbreak the build
priv/guest_ppc_toIR.c: In function disInstr_PPC:
priv/guest_ppc_toIR.c:20160:7: error: dis undeclared (first use in this function)
dis.continueAt = 0;
^
Carl Love [Thu, 7 Aug 2014 23:25:23 +0000 (23:25 +0000)]
This commit is for Bugzilla 334834. The Bugzilla contains patch 2 of 3
to add PPC64 LE support. The other two patches can be found in Bugzillas
334384 and 334836.
POWER PC, add the functional Little Endian support, patch 2 VEX part
The IBM POWER processor now supports both Big Endian and Little Endian.
The ABI for Little Endian also changes. Specifically, the function
descriptor is not used, the stack size changed, accessing the TOC
changed. Functions now have a local and a global entry point. Register
r2 contains the TOC for local calls and register r12 contains the TOC
for global calls. This patch makes the functional changes to the
Valgrind tool. The patch makes the changes needed for the
none/tests/ppc32 and none/tests/ppc64 Makefile.am. A number of the
ppc specific tests have Endian dependencies that are not fixed in
this patch. They are fixed in the next patch.
Per Julian's comments renamed coregrind/m_dispatch/dispatch-ppc64-linux.S
to coregrind/m_dispatch/dispatch-ppc64be-linux.S Created new file for LE
coregrind/m_dispatch/dispatch-ppc64le-linux.S. The same was done for
coregrind/m_syswrap/syscall-ppc-linux.S.
Signed-off-by: Carl Love <carll@us.ibm.com>
git-svn-id: svn://svn.valgrind.org/vex/trunk@2914
Improve infrastructure for dealing with endianness in VEX. This patch
removes all decisions about endianness from VEX. Instead, it requires
that the LibVEX_* calls pass in information about the guest or host
endianness (depending on context) and in turn it passes that info
through to all the places that need it:
* the front ends (xx_toIR.c)
* the back ends (xx_isel.c)
* the patcher functions (Chain, UnChain, PatchProfInc)
Mostly it is boring and ugly plumbing. As far as types go, there is a
new type "VexEndness" that carries the endianness. This also makes it
possible to stop using Bools to indicate endianness. VexArchInfo has
a new field of type VexEndness. Apart from that, no other changes in
types.
Followups: MIPS front and back ends have not yet been fixed up to use
the passed-in endianness information. Currently they assume that the
endianness of both host and guest is the same as the endianness of the
target for which VEX is being compiled.
Initialise a couple of scalars that gcc -Og thinks might be
uninitialised, presumably because at -Og it doesn't do enough
block straightening-outening or whatever to see that they are
always assigned before use.