Julian Seward [Tue, 27 Jan 2015 23:35:58 +0000 (23:35 +0000)]
Change AMD64Instr_CMov64 so that the source can only be a register
instead of register-or-memory (an AMD64RM). This avoids duplicating
conditional load functionality introduced in r3075 via
AMD64Instr_CLoad and in practice has no effect on the quality of the
generated code.
Julian Seward [Tue, 27 Jan 2015 23:17:02 +0000 (23:17 +0000)]
AMD64 front end: translate AVX2 PMASKMOV load instructions (vector
conditional loads) using IR conditional load statements IRLoadG rather
than the previous rather ingenious hack.
AMD64 back end:
* Add instruction selection etc for 32- and 64-bit conditional loads (IRLoadG)
* Handle dirty helper calls that return a value and that are conditional. These
result from Memcheck's instrumentation of IRLoadGs.
No functional change. This is a cleanup as part of supporting AVX2
PMASKMOV loads and stores by using the existing IR facilities for
conditional loads and stores.
The toUInt() should only be used if we are running in 32-bit mode. The lines
were changed to only convert the pointer to 32-bit if running in 32-bit mode.
There is no bugzilla for this issue. It was noticed by Florian Krohm.
Fix assert
vex: priv/guest_generic_bb_to_IR.c:224 (bb_to_IR): Assertion `vex_control.guest_max_insns < 100' failed.
caused by giving --vex-guest-max-insns=100
100 should be allowed as described by --help-debug:
--vex-guest-max-insns=<1..100> [50]
Florian Krohm [Sun, 4 Jan 2015 17:20:19 +0000 (17:20 +0000)]
Change remaining use of Addr64 in the VEX API to Addr. The reduces
the size of VexGuestExtent to 20 bytes on a 32-bit platform.
Change prototypes of x86g_dirtyhelper_loadF80le and
x86g_dirtyhelper_storeF80le to give the address in the parameter
list type Addr. Likewise for amd64g_dirtyhelper_loadF80le and
amd64g_dirtyhelper_storeF80le.
Update switchback.c - but not tested.
Florian Krohm [Wed, 31 Dec 2014 12:09:38 +0000 (12:09 +0000)]
It has long been assumed that host and guest architectures
are the same - even though the initial design goal was likely
different allowing a cross-valgrind of sorts. But as Julian
put it:
But it's been 12+ years and I've never once heard any mention of
such a thing. So perhaps it's time to give up on that one.
Now let's take advantage of this decision and tighten up the VEX
API using Addr instead of Addr64. As a first step move the definition
of Addr into VEX proper and change the chase_into_ok callback
accordingly.
Florian Krohm [Mon, 29 Dec 2014 22:18:58 +0000 (22:18 +0000)]
As a library, VEX should not export the offsetof and vg_alignof
macros. The latter isn't even used by VEX.
Move them to pub_tool_basics.h.
offsetof also goes to VEX's private header main_util.h.
On amd64, We handle GS similar to FS, i.e. consider it is constant.
Note that FS is not always 0 on linux. It looks rather to be constant
in all threads, and is zero in the main thread.
As values for FS and/or GS differs between platforms (linux or darwin),
FS_CONST and GS_CONST are used.
Note that we cannot easily test that the value of GS or FS is the
expected one, as the value might not be set at the begin of execution
but only set after prctl has been executed.
So, we just hope that effectively GS and FS are constant.
Some trials to set GS to other values that the expected
constant value on linux was causing a SEGV.
So, it looks like this is all effectively protected.
In summary: we were counting somewhat on the luck for FS,
we now similarly count on luch for GS
Florian Krohm [Mon, 15 Dec 2014 21:55:16 +0000 (21:55 +0000)]
Remove quote.txt and newline.txt as they are no longer needed.
Once upon a time those files were used to construct a
header file vex_svnversion.h but that more hassle than it
was worth and eventually it got nuked.
With this change, the user experience will be smoewhat better, e.g.:
VEX: Support for AVX2 requires AVX capabilities
Found: amd64-cx16-rdtscp-sse3-avx2
Cannot continue. Good-bye
Specifically, the patch decouples showing hwcaps and deciding their validity.
show_hwcaps_<ARCH> reports the hwcaps it finds. It never returns NULL.
check_hwcaps checks the hwcaps for feasibility and does not return in case
VEX cannot deal with them.
The function are_valid_hwcaps no longer exists.
Florian Krohm [Wed, 10 Dec 2014 16:08:09 +0000 (16:08 +0000)]
New function vfatal which should be used for user messages
to indicate a situation that can legitimately occur but that
we cannot handle today. The function does not return.
Florian Krohm [Mon, 8 Dec 2014 14:01:33 +0000 (14:01 +0000)]
The long displacement facility is now required. There were a
few spots in the code where this was assumed implicitly.
Ugly fixes were possible, but requiring this facility is not
unreasonable as it has been around sind 2003. So let's just
do this.
Florian Krohm [Fri, 5 Dec 2014 18:55:39 +0000 (18:55 +0000)]
Encountering a PFPO insn in a client program while running on a host
that does not have that insn now causes an emulation error.
Previously, it caused a failing assertion which was incorrect.
Florian Krohm [Sat, 22 Nov 2014 20:10:21 +0000 (20:10 +0000)]
Add function s390_isel_amode_b12_b20 to compile an expression into an
amode that is either S390_AMODE_B12 or S390_AMODE_B20. This is needed
for compare-and-swap insns. As we're currently not generating amodes
using an index register, there was never a problem.
This change future-proofs the code.
Also add a few more asserts for amodes in the s390_insns supporting
translation chaining.
Fixes BZ #269360.
Florian Krohm [Thu, 20 Nov 2014 15:08:56 +0000 (15:08 +0000)]
This change was triggered by BZ #247974 which suggested to include
VEX/test_main.* in the tarball. We don't want to do that because those
files are really just scaffolding for developers to play with and not
meant for general consumption (and are also bitrotting ATM). Therefore,
this patch moves them to the "useful" subdirectory and adds a crude
Makefile there to build the executable.
Makefile-gcc updated accordingly.
Julian Seward [Tue, 11 Nov 2014 12:49:21 +0000 (12:49 +0000)]
Add a nasty temporary kludge to CPUID that allows 64-bit MacOSX 10.10
(Yosemite) to run, until such time as XSAVE and XRSTOR are implemented.
Detailed in the comments. All other targets should be unaffected.
Florian Krohm [Sat, 11 Oct 2014 14:48:38 +0000 (14:48 +0000)]
Merge the memory allocation bits from libvex.h into main_util.c.
This is to avoid linkage problems due to unresolved symbols for
some compilers. See also valgrind r14600 and BZ #339542.
Carl Love [Thu, 9 Oct 2014 21:08:25 +0000 (21:08 +0000)]
This patch makes the needed changes to the lxvw4x for Little Endian.
The data was being loaded in the Big Endian data order for most
cases. The code in host_ppc_isel.c was changed to do a right
shift and to permute the hi and lo registers in the other order
to ensure the data was always loaded in BE order. The lxvw4x
emulation in guest_ppc_toIR.c was changed to permute the data from
the BE order to LE order when running on an LE system.
Carl Love [Tue, 7 Oct 2014 18:20:39 +0000 (18:20 +0000)]
This commit just makes white space changes to the three files in commit
r2966 so I can fix the commit message for that commit. The previous
commit message was "msg". The "msg" was the file with the commit message below.
The first attempt to fix the false positive message "Invalid read of size"
was to change to a V128 read instead of four 32-bit reads. Unfortunately,
this caused some regression test failures that were not caught before
committing the change.
This patch implements the V128 read without creating any regression failures.
The issue with the previous fix is that the lvx instruction was used to
do the V128 fetch. Unfortunately, that instruction takes the effective
address masks it to make it 16 byte aligned and then does the fetch. So,
non-aligned fetches do not work correctly. The fix in this patch does
two aligned fetches with the lvx instruction, calculates a how to permute
the data from the two loads and then permutes the data so the result in the
vector register is the correct value for an unaligned fetch.
Julian Seward [Thu, 2 Oct 2014 16:15:30 +0000 (16:15 +0000)]
guest_amd64_spechelper: fill in a number of missing cases for
conditions after SUBQ/SUBL/SUBW. Also, add cases for
Overflow-after-ADDL/SUBL for the benefit of code generated by
Javascript JITs.
Julian Seward [Thu, 2 Oct 2014 16:13:20 +0000 (16:13 +0000)]
Add folding rules for: Sar64(x,0) and Sar32(x,0). Immediate
shifts by zero seem to have a surprisingly large perf hit on
Intels, possibly due to the bizarre eflags/rflags semantics
involved.
Julian Seward [Thu, 2 Oct 2014 11:32:39 +0000 (11:32 +0000)]
guest_amd64_spechelper: number (in comments) and reorder the spec
cases for arbitrary-condition-after-sub32/sub64. This makes it easier
to see which cases are missing. No functional change.
Carl Love [Mon, 29 Sep 2014 19:33:00 +0000 (19:33 +0000)]
ppc64: lxvw4x instruction uses four 32-byte loads. When run on an
application that does partial loads an error message is generated by
valgrind about Invalid read of size 4. Valgrind is incorrectly
detecting the invalid read. The four loads were replaced
by a single 128-bit load. The invalid read message can now be
suppressed using the command line option " --partial-loads-ok=yes ".
Carl Love [Thu, 25 Sep 2014 15:57:31 +0000 (15:57 +0000)]
The function mk_AvDuplicateRI() stores 16 bytes to memory and then
fetches the data into a vector register. The load was being
generated as a lvewx instead of a lvx instruction by the code:
/* Effectively splat the r_src value to dst */
addInstr(env, PPCInstr_AvLdSt( True/*ld*/, 4, dst, am_offset_zero ) );
The second argument controls which load instruction is generated. The
second argument should have been 16 to generate the lvx instruction not
the lvewx instruction. The issue was reported on the Freescale processor
for the vsptlb instruction. The issue was not detected before because
the backend code generation used the same vector register to load into
as was used previously to create the data. However, the code generation
is dependent on the HW/Distro/compiler. If the same register isn't used
the bug appears. The issue was found with Valgrind 3.10.0 on the Freescale
processor as the Valgrind code generation didn't happen to pick the same
register to do the load into.
Carl Love [Tue, 23 Sep 2014 16:22:36 +0000 (16:22 +0000)]
The PPC64 store quad instruction is updating the address register with the
effective address of the store. The instruction should not update the
address register. The issue is due to the two putIReg() calls at the end of
the instruction. The two putIReg() calls were removed to fix the bug.
Remove the valgrind_support parameter from LibVEX_Init. It's unused
and looks like an anachronism. VEX is also cleaner without valgrind things
creeping in.
Couple of fixes:
- deepCopyIRConst failed to copy Ico_V256 constants
- deepCopyIRExpr did not copy Iex_Binder expressions
- handle_gets_Stmt should also handle an Ist_Put statement