Carl Love [Thu, 12 Sep 2013 17:26:42 +0000 (17:26 +0000)]
The Power ISA 2.07 document includes a correction to the description for the
behavior of the xscvspdp instruction, indicating that if the source argument
is a SNaN, it is first changed to a QNaN before being converted from
single-precision to double-precision. This updated information about the
xscvspdp instruction exposed a bug in the VEX implementation for that
instruction and also a bug in the testing for all instructions having
special behavior for single-precision SNaN arguments.
This patch fixes both the VEX bug in xscvspdp implementation:
The current implementation of xscvspdp emulates the instruction by
extracting the single-precision floating point from the vector register,
storing it in single-prcision, and then loading the data just stored using
the lfsx instruction. But the lfsx instruction does not change SNaN input
arguments to QNaN inputs before conversion to double-precision, so this
emulation is not sufficient for the xscvspdp instruction as described in the
current documentation. This patch fixes that issue by recognizing a SNaN input
and changing it to a QNaN before performing the emulation using lfsx.
While fixing the bug in xscvspdp implementation, it was also discovered that
xvcvspdp had the same issue where SNaN inputs were not being handled correctly,
so this patch also fixes its implementation, too
Enhance ado_treebuild_BB to allow an expression preceding a Put
statement and containing one or more Get expressions to be
substituted in an expression following the Put statement.
That transformation is harmless as long as the guest state areas being
accessed by the Put and Get(s) do not overlap.
Carl Love [Tue, 10 Sep 2013 18:46:40 +0000 (18:46 +0000)]
Bugzilla 323437, this is phase 2 in a series of patches adding support for IBM
Power ISA 2.07. The first bugzilla in the series was: 322294: Add initial
support for IBM Power ISA 2.07
Phase 2 adds support for the following new instructions to
VEX/priv/guest_ppc_toIR.c:
- lq, stq, lqarx, stqcx.
- mfvsrwz, mtvsrwz
- fmrgew, fmrgow
There is a corresponding test case for these instructions, see the bugzilla
for the commit number.
Carl Love [Fri, 6 Sep 2013 22:27:34 +0000 (22:27 +0000)]
The existing overflow detection in VEX/priv/guest_ppc_toIR.c/set_XER_OV_64()
under the case PPCG_FLAG_OP_MULLW: does not apply to the mulldo as we need to
detect overflow when performing a Multiply Low Doubleword (not Multiply Low
Word). Hence, we added a new enumeration value PPCG_FLAG_OP_MULLD in
VEX/priv/guest_ppc_defs.h and a corresponding new case under which the
computation for detecting overflow for mulldo/mulldo. is added in
set_XER_OV_64(). The tests have been added to: none/tests/ppc32/jm-insns.c
Carl Love [Fri, 6 Sep 2013 16:49:42 +0000 (16:49 +0000)]
The patch used the binary constants 0b10000 and 0b10001. The 0b designator
is supported by the GCC extensions but not all compilers seem to support the
0b extension in GCC. Therefore, the binary constats were changed to their
equivalent hex values as suggested by Florian.
Carl Love [Thu, 5 Sep 2013 19:47:40 +0000 (19:47 +0000)]
The current code is not properly handling a non-zero TH field in the
dcbt instruction, which is valid for several forms of data cache block
touch instructions. This patch adds the needed support to
VEX/priv/guest_ppc_toIR.c.
Mark Wielaard [Tue, 27 Aug 2013 10:19:03 +0000 (10:19 +0000)]
Support mmxext (integer sse) subset on i386 (athlon).
Some processors like the AMD Athlon "Classic" support mmxext,
a sse1 subset. This subset is not properly detected by VEX.
The subset uses the same encoding as the sse1 instructions.
The subset is described at:
http://support.amd.com/us/Embedded_TechDocs/22466.pdf
https://en.wikipedia.org/wiki/3DNow!#3DNow.21_extensions
This introduces a new VEX_HWCAPS_X86_MMXEXT that sits between
the baseline (0) and VEX_HWCAPS_X86_SSE1. There is also a new
x86g_dirtyhelper_CPUID_mmxext to mimics a Athlon "Classic"
(Model 2, K75 "Pluto/Orion").
Groups all mmxext instructions together in one block.
Florian Krohm [Thu, 15 Aug 2013 20:54:52 +0000 (20:54 +0000)]
Eliminate IRExprP__VECRET and IRExprP__BBPTR and introduce two new
IRExpr kinds instead: Iex_VECRET and Iex_BBPTR. Add constructor
functions and adjust ppIRExpr, typeOfIRxpr and deepCopyExpr. The
rest is mechanics.
Carl Love [Mon, 12 Aug 2013 18:01:40 +0000 (18:01 +0000)]
Initial ISA 2.07 support for POWER8-tuned libc
The IBM Power ISA 2.07 has been published on power.org, and IBM's new POWER8
processor is under development to implement that ISA. This patch provides
initial VEX support for running Valgrind on POWER8 systems running a soon-to-be
released Linux distribution. This Linux distro will include a POWER8-tuned
libc that uses a subset of the new instructions from ISA 2.07. Since virtually
all applications link with libc, it would be impossible to run an application
under Valgrind on this distro without adding support for these new instructions
to Valgrind, so that's the intent of this patch. Note that applications built
on this distro will *not* employ new POWER8 instructions by default. There are
roughly 150 new instructions in the Power ISA 2.07, including hardware
transaction management (HTM). Support for these new instructions (modulo the
subset included in this bug) will be added to Valgrind in a phased approach,
similar to what we did for Power ISA 2.06.
Julian Seward [Thu, 8 Aug 2013 10:28:59 +0000 (10:28 +0000)]
Add infrastructural support (IR, VEX) to allow returns of 128-
and 256-bit values from dirty helper functions, in a way which is
independent of the target ABIs and of compilers generating
correct struct return code.
Is a prereq for bug #294285.
MIPS fixes: Petar Jovanovic, mips32r2@gmail.com
S390 fixes: Maran, maranp@linux.vnet.ibm.com
Florian Krohm [Sat, 3 Aug 2013 20:39:32 +0000 (20:39 +0000)]
Do not use the 0b notation as older GCC's do not accept it.
Fixes BZ 322851 and also unbreaks the OS X nightly build (hopefully).
Patch by Thomas Rast (trast@student.ethz.ch).
mips32: Add support for mips32 DSP instruction set.
Add support for mips32 DSP and DSP revision 2 ASE.
More details about the mips32 DSP(r2) ASE:
http://www.mips.com/media/files/MD00566-2B-MIPSDSP-QRC-01.00.pdf
Applied patch provided by Maja Gagic <maja.gagic@rt-rk.com>
Implement the following instructions, in both ARM and Thumb
encodings:
SSAX SXTAB16 SHASX SHSAX SHSUB16 SHSUB8
UASX USAX UQADD16 UQASX UQSAX UHASX UHSAX REVSH
Add support for
(T1) STRBT reg+#imm8
(T1) STRHT reg+#imm8
(T1) LDRBT reg+#imm8
(T1) LDRSBT reg+#imm8
(T1) PLI reg+#imm12
(T2) PLI reg-#imm8
(T3) PLI PC+/-#imm12
Florian Krohm [Mon, 17 Jun 2013 21:03:56 +0000 (21:03 +0000)]
s390: Support some more BFP <-> DFP conversions (the ones
that were added in VEX r2727).
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Florian Krohm [Mon, 17 Jun 2013 18:59:51 +0000 (18:59 +0000)]
Add some more IRops to convert between binary floating point and
decimal floating point values. Needed to complete s390 DFP support.
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Petar Jovanovic [Sun, 9 Jun 2013 16:46:14 +0000 (16:46 +0000)]
mips64: fix 'unused variable' warning
On a couple of places, the code expected either _MIPSEB or _MIPSEL flag to
use some variables, but none of these flags is set when the code is compiled
for non-MIPS architectures.
Florian Krohm [Thu, 6 Jun 2013 19:12:46 +0000 (19:12 +0000)]
Eliminate IRRoundingModeDFP by merging its values into IRRoundingMode.
Retain encodings. The rationale is that a rounding mode is an abstraction
and as such independent of formats used to represent numeric values.
This was triggered by the need for a rounding mode to express conversions
between binary floating point values and decimal floating point values.
Florian Krohm [Fri, 31 May 2013 15:41:55 +0000 (15:41 +0000)]
s390x: Make the CC_DEP1 field appear completely initialised when
writing a 32-bit floating point value into it.
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Petar Jovanovic [Fri, 31 May 2013 15:09:56 +0000 (15:09 +0000)]
mips32/mips64: implement sdl, sdr, swl and swr without reading memory
New implementation of SDL, SDW, SWL and SWR instructions in a way in which
no memory read is required. This came as an issue for programs that map
memory as write-exec only.
Florian Krohm [Sat, 11 May 2013 15:02:58 +0000 (15:02 +0000)]
s390: First round of changes to support the PFPO insn.
Support these IROps:
Iop_F64toD64, Iop_D64toF64
Iop_F64toD128, Iop_D128toF64,
Iop_F128toD128, Iop_D128toF128,
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ #307113
Florian Krohm [Sun, 5 May 2013 15:04:30 +0000 (15:04 +0000)]
Add the following IROPs which are needed for s390 DFP support:
Iop_F64toD64, Iop_D64toF64
Iop_F64toD128, Iop_D128toF64,
Iop_F128toD128, Iop_D128toF128,
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ #307113
Petar Jovanovic [Sat, 27 Apr 2013 01:15:48 +0000 (01:15 +0000)]
mips: fix corner case for INS instruction
This change fixes corner case for INS instruction when lsb = 0.
The test in none/tests/mips32/MIPS32int.c will be extended to include
additional test cases that trigger this condition.
STRD (both ARM and Thumb): for push-like cases -- specifically, STRD
rD1,rD2, [sp, #-8], generate IR for the SP writeback before the
stores. This loses restartability of the instruction but avoids
Memcheck complaining that we're writing below the stack pointer.
Petar Jovanovic [Fri, 19 Apr 2013 12:35:00 +0000 (12:35 +0000)]
mips: fix endian issues for LWL, LWR, LDR and LDL for mips64
This change:
- fixes endian issues for unaligned loads for MIPS64,
- (re)moves endian dependencies in guest-to-IR for Iop_ReinterpI32asF32
and Iop_ReinterpI64asF64 to host-mips-isel,
- adds minor style changes in the area touched by the code.
Improved front end translations for Neon V{LD,ST}{1,2} instructions,
that do deinterleaving/interleaving via IROps and so generate far
fewer memory references. As a side effect, fix incorrect ARM back end
implementation of many of the SIMD lane interleaving/deinterleaving
and concatenation IROps.
Implement ARM SDIV and UDIV instructions. Fixes #314178. Partially
based on a patch by Ben Cheng, bccheng@android.com. Also renames two
misnamed PPC helpers.
Carl Love [Wed, 20 Mar 2013 15:51:34 +0000 (15:51 +0000)]
VEX, ppc code cleanup
This patch removes some dead code left behind when the code was restructured
to fix the implementation changes to make it compliant with the iop
definitions.
The patch makes no functional changes as it is just removing code that is not
reachable.
This patch is for Bugzilla 314269.
Signed-off-by: Carl Love <cel@us.ibm.com>
git-svn-id: svn://svn.valgrind.org/vex/trunk@2697
Julian Seward [Tue, 5 Mar 2013 10:35:44 +0000 (10:35 +0000)]
Handle "vmov qDest.I32 V128{0xFFFF}" so to speak, and make the case
for a zero immediate more similar. Verify assembled output against
GNU as. Fixes #311318.
Petar Jovanovic [Wed, 27 Feb 2013 22:57:17 +0000 (22:57 +0000)]
mips: adding MIPS64LE support to VEX
Necessary changes to VEX to support MIPS64LE on Linux.
Minor cleanup/style changes embedded in the patch as well.
Patch written by Dejan Jevtic and Petar Jovanovic.
More information about this issue:
https://bugs.kde.org/show_bug.cgi?id=313267
Florian Krohm [Thu, 14 Feb 2013 14:27:12 +0000 (14:27 +0000)]
s390: Support the following DFP insns:
- extract basied exponent
- insert biased exponent
- quantize
- reround to significance
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ #307113.
Florian Krohm [Mon, 11 Feb 2013 00:47:35 +0000 (00:47 +0000)]
Make HReg a struct. In the past there were several occurences where
a HReg was assigned to an integer. This worked by accident because the
bits representing the register number (which was meant to be accessed)
happened to be in the right place.
Two new functions: hregIsInvalid and sameHReg.
The HReg struct just wraps the integer that was previously used to
represent a register without changing the encoding.
Florian Krohm [Mon, 11 Feb 2013 00:03:27 +0000 (00:03 +0000)]
s390: Be consistent with emulation warnings about unsupported
rounding modes in absence of the floating-point extension facility.
For some insns we would vassert for others we'd give a warning.
Now we always issue an emulation warning.
Florian Krohm [Fri, 8 Feb 2013 20:22:03 +0000 (20:22 +0000)]
s390: Change get_dfp_rounding_mode to map IR rounding modes to
S390_DEP_ROUND_.. values in the range [8;15]. See comments in code.
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Florian Krohm [Sat, 2 Feb 2013 22:58:25 +0000 (22:58 +0000)]
s390: It is not necessary to save/restore the link register when
making a helper call. The link register needs to be saved when
switching between valgrind and client code and the dispatcher code
already does that. Julian suggested this change when he merged the
COMEM branch.
This saves between 6% and 13% of insns on the perf bucket.
Runtime difference is within noise margin.
Florian Krohm [Sat, 2 Feb 2013 00:16:58 +0000 (00:16 +0000)]
s390: Change insn selection to recognize memcpy-like statements.
Add S390_INCN_MEMCPY and generate MVC for that later on. Saves between
0.1 - 1.5% of insns. Observed runtime differences on the perf bucket were
within noise margin.
Carl Love [Wed, 30 Jan 2013 18:39:57 +0000 (18:39 +0000)]
The Coverity tool was run against the Valgrind source code and identified a
problem in VEX/priv/guest_ppc_toIR.c saying the variable 'insn_suffix' was
assigned but not used. The function _do_vsx_fp_roundToInt() has an
HChar * parameter named 'insn_suffix', and the intention of this function was
to set the insn_suffix appropriately for the passed opcode so that the caller
could use that suffix as needed (some callers needed, and others didn't).
However, since the parameter type is a simple pointer, passed by value,
insn_suffix was only modified locally, and the caller did not see the new
value. Since most of the callers of _do_vsx_fp_roundToInt() ignore the
insn_suffix, I have removed that from the parameter list and moved the code
for ascertaining the appropriate suffix into a new function called
_get_vsx_rdpi_suffix().
This patch is for Bugzilla 314099
The patch was written by Maynard Johnson.
The patch does not add any additional regtest errors. The vbit tester
was also run. No issues were found.
The patch was reviewed, tested and committed by Carl Love
Julian Seward [Sat, 26 Jan 2013 11:47:55 +0000 (11:47 +0000)]
Infrastructure cleanup: change type of the condition field of
IRExpr_Mux0X from Ity_I8 to Ity_I1. This makes more sense, makes it
consistent with condition fields in IRStmt_Dirty and IRStmt_Exit, and
avoids some pointless 1Uto8 casting of the condition, in many cases.
Fixes for s390 are from Florian.
Also, make a small extension to ir_opt.c, that allows the constant
folder to look backwards through arbitrary expressions even in flat
IR. This makes it possible to do arbitrary tree folding in ir_opt,
which is where it belongs. Use this to implement the folding rule
CmpNE32(1Uto32(b), 0) ==> b.