Florian Krohm [Sun, 5 May 2013 15:04:30 +0000 (15:04 +0000)]
Add the following IROPs which are needed for s390 DFP support:
Iop_F64toD64, Iop_D64toF64
Iop_F64toD128, Iop_D128toF64,
Iop_F128toD128, Iop_D128toF128,
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ #307113
Petar Jovanovic [Sat, 27 Apr 2013 01:15:48 +0000 (01:15 +0000)]
mips: fix corner case for INS instruction
This change fixes corner case for INS instruction when lsb = 0.
The test in none/tests/mips32/MIPS32int.c will be extended to include
additional test cases that trigger this condition.
STRD (both ARM and Thumb): for push-like cases -- specifically, STRD
rD1,rD2, [sp, #-8], generate IR for the SP writeback before the
stores. This loses restartability of the instruction but avoids
Memcheck complaining that we're writing below the stack pointer.
Petar Jovanovic [Fri, 19 Apr 2013 12:35:00 +0000 (12:35 +0000)]
mips: fix endian issues for LWL, LWR, LDR and LDL for mips64
This change:
- fixes endian issues for unaligned loads for MIPS64,
- (re)moves endian dependencies in guest-to-IR for Iop_ReinterpI32asF32
and Iop_ReinterpI64asF64 to host-mips-isel,
- adds minor style changes in the area touched by the code.
Improved front end translations for Neon V{LD,ST}{1,2} instructions,
that do deinterleaving/interleaving via IROps and so generate far
fewer memory references. As a side effect, fix incorrect ARM back end
implementation of many of the SIMD lane interleaving/deinterleaving
and concatenation IROps.
Implement ARM SDIV and UDIV instructions. Fixes #314178. Partially
based on a patch by Ben Cheng, bccheng@android.com. Also renames two
misnamed PPC helpers.
Carl Love [Wed, 20 Mar 2013 15:51:34 +0000 (15:51 +0000)]
VEX, ppc code cleanup
This patch removes some dead code left behind when the code was restructured
to fix the implementation changes to make it compliant with the iop
definitions.
The patch makes no functional changes as it is just removing code that is not
reachable.
This patch is for Bugzilla 314269.
Signed-off-by: Carl Love <cel@us.ibm.com>
git-svn-id: svn://svn.valgrind.org/vex/trunk@2697
Julian Seward [Tue, 5 Mar 2013 10:35:44 +0000 (10:35 +0000)]
Handle "vmov qDest.I32 V128{0xFFFF}" so to speak, and make the case
for a zero immediate more similar. Verify assembled output against
GNU as. Fixes #311318.
Petar Jovanovic [Wed, 27 Feb 2013 22:57:17 +0000 (22:57 +0000)]
mips: adding MIPS64LE support to VEX
Necessary changes to VEX to support MIPS64LE on Linux.
Minor cleanup/style changes embedded in the patch as well.
Patch written by Dejan Jevtic and Petar Jovanovic.
More information about this issue:
https://bugs.kde.org/show_bug.cgi?id=313267
Florian Krohm [Thu, 14 Feb 2013 14:27:12 +0000 (14:27 +0000)]
s390: Support the following DFP insns:
- extract basied exponent
- insert biased exponent
- quantize
- reround to significance
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ #307113.
Florian Krohm [Mon, 11 Feb 2013 00:47:35 +0000 (00:47 +0000)]
Make HReg a struct. In the past there were several occurences where
a HReg was assigned to an integer. This worked by accident because the
bits representing the register number (which was meant to be accessed)
happened to be in the right place.
Two new functions: hregIsInvalid and sameHReg.
The HReg struct just wraps the integer that was previously used to
represent a register without changing the encoding.
Florian Krohm [Mon, 11 Feb 2013 00:03:27 +0000 (00:03 +0000)]
s390: Be consistent with emulation warnings about unsupported
rounding modes in absence of the floating-point extension facility.
For some insns we would vassert for others we'd give a warning.
Now we always issue an emulation warning.
Florian Krohm [Fri, 8 Feb 2013 20:22:03 +0000 (20:22 +0000)]
s390: Change get_dfp_rounding_mode to map IR rounding modes to
S390_DEP_ROUND_.. values in the range [8;15]. See comments in code.
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Florian Krohm [Sat, 2 Feb 2013 22:58:25 +0000 (22:58 +0000)]
s390: It is not necessary to save/restore the link register when
making a helper call. The link register needs to be saved when
switching between valgrind and client code and the dispatcher code
already does that. Julian suggested this change when he merged the
COMEM branch.
This saves between 6% and 13% of insns on the perf bucket.
Runtime difference is within noise margin.
Florian Krohm [Sat, 2 Feb 2013 00:16:58 +0000 (00:16 +0000)]
s390: Change insn selection to recognize memcpy-like statements.
Add S390_INCN_MEMCPY and generate MVC for that later on. Saves between
0.1 - 1.5% of insns. Observed runtime differences on the perf bucket were
within noise margin.
Carl Love [Wed, 30 Jan 2013 18:39:57 +0000 (18:39 +0000)]
The Coverity tool was run against the Valgrind source code and identified a
problem in VEX/priv/guest_ppc_toIR.c saying the variable 'insn_suffix' was
assigned but not used. The function _do_vsx_fp_roundToInt() has an
HChar * parameter named 'insn_suffix', and the intention of this function was
to set the insn_suffix appropriately for the passed opcode so that the caller
could use that suffix as needed (some callers needed, and others didn't).
However, since the parameter type is a simple pointer, passed by value,
insn_suffix was only modified locally, and the caller did not see the new
value. Since most of the callers of _do_vsx_fp_roundToInt() ignore the
insn_suffix, I have removed that from the parameter list and moved the code
for ascertaining the appropriate suffix into a new function called
_get_vsx_rdpi_suffix().
This patch is for Bugzilla 314099
The patch was written by Maynard Johnson.
The patch does not add any additional regtest errors. The vbit tester
was also run. No issues were found.
The patch was reviewed, tested and committed by Carl Love
Julian Seward [Sat, 26 Jan 2013 11:47:55 +0000 (11:47 +0000)]
Infrastructure cleanup: change type of the condition field of
IRExpr_Mux0X from Ity_I8 to Ity_I1. This makes more sense, makes it
consistent with condition fields in IRStmt_Dirty and IRStmt_Exit, and
avoids some pointless 1Uto8 casting of the condition, in many cases.
Fixes for s390 are from Florian.
Also, make a small extension to ir_opt.c, that allows the constant
folder to look backwards through arbitrary expressions even in flat
IR. This makes it possible to do arbitrary tree folding in ir_opt,
which is where it belongs. Use this to implement the folding rule
CmpNE32(1Uto32(b), 0) ==> b.
Julian Seward [Fri, 25 Jan 2013 09:46:43 +0000 (09:46 +0000)]
Annotate ARMNImm_to_Imm64 with fallthrough markers following
verification against the table in host_arm_defs.h, "Neon Immediate
operand". A particularly nasty piece of code.
Carl Love [Tue, 22 Jan 2013 20:25:31 +0000 (20:25 +0000)]
Fix implementation of the DFP integer operands.
The implementation of integer operands doesn't really match the documentation
for the Iop. Take for example Iop_ExtractExpD64. It is documented as
D64 -> I64 but the implementation of the UNARY is defined as
UNARY(Ity_D64, Ity_D64). The result is an integer that is stored in an integer
format in a floating point register. On the IBM s390 however, the architecture
stores the integer value in a general purpose register (GPR) not a floating
point register. This issue exists with the implementation of 11 Iops where the
PPC implementation has either a source or destination whose value is an integer
but the value is stored in a floating point register in an integer format. After
reviewing the PPC implementation with the s390 developer, it was agreed the
cleanest way to fix this is to change the PPC implementation. The BINOP will be
changed to be consistent with the Iop description. This means the PPC
instruction implementation of the PPC instruction in guest_ppc_toIR.c will need
to reinterpret integer source operands as integers which will move the value
from a floating point register to an integer register before calling binop().
The underlying PPC implementation of the unop() for the specific Iop will also
need to change to move the value from the integer register back to the floating
point register so the native instruction can be issued with the integer value
in a floating point register. It was decided that making the changed in PPC,
rather then having the s390 reinterpret integers as DFP and then move the value
back to an integer register, was preferable as it makes the implementation of
the unop(), binops(), triop() consistent with the definition of the Iop.
This patch also includes the needed changes for the vbit tester. The Iop
definitions in memcheck/tests/vbit-test/util.c had to be updated to be consitent
with the changes in the Iops as documented below. Also, the function mkLazy3()
in memcheck/mc_translate.c had to be updated to handle the I32 x I8 x I64 -> I64
and I32 x I8 x I128 -> I128 cases.
The specific list of changes are as follows:
Iop name in pub/libvex_ir.h
documented type
type of UNARY/BINARY/TERNARY in priv/ir_defs.c
-------------------------------------------------------
Carl Love [Mon, 21 Jan 2013 18:12:31 +0000 (18:12 +0000)]
The 32-bit DFP value is stored in a 64-bit register in
ppc. The D32 to D64 and D64 to D32 definitions for the
Iop type was specified in VEX/priv/ir_defs.c, function
typeOfPrimop() as:
case Iop_D32toD64:
UNARY(Ity_64, Ity_D64);
case Iop_D64toD32:
BINARY(ity_RMode, Ity_D64, Ity_D64);
since the values resided in a 64-bit register. As part of the s390 DFP support
the definitions were changed to:
case Iop_D32toD64:
UNARY(Ity_32, Ity_D64);
case Iop_D64toD32:
BINARY(ity_RMode, Ity_D64, Ity_D32);
to reflect what they really should be. However, this broke the ppc
implementation. Valgrind would fail and report a mismatch on the types as the
ppc code was using a D64 instead of a D32.
This patch adds support for fetching and storing the Dfp32 operand as a 32-bit
value. The support includes adding the functions iselDfp32Expr() and
iselDfp32Expr_wrk() and additional code to support the DFP32 bit iops.
Petar Jovanovic [Sun, 20 Jan 2013 18:16:45 +0000 (18:16 +0000)]
mips: fix for mips-disassembler when branch is at block_size-2 position
Check if the last instruction in the block is a branch or jump instruction
should happen only if the disassembler was not already stopped.
Incorrect conditional led to a boundary case in which jumps/branches were not
executed when placed on "max_insns - 2" position in the block.
none/tests/mips32/block_size test will be added to Valgrind to describe the case
and check for regressions in future.
Florian Krohm [Sun, 20 Jan 2013 03:51:04 +0000 (03:51 +0000)]
Improve the tree builder in IR optimisation. Allow load expressions to be
moved past Put/I statements and dirty helpers, when it is safe to do so.
It is safe, when the statement does not require exact memory exceptions.
New functions stmt_modifies_guest_state and dirty_helper_puts have been
added to determine the side effect on the guest state.
This optimisation enables the use of memory-to-memory insns on
architectures that have those.
Note that this macro behaves slightly differently for some types
from the gcc __alignof__ and from the equivalent (new) standardised
alignof.
vg_alignof macro is needed for the "perm_malloc" callers (next commit)
to determine the alignment of small blocks, but might be useful
for other purposes => placed in libvex_basictypes.h, close to offsetof.
Julian Seward [Wed, 16 Jan 2013 22:11:13 +0000 (22:11 +0000)]
mips32 equivalent to r2636: fix up the mips32 back end to be in sync
with infrastructural changes w.r.t. conditional dirty helpers that
return values. Does not actually handle such cases since the mips32
front end does not generate them.
Julian Seward [Wed, 16 Jan 2013 21:10:01 +0000 (21:10 +0000)]
Fix up the s390 back end to be in sync with infrastructural changes
w.r.t. conditional dirty helpers that return values. Does not
actually handle such cases since the s390 front end does not generate
them. Further ahead, it would be more general to redo this by
incorporating a RetLoc as part of the helper_call struct. This change
is OK for now, though.
Julian Seward [Wed, 16 Jan 2013 14:56:06 +0000 (14:56 +0000)]
ppc32/64 equivalents to r2636: fix up the ppc back end to be in sync
with infrastructural changes w.r.t. conditional dirty helpers that
return values. Does not actually handle such cases since the ppc
front end does not generate them.
Julian Seward [Wed, 16 Jan 2013 09:29:37 +0000 (09:29 +0000)]
x86 equivalent to r2636: fix up the x86 back end to be in sync with
infrastructural changes w.r.t. conditional dirty helpers that return
values. Does not actually handle such cases since the x86 front end
does not generate them.
Julian Seward [Tue, 15 Jan 2013 22:30:39 +0000 (22:30 +0000)]
Fix up the amd64 back end to be in sync with infrastructural changes
w.r.t. conditional dirty helpers that return values. Does not
actually handle such cases since the amd64 front end does not generat
them.
Florian Krohm [Sun, 13 Jan 2013 02:29:05 +0000 (02:29 +0000)]
s390: Support insns to convert between DFP values and signed/unsigned
integers. Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Florian Krohm [Sat, 12 Jan 2013 22:02:07 +0000 (22:02 +0000)]
Add 12 IROps for converting betwen DFP values and signed/unsigned integers.
Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
Part of fixing BZ 307113.
Julian Seward [Tue, 8 Jan 2013 14:09:04 +0000 (14:09 +0000)]
Get rid of selectable default (return) values in conditional dirty
calls, as introduced in r2594. It is overkill -- unnecessary
complexity. Instead have a pre-assumed default bit pattern of 0101010
(0x55..) to be returned in such cases.
Carl Love [Thu, 3 Jan 2013 23:34:18 +0000 (23:34 +0000)]
The call to set the rounding mode for DFP iops: Iop_AddD128, Iop_SubD128,
Iop_MulD128, Iop_DivD128, and Iop_D128toI64 is wrong. The call being used is
set_FPU_rounding_mode(). This call is used to set the two rounding mode bits
for the Floating point instructions. The call set_FPU_DFP_rounding_mode()
should have been used to set the three rounding mode bits for the DFP
instructions.
This patch changes the call to the correct function to set the DFP
rounding mode bits.
Florian Krohm [Sun, 30 Dec 2012 18:17:18 +0000 (18:17 +0000)]
Improve handling of dirty helper calls when building trees in ado_treebuild_BB.
This function took an overly conservative approach and always assumed
that calling a dirty helper would modify both guest state and memory. This
patch introduces two new functions dirty_helper_stores and dirty_helper_puts,
to determine the actual side effets of a helper call. Using these functions
increases precision and allows the tree builder to move a GET past a dirty
helper call.
Florian Krohm [Thu, 27 Dec 2012 20:14:03 +0000 (20:14 +0000)]
s390: Support the "test data class/group" and "extract significance"
insns. Patch by Maran Pakkirisamy (maranp@linux.vnet.ibm.com).
This is part of fixing BZ 307113.
Florian Krohm [Thu, 27 Dec 2012 00:59:43 +0000 (00:59 +0000)]
s390: Do not waste a register when assigning a constant to a memory
location. If available, use MVHI and friends. If those are not available,
load the constant value into register r0 and store that. r0 is not visible
to register allocation and therefore using it does not increase register
pressure.
Remove S390_INSN_MZERO and replace it with S390_INSN_MIMM. Assigning zero
is just a special case..
Saves between 0.9% and 2.4% of insns as measured with the perf regression
bucket.