Tweak the IR injector so it can handle an immediate operand for
shift operations. This is needed for Iop_ShlD64 and the like on
powerpc where the shift amount is an immediate field in the insn.
Part of fixing bugzilla #305948.
s390: Change the handling of S390_ROUND_PER_FPC (which indicates that the
actual rounding mode is to be taken from the FPC register). Previously, this
was just mapped to S390_ROUND_NEAREST_EVEN, which obviously has correctness
issues.
First, we add a function get_bfp_rounding_mode_from_fpc to extract the
rounding mode from the guest FPC when building IR.
Second, have encode_bfp_rounding_mode invoke get_bfp_rounding_mode_from_fpc
whenever a S390_ROUND_PER_FPC is requested.
Third, in insn selection track whether (and if so to what value) the
rounding mode was set for the IRSB at hand. That way redundant assignments
can be avoided. This works well because the IR optimiser do a fine job
recognising end eliminating the expressions returned earlier from
get_bfp_rounding_more_from_fpc. So they get all mapped to the same IRTemp.
Note, VEX r2524 is essential to get this behaviour.
Fourth, remove the rounding more from the bfp_unop/binop/triop s390_insns.
Fifth, if the rounding mode can be set on the insn directly, prefer that
over setting it in the FPC and picking it up from there.
s390: More prep work bfp reorg. In the future unary/binary/ternary
operations on bfp data will no longer require a rounding mode in the
s390_insn. Only type conversion operations need a rounding mode.
So in this patch S390_BFP_CONVERT is introduced and
S390_BFP128_CONVERT_TO/FROM are consolidated to S390_BFP128_CONVERT.
This also makes the representation of bfp and bfp128 symmetric.
s390_insn gets a new variant: s390_convert.
The type conversion ops get their own data type now: s390_conv_t
s390: Prepare for bfp reorg. Change the emit functions for the
convert-to-fixed and load-rounded instructions to emit the extended
form. E.g. change s390_emit_CEFBR to s390_emit_CEFBRA. In the future
we will take advantage of those insns if the host's hardware facilities
allow it.
Petar Jovanovic [Sun, 9 Sep 2012 01:10:59 +0000 (01:10 +0000)]
Correcting how load/store doubles are modelled on MIPS for big-endian.
One of the previous changes, r2511, was correct for little-endian and introduced
a regression for big-endian MIPS. This corrects the endianness issues.
s390: Fix condition code computation for convert-to-fixed/logical
insns. Previously the condition code was computed based on the
to-be-converted value only. But that is not sufficient as testcase
none/tests/s390x/rounding-1 shows. The rounding mode needs to be
considered, too. Therefore, the rounding mode is now stored in the
flags thunk as well (in IRRoundingMode encoding). Note, that this is
done for *all* convert-to-fixed/logical insns. It's possible that some
of them do not need the rounding mode but I did not bother exploring
the fine print. Setting the rounding mode as it was on the incoming
insn certainly will not be detrimental so we can as well do it.
This patch fixes bugzilla #306054.
s390: Non-functional change.
Rename enable_rounding_mode to enable_bfp_rounding_mode in
anticipation of dfp coming. Change its return value to be an
IRTemp which will be handy soon. Fix all call-sites.
Binary floating point cleanup. This was an area that was hushed up
a bit when the s390 port was first committed. Time to get it in shape.
This patch
- completes s390_round_t to list all rounding modes that can appear
in a convert-to instruction
- adapts function encode_rounding_mode accordingly
- ensures that all s390_round_t -> IRRoundingMode conversions go through
encode_rounding_mode
Carl Love [Tue, 4 Sep 2012 22:09:48 +0000 (22:09 +0000)]
Add vassert for DFP shift value to make sure shift value is an immediate value.
V-bit tester was putting shift value in a register for the DFP shift
instructions causing the test to crash, see bugzilla #305948.
Petar Jovanovic [Tue, 4 Sep 2012 13:45:42 +0000 (13:45 +0000)]
Load/store doubles on MIPS are modeled through Ity_F64 rather than two Ity_F32.
This patch changes how the load/store doublewords are modeled on MIPS.
Previously, this was modeled through two Ity_F32s which caused test reports to
be different to expected.
This fixes memcheck/tests/fprw.
s390: Undo part of r2501. The "convert to fixed" opcodes always have an m3
field -- independent of the floating point extension facility.
So do not issue an emulation warning for those opcodes.
Support the variety of "convert to/from fixed" and "load rounded" opcodes
that have an additional m3 and/or m4 field.
Add emulation warning EmWarn_S390X_fpext_rounding and issue it in case
the current opcode cannot be emulated correctly (i.e. with the specified
rounding mode).
New function: emulation_warning.
Part of fixing bugzilla #306098.
Remove alignment checks for VMPSADBW, VPHMINPOSUW, VPALIGNR since they
do not apply to the AVX versions of these instructions. Fixes #305926.
(Jakub Jelinek, jakub@redhat.com)
s390: Generate an emulation failure if an insn is encountered that
requires the floating point extension facility but the host does not
have it. Factored out function emulation_failure.
s390: Add support for the "convert from/to logical" instruction family.
A few (7) new IROps are introduced.
Patch by Christian Borntraeger (borntraeger@de.ibm.com).
Fixes bugzilla #274695.
Florian Krohm [Wed, 29 Aug 2012 17:40:52 +0000 (17:40 +0000)]
Fix address computation in IR injection. When loading / storing a
128-bit value as 2 64-bit values, the two memory locations are 8 bytes
apart. Always. Everywhere. Due to a thinko this was busted on 32-bit
eachines.
Also add an assert that values requiring more than 128 bit are currently
not supported.
Florian Krohm [Tue, 28 Aug 2012 16:49:30 +0000 (16:49 +0000)]
VEX-side support for the V-bit tester.
- recognise the new "special instruction" for all architectures
(ARM needs implementation work; x86 and ARM are untested)
- inject IR into the superblock
- type definition for the IR injection control block
Florian Krohm [Sun, 26 Aug 2012 18:58:13 +0000 (18:58 +0000)]
s390: Add support for the ecag insn. Patch from Divya Vyas
(divyvyas@linux.vnet.ibm.com) with mods to terminate the super block
with EmFail in case the insn is not available on the host.
Part of fixing bugzilla #275800.
Florian Krohm [Sat, 25 Aug 2012 21:48:04 +0000 (21:48 +0000)]
Rename libvex_emwarn.h to libvex_emnote.h and fix all
#include's. The renaming of guest_EMWARN, VexemWarn etc will
be done in a followup patch.
The rationale for all this is that we want to reuse the existing
machinery for emulation warnings also for emulation failures.
And that calls for some kind of neutral naming scheme.
Florian Krohm [Mon, 6 Aug 2012 13:35:33 +0000 (13:35 +0000)]
The arguments in a helper call need to be sign/zero-extended
to 64 bit. Fix helper calls accordingly. And because I keep forgetting
this, add checking machinery in the insn selector so it won't happen again.
Diagnosed by Christian Borntraeger.
Florian Krohm [Sun, 5 Aug 2012 02:59:55 +0000 (02:59 +0000)]
Support the cu14 insn. That insn is very much like cu12 except the
converted value is always 4 byte wide. The only other difference is
the encoding of a 4-byte UTF-8 character.
Some code refactoring does the trick.
Part of fixing #289839.
Florian Krohm [Sat, 4 Aug 2012 04:25:30 +0000 (04:25 +0000)]
Fix a bug in insn selection. For some reason Iop_1UtoXYZ did no
zero-extension. That is essential, as not all computation is donw
using 8-byte values.
For example
- do a 64-bit computation in r1; assume leftmost 4 bytes != 0
- do a 32-bit computation in r1; leftmost 4 bytes are untouched != 0
- do 32to1 on r1; rightmost 4 bytes == 1; leftmost 4 bytes != 0
- do 1Uto64 on r1
Without zero-extension r1 will contain a value that is not boolean
Upon decode failure set the guest_IA to the address of the insn that
could not be decoded (current insn). Otherwise, a wrong address will be
reported in the complaint.
Clean up IR construction for insns that do not have straight-line
internal control flow. Introduce a few new convenience functions: iterate,
iterate_if, next_insn_if that are now used to build IR for insns with implicit
loops. These functions behave like if_condition_goto, except they do
not terminate the super block and do not call put_IA(next insn). Previously,
the guest_IA was assigned possibly several times for such insn. Now we do
it exactly once as it should be. This improves complaints for cu21 which was
the motivation behind this patch.
As a side effect insns with an implicit loop no longer terminate the super
block. All calls to dummy_put_IA have been eliminated.
No changes in runtime performance were observed.
Change IR generation for SRST, CLST, and CLCLE to not generate cc=3.
Two reasons:
(1) Consistency in implementation (we don't generate cc=3 for "translate",
"convert to unicode" and possibly other insns)
(2) There is nothing to be gained. A program that does not handle cc=3
correctly (by looping back to the insn that generated it) may exhibit
unpredictable behaviours. And there is no way for us to match that (as
we cannot know when hardware decides to interrupt the insn). So why
add complexity for that.
Back out special handling for opcode 00 (VEX r2189).
This was added based on the following analysis at the time:
(1) during decoding a sequence of insns we run into a 00 opcode (as that
opcode is sometimes used on purpose to force an abort)
(2) #1 only happens when chasing through unconditional gotos
(3) the path that was decoded in #1 would not be executed because an earlier
side exit in the super block was taken
But chasing through an unconditional branch should not reach an insn that is
not reached at execution time, because
(a) conditional gotos are supposed to terminate a superblock
(b) side exits that appear in the IR of complex insns will transfer control
to the very same address (for insns that have implicit loops) and/or to
the address that immediately follows the current insn (fall through)
Therefore, the special handling of opcode 00 was just fighting the
symptom but not the cause.
Most likely a super block was not correctly terminated.
Change logic in computed gotos to use if_condition_goto_computed
instead of if_not_condition_goto_computed. Hide the implementation
detail of inverting the condition in if_condition_goto_computed and
fix the call sites. This is clearer as it better matches the semantic
description in the POP.
Handle UD2 a bit better. This change causes Vex to decode UD2 like
any other instruction -- so it doesn't complain -- but Valgrind still
complains when synthesising the SIGILL for the guest. Marginally less
confusing than it was before.
Increase max allowed pre-allocation (vreg-ised) block size from 10000
to 15000. In very extreme circumstances the JIT pipeline can create
huge blocks. Fixes #303250, at least for the time being.
Make the IR sanity checker complain about dirty helpers that return
a value and are executed under a condition. That case is not handled
properly and will cause asserts down the road. As pointed out by Julian.