Carl Love [Tue, 1 Oct 2013 15:50:09 +0000 (15:50 +0000)]
Add tests for the phase 3 ISA 2.07 code patch
This patch adds testcases to an existing testcase
source file to test the new instructions which were
added to VEX support in the phase 3 ISA 2.07 code patch.
The patch also makes a small change to memcheck's
vbit tester code to allow successful execution.
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Bugzilla 324894. Corresponding VEX commit 2779
The following Iops were added to support the above instructions:
Iop_MullEven32Ux4, Iop_MullEven32Sx4, Iop_Max64Sx2, Iop_Max64Ux2,
Iop_Min64Sx2, Iop_Min64Ux2, Iop_CmpGT64Ux2, Iop_Rol64x2,
Iop_QNarrowBin64Sto32Ux4, Iop_QNarrowBin64Uto32Ux4, Iop_NarrowBin64to32x4,
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Bugzilla 324894
Dejan Jevtic [Tue, 1 Oct 2013 10:34:54 +0000 (10:34 +0000)]
mips32: Fix the align problem with mmap.
Valgrind is doing mmap always with MAP_FIXED. On mips32 we need to check arg4.
If the arg4 is MAP_SHARED we need to align the address to SHMLBA.
If the program tries to do mmap with VKI_FIXED Valgrind doesn't need to align
the address to SHMLBA.
Use global vars to point at possibly leaked
Depending on the compiler or optimisation level, the blocks that
are supposed to be possibly leaked are still reachable.
=> change the pointers to be global variables,
and do the allocation in a function, not in main.
add heuristics decreasing false possible "possible leaks" in c++ code.
The option --leak-check-heuristics=heur1,heur2,... can activate
various heuristics to decrease the number of false positive
"possible leaks" for C++ code. The available heuristics are
detecting valid interior pointers to std::stdstring, to new[] allocated
arrays with elements having destructors and to interior pointers pointing
to an inner part of a C++ object using multiple inheritance.
This fixes 280271 Valgrind reports possible memory leaks on still-reachable
std::string
This has been tested on x86/amd64/ppc32/ppc64.
First performance measurements seems to show a neglectible impact on
the leak search.
More feedback welcome both on performance and functional aspects
(false positive 'possibly leaked' rate decrease and/or
false negative 'possibly leaked' rate increase).
Note that the heuristic is not checking that the memory has been
allocated with "new" or "new[]", as it is expected that in some cases,
specific alloc fn are used for c++ objects instead of the standard new/new[].
If needed, we might add an option to check the alloc functions
to be new/new[].
Add a kludgey implementation of XTEST to go with the kludgey
implementation of XBEGIN. Also kludge the CPUID output for AVX
capable targets so as to claim we support HTM.
Mark Wielaard, mjw@redhat.com)
Petar Jovanovic [Tue, 24 Sep 2013 22:27:23 +0000 (22:27 +0000)]
mips64: finetune mips_dirtyhelper_calculate_FCSR
Several MIPS32 Revision 2 instructions also belong to Revision 1 of MIPS64.
Modifing parts of mips_dirtyhelper_calculate_FCSR to be active for MIPS64R1.
This fixes none/tests/mips64/round when Valgrind is compiled for MIPS64 R1.
Petar Jovanovic [Sat, 21 Sep 2013 01:47:18 +0000 (01:47 +0000)]
mips32: protect mips32r2 instructions with a flag
Regression issue that came when mips_dirtyhelper_calculate_FCSR was added.
Inline assembly with MIPS32r2 instructions needs to be protected by flags
that disable it for non-MIPS32r2 platforms such as some Broadcom boards.
In an attempt to fix the accounting for dynamic memory allocation
it turned out that coregrind freely allocates memory on the tool
arena (which it should not, conceptually) and tools rely on coregrind
doing so (by VG_(free)'ing memory allocated by coregrind).
Entangling this mess is risky and provides little benefit except
architectural cleanliness.
Thinking more about it... It isn't really all that interesting how
much memory is allocated by tool code in and by itself. What is
interesting is the total memory impact a tool has, e.g. as compared
to running "none".
So in this patch the number of memory arenas is consolidated by
subsuming VG_AR_TOOL/ERRORS/EXECCTXT into VG_AR_CORE.
VG_(malloc) and friends have been modified to operate on VG_AR_CORE.
Add a script 'check_headers_and_includes' to check that #include directives
are not against the grain.
Wrap this script together with 'check_makefile_consistency' into
'post_regtest_checks' and invoke that from the toplevel Makefile. So we can
easily add new checkers in the future.
Add a new make target 'post-regtest-checks' to just run those checks
and nothing else.
Double the size of the (already huge) translation cache on all
non-phone/tablet targets. The previous apparently-huge sizing is
evidently not huge enough for recent apps, eg, recent Firefox requires
circa 350k translations to get started and almost fills an 8-sector
cache merely starting up and then idling.
On Android targets, fall back to 6 sectors; space is critical.
Add support for the Intel TM "xbegin" instruction, by jumping directly
to the failure address. Currently disabled pending finding hardware
that can actually execute xbegin, for testing purposes.
x86 front ends: tighten up decoding of MOV Ib,Eb and MOV Iv,Ev. This
failed to check the g-register in the modrm byte, with the result that
it will mis-decode the AVX2 XABORT and XBEGIN instructions as these
instead, with obviously-bizarre consequences.
Carl Love [Wed, 18 Sep 2013 16:06:46 +0000 (16:06 +0000)]
The patch fixes the assembly of the Power dcbtst and dcbt instructions.
The assembly of these instructions is not alwasy being done correctly as
described in the following email reply.
Re: Assembling Power instructions: dcbtst/dcbt.
From: Peter Bergner <bergner at vnet dot ibm dot com>
To: Paralkar Anmol-B07584 <B07584 at freescale dot com> Cc: "amodra at bigpond dot net dot au" <amodra at bigpond dot net dot au>, "binutils at sourceware dot org" <binutils at sourceware dot org>
Date: Fri, 13 Sep 2013 15:22:35 -0500
Subject: Re: Assembling Power instructions: dcbtst/dcbt.
Authentication-results: sourceware.org; auth=none
References: <DC6D7B34688246489A6578981A5ADEB9302A07 at 039-SN2MPN1-012 dot 039d dot mgd dot msft dot net>
On Fri, 2013-09-13 at 18:32 +0000, Paralkar Anmol-B07584 wrote:
> Hello,
>
> Per Power ISA Version 2.07 (May 3, 2013) "4.3.2 Data Cache Instructions",
> the assembly language syntax for the dcbtst instruction (pp. 771) is:
>
> dcbtst RA,RB,TH [Category: Server]
> dcbtst TH,RA,RB [Category: Embedded]
>
> and it's layout in the object code is:
>
> +------+------+------+------+------------+---+
> | 31 | TH | RA | RB | 246(0xF6) | / |
> |0 |6 |11 |16 |21 |31 |
> +------+------+------+------+------------+---+
>
> (Analogously: dcbt pp. 770)
>
> However, GAS (as of version 2.23.52.20130912) decides on the syntax to use based on
> processor/architecture dialect (not Power ISA Category), using the Server syntax in
> the case of POWER4 and the Embedded syntax for generic PPC or VLE.
> Consequently (e.g.),
>
> dcbtst 17, 14, 6
>
> in the assembly file gets "misassembled" under -many for a user-space program on Linux:
When you only specify -many (and not one of -mpower4, -mpower5, etc.),
the assembler/disassembler will choose a default -m<CPU> value for
you. That has changed over time, but is generally one of the newer
server cpus. For example, for binutils trunk, the default is now
-mpower8 and for your 2.23.x binutils, it is -mpower7.
That should force the assembler and disassembler to assemble
the instruction using the server operand order you want, but the bug
above (which is in 2.23) basically resets it to an old cpu, so it
chooses to use the embedded/old cpu setting.
The patch from Amodra fixes the issue by manually generating the correct
hex value for the instruction rather then leaving it to the assembler to
generate the hex value from the symbolic assembly instruction name.
Change the existing tests to print the value of the FCSR
register after the mips fpu instruction is executed.
Add tests that are testing the value of FCSR register.
Followup to r13553 which caused some build failures.
(1) Detect availability of pthread_setname_np. Ignore testcases
memcheck/tests/threadname[_xml] if not available.
(2) Enable _GNU_SOURCE to avold compiler warnings.
(3) In threadname_xml filter out stackframes referring to system
libraries. Added tests/filter_xml_frames to do that.
(4) Adjust .exp files as needed
(5) Do not ship stdout.exp for memcheck/tests/threadname[_xml].
Petar Jovanovic [Mon, 16 Sep 2013 18:11:59 +0000 (18:11 +0000)]
mips: clean-up in hardware detection (Cavium/DSP ASEs)
This change is a clean up in MIPS hardware detection code.
New flag for Cavium Company ID is added, as well as the codes for 34K and
74K processors (MIPS Company ID). The later two represent platforms with DSP
ASEs implemented (Rev 1 and Rev 2 respectively). Macros to detect these two
platforms have been added as well.
Additional macros to extract Company ID out of hwcaps added as well, and
used where possible.
Intercept prctl(PR_SET_NAME, name) and store the thread name so it
can be used in error messages. That should be helpful when debugging
multithreaded applications.
Patch by Matthias Schwarzott <zzam@gentoo.org> with some minor
modifications. Fixes BZ 322254.
Petar Jovanovic [Sun, 15 Sep 2013 22:49:01 +0000 (22:49 +0000)]
mips32/mips64: rename mips32_features to mips_features
As this file is now detecting mips64/Cavium boards, we are renaming it to
reflect that. The functional change is that mips_features now can detect
Cavium board and allow Cavium-specific tests to be run.
Fix inclusion of header files in coregrind. No pub_tool_*.h should be
included here.
Added pub_core_poolalloc.h and renamed pub_tool_inner.h to pub_core_inner.h.
clarify that vg-in-place cannot be used as an outer in outer/inner setup
If you use a vg-in-place outer, then you obtain errors such as:
valgrind: mmap(0x38000000, 3293184) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.
What must be used is the "make install"-ed valgrind
Carl Love [Thu, 12 Sep 2013 17:38:13 +0000 (17:38 +0000)]
The Power ISA 2.07 document includes a correction to the description for the
behavior of the xscvspdp instruction, indicating that if the source argument
is a SNaN, it is first changed to a QNaN before being converted from
single-precision to double-precision. This updated information about the
xscvspdp instruction exposed a bug in the VEX implementation for that
instruction and also a bug in the testing for all instructions having
special behavior for single-precision SNaN arguments.
The VEX code fix for this issue is r2760.
This patch fixes the test cases for the ISA 2.07.
Testing bug: In several ppc[64] test cases, an array of special
double-precision floating point values is set up, and then all elements of
that array are copied via assignment to a single-precision array ('float'
type). Assignment from a double to a float works fine for all cases, except for
SNaN values. In the case of a SNaN, the source is changed to a QNaN and then
converted to single-precision. So the end result was that our array of floats
did not have an actual SNaN value, and, therefore, any instructions that had
special behavior for a single-precision SNaN input argument was never being
properly tested. This patch makes some functional changes in the following
testcases:
These changes impacted the associated *.stdout.exp files, so the patch also
updates those files. Additionally, there were several errors in testcase
source comments that misidentified QNaN and SNaN bit patterns which this patch
corrects.
Carl Love [Thu, 12 Sep 2013 17:26:42 +0000 (17:26 +0000)]
The Power ISA 2.07 document includes a correction to the description for the
behavior of the xscvspdp instruction, indicating that if the source argument
is a SNaN, it is first changed to a QNaN before being converted from
single-precision to double-precision. This updated information about the
xscvspdp instruction exposed a bug in the VEX implementation for that
instruction and also a bug in the testing for all instructions having
special behavior for single-precision SNaN arguments.
This patch fixes both the VEX bug in xscvspdp implementation:
The current implementation of xscvspdp emulates the instruction by
extracting the single-precision floating point from the vector register,
storing it in single-prcision, and then loading the data just stored using
the lfsx instruction. But the lfsx instruction does not change SNaN input
arguments to QNaN inputs before conversion to double-precision, so this
emulation is not sufficient for the xscvspdp instruction as described in the
current documentation. This patch fixes that issue by recognizing a SNaN input
and changing it to a QNaN before performing the emulation using lfsx.
While fixing the bug in xscvspdp implementation, it was also discovered that
xvcvspdp had the same issue where SNaN inputs were not being handled correctly,
so this patch also fixes its implementation, too
Enhance ado_treebuild_BB to allow an expression preceding a Put
statement and containing one or more Get expressions to be
substituted in an expression following the Put statement.
That transformation is harmless as long as the guest state areas being
accessed by the Put and Get(s) do not overlap.
Carl Love [Tue, 10 Sep 2013 19:01:00 +0000 (19:01 +0000)]
Bugzilla 323437, this is phase 2 in a series of patches adding support for IBM
Power ISA 2.07. The first bugzilla in the series was: 322294: Add initial
support for IBM Power ISA 2.07
Phase 2 VEX commit 2756 added support for the following new instructions to
VEX/priv/guest_ppc_toIR.c:
- lq, stq, lqarx, stqcx.
- mfvsrwz, mtvsrwz
- fmrgew, fmrgow
This commit adds the corresponding test cases for these instructions.
Carl Love [Tue, 10 Sep 2013 18:46:40 +0000 (18:46 +0000)]
Bugzilla 323437, this is phase 2 in a series of patches adding support for IBM
Power ISA 2.07. The first bugzilla in the series was: 322294: Add initial
support for IBM Power ISA 2.07
Phase 2 adds support for the following new instructions to
VEX/priv/guest_ppc_toIR.c:
- lq, stq, lqarx, stqcx.
- mfvsrwz, mtvsrwz
- fmrgew, fmrgow
There is a corresponding test case for these instructions, see the bugzilla
for the commit number.
Carl Love [Fri, 6 Sep 2013 22:29:55 +0000 (22:29 +0000)]
The existing overflow detection in VEX/priv/guest_ppc_toIR.c/set_XER_OV_64()
under the case PPCG_FLAG_OP_MULLW: does not apply to the mulldo as we need to
detect overflow when performing a Multiply Low Doubleword (not Multiply Low
Word). Hence, we added a new enumeration value PPCG_FLAG_OP_MULLD in
VEX/priv/guest_ppc_defs.h and a corresponding new case under which the
computation for detecting overflow for mulldo/mulldo. is added in
set_XER_OV_64(). The tests have been added to: none/tests/ppc32/jm-insns.c
Carl Love [Fri, 6 Sep 2013 22:27:34 +0000 (22:27 +0000)]
The existing overflow detection in VEX/priv/guest_ppc_toIR.c/set_XER_OV_64()
under the case PPCG_FLAG_OP_MULLW: does not apply to the mulldo as we need to
detect overflow when performing a Multiply Low Doubleword (not Multiply Low
Word). Hence, we added a new enumeration value PPCG_FLAG_OP_MULLD in
VEX/priv/guest_ppc_defs.h and a corresponding new case under which the
computation for detecting overflow for mulldo/mulldo. is added in
set_XER_OV_64(). The tests have been added to: none/tests/ppc32/jm-insns.c
Carl Love [Fri, 6 Sep 2013 16:49:42 +0000 (16:49 +0000)]
The patch used the binary constants 0b10000 and 0b10001. The 0b designator
is supported by the GCC extensions but not all compilers seem to support the
0b extension in GCC. Therefore, the binary constats were changed to their
equivalent hex values as suggested by Florian.
Carl Love [Thu, 5 Sep 2013 19:50:41 +0000 (19:50 +0000)]
The current VEX code is not properly handling a non-zero TH field in the
dcbt instruction, which is valid for several forms of data cache block
touch instructions. The VEX commit 2761 fixed the missing support in
VEX/priv/guest_ppc_toIR.c. This commit adds tests for the the non-zero
fields to the test cases for 32 and 64-bit modes.
Carl Love [Thu, 5 Sep 2013 19:47:40 +0000 (19:47 +0000)]
The current code is not properly handling a non-zero TH field in the
dcbt instruction, which is valid for several forms of data cache block
touch instructions. This patch adds the needed support to
VEX/priv/guest_ppc_toIR.c.
Carl Love [Thu, 5 Sep 2013 17:59:03 +0000 (17:59 +0000)]
The flag for compiling test none/tests/ppc32/test_isa_2_07_part2.c was
incorrectly set to FLAG_M64 instead of FLAG_M32. Fixed the flag. The
issue was reported in Bugzilla 324546.
Fix 324514 gdbserver monitor cmd output behaviour consistency + allow user
to put a "marker" msg in process log output
* v.info n_errs_found accepts optional msg, added in the output of
the monitor command.
* use VG_(printf) rather than VG_(gdb_printf) when output of command
should be redirected according to v.set gdb_output|log_output|mixed_output
* also avoid calling gdb_printf in output sink processing
to output zero bytes, as gdb_printf expects to have a null terminated
string, which is not ensured when 0 bytes have to be output.
* some minor reformatting (replace char* xxx by char *xxx).
Add a bunch of suppressions for 64-bit OSX 10.8 processes. This is a
huge kludge in that the right fix is to write proper syscall wrappers
for the new threading syscalls in 10.8, but that hasn't happened yet.