git.ipfire.org Git - thirdparty/valgrind.git/log

]> git.ipfire.org Git - thirdparty/valgrind.git/log

projects / thirdparty / valgrind.git / log

commit | commitdiff | tree

Philippe Waroquiers [Sun, 27 Jan 2019 10:04:01 +0000 (11:04 +0100)]

Sort the bug entries by bug nr, add a entry for a fixed bug.

commit | commitdiff | tree

Julian Seward [Sat, 26 Jan 2019 17:19:50 +0000 (18:19 +0100)]

Enable warning flag -Wenum-conversion if the compiler supports it.

This picks up some enum type confusion, and so looks useful. Unfortunately
only Clang seems to have it; gcc doesn't.

commit | commitdiff | tree

Julian Seward [Sat, 26 Jan 2019 17:18:28 +0000 (18:18 +0100)]

s390 front end: remove unused function 'put_gpr_int'. n-i-bz.

commit | commitdiff | tree

Julian Seward [Sat, 26 Jan 2019 17:00:41 +0000 (18:00 +0100)]

amd64 pipeline: generate a much better translation for PMADDUBSW.

This seems pretty common in some codecs, and the existing translation
was somewhat longwinded.

commit | commitdiff | tree

Julian Seward [Sat, 26 Jan 2019 16:38:01 +0000 (17:38 +0100)]

Rename some int<->fp conversion IROps for consistency.  No functional change.  n-i-bz.

2018-Dec-27: some of int<->fp conversion operations have been renamed so as to
have a trailing _DEP, meaning "deprecated".  This is because they don't
specify a rounding mode to be used for the conversion and so are
underspecified.  Their use should be replaced with equivalents that do specify
a rounding mode, either as a first argument or using a suffix on the name,
that indicates the rounding mode to use.

commit | commitdiff | tree

Julian Seward [Fri, 25 Jan 2019 11:06:37 +0000 (12:06 +0100)]

VG_(discard_translations): try to avoid invalidating the entire VG_(tt_fast) cache. n-i-bz.

It is very commonly the case that a call to VG_(discard_translations) results
in the discarding of exactly one superblock. In such cases, it's much cheaper
to find and invalidate the VG_(tt_fast) cache entry associated with the block,
than it is to invalidate the entire cache, because

(1) invalidating the fast cache is expensive, and

(2) repopulating the fast cache after invalidation is even more expensive.

For QEMU, which intensively invalidates individual translations (presumably
due to patching them), this reduces the fast-cache miss rate from circa one in
33 lookups to around one in 130 lookups.

commit | commitdiff | tree

Julian Seward [Fri, 25 Jan 2019 08:31:19 +0000 (09:31 +0100)]

Update.

commit | commitdiff | tree

Julian Seward [Fri, 25 Jan 2019 08:27:23 +0000 (09:27 +0100)]

Bug 402781 - Redo the cache used to process indirect branch targets.

Implementation for x86-solaris and amd64-solaris. This completes the
implementations for all targets. Note these two are untested because I don't
have any way to test them.

commit | commitdiff | tree

Julian Seward [Fri, 25 Jan 2019 08:14:56 +0000 (09:14 +0100)]

Bug 402781 - Redo the cache used to process indirect branch targets.

[This commit contains an implementation for all targets except amd64-solaris
and x86-solaris, which will be completed shortly.]

In the baseline simulator, jumps to guest code addresses that are not known at
JIT time have to be looked up in a guest->host mapping table.  That means:
indirect branches, indirect calls and most commonly, returns.  Since there are
huge numbers of these (often 10+ million/second) the mapping mechanism needs
to be extremely cheap.

Currently, this is implemented using a direct-mapped cache, VG_(tt_fast), with
2^15 (guest_addr, host_addr) pairs.  This is queried in handwritten assembly
in VG_(disp_cp_xindir) in dispatch-<arch>-<os>.S.  If there is a miss in the
cache then we fall back out to C land, and do a slow lookup using
VG_(search_transtab).

Given that the size of the translation table(s) in recent years has expanded
significantly in order to keep pace with increasing application sizes, two bad
things have happened: (1) the cost of a miss in the fast cache has risen
significantly, and (2) the miss rate on the fast cache has also increased
significantly.  This means that large (~ one-million-basic-blocks-JITted)
applications that run for a long time end up spending a lot of time in
VG_(search_transtab).

The proposed fix is to increase associativity of the fast cache, from 1
(direct mapped) to 4.  Simulations of various cache configurations using
indirect-branch traces from a large application show that is the best of
various configurations.  In an extreme case with 5.7 billion indirect
branches:

* The increase of associativity from 1 way to 4 way, whilst keeping the
  overall cache size the same (32k guest/host pairs), reduces the miss rate by
  around a factor of 3, from 4.02% to 1.30%.

* The use of a slightly better hash function than merely slicing off the
  bottom 15 bits of the address, reduces the miss rate further, from 1.30% to
  0.53%.

Overall the VG_(tt_fast) miss rate is almost unchanged on small workloads, but
reduced by a factor of up to almost 8 on large workloads.

By implementing each (4-entry) cache set using a move-to-front scheme in the
case of hits in ways 1, 2 or 3, the vast majority of hits can be made to
happen in way 0.  Hence the cost of having this extra associativity is almost
zero in the case of a hit.  The improved hash function costs an extra 2 ALU
shots (a shift and an xor) but overall this seems performance neutral to a
win.

commit | commitdiff | tree

Andreas Arnez [Mon, 21 Jan 2019 13:10:00 +0000 (14:10 +0100)]

Bug 403552 s390x: Fix vector facility bit number

The wrong bit number was used when checking for the vector facility.  This
can result in a fatal emulation error: "Encountered an instruction that
requires the vector facility.  That facility is not available on this
host."

In many cases the wrong facility bit was usually set as well, hence
nothing bad happened.  But when running Valgrind within a Qemu/KVM guest,
the wrong bit was not (always?) set and the emulation error occurred.

This fix simply corrects the vector facility bit number, changing it from
128 to 129.

commit | commitdiff | tree

Philippe Waroquiers [Sat, 12 Jan 2019 14:08:59 +0000 (15:08 +0100)]

Fix false positive 'Conditional jump or move' on amd64 64 bits ptracing 32 bits.

PTRACE_GET_THREAD_AREA is not handled by amd64 linux syswrap, which leads
to false positive errors in 64 bits program ptrace-ing 32 bits processes.

For example, the below error was wrongly reported on GDB:
==25377== Conditional jump or move depends on uninitialised value(s)
==25377==    at 0x8A1D7EC: td_thr_get_info (td_thr_get_info.c:35)
==25377==    by 0x526819: thread_from_lwp(thread_info*, ptid_t) (linux-thread-db.c:417)
==25377==    by 0x5281D4: thread_db_notice_clone(ptid_t, ptid_t) (linux-thread-db.c:442)
==25377==    by 0x51773B: linux_handle_extended_wait(lwp_info*, int) (linux-nat.c:2027)
....
==25377==  Uninitialised value was created by a stack allocation
==25377==    at 0x69A360: x86_linux_get_thread_area(int, void*, unsigned int*) (x86-linux-nat.c:278)

Fix this by implementing PTRACE_GET|SET_THREAD_AREA on amd64.

commit | commitdiff | tree

Mark Wielaard [Fri, 11 Jan 2019 20:52:26 +0000 (21:52 +0100)]

readdwarf3.c (parse_type_DIE): Accept DW_TAG_subrange_type with DW_AT_count

GCC9 generates a subrange_type with a lower_bound and count, but no
upper_bound attribute. This simply means the upper bound is lower
plus count.

commit | commitdiff | tree

Mark Wielaard [Fri, 11 Jan 2019 19:00:17 +0000 (20:00 +0100)]

Bug 402480 Do not use %esp in clobber list.

This is the same fix as for amd64-linux, but now for x86-linux.

commit | commitdiff | tree

Mark Wielaard [Mon, 31 Dec 2018 21:26:31 +0000 (22:26 +0100)]

Bug 402519 - POWER 3.0 addex instruction incorrectly implemented

addex uses OV as carry in and carry out. For all other instructions
OV is the signed overflow flag. And instructions like adde use CA
as carry.

Replace set_XER_OV_OV32 with set_XER_OV_OV32_ADDEX, which will
call calculate_XER_CA_64 and calculate_XER_CA_32, but with OV
as input, and sets OV and OV32.

Enable test_addex in none/tests/ppc64/test_isa_3_0.c and update
the expected output. test_addex would fail to match the expected
output before this patch.

commit | commitdiff | tree

Philippe Waroquiers [Sat, 29 Dec 2018 09:23:35 +0000 (10:23 +0100)]

Add memcheck/tests/vbit-test/vbit-test-sec in .gitignore

commit | commitdiff | tree

Philippe Waroquiers [Sat, 29 Dec 2018 09:20:33 +0000 (10:20 +0100)]

Some more .exp changes following --show-error-list new option

A few .exp files (not tested on amd64) have to be changed to
have the messages in the new order:
Use --track-origins=yes to see where uninitialised values come from
For lists of detected and suppressed errors, rerun with: -s

commit | commitdiff | tree

Philippe Waroquiers [Fri, 28 Dec 2018 23:25:34 +0000 (00:25 +0100)]

Fix the name of the option in the FIXED BUGS section

commit | commitdiff | tree

Philippe Waroquiers [Sun, 23 Dec 2018 22:48:41 +0000 (23:48 +0100)]

Document new options --show-error-list=no|yes and -s in NEWS

commit | commitdiff | tree

Philippe Waroquiers [Sun, 23 Dec 2018 22:44:54 +0000 (23:44 +0100)]

Document the new options --show-error-list and -s

commit | commitdiff | tree

Philippe Waroquiers [Sun, 23 Dec 2018 22:35:17 +0000 (23:35 +0100)]

Modify .exp files following the new error message.

Change:
For counts of detected and suppressed errors, rerun with: -v
to
For lists of detected and suppressed errors, rerun with: -s

commit | commitdiff | tree

Philippe Waroquiers [Sun, 23 Dec 2018 21:56:38 +0000 (22:56 +0100)]

Implement option --show-error-list=no|yes -s

This option allows to list the detected errors and show the used
suppressions without increasing the verbosity.
Increasing the verbosity also activates a lot of messages that
are often not very useful for the user.
So, this option allows to see the list of errors and used suppressions
independently of the verbosity.

Note if a high verbosity is selected, the behaviour is unchanged.
In other words, when specifying -v, the list of detected errors
and the used suppressions are still shown, even if
--show-error-list=yes and -s are not used.

commit | commitdiff | tree

Philippe Waroquiers [Sun, 23 Dec 2018 19:35:09 +0000 (20:35 +0100)]

Factorize producing the 'For counts of detected and suppressed errors' msg

Each tool producing errors had identical code to produce this msg.
Factorize the production of the message in m_main.c

This prepares the work to have a specific option to show the list
of detected errors and the count of suppressed errors.

This has a (small) visible effect on the output of memcheck:
Instead of producing
  For counts of detected and suppressed errors, rerun with: -v
  Use --track-origins=yes to see where uninitialised values come from
memcheck now produces:
  Use --track-origins=yes to see where uninitialised values come from
  For counts of detected and suppressed errors, rerun with: -v

i.e. the track origin and counts of errors msg are inverted.

commit | commitdiff | tree

Mark Wielaard [Sun, 23 Dec 2018 22:42:27 +0000 (23:42 +0100)]

Add vbit-test-sec.vgtest and vbit-test-sec.stderr.exp to EXTRA_DIST.

commit | commitdiff | tree

Mark Wielaard [Sun, 23 Dec 2018 22:11:42 +0000 (23:11 +0100)]

Mention 402481 as fixed in NEWS.

commit | commitdiff | tree

Khem Raj [Sat, 22 Dec 2018 23:28:40 +0000 (15:28 -0800)]

tests/amd64: Do not clobber %rsp register

This is seen with gcc-9.0 compiler now which is fix that gcc community
did recently
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52813

Signed-off-by: Khem Raj <raj.khem@gmail.com>

commit | commitdiff | tree

Mark Wielaard [Sun, 23 Dec 2018 12:29:27 +0000 (13:29 +0100)]

Also test memcheck/tests/vbit-test on any secondary arch.

If we are building a secondary arch then also build and run the
memcheck vbit-test for that architecture.

commit | commitdiff | tree

Julian Seward [Sun, 23 Dec 2018 21:02:03 +0000 (22:02 +0100)]

Bug 402481 - vbit-test fails on x86 for Iop_CmpEQ64 iselInt64Expr Sar64(Sub64(t14,Shr64(t14,0x1:I8)),0x3F:I8).

Fixes the failure by implementing Iop_Sar64 in the x86 back end.

commit | commitdiff | tree

Philippe Waroquiers [Sun, 23 Dec 2018 13:49:25 +0000 (14:49 +0100)]

Fix 402395 coregrind/vgdb-invoker-solaris.c: 2 * poor error checking

2 size_t variables were used as return value of read syscalls,
while ssize_t must be used.

commit | commitdiff | tree

Julian Seward [Sat, 22 Dec 2018 18:01:50 +0000 (19:01 +0100)]

amd64 back end: generate improved SIMD64 code.

For most SIMD operations that happen on 64-bit values (as would arise from MMX
instructions, for example, such as Add16x4, CmpEQ32x2, etc), generate code
that performs the operation using SSE/SSE2 instructions on values in the low
halves of XMM registers. This is much more efficient than the previous scheme
of calling out to helper functions written in C. There are still a few SIMD64
operations done via helpers, though.

commit | commitdiff | tree

Julian Seward [Sat, 22 Dec 2018 17:04:42 +0000 (18:04 +0100)]

amd64 back end: generate better code for 2x64<-->V128 and 4x64<-->V256 transfers ..

.. by adding support for MOVQ xmm/ireg and using that to implement 64HLtoV128,
4x64toV256 and their inverses. This reduces the number of instructions,
removes the use of memory as an intermediary, and avoids store-forwarding
stalls.

commit | commitdiff | tree

Julian Seward [Sat, 22 Dec 2018 15:11:39 +0000 (16:11 +0100)]

amd64 pipeline: improve performance of cvtdq2ps and cvtps2dq (128 and 256 bit versions) ..

.. by giving them their own vector IROps rather than doing each lane individually.

commit | commitdiff | tree

Julian Seward [Sat, 22 Dec 2018 12:34:11 +0000 (13:34 +0100)]

amd64 back end: generate better code for 128/256 bit vector shifts by immediate. n-i-bz.

commit | commitdiff | tree

Julian Seward [Sat, 22 Dec 2018 06:23:00 +0000 (07:23 +0100)]

amd64 pipeline: generate much better code for pshufb mm/xmm/ymm.  n-i-bz.

pshufb mm/xmm/ymm rearranges byte lanes in vector registers.  It's fairly
widely used, but we generated terrible code for it.  With this patch, we just
generate, at the back end, pshufb plus a bit of masking, which is a great
improvement.

commit | commitdiff | tree

Julian Seward [Sat, 22 Dec 2018 05:06:19 +0000 (06:06 +0100)]

amd64 hosts: detect SSSE3 (not SSE3) capabilities on the host. As-yet unused. n-i-bz.

commit | commitdiff | tree

Julian Seward [Wed, 12 Dec 2018 12:55:01 +0000 (13:55 +0100)]

Fix memcheck/tests/undef_malloc_args failure.

Try harder to trigger a memcheck error if a value is (partially) undefined.

commit | commitdiff | tree

Julian Seward [Mon, 10 Dec 2018 16:18:20 +0000 (17:18 +0100)]

Adjust ppc set_AV_CR6 computation to help Memcheck instrumentation.

* changes set_AV_CR6 so that it does scalar comparisons against zero,
  rather than sometimes against an all-ones word.  This is something
  that Memcheck can instrument exactly.

* in Memcheck, requests expensive instrumentation of Iop_Cmp{EQ,NE}64
  by default on ppc64le.

https://bugs.kde.org/show_bug.cgi?id=386945#c62

commit | commitdiff | tree

Mark Wielaard [Sun, 9 Dec 2018 22:25:05 +0000 (23:25 +0100)]

Implement ppc64 lxvb16x as 128-bit vector load with reversed double words.

This makes it possible for memcheck to know which part of the 128bit
vector is defined, even if the load is partly beyond an addressable block.

Partially resolves bug 386945.

commit | commitdiff | tree

Mark Wielaard [Sun, 9 Dec 2018 13:26:39 +0000 (14:26 +0100)]

memcheck: Allow unaligned loads of 128bit vectors on ppc64[le].

On powerpc partial unaligned loads of vectors from partially invalid
addresses are OK and could be generated by our translation of lxvd2x.

Adjust partial_load memcheck tests to allow partial loads of 16 byte
vectors on powerpc64.

Part of resolving bug #386945.

commit | commitdiff | tree

Mark Wielaard [Sat, 8 Dec 2018 23:55:42 +0000 (00:55 +0100)]

Implement ppc64 lxvd2x as 128-bit load with double word swap for ppc64le.

This makes it possible for memcheck to know which part of the 128bit
vector is defined, even if the load is partly beyond an addressable block.

Partially resolves bug 386945.

commit | commitdiff | tree

Mark Wielaard [Sat, 8 Dec 2018 18:47:43 +0000 (13:47 -0500)]

memcheck: Allow unaligned loads of words on ppc64[le].

On powerpc partial unaligned loads of words from partially invalid
addresses are OK and could be generated by our translation of ldbrx.

Adjust partial_load memcheck tests to allow partial loads of words
on powerpc64.

Part of resolving bug #386945.

commit | commitdiff | tree

Mark Wielaard [Fri, 7 Dec 2018 15:42:22 +0000 (10:42 -0500)]

Implement ppc64 ldbrx as 64-bit load and Iop_Reverse8sIn64_x1.

This makes it possible for memcheck to analyse the new gcc strcmp
inlined code correctly even if the ldbrx load is partly beyond an
addressable block.

Partially resolves bug 386945.

commit | commitdiff | tree

Bart Van Assche [Thu, 20 Dec 2018 02:13:14 +0000 (18:13 -0800)]

drd/tests/tsan_thread_wrappers_pthread.h: Fix MyThread::ThreadBody()

See also https://bugs.kde.org/show_bug.cgi?id=402341.

commit | commitdiff | tree

Mark Wielaard [Wed, 19 Dec 2018 19:52:29 +0000 (20:52 +0100)]

PR402134 assert fail in mc_translate.c (noteTmpUsesIn) Iex_VECRET on arm64

This happens when processing openssl aes_v8_set_encrypt_key
(aesv8-armx.S:133). The noteTmpUsesIn () function is new since
PR387664 Memcheck: make expensive-definedness-checks be the default.
It didn't handle Iex_VECRET which is used in the arm64 crypto
instruction dirty handlers.

commit | commitdiff | tree

Mark Wielaard [Wed, 19 Dec 2018 19:14:03 +0000 (20:14 +0100)]

PR402327 Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x13 DW_OP_drop

readdwarf.c (dwarfexpr_to_dag) didn't handle DW_OP_drop.
Implement it by simply popping the last element on the stack.

commit | commitdiff | tree

Mark Wielaard [Wed, 12 Dec 2018 13:11:29 +0000 (14:11 +0100)]

Implement minimal ptrace support for ppc64[le]-linux.

commit | commitdiff | tree

Mark Wielaard [Fri, 14 Dec 2018 13:32:27 +0000 (14:32 +0100)]

arm64: Fix PTRACE_TRACEME memcheck/tests/linux/getregset.vgtest testcase.

The sys_ptrace post didn't mark the thread as being in traceme mode.
This occassionally would make the memcheck/tests/linux/getregset.vgtest
testcase fail. With this patch it reliably passes.

commit | commitdiff | tree

Petar Jovanovic [Thu, 13 Dec 2018 15:20:28 +0000 (16:20 +0100)]

mips64: fix build break introduced by be7a730

Follow up to
commit be7a73004583aab5d4c97cf55276ca58d5b3090b

that broke the build for mips64.

commit | commitdiff | tree

Petar Jovanovic [Wed, 12 Dec 2018 17:45:34 +0000 (17:45 +0000)]

make outputs of drd/tests/fork* deterministic

Wait for children to finish before terminating the main process.

This fixes occasional failures of the following tests:

drd/tests/fork-parallel (stderr)
drd/tests/fork-serial (stderr)

commit | commitdiff | tree

Mark Wielaard [Wed, 12 Dec 2018 13:15:28 +0000 (14:15 +0100)]

Mark helper regs defined in final_tidyup before freeres_wrapper call.

In final_tidyup we setup the guest to call the freeres_wrapper, which
will (possibly) call __gnu_cxx::__freeres() and/or __libc_freeres().

In a couple of cases (ppc64be, ppc64le and mips32) this involves setting
up one or more helper registers. Since we setup these guest registers
we should make sure to mark them as fully defined. Otherwise we might
see spurious warnings about undefined value usage if the guest register
happened to not be fully defined before.

This fixes PR402006.

commit | commitdiff | tree

Nicholas Nethercote [Wed, 12 Dec 2018 09:52:33 +0000 (20:52 +1100)]

Fix path handling in the new Cachegrind and Callgrind tests.

commit | commitdiff | tree

Nicholas Nethercote [Fri, 23 Nov 2018 04:16:02 +0000 (15:16 +1100)]

Add a --show-percs option to cg_annotate and callgrind_annotate.

Because it's very useful. As part of this, the "percentage of events
annotated" numbers at the bottom of the output is changed to "events
annotated" so that --show-percs doesn't compute a percentage of a
percentage.

Example output lines:
```
4,967,137,442 (100.0%)  PROGRAM TOTALS

4,543 (25.23%)    17,566 ( 0.43%)    47,993 ( 0.92%) /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c

1 ( 0.01%) 2,000,001 (49.29%) 3,000,004 (57.36%)     for (int i = 0; i < 1000000; i++) {
```

The commit also adds some much-needed tests for cg_annotate and
callgrind_annotate.

commit | commitdiff | tree

Mark Wielaard [Fri, 7 Dec 2018 13:01:20 +0000 (14:01 +0100)]

Fix sigkill.stderr.exp for glibc-2.28.

glibc 2.28 filters out some bad signal numbers and returns
Invalid argument instead of passing such bad signal numbers
the kernel sigaction syscall. So we won't see such bad signal
numbers and won't print "bad signal number" ourselves.

Add a new memcheck/tests/sigkill.stderr.exp-glibc-2.28 to catch
this case.

commit | commitdiff | tree

Mark Wielaard [Thu, 6 Dec 2018 19:52:22 +0000 (20:52 +0100)]

Bug 401822 Fix asm constraints for ppc64 jm-vmx jm-insns.c test.

The mfvscr and vor instructions in jm-insns.c had a "=vr" constraint.
This should have been an "=v" constraint. This resolved assembler
warnings and the testcase failing on ppc64le with gcc 8.2 and
binutils 2.30.

commit | commitdiff | tree

Mark Wielaard [Sat, 1 Dec 2018 22:54:40 +0000 (23:54 +0100)]

Bug 401627 - Add wcsncmp override and testcase.

glibc 2.28 added an avx2 optimized variant of wstrncmp which memcheck
cannot proof correct. Add a simple override in vg_replace_strmem.c.

commit | commitdiff | tree

Andreas Arnez [Wed, 5 Dec 2018 16:07:05 +0000 (17:07 +0100)]

Add Emacs configuration files

This adds a configuration file ".dir-locals.el" for Emacs to the topmost
directory of the Valgrind source tree, and another such file to the
directory drd/tests.  These files contain per-directory local Emacs
variables.

The following settings are performed:

* The base C style is set to "Linux", indentation is set to 3 columns
  per level, the use of tabs for indentation is disabled, and the fill
  column is set to 80.

* The source files in drd/tests use 2 instead of 3 columns per indentation
  level.

commit | commitdiff | tree

Mark Wielaard [Sun, 2 Dec 2018 11:39:27 +0000 (12:39 +0100)]

Fix tsan_unittest.cpp compile error with older compilers.

Older compilers (g++ 4.8.5) don't like '>>':
error: ‘>>’ should be ‘> >’ within a nested template argument list.
Add an extra space.

commit | commitdiff | tree

Bart Van Assche [Sun, 2 Dec 2018 05:51:06 +0000 (21:51 -0800)]

drd/tests: Fix remaining gcc 8 compiler warnings

commit | commitdiff | tree

Bart Van Assche [Sat, 1 Dec 2018 16:42:55 +0000 (08:42 -0800)]

Add drd/tests/fork-serial and drd/tests/fork-parallel

commit | commitdiff | tree

Bart Van Assche [Sat, 1 Dec 2018 04:49:27 +0000 (20:49 -0800)]

drd: Add fork test program

commit | commitdiff | tree

Bart Van Assche [Sat, 1 Dec 2018 18:43:06 +0000 (10:43 -0800)]

drd: Fix fork() handling

The thread ID passed to DRD_(drd_thread_atfork_child)() is a Valgrind
thread ID instead of a DRD thread ID. This patch fixes bug 401578.

commit | commitdiff | tree

Bart Van Assche [Sat, 1 Dec 2018 05:15:39 +0000 (21:15 -0800)]

drd/test: Fix most gcc 8 compiler warnings

commit | commitdiff | tree

Vadim Barkov [Fri, 5 Oct 2018 10:46:44 +0000 (13:46 +0300)]

Bug 385411 s390x: Tests and internals for z13 vector FP support

Add test cases for the z13 vector FP support. Bring s390-opcodes.csv
up-to-date, reflecting that the z13 vector instructions are now supported.
Also remove the non-support disclaimer for the vector facility from
README.s390.

The patch was contributed by Vadim Barkov, with some clean-up and minor
adjustments by Andreas Arnez.

commit | commitdiff | tree

Vadim Barkov [Fri, 5 Oct 2018 10:51:49 +0000 (13:51 +0300)]

Bug 385411 s390x: Add z13 vector floating point support

This adds support for the z/Architecture vector FP instructions that were
introduced with z13.

The patch was contributed by Vadim Barkov, with some clean-up and minor
adjustments by Andreas Arnez.

commit | commitdiff | tree

Julian Seward [Wed, 28 Nov 2018 13:15:06 +0000 (14:15 +0100)]

Bug 401112 - LLVM 5.0 generates comparison against partially initialized data.

This generalises the existing spec rules for W of 32 bits:

W <u 0---(N-1)---0 1 0---0 or

(that is, B/NB after SUBL, where dep2 has the above form), to also cover

W <=u 0---(N-1)---0 0 1---1

(that is, BE/NBE after SUBL, where dept2 has the specified form).

Patch from Nicolas B. Pierron (nicolas.b.pierron@nbp.name).

commit | commitdiff | tree

Philippe Waroquiers [Sun, 25 Nov 2018 18:51:53 +0000 (19:51 +0100)]

Always output all leak kinds in a xtree leak result file.

- The option --xtree-leak=yes (to output leak result in xtree format)
  automatically activates the option --show-leak-kinds=all,
  as xtree visualisation tools such as kcachegrind can in any case
  select what kind of leak to visualise.

commit | commitdiff | tree

Andreas Arnez [Thu, 26 Jul 2018 14:35:24 +0000 (16:35 +0200)]

s390x: More fixes for z13 support

This patch addresses the following:

* Fix the implementation of LOCGHI.  Previously Valgrind performed 32-bit
  sign extension instead of 64-bit sign extension on the immediate value.

* Advertise VXRS in HWCAP.  If no VXRS are advertised, but the program
  uses vector registers, this could cause problems with a glibc built with
  "-march=z13".

commit | commitdiff | tree

Julian Seward [Tue, 20 Nov 2018 11:09:03 +0000 (12:09 +0100)]

Add support for Iop_{Sar,Shr}8 on ppc. --expensive-definedness-checks=yes needs them.

commit | commitdiff | tree

Julian Seward [Tue, 20 Nov 2018 10:46:55 +0000 (11:46 +0100)]

VEX/priv/ir_opt.c

fold_Expr: transform PopCount64(And64(Add64(x,-1),Not64(x))) into CtzNat64(x).

This is part of the fix for bug 386945.

commit | commitdiff | tree

Julian Seward [Tue, 20 Nov 2018 10:36:53 +0000 (11:36 +0100)]

ppc front end: use new IROps added in 42719898.

This pertains to bug 386945.

VEX/priv/guest_ppc_toIR.c:

gen_POPCOUNT: use Iop_PopCount{32,64} where possible.

gen_vpopcntd_mode32: use Iop_PopCount32.

for cntlz{w,d}, use Iop_CtzNat{32,64}.

gen_byterev32: use Iop_Reverse8sIn32_x1 instead of lengthy sequence.

verbose_Clz32: remove (was unused anyway).

commit | commitdiff | tree

Julian Seward [Tue, 20 Nov 2018 10:28:42 +0000 (11:28 +0100)]

Add Memcheck support for IROps added in 42719898.

memcheck/mc_translate.c:

Add mkRight{32,64} as right-travelling analogues to mkLeft{32,64}.

doCmpORD: for the cases of a signed comparison against zero, compute
definedness of the 3 result bits (lt,gt,eq) separately, and, for the lt and eq
bits, do it exactly accurately.

expensiveCountTrailingZeroes: no functional change. Re-analyse/verify and add
comments.

expensiveCountLeadingZeroes: add. Very similar to
expensiveCountTrailingZeroes.

Add some comments to mark unary ops which are self-shadowing.

Route Iop_Ctz{,Nat}{32,64} through expensiveCountTrailingZeroes.
Route Iop_Clz{,Nat}{32,64} through expensiveCountLeadingZeroes.

Add instrumentation for Iop_PopCount{32,64} and Iop_Reverse8sIn32_x1.

memcheck/tests/vbit-test/irops.c

Add dummy new entries for all new IROps, just enough to make it compile and
run.

commit | commitdiff | tree

Julian Seward [Tue, 20 Nov 2018 10:07:37 +0000 (11:07 +0100)]

Add ppc host-side isel and instruction support for IROps added in previous commit.

VEX/priv/host_ppc_defs.c, VEX/priv/host_ppc_defs.h:

Dont emit cnttz{w,d}. We may need them on a target which doesn't support
them. Instead we can generate a fairly reasonable alternative sequence with
cntlz{w,d} instead.

Add support for emitting popcnt{w,d}.

VEX/priv/host_ppc_isel.c

Add support for: Iop_ClzNat32 Iop_ClzNat64

Redo support for: Iop_Ctz{32,64} and their Nat equivalents, so as to not use
cnttz{w,d}, as mentioned above.

Add support for: Iop_PopCount64 Iop_PopCount32 Iop_Reverse8sIn32_x1

commit | commitdiff | tree

Julian Seward [Tue, 20 Nov 2018 09:52:33 +0000 (10:52 +0100)]

Add some new IROps to support improved Memcheck analysis of strlen etc.

This is part of the fix for bug 386945.  It adds the following IROps, plus
their supporting type- and printing- fragments:

Iop_Reverse8sIn32_x1: 32-bit byteswap.  A fancy name, but it is consistent
with naming for the other swapping IROps that already exist.

Iop_PopCount64, Iop_PopCount32: population count

Iop_ClzNat64, Iop_ClzNat32, Iop_CtzNat64, Iop_CtzNat32: counting leading and
trailing zeroes, with "natural" (Nat) semantics for a zero input, meaning, in
the case of zero input, return the number of bits in the word.  These
functionally overlap with the existing Iop_Clz64, Iop_Clz32, Iop_Ctz64,
Iop_Ctz32.  The existing operations are undefined in case of a zero input.
Adding these new variants avoids the complexity of having to change the
declared semantics of the existing operations.  Instead they are deprecated
but still available for use.

commit | commitdiff | tree

Julian Seward [Tue, 20 Nov 2018 09:18:29 +0000 (10:18 +0100)]

get_otrack_shadow_offset_wrk for ppc32 and ppc64: add missing cases for XER_OV32, XER_CA32 and C_FPCC.

The missing cases were discovered whilst testing fixes for bug 386945, but are
otherwise unrelated to that bug.

commit | commitdiff | tree

Nicholas Nethercote [Fri, 16 Nov 2018 05:48:13 +0000 (16:48 +1100)]

Fix Cachegrind's --help message.

commit | commitdiff | tree

Andreas Arnez [Thu, 18 Oct 2018 15:51:57 +0000 (17:51 +0200)]

Bug 397187 s390x: Add vector register support for vgdb

On s390x machines with a vector facility, Valgrind's gdbserver didn't
represent the vector registers. This is fixed.

commit | commitdiff | tree

Andreas Arnez [Tue, 30 Oct 2018 16:06:38 +0000 (17:06 +0100)]

Bug 400491 s390x: Sign-extend immediate operand of LOCHI and friends

The VEX implementation of each of the z/Architecture instructions LOCHI,
LOCHHI, and LOCGHI treats the immediate 16-bit operand as an unsigned
integer instead of a signed integer. This is fixed.

commit | commitdiff | tree

Andreas Arnez [Thu, 25 Oct 2018 11:47:12 +0000 (13:47 +0200)]

Bug 400490 s390x: Fix register allocation for VRs vs FPRs

On s390x, if vector registers are available, they are fed to the register
allocator as if they were separate from the floating-point registers. But
in fact the FPRs are embedded in the VRs. So for instance, if both f3 and
v3 are allocated and used at the same time, corruption will result.

This is fixed by offering only the non-overlapping VRs, v16 to v31, to the
register allocator instead.

commit | commitdiff | tree

Philippe Waroquiers [Tue, 6 Nov 2018 20:40:43 +0000 (21:40 +0100)]

Make white space style more consistent, no functional impact.

But this time, rather consistently use *no* space between function name and
function arg list.

commit | commitdiff | tree

Philippe Waroquiers [Sun, 4 Nov 2018 10:48:00 +0000 (11:48 +0100)]

Make white space style more consistent, no functional impact.

Consistently use a space between function name and function arg list.

commit | commitdiff | tree

Philippe Waroquiers [Sun, 28 Oct 2018 17:35:11 +0000 (18:35 +0100)]

Fix dependencies between libcoregrind*.a and *m_main.o/*m_libcsetjmp.o

The primary and secondary coregrind libraries must be updated
when m_main.c or m_libcsetjmp.c are changed.

A dependency was missing between libcoregrind*.a and libnolto_coregrind*.a,
and so tools were not relinked when m_main.c or m_libcsetjmp.c were
changed.

commit | commitdiff | tree

Philippe Waroquiers [Sat, 27 Oct 2018 18:28:59 +0000 (20:28 +0200)]

Fix 399301 - Use inlined frames in Massif XTree output.

Author: Nicholas Nethercote <nnethercote@mozilla.com>

Use inlined frames in Massif XTree output.

    This makes Massif's output much easier to follow.

    The commit also removes a -1 used on all Massif stack frame addresses.
    There was a big comment questioning the presence of that -1, and with it
    gone the addresses now match those produced by DHAT.

commit | commitdiff | tree

Philippe Waroquiers [Sat, 20 Oct 2018 14:54:19 +0000 (16:54 +0200)]

Update configure.ac to next version 3.15.GIT ...

As pointed out by Rhys, we need a .GIT postfix waiting
for the release ...

commit | commitdiff | tree

Philippe Waroquiers [Sat, 20 Oct 2018 09:44:00 +0000 (11:44 +0200)]

Update configure.ac to next version 3.15 ...

commit | commitdiff | tree

Philippe Waroquiers [Sat, 20 Oct 2018 09:37:26 +0000 (11:37 +0200)]

Prepare for the next 3.15 release

* Create the 3.15 section in the NEWS file
  (the idea is that this section is maintained during the development,
   i.e. document user visible changes and/or the fixed bugs, as part of
   the commit).

* start the fixed bug list with 399322  Improve callgrind_annotate output

* update vg-entities.xml for 3.15 next release.

commit | commitdiff | tree

Nicholas Nethercote [Fri, 19 Oct 2018 05:27:07 +0000 (16:27 +1100)]

Implement VG_(apply_ExeContext)().

It's been declared for a long time, but was lacking a definition.

commit | commitdiff | tree

Nicholas Nethercote [Wed, 3 Oct 2018 00:00:10 +0000 (10:00 +1000)]

Improve callgrind_annotate output.

This commit makes two changes:

- it adds commas to call counts (e.g. `65658x` becomes `65,658x`);

- it sorts callers/callees in the tree by the --sort order.

An example, old output:
```
72,142,945  < /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:_dl_lookup_symbol_x (65658x)
68,977,760  *  /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:do_lookup_x
       340  >   /build/glibc-OTsEL5/glibc-2.27/malloc/malloc.c:free (3x)
     4,564  >   /build/glibc-OTsEL5/glibc-2.27/nptl/pthread_mutex_unlock.c:pthread_mutex_unlock (163x)
1,282,381  >   /build/glibc-OTsEL5/glibc-2.27/string/../sysdeps/x86_64/strcmp.S:strcmp (12893x)
    13,310  >   /build/glibc-OTsEL5/glibc-2.27/malloc/malloc.c:calloc (4x)
       223  >   /build/glibc-OTsEL5/glibc-2.27/elf/dl-misc.c:_dl_higher_prime_number (3x)
1,741,689  >   /build/glibc-OTsEL5/glibc-2.27/elf/dl-misc.c:_dl_name_match_p (16842x)
     5,705  >   /build/glibc-OTsEL5/glibc-2.27/nptl/../nptl/pthread_mutex_lock.c:pthread_mutex_lock (163x)

    51,454  < /build/glibc-OTsEL5/glibc-2.27/elf/../elf/dl-runtime.c:_dl_fixup (33x)
     2,456  < /build/glibc-OTsEL5/glibc-2.27/elf/dl-sym.c:_dl_sym (2x)
100,313,502  < /build/glibc-OTsEL5/glibc-2.27/elf/../sysdeps/x86_64/dl-machine.h:_dl_relocate_object (39094x)
28,224,467  *  /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:_dl_lookup_symbol_x
72,142,945  >   /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:do_lookup_x (65658x)

[...]

  567        setLastUser(LastPMUses, P->getResolver()->getPMDataManager().getAsPass());
   18  => /home/njn/moz/rust0/src/llvm/include/llvm/IR/LegacyPassManagers.h:non-virtual thunk to llvm::FPPassManager::getAsPass() (9x)
1,971  => /home/njn/moz/rust0/src/llvm/lib/IR/LegacyPassManager.cpp:llvm::PMTopLevelManager::setLastUser(llvm::ArrayRef<llvm::Pass*>, llvm::Pass*)'2 (63x)
  108  => ???:non-virtual thunk to (anonymous namespace)::MPPassManager::getAsPass() (54x)
```

New output:
```
72,142,945  < /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:_dl_lookup_symbol_x (65,658x)
68,977,760  *  /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:do_lookup_x
1,741,689  >   /build/glibc-OTsEL5/glibc-2.27/elf/dl-misc.c:_dl_name_match_p (16,842x)
1,282,381  >   /build/glibc-OTsEL5/glibc-2.27/string/../sysdeps/x86_64/strcmp.S:strcmp (12,893x)
    13,310  >   /build/glibc-OTsEL5/glibc-2.27/malloc/malloc.c:calloc (4x)
     5,705  >   /build/glibc-OTsEL5/glibc-2.27/nptl/../nptl/pthread_mutex_lock.c:pthread_mutex_lock (163x)
     4,564  >   /build/glibc-OTsEL5/glibc-2.27/nptl/pthread_mutex_unlock.c:pthread_mutex_unlock (163x)
       340  >   /build/glibc-OTsEL5/glibc-2.27/malloc/malloc.c:free (3x)
       223  >   /build/glibc-OTsEL5/glibc-2.27/elf/dl-misc.c:_dl_higher_prime_number (3x)

100,313,502  < /build/glibc-OTsEL5/glibc-2.27/elf/../sysdeps/x86_64/dl-machine.h:_dl_relocate_object (39,094x)
    51,454  < /build/glibc-OTsEL5/glibc-2.27/elf/../elf/dl-runtime.c:_dl_fixup (33x)
     2,456  < /build/glibc-OTsEL5/glibc-2.27/elf/dl-sym.c:_dl_sym (2x)
28,224,467  *  /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:_dl_lookup_symbol_x
72,142,945  >   /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c:do_lookup_x (65,658x)

[...]

  567        setLastUser(LastPMUses, P->getResolver()->getPMDataManager().getAsPass());
1,971  => /home/njn/moz/rust0/src/llvm/lib/IR/LegacyPassManager.cpp:llvm::PMTopLevelManager::setLastUser(llvm::ArrayRef<llvm::Pass*>, llvm::Pass*)'2 (63x)
  108  => ???:non-virtual thunk to (anonymous namespace)::MPPassManager::getAsPass() (54x)
   18  => /home/njn/moz/rust0/src/llvm/include/llvm/IR/LegacyPassManagers.h:non-virtual thunk to llvm::FPPassManager::getAsPass() (9x)
```

commit | commitdiff | tree

Andreas Arnez [Tue, 9 Oct 2018 09:22:27 +0000 (11:22 +0200)]

Bug 399444 s390x: Drop unnecessary check in s390_irgen_VSLDB

In s390_irgen_VSLDB there was special handling for the case that the
immediate operand i4 has the value 16, which would mean that the result v1
were a full copy of the third operand v3. However, this is impossible
because i4 can only assume values from 0 to 15; thus the special handling
can be removed.

commit | commitdiff | tree

Julian Seward [Tue, 9 Oct 2018 11:52:05 +0000 (13:52 +0200)]

--> 3.14.0 final

commit | commitdiff | tree

Rhys Kidd [Mon, 8 Oct 2018 01:26:35 +0000 (21:26 -0400)]

Fix macOS 10.13 building from tarball

Fixes: 1ce04c3 ("Preliminary support for Darwin 17.x (macOS 10.13)")

commit | commitdiff | tree

Julian Seward [Wed, 3 Oct 2018 13:37:06 +0000 (15:37 +0200)]

--> 3.14.0.RC2

commit | commitdiff | tree

Julian Seward [Wed, 3 Oct 2018 13:29:42 +0000 (15:29 +0200)]

sigframe construction for x86-linux: ensure that ESP is correctly aligned before entering the handler. n-i-bz.

Without this, a signal handler compiled by Clang 6, which uses movdqa to load/store
relative to ESP, segfaults because the resulting address isn't 16-aligned.

commit | commitdiff | tree

Julian Seward [Wed, 3 Oct 2018 13:26:48 +0000 (15:26 +0200)]

x86 front end: handle UD2 as an officially recognised, unimplemented instruction, like the amd64 front end does. n-i-bz.

This doesn't change anything downstream -- a SIGILL is still raised -- but
there's a bit less debug printing now.

commit | commitdiff | tree

Julian Seward [Sun, 30 Sep 2018 08:21:27 +0000 (10:21 +0200)]

--> 3.14.0.RC1

commit | commitdiff | tree

Julian Seward [Sun, 30 Sep 2018 07:29:43 +0000 (09:29 +0200)]

Memcheck on amd64; fix false positive associated with spec cases {Z,NZ} after {LOGICB,LOGICW}.  n-i-bz.

For the spec cases {Z,NZ} after {LOGICB,LOGICW}, which are simply comparisons
of the result against zero, use Cmp{EQ,NE}32 rather than their 64-bit
counterparts.  This is because Memcheck on amd64 instruments the 32 bit
versions exactly, at the default --expensive-definedness-checks=auto setting.
The alternative would have been to make Memcheck also do exact instrumentation
of the 64 bit versions, but that would also burden all other 64 bit eq/ne
comparisons with that cost for no purpose.  So this is a cheaper solution.

commit | commitdiff | tree

Mark Wielaard [Thu, 27 Sep 2018 03:09:42 +0000 (05:09 +0200)]

Fix s390x_dirtyhelper_vec_op signature for non-s390x case.

The definition of s390x_dirtyhelper_vec_op in guest_s390_helpers.c
didn't match the one from guest_s390_defs.h for the non-s390x case.
Causing a compiler warning/error.

commit | commitdiff | tree

Andreas Arnez [Mon, 24 Sep 2018 17:08:01 +0000 (19:08 +0200)]

s390x: Vector integer and string insn support -- tests

This adds test cases and some internal stuff to the z/Architecture vector
integer and string instruction support.

Contributed by Vadim Barkov <vbrkov@gmail.com>.

commit | commitdiff | tree

Andreas Arnez [Mon, 24 Sep 2018 16:56:07 +0000 (18:56 +0200)]

s390x: Vector integer and string instruction support

This adds z/Architecture vector integer and string instruction support.

The main author of this patch is Vadim Barkov <vbrkov@gmail.com>. Some
fixes were provided by Andreas Arnez <arnez@linux.ibm.com>.

commit | commitdiff | tree

Philippe Waroquiers [Wed, 26 Sep 2018 16:04:43 +0000 (18:04 +0200)]

Fix 398028 Assertion `cfsi_fits` failing in simple C program

At least with libopenblas, we can have several rx mappings
with some holes between mappings.
Change the invariant (2) checking so that such holes are ok,
as long as no cfsi refers to such an hole.

commit | commitdiff | tree

Andreas Arnez [Mon, 24 Sep 2018 16:03:47 +0000 (18:03 +0200)]

Update bug status for s390x conditional trap insn support

This updates the bug status for git commit 20976f432, "s390x: Implement
conditional trap instructions".

commit | commitdiff | tree

Andreas Arnez [Wed, 25 Jul 2018 12:23:02 +0000 (14:23 +0200)]

s390x: Implement conditional trap instructions

This implements various z/Architecture instructions that conditionally
yield a data exception ("trap").  The condition is either based on a
comparison being true ("compare and trap") or on a loaded value being
zero ("load and trap").  These instructions haven't been widely used in
the past, but may now be emitted by newer compilers.  Note that the
resulting signal for a data exception is SIGFPE, not SIGTRAP.  Thus this
patch also adds a new jump kind Ijk_SigFPE.

Mirror of https://sourceware.org/git/valgrind.git

RSS Atom