Mark Wielaard [Tue, 18 Aug 2020 21:58:55 +0000 (23:58 +0200)]
Fix epoll_ctl setting of array event and data fields.
Fix for https://bugs.kde.org/show_bug.cgi?id=422623 in commit ecf5ba119
epoll_ctl warns for uninitialized padding on non-amd64 64bit arches
contained a bug. A pointer to an array is not a pointer to a pointer to
an array. Found by a Fedora user:
https://bugzilla.redhat.com/show_bug.cgi?id=1844778#c10
Mark Wielaard [Sun, 26 Jul 2020 19:17:23 +0000 (21:17 +0200)]
Handle REX prefixed JMP instruction.
The NET Core runtime might generate a JMP with a REX prefix.
For Jv (32bit offset) and Jb (8bit offset) this is valid.
Prefixes that change operand size are ignored for such JMPs.
So remove the check for sz == 4 and force sz = 4 for Jv.
Fix warning in syswrap sched_getattr print format.
m_syswrap/syswrap-linux.c:3716:10: warning: format '%ld' expects argument of type 'long int', but argument 4 has type 'RegWord' {aka 'long unsigned int'} [-Wformat=]
Mark Wielaard [Sun, 26 Jul 2020 20:40:22 +0000 (22:40 +0200)]
epoll_ctl warns for uninitialized padding on non-amd64 64bit arches
struct vki_epoll_event is packed on x86_64, but not on other 64bit
arches. This means that on 64bit arches there can be padding in the
epoll_event struct. Seperately the data field is only used by user
space (which might not set the data field if it doesn't need to).
Only check the events field on epoll_ctl. But assume both events
and data are both written to by epoll_[p]wait (exclude padding).
Mark Wielaard [Tue, 30 Jun 2020 15:51:49 +0000 (17:51 +0200)]
Add new auxchecks target that runs GNU Scientific Library tests.
Replace the gsl16test script under auxprogs that you run by hand
with a new make target auxchecks which fetches the source code,
patches, reconfigures and builds all tests. Then run all tests
under valgrind.
Carl Love [Tue, 9 Jun 2020 15:42:03 +0000 (10:42 -0500)]
Power PC Fix extraction of the L field for sync instruction
The L field is currently a two bit[22:21] field in ISA 3.0. The size of the
L field has changed over time.
Currently the ISA 3.0 Valgrind sync instruction support code sets the
flag_L for the instruction L field to a five bit value that includes bits
that are marked reserved the sync instruction. This patch fixes the issue for ISA 3.0
to only setting flag_L the specified two bits.
Mark Wielaard [Tue, 9 Jun 2020 09:23:46 +0000 (11:23 +0200)]
docs: Make sure all elements that need it have an id tag.
When generating HTML it is useful if every element that can be referenced
has a stable id. If it doesn't a random one is generated which makes it
harder to link to parts of the manual on the website. It also generates
spurious diffs. Explicitly add an id tag for the sect2 and sect3 elements
in dh-manual, a unique id for each legalnotice element and for each
FAQ question and answer.
Mark Wielaard [Mon, 8 Jun 2020 13:14:04 +0000 (15:14 +0200)]
doc/Makefile.am: Turn valid-manual and valid-manpages into real targets
Make valid-manual and valid-manpages real, separate make targets.
This means they can be run in parallel and they will only be run
once when doing make check, unless one of the manual and manpages
files has been touched.
Mark Wielaard [Mon, 8 Jun 2020 11:27:28 +0000 (13:27 +0200)]
guest_ppc_toIR: Call vpanic not just vex_printf when the impossible happens
is_Zero_Vector, is_Denorm_Vector, is_NaN_Vector and negate_Vector
only handle an Ity_I32 element size. And that is also what they are
currently being called with. In case they would ever be called with
a different element_size they would simply vex_printf and continue
(producing bogus/impossible results). To make this a bit more future
proof (and to silence a static analyzer) vpanic instead.
Mark Wielaard [Mon, 8 Jun 2020 11:24:47 +0000 (13:24 +0200)]
helgrind: If hg_cli__realloc fails, return NULL.
helgrind would not handle a failing realloc correctly and assume
cli_malloc would always succeed. If cli_malloc fails in hg_cli__realloc
do like dh and massif and fail the realloc call by returning NULL.
Assad Hashmi [Fri, 15 May 2020 14:44:14 +0000 (16:44 +0200)]
Enable v8.1 atomics and fix SWP and LDUMAX instructions.
The atomics test drd/tests/std_mutex hangs on Arm v8.1 when built
with GCC10. Add HWCAP_ATOMICS to ARM64_SUPPORTED_HWCAP and fix the
ldumax and swp instructions to make it work.
Mark Wielaard [Tue, 12 May 2020 14:58:36 +0000 (16:58 +0200)]
gcc10 arm64 build needs __getauxval for linking with libgcc
Provide a new library libgcc-sup-<platform>.a that contains symbols
needed by libgcc. This needs to be linked after -lgcc to provide
any symbols missing which would normally be provided by glibc.
At the moment this only provides __getauxval on arm64 linux.
Mark Wielaard [Thu, 14 May 2020 16:47:57 +0000 (18:47 +0200)]
Resolve id conflicts in See Also sections in valgrind and vgdb manpages.
Because the manpages are processed together they cannot contain the
same ids. Both the valgrind and vgdb manpage reference the vgdb main
manual URL. That conflicts even though the valgrind.1 and vgdb.1 manual
page are separate. Prefixing the vgdb ids (with "vgdb-") works around
the conflict. It still works fine, since in vgdb the references are only
directly used in the "See Also" refsect. The labels and urls still come
out as intended.
With this fix make valid validates both the manual index.xml and
manpages-index.xml without errors.
Mark Wielaard [Thu, 14 May 2020 14:07:04 +0000 (16:07 +0200)]
Turn manpages-index.xml into a "real" book, so it can be validated.
manpages-index.xml is just to easily get at each individual man page
with xsltproc. It wasn't a complete docbookx xml file. Now that it is
we can validate it with xmllint. It doesn't fully validate, but we
are close.
Mark Wielaard [Thu, 14 May 2020 13:11:56 +0000 (15:11 +0200)]
Use DTD DocBook XML V4.5 everywhere.
This makes the rule for xmllint easier since it doesn't need to
override the DTD to validate against. It also helps with other tools
tryinf to process the docbookx xml files.
Mark Wielaard [Thu, 14 May 2020 10:54:23 +0000 (12:54 +0200)]
Update README_DEVELOPERS and references to --vex-guest-chase-thresh.
Add a hint about using lldb in README_DEVELOPERS and fix any old references
to --vex-guest-chase-thresh=0 to --vex-guest-chase=no (mirroring the change
in commit 56e04256a "Rationalise --vex-guest* flags in the new IRSB
construction framework".
If you do `malloc(100)` followed by `realloc(200)`, DHAT now adds 100
bytes to the read and write counts for the implicit `memcpy`. This gives
more reasonable results.
I have long been surprised by low writes-per-byte values of around 0.35
for vectors that are grown by doubling. Counting the implicit `memcpy`
increases those numbers to well above 0.5, which is what you'd expect.
The commit also adds a section to the DHAT docs about `realloc`, because
there is some non-obvious behaviour, some of which confused me just a
couple of days ago.
Michal Privoznik [Fri, 15 Nov 2019 09:37:53 +0000 (10:37 +0100)]
Add support for setns syscall
I've tested this on amd64 and arm but I'm enabling it on all
arches since the syscall should work identically on all of them.
This was requested by users for a long time (almost 5 years) and
in fact, some programs (like libvirt) use namespaces and fork off
to enter other namespaces. Lack of implementation means valgrind
can't be used with these programs (or their configuration must be
changed to not use namespaces, which defeats the purpose).
Without knowing it, I've converged to same patch as mentioned in
bugs below.
Martin Storsjö [Mon, 27 Apr 2020 18:31:51 +0000 (21:31 +0300)]
mingw: Fix arch detection ifdefs for non-x86 mingw platforms
Don't assume that __MINGW32__ implies x86; Windows runs on ARM/ARM64
as well, and there are mingw toolchains that target those architectures.
This mirrors how the MSVC parts of the same expressions are written,
as (defined(_WIN32) && defined(_M_IX86)) and
(defined(_WIN64) && defined(_M_X64)) - not relying on _WIN32/_WIN64
or __MINGW32__/__MINGW64__ alone to indicate architecture.
Change the __MINGW64__ and _WIN64 ifdefs into plain __MINGW32__
and _WIN32 as well, for clarity - these defines mostly imply
platform.
Do not fix '417075 - pwritev(vector[...]) suppression ignored' but produce a warning.
- The release 3.15 introduced a backward incompatible change for
some suppression entries related to preadv and pwritev syscalls.
When reading a suppression entry using the unsupported 3.14 format,
valgrind will now produce a warning to say the suppression entry will not
work, and suggest the needed change.
For example, in 3.14, the extra line was:
pwritev(vector[...])
while in 3.15, it became e.g.
pwritev(vector[2])
3 possible fixes were discussed:
* revert the 3.15 change to go back to 3.14 format.
This is ugly because valgrind 3.16 would be incompatible
with the supp entries for 3.15.
* make the suppression matching logic consider that ... is a wildcard
equivalent to a *.
This is ugly because the suppression matching logic/functionality
is already very complex, and ... would mean 2 different things
in a suppression entry: wildcard in the extra line, and whatever
nr of stackframes in the backtrace portion of the supp entry.
* keep the 3.15 format, and accept the incompatibility with 3.14 and before.
This is ugly as valgrind 3.16 and above are still incompatible with 3.14
and before.
The third option was deemed the less ugly, in particular because it was possible
to detect the incompatible unsupported supp entry and produce a warning.
So, now, valgrind reports a warning when such an entry is detected, giving
e.g. a behaviour such as:
==21717== WARNING: pwritev(vector[...]) is an obsolete suppression line not supported in valgrind 3.15 or later.
==21717== You should replace [...] by a specific index such as [0] or [1] or [2] or similar
==21717==
....
==21717== Syscall param pwritev(vector[1]) points to unaddressable byte(s)
==21717== at 0x495B65A: pwritev (pwritev64.c:30)
==21717== by 0x1096C5: main (sys-preadv_pwritev.c:69)
==21717== Address 0xffffffffffffffff is not stack'd, malloc'd or (recently) free'd
So, we can hope that users having incompatible entries will easily understand
the problem of the supp entry not matching anymore.
In future releases of valgrind, we must take care to:
* never change the extra string produced for an error, unless *really* necessary
* minimise as much as possible 'variable' information generated dynamically
in error extra string. Such extra information can be reported in the rest
of the error message (like the address above for example).
The user can use e.g. GDB + vgdb to examine in details the offending
data or parameter values or uninitialised bytes or ...
A comment is added in pub_tool_errormgr.h to remind tool developers of the above.
Petar Jovanovic [Thu, 23 Apr 2020 16:42:26 +0000 (16:42 +0000)]
Amend the recent update to VG_(getrlimit) and VG_(setrlimit)
[get|set]rlimit system calls are becoming deprecated.
Coregrind should use prlimit64 as the first candidate in order to
achieve "rlimit" functionality.
There are also systems that do not even support older "rlimits".
Modify the previously added support VG_(getrlimit) and VG_(setrlimit)
using __NR_prlimit64 by making it similar to the glibc implementation.
It fixes none/tests/stackgrowth and none/tests/sigstackgrowth
tests on nanoMIPS.
Make memcheck/tests/linux/sigqueue usable with musl
Remove offsetof(siginfo_t, _sifields) from the test.
"_sifields" is not a mandatory field of struct siginfo_t so
it should not be used in regular user program.
We do not need u1 member of bits union as long as we use u32 for the same
purpose. Overlapping uint8_t with uint32_t causes a problem on BE platforms,
since LSB of u32 is not overlap with u1.
mips: treat delay slot as part of the previous instruction
Do so by recursively calling disInstr_MIPS_WRK() if the instruction
currently being disassembled is a branch/jump, effectively combining them
into one IR instruction.
A notable change is that the branch/jump + delay slot combination now forms
an eight-byte instruction.
It only ever worked on x86 and amd64, and even on those it had a high false
positive rate and was slow. Everything it does, ASan can do faster, better,
and on more architectures. So there's no reason to keep this tool any more.
drd/drd_pthread_intercepts: Add a workaround for what is probably a compiler bug
Without this patch drd produces incorrect output for some test cases. It
seems like without this patch an incorrect value is passed as the sixth
argument of VALGRIND_DO_CLIENT_REQUEST_STMT(VG_USERREQ__POST_SEM_OPEN, ...):
$ ./vg-in-place --tool=drd --traemaphore=yes drd/tests/sem_open -m -p
drd, a thread error detector
Copyright (C) 2006-2017, and GNU GPL'd, by Bart Van Assche.
Using Valgrind-3.16.0.GIT and LibVEX; rerun with -h for copyright info
Command: drd/tests/sem_open -m -p
[1] sem_open 0x4029000 name /drd-sem-open-test-27725 oflag 0xc0 mode 0600 value 0
s_d1 = 1 (should be 1)
[2] sem_wait 0x4029000 value 0 -> 4294967295
Thread 2:
Invalid semaphore: semaphore 0x4029000
at 0x484ADC7: sem_wait_intercept (drd_pthread_intercepts.c:1436)
by 0x484ADC7: sem_wait@* (drd_pthread_intercepts.c:1441)
by 0x4014A9: thread_func (sem_open.c:114)
by 0x483FEA6: vgDrd_thread_wrapper (drd_pthread_intercepts.c:449)
by 0x4886EF9: start_thread (in /lib64/libpthread-2.31.so)
by 0x499F3BE: clone (in /lib64/libc-2.31.so)
semaphore 0x4029000 was first observed at:
at 0x484A395: sem_open_intercept (drd_pthread_intercepts.c:1403)
by 0x484A395: sem_open (drd_pthread_intercepts.c:1409)
by 0x4012CE: main (sem_open.c:63)
[2] sem_post 0x4029000 value 4294967295 -> 0
[1] sem_wait 0x4029000 value 0 -> 4294967295
Thread 1:
Invalid semaphore: semaphore 0x4029000
at 0x484ADC7: sem_wait_intercept (drd_pthread_intercepts.c:1436)
by 0x484ADC7: sem_wait@* (drd_pthread_intercepts.c:1441)
by 0x40139D: main (sem_open.c:90)
semaphore 0x4029000 was first observed at:
at 0x484A395: sem_open_intercept (drd_pthread_intercepts.c:1403)
by 0x484A395: sem_open (drd_pthread_intercepts.c:1409)
by 0x4012CE: main (sem_open.c:63)
Conflicting load by thread 1 at 0x00404108 size 8
at 0x40139E: main (sem_open.c:91)
Allocation context: BSS section of /home/bart/software/valgrind.git/drd/tests/sem_open
Other segment start (thread 2)
(thread finished, call stack no longer available)
Other segment end (thread 2)
(thread finished, call stack no longer available)
Conflicting store by thread 1 at 0x00404108 size 8
at 0x4013B2: main (sem_open.c:91)
Allocation context: BSS section of /home/bart/software/valgrind.git/drd/tests/sem_open
Other segment start (thread 2)
(thread finished, call stack no longer available)
Other segment end (thread 2)
(thread finished, call stack no longer available)
[1] sem_post 0x4029000 value 4294967295 -> 0
s_d2 = 2 (should be 2)
s_d3 = 5 (should be 5)
[1] sem_close 0x4029000 value 0
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 18 from 8)
The test code in drd/tests/trylock.c attempts to write-lock a POSIX rwlock
twice. The code expects the second attempt to return an error, but POSIX
doesn't require that behaviour, and FreeBSD's implementation deadlocks
instead.
See also https://bugs.kde.org/show_bug.cgi?id=403212
Andreas Arnez [Fri, 3 Apr 2020 17:16:01 +0000 (19:16 +0200)]
s390x: Drop spurious register moves in CDAS instruction selector
The s390x instruction selector for Ist_CAS, in its handling of "compare
double and swap", adds spurious register moves after the CDAS operation
itself. These moves overwrite registers returned by calls to
s390_isel_int_expr(), potentially causing corruption of temp values.
Andreas Arnez [Thu, 2 Apr 2020 18:40:02 +0000 (20:40 +0200)]
s390x: Fix Iex_Load instruction selectors for F128/D128 types
The s390x instruction selectors for Iex_Load of Ity_F128 and Ity_D128
types had a common typo that would lead to crashes when used. So far this
bug didn't surface because Iex_Load is not emitted on s390x with these
types.
Andreas Arnez [Thu, 2 Apr 2020 16:00:13 +0000 (18:00 +0200)]
s390x: Introduce and exploit new ALU operator S390_ALU_ILIH
The handlers of Iop_8HLto16, Iop16HLto32, and Iop_32HLto64 in
s390_isel_int_wrk() yield a sequence of "shift", "and", and "or" ALU
operations, the second of which modifies a register returned from a call
to s390_isel_int_expr(). While this approach does not lead to wrong code
generation (because only the register's upper bits are changed which are
not relevant to the IR type), it violates the general "no-modify" rule.
Replace this sequence of ALU operations by a single ALU operation
S390_ALU_ILIH that inserts the low half of its second operand into the
high half of its first operand. Use the z/Architecture instruction
RISBG ("rotate then insert selected bits") for implementating it.
Andreas Arnez [Wed, 18 Mar 2020 17:59:15 +0000 (18:59 +0100)]
s390x: Drop register arg to s390_isel_int1_expr()
Restructure the interface of s390_isel_int1_expr() such that no
destination register is passed to it any more. Adjust all its callers
accordingly. Ensure that callers never modify the returned register, but
make a copy and modify that instead.
Andreas Arnez [Tue, 11 Feb 2020 17:02:38 +0000 (18:02 +0100)]
s390x: Activate "grail"
Now that the known problems with activating "grail" on s390x have been
fixed, there is no need to disable it for s390x guests any more. Remove
the appropriate check in "guest_generic_bb_to_IR.c".
Andreas Arnez [Thu, 19 Mar 2020 16:35:55 +0000 (17:35 +0100)]
Bug 418997 - s390x: Support Iex_ITE for float and vector expressions
The s390x backend supports Iex_ITE expressions for integer types I8, I16,
I32, and I64 only. But "grail" can now generate such expressions for
guarding any kind of Ist_Put statements; see add_guarded_stmt_to_end_of()
in "guest_generic_bb_to_IR.c". On s390x this means that F64 and V128 can
occur as well, in which case a crash would result. And such crashes are
actually seen when running the test suite with "grail" enabled.
Extend Iex_ITE support to the floating-point types F32 and F64 and to the
vector type V128. Do this by extending S390_INSN_COND_MOVE as needed.
Andreas Arnez [Tue, 3 Mar 2020 15:42:04 +0000 (16:42 +0100)]
s390x: Add directReload function for the register allocator
This adds the function directReload_S390() and wires it up to the register
allocator, enabling "direct reloading" for various types of instructions.
Direct reloading, when applied, avoids loading an operand into a register
and thus may reduce spilling. On s390x this slightly reduces the
generated code.
In order to determine which instructions are relevant for direct
reloading, it was tested which direct reloads the register allocator tries
to perform in simple programs.
Andreas Arnez [Wed, 18 Mar 2020 11:24:25 +0000 (12:24 +0100)]
Bug 417281 - s390x: Fix register usage of conditional moves
The s390x register usage callback marks the target register of a
conditional move as HRmWrite only. It fails to mention the fact that the
target register is also an input to the insn (unless the condition is
"never" or "always").
This was discovered while investigating "grail" failures on s390x and
fixes the majority of them.
Andreas Arnez [Fri, 13 Mar 2020 16:20:20 +0000 (17:20 +0100)]
s390x: Actually use "load on condition" for conditional moves
Although the implementation of the cond_move insn is prepared to emit
"load on condition" instructions, it doesn't, because of a reversed check.
The check is supposed to prevent emitting LOCx instructions when the
condition code mask is set to "always", but it's accidentally negated.
Fix the reversal of the check, so LOCx instructions are actually emitted
when applicable.
Andreas Arnez [Mon, 9 Mar 2020 16:26:26 +0000 (17:26 +0100)]
s390x: Mark register usage with HRmModify when applicable
Instead of marking register usage for the same register with HRmRead and
HRmWrite separately, use HRmModify instead. This makes the code a bit
easier to read.
Andreas Arnez [Mon, 9 Mar 2020 14:14:16 +0000 (15:14 +0100)]
s390x: Enable 1- and 2-byte operands for v-test
The v-test operation tests its operand against zero and sets the condition
code accordingly. So far the operation was only supported for 4- and
8-byte operands.
Lift this restriction and enable 1- and 2-byte operands for v-test, using
the z/Architecture "test under mask" instructions TM, TMY, and TMLL.
Exploit this in the instruction selector, getting rid of the conversion to
a 4-byte operand. This slightly reduces the generated code on s390x.