Mark Wielaard [Tue, 12 Oct 2021 16:51:23 +0000 (18:51 +0200)]
drd/tests: Extract start_thread which can come from libpthread or libc
The drd/tests/tc21_pthonce and drd/tests/annotate_barrier tests
would fail if start_thread came from libc (as it does in glibc 2.34)
instead of from libpthread. Extract start_thread in filter_stderr.in
and update the backtraces in annotate_barrier.stderr.exp and in
tc21_pthonce.stderr.exp
Tested against glibc 2.34, 2.33 and 2.17 on x86_64.
Paul Floyd [Sun, 10 Oct 2021 19:56:49 +0000 (21:56 +0200)]
Fix the ramaining easily fixable warnings with clang
There's one remaining
memalign2.c:29:9: warning: unused variable 'piece' [-Wunused-variable]
because of a block of #if FreeBSD for memalign that looks unnecessary
Otherwise all that is left is a few like
warning: unknown warning option '-Wno-alloc-size-larger-than'; did you mean '-Wno-frame-larger-than='? [-Wunknown-warning-option]
because there is no standard for compiler arguments.
Mark Wielaard [Sun, 10 Oct 2021 15:13:43 +0000 (17:13 +0200)]
Remove more warnings from tests
GCC12 catches various issues in tests at compile time that we want to
catch at runtime. Also glibc 2.34 deprecated various mallinfo related
functions. Add the relevant -Wno-foobar flags to those tests. In one
case, unit_oset.c, the warning was correct and the uninitialized
variable was explicitly set.
Mark Wielaard [Sun, 10 Oct 2021 14:35:37 +0000 (16:35 +0200)]
Fix printf warning in libmpiwrap.c
libmpiwrap.c:1379:45: warning: format '%d' expects argument of type 'int',
but argument 5 has type 'MPI_Request' {aka 'struct ompi_request_t *'}
Unfortunately MPI_Request is an opaque type (we don't really know what
is in struct ompi_request_t) so we cannot simply print it as int. In
other places we print an MPI_Request as 0x%lx by casting it to an
unsigned long. Do the same here.
Mark Wielaard [Sun, 10 Oct 2021 13:56:50 +0000 (15:56 +0200)]
Remove some warnings from tests
Various tests do things which we want to detect at runtime, like
ignoring the result of malloc or doing a deliberate impossibly large
allocation or operations that would result in overflowing or
truncated strings, that generate a warning from gcc.
In once case, mq_setattr called with new and old attrs overlapping,
this was explicitly fixed, in others -Wno-foobar was added to silence
the warning. This is safe even for older gcc, since a compiler will
ignore any -Wno-foobar they don't know about - since they do know they
won't warn for foobar.
Mark Wielaard [Thu, 7 Oct 2021 11:43:19 +0000 (13:43 +0200)]
Fix make distcheck by removing references to uncommitted files
Some files for the freebsd port have not yet committed, but were
already referenced in the Makefiles. Remove those references for
now to make distcheck happy.
Paul Floyd [Thu, 7 Oct 2021 05:53:33 +0000 (07:53 +0200)]
FreeBSD support, patch 2
Files in the root directory
Several Makefile.am files that have dependencies on FreeBSD autoconf
variables. Included a few new filter files to act as placeholders
to create new freebsd subdirectories.
Updated NEWS with the FreeBSD bugzilla items plus a couple of other
items fixed indirectly.
Andreas Arnez [Fri, 1 Oct 2021 18:10:54 +0000 (20:10 +0200)]
s390x: Add missing "cc" clobbers in test case inline asms
Some inline assemblies in various s390x test cases miss specifying the
condition code "cc" in the clobber list. Although this has not actually
been seen to cause wrong code generation, it certainly might, so fix this.
Andreas Arnez [Fri, 24 Sep 2021 18:06:39 +0000 (20:06 +0200)]
s390x: Fix compile warnings in test cases
Some GCC versions emit the following warnings for some s390x-specific test
cases:
warning: listing the stack pointer register '15' in a clobber list is
deprecated
warning: this 'else' clause does not
guard... [-Wmisleading-indentation] ...this statement, but...
Fix these.
Most of inline assemblies declaring r15 as clobbered do not actually
change its value. Only in stmg_wrap() it becomes necessary to save and
restore r15.
Mark Wielaard [Sat, 2 Oct 2021 10:03:46 +0000 (12:03 +0200)]
Ajust filter_gdb for arm64 with eglibc 2.19 and gdb 7.7.1
Older ubuntu arm64 setups used eglibc 2.19 and gdb 7.7.1. In that
case select.c could be under linux/generic and the select argument
list could be split up differently over several lines. Adjust
filter_gdb to catch those differences.
Also checked against an Debian arm64 with glibc 2.31 and gdb 10.1.
Mark Wielaard [Fri, 1 Oct 2021 20:25:40 +0000 (22:25 +0200)]
Add none/tests/scripts/shell.stderr.exp-dash4 for dash 0.5.11
dash 0.5.11 produces slightly different error messagess.
The new exp file is similar to shell.stderr.exp-dash3 but
with the extra (second) "shell: " output removed.
Carl Love [Tue, 28 Sep 2021 15:49:10 +0000 (15:49 +0000)]
Fix tests for mfspr
Split out the mfspr tests into a separate test using command line option
"-M". The value in the LR and CTR registers changed. It appears the
changes are due to changes in the test program jm-insns.c. Splinting
these instructions out will help to minimize the size of future updates
when the test program changes.
Carl Love [Thu, 9 Sep 2021 23:10:07 +0000 (23:10 +0000)]
fix sraw, srawi, srad, sradi instructions
For ISA 3.0 and beyond, the instructions also write the XER register.
Split the instructions out to a new command line option so we can create
an ISA 2.07 expect file, ISA 3.0 LE and ISA 3.0 BE expect file. The new
command line option is "-s" to just run just these four instructions.
Carl Love [Thu, 9 Sep 2021 19:06:00 +0000 (19:06 +0000)]
Add support for the mcrxrx instruction.
The mcrxrx instruction was introduced in ISA 3.0. It was missed when the
ISA 3.0 support was added to Valgrind.
The mcrxr instruction is not supported on ISA 3.0 and beyond. The
instructions both do a move to the condition register however the mcrxrx
moves [OV|OV32|CA|CA32]. Where the mcrxr instruction moves XER[32:35]
(S0, OV, and CA bits) to the CR.
Carl Love [Wed, 8 Sep 2021 22:01:05 +0000 (22:01 +0000)]
Fix dfp tests.
Due to changes between the compiler and linker, we need to add .machine
arguments to configure file to properly detect the availability of the
dfp instructions.
Add print statement if HAS_DFP is not enabled to make it
easier to determine when HAS_DFP is not enabled.
powerpc: Add .machine directives for scv, copy, paste, cpabort instructions
GCC is no longer passing the "-many" flag to the assembler. So, the
inline assembly instructions statements need to use the .machine directives
for the specific platform.
Andreas Arnez [Thu, 30 Sep 2021 12:10:29 +0000 (14:10 +0200)]
configure.ac: Avoid the use of "which"
The "which" command is not always installed, but configure.ac uses it in
the function AC_HWCAP_CONTAINS_FLAG to force invocation of the executable
"true" rather than the shell builtin with the same name. (The point here
is to get LD_SHOW_AUXV=1 evaluated by the dynamic loader.)
Another option might be to hard-wire the location /bin/true, because the
filesystem hierarchy standard requires it to be there. However, the FHS
doesn't apply to BSDs and at least some FreeBSD versions do not stick to
that specific rule.
On the other hand, the "env" command seems to be available on all relevant
platforms, so use that instead.
- prevent null dereferencing on dlang_type
- prevent buffer overflow when decoding user input
- Add support for demangling local D template declarations
- Add support for demangling D function literals as template
value parameters
- Add support for D `typeof(*null)' types
- Fix -Wundef warnings in ansidecl.h
- Fix endian bug in rust demangler
- Adjust mangling of __alignof__
- Avoid -Wstringop-truncation
Update libiberty demangler to support Rust v0 name mangling
Update the libiberty demangler using the auxprogs/update-demangler
script to the gcc git 01d92cfd79872e4cffc78bf233bb9b767336beb8.
Updates rust demangling to support the new v0 mangling scheme.
This includes the following changes:
- Update the update-demangler script to use gcc git instead of svn.
- The result of running the updated script to get an updated
demangler and resolving the merge conflicts.
- A change to long_namespace_xml.stderr.exp because two overly long
symbols aren't demangled anymore, but just returned as is.
- an update to the m_demangle/demangle.c source to deal with Rust
demangling in cp_demangle, which now directly demangles old and
new style rust symbols.
Mark Wielaard [Sun, 19 Sep 2021 12:30:19 +0000 (14:30 +0200)]
readdwarf3: Introduce abbv_state to read .debug_abbrev more lazily
With the inline parser often a lot of DIEs are skipped, so reading
all abbrevs up front wastes time and memory. A lot of time and memory
can be saved by reading the abbrevs on demand. Do this by introducing
an abbv_state that is used to keep track of the abbrevs already read.
This does technically make the CUConst struct not const.
Mark Wielaard [Sat, 18 Sep 2021 20:16:33 +0000 (22:16 +0200)]
readdwarf3: Reuse abbrev if possible between units
Instead of destroying the ht_abbrvs after processing a CU save it
and the offset so it can be reused for the next CU if that happens
to have the same abbrev offset. dwz compressed DWARF often reuse
the same abbrev for multiple CUs.
Mark Wielaard [Fri, 17 Sep 2021 22:24:38 +0000 (00:24 +0200)]
readdwarf3: Reuse fndn_ix_Table as much as possible
Both the var parser and the inl parser kept a fndn_ix_Table.
Initialize only one per debuginfo read pass and reuse if the stmt offset
is the same as last time (CUs can share the same line table and alt
files do share one for all units).
Mark Wielaard [Thu, 16 Sep 2021 20:01:47 +0000 (22:01 +0200)]
readdwarf3: Skip units without addresses when looking for inlined functions
When a unit doesn't cover any addresses skip it because no actual code
will be inside. Also use skip_DIE instead of read_DIE when not parsing
(skipping) children.
Andreas Arnez [Fri, 17 Sep 2021 16:48:12 +0000 (18:48 +0200)]
s390x: Fix 64-bit shift in s390_irgen_VSTRS
The function s390_irgen_VSTRS in guest_s390_toIR.c contains a shift
operation that is intended to yield a 64-bit number but uses 1UL instead
of 1ULL. This doesn't work on systems where 'unsigned long' is only 32
bits wide. Fix by replacing 1UL by 1ULL.
Carl Love [Fri, 10 Sep 2021 22:15:57 +0000 (17:15 -0500)]
Remove deprecated regression tests for mftgpr and mffgpr.
The mftgpr and mffgpr instructions are deprecated. Added comments in
VEX/priv/guest_ppc_toIR.c for the instructions stating the instructions
are deprecated. Valgrind support can be removed if the opcodes get reused
in the future. For now, leaving the functional support in Valgrind
for the instructions.
Removed the regression test power6_mf_gpr.c, expect files and vgtest file
from none/tests/ppc64.
Carl Love [Fri, 3 Sep 2021 17:14:50 +0000 (17:14 +0000)]
Fix impossible constraint issue in P10 testcase.
This reworks the modulo operation as seen in
valgrind/none/tests/ppc64/test_isa_3_1_common.c:
initialize_source_registers().
Due to a GCC issue (PR101882), we will try to avoid a modulo operation with
both input and outputs set to a hard register. In this case, we can apply
the modulo operation to the args[] array value used to initialize the ra
value.
Andreas Arnez [Tue, 18 May 2021 17:59:32 +0000 (19:59 +0200)]
s390x: Wrap up misc-insn-3 and vec-enh-2 support
Wrap up support for the miscellaneous-instruction-extensions facility 3
and the vector-enhancements facility 2: Add 'case' statements for the
remaining unhandled arch13 instructions to 'guest_s390_toIR.c', document
the new support in 's390-opcodes.csv', adjust 's390-check-opcodes.pl', and
announce the new feature in 'NEWS'.
Andreas Arnez [Mon, 17 May 2021 13:34:15 +0000 (15:34 +0200)]
s390x: Vec-enh-2, test cases
Add test cases for verifying the new/enhanced instructions in the
vector-enhancements facility 2. For "vector string search" VSTRS add a
memcheck test case.
Andreas Arnez [Tue, 16 Feb 2021 16:52:09 +0000 (17:52 +0100)]
s390x: Mark arch13 features as supported
Make the STFLE instruction report the miscellaneous-instruction-extensions
facility 3 and the vector-enhancements facility 2 as supported. Indicate
support for the latter in the HWCAP vector as well.
Andreas Arnez [Wed, 10 Mar 2021 18:22:51 +0000 (19:22 +0100)]
s390x: Vec-enh-2, VSTRS
Support the new "vector string search" instruction VSTRS. The
implementation is a full emulation and follows a similar approach as for
the other vector string instructions.
Andreas Arnez [Tue, 16 Feb 2021 15:19:31 +0000 (16:19 +0100)]
s390x: Vec-enh-2, VLBR and friends
Add support for the new byte- and element-swapping vector load/store
instructions VLEBRH, VLEBRG, VLEBRF, VLLEBRZ, VLBRREP, VLBR, VLER,
VSTEBRH, VSTEBRG, VSTEBRF, VSTBR, and VSTER.
Andreas Arnez [Thu, 11 Feb 2021 19:02:03 +0000 (20:02 +0100)]
s390x: Vec-enh-2, extend VCDG, VCDLG, VCGD, and VCLGD
The vector-enhancements facility 2 extends the vector floating-point
conversion instructions VCDG, VCDLG, VCGD, and VCLGD. In addition to
64-bit elements, they now also handle 32-bit elements. Add support for
these new forms.
Andreas Arnez [Wed, 7 Apr 2021 10:29:32 +0000 (12:29 +0200)]
s390x: Vec-enh-2, extend VSL, VSRA, and VSRL
The vector-enhancements facility 2 extends the existing bitwise vector
shift instructions VSL, VSRA, and VSRL. Now they allow the shift
vector (the third operand) to contain different shift amounts for each
byte. Add support for these new forms.
Add support for the instructions NCRK, NCGRK, NNRK, NNGRK, NORK, NOGRK,
NXRK, NXGRK, OCRK, and OCGRK. Introduce a common helper and use it for
the existing instructions NRK, NGRK, XRK, XGRK, ORK, and OGRK as well.
Mark Wielaard [Fri, 6 Aug 2021 17:08:17 +0000 (19:08 +0200)]
unhandled ppc64le-linux syscall: 252 (statfs64) and 253 (fstatfs64)
glibc 2.34 consolidated all statfs implementations. All other arches
that have statfs64/fstat64 (including ppc32) already had that syscall
hooked up, it was just ppc64 that was missing it.
Mark Wielaard [Wed, 21 Jul 2021 17:53:13 +0000 (19:53 +0200)]
Generate a ENOSYS (sys_ni_syscall) for clone3 on all linux arches
glibc 2.34 will try to use clone3 first before falling back to
the clone syscall. So implement clone3 as sys_ni_syscall which
simply return ENOSYS without producing a warning.
Mark Wielaard [Fri, 16 Jul 2021 19:47:08 +0000 (15:47 -0400)]
Update helgrind and drd suppression libc and libpthread paths in glibc 2.34
glibc 2.34 moved all pthread functions into the main libc library.
And it changed the (in memory) path of the main libc library to
libc.so.6 (before it was libc-2.xx.so).
This breaks various standard suppressions for helgrind and drd.
Fix this by doing a configure check for whether we are using glibc
2.34 by checking whether pthread_create is in libc instead of in
libpthread. If we are using glibc then define GLIBC_LIBC_PATH and
GLIBC_LIBPTHREAD_PATH variables that point to the (regexp) path
of the library that contains all libc functions and pthread functions
(which will be the same path for glibc 2.34+).
Rename glibc-2.34567-NPTL-helgrind.supp to glibc-2.X-helgrind.supp.in
and glibc-2.X-drd.supp to glibc-2.X-drd.supp.in and replace the
GLIBC_LIBC_PATH and GLIBC_LIBPTHREAD_PATH at configure time.
The same could be done for the glibc-2.X.supp.in file, but hasn't
yet because it looks like most suppressions in that file are obsolete.
Mark Wielaard [Fri, 16 Jul 2021 19:37:21 +0000 (21:37 +0200)]
gdbserver_tests: update filters for newer glibc/gdb
With newer glibc/gdb we might see a __select call without anything
following on the line. Also when gdb cannot find a file it might
now print "Inappropriate ioctl for device" instead of the message
"No such file or directory"
amd64 front end: Make uses of 8- and 16-bit GPRs GET the entire containing register.
Until now, a read of a 32-bit GPR (eg, %ecx) in the amd64 front end actually
involved GETting the containing 64-bit reg (%rcx) and dropping off its top
32-bits, in the IR translation. This makes IR optimisation work well for code
that mixes 32 and 64 bit integer operations, which is very commont. In
particular it helps guarantee that PUT-to-GET and redundant-GET optimisations
work, hence that constant propagation/folding across such boundaries works,
and indirectly helps to avoid generating code in the back end that suffers
from store-forwarding or partial-register-read stalls.
This commit partially extends those advantages to 8- and 16-bit GPR reads. In
particular, all 16-bit GPR fetches are now a GET of the whole 64-bit register
followed by an Iop_64to16 cast. The same scheme is used for 8-bit register
fetches, except for the "anomalous four" (%ah, %bh, %ch, %dh), whose handling
is left unchanged.
With this in place, now, a wider write followed by a smaller read, will play
nice with constant folding, propagation, for example (somewhat artificially):
movl $17, %ecx // 32-bit write of %rcx
shrl %cl, %r15 // 8-bit read of %rcx
The 17 will be propagated, in IR, up to the shift.
The commit also adds a couple more rewrite rules in ir_opt.c to remove some of
the resulting pointless conversion pairings.
Consistently set CC_NDEP when setting the flags thunk.
For most settings of the flags thunk (guest_CC_{OP,DEP1,DEP2,NDEP}), the value
of the NDEP field is irrelevant, because of the setting of the OP field, and
so it is usually not set in such cases, which are the vast majority. This
saves a store (a PUT) in the final generated code. But it has the bad effect
that the IR optimiser cannot know that preceding PUTs to the field are
possibly dead and can be removed. Most of the time that is not important, but
just occasionally it can cause a lot of pointless extra computation (calling
of amd64g_calculate_rflags_all) to happen. This was observed in a long basic
block involved in a hash calculation, like this:
rolq .. // sets CC_NDEP to the previous value of the flags,
// as calculated by amd64g_calculate_rflags_all
mulq ..
(rolq/mulq repeated several times)
addq .. // effect is, all of the flag computation done for the rol/mul
// sequence is irrelevant, but iropt can't see that
Setting CC_NDEP consistently to zero, even if it isn't needed, avoids the
problem.
amd64 front end: more spec rules: S/NS after LOGICW, S after SHRL, Z after SHRW, C after SUBW.
This adds a few more spec rules that seem useful for running Firefox built
with gcc-O3 and clang-O3. At least one of them removes a false Memcheck
error.
There is also some improved debug printing, currently #if 0'd.
Remove redundant assertions and conditionals in move_CEnt_to_top.
move_CEnt_to_top is on the hot path when reading large amounts of debug info,
especially Dwarf inlined-function info. It shows up in 'perf' profiles. This
commit removes assertions which are asserted elsewhere, and tries to avoid a
couple of conditional branches.
Reimplement h_generic_calc_GetMSBs8x16 to be more efficient.
h_generic_calc_GetMSBs8x16 concatenates the top bit of each 8-bit lane in a
128-bit value, producing a 16-bit scalar value. (It is PMOVMSKB, really).
The existing implementation is excessively inefficient and shows up sometimes
in 'perf' profiles of V. This commit replaces it with a logarithmic (4-stage)
algorithm which is hopefully much faster.