Carl Love [Tue, 9 Jun 2020 15:42:03 +0000 (10:42 -0500)]
Power PC Fix extraction of the L field for sync instruction
The L field is currently a two bit[22:21] field in ISA 3.0. The size of the
L field has changed over time.
Currently the ISA 3.0 Valgrind sync instruction support code sets the
flag_L for the instruction L field to a five bit value that includes bits
that are marked reserved the sync instruction. This patch fixes the issue for ISA 3.0
to only setting flag_L the specified two bits.
Mark Wielaard [Tue, 9 Jun 2020 09:23:46 +0000 (11:23 +0200)]
docs: Make sure all elements that need it have an id tag.
When generating HTML it is useful if every element that can be referenced
has a stable id. If it doesn't a random one is generated which makes it
harder to link to parts of the manual on the website. It also generates
spurious diffs. Explicitly add an id tag for the sect2 and sect3 elements
in dh-manual, a unique id for each legalnotice element and for each
FAQ question and answer.
Assad Hashmi [Fri, 15 May 2020 14:44:14 +0000 (16:44 +0200)]
Enable v8.1 atomics and fix SWP and LDUMAX instructions.
The atomics test drd/tests/std_mutex hangs on Arm v8.1 when built
with GCC10. Add HWCAP_ATOMICS to ARM64_SUPPORTED_HWCAP and fix the
ldumax and swp instructions to make it work.
Mark Wielaard [Tue, 12 May 2020 14:58:36 +0000 (16:58 +0200)]
gcc10 arm64 build needs __getauxval for linking with libgcc
Provide a new library libgcc-sup-<platform>.a that contains symbols
needed by libgcc. This needs to be linked after -lgcc to provide
any symbols missing which would normally be provided by glibc.
At the moment this only provides __getauxval on arm64 linux.
Mark Wielaard [Thu, 14 May 2020 16:47:57 +0000 (18:47 +0200)]
Resolve id conflicts in See Also sections in valgrind and vgdb manpages.
Because the manpages are processed together they cannot contain the
same ids. Both the valgrind and vgdb manpage reference the vgdb main
manual URL. That conflicts even though the valgrind.1 and vgdb.1 manual
page are separate. Prefixing the vgdb ids (with "vgdb-") works around
the conflict. It still works fine, since in vgdb the references are only
directly used in the "See Also" refsect. The labels and urls still come
out as intended.
With this fix make valid validates both the manual index.xml and
manpages-index.xml without errors.
Mark Wielaard [Thu, 14 May 2020 14:07:04 +0000 (16:07 +0200)]
Turn manpages-index.xml into a "real" book, so it can be validated.
manpages-index.xml is just to easily get at each individual man page
with xsltproc. It wasn't a complete docbookx xml file. Now that it is
we can validate it with xmllint. It doesn't fully validate, but we
are close.
Mark Wielaard [Thu, 14 May 2020 13:11:56 +0000 (15:11 +0200)]
Use DTD DocBook XML V4.5 everywhere.
This makes the rule for xmllint easier since it doesn't need to
override the DTD to validate against. It also helps with other tools
tryinf to process the docbookx xml files.
Mark Wielaard [Thu, 14 May 2020 10:54:23 +0000 (12:54 +0200)]
Update README_DEVELOPERS and references to --vex-guest-chase-thresh.
Add a hint about using lldb in README_DEVELOPERS and fix any old references
to --vex-guest-chase-thresh=0 to --vex-guest-chase=no (mirroring the change
in commit 56e04256a "Rationalise --vex-guest* flags in the new IRSB
construction framework".
If you do `malloc(100)` followed by `realloc(200)`, DHAT now adds 100
bytes to the read and write counts for the implicit `memcpy`. This gives
more reasonable results.
I have long been surprised by low writes-per-byte values of around 0.35
for vectors that are grown by doubling. Counting the implicit `memcpy`
increases those numbers to well above 0.5, which is what you'd expect.
The commit also adds a section to the DHAT docs about `realloc`, because
there is some non-obvious behaviour, some of which confused me just a
couple of days ago.
Michal Privoznik [Fri, 15 Nov 2019 09:37:53 +0000 (10:37 +0100)]
Add support for setns syscall
I've tested this on amd64 and arm but I'm enabling it on all
arches since the syscall should work identically on all of them.
This was requested by users for a long time (almost 5 years) and
in fact, some programs (like libvirt) use namespaces and fork off
to enter other namespaces. Lack of implementation means valgrind
can't be used with these programs (or their configuration must be
changed to not use namespaces, which defeats the purpose).
Without knowing it, I've converged to same patch as mentioned in
bugs below.
Martin Storsjö [Mon, 27 Apr 2020 18:31:51 +0000 (21:31 +0300)]
mingw: Fix arch detection ifdefs for non-x86 mingw platforms
Don't assume that __MINGW32__ implies x86; Windows runs on ARM/ARM64
as well, and there are mingw toolchains that target those architectures.
This mirrors how the MSVC parts of the same expressions are written,
as (defined(_WIN32) && defined(_M_IX86)) and
(defined(_WIN64) && defined(_M_X64)) - not relying on _WIN32/_WIN64
or __MINGW32__/__MINGW64__ alone to indicate architecture.
Change the __MINGW64__ and _WIN64 ifdefs into plain __MINGW32__
and _WIN32 as well, for clarity - these defines mostly imply
platform.
Do not fix '417075 - pwritev(vector[...]) suppression ignored' but produce a warning.
- The release 3.15 introduced a backward incompatible change for
some suppression entries related to preadv and pwritev syscalls.
When reading a suppression entry using the unsupported 3.14 format,
valgrind will now produce a warning to say the suppression entry will not
work, and suggest the needed change.
For example, in 3.14, the extra line was:
pwritev(vector[...])
while in 3.15, it became e.g.
pwritev(vector[2])
3 possible fixes were discussed:
* revert the 3.15 change to go back to 3.14 format.
This is ugly because valgrind 3.16 would be incompatible
with the supp entries for 3.15.
* make the suppression matching logic consider that ... is a wildcard
equivalent to a *.
This is ugly because the suppression matching logic/functionality
is already very complex, and ... would mean 2 different things
in a suppression entry: wildcard in the extra line, and whatever
nr of stackframes in the backtrace portion of the supp entry.
* keep the 3.15 format, and accept the incompatibility with 3.14 and before.
This is ugly as valgrind 3.16 and above are still incompatible with 3.14
and before.
The third option was deemed the less ugly, in particular because it was possible
to detect the incompatible unsupported supp entry and produce a warning.
So, now, valgrind reports a warning when such an entry is detected, giving
e.g. a behaviour such as:
==21717== WARNING: pwritev(vector[...]) is an obsolete suppression line not supported in valgrind 3.15 or later.
==21717== You should replace [...] by a specific index such as [0] or [1] or [2] or similar
==21717==
....
==21717== Syscall param pwritev(vector[1]) points to unaddressable byte(s)
==21717== at 0x495B65A: pwritev (pwritev64.c:30)
==21717== by 0x1096C5: main (sys-preadv_pwritev.c:69)
==21717== Address 0xffffffffffffffff is not stack'd, malloc'd or (recently) free'd
So, we can hope that users having incompatible entries will easily understand
the problem of the supp entry not matching anymore.
In future releases of valgrind, we must take care to:
* never change the extra string produced for an error, unless *really* necessary
* minimise as much as possible 'variable' information generated dynamically
in error extra string. Such extra information can be reported in the rest
of the error message (like the address above for example).
The user can use e.g. GDB + vgdb to examine in details the offending
data or parameter values or uninitialised bytes or ...
A comment is added in pub_tool_errormgr.h to remind tool developers of the above.
Petar Jovanovic [Thu, 23 Apr 2020 16:42:26 +0000 (16:42 +0000)]
Amend the recent update to VG_(getrlimit) and VG_(setrlimit)
[get|set]rlimit system calls are becoming deprecated.
Coregrind should use prlimit64 as the first candidate in order to
achieve "rlimit" functionality.
There are also systems that do not even support older "rlimits".
Modify the previously added support VG_(getrlimit) and VG_(setrlimit)
using __NR_prlimit64 by making it similar to the glibc implementation.
It fixes none/tests/stackgrowth and none/tests/sigstackgrowth
tests on nanoMIPS.
Make memcheck/tests/linux/sigqueue usable with musl
Remove offsetof(siginfo_t, _sifields) from the test.
"_sifields" is not a mandatory field of struct siginfo_t so
it should not be used in regular user program.
We do not need u1 member of bits union as long as we use u32 for the same
purpose. Overlapping uint8_t with uint32_t causes a problem on BE platforms,
since LSB of u32 is not overlap with u1.
mips: treat delay slot as part of the previous instruction
Do so by recursively calling disInstr_MIPS_WRK() if the instruction
currently being disassembled is a branch/jump, effectively combining them
into one IR instruction.
A notable change is that the branch/jump + delay slot combination now forms
an eight-byte instruction.
It only ever worked on x86 and amd64, and even on those it had a high false
positive rate and was slow. Everything it does, ASan can do faster, better,
and on more architectures. So there's no reason to keep this tool any more.
drd/drd_pthread_intercepts: Add a workaround for what is probably a compiler bug
Without this patch drd produces incorrect output for some test cases. It
seems like without this patch an incorrect value is passed as the sixth
argument of VALGRIND_DO_CLIENT_REQUEST_STMT(VG_USERREQ__POST_SEM_OPEN, ...):
$ ./vg-in-place --tool=drd --traemaphore=yes drd/tests/sem_open -m -p
drd, a thread error detector
Copyright (C) 2006-2017, and GNU GPL'd, by Bart Van Assche.
Using Valgrind-3.16.0.GIT and LibVEX; rerun with -h for copyright info
Command: drd/tests/sem_open -m -p
[1] sem_open 0x4029000 name /drd-sem-open-test-27725 oflag 0xc0 mode 0600 value 0
s_d1 = 1 (should be 1)
[2] sem_wait 0x4029000 value 0 -> 4294967295
Thread 2:
Invalid semaphore: semaphore 0x4029000
at 0x484ADC7: sem_wait_intercept (drd_pthread_intercepts.c:1436)
by 0x484ADC7: sem_wait@* (drd_pthread_intercepts.c:1441)
by 0x4014A9: thread_func (sem_open.c:114)
by 0x483FEA6: vgDrd_thread_wrapper (drd_pthread_intercepts.c:449)
by 0x4886EF9: start_thread (in /lib64/libpthread-2.31.so)
by 0x499F3BE: clone (in /lib64/libc-2.31.so)
semaphore 0x4029000 was first observed at:
at 0x484A395: sem_open_intercept (drd_pthread_intercepts.c:1403)
by 0x484A395: sem_open (drd_pthread_intercepts.c:1409)
by 0x4012CE: main (sem_open.c:63)
[2] sem_post 0x4029000 value 4294967295 -> 0
[1] sem_wait 0x4029000 value 0 -> 4294967295
Thread 1:
Invalid semaphore: semaphore 0x4029000
at 0x484ADC7: sem_wait_intercept (drd_pthread_intercepts.c:1436)
by 0x484ADC7: sem_wait@* (drd_pthread_intercepts.c:1441)
by 0x40139D: main (sem_open.c:90)
semaphore 0x4029000 was first observed at:
at 0x484A395: sem_open_intercept (drd_pthread_intercepts.c:1403)
by 0x484A395: sem_open (drd_pthread_intercepts.c:1409)
by 0x4012CE: main (sem_open.c:63)
Conflicting load by thread 1 at 0x00404108 size 8
at 0x40139E: main (sem_open.c:91)
Allocation context: BSS section of /home/bart/software/valgrind.git/drd/tests/sem_open
Other segment start (thread 2)
(thread finished, call stack no longer available)
Other segment end (thread 2)
(thread finished, call stack no longer available)
Conflicting store by thread 1 at 0x00404108 size 8
at 0x4013B2: main (sem_open.c:91)
Allocation context: BSS section of /home/bart/software/valgrind.git/drd/tests/sem_open
Other segment start (thread 2)
(thread finished, call stack no longer available)
Other segment end (thread 2)
(thread finished, call stack no longer available)
[1] sem_post 0x4029000 value 4294967295 -> 0
s_d2 = 2 (should be 2)
s_d3 = 5 (should be 5)
[1] sem_close 0x4029000 value 0
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 18 from 8)
The test code in drd/tests/trylock.c attempts to write-lock a POSIX rwlock
twice. The code expects the second attempt to return an error, but POSIX
doesn't require that behaviour, and FreeBSD's implementation deadlocks
instead.
See also https://bugs.kde.org/show_bug.cgi?id=403212
Andreas Arnez [Fri, 3 Apr 2020 17:16:01 +0000 (19:16 +0200)]
s390x: Drop spurious register moves in CDAS instruction selector
The s390x instruction selector for Ist_CAS, in its handling of "compare
double and swap", adds spurious register moves after the CDAS operation
itself. These moves overwrite registers returned by calls to
s390_isel_int_expr(), potentially causing corruption of temp values.
Andreas Arnez [Thu, 2 Apr 2020 18:40:02 +0000 (20:40 +0200)]
s390x: Fix Iex_Load instruction selectors for F128/D128 types
The s390x instruction selectors for Iex_Load of Ity_F128 and Ity_D128
types had a common typo that would lead to crashes when used. So far this
bug didn't surface because Iex_Load is not emitted on s390x with these
types.
Andreas Arnez [Thu, 2 Apr 2020 16:00:13 +0000 (18:00 +0200)]
s390x: Introduce and exploit new ALU operator S390_ALU_ILIH
The handlers of Iop_8HLto16, Iop16HLto32, and Iop_32HLto64 in
s390_isel_int_wrk() yield a sequence of "shift", "and", and "or" ALU
operations, the second of which modifies a register returned from a call
to s390_isel_int_expr(). While this approach does not lead to wrong code
generation (because only the register's upper bits are changed which are
not relevant to the IR type), it violates the general "no-modify" rule.
Replace this sequence of ALU operations by a single ALU operation
S390_ALU_ILIH that inserts the low half of its second operand into the
high half of its first operand. Use the z/Architecture instruction
RISBG ("rotate then insert selected bits") for implementating it.
Andreas Arnez [Wed, 18 Mar 2020 17:59:15 +0000 (18:59 +0100)]
s390x: Drop register arg to s390_isel_int1_expr()
Restructure the interface of s390_isel_int1_expr() such that no
destination register is passed to it any more. Adjust all its callers
accordingly. Ensure that callers never modify the returned register, but
make a copy and modify that instead.
Andreas Arnez [Tue, 11 Feb 2020 17:02:38 +0000 (18:02 +0100)]
s390x: Activate "grail"
Now that the known problems with activating "grail" on s390x have been
fixed, there is no need to disable it for s390x guests any more. Remove
the appropriate check in "guest_generic_bb_to_IR.c".
Andreas Arnez [Thu, 19 Mar 2020 16:35:55 +0000 (17:35 +0100)]
Bug 418997 - s390x: Support Iex_ITE for float and vector expressions
The s390x backend supports Iex_ITE expressions for integer types I8, I16,
I32, and I64 only. But "grail" can now generate such expressions for
guarding any kind of Ist_Put statements; see add_guarded_stmt_to_end_of()
in "guest_generic_bb_to_IR.c". On s390x this means that F64 and V128 can
occur as well, in which case a crash would result. And such crashes are
actually seen when running the test suite with "grail" enabled.
Extend Iex_ITE support to the floating-point types F32 and F64 and to the
vector type V128. Do this by extending S390_INSN_COND_MOVE as needed.
Andreas Arnez [Tue, 3 Mar 2020 15:42:04 +0000 (16:42 +0100)]
s390x: Add directReload function for the register allocator
This adds the function directReload_S390() and wires it up to the register
allocator, enabling "direct reloading" for various types of instructions.
Direct reloading, when applied, avoids loading an operand into a register
and thus may reduce spilling. On s390x this slightly reduces the
generated code.
In order to determine which instructions are relevant for direct
reloading, it was tested which direct reloads the register allocator tries
to perform in simple programs.
Andreas Arnez [Wed, 18 Mar 2020 11:24:25 +0000 (12:24 +0100)]
Bug 417281 - s390x: Fix register usage of conditional moves
The s390x register usage callback marks the target register of a
conditional move as HRmWrite only. It fails to mention the fact that the
target register is also an input to the insn (unless the condition is
"never" or "always").
This was discovered while investigating "grail" failures on s390x and
fixes the majority of them.
Andreas Arnez [Fri, 13 Mar 2020 16:20:20 +0000 (17:20 +0100)]
s390x: Actually use "load on condition" for conditional moves
Although the implementation of the cond_move insn is prepared to emit
"load on condition" instructions, it doesn't, because of a reversed check.
The check is supposed to prevent emitting LOCx instructions when the
condition code mask is set to "always", but it's accidentally negated.
Fix the reversal of the check, so LOCx instructions are actually emitted
when applicable.
Andreas Arnez [Mon, 9 Mar 2020 16:26:26 +0000 (17:26 +0100)]
s390x: Mark register usage with HRmModify when applicable
Instead of marking register usage for the same register with HRmRead and
HRmWrite separately, use HRmModify instead. This makes the code a bit
easier to read.
Andreas Arnez [Mon, 9 Mar 2020 14:14:16 +0000 (15:14 +0100)]
s390x: Enable 1- and 2-byte operands for v-test
The v-test operation tests its operand against zero and sets the condition
code accordingly. So far the operation was only supported for 4- and
8-byte operands.
Lift this restriction and enable 1- and 2-byte operands for v-test, using
the z/Architecture "test under mask" instructions TM, TMY, and TMLL.
Exploit this in the instruction selector, getting rid of the conversion to
a 4-byte operand. This slightly reduces the generated code on s390x.
Andreas Arnez [Wed, 5 Feb 2020 17:18:49 +0000 (18:18 +0100)]
s390x: Support And1/Or1, improve handling of Int1 expressions
This provides an instruction selector for Int1-expressions that supports
And1 and Or1. This implementation tries to keep values in registers as
much as possible, to avoid too many conversions from a Boolean value to a
condition code or vice versa. To this end, the new function
s390_isel_int1_expr() is added, which handles bit-typed expressions that
are supposed to end up in a register.
Also change the representation of Int1 values in registers and always
sign-extend them to 64 bits.
Andreas Arnez [Tue, 10 Mar 2020 16:18:48 +0000 (17:18 +0100)]
s390x: Fix down-cast from memory operand with size < 8
A down-cast always copies 8 bytes from the source operand, even if the
operand is actually smaller. This doesn't matter for register operands,
but it does for memory operands. Fix this and copy the correct number of
bytes instead.
Andreas Arnez [Fri, 13 Mar 2020 16:18:55 +0000 (17:18 +0100)]
s390x: Mark VRs as clobbered by helper calls
According to the s390x ABI, all vector registers are call-clobbered
(except for their portions that overlap with the call-saved FPRs). But
the s390x backend doesn't mark them as such when determining the register
usage of helper call insns.
Fix this in s390_insn_get_reg_usage when handling S390_INSN_HELPER_CALL.
Julian Seward [Mon, 9 Mar 2020 08:22:31 +0000 (09:22 +0100)]
Bug 415136 - ARMv8.1 Compare-and-Swap instructions are not supported. (TEST CASES).
This commit provides test cases for ARMv8.1 CAS instructions, support for
which was added in the previous commit.
Patch by Assad Hashmi <assad.hashmi@linaro.org>.
Julian Seward [Mon, 9 Mar 2020 08:18:09 +0000 (09:18 +0100)]
Bug 415136 - ARMv8.1 Compare-and-Swap instructions are not supported.
This commit implements ARMv8.1 CAS instructions. It does not contain
test cases; those will be in a subsequent commit.
Patch by Assad Hashmi <assad.hashmi@linaro.org>.
Andreas Arnez [Mon, 2 Mar 2020 15:22:59 +0000 (16:22 +0100)]
Bug 418435 - s390x: Avoid extra value dependency in CLC implementation
The test memcheck/tests/memcmp currently fails on s390x because it yields
the expected "conditional jump or move depends on uninitialised value(s)"
message twice instead of just once.
This is caused by the handling of the s390x instruction CLC, see
s390_irgen_CLC_EX(). When comparing two bytes from the two input strings,
the implementation uses the comparison result for a conditional branch to
the next instruction. But if no further bytes need to be compared, the
comparison result is also used for generating the resulting condition
code.
There are two cases: Either the inputs are equal; then the resulting
condition code is zero. This is what happens in the memcmp test case. Or
the inputs are different; then the resulting condition code is 1 or 2 if
the first or second operand is greater, respectively.
At least in the first case it is easy to avoid the additional dependency,
by clearing the condition code explicitly. Just do this.
Mark Wielaard [Fri, 28 Feb 2020 12:36:31 +0000 (13:36 +0100)]
Add 32bit time64 syscalls for arm, mips32, ppc32 and x86.
This patch adds sycall wrappers for the following syscalls which
use a 64bit time_t on 32bit arches: gettime64, settime64,
clock_getres_time64, clock_nanosleep_time64, timer_gettime64,
timer_settime64, timerfd_gettime64, timerfd_settime64,
utimensat_time64, pselect6_time64, ppoll_time64, recvmmsg_time64,
mq_timedsend_time64, mq_timedreceive_time64, semtimedop_time64,
rt_sigtimedwait_time64, futex_time64 and sched_rr_get_interval_time64.
Still missing are clock_adjtime64 and io_pgetevents_time64.
For the more complicated syscalls futex[_time64], pselect6[_time64]
and ppoll[_time64] there are shared pre and/or post helper functions.
Other functions just have their own PRE and POST handler.
Note that the vki_timespec64 struct really is the struct as used by
by glibc (it internally translates a 32bit timespec struct to a 64bit
timespec64 struct before passing it to any of the time64 syscalls).
The kernel uses a 64-bit signed int, but is ignoring the upper 32 bits
of the tv_nsec field. It does always write the full struct though.
So avoid checking the padding is only needed for PRE_MEM_READ.
There are two helper pre_read_timespec64 and pre_read_itimerspec64
to check the new structs.
Mark Wielaard [Wed, 4 Mar 2020 13:23:37 +0000 (14:23 +0100)]
Add suppressions for glibc DTV leaks
The glibc DTV (Dynamic Thread Vector) for the main thread is never
released, not even through __libc_freeres. This causes it to always
show up as a reachable block when used, and sometimes, when it is
extended and then reduced, as a possible leak when memcheck cannot
find a pointer to the start of the block.
Improve line info tracing, in particular when using lto.
With gcc 9 and --enable-lto, we now have spurious warnings telling
that the line information in the debug info has huge line numbers,
greater than the (valgrind) maximum of 2^20.
These spurious warnings make that all tests are failing.
This change modifies the tracing/debugging of the line info to:
* disable by default the warning for line info greater than 2^20.
When using -d, such warnings are however still shown (once).
* allow to see all such warnings, when using at least -d -d -d -d
Allow valgrind to find debug info in a 'usr merge' setup.
On ubuntu 19.10, valgrind fails telling that it cannot find
the mandatory redirection for strlen in ld-linux-x86-64.so.2.
This is due to /bin being a symlink to usr/bin: ld is found
in /usr/lib/x86_64-linux-gnu/ld-2.30.so
but its debug info is
in /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.30.so
Without this patch, valgrind searches the debug info (a.o.)
in /usr/lib/debug/usr/lib/x86_64-linux-gnu/ld-2.30.so
so using the concatenation of /usr/lib/debug
and /usr/lib/x86_64-linux-gnu/ld-2.30.so,
but the debug info is located at the concatenation of
/usr/lib/debug and /lib/x86_64-linux-gnu/ld-2.30.so
(so without the leading /usr).
Modify the debug info search so as to try with and without the /usr.
Patch derived from the patch done by Mathieu Trudel-Lapierre
to solve https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1808508
Andreas Arnez [Thu, 27 Feb 2020 14:52:53 +0000 (15:52 +0100)]
s390x: Add CPU model for z15
Make the z15 CPU models known to Valgrind. Add test case output for z15
to the "ecag" test. Also ensure that the facility bits for CPU facilities
unsupported by Valgrind are unset, particularly for the new
deflate-conversion facility.
mips: Fix linking errors for none/tests/mips[32|64]/msa_fpu
Some older toolchains (e.g. Codescape GNU Tools 2016.05-03 for MIPS
MTI Linux 4.9.2) require explicit inclusion of the "math" library in
order to link to the fpclassify() function.
Andreas Arnez [Wed, 26 Feb 2020 16:46:45 +0000 (17:46 +0100)]
s390x: Fix possible false positives with mul-z14 test case
The output of the tests for msrkc and msgrkc in "none/tests/s390x/mul-z14"
can differ from the expected output, because it depends on undetermined
data. The test always prints the register pair r2/r3, but the
instructions msrkc and msgrkc only write to r2, and msrkc even affects
only its lowest half.
Fix the undetermined output by initializing r2 and r3 with zero first.
Andreas Arnez [Wed, 5 Feb 2020 18:28:53 +0000 (19:28 +0100)]
s390x: Exploit LOCGHI for converting from CC to Int1
Whenever converting a condition code to a Boolean value, the current
implementation in s390_insn_cc2bool_emit() generates six instructions
including "insert program mask" (IPM). On systems with the
load/store-on-condition facility 2, this can be done in two instructions
instead, using "load halfword immediate on condition" (LOCGHI).
Add the new hardware capability VEX_HWCAPS_S390X_LSC2 and the respective
macro s390_host_has_lsc2. In s390_insn_cc2bool_emit(), check for the
facility and exploit it if available.
A conditional move from an immediate value can be slightly improved with
LOCGHI as well, so do that in s390_insn_cond_move_emit() if possible.