Uros Bizjak [Thu, 23 Nov 2023 14:45:59 +0000 (15:45 +0100)]
i386: Fix ICE with -mforce-indirect-call and -fsplit-stack [PR89316]
With the above two options, use a temporary register regno (as returned
from split_stack_prologue_scratch_regno) as an indirect call scratch
register to hold __morestack function address. On 64-bit targets, two
temporary registers are always available, so load the function addres in
%r11 and call __morestack_large_model with its one-argument-register value
rn %r10. On 32-bit targets, bail out with a "sorry" if the temporary
register can not be obtained.
On 32-bit targets, also emit PIC sequence that re-uses the obtained indirect
call scratch register before moving the function address to it. We can
not set up %ebx PIC register in this case, but __morestack is prepared
for this situation and sets it up by itself.
PR target/89316
gcc/ChangeLog:
* config/i386/i386.cc (ix86_expand_split_stack_prologue): Obtain
scratch regno when flag_force_indirect_call is set. On 64-bit
targets, call __morestack_large_model when flag_force_indirect_call
is set and on 32-bit targets with -fpic, manually expand PIC sequence
to call __morestack. Move the function address to an indirect
call scratch register.
gcc/testsuite/ChangeLog:
* g++.target/i386/pr89316.C: New test.
* gcc.target/i386/pr112605-1.c: New test.
* gcc.target/i386/pr112605-2.c: New test.
* gcc.target/i386/pr112605.c: New test.
Juergen Christ [Mon, 20 Nov 2023 08:13:10 +0000 (09:13 +0100)]
s390: implement flags output
Implement flags output for inline assemblies. Only use one output constraint
that captures the whole condition code. No breakout into different condition
codes is allowed. Also, only one condition code variable is allowed.
Add further logic to canonicalize various cases where we combine different
cases of possible condition codes.
Juergen Christ [Mon, 20 Nov 2023 08:12:18 +0000 (09:12 +0100)]
s390: Fix ICE in testcase pr89233
When using GNU vector extensions, an access outside of the vector size
caused an ICE on s390. Fix this by aligning with the vec_extract
builtin, i.e., computing constant index modulo number of lanes.
Di Zhao [Thu, 9 Nov 2023 07:06:37 +0000 (15:06 +0800)]
swap ops in reassoc to reduce cross backedge FMA
Previously for ops.length >= 3, when FMA is present, we don't
rank the operands so that more FMAs can be preserved. But this
brings more FMAs with loop dependency, which lead to worse
performance on some targets.
Rank the oprands (set width=2) when:
1. avoid_fma_max_bits is set.
2. And loop dependent FMA sequence is found.
In this way, we don't have to discard all the FMA candidates
in the bad shaped sequence in widening_mul, instead we can keep
fewer FMAs without loop dependency.
With this patch, there's about 2% improvement in 510.parest_r
1-copy run on ampere1 (with "-Ofast -mcpu=ampere1 -flto
--param avoid-fma-max-bits=512").
PR tree-optimization/110279
gcc/ChangeLog:
* tree-ssa-reassoc.cc (get_reassociation_width): check
for loop dependent FMAs.
(reassociate_bb): For 3 ops, refine the condition to call
swap_ops_for_binary_stmt.
void foo (void *out, void *in, int64_t a, int64_t b)
{
v1024b v = {a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a};
v1024b v2 = {b,b,b,b,b,b,b,b,b,b,b,b,b,b,b,b};
v1024b index = *(v1024b*)in;
v1024b v3 = __builtin_shuffle (v, v2, index);
__riscv_vse64_v_i64m1 (out, (vint64m1_t)v3, 10);
}
Incorrect ASM:
foo:
li a5,31
vsetivli zero,10,e64,m1,ta,mu
vmv.v.x v2,a5
vl1re64.v v1,0(a1)
vmv.v.x v4,a2
vand.vv v1,v1,v2
vmv.v.x v3,a3
vmsgeu.vi v0,v1,16
vrgather.vv v2,v4,v1 --> AVL = VLMAX according to codes.
vadd.vi v1,v1,-16
vrgather.vv v2,v3,v1,v0.t --> AVL = VLMAX according to codes.
vse64.v v2,0(a0) --> AVL = 10 according to codes.
ret
For vrgather dest, source, index instruction, when index may has the value > the following store AVL
that is index value > 10. In this situation, the codes above will end up with:
The source vector of vrgather has undefined value on index >= AVL (which is 10 in this case).
So disable AVL propagation for vrgather instruction.
PR target/112599
PR target/112670
gcc/ChangeLog:
* config/riscv/riscv-avlprop.cc (alv_can_be_propagated_p): New function.
(vlmax_ta_p): Disable vrgather AVL propagation.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr112599-1.c: New test.
Jakub Jelinek [Thu, 23 Nov 2023 11:59:54 +0000 (12:59 +0100)]
expr: Fix &bitint_var handling in initializers [PR112336]
As the following testcase shows, we ICE when trying to emit ADDR_EXPR of
a bitint variable which doesn't have mode width.
The problem is in the EXTEND_BITINT stuff which makes sure we treat the
padding bits on memory reads from user bitint vars as undefined.
When expanding ADDR_EXPR on such vars inside outside of initializers,
expand_expr_addr* uses EXPAND_CONST_ADDRESS modifier and EXTEND_BITINT
does nothing, but in initializers it keeps using EXPAND_INITIALIZER
modifier. So, we need to treat EXPAND_INITIALIZER the same as
EXPAND_CONST_ADDRESS for this regard.
2023-11-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/112336
* expr.cc (EXTEND_BITINT): Don't call reduce_to_bit_field_precision
if modifier is EXPAND_INITIALIZER.
Jonathan Wakely [Wed, 22 Nov 2023 21:24:08 +0000 (21:24 +0000)]
c++: Require C++11 for g++.dg/opt/pr110879.C [PR110879]
The _M_realloc_insert member does not have the trivial relocation
optimization for C++98, which seems to be why the _M_end_of_storage
member does not get optimized away. Make this test unsupported for
C++98.
gcc/testsuite/ChangeLog:
PR libstdc++/110879
* g++.dg/opt/pr110879.C: Require C++11 or later.
Jakub Jelinek [Thu, 23 Nov 2023 09:12:30 +0000 (10:12 +0100)]
c: Add __builtin_stdc_* builtins
As discussed in the
https://sourceware.org/pipermail/libc-alpha/2023-November/152756.html
thread, including e.g.
https://sourceware.org/pipermail/libc-alpha/2023-November/152795.html
patch, while one can use the new __builtin_{clz,ctz,popcount}g builtins
to implement the stdbit.h type-generic macros, there are certain problems
with that implementation if those macros must be usable outside of
function bodies (e.g. int a = sizeof (stdc_bit_floor (0ULL));), must not
evaluate their arguments multiple times and especially for deep stdc_*
macro nesting don't expand the argument more than once. Plus ideally are
usable in constant expressions for all the types if they have constant
arguments. The above second URL satisfies it all but the last two (the
last one satisfies for many of them). While we could get away with just
adding __biultin_stdc_bit_{ceil,floor,width} which are complicated and
2 further extensions (some way to say that __builtin_c{l,t}zg should
imply bit precision of the first argument for the second argument without
using __builtin_popcountg ((__typeof (x)) -1) in there because that
causes another expansion of the macro argument and say __builtin_bit_complement
type-generic builtin which would be like (__typeof (x)) ~(x)), it was decided
we want to implement builtins for all the stdc type-generic macros.
As we are close to running out of 8-bit enum rid (when adding the 14 new
RID_* we have 7 too many), this patch implements those 14 keywords using
a single RID_BUILTIN_STDC and simply in the rare case this is being
parsed check values of 1-2 characters from the builtin names to see which
one it is.
Richard Biener [Thu, 23 Nov 2023 07:54:56 +0000 (08:54 +0100)]
middle-end/32667 - document cpymem and memcpy exact overlap requirement
The following amends the cpymem documentation to mention that exact
overlap needs to be handled gracefully, also noting that the target
runtime is expected to behave the same way where -ffreestanding
docs mention the set of routines required.
PR middle-end/32667
* doc/md.texi (cpymem): Document that exact overlap of source
and destination needs to work.
* doc/standards.texi (ffreestanding): Mention memcpy is required
to handle the exact overlap case.
The following patch implements the user generated static_assert messages next
to string literals.
As I wrote already in the PR, in addition to looking through the paper
I looked at the clang++ testcase for this feature implemented there from
paper's author and on godbolt played with various parts of the testcase
coverage below, and there are some differences between what the patch
implements and what clang++ implements.
The first is that clang++ diagnoses if M.size () or M.data () methods
are present, but aren't constexpr; while the paper introduction talks about
that, the standard wording changes don't seem to require that, all they say
is that those methods need to exist (assuming accessible and the like)
and be implicitly convertible to std::size_t or const char *, but rest is
only if the static assertion fails. If there is intent to change that
wording, the question is how far to go, e.g. while M.size () could be
constexpr, they could e.g. return some class object which wouldn't have
constexpr conversion operator to size_t/const char * and tons of other
reasons why the constant evaluation could fail. Without actually evaluating
it I don't see how we could guarantee anything for non-failed static_assert.
The second difference is that
static_assert (false, "foo"_myd);
in the testcase is normal failed static assertion and
static_assert (true, "foo"_myd);
would be accepted, while clang++ rejects it. IMHO
"foo"_myd doesn't match the syntactic requirements of unevaluated-string
as mentioned in http://eel.is/c++draft/dcl.pre#10 , and because
a constexpr udlit operator can return something which is valid, it shouldn't
be rejected just in case.
Last is clang++ ICEs on non-static data members size/data.
The first version of this support had a difference where M.data () was not
a constant expression but a core constant expression, but if M.size () != 0
M.data ()[0] ... M.data ()[M.size () - 1] were integer constant expressions.
We don't have any routine to test whether an expression is a core constant
expression, so what the code does is try silently whether M.data () is
a constant expression (maybe_constant_value), if it is, nice, we can use
that result to attempt to optimize the extraction of the message from it
if it is some recognized form involving a STRING_CST and just to double-check
try to constant evaluate M.data ()[0] and M.data ()[M.size () - 1] expressions
as boundaries but not anything in between. If M.data () is not a constant
expression, we don't fail, but use a slower method of evaluating M.data ()[i]
for i 0, 1, ... M.size () - 1. And if M.size () == 0, the above wouldn't
evaluate anything, so we try to constant evaluate (M.data (), 0) as constant
expression, which should succeed if M.data () is a core constant expression
and fail otherwise.
The patch assumes that these expressions are manifestly constant evaluated.
The patch implements what I see in the paper, because it is unclear what
further changes will be voted in (and the changes can be done at that
point).
The initial patch used tf_none in 6 spots so that just the static_assert
specific errors were emitted and not others, but during review this has been
changed, so that we emit both the more detailed errors why something wasn't
found or wasn't callable or wasn't convertible and diagnostics that
static_assert second argument needs to satisfy some of the needed properties.
2023-11-23 Jakub Jelinek <jakub@redhat.com>
PR c++/110348
gcc/
* doc/invoke.texi (-Wno-c++26-extensions): Document.
gcc/c-family/
* c.opt (Wc++26-extensions): New option.
* c-cppbuiltin.cc (c_cpp_builtins): For C++26 predefine
__cpp_static_assert to 202306L rather than 201411L.
gcc/cp/
* parser.cc: Implement C++26 P2741R3 - user-generated static_assert
messages.
(cp_parser_static_assert): Parse message argument as
conditional-expression if it is not a pure string literal or
several of them concatenated followed by closing paren.
* semantics.cc (finish_static_assert): Handle message which is not
STRING_CST. For condition with bare parameter packs return early.
* pt.cc (tsubst_expr) <case STATIC_ASSERT>: Also tsubst_expr
message and make sure that if it wasn't originally STRING_CST, it
isn't after tsubst_expr either.
gcc/testsuite/
* g++.dg/cpp26/static_assert1.C: New test.
* g++.dg/cpp26/feat-cxx26.C (__cpp_static_assert): Expect
202306L rather than 201411L.
* g++.dg/cpp0x/udlit-error1.C: Expect different diagnostics for
static_assert with user-defined literal.
Manolis Tsamis [Tue, 21 Nov 2023 09:05:02 +0000 (10:05 +0100)]
ifcvt: remove obsolete SUBREG handling in noce_convert_multiple_sets
This code used to handle SUBREG for register replacement when ifcvt
was doing the replacements manually. This special handling is not
needed anymore because simplify_replace_rtx is used for the
replacements and it properly handles these cases.
gcc/ChangeLog:
* ifcvt.cc (noce_convert_multiple_sets_1): Remove old code.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Pan Li [Mon, 13 Nov 2023 03:22:37 +0000 (11:22 +0800)]
DSE: Allow vector type for get_stored_val when read < store
Update in v4:
* Merge upstream and removed some independent changes.
Update in v3:
* Take known_le instead of known_lt for vector size.
* Return NULL_RTX when gap is not equal 0 and not constant.
Update in v2:
* Move vector type support to get_stored_val.
Original log:
This patch would like to allow the vector mode in the
get_stored_val in the DSE. It is valid for the read
rtx if and only if the read bitsize is less than the
stored bitsize.
Given below example code with
--param=riscv-autovec-preference=fixed-vlmax.
Before this patch:
test:
lui a5,%hi(.LANCHOR0)
addi sp,sp,-32
addi a5,a5,%lo(.LANCHOR0)
li a3,32
vl2re64.v v2,0(a5)
vsetvli zero,a3,e8,m1,ta,ma
vs2r.v v2,0(sp) <== Unnecessary store to stack
vle8.v v1,0(sp) <== Ditto
vs1r.v v1,0(a0)
addi sp,sp,32
jr ra
After this patch:
test:
lui a5,%hi(.LANCHOR0)
addi a5,a5,%lo(.LANCHOR0)
li a4,32
addi sp,sp,-32
vsetvli zero,a4,e8,m1,ta,ma
vle8.v v1,0(a5)
vs1r.v v1,0(a0)
addi sp,sp,32
jr ra
Below tests are passed within this patch:
* The risc-v regression test.
* The x86 bootstrap and regression test.
* The aarch64 regression test.
PR target/111720
gcc/ChangeLog:
* dse.cc (get_stored_val): Allow vector mode if read size is
less than or equal to stored size.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
Costas Argyris [Mon, 20 Nov 2023 17:58:16 +0000 (17:58 +0000)]
mingw: Exclude utf8 manifest [PR111170, PR108865]
Make the utf8 manifest optional (on by default and
explicitly off with --disable-win32-utf8-manifest)
in the mingw hosts.
Also eliminate duplication between the 32-bit and
64-bit mingw hosts by putting them both in the
same branch and special-case only the 64-bit long
long setting.
PR mingw/111170
PR mingw/108865
Signed-off-by: Costas Argyris <costas.argyris@gmail.com> Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/Changelog:
* configure.ac: Handle new --enable-win32-utf8-manifest
option.
* config.host: allow win32 utf8 manifest to be disabled
by user.
* configure: Regenerate.
I made a mistake in the previous change to integer_store_memory_operand.
There is no support pa_emit_move sequence to handle secondary reloads of
integer REG+D instructions. Further, the Q constraint is used for some
non-simple instructions (movb and addib). Thus, we need to return true
when reload is in progress.
2023-11-22 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR target/112617
* config/pa/predicates.md (integer_store_memory_operand): Return
true for REG+D addresses when reload_in_progress is true.
Patrick Palka [Wed, 22 Nov 2023 18:54:29 +0000 (13:54 -0500)]
c++: alias template of non-template class [PR112633]
The entering_scope adjustment in tsubst_aggr_type assumes if an alias is
dependent, then so is the aliased type (and therefore it has template info)
but that's not true for the dependent alias template specialization ty1<T>
below which aliases the non-template class A. In this case no adjustment
is needed anyway, so we can just punt.
PR c++/112633
gcc/cp/ChangeLog:
* pt.cc (tsubst_aggr_type): Handle empty TYPE_TEMPLATE_INFO
in the entering_scope adjustment.
Iain Sandoe [Tue, 21 Nov 2023 10:19:29 +0000 (10:19 +0000)]
testsuite: Update path to intl include.
When we are building libintl in-tree, we need to pass the path
to the generated libintl.h include to the plugin tests. This
path has changed with the use of gettext directly.
gcc/testsuite/ChangeLog:
* lib/plugin-support.exp: Update the expected path to an
in-tree build of libintl.
Iain Sandoe [Thu, 26 Oct 2023 08:52:04 +0000 (09:52 +0100)]
testsuite, Darwin: Add support for Mach-O function body scans.
We need to process the source slightly differently from ELF, especially
in that we have __USER_LABEL_PREFIX__ and there are no function start
and end assembler directives. This means we cannot delineate functions
when frame output is switched off.
TODO: consider adding -mtest-markers or something similar to inject
assembler comments that can be scanned for.
gcc/testsuite/ChangeLog:
* lib/scanasm.exp: Initial handling for Mach-O function body scans.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
Richard Biener [Wed, 22 Nov 2023 10:10:41 +0000 (11:10 +0100)]
tree-optimization/112344 - wrong final value replacement
When performing final value replacement chrec_apply that's used to
compute the overall effect of niters to a CHREC doesn't consider that
the overall increment of { -2147483648, +, 2 } doesn't fit in
a signed integer when the loop iterates until the value of the IV
of 20. The following fixes this mistake, carrying out the multiply
and add in an unsigned type instead, avoiding undefined overflow
and thus later miscompilation by path range analysis.
PR tree-optimization/112344
* tree-chrec.cc (chrec_apply): Perform the overall increment
calculation and increment in an unsigned type.
Andrew Stubbs [Wed, 22 Nov 2023 13:46:12 +0000 (13:46 +0000)]
amdgcn: Fix vector TImode reload loop
I've only observed the problem on the devel/omp/gcc-13 branch, but this
could theoretically affect mainline also. The mov insns for the other modes
already have '$', so this completes the set.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (*mov<mode>_4reg): Disparage AVGPR use when a
reload is required.
[IRA]: Fix using undefined dump file in IRA code during insn scheduling
Part of IRA code is used for register pressure sensitive insn
scheduling and live range shrinkage. Numerous changes of IRA resulted
in that this IRA code uses dump file passed by the scheduler and
internal ira dump file (in called functions) which can be undefined or
freed by the scheduler during compiling previous functions. The patch
fixes this problem. To reproduce the error valgrind should be used
and GCC should be compiled with valgrind annotations. Therefor the
patch does not contain the test case.
gcc/ChangeLog:
PR rtl-optimization/112610
* ira-costs.cc: (find_costs_and_classes): Remove arg.
Use ira_dump_file for printing.
(print_allocno_costs, print_pseudo_costs): Ditto.
(ira_costs): Adjust call of find_costs_and_classes.
(ira_set_pseudo_classes): Set up and restore ira_dump_file.
Florian Weimer [Wed, 22 Nov 2023 13:26:53 +0000 (14:26 +0100)]
gcc.misc-tests/linkage-y.c: Compatibility with C99+ system compilers
This program is compiled with an installed "cc" compiler, not the
built GCC compiler, so it should be as compatible as possible across a
wide range of compilers.
gcc/testsuite/
* gcc.misc-tests/linkage-y.c (puts): Declare.
(main): Add int return type and return 0.
Juzhe-Zhong [Wed, 22 Nov 2023 10:53:22 +0000 (18:53 +0800)]
RISC-V: Fix incorrect use of vcompress in permutation auto-vectorization
This patch fixes following FAILs on zvl512b of RV32 system:
FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-12.c execution test
FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-9.c execution test
The root cause is that for permutation indice = {0,3,7,0} use vcompress optimization
which is incorrect. Fix vcompress optimization bug.
The stray line defining enable_darwin_at_rpath outside of the scope of
_LT_DARWIN_LINKER_FEATURES is a mistake and should be removed. It leads
to a wrong line in fixincludes/ChangeLog because there is no $1 argument
at that point.
Christophe Lyon [Wed, 22 Nov 2023 09:50:11 +0000 (09:50 +0000)]
arm: [MVE intrinsics] Fix typo
In commt 0c2037d9d93a8f768cb11698ff794278246bb31f (Add support for
contiguous loads and stores), I added a spurious line which broke
bootstrap because of an unused variable error.
Xi Ruoyao [Sat, 18 Nov 2023 22:12:22 +0000 (06:12 +0800)]
LoongArch: Optimize LSX vector shuffle on floating-point vector
The vec_perm expander was wrongly defined. GCC internal says:
Operand 3 is the “selector”. It is an integral mode vector of the same
width and number of elements as mode M.
But we made operand 3 in the same mode as the shuffled vectors, so it
would be a FP mode vector if the shuffled vectors are FP mode.
With this mistake, the generic code manages to work around and it ends
up creating some very nasty code for a simple __builtin_shuffle (a, b,
c) where a and b are V4SF, c is V4SI:
This is obviously stupid. Fix the expander definition and adjust
loongarch_expand_vec_perm to handle it correctly.
gcc/ChangeLog:
* config/loongarch/lsx.md (vec_perm<mode:LSX>): Make the
selector VIMODE.
* config/loongarch/loongarch.cc (loongarch_expand_vec_perm):
Use the mode of the selector (instead of the shuffled vector)
for truncating it. Operate on subregs in the selector mode if
the shuffled vector has a different mode (i. e. it's a
floating-point vector).
Hongyu Wang [Fri, 17 Nov 2023 07:30:16 +0000 (15:30 +0800)]
[APX PUSH2POP2] Adjust operand order for PUSH2POP2
The push2/pop2 operand order does not match the binutils implementation
for AT&T syntax that it will first push operands[2] then operands[1].
Correct it by reverse operand order for AT&T syntax.
gcc/ChangeLog:
* config/i386/i386.md (push2_di): Adjust operand order for AT&T
syntax.
(pop2_di): Likewise.
(push2p_di): Likewise.
(pop2p_di): Likewise.
Juzhe-Zhong [Wed, 22 Nov 2023 03:27:52 +0000 (11:27 +0800)]
RISC-V: Fix permutation indice mode bug
This patch fixes following FAILs on zvl512b:
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-16.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-16.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-17.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-17.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-5.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-5.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-6.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/slp_run-6.c execution test
The root cause is that we are using vrgather.vv on vector QI mode which
is incorrect for zvl512b since it exceed 256.
Jason Merrill [Tue, 21 Nov 2023 23:22:53 +0000 (18:22 -0500)]
c++: start_preparsed_function tweak
In review of the deducing 'this' patch, it came up that the logic in
start_preparsed_function around the ctype variable was convoluted, being
set for non-static member functions and friends, but not for static member
functions. Let's set it for any member function, and not rely on it to
decide whether to set up 'this'.
where the expression is ((1 << R3) + R10), which does not match a valid
machine addressing mode. Consequently `print_operand_address' chokes.
This can be reduced to the testcase included, where it triggers the same
ICE in `p'. Preincrements are required so that their results land in
registers and consequently an indexed addressing mode is tried or
otherwise doing operations piecemeal on stack-based function arguments
as direct input operands turns out more profitable in terms of RTX costs
and the ICE is avoided.
The ultimate cause has been commit c605a8bf9270 ("VAX: Accept ASHIFT in
address expressions"), where a shift of an immediate value by a register
has been mistakenly allowed as an index expression as if the shift
operation was commutative such as multiplication is. So with ASHIFT the
scaler in an index expression has to be the right-hand operand, and the
backend has to enforce that, whereas with MULT the scaler can be either
operand.
Fix this by only accepting the index scaler as the RHS operand to
ASHIFT.
gcc/
PR target/111815
* config/vax/vax.cc (index_term_p): Only accept the index scaler
as the RHS operand to ASHIFT.
gcc/testsuite/
PR target/111815
* gcc.dg/torture/pr111815.c: New test.
RISC-V/testsuite: Add branchless cases for FP NE cond-add operation
Verify, for the generic floating-point NE conditional-add operation,
that if-conversion triggers via `noce_try_addcc' at `-mbranch-cost=3'
setting, which makes branchless code sequences emitted by if-conversion
cheaper than their original branched equivalents, and that extraneous
instructions such as SNEZ, etc. are not present in output.
The reason to XFAIL the SImode test for RV64 targets is GCC thinks it
has to sign-extend addends, which causes if-conversion to give up.
gcc/testsuite/
* gcc.target/riscv/adddifne.c: New test.
* gcc.target/riscv/addsifne.c: New test.
RISC-V/testsuite: Add branched cases for FP NE cond-add operation
Verify, for the generic floating-point NE conditional-add operation,
that if-conversion does *not* trigger at `-mbranch-cost=2' setting,
which makes original branched code sequences cheaper than their
branchless equivalents if-conversion would emit.
gcc/testsuite/
* gcc.target/riscv/adddibfne.c: New test.
* gcc.target/riscv/addsibfne.c: New test.
RISC-V/testsuite: Add branched cases for FP NE cond-move operations
Verify, for the floating-point NE conditional-move operation, that
if-conversion triggers via `noce_try_cmove' at the respective
sufficiently high `-mbranch-cost=' settings that make branchless code
sequences produced by if-conversion cheaper than their original branched
equivalents, and that extraneous instructions such as SNEZ, etc. are not
present in output.
gcc/testsuite/
* gcc.target/riscv/movdifeq-sfb.c: New test.
* gcc.target/riscv/movdifeq-thead.c: New test.
* gcc.target/riscv/movdifeq-ventana.c: New test.
* gcc.target/riscv/movdifeq-zicond.c: New test.
* gcc.target/riscv/movdifeq.c: New test.
* gcc.target/riscv/movsifeq-sfb.c: New test.
* gcc.target/riscv/movsifeq-thead.c: New test.
* gcc.target/riscv/movsifeq-ventana.c: New test.
* gcc.target/riscv/movsifeq-zicond.c: New test.
* gcc.target/riscv/movsifeq.c: New test.
RISC-V/testsuite: Add branched cases for FP NE cond-move operations
Verify, for generic, Ventana and Zicond targets and the floating-point
NE conditional-move operation, that if-conversion does *not* trigger at
the respective sufficiently low `-mbranch-cost=' settings that make
original branched code sequences cheaper than their branchless
equivalents if-conversion would emit.
gcc/testsuite/
* gcc.target/riscv/movdibfeq-ventana.c: New test.
* gcc.target/riscv/movdibfeq-zicond.c: New test.
* gcc.target/riscv/movdibfeq.c: New test.
* gcc.target/riscv/movsibfeq-ventana.c: New test.
* gcc.target/riscv/movsibfeq-zicond.c: New test.
* gcc.target/riscv/movsibfeq.c: New test.
RISC-V: Handle FP NE operator via inversion in cond-operation expansion
We have no FNE.fmt machine instructions, but we can emulate them for the
purpose of conditional-move and conditional-add operations by using the
respective FEQ.fmt instruction and then swapping the data input operands
or complementing the mask for the conditional addend respectively, so
update our handlers accordingly.
gcc/
* config/riscv/riscv-protos.h (riscv_expand_float_scc): Add
`invert_ptr' parameter.
* config/riscv/riscv.cc (riscv_emit_float_compare): Add NE
inversion handling.
(riscv_expand_float_scc): Pass `invert_ptr' through to
`riscv_emit_float_compare'.
(riscv_expand_conditional_move): Pass `&invert' to
`riscv_expand_float_scc'.
* config/riscv/riscv.md (add<mode>cc): Likewise.
RISC-V/testsuite: Add branchless cases for generic FP cond adds
Verify, for generic floating-point conditional-add operations that have
a corresponding conditional-set machine instruction, that if-conversion
triggers via `noce_try_addcc' at `-mbranch-cost=3' setting, which makes
branchless code sequences emitted by if-conversion cheaper than their
original branched equivalents, and that extraneous instructions such as
SNEZ, etc. are not present in output.
The reason to XFAIL SImode tests for RV64 targets is the compiler thinks
it has to sign-extend addends, which causes if-conversion to give up.
gcc/testsuite/
* gcc.target/riscv/adddifeq.c: New test.
* gcc.target/riscv/adddifge.c: New test.
* gcc.target/riscv/adddifgt.c: New test.
* gcc.target/riscv/adddifle.c: New test.
* gcc.target/riscv/adddiflt.c: New test.
* gcc.target/riscv/addsifeq.c: New test.
* gcc.target/riscv/addsifge.c: New test.
* gcc.target/riscv/addsifgt.c: New test.
* gcc.target/riscv/addsifle.c: New test.
* gcc.target/riscv/addsiflt.c: New test.
RISC-V/testsuite: Add branched cases for generic FP cond adds
Verify, for generic floating-point conditional-add operations that have
a corresponding conditional-set machine instruction, that if-conversion
does *not* trigger at `-mbranch-cost=2' setting, which makes original
branched code sequences cheaper than their branchless equivalents
if-conversion would emit. Cover all the relevant floating-point
relational operations to make sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/adddibfeq.c: New test.
* gcc.target/riscv/adddibfge.c: New test.
* gcc.target/riscv/adddibfgt.c: New test.
* gcc.target/riscv/adddibfle.c: New test.
* gcc.target/riscv/adddibflt.c: New test.
* gcc.target/riscv/addsibfeq.c: New test.
* gcc.target/riscv/addsibfge.c: New test.
* gcc.target/riscv/addsibfgt.c: New test.
* gcc.target/riscv/addsibfle.c: New test.
* gcc.target/riscv/addsibflt.c: New test.
RISC-V/testsuite: Add branchless cases for generic FP cond moves
Verify, for generic floating-point conditional-move operations that have
a corresponding conditional-set machine instruction, that if-conversion
triggers (via `cond_move_convert_if_block', which doesn't report) at
`-mbranch-cost=5' setting, which makes branchless code sequences emitted
by if-conversion cheaper than their original branched equivalents, and
that extraneous instructions such as SNEZ, etc. are not present in
output.
gcc/testsuite/
* gcc.target/riscv/movdifge.c: New test.
* gcc.target/riscv/movdifgt.c: New test.
* gcc.target/riscv/movdifle.c: New test.
* gcc.target/riscv/movdiflt.c: New test.
* gcc.target/riscv/movdifne.c: New test.
* gcc.target/riscv/movsifge.c: New test.
* gcc.target/riscv/movsifgt.c: New test.
* gcc.target/riscv/movsifle.c: New test.
* gcc.target/riscv/movsiflt.c: New test.
* gcc.target/riscv/movsifne.c: New test.
RISC-V/testsuite: Add branched cases for generic FP cond moves
Verify, for generic floating-point conditional-move operations that have
a corresponding conditional-set machine instruction, that if-conversion
does *not* trigger at `-mbranch-cost=4' setting, which makes original
branched code sequences cheaper than their branchless equivalents
if-conversion would emit. Cover all the relevant floating-point
relational operations to make sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/movdibfge.c: New test.
* gcc.target/riscv/movdibfgt.c: New test.
* gcc.target/riscv/movdibfle.c: New test.
* gcc.target/riscv/movdibflt.c: New test.
* gcc.target/riscv/movdibfne.c: New test.
* gcc.target/riscv/movsibfge.c: New test.
* gcc.target/riscv/movsibfgt.c: New test.
* gcc.target/riscv/movsibfle.c: New test.
* gcc.target/riscv/movsibflt.c: New test.
* gcc.target/riscv/movsibfne.c: New test.
RISC-V: Avoid extraneous integer comparison for FP comparisons
We have floating-point coditional-set machine instructions for a subset
of FP comparisons, so avoid going through a comparison against constant
zero in `riscv_expand_float_scc' where not necessary, preventing an
extraneous RTL instruction from being produced that counts against the
cost of the replacement branchless code sequence in if-conversion, e.g.:
lowering the cost of the code sequence produced (even though combine
would swallow the second insn anyway).
We still need to produce a comparison against constant zero where the
instruction following a floating-point coditional-set operation is a
branch, so add canonicalization to `riscv_expand_conditional_branch'
instead.
gcc/
* config/riscv/riscv.cc (riscv_emit_float_compare) <NE>: Handle
separately.
<EQ, LE, LT, GE, GT>: Return operands supplied as is.
(riscv_emit_binary): Call `riscv_emit_binary' directly rather
than going through a temporary register for word-mode targets.
(riscv_expand_conditional_branch): Canonicalize the comparison
if not against constant zero.
RISC-V: Provide FP conditional-branch instructions for if-conversion
Do not expand floating-point conditional-branch RTL instructions right
away that use a comparison operation that is either directly available
as a machine conditional-set instruction or is NE, which can be emulated
by EQ. This is so that if-conversion sees them in their original form
and can produce fewer operations tried in a branchless code sequence
compared to when such an instruction has been already converted to a
sequence of a floating-point conditional-set RTL instruction followed by
an integer conditional-branch RTL instruction. Split any floating-point
conditional-branch RTL instructions still remaining after reload then.
Adjust the testsuite accordingly: since the middle end uses the inverse
condition internally, an inverse conditional-set instruction may make it
to assembly output and also `cond_move_process_if_block' will be used by
if-conversion rather than `noce_process_if_block', because the latter
function not yet been updated to handle inverted conditions.
gcc/
* config/riscv/predicates.md (ne_operator): New predicate.
* config/riscv/riscv.cc (riscv_insn_cost): Handle branches on a
floating-point condition.
* config/riscv/riscv.md (@cbranch<mode>4): Rename expander to...
(@cbranch<ANYF:mode>4): ... this. Only expand the RTX via
`riscv_expand_conditional_branch' for `!signed_order_operator'
operators, otherwise let it through.
(*cbranch<ANYF:mode>4, *cbranch<ANYF:mode>4): New insns and
splitters.
RISC-V: Also allow FP conditions in `riscv_expand_conditional_move'
In `riscv_expand_conditional_move' we only let integer conditions
through at the moment, even though code has already been prepared to
handle floating-point conditions as well.
Lift this restriction and only bail out if a non-word-mode integer
condition has been requested, as we cannot handle this specific case
owing to machine instruction set restriction. We already take care of
the non-integer, non-floating-point case later on.
gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Don't
bail out in floating-point conditions.
RISC-V: Only use SUBREG if applicable in `riscv_expand_float_scc'
A subsequent change to enable the processing of conditional moves on a
floating-point condition by `riscv_expand_conditional_move' will cause
`riscv_expand_float_scc' to be called for word-mode target RTX with RV64
targets. In that case an invalid insn such as:
would be produced, which would crash the compiler later on. Since the
output operand of the SET operation to be produced already has the same
mode as the input operand does, just omit the use of SUBREG and assign
directly.
gcc/
* config/riscv/riscv.cc (riscv_expand_float_scc): Suppress the
use of SUBREG if the conditional-set target is word-mode.
RISC-V/testsuite: Add branchless cases for generic integer cond adds
Verify, for generic integer conditional-add operations, if-conversion
to trigger via `noce_try_addcc' at the respective sufficiently high
`-mbranch-cost=' settings that make branchless code sequences produced
by if-conversion cheaper than their original branched equivalents, and,
where applicable, that extraneous instructions such as SNEZ, etc. are
not present in output. Cover all integer relational operations to make
sure no corner case escapes.
The reason to XFAIL SImode tests for RV64 targets is the compiler thinks
it has to sign-extend addends, which causes if-conversion to give up.
gcc/testsuite/
* gcc.target/riscv/adddieq.c: New test.
* gcc.target/riscv/adddige.c: New test.
* gcc.target/riscv/adddigeu.c: New test.
* gcc.target/riscv/adddigt.c: New test.
* gcc.target/riscv/adddigtu.c: New test.
* gcc.target/riscv/adddile.c: New test.
* gcc.target/riscv/adddileu.c: New test.
* gcc.target/riscv/adddilt.c: New test.
* gcc.target/riscv/adddiltu.c: New test.
* gcc.target/riscv/adddine.c: New test.
* gcc.target/riscv/addsieq.c: New test.
* gcc.target/riscv/addsige.c: New test.
* gcc.target/riscv/addsigeu.c: New test.
* gcc.target/riscv/addsigt.c: New test.
* gcc.target/riscv/addsigtu.c: New test.
* gcc.target/riscv/addsile.c: New test.
* gcc.target/riscv/addsileu.c: New test.
* gcc.target/riscv/addsilt.c: New test.
* gcc.target/riscv/addsiltu.c: New test.
* gcc.target/riscv/addsine.c: New test.
RISC-V/testsuite: Add branched cases for generic integer cond adds
Verify, for generic integer conditional-add operations, if-conversion
*not* to trigger at the respective sufficiently low `-mbranch-cost='
settings that make original branched code sequences cheaper than their
branchless equivalents if-conversion would emit. Cover all integer
relational operations to make sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/adddibeq.c: New test.
* gcc.target/riscv/adddibge.c: New test.
* gcc.target/riscv/adddibgeu.c: New test.
* gcc.target/riscv/adddibgt.c: New test.
* gcc.target/riscv/adddibgtu.c: New test.
* gcc.target/riscv/adddible.c: New test.
* gcc.target/riscv/adddibleu.c: New test.
* gcc.target/riscv/adddiblt.c: New test.
* gcc.target/riscv/adddibltu.c: New test.
* gcc.target/riscv/adddibne.c: New test.
* gcc.target/riscv/addsibeq.c: New test.
* gcc.target/riscv/addsibge.c: New test.
* gcc.target/riscv/addsibgeu.c: New test.
* gcc.target/riscv/addsibgt.c: New test.
* gcc.target/riscv/addsibgtu.c: New test.
* gcc.target/riscv/addsible.c: New test.
* gcc.target/riscv/addsibleu.c: New test.
* gcc.target/riscv/addsiblt.c: New test.
* gcc.target/riscv/addsibltu.c: New test.
* gcc.target/riscv/addsibne.c: New test.
RISC-V: Add `addMODEcc' implementation for generic targets
Provide RTL expansion of conditional-add operations for generic targets
using a suitable sequence of base integer machine instructions according
to cost evaluation by if-conversion. Use existing `-mmovcc' command
line option to enable this transformation.
gcc/
* config/riscv/riscv.md (add<mode>cc): New expander.
RISC-V/testsuite: Add branchless cases for generic integer cond moves
Verify, for generic integer conditional-move operations, if-conversion
to trigger via `noce_try_cmove' at the respective sufficiently high
`-mbranch-cost=' settings that make branchless code sequences produced
by if-conversion cheaper than their original branched equivalents, and,
where applicable, that extraneous instructions such as SNEZ, etc. are
not present in output. Cover all integer relational operations to make
sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/movdieq.c: New test.
* gcc.target/riscv/movdige.c: New test.
* gcc.target/riscv/movdigeu.c: New test.
* gcc.target/riscv/movdigt.c: New test.
* gcc.target/riscv/movdigtu.c: New test.
* gcc.target/riscv/movdile.c: New test.
* gcc.target/riscv/movdileu.c: New test.
* gcc.target/riscv/movdilt.c: New test.
* gcc.target/riscv/movdiltu.c: New test.
* gcc.target/riscv/movdine.c: New test.
* gcc.target/riscv/movsieq.c: New test.
* gcc.target/riscv/movsige.c: New test.
* gcc.target/riscv/movsigeu.c: New test.
* gcc.target/riscv/movsigt.c: New test.
* gcc.target/riscv/movsigtu.c: New test.
* gcc.target/riscv/movsile.c: New test.
* gcc.target/riscv/movsileu.c: New test.
* gcc.target/riscv/movsilt.c: New test.
* gcc.target/riscv/movsiltu.c: New test.
* gcc.target/riscv/movsine.c: New test.
RISC-V/testsuite: Add branched cases for generic integer cond moves
Verify, for generic integer conditional-move operations, if-conversion
*not* to trigger at the respective sufficiently low `-mbranch-cost='
settings that make original branched code sequences cheaper than their
branchless equivalents if-conversion would emit. Cover all integer
relational operations to make sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/movdibeq.c: New test.
* gcc.target/riscv/movdibge.c: New test.
* gcc.target/riscv/movdibgeu.c: New test.
* gcc.target/riscv/movdibgt.c: New test.
* gcc.target/riscv/movdibgtu.c: New test.
* gcc.target/riscv/movdible.c: New test.
* gcc.target/riscv/movdibleu.c: New test.
* gcc.target/riscv/movdiblt.c: New test.
* gcc.target/riscv/movdibltu.c: New test.
* gcc.target/riscv/movdibne.c: New test.
* gcc.target/riscv/movsibeq.c: New test.
* gcc.target/riscv/movsibge.c: New test.
* gcc.target/riscv/movsibgeu.c: New test.
* gcc.target/riscv/movsibgt.c: New test.
* gcc.target/riscv/movsibgtu.c: New test.
* gcc.target/riscv/movsible.c: New test.
* gcc.target/riscv/movsibleu.c: New test.
* gcc.target/riscv/movsiblt.c: New test.
* gcc.target/riscv/movsibltu.c: New test.
* gcc.target/riscv/movsibne.c: New test.
RISC-V: Add `movMODEcc' implementation for generic targets
Provide RTL expansion of conditional-move operations for generic targets
using a suitable sequence of base integer machine instructions according
to cost evaluation by if-conversion. Add `-mmovcc' command line option
to enable this transformation, off by default.
For the generic sequences small immediates as per the `arith_operand'
predicate are cost-equivalent to registers as we can use them as input,
alternative to a register, to the respective AND[I] machine operations,
however we need to reject immediates fulfilling `lui_operand', because
they would require reloading into a register, making the operation more
costly. Therefore add `movcc_operand' predicate and use it accordingly.
There is a need to adjust zbs-bext-02.c, which can also serve as emitted
code example, because with certain compilation options an AND operation
can now legitimately appear in output despite BEXT having been produced
as expected, such as with `-march=rv64gc -O2':
foo:
mv a3,a0
li a5,0
mv a0,a1
li a2,64
li a1,1
.L3:
sll a4,a1,a5
and a4,a4,a3
addiw a5,a5,1
beq a4,zero,.L2
addiw a0,a0,1
.L2:
bne a5,a2,.L3
ret
vs `-march=rv64gc_zbs -O2':
foo:
mv a4,a0
li a5,0
mv a0,a1
li a3,64
.L3:
bext a2,a4,a5
beq a2,zero,.L2
addiw a0,a0,1
.L2:
addiw a5,a5,1
bne a5,a3,.L3
ret
and then with `-march=rv64gc -mmovcc -mbranch-cost=7':
foo:
mv a6,a0
li a4,0
mv a0,a1
li a7,1
li a1,64
.L3:
sll a5,a7,a4
and a5,a5,a6
snez a5,a5
neg a5,a5
not a2,a5
addiw a3,a0,1
and a5,a5,a3
and a0,a2,a0
addiw a4,a4,1
or a0,a5,a0
bne a4,a1,.L3
ret
vs `-march=rv64gc_zbs -mmovcc -mbranch-cost=7':
foo:
mv a6,a0
li a4,0
mv a0,a1
li a1,64
.L3:
bext a5,a6,a4
neg a5,a5
not a2,a5
addiw a3,a0,1
and a5,a5,a3
and a0,a2,a0
addiw a4,a4,1
or a0,a5,a0
bne a4,a1,.L3
ret
However BEXT is supposed to replace an SLL operation so adjust the test
case to reject SLL rather than AND, letting the test case pass even with
`/-mmovcc/-mbranch-cost=7' specified as DejaGNU test flags (and in the
absence of target-specific conditional-move operations enabled either by
default or with other test flags).
gcc/
* config/riscv/predicates.md (movcc_operand): New predicate.
* config/riscv/riscv.cc (riscv_expand_conditional_move): Handle
generic targets.
* config/riscv/riscv.md (mov<mode>cc): Likewise.
* config/riscv/riscv.opt (mmovcc): New option.
* doc/invoke.texi (Option Summary): Document it.
gcc/testsuite/
* gcc.target/riscv/zbs-bext-02.c: Adjust to reject SLL rather
than AND.
RISC-V/testsuite: Add branchless cases for T-Head non-equality cond moves
Verify, for T-Head targets and the non-equality integer conditional-move
operations, that if-conversion triggers via `noce_try_cmove' at
`-mbranch-cost=2' setting, which makes branchless code sequences
produced by if-conversion cheaper than their original branched
equivalents, and that extraneous instructions such as SNEZ, etc. are not
present in output.
gcc/testsuite/
* gcc.target/riscv/movdige-thead.c: New test.
* gcc.target/riscv/movdigeu-thead.c: New test.
* gcc.target/riscv/movdigt-thead.c: New test.
* gcc.target/riscv/movdigtu-thead.c: New test.
* gcc.target/riscv/movdile-thead.c: New test.
* gcc.target/riscv/movdileu-thead.c: New test.
* gcc.target/riscv/movdilt-thead.c: New test.
* gcc.target/riscv/movdiltu-thead.c: New test.
* gcc.target/riscv/movsige-thead.c: New test.
* gcc.target/riscv/movsigeu-thead.c: New test.
* gcc.target/riscv/movsigt-thead.c: New test.
* gcc.target/riscv/movsigtu-thead.c: New test.
* gcc.target/riscv/movsile-thead.c: New test.
* gcc.target/riscv/movsileu-thead.c: New test.
* gcc.target/riscv/movsilt-thead.c: New test.
* gcc.target/riscv/movsiltu-thead.c: New test.
RISC-V/testsuite: Add branched cases for T-Head non-equality cond moves
Verify, for T-Head targets and the non-equality integer conditional-move
operations, that if-conversion does *not* trigger at `-mbranch-cost=1'
setting, which makes original branched code sequences cheaper than their
branchless equivalents if-conversion would emit.
gcc/testsuite/
* gcc.target/riscv/movdibge-thead.c: New test.
* gcc.target/riscv/movdibgeu-thead.c: New test.
* gcc.target/riscv/movdibgt-thead.c: New test.
* gcc.target/riscv/movdibgtu-thead.c: New test.
* gcc.target/riscv/movdible-thead.c: New test.
* gcc.target/riscv/movdibleu-thead.c: New test.
* gcc.target/riscv/movdiblt-thead.c: New test.
* gcc.target/riscv/movdibltu-thead.c: New test.
* gcc.target/riscv/movsibge-thead.c: New test.
* gcc.target/riscv/movsibgeu-thead.c: New test.
* gcc.target/riscv/movsibgt-thead.c: New test.
* gcc.target/riscv/movsibgtu-thead.c: New test.
* gcc.target/riscv/movsible-thead.c: New test.
* gcc.target/riscv/movsibleu-thead.c: New test.
* gcc.target/riscv/movsiblt-thead.c: New test.
* gcc.target/riscv/movsibltu-thead.c: New test.
Code in `riscv_expand_conditional_move' for Ventana and Zicond targets
seems like bolted on as an afterthought rather than properly merged so
as to handle all the cases together.
Fold the existing code pieces together then (observing that for short
forward branch targets no integer comparisons need to be canonicalized),
letting T-Head targets produce branchless sequences for all the integer
comparisons rather than for equality ones only, and preparing for the
handling of floating-point comparisons here across all conditional-move
targets.
gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Unify
conditional-move handling across all the relevant targets.
RISC-V: Also accept constants for T-Head cond-move data input operands
There is no need for the requirement for conditional-move data input
operands to be stricter for T-Head targets than for short forward branch
targets and limit them to registers only. They are keyed according to
the `sfb_alu_operand' predicate, which lets certain constants through.
Such constants are already forced into a register for the `cons' operand
in the analogous short forward branch case and we can force them for the
`alt' operand and T-Head as well. This enables more opportunities for a
branchless sequence to be produced.
gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Also
accept constants for T-Head data input operands.
RISC-V: Also accept constants for T-Head cond-move comparison operands
There is no need for the requirement for conditional-move comparison
operands to be stricter for T-Head targets than for other targets and
limit them to registers only. Constants will be reloaded if required
just as with branches or other-target conditional-move operations and
there is no extra overhead specific to the T-Head case. This enables
more opportunities for a branchless sequence to be produced.
gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Also
accept constants for T-Head comparison operands.
RISC-V/testsuite: Add branchless cases for equality cond-move operations
Verify, for Ventana and Zicond targets and the equality conditional-move
operations, that if-conversion triggers via `noce_try_cmove' at the
respective sufficiently high `-mbranch-cost=' settings that make
branchless code sequences produced by if-conversion cheaper than their
original branched equivalents, and that extraneous instructions such as
SNEZ, etc. are not present in output.
gcc/testsuite/
* gcc.target/riscv/movdieq-ventana.c: New test.
* gcc.target/riscv/movdieq-zicond.c: New test.
* gcc.target/riscv/movdine-ventana.c: New test.
* gcc.target/riscv/movdine-zicond.c: New test.
* gcc.target/riscv/movsieq-ventana.c: New test.
* gcc.target/riscv/movsieq-zicond.c: New test.
* gcc.target/riscv/movsine-ventana.c: New test.
* gcc.target/riscv/movsine-zicond.c: New test.
RISC-V/testsuite: Add branched cases for equality cond-move operations
Verify, for Ventana and Zicond targets and the equality conditional-move
operations, that if-conversion does *not* trigger at the respective
sufficiently low `-mbranch-cost=' settings that make original branched
code sequences cheaper than their branchless equivalents if-conversion
would emit.
gcc/testsuite/
* gcc.target/riscv/movdibeq-ventana.c: New test.
* gcc.target/riscv/movdibeq-zicond.c: New test.
* gcc.target/riscv/movdibne-ventana.c: New test.
* gcc.target/riscv/movdibne-zicond.c: New test.
* gcc.target/riscv/movsibeq-ventana.c: New test.
* gcc.target/riscv/movsibeq-zicond.c: New test.
* gcc.target/riscv/movsibne-ventana.c: New test.
* gcc.target/riscv/movsibne-zicond.c: New test.
RISC-V: Avoid extraneous EQ or NE operation in cond-move expansion
In the non-zero case there is no need for the conditional value used by
Ventana and Zicond integer conditional operations to be specifically 1.
Regardless we canonicalize it by producing an extraneous conditional-set
operation, such as with the sequence below:
where insn 23 can well be removed without changing the semantics of the
sequence. This is actually fixed up later on by combine and the insn
does not make it to output meaning no SNEZ (or SEQZ in the reverse case)
appears in the assembly produced, however it counts towards the cost of
the sequence calculated by if-conversion, raising the trigger level for
the branchless sequence to be chosen. Arguably to emit this extraneous
operation it can be also considered rather sloppy of our backend's.
Remove the check for operand 1 being constant 0 in the Ventana/Zicond
case for equality comparisons then, observing that `riscv_zero_if_equal'
called via `riscv_emit_int_compare' will canonicalize the comparison if
required, removing the extraneous insn from output:
Adjust branch costs across the test cases affected accordingly.
gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Remove
the check for operand 1 being constant 0 in the Ventana/Zicond
case for equality comparisons.
RISC-V/testsuite: Add branchless cases for GEU and LEU cond-move operations
Verify, for Ventana and Zicond targets and the GEU and LEU
conditional-move operations, that if-conversion triggers via
`noce_try_cmove' at `-mbranch-cost=4' setting, which makes branchless
code sequences produced by if-conversion cheaper than their original
branched equivalents, and that extraneous instructions such as SEQZ,
etc. are not present in output.
gcc/testsuite/
* gcc.target/riscv/movdigtu-ventana.c: New test.
* gcc.target/riscv/movdigtu-zicond.c: New test.
* gcc.target/riscv/movdiltu-ventana.c: New test.
* gcc.target/riscv/movdiltu-zicond.c: New test.
* gcc.target/riscv/movsigtu-ventana.c: New test.
* gcc.target/riscv/movsigtu-zicond.c: New test.
* gcc.target/riscv/movsiltu-ventana.c: New test.
* gcc.target/riscv/movsiltu-zicond.c: New test.
RISC-V/testsuite: Add branched cases for GEU and LEU cond-move operations
Verify, for Ventana and Zicond targets and the GEU and LEU
conditional-move operations, that if-conversion does *not* trigger at
`-mbranch-cost=3' setting, which makes original branched code sequences
cheaper than their branchless equivalents if-conversion would emit.
gcc/testsuite/
* gcc.target/riscv/movdibgtu-ventana.c: New test.
* gcc.target/riscv/movdibgtu-zicond.c: New test.
* gcc.target/riscv/movdibltu-ventana.c: New test.
* gcc.target/riscv/movdibltu-zicond.c: New test.
* gcc.target/riscv/movsibgtu-ventana.c: New test.
* gcc.target/riscv/movsibgtu-zicond.c: New test.
* gcc.target/riscv/movsibltu-ventana.c: New test.
* gcc.target/riscv/movsibltu-zicond.c: New test.
RISC-V: Also invert the cond-move condition for GEU and LEU
Update `riscv_expand_conditional_move' and handle the missing GEU and
LEU operators there, avoiding an extraneous conditional set operation,
such as with this output:
sgtu a0,a0,a1
seqz a1,a0
czero.eqz a3,a3,a1
czero.nez a1,a2,a1
or a0,a1,a3
produced when optimizing for Zicond targets from:
int
movsigtu (int w, int x, int y, int z)
{
return w > x ? y : z;
}
These operators can be inverted producing optimal code such as this:
sgtu a1,a0,a1
czero.nez a3,a3,a1
czero.eqz a1,a2,a1
or a0,a1,a3
which this change causes to happen.
gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Also
invert the condition for GEU and LEU.
RISC-V/testsuite: Add branchless cases for FP cond-move operations
Verify, for short forward branch, T-Head, Ventana and Zicond targets and
the ordered floating-point conditional-move operations that already work
as expected, that if-conversion triggers via `noce_try_cmove' at the
respective sufficiently high `-mbranch-cost=' settings that make
branchless code sequences produced by if-conversion cheaper than their
original branched equivalents, and that extraneous instructions such as
SNEZ, etc. are not present in output. Cover all ordered floating-point
relational operations to make sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/movdifge-sfb.c: New test.
* gcc.target/riscv/movdifge-thead.c: New test.
* gcc.target/riscv/movdifge-ventana.c: New test.
* gcc.target/riscv/movdifge-zicond.c: New test.
* gcc.target/riscv/movdifgt-sfb.c: New test.
* gcc.target/riscv/movdifgt-thead.c: New test.
* gcc.target/riscv/movdifgt-ventana.c: New test.
* gcc.target/riscv/movdifgt-zicond.c: New test.
* gcc.target/riscv/movdifle-sfb.c: New test.
* gcc.target/riscv/movdifle-thead.c: New test.
* gcc.target/riscv/movdifle-ventana.c: New test.
* gcc.target/riscv/movdifle-zicond.c: New test.
* gcc.target/riscv/movdiflt-sfb.c: New test.
* gcc.target/riscv/movdiflt-thead.c: New test.
* gcc.target/riscv/movdiflt-ventana.c: New test.
* gcc.target/riscv/movdiflt-zicond.c: New test.
* gcc.target/riscv/movdifne-sfb.c: New test.
* gcc.target/riscv/movdifne-thead.c: New test.
* gcc.target/riscv/movdifne-ventana.c: New test.
* gcc.target/riscv/movdifne-zicond.c: New test.
* gcc.target/riscv/movsifge-sfb.c: New test.
* gcc.target/riscv/movsifge-thead.c: New test.
* gcc.target/riscv/movsifge-ventana.c: New test.
* gcc.target/riscv/movsifge-zicond.c: New test.
* gcc.target/riscv/movsifgt-sfb.c: New test.
* gcc.target/riscv/movsifgt-thead.c: New test.
* gcc.target/riscv/movsifgt-ventana.c: New test.
* gcc.target/riscv/movsifgt-zicond.c: New test.
* gcc.target/riscv/movsifle-sfb.c: New test.
* gcc.target/riscv/movsifle-thead.c: New test.
* gcc.target/riscv/movsifle-ventana.c: New test.
* gcc.target/riscv/movsifle-zicond.c: New test.
* gcc.target/riscv/movsiflt-sfb.c: New test.
* gcc.target/riscv/movsiflt-thead.c: New test.
* gcc.target/riscv/movsiflt-ventana.c: New test.
* gcc.target/riscv/movsiflt-zicond.c: New test.
* gcc.target/riscv/movsifne-sfb.c: New test.
* gcc.target/riscv/movsifne-thead.c: New test.
* gcc.target/riscv/movsifne-ventana.c: New test.
* gcc.target/riscv/movsifne-zicond.c: New test.
RISC-V/testsuite: Add branched cases for FP cond-move operations
Verify, for Ventana and Zicond targets and the ordered floating-point
conditional-move operations that already work as expected, that
if-conversion does *not* trigger at `-mbranch-cost=2' setting, which
makes original branched code sequences cheaper than their branchless
equivalents if-conversion would emit. Cover all ordered floating-point
relational operations to make sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/movdibfge-ventana.c: New test.
* gcc.target/riscv/movdibfge-zicond.c: New test.
* gcc.target/riscv/movdibfgt-ventana.c: New test.
* gcc.target/riscv/movdibfgt-zicond.c: New test.
* gcc.target/riscv/movdibfle-ventana.c: New test.
* gcc.target/riscv/movdibfle-zicond.c: New test.
* gcc.target/riscv/movdibflt-ventana.c: New test.
* gcc.target/riscv/movdibflt-zicond.c: New test.
* gcc.target/riscv/movdibfne-ventana.c: New test.
* gcc.target/riscv/movdibfne-zicond.c: New test.
* gcc.target/riscv/movsibfge-ventana.c: New test.
* gcc.target/riscv/movsibfge-zicond.c: New test.
* gcc.target/riscv/movsibfgt-ventana.c: New test.
* gcc.target/riscv/movsibfgt-zicond.c: New test.
* gcc.target/riscv/movsibfle-ventana.c: New test.
* gcc.target/riscv/movsibfle-zicond.c: New test.
* gcc.target/riscv/movsibflt-ventana.c: New test.
* gcc.target/riscv/movsibflt-zicond.c: New test.
* gcc.target/riscv/movsibfne-ventana.c: New test.
* gcc.target/riscv/movsibfne-zicond.c: New test.
RISC-V/testsuite: Add branchless cases for integer cond-move operations
Verify, for T-Head, Ventana and Zicond targets and the integer
conditional-move operations that already work as expected, if-conversion
to trigger via `noce_try_cmove' at the respective sufficiently high
`-mbranch-cost=' settings that make branchless code sequences produced
by if-conversion cheaper than their original branched equivalents, and
that extraneous instructions such as SNEZ, etc. are not present in
output. Cover all integer relational operations to make sure no corner
case escapes.
gcc/testsuite/
* gcc.target/riscv/movdieq-thead.c: New test.
* gcc.target/riscv/movdige-ventana.c: New test.
* gcc.target/riscv/movdige-zicond.c: New test.
* gcc.target/riscv/movdigeu-ventana.c: New test.
* gcc.target/riscv/movdigeu-zicond.c: New test.
* gcc.target/riscv/movdigt-ventana.c: New test.
* gcc.target/riscv/movdigt-zicond.c: New test.
* gcc.target/riscv/movdile-ventana.c: New test.
* gcc.target/riscv/movdile-zicond.c: New test.
* gcc.target/riscv/movdileu-ventana.c: New test.
* gcc.target/riscv/movdileu-zicond.c: New test.
* gcc.target/riscv/movdilt-ventana.c: New test.
* gcc.target/riscv/movdilt-zicond.c: New test.
* gcc.target/riscv/movdine-thead.c: New test.
* gcc.target/riscv/movsieq-thead.c: New test.
* gcc.target/riscv/movsige-ventana.c: New test.
* gcc.target/riscv/movsige-zicond.c: New test.
* gcc.target/riscv/movsigeu-ventana.c: New test.
* gcc.target/riscv/movsigeu-zicond.c: New test.
* gcc.target/riscv/movsigt-ventana.c: New test.
* gcc.target/riscv/movsigt-zicond.c: New test.
* gcc.target/riscv/movsile-ventana.c: New test.
* gcc.target/riscv/movsile-zicond.c: New test.
* gcc.target/riscv/movsileu-ventana.c: New test.
* gcc.target/riscv/movsileu-zicond.c: New test.
* gcc.target/riscv/movsilt-ventana.c: New test.
* gcc.target/riscv/movsilt-zicond.c: New test.
* gcc.target/riscv/movsine-thead.c: New test.
RISC-V/testsuite: Add branched cases for integer cond-move operations
Verify, for T-Head, Ventana and Zicond targets and the integer
conditional-move operations that already work as expected, that
if-conversion does *not* trigger at the respective sufficiently low
`-mbranch-cost=' settings that make original branched code sequences
cheaper than their branchless equivalents if-conversion would emit.
Cover all integer relational operations to make sure no corner case
escapes.
The reason to XFAIL movdibne-thead.c and movsibne-thead.c is the
branchless T-Head sequence:
sub a1,a0,a1
th.mveqz a2,a3,a1
mv a0,a2
ret
produced rather than its original branched counterpart:
beq a0,a1,.L3
mv a0,a2
ret
.L3:
mv a0,a3
ret
at `-mbranch-cost=1', even though under this setting the latter sequence
is obviously cheaper performance-wise. This is because the final move
instruction in the branchless sequence is not counted towards its cost
and consequently the cost of both sequences works out at 8 each, making
if-conversion prefer the branchless variant. Use the XFAIL mark to keep
track of these cases for future consideration.
gcc/testsuite/
* gcc.target/riscv/movdibeq-thead.c: New test.
* gcc.target/riscv/movdibge-ventana.c: New test.
* gcc.target/riscv/movdibge-zicond.c: New test.
* gcc.target/riscv/movdibgeu-ventana.c: New test.
* gcc.target/riscv/movdibgeu-zicond.c: New test.
* gcc.target/riscv/movdibgt-ventana.c: New test.
* gcc.target/riscv/movdibgt-zicond.c: New test.
* gcc.target/riscv/movdible-ventana.c: New test.
* gcc.target/riscv/movdible-zicond.c: New test.
* gcc.target/riscv/movdibleu-ventana.c: New test.
* gcc.target/riscv/movdibleu-zicond.c: New test.
* gcc.target/riscv/movdiblt-ventana.c: New test.
* gcc.target/riscv/movdiblt-zicond.c: New test.
* gcc.target/riscv/movdibne-thead.c: New test.
* gcc.target/riscv/movsibeq-thead.c: New test.
* gcc.target/riscv/movsibge-ventana.c: New test.
* gcc.target/riscv/movsibge-zicond.c: New test.
* gcc.target/riscv/movsibgeu-ventana.c: New test.
* gcc.target/riscv/movsibgeu-zicond.c: New test.
* gcc.target/riscv/movsibgt-ventana.c: New test.
* gcc.target/riscv/movsibgt-zicond.c: New test.
* gcc.target/riscv/movsible-ventana.c: New test.
* gcc.target/riscv/movsible-zicond.c: New test.
* gcc.target/riscv/movsibleu-ventana.c: New test.
* gcc.target/riscv/movsibleu-zicond.c: New test.
* gcc.target/riscv/movsiblt-ventana.c: New test.
* gcc.target/riscv/movsiblt-zicond.c: New test.
* gcc.target/riscv/movsibne-thead.c: New test.
RISC-V: Rework branch costing model for if-conversion
The generic branch costing model for if-conversion assumes a fixed cost
of COSTS_N_INSNS (2) for a conditional branch, and that one half of that
cost comes from a preceding condition-set instruction, such as with
MODE_CC targets, and then the other half of that cost is for the actual
branch instruction. This is hardcoded for `if_info.original_cost' in
`noce_find_if_block' and regardless of the cost set for branches via
BRANCH_COST.
Then `default_max_noce_ifcvt_seq_cost' instructs if-conversion to prefer
a branchless sequence as costly as high as triple the BRANCH_COST value
set. This is apparently to make up for the inability to accurately
guess the branch penalty.
Consequently for the BRANCH_COST of 3 we commonly set for tuning,
if-conversion will consider branchless sequences costing 3 * 3 - 2 = 7
instruction units more than a corresponding branch sequence. For the
BRANCH_COST of 4 such as with `sifive-7-series' tuning this is even
worse, at 3 * 4 - 2 = 10. Effectively it means a branchless sequence
will always be chosen if available, even a very inefficient one.
Rework the branch costing model to better match our architecture,
observing in particular that we have no preparatory instructions for
branches so that the cost of a branch is naked BRANCH_COST plus any
extra overhead the processing of a branch's source RTX might incur.
Provide TARGET_INSN_COST and TARGET_MAX_NOCE_IFCVT_SEQ_COST handlers
than that return suitable cost based on BRANCH_COST. The latter hook
usually returns a value that is lower than the cost of the corresponding
branched sequence. This is because we don't really want to produce a
branchless sequence that is more expensive than the original branched
sequence. If this turns out too conservative for some corner case, then
this choice might be revisited.
Then we don't want to fiddle with `noce_find_if_block' without a lot of
cross-target verification, so add TARGET_NOCE_CONVERSION_PROFITABLE_P
defined such that it subtracts the fixed COSTS_N_INSNS (2) cost from the
cost of the original branched sequence supplied and instead adds actual
branch cost calculated from the conditional branch instruction used. It
is then further tweaked according to simple analysis of the replacement
branchless sequence produced so as to cancel the cost of an extraneous
zero extend operation produced by `noce_try_store_flag_mask' as observed
with gcc/testsuite/gcc.target/riscv/pr105314.c.
Tweak the testsuite accordingly and set `-mbranch-cost=' explicitly for
the relevant cases so that the expected if-conversion transformation is
made regardless of the default BRANCH_COST value of tuning in effect.
Some of these settings will be lowered later on as deficiencies in
branchless sequence generation have been fixed that lower their cost
calculated by if-conversion.
gcc/
* config/riscv/riscv.cc (riscv_insn_cost): New function.
(riscv_max_noce_ifcvt_seq_cost): Likewise.
(riscv_noce_conversion_profitable_p): Likewise.
(TARGET_INSN_COST): New macro.
(TARGET_MAX_NOCE_IFCVT_SEQ_COST): New macro.
(TARGET_NOCE_CONVERSION_PROFITABLE_P): New macro.
RISC-V: Fix `mode' usage in `riscv_expand_conditional_move'
In `riscv_expand_conditional_move' `mode' is initialized right away from
`GET_MODE (dest)', so remove needless references that refrain from using
the local variable.
gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Use
`mode' for `GET_MODE (dest)' throughout.
RISC-V: Sanitise NEED_EQ_NE_P case with `riscv_emit_int_compare'
For the NEED_EQ_NE_P `riscv_emit_int_compare' is documented to only emit
EQ or NE comparisons against zero, however it does not catch incorrect
use where a non-equality comparison has been requested and falls through
to the general case then. Add a safety guard to catch such a case then.
Arguably the NEED_EQ_NE_P case would best be moved into a function of
its own, but let's leave it for a separate cleanup.
gcc/
* config/riscv/riscv.cc (riscv_emit_int_compare): Bail out if
NEED_EQ_NE_P but the comparison is neither EQ nor NE.
RISC-V/testsuite: Add cases for integer SFB cond-move operations
Verify, for short forward branch targets and the conditional-move
operations that already work as expected, that if-conversion triggers
via `noce_try_cmove' already at `-mbranch-cost=1' and that extraneous
instructions such as SNEZ, etc. are not present in output. Cover all
integer relational operations to make sure no corner case escapes.
gcc/testsuite/
* gcc.target/riscv/movdieq-sfb.c: New test.
* gcc.target/riscv/movdige-sfb.c: New test.
* gcc.target/riscv/movdigeu-sfb.c: New test.
* gcc.target/riscv/movdigt-sfb.c: New test.
* gcc.target/riscv/movdigtu-sfb.c: New test.
* gcc.target/riscv/movdile-sfb.c: New test.
* gcc.target/riscv/movdileu-sfb.c: New test.
* gcc.target/riscv/movdilt-sfb.c: New test.
* gcc.target/riscv/movdiltu-sfb.c: New test.
* gcc.target/riscv/movdine-sfb.c: New test.
* gcc.target/riscv/movsieq-sfb.c: New test.
* gcc.target/riscv/movsige-sfb.c: New test.
* gcc.target/riscv/movsigeu-sfb.c: New test.
* gcc.target/riscv/movsigt-sfb.c: New test.
* gcc.target/riscv/movsigtu-sfb.c: New test.
* gcc.target/riscv/movsile-sfb.c: New test.
* gcc.target/riscv/movsileu-sfb.c: New test.
* gcc.target/riscv/movsilt-sfb.c: New test.
* gcc.target/riscv/movsiltu-sfb.c: New test.
* gcc.target/riscv/movsine-sfb.c: New test.
testsuite: Add cases for conditional-move and conditional-add operations
Add generic execution tests for expressions that are expected to expand
to conditional-move and conditional-add operations where supported. To
ensure no corner case escapes all relational operators are extensively
covered for integer comparisons and all ordered operators are covered
for floating-point comparisons. Unordered operators are not covered at
this point as they'd require a different input data set.
gcc/testsuite/
* gcc.dg/torture/addieq.c: New test.
* gcc.dg/torture/addifeq.c: New test.
* gcc.dg/torture/addifge.c: New test.
* gcc.dg/torture/addifgt.c: New test.
* gcc.dg/torture/addifle.c: New test.
* gcc.dg/torture/addiflt.c: New test.
* gcc.dg/torture/addifne.c: New test.
* gcc.dg/torture/addige.c: New test.
* gcc.dg/torture/addigeu.c: New test.
* gcc.dg/torture/addigt.c: New test.
* gcc.dg/torture/addigtu.c: New test.
* gcc.dg/torture/addile.c: New test.
* gcc.dg/torture/addileu.c: New test.
* gcc.dg/torture/addilt.c: New test.
* gcc.dg/torture/addiltu.c: New test.
* gcc.dg/torture/addine.c: New test.
* gcc.dg/torture/addleq.c: New test.
* gcc.dg/torture/addlfeq.c: New test.
* gcc.dg/torture/addlfge.c: New test.
* gcc.dg/torture/addlfgt.c: New test.
* gcc.dg/torture/addlfle.c: New test.
* gcc.dg/torture/addlflt.c: New test.
* gcc.dg/torture/addlfne.c: New test.
* gcc.dg/torture/addlge.c: New test.
* gcc.dg/torture/addlgeu.c: New test.
* gcc.dg/torture/addlgt.c: New test.
* gcc.dg/torture/addlgtu.c: New test.
* gcc.dg/torture/addlle.c: New test.
* gcc.dg/torture/addlleu.c: New test.
* gcc.dg/torture/addllt.c: New test.
* gcc.dg/torture/addlltu.c: New test.
* gcc.dg/torture/addlne.c: New test.
* gcc.dg/torture/movieq.c: New test.
* gcc.dg/torture/movifeq.c: New test.
* gcc.dg/torture/movifge.c: New test.
* gcc.dg/torture/movifgt.c: New test.
* gcc.dg/torture/movifle.c: New test.
* gcc.dg/torture/moviflt.c: New test.
* gcc.dg/torture/movifne.c: New test.
* gcc.dg/torture/movige.c: New test.
* gcc.dg/torture/movigeu.c: New test.
* gcc.dg/torture/movigt.c: New test.
* gcc.dg/torture/movigtu.c: New test.
* gcc.dg/torture/movile.c: New test.
* gcc.dg/torture/movileu.c: New test.
* gcc.dg/torture/movilt.c: New test.
* gcc.dg/torture/moviltu.c: New test.
* gcc.dg/torture/movine.c: New test.
* gcc.dg/torture/movleq.c: New test.
* gcc.dg/torture/movlfeq.c: New test.
* gcc.dg/torture/movlfge.c: New test.
* gcc.dg/torture/movlfgt.c: New test.
* gcc.dg/torture/movlfle.c: New test.
* gcc.dg/torture/movlflt.c: New test.
* gcc.dg/torture/movlfne.c: New test.
* gcc.dg/torture/movlge.c: New test.
* gcc.dg/torture/movlgeu.c: New test.
* gcc.dg/torture/movlgt.c: New test.
* gcc.dg/torture/movlgtu.c: New test.
* gcc.dg/torture/movlle.c: New test.
* gcc.dg/torture/movlleu.c: New test.
* gcc.dg/torture/movllt.c: New test.
* gcc.dg/torture/movlltu.c: New test.
* gcc.dg/torture/movlne.c: New test.
Robin Dapp [Tue, 21 Nov 2023 11:51:12 +0000 (12:51 +0100)]
vect: Allow reduc_index != 1 for COND_OPs.
In PR112406 Tamar found another problem with COND_OP reductions.
I wrongly assumed that the reduction variable will always remain in
operand 1, just as we create the COND_OP in ifcvt. But of course,
addition being commutative, we are free to swap operand 1 and 2 and we
end up with e.g.
Robin Dapp [Mon, 20 Nov 2023 15:19:46 +0000 (16:19 +0100)]
RISC-V: testsuite: Add rv64 requirement for bug-9 and bug-14.
This adds an effective target requirement to compile the tests. Since
we disabled 64-bit indices on rv32 targets those tests should be
unsupported on rv32.
Patrick O'Neill [Thu, 2 Nov 2023 17:20:43 +0000 (10:20 -0700)]
gfortran: Rely on dg-do-what-default to avoid running pr85853.f90, pr107254.f90 and vect-alias-check-1.F90 on non-vector targets
Testcases in gfortran.dg/vect/vect.exp rely on
check_vect_support_and_set_flags to set dg-do-what-default and avoid
running vector tests on non-vector targets. The three testcases in this
patch overwrite the default with dg-do run which causes issues
for non-vector targets.
Removing the dg-do run directive resolves this issue for non-vector
targets (while still running the tests on vector targets).
Jonathan Wakely [Tue, 21 Nov 2023 11:49:22 +0000 (11:49 +0000)]
libstdc++: Do not declare strtok for C++26 freestanding (P2937R0)
This was recently approved for C++26.
We should define the __cpp_lib_freestanding_cstring macro in <string.h>
as well as <cstring>, but we do not currently install our own <string.h>
for most targets.
libstdc++-v3/ChangeLog:
* include/bits/version.def (freestanding_cstring): Add.
* include/bits/version.h: Regenerate.
* include/c_compatibility/string.h (strtok): Do not declare for
C++26 freestanding.
* include/c_global/cstring (strtok): Likewise.
* testsuite/21_strings/headers/cstring/version.cc: New test.
Jonathan Wakely [Mon, 20 Nov 2023 21:39:58 +0000 (21:39 +0000)]
libstdc++: Add freestanding feature test macros (P2407R5)
This C++26 change makes several classes "partially freestanding", but we
already fully supported them in freestanding mode. All we need to do is
define the new feature test macros and add tests for them.
libstdc++-v3/ChangeLog:
* include/bits/version.def (freestanding_algorithm)
(freestanding_array, freestanding_optional)
(freestanding_string_view, freestanding_variant): Add.
* include/bits/version.h: Regenerate.
* include/std/algorithm (__glibcxx_want_freestanding_algorithm):
Define.
* include/std/array (__glibcxx_want_freestanding_array):
Define.
* include/std/optional (__glibcxx_want_freestanding_optional):
Define.
* include/std/string_view
(__glibcxx_want_freestanding_string_view): Define.
* include/std/variant (__glibcxx_want_freestanding_variant):
Define.
* testsuite/20_util/optional/version.cc: Add checks for
__cpp_lib_freestanding_optional.
* testsuite/20_util/variant/version.cc: Add checks for
__cpp_lib_freestanding_variant.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
* testsuite/21_strings/basic_string_view/requirements/version.cc:
New test.
* testsuite/23_containers/array/requirements/version.cc: New
test.
* testsuite/25_algorithms/fill_n/requirements/version.cc: New
test.
* testsuite/25_algorithms/swap_ranges/requirements/version.cc:
New test.
Jonathan Wakely [Sat, 18 Nov 2023 21:07:47 +0000 (21:07 +0000)]
libstdc++: Add std::span::at for C++26 (P2821R5)
Also define the new feature test macros from P2833R2, indicating that
std::span and std::expected are supported for freestanding mode.
libstdc++-v3/ChangeLog:
* include/bits/version.def (freestanding_expected): New macro.
(span): Add C++26 value.
* include/bits/version.h: Regenerate.
* include/std/expected (__glibcxx_want_freestanding_expected):
Define.
* include/std/span (span::at): New member function.
* testsuite/20_util/expected/version.cc: Add checks for
__cpp_lib_freestanding_expected.
* testsuite/23_containers/span/2.cc: Moved to...
* testsuite/23_containers/span/version.cc: ...here. Add checks
for __cpp_lib_span in <span> as well as in <version>.
* testsuite/23_containers/span/1.cc: Removed.
* testsuite/23_containers/span/at.cc: New test.
Jonathan Wakely [Sat, 18 Nov 2023 21:09:53 +0000 (21:09 +0000)]
libstdc++: Fix std::tr2::dynamic_bitset support for alternate characters
libstdc++-v3/ChangeLog:
* include/tr2/dynamic_bitset (dynamic_bitset): Pass zero and one
characters to _M_copy_from_string.
* testsuite/tr2/dynamic_bitset/string.cc: New test.
This patch adds a target-independent aligned_register_operand
predicate, for use with register constraints that use filters
to impose an alignment. The definition deliberately jetisons
some of the historical baggage in general_operand.
gcc/
* common.md (aligned_register_operand): New predicate.