git.ipfire.org Git - thirdparty/gcc.git/log

libstdc++: Prevent unwanted ADL in std::to_array [PR111512]

As noted in PR c++/111512, GCC does ADL for __builtin_memcpy if it is
unqualified, which can cause errors for template argument types which
cannot be completed.

Casting the memcpy arguments to void* prevents ADL from considering the
problem type.

libstdc++-v3/ChangeLog:

PR libstdc++/111511
PR c++/111512
* include/std/array (to_array): Cast memcpy arguments to void*.
* testsuite/23_containers/array/creation/111512.cc: New test.

libstdc++: Define C++23 std::forward_like (P2445R1)

libstdc++-v3/ChangeLog:

* include/bits/move.h (forward_list): Define for C++23.
* include/bits/version.def (forward_like): Define.
* include/bits/version.h: Regenerate.
* include/std/utility (__glibcxx_want_forward_like): Define.
* testsuite/20_util/forward_like/1.cc: New test.
* testsuite/20_util/forward_like/2_neg.cc: New test.
* testsuite/20_util/forward_like/version.cc: New test.

LoongArch: doc: Update -m[no-]explicit-relocs for r14-4160

gcc/ChangeLog:

* doc/invoke.texi: Update -m[no-]explicit-relocs for r14-4160.

Fix PR 110386: backprop vs ABSU_EXPR

The issue here is that when backprop tries to go
and strip sign ops, it skips over ABSU_EXPR but
ABSU_EXPR not only does an ABS, it also changes the
type to unsigned.
Since strip_sign_op_1 is only supposed to strip off
sign changing operands and not ones that change types,
removing ABSU_EXPR here is correct. We don't handle
nop conversions so this does cause any missed optimizations either.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/110386

gcc/ChangeLog:

* gimple-ssa-backprop.cc (strip_sign_op_1): Remove ABSU_EXPR.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr110386-1.c: New test.
* gcc.c-torture/compile/pr110386-2.c: New test.

RISC-V: Fix AVL/VL bug of VSETVL PASS[PR111548]

This patch fixes that AVL/VL reg incorrect fetch in VSETVL PASS.

C/C++ regression passed.

But gfortran didn't run yet. I am still finding a way to run it.

Will commit it when I pass the fortran regression.

PR target/111548

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (earliest_pred_can_be_fused_p): Bugfix

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr111548.c: New test.

rs6000: Skip empty inline asm in rs6000_update_ipa_fn_target_info [PR111366]

PR111366 exposes one thing that can be improved in function
rs6000_update_ipa_fn_target_info is to skip the given empty
inline asm string, since it's impossible to adopt any
hardware features (so far HTM).

Since this rs6000_update_ipa_fn_target_info related approach
exists in GCC12 and later, the affected project highway has
updated its target pragma with ",htm", see the link:
https://github.com/google/highway/commit/15e63d61eb535f478bc
I'd not bother to consider an inline asm parser for now but
will file a separated PR for further enhancement.

PR target/111366

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Skip
empty inline asm.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr111366.C: New test.

rs6000: Use default target option node for callee by default [PR111380]

As PR111380 (and the discussion in related PRs) shows, for
now how function rs6000_can_inline_p treats the callee
without any target option node is wrong.  It considers it's
always safe to inline this kind of callee, but actually its
target flags are from the command line options
(target_option_default_node), it's possible that the flags
of callee don't satisfy the condition of inlining, but it
is still inlined, then result in unexpected consequence.

As the associated test case pr111380-1.c shows, the caller
main is attributed with power8, but the callee foo is
compiled with power9 from command line, it's unexpected to
make main inline foo since foo can contain something that
requires power9 capability.  Without this patch, for lto
(with -flto) we can get error message (as it forces the
callee to have a target option node), but for non-lto, it's
inlined unexpectedly.

This patch is to make callee adopt target_option_default_node
when it doesn't have a target option node, it can avoid wrong
inlining decision and fix the inconsistency between LTO and
non-LTO.  It also aligns with what the other ports do.

PR target/111380

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_can_inline_p): Adopt
target_option_default_node when the callee has no option
attributes, also simplify the existing code accordingly.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr111380-1.c: New test.
* gcc.target/powerpc/pr111380-2.c: New test.

LoongArch: Optimizations of vector construction.

gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_vecinit_merge_<LASX:mode>): New
pattern for vector construction.
(vec_set<mode>_internal): Ditto.
(lasx_xvinsgr2vr_<mode256_i_half>_internal): Ditto.
(lasx_xvilvl_<lasxfmt_f>_internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl_<lsxfmt_f>_internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_<lsxfmt_f>): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.

Daily bump.

RISC-V: Fix fortran ICE/PR111546 when RV32 vec_init

When broadcast the reperated element, we take the mask_int_mode
by mistake. This patch would like to fix it by leveraging the machine
mode of the element.

The below test case in RV32 will be fixed.

* gcc/testsuite/gfortran.dg/overload_5.f90

PR target/111546

gcc/ChangeLog:

* config/riscv/riscv-v.cc
(expand_vector_init_merge_repeating_sequence): Bugfix

Signed-off-by: Pan Li <pan2.li@intel.com>

Fortran: Pad mismatched charlens in component initializers [PR68155]

2023-09-24 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/68155
* decl.cc (fix_initializer_charlen): New function broken out of
add_init_expr_to_sym.
(add_init_expr_to_sym, build_struct): Call the new function.

PR fortran/111271
* trans-expr.cc (gfc_conv_intrinsic_to_class): Remove repeated
condition.

gcc/testsuite/
PR fortran/68155
* gfortran.dg/pr68155.f90: New test.

MATCH: Add `(X & ~Y) & Y` and `(X | ~Y) | Y`

Even though this gets optimized by reassociation, catching it more often
will always be better.

Note the reason why I didn't add `(X ^ ~Y) ^ Y` is that it gets caught
by prefering `~(X ^ Y)` to `(X ^ ~Y)` which then it is caught by the
the pattern for `(X ^ Y) ^ Y` already.

PR tree-optimization/111543

gcc/ChangeLog:

* match.pd (`(X & ~Y) & Y`, `(X | ~Y) | Y`): New patterns.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-4.c: New test.

Daily bump.

RISC-V: Support full coverage VLS combine support

Support full coverage VLS combine support.

Committed.

gcc/ChangeLog:

* config/riscv/autovec-opt.md: Extend VLS modes
* config/riscv/vector-iterators.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h:
* gcc.target/riscv/rvv/autovec/vls/cond_convert-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-9.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_copysign-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mulh-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_narrow-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_narrow-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wmul-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wmul-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wmul-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/narrow-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/narrow-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/narrow-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wred-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wred-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wred-3.c: New test.

fortran: error recovery on duplicate declaration of class variable [PR95710]

gcc/fortran/ChangeLog:

PR fortran/95710
* class.cc (gfc_build_class_symbol): Do not try to build class
container for invalid typespec.
* resolve.cc (resolve_fl_var_and_proc): Prevent NULL pointer
dereference.
(resolve_symbol): Likewise.

gcc/testsuite/ChangeLog:

PR fortran/95710
* gfortran.dg/pr95710.f90: New test.

d: Merge upstream dmd, druntime 4574d1728d, phobos d7e79f024.

D front-end changes:

- Import dmd v2.105.0.
- Catch clause must take only `const' or mutable exceptions.
- Creating a `scope' class instance with a non-scope constructor
is now `@system' only with `-fpreview=dip1000'.
- Global `const' variables can no longer be initialized from a
non-shared static constructor

D runtime changes:

- Import druntime v2.105.0.

Phobos changes:

- Import phobos v2.105.0.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 4574d1728d.
* dmd/VERSION: Bump version to v2.105.0.
* d-diagnostic.cc (verror): Remove.
(verrorSupplemental): Remove.
(vwarning): Remove.
(vwarningSupplemental): Remove.
(vdeprecation): Remove.
(vdeprecationSupplemental): Remove.
(vmessage): Remove.
(vtip): Remove.
(verrorReport): New function.
(verrorReportSupplemental): New function.
* d-lang.cc (d_parse_file): Update for new front-end interface.
* decl.cc (d_mangle_decl): Update for new front-end interface.
* intrinsics.cc (maybe_set_intrinsic): Update for new front-end
interface.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 4574d1728d.
* src/MERGE: Merge upstream phobos d7e79f024.

testsuite: Add new test for already fixed PR111455

The following testcase has been fixed by r14-4231.

2023-09-23 Jakub Jelinek <jakub@redhat.com>

PR c++/111455
* g++.dg/ext/integer-pack8.C: New test.

RISC-V: Add VLS unary combine patterns

gcc/ChangeLog:

* config/riscv/autovec-opt.md: Add VLS modes for conditional ABS/SQRT.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/cond_abs-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_sqrt-1.c: New test.

RISC-V: Suport FP floor auto-vectorization

This patch would like to support auto-vectorization for the
floor API in math.h. It depends on the -ffast-math option.

When we would like to call floor/floorf like v2 = floor (v1), we will
convert it into below insns (reference the implementation of llvm).

* vfcvt.x.f v3, v1, RDN
* vfcvt.f.x v2, v3

However, the floating point value may not need the cvt as above if
its mantissa is zero. For example single precision floating point below.

  +-----------+---------------+-------------+
  | raw float | binary layout | after floor |
  +-----------+---------------+-------------+
  | 8388607.5 | 0x4affffff    | 8388607.0   |
  | 8388608.0 | 0x4b000000    | 8388608.0   |
  | 8388609.0 | 0x4b000001    | 8388609.0   |
  +-----------+---------------+-------------+

All single floating point glte 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do the
cvt on mask.

Befor this patch:
math-floor-1.c:21:1: missed: couldn't vectorize loop
  ...
.L3:
  flw     fa0,0(s0)
  addi    s0,s0,4
  addi    s1,s1,4
  call    ceilf
  fsw     fa0,-4(s1)
  bne     s0,s2,.L3

After this patch:
  ...
  fsrmi       2   // Rounding Down
.L4:
  vfabs.v     v1,v2
  vmflt.vf    v0,v1,fa5
  vfcvt.x.f.v v3,v2,v0.t
  vfcvt.f.x.v v1,v3,v0.t
  vfsgnj.vv   v1,v1,v2
  bne         .L4
.L14:
  fsrm        a6
  ret

Please note VLS mode is also involved in this patch and covered by the
test cases.

gcc/ChangeLog:

* config/riscv/autovec.md (floor<mode>2): New pattern.
* config/riscv/riscv-protos.h (enum insn_flags): New enum type.
(enum insn_type): Ditto.
(expand_vec_floor): New function decl.
* config/riscv/riscv-v.cc (gen_floor_const_fp): New function impl.
(expand_vec_floor): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-floor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-floor-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Remove FP run test for ceil.

FP16 is not well reconciled when linking.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c: Remove.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

c++ __integer_pack conversion again [PR111357]

As Jakub pointed out, the real problem here is that in a partial
substitution we're forgetting the conversion to the type of the non-type
template argument, because maybe_convert_nontype_argument doesn't do
anything with value-dependent arguments. I'm experimenting with changing
that, but in the meantime we can work around it here.

PR c++/111357

gcc/cp/ChangeLog:

* pt.cc (expand_integer_pack): Use IMPLICIT_CONV_EXPR.

c++: constexpr and designated initializer

The change of active member being non-constant (before C++20) results in a
CONSTRUCTOR with a null value for the first field, don't crash.

gcc/cp/ChangeLog:

* constexpr.cc (free_constructor): Handle null ce->value.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-union7.C: New test.

c++: unroll pragma in templates [PR111529]

We were failing to handle ANNOTATE_EXPR in tsubst_copy_and_build, leading to
problems with substitution of any wrapped expressions.

Let's also not tell users that lambda templates are available in C++14.

PR c++/111529

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_declarator_opt): Don't suggest
-std=c++14 for lambda templates.
* pt.cc (tsubst_expr): Move ANNOTATE_EXPR handling...
(tsubst_copy_and_build): ...here.

gcc/testsuite/ChangeLog:

* g++.dg/ext/unroll-4.C: New test.

RISC-V: Refine the code gen for ceil auto vectorization.

We vectorized below ceil code already.

void
test_ceil (float *out, float *in, int count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_ceilf (in[i]);
}

Before this patch:
vfmv.v.x    v4,fa0     // can be removed
vfabs.v     v0,v1
vmv1r.v     v2,v1      // can be removed
vmflt.vv    v0,v0,v4   // can be refined to vmflt.vf
vfcvt.x.f.v v3,v1,v0.t
vfcvt.f.x.v v2,v3,v0.t
vfsgnj.vv   v2,v2,v1

After this patch:
vfabs.v     v1,v2
vmflt.vf    v0,v1,fa5
vfcvt.x.f.v v3,v2,v0.t
vfcvt.f.x.v v1,v3,v0.t
vfsgnj.vv   v1,v1,v2

We can generate better code include below items.

* Remove vfmv.v.f.
* Take vmflt.vf instead of vmflt.vv.
* Remove vmv1r.v.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vec_float_cmp_mask): Refactor.
(emit_vec_float_cmp_mask): Rename.
(expand_vec_copysign): Ditto.
(emit_vec_copysign): Ditto.
(emit_vec_abs): New function impl.
(emit_vec_cvt_x_f): Ditto.
(emit_vec_cvt_f_x): Ditto.
(expand_vec_ceil): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c: Adjust body check.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add VLS mode widen ternary tests

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS modes.
* gcc.target/riscv/rvv/autovec/vls/wfma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfma-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfnms-1.c: New test.

RISC-V: Add VLS widen binary combine patterns

Regression passed.

Committed.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Extend VLS modes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS modes cond tests.
* gcc.target/riscv/rvv/autovec/vls/wadd-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wadd-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wadd-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wadd-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wmul-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wmul-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wmul-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-4.c: New test.

c++: missing SFINAE in grok_array_decl [PR111493]

We should guard both the diagnostic and backward compatibilty fallback
code with tf_error, so that in a SFINAE context we don't issue any
diagnostics and correctly treat ill-formed C++23 multidimensional
subscript operator expressions as such.

PR c++/111493

gcc/cp/ChangeLog:

* decl2.cc (grok_array_decl): Guard diagnostic and backward
compatibility fallback code paths with tf_error.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/subscript15.C: New test.

c++: constraint rewriting during ttp coercion [PR111485]

In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters. The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2. This patch
fixes this by including the outer template arguments in the substitution,
which ought to match the depth of the ttp.

The second testcase demonstrates it's better to substitute the concrete
outer template arguments instead of generic ones since a ttp's constraints
could depend on outer parameters.

PR c++/111485

gcc/cp/ChangeLog:

* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.

RISC-V: Move ceil test cases to unop folder

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/math-ceil-0.c: Moved to...
* gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c: ...here.
* gcc.target/riscv/rvv/autovec/math-ceil-1.c: Moved to...
* gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c: ...here.
* gcc.target/riscv/rvv/autovec/math-ceil-2.c: Moved to...
* gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c: ...here.
* gcc.target/riscv/rvv/autovec/math-ceil-3.c: Moved to...
* gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c: ...here.
* gcc.target/riscv/rvv/autovec/math-ceil-run-0.c: Moved to...
* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-0.c: ...here.
* gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: Moved to...
* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c: ...here.
* gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: Moved to...
* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c: ...here.
* gcc.target/riscv/rvv/autovec/test-math.h: Moved to...
* gcc.target/riscv/rvv/autovec/unop/test-math.h: ...here.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Remove @ of vec_duplicate pattern

It's obvious the @ of vec_duplicate pattern is duplicate.

Regression passed.

Committed.
gcc/ChangeLog:

* config/riscv/riscv-v.cc (gen_const_vector_dup): Use global expand function.
* config/riscv/vector.md (@vec_duplicate<mode>): Remove @.
(vec_duplicate<mode>): Ditto.

RISC-V: Add VLS conditional patterns support

Regression passed.

Committed.

gcc/ChangeLog:

* config/riscv/autovec.md: Add VLS conditional patterns.
* config/riscv/riscv-protos.h (expand_cond_unop): Ditto.
(expand_cond_binop): Ditto.
(expand_cond_ternop): Ditto.
* config/riscv/riscv-v.cc (expand_cond_unop): Ditto.
(expand_cond_binop): Ditto.
(expand_cond_ternop): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS conditional tests.
* gcc.target/riscv/rvv/autovec/vls/cond_add-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_add-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_and-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_div-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_div-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fnma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fnms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ior-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_max-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_max-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_min-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_min-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mod-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mul-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mul-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_neg-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_neg-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_not-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_shift-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_shift-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_sub-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_sub-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_xor-1.c: New test.

RISC-V: Rename the test macro for math autovec test

Rename TEST_CEIL to TEST_UNARY_CALL for the underlying function
autovec patch testing.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/test-math.h: Rename.
* gcc.target/riscv/rvv/autovec/math-ceil-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/math-ceil-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/math-ceil-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/math-ceil-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/math-ceil-run-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Optimization of vrgather.vv into vrgatherei16.vv[PR111451]

Consider this following case:

typedef int32_t vnx32si __attribute__ ((vector_size (128)));

  __attribute__ ((noipa)) void permute_##TYPE (TYPE values1, TYPE values2,     \
       TYPE *out)                      \
  {                                                                            \
    TYPE v                                                                     \
      = __builtin_shufflevector (values1, values2, MASK_##NUNITS (0, NUNITS)); \
    *(TYPE *) out = v;                                                         \
  }

  T (vnx32si, 32)                                                              \

TEST_ALL (PERMUTE)

Before this patch:
  li a4,31
  vsetvli a5,zero,e32,m8,ta,ma
  vl8re32.v v24,0(a0)
  vid.v v8
  vrsub.vx v8,v8,a4
  vrgather.vv v16,v24,v8
  vs8r.v v16,0(a2)
  ret

The index vector register "v8" occupies 8 registers.
We should optimize it into vrgatherei16.vv which is
using int16 as the index elements.

After this patch:
  vsetvli a5,zero,e16,m4,ta,ma
  li a4,31
  vid.v v4
  vl8re32.v v16,0(a0)
  vrsub.vx v4,v4,a4
  vsetvli zero,zero,e32,m8,ta,ma
  vrgatherei16.vv v8,v16,v4
  vs8r.v v8,0(a2)
  ret
With vrgatherei16.vv, the v8 will occupy 4 registers instead
of 8. Lower the register consuming and register pressure.

PR target/111451

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_gather_insn): Optimization of vrgather.vv
into vrgatherei16.vv.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Adjust case.
* gcc.target/riscv/rvv/autovec/vls/perm-4.c: Ditto.

RISC-V: Remove arch and abi option for run test case.

Remove the -march and -mabi.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/math-ceil-run-0.c: Remove arch and abi.
* gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Support combine cond extend and reduce sum to widen reduce sum

This patch support combining cond extend and reduce_sum to cond widen reduce_sum
like combine the following three insns:
   (set (reg:RVVM2HI 149)
        (if_then_else:RVVM2HI
          (unspec:RVVMF8BI [
            (const_vector:RVVMF8BI repeat [
              (const_int 1 [0x1])
            ])
            (reg:DI 146)
            (const_int 2 [0x2]) repeated x2
            (const_int 1 [0x1])
            (reg:SI 66 vl)
            (reg:SI 67 vtype)
          ] UNSPEC_VPREDICATE)
         (const_vector:RVVM2HI repeat [
           (const_int 0 [0])
         ])
         (unspec:RVVM2HI [
           (reg:SI 0 zero)
         ] UNSPEC_VUNDEF)))
  (set (reg:RVVM2HI 138)
    (if_then_else:RVVM2HI
      (reg:RVVMF8BI 135)
      (reg:RVVM2HI 148)
      (reg:RVVM2HI 149)))
  (set (reg:HI 150)
    (unspec:HI [
      (reg:RVVM2HI 138)
    ] UNSPEC_REDUC_SUM))
into one insn:
  (set (reg:SI 147)
    (unspec:SI [
      (if_then_else:RVVM2SI
        (reg:RVVMF16BI 135)
        (sign_extend:RVVM2SI (reg:RVVM1HI 136))
        (if_then_else:RVVM2HI
          (unspec:RVVMF8BI [
            (const_vector:RVVMF8BI repeat [
              (const_int 1 [0x1])
            ])
            (reg:DI 146)
            (const_int 2 [0x2]) repeated x2
            (const_int 1 [0x1])
            (reg:SI 66 vl)
            (reg:SI 67 vtype)
          ] UNSPEC_VPREDICATE)
         (const_vector:RVVM2HI repeat [
           (const_int 0 [0])
         ])
         (unspec:RVVM2HI [
           (reg:SI 0 zero)
         ] UNSPEC_VUNDEF)))
    ] UNSPEC_REDUC_SUM))

Consider the following C code:

int16_t foo (int8_t *restrict a, int8_t *restrict pred)
{
  int16_t sum = 0;
  for (int i = 0; i < 16; i += 1)
    if (pred[i])
      sum += a[i];
  return sum;
}

assembly before this patch:

foo:
        vsetivli        zero,16,e16,m2,ta,ma
        li      a5,0
        vmv.v.i v2,0
        vsetvli zero,zero,e8,m1,ta,ma
        vl1re8.v        v0,0(a1)
        vmsne.vi        v0,v0,0
        vsetvli zero,zero,e16,m2,ta,mu
        vle8.v  v4,0(a0),v0.t
        vmv.s.x v1,a5
        vsext.vf2       v2,v4,v0.t
        vredsum.vs      v2,v2,v1
        vmv.x.s a0,v2
        slliw   a0,a0,16
        sraiw   a0,a0,16
        ret

assembly after this patch:

foo:
li a5,0
vsetivli zero,16,e16,m1,ta,ma
vmv.s.x v3,a5
vsetivli zero,16,e8,m1,ta,ma
vl1re8.v v0,0(a1)
vmsne.vi v0,v0,0
vle8.v v2,0(a0),v0.t
vwredsum.vs v1,v2,v3,v0.t
vsetivli zero,0,e16,m1,ta,ma
vmv.x.s a0,v1
slliw a0,a0,16
sraiw a0,a0,16
ret

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*cond_widen_reduc_plus_scal_<mode>):
New combine patterns.
* config/riscv/riscv-protos.h (enum insn_type): New insn_type.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c: New test.

RISC-V: Split VLS avl_type from NONVLMAX avl_type

This patch split a VLS avl_type from the NONVLMAX avl_type, denoting
those RVV insn with length set to the number of units of VLS modes.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum avl_type): New VLS avl_type.
* config/riscv/riscv-v.cc (autovec_use_vlmax_p): Move comments.

RISC-V: Leverage __builtin_xx instead of math.h for test

The math.h may have problems in some environment, take __builtin__xx
instead for testing.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/floating-point-max-5.c:
Remove reference to math.h.
* gcc.target/riscv/rvv/autovec/vls/floating-point-min-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sgnjx-2.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Support ceil and ceilf auto-vectorization

Update in v4:

* Add test for _Float16.
* Remove unnecessary macro in def.h for test.

Original log:

This patch would like to support auto-vectorization for both the
ceil and ceilf of math.h. It depends on the -ffast-math option.

When we would like to call ceil/ceilf like v2 = ceil (v1), we will
convert it into below insn (reference the implementation of llvm).

* vfcvt.x.f v3, v1, RUP
* vfcvt.f.x v2, v3

However, the floating point value may not need the cvt as above if
its mantissa is zero. For example single precision floating point below.

  +-----------+---------------+
  | float     | binary layout |
  +-----------+---------------+
  | 8388607.5 | 0x4affffff    |
  | 8388608.0 | 0x4b000000    |
  | 8388609.0 | 0x4b000001    |
  +-----------+---------------+

All single floating point great than 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do the
cvt on mask.

Befor this patch:
math-ceil-1.c:21:1: missed: couldn't vectorize loop
  ...
.L3:
  flw     fa0,0(s0)
  addi    s0,s0,4
  addi    s1,s1,4
  call    ceilf
  fsw     fa0,-4(s1)
  bne     s0,s2,.L3

After this patch:
  ...
  fsrmi   3
.L4:
  vfabs.v     v0,v1
  vmv1r.v     v2,v1
  vmflt.vv    v0,v0,v4
  sub         a3,a3,a4
  vfcvt.x.f.v v3,v1,v0.t
  vfcvt.f.x.v v2,v3,v0.t
  vfsgnj.vv   v2,v2,v1
  bne         .L4
.L14:
  fsrm    a6
  ret

Please note VLS mode is also involved in this patch and covered by the
test cases.

gcc/ChangeLog:

* config/riscv/autovec.md (ceil<mode>2): New pattern.
* config/riscv/riscv-protos.h (enum insn_flags): New enum type.
(enum insn_type): Ditto.
(expand_vec_ceil): New function decl.
* config/riscv/riscv-v.cc (gen_ceil_const_fp): New function impl.
(expand_vec_float_cmp_mask): Ditto.
(expand_vec_copysign): Ditto.
(expand_vec_ceil): Ditto.
* config/riscv/vector.md: Add VLS mode support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/math-ceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/math-ceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/math-ceil-2.c: New test.
* gcc.target/riscv/rvv/autovec/math-ceil-3.c: New test.
* gcc.target/riscv/rvv/autovec/math-ceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/test-math.h: New test.
* gcc.target/riscv/rvv/autovec/vls/math-ceil-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

RISC-V: Add VLS integer ABS support

Regression passed.

Committed.

gcc/ChangeLog:

* config/riscv/autovec.md: Extend VLS modes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/abs-2.c: New test.

RISC-V: Add more VLS unary tests

Notice we are missing these tests.

Committed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/abs-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/not-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/sqrt-1.c: New test.

RISC-V: Support VLS mult high

Regression passed.

Committed.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Extend VLS modes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS mult high.
* gcc.target/riscv/rvv/autovec/vls/mulh-1.c: New test.

RISC-V: Adjusting the comments of the emit_vlmax_insn/emit_vlmax_insn_lra/emit_nonvlmax_insn functions

V2 Change: Use Robin's comments.

This patch adjusts the comments of the
emit_vlmax_insn/emit_vlmax_insn_lra/emit_nonvlmax_insn functions.
The purpose of the adjustment is to make it clear that vlmax here is not
VLMAX as defined inside the RVV ISA. This is because this function is used
by RVV mode (e.g. RVVM1SImode) in addition to VLS mode (V16QI). For RVV mode,
it means the same thing, for VLS mode, it indicates setting the vl to the
number of units of the mode. Changed the comment because I didn't think of
a better name. If there is a suitable name, feel free to discuss it.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_insn): Adjust comments.
(emit_nonvlmax_insn): Adjust comments.
(emit_vlmax_insn_lra): Adjust comments.

Co-Authored-By: Robin Dapp <rdapp.gcc@gmail.com>

rust: Implement TARGET_RUST_OS_INFO for *-*-*linux*.

gcc/ChangeLog:

* config.gcc (*linux*): Set rust target_objs, and
target_has_targetrustm,
* config/t-linux (linux-rust.o): New rule.
* config/linux-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for i[34567]86-*-mingw* and x86_64-*-mingw*.

gcc/ChangeLog:

* config.gcc (i[34567]86-*-mingw* | x86_64-*-mingw*): Set
rust_target_objs and target_has_targetrustm.
* config/t-winnt (winnt-rust.o): New rule.
* config/winnt-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-fuchsia*.

gcc/ChangeLog:

* config.gcc (*-*-fuchsia): Set tmake_rule, rust_target_objs,
and target_has_targetrustm.
* config/fuchsia-rust.cc: New file.
* config/t-fuchsia: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-vxworks*

gcc/ChangeLog:

* config.gcc (*-*-vxworks*): Set rust_target_objs and
target_has_targetrustm.
* config/t-vxworks (vxworks-rust.o): New rule.
* config/vxworks-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-dragonfly*

gcc/ChangeLog:

* config.gcc (*-*-dragonfly*): Set rust_target_objs and
target_has_targetrustm.
* config/t-dragonfly (dragonfly-rust.o): New rule.
* config/dragonfly-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-solaris2*.

gcc/ChangeLog:

* config.gcc (*-*-solaris2*): Set rust_target_objs and
target_has_targetrustm.
* config/t-sol2 (sol2-rust.o): New rule.
* config/sol2-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-openbsd*

gcc/ChangeLog:

* config.gcc (*-*-openbsd*): Set rust_target_objs and
target_has_targetrustm.
* config/t-openbsd (openbsd-rust.o): New rule.
* config/openbsd-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-netbsd*

gcc/ChangeLog:

* config.gcc (*-*-netbsd*): Set rust_target_objs and
target_has_targetrustm.
* config/t-netbsd (netbsd-rust.o): New rule.
* config/netbsd-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-freebsd*

gcc/ChangeLog:

* config.gcc (*-*-freebsd*): Set rust_target_objs and
target_has_targetrustm.
* config/t-freebsd (freebsd-rust.o): New rule.
* config/freebsd-rust.cc: New file.

rust: Implement TARGET_RUST_OS_INFO for *-*-darwin*

gcc/ChangeLog:

* config.gcc (*-*-darwin*): Set rust_target_objs and
target_has_targetrustm.
* config/t-darwin (darwin-rust.o): New rule.
* config/darwin-rust.cc: New file.

rust: Implement TARGET_RUST_CPU_INFO for i[34567]86-*-* and x86_64-*-*

There are still quite a lot of the previously reverted i386-rust.cc
missing, so it's only a partial reimplementation.

gcc/ChangeLog:

* config/i386/t-i386 (i386-rust.o): New rule.
* config/i386/i386-rust.cc: New file.
* config/i386/i386-rust.h: New file.

rust: Reintroduce TARGET_RUST_OS_INFO hook

gcc/ChangeLog:

* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Document TARGET_RUST_OS_INFO.

gcc/rust/ChangeLog:

* rust-session-manager.cc (Session::init): Call
targetrustm.rust_os_info.
* rust-target.def (rust_os_info): New hook.

rust: Reintroduce TARGET_RUST_CPU_INFO hook

gcc/ChangeLog:

* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Add @node for Rust language and ABI, and document
TARGET_RUST_CPU_INFO.

gcc/rust/ChangeLog:

* rust-lang.cc (rust_add_target_info): Remove sorry.
* rust-session-manager.cc: Replace include of target.h with
include of tm.h and rust-target.h.
(Session::init): Call targetrustm.rust_cpu_info.
* rust-target.def (rust_cpu_info): New hook.
* rust-target.h (rust_add_target_info): Declare.

rust: Add skeleton support and documentation for targetrustm hooks.

gcc/ChangeLog:

* Makefile.in (tm_rust_file_list, tm_rust_include_list, TM_RUST_H,
RUST_TARGET_DEF, RUST_TARGET_H, RUST_TARGET_OBJS): New variables.
(tm_rust.h, cs-tm_rust.h, default-rust.o,
rust/rust-target-hooks-def.h, s-rust-target-hooks-def-h): New rules.
(s-tm-texi): Also check timestamp on rust-target.def.
(generated_files): Add TM_RUST_H and rust-target-hooks-def.h.
(build/genhooks.o): Also depend on RUST_TARGET_DEF.
* config.gcc (tm_rust_file, rust_target_objs, target_has_targetrustm):
New variables.
* configure: Regenerate.
* configure.ac (tm_rust_file_list, tm_rust_include_list,
rust_target_objs): Add substitutes.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (targetrustm): Document.
(target_has_targetrustm): Document.
* genhooks.cc: Include rust/rust-target.def.
* config/default-rust.cc: New file.

gcc/rust/ChangeLog:

* rust-target-def.h: New file.
* rust-target.def: New file.
* rust-target.h: New file.

RISC-V: Enable undefined support for RVV auto-vectorization[PR110751]

Now GCC middle-end can support undefined value which is traslated into (scratch:mode).

This patch is to enable RISC-V backend undefine value in ELSE value of COND_LEN_xxx/COND_xxx.

Consider this following case:

  __attribute__((noipa))
  void vrem_int8_t (int8_t * __restrict dst, int8_t * __restrict a, int8_t * __restrict b, int n)
  {
    for (int i = 0; i < n; i++)
      dst[i] = a[i] % b[i];
  }

Before this patch:

vrem_int8_t:
        ble     a3,zero,.L5
        vsetvli a5,zero,e8,m1,ta,ma
        vmv.v.i v4,0                          ---> redundant.
.L3:
        vsetvli a5,a3,e8,m1,tu,ma             ---> should be TA.
        vmv1r.v v1,v4                         ---> redudant.
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        sub     a3,a3,a5
        vrem.vv v1,v3,v2
        vse8.v  v1,0(a0)
        add     a1,a1,a5
        add     a2,a2,a5
        add     a0,a0,a5
        bne     a3,zero,.L3
.L5:
        ret

After this patch:

vrem_int8_t:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e8,m1,ta,ma
vle8.v v1,0(a1)
vle8.v v2,0(a2)
sub a3,a3,a5
vrem.vv v1,v1,v2
vse8.v v1,0(a0)
add a1,a1,a5
add a2,a2,a5
add a0,a0,a5
bne a3,zero,.L3
.L5:
ret

PR target/110751

gcc/ChangeLog:

* config/riscv/autovec.md: Enable scratch rtx in ELSE operand.
* config/riscv/predicates.md (autovec_else_operand): New predicate.
* config/riscv/riscv-v.cc (get_else_operand): New function.
(expand_cond_len_unop): Adapt ELSE value.
(expand_cond_len_binop): Ditto.
(expand_cond_len_ternop): Ditto.
* config/riscv/riscv.cc (riscv_preferred_else_value): New function.
(TARGET_PREFERRED_ELSE_VALUE): New targethook.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adapt test.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-9.c: Ditto.

RISC-V: Fix SUBREG move of VLS mode[PR111486]

This patch fixes this bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111486

Before this patch, we can only handle (subreg:DI (reg:V8QI))

The PR ICE:

during RTL pass: reload
testcase.c: In function 'foo':
testcase.c:8:1: internal compiler error: in require, at machmode.h:313
    8 | }
      | ^
0xa40cd2 opt_mode<machine_mode>::require() const
        /repo/gcc-trunk/gcc/machmode.h:313
0xa47091 opt_mode<machine_mode>::require() const
        /repo/gcc-trunk/gcc/config/riscv/riscv.cc:2546
0xa47091 riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
        /repo/gcc-trunk/gcc/config/riscv/riscv.cc:2543
0x1f1df10 gen_movdi(rtx_def*, rtx_def*)
        /repo/gcc-trunk/gcc/config/riscv/riscv.md:2024
0x10f1423 rtx_insn* insn_gen_fn::operator()<rtx_def*, rtx_def*>(rtx_def*, rtx_def*) const
        /repo/gcc-trunk/gcc/recog.h:411
0x10f1423 emit_move_insn_1(rtx_def*, rtx_def*)
        /repo/gcc-trunk/gcc/expr.cc:4164
0x10f183d emit_move_insn(rtx_def*, rtx_def*)
        /repo/gcc-trunk/gcc/expr.cc:4334
0x13070ec lra_emit_move(rtx_def*, rtx_def*)
        /repo/gcc-trunk/gcc/lra.cc:509
0x132295b curr_insn_transform
        /repo/gcc-trunk/gcc/lra-constraints.cc:4748
0x1324335 lra_constraints(bool)
        /repo/gcc-trunk/gcc/lra-constraints.cc:5488
0x130a3d4 lra(_IO_FILE*)
        /repo/gcc-trunk/gcc/lra.cc:2419
0x12bb629 do_reload
        /repo/gcc-trunk/gcc/ira.cc:5970
0x12bb629 execute
        /repo/gcc-trunk/gcc/ira.cc:6156

Because of (subreg:DI (reg:V2QI))

PR target/111486

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Fix bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr111486.c: New test.

check undefine_p for one more vr

The root cause of PR111355 and PR111482 is missing to check if vr0
is undefined_p before call vr0.lower_bound.

In the pattern "(X + C) / N",

    (if (INTEGRAL_TYPE_P (type)
&& get_range_query (cfun)->range_of_expr (vr0, @0))
     (if (...)
       (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
       (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
    && wi::geu_p (vr0.lower_bound (), -c))

In "(if (...)", there is code to prevent vr0's undefined_p,
But in the "else" part, vr0's undefined_p is not checked before
"wi::geu_p (vr0.lower_bound (), -c)".

PR tree-optimization/111355

gcc/ChangeLog:

* match.pd ((X + C) / N): Update pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/pr111355.c: New test.

using overflow_free_p to simplify pattern

In r14-3582, an "overflow_free_p" interface is added.
The pattern of "(t * 2) / 2" in match.pd can be simplified
by using this interface.

gcc/ChangeLog:

* match.pd ((t * 2) / 2): Update to use overflow_free_p.

RISC-V: Optimized for strided load/store with stride == element width[PR111450]

When stride == element width, vlsse should be optimized into vle.v.
vsse should be optimized into vse.v.

PR target/111450

gcc/ChangeLog:

* config/riscv/constraints.md (c01): const_int 1.
(c02): const_int 2.
(c04): const_int 4.
(c08): const_int 8.
* config/riscv/predicates.md (vector_eew8_stride_operand): New predicate for stride operand.
(vector_eew16_stride_operand): Ditto.
(vector_eew32_stride_operand): Ditto.
(vector_eew64_stride_operand): Ditto.
* config/riscv/vector-iterators.md: New iterator for stride operand.
* config/riscv/vector.md: Add stride = element width constraint.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111450.c: New test.

RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic names

This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.

gcc/ChangeLog:

* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.

RISC-V: Support VLS INT <-> FP conversions

Support INT <-> FP VLS auto-vectorization patterns.

Regression passed.
Committed.

gcc/ChangeLog:

* config/riscv/autovec.md: Extend VLS modes.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/convert-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-9.c: New test.

Daily bump.

testsuite: Add test for already-fixed issue with _Pragma expansion [PR90400]

The PR was fixed by r12-5454. Since the fix was somewhat incidental,
although related, add a testcase from PR90400 too before closing it out.

gcc/testsuite/ChangeLog:

PR preprocessor/90400
* c-c++-common/cpp/pr90400.c: New test.

libcpp: Fix ICE on #include after a line marker directive [PR61474]

As noted in the PR, GCC will segfault if a file name is first seen in a
linemarker directive, and then later seen in a normal #include. This is
because the fake include process adds the file to the cache with a null PATH
member. The normal #include finds this file in the cache and then attempts
to use the null PATH. Resolve by adding the file to the cache with a unique
starting directory, so that the fake entry will only be found by a
subsequent fake include, not by a real one.

libcpp/ChangeLog:

PR preprocessor/61474
* files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake
include files.
(_cpp_fake_include): Pass a unique cpp_dir* address so
the fake file will not be found when looked up for real.

gcc/testsuite/ChangeLog:

PR preprocessor/61474
* c-c++-common/cpp/pr61474-2.h: New test.
* c-c++-common/cpp/pr61474.c: New test.
* c-c++-common/cpp/pr61474.h: New test.

Tweak merge_range API.

merge_range use to return TRUE if there was already a range. Now it
returns TRUE if a new range is added, OR updates and existing range
with a new value. FALSE is returned when the range already matches.

* gimple-range-cache.cc (ssa_cache::merge_range): Change meaning
of the return value.
(ssa_cache::dump): Don't print GLOBAL RANGE header.
(ssa_lazy_cache::merge_range): Adjust return value meaning.
(ranger_cache::dump): Print GLOBAL RANGE header.

aarch64: Ensure const and sign correctness

Be const and sign correct by using a matching CIE augmentation type.
Use a builtin instead of relying <string.h> being included.

libgcc/ChangeLog:

* config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
Use const unsigned type and a builtin.

Signed-off-by: Pekka Seppänen <pexu@gcc.mail.kapsi.fi>

RISC-V: Remove math.h import to resolve missing stubs failures

Resolves some of the missing stubs failures:
fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.

2023-09-20 Juzhe Zhong <juzhe.zhong@rivai.ai>

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded math.h
import.

Tested-by: Patrick O'Neill <patrick@rivosinc.com>

[frange] Remove special casing from unordered operators.

In coming up with testcases for the unordered folders, I realized that
we were already handling them correctly, even in the absence of my
work in this area lately.

All of the unordered fold_range() methods try to fold with the ordered
variants first, and if they return TRUE, we are guaranteed to be able
to fold, even in the presence of NANs.  For example:

if (x_5 >= y_8)
  if (x_5 __UNLE y_8)

On the true side of the first conditional we know that either x_5 < y_8
or that one or more operands is a NAN.  Since UNLE_EXPR returns true
for precisely this scenario, we can fold as true.

This is handled in the fold_range() methods as follows:

    if (!range_op_handler (LE_EXPR).fold_range (r, type, op1_no_nan,
op2_no_nan, trio))
      return false;
    // The result is the same as the ordered version when the
    // comparison is true or when the operands cannot be NANs.
    if (!maybe_isnan (op1, op2) || r == range_true (type))
      return true;

This code has been there since the last release, and makes the special
casing I am deleting obsolete.  I have added tests to make sure we
keep track of this behavior.

gcc/ChangeLog:

* range-op-float.cc (foperator_unordered_ge::fold_range): Remove
special casing.
(foperator_unordered_gt::fold_range): Same.
(foperator_unordered_lt::fold_range): Same.
(foperator_unordered_le::fold_range): Same.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp-float-relations-5.c: New test.
* gcc.dg/tree-ssa/vrp-float-relations-6.c: New test.

c, c++: Accept __builtin_classify_type (typename)

As mentioned in my stdckdint.h mail, __builtin_classify_type has
a problem that argument promotion (the argument is passed to ...
prototyped builtin function) means that certain type classes will
simply never appear.
I think it is too late to change how it behaves, lots of code in the
wild might rely on the current behavior.

So, the following patch adds option to use a typename rather than
expression as the operand to the builtin, making it behave similarly
to sizeof, typeof or say the clang _Generic extension where the
first argument can be there not just expression, but also typename.

I think we have other prior art here, e.g. __builtin_va_arg also
expects typename.

I've added this to both C and C++, because it would be weird if it
supported it only in C and not in C++.

2023-09-20 Jakub Jelinek <jakub@redhat.com>

gcc/
* builtins.h (type_to_class): Declare.
* builtins.cc (type_to_class): No longer static. Return
int rather than enum.
* doc/extend.texi (__builtin_classify_type): Document.
gcc/c/
* c-parser.cc (c_parser_postfix_expression_after_primary): Parse
__builtin_classify_type call with typename as argument.
gcc/cp/
* parser.cc (cp_parser_postfix_expression): Parse
__builtin_classify_type call with typename as argument.
* pt.cc (tsubst_copy_and_build): Handle __builtin_classify_type
with dependent typename as argument.
gcc/testsuite/
* c-c++-common/builtin-classify-type-1.c: New test.
* g++.dg/ext/builtin-classify-type-1.C: New test.
* g++.dg/ext/builtin-classify-type-2.C: New test.
* gcc.dg/builtin-classify-type-1.c: New test.

internal-fn: Support undefined rtx for uninitialized SSA_NAME[PR110751]

According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751

As Richard and Richi suggested, we recognize uninitialized SSA_NAME and convert it
into SCRATCH rtx if the target predicate allows SCRATCH.

It can help to reduce redundant data move instructions of targets like RISC-V.

Bootstrap and Regression on x86 passed.

gcc/ChangeLog:
PR target/110751

* internal-fn.cc (expand_fn_using_insn): Support undefined rtx value.
* optabs.cc (maybe_legitimize_operand): Ditto.
(can_reuse_operands_p): Ditto.
* optabs.h (enum expand_operand_type): Ditto.
(create_undefined_input_operand): Ditto.

c++: improve class NTTP object pretty printing [PR111471]

1. Move class NTTP object pretty printing to a more general spot in
   the pretty printer, so that we always print its value instead of
   its (mangled) name even when it appears outside of a template
   argument list.
2. Print the type of an class NTTP object alongside its CONSTRUCTOR
   value, like dump_expr would have done.
3. Don't print const VIEW_CONVERT_EXPR wrappers for class NTTPs.

PR c++/111471

gcc/cp/ChangeLog:

* cxx-pretty-print.cc (cxx_pretty_printer::expression)
<case VAR_DECL>: Handle class NTTP objects by printing
their type and value.
<case VIEW_CONVERT_EXPR>: Strip const VIEW_CONVERT_EXPR
wrappers for class NTTPs.
(pp_cxx_template_argument_list): Don't handle class NTTP
objects here.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/diagnostic19.C: New test.

c++: further optimize tsubst_template_decl

This patch makes tsubst_template_decl use use_spec_table=false also in
the non-class non-function template case, to avoid computing 'argvec' and
doing a hash table lookup from tsubst_decl (when partially instantiating
a member variable/alias template).

This change reveals that for function templates, tsubst_template_decl
registers the partially instantiated TEMPLATE_DECL, whereas for other
non-class templates it registers the corresponding DECL_TEMPLATE_RESULT
which is an interesting inconsistency that I decided to preserve for now.
Trying to consistently register the TEMPLATE_DECL (or DECL_TEMPLATE_RESULT)
causes modules ICEs which I didn't look into.

In passing, in tsubst_function_decl I noticed 'argvec' is unused
when 'lambda_fntype' is set (since lambdas aren't recorded in the
specializations table), so we can avoid computing it in that case.

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Don't bother computing 'argvec'
when 'lambda_fntype' is set.
(tsubst_template_decl): Make sure we return a TEMPLATE_DECL
during specialization lookup. In the non-class non-function
template case, use tsubst_decl directly with use_spec_table=false,
update DECL_TI_ARGS and call register_specialization like
tsubst_decl would have done if use_spec_table=true.

OpenMP: Add ME support for 'omp allocate' stack variables

Call GOMP_alloc/free for 'omp allocate' allocated variables. This is
for C only as C++ and Fortran show a sorry already in the FE. Note that
this only applies to stack variables as the C FE shows a sorry for
static variables.

gcc/ChangeLog:

* gimplify.cc (gimplify_bind_expr): Call GOMP_alloc/free for
'omp allocate' variables; move stack cleanup after other
cleanup.
(omp_notice_variable): Process original decl when decl
of the value-expression for a 'omp allocate' variable is passed.
* omp-low.cc (scan_omp_1_op): Handle 'omp allocate' variables

libgomp/ChangeLog:

* libgomp.texi (OpenMP 5.1 Impl.): Mark 'omp allocate' as
implemented for C only.
* testsuite/libgomp.c/allocate-4.c: New test.
* testsuite/libgomp.c/allocate-5.c: New test.
* testsuite/libgomp.c/allocate-6.c: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/allocate-11.c: Remove C-only dg-message
for 'sorry, unimplemented'.
* c-c++-common/gomp/allocate-12.c: Likewise.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Likewise.
* c-c++-common/gomp/allocate-10.c: New test.
* c-c++-common/gomp/allocate-17.c: New test.

RISC-V: Support simplifying x/(-1) to neg for vector.

gcc/ChangeLog:

* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
support simplifying vector int not only scalar int.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/simplify-vdiv.c: New test.

Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>

RISC-V: Support VLS floating-point extend/truncate

Regression passed.

Committed.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Extend VLS floating-point.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/widen/widen-10.c: Adapt test.
* gcc.target/riscv/rvv/autovec/widen/widen-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/ext-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ext-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trunc-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trunc-5.c: New test.

RISC-V: Fix Demand comparison bug[VSETVL PASS]

This bug is exposed when we support VLS integer conversion patterns.

FAIL: c-c++-common/torture/pr53505.c execution.

This is because incorrect vsetvl elimination by Phase 4:

   10318:       0d207057                vsetvli zero,zero,e32,m4,ta,ma
   1031c:       5e003e57                vmv.v.i v28,0
   .....:       ........                missed e8,m1 vsetvl
   10320:       7b07b057                vmsgtu.vi       v0,v16,15
   10324:       03083157                vadd.vi v2,v16,-16

Regression on release version GCC no surprise difference.

Committed.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Fix bug.

Darwin: Move checking of the 'shared' driver spec.

This avoids a bunch of irrelevant diagnostics if the user passes '-shared' to
gnatmake. Currently, we push '-dynamiclib' back onto the command line (since
that is the Darwin spelling of 'shared') but this is not handled by gnat1,
leading to a diagnostic for every character after the '-d'.

'-shared' has no effect on gnatmake (it needs to be passed to gnatbind).

This moves the handling of '-shared' to leaf specs so that we do not need to
push 'dynamiclib' onto the command line.

gcc/ChangeLog:

* config/darwin.h:
(SUBTARGET_DRIVER_SELF_SPECS): Move handling of 'shared' into the same
specs as 'dynamiclib'. (STARTFILE_SPEC): Handle 'shared'.

tree-optimization/111489 - raise --param uninit-max-chain-len to 8

This raises --param uninit-max-chain-len to avoid a bogus diagnostic
for the large testcase in PR111489.

PR tree-optimization/111489
* params.opt (-param uninit-max-chain-len=): Raise default to 8.

* gcc.dg/uninit-pr111489.c: New testcase.

tree-optimization/111489 - turn uninit limits to params

The following turns MAX_NUM_CHAINS and MAX_CHAIN_LEN to params which
allows to experiment with raising them. For the testcase in PR111489
raising MAX_CHAIN_LEN from 5 to 8 avoids the bogus diagnostics
at -O2, at -O3 we need a MAX_CHAIN_LEN of 6.

PR tree-optimization/111489
* doc/invoke.texi (--param uninit-max-chain-len): Document.
(--param uninit-max-num-chains): Likewise.
* params.opt (-param=uninit-max-chain-len=): New.
(-param=uninit-max-num-chains=): Likewise.
* gimple-predicate-analysis.cc (MAX_NUM_CHAINS): Define to
param_uninit_max_num_chains.
(MAX_CHAIN_LEN): Define to param_uninit_max_chain_len.
(uninit_analysis::init_use_preds): Avoid VLA.
(uninit_analysis::init_from_phi_def): Likewise.
(compute_control_dep_chain): Avoid using MAX_CHAIN_LEN in
template parameter.

middle-end: use MAX_FIXED_MODE_SIZE instead of precidion of TImode/DImode

On Tue, Sep 19, 2023 at 05:50:59PM +0100, Richard Sandiford wrote:
> How about using MAX_FIXED_MODE_SIZE for things like this?

Seems like a good idea.

The following patch does that.

2023-09-20 Jakub Jelinek <jakub@redhat.com>

* match.pd ((x << c) >> c): Use MAX_FIXED_MODE_SIZE instead of
GET_MODE_PRECISION of TImode or DImode depending on whether
TImode is supported scalar mode.
* gimple-lower-bitint.cc (bitint_precision_kind): Likewise.
* expr.cc (expand_expr_real_1): Likewise.
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): Likewise.
* ubsan.cc (ubsan_encode_value, ubsan_type_descriptor): Likewise.

RISC-V: Reorganize and rename combine patterns in autovec-opt.md

This patch reorganize and rename the combine patterns in autovec-opt.md
by category. There shouldn't be any functional changes.
The current classification includes the following categories:

- Combine op + vmerge to cond_op
- Combine binop + trunc to narrow_binop
- Combine extend + binop to widen_binop
- Combine extend + ternop to widen_ternop
- Misc combine patterns

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*<optab>not<mode>): Move and rename.
(*n<optab><mode>): Ditto.
(*v<any_shiftrt:optab><any_extend:optab>trunc<mode>): Ditto.
(*<any_shiftrt:optab>trunc<mode>): Ditto.
(*narrow_<any_shiftrt:optab><any_extend:optab><mode>): Ditto.
(*narrow_<any_shiftrt:optab><mode>_scalar): Ditto.
(*single_widen_mult<any_extend:su><mode>): Ditto.
(*single_widen_mul<any_extend:su><mode>): Ditto.
(*single_widen_mult<mode>): Ditto.
(*single_widen_mul<mode>): Ditto.
(*dual_widen_fma<mode>): Ditto.
(*dual_widen_fma<su><mode>): Ditto.
(*single_widen_fma<mode>): Ditto.
(*single_widen_fma<su><mode>): Ditto.
(*dual_fma<mode>): Ditto.
(*single_fma<mode>): Ditto.
(*dual_fnma<mode>): Ditto.
(*dual_widen_fnma<mode>): Ditto.
(*single_fnma<mode>): Ditto.
(*single_widen_fnma<mode>): Ditto.
(*dual_fms<mode>): Ditto.
(*dual_widen_fms<mode>): Ditto.
(*single_fms<mode>): Ditto.
(*single_widen_fms<mode>): Ditto.
(*dual_fnms<mode>): Ditto.
(*dual_widen_fnms<mode>): Ditto.
(*single_fnms<mode>): Ditto.
(*single_widen_fnms<mode>): Ditto.

openmp: Add omp::decl attribute support [PR111392]

This patch adds support for (so far C++) omp::decl attribute.  For
declare simd and declare variant directives it is essentially another
spelling of omp::decl, except per discussions it is not allowed inside
of omp::sequence attribute.  For threadprivate, declare target, allocate
and later groupprivate directives it should appertain to variable (or for
declare target also function definitions and) declarations and where in
normal syntax one specifies a list of variables (or variables and functions),
either as argument of the directive or clause argument, such argument is
not specified and implied to be the variable it applies to.

2023-09-20  Jakub Jelinek  <jakub@redhat.com>

PR c++/111392
gcc/
* attribs.cc (decl_attributes): Don't warn on omp::directive attribute
on vars or function decls if -fopenmp or -fopenmp-simd.
gcc/c-family/
* c-omp.cc (c_omp_directives): Add commented out groupprivate
directive entry.
gcc/cp/
* parser.h (struct cp_lexer): Add in_omp_decl_attribute member.
* cp-tree.h (cp_maybe_parse_omp_decl): Declare.
* parser.cc (cp_parser_handle_statement_omp_attributes): Diagnose
omp::decl attribute on statements.  Adjust diagnostic wording for
omp::decl.
(cp_parser_omp_directive_args): Add DECL_P argument, set TREE_PUBLIC
to it on the DEFERRED_PARSE tree.
(cp_parser_omp_sequence_args): Adjust caller.
(cp_parser_std_attribute): Handle omp::decl attribute.
(cp_parser_omp_var_list): If parser->lexer->in_omp_decl_attribute
don't expect any arguments, instead create clause or TREE_LIST for
that decl.
(cp_parser_late_parsing_omp_declare_simd): Adjust diagnostic wording
for omp::decl.
(cp_maybe_parse_omp_decl): New function.
(cp_parser_omp_declare_target): If
parser->lexer->in_omp_decl_attribute and first token isn't name or
comma invoke cp_parser_omp_var_list.
* decl2.cc (cplus_decl_attributes): Adjust diagnostic wording for
omp::decl.  Handle omp::decl on declarations.
* name-lookup.cc (finish_using_directive): Adjust diagnostic wording
for omp::decl.
gcc/testsuite/
* g++.dg/gomp/attrs-19.C: New test.
* g++.dg/gomp/attrs-20.C: New test.
* g++.dg/gomp/attrs-21.C: New test.
libgomp/
* libgomp.texi: Mark decl attribute was added to the C++ attribute
syntax as implemented.

RISC-V: Fixed ICE caused by missing operand

This ICE appears in GCC compiled with -O2 flags.

PR target/111488

gcc/ChangeLog:

* config/riscv/autovec-opt.md: Add missed operand.

debug/111409 - don't generate COMDAT macro sections for split DWARF

Split DWARF files aren't processed by the linker, so DW_MACRO_import
offsets aren't relocated and the .debug_macro.dwo sections aren't
deduplicated and merged. There's no clear way for this to work for
split DWARF, so disable it.

gcc/ChangeLog:

PR debug/111409
* dwarf2out.cc (output_macinfo): Don't call optimize_macinfo_range if
dwarf_split_debug_info.

gcc/testsuite/ChangeLog:

PR debug/111409
* gcc.dg/pr111409.c: New test.

testcase: rename pr111303.c to pr111324.c

When commit the fix for pr111324, the test cases was named as pr111303.c
by mistake. Here, rename it to pr111324.c

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr111303.c: Rename to ...
* gcc.dg/tree-ssa/pr111324.c: ... this.

RISC-V: Extend VLS modes in 'VWEXTI' iterator

This patch extends 'VWEXT' iterator so that we will support
integer extension/integer truncate/integer average VLS patterns.

This patch reduce these following FAILs:

FAIL: gcc.dg/pr92301.c execution test
XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects scan-tree-dump-times slp2 "optimized: basic block" 2
XPASS: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized: basic block" 2

The pr92301.c is the latent bug in middle-end GIMPLE FOLD.
We are just lucky that this test passes with this patch which makes us not trigger the GIMPLE FOLD bug again.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (can_find_related_mode_p): New function.
(vectorize_related_mode): Add VLS related modes.
* config/riscv/vector-iterators.md: Extend VLS modes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/narrow-1.c: Adapt testcase.
* gcc.target/riscv/rvv/autovec/binop/narrow-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/narrow-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cmp/vcond-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cmp/vcond-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cmp/vcond-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cmp/vcond-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/pr110950.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/def.h: Ditto.
* gcc.target/riscv/rvv/autovec/vls/div-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/shift-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32f-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/avg-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/avg-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/avg-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/avg-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ext-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ext-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ext-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trunc-3.c: New test.

ira: Consider save/restore costs of callee-save registers [PR110071]

In improve_allocation() routine, IRA checks for each allocno if spilling
any conflicting allocnos can improve the allocation of this allocno.
This routine computes the cost improvement for usage of each profitable
hard register for a given allocno. The existing code in
improve_allocation() does not consider the save/restore costs of callee
save registers while computing the cost improvement.

This can result in a callee save register being assigned to a pseudo
that is live in the entire function and across a call, overriding a
non-callee save register assigned to the pseudo by graph coloring. So
the entry basic block requires a prolog, thereby causing shrink wrap to
fail.

2023-09-14 Surya Kumari Jangala <jskumari@linux.ibm.com>

gcc/
PR rtl-optimization/110071
* ira-color.cc (improve_allocation): Consider cost of callee
save registers.

gcc/testsuite/
PR rtl-optimization/110071
* gcc.target/powerpc/pr110071.c: New test.

Modify gas uleb128 support test

Some assemblers (GNU as for LoongArch) generates relocations for leb128
symbol arithmetic for relaxation, we need to disable relaxation probing
leb128 support then.

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: Checking assembler for -mno-relax support.
Disable relaxation when probing leb128 support.

co-authored-by: Xi Ruoyao <xry111@xry111.site>

LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.

gcc/ChangeLog:

* config.in: Regenerate.
* config/loongarch/genopts/loongarch.opt.in: Add compilation option
mrelax. And set the initial value of explicit-relocs according to the
detection status.
* config/loongarch/gnu-user.h: When compiling with -mno-relax, pass the
--no-relax option to the linker.
* config/loongarch/loongarch-driver.h (ASM_SPEC): When compiling with
-mno-relax, pass the -mno-relax option to the assembler.
* config/loongarch/loongarch-opts.h (HAVE_AS_MRELAX_OPTION): Define macro.
* config/loongarch/loongarch.opt: Regenerate.
* configure: Regenerate.
* configure.ac: Add detection of support for binutils relax function.

Daily bump.

c++modules: report module mapper files as a dependency

It affects the build, and if used as a static file, can reliably be
tracked using the `-MF` mechanism.

gcc/cp/:

* mapper-client.cc, mapper-client.h (open_module_client): Accept
dependency tracking and track module mapper files as
dependencies.
* module.cc (make_mapper, get_mapper): Pass the dependency
tracking class down.

gcc/testsuite/:

* g++.dg/modules/depreport-2.modmap: New test.
* g++.dg/modules/depreport-2_a.C: New test.
* g++.dg/modules/depreport-2_b.C: New test.
* g++.dg/modules/test-depfile.py: Support `:|` syntax output
when generating modules.

Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++modules: report imported CMI files as dependencies

They affect the build, so report them via `-MF` mechanisms.

gcc/cp/

* module.cc (do_import): Report imported CMI files as
dependencies.

gcc/testsuite/

* g++.dg/modules/depreport-1_a.C: New test.
* g++.dg/modules/depreport-1_b.C: New test.
* g++.dg/modules/test-depfile.py: New tool for validating depfile
information.
* lib/modules.exp: Support for validating depfile contents.

Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

p1689r5: initial support

This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.

Support is communicated through the following three new flags:

- `-fdeps-format=` specifies the format for the output. Currently named
  `p1689r5`.

- `-fdeps-file=` specifies the path to the file to write the format to.

- `-fdeps-target=` specifies the `.o` that will be written for the TU
  that is scanned. This is required so that the build system can
  correlate the dependency output with the actual compilation that will
  occur.

CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.

Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst

TODO:

- header-unit information fields

Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.

- non-utf8 paths

The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously representable in UTF-8" case).

- figure out why junk gets placed at the end of the file

Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.

libcpp/

* include/cpplib.h: Add cpp_fdeps_format enum.
(cpp_options): Add fdeps_format field
(cpp_finish): Add structured dependency fdeps_stream parameter.
* include/mkdeps.h (deps_add_module_target): Add flag for
whether a module is exported or not.
(fdeps_add_target): Add function.
(deps_write_p1689r5): Add function.
* init.cc (cpp_finish): Add new preprocessor parameter used for C++
module tracking.
* mkdeps.cc (mkdeps): Implement P1689R5 output.

gcc/

* doc/invoke.texi: Document -fdeps-format=, -fdeps-file=, and
-fdeps-target= flags.
* gcc.cc: add defaults for -fdeps-target= and -fdeps-file= when
only -fdeps-format= is specified.
* json.h: Add a TODO item to refactor out to share with
`libcpp/mkdeps.cc`.

gcc/c-family/

* c-opts.cc (c_common_handle_option): Add fdeps_file variable and
-fdeps-format=, -fdeps-file=, and -fdeps-target= parsing.
* c.opt: Add -fdeps-format=, -fdeps-file=, and -fdeps-target=
flags.

gcc/cp/

* module.cc (preprocessed_module): Pass whether the module is
exported to dependency tracking.

gcc/testsuite/

* g++.dg/modules/depflags-f-MD.C: New test.
* g++.dg/modules/depflags-f.C: New test.
* g++.dg/modules/depflags-fi.C: New test.
* g++.dg/modules/depflags-fj-MD.C: New test.
* g++.dg/modules/depflags-fj.C: New test.
* g++.dg/modules/depflags-fjo-MD.C: New test.
* g++.dg/modules/depflags-fjo.C: New test.
* g++.dg/modules/depflags-fo-MD.C: New test.
* g++.dg/modules/depflags-fo.C: New test.
* g++.dg/modules/depflags-j-MD.C: New test.
* g++.dg/modules/depflags-j.C: New test.
* g++.dg/modules/depflags-jo-MD.C: New test.
* g++.dg/modules/depflags-jo.C: New test.
* g++.dg/modules/depflags-o-MD.C: New test.
* g++.dg/modules/depflags-o.C: New test.
* g++.dg/modules/p1689-1.C: New test.
* g++.dg/modules/p1689-1.exp.ddi: New test expectation.
* g++.dg/modules/p1689-2.C: New test.
* g++.dg/modules/p1689-2.exp.ddi: New test expectation.
* g++.dg/modules/p1689-3.C: New test.
* g++.dg/modules/p1689-3.exp.ddi: New test expectation.
* g++.dg/modules/p1689-4.C: New test.
* g++.dg/modules/p1689-4.exp.ddi: New test expectation.
* g++.dg/modules/p1689-5.C: New test.
* g++.dg/modules/p1689-5.exp.ddi: New test expectation.
* g++.dg/modules/modules.exp: Load new P1689 library routines.
* g++.dg/modules/test-p1689.py: New tool for validating P1689 output.
* lib/modules.exp: Support for validating P1689 outputs.

Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

spec: add a spec function to join arguments

When passing `-o` flags to other options, the typical `-o foo` spelling
leaves a leading whitespace when replacing elsewhere. This ends up
creating flags spelled as `-some-option-with-arg= foo.ext` which doesn't
parse properly. When attempting to make a spec function to just remove
the leading whitespace, the argument splitting ends up masking the
whitespace. However, the intended extension *also* ends up being its own
argument. To perform the desired behavior, the arguments need to be
concatenated together.

gcc/:

* gcc.cc (join_spec_func): Add a spec function to join all
arguments.

Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
Co-authored-by: Jason Merrill <jason@redhat.com>

RISC-V: Fix --enable-checking=rtl ICE on rv32gc bootstrap

Resolves PR 111461.

during RTL pass: expand
offtime.c: In function '__offtime':
offtime.c:79:6: internal compiler error: RTL check: expected elt 0 type 'e' or 'u', have 'w' (rtx const_int) in riscv_legitimize_const_move, at config/riscv/riscv.cc:2176
79 | ip = __mon_yday[__isleap(y)];

Tested on rv32gc glibc with --enable-checking=rtl.

2023-09-19 Juzhe Zhong <juzhe.zhong@rivai.ai>

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_const_move): Eliminate
src_op_0 var to avoid rtl check error.

Tested-by: Patrick O'Neill <patrick@rivosinc.com>

[frange] Clean up floating point relational folding.

The following patch removes all the special casing from the floating
point relational folding code.  Now all the code relating to folding
of relationals is in frelop_early_resolve() and in
operator_not_equal::fold_range() which requires a small tweak.

I have written new relational tests, and moved them to
gcc.dg/tree-ssa/vrp-float-relations-* for easy reference.  In the
tests it's easy to see the type of things we need to handle:

(a)
if (x != y)
  if (x == y)
    link_error ();

(b)
if (a != b)
  if (a != b) // Foldable as true.

(c)
/* We can thread BB2->BB4->BB5 even though we have no knowledge
   of the NANness of either x_1 or a_5.  */
__BB(4):
  x_1 = __PHI (__BB2: a_5(D), __BB3: b_4(D));
  if (x_1 __UNEQ a_5(D))

(d)
/* Even though x_1 and a_4 are equivalent on the BB2->BB4 path,
   we cannot fold the conditional because of possible NANs:  */
__BB(4):
  # x_1 = __PHI (__BB2: a_4(D), __BB3: 8.0e+0(3));
  if (x_1 == a_4(D))

(e)
if (cond)
  x = a;
else
  x = 8.0;

/* We can fold this as false on the path coming out of cond==1,
   regardless of NANs on either "x" or "a".  */
if (x < a)
  stuff ();

[etc, etc]

We can implement everything without either special casing,
get_identity_relation(), or adding new unordered relationals.

The basic idea is that if we accurately reflect NANs in op[12]_range,
this information gets propagated to the relevant edges, and there's no
need for unordered relations (VREL_UN*), because the information is in
the range itself.  This information is then used in
frelop_early_resolve() to fold certain combinations.

I don't mean this patch as a hard-no against implementing the
unordered relations Jakub preferred, but seeing that it's looking
cleaner and trivially simple without the added burden of more enums,
I'd like to flesh it out completely and then discuss if we still think
new codes are needed.

More testcases or corner cases are highly welcome.

In follow-up patches I will finish up unordered relation folding, and
come up with suitable tests.

gcc/ChangeLog:

* range-op-float.cc (frelop_early_resolve): Clean-up and remove
special casing.
(operator_not_equal::fold_range): Handle VREL_EQ.
(operator_lt::fold_range): Remove special casing for VREL_EQ.
(operator_gt::fold_range): Same.
(foperator_unordered_equal::fold_range): Same.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp-float-12.c: Moved to...
* gcc.dg/tree-ssa/vrp-float-relations-1.c: ...here.
* gcc.dg/tree-ssa/vrp-float-relations-2.c: New test.
* gcc.dg/tree-ssa/vrp-float-relations-3.c: New test.
* gcc.dg/tree-ssa/vrp-float-relations-4.c: New test.