]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
9 months agoaarch64: libstdc++: Use shufflevector instead of shuffle in opt_random.h
Ricardo Jesus [Mon, 14 Oct 2024 13:28:02 +0000 (14:28 +0100)] 
aarch64: libstdc++: Use shufflevector instead of shuffle in opt_random.h

This patch modifies the implementation of the vectorized mersenne
twister random number generator to use __builtin_shufflevector instead
of __builtin_shuffle. This makes it (almost) compatible with Clang.

To make the implementation fully compatible with Clang, Clang will need
to support internal Neon types like __Uint8x16_t and __Uint32x4_t, which
currently it does not. This looks like an oversight in Clang and so will
be addressed separately.

I see no codegen change with this patch.

Bootstrapped and tested on aarch64-none-linux-gnu.

libstdc++-v3/ChangeLog:

* config/cpu/aarch64/opt/ext/opt_random.h (__VEXT): Replace uses
of __builtin_shuffle with __builtin_shufflevector.
(__aarch64_lsl_128): Move shift amount to a template parameter.
(__aarch64_lsr_128): Move shift amount to a template parameter.
(__aarch64_recursion): Update call sites of __aarch64_lsl_128
and __aarch64_lsr_128.

Signed-off-by: Ricardo Jesus <rjj@nvidia.com>
9 months agoRecord nonzero bits in the irange_bitmask of POLY_INT_CSTs
Richard Sandiford [Thu, 24 Oct 2024 13:22:34 +0000 (14:22 +0100)] 
Record nonzero bits in the irange_bitmask of POLY_INT_CSTs

At the moment, ranger punts entirely on POLY_INT_CSTs.  Numerical
ranges are a bit difficult, unless we do start modelling bounds on
the indeterminates.  But we can at least track the nonzero bits.

gcc/
* value-query.cc (range_query::get_tree_range): Use get_nonzero_bits
to populate the irange_bitmask of a POLY_INT_CST.

gcc/testsuite/
* gcc.target/aarch64/sve/cnt_fold_6.c: New test.

9 months agoTry to simplify (X >> C1) * (C2 << C1) -> X * C2
Richard Sandiford [Thu, 24 Oct 2024 13:22:33 +0000 (14:22 +0100)] 
Try to simplify (X >> C1) * (C2 << C1) -> X * C2

This patch adds a rule to simplify (X >> C1) * (C2 << C1) -> X * C2
when the low C1 bits of X are known to be zero.  As with the earlier
X >> C1 << (C2 + C1) patch, any single conversion is allowed between
the shift and the multiplication.

gcc/
* match.pd: Simplify (X >> C1) * (C2 << C1) -> X * C2 if the
low C1 bits of X are zero.

gcc/testsuite/
* gcc.dg/tree-ssa/shifts-3.c: New test.
* gcc.dg/tree-ssa/shifts-4.c: Likewise.
* gcc.target/aarch64/sve/cnt_fold_5.c: Likewise.

9 months agoHandle POLY_INT_CSTs in get_nonzero_bits
Richard Sandiford [Thu, 24 Oct 2024 13:22:33 +0000 (14:22 +0100)] 
Handle POLY_INT_CSTs in get_nonzero_bits

This patch extends get_nonzero_bits to handle POLY_INT_CSTs,
The easiest (but also most useful) case is that the number
of trailing zeros in the runtime value is at least the number
of trailing zeros in each individual component.

In principle, we could do this for coeffs 1 and above only,
and then OR in ceoff 0.  This would give ~0x11 for [14, 32], say.
But that's future work.

gcc/
* tree-ssanames.cc (get_nonzero_bits): Handle POLY_INT_CSTs.
* match.pd (with_possible_nonzero_bits): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/cnt_fold_4.c: New test.

9 months agoTry to simplify (X >> C1) << (C1 + C2) -> X << C2
Richard Sandiford [Thu, 24 Oct 2024 13:22:32 +0000 (14:22 +0100)] 
Try to simplify (X >> C1) << (C1 + C2) -> X << C2

This patch adds a rule to simplify (X >> C1) << (C1 + C2) -> X << C2
when the low C1 bits of X are known to be zero.

Any single conversion can take place between the shifts.  E.g. for
a truncating conversion, any extra bits of X that are preserved by
truncating after the shift are immediately lost by the shift left.
And the sign bits used for an extending conversion are the same as
the sign bits used for the rshift.  (A double conversion of say
int->unsigned->uint64_t would be wrong though.)

gcc/
* match.pd: Simplify (X >> C1) << (C1 + C2) -> X << C2 if the
low C1 bits of X are zero.

gcc/testsuite/
* gcc.dg/tree-ssa/shifts-1.c: New test.
* gcc.dg/tree-ssa/shifts-2.c: Likewise.

9 months agoGeneralise ((X /[ex] A) +- B) * A -> X +- A * B rule
Richard Sandiford [Thu, 24 Oct 2024 13:22:32 +0000 (14:22 +0100)] 
Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule

match.pd had a rule to simplify ((X /[ex] A) +- B) * A -> X +- A * B
when A and B are INTEGER_CSTs.  This patch extends it to handle the
case where the outer multiplication is by a factor of A, not just
A itself.  It also handles addition and multiplication of poly_ints.
(Exact division by a poly_int seems unlikely.)

gcc/
* match.pd: Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule
to ((X /[ex] C1) +- C2) * (C1 * C3) -> (X * C3) +- (C1 * C2 * C3).

gcc/testsuite/
* gcc.dg/tree-ssa/mulexactdiv-5.c: New test.
* gcc.dg/tree-ssa/mulexactdiv-6.c: Likewise.
* gcc.dg/tree-ssa/mulexactdiv-7.c: Likewise.
* gcc.dg/tree-ssa/mulexactdiv-8.c: Likewise.
* gcc.target/aarch64/sve/cnt_fold_3.c: Likewise.

9 months agoSimplify (X /[ex] C1) * (C1 * C2) -> X * C2
Richard Sandiford [Thu, 24 Oct 2024 13:22:31 +0000 (14:22 +0100)] 
Simplify (X /[ex] C1) * (C1 * C2) -> X * C2

gcc/
* match.pd: Simplify (X /[ex] C1) * (C1 * C2) -> X * C2.

gcc/testsuite/
* gcc.dg/tree-ssa/mulexactdiv-1.c: New test.
* gcc.dg/tree-ssa/mulexactdiv-2.c: Likewise.
* gcc.dg/tree-ssa/mulexactdiv-3.c: Likewise.
* gcc.dg/tree-ssa/mulexactdiv-4.c: Likewise.
* gcc.target/aarch64/sve/cnt_fold_1.c: Likewise.
* gcc.target/aarch64/sve/cnt_fold_2.c: Likewise.

9 months agoUse get_nonzero_bits to simplify trunc_div to exact_div
Richard Sandiford [Thu, 24 Oct 2024 13:22:31 +0000 (14:22 +0100)] 
Use get_nonzero_bits to simplify trunc_div to exact_div

There are a limited number of existing rules that benefit from
knowing that a division is exact.  Later patches will add more.

gcc/
* match.pd: Simplify X / (1 << C) to X /[ex] (1 << C) if the
low C bits of X are clear

gcc/testsuite/
* gcc.dg/tree-ssa/cmpexactdiv-6.c: New test.

9 months agoMake more places handle exact_div like trunc_div
Richard Sandiford [Thu, 24 Oct 2024 13:22:30 +0000 (14:22 +0100)] 
Make more places handle exact_div like trunc_div

I tried to look for places where we were handling TRUNC_DIV_EXPR
more favourably than EXACT_DIV_EXPR.

Most of the places that I looked at but didn't change were handling
div/mod pairs.  But there's bound to be others I missed...

gcc/
* match.pd: Extend some rules to handle exact_div like trunc_div.
* tree.h (trunc_or_exact_div_p): New function.
* tree-ssa-loop-niter.cc (is_rshift_by_1): Use it.
* tree-ssa-loop-ivopts.cc (force_expr_to_var_cost): Handle
EXACT_DIV_EXPR.

9 months agoImplement pointer_or_operator.
Andrew MacLeod [Wed, 23 Oct 2024 14:59:13 +0000 (10:59 -0400)] 
Implement pointer_or_operator.

The class pointer_or is no longer used, and can be removed.  Its
functionality was never moved to the new dispatch system.
This implements operator_bitwise_or::fold_range() for prange operands.

* range-op-mixed.h (operator_bitwise_or::fold_range): Add prange
variant.
* range-op-ptr.cc (class pointer_or_operator): Remove.
(pointer_or_operator::op1_range): Remove.
(pointer_or_operator::op2_range): Remove.
(pointer_or_operator::wi_fold): Remove.
(operator_bitwise_or::fold_range): New prange variant.

9 months agoRemove pointer_and_operator.
Andrew MacLeod [Mon, 21 Oct 2024 22:20:10 +0000 (18:20 -0400)] 
Remove pointer_and_operator.

This operator class predates the dispatch system, and is no longer used.
The functionality of wi_fold has been replaced by
operator_bitwise_and::fold_range with prange operaands.

* range-op-ptr.cc (class pointer_and_operator): Remove.
(pointer_and_operator::wi_fold): Remove.

9 months agoRemove pointer_min_max_operator.
Andrew MacLeod [Mon, 21 Oct 2024 22:11:43 +0000 (18:11 -0400)] 
Remove pointer_min_max_operator.

The pointer_min_max_operator class was used before the current dispatch
system was created.  These operations have been transferred to
operator_min::fold_range () and operator_max::fold_range () with prange
operands.

This class is no longer used for anything, delete it.

* range-op-ptr.cc (class pointer_min_max_operator): Remove.
(pointer_min_max_operator::wi_fold): Remove.

9 months agoCleanup pointer_plus_operator.
Andrew MacLeod [Mon, 21 Oct 2024 20:47:32 +0000 (16:47 -0400)] 
Cleanup pointer_plus_operator.

The POINTER_PLUS operator still carries some remnamts of the old
irange interface, which is now dead code with prange.

* range-op-ptr.cc (pointer_plus_operator::wi_fold): Remove.
(pointer_plus_operator::op2_range): Remove irange variant.
(pointer_plus_operator::update_bitmask): Likewise.

9 months agoc++: Further fix for get_member_function_from_ptrfunc [PR117259]
Jakub Jelinek [Thu, 24 Oct 2024 10:56:19 +0000 (12:56 +0200)] 
c++: Further fix for get_member_function_from_ptrfunc [PR117259]

The following testcase shows that the previous get_member_function_from_ptrfunc
changes weren't sufficient and we still have cases where
-fsanitize=undefined with pointers to member functions can cause wrong code
being generated and related false positive warnings.

The problem is that save_expr doesn't always create SAVE_EXPR, it can skip
some invariant arithmetics and in the end it could be really large
expressions which would be evaluated several times (and what is worse, with
-fsanitize=undefined those expressions then can have SAVE_EXPRs added to
their subparts for -fsanitize=bounds or -fsanitize=null or
-fsanitize=alignment instrumentation).  Tried to just build1 a SAVE_EXPR
+ add TREE_SIDE_EFFECTS instead of save_expr, but that doesn't work either,
because cp_fold happily optimizes those SAVE_EXPRs away when it sees
SAVE_EXPR operand is tree_invariant_p.

So, the following patch instead of using save_expr or building SAVE_EXPR
manually builds a TARGET_EXPR.  Both types are pointers, so it doesn't need
to be destroyed in any way, but TARGET_EXPR is what doesn't get optimized
away immediately.

2024-10-24  Jakub Jelinek  <jakub@redhat.com>

PR c++/117259
* typeck.cc (get_member_function_from_ptrfunc): Use force_target_expr
rather than save_expr for instance_ptr and function.  Don't call it
for TREE_CONSTANT.

* g++.dg/ubsan/pr117259.C: New test.

9 months agoasan: Fix up build_check_stmt gsi handling [PR117209]
Jakub Jelinek [Thu, 24 Oct 2024 10:45:34 +0000 (12:45 +0200)] 
asan: Fix up build_check_stmt gsi handling [PR117209]

gsi_safe_insert_before properly updates gsi_bb in gimple_stmt_iterator
in case it splits objects, but unfortunately build_check_stmt was in
some places (but not others) using a copy of the iterator rather than
the iterator passed from callers and so didn't propagate that to callers.
I guess it didn't matter much before when it was just using
gsi_insert_before as that really didn't change the iterator.
The !before_p case is apparently dead code, nothing is calling it with
before_p=false since around 4.9.

2024-10-24  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/117209
* asan.cc (maybe_cast_to_ptrmode): Formatting fix.
(build_check_stmt): Don't copy *iter into gsi, perform all
the updates on iter directly.

* gcc.dg/asan/pr117209.c: New test.

9 months agoSVE intrinsics: Fold svsra with op1 all zeros to svlsr/svasr.
Jennifer Schmitz [Thu, 17 Oct 2024 09:31:47 +0000 (02:31 -0700)] 
SVE intrinsics: Fold svsra with op1 all zeros to svlsr/svasr.

A common idiom in intrinsics loops is to have accumulator intrinsics
in an unrolled loop with an accumulator initialized to zero at the beginning.
Propagating the initial zero accumulator into the first iteration
of the loop and simplifying the first accumulate instruction is a
desirable transformation that we should teach GCC.
Therefore, this patch folds svsra to svlsr/svasr if op1 is all zeros,
producing the lower latency instructions LSR/ASR instead of USRA/SSRA.
We implemented this optimization in svsra_impl::fold.

Tests were added to check the produced assembly for use of LSR/ASR.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svsra_impl::fold): Fold svsra to svlsr/svasr if op1 is all zeros.

gcc/testsuite/
* gcc.target/aarch64/sve2/acle/asm/sra_s32.c: New test.
* gcc.target/aarch64/sve2/acle/asm/sra_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sra_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sra_u64.c: Likewise.

9 months agoSVE intrinsics: Fold constant operands for svlsl.
Soumya AR [Thu, 17 Oct 2024 04:00:35 +0000 (09:30 +0530)] 
SVE intrinsics: Fold constant operands for svlsl.

This patch implements constant folding for svlsl. Test cases have been added to
check for the following cases:

Zero, merge, and don't care predication.
Shift by 0.
Shift by register width.
Overflow shift on signed and unsigned integers.
Shift on a negative integer.
Maximum possible shift, eg. shift by 7 on an 8-bit integer.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-base.cc (svlsl_impl::fold):
Try constant folding.
* config/aarch64/aarch64-sve-builtins.cc (aarch64_const_binop):
Return 0 if shift is out of range.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/const_fold_lsl_1.c: New test.

9 months agoSVE intrinsics: Fold division and multiplication by -1 to neg
Jennifer Schmitz [Tue, 1 Oct 2024 15:01:13 +0000 (08:01 -0700)] 
SVE intrinsics: Fold division and multiplication by -1 to neg

Because a neg instruction has lower latency and higher throughput than
sdiv and mul, svdiv and svmul by -1 can be folded to svneg. For svdiv,
this is already implemented on the RTL level; for svmul, the
optimization was still missing.
This patch implements folding to svneg for both operations using the
gimple_folder. For svdiv, the transform is applied if the divisor is -1.
Svmul is folded if either of the operands is -1. A case distinction of
the predication is made to account for the fact that svneg_m has 3 arguments
(argument 0 holds the values for the inactive lanes), while svneg_x and
svneg_z have only 2 arguments.
Tests were added or adjusted to check the produced assembly and runtime
tests were added to check correctness.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
Fold division by -1 to svneg.
(svmul_impl::fold): Fold multiplication by -1 to svneg.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/mul_s16.c: Adjust expected outcome.
* gcc.target/aarch64/sve/acle/asm/mul_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/mul_s64.c: Adjust expected outcome.
* gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
* gcc.target/aarch64/sve/div_const_run.c: New test.
* gcc.target/aarch64/sve/mul_const_run.c: Likewise.

9 months agoSVE intrinsics: Add constant folding for svindex.
Jennifer Schmitz [Tue, 15 Oct 2024 14:58:14 +0000 (07:58 -0700)] 
SVE intrinsics: Add constant folding for svindex.

This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
  return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
    return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
        index   z0.d, #10, #3
        mul     z0.d, z0.d, #5
        ret
Now, it is compiled to
f2:
        mov     x0, 50
        index   z0.d, x0, #15
        ret

We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc
(svindex_impl::fold): Add constant folding.

gcc/testsuite/
* gcc.target/aarch64/sve/index_const_fold.c: New test.

9 months ago[PATCH] RISC-V: override alignment of function/jump/loop
Wang Pengcheng [Thu, 24 Oct 2024 05:11:53 +0000 (23:11 -0600)] 
[PATCH] RISC-V: override alignment of function/jump/loop

Just like what AArch64 has done.

Signed-off-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>
gcc/ChangeLog:

* config/riscv/riscv.cc (struct riscv_tune_param): Add new
tune options.
(riscv_override_options_internal): Override the default alignment
when not optimizing for size.

9 months agolibffi: LoongArch: Fix soft-float builds of libffi
Yang Yujie [Sat, 27 Jan 2024 07:09:46 +0000 (15:09 +0800)] 
libffi: LoongArch: Fix soft-float builds of libffi

This patch correspond to the upstream PR:
https://github.com/libffi/libffi/pull/817
And has been merged.

libffi/ChangeLog:

* src/loongarch64/ffi.c: Avoid defining floats
in struct call_context if the ABI is soft-float.

9 months agotestsuite: Fix up pr116488.c and pr117226.c tests [PR116488]
Jakub Jelinek [Thu, 24 Oct 2024 03:21:13 +0000 (21:21 -0600)] 
testsuite: Fix up pr116488.c and pr117226.c tests [PR116488]

Hi!

On Mon, Oct 21, 2024 at 01:39:52PM -0600, Jeff Law wrote:
>  * gcc.dg/torture/pr116488.c: New test.
>  * gcc.dg/torture/pr117226.c: New test.

These two tests FAIL on powerpc64le-linux (and I assume on all other
-funsigned-char defaulting targets).

The following patch fixes that, tested on powerpc64le-linux and
x86_64-linux (-m32/-m64); on x86_64 also tested before/after with
-funsigned-char.

Ok for trunk?

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/116488
PR rtl-optimization/117226
* gcc.dg/torture/pr116488.c (c, e): Change type from char to
signed char.
* gcc.dg/torture/pr117226.c (main): Change f type from char to
signed char.

9 months agoRISC-V: Add testcases for form 4 of signed vector SAT_ADD
Pan Li [Mon, 23 Sep 2024 05:43:50 +0000 (13:43 +0800)] 
RISC-V: Add testcases for form 4 of signed vector SAT_ADD

Form 4:
  #define DEF_VEC_SAT_S_ADD_FMT_4(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_4 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T sum;                                                         \
        bool overflow = __builtin_add_overflow (x, y, &sum);           \
        out[i] = !overflow ? sum : x < 0 ? MIN : MAX;                  \
      }                                                                \
  }

DEF_VEC_SAT_S_ADD_FMT_4 (int8_t, uint8_t, INT8_MIN, INT8_MAX)

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-16.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoDaily bump.
GCC Administrator [Thu, 24 Oct 2024 00:20:23 +0000 (00:20 +0000)] 
Daily bump.

9 months agoaarch64: Fix warning in aarch64_ptrue_reg
Andrew Pinski [Wed, 23 Oct 2024 23:39:21 +0000 (16:39 -0700)] 
aarch64: Fix warning in aarch64_ptrue_reg

After r15-4579-g9ffcf1f193b477, we get the following warning/error while bootstrapping on aarch64:
```
../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def* aarch64_ptrue_reg(machine_mode, unsigned int)’:
../../gcc/gcc/config/aarch64/aarch64.cc:3643:21: error: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Werror=sign-compare]
 3643 |   for (int i = 0; i < vl; i++)
      |                   ~~^~~~
```

This changes the type of i to unsigned to match the type of vl.

Pushed as obvious after a bootstrap/test on aarch64-linux-gnu.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_ptrue_reg): Fix type
of induction variable i.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
9 months agomatch: Reject non-const internal functions [PR117260]
Andrew Pinski [Tue, 22 Oct 2024 16:05:38 +0000 (09:05 -0700)] 
match: Reject non-const internal functions [PR117260]

When internal functions support was added to match (r6-4979-gc9e926ce2bdc8b),
the check for ECF_CONST was the builtin function side. Though before r15-4503-g8d6d6d537fdc,
there was no use of maybe_push_res_to_seq with non-const internal functions so the check
would not make a difference.

This adds the check for internal functions just as there is a check for builtins.

Note I didn't add a testcase because there was no non-const internal function
which could be used on x86_64 in a decent manor.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/117260
* gimple-match-exports.cc (maybe_push_res_to_seq): Reject non-const
internal functions.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
9 months agoginclude: stdalign.h should define __xxx_is_defined macros for C++
Jonathan Wakely [Tue, 22 Oct 2024 15:26:27 +0000 (16:26 +0100)] 
ginclude: stdalign.h should define __xxx_is_defined macros for C++

The __alignas_is_defined macro has been required by C++ since C++11, and
C++ Library DR 4036 clarified that __alignof_is_defined should be
defined too. The whole <stdalign.h> header was deprecated for C++23 (see
LWG 3827) and is likely to be removed for C++26 (see P3348), but we can
deal with that later.

The macros alignas and alignof should not be defined, as they're
keywords in C++.

gcc/ChangeLog:

* ginclude/stdalign.h (__alignas_is_defined): Define for C++.
(__alignof_is_defined): Likewise.

libstdc++-v3/ChangeLog:

* testsuite/18_support/headers/cstdalign/macros.cc: New test.

9 months agotop-level: Add pull request template for Forgejo
Jonathan Wakely [Wed, 23 Oct 2024 14:20:27 +0000 (15:20 +0100)] 
top-level: Add pull request template for Forgejo

ChangeLog:

* .forgejo/PULL_REQUEST_TEMPLATE.md: New file.

9 months agojit: reset state in varasm.cc [PR117275]
David Malcolm [Wed, 23 Oct 2024 18:26:38 +0000 (14:26 -0400)] 
jit: reset state in varasm.cc [PR117275]

PR jit/117275 reports various jit test failures seen on
powerpc64le-unknown-linux-gnu due to hitting this assertion
in varasm.cc on the 2nd compilation in a process:

#2  0x00007ffff63e67d0 in assemble_external_libcall (fun=0x7ffff2a4b1d8)
    at ../../src/gcc/varasm.cc:2650
2650          gcc_assert (!pending_assemble_externals_processed);
(gdb) p pending_assemble_externals_processed
$1 = true

We're not properly resetting state in varasm.cc after a compile
for libgccjit.

Fixed thusly.

gcc/ChangeLog:
PR jit/117275
* toplev.cc (toplev::finalize): Call varasm_cc_finalize.
* varasm.cc (varasm_cc_finalize): New.
* varasm.h (varasm_cc_finalize): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
9 months agoaarch64: Improve scalar mode popcount expansion by using SVE [PR113860]
Pengxuan Zheng [Mon, 14 Oct 2024 12:37:49 +0000 (05:37 -0700)] 
aarch64: Improve scalar mode popcount expansion by using SVE [PR113860]

This is similar to the recent improvements to the Advanced SIMD popcount
expansion by using SVE. We can utilize SVE to generate more efficient code for
scalar mode popcount too.

Changes since v1:
* v2: Add a new VNx1BI mode and a new test case for V1DI.
* v3: Abandon VNx1BI changes and add a new variant of aarch64_ptrue_reg.

PR target/113860

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_ptrue_reg): New function.
* config/aarch64/aarch64-simd.md (popcount<mode>2): Update pattern to
also support V1DI mode.
* config/aarch64/aarch64.cc (aarch64_ptrue_reg): New function.
* config/aarch64/aarch64.md (popcount<mode>2): Add TARGET_SVE support.
* config/aarch64/iterators.md (VDQHSD_V1DI): New mode iterator.
(SVE_VDQ_I): Add V1DI.
(bitsize): Likewise.
(VPRED): Likewise.
(VEC_POP_MODE): New mode attribute.
(vec_pop_mode): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/popcnt-sve.c: Update test.
* gcc.target/aarch64/popcnt11.c: New test.
* gcc.target/aarch64/popcnt12.c: New test.

Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
9 months agoImplement operator_pointer_diff::fold_range
Andrew MacLeod [Mon, 21 Oct 2024 20:32:00 +0000 (16:32 -0400)] 
Implement operator_pointer_diff::fold_range

prange has no default fold_range processing like irange does, so each
pointer specific operator needs to implement its own fold routine.

PR tree-optimization/117222
gcc/
* range-op-ptr.cc (operator_pointer_diff::fold_range): New.
(operator_pointer_diff::op1_op2_relation_effect): Remove irange
variant.
(operator_pointer_diff::update_bitmask): Likewise.

gcc/testsuite
* g++.dg/pr117222.C: New.

9 months agolibstdc++: Add -D_GLIBCXX_ASSERTIONS default for -O0 to API history
Jonathan Wakely [Wed, 23 Oct 2024 15:01:04 +0000 (16:01 +0100)] 
libstdc++: Add -D_GLIBCXX_ASSERTIONS default for -O0 to API history

libstdc++-v3/ChangeLog:

* doc/xml/manual/evolution.xml: Document that assertions are
enabled for unoptimized builds.
* doc/html/*: Regenerate.

9 months agolibstdc++: Add GLIBCXX_TESTSUITE_STDS example to docs
Jonathan Wakely [Tue, 22 Oct 2024 20:18:51 +0000 (21:18 +0100)] 
libstdc++: Add GLIBCXX_TESTSUITE_STDS example to docs

libstdc++-v3/ChangeLog:

* doc/xml/manual/test.xml: Add GLIBCXX_TESTSUITE_STDS example.
* doc/html/manual/test.html: Regenerate.

9 months agodiagnostics: implement buffering for non-textual formats [PR105916]
David Malcolm [Wed, 23 Oct 2024 14:54:42 +0000 (10:54 -0400)] 
diagnostics: implement buffering for non-textual formats [PR105916]

PR fortran/105916 reports stray diagnostics appearing in JSON and SARIF
output from gfortran.

In order to handle various awkard parsing issues, the Fortran frontend
implements buffering of diagnostics, so that diagnostics reported to
global_dc can be either:
(a) immediately issued, or
(b) speculatively reported to global_dc, and stored in a buffer, to
either be issued later or discarded.

This buffering code in gcc/fortran/error.cc directly manipulates
implementation details of the diagnostic_context such as the
pretty_printer's buffer, and the counts of how many diagnostics have
been issued.  The issue is that this manipulation of pretty_printer's
buffer doesn't work for formats such as JSON and SARIF where diagnostics
are handled in a different way (such as by accumulating json::object
instances in an array).

This patch moves responsibility for such buffering of diagnostics from
fortran's error.cc to the diagnostic subsystem.  It introduces a new
class diagnostic_buffer representing a particular buffer of diagnostics
that have been reported but not yet issued.  Each diagnostic output
format implements buffering in a different way, and so there is a
new class hierarchy, diagnostic_per_format_buffer, representing the
various format-specific ways that buffering is to be implemented.  This
is hidden as an implementation detail of diagnostic_buffer.

The patch also updates how diagnostics of each kind (e.g. warnings vs
errors) are counted, so that if buffering is enabled, the count is
incremented within the buffer, and the counts in the diagnostic_context
are only updated if and when the buffer is flushed; checking for
max_errors is similarly updated to support both buffered and unbuffered
cases.

For ease of debugging, the patch extends the "dump" functions within the
diagnostics subsystem, so that e.g. global_dc->dump () now prints the
buffering status, e.g.:

(gdb) call global_dc->dump()
diagnostic_context:
  counts:
    (none)
  output format:
    sarif_output_format
  printer:
    m_show_color: false
    m_url_format: bel
    m_buffer:
      m_formatted_obstack current object: length 0:
      m_chunk_obstack current object: length 0:
  diagnostic buffer:
    m_per_format_buffer:
      counts:
        error: 1
      diagnostic_sarif_format_buffer:
        result[0]:
{"ruleId": "error",
 "level": "error",
 "message": {"text": "Function ‘program’ requires an argument list at (1)"},
 "locations": [{"physicalLocation": {"artifactLocation": {"uri": "../../src/gcc/testsuite/gfortran.dg/pr105954.f90",
                                                          "uriBaseId": "PWD"},
                                     "region": {"startLine": 6,
                                                "startColumn": 8,
                                                "endColumn": 9},
                                     "contextRegion": {"startLine": 6,
        "snippet": {"text": "program p\n"}}}}]}

which shows that no diagnostics have been issued yet, but the active
diagnostic_buffer has a single error buffered within it, in SARIF form.

Similarly, it's possible to use "dump" on a diagnostic_buffer to directly
query its contents; here's the same example, this time with the text
output format:

(gdb) call error_buffer.buffer.dump()
m_per_format_buffer:
  counts:
    error: 1
  diagnostic_text_format_buffer:
    m_formatted_obstack current object: length 232:
      00000000: 1b 5b 30 31 6d 1b 5b 4b 2e 2e 2f 2e 2e 2f 73 72 | .[01m.[K../../sr
      00000010: 63 2f 67 63 63 2f 74 65 73 74 73 75 69 74 65 2f | c/gcc/testsuite/
      00000020: 67 66 6f 72 74 72 61 6e 2e 64 67 2f 70 72 31 30 | gfortran.dg/pr10
      00000030: 35 39 35 34 2e 66 39 30 3a 36 3a 38 3a 1b 5b 6d | 5954.f90:6:8:.[m
      00000040: 1b 5b 4b 0a 0a 20 20 20 20 36 20 7c 20 70 72 6f | .[K..    6 | pro
      00000050: 67 72 61 6d 20 70 0a 20 20 20 20 20 20 7c 20 20 | gram p.      |
      00000060: 20 20 20 20 20 20 1b 5b 30 31 3b 33 31 6d 1b 5b |       .[01;31m.[
      00000070: 4b 31 1b 5b 6d 1b 5b 4b 0a 1b 5b 30 31 3b 33 31 | K1.[m.[K..[01;31
      00000080: 6d 1b 5b 4b 45 72 72 6f 72 3a 1b 5b 6d 1b 5b 4b | m.[KError:.[m.[K
      00000090: 20 46 75 6e 63 74 69 6f 6e 20 e2 80 98 1b 5b 30 |  Function ....[0
      000000a0: 31 6d 1b 5b 4b 70 72 6f 67 72 61 6d 1b 5b 6d 1b | 1m.[Kprogram.[m.
      000000b0: 5b 4b e2 80 99 20 72 65 71 75 69 72 65 73 20 61 | [K... requires a
      000000c0: 6e 20 61 72 67 75 6d 65 6e 74 20 6c 69 73 74 20 | n argument list
      000000d0: 61 74 20 1b 5b 30 31 3b 33 31 6d 1b 5b 4b 28 31 | at .[01;31m.[K(1
      000000e0: 29 1b 5b 6d 1b 5b 4b 0a                         | ).[m.[K.
    m_chunk_obstack current object: length 0:

showing that we have an error in error_buffer, with colorized text.

gcc/ChangeLog:
PR fortran/105916
* diagnostic-buffer.h: New file.
* diagnostic-format-json.cc: Define INCLUDE_VECTOR.  Include
"diagnostic-buffer.h".
(class diagnostic_json_format_buffer): New subclass.
(class json_output_format): Add friend class
diagnostic_json_format_buffer.
(json_output_format::make_per_format_buffer): New vfunc
implementation.
(json_output_format::set_buffer): New vfunc implementation.
(json_output_format::json_output_format): Initialize m_buffer.
(json_output_format::m_buffer): New field.
(diagnostic_json_format_buffer::dump): New.
(diagnostic_json_format_buffer::empty_p): New.
(diagnostic_json_format_buffer::move_to): New.
(diagnostic_json_format_buffer::clear): New.
(diagnostic_json_format_buffer::flush): New.
(json_output_format::on_report_diagnostic): Implement optional
buffering.
* diagnostic-format-sarif.cc: Include "diagnostic-buffer.h".
(class diagnostic_sarif_format_buffer): New subclass.
(class sarif_builder): Add friend
class diagnostic_sarif_format_buffer.
(sarif_builder::num_results): New accessor.
(sarif_builder::get_result): New accessor.
(sarif_builder::on_report_diagnostic): Add param "buffer"; use it
to implement optional buffering.
(diagnostic_sarif_format_buffer::dump): New.
(diagnostic_sarif_format_buffer::empty_p): New.
(diagnostic_sarif_format_buffer::move_to): New.
(diagnostic_sarif_format_buffer::clear): New.
(diagnostic_sarif_format_buffer::flush): New.
(sarif_output_format::make_per_format_buffer): New vfunc
implementation.
(sarif_output_format::set_buffer): New vfunc implementation.
(sarif_output_format::on_report_diagnostic): Pass m_buffer to
sarif_builder::on_report_diagnostic.
(sarif_output_format::num_results): New accessor.
(sarif_output_format::get_result): New accessor.
(diagnostic_output_format::diagnostic_output_format): Initialize
m_buffer.
(diagnostic_output_format::m_buffer): New field.
(diagnostic_output_format::num_results): Get accessor.
(diagnostic_output_format::get_result): Get accessor.
(selftest::get_message_from_result): New.
(selftest::test_buffering): New.
(selftest::diagnostic_format_sarif_cc_tests): Call it.
* diagnostic-format-text.cc: Include
"diagnostic-client-data-hooks.h".
(class diagnostic_text_format_buffer): New subclass.
(diagnostic_text_format_buffer::diagnostic_text_format_buffer):
New.
(diagnostic_text_format_buffer::dump): New.
(diagnostic_text_format_buffer::empty_p): New.
(diagnostic_text_format_buffer::move_to): New.
(diagnostic_text_format_buffer::clear): New.
(diagnostic_text_format_buffer::flush): New.
(diagnostic_text_output_format::dump): Dump m_saved_output_buffer.
(diagnostic_text_output_format::set_buffer): New.
(diagnostic_text_output_format::make_per_format_buffer): New.
* diagnostic-format-text.h
(diagnostic_text_output_format::diagnostic_text_output_format):
Initialize m_saved_output_buffer.
(diagnostic_text_output_format::set_buffer): New decl.
(diagnostic_text_output_format::make_per_format_buffer): New decl.
(diagnostic_text_output_format::m_saved_output_buffer): New field.
* diagnostic-format.h (class diagnostic_per_format_buffer): New
forward decl.
(diagnostic_output_format::make_per_format_buffer): New vfunc.
(diagnostic_output_format::set_buffer): New vfunc.
* diagnostic.cc: Include "diagnostic-buffer.h".
(diagnostic_context::initialize): Replace memset with call to
"clear" on m_diagnostic_counters.  Initializer
m_diagnostic_buffer.
(diagnostic_context::finish): Call set_diagnostic_buffer with
nullptr.
(diagnostic_context::dump): Update for encapsulation of counts
into m_diagnostic_counters.  Dump m_diagnostic_buffer.
(diagnostic_context::execution_failed_p): Update for encapsulation of
counts into m_diagnostic_counters.
(diagnostic_context::check_max_errors): Likewise.
(diagnostic_context::report_diagnostic): Likewise.  Eliminate
diagnostic_check_max_errors in favor of check_max_errors.
Update increment of counter to support buffering.  Eliminate
diagnostic_action_after_output in favor of action_after_output.
Only add fixits to m_edit_context_ptr if buffering is disabled.
Only call diagnostic_output_format::after_diagnostic if buffering
is disabled.
(diagnostic_context::error_recursion):  Eliminate
diagnostic_action_after_output in favor of action_after_output.
(diagnostic_context::set_diagnostic_buffer): New.
(diagnostic_context::clear_diagnostic_buffer): New.
(diagnostic_context::flush_diagnostic_buffer): New.
(diagnostic_counters::diagnostic_counters): New.
(diagnostic_counters::dump): New.
(diagnostic_counters::move_to): New.
(diagnostic_counters::clear): New.
(diagnostic_buffer::diagnostic_buffer): New.
(diagnostic_buffer::~diagnostic_buffer): New.
(diagnostic_buffer::dump): New.
(diagnostic_buffer::empty_p): New.
(diagnostic_buffer::move_to): New.
(diagnostic_buffer::ensure_per_format_buffer): New.
(c_diagnostic_cc_tests): Remove stray newline.
* diagnostic.h (class diagnostic_buffer): New forward decl.
(struct diagnostic_counters): New.
(diagnostic_context::check_max_errors): Make private.
(diagnostic_context::action_after_output): Make private.
(diagnostic_context::get_output_format): Make non-const.
(diagnostic_context::diagnostic_count): Update for change
to m_diagnostic_counters.
(diagnostic_context::set_diagnostic_buffer): New decl.
(diagnostic_context::get_diagnostic_buffer): New decl.
(diagnostic_context::clear_diagnostic_buffer): New decl.
(diagnostic_context::flush_diagnostic_buffer): New decl.
(diagnostic_context::m_diagnostic_count): Replace array with...
(diagnostic_context::m_diagnostic_counters): ...this.
(diagnostic_context::m_diagnostic_buffer): New field.
(diagnostic_action_after_output): Delete.
(diagnostic_check_max_errors): Delete.

gcc/fortran/ChangeLog:
PR fortran/105916
* error.cc (pp_error_buffer, pp_warning_buffer): Convert from
output_buffer * to diagnostic_buffer *.
(warningcount_buffered, werrorcount_buffered): Eliminate.
(gfc_error_buffer::gfc_error_buffer): Move constructor definition
here, and initialize "buffer" using *global_dc.
(gfc_output_buffer_empty_p): Delete in favor of
diagnostic_buffer::empty_p.
(gfc_clear_pp_buffer): Replace with...
(gfc_clear_diagnostic_buffer): ...this, moving implementation
details to diagnostic_context::clear_diagnostic_buffer.
(gfc_warning): Replace buffering implementation with calls
to global_dc->get_diagnostic_buffer and
global_dc->set_diagnostic_buffer.
(gfc_clear_warning): Update for renaming of gfc_clear_pp_buffer
and elimination of warningcount_buffered and werrorcount_buffered.
(gfc_warning_check): Replace buffering implementation with calls
to pp_warning_buffer->empty_p and
global_dc->flush_diagnostic_buffer.
(gfc_error_opt): Replace buffering implementation with calls to
global_dc->get_diagnostic_buffer and set_diagnostic_buffer.
(gfc_clear_error): Update for renaming of gfc_clear_pp_buffer.
(gfc_error_flag_test): Replace call to gfc_output_buffer_empty_p
with call to diagnostic_buffer::empty_p.
(gfc_error_check): Replace buffering implementation with calls
to pp_error_buffer->empty_p and global_dc->flush_diagnostic_buffer.
(gfc_move_error_buffer_from_to): Replace buffering implementation
with usage of diagnostic_buffer.
(gfc_free_error): Update for renaming of gfc_clear_pp_buffer.
(gfc_diagnostics_init): Use "new" directly when creating
pp_warning_buffer.  Remove setting of m_flush_p on the two
buffers, as this is handled by diagnostic_buffer and by
diagnostic_text_format_buffer's constructor.
* gfortran.h: Replace #include "pretty-print.h" for output_buffer
with #include "diagnostic-buffer.h" for diagnostic_buffer.
(struct gfc_error_buffer): Change type of field "buffer" from
output_buffer to diagnostic_buffer.  Move definition of constructor
into error.cc so that it can use global_dc.

gcc/testsuite/ChangeLog:
PR fortran/105916
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.c: Include
"diagnostic-buffer.h".
(class diagnostic_xhtml_format_buffer): New subclass.
(class xhtml_builder): Add friend
class diagnostic_xhtml_format_buffer.
(diagnostic_xhtml_format_buffer::dump): New.
(diagnostic_xhtml_format_buffer::empty_p): New.
(diagnostic_xhtml_format_buffer::move_to): New.
(diagnostic_xhtml_format_buffer::clear): New.
(diagnostic_xhtml_format_buffer::flush): New.
(xhtml_builder::on_report_diagnostic): Add "buffer" param, and use
it.
(xhtml_output_format::dump): Fix typo.
(xhtml_output_format::make_per_format_buffer): New.
(xhtml_output_format::set_buffer): New.
(xhtml_output_format::on_report_diagnostic): Fix whitespace.  Pass
m_buffer to xhtml_builder::on_report_diagnostic.
(xhtml_output_format::xhtml_output_format): Initialize m_buffer.
(xhtml_output_format::m_buffer): New field.
* gfortran.dg/diagnostic-format-json-pr105916.F90: New test.
* gfortran.dg/diagnostic-format-sarif-1.F90: New test.
* gfortran.dg/diagnostic-format-sarif-1.py: New support script.
* gfortran.dg/diagnostic-format-sarif-pr105916.f90: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
9 months agolibstdc++: Replace std::__to_address in C++20 branch in <string>
Jonathan Wakely [Tue, 22 Oct 2024 20:23:06 +0000 (21:23 +0100)] 
libstdc++: Replace std::__to_address in C++20 branch in <string>

As noted by Patrick, r15-4546-g85e5b80ee2de80 should have changed the
usage of std::__to_address to std::to_address in the C++20-specific
branch that works on types satisfying std::contiguous_iterator.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (assign(Iter, Iter)): Call
std::to_address instead of __to_address.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
9 months agoFortran: Generic processing of assumed rank objects (f202y) [PR116733]
Paul Thomas [Wed, 23 Oct 2024 13:34:20 +0000 (14:34 +0100)] 
Fortran: Generic processing of assumed rank objects (f202y) [PR116733]

2024-10-23  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
PR fortran/116733
* array.cc : White space corrections.
* expr.cc (gfc_check_pointer_assign): Permit assumed rank
target with -std=f202y. Add constraints that the data pointer
object must have rank remapping specified and the that the data
target be contiguous.
* gfortran.h : Add a gfc_array_ref field 'ar' to the structure
'gfc_association_list'.
* interface.cc (gfc_compare_actual_formal): If -Wsurprising
is set, emit a warning if an assumed size array is passed to an
assumed rank dummy.
* intrinsic.cc (do_ts29113_check): Permit an assumed rank arg.
for reshape if -std=f202y and the argument is contiguous.
* invoke.texi : Introduce -std=f202y. Whitespace errors.
* lang.opt : Accept -std=f202y.
* libgfortran.h : Define GFC_STD_F202Y.
* match.cc (gfc_match_associate): If -std=f202y an assumed rank
selector is allowed if it is contiguous and the associate name
has rank remapping specified.
* options.cc (gfc_init_options): -std=f202y is equivalent to
-std=f2023 with experimental f202y features. White space issues
* parse.cc (parse_associate): If the selector is assumed rank,
use the 'ar' field of the association list to build an array
specification.
* primary.cc (gfc_match_varspec): Do not resolve the assumed
rank selector of a class associate name at this stage to avoid
the rank change.
* resolve.cc (find_array_spec): If an array_ref dimension is -1
reset it with the rank in the object's array_spec.
(gfc_expression_rank): Do not check dimen types for an assumed
rank variable expression.
(resolve_variable): Do not emit the assumed rank context error
if the context is pointer assignment and the variable is a
target.
(resolve_assoc_var): Resolve the bounds and check for missing
bounds in the rank remap of an associate name with an assumed
rank selector. Do not correct the rank of an associate name
with an assumed rank selector.
(resolve_symbol): Allow the reference to an assumed rank object
if -std-f202y is enabled and the current operation is
EXEC_BLOCK.
* st.cc (gfc_free_association_list): Free bounds expressions
of the 'ar' field, if present.
* trans-array.cc (gfc_conv_ss_startstride): If -std=f202y and
bounds checking activated, do not apply the assertion.
* trans-expr.cc (gfc_trans_pointer_assignment): An assumed rank
target has its offset set to zero.
* trans-stmt.cc (trans_associate_var): If the selector is
assumed rank, call gfc_trans_pointer_assignment using the 'ar'
field in the association list as the array reference for expr1.
The data target, expr2, is a copy of the selector expression.

gcc/testsuite/
PR fortran/116733
* gfortran.dg/associate_3.f03: Change error message.
* gfortran.dg/f202y/f202y.exp: Enable tests of f202y features.
* gfortran.dg/f202y/generic_assumed_rank_1.f90: New test.
* gfortran.dg/f202y/generic_assumed_rank_2.f90: New test.
* gfortran.dg/f202y/generic_assumed_rank_3.f90: New test.

9 months agoAArch64: Remove redundant check in aarch64_simd_mov
Wilco Dijkstra [Thu, 17 Oct 2024 14:33:44 +0000 (14:33 +0000)] 
AArch64: Remove redundant check in aarch64_simd_mov

The split condition in aarch64_simd_mov uses aarch64_simd_special_constant_p.
While doing the split, it checks the mode before calling
aarch64_maybe_generate_simd_constant.  This risky since it may result in
unexpectedly calling aarch64_split_simd_move instead of
aarch64_maybe_generate_simd_constant.  Since the mode is already checked,
remove the spurious explicit mode check.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (aarch64_simd_mov<VQMOV:mode>):
Remove redundant mode check.

9 months agoAArch64: Fix copysign patterns
Wilco Dijkstra [Tue, 15 Oct 2024 16:22:23 +0000 (16:22 +0000)] 
AArch64: Fix copysign patterns

The current copysign pattern has a mismatch in the predicates and constraints -
operand[2] is a register_operand but also has an alternative X which allows any
operand.  Since it is a floating point operation, having an integer alternative
makes no sense.  Change the expander to always use vector immediates which
results in better code and sharing of immediates between copysign and xorsign.

gcc/ChangeLog:

* config/aarch64/aarch64.md (copysign<GPF:mode>3): Widen immediate to
vector.
(copysign<GPF:mode>3_insn): Use VQ_INT_EQUIV in operand 3.
* config/aarch64/iterators.md (VQ_INT_EQUIV): New iterator.
(vq_int_equiv): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/copysign_3.c: New test.
* gcc.target/aarch64/copysign_4.c: New test.
* gcc.target/aarch64/fneg-abs_2.c: Fixup test.
* gcc.target/aarch64/sve/fneg-abs_2.c: Likewise.

9 months agodoc: remove obsolete deprecated info
Jason Merrill [Tue, 15 Oct 2024 13:04:23 +0000 (09:04 -0400)] 
doc: remove obsolete deprecated info

These formerly deprecated features eventually made it into the C++ standard.

gcc/ChangeLog:

* doc/extend.texi (Deprecated Features): Remove text about some
no-longer-deprecated features.

9 months agoAArch64: Add support for SIMD xor immediate (3/3)
Wilco Dijkstra [Mon, 14 Oct 2024 16:53:44 +0000 (16:53 +0000)] 
AArch64: Add support for SIMD xor immediate (3/3)

Add support for SVE xor immediate when generating AdvSIMD code and SVE is
available.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (enum simd_immediate_check): Add
AARCH64_CHECK_XOR.
(aarch64_simd_valid_xor_imm): New function.
(aarch64_output_simd_imm): Add AARCH64_CHECK_XOR support.
(aarch64_output_simd_xor_imm): New function.
* config/aarch64/aarch64-protos.h (aarch64_output_simd_xor_imm): New
prototype.
(aarch64_simd_valid_xor_imm): New prototype.
* config/aarch64/aarch64-simd.md (xor<mode>3<vczle><vczbe>):
Use aarch64_reg_or_xor_imm predicate and add an immediate alternative.
* config/aarch64/predicates.md (aarch64_reg_or_xor_imm): Add new
predicate.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/simd_imm.c: New test.

9 months agoAArch64: Improve SIMD immediate generation (2/3)
Wilco Dijkstra [Tue, 8 Oct 2024 15:55:25 +0000 (15:55 +0000)] 
AArch64: Improve SIMD immediate generation (2/3)

Allow use of SVE immediates when generating AdvSIMD code and SVE is available.
First check for a valid AdvSIMD immediate, and if SVE is available, try using
an SVE move or bitmask immediate.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (ior<mode>3<vczle><vczbe>):
Use aarch64_reg_or_orr_imm predicate.  Combine SVE/AdvSIMD immediates
and use aarch64_output_simd_orr_imm.
* config/aarch64/aarch64.cc (struct simd_immediate_info): Add SVE_MOV.
(aarch64_sve_valid_immediate): Use SVE_MOV for SVE move immediates.
(aarch64_simd_valid_imm): Enable SVE SIMD immediates when possible.
(aarch64_output_simd_imm): Support emitting SVE SIMD immediates.
* config/aarch64/predicates.md (aarch64_orr_imm_sve_advsimd): Remove.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/acle/asm/insr_s64.c: Allow SVE MOV imm.
* gcc.target/aarch64/sve/acle/asm/insr_u64.c: Likewise.
* gcc.target/aarch64/sve/fneg-abs_1.c: Update to check for ORRI.
* gcc.target/aarch64/sve/fneg-abs_2.c: Likewise.
* gcc.target/aarch64/sve/simd_imm_mov.c: New test.

9 months agoAArch64: Improve SIMD immediate generation (1/3)
Wilco Dijkstra [Tue, 8 Oct 2024 13:32:09 +0000 (13:32 +0000)] 
AArch64: Improve SIMD immediate generation (1/3)

Cleanup the various interfaces related to SIMD immediate generation.  Introduce
new functions that make it clear which operation (AND, OR, MOV) we are testing
for rather than guessing the final instruction.  Reduce the use of overly long
names, unused and default parameters for clarity.  No changes to internals or
generated code.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (enum simd_immediate_check): Move to aarch64.cc.
(aarch64_output_simd_mov_immediate): Remove.
(aarch64_output_simd_mov_imm): New prototype.
(aarch64_output_simd_orr_imm): Likewise.
(aarch64_output_simd_and_imm): Likewise.
(aarch64_simd_valid_immediate): Remove.
(aarch64_simd_valid_and_imm): New prototype.
(aarch64_simd_valid_mov_imm): Likewise.
(aarch64_simd_valid_orr_imm): Likewise.
* config/aarch64/aarch64-simd.md: Use aarch64_output_simd_mov_imm.
* config/aarch64/aarch64.cc (enum simd_immediate_check): Moved from aarch64-protos.h.
Use AARCH64_CHECK_AND rather than AARCH64_CHECk_BIC.
(aarch64_expand_sve_const_vector): Use aarch64_simd_valid_mov_imm.
(aarch64_expand_mov_immediate): Likewise.
(aarch64_can_const_movi_rtx_p): Likewise.
(aarch64_secondary_reload): Likewise.
(aarch64_legitimate_constant_p): Likewise.
(aarch64_advsimd_valid_immediate): Simplify checks on 'which' param.
(aarch64_sve_valid_immediate): Add extra param for move vs logical.
(aarch64_simd_valid_immediate): Rename to aarch64_simd_valid_imm.
(aarch64_simd_valid_mov_imm): New function.
(aarch64_simd_valid_orr_imm): Likewise.
(aarch64_simd_valid_and_imm): Likewise.
(aarch64_mov_operand_p): Use aarch64_simd_valid_mov_imm.
(aarch64_simd_scalar_immediate_valid_for_move): Likewise.
(aarch64_simd_make_constant): Likewise.
(aarch64_expand_vector_init_fallback): Likewise.
(aarch64_output_simd_mov_immediate): Rename to aarch64_output_simd_imm.
(aarch64_output_simd_orr_imm): New function.
(aarch64_output_simd_and_imm): Likewise.
(aarch64_output_simd_mov_imm): Likewise.
(aarch64_output_scalar_simd_mov_immediate): Use aarch64_output_simd_mov_imm.
(aarch64_output_sve_mov_immediate): Use aarch64_simd_valid_imm.
(aarch64_output_sve_ptrues): Likewise.
* config/aarch64/constraints.md (Do): Use aarch64_simd_valid_orr_imm.
(Db): Use aarch64_simd_valid_and_imm.
* config/aarch64/predicates.md (aarch64_reg_or_bic_imm): Use aarch64_simd_valid_orr_imm.
(aarch64_reg_or_and_imm): Use aarch64_simd_valid_and_imm.

9 months agoFix ICE due to isa mismatch for the builtins.
liuhongt [Tue, 22 Oct 2024 08:54:40 +0000 (01:54 -0700)] 
Fix ICE due to isa mismatch for the builtins.

gcc/ChangeLog:

PR target/117240
* config/i386/i386-builtin.def: Add avx/avx512f to vaes
ymm/zmm builtins.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117240_avx.c: New test.
* gcc.target/i386/pr117240_avx512f.c: New test.

9 months agoFortran: Minor follow-up cleanup to error.cc
Tobias Burnus [Wed, 23 Oct 2024 10:25:00 +0000 (12:25 +0200)] 
Fortran: Minor follow-up cleanup to error.cc

Follow up to r15-4268-g459c6018d2308d, which removed dead code,
but missing that terminal_width was only set but not used.

gcc/fortran/ChangeLog:

* error.cc (terminal_width, gfc_get_terminal_width): Remove.
(gfc_error_init_1): Do not call one to set the other.

9 months agotree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)
Martin Jambor [Wed, 23 Oct 2024 09:30:32 +0000 (11:30 +0200)] 
tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

PR 117142 shows that the current SRA probably never worked reliably
with arguments passed to a function returning twice, because it then
creates statements before the call which however needs to be at the
beginning of a basic block.

While it should be possible to make at least the case of passing
arguments by value work with SRA (the statements would need to be put
just on the non-abnormal edges leading to the BB), this would mean
large surgery of function sra_modify_expr and I guess the time would
better be spent re-organizing the whole pass.

gcc/ChangeLog:

2024-10-21  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/117142
* tree-sra.cc (build_access_from_call_arg): Disqualify any
candidate passed to a function returning twice.

gcc/testsuite/ChangeLog:

2024-10-21  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/117142
* gcc.dg/tree-ssa/pr117142.c: New test.

9 months agoc-family: Regenerate c.opt.urls
Jakub Jelinek [Wed, 23 Oct 2024 08:18:46 +0000 (10:18 +0200)] 
c-family: Regenerate c.opt.urls

Forgot to regenerate urls after -Wleading-whitespace addition.

2024-10-23  Jakub Jelinek  <jakub@redhat.com>

* c.opt.urls: Regenerate.

9 months agolibcpp: Add -Wleading-whitespace= warning
Jakub Jelinek [Wed, 23 Oct 2024 07:58:06 +0000 (09:58 +0200)] 
libcpp: Add -Wleading-whitespace= warning

The following patch on top of the r15-4346 patch adds
-Wleading-whitespace= warning option.
This warning doesn't care how much one actually indents which line
in the source (that is something that can't be easily done in the
preprocessor without doing syntactic analysis), but just simple checks
on what kind of whitespace is used in the indentation.
I think it is still useful to get warnings about such issues early,
while git diagnoses some of it in patches (e.g. the tab after space
case), getting the warnings earlier might help avoiding such issues
sooner.

There are projects which ban use of tabs and require just spaces,
others which require indentation just with horizontal tabs, and finally
projects which want indentation with tabs for multiples of tabstop size
followed by spaces (fewer than tabstop size), like GCC.
For all 3 kinds the warning diagnoses indentation with '\v' or '\f'
characters (unless line contains just whitespace), and for the last one
also cases where a space in the indentation is followed by horizontal
tab or where there are N or more consecutive spaces in the indentation
(for -ftabstop=N).

BTW, for additional testing I've enabled the warnings (without -Werror
for them) in stage3.  There are many warnings (both trailing and leading
whitespace), some of them something that can be easily fixed in the headers
or source files, but others with whitespace issues in generated sources,
so if we enable the warnings, either we'd need to adjust the generators
or disable the warnings in (some of the) generated files.

2024-10-23  Jakub Jelinek  <jakub@redhat.com>

libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_leading_whitespace and cpp_tabstop members.
(enum cpp_warning_reason): Add CPP_W_LEADING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document new
line note kinds.
* init.cc (cpp_create_reader): Set cpp_tabstop to 8.
* lex.cc (find_leading_whitespace_issues): New function.
(_cpp_clean_line): Use it.
(_cpp_process_line_notes): Handle 'L', 'S' and 'T' line notes.
(lex_raw_string): Clear type on 'L', 'S' and 'T' line notes
inside of raw string literals.
gcc/
* doc/invoke.texi (Wleading-whitespace=): Document.
gcc/c-family/
* c.opt (Wleading-whitespace=): New option.
* c-opts.cc (c_common_post_options): Set cpp_opts->cpp_tabstop
to global_dc->m_tabstop.
gcc/testsuite/
* c-c++-common/cpp/Wleading-whitespace-1.c: New test.
* c-c++-common/cpp/Wleading-whitespace-2.c: New test.
* c-c++-common/cpp/Wleading-whitespace-3.c: New test.
* c-c++-common/cpp/Wleading-whitespace-4.c: New test.

9 months agolibstdc++: Always instantiate key_type to compute hash code [PR115285]
François Dumont [Tue, 22 Oct 2024 17:13:34 +0000 (19:13 +0200)] 
libstdc++: Always instantiate key_type to compute hash code [PR115285]

Even if it is possible to compute a hash code from the inserted arguments
we need to instantiate the key_type to guaranty hash code consistency.

Preserve the lazy instantiation of the mapped_type in the context of
associative containers.

libstdc++-v3/ChangeLog:

PR libstdc++/115285
* include/bits/hashtable.h (_S_forward_key<_Kt>): Always return a temporary
key_type instance.
* testsuite/23_containers/unordered_map/96088.cc: Adapt to additional instanciation.
Also check that mapped_type is not instantiated when there is no insertion.
* testsuite/23_containers/unordered_multimap/96088.cc: Adapt to additional
instanciation.
* testsuite/23_containers/unordered_multiset/96088.cc: Likewise.
* testsuite/23_containers/unordered_set/96088.cc: Likewise.
* testsuite/23_containers/unordered_set/pr115285.cc: New test case.

9 months agoi386: Optimize EQ/NE comparison between avx512 kmask and -1.
liuhongt [Mon, 21 Oct 2024 09:22:08 +0000 (02:22 -0700)] 
i386: Optimize EQ/NE comparison between avx512 kmask and -1.

r15-974-gbf7745f887c765e06f2e75508f263debb60aeb2e has optimized for
jcc/setcc, but missed movcc.
The patch supports movcc.

gcc/ChangeLog:

PR target/117232
* config/i386/sse.md (*kortest_cmp<SWI1248_AVX512BWDQ_64:mode>_movqicc):
New define_insn_and_split.
(*kortest_cmp<SWI1248_AVX512BWDQ_64:mode>_mov<SWI248:mode>cc):
Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117232-1.c: New test.
* gcc.target/i386/pr117232-apx-1.c: New test.

9 months agoDaily bump.
GCC Administrator [Wed, 23 Oct 2024 00:19:43 +0000 (00:19 +0000)] 
Daily bump.

9 months agoc: Restore "originally defined" struct redefinition messages for C23
Joseph Myers [Wed, 23 Oct 2024 00:10:01 +0000 (00:10 +0000)] 
c: Restore "originally defined" struct redefinition messages for C23

One failure with a -std=gnu23 default that indicates a
quality-of-implementation regression in C23 mode is gcc.dg/pr39084.c,
which loses the expected "originally defined here" message on struct
redefinition errors (which occur in a different place in the front end
for C23 because it is necessary to see the members of the struct to
determine whether the redefinition is valid).  That message seems a
good thing to have both in and out of C23 mode, so add logic to
restore it in the C23 case.

Bootstrapped with no regressions for x86-64-pc-linux-gnu.

gcc/c/
* c-decl.cc (c_struct_parse_info): Add member refloc.
(start_struct): Store refloc in struct_parse_info.
(finish_struct): Give "originally defined" message for C23 struct
redefinition errors.

gcc/testsuite/
* gcc.dg/gnu17-tag-1.c, gcc.dg/gnu23-tag-5.c: New tests.

9 months agoc++: non-dep structured binding decltype again [PR117107]
Jason Merrill [Tue, 22 Oct 2024 20:37:49 +0000 (16:37 -0400)] 
c++: non-dep structured binding decltype again [PR117107]

The patch for PR92687 handled the usual case of a decomp variable not being
in the table, but missed the case of there being nothing in the table yet.

PR c++/117107
PR c++/92687

gcc/cp/ChangeLog:

* decl.cc (lookup_decomp_type): Handle null table.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/decomp10.C: New test.

9 months agoc++: add testcase [PR116929]
Jason Merrill [Tue, 22 Oct 2024 15:23:34 +0000 (11:23 -0400)] 
c++: add testcase [PR116929]

This testcase was fixed by r15-822-g0173dcce92baa6 .

PR c++/116929

gcc/testsuite/ChangeLog:

* g++.dg/modules/enum-14.C: New test.

9 months agolibstdc++: Implement LWG 4166 changes to concat_view::end()
Patrick Palka [Tue, 22 Oct 2024 21:01:59 +0000 (17:01 -0400)] 
libstdc++: Implement LWG 4166 changes to concat_view::end()

This patch proactively implements the proposed resolution for this LWG
issue, which seems straightforward and slated to get approved as-is.

(No _GLIBCXX_RESOLVE_LIB_DEFECTS code comment is added since concat_view
is C++26, so this isn't a defect against a published standard.)

libstdc++-v3/ChangeLog:

* include/std/ranges (concat_view::begin): Add space after
'requires' starting a requires-clause.
(concat_view::end): Likewise.  Refine condition for returning an
iterator rather than default_sentinel as per LWG 4166.
* testsuite/std/ranges/concat/1.cc (test03): Verify LWG 4166
example.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
9 months agoc: Better fix for speed up compilation of large char array initializers when not...
Jakub Jelinek [Tue, 22 Oct 2024 20:36:03 +0000 (22:36 +0200)] 
c: Better fix for speed up compilation of large char array initializers when not using #embed [PR117190]

On Wed, Oct 16, 2024 at 11:09:32PM +0200, Jakub Jelinek wrote:
> Apparently my
> c: Speed up compilation of large char array initializers when not using #embed
> patch broke building glibc.
>
> The issue is that when using CPP_EMBED, we are guaranteed by the
> preprocessor that there is CPP_NUMBER CPP_COMMA before it and
> CPP_COMMA CPP_NUMBER after it (or CPP_COMMA CPP_EMBED), so RAW_DATA_CST
> never ends up at the end of arrays of unknown length.
> Now, the c_parser_initval optimization attempted to preserve that property
> rather than changing everything that e.g. inferes array number of elements
> from the initializer etc. to deal with RAW_DATA_CST at the end, but
> it didn't take into account the possibility that there could be
> CPP_COMMA followed by CPP_CLOSE_BRACE (where the CPP_COMMA is redundant).
>
> As we are peaking already at 4 tokens in that code, peeking more would
> require using raw tokens and that seems to be expensive doing it for
> every pair of tokens due to vec_free done when we are out of raw tokens.

Sorry for rushing the previous patch too much, turns out I was wrong,
given that the c_parser_peek_nth_token numbering is 1 based, we can peek
also with c_parser_peek_nth_token (parser, 4) and the loop actually peeked
just at 3 tokens, not 4.

So, I think it is better to revert the previous patch (but keep the new
test) and instead peek the 4th non-raw token, which is what the following
patch does.

Additionally, PR117190 shows one further spot which missed the peek of
the token after CPP_COMMA, in case it is incomplete array with exactly 65
elements with redundant comma after it, which this patch handles too.

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

PR c/117190
gcc/c/
* c-parser.cc (c_parser_initval): Revert 2024-10-17 changes.
Instead peek the 4th token and if it is not CPP_NUMBER,
handle it like 3rd token CPP_CLOSE_BRACE for orig_len == INT_MAX.
Also, check (2 + 2 * i)th raw token for the orig_len == INT_MAX
case and punt if it is not CPP_NUMBER.
gcc/testsuite/
* c-c++-common/init-5.c: New test.

9 months agoc-family: Fix up -Wsizeof-pointer-memaccess ICEs [PR117230]
Jakub Jelinek [Tue, 22 Oct 2024 18:30:41 +0000 (20:30 +0200)] 
c-family: Fix up -Wsizeof-pointer-memaccess ICEs [PR117230]

In the following testcases, we ICE on all 4 function calls.
The problem is using TYPE_PRECISION on vector types (but guess it
would be similarly problematic on structures/unions/arrays).
The test only differentiates between suggestion what to do, whether
to supply explicit size because sizeof (*p) for
{,{,un}signed }char *p is not very likely what the user want, or
dereferencing the pointer, so I think limiting that suggestion
to integral types is ok.

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

PR c/117230
* c-warn.cc (sizeof_pointer_memaccess_warning): Only compare
TYPE_PRECISION of TREE_TYPE (type) to precision of char if
TREE_TYPE (type) is integral type.

* c-c++-common/Wsizeof-pointer-memaccess5.c: New test.

9 months agovarasm: Handle RAW_DATA_CST in compare_constant [PR117199]
Jakub Jelinek [Tue, 22 Oct 2024 18:21:56 +0000 (20:21 +0200)] 
varasm: Handle RAW_DATA_CST in compare_constant [PR117199]

On the following testcase without LTO we unnecessarily don't merge
two identical .LC* constants (constant hashing computes the same hash,
but as compare_constant returned false for the RAW_DATA_CST in it,
it never compares equal), and with LTO fails to link because LTO assumes such
constants have to be merged and so doesn't emit the other constant.

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/117199
* varasm.cc (compare_constant): Handle RAW_DATA_CST.  Formatting fix
in the STRING_CST case.

* gcc.dg/lto/pr117199_0.c: New test.

9 months agovarasm: Fix up RAW_DATA_CST handling in array_size_for_constructor [PR117190]
Jakub Jelinek [Tue, 22 Oct 2024 18:20:23 +0000 (20:20 +0200)] 
varasm: Fix up RAW_DATA_CST handling in array_size_for_constructor [PR117190]

CONSTRUCTOR indices for arrays have bitsize type, and the r15-4375
patch actually got it right in 6 other spots, but not in this function,
where it used size_int rather than bitsize_int and so size_binop can ICE
on type mismatch.

This is covered by the init-5.c testcase I've just posted, though the ICE
goes away when the C FE is fixed (and when it is not, there is another
ICE).

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

PR c/117190
* varasm.cc (array_size_for_constructor): For RAW_DATA_CST,
use bitsize_int rather than size_int.

9 months agoGCN: Initial generic-target handling, add more GCN macro defines
Tobias Burnus [Tue, 22 Oct 2024 18:06:50 +0000 (20:06 +0200)] 
GCN: Initial generic-target handling, add more GCN macro defines

Newer llvm-mc assemblers support the gfx*-generic targets, permitting to
generate code for all GPUs belonging to the same generation, even if not
optimal code. This requires LLVM 19.

This patch adds the compiler-side support for generic gfx and also
adds -march=gfx10-3-generic and -march=gfx-11. However, those -march= are
not documented nor used anywhere, yet.

Disclaimer: Not tested (as my ROCm does not support it); additionally,
libgomp/plugin/plugin-gcn.c has to be updated before it becomes useful.

For better compatibility with LLVM's Clang, this commit additionally adds
the macro definitions __GFX<9|10|11>__ for the architecture family,
__AMDGPU__ besides the existing __AMDGCN__ and the two strings-containing
macros __amdgcn_processor__ and __amdgcn_target_id__, where the former has
'-' replaced by '_' but otherwise both contain the lower case name. For the
new generic targets, the same happens, yielding, e.g., __gfx10_3_generic__.

gcc/ChangeLog:

* config/gcn/gcn-devices.def: Add generic version/flag as additional
value and architecture family entry; update; add gfx-10-3-generic
and gfx11-generic.
* config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove
(ASM_SPEC): Use generated ABI_VERSION_OPT instead.
* config/gcn/gcn-tables.opt: Regenerate
* config/gcn/gcn.h (gcn_device_def): Add generic_version and
arch_family members.
(TARGET_CPU_CPP_BUILTINS): Fix allocation bug, handle '-' in the
name and add additional macro defines.
* config/gcn/gcn.cc (gcn_devices): Handle it.
* config/gcn/gen-gcn-device-macros.awk: Likewise; use ELF name
for the macro name; generate ABI_VERSION_OPT.
* config/gcn/mkoffload.cc (ELFABIVERSION_AMDGPU_HSA_V6,
EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET,
GET_GENERIC_VERSION, SET_GENERIC_VERSION): Define.
(get_arch): Call SET_GENERIC_VERSION flag on elf_flags.
(copy_early_debug_info): If the arch sets the generic version,
use ELFABIVERSION_AMDGPU_HSA_V6.

9 months agotestsuite: arm: Use check-function-bodies in fp16-aapcs-* tests
Torbjörn SVENSSON [Sun, 20 Oct 2024 09:48:42 +0000 (11:48 +0200)] 
testsuite: arm: Use check-function-bodies in fp16-aapcs-* tests

Converted the tests to use check-function-bodies in order to ensure that
the sequence is correct.

gcc/testsuite/ChangeLog:

* gcc.target/arm/fp16-aapcs-1.c: Use check-function-bodies.
* gcc.target/arm/fp16-aapcs-2.c: Likewise.
* gcc.target/arm/fp16-aapcs-3.c: Likewise.
* gcc.target/arm/fp16-aapcs-4.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
9 months agotestsuite: arm: Relax expected asm in bitfield* and union-2 tests
Torbjörn SVENSSON [Sun, 20 Oct 2024 08:28:32 +0000 (10:28 +0200)] 
testsuite: arm: Relax expected asm in bitfield* and union-2 tests

Below -O2, lsls/lsrs are prefered. For -O2 and above, lsl/lsr are
prefered.

gcc/testsuite/ChangeLog:

* gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Allow lsl and
lsr instructions.
* gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise.
* gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise.
* gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: Likewise.
* gcc.target/arm/cmse/mainline/8_1m/union-2.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
9 months agotestsuite: arm: Use check-function-bodies in cmse-5 tests
Torbjörn SVENSSON [Sun, 20 Oct 2024 09:20:43 +0000 (11:20 +0200)] 
testsuite: arm: Use check-function-bodies in cmse-5 tests

Converted the tests to use check-function-bodies in order to ensure that
the sequence is correct.
This also allows both APSR_nzcvq and APSR_nzcvqg as target selector does
not work when the -march and/or -mcpu overrides the target to test.

gcc/testsuite/ChangeLog:

* gcc.target/arm/cmse/mainline/8m/hard-sp/cmse-5.c: Use
check-function-bodies.
* gcc.target/arm/cmse/mainline/8m/hard/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/8m/soft/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/8m/softfp-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/8m/softfp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/8_1m/hard/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/8_1m/soft/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-5.c:
Likewise.
* gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-5.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
9 months agolibstdc++: Avoid using std::__to_address with iterators
Jonathan Wakely [Fri, 18 Oct 2024 11:11:10 +0000 (12:11 +0100)] 
libstdc++: Avoid using std::__to_address with iterators

In r12-3935-g82626be2d633a9 I added the partial specialization
std::pointer_traits<__normal_iterator<It, Cont>> so that __to_address
would work with __normal_iterator objects. Soon after that, François
replaced it in r12-6004-g807ad4bc854cae with an overload of __to_address
that served the same purpose, but was less complicated and less wrong.

I now think that both commits were mistakes, and that instead of adding
hacks to make __normal_iterator work with __to_address, we should not be
using __to_address with iterators at all before C++20.

The pre-C++20 std::__to_address function should only be used with
pointer-like types, specifically allocator_traits<A>::pointer types.
Those pointer-like types are guaranteed to be contiguous iterators, so
that getting a raw memory address from them is OK.

For arbitrary iterators, even random access iterators, we don't know
that it's safe to lower the iterator to a pointer e.g. for std::deque
iterators it's not, because (it + n) == (std::to_address(it) + n) only
holds within the same block of the deque's storage.

For C++20, std::to_address does work correctly for contiguous iterators,
including __normal_iterator, and __to_address just calls std::to_address
so also works. But we have to be sure we have an iterator that satisfies
the std::contiguous_iterator concept for it to be safe, and we can't
check that before C++20.

So for pre-C++20 code the correct way to handle iterators that might be
pointers or might be __normal_iterator is to call __niter_base, and if
necessary use is_pointer to check whether __niter_base returned a real
pointer.

We currently have some uses of std::__to_address with iterators where
we've checked that they're either pointers, or __normal_iterator
wrappers around pointers, or satisfy std::contiguous_iterator. But this
seems a little fragile, and it would be better to just use
std::__niter_base for the pointers and __normal_iterator cases, and use
C++20 std::to_address when the C++20 std::contiguous_iterator concept is
satisfied. This patch does that.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string::assign): Replace
use of __to_address with __niter_base or std::to_address as
appropriate.
* include/bits/ptr_traits.h (__to_address): Add comment.
* include/bits/shared_ptr_base.h (__shared_ptr): Qualify calls
to __to_address.
* include/bits/stl_algo.h (find): Replace use of __to_address
with __niter_base or std::to_address as appropriate. Only use
either of them when the range is not empty.
* include/bits/stl_iterator.h (__to_address): Remove overload
for __normal_iterator.
* include/debug/safe_iterator.h (__to_address): Remove overload
for _Safe_iterator.
* include/std/ranges (views::counted): Replace use of
__to_address with std::to_address.
* testsuite/24_iterators/normal_iterator/to_address.cc: Removed.

9 months agotestsuite: Add test directive checking removal of link_error
Jennifer Schmitz [Tue, 22 Oct 2024 12:54:13 +0000 (05:54 -0700)] 
testsuite: Add test directive checking removal of link_error

This test needs a directive checking the removal of the link_error.
Committed as obvious.

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/testsuite/
* gcc.dg/tree-ssa/log_ident.c: Add scan for removal of
link_error in optimized tree dump.

9 months agoc++: redundant hashing in register_specialization
Patrick Palka [Tue, 22 Oct 2024 12:01:16 +0000 (08:01 -0400)] 
c++: redundant hashing in register_specialization

After r15-4050-g5dad738c1dd164 register_specialization needs to set
elt.hash to the (maybe) precomputed hash so that the lookup uses it
rather than redundantly computing it from scratch.

gcc/cp/ChangeLog:

* pt.cc (register_specialization): Set elt.hash.

Reviewed-by: Jason Merrill <jason@redhat.com>
9 months agotestsuite: Skip pr112305.c for -O[01] on simulators
Richard Sandiford [Tue, 22 Oct 2024 11:47:45 +0000 (12:47 +0100)] 
testsuite: Skip pr112305.c for -O[01] on simulators

gcc.dg/torture/pr112305.c contains an inner loop that executes
0x8000_0014 times and an outer loop that executes 5 times, giving about
10 billion total executions of the inner loop body.  At -O2 and above we
are able to remove the inner loop, but at -O1 we keep a no-op loop:

        dls     lr, r3
.L3:
        subs    r3, r3, #1
        le      lr, .L3

and at -O0 we of course don't optimise.

This can lead to long execution times on simulators, possibly
triggering a timeout.

gcc/testsuite
* gcc.dg/torture/pr112305.c: Skip at -O0 and -O1 for simulators.

9 months agoc++/modules: Handle forward-declared class types
Nathaniel Shead [Mon, 21 Oct 2024 11:55:46 +0000 (22:55 +1100)] 
c++/modules: Handle forward-declared class types

In some cases we can access members of a namespace-scope class without
ever having performed name-lookup on it; this can occur when a
forward-declaration of the class is used as a return type, for
instance, or with PIMPL.

One possible approach would be to do name lookup in complete_type to
force lazy loading to occur, but this seems overly expensive for a
relatively rare case.  Instead, this patch generalises the existing
pending-entity support to handle this case as well.

Unfortunately this does mean that almost every class definition will be
added to the pending-entity table, and almost always unnecessarily, but
I don't see a good way to avoid this.

gcc/cp/ChangeLog:

* module.cc (depset::DB_IS_MEMBER_BIT): Rename to...
(depset::DB_IS_PENDING_BIT): ...this.
(depset::is_member): Remove.
(depset::is_pending_entity): New function.
(depset::hash::make_dependency): Mark definitions of
namespace-scope types as maybe-pending entities.
(depset::hash::add_class_entities): Rename DB_IS_MEMBER_BIT to
DB_IS_PENDING_BIT.
(depset::hash::find_dependencies): Use is_pending_entity
instead of is_member.
(module_state::write_pendings): Likewise; adjust comment.

gcc/testsuite/ChangeLog:

* g++.dg/modules/inst-4_b.C: Adjust pending-entity count.
* g++.dg/modules/member-def-1_c.C: Likewise.
* g++.dg/modules/member-def-2_c.C: Likewise.
* g++.dg/modules/tpl-spec-3_b.C: Likewise.
* g++.dg/modules/tpl-spec-4_b.C: Likewise.
* g++.dg/modules/tpl-spec-5_b.C: Likewise.
* g++.dg/modules/class-9_a.H: New test.
* g++.dg/modules/class-9_b.H: New test.
* g++.dg/modules/class-9_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
9 months agotree-optimization/117254 - ICE with access diangostics
Richard Biener [Tue, 22 Oct 2024 09:46:47 +0000 (11:46 +0200)] 
tree-optimization/117254 - ICE with access diangostics

The diagnostics code fails to handle non-constant domain max.

PR tree-optimization/117254
* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg):
Check the array domain max is constant before using it.

* gcc.dg/pr117254.c: New testcase.

9 months agoamdgcn: Refactor device settings into a def file
Andrew Stubbs [Tue, 17 Sep 2024 15:26:04 +0000 (15:26 +0000)] 
amdgcn: Refactor device settings into a def file

Almost all device-specific settings are now centralised into gcn-devices.def
for the compiler, mkoffload, and libgomp.  No longer will we have to touch 10
files in multiple places just to add another device without any exotic
features.  (New ISAs and devices with incompatible metadata will continue to
need a bit more.)

In order to remove the device-specific conditionals in the code a new value
HSACO_ATTR_UNSUPPORTED has been added, indicating that the assembler will
reject any setting of that option.

This incorporates some of Tobias's patch from March 2024.

Co-Authored-By: Tobias Burnus <tburnus@baylibre.com>
gcc/ChangeLog:

* config.gcc (amdgcn): Add gcn-device-macros.h to tm_file.
Add gcn-tables.opt to extra_options.
* config/gcn/gcn-hsa.h (NO_XNACK): Delete.
(NO_SRAM_ECC): Delete.
(SRAMOPT): Move definition to generated file gcn-device-macros.h.
(XNACKOPT): Likewise.
(ASM_SPEC): Redefine using generated values from gcn-device-macros.h.
* config/gcn/gcn-opts.h
(enum processor_type): Generate from gcn-devices.def.
(TARGET_VEGA10): Delete.
(TARGET_VEGA20): Delete.
(TARGET_GFX908): Delete.
(TARGET_GFX90a): Delete.
(TARGET_GFX90c): Delete.
(TARGET_GFX1030): Delete.
(TARGET_GFX1036): Delete.
(TARGET_GFX1100): Delete.
(TARGET_GFX1103): Delete.
(TARGET_XNACK): Redefine to allow for HSACO_ATTR_UNSUPPORTED.
(enum hsaco_attr_type): Add HSACO_ATTR_UNSUPPORTED.
(TARGET_TGSPLIT): New define.
* config/gcn/gcn.cc (gcn_devices): New constant table.
(gcn_option_override): Rework to use gcn_devices table.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
(gcn_hsa_declare_function_name): Rework using TARGET_* macros.
* config/gcn/gcn.h (gcn_devices): Declare struct and table.
(TARGET_CPU_CPP_BUILTINS): Rework using gcn_devices.
* config/gcn/gcn.opt: Move enum data to generated file gcn-tables.opt.
Use new names for the default values.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX900): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX906): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX908): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX90a): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX90c): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX1030): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX1036): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX1100): Delete.
(EF_AMDGPU_MACH_AMDGCN_GFX1103): Delete.
(enum elf_arch_code): Define using gcn-devices.def.
(get_arch): Rework using gcn-devices.def.
(main): Rework using gcn-devices.def
* config/gcn/t-gcn-hsa (gcn-tables.opt): Generate file.
(gcn-device-macros.h): Generate file.
* config/gcn/t-omp-device: Generate isa list from gcn-devices.def.
* config/gcn/gcn-devices.def: New file.
* config/gcn/gcn-tables.opt: New file.
* config/gcn/gcn-tables.opt.urls: New file.
* config/gcn/gen-gcn-device-macros.awk: New file.
* config/gcn/gen-opt-tables.awk: New file.

libgomp/ChangeLog:

* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Generate from gcn-devices.def.
(gcn_gfx803_s): Delete.
(gcn_gfx900_s): Delete.
(gcn_gfx906_s): Delete.
(gcn_gfx908_s): Delete.
(gcn_gfx90a_s): Delete.
(gcn_gfx90c_s): Delete.
(gcn_gfx1030_s): Delete.
(gcn_gfx1036_s): Delete.
(gcn_gfx1100_s): Delete.
(gcn_gfx1103_s): Delete.
(gcn_isa_name_len): Delete.
(isa_hsa_name): Rename ...
(isa_name): ... to this, and rework using gcn-devices.def.
(isa_gcc_name): Delete.
(isa_code): Rework using gcn-devices.def.
(max_isa_vgprs): Rework using gcn-devices.def.
(isa_matches_agent): Update isa_name usage.
(GOMP_OFFLOAD_init_device): Improve diagnostic using the name.

9 months agotree-optimization/117123 - missed PHI equivalence in VN
Richard Biener [Mon, 21 Oct 2024 12:01:23 +0000 (14:01 +0200)] 
tree-optimization/117123 - missed PHI equivalence in VN

Value-numbering can use its set of equivalences to prove that
a PHI node with args <a_1, 5, 10> is equal to a_1 iff on the
edges with the constants a_1 == 5 and a_1 == 10 hold.  This
breaks down when the order of PHI args is <5, 10, a_1> as then
we drop to VARYING early.  The following mitigates this by
shuffling a copy of the edge vector to always process a SSA name
argument first.  Which should also handle the special-case of
a two argument <5, a_1> we already had.

PR tree-optimization/117123
* tree-ssa-sccvn.cc (visit_phi): First process a non-constant
argument edge to handle more equivalences.  Remove the
two-arg special case.

* g++.dg/tree-ssa/pr117123.C: New testcase.

9 months agotestsuite: Fix typo in ext-floating19.C
Stefan Schulze Frielinghaus [Tue, 22 Oct 2024 06:58:14 +0000 (08:58 +0200)] 
testsuite: Fix typo in ext-floating19.C

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/ext-floating19.C: Fix typo for bfloat16 guard.

9 months agoRISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = 1.
xuli [Mon, 21 Oct 2024 04:10:14 +0000 (04:10 +0000)] 
RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = 1.

form 1:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
{                                       \
  return (T)IMM >= y ? (T)IMM - y : 0;  \
}

Passed the rv64gcv regression test.

Change-Id: I8805225b445cdbbc685f4f54a4d66c7ee8f748e1
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_u_sub_imm-1_4.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_4.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_4.c: New test.
* gcc.target/riscv/sat_u_sub_imm-4_2.c: New test.

9 months agoMatch: Support IMM=1 for unsigned scalar .SAT_SUB IMM form 1
xuli [Mon, 21 Oct 2024 04:08:46 +0000 (04:08 +0000)] 
Match: Support IMM=1 for unsigned scalar .SAT_SUB IMM form 1

This patch would like to support .SAT_SUB when one of the op
is IMM = 1 of form1.

Form 1:
 #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
 T __attribute__((noinline))             \
 sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
 {                                       \
   return IMM >= y ? IMM - y : 0;        \
 }

Take below form 1 as example:
DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 1)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;
  uint8_t _3;

  <bb 2> [local count: 1073741824]:
  if (y_2(D) <= 1)
    goto <bb 3>; [41.00%]
  else
    goto <bb 4>; [59.00%]

  <bb 3> [local count: 440234144]:
  _3 = y_2(D) ^ 1;

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <0(2), _3(3)>
  return _1;

}

After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _1 = .SAT_SUB (1, y_2(D)); [tail call]
  return _1;
;;    succ:       EXIT

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:

* match.pd: Support IMM=1.

9 months agoRISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = max -1.
xuli [Mon, 21 Oct 2024 04:01:01 +0000 (04:01 +0000)] 
RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = max -1.

form 1:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
{                                       \
  return (T)IMM >= y ? (T)IMM - y : 0;  \
}

Passed the rv64gcv regression test.

Change-Id: Idaa1ab41f2a5785112279ea8ee2c93236457b740
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_u_sub_imm-1_3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-4_1.c: New test.

9 months agoMatch: Support IMM=max-1 for unsigned scalar .SAT_SUB IMM form 1
xuli [Tue, 22 Oct 2024 01:08:56 +0000 (01:08 +0000)] 
Match: Support IMM=max-1 for unsigned scalar .SAT_SUB IMM form 1

This patch would like to support .SAT_SUB when one of the op
is IMM = max - 1 of form1.

Form 1:
 #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
 T __attribute__((noinline))             \
 sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
 {                                       \
   return IMM >= y ? IMM - y : 0;        \
 }

Take below form 1 as example:
DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 254)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;
  uint8_t _3;

  <bb 2> [local count: 1073741824]:
  if (y_2(D) != 255)
    goto <bb 3>; [66.00%]
  else
    goto <bb 4>; [34.00%]

  <bb 3> [local count: 708669600]:
  _3 = 254 - y_2(D);

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <0(2), _3(3)>
  return _1;

}

After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;

  <bb 2> [local count: 1073741824]:
  _1 = .SAT_SUB (254, y_2(D)); [tail call]
  return _1;

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:

* match.pd: Support IMM=max-1.

9 months agoDaily bump.
GCC Administrator [Tue, 22 Oct 2024 00:20:27 +0000 (00:20 +0000)] 
Daily bump.

9 months ago[committed][PR rtl-optimization/116488] Fix SIGN_EXTEND source handling in ext-dce
Jeff Law [Mon, 21 Oct 2024 19:37:21 +0000 (13:37 -0600)] 
[committed][PR rtl-optimization/116488] Fix SIGN_EXTEND source handling in ext-dce

A while back I noticed that the code to call carry_backpropagate was being
called after the optimization step.  Which seemed wrong, but at the time I
didn't have a testcase showing it as a problem.  Now I have 4 :-)

The way things used to work, the extension would be stripped away before
calling carry_backpropagte, meaning carry_backpropagate would never see a
SIGN_EXTENSION.  Thus the code trying to account for the sign extended bit was
never reached.

Getting that bit marked live is what's needed to fix these testcases. Fallout
is minor with just an adjustment needed to sensibly deal with vector modes in a
place where we didn't have them before.

I'm still somewhat concerned about this code.  Specifically whether or not we
can get in here with arbitrarily complex RTL, and if so do we need to recurse
down and look at those sub-expressions.

So while this patch fixes the most pressing issue, I wouldn't be terribly
surprised if we're back inside this code at some point.

Bootstrapped and regression tested on x86_64, ppc64le, riscv64, s390x, mips64,
loongarch, aarch64, m68k, alpha, hppa, sh4, sh4eb, perhaps something else that
I've forgotten...  Also tested on all the crosses in my tester.

PR rtl-optimization/116488
PR rtl-optimization/116579
PR rtl-optimization/116915
PR rtl-optimization/117226
gcc/
* ext-dce.cc (carry_backpropagate): Properly handle SIGN_EXTEND, add
ZERO_EXTEND handling as well.
(ext_dce_process_uses): Call carry_backpropagate before the optimization
step.

gcc/testsuite/
* gcc.dg/torture/pr116488.c: New test.
* gcc.dg/torture/pr116579.c: New test.
* gcc.dg/torture/pr116915.c: New test.
* gcc.dg/torture/pr117226.c: New test.

9 months agoRISC-V: Add testcases for form 8 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 07:23:57 +0000 (15:23 +0800)] 
RISC-V: Add testcases for form 8 of vector signed SAT_TRUNC

Form 8:
  #define DEF_VEC_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_8 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN >= x || x >= (WT)NT_MAX                     \
  ? x < 0 ? NT_MIN : NT_MAX                                     \
  : trunc;                                                      \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 7 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 07:10:46 +0000 (15:10 +0800)] 
RISC-V: Add testcases for form 7 of vector signed SAT_TRUNC

Form 7:
  #define DEF_VEC_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_7 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN > x || x >= (WT)NT_MAX                      \
  ? x < 0 ? NT_MIN : NT_MAX                                     \
  : trunc;                                                      \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 6 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 06:55:56 +0000 (14:55 +0800)] 
RISC-V: Add testcases for form 6 of vector signed SAT_TRUNC

Form 6:
  #define DEF_VEC_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_6 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN >= x || x > (WT)NT_MAX                      \
  ? x < 0 ? NT_MIN : NT_MAX                                     \
  j: trunc;                                                      \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 5 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 06:41:22 +0000 (14:41 +0800)] 
RISC-V: Add testcases for form 5 of vector signed SAT_TRUNC

Form 5:
  #define DEF_VEC_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_5 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN > x || x > (WT)NT_MAX                       \
  ? x < 0 ? NT_MIN : NT_MAX                                     \
  : trunc;                                                      \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 4 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 03:41:02 +0000 (11:41 +0800)] 
RISC-V: Add testcases for form 4 of vector signed SAT_TRUNC

Form 4:
  #define DEF_VEC_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN <= x && x < (WT)NT_MAX                      \
  ? trunc                                                       \
  : x < 0 ? NT_MIN : NT_MAX;                                    \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 3 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 03:26:06 +0000 (11:26 +0800)] 
RISC-V: Add testcases for form 3 of vector signed SAT_TRUNC

Form 3:
  #define DEF_VEC_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_3 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX                       \
  ? trunc                                                       \
  : x < 0 ? NT_MIN : NT_MAX;                                    \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 2 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 03:09:55 +0000 (11:09 +0800)] 
RISC-V: Add testcases for form 2 of vector signed SAT_TRUNC

Form 2:
  #define DEF_VEC_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_2 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX                       \
  ? trunc                                                       \
  : x < 0 ? NT_MIN : NT_MAX;                                    \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 1 of vector signed SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 02:21:39 +0000 (10:21 +0800)] 
RISC-V: Add testcases for form 1 of vector signed SAT_TRUNC

Form 1:
  #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX                     \
    ? trunc                                                       \
  : x < 0 ? NT_MIN : NT_MAX;                                    \
      }                                                                 \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/vec_sat_data.h: Add test data for
signed SAT_TRUNC.
* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i16-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i8.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i16.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i32.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Implement vector SAT_TRUNC for signed integer
Pan Li [Mon, 14 Oct 2024 02:14:31 +0000 (10:14 +0800)] 
RISC-V: Implement vector SAT_TRUNC for signed integer

This patch would like to implement the sstrunc for vector signed integer.

Form 1:
  #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX                     \
  ? trunc                                                       \
  : x < 0 ? NT_MIN : NT_MAX;                                    \
      }                                                                 \
  }

DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX)

Before this patch:
  27   │     vsetvli a5,a2,e64,m1,ta,ma
  28   │     vle64.v v1,0(a1)
  29   │     slli    a3,a5,3
  30   │     slli    a4,a5,2
  31   │     sub a2,a2,a5
  32   │     add a1,a1,a3
  33   │     vadd.vv v0,v1,v5
  34   │     vsetvli zero,zero,e32,mf2,ta,ma
  35   │     vnsrl.wx    v2,v1,a6
  36   │     vncvt.x.x.w v1,v1
  37   │     vsetvli zero,zero,e64,m1,ta,ma
  38   │     vmsgtu.vv   v0,v0,v4
  39   │     vsetvli zero,zero,e32,mf2,ta,mu
  40   │     vneg.v  v2,v2
  41   │     vxor.vv v1,v2,v3,v0.t
  42   │     vse32.v v1,0(a0)
  43   │     add a0,a0,a4
  44   │     bne a2,zero,.L3

After this patch:
  16   │     vsetvli a5,a2,e32,mf2,ta,ma
  17   │     vle64.v v1,0(a1)
  18   │     slli    a3,a5,3
  19   │     slli    a4,a5,2
  20   │     sub a2,a2,a5
  21   │     add a1,a1,a3
  22   │     vnclip.wi   v1,v1,0
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a2,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add
new pattern sstrunc for double trunc.
(sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc.
(sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc.
* config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add
new func decl to expand double trunc.
(expand_vec_quad_sstrunc): Ditto but for quad trunc.
(expand_vec_oct_sstrunc): Ditto but for oct trunc.
* config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new
func to expand double trunc.
(expand_vec_quad_sstrunc): Ditto but for quad trunc.
(expand_vec_oct_sstrunc): Ditto but for oct trunc.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoVect: Try the pattern of vector signed integer SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 02:09:31 +0000 (10:09 +0800)] 
Vect: Try the pattern of vector signed integer SAT_TRUNC

Almost the same as vector unsigned integer SAT_TRUNC, try to match
the signed version during the vector pattern matching.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* tree-vect-patterns.cc (gimple_signed_integer_sat_trunc): Add
new func decl for signed SAT_TRUNC.
(vect_recog_sat_trunc_pattern): Try signed match pattern for
the SAT_TRUNC.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoMatch: Support form 1 for vector signed integer SAT_TRUNC
Pan Li [Mon, 14 Oct 2024 02:03:25 +0000 (10:03 +0800)] 
Match: Support form 1 for vector signed integer SAT_TRUNC

This patch would like to support the form 1 of the vector signed
integer SAT_TRUNC.  Aka below example:

Form 1:
  #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX                     \
  ? trunc                                                       \
  : x < 0 ? NT_MIN : NT_MAX;                                    \
      }                                                                 \
  }

DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX)

Before this patch:
  48   │   _87 = .SELECT_VL (ivtmp_85, POLY_INT_CST [2, 2]);
  49   │   ivtmp_64 = _87 * 8;
  50   │   vect_x_14.10_67 = .MASK_LEN_LOAD (vectp_in.8_65, 64B, { -1, ... }, _87, 0);
  51   │   vect_trunc_15.21_78 = (vector([2,2]) int) vect_x_14.10_67;
  52   │   _61 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67);
  53   │   _32 = _61 >> 63;
  54   │   vect_patt_52.16_73 = (vector([2,2]) int) _32;
  55   │   vect__46.17_74 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>(vect_patt_52.16_73);
  56   │   vect__47.18_75 = -vect__46.17_74;
  57   │   vect__21.19_76 = VIEW_CONVERT_EXPR<vector([2,2]) int>(vect__47.18_75);
  58   │   vect_x.11_68 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67);
  59   │   vect__5.12_69 = vect_x.11_68 + { 2147483648, ... };
  60   │   mask__34.13_70 = vect__5.12_69 > { 4294967295, ... };
  61   │   _25 = .COND_XOR (mask__34.13_70, vect__21.19_76, { 2147483647, ... }, vect_trunc_15.21_78);
  62   │   ivtmp_80 = _87 * 4;
  63   │   .MASK_LEN_STORE (vectp_out.23_81, 32B, { -1, ... }, _87, 0, _25);
  64   │   vectp_in.8_66 = vectp_in.8_65 + ivtmp_64;
  65   │   vectp_out.23_82 = vectp_out.23_81 + ivtmp_80;
  66   │   ivtmp_86 = ivtmp_85 - _87;

After this patch:
  38   │   _77 = .SELECT_VL (ivtmp_75, POLY_INT_CST [2, 2]);
  39   │   ivtmp_65 = _77 * 8;
  40   │   vect_x_14.10_68 = .MASK_LEN_LOAD (vectp_in.8_66, 64B, { -1, ... }, _77, 0);
  41   │   vect_patt_53.11_69 = .SAT_TRUNC (vect_x_14.10_68);
  42   │   ivtmp_70 = _77 * 4;
  43   │   .MASK_LEN_STORE (vectp_out.12_71, 32B, { -1, ... }, _77, 0, vect_patt_53.11_69);
  44   │   vectp_in.8_67 = vectp_in.8_66 + ivtmp_65;
  45   │   vectp_out.12_72 = vectp_out.12_71 + ivtmp_70;
  46   │   ivtmp_76 = ivtmp_75 - _77;

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Refine matching for vector signed SAT_TRUNC form 1.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoaarch64: Fix costing of move to/from MOVEABLE_SYSREGS
Andrew Carlotti [Thu, 22 Aug 2024 10:59:33 +0000 (11:59 +0100)] 
aarch64: Fix costing of move to/from MOVEABLE_SYSREGS

This is necessary to prevent reload assuming that a direct FP->FPMR move
is valid.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_register_move_cost):
Increase costs involving MOVEABLE_SYSREGS.

9 months agoamdgcn: silence warning
Andrew Stubbs [Mon, 16 Sep 2024 12:31:59 +0000 (12:31 +0000)] 
amdgcn: silence warning

FIRST_SGPR_REG is register zero so the compiler always claims this comparison
is redundant.  It's right, of course, but I'd have preferred to keep the
comparison for completeness.  Probably the "correct" solution is to use an enum
for these values.

gcc/ChangeLog:

* config/gcn/gcn.h (SGPR_REGNO_P): Silence warning.

9 months agopair-fusion: Assume alias conflict if common address reg changes [PR116783]
Alex Coplan [Fri, 20 Sep 2024 16:39:39 +0000 (17:39 +0100)] 
pair-fusion: Assume alias conflict if common address reg changes [PR116783]

As the PR shows, pair-fusion was tricking memory_modified_in_insn_p into
returning false when a common base register (in this case, x1) was
modified between the mem and the store insn.  This lead to wrong code as
the accesses really did alias.

To avoid this sort of problem, this patch avoids invoking RTL alias
analysis altogether (and assume an alias conflict) if the two insns to
be compared share a common address register R, and the insns see different
definitions of R (i.e. it was modified in between).

gcc/ChangeLog:

PR rtl-optimization/116783
* pair-fusion.cc (def_walker::cand_addr_uses): New.
(def_walker::def_walker): Add parameter for candidate address
uses.
(def_walker::alias_conflict_p): Declare.
(def_walker::addr_reg_conflict_p): New.
(def_walker::conflict_p): New.
(store_walker::store_walker): Add parameter for candidate
address uses and pass to base ctor.
(store_walker::conflict_p): Rename to ...
(store_walker::alias_conflict_p): ... this.
(load_walker::load_walker): Add parameter for candidate
address uses and pass to base ctor.
(load_walker::conflict_p): Rename to ...
(load_walker::alias_conflict_p): ... this.
(pair_fusion_bb_info::try_fuse_pair): Collect address register
uses for candidate insns and pass down to alias walkers.

gcc/testsuite/ChangeLog:

PR rtl-optimization/116783
* g++.dg/torture/pr116783.C: New test.

9 months agolibstdc++: Improve 26_numerics/headers/cmath/types_std_c++0x_neg.cc
Jonathan Wakely [Fri, 18 Oct 2024 11:02:45 +0000 (12:02 +0100)] 
libstdc++: Improve 26_numerics/headers/cmath/types_std_c++0x_neg.cc

This test checks that the special functions in <cmath> are not declared
prior to C++17. But we can remove the target selector and allow it to be
tested for C++17 and later, and add target selectors to the individual
dg-error directives instead.

Also rename the test to match what it actually tests.

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/headers/cmath/types_std_c++0x_neg.cc:
Move to ...
* testsuite/26_numerics/headers/cmath/specfun_c++17.cc: here and
adjust test to be valid for all -std dialects.

9 months agolibstdc++: Simplify C++98 std::vector::_M_data_ptr overload set
Jonathan Wakely [Fri, 18 Oct 2024 10:55:08 +0000 (11:55 +0100)] 
libstdc++: Simplify C++98 std::vector::_M_data_ptr overload set

We don't need separate overloads for returning a const or non-const
pointer. We can make the member function const and return a non-const
pointer, and let vector::data() const convert it to const as needed.

libstdc++-v3/ChangeLog:

* include/bits/stl_vector.h (vector::_M_data_ptr): Remove
non-const overloads. Always return non-const pointer.

9 months agolibstdc++: Fix order of [[...]] and __attribute__((...)) attrs [PR117220]
Jonathan Wakely [Mon, 21 Oct 2024 11:09:36 +0000 (12:09 +0100)] 
libstdc++: Fix order of [[...]] and __attribute__((...)) attrs [PR117220]

GCC allows these in either order, but Clang doesn't like the C++11-style
[[__nodiscard__]] coming after __attribute__((__always_inline__)).

libstdc++-v3/ChangeLog:

PR libstdc++/117220
* include/bits/stl_iterator.h: Move _GLIBCXX_NODISCARD
annotations after __attribute__((__always_inline__)).

9 months agors6000: Correct the function code for _AMO_LD_DEC_BOUNDED
Jeevitha [Thu, 10 Oct 2024 19:42:45 +0000 (14:42 -0500)] 
rs6000: Correct the function code for _AMO_LD_DEC_BOUNDED

Corrected the function code for the Atomic Memory Operation "Fetch and Decrement
Bounded", changing it from 0x1A to 0x1C.

2024-10-11 Jeevitha Palanisamy <jeevitha@linux.ibm.com>

gcc/

* config/rs6000/amo.h (enum _AMO_LD): Correct the function code for
_AMO_LD_DEC_BOUNDED.

9 months agoi386: Refactor get_intel_cpu
Haochen Jiang [Mon, 21 Oct 2024 05:42:12 +0000 (13:42 +0800)] 
i386: Refactor get_intel_cpu

From ISE, it shows that we will have family 0x13 for Diamond Rapids.
Therefore, we need to refactor the get_intel_cpu to accept new families.
Also I did some reorder in the switch for clearness by putting earlier
added products on top for search convenience.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_intel_cpu): Refactor the
function for future expansion on different family.

9 months agoRISC-V: Skip flag -flto for all saturated arithmetic test cases.
xuli [Mon, 21 Oct 2024 03:22:01 +0000 (03:22 +0000)] 
RISC-V: Skip flag -flto for all saturated arithmetic test cases.

Skip flat -flto to address UNRESOLVED cases as follows:

gcc.target/riscv/sat_s_add-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects: output file does not exist
UNRESOLVED: gcc.target/riscv/sat_s_add-1.c

Change-Id: I7ff55197b6294cd473dfaa6cc350c5e2eb5960fe
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_s_add-1.c: Skip flag -flto.
* gcc.target/riscv/sat_s_add-10.c: Ditto.
* gcc.target/riscv/sat_s_add-11.c: Ditto.
* gcc.target/riscv/sat_s_add-12.c: Ditto.
* gcc.target/riscv/sat_s_add-13.c: Ditto.
* gcc.target/riscv/sat_s_add-14.c: Ditto.
* gcc.target/riscv/sat_s_add-15.c: Ditto.
* gcc.target/riscv/sat_s_add-16.c: Ditto.
* gcc.target/riscv/sat_s_add-2.c: Ditto.
* gcc.target/riscv/sat_s_add-3.c: Ditto.
* gcc.target/riscv/sat_s_add-4.c: Ditto.
* gcc.target/riscv/sat_s_add-5.c: Ditto.
* gcc.target/riscv/sat_s_add-6.c: Ditto.
* gcc.target/riscv/sat_s_add-7.c: Ditto.
* gcc.target/riscv/sat_s_add-8.c: Ditto.
* gcc.target/riscv/sat_s_add-9.c: Ditto.
* gcc.target/riscv/sat_s_sub-1-i16.c: Ditto.
* gcc.target/riscv/sat_s_sub-1-i32.c: Ditto.
* gcc.target/riscv/sat_s_sub-1-i64.c: Ditto.
* gcc.target/riscv/sat_s_sub-1-i8.c: Ditto.
* gcc.target/riscv/sat_s_sub-2-i16.c: Ditto.
* gcc.target/riscv/sat_s_sub-2-i32.c: Ditto.
* gcc.target/riscv/sat_s_sub-2-i64.c: Ditto.
* gcc.target/riscv/sat_s_sub-2-i8.c: Ditto.
* gcc.target/riscv/sat_s_sub-3-i16.c: Ditto.
* gcc.target/riscv/sat_s_sub-3-i32.c: Ditto.
* gcc.target/riscv/sat_s_sub-3-i64.c: Ditto.
* gcc.target/riscv/sat_s_sub-3-i8.c: Ditto.
* gcc.target/riscv/sat_s_sub-4-i16.c: Ditto.
* gcc.target/riscv/sat_s_sub-4-i32.c: Ditto.
* gcc.target/riscv/sat_s_sub-4-i64.c: Ditto.
* gcc.target/riscv/sat_s_sub-4-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-1-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-1-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-1-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-1-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-1-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-1-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-2-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-2-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-2-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-2-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-2-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-2-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-3-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-3-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-3-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-4-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-4-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-4-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-5-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-5-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-5-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-6-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-6-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-6-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-7-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-7-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-7-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-8-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-8-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-8-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat_u_add-1.c: Ditto.
* gcc.target/riscv/sat_u_add-10.c: Ditto.
* gcc.target/riscv/sat_u_add-11.c: Ditto.
* gcc.target/riscv/sat_u_add-12.c: Ditto.
* gcc.target/riscv/sat_u_add-13.c: Ditto.
* gcc.target/riscv/sat_u_add-14.c: Ditto.
* gcc.target/riscv/sat_u_add-15.c: Ditto.
* gcc.target/riscv/sat_u_add-16.c: Ditto.
* gcc.target/riscv/sat_u_add-17.c: Ditto.
* gcc.target/riscv/sat_u_add-18.c: Ditto.
* gcc.target/riscv/sat_u_add-19.c: Ditto.
* gcc.target/riscv/sat_u_add-2.c: Ditto.
* gcc.target/riscv/sat_u_add-20.c: Ditto.
* gcc.target/riscv/sat_u_add-21.c: Ditto.
* gcc.target/riscv/sat_u_add-22.c: Ditto.
* gcc.target/riscv/sat_u_add-23.c: Ditto.
* gcc.target/riscv/sat_u_add-24.c: Ditto.
* gcc.target/riscv/sat_u_add-3.c: Ditto.
* gcc.target/riscv/sat_u_add-4.c: Ditto.
* gcc.target/riscv/sat_u_add-5.c: Ditto.
* gcc.target/riscv/sat_u_add-6.c: Ditto.
* gcc.target/riscv/sat_u_add-7.c: Ditto.
* gcc.target/riscv/sat_u_add-8.c: Ditto.
* gcc.target/riscv/sat_u_add-9.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-1.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-10.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-11.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-12.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-13.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-14.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-15.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-16.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-2.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-3.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-4.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-5.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-6.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-7.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-8.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-9.c: Ditto.
* gcc.target/riscv/sat_u_sub-1.c: Ditto.
* gcc.target/riscv/sat_u_sub-10.c: Ditto.
* gcc.target/riscv/sat_u_sub-11.c: Ditto.
* gcc.target/riscv/sat_u_sub-12.c: Ditto.
* gcc.target/riscv/sat_u_sub-13.c: Ditto.
* gcc.target/riscv/sat_u_sub-14.c: Ditto.
* gcc.target/riscv/sat_u_sub-15.c: Ditto.
* gcc.target/riscv/sat_u_sub-16.c: Ditto.
* gcc.target/riscv/sat_u_sub-17.c: Ditto.
* gcc.target/riscv/sat_u_sub-18.c: Ditto.
* gcc.target/riscv/sat_u_sub-19.c: Ditto.
* gcc.target/riscv/sat_u_sub-2.c: Ditto.
* gcc.target/riscv/sat_u_sub-20.c: Ditto.
* gcc.target/riscv/sat_u_sub-21.c: Ditto.
* gcc.target/riscv/sat_u_sub-22.c: Ditto.
* gcc.target/riscv/sat_u_sub-23.c: Ditto.
* gcc.target/riscv/sat_u_sub-24.c: Ditto.
* gcc.target/riscv/sat_u_sub-25.c: Ditto.
* gcc.target/riscv/sat_u_sub-26.c: Ditto.
* gcc.target/riscv/sat_u_sub-27.c: Ditto.
* gcc.target/riscv/sat_u_sub-28.c: Ditto.
* gcc.target/riscv/sat_u_sub-29.c: Ditto.
* gcc.target/riscv/sat_u_sub-3.c: Ditto.
* gcc.target/riscv/sat_u_sub-30.c: Ditto.
* gcc.target/riscv/sat_u_sub-31.c: Ditto.
* gcc.target/riscv/sat_u_sub-32.c: Ditto.
* gcc.target/riscv/sat_u_sub-33.c: Ditto.
* gcc.target/riscv/sat_u_sub-34.c: Ditto.
* gcc.target/riscv/sat_u_sub-35.c: Ditto.
* gcc.target/riscv/sat_u_sub-36.c: Ditto.
* gcc.target/riscv/sat_u_sub-37.c: Ditto.
* gcc.target/riscv/sat_u_sub-38.c: Ditto.
* gcc.target/riscv/sat_u_sub-39.c: Ditto.
* gcc.target/riscv/sat_u_sub-4.c: Ditto.
* gcc.target/riscv/sat_u_sub-40.c: Ditto.
* gcc.target/riscv/sat_u_sub-41.c: Ditto.
* gcc.target/riscv/sat_u_sub-42.c: Ditto.
* gcc.target/riscv/sat_u_sub-43.c: Ditto.
* gcc.target/riscv/sat_u_sub-44.c: Ditto.
* gcc.target/riscv/sat_u_sub-45.c: Ditto.
* gcc.target/riscv/sat_u_sub-46.c: Ditto.
* gcc.target/riscv/sat_u_sub-47.c: Ditto.
* gcc.target/riscv/sat_u_sub-48.c: Ditto.
* gcc.target/riscv/sat_u_sub-5.c: Ditto.
* gcc.target/riscv/sat_u_sub-6.c: Ditto.
* gcc.target/riscv/sat_u_sub-7.c: Ditto.
* gcc.target/riscv/sat_u_sub-8.c: Ditto.
* gcc.target/riscv/sat_u_sub-9.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-10.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-10_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-10_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-11.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-11_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-11_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-12.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-13.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-13_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-13_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-14.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-14_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-14_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-15.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-15_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-15_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-16.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-1_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-1_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-2_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-2_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-3.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-3_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-3_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-4.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-5.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-5_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-5_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-6.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-6_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-6_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-7.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-7_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-7_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-8.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-9.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-9_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-9_2.c: Ditto.
* gcc.target/riscv/sat_u_trunc-1.c: Ditto.
* gcc.target/riscv/sat_u_trunc-10.c: Ditto.
* gcc.target/riscv/sat_u_trunc-11.c: Ditto.
* gcc.target/riscv/sat_u_trunc-12.c: Ditto.
* gcc.target/riscv/sat_u_trunc-13.c: Ditto.
* gcc.target/riscv/sat_u_trunc-14.c: Ditto.
* gcc.target/riscv/sat_u_trunc-15.c: Ditto.
* gcc.target/riscv/sat_u_trunc-16.c: Ditto.
* gcc.target/riscv/sat_u_trunc-17.c: Ditto.
* gcc.target/riscv/sat_u_trunc-18.c: Ditto.
* gcc.target/riscv/sat_u_trunc-19.c: Ditto.
* gcc.target/riscv/sat_u_trunc-2.c: Ditto.
* gcc.target/riscv/sat_u_trunc-20.c: Ditto.
* gcc.target/riscv/sat_u_trunc-21.c: Ditto.
* gcc.target/riscv/sat_u_trunc-22.c: Ditto.
* gcc.target/riscv/sat_u_trunc-23.c: Ditto.
* gcc.target/riscv/sat_u_trunc-24.c: Ditto.
* gcc.target/riscv/sat_u_trunc-3.c: Ditto.
* gcc.target/riscv/sat_u_trunc-4.c: Ditto.
* gcc.target/riscv/sat_u_trunc-5.c: Ditto.
* gcc.target/riscv/sat_u_trunc-6.c: Ditto.
* gcc.target/riscv/sat_u_trunc-7.c: Ditto.
* gcc.target/riscv/sat_u_trunc-8.c: Ditto.
* gcc.target/riscv/sat_u_trunc-9.c: Ditto.

9 months ago[testsuite] [arm] add effective target and options for pacbti tests
Alexandre Oliva [Mon, 21 Oct 2024 03:12:10 +0000 (00:12 -0300)] 
[testsuite] [arm] add effective target and options for pacbti tests

arm pac and bti tests that use -march=armv8.1-m.main get an implicit
-mthumb, that is incompatible with vxworks kernel mode.  Declaring the
requirement for a 8.1-m.main-compatible toolchain is enough to avoid
those fails, because the toolchain feature test fails in kernel mode,
but taking the -march options from the standardized arch tests, after
testing for support for the corresponding effective target, makes it
generally safer, and enables us to drop skip directives and extraneous
option variants.

for  gcc/testsuite/ChangeLog

* gcc.target/arm/bti-1.c: Require arch, use its opts, drop skip.
* gcc.target/arm/bti-2.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.
* g++.target/arm/pac-1.C: Likewise.  Drop +mve.

9 months agoRefine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"
liuhongt [Wed, 16 Oct 2024 05:43:48 +0000 (13:43 +0800)] 
Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

r12-6103-g1a7ce8570997eb combines vpcmpuw + zero_extend to vpcmpuw
with the pre_reload splitter, but the splitter transforms the
zero_extend into a subreg which make reload think the upper part is
garbage, it's not correct.

The patch adjusts the zero_extend define_insn_and_split to
define_insn to keep zero_extend.

gcc/ChangeLog:

PR target/117159
* config/i386/sse.md
(*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>):
Change from define_insn_and_split to define_insn.
(*<avx512>_cmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>):
Ditto.
(*<avx512>_ucmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>):
Ditto.
(*<avx512>_ucmp<VI48_AVX512VL:mode>3_zero_extend<SWI248x:mode>):
Ditto.
(*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2):
Split to the zero_extend pattern.
(*<avx512>_cmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2):
Ditto.
(*<avx512>_ucmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2):
Ditto.
(*<avx512>_ucmp<VI48_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2):
Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117159.c: New test.
* gcc.target/i386/avx512bw-pr103750-1.c: Remove xfail.
* gcc.target/i386/avx512bw-pr103750-2.c: Remove xfail.

9 months agoDaily bump.
GCC Administrator [Mon, 21 Oct 2024 00:17:11 +0000 (00:17 +0000)] 
Daily bump.