git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

projects / thirdparty / gcc.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Sam James [Thu, 31 Oct 2024 01:37:47 +0000 (01:37 +0000)]

testsuite: fix syntax in Wstringop-overflow-59.c

Fix quoting issues, escaping, and dg directive types.

There were two issues here:
1) The incorrect quoting in an earlier dg-message was covering up that the
syntax in the next part was wrong;
2) Fix dg-warning -> dg-message to correctly pick up the notes. Once 1) was
fixed, this was exposed.

With this, I get:
```
+PASS: gcc.dg/Wstringop-overflow-59.c note (test for warnings, line 192)
+PASS: gcc.dg/Wstringop-overflow-59.c note (test for warnings, line 193)
```

gcc/testsuite/ChangeLog:
PR middle-end/92936

* gcc.dg/Wstringop-overflow-59.c: Fix dg-* syntax.

commit | commitdiff | tree

Andrew Pinski [Tue, 29 Oct 2024 21:43:42 +0000 (14:43 -0700)]

gimple: Remove special handling of COND_EXPR for COMPARISON_CLASS_P [PR116949, PR114785]

After r13-707-g68e0063397ba82, COND_EXPR for gimple assign no longer could contain a comparison.
The vectorizer was builting gimple assigns with comparison until r15-4695-gd17e672ce82e69
(which added an assert to make sure it no longer builds it).

So let's remove the special handling COND_EXPR in a few places and add an assert to
gimple_build_assign_1 to make sure we don't build a gimple assign any more with a comparison.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR middle-end/114785
PR middle-end/116949
* gimple-match-exports.cc (maybe_push_res_to_seq): Remove special
handling of COMPARISON_CLASS_P in COND_EXPR/VEC_COND_EXPR.
(gimple_extract): Likewise.
* gimple-walk.cc (walk_stmt_load_store_addr_ops): Likewise.
* gimple.cc (gimple_build_assign_1): Add assert for COND_EXPR
so its 1st operand is not a comparison.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

GCC Administrator [Thu, 31 Oct 2024 00:18:53 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Jonathan Wakely [Wed, 30 Oct 2024 19:27:54 +0000 (19:27 +0000)]

libstdc++: Fix copy&paste comments in vector range tests

These comments were copied from the std::vector<bool> tests, but the
value_type is not bool in these ones.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/vector/cons/from_range.cc: Fix copy &
paste error in comment.
* testsuite/23_containers/vector/modifiers/append_range.cc:
Likewise.
* testsuite/23_containers/vector/modifiers/assign/assign_range.cc:
Likewise.
* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
Likewise.

commit | commitdiff | tree

Jonathan Wakely [Wed, 30 Oct 2024 21:10:58 +0000 (21:10 +0000)]

libstdc++: Fix some typos and grammatical errors in docs

Also remove some redundant 'void' parameters from code examples.

libstdc++-v3/ChangeLog:

* doc/xml/manual/using_exceptions.xml: Fix typos and grammatical
errors.
* doc/html/manual/using_exceptions.html: Regenerate.

commit | commitdiff | tree

Kugan Vivekanandarajah [Wed, 30 Oct 2024 20:23:10 +0000 (07:23 +1100)]

[PATCH] Fix SLP when ifcvt versioned loop is not vectorized

When ifcvt version a loop, it sets dont_vectorize to the scalar loop. If the
vector loop is not vectorized and removed, the scalar loop is still left with
dont_vectorize. As a result, BB vectorization will not happen.

This patch resets dont_vectorize to scalar loop when IFN_LOOP_VECTORIZED
is set to false.

gcc/ChangeLog:

* tree-vectorizer.cc (pass_vectorize::execute): Reset dont_vectorize
to scalar loop when setting IFN_LOOP_VECTORIZED to false.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-77.c: New test.

commit | commitdiff | tree

Kugan Vivekanandarajah [Wed, 30 Oct 2024 20:20:49 +0000 (07:20 +1100)]

[PATCH] Adjust param_vect_max_version_for_alias_checks

This patch sets param_vect_max_version_for_alias_checks to 15.
This was causing GCC to miss vectorization opportunities for an application,
making it slower than LLVM by about ~14%.

Original default of 10 itself is arbitary. Given that, GCC's vectoriser does
consideres cost of alias checks, increasing this param is reasonable.

In this case we need a value of at teast 11 whereas the current
default is 10.

gcc/ChangeLog:

* params.opt: Adjust param_vect_max_version_for_alias_checks

gcc/testsuite/ChangeLog:

* g++.dg/alias-checks.C: New test.

Signed-off-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>

commit | commitdiff | tree

Joseph Myers [Wed, 30 Oct 2024 18:50:11 +0000 (18:50 +0000)]

c: Do not document C23 support as experimental and incomplete

Since C23 support is substantially feature-complete, update
documentation to no longer refer to it as experimental and incomplete.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/
* doc/cpp.texi (__STDC_VERSION__): Do not refer to C23 support as
experimental.
* doc/invoke.texi (std=c23, std=gnu23): Do not document as
experimental and incomplete.
* doc/standards.texi: Do not refer to C23 support as experimental
and incomplete.

gcc/c-family/
* c.opt (std=c23, std=gnu23, std=iso9899:2024): Do not mark as
experimental and incomplete.

commit | commitdiff | tree

Ian Lance Taylor [Tue, 29 Oct 2024 22:39:02 +0000 (15:39 -0700)]

syscall: don't define syscall stub on Hurd

Patch from Samuel Thibault.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/623415

commit | commitdiff | tree

Andi Kleen [Wed, 2 Oct 2024 20:13:21 +0000 (13:13 -0700)]

Remove sys/user time in -ftime-report

Retrieving sys/user time in timevars is quite expensive because it
always needs a system call. Only getting the wall time is much
cheaper because operating systems have optimized paths for this.

The sys time isn't that interesting for a compiler and wall time
is usually close to user time except when the system is overloaded.
On the other hand when it is not wall time is more accurate because
it has less overhead.

For building tramp3d with -O0 the -ftime-report overhead drops from
18% to 3%. For -O2 it drops from 8% to not measurable.

I changed the code to use gettimeofday as a fallback for clock_gettime
CLOCK_MONOTONIC. If a host has neither of those the time will not
be measured. Previously clock was the fallback.

This removes a lot of code in timevar.cc:

gcc/timevar.cc | 167 ++++++---------------------------------------------------
gcc/timevar.h | 10 +---

2 files changed, 17 insertions(+), 160 deletions(-)

gcc/ChangeLog:

* timevar.cc (struct tms): Remove.
(RUSAGE_SELF): Remove.
(TICKS_PER_SECOND): Remove.
(USE_TIMES): Remove.
(HAVE_USER_TIME): Remove.
(HAVE_SYS_TIME): Remove.
(HAVE_WALL_TIME): Remove.
(USE_GETRUSAGE): Remove.
(USE_CLOCK): Remove.
(NANOSEC_PER_SEC): Remove.
(TICKS_TO_NANOSEC): Remove.
(CLOCKS_TO_NANOSEC): Remove.
(timer::named_items::push): Remove sys/user.
(get_time): Remove clock and times and getruage code.
(timevar_accumulate): Remove sys/user.
(timevar_diff): Dito.
(timer::validate_phases): Dito.
(timer::print_row): Dito.
(timer::all_zero): Dito.
(timer::print): Dito.
(make_json_for_timevar_time_def): Dito.
* timevar.h (struct timevar_time_def): Dito.

commit | commitdiff | tree

Richard Biener [Wed, 30 Oct 2024 12:06:08 +0000 (13:06 +0100)]

Remove vectorizer finish_cost wrapper

The inline function wraps the vector_cost class API and no longer is
a good representation of the query style of that class which makes it
also difficult to extend.

* tree-vectorizer.h (finish_cost): Inline everywhere and remove.
* tree-vect-loop.cc (vect_estimate_min_profitable_iters):
Inline finish_cost.
* tree-vect-slp.cc (vect_bb_vectorization_profitable_p): Likewise.

commit | commitdiff | tree

Yangyu Chen [Wed, 30 Oct 2024 14:33:57 +0000 (14:33 +0000)]

Fix function multiversioning dispatcher link error with LTO

We forgot to apply DECL_EXTERNAL to __init_cpu_features_resolver decl. When
building with LTO, the linker cannot find the
__init_cpu_features_resolver.lto_priv* symbol, causing the link error.

This patch gets this fixed by adding DECL_EXTERNAL to the decl. To avoid used
but never defined warning for this symbol, we also mark TREE_PUBLIC to the decl.
We should also mark the decl having hidden visibility. And fix the attribute in
the same way for __aarch64_cpu_features identifier.

Minimal steps to reproduce the bug:

echo '__attribute__((target_clones("default", "aes"))) void func1() { }' > 1.c
echo '__attribute__((target_clones("default", "aes"))) void func2() { }' > 2.c
echo 'void func1();void func2();int main(){func1();func2();return 0;}' > main.c
gcc -flto -c 1.c 2.c
gcc -flto main.c 1.o 2.o

Fixes: 0cfde688e213 ("[aarch64] Add function multiversioning support")
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:

* config/aarch64/aarch64.cc (dispatch_function_versions): Adding
DECL_EXTERNAL, TREE_PUBLIC and hidden DECL_VISIBILITY to
__init_cpu_features_resolver and __aarch64_cpu_features.

commit | commitdiff | tree

Jakub Jelinek [Wed, 30 Oct 2024 13:51:02 +0000 (14:51 +0100)]

c: Diagnose char argument to __builtin_stdc_*

When working on __builtin_stdc_rotate_*, I've noticed that while the
second argument to those is explicitly allowed to have char type,
the first argument to all the stdc_* type-generic functions is
- standard unsigned integer type, excluding bool;
- extended unsigned integer type;
- or, bit-precise unsigned integer type whose width matches a standard
or extended integer type, excluding bool.
but the __builtin_stdc_* lowering code was diagnosing just
!INTEGRAL_TYPE_P
ENUMERAL_TYPE
BOOLEAN_TYPE
!TYPE_UNSIGNED
Now, with -funsigned-char plain char type is TYPE_UNSIGNED, yet it isn't
allowed because it isn't standard unsigned integer type, nor
extended unsigned integer type, nor bit-precise unsigned integer type.

The following patch diagnoses char arguments and adds testsuite coverage
for that.

Or should I make it a pedwarn instead?

2024-10-30 Jakub Jelinek <jakub@redhat.com>

gcc/c/
* c-parser.cc (c_parser_postfix_expression): Diagnose if
first __builtin_stdc_* argument has char type even when
-funsigned-char.
gcc/testsuite/
* gcc.dg/builtin-stdc-bit-3.c: New test.
* gcc.dg/builtin-stdc-rotate-3.c: New test.

commit | commitdiff | tree

Jeff Law [Wed, 30 Oct 2024 13:43:22 +0000 (07:43 -0600)]

[RISC-V] Aggressively hoist VXRM assignments

So a while back I was looking at pixel_avg for RISC-V where we try to
use vaaddu for the halfword-ceiling-average step.  The problem with
vaaddu is that you must set VXRM to a suitable rounding mode as it has
an undetermined state at function entry or after a function call.

It turns out some designs will fully flush their pipelines on a write to
VXRM which you can imagine is incredibly expensive.

VXRM assignments are handled by an LCM based algorithm to find "optimal"
placement points based on what insns in the stream need VXRM assignments
and the particular mode they need.

Unfortunately in pixel_avg an LCM algorithm only allows hoisting out of
the innermost loop, but not the outer loop.  The core issue is that LCM
does not allow any speculation and there are paths which would bypass
the inner loop (which don't actually trigger at runtime IIRC).

The expectation is that VXRM assignments should be exceedingly rare and
needing more than one mode even rarer.  So hoisting more aggressively
seems like a reasonable thing to do, but we don't want to burn too much
time trying to do something fancy.

So what this patch does is scan the IL once collecting any VXRM needs.
If the current function has precisely one VXRM mode needed, then we
pretend (for the sake of LCM) that the first instruction in the function
also has that need.

By doing so the VXRM assignment is essentially anticipated everywhere in
the function.  The standard LCM algorithm is run and has enough
information to hoist the VXRM assignment more aggressively, most often
to the prologue.

This helps the BPI in a measurable way (IIRC it was 2-3%).  It probably
helps some of the SiFive designs, but I've been told they still benefit
from the longer sequence of shifts & adds, hoisting just isn't enough
for those designs.  The Ventana design basically doesn't care where the
VXRM assignment is.  Point is we may want to have a tuning knob for the
patterns which need VXRM (vaadd[u], vasub[u]) at some point in the near
future.

Bootstrapped and regression tested on riscv64 and regression tested on
riscv32-elf and riscv64-elf.  We've been using this internally for a
while a while on spec as well.   Obviously I'll wait for the pre-commit
tester to do its thing.

gcc/
* config/riscv/riscv.cc (singleton_vxrm_need): New function.
(riscv_mode_needed): See if there is a singleton need and if so,
claim it happens on the first insn in the chain.

commit | commitdiff | tree

Iain Sandoe [Wed, 30 Oct 2024 10:29:49 +0000 (10:29 +0000)]

c++, contracts: Only check contracts attributes [PR116607].

The ICE described in the PR is caused by not filtering out non-
contract attributes before making the has_active_contract_condition
test. Fixed, as suggested by Andrew Pinski, by just using the
existing CONTRACT_CHAIN () macro to advance through the list.

PR c++/116607

gcc/cp/ChangeLog:

* contracts.cc (has_active_contract_condition): Use the
CONTRACT_CHAIN macro to advance through the attribute list.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/pr116607.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

commit | commitdiff | tree

Jonathan Wakely [Thu, 24 Oct 2024 10:40:42 +0000 (11:40 +0100)]

libstdc++: Define config macros for additional IEEE formats

Some targets use IEEE binary64 for both double and long double, which
means we could use memmove to optimize a std::copy from a range of
double to a range of long double. We currently have no config macro to
detect when long double is binary64, so add that to <bits/c++config.h>.

This also adds config macros for the case where double and long double
both use the same binary32 format as float, which is true for the avr
target. No specializations of __memcpyable for that case are added by
this patch, but they could be added later.

libstdc++-v3/ChangeLog:

* include/bits/c++config (_GLIBCXX_DOUBLE_IS_IEEE_BINARY32):
Define.
(_GLIBCXX_LDOUBLE_IS_IEEE_BINARY64): Define.
(_GLIBCXX_LDOUBLE_IS_IEEE_BINARY32): Define.
* include/bits/cpp_type_traits.h (__memcpyable): Define
specializations when double and long double are compatible.

Reviewed-by: Patrick Palka <ppalka@redhat.com>

commit | commitdiff | tree

Jonathan Wakely [Thu, 24 Oct 2024 10:06:42 +0000 (11:06 +0100)]

libstdc++: Define __memcpyable<float*, _Float32*> as true

This allows optimizing copying ranges of floating-point types when they
have the same size and representation, e.g. between _Float32 and float
when we know that float uses the same IEEE binary32 format as _Float32.

On some targets double and long double both use IEEE binary64 format so
we could enable memcpy between those types, but we don't have existing
macros to check for that case.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__memcpyable): Add
specializations for compatible floating-point types.

Reviewed-by: Patrick Palka <ppalka@redhat.com>

commit | commitdiff | tree

liuhongt [Tue, 29 Oct 2024 09:09:39 +0000 (02:09 -0700)]

Fix ICE due to subreg:us_truncate.

Force_operand issues an ICE when input
is (subreg:DI (us_truncate:V8QI)), it's probably because it's an
invalid rtx, So refine backend patterns for that.

gcc/ChangeLog:

PR target/117318
* config/i386/sse.md (*avx512vl_<code>v2div2qi2_mask_store_1):
Rename to ..
(avx512vl_<code>v2div2qi2_mask_store_1): .. this.
(avx512vl_<code>v2div2qi2_mask_store_2): Change to
define_expand.
(*avx512vl_<code><mode>v4qi2_mask_store_1): Rename to ..
(avx512vl_<code><mode>v4qi2_mask_store_1): .. this.
(avx512vl_<code><mode>v4qi2_mask_store_2): Change to
define_expand.
(*avx512vl_<code><mode>v8qi2_mask_store_1): Rename to ..
(avx512vl_<code><mode>v8qi2_mask_store_1): .. this.
(avx512vl_<code><mode>v8qi2_mask_store_2): Change to
define_expand.
(*avx512vl_<code><mode>v4hi2_mask_store_1): Rename to ..
(avx512vl_<code><mode>v4hi2_mask_store_1): .. this.
(avx512vl_<code><mode>v4hi2_mask_store_2): Change to
define_expand.
(*avx512vl_<code>v2div2hi2_mask_store_1): Rename to ..
(avx512vl_<code>v2div2hi2_mask_store_1): .. this.
(avx512vl_<code>v2div2hi2_mask_store_2): Change to
define_expand.
(*avx512vl_<code>v2div2si2_mask_store_1): Rename to ..
(avx512vl_<code>v2div2si2_mask_store_1): .. this.
(avx512vl_<code>v2div2si2_mask_store_2): Change to
define_expand.
(*avx512f_<code>v8div16qi2_mask_store_1): Rename to ..
(avx512f_<code>v8div16qi2_mask_store_1): .. this.
(avx512f_<code>v8div16qi2_mask_store_2): Change to
define_expand.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117318.c: New test.

commit | commitdiff | tree

Harald Anlauf [Tue, 29 Oct 2024 20:52:27 +0000 (21:52 +0100)]

Fortran: fix several front-end memleaks

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_trans_class_init_assign): Free intermediate
gfc_expr's.
* trans.cc (get_final_proc_ref): Likewise.
(get_elem_size): Likewise.
(gfc_add_finalizer_call): Likewise.

commit | commitdiff | tree

Christophe Lyon [Wed, 30 Oct 2024 09:50:16 +0000 (09:50 +0000)]

arm: [MVE intrinsics] Remove unused builtins qualifiers

After the re-implementation of MVE vld/vst intrinsics, a few builtins
qualifiers became useless.

This patch removes them to restore bootstrap (otherwise the build
fails because of 'defined but not used' errors.

gcc/ChangeLog:

* config/arm/arm-builtins.cc (STRS_QUALIFIERS): Delete.
(STRU_QUALIFIERS): Delete.
(STRS_P_QUALIFIERS): Delete.
(STRU_P_QUALIFIERS): Delete.
(LDRS_QUALIFIERS): Delete.
(LDRU_QUALIFIERS): Delete.
(LDRS_Z_QUALIFIERS): Delete.
(LDRU_Z_QUALIFIERS): Delete.

commit | commitdiff | tree

Richard Biener [Sat, 26 Oct 2024 12:29:17 +0000 (14:29 +0200)]

Remove dead part of bool pattern recognition

Given we no longer want vcond[u]{,_eq} and VEC_COND_EXPR or COND_EXPR
with embedded GENERIC comparisons the whole check_bool_pattern
and adjust_bool_stmts machinery is dead. It is effectively dead
after r15-4713-g0942bb85fc5573 and the following patch removes it.

* tree-vect-patterns.cc (check_bool_pattern): Remove.
(adjust_bool_pattern_cast): Likewise.
(adjust_bool_pattern): Likewise.
(sort_after_uid): Likewise.
(adjust_bool_stmts): Likewise.
(vect_recog_bool_pattern): Remove calls to check_bool_pattern
and fold as if it returns false.

commit | commitdiff | tree

Soumya AR [Wed, 30 Oct 2024 08:57:46 +0000 (14:27 +0530)]

[MAINTAINERS] Add myself to write after approval and DCO.

ChangeLog:

* MAINTAINERS: Add myself to write after approval and DCO.

commit | commitdiff | tree

Jakub Jelinek [Wed, 30 Oct 2024 08:59:22 +0000 (09:59 +0100)]

function: Call do_pending_stack_adjust in assign_parms [PR117296]

Functions called by assign_parms call emit_block_move in two places,
so on some targets can be expanded as calls and can result in pending
stack adjustment.

Now, during expansion we normally call do_pending_stack_adjust at the end
of expansion of each basic block or before emitting code that will branch
and/or has labels, and when emitting labels we assert that there are no
pending stack adjustments.

assign_parms is expanded before the first basic block and if the first
basic block starts with a label and at least one of those emit_block_move
calls resulted in the need of pending stack adjustments, we ICE when
emitting that label.

The following patch fixes that by calling do_pending_stack_adjust after
after the assign_parms potential emit_block_move calls.

2024-10-30 Jakub Jelinek <jakub@redhat.com>

PR target/117296
* function.cc (assign_parms): Call do_pending_stack_adjust.

* gcc.target/i386/pr117296.c: New test.

commit | commitdiff | tree

Jakub Jelinek [Wed, 30 Oct 2024 08:58:26 +0000 (09:58 +0100)]

genmatch: Fix build on hppa64-hpux [PR117348]

Apparently autoconf defines the HAVE_DECL_* macros to 0
rather than not defining them at all, so defined(HAVE_DECL_FMEMOPEN)
test doesn't do much.

The following patch fixes it by testing HAVE_DECL_FMEMOPEN
for being non-zero instead.

2024-10-30 Jakub Jelinek <jakub@redhat.com>

PR middle-end/117348
* genmatch.cc: Replace defined(HAVE_DECL_FMEMOPEN)
test with HAVE_DECL_FMEMOPEN.

commit | commitdiff | tree

Paul Thomas [Wed, 30 Oct 2024 07:49:52 +0000 (07:49 +0000)]

Fortran: Move pr115070.f90 to ieee directory [PR117335].

2024-10-30 Paul Thomas <pault@gcc.gnu.org>

gcc/testsuite/
PR fortran/117335
* gfortran.dg/pr115070.f90: Delete.
* gfortran.dg/ieee/pr115070.f90: Moved to ieee directory to
prevent failures on incompatible architectures.

commit | commitdiff | tree

Uros Bizjak [Wed, 30 Oct 2024 07:17:15 +0000 (08:17 +0100)]

i386: Use assign_stack_temp instead of assign_386_stack_local with SLOT_TEMP

It is better to use assign_stack_temp instead of assign_386_stack_local
with SLOT_TEMP because assign_stack_temp also shares sub-space of stack
slots (e.g. HImode temp shares stack slot with SImode stack slot).

Use assign_386_stack_local only for special stack slots (SLOT_STV_TEMP that
can be nested inside other stack temp access, SLOT_FLOATxFDI_387 that has
relaxed alignment constraint) or slots that can't be shared (SLOT_CW_*).

The patch removes SLOT_TEMP. assign_stack_temp should be used instead.

gcc/ChangeLog:

* config/i386/i386.h (enum ix86_stack_slot): Remove SLOT_TEMP.
* config/i386/i386-expand.cc (ix86_expand_builtin)
<case IX86_BUILTIN_LDMXCSR>: Use assign_stack_temp instead of
assign_386_stack_local with SLOT_TEMP.
<case IX86_BUILTIN_LDMXCSR>: Ditto.
(ix86_expand_divmod_libfunc): Ditto.
* config/i386/i386.md (floatunssi<mode>2): Ditto.
* config/i386/sync.md (atomic_load<mode>): Ditto.
(atomic_store<mode>): Ditto.

commit | commitdiff | tree

Jakub Jelinek [Wed, 30 Oct 2024 06:59:52 +0000 (07:59 +0100)]

c: Add C2Y N3370 - Case range expressions support [PR117021]

The following patch adds the C2Y N3370 paper support.
We had the case ranges as a GNU extension for decades, so this patch
simply:
1) adds different diagnostics when it is used in C (depending on flag_isoc2y
   and pedantic and warn_c23_c2y_compat)
2) emits a pedwarn in C if in a range conversion changes the value of
   the low or high bounds and in that case doesn't emit -Woverflow and
   similar warnings anymore if the pedwarn has been diagnosed
3) changes the handling of empty ranges both in C and C++; previously
   we just warned but let the values be still looked up in the splay
   tree/entered into it (and let only gimplification throw away those
   empty cases), so e.g. case -6 ... -8: break; case -6: break;
   complained about duplicate case label.  But that actually isn't
   duplicate case label, case -6 ... -8: stands for nothing at all
   and that is how it is treated later on (thrown away)

2024-10-30  Jakub Jelinek  <jakub@redhat.com>

PR c/117021
gcc/c-family/
* c-common.cc (c_add_case_label): Emit different diagnostics for C
on case ranges.  Diagnose for C using pedwarn conversions of range
expressions changing value and don't emit further conversion
diagnostics if the pedwarn has been diagnosed.  For empty ranges
bail out after emitting warning, don't add anything into splay
trees nor add a CASE_LABEL_EXPR.
gcc/testsuite/
* gcc.dg/switch-6.c: Expect different diagnostics.  Add -std=gnu23
to dg-options.
* gcc.dg/switch-7.c: Expect different diagnostics.  Add -std=c23
to dg-options.
* gcc.dg/gnu23-switch-1.c: New test.
* gcc.dg/gnu23-switch-2.c: New test.
* gcc.dg/c23-switch-1.c: New test.
* gcc.dg/c2y-switch-1.c: New test.
* gcc.dg/c2y-switch-2.c: New test.
* gcc.dg/c2y-switch-3.c: New test.

commit | commitdiff | tree

Haochen Jiang [Tue, 29 Oct 2024 07:51:14 +0000 (15:51 +0800)]

testsuite: Adjust AVX10.2 check_effective_target

Since Binutils haven't fully merged all AVX10.2 insts, only testing
one inst/intrin in AVX10.2 is never sufficient for check_effective_target.
Like APX_F, use inline asm to do the target check.

gcc/testsuite/ChangeLog:

PR target/117301
* lib/target-supports.exp (check_effective_target_avx10_2):
Use inline asm instead of intrin for check_effective_target.
(check_effective_target_avx10_2_512): Ditto.

commit | commitdiff | tree

xuli [Wed, 23 Oct 2024 01:57:51 +0000 (01:57 +0000)]

RISC-V: Add testcases for unsigned .SAT_SUB form 2 with IMM = 1.

form2:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
{                                       \
  return x >= (T)IMM ? x - (T)IMM : 0;  \
}

Passed the rv64gcv regression test.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_u_sub_imm-run-5.c: add run case for imm=1.
* gcc.target/riscv/sat_u_sub_imm-run-6.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-run-7.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-run-8.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-5_3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-6_3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-7_3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-8_1.c: New test.

commit | commitdiff | tree

xuli [Tue, 22 Oct 2024 09:48:03 +0000 (09:48 +0000)]

Match: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).

When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating
a branch instruction.This simplification also applies to signed integer.

Form2:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
{                                       \
  return x >= (T)IMM ? x - (T)IMM : 0;  \
}

Take below form 2 as example:
DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
  uint8_t _1;
  uint8_t _3;

  <bb 2> [local count: 1073741824]:
  if (x_2(D) != 0)
    goto <bb 3>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 3> [local count: 536870912]:
  _3 = x_2(D) + 255;

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <x_2(D)(2), _3(3)>
  return _1;

}

Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
beq a0,zero,.L2
addiw a0,a0,-1
andi a0,a0,0xff
.L2:
ret

After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
  _Bool _1;
  unsigned char _2;
  uint8_t _4;

  <bb 2> [local count: 1073741824]:
  _1 = x_3(D) != 0;
  _2 = (unsigned char) _1;
  _4 = x_3(D) - _2;
  return _4;

}

Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
snez a5,a0
subw a0,a0,a5
andi a0,a0,0xff
ret

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:

* match.pd: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi-opt-44.c: New test.
* gcc.dg/tree-ssa/phi-opt-45.c: New test.

commit | commitdiff | tree

GCC Administrator [Wed, 30 Oct 2024 00:19:35 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Andi Kleen [Tue, 29 Oct 2024 23:41:57 +0000 (16:41 -0700)]

Revert "Simplify switch bit test clustering algorithm"

This reverts commit 3d06e9c3e07e13eab715e19dafbcfc1a0b7e43d6.

commit | commitdiff | tree

David Malcolm [Tue, 29 Oct 2024 23:12:02 +0000 (19:12 -0400)]

diagnostics: support multiple output formats simultaneously [PR116613]

This patch generalizes diagnostic_context so that rather than having
a single output format, it has a vector of zero or more.

It adds new two options:
-fdiagnostics-add-output=DIAGNOSTICS-OUTPUT-SPEC
-fdiagnostics-set-output=DIAGNOSTICS-OUTPUT-SPEC
which both take a new configuration syntax of the form SCHEME ("text" or
"sarif"), optionally followed by ":" and one or more KEY=VALUE pairs,
in this form:

  <SCHEME>
  <SCHEME>:<KEY>=<VALUE>
  <SCHEME>:<KEY>=<VALUE>,<KEY2>=<VALUE2>
  ...etc

where each SCHEME supports some set of keys.  For example, it's now
possible to use:

  -fdiagnostics-add-output=sarif:version=2.1,file=foo.2.1.sarif \
  -fdiagnostics-add-output=sarif:version=2.2-prerelease,file=foo.2.2.sarif

to add a pair of outputs, each writing to a different file, using
versions 2.1 and 2.2 of the SARIF standard respectively, whilst also
emitting the classic text form of the diagnostics to stderr.

I hope the new syntax gives us room to potentially add new kinds of
output sink in the future (e.g. RPC notifications), and to add new
key/value pairs as needed by the different sinks.

Implementation-wise, the diagnostic_context's m_printer which previously
was used directly by the single output format now becomes a "reference
printer", created by the client (such as the frontend), with defaults
modified by command-line options.  Each of the multiple output sinks has
its own pretty_printer instance, created by cloning the context's
reference printer.

gcc/ChangeLog:
PR other/116613
* Makefile.in (OBJS-libcommon-target): Add opts-diagnostic.o.
* common.opt (fdiagnostics-add-output=): New.
(fdiagnostics-set-output=): New.
(diagnostics_output_format): Drop sarif-file-2.2-prerelease from
enum.
* common.opt.urls: Regenerate.
* diagnostic-buffer.h (diagnostic_buffer::~diagnostic_buffer): New.
(diagnostic_buffer::ensure_per_format_buffer): Rename to...
(diagnostic_buffer::ensure_per_format_buffers): ...this.
(diagnostic_buffer::m_per_format_buffer): Replace with...
(diagnostic_buffer::m_per_format_buffers): ...this, updating type.
* diagnostic-format-json.cc (json_output_format::update_printer):
New.
(json_output_format::follows_reference_printer_p): New.
(diagnostic_output_format_init_json): Drop redundant call to
set_path_format, as this is not a text output format.
* diagnostic-format-sarif.cc: Include "diagnostic-format-text.h".
(sarif_builder::set_printer): New.
(sarif_builder::sarif_builder): Add "printer" param and use it for
m_printer.
(sarif_builder::make_location_object::escape_nonascii_renderer::render):
Rather than using dc.m_printer, create a
diagnostic_text_output_format instance and use its printer.
(sarif_output_format::follows_reference_printer_p): New.
(sarif_output_format::update_printer): New.
(sarif_output_format::sarif_output_format): Pass in correct
printer to m_builder's ctor.
(diagnostic_output_format_init_sarif): Drop redundant call to
set_path_format, as this is not a text output format.  Replace
calls to pp_show_color and set_token_printer with call to
update_printer.  Drop redundant call to set_show_highlight_colors,
as this printer does not show colors.
(diagnostic_output_format_init_sarif_file): Split out file opening
into...
(diagnostic_output_format_open_sarif_file): ...this new function.
(make_sarif_sink): New.
(selftest::test_make_location_object): Provide a pp for the
builder.
* diagnostic-format-sarif.h
(diagnostic_output_format_open_sarif_file): New decl.
(make_sarif_sink): New decl.
* diagnostic-format-text.cc (diagnostic_text_output_format::dump):
Dump sm_follows_reference_printer.
(diagnostic_text_output_format::on_report_verbatim): New.
(diagnostic_text_output_format::follows_reference_printer_p): New.
(diagnostic_text_output_format::update_printer): New.
* diagnostic-format-text.h
(diagnostic_text_output_format::diagnostic_text_output_format):
Add optional "follows_reference_printer" param.
(diagnostic_text_output_format::on_report_verbatim): New decl.
(diagnostic_text_output_format::after_diagnostic): Drop "final".
(diagnostic_text_output_format::follows_reference_printer_p): New
decl.
(class diagnostic_text_output_format): Convert private members to
protected.
(diagnostic_text_output_format::m_follows_reference_printer): New
field.
* diagnostic-format.h
(diagnostic_output_format::on_report_verbatim): New vfunc.
(diagnostic_output_format::follows_reference_printer_p): New vfunc.
(diagnostic_output_format::update_printer): New vfunc.
(diagnostic_output_format::get_printer): Use m_printer rather than
a printer from m_context.
(diagnostic_output_format::diagnostic_output_format): Initialize
m_printer by cloning the context's printer.
(diagnostic_output_format::m_printer): New field.
* diagnostic-global-context.cc (verbatim): Reimplement in terms of
global_dc->report_verbatim, moving existing implementation to
diagnostic_text_output_format::on_report_verbatim.
(fnotice): Support multiple output sinks by using a new
global_dc->supports_fnotice_on_stderr_p.
* diagnostic-output-file.h
(diagnostic_output_file::diagnostic_output_file): New default ctor.
(diagnostic_output_file::operator=): Implement move assignment.
* diagnostic-path.cc (selftest::test_interprocedural_path_1): Pass
false for new param of text_output's ctor.
* diagnostic-show-locus.cc
(selftest::test_layout_x_offset_display_utf8): Use reference
printer.
(selftest::test_layout_x_offset_display_tab): Likewise.
(selftest::test_one_liner_fixit_remove): Likewise.
* diagnostic.cc: Include "pretty-print-urlifier.h".
(diagnostic_set_caret_max_width): Update for global_dc's m_printer
becoming reference printer.
(diagnostic_context::initialize): Update for m_printer becoming
m_reference_printer.  Use ::make_unique to create it.  Update for
m_output_format becoming m_output_sinks.
(diagnostic_context::color_init): Update the reference printer,
then update the printers for any output sinks that follow it.
(diagnostic_context::urls_init): Likewise.
(diagnostic_context::finish): Update comment.  Update for
m_output_format becoming m_output_sinks.  Update for m_printer
becoming m_reference_printer and use "delete" on it rather than
XDELETE.
(diagnostic_context::dump): Update for m_printer becoming
reference printer, and for multiple output sinks.
(diagnostic_context::set_output_format): Reimplement for
supporting multiple output sinks.
(diagnostic_context::get_output_format): Likewise.
(diagnostic_context::add_sink): New.
(diagnostic_context::supports_fnotice_on_stderr_p): New.
(diagnostic_context::set_pretty_printer): New.
(diagnostic_context::refresh_output_sinks): New.
(diagnostic_context::set_format_decoder): New.
(diagnostic_context::set_show_highlight_colors): New.
(diagnostic_context::set_prefixing_rule): New.
(diagnostic_context::report_diagnostic): Update to support
multiple output sinks.
(diagnostic_context::report_verbatim): New.
(diagnostic_context::emit_diagram): Update to support multiple
output sinks.
(diagnostic_context::error_recursion): Update to use
m_reference_printer.
(fancy_abort): Likewise.
(diagnostic_context::end_group): Update to support multiple
output sinks.
(diagnostic_output_format::dump): Implement.
(diagnostic_output_format::on_report_verbatim): Likewise.
(diagnostic_output_format_init): Drop
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE.
(diagnostic_context::set_diagnostic_buffer): Reimplement to
support multiple output sinks.
(diagnostic_context::clear_diagnostic_buffer): Likewise.
(diagnostic_context::flush_diagnostic_buffer): Likewise.
(diagnostic_buffer::diagnostic_buffer): Initialize
m_per_format_buffers.
(diagnostic_buffer::~diagnostic_buffer): New dtor.
(diagnostic_buffer::dump): Reimplement to support multiple output
sinks.
(diagnostic_buffer::empty_p): Likewise.
(diagnostic_buffer::move_to): Likewise.
(diagnostic_buffer::ensure_per_format_buffer): Likewise, renaming
to...
(diagnostic_buffer::ensure_per_format_buffers): ...this.
* diagnostic.h
(DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE): Delete.
(class diagnostic_context): Add friend class diagnostic_buffer.
(diagnostic_context::set_pretty_printer): New decl.
(diagnostic_context::refresh_output_sinks): New decl.
(diagnostic_context::report_verbatim): New decl.
(diagnostic_context::get_output_format): Drop.
(diagnostic_context::set_show_highlight_colors): Drop body.
(diagnostic_context::set_format_decoder): New decl.
(diagnostic_context::set_prefixing_rule): New decl.
(diagnostic_context::clone_printer): Reimplement.
(diagnostic_context::get_reference_printer): New accessor.
(diagnostic_context::add_sink): New decl.
(diagnostic_context::supports_fnotice_on_stderr_p): New decl.
(diagnostic_context::m_printer): Replace with...
(diagnostic_context::m_reference_printer): ...this, and make
private.
(diagnostic_context::m_output_format): Replace with...
(diagnostic_context::m_output_sinks): ...this.
(diagnostic_format_decoder): Delete.
(diagnostic_prefixing_rule): Delete.
(diagnostic_ready_p): Delete.
* doc/invoke.texi: Document -fdiagnostics-add-output= and
-fdiagnostics-set-output=.
* gcc.cc: Include "opts-diagnostic.h".
(driver_handle_option): Handle cases OPT_fdiagnostics_add_output_
and OPT_fdiagnostics_set_output_.
* opts-diagnostic.cc: New file.
* opts-diagnostic.h (handle_OPT_fdiagnostics_add_output_): New decl.
(handle_OPT_fdiagnostics_set_output_): New decl.
* opts-global.cc (init_options_once): Update for global_dc's
m_printer becoming reference printer.  Call
global_dc->refresh_output_sinks.
* opts.cc (common_handle_option): Replace use of
diagnostic_prefixing_rule with dc->set_prefixing_rule.  Handle
cases OPT_fdiagnostics_add_output_ and
OPT_fdiagnostics_set_output_.  Update for m_printer becoming
reference printer.
* selftest-diagnostic.cc
(selftest::test_diagnostic_context::test_diagnostic_context):
Update for m_printer becoming reference printer.
(test_diagnostic_context::test_show_locus): Likewise.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::opts_diagnostic_cc_tests.
* selftest.h (selftest::opts_diagnostic_cc_tests): New decl.
* simple-diagnostic-path.cc
(selftest::simple_diagnostic_path_cc_tests): Use reference
printer.
* toplev.cc (announce_function): Update for global_dc's m_printer
becoming reference printer.
(toplev::main): Likewise.
* tree-diagnostic.cc (tree_diagnostics_defaults): Replace use of
diagnostic_format_decoder with context->set_format_decoder.
* tree-diagnostic.h
(tree_dump_pretty_printer::tree_dump_pretty_printer): Update for
global_dc's m_printer becoming reference printer.
* tree.cc (escaped_string::escape): Likewise.
(selftest::test_escaped_strings): Likewise.

gcc/ada/ChangeLog:
PR other/116613
* gcc-interface/misc.cc (internal_error_function): Update for
m_printer becoming reference printer.

gcc/analyzer/ChangeLog:
PR other/116613
* analyzer-language.cc (on_finish_translation_unit): Update for
m_printer becoming reference printer.
* engine.cc (run_checkers): Likewise.
* program-point.cc (function_point::print_source_line): Likewise.

gcc/c-family/ChangeLog:
PR other/116613
* c-format.cc (selftest::test_type_mismatch_range_labels): Update
for m_printer becoming reference printer.
(selftest::test_type_mismatch_range_labels): Likewise.

gcc/c/ChangeLog:
PR other/116613
* c-objc-common.cc: Include "make-unique.h".
(c_initialize_diagnostics): Use unique_ptr for pretty_printer.
Use context->set_format_decoder.

gcc/cp/ChangeLog:
PR other/116613
* error.cc (cxx_initialize_diagnostics): Use unique_ptr for
pretty_printer.  Use context->set_format_decoder.
* module.cc (noisy_p): Update for global_dc's m_printer becoming
reference printer.

gcc/d/ChangeLog:
PR other/116613
* d-diagnostic.cc (d_diagnostic_report_diagnostic): Update for
m_printer becoming reference printer.

gcc/fortran/ChangeLog:
PR other/116613
* error.cc (gfc_diagnostic_build_kind_prefix): Update for
global_dc's m_printer becoming reference printer.
(gfc_diagnostics_init): Replace usage of diagnostic_format_decoder
with global_dc->set_format_decoder.

gcc/jit/ChangeLog:
PR other/116613
* dummy-frontend.cc: Include "make-unique.h".
(class jit_diagnostic_listener): New.
(jit_begin_diagnostic): Update comment.
(jit_end_diagnostic): Drop call to add_diagnostic.
(jit_langhook_init): Set the output format to a new
jit_diagnostic_listener.
* jit-playback.cc (playback::context::add_diagnostic): Add "text"
param and use that rather than trying to get the text from a
pretty_printer.
* jit-playback.h (playback::context::add_diagnostic): Add "text"
param.

gcc/testsuite/ChangeLog:
PR other/116613
* gcc.dg/plugin/analyzer_cpython_plugin.c (dump_refcnt_info):
Update for global_dc's m_printer becoming reference printer.
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2.2.c: Replace usage
of -fdiagnostics-format=sarif-file-2.2-prerelease with
-fdiagnostics-set-output=sarif:version=2.2-prerelease.
* gcc.dg/plugin/diagnostic_plugin_test_paths.c: Update for
global_dc's m_printer becoming reference printer.
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.c: Update for
changes to output formats.
* gcc.dg/plugin/expensive_selftests_plugin.c: Update for
global_dc's m_printer becoming reference printer.
* gcc.dg/sarif-output/add-output-sarif-defaults.c: New test.
* gcc.dg/sarif-output/bad-binary-op.c: New test.
* gcc.dg/sarif-output/bad-binary-op.py: New support script.
* gcc.dg/sarif-output/multiple-outputs.c: New test.
* gcc.dg/sarif-output/multiple-outputs.py: New support script.
* lib/scansarif.exp (verify-sarif-file): Add an optional second
argument specifying the expected filename of the .sarif file.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 29 Oct 2024 16:16:18 +0000 (09:16 -0700)]

aarch64: Use canonicalize_comparison in ccmp expansion [PR117346]

While testing the patch for PR 85605 on aarch64, it was noticed that
imm_choice_comparison.c test failed. This was because canonicalize_comparison
was not being called in the ccmp case. This can be noticed without the patch
for PR 85605 as evidence of the new testcase.

Bootstrapped and tested on aarch64-linux-gnu.

PR target/117346

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_gen_ccmp_first): Call
canonicalize_comparison before figuring out the cmp_mode/cc_mode.
(aarch64_gen_ccmp_next): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/imm_choice_comparison-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andi Kleen [Fri, 25 Oct 2024 22:04:06 +0000 (15:04 -0700)]

Simplify switch bit test clustering algorithm

The current switch bit test clustering enumerates all possible case
clusters combinations to find ones that fit the bit test constrains
best. This causes performance problems with very large switches.

For bit test clustering which happens naturally in word sized chunks
I don't think such an expensive algorithm is really needed.

This patch implements a simple greedy algorithm that walks
the sorted list and examines word sized windows and tries
to cluster them.

Surprisingly the new algorithm gives consistly better clusters
for the examples I tried.

For example from the gcc bootstrap:

old: 0-15 16-31 96-175
new: 0-31 96-175

I'm not fully sure why that is, probably some bug in the old
algorithm? This shows even up in the test suite where if-to-switch-6
now can generate a switch, as well as a case in switch-1.c

I don't have a proof that the new algorithm is always as good or better,
but so far at least I don't see any counter examples.

It also fixes the excessive compile time in PR117091,
however this was already fixed by an earlier patch
that doesn't run clustering when no targets have multiple
values.

gcc/ChangeLog:

PR middle-end/117091
* tree-switch-conversion.cc (bit_test_cluster::find_bit_tests):
Change clustering algorithm to simple greedy.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/if-to-switch-6.c: Allow condition chain.
* gcc.dg/tree-ssa/switch-1.c: Allow more bit tests.
* gcc.dg/pr21643.c: Use -fno-bit-tests
* gcc.target/aarch64/pr99988.c: Use -fno-bit-tests

commit | commitdiff | tree

Andi Kleen [Wed, 16 Oct 2024 21:07:18 +0000 (14:07 -0700)]

Only do switch bit test clustering when multiple labels point to same bb

The bit cluster code generation strategy is only beneficial when
multiple case labels point to the same code. Do a quick check if
that is the case before trying to cluster.

This fixes the switch part of PR117091 where all case labels are unique
however it doesn't address the performance problems for non unique
cases.

gcc/ChangeLog:

PR middle-end/117091
* gimple-if-to-switch.cc (if_chain::is_beneficial): Update
find_bit_test call.
* tree-switch-conversion.cc (bit_test_cluster::find_bit_tests):
Get max_c argument and bail out early if all case labels are
unique.
(switch_decision_tree::compute_cases_per_edge): Record number of
targets per label and return.
(switch_decision_tree::analyze_switch_statement): ... pass to
find_bit_tests.
* tree-switch-conversion.h: Update prototypes.

commit | commitdiff | tree

Andi Kleen [Tue, 15 Oct 2024 20:15:09 +0000 (13:15 -0700)]

Disable -fbit-tests and -fjump-tables at -O0

gcc/ChangeLog:

* common.opt: Enable -fbit-tests and -fjump-tables only at -O1.
* opts.cc (default_options_table): Dito.

commit | commitdiff | tree

Eric Botcazou [Tue, 29 Oct 2024 20:40:34 +0000 (21:40 +0100)]

Fix miscompilation of function containing __builtin_unreachable

This is a wrong-code generation on the SPARC for a function containing
a call to __builtin_unreachable caused by the delay slot scheduling pass,
and more specifically the find_end_label function which has these lines:

  /* Otherwise, see if there is a label at the end of the function. If there
     is, it must be that RETURN insns aren't needed, so that is our return
     label and we don't have to do anything else.  */

The comment was correct 20 years ago but no longer is nowadays in the
presence of RTL epilogues and calls to __builtin_unreachable, so the
patch just removes the associated two lines of code:

  else if (LABEL_P (insn))
    *plabel = as_a <rtx_code_label *> (insn);

and otherwise contains just adjustments to the commentary.

gcc/
PR rtl-optimization/117327
* reorg.cc (find_end_label): Do not return a dangling label at the
end of the function and adjust commentary.

gcc/testsuite/
* gcc.c-torture/execute/20241029-1.c: New test.

commit | commitdiff | tree

Andrew Pinski [Tue, 29 Oct 2024 20:01:30 +0000 (13:01 -0700)]

aarch64: Remove unnecessary casts to rtx_code [PR117349]

In aarch64_gen_ccmp_first/aarch64_gen_ccmp_next, the casts
were no longer needed after r14-3412-gbf64392d66f291 which
changed the type of the arguments to rtx_code.

In aarch64_rtx_costs, they were no longer needed since
r12-4828-g1d5c43db79b7ea which changed the type of code
to rtx_code.

Pushed as obvious after a build/test for aarch64-linux-gnu.

gcc/ChangeLog:

PR target/117349
* config/aarch64/aarch64.cc (aarch64_rtx_costs): Remove
unnecessary casts to rtx_code.
(aarch64_gen_ccmp_first): Likewise.
(aarch64_gen_ccmp_next): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Jakub Jelinek [Tue, 29 Oct 2024 19:14:09 +0000 (20:14 +0100)]

c-family: Handle RAW_DATA_CST in complete_array_type [PR117313]

The following testcase ICEs, because
add_flexible_array_elts_to_size -> complete_array_type
is done only after braced_lists_to_strings which optimizes
RAW_DATA_CST surrounded by INTEGER_CST into a larger RAW_DATA_CST
covering even the boundaries, while I thought it is done before
that.
So, RAW_DATA_CST now can be the last constructor_elt in a CONSTRUCTOR
and so we need the function to take it into account (handle it as
RAW_DATA_CST standing for RAW_DATA_LENGTH consecutive elements).

The function wants to support both CONSTRUCTORs without indexes and with
them (for non-RAW_DATA_CST elts it was just adding 1 for the current
index).  So, if the RAW_DATA_CST elt has ce->index, we need to add
RAW_DATA_LENGTH (ce->value) - 1, while if it doesn't (and it isn't cnt == 0
case where curindex is 0), add that plus 1, i.e. RAW_DATA_LENGTH (ce->value).

2024-10-29  Jakub Jelinek  <jakub@redhat.com>

PR c/117313
gcc/c-family/
* c-common.cc (complete_array_type): For RAW_DATA_CST elements
advance curindex by RAW_DATA_LENGTH or one less than that if
ce->index is non-NULL.  Handle even the first element if
it is RAW_DATA_CST.  Formatting fix.
gcc/testsuite/
* c-c++-common/init-6.c: New test.

commit | commitdiff | tree

Jason Merrill [Tue, 22 Oct 2024 21:45:00 +0000 (17:45 -0400)]

c++: printing AGGR_INIT_EXPR args

PR30854 was about wrongly dumping the dummy object argument to a
constructor; r126582 in 4.3 fixed that by skipping the first argument. But
not all functions called by AGGR_INIT_EXPR are constructors, as observed in
PR116634; we shouldn't skip for non-member functions. And let's combine the
printing code for CALL_EXPR and AGGR_INIT_EXPR.

This doesn't make us accept the ill-formed 116634 testcase again with a
pedwarn, just fixes the diagnostic issue.

PR c++/30854
PR c++/116634

gcc/cp/ChangeLog:

* error.cc (dump_aggr_init_expr_args): Remove.
(dump_call_expr_args): Handle AGGR_INIT_EXPR.
(dump_expr): Combine AGGR_INIT_EXPR and CALL_EXPR cases.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C: Adjust
diagnostic.
* g++.dg/diagnostic/aggr-init1.C: New test.

commit | commitdiff | tree

Tsung Chun Lin [Tue, 29 Oct 2024 15:47:57 +0000 (09:47 -0600)]

[RISC-V] RISC-V: Add implication for M extension.

That M implies Zmmul.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: M implies Zmmul.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute-15.c: Add _zmmul1p0 to arch string.
* gcc.target/riscv/attribute-16.c: Ditto.
* gcc.target/riscv/attribute-17.c: Ditto.
* gcc.target/riscv/attribute-18.c: Ditto.
* gcc.target/riscv/attribute-19.c: Ditto.
* gcc.target/riscv/pr110696.c: Ditto.
* gcc.target/riscv/target-attr-01.c: Ditto.
* gcc.target/riscv/target-attr-02.c: Ditto.
* gcc.target/riscv/target-attr-03.c: Ditto.
* gcc.target/riscv/target-attr-04.c: Ditto.
* gcc.target/riscv/target-attr-08.c: Ditto.
* gcc.target/riscv/target-attr-11.c: Ditto.
* gcc.target/riscv/target-attr-14.c: Ditto.
* gcc.target/riscv/target-attr-15.c: Ditto.
* gcc.target/riscv/target-attr-16.c: Ditto.
* gcc.target/riscv/rvv/base/pr114352-1.c: Likewise.
* gcc.target/riscv/rvv/base/pr114352-3.c: Likewise.
* gcc.dg/pr90838.c: Fix search string for rv64.

Co-Authored-By: Jeff Law <jlaw@ventanamicro.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 29 Oct 2024 05:05:08 +0000 (22:05 -0700)]

testcase: Add testcase for tree-optimization/117341

Even though PR 117341 was a duplicate of PR 116768, another
testcase this time C++ does not hurt to have.
The testcase is a self-contained and does not use directly libstdc++
except for operator new (it does not even call delete).

Tested on x86_64-linux-gnu with it working.

PR tree-optimization/117341

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr117341-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

yulong [Tue, 29 Oct 2024 14:44:45 +0000 (08:44 -0600)]

[PATCH 2/2] RISC-V:Add intrinsic cases for the CMOs extensions

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmo-32.c: New test.
* gcc.target/riscv/cmo-64.c: New test.

commit | commitdiff | tree

yulong [Tue, 29 Oct 2024 14:43:42 +0000 (08:43 -0600)]

[PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

gcc/ChangeLog:

* config.gcc: Add riscv_cmo.h.
* config/riscv/riscv_cmo.h: New file.

commit | commitdiff | tree

Pan Li [Wed, 23 Oct 2024 08:52:01 +0000 (16:52 +0800)]

RISC-V: Add testcases for form 1 of MASK_LEN_STRIDED_LOAD{STORE}

Form 1:
  void __attribute__((noinline))                                        \
  vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \
       long stride, size_t size)        \
  {                                                                     \
    for (size_t i = 0; i < size; i++)                                   \
      out[i * stride] = in[i * stride];                                 \
  }

The below test suites are passed for this patch:
* The riscv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add strided folder.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f64.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u8.c: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st.h: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st_data.h: New test.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st_run.h: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Pan Li [Wed, 23 Oct 2024 08:46:53 +0000 (16:46 +0800)]

RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE}

This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in
the RISC-V backend by leveraging the vector strided load/store insn.

For example:
void foo (int * __restrict a, int * __restrict b, int stride, int n)
{
    for (int i = 0; i < n; i++)
      a[i*stride] = b[i*stride] + 100;
}

Before this patch:
  38   │     vsetvli a5,a3,e32,m1,ta,ma
  39   │     vluxei64.v  v1,(a1),v4
  40   │     mul a4,a2,a5
  41   │     sub a3,a3,a5
  42   │     vadd.vv v1,v1,v2
  43   │     vsuxei64.v  v1,(a0),v4
  44   │     add a1,a1,a4
  45   │     add a0,a0,a4

After this patch:
  33   │     vsetvli a5,a3,e32,m1,ta,ma
  34   │     vlse32.v    v1,0(a1),a2
  35   │     mul a4,a2,a5
  36   │     sub a3,a3,a5
  37   │     vadd.vv v1,v1,v2
  38   │     vsse32.v    v1,0(a0),a2
  39   │     add a1,a1,a4
  40   │     add a0,a0,a4

The below test suites are passed for this patch:
* The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec.md (mask_len_strided_load_<mode>): Add
new pattern for MASK_LEN_STRIDED_LOAD.
(mask_len_strided_store_<mode>): Ditto but for store.
* config/riscv/riscv-protos.h (expand_strided_load): Add new
func decl to expand strided load.
(expand_strided_store): Ditto but for store.
* config/riscv/riscv-v.cc (expand_strided_load): Add new
func impl to expand strided load.
(expand_strided_store): Ditto but for store.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Pan Li [Wed, 23 Oct 2024 08:43:37 +0000 (16:43 +0800)]

RISC-V: Adjust the gather-scatter testcases due to middle-end change

After we have MASK_LEN_STRIDED_LOAD{STORE} in the middle-end, the
strided case need to be adjust for IR check.

The below test suites are passed for this patch:
* The riscv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c:
Adjust IR for MASK_LEN_LOAD check.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c:
Ditto but for store.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c:
Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Pan Li [Wed, 23 Oct 2024 08:36:28 +0000 (16:36 +0800)]

Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer

This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR
for invariant stride memory access.  For example as below

void foo (int * __restrict a, int * __restrict b, int stride, int n)
{
    for (int i = 0; i < n; i++)
      a[i*stride] = b[i*stride] + 100;
}

Before this patch:
  66   │   _73 = .SELECT_VL (ivtmp_71, POLY_INT_CST [4, 4]);
  67   │   _52 = _54 * _73;
  68   │   vect__5.16_61 = .MASK_LEN_GATHER_LOAD (vectp_b.14_59, _58, 4, { 0, ... }, { -1, ... }, _73, 0);
  69   │   vect__7.17_63 = vect__5.16_61 + { 100, ... };
  70   │   .MASK_LEN_SCATTER_STORE (vectp_a.18_67, _58, 4, vect__7.17_63, { -1, ... }, _73, 0);
  71   │   vectp_b.14_60 = vectp_b.14_59 + _52;
  72   │   vectp_a.18_68 = vectp_a.18_67 + _52;
  73   │   ivtmp_72 = ivtmp_71 - _73;

After this patch:
  60   │   _70 = .SELECT_VL (ivtmp_68, POLY_INT_CST [4, 4]);
  61   │   _52 = _54 * _70;
  62   │   vect__5.16_58 = .MASK_LEN_STRIDED_LOAD (vectp_b.14_56, _55, { 0, ... }, { -1, ... }, _70, 0);
  63   │   vect__7.17_60 = vect__5.16_58 + { 100, ... };
  64   │   .MASK_LEN_STRIDED_STORE (vectp_a.18_64, _55, vect__7.17_60, { -1, ... }, _70, 0);
  65   │   vectp_b.14_57 = vectp_b.14_56 + _52;
  66   │   vectp_a.18_65 = vectp_a.18_64 + _52;
  67   │   ivtmp_69 = ivtmp_68 - _70;

The below test suites are passed for this patch:
* The x86 bootstrap test.
* The x86 fully regression test.
* The riscv fully regression test.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_get_strided_load_store_ops): Handle
MASK_LEN_STRIDED_LOAD{STORE} after supported check.
(vectorizable_store): Generate MASK_LEN_STRIDED_LOAD when the offset
of gater is not vector type.
(vectorizable_load): Ditto but for store.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Pan Li [Wed, 23 Oct 2024 08:24:19 +0000 (16:24 +0800)]

Internal-fn: Introduce new IFN MASK_LEN_STRIDED_LOAD{STORE}

This patch would like to introduce new IFN for strided load and store.

LOAD:  v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias)
STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias)

The IFN target below code example similar as below

void foo (int * a, int * b, int stride, int n)
{
  for (int i = 0; i < n; i++)
    a[i * stride] = b[i * stride];
}

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* internal-fn.cc (strided_load_direct): Add new define direct
for strided load.
(strided_store_direct): Ditto but for store.
(expand_strided_load_optab_fn): Add new func to expand the IFN
MASK_LEN_STRIDED_LOAD in middle-end.
(expand_strided_store_optab_fn): Ditto but for store.
(direct_strided_load_optab_supported_p): Add define for stride
load optab supported.
(direct_strided_store_optab_supported_p): Ditto but for store.
(internal_fn_len_index): Add strided load/store len index.
(internal_fn_mask_index): Ditto but for mask.
(internal_fn_stored_value_index): Add strided store value index.
* internal-fn.def (MASK_LEN_STRIDED_LOAD): Add new IFN for
strided load.
(MASK_LEN_STRIDED_STORE): Ditto but for store.
* optabs.def (mask_len_strided_load_optab): Add strided load optab.
(mask_len_strided_store_optab): Add strided store optab.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Richard Biener [Sat, 26 Oct 2024 12:27:14 +0000 (14:27 +0200)]

Remove dead vect_recog_mixed_size_cond_pattern

vect_recog_mixed_size_cond_pattern only applies to COMPARISON_CLASS_P
rhs1 COND_EXPRs which no longer appear - the following removes it.
Its testcases still pass, I believe the situation is mitigated by
bool pattern handling of the compare use in COND_EXPRs.

* tree-vect-patterns.cc (type_conversion_p): Remove.
(vect_recog_mixed_size_cond_pattern): Likewise.
(vect_vect_recog_func_ptrs): Remove vect_recog_mixed_size_cond_pattern
entry.

commit | commitdiff | tree

Richard Biener [Sat, 26 Oct 2024 12:23:15 +0000 (14:23 +0200)]

Remove dead code in vectorizer pattern recog

The following removes the code path in vect_recog_mask_conversion_pattern
dealing with comparisons in COND_EXPRs. That can no longer happen.

* tree-vect-patterns.cc (vect_recog_mask_conversion_pattern):
Remove COMPARISON_CLASS_P rhs1 of COND_EXPR case and assert
it doesn't happen.

commit | commitdiff | tree

Patrick Palka [Tue, 29 Oct 2024 13:26:19 +0000 (09:26 -0400)]

libstdc++: Fix complexity of drop_view::begin() const [PR112641]

Views are required to have a amortized O(1) begin(), but our drop_view's
const begin overload is O(n) for non-common ranges with a non-sized
sentinel. This patch reimplements it so that it's O(1) always. See
also LWG 4009.

PR libstdc++/112641

libstdc++-v3/ChangeLog:

* include/std/ranges (drop_view::begin): Reimplement const
overload so that it's O(1) always.
* testsuite/std/ranges/adaptors/drop.cc (test10): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

commit | commitdiff | tree

David Malcolm [Tue, 29 Oct 2024 12:25:56 +0000 (08:25 -0400)]

jit: fix leak of pending_assemble_externals_set [PR117275]

My recent r15-4580-g779c0390e3b57d fix for resetting state in
varasm.cc introduced some noise to "make selftest-valgrind" and,
presumably, a memory leak in libgccjit:

==2462086== 160 (56 direct, 104 indirect) bytes in 1 blocks are definitely lost in loss record 248 of 352
==2462086==    at 0x5270E7D: operator new(unsigned long) (vg_replace_malloc.c:342)
==2462086==    by 0x1D1EB89: init_varasm_once() (varasm.cc:6806)
==2462086==    by 0x181C845: backend_init() (toplev.cc:1826)
==2462086==    by 0x181D41A: do_compile() (toplev.cc:2193)
==2462086==    by 0x181D99C: toplev::main(int, char**) (toplev.cc:2371)
==2462086==    by 0x378391D: main (main.cc:39)

Fixed thusly.

gcc/ChangeLog:
PR jit/117275
* varasm.cc (process_pending_assemble_externals): Reset
pending_assemble_externals_set to nullptr after deleting it.
(varasm_cc_finalize): Delete pending_assemble_externals_set.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

commit | commitdiff | tree

Richard Biener [Tue, 29 Oct 2024 10:26:13 +0000 (11:26 +0100)]

tree-optimization/117343 - decide_masked_load_lanes and stale graph

It turns out decide_masked_load_lanes accesses a stale SLP graph
so the following re-builds it instead.

PR tree-optimization/117343
* tree-vect-slp.cc (vect_optimize_slp_pass::build_vertices):
Support re-building the SLP graph.
(vect_optimize_slp_pass::run): Re-build the SLP graph before
decide_masked_load_lanes.

commit | commitdiff | tree

Richard Biener [Tue, 29 Oct 2024 08:42:12 +0000 (09:42 +0100)]

tree-optimization/117333 - ICE with NULL access size DR

dr_may_alias_p ICEs when TYPE_SIZE of DR->ref is NULL but this is
valid IL when the access size of an aggregate copy can be infered
from the RHS.

PR tree-optimization/117333
* tree-data-ref.cc (dr_may_alias_p): Guard against NULL
access size.

* gcc.dg/torture/pr117333.c: New testcase.

commit | commitdiff | tree

Jakub Jelinek [Tue, 29 Oct 2024 10:14:12 +0000 (11:14 +0100)]

libstdc++: Use if consteval rather than if (std::__is_constant_evaluated()) for {,b}float16_t nextafter [PR117321]

The nextafter_c++23.cc testcase fails to link at -O0.
The problem is that eventhough std::__is_constant_evaluated() has
always_inline attribute, that at -O0 just means that we inline the
call, but its result is still assigned to a temporary which is tested
later, nothing at -O0 propagates that false into the if and optimizes
away the if body.  And the __builtin_nextafterf16{,b} calls are meant
to be used solely for constant evaluation, the C libraries don't
define nextafterf16 these days.

As __STDCPP_FLOAT16_T__ and __STDCPP_BFLOAT16_T__ are predefined right
now only by GCC, not by clang which doesn't implement the extended floating
point types paper, and as they are predefined in C++23 and later modes only,
I think we can just use if consteval which is folded already during the FE
and the body isn't included even at -O0.  I've added a feature test for
that just in case clang implements those and implements those in some weird
way.  Note, if (__builtin_is_constant_evaluted()) would work correctly too,
that is also folded to false at gimplification time and the corresponding
if block not emitted at all.  But for -O0 it can't be wrapped into a helper
inline function.

2024-10-29  Jakub Jelinek  <jakub@redhat.com>

PR libstdc++/117321
* include/c_global/cmath (nextafter(_Float16, _Float16)): Use
if consteval rather than if (std::__is_constant_evaluated()) around
the __builtin_nextafterf16 call.
(nextafter(__gnu_cxx::__bfloat16_t, __gnu_cxx::__bfloat16_t)): Use
if consteval rather than if (std::__is_constant_evaluated()) around
the __builtin_nextafterf16b call.
* testsuite/26_numerics/headers/cmath/117321.cc: New test.

commit | commitdiff | tree

Marc Poulhiès [Mon, 28 Oct 2024 15:10:25 +0000 (16:10 +0100)]

ada: Fix static_assert with one argument

Single argument static_assert is C++17 only and breaks the build using
older GCC (prerequisite is C++14).

gcc/ada

* types.h: fix static_assert.

commit | commitdiff | tree

Alfie Richards [Wed, 11 Sep 2024 13:01:43 +0000 (15:01 +0200)]

arm: [MVE intrinsics] Rework MVE vld/vst intrinsics

Implement the mve vld and vst intrinsics using the MVE builtins framework.

The main part of the patch is to reimplement to vstr/vldr patterns
such that we now have much fewer of them:
- non-truncating stores
- predicated non-truncating stores
- truncating stores
- predicated truncating stores
- non-extending loads
- predicated non-extending loads
- extending loads
- predicated extending loads

This enables us to update the implementation of vld1/vst1 and use the
new vldr/vstr builtins.

The patch also adds support for the predicated vld1/vst1 versions.

gcc.target/arm/pr112337.c needs an update, to call the intrinsic
instead of the builtin, which this patch deletes.

2024-09-11 Alfie Richards <Alfie.Richards@arm.com>
Christophe Lyon <christophe.lyon@arm.com>

gcc/

* config/arm/arm-mve-builtins-base.cc (vld1q_impl): Add support
for predicated version.
(vst1q_impl): Likewise.
(vstrq_impl): New class.
(vldrq_impl): New class.
(vldrbq): New.
(vldrhq): New.
(vldrwq): New.
(vstrbq): New.
(vstrhq): New.
(vstrwq): New.
* config/arm/arm-mve-builtins-base.def (vld1q): Add predicated
version.
(vldrbq): New.
(vldrhq): New.
(vldrwq): New.
(vst1q): Add predicated version.
(vstrbq): New.
(vstrhq): New.
(vstrwq): New.
(vrev32q): Update types to float_16.
* config/arm/arm-mve-builtins-base.h (vldrbq): New.
(vldrhq): New.
(vldrwq): New.
(vstrbq): New.
(vstrhq): New.
(vstrwq): New.
* config/arm/arm-mve-builtins-functions.h (memory_vector_mode):
Remove conversion of floating point vectors to integer.
* config/arm/arm-mve-builtins.cc (TYPES_float16): Change to...
(TYPES_float_16): ...this.
(TYPES_float_32): New.
(float16): Change to...
(float_16): ...this.
(float_32): New.
(preds_z_or_none): New.
(function_resolver::check_gp_argument): Add support for _z
predicate.
* config/arm/arm_mve.h (vstrbq): Remove.
(vstrbq_p): Likewise.
(vstrhq): Likewise.
(vstrhq_p): Likewise.
(vstrwq): Likewise.
(vstrwq_p): Likewise.
(vst1q_p): Likewise.
(vld1q_z): Likewise.
(vldrbq_s8): Likewise.
(vldrbq_u8): Likewise.
(vldrbq_s16): Likewise.
(vldrbq_u16): Likewise.
(vldrbq_s32): Likewise.
(vldrbq_u32): Likewise.
(vstrbq_s8): Likewise.
(vstrbq_s32): Likewise.
(vstrbq_s16): Likewise.
(vstrbq_u8): Likewise.
(vstrbq_u32): Likewise.
(vstrbq_u16): Likewise.
(vstrbq_p_s8): Likewise.
(vstrbq_p_s32): Likewise.
(vstrbq_p_s16): Likewise.
(vstrbq_p_u8): Likewise.
(vstrbq_p_u32): Likewise.
(vstrbq_p_u16): Likewise.
(vldrbq_z_s16): Likewise.
(vldrbq_z_u8): Likewise.
(vldrbq_z_s8): Likewise.
(vldrbq_z_s32): Likewise.
(vldrbq_z_u16): Likewise.
(vldrbq_z_u32): Likewise.
(vldrhq_s32): Likewise.
(vldrhq_s16): Likewise.
(vldrhq_u32): Likewise.
(vldrhq_u16): Likewise.
(vldrhq_z_s32): Likewise.
(vldrhq_z_s16): Likewise.
(vldrhq_z_u32): Likewise.
(vldrhq_z_u16): Likewise.
(vldrwq_s32): Likewise.
(vldrwq_u32): Likewise.
(vldrwq_z_s32): Likewise.
(vldrwq_z_u32): Likewise.
(vldrhq_f16): Likewise.
(vldrhq_z_f16): Likewise.
(vldrwq_f32): Likewise.
(vldrwq_z_f32): Likewise.
(vstrhq_f16): Likewise.
(vstrhq_s32): Likewise.
(vstrhq_s16): Likewise.
(vstrhq_u32): Likewise.
(vstrhq_u16): Likewise.
(vstrhq_p_f16): Likewise.
(vstrhq_p_s32): Likewise.
(vstrhq_p_s16): Likewise.
(vstrhq_p_u32): Likewise.
(vstrhq_p_u16): Likewise.
(vstrwq_f32): Likewise.
(vstrwq_s32): Likewise.
(vstrwq_u32): Likewise.
(vstrwq_p_f32): Likewise.
(vstrwq_p_s32): Likewise.
(vstrwq_p_u32): Likewise.
(vst1q_p_u8): Likewise.
(vst1q_p_s8): Likewise.
(vld1q_z_u8): Likewise.
(vld1q_z_s8): Likewise.
(vst1q_p_u16): Likewise.
(vst1q_p_s16): Likewise.
(vld1q_z_u16): Likewise.
(vld1q_z_s16): Likewise.
(vst1q_p_u32): Likewise.
(vst1q_p_s32): Likewise.
(vld1q_z_u32): Likewise.
(vld1q_z_s32): Likewise.
(vld1q_z_f16): Likewise.
(vst1q_p_f16): Likewise.
(vld1q_z_f32): Likewise.
(vst1q_p_f32): Likewise.
(__arm_vstrbq_s8): Likewise.
(__arm_vstrbq_s32): Likewise.
(__arm_vstrbq_s16): Likewise.
(__arm_vstrbq_u8): Likewise.
(__arm_vstrbq_u32): Likewise.
(__arm_vstrbq_u16): Likewise.
(__arm_vldrbq_s8): Likewise.
(__arm_vldrbq_u8): Likewise.
(__arm_vldrbq_s16): Likewise.
(__arm_vldrbq_u16): Likewise.
(__arm_vldrbq_s32): Likewise.
(__arm_vldrbq_u32): Likewise.
(__arm_vstrbq_p_s8): Likewise.
(__arm_vstrbq_p_s32): Likewise.
(__arm_vstrbq_p_s16): Likewise.
(__arm_vstrbq_p_u8): Likewise.
(__arm_vstrbq_p_u32): Likewise.
(__arm_vstrbq_p_u16): Likewise.
(__arm_vldrbq_z_s8): Likewise.
(__arm_vldrbq_z_s32): Likewise.
(__arm_vldrbq_z_s16): Likewise.
(__arm_vldrbq_z_u8): Likewise.
(__arm_vldrbq_z_u32): Likewise.
(__arm_vldrbq_z_u16): Likewise.
(__arm_vldrhq_s32): Likewise.
(__arm_vldrhq_s16): Likewise.
(__arm_vldrhq_u32): Likewise.
(__arm_vldrhq_u16): Likewise.
(__arm_vldrhq_z_s32): Likewise.
(__arm_vldrhq_z_s16): Likewise.
(__arm_vldrhq_z_u32): Likewise.
(__arm_vldrhq_z_u16): Likewise.
(__arm_vldrwq_s32): Likewise.
(__arm_vldrwq_u32): Likewise.
(__arm_vldrwq_z_s32): Likewise.
(__arm_vldrwq_z_u32): Likewise.
(__arm_vstrhq_s32): Likewise.
(__arm_vstrhq_s16): Likewise.
(__arm_vstrhq_u32): Likewise.
(__arm_vstrhq_u16): Likewise.
(__arm_vstrhq_p_s32): Likewise.
(__arm_vstrhq_p_s16): Likewise.
(__arm_vstrhq_p_u32): Likewise.
(__arm_vstrhq_p_u16): Likewise.
(__arm_vstrwq_s32): Likewise.
(__arm_vstrwq_u32): Likewise.
(__arm_vstrwq_p_s32): Likewise.
(__arm_vstrwq_p_u32): Likewise.
(__arm_vst1q_p_u8): Likewise.
(__arm_vst1q_p_s8): Likewise.
(__arm_vld1q_z_u8): Likewise.
(__arm_vld1q_z_s8): Likewise.
(__arm_vst1q_p_u16): Likewise.
(__arm_vst1q_p_s16): Likewise.
(__arm_vld1q_z_u16): Likewise.
(__arm_vld1q_z_s16): Likewise.
(__arm_vst1q_p_u32): Likewise.
(__arm_vst1q_p_s32): Likewise.
(__arm_vld1q_z_u32): Likewise.
(__arm_vld1q_z_s32): Likewise.
(__arm_vldrwq_f32): Likewise.
(__arm_vldrwq_z_f32): Likewise.
(__arm_vldrhq_z_f16): Likewise.
(__arm_vldrhq_f16): Likewise.
(__arm_vstrwq_p_f32): Likewise.
(__arm_vstrwq_f32): Likewise.
(__arm_vstrhq_f16): Likewise.
(__arm_vstrhq_p_f16): Likewise.
(__arm_vld1q_z_f16): Likewise.
(__arm_vst1q_p_f16): Likewise.
(__arm_vld1q_z_f32): Likewise.
(__arm_vst2q_f32): Likewise.
(__arm_vst1q_p_f32): Likewise.
(__arm_vstrbq): Likewise.
(__arm_vstrbq_p): Likewise.
(__arm_vstrhq): Likewise.
(__arm_vstrhq_p): Likewise.
(__arm_vstrwq): Likewise.
(__arm_vstrwq_p): Likewise.
(__arm_vst1q_p): Likewise.
(__arm_vld1q_z): Likewise.
* config/arm/arm_mve_builtins.def:
(vstrbq_s): Delete.
(vstrbq_u): Likewise.
(vldrbq_s): Likewise.
(vldrbq_u): Likewise.
(vstrbq_p_s): Likewise.
(vstrbq_p_u): Likewise.
(vldrbq_z_s): Likewise.
(vldrbq_z_u): Likewise.
(vld1q_u): Likewise.
(vld1q_s): Likewise.
(vldrhq_z_u): Likewise.
(vldrhq_u): Likewise.
(vldrhq_z_s): Likewise.
(vldrhq_s): Likewise.
(vld1q_f): Likewise.
(vldrhq_f): Likewise.
(vldrhq_z_f): Likewise.
(vldrwq_f): Likewise.
(vldrwq_s): Likewise.
(vldrwq_u): Likewise.
(vldrwq_z_f): Likewise.
(vldrwq_z_s): Likewise.
(vldrwq_z_u): Likewise.
(vst1q_u): Likewise.
(vst1q_s): Likewise.
(vstrhq_p_u): Likewise.
(vstrhq_u): Likewise.
(vstrhq_p_s): Likewise.
(vstrhq_s): Likewise.
(vst1q_f): Likewise.
(vstrhq_f): Likewise.
(vstrhq_p_f): Likewise.
(vstrwq_f): Likewise.
(vstrwq_s): Likewise.
(vstrwq_u): Likewise.
(vstrwq_p_f): Likewise.
(vstrwq_p_s): Likewise.
(vstrwq_p_u): Likewise.
* config/arm/iterators.md (MVE_w_narrow_TYPE): New iterator.
(MVE_w_narrow_type): New iterator.
(MVE_wide_n_TYPE): New attribute.
(MVE_wide_n_type): New attribute.
(MVE_wide_n_sz_elem): New attribute.
(MVE_wide_n_VPRED): New attribute.
(MVE_elem_ch): New attribute.
(supf): Remove VSTRBQ_S, VSTRBQ_U, VLDRBQ_S, VLDRBQ_U, VLD1Q_S,
VLD1Q_U, VLDRHQ_S, VLDRHQ_U, VLDRWQ_S, VLDRWQ_U, VST1Q_S, VST1Q_U,
VSTRHQ_S, VSTRHQ_U, VSTRWQ_S, VSTRWQ_U.
(VSTRBQ, VLDRBQ, VLD1Q, VLDRHQ, VLDRWQ, VST1Q, VSTRHQ, VSTRWQ):
Delete.
* config/arm/mve.md (mve_vstrbq_<supf><mode>): Remove.
(mve_vldrbq_<supf><mode>): Likewise.
(mve_vstrbq_p_<supf><mode>): Likewise.
(mve_vldrbq_z_<supf><mode>): Likewise.
(mve_vldrhq_fv8hf): Likewise.
(mve_vldrhq_<supf><mode>): Likewise.
(mve_vldrhq_z_fv8hf): Likewise.
(mve_vldrhq_z_<supf><mode>): Likewise.
(mve_vldrwq_fv4sf): Likewise.
(mve_vldrwq_<supf>v4si): Likewise.
(mve_vldrwq_z_fv4sf): Likewise.
(mve_vldrwq_z_<supf>v4si): Likewise.
(@mve_vld1q_f<mode>): Likewise.
(@mve_vld1q_<supf><mode>): Likewise.
(mve_vstrhq_fv8hf): Likewise.
(mve_vstrhq_p_fv8hf): Likewise.
(mve_vstrhq_p_<supf><mode>): Likewise.
(mve_vstrhq_<supf><mode>): Likewise.
(mve_vstrwq_fv4sf): Likewise.
(mve_vstrwq_p_fv4sf): Likewise.
(mve_vstrwq_p_<supf>v4si): Likewise.
(mve_vstrwq_<supf>v4si): Likewise.
(@mve_vst1q_f<mode>): Likewise.
(@mve_vst1q_<supf><mode>): Likewise.
(@mve_vstrq_<mode>): New.
(@mve_vstrq_p_<mode>): New.
(@mve_vstrq_truncate_<mode>): New.
(@mve_vstrq_p_truncate_<mode>): New.
(@mve_vldrq_<mode>): New.
(@mve_vldrq_z_<mode>): New.
(@mve_vldrq_extend_<mode><US>): New.
(@mve_vldrq_z_extend_<mode><US>): New.
* config/arm/unspecs.md:
(VSTRBQ_S): Remove.
(VSTRBQ_U): Likewise.
(VLDRBQ_S): Likewise.
(VLDRBQ_U): Likewise.
(VLD1Q_F): Likewise.
(VLD1Q_S): Likewise.
(VLD1Q_U): Likewise.
(VLDRHQ_F): Likewise.
(VLDRHQ_U): Likewise.
(VLDRHQ_S): Likewise.
(VLDRWQ_F): Likewise.
(VLDRWQ_S): Likewise.
(VLDRWQ_U): Likewise.
(VSTRHQ_F): Likewise.
(VST1Q_S): Likewise.
(VST1Q_U): Likewise.
(VSTRHQ_U): Likewise.
(VSTRWQ_S): Likewise.
(VSTRWQ_U): Likewise.
(VSTRWQ_F): Likewise.
(VST1Q_F): Likewise.
(VLDRQ): New.
(VLDRQ_Z): Likewise.
(VLDRQ_EXT): Likewise.
(VLDRQ_EXT_Z): Likewise.
(VSTRQ): Likewise.
(VSTRQ_P): Likewise.
(VSTRQ_TRUNC): Likewise.
(VSTRQ_TRUNC_P): Likewise.

gcc/testsuite/
* gcc.target/arm/pr112337.c: Call intrinsic instead of builtin.

commit | commitdiff | tree

Alfie Richards [Wed, 11 Sep 2024 12:56:28 +0000 (14:56 +0200)]

arm: [MVE intrinsics] Add support for predicated contiguous loads and stores

This patch extends
function_expander::use_contiguous_load_insn and
function_expander::use_contiguous_store_insn functions to
support predicated versions.

2024-09-11 Alfie Richards <Alfie.Richards@arm.com>
Christophe Lyon <christophe.lyon@arm.com>

gcc/

* config/arm/arm-mve-builtins.cc
(function_expander::use_contiguous_load_insn): Add support for
PRED_z.
(function_expander::use_contiguous_store_insn): Add support for
PRED_p.

commit | commitdiff | tree

Alfie Richards [Wed, 11 Sep 2024 12:55:24 +0000 (14:55 +0200)]

arm: [MVE intrinsics] Add load_extending and store_truncating function bases

This patch adds the load_extending and store_truncating function bases
for MVE intrinsics.

The constructors have parameters describing the memory element
type/width which is part of the function base name (e.g. "h" in
vldrhq).

2024-09-11 Alfie Richards <Alfie.Richards@arm.com>

gcc/

* config/arm/arm-mve-builtins-functions.h
(load_extending): New class.
(store_truncating): New class.
* config/arm/arm-protos.h (arm_mve_data_mode): New helper function.
* config/arm/arm.cc (arm_mve_data_mode): New helper function.

commit | commitdiff | tree

Alfie Richards [Wed, 11 Sep 2024 10:32:06 +0000 (12:32 +0200)]

arm: [MVE intrinsics] Add load_ext intrinsic shape

This patch adds the extending load shape.
It also adds/fixes comments for the load and store shapes.

2024-09-11 Alfie Richards <Alfie.Richards@arm.com>
Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-shapes.cc:
(load_ext): New.
* config/arm/arm-mve-builtins-shapes.h:
(load_ext): New.

commit | commitdiff | tree

Alfie Richards [Wed, 11 Sep 2024 16:02:01 +0000 (18:02 +0200)]

arm: [MVE intrinsics] fix vst tests

The tests for vst* instrinsics use functions which return a void
expression which can generate a warning. This hasn't come up previously
as the inlining presumably prevents the warning.

This change removed the uneccessary and incorrect returns.

2024-09-11 Alfie Richards <Alfie.Richards@arm.com>

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/vst1q_p_f16.c: Remove `return`.
* gcc.target/arm/mve/intrinsics/vst1q_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst2q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s8.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u8.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s8.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u8.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_p_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_p_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_p_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_p_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_p_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_p_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_f16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_f16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_s16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_u16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_s16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_u16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_f16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_f16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_s16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_u16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_s16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_u16.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrhq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_u32.c: Likewise.

commit | commitdiff | tree

Jakub Jelinek [Tue, 29 Oct 2024 08:06:25 +0000 (09:06 +0100)]

c: Add __builtin_stdc_rotate_{left,right} builtins [PR117030]

I believe the new C2Y <stdbit.h> type-generic functions
stdc_rotate_{left,right} have the same problems the other stdc_*
type-generic functions had.  If we want to support arbitrary
unsigned _BitInt(N), don't want to use statement expressions
(so that one can actually use them in static variable initializers),
don't want to evaluate the arguments multiple times and don't want
to expand the arguments multiple times during preprocessing to avoid the
old tgmath preprocessing bloat, we need a built-in for those.

The following patch adds those.  And as we need to support rotations by 0
and tree-ssa-forwprop.cc is only able to pattern recognize with BIT_AND_EXPR
for that case (i.e. for power of two widths), the patch just constructs
LROTATE_EXPR/RROTATE_EXPR right away.  Negative second arguments are
considered UB, while positive ones are modulo precision.

2024-10-29  Jakub Jelinek  <jakub@redhat.com>

PR c/117030
gcc/
* doc/extend.texi (__builtin_stdc_rotate_left,
__builtin_stdc_rotate_right): Document.
gcc/c-family/
* c-common.cc (c_common_reswords): Add __builtin_stdc_rotate_left
and __builtin_stdc_rotate_right.
* c-ubsan.cc (ubsan_instrument_shift): For {L,R}ROTATE_EXPR
just check if op1 is negative.
gcc/c/
* c-parser.cc: Include asan.h and c-family/c-ubsan.h.
(c_parser_postfix_expression): Handle __builtin_stdc_rotate_left
and __builtin_stdc_rotate_right.
* c-fold.cc (c_fully_fold_internal): Handle LROTATE_EXPR and
RROTATE_EXPR.
gcc/testsuite/
* gcc.dg/builtin-stdc-rotate-1.c: New test.
* gcc.dg/builtin-stdc-rotate-2.c: New test.
* gcc.dg/ubsan/builtin-stdc-rotate-1.c: New test.
* gcc.dg/ubsan/builtin-stdc-rotate-2.c: New test.

commit | commitdiff | tree

GCC Administrator [Tue, 29 Oct 2024 00:18:25 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

David Malcolm [Mon, 28 Oct 2024 22:43:11 +0000 (18:43 -0400)]

testsuite: drop the "test-" prefix from sarif-output python scripts

Drop the "text-" prefix from the various gcc.dg/sarif-output/test-*.py
scripts so that the scripts are close to the .c files they are used by
when the files are sorted by name.

gcc/testsuite/ChangeLog:
* gcc.dg/sarif-output/test-bad-pragma.py: Rename to...
* gcc.dg/sarif-output/bad-pragma.py: ...this.
* gcc.dg/sarif-output/bad-pragma.c: Update for script renaming.
* gcc.dg/sarif-output/test-include-chain-1.py: Rename to...
* gcc.dg/sarif-output/include-chain-1.py: ...this.
* gcc.dg/sarif-output/include-chain-1.c: Update for script renaming.
* gcc.dg/sarif-output/test-include-chain-2.py: Rename to...
* gcc.dg/sarif-output/include-chain-2.py: ...this.
* gcc.dg/sarif-output/include-chain-2.c: Update for script renaming.
* gcc.dg/sarif-output/test-missing-semicolon.py: Rename to...
* gcc.dg/sarif-output/missing-semicolon.py: ...this.
* gcc.dg/sarif-output/missing-semicolon.c: Update for script renaming.
* gcc.dg/sarif-output/test-no-diagnostics.py: Rename to...
* gcc.dg/sarif-output/no-diagnostics.py: ...this.
* gcc.dg/sarif-output/no-diagnostics.c: Update for script renaming.
* gcc.dg/sarif-output/test-werror.py: Rename to...
* gcc.dg/sarif-output/werror.py: ...this.
* gcc.dg/sarif-output/werror.c: Update for script renaming.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

commit | commitdiff | tree

Andrew Pinski [Mon, 28 Oct 2024 20:29:58 +0000 (13:29 -0700)]

testcase: Add testcase for PR 117330 [PR117330]

This testcase was causing an ICE during vectorization
due to r15-4695-gd17e672ce82e69 but was fixed with
r15-4713-g0942bb85fc5573.

Pushed as obvious after a quick test on x86_64-linux-gnu to
make sure the testcase passes.

PR tree-optimization/117330

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr117330-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Dimitar Dimitrov [Sun, 27 Oct 2024 07:49:49 +0000 (09:49 +0200)]

testsuite: Require atomic operations for pr47333_0

Since the test uses __sync_fetch_and_add, add a requirement for
target to support atomic operations on int and long types.

This fixes a spurious test failure on pru-unknown-elf, which lacks
atomic ops. The test still passes on x86_64-linux-gnu.

gcc/testsuite/ChangeLog:

* g++.dg/lto/pr47333_0.C: Require target that supports atomic
operations on int and long types.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

commit | commitdiff | tree

Sam James [Mon, 28 Oct 2024 18:24:14 +0000 (18:24 +0000)]

gcc: fix 'statements' comment typo

gcc/ChangeLog:

* opts-common.cc (prune_options): Fix typo.

commit | commitdiff | tree

Sam James [Mon, 21 Oct 2024 11:11:42 +0000 (12:11 +0100)]

testsuite: add testcase for fixed PR107467

PR107467 ended up being fixed by the fix for PR115110, but let's
add the testcase on top.

gcc/testsuite/ChangeLog:
PR tree-optimization/107467
PR middle-end/115110

* g++.dg/lto/pr107467_0.C: New test.

commit | commitdiff | tree

Andrew MacLeod [Mon, 28 Oct 2024 13:47:03 +0000 (09:47 -0400)]

Fix bitwise_or logic for prange.

Set non-zero only if at least one of the two operands does not contain zero.

* range-op-ptr.cc (operator_bitwise_or::fold_range): Fix logic
for setting nonzero.

commit | commitdiff | tree

Kyrylo Tkachov [Mon, 28 Oct 2024 14:19:07 +0000 (15:19 +0100)]

aarch64: Use implementation namespace for vxarq_u64 immediate argument

Looks like this immediate variable was missed out when I last fixed the
namespace issues in arm_neon.h. Fixed in the obvious manner.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
* config/aarch64/arm_neon.h (vxarq_u64): Rename imm6 to __imm6.

commit | commitdiff | tree

Jonathan Wakely [Mon, 28 Oct 2024 13:05:53 +0000 (13:05 +0000)]

libstdc++: Fix tests for std::vector range operations

The commit I pushed was not the one I'd tested, so it had older versions
of the tests, with bugs that I'd already fixed locally. This commit has
the fixed tests that I'd intended to push in the first place.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/vector/bool/cons/from_range.cc: Use
dg-do run instead of compile.
(test_ranges): Use do_test instead of do_test_a for rvalue
range.
(test_constexpr): Call function template instead of just
instantiating it.
* testsuite/23_containers/vector/bool/modifiers/assign/assign_range.cc:
Use dg-do run instead of compile.
(do_test): Use same test logic for vector<bool> as for primary
template.
(test_constexpr): Call function template instead of just
instantiating it.
* testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
Use dg-do run instead of compile.
(test_ranges): Use do_test instead of do_test_a for rvalue
range.
(test_constexpr): Call function template instead of just
instantiating it.
* testsuite/23_containers/vector/bool/modifiers/insert/insert_range.cc:
Use dg-do run instead of compile.
(do_test): Fix incorrect function arguments to match intended
results.
(test_ranges): Use do_test instead of do_test_a for rvalue
range.
(test_constexpr): Call function template instead of just
instantiating it.
* testsuite/23_containers/vector/cons/from_range.cc: Use dg-do
run instead of compile.
(test_ranges): Fix ill-formed call to do_test.
(test_constexpr): Call function template instead of just
instantiating it.
* testsuite/23_containers/vector/modifiers/append_range.cc:
Use dg-do run instead of compile.
(test_constexpr): Likewise.
* testsuite/23_containers/vector/modifiers/assign/assign_range.cc:
Use dg-do run instead of compile.
(do_test): Do not reuse input ranges.
(test_constexpr): Call function template instead of just
instantiating it.
* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
Use dg-do run instead of compile.
(do_test): Fix incorrect function arguments to match intended
results.
(test_constexpr): Call function template instead of just
instantiating it.

commit | commitdiff | tree

Jason Merrill [Tue, 17 Sep 2024 21:38:35 +0000 (17:38 -0400)]

build: update bootstrap req to C++14

We moved to a bootstrap requirement of C++11 in GCC 11, 8 years after
support was stable in GCC 4.8.

It is now 8 years since C++14 was the default mode in GCC 6 (and 9 years
since support was complete in GCC 5), and we have a few bits of optional
C++14 code in the compiler, so it seems a good time to update the bootstrap
requirement again.

The big benefit of the change is the greater constexpr power, but C++14 also
added variable templates, generic lambdas, lambda init-capture, binary
literals, and numeric literal digit separators.

C++14 was feature-complete in GCC 5, and became the default in GCC 6. 5.4.0
bootstraps trunk correctly; trunk stage1 built with 5.3.0 breaks in
eh_data_format_name due to PR69995.

gcc/ChangeLog:

* doc/install.texi (Prerequisites): Update to C++14.

ChangeLog:

* configure.ac: Update requirement to C++14.
* configure: Regenerate.

commit | commitdiff | tree

Jeff Law [Mon, 28 Oct 2024 11:39:24 +0000 (05:39 -0600)]

[target/117316] Fix initializer for riscv code alignment handling

The construct used for initializing the code alignments in a recent change is
causing bootstrap problems on riscv64 as seen in the referenced bugzilla.

This patch adjusts the initializer by pushing the NULL down into each uarch
clause. Bootstrapped on riscv64, regression test in flight, but given
bootstrap is broken it seemed advisable to move this forward now.

I'm so much looking forward to the day when we have performant hardware for
bootstrap testing... Sigh.

Anyway, bootstrapped and installing on the trunk.

PR target/117316
gcc/
* config/riscv/riscv.cc (riscv_tune_param): Drop initializer.
(*_tune_info): Add initializers for code alignments.

commit | commitdiff | tree

Richard Biener [Mon, 28 Oct 2024 08:52:08 +0000 (09:52 +0100)]

tree-optimization/117307 - STMT_VINFO_SLP_VECT_ONLY mis-computation

STMT_VINFO_SLP_VECT_ONLY isn't properly computed as union of all
group members and when the group is later split due to duplicates
not all sub-groups inherit the flag.

PR tree-optimization/117307
* tree-vect-data-refs.cc (vect_analyze_data_ref_accesses):
Properly compute STMT_VINFO_SLP_VECT_ONLY. Set it on all
parts of a split group.

* gcc.dg/vect/pr117307.c: New testcase.

commit | commitdiff | tree

Tobias Burnus [Mon, 28 Oct 2024 09:00:08 +0000 (10:00 +0100)]

tree-core.h (omp_clause_code): Comments regarding range checks for OMP_CLAUSE_...

gcc/ChangeLog:

* tree-core.h (enum omp_clause_code): Add comments to cross ref to
OMP_CLAUSE_DECL etc. and mark the ranges used in the range checks.

commit | commitdiff | tree

Andrew Pinski [Sun, 27 Oct 2024 20:16:22 +0000 (13:16 -0700)]

vec-lowering: Fix ABSU lowering [PR111285]

ABSU_EXPR lowering incorrectly used the resulting type
for the new expression but in the case of ABSU the resulting
type is an unsigned type and with ABSU is folded away. The fix
is to use a signed type for the expression instead.

Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/111285

gcc/ChangeLog:

* tree-vect-generic.cc (do_unop): Use a signed type for the
operand if the operation was ABSU_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/torture/vect-absu-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Sun, 27 Oct 2024 03:37:36 +0000 (20:37 -0700)]

phiopt: Move check for maybe_undef_p slightly earlier

This moves the check for maybe_undef_p in match_simplify_replacement
slightly earlier before figuring out the true/false arg using arg0/arg1
instead.
In most cases this is no difference in compile time; just in the case
there is an undef in the args there would be a slight compile time
improvement as there is no reason to figure out which arg corresponds
to the true/false side of the conditional.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (match_simplify_replacement): Move
check for maybe_undef_p earlier.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Richard Biener [Sat, 26 Oct 2024 12:18:37 +0000 (14:18 +0200)]

Remove code in vectorizer pattern recog relying on vec_cond{u,eq,}

With the intent to rely on vec_cond_mask and vec_cmp patterns
comparisons do not need rewriting into COND_EXPRs that eventually
combine to vec_cond{u,eq,}.

* tree-vect-patterns.cc (check_bool_pattern): For comparisons
we do nothing if we can expand them or we can't replace them
with a ? -1 : 0 condition - but the latter would require
expanding the comparison which we proved we can't. So do
nothing, aka not think vec_cond{u,eq,} will save us.

commit | commitdiff | tree

xuli [Mon, 28 Oct 2024 04:41:09 +0000 (04:41 +0000)]

RISC-V:Bugfix for vlmul_ext and vlmul_trunc with NULL return value[pr117286]

This patch fixes following ICE:

test.c: In function 'func':
test.c:37:24: internal compiler error: Segmentation fault
   37 |     vfloat16mf2_t vc = __riscv_vlmul_trunc_v_f16m1_f16mf2(vb);
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The root cause is that vlmul_trunc has a null return value.
gimple_call <__riscv_vlmul_trunc_v_f16m1_f16mf2, NULL, vb_13>
                                                 ^^^

Passed the rv64gcv_zvfh regression test.

Singed-off-by: Li Xu <xuli1@eswincomputing.com>
PR target/117286

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Do not expand NULL return.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr117286.c: New test.

commit | commitdiff | tree

H.J. Lu [Fri, 11 Oct 2024 21:53:49 +0000 (05:53 +0800)]

gcc.target/i386/pr53533-[13].c: Adjust assembly scan

Before

1089d083117 Simplify (B * v + C) * D -> BD* v + CD when B,C,D are all INTEGER_CST.

the loop was

.L2:
movl (%rdi,%rdx), %eax
addl $12345, %eax
imull $-1564285888, %eax, %eax
leal -333519936(%rax), %eax
movl %eax, (%rsi,%rdx)
addq $4, %rdx
cmpq $1024, %rdx
jne .L2

There were 1 addl and 1 leal. 1 addq was to update the loop counter. The
optimized loop is

.L2:
imull $-1564285888, (%rdi,%rax), %edx
subl $1269844480, %edx
movl %edx, (%rsi,%rax)
addq $4, %rax
cmpq $1024, %rax
jne .L2

1 addl is changed to subl and leal is removed. Adjust assembly scan to
check for 1 subl and 1 addl/addq as well as lea removal.

* gcc.target/i386/pr53533-1.c: Adjust assembly scan.
* gcc.target/i386/pr53533-3.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree

GCC Administrator [Mon, 28 Oct 2024 00:17:33 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Jonathan Wakely [Tue, 8 Oct 2024 20:15:18 +0000 (21:15 +0100)]

libstdc++: Add P1206R7 from_range members to std::vector [PR111055]

This is another piece of P1206R7, adding new members to std::vector and
std::vector<bool>.

The __uninitialized_copy_a extension needs to be enhanced to support
passing non-common ranges (i.e. a sentinel that is a different type from
the iterator) and move-only input iterators.

libstdc++-v3/ChangeLog:

PR libstdc++/111055
* include/bits/ranges_base.h (__container_compatible_range): New
concept.
* include/bits/stl_bvector.h (vector(from_range, R&&, const Alloc&))
(assign_range, insert_range, append_range): Define.
* include/bits/stl_uninitialized.h (__do_uninit_copy): Support
non-common ranges.
(__uninitialized_copy_a): Likewise.
* include/bits/stl_vector.h (_Vector_base::_M_append_range_to):
New function.
(_Vector_base::_M_append_range): Likewise.
(vector(from_range, R&&, const Alloc&), assign_range): Define.
(append_range): Define.
(insert_range): Declare.
* include/debug/vector (vector(from_range, R&&, const Alloc&))
(assign_range, insert_range, append_range): Define.
* include/bits/vector.tcc (insert_range): Define.
* testsuite/util/testsuite_iterators.h (input_iterator_wrapper_rval):
New class template.
* testsuite/23_containers/vector/bool/cons/from_range.cc: New test.
* testsuite/23_containers/vector/bool/modifiers/assign/assign_range.cc:
New test.
* testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
New test.
* testsuite/23_containers/vector/bool/modifiers/insert/insert_range.cc:
New test.
* testsuite/23_containers/vector/cons/from_range.cc: New test.
* testsuite/23_containers/vector/modifiers/append_range.cc: New test.
* testsuite/23_containers/vector/modifiers/assign/assign_range.cc:
New test.
* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
New test.

Reviewed-by: Patrick Palka <ppalka@redhat.com>

commit | commitdiff | tree

Jonathan Wakely [Sat, 26 Oct 2024 20:24:58 +0000 (21:24 +0100)]

libstdc++: Fix std::vector<bool>::emplace to forward parameter

If the parameter is not lvalue-convertible to bool then the current code
will fail to compile. The parameter should be forwarded to restore the
original value category.

libstdc++-v3/ChangeLog:

* include/bits/stl_bvector.h (emplace_back, emplace): Forward
parameter pack to preserve value category.
* testsuite/23_containers/vector/bool/emplace_rvalue.cc: New
test.

commit | commitdiff | tree

Fangrui Song [Sun, 27 Oct 2024 19:37:21 +0000 (12:37 -0700)]

arm: Support -mfdpic for more targets

Targets that are not arm*-*-uclinuxfdpiceabi can use -S -mfdpic, but -c
-mfdpic does not pass --fdpic to gas.  This is an unnecessary
restriction.  Just define the ASM_SPEC in bpabi.h.

Additionally, use armelf[b]_linux_fdpiceabi emulations for -mfdpic in
linux-eabi.h.  This will allow a future musl fdpic port to use the
desired BFD emulation.

gcc/ChangeLog:

* config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): Transform -mfdpic.
* config/arm/linux-eabi.h (TARGET_FDPIC_LINKER_EMULATION): Define.
(SUBTARGET_EXTRA_LINK_SPEC): Use TARGET_FDPIC_LINKER_EMULATION
if -mfdpic.

commit | commitdiff | tree

Takayuki 'January June' Suwa [Wed, 23 Oct 2024 02:31:15 +0000 (11:31 +0900)]

xtensa: Define TARGET_DIFFERENT_ADDR_DISPLACEMENT_P target hook

In commit bc5a9dab55d13f888a3cdd150c8cf5c2244f35e0 ("gcc: xtensa: reorder
movsi_internal patterns for better code generation during LRA"), the
instruction order in "movsi_internal" MD definition was changed to make LRA
use load/store instructions with larger memory address displacements, but as
a side effect, it now uses the larger displacements (ie., the larger
instructions) even outside of reload operations.

The underlying problem is that LRA assumes by default that there is only one
maximal legitimate displacement for the same address structure, meaning that
it has no choice but to use the first load/store instruction it finds.

To fix this, define TARGET_DIFFERENT_ADDR_DISPLACEMENT_P hook to always
return true.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (TARGET_DIFFERENT_ADDR_DISPLACEMENT_P):
Add new target hook to always return true.
* config/xtensa/xtensa.md (movsi_internal):
Revert the previous changes.

commit | commitdiff | tree

Jakub Jelinek [Sun, 27 Oct 2024 15:44:35 +0000 (16:44 +0100)]

genmatch: Add selftests to genmatch for diag_vfprintf

The following patch adds selftests to genmatch to verify the new printing
routine there.
So that I can rely on HAVE_DECL_FMEMOPEN (host test), the tests are done
solely in stage2+ where we link the host libcpp etc. to genmatch.
The tests have been adjusted from pretty-print.cc (test_pp_format),
and I've added to that function two new tests because I've noticed nothing
was testing the %M$.*N$s etc. format specifiers.

2024-10-27 Jakub Jelinek <jakub@redhat.com>

* configure.ac (gcc_AC_CHECK_DECLS): Add fmemopen.
* configure: Regenerate.
* config.in: Regenerate.
* Makefile.in (build/genmatch.o): Add -DGENMATCH_SELFTESTS to
BUILD_CPPFLAGS for stage2+ genmatch.
* genmatch.cc (test_diag_vfprintf, genmatch_diag_selftests): New
functions.
(main): Call genmatch_diag_selftests.
* pretty-print.cc (test_pp_format): Add two tests, one for %M$.*N$s
and one for %M$.Ns.

commit | commitdiff | tree

Jakub Jelinek [Sun, 27 Oct 2024 15:42:53 +0000 (16:42 +0100)]

c-family: -Wleading-whitespace= argument spelling

On Thu, Oct 24, 2024 at 03:33:25PM -0400, Eric Gallager wrote:
> On Thu, Oct 24, 2024 at 4:17 AM Jakub Jelinek <jakub@redhat.com> wrote:
> > I've tried to build stage3 with
> > -Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
>
> So wait, it's "blanks" (plural) when it's leading, but "blank"
> (singular) when it's trailing? That inconsistency bothers me...

I've mentioned it already in
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664664.html
Citing that here:
    Not sure about the kinds for the option, given -Wleading-whitespace=
    uses plural and this option singular and -Wleading-whitespace= spaces
    means literally just ' ' characters, while space in
    -Wtrailing-whitespace= was ' ', '\t', '\v' and '\f'; so category;
    perhaps just use any and blanks?
Other preferences?

Here is a patch to do the blank->blanks and space->any changes.

2024-10-27  Jakub Jelinek  <jakub@redhat.com>

gcc/
* doc/invoke.texi (Wtrailing-whitespace=): Change
blank argument to blanks and space argument to any.
gcc/c-family/
* c.opt (warn_trailing_whitespace_kind): Change blank
to blanks and space to any.
gcc/testsuite/
* c-c++-common/cpp/Wtrailing-whitespace-2.c: Use
-Wtrailing-whitespace=blanks rather than -Wtrailing-whitespace=blank.
* c-c++-common/cpp/Wtrailing-whitespace-3.c: Use
-Wtrailing-whitespace=any rather than -Wtrailing-whitespace=space.
* c-c++-common/cpp/Wtrailing-whitespace-7.c: Use
-Wtrailing-whitespace=blanks rather than -Wtrailing-whitespace=blank.
* c-c++-common/cpp/Wtrailing-whitespace-8.c: Use
-Wtrailing-whitespace=any rather than -Wtrailing-whitespace=space.

commit | commitdiff | tree

Jakub Jelinek [Sun, 27 Oct 2024 15:41:28 +0000 (16:41 +0100)]

testsuite: Fix up gcc.dg/vec-perm-lower.c test

On Tue, Oct 15, 2024 at 12:45:35PM +0000, Tamar Christina wrote:
> I'll write a gimple one and commit with this then.

The new test FAILs on i686-linux, with the usual

FAIL: gcc.dg/vec-perm-lower.c (test for excess errors)
Excess errors:
.../gcc/testsuite/gcc.dg/vec-perm-lower.c:9:1: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi]
.../gcc/testsuite/gcc.dg/vec-perm-lower.c:8:1: warning: MMX vector argument without MMX enabled changes the ABI [-Wpsabi]

The following patch fixes that.
Tested on x86_64-linux with
make check-gcc RUNTESTFLAGS='--target_board=unix/\{-m32,-m32/-mno-sse/-mno-mmx,-m64\} dg.exp=vec-perm-lower.c'
which previously FAILed, now PASSes, ok for trunk?

2024-10-27 Jakub Jelinek <jakub@redhat.com>

* gcc.dg/vec-perm-lower.c: Add -Wno-psabi to dg-options.

commit | commitdiff | tree

Paul Thomas [Sun, 27 Oct 2024 12:40:42 +0000 (12:40 +0000)]

Fortran: Fix regressions with intent(out) class[PR115070, PR115348].

2024-10-27 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/115070
PR fortran/115348
* trans-expr.cc (gfc_trans_class_init_assign): If all the
components of the default initializer are null for a scalar,
build an empty statement to prevent prior declarations from
disappearing.

gcc/testsuite/
PR fortran/115070
* gfortran.dg/pr115070.f90: New test.

PR fortran/115348
* gfortran.dg/pr115348.f90: New test.

commit | commitdiff | tree

Torbjörn SVENSSON [Tue, 3 Sep 2024 09:23:57 +0000 (11:23 +0200)]

testsuite: Sanitize pacbti test cases for Cortex-M

Some of the test cases were scanning for "bti", but it would,
incorrectly, match the ".arch_extenssion pacbti".

gcc/testsuite/ChangeLog:

* gcc.target/arm/bti-1.c: Check for asm instructions starting
with a tab.
* gcc.target/arm/bti-2.c: Likewise.
* gcc.target/arm/pac-1.c: Likewise.
* gcc.target/arm/pac-2.c: Likewise.
* gcc.target/arm/pac-3.c: Likewise.
* gcc.target/arm/pac-4.c: Likewise.
* gcc.target/arm/pac-6.c: Likewise.
* gcc.target/arm/pac-7.c: Likewise.
* gcc.target/arm/pac-8.c: Likewise.
* gcc.target/arm/pac-9.c: Likewise.
* gcc.target/arm/pac-10.c: Likewise.
* gcc.target/arm/pac-11.c: Likewise.
* gcc.target/arm/pac-15.c: Likewise.
* gcc.target/arm/pac-sibcall.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>

commit | commitdiff | tree

GCC Administrator [Sun, 27 Oct 2024 00:17:28 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Iain Sandoe [Sat, 26 Oct 2024 22:06:09 +0000 (23:06 +0100)]

doc, fortran: Add a missing menu item.

The changes in r15-4697-g4727bfb37701 omit a menu entry which causes a
bootstrap fail when Frotran is included for at least makeinfo 6.7.
Fixed thus.

gcc/fortran/ChangeLog:

* intrinsic.texi: Add menu item for UINT.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

commit | commitdiff | tree

Andrew Pinski [Sat, 26 Oct 2024 09:14:18 +0000 (02:14 -0700)]

tree: Mark PAREN_EXPR and VEC_DUPLICATE_EXPR as non-trapping [PR117234]

While looking to fix a possible trapping issue in PHI-OPT's factor,
I noticed that some tree codes could be marked as trapping even
though they don't have a possibility to trap. In the case of PAREN_EXPR,
it is basically a nop except when it comes to association across it so
it can't trap.
In the case of VEC_DUPLICATE_EXPR, it is similar to a CONSTRUCTOR, so it
can't trap.

This fixes those 2 issues and adds 4 testcases, 2 which are specific to aarch64
since the only way to get a VEC_DUPLICATE_EXPR is to use intrinsics currently.

Build and tested for aarch64-linux-gnu.

PR tree-optimization/117234

gcc/ChangeLog:

* tree-eh.cc (operation_could_trap_helper_p): Treat
PAREN_EXPR and VEC_DUPLICATE_EXPR like constructing
expressions.

gcc/testsuite/ChangeLog:

* g++.dg/eh/noncall-fp-1.C: New test.
* g++.target/aarch64/sve/noncall-eh-fp-1.C: New test.
* gcc.dg/tree-ssa/trapping-1.c: New test.
* gcc.target/aarch64/sve/trapping-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Thomas Koenig [Sat, 26 Oct 2024 17:20:14 +0000 (19:20 +0200)]

Add UNSIGNED for intrinsics.

gcc/fortran/ChangeLog:

* gfortran.texi: Correct reference to make clear that UNSIGNED
will not be part of F202Y.
Other clarifications.
Extend table of intrinsics, add links.
* intrinsic.texi: Add descriptions for UNSIGNED arguments.
* invoke.texi: Add anchor for -funsigned.

commit | commitdiff | tree

Eric Botcazou [Sat, 26 Oct 2024 13:16:57 +0000 (15:16 +0200)]

Fix old glitch in the GNAT Reference Manual

gcc/ada
PR ada/62122
* doc/gnat_rm/implementation_defined_attributes.rst
(Unrestricted_Access): Remove null exclusion.
* gnat_rm.texi: Regenerate.

commit | commitdiff | tree

Richard Biener [Fri, 25 Oct 2024 12:27:37 +0000 (14:27 +0200)]

Assert finished vectorizer pattern COND_EXPR transition

The following places a few strathegic asserts so we do not end up
with COND_EXPRs with a comparison as the first operand during
vectorization.

* tree-vect-slp.cc (vect_get_operand_map): Mark
COMPARISON_CLASS_P COND_EXPR condition path unreachable.
* tree-vect-stmts.cc (vect_is_simple_use): Likewise.
(vectorizable_condition): Assert the COND_EXPR condition isn't
COMPARISON_CLASS_P.

commit | commitdiff | tree

Richard Biener [Fri, 25 Oct 2024 12:20:23 +0000 (14:20 +0200)]

Finish vectorizer pattern proper COND_EXPR transition

This fixes up vect_recog_ctz_ffs_pattern.

* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern): Create
a separate pattern stmt for the comparison in the generated
COND_EXPR.

commit | commitdiff | tree

Richard Biener [Fri, 25 Oct 2024 11:42:08 +0000 (13:42 +0200)]

Finish vectorizer pattern proper COND_EXPR transition

The following tries to finish building proper GIMPLE COND_EXPRs
in vectorizer pattern recognition.

* tree-vect-patterns.cc (vect_recog_divmod_pattern): Build
separate comparion pattern for the condition of a COND_EXPR
pattern.

Mirror of https://gcc.gnu.org/git/gcc.git