]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
7 months agoAssign separate timevar to duplicate computed goto pass
Richard Biener [Mon, 9 Dec 2024 13:09:12 +0000 (14:09 +0100)] 
Assign separate timevar to duplicate computed goto pass

It currently shares the timevar with bb-reorder but can use significant
memory and compile-time on its own.

* timevar.def (TV_DUP_COMPGOTO): Add.
* bb-reorder.cc (pass_data_duplicate_computed_gotos): Use
TV_DUP_COMPGOTO.

7 months agos390: Fix UNSPEC_CC_TO_INT canonicalization
Juergen Christ [Fri, 6 Dec 2024 17:52:36 +0000 (18:52 +0100)] 
s390: Fix UNSPEC_CC_TO_INT canonicalization

Canonicalization of comparisons for UNSPEC_CC_TO_INT missed one case
causing unnecessarily complex code.  This especially seems to hit the
Linux kernel.

gcc/ChangeLog:

* config/s390/s390.cc (s390_canonicalize_comparison): Add
missing UNSPEC_CC_TO_INT case.

gcc/testsuite/ChangeLog:

* gcc.target/s390/ccusage.c: New test.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
7 months agoc++: Allow overloaded builtins to be used in SFINAE context
Matthew Malcomson [Mon, 7 Oct 2024 15:42:41 +0000 (16:42 +0100)] 
c++: Allow overloaded builtins to be used in SFINAE context

This commit newly introduces the ability to use overloaded builtins in
C++ SFINAE context.

The goal behind this is in order to ensure there is a single mechanism
that libstdc++ can use to determine whether a given type can be used in
the atomic fetch_add (and similar) builtins.  I am working on another
patch that hopes to use this mechanism to identify whether fetch_add
(and similar) work on floating point types.

Current state of the world:

    GCC currently exposes resolved versions of these builtins to the
    user, so for GCC it's currently possible to use tests similar to the
    below to check for atomic loads on a 2 byte sized object.
      #if __has_builtin(__atomic_load_2)
    Clang does not expose resolved versions of the atomic builtins.

    clang currently allows SFINAE on builtins, so that C++ code can
    check whether a builtin is available on a given type.
    GCC does not (and that is what this patch aims to change).

    C libraries like libatomic can check whether a given atomic builtin
    can work on a given type by using autoconf to check for a
    miscompilation when attempting such a use.

My goal:
    I would like to enable floating point fetch_add (and similar) in
    GCC, in order to use those overloads in libstdc++ implementation of
    atomic<float>::fetch_add.
    This should allow compilers targeting GPU's which have floating
    point fetch_add instructions to emit optimal code.

    In order to do that I need some consistent mechanism that libstdc++
    can use to identify whether the fetch_add builtins have floating
    point overloads (and for which types these exist).

    I would hence like to enable SFINAE on builtins, so that libstdc++
    can use that mechanism for the floating point fetch_add builtins.

Implementation follows the existing mechanism for handling SFINAE
contexts in c-common.cc.  A boolean is passed into the c-common.cc
function indicating whether these functions should emit errors or not.
This boolean comes from `complain & tf_error` in the C++ frontend.
(Similar to other functions like valid_array_size_p and
c_build_vec_perm_expr).

This is done both for resolve_overloaded_builtin and
check_builtin_function_arguments, both of which can be used in SFINAE
contexts.
    I attempted to trigger something using the `reject_gcc_builtin`
    function in an SFINAE context.  Given the context where this
    function is called from the C++ frontend it looks like it may be
    possible, but I did not manage to trigger this in template context
    by attempting to do something similar to the testcases added around
    those calls.
    - I would appreciate any feedback on whether this is something that
      can happen in a template context, and if so some help writing a
      relevant testcase for it.

Both of these functions have target hooks for target specific builtins
that I have updated to take the extra boolean flag.  I have not adjusted
the functions implementing those target hooks (except to update the
declarations) so target specific builtins will still error in SFINAE
contexts.
- I could imagine not updating the target hook definition since nothing
  would use that change.  However I figure that allowing targets to
  decide this behaviour would be the right thing to do eventually, and
  since this is the target-independent part of the change to do that
  this patch should make that change.
  Could adjust if others disagree.

Other relevant points that I'd appreciate reviewers check:
- I did not pass this new flag through
  atomic_bitint_fetch_using_cas_loop since the _BitInt type is not
  available in the C++ frontend and I didn't want if conditions that can
  not be executed in the source.
- I only test non-compile-time-constant types with SVE types, since I do
  not know of a way to get a VLA into a SFINAE context.
- While writing tests I noticed a few differences with clang in this
  area.  I don't think they are problematic but am mentioning them for
  completeness and to allow others to judge if these are a problem).
  - atomic_fetch_add on a boolean is allowed by clang.
  - When __atomic_load is passed an invalid memory model (i.e. too
    large), we give an SFINAE failure while clang does not.

Bootstrap and regression tested on AArch64 and x86_64.
Built first stage on targets whose target hook declaration needed
updated (though did not regtest etc).  Targets triplets I built in order
to check the backend specific changes I made:
   - arm-none-linux-gnueabihf
   - avr-linux-gnu
   - riscv-linux-gnu
   - powerpc-linux-gnu
   - s390x-linux-gnu

Ok for commit to trunk?

gcc/c-family/ChangeLog:

* c-common.cc (builtin_function_validate_nargs,
check_builtin_function_arguments,
speculation_safe_value_resolve_call,
speculation_safe_value_resolve_params, sync_resolve_size,
sync_resolve_params, get_atomic_generic_size,
resolve_overloaded_atomic_exchange,
resolve_overloaded_atomic_compare_exchange,
resolve_overloaded_atomic_load, resolve_overloaded_atomic_store,
resolve_overloaded_builtin):  Add `complain` boolean parameter
and determine whether to emit errors based on its value.
* c-common.h (check_builtin_function_arguments,
resolve_overloaded_builtin):  Mention `complain` boolean
parameter in declarations.  Give it a default of `true`.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc
(aarch64_resolve_overloaded_builtin,aarch64_check_builtin_call):
Add new unused boolean parameter to match target hook
definition.
* config/arm/arm-builtins.cc (arm_check_builtin_call): Likewise.
* config/arm/arm-c.cc (arm_resolve_overloaded_builtin):
Likewise.
* config/arm/arm-protos.h (arm_check_builtin_call): Likewise.
* config/avr/avr-c.cc (avr_resolve_overloaded_builtin):
Likewise.
* config/riscv/riscv-c.cc (riscv_check_builtin_call,
riscv_resolve_overloaded_builtin): Likewise.
* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
Likewise.
* config/rs6000/rs6000-protos.h (altivec_resolve_overloaded_builtin):
Likewise.
* config/s390/s390-c.cc (s390_resolve_overloaded_builtin):
Likewise.
* doc/tm.texi: Regenerate.
* target.def (TARGET_RESOLVE_OVERLOADED_BUILTIN,
TARGET_CHECK_BUILTIN_CALL): Update prototype to include a
boolean parameter that indicates whether errors should be
emitted.  Update documentation to mention this fact.

gcc/cp/ChangeLog:

* call.cc (build_cxx_call):  Pass `complain` parameter to
check_builtin_function_arguments.  Take its value from the
`tsubst_flags_t` type `complain & tf_error`.
* semantics.cc (finish_call_expr):  Pass `complain` parameter to
resolve_overloaded_builtin.  Take its value from the
`tsubst_flags_t` type `complain & tf_error`.

gcc/testsuite/ChangeLog:

* g++.dg/template/builtin-atomic-overloads.def: New test.
* g++.dg/template/builtin-atomic-overloads1.C: New test.
* g++.dg/template/builtin-atomic-overloads2.C: New test.
* g++.dg/template/builtin-atomic-overloads3.C: New test.
* g++.dg/template/builtin-atomic-overloads4.C: New test.
* g++.dg/template/builtin-atomic-overloads5.C: New test.
* g++.dg/template/builtin-atomic-overloads6.C: New test.
* g++.dg/template/builtin-atomic-overloads7.C: New test.
* g++.dg/template/builtin-atomic-overloads8.C: New test.
* g++.dg/template/builtin-sfinae-check-function-arguments.C: New test.
* g++.dg/template/builtin-speculation-overloads.def: New test.
* g++.dg/template/builtin-speculation-overloads1.C: New test.
* g++.dg/template/builtin-speculation-overloads2.C: New test.
* g++.dg/template/builtin-speculation-overloads3.C: New test.
* g++.dg/template/builtin-speculation-overloads4.C: New test.
* g++.dg/template/builtin-speculation-overloads5.C: New test.
* g++.dg/template/builtin-validate-nargs.C: New test.

Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
7 months agoPR modula2/115328: use enable forward bool and set default true
Gaius Mulley [Mon, 9 Dec 2024 13:56:37 +0000 (13:56 +0000)] 
PR modula2/115328: use enable forward bool and set default true

This patch introduces GetEnableForward and SetEnableForward
against which the forward procedure declaration feature is checked.
Currently this is set as default true.

gcc/m2/ChangeLog:

PR modula2/115328
* gm2-compiler/M2Options.def (GetEnableForward): New procedure
function.
(SetEnableForward): New procedure.
* gm2-compiler/M2Options.mod (GetEnableForward): New procedure
function.
(SetEnableForward): New procedure.
(EnableForward): New boolean.
* gm2-compiler/P1SymBuild.mod (EndBuildForward): Check
GetEnableForward and generate an error message if false.

gcc/testsuite/ChangeLog:

PR modula2/115328
* gm2/pim/fail/forward.mod: Move to...
* gm2/pim/pass/forward.mod: ...here.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
7 months agodocs: Clarify -fsanitize=hwaddress target support [PR117960]
Jakub Jelinek [Mon, 9 Dec 2024 13:17:39 +0000 (14:17 +0100)] 
docs: Clarify -fsanitize=hwaddress target support [PR117960]

Since GCC 13 -fsanitize=hwaddress is not supported just on AArch64, but also
on x86_64 (but only with -mlam=u48 or -mlam=u57).

2024-12-09  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/117960
* doc/invoke.texi (fsanitize=hwaddress): Clarify on which targets
it is supported.

7 months agoreplace atoi with stroul in c_parser_gimple_parse_bb_spec [PR114541]
Heiko Eißfeldt [Mon, 9 Dec 2024 09:39:50 +0000 (10:39 +0100)] 
replace atoi with stroul in c_parser_gimple_parse_bb_spec [PR114541]

The full treatment of these invalid values was considered out of
scope for this patch.

PR c/114541
* gimple-parser.cc (c_parser_gimple_parse_bb_spec):
Use strtoul with ERANGE check instead of atoi to avoid UB
and detect invalid __BB#.

Signed-off-by: Heiko Eißfeldt <heiko@hexco.de>
7 months agoarm: remove obsolete vcond expanders
Richard Earnshaw [Fri, 6 Dec 2024 12:57:52 +0000 (12:57 +0000)] 
arm: remove obsolete vcond expanders

The vcond{,u} expander paterns have been declared as obsolete.  Remove
them from the Arm backend.

gcc/ChangeLog:

PR target/114189
* config/arm/arm-protos.h (arm_expand_vcond): Delete prototype.
* config/arm/arm.cc (arm_expand_vcond): Delete function.
* config/arm/vec-common.md (vcond<mode><mode>): Delete pattern
(vcond<V_cvtto><mode>): Likewise.
(vcond<VH_cvtto><mode>): Likewise.
(vcondu<mode><v_cmp_result>): Likewise.

7 months agoRISC-V: Refine signed SAT_TRUNC testcase dump check to tree optimized
Pan Li [Sun, 8 Dec 2024 01:32:30 +0000 (09:32 +0800)] 
RISC-V: Refine signed SAT_TRUNC testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_TRUNC exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_TRUNC (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_trunc-1-i16-to-i8.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_s_trunc-1-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i64-to-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Refine signed SAT_SUB testcase dump check to tree optimized
Pan Li [Sun, 8 Dec 2024 01:32:29 +0000 (09:32 +0800)] 
RISC-V: Refine signed SAT_SUB testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_SUB exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_SUB (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_sub-1-i16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_s_sub-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Refine signed SAT_ADD testcase dump check to tree optimized
Pan Li [Sun, 8 Dec 2024 01:32:28 +0000 (09:32 +0800)] 
RISC-V: Refine signed SAT_ADD testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_ADD exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_ADD (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_add-1-i16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_s_add-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-1-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-2-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-2.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-3-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-3.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-4.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Refine unsigned SAT_TRUNC testcase dump check to tree optimized
Pan Li [Sun, 8 Dec 2024 01:32:27 +0000 (09:32 +0800)] 
RISC-V: Refine unsigned SAT_TRUNC testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_TRUNC exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_TRUNC (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_u_trunc-1-u16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_u_trunc-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Refine unsigned SAT_SUB testcase dump check to tree optimized
Pan Li [Sun, 8 Dec 2024 01:32:26 +0000 (09:32 +0800)] 
RISC-V: Refine unsigned SAT_SUB testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_SUB exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_SUB (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_u_sub-1-u16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_u_sub-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u64-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u64-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u64-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Refine unsigned SAT_ADD testcase dump check to tree optimized
Pan Li [Sun, 8 Dec 2024 01:32:25 +0000 (09:32 +0800)] 
RISC-V: Refine unsigned SAT_ADD testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_ADD exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_ADD (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_u_add-1-u16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_u_add-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-10.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-11.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-12.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-13.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-14.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-15.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-17.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-18.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-19.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-20.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-21.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-22.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-23.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-24.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-25.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-26.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-27.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-28.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-29.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-30.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-31.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-33.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-34.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-35.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-36.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-37.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-38.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-39.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-40.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-41.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-42.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-43.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-44.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-45.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-46.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-47.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-48.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-49.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-5.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-50.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-51.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-52.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-53.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-54.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-55.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-56.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-57.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-58.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-59.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-6.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-60.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-7.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-9.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agomiddle-end/117932 - further speedup DF worklist solver
Richard Biener [Sat, 7 Dec 2024 13:43:00 +0000 (14:43 +0100)] 
middle-end/117932 - further speedup DF worklist solver

The triple-indirect memory reference we perform for each incoming
edge age <= last_change_age[bbindex_to_postorder[e->src->index]]
is pretty bad and when there are a lot of small BBs like for the
PR26854 testcase this shows in the profile.  The following reduces
this by one level by making last_change_age indexed by BB index
rather than postorder number and realizing that for the first
iteration the age check is always true.  We pay for this by
allocating last_change_age for all BBs in the function but we
do it like for sparsesets and avoid initializing given we check
the considerd bitmap anyway.  We can also elide initializing
last_visit_age in an obvious way given we separated the initial
iteration in the previous change.

Together this improves compile-time in the PR117932 setting by
another 2%.

PR middle-end/117932
* df-core.cc (df_worklist_propagate_forward): Elide
age check for the first iteration, adjust for
last_change_age change.
(df_worklist_propagate_backward): Likewise.
(df_worklist_dataflow_doublequeue): Make last_change_age
indexed by BB index, avoid clearing both age arrays.

7 months agomiddle-end/117932 - speed up DF solver
Richard Biener [Fri, 6 Dec 2024 15:36:39 +0000 (16:36 +0100)] 
middle-end/117932 - speed up DF solver

The following addresses slow bitmap operations for maintaining the
iteration order of df_worklist_dataflow_doublequeue for large number
of basic-blocks.  The main complexity change is switching the
worklist and pending bitmaps to tree view, a secondary change is
avoiding the fully populated initial bitmap for the first iteration
and instead special-casing that plus avoiding all forward worklist
bitmap sets in that iteration.  Usually second or later iterations
are sparse, so optimizing the first iteration seems worthwhile.

For PR117932 when isolating from ext-dce and fold-mem-offset issues
this results in a 10% compile-time reduction.

PR middle-end/117932
* df-core.cc (df_worklist_propagate_forward): When WORKLIST
is NULL, do not set bits there.
(df_worklist_propagate_backward): Likewise.
(df_worklist_dataflow_doublequeue): Separate first pass
over all blocks with NULL worklist.
(df_worklist_dataflow): Do not initialize pending and adjust.

7 months agonvptx: Switch default from '-march=sm_30' to '-march=sm_52'
Thomas Schwinge [Mon, 11 Nov 2024 12:20:46 +0000 (13:20 +0100)] 
nvptx: Switch default from '-march=sm_30' to '-march=sm_52'

In preparation of GCC/nvptx code changes that require sm_52 features, this
commit raises nvptx code generation from sm_30 "Kepler" to sm_52 "Maxwell".
The latter has been supported as of CUDA 6.5 (2014-08), and is thus supported
by most Nvidia GPUs of the last decade, approximately.  (This commit doesn't
change the use of PTX ISA 6.0, which already requires CUDA 9.0 anyway.)

To continue building sm_30 multilib variants (for use via building/linking with
'-march=sm_30'), specify '--with-multilib-list=default,sm_30', for example.  Or,
to continue defaulting to sm_30 multilib variants, specify '--with-arch=sm_30'
(plus '--without-multilib-list', if applicable).  See the documentation,
<https://gcc.gnu.org/install/specific.html#nvptx-x-none>.

(Note that after a long deprecation time, eventually the
sm_3x "Kepler architecture support is removed from CUDA 12.0", 2022-12.)

gcc/
* config.gcc [nvptx-*]: Switch default from '-march=sm_30' to
'-march=sm_52'.
* doc/install.texi (Nvidia PTX Options): Update.

7 months agoGCN: Fix 'real_from_integer' usage
Thomas Schwinge [Thu, 5 Dec 2024 13:28:26 +0000 (14:28 +0100)] 
GCN: Fix 'real_from_integer' usage

The recent commit b3f1b9e2aa079f8ec73e3cb48143a16645c49566
"build: Remove INCLUDE_MEMORY [PR117737]" exposed an issue in code added in
2020 GCN back end commit 95607c12363712c39345e1d97f2c1aee8025e188
"Zero-initialise masked load destinations"; compilation now fails:

    [...]
    In file included from ../../source-gcc/gcc/coretypes.h:507:0,
                     from ../../source-gcc/gcc/config/gcn/gcn.cc:24:
    ../../source-gcc/gcc/real.h: In instantiation of ‘format_helper::format_helper(const T&) [with T = std::nullptr_t]’:
    ../../source-gcc/gcc/config/gcn/gcn.cc:1178:46:   required from here
    ../../source-gcc/gcc/real.h:233:17: error: no match for ‘operator==’ (operand types are ‘std::nullptr_t’ and ‘machine_mode’)
       : m_format (m == VOIDmode ? 0 : REAL_MODE_FORMAT (m))
                     ^
    [...]

That's with 'g++ (GCC) 5.5.0', and seen similarly with
'g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0', for example.

gcc/
* config/gcn/gcn.cc (gcn_vec_constant): Fix 'real_from_integer'
usage.

7 months agoRust: libformat_parser: Lower minimum Rust version to 1.49
Arthur Cohen [Tue, 23 Apr 2024 12:13:21 +0000 (14:13 +0200)] 
Rust: libformat_parser: Lower minimum Rust version to 1.49

libgrust/ChangeLog:

* libformat_parser/Cargo.toml: Change Rust edition from 2021 to 2018.
* libformat_parser/generic_format_parser/Cargo.toml: Likewise.
* libformat_parser/generic_format_parser/src/lib.rs: Remove usage of
then-unstable std features and language constructs.
* libformat_parser/src/lib.rs: Likewise, plus provide extension trait
for String::leak.

7 months agoRust: Work around 'error[E0599]: no method named `leak` found for struct `std::string...
Thomas Schwinge [Sat, 3 Aug 2024 14:39:17 +0000 (16:39 +0200)] 
Rust: Work around 'error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope'

Compiling with Debian GNU/Linux 12 (bookworm) packages:

    $ apt-cache madison cargo rustc
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main Sources
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main Sources

..., we run into:

       Compiling libformat_parser v0.1.0 ([...]/source-gcc/libgrust/libformat_parser)
    error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope
       --> src/lib.rs:396:18
        |
    396 |         ptr: str.leak().as_ptr(),
        |                  ^^^^ method not found in `std::string::String`

    error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope
       --> src/lib.rs:434:7
        |
    434 |     s.leak();
        |       ^^^^ method not found in `std::string::String`

    error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope
       --> src/lib.rs:439:23
        |
    439 |         ptr: cloned_s.leak().as_ptr(),
        |                       ^^^^ method not found in `std::string::String`

Locally replace 1.72.0+ method 'leak' for struct 'std::string::String'.

libgrust/
* libformat_parser/src/lib.rs: Work around 'error[E0599]:
no method named `leak` found for struct `std::string::String` in the current scope'.

7 months agoRust: Work around 'error[E0658]: `let...else` statements are unstable'
Thomas Schwinge [Sat, 3 Aug 2024 14:08:42 +0000 (16:08 +0200)] 
Rust: Work around 'error[E0658]: `let...else` statements are unstable'

Compiling with Debian GNU/Linux 12 (bookworm) packages:

    $ apt-cache madison cargo rustc
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main Sources
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main Sources

..., we run into:

       Compiling generic_format_parser v0.1.0 ([...]/source-gcc/libgrust/libformat_parser/generic_format_parser)
    error[E0658]: `let...else` statements are unstable
       --> generic_format_parser/src/lib.rs:994:5
        |
    994 | /     let Some(unescaped) = unescape_string(snippet) else {
    995 | |         return InputStringKind::NotALiteral;
    996 | |     };
        | |______^
        |
        = note: see issue #87335 <https://github.com/rust-lang/rust/issues/87335> for more information

Rewrite backwards, per <https://rust-lang.github.io/rfcs/3137-let-else.html>.

libgrust/
* libformat_parser/generic_format_parser/src/lib.rs: Work around
'error[E0658]: `let...else` statements are unstable'.

7 months agolibstdc++: Add missing equality comparison in new tests [PR117921]
Jonathan Wakely [Mon, 9 Dec 2024 09:36:15 +0000 (09:36 +0000)] 
libstdc++: Add missing equality comparison in new tests [PR117921]

These new tests fail in Debug Mode because the allocator types aren't
equality comparable.

libstdc++-v3/ChangeLog:

PR libstdc++/117921
* testsuite/23_containers/set/modifiers/swap/adl.cc: Add
equality comparison for Allocator.
* testsuite/23_containers/unordered_set/modifiers/swap-2.cc:
Likewise.

7 months agoaarch64: Update cpuinfo strings for some arch features
Kyrylo Tkachov [Tue, 3 Dec 2024 12:12:09 +0000 (04:12 -0800)] 
aarch64: Update cpuinfo strings for some arch features

The entries for some recently-added arch features were missing the cpuinfo
string used in -march=native detection.  Presumably the Linux kernel had not
specified such a string at the time the GCC support was added.
But I see that current versions of Linux do have strings for these features
in the arch/arm64/kernel/cpuinfo.c file in the kernel tree.

This patch adds them.  This fixes the strings for the f32mm and f64mm features
which I think were using the wrong string.  The kernel exposes them with an
"sve" prefix.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

* config/aarch64/aarch64-option-extensions.def (sve-b16b16,
f32mm, f64mm, sve2p1, sme-f64f64, sme-i16i64, sme-b16b16,
sme-f16f16, mops): Update FEATURE_STRING field.

7 months agotree-eh: Don't crash on GIMPLE_TRY_FINALLY with empty cleanup sequence [PR117845]
Simon Martin [Mon, 9 Dec 2024 08:21:25 +0000 (09:21 +0100)] 
tree-eh: Don't crash on GIMPLE_TRY_FINALLY with empty cleanup sequence [PR117845]

The following valid code triggers an ICE with -fsanitize=address

=== cut here ===
void l() {
    auto const ints = {0,1,2,3,4,5};
    for (auto i : { 3 } ) {
        __builtin_printf("%d ", i);
    }
}
=== cut here ===

The problem is that honor_protect_cleanup_actions does not expect the
cleanup sequence of a GIMPLE_TRY_FINALLY to be empty. It is however the
case here since r14-8681-gceb242f5302027, because lower_stmt removes the
only statement in the sequence: a ASAN_MARK statement for the array that
backs the initializer_list).

This patch simply checks that the finally block is not 0 before
accessing it in honor_protect_cleanup_actions.

PR c++/117845

gcc/ChangeLog:

* tree-eh.cc (honor_protect_cleanup_actions): Support empty
finally sequences.

gcc/testsuite/ChangeLog:

* g++.dg/asan/pr117845-2.C: New test.
* g++.dg/asan/pr117845.C: New test.

7 months agoFortran: Fix testsuite regressions after r15-5897 [PR116261/PR117901]
Paul Thomas [Mon, 9 Dec 2024 07:32:22 +0000 (07:32 +0000)] 
Fortran: Fix testsuite regressions after r15-5897 [PR116261/PR117901]

2024-12-09  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
PR fortran/116261
* trans-array.cc (gfc_array_init_size): New arg 'explicit_ts',
to suppress the use of the expr3 element size in the descriptor
dtype.
(gfc_array_allocate): New arg 'explicit_ts', used in call to
gfc_array_init_size.
* trans-array.h : Modify prototype for gfc_array_allocate for new
bool argument.
* trans-stmt.cc (gfc_trans_allocate): Set new argument if the
typespec is explicit.

gcc/testsuite/
PR fortran/117901
* gfortran.dg/class_transformational_1.f90: Temporary fix for
ICE with some compile options by setting dummy arg of
'unlimited rebar' to be allocatable.

7 months agoRISC-V: Fix incorrect optimization options passing to partial
Pan Li [Mon, 9 Dec 2024 06:07:22 +0000 (14:07 +0800)] 
RISC-V: Fix incorrect optimization options passing to partial

Like the strided load/store, the testcases of vector partial
are designed to pick up different sorts of optimization options but
actually these option are ignored according to the Execution log of
the gcc.log.

This patch would like to make it correct almost the same as how we
fixed for strided load/store.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Fix the incorrect optimization
options passing to testcases.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Refactor the testcases for rvv binop and cmp
Pan Li [Fri, 6 Dec 2024 04:22:53 +0000 (12:22 +0800)] 
RISC-V: Refactor the testcases for rvv binop and cmp

This patch would like to refactor the testcases for rvv binop
and cmp after sorts of optimization option passing to testcase.
To fits different optimization option asm dump checks.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c: Skip
m8 as it has different body layout.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-1.c: Add build option
condition when check asm dumps.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-9.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Fix incorrect optimization options passing to binop and cmp
Pan Li [Fri, 6 Dec 2024 04:22:52 +0000 (12:22 +0800)] 
RISC-V: Fix incorrect optimization options passing to binop and cmp

Like the strided load/store, the testcases of vector binop and cmp
are designed to pick up different sorts of optimization options but
actually these option are ignored according to the Execution log of
the gcc.log.

This patch would like to make it correct almost the same as how we
fixed for strided load/store.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Fix the incorrect optimization
options passing to testcases.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoDaily bump.
GCC Administrator [Mon, 9 Dec 2024 00:17:22 +0000 (00:17 +0000)] 
Daily bump.

7 months agoSupport for 64-bit location_t: Activate 64-bit location_t
Lewis Hyatt [Sat, 16 Nov 2024 18:45:22 +0000 (13:45 -0500)] 
Support for 64-bit location_t: Activate 64-bit location_t

Change location_t to be a 64-bit integer instead of a 32-bit integer in
libcpp.

Also included in this change are the two other patches in the original
series which depended on this one; I am committing them all at once in case
it needs to be reverted later:

-Support for 64-bit location_t: gimple parts

The size of struct gimple increased by 8 bytes with the change in size of
location_t from 32- to 64-bit; adjust the WORD markings in the comments
accordingly. It seems that most of the WORD markings were off by one already,
probably not having been updated after a previous reduction in the size of a
gimple, so they have become retroactively correct again, and only a couple
needed adjustment actually.

Also add a comment that there is now 32 bits of unused padding available in
struct gimple for 64-bit hosts.

-Support for 64-bit location_t: Remove -flarge-source-files

The option -flarge-source-files became unnecessary with 64-bit location_t
and harms performance compared to the new default setting, so silently
ignore it.

libcpp/ChangeLog:

* include/cpplib.h (struct cpp_token): Adjust comment about the
struct size.
* include/line-map.h (location_t): Change typedef from 32-bit to 64-bit
integer.
(LINE_MAP_MAX_COLUMN_NUMBER): Increase size to be appropriate for
64-bit location_t.
(LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES): Likewise.
(LINE_MAP_MAX_LOCATION_WITH_COLS): Likewise.
(LINE_MAP_MAX_LOCATION): Likewise.
(MAX_LOCATION_T): Likewise.
(line_map_suggested_range_bits): Likewise.
(struct line_map): Adjust comment about the struct size.
(struct line_map_macro): Likewise.
(struct line_map_ordinary): Likewise. Rearrange fields to optimize
padding.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/pr77949.C: Adapt the test for 64-bit location_t,
when the previously expected failure doesn't actually happen.
* g++.dg/modules/loc-prune-4.C: Adjust the expected output for the
64-bit location_t case.
* gcc.dg/plugin/expensive_selftests_plugin.cc: Don't try to test
the maximum supported column number in 64-bit location_t mode.
* gcc.dg/plugin/location_overflow_plugin.cc: Adjust the base_location
so it can effectively test 64-bit location_t.

gcc/ChangeLog:

* gimple.h (struct gphi): Update word marking comments to reflect
the new size of location_t.
(struct gimple): Likewise. Add a comment about padding.
* common.opt: Mark -flarge-source-files as Ignored.
* common.opt.urls: Regenerate.
* doc/invoke.texi: Remove -flarge-source-files.
* toplev.cc (process_options): Remove support for
-flarge-source-files.

7 months agopru: Implement c and n asm operand modifiers
Dimitar Dimitrov [Sun, 8 Dec 2024 09:37:06 +0000 (11:37 +0200)] 
pru: Implement c and n asm operand modifiers

Fix c-c++-common/toplevel-asm-1.c failure for PRU backend, caused by
missing implementation of the "c" asm operand modifier.

gcc/ChangeLog:

* config/pru/pru.cc (pru_print_operand): Implement c and n
inline assembly operand modifiers.

gcc/testsuite/ChangeLog:

* gcc.target/pru/asm-op-modifier.c: New test.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
7 months agoDaily bump.
GCC Administrator [Sun, 8 Dec 2024 00:16:56 +0000 (00:16 +0000)] 
Daily bump.

7 months agoSPARC: Add functional comments for VIS4B instructions
Eric Botcazou [Sat, 7 Dec 2024 18:53:53 +0000 (19:53 +0100)] 
SPARC: Add functional comments for VIS4B instructions

gcc/
* config/sparc/sparc.md (VIS4B instructions): Add comments.

7 months agoAVR: Better location for late (during final) diagnostic.
Georg-Johann Lay [Sat, 7 Dec 2024 18:54:02 +0000 (19:54 +0100)] 
AVR: Better location for late (during final) diagnostic.

gcc/
* config/avr/avr.cc (avr_print_operand_address): Use
avr_insn_location as location for late (during final) diagnostic.

7 months agoPR modula2/117948: Forward procedure declaration should only be available in ISO
Gaius Mulley [Sat, 7 Dec 2024 14:04:44 +0000 (14:04 +0000)] 
PR modula2/117948: Forward procedure declaration should only be available in ISO

This patch restricts the forward procedure declaration to the ISO dialect.

gcc/m2/ChangeLog:

PR modula2/117948
* gm2-compiler/P1Build.bnf (ForwardDeclaration): Pass token
position of the FORWARD keyword to EndBuildForward.
* gm2-compiler/P1SymBuild.def (EndBuildForward): New parameter
forwardPos.
* gm2-compiler/P1SymBuild.mod (EndBuildForward): Issue an error at
forwardPos if the Iso boolean is false.

gcc/testsuite/ChangeLog:

PR modula2/117948
* gm2/pim/fail/forward.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
7 months agoi386: x r<< (c - y) to x r>> y etc. optimization [PR117930]
Jakub Jelinek [Sat, 7 Dec 2024 10:40:12 +0000 (11:40 +0100)] 
i386: x r<< (c - y) to x r>> y etc. optimization [PR117930]

The following patch optimizes x r<< (c - y) to x r>> y,
x r>> (c - y) to x r<< y, x r<< (c + y) to x r<< y and
x r>> (c + y) to x r>> y if c is a multiple of x's bitsize.

2024-12-07  Jakub Jelinek  <jakub@redhat.com>

PR target/117930
* config/i386/i386.md (crotate): New define_code_attr.
(*<insn><mode>3_add, *<insn><mode>3_add_1,
*<insn><mode>3_sub, *<insn><mode>3_sub_1): New define_insn_and_split
patterns plus following define_split for constant first input
operand.

* gcc.target/i386/pr117930.c: New test.

7 months agolibstdc++: Fix typo in Doxygen comment in <format>
Jonathan Wakely [Sat, 7 Dec 2024 01:34:33 +0000 (01:34 +0000)] 
libstdc++: Fix typo in Doxygen comment in <format>

libstdc++-v3/ChangeLog:

* include/std/format: Fix typo in Doxygen comment.

7 months agoThe fix for PR116778:
Denis Chertykov [Sat, 7 Dec 2024 09:47:04 +0000 (13:47 +0400)] 
The fix for PR116778:

Brief:
The bug appears in LRA after rematerialization pass while creating live ranges.
File lra.cc:
*************************************************************
      /* Now we know what pseudos should be spilled.  Try to
 rematerialize them first.  */
      if (lra_remat ())
{
  /* We need full live info -- see the comment above.  */
  lra_create_live_ranges (lra_reg_spill_p, true);
*************************************************************
Wrong call `lra_create_live_ranges (lra_reg_spill_p, true)'
It have to be `lra_create_live_ranges (true, true)'.

The explanation:
**********************************
int main (void)
{
  if (a.u33 * a.u33 != 0)
------^^^^^^^^^^^^^
    goto abrt;
  if (a.u33 * a.u40 * a.u33 != 0)
**********************************
The bug appears here.

Part of the expression `a.u33 * a.u33'
Before LRA:
*************************************************************
(insn 13 11 15 2 (set (reg:QI 184 [ _1+3 ])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)
                    (const_int 3 [0x3]))) [1 a+3 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 15 13 16 2 (set (reg:QI 64 [ a+4 ])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)
                    (const_int 4 [0x4]))) [1 a+4 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 16 15 20 2 (set (reg:QI 185 [ _1+4 ])
        (zero_extract:QI (reg:QI 64 [ a+4 ])
            (const_int 1 [0x1])
            (const_int 0 [0]))) "bf.c":11:8 985 {*extzvqi_split}
     (nil))
*************************************************************

After LRA:
*************************************************************
(insn 587 11 13 2 (set (reg:QI 24 r24 [368])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)
                    (const_int 3 [0x3]))) [1 a+3 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 13 587 15 2 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 1 [0x1])) [4 %sfp+1 S1 A8])
        (reg:QI 24 r24 [368])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 15 13 16 2 (set (reg:QI 6 r6 [orig:64 a+4 ] [64])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)
                    (const_int 4 [0x4]))) [1 a+4 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 16 15 572 2 (set (reg:QI 24 r24 [orig:185 _1+4 ] [185])
        (zero_extract:QI (reg:QI 6 r6 [orig:64 a+4 ] [64])
            (const_int 1 [0x1])
            (const_int 0 [0]))) "bf.c":11:8 985 {*extzvqi_split}
     (nil))
(insn 572 16 20 2 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 1 [0x1])) [4 %sfp+1 S1 A8])
        (reg:QI 24 r24 [orig:185 _1+4 ] [185])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
*************************************************************
Insn 13 and insn 572 use sfp+1 as a spill slot, but in IRA pass it was a two
different pseudos r184 and r185.
Insns 13 use sfp+1 as a spill slot for r184
Insns 572 use the same slot for r185. It's wrong.

Here we have a rematerialization.

Fragment from bf.c.317r.reload:
**************************************************************************************
******** Rematerialization #1: ********

df_worklist_dataflow_doublequeue: n_basic_blocks 14 n_edges 18 count 14 (    1)
df_worklist_dataflow_doublequeue: n_basic_blocks 14 n_edges 18 count 14 (    1)

Cands:
0 (nop=0, remat_regno=185, reload_regno=359):
(insn 16 15 572 2 (set (reg:QI 359 [orig:185 _1+4 ] [185])
                    (zero_extract:QI (reg:QI 64 [ a+4 ])
                        (const_int 1 [0x1])
                        (const_int 0 [0]))) "bf.c":11:8 985 {*extzvqi_split}
                 (nil))

**************************************************************************************
[...]
**************************************************************************************
Ranges after the compression:
 r185: [0..1]
   Frame pointer can not be eliminated anymore
   Spilling non-eliminable hard regs: 28 29
 Spilling r113(28)
 Spilling r184(29)
 Spilling r208(29)
 Spilling r209(28)
  Slot 0 regnos (width = 0):  185  209  208  184  113
**************************************************************************************

The bug is here: `r185: [0..1]' wrong live range after compression.
r185 and r184 can't have the same spill slot !

Rematerialization in bf.c.317r.reload looks like:
*************************************************************
   24: r14:QI=r185:QI
    Inserting rematerialization insn before:
  581: r14:QI=zero_extract(r64:QI,0x1,0)

deleting insn with uid = 24.
         Considering alt=0 of insn 16:   (0) =r  (1) rYil  (2) n
          overall=0,losers=0,rld_nregs=0
   32: r22:QI=r185:QI
    Inserting rematerialization insn before:
  582: r22:QI=zero_extract(r64:QI,0x1,0)

deleting insn with uid = 32.
*************************************************************

It's happened because:

Fragment from lra.c (lra):
*************************************************************************
      if (! live_p)
{
  /* We need full live info for spilling pseudos into
     registers instead of memory.  */
  lra_create_live_ranges (lra_reg_spill_p, true);
  live_p = true;
}
      /* We should check necessity for spilling here as the above live
 range pass can remove spilled pseudos.  */
      if (! lra_need_for_spills_p ())
break;
      /* Now we know what pseudos should be spilled.  Try to
 rematerialize them first.  */
      if (lra_remat ())
{
  /* We need full live info -- see the comment above.  */
  lra_create_live_ranges (lra_reg_spill_p, true);
----------------------------------^^^^^^^^^^^^^^^
  live_p = true;
*************************************************************************

The bug is here.
Rematerialization sometimes can be like spilling pseudos into registers.
  582: r22:QI=zero_extract(r64:QI,0x1,0)

So, here we need a live ranges for all pseudos.

PS: the patch will not affect any target with usable definition of
    TARGET_SPILL_CLASS hook.

PR target/116778
gcc/
* lra-lives.cc (complete_info_p): Clarification of the comment.
* lra.cc (lra): Create a full live info after rematerialization.

7 months agolibstdc++: editorconfig: Adjust wildcard patterns
Matthew Malcomson [Fri, 6 Dec 2024 17:16:42 +0000 (17:16 +0000)] 
libstdc++: editorconfig: Adjust wildcard patterns

According to the editorconfig file format description, a match against
one of multiple different strings is described with those different
strings separated by commas and within curly braces.  E.g.
    [{x,y}.txt]

https://editorconfig.org/, under "Wildcard Patterns".

The current libstdc++-v3/.editorconfig file has a few places where we
match against similar globs by using strings separated by commas but
without the curly braces.  E.g.
    [*.h,*.cc]

This doesn't take affect in neovim nor emacs (as far as I can tell), I
haven't looked into other editors.
I would expect that following the standard syntax described in the
documentation would satisfy more editors.  Hence this patch suggests
following that standard by using something like:
    [*.{h,cc}]

libstdc++-v3/ChangeLog:

* .editorconfig: Adjust globbing style to standard syntax.

Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
7 months agoRevert "RISC-V: Add const to function_shape::get_name [NFC]"
Kito Cheng [Sat, 7 Dec 2024 00:23:58 +0000 (08:23 +0800)] 
Revert "RISC-V: Add const to function_shape::get_name [NFC]"

This reverts commit 9bf4cad4e4e1ec92c320a619c9bad35535596ced.

7 months agoDaily bump.
GCC Administrator [Sat, 7 Dec 2024 00:20:02 +0000 (00:20 +0000)] 
Daily bump.

7 months agoSupport for 64-bit location_t: libgdiagnostics parts
Lewis Hyatt [Sat, 7 Dec 2024 00:01:37 +0000 (19:01 -0500)] 
Support for 64-bit location_t: libgdiagnostics parts

Tweak libgdiagnostics.cc, which is necessarily sensitive to line-map
internals, to support 64-bit location_t as well.

gcc/ChangeLog:

* libgdiagnostics.cc (struct diagnostic_manager): Use location_t(-1)
instead of UINT_MAX to support 64-bit location_t as well.
(diagnostic_manager::diagnostic_manager): Change hard-coded "5" to
line_map_suggested_range_bits.

7 months agoSupport for 64-bit location_t: RTL parts
Lewis Hyatt [Sat, 7 Dec 2024 00:01:34 +0000 (19:01 -0500)] 
Support for 64-bit location_t: RTL parts

Some RTL objects need to store a location_t. Currently, they store it in the
rt_int field of union rtunion, but in a world where location_t could be
64-bit, they need to store it in a larger variable. Unfortunately, rtunion
does not currently have a 64-bit int type for that purpose, so add one. In
order to avoid increasing any overhead when 64-bit locations are not in use,
the new field is dedicated for location_t storage only and has type
"location_t" so it will only be 64-bit if necessary. This necessitates
adding a new RTX format code 'L' for locations. There are very many switch
statements in the codebase that inspect the RTX format code. I took the
approach of finding all of them that handle code 'i' or 'n' and making sure
they handle 'L' too. I am sure that some of these call sites can never see
an 'L' code, but I thought it would be safer and more future-proof to handle
as many as possible, given it's just a line or two to add in most cases.

gcc/ChangeLog:

* rtl.def (DEBUG_INSN): Use new format code 'L' for location_t fields.
(INSN): Likewise.
(JUMP_INSN): Likewise.
(CALL_INSN): Likewise.
(ASM_INPUT): Likewise.
(ASM_OPERANDS): Likewise.
* rtl.h (union rtunion): Add new location_t RT_LOC member for use by
the 'L' format.
(struct rtx_debug_insn): Adjust comment.
(struct rtx_nonjump_insn): Adjust comment.
(struct rtx_call_insn): Adjust comment.
(XLOC): New accessor macro for rtunion::rt_loc.
(X0LOC): Likewise.
(XCLOC): Likewise.
(INSN_LOCATION): Use XLOC instead of XUINT to retrieve a location_t.
(NOTE_MARKER_LOCATION): Likewise for XCUINT -> XCLOC.
(ASM_OPERANDS_SOURCE_LOCATION): Likewise.
(ASM_INPUT_SOURCE_LOCATION):Likewise.
(gen_rtx_ASM_INPUT): Adjust to use sL format instead of si.
(gen_rtx_INSN): Adjust prototype to use location_r rather than int
for the location.
* cfgrtl.cc (force_nonfallthru_and_redirect): Change type of LOC
local variable from int to location_t.
* rtlhash.cc (add_rtx): Support 'L' format in the switch statement.
* var-tracking.cc (loc_cmp): Likewise.
* alias.cc (rtx_equal_for_memref_p): Likewise.
* config/alpha/alpha.cc (summarize_insn): Likewise.
* config/ia64/ia64.cc (rtx_needs_barrier): Likewise.
* config/rs6000/rs6000.cc (rs6000_hash_constant): Likewise.
* cse.cc (hash_rtx): Likewise.
(exp_equiv_p): Likewise.
* cselib.cc (rtx_equal_for_cselib_1): Likewise.
(cselib_hash_rtx): Likewise.
(cselib_expand_value_rtx_1): Likewise.
* emit-rtl.cc (copy_insn_1): Likewise.
(gen_rtx_INSN): Change the location argument from int to location_t,
and call the corresponding gen_rtf_fmt_* function.
* final.cc (leaf_renumber_regs_insn): Support 'L' format in the
switch statement.
* genattrtab.cc (attr_rtx_1): Likewise.
* genemit.cc (gen_exp): Likewise.
* gengenrtl.cc (type_from_format): Likewise.
(accessor_from_format): Likewise.
* gengtype.cc (adjust_field_rtx_def): Likewise.
* genpeep.cc (match_rtx): Likewise; just mark gcc_unreachable() for
now.
* genrecog.cc (find_operand): Support 'L' format in the switch statement.
(find_matching_operand): Likewise.
(validate_pattern): Likewise.
* gensupport.cc (subst_pattern_match): Likewise.
(get_alternatives_number): Likewise.
(collect_insn_data): Likewise.
(alter_predicate_for_insn): Likewise.
(alter_constraints): Likewise.
(subst_dup): Likewise.
* jump.cc (rtx_renumbered_equal_p): Likewise.
* loop-invariant.cc (hash_invariant_expr_1): Likewise.
* lra-constraints.cc (operands_match_p): Likewise.
* lra.cc (lra_rtx_hash): Likewise.
* print-rtl.cc (rtx_writer::print_rtx_operand_code_i): Refactor
location_t-relevant code to...
(rtx_writer::print_rtx_operand_code_L): ...new function here.
(rtx_writer::print_rtx_operand): Support 'L' format in the switch statement.
* print-rtl.h (rtx_writer::print_rtx_operand_code_L): Add prototype
for new function.
* read-rtl-function.cc (function_reader::read_rtx_operand): Support
'L' format in the switch statement.
(function_reader::read_rtx_operand_i_or_n): Rename to...
(function_reader::read_rtx_operand_inL): ...this, and support 'L' as
well.
* read-rtl.cc (apply_int_iterator): Support 'L' format in the switch
statement.
(rtx_reader::read_rtx_operand): Likewise.
* reload.cc (operands_match_p): Likewise.
* rtl.cc (rtx_format): Add new code 'L'.
(rtx_equal_p): Support 'L' in the switch statement. Remove dead code
in the handling for 'i' and 'n'.

7 months agofinal: Fix call to INSN_LOCATION on a NOTE rtl
Lewis Hyatt [Sat, 7 Dec 2024 00:01:32 +0000 (19:01 -0500)] 
final: Fix call to INSN_LOCATION on a NOTE rtl

This function has a code path that calls INSN_LOCATION on an rtl note. For a
note, this returns the note type enum rather than a location, but it runs
without complaint even with --enable-checking=rtl because both are stored in
the rt_int member of the rtunion. A subsequent commit will add a new rtl
format code specifically for locations, in which case attempting to call
INSN_LOCATION on a note will trigger an error. Fix it up by handling the
case of a note missing a location separately.

gcc/ChangeLog:

* final.cc (reemit_insn_block_notes): Don't try to call
INSN_LOCATION on a NOTE rtl object. Don't call change_scope () for a
NOTE missing a location.

7 months agomiddle-end: Handle resized PHI nodes in loop_version()
Lewis Hyatt [Sat, 7 Dec 2024 00:01:27 +0000 (19:01 -0500)] 
middle-end: Handle resized PHI nodes in loop_version()

While testing upcoming support for 64-bit location_t, I came across some
test failures on sparc (32-bit) that trigger when location_t is changed to
be 64-bit. The reason is that several call sites that make use of
loop_version() for performing loop optimizations assume that a gphi*
obtained prior to calling loop_version() will remain valid afterwards, but
this is not the case for a PHI that needs to be resized. It doesn't happen
usually, because PHI nodes usually have room for at least 4 arguments and
this is usually more than are needed, but this is not guaranteed.

Fix the affected callers by avoiding the assumption that a PHI node pointer
remains valid. For most cases, this is done by remembering instead the
gphi->result pointer, which contains a pointer back to the PHI node that is
kept up to date when the PHI is moved to a new address.

gcc/ChangeLog:

* tree-parloops.cc (struct reduction_info): Store the result of the
reduction PHI rather than the PHI itself.
(reduction_info::reduc_phi): New member function.
(reduction_hasher::equal): Adapt to the change in struct reduction_info.
(reduction_phi): Likewise.
(initialize_reductions): Likewise.
(create_call_for_reduction_1): Likewise.
(transform_to_exit_first_loop_alt): Likewise.
(transform_to_exit_first_loop): Likewise.
(build_new_reduction): Likewise.
(set_reduc_phi_uids): Likewise.
(try_create_reduction_list): Likewise.
* tree-ssa-loop-split.cc (split_loop): Remember the PHI result
variable so that the PHI can be found in case it is resized and move
to a new address.
* tree-vect-loop-manip.cc (vect_loop_versioning): After calling
loop_version(), fix up stored PHI pointers in case they have
changed.
* tree-vectorizer.cc (vec_info::resync_stmt_addr): New function.
* tree-vectorizer.h (vec_info::resync_stmt_addr): Declare.

7 months agoOnly add inferred ranges if they change the value.
Andrew MacLeod [Sat, 23 Nov 2024 19:05:54 +0000 (14:05 -0500)] 
Only add inferred ranges if they change the value.

Do not add an inferred range if it is already incorprated in the
current range of an SSA_NAME.

PR tree-optimization/117467
* gimple-range-infer.cc (infer_range_manager::add_ranges): Check
range_of_expr to see if the inferred range is needed.

7 months agoAdd a range query to inferred ranges.
Andrew MacLeod [Sat, 16 Nov 2024 13:29:30 +0000 (08:29 -0500)] 
Add a range query to inferred ranges.

Provide a range_query for any inferred range processing which wants to
examine the range of an argument to make decisions.  Add some comments.

* gimple-range-cache.cc (ranger_cache::ranger_cache): Create the
infer oracle using THIS as the range_query.
* gimple-range-infer.cc (gimple_infer_range::gimple_infer_range):
Add a range_query to the constructor and use it.
(infer_range_manager::infer_range_manager): Add a range_query.
* gimple-range-infer.h (gimple_infer_range): Adjust prototype.
(infer_range_manager): Add a range_query.
* value-query.cc (range_query::create_infer_oracle): Add a range_query.
* value-query.h (range_query::create_infer_oracle): Update prototype.

7 months agoDo not calculate an entry range for invariant names.
Andrew MacLeod [Mon, 25 Nov 2024 14:50:33 +0000 (09:50 -0500)] 
Do not calculate an entry range for invariant names.

If an SSA_NAME is invariant, do not calculate an on_entry value.

PR tree-optimization/117467
* gimple-range-cache.cc (ranger_cache::entry_range): Do not
invoke range_from_dom for invariant ssa-names.

7 months ago[PR117248][LRA]: Rewriting reg notes update and fix calculation of conflict hard...
Vladimir N. Makarov [Fri, 6 Dec 2024 21:16:28 +0000 (16:16 -0500)] 
[PR117248][LRA]: Rewriting reg notes update and fix calculation of conflict hard regs of pseudo.

  LRA updates conflict hard regs of pseudo when some hard reg dies.  A
complicated PA div/mod insns reference for clobbered explicit hard regs and
hard reg as operands.  It prevents some hard reg dying although they
still conflict with pseudos living through.  Although on such insns LRA
updates wrongly reg notes (REG_DEAD, REG_UNUSED) which are used later in
rematerialization subpass.  The patch fixes the problems.

gcc/ChangeLog:

PR rtl-optimization/117248
* lra-lives.cc (start_living, start_dying): Remove.
(insn_regnos, out_insn_regnos, insn_regnos_live_after): New.
(sparseset_contains_pseudos_p): Remove.
(make_hard_regno_live, make_hard_regno_dead): Return true if
something in liveness is changed.
(mark_pseudo_live,  mark_pseudo_dead): Ditto.
(mark_regno_live, mark_regno_dead): Ditto.
(clear_sparseset_regnos, regnos_in_sparseset_p): Use set instead
of dead_set.
(process_bb_lives): Rewrite dealing with reg notes.  Update
conflict hard regs even when clobber hard reg is not marked as
dead.
(lra_create_live_ranges_1): Add initialization/finalization of
insn_regnos, out_insn_regnos, insn_regnos_live_after.

7 months ago[PR tree-optimization/117895] Fix sparc libgo build failure with CRC opts enabled
Jeff Law [Fri, 6 Dec 2024 20:40:25 +0000 (13:40 -0700)] 
[PR tree-optimization/117895] Fix sparc libgo build failure with CRC opts enabled

So as noted in the BZ, sparc builds of the golang libraries were failing due to
the CRC code.

Ultimately this was another mode problem in the table expansion.  Essentially
when the mode of the resultant crc was different than the mode of the input
data we could create mixed mode operations which is a no-no.  Not entirely sure
how we were getting away with it before, but it was clearly wrong.

The mode of the crc will always be at least as large at the mode of the data
for the cases we support.  So the code has been adjusted to convert the data's
mode to the crc's mode and do all the ops in the crc mode.

That fixes the libgo build problem on sparc and I've verfied that there aren't
any regressions on x86_64 as well as all the embedded targets in my tester.

PR tree-optimization/117895
gcc/
* expr.cc (calculate_table_based_CRC): Drop CRC_MODE argument.
Convert DATA to CRC's mode, then do calculations in CRC's mode.
(expand_crc_table_based): Corresponding changes.
(expand_reversed_crc_table_based): Corresponding changes.

7 months agoc++: use diagnostic nesting [PR116253]
David Malcolm [Fri, 6 Dec 2024 18:40:55 +0000 (13:40 -0500)] 
c++: use diagnostic nesting [PR116253]

This patch uses the nested diagnostics capabilities added in the earlier
patch in the C++ frontend.

With this, and enabling the non-standard text formatting via:
  -fdiagnostics-set-output=text:experimental-nesting=yes
and using:
  -std=c++20 -fconcepts-diagnostics-depth=2
then the output for the example in SG15's P3358R0 ("SARIF for Structured
Diagnostics") is:

P3358R0.C: In function ‘int main()’:
P3358R0.C:26:6: error: no matching function for call to ‘pet(lizard)’
   26 |   pet(lizard{});
      |   ~~~^~~~~~~~~~
  • note: candidate: ‘template<class auto:1>  requires  pettable<auto:1> void pet(auto:1)’
    P3358R0.C:21:6:
       21 | void pet(pettable auto t);
          |      ^~~
    • note: template argument deduction/substitution failed:
      • note: constraints not satisfied
        • P3358R0.C: In substitution of ‘template<class auto:1>  requires  pettable<auto:1> void pet(auto:1) [with auto:1 = lizard]’:
        • required from here
          P3358R0.C:26:6:
             26 |   pet(lizard{});
                |   ~~~^~~~~~~~~~
        • required for the satisfaction of ‘pettable<auto:1>’ [with auto:1 = lizard]
          P3358R0.C:19:9:
             19 | concept pettable = has_member_pet<T> or has_default_pet<T>;
                |         ^~~~~~~~
        • note: no operand of the disjunction is satisfied
          P3358R0.C:19:38:
             19 | concept pettable = has_member_pet<T> or has_default_pet<T>;
                |                    ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
          • note: the operand ‘has_member_pet<T>’ is unsatisfied because
            P3358R0.C:19:20:
               19 | concept pettable = has_member_pet<T> or has_default_pet<T>;
                  |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            • required for the satisfaction of ‘has_member_pet<T>’ [with T = lizard]
              P3358R0.C:13:9:
                 13 | concept has_member_pet = requires(T t) { t.pet(); };
                    |         ^~~~~~~~~~~~~~
            • required for the satisfaction of ‘pettable<auto:1>’ [with auto:1 = lizard]
              P3358R0.C:19:9:
                 19 | concept pettable = has_member_pet<T> or has_default_pet<T>;
                    |         ^~~~~~~~
            • in requirements with ‘T t’ [with T = lizard]
              P3358R0.C:13:26:
                 13 | concept has_member_pet = requires(T t) { t.pet(); };
                    |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~
            • note: the required expression ‘t.pet()’ is invalid, because
              P3358R0.C:13:47:
                 13 | concept has_member_pet = requires(T t) { t.pet(); };
                    |                                          ~~~~~^~
              • error: ‘struct lizard’ has no member named ‘pet’
                P3358R0.C:13:44:
                   13 | concept has_member_pet = requires(T t) { t.pet(); };
                      |                                          ~~^~~
          • note: the operand ‘has_default_pet<T>’ is unsatisfied because
            P3358R0.C:19:41:
               19 | concept pettable = has_member_pet<T> or has_default_pet<T>;
                  |                    ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
            • required for the satisfaction of ‘has_default_pet<T>’ [with T = lizard]
              P3358R0.C:16:9:
                 16 | concept has_default_pet = T::is_pettable;
                    |         ^~~~~~~~~~~~~~~
            • required for the satisfaction of ‘pettable<auto:1>’ [with auto:1 = lizard]
              P3358R0.C:19:9:
                 19 | concept pettable = has_member_pet<T> or has_default_pet<T>;
                    |         ^~~~~~~~
            • error: ‘is_pettable’ is not a member of ‘lizard’
              P3358R0.C:16:30:
                 16 | concept has_default_pet = T::is_pettable;
                    |                              ^~~~~~~~~~~
  • note: candidate: ‘void pet(dog)’
    P3358R0.C:9:6:
        9 | void pet(dog);
          |      ^~~
    • note: no known conversion for argument 1 from ‘lizard’ to ‘dog’
      P3358R0.C:9:10:
          9 | void pet(dog);
            |          ^~~
  • note: candidate: ‘void pet(cat)’
    P3358R0.C:10:6:
       10 | void pet(cat);
          |      ^~~
    • note: no known conversion for argument 1 from ‘lizard’ to ‘cat’
      P3358R0.C:10:10:
         10 | void pet(cat);
            |          ^~~

showing the hierarchical structure of the messages; ideally there
would be a UI here allowing the user to expand/collapse the messages
to drill out into the detail they are interested in.

The structure is also captured in SARIF output (via the "nestingLevel"
property).

gcc/cp/ChangeLog:
PR other/116253
* call.cc (print_conversion_rejection): Remove leading space from
diagnostic messages.
(print_conversion_rejection): Likewise.
(print_arity_information): Likewise.
(print_z_candidate): Likewise.  Add auto_diagnostic_nesting_level
before calls to fn_type_unification and diagnose_constraints.
(print_z_candidates): Add auto_diagnostic_nesting_level before
looping over candidates.
(conversion_null_warnings): Remove leading space from
diagnostic messages.
(maybe_inform_about_fndecl_for_bogus_argument_init): Likewise.
* constraint.cc (tsubst_valid_expression_requirement): Add
auto_diagnostic_nesting_level when showing why the expression is
invalid.
(satisfy_disjunction): Likewise when showing operans, and again
when replaying each branch of the disjunction.
(diagnose_constraints): Likewise when replaying satisfaction.
* error.cc (cp_diagnostic_text_starter): Set prefix.
(print_instantiation_full_context): Only show the file
if we're not showing nesting or the user has opted in to
showing location information in nested diagnostics.
(class auto_context_line): New.
(print_instantiation_partial_context_line): Replace calls to
print_location and to diagnostic_show_locus with an
auto_context_line.
(print_instantiation_partial_context): Replace calls to
print_location with an auto_context_line.
(maybe_print_constexpr_context): Likewise.
(print_constrained_decl_info): Likewise.
(print_concept_check_info): Likewise.
(print_constraint_context_head): Likewise.
(print_requires_expression_info): Likewise.

gcc/testsuite/ChangeLog:
PR other/116253
* g++.dg/concepts/nested-diagnostics-1-truncated.C: New test.
* g++.dg/concepts/nested-diagnostics-1.C: New test.
* g++.dg/concepts/nested-diagnostics-2.C: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agoi386: Add missing part from my previous commit.
Uros Bizjak [Fri, 6 Dec 2024 18:21:53 +0000 (19:21 +0100)] 
i386: Add missing part from my previous commit.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_decompose_address):
Add missing part from my previous commit.

7 months agoi386: Fix gcc.target/i386/pr101716.c (and some related cleanups)
Uros Bizjak [Fri, 6 Dec 2024 18:00:34 +0000 (19:00 +0100)] 
i386: Fix gcc.target/i386/pr101716.c (and some related cleanups)

Fix pr101716.c testcase scan-assembler failure.  The combine pass will not
combine instructions that use registers in TARGET_CLASS_LIKELY_SPILLED
class, such as %eax return register in AREG class.

Change the testcase to use pseudos only and explicitly scan for
zero_extendsidi pattern name.

While looking there, also clean ix86_decompose_address a bit: eliminate
common code and use UINTVAL and HOST_WIDE_INT_UC macros in the condition
for AND wrapped address.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_decompose_address): Eliminate
common code and use use UINTVAL and HOST_WIDE_INT_UC macros
in the condition for AND wrapped address.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr101716.c (dg-options): Add -dp.
(dg-final): Scan for zero_extendsidi.
(sample1): Change the code to use pseudos only.

7 months agoarm,testsuite: Add -mtune=cortex-m55 to dlstp-int8x16.c
Christophe Lyon [Fri, 6 Dec 2024 15:59:25 +0000 (15:59 +0000)] 
arm,testsuite: Add -mtune=cortex-m55 to dlstp-int8x16.c

Like dlstp-compile-asm-1.c, this test would fail if GCC is configured
with non-default options, such as -mtune=cortex-a9.

Force -mtune=cortex-m55 to avoid this unexpected issue.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/dlstp-int8x16.c: Add -mtune=cortex-m55

7 months agoi386: Fix unwanted fwprop to 3dNOW! insn [PR117926]
Uros Bizjak [Fri, 6 Dec 2024 15:59:16 +0000 (16:59 +0100)] 
i386: Fix unwanted fwprop to 3dNOW! insn [PR117926]

The compiler is able to forward propagate a partial vector V4SF instruction
using XMM registers to a 3dNOW! V2SF instruction using MM registers.  Prevent
unwanted transformation by tagging 3dNOW! V2SF instructions using generic
RTXes with "(unspec [(const_int 0)] UNSPEC_3DNOW)" tag.

PR target/117926

gcc/ChangeLog:

* config/i386/mmx.md (UNSPEC_3DNOW): New unspec.
(mmx_addv2sf3): Tag insn with UNSPEC_3DNOW tag.
(*mmx_addv2sf3): Ditto.
(mmx_sub2vsf3): Ditto.
(mmx_subrv2sf3): Ditto.
(*mmx_subv2sf3): Ditto.
(mmx_mulv2sf3): Ditto.
(mmx_<smaxmin:code>v2sf3): Ditto.
(*mmx_<smaxmin:code>v2sf3): Ditto.
(mmx_ieee_<ieee_maxmin>v2sf3): Ditto.
(mmx_eqv2sf3): Ditto.
(*mmx_eqv2sf3): Ditto.
(mmx_gtv2sf3): Ditto.
(mmx_gev2sf3): Ditto.
(mmx_fix_truncv2sfv2si2): Ditto.
(mmx_floatv2siv2sf2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117926.c: New test.

7 months agoarm: testsuite: fix some legacy C tests
Richard Earnshaw [Fri, 6 Dec 2024 17:05:27 +0000 (17:05 +0000)] 
arm: testsuite: fix some legacy C tests

These tests all lack ISO-C style function definitions.  Some
deliberatly so.  Rather than try to adjust the code and risk changing
the nature of the test, add -std=c17 to the test options.

gcc/testsuite/ChangeLog:

* gcc.target/arm/20031108-1.c: Add -std=c17.
* gcc.target/arm/fp16-unprototyped-1.c: Likewise.
* gcc.target/arm/fp16-unprototyped-2.c: Likewise.
* gcc.target/arm/neon-thumb2-move.c: Likewise.
* gcc.target/arm/pr67756.c: Likewise.
* gcc.target/arm/pr81863.c: Likewise.

7 months agoclang-format BraceWrapping.AfterCaseLabel to true
Matthew Malcomson [Tue, 3 Dec 2024 22:13:40 +0000 (22:13 +0000)] 
clang-format BraceWrapping.AfterCaseLabel to true

This setting seems to better match the indentation that is used in GCC.

Adds an exra level of indentation after braces in a case statement.

Only manual testing done on the switch statements in
c-common.cc:resolve_overloaded_builtin and
alias.cc:record_component_aliases.

Ok for trunk?

contrib/ChangeLog:

* clang-format: Set BraceWrapping.AfterCaseLabel.

Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
7 months agodiagnostics: UX: add doc URLs for attributes (v2)
David Malcolm [Fri, 6 Dec 2024 16:48:43 +0000 (11:48 -0500)] 
diagnostics: UX: add doc URLs for attributes (v2)

This is v2 of the patch; v1 was here:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655541.html

Changed in v2:
* added a new TARGET_DOCUMENTATION_NAME hook for figuring out which
  documentation URL to use when there are multiple per-target docs,
  such as for __attribute__((interrupt)); implemented this for all
  targets that have target-specific attributes
* moved attribute_urlifier and its support code to a new
  gcc-attribute-urlifier.cc since it needs to use targetm for the
  above; gcc-urlifier.o is used by the driver.
* fixed extend.texi so that some attributes that failed to appear in
  attr-urls.def now do so (affected nvptx "kernel" and "shared" attrs)
* regenerated attr-urls.def for the above fix, and bringing in
  attributes added since v1 of the patch

In r14-5118-gc5db4d8ba5f3de I added a mechanism to automatically add
documentation URLs to quoted strings in diagnostics.
In r14-6920-g9e49746da303b8 I added a mechanism to generate URLs for
mentions of command-line options in quoted strings in diagnostics.

This patch does a similar thing for attributes.  It adds a new Python 3
script to scrape the generated HTML looking for documentation of
attributes, and uses this to (re)generate a new gcc/attr-urls.def file.

Running "make regenerate-attr-urls" after rebuilding the HTML docs will
regenerate gcc/attr-urls.def in the source directory.

The patch uses this to optionally add doc URLs for attributes in any
diagnostic emitted during the lifetime of a auto_urlify_attributes
instance, and adds such instances everywhere that a diagnostic refers
to a diagnostic within quotes (based on grepping the source tree
for references to attributes in strings and in code).

For example, given:

$ ./xgcc -B. -S ../../src/gcc/testsuite/gcc.dg/attr-access-2.c
../../src/gcc/testsuite/gcc.dg/attr-access-2.c:14:16: warning:
attribute ‘access(read_write, 2, 3)’ positional argument 2 conflicts
with previous designation by argument 1 [-Wattributes]

with this patch the quoted text `access(read_write, 2, 3)'
automatically gains the URL for our docs for "access":
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-access-function-attribute
in a sufficiently modern terminal.

Like r14-6920-g9e49746da303b8 this avoids the Makefile target
depending on the generated HTML, since a missing URL is a minor
problem, whereas requiring all users to build HTML docs seems more
involved.  Doing so also avoids Python 3 as a build requirement for
everyone, but instead just for developers addding attributes.
Like the options, we could add a CI test for this.

The patch gathers both general and target-specific attributes.
For example, the function attribute "interrupt" has 19 URLs within our
docs: one common, and 18 target-specific ones.
The patch adds a new target hook used when selecting the most
appropriate one.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/ChangeLog:
* Makefile.in (OBJS): Add -attribute-urlifier.o.
(ATTR_URLS_HTML_DEPS): New.
(regenerate-attr-urls): New.
(regenerate-attr-urls-unit-test): New.
* attr-urls.def: New file.
* attribs.cc: Include "gcc-urlifier.h".
(decl_attributes): Use auto_urlify_attributes.
* config/aarch64/aarch64.cc (TARGET_DOCUMENTATION_NAME): New.
* config/arc/arc.cc (TARGET_DOCUMENTATION_NAME): New.
* config/arm/arm.cc (TARGET_DOCUMENTATION_NAME): New.
* config/bfin/bfin.cc (TARGET_DOCUMENTATION_NAME): New.
* config/bpf/bpf.cc (TARGET_DOCUMENTATION_NAME): New.
* config/epiphany/epiphany.cc (TARGET_DOCUMENTATION_NAME): New.
* config/gcn/gcn.cc (TARGET_DOCUMENTATION_NAME): New.
* config/h8300/h8300.cc (TARGET_DOCUMENTATION_NAME): New.
* config/i386/i386.cc (TARGET_DOCUMENTATION_NAME): New.
* config/ia64/ia64.cc (TARGET_DOCUMENTATION_NAME): New.
* config/m32c/m32c.cc (TARGET_DOCUMENTATION_NAME): New.
* config/m32r/m32r.cc (TARGET_DOCUMENTATION_NAME): New.
* config/m68k/m68k.cc (TARGET_DOCUMENTATION_NAME): New.
* config/mcore/mcore.cc (TARGET_DOCUMENTATION_NAME): New.
* config/microblaze/microblaze.cc (TARGET_DOCUMENTATION_NAME):
New.
* config/mips/mips.cc (TARGET_DOCUMENTATION_NAME): New.
* config/msp430/msp430.cc (TARGET_DOCUMENTATION_NAME): New.
* config/nds32/nds32.cc (TARGET_DOCUMENTATION_NAME): New.
* config/nvptx/nvptx.cc (TARGET_DOCUMENTATION_NAME): New.
* config/riscv/riscv.cc (TARGET_DOCUMENTATION_NAME): New.
* config/rl78/rl78.cc (TARGET_DOCUMENTATION_NAME): New.
* config/rs6000/rs6000.cc (TARGET_DOCUMENTATION_NAME): New.
* config/rx/rx.cc (TARGET_DOCUMENTATION_NAME): New.
* config/s390/s390.cc (TARGET_DOCUMENTATION_NAME): New.
* config/sh/sh.cc (TARGET_DOCUMENTATION_NAME): New.
* config/stormy16/stormy16.cc (TARGET_DOCUMENTATION_NAME): New.
* config/v850/v850.cc (TARGET_DOCUMENTATION_NAME): New.
* config/visium/visium.cc (TARGET_DOCUMENTATION_NAME): New.

gcc/analyzer/ChangeLog:
* region-model.cc: Include "gcc-urlifier.h".
(reason_attr_access::emit): Use auto_urlify_attributes.
* sm-taint.cc: Include "gcc-urlifier.h".
(tainted_access_attrib_size::emit): Use auto_urlify_attributes.

gcc/c-family/ChangeLog:
* c-attribs.cc: Include "gcc-urlifier.h".
(positional_argument): Use auto_urlify_attributes.
* c-common.cc: Include "gcc-urlifier.h".
(parse_optimize_options): Use auto_urlify_attributes with
OPT_Wattributes.
(attribute_fallthrough_p): Use auto_urlify_attributes.
* c-warn.cc: Include "gcc-urlifier.h".
(diagnose_mismatched_attributes): Use auto_urlify_attributes.

gcc/c/ChangeLog:
* c-decl.cc: Include "gcc-urlifier.h".
(start_decl): Use auto_urlify_attributes with OPT_Wattributes.
(start_function): Likewise.
* c-parser.cc: Include "gcc-urlifier.h".
(c_parser_statement_after_labels): Use auto_urlify_attributes with
OPT_Wattributes.
* c-typeck.cc: Include "gcc-urlifier.h".
(maybe_warn_nodiscard): Use auto_urlify_attributes with
OPT_Wunused_result.

gcc/cp/ChangeLog:
* cp-gimplify.cc: Include "gcc-urlifier.h".
(process_stmt_hotness_attribute): Use auto_urlify_attributes with
OPT_Wattributes.
* cvt.cc: Include "gcc-urlifier.h".
(maybe_warn_nodiscard): Use auto_urlify_attributes with
OPT_Wunused_result.
* decl.cc: Include "gcc-urlifier.h".
(start_decl): Use auto_urlify_attributes.
(start_preparsed_function): Likewise.

gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::override_urlifier): New.
* diagnostic.h (diagnostic_context::override_urlifier): New decl.
* doc/extend.texi (Nvidia PTX Function Attributes): Update
@cindex to specify that "kernel" is a function attribute and
"shared" is a variable attribute, so that these entries are
recognized by the regex in regenerate-attr-urls.py.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_DOCUMENTATION_NAME): New.
* gcc-attribute-urlifier.cc: New file.
* gcc-urlifier.cc: Include diagnostic.h.
(gcc_urlifier::make_doc): Convert to...
(make_doc_url): ...this.
(auto_override_urlifier::auto_override_urlifier): New.
(auto_override_urlifier::~auto_override_urlifier): New.
(selftest::gcc_urlifier_cc_tests): Split out body into...
(selftest::test_gcc_urlifier): ...this.
* gcc-urlifier.h: Include "pretty-print-urlifier.h" and "label-text.h".
(make_doc_url): New decl.
(class auto_override_urlifier): New.
(class attribute_urlifier): New.
(class auto_urlify_attributes): New.
* gimple-ssa-warn-access.cc: Include "gcc-urlifier.h".
(pass_waccess::execute): Use auto_urlify_attributes.
* gimplify.cc: Include "gcc-urlifier.h".
(expand_FALLTHROUGH): Use auto_urlify_attributes.
* internal-fn.cc: Define INCLUDE_MEMORY and include
"gcc-urlifier.h.
(expand_FALLTHROUGH): Use auto_urlify_attributes.
* ipa-pure-const.cc: Include "gcc-urlifier.h.
(suggest_attribute): Use auto_urlify_attributes.
* ipa-strub.cc: Include "gcc-urlifier.h.
(can_strub_p): Use auto_urlify_attributes.
* regenerate-attr-urls.py: New file.
* selftest-run-tests.cc (selftest::run_tests): Call
gcc_attribute_urlifier_cc_tests.
* selftest.h (selftest::gcc_attribute_urlifier_cc_tests): New
decl.
* target.def (documentation_name): New DEFHOOKPOD.
* tree-cfg.cc: Include "gcc-urlifier.h.
(do_warn_unused_result): Use auto_urlify_attributes.
* tree-ssa-uninit.cc: Include "gcc-urlifier.h.
(maybe_warn_read_write_only): Use auto_urlify_attributes.
(maybe_warn_pass_by_reference): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agoc++: handle misspelled concepts and missing #include <concepts>
David Malcolm [Fri, 6 Dec 2024 16:33:59 +0000 (11:33 -0500)] 
c++: handle misspelled concepts and missing #include <concepts>

gcc/cp/ChangeLog:
* name-lookup.cc (suggest_alternative_in_explicit_scope):
Gracefully handle non-namespaces, such as scoped enums.
* parser.cc (cp_parser_name_lookup_error): Provide
a name_hint for the case where we're in an explicit scope.
* std-name-hint.gperf: Add <concepts>.
* std-name-hint.h: Regenerate.

gcc/testsuite/ChangeLog:
* g++.dg/concepts/missing-header.C: New test.
* g++.dg/concepts/misspelled-concept.C: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agoc++: consolidate location printing in error.cc [PR116253]
David Malcolm [Fri, 6 Dec 2024 16:29:54 +0000 (11:29 -0500)] 
c++: consolidate location printing in error.cc [PR116253]

Consolidate the location-printing logic in cp/error.cc, as preliminary
work towards supporting nested diagnostics (PR other/116253).

gcc/cp/ChangeLog:
PR other/116253
* error.cc (print_location): Move to earlier in the file.
(print_instantiation_partial_context_line): Replace
location-printing logic with a call to print_location.
(print_instantiation_partial_context): Likewise, splitting up
pp_verbatim calls.
(maybe_print_constexpr_context): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agoavr.opt.urls: Rebuild.
Georg-Johann Lay [Fri, 6 Dec 2024 13:45:38 +0000 (14:45 +0100)] 
avr.opt.urls: Rebuild.

gcc/
* config/avr/avr.opt.urls: Rebuild.

7 months agoAVR: Disable generation of CRC lookup tables.
Georg-Johann Lay [Fri, 6 Dec 2024 10:52:16 +0000 (11:52 +0100)] 
AVR: Disable generation of CRC lookup tables.

With -foptimize-crc, large lookup tables may be generated which
are places in .rodata (RAM).  This patch disables such tables.

gcc/
* common/config/avr/avr-common.cc
(avr_option_optimization_table): Default to -fno-optimize-crc.

7 months agoavoid-store-forwarding: bail when an instruction may throw [PR117816]
kelefth [Thu, 5 Dec 2024 10:11:27 +0000 (11:11 +0100)] 
avoid-store-forwarding: bail when an instruction may throw [PR117816]

Avoid-store-forwarding doesn't handle the case where an instruction in
the store-load sequence contains a REG_EH_REGION note, leading to the
insertion of instructions after it, while it should be the last
instruction in the basic block. This causes an ICE when compiling
using `-O -fnon-call-exceptions -favoid-store-forwarding
-fno-forward-propagate -finstrument-functions`.

This patch rejects the transformation when there are instructions in
the sequence that may throw an exeption.

PR rtl-optimization/117816

gcc/ChangeLog:

* avoid-store-forwarding.cc (store_forwarding_analyzer::avoid_store_forwarding):
Reject the transformation when having instructions that may
throw exceptions in the sequence.

gcc/testsuite/ChangeLog:

* gcc.dg/pr117816.c: New test.

7 months agonvptx: Support '-march=sm_89'
Thomas Schwinge [Tue, 12 Nov 2024 16:49:10 +0000 (17:49 +0100)] 
nvptx: Support '-march=sm_89'

gcc/
* config/nvptx/nvptx-sm.def: Add '89'.
* config/nvptx/nvptx-gen.h: Regenerate.
* config/nvptx/nvptx-gen.opt: Likewise.
* config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust.
* config/nvptx/nvptx.opt (-march-map=sm_89, -march-map=sm_90)
(march-map=sm_90a): Likewise.
* config.gcc: Likewise.
* doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_89'.
* config/nvptx/gen-multilib-matches-tests: Extend.
gcc/testsuite/
* gcc.target/nvptx/march-map=sm_89.c: Adjust.
* gcc.target/nvptx/march-map=sm_90.c: Likewise.
* gcc.target/nvptx/march-map=sm_90a.c: Likewise.
* gcc.target/nvptx/march=sm_89.c: New.
libgomp/
* testsuite/libgomp.c/declare-variant-3-sm89.c: New.
* testsuite/libgomp.c/declare-variant-3.h: Adjust.

7 months agonvptx: Support '-mptx=7.8'
Thomas Schwinge [Tue, 12 Nov 2024 16:37:44 +0000 (17:37 +0100)] 
nvptx: Support '-mptx=7.8'

gcc/
* config/nvptx/nvptx-opts.h (enum ptx_version): Add
'PTX_VERSION_7_8'.
* config/nvptx/nvptx.cc (ptx_version_to_string)
(ptx_version_to_number): Adjust.
* config/nvptx/nvptx.h (TARGET_PTX_7_8): New.
* config/nvptx/nvptx.opt (Enum(ptx_version)): Add 'EnumValue'
'7.8' for 'PTX_VERSION_7_8'.
* doc/invoke.texi (Nvidia PTX Options): Document '-mptx=7.8'.
gcc/testsuite/
* gcc.target/nvptx/mptx=7.8.c: New.

7 months agonvptx: Support '-march=sm_52'
Thomas Schwinge [Sun, 10 Nov 2024 16:47:16 +0000 (17:47 +0100)] 
nvptx: Support '-march=sm_52'

gcc/
* config/nvptx/nvptx-sm.def: Add '52'.
* config/nvptx/nvptx-gen.h: Regenerate.
* config/nvptx/nvptx-gen.opt: Likewise.
* config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust.
* config/nvptx/nvptx.opt (-march-map=sm_52): Likewise.
* config.gcc: Likewise.
* doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_52'.
* config/nvptx/gen-multilib-matches-tests: Extend.
gcc/testsuite/
* gcc.target/nvptx/march-map=sm_52.c: Adjust.
* gcc.target/nvptx/march=sm_52.c: New.
libgomp/
* testsuite/libgomp.c/declare-variant-3-sm52.c: New.
* testsuite/libgomp.c/declare-variant-3.h: Adjust.

7 months agonvptx: Support '-march=sm_37'
Thomas Schwinge [Tue, 12 Nov 2024 07:31:53 +0000 (08:31 +0100)] 
nvptx: Support '-march=sm_37'

gcc/
* config/nvptx/nvptx-sm.def: Add '37'.
* config/nvptx/nvptx-gen.h: Regenerate.
* config/nvptx/nvptx-gen.opt: Likewise.
* config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust.
* config/nvptx/nvptx.opt (-march-map=sm_37, -march-map=sm_50):
Likewise.
* config.gcc: Likewise.
* doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_37'.
* config/nvptx/gen-multilib-matches-tests: Extend.
gcc/testsuite/
* gcc.target/nvptx/march-map=sm_37.c: Adjust.
* gcc.target/nvptx/march-map=sm_50.c: Likewise.
* gcc.target/nvptx/march-map=sm_52.c: Likewise.
* gcc.target/nvptx/march=sm_37.c: New.
libgomp/
* testsuite/libgomp.c/declare-variant-3-sm37.c: New.
* testsuite/libgomp.c/declare-variant-3.h: Adjust.

7 months agonvptx: Support '-mptx=4.1'
Thomas Schwinge [Sun, 10 Nov 2024 16:34:08 +0000 (17:34 +0100)] 
nvptx: Support '-mptx=4.1'

gcc/
* config/nvptx/nvptx-opts.h (enum ptx_version): Add
'PTX_VERSION_4_1'.
* config/nvptx/nvptx.cc (ptx_version_to_string)
(ptx_version_to_number): Adjust.
* config/nvptx/nvptx.h (TARGET_PTX_4_1): New.
* config/nvptx/nvptx.opt (Enum(ptx_version)): Add 'EnumValue'
'4.1' for 'PTX_VERSION_4_1'.
* doc/invoke.texi (Nvidia PTX Options): Document '-mptx=4.1'.
gcc/testsuite/
* gcc.target/nvptx/mptx=4.1.c: New.

7 months agonvptx: Expose '-mptx=4.2'
Thomas Schwinge [Sun, 10 Nov 2024 16:35:07 +0000 (17:35 +0100)] 
nvptx: Expose '-mptx=4.2'

'PTX_VERSION_4_2' was added in commit decde11183bdccc46587d6614b75f3d56a2f2e4a
"[nvptx] Choose -mptx default based on -misa" for use for '-march=sm_52'
('first_ptx_version_supporting_sm', 'PTX_ISA_SM53'), as documented by Nvidia.
However, '-mptx=4.2' wasn't exposed to the user, but there's no reason not to.

gcc/
* config/nvptx/nvptx.h (TARGET_PTX_4_2): New.
* config/nvptx/nvptx.opt (Enum(ptx_version)): Add 'EnumValue'
'4.2' for 'PTX_VERSION_4_2'.
* doc/invoke.texi (Nvidia PTX Options): Document '-mptx=4.2'.
gcc/testsuite/
* gcc.target/nvptx/mptx=4.2.c: New.

7 months agonvptx: Clarify that our baseline is PTX ISA Version 3.1
Thomas Schwinge [Sun, 10 Nov 2024 16:32:55 +0000 (17:32 +0100)] 
nvptx: Clarify that our baseline is PTX ISA Version 3.1

Added in commit decde11183bdccc46587d6614b75f3d56a2f2e4a
"[nvptx] Choose -mptx default based on -misa", 'PTX_VERSION_3_0' was added for
'first_ptx_version_supporting_sm' to return it for 'PTX_ISA_SM30' (as
documented by Nvidia).  It's however then immediately overridden to 3.1, which
in GCC/nvptx "has been the smallest version historically", and also '-mptx=3.0'
isn't exposed to the user.  As we also elsewhere (machine description etc.)
assume that our baseline is PTX ISA Version 3.1, there's no real value added in
maintaining 'PTX_VERSION_3_0' for purposes of 'first_ptx_version_supporting_sm'
only.

No change in behavior intended.

gcc/
* config/nvptx/nvptx-opts.h (enum ptx_version): Remove
'PTX_VERSION_3_0'.
* config/nvptx/nvptx.cc (first_ptx_version_supporting_sm)
(default_ptx_version_option, ptx_version_to_string)
(ptx_version_to_number): Adjust.
* config/nvptx/nvptx.h: Comment.

7 months agonvptx: Support '--with-multilib-list'
Thomas Schwinge [Fri, 27 Sep 2024 15:44:16 +0000 (17:44 +0200)] 
nvptx: Support '--with-multilib-list'

No change in behavior unless specifying it.

gcc/
* config.gcc: nvptx: Support '--with-multilib-list'.
* config/nvptx/gen-multilib-matches.sh: Adjust.
* configure.ac: Likewise.
* configure: Regenerate.
* doc/install.texi: Update.
* doc/invoke.texi: Align.
* config/nvptx/gen-multilib-matches-tests: Extend.

7 months agoarm,testsuite: Add -mtune=cortex-m55 to dlstp-compile-asm-1.c test.
Christophe Lyon [Fri, 6 Dec 2024 09:49:58 +0000 (09:49 +0000)] 
arm,testsuite: Add -mtune=cortex-m55 to dlstp-compile-asm-1.c test.

This test would fail if GCC is configured with non-default options,
such as -mtune=cortex-a9.

This 'unexpected' scheduling makes the DLSTP optimization generate
subs    lr, #16
bhi .L4
lctp
pop     {r4, r5, pc}
.L4:
sub     ip, ip, #16
b      <loop-begin>

instead of the expected
sub     ip, ip, #16
letp lr, <loop-begin>

Although GCC still optimizes all 144 loops, only 96 use letp, 48
others use lctp.

The patch simply forces -mtune=cortex-m55 to avoid this unexpected
issue.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/dlstp-compile-asm-1.c: Add -mtune=cortex-m55

7 months agonvptx: Enhance '-march-map=[...]' test cases
Thomas Schwinge [Sun, 10 Nov 2024 17:29:25 +0000 (18:29 +0100)] 
nvptx: Enhance '-march-map=[...]' test cases

This expands upon the one test case added in
commit de0ef04419e90eacf0d1ddb265552a1b08c18d4b "[nvptx] Add march-map".

gcc/testsuite/
* gcc.target/nvptx/march-map.c: Remove; expanded into...
* gcc.target/nvptx/march-map=sm_50.c: ... this.
* gcc.target/nvptx/march-map=sm_30.c: New.
* gcc.target/nvptx/march-map=sm_32.c: Likewise.
* gcc.target/nvptx/march-map=sm_35.c: Likewise.
* gcc.target/nvptx/march-map=sm_37.c: Likewise.
* gcc.target/nvptx/march-map=sm_52.c: Likewise.
* gcc.target/nvptx/march-map=sm_53.c: Likewise.
* gcc.target/nvptx/march-map=sm_60.c: Likewise.
* gcc.target/nvptx/march-map=sm_61.c: Likewise.
* gcc.target/nvptx/march-map=sm_62.c: Likewise.
* gcc.target/nvptx/march-map=sm_70.c: Likewise.
* gcc.target/nvptx/march-map=sm_72.c: Likewise.
* gcc.target/nvptx/march-map=sm_75.c: Likewise.
* gcc.target/nvptx/march-map=sm_80.c: Likewise.
* gcc.target/nvptx/march-map=sm_86.c: Likewise.
* gcc.target/nvptx/march-map=sm_87.c: Likewise.
* gcc.target/nvptx/march-map=sm_89.c: Likewise.
* gcc.target/nvptx/march-map=sm_90.c: Likewise.
* gcc.target/nvptx/march-map=sm_90a.c: Likewise.
* gcc.target/nvptx/main.c: Remove.

7 months agonvptx: Enhance '-march=[...]' test cases
Thomas Schwinge [Sun, 10 Nov 2024 19:09:42 +0000 (20:09 +0100)] 
nvptx: Enhance '-march=[...]' test cases

This expands upon the test cases added in
commit 4706670cd3b06bb024da0683776bf86c79d55940
"[nvptx, testsuite] Add gcc.target/nvptx/sm*.c".

gcc/testsuite/
* gcc.target/nvptx/sm30.c: Remove; expanded into...
* gcc.target/nvptx/march=sm_30.c: ... this.
* gcc.target/nvptx/sm35.c: Remove; expanded into...
* gcc.target/nvptx/march=sm_35.c: ... this.
* gcc.target/nvptx/sm53.c: Remove; expanded into...
* gcc.target/nvptx/march=sm_53.c: ... this.
* gcc.target/nvptx/sm70.c: Remove; expanded into...
* gcc.target/nvptx/march=sm_70.c: ... this.
* gcc.target/nvptx/sm75.c: Remove; expanded into...
* gcc.target/nvptx/march=sm_75.c: ... this.
* gcc.target/nvptx/sm80.c: Remove; expanded into...
* gcc.target/nvptx/march=sm_80.c: ... this.
* gcc.target/nvptx/march.c: Remove.

7 months agonvptx: Enhance '-mptx=[...]' test cases
Thomas Schwinge [Sun, 10 Nov 2024 19:01:58 +0000 (20:01 +0100)] 
nvptx: Enhance '-mptx=[...]' test cases

This expands upon the test cases added in
commit a2eacdbd4c4a698b3b6f27ef5e1f8dd3d836b2e5
"[nvptx] Add __PTX_ISA_VERSION_{MAJOR,MINOR}__".

gcc/testsuite/
* gcc.target/nvptx/ptx31.c: Remove; expanded into...
* gcc.target/nvptx/mptx=3.1.c: ... this.
* gcc.target/nvptx/ptx60.c: Remove; expanded into...
* gcc.target/nvptx/mptx=6.0.c: ... this.
* gcc.target/nvptx/ptx63.c: Remove; expanded into...
* gcc.target/nvptx/mptx=6.3.c: ... this.
* gcc.target/nvptx/ptx70.c: Remove; expanded into...
* gcc.target/nvptx/mptx=7.0.c: ... this.
* gcc.target/nvptx/mptx=_.c: New.

7 months agoUse new RAW_DATA_{U,S}CHAR_ELT macros in the middle-end and C FE
Jakub Jelinek [Fri, 6 Dec 2024 10:00:52 +0000 (11:00 +0100)] 
Use new RAW_DATA_{U,S}CHAR_ELT macros in the middle-end and C FE

During the patch review of the C++ #embed optimization, Jason asked for
a macro for the common
((const unsigned char *) RAW_DATA_POINTER (value))[i]
and ditto with signed char patterns which appear in a lot of places.
In the just committed patch I've added
+#define RAW_DATA_UCHAR_ELT(NODE, I) \
+  (((const unsigned char *) RAW_DATA_POINTER (NODE))[I])
+#define RAW_DATA_SCHAR_ELT(NODE, I) \
+  (((const signed char *) RAW_DATA_POINTER (NODE))[I])
macros for that in tree.h.

The following patch is just a cleanup to use those macros where appropriate.

2024-12-06  Jakub Jelinek  <jakub@redhat.com>

gcc/
* gimplify.cc (gimplify_init_ctor_eval): Use RAW_DATA_UCHAR_ELT
macro.
* gimple-fold.cc (fold_array_ctor_reference): Likewise.
* tree-pretty-print.cc (dump_generic_node): Use RAW_DATA_UCHAR_ELT
and RAW_DATA_SCHAR_ELT macros.
* fold-const.cc (fold): Use RAW_DATA_UCHAR_ELT macro.
gcc/c/
* c-parser.cc (c_parser_get_builtin_args, c_parser_expression,
c_parser_expr_list): Use RAW_DATA_UCHAR_ELT macro.
* c-typeck.cc (digest_init): Use RAW_DATA_UCHAR_ELT and
RAW_DATA_SCHAR_ELT macros.
(add_pending_init, maybe_split_raw_data): Use RAW_DATA_UCHAR_ELT
macro.

7 months agoMore duplicates reported by genmatch
Richard Biener [Thu, 5 Dec 2024 12:47:36 +0000 (13:47 +0100)] 
More duplicates reported by genmatch

Here are a bit less obvious cases of duplicate, mostly of the
form (op (op:c @0 @1) (op:c @0 @1)) where it's enough to have
one :c to get all relevant cases.

* match.pd: Remove redundant :c, reported by genmatch as
duplicate patterns.

7 months agoRemove some duplicates reported by genmatch
Richard Biener [Thu, 5 Dec 2024 12:24:27 +0000 (13:24 +0100)] 
Remove some duplicates reported by genmatch

genmatch currently has a difficulty to decide whether a duplicate
structural match is really duplicate as uses of captures within
predicates or in C code can be order dependent.  For example
a reported duplicate results in

 {
   tree captures[4] ATTRIBUTE_UNUSED = { _p1, _p0, _q20, _q21 }
   if (gimple_simplify_112 (res_op, seq, valueize, type, captures))
     return true;
 }
 {
   tree captures[4] ATTRIBUTE_UNUSED = { _p1, _p0, _q21, _q20 };
   if (gimple_simplify_112 (res_op, seq, valueize, type, captures))
     return true;
 }

where the difference is only in _q20 and _q21 being swapped but
that resulting in a call to bitwise_inverted_equal_p (_p1, X)
with X once _q20 and once _q21.  That is, we treat bare
captures as equal for reporting duplicates.

Due to bitwise_inverted_equal_p there are meanwhile a _lot_ of
duplicates reported that are not actual duplicates.

The following removes some that are though, as the operands are
only passed to types_match.

* match.pd (.SAT_ADD patterns using IFN_ADD_OVERFLOW): Remove :c that
only causes duplicate patterns.

7 months agoRISC-V: Add --with-cmodel configure option
Hau Hsu [Fri, 2 Aug 2024 05:11:51 +0000 (13:11 +0800)] 
RISC-V: Add --with-cmodel configure option

Sometimes we want to use default cmodel other than medlow. Add a GCC
configure option for that.

gcc/ChangeLog:

* config.gcc (riscv*-*-*): Add support for --with-cmodel configure option.
(all_defaults): Add cmodel.
* config/riscv/riscv.h (TARGET_DEFAULT_CMODEL): Remove.
* doc/install.texi: Document --with-cmodel configure option.
* doc/invoke.texi (-mcmodel): Mention --with-cmodel configure option.

Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
7 months ago'gcc/config/nvptx/gen-multilib-matches.sh': Support '--selftest'
Thomas Schwinge [Mon, 2 Dec 2024 15:50:16 +0000 (16:50 +0100)] 
'gcc/config/nvptx/gen-multilib-matches.sh': Support '--selftest'

..., and invoke that before actual use.

gcc/
* config/nvptx/gen-multilib-matches.sh: Support '--selftest'.
* config/nvptx/t-nvptx (t-nvptx-gen-multilib-matches:): Invoke it.
* config/nvptx/gen-multilib-matches-tests: New.

7 months ago'gcc/config/nvptx/gen-*.sh': Simplify interface
Thomas Schwinge [Mon, 2 Dec 2024 15:42:14 +0000 (16:42 +0100)] 
'gcc/config/nvptx/gen-*.sh': Simplify interface

What we currently pass in as '$1' is simply 'dirname "$0"'.

gcc/
* config/nvptx/gen-h.sh: Don't pass in '$1'; compute it locally.
* config/nvptx/gen-multilib-matches.sh: Likewise.
* config/nvptx/gen-omp-device-properties.sh: Likewise.
* config/nvptx/gen-opt.sh: Likewise.
* config/nvptx/t-nvptx (s-nvptx-gen-h:, s-nvptx-gen-opt:)
(t-nvptx-gen-multilib-matches:): Adjust.
* config/nvptx/t-omp-device (omp-device-properties-nvptx):
Likewise.

7 months ago'gcc/config/nvptx/gen-multilib-matches.sh': Encapsulate main logic
Thomas Schwinge [Mon, 2 Dec 2024 15:34:03 +0000 (16:34 +0100)] 
'gcc/config/nvptx/gen-multilib-matches.sh': Encapsulate main logic

Refactoring for later extension.  No change in behavior intended.

gcc/
* config/nvptx/gen-multilib-matches.sh: Encapsulate main logic.

7 months ago'gcc/config/nvptx/t-nvptx': Don't use the 'shell' function of 'make'
Thomas Schwinge [Mon, 2 Dec 2024 14:06:58 +0000 (15:06 +0100)] 
'gcc/config/nvptx/t-nvptx': Don't use the 'shell' function of 'make'

The exit status of the command invoked in a 'Makefile' via '$(shell [...])'
effectively gets discarded (unless explicitly checking the GNU Make 4.2+
'.SHELLSTATUS' variable or jumping through other hoops).  In order to be able
to catch errors in what the 'shell' function invokes, let's make things
explicit: similar to how 'gcc/config/avr/t-avr' is doing with 't-multilib-avr',
for example.

gcc/
* config/nvptx/t-nvptx (multilib_matches): Don't use the 'shell'
function of 'make'.
* config/nvptx/gen-multilib-matches.sh: Adjust.

7 months agonvptx: Tag '-misa=[...]', '-mptx=[...]' as 'Negative' of themselves [PR117916]
Thomas Schwinge [Wed, 4 Dec 2024 21:37:17 +0000 (22:37 +0100)] 
nvptx: Tag '-misa=[...]', '-mptx=[...]' as 'Negative' of themselves [PR117916]

This issue is similar to what a year ago I resolved for GCN in PR112669
"GCN: wrong 'LIBRARY_PATH' in presence of several different '-march=[...]' flags".

Given the current standard nvptx configuration, we get:

    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -mptx=6.3
    .
    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -mptx=3.1
    mptx-3.1

... as expected.  The following, however, is not:

    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -mptx=3.1 -mptx=6.3
    mptx-3.1

This should print '.'.

Or, in a '--with-arch=sm_70' configuration:

    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -misa=sm_70
    .
    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -misa=sm_30
    misa-sm_30

... as expected.  The following, however, are not:

    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -misa=sm_30 -misa=sm_70
    misa-sm_30
    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -misa=sm_30 -march=sm_70
    misa-sm_30
    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -march=sm_30 -march=sm_70
    misa-sm_30
    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -march=sm_30 -misa=sm_70
    misa-sm_30

These should all print '.'.

Even worse:

    $ build-gcc-offload-nvptx-none/gcc/xgcc -print-multi-directory -mgomp -mptx=3.1 -mptx=_
    .

This should print 'mgomp'.  Otherwise, for OpenMP offloading compilation
the wrong (non-'mgomp') multilib is linked in ('.'), and linking fails
due to 'unresolved symbol __nvptx_uni'.

PR target/117916
gcc/
* config/nvptx/nvptx.opt (misa=, mptx=): Tag as 'Negative' of
themselves.

7 months agoClarify libgomp nvptx 'omp_low_lat_mem_space' documentation
Thomas Schwinge [Tue, 12 Nov 2024 08:54:35 +0000 (09:54 +0100)] 
Clarify libgomp nvptx 'omp_low_lat_mem_space' documentation

PTX '%dynamic_smem_size' was "Introduced in PTX ISA version 4.1", and
"Requires 'sm_20' or higher".  Given that GCC/nvptx generally supports
'sm_20', only the PTX ISA version matters here, and that's all fine if
just using GCC's defaults.  Follow-up to
commit e9a19ead498fcc89186b724c6e76854f7751a89b
"openmp, nvptx: low-lat memory access traits".

libgomp/
* libgomp.texi: Clarify nvptx 'omp_low_lat_mem_space'
documentation.

7 months agoFortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device...
Thomas Schwinge [Mon, 14 Oct 2024 08:45:06 +0000 (10:45 +0200)] 
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Revert 'gimple_fold_builtin_acc_on_device' change

The motivation of the 'gimple_fold_builtin_acc_on_device' change in
commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11
"Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device"
is unclear, and it unnecessarily diverges GCC's (default)
'--disable-offload-targets' vs. '--enable-offload-targets=[...]'
configurations.

PR testsuite/82250
gcc/
* gimple-fold.cc (gimple_fold_builtin_acc_on_device): Revert last
change.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Revert
last change.

7 months agotestsuite/117714 - gcc.dg/vect/slp-reduc-4.c FAILs on 32-bit SPARC
Richard Biener [Fri, 6 Dec 2024 08:37:35 +0000 (09:37 +0100)] 
testsuite/117714 - gcc.dg/vect/slp-reduc-4.c FAILs on 32-bit SPARC

The testcase tries to ensure we can elide all permutations when
vectorizing a MAX reduction.  For SPARC the issue is that the
MAX reduction isn't supported and since we're trying to fall back
to single-lane SLP the dumps contain VEC_PERM_EXPR for the
interleaving permute lowering.  Before all-SLP that wouldn't
be in the dumps when doing non-SLP, but eventually we'd fail to
vectorize so no VEC_PERM_EXPRs would be in the dumps either.

The following adds vect_no_int_min_max to the set of xfails for
this particular scan as well, like the existing check for vectorizing.

PR testsuite/117714
* gcc.dg/vect/slp-reduc-4.c: Add vect_no_int_min_max to the
XFAIL for the VEC_PERM_EXPR scan.

7 months agolibcpp, c++: Optimize initializers using #embed in C++
Jakub Jelinek [Fri, 6 Dec 2024 08:09:12 +0000 (09:09 +0100)] 
libcpp, c++: Optimize initializers using #embed in C++

This patch adds similar optimizations to the C++ FE as have been
implemented earlier in the C FE.
The libcpp hunk enables use of CPP_EMBED token even for C++, not just
C; the preprocessor guarantees there is always a CPP_NUMBER CPP_COMMA
before CPP_EMBED and CPP_COMMA CPP_NUMBER after it which simplifies
parsing (unless #embed is more than 2GB, in that case it could be
CPP_NUMBER CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED
CPP_COMMA CPP_NUMBER etc. with each CPP_EMBED covering at most INT_MAX
bytes).
Similarly to the C patch, this patch parses it into RAW_DATA_CST tree
in the braced initializers (and from there peels into INTEGER_CSTs unless
it is an initializer of an std::byte array or integral array with CHAR_BIT
element precision), parses CPP_EMBED in cp_parser_expression into just
the last INTEGER_CST in it because I think users don't need millions of
-Wunused-value warnings because they did useless
  int a = (
  #embed "megabyte.dat"
  );
and so most of the inner INTEGER_CSTs would be there just for the warning,
and in the rest of contexts like template argument list, function argument
list, attribute argument list, ...) parse it into a sequence of INTEGER_CSTs
(I wrote a range/iterator classes to simplify that).

My dumb
cat embed-11.c
constexpr unsigned char a[] = {
  #embed "cc1plus"
};
const unsigned char *b = a;
testcase where cc1plus is 492329008 bytes long when configured
--enable-checking=yes,rtl,extra against recent binutils with .base64 gas
support results in:
time ./xg++ -B ./ -S -O2 embed-11.c

real    0m4.350s
user    0m2.427s
sys     0m0.830s
time ./xg++ -B ./ -c -O2 embed-11.c

real    0m6.932s
user    0m6.034s
sys     0m0.888s
(compared to running out of memory or very long compilation).
On a shorter inclusion,
cat embed-12.c
constexpr unsigned char a[] = {
  #embed "xg++"
};
const unsigned char *b = a;
where xg++ is 15225904 bytes long, this takes using GCC with the #embed
patchset except for this patch:
time ~/src/gcc/obj36/gcc/xg++ -B ~/src/gcc/obj36/gcc/ -S -O2 embed-12.c

real    0m33.190s
user    0m32.327s
sys     0m0.790s
and with this patch:
time ./xg++ -B ./ -S -O2 embed-12.c

real    0m0.118s
user    0m0.090s
sys     0m0.028s

The patch doesn't change anything on what the first patch in the series
introduces even for C++, namely that #embed is expanded (actually or as if)
into a sequence of literals like
127,69,76,70,2,1,1,3,0,0,0,0,0,0,0,0,2,0,62,0,1,0,0,0,80,211,64,0,0,0,0,0,64,0,0,0,0,0,0,0,8,253
and so each element has int type.
That is how I believe it is in C23, and the different versions of the
C++ P1967 paper specified there some casts, P1967R12 in particular
"Otherwise, the integral constant expression is the value of std::fgetc’s return is cast
to unsigned char."
but please see
https://github.com/llvm/llvm-project/pull/97274#issuecomment-2230929277
comment and whether we really want the preprocessor to preprocess it for
C++ as (or as-if)
static_cast<unsigned char>(127),static_cast<unsigned char>(69),static_cast<unsigned char>(76),static_cast<unsigned char>(70),static_cast<unsigned char>(2),...
i.e. 9 tokens per byte rather than 2, or
(unsigned char)127,(unsigned char)69,...
or
((unsigned char)127),((unsigned char)69),...
etc.
Without a literal suffix for unsigned char constant literals it is horrible,
plus the incompatibility between C and C++.  Sure, we could use the magic
form more often for C++ to save the size and do the 9 or how many tokens
form only for the boundary constants and use #embed "." __gnu__::__base64__("...")
for what is in between if there are at least 2 tokens inside of it.
E.g. (unsigned char)127 vs. static_cast<unsigned char>(127) behaves
differently if there is constexpr long long p[] = { ... };
...
  #embed __FILE__
[p]

2024-12-06  Jakub Jelinek  <jakub@redhat.com>

libcpp/
* files.cc (finish_embed): Use CPP_EMBED even for C++.
gcc/
* tree.h (RAW_DATA_UCHAR_ELT, RAW_DATA_SCHAR_ELT): Define.
gcc/cp/ChangeLog:
* cp-tree.h (class raw_data_iterator): New type.
(class raw_data_range): New type.
* parser.cc (cp_parser_postfix_open_square_expression): Handle
parsing of CPP_EMBED.
(cp_parser_parenthesized_expression_list): Likewise.  Use
cp_lexer_next_token_is.
(cp_parser_expression): Handle parsing of CPP_EMBED.
(cp_parser_template_argument_list): Likewise.
(cp_parser_initializer_list): Likewise.
(cp_parser_oacc_clause_tile): Likewise.
(cp_parser_omp_tile_sizes): Likewise.
* pt.cc (tsubst_expr): Handle RAW_DATA_CST.
* constexpr.cc (reduced_constant_expression_p): Likewise.
(raw_data_cst_elt): New function.
(find_array_ctor_elt): Handle RAW_DATA_CST.
(cxx_eval_array_reference): Likewise.
* typeck2.cc (digest_init_r): Emit -Wnarrowing and/or -Wconversion
diagnostics.
(process_init_constructor_array): Handle RAW_DATA_CST.
* decl.cc (maybe_deduce_size_from_array_init): Likewise.
(is_direct_enum_init): Fail for RAW_DATA_CST.
(cp_maybe_split_raw_data): New function.
(consume_init): New function.
(reshape_init_array_1): Add VECTOR_P argument.  Handle RAW_DATA_CST.
(reshape_init_array): Adjust reshape_init_array_1 caller.
(reshape_init_vector): Likewise.
(reshape_init_class): Handle RAW_DATA_CST.
(reshape_init_r): Likewise.
gcc/testsuite/
* c-c++-common/cpp/embed-22.c: New test.
* c-c++-common/cpp/embed-23.c: New test.
* g++.dg/cpp/embed-4.C: New test.
* g++.dg/cpp/embed-5.C: New test.
* g++.dg/cpp/embed-6.C: New test.
* g++.dg/cpp/embed-7.C: New test.
* g++.dg/cpp/embed-8.C: New test.
* g++.dg/cpp/embed-9.C: New test.
* g++.dg/cpp/embed-10.C: New test.
* g++.dg/cpp/embed-11.C: New test.
* g++.dg/cpp/embed-12.C: New test.
* g++.dg/cpp/embed-13.C: New test.
* g++.dg/cpp/embed-14.C: New test.

7 months agoSVE intrinsics: Fold calls with pfalse predicate.
Jennifer Schmitz [Fri, 15 Nov 2024 15:45:59 +0000 (07:45 -0800)] 
SVE intrinsics: Fold calls with pfalse predicate.

If an SVE intrinsic has predicate pfalse, we can fold the call to
a simplified assignment statement: For _m predication, the LHS can be assigned
the operand for inactive values and for _z, we can assign a zero vector.
For _x, the returned values can be arbitrary and as suggested by
Richard Sandiford, we fold to a zero vector.

For example,
svint32_t foo (svint32_t op1, svint32_t op2)
{
  return svadd_s32_m (svpfalse_b (), op1, op2);
}
can be folded to lhs = op1, such that foo is compiled to just a RET.

For implicit predication, a case distinction is necessary:
Intrinsics that read from memory can be folded to a zero vector.
Intrinsics that write to memory or prefetch can be folded to a no-op.
Other intrinsics need case-by-case implemenation, which we added in
the corresponding svxxx_impl::fold.

We implemented this optimization during gimple folding by calling a new method
gimple_folder::fold_pfalse from gimple_folder::fold, which covers the generic
cases described above.

We tested the new behavior for each intrinsic with all supported predications
and data types and checked the produced assembly. There is a test file
for each shape subclass with scan-assembler-times tests that look for
the simplified instruction sequences, such as individual RET instructions
or zeroing moves. There is an additional directive counting the total number of
functions in the test, which must be the sum of counts of all other
directives. This is to check that all tested intrinsics were optimized.

Some few intrinsics were not covered by this patch:
- svlasta and svlastb already have an implementation to cover a pfalse
predicate. No changes were made to them.
- svld1/2/3/4 return aggregate types and were excluded from the case
that folds calls with implicit predication to lhs = {0, ...}.
- svst1/2/3/4 already have an implementation in svstx_impl that precedes
our optimization, such that it is not triggered.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/ChangeLog:

PR target/106329
* config/aarch64/aarch64-sve-builtins-base.cc
(svac_impl::fold): Add folding if pfalse predicate.
(svadda_impl::fold): Likewise.
(class svaddv_impl): Likewise.
(class svandv_impl): Likewise.
(svclast_impl::fold): Likewise.
(svcmp_impl::fold): Likewise.
(svcmp_wide_impl::fold): Likewise.
(svcmpuo_impl::fold): Likewise.
(svcntp_impl::fold): Likewise.
(class svcompact_impl): Likewise.
(class svcvtnt_impl): Likewise.
(class sveorv_impl): Likewise.
(class svminv_impl): Likewise.
(class svmaxnmv_impl): Likewise.
(class svmaxv_impl): Likewise.
(class svminnmv_impl): Likewise.
(class svorv_impl): Likewise.
(svpfirst_svpnext_impl::fold): Likewise.
(svptest_impl::fold): Likewise.
(class svsplice_impl): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(class svcvtxnt_impl): Likewise.
(svmatch_svnmatch_impl::fold): Likewise.
* config/aarch64/aarch64-sve-builtins.cc
(is_pfalse): Return true if tree is pfalse.
(gimple_folder::fold_pfalse): Fold calls with pfalse predicate.
(gimple_folder::fold_call_to): Fold call to lhs = t for given tree t.
(gimple_folder::fold_to_stmt_vops): Helper function that folds the
call to given stmt and adjusts virtual operands.
(gimple_folder::fold): Call fold_pfalse.
* config/aarch64/aarch64-sve-builtins.h (is_pfalse): Declare is_pfalse.

gcc/testsuite/ChangeLog:

PR target/106329
* gcc.target/aarch64/pfalse-binary_0.h: New test.
* gcc.target/aarch64/pfalse-unary_0.h: New test.
* gcc.target/aarch64/sve/pfalse-binary.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_rotate.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binaryxn.c: New test.
* gcc.target/aarch64/sve/pfalse-clast.c: New test.
* gcc.target/aarch64/sve/pfalse-compare_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-count_pred.c: New test.
* gcc.target/aarch64/sve/pfalse-fold_left.c: New test.
* gcc.target/aarch64/sve/pfalse-load.c: New test.
* gcc.target/aarch64/sve/pfalse-load_ext.c: New test.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: New test.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: New test.
* gcc.target/aarch64/sve/pfalse-load_gather_sv.c: New test.
* gcc.target/aarch64/sve/pfalse-load_gather_vs.c: New test.
* gcc.target/aarch64/sve/pfalse-load_replicate.c: New test.
* gcc.target/aarch64/sve/pfalse-prefetch.c: New test.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: New test.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: New test.
* gcc.target/aarch64/sve/pfalse-ptest.c: New test.
* gcc.target/aarch64/sve/pfalse-rdffr.c: New test.
* gcc.target/aarch64/sve/pfalse-reduction.c: New test.
* gcc.target/aarch64/sve/pfalse-reduction_wide.c: New test.
* gcc.target/aarch64/sve/pfalse-shift_right_imm.c: New test.
* gcc.target/aarch64/sve/pfalse-store.c: New test.
* gcc.target/aarch64/sve/pfalse-store_scatter_index.c: New test.
* gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: New test.
* gcc.target/aarch64/sve/pfalse-storexn.c: New test.
* gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-ternary_rotate.c: New test.
* gcc.target/aarch64/sve/pfalse-unary.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_convertxn.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_n.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_pred.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_to_uint.c: New test.
* gcc.target/aarch64/sve/pfalse-unaryxn.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_wide.c: New test.
* gcc.target/aarch64/sve2/pfalse-compare.c: New test.
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: New test.
* gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: New test.
* gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: New test.
* gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: New test.
* gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-unary.c: New test.
* gcc.target/aarch64/sve2/pfalse-unary_convert.c: New test.
* gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: New test.
* gcc.target/aarch64/sve2/pfalse-unary_to_int.c: New test.

7 months agortl-optimization/117922 - add timevar for fold-mem-offsets
Richard Biener [Fri, 6 Dec 2024 07:08:55 +0000 (08:08 +0100)] 
rtl-optimization/117922 - add timevar for fold-mem-offsets

The new fold-mem-offsets RTL pass takes significant amount of time
and memory.  Add a timevar for it.

PR rtl-optimization/117922
* timevar.def (TV_FOLD_MEM_OFFSETS): New.
* fold-mem-offsets.cc (pass_data_fold_mem): Use TV_FOLD_MEM_OFFSETS.

7 months agoc++: ICE with pack indexing empty pack [PR117898]
Marek Polacek [Wed, 4 Dec 2024 21:58:59 +0000 (16:58 -0500)] 
c++: ICE with pack indexing empty pack [PR117898]

Here we ICE with a partially-substituted pack indexing.  The pack
expanded to an empty pack, which we can't index.  It seems reasonable
to detect this case in tsubst_pack_index, even before we substitute
the index.  Other erroneous cases can wait until pack_index_element
where we have the index.

PR c++/117898

gcc/cp/ChangeLog:

* pt.cc (tsubst_pack_index): Detect indexing an empty pack.

gcc/testsuite/ChangeLog:

* g++.dg/cpp26/pack-indexing2.C: Adjust.
* g++.dg/cpp26/pack-indexing12.C: New test.

7 months agoRISC-V: Refactor the testcases for bswap16-0
Pan Li [Wed, 4 Dec 2024 02:08:12 +0000 (10:08 +0800)] 
RISC-V: Refactor the testcases for bswap16-0

This patch would like to refactor the testcases of bswap16-0
after sorts of optimization option passing to testcase.  To
fits the big lmul like m8 for asm dump check.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: Update
the vector register RE to cover v10 - v31.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoRISC-V: Fix incorrect optimization options passing to convert and unop
Pan Li [Wed, 4 Dec 2024 02:08:11 +0000 (10:08 +0800)] 
RISC-V: Fix incorrect optimization options passing to convert and unop

Like the strided load/store, the testcases of vector convert and unop
are designed to pick up different sorts of optimization options but
actually these option are ignored according to the Execution log of
the gcc.log.

This patch would like to make it correct almost the same as how we
fixed for strided load/store.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Fix the incorrect optimization
options passing to testcases.

Signed-off-by: Pan Li <pan2.li@intel.com>
7 months agoDaily bump.
GCC Administrator [Fri, 6 Dec 2024 00:19:28 +0000 (00:19 +0000)] 
Daily bump.

7 months agoPR modula2/117904: cc1gm2 ICE when compiling a const built from VAL and SIZE
Gaius Mulley [Thu, 5 Dec 2024 20:31:34 +0000 (20:31 +0000)] 
PR modula2/117904: cc1gm2 ICE when compiling a const built from VAL and SIZE

This patch fixes an ICE which occurs when a positive ZType constant
increment is used during a FOR loop.

gcc/m2/ChangeLog:

PR modula2/117904
* gm2-compiler/M2GenGCC.mod (PerformLastForIterator): Add call to
BuildConvert when increment is > 0.

gcc/testsuite/ChangeLog:

PR modula2/117904
* gm2/iso/pass/forloopbyconst.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
7 months agoi386: Fix addcarry/subborrow issues [PR117860]
Uros Bizjak [Thu, 5 Dec 2024 16:02:46 +0000 (17:02 +0100)] 
i386: Fix addcarry/subborrow issues [PR117860]

Fix several things to enable combine to handle addcarry/subborrow patterns:

- Fix wrong canonical form of addcarry<mode> insn and friends. For
commutative operand (PLUS RTX) binary operand (LTU) takes precedence before
unary operand (ZERO_EXTEND).

- Swap operands of GTU comparison to canonicalize addcarry/subborrow
comparison. Again, the canonical form of the compare is PLUS RTX before
ZERO_EXTEND RTX. GTU comparison is not a carry flag comparison, so we have
to swap operands in x86_canonicalize_comparison to a non-canonical form
to use LTU comparison.

- Return correct compare mode (CCCmode) for addcarry/subborrow pattern
from ix86_cc_mode, so combine is able to emit required compare mode for
combined insn.

- Add *subborrow<mode>_1 pattern having const_scalar_int_operand predicate.
Here, canonicalization of SUB (op1, const) RTX to PLUS (op1, -const) requires
negation of constant operand when ckecking operands.

With the above changes, combine is able to create *addcarry_1/*subborrow_1
pattern with immediate operand for the testcase in the PR:

SomeAddFunc:
        addq    %rcx, %rsi      # 10    [c=4 l=3]  adddi3_cc_overflow_1/0
        movq    %rdi, %rax      # 33    [c=4 l=3]  *movdi_internal/3
        adcq    $5, %rdx        # 19    [c=4 l=4]  *addcarrydi_1/0
        movq    %rsi, (%rdi)    # 23    [c=4 l=3]  *movdi_internal/5
        movq    %rdx, 8(%rdi)   # 24    [c=4 l=4]  *movdi_internal/5
        setc    %dl     # 39    [c=4 l=3]  *setcc_qi
        movzbl  %dl, %edx       # 40    [c=4 l=3]  zero_extendqidi2/0
        movq    %rdx, 16(%rdi)  # 26    [c=4 l=4]  *movdi_internal/5
        ret             # 43    [c=0 l=1]  simple_return_internal

SomeSubFunc:
        subq    %rcx, %rsi      # 10    [c=4 l=3]  *subdi_3/0
        movq    %rdi, %rax      # 42    [c=4 l=3]  *movdi_internal/3
        sbbq    $17, %rdx       # 19    [c=4 l=4]  *subborrowdi_1/0
        movq    %rsi, (%rdi)    # 33    [c=4 l=3]  *movdi_internal/5
        sbbq    %rcx, %rcx      # 29    [c=8 l=3]  *x86_movdicc_0_m1_neg
        movq    %rdx, 8(%rdi)   # 34    [c=4 l=4]  *movdi_internal/5
        movq    %rcx, 16(%rdi)  # 35    [c=4 l=4]  *movdi_internal/5
        ret             # 51    [c=0 l=1]  simple_return_internal

PR target/117860

gcc/ChangeLog:

* config/i386/i386.cc (ix86_canonicalize_comparison): Swap
operands of GTU comparison to canonicalize addcarry/subborrow
comparison.
(ix86_cc_mode): Return CCCmode for the comparison of
addcarry/subborrow pattern.
* config/i386/i386.md (addcarry<mode>): Swap operands of
PLUS RTX to make it canonical.
(*addcarry<mode>_1): Ditto.
(addcarry peephole2s): Update RTXes for addcarry<mode>_1 change.
(*add<dwi>3_doubleword_cc_overflow_1): Ditto.
(*subborrow<mode>_1): New insn pattern.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117860.c: New test.

7 months agoarm: remove support for iWMMX/iWMMX2 intrinsics
Richard Earnshaw [Thu, 5 Dec 2024 15:14:09 +0000 (15:14 +0000)] 
arm: remove support for iWMMX/iWMMX2 intrinsics

The mmintrin.h header was adjusted for GCC-14 to generate a
(suppressible) warning if it was used, saying that support would be
removed in GCC-15.

Make that come true by removing the contents of this header and
emitting an error.

At this point in time I've not removed the internal support for the
intrinsics, just the wrappers that enable access to them.  That can be
done at leisure from now on.

gcc/ChangeLog:

* config/arm/mmintrin.h: Raise an error if this header is used.
Remove other content.

7 months agoaarch64: Mark vluti* intrinsics as QUIET
Richard Sandiford [Thu, 5 Dec 2024 15:33:11 +0000 (15:33 +0000)] 
aarch64: Mark vluti* intrinsics as QUIET

This patch fixes the vluti* definitions to say that they don't
raise FP exceptions even for floating-point modes.

gcc/
* config/aarch64/aarch64-simd-pragma-builtins.def
(ENTRY_TERNARY_VLUT8): Use FLAG_QUIET rather than FLAG_DEFAULT.
(ENTRY_TERNARY_VLUT16): Likewise.

7 months agoaarch64: Reintroduce FLAG_AUTO_FP
Richard Sandiford [Thu, 5 Dec 2024 15:33:10 +0000 (15:33 +0000)] 
aarch64: Reintroduce FLAG_AUTO_FP

The flag now known as FLAG_QUIET is an odd-one-out in that it
removes side-effects rather than adding them.  This patch inverts
it and gives it the old name FLAG_AUTO_FP.  FLAG_QUIET now means
"no flags" instead.

gcc/
* config/aarch64/aarch64-builtins.cc (FLAG_QUIET): Redefine to 0,
replacing the old flag with...
(FLAG_AUTO_FP): ...this.
(FLAG_DEFAULT): Redefine to FLAG_AUTO_FP.
(aarch64_call_properties): Update accordingly.

7 months agoaarch64: Rename FLAG_NONE to FLAG_DEFAULT
Richard Sandiford [Thu, 5 Dec 2024 15:33:10 +0000 (15:33 +0000)] 
aarch64: Rename FLAG_NONE to FLAG_DEFAULT

This patch renames to FLAG_NONE to FLAG_DEFAULT.  "NONE" suggests
that the function has no side-effects, whereas it actually means
that floating-point operations are assumed to read FPCR and to
raise FP exceptions.

gcc/
* config/aarch64/aarch64-builtins.cc (FLAG_NONE): Rename to...
(FLAG_DEFAULT): ...this and update all references.
* config/aarch64/aarch64-simd-builtins.def: Update all references
here too.
* config/aarch64/aarch64-simd-pragma-builtins.def: Likewise.

7 months agoaarch64: Rename FLAG_AUTO_FP to FLAG_QUIET
Richard Sandiford [Thu, 5 Dec 2024 15:33:09 +0000 (15:33 +0000)] 
aarch64: Rename FLAG_AUTO_FP to FLAG_QUIET

I'd suggested the name "FLAG_AUTO_FP" to mean "automatically derive
FLAG_FP from the mode", i.e. automatically decide whether the function
might read the FPCR or might raise FP exceptions.  However, the flag
currently suppresses that behaviour instead.

This patch renames FLAG_AUTO_FP to FLAG_QUIET.  That's probably not a
great name, but it's also what the SVE code means by "quiet", and is
borrowed from "quiet NaNs".

gcc/
* config/aarch64/aarch64-builtins.cc (FLAG_AUTO_FP): Rename to...
(FLAG_QUIET): ...this and update all references.
* config/aarch64/aarch64-simd-builtins.def: Update all references
here too.