]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
7 months agolibstdc++: Add missing character to __to_wstring_numeric map
Jonathan Wakely [Mon, 16 Dec 2024 09:45:40 +0000 (09:45 +0000)] 
libstdc++: Add missing character to __to_wstring_numeric map

The mapping from char to wchar_t needs to handle 'i' and 'I' but those
were absent from the table that is used for some non-ASCII encodings.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (__to_wstring_numeric): Add 'i'
and 'I' to mapping.

7 months agolibstdc++: Call regex_traits::transform_primary() only when necessary [PR98723]
Luca Bacci [Tue, 17 Dec 2024 18:57:30 +0000 (18:57 +0000)] 
libstdc++: Call regex_traits::transform_primary() only when necessary [PR98723]

This is both a performance optimization and a partial fix for PR 98723.

This commit fixes the issue for bracket expressions that do not depend
on the locale's collation facet. Examples:

 * Character ranges ([a-z]) when std::regex::collate is not set
 * Character classes ([:alnum:])
 * Individual characters ([abc])

Signed-off-by: Luca Bacci <luca.bacci982@gmail.com>
libstdc++-v3/ChangeLog:

PR libstdc++/98723
* include/bits/regex_compiler.tcc (_BracketMatcher::_M_apply):
Only use transform_primary when an equivalence set is used.

7 months agoDocumentation: Fix paste-o in recent OpenMP/OpenACC patch
Sandra Loosemore [Wed, 18 Dec 2024 03:51:54 +0000 (03:51 +0000)] 
Documentation: Fix paste-o in recent OpenMP/OpenACC patch

gcc/ChangeLog
* doc/extend.texi (OpenACC): Fix paste-o.

7 months agoc++: modules: Fix 32-bit overflow with 64-bit location_t [PR117970]
Lewis Hyatt [Wed, 18 Dec 2024 02:26:18 +0000 (21:26 -0500)] 
c++: modules: Fix 32-bit overflow with 64-bit location_t [PR117970]

With the move to 64-bit location_t in r15-6016, I missed a spot in module.cc
where a location_t was still being stored in a 32-bit int. Fixed.

The xtreme-header* tests in modules.exp were still passing fine on lots of
architectures that were tested (x86-64, i686, aarch64, sparc, riscv64), but
the PR shows that they were failing in some particular risc-v multilib
configurations. They pass now.

gcc/cp/ChangeLog:

PR c++/117970
* module.cc (module_state::read_ordinary_maps): Change argument to
line_map_uint_t instead of unsigned int.

7 months agoDaily bump.
GCC Administrator [Wed, 18 Dec 2024 00:18:58 +0000 (00:18 +0000)] 
Daily bump.

7 months agoc++: print NONTYPE_ARGUMENT_PACK [PR118073]
Marek Polacek [Tue, 17 Dec 2024 18:44:22 +0000 (13:44 -0500)] 
c++: print NONTYPE_ARGUMENT_PACK [PR118073]

This PR points out that we're not pretty-printing NONTYPE_ARGUMENT_PACK
so the compiler emits the ugly:

  'nontype_argument_pack' not supported by dump_expr<expression error>>

Fixed thus.  I've wrapped the elements of the pack in { } because that's
what cxx_pretty_printer::expression does.

PR c++/118073

gcc/cp/ChangeLog:

* error.cc (dump_expr) <case NONTYPE_ARGUMENT_PACK>: New case.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/arg-pack1.C: New test.

7 months agolibstdc++: Fix -Wparentheses warning in Debug Mode macro
Jonathan Wakely [Fri, 13 Dec 2024 16:53:06 +0000 (16:53 +0000)] 
libstdc++: Fix -Wparentheses warning in Debug Mode macro

libstdc++-v3/ChangeLog:

* include/debug/safe_local_iterator.h (_GLIBCXX_DEBUG_VERIFY_OPERANDS):
Add parentheses to avoid -Wparentheses warning.

7 months agolibstdc++: Fix std::deque::insert(pos, first, last) undefined behaviour [PR118035]
Jonathan Wakely [Mon, 16 Dec 2024 17:42:24 +0000 (17:42 +0000)] 
libstdc++: Fix std::deque::insert(pos, first, last) undefined behaviour [PR118035]

Inserting an empty range into a std::deque results in undefined calls to
either std::copy, std::copy_backward, std::move, or std::move_backward.
We call those algos with invalid arguments where the output range is the
same as the input range, e.g.  std::copy(first, last, first) which
violates the preconditions for the algorithms.

This fix simply returns early if there's nothing to insert. Most callers
already ensure that we don't even call _M_range_insert_aux with an empty
range, but some callers don't. Rather than checking for n == 0 in each
of the callers, this just does the check once and uses __builtin_expect
to treat empty insertions as unlikely.

libstdc++-v3/ChangeLog:

PR libstdc++/118035
* include/bits/deque.tcc (_M_range_insert_aux): Return
immediately if inserting an empty range.
* testsuite/23_containers/deque/modifiers/insert/118035.cc: New
test.

7 months agoDocumentation: Make OpenMP/OpenACC docs easier to find [PR26154]
Sandra Loosemore [Tue, 17 Dec 2024 15:19:36 +0000 (15:19 +0000)] 
Documentation: Make OpenMP/OpenACC docs easier to find [PR26154]

PR c/26154 is one of our oldest documentation issues.  The only
discussion of OpenMP support in the GCC manual is buried in the "C
Dialect Options" section, with nothing at all under "Extensions".  The
Fortran manual does have separate sections for OpenMP and OpenACC
extensions so I have copy-edited/adapted that text for similar sections
in the GCC manual, as well as breaking out the OpenMP and OpenACC options
into their own section (they apply to all of C, C++, and Fortran).

I also updated the information about what versions of OpenMP and
OpenACC are supported and removed some redundant text from the Fortran
manual to prevent it from getting out of sync on future updates, and
inserted some cross-references to the new sections elsewhere.

gcc/c-family/ChangeLog
PR c/26154
* c.opt.urls: Regenerated.

gcc/ChangeLog
PR c/26154
* common.opt.urls: Regenerated.
* doc/extend.texi (C Extensions): Adjust menu for new sections.
(Attribute Syntax): Mention OpenMP directives.
(Pragmas): Mention OpenMP and OpenACC directives.
(OpenMP): New section.
(OpenACC): New section.
* doc/invoke.texi (Invoking GCC): Adjust menu for new section.
(Option Summary): Move OpenMP and OpenACC options to their own
category.
(C Dialect Options): Move documentation for -foffload, -fopenacc,
-fopenacc-dim, -fopenmp, -fopenmd-simd, and
-fopenmp-target-simd-clone to...
(OpenMP and OpenACC Options): ...this new section.  Light
copy-editing of the option descriptions.

gcc/fortran/ChangeLog:
PR c/26154
* gfortran.texi (Standards): Remove redundant info about
OpenMP/OpenACC standard support.
(OpenMP): Copy-editing and update version info.
(OpenACC): Likewise.
* lang.opt.urls: Regenerated.

7 months agomiddle-end/118062 - bogus lowering of vector compares
Richard Biener [Tue, 17 Dec 2024 10:23:02 +0000 (11:23 +0100)] 
middle-end/118062 - bogus lowering of vector compares

The generic expand_vector_piecewise routine supports lowering of
a vector operation to vector operations of smaller size.  When
computing the extract position from the larger vector it uses the
element size in bits of the original result vector to determine
the number of elements in the smaller vector.  That is wrong when
lowering a compare as the vector element size of a bool vector
does not have to agree with that of the compare operand.  The
following simplifies this, fixing the error.

PR middle-end/118062
* tree-vect-generic.cc (expand_vector_piecewise): Properly
compute delta.

7 months agoc++: ICE initializing array of aggrs [PR117985]
Marek Polacek [Thu, 12 Dec 2024 19:56:07 +0000 (14:56 -0500)] 
c++: ICE initializing array of aggrs [PR117985]

This crash started with my r12-7803 but I believe the problem lies
elsewhere.

build_vec_init has cleanup_flags whose purpose is -- if I grok this
correctly -- to avoid destructing an object multiple times.  Let's
say we are initializing an array of A.  Then we might end up in
a scenario similar to initlist-eh1.C:

  try
    {
      call A::A in a loop
      // #0
      try
        {
  call a fn using the array
}
      finally
{
  // #1
  call A::~A in a loop
}
    }
  catch
    {
      // #2
      call A::~A in a loop
    }

cleanup_flags makes us emit a statement like

  D.3048 = 2;

at #0 to disable performing the cleanup at #2, since #1 will take
care of the destruction of the array.

But if we are not emitting the loop because we can use a constant
initializer (and use a single { a, b, ...}), we shouldn't generate
the statement resetting the iterator to its initial value.  Otherwise
we crash in gimplify_var_or_parm_decl because it gets the stray decl
D.3048.

PR c++/117985

gcc/cp/ChangeLog:

* init.cc (build_vec_init): Pop CLEANUP_FLAGS if we're not
generating the loop.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-array23.C: New test.
* g++.dg/cpp0x/initlist-array24.C: New test.

7 months ago[PATCH] RISC-V: optimization on checking certain bits set ((x & mask) == val)
Oliver Kozul [Tue, 17 Dec 2024 14:44:33 +0000 (07:44 -0700)] 
[PATCH] RISC-V: optimization on checking certain bits set ((x & mask) == val)

The patch optimizes code generation for comparisons of the form
X & C1 == C2 by converting them to (X | ~C1) == (C2 | ~C1).
C1 is a constant that requires li and addi to be loaded,
while ~C1 requires a single lui instruction.
As the values of C1 and C2 are not visible within
the equality expression, a plus pattern is matched instead.

      PR target/114087

gcc/ChangeLog:

* config/riscv/riscv.md (*lui_constraint<ANYI:mode>_and_to_or): New pattern

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr114087-1.c: New test.

7 months agoRISC-V: Remove svvptc from riscv-ext-bitmask.def
Yangyu Chen [Tue, 17 Dec 2024 14:41:05 +0000 (07:41 -0700)] 
RISC-V: Remove svvptc from riscv-ext-bitmask.def

There should be no svvptc in the riscv-ext-bitmask.def file since it has
not yet been added to the RISC-V C API Specification or the Linux
hwprobe. And there is no need for userspace software to know that this
extension exists. So remove it from the riscv-ext-bitmask.def file.

Fixes: e4f4b2dc08 ("RISC-V: Minimal support for svvptc extension.")
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:

* common/config/riscv/riscv-ext-bitmask.def (RISCV_EXT_BITMASK): Remove svvptc.

7 months agotestsuite: arm: Mark pr81812.C as xfail for thumb1
Torbjörn SVENSSON [Sun, 10 Nov 2024 19:15:13 +0000 (20:15 +0100)] 
testsuite: arm: Mark pr81812.C as xfail for thumb1

Test fails for Cortex-M0 with:

.../pr81812.C:6:8: error: generic thunk code fails for method 'virtual void ChildNode::_ZTv0_n12_NK9ChildNode5errorEz(...) const' which uses '...'

According to PR108277, it's expected that thumb1 targets does not
support empty virtual functions with ellipsis.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr81812.C: Add xfail for thumb1.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
7 months ago[PATCH v2 2/2] RISC-V: Add Tenstorrent Ascalon 8 wide architecture
Anton Blanchard [Tue, 17 Dec 2024 14:34:20 +0000 (07:34 -0700)] 
[PATCH v2 2/2] RISC-V: Add Tenstorrent Ascalon 8 wide architecture

This adds the Tenstorrent Ascalon 8 wide architecture (tt-ascalon-d8)
to the list of known cores.

gcc/ChangeLog:

* config/riscv/riscv-cores.def: Add tt-ascalon-d8.
* config/riscv/riscv.cc (tt_ascalon_d8_tune_info): New.
* doc/invoke.texi (RISC-V): Add tt-ascalon-d8 to -mcpu.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/mcpu-tt-ascalon-d8.c: New test.

7 months ago[PATCH v2 1/2] RISC-V: Document thead-c906, xiangshan-nanhu, and generic-ooo
Anton Blanchard [Tue, 17 Dec 2024 14:30:55 +0000 (07:30 -0700)] 
[PATCH v2 1/2] RISC-V: Document thead-c906, xiangshan-nanhu, and generic-ooo

gcc/ChangeLog
* doc/invoke.texi (RISC-V): Add thead-c906, xiangshan-nanhu to
-mcpu, add generic-ooo and remove thead-c906 from -mtune.

7 months agotestsuite: arm: Add -mtune to all arm_cpu_* effective targets
Torbjörn SVENSSON [Mon, 16 Dec 2024 13:47:40 +0000 (14:47 +0100)] 
testsuite: arm: Add -mtune to all arm_cpu_* effective targets

Fixes Linaro CI reported regression on r15-6164-gbdf75257aad2 in
https://linaro.atlassian.net/browse/GNU-1463.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Added corresponding -mtune= option
for each fo the arm_cpu_* effective targets.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
7 months agoRISC-V: Add new constraint R for register even-odd pairs
Kito Cheng [Mon, 9 Dec 2024 06:55:20 +0000 (14:55 +0800)] 
RISC-V: Add new constraint R for register even-odd pairs

Although this constraint is not currently used for any instructions, it is very
useful for custom instructions. Additionally, some new standard extensions
(not yet upstream), such as `Zilsd` and `Zclsd`, are potential users of this
constraint. Therefore, I believe there is sufficient justification to add it
now.

gcc/ChangeLog:

* config/riscv/constraints.md (R): New constraint.
* doc/md.texi: Document new constraint `R`.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/constraint-R.c: New.

7 months agoRISC-V: Implment N modifier for printing the register number rather than the register...
Kito Cheng [Thu, 14 Nov 2024 09:24:45 +0000 (17:24 +0800)] 
RISC-V: Implment N modifier for printing the register number rather than the register name

The modifier `N`, to print the raw encoding of a register. This is used
when using `.insn <length>, <encoding>`, where the user wants to pass
a value to the instruction in a known register, but where the
instruction doesn't follow the existing instruction formats, so the
assembly parser is not expecting a register name, just a raw integer.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand): Add N.
* doc/extend.texi: Document for N,

gcc/testsuite/ChangeLog:

* gcc.target/riscv/modifier-N-fpr.c: New.
* gcc.target/riscv/modifier-N-vr.c: New.
* gcc.target/riscv/modifier-N.c: New.

7 months agoRISC-V: Rename internal operand modifier N to n
Kito Cheng [Thu, 14 Nov 2024 08:41:52 +0000 (16:41 +0800)] 
RISC-V: Rename internal operand modifier N to n

Here is a purposal that using N for printing register encoding number,
so let rename the existing internal operand modifier `N` to `n`.

gcc/ChangeLog:

* config/riscv/corev.md (*cv_branch<mode>): Update modifier.
(*branch<mode>): Ditto.
* config/riscv/riscv.cc (riscv_print_operand): Update modifier.
* config/riscv/riscv.md (*branch<mode>): Update modifier.

7 months agoRISC-V: Add cr and cf constraint
Kito Cheng [Wed, 13 Nov 2024 09:54:16 +0000 (17:54 +0800)] 
RISC-V: Add cr and cf constraint

gcc/ChangeLog:

* config/riscv/constraints.md (cr): New.
(cf): New.
* config/riscv/riscv.h (reg_class): Add RVC_GR_REGS and
RVC_FP_REGS.
(REG_CLASS_NAMES): Ditto.
(REG_CLASS_CONTENTS): Ditto.
* doc/md.texi: Document cr and cf constraint.
* config/riscv/riscv.cc (riscv_regno_to_class): Update
FP_REGS to RVC_FP_REGS since it smaller set.
(riscv_secondary_memory_needed): Handle RVC_FP_REGS.
(riscv_register_move_cost): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/constraint-cf-zfinx.c: New.
* gcc.target/riscv/constraint-cf.c: New.
* gcc.target/riscv/constraint-cr.c: New.

7 months agoRISC-V: Rename constraint c0* to k0*
Kito Cheng [Mon, 9 Dec 2024 07:05:37 +0000 (15:05 +0800)] 
RISC-V: Rename constraint c0* to k0*

Rename those constraint since we want define other constraint start with
`c`, those constraints are internal and undocumented, so it's fine to
rename.

gcc/ChangeLog:

* config/riscv/constraints.md (c01): Rename to...
(k01): ...this.
(c02): Rename to...
(k02): ...this.
(c03): Rename to...
(k03): ...this.
(c04): Rename to...
(k04): ...this.
(c08): Rename to...
(k08): ...this.
* config/riscv/corev.md (riscv_cv_simd_add_h_si): Update
constraints.
(riscv_cv_simd_sub_h_si): Ditto.
(riscv_cv_simd_cplxmul_i_si): Ditto.
(riscv_cv_simd_subrotmj_si): Ditto.
* config/riscv/riscv-v.cc (splat_to_scalar_move_p): Update
constraints.
* config/riscv/vector-iterators.md (stride_load_constraint):
Update constraints.
(stride_store_constraint): Ditto.

7 months agoipa: Improve how we derive value ranges from IPA invariants
Martin Jambor [Tue, 17 Dec 2024 10:17:14 +0000 (11:17 +0100)] 
ipa: Improve how we derive value ranges from IPA invariants

I believe that the current function ipa_range_set_and_normalize lacks
a check that a base of an ADDR_EXPR lacks a test whether the base
really cannot be NULL, so this patch adds it.  Moreover, I never liked
the name as I do not think it makes the value of ranges any more
normal but rather just special-cases non-zero ip_invariant pointers.
Therefore, I have given it a different name and moved it to a .cc
file, our LTO bootstrap should inline (and/or split) it if necessary
anyway.

Because, as Honza correctly pointed out, deriving non-NULLness from a
pointer depends on flag_delete_null_pointer_checks which is an
optimization flag and thus depends on a given function, in this
version of the patch ipa_get_range_from_ip_invariant gets a
context_node parameter for that purpose.  This then needs to be used
within symtab_node::nonzero_address which gets a special overload in
which the value of the flag can be provided as a parameter.

gcc/ChangeLog:

2024-12-11  Martin Jambor  <mjambor@suse.cz>

* cgraph.h (symtab_node): Add a new overload of nonzero_address.
* symtab.cc (symtab_node::nonzero_address): Add a new overload whith a
parameter for delete_null_pointer_checks.  Make the original overload
call the new one which has retains the actual implementation.
* ipa-prop.h (ipa_get_range_from_ip_invariant): Declare.
(ipa_range_set_and_normalize): Remove.
* ipa-prop.cc (ipa_get_range_from_ip_invariant): New function.
(ipa_range_set_and_normalize): Remove.
* ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Add a new parameter
context_node. Use ipa_get_range_from_ip_invariant instead of
ipa_range_set_and_normalize and pass to it the new parameter.
(ipa_value_range_from_jfunc): Pass cs->caller as the context_node to
ipa_vr_intersect_with_arith_jfunc.
(propagate_vr_across_jump_function): Likewise.
(ipa_get_range_from_ip_invariant): New function.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Use
ipa_get_range_from_ip_invariant instead of ipa_range_set_and_normalize

7 months agoipa: Better value ranges for pointer integer constants
Martin Jambor [Tue, 17 Dec 2024 10:17:14 +0000 (11:17 +0100)] 
ipa: Better value ranges for pointer integer constants

When looking into cases where we know an actual argument of a call is
a constant but we don't generate a singleton value-range for the jump
function, I found out that the special handling of pointer constants
does not work well for constant zero pointer values.  In fact the code
only attempts to see if it can figure out that an argument is not zero
and if it can figure out any alignment information.

With this patch, we try to use the value_range that ranger can give us
in the jump function if we can and we query ranger for all kinds of
arguments, not just SSA_NAMES (and so also pointer integer constants).
If we cannot figure out a useful range we fall back again on figuring
out non-NULLness with tree_single_nonzero_warnv_p.

With this patch, we generate

  [prange] struct S * [0, 0] MASK 0x0 VALUE 0x0

instead of for example:

  [prange] struct S * [0, +INF] MASK 0xfffffffffffffff0 VALUE 0x0

for a zero constant passed in a call.

If you are wondering why we check whether the value range obtained
from range_of_expr can be undefined, even when the function returns
true, that is because that can apparently happen fro default-definition
SSA_NAMEs.

gcc/ChangeLog:

2024-11-15  Martin Jambor  <mjambor@suse.cz>

* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Try harder to
use the value range obtained from ranger for pointer values.

7 months agoipa: Skip widening type conversions in jump function constructions
Martin Jambor [Tue, 17 Dec 2024 10:17:14 +0000 (11:17 +0100)] 
ipa: Skip widening type conversions in jump function constructions

Originally, we did not stream any formal parameter types into WPA and
were generally very conservative when it came to type mismatches in
IPA-CP.  Over the time, mismatches that happen in code and blew up in
WPA made us to be much more resilient and also to stream the types of
the parameters which we now use commonly.

With that information, we can safely skip conversions when looking at
the IL from which we build jump functions and then simply fold convert
the constants and ranges to the resulting type, as long as we are
careful that performing the corresponding folding of constants gives
the corresponding results.  In order to do that, we must ensure that
the old value can be represented in the new one without any loss.
With this change, we can nicely propagate non-NULLness in IPA-VR as
demonstrated with the new test case.

I have gone through all other uses of (all components of) jump
functions which could be affected by this and verified they do indeed
check types and can handle mismatches.

gcc/ChangeLog:

2024-12-11  Martin Jambor  <mjambor@suse.cz>

* ipa-prop.cc: Include vr-values.h.
(skip_a_safe_conversion_op): New function.
(ipa_compute_jump_functions_for_edge): Use it.

gcc/testsuite/ChangeLog:

2024-11-01  Martin Jambor  <mjambor@suse.cz>

* gcc.dg/ipa/vrp9.c: New test.

7 months agoc++: Diagnose earlier non-static data members with cv containing class type [PR116108]
Jakub Jelinek [Tue, 17 Dec 2024 09:13:24 +0000 (10:13 +0100)] 
c++: Diagnose earlier non-static data members with cv containing class type [PR116108]

In r10-6457 aka PR92593 fix a check has been added to reject
earlier non-static data members with current_class_type in templates,
as the deduction then can result in endless recursion in reshape_init.
It fixed the
template <class T>
struct S { S s = 1; };
S t{2};
crashes, but as the following testcase shows, didn't catch when there
are cv qualifiers on the non-static data member.

Fixed by using TYPE_MAIN_VARIANT.

2024-12-17  Jakub Jelinek  <jakub@redhat.com>

PR c++/116108
gcc/cp/
* decl.cc (grokdeclarator): Pass TYYPE_MAIN_VARIANT (type)
rather than type to same_type_p when checking if the non-static
data member doesn't have current class type.
gcc/testsuite/
* g++.dg/cpp1z/class-deduction117.C: New test.

7 months agoFortran: Fix associate with derived type array construtor [PR117347]
Andre Vehreschild [Fri, 13 Dec 2024 08:06:11 +0000 (09:06 +0100)] 
Fortran: Fix associate with derived type array construtor [PR117347]

gcc/fortran/ChangeLog:

PR fortran/117347

* primary.cc (gfc_match_varspec): Add array constructors for
guessing their type like with unresolved function calls.

gcc/testsuite/ChangeLog:

* gfortran.dg/associate_71.f90: New test.

7 months agoDaily bump.
GCC Administrator [Tue, 17 Dec 2024 00:19:06 +0000 (00:19 +0000)] 
Daily bump.

7 months agoUpdate cpplib sr.po
Joseph Myers [Mon, 16 Dec 2024 23:52:38 +0000 (23:52 +0000)] 
Update cpplib sr.po

* sr.po: Update.

7 months agoi386: Fix tabs vs. spaces in mmx.md
Uros Bizjak [Mon, 16 Dec 2024 19:58:09 +0000 (20:58 +0100)] 
i386: Fix tabs vs. spaces in mmx.md

gcc/ChangeLog:

* config/i386/mmx.md: Fix tabs vs. spaces.

7 months agoi386: Add HImode to VALID_SSE2_REG_MODE
Uros Bizjak [Mon, 16 Dec 2024 19:51:07 +0000 (20:51 +0100)] 
i386: Add HImode to VALID_SSE2_REG_MODE

Move explicit Himode handling for SSE2 XMM regnos from
ix86_hard_regno_mode_ok to VALID_SSE2_REG_MODE.

No functional change.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_hard_regno_mode_ok):
Remove explicit HImode handling for SSE2 XMM regnos.
* config/i386/i386.h (VALID_SSE2_REG_MODE): Add HImode.

7 months agotestsuite: Force max-completely-peeled-insns=300 for CRIS, PR118055
Hans-Peter Nilsson [Mon, 16 Dec 2024 17:47:03 +0000 (18:47 +0100)] 
testsuite: Force max-completely-peeled-insns=300 for CRIS, PR118055

This handles fallout from r15-6097-gee2f19b0937b5e.  A brief
analysis shows that the metric used in that code is computed
by estimate_move_cost, differentiating on the target macro
MOVE_MAX_PIECES (which defaults to MOVE_MAX) which for most
"32-bit targets" is 4 and for "64-bit targets" is 8.  There
are some outliers, like pru, with MOVE_MAX set to 8 but
counting as a 32-bit target.

So, the main difference for this test-case, which is heavy
on 64-bit moves (most targets have "double" mapped to IEEE
64-bit), is between "32-bit" and "64-bit", with the cost up
to twice for the former compared to the latter.  I see no
effective_target_move_max_is_4 or equivalent, and this
instance falls below the threshold of adding one, so I'm
sticking to a list of targets.  For CRIS, it would suffice
with 210, but there's no need to be this specific, and it
would make the test even more brittle.

PR tree-optimization/118055
* gcc.dg/tree-ssa/pr83403-1.c, gcc.dg/tree-ssa/pr83403-2.c: Add
cris-*-* to targets passing --param=max-completely-peeled-insns=300.

7 months agosarif-replay: handle embedded links (§3.11.6)
David Malcolm [Mon, 16 Dec 2024 16:22:50 +0000 (11:22 -0500)] 
sarif-replay: handle embedded links (§3.11.6)

Handle embedded links in plain text messages.  For now, merely
use the link text and discard the destination.

gcc/ChangeLog:
* libsarifreplay.cc (struct embedded_link): New.
(maybe_consume_embedded_link): New.
(sarif_replayer::make_plain_text_within_result_message): Handle
embedded links by using the link text, for now.

gcc/testsuite/ChangeLog:
* sarif-replay.dg/2.1.0-valid/3.11.6-embedded-links.sarif: New test.
* sarif-replay.dg/2.1.0-valid/malloc-vs-local-4.c.sarif: Update
expected output for handling the embedded links.
* sarif-replay.dg/2.1.0-valid/spec-example-4.sarif: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agolibgdiagnostics: consolidate logical locations
David Malcolm [Mon, 16 Dec 2024 16:22:50 +0000 (11:22 -0500)] 
libgdiagnostics: consolidate logical locations

This patch updates diagnostic_manager_new_logical_location so
that repeated calls with the same input values yield the same
instance of diagnostic_logical_location.

Doing so allows the path-printing logic to properly consolidate runs of
events, whereas previously it could treat each event as having a
distinct logical location, and thus require them to be printed
separately; this greatly improves the output of sarif-replay when
displaying execution paths.

gcc/ChangeLog:
* doc/libgdiagnostics/topics/logical-locations.rst
(diagnostic_manager_new_logical_location): Add note about repeated
calls.
* libgdiagnostics.cc: Define INCLUDE_MAP.
(class owned_nullable_string): Add copy ctor and move ctor.
(owned_nullable_string::operator<): New.
(diagnostic_logical_location::operator<): New.
(diagnostic_manager::new_logical_location): Use m_logical_locs to
"uniquify" instances, converting it to a std::map.
(diagnostic_manager::logical_locs_map_t): New typedef.
(diagnostic_manager::t m_logical_locs): Convert from a std::vector
to a std::map.
(diagnostic_execution_path::same_function_p): Update comment.

gcc/testsuite/ChangeLog:
* libgdiagnostics.dg/test-logical-location.c: Include <assert.h>.
Verify that creating a diagnostic_logical_location with equal
values yields the same instance.
* sarif-replay.dg/2.1.0-valid/malloc-vs-local-4.c.sarif: New test.
* sarif-replay.dg/2.1.0-valid/signal-1.c.moved.sarif: Update
expected output to show logical location and for consolidation of
events into runs.
* sarif-replay.dg/2.1.0-valid/signal-1.c.sarif: Likewise.
* sarif-replay.dg/2.1.0-valid/spec-example-4.sarif: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agosarif-replay: quote source from artifact contents [PR117943]
David Malcolm [Mon, 16 Dec 2024 16:22:49 +0000 (11:22 -0500)] 
sarif-replay: quote source from artifact contents [PR117943]

The diagnostic source-quoting machinery uses class file_cache
implemented in gcc/input.cc for (re)reading the source when
issuing diagnostics.

When sarif-replay issues a saved diagnostic it might be running
in a different path to where the .sarif file was captured, or
on an entirely different machine.

Previously such invocations would lead to the source-quoting
silently failing, even if the content of the file is recorded
in the .sarif file in the artifact "contents" property (which
gcc populates when emitting .sarif output).

This patch:
- adds the ability for slots in file_cache to be populated from memory
rather than from the filesystem
- exposes it in libgdiagnostics via a new entrypoint
- uses this in sarif-replay for any artifacts with a "contents"
property, so that source-quoting uses that rather than trying to read
from the path on the filesystem

gcc/ChangeLog:
PR sarif-replay/117943
* doc/libgdiagnostics/topics/physical-locations.rst
(diagnostic_manager_new_file): Drop "const" from return type.
* doc/libgdiagnostics/tutorial/02-physical-locations.rst: Drop
"const" from "main_file" decl.
* input.cc (file_cache::add_buffered_content): New.
(file_cache_slot::set_content): New.
(file_cache_slot::dump): Use m_file_path being null rather than
m_fp to determine empty slots.  Dump m_fp.
(find_end_of_line): Drop "const" from return type and param.  Add
forward decl.
(file_cache_slot::get_next_line): Fix "const"-ness.
(selftest::test_reading_source_buffer): New.
(selftest::input_cc_tests): Call it.
* input.h (file_cache::add_buffered_content): New decl.
* libgdiagnostics++.h (class file): Drop const-ness from m_inner.
(file::set_buffered_content): New.
* libgdiagnostics.cc (class content_buffer): New.
(diagnostic_file::diagnostic_file): Add "mgr" param.
(diagnostic_file::get_content): New.
(diagnostic_file::set_buffered_content): New.
(diagnostic_file::m_mgr): New.
(diagnostic_file::m_content): New.
(diagnostic_manager::new_file): Drop const-ness.  Pass *this to
ctor.
(diagnostic_file::set_buffered_content): New.
(diagnostic_manager_new_file): Drop "const" from return type.
(diagnostic_file_set_buffered_content): New entrypoint.
(diagnostic_manager_debug_dump_file): Dump the content size,
if any.
* libgdiagnostics.h (diagnostic_manager_new_file): Drop "const"
from return type.
(diagnostic_file_set_buffered_content): New decl.
* libgdiagnostics.map (diagnostic_file_set_buffered_content): New
symbol.
* libsarifreplay.cc (sarif_replayer::m_artifacts_arr): Convert
from json::value to json::array.
(sarif_replayer::handle_run_obj): Call handle_artifact_obj
on all artifacts.
(sarif_replayer::handle_artifact_obj): New.

gcc/testsuite/ChangeLog:
PR sarif-replay/117943
* sarif-replay.dg/2.1.0-valid/error-with-note.sarif: Update
expected output to include quoted source code and underlines.
* sarif-replay.dg/2.1.0-valid/signal-1.c.moved.sarif: New test.
* sarif-replay.dg/2.1.0-valid/signal-1.c.sarif: Update expected
output to include quoted source code and underlines.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agodiagnostics: move libgdiagnostics dc from sinks into diagnostic_manager
David Malcolm [Mon, 16 Dec 2024 16:22:49 +0000 (11:22 -0500)] 
diagnostics: move libgdiagnostics dc from sinks into diagnostic_manager

libgdiagnostics was written before the fixes for PR other/116613 allowed
a diagnostic_context to have multiple output sinks.

Hence each libgdiagnostics sink had its own diagnostic_context with just
one diagnostic_output_format.

This wart is no longer necessary and makes it harder to move state
into the manager/context; in particular for quoting source code
from the .sarif file (PR sarif-replay/117943).

Simplify, by making libgdiagnostics' implementation more similar to
GCC's implementation, by moving the diagnostic_context from sink into
diagnostic_manager.

Doing so requires generalizing where the
diagnostic_source_printing_options comes from in class
diagnostic_text_output_format: for GCC we use
the instance within the diagnostic_context, whereas for
libgdiagnostics each diagnostic_text_sink has its own instance.

No functional change intended.

gcc/c-family/ChangeLog:
PR sarif-replay/117943
* c-format.cc (selftest::test_type_mismatch_range_labels): Use
dc.m_source_printing.
* c-opts.cc (c_diagnostic_text_finalizer): Use source-printing
options from text_output.

gcc/cp/ChangeLog:
PR sarif-replay/117943
* error.cc (auto_context_line::~auto_context_line): Use
source-printing options from text_output.

gcc/ChangeLog:
PR sarif-replay/117943
* diagnostic-format-text.cc
(diagnostic_text_output_format::append_note): Use source-printing
options from text_output.
(diagnostic_text_output_format::update_printer): Copy
source-printing options from dc.
(default_diagnostic_text_finalizer): Use source-printing
options from text_output.
* diagnostic-format-text.h
(diagnostic_text_output_format::diagnostic_text_output_format):
Add optional diagnostic_source_printing_options param, using
the context's if null.
(diagnostic_text_output_format::get_source_printing_options): New
accessor.
(diagnostic_text_output_format::m_source_printing): New field.
* diagnostic-path.cc (event_range::print): Use source-printing
options from text_output.
(selftest::test_interprocedural_path_1): Use source-printing
options from dc.
* diagnostic-show-locus.cc
(gcc_rich_location::add_location_if_nearby): Likewise.
(diagnostic_context::maybe_show_locus): Add "opts" param
and use in place of m_source_printing.  Pass it to source_policy
ctor.
(diagnostic_source_print_policy::diagnostic_source_print_policy):
Add overload taking a const diagnostic_source_printing_options &.
* diagnostic.cc (diagnostic_context::initialize): Pass nullptr
for source options when creating text sink, so that it uses
the dc's options.
(diagnostic_context::dump): Add an "output sinks:" heading and
print "(none)" if there aren't any.
(diagnostic_context::set_output_format): Split out code into...
(diagnostic_context::remove_all_output_sinks): ...this new
function.
* diagnostic.h
(diagnostic_source_print_policy::diagnostic_source_print_policy):
Add overload taking a const diagnostic_source_printing_options &.
(diagnostic_context::maybe_show_locus): Add "opts" param.
(diagnostic_context::remove_all_output_sinks): New decl.
(diagnostic_context::m_source_printing): New field.
(diagnostic_show_locus): Add "opts" param and pass to
maybe_show_locus.
* libgdiagnostics.cc (sink::~sink): Delete.
(sink::begin_group): Delete.
(sink::end_group): Delete.
(sink::emit): Delete.
(sink::m_dc): Drop field.
(diagnostic_text_sink::on_begin_text_diagnostic): Delete.
(diagnostic_text_sink::get_source_printing_options): Use
m_souece_printing.
(diagnostic_text_sink::m_current_logical_loc): Drop field.
(diagnostic_text_sink::m_inner_sink): New field.
(diagnostic_text_sink::m_source_printing): New field.
(diagnostic_manager::diagnostic_manager): Update for changes
to fields.  Initialize m_dc.
(diagnostic_manager::~diagnostic_manager): Call diagnostic_finish.
(diagnostic_manager::get_file_cache): Drop.
(diagnostic_manager::get_dc): New accessor.
(diagnostic_manager::begin_group): Reimplement.
(diagnostic_manager::end_group): Reimplement.
(diagnostic_manager::get_prev_diag_logical_loc): New accessor.
(diagnostic_manager::m_dc): New field.
(diagnostic_manager::m_file_cache): Drop field.
(diagnostic_manager::m_edit_context): Convert to a std::unique_ptr
so that object can be constructed after m_dc is initialized.
(diagnostic_manager::m_prev_diag_logical_loc): New field.
(diagnostic_text_sink::diagnostic_text_sink): Reimplement.
(get_color_rule): Delete.
(diagnostic_text_sink::set_colorize): Reimplement.
(diagnostic_text_sink::text_starter): New.
(sarif_sink::sarif_sink): Reimplement.
(diagnostic_manager::write_patch): Update for change to
m_edit_context.
(diagnostic_manager::emit): Update now that each sink has a
corresponding diagnostic_output_format object within m_dc.

gcc/fortran/ChangeLog:
PR sarif-replay/117943
* error.cc (gfc_diagnostic_text_starter): Use source-printing
options from text_output.

gcc/testsuite/ChangeLog:
PR sarif-replay/117943
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.cc
(custom_diagnostic_text_finalizer): Use source-printing options
from text_output.
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.cc
(xhtml_builder::make_element_for_diagnostic): Use source-printing
options from diagnostic_context.
* gcc.dg/plugin/expensive_selftests_plugin.cc (test_richloc):
Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agodiagnostics: implement file_cache::dump
David Malcolm [Mon, 16 Dec 2024 16:22:49 +0000 (11:22 -0500)] 
diagnostics: implement file_cache::dump

This is purely for use when debugging.

gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::dump): Dump m_file_cache.
* input.cc (file_cache_slot::dump): New decls and implementations.
(file_cache::dump): New.
* input.h (file_cache::dump): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
7 months agotestsuite: Require int32plus target for gcc.dg/pr117816.c
Dimitar Dimitrov [Mon, 16 Dec 2024 15:27:53 +0000 (17:27 +0200)] 
testsuite: Require int32plus target for gcc.dg/pr117816.c

Memmove destination overflows if size of int is less than 3, resulting in
spurious test failures.  Fix by adding a requirement for effective
target int32plus.

gcc/testsuite/ChangeLog:

* gcc.dg/pr117816.c: Require effective target int32plus.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
7 months agodocs: Fix [us]abd pattern name.
Robin Dapp [Thu, 12 Dec 2024 10:46:32 +0000 (11:46 +0100)] 
docs: Fix [us]abd pattern name.

The uabd and sabd optab name is missing a 3 suffix (for its three
arguments).  This patch adds it.

gcc/ChangeLog:

* doc/md.texi: Add "3" suffix.

7 months agovect: Do not try to duplicate_and_interleave one-element mode.
Robin Dapp [Fri, 6 Sep 2024 14:04:03 +0000 (16:04 +0200)] 
vect: Do not try to duplicate_and_interleave one-element mode.

PR112694 shows that we try to create sub-vectors of single-element
vectors because can_duplicate_and_interleave_p returns true.
The problem resurfaced in PR116611.

This patch makes can_duplicate_and_interleave_p return false
if count / nvectors > 0 and removes the corresponding check in the riscv
backend.

This partially gets rid of the FAIL in slp-19a.c.  At least when built
with cost model we don't have LOAD_LANES anymore.  Without cost model,
as in the test suite, we choose a different path and still end up with
LOAD_LANES.

Bootstrapped and regtested on x86 and power10, regtested on
rv64gcv_zvfh_zvbb.  Still waiting for the aarch64 results.

Regards
 Robin

gcc/ChangeLog:

PR target/112694
PR target/116611.

* config/riscv/riscv-v.cc (expand_vec_perm_const): Remove early
return.
* tree-vect-slp.cc (can_duplicate_and_interleave_p): Return
false when we cannot create sub-elements.

7 months agoRISC-V: Fix compress shuffle pattern [PR117383].
Robin Dapp [Wed, 11 Dec 2024 19:48:30 +0000 (20:48 +0100)] 
RISC-V: Fix compress shuffle pattern [PR117383].

This patch makes vcompress use the tail-undisturbed policy by default
and also uses the proper VL.

PR target/117383

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum insn_type): Use TU policy.
* config/riscv/riscv-v.cc (shuffle_compress_patterns): Set VL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c:
Expect tu.
* gcc.target/riscv/rvv/autovec/pr117383.c: New test.

7 months agoRISC-V: Increase cost for vec_construct [PR118019].
Robin Dapp [Fri, 13 Dec 2024 10:23:03 +0000 (11:23 +0100)] 
RISC-V: Increase cost for vec_construct [PR118019].

For a generic vec_construct from scalar elements we need to load each
scalar element and move it over to a vector register.
Right now we only use a cost of 1 per element.

This patch uses register-move cost as well as scalar_to_vec and
multiplies it with the number of elements in the vector instead.

PR target/118019

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_builtin_vectorization_cost):
Increase vec_construct cost.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr118019.c: New test.

7 months agolibstdc++: Initialize all members of hashtable local iterators
Jonathan Wakely [Thu, 12 Dec 2024 20:42:19 +0000 (20:42 +0000)] 
libstdc++: Initialize all members of hashtable local iterators

Currently the _M_bucket members are left uninitialized for
default-initialized local iterators, and then copy construction copies
indeterminate values. We should just ensure they're initialized on
construction.

Setting them to zero makes default-initialization consistent with
value-initialization and avoids indeterminate values.

For the _Local_iterator_base<..., false> specialization we preserve the
existing behaviour of setting _M_bucket_count to -1 in the default
constructor, as a sentinel value to indicate there's no hash object
present.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h (_Local_iterator_base): Use
default member-initializers.

7 months agolibstdc++: Use alias-declarations in bits/hashtable_policy,h
Jonathan Wakely [Thu, 12 Dec 2024 20:40:15 +0000 (20:40 +0000)] 
libstdc++: Use alias-declarations in bits/hashtable_policy,h

This file is only for C++11 and later, so replace typedefs with
alias-declarations for clarity. Also remove redundant std::
qualification on size_t, ptrdiff_t etc.

We can also remove the result_type, first_argument_type and
second_argument_type typedefs from the range hashers. We don't need
those types to follow the C++98 adaptable function object protocol.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h: Replace typedefs with
alias-declarations. Remove redundant std:: qualification.
(_Mod_range_hashing, _Mask_range_hashing): Remove adaptable
function object typedefs.

7 months agolibstdc++: Simplify storage of hasher in local iterators
Jonathan Wakely [Thu, 5 Dec 2024 15:48:30 +0000 (15:48 +0000)] 
libstdc++: Simplify storage of hasher in local iterators

The fix for PR libstdc++/56267 (relating to the lifetime of the hash
object stored in a local iterator) has undefined behaviour, as it relies
on being able to call a member function on an empty object that never
started its lifetime. Although the member function probably doesn't care
about the empty object's state, this is still technically undefined
because there is no object of that type at that address. It's also
possible that the hash object would have a stricter alignment than the
_Hash_code_storage object, so that the reinterpret_cast would produce a
misaligned pointer.

This fix replaces _Local_iterator_base's _Hash_code_storage base-class
with a new class template containing a potentially-overlapping (i.e.
[[no_unique_address]]) union member.  This means that we always have
storage of the correct type, and it can be initialized/destroyed when
required. We no longer need a reinterpret_cast that gives us a pointer
that we should not dereference.

It would be nice if we could just use a union containing the _Hash
object as a data member of _Local_iterator_base, but that would be an
ABI change. The _Hash_code_storage that contains the _Hash object is the
first base-class, before the _Node_iterator_base base-class. Making the
union a data member of _Local_iterator_base would make it come after the
_Node_iterator_base base instead of before it, altering the layout.

Since we're changing _Hash_code_storage anyway, we can replace it with a
new class template that stores the _Hash object itself in the union,
rather than a _Hash_code_base that holds the _Hash. This removes an
unnecessary level of indirection in the class hierarchy. This change
requires the effects of _Hash_code_base::_M_bucket_index to be inlined
into the _Local_iterator_base::_M_incr function, but that's easy.

We don't need separate specializations of _Hash_obj_storage for an empty
hash function and a non-empty one. Using [[no_unique_address]] gives us
an empty base-class when possible.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h (_Hash_code_storage): Remove.
(_Hash_obj_storage): New class template. Store the hash
function as a union member instead of using a byte buffer.
(_Local_iterator_base): Use _Hash_obj_storage instead of
_Hash_code_storage, adjust members that construct and destroy
the hash object.
(_Local_iterator_base::_M_incr): Calculate bucket index.

7 months agolibstdc++: Further simplify _Hashtable inheritance hierarchy
Jonathan Wakely [Wed, 4 Dec 2024 21:52:40 +0000 (21:52 +0000)] 
libstdc++: Further simplify _Hashtable inheritance hierarchy

The main change here is using [[no_unique_address]] instead of the Empty
Base-class Optimization. Using the attribute allows us to use data
members instead of base-classes. That simplifies the inheritance
hierarchy, which means less work for the compiler. It also means that
ADL has fewer associated classes and associated namespaces to consider,
further reducing the work the compiler has to do.

Reducing the differences between the _Hashtable_ebo_helper primary
template and the partial specialization means we no longer need to use
member functions to access the stored object, because it's now always a
data member called _M_obj.  This means we can also remove a number of
other helper functions that were using those member functions to access
the object, for example we can swap the _Hash and _Equal objects
directly in _Hashtable::swap instead of calling _Hashtable_base::_M_swap
which then calls _Hash_code_base::_M_swap.

Although [[no_unique_address]] would allow us to reduce the size for
empty types that are also 'final', doing so would be an ABI break
because those types were previously excluded from using the EBO. So we
still need the _Hashtable_ebo_helper class template and a partial
specialization, so that we only use the attribute under exactly the same
conditions as we previously used the EBO. This could be avoided with a
non-standard [[no_unique_address(expr)]] attribute that took a boolean
condition, or with reflection and token sequence injection, but we don't
have either of those things.

Because _Hashtable_ebo_helper is no longer used as a base-class we don't
need to disambiguate possible identical bases, so it doesn't need an
integral non-type template parameter.

libstdc++-v3/ChangeLog:

* include/bits/hashtable.h (_Hashtable::swap): Swap hash
function and equality predicate here. Inline allocator swap
instead of using __alloc_on_swap.
* include/bits/hashtable_policy.h (_Hashtable_ebo_helper):
Replace EBO with no_unique_address attribute. Remove NTTP.
(_Hash_code_base): Replace base class with data member using
no_unique_address attribute.
(_Hash_code_base::_M_swap): Remove.
(_Hash_code_base::_M_hash): Remove.
(_Hashtable_base): Replace base class with data member using
no_unique_address attribute.
(_Hashtable_base::_M_swap): Remove.
(_Hashtable_alloc): Replace base class with data member using
no_unique_address attribute.

7 months agolibstdc++: Fix fancy pointer support in linked lists [PR57272]
Jonathan Wakely [Sun, 8 Dec 2024 14:34:01 +0000 (14:34 +0000)] 
libstdc++: Fix fancy pointer support in linked lists [PR57272]

The union members I used in the new _Node types for fancy pointers only
work for value types that are trivially default constructible. This
change replaces the anonymous union with a named union so it can be
given a default constructor and destructor, to leave the variant member
uninitialized.

This also fixes the incorrect macro names in the alloc_ptr_ignored.cc
tests as pointed out by François, and fixes some std::list pointer
confusions that the fixed alloc_ptr_ignored.cc test revealed.

libstdc++-v3/ChangeLog:

PR libstdc++/57272
* include/bits/forward_list.h (__fwd_list::_Node): Add
user-provided special member functions to union.
* include/bits/stl_list.h (__list::_Node): Likewise.
(_Node_base::_M_hook, _Node_base::swap): Use _M_base() instead
of std::pointer_traits::pointer_to.
(_Node_base::_M_transfer): Likewise. Add noexcept.
(_List_base::_M_put_node): Use 'if constexpr' to avoid using
pointer_traits::pointer_to when not necessary.
(_List_base::_M_destroy_node): Fix parameter to be the pointer
type used internally, not the allocator's pointer.
(list::_M_create_node): Likewise.
* testsuite/23_containers/forward_list/requirements/explicit_instantiation/alloc_ptr.cc:
Check explicit instantiation of non-trivial value type.
* testsuite/23_containers/list/requirements/explicit_instantiation/alloc_ptr.cc:
Likewise.
* testsuite/23_containers/forward_list/requirements/explicit_instantiation/alloc_ptr_ignored.cc:
Fix macro name.
* testsuite/23_containers/list/requirements/explicit_instantiation/alloc_ptr_ignored.cc:
Likewise.

7 months agoFix non-aligned CodeView symbols
Mark Harmstone [Sat, 30 Nov 2024 22:35:24 +0000 (22:35 +0000)] 
Fix non-aligned CodeView symbols

CodeView symbols in PDB files are aligned to four-byte boundaries. It's
not really clear what logic MSVC uses to enforce this; sometimes the
symbols are padded in the object file, sometimes the linker seems to do
the work.

It makes more sense to do this in the compiler, so fix the two instances
where we can write symbols with a non-aligned length. S_FRAMEPROC is
unusually not a multiple of 4, so will always have 2 bytes padding.
S_INLINESITE is followed by variable-length "binary annotations", so
will also usually have padding.

gcc/
* dwarf2codeview.cc (write_s_frameproc): Align output.
(write_s_inlinesite): Align output.

7 months agoDaily bump.
GCC Administrator [Mon, 16 Dec 2024 00:16:48 +0000 (00:16 +0000)] 
Daily bump.

7 months agohppa: Implement TARGET_FRAME_POINTER_REQUIRED
John David Anglin [Sun, 15 Dec 2024 22:18:40 +0000 (17:18 -0500)] 
hppa: Implement TARGET_FRAME_POINTER_REQUIRED

If a function receives nonlocal gotos, it needs to save the frame
pointer in the argument save area.  This ensures that LRA sets
frame_pointer_needed when it saves arguments in the save area.

2024-12-15  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR target/118018
* config/pa/pa.cc (pa_frame_pointer_required): Declare and
implement.
(TARGET_FRAME_POINTER_REQUIRED): Define.

7 months agotestsuite: Enable TImode tests on hppa64
John David Anglin [Sun, 15 Dec 2024 21:02:54 +0000 (16:02 -0500)] 
testsuite: Enable TImode tests on hppa64

2024-12-15  John David Anglin  <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ivopts-1.c: Enable TImode tests on hppa64.

7 months agotestsuite: xfail scan-assembler-times in c-c++-common/gomp/unroll-[45].c
John David Anglin [Sun, 15 Dec 2024 20:53:12 +0000 (15:53 -0500)] 
testsuite: xfail scan-assembler-times in c-c++-common/gomp/unroll-[45].c

Count differs on hppa*-*-hpux* due to hpux specific directives.

2024-12-15  John David Anglin  <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/unroll-4.c: xfail scan-assembler-times
"dummy" for hppa*-*-hpux*.
* c-c++-common/gomp/unroll-5.c: Likewise.

7 months agotestsuite: Require lto in g++.dg/modules/enum-14.C
John David Anglin [Sun, 15 Dec 2024 20:39:50 +0000 (15:39 -0500)] 
testsuite: Require lto in g++.dg/modules/enum-14.C

2024-12-15  John David Anglin  <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

* g++.dg/modules/enum-14.C: Require lto.

7 months agoc++, coroutines: Use finish_if_stmt in a missed case.
Iain Sandoe [Tue, 3 Sep 2024 11:04:59 +0000 (12:04 +0100)] 
c++, coroutines: Use finish_if_stmt in a missed case.

Just shorter code.

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::wrap_original_function_body): Use
finish_if_stmt instead of manually applying the same process.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
7 months agoc++, coroutines: Make the resume index consistent for debug.
Iain Sandoe [Thu, 3 Oct 2024 08:02:59 +0000 (09:02 +0100)] 
c++, coroutines: Make the resume index consistent for debug.

At present, we only update the resume index when we actually are
at the stage that the coroutine is considered suspended. This is
on the basis that it is UB to resume or destroy a coroutines that
is not suspended (and therefore we never need to access this value
otherwise).  However, it is possible that someone could set a debug
breakpoint on the resume which can be reached without suspending
if await_ready() returns true.  In that case, the debugger would
read an incorrect resume index.  Fixed by moving the update to
just before the test for ready.

gcc/cp/ChangeLog:

* coroutines.cc (expand_one_await_expression): Update the
resume index before checking if the coroutine is ready.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
7 months agoc++, coroutines:Ensure bind exprs are visited once [PR98935].
Iain Sandoe [Fri, 1 Nov 2024 23:30:58 +0000 (23:30 +0000)] 
c++, coroutines:Ensure bind exprs are visited once [PR98935].

Recent changes in the C++ FE and the coroutines implementation have
exposed a latent issue in which a bind expression containing a var
that we need to capture in the coroutine state gets visited twice.
This causes an ICE (from a checking assert).  Fixed by adding a pset
to the relevant tree walk.  Exit the callback early when the tree is
not a BIND_EXPR.

PR c++/98935

gcc/cp/ChangeLog:

* coroutines.cc (register_local_var_uses): Add a pset to the
tree walk to avoid visiting the same BIND_EXPR twice.  Make
an early exit for cases that the callback does not apply.
(cp_coroutine_transform::apply_transforms): Add a pset to the
tree walk for register_local_vars.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr98935.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
7 months agoarm: fix bootstrap after MVE changes
Tamar Christina [Sun, 15 Dec 2024 13:21:44 +0000 (13:21 +0000)] 
arm: fix bootstrap after MVE changes

The recent commits for MVE on Saturday have broken armhf bootstrap due to a
-Werror false positive:

    inlined from 'virtual rtx_def* {anonymous}::vstrq_scatter_base_impl::expand(arm_mve::function_expander&) const' at /gcc/config/arm/arm-mve-builtins-base.cc:352:17:
./genrtl.h:38:16: error: 'new_base' may be used uninitialized [-Werror=maybe-uninitialized]
   38 |   XEXP (rt, 1) = arg1;
/gcc/config/arm/arm-mve-builtins-base.cc: In member function 'virtual rtx_def* {anonymous}::vstrq_scatter_base_impl::expand(arm_mve::function_expander&) const':
/gcc/config/arm/arm-mve-builtins-base.cc:311:26: note: 'new_base' was declared here
  311 |     rtx insns, base_ptr, new_base;
      |                          ^~~~~~~~
In function 'rtx_def* init_rtx_fmt_ee(rtx, machine_mode, rtx, rtx)',
    inlined from 'rtx_def* gen_rtx_fmt_ee_stat(rtx_code, machine_mode, rtx, rtx)' at ./genrtl.h:50:26,
    inlined from 'virtual rtx_def* {anonymous}::vldrq_gather_base_impl::expand(arm_mve::function_expander&) const' at /gcc/config/arm/arm-mve-builtins-base.cc:527:17:
./genrtl.h:38:16: error: 'new_base' may be used uninitialized [-Werror=maybe-uninitialized]
   38 |   XEXP (rt, 1) = arg1;
/gcc/config/arm/arm-mve-builtins-base.cc: In member function 'virtual rtx_def* {anonymous}::vldrq_gather_base_impl::expand(arm_mve::function_expander&) const':
/gcc/config/arm/arm-mve-builtins-base.cc:486:26: note: 'new_base' was declared here
  486 |     rtx insns, base_ptr, new_base;

To fix it I just initialize the value.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-base.cc (expand): Initialize new_base.

7 months agoShrink back size of tree_exp from 40 bytes to 32
Jakub Jelinek [Sun, 15 Dec 2024 12:13:07 +0000 (13:13 +0100)] 
Shrink back size of tree_exp from 40 bytes to 32

The following patch implements what I've mentioned in the 64-bit
location_t thread.
struct tree_exp had unsigned condition_uid member added for something
rarely used (-fcondition-coverage) and even there used only on very small
subset of trees only for the duration of the gimplification.

The following patch uses a hash_map instead, which allows shrinking
tree_exp to its previous size (32 bytes + (number of operands - 1) * sizeof
(tree)).

2024-12-15  Jakub Jelinek  <jakub@redhat.com>

* tree-core.h (struct tree_exp): Remove condition_uid member.
* tree.h (SET_EXPR_UID, EXPR_COND_UID): Remove.
* gimplify.cc (nextuid): Rename to ...
(nextconduid): ... this.
(cond_uids): New static variable.
(next_cond_uid, reset_cond_uid): Adjust for the renaming,
formatting fix.
(tree_associate_condition_with_expr): New function.
(shortcut_cond_r, tag_shortcut_cond, shortcut_cond_expr): Use it
instead of SET_EXPR_UID.
(gimplify_cond_expr): Look up cond_uid in cond_uids hash map if
non-NULL instead of using EXPR_COND_UID.
(gimplify_function_tree): Delete cond_uids and set it to NULL.

7 months agoFortran: Pointer fcn results must not be finalized [PR117897]
Paul Thomas [Sun, 15 Dec 2024 10:42:34 +0000 (10:42 +0000)] 
Fortran: Pointer fcn results must not be finalized [PR117897]

2024-12-15  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
PR fortran/117897
* trans-expr.cc (gfc_trans_assignment_1): RHS pointer function
results must not be finalized.

gcc/testsuite/
PR fortran/117897
* gfortran.dg/finalize_59.f90: New test.

7 months agoDaily bump.
GCC Administrator [Sun, 15 Dec 2024 00:17:24 +0000 (00:17 +0000)] 
Daily bump.

7 months agolibbacktrace: don't use ZSTD_CLEVEL_DEFAULT
Ian Lance Taylor [Sat, 14 Dec 2024 22:32:11 +0000 (14:32 -0800)] 
libbacktrace: don't use ZSTD_CLEVEL_DEFAULT

PR 117812 reports that testing GCC with zstd 1.3.4 fails because
ZSTD_CLEVEL_DEFAULT is not defined, so avoid using it.

PR libbacktrace/117812
* zstdtest.c (test_large): Use 3 rather than ZSTD_CLEVEL_DEFAULT

7 months ago[PATCH v3] match.pd: Add pattern to simplify `(a - 1) & -a` to `0`
Jovan Vukic [Sat, 14 Dec 2024 21:47:35 +0000 (14:47 -0700)] 
[PATCH v3] match.pd: Add pattern to simplify `(a - 1) & -a` to `0`

Thank you for the feedback. I have made the minor changes that were requested.
Additionally, I extracted the repetitive code into a reusable helper function,
match_plus_neg_pattern, making the code much more readable. Furthermore, the
logic, code, and tests remain the same as in version 2 of the patch.

gcc/ChangeLog:

* match.pd: New pattern.
* simplify-rtx.cc (match_plus_neg_pattern): New helper function.
(simplify_context::simplify_binary_operation_1): New
code to handle (a - 1) & -a, (a - 1) | -a and (a - 1) ^ -a.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-11.c: New test.

7 months agobpf: fix build adding new required arg to RESOLVE_OVERLOADED_BUILTIN
Jose E. Marchesi [Sat, 14 Dec 2024 18:15:34 +0000 (19:15 +0100)] 
bpf: fix build adding new required arg to RESOLVE_OVERLOADED_BUILTIN

gcc/ChangeLog

* config/bpf/bpf.cc (bpf_resolve_overloaded_builtin): Add argument
`complain'.

7 months agodoc: Fix typos for --enable-host-pie docs in install.texi
Heiko Eißfeldt [Sat, 14 Dec 2024 12:31:58 +0000 (12:31 +0000)] 
doc: Fix typos for --enable-host-pie docs in install.texi

gcc/ChangeLog:

* doc/install.texi (Configuration): Fix typos in documentation
for --enable-host-pie.

7 months agogimple-fold: Fix the recent ifcombine optimization for _BitInt [PR118023]
Jakub Jelinek [Sat, 14 Dec 2024 10:28:25 +0000 (11:28 +0100)] 
gimple-fold: Fix the recent ifcombine optimization for _BitInt [PR118023]

The BIT_FIELD_REF verifier has:
          if (INTEGRAL_TYPE_P (TREE_TYPE (op))
              && !type_has_mode_precision_p (TREE_TYPE (op)))
            {
              error ("%qs of non-mode-precision operand", code_name);
              return true;
            }
check among other things, so one can't extract something out of say
_BitInt(63) or _BitInt(4096).
The new ifcombine optimization happily creates such BIT_FIELD_REFs
and ICEs during their verification.

The following patch fixes that by rejecting those in decode_field_reference.

2024-12-14  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/118023
* gimple-fold.cc (decode_field_reference): Return NULL_TREE if
inner has non-type_has_mode_precision_p integral type.

* gcc.dg/bitint-119.c: New test.

7 months agowarn-access: Fix up matching_alloc_calls_p [PR118024]
Jakub Jelinek [Sat, 14 Dec 2024 10:27:20 +0000 (11:27 +0100)] 
warn-access: Fix up matching_alloc_calls_p [PR118024]

The following testcase ICEs because of a bug in matching_alloc_calls_p.
The loop was apparently meant to be walking the two attribute chains
in lock-step, but doesn't really do that.  If the first lookup_attribute
returns non-NULL, the second one is not done, so rmats in that case can
be some random unrelated attribute rather than "malloc" attribute; the
body assumes even rmats if non-NULL is "malloc" attribute and relies
on its argument to be a "malloc" argument and if it is some other
attribute with incompatible attribute, it just crashes.

Now, fixing that in the obvious way, instead of doing
(amats = lookup_attribute ("malloc", amats))
 || (rmats = lookup_attribute ("malloc", rmats))
in the condition do
((amats = lookup_attribute ("malloc", amats)),
 (rmats = lookup_attribute ("malloc", rmats)),
 (amats || rmats))
fixes the testcase but regresses Wmismatched-dealloc-{2,3}.c tests.
The problem is that walking the attribute lists in a lock-step is obviously
a very bad idea, there is no requirement that the same deallocators are
present in the same order on both decls, e.g. there could be an extra malloc
attribute without argument in just one of the lists, or the order of say
free/realloc could be swapped, etc.  We don't generally document nor enforce
any particular ordering of attributes (even when for some attributes we just
handle the first one rather than all).

So, this patch instead simply splits it into two loops, the first one walks
alloc_decl attributes, the second one walks dealloc_decl attributes.
If the malloc attribute argument is a built-in, that doesn't change
anything, and otherwise we have the chance to populate the whole
common_deallocs hash_set in the first loop and then can check it in the
second one (and don't need to use more expensive add method on it, can just
check contains there).  Not to mention that it also fixes the case when
the function would incorrectly return true if there wasn't a common
deallocator between the two, but dealloc_decl had 2 malloc attributes with
the same deallocator.

2024-12-14  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/118024
* gimple-ssa-warn-access.cc (matching_alloc_calls_p): Walk malloc
attributes of alloc_decl and dealloc_decl in separate loops rather
than in lock-step.  Use common_deallocs.contains rather than
common_deallocs.add in the second loop.

* gcc.dg/pr118024.c: New test.

7 months agoopts: Use OPTION_SET_P instead of magic value 2 for -fshort-enums default [PR118011]
Jakub Jelinek [Sat, 14 Dec 2024 10:25:08 +0000 (11:25 +0100)] 
opts: Use OPTION_SET_P instead of magic value 2 for -fshort-enums default [PR118011]

The magic values for default (usually -1 or sometimes 2) for some options
are from times we haven't global_options_set, I think we should eventually
get rid of all of those.

The PR is about gcc -Q --help=optimizers reporting -fshort-enums as
[enabled] when it is disabled.
For this the following patch is just partial fix; with explicit
gcc -Q --help=optimizers -fshort-enums
or
gcc -Q --help=optimizers -fno-short-enums
it already worked correctly before, with this patch it will report
even with just
gcc -Q --help=optimizers
correct value on most targets, except 32-bit arm with some options or
defaults, so I think it is a step in the right direction.

But, as I wrote in the PR, process_options isn't done before --help=
and even shouldn't be in its current form where it warns on some option
combinations or errors or emits sorry on others, so I think ideally
process_options should have some bool argument whether it is done for
--help= purposes or not, if yes, not emit warnings and just adjust the
options, otherwise do what it currently does.

2024-12-14  Jakub Jelinek  <jakub@redhat.com>

PR c/118011
gcc/
* opts.cc (init_options_struct): Don't set opts->x_flag_short_enums to
2.
* toplev.cc (process_options): Test !OPTION_SET_P (flag_short_enums)
rather than flag_short_enums == 2.
gcc/ada/
* gcc-interface/misc.cc (gnat_post_options): Test
!OPTION_SET_P (flag_short_enums) rather than flag_short_enums == 2.

7 months agoc++: Disallow decomposition of lambda bases [PR90321]
Nathaniel Shead [Thu, 7 Nov 2024 10:37:28 +0000 (21:37 +1100)] 
c++: Disallow decomposition of lambda bases [PR90321]

Decomposition of lambda closure types is not allowed by
[dcl.struct.bind] p6, since members of a closure have no name.

r244909 made this an error, but missed the case where a lambda is used
as a base.  This patch moves the check to find_decomp_class_base to
handle this case.

As a drive-by improvement, we also slightly improve the diagnostics to
indicate why a base class was being inspected.  Ideally the diagnostic
would point directly at the relevant base, but there doesn't seem to be
an easy way to get this location just from the binfo so I don't worry
about that here.

PR c++/90321

gcc/cp/ChangeLog:

* decl.cc (find_decomp_class_base): Check for decomposing a
lambda closure type.  Report base class chains if needed.
(cp_finish_decomp): Remove no-longer-needed check.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/decomp62.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Marek Polacek <polacek@redhat.com>
7 months agolibstdc++: Remove duplicate using-declaration in <wchar.h>
Abdo Eid [Sat, 14 Dec 2024 01:16:10 +0000 (01:16 +0000)] 
libstdc++: Remove duplicate using-declaration in <wchar.h>

libstdc++-v3/ChangeLog:

* include/c_compatibility/wchar.h (fgetwc): Remove duplicate
using-declaration.

7 months agoDaily bump.
GCC Administrator [Sat, 14 Dec 2024 00:19:52 +0000 (00:19 +0000)] 
Daily bump.

7 months agocse: Fix up record_jump_equiv checks [PR117095]
Jakub Jelinek [Fri, 13 Dec 2024 23:41:00 +0000 (00:41 +0100)] 
cse: Fix up record_jump_equiv checks [PR117095]

The following testcase is miscompiled on s390x-linux with -O2 -march=z15.
The problem happens during cse2, which sees in an extended basic block
(jump_insn 217 78 216 10 (parallel [
            (set (pc)
                (if_then_else (ne (reg:SI 165)
                        (const_int 1 [0x1]))
                    (label_ref 216)
                    (pc)))
            (set (reg:SI 165)
                (plus:SI (reg:SI 165)
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (scratch:SI))
            (clobber (reg:CC 33 %cc))
        ]) "t.c":14:17 discrim 1 2192 {doloop_si64}
     (int_list:REG_BR_PROB 955630228 (nil))
 -> 216)
...
(insn 99 98 100 12 (set (reg:SI 138)
        (const_int 1 [0x1])) "t.c":9:31 1507 {*movsi_zarch}
     (nil))
(insn 100 99 103 12 (parallel [
            (set (reg:SI 137)
                (minus:SI (reg:SI 138)
                    (subreg:SI (reg:HI 135 [ a ]) 0)))
            (clobber (reg:CC 33 %cc))
        ]) "t.c":9:31 1904 {*subsi3}
     (expr_list:REG_DEAD (reg:SI 138)
        (expr_list:REG_DEAD (reg:HI 135 [ a ])
            (expr_list:REG_UNUSED (reg:CC 33 %cc)
                (nil)))))
Note, cse2 has df_note_add_problem () before df_analyze, which add
     (expr_list:REG_UNUSED (reg:SI 165)
        (expr_list:REG_UNUSED (reg:CC 33 %cc)
notes to the first insn (correctly so, %cc is clobbered there and pseudo
165 isn't used after the insn).
Now, cse_extended_basic_block has an extra optimization on conditional
jumps, where it records equivalence on the edge which continues in the ebb.
Here it sees (ne reg:SI 165) (const_int 1) is false on the edge and
remembers that pseudo 165 is comparison equivalent to (const_int 1),
so on insn 100 it decides to replace (reg:SI 138) with (reg:SI 165).

This optimization isn't correct here though, because the JUMP_INSN has
multiple sets.  Before r0-77890 record_jump_equiv has been called from
cse_insn guarded on n_sets == 1 && any_condjump_p (insn), so it wouldn't
be done on the above JUMP_INSN where n_sets == 2.  But since that change
it is guarded with single_set (insn) && any_condjump_p (insn) and that
is true because of the REG_UNUSED note.  Looking at that note is
inappropriate in CSE though, because the whole intent of the pass is to
extend the lifetimes of the pseudos if equivalence is found, so the fact
that there is REG_UNUSED note for (reg:SI 165) and that the reg isn't used
later doesn't imply that it won't be used after the optimization.
So, unless we manage to process the other sets on the JUMP_INSN (it wouldn't
be terribly hard in this exact case, the doloop insn decreases the register
by 1 and so we could just record equivalence to (const_int 0) instead, but
generally it might be hard), we should IMHO just punt if there are multiple
sets.

The patch below adds !multiple_sets (insn) check instead of replacing with
it the single_set (insn) check, because apparently any_condjump_p uses
pc_set which supports the case where PATTERN is a SET to PC (that is a
single_set (insn) && !multiple_sets (insn), PATTERN is a PARALLEL with a
single SET to PC (likewise) and some CLOBBERs, PARALLEL with two or more
SETs where the first one is SET to PC (that could be single_set (insn)
with REG_UNUSED notes but is not !multiple_sets (insn)) or PATTERN
is UNSPEC/UNSPEC_VOLATILE with SET inside of it.  For the last case
!multiple_sets (insn) will be true, but IMHO we shouldn't try to derive
anything from those because we haven't checked the rest of the UNSPEC*
and we don't really know what it does.

2024-12-13  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/117095
* cse.cc (cse_extended_basic_block): Don't call record_jump_equiv
if multiple_sets (insn).

* gcc.c-torture/execute/pr117095.c: New test.

7 months agolibstdc++: Avoid unnecessary copies in ranges::min/max [PR112349]
Patrick Palka [Fri, 13 Dec 2024 18:17:29 +0000 (13:17 -0500)] 
libstdc++: Avoid unnecessary copies in ranges::min/max [PR112349]

Use a local reference for the (now possibly lifetime extended) result of
*__first so that we copy it only when necessary.

PR libstdc++/112349

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__min_fn::operator()): Turn local
object __tmp into a reference.
* include/bits/ranges_util.h (__max_fn::operator()): Likewise.
* testsuite/25_algorithms/max/constrained.cc (test04): New test.
* testsuite/25_algorithms/min/constrained.cc (test04): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
7 months agoarm: [MVE intrinsics] Fix support for predicate constants [PR target/114801]
Christophe Lyon [Sun, 24 Nov 2024 18:08:48 +0000 (18:08 +0000)] 
arm: [MVE intrinsics] Fix support for predicate constants [PR target/114801]

In this PR, we have to handle a case where MVE predicates are supplied
as a const_int, where individual predicates have illegal boolean
values (such as 0xc for a 4-bit boolean predicate).  To avoid the ICE,
fix the constant (any non-zero value is converted to all 1s) and emit
a warning.

On MVE, V8BI and V4BI multi-bit masks are interpreted byte-by-byte at
instruction level, but end-users should describe lanes rather than
bytes (so all bytes of a true-predicated lane should be '1'), see the
section on MVE intrinsics in the Arm ACLE specification.

Since force_lowpart_subreg cannot handle const_int (because they have VOID mode),
use gen_lowpart on them, force_lowpart_subreg otherwise.

2024-11-20  Christophe Lyon  <christophe.lyon@linaro.org>
    Jakub Jelinek  <jakub@redhat.com>

PR target/114801
gcc/
* config/arm/arm-mve-builtins.cc
(function_expander::add_input_operand): Handle CONST_INT
predicates.

gcc/testsuite/
* gcc.target/arm/mve/pr108443.c: Update predicate constant.
* gcc.target/arm/mve/pr108443-run.c: Likewise.
* gcc.target/arm/mve/pr114801.c: New test.

7 months agoarm: [MVE intrinsics] rework vst2q vst4q vld2q vld4q
Christophe Lyon [Wed, 13 Nov 2024 15:33:18 +0000 (15:33 +0000)] 
arm: [MVE intrinsics] rework vst2q vst4q vld2q vld4q

Implement vst2q, vst4q, vld2q and vld4q using the new MVE builtins
framework.

Since MVE uses different tuple modes than Neon, we need to use
VALID_MVE_STRUCT_MODE because VALID_NEON_STRUCT_MODE is no longer a
super-set of it, for instance in output_move_neon and
arm_print_operand_address.

In arm_hard_regno_mode_ok, the change is similar but a bit more
intrusive.

Expand the VSTRUCT iterator, so that mov<mode> and neon_mov<mode>
patterns from neon.md still work for MVE.

Besides the small updates to the patterns in mve.md, we have to update
vec_load_lanes and vec_store_lanes in vec-common.md so that the
vectorizer can handle the new modes. These patterns are now different
from Neon's, so maybe we should move them back to neon.md and mve.md

The patch adds arm_array_mode, which is used by build_array_type_nelts
and makes it possible to support the new assert in
register_builtin_tuple_types.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-base.cc (class vst24_impl): New.
(class vld24_impl): New.
(vld2q, vld4q, vst2q, vst4q): New.
* config/arm/arm-mve-builtins-base.def (vld2q, vld4q, vst2q)
(vst4q): New.
* config/arm/arm-mve-builtins-base.h (vld2q, vld4q, vst2q, vst4q):
New.
* config/arm/arm-mve-builtins.cc (register_builtin_tuple_types):
Add more asserts.
* config/arm/arm.cc (TARGET_ARRAY_MODE): New.
(output_move_neon): Handle MVE struct modes.
(arm_print_operand_address): Likewise.
(arm_hard_regno_mode_ok): Likewise.
(arm_array_mode): New.
* config/arm/arm.h (VALID_MVE_STRUCT_MODE): Likewise.
* config/arm/arm_mve.h (vst4q): Delete.
(vst2q): Delete.
(vld2q): Delete.
(vld4q): Delete.
(vst4q_s8): Delete.
(vst4q_s16): Delete.
(vst4q_s32): Delete.
(vst4q_u8): Delete.
(vst4q_u16): Delete.
(vst4q_u32): Delete.
(vst4q_f16): Delete.
(vst4q_f32): Delete.
(vst2q_s8): Delete.
(vst2q_u8): Delete.
(vld2q_s8): Delete.
(vld2q_u8): Delete.
(vld4q_s8): Delete.
(vld4q_u8): Delete.
(vst2q_s16): Delete.
(vst2q_u16): Delete.
(vld2q_s16): Delete.
(vld2q_u16): Delete.
(vld4q_s16): Delete.
(vld4q_u16): Delete.
(vst2q_s32): Delete.
(vst2q_u32): Delete.
(vld2q_s32): Delete.
(vld2q_u32): Delete.
(vld4q_s32): Delete.
(vld4q_u32): Delete.
(vld4q_f16): Delete.
(vld2q_f16): Delete.
(vst2q_f16): Delete.
(vld4q_f32): Delete.
(vld2q_f32): Delete.
(vst2q_f32): Delete.
(__arm_vst4q_s8): Delete.
(__arm_vst4q_s16): Delete.
(__arm_vst4q_s32): Delete.
(__arm_vst4q_u8): Delete.
(__arm_vst4q_u16): Delete.
(__arm_vst4q_u32): Delete.
(__arm_vst2q_s8): Delete.
(__arm_vst2q_u8): Delete.
(__arm_vld2q_s8): Delete.
(__arm_vld2q_u8): Delete.
(__arm_vld4q_s8): Delete.
(__arm_vld4q_u8): Delete.
(__arm_vst2q_s16): Delete.
(__arm_vst2q_u16): Delete.
(__arm_vld2q_s16): Delete.
(__arm_vld2q_u16): Delete.
(__arm_vld4q_s16): Delete.
(__arm_vld4q_u16): Delete.
(__arm_vst2q_s32): Delete.
(__arm_vst2q_u32): Delete.
(__arm_vld2q_s32): Delete.
(__arm_vld2q_u32): Delete.
(__arm_vld4q_s32): Delete.
(__arm_vld4q_u32): Delete.
(__arm_vst4q_f16): Delete.
(__arm_vst4q_f32): Delete.
(__arm_vld4q_f16): Delete.
(__arm_vld2q_f16): Delete.
(__arm_vst2q_f16): Delete.
(__arm_vld4q_f32): Delete.
(__arm_vld2q_f32): Delete.
(__arm_vst2q_f32): Delete.
(__arm_vst4q): Delete.
(__arm_vst2q): Delete.
(__arm_vld2q): Delete.
(__arm_vld4q): Delete.
* config/arm/arm_mve_builtins.def (vst4q, vst2q, vld4q, vld2q):
Delete.
* config/arm/iterators.md (VSTRUCT): Add V2x16QI, V2x8HI, V2x4SI,
V2x8HF, V2x4SF, V4x16QI, V4x8HI, V4x4SI, V4x8HF, V4x4SF.
(MVE_VLD2_VST2, MVE_vld2_vst2, MVE_VLD4_VST4, MVE_vld4_vst4): New.
* config/arm/mve.md (mve_vst4q<mode>): Update into ...
(@mve_vst4q<mode>): ... this.
(mve_vst2q<mode>): Update into ...
(@mve_vst2q<mode>): ... this.
(mve_vld2q<mode>): Update into ...
(@mve_vld2q<mode>): ... this.
(mve_vld4q<mode>): Update into ...
(@mve_vld4q<mode>): ... this.
* config/arm/vec-common.md (vec_load_lanesoi<mode>) Remove MVE
support.
(vec_load_lanesxi<mode>): Likewise.
(vec_store_lanesoi<mode>): Likewise.
(vec_store_lanesxi<mode>): Likewise.
(vec_load_lanes<MVE_vld2_vst2><mode>):
New.
(vec_store_lanes<MVE_vld2_vst2><mode>): New.
(vec_load_lanes<MVE_vld4_vst4><mode>): New.
(vec_store_lanes<MVE_vld4_vst4><mode>): New.

7 months agoarm: [MVE intrinsics] fix store shape to support tuples
Christophe Lyon [Wed, 13 Nov 2024 15:31:21 +0000 (15:31 +0000)] 
arm: [MVE intrinsics] fix store shape to support tuples

Now that tuples are properly supported, we can update the store shape, to expect
"t0" instead of "v0" as last argument.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-shapes.cc (struct store_def): Add
support for tuples.

7 months agoarm: [MVE intrinsics] add support for tuples
Christophe Lyon [Wed, 13 Nov 2024 15:30:44 +0000 (15:30 +0000)] 
arm: [MVE intrinsics] add support for tuples

This patch is largely a copy/paste from the aarch64 SVE counterpart,
and adds support for tuples to the MVE intrinsics framework.

Introduce function_resolver::infer_tuple_type which will be used to
resolve overloaded vst2q and vst4q function names in a later patch.

Fix access to acle_vector_types in a few places, as well as in
infer_vector_or_tuple_type because we should shift the tuple size to
the right by one bit when computing the array index.

The new wrap_type_in_struct, register_type_decl and infer_tuple_type
are largely copies of the aarch64 versions, and
register_builtin_tuple_types is very similar.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-shapes.cc (parse_type): Fix access
to acle_vector_types.
* config/arm/arm-mve-builtins.cc (wrap_type_in_struct): New.
(register_type_decl): New.
(register_builtin_tuple_types): Fix support for tuples.
(function_resolver::infer_tuple_type): New.
* config/arm/arm-mve-builtins.h
(function_resolver::infer_tuple_type): Declare.
(function_instance::tuple_type): Fix access to acle_vector_types.

7 months agoarm: [MVE intrinsics] add modes for tuples
Christophe Lyon [Wed, 13 Nov 2024 15:28:29 +0000 (15:28 +0000)] 
arm: [MVE intrinsics] add modes for tuples

Add V2x and V4x modes, like we do on aarch64 for Advanced SIMD
q-registers.

gcc/ChangeLog:

* config/arm/arm-modes.def (MVE_STRUCT_MODES): New.

7 months agoarm: [MVE intrinsics] remove V2DF from MVE_vecs iterator
Christophe Lyon [Wed, 16 Aug 2023 13:43:16 +0000 (13:43 +0000)] 
arm: [MVE intrinsics] remove V2DF from MVE_vecs iterator

V2DF is not supported by MVE, so remove it from the only iterator
which contains it.

gcc/ChangeLog:

* config/arm/iterators.md (MVE_vecs): Remove V2DF.

7 months agoarm: [MVE intrinsics] Fix condition for vec_extract patterns
Christophe Lyon [Wed, 16 Aug 2023 13:42:53 +0000 (13:42 +0000)] 
arm: [MVE intrinsics] Fix condition for vec_extract patterns

Remove floating-point condition from mve_vec_extract_sext_internal and
mve_vec_extract_zext_internal, since the MVE_2 iterator does not
include any FP mode.

gcc/ChangeLog:

* config/arm/mve.md (mve_vec_extract_sext_internal): Fix
condition.
(mve_vec_extract_zext_internal): Likewise.

7 months agoarm: [MVE intrinsics] remove useless call_properties implementations.
Christophe Lyon [Tue, 5 Nov 2024 22:43:04 +0000 (22:43 +0000)] 
arm: [MVE intrinsics] remove useless call_properties implementations.

vstrq_impl derives from store_truncating and vldrq_impl derives from
load_extending which both implement call_properties.

No need to re-implement them in the derived classes.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-base.cc (vstrq_impl): Remove
call_properties.
(vldrq_impl): Likewise.

7 months agoarm: [MVE intrinsics] rework vldr gather_base_wb
Christophe Lyon [Thu, 31 Oct 2024 09:56:05 +0000 (09:56 +0000)] 
arm: [MVE intrinsics] rework vldr gather_base_wb

Implement vldr?q_gather_base_wb using the new MVE builtins framework.

gcc/ChangeLog:

* config/arm/arm-builtins.cc (arm_ldrgbwbxu_qualifiers)
(arm_ldrgbwbxu_z_qualifiers, arm_ldrgbwbs_qualifiers)
(arm_ldrgbwbu_qualifiers, arm_ldrgbwbs_z_qualifiers)
(arm_ldrgbwbu_z_qualifiers): Delete.
* config/arm/arm-mve-builtins-base.cc (vldrq_gather_base_impl):
Add support for MODE_wb.
* config/arm/arm-mve-builtins-shapes.cc (struct
load_gather_base_def): Likewise.
* config/arm/arm_mve.h (vldrdq_gather_base_wb_s64): Delete.
(vldrdq_gather_base_wb_u64): Delete.
(vldrdq_gather_base_wb_z_s64): Delete.
(vldrdq_gather_base_wb_z_u64): Delete.
(vldrwq_gather_base_wb_f32): Delete.
(vldrwq_gather_base_wb_s32): Delete.
(vldrwq_gather_base_wb_u32): Delete.
(vldrwq_gather_base_wb_z_f32): Delete.
(vldrwq_gather_base_wb_z_s32): Delete.
(vldrwq_gather_base_wb_z_u32): Delete.
(__arm_vldrdq_gather_base_wb_s64): Delete.
(__arm_vldrdq_gather_base_wb_u64): Delete.
(__arm_vldrdq_gather_base_wb_z_s64): Delete.
(__arm_vldrdq_gather_base_wb_z_u64): Delete.
(__arm_vldrwq_gather_base_wb_s32): Delete.
(__arm_vldrwq_gather_base_wb_u32): Delete.
(__arm_vldrwq_gather_base_wb_z_s32): Delete.
(__arm_vldrwq_gather_base_wb_z_u32): Delete.
(__arm_vldrwq_gather_base_wb_f32): Delete.
(__arm_vldrwq_gather_base_wb_z_f32): Delete.
* config/arm/arm_mve_builtins.def (vldrwq_gather_base_nowb_z_u)
(vldrdq_gather_base_nowb_z_u, vldrwq_gather_base_nowb_u)
(vldrdq_gather_base_nowb_u, vldrwq_gather_base_nowb_z_s)
(vldrwq_gather_base_nowb_z_f, vldrdq_gather_base_nowb_z_s)
(vldrwq_gather_base_nowb_s, vldrwq_gather_base_nowb_f)
(vldrdq_gather_base_nowb_s, vldrdq_gather_base_wb_z_s)
(vldrdq_gather_base_wb_z_u, vldrdq_gather_base_wb_s)
(vldrdq_gather_base_wb_u, vldrwq_gather_base_wb_z_s)
(vldrwq_gather_base_wb_z_f, vldrwq_gather_base_wb_z_u)
(vldrwq_gather_base_wb_s, vldrwq_gather_base_wb_f)
(vldrwq_gather_base_wb_u): Delete
* config/arm/iterators.md (supf): Remove VLDRWQGBWB_S,
VLDRWQGBWB_U, VLDRDQGBWB_S, VLDRDQGBWB_U.
(VLDRWGBWBQ, VLDRDGBWBQ): Delete.
* config/arm/mve.md (mve_vldrwq_gather_base_wb_<supf>v4si): Delete.
(mve_vldrwq_gather_base_nowb_<supf>v4si): Delete.
(mve_vldrwq_gather_base_wb_<supf>v4si_insn): Delete.
(mve_vldrwq_gather_base_wb_z_<supf>v4si): Delete.
(mve_vldrwq_gather_base_nowb_z_<supf>v4si): Delete.
(mve_vldrwq_gather_base_wb_z_<supf>v4si_insn): Delete.
(mve_vldrwq_gather_base_wb_fv4sf): Delete.
(mve_vldrwq_gather_base_nowb_fv4sf): Delete.
(mve_vldrwq_gather_base_wb_fv4sf_insn): Delete.
(mve_vldrwq_gather_base_wb_z_fv4sf): Delete.
(mve_vldrwq_gather_base_nowb_z_fv4sf): Delete.
(mve_vldrwq_gather_base_wb_z_fv4sf_insn): Delete.
(mve_vldrdq_gather_base_wb_<supf>v2di): Delete.
(mve_vldrdq_gather_base_nowb_<supf>v2di): Delete.
(mve_vldrdq_gather_base_wb_<supf>v2di_insn): Delete.
(mve_vldrdq_gather_base_wb_z_<supf>v2di): Delete.
(mve_vldrdq_gather_base_nowb_z_<supf>v2di): Delete.
(mve_vldrdq_gather_base_wb_z_<supf>v2di_insn): Delete.
(@mve_vldrq_gather_base_wb_<mode>): New.
(@mve_vldrq_gather_base_wb_z_<mode>): New.
* config/arm/unspecs.md (VLDRWQGBWB_S, VLDRWQGBWB_U, VLDRWQGBWB_F)
(VLDRDQGBWB_S, VLDRDQGBWB_U): Delete
(VLDRGBWBQ, VLDRGBWBQ_Z): New.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c:
Update expected output.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c:
Likewise.

7 months agoarm: [MVE intrinsics] rework vldr gather_base
Christophe Lyon [Wed, 30 Oct 2024 17:32:48 +0000 (17:32 +0000)] 
arm: [MVE intrinsics] rework vldr gather_base

Implement vldr?q_gather_base using the new MVE builtins framework.

The patch updates two testcases rather than using different iterators
for predicated and non-predicated versions. According to ACLE:
vldrdq_gather_base_s64 is expected to generate VLDRD.64
vldrdq_gather_base_z_s64 is expected to generate VLDRDT.U64

Both are equally valid, however.

gcc/ChangeLog:

* config/arm/arm-builtins.cc (arm_ldrgbs_qualifiers)
(arm_ldrgbu_qualifiers, arm_ldrgbs_z_qualifiers)
(arm_ldrgbu_z_qualifiers): Delete.
* config/arm/arm-mve-builtins-base.cc (class
vldrq_gather_base_impl): New.
(vldrdq_gather_base, vldrwq_gather_base): New.
* config/arm/arm-mve-builtins-base.def (vldrdq_gather_base)
(vldrwq_gather_base): New.
* config/arm/arm-mve-builtins-base.h: (vldrdq_gather_base)
(vldrwq_gather_base): New.
* config/arm/arm_mve.h (vldrwq_gather_base_s32): Delete.
(vldrwq_gather_base_u32): Delete.
(vldrwq_gather_base_z_u32): Delete.
(vldrwq_gather_base_z_s32): Delete.
(vldrdq_gather_base_s64): Delete.
(vldrdq_gather_base_u64): Delete.
(vldrdq_gather_base_z_s64): Delete.
(vldrdq_gather_base_z_u64): Delete.
(vldrwq_gather_base_f32): Delete.
(vldrwq_gather_base_z_f32): Delete.
(__arm_vldrwq_gather_base_s32): Delete.
(__arm_vldrwq_gather_base_u32): Delete.
(__arm_vldrwq_gather_base_z_s32): Delete.
(__arm_vldrwq_gather_base_z_u32): Delete.
(__arm_vldrdq_gather_base_s64): Delete.
(__arm_vldrdq_gather_base_u64): Delete.
(__arm_vldrdq_gather_base_z_s64): Delete.
(__arm_vldrdq_gather_base_z_u64): Delete.
(__arm_vldrwq_gather_base_f32): Delete.
(__arm_vldrwq_gather_base_z_f32): Delete.
* config/arm/arm_mve_builtins.def (vldrwq_gather_base_s)
(vldrwq_gather_base_u, vldrwq_gather_base_z_s)
(vldrwq_gather_base_z_u, vldrdq_gather_base_s)
(vldrwq_gather_base_f, vldrdq_gather_base_z_s)
(vldrwq_gather_base_z_f, vldrdq_gather_base_u)
(vldrdq_gather_base_z_u): Delete.
* config/arm/iterators.md (supf): Remove VLDRWQGB_S, VLDRWQGB_U,
VLDRDQGB_S, VLDRDQGB_U.
(VLDRWGBQ, VLDRDGBQ): Delete.
* config/arm/mve.md (mve_vldrwq_gather_base_<supf>v4si): Delete.
(mve_vldrwq_gather_base_z_<supf>v4si): Delete.
(mve_vldrdq_gather_base_<supf>v2di): Delete.
(mve_vldrdq_gather_base_z_<supf>v2di): Delete.
(mve_vldrwq_gather_base_fv4sf): Delete.
(mve_vldrwq_gather_base_z_fv4sf): Delete.
(@mve_vldrq_gather_base_<mode>): New.
(@mve_vldrq_gather_base_z_<mode>): New.
* config/arm/unspecs.md (VLDRWQGB_S, VLDRWQGB_U, VLDRDQGB_S)
(VLDRDQGB_U, VLDRWQGB_F): Delete.
(VLDRGBQ, VLDRGBQ_Z): New.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_s64.c: Update
expected output.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_u64.c:
Likewise.

7 months agoarm: [MVE intrinsics] add load_gather_base shape
Christophe Lyon [Wed, 30 Oct 2024 09:02:15 +0000 (09:02 +0000)] 
arm: [MVE intrinsics] add load_gather_base shape

This patch adds the load_gather_base shape description.

Unlike other load_gather shapes, this one does not support overloaded
forms.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-shapes.cc (struct
load_gather_base_def): New.
* config/arm/arm-mve-builtins-shapes.h: (load_gather_base): New.

7 months agoarm: [MVE intrinsics] rework vldr gather_shifted_offset
Christophe Lyon [Tue, 29 Oct 2024 21:10:10 +0000 (21:10 +0000)] 
arm: [MVE intrinsics] rework vldr gather_shifted_offset

Implement vldr?q_gather_shifted_offset using the new MVE builtins
framework.

gcc/ChangeLog:

* config/arm/arm-builtins.cc (arm_ldrgu_qualifiers)
(arm_ldrgs_qualifiers, arm_ldrgs_z_qualifiers)
(arm_ldrgu_z_qualifiers): Delete.
* config/arm/arm-mve-builtins-base.cc (vldrq_gather_impl): Add
support for shifted version.
(vldrdq_gather_shifted, vldrhq_gather_shifted)
(vldrwq_gather_shifted): New.
* config/arm/arm-mve-builtins-base.def (vldrdq_gather_shifted)
(vldrhq_gather_shifted, vldrwq_gather_shifted): New.
* config/arm/arm-mve-builtins-base.h (vldrdq_gather_shifted)
(vldrhq_gather_shifted, vldrwq_gather_shifted): New.
* config/arm/arm_mve.h (vldrhq_gather_shifted_offset): Delete.
(vldrhq_gather_shifted_offset_z): Delete.
(vldrdq_gather_shifted_offset): Delete.
(vldrdq_gather_shifted_offset_z): Delete.
(vldrwq_gather_shifted_offset): Delete.
(vldrwq_gather_shifted_offset_z): Delete.
(vldrhq_gather_shifted_offset_s32): Delete.
(vldrhq_gather_shifted_offset_s16): Delete.
(vldrhq_gather_shifted_offset_u32): Delete.
(vldrhq_gather_shifted_offset_u16): Delete.
(vldrhq_gather_shifted_offset_z_s32): Delete.
(vldrhq_gather_shifted_offset_z_s16): Delete.
(vldrhq_gather_shifted_offset_z_u32): Delete.
(vldrhq_gather_shifted_offset_z_u16): Delete.
(vldrdq_gather_shifted_offset_s64): Delete.
(vldrdq_gather_shifted_offset_u64): Delete.
(vldrdq_gather_shifted_offset_z_s64): Delete.
(vldrdq_gather_shifted_offset_z_u64): Delete.
(vldrhq_gather_shifted_offset_f16): Delete.
(vldrhq_gather_shifted_offset_z_f16): Delete.
(vldrwq_gather_shifted_offset_f32): Delete.
(vldrwq_gather_shifted_offset_s32): Delete.
(vldrwq_gather_shifted_offset_u32): Delete.
(vldrwq_gather_shifted_offset_z_f32): Delete.
(vldrwq_gather_shifted_offset_z_s32): Delete.
(vldrwq_gather_shifted_offset_z_u32): Delete.
(__arm_vldrhq_gather_shifted_offset_s32): Delete.
(__arm_vldrhq_gather_shifted_offset_s16): Delete.
(__arm_vldrhq_gather_shifted_offset_u32): Delete.
(__arm_vldrhq_gather_shifted_offset_u16): Delete.
(__arm_vldrhq_gather_shifted_offset_z_s32): Delete.
(__arm_vldrhq_gather_shifted_offset_z_s16): Delete.
(__arm_vldrhq_gather_shifted_offset_z_u32): Delete.
(__arm_vldrhq_gather_shifted_offset_z_u16): Delete.
(__arm_vldrdq_gather_shifted_offset_s64): Delete.
(__arm_vldrdq_gather_shifted_offset_u64): Delete.
(__arm_vldrdq_gather_shifted_offset_z_s64): Delete.
(__arm_vldrdq_gather_shifted_offset_z_u64): Delete.
(__arm_vldrwq_gather_shifted_offset_s32): Delete.
(__arm_vldrwq_gather_shifted_offset_u32): Delete.
(__arm_vldrwq_gather_shifted_offset_z_s32): Delete.
(__arm_vldrwq_gather_shifted_offset_z_u32): Delete.
(__arm_vldrhq_gather_shifted_offset_f16): Delete.
(__arm_vldrhq_gather_shifted_offset_z_f16): Delete.
(__arm_vldrwq_gather_shifted_offset_f32): Delete.
(__arm_vldrwq_gather_shifted_offset_z_f32): Delete.
(__arm_vldrhq_gather_shifted_offset): Delete.
(__arm_vldrhq_gather_shifted_offset_z): Delete.
(__arm_vldrdq_gather_shifted_offset): Delete.
(__arm_vldrdq_gather_shifted_offset_z): Delete.
(__arm_vldrwq_gather_shifted_offset): Delete.
(__arm_vldrwq_gather_shifted_offset_z): Delete.
* config/arm/arm_mve_builtins.def
(vldrhq_gather_shifted_offset_z_u, vldrhq_gather_shifted_offset_u)
(vldrhq_gather_shifted_offset_z_s, vldrhq_gather_shifted_offset_s)
(vldrdq_gather_shifted_offset_s, vldrhq_gather_shifted_offset_f)
(vldrwq_gather_shifted_offset_f, vldrwq_gather_shifted_offset_s)
(vldrdq_gather_shifted_offset_z_s)
(vldrhq_gather_shifted_offset_z_f)
(vldrwq_gather_shifted_offset_z_f)
(vldrwq_gather_shifted_offset_z_s, vldrdq_gather_shifted_offset_u)
(vldrwq_gather_shifted_offset_u, vldrdq_gather_shifted_offset_z_u)
(vldrwq_gather_shifted_offset_z_u): Delete.
* config/arm/iterators.md (supf): Remove VLDRHQGSO_S, VLDRHQGSO_U,
VLDRDQGSO_S, VLDRDQGSO_U, VLDRWQGSO_S, VLDRWQGSO_U.
(VLDRHGSOQ, VLDRDGSOQ, VLDRWGSOQ): Delete.
* config/arm/mve.md
(mve_vldrhq_gather_shifted_offset_<supf><mode>): Delete.
(mve_vldrhq_gather_shifted_offset_z_<supf><mode>): Delete.
(mve_vldrdq_gather_shifted_offset_<supf>v2di): Delete.
(mve_vldrdq_gather_shifted_offset_z_<supf>v2di): Delete.
(mve_vldrhq_gather_shifted_offset_fv8hf): Delete.
(mve_vldrhq_gather_shifted_offset_z_fv8hf): Delete.
(mve_vldrwq_gather_shifted_offset_fv4sf): Delete.
(mve_vldrwq_gather_shifted_offset_<supf>v4si): Delete.
(mve_vldrwq_gather_shifted_offset_z_fv4sf): Delete.
(mve_vldrwq_gather_shifted_offset_z_<supf>v4si): Delete.
(@mve_vldrq_gather_shifted_offset_<mode>): New.
(@mve_vldrq_gather_shifted_offset_extend_v4si<US>): New.
(@mve_vldrq_gather_shifted_offset_z_<mode>): New.
(@mve_vldrq_gather_shifted_offset_z_extend_v4si<US>): New.
* config/arm/unspecs.md (VLDRHQGSO_S, VLDRHQGSO_U, VLDRDQGSO_S)
(VLDRDQGSO_U, VLDRHQGSO_F, VLDRWQGSO_F, VLDRWQGSO_S, VLDRWQGSO_U):
Delete.
(VLDRGSOQ, VLDRGSOQ_Z, VLDRGSOQ_EXT, VLDRGSOQ_EXT_Z): New.

7 months agoarm: [MVE intrinsics] rework vldr gather_offset
Christophe Lyon [Tue, 29 Oct 2024 10:34:23 +0000 (10:34 +0000)] 
arm: [MVE intrinsics] rework vldr gather_offset

Implement vldr?q_gather_offset using the new MVE builtins framework.

The patch introduces a new attribute iterator (MVE_u_elem) to
accomodate the fact that ACLE's expected output description uses "uNN"
for all modes, except V8HF where it expects ".f16".  Using "V_sz_elem"
would work, but would require to update several testcases.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-base.cc (class vldrq_gather_impl):
New.
(vldrbq_gather, vldrdq_gather, vldrhq_gather, vldrwq_gather): New.
* config/arm/arm-mve-builtins-base.def (vldrbq_gather)
(vldrdq_gather, vldrhq_gather, vldrwq_gather): New.
* config/arm/arm-mve-builtins-base.h (vldrbq_gather)
(vldrdq_gather, vldrhq_gather, vldrwq_gather): New.
* config/arm/arm_mve.h (vldrbq_gather_offset): Delete.
(vldrbq_gather_offset_z): Delete.
(vldrhq_gather_offset): Delete.
(vldrhq_gather_offset_z): Delete.
(vldrdq_gather_offset): Delete.
(vldrdq_gather_offset_z): Delete.
(vldrwq_gather_offset): Delete.
(vldrwq_gather_offset_z): Delete.
(vldrbq_gather_offset_u8): Delete.
(vldrbq_gather_offset_s8): Delete.
(vldrbq_gather_offset_u16): Delete.
(vldrbq_gather_offset_s16): Delete.
(vldrbq_gather_offset_u32): Delete.
(vldrbq_gather_offset_s32): Delete.
(vldrbq_gather_offset_z_s16): Delete.
(vldrbq_gather_offset_z_u8): Delete.
(vldrbq_gather_offset_z_s32): Delete.
(vldrbq_gather_offset_z_u16): Delete.
(vldrbq_gather_offset_z_u32): Delete.
(vldrbq_gather_offset_z_s8): Delete.
(vldrhq_gather_offset_s32): Delete.
(vldrhq_gather_offset_s16): Delete.
(vldrhq_gather_offset_u32): Delete.
(vldrhq_gather_offset_u16): Delete.
(vldrhq_gather_offset_z_s32): Delete.
(vldrhq_gather_offset_z_s16): Delete.
(vldrhq_gather_offset_z_u32): Delete.
(vldrhq_gather_offset_z_u16): Delete.
(vldrdq_gather_offset_s64): Delete.
(vldrdq_gather_offset_u64): Delete.
(vldrdq_gather_offset_z_s64): Delete.
(vldrdq_gather_offset_z_u64): Delete.
(vldrhq_gather_offset_f16): Delete.
(vldrhq_gather_offset_z_f16): Delete.
(vldrwq_gather_offset_f32): Delete.
(vldrwq_gather_offset_s32): Delete.
(vldrwq_gather_offset_u32): Delete.
(vldrwq_gather_offset_z_f32): Delete.
(vldrwq_gather_offset_z_s32): Delete.
(vldrwq_gather_offset_z_u32): Delete.
(__arm_vldrbq_gather_offset_u8): Delete.
(__arm_vldrbq_gather_offset_s8): Delete.
(__arm_vldrbq_gather_offset_u16): Delete.
(__arm_vldrbq_gather_offset_s16): Delete.
(__arm_vldrbq_gather_offset_u32): Delete.
(__arm_vldrbq_gather_offset_s32): Delete.
(__arm_vldrbq_gather_offset_z_s8): Delete.
(__arm_vldrbq_gather_offset_z_s32): Delete.
(__arm_vldrbq_gather_offset_z_s16): Delete.
(__arm_vldrbq_gather_offset_z_u8): Delete.
(__arm_vldrbq_gather_offset_z_u32): Delete.
(__arm_vldrbq_gather_offset_z_u16): Delete.
(__arm_vldrhq_gather_offset_s32): Delete.
(__arm_vldrhq_gather_offset_s16): Delete.
(__arm_vldrhq_gather_offset_u32): Delete.
(__arm_vldrhq_gather_offset_u16): Delete.
(__arm_vldrhq_gather_offset_z_s32): Delete.
(__arm_vldrhq_gather_offset_z_s16): Delete.
(__arm_vldrhq_gather_offset_z_u32): Delete.
(__arm_vldrhq_gather_offset_z_u16): Delete.
(__arm_vldrdq_gather_offset_s64): Delete.
(__arm_vldrdq_gather_offset_u64): Delete.
(__arm_vldrdq_gather_offset_z_s64): Delete.
(__arm_vldrdq_gather_offset_z_u64): Delete.
(__arm_vldrwq_gather_offset_s32): Delete.
(__arm_vldrwq_gather_offset_u32): Delete.
(__arm_vldrwq_gather_offset_z_s32): Delete.
(__arm_vldrwq_gather_offset_z_u32): Delete.
(__arm_vldrhq_gather_offset_f16): Delete.
(__arm_vldrhq_gather_offset_z_f16): Delete.
(__arm_vldrwq_gather_offset_f32): Delete.
(__arm_vldrwq_gather_offset_z_f32): Delete.
(__arm_vldrbq_gather_offset): Delete.
(__arm_vldrbq_gather_offset_z): Delete.
(__arm_vldrhq_gather_offset): Delete.
(__arm_vldrhq_gather_offset_z): Delete.
(__arm_vldrdq_gather_offset): Delete.
(__arm_vldrdq_gather_offset_z): Delete.
(__arm_vldrwq_gather_offset): Delete.
(__arm_vldrwq_gather_offset_z): Delete.
* config/arm/arm_mve_builtins.def (vldrbq_gather_offset_u)
(vldrbq_gather_offset_s, vldrbq_gather_offset_z_s)
(vldrbq_gather_offset_z_u, vldrhq_gather_offset_z_u)
(vldrhq_gather_offset_u, vldrhq_gather_offset_z_s)
(vldrhq_gather_offset_s, vldrdq_gather_offset_s)
(vldrhq_gather_offset_f, vldrwq_gather_offset_f)
(vldrwq_gather_offset_s, vldrdq_gather_offset_z_s)
(vldrhq_gather_offset_z_f, vldrwq_gather_offset_z_f)
(vldrwq_gather_offset_z_s, vldrdq_gather_offset_u)
(vldrwq_gather_offset_u, vldrdq_gather_offset_z_u)
(vldrwq_gather_offset_z_u): Delete.
* config/arm/iterators.md (MVE_u_elem): New.
(supf): Remove VLDRBQGO_S, VLDRBQGO_U, VLDRHQGO_S, VLDRHQGO_U,
VLDRDQGO_S, VLDRDQGO_U, VLDRWQGO_S, VLDRWQGO_U.
(VLDRBGOQ, VLDRHGOQ, VLDRDGOQ, VLDRWGOQ): Delete.
* config/arm/mve.md (mve_vldrbq_gather_offset_<supf><mode>):
Delete.
(mve_vldrbq_gather_offset_z_<supf><mode>): Delete.
(mve_vldrhq_gather_offset_<supf><mode>): Delete.
(mve_vldrhq_gather_offset_z_<supf><mode>): Delete.
(mve_vldrdq_gather_offset_<supf>v2di): Delete.
(mve_vldrdq_gather_offset_z_<supf>v2di): Delete.
(mve_vldrhq_gather_offset_fv8hf): Delete.
(mve_vldrhq_gather_offset_z_fv8hf): Delete.
(mve_vldrwq_gather_offset_fv4sf): Delete.
(mve_vldrwq_gather_offset_<supf>v4si): Delete.
(mve_vldrwq_gather_offset_z_fv4sf): Delete.
(mve_vldrwq_gather_offset_z_<supf>v4si): Delete.
(@mve_vldrq_gather_offset_<mode>): New.
(@mve_vldrq_gather_offset_extend_<mode><US>): New.
(@mve_vldrq_gather_offset_z_<mode>): New.
(@mve_vldrq_gather_offset_z_extend_<mode><US>): New.
* config/arm/unspecs.md (VLDRBQGO_S, VLDRBQGO_U, VLDRHQGO_S)
(VLDRHQGO_U, VLDRDQGO_S, VLDRDQGO_U, VLDRHQGO_F, VLDRWQGO_F)
(VLDRWQGO_S, VLDRWQGO_U): Delete.
(VLDRGOQ, VLDRGOQ_Z, VLDRGOQ_EXT, VLDRGOQ_EXT_Z): New.

7 months agoarm: [MVE intrinsics] add load_ext_gather_offset shape
Christophe Lyon [Thu, 17 Oct 2024 19:35:21 +0000 (19:35 +0000)] 
arm: [MVE intrinsics] add load_ext_gather_offset shape

This patch adds the load_ext_gather_offset shape description.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-shapes.cc (struct load_ext_gather):
New.
(struct load_ext_gather_offset_def): New.
* config/arm/arm-mve-builtins-shapes.h (load_ext_gather_offset):
New.

7 months agoarm: [MVE intrinsics] rework vstr scatter_base_wb
Christophe Lyon [Tue, 15 Oct 2024 14:46:37 +0000 (14:46 +0000)] 
arm: [MVE intrinsics] rework vstr scatter_base_wb

Implement vstr?q_scatter_base_wb using the new MVE builtins framework.

The patch introduces a new 'b' type for signatures, which
represents the type of the 'base' argument of vstr?q_scatter_base_wb.

gcc/ChangeLog:

* config/arm/arm-builtins.cc (arm_strsbwbs_qualifiers)
(arm_strsbwbu_qualifiers, arm_strsbwbs_p_qualifiers)
(arm_strsbwbu_p_qualifiers): Delete.
* config/arm/arm-mve-builtins-base.cc (vstrq_scatter_base_impl):
Add support for MODE_wb.
* config/arm/arm-mve-builtins-shapes.cc (parse_type): Add support
for 'b' type.
(store_scatter_base): Add support for MODE_wb.
* config/arm/arm-mve-builtins.cc
(function_resolver::require_pointer_to_type): New.
* config/arm/arm-mve-builtins.h
(function_resolver::require_pointer_to_type): New.
* config/arm/arm_mve.h (vstrdq_scatter_base_wb): Delete.
(vstrdq_scatter_base_wb_p): Delete.
(vstrwq_scatter_base_wb_p): Delete.
(vstrwq_scatter_base_wb): Delete.
(vstrdq_scatter_base_wb_p_s64): Delete.
(vstrdq_scatter_base_wb_p_u64): Delete.
(vstrdq_scatter_base_wb_s64): Delete.
(vstrdq_scatter_base_wb_u64): Delete.
(vstrwq_scatter_base_wb_p_s32): Delete.
(vstrwq_scatter_base_wb_p_f32): Delete.
(vstrwq_scatter_base_wb_p_u32): Delete.
(vstrwq_scatter_base_wb_s32): Delete.
(vstrwq_scatter_base_wb_u32): Delete.
(vstrwq_scatter_base_wb_f32): Delete.
(__arm_vstrdq_scatter_base_wb_s64): Delete.
(__arm_vstrdq_scatter_base_wb_u64): Delete.
(__arm_vstrdq_scatter_base_wb_p_s64): Delete.
(__arm_vstrdq_scatter_base_wb_p_u64): Delete.
(__arm_vstrwq_scatter_base_wb_p_s32): Delete.
(__arm_vstrwq_scatter_base_wb_p_u32): Delete.
(__arm_vstrwq_scatter_base_wb_s32): Delete.
(__arm_vstrwq_scatter_base_wb_u32): Delete.
(__arm_vstrwq_scatter_base_wb_f32): Delete.
(__arm_vstrwq_scatter_base_wb_p_f32): Delete.
(__arm_vstrdq_scatter_base_wb): Delete.
(__arm_vstrdq_scatter_base_wb_p): Delete.
(__arm_vstrwq_scatter_base_wb_p): Delete.
(__arm_vstrwq_scatter_base_wb): Delete.
* config/arm/arm_mve_builtins.def (vstrwq_scatter_base_wb_u)
(vstrdq_scatter_base_wb_u, vstrwq_scatter_base_wb_p_u)
(vstrdq_scatter_base_wb_p_u, vstrwq_scatter_base_wb_s)
(vstrwq_scatter_base_wb_f, vstrdq_scatter_base_wb_s)
(vstrwq_scatter_base_wb_p_s, vstrwq_scatter_base_wb_p_f)
(vstrdq_scatter_base_wb_p_s): Delete.
* config/arm/iterators.md (supf): Remove VSTRWQSBWB_S,
VSTRWQSBWB_U, VSTRDQSBWB_S, VSTRDQSBWB_U.
(VSTRDSBQ, VSTRWSBWBQ, VSTRDSBWBQ): Delete.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si): Delete.
(mve_vstrwq_scatter_base_wb_p_<supf>v4si): Delete.
(mve_vstrwq_scatter_base_wb_fv4sf): Delete.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Delete.
(mve_vstrdq_scatter_base_wb_<supf>v2di): Delete.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): Delete.
(@mve_vstrq_scatter_base_wb_<mode>): New.
(@mve_vstrq_scatter_base_wb_p_<mode>): New.
* config/arm/unspecs.md (VSTRWQSBWB_S, VSTRWQSBWB_U, VSTRWQSBWB_F)
(VSTRDQSBWB_S, VSTRDQSBWB_U): Delete.
(VSTRSBWBQ, VSTRSBWBQ_P): New.

7 months agoarm: [MVE intrinsics] rework vstr scatter_base
Christophe Lyon [Sun, 13 Oct 2024 21:03:26 +0000 (21:03 +0000)] 
arm: [MVE intrinsics] rework vstr scatter_base

Implement vstr?q_scatter_base using the new MVE builtins framework.

We need to introduce a new iterator (MVE_4) to support the set needed
by vstr?q_scatter_base (V4SI V4SF V2DI).

gcc/ChangeLog:

* config/arm/arm-builtins.cc (arm_strsbs_qualifiers)
(arm_strsbu_qualifiers, arm_strsbs_p_qualifiers)
(arm_strsbu_p_qualifiers): Delete.
* config/arm/arm-mve-builtins-base.cc (class
vstrq_scatter_base_impl): New.
(vstrwq_scatter_base, vstrdq_scatter_base): New.
* config/arm/arm-mve-builtins-base.def (vstrwq_scatter_base)
(vstrdq_scatter_base): New.
* config/arm/arm-mve-builtins-base.h (vstrwq_scatter_base)
(vstrdq_scatter_base): New.
* config/arm/arm_mve.h (vstrwq_scatter_base): Delete.
(vstrwq_scatter_base_p): Delete.
(vstrdq_scatter_base_p): Delete.
(vstrdq_scatter_base): Delete.
(vstrwq_scatter_base_s32): Delete.
(vstrwq_scatter_base_u32): Delete.
(vstrwq_scatter_base_p_s32): Delete.
(vstrwq_scatter_base_p_u32): Delete.
(vstrdq_scatter_base_p_s64): Delete.
(vstrdq_scatter_base_p_u64): Delete.
(vstrdq_scatter_base_s64): Delete.
(vstrdq_scatter_base_u64): Delete.
(vstrwq_scatter_base_f32): Delete.
(vstrwq_scatter_base_p_f32): Delete.
(__arm_vstrwq_scatter_base_s32): Delete.
(__arm_vstrwq_scatter_base_u32): Delete.
(__arm_vstrwq_scatter_base_p_s32): Delete.
(__arm_vstrwq_scatter_base_p_u32): Delete.
(__arm_vstrdq_scatter_base_p_s64): Delete.
(__arm_vstrdq_scatter_base_p_u64): Delete.
(__arm_vstrdq_scatter_base_s64): Delete.
(__arm_vstrdq_scatter_base_u64): Delete.
(__arm_vstrwq_scatter_base_f32): Delete.
(__arm_vstrwq_scatter_base_p_f32): Delete.
(__arm_vstrwq_scatter_base): Delete.
(__arm_vstrwq_scatter_base_p): Delete.
(__arm_vstrdq_scatter_base_p): Delete.
(__arm_vstrdq_scatter_base): Delete.
* config/arm/arm_mve_builtins.def (vstrwq_scatter_base_s)
(vstrwq_scatter_base_u, vstrwq_scatter_base_p_s)
(vstrwq_scatter_base_p_u, vstrdq_scatter_base_s)
(vstrwq_scatter_base_f, vstrdq_scatter_base_p_s)
(vstrwq_scatter_base_p_f, vstrdq_scatter_base_u)
(vstrdq_scatter_base_p_u): Delete.
* config/arm/iterators.md (MVE_4): New.
(supf): Remove VSTRWQSB_S, VSTRWQSB_U.
(VSTRWSBQ): Delete.
* config/arm/mve.md (mve_vstrwq_scatter_base_<supf>v4si): Delete.
(mve_vstrwq_scatter_base_p_<supf>v4si): Delete.
(mve_vstrdq_scatter_base_p_<supf>v2di): Delete.
(mve_vstrdq_scatter_base_<supf>v2di): Delete.
(mve_vstrwq_scatter_base_fv4sf): Delete.
(mve_vstrwq_scatter_base_p_fv4sf): Delete.
(@mve_vstrq_scatter_base_<mode>): New.
(@mve_vstrq_scatter_base_p_<mode>): New.
* config/arm/unspecs.md (VSTRWQSB_S, VSTRWQSB_U, VSTRWQSB_F):
Delete.
(VSTRSBQ, VSTRSBQ_P): New.

7 months agoarm: [MVE intrinsics] Add store_scatter_base shape
Christophe Lyon [Sun, 13 Oct 2024 21:03:12 +0000 (21:03 +0000)] 
arm: [MVE intrinsics] Add store_scatter_base shape

This patch adds the store_scatter_base shape description.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-shapes.cc (store_scatter_base): New.
* config/arm/arm-mve-builtins-shapes.h (store_scatter_base): New.

7 months agoarm: [MVE intrinsics] Check immediate is a multiple in a range
Christophe Lyon [Mon, 4 Nov 2024 21:41:47 +0000 (21:41 +0000)] 
arm: [MVE intrinsics] Check immediate is a multiple in a range

This patch adds support to check that an immediate is a multiple of a
given value in a given range.

This will be used for instance by scatter_base to check that offset is
in +/-4*[0..127].

Unlike require_immediate_range, require_immediate_range_multiple
accepts signed range bounds to handle the above case.

gcc/ChangeLog:

* config/arm/arm-mve-builtins.cc (report_out_of_range_multiple):
New.
(function_checker::require_signed_immediate): New.
(function_checker::require_immediate_range_multiple): New.
* config/arm/arm-mve-builtins.h
(function_checker::require_immediate_range_multiple): New.
(function_checker::require_signed_immediate): New.

7 months agoarm: [MVE intrinsics] rework vstr_scatter_shifted_offset
Christophe Lyon [Fri, 11 Oct 2024 15:37:03 +0000 (15:37 +0000)] 
arm: [MVE intrinsics] rework vstr_scatter_shifted_offset

Implement vstr?q_scatter_shifted_offset intrinsics using the MVE
builtins framework.

We use the same approach as the previous patch, and we now have four
sets of patterns:
- vector scatter stores with shifted offset (non-truncating)
- predicated vector scatter stores with shifted offset (non-truncating)
- truncating vector scatter stores with shifted offset
- predicated truncating vector scatter stores with shifted offset

Note that the truncating patterns do not use an iterator since there
is only one such variant: V4SI to V4HI.

We need to introduce new iterators:
- MVE_VLD_ST_scatter_shifted, same as MVE_VLD_ST_scatter without V16QI
- MVE_scatter_shift to map the mode to the shift amount

gcc/ChangeLog:

* config/arm/arm-builtins.cc (arm_strss_qualifiers)
(arm_strsu_qualifiers, arm_strsu_p_qualifiers)
(arm_strss_p_qualifiers): Delete.
* config/arm/arm-mve-builtins-base.cc (class vstrq_scatter_impl):
Add support for shifted version.
(vstrdq_scatter_shifted, vstrhq_scatter_shifted)
(vstrwq_scatter_shifted): New.
* config/arm/arm-mve-builtins-base.def (vstrhq_scatter_shifted)
(vstrwq_scatter_shifted, vstrdq_scatter_shifted): New.
* config/arm/arm-mve-builtins-base.h (vstrhq_scatter_shifted)
(vstrwq_scatter_shifted, vstrdq_scatter_shifted): New.
* config/arm/arm_mve.h (vstrhq_scatter_shifted_offset): Delete.
(vstrhq_scatter_shifted_offset_p): Delete.
(vstrdq_scatter_shifted_offset_p): Delete.
(vstrdq_scatter_shifted_offset): Delete.
(vstrwq_scatter_shifted_offset_p): Delete.
(vstrwq_scatter_shifted_offset): Delete.
(vstrhq_scatter_shifted_offset_s32): Delete.
(vstrhq_scatter_shifted_offset_s16): Delete.
(vstrhq_scatter_shifted_offset_u32): Delete.
(vstrhq_scatter_shifted_offset_u16): Delete.
(vstrhq_scatter_shifted_offset_p_s32): Delete.
(vstrhq_scatter_shifted_offset_p_s16): Delete.
(vstrhq_scatter_shifted_offset_p_u32): Delete.
(vstrhq_scatter_shifted_offset_p_u16): Delete.
(vstrdq_scatter_shifted_offset_p_s64): Delete.
(vstrdq_scatter_shifted_offset_p_u64): Delete.
(vstrdq_scatter_shifted_offset_s64): Delete.
(vstrdq_scatter_shifted_offset_u64): Delete.
(vstrhq_scatter_shifted_offset_f16): Delete.
(vstrhq_scatter_shifted_offset_p_f16): Delete.
(vstrwq_scatter_shifted_offset_f32): Delete.
(vstrwq_scatter_shifted_offset_p_f32): Delete.
(vstrwq_scatter_shifted_offset_p_s32): Delete.
(vstrwq_scatter_shifted_offset_p_u32): Delete.
(vstrwq_scatter_shifted_offset_s32): Delete.
(vstrwq_scatter_shifted_offset_u32): Delete.
(__arm_vstrhq_scatter_shifted_offset_s32): Delete.
(__arm_vstrhq_scatter_shifted_offset_s16): Delete.
(__arm_vstrhq_scatter_shifted_offset_u32): Delete.
(__arm_vstrhq_scatter_shifted_offset_u16): Delete.
(__arm_vstrhq_scatter_shifted_offset_p_s32): Delete.
(__arm_vstrhq_scatter_shifted_offset_p_s16): Delete.
(__arm_vstrhq_scatter_shifted_offset_p_u32): Delete.
(__arm_vstrhq_scatter_shifted_offset_p_u16): Delete.
(__arm_vstrdq_scatter_shifted_offset_p_s64): Delete.
(__arm_vstrdq_scatter_shifted_offset_p_u64): Delete.
(__arm_vstrdq_scatter_shifted_offset_s64): Delete.
(__arm_vstrdq_scatter_shifted_offset_u64): Delete.
(__arm_vstrwq_scatter_shifted_offset_p_s32): Delete.
(__arm_vstrwq_scatter_shifted_offset_p_u32): Delete.
(__arm_vstrwq_scatter_shifted_offset_s32): Delete.
(__arm_vstrwq_scatter_shifted_offset_u32): Delete.
(__arm_vstrhq_scatter_shifted_offset_f16): Delete.
(__arm_vstrhq_scatter_shifted_offset_p_f16): Delete.
(__arm_vstrwq_scatter_shifted_offset_f32): Delete.
(__arm_vstrwq_scatter_shifted_offset_p_f32): Delete.
(__arm_vstrhq_scatter_shifted_offset): Delete.
(__arm_vstrhq_scatter_shifted_offset_p): Delete.
(__arm_vstrdq_scatter_shifted_offset_p): Delete.
(__arm_vstrdq_scatter_shifted_offset): Delete.
(__arm_vstrwq_scatter_shifted_offset_p): Delete.
(__arm_vstrwq_scatter_shifted_offset): Delete.
* config/arm/arm_mve_builtins.def
(vstrhq_scatter_shifted_offset_p_u)
(vstrhq_scatter_shifted_offset_u)
(vstrhq_scatter_shifted_offset_p_s)
(vstrhq_scatter_shifted_offset_s, vstrdq_scatter_shifted_offset_s)
(vstrhq_scatter_shifted_offset_f, vstrwq_scatter_shifted_offset_f)
(vstrwq_scatter_shifted_offset_s)
(vstrdq_scatter_shifted_offset_p_s)
(vstrhq_scatter_shifted_offset_p_f)
(vstrwq_scatter_shifted_offset_p_f)
(vstrwq_scatter_shifted_offset_p_s)
(vstrdq_scatter_shifted_offset_u, vstrwq_scatter_shifted_offset_u)
(vstrdq_scatter_shifted_offset_p_u)
(vstrwq_scatter_shifted_offset_p_u): Delete.
* config/arm/iterators.md (MVE_VLD_ST_scatter_shifted): New.
(MVE_scatter_shift): New.
(supf): Remove VSTRHQSSO_S, VSTRHQSSO_U, VSTRDQSSO_S, VSTRDQSSO_U,
VSTRWQSSO_U, VSTRWQSSO_S.
(VSTRHSSOQ, VSTRDSSOQ, VSTRWSSOQ): Delete.
* config/arm/mve.md (mve_vstrhq_scatter_shifted_offset_p_<supf><mode>): Delete.
(mve_vstrhq_scatter_shifted_offset_p_<supf><mode>_insn): Delete.
(mve_vstrhq_scatter_shifted_offset_<supf><mode>): Delete.
(mve_vstrhq_scatter_shifted_offset_<supf><mode>_insn): Delete.
(mve_vstrdq_scatter_shifted_offset_p_<supf>v2di): Delete.
(mve_vstrdq_scatter_shifted_offset_p_<supf>v2di_insn): Delete.
(mve_vstrdq_scatter_shifted_offset_<supf>v2di): Delete.
(mve_vstrdq_scatter_shifted_offset_<supf>v2di_insn): Delete.
(mve_vstrhq_scatter_shifted_offset_fv8hf): Delete.
(mve_vstrhq_scatter_shifted_offset_fv8hf_insn): Delete.
(mve_vstrhq_scatter_shifted_offset_p_fv8hf): Delete.
(mve_vstrhq_scatter_shifted_offset_p_fv8hf_insn): Delete.
(mve_vstrwq_scatter_shifted_offset_fv4sf): Delete.
(mve_vstrwq_scatter_shifted_offset_fv4sf_insn): Delete.
(mve_vstrwq_scatter_shifted_offset_p_fv4sf): Delete.
(mve_vstrwq_scatter_shifted_offset_p_fv4sf_insn): Delete.
(mve_vstrwq_scatter_shifted_offset_p_<supf>v4si): Delete.
(mve_vstrwq_scatter_shifted_offset_p_<supf>v4si_insn): Delete.
(mve_vstrwq_scatter_shifted_offset_<supf>v4si): Delete.
(mve_vstrwq_scatter_shifted_offset_<supf>v4si_insn): Delete.
(@mve_vstrq_scatter_shifted_offset_<mode>): New.
(@mve_vstrq_scatter_shifted_offset_p_<mode>): New.
(mve_vstrq_truncate_scatter_shifted_offset_v4si): New.
(mve_vstrq_truncate_scatter_shifted_offset_p_v4si): New.
* config/arm/unspecs.md (VSTRDQSSO_S, VSTRDQSSO_U, VSTRWQSSO_S)
(VSTRWQSSO_U, VSTRHQSSO_F, VSTRWQSSO_F, VSTRHQSSO_S, VSTRHQSSO_U):
Delete.
(VSTRSSOQ, VSTRSSOQ_P, VSTRSSOQ_TRUNC, VSTRSSOQ_TRUNC_P): New.

7 months agoarm: [MVE intrinsics] rework vstr?q_scatter_offset
Christophe Lyon [Tue, 29 Oct 2024 10:31:53 +0000 (10:31 +0000)] 
arm: [MVE intrinsics] rework vstr?q_scatter_offset

This patch implements vstr?q_scatter_offset using the new MVE builtins
framework.

It uses a similar approach to a previous patch which grouped
truncating and non-truncating stores in two sets of patterns, rather
than having groups of patterns depending on the destination size.

We need to add the 'integer_64' types of suffixes in order to support
vstrdq_scatter_offset.

The patch introduces the MVE_VLD_ST_scatter iterator, similar to
MVE_VLD_ST but which also includes V2DI (again, for
vstrdq_scatter_offset).

The new MVE_scatter_offset mode attribute is used to map the
destination type to the offset type (both are usually equal, except
when the destination is floating-point).

We end up with four sets of patterns:
- vector scatter stores with offset (non-truncating)
- predicated vector scatter stores with offset (non-truncating)
- truncating vector scatter stores with offset
- predicated truncating vector scatter stores with offset

gcc/ChangeLog:

* config/arm/arm-mve-builtins-base.cc (class vstrq_scatter_impl):
New.
(vstrbq_scatter, vstrhq_scatter, vstrwq_scatter, vstrdq_scatter):
New.
* config/arm/arm-mve-builtins-base.def (vstrbq_scatter)
(vstrhq_scatter, vstrwq_scatter, vstrdq_scatter): New.
* config/arm/arm-mve-builtins-base.h (vstrbq_scatter)
(vstrhq_scatter, vstrwq_scatter, vstrdq_scatter): New.
* config/arm/arm-mve-builtins.cc (integer_64): New.
* config/arm/arm_mve.h (vstrbq_scatter_offset): Delete.
(vstrbq_scatter_offset_p): Delete.
(vstrhq_scatter_offset): Delete.
(vstrhq_scatter_offset_p): Delete.
(vstrdq_scatter_offset_p): Delete.
(vstrdq_scatter_offset): Delete.
(vstrwq_scatter_offset_p): Delete.
(vstrwq_scatter_offset): Delete.
(vstrbq_scatter_offset_s8): Delete.
(vstrbq_scatter_offset_u8): Delete.
(vstrbq_scatter_offset_u16): Delete.
(vstrbq_scatter_offset_s16): Delete.
(vstrbq_scatter_offset_u32): Delete.
(vstrbq_scatter_offset_s32): Delete.
(vstrbq_scatter_offset_p_s8): Delete.
(vstrbq_scatter_offset_p_s32): Delete.
(vstrbq_scatter_offset_p_s16): Delete.
(vstrbq_scatter_offset_p_u8): Delete.
(vstrbq_scatter_offset_p_u32): Delete.
(vstrbq_scatter_offset_p_u16): Delete.
(vstrhq_scatter_offset_s32): Delete.
(vstrhq_scatter_offset_s16): Delete.
(vstrhq_scatter_offset_u32): Delete.
(vstrhq_scatter_offset_u16): Delete.
(vstrhq_scatter_offset_p_s32): Delete.
(vstrhq_scatter_offset_p_s16): Delete.
(vstrhq_scatter_offset_p_u32): Delete.
(vstrhq_scatter_offset_p_u16): Delete.
(vstrdq_scatter_offset_p_s64): Delete.
(vstrdq_scatter_offset_p_u64): Delete.
(vstrdq_scatter_offset_s64): Delete.
(vstrdq_scatter_offset_u64): Delete.
(vstrhq_scatter_offset_f16): Delete.
(vstrhq_scatter_offset_p_f16): Delete.
(vstrwq_scatter_offset_f32): Delete.
(vstrwq_scatter_offset_p_f32): Delete.
(vstrwq_scatter_offset_p_s32): Delete.
(vstrwq_scatter_offset_p_u32): Delete.
(vstrwq_scatter_offset_s32): Delete.
(vstrwq_scatter_offset_u32): Delete.
(__arm_vstrbq_scatter_offset_s8): Delete.
(__arm_vstrbq_scatter_offset_s32): Delete.
(__arm_vstrbq_scatter_offset_s16): Delete.
(__arm_vstrbq_scatter_offset_u8): Delete.
(__arm_vstrbq_scatter_offset_u32): Delete.
(__arm_vstrbq_scatter_offset_u16): Delete.
(__arm_vstrbq_scatter_offset_p_s8): Delete.
(__arm_vstrbq_scatter_offset_p_s32): Delete.
(__arm_vstrbq_scatter_offset_p_s16): Delete.
(__arm_vstrbq_scatter_offset_p_u8): Delete.
(__arm_vstrbq_scatter_offset_p_u32): Delete.
(__arm_vstrbq_scatter_offset_p_u16): Delete.
(__arm_vstrhq_scatter_offset_s32): Delete.
(__arm_vstrhq_scatter_offset_s16): Delete.
(__arm_vstrhq_scatter_offset_u32): Delete.
(__arm_vstrhq_scatter_offset_u16): Delete.
(__arm_vstrhq_scatter_offset_p_s32): Delete.
(__arm_vstrhq_scatter_offset_p_s16): Delete.
(__arm_vstrhq_scatter_offset_p_u32): Delete.
(__arm_vstrhq_scatter_offset_p_u16): Delete.
(__arm_vstrdq_scatter_offset_p_s64): Delete.
(__arm_vstrdq_scatter_offset_p_u64): Delete.
(__arm_vstrdq_scatter_offset_s64): Delete.
(__arm_vstrdq_scatter_offset_u64): Delete.
(__arm_vstrwq_scatter_offset_p_s32): Delete.
(__arm_vstrwq_scatter_offset_p_u32): Delete.
(__arm_vstrwq_scatter_offset_s32): Delete.
(__arm_vstrwq_scatter_offset_u32): Delete.
(__arm_vstrhq_scatter_offset_f16): Delete.
(__arm_vstrhq_scatter_offset_p_f16): Delete.
(__arm_vstrwq_scatter_offset_f32): Delete.
(__arm_vstrwq_scatter_offset_p_f32): Delete.
(__arm_vstrbq_scatter_offset): Delete.
(__arm_vstrbq_scatter_offset_p): Delete.
(__arm_vstrhq_scatter_offset): Delete.
(__arm_vstrhq_scatter_offset_p): Delete.
(__arm_vstrdq_scatter_offset_p): Delete.
(__arm_vstrdq_scatter_offset): Delete.
(__arm_vstrwq_scatter_offset_p): Delete.
(__arm_vstrwq_scatter_offset): Delete.
* config/arm/arm_mve_builtins.def (vstrbq_scatter_offset_s)
(vstrbq_scatter_offset_u, vstrbq_scatter_offset_p_s)
(vstrbq_scatter_offset_p_u, vstrhq_scatter_offset_p_u)
(vstrhq_scatter_offset_u, vstrhq_scatter_offset_p_s)
(vstrhq_scatter_offset_s, vstrdq_scatter_offset_s)
(vstrhq_scatter_offset_f, vstrwq_scatter_offset_f)
(vstrwq_scatter_offset_s, vstrdq_scatter_offset_p_s)
(vstrhq_scatter_offset_p_f, vstrwq_scatter_offset_p_f)
(vstrwq_scatter_offset_p_s, vstrdq_scatter_offset_u)
(vstrwq_scatter_offset_u, vstrdq_scatter_offset_p_u)
(vstrwq_scatter_offset_p_u) Delete.
* config/arm/iterators.md (MVE_VLD_ST_scatter): New.
(MVE_scatter_offset): New.
(MVE_elem_ch): Add entry for V2DI.
(supf): Remove VSTRBQSO_S, VSTRBQSO_U, VSTRHQSO_S, VSTRHQSO_U,
VSTRDQSO_S, VSTRDQSO_U, VSTRWQSO_U, VSTRWQSO_S.
(VSTRBSOQ, VSTRHSOQ, VSTRDSOQ, VSTRWSOQ): Delete.
* config/arm/mve.md (mve_vstrbq_scatter_offset_<supf><mode>):
Delete.
(mve_vstrbq_scatter_offset_<supf><mode>_insn): Delete.
(mve_vstrbq_scatter_offset_p_<supf><mode>): Delete.
(mve_vstrbq_scatter_offset_p_<supf><mode>_insn): Delete.
(mve_vstrhq_scatter_offset_p_<supf><mode>): Delete.
(mve_vstrhq_scatter_offset_p_<supf><mode>_insn): Delete.
(mve_vstrhq_scatter_offset_<supf><mode>): Delete.
(mve_vstrhq_scatter_offset_<supf><mode>_insn): Delete.
(mve_vstrdq_scatter_offset_p_<supf>v2di): Delete.
(mve_vstrdq_scatter_offset_p_<supf>v2di_insn): Delete.
(mve_vstrdq_scatter_offset_<supf>v2di): Delete.
(mve_vstrdq_scatter_offset_<supf>v2di_insn): Delete.
(mve_vstrhq_scatter_offset_fv8hf): Delete.
(mve_vstrhq_scatter_offset_fv8hf_insn): Delete.
(mve_vstrhq_scatter_offset_p_fv8hf): Delete.
(mve_vstrhq_scatter_offset_p_fv8hf_insn): Delete.
(mve_vstrwq_scatter_offset_fv4sf): Delete.
(mve_vstrwq_scatter_offset_fv4sf_insn): Delete.
(mve_vstrwq_scatter_offset_p_fv4sf): Delete.
(mve_vstrwq_scatter_offset_p_fv4sf_insn): Delete.
(mve_vstrwq_scatter_offset_p_<supf>v4si): Delete.
(mve_vstrwq_scatter_offset_p_<supf>v4si_insn): Delete.
(mve_vstrwq_scatter_offset_<supf>v4si): Delete.
(mve_vstrwq_scatter_offset_<supf>v4si_insn): Delete.
(@mve_vstrq_scatter_offset_<mode>): New.
(@mve_vstrq_scatter_offset_p_<mode>): New.
(@mve_vstrq_truncate_scatter_offset_<mode>): New.
(@mve_vstrq_truncate_scatter_offset_p_<mode>): New.
* config/arm/unspecs.md (VSTRBQSO_S, VSTRBQSO_U, VSTRHQSO_S)
(VSTRDQSO_S, VSTRDQSO_U, VSTRWQSO_S, VSTRWQSO_U, VSTRHQSO_F)
(VSTRWQSO_F, VSTRHQSO_U): Delete.
(VSTRQSO, VSTRQSO_P, VSTRQSO_TRUNC, VSTRQSO_TRUNC_P): New.

7 months agoarm: [MVE intrinsics] add store_scatter_offset shape
Christophe Lyon [Thu, 10 Oct 2024 10:48:42 +0000 (10:48 +0000)] 
arm: [MVE intrinsics] add store_scatter_offset shape

This patch adds the store_scatter_offset shape and uses a new helper
class (store_scatter), which will also be used by later patches.

gcc/ChangeLog:

* config/arm/arm-mve-builtins-shapes.cc (struct store_scatter): New.
(struct store_scatter_offset_def): New.
* config/arm/arm-mve-builtins-shapes.h (store_scatter_offset): New.

7 months agoarm: [MVE intrinsics] add mode_after_pred helper in function_shape
Christophe Lyon [Thu, 10 Oct 2024 16:35:23 +0000 (16:35 +0000)] 
arm: [MVE intrinsics] add mode_after_pred helper in function_shape

This new helper returns true if the mode suffix goes after the
predicate suffix.  This is true in most cases, so the base
implementations in nonoverloaded_base and overloaded_base return true.
For instance: vaddq_m_n_s32.

This will be useful in later patches to implement
vstr?q_scatter_offset_p (_p appears after _offset).

gcc/ChangeLog:

* config/arm/arm-mve-builtins-shapes.cc (struct
nonoverloaded_base): Implement mode_after_pred.
(struct overloaded_base): Likewise.
* config/arm/arm-mve-builtins.cc (function_builder::get_name):
Call mode_after_pred as needed.
* config/arm/arm-mve-builtins.h (function_shape): Add
mode_after_pred.

7 months agoC++: reject OpenMP directives in constexpr functions
Tobias Burnus [Fri, 13 Dec 2024 13:27:08 +0000 (14:27 +0100)] 
C++: reject OpenMP directives in constexpr functions

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_construct, cp_parser_pragma): Reject
OpenMP expressions in constexpr functions.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/pr108607.C: Update dg-error.
* g++.dg/gomp/pr79664.C: Update dg-error.
* g++.dg/gomp/omp-constexpr.C: New test.

7 months agogenrecog: Split into separate partitions [PR111600].
Robin Dapp [Tue, 26 Nov 2024 13:44:17 +0000 (14:44 +0100)] 
genrecog: Split into separate partitions [PR111600].

Hi,

this patch makes genrecog split its output into separate files (10 by
default) in the same vein genemit does.  The changes are mostly
mechanical again, changing printfs and puts to fprintf.
As insn-recog.cc relies on being able to call other recog functions a
header insn-recog.h is introduced that pre declares all of those.

For simplicity the number of files is determined by (re-using)
--with-insnemit-partitions.  Naming suggestions welcome :)

Bootstrapped and regtested on x86 and power10, regtested on riscv.
aarch64 bootstrap is currently blocked because of the
"maybe uninitialized" issue discussed on IRC.

Regards
 Robin

PR target/111600

gcc/ChangeLog:

* Makefile.in:  Add insn-recog split.
* configure: Regenerate.
* configure.ac: Document that the number of insnemit partitions is
used for insn-recog as well.
* genconditions.cc (write_one_condition): Use fprintf.
* genpreds.cc (write_predicate_expr): Ditto.
(write_init_reg_class_start_regs): Ditto.
* genrecog.cc (write_header): Add header file to includes.
(printf_indent): Use fprintf.
(change_state): Ditto.
(print_code): Ditto.
(print_host_wide_int): Ditto.
(print_parameter_value): Ditto.
(print_test_rtx): Ditto.
(print_nonbool_test): Ditto.
(print_label_value): Ditto.
(print_test): Ditto.
(print_decision): Ditto.
(print_state): Ditto.
(print_subroutine_call): Ditto.
(print_acceptance): Ditto.
(print_subroutine_start): Ditto.
(print_pattern): Ditto.
(print_subroutine): Ditto.
(print_subroutine_group): Ditto.
(handle_arg): Add -O and -H for output and header file handling.
(main): Use callback.
* gentarget-def.cc (def_target_insn): Use fprintf.
* read-md.cc (md_reader::print_c_condition): Ditto.
* read-md.h (class md_reader): Ditto.

7 months agolibstdc++: Fix uninitialized data in std::basic_spanbuf::seekoff
Jonathan Wakely [Fri, 13 Dec 2024 10:54:29 +0000 (10:54 +0000)] 
libstdc++: Fix uninitialized data in std::basic_spanbuf::seekoff

I noticed a -Wmaybe-uninitialized warning for this function, which turns
out to be correct. If the caller passes a valid std::ios_base::seekdir
value then there's no problem, but if they pass std::seekdir(999) then
we don't initialize the __base variable before adding it to __off.

Rather than initialize it to an arbitrary value, we should return an
error.

Also add [[unlikely]] attributes to the paths that return an error.

libstdc++-v3/ChangeLog:

* include/std/spanstream (basic_spanbuf::seekoff): Return an
error for invalid seekdir values.

7 months agolibstdc++: Swap expressions in noexcept-specifier of ranges::not_equal_to
Jonathan Wakely [Thu, 12 Dec 2024 23:24:39 +0000 (23:24 +0000)] 
libstdc++: Swap expressions in noexcept-specifier of ranges::not_equal_to

Although this should never make a difference for sensible code, we
should really make the expression in the noexcept-specifier match the
expression in the function body.

libstdc++-v3/ChangeLog:

* include/bits/ranges_cmp.h (not_equal_to): Make order of
expressions in noexcept-specifier match the body.
* testsuite/20_util/function_objects/range.cmp/not_equal_to.cc:
Check noexcept.

7 months agolibstdc++: Fix -Wsign-compare warning in <regex>
Jonathan Wakely [Fri, 13 Dec 2024 11:59:08 +0000 (11:59 +0000)] 
libstdc++: Fix -Wsign-compare warning in <regex>

libstdc++-v3/ChangeLog:

* include/bits/regex.tcc: Fix -Wsign-compare warning.

7 months agolibstdc++: Fix -Wreorder warning in <pstl/parallel_backend_tbb.h>
Jonathan Wakely [Fri, 13 Dec 2024 11:00:19 +0000 (11:00 +0000)] 
libstdc++: Fix -Wreorder warning in <pstl/parallel_backend_tbb.h>

libstdc++-v3/ChangeLog:

* include/pstl/parallel_backend_tbb.h (__merge_func): Fix order
of mem-initializers.