git.ipfire.org Git - thirdparty/gcc.git/log

ada: Remove unused parameter from volatile type queries

Routines Is_Effectively_Volatile and Is_Effectively_Volatile_For_Reading
were always called with Ignore_Protected parameter set to True (or has
been passed unmodified on recursive calls), so this parameter wasn't
actually needed.

Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* sem_util.adb (Is_Effectively_Volatile,
Is_Effectively_Volatile_For_Reading): Remove Ignore_Protected
parameter.
(Is_Effectively_Volatile_Object,
Is_Effectively_Volatile_Object_For_Reading): Remove
single-parameter wrappers that are needed to instantiate
generic subprogram.
* sem_util.ads (Is_Effectively_Volatile,
Is_Effectively_Volatile_For_Reading): Remove parameter; adjust
comment.

ada: Elide copy for calls in allocators for nonlimited by-reference types

This prevents a temporary from being created on the primary stack to hold
the result of the function calls before it is copied to the newly allocated
memory in the nonlimited by-reference case.

That's already not done in the nonlimited non-by-reference case and there is
no reason to do it in the former case either. The main issue is the call to
Remove_Side_Effects in Expand_Allocator_Expression, but its only purpose is
to cover the problematic processing done in Build_Allocate_Deallocate_Proc
on (part of) the expression; once this is fixed, the call is unnecessary.

The change also contains another small fix to deal with the corner case of
allocators for access-to-access types.

gcc/ada/ChangeLog:

* exp_ch4.adb (Expand_Allocator_Expression): Do not preventively
call Remove_Side_Effects on the expression in the nonlimited
by-reference case. Always call Build_Allocate_Deallocate_Proc
in the default case.
* exp_ch6.adb (Expand_Ctrl_Function_Call): Bail out if the call
is the qualified expression of an allocator.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Replace all the
calls to Relocate_Node by calls to Duplicate_Subexpr_No_Checks.

ada: Remove last call to Preanalyze_And_Resolve from Exp_Aggr

All the expressions are now at least preanalyzed in a non-iterated context,
so we do not need to redo it in Aggr_Assignment_OK_For_Backend, given that
Is_OK_Aggregate explicitly rejects iterated component associations.

gcc/ada/ChangeLog:

* exp_aggr.adb (Aggr_Assignment_OK_For_Backend): Do not call again
Preanalyze_And_Resolve on the expression.

ada: Fix breakage of GNATprove introduced by latest change

gcc/ada/ChangeLog:

* sem_aggr.adb (Resolve_Aggr_Expr): Always perform a full analysis
of the expression in SPARK mode.

ada: Fix typo in reference manual

gcc/ada/ChangeLog:

* doc/gnat_rm/gnat_language_extensions.rst: Fix typo.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

ada: Fix dangling reference with user-defined indexing of function call

This happens with a noncontrolled type because the user-defined indexing is
expanded into a function call that binds the lifetime of the original call
to its return value. The temporary must be created explicitly in this case,
so that the front-end can control its lifetime.

gcc/ada/ChangeLog:

* exp_ch6.adb (Expand_Call_Helper): Also create a temporary in the
case of a noncontrolled user-defined indexing.

ada: Fix documentation of Ada.Real_Time.Timing_Events

The GNAT reference manual stated that GNAT did not implement this
language-defined package, but GNAT in fact does offer an implementation
of it.

gcc/ada/ChangeLog:

* doc/gnat_rm/standard_library_routines.rst: Fix documentation.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

ada: Exclude library units from gnatcov instrumentation

Before this patch, we instrumented code that's only used during the
build process to generate more code. This patch marks the
code-generating code so it's not instrumented for coverage.

gcc/ada/ChangeLog:

* gnat2.gpr: Add library units to coverage exclusion list.

ada: Further work in semantic analysis of iterated component associations

This finishes up the transition to preanalysis of a copy of the expression
for iterated component associations in all contexts, thus voiding the need
to clean things up afterward.

However, this requires a larger cleanup in semantics analysis of aggregates,
in particular for others choices, which are currently skipped in Sem_Aggr,
with Exp_Aggr trying to patch things up afterward but leaving some legality
loopholes in the end.  That's why this makes sure that all the expressions
appearing in aggregates are either analyzed or preanalyzed by Sem_Aggr, as
documented in the spec of Sem, modulo the copy in an iteration context.

gcc/ada/ChangeLog:

* exp_aggr.adb (Build_Array_Aggr_Code): Remove obsolete comment.
(Convert_To_Positional): Remove Ctyp local variable.
(Is_Static_Element): Remove Dims parameter and do not preanalyze the
expression there.
(Expand_Array_Aggregate): Make Ctyp a constant.
(Compute_Others_Present): Do not preanalyze the expression there.
* sem_aggr.adb (Resolve_Array_Aggregate): New Ctyp constant.  Use it
throughout the procedure to denote the component type.
(Resolve_Aggr_Expr): Always preanalyze a copy of the expression in
an iteration context.  Preanalyze it directly when the expander is
active and the choice may cover multiple components.  Otherwise,
fully analyze it.
Do not reanalyze an iterated component association with an others
choice either when there are positional components.
(Resolve_Iterated_Component_Association): Do not remove references
from the expression after invoking Resolve_Aggr_Expr on it.

ada: Remove implicit assumption in the double case

The assumption is fulfilled in all the instantiations of the package, but
it should not be made in the generic code.

gcc/ada/ChangeLog:

* libgnat/s-imager.adb (Set_Image_Real): In the case where a double
integer is needed, do not implicit assume that it can contain up to
'Digits of the floating-point type.

ada: Adjust cut-off for scaling of floating-point numbers

The value needs to take into account denormals and encompass Maxdigs.

gcc/ada/ChangeLog:

* libgnat/s-imager.adb (Maxscaling): Change to Natural constant and
add Maxdigs to value.

Fix -fstrict-flex-arrays documentation, again [PR111659]

My previous attempt to fix this issue ended up garbling the text
instead. Trying again to make the descriptions of the attribute and
command-line option consistent.

gcc/ChangeLog
PR middle-end/111659
* doc/extend.texi (Common Variable Attributes): Copy-edit description
of the strict_flex_array attribute levels.
* doc/invoke.texi (C Dialect Options): Swap documented behavior for
levels 0 and 3. Copy the description for the other levels from the
attribute instead of indirecting to it.

Daily bump.

libstdc++: Fix some -Wsign-compare warnings in the testsuite

libstdc++-v3/ChangeLog:

* testsuite/23_containers/unordered_map/modifiers/reserve.cc:
Cast to size_t to fix -Wsign-compare warning.
* testsuite/23_containers/unordered_set/hash_policy/71181.cc:
Likewise.
* testsuite/23_containers/unordered_set/insert/move_range.cc:
Likewise.

libstdc++: Fix -Wsign-compare warnings in bits/hashtable_policy.h

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h (_Local_iterator_base): Fix
-Wsign-compare warnings.

libstdc++: Fix typo in comment in src/c++17/fs_dir.cc

libstdc++-v3/ChangeLog:

* src/c++17/fs_dir.cc: Fix typo in comment.

hppa: Remove extra clobber from divsi3, udivsi3, modsi3 and umodsi3 patterns

The $$divI, $$divU, $$remI and $$remU millicode calls clobber r1,
r26, r25 and the return link register (r31 or r2). We don't need
to clobber any other registers.

2024-12-12 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.cc (pa_emit_hpdiv_const): Clobber r1, r25,
r25 and return register.
* config/pa/pa.md (divsi3): Revise clobbers and operands.
Remove second clobber from div:SI insns.
(udivsi3, modsi3, umodsi3): Likewise.

Regenerate attr-urls.def.

I noticed there is this new generated file that needs to be updated by
"make regenerate-attr-urls" similarly to "make regenerate-opt-urls", but
nobody had done that recently as the buildbot does not nag about it yet.

gcc/ChangeLog

* attr-urls.def: Regenerate.

Clean up documentation of -Wsuggest-attribute= [PR115532]

The list of -Wsuggest-attribute= variants was out of date in the option
summary (and getting too long to fit on one line), and an index entry was
missing for -Wsuggest-attribute=returns_nonnull.

gcc/c-family/ChangeLog
PR c/115532
* c.opt.urls: Regenerated.

gcc/ChangeLog
PR c/115532
* common.opt.urls: Regenerated.
* doc/invoke.texi (Option Summary): Don't try to list all the
-Wsuggest-attribute= variants inline here.
(Warning Options): Likewise. Add @opindex for
Wsuggest-attribute=returns_nonnull and its no- form. Remove
@itemx for no- form.

Co-Authored-By: Peter Eisentraut <peter@eisentraut.org>

match.pd: Defer some CTZ/CLZ foldings until after ubsan pass for -fsanitize=builtin [PR115127]

As the following testcase shows, -fsanitize=builtin instruments the
builtins in the ubsan pass which is done shortly after going into
SSA, but if optimizations optimize the builtins away before that,
nothing is instrumented. Now, I think it is just fine if the
result of the builtins isn't used in any way and we just DCE them,
but in the following optimizations the result is used.
So, the following patch for -fsanitize=builtin only defers the
optimizations that might turn single argument CLZ/CTZ (aka undefined
at zero) until the ubsan pass is done.
Now, we don't have PROP_ubsan and am not sure it is worth adding it,
there is PROP_ssa set by the ssa pass which is 3 passes before
ubsan, but there are only 2 warning passes in between, so PROP_ssa
looked good enough to me.

2024-12-12 Jakub Jelinek <jakub@redhat.com>

PR sanitizer/115127
* match.pd (clz (X) == C, ctz (X) == C, ctz (X) >= C): Don't
optimize if -fsanitize=builtin and not yet in SSA form.

* c-c++-common/ubsan/builtin-2.c: New test.

OpenMP: Enable has_device_addr clause for 'dispatch' in C/C++

The 'has_device_addr' of 'dispatch' has to be seen in conjunction with the
'need_device_addr' modifier to the 'adjust_args' clause of 'declare variant'.
As the latter has not yet been implemented, 'has_device_addr' has no real
effect. However, to prepare for 'need_device_addr' and as service to the user:

For C, where 'need_device_addr' is not permitted (contrary to C++ and Fortran),
a note is output when then the user tries to use it (alongside the existing
error that either 'nothing' or 'need_device_ptr' was expected).

And, on the ME side, is is lightly handled by diagnosing when - for the
same argument - there is a mismatch between the variant's adjust_args
'need_device_ptr' modifier and dispatch having an 'has_device_addr' clause
(or likewise for need_device_addr with is_device_ptr) as, according to the
spec, those are completely separate.
Thus, 'dispatch' will still do the host to device pointer conversion for
a 'need_device_ptr' argument, even if it appeared in a 'has_device_addr'
clause.

gcc/c/ChangeLog:

* c-parser.cc (OMP_DISPATCH_CLAUSE_MASK): Add has_device_addr clause.
(c_finish_omp_declare_variant): Add an 'inform' telling the user that
'need_device_addr' is invalid for C.

gcc/cp/ChangeLog:

* parser.cc (OMP_DISPATCH_CLAUSE_MASK): Add has_device_addr clause.

gcc/ChangeLog:

* gimplify.cc (gimplify_call_expr): When handling OpenMP's dispatch,
add diagnostic when there is a ptr vs. addr mismatch between
need_device_{addr,ptr} and {is,has}_device_{ptr,addr}, respectively.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/adjust-args-3.c: New test.
* gcc.dg/gomp/adjust-args-2.c: New test.

Fortran: Fix testsuite regressions after r15-5083 [PR117797]

2024-12-12 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/117797
* trans-array.cc (class_array_element_size): New function.
(gfc_get_array_span): Refactor, using class_array_element_size
to return the span for descriptors that are the _data component
of a class expression and then class dummy references. Revert
the conditions to those before r15-5083 tidying up using 'sym'.

gcc/testsuite/
PR fortran/117797
* gfortran.dg/pr117797.f90: New test.

Fix precondition failure with Ada.Numerics.Generic_Real_Arrays.Eigenvalues

This fixes a precondition failure triggered when the Eigenvalues routine
of Ada.Numerics.Generic_Real_Arrays is instantiated with -gnata, beause
it calls Sort_Eigensystem on an empty vector.

gcc/ada
PR ada/117996
* libgnat/a-ngrear.adb (Jacobi): Remove default value for
Compute_Vectors formal parameter.
(Sort_Eigensystem): Add Compute_Vectors formal parameter. Do not
modify the Vectors if Compute_Vectors is False.
(Eigensystem): Pass True as Compute_Vectors to Sort_Eigensystem.
(Eigenvalues): Pass False as Compute_Vectors to Sort_Eigensystem.

gcc/testsuite
* gnat.dg/matrix1.adb: New test.

AVR: target/118000 - Fix copymem from address-spaces.

* rampz_rtx et al. were missing MEM_VOLATILE_P. This is needed because
avr_emit_cpymemhi is setting RAMPZ explicitly with an own insn.

* avr_out_cpymem was missing a final RAMPZ = 0 on EBI devices.

This only affects the __flash1 ... __flash5 spaces since the other ASes
use different routines,

gcc/
PR target/118000
* config/avr/avr.cc (avr_init_expanders) <sreg_rtx>
<rampd_rtx, rampx_rtx, rampy_rtx, rampz_rtx>: Set MEM_VOLATILE_P.
(avr_out_cpymem) [ELPM && EBI]: Restore RAMPZ to 0 after.

ifcombine field-merge: set upper bound for get_best_mode

A bootstrap on aarch64-linux-gnu revealed that sometimes (for example,
when building shorten_branches in final.cc) we will find such things
as MEM <unsigned int>, where unsigned int happens to be a variant of
the original unsigned int type, but with 64-bit alignment. This
unusual alignment circumstance caused (i) get_inner_reference to not
look inside the MEM, (ii) get_best_mode to choose DImode instead of
SImode to access the object, so we built a BIT_FIELD_REF that
attempted to select all 64 bits of a 32-bit object, and that failed
gimple verification ("position plus size exceeds size of referenced
object") because there aren't that many bits in the unsigned int
object.

This patch avoids this failure mode by limiting the bitfield range
with the size of the inner object, if it is a known constant.

This enables us to avoid creating a BIT_FIELD_REF and reusing the load
expr, but we still introduced a separate load, that would presumably
get optimized out, but that is easy enough to avoid in the first place
by reusing the SSA_NAME it was originally loaded into, so I
implemented that in make_bit_field_load.

for gcc/ChangeLog

* gimple-fold.cc (fold_truth_andor_for_ifcombine): Limit the
size of the bitregion in get_best_mode calls by the inner
object's type size, if known.
(make_bit_field_load): Reuse SSA_NAME if we're attempting to
issue an identical load.

fold fold_truth_andor field merging into ifcombine

This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine, and guard its warnings with
-Wtautological-compare, turned into a common flag.

When the second of a noncontiguous pair of compares is the first that
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.

for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here,
with -Wtautological-compare warning guards, and...
(decode_field_reference): ... here.  Rework for gimple.
(gimple_convert_def_p, gimple_binop_def_p): New.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.
* common.opt (Wtautological-compare): Move here.

for  gcc/c-family/ChangeLog

* c.opt (Wtautological-compare): Move to ../common.opt.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.
* gcc.dg/field-merge-12.c: New.
* gcc.target/aarch64/long_branch_1.c: Disable ifcombine.

AVR: Assert minimal required bit width of section_common::flags.

gcc/
* config/avr/avr.cc (avr_ctz): New constexpr function.
(section_common::flags): Assert minimal bit width.

AVR: target/118001 - Add __flashx as 24-bit named address space.

This patch adds __flashx as a new named address space that allocates
objects in .progmemx.data.  The handling is mostly the same or similar
to that of 24-bit space __memx, except that the asm routines are
simpler and more efficient.  Loads are emit inline when ELPMX or
LPMX is available.  The address space uses a 24-bit addresses even
on devices with a program memory size of 64 KiB or less.

PR target/118001
gcc/
* doc/extend.texi (AVR Named Address Spaces): Document __flashx.
* config/avr/avr.h (ADDR_SPACE_FLASHX): New enum value.
* config/avr/avr-protos.h (avr_out_fload, avr_mem_flashx_p)
(avr_fload_libgcc_p, avr_load_libgcc_mem_p)
(avr_load_libgcc_insn_p): New.
* config/avr/avr.cc (avr_addrspace): Add ADDR_SPACE_FLASHX.
(avr_decl_flashx_p, avr_mem_flashx_p, avr_fload_libgcc_p)
(avr_load_libgcc_mem_p, avr_load_libgcc_insn_p, avr_out_fload):
New functions.
(avr_adjust_insn_length) [ADJUST_LEN_FLOAD]: Handle case.
(avr_progmem_p) [avr_decl_flashx_p]: return 2.
(avr_addr_space_legitimate_address_p) [ADDR_SPACE_FLASHX]:
Has same behavior like ADDR_SPACE_MEMX.
(avr_addr_space_convert): Use pointer sizes rather then ASes.
(avr_addr_space_contains): New function.
(avr_convert_to_type): Use it.
(avr_emit_cpymemhi): Handle ADDR_SPACE_FLASHX.
* config/avr/avr.md (adjust_len) <fload>: New attr value.
(gen_load<mode>_libgcc): Renamed from load<mode>_libgcc.
(xload8<mode>_A): Iterate over MOVMODE rather than over ALL1.
(fxmov<mode>_A): New from xloadv<mode>_A.
(xmov<mode>_8): New from xload<mode>_A.
(fmov<mode>): New insns.
(fxload<mode>_A): New from xload<mode>_A.
(fxload_<mode>_libgcc): New from xload_<mode>_libgcc.
(*fxload_<mode>_libgcc): New from *xload_<mode>_libgcc.
(mov<mode>) [avr_mem_flashx_p]: Hande ADDR_SPACE_FLASHX.
(cpymemx_<mode>): Make sure the address space is not lost
when splitting.
(*cpymemx_<mode>) [ADDR_SPACE_FLASHX]: Use __movmemf_<mode> for asm.
(*ashlqi.1.zextpsi_split): New combine pattern.
* config/avr/predicates.md (nox_general_operand): Don't match
when avr_mem_flashx_p is true.
* config/avr/avr-passes.cc (AVR_LdSt_Props):
ADDR_SPACE_FLASHX has no post_inc.

gcc/testsuite/
* gcc.target/avr/torture/addr-space-1.h [AVR_HAVE_ELPM]:
Use a function to bump .progmemx.data to a high address.
* gcc.target/avr/torture/addr-space-2.h: Same.
* gcc.target/avr/torture/addr-space-1-fx.c: New test.
* gcc.target/avr/torture/addr-space-2-fx.c: New test.

libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _fload_1, _fload_2,
_fload_3, _fload_4, _movmemf.
* config/avr/lib1funcs.S (.branch_plus): New .macro.
(__xload_1, __xload_2, __xload_3, __xload_4): When the address is
located in flash, then forward to...
(__fload_1, __fload_2, __fload_3, __fload_4): ...these new
functions, respectively.
(__movmemx_hi): When the address is located in flash, forward to...
(__movmemf_hi): ...this new function.

Fix type compatibility for types with flexible array member 2/2 [PR113688,PR114713,PR117724]

For checking or computing TYPE_CANONICAL, ignore the array size when it is
the last element of a structure or union. To not get errors because of
an inconsistent number of members, zero-sized arrays which are the last
element are not ignored anymore when checking the fields of a struct.

PR c/113688
PR c/114014
PR c/114713
PR c/117724

gcc/ChangeLog:
* tree.cc (gimple_canonical_types_compatible_p): Add exception.

gcc/lto/ChangeLog:
* lto-common.cc (hash_canonical_type): Add exception.

gcc/testsuite/ChangeLog:
* gcc.dg/pr113688.c: New test.
* gcc.dg/pr114014.c: New test.
* gcc.dg/pr114713.c: New test.
* gcc.dg/pr117724.c: New test.

Fix type compatibility for types with flexible array member 1/2 [PR113688,PR114713,PR117724]

Allow the TYPE_MODE of a type with an array as last member to differ from
another compatible type.

gcc/ChangeLog:
* tree.cc (gimple_canonical_types_compatible_p): Add exception.
(verify_type): Add exception.

gcc/lto/ChangeLog:
* lto-common.cc (hash_canonical_type): Add exception.

testsuite: arm: Use -mtune=cortex-m4 for thumb-ifcvt.c test

On Cortex-M4, the code generated is:
     cmp     r0, r1
     itte    ne
     lslne   r0, r0, r1
     asrne   r0, r0, #1
     moveq   r0, r1
     add     r0, r0, r1
     bx      lr

On Cortex-M7, the code generated is:
     cmp     r0, r1
     beq     .L3
     lsls    r0, r0, r1
     asrs    r0, r0, #1
     add     r0, r0, r1
     bx      lr
.L3:
     mov     r0, r1
     add     r0, r0, r1
     bx      lr

As Cortex-M7 only allow maximum one conditional instruction, force
Cortex-M4 to have a stable test case.

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb-ifcvt.c: Use -mtune=cortex-m4.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: arm: Fix build error for thumb2-slow-flash-data-3.c test

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb2-slow-flash-data-3.c: Added argument to
fn1 to avoid compile error.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: arm: Check that a far jump is used in thumb1-far-jump-2.c

With the changes in r15-1579-g792f97b44ff, the code used as "padding" in
the test case is optimized way. Prevent this optimization by forcing a
read of the volatile memory.
Also, validate that there is a far jump in the generated assembler.

Without this patch, the generated assembler is reduced to:
f3:
        cmp     r0, #0
        beq     .L1
        ldr     r4, .L6
.L1:
        bx      lr
.L7:
        .align  2
.L6:
        .word   g_0_1

With the patch, the generated assembler is:
f3:
        movs    r2, #1
        ldr     r3, .L6
        push    {lr}
        str     r2, [r3]
        cmp     r0, #0
        bne     .LCB10
        bl      .L1     @far jump
.LCB10:
        b       .L7
.L8:
        .align  2
.L6:
        .word   .LANCHOR0
.L7:
        str     r2, [r3]
        ...
        str     r2, [r3]
.L1:
        pop     {pc}

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb1-far-jump-2.c: Write to volatile memmory
in macro to avoid optimization.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: arm: Use effective-target for pr96939 test

Update test case to use -mcpu=unset/-march=unset feature introduced in
r15-3606-g7d6c6a0d15c.

gcc/testsuite/ChangeLog:

* gcc.target/arm/lto/pr96939_0.c: Use effective-target
arm_arch_v8a.
* gcc.target/arm/lto/pr96939_1.c: Remove dg-options.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: arm: Use effective-target for its.c test [PR94531]

The test case gcc.target/arm/its.c was created together with restriction
of IT blocks for Cortex-M7. As the test case fails on all tunes that
does not match Cortex-M7, explicitly test it for Cortex-M7. To have some
additional faith that GCC does the correct thing, I also added another
variant of the test for Cortex-M3 that should allow longer IT blocks.

gcc/testsuite/ChangeLog:

PR testsuite/94531
* gcc.target/arm/its.c: Removed.
* gcc.target/arm/its-1.c: Copy of gcc.target/arm/its.c. Use
effective-target arm_cpu_cortex_m7.
* gcc.target/arm/its-2.c: Copy of gcc.target/arm/its.c. Use
effective-target arm_cpu_cortex_m3.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: arm: Use -mcpu=unset when overriding -march

Update test cases to use -mcpu=unset/-march=unset feature introduced in
r15-3606-g7d6c6a0d15c.

gcc/testsuite/ChangeLog:
* gcc.dg/pr41574.c: Added option "-mcpu=unset".
* gcc.dg/pr59418.c: Likewise.
* lib/target-supports.exp (add_options_for_vect_early_break):
Likewise.
(add_options_for_arm_v8_neon): Likewise.
(check_effective_target_arm_neon_ok_nocache): Likewise.
(check_effective_target_arm_simd32_ok_nocache): Likewise.
(check_effective_target_arm_sat_ok_nocache): Likewise.
(check_effective_target_arm_dsp_ok_nocache): Likewise.
(check_effective_target_arm_crc_ok_nocache): Likewise.
(check_effective_target_arm_v8_neon_ok_nocache): Likewise.
(check_effective_target_arm_v8_1m_mve_fp_ok_nocache): Likewise.
(check_effective_target_arm_v8_1a_neon_ok_nocache): Likewise.
(check_effective_target_arm_v8_2a_fp16_scalar_ok_nocache):
Likewise.
(check_effective_target_arm_v8_2a_fp16_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1m_mve_ok_nocache): Likewise.
(check_effective_target_arm_v8_2a_i8mm_ok_nocache): Likewise.
(check_effective_target_arm_fp16fml_neon_ok_nocache): Likewise.
(check_effective_target_arm_v8_2a_bf16_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8m_main_cde_ok_nocache): Likewise.
(check_effective_target_arm_v8m_main_cde_fp_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1m_main_cde_mve_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1m_main_cde_mve_fp_ok_nocache):
Likewise.
(check_effective_target_arm_v8_3a_complex_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_3a_fp16_complex_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1_lob_ok): Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: arm: Use -march=unset for bfloat16_scalar* tests

Update test cases to use -mcpu=unset/-march=unset feature introduced in
r15-3606-g7d6c6a0d15c.

gcc/testsuite/ChangeLog:

* gcc.target/arm/bfloat16_scalar_1_1.c: Use effective-target
arm_arch_v8_2a_bf16_hard.
* gcc.target/arm/bfloat16_scalar_2_1.c: Likewise.
* gcc.target/arm/bfloat16_scalar_3_1.c: Likewise.
* gcc.target/arm/bfloat16_scalar_1_2.c: Use effective-target
arm_arch_v8_2a_bf16.
* gcc.target/arm/bfloat16_scalar_2_2.c: Likewise.
* gcc.target/arm/bfloat16_scalar_3_2.c: Likewise.
* lib/target-supports.exp: Define effective-target
v8_2a_bf16 and v8_2a_bf16_hard.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: arm: Use effective-target for pr56184.C and pr59985.C

Update test cases to use -mcpu=unset/-march=unset feature introduced in
r15-3606-g7d6c6a0d15c.

gcc/testsuite/ChangeLog:

* g++.dg/other/pr56184.C: Use effective-target
arm_arch_v7a_neon_thumb.
* g++.dg/other/pr59985.C: Use effective-target
arm_arch_v7a_fp_hard.
* lib/target-supports.exp: Define effective-target
arm_arch_v7a_fp_hard, arm_arch_v7a_neon_thumb

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

i386: regenerate i386.opt.urls

r15-6128-gfa878dc8c45fa3 missed the regeneration of the URL doc map, so
regenerate it here to make the buildbots happy.

gcc/ChangeLog:

* config/i386/i386.opt.urls: Regenerate.

crc: Comment spelling fix

"replacement is succeeded" doesn't look correct, this patch drops the
is.

2024-12-12 Jakub Jelinek <jakub@redhat.com>

* gimple-crc-optimization.cc (crc_optimization::optimize_crc_loop):
Comment spelling fix, is succeeded -> succeeded.

ada: Fix reference to Ada 2020 in comment

Code cleanup.

gcc/ada/ChangeLog:

* par-ch5.adb (Test_Statement_Required): Fix comment.

ada: Elide the copy for bit-packed aggregates in object declarations

The in-place expansion has been historically disabled for them, but there
does not seem to be any good reason left for this. However, this requires
a small trick in order for the expanded code not to be flagged as using the
object uninitialized by the code generator.

gcc/ada/ChangeLog:

* exp_aggr.adb (Convert_Aggr_In_Object_Decl): Clear the component
referenced on the right-hand side of the first assignment generated
for a bit-packed array, if any.
(Expand_Array_Aggregate): Do not exclude aggregates of bit-packed
array types in object declarations from in-place expansion.
* sem_eval.adb (Eval_Indexed_Component): Do not attempt a constant
evaluation for a bit-packed array type.

ada: Defend against risk of infinite loop

A recently fixed bug caused an infinite loop when assertions were not
checked. With assertions checked, the symptom was just an internal
error caused by an assertion failure. This patch makes it so that if
another bug ever causes the same condition to fail, there will never be
an infinite loop with any assertion policy.

gcc/ada/ChangeLog:

* sem_ch3.adb (Access_Subprogram_Declaration): Replace assertion with
more defensive code.

ada: Avoid going through symlinks in the json report

gcc/ada/ChangeLog:

* errout.adb (Write_JSON_Location): Avoid going through
symbolic links when printing the full name.

ada: Fix minor display issue on invalid floats

GNAT implements a format with trailing '*' signs for the Image attribute
of NaN, +inf and -inf. It was probably always intended to be the same
length as the image of 1.0, but one '*' was actually missing. This patch
fixes this.

gcc/ada/ChangeLog:

* libgnat/s-imager.adb (Image_Floating_Point): Tweak display of
invalid floating point values.

ada: Improve task entry context detection

Access parameters are not allowed in specifications of task entries.
Before this patch, the compiler failed to detect that case in accept
statements that were not directly in their task body's scopes. This
patch fixes this issue.

gcc/ada/ChangeLog:

* sem_ch3.adb (Access_Definition): Remove test for task entry context.
* sem_ch6.adb (Process_Formals): Add improved test for task entry
context.

ada: Refactor warning about null loops

Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* sem_ch5.adb (Analyze_Loop_Parameter_Specification): Move call
to Comes_From_Source to the outer if-statement.

ada: Fix internal error on loop parameter specifications

Originally loop parameter specification only occurred in loops, but now
it also occurs in quantified expressions. This patch guards against
flagging non-loop nodes as null loop statements. This was causing
internal compiler errors that were only visible with switch -gnatdk,
which happens to be default in GNATprove testsuite.

gcc/ada/ChangeLog:

* sem_ch5.adb (Analyze_Loop_Parameter_Specification): Only set
flag Is_Null_Loop when loop parameter specification comes from
a loop and not from a quantified expression.

ada: Elide the copy for bit-packed aggregates in allocators

The in-place expansion has been historically disabled for them, but there
does not seem to be any good reason left for this.

gcc/ada/ChangeLog:

* exp_aggr.adb (Expand_Array_Aggregate): Do not exclude aggregates
of bit-packed array types in allocators from in-place expansion.

ada: Fix the level of the LLVM chapter in the User's Guide

gcc/ada/ChangeLog:

* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Move
the LLVM chapter one level up.
* gnat_ugn.texi: Regenerate.

ada: Accept static strings with External_Initialization

Before this patch, the argument to the External_Initialization aspect
had to be a string literal. This patch extends the possibilities so that
any static string is accepted.

A new helper function, Is_OK_Static_Expression_Of_Type, is introduced,
and in addition to the main change of this patch a couple of calls to
that helper function are added in other places to replace equivalent
inline code.

gcc/ada/ChangeLog:

* sem_eval.ads (Is_OK_Static_Expression_Of_Type): New function.
* sem_eval.adb (Is_OK_Static_Expression_Of_Type): Likewise.
* sem_ch13.adb (Check_Expr_Is_OK_Static_Expression): Use new function.
* sem_prag.adb (Check_Expr_Is_OK_Static_Expression): Likewise.
* sem_ch3.adb (Apply_External_Initialization): Accept static strings
for the parameter.

ada: Fix reference manual clauses

The clauses in section 3.5 of the reference manual were moved around
along the different Ada versions, which caused some comments in our
source code to go out of date. This patch updates the references in
those comments.

gcc/ada/ChangeLog:

* libgnat/a-tifiio.adb: Fix comment.
* libgnat/a-tifiio__128.adb: Likewise.
* libgnat/s-imaged.ads (Image_Decimal): Likewise.
* libgnat/s-imagef.ads (Image_Fixed): Likewise.
* libgnat/s-imager.ads (Image_Fixed_Point): Likewise.
* libgnat/s-imde32.ads (Image_Decimal32): Likewise.
* libgnat/s-imfi64.ads (Image_Fixed64): Likewise.
* libgnat/s-imgcha.adb (Image_Character): Likewise.
* libgnat/s-valuer.adb (Scan_Raw_Real): Likewise.
* sem_attr.adb (Eval_Attribute): Likewise.

ada: Fix pragma Compile_Time_Error for sizes of nonstatic array types

The pragma is consistenly rejected for the sizes of nonstatic array types
because Eval_Attribute does not evaluate it even if it is known.

gcc/ada/ChangeLog:

* sem_attr.adb (Eval_Attribute): Treat the various size attributes
like Component_Size for nonstatic array types.

ada: Refactor code of Check_Ambiguous_Call and Valid_Conversion

gcc/ada/ChangeLog:

* sem_res.adb (Is_Ambiguous_Operand): Add missing decoration of
the operand when it is labeled overloaded but has just one
interpretation.

ada: Minor refactoring in expansion of array aggregates

This just moves a couple of checks done in conjunction with the predicate
Aggr_Assignment_OK_For_Backend into its body and adds a couple of comments.

No functional changes.

gcc/ada/ChangeLog:

* exp_aggr.adb (Aggr_Assignment_OK_For_Backend): Add Target formal
parameter and check that it is not a bit-aligned component or slice.
Return False in CodePeer mode as well.
(Build_Array_Aggr_Code): Remove redundant tests done in conjunction
with a call to Aggr_Assignment_OK_For_Backend.
(Expand_Array_Aggregate): Likewise. Add a couple of comments and
improve formatting.

ada: Fix validity check for private types

Before this patch, the machinery to generate validity checks got
confused in some situations involving private views of types, and ended
up generating incorrect conversions from floating point types to integer
types. This patch fixes this.

gcc/ada/ChangeLog:

* exp_attr.adb (Expand_N_Attribute_Reference): Fix computation of type
category.

ada: Add minimal support for other delayed aspects on controlled objects

This extends the processing done for the Address aspect to other delayed
aspects. The External_Name aspect is also reclassified as a representation
aspect and the three representation aspects External_Name, Link_Name and
Linker_Section are moved from the Always_Delay to the Rep_Aspect category,
which makes it possible not to delay them in most cases with a small tweak.

gcc/ada/ChangeLog:

* aspects.ads (Is_Representation_Aspect): True for External_Name.
(Aspect_Delay): Use Rep_Aspect for External_Name, Link_Name and
Linker_Section.
* einfo.ads (Initialization_Statements): Document extended usage.
* exp_util.adb (Needs_Initialization_Statements): Return True for
all delayed aspects.
* freeze.adb (Check_Address_Clause): Do not move the initialization
expression here...
(Freeze_Object_Declaration): ...but here instead, as well as for all
delayed aspects. Remove test for pragma Linker_Section.
* sem_ch13.adb (Analyze_One_Aspect): Do not delay in the Rep_Aspect
case if the expression is a string literal.

ada: Fix documentation comment for Scan_Sign

This patches fixes a couple of details that were wrong in the
documentation comment for System.Val_Util.Scan_Sign.

gcc/ada/ChangeLog:

* libgnat/s-valuti.ads (Scan_Sign): Fix documentation comment.

ada: Crash on assignment of task allocator with expanded name

The compiler crashes on an assignment statement of the form
"X.Y := new T;", where X.Y is an expanded name (i.e. not a record
component or similar) and T is a type containing tasks.

gcc/ada/ChangeLog:

* exp_util.adb (Build_Task_Image_Decls):
Deal properly with the case of an expanded name.
Minor cleanup: use a case statement instead of if/elsif chain.

ada: Lift technical limitation in expansion of record aggregates

The mechanim deferring the expansion of record aggregates nested in other
aggregates with intermediate conditional expressions is disabled in the
case where they contain self-references, because of a technical limitation
in the replacements done by Build_Record_Aggr_Code. This change lifts it.

gcc/ada/ChangeLog:

* exp_aggr.adb (Traverse_Proc_For_Aggregate): New generic procedure.
(Replace_Discriminants): Instantiate it instead of Traverse_Proc.
(Replace_Self_Reference): Likewise.
(Convert_To_Assignments): Remove limitation for nested aggregates
that contain self-references.

ada: Small improvements to expansion of conditional expressions

They comprise using a nonnull accesss type for the indirect expansion to
avoid useless checks, smplifying the expansion of if expressions whose
condition is known at compile time to avoid an N_Expression_With_Actions,
using the indirect expansion for them in the indefinite case too, which
makes the special case for an unconstrained array type obsolete.

No functional changes.

gcc/ada/ChangeLog:

* exp_ch4.adb (Expand_N_Case_Expression): Remove obsolete comment
about C code generation.  Do not create a useless target type if
the parent statement is rewritten instead of the expression.  Use
a nonnull accesss type for the expansion done for composite types.
(Expand_N_If_Expression): Simplify the expansion when the condition
is known at compile time.  Apply the expansion done for by-reference
types to indefinite types and remove the obsolete special case for
unconstrained array types  Use a nonnull access type in this case.
Rename New_If local variable to If_Stmt for the sake of consistency.

ada: Fix wrong finalization with private unconstrained array type

The address passed to the routine attaching a controlled object to the
finalization master must be that of its dope vector for an object whose
nominal subtype is an unconstrained array type, but this is not the case
when this subtype has a private declaration.

gcc/ada/ChangeLog:

* exp_ch7.adb (Make_Address_For_Finalize): Look at the underlying
subtype to detect the unconstrained array type case.
* sprint.adb (Write_Itype) <E_Private_Subtype>: New case.

ada: Update documentation for External_Initialization

This fixes an omission in the recent change that was made to file lookup
for External_Initialization.

gcc/ada/ChangeLog:

* doc/gnat_rm/gnat_language_extensions.rst: Update
External_Initialization section.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

ada: Tweak Is_Predefined_File_Name

This patch slightly widens the set of filenames that the compiler
considers predefined. That makes it possible to build the GNAT runtime
using only the file mapping facilities of the compiler, without having
to rename files.

gcc/ada/ChangeLog:

* fname.adb (Is_Predefined_File_Name): Tweak test.

ada: Restrict External_Initialization file lookup

Before this patch, External_Initialization looked for files in all
directories of the source search path, which led to inconsistencies in
some cases. This patch restricts the file lookup so the argument is
interpreted as relative to the current source file's directory only.

gcc/ada/ChangeLog:

* sem_ch3.adb (Apply_External_Initialization): Restrict File lookup.

ada: Clean up and restrict usage of Initialization_Statements

This mechanism is the only producer of N_Compound_Statement in the expanded
code and parks the statements generated for the in-place initialization of
objects by an aggregate, so that they can be moved to the freeze point if
there is an address aspect/clause, or even cancelled if the aggregate has
been generated for Initialize_Scalars/Normalize_Scalars before a subsequent
pragma Import for the object is encountered.

The main condition for its triggering is that the object be not yet frozen,
but that's always the case when its declaration is being processed, so the
mechanism is triggered unnecessarily and the change restricts this but, on
the other hand, it also extends its usage to the in-place initialization by
a function call, which was implemented by means of a custom deferral.

There should be no functional changes.

gcc/ada/ChangeLog:

* einfo.ads (Initialization_Statements): Document usage precisely.
* exp_aggr.adb (Convert_Aggr_In_Object_Decl): Do not create a
compound statement in most cases, do it only if necessary.
* exp_ch3.adb (Expand_N_Object_Declaration): Remove a couple of
useless statements.
* exp_ch6.adb (Make_Build_In_Place_Call_In_Object_Declaration):
Use the Initialization_Statements mechanism if necessary.
* exp_ch7.adb: Remove clauses for Aspects package.
(Insert_Actions_In_Scope_Around): Use the support code of Exp_Util
for the Initialization_Statements mechanism.
* exp_prag.adb (Undo_Initialization): Remove obsolete code.
* exp_util.ads (Move_To_Initialization_Statements): New procedure.
(Needs_Initialization_Statements): New function.
* exp_util.adb (Move_To_Initialization_Statements): New procedure.
(Needs_Initialization_Statements): New predicate.

ada: Avoid expanding LHS assignments for controlled types

Expanding a function call that returns a controlled type
on the left-hand side of an assignment should be avoided.
Otherwise we will miss the diagnostic for
trying to assign something to a non-variable element.

gcc/ada/ChangeLog:

* exp_ch6.adb (Expand_Ctrl_Function_Call): Avoid expansion
of controlled types when the LHS is a function call.

ada: Add SIGPROT handler for CheriBSD

gcc/ada/ChangeLog:

* init.c (__gnat_error_handler): Handle SIGPROT
(__gnat_install_handler): Install SIGPROT handler

ada: Export CHERI exception IDs

This allows CHERI exceptions to be raised from C code in the runtime.

gcc/ada/ChangeLog:

* libgnat/i-cheri-exceptions.ads: Export CHERI exception IDs.

ada: Ensure minimum stack size for preallocated task stacks

On targets with preallocated task stacks the minimum stack size is
defined as a constant in System.Parameters. When adding preallocated
tasks to the expanded code the compiler does not have direct access to
that value. Instead generate the expression
Max (Task_Size, Minimum_Task_Size) in the expanded tree and let it be
resolved later in the compilation process.

gcc/ada/ChangeLog:

* exp_ch9.adb (Expand_N_Task_Type_Declaration): Take
Minimum_Stack_Size into account when preallocating task stacks.
* rtsfind.ads (RE_Id, RE_Unit_Table): Add RE_Minimum_Stack_Size.

Fix misplaced x86 -mstack-protector-guard-symbol documentation [PR117150]

Commit e1769bdd4cef522ada32aec863feba41116b183a accidentally inserted
the documentation for the x86 -mstack-protector-guard-symbol option in the
wrong place. Fixed thusly.

gcc/ChangeLog
PR target/117150
* doc/invoke.texi (RS/6000 and PowerPC Options): Move description
of -mstack-protector-guard-symbol from here...
(x86 Options): ...to here.

Daily bump.

libstdc++: Disable __gnu_debug::__is_singular(T*) in constexpr [PR109517]

Because of PR c++/85944 we have several bugs where _GLIBCXX_DEBUG causes
errors for constexpr code. Although Bug 117966 could be fixed by
avoiding redundant debug checks in std::span, and Bug 106212 could be
fixed by avoiding redundant debug checks in std::array, there are many
more cases where similar __glibcxx_requires_valid_range checks fail to
compile and removing the checks everywhere isn't desirable.

This just disables the __gnu_debug::__check_singular(T*) check during
constant evaluation. Attempting to dereference a null pointer will
certainly fail during constant evaluation (if it doesn't fail then it's
a compiler bug and not the library's problem). Disabling this check
during constant evaluation shouldn't do any harm.

libstdc++-v3/ChangeLog:

PR libstdc++/109517
PR libstdc++/109976
* include/debug/helper_functions.h (__valid_range_aux): Treat
all input iterator ranges as valid during constant evaluation.

libstdc++: Skip redundant assertions in std::array equality [PR106212]

As PR c++/106212 shows, the Debug Mode checks cause a compilation error
for equality comparisons involving std::array prvalues in constant
expressions. Those Debug Mode checks are redundant when
comparing two std::array objects, because we already know we have a
valid range. We can also avoid the unnecessary step of using
std::__niter_base to do __normal_iterator unwrapping, which isn't needed
because our std::array iterators are just pointers. Using
std::__equal_aux1 instead of std::equal avoids the redundant checks in
std::equal and std::__equal_aux.

libstdc++-v3/ChangeLog:

PR libstdc++/106212
* include/std/array (operator==): Use std::__equal_aux1 instead
of std::equal.
* testsuite/23_containers/array/comparison_operators/106212.cc:
New test.

libstdc++: Skip redundant assertions in std::span construction [PR117966]

As PR c++/117966 shows, the Debug Mode checks cause a compilation error
for a global constexpr std::span. Those debug checks are redundant when
constructing from an array or a range, because we already know we have a
valid range and we know its size. Instead of delegating to the
std::span(contiguous_iterator, contiguous_iterator) constructor, just
initialize the data members directly.

libstdc++-v3/ChangeLog:

PR libstdc++/117966
* include/std/span (span(T (&)[N])): Do not delegate to
constructor that performs redundant checks.
(span(array<T, N>&), span(const array<T, N>&)): Likewise.
(span(Range&&), span(const span<T, N>&)): Likewise.
* testsuite/23_containers/span/117966.cc: New test.

libstdc++: Remove constraints on std::generator::promise_type::operator new

This was approved in Wrocław as LWG 3900, so that passing an incorrect
argument intended as an allocator will be ill-formed, instead of
silently ignored.

This also renames the template parameters and function parameters for
the allocators, to align with the names in the standard. I found it too
confusing to have a parameter _Alloc which doesn't correspond to Alloc
in the standard. Rename _Alloc to _Allocator (which the standard calls
Allocator) and rename _Na to _Alloc (which the standard calls Alloc).

libstdc++-v3/ChangeLog:

* include/std/generator (_Promise_alloc): Rename template
parameter. Use __alloc_rebind to rebind allocator.
(_Promise_alloc::operator new): Replace constraints with a
static_assert in the body. Rename allocator parameter.
(_Promise_alloc<void>::_M_allocate): Rename allocator parameter.
Use __alloc_rebind to rebind allocator.
(_Promise_alloc<void>::operator new): Rename allocator
parameter.
* testsuite/24_iterators/range_generators/alloc.cc: New test.
* testsuite/24_iterators/range_generators/lwg3900.cc: New test.

Reviewed-by: Arsen Arsenović <arsen@aarsen.me>

[PR116778][LRA]: Check pseudos assigned to FP after rematerialization to build live ranges

This is a better fix of the PR permitting to avoid building live
ranges after rematerialization. It checks that FP can not be
eliminated now and that pseudos assigned to FP will be spilled. In
this case we need to build live ranges after rematerialization for
correct assignments of stack slots to spilled pseudos involved in
rematerialization.

gcc/ChangeLog:

PR rtl-optimization/116778
* ira-int.h (x_ira_class_hard_reg_index): Fix comment typo.
* lra-eliminations.cc (lra_fp_pseudo_p): New function.
* lra-int.h (lra_fp_pseudo_p): External declaration.
* lra-spills.cc (lra_need_for_spills_p): Fix formatting.
* lra.cc (lra): Use lra_fp_pseudo_p in lra_create_live_range after
lra_remat.

Fortran: Add DECL_EXPR for variable length assoc name [PR117901]

2024-12-11 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/117901
* trans-stmt.cc (trans_associate_var): A variable character
length array associate name must generate a DECL expression for
the data pointer type.

gcc/testsuite/
PR fortran/117901
* gfortran.dg/pr117901.f90: New test.

gimple: Add limit after which slower switchlower algs are used [PR117091] [PR117352]

This patch adds a limit on the number of cases of a switch.  When this
limit is exceeded, switch lowering decides to use faster but less
powerful algorithms.

In particular this means that for finding bit tests switch lowering
decides between the old dynamic programming O(n^2) algorithm and the
new greedy algorithm that Andi Kleen recently added but then reverted
due to PR117352.  It also means that switch lowering may bail out on
finding jump tables if the switch is too large  (Btw it also may not
bail!  It can happen that the greedy algorithms finds some bit tests
which then basically split the switch into multiple smaller switches and
those may be small enough to fit under the limit.)

The limit is implemented as --param switch-lower-slow-alg-max-cases.
Exceeding the limit is reported through -Wdisabled-optimization.

This patch fixes the issue with the greedy algorithm described in
PR117352.  The problem was incorrect usage of the is_beneficial()
heuristic.

gcc/ChangeLog:

PR middle-end/117091
PR middle-end/117352
* doc/invoke.texi: Add switch-lower-slow-alg-max-cases.
* params.opt: Add switch-lower-slow-alg-max-cases.
* tree-switch-conversion.cc (jump_table_cluster::find_jump_tables):
Note in a comment that we are looking for jump tables in
case sequences delimited by the already found bit tests.
(bit_test_cluster::find_bit_tests): Decide between
find_bit_tests_fast() and find_bit_tests_slow().
(bit_test_cluster::find_bit_tests_fast): New function.
(bit_test_cluster::find_bit_tests_slow): New function.
(switch_decision_tree::analyze_switch_statement): Report
exceeding the limit.
* tree-switch-conversion.h: Add find_bit_tests_fast() and
find_bit_tests_slow().

Co-Authored-By: Andi Kleen <ak@gcc.gnu.org>
Signed-off-by: Filip Kastl <fkastl@suse.cz>

c++: allow stores to anon union vars to change current union member in constexpr [PR117614]

Since r14-4771 the FE tries to differentiate between cases where the lhs
of a store allows changing the current union member and cases where it
doesn't, and cases where it doesn't includes everything that has gone
through the cxx_eval_constant_expression path on the lhs.
As the testcase shows, DECL_ANON_UNION_VAR_P vars were handled like that
too, even when stores to them are the only way how to change the current
union member in the sources.

So, the following patch just handles that case manually without calling
cxx_eval_constant_expression and without setting evaluated to true.

2024-12-11 Jakub Jelinek <jakub@redhat.com>

PR c++/117614
* constexpr.cc (cxx_eval_store_expression): For stores to
DECL_ANON_UNION_VAR_P vars just continue with DECL_VALUE_EXPR
of it, without setting evaluated to true or full
cxx_eval_constant_expression.

* g++.dg/cpp2a/constexpr-union8.C: New test.

c++: tweak colorization of incompatible declspecs

Introduce a helper function for complaining about "signed unsigned"
and "short long". Add colorization there so that e.g. the 'signed'
and 'unsigned' are given consistent contrasting colors in both the
message and the quoted source.

gcc/cp/ChangeLog:
* decl.cc: Add #include "diagnostic-highlight-colors.h"
and #include "pretty-print-markup.h".
(complain_about_incompatible_declspecs): New.
(grokdeclarator): Use it when complaining about both 'signed' and
'unsigned', and both 'long' and 'short'.

gcc/ChangeLog:
* diagnostic-highlight-colors.h: Tweak comment.
* pretty-print-markup.h (class pp_element_quoted_string): New,
based on pretty-print.cc's selftest::test_element, adding an
optional highlight color.
* pretty-print.cc (class test_element): Drop.
(selftest::test_pp_format): Use pp_element_quoted_string.
(selftest::test_urlification): Likewise.

gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/long-short-colorization.C: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: suppress "note: " prefix in nested diagnostics [PR116253]

This patch is a followup to:
"c++: use diagnostic nesting [PR116253]"

This patch tweaks how text output with experimental-nesting=yes
prints nested diagnostics, by omitting the leading "note: " from
nested notes.

This reduces the amount of visual cruft the user has to ignore when
reading C++ template errors; see the examples in the testsuite.

This doesn't affect the output for users who have not opted-in
to nested diagnostic-printing.

gcc/ChangeLog:
PR other/116253
* diagnostic-format-text.cc (build_prefix): Don't add the
"note: " prefix when showing nested diagnostics.

gcc/testsuite/ChangeLog:
PR other/116253
* g++.dg/concepts/nested-diagnostics-1-truncated.C: Update
expected output.
* g++.dg/concepts/nested-diagnostics-1.C: Likewise.
* g++.dg/concepts/nested-diagnostics-2.C: Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented-show-levels.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented-unicode.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented.c: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c++: print z candidate count and number them (v2)

Changed in v2: changed wording to "there is"/"there are" rather
than "we found".

This patch is a followup to:
"c++: use diagnostic nesting [PR116253]"

Following Sy Brand's UX suggestions in P2429R0 for example 1, this patch
tweaks print_z_candidates to add a note about the number of candidates,
and adds a candidate number to each one.

Various examples of output can be seen in the testsuite part of the
patch.

gcc/cp/ChangeLog:
* call.cc (print_z_candidates): Count the number of
candidates and issue a note stating the count at an
intermediate nesting level. Number the individual
candidates.

gcc/testsuite/ChangeLog:
* g++.dg/concepts/diagnostic9.C: Update expected
results for candidate count and numbering.
* g++.dg/concepts/nested-diagnostics-1-truncated.C:
* g++.dg/concepts/nested-diagnostics-1.C: Likewise.
* g++.dg/concepts/nested-diagnostics-2.C: Likewise.
* g++.dg/cpp23/explicit-obj-lambda11.C: Likewise.
* g++.dg/cpp2a/desig4.C: Likewise.
* g++.dg/cpp2a/desig6.C: Likewise.
* g++.dg/cpp2a/spaceship-eq15.C: Likewise.
* g++.dg/diagnostic/function-color1.C: Likewise.
* g++.dg/diagnostic/param-type-mismatch-2.C: Likewise.
* g++.dg/diagnostic/pr100716-1.C: Likewise.
* g++.dg/diagnostic/pr100716.C: Likewise.
* g++.dg/lookup/operator-2.C: Likewise.
* g++.dg/lookup/pr80891-5.C: Likewise.
* g++.dg/modules/adhoc-1_b.C: Likewise.
* g++.dg/modules/err-1_c.C: Likewise.
* g++.dg/modules/err-1_d.C: Likewise.
* g++.dg/other/return2.C: Likewise.
* g++.dg/overload/error6.C: Likewise.
* g++.dg/template/local6.C: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: tweak output for nested messages [PR116253]

When printing nested messages with
-fdiagnostics-set-output=text:experimental-nesting=yes
avoid printing a line such as the "cc1plus:" in the following:
• note: set ‘-fconcepts-diagnostics-depth=’ to at least 2 for more detail
cc1plus:
for "special" locations such as UNKNOWN_LOCATION.

gcc/ChangeLog:
PR other/116253
* diagnostic-format-text.cc (on_report_diagnostic): When showing
locations for nested messages on new lines, don't print
UNKNOWN_LOCATION or BUILTINS_LOCATION.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

input.cc: rename file_cache:in_context

No functional change intended.

gcc/ChangeLog:
* input.cc (file_cache::initialize_input_context): Rename member
"in_context" to "m_input_context".
(file_cache::add_file): Likewise.
* input.h (class file_cache): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Ada: Add GNU/Hurd x86_64 support

This is essentially the same as the i386-pc-gnu section, the differences
are the same as between freebsd i386 and freebsd x86_64.

gcc/ada/ChangeLog:

* Makefile.rtl: Add x86_64-pc-gnu section.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Ada: Fix GNU/Hurd priority range

GNU/Mach currently uses a 0..63 range.

gcc/ada/ChangeLog:

* libgnat/system-gnu.ads: New file.
* Makefile.rtl (x86-gnuhurd): Use libgnat/system-gnu.ads instead of
libgnat/system-freebsd.ads.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Ada: Factorize bsd signal definitions

They are all the same on all BSD-like systems (including GNU/Hurd).

gcc/ada/ChangeLog:

* libgnarl/a-intnam__freebsd.ads: Rename to...
* libgnarl/a-intnam__bsd.ads: ... new file.
* libgnarl/a-intnam__dragonfly.ads: Remove file.
* Makefile.rtl (x86-kfreebsd, x86-gnuhurd, x86_64-kfreebsd,
aarch64-freebsd, x86-freebsd, x86_64-freebsd): Use
libgnarl/a-intnam__bsd.ads instead of libgnarl/a-intnam__freebsd.ads.
(x86_64-dragonfly): Use libgnarl/a-intnam__bsd.ads instead of
libgnarl/a-intnam__dragonfly.ads.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

ipa: Update value range jump functions during inlining

When inlining (during the analysis phase) a call graph edge, we update
all pass-through jump functions corresponding to edges going out of
the newly inlined function to be relative to the function into which
we are inlining or to expose the information originally captured for
the edge that is being inlined.

Similarly, we can combine the value range information in pass-through
jump functions corresponding to both edges, which is what this patch
adds - at least for the case when the inlined pass-through is a
simple, non-arithmetic one, which is the case that we also handle for
constant and aggregate jump function parts.

gcc/ChangeLog:

2024-11-01 Martin Jambor <mjambor@suse.cz>

* ipa-cp.h: Forward declare class ipa_vr.
(ipa_vr_operation_and_type_effects) Declare.
* ipa-cp.cc (ipa_vr_operation_and_type_effects): Make public.
* ipa-prop.cc (update_jump_functions_after_inlining): Also update
value range jump functions.

middle-end: Add initial support for poly_int64 BIT_FIELD_REF in expand pass [PR96342]

While `poly_int64' has been the default representation of bitfield size
and offset for some time, there was a lack of support for the use of
non-constant `poly_int64' values for those values throughout the
compiler, limiting the applicability of the BIT_FIELD_REF rtl expression
for variable length vectors, such as those used by SVE.

This patch starts work on extending the functionality of relevant
functions in the expand pass such as to enable their use by the compiler
for such vectors.

gcc/ChangeLog:

PR target/96342
* expr.cc (store_constructor): Enable poly_{u}int64 type usage.
(get_inner_reference): Ditto.

Co-authored-by: Tamar Christina <tamar.christina@arm.com>

middle-end: add vec_init support for variable length subvector concatenation. [PR96342]

For architectures where the vector-length is a compile-time variable,
rather representing a runtime constant, as is the case with SVE it is
perfectly reasonable that such vector be made up of two (or more) subvector
components of a compatible sub-length variable.

One example of this would be the concatenation of two VNx4QI vectors
into a single VNx8QI vector.

This patch adds initial support for the enablement of this feature in
the middle-end, removing the `.is_constant()' constraint on the vector's
number of elements, instead making the constant no. of elements the
multiple of the number of subvectors (which must then also be of
variable length, such that their polynomial ratio then results in a
compile-time constant) required to fill the vector.

gcc/ChangeLog:

PR target/96342
* expr.cc (store_constructor): add support for variable-length
vectors.

Co-authored-by: Tamar Christina <tamar.christina@arm.com>

middle-end: Fix mask length arg in call to vect_get_loop_mask [PR96342]

When issuing multiple calls to a simdclone in a vectorized loop,
TYPE_VECTOR_SUBPARTS(vectype) gives the incorrect number when compared
to the TYPE_VECTOR_SUBPARTS result we get from the mask type derived
from the relevant `rgroup_controls' entry within `vect_get_loop_mask'.

By passing `masktype' instead, we are able to get the correct number of
vector subparts and thu eliminate the ICE in the call to
`vect_get_loop_mask' when the data type for which we retrieve the mask
is wider than the one used when defining the mask at mask registration
time.

gcc/ChangeLog:

PR target/96342
* tree-vect-stmts.cc (vectorizable_simd_clone_call):
s/vectype/masktype/ in call to vect_get_loop_mask.

middle-end: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE [PR96342]

This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed because for VLS SVE vectorization the vectorizer accepts
Advanced SIMD simd clones when vectorizing using SVE types because the simdlens
might match. This will cause type errors later on.

Other targets do not currently need to use this argument.

gcc/ChangeLog:

PR target/96342
* target.def (TARGET_SIMD_CLONE_USABLE): Add argument.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass stmt_info to
call TARGET_SIMD_CLONE_USABLE.
* config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add argument
and use it to reject the use of SVE simd clones with Advanced SIMD
modes.
* config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused argument.
* config/i386/i386.cc (ix86_simd_clone_usable): Likewise.
* doc/tm.texi: Regenerate

Co-authored-by: Victor Do Nascimento <victor.donascimento@arm.com>
Co-authored-by: Tamar Christina <tamar.christina@arm.com>

middle-end: use two's complement equality when comparing IVs during candidate selection  [PR114932]

IVOPTS normally uses affine trees to perform comparisons between different IVs,
but these seem to have been missing in two key spots and instead normal tree
equivalencies used.

In some cases where we have a two-complements equivalence but not a strict
signedness equivalencies we end up generating both a signed and unsigned IV for
the same candidate.

This patch implements a new OEP flag called OEP_ASSUME_WRAPV.  This flag will
check if the operands would produce the same bit values after the computations
even if the final sign is different.

This happens quite a lot with Fortran but can also happen in C because this came
code is unable to figure out when one expression is a multiple of another.

As an example in the attached testcase we get:

Initial set of candidates:
  cost: 24 (complexity 3)
  reg_cost: 9
  cand_cost: 15
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6, 8
   group:0 --> iv_cand:6, cost=(0,1)
   group:1 --> iv_cand:1, cost=(0,0)
   group:2 --> iv_cand:8, cost=(0,1)
   group:3 --> iv_cand:8, cost=(0,1)
  invariant variables: 6
  invariant expressions: 1, 2

<Invariant Expressions>:
inv_expr 1:     stride.3_27 * 4
inv_expr 2:     (unsigned long) stride.3_27 * 4

These end up being used in the same group:

Group 1:
cand  cost    compl.  inv.expr.       inv.vars
1     0       0       NIL;    6
2     0       0       NIL;    6
3     0       0       NIL;    6

which ends up with IV opts picking the signed and unsigned IVs:

Improved to:
  cost: 24 (complexity 3)
  reg_cost: 9
  cand_cost: 15
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6, 8
   group:0 --> iv_cand:6, cost=(0,1)
   group:1 --> iv_cand:1, cost=(0,0)
   group:2 --> iv_cand:8, cost=(0,1)
   group:3 --> iv_cand:8, cost=(0,1)
  invariant variables: 6
  invariant expressions: 1, 2

and so generates the same IV as both signed and unsigned:

;;   basic block 21, loop depth 3, count 214748368 (estimated locally, freq 58.2545), maybe hot
;;    prev block 28, next block 31, flags: (NEW, REACHABLE, VISITED)
;;    pred:       28 [always]  count:23622320 (estimated locally, freq 6.4080) (FALLTHRU,EXECUTABLE)
;;                25 [always]  count:191126046 (estimated locally, freq 51.8465) (FALLTHRU,DFS_BACK,EXECUTABLE)
  # .MEM_66 = PHI <.MEM_34(28), .MEM_22(25)>
  # ivtmp.22_41 = PHI <0(28), ivtmp.22_82(25)>
  # ivtmp.26_51 = PHI <ivtmp.26_55(28), ivtmp.26_72(25)>
  # ivtmp.28_90 = PHI <ivtmp.28_99(28), ivtmp.28_98(25)>

...

;;   basic block 24, loop depth 3, count 214748366 (estimated locally, freq 58.2545), maybe hot
;;    prev block 22, next block 25, flags: (NEW, REACHABLE, VISITED)'
;;    pred:       22 [always]  count:95443719 (estimated locally, freq 25.8909) (FALLTHRU)
;;                21 [33.3% (guessed)]  count:71582790 (estimated locally, freq 19.4182) (TRUE_VALUE,EXECUTABLE)
;;                31 [33.3% (guessed)]  count:47721860 (estimated locally, freq 12.9455) (TRUE_VALUE,EXECUTABLE)
# .MEM_22 = PHI <.MEM_44(22), .MEM_31(21), .MEM_79(31)>
ivtmp.22_82 = ivtmp.22_41 + 1;
ivtmp.26_72 = ivtmp.26_51 + _80;
ivtmp.28_98 = ivtmp.28_90 + _39;

These two IVs are always used as unsigned, so IV ops generates:

  _73 = stride.3_27 * 4;
  _80 = (unsigned long) _73;
  _54 = (unsigned long) stride.3_27;
  _39 = _54 * 4;

Which means that in e.g. exchange2 we generate a lot of duplicate code.

This is because candidate 6 and 8 are equivalent under two's complement but have
different signs.

This patch changes it so that if you have two IVs that are affine equivalent to
just pick one over the other.  IV already has code for this, so the patch just
uses affine trees instead of tree for the check.

With it we get:

<Invariant Expressions>:
inv_expr 1:     stride.3_27 * 4

<Group-candidate Costs>:
Group 0:
  cand  cost    compl.  inv.expr.       inv.vars
  5     0       2       NIL;    NIL;
  6     0       3       NIL;    NIL;

Group 1:
  cand  cost    compl.  inv.expr.       inv.vars
  1     0       0       NIL;    6
  2     0       0       NIL;    6
  3     0       0       NIL;    6
  4     0       0       NIL;    6

Initial set of candidates:
  cost: 16 (complexity 3)
  reg_cost: 6
  cand_cost: 10
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6
   group:0 --> iv_cand:6, cost=(0,3)
   group:1 --> iv_cand:1, cost=(0,0)
  invariant variables: 6
  invariant expressions: 1

gcc/ChangeLog:

PR tree-optimization/114932
* fold-const.cc (operand_compare::operand_equal_p): Use it.
(operand_compare::verify_hash_value): Likewise.
(operand_compare::hash_operand): Likewise.
(test_operand_equality::test): New.
(fold_const_cc_tests): Use it.
* tree-core.h (enum operand_equal_flag): Add OEP_ASSUME_WRAPV.
* tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.

gcc/testsuite/ChangeLog:

PR tree-optimization/114932
* gfortran.dg/addressing-modes_2.f90: New test.

middle-end: refactor type to be explicit in operand_equal_p [PR114932]

This is a refactoring with no expected behavioral change.
The goal with this is to make the type of the expressions being used explicit.

I did not change all the recursive calls to operand_equal_p () to recurse
directly to the new function but instead this goes through the top level call
which re-extracts the types.

This was done because in most of the cases where we recurse type == arg.
The second patch makes use of this new flexibility to implement an overload
of operand_equal_p which checks for equality under two's complement.

gcc/ChangeLog:

PR tree-optimization/114932
* fold-const.cc (operand_compare::operand_equal_p): Split into one that
takes explicit type parameters and use that in public one.
* fold-const.h (class operand_compare): Add operand_equal_p private
overload.

MAINTAINERS: add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval.

autoupdate: replace obsolete macros in libiberty

Autoreconf-2.72 warns about obsolete macros. This patch aims at removing
the noise from a future upgrade to autoreconf-2.72 or later. This is in
no a way a complete patch allowing the upgrade to autoreconf-2.72.

- AC_GNU_SOURCE by AC_USE_SYSTEM_EXTENSIONS
  https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.72/
  autoconf.html#index-AC_005fGNU_005fSOURCE-1
- AC_CONFIG_HEADER by AC_CONFIG_HEADERS
  https://www.gnu.org/software/automake/manual/1.12.2/html_node/Obsolete-
  Macros.html#index-AM_005fCONFIG_005fHEADER

Those fixes were originally submitted in a patch series in binutils.
https://inbox.sourceware.org/binutils/878qthm6a0.fsf@gentoo.org/

libiberty/ChangeLog:

* configure: Regenerate.
* configure.ac: Fix autoupdate warnings.

libstdc++: Make std::println use locale from ostream (LWG 4088)

This was just approved in Wrocław.

libstdc++-v3/ChangeLog:

* include/std/ostream (println): Pass stream's locale to
std::format, as per LWG 4088.
* testsuite/27_io/basic_ostream/print/1.cc: Check std::println
with custom locale. Remove unused brit_punc class.

aix: Resolve build failure with default C23

libiberty/getopt.c file is defining _NO_PROTO, which causes
conflicting declarations for the functions in AIX header files
like stdio.h & stdlib.h.
Looks like _NO_PROTO define were added long back and conflicting
declarations were always present until C23 standard uncovered it.

Remove the block defining _NO_PROTO as both Tru64 UNIX (ex-OSF/1)
and AIX 3.2 is no more supported.

libiberty/ChangeLog:

* getopt.c: Remove _NO_PROTO block

aarch64: Use SVE ASRD instruction with Neon modes.

The ASRD instruction on SVE performs an arithmetic shift right by an immediate
for divide.

This patch enables the use of ASRD with Neon modes.

For example:

int in[N], out[N];

void
foo (void)
{
for (int i = 0; i < N; i++)
out[i] = in[i] / 4;
}

compiles to:

ldr q31, [x1, x0]
cmlt v30.16b, v31.16b, #0
and z30.b, z30.b, 3
add v30.16b, v30.16b, v31.16b
sshr v30.16b, v30.16b, 2
str q30, [x0, x2]
add x0, x0, 16
cmp x0, 1024

but can just be:

ldp q30, q31, [x0], 32
asrd z31.b, p7/m, z31.b, #2
asrd z30.b, p7/m, z30.b, #2
stp q30, q31, [x1], 32
cmp x0, x2

This patch also adds the following overload:
aarch64_ptrue_reg (machine_mode pred_mode, machine_mode data_mode)
Depending on the data mode, the function returns a predicate with the
appropriate bits set.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_ptrue_reg): New overload.
* config/aarch64/aarch64-protos.h (aarch64_ptrue_reg): Likewise.
* config/aarch64/aarch64-sve.md: Extended sdiv_pow2<mode>3
and *sdiv_pow2<mode>3 to support Neon modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/sve-asrd.c: New test.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
Signed-off-by: Soumya AR <soumyaa@nvidia.com>