git.ipfire.org Git - thirdparty/gcc.git/log

RISC-V: Fix factor in dwarf_poly_indeterminate_value [PR116305]

This patch is to fix the bug (BugId:116305) introduced by the commit
bd93ef for risc-v target.

The commit bd93ef changes the chunk_num from 1 to TARGET_MIN_VLEN/128
if TARGET_MIN_VLEN is larger than 128 in riscv_convert_vector_bits. So
it changes the value of BYTES_PER_RISCV_VECTOR. For example, before
merging the commit bd93ef and if TARGET_MIN_VLEN is 256, the value
of BYTES_PER_RISCV_VECTOR should be [8, 8], but now [16, 16]. The value
of riscv_bytes_per_vector_chunk and BYTES_PER_RISCV_VECTOR are no longer
equal.

Prologue will use BYTES_PER_RISCV_VECTOR.coeffs[1] to estimate the vlenb
register value in riscv_legitimize_poly_move, and dwarf2cfi will also
get the estimated vlenb register value in riscv_dwarf_poly_indeterminate_value
to calculate the number of times to multiply the vlenb register value.

So need to change the factor from riscv_bytes_per_vector_chunk to
BYTES_PER_RISCV_VECTOR, otherwise we will get the incorrect dwarf
information. The incorrect example as follow:

```
csrr    t0,vlenb
slli    t1,t0,1
sub     sp,sp,t1

.cfi_escape 0xf,0xb,0x72,0,0x92,0xa2,0x38,0,0x34,0x1e,0x23,0x50,0x22
```

The sequence '0x92,0xa2,0x38,0' means the vlenb register, '0x34' means
the literal 4, '0x1e' means the multiply operation. But in fact, the
vlenb register value just need to multiply the literal 2.

PR target/116305

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_dwarf_poly_indeterminate_value): Take
BYTES_PER_RISCV_VECTOR for *factor instead of riscv_bytes_per_vector_chunk.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalable_vector_cfi.c: New test.

Signed-off-by: Zhijin Zeng <zhijin.zeng@spacemit.com>

Daily bump.

Write CodeView information about stack variables

Outputs CodeView S_REGREL32 symbols for unoptimized local variables that
are stored on the stack. This includes a change to dwarf2out.cc to make
it easier to extract the function frame base without having to worry
about the function prologue or epilogue.

gcc/
* dwarf2codeview.cc (enum cv_sym_type): Add S_REGREL32.
(write_fbreg_variable): New function.
(write_unoptimized_local_variable): Add fblock parameter, and handle
DW_OP_fbreg locations.
(write_unoptimized_function_vars): Add fbloc parameter.
(write_function): Extract frame base from DWARF.
* dwarf2out.cc (convert_cfa_to_fb_loc_list): Output simplified frame
base information for CodeView.

Write CodeView information about enregistered variables

Outputs CodeView S_REGISTER symbols, representing local variables or
parameters that are held in a register.

gcc/
* dwarf2codeview.cc (enum cv_sym_type): Add S_REGISTER.
(enum cv_x86_register): New type.
(enum cv_amd64_register): New type.
(dwarf_reg_to_cv): New function.
(write_s_register): New function.
(write_unoptimized_local_variable): Handle parameters and DW_OP_reg*
location types.

Write CodeView information about local static variables

Outputs CodeView S_LDATA32 symbols, for static variables within
functions, along with S_BLOCK32 and S_END for the beginning and end of
lexical blocks.

gcc/
* dwarf2codeview.cc (enum cv_sym_type): Add S_END and S_BLOCK32.
(write_local_s_ldata32): New function.
(write_unoptimized_local_variable): New function.
(write_s_block32): New function.
(write_s_end): New function.
(write_unoptimized_function_vars): New function.
(write_function): Call write_unoptimized_function_vars.

Fix maybe-uninitialized CodeView LF_INDEX warning

Initialize last_type to 0 to silence two spurious maybe-uninitialized warnings.
We issue an LF_INDEX continuation subtype for any LF_FIELDLISTs that
overflow, so LF_INDEXes will always have a subtype preceding them (and
thus last_type will always be set).

gcc/
* dwarf2codeview.cc (get_type_num_enumeration_type): Initialize last_type
to 0.
(get_type_num_struct): Likewise.

AVR: target/85624 - Use HImode for clrmemqi alignment.

gcc/
PR target/85624
* config/avr/avr.md (*clrmemqi*): Use HImode for alignment operand.

(cherry picked from commit 507b4e147588c0fafe952b7226dd764ebeebb103)

Fortran: fix documentation of intrinsic RANDOM_INIT [PR114146]

gcc/fortran/ChangeLog:

PR fortran/114146
* intrinsic.texi: Fix documentation of arguments of RANDOM_INIT,
which is conforming to the F2018 standard.

modula2: change identifier names to avoid build warnings

This fix avoids the following warnings: In implementation module
‘StdChans’: either the identifier has the same name as a keyword or
alternatively a keyword has the wrong case (‘IN’ and ‘in’)
54 | stdnull: ChanId ;

the symbol name ‘in’ is legal as an identifier, however as such it
might cause confusion and is considered bad programming practice.

gcc/m2/ChangeLog:

* gm2-libs-iso/StdChans.mod (in): Rename to ...
(inch): ... this.
(out): Rename to ...
(outch): ... this.
(err): Rename to ...
(errch): ... this.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>

Fix using keywords as identifiers to prevent warnings during build

m2pim/DynamicStrings.mod:1358:27: note: In procedure ‘Slice’: the symbol
name ‘end’ is legal as an identifier, however as such it might cause
confusion and is considered bad programming practice
1358 | start, end, o: INTEGER ;

m2pim/DynamicStrings.mod:1358:27: note: either the identifier has the
same name as a keyword or alternatively a keyword has the wrong case
(‘END’ and ‘end’).

gcc/m2/ChangeLog:

* gm2-libs/DynamicStrings.mod (Slice): Rename end to stop.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>

testsuite: Verify -fshort-enums and -fno-short-enums in pr33738.C

For some targets, like Cortex-M on arm-none-eabi, the -fshort-enums is
enabled by default. For these targets, the test case fails as
sizeof(Alpha) < sizeof(int).
To make the test case behave identical for targets that does enable
-fshort-enums and those that does not, add -fno-short-enums in the test
case and verify that the warning is not emitted. Then also create a copy
and run the test with -fshort-enums and verify that the warning is
emitted.

Regtested on x86_64-pc-linux-gnu and arm-none-eabi.

gcc/testsuite/ChangeLog:

* g++.dg/warn/pr33738.C: Added -fno-short-enums.
* g++.dg/warn/pr33738-2.C: Duplicate g++.dg/warn/pr33738.C with
-fshort-enums and removed xfail.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: Add -fno-short-enums to pr97315-1.C

The test case assumes that sizeof(tree_code) >= 2. On some targets, like
Cortex-M on arm-none-eabi, -fshort-enums is enabled by default and in
that case, sizeof(tree_code) will be 1 and the following warning is
emitted:

.../pr97315-1.C:8:13: warning: width of 'tree_base::code' exceeds its type

Avoid the warning by forcing -fno-short-enums.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr97315-1.C: Add -fno-short-enums.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: Add -fwrapv to signbit-5.c

On Cortex-M55 with MVE, the test case fails due to -INT_MAX being
undefined. Adding -fwrapv solves the issues.

Regtested on x86_64-pc-linux-gnu and arm-none-eabi for
Cortex-M0/M3/M4/M7/M33/M55/M85/A7.

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-5.c: Add -fwrapv and remove x86 exception.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>

PR modula2/116378 m2 bootstrap fails on x86_64-darwin

This patch fixes m2 bootstrap failure on x86_64-darwin.  libc_open
is defined with three parameters the last of which is an int for
portability (rather than a vararg).  This avoids portability
problems by promoting mode_t to an int.  In the future it could
be tidied up by using the m2 optarg extension.

gcc/m2/ChangeLog:

PR modula2/116378
* gm2-libs-iso/TermFile.mod (termOpen): Add third argument
for open.
* gm2-libs/libc.def (open): Remove vararg and use INTEGER for
mode parameter three.
* mc-boot-ch/Glibc.c (tracedb_open): Replace mode_t with int.
(libc_open): Rewrite without varargs.
* mc-boot/Glibc.h (libc_open): Replace varargs with int mode.
* pge-boot/Glibc.cc (libc_open): Rewrite.
* pge-boot/Glibc.h (libc_open): Replace varargs with int mode.

gcc/testsuite/ChangeLog:

PR modula2/116378
* gm2/extensions/run/pass/testopen.mod: Add third argument
for open.
* gm2/isolib/run/pass/openlibc.mod: Ditto.
* gm2/pim/run/pass/testaddr3.mod: Ditto.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

c++: Pedwarn on [[]]; at class scope [PR110345]

For C++ 26 P2552R3 I went through all the spots (except modules) where
attribute-specifier-seq appears in the grammar and tried to construct
a testcase in all those spots, for now for [[deprecated]] attribute.

The fourth issue is that we just emit (when enabled) -Wextra-semi warning
not just for lone semicolon at class scope (correct), but also for
[[]]; or [[whatever]]; there too.
While just semicolon is valid in C++11 and newer,
https://eel.is/c++draft/class.mem#nt:member-declaration
allows empty-declaration, unlike namespace scope or block scope
something like attribute-declaration or empty statement with attributes
applied for it aren't supported.
While syntactically it matches
attribute-specifier-seq [opt] decl-specifier-seq [opt] member-declarator-list [opt] ;
with the latter two omitted, there is
https://eel.is/c++draft/class.mem#general-3
which says that is not valid.

So, the following patch emits a pedwarn in that case.

2024-08-16 Jakub Jelinek <jakub@redhat.com>

PR c++/110345
* parser.cc (cp_parser_member_declaration): Call maybe_warn_extra_semi
only if it is empty-declaration, if there are some tokens like
attribute, pedwarn that the declaration doesn't declare anything.

* g++.dg/cpp0x/gen-attrs-84.C: New test.

i386: Fix some vex insns that prohibit egpr

Although these vex insn have evex counterpart, but when it
uses the displayed vex prefix should not support APX EGPR.
Like TARGET_AVXVNNI, TARGET_IFMA and TARGET_AVXNECONVERT.
TARGET_AVXVNNIINT8 and TARGET_AVXVNNITINT16 also are vex
insn should not support egpr.

gcc/ChangeLog:

* config/i386/sse.md (vpmadd52<vpmadd52type><mode>):
Prohibit egpr for vex version.
(vpdpbusd_<mode>): Ditto.
(vpdpbusds_<mode>): Ditto.
(vpdpwssd_<mode>): Ditto.
(vpdpwssds_<mode>): Ditto.
(*vcvtneps2bf16_v4sf): Ditto.
(*vcvtneps2bf16_v8sf): Ditto.
(vpdp<vpdotprodtype>_<mode>): Ditto.
(vbcstnebf162ps_<mode>): Ditto.
(vbcstnesh2ps_<mode>): Ditto.
(vcvtnee<bf16_ph>2ps_<mode>): Ditto.
(vcvtneo<bf16_ph>2ps_<mode>): Ditto.
(vpdp<vpdpwprodtype>_<mode>): Ditto.

aarch64: Improve popcount for bytes [PR113042]

For popcount for bytes, we don't need the reduction addition
after the vector cnt instruction as we are only counting one
byte's popcount.
This changes the popcount extend to cover all ALLI rather than GPI.

Changes since v1:
* v2 - Use ALLI iterator and combine all into one pattern.
       Add new testcases popcnt[6-8].c.
* v3 - Simplify TARGET_CSSC path.
       Use convert_to_mode instead of gen_zero_extend* directly.
       Some other small cleanups.

Bootstrapped and tested on aarch64-linux-gnu with no regressions.

PR target/113042

gcc/ChangeLog:

* config/aarch64/aarch64.md (popcount<mode>2): Update pattern
to support ALLI modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/popcnt5.c: New test.
* gcc.target/aarch64/popcnt6.c: New test.
* gcc.target/aarch64/popcnt7.c: New test.
* gcc.target/aarch64/popcnt8.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++-v3: Handle iconv as optional for newlib builds [PR116362]

Support for iconv in newlib seems to have been always
assumed present by libstdc++-v3, but is default off.

Though, it hasn't been used before recent libstdc++ changes
that actually call iconv functions. This now leads to
failures exposed by running the test-suite, unless the
newlib being used has been explicitly configured with
--enable-newlib-iconv. When failing, there are undefined
references to iconv, iconv_open or iconv_close for multiple
tests.

Thankfully there's a macro in newlib.h that we can check to
detect presence of iconv support for the newlib build that's
used.

libstdc++-v3:
PR libstdc++/116362
* configure.ac: Check newlib configuration whether iconv is enabled.
* configure: Regenerate.

libstdc++-v3: testsuite: Prune uncapitalized "in function" linker warning

Newer newlib trigger warnings about certain functions not implemented
(_getentropy) when testing libstdc++-v3.

Since 2018 (circa binutils-2.31) the "in function" prefix isn't
capitalized for those "not implemented" warnings when generated from
the linker (a GNU ld feature used by newlib). Dejagnu up to and
including at least dejagnu-1.6.3 (and git @ 42979bd3b9) assumes a
capital "In function", leaving that part unpruned, and boom we have
thousands of "excess errors" from the libstdc++-v3 testsuite.

While gcc/testsuite/lib/prune.exp:prune_gcc_output already deals with
this quirk with a vastly more generic pattern, I choose this simpler
tweak.

libstdc++-v3:
* testsuite/lib/prune.exp (libstdc++-dg-prune): Prune
uncapitalized "in function" warning from linker.

Daily bump.

PHIOPT: Fix comment before factor_out_conditional_operation

I didn't update the comment before factor_out_conditional_operation
correctly. this updates it to be correct and mentions unary operations
rather than just conversions.

Pushed as obvious.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (factor_out_conditional_operation): Update
comment.

RISC-V: use fclass insns to implement isfinite,isnormal and isinf builtins

Currently these builtins use float compare instructions which require
FP flags to be saved/restored which could be costly in uarch.
RV Base ISA already has FCLASS.{d,s,h} instruction to compare/identify FP
values w/o disturbing FP exception flags.

Now that upstream supports the corresponding optabs, wire them up in the
backend.

gcc/ChangeLog:
* config/riscv/riscv.md: define_insn for fclass insn.
define_expand for isfinite, isnormal, isinf.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/fclass.c: New tests.

Tested-by: Edwin Lu <ewlu@rivosinc.com> # pre-commit-CI #2060
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>

i386: Improve split of *extendv2di2_highpart_stv_noavx512vl.

This patch follows up on the previous patch to fix PR target/116275 by
improving the code STV (ultimately) generates for highpart sign extensions
like (x<<8)>>8.  The arithmetic right shift is able to take advantage of
the available common subexpressions from the preceding left shift.

Hence previously with -O2 -m32 -mavx -mno-avx512vl we'd generate:

        vpsllq  $8, %xmm0, %xmm0
        vpsrad  $8, %xmm0, %xmm1
        vpsrlq  $8, %xmm0, %xmm0
        vpblendw        $51, %xmm0, %xmm1, %xmm0

But with improved splitting, we now generate three instructions:

        vpslld  $8, %xmm1, %xmm0
        vpsrad  $8, %xmm0, %xmm0
        vpblendw        $51, %xmm1, %xmm0, %xmm0

This patch also implements Uros' suggestion that the pre-reload
splitter could introduced a new pseudo to hold the intermediate
to potentially help reload with register allocation, which applies
when not performing the above optimization, i.e. on TARGET_XOP.

2024-08-15  Roger Sayle  <roger@nextmovesoftware.com>
    Uros Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): Split
to an improved implementation on !TARGET_XOP.  On TARGET_XOP, use
a new pseudo for the intermediate to simplify register allocation.

gcc/testsuite/ChangeLog
* g++.target/i386/pr116275-2.C: New test case.

fortran: Fix bootstrap in resolve.cc [PR116387]

The r15-2934 change broke bootstrap:
../../gcc/fortran/resolve.cc: In function ‘bool resolve_operator(gfc_expr*)’:
../../gcc/fortran/resolve.cc:4649:22: error: too many arguments for format [-Werror=format-extra-args]
4649 |           gfc_error ("Inconsistent coranks for operator at %%L and %%L",
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following patch fixes that by using %L rather than %%L, the call has 2
location arguments.

2024-08-15  Jakub Jelinek  <jakub@redhat.com>

PR bootstrap/116387
* resolve.cc (resolve_operator): Use %L rather than %%L in format
string.

c++: fix up cpp23/consteval-if3.C test [PR115583]

Compiling with optimizations is needed to trigger the bug fixed
by r15-2369.

PR c++/115583

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/consteval-if13.C: Compile with -O.

Tweak base/index disambiguation in decompose_normal_address [PR116236]

The PR points out that, for an address like:

(plus (zero_extend X) Y)

decompose_normal_address doesn't establish a strong preference
between treating X as the base or Y as the base. As the comment
in the patch says, zero_extend isn't enough on its own to assume
an index, at least not on POINTERS_EXTEND_UNSIGNED targets.
But in a construct like the one above, X and Y have different modes,
and it seems reasonable to assume that the one with the expected
address mode is the base.

This matters on targets like m68k that support index extension
and that require different classes for bases and indices.

gcc/
PR middle-end/116236
* rtlanal.cc (decompose_normal_address): Try to distinguish
bases and indices based on mode, before resorting to "baseness".

late-combine: Preserve INSN_CODE when modifying notes [PR116343]

When it removes a definition, late-combine tries to update all
uses in notes.  It does this using the same insn_propagation class
that it uses for patterns.

However, insn_propagation uses validate_change, which in turn
resets the INSN_CODE.  This is inefficient in the best case,
since it forces the pattern to be rerecognised even though
changing a note can't affect the INSN_CODE.  But in the PR
it's a correctness problem: resetting INSN_CODE means we lose
the NOOP_INSN_MOVE_CODE, which in turn means that rtl-ssa doesn't
queue it for deletion.

This patch adds a routine specifically for propagating into notes.
A belt-and-braces fix would be to rerecognise noop moves in
function_info::change_insns, but I can't think of a good reason
why that would be necessary, and it could paper over latent bugs.

gcc/
PR testsuite/116343
* recog.h (insn_propagation::apply_to_note): Declare.
* recog.cc (insn_propagation::apply_to_note): New function.
* late-combine.cc (insn_combination::substitute_note): Use
apply_to_note instead of apply_to_rvalue.
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Improve
dumping of costs for noop moves.

gcc/testsuite/
PR testsuite/116343
* gcc.dg/torture/pr116343.c: New test.

Fix Coarray in associate not a coarray. [PR110033]

A coarray used in an associate did not become a coarray in the block of
the associate. This patch fixes that and the same also in select type
statements.

PR fortran/110033

gcc/fortran/ChangeLog:

* class.cc (gfc_is_class_scalar_expr): Coarray refs that ref
only self, aka this image, are regarded as scalar, too.
* resolve.cc (resolve_assoc_var): Ignore this image coarray refs
and do not build a new class type.
* trans-expr.cc (gfc_get_caf_token_offset): Get the caf token
from the descriptor for associated variables.
(gfc_conv_variable): Same.
(gfc_trans_pointer_assignment): Assign token to temporary
associate variable, too.
(gfc_trans_scalar_assign): Add flag that assign is for associate
and use it to assign the token.
(is_assoc_assign): Detect that expressions are for associate
assign.
(gfc_trans_assignment_1): Treat associate assigns like pointer
assignments where possible.
* trans-stmt.cc (trans_associate_var): Set same_class only for
class-targets.
* trans.h (gfc_trans_scalar_assign): Add flag to
trans_scalar_assign for marking associate assignments.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/associate_1.f90: New test.

Add corank to gfc_expr.

Compute the corank of an expression along side to the regular rank.
This safe costly calls to gfc_get_corank (), which consecutively has
been removed. In some locations the code needed some adaption to model
the difference between expr.corank and gfc_get_corank correctly. The
latter always returned the codimension of the expression and not its
current corank, i.e. the resolution of all indezes.

This commit is preparatory to fixing PR fortran/110033 and may contain
parts of that fix already.

gcc/fortran/ChangeLog:

* arith.cc (reduce_unary): Use expr.corank.
(reduce_binary_ac): Same.
(reduce_binary_ca): Same.
(reduce_binary_aa): Same.
* array.cc (gfc_match_array_ref): Same.
* check.cc (dim_corank_check): Same.
(gfc_check_move_alloc): Same.
(gfc_check_image_index): Same.
* class.cc (gfc_add_class_array_ref): Same.
(finalize_component): Same.
* data.cc (gfc_assign_data_value): Same.
* decl.cc (match_clist_expr): Same.
(add_init_expr_to_sym): Same.
* expr.cc (simplify_intrinsic_op): Same.
(simplify_parameter_variable): Same.
(gfc_check_assign_symbol): Same.
(gfc_get_variable_expr): Same.
(gfc_add_full_array_ref): Same.
(gfc_lval_expr_from_sym): Same.
(gfc_get_corank): Removed.
* frontend-passes.cc (callback_reduction): Use expr.corank.
(create_var): Same.
(combine_array_constructor): Same.
(optimize_minmaxloc): Same.
* gfortran.h (gfc_get_corank): Add corank to gfc_expr.
* intrinsic.cc (gfc_get_intrinsic_function_symbol): Use
expr.corank.
(gfc_convert_type_warn): Same.
(gfc_convert_chartype): Same.
* iresolve.cc (resolve_bound): Same.
(gfc_resolve_cshift): Same.
(gfc_resolve_eoshift): Same.
(gfc_resolve_logical): Same.
(gfc_resolve_matmul): Same.
* match.cc (copy_ts_from_selector_to_associate): Same.
* matchexp.cc (gfc_get_parentheses): Same.
* parse.cc (parse_associate): Same.
* primary.cc (gfc_match_rvalue): Same.
* resolve.cc (resolve_structure_cons): Same.
(resolve_actual_arglist): Same.
(resolve_elemental_actual): Same.
(resolve_generic_f0): Same.
(resolve_unknown_f): Same.
(resolve_operator): Same.
(gfc_expression_rank): Same and set dimen_type for coarray to
default.
(gfc_op_rank_conformable): Use expr.corank.
(add_caf_get_intrinsic): Same.
(resolve_variable): Same.
(gfc_fixup_inferred_type_refs): Same.
(check_host_association): Same.
(resolve_compcall): Same.
(resolve_expr_ppc): Same.
(resolve_assoc_var): Same.
(fixup_array_ref): Same.
(resolve_select_type): Same.
(add_comp_ref): Same.
(get_temp_from_expr): Same.
(resolve_fl_var_and_proc): Same.
(resolve_symbol): Same.
* symbol.cc (gfc_is_associate_pointer): Same.
* trans-array.cc (walk_coarray): Same.
(gfc_conv_expr_descriptor): Same.
(gfc_walk_array_ref): Same.
* trans-array.h (gfc_walk_array_ref): Same.
* trans-expr.cc (gfc_get_ultimate_alloc_ptr_comps_caf_token):
Same.
* trans-intrinsic.cc (trans_this_image): Same.
(trans_image_index): Same.
(conv_intrinsic_cobound): Same.
(gfc_walk_intrinsic_function): Same.
(conv_intrinsic_move_alloc): Same.
* trans-stmt.cc (gfc_trans_lock_unlock): Same.
(trans_associate_var): Same and adapt to slightly different
behaviour of expr.corank and gfc_get_corank.
(gfc_trans_allocate): Same.
* trans.cc (gfc_add_finalizer_call): Same.

c++: c->B::m access resolved through current inst [PR116320]

Here when checking the access of (the injected-class-name) B in c->B::m
at parse time, we notice its context B (now the type) is a base of the
object type C<T>, so we proceed to use C<T> as the effective qualifying
type. But this C<T> is the dependent specialization not the primary
template type, so it has empty TYPE_BINFO, which leads to a segfault later
from perform_or_defer_access_check.

The reason the DERIVED_FROM_P (B, C<T>) test guarding this code path works
despite C<T> having empty TYPE_BINFO is because of its currently_open_class
logic (added in r9-713-gd9338471b91bbe) which replaces a dependent
specialization with the primary template type if we're inside it. So the
safest fix seems to be to call currently_open_class in the caller as well.

PR c++/116320

gcc/cp/ChangeLog:

* semantics.cc (check_accessibility_of_qualified_id): Try
currently_open_class when using the object type as the
effective qualifying type.

gcc/testsuite/ChangeLog:

* g++.dg/template/access42.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++/coroutines: fix passing *this to promise type, again [PR116327]

In r15-2210 we got rid of the unnecessary cast to lvalue reference when
passing *this to the promise type ctor, and as a drive-by change we also
simplified the code to use cp_build_fold_indirect_ref.

But it turns out cp_build_fold_indirect_ref does too much here, namely
it has a shortcut for returning current_class_ref if the operand is
current_class_ptr. The problem with that shortcut is current_class_ref
might have gotten clobbered earlier if it appeared in the function body,
since rewrite_param_uses walks and rewrites in-place all local variable
uses to their corresponding frame copy.

So later cp_build_fold_indirect_ref for *this will instead return the
clobbered current_class_ref i.e. *frame_ptr->this, which doesn't make
sense here since we're in the ramp function and not the actor function
where frame_ptr is in scope.

This patch fixes this by using the build_fold_indirect_ref instead of
cp_build_fold_indirect_ref.

PR c++/116327
PR c++/104981
PR c++/115550

gcc/cp/ChangeLog:

* coroutines.cc (morph_fn_to_coro): Use build_fold_indirect_ref
instead of cp_build_fold_indirect_ref.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr104981-preview-this.C: Improve coverage by
adding a non-static data member use within the coroutine member
function.
* g++.dg/coroutines/pr116327-preview-this.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

LoongArch: Implement scalar isinf, isnormal, and isfinite via fclass

Doing so can avoid loading FP constants from the memory.  It also
partially fixes PR 66262 as fclass does not signal on sNaN.

gcc/ChangeLog:

* config/loongarch/loongarch.md (extendsidi2): Add ("=r", "f")
alternative and use movfr2gr.s for it.  The spec clearly states
movfr2gr.s sign extends the value to GRLEN.
(fclass_<fmt>): Make the result SImode instead of a floating
mode.  The fclass results are really not FP values.
(FCLASS_MASK): New define_int_iterator.
(fclass_optab): New define_int_attr.
(<FCLASS_MASK:fclass_optab><ANYF:mode>): New define_expand
template.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/fclass-compile.c: New test.
* gcc.target/loongarch/fclass-run.c: New test.

Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload.

It results in 2 failures for x86_64-pc-linux-gnu{\
-march=cascadelake};

gcc: gcc.target/i386/extendditi3-1.c scan-assembler cqt?o
gcc: gcc.target/i386/pr113560.c scan-assembler-times \tmulq 1

For pr113560.c, now GCC generates mulx instead of mulq with
-march=cascadelake, which should be optimal, so adjust testcase for
that.
For gcc.target/i386/extendditi2-1.c, RA happens to choose another
register instead of rax and result in

movq %rdi, %rbp
movq %rdi, %rax
sarq $63, %rbp
movq %rbp, %rdx

The patch adds a new define_peephole2 for that.

gcc/ChangeLog:

PR target/116274
* config/i386/i386-expand.cc (ix86_expand_vector_move):
Restrict special case TImode to 128-bit vector conversions via
V2DI under ix86_pre_reload_split ().
* config/i386/i386.cc (inline_secondary_memory_needed):
Movement between GENERAL_REGS and SSE_REGS for TImode doesn't
need secondary reload.
* config/i386/i386.md (*extendsidi2_rex64): Add a
define_peephole2 after it.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116274.c: New test.
* gcc.target/i386/pr113560.c: Scan either mulq or mulx.

aarch64: Rename svpext to svpext_lane [PR116371]

When implementing the SME2 ACLE, I somehow missed off the _lane
suffix on svpext.

gcc/
PR target/116371
* config/aarch64/aarch64-sve-builtins-sve2.h (svpext): Rename to...
(svpext_lane): ...this.
* config/aarch64/aarch64-sve-builtins-sve2.cc (svpext_impl): Rename
to...
(svpext_lane_impl): ...this and update instantiation accordingly.
* config/aarch64/aarch64-sve-builtins-sve2.def (svpext): Rename to...
(svpext_lane): ...this.

gcc/testsuite/
PR target/116371
* gcc.target/aarch64/sme2/acle-asm/pext_c16.c,
gcc.target/aarch64/sme2/acle-asm/pext_c16_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_c32.c,
gcc.target/aarch64/sme2/acle-asm/pext_c32_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_c64.c,
gcc.target/aarch64/sme2/acle-asm/pext_c64_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_c8.c,
gcc.target/aarch64/sme2/acle-asm/pext_c8_x2.c: Replace with...
* gcc.target/aarch64/sme2/acle-asm/pext_lane_c16.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c16_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c32.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c32_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c64.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c64_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c8.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c8_x2.c: ...these new tests,
testing for svpext_lane instead of svpext.

rs6000: Add TARGET_FLOAT128_HW guard for quad-precision insns

gcc/
* config/rs6000/rs6000.md (floatti<mode>2, floatunsti<mode>2,
fix_trunc<mode>ti2): Add guard TARGET_FLOAT128_HW.
* config/rs6000/vsx.md (xsxexpqp_<IEEE128:mode>_<V2DI_DI:mode>,
xsxsigqp_<IEEE128:mode>_<VEC_TI:mode>, xsiexpqpf_<mode>,
xsiexpqp_<IEEE128:mode>_<V2DI_DI:mode>, xscmpexpqp_<code>_<mode>,
*xscmpexpqp, xststdcnegqp_<mode>): Replace guard TARGET_P9_VECTOR
with TARGET_FLOAT128_HW.
(xststdc_<mode>, *xststdc_<mode>, isinf<mode>2): Add guard
TARGET_FLOAT128_HW for the IEEE128 modes.

gcc/testsuite/
* gcc.target/powerpc/float128-cmp2-runnable.c: Replace
ppc_float128_sw with ppc_float128_hw and remove p9vector_hw.

rs6000: Implement optab_isnormal for SFDF and IEEE128

gcc/
PR target/97786
* config/rs6000/vsx.md (isnormal<mode>2): New expand.

gcc/testsuite/
PR target/97786
* gcc.target/powerpc/pr97786-7.c: New test.
* gcc.target/powerpc/pr97786-8.c: New test.

rs6000: Implement optab_isfinite for SFDF and IEEE128

gcc/
PR target/97786
* config/rs6000/vsx.md (isfinite<mode>2): New expand.

gcc/testsuite/
PR target/97786
* gcc.target/powerpc/pr97786-4.c: New test.
* gcc.target/powerpc/pr97786-5.c: New test.

rs6000: Implement optab_isinf for SFDF and IEEE128

gcc/
PR target/97786
* config/rs6000/rs6000.md (constant VSX_TEST_DATA_CLASS_NAN,
VSX_TEST_DATA_CLASS_POS_INF, VSX_TEST_DATA_CLASS_NEG_INF,
VSX_TEST_DATA_CLASS_POS_ZERO, VSX_TEST_DATA_CLASS_NEG_ZERO,
VSX_TEST_DATA_CLASS_POS_DENORMAL, VSX_TEST_DATA_CLASS_NEG_DENORMAL):
Define.
(mode_attr sdq, vsx_altivec, wa_v, x): Define.
(mode_iterator IEEE_FP): Define.
* config/rs6000/vsx.md (isinf<mode>2): New expand.
(expand xststdcqp_<mode>, xststdc<sd>p): Combine into...
(expand xststdc_<mode>): ...this.
(insn *xststdcqp_<mode>, *xststdc<sd>p): Combine into...
(insn *xststdc_<mode>): ...this.
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Rename
CODE_FOR_xststdcqp_kf as CODE_FOR_xststdc_kf,
CODE_FOR_xststdcqp_tf as CODE_FOR_xststdc_tf.
* config/rs6000/rs6000-builtins.def: Rename xststdcdp as xststdc_df,
xststdcsp as xststdc_sf, xststdcqp_kf as xststdc_kf.

gcc/testsuite/
PR target/97786
* gcc.target/powerpc/pr97786-1.c: New test.
* gcc.target/powerpc/pr97786-2.c: New test.

Value Range: Add range op for builtin isnormal

The former patch adds optab for builtin isnormal. Thus builtin isnormal
might not be folded at front end. So the range op for isnormal is needed
for value range analysis. This patch adds range op for builtin isnormal.

gcc/
* gimple-range-op.cc (class cfn_isfinite): New.
(op_cfn_finite): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CFN_BUILT_IN_ISFINITE.
* value-range.h (class frange): Declear known_isnormal and
known_isdenormal_or_zero.
(frange::known_isnormal): Define.
(frange::known_isdenormal_or_zero): Define.

gcc/testsuite/
* gcc.dg/tree-ssa/range-isnormal.c: New test.

Value Range: Add range op for builtin isfinite

The former patch adds optab for builtin isfinite. Thus builtin isfinite
might not be folded at front end. So the range op for isfinite is needed
for value range analysis. This patch adds range op for builtin isfinite.

gcc/
* gimple-range-op.cc (class cfn_isfinite): New.
(op_cfn_finite): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CFN_BUILT_IN_ISFINITE.

gcc/testsuite/
* gcc.dg/tree-ssa/range-isfinite.c: New test.

Value Range: Add range op for builtin isinf

The builtin isinf is not folded at front end if the corresponding optab
exists. So the range op for isinf is needed for value range analysis.
This patch adds range op for builtin isinf.

gcc/
PR target/114678
* gimple-range-op.cc (class cfn_isinf): New.
(op_cfn_isinf): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CASE_FLT_FN (BUILT_IN_ISINF).

gcc/testsuite/
PR target/114678
* gcc.dg/tree-ssa/range-isinf.c: New test.
* gcc.dg/tree-ssa/range-sincos.c: Remove xfail for s390.
* gcc.dg/tree-ssa/vrp-float-abs-1.c: Likewise.

Daily bump.

c++: ICE with NSDMIs and fn arguments [PR116015]

The problem in this PR is that we ended up with

  {.rows=(&<PLACEHOLDER_EXPR struct Widget>)->n,
   .outer_stride=(&<PLACEHOLDER_EXPR struct MatrixLayout>)->rows}

that is, two PLACEHOLDER_EXPRs for different types on the same level
in one { }.  That should not happen; we may, for instance, neglect to
replace a PLACEHOLDER_EXPR due to CONSTRUCTOR_PLACEHOLDER_BOUNDARY on
the constructor.

The same problem happened in PR100252, which I fixed by introducing
replace_placeholders_for_class_temp_r.  That didn't work here, though,
because r_p_for_c_t_r only works for non-eliding TARGET_EXPRs: replacing
a PLACEHOLDER_EXPR with a temporary that is going to be elided will
result in a crash in gimplify_var_or_parm_decl when it encounters such
a loose decl.

But leaving the PLACEHOLDER_EXPRs in is also bad because then we end
up with this PR.

TARGET_EXPRs for function arguments are elided in gimplify_arg.  The
argument will get a real temporary only in get_formal_tmp_var.  One
idea was to use the temporary that is going to be elided anyway, and
then replace_decl it with the real object once we get it.  But that
didn't work out: one problem is that we elide the TARGET_EXPR for an
argument before we create the real temporary for the argument, and
when we get it, the context that this was a TARGET_EXPR for an argument
has been lost.  We're also in the middle end territory now, even though
this is a C++-specific problem.

A solution is to simply stop eliding TARGET_EXPRs whose initializer is
a CONSTRUCTOR.  Such copies can't be (at the moment) elided anyway.  But
not eliding all TARGET_EXPRs would be a pessimization.

PR c++/116015

gcc/cp/ChangeLog:

* call.cc (convert_for_arg_passing): Don't set_target_expr_eliding
when the TARGET_EXPR initializer is a CONSTRUCTOR.

gcc/ChangeLog:

* gimplify.cc (gimplify_arg): Do not strip a TARGET_EXPR whose
initializer is a CONSTRUCTOR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/nsdmi-aggr23.C: New test.

s390: Remove vector intrinsics

The following intrinsics are not implemented. Thus, remove them.

gcc/ChangeLog:

* config/s390/vecintrin.h (vec_vstbrh): Remove.
(vec_vstbrf): Remove.
(vec_vstbrg): Remove.
(vec_vstbrq): Remove.
(vec_vstbrf_flt): Remove.
(vec_vstbrg_dbl): Remove.
(vec_vsterb): Remove.
(vec_vsterh): Remove.
(vec_vsterf): Remove.
(vec_vsterg): Remove.
(vec_vsterf_flt): Remove.
(vec_vsterg_dbl): Remove.

s390: Fix high-level builtins vec_gfmsum{,_accum}_128

Starting with r14-9449-g9f2b16ce1efef0 builtins were streamlined with
those in LLVM.  In particular s390_vgfm{,a}g have been changed from
UV16QI to UINT128 in order to match those in LLVM.  However, these
low-level builtins are directly used by the high-level builtins
vec_gfmsum{,_accum}_128 which expect UV16QI instead.  Therefore,
introduce new low-level builtins s390_vgfm{,a}g_128 and make use of
them, respectively.

gcc/ChangeLog:

* config/s390/s390-builtin-types.def (BT_FN_UV16QI_UV2DI_UV2DI):
New.
(BT_FN_UV16QI_UV2DI_UV2DI_UV16QI): New.
* config/s390/s390-builtins.def (s390_vgfmg_128): New.
(s390_vgfmag_128): New.
* config/s390/vecintrin.h (vec_gfmsum_128): Use s390_vgfmg_128.
(vec_gfmsum_accum_128): Use s390_vgfmag_128.

Fortran: fix minor frontend GMP leaks

gcc/fortran/ChangeLog:

* simplify.cc (gfc_simplify_sizeof): Clear used gmp variable.
* target-memory.cc (gfc_target_expr_size): Likewise.

i386: Optimization for APX NDD is always zero-uppered for shift

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*ashlqi3_1_zext<mode><nf_name>):
New define_insn.
(*ashlhi3_1_zext<mode><nf_name>): Ditto.
(*<insn>qi3_1_zext<mode><nf_name>): Ditto.
(*<insn>hi3_1_zext<mode><nf_name>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: Add testcase for shift and rotate.

i386: Optimization for APX NDD is always zero-uppered for logic

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*andqi_1_zext<mode><nf_name>): New
define_insn.
(*andhi_1_zext<mode><nf_name>): Ditto.
(*<code>qi_1_zext<mode><nf_name>): Ditto.
(*<code>hi_1_zext<mode><nf_name>): Ditto.
(*negqi_1_zext<mode><nf_name>): Ditto.
(*neghi_1_zext<mode><nf_name>): Ditto.
(*one_cmplqi2_1_zext<mode>): Ditto.
(*one_cmplhi2_1_zext<mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: Add more tests.

i386: Optimization for APX NDD is always zero-uppered for sub/adc/sbb

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*subqi_1_zext<mode><nf_name>): New
define_insn.
(*subhi_1_zext<mode><nf_name>): Ditto.
(*addqi3_carry_zext<mode>): Ditto.
(*addhi3_carry_zext<mode>): Ditto.
(*addqi3_carry_zext<mode>_0): Ditto.
(*addhi3_carry_zext<mode>_0): Ditto.
(*addqi3_carry_zext<mode>_0r): Ditto.
(*addhi3_carry_zext<mode>_0r): Ditto.
(*subqi3_carry_zext<mode>): Ditto.
(*subhi3_carry_zext<mode>): Ditto.
(*subqi3_carry_zext<mode>_0): Ditto.
(*subhi3_carry_zext<mode>_0): Ditto.
(*subqi3_carry_zext<mode>_0r): Ditto.
(*subhi3_carry_zext<mode>_0r): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: Add more test.
* gcc.target/i386/pr113729-adc-sbb.c: New test.

i386: Optimization for APX NDD is always zero-uppered for ADD

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*addqi_1_zext<mode><nf_name>): New
define.
(*addhi_1_zext<mode><nf_name>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: New test.

Restrict pr116202-run-1.c test to riscv_v target

The testcase uses -march=rv64gcv and dg-do run, so should be
restricted to a riscv_v target.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr116202-run-1.c (dg-do run):
Add target riscv_v.

Prevent future proc_ptr parsing issues in associate [PR102973]

A global variable is set when proc_ptr parsing in an associate is
expected. In the case of an error, that flag was not reset, which is
fixed now.

gcc/fortran/ChangeLog:

PR fortran/102973

* match.cc (gfc_match_associate): Reset proc_ptr parsing flag on
error.

Fix ICE in build_function_decl [PR116292]

Fix ICE by getting the vtype only when a derived or class type is
prevent. Also take care about the _len component for unlimited
polymorphics.

gcc/fortran/ChangeLog:

PR fortran/116292

* trans-intrinsic.cc (conv_intrinsic_move_alloc): Get the vtab
only for derived types and classes and adjust _len for class
types.

gcc/testsuite/ChangeLog:

* gfortran.dg/move_alloc_19.f90: New test.

genoutput: Accelerate the place_operands function.

With the increase in the number of modes and patterns for some
backend architectures, the place_operands function becomes a
bottleneck int the speed of genoutput, and may even become a
bottleneck int the overall speed of building the GCC project.
This patch aims to accelerate the place_operands function,
the optimizations it includes are:
1. Use a hash table to store operand information,
improving the lookup time for the first operand.
2. Move mode comparison to the beginning to avoid the scenarios of most strcmp.

I tested the speed improvements for the following backends,
Improvement Ratio
x86_64 197.9%
aarch64 954.5%
riscv 2578.6%
If the build machine is slow, then this improvement can save a lot of time.

I tested the genoutput output for x86_64/aarch64/riscv backends,
and there was no difference compared to before the optimization,
so this shouldn't introduce any functional issues.

gcc/
* genoutput.cc (struct operand_data): Add member 'eq_next' to
point to the next member with the same hash value in the
hash table.
(compare_operands): Move the comparison of the mode to the very
beginning to accelerate the comparison of the two operands.
(struct operand_data_hasher): New, a class that takes into account
the necessary elements for comparing the equality of two operands
in its hash value.
(operand_data_hasher::hash): New.
(operand_data_hasher::equal): New.
(operand_datas): New, hash table of konwn pattern operands.
(place_operands): Use a hash table instead of traversing the array
to find the same operand.
(main): Add initialization of the hash table 'operand_datas'.

Revert "[rtl-optimization/116244] Don't create bogus regs in alter_subreg"

This reverts commit e9738e77674e23f600315ca1efed7d1c7944d0cc.

testsuite: Fix fam-in-union-alone-in-struct-2.c with unsigned char [PR116148]

As PR116148#c7 shows, fam-in-union-alone-in-struct-2.c still
fails on hppa which is a BE environment, but by checking more
(also confirmed by John in PR116148#c12), it's due to that
signedness of plain char on hppa is signed therefore the value
of with_fam_3_v.a[7] "8f" get sign extended as "ffffff8f" then
the verification will fail. This patch is to change plain char
with unsigned char to avoid that.

PR testsuite/116148

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-2.c: Change the type of
member a[] of union with_fam_3 with unsigned char.

Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area.

gcc/ChangeLog:

PR target/116174
* config/i386/i386.cc (ix86_align_loops): Move this to ..
* config/i386/i386-features.cc (ix86_align_loops): .. here.
(class pass_align_tight_loops): New class.
(make_pass_align_tight_loops): New function.
* config/i386/i386-passes.def: Insert pass_align_tight_loops
after pass_insert_endbr_and_patchable_area.
* config/i386/i386-protos.h (make_pass_align_tight_loops): New
declare.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116174.c: New test.

Daily bump.

testsuite: Fix struct size check [PR116155]

The size of "struct only_fam_2" is dependent on the alignment of the
flexible array member "b", and not on the type of the preceding
bit-fields. For most targets the two are equal. But on default_packed
targets like pru-unknown-elf, the alignment of int is not equal to the
size of int, so the test failed.

Patch was suggested by Qing Zhao. Tested on pru-unknown-elf and
x86_64-pc-linux-gnu.

PR testsuite/116155

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-1.c: Adjust
check to account for default_packed targets.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

ifcvt: Fix force_operand ICE in noce_convert_multiple_sets [PR116353]

Now that more operations are allowed for noce_convert_multiple_sets,
we need to check noce_can_force_operand on the sequence before calling
try_emit_cmove_seq. Otherwise an inappropriate argument may be given
to copy_to_mode_reg and result in an ICE.

PR tree-optimization/116353

gcc/ChangeLog:

* ifcvt.cc (bb_ok_for_noce_convert_multiple_sets): Check
noce_can_force_operand.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116353.c: New test.

Fortran: reject array constructor value of abstract type [PR114308]

gcc/fortran/ChangeLog:

PR fortran/114308
* array.cc (resolve_array_list): Reject array constructor value if
its declared type is abstract (F2018:C7114).

gcc/testsuite/ChangeLog:

PR fortran/114308
* gfortran.dg/abstract_type_10.f90: New test.

Co-Authored-By: Steven G. Kargl <kargl@gcc.gnu.org>

RISC-V: Fix non-obvious comment typos

This fixes the remainder of the typos I found when reading various parts of the
RISC-V backend.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (legitimize_move): extrac -> extract.
(expand_vec_cmp_float): Remove duplicate vmnor.mm.
* config/riscv/riscv-vector-builtins.cc: ins -> insns.
* config/riscv/riscv.cc (riscv_init_machine_status): mwrvv -> mrvv.
* config/riscv/vector-iterators.md: RVVM8QImde -> RVVM8QImode
* config/riscv/vector.md: Replaced non-existant vsetivl with vsetivli.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

Internal-fn: Handle vector bool type for type strict match mode [PR116103]

For some target like target=amdgcn-amdhsa, we need to take care of
vector bool types prior to general vector mode types. Or we may have
the asm check failure as below.

gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump-not ivopts "zero if "

The below test suites are passed for this patch.
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
4. The amdgcn test case as above.

PR target/116103

gcc/ChangeLog:

* internal-fn.cc (type_strictly_matches_mode_p): Add handling
for vector bool type.

Signed-off-by: Pan Li <pan2.li@intel.com>

LRA: Don't emit move for substituted CONSTATNT_P operand [PR116170]

Commit r15-2084 exposes one ICE in LRA.  Firstly, before
r15-2084 KFmode has 126 bit precision while V1TImode has 128
bit precision, so the subreg (subreg:V1TI (reg:KF 131) 0) is
paradoxical_subreg_p, which stops some passes from doing
some optimization.  After r15-2084, KFmode has the same mode
precision as V1TImode, passes are able to optimize more, but
it causes this ICE in LRA as described below:

For insn 106 (set (mem:V1TI ...) (subreg:V1TI (reg:KF 133) 0)),
which matches pattern

(define_insn "*vsx_le_perm_store_<mode>"
  [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
        (match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
   && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
  "@
   #
   #"
  [(set_attr "type" "vecstore,store")
   (set_attr "length" "12,8")
   (set_attr "isa" "<VSisa>,*")])

LRA makes equivalence substitution on r133 with const double
(const_double:KF 0.0), selects alternative 0 and fixes up
operand 1 for constraint "wa", because operand 1 is OP_INOUT,
so it considers assigning back to it as well, that is:

  lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);

But because old has been changed to const_double in equivalence
substitution, the move is actually assigning to const_double,
which is invalid and cause ICE.

Considering reg:KF 133 is equivalent with (const_double:KF 0.0)
even though this operand is OP_INOUT, IMHO there should not be
any following uses of reg:KF 133, otherwise it doesn't have the
chance to be equivalent to (const_double:KF 0.0).  So this patch
is to guard the lra_emit_move with !CONSTANT_P to exclude such
case.

PR rtl-optimization/116170

gcc/ChangeLog:

* lra-constraints.cc (curr_insn_transform): Don't emit move back to
old operand if it's CONSTANT_P.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr116170.c: New test.

Regenerate avr.opt.urls

avr added an -mlra option, but the avr.opt.url file wasn't
regenerated.

Note that commit 149a23ee2568 ("AVR: -mlra is not documeted in TEXI.")
did add the Undocumented flag, but that still needs the avr.op.urls
file to be updated.

Fixes: 09a87ea666b2 ("AVR: ad target/113934 - Add option -mlra to enable LRA.")
gcc/ChangeLog:

* config/avr/avr.opt.urls: Regenerate.

Daily bump.

rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

Only disable shrink-wrapping when using -mrop-protect when we know we
will be emitting the ROP-protect hash instructions (ie, non-leaf functions).

2024-06-17 Peter Bergner <bergner@linux.ibm.com>

gcc/
PR target/114759
* config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move
the disabling of shrink-wrapping from here....
* config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here.

gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-1.c: New test.

RISC-V: Fix missing abi arg in test

The following test was failing when building on 32 bit targets
due to not overwriting the mabi arg. This resulted in dejagnu
attempting to run the test with -mabi=ilp32d -march=rv64gcv_zvl256b

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr116202-run-1.c: Add mabi arg

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

[rtl-optimization/116244] Don't create bogus regs in alter_subreg

So this is another nasty latent bug exposed by ext-dce.

Similar to the prior m68k failure it's another problem with how we handle
paradoxical subregs on big endian targets.

In this instance when we remove the hard subregs we take something like:

(subreg:DI (reg:SI 0) 0)

And turn it into

(reg:SI -1)

Which is clearly wrong.  (reg:SI 0) is correct.

The transformation happens in alter_subreg, but I really wanted to fix this in
subreg_regno since we could have similar problems in some of the other callers
of subreg_regno.

Unfortunately reload depends on the current behavior of subreg_regno; in the
cases where the return value is an invalid register, the wrong half of a
register pair, etc the resulting bogus value is detected by reload and triggers
reloading of the inner object.  So that's the new comment in subreg_regno.

The second best place to fix is alter_subreg which is what this patch does.  If
presented with a paradoxical subreg, then the base register number should
always be REGNO (SUBREG_REG (object)).  It's just how paradoxicals are designed
to work.

I haven't tried to fix the other places that call subreg_regno.  After being
burned by reload, I'm more than a bit worried about unintended fallout.

I must admit I'm surprised we haven't stumbled over this before and that it
didn't fix any failures on the big endian embedded targets.

Boostrapped & regression tested on x86_64, also went through all the embedded
targets in my tester and bootstrapped on m68k & s390x to get some additional
big endian testing.

Pushing to the trunk.

rtl-optimization/116244
gcc/
* rtlanal.cc (subreg_regno): Update comment.
* final.cc (alter_subrg): Always use REGNO (SUBREG_REG ()) to get
the base regsiter for paradoxical subregs.

gcc/testsuite/
* g++.target/m68k/m68k.exp: New test driver.
* g++.target/m68k/pr116244.C: New test.

borrowck: Fix debug prints on 32-bits architectures

gcc/rust/ChangeLog:

* checks/errors/borrowck/rust-bir-builder.h: Cast size_t values to unsigned
long before printing.
* checks/errors/borrowck/rust-bir-fact-collector.h: Likewise.

borrowck: Avoid overloading issues on 32bit architectures

On architectures where `size_t` is `unsigned int`, such as 32bit x86,
we encounter an issue with `PlaceId` and `FreeRegion` being aliases to
the same types. This poses an issue for overloading functions for these
two types, such as `push_subset` in that case. This commit renames one
of these `push_subset` functions to avoid the issue, but this should be
fixed with a newtype pattern for these two types.

gcc/rust/ChangeLog:

* checks/errors/borrowck/rust-bir-fact-collector.h (points): Rename
`push_subset(PlaceId, PlaceId)` to `push_subset_place(PlaceId, PlaceId)`

ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets

The existing implementation of need_cmov_or_rewire and
noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG.
This commit enchances them so they can handle/rewire arbitrary set statements.

To do that a new helper struct noce_multiple_sets_info is introduced which is
used by noce_convert_multiple_sets and its helper functions. This results in
cleaner function signatures, improved efficientcy (a number of vecs and hash
set/map are replaced with a single vec of struct) and simplicity.

gcc/ChangeLog:

* ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets_info.
(init_noce_multiple_sets_info): Initialize noce_multiple_sets_info.
(noce_convert_multiple_sets_1): Use noce_multiple_sets_info and handle
rewiring of multiple registers.
(noce_convert_multiple_sets): Updated to use noce_multiple_sets_info.
* ifcvt.h (struct noce_multiple_sets_info): Introduce new struct
noce_multiple_sets_info to store info for noce_convert_multiple_sets.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ifcvt_multiple_sets_rewire.c: New test.

ifcvt: Allow more operations in multiple set if conversion

Currently the operations allowed for if conversion of a basic block
with multiple sets are few, namely REG, SUBREG and CONST_INT (as
controlled by bb_ok_for_noce_convert_multiple_sets).

This commit allows more operations (arithmetic, compare, etc) to
participate in if conversion. The target's profitability hook and
ifcvt's costing is expected to reject sequences that are unprofitable.

This is especially useful for targets which provide a rich selection
of conditional instructions (like aarch64 which has cinc, csneg,
csinv, ccmp, ...) which are currently not used in basic blocks with
more than a single set.

For targets that have a rich selection of conditional instructions,
like aarch64, we have seen an ~5x increase of profitable if
conversions for multiple set blocks in SPEC CPU 2017 benchmarks.

gcc/ChangeLog:

* ifcvt.cc (try_emit_cmove_seq): Modify comments.
(noce_convert_multiple_sets_1): Modify comments.
(bb_ok_for_noce_convert_multiple_sets): Allow more operations.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ifcvt_multiple_sets_arithm.c: New test.

ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

This is an extension of what was done in PR106590.

Currently if a sequence generated in noce_convert_multiple_sets clobbers the
condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards
(sequences that emit the comparison itself). Since this applies only from the
next iteration it assumes that the sequences generated (in particular seq2)
doesn't clobber the condition rtx itself before using it in the if_then_else,
which is only true in specific cases (currently only register/subregister moves
are allowed).

This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in
the current iteration. It also checks whether the resulting sequence clobbers
the condition attached to the jump. This makes it possible to include arithmetic
operations in noce_convert_multiple_sets.

It also makes the code that checks whether the condition is used outside of the
if_then_else emitted more robust.

gcc/ChangeLog:

* ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead.
(noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp.
Punt if seq clobbers cond. Refactor the code that sets read_comparison.

AVR: target/85624 - Fix non-matching alignment in clrmem* insns.

The clrmem* patterns don't use the provided alignment information,
hence the setmemhi expander can just pass down 0 as alignment to
the clrmem* insns.

PR target/85624
gcc/
* config/avr/avr.md (setmemhi): Set alignment to 0.

gcc/testsuite/
* gcc.target/avr/torture/pr85624.c: New test.

16-bit testsuite fixes - excessive code size

gcc/testsuite/
* gcc.c-torture/execute/20021120-1.c: Skip if not size20plus or -Os.
* gcc.dg/fixed-point/convert-float-4.c: Require size20plus.
* gcc.dg/torture/pr112282.c: Skip if -O0 unless size20plus.
* g++.dg/lookup/pr21802.C: Require size20plus.

This fixes problems with tests that exceed a data type or the maximum stack frame size on 16 bit targets.

Note: GCC has a limitation that a stack frame cannot exceed half the address space.

For two tests the decision to modify or skip them seems not so clear-cut;
I choose to modify gcc.dg/pr47893.c to use types that fit the numbers, as
that seemed to have little impact on the test, and skip gcc.dg/pr115646.c
for 16 bit, as layout of structs with bitfields members can have quite
subtle rules.

gcc/testsuite/
* gcc.dg/pr107523.c: Make sure variables can fit numbers.
* gcc.dg/pr47893.c: Add dg-require-effective-target size20plus clause.
* c-c++-common/torture/builtin-clear-padding-2.c:
dg-require-effective-target size20plus.
* gcc.dg/pr115646.c: dg-require-effective-target int32plus.
* c-c++-common/analyzer/coreutils-sum-pr108666.c:
For c++, expect a warning about exceeding maximum object size
if not size20plus.
* gcc.dg/torture/inline-mem-cpy-1.c:
Like the included file, dg-require-effective-target ptr32plus.
* gcc.dg/torture/inline-mem-cmp-1.c: Likewise.

Avoid cfg corruption when using sjlj exceptions where loops are present in the assign_params emitted code.

2024-08-06 Joern Rennecke <joern.rennecke@riscy-ip.com>

gcc/
* except.cc (sjlj_emit_function_enter):
Set fn_begin_outside_block again if encountering a jump instruction.

Use splay-tree-utils.h in tree-ssa-sccvn [PR30920]

This patch is an attempt to gauge opinion on one way of fixing PR30920.

The PR points out that the libiberty splay tree implementation does
not implement the algorithm described by Sleator and Tarjan and has
unclear complexity bounds.  (It's also somewhat dangerous in that
splay_tree_min and splay_tree_max walk the tree without splaying,
meaning that they are fully linear in the worst case, rather than
amortised logarithmic.)  These properties have been carried over
to typed-splay-tree.h.

We could fix those problems directly in the existing implementations,
and probably should for libiberty.  But when I added rtl-ssa, I also
added a third(!) splay tree implementation: splay-tree-utils.h.
In response to Jeff's understandable unease about having three
implementations, I was supposed to go back during the next stage 1
and reduce it to no more than two.  I never did that. :-(

splay-tree-utils.h is so called because rtl-ssa uses splay trees
in structures that are relatively small and very size-sensitive.
I therefore wanted to be able to embed the splay tree links directly
in the structures, rather than pay the penalty of using separate
nodes with one-way or two-way links between them.  There were also
operations for which it was convenient to treat the splay tree root
as an explicitly managed cursor, rather than treating the tree as
a pure ADT.  The interface is therefore a bit more low-level than
for the other implementations.

I wondered whether the same trade-offs might apply to users of
the libiberty splay trees.  The first one I looked at in detail
was SCC value numbering, which seemed like it would benefit from
using splay-tree-utils.h directly.

The patch does that.  It also adds a couple of new helper routines
to splay-tree-utils.h.

I don't expect this approach to be the right one for every use
of splay trees.  E.g. splay tree used for omp gimplification would
certainly need separate nodes.

gcc/
PR other/30920
* splay-tree-utils.h (rooted_splay_tree::insert_relative)
(rooted_splay_tree::lookup_le): New functions.
(rooted_splay_tree::remove_root_and_splay_next): Likewise.
* splay-tree-utils.tcc (rooted_splay_tree::insert_relative): New
function, extracted from...
(rooted_splay_tree::insert): ...here.
(rooted_splay_tree::lookup_le): New function.
(rooted_splay_tree::remove_root_and_splay_next): Likewise.
* tree-ssa-sccvn.cc (pd_range::m_children): New member variable.
(vn_walk_cb_data::vn_walk_cb_data): Initialize first_range.
(vn_walk_cb_data::known_ranges): Use a default_splay_tree.
(vn_walk_cb_data::~vn_walk_cb_data): Remove freeing of known_ranges.
(pd_range_compare, pd_range_alloc, pd_range_dealloc): Delete.
(vn_walk_cb_data::push_partial_def): Rewrite splay tree operations
to use splay-tree-utils.h.
* rtl-ssa/accesses.cc (function_info::add_use): Use insert_relative.

aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for Advanced SIMD

On many cores, including Neoverse V2 the throughput of vector ADD
instructions is higher than vector shifts like SHL.  We can lean on that
to emit code like:
  add     v0.4s, v0.4s, v0.4s
instead of:
  shl     v0.4s, v0.4s, 1

LLVM already does this trick.
In RTL the code gets canonincalised from (plus x x) to (ashift x 1) so I
opted to instead do this at the final assembly printing stage, similar
to how we emit CMLT instead of SSHR elsewhere in the backend.

I'd like to also do this for SVE shifts, but those will have to be
separate patches.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/ChangeLog:

* config/aarch64/aarch64-simd.md
(aarch64_simd_imm_shl<mode><vczle><vczbe>): Rewrite to new
syntax.  Add =w,w,vs1 alternative.
* config/aarch64/constraints.md (vs1): New constraint.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd_shl_add.c: New test.

Fortran: Fix coarray in associate not linking [PR85510]

PR fortran/85510

gcc/fortran/ChangeLog:

* resolve.cc (resolve_variable): Mark the variable as host
associated only, when it is not in an associate block.
* trans-decl.cc (generate_coarray_init): Remove incorrect unused
flag on parameter.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/pr85510.f90: New test.

Initial support for AVX10.2

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features): Handle
avx10.2.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVX10_2_256_SET): New.
(OPTION_MASK_ISA2_AVX10_2_512_SET): Ditto.
(OPTION_MASK_ISA2_AVX10_1_256_UNSET):
Add OPTION_MASK_ISA2_AVX10_2_256_UNSET.
(OPTION_MASK_ISA2_AVX10_1_512_UNSET):
Add OPTION_MASK_ISA2_AVX10_2_512_UNSET.
(OPTION_MASK_ISA2_AVX10_2_256_UNSET): New.
(OPTION_MASK_ISA2_AVX10_2_512_UNSET): Ditto.
(ix86_handle_option): Handle avx10.2-256 and avx10.2-512.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AVX10_2_256 and FEATURE_AVX10_2_512.
* common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY for
avx10.2-256 and avx10.2-512.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVX10_2_256__ and __AVX10_2_512__.
* config/i386/i386-isa.def (AVX10_2): Add DEF_PTA(AVX10_2_256)
and DEF_PTA(AVX10_2_512).
* config/i386/i386-options.cc (isa2_opts): Add -mavx10.2-256 and
-mavx10.2-512.
(ix86_valid_target_attribute_inner_p): Handle avx10.2-256 and
avx10.2-512.
* config/i386/i386.opt: Add option -mavx10.2, -mavx10.2-256 and
-mavx10.2-512.
* config/i386/i386.opt.urls: Regenerated.
* doc/extend.texi: Document avx10.2, avx10.2-256 and avx10.2-512.
* doc/invoke.texi: Document -mavx10.2, -mavx10.2-256 and
-mavx10.2-512.
* doc/sourcebuild.texi: Document target avx10.2, avx10.2-256,
avx10.2-512.

gcc/testsuite/ChangeLog:

* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/sse-12.c: Ditto.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.

PR target/116275: Handle STV of *extenddi2_doubleword_highpart on i386.

This patch resolves PR target/116275, a recent ICE-on-valid regression on
-m32 caused by my recent change to enable STV of DImode arithmeric right
shift on non-AVX512VL targets.  The oversight is that the i386 backend
contains an *extenddi2_doubleword_highpart instruction (whose pattern
is an arithmetic right shift of a left shift) that optimizes the case where
sign-extension need only update the highpart word of a DImode value when
generating 32-bit code (!TARGET_64BIT).  STV accepts this pattern as a
candidate, as there are patterns to handle this form of extension on SSE
using AVX512VL instructions (and previously ASHIFTRT was only allowed on
AVX512VL).  Now that ASHIFTRT is a candidate on non-AVX512vL targets, we
either need to check that the first operand is a register, or as done
below provide the define_insn_and_split that provides a non-AVX512VL
implementation of *extendv2di_highpart_stv.

The new testcase only ICEed with -m32, so this test could be limited to
target ia32, but there's no harm also running this test on -m64 to
provide a little extra test coverage.

2024-08-12  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR target/116275
* config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): New
define_insn_and_split to handle the STV conversion of the DImode
pattern *extendsi2_doubleword_highpart.

gcc/testsuite/ChangeLog
PR target/116275
* g++.target/i386/pr116275.C: New test case.

LoongArch: Provide ashr lshr and ashl RTL pattern for vectors.

We support vashr vlshr and vashl. However, in r15-1638 support optimize
x < 0 ? -1 : 0 into (signed) x >> 31 and x < 0 ? 1 : 0 into (unsigned) x >> 31.
To support this optimization, vector ashr lshr and ashl need to be implemented.

gcc/ChangeLog:

* config/loongarch/loongarch.md (insn): Added rotatert rotr pairs.
* config/loongarch/simd.md (rotr<mode>3): Remove to ...
(<optab><mode>3): This.

gcc/testsuite/ChangeLog:

* g++.target/loongarch/vect-ashr-lshr.C: New test.

LoongArch: Drop vcond{,u} expanders.

Optabs vcond{,u} will be removed for GCC 15. Since regtest shows no
fallout, dropping the expanders, now.

gcc/ChangeLog:

PR target/114189
* config/loongarch/lasx.md (vcondu<LASX:mode><ILASX:mode>): Delete.
(vcond<LASX:mode><LASX_2:mode>): Likewise.
* config/loongarch/lsx.md (vcondu<LSX:mode><ILSX:mode>): Likewise.
(vcond<LSX:mode><LSX_2:mode>): Likewise.

LoongArch: Use iorn and andn standard pattern names.

R15-1890 introduced new optabs iorc and andc, and its corresponding
internal functions BIT_{ANDC,IORC}, and if targets defines such optabs
for vector modes. And in r15-2258 the iorc and andc were renamed to
iorn and andn.
So we changed the andn and iorn implementation templates to the standard
template names.

gcc/ChangeLog:

* config/loongarch/lasx.md (xvandn<mode>3): Rename to ...
(andn<mode>3): This.
(xvorn<mode>3): Rename to ...
(iorn<mode>3): This.
* config/loongarch/loongarch-builtins.cc
(CODE_FOR_lsx_vandn_v): Defined as the modified name.
(CODE_FOR_lsx_vorn_v): Likewise.
(CODE_FOR_lasx_xvandn_v): Likewise.
(CODE_FOR_lasx_xvorn_v): Likewise.
(loongarch_expand_builtin_insn): When the builtin function to be
called is __builtin_lasx_xvandn or __builtin_lsx_vandn, swap the
two operands.
* config/loongarch/loongarch.md (<optab>n<mode>): Rename to ...
(<optab>n<mode>3): This.
* config/loongarch/lsx.md (vandn<mode>3): Rename to ...
(andn<mode>3): This.
(vorn<mode>3): Rename to ...
(iorn<mode>3): This.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/lasx-andn-iorn.c: New test.
* gcc.target/loongarch/lsx-andn-iorn.c: New test.

PR modula2/116181 fix ODR warnings for C/m2 interface library modules

This patch fixes many ODR warnings which appear when compiling the
interface files found in gcc/m2/*-ch/ and gcc/m2/{pge,mc}-boot
directories.

gcc/m2/ChangeLog:

PR modula2/116181
* gm2-compiler/ppg.mod (FindStr): Initialize j.
* gm2-libs-ch/UnixArgs.cc (_M2_UnixArgs_ctor): Replace
M2RTS_RegisterModule with M2RTS_RegisterModule_Cstr.
* gm2-libs-ch/dtoa.cc (_M2_dtoa_ctor): Ditto.
* gm2-libs-ch/ldtoa.cc (ldtoa_strtold): Cast parameter s
for strtod.
(_M2_ldtoa_ctor): Replace M2RTS_RegisterModule with
M2RTS_RegisterModule_Cstr.
* gm2-libs-ch/m2rts.h (M2RTS_RegisterModule_Cstr): New
define.
(M2RTS_RegisterModule): Remove const.
* mc-boot-ch/GSelective.c (Selective_FdIsSet): Return bool
rather than int.
* mc-boot-ch/Gldtoa.cc (ldtoa_strtold): Change const char to
void.
Cast s before passing as a parameter to strtod.
* mc-boot-ch/Glibc.c (tracedb_open): Replace const char with const
void.
(libc_perror): Replace char with const char.
(libc_printf): Replace char with void.
(libc_snprintf): Replace char with void.
Add const_cast for parameter to index.
Add reinterpret_cast for parameter to vsnprintf.
(libc_open): Replace first paramter type char with void.
Add vararg for the third parameter.
* mc-boot-ch/Gm2rtsdummy.cc (M2RTS_RequestDependant): Remove #if 0 code.
(m2pim_M2RTS_RegisterModule): Change const char parameters to void
(M2RTS_RegisterModule): Ditto.
(_M2_M2RTS_init): Remove #if 0 code.
(M2RTS_ConstructModules): Ditto.
(M2RTS_Terminate): Ditto.
(M2RTS_DeconstructModules): Ditto.
(M2RTS_Halt): Ditto.
* mc-boot-ch/Gtermios.cc (SetFlag): Return bool.
* mc-boot-ch/m2rts.h (M2RTS_RegisterModule_Cstr): New define.
(M2RTS_RegisterModule): Change const char parameters to void.
* mc-boot/Gdecl.cc: Regenerate.
* mc/decl.mod (getNextConstExp): Reimplement.
* pge-boot/GDynamicStrings.cc: Regenerate.
* pge-boot/GDynamicStrings.h: Ditto.
* pge-boot/GM2RTS.h (M2RTS_RegisterModule_Cstr): New function.
(M2RTS_RegisterModule): Reformat.
* pge-boot/GSymbolKey.cc: Regenerate.
* pge-boot/GSysExceptions.cc (_M2_SysExceptions_init): Add correct parameters.
(_M2_SysExceptions_fini): Ditto.
* pge-boot/GUnixArgs.cc (_M2_UnixArgs_ctor::_M2_UnixArgs_ctor):
Replace call to M2RTS_RegisterModule with M2RTS_RegisterModuleCstr.
* pge-boot/Gerrno.cc (_M2_errno_init): Add correct parameters.
(_M2_errno_fini): Ditto.
* pge-boot/Gldtoa.cc (ldtoa_strtold): Replace const char with
void.
Use reinterpret_cast when passing s to strtod.
Replace true with TRUE.
* pge-boot/Gldtoa.h (ldtoa_strtold): Tidy up.
* pge-boot/Glibc.cc (libc_read): Use size_t as the return type.
(libc_write): Ditto.
(libc_strlen): Ditto.
(libc_perror): Replace char with const char.
(libc_printf): Replace char to const char.
Cast parameter to index using const_cast.
(libc_snprintf): Replace char with void.
Cast parameter to index using const_cast.
(libc_malloc): Replace parameter type with size_t.
(libc_memcpy): Replace third parameter type with size_t.
(libc_open): Use varargs.
* pge-boot/Glibc.h (libc_perror): Add _string_high parameter.
* pge-boot/Gpge.cc: Regenerate.
* pge-boot/Gtermios.cc (SetFlag): Replace return type with bool.
(_M2_termios_init): Add correct parameters.
(_M2_termios_fini): Ditto.
* pge-boot/m2rts.h (M2RTS_RegisterModule_Cstr): New define.
(M2RTS_RegisterModule): Replace const char with void.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Daily bump.

Fortran: silence Wmaybe-uninitialized warnings for LTO build [PR116221]

PR fortran/116221

gcc/fortran/ChangeLog:

* intrinsic.cc (gfc_get_intrinsic_sub_symbol): Initialize variable.
* symbol.cc (gfc_get_ha_symbol): Likewise.

AVR: -mlra is not documeted in TEXI.

gcc/
* config/avr/avr.opt (mlra): Set Undocumented flag.

AVR: Add function avr.cc::ra_in_progress().

It returns lra_in_progress resp. reload_in_progress depending on avr_lra_p.
Currently, direct use of ra_in_progress() is only made with -mlog=.

gcc/
* config/avr/avr.cc (ra_in_progress): New static function.
(avr_legitimate_address_p, avr_addr_space_legitimate_address_p)
(extra_constraint_Q): Use it with -mlog=.

Daily bump.

i386: testsuite: Adapt fentryname3.c for r14-811 change [PR70150]

After r14-811 "call *nop@GOTPCREL(%rip)" is only generated with
-mno-direct-extern-access even if --enable-default-pie. So the r13-1614
change to this file is not valid anymore.

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.target/i386/fentryname3.c (dg-final): Revert r13-1614
change.

i386: testsuite: Add -no-pie for pr113689-1.c [PR70150]

For a --enable-default-pie build, using -fno-pic (for compiler) but
not -no-pie (for linker) triggers some linker warnings counted as
excess errors:

    /usr/bin/ld: /tmp/cc8MgxiR.o: warning: relocation in read-only
    section `.text.startup'
    /usr/bin/ld: warning: creating DT_TEXTREL in a PIE

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.target/i386/pr113689-1.c (dg-options): Add -no-pie.

Fix reference to the dom walker function in the documentation

It is using a class now with a different name.

gcc/ChangeLog:

* doc/cfg.texi: Fix references to dom_walker.

gm2: add missing debug output guard

The Close() procedure in MemStream is missing a guard to prevent it from
printing in non-debug mode.

gcc/gm2:
* gm2-libs-iso/MemStream.mod: Guard debug output.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>

testsuite: Fix up sse3-addsubps.c

The testcase uses sizeof (vals) / sizeof (vals) as the number of vals to
handle (though, handles 8 vals at a time). That is an obvious typo,
all similar testcases use sizeof (vals) / sizeof (vals[0]) properly.

2024-08-10 Jakub Jelinek <jakub@redhat.com>

* gcc.target/powerpc/sse3-addsubps.c (TEST): Divide by
sizeof (vals[0]) rather than sizeof (vals).

AVR: ad target/113934 - Add option -mlra to enable LRA.

PR target/113934
gcc/
* config/avr/avr.opt (-mlra): New target option.
* config/avr/avr.cc (avr_use_lra_p): New function.
(TARGET_LRA_P): Use it.
(avr_hard_regno_mode_ok) [lra]: Don't disallow 4-byte modes for X.

c++: inherited CTAD fixes [PR116276]

This implements the overlooked inherited vs non-inherited guide
tiebreaker from P2582R1. This requires tracking inherited-ness of a
guide, for which it seems natural to reuse the lang_decl_fn::context
field which for a constructor tracks its inherited-ness.

This patch also works around CLASSTYPE_CONSTRUCTORS not reliably
returning all inherited constructors (due to some using-decl handling
quirks in in push_class_level_binding) by iterating over TYPE_FIELDS
instead.

This patch also makes us recognize another written form of inherited
constructor, 'using Base<T>::Base::Base' whose USING_DECL_SCOPE is a
TYPENAME_TYPE.

PR c++/116276

gcc/cp/ChangeLog:

* call.cc (joust): Implement P2582R1 inherited vs non-inherited
guide tiebreaker.
* cp-tree.h (lang_decl_fn::context): Document usage in
deduction_guide_p FUNCTION_DECLs.
(inherited_guide_p): Declare.
* pt.cc (inherited_guide_p): Define.
(set_inherited_guide_context): Define.
(alias_ctad_tweaks): Use set_inherited_guide_context.
(inherited_ctad_tweaks): Recognize some inherited constructors
whose scope is a TYPENAME_TYPE.
(ctor_deduction_guides_for): For C++23 inherited CTAD, iterate
over TYPE_FIELDS instead of CLASSTYPE_CONSTRUCTORS to recognize
all inherited constructors.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/class-deduction-inherited4.C: Remove an xfail.
* g++.dg/cpp23/class-deduction-inherited5.C: New test.
* g++.dg/cpp23/class-deduction-inherited6.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P tweaks

DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P templates can only appear as part
of a template friend declaration, and in turn get partially instantiated
only from tsubst_friend_function or tsubst_friend_class. So rather than
having tsubst_template_decl clear the flag, let's leave it up to the
tsubst friend routines to clear it so that template friend handling stays
localized (note that tsubst_friend_function was already clearing it).

Also the template depth comparison test within tsubst_friend_function is
equivalent to DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P since such templates
belong to the class context (and so always have more levels than the
context), and conversely and it isn't possible to directly refer to an
existing template that has more levels than the class context.

gcc/cp/ChangeLog:

* pt.cc (tsubst_friend_class): Simplify depth comparison test
in the redeclaration code path to
DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P. Clear the flag after
partial instantiation here ...
(tsubst_template_decl): ... instead of here.

Reviewed-by: Jason Merrill <jason@redhat.com>