git.ipfire.org Git - thirdparty/gcc.git/log

vect: Give up if there is no offset_vectype.

vect_gather_scatter_fn_p currently ICEs if offset_vectype is NULL.
This is an oversight in the patches that relax gather/scatter detection.
Catch this.

gcc/ChangeLog:

* tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Bail if
offset_vectype is NULL.

vect: Reduce group size of consecutive strided accesses.

Consecutive load permutations like {0, 1, 2, 3} or {4, 5, 6, 7} in a
group of 8 only read a part of the group, leaving a gap.

For strided accesses we can elide the permutation and, instead of
accessing the whole group, use the number of SLP lanes. This
effectively increases the vector size as we don't load gaps. On top we
do not need to emit the permutes at all.

gcc/ChangeLog:

* tree-vect-slp.cc (vect_load_perm_consecutive_p): New function.
(vect_lower_load_permutations): Use.
(vect_optimize_slp_pass::remove_redundant_permutations): Use.
* tree-vect-stmts.cc (has_consecutive_load_permutation): New
function that uses vect_load_perm_consecutive_p.
(get_load_store_type): Use.
(vectorizable_load): Reduce group size.
* tree-vectorizer.h (struct vect_load_store_data): Add
subchain_p.
(vect_load_perm_consecutive_p): Declare.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr118019-2.c:

c++: Implement C++26 P3920R0 - Wording for NB comment resolution on trivial relocation

Trivial relocation was voted out of C++26, the following patch
removes it (note, the libstdc++ part was still waiting for patch review
and so doesn't need to be removed).

This isn't a mere revert of r16-2206; I've kept -Wc++26-compat option,
from earlier patches the non-terminal stays to be class-property-specifier,
and I had to partially revert also various follow-up changes, e.g. for
modules to handle the new flags and test them, for -Wkeyword-macro
etc. to diagnose the conditional keywords or the feature test macro
etc.

2025-11-10 Jakub Jelinek <jakub@redhat.com>

PR c++/119064
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Revert 2025-07-11
changes.
gcc/cp/
* cp-tree.h (struct lang_type): Revert 2025-07-11 changes.
(CLASSTYPE_TRIVIALLY_RELOCATABLE_BIT): Remove.
(CLASSTYPE_TRIVIALLY_RELOCATABLE_COMPUTED): Remove.
(CLASSTYPE_REPLACEABLE_BIT): Remove.
(CLASSTYPE_REPLACEABLE_COMPUTED): Remove.
(enum virt_specifier): Revert 2025-07-11 changes.
(trivially_relocatable_type_p): Remove.
(replaceable_type_p): Remove.
* cp-trait.def (IS_NOTHROW_RELOCATABLE): Remove.
(IS_REPLACEABLE): Remove.
(IS_TRIVIALLY_RELOCATABLE): Remove.
* parser.cc (cp_parser_class_specifier, cp_parser_class_head):
Revert 2025-07-11 changes.
* pt.cc (instantiate_class_template): Likewise.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.
* tree.cc (default_movable_type_p): Remove.
(union_with_no_declared_special_member_fns): Remove.
(trivially_relocatable_type_p): Remove.
(replaceable_type_p): Remove.
* constraint.cc (diagnose_trait_expr): Revert 2025-07-11 changes.
* module.cc (trees_out::lang_type_bools): Revert part of
2025-07-26 changes related to trivial relocatability.
(trees_in::lang_type_bools): Likewise.
* lex.cc (cxx_init): Don't call cpp_warn on
trivially_relocatable_if_eligible and replaceable_if_eligible.
gcc/testsuite/
* g++.dg/cpp26/feat-cxx26.C: Revert 2025-07-11 changes.
* g++.dg/DRs/dr2581-1.C (__cpp_trivial_relocatability): Remove test.
* g++.dg/DRs/dr2581-2.C (__cpp_trivial_relocatability): Likewise.
* g++.dg/warn/Wkeyword-macro-1.C: Don't expect any diagnostics on
#define or #undef of trivially_relocatable_if_eligible or
replaceable_if_eligible.
* g++.dg/warn/Wkeyword-macro-2.C: Likewise.
* g++.dg/warn/Wkeyword-macro-4.C: Likewise.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-7.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.
* g++.dg/cpp26/trivially-relocatable1.C: Remove.
* g++.dg/cpp26/trivially-relocatable2.C: Remove.
* g++.dg/cpp26/trivially-relocatable3.C: Remove.
* g++.dg/cpp26/trivially-relocatable4.C: Remove.
* g++.dg/cpp26/trivially-relocatable5.C: Remove.
* g++.dg/cpp26/trivially-relocatable6.C: Remove.
* g++.dg/cpp26/trivially-relocatable7.C: Remove.
* g++.dg/cpp26/trivially-relocatable8.C: Remove.
* g++.dg/cpp26/trivially-relocatable9.C: Remove.
* g++.dg/cpp26/trivially-relocatable10.C: Remove.
* g++.dg/cpp26/trivially-relocatable11.C: Remove.
* g++.dg/modules/class-11_a.H: Remove trivial relocatability
related parts.
* g++.dg/modules/class-11_b.C: Likewise.

c++: Diagnose #define/#undef indeterminate

While working on CWG3053 I've noticed I forgot to enable diagnostics
on #define indeterminate or #undef indeterminate now that it is handled
as valid C++26 attribute.

2025-11-10 Jakub Jelinek <jakub@redhat.com>

gcc/cp/
* lex.cc (cxx_init): For C++26 call cpp_warn on "indeterminate".
gcc/testsuite/
* g++.dg/warn/Wkeyword-macro-1.C: Expect diagnostics on define/undef
of indeterminate.
* g++.dg/warn/Wkeyword-macro-2.C: Likewise.
* g++.dg/warn/Wkeyword-macro-4.C: Likewise.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-7.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.

c++, libcpp: Implement CWG3053

The following patch implements CWG3053 approved in Kona, where it is now
valid not just to #define likely(a) or #define unlikely(a, b, c) but also
to #undef likely or #undef unlikely.

2025-11-10 Jakub Jelinek <jakub@redhat.com>

libcpp/
* directives.cc: Implement CWG3053.
(do_undef): Don't pedwarn or warn about #undef likely or #undef
unlikely.
gcc/testsuite/
* g++.dg/warn/Wkeyword-macro-4.C: Don't diagnose for #undef likely
or #undef unlikely.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-9.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.
* g++.dg/warn/Wkeyword-macro-10.C: Likewise.

libstdc++: Implement final wording of C++26 P3778R0 - type_order

The approved P3778R0 wording doesn't have type_order<_Tp, _Up>::type, so
this patch removes it.

2025-11-10 Jakub Jelinek <jakub@redhat.com>

* libsupc++/compare: Implement final wording of C++26 P3778R0 - Fix
for type_order template definition.
(std::type_order): Remove type member.

libcpp: Improve locations for macros defined prior to PCH include [PR105608]

It is permissible to define macros prior to including a PCH, as long as
these definitions are disjoint from or identical to the macros in the
PCH. The PCH loading process replaces all libcpp data structures with those
from the PCH, so it is necessary to remember the extra macros separately and
then restore them after loading the PCH, which all is handled by
cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
process consists of pushing a buffer containing the macro definition and
then lexing it from there, similar to how a command-line -D option is
processed. The current implementation does not attempt to set up the
line_map for this process, and so the locations assigned to the macros are
often not meaningful. (Similar to what happened in the past with lexing the
tokens out of a _Pragma string, lexing out of a buffer rather than a file
produces "sorta" reasonable locations that are often close enough, but not
reliably correct.)

Fix that up by remembering enough additional information (more or less, an
expanded_location for each macro definition) to produce a reasonable
location for the newly restored macros.

One issue that came up is the treatment of command-line-defined macros. From
the perspective of the generic line_map data structures, the command-line
location is not distinguishable from other locations; it's just an ordinary
location created by the front ends with a fake file name by convention. (At
the moment, it is always the string `<command-line>', subject to
translation.)  Since libcpp needs to assign macros to that location, it
needs to know what location to use, so I added a new member
line_maps::cmdline_location for the front ends to set, similar to how
line_maps::builtin_location is handled.

This revealed a small issue, in c-opts.cc we have:

    /* All command line defines must have the same location.  */
      cpp_force_token_locations (parse_in, line_table->highest_line);

But contrary to the comment, all command line defines don't actually end up
with the same location anymore. This is because libcpp/lex.cc has been
expanded (r6-4873) to include range information on the returned
locations. That logic has never been respecting the request of
cpp_force_token_locations. I believe this was not intentional, and so I have
corrected that here. Prior to this patch, the range logic has been leading
to command-line macros all having similar locations in the same line map (or
ad-hoc locations based from there for sufficiently long tokens); with this
change, they all have exactly the same location and that location is
recorded in line_maps::cmdline_location.

With that change, then it works fine for pch.cc to restore macros whether
they came from the command-line or from the main file.

gcc/c-family/ChangeLog:

PR preprocessor/105608
* c-opts.cc (c_finish_options): Set new member
line_table->cmdline_location.
* c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
libcpp pch.cc; it is now possible that the linemap is in a different
file after returning from cpp_read_state().

libcpp/ChangeLog:

PR preprocessor/105608
* include/line-map.h: Add new member CMDLINE_LOCATION.
* lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
the token location to include range information if token location
override was requested.
(warn_about_normalization): Likewise.
(_cpp_lex_direct): Likewise.
* pch.cc (struct saved_macro): New local struct.
(struct save_macro_data): Change DEFNS vector to hold saved_macro
rather than uchar*.
(save_macros): Adapt to remember the location information for each
saved macro in addition to the definition.
(cpp_prepare_state): Likewise.
(cpp_read_state): Use the saved location information to generate
proper locations for the restored macros.

gcc/testsuite/ChangeLog:

PR preprocessor/105608
* g++.dg/pch/line-map-3.C: Remove xfails.
* g++.dg/pch/line-map-4.C: New test.
* g++.dg/pch/line-map-4.Hs: New test.

Daily bump.

apx-ndd-tls-1b.c: Change to xfail for ! ia32

commit be671ec1f30ecd55aaff09048afb2a619018cb8a
Author: liuhongt <hongtao.liu@intel.com>
Date:   Sun Mar 16 22:28:44 2025 -0700

    Mark gcc.target/i386/apx-ndd-tls-1b.c as xfail.

marked gcc.target/i386/apx-ndd-tls-1b.c as xfail for lp64.  But this test
is enabled for ! ia32:

/* { dg-do assemble { target { apxf && { ! ia32 } } } } */

Change gcc.target/i386/apx-ndd-tls-1b.c to xfail for ! ia32.

* gcc.target/i386/apx-ndd-tls-1b.c: Change to xfail for ! ia32.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

indirect-thunk-(|extern|inline)-(5|6).c: Drop x32 scan

Since the x32 codegen is similar to lp64, drop x32 scan and use the same
scan for x32.

* gcc.target/i386/indirect-thunk-5.c: Drop x32 scan.
* gcc.target/i386/indirect-thunk-6.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-5.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-6.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-5.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-6.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

gcc.target/i386/cmov12.c: Limit to lp64

Since gcc.target/i386/cmov12.c also fails for x32, similar to ia32, limit
it to lp64.

* gcc.target/i386/cmov12.c: Limit to lp64.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Regenerate libgfortran Makefile.in and aclocal.m4

Commit a1fe2cfa8965 ("fortran: [PR121628]") regenerated libgfortran
Makefile.an and aclocal.m4 files with automake 1.15 instead of 1.15.1.
Run autoreconf version 2.69 with automake 1.15.1 inside libgfortran.

libgfortran/ChangeLog:

* Makefile.in: Regenerate.
* aclocal.m4: Regenerate.

Fix typo

gcc.target/i386/shrink_wrap_1.c: Limit to lp64

Since gcc.target/i386/shrink_wrap_1.c also fails for x32, similar to ia32,
limit it to lp64.

* gcc.target/i386/shrink_wrap_1.c: Limit to lp64.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

gcc.dg/pr90838.c: Adjust for x32

Adjust gcc.dg/pr90838.c for x32 which has 32-bit long with x86-64 ISA.

* gcc.dg/pr90838.c: Adjust for x32

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Daily bump.

Ada: Fix bogus error on limited with clause and private parent package

The implementation of the 10.1.2(8/2-11/2) subclauses that establish rules
for the legality of "with" clauses of private child units is done separately
for regular "with" clauses (in Check_Private_Child_Unit) and for limited
"with" clauses (in Check_Private_Limited_Withed_Unit). The testcase, which
contains the regular and the "limited" version of the same pattern, exhibits
a disagreement between them; the former implementation is correct and the
latter is wrong in this case.

The patch fixes the problem and also cleans up the latter implementation by
aligning it with the former as much as possible.

gcc/ada/
PR ada/34374
* sem_ch10.adb (Check_Private_Limited_Withed_Unit): Use a separate
variable for the private child unit, streamline the loop locating
the nearest private ancestor, fix a too early termination of the
loop traversing the ancestor of the current unit, and use the same
privacy test as Check_Private_Child_Unit.

gcc/testsuite/
* gnat.dg/specs/limited_with4.ads: Rename to...
* gnat.dg/specs/limited_with1.ads: ...this.
* gnat.dg/specs/limited_with4_pkg.ads: Rename to...
* gnat.dg/specs/limited_with1_pkg.ads: ...this.
* gnat.dg/specs/limited_with2-child1.ads: New test.
* gnat.dg/specs/limited_with2-child2.ads: Likewise.
* gnat.dg/specs/limited_with2.ads: New helper.

[RISC-V] Add testcase for shifted truthvalue

I was doing some cleanup on our internal tree and noticed a pattern that I
didn't think was actually useful in practice. Thankfully the internal commit
included a testcase clearly targeting that pattern.

I'm upstreaming the testcase, but not the unnecessary pattern.

gcc/testsuite
* gcc.target/riscv/snez.c: New test.

Fortran: F2018 extensions to interoperability of procedures [PR113338]

PR fortran/113338

gcc/fortran/ChangeLog:

* decl.cc (gfc_verify_c_interop_param): Allow further types of
dummy argument without the VALUE attribute as specified in
F2018 18.3.6 item (5).

gcc/testsuite/ChangeLog:

* gfortran.dg/c-interop/pr113338-c.c: New test.
* gfortran.dg/c-interop/pr113338.f90: New test.

isel: Check bounds before converting VIEW_CONVERT to VEC_SET.

The function gimple_expand_vec_set_expr in the isel pass, converted
VIEW_CONVERT_EXPR to VEC_SET_EXPR without checking the bounds on the index,
which cause ICE on targets that supported VEC_SET_EXPR like x86 and powerpc.
This patch adds a bound check on the index operand and rejects the conversion
if index is out of bound.

2025-11-08 Avinash Jayakar <avinashd@linux.ibm.com>

gcc/ChangeLog:
PR tree-optimization/122126
* gimple-isel.cc (gimple_expand_vec_set_extract_expr): Add bound check.

gcc/testsuite/ChangeLog:
PR tree-optimization/122126
* gcc.dg/pr122126_vextr.c: New test.
* gcc.dg/pr122126_vset.c: New test.

LoongArch: Fix PR122097 (2).

r16-4703 does not completely fix PR122097. Floating-point vectors
were not processed in the function loongarch_const_vector_same_bytes_p.
This patch will completely resolve this issue.

PR target/122097

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_const_vector_same_bytes_p): Add processing for
floating-point vector data.

vect: Complete implementation for MULT_EXPR vector lowering.

Use sequences of shifts and add/sub if the hardware does not have support for
vector multiplication. In a previous patch, bare bones vector lowering had been
implemented which only worked when the constant value was a power of 2.

In this patch, few more cases have been added, i.e., if a constant is a uniform
vector but not a power of 2 then use the choose_mult_variant, with max cost
estimate as the cost of scalar multiplication operation times the number of
elements in the vector. This is similar to the logic while expanding MULT_EXPR
in expand pass or in the vector pattern recognition in tree-vect-patterns.cc.

2025-11-08 Avinash Jayakar <avinashd@linux.ibm.com>

gcc/ChangeLog:
PR tree-optimization/122065
* tree-vect-generic.cc (target_supports_mult_synth_alg): Add helper to
check mult synth.
(expand_vector_mult): Optimize mult when const is uniform but not
power of 2.

fortran: [PR121628]

The PR121628 deep-copy helper reused a static seen_derived_types set
across wrapper generation, so recursive allocatable arrays that appeared
multiple times in a derived type caused infinite compile-time recursion.
Save and restore the set around each wrapper build, polish follow-ups,
and add a regression test to keep the scenario covered.

gcc/fortran/ChangeLog:

PR fortran/121628
* trans-array.cc (seen_derived_types): Move to file scope and
preserve/restore around generate_element_copy_wrapper.
* trans-intrinsic.cc (conv_intrinsic_atomic_op): Reuse
gfc_trans_force_lval when forcing addressable CAF temps.

gcc/testsuite/ChangeLog:

PR fortran/121628
* gfortran.dg/alloc_comp_deep_copy_7.f90: New test.

libgfortran/ChangeLog:

PR fortran/121628
* Makefile.in: Keep continuation indentation within 80 columns.
* aclocal.m4: Regenerate.
* libgfortran.h: Drop unused forward declaration.

Signed-off-by: Christopher Albert <albert@tugraz.at>

sccp: Fix order of removal of phi (again) [PR122599]

This time we are gimplifying the expression and call
fold_stmt during the gimplification (which is fine) but
since we removed the phi and the expression references ssa
names in the phi indirectly, things just fall over inside the ranger.

This moves the removal of the phi until gimplification happens as it
might refer back to the ssa name that the phi defines.

Pushed as obvious after bootstrap test on x86_64-linux-gnu.

PR tree-optimization/122599

gcc/ChangeLog:

* tree-scalar-evolution.cc (final_value_replacement_loop): Move
the removal of the phi until after the gimplification of the final
value expression.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr122599-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

Daily bump.

gdbhooks.py: add printers for analyzer types (supernode and exploded_node)

Tested by hand; this leads to output like this:

(gdb) p m_enode_for_diag
$1 = <ana::exploded_node 0x4e389f0 (EN 6)>

(gdb) p m_enode_for_diag->get_supernode ()
$2 = <ana::supernode 0x4da9030 (SN 1)>

gcc/ChangeLog:
* gdbhooks.py (class AnaSupernodePrinter): New.
(class AnaExplodedNodePrinter): New.
(build_pretty_printer): Register the above.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix stray trailing "allocated here" text in events

Fix event text like this (seen in gcc.dg/analyzer/allocation-size-5.c):

   31 |   char arr[sizeof (int16_t)];
      |        ^~~
      |        |
      |        (1) allocated 2 bytes hereallocated here

gcc/analyzer/ChangeLog:
* checker-event.cc
(region_creation_event_allocation_size::print_desc): Fix missing
"else" leading to stray trailing "allocated here" text in events.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Move build_call_nary away from va_list

Instead of a va_list here we can create a std::initializer_list that contains the
arguments and pass that.
This is just one quick version of what was mentioned during the Reviewing refactoring
goals and acceptable abstractions.
The generated code should be similar or slightly better. Plus there is extra checking
of bounds of the std::initializer_list.

I didn't remove the n argument from build_call_nary at this stage as I didn't want to change
the calls to build_call_nary but I added a gcc_checking_assert to make sure the number passed
is the number of arguments.

Changes since v1:
* v2: Fix build_call's access of std::initializer_list.

gcc/ChangeLog:

* tree.cc (build_call_nary): Remove decl.
Add template definition that uses std::initializer_list<tree>
and call build_call.
(build_call): New declaration.
* tree.h (build_call_nary): Remove.
(build_call): New function.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

RISC-V: Remove gather scale and offset handling.

With the recent vectorizer changes upstream the vectorizer can take care
of offset extension and scaling (and its proper costing) itself.
Thus, we can remove all related handling in expand_gather_scatter and
set the predicates in the gather/scatter expanders to what our
instructions actually support.

gcc/ChangeLog:

* config/riscv/autovec.md: Use const_1_operand for scale and
extend predicates.
* config/riscv/riscv-v.cc (expand_gather_scatter): Remove scale
and extension handling.

vect: Do not convert offset type in strided gather.

The gather/scatter relaxation patches introduced a bug with
vect_use_strided_gather_scatters_p.  I didn't want to pass
supported_offset_vectype and supported scale all the way from
vect_truncate_gather_scatter_offset and
vect_use_strided_gather_scatters_p to get_load_store_type so
just called vect_gather_scatter_fn_p again afterwards to determine
the supported type and scale.

However, this doesn't take into account that
vect_use_strided_gather_scatters_p changes the offset type after
verifying that we can use gather/scatter.

The flow right now is
- vect_use_strided_gather_scatters_p calls vect_check_gather_scatter
with e.g. a char offset type.
- We actually need/support a short vector offset type and
vect_use_strided_gather_scatters_p fold converts the actual (scalar)
char offset to a short offset.
- We call vect_gather_scatter_fn_p with the new short offset instead of
the original char one, thinking we need an even larger offset type.

The last call is obviously not identical to the ones we used to check
gather/scatter in the first place and can fail if there is no offset
vectype.

There are several ways to fix this.  The most obvious one is to bite the
bullet and just add the supported_offset_vectype and supported_scale to
all the intermediate functions.  I wondered, however, if we need the
offset conversion at all.  As far as I can tell we don't ever use
the scalar offset type and vect_get_strided_load_store_ops in particular
uses offset_vectype.  This, this patch removes the conversion.

I bootstrapped and regtested this, before and after the relaxation
patches, on x86 and power10.  Regtested on aarch64 and riscv.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_use_strided_gather_scatters_p):
Do not convert offset type.

vect: Relax gather/scatter scale handling.

Similar to the signed/unsigned patch before this one relaxes the
gather/scatter restrictions on scale factors. The basic idea is that a
natively unsupported scale factor can still be reached by emitting a
multiplication before the actual gather operation. As before, we need
to make sure that there is no overflow when multiplying.

gcc/ChangeLog:

* tree-vect-data-refs.cc (struct gather_scatter_config):
Add scale.
(vect_gather_scatter_get_configs): Try various scales.
(vect_gather_scatter_fn_p): Add scale handling.
(vect_check_gather_scatter): Add scale parameter.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Ditto.
(vect_truncate_gather_scatter_offset): Ditto.
(vect_use_grouped_gather): Ditto.
(get_load_store_type): Ditto.
(vectorizable_store): Scale offset if necessary.
(vectorizable_load): Ditto.
* tree-vectorizer.h (struct vect_load_store_data): Add
supported_scale.
(vect_gather_scatter_fn_p): Add argument.

vect: Relax gather/scatter detection by swapping offset sign.

This patch adjusts vect_gather_scatter_fn_p to always check an offset
type with swapped signedness (vs. the original offset argument).
If the target supports the gather/scatter with the new offset type as
well as the conversion of the offset we now emit an explicit offset
conversion before the actual gather/scatter.

The relaxation is only done for the IFN path of gather/scatter and the
general idea roughly looks like:

  - vect_gather_scatter_fn_p builds a list of all offset vector types
  that the target supports for the current vectype.  Then it goes
  through that list, trying direct support first and sign-swapped
  offset types next, taking precision requirements into account.
  If successful it sets supported_offset_vectype to the type that actually
  worked while offset_vectype_out is the type that was requested.
  - vect_check_gather_scatter works as before but uses the relaxed
  vect_gather_scatter_fn_p.
  - get_load_store_type sets ls_data->supported_offset_vectype if the
  requested type wasn't supported but another one was.
  - check_load_store_for_partial_vectors uses the
  supported_offset_vectype in order to validate what get_load_store_type
  determined.
  - vectorizable_load/store emit a conversion if
  ls_data->supported_offset_vectype is nonzero and cost it.

The offset type is either of pointer size (if we started with a signed
offset) or twice the size of the original offset (when that one was
unsigned).

gcc/ChangeLog:

* tree-vect-data-refs.cc (struct gather_scatter_config): New
struct to hold gather/scatter configurations.
(vect_gather_scatter_which_ifn): New function to determine which
IFN to use.
(vect_gather_scatter_get_configs): New function to enumerate all
target-supported configs.
(vect_gather_scatter_fn_p): Rework to use
vect_gather_scatter_get_configs and try sign-swapped offset.
(vect_check_gather_scatter): Use new supported offset vectype
argument.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Ditto.
(vect_truncate_gather_scatter_offset): Ditto.
(vect_use_grouped_gather): Ditto.
(get_load_store_type): Ditto.
(vectorizable_store): Convert to sign-swapped offset type if
needed.
(vectorizable_load): Ditto.
* tree-vectorizer.h (struct vect_load_store_data): Add
supported_offset_vectype.
(vect_gather_scatter_fn_p): Add argument.

forwprop: Handle already true/false branchs in optimize_unreachable [PR122588]

When optimize_unreachable was moved from fab to forwprop, I missed that due to
the integrated copy prop, we might end up with an already true branch leading
to a __builtin_unreachable block. optimize_unreachable would switch around
the if and things go down hill from there since the other edge was already
marked as non-executable, forwprop didn't process those blocks and didn't
do copy prop into that block and the original assignment statement was removed.

This fixes the problem by having optimize_unreachable not touch the if
statement was already changed to true/false.

Note I placed the testcase in gcc.c-torture/compile as gcc.dg/torture
is NOT currently testing -Og (see PR 122450 for that).

Changes since v1:
* v2: Add gimple testcase.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/122588

gcc/ChangeLog:

* tree-ssa-forwprop.cc (optimize_unreachable): Don't touch
if the condition was already true or false.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr122588-1.c: New test.
* gcc.dg/tree-ssa/pr122588-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

testsuite: Move complex-operations-run.c to vect-complex-operations-run.c [PR122602]

The vect testsuite is special in that not everything named *.c will be run so
this testcase needs to be named vect-* to be able to run.

Pushed as obvious after making sure it at least now compiles for aarch64-linux-gnu.

PR testsuite/122602
gcc/testsuite/ChangeLog:

* gcc.dg/vect/complex/complex-operations-run.c: Move to...
* gcc.dg/vect/complex/vect-complex-operations-run.c: ...here.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

Ada: Fix bogus error on inherited operation for extension of type instance

It comes from a small discrepancy between class-wide subtypes and types:
they both have unknown discriminants, but only the latter may have
discriminants, which causes Subtypes_Statically_Match to return False.

gcc/ada/
PR ada/83188
* sem_eval.adb (Subtypes_Statically_Match): Deal with class-wide
subtypes whose class-wide types have discriminants.

gcc/testsuite/
* gnat.dg/class_wide6.ads, gnat.dg/class_wide6.adb: New test.
* gnat.dg/class_wide6_pkg.ads: New helper.

bpf: improve memmove inlining [PR122140]

The BPF backend inline memmove expansion was broken for certain
constructs.  This patch addresses the two underlying issues:
1. Off-by-one in the "backwards" unrolled move loop offset.
2. Poor use of temporary register for the generated move loop, which
   could result in some of the loads performing the move to be optimized
   away when the source and destination of the memmove are based off of
   the same pointer.

gcc/

PR target/122140
* config/bpf/bpf.cc (bpf_expand_cpymem): Fix off-by-one offset
in backwards loop.  Improve src and dest addrs used for the
branch condition.
(emit_move_loop): Improve emitted set insns and remove the
explict temporary register.

tree-optimization/122577 - missed vectorization of conversion from bool

We are currently overly restrictive with rejecting conversions from
bit-precision entities to mode precision ones.  Similar to RTL expansion
we can focus on non-bit operations producing bit-precision results
which we currently do not properly handle by masking.  Such checks
should be already present.  The following relaxes vectorizable_conversion.

Actual bitfield accesses are catched and rejected by vectorizer dataref
analysis and converted during if-conversion into mode-size accesses
with appropriate sign- or zero-extension.

PR tree-optimization/122577
* tree-vect-stmts.cc (vectorizable_conversion): Allow conversions
from non-mode-precision types.

* gcc.dg/vect/vect-bool-3.c: New testcase.

Match: Refactor bit_ior based unsigned SAT_MUL pattern by widen mul helper [NFC]

There are 3 kinds of widen_mul during the unsigned SAT_MUL pattern, aka
* widen_mul directly, like _3 w* _4
* convert and the widen_mul, like (uint64_t)_3 *w (uint64_t)_4
* convert and then mul, like (uint64_t)_3 * (uint64_t)_4

All of them will be referenced during different forms of unsigned
SAT_MUL pattern match, but actually we can wrap them into a helper
which present the "widening_mul" sematics. With this helper, some
unnecessary pattern and duplicated code could be eliminated. Like
min based pattern, this patch focus on bit_ior based pattern.

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

gcc/ChangeLog:

* match.pd: Leverage usmul_widen_mult by bit_ior based
unsigned SAT_MUL pattern.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vsext.vf2 and vsll.vi to vwsll.vi on ZVBB

The vwsll.vi of zvbb ext take zero extend before ashift.  But
we can still do some combine based on sign extend if and only
if the shift is imm and the sign extend bits are all shifted.
For example as below

  vsetvli   zero, zero, e32, m1, ta, ma
  vsext.vf2 v1, v2
  vsll.vi   v1, v1, 16

If the ashift bits is greater than or equals to truncated bitsize,
(aka 16 for e32), the sign or zero extend bits will be ashifted
and never pollute the final result.  Then we have

  vsetvli   zero, zero, e32, m1, ta, ma
  vwsll.vi  v1, v2, 16

PR target.121959

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*vwsll_sign_extend_<mode>): Add
pattern to combine vsext.vf2 and vslli.vi to vwsll.vi.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr121959-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-3.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-4.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-5.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959.h: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

tree-optimization/122589 - imm use iterator checking fallout

The following addresses the latent issue that gsi_replace_with_seq
causes debug info to unnecessarily degrade and in this process
break the new immediate use iterator sanity checking.  In particular
gsi_remove has side-effects on debug stmts even when operating
in non-permanent operation.  But as we are operating on a sequence
not in the IL here this should be avoided.  Re-factoring
gsi_replace_with_seq to not rely on gsi_remove fulfills this.

I've noticed gsi_split_seq_before has misleading documentation.
Fixed thereby as well.

PR tree-optimization/122589
PR middle-end/122594
* gimple-iterator.cc (gsi_replace_with_seq): Instead of
removing the last stmt from the sequence with gsi_remove,
split it using gsi_split_seq_before.
(gsi_split_seq_before): Fix bogus documentation.

* g++.dg/torture/pr122589.C: New testcase.

aarch64: Add support for preserve_none function attribute [PR target/118328]

When applied to a function preserve_none changes the procedure call standard
such that all registers except stack pointer, frame register, and link register
are caller saved. Additionally, it changes the argument passing registers.

PR target/118328

gcc/ChangeLog:

* config/aarch64/aarch64.cc (handle_aarch64_vector_pcs_attribute):
Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_pcs_exclusions): New definition.
(aarch64_gnu_attributes): Add entry for preserve_none and add
aarch64_pcs_exclusions to aarch64_vector_pcs entry.
(aarch64_preserve_none_abi): New function.
(aarch64_fntype_abi): Add handling for preserve_none.
(aarch64_reg_save_mode): Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_hard_regno_call_part_clobbered): Add handling for
ARM_PCS_PRESERVE_NONE.
(num_pcs_arg_regs): New helper function.
(get_pcs_arg_reg): New helper function.
(aarch64_function_ok_for_sibcall): Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_layout_arg): Add preserve_none argument lauout..
(function_arg_preserve_none_regno_p): New helper function.
(aarch64_function_arg): Update to handle preserve_none.
(function_arg_preserve_none_regno_p): Update logic for preserve_none.
(aarch64_expand_builtin_va_start): Add preserve_none layout.
(aarch64_setup_incoming_varargs): Add preserve_none layout.
(aarch64_is_variant_pcs): Update for case of ARM_PCS_PRESERVE_NONE.
(aarch64_comp_type_attributes): Add preserve_none.
* config/aarch64/aarch64.h (NUM_PRESERVE_NONE_ARG_REGS): New macro.
(PRESERVE_NONE_REGISTERS): New macro.
(enum arm_pcs): Add ARM_PCS_PRESERVE_NONE.
* doc/extend.texi (preserve_none): Add docs for new attribute.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/preserve_none_1.c: New test.
* gcc.target/aarch64/preserve_none_mingw_1.c: New test.
* gcc.target/aarch64/preserve_none_2.c: New test.
* gcc.target/aarch64/preserve_none_3.c: New test.
* gcc.target/aarch64/preserve_none_4.c: New test.
* gcc.target/aarch64/preserve_none_5.c: New test.
* gcc.target/aarch64/preserve_none_6.c: New test.

RISC-V: Add test for vec_duplicate + vwmaccu.vv combine with GR2VR cost 0, 1 and 15

Add asm dump check and run test for vec_duplicate + vwmaccu.vv
combine to vwmacc.vx, with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vwmaccu.vx.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_widen.h: Add test helper
macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_widen_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vwmaccu-run-1-u64.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: RISC-V: Combine vec_duplicate + vwmaccu.vv to vwmaccu.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vwmaccu.wv to the
vwmaccu.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have asm code like below, GR2VR cost is 0.

Before this patch:
  11       beq a3,zero,.L8
  12       vsetvli a5,zero,e32,m1,ta,ma
  13       vmv.v.x v2,a2
  ...
  16   .L3:
  17       vsetvli a5,a3,e32,m1,ta,ma
  ...
  22       vwmaccu.wv v1,v2,v3
  ...
  25       bne a3,zero,.L3

After this patch:
  11       beq a3,zero,.L8
  ...
  14    .L3:
  15       vsetvli a5,a3,e32,m1,ta,ma
  ...
  20       vwmaccu.wx v1,a2,v3
  ...
  23       bne a3,zero,.L3

Unfortunately, and similar as vwaddu.vv, only widening from uint32_t to
uint64_t has the necessary zero-extend during combine, we loss the
extend op after expand for any other types.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*widen_mul_plus_vx_<mode>): Add
new pattern to combine the vwmaccu.vx.
* config/riscv/vector.md (*pred_widen_mul_plus_u_vx<mode>_undef):
Add undef define_insn for vmwaccu.vx emiting.
(@pred_widen_mul_plus_u_vx<mode>): Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

LoongArch: Redundant sign extension instruction optimization.

When the mode of the destination operand selected by the condition
is SImode, explicit sign extension is applied to both selected
source operands, and the destination operand is marked as
sign-extended.

This method can eliminate some of the sign extension instructions
caused by conditional selection optimization.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_sign_extend_if_subreg_prom_p): Determine if the
current operand is SUBREG and if the source of SUBREG is
the sign-extended value.
(loongarch_expand_conditional_move): Optimize.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend-4.c: New test.
* gcc.target/loongarch/sign-extend-5.c: New test.

LoongArch: Implement sge and sgeu.

The original implementation of the function loongarch_extend_comparands
only prevented op1 from being loaded into the register when op1 was
const0_rtx. It has now been modified so that op1 is not loaded into
the register as long as op1 is an immediate value. This allows
slt{u}i to be generated instead of slt{u} if the conditions are met.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_canonicalize_int_order_test): Support GT GTU LT
and LTU.
(loongarch_extend_comparands): Expand the scope of op1 from
0 to all immediate values.
* config/loongarch/loongarch.md
(*sge<u>_<X:mode><GPR:mode>): New template.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend-3.c: New test.

LoongArch: When loading an immediate value, promote mode to word_mode.

This optimization can eliminate redundant immediate load instructions
during CSE optimization.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_legitimize_move): Optimize.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend-6.c: New test.

Daily bump.

AArch64, ARM: Clean up documentation of -mbranch-protection.

While working on other things, I noticed that the documentation for
the -mbranch-protection= option was pretty garbled on both aarch64 and
arm targets, with incorrect markup, too much syntax crammed into the
option summary, and confusion about what the values "+leaf" modifier
can apply to. I rewrote it to list all the valid option values
explicitly in the option description, checking this against the
implementation.

gcc/ChangeLog
* doc/invoke.texi (AArch64 Options): Clean up description of
-mbranch-protection= argument.
(ARM Options): Likewise.

doc: Move _Countof under 'Other Extensions to C Syntax'

gcc/ChangeLog:

* doc/extend.texi: Move _Countof under 'Syntax Extensions'.

Signed-off-by: Alejandro Colomar <alx@kernel.org>

fortran: [PR121628]

This patch fixes PR121628 by implementing proper deep copy semantics for
derived types containing recursive allocatable array components, in
compliance with Fortran 2018+ standards.

The original implementation would generate infinitely recursive code at
compile time when encountering self-referential derived types with
allocatable components (e.g., type(t) containing allocatable type(t)
arrays). This patch solves the problem by generating a runtime helper
function that performs element-wise deep copying, avoiding compile-time
recursion while maintaining correct assignment semantics.

The trans-intrinsic.cc change enhances handling of constant values in
coarray atomic operations to ensure temporary variables are created when
needed, avoiding invalid address-of-constant expressions.

gcc/fortran/ChangeLog:

PR fortran/121628
* trans-array.cc (get_copy_helper_function_type): New function to
create function type for element copy helpers.
(get_copy_helper_pointer_type): New function to create pointer type
for element copy helpers.
(generate_element_copy_wrapper): New function to generate runtime
helper for element-wise deep copying of recursive types.
(structure_alloc_comps): Detect recursive allocatable array
components and use runtime helper instead of inline recursion.
Add includes for cgraph.h and function.h.
* trans-decl.cc (gfor_fndecl_cfi_deep_copy_array): New declaration
for runtime deep copy helper.
(gfc_build_builtin_function_decls): Initialize the runtime helper
declaration.
* trans-intrinsic.cc (conv_intrinsic_atomic_op): Enhance handling of
constant values in coarray atomic operations by detecting and
materializing address-of-constant expressions.
* trans.h (gfor_fndecl_cfi_deep_copy_array): Add external declaration.

libgfortran/ChangeLog:

PR fortran/121628
* Makefile.am: Add runtime/deep_copy.c to source files.
* Makefile.in: Regenerate.
* gfortran.map: Export _gfortran_cfi_deep_copy_array symbol.
* libgfortran.h: Add prototype for internal_deep_copy_array.
* runtime/deep_copy.c: New file implementing runtime deep copy
helper for recursive allocatable array components.

gcc/testsuite/ChangeLog:

PR fortran/121628
* gfortran.dg/alloc_comp_deep_copy_5.f90: New test for recursive
allocatable array deep copy.
* gfortran.dg/alloc_comp_deep_copy_6.f90: New test for multi-level
recursive allocatable deep copy.
* gfortran.dg/array_memcpy_2.f90: Fix test with proper allocation.

Signed-off-by: Christopher Albert <albert@tugraz.at>

Ada: Fix function call in object notation incorrectly rejected

This happens in the name of a procedure call, again when there
is an implicit dereference in this name, and the fix to apply to
Find_Selected_Component is again straightforward:

--- a/gcc/ada/sem_ch8.adb
+++ b/gcc/ada/sem_ch8.adb
@@ -8524,9 +8524,7 @@ package body Sem_Ch8 is
          --  Error if the prefix is procedure or entry, as is P.X

          if Ekind (P_Name) /= E_Function
-           and then
-             (not Is_Overloaded (P)
-               or else Nkind (Parent (N)) = N_Procedure_Call_Statement)
+           and then not Is_Overloaded (P)
          then
             --  Prefix may mention a package that is hidden by a local
             --  declaration: let the user know. Scan the full homonym

But this also changes the diagnostics in illegal cases because they are not
uniform in the procedure, so the change also factors them out so as to make
them uniform, which slightly improves them in the end.

gcc/ada/
PR ada/113352
* sem_ch4.adb (Diagnose_Call): Tweak error message.
* sem_ch8.adb (Find_Selected_Component): Remove bypass for calls
to procedures in the overloaded overloadable case.  Factor out
the diagnostics code and invoke it uniformly in this case.

gcc/testsuite/
* gnat.dg/prefix3.adb: New test.
* gnat.dg/prefix3_pkg.ads: New helper.
* gnat.dg/prefix3_pkg.adb: Likewise.

libbacktrace: use correct names in #undef of ELF macros

Patch from Joseph Koshy.

* elf.c (ELFMAGn): In #undef rename from ELF_MAGn.

AVR: target/122516: Make attribute "retain" work.

Due to some quirks in crtstuff.c, attribute "retain" requires
some features that avr doesn't implement -- even though it
doesnt't even use crtstuff. This patch works around that.

PR target/122516
gcc/
* config/avr/elf.h (SUPPORTS_SHF_GNU_RETAIN): Define if
HAVE_GAS_SHF_GNU_RETAIN.

[RISC-V] Ignore useless zero-initialization in conditional move sequence costing

Let's consider this code fragment (from xalan):

bool isValidAncestorType(int type) {
    if (type == 0 || type == 6 || type == 4) {
            return true;
    }
    return false;
}

Right now it generates something like this for RISC-V with Zicond enabled:

>         li      a5,6
>         bgtu    a0,a5,.L3
>         li      a5,81
>         bext    a0,a5,a0
>         ret
> .L3:
>         li      a0,0
>         ret

But there's a clearly better sequence.  LLVM for example generates:

>         addi    a1, a0, -6
>         seqz    a1, a1
>         andi    a0, a0, -5
>         seqz    a0, a0
>         or      a0, a1, a0

And with a little work GCC can generate this (which is probably better than
LLVM's codegen ever-so-slightly):

>         li      a4,6            # 8     [c=4 l=4]  *movdi_64bit/1
>         li      a5,81           # 33    [c=4 l=4]  *movdi_64bit/1
>         bext    a5,a5,a0        # 36    [c=4 l=4]  *bextdi
>         sgtu    a0,a0,a4        # 38    [c=4 l=4]  *sgtu_didi
>         czero.nez       a0,a5,a0        # 39    [c=4 l=4]  *czero.nez.didi

There's multiple ways Daniel B. and I looked at to resolve this.  At the core
the failure to if-convert is a cost model problem.

We could try to generate more efficient code out of cmove_arith.  What we're
currently getting out of that routine is horribly bad:

```
> (insn 10 3 34 2 (set (reg:DI 138)
>         (const_int 6 [0x6])) 277 {*movdi_64bit}
>      (nil))
> (insn 34 10 35 2 (set (reg:DI 145)
>         (const_int 0 [0])) 277 {*movdi_64bit}
>      (nil)) (insn 35 34 36 2 (set (reg:DI 141)
>         (const_int 81 [0x51])) 277 {*movdi_64bit}
>      (nil))
> (insn 36 35 37 2 (set (reg:DI 140 [ _3 ])
>         (lshiftrt:DI (reg:DI 141)                    (subreg:QI (reg/v:DI 137 [ type ]) 0))) 301 {lshrdi3}
>      (nil))
> (insn 37 36 38 2 (set (reg:DI 143)
>         (and:DI (reg:DI 140 [ _3 ])
>             (const_int 1 [0x1]))) 106 {*anddi3}
>      (nil))
> (insn 38 37 40 2 (set (reg:DI 146)
>         (zero_extend:DI (subreg:QI (reg:DI 143) 0))) 126 {*zero_extendqidi2_internal}
>      (nil)) (insn 40 38 41 2 (set (reg:DI 147)
>         (gtu:DI (reg/v:DI 137 [ type ])
>             (reg:DI 138))) 428 {*sgtu_didi}
>      (nil))
> (insn 41 40 42 2 (set (reg:DI 149)         (if_then_else:DI (ne:DI (reg:DI 147)
>                 (const_int 0 [0]))
>             (const_int 0 [0])
>             (reg:DI 146))) 40369 {*czero.nez.didi}
>      (nil))
> (insn 42 41 43 2 (set (reg:DI 148)
>         (if_then_else:DI (eq:DI (reg:DI 147)
>                 (const_int 0 [0]))             (const_int 0 [0])
>             (reg:DI 145))) 40368 {*czero.eqz.didi}
>      (nil))
> (insn 43 42 25 2 (set (reg:DI 136 [ <retval> ])
>         (plus:DI (reg:DI 148)
>             (reg:DI 149))) 5 {*adddi3}

```

In particular note insn 34, 42 and 43.  Those are useless.  Insns 36, 37, 38
are just a single bit extraction from a variable location (from one of the
if-converted blocks).  I couldn't see a good way to fix the problem with insn
34/insn 42.  The desire to make the then/else blocks independent is cmove_arith
is good, what's unclear is whether or not that code really cares about the
*destination* of the then/else blocks. But I set that aside.

We then thought that cleaning up the variable bit extraction would be the way
to go.  So a pattern was constructed to match that form of variable bit extract
and the cost model was twiddled to return that it was a single fast
instruction.  But even with those changes fwprop1 refused to make the
substitution.  Sigh.  At least combine recognizes the idiom later and cleans it
up.

Then we realized we really should just ignore the (set (reg) (const_int 0)) in
the if-converted sequence.  We're going to be able to propagate that away in
nearly every case since we have a hard-wired zero register.  Sure enough,
ignoring that insn was enough to tip the balance on this case and we get the
desired code.

Tested on riscv32-elf and riscv64-elf.  Pioneer bootstrap is in flight, though
it won't really exercise this problem.  The BPI's build hasn't started yet, so
it'll be at least 27hours before it's done.

Waiting on pre-commit CI before moving forward.

gcc/

* config/riscv/riscv.cc (riscv_noce_conversion_profitable_p): Ignore
assignments of (const_int 0) to a register.  They will get propagated
away.

gcc/testsuite
* gcc.target/riscv/czero-bext.c: New test.

Ada: Fix incorrect renaming of primitive subprogram in object notation

It is possible to declare a subprogram renaming whose name is a primitive
subprogram in object notation; in this case, the name is unconditionally
evaluated in the front-end (unlike for objects) so that, if an ad-hoc body
needs to be built for the renaming later, the name is not reevaluated for
every call to it.

This evaluation is skipped if the name contains an implicit dereference,
as reported in the first PR, and the fix is to make the dereference explicit
at the end of the processing done in Analyze_Renamed_Primitive_Operation,
as is done in the sibling procedure Analyze_Renamed_Entry. The patch also
makes a few consistency tweaks to them and also replaces a manual evaluation
of the name in Expand_N_Subprogram_Renaming_Declaration by a simple call to
Evaluate_Name, which is the procedure used for object renamings.

Analyze_Renamed_Primitive_Operation performs the resolution of the name
based on the declared profile, but it does not do that correctly in all
cases, as reported in the second PR; the fix is again straightforward.

gcc/ada/
PR ada/113350
PR ada/113551
* exp_ch2.adb (Expand_Renaming): Fix reference to Evaluate_Name.
* exp_ch8.adb (Expand_N_Subprogram_Renaming_Declaration): Call
Evaluate_Name to evaluate the name.
* sem_ch8.adb (Analyze_Renamed_Entry): Minor tweaks.
(Analyze_Renamed_Family_Member): Likewise.
(Analyze_Renamed_Primitive_Operation): Likewise.
Fix thinko in the function checking profile conformance, save the
result of the resolution and make implicit dereferences explicit.

gcc/testsuite
* gnat.dg/renaming19.adb: New test.
* gnat.dg/renaming19_pkg.ads: New helper.
* gnat.dg/renaming19_pkg.adb: Likewise.

AVR: AVR-SD: Put a valid opcode prior to gs() table in .subsection 1.

On functional safety devices (AVR-SD), each executed instruction must
be followed by a valid opcode. This is because instruction fetch and
decode for the next instruction runs while the 2-stage pipeline is
executing the current instruction.

There is only one case where avr-gcc generates code interspersed with
data, which is when a switch/case table is generated for a function
with a "section" attribute and AVR_HAVE_JMP_CALL. In that case, the
table with the gs() code label addresses is put in .subsection 1 so
that it belongs to the section as specified by the "section" attribute.

gcc/
* config/avr/avr.cc (avr_output_addr_vec): Output
a valid opcode prior to the first gs() label provided:
- The code is compiled for an arch that has AVR-SD mcus, and
- the function has a "section" attribute, and
- the function has a gs() label addresses switch/case table.

[RISC-V][PR 121136] Improve various tests which only need to examine upper bits in a GPR

So pre-commit CI flagged an issue with the initial version of this patch.  In
particular the cmp-mem-const-{1,2} tests are failing.

I didn't see that in my internal testing, but that well could be an artifact of
having multiple patches touching in the same broad space that the tester is
evaluating.  If I apply just this patch I can trigger the cmp-mem-const{1,2}
failures.

The code we're getting now is actually better than we were getting before, but
the new patterns avoid the path through combine that emits the message about
narrowing the load down to a byte load, hence the failure.

Given we're getting better code now than before, I'm just skipping this test on
risc-v.    That's the only non-whitespace change since the original version of
this patch.

--

This addresses the first level issues seen in generating better performing code
for testcases derived from pr121136.  It likely regresses code size in some
cases as in many cases it selects code sequences that should be better
performing, though larger to encode.

Improving -Os code generation should remain the primary focus of pr121136.  Any
improvements in code size with this change are a nice side effect, but not the
primary goal.

--

Let's take this test (derived from the PR):

_Bool func1_0x1U (unsigned int x) { return x <= 0x1U; }

_Bool func2_0x1U (unsigned int x) { return ((x >> __builtin_ctz (0x1U + 1U)) == 0); }

_Bool func3_0x1U (unsigned int x) { return ((x / (0x1U + 1U)) == 0); }

Those should produce the same output.  We currently get these fragments for the
3 cases.  In particular note how the second variant is a two instruction
sequence.

        sltiu   a0,a0,2

        srliw   a0,a0,1
        seqz    a0,a0

        sltiu   a0,a0,2

This patch will adjust that second sequence to match the first and third and is
optimal.

Let's take another case.  This is interesting as it's right at the simm12
border:

_Bool func1_0x7ffU (unsigned long x) { return x <= 0x7ffU; }

_Bool func2_0x7ffU (unsigned long x) { return ((x >> __builtin_ctzl (0x7ffU + 1UL)) == 0); }

_Bool func3_0x7ffU (unsigned long x) { return ((x / (0x7ffU + 1UL)) == 0); }

We get:

        li      a5,2047
        sltu    a0,a5,a0
        seqz    a0,a0

        srli    a0,a0,11
        seqz    a0,a0

        li      a5,2047
        sltu    a0,a5,a0
        seqz    a0,a0

In this case the second sequence is pretty good.  Not perfect, but clearly
better than the other two.  This patch will fix the code for case #1 and case

So anyway, that's the basic motivation here.  So to be 100% clear, while the
bug is focused on code size, I'm focused on the performance of the resulting
code.

This has been tested on riscv32-elf and riscv64-elf.  It's also bootstrapped
and regression tested on the Pioneer.  The BPI won't have results for this
patch until late tomorrow.

--

PR rtl-optimization/121136
gcc/
* config/riscv/riscv.md: Add define_insn to test the
upper bits of a register against zero using sltiu when
the bits are extracted via zero_extract or logial right shift.
Add 3->2 define_splits for gtu/leu cases testing upper bits
against zero.

gcc/testsuite
* gcc.target/riscv/pr121136.c: New test.
* gcc.dg/cmp-mem-const-1.c: Skip for risc-v.
* gcc.dg/cmp-mem-const-2.c: Likewise.

Add GTY skip to active_iterated_stmt

The following should fix a disable-checking build which seems to
confuse gengtype.

* tree-core.h (tree_ssa_name::active_iterated_stmt): Mark
GTY((skip(""))).

cobol: Mainly extends compilation and execution in finternal-ebcdic.

We expanded our extended testing regime to execute many testcases in
EBCDIC mode as well as in ASCII. This exposed hundreds of problems in
both compilation (where conversions must be made between the ASCII
source code and the EBCDIC execution environment) and in run-time
functionality, where results from calls to system routines and internal
calculations that must be done in ASCII have to be converted to EBCDIC.

These changes also switch to using FIXED_WIDE_INT(128) instead of
REAL_VALUE_TYPE when initializing fixed-point COBOL variable types.
This provides for accurate initialization up to 37 digits, instead of
losing accuracy after 33 digits.

These changes also support the implementation of the COBOL DELETE FILE
(Format 2) statement.

These changes also introduce expanded support for specifying character
encodings, including support for locales.

co-authored-by: Robert Dubner <rdubner@symas.com>
co-authored-by: James K. Lowden <jklowden@cobolworx.com>

gcc/cobol/ChangeLog:

* Make-lang.in: Repair documentation generation.
* cdf.y: Changes to tokens.
* cobol1.cc (cobol_langhook_handle_option): Add comment.
* genapi.cc (function_pointer_from_name): Use data.original() for
function name.
(parser_initialize_programs): Likewise.
(cobol_compare): Make sure encodings of comparands are the same.
(move_tree): Change name of DEFAULT_SOURCE_ENCODING macro.
(parser_enter_program): Typo.
(psa_FldLiteralN): Break out dirty_to_binary() support routine.
(dirty_to_binary): Likewise.
(parser_alphabet): Rename 'alphabet' to 'collation_sequence'.
(parser_allocate): Change wsclear() to be uint32_t instead of char.
(parser_label_label): Formatting.
(parser_label_goto): Likewise.
(get_the_filename): Breakout get_the_filename(), which handles
encoding.
(parser_file_open): Likewise.
(set_up_delete_file_label): Implement DELETE FILE (Format 2).
(parser_file_delete_file): Likewise.
(parser_file_delete_on_exception): Likewise.
(parser_file_delete_not_exception): Likewise.
(parser_file_delete_end): Likewise.
(parser_call): Use data.original().
(parser_entry): Use data.original().
(mh_source_is_literalN): Convert from
sourceref.field->codeset.encoding.
(binary_initial_from_float128): Change to "binary_initial".
(binary_initial): Calculate in FIXED_WIDE_INT(128) instead of
REAL_VALUE_TYPE.
(digits_from_int128): New routine uses binary_initial.
(digits_from_float128): Removed.  Kept as comment for reference.
(initial_from_initial): Use binary_initial.
(actually_create_the_static_field): Use correct encoding.
(parser_symbol_add): Likewise.
* genapi.h (parser_file_delete_file): Implement FILE DELETE.
(parser_file_delete_on_exception): Implement FILE DELETE.
(parser_file_delete_not_exception): Implement FILE DELETE.
(parser_file_delete_end): Implement FILE DELETE.
* genmath.cc: Include charmaps.h.
* genutil.cc (get_literal_string):  Change name of
DEFAULT_SOURCE_ENCODING macro.
* parse.y: Token changes; numerous changes in support of encoding;
support for DELETE FILE.
* parse_ante.h (name_of): Use data.original().
(class prog_descr_t): Support of locales.
(current_options): Formatting.
(current_encoding):  Formatting.
(current_program_index): Formatting.
(current_section): Formatting.
(current_paragraph): Formatting.
(is_integer_literal): Use correct encoding.
(value_encoding_check): Handle encoding changes.
(alphabet_add): Likewise.
(data_division_ready): Likewise.
* scan.l: Use data.original().
* show_parse.h: Use correct encoding.
* symbols.cc (elementize): Likewise.
(symbol_elem_cmp): Handle locale.
(struct symbol_elem_t): Likewise.
(symbol_locale): Likewise.
(field_str): Change DEFAULT_SOURCE_ENCODING macro name.
(symbols_alphabet_set): Formatting.
(symbols_update): Modify consistency checks.
(symbol_locale_add): Locale support.
(cbl_locale_t::cbl_locale_t): Locale support.
(cbl_alphabet_t::cbl_alphabet_t): New structure.
(cbl_alphabet_t::reencode): Formatting.
(cbl_alphabet_t::assign): Change name of collation_sequence.
(cbl_alphabet_t::also): Likewise.
(new_literal_add): Anticipate the need for four-byte characters.
(guess_encoding): Eliminate.
(cbl_field_t::internalize): Refine conversion of data.initial to
specified encoding.
* symbols.h (enum symbol_type_t): Add SymLocale.
(struct cbl_field_data_t): Incorporate data.orig.
(struct cbl_field_t): Likewise.
(struct cbl_delete_file_t): New structure.
(struct cbl_label_t): Incorporate cbl_delete_file_t.
(struct cbl_locale_t): Support for locale.
(hex_decode): Comment.
(struct cbl_alphabet_t): Incorporate locale; change variable name
to collation_sequence.
(struct symbol_elem_t): Incorporate locale.
(cbl_locale_of): Likewise.
(cbl_alphabet_of): Likewise.
(symbol_locale_add): Likewise.
(wsclear): Type is now uint32_t instead of char.
* util.cc (symbol_type_str):  Incorporate locale.
(cbl_field_t::report_invalid_initial_value): Change test so that
pure PIC A() variables are limited to [a-zA-Z] and space.
(valid_move): Use DEFAULT_SOURCE_ENCODING macro.
(cobol_filename): Formatting.

libgcobol/ChangeLog:

* charmaps.cc (__gg__encoding_iconv_type): Eliminate trailing
'/' characters from encoding names.
(__gg__get_charmap): Switch to DEFAULT_SOURCE_ENCODING macro name.
* charmaps.h (DEFAULT_CHARMAP_SOURCE): Likewise.
(DEFAULT_SOURCE_ENCODING): Likewise.
(class charmap_t): Enhance constructor.
* encodings.h (valid_encoding): New routine.
* gcobolio.h (enum cblc_file_prior_op_t): Support DELETE FILE.
* gfileio.cc (get_filename): Likewise.
(__io__file_remove): Likewise.
(__gg__file_reopen): Likewise.
(__io__file_open): Likewise.
(gcobol_fileops): Likewise.
(__gg__file_delete): Likewise.
(__gg__file_remove): Likewise.
* intrinsic.cc (get_all_time):  Switch to DEFAULT_SOURCE_ENCODING
macro name.
(ftime_replace): Support ASCII/EBCDIC encoding.
(__gg__current_date): Likewise.
(__gg__max): Likewise.
(__gg__lower_case): Likewise.
(numval): Likewise.
(numval_c): Likewise.
(__gg__upper_case): Likewise.
(__gg__when_compiled): Likewise.
(gets_int): Likewise.
(gets_nanoseconds): Likewise.
(fill_cobol_tm): Likewise.
(floating_format_tester): Likewise.
(__gg__numval_f): Likewise.
(__gg__test_numval_f): Likewise.
(iscasematch): Likewise.
(strcasestr): Likewise.
(strcaselaststr): Likewise.
(__gg__substitute): Likewise.
(__gg__locale_compare): Support for locale.
(__gg__locale_date): Likewise.
(__gg__locale_time): Likewise.
(__gg__locale_time_from_seconds): Likewise.
* libgcobol.cc (class ec_status_t): Support for encoding.
(int128_to_field): Likewise.
(__gg__dirty_to_float): Likewise.
(format_for_display_internal): Likewise.
(get_float128): Likewise.
(compare_field_class): Likewise.
(__gg__compare_2): Likewise.
(init_var_both): Likewise.
(__gg__move): Likewise.
(display_both): Likewise.
(is_numeric_display_numeric): Likewise.
(accept_envar): Likewise.
(__gg__get_argv): Likewise.
(__gg__unstring): Likewise.
(__gg__check_fatal_exception): Likewise.
(__gg__adjust_encoding): Likewise.
(__gg__func_exception_location): Likewise.
(__gg__func_exception_statement): Likewise.
(__gg__func_exception_status): Likewise.
(__gg__func_exception_file): Likewise.
(__gg__just_mangle_name): Likewise.
(__gg__function_handle_from_name): Likewise.
(get_the_byte): Likewise.
(__gg__module_name): Likewise.
(__gg__accept_arg_value): Likewise.
* xmlparse.cc (fatalError): Formatting.
(setDocumentLocator): Formatting.
(xmlchar_of): Formatting.
(xmlParserErrors_str): Formatting.

SSA immediate use iterator checking

The following implements additional checking around
SSA immediate use iteration.  Specifically this prevents

- any nesting of FOR_EACH_IMM_USE_STMT inside another iteration
   via FOR_EACH_IMM_USE_STMT or FOR_EACH_IMM_USE_FAST when iterating
   on the same SSA name

- modification (for now unlinking of immediate uses) of a SSA
   immediate use list when a fast iteration of the immediate uses
   of the SSA name is active

- modification (for now unlinking of immediate uses) of the immediate
   use list outside of the block of uses for the currently active stmt
   of an ongoing FOR_EACH_IMM_USE_STMT of the SSA name

To implement this additional bookkeeping members are put into the
SSA name structure when ENABLE_GIMPLE_CHECKING is active.  I have
kept the existing consistency checking of the fast iterator.

* ssa-iterators.h (imm_use_iterator::name): Add.
(delink_imm_use): When in a FOR_EACH_IMM_USE_STMT iteration
enforce we only remove uses from the current stmt.
(end_imm_use_stmt_traverse): Reset current stmt.
(first_imm_use_stmt): Assert no FOR_EACH_IMM_USE_STMT on
var is in progress.  Set the current stmt.
(next_imm_use_stmt): Set the current stmt.
(auto_end_imm_use_fast_traverse): New, lower iteration
depth upon destruction.
(first_readonly_imm_use): Bump the iteration depth.
* tree-core.h (tree_ssa_name::active_iterated_stmt,
tree_ssa_name::fast_iteration_depth): New members when
ENABLE_GIMPLE_CHECKING.
* tree-ssanames.cc (make_ssa_name_fn): Initialize
immediate use verifier bookkeeping members.

Make FOR_EACH_IMM_USE_STMT work w/o fake imm use node

This is an attempt to fix PR122502 by making a FOR_EACH_IMM_USE_FAST
with in an FOR_EACH_IMM_USE_STMT on _the same_ VAR work without
the former running into the FOR_EACH_IMM_USE_STMT inserted marker
use operand.  It does this by getting rid of the marker.

The downside is that this in principle restricts the set of operations
that can be done on the immediate use list of VAR.  Where previously
almost anything was OK (but technically not well-defined what happens
to the iteration) after this patch you may only remove immediate
uses of VAR on the current stmt from the FOR_EACH_IMM_USE_STMT
iteration.  In particular things will break if you happen to remove
the one immediate use of VAR on the stmt immediately following
the set of immediate uses on the currrent stmt.

Additional checking to combat such cases is implemented in a
followup.

PR tree-optimization/122502
* ssa-iterators.h (imm_use_iterator::iter_node): Remove.
(imm_use_iterator::next_stmt_use): New.
(next_readonly_imm_use): Adjust checking code.
(end_imm_use_stmt_traverse): Simplify.
(link_use_stmts_after): Likewise.  Return the last use
with the same stmt.
(first_imm_use_stmt): Simplify.  Set next_stmt_use.
(next_imm_use_stmt): Likewise.
(end_imm_use_on_stmt_p): Adjust.

* gcc.dg/torture/pr122502-2.c: New testcase.

Update immediate use iterator documentation

This clarifies the constraints of the immediate use iterators,
documenting how exactly stmts and their immediate uses might be
altered during it.

* doc/tree-ssa.texi: Update immediate use iterator
documentation.
* ssa-iterators.h: Likewise.

New XOR fold routine.

XOR goes to VARYING frequently with complex ranges. The other bitwise
operations are improved, so implement XOR using them as well.

PR tree-optimization/113632
gcc/
* range-op-mixed.h (operator_bitwise_xor): Relocate and adjust.
(operator_bitwise_xor::m_and, m_or, m_not): New.
* range-op.cc (operator_bitwise_xor::fold_range): New.

gcc/testsuite/
* gcc.dg/pr113632.c: New.

testsuite: arm: fix arm_v8_vfp_ok effective-target

This effective-target does not need to check for arm32, but needs to
force -march=armv8-a, otherwise -mfpu=fp-armv8 has no useful meaning.

While fixing that, introduce
check_effective_target_arm_v8_vfp_ok_nocache, so that arm_v8_vfp_ok
behaves like arm_v8_neon_ok and many other effective-targets.

Without this patch, gcc.target/arm/attr-neon.c fails with a toolchain
configured with --with-mode=thumb --with-cpu=cortex-m0
--with-float=soft because arm_v8_vfp returns "" because arm32 is
false. As a result, the testcase is compiled with the options needed
for arm_neon_ok, which generates an extra ".fpu neon" directive
compared to what is expected.

The patch removes -march=armv8-a from dg-options in lceil-vcvt_1.c,
lfloor-vcvt_1.c lround-vcvt_1.c and vrinta-ce.c, because this could
override what arm_v8_vfp_ok detected (and lead to 'error: selected
architecture lacks an FPU').

With this patch, the test passes, and several others are enabled:
gcc.target/arm/lceil-vcvt_1.c
gcc.target/arm/lfloor-vcvt_1.c
gcc.target/arm/lround-vcvt_1.c
gcc.target/arm/pr69135_1.c
gcc.target/arm/vmaxnmdf.c
gcc.target/arm/vmaxnmsf.c
gcc.target/arm/vminnmdf.c
gcc.target/arm/vminnmsf.c
gcc.target/arm/vrinta-ce.c
gcc.target/arm/vrintaf32.c
gcc.target/arm/vrintaf64.c
gcc.target/arm/vrintmf32.c
gcc.target/arm/vrintmf64.c
gcc.target/arm/vrintpf32.c
gcc.target/arm/vrintpf64.c
gcc.target/arm/vrintrf32.c
gcc.target/arm/vrintrf64.c
gcc.target/arm/vrintxf32.c
gcc.target/arm/vrintxf64.c
gcc.target/arm/vrintzf32.c
gcc.target/arm/vrintzf64.c
gcc.target/arm/vseleqdf.c
gcc.target/arm/vseleqsf.c
gcc.target/arm/vselgedf.c
gcc.target/arm/vselgesf.c
gcc.target/arm/vselgtdf.c
gcc.target/arm/vselgtsf.c
gcc.target/arm/vselledf.c
gcc.target/arm/vsellesf.c
gcc.target/arm/vselltdf.c
gcc.target/arm/vselltsf.c
gcc.target/arm/vselnedf.c
gcc.target/arm/vselnesf.c
gcc.target/arm/vselvcdf.c
gcc.target/arm/vselvcsf.c
gcc.target/arm/vselvsdf.c
gcc.target/arm/vselvssf.c

gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_arm_v8_vfp_ok_nocache): New.
(check_effective_target_arm_v8_vfp_ok): Call the above helper, and
use global flags.
(add_options_for_arm_v8_vfp): Use et_arm_v8_vfp_flags.
* gcc.target/arm/lceil-vcvt_1.c: Remove -march=armv8-a.
* gcc.target/arm/lfloor-vcvt_1.c: Likewise.
* gcc.target/arm/lround-vcvt_1.c: Likewise.
* gcc.target/arm/vrinta-ce.c: Likewise.

libiberty: Add BigObj COFF support for LTO on Windows targets [PR122472]

This patch adds support for the BigObj COFF object file format to libiberty's
simple-object-coff.c. BigObj extends regular COFF to support a 32-bit section
count.

BigObj differs from COFF in a few ways:

* A different header structure
* 32-bit section counts instead of 16-bit
* 32-bit symbol section numbers instead of 16-bit
* 20-byte symbols instead of 18-byte symbols
(due to the extended section numbers)

For a more detailed summary, read my blog post on this subject:
https://peter0x44.github.io/posts/bigobj_format_explained/

libiberty/ChangeLog:

PR target/122472
* simple-object-coff.c (struct external_filehdr_bigobj): New
structure for BigObj file header.
(bigobj_magic): New constant for BigObj magic bytes.
(struct external_syment_bigobj): New structure for BigObj
20-byte symbol table entries.
(union external_auxent_bigobj): New union for BigObj 20-byte
auxiliary symbol entries.
(struct simple_object_coff_read): Add is_bigobj flag and make
nscns 32-bit to support both formats.
(struct simple_object_coff_attributes): Add is_bigobj flag.
(simple_object_coff_match): Add BigObj format detection.
(simple_object_coff_read_strtab): Use format-specific symbol
size when calculating string table offset.
(simple_object_coff_attributes_merge): Check is_bigobj flag.
(simple_object_coff_write_filehdr_bigobj): New function.
(simple_object_coff_write_to_file): Add logic for writing
BigObj vs regular COFF format with appropriate symbol
and auxiliary entry structures.

Signed-off-by: Peter Damianov <peter0x44@disroot.org>
Signed-off-by: Jonathan Yong <10walls@gmail.com>

LoongArch: Switch the default code model to medium

It has turned out the normal code model isn't enough for some large
LoongArch link units in practice.  Quoting WANG Rui's comment [1]:

We’ve actually been considering pushing for a change to the default
code model for LoongArch compilers (including GCC) for a while now.
In fact, this was one of the topics discussed in yesterday’s internal
compiler tool-chain meeting. The reason we haven’t moved forward with
it yet is that the medium code model generates a R_LARCH_CALL36
relocation, which had some issues with earlier versions of the linker.
We need to assess the impact on users before proceeding with the change.

In GCC we have build-time probe for linker call36 support and if the
linker does not support it, we fall back to pcalau12i + jirl or
la.{local,global} + jirl for the medium code model.  I also had some
concern about a potential performance regression caused by the
conservative nature of the relaxation process, but when I tested this
patch it turned out the relaxation is powerful enough to eliminate all
the pcaddu18i instructions in cc1plus and libstdc++.so.

The Loong Arch Linux project has been using -mcmodel=medium in their
{C,CXX}FLAGS building packages for a while [2] and they've not reported
any issues with that.

The Linux kernel developers has already anticipated the change and
explicitly specified -mcmodel=normal for a while [3].

Thus to me it's safe to make GCC 16 the first release with the medium
code model as the default now.  If someone must keep the normal code
model as the default for any reason, it's possible to configure GCC
using --with-cmodel=normal.

[1]: https://discourse.llvm.org/t/rfc-changing-the-default-code-model-for-loongarch/85317/3
[2]: https://github.com/lcpu-club/loongarch-packages/pull/340
[3]: https://git.kernel.org/torvalds/c/e67e0eb6a98b

gcc/

* config.gcc: Support --with-cmodel={medium,normal} and make
medium the default for LoongArch, define TARGET_DEFAULT_CMODEL
as the selected value.
* config/loongarch/loongarch-opts.cc: Use TARGET_DEFAULT_CMODEL
instead of hard coding CMODEL_NORMAL.
* doc/install.texi: Document that --with-cmodel= is supported
for LoongArch.
* doc/invoke.texi: Update the document about default code model
on LoongArch.

gcc/testsuite/

* gcc.target/loongarch/vect-frint-no-inexact.c (dg-options): Add
-mcmodel=normal.
* gcc.target/loongarch/vect-frint-scalar-no-inexact.c: Likewise.
* gcc.target/loongarch/vect-frint-scalar.c: Likewise.
* gcc.target/loongarch/vect-frint.c: Likewise.
* gcc.target/loongarch/vect-ftint-no-inexact.c: Likewise.
* gcc.target/loongarch/vect-ftint.c: Likewise.

Daily bump.

c++/modules: Complain on imported GMF TU-local entities in instantiation [PR121574]

An unfortunate side effect of the previous patch is that even with
-pedantic-errors, unless the user specifies -Wtemplate-names-tu-local
when building the module interface there will be no diagnostic at all
from instantiating a template that exposes global TU-local entities,
either when building the module or its importer.

This patch solves this by recognising imported TU-local dependencies,
even if they weren't streamed as TU_LOCAL_ENTITY nodes. The warnings
here are deliberately conservative for when we can be sure this was
actually an imported TU-local entity; in particular, we bail on any
TU-local entity that originated from a header module, without attempting
to determine if the entity came via a named module first.

PR c++/121574

gcc/cp/ChangeLog:

* cp-tree.h (instantiating_tu_local_entity): Declare.
* module.cc (is_tu_local_entity): Extract from depset::hash.
(is_tu_local_value): Likewise.
(has_tu_local_tmpl_arg): Likewise.
(depset::hash::is_tu_local_entity): Remove.
(depset::hash::has_tu_local_tmpl_arg): Remove.
(depset::hash::is_tu_local_value): Remove.
(instantiating_tu_local_entity): New function.
(depset::hash::add_binding_entity): No longer go through
depset::hash to check is_tu_local_entity.
* pt.cc (complain_about_tu_local_entity): Remove.
(tsubst): Use instantiating_tu_local_entity.
(tsubst_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-17_b.C: Check for diagnostics when
instantiating imported TU-local entities.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++/modules: Allow ignoring some TU-local exposure errors in GMF [PR121574]

A frequent issue with migrating to C++20 modules has been dealing with
third-party libraries with internal functions or data.  This causes GCC
to currently refuse to build the module if any references to these
internal-linkage declarations escape into the module CMI.

This can seem needlessly hostile, however, especially since we have the
capabilities to support this (to a degree) from header units, albeit
with the inherent ODR issues associated with their use.  In aid of this,
this patch demotes the error to a pedwarn in various scenarios, by
treating some declarations as not being TU-local even if they otherwise
would have been.

Effort has been made to not alter semantics of valid programs, and to
continue to diagnose cases that the standard says we must.  In
particular, any code in the module purview is still a hard error, due to
the inherent issues with exposing TU-local entities, and the lack of any
migration requirements.

Because this patch is just to assist migration, we only deal with the
simplest (yet most common) cases: namespace scope functions and
variables.  Types are hard to handle neatly as we risk getting thousands
of unhelpful warnings as we continue to walk the type body and find new
TU-local entities to complain about.  Templates are also tricky because
it's hard to tell if an instantiation that occurred in the module
purview only refers to global module entities or if it's inadvertantly
exposing a purview entity as well.  Neither of these are likely to occur
frequently in third-party code; if need be, this can be relaxed later as
well.

Similarly, even in the GMF a constexpr variable with a TU-local value
will not be usable in constant expressions in the importer, and since we
cannot easily warn about this from the importer we continue to make this
an error in the module interface.

PR c++/121574

gcc/c-family/ChangeLog:

* c.opt: New warning '-Wexpose-global-module-tu-local'.
* c.opt.urls: Regenerate.

gcc/ChangeLog:

* doc/invoke.texi: Document '-Wexpose-global-module-tu-local'.

gcc/cp/ChangeLog:

* module.cc (depset::disc_bits): Replace 'DB_REFS_TU_LOCAL_BIT'
and 'DB_EXPOSURE_BIT' with new four flags
'DB_{REF,EXPOSE}_{GLOBAL,PURVIEW}_BIT'.
(depset::is_tu_local): Support checking either for only purview
TU-local entities or any entity described TU-local by standard.
(depset::refs_tu_local): Likewise.
(depset::is_exposure): Likewise.
(depset::hash::make_dependency): A constant initialized to a
TU-local variable is always considered a purview exposure.
(is_exposure_of_member_type): Adjust sanity checks to handle if
we ever relax requirements for TU-local types.
(depset::hash::add_dependency): Differentiate referencing
purview or GMF TU-local entities.
(depset::hash::diagnose_bad_internal_ref): New function.
(depset::hash::diagnose_template_names_tu_local): New function.
(depset::hash::finalize_dependencies): Handle new warnings that
might be needed for GMF TU-local entities.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-17_a.C: New test.
* g++.dg/modules/internal-17_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

Ada: Fix qualified name of discriminant incorrectly accepted in constraint

The RM 3.8(12/3) subclause says that a discriminant mentioned in a
constraint must appear alone as a direct name. The last part is not
consistently checked and, while the first part is, it generates a
slightly different error message depending on the form of the input.

This fixes the last part and changes the first to use a single message.

gcc/ada/
PR ada/35793
* sem_res.adb (Check_Discriminant_Use): In a constraint context,
check that the discriminant appears alone as a direct name in all
cases and give a consistent error message when it does not.

gcc/testsuite/
* gnat.dg/specs/discr8.ads: New test.

libgomp.c++/target-std__multimap-concurrent.C: Fix USM memory freeing

Fix the unified-shared memory test,
   libgomp.c++/target-std__multimap-concurrent-usm.C
added in commit r16-1010-g83ca283853f195
   libgomp: Add testcases for concurrent access to standard C++ containers
   on offload targets, a number of USM variants
This tests includes the actual code of target-std__multimap-concurrent.C.

The issue is that multimap.insert allocates memory – which is freed by
the destructor. However, if the memory is allocated on a device
('insert'), it also needs to be freed there ('clear') as in general
freeing device-allocated memory is not possible on the host.

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-std__multimap-concurrent.C: Fix memory
freeing of device allocated memory with USM.

Fortran: Add non-PDT type extension to PDTs [PR122566]

2025-11-05 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/122566
* decl.cc (gfc_get_pdt_instance): Add non-PDT type exstention.

gcc/testsuite/
PR fortran/122566
* gfortran.dg/pdt_68.f03: New test.

Fortran: Fix PDT constructors in associate [PR122501, PR122524]

2025-11-05 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/122501
PR fortran/122524
* primary.cc (gfc_convert_to_structure_constructor): Correct
whitespace issue.
(gfc_match_rvalue): Remove the attempt to match specific procs
before filling out PDT constructor. Instead, defer this until
resolution with the condition that there not be a following
arglist and more than one procedure in the generic interface.

gcc/testsuite/
PR fortran/122501
* gfortran.dg/pdt_66.f03: New test.

PR fortran/122524
* gfortran.dg/pdt_67.f03: New test.

OpenMP/Fortran: Fix skipping unmatchable metadirectives [PR122570]

Fix a bug in the removal code of always false variants in metadirectives.

PR fortran/122570

gcc/fortran/ChangeLog:

* openmp.cc (resolve_omp_metadirective): Fix 'skip' of
never matchable metadirective variants.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/pr122570.f: New test.

forwprop: allow subvectors in simplify_vector_constructor ()

This is an attempt to fix
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/697879.html in the
middle-end; the motivation in that patch was to teach gcc to compile:

int16x8_t foo (int16x8_t x)
{
  return vcombine_s16 (vget_high_s16 (x), vget_low_s16 (x));
}

into one instruction:

foo:
        ext     v0.16b, v0.16b, v0.16b, #8
        ret

rather than the two we are generating now:

foo:
        dup     d31, v0.d[1]
        uzp1    v0.2d, v31.2d, v0.2d
        ret

Instead of adding a define_insn in the backend, this patch relaxes the
precondition of tree-ssa-forwprop.cc:simplify_vector_constructor () to
accept subvectors as constructor elements.  During initial argument
processing (ll. 3817-3916), subvectors are decomposed into individual
elements before populating the ELTS array; this allows the rest of the
function to remain unchanged.  Special handling is also implemented for
constant and splat subvector elements of a constructor (the latter with
the use of ssa_uniform_vector_p () from tree-vect-generic.cc, which this
patch moves to tree.cc).

Add GIMPLE tests to gcc.dg/tree-ssa demonstrating the intended behavior
with various combinations of subvectors as constructor arguments,
including constant and splat subvectors; also add some aarch64-specific
tests to show that the change leads to us picking the "ext" instruction
for the resulting VEC_PERM_EXPR.

Bootstrapped and regtested on aarch64 and x86_64, regtested on aarch64_be.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (simplify_vector_constructor): Support
vector constructor elements.
* tree-vect-generic.cc (ssa_uniform_vector_p): Make non-static and
move ...
* tree.cc (ssa_uniform_vector_p): ... here.
* tree.h (ssa_uniform_vector_p): Declare it.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/forwprop-43.c: New test.
* gcc.target/aarch64/simd/combine_ext.c: New test.

Use gather_imm_use_stmts instead of FOR_EACH_IMM_USE_STMT in forwprop

The following fixes forwprop using FOR_EACH_IMM_USE_STMT to iterate
over stmts and then eventually removing the active stmt, releasing
its defs. This can cause debug stmt insertion with a RHS referencing
the SSA name we iterate over, adding to its immediate use list
but also adjusting all other debug stmts refering to the released
SSA name, updating those. And those can refer to the original
iterated over variable.

In the end the destructive behavior of update_stmt is a problem
here, which unlinks all uses of a stmt and then links in the
newly computed ones instead of leaving those in place that are.

The solution is to not rely on FOR_EACH_IMM_USE_STMT to iterate
over stmt uses without duplicates.

* tree-ssa-forwprop.cc (forward_propagate_addr_expr):
Use gather_imm_use_stmts instead of FOR_EACH_IMM_USE_STMT.

Add gather_imm_use_stmts helper

The following adds a helper function to gather SSA use stmts without
duplicates.  It steals the only padding bit in gimple to be a
"infrastructure local flag" which should be used only temporarily
and kept cleared.  I did not add accessor functions for the flag
to not encourage (ab-)uses.

I have used an auto_vec<gimple *, 2> in the API to avoid heap
allocations for most cases (without doing statistics).  I have
verified GCC 7 performs NRV optimization on the copy but I'll
note while auto_vec<gimple *> has copy and assign deleted,
auto_vec<gimple *, N> does not.  Adding them breaks pair-fusion.cc
compile.  Without using 'auto' or range-for the API use is a bit
awkward as that exposes the number of auto-allocated elements.

The helper can be used in a range-for, see the followup for an
example.

* gimple.h (gimple::pad): Rename to ...
(gimple::ilf): ... this.
* ssa-iterators.h (gather_imm_use_stmts): Declare.
* tree-ssa-operands.cc (gather_imm_use_stmts): New function.

Fix unsafe stmt modifications in FOR_EACH_IMM_USE_STMT

The following fixes path isolation changing the immediate use list of
an SSA name that is currently iterated over via FOR_EACH_IMM_USE_STMT.
This happens when it duplicates a BB within this and creates/modifies
stmts that contain SSA uses of the name and calls update_stmt which
re-builds SSA operands, including removal of SSA uses and re-inserting
them. This is not safe as that might cause missed iterated uses but
more importantly could cause the 'next' use to be removed.

For the case in question the fix is to collect interesting uses in
a vec and do the processing outside of the FOR_EACH_IMM_USE_STMT
iteration.

* gimple-ssa-isolate-paths.cc (check_loadstore): Set
the volatile flag on the stmt manually.
(find_implicit_erroneous_behavior): Move code transform
outside of FOR_EACH_IMM_USE_STMT iteration.

Add debug function for affine_iv

* tree-ssa-loop-niter.cc (dump_affine_iv): Use file, not
dump_file when printing.
(debug): New overload for affine_iv.

LoongArch: testsuite: Adapt widen-mul-rtx-cost-signed.c for r16-4949

Commit r16-4949 optimize mulw.d.w + srli.d to mulh.w,
so remove the check rule for mulw.d.w.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/widen-mul-rtx-cost-signed.c: Update.

LoongArch: Avoid unnecessary zero-initialization using LSX for scalar popcount

Now for __builtin_popcountl we are getting things like

vrepli.b $vr0,0
vinsgr2vr.d $vr0,$r4,0
vpcnt.d $vr0,$vr0
vpickve2gr.du $r4,$vr0,0
slli.w $r4,$r4,0
jr  $r1

The "vrepli.b" instruction is introduced by the init-regs pass (see
PR61810 and all the issues it references).  To work it around, we can
use post-reload instead of define_expand: the "f" constraint will make
the compiler automatically move the scalar between GPR and FPR, and
reload is much later than init-regs so init-regs won't get in our way.

Now the code looks like:

movgr2fr.d $f0,$r4
vpcnt.d $vr0,$vr0
movfr2gr.d $r4,$f0
jr  $r1

gcc/ChangeLog:

* config/loongarch/loongarch.md (cntmap): Change to uppercase.
(popcount<GPR:mode>2): Modify to a post reload split.

Daily bump.

Ada: Fix explicit raise on subtype of lock-free protected type

The issue is that the Uses_Lock_Free flag is not propagated to the subtype.

gcc/ada/
* sem_ch3.adb (Analyze_Subtype_Declaration) <Concurrent_Kind>:
Propagate the Uses_Lock_Free flag for protected types.

gcc/testsuite/
* gnat.dg/protected_subtype1.adb: New test.

Ada: Fix incorrect legality check in instantiation of child generic unit

The problem arises when the generic unit has a formal access type parameter,
because the manual resolution implemented in Find_Actual_Type does not pick
the correct entity for the designated type. The fix replaces it with a bona
fide resolution and cleans up the associated code in the callers.

gcc/ada/
PR ada/18453
* sem_ch12.adb (Find_Actual_Type): Add Typ_Ref parameter and
perform a standard resolution on it in the fallback case.
Call Get_Instance_Of if the type is declared in a formal of
the child unit.
(Instantiate_Type.Validate_Access_Type_Instance): Adjust call
to Find_Actual_Type.
(Instantiate_Type.Validate_Array_Type_Instance): Likewise and
streamline the check for matching component subtypes.

gcc/testsuite/
* gnat.dg/specs/generic_inst9.ads: New test.
* gnat.dg/specs/generic_inst9_pkg1.ads: New helper.
* gnat.dg/specs/generic_inst9_pkg2.ads: Likewise.
* gnat.dg/specs/generic_inst9_pkg2-g.ads: Likewise.

Fortran: fix frontend memleak with DO CONCURRENT [PR122564]

PR fortran/122564

gcc/fortran/ChangeLog:

* resolve.cc (resolve_locality_spec): Delete temporary hash_set.

Fix typos in comment

gcc/c-family/ChangeLog:

* c-warn.cc (warn_parms_array_mismatch): Fix typos in comment.

Signed-off-by: Alejandro Colomar <alx@kernel.org>

gcc/: Rename warn_parm_array_mismatch() => warn_parms_array_mismatch()

This function acts on entire parameter declaration lists, and iterates
over them. Use plural in the name, to clarify that it acts on
parameters, not just on a single parameter.

gcc/c-family/ChangeLog:

* c-common.h (warn_parm_array_mismatch):
Rename warn_parm_array_mismatch => warn_parms_array_mismatch.
* c-warn.cc (warn_parm_array_mismatch):
Rename warn_parm_array_mismatch => warn_parms_array_mismatch.

gcc/c/ChangeLog:

* c-decl.cc (start_function):
Rename warn_parm_array_mismatch => warn_parms_array_mismatch.
* c-parser.cc (c_parser_declaration_or_fndef):
Rename warn_parm_array_mismatch => warn_parms_array_mismatch.

Signed-off-by: Alejandro Colomar <alx@kernel.org>

i386: TEST insn should be merged with ADC/SBB insn [PR122390]

The attached testcase is currently compiled to:

f1:
        cmpl    %esi, %edi
        adcl    %esi, %edi
        testl   %edi, %edi
        js      .L4
...

TEST insn should be merged with ADC/SBB insn.  The patch provides missing
combined insn patterns.

PR target/122390

gcc/ChangeLog:

* config/i386/i386.md (*add<mode>3_carry_2): New insn pattern.
(*add<mode>3_carry_0_cc): Ditto.
(*add<mode>3_carry_0r_cc): Ditto.
(*sub<mode>3_carry_2): Ditto.
(*sub<mode>3_carry_0_cc): Ditto.
(*sub<mode>3_carry_0r_cc): Ditt.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr122390.c: New test.
* gcc.target/i386/pr122390-1.c: New test.

arc: Add const attribute support for mathematical ARC builtins

The ARC builtin functions __builtin_arc_ffs and __builtin_arc_fls
perform pure mathematical operations equivalent to the standard
GCC __builtin_ffs function, which is marked with the const attribute.
However, the ARC target-specific versions were not marked as const,
preventing compiler optimizations like common subexpression elimination.

Extend the ARC builtin infrastructure to support function attributes
and mark the appropriate mathematical builtins as const:

- __builtin_arc_ffs: Find first set bit (const)
- __builtin_arc_fls: Find last set bit (const)
- __builtin_arc_norm: Count leading zeros (const)
- __builtin_arc_normw: Count leading zeros for 16-bit (const)
- __builtin_arc_swap: Endian swap (const)

gcc/ChangeLog:

* config/arc/builtins.def: Add ATTRS parameter to DEF_BUILTIN
macro calls.  Mark mathematical builtins (FFS, FLS, NORM, NORMW,
SWAP) with attr_const, leave others as NULL_TREE.
* config/arc/arc.cc: Add support for builtin function attributes.
Create attr_const using tree_cons.  Update DEF_BUILTIN macro to
pass ATTRS parameter to add_builtin_function.

gcc/testsuite/ChangeLog:

* gcc.target/arc/builtin_fls_const.c: New test.  Verify that
const attribute enables CSE optimization for mathematical ARC
builtins by checking that duplicate calls are eliminated and
results are optimized to shift operations.

Signed-off-by: Kees Cook <kees@kernel.org>

OpenMP/Fortran: Revamp handling of labels in metadirectives [PR122369,PR122508]

When a label is matched in the first statement after the end of a metadirective
body, it is bound to the associated region. However this prevents it from being
referenced elsewhere.
This patch fixes it by rebinding such labels to the outer region. It also
ensures that labels defined in an outer region can be referenced in a
metadirective body.

PR fortran/122369
PR fortran/122508

gcc/fortran/ChangeLog:
* gfortran.h (gfc_rebind_label): Declare new function.
* parse.cc (parse_omp_metadirective_body): Rebind labels to the outer
region. Maintain a vector of metadirective regions.
(gfc_parse_file): Initialise it.
* parse.h (GFC_PARSE_H): Declare it.
* symbol.cc (gfc_get_st_label): Look for existing labels in outer
metadirective regions.
(gfc_rebind_label): Define new function.
(gfc_define_st_label): Accept duplicate labels in metadirective body.
(gfc_reference_st_label): Accept shared DO termination labels in
metadirective body.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/pr122369-1.f90: New test.
* gfortran.dg/gomp/pr122369-2.f90: New test.
* gfortran.dg/gomp/pr122369-3.f90: New test.
* gfortran.dg/gomp/pr122369-4.f90: New test.
* gfortran.dg/gomp/pr122508-1.f90: New test.
* gfortran.dg/gomp/pr122508-2.f90: New test.

Match: Refactor min based unsigned SAT_MUL pattern by widen mul helper [NFC]

There are 3 kinds of widen_mul during the unsigned SAT_MUL pattern, aka
* widen_mul directly, like _3 w* _4
* convert and the widen_mul, like (uint64_t)_3 *w (uint64_t)_4
* convert and then mul, like (uint64_t)_3 * (uint64_t)_4

All of them will be referenced during different forms of unsigned
SAT_MUL pattern match, but actually we can wrap them into a helper
which present the "widening_mul" sematics. With this helper, some
unnecessary pattern and duplicated code could be eliminated.

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

gcc/ChangeLog:

* match.pd: Add usmul_widen_mult helper and referenced by
min based unsigned SAT_MUL pattern.

Signed-off-by: Pan Li <pan2.li@intel.com>

lto/122515: Fix archive offset types for i686

On i686, offsets into object archives could be 64-bit, but they're
inconsistently treated across the lto, which may sometimes result in
truncation of those offsets for large archives.

Use int64_t/off_t consistently across all uses of archive offsets to
make sure that they're always read and mapped correctly.

gcc/lto/ChangeLog

PR lto/122515
* lto.h (lto_section_slot): Set type of START to off_t.
* lto-common.cc (lto_read_section_data): Adjust.
* lto-object.cc (lto_obj_file_open): Set type of OFFSET to
int64_t.

gcc/ChangeLog

PR lto/122515
* lto-wrapper.cc (debug_objcopy): Set type of INOFF to int64_t.
(run_gcc): Set type of FILE_OFFSET to int64_t.

gcc/testsuite/ChangeLog

PR lto/122515
* lib/lto.exp (lto-build-archive): New procedure.
(lto-execute-1): Use it.
(lto-link-and-maybe-run, lto-get-options-main): Handle ar-link.
* gcc.dg/lto/pr122515_0.c: New test case.
* gcc.dg/lto/pr122515_1.c: New file.
* gcc.dg/lto/pr122515_2.c: Likewise.
* gcc.dg/lto/pr122515_3.c: Likewise.
* gcc.dg/lto/pr122515_4.c: Likewise.
* gcc.dg/lto/pr122515_5.c: Likewise.
* gcc.dg/lto/pr122515_6.c: Likewise.
* gcc.dg/lto/pr122515_7.c: Likewise.
* gcc.dg/lto/pr122515_8.c: Likewise.
* gcc.dg/lto/pr122515_9.c: Likewise.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

c++: Don't constrain template visibility using no-linkage variables [PR122253]

When finding the minimal visibility of a template, any reference to a
dependent automatic variable will cause the instantiation to be marked
as internal linkage. However, when processing the template decl we
don't yet know whether that should actually be the case, as a given
instantiation may not require referencing the local decl in its
mangling.

This patch fixes the issue by checking for no-linkage decls first, in
which case we just constrain using the type of the entity. We can't use
a check for lk_external/lk_internal in the other cases, as
instantiations referring to internal types can still have external
linkage as determined by the language, but should still constrain the
visibility of any declarations that refer to them.

PR c++/122253

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r): Don't mark no-linkage declarations
as VISIBILITY_ANON.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-16.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

gfortran.dg/pr122513-2.f90: New test [PR122513]

This test is from PR122513; even though the actual error message was
already added in GCC 15, there was no testcase for the diagnostic type
Index variable 'i' at (1) cannot be specified in a locality-spec

Thus, this commit adds one.

gcc/testsuite/ChangeLog:

PR fortran/122513
* gfortran.dg/pr122513-2.f90: New test.

simplify-rtx: Canonicalize SUBREG and LSHIFTRT order for AND operations

For a given rtx expression (and (lshiftrt (subreg X) shift) mask)
combine pass tries to simplify the RTL form to

   (and (subreg (lshiftrt X shift)) mask)

where the SUBREG wraps the result of the shift.  This leaves the AND
and the shift in different modes, which complicates recognition.

   (and (lshiftrt (subreg X) shift) mask)

where the SUBREG is inside the shift and both operations share the same
mode.  This form is easier to recognize across targets and enables
cleaner pattern matching.

This patch checks in simplify-rtx to perform this transformation when
safe: the SUBREG must be a lowpart, the shift amount must be valid, and
the precision of the operation must be preserved.

Tested on powerpc64le-linux-gnu, powerpc64-linux-gnu, and
x86_64-pc-linux-gnu with no regressions.  On rs6000, the change reduces
insn counts due to improved matching.

2025-11-04  Kishan Parmar  <kishan@linux.ibm.com>

gcc/ChangeLog:

PR rtl-optimization/93738
* simplify-rtx.cc (simplify_binary_operation_1): Canonicalize
SUBREG(LSHIFTRT) into LSHIFTRT(SUBREG) when valid.

gcc/testsuite/ChangeLog:

PR rtl-optimization/93738
* gcc.target/powerpc/rlwimi-2.c: Update expected rldicl count.

analyzer: add event kinds for special control flow [PR122544]

The SARIF 3.38.8 "kinds" property has some verbs for expressing
control flow, but is missing some of the awkward special cases.

The spec says "If none of these values are appropriate, a SARIF
producer MAY use any value."

This patch adds the following new values:

* "throw" for throwing an exception
* "catch" for catching an exception
* "unwind" for unwinding stack frame(s) during exception-handling
* "setjmp" for calls to setjmp
* "longjmp" for calls to longjmp that rewind the program counter/stack
to the location of a previous setjmp call

gcc/analyzer/ChangeLog:
PR analyzer/122544
* checker-event.cc (catch_cfg_edge_event::get_meaning): New.
(setjmp_event::get_meaning): New.
(rewind_event::get_meaning): New.
(throw_event::get_meaning): New.
(unwind_event::get_meaning): New.
* checker-event.h (catch_cfg_edge_event::get_meaning): New decl.
(setjmp_event::get_meaning): New decl.
(rewind_event::get_meaning): New decl.
(throw_event::get_meaning): New decl.
(unwind_event::get_meaning): New decl.

gcc/ChangeLog:
PR analyzer/122544
* diagnostics/paths.cc (event::meaning::maybe_get_verb_str):
Handle the new verbs.
* diagnostics/paths.h (event::meaning::verb): Add new values
for special control flow operations.
(event::meaning::meaning): Add ctor taking just a verb.

gcc/testsuite/ChangeLog:
PR analyzer/122544
* g++.dg/analyzer/exception-path-1-sarif.py: New test script.
* g++.dg/analyzer/exception-path-1.C: Add SARIF output, and use
the above to check it.
* g++.dg/analyzer/exception-path-unwind-multiple-2-sarif.py: New
test script.
* g++.dg/analyzer/exception-path-unwind-multiple-2.C: Add SARIF
output, and use the above to check it.
* gcc.dg/analyzer/setjmp-3-sarif.py: New test script.
* gcc.dg/analyzer/setjmp-3.c: Add SARIF output, and use
the above to check it.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

cp: fix typo "an decrement" -> "a decrement"

gcc/cp/ChangeLog:
* parser.cc (enum non_integral_constant): Fix typo
"an decrement" -> "a decrement" in comment.
(cp_parser_non_integral_constant_expression): Likewise in error
message.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Daily bump.

Ada: Fix segfault for instantiation on function call returning string

The problem is that a transient scope is created during the analysis of the
actual parameters of the instantiation and this discombobulates the complex
handling of scopes in Sem_Ch12.

gcc/ada/
PR ada/78175
* sem_ch12.adb (Hide_Current_Scope): Deal with a transient scope
as current scope.
(Remove_Parent): Likewise.

gcc/testsuite/
* gnat.dg/generic_inst15.adb: New test.
* gnat.dg/generic_inst15_pkg-g.ads: New helper.
* gnat.dg/generic_inst15_pkg.ads: Likewise.

Ada: Fix misleading diagnostic about abstract new in type derivation

The current error message is:

abstract1.ads:7:13: error: "abstract" not allowed here, ignored

but "abstract" is indeed allowed there if you complete the declaration.

The patch changes it to:

abstract1.ads:7:13: error: "abstract" allowed only for record extension, ...

gcc/ada/
PR ada/55324
* par-ch3.adb (P_Type_Declaration): Give a better error message
for illegal "abstract" in a type derivation.

gcc/testsuite/
* gnat.dg/specs/abstract1.ads: New test.