git.ipfire.org Git - thirdparty/gcc.git/log

RISC-V: Add test for vec_duplicate + vssubu.vv combine case 1 with GR2VR cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vssubu.vv combine to
vssubu.vx, with the GR2VR cost is 0, 1 and 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vssubu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vssubu.vv combine case 0 with GR2VR cost 0, 2 and 15

Add asm dump check and run test for vec_duplicate + vssubu.vv
combine to vssubu.vx, with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-u8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Reconcile the existing test due to cost model change

The cost model change will make the default cost of vx to 2, thus
reconcile the asm check for this change.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u16.c:
Update the asm check due to cost model change.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u32.c:
Diito.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u8.c:
Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vec_duplicate + vssubu.vv to vssubu.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vssubu.vv to the
vssubu.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, FUNC)                                      \
  void                                                                \
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {                                                                   \
    for (unsigned i = 0; i < n; i++)                                  \
      out[i] = FUNC (in[i], x);                                       \
  }

  T sat_sub(T a, T b)
  {
    return (a - b) & (-(T)(a >= b));
  }

  DEF_VX_BINARY(uint32_t, sat_sub)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vssubu.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vssubu.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_vec_dup): Add
new case US_MINUS.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op us_minus.

Signed-off-by: Pan Li <pan2.li@intel.com>

ada: Fix Execution_Successful value with exceptions

Store the Exit_Code value and use that to generate
the Exceution_Successful value in the SARIF report.

gcc/ada/ChangeLog:

* comperr.adb (Compiler_Abort): Pass the exit code in calls to
Output_Messages.
* errout.adb (Output_Messages): Add new parameter for the
Exit_Code and store its value.
* errout.ads (Output_Messages): Likewise.
* erroutc-sarif_emitter.adb (Print_Invocations): Set
Execution_Successful based on the exit code.
* erroutc.ads (Exit_Code): Store the exit code value.
* gnat1drv.adb (Gnat1drv): Pass the exit code in calls to
Output_Messages.
* prepcomp.adb (Parse_Preprocessing_Data_File, Prpare_To_Preprocess):
Likewise.

ada: Refine use of Has_Exit

The description of the Has_Exit field in Einfo makes it pretty clear
that it can only be meaningful for loop entities. It was however defined
in all entities until this patch, which restricts this field to E_Loop.

gcc/ada/ChangeLog:

* gen_il-gen-gen_entities.adb (Gen_Entities): Tweak Has_Exit.

ada: Make class-wide Max_Size_In_Storage_Elements return a large value

Max_Size_In_Storage_Elements is supposed to return a value greater or
equal to what is passed for any heap allocation for an object of the
type. For a tagged type T, we don't know the allocation size for
descendants; therefore T'Class'Max_Size_In_Storage_Elements should
return a huge number. In particular, it now returns Storage_Count'Last,
which is greater than any possible heap allocation.

Previously, T'Class'Max_Size_In_Storage_Elements was returning
the same value as T'Max_Size_In_Storage_Elements, which was
wrong.

gcc/ada/ChangeLog:

* exp_attr.adb (Attribute_Max_Size_In_Storage_Elements):
Return Storage_Count'Last converted to universal_integer.

ada: Add documentation of implemented Ada 2022 features

gcc/ada/ChangeLog:

* doc/gnat_rm.rst: add entry point for the new chapter
* doc/gnat_rm/about_this_guide.rst: add reference to the new
chapter
* doc/gnat_rm/implementation_of_ada_2022_features.rst: new file
* doc/gnat_rm/implementation_of_ada_2012_features.rst: update
explanation about RM references
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

ada: Reuse Snames classification of reserved words

Before this patch, Check_Future_Keyword had hardcoded lists of what
reserved words were introduced in what versions of the Ada language
specification. This patch makes it use the classification in Snames
instead.

gcc/ada/ChangeLog:

* par-util.adb (Check_Future_Keyword): Use Snames subtypes. Extend
comment.

ada: Remove obsolete comment

This patch removes a comment that was left over when an exception
declaration was removed.

gcc/ada/ChangeLog:

* sem_ch5.adb (Analyze_Loop_Statement): Remove obsolete comment.

ada: Fix bug in -gnatw.o switch (unreferenced out parameters)

Fixes this bug: If -gnatw.o is specified, it is ignored unless
-gnatwm is also specified (either directly, or as part of a
catch-all switch like -gnatwa).

gcc/ada/ChangeLog:

* sem_warn.adb (Warn_On_Useless_Assignments):
Enable Warn_On_Useless_Assignment in the case of
Warn_On_All_Unread_Out_Parameters.

ada: Remove uses of E_Void for subtype declarations

This patch slightly reorganizes Analyze_Subtype_Declaration so that the
proper Ekind of the new subtype's entity is set before anything else is
done with it. A new local subprogram is introduced in the process.

gcc/ada/ChangeLog:

* sem_ch3.adb (Analyze_Subtype_Declaration): Remove uses of E_Void.
(Copy_Parent_Attributes): New procedure.

ada: Remove useless Set_Scope calls

This patch remove calls to Set_Scope that have no effect because of
subsequent calls to Append_Entity, which calls Set_Scope itself.

gcc/ada/ChangeLog:

* cstand.adb (Make_Aliased_Component, Make_Formal, New_Operator,
Create_Standard): Remove useless calls.

ada: Fix missing rounding in System.Value_R.Scan_Raw_Real

The extra digit returned by the function is supposed to be rounded, either
by Scan_Integral_Digits or by Scan_Decimal_Digits, but that is not the case
when it is the last digit read by Scan_Integral_Digits.

The problem is fixed by rounding it in Scan_Decimal_Digits in this case.

gcc/ada/ChangeLog:

* libgnat/s-valuer.adb (Scan_Decimal_Digits): Also pretend that the
precision limit was just reached if it was already reached.
(Scan_Integral_Digits): Add Extra_Rounded out parameter, set it to
False on entry and to True when Extra is rounded.
(Scan_Raw_Real): New Extra_Rounded local variable. Pass it in the
calls to Scan_Integral_Digits. If it is True, pass a dummy extra
digit to Scan_Decimal_Digits.

ada: Ignore ghost predicate in Ada.Strings.Superbounded

Add an assertion policy to ignore the ghost predicates in
Ada.Strings.Superbounded.

gcc/ada/ChangeLog:

* libgnat/a-strsup.ads: Ignore Ghost_Predicate in the assertion policy.

ada: Array aggregates of mutably tagged objects

When an array of mutably tagged class-wide types is initialized
with an array aggregate, the compiler erroneously rejects it
reporting that the type of the aggregate cannot be a
class-wide type. In addition, Program_Error is not raised at
runtime on array type or record type objects when they have
mutably tagged abstract class-wide type components that are
initialized by default.

gcc/ada/ChangeLog:

* sem_aggr.adb (Resolve_Record_Aggregate): Adjust the code to
handle mutably tagged class-wide types since they don't have
discriminants, but all class-wide types are considered to have
unknown discriminants. Initialize mutably tagged class-wide
type components calling their IP subprogram.
* exp_aggr.adb (Gen_Assign): Handle mutably tagged class-wide type
components that have an initializing qualified expression, and
mutably tagged class-wide components default initialization.
(Gen_Loop): Handle mutably tagged class-wide types.
(Gen_Assign): ditto.
(Build_Record_Aggr_Code): Default initialization of mutably tagged
class-wide types is performed by their IP subprogram.
* exp_ch3.adb (Init_Component): Generate code to raise Program_Error
in the IP subprogram of arrays when the type of their components is
a mutably tagged abstract class-wide type.
(Build_Init_Procedure): ditto for the init procedure of record types.
(Build_Init_Statements): Ensure that the type of the expression
initializing a mutably class-wide tagged type component is frozen.
(Requires_Init_Proc): Mutably tagged class-wide types require the
init-proc since it takes care of their default initialization.
* sem_util.adb (Needs_Simple_Initialization): Mutably tagged class-wide
types don't require simple initialization.
* types.ads (PE_Abstract_Type_Component): New reason for Program_Error.
* types.h (PE_Abstract_Type_Component): ditto.
* exp_ch11.adb (Get_RT_Exception_Name): Handle new reason for
Program_Error.
* libgnat/a-except.adb (Rcheck_PE_Abstract_Type_Component): New
subprogram.

ada: Sort subprogram declaration in alphabetic order

Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* sem_util.ads (Get_Enclosing_Object, Get_Enum_Lit_From_Pos,
Is_Universal_Numeric_Type): Reorder declarations.

ada: Ignore unchecked type conversions while getting enclosing object

This patch both makes GNAT emit warnings on unused assignments where previously
they were suppressed for obscure reasons and synchronizes routine
Get_Enclosing_Object with a similar routine in GNATprove (which differs in
handling of explicit dereferences).

gcc/ada/ChangeLog:

* sem_util.adb (Get_Enclosing_Object): Traverse unchecked type
conversions since they from the compiler and should be transparent for
semantic reasoning.

ada: Fix Itype-related predicate check omissions (part 2).

Add to the previous fix for this issue to better handle cases where
GNATProve calls Einfo.Utils.Predicate_Function, passing in an Itype.

gcc/ada/ChangeLog:

* einfo-utils.adb (Predicate_Function): Look through an Itype if
that takes us to another subtype of the same type.

ada: Filling in gaps in handling of inherited Pre/Post'Class aspects

The initial set of changes for doing proper mapping of calls to primitive
functions in Pre/Post'Class aspects inherited by derived types was not
handling some cases (such as when formals are referenced as part of
dereferences, certain aspects such as 'Old and 'Access, and conditional
and declare expressions), and mishandling other cases (such as nested
function calls).

This set of changes attempts to properly address those cases. It also
includes a change to suppress unneeded (and sometimes wrong) accessibility
checks on conversions of actual parameters of a derived type to the parent
type when passing them on calls to parent primitives (encountered while
developing these changes).

gcc/ada/ChangeLog:

* exp_util.adb (Must_Map_Call_To_Parent_Primitive): Change function
name (was Call_To_Parent_Dispatching_Op_Must_Be_Mapped). Move logic
for attributes and dereferences, plus testing for controlled formals,
into new function Expr_Has_Ctrl_Formal_Ref. Add handling for
access attributes, multiple levels of attributes/dereferences,
conditional_expressions, and declare_expressions. Properly account
for function calls with multiple operands and enclosing calls.
(Expr_Has_Ctrl_Formal_Ref): New function to determine whether
an expression is a reference to a controlling formal or has
a prefix that is such a reference.
(Is_Controlling_Formal_Ref): New function in Expr_Has_Ctrl_Formal_Ref
to determine if a node is a direct reference to a controlling formal.
* freeze.adb (Build_DTW_Body): Create an unchecked conversion instead
of a regular type conversion for converting actuals in calls to parent
inherited primitives that are wrapped for inherited pre/postconditions.
Avoids generating unnecessary checks (such as accessibility checks on
conversions for anonymous access formals).

ada: Tune name and commend document of a ghost utility routine

Detection of ghost entities work similarly for names of objects (in assignment
statements) and for names of subprograms (in subprogram calls). Tune routine
name and its comment to match this similarity.

gcc/ada/ChangeLog:

* sem_util.ads (Get_Enclosing_Ghost_Entity): Rename spec.
* sem_util.adb (Get_Enclosing_Ghost_Object): Rename body; reorder
alphabetically; adapt recursive call.
* ghost.adb: Adapt calls to Get_Enclosing_Ghost_Object.

ada: Fix detection of ghost objects in unusual procedure calls

When name of a called procedure involves unusual constructs, e.g. type
conversions (like in "Typ (Obj).all"), we must look at the outermost construct
to decide whether the name denotes a ghost entity.

gcc/ada/ChangeLog:

* ghost.adb (Ghost_Entity): Remove; use Get_Enclosing_Ghost_Object
instead; adapt callers.

ada: Fix bogus error for pragma No_Component_Reordering on record type

This happens when the record type has an incomplete declaration before its
full declaration and is fixed by calling Find_Type appropriately.

gcc/ada/ChangeLog:

* sem_prag.adb (Analyze_Pragma) <Pragma_No_Component_Reordering>:
Call Find_Type on the first argument of the pragma.

ada: Remove redundant `gnatls -l` switch

gcc/ada/ChangeLog:

* gnatls.adb: remove -l switch

ada: 'Size'Class and interface types documentation

Update GNAT RM documentation of the Size'Class aspect.

gcc/ada/ChangeLog:

* doc/gnat_rm/gnat_language_extensions.rst: Update documentation for
mutably tagged types and the Size'Class aspect.
* gnat_rm.texi: Regenerate.

ada: Fix detection of ghost objects in assignment statements

Remove duplicated and inconsistent code for detecting ghost objects on the
left-hand side of assignment statements. Fix detection in the presence of
attribute references (e.g. "X'Access.all"), function calls (e.g. "F.all"),
qualified expressions (e.g. "T'(new Integer'(0)).all") and unchecked type
conversions (which come from expansion).

gcc/ada/ChangeLog:

* ghost.adb
(Whole_Object_Ref): Remove; use Get_Enclosing_Ghost_Object instead.
(Is_Ghost_Assignment): Handle more than object identifiers.
(Mark_And_Set_Ghost_Assignment): Likewise.
* sem_util.adb (Get_Enclosing_Ghost_Object): Detect more expressions
as ghost references; rename to better match the intended meaning.
* sem_util.ads (Get_Enclosing_Ghost_Object): Rename; adjust comment.

ada: Fix fallout of latest change to aggregate expansion

It exposed a small loophole in the Backend_Processing_Possible predicate.

gcc/ada/ChangeLog:

* exp_aggr.adb (Backend_Processing_Possible.Component_Check): Return
False for delayed conditional expressions.

ada: Elide copy for calls as values of nonlimited by-reference components

...in aggregates.  This prevents a temporary from being created on the
primary stack to hold the result of the function calls before it is copied
to the component of the aggregate in the nonlimited by-reference case.

This requires a small tweak to Check_Function_Writable_Actuals to avoid
giving a spurious error in a specific case.

gcc/ada/ChangeLog:

* exp_aggr.ads (Parent_Is_Regular_Aggregate): New predicate.
* exp_aggr.adb (In_Place_Assign_OK.Safe_Component): Implement more
accurate criterion for function calls.
(Convert_To_Assignments): Use Parent_Is_Regular_Aggregate predicate.
(Expand_Array_Aggregate): Likewise.  Remove obsolete comment.
(Initialize_Component): Do not adjust when the expression is a naked
function call and Back_End_Return_Slot is True.
(Parent_Is_Regular_Aggregate): New predicate.
* exp_ch3.adb (Build_Record_Init_Proc.Build_Assignment): Add test of
Back_End_Return_Slot in conjunction with a function call.
* exp_ch4.adb (Expand_Allocator_Expression): Likewise.  Use the
Is_Container_Aggregate predicate to detect container aggregates.
(Expand_N_Case_Expression): Delay the expansion if the parent is a
regular aggregate and the type should not be copied.
(Expand_N_If_Expression): Likewise.
(New_Assign_Copy): New function.
* exp_ch6.adb (Expand_Ctrl_Function_Call): Bail out when the parent
is a regular aggregate.
* sem_util.adb (Check_Function_Writable_Actuals): Do not take into
account attribute references created by the compiler.

ada: use pointer decay for socket address type compatibility

GCC 14 is stricter about type conversions. Taking the address of an
array and decaying the array to a pointer to its first element yield
the same address, but the types are no longer considered compatible.
The socket data structures want decayed pointers rather than addresses
of arrays, so drop the '&'s.

gcc/ada/ChangeLog:

* socket.c [__vxworks]
(__gnat_gethostbyname): Drop excess '&'.
(__gnat_gethostbyaddr): Likewise.

ada: include header to declare isalpha in adaint

A vxworks-specific part of adaint.c calls isalpha without including
ctype.h. gcc-14 rejects calls of undeclared functions. Include the
required header file when compiling for vxworks.

gcc/ada/ChangeLog:

* adaint.c [__vxworks]: Include ctype.h.

ada: Record type Put_Image procedures omitting discriminant values

If a type T has a partial view with a known_discriminant_part and no
user-specified Put_Image aspect specification, then the output generated
by T'Put_Image would incorrectly omit the discriminant values.

gcc/ada/ChangeLog:

* exp_put_image.adb (Build_Record_Put_Image_Procedure): If
Discriminant_Specifications takes us from the full view of a type
to an (intentionally) unanalyzed subtree, then instead find
discriminant entities by calling Discriminant_Specifications on
the partial view of the type.

ada: Fix crash on nested access-to-subprogram types

This patch fixes a crash on some subprograms with anonymous
access-to-subprogram parameters by removing delayed freezing of
subprograms in some cases where it wasn't necessary. The -gnatD output
for itypes is also improved.

gcc/ada/ChangeLog:

* sem_ch6.adb (Check_Delayed_Subprogram, Possible_Freeze): Restrict
cases where freezing is delayed.
* sem_ch6.ads (Check_Delayed_Subprogram): Improve documentation
comment.
* sprint.adb (Write_Itype): Improve output.

ada: Remove redundant condition in test of System.Val_Real.Integer_To_Real

The second condition of the conjunction is redundant with the first.

gcc/ada/ChangeLog:

* libgnat/s-valrea.adb (Integer_to_Real): Rename to...
(Integer_To_Real): ...this. Remove the second condition of the
conjunction in the test for the zero value.
(Scan_Real): Adjust to above renaming.
(Value_Real): Likewise.
* libgnat/s-valuer.ads (Scan_Raw_Real): Add note about Val.

ada: Fix a couple of typos in the sanitizers for Ada

gcc/ada/ChangeLog:

* doc/gnat_ugn/gnat_and_program_execution.rst: Fix a
couple of minor formatting issues.
* gnat_ugn.texi: Regenerate.

ada: Add entity chain debug printing subprograms

This patchs adds two pn-like subprograms that print entity chains.

gcc/ada/ChangeLog:

* treepr.ads (Print_Entity_Chain, pec, rpec): New subprograms.
* treepr.adb (Print_Entity_Chain, pec, rpec): Likewise.

ada: Fix typo in comment

gcc/ada/ChangeLog:

* atree.ads (Parent_Or_List_Containing): Fix typo.

ada: Tweak handling of Parent field in Print_Node

Before this patch, Print_Node failed to honor its Prefix_Char formal
parameter when printing the Parent field. This had no consequences
because Prefix_Char was only used to print members of Nlists, and those
don't have a parent in the tree. But this patch fixes it anyway in
preparation for new debug printing features.

gcc/ada/ChangeLog:

* treepr.adb (Print_Node): Tweak Parent field printing.

ada: Document sanitizers for Ada

gcc/ada/ChangeLog:

* doc/gnat_ugn/gnat_and_program_execution.rst: Add the
documentation about using sanitizers with Ada code.
* gnat_ugn.texi: Regenerate.

ada: Remove dead branch from Get_Enclosing_Object

The dead branch in routine Get_Enclosing_Object was most likely some
experiment from the early days of GNATprove. This routine is meant
to be called with the LHS of an assignment statement where an implicit
dereference is always rewritten into explicit one, regardless if code
is generated.

gcc/ada/ChangeLog:

* sem_util.adb (Get_Enclosing_Object): Remove dead code.

ada: Fix Itype-related predicate check omissions.

Clean up problematic interactions between Itype subtypes and predicates,
which were causing required predicate checks to be (incorrectly) omitted.

gcc/ada/ChangeLog:

* einfo-utils.adb (Predicate_Function): Improve handling of a case
where a predicate specified for a subtype of a partial view of a
type was incorrectly ignored.
(Set_Predicate_Function): If the attribute has already been set to
the same value, then do nothing (instead of raising P_E).
* sem_ch13.adb (Build_Predicate_Function): Add new function
Has_Source_Predicate. If a subtype inherits a predicate but also
has its own explicitly specified predicate, then avoid
misinterpreting the presence of the function built for the
inherited predicate to mean that no additional predicate function
is needed.
* sem_util.adb (Build_Subtype): In the case where we are given a
constrained record or array subtype and we need to construct a
different subtype, subject to a different constraint, the
subtype_mark of the constructed subtype needs to reference an
unconstrained subtype (because a new constraint is going to be
imposed). If the Predicated_Parent attribute of the given subtype
is present and refers to a suitable unconstrained subtype, then
use that subtype instead of setting the Predicated_Parent
attribute on a new node (and performing the associated attribute
copying).

ada: Fix internal error on Ghost aspect applied to Big_Integers

That's a regression introduced by the rewrite of the finalization machinery,
in the form of dangling references to Master_Node entities remaining in the
tree after the removal of the ignored Ghost code.

gcc/ada/ChangeLog:

* exp_ch7.adb (Process_Transient_In_Scope): Bail out if the object
is an ignored ghost entity.

ada: Fix internal error on expression function called for default expression

This happens for the default expression of a controlled component when an
aggregate is used for the record type, because of a freeze node generated
for the expression within an artificial block that is needed to implement
the cleanup actions attached to the assignment of the component.

This is fixed by extending the special treatment applied to freeze nodes
by Insert_Actions, in the case of loops generated for aggregates, to the
case of blocks generated for aggregates.

gcc/ada/ChangeLog:

* exp_util.adb (Insert_Actions): Extend special treatment applied
to freeze nodes to the case of blocks generated for aggregates.

ada: Adjust comparisons in if-statements according to coding style

The Ada coding style requires the use of short circuit forms in
if-statements. Use this form consistently for all if-statements.

gcc/ada/ChangeLog:

* libgnat/s-valuer.adb: Switch missing if-statements to
short-circuit form.
* libgnat/i-cpoint.adb: Ditto.

rs6000: Disassemble opaque modes using subregs to allow optimizations

PR109116  reveals  missed optimizations  when using  unspecs to  extract
vector  components  from opaque-mode  variables. Since RTL optimizers do
not understand unspecs, this leads to redundant register copies. Replace
unspecs with subregs, which  are well understood by RTL passes, allowing
optimizations to take place.

2025-06-30  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/109116
* config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_EXTRACT.
(vsx_disassemble_pair): Expand into a vector register sized subreg.
(mma_disassemble_acc): Likewise.
(*vsx_disassemble_pair): Delete.
(*mma_disassemble_acc): Likewise.

RISC-V: Ignore -Oz for most rvv testcase [NFC]

Most testcase in rvv folder already ignore -Oz, but some of them
are not. This patch makes them consistent.

gcc/testsuite/ChangeLog.

* gcc.target/riscv/rvv/vsetvl/avl_single-21.c: Ignore -Oz.
* gcc.target/riscv/rvv/vsetvl/avl_single-26.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-36.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-39.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-41.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-22.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-2.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-4.c: Ditto.

RISC-V: Primary vector pipeline model for sifive 7 series

This commit introduces a primary vector pipeline model for the SiFive 7
series, that pipeline model is kind of simplified version, it only
defined vector command queue, arithmetic unit, and vector load store
unit.

The latency of real hardware is LMUL-aware, but I realize that will
complicate the model a lots, so I just use a simplified version, which
all LMUL use same latency, we may improve it later once we have found
meaningful performance difference.

gcc/ChangeLog:

* config/riscv/sifive-7.md: Add primary vector pipeline model
for SiFive 7 series.

RISC-V: Adding B ext, fp16 and missing scalar instruction type for sifive-7 pipeline model [PR120659]

gcc/ChangeLog:

PR target/120659
* config/riscv/sifive-7.md: Add B extension, fp16 and missing
scalar instruction type for sifive-7 pipeline model.

gcc/testsuite/ChangeLog:

PR target/120659
* gcc.target/riscv/pr120659.c: New test.

Handle SLP build operand swapping for ternaries and calls

The following adds SLP build operand swapping for .FMA which is
a ternary operator and a call. The current code only handles
binary operators in assignments, thus the patch extends this to
handle both calls and assignments as well as binary and ternary
operators.

* tree-vect-slp.cc (vect_build_slp_2): Handle ternary
and call operators when swapping operands.

* gcc.target/i386/vect-pr82426.c: Pass explicit -ffp-contract=fast.
* gcc.target/i386/vect-pr82426-2.c: New testcase variant with
-ffp-contract=on.

RISC-V: Vector-scalar negate-multiply-(subtract-)accumulate [PR119100]

This pattern enables the combine pass (or late-combine, depending on the case)
to merge a vec_duplicate into a (possibly negated) minus-mult RTL instruction.

Before this patch, we have two instructions, e.g.:
  vfmv.v.f        v6,fa0
  vfnmacc.vv      v2,v6,v4

After, we get only one:
  vfnmacc.vf      v2,fa0,v4

PR target/119100

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*vfnmsub_<mode>,*vfnmadd_<mode>): Handle
both add and acc variants.
* config/riscv/vector.md (*pred_mul_neg_<optab><mode>_scalar_undef): New
pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfnmacc and
vfnmsac.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop.h (DEF_VF_MULOP_CASE_1):
Fix return type.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f64.c: New test.

aarch64: Add support for NVIDIA GB10

This adds support for -mcpu=gb10.  This is a big.LITTLE configuration
involving Cortex-X925 and Cortex-A725 cores.  The appropriate MIDR numbers
are added to detect them in -mcpu=native.  We did not add an
-mcpu=cortex-x925.cortex-a725 option because GB10 does include the crypto
instructions which we want on by default, and the current convention is to not
enable such extensions for Arm Cortex cores in -mcpu where they are optional
in the IP.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

* config/aarch64/aarch64-cores.def (gb10): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document the above.

Extend nonnull_if_nonzero attribute [PR120520]

C2Y voted in the
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3466.pdf
paper, which clarifies some of the conditional nonnull cases.
For strncat/__strncat_chk no changes are necessary, we already
use __attribute__((nonnull (1), nonnull_if_nonzero (2, 3))) attributes
on the builtin and glibc can do the same too, meaning that first
argument must be nonnull always and second must be nonnull if
the third one is nonzero.

The problem is with the fread/fwrite changes, where the paper adds:
If size or nmemb is zero,
+ptr may be a null pointer,
fread returns zero and the contents of the array and the state of
the stream remain unchanged.
and ditto for fwrite, so the two argument nonnull_if_nonzero attribute
isn't usable to express that, because whether the pointer can be null
depends on 2 integral arguments rather than one.

The following patch extends the nonnull_if_nonzero attribute, so that
instead of requiring 2 arguments it allows 2 or 3, the first one
is still the pointer argument index which sometimes must not be null
and the other one or two are integral arguments, if there are 2, the
invalid case is only if pointer is null and both the integral arguments
are nonzero.

2025-06-30 Jakub Jelinek <jakub@redhat.com>

PR c/120520
PR c/117023
gcc/
* builtin-attrs.def (DEF_LIST_INT_INT_INT): Define it and
use for 1,2,3.
(ATTR_NONNULL_IF123_LIST): New DEF_ATTR_TREE_LIST.
(ATTR_NONNULL_4_IF123_LIST): Likewise.
* builtins.def (BUILT_IN_FWRITE): Use ATTR_NONNULL_4_IF123_LIST
instead of ATTR_NONNULL_LIST.
(BUILT_IN_FWRITE_UNLOCKED): Likewise.
* gimple.h (infer_nonnull_range_by_attribute): Add another optional
tree * argument defaulted to NULL.
* gimple.cc (infer_nonnull_range_by_attribute): Add OP3 argument,
handle 3 argument nonnull_if_nonzero attribute.
* builtins.cc (validate_arglist): Handle 3 argument nonnull_if_nonzero
attribute.
* tree-ssa-ccp.cc (pass_post_ipa_warn::execute): Likewise.
* ubsan.cc (instrument_nonnull_arg): Adjust
infer_nonnull_range_by_attribute caller, handle 3 argument
nonnull_if_nonzero attribute.
* gimple-range-infer.cc (gimple_infer_range::gimple_infer_range):
Handle 3 argument nonnull_if_nonzero attribute.
* doc/extend.texi (nonnull_if_nonzero): Document 3 argument version
of the attribute.
gcc/c-family/
* c-attribs.cc (c_common_gnu_attributes): Allow 2 or 3 arguments for
nonnull_if_nonzero attribute instead of only 2.
(handle_nonnull_if_nonzero_attribute): Handle 3 argument
nonnull_if_nonzero.
* c-common.cc (struct nonnull_arg_ctx): Rename other member to other1,
add other2 member.
(check_function_nonnull): Clear a if nonnull attribute has an
argument. Adjust for nonnull_arg_ctx changes. Handle 3 argument
nonnull_if_nonzero attribute.
(check_nonnull_arg): Adjust for nonnull_arg_ctx changes, emit different
diagnostics for 3 argument nonnull_if_nonzero attributes.
(check_function_arguments): Adjust ctx var initialization.
gcc/analyzer/
* sm-malloc.cc (malloc_state_machine::on_stmt): Handle 3 argument
nonnull_if_nonzero attribute.
gcc/testsuite/
* gcc.dg/nonnull-9.c: Tweak for 3 argument nonnull_if_nonzero
attribute support, add further tests.
* gcc.dg/nonnull-12.c: New test.
* gcc.dg/nonnull-13.c: New test.
* gcc.dg/nonnull-14.c: New test.
* c-c++-common/ubsan/nonnull-8.c: New test.
* c-c++-common/ubsan/nonnull-9.c: New test.

lra: Check for null lowpart_subregs [PR120733]

lra-eliminations.cc:move_plus_up tries to:

   Transform (subreg (plus reg const)) to (plus (subreg reg) const)
   when it is possible.

Most of it is heavily conditional:

  if (!paradoxical_subreg_p (x)
      && GET_CODE (subreg_reg) == PLUS
      && CONSTANT_P (XEXP (subreg_reg, 1))
      && GET_MODE_CLASS (x_mode) == MODE_INT
      && GET_MODE_CLASS (subreg_reg_mode) == MODE_INT)
    {
      rtx cst = simplify_subreg (x_mode, XEXP (subreg_reg, 1), subreg_reg_mode,
subreg_lowpart_offset (x_mode,
subreg_reg_mode));
      if (cst && CONSTANT_P (cst))

but the final:

return gen_rtx_PLUS (x_mode, lowpart_subreg (x_mode,
     XEXP (subreg_reg, 0),
     subreg_reg_mode), cst);

assumed without checking that lowpart_subreg succeeded.  In the PR,
this led to creating a PLUS with a null operand.

In more detail, the testcase had:

    (var_location a (plus:SI (subreg:SI (reg/f:DI 64 sfp) 0)
        (const_int -4 [0xfffffffffffffffc])))

with sfp being eliminated to (plus:DI (reg:DI sp) (const_int 16)).
Initially, during the !subst_p phase, lra_eliminate_regs_1 sees
the PLUS and recurses into each operand.  The recursive call sees
the SUBREG and recurses into the SUBREG_REG.  Since !subst_p,
this final recursive call replaces (reg:DI sfp) with:

    (plus:DI (reg:DI sfp) (const_int 16))

(i.e. keeping the base register the same).  So the SUBREG is
eliminated to:

    (subreg:SI (plus:DI (reg:DI sfp) (const_int 16)) 0)

The PLUS handling in lra_eliminate_regs_1 then passes this to
move_plus_up, which tries to push the SUBREG into the PLUS.
This means trying to create:

    (plus:SI (simplify_gen_subreg:SI (reg:DI sfp) 0) (const_int 16))

The simplify_gen_subreg then returns null, because simplify_subreg_regno
fails both with allow_stack_regs==false (when trying to simplify the
SUBREG to a REG) and with allow_stack_regs=true (when validating
whether the SUBREG can be generated).  And that in turn happens
because aarch64 refuses to allow SImode to be stored in sfp:

  if (regno == SP_REGNUM)
    /* The purpose of comparing with ptr_mode is to support the
       global register variable associated with the stack pointer
       register via the syntax of asm ("wsp") in ILP32.  */
    return mode == Pmode || mode == ptr_mode;

  if (regno == FRAME_POINTER_REGNUM || regno == ARG_POINTER_REGNUM)
    return mode == Pmode;

This seems dubious.  If the frame pointer can hold a DImode value then it
can also hold an SImode value.  There might be limited cases when the low
32 bits of the frame pointer are useful, but aarch64_hard_regno_mode_ok
doesn't have the context to second-guess things like that.  It seemed
from a quick scan of other targets that they behave more as I'd expect.

So there might be a target bug here too.  But it seemed worth fixing the
unchecked use of lowpart_subreg independently of that.

The patch fixes an existing ICE in gcc.c-torture/compile/pass.c.

gcc/
PR rtl-optimization/120733
* lra-eliminations.cc (move_plus_up): Check whether lowpart_subreg
returns null.

Re-add logic to mitigate some afdo profile inconsistencies

This patch re-adds logic to increase counts of annotated basic blocks if otherwise
the Kirhoff law can not be solved. This is done only in easy cases where total
count of in or out edges is smaller than the count of BB or when BB has single
exit which is annotated by small count.

This helps to solve problems seen i.e. in parest where header of loops gets too
low count because vectorizer replaced the IV condiitonal and did not preserved
debug info. We should solve the debug info issues as well, and simiar problems
can now be tracked by in afdo debug dumps.

gcc/ChangeLog:

* auto-profile.cc (autofdo_source_profile::offline_external_functions):
Add missing newline in dump.
(afdo_propagate_edge): If annotated BB or edge has too small count
bump it up to mitigate profile imprecisions caused by vectorizer.
(afdo_propagate): Increase number of iteraitons and fix dump

Impove diagnostics of mismatched discriminators in auto-profile

We are missing discriminator info in auto-profiles, for example in exchange2.
I am not sure why, since I see the info still present in dwarf2out, so it may
be bug at create_gcov side.

This patch makes the workaround to ouptput better diagnostics (to actually show
the soruce location). This needs promotion of location info through the inline
stack API, so I turned it from pair to actual structure. Overall I think pairs
are overused in this source and makes it harder to read.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

* auto-profile.cc (struct decl_lineno): Turn to structure; add
location.
(dump_inline_stack): Update.
(get_inline_stack): Update.
(get_relative_location_for_locus): Fixup formating.
(function_instance::get_function_instance_by_decl): Add
LOCATION parameter; improve dumping.
(autofdo_source_profile::get_callsite_total_count): Improve dumping;
update.
(walk_block): Update.
(autofdo_source_profile::offline_unrealized_inlines): Update.
(autofdo_source_profile::get_count_info): Update.

x86: Preserve frame pointer for no_callee_saved_registers attribute

Update functions with no_callee_saved_registers/preserve_none attribute
to preserve frame pointer since caller may use it to save the current
stack:

pushq %rbp
movq %rsp, %rbp
...
call function
...
leave
ret

If callee changes frame pointer without restoring it, caller will fail
to restore its stack after callee returns as LEAVE does

mov %rbp, %rsp
pop %rbp

The corrupted frame pointer will corrupt stack pointer in caller.

There are no regressions on Linux/x86-64. Also tested with

https://github.com/python/cpython

configured with "./configure --with-tail-call-interp".

gcc/

PR target/120840
* config/i386/i386-expand.cc (ix86_expand_call): Don't mark
hard frame pointer as clobber.
* config/i386/i386-options.cc (ix86_set_func_type): Use
TYPE_NO_CALLEE_SAVED_REGISTERS instead of
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
* config/i386/i386.cc (ix86_function_ok_for_sibcall): Remove the
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP check.
(ix86_save_reg): Merge TYPE_NO_CALLEE_SAVED_REGISTERS and
TYPE_PRESERVE_NONE with TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
* config/i386/i386.h (call_saved_registers_type): Remove
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
* doc/extend.texi: Update no_callee_saved_registers documentation.

gcc/testsuite/

PR target/120840
* gcc.target/i386/no-callee-saved-1.c: Updated.
* gcc.target/i386/no-callee-saved-2.c: Likewise.
* gcc.target/i386/no-callee-saved-7.c: Likewise.
* gcc.target/i386/no-callee-saved-8.c: Likewise.
* gcc.target/i386/no-callee-saved-9.c: Likewise.
* gcc.target/i386/no-callee-saved-10.c: Likewise.
* gcc.target/i386/no-callee-saved-18.c: Likewise.
* gcc.target/i386/no-callee-saved-19a.c: Likewise.
* gcc.target/i386/no-callee-saved-19c.c: Likewise.
* gcc.target/i386/no-callee-saved-19d.c: Likewise.
* gcc.target/i386/pr119784a.c: Likewise.
* gcc.target/i386/preserve-none-6.c: Likewise.
* gcc.target/i386/preserve-none-7.c: Likewise.
* gcc.target/i386/preserve-none-12.c: Likewise.
* gcc.target/i386/preserve-none-13.c: Likewise.
* gcc.target/i386/preserve-none-14.c: Likewise.
* gcc.target/i386/preserve-none-15.c: Likewise.
* gcc.target/i386/preserve-none-23.c: Likewise.
* gcc.target/i386/pr120840-1a.c: New test.
* gcc.target/i386/pr120840-1b.c: Likewise.
* gcc.target/i386/pr120840-1c.c: Likewise.
* gcc.target/i386/pr120840-1d.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

RISC-V: Refactor the function bitmap_union_of_preds_with_entry

The current implementation of this function is somewhat difficult to
understand, as it uses a direct break statement within the for loop,
rendering the loop meaningless. Additionally, during the Coverity check
on the for loop, a warning appeared: "unreachable: Since the loop
increment ix++; is unreachable, the loop body will never execute more
than once." Therefore, I have made some simple refactoring to address
these issues.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry):
Refactor.

Signed-off-by: Jin Ma <jinma@linux.alibaba.com>

RISC-V: Add pipeline-checker script

Pipeline checker utility for RISC-V architecture that validates processor
pipeline models. This tool analyzes machine description files to ensure all
instruction types are properly handled by pipeline scheduling models.

I write this tool since I am implment vector pipeline stuff for SiFive
core, but it's hard to find which instruction type is not handled by
pipeline scheduling models. This tool will help me to find out which
instruction type is not handled by pipeline scheduling models, so I can
fix them.

And I think it may be useful for other RISC-V core developers, so I
decided to upstream that :)

Usage:
```
./pipeline-checker <your-pipeline-model>
```
Example:
```
$ ./pipeline-checker sifive-7.md
Error: Some types are not consumed by the pipemodel
Missing types:
{'vfclass', 'vimovxv', 'vmov', 'rdfrm', 'wrfrm', 'ghost', 'wrvxrm', 'crypto', 'vwsll', 'vfmovfv', 'vimovvx', 'sf_vc', 'vfmovvf', 'sf_vc_se', 'rdvlenb', 'vbrev', 'vrev8', 'sf_vqmacc', 'sf_vfnrclip', 'vsetvl_pre', 'rdvl', 'vsetvl'}
```

gcc/ChangeLog:

* config/riscv/pipeline-checker: New file.

Daily bump.

[PR modula2/117203] Followup add Delete procedure function

This patch provides GetFileName procedure function for
FIO.File, FileSystem.File and IOChan.ChanId. The
return result from these procedures can be passed into
StringFileSysOp.Unlink to complete the required delete.

gcc/m2/ChangeLog:

PR modula2/117203
* gm2-libs-log/FileSystem.def (GetFileName): New
procedure function.
(WriteString): New procedure.
* gm2-libs-log/FileSystem.mod (GetFileName): New
procedure function.
(WriteString): New procedure.
* gm2-libs/SFIO.def (GetFileName): New procedure function.
* gm2-libs/SFIO.mod (GetFileName): New procedure function.
* gm2-libs-iso/IOChanUtils.def: New file.
* gm2-libs-iso/IOChanUtils.mod: New file.

libgm2/ChangeLog:

PR modula2/117203
* libm2iso/Makefile.am (M2DEFS): Add IOChanUtils.def.
(M2MODS): Add IOChanUtils.mod.
* libm2iso/Makefile.in: Regenerate.

gcc/testsuite/ChangeLog:

PR modula2/117203
* gm2/isolib/run/pass/testdelete2.mod: New test.
* gm2/pimlib/logitech/run/pass/testdelete2.mod: New test.
* gm2/pimlib/run/pass/testdelete.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

shrink_wrap_separate_check_lea.c: Scan lea(l|q)

Scan "lea(l|q)", instead of "leaq", to support x32.

* gcc.target/i386/shrink_wrap_separate_check_lea.c: Scan
"lea(l|q)", instead of "leaq".

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

cobol: Normalize generating and using function_decls.

Because COBOL doesn't require function prototypes, it is possible to, for
example,

    CALL "getcwd" USING <parameters>

and then later

    CALL "getcwd" USING <parameters> RETURNING <alphanumeric>

The second call "knows" that the return value is a char*, but the first one
does not.  So, the first one gets a default return value type of SSIZE_t, which
later needs to be replaced with CHAR_P.

These [all too] extensive changes ensure that all references to a particular
function use the same function_decl, and take measures to make sure that one
function_decl is back-modified, if necessary, with the best return value type.

gcc/cobol/ChangeLog:

* Make-lang.in: Incorporate gcobol.clean.
* except.cc (cbl_enabled_exceptions_t::dump): Update debug message.
* genapi.cc (gg_attribute_bit_get): Formatting.
(file_static_variable): Formatting.
(trace1_init): Formatting.
(build_main_that_calls_something): Normalize function_decl use.
(parser_call_target): Likewise.
(set_call_convention): Likewise.
(parser_call_target_convention): Likewise.
(parser_call_targets_dump): Likewise.
(function_handle_from_name): Likewise.
(function_pointer_from_name): Likewise.
(parser_initialize_programs): Likewise.
(parser_statement_begin): Formatting.
(parser_leave_file): Use function_decl FIFO.
(enter_program_common): Normalize function_decl use.
(parser_enter_program): Normalize function_decl use.
(tree_type_from_field_type): Normalize function_decl use.
(is_valuable): Comment.
(pe_stuff): Change name to program_end_stuff.
(program_end_stuff): Likewise.
(parser_exit): Likewise.
(parser_division): Normalize function_decl use.
(create_and_call): Normalize function_decl use.
(parser_call): Normalize function_decl use.
(parser_set_pointers): Normalize function_decl use.
(parser_program_hierarchy): Normalize function_decl use.
(psa_FldLiteralA): Defeat attempt to re-use literals. (Fails on some aarch64).
(parser_symbol_add): Error message formatting.
* genapi.h: Formatting.
* gengen.cc (struct cbl_translation_unit_t): Add function_decl FIFO.
(show_type): Rename to gg_show_type.
(gg_show_type): Correct an error message.
(gg_assign): Formatting; change error handling.
(gg_modify_function_type): Normalize function_decl use.
(gg_define_function_with_no_parameters): Fold into gg_defint_function().
(function_decl_key): Normalize function_decl use.
(gg_peek_fn_decl): Normalize function_decl use.
(gg_build_fn_decl): Normalize function_decl use.
(gg_define_function): Normalize function_decl use.
(gg_tack_on_function_parameters): Remove.
(gg_finalize_function): Normalize function_decl use.
(gg_leaving_the_source_code_file): Normalize function_decl use.
(gg_call_expr_list): Normalize function_decl use.
(gg_trans_unit_var_decl): Normalize function_decl use.
(gg_insert_into_assemblerf): New function; formatting.
* gengen.h (struct gg_function_t): Eliminate "is_truly_nested" flag.
(gg_assign): Incorporate return value.
(gg_define_function): Normalize function_decl use.
(gg_define_function_with_no_parameters): Eliminate.
(gg_build_fn_decl): Normalize function_decl use.
(gg_peek_fn_decl): Normalize function_decl use.
(gg_modify_function_type): Normalize function_decl use.
(gg_call_expr_list): Normalize function_decl use.
(gg_get_function_decl): Normalize function_decl use.
(location_from_lineno): Prefix with "extern".
(gg_open): Likewise.
(gg_close): Likewise.
(gg_get_indirect_reference): Likewise.
(gg_insert_into_assembler): Likewise.
(gg_insert_into_assemblerf): Likewise.
(gg_show_type): New declaration.
(gg_leaving_the_source_code_file): New declaration.
* parse.y: Format debugging message.
* parse_ante.h: Normalize function_decl use.

contrib/mklog.py: Fix writing to a global variable

The last patch of mklog.py put top-level code into function 'main()'.
Because of this, writing to global variable 'root' has to be preceded by
explicitly declaring 'root' as global. Otherwise the write only has a
local effect.

Without this change, the '-d' cmdline flag would be broken.

Commited as obvious.

contrib/ChangeLog:

* mklog.py: In 'main()', specify variable 'root' as global.

Signed-off-by: Filip Kastl <fkastl@suse.cz>

Daily bump.

Add "void debug (tree)"

Add "void debug (tree)" to support:

(gdb) call debug (expr)
<parm_decl 0x7fffe9810bb0 f
    type <record_type 0x7fffe99cec78 c BLK
        size <integer_cst 0x7fffe98242d0 constant 256>
        unit-size <integer_cst 0x7fffe98243a8 constant 32>
        user align:256 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7fffe99cebd0
        fields <field_decl 0x7fffe98318c0 a type <real_type 0x7fffe982a3f0 long double>
            XF x.c:2:15
            size <integer_cst 0x7fffe9802fa8 constant 128>
            unit-size <integer_cst 0x7fffe9802fc0 constant 16>
            align:128 warn_if_not_align:0 offset_align 128 decl_not_flexarray: 1
            offset <integer_cst 0x7fffe9802f90 constant 0>
            bit-offset <integer_cst 0x7fffe9802fd8 constant 0> context <record_type 0x7fffe99cebd0> chain <field_decl 0x7fffe9831960 b>>>
    used read BLK x.c:7:6 size <integer_cst 0x7fffe98242d0 256> unit-size <integer_cst 0x7fffe98243a8 32>
    align:256 warn_if_not_align:0 context <function_decl 0x7fffe99d2900 e> arg-type <record_type 0x7fffe99cec78 c>>
(gdb)

PR debug/120849
* print-tree.cc (debug): New.
* print-tree.h (debug): Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Fix compilation of concatenation with illegal character constant

This fixes an error recovery issue, whereby the compilation of a string
concatenation with an illegal character constant hangs.

gcc/ada/
PR ada/120854
* sem_eval.adb (Get_String_Val): Be prepared for an integer literal
after a serious error is detected, and raise PE on other nodes.
gcc/testsuite/
* gnat.dg/concat6.adb: New test.

c++/modules: Make bitfield storage unit detection more robust

Modules streaming needs to handle these differently from other unnamed
FIELD_DECLs that are streamed for internal RECORD_DECLs, and there
doesn't seem to be a good way to detect this case otherwise.

This matters only to allow for compiler-generated type definitions that
build FIELD_DECLs with no name, as otherwise they get confused.
Currently the only such types left I hadn't earlier fixed by giving
names to are contextless, for which we have an early check to mark their
fields as MK_unique anyway, but there may be other cases in the future.

gcc/cp/ChangeLog:

* module.cc (trees_out::walking_bit_field_unit): New flag.
(trees_out::trees_out): Initialize it.
(trees_out::core_vals): Set it.
(trees_out::get_merge_kind): Use it, move previous ad-hoc check
into assertion.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

c++/modules: Ensure type of partial spec VAR_DECL is consistent with its template [PR120644]

We were erroring because the TEMPLATE_DECL of the existing partial
specialisation has an undeduced return type, but the imported
declaration did not.

The root cause is similar to what was fixed in r13-2744-g4fac53d6522189,
where modules streaming code assumes that a TEMPLATE_DECL and its
DECL_TEMPLATE_RESULT will always have the same TREE_TYPE. That commit
fixed the issue by ensuring that when the type of a variable is deduced
the TEMPLATE_DECL is updated as well, but missed handling partial
specialisations. This patch ensures that the same adjustment is made
there as well.

PR c++/120644

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Also propagate type to partial
templates.
* module.cc (trees_out::decl_value): Add assertion that the
TREE_TYPE of a streamed template decl matches its inner.
(trees_in::is_matching_decl): Clarify function return type
deduction should only occur for non-TEMPLATE_DECL.
* pt.cc (template_for_substitution): Handle partial specs.

gcc/testsuite/ChangeLog:

* g++.dg/modules/auto-7.h: New test.
* g++.dg/modules/auto-7_a.H: New test.
* g++.dg/modules/auto-7_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>

AVR: target/120856 - Deny R24:DI in avr_hard_regno_mode_ok with Reload.

This fixes an ICE with -mno-lra when split2 tries to split the following
zero_extendsidi2 insn:

   (set (reg:DI 24)
        (zero_extend:DI (reg:SI **)))

The ICE is because avr_hard_regno_mode_ok allows R24:DI but disallows
R28:SI when Reload is used.  R28:SI is a result of zero_extendsidi2.

This ICE only occurs with Reload (which will die before very long),
but it occurs when building libgcc.

gcc/
PR target/120856
* config/avr/avr.cc (avr_hard_regno_mode_ok) [-mno-lra]: Deny
hard regs >= 4 bytes that overlap Y.

Relax the testcase check for Solaris [PR120818]

gcc/testsuite/ChangeLog:

PR target/120818
* g++.target/i386/shrink_wrap_separate.C: Relax the check.

Fix handling of dwarf name and duplicated names

I have tested Kugan's patch on exchange2 and noticed multiple problems:
  1) with LTO the translation from dwarf names to symbol names is disabled
     since we free lang data sooner.  I moved the offline pass upstream which
     however also may make us miss clones intorduced betwen free lang data
     and annotation.  This is not very important right now and may be furhter
     fixed by splitting off auto-profile-read and offline passes.
  2) I noticed that we miss a lot of AFDO inlines because some code compares
     name indexes for equality in belief that it compares symbol names.  This
     is not ture if we drop prefixes.  For this reason I integrated get_original_name
     into the renaming machinery which actually updates indexes so string table
     conitnues to work as symbol table.
     This lets me to drop
        afdo_string_table->get_index (afdo_string_table->get_name (other->name ()))
     hops that were introduced at some places

     Now after renaming all afdo instances should go by DECL_ASSEMBLER_NAME
     names.
  3) Detection of realized offline instances had an ordering issue where we
     omitted marking of those that were offlined later.  Since we can now
     lookup assembler names, I simplified the logic into single-pass.

autoprofiledbootstrapped/regteted x86_64-linux, comitted.

gcc/ChangeLog:

* auto-profile.cc (get_original_name): Only strip suffixes introduced
after auto-fdo annotation.
(string_table::get_index_by_decl):  Simplify.
(string_table::add_name): New member function.
(string_table::read): Micro-optimize allocation.
(function_instance::get_function_instance_by_decl): Dump reasons
for failure; try to compensate lost discriminators.
(function_instance::merge): Simplify sanity check; do not check
for realized flag; fix merging of targets.
(function_instance::offline_if_in_set): Simplify.
(function_instance::dump): Sanity check that names are consistent.
(autofdo_source_profile::offline_external_functions): Also handle
stripping suffixes.
(walk_block): Move up in source.
(autofdo_source_profile::offline_unrealized_inlines): Also compute
realized functions.
(autofdo_source_profile::get_function_instance_by_name_index): Simplify.
(autofdo_source_profile::add_function_instance): Simplify.
(autofdo_source_profile::read): Do not strip suffxies; error on duplicates.
(mark_realized_functions): Remove.
(auto_profile): Do not call mark_realized_functions.
* passes.def: Move auto_profile_offline before free_lang_data.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/clone-test.c: New test.
* gcc.dg/tree-prof/clone-merge-1.c: Updae template.

Co-authored-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>

Daily bump.

c++: fix ICE with [[deprecated]] [PR120756]

Here we end up with "error reporting routines re-entered" because
resolve_nondeduced_context isn't passing complain to mark_used.

PR c++/120756

gcc/cp/ChangeLog:

* pt.cc (resolve_nondeduced_context): Pass complain to mark_used.

gcc/testsuite/ChangeLog:

* g++.dg/warn/deprecated-22.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

testsuite: adjust for implicit constexpr

Jakub's constexpr virtual base patch allowed -fimplicit-constexpr to
interfere with these tests.

* g++.dg/abi/mangle81.C: Add -fno-implicit-constexpr.
* g++.dg/init/vbase1.C: Likewise.
* g++.dg/ipa/ipa-icf-4.C: Likewise.

Fix misoptimization of CONSTRUCTOR with reverse SSO

fold_ctor_reference already punts on a CONSTRUCTOR whose type has reverse
storage order, but it can be invoked in a couple of places on a CONSTRUCTOR
with native storage order that has been wrapped in a VIEW_CONVERT_EXPR to a
type with reverse storage order; this would require a post adjustment that
does not currently exist, thus yield wrong code for this admittedly quite
pathological (but supported) case.

gcc/
* gimple-fold.cc (fold_const_aggregate_ref_1) <COMPONENT_REF>:
Bail out immediately if the reference has reverse storage order.
* tree-ssa-sccvn.cc (fully_constant_vn_reference_p): Likewise.
gcc/testsuite/
* gnat.dg/sso20.adb: New test.

c++: Implement C++26 P3533R2 - constexpr virtual inheritance [PR120777]

The following patch implements the C++26
P3533R2 - constexpr virtual inheritance
paper.
The changes include not rejecting it for C++26, tweaking the
error wording to show that it is valid in C++26, adjusting
synthesized_method_walk not to make synthetized cdtors non-constexpr
just because of virtual base classes in C++26 and various tweaks in
constexpr.cc so that it can deal with the expressions used for
virtual base member accesses or cdtor calls which need __in_chrg
and/or __vtt_parm arguments to be passed in some cases implicitly when
they aren't passed explicitly.  And dynamic_cast constant evaluation
tweaks so that it handles also expressions with types with virtual bases.

2025-06-27  Jakub Jelinek  <jakub@redhat.com>

PR c++/120777
gcc/
* gimple-fold.cc (gimple_get_virt_method_for_vtable): Revert
2018-09-18 changes.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_constexpr_virtual_inheritance=202506L for C++26.
gcc/cp/
* constexpr.cc: Implement C++26 P3533R2 - constexpr virtual
inheritance.
(is_valid_constexpr_fn): Don't reject constexpr cdtors in classes
with virtual bases for C++26, adjust error wording.
(cxx_bind_parameters_in_call): Add ORIG_FUN argument, add
values for __in_chrg and __vtt_parm arguments when needed.
(cxx_eval_dynamic_cast_fn): Adjust function comment, HINT -1
should be possible.  For C++26 if obj is cast from POINTER_PLUS_EXPR,
attempt to use cxx_fold_indirect_ref to simplify it and if successful,
build ADDR_EXPR of that.
(cxx_eval_call_expression): Add orig_fun variable, set it to
fun before looking through clones, pass it to
cxx_bind_parameters_in_call.
(reduced_constant_expression_p): Add SZ argument, pass DECL_SIZE
of FIELD_DECL e.index to recursive calls and don't return false
if SZ is non-NULL and there are unfilled fields with bit position
at or above SZ.
(cxx_fold_indirect_ref_1): Handle reading of vtables using
ptrdiff_t dynamic type instead of some pointer type.  Set el_sz
to DECL_SIZE_UNIT value rather than TYPE_SIZE_UNIT of
DECL_FIELD_IS_BASE fields in classes with virtual bases.
(cxx_fold_indirect_ref): In canonicalize_obj_off lambda look
through COMPONENT_REFs with DECL_FIELD_IS_BASE in classes with
virtual bases and adjust off correspondingly.  Remove assertion that
off is integer_zerop, pass tree_to_uhwi (off) instead of 0 to the
cxx_fold_indirect_ref_1 call.
* cp-tree.h (publicly_virtually_derived_p): Declare.
(reduced_constant_expression_p): Add another tree argument defaulted
to NULL_TREE.
* method.cc (synthesized_method_walk): Don't clear *constexpr_p
if there are virtual bases for C++26.
* class.cc (build_base_path): Compute fixed_type_p and
virtual_access before checks for build_simple_base_path instead of
after that and conditional cp_build_addr_expr.  Use build_simple_path
if !virtual_access even when v_binfo is non-NULL.
(layout_virtual_bases): For build_base_field calls use
access_public_node rather than access_private_node if
publicly_virtually_derived_p.
(build_vtbl_initializer): Revert 2018-09-18 and 2018-12-11 changes.
(publicly_virtually_derived_p): New function.
gcc/testsuite/
* g++.dg/cpp26/constexpr-virt-inherit1.C: New test.
* g++.dg/cpp26/constexpr-virt-inherit2.C: New test.
* g++.dg/cpp26/constexpr-virt-inherit3.C: New test.
* g++.dg/cpp26/feat-cxx26.C: Add __cpp_constexpr_virtual_inheritance
tersts.
* g++.dg/cpp2a/constexpr-dtor3.C: Don't expect one error for C++26.
* g++.dg/cpp2a/constexpr-dtor16.C: Don't expect errors for C++26.
* g++.dg/cpp2a/constexpr-dynamic10.C: Likewise.
* g++.dg/cpp0x/constexpr-ice21.C: Likewise.
* g++.dg/cpp0x/constexpr-ice4.C: Likewise.
* g++.dg/abi/mangle1.C: Guard the test on c++23_down.
* g++.dg/abi/mangle81.C: New test.
* g++.dg/ipa/ipa-icf-4.C (A::A): For
__cpp_constexpr_virtual_inheritance >= 202506L add user provided
non-constexpr constructor.

[sanitizer_common] Fix build on ppc64+musl (#120036)

Cherry picked from LLVM commit 801b519dfd01e21da0be17aa8f8dc2ceb0eb9e77.

In powerpc64-unknown-linux-musl, signal.h does not include asm/ptrace.h,
which causes "member access into incomplete type 'struct pt_regs'"
errors. Include the header explicitly to fix this.

Also in sanitizer_linux_libcdep.cpp, there is a usage of TlsPreTcbSize
which is not defined in such a platform. Guard the branch with macro.

Fortran: follow-up fix to checking of renamed-on-use interface name [PR120784]

Commit r16-1633 introduced a regression for imported interfaces that were
not renamed-on-use, since the related logic did not take into account that
the absence of renaming could be represented by an empty string.

PR fortran/120784

gcc/fortran/ChangeLog:

* interface.cc (gfc_match_end_interface): Detect empty local_name.

gcc/testsuite/ChangeLog:

* gfortran.dg/interface_63.f90: Extend testcase.

c++: fix decltype_p handling for binary expressions

With Jakub's constexpr virtual base patch,
23_containers/vector/bool/cmp_c++20.cc failed the assert I add to
fixed_type_or_null, meaning that it returned the wrong value. Let's fix the
result as well as adding the assert, and fix cp_parser_binary_expression to
properly wrap any class-type calls in the operands in TARGET_EXPR even
within a decltype so we don't hit the assert.

gcc/cp/ChangeLog:

* class.cc (fixed_type_or_null): Handle class-type CALL_EXPR.
* parser.cc (cp_parser_binary_expression): Fix decltype_p handling.

libstdc++: Directly implement ranges::shuffle [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (shuffle_fn::operator()):
Reimplement directly, based on the stl_algo.h implementation.
* testsuite/25_algorithms/shuffle/constrained.cc (test02):
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Directly implement ranges::sample [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__sample_fn::operator()):
Reimplement the forward_iterator branch directly, based
on the stl_algo.h implementation. Add explicit cast to
_Out's difference_type in the !forward_iterator branch.
* testsuite/25_algorithms/sample/constrained.cc (test02):
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Directly implement ranges::nth_element [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__introselect): New,
based on the stl_algo.h implementation.
(nth_element_fn::operator()): Reimplement in terms of the above.
* testsuite/25_algorithms/nth_element/constrained.cc:

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Directly implement ranges::stable_partition [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__find_if_not_n): New,
based on the stl_algo.h implementation.
(__detail::__stable_partition_adaptive): Likewise.
(__stable_partition_fn::operator()): Reimplement in terms of
the above.
* testsuite/25_algorithms/stable_partition/constrained.cc
(test03): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Directly implement ranges::stable_sort [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__move_merge): New,
based on the stl_algo.h implementation.
(__detail::__merge_sort_loop): Likewise.
(__detail::__chunk_insertion_sort): Likewise.
(__detail::__merge_sort_with_buffer): Likewise.
(__detail::__stable_sort_adaptive): Likewise.
(__detail::__stable_sort_adaptive_resize): Likewise.
(__detail::__inplace_stable_sort): Likewise.
(__stable_sort_fn::operator()): Reimplement in terms of the above.
* testsuite/25_algorithms/stable_sort/constrained.cc:

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Directly implement ranges::inplace_merge [PR100795]

As with the previous patch, this patch reimplements ranges::inplace_merge
directly instead of incorrectly forwarding to std::inplace_merge.  In
addition to the compatibility changes listed in the previous patch we
also:

  - explicitly cast the difference type (which can be an integer class) to
    ptrdiff_t when constructing a _Temporary_buffer

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__move_merge_adaptive):
New, based on the stl_algo.h implementation.
(__detail::__move_merge_adaptive_backward): Likewise.
(__detail::__rotate_adaptive): Likewise.
(__detail::__merge_adaptive): Likewise.
(__detail::__merge_adaptive_resize): Likewise.
(__detail::__merge_without_buffer): Likewise.
(__inplace_merge_fn::operator()): Reimplement in terms of the
above.
* testsuite/25_algorithms/inplace_merge/constrained.cc (test03):
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Directly implement ranges::sort [PR100795]

As with the previous patch, this patch reimplements ranges::sort
directly instead of incorrectly forwarding to std::sort.  In addition to
the compatibility changes listed in the previous patch we also:

  - use ranges::iter_swap instead of std::iter_swap
  - use ranges::move_backward instead of std::move_backward
  - use __bit_width and __to_unsigned_like instead of __lg

PR libstdc++/100795
PR libstdc++/118209

libstdc++-v3/ChangeLog:

* include/bits/max_size_type.h (__bit_width): New explicit
specialization for __max_size_type.
* include/bits/ranges_algo.h (__detail::__move_median_to_first):
New, based on the stl_algo.h implementation.
(__detail::__unguarded_liner_insert): Likewise.
(__detail::__insertion_sort): Likewise.
(__detail::__sort_threshold): Likewise.
(__detail::__unguarded_insertion_sort): Likewise.
(__detail::__final_insertion_sort): Likewise.
(__detail::__unguarded_partition): Likewise.
(__detail::__unguarded_partition_pivot): Likewise.
(__detail::__heap_select): Likewise.
(__detail::__partial_sort): Likewise.
(__detail::__introsort_loop): Likewise.
(__sort_fn::operator()): Reimplement in terms of the above.
* testsuite/25_algorithms/sort/118209.cc: New test.
* testsuite/25_algorithms/sort/constrained.cc (test03): New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Directly implement ranges::heap algos [PR100795]

ranges::push_heap, ranges::pop_heap, ranges::make_heap and
ranges::sort_heap are currently defined in terms of the corresponding
STL-style algorithms, but this is incorrect because the STL-style
algorithms rely on the legacy iterator system, and so misbehave when
passed a narrowly C++20 random access iterator.  The other ranges heap
algos, ranges::is_heap and ranges::is_heap_until, are implemented
directly already and have no known issues.

This patch reimplements these ranges:: algos directly instead, based
closely on the legacy stl_heap.h implementation, with the following
changes for compatibility with the C++20 iterator system:

  - handle non-common ranges by computing the corresponding end iterator
  - use ranges::iter_move instead of std::move(*iter)
  - use iter_value_t / iter_difference_t instead of iterator_traits

Besides these changes, the implementation of these algorithms is
intended to mirror the stl_heap.h implementations, for ease of
maintenance and review.

Note that we don't explicitly pass the projection function throughout,
instead we just create and pass a composite predicate via __make_comp_proj.

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__push_heap): New,
based on the stl_heap.h implementation.
(__push_heap_fn::operator()): Reimplement in terms of the above.
(__detail::__adjust_heap): New, based on the stl_heap.h
implementation.
(__deatil::__pop_heap): Likewise.
(__pop_heap_fn::operator()): Reimplement in terms of the above.
(__make_heap_fn::operator()): Likewise.
(__sort_heap_fn::operator()): Likewise.
* testsuite/25_algorithms/heap/constrained.cc (test03): New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Use runtime format for internal format calls in chrono [PR110739]

This patch adjust all internal std::format call inside of __formatter_chrono,
to use runtime format string and thus avoid compile time checking of validity
of the format string. Majority of cases are covered by calling newly introduced
_S_empty_fs() function that returns __Runtime_format_string containing
_S_empty_spec, instead of passing later directly.

In case of _M_j we use _S_str_d3 function (extracted from _S_str_d2), eliminating
call to std::format outside of unlikely scenario in which day of year is greater
than 1000 (this may happen for year_month_day with month greater than 12). In
consequence, outside of handling subseconds, we no longer delegate to std::format
or construct temporary strings, when formatting chrono types with ok() values.

PR libstdc++/110739

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_S_empty_fs): Define.
(__formatter_chrono::_S_str_d2): Use _S_str_d3 for 3+ digits and
place allways_inline attribute after comment.
(__formatter_chrono::_S_str_d3): Extracted from _S_str_d2.
(__formatter_chrono::_M_H_I, __formatter_chrono::_M_R_X): Replace
_S_empty_spec with _S_empty_fs().
(__formatter_chrono::_M_j): Likewise and use _S_str_d3 in common
case.
(__format::operator-(_ChronoParts, _ChronoParts))
(__format::operator-=(_ChronoParts, _ChronoParts))
(__formatter_chrono::_S_fill_two_digits)
(__formatter_chrono::_S_str_d1): Place always_inline attribute
after comment.

c++/modules: Avoid name clashes when streaming internal labels [PR98375,PR118904]

The frontend creates some variables that need to be given unique names
for the TU so that they can unambiguously be accessed. Historically
this has been done with a global counter local to each place that needs
an internal label, but this doesn't work with modules as depending on
what declarations have been imported, some counter values may have
already been used.

This patch reworks the situation to instead have a single collection of
counters for the TU, and a new function 'generate_internal_label' that
gets the next label with given prefix using that counter. Modules
streaming can then use this function to regenerate new names on
stream-in for any such decls, guaranteeing uniqueness within the TU.

These labels should only be used for internal entities so there should
be no issues with the names differing from TU to TU; we will need to
handle this if we ever start checking ODR of definitions we're merging
but that's an issue for later.

For proof of concept, this patch makes use of the new API for
__builtin_source_location and ubsan; there are probably other places
in the frontend where this change will need to be made as well.
One other change this exposes is that both of these components rely
on the definition of the VAR_DECLs they create, so stream that too
for uncontexted variables.

PR c++/98735
PR c++/118904

gcc/cp/ChangeLog:

* cp-gimplify.cc (source_location_id): Remove.
(fold_builtin_source_location): Use generate_internal_label.
* module.cc (enum tree_tag): Add 'tt_internal_id' enumerator.
(trees_out::tree_value): Adjust assertion, write definitions
of uncontexted VAR_DECLs.
(trees_in::tree_value): Read variable definitions.
(trees_out::tree_node): Write internal labels, adjust assert.
(trees_in::tree_node): Read internal labels.

gcc/ChangeLog:

* tree.cc (struct identifier_hash): New type.
(struct identifier_count_traits): New traits.
(internal_label_nums): New hash map.
(generate_internal_label): New function.
(prefix_for_internal_label): New function.
* tree.h (IDENTIFIER_INTERNAL_P): New macro.
(generate_internal_label): Declare.
(prefix_for_internal_label): Declare.
* ubsan.cc (ubsan_ids): Remove.
(ubsan_type_descriptor): Use generate_internal_label.
(ubsan_create_data): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/src-loc-1.h: New test.
* g++.dg/modules/src-loc-1_a.H: New test.
* g++.dg/modules/src-loc-1_b.C: New test.
* g++.dg/modules/src-loc-1_c.C: New test.
* g++.dg/modules/ubsan-1_a.C: New test.
* g++.dg/modules/ubsan-1_b.C: New test.
* g++.dg/ubsan/module-1-aux.cc: New test.
* g++.dg/ubsan/module-1.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++/modules: Support streaming new size cookie for constexpr [PR120040]

This type currently has a DECL_NAME of an IDENTIFIER_DECL. Although the
documentation indicates this is legal, this confuses modules streaming
which expects all RECORD_TYPEs to have a TYPE_DECL, which is used to
determine the context and merge key, etc.

PR c++/120040

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Handle TYPE_NAME
now being a TYPE_DECL rather than just an IDENTIFIER_NODE.
* init.cc (build_new_constexpr_heap_type): Build a TYPE_DECL for
the returned type; mark the type as artificial.
* module.cc (trees_out::type_node): Add some assertions.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr120040_a.C: New test.
* g++.dg/modules/pr120040_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++/modules: Implement streaming of uncontexted TYPE_DECLs [PR98735]

Currently, most declarations must have a DECL_CONTEXT for modules
streaming to behave correctly, so that they can have an appropriate
merge key generated and be correctly deduplicated on import.

There are a few exceptions, however, for internally generated
declarations that will never be merged and don't necessarily have an
appropriate parent to key off for the context.  One case that's come up
a few times is TYPE_DECLs, especially temporary RECORD_TYPEs used as
intermediaries within expressions.

Previously I've tried to give all such types a DECL_CONTEXT, but in some
cases that has ended up being infeasible, such as with the types
generated by UBSan (which are shared with the C frontend and don't know
their context, especially when created at global scope).  Additionally,
these types often don't have many of the parts that a normal struct
declaration created via parsing user code would have, which confuses
module streaming.

Given that these types are typically intended to be one-off and unique
anyway, this patch instead adds support for by-value streaming of
uncontexted TYPE_DECLs.  The patch only support streaming the bare
minimum amount of fields needed for the cases I've come across so far;
in general the preference should still be to ensure that DECL_CONTEXT is
set where possible.

PR c++/98735
PR c++/120040

gcc/cp/ChangeLog:

* module.cc (trees_out::tree_value): Write TYPE_DECLs.
(trees_in::tree_value): Read TYPE_DECLs.
(trees_out::tree_node): Support uncontexted TYPE_DECLs, and
ensure that all parts of a by-value decl are marked for
streaming.
(trees_out::get_merge_kind): Treat members of uncontexted types
as always unique.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

libstdc++: Fix warnings introduced by type-erasing for chrono commits [PR110739]

The r16-1709-g4b3cefed1a08344495fedec4982d85168bd8173f caused `-Woverflow`
in empty_spec.cc file. This warning is not cause by any issue in shipping
code, and results in taking to much shortcut when implementing a test-only
custom representation type Rep, where long was always used to store a value.
In particular common type for Rep and long long int, was de-facto long.
This is addressed by adding Under template parameter, that controls the type
of stored value, and handling it properly in common_type specializations.
No changes to shipping code are necessary.

Secondly, extracting _M_locale_fmt calls in r16-1712-gcaac94, resulted in __ctx
format parameter no longer being used. This patch removes such parameter
entirely, and replace _FormatContext template parameter, with _OutIter parameter
for __out. For consistency type of the __out is decoupled from _FormatContext,
for functions that still need context:
* to extract locale (_M_A_a, _M_B_b, _M_c, _M_p, _M_r, _M_subsecs)
* perform formatting for duration/subseconds (_M_Q, _M_T, _M_S, _M_subsecs)

PR libstdc++/110739

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_M_format_to):
Rename _Out to _OutIter for consistency, and update calls
to specifier functions.
(__formatter_chrono::_M_wi, __formatter_chrono::_M_C_y_Y)
(__formatter_chrono::_M_D_x, __formatter_chrono::_M_d_e)
(__formatter_chrono::_M_F, __formatter_chrono::_M_g_G)
(__formatter_chrono::_M_H_I, __formatter_chrono::_M_j)
(__formatter_chrono::_M_m, __formatter_chrono::_M_M)
(__formatter_chrono::_M_q, __formatter_chrono::_M_R_X)
(__formatter_chrono::_M_u_w, __formatter_chrono::_M_U_V_W)
(__formatter_chrono::_M_z, __formatter_chrono::_M_z):
Remove _FormatContext parameter, and introduce _OutIter
for __out type.
(__formatter_chrono::_M_a_A, __formatter_chrono::_M_B_b)
(__formatter_chrono::_M_p, __formatter_chrono::_M_Q)
(__formatter_chrono::_M_r, __formatter_chrono::_M_S)
(__formatter_chrono::_M_subsecs, __formatter_chrono::_M_T):
Introduce separate _OutIter template parameter for __out.
(__formatter_chrono::_M_c, __formatter_chrono::_M_T):
Likewise, and adjust calls to specifiers functions.
* testsuite/std/time/format/empty_spec.cc: Make underlying
type for Rep configurable.

Fix afdo profiles for functions that was not early-inlined

This patch should finish the oflining infrastructure by offlining
(prior AFDO annotation) all inline function instances that was not
early inlined. This is mostly the case of recursive inlining or when
-fno-auto-profile-inlining is used which sould now produce comparable
code.

I also cleaned up offlining of self-recursive functions which now
happens through the worklist and reduces problem with recursive ivocation
of the funciton merging modifying datastructures at unexpected places.

gcc/ChangeLog:

* auto-profile.cc (function_instance::set_name,
function_instance::set_realized, function_instnace::realized_p,
function_instance::set_in_worklist,
function_instance::clear_in_worklist,
function_instance::in_worklist_p): New member functions.
(function_instance::in_worklist, function_instance::realized_):
new.
(get_relative_location_for_locus): Break out from ....
(get_relative_location_for_stmt): ... here.
(function_instance::~function_instance): Sanity check that
removed function is not in worklist.
(function_instance::merge): Do not offline realized instances.
(function_instance::offline): Make private; add duplicate functions
to worklist rather then merging immediately.
(function_instance::offline_if_in_set): Cleanup.
(function_instance::remove_external_functions): Likewise.
(function_instance::offline_if_not_realized): New member function.
(autofdo_source_profile::offline_external_functions): Handle delayed
functions.
(autofdo_source_profile::offline_unrealized_inlines): New member function.
(walk_block): New function.
(mark_realized_functions): New function.
(afdo_annotate_cfg): Fix dump.
(auto_profile): Mark realized functions and offline rest; do not compute
fn summary.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/afdo-crossmodule-1.c: Update template.

AVR: target/113934 - Use LRA per default.

Now that the patches for PR120424 are upstream, the last known bug
associated with avr+lra has been fixed: PR118591.  So we can pull the
switch that turns on LRA per default.

This patch only sets -mlra per default.  It doesn't do any Reload related
cleanup or removal from the avr backend, hence -mno-lra still works.

The only new problem is that gcc.dg/torture/pr64088.c fails with LRA
but not with Reload.  Though that test case is awkward since it is UB
but expects the compiler to behave in a specific way which avr-gcc
doesn't do: PR116780.

This patch also avoids a relative recent ICE that breaks building libgcc:
R24:DI is allowed per hard_regno_mode_ok, but R26:SI is disallowed
for Reload for old reasons.  Outcome is that a split2 pattern for
R24:DI = zero_extend:DI (R22:SI) runs into an ICE.

AVR-LibC builds fine with this patch.
The AVR-LibC testsuite passes without errors.

gcc/
PR target/113934
* config/avr/avr.opt (-mlra): Turn on per default.

[RISC-V][PR target/119971] Avoid losing shift count masking

Fix typo spotted by Bernhard Reutner-Fischer.

PR target/119971

gcc/testsuite/
* gcc.target/riscv/pr119971.c: Fix typo.

tree-optimization/120808 - SLP patterns with FMA/FMS

The following amends the SLP addsub pattern to also match blends
of .FMA/.FMS and form .FMADDSUB even when -ffp-contract=off.

PR tree-optimization/120808
* tree-vect-slp-patterns.cc (vect_match_expression_p):
Take a code_helper and also match calls.
(addsub_pattern::recognize): Handle .FMA/.FMS pairs
in addition to PLUS/MINUS.
(addsub_pattern::build): Adjust.

* gcc.dg/vect/bb-slp-pr120808.c: Now also expect FMADDSUB
patterns to be matched.

Fixup vector epilog analysis skipping when not using partial vectors

The following avoids re-analyzing the loop as epilogue when not
using partial vectors and the mode is the same as the autodetected
vector mode and that has a too high VF for a non-predicated loop.
This situation occurs almost always on x86 and saves us one
re-analysis unless --param vect-partial-vector-usage is non-default.

* tree-vectorizer.h (vect_chooses_same_modes_p): New
overload.
* tree-vect-stmts.cc (vect_chooses_same_modes_p): Likewise.
* tree-vect-loop.cc (vect_analyze_loop): Prune epilogue
analysis further when not using partial vectors.

Fixup partial_vectors_supported_p use

The following fixes the computation of supports_partial_vectors which
is used to prune the set of modes to iterate over for epilog
vectorization. The used partial_vectors_supported_p predicate
only looks for while_ult while also support predication when
mask modes are integer modes as for AVX512.

I've noticed this isn't very effective on x86_64 anyway since
if the main loop mode is autodetected we skip re-analyzing
mode_i == 0, but then mode_i == 1 is usually the very same
large mode. A patch for this will follow, but this will
regress without the fix below.

* tree-vect-loop.cc (vect_analyze_loop): Consider AVX512
style masking when computing supports_partial_vectors.

libstdc++: Fix Darwin bootstrap by simplifying ver file syntax.

The symbol parsing script does not handle the closing brace of a new
symbol group and the identifier for the inherited group to be on
different lines, which r16-1708-gaf5b72cf9f564 introduced. Fixed by
making the conditional encompass both the brace and the identifier.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver: Keep the closing brace of the
CXXABI_1.3.17 symbol group together with the identifier
for the inherited group.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

c++: Add fix note for how to declare main in a module

This patch adds a note to help users unfamiliar with modules terminology
understand how to declare main in a named module since P3618.

There doesn't appear to be an easy robust location available for "the
start of this declaration" that I could find to attach a fixit to, but
the explanation should suffice.

gcc/cp/ChangeLog:

* decl.cc (grokfndecl): Add explanation of how to attach to
global module.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>