git.ipfire.org Git - thirdparty/gcc.git/log

ada: Fixup one more pattern of broken scope information

When an array's initialization contains a `others =>` clause with an
expression that involves finalization, the resulting scope information
is incorrect and can cause crashes with backend (i.e. gnat-llvm) that
also use unnesting. The observable symptom is a nested object
declaration (created by the compiler) within a loop wrapped in a
procedure created by the unnester that has incoherent scope information:
its Scope field points to the scope of the procedure (1 level too high)
and is contained in the entity chain of some entity nested in the
procedure (correct).

The correct solution would be to fix the scope information when it is
created, but this revealed too large of a task with many interaction
with existing code.

This change adds another pattern to the Fixup_Inner_Scopes procedure to
detect the problematic case and fix the scope, "after the facts".

gcc/ada/

* exp_ch7.adb (Unnest_Loop::Fixup_Inner_Scopes): detect a new
problematic pattern and fixup the scope accordingly.

ada: Fix typo in CUDA error message

Fix typo in error message; semantics is unaffected.

gcc/ada/

* gnat_cuda.adb (Remove_CUDA_Device_Entities): Fix typo.

ada: Fix latent alignment issue for dynamically-allocated controlled objects

Dynamically-allocated controlled objects are attached to a finalization
collection by means of a hidden header placed right before the object,
which means that the size effectively allocated must naturally account
for the size of this header. But the allocation must also account for
the alignment of this header in order to have it properly aligned.

gcc/ada/

* libgnat/s-finpri.ads (Header_Alignment): New function.
(Header_Size): Adjust description.
(Master_Node): Put Finalize_Address as first component.
(Collection_Node): Likewise.
* libgnat/s-finpri.adb (Header_Alignment): New function.
(Header_Size): Return the object size in storage units.
* libgnat/s-stposu.ads (Adjust_Controlled_Dereference): Replace
collection node with header in description.
* libgnat/s-stposu.adb (Adjust_Controlled_Dereference): Likewise.
(Allocate_Any_Controlled): Likewise. Pass the maximum of the
specified alignment and that of the header to the allocator.
(Deallocate_Any_Controlled): Likewise to the deallocator.

ada: Fix resolving tagged operations in array aggregates

In the Two_Pass_Aggregate_Expansion we were removing
all of the entity links in the Iterator_Specification
to avoid reusing the same Iterator_Definition in both
loops.

However this approach was also breaking the links to
calls with dot notation that had been transformed to
the regular call notation.

In order to circumvent this, explicitly create new
identifier definitions when copying the
Iterator_Specfications for both of the loops.

gcc/ada/

* exp_aggr.adb (Two_Pass_Aggregate_Expansion):
Explicitly create new Defining_Iterators for both
of the loops.

ada: Fix bogus error on function returning noncontrolling result in private part

This occurs in the additional case of RM 3.9.3(10) in Ada 2012, that is to
say the access controlling result, because the implementation does not use
the same (correct) conditions as in the original case.

This factors out these conditions and uses them in both cases, as well as
adjusts the wording of the message in the first case.

gcc/ada/

* sem_ch6.adb (Check_Private_Overriding): Implement the second part
of RM 3.9.3(10) consistently in both cases.

ada: Fix casing of CUDA in error messages

Error messages now capitalize CUDA.

gcc/ada/

* erroutc.adb (Set_Msg_Insertion_Reserved_Word): Fix casing for
CUDA appearing in error message strings.
(Set_Msg_Str): Likewise for CUDA being a part of a Name_Id.

ada: Fix crash with -gnatdJ and -gnatw_q

This commit makes the emission of -gnatw_q warnings pass node information
so as to handle the enclosing subprogram display of -gnatdJ instead of
crashing.

gcc/ada/

* exp_ch4.adb (Expand_Composite_Equality): Call Error_Msg_N
instead of Error_Msg.

ada: Follow up fixes for Put_Image/streaming regressions

A recent change to reduce duplication of compiler-generated Put_Image and
streaming subprograms introduced some regressions. The fix for one of them
was incomplete.

gcc/ada/

* exp_attr.adb (Build_And_Insert_Type_Attr_Subp): Further tweaking
of the point where a compiler-generated Put_Image or streaming
subprogram is to be inserted in the tree. If one such subprogram
calls another (as is often the case with, for example, Put_Image
procedures for composite type and for a component type thereof),
then we want to avoid use-before-definition problems that can
result from inserting the caller ahead of the callee.

ada: Implement per-finalization-collection spinlocks

This changes the implementation of finalization collections from using the
global task lock to using per-collection spinlocks. Spinlocks are a good
fit in this context because they are very cheap and therefore can be taken
with a fine granularity only around the portions of code implementing the
shuffling of pointers required by attachment and detachment actions.

gcc/ada/

* libgnat/s-finpri.ads (Lock_Type): New modular type.
(Collection_Node): Add Enclosing_Collection component.
(Finalization_Collection): Add Lock component.
* libgnat/s-finpri.adb: Add clauses for System.Atomic_Primitives.
(Attach_Object_To_Collection): Lock and unlock the collection.
Save a pointer to the enclosing collection in the node.
(Detach_Object_From_Collection): Lock and unlock the collection.
(Finalize): Likewise.
(Initialize): Initialize the lock.
(Lock_Collection): New procedure.
(Unlock_Collection): Likewise.

ada: Formal_Derived_Type'Size is not static

In deciding whether a Size attribute reference is static, the compiler could
get confused about whether an implicitly-declared subtype of a generic formal
type is itself a generic formal type, possibly resulting in an assertion
failure and then a bugbox.

gcc/ada/

* sem_attr.adb (Eval_Attribute): Expand existing checks for
generic formal types for which Is_Generic_Type returns False. In
that case, mark the attribute reference as nonstatic.

ada: Fix bug in maintaining dimension info

Copying a node does not automatically propagate its associated dimension
information (if any). This must be done explicitly.

gcc/ada/

* sem_util.adb (Copy_Node_With_Replacement): Add call to
Copy_Dimensions so that any dimension information associated with
the copied node is also associated with the resulting copy.

ada: Remove Aspect_Specifications field from N_Procedure_Specification

Sync Has_Aspect_Specifications_Flag with the actual flags in the AST.
Code cleanup; behavior is unaffected.

gcc/ada/

* gen_il-gen-gen_nodes.adb (N_Procedure_Specification): Remove
Aspect_Specifications field.

ada: Reuse existing expression when rewriting aspects to pragmas

Code cleanup; semantics is unaffected.

gcc/ada/

* sem_ch13.adb (Analyze_Aspect_Specification): Consistently
reuse existing constant where possible.

ada: Cleanup reporting locations for Ada 2022 and GNAT extension aspects

Code cleanup; semantics is unaffected.

gcc/ada/

* sem_ch13.adb (Analyze_Aspect_Specification): Consistently
reuse existing constant where possible.

ada: Fix alphabetic ordering of aspect identifiers

Code cleanup.

gcc/ada/

* aspects.ads (Aspect_Id): Fix ordering.

ada: Fix ordering of code for pragma Preelaborable_Initialization

Code cleanup.

gcc/ada/

* sem_prag.adb (Analyze_Pragma): Move case alternative to match
to alphabetic order.

ada: Fix casing in error messages

Error messages should not start with a capital letter.

gcc/ada/

* gnat_cuda.adb (Remove_CUDA_Device_Entities): Fix casing
(this primarily fixes a style, because the capitalization will
not be preserved by the error-reporting machinery anyway).
* sem_ch13.adb (Analyze_User_Aspect_Aspect_Specification): Fix
casing in error message.

ada: Fix docs and comments about pragmas for Boolean-valued aspects

Fix various inconsistencies in documentation and comments of
Boolean-valued aspects.

gcc/ada/

* doc/gnat_rm/implementation_defined_pragmas.rst: Fix
documentation.
* sem_prag.adb: Fix comments.
* gnat_rm.texi: Regenerate.

diagnostics: use unicode art for interprocedural depth

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/out-of-bounds-diagram-1-emoji.c: Update expected
output to use unicode for depth indication.
* gcc.dg/analyzer/out-of-bounds-diagram-1-unicode.c: Likewise.

gcc/ChangeLog:
* text-art/theme.cc (ascii_theme::get_cppchar): Add
cell_kind::INTERPROCEDURAL_*.
(unicode_theme::get_cppchar): Likewise.
* text-art/theme.h (theme::cell_kind): Likewise.
* tree-diagnostic-path.cc:
(thread_event_printer::print_swimlane_for_event_range): Use the
above to get characters for indicating interprocedural stack
depth activity, falling back to ascii.
(selftest::test_interprocedural_path_1): Test with both ascii
and unicode themes.
(selftest::test_interprocedural_path_2): Likewise.
(selftest::test_recursion): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: add warning emoji to events with VERB_danger

Tweak the printing of -fdiagnostics-path-format=inline-events so that
any event with diagnostic_event::VERB_danger gains a warning emoji,
provided that the text art theme enables emoji support.

VERB_danger is set by the analyzer on the last event in a path, and so
this emoji appears at the end of all analyzer execution paths
highlighting the location of the problem.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/out-of-bounds-diagram-1-emoji.c: Update expected
output to include warning emoji.
* gcc.dg/analyzer/warning-emoji.c: New test.

gcc/ChangeLog:
* tree-diagnostic-path.cc: Include "text-art/theme.h".
(path_label::get_text): If the event has
diagnostic_event::VERB_danger, and the theme enables emojis, then
add a warning emoji between the event number and the event text.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: simplify output of purely intraprocedural execution paths

Diagnostic path printing was added in r10-5901-g4bc1899b2e883f. As of
that commit, with -fdiagnostics-path-format=inline-events (the default),
we print a vertical line to the left of the source line numbering,
visualizing the stack depth and interprocedural calls and returns as
indentation changes.

For cases where the events on a thread are purely interprocedural, this
line does nothing except take up space and complicate the output.

This patch adds logic to omit it for such cases, simpifying the output,
and, I believe, improving readability.

gcc/ChangeLog:
* diagnostic-path.h: Update leading comment to reflect
intraprocedural cases. Fix typo in comment.
* doc/invoke.texi: Update intraprocedural example.

gcc/testsuite/ChangeLog:
* c-c++-common/analyzer/allocation-size-multiline-1.c: Update
expected results for purely intraprocedural path.
* c-c++-common/analyzer/allocation-size-multiline-2.c: Likewise.
* c-c++-common/analyzer/allocation-size-multiline-3.c: Likewise.
* c-c++-common/analyzer/analyzer-verbosity-0.c: Likewise.
* c-c++-common/analyzer/analyzer-verbosity-1.c: Likewise.
* c-c++-common/analyzer/analyzer-verbosity-2.c: Likewise.
* c-c++-common/analyzer/analyzer-verbosity-3.c: Likewise.
* c-c++-common/analyzer/malloc-macro-inline-events.c: Likewise.
Doing so for this file requires a rewrite since the paths
prefixing the "in expansion of macro" lines become the only thing
on their line and so are no longer pruned by multiline.exp logic
for pruning extra content on non-blank lines.
* c-c++-common/analyzer/malloc-paths-9-noexcept.c: Likewise.
* c-c++-common/analyzer/setjmp-2.c: Likewise.
* gcc.dg/analyzer/malloc-paths-9.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-multiline-2.c: Likewise.
* gcc.dg/plugin/diagnostic-test-paths-2.c: Likewise.

gcc/ChangeLog:
* tree-diagnostic-path.cc (per_thread_summary::interprocedural_p):
New.
(thread_event_printer::print_swimlane_for_event_range): Don't
indent and print the stack depth line if this thread's events are
purely intraprocedural.
(selftest::test_intraprocedural_path): Update expected output.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: handle SGR codes in line_label::m_display_width

gcc/ChangeLog:
* diagnostic-show-locus.cc: Define INCLUDE_VECTOR and include
"text-art/types.h".
(line_label::line_label): Drop "policy" argument. Use
styled_string::calc_canvas_width when computing m_display_width,
as this skips SGR codes.
(layout::print_any_labels): Update for line_label ctor change.
(selftest::test_one_liner_labels_utf8): Update expected text to
reflect that the labels can fit on one line if we don't get
confused by SGR colorization codes.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

RISC-V: Add Zvfbfwma extension to the -march= option

This patch would like to add new sub extension (aka Zvfbfwma) to the
-march= option. It introduces a new data type BF16.

1 In spec: "Zvfbfwma requires the Zvfbfmin extension and the Zfbfmin extension."
  1.1 In Embedded    Processor: Zvfbfwma -> Zvfbfmin -> Zve32f
  1.2 In Application Processor: Zvfbfwma -> Zvfbfmin -> V
  1.3 In both scenarios, there are: Zvfbfwma -> Zfbfmin

2 Zvfbfmin's information is in:
<https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1ddf65c5fc6ba7cf5826e1c02c569c923a541c09>

3 Zfbfmin's formation is in:
<https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=35224ead63732a3550ba4b1332c06e9dc7999c31>

4 Depending on different usage scenarios, the Zvfbfwma extension may
depend on 'V' or 'Zve32f'. This patch only implements dependencies in
scenario of Embedded Processor. This is consistent with the processing
strategy in Zvfbfmin. In scenario of Application Processor, it is
necessary to explicitly indicate the dependent 'V' extension.

5 You can locate more information about Zvfbfwma from below spec doc:
<https://github.com/riscv/riscv-bfloat16/releases/download/v59042fc71c31a9bcb2f1957621c960ed36fac401/riscv-bfloat16.pdf>

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc:
(riscv_implied_info): Add zvfbfwma item.
(riscv_ext_version_table): Ditto.
(riscv_ext_flag_table): Ditto.
* config/riscv/riscv.opt:
(MASK_ZVFBFWMA): New macro.
(TARGET_ZVFBFWMA): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-37.c: New test.
* gcc.target/riscv/arch-38.c: New test.
* gcc.target/riscv/predef-36.c: New test.
* gcc.target/riscv/predef-37.c: New test.

Set d.one_operand_p to true when TARGET_SSSE3 in ix86_expand_vecop_qihi_partial.

pshufb is available under TARGET_SSSE3, so
ix86_expand_vec_perm_const_1 must return true when TARGET_SSSE3.

With the patch under -march=x86-64-v2

v8qi
foo (v8qi a)
{
return a >> 5;
}

< pmovsxbw %xmm0, %xmm0
< psraw $5, %xmm0
< pshufb .LC0(%rip), %xmm0

vs.

> movdqa %xmm0, %xmm1
> pcmpeqd %xmm0, %xmm0
> pmovsxbw %xmm1, %xmm1
> psrlw $8, %xmm0
> psraw $5, %xmm1
> pand %xmm1, %xmm0
> packuswb %xmm0, %xmm0

Although there's a memory load from constant pool, but it should be
better when it's inside a loop. The load from constant pool can be
hoist out. it's 1 instruction vs 4 instructions.

< pshufb .LC0(%rip), %xmm0

vs.

> pcmpeqd %xmm0, %xmm0
> psrlw $8, %xmm0
> pand %xmm1, %xmm0
> packuswb %xmm0, %xmm0

gcc/ChangeLog:

PR target/114514
* config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial):
Set d.one_operand_p to true when TARGET_SSSE3.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr114514-shufb.c: New test.

Optimize ashift >> 7 to vpcmpgtb for vector int8.

Since there is no corresponding instruction, the shift operation for
vector int8 is implemented using the instructions for vector int16,
but for some special shift counts, it can be transformed into vpcmpgtb.

gcc/ChangeLog:

PR target/114514
* config/i386/i386-expand.cc
(ix86_expand_vec_shift_qihi_constant): Optimize ashift >> 7 to
vpcmpgtb.
(ix86_expand_vecop_qihi_partial): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr114514-shift.c: New test.

Daily bump.

Add missing hunk in recent change.

gcc/
* config/riscv/riscv-string.cc: Add missing hunk from last change.

analyzer: fix ICE seen with -fsanitize=undefined [PR114899]

gcc/analyzer/ChangeLog:
PR analyzer/114899
* access-diagram.cc
(written_svalue_spatial_item::get_label_string): Bulletproof
against SSA_NAME_VAR being null.

gcc/testsuite/ChangeLog:
PR analyzer/114899
* c-c++-common/analyzer/out-of-bounds-diagram-pr114899.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

[v2,2/2] RISC-V: strcmp expansion: Use adjust_address() for address calculation

We have an arch-independent routine to generate an address with an offset.
Let's use that instead of doing the calculation in the backend.

gcc/ChangeLog:

* config/riscv/riscv-string.cc (emit_strcmp_scalar_load_and_compare):
Use adjust_address() to calculate MEM-PLUS pattern.

[v2,1/2] RISC-V: Add cmpmemsi expansion

GCC has a generic cmpmemsi expansion via the by-pieces framework,
which shows some room for target-specific optimizations.
E.g. for comparing two aligned memory blocks of 15 bytes
we get the following sequence:

my_mem_cmp_aligned_15:
        li      a4,0
        j       .L2
.L8:
        bgeu    a4,a7,.L7
.L2:
        add     a2,a0,a4
        add     a3,a1,a4
        lbu     a5,0(a2)
        lbu     a6,0(a3)
        addi    a4,a4,1
        li      a7,15    // missed hoisting
        subw    a5,a5,a6
        andi    a5,a5,0xff // useless
        beq     a5,zero,.L8
        lbu     a0,0(a2) // loading again!
        lbu     a5,0(a3) // loading again!
        subw    a0,a0,a5
        ret
.L7:
        li      a0,0
        ret

Diff first byte: 15 insns
Diff second byte: 25 insns
No diff: 25 insns

Possible improvements:
* unroll the loop and use load-with-displacement to avoid offset increments
* load and compare multiple (aligned) bytes at once
* Use the bitmanip/strcmp result calculation (reverse words and
  synthesize (a2 >= a3) ? 1 : -1 in a branchless sequence)

When applying these improvements we get the following sequence:

my_mem_cmp_aligned_15:
        ld      a5,0(a0)
        ld      a4,0(a1)
        bne     a5,a4,.L2
        ld      a5,8(a0)
        ld      a4,8(a1)
        slli    a5,a5,8
        slli    a4,a4,8
        bne     a5,a4,.L2
        li      a0,0
.L3:
        sext.w  a0,a0
        ret
.L2:
        rev8    a5,a5
        rev8    a4,a4
        sltu    a5,a5,a4
        neg     a5,a5
        ori     a0,a5,1
        j       .L3

Diff first byte: 11 insns
Diff second byte: 16 insns
No diff: 11 insns

This patch implements this improvements.

The tests consist of a execution test (similar to
gcc/testsuite/gcc.dg/torture/inline-mem-cmp-1.c) and a few tests
that test the expansion conditions (known length and alignment).

Similar to the cpymemsi expansion this patch does not introduce any
gating for the cmpmemsi expansion (on top of requiring the known length,
alignment and Zbb).

Bootstrapped and SPEC CPU 2017 tested.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_expand_block_compare): New
prototype.
* config/riscv/riscv-string.cc (GEN_EMIT_HELPER2): New helper
for zero_extendhi.
(do_load_from_addr): Add support for HI and SI/64 modes.
(do_load): Add helper for zero-extended loads.
(emit_memcmp_scalar_load_and_compare): New helper to emit memcmp.
(emit_memcmp_scalar_result_calculation): Likewise.
(riscv_expand_block_compare_scalar): Likewise.
(riscv_expand_block_compare): New RISC-V expander for memory compare.
* config/riscv/riscv.md (cmpmemsi): New cmpmem expansion.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmpmemsi-1.c: New test.
* gcc.target/riscv/cmpmemsi-2.c: New test.
* gcc.target/riscv/cmpmemsi-3.c: New test.
* gcc.target/riscv/cmpmemsi.c: New test.

c++: ICE with reference NSDMI [PR114854]

Here we crash on a cp_gimplify_expr/TARGET_EXPR assert:

      /* A TARGET_EXPR that expresses direct-initialization should have been
         elided by cp_gimplify_init_expr.  */
      gcc_checking_assert (!TARGET_EXPR_DIRECT_INIT_P (*expr_p));

the TARGET_EXPR in question is created for the NSDMI in:

  class Vector { int m_size; };
  struct S {
    const Vector &vec{};
  };

where we first need to create a Vector{} temporary, and then bind the
vec reference to it.  The temporary is represented by a TARGET_EXPR
and it cannot be elided.  When we create an object of type S, we get

  D.2848 = {.vec=(const struct Vector &) &TARGET_EXPR <D.2840, {.m_size=0}>}

where the TARGET_EXPR is no longer direct-initializing anything.

Fixed by not setting TARGET_EXPR_DIRECT_INIT_P in convert_like_internal/ck_user.

PR c++/114854

gcc/cp/ChangeLog:

* call.cc (convert_like_internal) <case ck_user>: Don't set
TARGET_EXPR_DIRECT_INIT_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/nsdmi-aggr22.C: New test.

c++: DR 569, DR 1693: fun with semicolons [PR113760]

Prompted by c++/113760, I started looking into a bogus "extra ;"
warning in C++11.  It quickly turned out that if I want to fix
this for good, the fix will not be so small.

This patch touches on DR 569, an extra ; at namespace scope should
be allowed since C++11:

  struct S {
  };
  ; // pedwarn in C++98

It also touches on DR 1693, which allows superfluous semicolons in
class definitions since C++11:

  struct S {
    int a;
    ; // pedwarn in C++98
  };

Note that a single semicolon is valid after a member function definition:

  struct S {
    void foo () {}; // only warns with -Wextra-semi
  };

There's a new function maybe_warn_extra_semi to handle all of the above
in a single place.  So now they all get a fix-it hint.

-Wextra-semi turns on all "extra ;" diagnostics.  Currently, options
like -Wc++11-compat or -Wc++11-extensions are not considered.

DR 1693
PR c++/113760
DR 569

gcc/c-family/ChangeLog:

* c.opt (Wextra-semi): Initialize to -1.

gcc/cp/ChangeLog:

* parser.cc (extra_semi_kind): New.
(maybe_warn_extra_semi): New.
(cp_parser_declaration): Call maybe_warn_extra_semi.
(cp_parser_member_declaration): Likewise.

gcc/ChangeLog:

* doc/invoke.texi: Update -Wextra-semi documentation.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/semicolon1.C: New test.
* g++.dg/diagnostic/semicolon10.C: New test.
* g++.dg/diagnostic/semicolon11.C: New test.
* g++.dg/diagnostic/semicolon12.C: New test.
* g++.dg/diagnostic/semicolon13.C: New test.
* g++.dg/diagnostic/semicolon14.C: New test.
* g++.dg/diagnostic/semicolon15.C: New test.
* g++.dg/diagnostic/semicolon16.C: New test.
* g++.dg/diagnostic/semicolon17.C: New test.
* g++.dg/diagnostic/semicolon2.C: New test.
* g++.dg/diagnostic/semicolon3.C: New test.
* g++.dg/diagnostic/semicolon4.C: New test.
* g++.dg/diagnostic/semicolon5.C: New test.
* g++.dg/diagnostic/semicolon6.C: New test.
* g++.dg/diagnostic/semicolon7.C: New test.
* g++.dg/diagnostic/semicolon8.C: New test.
* g++.dg/diagnostic/semicolon9.C: New test.

c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]

This patch reworks the cdtor alias optimization, such that we can create
aliases even when maybe_clone_body is called not at at_eof time, without trying
to repeat it in maybe_optimize_cdtor.

2024-05-15 Jakub Jelinek <jakub@redhat.com>
Jason Merrill <jason@redhat.com>

PR lto/113208
* cp-tree.h (maybe_optimize_cdtor): Remove.
* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
for implicit instantiations of maybe in charge ctors/dtors
declared inline.
(import_export_decl): Don't call maybe_optimize_cdtor.
(c_parse_final_cleanups): Formatting fixes.
* optimize.cc (can_alias_cdtor): Adjust condition, for
HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
if not DECL_INTERFACE_KNOWN.
(maybe_clone_body): Don't clear DECL_SAVED_TREE, instead set it
to void_node.
(maybe_clone_body): Remove.
* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
set.

* g++.dg/abi/comdat3.C: New test.
* g++.dg/abi/comdat4.C: New test.

combine: Fix up simplify_compare_const [PR115092]

The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers. That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).

The following patch just disables the optimization in that case.

2024-05-15 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/114902
PR rtl-optimization/115092
* combine.cc (simplify_compare_const): Don't optimize
GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
EQ op0 const0_rtx.

* gcc.dg/pr114902.c: New test.
* gcc.dg/pr115092.c: New test.

openmp: Diagnose using grainsize+num_tasks clauses together [PR115103]

I've noticed that while we diagnose many other OpenMP exclusive clauses,
we don't diagnose grainsize together with num_tasks on taskloop construct
in all of C, C++ and Fortran (the implementation simply ignored grainsize
in that case) and for Fortran also don't diagnose mixing nogroup clause
with reduction clause(s).

Fixed thusly.

2024-05-15 Jakub Jelinek <jakub@redhat.com>

PR c/115103
gcc/c/
* c-typeck.cc (c_finish_omp_clauses): Diagnose grainsize
used together with num_tasks.
gcc/cp/
* semantics.cc (finish_omp_clauses): Diagnose grainsize
used together with num_tasks.
gcc/fortran/
* openmp.cc (resolve_omp_clauses): Diagnose grainsize
used together with num_tasks or nogroup used together with
reduction.
gcc/testsuite/
* c-c++-common/gomp/clause-dups-1.c: Add 2 further expected errors.
* gfortran.dg/gomp/pr115103.f90: New test.

tree-optimization/114589 - remove profile based sink heuristics

The following removes the profile based heuristic limiting sinking
and instead uses post-dominators to avoid sinking to places that
are executed under the same conditions as the earlier location which
the profile based heuristic should have guaranteed as well.

To avoid regressing this moves the empty-latch check to cover all
sink cases.

It also stream-lines the resulting select_best_block a bit but avoids
adjusting heuristics more with this change.  gfortran.dg/streamio_9.f90
starts execute failing with this on x86_64 with -m32 because the
(float)i * 9.9999...e-7 compute is sunk across a STOP causing it
to be no longer spilled and thus the compare failing due to excess
precision.  The patch adds -ffloat-store to avoid this, following
other similar testcases.

This change fixes the testcase in the PR only when using -fno-ivopts
as otherwise VRP is confused.

PR tree-optimization/114589
* tree-ssa-sink.cc (select_best_block): Remove profile-based
heuristics.  Instead reject sink locations that sink
to post-dominators.  Move empty latch check here from
statement_sink_location.  Also consider early_bb for the
loop depth check.
(statement_sink_location): Remove superfluous check.  Remove
empty latch check.
(pass_sink_code::execute): Compute/release post-dominators.

* gfortran.dg/streamio_9.f90: Use -ffloat-store to avoid
excess precision when not spilling.
* g++.dg/tree-ssa/pr114589.C: New testcase.

middle-end/111422 - wrong stack var coalescing, handle PHIs

The gcc.c-torture/execute/pr111422.c testcase after installing the
sink pass improvement reveals that we also need to handle

_65 = &g + _58; _44 = &g + _43;
# _59 = PHI <_65, _44>
*_59 = 8;
g = {v} {CLOBBER(eos)};
...
n[0] = &f;
*_59 = 8;
g = {v} {CLOBBER(eos)};

where we fail to see the conflict between n and g after the first
clobber of g. Before the sinking improvement there was a conflict
recorded on a path where _65/_44 are unused, so the real conflict
was missed but the fake one avoided the miscompile.

The following handles PHI defs in add_scope_conflicts_2 which
fixes the issue.

PR middle-end/111422
* cfgexpand.cc (add_scope_conflicts_2): Handle PHIs
by recursing to their arguments.

PR modula2/115057 TextIO.ReadRestLine raises an exception when buffer is exceeded

TextIO.ReadRestLine will raise an "attempting to read beyond end of file"
exception if the buffer is exceeded.  This bug is caused by the
TextIO.ReadRestLine calling IOChan.Skip without a preceeding IOChan.Look.
The Look procedure will update the status result whereas
Skip always sets read result to allRight.

gcc/m2/ChangeLog:

PR modula2/115057
* gm2-libs-iso/TextIO.mod (ReadRestLine): Use ReadChar to
skip unwanted characters as this calls IOChan.Look and updates
the cid result status.  A Skip without a Look does not update
the status.  Skip always sets read result to allRight.
* gm2-libs-iso/TextUtil.def (SkipSpaces): Improve comments.
(CharAvailable): Improve comments.
* gm2-libs-iso/TextUtil.mod (SkipSpaces): Improve comments.
(CharAvailable): Improve comments.

gcc/testsuite/ChangeLog:

PR modula2/115057
* gm2/isolib/run/pass/testrestline.mod: New test.
* gm2/isolib/run/pass/testrestline2.mod: New test.
* gm2/isolib/run/pass/testrestline3.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

c++: add test for DR 2855

Let

  int8_t x = 127;

This DR says that while

  x++;

invokes UB,

  ++x;

does not.  The resolution was to make the first one valid.  The
following test verifies that we don't report any errors in a constexpr
context.

DR 2855

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2855.C: New test.

RISC-V: Test cbo.zero expansion for rv32

We had an issue when expanding via cmo-zero for RV32.
This was fixed upstream, but we don't have a RV32 test.
Therefore, this patch introduces such a test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmo-zicboz-zic64-1.c: Fix for rv32.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

AArch64: Use UZP1 instead of INS

Use UZP1 instead of INS when combining low and high halves of vectors.
UZP1 has 3 operands which improves register allocation, and is faster on
some microarchitectures.

gcc:
* config/aarch64/aarch64-simd.md (aarch64_combine_internal<mode>):
Use UZP1 instead of INS.
(aarch64_combine_internal_be<mode>): Likewise.

gcc/testsuite:
* gcc.target/aarch64/ldp_stp_16.c: Update to check for UZP1.
* gcc.target/aarch64/pr109072_1.c: Likewise.
* gcc.target/aarch64/vec-init-14.c: Likewise.
* gcc.target/aarch64/vec-init-9.c: Likewise.

Avoid pointer compares on TYPE_MAIN_VARIANT in TBAA

while building more testcases for ipa-icf I noticed that there are two places
in aliasing code where we still compare TYPE_MAIN_VARIANT for pointer equality.
This is not good idea for LTO since type merging may not happen for example
when in one unit pointed to type is forward declared while in other it is fully
defined. We have same_type_for_tbaa for that.

Bootstrapped/regtested x86_64-linux, OK?

gcc/ChangeLog:

* alias.cc (reference_alias_ptr_type_1): Use view_converted_memref_p.
* alias.h (view_converted_memref_p): Declare.
* tree-ssa-alias.cc (view_converted_memref_p): Export.
(ao_compare::compare_ao_refs): Use same_type_for_tbaa.

testsuite: Require lto-plugin in gcc.dg/ipa/ipa-icf-38.c [PR85656]

gcc.dg/ipa/ipa-icf-38.c currently FAILs on Solaris (SPARC and x86, 32
and 64-bit):

FAIL: gcc.dg/ipa/ipa-icf-38.c scan-ltrans-tree-dump-not optimized "Function bar"

As it turns out, this only happens when the Solaris linker is used; with
GNU ld the test PASSes just fine. In fact, that happens because gld
supports the lto-plugin while ld does not: in a Solaris build with gld,
the test FAILs the same way as with ld when -fno-use-linker-plugin is
passed, so this patch requires linker_plugin.

Tested on i386-pc-solaris2.11 (ld and gld) and x86_64-pc-linux-gnu.

2024-05-15 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR ipa/85656
* gcc.dg/ipa/ipa-icf-38.c: Require linker_plugin.

libstdc++: Fix data race in std::basic_ios::fill() [PR77704]

The lazy caching in std::basic_ios::fill() updates a mutable member
without synchronization, which can cause a data race if two threads both
call fill() on the same stream object when _M_fill_init is false.

To avoid this we can just cache the _M_fill member and set _M_fill_init
early in std::basic_ios::init, instead of doing it lazily. As explained
by the comment in init, there's a good reason for doing it lazily. When
char_type is neither char nor wchar_t, the locale might not have a
std::ctype<char_type> facet, so getting the fill character would throw
an exception. The current lazy init allows using unformatted I/O with
such a stream, because the fill character is never needed and so it
doesn't matter if the locale doesn't have a ctype<char_type> facet. We
can maintain this property by only setting the fill character in
std::basic_ios::init if the ctype facet is present at that time. If
fill() is called later and the fill character wasn't set by init, we can
get it from the stream's current locale at the point when fill() is
called (and not try to cache it without synchronization). If the stream
hasn't been imbued with a locale that includes the facet when we need
the fill() character, then throw bad_cast at that point.

This causes a change in behaviour for the following program:

  std::ostringstream out;
  out.imbue(loc);
  auto fill = out.fill();

Previously the fill character would have been set when fill() is called,
and so would have used the new locale. This commit changes it so that
the fill character is set on construction and isn't affected by the new
locale being imbued later. This new behaviour seems to be what the
standard requires, and matches MSVC.

The new 27_io/basic_ios/fill/char/fill.cc test verifies that it's still
possible to use a std::basic_ios without the ctype<char_type> facet
being present at construction.

libstdc++-v3/ChangeLog:

PR libstdc++/77704
* include/bits/basic_ios.h (basic_ios::fill()): Do not modify
_M_fill and _M_fill_init in a const member function.
(basic_ios::fill(char_type)): Use _M_fill directly instead of
calling fill(). Set _M_fill_init to true.
* include/bits/basic_ios.tcc (basic_ios::init): Set _M_fill and
_M_fill_init here instead.
* testsuite/27_io/basic_ios/fill/char/1.cc: New test.
* testsuite/27_io/basic_ios/fill/wchar_t/1.cc: New test.

testsuite: i386: Fix g++.target/i386/pr97054.C on Solaris

g++.target/i386/pr97054.C currently FAILs on 64-bit Solaris/x86:

FAIL: g++.target/i386/pr97054.C  -std=gnu++14 (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C  -std=gnu++14 compilation failed to produce executable
FAIL: g++.target/i386/pr97054.C  -std=gnu++17 (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C  -std=gnu++17 compilation failed to produce executable
FAIL: g++.target/i386/pr97054.C  -std=gnu++2a (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C  -std=gnu++2a compilation failed to produce executable
FAIL: g++.target/i386/pr97054.C  -std=gnu++98 (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C  -std=gnu++98 compilation failed to produce executable

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/g++.target/i386/pr97054.C:49:20: error: frame pointer required, but reserved

Since Solaris/x86 defaults to -fno-omit-frame-pointer, this patch
explicitly builds with -fomit-frame-pointer as is the default on other
x86 targets.

Tested on i386-pc-solaris2.11 (32 and 64-bit) and x86_64-pc-linux-gnu.

2024-05-15  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* g++.target/i386/pr97054.C (dg-options): Add -fomit-frame-pointer.

RISC-V: Allow by-pieces to do overlapping accesses in block_move_straight

The current implementation of riscv_block_move_straight() emits a couple
of loads/stores with with maximum width (e.g. 8-byte for RV64).
The remainder is handed over to move_by_pieces().
The by-pieces framework utilizes target hooks to decide about the emitted
instructions (e.g. unaligned accesses or overlapping accesses).

Since the current implementation will always request less than XLEN bytes
to be handled by the by-pieces infrastructure, it is impossible that
overlapping memory accesses can ever be emitted (the by-pieces code does
not know of any previous instructions that were emitted by the backend).

This patch changes the implementation of riscv_block_move_straight()
such, that it utilizes the by-pieces framework if the remaining data
is less than 2*XLEN bytes, which is sufficient to enable overlapping
memory accesses (if the requirements for them are given).

The changes in the expansion can be seen in the adjustments of the
cpymem-NN-ooo test cases. The changes in the cpymem-NN tests are
caused by the different instruction ordering of the code emitted
by the by-pieces infrastructure, which emits alternating load/store
sequences.

gcc/ChangeLog:

* config/riscv/riscv-string.cc (riscv_block_move_straight):
Hand over up to 2xXLEN bytes to move_by_pieces().

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: Adjustments for overlapping
access.
* gcc.target/riscv/cpymem-32.c: Adjustments for code emitted by
by-pieces.
* gcc.target/riscv/cpymem-64-ooo.c: Adjustments for overlapping
access.
* gcc.target/riscv/cpymem-64.c: Adjustments for code emitted by
by-pieces.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

RISC-V: add tests for overlapping mem ops

A recent patch added the field overlap_op_by_pieces to the struct
riscv_tune_param, which is used by the TARGET_OVERLAP_OP_BY_PIECES_P()
hook. This hook is used by the by-pieces infrastructure to decide
if overlapping memory accesses should be emitted.

The changes in the expansion can be seen in the adjustments of the
cpymem test cases. These tests also reveal a limitation in the
RISC-V cpymem expansion that prevents this optimization as only
by-pieces cpymem expansions emit overlapping memory accesses.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: Adjust for overlapping
access.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

RISC-V: Allow unaligned accesses in cpymemsi expansion

The RISC-V cpymemsi expansion is called, whenever the by-pieces
infrastructure will not take care of the builtin expansion.
The code emitted by the by-pieces infrastructure may emits code,
that includes unaligned accesses if riscv_slow_unaligned_access_p
is false.

The RISC-V cpymemsi expansion is handled via riscv_expand_block_move().
The current implementation of this function does not check
riscv_slow_unaligned_access_p and never emits unaligned accesses.

Since by-pieces emits unaligned accesses, it is reasonable to implement
the same behaviour in the cpymemsi expansion. And that's what this patch
is doing.

The patch checks riscv_slow_unaligned_access_p at the entry and sets
the allowed alignment accordingly. This alignment is then propagated
down to the routines that emit the actual instructions.

The changes introduced by this patch can be seen in the adjustments
of the cpymem tests.

gcc/ChangeLog:

* config/riscv/riscv-string.cc (riscv_block_move_straight): Add
parameter align.
(riscv_adjust_block_mem): Replace parameter length by align.
(riscv_block_move_loop): Add parameter align.
(riscv_expand_block_move_scalar): Set alignment properly if the
target has fast unaligned access.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: Adjust for unaligned access.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

RISC-V: Add test cases for cpymem expansion

We have two mechanisms in the RISC-V backend that expand
cpymem pattern: a) by-pieces, b) riscv_expand_block_move()
in riscv-string.cc. The by-pieces framework has higher priority
and emits a sequence of up to 15 instructions
(see use_by_pieces_infrastructure_p() for more details).

As a rule-of-thumb, by-pieces emits alternating load/store sequences
and the setmem expansion in the backend emits a sequence of loads
followed by a sequence of stores.

Let's add some test cases to document the current behaviour
and to have tests to identify regressions.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: New test.
* gcc.target/riscv/cpymem-32.c: New test.
* gcc.target/riscv/cpymem-64-ooo.c: New test.
* gcc.target/riscv/cpymem-64.c: New test.

[prange] Default pointers_handled_p() to false.

The pointers_handled_p() method is an internal range-op helper to help
catch dispatch type mismatches for pointer operands.  This is what
caught the IPA mismatch in PR114985.

This method is only a temporary measure to catch any incompatibilities
in the current pointer range-op entries.  This patch returns true for
any *new* entries in the range-op table, as the current ones are
already fleshed out.  This keeps us from having to implement this
boilerplate function for any new range-op entries.

PR tree-optimization/114995
* range-op-ptr.cc (range_operator::pointers_handled_p): Default to true.

libstdc++: Rewrite std::variant comparisons without macros

libstdc++-v3/ChangeLog:

* include/std/variant (__detail::__variant::__compare): New
function template.
(operator==, operator!=, operator<, operator>, operator<=)
(operator>=): Replace macro definition with handwritten function
calling __detail::__variant::__compare.
(operator<=>): Call __detail::__variant::__compare.

libstdc++: Give std::memory_order a fixed underlying type [PR89624]

Prior to C++20 this enum type doesn't have a fixed underlying type,
which means it can be modified by -fshort-enums, which then means the
HLE bits are outside the range of valid values for the type.

As it has a fixed type of int in C++20 and later, do the same for
earlier standards too. This is technically a change for C++17 down,
because the implicit underlying type (without -fshort-enums) was
unsigned before. I doubt it matters in practice. That incompatibility
already exists between C++17 and C++20 and nobody has noticed or
complained. Now at least the underlying type will be int for all -std
modes.

libstdc++-v3/ChangeLog:

PR libstdc++/89624
* include/bits/atomic_base.h (memory_order): Use int as
underlying type.
* testsuite/29_atomics/atomic/89624.cc: New test.

tree-cfg: Move the returns_twice check to be last statement only [PR114301]

When I was checking to making sure that all of the bugs dealing
with the case where gimple_can_duplicate_bb_p would return false was fixed,
I noticed that the code which was checking if a call statement was
returns_twice was checking all call statements rather than just the
last statement. Since calling gimple_call_flags has a small non-zero
overhead due to a few string comparison, removing the uses of it
can have a small performance improvement. In the case of returns_twice
functions calls, will always end the basic-block due to the check in
stmt_can_terminate_bb_p (and others). So checking only the last statement
is a small optimization and will be safe.

Bootstrapped and tested pon x86_64-linux-gnu with no regressions.

PR tree-optimization/114301
gcc/ChangeLog:

* tree-cfg.cc (gimple_can_duplicate_bb_p): Check returns_twice
only on the last call statement rather than all.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

[committed] Fix rv32 issues with recent zicboz work

I should have double-checked the CI system before pushing Christoph's patches
for memset-zero.  While I thought I'd checked CI state, I must have been
looking at the wrong patch from Christoph.

Anyway, this fixes the rv32 ICEs and disables one of the tests for rv32.

The test would need a revamp for rv32 as the expected output is all rv64 code
using "sd" instructions.  I'm just not vested deeply enough into rv32 to adjust
the test to work in that environment though it should be fairly trivial to copy
the test and provide new expected output if someone cares enough.

Verified this fixes the rv32 failures in my tester:
> New tests that FAIL (6 tests):
>
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O1  (internal compiler error: in extract_insn, at recog.cc:2812)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O1  (test for excess errors)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O2  (internal compiler error: in extract_insn, at recog.cc:2812)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O2  (test for excess errors)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O3 -g  (internal compiler error: in extract_insn, at recog.cc:2812)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O3 -g  (test for excess errors)

And after the ICE is fixed, these are eliminated by only running the test for
rv64:

> New tests that FAIL (3 tests):
>
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O1   check-function-bodies clear_buf_123
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O2   check-function-bodies clear_buf_123
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c   -O3 -g   check-function-bodies clear_buf_123

gcc/
* config/riscv/riscv-string.cc
(riscv_expand_block_clear_zicboz_zic64b): Handle rv32 correctly.

gcc/testsuite

* gcc.target/riscv/cmo-zicboz-zic64-1.c: Don't run on rv32.

x86: Add 3-instruction subroutine vector shift for V16QI in ix86_expand_vec_perm_const_1 [PR107563]

Hi All

We've introduced a new subroutine in ix86_expand_vec_perm_const_1
to optimize vector shifting for the V16QI type on x86.
This patch uses a three-instruction sequence psrlw, psllw, and por
to handle specific vector shuffle operations more efficiently.
The change aims to improve assembly code generation for configurations
supporting SSE2.

Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?

Best
Levy

gcc/ChangeLog:

PR target/107563
* config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por): New
subroutine.
(ix86_expand_vec_perm_const_1): Call expand_vec_perm_psrlw_psllw_por.

gcc/testsuite/ChangeLog:

PR target/107563
* g++.target/i386/pr107563-a.C: New test.
* g++.target/i386/pr107563-b.C: New test.

c++: lvalueness of non-dependent assignment expr [PR114994]

r14-4111-g6e92a6a2a72d3b made us check non-dependent simple assignment
expressions ahead of time and give them a type, as was already done for
compound assignments. Unlike for compound assignments however, if a
simple assignment resolves to an operator overload we represent it as a
(typed) MODOP_EXPR instead of a CALL_EXPR to the selected overload.
(I reckoned this was at worst a pessimization -- we'll just have to repeat
overload resolution at instantiatiation time.)

But this turns out to break the below testcase ultimately because
MODOP_EXPR (of non-reference type) is always treated as an lvalue
according to lvalue_kind, which is incorrect for the MODOP_EXPR
representing x=42.

We can fix this by representing such class assignment expressions as
CALL_EXPRs as well, but this turns out to require some tweaking of our
-Wparentheses warning logic and may introduce other fallout making it
unsuitable for backporting.

So this patch instead fixes lvalue_kind to consider the type of a
MODOP_EXPR representing a class assignment.

PR c++/114994

gcc/cp/ChangeLog:

* tree.cc (lvalue_kind) <case MODOP_EXPR>: For a class
assignment, consider the result type.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent32.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

[to-be-committed,RISC-V] Remove redundant AND in shift-add sequence

So this patch allows us to eliminate an redundant AND in some shift-add
style sequences.   I think the testcase was reduced from xz by the RAU
team, but I'm not highly confident of that.

Specifically the AND is masking off the upper 32 bits of the un-shifted
value and there's an outer SIGN_EXTEND from SI to DI.  However in the
RTL it's working on the post-shifted value, so the constant is left
shifted, so we have to account for that in the pattern's condition.

We can just drop the AND in this case.  So instead we do a 64bit shift,
then a sign extending ADD utilizing the low part of that 64bit shift result.

This has run through Ventana's CI as well as my own.  I'll wait for it
to run through the larger CI system before pushing.

Jeff

gcc/
* config/riscv/riscv.md: Add pattern for sign extended shift-add
sequence with a masked input.

gcc/testsuite

* gcc.target/riscv/shift-add-2.c: New test.

Daily bump.

c++: ICE in build_deduction_guide for invalid template [PR105760]

We currently ICE upon the following invalid snippet because we fail to
properly handle tsubst_arg_types returning error_mark_node in
build_deduction_guide.

== cut ==
template<class... Ts, class>
struct A { A(Ts...); };
A a;
== cut ==

This patch fixes this, and has been successfully tested on x86_64-pc-linux-gnu.

PR c++/105760

gcc/cp/ChangeLog:

* pt.cc (build_deduction_guide): Check for error_mark_node
result from tsubst_arg_types.

gcc/testsuite/ChangeLog:

* g++.dg/parse/error66.C: New test.

c++ comment adjustments for 114935

gcc/cp/ChangeLog:

* decl.cc (wrap_cleanups_r): Clarify comment.
* init.cc (build_vec_init): Update comment.

pru: Implement TARGET_CLASS_LIKELY_SPILLED_P to fix PR115013

Commit r15-436-g44e7855e did not fix PR115013 for PRU because
SMALL_REGISTER_CLASS_P is not returning an accurate value for the PRU
backend.

Word mode for PRU backend is defined as 8-bit, yet all ALU operations
are preferred in 32-bit mode.  Thus checking whether a register class
contains a single word_mode register would not classify the actually
single SImode register classes as small.  This affected the
multiplication source and destination register classes.

Fix by implementing TARGET_CLASS_LIKELY_SPILLED_P to treat all register
classes with SImode or smaller size as likely spilled.  This in turn
corrects the behaviour of SMALL_REGISTER_CLASS_P for PRU.

PR rtl-optimization/115013

gcc/ChangeLog:

* config/pru/pru.cc (pru_class_likely_spilled_p): Implement
to mark classes containing one SImode register as likely
spilled.
(TARGET_CLASS_LIKELY_SPILLED_P): Define.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

RISC-V: avoid LUI based const materialization ... [part of PR/106265]

... if the constant can be represented as sum of two S12 values.
The two S12 values could instead be fused with subsequent ADD insn.
The helps
- avoid an additional LUI insn
- side benefits of not clobbering a reg

e.g.
                            w/o patch             w/ patch
long                  |                     |
plus(unsigned long i) | li a5,4096     |
{                     | addi a5,a5,-2032 | addi a0, a0, 2047
   return i + 2064;   | add a0,a0,a5    | addi a0, a0, 17
}                     | ret         | ret

NOTE: In theory not having const in a standalone reg might seem less
      CSE friendly, but for workloads in consideration these mat are
      from very late LRA reloads and follow on GCSE is not doing much
      currently.

The real benefit however is seen in base+offset computation for array
accesses and especially for stack accesses which are finalized late in
optim pipeline, during LRA register allocation. Often the finalized
offsets trigger LRA reloads resulting in mind boggling repetition of
exact same insn sequence including LUI based constant materialization.

This shaves off 290 billion dynamic instrustions (QEMU icounts) in
SPEC 2017 Cactu benchmark which is over 10% of workload. In the rest of
suite, there additional 10 billion shaved, with both gains and losses
in indiv workloads as is usual with compiler changes.

500.perlbench_r-0 |  1,214,534,029,025 | 1,212,887,959,387 |
500.perlbench_r-1 |    740,383,419,739 |   739,280,308,163 |
500.perlbench_r-2 |    692,074,638,817 |   691,118,734,547 |
502.gcc_r-0       |    190,820,141,435 |   190,857,065,988 |
502.gcc_r-1       |    225,747,660,839 |   225,809,444,357 | <- -0.02%
502.gcc_r-2       |    220,370,089,641 |   220,406,367,876 | <- -0.03%
502.gcc_r-3       |    179,111,460,458 |   179,135,609,723 | <- -0.02%
502.gcc_r-4       |    219,301,546,340 |   219,320,416,956 | <- -0.01%
503.bwaves_r-0    |    278,733,324,691 |   278,733,323,575 | <- -0.01%
503.bwaves_r-1    |    442,397,521,282 |   442,397,519,616 |
503.bwaves_r-2    |    344,112,218,206 |   344,112,216,760 |
503.bwaves_r-3    |    417,561,469,153 |   417,561,467,597 |
505.mcf_r         |    669,319,257,525 |   669,318,763,084 |
507.cactuBSSN_r   |  2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%
508.namd_r        |  1,855,884,342,110 | 1,855,881,110,934 |
510.parest_r      |  1,654,525,521,053 | 1,654,402,859,174 |
511.povray_r      |  2,990,146,655,619 | 2,990,060,324,589 |
519.lbm_r         |  1,158,337,294,525 | 1,158,337,294,529 |
520.omnetpp_r     |  1,021,765,791,283 | 1,026,165,661,394 |
521.wrf_r         |  1,715,955,652,503 | 1,714,352,737,385 |
523.xalancbmk_r   |    849,846,008,075 |   849,836,851,752 |
525.x264_r-0      |    277,801,762,763 |   277,488,776,427 |
525.x264_r-1      |    927,281,789,540 |   926,751,516,742 |
525.x264_r-2      |    915,352,631,375 |   914,667,785,953 |
526.blender_r     |  1,652,839,180,887 | 1,653,260,825,512 |
527.cam4_r        |  1,487,053,494,925 | 1,484,526,670,770 |
531.deepsjeng_r   |  1,641,969,526,837 | 1,642,126,598,866 |
538.imagick_r     |  2,098,016,546,691 | 2,097,997,929,125 |
541.leela_r       |  1,983,557,323,877 | 1,983,531,314,526 |
544.nab_r         |  1,516,061,611,233 | 1,516,061,407,715 |
548.exchange2_r   |  2,072,594,330,215 | 2,072,591,648,318 |
549.fotonik3d_r   |  1,001,499,307,366 | 1,001,478,944,189 |
554.roms_r        |  1,028,799,739,111 | 1,028,780,904,061 |
557.xz_r-0        |    363,827,039,684 |   363,057,014,260 |
557.xz_r-1        |    906,649,112,601 |   905,928,888,732 |
557.xz_r-2        |    509,023,898,187 |   508,140,356,932 |
997.specrand_fr   |        402,535,577 |       403,052,561 |
999.specrand_ir   |        402,535,577 |       403,052,561 |

This should still be considered damage control as the real/deeper fix
would be to reduce number of LRA reloads or CSE/anchor those during
LRA constraint sub-pass (re)runs (thats a different PR/114729.

Implementation Details (for posterity)
--------------------------------------
- basic idea is to have a splitter selected via a new predicate for constant
   being possible sum of two S12 and provide the transform.
   This is however a 2 -> 2 transform which combine can't handle.
   So we specify it using a define_insn_and_split.

- the initial loose "i" constraint caused LRA to accept invalid insns thus
   needing a tighter new constraint as well.

- An additional fallback alternate with catch-all "r" register
   constraint also needed to allow any "reloads" that LRA might
   require for ADDI with const larger than S12.

Testing
--------
This is testsuite clean (rv64 only).
I'll rely on post-commit CI multlib run for any possible fallout for
other setups such as rv32.

|                                               |         gcc |          g++ |     gfortran |
| rv64imafdc_zba_zbb_zbs_zicond/  lp64d/ medlow |  41 /    17 |    8 /     3 |    7 /     2 |
| rv64imafdc_zba_zbb_zbs_zicond/  lp64d/ medlow |  41 /    17 |    8 /     3 |    7 /     2 |

I also threw this into a buildroot run, it obviously boots Linux to
userspace. bloat-o-meter on glibc and kernel show overall decrease in
staic instruction counts with some minor spot increases.
These are generally in the category of

- LUI + ADDI are 2 byte each vs. two ADD being 4 byte each.
- Knock on effects due to inlining changes.
- Sometimes the slightly shorter 2-insn seq in a mult-exit function
   can cause in-place epilogue duplication (vs. a jump back).
   This is slightly larger but more efficient in execution.
In summary nothing to fret about.

| linux/scripts/bloat-o-meter build-gcc-240131/target/lib/libc.so.6 \
         build-gcc-240131-new-splitter-1-variant/target/lib/libc.so.6
|
| add/remove: 0/0 grow/shrink: 21/49 up/down: 520/-3056 (-2536)
| Function                                     old     new   delta
| getnameinfo                                 2756    2892    +136
...
| tempnam                                      136     144      +8
| padzero                                      276     284      +8
...
| __GI___register_printf_specifier             284     280      -4
| __EI_xdr_array                               468     464      -4
| try_file_lock                                268     260      -8
| pthread_create@GLIBC_2                      3520    3508     -12
| __pthread_create_2_1                        3520    3508     -12
...
| _nss_files_setnetgrent                       932     904     -28
| _nss_dns_gethostbyaddr2_r                   1524    1480     -44
| build_trtable                               3312    3208    -104
| printf_positional                          25000   22580   -2420
| Total: Before=2107999, After=2105463, chg -0.12%

Caveat:
------
Jeff noted during v2 review that the operand0 constraint !riscv_reg_frame_related
could potentially cause issues with hard reg cprop in future. If that
trips things up we will have to loosen the constraint while dialing down
the const range to (-2048 to 2032) as opposed to fll S12 range of
(-2048 to 2047) to keep stack regs aligned.

gcc/ChangeLog:
* config/riscv/riscv.h: New macros to check for sum of two S12
range.
* config/riscv/constraints.md: New constraint.
* config/riscv/predicates.md: New Predicate.
* config/riscv/riscv.md: New splitter.
* config/riscv/riscv.cc (riscv_reg_frame_related): New helper.
* config/riscv/riscv-protos.h: New helper prototype.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/sum-of-two-s12-const-1.c: New test: checks
for new patterns output.
* gcc.target/riscv/sum-of-two-s12-const-2.c: Ditto.
* gcc.target/riscv/sum-of-two-s12-const-3.c: New test: should not
ICE.

Tested-by: Edwin Lu <ewlu@rivosinc.com> # pre-commit-CI #1520
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>

tree-optimization/99954 - redo loop distribution memcpy recognition fix

The following revisits the fix for PR99954 which was observed as
causing missed memcpy recognition and instead using memmove for
non-aliasing copies.  While the original fix mitigated bogus
recognition of memcpy the root cause was not properly identified.
The root cause is dr_analyze_indices "failing" to handle union
references and leaving the DRs indices in a state that's not correctly
handled by dr_may_alias.  The following mitigates this there
appropriately, restoring memcpy recognition for non-aliasing copies.

This makes us run into a latent issue in ptr_deref_may_alias_decl_p
when the pointer is something like &MEM[0].a in which case we fail
to handle non-SSA name pointers.  Add code similar to what we have
in ptr_derefs_may_alias_p.

PR tree-optimization/99954
* tree-data-ref.cc (dr_may_alias_p): For bases that are
not completely analyzed fall back to TBAA and points-to.
* tree-loop-distribution.cc
(loop_distribution::classify_builtin_ldst): When there
is no dependence again classify as memcpy.
* tree-ssa-alias.cc (ptr_deref_may_alias_decl_p): Verify
the pointer is an SSA name.

* gcc.dg/tree-ssa/ldist-40.c: New testcase.

[PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero

The Zicboz extension offers the cbo.zero instruction, which can be used
to clean a memory region corresponding to a cache block.
The Zic64b extension defines the cache block size to 64 byte.
If both extensions are available, it is possible to use cbo.zero
to clear memory, if the alignment and size constraints are met.
This patch implements this.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_expand_block_clear): New prototype.
* config/riscv/riscv-string.cc (riscv_expand_block_clear_zicboz_zic64b):
New function to expand a block-clear with cbo.zero.
(riscv_expand_block_clear): New RISC-V block-clear expansion function.
* config/riscv/riscv.md (setmem<mode>): New setmem expansion.

[PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe

Let's add '\t' to the instruction match pattern to avoid false positive
matches when compiling with -flto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmo-zicbom-1.c: Add \t to test pattern.
* gcc.target/riscv/cmo-zicbom-2.c: Likewise.
* gcc.target/riscv/cmo-zicbop-1.c: Likewise.
* gcc.target/riscv/cmo-zicbop-2.c: Likewise.
* gcc.target/riscv/cmo-zicboz-1.c: Likewise.
* gcc.target/riscv/cmo-zicboz-2.c: Likewise.

[1/3] expr: Export clear_by_pieces()

Make clear_by_pieces() available to other parts of the compiler,
similar to store_by_pieces().

gcc/ChangeLog:

* expr.cc (clear_by_pieces): Remove static from clear_by_pieces.
* expr.h (clear_by_pieces): Add prototype for clear_by_pieces.

[testsuite] Fix gcc.dg/pr115066.c fail on aarch64

On aarch64, I get this failure:
...
FAIL: gcc.dg/pr115066.c scan-assembler \\.byte\\t0xb\\t# Define macro strx
...

This happens because we expect to match:
...
        .byte   0xb     # Define macro strx
...
but instead we get:
...
        .byte   0xb     // Define macro strx
...

Fix this by not explicitly matching the comment marker.

Tested on aarch64 and x86_64.

gcc/testsuite/ChangeLog:

2024-05-14  Tom de Vries  <tdevries@suse.de>

* gcc.dg/pr115066.c: Don't match comment marker.

testsuite: analyzer: Fix fd-glibc-byte-stream-connection-server.c on Solaris [PR107750]

gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c currently FAILs
on Solaris:

FAIL: gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c (test for
excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c:91:3:
error: implicit declaration of function 'memset'
[-Wimplicit-function-declaration]

Solaris <sys/select.h> has

but no declaration of memset. While one can argue that this should be
fixed, it's easy enough to just include <string.h> instead, which is
what this patch does.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

2024-05-14 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR analyzer/107750
* gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c:
Include <string.h>.

libstdc++: Guard dynamic_cast use in src/c++23/print.cc [PR115015]

Do not use dynamic_cast unconditionally, in case libstdc++ is built with
-fno-rtti.

libstdc++-v3/ChangeLog:

PR libstdc++/115015
* src/c++23/print.cc (__open_terminal(streambuf*)) [!__cpp_rtti]:
Do not use dynamic_cast.

libstdc++: Document when std::string::shrink_to_fit was added

This section can be misread to say that shrink_to_fit is available from
GCC 3.4, but it was added later.

libstdc++-v3/ChangeLog:

* doc/xml/manual/strings.xml: Clarify that GCC 4.5 added
std::string::shrink_to_fit.
* doc/html/manual/strings.html: Regenerate.

[debug] Fix dwarf v4 .debug_macro.dwo

Consider a hello world, compiled with -gsplit-dwarf and dwarf version 4, and
-g3:
...
$ gcc -gdwarf-4 -gsplit-dwarf /data/vries/hello.c -g3 -save-temps -dA
...

In section .debug_macro.dwo, we have:
...
.Ldebug_macro0:
        .value  0x4     # DWARF macro version number
        .byte   0x2     # Flags: 32-bit, lineptr present
        .long   .Lskeleton_debug_line0
        .byte   0x3     # Start new file
        .uleb128 0      # Included from line number 0
        .uleb128 0x1    # file /data/vries/hello.c
        .byte   0x5     # Define macro strp
        .uleb128 0      # At line number 0
        .uleb128 0x1d0  # The macro: "__STDC__ 1"
...

Given that we use a DW_MACRO_define_strp, we'd expect 0x1d0 to be an
offset into a .debug_str.dwo section.

But in fact, 0x1d0 is an index into the string offset table in
section .debug_str_offsets.dwo:
...
        .long   0x34f0  # indexed string 0x1d0: __STDC__ 1
...

Add asserts that catch this inconsistency, and fix this by using
DW_MACRO_define_strx instead.

Tested on x86_64.

gcc/ChangeLog:

2024-05-14  Tom de Vries  <tdevries@suse.de>

PR debug/115066
* dwarf2out.cc (output_macinfo_op): Fix DW_MACRO_define_strx/strp
choice for v4 .debug_macro.dwo.  Add asserts to check that choice.

gcc/testsuite/ChangeLog:

2024-05-14  Tom de Vries  <tdevries@suse.de>

PR debug/115066
* gcc.dg/pr115066.c: New test.

Reduce recursive inlining of always_inline functions

this patch tames down inliner on (mutiply) self-recursive always_inline functions.
While we already have caps on recursive inlning, the testcase combines early inliner
and late inliner to get very wide recursive inlining tree. The basic idea is to
ignore DISREGARD_INLINE_LIMITS when deciding on inlining self recursive functions
(so we cut on function being large) and clear the flag once it is detected.

I did not include the testcase since it still produces a lot of code and would
slow down testing. It also outputs many inlining failed messages that is not
very nice, but it is hard to detect self recursin cycles in full generality
when indirect calls and other tricks may happen.

gcc/ChangeLog:

PR ipa/113291

* ipa-inline.cc (enum can_inline_edge_by_limits_flags): New enum.
(can_inline_edge_by_limits_p): Take flags instead of multiple bools; add flag
for forcing inlinie limits.
(can_early_inline_edge_p): Update.
(want_inline_self_recursive_call_p): Update; use FORCE_LIMITS mode.
(check_callers): Update.
(update_caller_keys): Update.
(update_callee_keys): Update.
(recursive_inlining): Update.
(add_new_edges_to_heap): Update.
(speculation_useful_p): Update.
(inline_small_functions): Clear DECL_DISREGARD_INLINE_LIMITS on self recursion.
(flatten_function): Update.
(inline_to_all_callers_1): Update.

libstdc++: Fix typo in std::stacktrace::max_size [PR115063]

libstdc++-v3/ChangeLog:

PR libstdc++/115063
* include/std/stacktrace (basic_stacktrace::max_size): Fix typo
in reference to _M_alloc member.
* testsuite/19_diagnostics/stacktrace/stacktrace.cc: Check
max_size() compiles.

rs6000: Enable overlapped by-pieces operations

This patch enables overlapped by-piece operations by defining
TARGET_OVERLAP_OP_BY_PIECES_P to true. On rs6000, default move/set/clear
ratio is 2. So the overlap is only enabled with compare by-pieces.

gcc/
* config/rs6000/rs6000.cc (TARGET_OVERLAP_OP_BY_PIECES_P): Define.

gcc/testsuite/
* gcc.target/powerpc/block-cmp-9.c: New.

MAINTAINERS: Fix an entry using spaces instead of tabs

In the MAINTAINERS file, names and emails are separated by tabs.  One of
the entries recently added used spaces.  This patch corrects this.

The check-MAINTAINERS.py script breaks a bit when this happens.  This
patch also adds warning about this situation into the script.

ChangeLog:

* MAINTAINERS: Use tabs between name and email.

contrib/ChangeLog:

* check-MAINTAINERS.py: Add warning about not using tabs.

Signed-off-by: Filip Kastl <fkastl@suse.cz>

ada: Fix classification of SPARK Boolean aspects

The implementation of User_Aspect_Definition uses subtype
Boolean_Aspects to decide which existing aspects can be used to define
old aspects. This subtype didn't include many of the SPARK aspects,
notably the Always_Terminates.

gcc/ada/

* aspects.ads (Aspect_Id, Boolean_Aspect): Change categorization
of Boolean-valued SPARK aspects.
* sem_ch13.adb (Analyze_Aspect_Specification): Adapt CASE
statements to new classification of Boolean-valued SPARK
aspects.

ada: Fix crash with -gnatdJ and -gnatf

This patch fixes a crash when the compiler emits a warning about
an unchecked conversion and -gnatdJ is enabled.

gcc/ada/

* sem_ch13.adb (Validate_Unchecked_Conversions): Add node
parameters to Error_Msg calls.

ada: Minor typo fix in comment

gcc/ada/

* sem_util.adb: Typo fix in comment.
* exp_aggr.adb: Likewise.

ada: Document more details of the implementation of finalization chains

gcc/ada/

* exp_ch7.adb (Finalization Management): Add a short description of
the implementation of finalization chains.

ada: Fix small inaccuracy in previous change

The call to Build_Allocate_Deallocate_Proc must occur before the special
accessibility check for class-wide allocation is generated, because this
check comes with cleanup code.

gcc/ada/

* exp_ch4.adb (Expand_Allocator_Expression): Move the first call to
Build_Allocate_Deallocate_Proc up to before the accessibility check.

ada: Fix pragma Warnings and -gnatD interaction

A recent change broke pragma Warnings when -gnatD is enabled in some
cases. This patch fixes this by caching more slocs at times when it's
known that they haven't been modified by -gnatD.

gcc/ada/

* errout.adb (Validate_Specific_Warnings): Adapt to record
definition change.
* erroutc.adb (Set_Specific_Warning_On, Set_Specific_Warning_Off,
Warning_Specifically_Suppressed): Likewise.
* erroutc.ads: Change record definition.

ada: Decouple attachment from dynamic allocation for controlled objects

This decouples the attachment to the appropriate finalization collection of
dynamically allocated objects that need finalization from their allocation.

The current implementation immediately attaches them after allocating them,
which means that they will be finalized even if their initialization does
not complete successfully.  The new implementation instead generates the
same sequence as the one generated for (statically) declared objects, that
is to say, allocation, initialization and attachment in this order.

gcc/ada/

* exp_ch3.adb (Build_Default_Initialization): Do not generate the
protection for finalization collections.
(Build_Heap_Or_Pool_Allocator): Set the No_Initialization flag on
the declaration of the temporary.
* exp_ch4.adb (Build_Aggregate_In_Place): Do not build an allocation
procedure here.
(Expand_Allocator_Expression): Build an allocation procedure, if it
is required, only just before rewriting the allocator.
(Expand_N_Allocator): Do not build an allocation procedure if the
No_Initialization flag is set on the allocator, except for those
generated for special return objects.  In other cases, build an
allocation procedure, if it is required, only before rewriting
the allocator.
* exp_ch7.ads (Make_Address_For_Finalize): New function declaration.
* exp_ch7.adb (Finalization Management): Update description for
dynamically allocated objects.
(Make_Address_For_Finalize): Remove declaration.
(Find_Last_Init): Change to function and move to...
(Process_Object_Declaration): Adjust to above change.
* exp_util.ads (Build_Allocate_Deallocate_Proc): Add Mark parameter
with Empty default and document it.
(Find_Last_Init): New function declaration.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Add Mark parameter
with Empty default and pass it in recursive call.  Deal with type
conversions created for interface types.  Adjust call sequence to
Allocate_Any_Controlled by changing Collection to In/Out parameter
and removing Finalize_Address parameter.  For a controlled object,
generate a conditional call to Attach_Object_To_Collection for an
allocation and to Detach_Object_From_Collection for a deallocation.
(Find_Last_Init): ...here.  Compute the initialization type for an
allocator whose designating type is class wide specifically and also
handle concurrent types.
* rtsfind.ads (RE_Id): Add RE_Attach_Object_To_Collection and
RE_Detach_Object_From_Collection.
(RE_Unit_Table): Add entries for RE_Attach_Object_To_Collection and
RE_Detach_Object_From_Collection.
* libgnat/s-finpri.ads (Finalization_Started): Delete.
(Attach_Node_To_Collection): Likewise.
(Detach_Node_From_Collection): Move to...
(Attach_Object_To_Collection): New procedure declaration.
(Detach_Object_From_Collection): Likewise.
(Finalization_Collection): Remove Atomic for Finalization_Started.
Add pragma Inline for Initialize.
* libgnat/s-finpri.adb: Add clause for Ada.Unchecked_Conversion.
(To_Collection_Node_Ptr): New instance of Ada.Unchecked_Conversion.
(Detach_Node_From_Collection): ...here.
(Attach_Object_To_Collection): New procedure.
(Detach_Object_From_Collection): Likewise.
(Finalization_Started): Delete.
(Finalize): Replace allocation with attachment in comments.
* libgnat/s-stposu.ads (Allocate_Any_Controlled): Rename parameter
Context_Subpool into Named_Subpool, parameter Context_Collection
into Collection and change it to In/Out, and remove Fin_Address.
* libgnat/s-stposu.adb: Remove clause for Ada.Unchecked_Conversion
and Finalization_Primitives.
(To_Collection_Node_Ptr): Delete.
(Allocate_Any_Controlled): Rename parameter Context_Subpool into
Named_Subpool, parameter Context_Collection into Collection and
change it to In/Out, and remove Fin_Address.  Do not lock/unlock
and do not attach the object, instead only displace its address.
(Deallocate_Any_Controlled): Do not lock/unlock and do not detach
the object.
(Header_Size_With_Padding): Use qualified name for Header_Size.

ada: Follow up fixes for Put_Image/streaming regressions

A recent change to reduce duplication of compiler-generated Put_Image and
streaming subprograms introduced two regressions. One is yet another of the
many cases where generating these routines "on demand" (as opposed at the
point of the associated type declaration) requires loosening the compiler's
enforcement of privacy. The other is a use-before-definition issue that
occurs because the declaration of a Put_Image procedure is not hoisted far
enough.

gcc/ada/

* exp_attr.adb (Build_And_Insert_Type_Attr_Subp): If a subprogram
associated with a (library-level) type declared in another unit is
to be inserted somewhere in a list, then insert it at the head of
the list.
* sem_ch5.adb (Analyze_Assignment): Normally a limited-type
assignment is illegal. Relax this rule if Comes_From_Source is
False and the type is not immutably limited.

ada: Fix pragma Compile_Time_Error and -gnatdJ crash

This patch makes it so the diagnostics coming from occurrences of
pragma Compile_Time_Error and Compile_Time_Warning are emitted with
a node parameter so they don't cause a crash when -gnatdJ is enabled.

gcc/ada/

* errout.ads (Error_Msg): Add node parameter.
* errout.adb (Error_Msg): Add parameter and pass it to
the underlying call.
* sem_prag.adb (Validate_Compile_Time_Warning_Or_Error): Pass
pragma node when emitting errors.

ada: Fix crash with -gnatdJ and -gnatyz

This patch makes it so -gnatyz style checks reports specify a node
ID. That is required since those checks are sometimes made during
semantic analysis of short-circuit operators, where the Current_Node
mechanism that -gnatdJ uses is not operational.

Check_Xtra_Parens_Precedence is moved from Styleg to Style to make
this possible.

gcc/ada/

* styleg.ads (Check_Xtra_Parens_Precedence): Moved ...
* style.ads (Check_Xtra_Parens_Precedence): ... here. Also
replace corresponding renaming.
* styleg.adb (Check_Xtra_Parens_Precedence): Moved ...
* style.adb (Check_Xtra_Parens_Precedence): here. Also use
Errout.Error_Msg and pass it a node parameter.

ada: Small cleanup about allocators and aggregates

This eliminates a few oddities present in the expander for allocators and
aggregates present in allocators:

  - Convert_Array_Aggr_In_Allocator takes both a Decl and Alloc parameters,
    and inserts new code before Alloc for records and after Decl for arrays
    through Convert_Array_Aggr_In_Allocator.  Now, for the 3 (duplicated)
    calls to the procedure, that's the same place.  It also creates a new
    list that it does not use in most cases.

  - Expand_Allocator_Expression uses the same code sequence in 3 places
    when the expression is an aggregate to build in place.

  - Build_Allocate_Deallocate_Proc takes an Is_Allocate parameter that is
    entirely determined by the N parameter: if N is an allocator, it must
    be true; if N is a free statement, it must be false.  Barring that,
    the procedure either raises an assertion or Program_Error.  It also
    contains useless pattern matching code in the second part.

No functional changes.

gcc/ada/

* exp_aggr.ads (Convert_Aggr_In_Allocator): Rename Alloc into N,
replace Decl with Temp and adjust description.
(Convert_Aggr_In_Object_Decl): Alphabetize.
(Is_Delayed_Aggregate): Likewise.
* exp_aggr.adb (Convert_Aggr_In_Allocator): Rename Alloc into N
and replace Decl with Temp.  Allocate a list only when neeeded.
(Convert_Array_Aggr_In_Allocator): Replace N with Decl and insert
new code before it.
* exp_ch4.adb (Build_Aggregate_In_Place): New procedure nested in
Expand_Allocator_Expression.
(Expand_Allocator_Expression): Call it to build aggregates in place.
Remove second parameter in calls to Build_Allocate_Deallocate_Proc.
(Expand_N_Allocator): Likewise.
* exp_ch13.adb (Expand_N_Free_Statement): Likewise.
* exp_util.ads (Build_Allocate_Deallocate_Proc): Remove Is_Allocate
parameter.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Remove Is_Allocate
parameter and replace it with local variable of same name.  Delete
useless pattern matching.

ada: Fix warning indicators in usage string

Before this patch, the default status of -gnatw.i and -gnatw.d are
reported incorrectly in the usage string used throughout GNAT tools.
This patch fixes this.

gcc/ada/

* usage.adb (Usage): Fix enabled-by-default indicators.

ada: Correct System.Win32.LocalFileTimeToFileTime wrapper typo

The parameters should be swapped to fit Fileapi.h documentation.
BOOL LocalFileTimeToFileTime(
[in] const FILETIME *lpLocalFileTime,
[out] LPFILETIME lpFileTime
);

gcc/ada/
* libgnat/s-win32.ads (LocalFileTimeToFileTime): Swap parameters.

ada: Fix crash with -gnatdJ and JSON output

This patch tweaks the calls made to Errout subprograms to report
violations of dependence restrictions, in order fix a crash that
occurred with -gnatdJ and -fdiagnostics-format=json.

gcc/ada/

* restrict.adb (Violation_Of_No_Dependence): Tweak error
reporting calls.

ada: Fix crash with -gnatdJ and -gnatw.w

This patch fixes a crash when -gnatdJ is enabled and a warning
must be emitted about an ineffective pragma Warnings clause.

Some modifications are made to the specific warnings machinery so
that warnings carry the ID of the pragma node they're about, so the
-gnatdJ mechanism can find an appropriate enclosing subprogram.

gcc/ada/

* sem_prag.adb (Analyze_Pragma): Adapt call to new signature.
* erroutc.ads (Set_Specific_Warning_Off): change signature
and update documentation.
(Validate_Specific_Warnings): Move ...
* errout.adb: ... here and change signature. Also move body
of Validate_Specific_Warnings from erroutc.adb.
(Finalize): Adapt call.
* errout.ads (Set_Specific_Warning_Off): Adapt signature of
renaming.
* erroutc.adb (Set_Specific_Warning_Off): Adapt signature and
body.
(Validate_Specific_Warnings): Move to the body of Errout.
(Warning_Specifically_Suppressed): Adapt body.

ada: Restore default size for dynamic allocations of discriminated type

The allocation strategy for objects of a discriminated type with defaulted
discriminants is not the same when the allocation is dynamic as when it is
static (i.e a declaration): in the former case, the compiler allocates the
default size whereas, in the latter case, it allocates the maximum size.

This restores the default size, which was dropped during the refactoring.

gcc/ada/

* exp_aggr.adb (Build_Array_Aggr_Code): Pass N in the call to
Build_Initialization_Call.
(Build_Record_Aggr_Code): Likewise.
(Convert_Aggr_In_Object_Decl): Likewise.
(Initialize_Discriminants): Likewise.
* exp_ch3.ads (Build_Initialization_Call): Replace Loc witn N.
* exp_ch3.adb (Build_Array_Init_Proc): Pass N in the call to
Build_Initialization_Call.
(Build_Default_Initialization): Likewise.
(Expand_N_Object_Declaration): Likewise.
(Build_Initialization_Call): Replace Loc witn N parameter and add
Loc local variable. Build a default subtype for an allocator of
a discriminated type with defaulted discriminants.
(Build_Record_Init_Proc): Pass the declaration of components in the
call to Build_Initialization_Call.
* exp_ch6.adb (Make_CPP_Constructor_Call_In_Allocator): Pass the
allocator in the call to Build_Initialization_Call.

ada: Fix typo in diagnostic message

A previous change introduced an error in the diagnostic message about
overlapping actuals. This commit fixes this.

gcc/ada/

* sem_warn.adb (Warn_On_Overlapping_Actuals): Fix typo.

ada: Compiler crash or errors on if_expression in container aggregate

The compiler may either crash or incorrectly report errors when
a component association in a container aggregate is an if_expression
with an elsif part whose dependent expression is a call to a function
returning a result that requires finalization. The compiler complains
that a private type is expected, but a package or procedure name was
found. This is due to the compiler improperly associating expanded
calls to Finalize_Object with the aggregate, rather than the enclosing
object declaration being initialized by the aggregate, which can result
in the Finalize_Object procedure call being passed as an actual to
the Add_Unnamed operation of the container type and leading to a type
mismatch and the confusing error message. This is fixed by adjusting
the code that locates the proper context for insertion of Finalize_Object
calls to locate the enclosing declaration or statement rather than
stopping at the aggregate.

gcc/ada/

* exp_util.adb (Find_Hook_Context): Exclude N_*Aggregate Nkinds
of Parent (Par) from the early return in the second loop of the
In_Cond_Expr case, to prevent returning an aggregate from this
function rather than the enclosing declaration or statement.

ada: Replace "not Present" tests with "No".

Fix constructs that were flagged by CodePeer.

gcc/ada/

* exp_attr.adb: Replace 6 "not Present" tests with equivalent calls to "No".

ada: Follow-up adjustment after fix to Default_Initialize_Object

Now that Default_Initialize_Object honors the No_Initialization flag in all
cases, objects of an access type declared without initialization expression
can no longer be considered as being automatically initialized to null.

gcc/ada/

* exp_ch3.adb (Expand_N_Object_Declaration): Examine the Expression
field after the call to Default_Initialize_Object in order to set
Is_Known_Null, as well as Is_Known_Non_Null, on an access object.

ada: Reduce generated code duplication for streaming and Put_Image subprograms

In the case of an untagged composite type, the compiler does not generate
streaming-related subprograms or a Put_Image procedure when the type is
declared. Instead, these subprograms are declared "on demand" when a
corresponding attribute reference is encountered. In this case, hoist the
declaration of the implicitly declared subprogram out as far as possible
in order to maximize the chances that it can be reused (as opposed to
generating an identical second subprogram) in the case where a second
reference to the same attribute is encountered. Also relax some
privacy-related rules to allow these procedures to do what they need to do
even when constructed in a scope where some of those actions would
normally be illegal.

gcc/ada/

* exp_attr.adb: Change name of package Cached_Streaming_Ops to
reflect the fact that it is now also used for Put_Image
procedures. Similarly change other "Streaming_Op" names therein.
Add Validate_Cached_Candidate procedure to detect case where a
subprogram found in the cache cannot be reused. Add new generic
procedure Build_And_Insert_Type_Attr_Subp; the "Build" part is
handled by just calling a formal procedure; the bulk of this
(generic) procedure's code has to with deciding where in the tree
to insert the newly-constructed subprogram. Replace each later
"Build" call (and the following Insert_Action or
Compile_Stream_Body_In_Scope call) with a declare block that
instantiates and then calls this generic procedure. Delete the
now-unused procedure Compile_Stream_Body_In_Scope. A constructed
subprogram is entered in the appropriate cache if the
corresponding type is untagged; this replaces more complex tests.
A new function Interunit_Ref_OK is added to determine whether an
attribute reference occuring in one unit can safely refer to a
cached subprogram declared in another unit.
* exp_ch3.adb (Build_Predefined_Primitive_Bodies): A formal
parameter was deleted, so delete the corresponding actual in a
call.
* exp_put_image.adb (Build_Array_Put_Image_Procedure): Because the
procedure being built may be referenced more than once, the
generated procedure takes its source position info from the type
declaration instead of the (first) attribute reference.
(Build_Record_Put_Image_Procedure): Likewise.
* exp_put_image.ads (Build_Array_Put_Image_Procedure): Eliminate
now-unused Nod parameter.
(Build_Record_Put_Image_Procedure): Eliminate now-unused Loc parameter.
* sem_ch3.adb (Constrain_Discriminated_Type): For declaring a
subtype with a discriminant constraint, ignore privacy if
Comes_From_Source is false (as is already done if Is_Instance is
true).
* sem_res.adb (Resolve): When passed two type entities that have
the same underlying base type, Sem_Type.Covers may return False in
some cases because of privacy. [This can happen even if
Is_Private_Type returns False both for Etype (N) and for Typ;
Covers calls Base_Type, which can take a non-private argument and
yield a private result.] If Comes_From_Source (N) is False
(e.g., for a compiler-generated Put_Image or streaming subprogram), then
avoid that scenario by not calling Covers. Covers already has tests for
doing this sort of thing (see the calls therein to Full_View_Covers),
but the Comes_From_Source test is too coarse to apply there. So instead
we handle the problem here at the call site.
(Original_Implementation_Base_Type): A new function. Same as
Implementation_Base_Type except if the Original_Node attribute of
a non-derived type declaration indicates that it once was a derived
type declaration. Needed for looking through privacy.
(Valid Conversion): Ignore privacy when converting between different views
of the same type if Comes_From_Source is False for the conversion.
(Valid_Tagged_Conversion): An ancestor-to-descendant conversion is not an
illegal downward conversion if there is no type extension involved
(because the derivation was from an untagged view of the parent type).

ada: Better error message for bad general case statements

If -gnatX0 is specified, we allow case statements with a selector
expression of a record or array type, but not of a private type.
If the selector expression is of a private type then we should generate
an appropriate error message instead of a bugbox.

gcc/ada/

* sem_ch5.adb (Analyze_Case_Statement): Emit a message and return
early in the case where general case statements are allowed but
the selector expression is of a private type. This is done to
avoid a bugbox.

ada: Spurious unreferenced warning on selected component

This patch fixes an error in the compiler whereby a selected component on the
left hand side of an assignment statement may not get marked as referenced -
leading to spurious unreferenced warnings on such objects.

gcc/ada/

* sem_util.adb (Set_Referenced_Modified): Use Original_Node to
avoid recursive calls on expanded / internal objects such that
source nodes get appropriately marked as referenced.

ada: Fix overlap warning suppression

Before this patch, some warnings about overlapping actuals were
emitted regardless of the Value of
Warnsw.Warnings_Package.Warn_On_Overlap. This patch fixes this.

gcc/ada/

* sem_warn.adb (Warn_On_Overlapping_Actuals): Stop ignoring
warning suppression settings.

ada: Follow-up adjustment to earlier fix in Build_Allocate_Deallocate_Proc

The profile of the procedure built for an allocation on the secondary stack
now includes the alignment parameter, so the parameter can just be forwarded
in the call to Allocate_Any_Controlled.

gcc/ada/

* exp_util.adb (Build_Allocate_Deallocate_Proc): Pass the alignment
parameter in the inner call for a secondary stack allocation too.