git.ipfire.org Git - thirdparty/gcc.git/log

Add AMD znver5 processor enablement with scheduler model

2024-02-14 Jan Hubicka <jh@suse.cz>
Karthiban Anbazhagan <Karthiban.Anbazhagan@amd.com>

gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5.
* common/config/i386/i386-common.cc (processor_names): Add znver5.
(processor_alias_table): Likewise.
* common/config/i386/i386-cpuinfo.h (processor_types): Add new zen
family.
(processor_subtypes): Add znver5.
* config.gcc (x86_64-*-* |...): Likewise.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
march=native detect znver5 cpu's.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add
znver5.
* config/i386/i386-options.cc (m_ZNVER5): New definition
(processor_cost_table): Add znver5.
* config/i386/i386.cc (ix86_reassociation_width): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER5
(PTA_ZNVER5): New definition.
* config/i386/i386.md (define_attr "cpu"): Add znver5.
(Scheduling descriptions) Add znver5.md.
* config/i386/x86-tune-costs.h (znver5_cost): New definition.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver5.
(ix86_adjust_cost): Likewise.
* config/i386/x86-tune.def (avx512_move_by_pieces): Add m_ZNVER5.
(avx512_store_by_pieces): Add m_ZNVER5.
* doc/extend.texi: Add znver5.
* doc/invoke.texi: Likewise.
* config/i386/znver4.md: Rename to zn4zn5.md; combine znver4 and znver5 Scheduler.

gcc/testsuite/ChangeLog:
* g++.target/i386/mv29.C: Handle znver5 arch.
* gcc.target/i386/funcspec-56.inc:Likewise.

avr.md - Tweak xor insn constraints.

xor insn can handle some more values without the requirement of a
scratch register. This patch adds a new constraint alternative for
such values. The output function avr_out_bitop already handles
these cases, so no change is needed there.

gcc/
* config/avr/constraints.md (CX2, CX3, CX4): New constraints.
* config/avr/avr-protos.h (avr_xor_noclobber_dconst): New proto.
* config/avr/avr.cc (avr_xor_noclobber_dconst): New function.
* config/avr/avr.md (xorhi3, *xorhi3): Add "d,0,CX2,X" alternative.
(xorpsi3, *xorpsi3): Add "d,0,CX3,X" alternative.
(xorsi3, *xorsi3): Add "d,0,CX4,X" alternative.

testsuite: Define _POSIX_C_SOURCE for test

As the tests assume that strndup() is visible (only part of
POSIX.1-2008) define the guard to ensure that it's visible. Currently,
glibc appears to always have this defined in C++, newlib does not.

Without this patch, fails like this can be seen:

Testing analyzer/strndup-1.c, -std=c++98
.../strndup-1.c: In function 'void test_1(const char*)':
.../strndup-1.c:11:13: error: 'strndup' was not declared in this scope; did you mean 'strncmp'?
.../strndup-1.c: In function 'void test_2(const char*)':
.../strndup-1.c:16:13: error: 'strndup' was not declared in this scope; did you mean 'strncmp'?
.../strndup-1.c: In function 'void test_3(const char*)':
.../strndup-1.c:21:13: error: 'strndup' was not declared in this scope; did you mean 'strncmp'?

Patch has been verified on Linux.

gcc/testsuite/ChangeLog:

* c-c++-common/analyzer/strndup-1.c: Define _POSIX_C_SOURCE.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

Add missing <any_logic>hf/bf patterns.

It will be used by copysignm3/xorsignm3/lroundmn2 expanders.

gcc/ChangeLog:

PR target/114334
* config/i386/i386.md (mode): Add new number V8BF,V16BF,V32BF.
(MODEF248): New mode iterator.
(ssevecmodesuffix): Hanlde BF and HF.
* config/i386/sse.md (andnot<mode>3): Extend to HF/BF.
(<code><mode>3): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr114334.c: New test.

hppa: Improve handling of REG+D addresses when generating PA 2.0 code

In looking at PR 112415, it became clear that improvements could be
made in the handling of loads and stores using REG+D addresses.  A
change in 2002 conflated two issues:

1) We can't generate insns with 14-bit displacements before reload
completes when generating PA 1.x code since floating-point loads and
stores only support 5-bit offsets in PA 1.x.

2) The GNU ELF 32-bit linker lacks relocation support for PA 2.0
floating point instructions with 14-bit displacements.  These
relocations affect instructions with symbolic references.

The result of the change was to block creation of PA 2.0 instructions
with 14-bit REG_D displacements for SImode, DImode, SFmode and DFmode
on the GNU linux target before reload.  This was unnecessary as these
instructions don't need relocation.

This change revises the INT14_OK_STRICT define to allow creation
of instructions with 14-bit REG+D addresses before reload when
generating PA 2.0 code.

2024-03-17  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR rtl-optimization/112415
* config/pa/pa.cc (pa_emit_move_sequence): Revise condition
for symbolic memory operands.
(pa_legitimate_address_p): Revise LO_SUM condition.
* config/pa/pa.h (INT14_OK_STRICT): Revise define.  Move
comment about GNU linker to predicates.md.
* config/pa/predicates.md (floating_point_store_memory_operand):
Revise condition for symbolic memory operands.  Update
comment.

Daily bump.

Fortran: fix for absent array argument passed to optional dummy [PR101135]

gcc/fortran/ChangeLog:

PR fortran/101135
* trans-array.cc (gfc_get_dataptr_offset): Check for optional
arguments being present before dereferencing data pointer.

gcc/testsuite/ChangeLog:

PR fortran/101135
* gfortran.dg/missing_optional_dummy_6a.f90: Adjust diagnostic pattern.
* gfortran.dg/ubsan/missing_optional_dummy_8.f90: New test.

hppa: Fix complaint about non-delegitimized UNSPEC UNSPEC_TP

2024-03-17 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.cc (pa_delegitimize_address): Delegitimize UNSPEC_TP.

libstdc++: Implement N3644 on _Safe_iterator<> [PR114316]

Consider range of value-initialized iterators as valid and empty.

libstdc++-v3/ChangeLog:

PR libstdc++/114316
* include/debug/safe_iterator.tcc (_Safe_iterator<>::_M_valid_range):
First check if both iterators are value-initialized before checking if
singular.
* testsuite/23_containers/set/debug/114316.cc: New test case.
* testsuite/23_containers/vector/debug/114316.cc: New test case.

PR modula2/114296 ICE when attempting to create a constant set with a variable element

This patch corrects the virtual token creation for the aggregate constant
and also corrects tokens for constructor components.

gcc/m2/ChangeLog:

PR modula2/114296
* gm2-compiler/M2ALU.mod (ElementsSolved): Add tokenno parameter.
Add constant checks and generate error messages.
(EvalSetValues): Pass tokenno parameter to ElementsSolved.
* gm2-compiler/M2LexBuf.mod (stop): New procedure.
(MakeVirtualTok): Call stop if caret = BadTokenNo.
* gm2-compiler/M2Quads.def (BuildNulExpression): Add tokpos
parameter.
(BuildSetStart): Ditto.
(BuildEmptySet): Ditto.
(BuildConstructorEnd): Add startpos parameter.
(BuildTypeForConstructor): Add tokpos parameter.
* gm2-compiler/M2Quads.mod (BuildNulExpression): Add tokpos
parameter and push tokpos to the quad stack.
(BuildSetStart): Add tokpos parameter and push tokpos.
(BuildSetEnd): Rewrite.
(BuildEmptySet): Add tokpos parameter and push tokpos with
the set type.
(BuildConstructorStart): Pop typepos.
(BuildConstructorEnd): Add startpos parameter.
Create valtok from startpos and cbratokpos.
(BuildTypeForConstructor): Add tokpos parameter.
* gm2-compiler/M2Range.def (InitAssignmentRangeCheck): Rename
d to des and e to expr.
Add destok and exprtok parameters.
* gm2-compiler/M2Range.mod (InitAssignmentRangeCheck): Rename
d to des and e to expr.
Add destok and exprtok parameters.
Save destok and exprtok into range record.
(FoldAssignment): Pass exprtok to TryDeclareConstant.
* gm2-compiler/P3Build.bnf (ComponentValue): Rewrite.
(Constructor): Rewrite.
(ConstSetOrQualidentOrFunction): Rewrite.
(SetOrQualidentOrFunction): Rewrite.
* gm2-compiler/PCBuild.bnf (ConstSetOrQualidentOrFunction): Rewrite.
(SetOrQualidentOrFunction): Rewrite.
* gm2-compiler/PHBuild.bnf (Constructor): Rewrite.
(ConstSetOrQualidentOrFunction): Rewrite.

gcc/testsuite/ChangeLog:

PR modula2/114296
* gm2/pim/fail/badtype2.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

d: Merge upstream dmd, druntime 855353a1d9

D front-end changes:

- Import dmd v2.108.0-rc.1.
- Add support for Named Arguments for functions.
- Hex strings now convert to integer arrays.

D runtime changes:

- Import druntime v2.108.0-rc.1.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 855353a1d9.
* dmd/VERSION:

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 855353a1d9.

Daily bump.

i386: Fix setup of incoming varargs for (...) functions which return large aggregates [PR114175]

The c23-stdarg-6.c testcase I've added recently apparently works fine with
-O0 but aborts with -O1 and higher on x86_64-linux.
The problem is in setup of incoming varargs.

Like function.cc before r14-9249 even ix86_setup_incoming_varargs assumes
that TYPE_NO_NAMED_ARGS_STDARG_P don't have any named arguments and there
is nothing to advance, but that is not the case for (...) functions
returning by hidden reference which have one such artificial argument.
If the setup_incoming_varargs hook is called from the
  if (TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (fndecl))
      && fnargs.is_empty ())
    {
      struct assign_parm_data_one data = {};
      assign_parms_setup_varargs (&all, &data, false);
    }
spot, i.e. where there is no hidden return argument passed, arg.type
is always NULL, while when it is called in the
      if (cfun->stdarg && !DECL_CHAIN (parm))
        assign_parms_setup_varargs (&all, &data, false);
spot, even when it is TYPE_NO_NAMED_ARGS_STDARG_P arg.type will be non-NULL.
The tree-stdarg.cc pass in f in c23-stdarg-6.cc at -O1 or higher determines
that va_arg is used on integral types at most twice (loads 2 words),
and because ix86_setup_incoming_varargs doesn't advance, the code saves
just the %rdi and %rsi registers to the save area.  But that isn't correct,
it should save %rsi and %rdx because %rdi is the hidden return argument.
With -O0 tree-stdarg.cc doesn't attempt to optimize and we save all the
registers, so it works fine in that case.

Now, I think we'll need the same fix also on
aarch64, alpha, arc, csky, ia64, loongarch, mips, mmix, nios2, riscv, visium
which have pretty much the similarly looking snippet in their hooks
changed by the r13-3549 commit.
Then arm, epiphany, fr30, frv, ft32, m32r, mcore, nds32, rs6000, sh
have different changes but most likely need something similar too.
I don't have access to most of those, could test aarch64 and rs6000 I guess.

2024-03-16  Jakub Jelinek  <jakub@redhat.com>

PR target/114175
* config/i386/i386.cc (ix86_setup_incoming_varargs): Only skip
ix86_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

* gcc.dg/c23-stdarg-7.c: New test.
* gcc.dg/c23-stdarg-8.c: New test.

bitint: Fix up stores to large/huge _BitInt bitfields [PR114329]

The verifier requires BIT_FIELD_REFs with INTEGRAL_TYPE_P first operand
to have mode precision. In most cases for the large/huge _BitInt bitfield
stores the code uses bitfield representatives, which are typically arrays
of chars, but if the bitfield starts at byte boundary on big endian,
the code uses as nlhs in lower_mergeable_store COMPONENT_REF of the
bitfield FIELD_DECL instead, which is fine for the limb accesses,
but when used for the most significant limb can result in invalid
BIT_FIELD_REF because the first operand then has BITINT_TYPE and
usually VOIDmode.

The following patch adds a helper method for the 4 creatikons of
BIT_FIELD_REF which when needed adds a VIEW_CONVERT_EXPR.

2024-03-16 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/114329
* gimple-lower-bitint.cc (struct bitint_large_huge): Declare
build_bit_field_ref method.
(bitint_large_huge::build_bit_field_ref): New method.
(bitint_large_huge::lower_mergeable_stmt): Use it.

* gcc.dg/bitint-101.c: New test.

c++: Check module attachment instead of just purview when necessary [PR112631]

Block-scope declarations of functions or extern values are not allowed
when attached to a named module. Similarly, class member functions are
not inline if attached to a named module. However, in both these cases
we currently only check if the declaration is within the module purview;
it is possible for such a declaration to occur within the module purview
but not be attached to a named module (e.g. in an 'extern "C++"' block).
This patch makes the required adjustments.

PR c++/112631

gcc/cp/ChangeLog:

* cp-tree.h (named_module_attach_p): New function.
* decl.cc (start_decl): Check for attachment not purview.
(grokmethod): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/block-decl-1_a.C: New test.
* g++.dg/modules/block-decl-1_b.C: New test.
* g++.dg/modules/block-decl-2.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

libcc1: fix <vector> include

Use INCLUDE_VECTOR before including system.h, instead of directly
including <vector>, to avoid running into poisoned identifiers.

Signed-off-by: Dimitry Andric <dimitry@andric.com>
libcc1/ChangeLog:

PR middle-end/111632
* libcc1plugin.cc: Fix include.
* libcp1plugin.cc: Fix include.

Daily bump.

libgcc: Fix quotient and/or remainder negation in __divmodbitint4 [PR114327]

While for __mulbitint3 we actually don't negate anything and perform the
multiplication in unsigned style always, for __divmodbitint4 if the operands
aren't unsigned and are negative, we negate them first and then try to
negate them as needed at the end.
quotient is negated if just one of the operands was negated and the other
wasn't or vice versa, and remainder is negated if the first operand was
negated.
The case which doesn't work correctly is if due to limited range of the
operands we perform the division/modulo in some smaller number of limbs
and then extend it to the desired precision of the quotient and/or
remainder results.  If they aren't negated, the extension is done with
memset to 0, if they are negated, the extension was done with memset
to -1.  The problem is that if the quotient or remainder is zero,
then bitint_negate negates it again to zero (that is ok), but we should
then extend with memset to 0, not memset to -1.

The following patch achieves that by letting bitint_negate also check if
the negated operand is zero and changes the memset argument based on that.

2024-03-15  Jakub Jelinek  <jakub@redhat.com>

PR libgcc/114327
* libgcc2.c (bitint_negate): Return UWtype bitwise or of all the limbs
before negation rather than void.
(__divmodbitint4): Determine whether to fill in the upper limbs after
negation based on whether bitint_negate returned 0 or non-zero, rather
then always filling with -1.

* gcc.dg/torture/bitint-63.c: New test.

testsuite: Fix pr113431.c FAIL on sparc* [PR113431]

As mentioned in the PR, the new testcase FAILs on sparc*-* due to
lack of support of misaligned store.

This patch restricts that to vect_hw_misalign targets.

2024-03-15 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113431
* gcc.dg/vect/pr113431.c: Restrict scan-tree-dump-times to
vect_hw_misalign targets.

Regenerate opt.urls

Fixes: acc38ff59976 ("MIPS: Add -m(no-)strict-align option")
gcc/ChangeLog:

* config/riscv/riscv.opt.urls: Regenerated.
* config/rs6000/sysv4.opt.urls: Likewise.
* config/xtensa/xtensa.opt.urls: Likewise.

lower-subreg, edit-context: Fix comment typos

When backporting r14-9315 to 13 branch, I've noticed I've missed
one letter in a comment. And grepping for similar issues I found
one word with too many.

2024-03-15 Jakub Jelinek <jakub@redhat.com>

* lower-subreg.cc (resolve_simple_move): Fix comment typo,
betwee -> between.
* edit-context.cc (class line_event): Fix comment typo,
betweeen -> between.

i386: Fix a pasto in ix86_expand_int_sse_cmp [PR114339]

In r13-3803-gfa271afb58 I've added an optimization for LE/LEU/GE/GEU
comparison against CONST_VECTOR.  As the comments say:
         /* x <= cst can be handled as x < cst + 1 unless there is
            wrap around in cst + 1.  */
...
                     /* For LE punt if some element is signed maximum.  */
...
                 /* For LEU punt if some element is unsigned maximum.  */
and
         /* x >= cst can be handled as x > cst - 1 unless there is
            wrap around in cst - 1.  */
...
                     /* For GE punt if some element is signed minimum.  */
...
                 /* For GEU punt if some element is zero.  */
Apparently I wrote the GE/GEU (second case) first and then
copied/adjusted it for LE/LEU, most of the adjustments look correct, but
I've left if (code == GE) comparison when testing if it should punt for
signed maximum.  That condition is never true, because this is in
switch (code) { ... case LE: case LEU: block and we really meant to
be what the comment says, for LE punt if some element is signed maximum,
as then cst + 1 wraps around.

The following patch fixes the pasto.

2024-03-15  Jakub Jelinek  <jakub@redhat.com>

PR target/114339
* config/i386/i386-expand.cc (ix86_expand_int_sse_cmp) <case LE>: Fix
a pasto, compare code against LE rather than GE.

* gcc.target/i386/pr114339.c: New test.

match.pd: Only merge truncation with conversion for -fno-signed-zeros

This optimisation does not honour signed zeros, so should not be
enabled except with -fno-signed-zeros.

gcc/ChangeLog:

* match.pd: Fix truncation pattern for -fno-signed-zeroes

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/no_merge_trunc_signed_zero.c: New test.

expand: EXTEND_BITINT CALL_EXPR results [PR114332]

The x86-64 and aarch64 psABIs (and the unwritten ia64 psABI part) say that
the padding bits of _BitInt are undefined, while the expansion internally
typically assumes that non-mode precision integers are sign/zero extended
and extends after operations. We handle that mismatch with EXTEND_BITINT
done when reading from untrusted sources like function arguments, reading
_BitInt from memory etc. but otherwise keep relying on stuff being extended
internally (say in pseudos).
The return value of a function is an ABI boundary though too and we need
to extend that too.

2024-03-15 Jakub Jelinek <jakub@redhat.com>

PR middle-end/114332
* expr.cc (expand_expr_real_1): EXTEND_BITINT also CALL_EXPR results.

testsuite: Fix up pr104601.C for recent libstdc++ changes

r14-9478 added [[nodiscard]] to various <algorithm> APIs including find_if
the pr104601.C testcase uses. As it is an optimization bug fix testcase,
haven't tried to adjust the testcase to use the find_if result, but instead
have added -Wno-unused-result flag to quiet the warning.

2024-03-15 Jakub Jelinek <jakub@redhat.com>

* g++.dg/torture/pr104601.C: Add -Wno-unused-result to dg-options.

bitint: Fix up adjustment of large/huge _BitInt arguments of returns_twice calls [PR113466]

This patch (on top of the just posted gsi_safe_insert* fixes patch)
fixes the instrumentation of large/huge _BitInt SSA_NAME arguments of
returns_twice calls.

In this case it isn't just a matter of using gsi_safe_insert_before instead
of gsi_insert_before, we need to do more.

One thing is that unlike the asan/ubsan instrumentation which does just some
checking, here we want the statement before the call to load into a SSA_NAME
which is passed to the call. With another edge we need to add a PHI,
with one PHI argument the loaded SSA_NAME, another argument an uninitialized
warning free SSA_NAME and a result and arrange for all 3 SSA_NAMEs to be
preserved (i.e. stay as is, be no longer lowered afterwards).

Unfortunately, edge_before_returns_twice_call can create new SSA_NAMEs using
copy_ssa_name and while we can have a reasonable partition for them (same
partition as PHI result correspoding to the PHI argument newly added), adding
SSA_NAMEs into a partition after the partitions are finalized is too ugly.
So, this patch takes a different approach suggested by Richi, just emit
the argument loads before the returns_twice call normally (i.e. temporarily
create invalid IL) and just remember that we did that, and when the bitint
lowering is otherwise done fix this up, gsi_remove those statements,
gsi_safe_insert_before and and create the needed new PHIs.

2024-03-15 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113466
* gimple-lower-bitint.cc (bitint_large_huge): Add m_returns_twice_calls
member.
(bitint_large_huge::bitint_large_huge): Initialize it.
(bitint_large_huge::~bitint_large_huge): Release it.
(bitint_large_huge::lower_call): Remember ECF_RETURNS_TWICE call stmts
before which at least one statement has been inserted.
(gimple_lower_bitint): Move argument loads before ECF_RETURNS_TWICE
calls to a different block and add corresponding PHIs.

* gcc.dg/bitint-100.c: New test.

Fortran: Fix class/derived/complex function associate selectors [PR87477]

2024-03-15 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/87477
PR fortran/89645
PR fortran/99065
PR fortran/114141
PR fortran/114280
* class.cc (gfc_change_class): New function needed for
associate names, when rank changes or a derived type is
produced by resolution
* dump-parse-tree.cc (show_code_node): Make output for SELECT
TYPE more comprehensible.
* expr.cc (find_inquiry_ref): Do not simplify expressions of
an inferred type.
* gfortran.h : Add 'gfc_association_list' to structure
'gfc_association_list'. Add prototypes for
'gfc_find_derived_types', 'gfc_fixup_inferred_type_refs' and
'gfc_change_class'. Add macro IS_INFERRED_TYPE.
* match.cc (copy_ts_from_selector_to_associate): Add bolean arg
'select_type' with default false. If this is a select type name
and the selector is a inferred type, build the class type and
apply it to the associate name.
(build_associate_name): Pass true to 'select_type' in call to
previous.
* parse.cc (parse_associate): If the selector is inferred type
the associate name is too. Make sure that function selector
class and rank, if known, are passed to the associate name. If
a function result exists, pass its typespec to the associate
name.
* primary.cc (resolvable_fcns): New function to check that all
the function references are resolvable.
(gfc_match_varspec): If a scalar derived type select type
temporary has an array reference, match the array reference,
treating this in the same way as an equivalence member. Do not
set 'inquiry' if applied to an unknown type the inquiry name
is ambiguous with the component of an accessible derived type.
Check that resolution of the target expression is OK by testing
if the symbol is declared or is an operator expression, then
using 'resolvable_fcns' recursively. If all is well, resolve
the expression. If this is an inferred type with a component
reference, call 'gfc_find_derived_types' to find a suitable
derived type. If there is an inquiry ref and the symbol either
is of unknown type or is inferred to be a derived type, set the
primary and symbol TKR appropriately.
* resolve.cc (resolve_variable): Call new function below.
(gfc_fixup_inferred_type_refs): New function to ensure that the
expression references for a inferred type are consistent with
the now fixed up selector.
(resolve_assoc_var): Ensure that derived type or class function
selectors transmit the correct arrayspec to the associate name.
(resolve_select_type): If the selector is an associate name of
inferred type and has no component references, the associate
name should have its typespec. Simplify the conversion of a
class array to class scalar by calling 'gfc_change_class'.
Make sure that a class, inferred type selector with an array
ref transfers the typespec from the symbol to the expression.
* symbol.cc (gfc_set_default_type): If an associate name with
unknown type has a selector expression, try resolving the expr.
(find_derived_types, gfc_find_derived_types): New functions
that search for a derived type with a given name.
* trans-expr.cc (gfc_conv_variable): Some inferred type exprs
escape resolution so call 'gfc_fixup_inferred_type_refs'.
* trans-stmt.cc (trans_associate_var): Tidy up expression for
'class_target'. Finalize and free class function results.
Correctly handle selectors that are class functions and class
array references, passed as derived types.

gcc/testsuite/
PR fortran/87477
PR fortran/89645
PR fortran/99065
* gfortran.dg/associate_64.f90 : New test
* gfortran.dg/associate_66.f90 : New test
* gfortran.dg/associate_67.f90 : New test

PR fortran/114141
* gfortran.dg/associate_65.f90 : New test

PR fortran/114280
* gfortran.dg/associate_68.f90 : New test

MIPS: Add -m(no-)strict-align option

We support options -m(no-)unaligned-access 2 years ago, while
currently most of other ports prefer -m(no-)strict-align.
Let's support -m(no-)strict-align, and keep -m(no-)unaligned-access
as alias.

gcc
* config/mips/mips.opt: Support -mstrict-align, and use
TARGET_STRICT_ALIGN as the flag; keep -m(no-)unaligned-access
as alias.
* config/mips/mips.h: Use TARGET_STRICT_ALIGN.
* config/mips/mips.opt.urls: Regenerate.
* doc/invoke.texi: Document -m(no-)strict-algin for MIPSr6.

vect: Call vect_convert_output with the right vecitype [PR114108]

This patch fixes a bug where vect_recog_abd_pattern called vect_convert_output
with the incorrect vecitype for the corresponding pattern_stmt.
vect_convert_output expects vecitype to be the vector form of the scalar type
of the LHS of pattern_stmt, but we were passing in the vector form of the LHS
of the new impending conversion statement. This caused a skew in ABD's
pattern_stmt having the vectype of the following gimple pattern_stmt.

2024-03-06 Tejas Belagod <tejas.belagod@arm.com>

gcc/ChangeLog:

PR middle-end/114108
* tree-vect-patterns.cc (vect_recog_abd_pattern): Call
vect_convert_output with the correct vecitype.

gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr114108.c: New test.

LoongArch: Remove masking process for operand 3 of xvpermi.q.

The behavior of non-zero unused bits in xvpermi.q instruction's
third operand is undefined on LoongArch, according to our
discussion (https://github.com/llvm/llvm-project/pull/83540),
we think that keeping original insn operand as unmodified
state is better solution.

This patch partially reverts 7b158e036a95b1ab40793dd53bed7dbd770ffdaf.

gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_xvpermi_q_<LASX:mode>):
Remove masking of operand 3.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
Reposition operand 3's value into instruction's defined accept range.

Daily bump.

tree-core: clarify clobber comments

It came up on the mailing list that OBJECT_BEGIN/END are described as
marking object lifetime, but mark the beginning of the constructor and end
of the destructor, whereas the C++ notion of lifetime is between the end of
the constructor and beginning of the destructor. So let's fix the comments.

gcc/ChangeLog:

* tree-core.h (enum clobber_kind): Clarify CLOBBER_OBJECT_*
comments.

PR modula2/114294 expression causes ICE

This patch fixes an ICE when encountering an expression:
1 + HIGH (a[0]). The fix was to assign a type to the constant
created by BuildConstHighFromSym in M2Quads.mod.

gcc/m2/ChangeLog:

PR modula2/114294
* gm2-compiler/M2Quads.mod (BuildConstHighFromSym):
Call PutConst to assign the type Cardinal in the result
constant.

gcc/testsuite/ChangeLog:

PR modula2/114294
* gm2/pim/pass/log: Removed.
* gm2/pim/pass/highexp.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

hppa: Fix REG+D address support before reload

When generating PA 1.x code or code for GNU ld, floating-point
accesses only support 5-bit displacements but integer accesses
support 14-bit displacements. I mistakenly assumed reload
could fix an invalid 14-bit displacement in a floating-point
access but this is not the case.

2024-03-14 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR target/114288
* config/pa/pa.cc (pa_legitimate_address_p): Don't allow
14-bit displacements before reload for modes that may use
a floating-point load or store.

bpf: define INT8_TYPE as signed char

Change the BPF backend to define INT8_TYPE with an explicit sign, rather
than a plain char. This is in line with other targets and removes the
risk of int8_t being affected by the signedness of the plain char type
of the host system.

The motivation for this change is that even if `char' is defined to be
signed in BPF targets, some BPF programs use the (mal)practice of
including internal libc headers, either directly or indirectly via
kernel headers, which in turn may trigger compilation errors regarding
redefinitions of types.

gcc/

* config/bpf/bpf.h (INT8_TYPE): Change to signed char.

gcc: xtensa: reorder movsi_internal patterns for better code generation during LRA

After switching to LRA xtensa backend generates the following code for
saving/loading registers:

    movi     a9, 0x190
    add      a9, a9, sp
    s32i.n   a3, a9, 0

instead of the shorter and more efficient

    s32i     a3, a9, 0x190

E.g. the following code can be used to reproduce it:

    int f1(int a, int b, int c, int d, int e, int f, int *p);
    int f2(int a, int b, int c, int d, int e, int f, int *p);
    int f3(int a, int b, int c, int d, int e, int f, int *p);

    int foo(int a, int b, int c, int d, int e, int f)
    {
        int g[100];
        return
            f1(a, b, c, d, e, f, g) +
            f2(a, b, c, d, e, f, g) +
            f3(a, b, c, d, e, f, g);
    }

This happens in the LRA pass because s32i.n and l32i.n are listed before
the s32i and l32i in the movsi_internal pattern and alternative
consideration loop stops early.

gcc/

* config/xtensa/xtensa.md (movsi_internal): Move l32i and s32i
patterns ahead of the l32i.n and s32i.n.

libstdc++: Fix std::format("{}", negative_integer) [PR114325]

The fast path for "{}" format strings has a bug for negative integers
where the length passed to std::to_chars is too long.

libstdc++-v3/ChangeLog:

PR libstdc++/114325
* include/std/format (_Scanner::_M_scan): Pass correct length to
__to_chars_10_impl.
* testsuite/std/format/functions/format.cc: Check negative
integers with empty format-spec.

libstdc++: Add nodiscard in <algorithm>

Add the [[nodiscard]] attribute to several functions in <algorithm>.
These all have no side effects and are only called for their return
value (e.g. std::count) or produce a result that must not be discarded
for correctness (e.g. std::remove).

I was intending to add the attribute to a number of other functions like
std::copy_if, std::unique_copy, std::set_union, and std::set_difference.
I stopped when I noticed that MSVC doesn't use it on those functions,
which I suspect is because they're often used with an insert iterator
(e.g. std::back_insert_iterator). In that case it doesn't matter if
you discard the result, because you have the container to tell you how
many elements were copied to the output range.

libstdc++-v3/ChangeLog:

* include/bits/stl_algo.h (find_end, all_of, none_of, any_of)
(find_if_not, is_partitioned, partition_point, remove)
(remove_if, unique, lower_bound, upper_bound, equal_range)
(binary_search, includes, is_sorted, is_sorted_until, minmax)
(minmax_element, is_permutation, clamp, find_if, find_first_of)
(adjacent_find, count, count_if, search, search_n, min_element)
(max_element): Add nodiscard attribute.
* include/bits/stl_algobase.h (min, max, lower_bound, equal)
(lexicographical_compare, lexicographical_compare_three_way)
(mismatch): Likewise.
* include/bits/stl_heap.h (is_heap, is_heap_until): Likewise.
* testsuite/25_algorithms/equal/debug/1_neg.cc: Add dg-warning.
* testsuite/25_algorithms/equal/debug/2_neg.cc: Likewise.
* testsuite/25_algorithms/equal/debug/3_neg.cc: Likewise.
* testsuite/25_algorithms/find_first_of/concept_check_1.cc:
Likewise.
* testsuite/25_algorithms/is_permutation/2.cc: Likewise.
* testsuite/25_algorithms/lexicographical_compare/71545.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/33613.cc: Likewise.
* testsuite/25_algorithms/lower_bound/debug/irreflexive.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/debug/partitioned_neg.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/debug/partitioned_pred_neg.cc:
Likewise.
* testsuite/25_algorithms/minmax/3.cc: Likewise.
* testsuite/25_algorithms/search/78346.cc: Likewise.
* testsuite/25_algorithms/search_n/58358.cc: Likewise.
* testsuite/25_algorithms/unique/1.cc: Likewise.
* testsuite/25_algorithms/unique/11480.cc: Likewise.
* testsuite/25_algorithms/upper_bound/33613.cc: Likewise.
* testsuite/25_algorithms/upper_bound/debug/partitioned_neg.cc:
Likewise.
* testsuite/25_algorithms/upper_bound/debug/partitioned_pred_neg.cc:
Likewise.
* testsuite/ext/concept_checks.cc: Likewise.
* testsuite/ext/is_heap/47709.cc: Likewise.
* testsuite/ext/is_sorted/cxx0x.cc: Likewise.

gcn: Fix a comment typo

I've noticed a typo in the comment above ABI_VERSION_SPEC.

Fixed thusly.

2024-03-14 Jakub Jelinek <jakub@redhat.com>

* config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Fix comment typo.

icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

AFAIK we have no code in LTO streaming to stream out or in
SSA_NAME_{RANGE,PTR}_INFO, so LTO effectively throws it all away
and let vrp1 and alias analysis after IPA recompute that.  There is
just one spot, for IPA VRP and IPA bit CCP we save/restore ranges
and set SSA_NAME_{PTR,RANGE}_INFO e.g. on parameters depending on what
we saved and propagated, but that is after streaming in bodies for the
post IPA optimizations.

Now, without LTO SSA_NAME_{RANGE,PTR}_INFO is already computed from
earlier in many cases (er.g. evrp and early alias analysis but other spots
too), but IPA ICF is ignoring the ranges and points-to details when
comparing the bodies.  I think ignoring that is just fine, that is
effectively what we do for LTO where we throw that information away
before the analysis, and not ignoring it could lead to fewer ICF merging
possibilities.

So, the following patch instead verifies that for LTO SSA_NAME_{PTR,RANGE}_INFO
just isn't there on SSA_NAMEs in functions into which other functions have
been ICFed, and for non-LTO throws that information away (which matches the
LTO behavior).

Another possibility would be to remember the SSA_NAME <-> SSA_NAME mapping
vector (just one of the 2) on successful sem_function::equals on the
sem_function which is not the chosen leader (e.g. how SSA_NAMEs in the
leader map to SSA_NAMEs in the other function) and use that vector
to union the ranges in sem_function::merge.  I can implement that for
comparison, but wanted to post this first if there is an agreement on
doing that or if Honza thinks we should take SSA_NAME_{RANGE,PTR}_INFO
into account.  I think we can compare SSA_NAME_RANGE_INFO, but have
no idea how to try to compare points to info.  And I think it will result
in less effective ICF for non-LTO vs. LTO unnecessarily.

2024-03-12  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/113907
* ipa-icf.cc (sem_item_optimizer::merge_classes): Reset
SSA_NAME_RANGE_INFO and SSA_NAME_PTR_INFO on successfully ICF merged
functions.

* gcc.dg/pr113907-1.c: New test.

PR modula2/114333 set type comparison against cardinal should cause error addendum

This patch applies the new stricter type checking procedure function to
the remaining 6 comparisons: less, greater, lessequ, greequ, ifin and
ifnotin.

gcc/m2/ChangeLog:

PR modula2/114333
* gm2-compiler/M2GenGCC.mod (CodeStatement): Remove op1, op2 and
op3 parameters to CodeIfLess, CodeIfLessEqu, CodeIfGreEqu, CodeIfGre,
CodeIfIn, CodeIfNotIn.
(CodeIfLess): Rewrite.
(PerformCodeIfLess): New procedure.
(CodeIfLess): Rewrite.
(PerformCodeIfLess): New procedure.
(CodeIfLessEqu): Rewrite.
(PerformCodeIfLessEqu): New procedure.
(CodeIfGreEqu): Rewrite.
(PerformCodeIfGreEqu): New procedure.
(CodeIfGre): Rewrite.
(PerformCodeIfGre): New procedure.
(CodeIfIn): Rewrite.
(PerformCodeIfIn): New procedure.
(CodeIfNotIn): Rewrite.
(PerformCodeIfNotIn): New procedure.

gcc/testsuite/ChangeLog:

PR modula2/114333
* gm2/pim/fail/badset5.mod: New test.
* gm2/pim/fail/badset6.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

LoongArch: Remove unused and incorrect "sge<u>_<X:mode><GPR:mode>" define_insn

If this insn is really used, we'll have something like

    slti $r4,$r0,$r5

in the code.  The assembler will reject it because slti wants 2
register operands and 1 immediate operand.  But we've not got any bug
report for this, indicating this define_insn is unused at all.

Note that do_store_flag (in expr.cc) is already converting x >= 1 to
x > 0 unconditionally, so this define_insn is indeed unused and we can
just remove it.

gcc/ChangeLog:

* config/loongarch/loongarch.md (any_ge): Remove.
(sge<u>_<X:mode><GPR:mode>): Remove.

libstdc++: Add missing clear_padding in __atomic_float constructor

For 80-bit long double we need to clear the padding bits on
construction.

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (__atomic_float::__atomic_float(Fp)):
Clear padding.
* testsuite/29_atomics/atomic_float/compare_exchange_padding.cc:
New test.

Signed-off-by: xndcn <xndchn@gmail.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing

... to complement commit ddf852dac2abaca317c10b8323f338123b0585c8
"OpenACC 2.7: front-end support for readonly modifier".

gcc/testsuite/
* c-c++-common/goacc/readonly-1.c: Add basic OpenACC 'declare'
testing.
* gfortran.dg/goacc/readonly-1.f90: Likewise.

Minor fixes for OpenACC/Fortran 'self' clause for compute constructs

... to fix up recent commit 3a3596389c2e539cb8fd5dc5784a4e2afe193a2a
"OpenACC 2.7: Implement self clause for compute constructs".

gcc/fortran/
* dump-parse-tree.cc (show_omp_clauses): Handle 'self_expr'.
* openmp.cc (gfc_free_omp_clauses): Likewise.
* trans-openmp.cc (gfc_split_omp_clauses): Don't handle 'self_expr'.

Fix 'char' initialization, copy, check in 'libgomp.oacc-fortran/acc-memcpy.f90'

Our dear friend '-Wuninitialized' reported:

    [...]/libgomp.oacc-fortran/acc-memcpy.f90:18:27:

       18 |     char(j) = int (j, int8)
          |                           ^
    Warning: ‘j’ may be used uninitialized [-Wmaybe-uninitialized]
    [...]/libgomp.oacc-fortran/acc-memcpy.f90:14:20:

       14 |   integer(int8) :: j
          |                    ^
    note: ‘j’ was declared here

..., but actually there were other issues.

libgomp/
* testsuite/libgomp.oacc-fortran/acc-memcpy.f90: Fix 'char'
initialization, copy, check.

aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310]

The following testcase ICEs with LSE atomics.
The problem is that the @atomic_compare_and_swap<mode> expander uses
aarch64_reg_or_zero predicate for the desired operand, which is fine,
given that for most of the modes and even for TImode in some cases
it can handle zero immediate just fine, but the TImode
@aarch64_compare_and_swap<mode>_lse just uses register_operand for
that operand instead, again intentionally so, because the casp,
caspa, caspl and caspal instructions need to use a pair of consecutive
registers for the operand and xzr is just one register and we can't
just store zero into the link register to emulate pair of zeros.

So, the following patch fixes that by forcing the newval operand into
a register for the TImode LSE case.

2024-03-14 Jakub Jelinek <jakub@redhat.com>

PR target/114310
* config/aarch64/aarch64.cc (aarch64_expand_compare_and_swap): For
TImode force newval into a register.

* gcc.dg/pr114310.c: New test.

s390: fix htm-builtins test cases

Transactional and non-transactional stores to the same cache line cause
transactions to abort on newer generations. Add sufficient padding to make
sure another cache line is used.

Tested on s390.

gcc/testsuite/ChangeLog:

* gcc.target/s390/htm-builtins-1.c: Fix.
* gcc.target/s390/htm-builtins-2.c: Fix.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

libstdc++: Correct notes about std::call_once in manual [PR66146]

The bug with exceptions thrown during a std::call_once call affects all
targets, so fix the docs that say it only affects non-Linux targets.

libstdc++-v3/ChangeLog:

PR libstdc++/66146
* doc/xml/manual/status_cxx2011.xml: Remove mention of Linux in
note about std::call_once.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise.
* doc/html/manual/status.html: Regenerate.

libstdc++: Update C++23 status in the manual

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2023.xml: Update C++23 status table.
* doc/html/manual/status.html: Regenerate.
* include/bits/version.def: Fix typo in comment.

libcpp: Fix macro expansion for argument of __has_include [PR110558]

When the file name for a #include directive is the result of stringifying a
macro argument, libcpp needs to take some care to get the whitespace
correct; in particular stringify_arg() needs to see a CPP_PADDING token
between macro tokens so that it can figure out when to output space between
tokens. The CPP_PADDING tokens are not normally generated when handling a
preprocessor directive, but for #include-like directives, libcpp sets the
state variable pfile->state.directive_wants_padding to TRUE so that the
CPP_PADDING tokens will be output, and then everything works fine for
computed includes.

As the PR points out, things do not work fine for __has_include. Fix that by
setting the state variable the same as is done for #include.

libcpp/ChangeLog:

PR preprocessor/110558
* macro.cc (builtin_has_include): Set
pfile->state.directive_wants_padding prior to lexing the
file name, in case it comes from macro expansion.

gcc/testsuite/ChangeLog:

PR preprocessor/110558
* c-c++-common/cpp/has-include-2.c: New test.
* c-c++-common/cpp/has-include-2.h: New test.

libcpp: Fix __has_include_next ICE in the last directory of the path [PR80755]

In libcpp/files.cc, the function _cpp_has_header(), which implements
__has_include and __has_include_next, does not check for a NULL return value
from search_path_head(), leading to an ICE tripping an assert when
_cpp_find_file() tries to use it. Fix it by checking for that case and
silently returning false instead.

As suggested by the PR author, it is easiest to make a testcase by using
the -idirafter option. To enable that, also modify the dg-additional-options
testsuite procedure to make the global $srcdir available, since -idirafter
requires the full path.

libcpp/ChangeLog:

PR preprocessor/80755
* files.cc (search_path_head): Add SUPPRESS_DIAGNOSTIC argument
defaulting to false.
(_cpp_has_header): Silently return false if the search path has been
exhausted, rather than issuing a diagnostic and then hitting an
assert.

gcc/testsuite/ChangeLog:

* lib/gcc-defs.exp (dg-additional-options): Make $srcdir usable in a
dg-additional-options directive.
* c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h: New test.
* c-c++-common/cpp/has-include-next-2.c: New test.

PR modula2/114333 set type comparison against a cardinal should cause an error

The type checker M2Check.mod needs extending to detect if a set, array or
record is in either operand at the end of the cascaded test list.

gcc/m2/ChangeLog:

PR modula2/114333
* gm2-compiler/M2Check.mod (checkUnbounded): New procedure
function.
(checkArrayTypeEquivalence): Extend checking to cover unbounded
arrays, arrays and constants.
(IsTyped): Simplified the expression and corrected a test for
IsConstructor.
(checkTypeKindViolation): New procedure function.
(doCheckPair): Call checkTypeKindViolation.
* gm2-compiler/M2GenGCC.mod (CodeStatement): Remove parameters
to CodeEqu and CodeNotEqu.
(PerformCodeIfEqu): New procedure.
(CodeIfEqu): Rewrite.
(PerformCodeIfNotEqu): New procedure.
(CodeIfNotEqu): Rewrite.
* gm2-compiler/M2Quads.mod (BuildRelOpFromBoolean): Correct
comment.

gcc/testsuite/ChangeLog:

PR modula2/114333
* gm2/cse/pass/testcse54.mod: New test.
* gm2/iso/run/pass/array9.mod: New test.
* gm2/iso/run/pass/strcons3.mod: New test.
* gm2/iso/run/pass/strcons4.mod: New test.
* gm2/pim/fail/badset1.mod: New test.
* gm2/pim/fail/badset2.mod: New test.
* gm2/pim/fail/badset3.mod: New test.
* gm2/pim/fail/badset4.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

OpenACC 2.7: front-end support for readonly modifier

This patch implements the front-end support for the 'readonly' modifier for the
OpenACC 'copyin' clause and 'cache' directive.

This currently only includes front-end parsing for C/C++/Fortran and setting of
new bits OMP_CLAUSE_MAP_READONLY, OMP_CLAUSE__CACHE__READONLY. Further linking
of these bits to points-to analysis and/or utilization of read-only memory in
accelerator target are for later patches.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_oacc_data_clause): Add parsing support for
'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
found, update comments.
(c_parser_oacc_cache): Add parsing support for 'readonly' modifier,
set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
comments.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_oacc_data_clause): Add parsing support for
'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
found, update comments.
(cp_parser_oacc_cache): Add parsing support for 'readonly' modifier,
set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
comments.

gcc/fortran/ChangeLog:

* dump-parse-tree.cc (show_omp_namelist): Print "readonly," for
OMP_LIST_MAP and OMP_LIST_CACHE if n->u.map.readonly is set.
Adjust 'n->u.map_op' to 'n->u.map.op'.
* gfortran.h (typedef struct gfc_omp_namelist): Adjust map_op as
'ENUM_BITFIELD (gfc_omp_map_op) op:8', add 'bool readonly' field,
change to named struct field 'map'.
* openmp.cc (gfc_match_omp_map_clause): Adjust 'n->u.map_op' to
'n->u.map.op'.
(gfc_match_omp_clause_reduction): Likewise.
(gfc_match_omp_clauses): Add readonly modifier parsing for OpenACC
copyin clause, set 'n->u.map.op' and 'n->u.map.readonly' for parsed
clause. Adjust 'n->u.map_op' to 'n->u.map.op'.
(gfc_match_oacc_declare): Adjust 'n->u.map_op' to 'n->u.map.op'.
(gfc_match_oacc_cache): Add readonly modifier parsing for OpenACC
cache directive.
(resolve_omp_clauses): Adjust 'n->u.map_op' to 'n->u.map.op'.
* trans-decl.cc (add_clause): Adjust 'n->u.map_op' to 'n->u.map.op'.
(finish_oacc_declare): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_CLAUSE_MAP_READONLY,
OMP_CLAUSE__CACHE__READONLY to 1 when readonly is set. Adjust
'n->u.map_op' to 'n->u.map.op'.
(gfc_add_clause_implicitly): Adjust 'n->u.map_op' to 'n->u.map.op'.

gcc/ChangeLog:

* tree.h (OMP_CLAUSE_MAP_READONLY): New macro.
(OMP_CLAUSE__CACHE__READONLY): New macro.
* tree-core.h (struct GTY(()) tree_base): Adjust comments for new
uses of readonly_flag bit in OMP_CLAUSE_MAP_READONLY and
OMP_CLAUSE__CACHE__READONLY.
* tree-pretty-print.cc (dump_omp_clause): Add support for printing
OMP_CLAUSE_MAP_READONLY and OMP_CLAUSE__CACHE__READONLY.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/readonly-1.c: New test.
* gfortran.dg/goacc/readonly-1.f90: New test.

IBM Z: Fix -munaligned-symbols

With this fix we make sure that only symbols with a natural alignment
smaller than 2 are considered misaligned with
-munaligned-symbols. Background is that -munaligned-symbols is only
supposed to affect symbols whose natural alignment wouldn't be enough
to fulfill our ABI requirement of having all symbols at even
addresses. Because only these are the cases where we differ from other
architectures.

gcc/ChangeLog:

* config/s390/s390.cc (s390_encode_section_info): Adjust the check
for misaligned symbols.
* config/s390/s390.opt: Improve documentation.

gcc/testsuite/ChangeLog:

* gcc.target/s390/aligned-1.c: Add weak and void variables
incorporating the cases from unaligned-2.c.
* gcc.target/s390/unaligned-1.c: Likewise.
* gcc.target/s390/unaligned-2.c: Removed.

gimple-iterator: Some gsi_safe_insert_*before fixes

When trying to use the gsi_safe_insert*before APIs in bitint lowering,
I've discovered 3 issues and the following patch addresses those:

1) both split_block and split_edge update CDI_DOMINATORS if they are
   available, but because edge_before_returns_twice_call first splits
   and then adds an extra EDGE_ABNORMAL edge and then removes another
   one, the immediate dominators of both the new bb and the bb with
   returns_twice call need to change
2) the new EDGE_ABNORMAL edge had uninitialized probability; this patch
   copies the probability from the edge that is going to be removed
   and similarly copies other flags (EDGE_EXECUTABLE, EDGE_DFS_BACK,
   EDGE_IRREDUCIBLE_LOOP etc.)
3) if edge_before_returns_twice_call splits a block, then the bb with
   returns_twice call changes, so the gimple_stmt_iterator for it is
   no longer accurate, it points to the right statement, but gsi_bb
   and gsi_seq are no longer correct; the patch updates it

2024-03-14  Jakub Jelinek  <jakub@redhat.com>

* gimple-iterator.cc (edge_before_returns_twice_call): Copy all
flags and probability from ad_edge to e edge.  If CDI_DOMINATORS
are computed, recompute immediate dominator of other_edge->src
and other_edge->dest.
(gsi_safe_insert_before, gsi_safe_insert_seq_before): Update *iter
for the returns_twice call case to the gsi_for_stmt (stmt) to deal
with update it for bb splitting.

i386[stv]: Handle REG_EH_REGION note

When we split
(insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
        (mem:DI (reg/f:SI 98 [ CallNative_nclosure.0_1 ]) [6 MEM[(struct SQRefCounted *)CallNative_nclosure.0_1]._uiRef+0 S8 A32])) "test.C":22:42 84 {*movdi_internal}
     (expr_list:REG_EH_REGION (const_int -11 [0xfffffffffffffff5])

into

(insn 104 36 37 10 (set (subreg:V2DI (reg:DI 124) 0)
        (vec_concat:V2DI (mem:DI (reg/f:SI 98 [ CallNative_nclosure.0_1 ]) [6 MEM[(struct SQRefCounted *)CallNative_nclosure.0_1]._uiRef+0 S8 A32])
            (const_int 0 [0]))) "test.C":22:42 -1
        (nil)))
(insn 37 104 105 10 (set (subreg:V2DI (reg:DI 104 [ _18 ]) 0)
        (subreg:V2DI (reg:DI 124) 0)) "test.C":22:42 2024 {movv2di_internal}
     (expr_list:REG_EH_REGION (const_int -11 [0xfffffffffffffff5])
        (nil)))

we must copy the REG_EH_REGION note to the first insn and split the block
after the newly added insn.  The REG_EH_REGION on the second insn will be
removed later since it no longer traps.

gcc/ChangeLog:

* config/i386/i386-features.cc
(general_scalar_chain::convert_op): Handle REG_EH_REGION note.
(convert_scalars_to_vector): Ditto.
* config/i386/i386-features.h (class scalar_chain): New
memeber control_flow_insns.

gcc/testsuite/ChangeLog:

* g++.target/i386/pr111822.C: New test.

Daily bump.

libstdc++: Move test error_category to global scope

A recent GDB change causes this test to fail due to missing RTTI for the
custom_cast type. This is presumably because the custom_cat type was
defined as a local class, so has no linkage. Moving it to local scope
seems to fix the test regressions, and probably makes the test more
realistic as a local class with no linkage isn't practical to use as an
error category that almost certainly needs to be referred to in other
scopes.

libstdc++-v3/ChangeLog:

* testsuite/libstdc++-prettyprinters/cxx11.cc: Move custom_cat
to namespace scope.

libstdc++: Improve documentation on debugging with libstdc++

libstdc++-v3/ChangeLog:

* doc/xml/manual/debug.xml: Improve docs on debug builds and
using ASan. Mention _GLIBCXX_ASSERTIONS. Reorder sections to put
the most relevant ones first.
* doc/xml/manual/using.xml: Add comma.
* doc/html/*: Regenerate.

libstdc++: Document that _GLIBCXX_CONCEPT_CHECKS might be removed in future

The macro-based concept checks are unmaintained and do not support C++11
or later, so reject valid code. If nobody plans to update them we should
consider removing them. Alternatively, we could ignore the macro for
C++11 and later, so they have no effect and don't reject valid code.

libstdc++-v3/ChangeLog:

* doc/xml/manual/debug.xml: Document that concept checking might
be removed in future.
* doc/xml/manual/extensions.xml: Likewise.

Fortran: fix IS_CONTIGUOUS for polymorphic dummy arguments [PR114001]

gcc/fortran/ChangeLog:

PR fortran/114001
* expr.cc (gfc_is_simply_contiguous): Adjust logic so that CLASS
symbols are also handled.

gcc/testsuite/ChangeLog:

PR fortran/114001
* gfortran.dg/is_contiguous_4.f90: New test.

store-merging: Match bswap64 on 32-bit targets with bswapsi2 [PR114319]

gimple-ssa-store-merging.cc tests bswap_optab in 3 different places,
in 2 of them it has special exception for double-word bswap using pair
of word-mode bswap optabs, but in the last one it doesn't.

The following patch changes even the last spot.
We don't handle 128-bit bswaps in the passes at all, because currently we
just use uint64_t to represent the byte reshuffling (we'd need to use
offset_int or something like that instead) and we don't have
__builtin_bswap128 nor type-generic __builtin_bswap, so there is nothing
for 64-bit targets there.

2024-03-13 Jakub Jelinek <jakub@redhat.com>

PR middle-end/114319
* gimple-ssa-store-merging.cc
(imm_store_chain_info::try_coalesce_bswap): For 32-bit targets
allow matching __builtin_bswap64 if there is bswapsi2 optab.

* gcc.target/i386/pr114319.c: New test.

testsuite: target test for short_enums

On arm-none-eabi, the test case fails with below warning on GCC13
.../null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:63:65: warning: converting a packed 'enum obj_type' pointer (alignment 1) to a 'struct connection' pointer (alignment 4) may result in an unaligned pointer value [-Waddress-of-packed-member]

Add a dg-bogus to ensure that the warning is not reintroduced.

gcc/testsuite/ChangeLog:

* c-c++-common/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
Added dg-bogus with target on offending line for short_enums.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

s390: Fix TARGET_SECONDARY_RELOAD for non-SYMBOL_REFs

RTX X need not necessarily be a SYMBOL_REF and may e.g. be an
UNSPEC_GOTENT for which SYMBOL_FLAG_NOTALIGN2_P fails.

gcc/ChangeLog:

* config/s390/s390.cc (s390_secondary_reload): Guard
SYMBOL_FLAG_NOTALIGN2_P.

s390: Fix tests rosbg_si_srl and rxsbg_si_srl

Starting with r14-2047-gd0e891406b16dc two SI mode tests are optimized
into DI mode.  Thus, the scan-assembler directives fail.  For example
RTL expression

(ior:SI (subreg:SI (lshiftrt:DI (reg:DI 69)
            (const_int 2 [0x2])) 4)
    (subreg:SI (reg:DI 68) 4))

is optimized into

(ior:DI (lshiftrt:DI (reg:DI 69)
        (const_int 2 [0x2]))
    (reg:DI 68))

Fixed by moving operands into memory in order to enforce SI mode
computation.

Furthermore, in r9-6056-g290dfd9bc7bea2 the starting bit position of the
scan-assembler directive for rosbg was incorrectly set to 32 which
actually should be 32+SHIFT_AMOUNT, i.e., in this particular case 34.

gcc/testsuite/ChangeLog:

* gcc.target/s390/md/rXsbg_mode_sXl.c: Fix tests rosbg_si_srl
and rxsbg_si_srl.

s390: Streamline vector builtins with LLVM

Similar as to s390_lcbb, s390_vll, s390_vstl, et al. make use of a
signed vector type for vlbb. Furthermore, a const void pointer seems
more common and an integer for the mask.

For s390_vfi(s,d)b make use of integers for masks, too.

Use unsigned integers for all s390_vlbr/vstbr variants.

Make use of type UV16QI for the length operand of s390_vstrs(,z)(h,f).

Following the Principles of Operation, change from signed to unsigned
type for s390_va(c,cc,ccc)q and s390_vs(,c,bc)biq and s390_vmslg.

Make use of scalar type UINT128 instead of UV16QI for s390_vgfm(,a)g,
and s390_vsumq(f,g).

gcc/ChangeLog:

* config/s390/s390-builtin-types.def: Update to reflect latest
changes.
* config/s390/s390-builtins.def: Streamline vector builtins with
LLVM.

s390: Deprecate some vector builtins

According to IBM Open XL C/C++ for z/OS version 1.1 builtins

- vec_permi
- vec_ctd
- vec_ctsl
- vec_ctul
- vec_ld2f
- vec_st2f

are deprecated. Also deprecate helper builtins vec_ctd_s64 and
vec_ctd_u64.

Furthermore, the overloads of vec_insert which make use of a bool vector
are deprecated, too.

gcc/ChangeLog:

* config/s390/s390-builtins.def (vec_permi): Deprecate.
(vec_ctd): Deprecate.
(vec_ctd_s64): Deprecate.
(vec_ctd_u64): Deprecate.
(vec_ctsl): Deprecate.
(vec_ctul): Deprecate.
(vec_ld2f): Deprecate.
(vec_st2f): Deprecate.
(vec_insert): Deprecate overloads with bool vectors.

bitint: Fix up lowering of bitfield loads/stores [PR114313]

The following testcase ICEs, because for large/huge _BitInt bitfield
loads/stores we use the DECL_BIT_FIELD_REPRESENTATIVE as the underlying
"var" and indexes into it can be larger than the precision of the
bitfield might normally allow.

The following patch fixes that by passing NULL_TREE type in that case
to limb_access, so that we always return m_limb_type type and don't
do the extra assertions, after all, the callers expect that too.
I had to add the first hunk to avoid ICE, it was using type in one place
even when it was NULL. But TYPE_SIZE (TREE_TYPE (var)) seems like the
right size to use anyway because the code uses VIEW_CONVERT_EXPR on it.

2024-03-13 Jakub Jelinek <jakub@redhat.com>

PR middle-end/114313
* gimple-lower-bitint.cc (bitint_large_huge::limb_access): Use
TYPE_SIZE of TREE_TYPE (var) rather than TYPE_SIZE of type.
(bitint_large_huge::handle_load): Pass NULL_TREE rather than
rhs_type to limb_access for the bitfield load cases.
(bitint_large_huge::lower_mergeable_stmt): Pass NULL_TREE rather than
lhs_type to limb_access if nlhs is non-NULL.

* gcc.dg/torture/bitint-62.c: New test.

OpenMP/Fortran: Fix defaultmap(none) issue with dummy procedures [PR114283]

Dummy procedures look similar to variables but aren't - neither in Fortran
nor in OpenMP. As the middle end sees PARM_DECLs, mark them as predetermined
firstprivate for mapping (as already done in gfc_omp_predetermined_sharing).

This does not address the isses related to procedure pointers, which are
still discussed on spec level [see PR].

PR fortran/114283

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_omp_predetermined_mapping): Map dummy
procedures as firstprivate.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/declare-target-indirect-4.f90: New test.

asan: Fix ICE during instrumentation of returns_twice calls [PR112709]

The following patch on top of the previously posted ubsan/gimple-iterator
one handles asan the same. While the case of returning by hidden reference
is handled differently because of the first recently posted asan patch,
this deals with instrumentation of the aggregates returned in registers
case as well as instrumentation of loads from aggregate memory in the
function arguments of returns_twice calls.

2024-03-13 Jakub Jelinek <jakub@redhat.com>

PR sanitizer/112709
* asan.cc (maybe_create_ssa_name, maybe_cast_to_ptrmode,
build_check_stmt, maybe_instrument_call, asan_expand_mark_ifn): Use
gsi_safe_insert_before instead of gsi_insert_before.

* gcc.dg/asan/pr112709-2.c: New test.

gimple-iterator, ubsan: Fix ICE during instrumentation of returns_twice calls [PR112709]

ubsan, asan (both PR112709) and _BitInt lowering (PR113466) want to
insert some instrumentation or adjustment statements before some statement.
This unfortunately creates invalid IL if inserting before a returns_twice
call, because we require that such calls are the first statement in a basic
block and that we have an edge from the .ABNORMAL_DISPATCHER block to
the block containing the returns_twice call (in addition to other edge(s)).

The following patch adds helper functions for such insertions and uses it
for now in ubsan (I'll post a follow up which uses it in asan and will
work later on the _BitInt lowering PR).

In particular, if the bb with returns_twice call at the start has just
2 edges, one EDGE_ABNORMAL from .ABNORMAL_DISPATCHER and another
(non-EDGE_ABNORMAL/EDGE_EH) from some other bb, it just inserts the
statement or sequence on that other edge.
If the bb has more predecessor edges or the one not from
.ABNORMAL_DISPATCHER is e.g. an EH edge (this latter case likely shouldn't
happen, one would need labels or something like that), the patch splits the
block with returns_twice call such that there is just one edge next to
.ABNORMAL_DISPATCHER edge and adjusts PHIs as needed to make it happen.
The functions also replace uses of PHIs from the returns_twice bb with
the corresponding PHI arguments, because otherwise it would be invalid IL.

E.g. in ubsan/pr112709-2.c (qux) we have before the ubsan pass
  <bb 10> :
  # .MEM_5(ab) = PHI <.MEM_4(9), .MEM_25(ab)(11)>
  # _7(ab) = PHI <_20(9), _8(ab)(11)>
  # .MEM_21(ab) = VDEF <.MEM_5(ab)>
  _22 = bar (*_7(ab));
where bar is returns_twice call and bb 11 has .ABNORMAL_DISPATCHER call,
this patch instruments it like:
  <bb 9> :
  # .MEM_4 = PHI <.MEM_17(ab)(4), .MEM_10(D)(5), .MEM_14(ab)(8)>
  # DEBUG BEGIN_STMT
  # VUSE <.MEM_4>
  _20 = p;
  # .MEM_27 = VDEF <.MEM_4>
  .UBSAN_NULL (_20, 0B, 0);
  # VUSE <.MEM_27>
  _2 = __builtin_dynamic_object_size (_20, 0);
  # .MEM_28 = VDEF <.MEM_27>
  .UBSAN_OBJECT_SIZE (_20, 1024, _2, 0);

  <bb 10> :
  # .MEM_5(ab) = PHI <.MEM_28(9), .MEM_25(ab)(11)>
  # _7(ab) = PHI <_20(9), _8(ab)(11)>
  # .MEM_21(ab) = VDEF <.MEM_5(ab)>
  _22 = bar (*_7(ab));
The edge from .ABNORMAL_DISPATCHER is there just to represent the
returning for 2nd and later times, the instrumentation can't be
done at that point as there is no code executed during that point.
The ubsan/pr112709-1.c testcase includes non-virtual PHIs to cover
the handling of those as well.

2024-03-13  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/112709
* gimple-iterator.h (gsi_safe_insert_before,
gsi_safe_insert_seq_before): Declare.
* gimple-iterator.cc: Include gimplify.h.
(edge_before_returns_twice_call, adjust_before_returns_twice_call,
gsi_safe_insert_before, gsi_safe_insert_seq_before): New functions.
* ubsan.cc (instrument_mem_ref, instrument_pointer_overflow,
instrument_nonnull_arg, instrument_nonnull_return): Use
gsi_safe_insert_before instead of gsi_insert_before.
(maybe_instrument_pointer_overflow): Use force_gimple_operand,
gimple_seq_add_seq_without_update and gsi_safe_insert_seq_before
instead of force_gimple_operand_gsi.
(instrument_object_size): Likewise.  Use gsi_safe_insert_before
instead of gsi_insert_before.

* gcc.dg/ubsan/pr112709-1.c: New test.
* gcc.dg/ubsan/pr112709-2.c: New test.

Daily bump.

Fortran: handle procedure pointer component in DT array [PR110826]

gcc/fortran/ChangeLog:

PR fortran/110826
* array.cc (gfc_array_dimen_size): When walking the ref chain of an
array and the ultimate component is a procedure pointer, do not try
to figure out its dimension even if it is a array-valued function.

gcc/testsuite/ChangeLog:

PR fortran/110826
* gfortran.dg/proc_ptr_comp_53.f90: New test.

libgomp/libgomp.texi: Fix @node order in @menu

While texinfo 7.0.3 does not warn, an older texinfo did complain about:
libgomp.texi:1964: warning: node next `omp_target_memcpy' in menu
`omp_target_memcpy_rect' and in sectioning `omp_target_memcpy_async' differ

libgomp/

* libgomp.texi (Device Memory Routines): Swap item order to match
the order of the '@node's of the '@subsection's.

tree-optimization/114121 - chrec_fold_{plus,multiply} and recursion

The following addresses endless recursion in the
chrec_fold_{plus,multiply} functions when handling sign-conversions.
We only need to apply tricks when we'd fail (there's a chrec in the
converted operand) and we need to make sure to not turn the other
operand into something worse (for the chrec-vs-chrec case).

PR tree-optimization/114121
* tree-chrec.cc (chrec_fold_plus_1): Guard recursion with
converted operand properly.
(chrec_fold_multiply): Likewise. Handle missed recursion.

* gcc.dg/torture/pr114312.c: New testcase.

c++: Support target-specific nodes when streaming modules [PR111224]

Some targets make use of POLY_INT_CSTs and other custom builtin types,
which currently violate some assumptions when streaming. This patch adds
support for them, such as types like Aarch64 __fp16, PowerPC __ibm128,
and vector types thereof.

This patch doesn't provide "full" support of AArch64 SVE, however, since
for that we would need to support 'target' nodes (tracked in PR108080).

Adding the new builtin types means that on Aarch64 we now have 217
global trees created on initialisation (up from 191), so this patch also
slightly bumps the initial size of the fixed_trees allocation to 250.

PR c++/98645
PR c++/98688
PR c++/111224

gcc/cp/ChangeLog:

* module.cc (enum tree_tag): Add new tag for builtin types.
(trees_out::start): POLY_INT_CSTs can be emitted.
(trees_in::start): Likewise.
(trees_out::core_vals): Stream POLY_INT_CSTs.
(trees_in::core_vals): Likewise.
(trees_out::type_node): Handle vectors with multiple coeffs.
(trees_in::tree_node): Likewise.
(init_modules): Register target-specific builtin types. Bump
initial capacity slightly.

gcc/testsuite/ChangeLog:

* g++.dg/modules/target-aarch64-1_a.C: New test.
* g++.dg/modules/target-aarch64-1_b.C: New test.
* g++.dg/modules/target-powerpc-1_a.C: New test.
* g++.dg/modules/target-powerpc-1_b.C: New test.
* g++.dg/modules/target-powerpc-2_a.C: New test.
* g++.dg/modules/target-powerpc-2_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>

asan: Instrument <retval> stores in callees rather than callers [PR112709]

asan currently instruments since PR69276 r6-6758 fix calls which store
the return value into memory on the caller side, before the call it
verifies the memory is writable.
Now PR112709 where we ICE on trying to instrument such calls made me
think about whether that is what we want to do.

There are 3 different cases.

One is when a function returns an aggregate which is passed e.g. in
registers, say like struct S { int a[4]; }; returning on x86_64.
That would be ideally instrumented in between the actual call and
storing of the aggregate into memory, but asan currently mostly
works as a GIMPLE pass and arranging for the instrumentation to happen
at that spot would be really hard.  We could diagnose after the call
but generally asan attempts to diagnose stuff before something is
overwritten rather than after, or keep the current behavior (that is
what this patch does, which has the disadvantage that it can complain
about UB even for functions which never return and so never actually store,
and doesn't check whether the memory wasn't e.g. poisoned during the call)
or could e.g. instrument both before and after the call (that would have
the disadvantage the current state has but at least would check post-factum
the store again afterwards).

Another case is when a function returns an aggregate through a hidden
reference, struct T { int a[128]; }; on x86_64 or even the above struct S
on ia32 as example.  In the actual program such stores happen when storing
something to <retval> or its parts in the callee, because <retval> there
expands to *hidden_retval.  So, IMHO we should instrument those in the
callee rather than caller, that is where the writes are and we can do that
easily.  This is what the patch below does.

And the last case is for builtins/internal functions.  Usually those don't
return aggregates, but in case they'd do and can be expanded inline, it is
better to instrument them in the caller (as before) rather than not
instrumenting the return stores at all.

I had to tweak the expected output on the PR69276 testcase, because
with the patch it keeps previous behavior on x86_64 (structure returned
in registers, stored in the caller, so reported as UB in A::A()),
while on i686 it changed the behavior and is reported as UB in the
vnull::operator vec which stores the structure, A::A() is then a frame
above it in the backtrace.

2024-03-12  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/112709
* asan.cc (has_stmt_been_instrumented_p): Don't instrument call
stores on the caller side unless it is a call to a builtin or
internal function or function doesn't return by hidden reference.
(maybe_instrument_call): Likewise.
(instrument_derefs): Instrument stores to RESULT_DECL if
returning by hidden reference.

* gcc.dg/asan/pr112709-1.c: New test.
* g++.dg/asan/pr69276.C: Adjust expected output for some targets.

strlen: Fix another spot that can create invalid ranges [PR114293]

This PR is similar to PR110603 fixed with r14-8487, except in a different
spot.  From the memset with -1 size of non-zero value we determine minimum
of (size_t) -1 and the code uses PTRDIFF_MAX - 2 (not really sure I
understand why it is - 2 and not - 1, e.g. heap allocated array
with PTRDIFF_MAX char elements which contain '\0' in the last element
should be fine, no?  One can still represent arr[PTRDIFF_MAX] - arr[0]
and arr[0] - arr[PTRDIFF_MAX] in ptrdiff_t and
strlen (arr) == PTRDIFF_MAX - 1) as the maximum, so again invalid range.
As in the other case, it is just UB that can lead to that, and we have
choice to only keep the min and use +inf for max, or only keep max
and use 0 for min, or not set the range at all, or use [min, min] or
[max, max] etc.  The following patch uses [min, +inf].

2024-03-12  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/114293
* tree-ssa-strlen.cc (strlen_pass::handle_builtin_strlen): If
max is smaller than min, set max to ~(size_t)0.

* gcc.dg/pr114293.c: New test.

RISC-V: Fix some code style issue(s) in riscv-c.cc [NFC]

Notice some code style issue(s) when add __riscv_v_fixed_vlen, includes:

* Meanless empty line.
* Line greater than 80 chars.
* Indent with 3 space(s).
* Argument unalignment.

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_ext_version_value): Fix
code style greater than 80 chars.
(riscv_cpu_cpp_builtins): Fix useless empty line, indent
with 3 space(s) and argument unalignment.

Signed-off-by: Pan Li <pan2.li@intel.com>

tree-optimization/114297 - SLP reduction with early break fix

The following makes sure to pass in the SLP node for the live stmts
we are generating the reduction epilogue for to
vect_create_epilog_for_reduction. This follows the previous fix for
the non-SLP path.

PR tree-optimization/114297
* tree-vect-loop.cc (vectorizable_live_operation): Pass in the
live stmts SLP node to vect_create_epilog_for_reduction.

* gcc.dg/vect/vect-early-break_123-pr114297.c: New testcase.

Reject -fno-multiflags [PR114314]

When -fmultiflags option support was added in r13-3693-g6b1a2474f9e422,
it accidently allowed -fno-multiflags which then would pass on to cc1.
This fixes that oversight.

Committed as obvious after bootstrap/test on x86_64-linux-gnu.

gcc/ChangeLog:

PR driver/114314
* common.opt (fmultiflags): Add RejectNegative.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Daily bump.

libgfortran: [PR114304] Revert portion of PR105347 change.

PR libfortran/105437
PR libfortran/114304

libgfortran/ChangeLog:

* io/list_read.c (eat_separator): Remove check for decimal
point mode and semicolon used as a seprator. Removes
the regression.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr105473.f90: Add additional checks to address
the case of semicolon at the end of a line.

Update gcc sv.po

* sv.po: Update.

gomp: testsuite: improve compatibility of bad-array-section-3.c [PR113428]

This test generates different warnings on ilp32 targets because the size
of an integer matches the size of a pointer. Avoid this by using
signed char.

gcc/testsuite:

PR testsuite/113428
* gcc.dg/gomp/bad-array-section-c-3.c: Use signed char instead
of int.

PR modula2/114295 Incorrect location if compiling implementation without definition

This patch fixes a bug which occurred if gm2 was asked to compile an
implementation module and could not find the definition module.  The error
location would be set to the SYSTEM module.  The bug occurred as the
module sym was created during the peep phase after which the few tokens are
destroyed and recreated during parsing.  The bug fix is to call
PutDeclared when the module is encountered during parsing which updates
the tokenno associated with the module.

gcc/m2/ChangeLog:

PR modula2/114295
* gm2-compiler/M2Batch.mod (MakeProgramSource): Call PutDeclared
if the module is known.
(MakeDefinitionSource): Ditto.
(MakeImplementationSource): Ditto.
* gm2-compiler/M2Comp.mod (ExamineHeader): New procedure.
(ExamineCompilationUnit): Rewrite.
(PeepInto): Rewrite.
* gm2-compiler/M2Error.mod (NewError): Remove default call to
GetTokenNo.
* gm2-compiler/M2Quads.mod (callRequestDependant): Push tokno with
Adr.
(BuildStringAdrParam): Ditto.
(doBuildBinaryOp): Push OperatorPos on the bool stack.
(BuildRelOp): Ditto.
* gm2-compiler/P2Build.bnf (SetType): Pass set token pos to
BuildSetType.
(PointerType): Pass pointer token pos to BuildPointerType.
* gm2-compiler/P2SymBuild.def (BuildPointerType): Add parameter
pointerpos.
(BuildSetType): Add parameter setpos.
* gm2-compiler/P2SymBuild.mod (BuildPointerType): Add parameter
pointerpos.  Build combined token and use it when creating a
pointer type.
(BuildSetType): Add parameter setpos.  Build combined token and
use it when creating a set type.
* gm2-compiler/SymbolTable.mod (DebugUnknownToken): New constant.
(CheckTok): New procedure function.
(MakeProcedure): Call CheckTok.
(MakeRecord): Ditto.
(MakeVarient): Ditto.
(MakeEnumeration): Ditto.
(MakeHiddenType): Ditto.
(MakeConstant): Ditto.
(MakeConstStringCnul): Ditto.
(MakeSubrange): Ditto.
(MakeTemporary): Ditto.
(MakeVariableForParam): Ditto.
(MakeParameterHeapVar): Ditto.
(MakePointer): Ditto.
(MakeSet): Ditto.
(MakeUnbounded): Ditto.
(MakeProcType): Ditto.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

testsuite: vect: Require vect_hw_misalign in gcc.dg/vect/vect-cost-model-1.c etc. [PR98238]

Several gcc.dg/vect/vect-cost-model-?.c tests FAIL on 32 and 64-bit
Solaris/SPARC:

FAIL: gcc.dg/vect/vect-cost-model-1.c -flto -ffat-lto-objects
scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-1.c scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-3.c -flto -ffat-lto-objects
scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-3.c scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-5.c -flto -ffat-lto-objects
scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-5.c scan-tree-dump vect "LOOP VECTORIZED"

The dumps show

/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:7:30:
note: ==> examining statement: _3 = *_2;
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:7:30:
missed: unsupported unaligned access
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:8:6:
missed: not vectorized: relevant stmt not supported: _3 = *_2;

so I think the tests need to require vect_hw_misalign. This is what
this patch does.

Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.

2024-02-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR tree-optimization/98238
* gcc.dg/vect/vect-cost-model-1.c (scan-tree-dump): Also require
vect_hw_misalign.
* gcc.dg/vect/vect-cost-model-3.c: Likewise.
* gcc.dg/vect/vect-cost-model-5.c: Likewise.

testsuite: vect: Require vect_perm in several tests [PR114071, PR113557, PR96109]

Several vectorization tests FAIL on 32 and 64-bit Solaris/SPARC:

FAIL: gcc.dg/vect/pr37027.c -flto -ffat-lto-objects scan-tree-dump-times
vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/pr37027.c -flto -ffat-lto-objects scan-tree-dump-times
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorizing stmts
using SLP" 1
FAIL: gcc.dg/vect/pr67790.c -flto -ffat-lto-objects scan-tree-dump vect
"vectorizing stmts using SLP"
FAIL: gcc.dg/vect/pr67790.c scan-tree-dump vect "vectorizing stmts using SLP"
FAIL: gcc.dg/vect/slp-47.c -flto -ffat-lto-objects scan-tree-dump-times
vect "vectorizing stmts using SLP" 2
FAIL: gcc.dg/vect/slp-47.c scan-tree-dump-times vect "vectorizing stmts
using SLP" 2
FAIL: gcc.dg/vect/slp-48.c -flto -ffat-lto-objects scan-tree-dump-times
vect "vectorizing stmts using SLP" 2
FAIL: gcc.dg/vect/slp-48.c scan-tree-dump-times vect "vectorizing stmts
using SLP" 2
FAIL: gcc.dg/vect/slp-reduc-1.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-1.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorizing
stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-2.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-2.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorizing
stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorizing
stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-8.c -flto -ffat-lto-objects scan-tree-dump vect
"vectorized 1 loops"
FAIL: gcc.dg/vect/slp-reduc-8.c scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.dg/vect/vect-multi-peel-gaps.c -flto -ffat-lto-objects
scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-multi-peel-gaps.c scan-tree-dump vect "LOOP VECTORIZED"

The dumps show variations of

/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17:
note: ==> examining statement: _4 = a[i_19].f2;
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17:
missed: unsupported vect permute { 1 0 3 2 5 4 }
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17:
missed: unsupported load permutation
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:27:17:
missed: not vectorized: relevant stmt not supported: _4 = a[i_19].f2;

so I think the tests should require vect_perm. This is what this patch does

Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.

2024-02-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR tree-optimization/114071
* gcc.dg/vect/pr37027.c: Require vect_perm.
* gcc.dg/vect/pr67790.c: Likewise.
* gcc.dg/vect/slp-reduc-1.c: Likewise.
* gcc.dg/vect/slp-reduc-2.c: Likewise.
* gcc.dg/vect/slp-reduc-7.c: Likewise.
* gcc.dg/vect/slp-reduc-8.c: Likewise.

PR tree-optimization/113557
* gcc.dg/vect/vect-multi-peel-gaps.c (scan-tree-dump): Also
require vect_perm.

PR testsuite/96109
* gcc.dg/vect/slp-47.c: Require vect_perm.
* gcc.dg/vect/slp-48.c: Likewise.

aarch64,arm: Move branch-protection data to targets

The branch-protection types are target specific, not the same on arm
and aarch64. This currently affects pac-ret+b-key, but there will be
a new type on aarch64 that is not relevant for arm.

After the move, change aarch_ identifiers to aarch64_ or arm_ as
appropriate.

Refactor aarch_validate_mbranch_protection to take the target specific
branch-protection types as an argument.

In case of invalid input currently no hints are provided: the way
branch-protection types and subtypes can be mixed makes it difficult
without causing confusion.

gcc/ChangeLog:

* config/aarch64/aarch64.md: Rename aarch_ to aarch64_.
* config/aarch64/aarch64.opt: Likewise.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Likewise.
* config/aarch64/aarch64.cc (aarch64_expand_prologue): Likewise.
(aarch64_expand_epilogue): Likewise.
(aarch64_post_cfi_startproc): Likewise.
(aarch64_handle_no_branch_protection): Copy and rename.
(aarch64_handle_standard_branch_protection): Likewise.
(aarch64_handle_pac_ret_protection): Likewise.
(aarch64_handle_pac_ret_leaf): Likewise.
(aarch64_handle_pac_ret_b_key): Likewise.
(aarch64_handle_bti_protection): Likewise.
(aarch64_override_options): Update branch protection validation.
(aarch64_handle_attr_branch_protection): Likewise.
* config/arm/aarch-common-protos.h (aarch_validate_mbranch_protection):
Pass branch protection type description as argument.
(struct aarch_branch_protect_type): Move from aarch-common.h.
* config/arm/aarch-common.cc (aarch_handle_no_branch_protection):
Remove.
(aarch_handle_standard_branch_protection): Remove.
(aarch_handle_pac_ret_protection): Remove.
(aarch_handle_pac_ret_leaf): Remove.
(aarch_handle_pac_ret_b_key): Remove.
(aarch_handle_bti_protection): Remove.
(aarch_validate_mbranch_protection): Pass branch protection type
description as argument.
* config/arm/aarch-common.h (enum aarch_key_type): Remove.
(struct aarch_branch_protect_type): Remove.
* config/arm/arm-c.cc (arm_cpu_builtins): Remove aarch_ra_sign_key.
* config/arm/arm.cc (arm_handle_no_branch_protection): Copy and rename.
(arm_handle_standard_branch_protection): Likewise.
(arm_handle_pac_ret_protection): Likewise.
(arm_handle_pac_ret_leaf): Likewise.
(arm_handle_bti_protection): Likewise.
(arm_configure_build_target): Update branch protection validation.
* config/arm/arm.opt: Remove aarch_ra_sign_key.

middle-end/114299 - missing error recovery from gimplify failure

When internal_get_tmp_var fails to gimplify the value the temporary
SSA name is supposed to be initialized with we can leak SSA names
with a NULL SSA_NAME_DEF_STMT into the IL. That's bad, so recover
from this by instead returning a decl in that case.

PR middle-end/114299
* gimplify.cc (internal_get_tmp_var): When gimplification
of VAL failed, return a decl.

* gcc.target/i386/pr114299.c: New testcase.

bitint: Avoid rewriting large/huge _BitInt vars into SSA after bitint lowering [PR114278]

The following testcase ICEs, because update-address-taken subpass of
fre5 rewrites
  _BitInt(128) b;
  vector(16) unsigned char _3;

  <bb 2> [local count: 1073741824]:
  _3 = MEM <vector(16) unsigned char> [(char * {ref-all})p_2(D)];
  MEM <vector(16) unsigned char> [(char * {ref-all})&b] = _3;
  b ={v} {CLOBBER(eos)};
to
  _BitInt(128) b;
  vector(16) unsigned char _3;

  <bb 2> [local count: 1073741824]:
  _3 = MEM <vector(16) unsigned char> [(char * {ref-all})p_2(D)];
  b_5 = VIEW_CONVERT_EXPR<_BitInt(128)>(_3);
but we can't have large/huge _BitInt vars in SSA form after the bitint
lowering except for function arguments loaded from memory, as expansion
isn't able to deal with those, it relies on bitint lowering to lower
those operations.
The following patch fixes that by setting DECL_NOT_GIMPLE_REG_P for
large/huge _BitInt vars after bitint lowering, such that we don't
rewrite them into SSA form.

2024-03-11  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/114278
* tree-ssa.cc (maybe_optimize_var): If large/huge _BitInt vars are no
longer addressable, set DECL_NOT_GIMPLE_REG_P on them.

* gcc.dg/bitint-99.c: New test.

Fix placement of recently implemented DIE

It's the DIE added for enumeration types with reverse scalar storage order.

gcc/
PR debug/113519
PR debug/113777
* dwarf2out.cc (gen_enumeration_type_die): In the reverse case,
generate the DIE with the same parent as in the regular case.

gcc/testsuite/
* gcc.dg/sso-20.c: New test.
* gcc.dg/sso-21.c: Likewise.

Fold: Fix up merge_truthop_with_opposite_arm for NaNs [PR95351]

The problem here is that merge_truthop_with_opposite_arm would
use the type of the result of the comparison rather than the operands
of the comparison to figure out if we are honoring NaNs.
This fixes that oversight and now we get the correct results in this
case.

Committed as obvious after a bootstrap/test on x86_64-linux-gnu.

PR middle-end/95351

gcc/ChangeLog:

* fold-const.cc (merge_truthop_with_opposite_arm): Use
the type of the operands of the comparison and not the type
of the comparison.

gcc/testsuite/ChangeLog:

* gcc.dg/float_opposite_arm-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Daily bump.

d: Fix -fpreview=in ICEs with forward referenced parameter [PR112285]

The way that the target hook preferPassByRef is implemented, it relied
on the GCC "back-end" tree type to determine whether or not to use `ref'
ABI for D `in' parameters; e.g: prefer by value if it is expected that
the target will pass the type around in registers.

Building the GCC tree type depends on the AST type being complete - all
semantic processing is finished - but as this hook is called from the
front-end, this will not be the case for forward referenced or
self-referencing types.

The consensus in upstream is that `in' parameters should always be
implicitly `ref', but as the front-end does not yet support all types
being rvalue references, limit this just static arrays and structs.

PR d/112285
PR d/112290

gcc/d/ChangeLog:

* d-target.cc (Target::preferPassByRef): Return true for all static
array and struct types.

gcc/testsuite/ChangeLog:

* gdc.dg/pr112285.d: New test.
* gdc.dg/pr112290.d: New test.

[committed] [PR tree-optimization/110199] Simplify MIN/MAX more often

So as I mentioned in the BZ, the case of

t = MIN_EXPR (A, B)

where we know something about the relationship between A and B can be trivially
handled by some existing code in DOM.  That existing code would simplify when A
== B.  But by testing GE and LE instead of EQ we can cover more cases with
minimal effort.  When applicable the MIN/MAX turns into a simple copy.

I made one other change.  We have other binary operations that we simplify when
we know something about the relationship between the operands.  That code was
not canonicalizing the order of operands when building the expression to lookup
in the hash tables to discover that relationship.  Since those paths are only
testing for equality, we can trivially reverse them and not have to worry about
changing codes or anything like that.  So extremely safe and avoids having to
come back and fix that code to match the MIN_EXPR/MAX_EXPR case later.

Bootstrapped on x86 and also tested on the crosses.  I briefly thought there
was an sh regression, but that was actually the recent fwprop changes twiddling
code generation for one test.

PR tree-optimization/110199
gcc/
* tree-ssa-scopedtables.cc
(avail_exprs_stack::simplify_binary_operation): Generalize handling
of MIN_EXPR/MAX_EXPR to allow additional simplifications.  Canonicalize
comparison operands for other cases.

gcc/testsuite

* gcc.dg/tree-ssa/minmax-27.c: New test.
* gcc.dg/tree-ssa/minmax-28.c: New test.

VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

This patch would like to fix one ICE in vectorizable_store when both the
loop_masks and loop_lens are enabled.  The ICE looks like below when build
with "-march=rv64gcv -O3".

during GIMPLE pass: vect
test.c: In function ‘d’:
test.c:6:6: internal compiler error: in vectorizable_store, at
tree-vect-stmts.cc:8691
    6 | void d() {
      |      ^
0x37a6f2f vectorizable_store
        .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
_slp_tree*, _slp_instance*, vec<stmt_info_for_cost, va_heap, vl_ptr>*)
        .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
0x1db5dca vect_analyze_loop_operations
        .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
0x1db885b vect_analyze_loop_2
        .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
0x1dba029 vect_analyze_loop_1
        .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
        .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
0x1e389d1 try_vectorize_loop_1
        .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
0x1e38f3d try_vectorize_loop
        .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
0x1e39230 execute
        .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298

There are two ways to reach vectorizer LD/ST, one is the analysis and
the other is transform.  We cannot have both the lens and the masks
enabled during transform but it is valid during analysis.  Given the
transform doesn't required cost_vec,  we can only enable the assert
based on cost_vec is NULL or not.

Below testsuites are passed for this patch:
* The x86 bootstrap tests.
* The x86 fully regression tests.
* The aarch64 fully regression tests.
* The riscv fully regressison tests.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Enable the assert
during transform process.
(vectorizable_load): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114195-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "[committed] Adjust expectations for pr59533-1.c"

This reverts commit 7e16f819ff413c48702f9087b62eaac39a060a14.

[committed] [target/102250] Document python requirement for risc-v

PR target/102250
gcc/

* doc/install.texi: Document need for python when building
RISC-V compilers.