git.ipfire.org Git - thirdparty/gcc.git/log

rtl-optimization/105231 - distribute_notes and REG_EH_REGION

The following mitigates a problem in combine distribute_notes which
places an original REG_EH_REGION based on only may_trap_p which is
good to test whether a non-call insn can possibly throw but not if
actually it does or we care.  That's something we decided at RTL
expansion time where we possibly still know the insn evaluates
to a constant.

In fact, the REG_EH_REGION note with lp > 0 can only come from the
original i3 and an assert is added to that effect.  That means we only
need to retain the note on i3 or, if that cannot trap, drop it but we
should never move it to i2.

The following places constraints on the insns to combine with
non-call exceptions since we cannot handle the case where we
have more than one EH side-effect in the IL.  The patch also
makes sure we can accumulate that on i3 and do not split
a possible exception raising part of it to i2.  As a special
case we do not place any restriction on all externally
throwing insns when there is no REG_EH_REGION present.

2022-04-22  Richard Biener  <rguenther@suse.de>

PR rtl-optimization/105231
* combine.cc (distribute_notes): Assert that a REG_EH_REGION
with landing pad > 0 is from i3.  Put any REG_EH_REGION note
on i3 or drop it if the insn can not trap.
(try_combine): Ensure that we can merge REG_EH_REGION notes
with non-call exceptions.  Ensure we are not splitting a
trapping part of an insn with non-call exceptions when there
is any REG_EH_REGION note to preserve.

* gcc.dg/torture/pr105231.c: New testcase.

AVX512F: Add missing macro for mask(z?)_scalf_s[sd] [PR 105339]

Add missing macro under O0 and adjust macro format for scalf
intrinsics.

gcc/ChangeLog:

PR target/105339
* config/i386/avx512fintrin.h (_mm512_scalef_round_pd):
Add parentheses for parameters and djust format.
(_mm512_mask_scalef_round_pd): Ditto.
(_mm512_maskz_scalef_round_pd): Ditto.
(_mm512_scalef_round_ps): Ditto.
(_mm512_mask_scalef_round_ps): Ditto.
(_mm512_maskz_scalef_round_ps): Ditto.
(_mm_scalef_round_sd): Use _mm_undefined_pd.
(_mm_scalef_round_ss): Use _mm_undefined_ps.
(_mm_mask_scalef_round_sd): New macro.
(_mm_mask_scalef_round_ss): Ditto.
(_mm_maskz_scalef_round_sd): Ditto.
(_mm_maskz_scalef_round_ss): Ditto.

gcc/testsuite/ChangeLog:

PR target/105339
* gcc.target/i386/sse-14.c: Add tests for new macro.

Daily bump.

[committed] exec-stack warning for test which wants executable stacks

gcc/testsuite
* gcc.dg/lto/pr94157_0.c: Also request executable stack from
the linker.

fortran: Detect duplicate unlimited polymorphic types [PR103662]

This fixes a type-based alias analysis issue with unlimited polymorphic
class descriptors (types behind class(*)) causing data initialisation to
be removed by optimization.

The fortran front-end may create multiple declarations for types, for
example if a type is redeclared in each program unit it is used in.
To avoid optimization seeing them as non-aliasing, a list of derived
types is created at resolution time, and used at translation to set
the same TYPE_CANONICAL type for each duplicate type declaration.

This mechanism didn’t work for unlimited polymorphic descriptors types,
as there is a short-circuit return skipping all the resolution handling
for them, including the type registration.

This change adds type registration at the short-circuit return, and
updates type comparison to handle specifically unlimited polymorphic
fake symbols, class descriptor types and virtual table types.

The test, which exhibited mismatching dynamic types had to be fixed as
well.

PR fortran/103662

gcc/fortran/ChangeLog:

* interface.cc (gfc_compare_derived_types): Support comparing
unlimited polymorphic fake symbols. Recursively compare class
descriptor types and virtual table types.
* resolve.cc (resolve_fl_derived): Add type to the types list
on unlimited polymorphic short-circuit return.

gcc/testsuite/ChangeLog:

* gfortran.dg/unlimited_polymorphic_3.f03 (foo): Separate
bind(c) and sequence checks to...
(foo_bc, foo_sq): ... two different procedures.
(main, foo*): Change type declarations so that type name,
component name, and either bind(c) or sequence attribute match
between the main type declarations and the procedure type
declarations.
(toplevel): Add optimization dump checks.

Co-Authored-By: Jakub Jelinek <jakub@redhat.com>

Daily bump.

i386: Improve ix86_expand_int_movcc [PR105338]

The following testcase regressed on x86_64 on the trunk, due to some GIMPLE
pass changes (r12-7687) we end up an *.optimized dump difference of:
@@ -8,14 +8,14 @@ int foo (int i)

   <bb 2> [local count: 1073741824]:
   if (i_2(D) != 0)
-    goto <bb 4>; [35.00%]
+    goto <bb 3>; [35.00%]
   else
-    goto <bb 3>; [65.00%]
+    goto <bb 4>; [65.00%]

-  <bb 3> [local count: 697932184]:
+  <bb 3> [local count: 375809640]:

   <bb 4> [local count: 1073741824]:
-  # iftmp.0_1 = PHI <5(2), i_2(D)(3)>
+  # iftmp.0_1 = PHI <5(3), i_2(D)(2)>
   return iftmp.0_1;

}
and similarly for the other functions.  That is functionally equivalent and
there is no canonical form for those.  The reason for i_2(D) in the PHI
argument as opposed to 0 is the uncprop pass, that is in many cases
beneficial for expansion as we don't need to load the value into some pseudo
in one of the if blocks.
Now, for the 11.x ordering we have the pseudo = i insn in the extended basic
block (it comes first) and so forwprop1 undoes what uncprop does by
propagating constant 0 there.  But for the 12.x ordering, the extended basic
block contains pseudo = 5 and pseudo = i is in the other bb and so fwprop1
doesn't change it.
During the ce1 pass, we attempt to emit a conditional move and we have very
nice code for the cases where both last operands of ?: are constant, and yet
another for !TARGET_CMOVE if at least one of them is.

The following patch will undo the uncprop behavior during
ix86_expand_int_movcc, but just for those spots that can benefit from both
or at least one operands being constant, leaving the pure cmov case as is
(because then it is useful not to have to load a constant into a pseudo
as it already is in one).  We can do that in the
op0 == op1 ? op0 : op3
or
op0 != op1 ? op2 : op0
cases if op1 is a CONST_INT by pretending it is
op0 == op1 ? op1 : op3
or
op0 != op1 ? op2 : op1

2022-04-23  Jakub Jelinek  <jakub@redhat.com>

PR target/105338
* config/i386/i386-expand.cc (ix86_expand_int_movcc): Handle
op0 == cst1 ? op0 : op3 like op0 == cst1 ? cst1 : op3 for the non-cmov
cases.

* gcc.target/i386/pr105338.c: New test.

Daily bump.

libstdc++: Make atomic notify_one and notify_all non-const

<recording this here for future reference>
PR102994 "atomics: std::atomic<ptr>::wait is not marked const" raises the
issue that the current libstdc++ implementation marks the notify members
const, the implementation strategy used by libstdc++, as well as libc++
and the Microsoft STL, do not require the atomic to be mutable (it is hard
to conceive of a desirable implementation approach that would require it).
The original paper proposing the wait/notify functionality for atomics
(p1185) also had these members marked const for the first three revisions,
but that was changed without explanation in r3 and subsequent revisions of
the paper.

After raising the issue to the authors of p1185 and the author of the
libc++ implementation, the consensus seems to be "meh, it's harmless" so
there seems little appetite for an LWG issue to revisit the subject.

This patch changes the libstdc++ implementation to be in agreement with
the standard by removing const from those notify_one/notify_all members.

libstdc++-v3/ChangeLog:

PR libstdc++/102994
* include/bits/atomic_base.h (atomic_flag::notify_one,
notify_all): Remove const qualification.
(__atomic_base::notify_one, notify_all): Likewise.
* include/std/atomic (atomic<bool>::notify_one, notify_all):
Likewise.
(atomic::notify_one, notify_all): Likewise.
(atomic<T*>::notify_one, notify_all): Likewise.
(atomic_notify_one, atomic_notify_all): Likewise.
* testsuite/29_atomics/atomic/wait_notify/102994.cc: Adjust test
to account for change in notify_one/notify_all signature.

fortran: Use pointer arithmetic to index arrays [PR102043]

The code generated for array references used to be ARRAY_REF trees as
could be expected.  However, the middle-end may conclude from those
trees that the indexes used are non-negative (more precisely not below
the lower bound), which is a wrong assumption in the case of "reversed-
order" arrays.

The problematic arrays are those with a descriptor and having a negative
stride for at least one dimension.  The descriptor data points to the
first element in array order (which is not the first in memory order in
that case), and the negative stride(s) makes walking the array backwards
(towards lower memory addresses), and we can access elements with
negative index wrt data pointer.

With this change, pointer arithmetic is generated by default for array
references, unless we are in a case where negative indexes can’t happen
(array descriptor’s dim element, substrings, explicit shape,
allocatable, or assumed shape contiguous).  A new flag is added to
choose between array indexing and pointer arithmetic, and it’s set
if the context can tell array indexing is safe (descriptor dim
element, substring, temporary array), or a new method is called
to decide on whether the flag should be set for one given array
expression.

PR fortran/102043

gcc/fortran/ChangeLog:

* trans.h (gfc_build_array_ref): Add non_negative_offset
argument.
* trans.cc (gfc_build_array_ref): Ditto. Use pointer arithmetic
if non_negative_offset is false.
* trans-expr.cc (gfc_conv_substring): Set flag in the call to
gfc_build_array_ref.
* trans-array.cc (gfc_get_cfi_dim_item,
gfc_conv_descriptor_dimension): Same.
(build_array_ref): Decide on whether to set the flag and update
the call.
(gfc_conv_scalarized_array_ref): Same.  New argument tmp_array.
(gfc_conv_tmp_array_ref): Update call to
gfc_conv_scalarized_ref.
(non_negative_strides_array_p): New function.

gcc/testsuite/ChangeLog:

* gfortran.dg/array_reference_3.f90: New.
* gfortran.dg/negative_stride_1.f90: New.
* gfortran.dg/vector_subscript_8.f90: New.
* gfortran.dg/vector_subscript_9.f90: New.
* gfortran.dg/c_loc_test_22.f90: Update dump patterns.
* gfortran.dg/finalize_10.f90: Same.

Co-Authored-By: Richard Biener <rguenther@suse.de>

fortran: Generate an array temporary reference [PR102043]

This avoids regressing on char_cast_1.f90 and char_cast_2.f90 later in
the patch series when the code generation for array references is
changed to use pointer arithmetic.

The regressing testcases match part of an array reference in the
generated tree dump and it’s not clear how the pattern should be
rewritten to match the equivalent with pointer arithmetic.

This change uses a method specific to array temporaries to generate
array-references, so that these array references are flagged as safe
for array indexing and will not be updated to use pointer arithmetic.

PR fortran/102043

gcc/fortran/ChangeLog:
* trans-array.cc (gfc_conv_expr_descriptor): Use
gfc_conv_tmp_array_ref.

fortran: Update index extraction code. [PR102043]

This avoids a regression on hollerith4.f90 and hollerith6.f90 later in
the patch series when code generation for array references is changed
to use pointer arithmetic.

The problem comes from the extraction of the array index from an
ARRAY_REF tree, which doesn’t work if the tree is not an ARRAY_REF
any more.

This updates the code generated for remaining size evaluation to work
with a source tree that uses either array indexing or pointer
arithmetic.

PR fortran/102043

gcc/fortran/ChangeLog:

* trans-io.cc: Add handling for the case where the array
is referenced using pointer arithmetic.

fortran: Pre-evaluate string pointers. [PR102043]

This avoids a regression on deferred_character_23.f90 later in the
patch series when array references are rewritten to use pointer
arithmetic.

The problem is a SAVE_EXPR tree as TYPE_SIZE_UNIT of one array element
type, which is used by the pointer arithmetic expressions. As these
expressions appear in both branches of an if-then-else block, the tree
is lowered to a variable in one of the branches but it’s used in both
branches, which is invalid middle-end code.

This change pre-evaluates the array references or pointer arithmetics
to variables before the if-then-else block, so that the SAVE_EXPR are
expanded to variables in the parent scope of the if-then-else block,
and expressions referencing the variables remain valid in both
branches.

PR fortran/102043

gcc/fortran/ChangeLog:
* trans-expr.cc: Pre-evaluate src and dest to variables
before using them.

gcc/testsuite/ChangeLog:
* gfortran.dg/dependency_49.f90: Update variable occurence
count.

rs6000: Fix pack for soft-float (PR105334)

For PR103623 I fixed unpack, but pack is broken as well, as reported in
PR105334. Fixing that is a bit more code, but it is pretty simple code
nonetheless.

2022-04-22 Segher Boessenkool <segher@kernel.crashing.org>

PR target/105334
* config/rs6000/rs6000.md (pack<mode> for FMOVE128): New expander.
(pack<mode> for FMOVE128): Rename and split the insn_and_split to...
(pack<mode>_hard for FMOVE128): ... this...
(pack<mode>_soft for FMOVE128): ... and this.

docs: Correct "This functions" to "These functions"

2022-04-22 Paul A. Clarke <pc@us.ibm.com>

gcc
* doc/extend.texi: Correct "This" to "These".

rtlanal: Fix up replace_rtx [PR105333]

The following testcase FAILs, because replace_rtx replaces a REG with
CONST_WIDE_INT inside of a SUBREG, which is an invalid transformation
because a SUBREG relies on SUBREG_REG having non-VOIDmode but
CONST_WIDE_INT has VOIDmode.

replace_rtx already has code to deal with it, but it was doing
it only for CONST_INTs. The following patch does it also for
VOIDmode CONST_DOUBLE or CONST_WIDE_INT.

2022-04-22 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/105333
* rtlanal.cc (replace_rtx): Use simplify_subreg or
simplify_unary_operation if CONST_SCALAR_INT_P rather than just
CONST_INT_P.

* gcc.dg/pr105333.c: New test.

Daily bump.

rs6000/testsuite: xfail bswap-brw.c

This testcase does not generate anywhere near optimal code for 32-bit
code.  For p10 it actually now fails this testcase, after the previous
patch.  Let's xfail it.

2022-04-21  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/testsuite/
PR target/103197
PR target/102146
* gcc.target/powerpc/bswap-brw.c: Add xfail on scan-assembler for -m32.

rs6000: Disparage lfiwzx and similar

RA now chooses GEN_OR_VSX_REGS in most cases.  This is great in most
cases, but we often (or always?) use {l,st}{f,xs}iwzx now, which is
problematic because the integer load and store insns can use cheaper
addressing modes.  We can fix that by putting a small penalty on the
instruction alternatives for those.

2022-04-21  Segher Boessenkool  <segher@kernel.crashing.org>

PR target/103197
PR target/102146
* config/rs6000/rs6000.md (zero_extendqi<mode>2 for EXTQI): Disparage
the "Z" alternatives in {l,st}{f,xs}iwzx.
(zero_extendhi<mode>2 for EXTHI): Ditto.
(zero_extendsi<mode>2 for EXTSI): Ditto.
(*movsi_internal1): Ditto.
(*mov<mode>_internal1 for QHI): Ditto.
(movsd_hardfloat): Ditto.

rs6000: Add effective target has_arch_ppc64

This is true if we have -mpowerpc64.

2022-04-21 Segher Boessenkool <segher@kernel.crashing.org>

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_has_arch_ppc64): New.

d: Merge upstream dmd eb7bee331, druntime 27834edb, phobos ac296f80c.

D front-end changes:

    - Import dmd v2.100.0-beta.1.
    - Print deprecation messages for scope violations unless
      `-frevert=dip1000' is used.
    - Fixed a missed case of switch case fallthrough not being caught by
      the compiler.

D runtime changes:

    - Import druntime v2.100.0-beta.1.

Phobos changes:

    - Import phobos v2.100.0-beta.1.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd eb7bee331.
* dmd/VERSION: Update version to v2.100.0-beta.1.
* d-lang.cc (d_handle_option): Handle OPT_frevert_dip1000.
* lang.opt (frevert=dip1000): New option.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 27834edb.
* src/MERGE: Merge upstream phobos ac296f80c.
* src/Makefile.am (PHOBOS_DSOURCES): Add std/int128.d.
* src/Makefile.in: Regenerate.

libstdc++: Avoid ASCII assumptions in floating_from_chars.cc

In starts_with_ci and in __floating_from_chars_hex's inf/nan handling,
we were assuming that the letters are contiguous and that 'A' + 32 == 'a'
which is true for ASCII but not for other character encodings.

This patch fixes starts_with_ci by using a constexpr lookup table that
maps uppercase letters to lowercase, and fixes __floating_from_chars_hex
by using __from_chars_alnum_to_val.

libstdc++-v3/ChangeLog:

* include/std/charconv (__from_chars_alnum_to_val_table):
Simplify initialization of __lower/__upper_letters.
(__from_chars_alnum_to_val): Default the template parameter to
false.
* src/c++17/floating_from_chars.cc (starts_with_ci): Don't
assume the uppercase and lowercase letters are contiguous.
(__floating_from_chars_hex): Likewise.

c++: Remove unused parameter

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_logical_expression): Remove unused
parameter.
(cxx_eval_constant_expression) <case TRUTH_ANDIF_EXPR>,
<case TRUTH_OR_EXPR>: Adjust calls to cxx_eval_logical_expression.

c++: wrong error with constexpr COMPOUND_EXPR [PR105321]

Here we issue a bogus error for the first assert in the test.  Therein
we have

<retval> = (void) (VIEW_CONVERT_EXPR<bool>(yes) || handle_error ());, VIEW_CONVERT_EXPR<int>(value);

which has a COMPOUND_EXPR, so we get to cxx_eval_constant_expression
<case COMPOUND_EXPR>.  The problem here is that we call

7044             /* Check that the LHS is constant and then discard it.  */
7045             cxx_eval_constant_expression (ctx, op0,
7046                                           true, non_constant_p, overflow_p,
7047                                           jump_target);

where lval is always true, so the PARM_DECL 'yes' is not evaluated into
its value.

Fixed by always passing false for 'lval' in cxx_eval_logical_expression;
there's no case where we actually expect an lvalue from a TRUTH_*.

PR c++/105321

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_logical_expression): Always pass false for lval
to cxx_eval_constant_expression.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-105321.C: New test.

fortran: Fix conv of UNION constructors [PR105310]

This fixes an ICE when a UNION is the (1+8*2^n)-th field in a DEC
STRUCTURE when compiled with -finit-derived -finit-local-zero.
The problem was CONSTRUCTOR_APPEND_ELT from within gfc_conv_union_initializer
modified the vector pointer, but the pointer was passed by-value,
so the old pointer from the caller (gfc_conv_structure) pointed to freed
memory.

PR fortran/105310

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_union_initializer): Pass vec* by reference.

gcc/testsuite/ChangeLog:

* gfortran.dg/dec_union_12.f90: New test.

libstdc++: Work around modules ICE in <charconv> [PR105297]

This makes the initializer for __table in __from_chars_alnum_to_val
dependent in an artificial way, which works around the reported modules
testsuite ICE by preventing the compiler from evaluating the initializer
parse time.

Compared to the alternative workaround of using a non-local class type
for __table, this workaround has the advantage of slightly speeding up
compilation of <charconv>, since now the table won't get built (via
constexpr evaluation) until the integer std::from_chars overload is
instantiated.

PR c++/105297
PR c++/105322

libstdc++-v3/ChangeLog:

* include/std/charconv (__from_chars_alnum_to_val): Make
initializer for __table dependent in an artificial way.

libstdc++: Remove bogus assertion in std::from_chars [PR105324]

I'm not sure what I was thinking when I added this assertion, maybe it
was supposed to be alignment == 1 (which is what the pmr::string actually
uses). The simplest fix is to just remove the assertion.

The assertion is no longer enabled by default on trunk, but it's still
there for the --enablke-libstdcxx-debug build, and is still wrong. The
fix is needed on the gcc-11 branch.

libstdc++-v3/ChangeLog:

PR libstdc++/105324
* src/c++17/floating_from_chars.cc (buffer_resource::do_allocate):
Remove assertion.
* testsuite/20_util/from_chars/pr105324.cc: New test.

Support --compress-debug-sections for ld.mold.

gcc/ChangeLog:

* configure.ac: Enable compressed debug sections for mold
linker.
* configure: Regenerate.

emit-rtl: Fix -fcompare-debug bug with label references in debug insns [PR105203]

When we compute LABEL_NUSES from scratch, mark_all_labels doesn't call
mark_jump_label on DEBUG_INSNs:
              if (NONDEBUG_INSN_P (insn))
                mark_jump_label (PATTERN (insn), insn, 0);
and so doesn't increment LABEL_NUSES from references in DEBUG_INSNs.
But, when we call emit_copy_of_insn_after e.g. when duplicating some
DEBUG_INSNs, we call it even on those, which then results in LABEL_NUSES
differences and -fcompare-debug failures.

The following patch makes sure we don't call it on DEBUG_INSNs.

2022-04-21  Jakub Jelinek  <jakub@redhat.com>

PR debug/105203
* emit-rtl.cc (emit_copy_of_insn_after): Don't call mark_jump_label
on DEBUG_INSNs.

* gfortran.dg/g77/pr105203.f: New test.

runtime: use correct field name for PPC32 GLIBC registers

One of these days we will get this right.

Fixes PR go/105315

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/401374

Daily bump.

c++: Fall through for arrays of T vs T cv [PR104996]

If two arrays do not have the exact same element type including
qualification, this could be e.g. f(int (&&)[]) vs. f(int const (&)[]),
which can still be distinguished by the lvalue-rvalue tiebreaker.

By tightening this branch (in accordance with the letter of the Standard) we
fall through to the next branch, which tests whether they have different
element type ignoring qualification and returns 0 in that case; thus we only
actually fall through in the T[...] vs. T cv[...] case, eventually
considering the lvalue-rvalue tiebreaker at the end of compare_ics.

Signed-off-by: Ed Catmur <ed@catmur.uk>
PR c++/104996

gcc/cp/ChangeLog:

* call.cc (compare_ics): When comparing list-initialization
sequences, do not return early.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist129.C: New test.

libstdc++: Fix macro checked by test

The macro being tested here is wrong, but just happens to have the same
value as the one supposed to be tests.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string_view/operations/copy/char/constexpr.cc:
Check correct feature test macro.

libstdc++: Use LTLIBICONV when linking libstdc++.so [PR93602]

This fixes missing libiconv symbols when libstdc++ is built on a system
that has libiconv installed. If the libiconv headers are found then
libstdc++ depends on libiconv_open etc instead of libc's iconv_open. But
without this fix libstdc++ is not linked to the libiconv library that
provides the definitions of those symbols.

As discussed in PR 93602 this changed means that libstdc++.so.6 might
have an rpath pointing to the location of the libiconv.so library. If
that is not desired, then GCC must be configured to link to a static
libiconv.a instead, using either --with-libiconv-type=static or an
in-tree build of libiconv.

libstdc++-v3/ChangeLog:

PR libstdc++/93602
* doc/xml/manual/prerequisites.xml: Document libiconv
workarounds.
* doc/html/manual/setup.html: Regenerate.
* src/Makefile.am (CXXLINK): Add $(LTLIBICONV).
* src/Makefile.in: Regenerate.

tree-optimization/104912 - ensure cost model is checked first

The following makes sure that when we build the versioning condition
for vectorization including the cost model check, we check for the
cost model and branch over other versioning checks.  That is what
the cost modeling assumes, since the cost model check is the only
one accounted for in the scalar outside cost.  Currently we emit
all checks as straight-line code combined with bitwise ops which
can result in surprising ordering of checks in the final assembly.

Since loop_version accepts only a single versioning condition
the splitting is done after the fact.

The result is a 1.5% speedup of 416.gamess on x86_64 when compiling
with -Ofast and tuning for generic or skylake.  That's not enough
to recover from the slowdown when vectorizing but it now cuts off
the expensive alias versioning test.

2022-03-21  Richard Biener  <rguenther@suse.de>

PR tree-optimization/104912
* tree-vect-loop-manip.cc (vect_loop_versioning): Split
the cost model check to a separate BB to make sure it is
checked first and not combined with other version checks.

tree-optimization/105312 - fix ISEL VCOND expansion

The following aligns ISEL VEC_COND_EXPR expansion using VCOND
with the optab query done by vector lowering. Instead of only
allowing the signed optab to provide EQ/NE compares we allow both
here though since there seems to be no documented canonicalization.

2022-04-20 Richard Biener <rguenther@suse.de>

PR tree-optimization/105312
* gimple-isel.cc (gimple_expand_vec_cond_expr): Query both
VCOND and VCONDU for EQ and NE.

* gcc.target/arm/pr105312.c: New testcase.

Fix overflows in ipa-modref-tree.cc

gcc/ChangeLog:

2022-04-20 Jan Hubicka <hubicka@ucw.cz>

PR ipa/103818

* ipa-modref-tree.cc (modref_access_node::closer_pair_p): Use
poly_offset_int to avoid overflow.
(modref_access_node::update2): likewise.

gcc/testsuite/ChangeLog:

2022-04-20 Jan Hubicka <hubicka@ucw.cz>

* gcc.c-torture/compile/103818.c: New test.

cgraph: Fix up semantic_interposition handling [PR105306]

cgraph_node has a semantic_interposition flag which should mirror
opt_for_fn (decl, flag_semantic_interposition).  But it actually is
initialized not from that, but from flag_semantic_interposition in the
  explicit symtab_node (symtab_type t)
    : type (t), resolution (LDPR_UNKNOWN), definition (false), alias (false),
...
      semantic_interposition (flag_semantic_interposition),
...
      x_comdat_group (NULL_TREE), x_section (NULL)
  {}
ctor.  I think that might be fine for varpool nodes, but since
flag_semantic_interposition is now implied from -Ofast it isn't correct
for cgraph nodes, unless we guarantee that cgraph node for a particular
function decl is always created while that function is
current_function_decl.  That is often the case, but not always as the
following function shows.
Because symtab_node's ctor doesn't know for which decl the cgraph node
is being created, the following patch keeps that as is, but updates it from
opt_for_fn (decl, flag_semantic_interposition) when we know that, or for
clones copies that flag (often it is then overridden in
set_new_clone_decl_and_node_flags, but not always).

2022-04-20  Jakub Jelinek  <jakub@redhat.com>

PR ipa/105306
* cgraph.cc (cgraph_node::create): Set node->semantic_interposition
to opt_for_fn (decl, flag_semantic_interposition).
* cgraphclones.cc (cgraph_node::create_clone): Copy over
semantic_interposition flag.

* g++.dg/opt/pr105306.C: New test.

Daily bump.

libgo: make a couple of sed uses POSIX compliant

Patch from Jonathan Wakely.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/401054

gcov-profile: Allow negative counts of indirect calls [PR105282]

TOPN metrics are histograms that contain overall count and per-bucket
count. Overall count can be negative when two profiles merge and some
of per-bucket metrics are disacarded.

Noticed as an ICE on python PGO build where gcc crashes as:

    during IPA pass: modref
    a.c:36:1: ICE: in stream_out_histogram_value, at value-prof.cc:340
       36 | }
          | ^
    stream_out_histogram_value(output_block*, histogram_value_t*)
            gcc/value-prof.cc:340

gcc/ChangeLog:

PR gcov-profile/105282
* value-prof.cc (stream_out_histogram_value): Allow negative counts
on HIST_TYPE_INDIR_CALL.

MAINTAINERS: Update my email address.

2022-04-19 Richard Henderson <rth@gcc.gnu.org>

* MAINTAINERS: Update my email address.

sparc: Preserve ORIGINAL_REGNO in epilogue_renumber [PR105257]

The following testcase ICEs, because the pic register is
(reg:DI 24 %i0 [109]) and is used in the delay slot of a return.
We invoke epilogue_renumber and that changes it to
(reg:DI 8 %o0) which no longer satisfies sparc_pic_register_p
predicate, so we don't recognize the insn anymore.

The following patch fixes that by preserving ORIGINAL_REGNO if
specified, so we get (reg:DI 8 %o0 [109]) instead.

2022-04-19 Jakub Jelinek <jakub@redhat.com>

PR target/105257
* config/sparc/sparc.cc (epilogue_renumber): If ORIGINAL_REGNO,
use gen_raw_REG instead of gen_rtx_REG and copy over also
ORIGINAL_REGNO. Use return 0; instead of /* fallthrough */.

* gcc.dg/pr105257.c: New test.

c++: Fix up CONSTRUCTOR_PLACEHOLDER_BOUNDARY handling [PR105256]

The CONSTRUCTOR_PLACEHOLDER_BOUNDARY bit is supposed to separate
PLACEHOLDER_EXPRs that should be replaced by one object or subobjects of it
(variable, TARGET_EXPR slot, ...) from other PLACEHOLDER_EXPRs that should
be replaced by different objects or subobjects.
The bit is set when finding PLACEHOLDER_EXPRs inside of a CONSTRUCTOR, not
looking into nested CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctors, and we prevent
elision of TARGET_EXPRs (through TARGET_EXPR_NO_ELIDE) whose initializer
is a CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctor.  The following testcase ICEs
though, we don't replace the placeholders in there at all, because
CONSTRUCTOR_PLACEHOLDER_BOUNDARY isn't set on the TARGET_EXPR_INITIAL
ctor, but on a ctor nested in such a ctor.  replace_placeholders should be
run on the whole TARGET_EXPR slot.

So, the following patch fixes it by moving the CONSTRUCTOR_PLACEHOLDER_BOUNDARY
bit from nested CONSTRUCTORs to the CONSTRUCTOR containing those (but only
if it is closely nested, if there is some other tree sandwiched in between,
it doesn't do it).

2022-04-19  Jakub Jelinek  <jakub@redhat.com>

PR c++/105256
* typeck2.cc (process_init_constructor_array,
process_init_constructor_record, process_init_constructor_union): Move
CONSTRUCTOR_PLACEHOLDER_BOUNDARY flag from CONSTRUCTOR elements to the
containing CONSTRUCTOR.

* g++.dg/cpp0x/pr105256.C: New test.

tree-optimization/104010 - fix SLP scalar costing with patterns

When doing BB vectorization the scalar cost compute is derailed
by patterns, causing lanes to be considered live and thus not
costed on the scalar side.  For the testcase in PR104010 this
prevents vectorization which was done by GCC 11.  PR103941
shows similar cases of missed optimizations that are fixed by
this patch.

2022-04-13  Richard Biener  <rguenther@suse.de>

PR tree-optimization/104010
PR tree-optimization/103941
* tree-vect-slp.cc (vect_bb_slp_scalar_cost): When
we run into stmts in patterns continue walking those
for uses outside of the vectorized region instead of
marking the lane live.

* gcc.target/i386/pr103941-1.c: New testcase.
* gcc.target/i386/pr103941-2.c: Likewise.

libstdc++: Stop defining _GLIBCXX_ASSERTIONS in floating_to_chars.cc

Assertions were originally enabled in the compiled-in floating-point
std::to_chars implementation to help shake out any bugs, but they
apparently impose a significant performance penalty, most notably for
the hex formatting which is around 25% slower with assertions enabled.
This seems too high a cost for unconditionally enabling them.

The newly added calls to __builtin_unreachable work around the compiler
no longer knowing that the set of valid values of 'fmt' is limited (which
was previously upheld by an assert).

libstdc++-v3/ChangeLog:

* src/c++17/floating_to_chars.cc (_GLIBCXX_ASSERTIONS): Don't
define.
(__floating_to_chars_shortest): Add __builtin_unreachable calls to
squelch false-positive -Wmaybe-uninitialized and -Wreturn-type
warnings.
(__floating_to_chars_precision): Likewise.

libstdc++: Add pretty printer for std::span

This improves the debug output for C++20 spans.

Before:
{static extent = 18446744073709551615, _M_ptr = 0x7fffffffb9a8, _M_extent = {_M_extent_value = 2}}
Now with StdSpanPrinter:
std::span of length 2 = {1, 2}

Signed-off-by: Philipp Fent <fent@in.tum.de>
libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdSpanPrinter): Define.
* testsuite/libstdc++-prettyprinters/cxx20.cc: Test it.

tree-optimization/104880 - move testcase

This renames the testcase to something picked up by the suites regexp.

2022-04-19 Richard Biener <rguenther@suse.de>

PR tree-optimization/104880
* g++.dg/opt/pr104880.cc: Rename to ...
* g++.dg/opt/pr104880.C: ... this.

libstdc++: Fix syntax error in libbacktrace configuration

Using == instead of = causes a configuration error with dash as the
shell:

checking whether to build libbacktrace support... /home/devel/building/work/src/gcc-12-20220417/libstdc++-v3/configure: 77471: test: auto: unexpected operator
/home/devel/building/work/src/gcc-12-20220417/libstdc++-v3/configure: 77474: test: auto: unexpected operator
auto

This means we fail to change the value from "auto" to "no" and so this
test passes:
GLIBCXX_CONDITIONAL(ENABLE_BACKTRACE, [test "$enable_libstdcxx_backtrace" != no])

This leads to the libbacktrace directory being included in the build
without being configured properly, and bootstrap fails.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Fix shell operators.
* configure: Regenerate.

Daily bump.

doc/install.texi: CRIS: Remove gone websites. Adjust CRIS targets

That is, support for cris-linux-gnu was removed in gcc-11, but
install.texi wasn't adjusted accordingly. Also, unfortunately the
developer-related sites are gone with no replacements. And, CRIS is
used in other chip series as well, but allude rather than list.

The generated manpages, info, pdf and html were sanity-checked.

gcc:
* doc/install.texi <CRIS>: Remove references to removed websites and
adjust for cris-*-elf being the only remaining toolchain.

doc/invoke.texi: CRIS: Remove references to cris-axis-linux-gnu

...and related options. These stale bits were overlooked when support
for "Linux/GNU" and CRIS v32 was removed, before the gcc-11 release.

Resulting pdf, html and info inspected for sanity.

gcc:
* doc/invoke.texi <CRIS>: Remove references to options for removed
subtarget cris-axis-linux-gnu and tweak wording accordingly.

libgo: only add signum to siglist if it doesn't exist yet

This fixes a build issue on musl libc where the same signal number
is used for SIGIO and SIGPOLL. This causes a compilation error since
the signal numbers must be unique for the signal table.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/400595

runtime: add special handling for signal 34

The musl libc uses signal 34 internally for setgid (similar to how glibc
uses signal 32 and signal 33). For this reason, special handling is
needed for this signal in the runtime. The gc implementation already
handles the signal accordingly. As such, this commit intends to
simply copy the behavior of the Google Go implementation to libgo.

See https://go.dev/issues/39343

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/400594

libstdc++: Micro-optimize __from_chars_pow2_base

In the first iteration of __from_chars_pow2_base's main loop, we need
to remember the value of the leading significant digit for sake of the
overflow check at the end (for base > 2).

This patch manually unrolls this first iteration so as to not encumber
the entire loop with logic that only the first iteration needs.  This
seems to significantly improve performance:

Base  Before  After (seconds, lower is better)
   2    9.36   9.37
   8    3.66   2.93
  16    2.93   1.91
  32    2.39   2.24

libstdc++-v3/ChangeLog:

* include/std/charconv (__from_chars_pow2_base): Manually
unroll the first iteration of the main loop and simplify
accordingly.

testsuite: Skip pr105250.c for powerpc and s390 [PR105266]

This test case pr105250.c is like its related pr105140.c, which
suffers the error with message like "{AltiVec,vector} argument
passed to unprototyped" on powerpc and s390. So like commits
r12-8025 and r12-8039, this fix is to add the dg-skip-if for
powerpc*-*-* and s390*-*-*.

gcc/testsuite/ChangeLog:

PR testsuite/105266
* gcc.dg/pr105250.c: Skip for powerpc*-*-* and s390*-*-*.

Daily bump.

doc: Adjust mingw-w64 download link

gcc:
* doc/install.texi (Specific): Adjust mingw-w64 download link.

Daily bump.

compiler: revert `for package-scope "a = b; b = x" just set "a = x"`

Revert CL 245098.  It caused incorrect initialization ordering.

Adjust the runtime package to work even with the CL reverted.

Original description of CL 245098:

    This avoids requiring an init function to initialize the variable.
    This can only be done if x is a static initializer.

    The go1.15rc1 runtime package relies on this optimization.
    The package has a variable "var maxSearchAddr = maxOffAddr".
    The maxSearchAddr variable is used by code that runs before package
    initialization is complete.

For golang/go#51913

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/395994

libstdc++: Avoid double-deref of __first in ranges::minmax [PR104858]

PR libstdc++/104858

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__minmax_fn): Avoid dereferencing
__first twice at the start.
* testsuite/25_algorithms/minmax/constrained.cc (test06): New test.

rs6000: Move more g++.dg powerpc tests to g++.target

Also adjust DejaGnu directives, as specifically requiring "powerpc*-*-*" is no
longer required.

2022-04-15 Paul A. Clarke <pc@us.ibm.com>

gcc/testsuite
* g++.dg/debug/dwarf2/const2.C: Move to g++.target/powerpc.
* g++.dg/other/darwin-minversion-1.C: Likewise.
* g++.dg/eh/ppc64-sighandle-cr.C: Likewise.
* g++.dg/eh/simd-5.C: Likewise.
* g++.dg/eh/simd-4.C: Move to g++.target/powerpc, adjust dg directives.
* g++.dg/eh/uncaught3.C: Likewise.
* g++.dg/other/spu2vmx-1.C: Likewise.

c++: wrong error with variadic concept [PR105268]

Here we issue a wrong error for the

  template<typename T = S<C_many<int>>> void g();

line in the testcase.  I surmise that's because we mistakenly parse
C_many<int> as a placeholder-type-specifier, and things go wrong from
there.  We are in a default argument so we should reject parsing C_many<int>
as a placeholder-type-specifier, which would mean creating a new parameter.
We want C_many<int> to be a concept-id instead.

It's interesting to see why the same problem didn't occur for C_one<int>.
In that case, cp_parser_placeholder_type_specifier -> finish_type_constraints
-> build_type_constraint -> build_concept_check -> build_standard_check ->
coerce_template_parms fails the parse here:

8916   nargs = inner_args ? NUM_TMPL_ARGS (inner_args) : 0;
8917   if ((nargs - variadic_args_p > nparms && !variadic_p)
8918       || (nargs < nparms - variadic_p
8919           && require_all_args
8920           && !variadic_args_p
8921           && (!use_default_args
8922               || (TREE_VEC_ELT (parms, nargs) != error_mark_node
8923                   && !TREE_PURPOSE (TREE_VEC_ELT (parms, nargs))))))
8924     {
8925     bad_nargs:
...
8943       return error_mark_node;

because nargs is 2 (the targs are <WILDCARD_DECL, int>) while nparms is
1 (for the one 'typename' in the tparam list of C_one).  But for
C_many<int> variadic_p is true so we don't return error_mark_node but
<type_argument_pack>.

This patch does not issue any error for the !tentative case because
I didn't figure out how to trigger that.  So it adds an assert instead.

PR c++/105268

gcc/cp/ChangeLog:

* parser.cc (cp_parser_placeholder_type_specifier): Return
error_mark_node when trying to build up a constrained parameter in
a default argument.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/variadic6.C: New test.

libstdc++: Optimize integer std::from_chars

This applies the following optimizations to the integer std::from_chars
implementation:

  1. Use a lookup table for converting an alphanumeric digit to its
     base-36 value instead of using a range test (for 0-9) and switch
     (for a-z and A-Z).  The table is constructed using a C++14
     constexpr function which doesn't assume a particular character
     encoding or __CHAR_BIT__ value.  This new conversion function
     __from_chars_alnum_to_val is templated on whether we care
     only about the decimal digits, in which case we can perform the
     conversion with a single subtraction since the digit characters
     are guaranteed to be contiguous (unlike the letters).
  2. Generalize __from_chars_binary to handle all power-of-two bases.
     This function (now named __from_chars_pow2_base) is also templated
     on whether we care only about the decimal digits for the benefit of
     faster digit conversion for base 2, 4 and 8.
  3. In __from_chars_digit, use
       static_cast<unsigned char>(__c - '0') < __base
     instead of
       '0' <= __c && __c <= ('0' + (__base - 1)).
     as the digit recognition test (exhaustively verified that the two
     tests are equivalent).
  4. In __from_chars_alnum, use a nested loop to consume the rest of the
     digits in the overflow case (mirroring __from_chars_digit) so that
     the main loop doesn't have to maintain the overflow flag __valid.

At this point, __from_chars_digit is nearly identical to
__from_chars_alnum, so this patch merges the two functions by removing
the former and templatizing the latter according to whether we care only
about the decimal digits.  Finally,

  5. In __from_chars_alnum, maintain a lower bound on the number of
     unused bits in the result and use it to omit the overflow check
     when it's safe to do so.

In passing, this patch replaces the non-portable function ascii_to_hexit
used by __floating_from_chars_hex with the new conversion function.

Some runtime measurements for a simple 15-line benchmark that roundtrips
printing/parsing 200 million integers via std::to/from_chars (average of
5 runs):

  Base  Before  After (seconds, lower is better)
     2    9.37   9.37
     3   15.79  12.13
     8    4.15   3.67
    10    4.90   3.86
    11    6.84   5.03
    16    4.14   2.93
    32    3.85   2.39
    36    5.22   3.26

libstdc++-v3/ChangeLog:

* include/std/charconv (__from_chars_alnum_to_val_table): Define.
(__from_chars_alnum_to_val): Define.
(__from_chars_binary): Rename to ...
(__from_chars_pow2_base): ... this.  Generalize to handle any
power-of-two base using __from_chars_alnum_to_val.
(__from_chars_digit): Optimize digit recognition to a single
test instead of two tests.  Use [[__unlikely___]] attribute.
(__from_chars_alpha_to_num): Remove.
(__from_chars_alnum): Use __from_chars_alnum_to_val.  Use a
nested loop for the overflow case.  Maintain a lower bound
on the number of available bits in the result and use it to
omit the overflow check.
(from_chars): Adjust appropriately.
* src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove.
(__floating_from_chars_hex): Use __from_chars_alnum_to_val
to recognize a hex digit instead.

i386: Correct target attribute for crc32 intrinsics

Complile _mm_crc32_u8/16/32/64 intrinsics with -mcrc32
would meet target specific option mismatch. Correct target pragma
to fix.

gcc/ChangeLog:

* config/i386/smmintrin.h: Correct target pragma from sse4.1
and sse4.2 to crc32 for crc32 intrinsics.

gcc/testsuite/ChangeLog:

* gcc.target/i386/crc32-6.c: Adjust dg-error message.
* gcc.target/i386/crc32-7.c: New test.

c++: unsigned int32_t enum promotion [PR102804]

There's been an extension for a long time to allow applying 'unsigned' to an
int typedef, but that was confusing the integer promotion code. Fixed by
forgetting about the typedef in that case.

I'm going to make this an unconditional pedwarn in stage 1.

PR c++/102804

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Drop typedef used with 'unsigned'.

gcc/testsuite/ChangeLog:

* g++.dg/ext/unsigned-typedef1.C: New test.

c++: using in diagnostics [PR102987]

The expression pretty-printing code crashed on a location wrapper with no
type, and didn't know what to do with a USING_DECL.

PR c++/102987

gcc/cp/ChangeLog:

* error.cc (dump_expr): Handle USING_DECL.
[VIEW_CONVERT_EXPR]: Just look through location wrapper.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/using1.C: New test.

Daily bump.

Update gcc de.po, fr.po, sv.po

* de.po, fr.po, sv.po: Update.

analyzer: fix escaping of pointer arithmetic [PR105264]

PR analyzer/105264 reports that the analyzer can fail to treat
(PTR + IDX) and PTR[IDX] as referring to the same memory under
some situations.

There are various ways in which this can happen when IDX is a
symbolic value, due to having several ways in which such memory
regions can be referred to symbolically. I attempted to fix this by
being smarter when folding svalues and regions, but this fix
seems too fiddly to attempt in stage 4.

Instead, this less ambitious patch fixes a false positive from
-Wanalyzer-use-of-uninitialized-value by making the analyzer's escape
analysis smarter, so that it treats *PTR as escaping when
(PTR + OFFSET) is passed to an external function, and thus
it treats *PTR as possibly-initialized (the "passing &PTR[IDX]" case
was already working).

gcc/analyzer/ChangeLog:
PR analyzer/105264
* region-model-reachability.cc (reachable_regions::handle_parm):
Use maybe_get_deref_base_region rather than just region_svalue, to
handle pointer arithmetic also.
* svalue.cc (svalue::maybe_get_deref_base_region): New.
* svalue.h (svalue::maybe_get_deref_base_region): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/105264
* gcc.dg/analyzer/torture/symbolic-10.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

runtime: use regset indexes for PPC register values

Using names depended on <asm/ptrace.h>, which glibc includes somewhere
but musl did not. Change to just always use indexes.

Based on patch by Sören Tempel.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/400214

c++: constexpr trivial -fno-elide-ctors [PR104646]

The constexpr constructor checking code got confused by the expansion of a
trivial copy constructor; we don't need to do that checking for defaulted
ctors, anyway.

PR c++/104646

gcc/cp/ChangeLog:

* constexpr.cc (maybe_save_constexpr_fundef): Don't do extra
checks for defaulted ctors.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-fno-elide-ctors1.C: New test.

libgccjit: Fix a bootstrap break for some targets.

Some targets use 'long long unsigned int' for unsigned HW int, and this
leads to a Werror=format= fail for two print cases in jit-playback.cc
introduced in r12-8117-g30f7c83e9cfe (Add support for bitcasts [PR104071])

As discussed on IRC, casting to (long) seems entirely reasonable for the
values (since they are type sizes).

tested that this fixes bootstrap on x86_64-darwin19 and running check-jit.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/jit/ChangeLog:

* jit-playback.cc (new_bitcast): Cast values returned by tree_to_uhwi
to 'long' to match the print format.

c++: lambda and the current instantiation [PR82980]

When a captured variable is type-dependent, we've expressed the type of the
capture field and proxy with a decltype variant.  But if the type is "the
current instantiation", we need to be able to see that so that we can do
lookup inside it just like we could with the captured variable itself.

I also tried looking through lambda capture in
cp_parser_postfix_dot_deref_expression, but this way seems cleaner.  I plan
to treat more types as deducible in stage 1.

I considered also using this in do_auto_deduction, but think that would be
wrong: [temp.dep.expr] says an id-expression is type-dependent if it is
"associated by name lookup with a variable declared with a type that
contains a placeholder type where the initializer is type-dependent".  That
doesn't clearly exclude deducing a dependent type from the initializer, but
it seems like a barrier, and other implementations agree.

PR c++/82980

gcc/cp/ChangeLog:

* lambda.cc (type_deducible_expression_p): New.
(lambda_capture_field_type): Check it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-current-inst1.C: New test.

Refactor and update CTF testcases [PR105089]

This commit splits the ctf-array-2.c into ctf-array-5.c and
ctf-variables.c with the following responsibilities:

[1] ctf-array-2.c: Test CTF generation for unsized arrays.
[2] ctf-array-5.c: Test CTF generation for unsized but initialized array.
[3] ctf-variables-3.c: Test CTF generation for extern variable with defining
decl.

Earlier all three tests above were being done in ctf-array-2.c.  The
checks around [3] were very loose in the original version of ctf-array-2.c
in that the testcase was only checking that the types are as expected.  The
compiler was emitting two CTF variable records as follows:

Variables:
  _CTF_NEWSTR ->  5: const const char [0] (size 0x0) -> 4: const char [0] (size 0x0)
  _CTF_NEWSTR ->  8: const const char [8] (size 0x8) -> 7: const char [8] (size 0x8)

This is incorrect behaviour as it creates ambiguity.  The testcase
ctf-variables-3.c now has added checks that only one CTF variable record
is expected.

2022-04-14  Indu Bhagat  <indu.bhagat@oracle.com>

gcc/testsuite/ChangeLog:

PR debug/105089
* gcc.dg/debug/ctf/ctf-array-2.c: Refactor testcase.  Move some
checks ...
* gcc.dg/debug/ctf/ctf-array-5.c: ... to here.
* gcc.dg/debug/ctf/ctf-variables-3.c: ... and here.  Add
additional checks for one CTF variable and one CTF object info
record.

CTF for extern variable fix [PR105089]

The CTF format cannot differentiate between a non-defining extern
variable declaration vs. a defining variable declaration (unlike DWARF).
So, the correct behaviour wrt the compiler generating CTF for such
extern variables (i.e., when both the defining and non-defining decl
are present in the same CU) is to simply emit the CTF variable
correspoding to the defining declaration.

To carry out the above, following changes are introduced via the patch:

1. The CTF container (ctfc.h) now keeps track of the non-defining declarations
(by noting the DWARF attribute DW_AT_specification) in a new ctfc_ignore_vars
hashtable.  Such book-keeping is necessary because the CTF container should
not rely on the order of DWARF DIEs presented to it at generation time.

2. At the time of ctf_add_variable (), the DW_AT_specification DIE if present
is added in the ctfc_ignore_vars hashtable.  The CTF variable generation for
the defining declaration continues as normal.

3. If the ctf_add_variable () is asked to generate CTF variable for a DIE
present in the ctfc_ignore_vars, it skips generating CTF for it.

4. Recall that CTF variables are pre-processed before emission.  Till now, the
only pre-processing that was being done was to sort them in order of their
names.  Now an additional step is added:  If the CTF variable which
corresponds to the non-defining declaration is indeed present in the ctfc_vars
hashtable (because the corresponding DWARF DIE was encountered first by the
CTF generation engine), skip that CTF variable from output.

An important side effect of such a workflow above is that CTF for the C type
of the non-defining decl will remain in the CTF dictionary (and will be
emitted in the output section as well).  This type can be pruned by the
link-time de-duplicator as usual, if deemed unused.

2022-04-14  Indu Bhagat  <indu.bhagat@oracle.com>

gcc/ChangeLog:

PR debug/105089
* ctfc.cc (ctf_dvd_ignore_insert): New function.
(ctf_dvd_ignore_lookup): Likewise.
(ctf_add_variable): Keep track of non-defining decl DIEs.
(new_ctf_container): Initialize the new hash-table.
(ctfc_delete_container): Empty hash-table.
* ctfc.h (struct ctf_container): Add new hash-table.
(ctf_dvd_ignore_lookup): New declaration.
(ctf_add_variable): Add additional argument.
* ctfout.cc (ctf_dvd_preprocess_cb): Skip adding CTF variable
record for non-defining decl for which a defining decl exists
in the same TU.
(ctf_preprocess): Defer updating the number of global objts
until here.
(output_ctf_header): Use ctfc_vars_list_count as some CTF
variables may not make it to the final output.
(output_ctf_vars): Likewise.
* dwarf2ctf.cc (gen_ctf_variable): Skip generating CTF variable
if this is known to be a non-defining decl DIE.

ctfc: get rid of the static variable in ctf_list_add_ctf_vars ()

2022-04-14 Indu Bhagat <indu.bhagat@oracle.com>

gcc/ChangeLog:

* ctfc.h (struct ctf_container): Introduce a new member.
* ctfout.cc (ctf_list_add_ctf_vars): Use it instead of static
variable.

libstdc++: Default to mutex-based atomics on RISC-V

The RISC-V port requires libatomic to be linked in order to resolve
various atomic functions, which results in builds that have
"--with-libstdcxx-lock-policy=auto" defaulting to mutex-based locks.
Changing this to direct atomics breaks the ABI, this forces the auto
detection mutex-based atomics on RISC-V in order to avoid a silent ABI
break for users.

See Bug 84568 for more discussion. In the long run there may be a way
to get the higher-performance atomics without an ABI flag day, but
that's going to be a much more complicated operation. We don't even
have support for the inline atomics yet, but given that some folks have
been discussing hacks to make these libatomic routines appear implicitly
it seems prudent to just turn off the automatic detection for RISC-V.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Force auto to mutex
for RISC-V.
* configure: Regenerate.

libstdc++: Fix incorrect IS number in doc comment

libstdc++-v3/ChangeLog:

* doc/xml/manual/intro.xml: Fix comment.

analyzer: fix ICE comparing VECTOR_CSTs [PR105252]

gcc/analyzer/ChangeLog:
PR analyzer/105252
* svalue.cc (cmp_cst): When comparing VECTOR_CSTs, compare the
types of the encoded elements before calling cmp_cst on them.

gcc/testsuite/ChangeLog:
PR analyzer/105252
* gcc.dg/analyzer/pr105252.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

simplify-rtx: Don't assume shift count has the same mode as the shift [PR105247]

The following testcase ICEs on ia64.  It is UB at runtime, but we shouldn't
ICE on it...
The problem is that on ia64, the shift count (last operand of ASHIFT etc.)
is promoted to DImode (using zero-extension), while most other targets
use much narrower modes (say QImode).  If we try to simplify a shift
and the shift count is CONST_INT or other VOIDmode integer constant
which isn't properly sign extended for the first operand's mode
(in the testcase the shift count is 0xfffffff8U and it is a SImode shift),
then we ICE during wide_int wop1 = pop1; in the first hunk, INTVAL == 0xfffffff8U
is not valid for SImode.  I think in theory we could run into this even
on other targets, say if they use SImode or HImode shift counts for e.g.
QImode shifts.  I hope word size is the upper bound of what a reasonable
target should use, using e.g. multiple registers for the shift count is
insane, so the following patch if op1 has VOIDmode and int_mode
is narrower than word uses word_mode for extraction of the value.

2022-04-14  Jakub Jelinek  <jakub@redhat.com>

PR target/105247
* simplify-rtx.cc (simplify_const_binary_operation): For shifts
or rotates by VOIDmode constant integer shift count use word_mode
for the operand if int_mode is narrower than word.

* gcc.c-torture/compile/pr105247.c: New test.

testsuite/s390: Silence warning in pr80725.c

This test case checks that we do not ICE but FAILs because of
-Wint-to-pointer-cast. Silence this warning.

gcc/testsuite/ChangeLog:

* gcc.target/s390/pr80725.c: Add -Wno-int-to-pointer-cast.

s390: Add scheduler description for z16.

This patch adds the scheduler description for the z16 machine.

gcc/ChangeLog:

* config/s390/s390.cc (s390_get_sched_attrmask): Add z16.
(s390_get_unit_mask): Likewise.
(s390_is_fpd): Likewise.
(s390_is_fxd): Likewise.
* config/s390/s390.h (s390_tune_attr): Set max tune level to z16.
* config/s390/s390.md (z900,z990,z9_109,z9_ec,z10,z196,zEC12,z13,z14,z15):
Add z16.
(z900,z990,z9_109,z9_ec,z10,z196,zEC12,z13,z14,z15,z16):
Likewise.
* config/s390/3931.md: New file.

libstdc++: Add new headers to <bits/stdc++.h> PCH

libstdc++-v3/ChangeLog:

* include/precompiled/stdc++.h: Include <stacktrace> and
<stdatomic.h> for C++23.

libstdc++: Fix missing and incorrect feature test macros [PR105269]

libstdc++-v3/ChangeLog:

PR libstdc++/105269
* include/bits/stl_vector.h (__cpp_lib_constexpr_vector):
Define.
* include/c_compatibility/stdatomic.h (__cpp_lib_stdatomic_h):
Define.
* include/std/optional (__cpp_lib_optional): Define new value
for C++23.
(__cpp_lib_monadic_optional): Remove.
* include/std/version (__cpp_lib_constexpr_vector): Define.
(__cpp_lib_stdatomic_h): Define.
(__cpp_lib_optional): Define new value for C++23.
(__cpp_lib_monadic_optional): Remove.
* testsuite/20_util/optional/monadic/and_then.cc: Adjust.
* testsuite/20_util/optional/requirements.cc: Adjust for C++23.
* testsuite/20_util/optional/version.cc: Likewise.
* testsuite/23_containers/vector/cons/constexpr.cc: Check
feature test macro.
* testsuite/29_atomics/headers/stdatomic.h/c_compat.cc:
Likewise.
* testsuite/20_util/optional/monadic/version.cc: Removed.
* testsuite/23_containers/vector/requirements/version.cc: New test.
* testsuite/29_atomics/headers/stdatomic.h/version.cc: New test.

c++: alignment of local typedef in template [PR65211]

Because common_handle_aligned_attribute only applies the alignment to the
TREE_TYPE of a typedef, not the DECL_ORIGINAL_TYPE, we need to copy it
explicitly in tsubst.

PR c++/65211

gcc/cp/ChangeLog:

* pt.cc (tsubst_decl) [TYPE_DECL]: Copy TYPE_ALIGN.

gcc/testsuite/ChangeLog:

* g++.target/i386/vec-tmpl1.C: New test.

c++: local fn and generic lambda [PR97219]

When instantiating the op() for a generic lambda, we can no longer do name
lookup inside function scopes enclosing the lambda, so we need to remember
the lookup result from processing the definition of the lambda.  So the code
in finish_call_expr to throw away the lookup result and instead look it up
again at instantiation time needs to be adjusted.  The approach I take is to
only discard the result if the local extern comes from dependent scope; once
the enclosing function template is instantiated and we're regenerating the
lambda, then we can remember the result of lookup.  We also need any default
arguments to be instantiated at that point.

PR c++/97219

gcc/cp/ChangeLog:

* name-lookup.cc (dependent_local_decl_p): New.
* cp-tree.h (dependent_local_decl_p): Declare.
* semantics.cc (finish_call_expr): Use it.
* pt.cc (tsubst_arg_types): Also substitute default args
for local externs.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/lambda-generic-local-fn1.C: New test.

c++: template conversion op [PR101698]

Asking for conversion to a dependent type also makes a BASELINK dependent.

PR c++/101698

gcc/cp/ChangeLog:

* pt.cc (tsubst_baselink): Also check dependent optype.

gcc/testsuite/ChangeLog:

* g++.dg/template/conv19.C: New test.

c++: NRV and ref-extended temps [PR101442]

This issue goes back to r83221, where the cleanup for extended ref temps
changed from being unconditional to being tied to the declaration they
formed part of the initializer for.

The named return value optimization changes the cleanup for the NRV variable
to only run on the EH path; we don't want that change to affect temporary
cleanups. The perform_member_init change isn't necessary (there 'decl' is a
COMPONENT_REF), it's just for consistency.

PR c++/101442

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Don't pass decl to push_cleanup.
* init.cc (perform_member_init): Likewise.
* semantics.cc (push_cleanup): Adjust comment.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-nrv1.C: New test.

c++: add test [PR105265]

This was fixed by r12-1165, but good to have a test that doesn't need
-fno-elide-constructors.

PR c++/105265
PR c++/100838

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-new6.C: New test.

Daily bump.

go.test: update issue10441.go to current upstream version

This test only needs to be compiled, not linked.

aarch64: Make sure the UF divides the VF [PR105254]

In this PR, we were trying to set the unroll factor to a value higher
than the minimum VF (or more specifically, to a value that doesn't
divide the VF).  I guess there are two approaches to this: let the
target pick any value it likes and make target-independent code pare
it back to something that makes sense, or require targets to supply
sensible values from the outset.  This patch goes for the latter
approach.

gcc/
PR tree-optimization/105254
* config/aarch64/aarch64.cc
(aarch64_vector_costs::determine_suggested_unroll_factor): Take a
loop_vec_info as argument.  Restrict the unroll factor to values
that divide the VF.
(aarch64_vector_costs::finish_cost): Update call accordingly.

gcc/testsuite/
PR tree-optimization/105254
* g++.dg/vect/pr105254.cc: New test.

OpenMP/Fortran: Fix EXIT in loop diagnostic [PR105242]

gcc/fortran/ChangeLog:

PR fortran/105242
* match.cc (match_exit_cycle): Handle missing OMP LOOP, DO and SIMD
directives in the EXIT/CYCLE diagnostic.

gcc/testsuite/ChangeLog:

PR fortran/105242
* gfortran.dg/gomp/loop-exit.f90: New test.

c++: empty base constexpr -fno-elide-ctors [PR105245]

The patch for 100111 extended our handling of empty base elision to the case
where the derived class has no other fields, but we still need to make sure
that there's some initializer for the derived object.

PR c++/105245
PR c++/100111

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_store_expression): Build a CONSTRUCTOR
as needed in empty base handling.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-empty2.C: Add -fno-elide-constructors.

d: Merge upstream dmd 4d1bfcf14, druntime 9ba9a6ae, phobos c0cc5e917.

D front-end changes:

    - Import dmd v2.099.1.
    - Added `@mustuse' attribute, implmenting DIP 1038.
    - Added `.tupleof` property for static arrays

D runtime changes:

    - Import druntime v2.099.1.

Phobos changes:

    - Import phobos v2.099.1.
    - Zlib bindings have been updated to 1.2.12.

gcc/d/ChangeLog:

* Make-lang.in (D_FRONTEND_OBJS): Add d/common-bitfields.o,
d/mustuse.o.
* d-ctfloat.cc (CTFloat::isIdentical): Don't treat NaN values as
identical.
* dmd/MERGE: Merge upstream dmd 4d1bfcf14.
* expr.cc (ExprVisitor::visit (VoidInitExp *)): New.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 9ba9a6ae.
* src/MERGE: Merge upstream phobos c0cc5e917.

tree-optimization/105263 - reassoc and DFP

reassoc has certain tricks which in the end depend on the ability
to undo them.  For DFP creating a -1. constant is easy but
re-identifying is appearantly not - real_minus_onep rejects those
outright for DFP.  So we have to disable (at least) this one trick.

2022-04-13  Richard Biener  <rguenther@suse.de>

PR tree-optimization/105263
* tree-ssa-reassoc.cc (try_special_add_to_ops): Do not consume
negates in multiplication chains with DFP.

* gcc.dg/pr105263.c: New testcase.

tree.cc: Use useless_type_conversion_p in tree_builtin_call_types_compatible_p while in gimple form [PR105253]

tree_builtin_call_types_compatible_p uses TYPE_MAIN_VARIANT comparisons
or tree_nop_conversion_p to ensure a builtin has correct GENERIC arguments.
Unfortunately this regressed when get_call_combined_fn is called during
GIMPLE optimizations.  E.g. when number_of_iterations_popcount is called,
it doesn't ensure TYPE_MAIN_VARIABLE compatible argument type, it picks
__builtin_popcount{,l,ll} based just on types' precision and doesn't
fold_convert the arg to the right type.  We are in GIMPLE, such conversions
are useless...
So, either we'd need to fix number_of_iterations_popcount to add casts
and inspect anything else that creates CALL_EXPRs late, or we can
in tree_builtin_call_types_compatible_p just use the GIMPLE type
comparisons (useless_type_conversion_p) when we are in GIMPLE form and
the TYPE_MAIN_VARIANT comparison or tree_nop_conversion_p test otherwise.

I think especially this late in stage4 the latter seems safer to me.

2022-04-13  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/105253
* tree.cc (tree_builtin_call_types_compatible_p): If PROP_gimple,
use useless_type_conversion_p checks instead of TYPE_MAIN_VARIANT
comparisons or tree_nop_conversion_p checks.

* gcc.target/i386/pr105253.c: New test.

c++: Treat alignas align_expr and aligned attribute's operand as manifestly constant evaluation [PR105233]

The following testcase fails, because we only constant evaluate the
alignas argument as non-manifestly constant-evaluated and as
__builtin_is_constant_evaluated appears, we make it non-constant
(the reason is that we often try to evaluate some expression without
manifestly_const_eval perhaps even multiple times before actually
evaluating it with manifestly_const_eval (e.g. when folding for warnings
and in many other places), and we don't want __builtin_is_constant_evaluated
to evaluate to false in those cases, because we could get a different
result from when we actually evaluate it with manifestly_const_eval
set).
Now, for alignas the standard seems to be clear, it says the
argument is constant-expression, which means we should
manifestly-constant-eval it.
Attributes are a fuzzy area, they are extensions and various attributes
take e.g. identifiers, or string literals etc. as arguments.

Either we can just treat alignas as manifestly-const-eval, for that
we'd need some way how to differentiate between alignas and gnu::aligned
or aligned attribute.

Another possibility is what the patch below implements, treat
both alignas and gnu::aligned and aligned attribute's argument as
manifestly-const-eval and not do that for other attributes.

Another is to go through all attributes and figure out for which
such treatment is useful (e.g. those that expect INTEGER_CST as argument),
and either add a new column in the attribute table or have another table
in the C++ FE to find out which attribute needs that.

Another is do that for all the attribute arguments that are EXPR_P
and see what breaks (bet that it could be quite risky this late in
GCC 12 cycle and especially for backporting).

2022-04-13 Jakub Jelinek <jakub@redhat.com>

PR c++/105233
* decl2.cc (cp_check_const_attributes): For aligned attribute
pass manifestly_const_eval=true to fold_non_dependent_expr.

* g++.dg/cpp2a/is-constant-evaluated13.C: New test.