git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

c++: constraint rewriting during ttp coercion [PR111485]

In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters. The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2. This patch
fixes this by including the outer template arguments in the substitution,
which ought to match the depth of the ttp.

The second testcase demonstrates it's better to substitute the concrete
outer template arguments instead of generic ones since a ttp's constraints
could depend on outer parameters.

PR c++/111485

gcc/cp/ChangeLog:

* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.

(cherry picked from commit 6f902a42b0afe3f3145bcb864695fc290b5acc3e)

Daily bump.

c++: value dependence of by-ref lambda capture [PR108975]

We are still ICEing on the generic lambda version of the testcase from
this PR, even after r13-6743-g6f90de97634d6f, due to the by-ref capture
of the constant local variable 'dim' being considered value-dependent
when regenerating the lambda (at which point processing_template_decl is
set since the lambda is generic), which prevents us from constant folding
its uses.  Later during prune_lambda_captures we end up not thoroughly
walking the body of the lambda and overlook the (non-folded) uses of
'dim' within the array bound and using-decls.

We could fix this by making prune_lambda_captures walk the body of the
lambda more thoroughly so that it finds these uses of 'dim', but ideally
we should be able to constant fold all uses of 'dim' ahead of time and
prune the implicit capture after all.

To that end this patch makes value_dependent_expression_p return false
for such by-ref captures of constant local variables, allowing their
uses to get constant folded ahead of time.  It seems we just need to
disable the predicate's conservative early exit for reference variables
(added by r5-5022-g51d72abe5ea04e) when DECL_HAS_VALUE_EXPR_P.  This
effectively makes us treat by-value and by-ref captures more consistently
when it comes to value dependence.

PR c++/108975

gcc/cp/ChangeLog:

* pt.cc (value_dependent_expression_p) <case VAR_DECL>:
Suppress conservative early exit for reference variables
when DECL_HAS_VALUE_EXPR_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-const11a.C: New test.

(cherry picked from commit 3d674e29d7f89bf93fcfcc963ff0248c6347586d)

Daily bump.

i386: Fix mmx.md signbit expanders [PR112816]

Apparently when looking for "signbit<mode>2" vector expanders, I've only
looked at sse.md and forgot mmx.md, which has another one and the
following patch still ICEd.

2023-12-19 Jakub Jelinek <jakub@redhat.com>

PR target/112816
* config/i386/mmx.md (signbitv2sf2): Force operands[1] into a REG.

* gcc.target/i386/sse2-pr112816-2.c: New test.

(cherry picked from commit 80e1375ed7a7a05a5a60a57e72c5ad5eba005798)

Daily bump.

tree-object-size: Robustify alloc_size attribute handling [PR113013]

The following testcase ICEs because we aren't careful enough with
alloc_size attribute.  We do check that such an argument exists
(although wouldn't handle correctly functions with more than INT_MAX
arguments), but didn't check that it is scalar integer, the ICE is
trying to fold_convert a structure to sizetype.

Given that the attribute can also appear on non-prototyped functions
where the arguments aren't known, I don't see how the FE could diagnose
that and because we already handle the case where argument doesn't exist,
I think we should also verify the argument is scalar integer convertible
to sizetype.  Furthermore, given this is not just in diagnostics but
used for code generation, I think it is better to punt on arguments with
larger precision then sizetype, the upper bits are then truncated.

The patch also fixes some formatting issues and avoids duplication of the
fold_convert, plus removes unnecessary check for if (arg1 >= 0), that is
always the case after if (arg1 < 0) return ...;

2023-12-18  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/113013
* tree-object-size.cc (alloc_object_size): Return size_unknown if
corresponding argument(s) don't have integral type or have integral
type with higher precision than sizetype.  Don't check arg1 >= 0
uselessly.  Compare argument indexes against gimple_call_num_args
in unsigned type rather than int.  Formatting fixes.

* gcc.dg/pr113013.c: New test.

(cherry picked from commit 5347263b347d02e875879ca40ca6e289ac178919)

Daily bump.

c++: Unshare folded SAVE_EXPR arguments during cp_fold [PR112727]

The following testcase is miscompiled because two ubsan instrumentations
run into each other.
The first one is the shift instrumentation.  Before the C++ FE calls
it, it wraps the 2 shift arguments with cp_save_expr, so that side-effects
in them aren't evaluated multiple times.  And, ubsan_instrument_shift
itself uses unshare_expr on any uses of the operands to make sure further
modifications in them don't affect other copies of them (the only not
unshared ones are the one the caller then uses for the actual operation
after the instrumentation, which means there is no tree sharing).

Now, if there are side-effects in the first operand like say function
call, cp_save_expr wraps it into a SAVE_EXPR, and ubsan_instrument_shift
in this mode emits something like
if (..., SAVE_EXPR <foo ()>, SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR <foo ()>, ...);
and caller adds
SAVE_EXPR <foo ()> << SAVE_EXPR <op1>
after it in a COMPOUND_EXPR.  So far so good.

If there are no side-effects and cp_save_expr doesn't create SAVE_EXPR,
everything is ok as well because of the unshare_expr.
We have
if (..., SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., ptr->something[i], ...);
and
ptr->something[i] << SAVE_EXPR <op1>
where ptr->something[i] is unshared.

In the testcase below, the !x->s[j] ? 1 : 0 expression is wrapped initially
into a SAVE_EXPR though, and unshare_expr doesn't unshare SAVE_EXPRs nor
anything used in them for obvious reasons, so we end up with:
if (..., SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0>, SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0>, ...);
and
SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0> << SAVE_EXPR <op1>
So far good as well.  But later during cp_fold of the SAVE_EXPR we find
out that VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1 is actually
invariant (has TREE_READONLY set) and so cp_fold simplifies the above to
if (..., SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., (bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1, ...);
and
((bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1) << SAVE_EXPR <op1>
with the s[j] ARRAY_REFs and other expressions shared in between the two
uses (and obviously the expression optimized away from the COMPOUND_EXPR in
the if condition.

Then comes another ubsan instrumentation at genericization time,
this time to instrument the ARRAY_REFs with strict bounds checking,
and replaces the s[j] in there with s[.UBSAN_BOUNDS (0B, SAVE_EXPR<j>, 8), SAVE_EXPR<j>]
As the trees are shared, it does that just once though.
And as the if body is gimplified first, the SAVE_EXPR<j> is evaluated inside
of the if body and when it is used again after the if, it uses a potentially
uninitialized value of j.1 (always uninitialized if the shift count isn't
out of bounds).

The following patch fixes that by unshare_expr unsharing the folded argument
of a SAVE_EXPR if we've folded the SAVE_EXPR into an invariant and it is
used more than once.

2023-12-08  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/112727
* cp-gimplify.cc (cp_fold): If SAVE_EXPR has been previously
folded, unshare_expr what is returned.

* c-c++-common/ubsan/pr112727.c: New test.

(cherry picked from commit 6ddaf06e375e1c15dcda338697ab6ea457e6f497)

fold-const: Fix up multiple_of_p [PR112733]

We ICE on the following testcase when wi::multiple_of_p is called on
widest_int 1 and -128 with UNSIGNED.  I still need to work on the
actual wide-int.cc issue, the latest patch attached to the PR regressed
bitint-{38,39}.c, so will need to debug that, but there is a clear bug
on the fold-const.cc side as well - widest_int is a signed representation
by definition, using UNSIGNED with it certainly doesn't match what was
intended, because -128 as the second operand effectively means unsigned
131072 bit 0xfffff............ffff80 integer, not the signed char -128
that appeared in the source.

In the INTEGER_CST case a few lines above this we already use
    case INTEGER_CST:
      if (TREE_CODE (bottom) != INTEGER_CST || integer_zerop (bottom))
        return false;
      return wi::multiple_of_p (wi::to_widest (top), wi::to_widest (bottom),
                                SIGNED);
so I think using SIGNED with widest_int is best there (compared to the
other choices in the PR).

2023-11-29  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/112733
* fold-const.cc (multiple_of_p): Pass SIGNED rather than
UNSIGNED for wi::multiple_of_p on widest_int arguments.

* gcc.dg/pr112733.c: New test.

(cherry picked from commit 5c95bf945c632925efba86dd5dceccdb9da8884c)

i386: Fix -fcf-protection -Os ICE due to movabsq peephole2 [PR112845]

The following testcase ICEs in the movabsq $(i32 << shift), r64 peephole2
I've added a while back to use smaller code than movabsq if possible.
If i32 is 0xfa1e0ff3 and shift is not divisible by 8, then it creates
an invalid insn (as 0xfa1e0ff3 CONST_INT is not allowed as
x86_64_immediate_operand nor x86_64_zext_immediate_operand), the peephole2
even triggers on it again and again (this time with shift 0) until it gives
up.

The following patch fixes that. As ix86_endbr_immediate_operand needs a
CONST_INT and it is hopefully rare, I chose to use FAIL rather than handling
it in the condition (where I'd probably need to call ctz_hwi again etc.).

2023-12-05 Jakub Jelinek <jakub@redhat.com>

PR target/112845
* config/i386/i386.md (movabsq $(i32 << shift), r64 peephole2): FAIL
if the new immediate is ix86_endbr_immediate_operand.

(cherry picked from commit e0786ca9a18c50ad08c40936b228e325193664b8)

i386: Fix rtl checking ICE in ix86_elim_entry_set_got [PR112837]

The following testcase ICEs with RTL checking, because it sets if
XINT (SET_SRC (set), 1) is UNSPEC_SET_GOT without checking if SET_SRC (set)
is actually an UNSPEC, so any time we see any other insn with PARALLEL
and a SET in it which is not an UNSPEC we ICE during RTL checking or
access there some other union member as if it was an rt_int.
The rest is just small cleanup.

2023-12-04 Jakub Jelinek <jakub@redhat.com>

PR target/112837
* config/i386/i386.cc (ix86_elim_entry_set_got): Before checking
for UNSPEC_SET_GOT check that SET_SRC is UNSPEC. Use SET_SRC and
SET_DEST macros instead of XEXP, rename vec variable to set.

* gcc.dg/pr112837.c: New test.

(cherry picked from commit 4586d7d0a92e9b60d0c01043e0ae262b1e06f337)

i386: Fix up signbit<mode>2 expander [PR112816]

The following testcase ICEs, because the signbit<mode>2 expander uses an
explicit SUBREG in the pattern around match_operand with register_operand
predicate. If we are unlucky enough that expansion tries to expand it
with some SUBREG as operands[1], we have two nested SUBREGs in the IL,
which is not valid and causes ICE later.

2023-12-04 Jakub Jelinek <jakub@redhat.com>

PR target/112816
* config/i386/sse.md (signbit<mode>2): Force operands[1] into a REG.

* gcc.target/i386/sse2-pr112816.c: New test.

(cherry picked from commit 994d6dc64435d6b7c50accca9941ee7decd92a22)

c++: #pragma GCC unroll C++ fixes [PR112795]

foo in the unroll-5.C testcase ICEs because cp_parser_pragma_unroll
during parsing calls maybe_constant_value unconditionally, which is
fine if !processing_template_decl, but can ICE otherwise.

While just calling fold_non_dependent_expr there instead could be enough
to fix the ICE (and I guess the right thing to do for backports if any),
I don't see a reason why we couldn't handle a dependent #pragma GCC unroll
argument as well, the unrolling isn't done in the FE and all the middle-end
cares about is that ANNOTATE_EXPR has a 1..65534 last operand when it is
annot_expr_unroll_kind.

So, the following patch changes all the unsigned short unroll arguments
to tree unroll (and thus avoids the tree -> unsigned short -> tree
conversions), does the type and value checking during parsing only if
the argument isn't dependent and repeats it during instantiation.

2023-12-04 Jakub Jelinek <jakub@redhat.com>

PR c++/112795
gcc/cp/
* parser.cc (cp_parser_pragma_unroll): Use fold_non_dependent_expr
instead of maybe_constant_value.
gcc/testsuite/
* g++.dg/ext/unroll-5.C: New test.

(cherry picked from commit b6c78feea08c36e5754818c6a3d7536b3f8913dc)

i386: Fix up *jcc_bt*_mask{,_1} [PR111408]

The following testcase is miscompiled in GCC 14 because the
*jcc_bt<mode>_mask and *jcc_bt<SWI48:mode>_mask_1 patterns have just
one argument in (match_operator 0 "bt_comparison_operator" [...])
but as bt_comparison_operator is eq,ne, we need two.
The md readers don't warn about it, after all, some checks can
be done in the predicate rather than specified explicitly, and the
behavior is that anything is accepted as the second argument.

I went through all other i386.md match_operator uses and all others
looked right (extract_operator using 3 operands, all others 2).

I think we'll want to fix this at different spots in older releases
because I think the bug was introduced already in 2008, though most
likely just latent.

2023-11-25 Jakub Jelinek <jakub@redhat.com>

PR target/111408
* config/i386/i386.md (*jcc_bt<mode>_mask): Add (const_int 0) as
expected second operand of bt_comparison_operator.

* gcc.c-torture/execute/pr111408.c: New test.

(cherry picked from commit 9866c98e1015d98b8fc346d7cf73a0070cce5f69)

gimple-range-cache: Fix ICEs when dumping details [PR111967]

The following testcase ICEs when dumping details.
When m_ssa_ranges vector is created, it is safe_grow_cleared (num_ssa_names),
but when when some new SSA_NAME is added, we strangely grow it to
num_ssa_names + 1 instead and later on the 3 argument dump method
iterates from 1 to m_ssa_ranges.length () - 1 and uses ssa_name (x)
on each; but because set_bb_range grew it one too much, ssa_name
(m_ssa_ranges.length () - 1) might be after the end of the ssanames
vector and ICE.

The fix grows the vector consistently only to num_ssa_names,
doesn't waste time checking m_ssa_ranges[0] because there is no
ssa_names (0), it is always NULL, before using ssa_name (x) checks
if we'll need it at all (we check later if m_ssa_ranges[x] is non-NULL,
so we might check it earlier as well) and also in the last loop
iterates until m_ssa_ranges.length () rather than num_ssa_names, I don't
see a reason for the inconsistency and in theory some SSA_NAME could be
added without set_bb_range called for it and the vector could be shorter
than the ssanames vector.

To actually fix the ICE, either the first hunk or the last 2 hunks
would be enough, but I think it doesn't hurt to change all the spots.

2023-11-13  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/111967
* gimple-range-cache.cc (block_range_cache::set_bb_range): Grow
m_ssa_ranges to num_ssa_names rather than num_ssa_names + 1.
(block_range_cache::dump): Iterate from 1 rather than 0.  Don't use
ssa_name (x) unless m_ssa_ranges[x] is non-NULL.  Iterate to
m_ssa_ranges.length () rather than num_ssa_names.

* gcc.dg/tree-ssa/pr111967.c: New test.

(cherry picked from commit 5a0c302d2d721b9650c1e354695dbba87364c334)

attribs: Fix ICE with -Wno-attributes= [PR112339]

The following testcase ICEs, because with -Wno-attributes=foo::no_sanitize
(but generally any other non-gnu namespace and some gnu well known attribute
name within that other namespace) the FEs don't really parse attribute
arguments of such attribute, but lookup_attribute_spec is non-NULL with
NULL handler and such attributes are added to DECL_ATTRIBUTES or
TYPE_ATTRIBUTES and then when e.g. middle-end does lookup_attribute
on a particular attribute and expects the attribute to mean something
and/or have a particular verified arguments, it can crash when seeing
the foreign attribute in there instead.

The following patch fixes that by never adding ignored attributes
to DECL_ATTRIBUTES/TYPE_ATTRIBUTES, previously that was the case just
for attributes in ignored namespace (where lookup_attribute_space
returned NULL).  We don't really know anything about those attributes,
so shouldn't pretend we know something about them, especially when
the arguments are error_mark_node or NULL instead of something that
would have been parsed.  And it would be really weird if we normally
ignore say [[clang::unused]] attribute, but when people use
-Wno-attributes=clang::unused we actually treated it as gnu::unused.
All the user asked for is suppress warnings about that attribute being
unknown.

The first hunk is just playing safe, I'm worried people could
-Wno-attributes=gnu::
and get various crashes with known GNU attributes not being actually
parsed and recorded (or worse e.g. when we tweak standard attributes
into GNU attributes and we wouldn't add those).
The -Wno-attributes= documentation says that it suppresses warning about
unknown attributes, so I think -Wno-attributes=gnu:: should prevent
warning about say [[gnu::foobarbaz]] attribute, but not about
[[gnu::unused]] because the latter is a known attribute.
The routine would return true for any scoped attribute in the ignored
namespace, with the change it ignores only unknown attributes in ignored
namespace, known ones in there will be ignored only if they have
max_length of -2 (e.g.. with
-Wno-attributes=gnu:: -Wno-attributes=gnu::foobarbaz).

2023-11-09  Jakub Jelinek  <jakub@redhat.com>

PR c/112339
* attribs.cc (attribute_ignored_p): Only return true for
attr_namespace_ignored_p if as is NULL.
(decl_attributes): Never add ignored attributes.

* c-c++-common/ubsan/Wno-attributes-1.c: New test.

(cherry picked from commit 533241c6c60bc7c9f7dc47a94e94b5eed1b370e6)

libstdc++: Fix tr1/8_c_compatibility/cstdio/functions.cc regression with recent glibc

The following testcase started FAILing recently after the
https://sourceware.org/git/?p=glibc.git;a=commit;h=64b1a44183a3094672ed304532bedb9acc707554
glibc change which marked vfscanf with nonnull (1) attribute.
While vfwscanf hasn't been marked similarly (strangely), the patch changes
that too. By using va_arg one hides the value of it from the compiler
(volatile keyword would do too, or making the FILE* stream a function
argument, but then it might need to be guarded by #if or something).

2023-10-13 Jakub Jelinek <jakub@redhat.com>

* testsuite/tr1/8_c_compatibility/cstdio/functions.cc (test01):
Initialize stream to va_arg(ap, FILE*) rather than 0.
* testsuite/tr1/8_c_compatibility/cwchar/functions.cc (test01):
Likewise.

(cherry picked from commit badb798f5e96a995bb9fa8c4ea48071aa4f2b4b3)

wide-int: Fix up wi::divmod_internal [PR110731]

As the following testcase shows, wi::divmod_internal doesn't handle
correctly signed division with precision > 64 when the dividend (and likely
divisor as well) is the type's minimum and the precision isn't divisible
by 64.

A few lines above what the patch hunk changes is:
  /* Make the divisor and dividend positive and remember what we
     did.  */
  if (sgn == SIGNED)
    {
      if (wi::neg_p (dividend))
        {
          neg_dividend = -dividend;
          dividend = neg_dividend;
          dividend_neg = true;
        }
      if (wi::neg_p (divisor))
        {
          neg_divisor = -divisor;
          divisor = neg_divisor;
          divisor_neg = true;
        }
    }
i.e. we negate negative dividend or divisor and remember those.
But, after we do that, when unpacking those values into b_dividend and
b_divisor we need to always treat the wide_ints as UNSIGNED,
because divmod_internal_2 performs an unsigned division only.
Now, if precision <= 64, we don't reach here at all, earlier code
handles it.  If dividend or divisor aren't the most negative values,
the negation clears their most significant bit, so it doesn't really
matter if we unpack SIGNED or UNSIGNED.  And if precision is multiple
of HOST_BITS_PER_WIDE_INT, there is no difference in behavior, while
-0x80000000000000000000000000000000 negates to
-0x80000000000000000000000000000000 the unpacking of it as SIGNED
or UNSIGNED works the same.
In the testcase, we have signed precision 119 and the dividend is
val = { 0, 0xffc0000000000000 }, len = 2, precision = 119
both before and after negation.
Divisor is
val = { 2 }, len = 1, precision = 119
But we really want to divide 0x400000000000000000000000000000 by 2
unsigned and then negate at the end.
If it is unsigned precision 119 division
0x400000000000000000000000000000 by 2
dividend is
val = { 0, 0xffc0000000000000 }, len = 2, precision = 119
but as we unpack it UNSIGNED, it is unpacked into
0, 0, 0, 0x00400000

The following patch fixes it by always using UNSIGNED unpacking
because we've already negated negative values at that point if
sgn == SIGNED and so most negative constants should be treated as
positive.

2023-07-19  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/110731
* wide-int.cc (wi::divmod_internal): Always unpack dividend and
divisor as UNSIGNED regardless of sgn.

* gcc.dg/pr110731.c: New test.

(cherry picked from commit ece799607c841676f4e00c2fea98bbec6976da3f)

Daily bump.

debug/111080 - avoid outputting debug info for unused restrict qualified type

The following applies some maintainance with respect to type qualifiers
and kinds added by later DWARF standards to prune_unused_types_walk.
The particular case in the bug is not handling (thus marking required)
all restrict qualified type DIEs. I've found more DW_TAG_*_type that
are unhandled, looked up the DWARF docs and added them as well based
on common sense.

PR debug/111080
* dwarf2out.cc (prune_unused_types_walk): Handle
DW_TAG_restrict_type, DW_TAG_shared_type, DW_TAG_atomic_type,
DW_TAG_immutable_type, DW_TAG_coarray_type, DW_TAG_unspecified_type
and DW_TAG_dynamic_type as to only output them when referenced.

* gcc.dg/debug/dwarf2/pr111080.c: New testcase.

(cherry picked from commit bd2c4d6d8fffd5a6dae5217d6076cc4190bab13d)

tree-optimization/111137 - dependence checking for SLP

The following fixes a mistake with SLP dependence checking.  When
checking whether we can hoist loads to the first load place we
special-case stores of the same instance considering them sunk
to the last store place.  But we fail to consider that stores from
other SLP instances are sunk in a similar way.  This leads us to
miss the dependence between (A) and (B) in

  b[0][1] = 0;             (A)
...
  _6 = b[_5 /* 0 */][0];   (B')
  _7 = _6 ^ 1;
  b[_5 /* 0 */][0] = _7;
  b[0][2] = 0;             (A')
  _10 = b[_5 /* 0 */][1];  (B)
  _11 = _10 ^ 1;
  b[_5 /* 0 */][1] = _11;

where the zeroing stores are sunk to (A') and the loads hoisted
to (B').  The following fixes this, treating grouped stores from
other instances similar to stores from our own instance.  The
difference is - and this is more conservative than necessary - that
we don't know which stores of a group are in which SLP instance
(though I believe either all of the grouped stores will be in
a single SLP instance or in none at the moment), so we don't
know which stores are sunk where.  We simply assume they are
all sunk to the last store we run into.  Likewise we do not take
into account that an SLP instance might be cancelled (or a grouped
store not actually belong to any instance).

PR tree-optimization/111137
* tree-vect-data-refs.cc (vect_slp_analyze_load_dependences):
Properly handle grouped stores from other SLP instances.

* gcc.dg/torture/pr111137.c: New testcase.

(cherry picked from commit 845ee9c7107956845e487cb123fa581d9c70ea1b)

Apply some TLC to vect_slp_analyze_instance_dependence

This refactors things, separating load and store handing, adjusting
comments to reflect reality and removing some dead code.

* tree-vect-data-refs.cc (vect_slp_analyze_store_dependences):
Split out from vect_slp_analyze_node_dependences, remove
dead code.
(vect_slp_analyze_load_dependences): Split out from
vect_slp_analyze_node_dependences, adjust comments. Process
queued stores before any disambiguation.
(vect_slp_analyze_node_dependences): Remove.
(vect_slp_analyze_instance_dependence): Adjust.

(cherry picked from commit 470da3b27e6dbeb3286b09dcb1c1b810ac75b276)

middle-end/111253 - partly revert r11-6508-gabb1b6058c09a7

The following keeps dumping SSA def stmt RHS during diagnostic
reporting only for gimple_assign_single_p defs which means
memory loads.  This avoids diagnostics containing PHI nodes
like

  warning: 'realloc' called on pointer '*_42 = PHI <lcs.14_40(29), lcs.19_48(30)>.t_mem_caches' with nonzero offset 40

instead getting back the previous behavior:

  warning: 'realloc' called on pointer '*<unknown>.t_mem_caches' with nonzero offset 40

PR middle-end/111253
gcc/c-family/
* c-pretty-print.cc (c_pretty_printer::primary_expression):
Only dump gimple_assign_single_p SSA def RHS.

gcc/testsuite/
* gcc.dg/Wfree-nonheap-object-7.c: New testcase.

(cherry picked from commit e3ece7684b02c47d2b259899cf8009d6bdcccaf3)

Daily bump.

Don't assume it's AVX_U128_CLEAN after call_insn whose abi.mode_clobber(V4DImode) deosn't contains all SSE_REGS.

If the function desn't clobber any sse registers or only clobber
128-bit part, then vzeroupper isn't issued before the function exit.
the status not CLEAN but ANY after the function.

Also for sibling_call, it's safe to issue an vzeroupper. Also there
could be missing vzeroupper since there's no mode_exit for
sibling_call_p.

gcc/ChangeLog:

PR target/112891
* config/i386/i386.cc (ix86_avx_u128_mode_after): Return
AVX_U128_ANY if callee_abi doesn't clobber all_sse_regs to
align with ix86_avx_u128_mode_needed.
(ix86_avx_u128_mode_needed): Return AVX_U128_ClEAN for
sibling_call.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112891.c: New test.
* gcc.target/i386/pr112891-2.c: New test.

(cherry picked from commit fc189a08f5b7ad5889bd4c6b320c1dd99dd5d642)

Daily bump.

fixincludes: Update darwin_flt_eval_method for macOS 14

On macOS 14, a guard in <math.h> changed:

-- MacOSX13.3.sdk/usr/include/math.h 2023-04-19 01:54:44
+++ MacOSX14.0.sdk/usr/include/math.h 2023-08-01 08:42:43
@@ -22,0 +23 @@
+
@@ -43 +44 @@
-#if __FLT_EVAL_METHOD__ == 0
+#if __FLT_EVAL_METHOD__ == 0 || __FLT_EVAL_METHOD__ == -1
@@ -49 +50 @@
-#elif __FLT_EVAL_METHOD__ == 2 || __FLT_EVAL_METHOD__ == -1
+#elif __FLT_EVAL_METHOD__ == 2

Therefore the darwin_flt_eval_method fixincludes fix doesn't match any
longer, leading to a large number of testsuite failures like

/private/var/gcc/regression/master/14-gcc/build/gcc/include-fixed/math.h:69:5:
error: #error "Unsupported value of __FLT_EVAL_METHOD__."

where __FLT_EVAL_METHOD__ = 16.

This patch adjusts the fix to allow for both forms.

Tested with make check in fixincludes on x86_64-apple-darwin23.0.0 and
verifying that <math.h> has indeed been fixed as expected.

(backport of 93f803d53b5ccaabded9d7b4512b54da81c1c616)

2023-12-11 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

fixincludes:
* inclhack.def (darwin_flt_eval_method): Handle macOS 14 guard
variant.
* fixincl.x: Regenerate.
* tests/base/math.h [DARWIN_FLT_EVAL_METHOD_CHECK]: Update test.

Daily bump.

libstdc++: Add assertion to std::string_view::remove_suffix [PR112314]

libstdc++-v3/ChangeLog:

PR libstdc++/112314
* include/std/string_view (string_view::remove_suffix): Add
debug assertion.
* testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc:
New test.
* testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc:
New test.

(cherry picked from commit 6afa984f47e16e8bd958646d7407b74e61041f5d)

libstdc++: Adjust std::in_range template parameter name

This is more consistent with the specification in the standard.

libstdc++-v3/ChangeLog:

* include/std/utility (in_range): Rename _Up parameter to _Res.

(cherry picked from commit 97fc8851f60fda381ac3bf6213a1cc93d9fda4f0)

Daily bump.

Fortran: avoid obsolescence warning for COMMON with submodule [PR111880]

gcc/fortran/ChangeLog:

PR fortran/111880
* resolve.cc (resolve_common_vars): Do not call gfc_add_in_common
for symbols that are USE associated or used in a submodule.

gcc/testsuite/ChangeLog:

PR fortran/111880
* gfortran.dg/pr111880.f90: New test.

(cherry picked from commit c9d029ba2ceb435e31492c1f3f0fd3edf0e386be)

Daily bump.

c++: constantness of call to function pointer [PR111703]

potential_constant_expression for CALL_EXPR tests FUNCTION_POINTER_TYPE_P
on the callee rather than on the type of the callee, which means we
always pass want_rval=any when recursing and so may fail to identify a
non-constant function pointer callee as such. Fixing this turns out to
further work around PR111703.

PR c++/111703
PR c++/107939

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) <case CALL_EXPR>:
Fix FUNCTION_POINTER_TYPE_P test.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: Extend test.
* g++.dg/diagnostic/constexpr4.C: New test.

(cherry picked from commit 0077c0fb19981c108a01cd15af9b2d6d478c183b)

c++: constantness of local var in constexpr fn [PR111703, PR112269]

potential_constant_expression was incorrectly treating most local
variables from a constexpr function as constant because it wasn't
considering the 'now' parameter.  This patch fixes this by relaxing
its var_in_maybe_constexpr_fn checks accordingly, which turns out to
partially fix two recently reported regressions:

PR111703 is a regression caused by r11-550-gf65a3299a521a4 for restricting
constexpr evaluation during warning-dependent folding.  The mechanism is
intended to restrict only constant evaluation of the instantiated
non-dependent expression, but it also ends up restricting constant
evaluation occurring during instantiation of the expression, in particular
when instantiating the converted argument 'x' (a VIEW_CONVERT_EXPR) into
a copy constructor call.  This seems like a flaw in the mechanism, though
I don't know if we want to fix the mechanism or get rid of it completely
since the original testcases which motivated the mechanism are fixed more
simply by r13-1225-gb00b95198e6720.  In any case, this patch partially
fixes this by making us correctly treat 'x' as non-constant which prevents
the problematic warning-dependent folding from occurring at all.

PR112269 is caused by r14-4796-g3e3d73ed5e85e7 for merging tsubst_copy
into tsubst_copy_and_build.  tsubst_copy used to exit early when 'args'
was empty, behavior which that commit deliberately didn't preserve.
This early exit masked the fact that COMPLEX_EXPR wasn't handled by
tsubst at all, and is a tree code that apparently we could see during
warning-dependent folding on some targets.  A complete fix is to add
handling for this tree code in tsubst_expr, but this patch should fix
the reported testsuite failures since the COMPLEX_EXPRs that crop up
in <complex> are considered non-constant expressions after this patch.

PR c++/111703
PR c++/112269

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) <case VAR_DECL>:
Only consider var_in_maybe_constexpr_fn if 'now' is false.
<case INDIRECT_REF>: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: New test.

(cherry picked from commit 6665a8572c8f24bd55c6081c91f461442c94dcfb)

tree-optimization/111917 - bougs IL after guard hoisting

The unswitching code to hoist guards inserts conditions in wrong
places. The following fixes this, simplifying code.

PR tree-optimization/111917
* tree-ssa-loop-unswitch.cc (hoist_guard): Always insert
new conditional after last stmt.

* gcc.dg/torture/pr111917.c: New testcase.

(cherry picked from commit d96bd4aade170fcd86f5f09b68b770dde798e631)

middle-end/111818 - failed DECL_NOT_GIMPLE_REG_P setting of volatile

The following addresses a missed DECL_NOT_GIMPLE_REG_P setting of
a volatile declared parameter which causes inlining to substitute
a constant parameter into a context where its address is required.

The main issue is in update_address_taken which clears
DECL_NOT_GIMPLE_REG_P from the parameter but fails to rewrite it
because is_gimple_reg returns false for volatiles. The following
changes maybe_optimize_var to make the 1:1 correspondence between
clearing DECL_NOT_GIMPLE_REG_P of a register typed decl and
actually rewriting it to SSA.

PR middle-end/111818
* tree-ssa.cc (maybe_optimize_var): When clearing
DECL_NOT_GIMPLE_REG_P always rewrite into SSA.

* gcc.dg/torture/pr111818.c: New testcase.

(cherry picked from commit ce55521bcd149fdc431f1d78e706b66d470210ae)

tree-optimization/111614 - missing convert in undistribute_bitref_for_vector

The following adjusts a flawed guard for converting the first vector
of the sum we create in undistribute_bitref_for_vector.

PR tree-optimization/111614
* tree-ssa-reassoc.cc (undistribute_bitref_for_vector): Properly
convert the first vector when required.

* gcc.dg/torture/pr111614.c: New testcase.

(cherry picked from commit 88d79b9b03eccf39921d13c2cbd1acc50aeda126)

tree-optimization/111764 - wrong reduction vectorization

The following removes a misguided attempt to allow x + x in a reduction
path, also allowing x * x which isn't valid. x + x actually never
arrives this way but instead is canonicalized to 2 * x. This makes
reduction path handling consistent with how we handle the single-stmt
reduction case.

PR tree-optimization/111764
* tree-vect-loop.cc (check_reduction_path): Remove the attempt
to allow x + x via special-casing of assigns.

* gcc.dg/vect/pr111764.c: New testcase.

(cherry picked from commit 05f98310b54da95e468d799f4a910174320cccbb)

tree-optimization/111445 - simple_iv simplification fault

The following fixes a missed check in the simple_iv attempt
to simplify (signed T)((unsigned T) base + step) where it
allows a truncating inner conversion leading to wrong code.

PR tree-optimization/111445
* tree-scalar-evolution.cc (simple_iv_with_niters):
Add missing check for a sign-conversion.

* gcc.dg/torture/pr111445.c: New testcase.

(cherry picked from commit 9692309ed6b625f0fb358c0e230404b5603f69a6)

tree-optimization/111019 - invariant motion and aliasing

The following fixes a bad choice in representing things to the alias
oracle by LIM which while correct in pieces is inconsistent with itself.
When canonicalizing a ref to a bare deref instead of leaving the base
object and the extracted offset the same and just substituting an
alternate ref the following replaces the base and the offset as well,
avoiding the confusion that otherwise will arise in
aliasing_matching_component_refs_p.

PR tree-optimization/111019
* tree-ssa-loop-im.cc (gather_mem_refs_stmt): When canonicalizing
also scrap base and offset in case the ref is indirect.

* g++.dg/torture/pr111019.C: New testcase.

(cherry picked from commit 745ec2135aabfbe2c0fb7780309837d17e8986d4)

tree-optimization/110702 - avoid zero-based memory references in IVOPTs

Sometimes IVOPTs chooses a weird induction variable which downstream
leads to issues.  Most of the times we can fend those off during costing
by rejecting the candidate but it looks like the address description
costing synthesizes is different from what we end up generating so
the following fixes things up at code generation time.  Specifically
we avoid the create_mem_ref_raw fallback which uses a literal zero
address base with the actual base in index2.  For the case in question
we have the address

  type = unsigned long
  offset = 0
  elements = {
    [0] = &e * -3,
    [1] = (sizetype) a.9_30 * 232,
    [2] = ivtmp.28_44 * 4
  }

from which we code generate the problematical

  _3 = MEM[(long int *)0B + ivtmp.36_9 + ivtmp.28_44 * 4];

which references the object at address zero.  The patch below
recognizes the fallback after the fact and transforms the
TARGET_MEM_REF memory reference into a LEA for which this form
isn't problematic:

  _24 = &MEM[(long int *)0B + ivtmp.36_34 + ivtmp.28_44 * 4];
  _3 = *_24;

hereby avoiding the correctness issue.  We'd later conclude the
program terminates at the null pointer dereference and make the
function pure, miscompling the main function of the testcase.

PR tree-optimization/110702
* tree-ssa-loop-ivopts.cc (rewrite_use_address): When
we created a NULL pointer based access rewrite that to
a LEA.

* gcc.dg/torture/pr110702.c: New testcase.

(cherry picked from commit 13dfb01e5c30c3bd09333ac79d6ff96a617fea67)

tree-optimization/110556 - tail merging still pre-tuples

The stmt comparison function for GIMPLE_ASSIGNs for tail merging
still looks like it deals with pre-tuples IL. The following
attempts to fix this, not only comparing the first operand (sic!)
of stmts but all of them plus also compare the operation code.

PR tree-optimization/110556
* tree-ssa-tail-merge.cc (gimple_equal_p): Check
assign code and all operands of non-stores.

* gcc.dg/torture/pr110556.c: New testcase.

(cherry picked from commit 7b16686ef882ab141276f0e36a9d4ce1d755f64a)

tree-optimization/110515 - wrong code with LIM + PRE

In this PR we face the issue that LIM speculates a load when
hoisting it out of the loop (since it knows it cannot trap).
Unfortunately this exposes undefined behavior when the load
accesses memory with the wrong dynamic type. This later
makes PRE use that representation instead of the original
which accesses the same memory location but using a different
dynamic type leading to a wrong disambiguation of that
original access against another and thus a wrong-code transform.

Fortunately there already is code in PRE dealing with a similar
situation for code hoisting but that left a small gap which
when fixed also fixes the wrong-code transform in this bug even
if it doesn't address the underlying issue of LIM speculating
that load.

The upside is this fix is trivially safe to backport and chances
of code generation regressions are very low.

PR tree-optimization/110515
* tree-ssa-pre.cc (compute_avail): Make code dealing
with hoisting loads with different alias-sets more
robust.

* g++.dg/opt/pr110515.C: New testcase.

(cherry picked from commit 9f4f833455bb35c11d03e93f802604ac7cd8b740)

debug/110295 - mixed up early/late debug for member DIEs

When we process a scope typedef during early debug creation and
we have already created a DIE for the type when the decl is
TYPE_DECL_IS_STUB and this DIE is still in limbo we end up
just re-parenting that type DIE instead of properly creating
a DIE for the decl, eventually picking up the now completed
type and creating DIEs for the members. Instead this is currently
defered to the second time we come here, when we annotate the
DIEs with locations late where now the type DIE is no longer
in limbo and we fall through doing the job for the decl.

The following makes sure we perform the necessary early tasks
for this by continuing with the decl DIE creation after setting
a parent for the limbo type DIE.

PR debug/110295
* dwarf2out.cc (process_scope_var): Continue processing
the decl after setting a parent in case the existing DIE
was in limbo.

* g++.dg/debug/pr110295.C: New testcase.

(cherry picked from commit 963f87f8a65ec82f503ac4334a3da83b0a8a43b2)

ipa/109983 - (IPA) PTA speedup

This improves the edge avoidance heuristic by re-ordering the
topological sort of the graph to make sure the component with
the ESCAPED node is processed first. This improves the number
of created edges which directly correlates with the number
of bitmap_ior_into calls from 141447426 to 239596 and the
compile-time from 1083s to 3s. It also improves the compile-time
for the related PR109143 from 81s to 27s.

I've modernized the topological sorting API on the way as well.

PR ipa/109983
PR tree-optimization/109143
* tree-ssa-structalias.cc (struct topo_info): Remove.
(init_topo_info): Likewise.
(free_topo_info): Likewise.
(compute_topo_order): Simplify API, put the component
with ESCAPED last so it's processed first.
(topo_visit): Adjust.
(solve_graph): Likewise.

(cherry picked from commit 95e5c38a98cc64a797b1d766a20f8c0d0c807a74)

Daily bump.

i386: Wrong code with __builtin_parityl [PR112672]

gen_parityhi2_cmp instruction clobbers its input operand, so use
a temporary register in the call to gen_parityhi2_cmp.

PR target/112672

gcc/ChangeLog:

* config/i386/i386.md (parityhi2):
Use temporary register in the call to gen_parityhi2_cmp.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112672.c: New test.

(cherry picked from commit b2d17bdd45b582b93e89c00b04763a45f97d7a34)

Daily bump.

PR target/111815: VAX: Only accept the index scaler as the RHS operand to ASHIFT

As from commit 9df1ba9a35b8 ("libbacktrace: support zstd decompression")
GCC for the `vax-netbsdelf' target fails to complete building, with an
ICE:

during RTL pass: final
.../libbacktrace/elf.c: In function 'elf_zstd_decompress':
.../libbacktrace/elf.c:5006:1: internal compiler error: in print_operand_address, at config/vax/vax.cc:514
5006 | }
      | ^
0x1113df97 print_operand_address(_IO_FILE*, rtx_def*)
.../gcc/config/vax/vax.cc:514
0x10c2489b default_print_operand_address(_IO_FILE*, machine_mode, rtx_def*)
.../gcc/targhooks.cc:373
0x106ddd0b output_address(machine_mode, rtx_def*)
.../gcc/final.cc:3648
0x106ddd0b output_asm_insn(char const*, rtx_def**)
.../gcc/final.cc:3505
0x106e2143 output_asm_insn(char const*, rtx_def**)
.../gcc/final.cc:3421
0x106e2143 final_scan_insn_1
.../gcc/final.cc:2841
0x106e28e3 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
.../gcc/final.cc:2887
0x106e2bf7 final_1
.../gcc/final.cc:1979
0x106e3c67 rest_of_handle_final
.../gcc/final.cc:4240
0x106e3c67 execute
.../gcc/final.cc:4318
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

This is due to combine producing an invalid address RTX:

(plus:SI (ashift:SI (const_int 1 [0x1])
        (reg:QI 3 %r3 [1232]))
    (reg/v:SI 10 %r10 [orig:736 weight_mask ] [736]))

where the expression is ((1 << R3) + R10), which does not match a valid
machine addressing mode.  Consequently `print_operand_address' chokes.

This can be reduced to the testcase included, where it triggers the same
ICE in `p'.  Preincrements are required so that their results land in
registers and consequently an indexed addressing mode is tried or
otherwise doing operations piecemeal on stack-based function arguments
as direct input operands turns out more profitable in terms of RTX costs
and the ICE is avoided.

The ultimate cause has been commit c605a8bf9270 ("VAX: Accept ASHIFT in
address expressions"), where a shift of an immediate value by a register
has been mistakenly allowed as an index expression as if the shift
operation was commutative such as multiplication is.  So with ASHIFT the
scaler in an index expression has to be the right-hand operand, and the
backend has to enforce that, whereas with MULT the scaler can be either
operand.

Fix this by only accepting the index scaler as the RHS operand to
ASHIFT.

gcc/
PR target/111815
* config/vax/vax.cc (index_term_p): Only accept the index scaler
as the RHS operand to ASHIFT.

gcc/testsuite/
PR target/111815
* gcc.dg/torture/pr111815.c: New test.

(cherry picked from commit 56ff988e6be3fdba70cad86d73ec0038bc3b6b5a)

Daily bump.

LoongArch: Modify MUSL_DYNAMIC_LINKER.

Use no suffix at all in the musl dynamic linker name for hard
float ABI. Use -sf and -sp suffixes in musl dynamic linker name
for soft float and single precision ABIs. The following table
outlines the musl interpreter names for the LoongArch64 ABI names.

musl interpreter | LoongArch64 ABI
--------------------------- | -----------------
ld-musl-loongarch64.so.1 | loongarch64-lp64d
ld-musl-loongarch64-sp.so.1 | loongarch64-lp64f
ld-musl-loongarch64-sf.so.1 | loongarch64-lp64s

gcc/ChangeLog:

* config/loongarch/gnu-user.h (MUSL_ABI_SPEC): Modify suffix.

(cherry picked from commit 8bccee51f0deac64b79cd9ad75df599422f4c8ff)

LoongArch: Fix MUSL_DYNAMIC_LINKER

The system based on musl has no '/lib64', so change it.

https://wiki.musl-libc.org/guidelines-for-distributions.html,
"Multilib/multi-arch" section of this introduces it.

gcc/
* config/loongarch/gnu-user.h (MUSL_DYNAMIC_LINKER): Redefine.

Signed-off-by: Peng Fan <fanpeng@loongson.cn>
Suggested-by: Xi Ruoyao <xry111@xry111.site>
(cherry picked from commit a80c68a08604b0ac625ac7fc59eae40b551b1176)

Daily bump.

c++: retval dtor on rethrow [PR112301]

In r12-6333 for PR33799, I fixed the example in [except.ctor]/2. In that
testcase, the exception is caught and the function returns again,
successfully.

In this testcase, however, the exception is rethrown, and hits two separate
cleanups: one in the try block and the other in the function body. So we
destroy twice an object that was only constructed once.

Fortunately, the fix for the normal case is easy: we just need to clear the
"return value constructed by return" flag when we do it the first time.

This gets more complicated with the named return value optimization, since
we don't want to destroy the return value while the NRV variable is still in
scope.

PR c++/112301
PR c++/102191
PR c++/33799

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Clear
current_retval_sentinel when destroying retval.
* semantics.cc (nrv_data): Add in_nrv_cleanup.
(finalize_nrv): Set it.
(finalize_nrv_r): Fix handling of throwing cleanups.

gcc/testsuite/ChangeLog:

* g++.dg/eh/return1.C: Add more cases.

c++: fix contracts with NRV

The NRV implementation was blindly replacing the operand of RETURN_EXPR,
clobbering anything that check_return_expr might have added on to the actual
initialization, such as checking the postcondition.

GCC 12 note: There are no contracts in GCC 12, but this issue also broke
setting current_retval_sentinel.

gcc/cp/ChangeLog:

* semantics.cc (finalize_nrv_r): [RETURN_EXPR]: Only replace the
INIT_EXPR.

c++: fix throwing cleanup with label

While looking at PR92407 I noticed that the expectations of
maybe_splice_retval_cleanup weren't being met; an sk_cleanup level was
confusing its attempt to recognize the outer block of the function. And
even if I fixed the detection, it failed to actually wrap the body of the
function because the STATEMENT_LIST it got only had the label, not anything
after it. So I moved the call after poplevel does pop_stmt_list on all the
sk_cleanup levels.

PR c++/33799

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Change
recognition of function body and try scopes.
* semantics.cc (do_poplevel): Call it after poplevel.
(at_try_scope): New.
* cp-tree.h (maybe_splice_retval_cleanup): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/eh/return1.C: Add label cases.

Daily bump.

Fix warning on new Ada testcase

gcc/testsuite/
* gnat.dg/varsize4.adb (Func): Initialize Byte_Read parameter.

Fix internal error on function returning dynamically-sized type

This is a tree sharing issue for the internal return type synthesized for
a function returning a dynamically-sized type and taking an Out or In/Out
parameter passed by copy.

gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Also create a
TYPE_DECL for the return type built for the CI/CO mechanism.

gcc/testsuite/
* gnat.dg/varsize4.ads, gnat.dg/varsize4.adb: New test.
* gnat.dg/varsize4_pkg.ads: New helper.

LoongArch: Remove redundant barrier instructions before LL-SC loops

This is isomorphic to the LLVM changes [1-2].

On LoongArch, the LL and SC instructions has memory barrier semantics:

- LL: <memory-barrier> + <load-exclusive>
- SC: <store-conditional> + <memory-barrier>

But the compare and swap operation is allowed to fail, and if it fails
the SC instruction is not executed, thus the guarantee of acquiring
semantics cannot be ensured. Therefore, an acquire barrier needs to be
generated when failure_memorder includes an acquire operation.

On CPUs implementing LoongArch v1.10 or later, "dbar 0b10100" is an
acquire barrier; on CPUs implementing LoongArch v1.00, it is a full
barrier. So it's always enough for acquire semantics. OTOH if an
acquire semantic is not needed, we still needs the "dbar 0x700" as the
load-load barrier like all LL-SC loops.

[1]:https://github.com/llvm/llvm-project/pull/67391
[2]:https://github.com/llvm/llvm-project/pull/69339

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_memmodel_needs_release_fence): Remove.
(loongarch_cas_failure_memorder_needs_acquire): New static
function.
(loongarch_print_operand): Redefine 'G' for the barrier on CAS
failure.
* config/loongarch/sync.md (atomic_cas_value_strong<mode>):
Remove the redundant barrier before the LL instruction, and
emit an acquire barrier on failure if needed by
failure_memorder.
(atomic_cas_value_cmp_and_7_<mode>): Likewise.
(atomic_cas_value_add_7_<mode>): Remove the unnecessary barrier
before the LL instruction.
(atomic_cas_value_sub_7_<mode>): Likewise.
(atomic_cas_value_and_7_<mode>): Likewise.
(atomic_cas_value_xor_7_<mode>): Likewise.
(atomic_cas_value_or_7_<mode>): Likewise.
(atomic_cas_value_nand_7_<mode>): Likewise.
(atomic_cas_value_exchange_7_<mode>): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/cas-acquire.c: New test.

(cherry picked from commit 4d86dc51e34d2a5695b617afeb56e3414836a79a)

Daily bump.

libstdc++: Fix std::deque::operator[] Xmethod [PR112491]

The Xmethod for std::deque::operator[] has the same bug that I recently
fixed for the std::deque::size() Xmethod. The first node might have
unused capacity at the start, which needs to be accounted for when
indexing into the deque.

libstdc++-v3/ChangeLog:

PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.index):
Correctly handle unused capacity at the start of the first node.
* testsuite/libstdc++-xmethods/deque.cc: Check index operator
when elements have been removed from the front.

(cherry picked from commit 452476db0c705caeac8712d560fc16ced0ca5226)

libstdc++: std::stacktrace tweaks

Fix a typo in a string literal and make the new hash.cc test gracefully
handle missing stacktrace data (see PR 112541).

libstdc++-v3/ChangeLog:

* include/std/stacktrace (basic_stacktrace::at): Fix class name
in exception message.
* testsuite/19_diagnostics/stacktrace/hash.cc: Do not fail if
current() returns a non-empty stacktrace.

(cherry picked from commit cbd0fe22a5ced9751d2450dc4fd6fe3525c2fc02)

rs6000: Consider inline asm as safe if no assembler complains [PR111828]

As discussed in PR111828, rs6000_update_ipa_fn_target_info
is much conservative, currently for any non-empty inline
asm, without any parsing, it would take inline asm could
have HTM insns. It means for one function attributed with
power8 having inline asm, even if it has no HTM insns, we
don't make a function attributed with power10 inline it.

Peter pointed out an inline asm parser can be a slippery
slope, and noticed that the current gnu assembler still
allows HTM insns even with power10 machine type, so he
suggested that we can aggressively ignore the handling on
inline asm, this patch goes for this suggestion.

Considering that there are a few assembler alternatives
and assembler can update its behaviors (complaining HTM
insns at power10 and later cpus sounds reasonable from a
certain point of view), this patch also checks assembler
complains on HTM insns at power10 or not. For a case that
a caller attributed power10 calls a callee attributed
power8 having inline asm with HTM insn, without inlining
at least the compilation succeeds, but if assembler
complains HTM insns at power10, after inlining the
compilation would fail.

The two associated test cases are fine without and with
this patch (effective target takes effect or not).

PR target/111828

gcc/ChangeLog:

* config.in: Regenerate.
* config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Guard
inline asm handling under !HAVE_AS_POWER10_HTM.
* configure: Regenerate.
* configure.ac: Detect assembler support for HTM insns at power10.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_powerpc_as_p10_htm): New proc.
* g++.target/powerpc/pr111828-1.C: New test.
* g++.target/powerpc/pr111828-2.C: New test.

(cherry picked from commit b2075291af8810794c7184fd125b991c2341cb1e)

Daily bump.

libstdc++: Fix std::hash<std::stacktrace> [PR112348]

libstdc++-v3/ChangeLog:

PR libstdc++/112348
* include/std/stacktrace (hash<basic_stacktrace<Alloc>>): Fix
type of hash function for entries.
* testsuite/19_diagnostics/stacktrace/hash.cc: New test.

(cherry picked from commit 6f2fc42d9e52e8322e718e0154cd235d00906f99)

libstdc++: Fix std::deque::size() Xmethod [PR112491]

The Xmethod for std::deque::size() assumed that the first element would
be at the start of the first node. That's only true if elements are only
added at the back. If an element is inserted at the front, or removed
from the front (or anywhere before the middle) then the first node will
not be completely populated, and the Xmethod will give the wrong result.

libstdc++-v3/ChangeLog:

PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.size): Fix
calculation to use _M_start._M_cur.
* testsuite/libstdc++-xmethods/deque.cc: Check failing cases.

(cherry picked from commit 4db820928065eccbeb725406450d826186582b9f)

Daily bump.

libstdc++: Correctly call _string_types function

flake8 points out that the new call to _string_types from
StdExpAnyPrinter.__init__ is not correct -- it needs to be qualified.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py
(StdExpAnyPrinter.__init__): Qualify call to
_string_types.

(cherry picked from commit 4bf77db70e2521dc89f9d7f51c7ae6e58a94b4f9)

libstdc++: _versioned_namespace is always non-None

Some code in the pretty-printers seems to assume that the
_versioned_namespace global might be None (or the empty string).
However, doesn't occur, as the variable is never reassigned.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Assume that
_versioned_namespace is non-None.
* python/libstdcxx/v6/xmethods.py (is_specialization_of):
Assume that _versioned_namespace is non-None.

(cherry picked from commit d342c9de6a1534cbce324bcc3c7c0898ff74d386)

libstdc++: Use Python "not in" operator

flake8 warns about code like

not something in "whatever"

Ordinarily in Python this should be written as:

something not in "whatever"

This patch makes this change.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (Printer.add_version)
(add_one_template_type_printer)
(FilteringTypePrinter.add_one_type_printer): Use Python
"not in" operator.

(cherry picked from commit 202810947ca793b17653658e584008fb151e0937)

libstdc++: Define _versioned_namespace in xmethods.py

flake8 pointed out that is_specialization_of in xmethods.py looks at a
global that wasn't added to the file. This patch correct the
oversight.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/xmethods.py (_versioned_namespace):
Define.

(cherry picked from commit 83ec6e80ff0d56342fc066c2ef649036c0983529)

libstdc++: Refactor Python Xmethods to use is_specialization_of

This copies the is_specialization_of function from printers.py (with
slight modification for versioned namespace handling) and reuses it in
xmethods.py to replace repetitive re.match calls in every class.

This fixes the problem that the regular expressions used \d without
escaping the backslash properly.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/xmethods.py (is_specialization_of): Define
new function.
(ArrayMethodsMatcher, DequeMethodsMatcher)
(ForwardListMethodsMatcher, ListMethodsMatcher)
(VectorMethodsMatcher, AssociativeContainerMethodsMatcher)
(UniquePtrGetWorker, UniquePtrMethodsMatcher)
(SharedPtrSubscriptWorker, SharedPtrMethodsMatcher): Use
is_specialization_of instead of re.match.

(cherry picked from commit 17d3477fa89466604bee5af2a2caf8de5441aeb5)