git.ipfire.org Git - thirdparty/gcc.git/log

wide-int: Fix up wi::divmod_internal [PR110731]

As the following testcase shows, wi::divmod_internal doesn't handle
correctly signed division with precision > 64 when the dividend (and likely
divisor as well) is the type's minimum and the precision isn't divisible
by 64.

A few lines above what the patch hunk changes is:
  /* Make the divisor and dividend positive and remember what we
     did.  */
  if (sgn == SIGNED)
    {
      if (wi::neg_p (dividend))
        {
          neg_dividend = -dividend;
          dividend = neg_dividend;
          dividend_neg = true;
        }
      if (wi::neg_p (divisor))
        {
          neg_divisor = -divisor;
          divisor = neg_divisor;
          divisor_neg = true;
        }
    }
i.e. we negate negative dividend or divisor and remember those.
But, after we do that, when unpacking those values into b_dividend and
b_divisor we need to always treat the wide_ints as UNSIGNED,
because divmod_internal_2 performs an unsigned division only.
Now, if precision <= 64, we don't reach here at all, earlier code
handles it.  If dividend or divisor aren't the most negative values,
the negation clears their most significant bit, so it doesn't really
matter if we unpack SIGNED or UNSIGNED.  And if precision is multiple
of HOST_BITS_PER_WIDE_INT, there is no difference in behavior, while
-0x80000000000000000000000000000000 negates to
-0x80000000000000000000000000000000 the unpacking of it as SIGNED
or UNSIGNED works the same.
In the testcase, we have signed precision 119 and the dividend is
val = { 0, 0xffc0000000000000 }, len = 2, precision = 119
both before and after negation.
Divisor is
val = { 2 }, len = 1, precision = 119
But we really want to divide 0x400000000000000000000000000000 by 2
unsigned and then negate at the end.
If it is unsigned precision 119 division
0x400000000000000000000000000000 by 2
dividend is
val = { 0, 0xffc0000000000000 }, len = 2, precision = 119
but as we unpack it UNSIGNED, it is unpacked into
0, 0, 0, 0x00400000

The following patch fixes it by always using UNSIGNED unpacking
because we've already negated negative values at that point if
sgn == SIGNED and so most negative constants should be treated as
positive.

2023-07-19  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/110731
* wide-int.cc (wi::divmod_internal): Always unpack dividend and
divisor as UNSIGNED regardless of sgn.

* gcc.dg/pr110731.c: New test.

(cherry picked from commit ece799607c841676f4e00c2fea98bbec6976da3f)

Daily bump.

debug/111080 - avoid outputting debug info for unused restrict qualified type

The following applies some maintainance with respect to type qualifiers
and kinds added by later DWARF standards to prune_unused_types_walk.
The particular case in the bug is not handling (thus marking required)
all restrict qualified type DIEs. I've found more DW_TAG_*_type that
are unhandled, looked up the DWARF docs and added them as well based
on common sense.

PR debug/111080
* dwarf2out.cc (prune_unused_types_walk): Handle
DW_TAG_restrict_type, DW_TAG_shared_type, DW_TAG_atomic_type,
DW_TAG_immutable_type, DW_TAG_coarray_type, DW_TAG_unspecified_type
and DW_TAG_dynamic_type as to only output them when referenced.

* gcc.dg/debug/dwarf2/pr111080.c: New testcase.

(cherry picked from commit bd2c4d6d8fffd5a6dae5217d6076cc4190bab13d)

tree-optimization/111137 - dependence checking for SLP

The following fixes a mistake with SLP dependence checking.  When
checking whether we can hoist loads to the first load place we
special-case stores of the same instance considering them sunk
to the last store place.  But we fail to consider that stores from
other SLP instances are sunk in a similar way.  This leads us to
miss the dependence between (A) and (B) in

  b[0][1] = 0;             (A)
...
  _6 = b[_5 /* 0 */][0];   (B')
  _7 = _6 ^ 1;
  b[_5 /* 0 */][0] = _7;
  b[0][2] = 0;             (A')
  _10 = b[_5 /* 0 */][1];  (B)
  _11 = _10 ^ 1;
  b[_5 /* 0 */][1] = _11;

where the zeroing stores are sunk to (A') and the loads hoisted
to (B').  The following fixes this, treating grouped stores from
other instances similar to stores from our own instance.  The
difference is - and this is more conservative than necessary - that
we don't know which stores of a group are in which SLP instance
(though I believe either all of the grouped stores will be in
a single SLP instance or in none at the moment), so we don't
know which stores are sunk where.  We simply assume they are
all sunk to the last store we run into.  Likewise we do not take
into account that an SLP instance might be cancelled (or a grouped
store not actually belong to any instance).

PR tree-optimization/111137
* tree-vect-data-refs.cc (vect_slp_analyze_load_dependences):
Properly handle grouped stores from other SLP instances.

* gcc.dg/torture/pr111137.c: New testcase.

(cherry picked from commit 845ee9c7107956845e487cb123fa581d9c70ea1b)

Apply some TLC to vect_slp_analyze_instance_dependence

This refactors things, separating load and store handing, adjusting
comments to reflect reality and removing some dead code.

* tree-vect-data-refs.cc (vect_slp_analyze_store_dependences):
Split out from vect_slp_analyze_node_dependences, remove
dead code.
(vect_slp_analyze_load_dependences): Split out from
vect_slp_analyze_node_dependences, adjust comments. Process
queued stores before any disambiguation.
(vect_slp_analyze_node_dependences): Remove.
(vect_slp_analyze_instance_dependence): Adjust.

(cherry picked from commit 470da3b27e6dbeb3286b09dcb1c1b810ac75b276)

middle-end/111253 - partly revert r11-6508-gabb1b6058c09a7

The following keeps dumping SSA def stmt RHS during diagnostic
reporting only for gimple_assign_single_p defs which means
memory loads.  This avoids diagnostics containing PHI nodes
like

  warning: 'realloc' called on pointer '*_42 = PHI <lcs.14_40(29), lcs.19_48(30)>.t_mem_caches' with nonzero offset 40

instead getting back the previous behavior:

  warning: 'realloc' called on pointer '*<unknown>.t_mem_caches' with nonzero offset 40

PR middle-end/111253
gcc/c-family/
* c-pretty-print.cc (c_pretty_printer::primary_expression):
Only dump gimple_assign_single_p SSA def RHS.

gcc/testsuite/
* gcc.dg/Wfree-nonheap-object-7.c: New testcase.

(cherry picked from commit e3ece7684b02c47d2b259899cf8009d6bdcccaf3)

Daily bump.

Don't assume it's AVX_U128_CLEAN after call_insn whose abi.mode_clobber(V4DImode) deosn't contains all SSE_REGS.

If the function desn't clobber any sse registers or only clobber
128-bit part, then vzeroupper isn't issued before the function exit.
the status not CLEAN but ANY after the function.

Also for sibling_call, it's safe to issue an vzeroupper. Also there
could be missing vzeroupper since there's no mode_exit for
sibling_call_p.

gcc/ChangeLog:

PR target/112891
* config/i386/i386.cc (ix86_avx_u128_mode_after): Return
AVX_U128_ANY if callee_abi doesn't clobber all_sse_regs to
align with ix86_avx_u128_mode_needed.
(ix86_avx_u128_mode_needed): Return AVX_U128_ClEAN for
sibling_call.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112891.c: New test.
* gcc.target/i386/pr112891-2.c: New test.

(cherry picked from commit fc189a08f5b7ad5889bd4c6b320c1dd99dd5d642)

Daily bump.

fixincludes: Update darwin_flt_eval_method for macOS 14

On macOS 14, a guard in <math.h> changed:

-- MacOSX13.3.sdk/usr/include/math.h 2023-04-19 01:54:44
+++ MacOSX14.0.sdk/usr/include/math.h 2023-08-01 08:42:43
@@ -22,0 +23 @@
+
@@ -43 +44 @@
-#if __FLT_EVAL_METHOD__ == 0
+#if __FLT_EVAL_METHOD__ == 0 || __FLT_EVAL_METHOD__ == -1
@@ -49 +50 @@
-#elif __FLT_EVAL_METHOD__ == 2 || __FLT_EVAL_METHOD__ == -1
+#elif __FLT_EVAL_METHOD__ == 2

Therefore the darwin_flt_eval_method fixincludes fix doesn't match any
longer, leading to a large number of testsuite failures like

/private/var/gcc/regression/master/14-gcc/build/gcc/include-fixed/math.h:69:5:
error: #error "Unsupported value of __FLT_EVAL_METHOD__."

where __FLT_EVAL_METHOD__ = 16.

This patch adjusts the fix to allow for both forms.

Tested with make check in fixincludes on x86_64-apple-darwin23.0.0 and
verifying that <math.h> has indeed been fixed as expected.

(backport of 93f803d53b5ccaabded9d7b4512b54da81c1c616)

2023-12-11 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

fixincludes:
* inclhack.def (darwin_flt_eval_method): Handle macOS 14 guard
variant.
* fixincl.x: Regenerate.
* tests/base/math.h [DARWIN_FLT_EVAL_METHOD_CHECK]: Update test.

Daily bump.

libstdc++: Add assertion to std::string_view::remove_suffix [PR112314]

libstdc++-v3/ChangeLog:

PR libstdc++/112314
* include/std/string_view (string_view::remove_suffix): Add
debug assertion.
* testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc:
New test.
* testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc:
New test.

(cherry picked from commit 6afa984f47e16e8bd958646d7407b74e61041f5d)

libstdc++: Adjust std::in_range template parameter name

This is more consistent with the specification in the standard.

libstdc++-v3/ChangeLog:

* include/std/utility (in_range): Rename _Up parameter to _Res.

(cherry picked from commit 97fc8851f60fda381ac3bf6213a1cc93d9fda4f0)

Daily bump.

Fortran: avoid obsolescence warning for COMMON with submodule [PR111880]

gcc/fortran/ChangeLog:

PR fortran/111880
* resolve.cc (resolve_common_vars): Do not call gfc_add_in_common
for symbols that are USE associated or used in a submodule.

gcc/testsuite/ChangeLog:

PR fortran/111880
* gfortran.dg/pr111880.f90: New test.

(cherry picked from commit c9d029ba2ceb435e31492c1f3f0fd3edf0e386be)

Daily bump.

c++: constantness of call to function pointer [PR111703]

potential_constant_expression for CALL_EXPR tests FUNCTION_POINTER_TYPE_P
on the callee rather than on the type of the callee, which means we
always pass want_rval=any when recursing and so may fail to identify a
non-constant function pointer callee as such. Fixing this turns out to
further work around PR111703.

PR c++/111703
PR c++/107939

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) <case CALL_EXPR>:
Fix FUNCTION_POINTER_TYPE_P test.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: Extend test.
* g++.dg/diagnostic/constexpr4.C: New test.

(cherry picked from commit 0077c0fb19981c108a01cd15af9b2d6d478c183b)

c++: constantness of local var in constexpr fn [PR111703, PR112269]

potential_constant_expression was incorrectly treating most local
variables from a constexpr function as constant because it wasn't
considering the 'now' parameter.  This patch fixes this by relaxing
its var_in_maybe_constexpr_fn checks accordingly, which turns out to
partially fix two recently reported regressions:

PR111703 is a regression caused by r11-550-gf65a3299a521a4 for restricting
constexpr evaluation during warning-dependent folding.  The mechanism is
intended to restrict only constant evaluation of the instantiated
non-dependent expression, but it also ends up restricting constant
evaluation occurring during instantiation of the expression, in particular
when instantiating the converted argument 'x' (a VIEW_CONVERT_EXPR) into
a copy constructor call.  This seems like a flaw in the mechanism, though
I don't know if we want to fix the mechanism or get rid of it completely
since the original testcases which motivated the mechanism are fixed more
simply by r13-1225-gb00b95198e6720.  In any case, this patch partially
fixes this by making us correctly treat 'x' as non-constant which prevents
the problematic warning-dependent folding from occurring at all.

PR112269 is caused by r14-4796-g3e3d73ed5e85e7 for merging tsubst_copy
into tsubst_copy_and_build.  tsubst_copy used to exit early when 'args'
was empty, behavior which that commit deliberately didn't preserve.
This early exit masked the fact that COMPLEX_EXPR wasn't handled by
tsubst at all, and is a tree code that apparently we could see during
warning-dependent folding on some targets.  A complete fix is to add
handling for this tree code in tsubst_expr, but this patch should fix
the reported testsuite failures since the COMPLEX_EXPRs that crop up
in <complex> are considered non-constant expressions after this patch.

PR c++/111703
PR c++/112269

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) <case VAR_DECL>:
Only consider var_in_maybe_constexpr_fn if 'now' is false.
<case INDIRECT_REF>: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: New test.

(cherry picked from commit 6665a8572c8f24bd55c6081c91f461442c94dcfb)

tree-optimization/111917 - bougs IL after guard hoisting

The unswitching code to hoist guards inserts conditions in wrong
places. The following fixes this, simplifying code.

PR tree-optimization/111917
* tree-ssa-loop-unswitch.cc (hoist_guard): Always insert
new conditional after last stmt.

* gcc.dg/torture/pr111917.c: New testcase.

(cherry picked from commit d96bd4aade170fcd86f5f09b68b770dde798e631)

middle-end/111818 - failed DECL_NOT_GIMPLE_REG_P setting of volatile

The following addresses a missed DECL_NOT_GIMPLE_REG_P setting of
a volatile declared parameter which causes inlining to substitute
a constant parameter into a context where its address is required.

The main issue is in update_address_taken which clears
DECL_NOT_GIMPLE_REG_P from the parameter but fails to rewrite it
because is_gimple_reg returns false for volatiles. The following
changes maybe_optimize_var to make the 1:1 correspondence between
clearing DECL_NOT_GIMPLE_REG_P of a register typed decl and
actually rewriting it to SSA.

PR middle-end/111818
* tree-ssa.cc (maybe_optimize_var): When clearing
DECL_NOT_GIMPLE_REG_P always rewrite into SSA.

* gcc.dg/torture/pr111818.c: New testcase.

(cherry picked from commit ce55521bcd149fdc431f1d78e706b66d470210ae)

tree-optimization/111614 - missing convert in undistribute_bitref_for_vector

The following adjusts a flawed guard for converting the first vector
of the sum we create in undistribute_bitref_for_vector.

PR tree-optimization/111614
* tree-ssa-reassoc.cc (undistribute_bitref_for_vector): Properly
convert the first vector when required.

* gcc.dg/torture/pr111614.c: New testcase.

(cherry picked from commit 88d79b9b03eccf39921d13c2cbd1acc50aeda126)

tree-optimization/111764 - wrong reduction vectorization

The following removes a misguided attempt to allow x + x in a reduction
path, also allowing x * x which isn't valid. x + x actually never
arrives this way but instead is canonicalized to 2 * x. This makes
reduction path handling consistent with how we handle the single-stmt
reduction case.

PR tree-optimization/111764
* tree-vect-loop.cc (check_reduction_path): Remove the attempt
to allow x + x via special-casing of assigns.

* gcc.dg/vect/pr111764.c: New testcase.

(cherry picked from commit 05f98310b54da95e468d799f4a910174320cccbb)

tree-optimization/111445 - simple_iv simplification fault

The following fixes a missed check in the simple_iv attempt
to simplify (signed T)((unsigned T) base + step) where it
allows a truncating inner conversion leading to wrong code.

PR tree-optimization/111445
* tree-scalar-evolution.cc (simple_iv_with_niters):
Add missing check for a sign-conversion.

* gcc.dg/torture/pr111445.c: New testcase.

(cherry picked from commit 9692309ed6b625f0fb358c0e230404b5603f69a6)

tree-optimization/111019 - invariant motion and aliasing

The following fixes a bad choice in representing things to the alias
oracle by LIM which while correct in pieces is inconsistent with itself.
When canonicalizing a ref to a bare deref instead of leaving the base
object and the extracted offset the same and just substituting an
alternate ref the following replaces the base and the offset as well,
avoiding the confusion that otherwise will arise in
aliasing_matching_component_refs_p.

PR tree-optimization/111019
* tree-ssa-loop-im.cc (gather_mem_refs_stmt): When canonicalizing
also scrap base and offset in case the ref is indirect.

* g++.dg/torture/pr111019.C: New testcase.

(cherry picked from commit 745ec2135aabfbe2c0fb7780309837d17e8986d4)

tree-optimization/110702 - avoid zero-based memory references in IVOPTs

Sometimes IVOPTs chooses a weird induction variable which downstream
leads to issues.  Most of the times we can fend those off during costing
by rejecting the candidate but it looks like the address description
costing synthesizes is different from what we end up generating so
the following fixes things up at code generation time.  Specifically
we avoid the create_mem_ref_raw fallback which uses a literal zero
address base with the actual base in index2.  For the case in question
we have the address

  type = unsigned long
  offset = 0
  elements = {
    [0] = &e * -3,
    [1] = (sizetype) a.9_30 * 232,
    [2] = ivtmp.28_44 * 4
  }

from which we code generate the problematical

  _3 = MEM[(long int *)0B + ivtmp.36_9 + ivtmp.28_44 * 4];

which references the object at address zero.  The patch below
recognizes the fallback after the fact and transforms the
TARGET_MEM_REF memory reference into a LEA for which this form
isn't problematic:

  _24 = &MEM[(long int *)0B + ivtmp.36_34 + ivtmp.28_44 * 4];
  _3 = *_24;

hereby avoiding the correctness issue.  We'd later conclude the
program terminates at the null pointer dereference and make the
function pure, miscompling the main function of the testcase.

PR tree-optimization/110702
* tree-ssa-loop-ivopts.cc (rewrite_use_address): When
we created a NULL pointer based access rewrite that to
a LEA.

* gcc.dg/torture/pr110702.c: New testcase.

(cherry picked from commit 13dfb01e5c30c3bd09333ac79d6ff96a617fea67)

tree-optimization/110556 - tail merging still pre-tuples

The stmt comparison function for GIMPLE_ASSIGNs for tail merging
still looks like it deals with pre-tuples IL. The following
attempts to fix this, not only comparing the first operand (sic!)
of stmts but all of them plus also compare the operation code.

PR tree-optimization/110556
* tree-ssa-tail-merge.cc (gimple_equal_p): Check
assign code and all operands of non-stores.

* gcc.dg/torture/pr110556.c: New testcase.

(cherry picked from commit 7b16686ef882ab141276f0e36a9d4ce1d755f64a)

tree-optimization/110515 - wrong code with LIM + PRE

In this PR we face the issue that LIM speculates a load when
hoisting it out of the loop (since it knows it cannot trap).
Unfortunately this exposes undefined behavior when the load
accesses memory with the wrong dynamic type. This later
makes PRE use that representation instead of the original
which accesses the same memory location but using a different
dynamic type leading to a wrong disambiguation of that
original access against another and thus a wrong-code transform.

Fortunately there already is code in PRE dealing with a similar
situation for code hoisting but that left a small gap which
when fixed also fixes the wrong-code transform in this bug even
if it doesn't address the underlying issue of LIM speculating
that load.

The upside is this fix is trivially safe to backport and chances
of code generation regressions are very low.

PR tree-optimization/110515
* tree-ssa-pre.cc (compute_avail): Make code dealing
with hoisting loads with different alias-sets more
robust.

* g++.dg/opt/pr110515.C: New testcase.

(cherry picked from commit 9f4f833455bb35c11d03e93f802604ac7cd8b740)

debug/110295 - mixed up early/late debug for member DIEs

When we process a scope typedef during early debug creation and
we have already created a DIE for the type when the decl is
TYPE_DECL_IS_STUB and this DIE is still in limbo we end up
just re-parenting that type DIE instead of properly creating
a DIE for the decl, eventually picking up the now completed
type and creating DIEs for the members. Instead this is currently
defered to the second time we come here, when we annotate the
DIEs with locations late where now the type DIE is no longer
in limbo and we fall through doing the job for the decl.

The following makes sure we perform the necessary early tasks
for this by continuing with the decl DIE creation after setting
a parent for the limbo type DIE.

PR debug/110295
* dwarf2out.cc (process_scope_var): Continue processing
the decl after setting a parent in case the existing DIE
was in limbo.

* g++.dg/debug/pr110295.C: New testcase.

(cherry picked from commit 963f87f8a65ec82f503ac4334a3da83b0a8a43b2)

ipa/109983 - (IPA) PTA speedup

This improves the edge avoidance heuristic by re-ordering the
topological sort of the graph to make sure the component with
the ESCAPED node is processed first. This improves the number
of created edges which directly correlates with the number
of bitmap_ior_into calls from 141447426 to 239596 and the
compile-time from 1083s to 3s. It also improves the compile-time
for the related PR109143 from 81s to 27s.

I've modernized the topological sorting API on the way as well.

PR ipa/109983
PR tree-optimization/109143
* tree-ssa-structalias.cc (struct topo_info): Remove.
(init_topo_info): Likewise.
(free_topo_info): Likewise.
(compute_topo_order): Simplify API, put the component
with ESCAPED last so it's processed first.
(topo_visit): Adjust.
(solve_graph): Likewise.

(cherry picked from commit 95e5c38a98cc64a797b1d766a20f8c0d0c807a74)

Daily bump.

i386: Wrong code with __builtin_parityl [PR112672]

gen_parityhi2_cmp instruction clobbers its input operand, so use
a temporary register in the call to gen_parityhi2_cmp.

PR target/112672

gcc/ChangeLog:

* config/i386/i386.md (parityhi2):
Use temporary register in the call to gen_parityhi2_cmp.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112672.c: New test.

(cherry picked from commit b2d17bdd45b582b93e89c00b04763a45f97d7a34)

Daily bump.

PR target/111815: VAX: Only accept the index scaler as the RHS operand to ASHIFT

As from commit 9df1ba9a35b8 ("libbacktrace: support zstd decompression")
GCC for the `vax-netbsdelf' target fails to complete building, with an
ICE:

during RTL pass: final
.../libbacktrace/elf.c: In function 'elf_zstd_decompress':
.../libbacktrace/elf.c:5006:1: internal compiler error: in print_operand_address, at config/vax/vax.cc:514
5006 | }
      | ^
0x1113df97 print_operand_address(_IO_FILE*, rtx_def*)
.../gcc/config/vax/vax.cc:514
0x10c2489b default_print_operand_address(_IO_FILE*, machine_mode, rtx_def*)
.../gcc/targhooks.cc:373
0x106ddd0b output_address(machine_mode, rtx_def*)
.../gcc/final.cc:3648
0x106ddd0b output_asm_insn(char const*, rtx_def**)
.../gcc/final.cc:3505
0x106e2143 output_asm_insn(char const*, rtx_def**)
.../gcc/final.cc:3421
0x106e2143 final_scan_insn_1
.../gcc/final.cc:2841
0x106e28e3 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
.../gcc/final.cc:2887
0x106e2bf7 final_1
.../gcc/final.cc:1979
0x106e3c67 rest_of_handle_final
.../gcc/final.cc:4240
0x106e3c67 execute
.../gcc/final.cc:4318
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

This is due to combine producing an invalid address RTX:

(plus:SI (ashift:SI (const_int 1 [0x1])
        (reg:QI 3 %r3 [1232]))
    (reg/v:SI 10 %r10 [orig:736 weight_mask ] [736]))

where the expression is ((1 << R3) + R10), which does not match a valid
machine addressing mode.  Consequently `print_operand_address' chokes.

This can be reduced to the testcase included, where it triggers the same
ICE in `p'.  Preincrements are required so that their results land in
registers and consequently an indexed addressing mode is tried or
otherwise doing operations piecemeal on stack-based function arguments
as direct input operands turns out more profitable in terms of RTX costs
and the ICE is avoided.

The ultimate cause has been commit c605a8bf9270 ("VAX: Accept ASHIFT in
address expressions"), where a shift of an immediate value by a register
has been mistakenly allowed as an index expression as if the shift
operation was commutative such as multiplication is.  So with ASHIFT the
scaler in an index expression has to be the right-hand operand, and the
backend has to enforce that, whereas with MULT the scaler can be either
operand.

Fix this by only accepting the index scaler as the RHS operand to
ASHIFT.

gcc/
PR target/111815
* config/vax/vax.cc (index_term_p): Only accept the index scaler
as the RHS operand to ASHIFT.

gcc/testsuite/
PR target/111815
* gcc.dg/torture/pr111815.c: New test.

(cherry picked from commit 56ff988e6be3fdba70cad86d73ec0038bc3b6b5a)

Daily bump.

LoongArch: Modify MUSL_DYNAMIC_LINKER.

Use no suffix at all in the musl dynamic linker name for hard
float ABI. Use -sf and -sp suffixes in musl dynamic linker name
for soft float and single precision ABIs. The following table
outlines the musl interpreter names for the LoongArch64 ABI names.

musl interpreter | LoongArch64 ABI
--------------------------- | -----------------
ld-musl-loongarch64.so.1 | loongarch64-lp64d
ld-musl-loongarch64-sp.so.1 | loongarch64-lp64f
ld-musl-loongarch64-sf.so.1 | loongarch64-lp64s

gcc/ChangeLog:

* config/loongarch/gnu-user.h (MUSL_ABI_SPEC): Modify suffix.

(cherry picked from commit 8bccee51f0deac64b79cd9ad75df599422f4c8ff)

LoongArch: Fix MUSL_DYNAMIC_LINKER

The system based on musl has no '/lib64', so change it.

https://wiki.musl-libc.org/guidelines-for-distributions.html,
"Multilib/multi-arch" section of this introduces it.

gcc/
* config/loongarch/gnu-user.h (MUSL_DYNAMIC_LINKER): Redefine.

Signed-off-by: Peng Fan <fanpeng@loongson.cn>
Suggested-by: Xi Ruoyao <xry111@xry111.site>
(cherry picked from commit a80c68a08604b0ac625ac7fc59eae40b551b1176)

Daily bump.

c++: retval dtor on rethrow [PR112301]

In r12-6333 for PR33799, I fixed the example in [except.ctor]/2. In that
testcase, the exception is caught and the function returns again,
successfully.

In this testcase, however, the exception is rethrown, and hits two separate
cleanups: one in the try block and the other in the function body. So we
destroy twice an object that was only constructed once.

Fortunately, the fix for the normal case is easy: we just need to clear the
"return value constructed by return" flag when we do it the first time.

This gets more complicated with the named return value optimization, since
we don't want to destroy the return value while the NRV variable is still in
scope.

PR c++/112301
PR c++/102191
PR c++/33799

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Clear
current_retval_sentinel when destroying retval.
* semantics.cc (nrv_data): Add in_nrv_cleanup.
(finalize_nrv): Set it.
(finalize_nrv_r): Fix handling of throwing cleanups.

gcc/testsuite/ChangeLog:

* g++.dg/eh/return1.C: Add more cases.

c++: fix contracts with NRV

The NRV implementation was blindly replacing the operand of RETURN_EXPR,
clobbering anything that check_return_expr might have added on to the actual
initialization, such as checking the postcondition.

GCC 12 note: There are no contracts in GCC 12, but this issue also broke
setting current_retval_sentinel.

gcc/cp/ChangeLog:

* semantics.cc (finalize_nrv_r): [RETURN_EXPR]: Only replace the
INIT_EXPR.

c++: fix throwing cleanup with label

While looking at PR92407 I noticed that the expectations of
maybe_splice_retval_cleanup weren't being met; an sk_cleanup level was
confusing its attempt to recognize the outer block of the function. And
even if I fixed the detection, it failed to actually wrap the body of the
function because the STATEMENT_LIST it got only had the label, not anything
after it. So I moved the call after poplevel does pop_stmt_list on all the
sk_cleanup levels.

PR c++/33799

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Change
recognition of function body and try scopes.
* semantics.cc (do_poplevel): Call it after poplevel.
(at_try_scope): New.
* cp-tree.h (maybe_splice_retval_cleanup): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/eh/return1.C: Add label cases.

Daily bump.

Fix warning on new Ada testcase

gcc/testsuite/
* gnat.dg/varsize4.adb (Func): Initialize Byte_Read parameter.

Fix internal error on function returning dynamically-sized type

This is a tree sharing issue for the internal return type synthesized for
a function returning a dynamically-sized type and taking an Out or In/Out
parameter passed by copy.

gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Also create a
TYPE_DECL for the return type built for the CI/CO mechanism.

gcc/testsuite/
* gnat.dg/varsize4.ads, gnat.dg/varsize4.adb: New test.
* gnat.dg/varsize4_pkg.ads: New helper.

LoongArch: Remove redundant barrier instructions before LL-SC loops

This is isomorphic to the LLVM changes [1-2].

On LoongArch, the LL and SC instructions has memory barrier semantics:

- LL: <memory-barrier> + <load-exclusive>
- SC: <store-conditional> + <memory-barrier>

But the compare and swap operation is allowed to fail, and if it fails
the SC instruction is not executed, thus the guarantee of acquiring
semantics cannot be ensured. Therefore, an acquire barrier needs to be
generated when failure_memorder includes an acquire operation.

On CPUs implementing LoongArch v1.10 or later, "dbar 0b10100" is an
acquire barrier; on CPUs implementing LoongArch v1.00, it is a full
barrier. So it's always enough for acquire semantics. OTOH if an
acquire semantic is not needed, we still needs the "dbar 0x700" as the
load-load barrier like all LL-SC loops.

[1]:https://github.com/llvm/llvm-project/pull/67391
[2]:https://github.com/llvm/llvm-project/pull/69339

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_memmodel_needs_release_fence): Remove.
(loongarch_cas_failure_memorder_needs_acquire): New static
function.
(loongarch_print_operand): Redefine 'G' for the barrier on CAS
failure.
* config/loongarch/sync.md (atomic_cas_value_strong<mode>):
Remove the redundant barrier before the LL instruction, and
emit an acquire barrier on failure if needed by
failure_memorder.
(atomic_cas_value_cmp_and_7_<mode>): Likewise.
(atomic_cas_value_add_7_<mode>): Remove the unnecessary barrier
before the LL instruction.
(atomic_cas_value_sub_7_<mode>): Likewise.
(atomic_cas_value_and_7_<mode>): Likewise.
(atomic_cas_value_xor_7_<mode>): Likewise.
(atomic_cas_value_or_7_<mode>): Likewise.
(atomic_cas_value_nand_7_<mode>): Likewise.
(atomic_cas_value_exchange_7_<mode>): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/cas-acquire.c: New test.

(cherry picked from commit 4d86dc51e34d2a5695b617afeb56e3414836a79a)

Daily bump.

libstdc++: Fix std::deque::operator[] Xmethod [PR112491]

The Xmethod for std::deque::operator[] has the same bug that I recently
fixed for the std::deque::size() Xmethod. The first node might have
unused capacity at the start, which needs to be accounted for when
indexing into the deque.

libstdc++-v3/ChangeLog:

PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.index):
Correctly handle unused capacity at the start of the first node.
* testsuite/libstdc++-xmethods/deque.cc: Check index operator
when elements have been removed from the front.

(cherry picked from commit 452476db0c705caeac8712d560fc16ced0ca5226)

libstdc++: std::stacktrace tweaks

Fix a typo in a string literal and make the new hash.cc test gracefully
handle missing stacktrace data (see PR 112541).

libstdc++-v3/ChangeLog:

* include/std/stacktrace (basic_stacktrace::at): Fix class name
in exception message.
* testsuite/19_diagnostics/stacktrace/hash.cc: Do not fail if
current() returns a non-empty stacktrace.

(cherry picked from commit cbd0fe22a5ced9751d2450dc4fd6fe3525c2fc02)

rs6000: Consider inline asm as safe if no assembler complains [PR111828]

As discussed in PR111828, rs6000_update_ipa_fn_target_info
is much conservative, currently for any non-empty inline
asm, without any parsing, it would take inline asm could
have HTM insns. It means for one function attributed with
power8 having inline asm, even if it has no HTM insns, we
don't make a function attributed with power10 inline it.

Peter pointed out an inline asm parser can be a slippery
slope, and noticed that the current gnu assembler still
allows HTM insns even with power10 machine type, so he
suggested that we can aggressively ignore the handling on
inline asm, this patch goes for this suggestion.

Considering that there are a few assembler alternatives
and assembler can update its behaviors (complaining HTM
insns at power10 and later cpus sounds reasonable from a
certain point of view), this patch also checks assembler
complains on HTM insns at power10 or not. For a case that
a caller attributed power10 calls a callee attributed
power8 having inline asm with HTM insn, without inlining
at least the compilation succeeds, but if assembler
complains HTM insns at power10, after inlining the
compilation would fail.

The two associated test cases are fine without and with
this patch (effective target takes effect or not).

PR target/111828

gcc/ChangeLog:

* config.in: Regenerate.
* config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Guard
inline asm handling under !HAVE_AS_POWER10_HTM.
* configure: Regenerate.
* configure.ac: Detect assembler support for HTM insns at power10.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_powerpc_as_p10_htm): New proc.
* g++.target/powerpc/pr111828-1.C: New test.
* g++.target/powerpc/pr111828-2.C: New test.

(cherry picked from commit b2075291af8810794c7184fd125b991c2341cb1e)

Daily bump.

libstdc++: Fix std::hash<std::stacktrace> [PR112348]

libstdc++-v3/ChangeLog:

PR libstdc++/112348
* include/std/stacktrace (hash<basic_stacktrace<Alloc>>): Fix
type of hash function for entries.
* testsuite/19_diagnostics/stacktrace/hash.cc: New test.

(cherry picked from commit 6f2fc42d9e52e8322e718e0154cd235d00906f99)

libstdc++: Fix std::deque::size() Xmethod [PR112491]

The Xmethod for std::deque::size() assumed that the first element would
be at the start of the first node. That's only true if elements are only
added at the back. If an element is inserted at the front, or removed
from the front (or anywhere before the middle) then the first node will
not be completely populated, and the Xmethod will give the wrong result.

libstdc++-v3/ChangeLog:

PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.size): Fix
calculation to use _M_start._M_cur.
* testsuite/libstdc++-xmethods/deque.cc: Check failing cases.

(cherry picked from commit 4db820928065eccbeb725406450d826186582b9f)

Daily bump.

libstdc++: Correctly call _string_types function

flake8 points out that the new call to _string_types from
StdExpAnyPrinter.__init__ is not correct -- it needs to be qualified.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py
(StdExpAnyPrinter.__init__): Qualify call to
_string_types.

(cherry picked from commit 4bf77db70e2521dc89f9d7f51c7ae6e58a94b4f9)

libstdc++: _versioned_namespace is always non-None

Some code in the pretty-printers seems to assume that the
_versioned_namespace global might be None (or the empty string).
However, doesn't occur, as the variable is never reassigned.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Assume that
_versioned_namespace is non-None.
* python/libstdcxx/v6/xmethods.py (is_specialization_of):
Assume that _versioned_namespace is non-None.

(cherry picked from commit d342c9de6a1534cbce324bcc3c7c0898ff74d386)

libstdc++: Use Python "not in" operator

flake8 warns about code like

not something in "whatever"

Ordinarily in Python this should be written as:

something not in "whatever"

This patch makes this change.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (Printer.add_version)
(add_one_template_type_printer)
(FilteringTypePrinter.add_one_type_printer): Use Python
"not in" operator.

(cherry picked from commit 202810947ca793b17653658e584008fb151e0937)

libstdc++: Define _versioned_namespace in xmethods.py

flake8 pointed out that is_specialization_of in xmethods.py looks at a
global that wasn't added to the file. This patch correct the
oversight.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/xmethods.py (_versioned_namespace):
Define.

(cherry picked from commit 83ec6e80ff0d56342fc066c2ef649036c0983529)

libstdc++: Refactor Python Xmethods to use is_specialization_of

This copies the is_specialization_of function from printers.py (with
slight modification for versioned namespace handling) and reuses it in
xmethods.py to replace repetitive re.match calls in every class.

This fixes the problem that the regular expressions used \d without
escaping the backslash properly.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/xmethods.py (is_specialization_of): Define
new function.
(ArrayMethodsMatcher, DequeMethodsMatcher)
(ForwardListMethodsMatcher, ListMethodsMatcher)
(VectorMethodsMatcher, AssociativeContainerMethodsMatcher)
(UniquePtrGetWorker, UniquePtrMethodsMatcher)
(SharedPtrSubscriptWorker, SharedPtrMethodsMatcher): Use
is_specialization_of instead of re.match.

(cherry picked from commit 17d3477fa89466604bee5af2a2caf8de5441aeb5)

libstdc++: Reformat Python code

Some of these changes were suggested by autopep8's --aggressive
option, others are for readability.

Break long lines by splitting strings across multiple lines, or
introducing local variables to hold results.

Use raw strings for regular expressions, so that backslashes don't need
to be escaped.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Break long lines. Use raw
strings for regular expressions. Add whitespace around
operators.
(is_member_of_namespace): Use isinstance to check type.
(is_specialization_of): Likewise. Adjust template_name
for versioned namespace instead of duplicating the re.match
call.
(StdExpAnyPrinter._string_types): New static method.
(StdExpAnyPrinter.to_string): Use _string_types.

(cherry picked from commit 6b5c3f9b8139d9eee358b354b35da0b757a0270d)

libstdc++: Format Python docstrings according to PEP 357

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Format docstrings according
to PEP 257.
* python/libstdcxx/v6/xmethods.py: Likewise.

(cherry picked from commit 0ef4cc8225f8669463bca501f36b6951967a692a)

libstdc++: Format Python code according to PEP8

These files were filtered through autopep8 to reformat them more
conventionally.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Reformat.
* python/libstdcxx/v6/xmethods.py: Likewise.

(cherry picked from commit e08559271b2d797f658579ac8610dbf5e58bcfd8)

Fix wrong code due to vec_merge + pcmp to blendvb splitter.

gcc/ChangeLog:

PR target/112443
* config/i386/sse.md (*avx2_pcmp<mode>3_4): Fix swap condition
from LT to GT since there's not in the pattern.
(*avx2_pcmp<mode>3_5): Ditto.

gcc/testsuite/ChangeLog:

* g++.target/i386/pr112443.C: New test.

(cherry picked from commit 9a0cc04b9c9b02426762892b88efc5c44ba546bd)

Daily bump.

libphobos: Fix regression d21 loops in getCpuInfo0B in Solaris/x86 kernel zone

This function assumes that cpuid would return "invalid domain" when a
sub-leaf index greater than what's supported is requested. This turned
out not to always be the case when running on some virtual machines.

As the loop only does anything for levels 0 and 1, make that a hard
limit for number of times the loop is ran.

PR d/112408

libphobos/ChangeLog:

* libdruntime/core/cpuid.d (getCpuInfo0B): Limit number of times loop
runs.

(cherry picked from commit 0b25c1295d4e84af681f4b1f4af2ad37cd270da3)

Daily bump.

libstdc++: use -D_GNU_SOURCE when building libbacktrace

PR libbacktrace/111315
PR libbacktrace/112263
* acinclude.m4: Set -D_GNU_SOURCE in BACKTRACE_CPPFLAGS and when
grepping link.h for dl_iterate_phdr.
* configure: Regenerate.

hppa: Fix typo in PA 2.0 trampoline template

2023-11-06 John David Anglin <danglin@gcc.gnu.org>

* config/pa/pa.cc (pa_asm_trampoline_template): Fix typo.

Daily bump.

d: Fix ICE: verify_gimple_failed (conversion of register to a different size in 'view_convert_expr')

Static arrays in D are passed around by value, rather than decaying to a
pointer. On x86_64 __builtin_va_list is an exception to this rule, but
semantically it's still treated as a static array.

This makes certain assignment operations fail due a mismatch in types.
As all examples in the test program are rejected by C/C++ front-ends,
these are now errors in D too to be consistent.

PR d/110712

gcc/d/ChangeLog:

* d-codegen.cc (d_build_call): Update call to convert_for_argument.
* d-convert.cc (is_valist_parameter_type): New function.
(check_valist_conversion): New function.
(convert_for_assignment): Update signature. Add check whether
assigning va_list is permissible.
(convert_for_argument): Likewise.
* d-tree.h (convert_for_assignment): Update signature.
(convert_for_argument): Likewise.
* expr.cc (ExprVisitor::visit (AssignExp *)): Update call to
convert_for_assignment.

gcc/testsuite/ChangeLog:

* gdc.dg/pr110712.d: New test.

(cherry picked from commit ea8ffdcadb388b531adf4772287e7987a82a84b7)

Daily bump.

d: Fix ICE: in verify_gimple_in_seq on powerpc-darwin9 [PR112270]

This ICE was seen during stage2 on powerpc-darwin9 only. There were
still some uses of GCC's boolean_type_node in the D front-end, which
caused a type mismatch to trigger as D bool size is fixed to 1 byte on
all targets.

So two new nodes have been introduced - d_bool_false_node and
d_bool_true_node - which have replaced all remaining uses of
boolean_false_node and boolean_true_node respectively.

PR d/112270

gcc/d/ChangeLog:

* d-builtins.cc (d_build_d_type_nodes): Initialize d_bool_false_node,
d_bool_true_node.
* d-codegen.cc (build_array_struct_comparison): Use d_bool_false_node
instead of boolean_false_node.
* d-convert.cc (d_truthvalue_conversion): Use d_bool_false_node and
d_bool_true_node instead of boolean_false_node and boolean_true_node.
* d-tree.h (enum d_tree_index): Add DTI_BOOL_FALSE and DTI_BOOL_TRUE.
(d_bool_false_node): New macro.
(d_bool_true_node): New macro.
* modules.cc (build_dso_cdtor_fn): Use d_bool_false_node and
d_bool_true_node instead of boolean_false_node and boolean_true_node.
(register_moduleinfo): Use d_bool_type instead of boolean_type_node.

gcc/testsuite/ChangeLog:

* gdc.dg/pr112270.d: New test.

(cherry picked from commit 10f1489dcb3bd9adccc88898bc12f53398fa3583)

Daily bump.