git.ipfire.org Git - thirdparty/gcc.git/log

tree-optimization/111917 - bougs IL after guard hoisting

The unswitching code to hoist guards inserts conditions in wrong
places. The following fixes this, simplifying code.

PR tree-optimization/111917
* tree-ssa-loop-unswitch.cc (hoist_guard): Always insert
new conditional after last stmt.

* gcc.dg/torture/pr111917.c: New testcase.

(cherry picked from commit d96bd4aade170fcd86f5f09b68b770dde798e631)

middle-end/111818 - failed DECL_NOT_GIMPLE_REG_P setting of volatile

The following addresses a missed DECL_NOT_GIMPLE_REG_P setting of
a volatile declared parameter which causes inlining to substitute
a constant parameter into a context where its address is required.

The main issue is in update_address_taken which clears
DECL_NOT_GIMPLE_REG_P from the parameter but fails to rewrite it
because is_gimple_reg returns false for volatiles. The following
changes maybe_optimize_var to make the 1:1 correspondence between
clearing DECL_NOT_GIMPLE_REG_P of a register typed decl and
actually rewriting it to SSA.

PR middle-end/111818
* tree-ssa.cc (maybe_optimize_var): When clearing
DECL_NOT_GIMPLE_REG_P always rewrite into SSA.

* gcc.dg/torture/pr111818.c: New testcase.

(cherry picked from commit ce55521bcd149fdc431f1d78e706b66d470210ae)

tree-optimization/111614 - missing convert in undistribute_bitref_for_vector

The following adjusts a flawed guard for converting the first vector
of the sum we create in undistribute_bitref_for_vector.

PR tree-optimization/111614
* tree-ssa-reassoc.cc (undistribute_bitref_for_vector): Properly
convert the first vector when required.

* gcc.dg/torture/pr111614.c: New testcase.

(cherry picked from commit 88d79b9b03eccf39921d13c2cbd1acc50aeda126)

tree-optimization/111764 - wrong reduction vectorization

The following removes a misguided attempt to allow x + x in a reduction
path, also allowing x * x which isn't valid. x + x actually never
arrives this way but instead is canonicalized to 2 * x. This makes
reduction path handling consistent with how we handle the single-stmt
reduction case.

PR tree-optimization/111764
* tree-vect-loop.cc (check_reduction_path): Remove the attempt
to allow x + x via special-casing of assigns.

* gcc.dg/vect/pr111764.c: New testcase.

(cherry picked from commit 05f98310b54da95e468d799f4a910174320cccbb)

tree-optimization/111445 - simple_iv simplification fault

The following fixes a missed check in the simple_iv attempt
to simplify (signed T)((unsigned T) base + step) where it
allows a truncating inner conversion leading to wrong code.

PR tree-optimization/111445
* tree-scalar-evolution.cc (simple_iv_with_niters):
Add missing check for a sign-conversion.

* gcc.dg/torture/pr111445.c: New testcase.

(cherry picked from commit 9692309ed6b625f0fb358c0e230404b5603f69a6)

tree-optimization/111019 - invariant motion and aliasing

The following fixes a bad choice in representing things to the alias
oracle by LIM which while correct in pieces is inconsistent with itself.
When canonicalizing a ref to a bare deref instead of leaving the base
object and the extracted offset the same and just substituting an
alternate ref the following replaces the base and the offset as well,
avoiding the confusion that otherwise will arise in
aliasing_matching_component_refs_p.

PR tree-optimization/111019
* tree-ssa-loop-im.cc (gather_mem_refs_stmt): When canonicalizing
also scrap base and offset in case the ref is indirect.

* g++.dg/torture/pr111019.C: New testcase.

(cherry picked from commit 745ec2135aabfbe2c0fb7780309837d17e8986d4)

tree-optimization/110702 - avoid zero-based memory references in IVOPTs

Sometimes IVOPTs chooses a weird induction variable which downstream
leads to issues.  Most of the times we can fend those off during costing
by rejecting the candidate but it looks like the address description
costing synthesizes is different from what we end up generating so
the following fixes things up at code generation time.  Specifically
we avoid the create_mem_ref_raw fallback which uses a literal zero
address base with the actual base in index2.  For the case in question
we have the address

  type = unsigned long
  offset = 0
  elements = {
    [0] = &e * -3,
    [1] = (sizetype) a.9_30 * 232,
    [2] = ivtmp.28_44 * 4
  }

from which we code generate the problematical

  _3 = MEM[(long int *)0B + ivtmp.36_9 + ivtmp.28_44 * 4];

which references the object at address zero.  The patch below
recognizes the fallback after the fact and transforms the
TARGET_MEM_REF memory reference into a LEA for which this form
isn't problematic:

  _24 = &MEM[(long int *)0B + ivtmp.36_34 + ivtmp.28_44 * 4];
  _3 = *_24;

hereby avoiding the correctness issue.  We'd later conclude the
program terminates at the null pointer dereference and make the
function pure, miscompling the main function of the testcase.

PR tree-optimization/110702
* tree-ssa-loop-ivopts.cc (rewrite_use_address): When
we created a NULL pointer based access rewrite that to
a LEA.

* gcc.dg/torture/pr110702.c: New testcase.

(cherry picked from commit 13dfb01e5c30c3bd09333ac79d6ff96a617fea67)

tree-optimization/110556 - tail merging still pre-tuples

The stmt comparison function for GIMPLE_ASSIGNs for tail merging
still looks like it deals with pre-tuples IL. The following
attempts to fix this, not only comparing the first operand (sic!)
of stmts but all of them plus also compare the operation code.

PR tree-optimization/110556
* tree-ssa-tail-merge.cc (gimple_equal_p): Check
assign code and all operands of non-stores.

* gcc.dg/torture/pr110556.c: New testcase.

(cherry picked from commit 7b16686ef882ab141276f0e36a9d4ce1d755f64a)

tree-optimization/110515 - wrong code with LIM + PRE

In this PR we face the issue that LIM speculates a load when
hoisting it out of the loop (since it knows it cannot trap).
Unfortunately this exposes undefined behavior when the load
accesses memory with the wrong dynamic type. This later
makes PRE use that representation instead of the original
which accesses the same memory location but using a different
dynamic type leading to a wrong disambiguation of that
original access against another and thus a wrong-code transform.

Fortunately there already is code in PRE dealing with a similar
situation for code hoisting but that left a small gap which
when fixed also fixes the wrong-code transform in this bug even
if it doesn't address the underlying issue of LIM speculating
that load.

The upside is this fix is trivially safe to backport and chances
of code generation regressions are very low.

PR tree-optimization/110515
* tree-ssa-pre.cc (compute_avail): Make code dealing
with hoisting loads with different alias-sets more
robust.

* g++.dg/opt/pr110515.C: New testcase.

(cherry picked from commit 9f4f833455bb35c11d03e93f802604ac7cd8b740)

debug/110295 - mixed up early/late debug for member DIEs

When we process a scope typedef during early debug creation and
we have already created a DIE for the type when the decl is
TYPE_DECL_IS_STUB and this DIE is still in limbo we end up
just re-parenting that type DIE instead of properly creating
a DIE for the decl, eventually picking up the now completed
type and creating DIEs for the members. Instead this is currently
defered to the second time we come here, when we annotate the
DIEs with locations late where now the type DIE is no longer
in limbo and we fall through doing the job for the decl.

The following makes sure we perform the necessary early tasks
for this by continuing with the decl DIE creation after setting
a parent for the limbo type DIE.

PR debug/110295
* dwarf2out.cc (process_scope_var): Continue processing
the decl after setting a parent in case the existing DIE
was in limbo.

* g++.dg/debug/pr110295.C: New testcase.

(cherry picked from commit 963f87f8a65ec82f503ac4334a3da83b0a8a43b2)

ipa/109983 - (IPA) PTA speedup

This improves the edge avoidance heuristic by re-ordering the
topological sort of the graph to make sure the component with
the ESCAPED node is processed first. This improves the number
of created edges which directly correlates with the number
of bitmap_ior_into calls from 141447426 to 239596 and the
compile-time from 1083s to 3s. It also improves the compile-time
for the related PR109143 from 81s to 27s.

I've modernized the topological sorting API on the way as well.

PR ipa/109983
PR tree-optimization/109143
* tree-ssa-structalias.cc (struct topo_info): Remove.
(init_topo_info): Likewise.
(free_topo_info): Likewise.
(compute_topo_order): Simplify API, put the component
with ESCAPED last so it's processed first.
(topo_visit): Adjust.
(solve_graph): Likewise.

(cherry picked from commit 95e5c38a98cc64a797b1d766a20f8c0d0c807a74)

Daily bump.

i386: Wrong code with __builtin_parityl [PR112672]

gen_parityhi2_cmp instruction clobbers its input operand, so use
a temporary register in the call to gen_parityhi2_cmp.

PR target/112672

gcc/ChangeLog:

* config/i386/i386.md (parityhi2):
Use temporary register in the call to gen_parityhi2_cmp.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112672.c: New test.

(cherry picked from commit b2d17bdd45b582b93e89c00b04763a45f97d7a34)

Daily bump.

PR target/111815: VAX: Only accept the index scaler as the RHS operand to ASHIFT

As from commit 9df1ba9a35b8 ("libbacktrace: support zstd decompression")
GCC for the `vax-netbsdelf' target fails to complete building, with an
ICE:

during RTL pass: final
.../libbacktrace/elf.c: In function 'elf_zstd_decompress':
.../libbacktrace/elf.c:5006:1: internal compiler error: in print_operand_address, at config/vax/vax.cc:514
5006 | }
      | ^
0x1113df97 print_operand_address(_IO_FILE*, rtx_def*)
.../gcc/config/vax/vax.cc:514
0x10c2489b default_print_operand_address(_IO_FILE*, machine_mode, rtx_def*)
.../gcc/targhooks.cc:373
0x106ddd0b output_address(machine_mode, rtx_def*)
.../gcc/final.cc:3648
0x106ddd0b output_asm_insn(char const*, rtx_def**)
.../gcc/final.cc:3505
0x106e2143 output_asm_insn(char const*, rtx_def**)
.../gcc/final.cc:3421
0x106e2143 final_scan_insn_1
.../gcc/final.cc:2841
0x106e28e3 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
.../gcc/final.cc:2887
0x106e2bf7 final_1
.../gcc/final.cc:1979
0x106e3c67 rest_of_handle_final
.../gcc/final.cc:4240
0x106e3c67 execute
.../gcc/final.cc:4318
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

This is due to combine producing an invalid address RTX:

(plus:SI (ashift:SI (const_int 1 [0x1])
        (reg:QI 3 %r3 [1232]))
    (reg/v:SI 10 %r10 [orig:736 weight_mask ] [736]))

where the expression is ((1 << R3) + R10), which does not match a valid
machine addressing mode.  Consequently `print_operand_address' chokes.

This can be reduced to the testcase included, where it triggers the same
ICE in `p'.  Preincrements are required so that their results land in
registers and consequently an indexed addressing mode is tried or
otherwise doing operations piecemeal on stack-based function arguments
as direct input operands turns out more profitable in terms of RTX costs
and the ICE is avoided.

The ultimate cause has been commit c605a8bf9270 ("VAX: Accept ASHIFT in
address expressions"), where a shift of an immediate value by a register
has been mistakenly allowed as an index expression as if the shift
operation was commutative such as multiplication is.  So with ASHIFT the
scaler in an index expression has to be the right-hand operand, and the
backend has to enforce that, whereas with MULT the scaler can be either
operand.

Fix this by only accepting the index scaler as the RHS operand to
ASHIFT.

gcc/
PR target/111815
* config/vax/vax.cc (index_term_p): Only accept the index scaler
as the RHS operand to ASHIFT.

gcc/testsuite/
PR target/111815
* gcc.dg/torture/pr111815.c: New test.

(cherry picked from commit 56ff988e6be3fdba70cad86d73ec0038bc3b6b5a)

Daily bump.

LoongArch: Modify MUSL_DYNAMIC_LINKER.

Use no suffix at all in the musl dynamic linker name for hard
float ABI. Use -sf and -sp suffixes in musl dynamic linker name
for soft float and single precision ABIs. The following table
outlines the musl interpreter names for the LoongArch64 ABI names.

musl interpreter | LoongArch64 ABI
--------------------------- | -----------------
ld-musl-loongarch64.so.1 | loongarch64-lp64d
ld-musl-loongarch64-sp.so.1 | loongarch64-lp64f
ld-musl-loongarch64-sf.so.1 | loongarch64-lp64s

gcc/ChangeLog:

* config/loongarch/gnu-user.h (MUSL_ABI_SPEC): Modify suffix.

(cherry picked from commit 8bccee51f0deac64b79cd9ad75df599422f4c8ff)

LoongArch: Fix MUSL_DYNAMIC_LINKER

The system based on musl has no '/lib64', so change it.

https://wiki.musl-libc.org/guidelines-for-distributions.html,
"Multilib/multi-arch" section of this introduces it.

gcc/
* config/loongarch/gnu-user.h (MUSL_DYNAMIC_LINKER): Redefine.

Signed-off-by: Peng Fan <fanpeng@loongson.cn>
Suggested-by: Xi Ruoyao <xry111@xry111.site>
(cherry picked from commit a80c68a08604b0ac625ac7fc59eae40b551b1176)

Daily bump.

c++: retval dtor on rethrow [PR112301]

In r12-6333 for PR33799, I fixed the example in [except.ctor]/2. In that
testcase, the exception is caught and the function returns again,
successfully.

In this testcase, however, the exception is rethrown, and hits two separate
cleanups: one in the try block and the other in the function body. So we
destroy twice an object that was only constructed once.

Fortunately, the fix for the normal case is easy: we just need to clear the
"return value constructed by return" flag when we do it the first time.

This gets more complicated with the named return value optimization, since
we don't want to destroy the return value while the NRV variable is still in
scope.

PR c++/112301
PR c++/102191
PR c++/33799

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Clear
current_retval_sentinel when destroying retval.
* semantics.cc (nrv_data): Add in_nrv_cleanup.
(finalize_nrv): Set it.
(finalize_nrv_r): Fix handling of throwing cleanups.

gcc/testsuite/ChangeLog:

* g++.dg/eh/return1.C: Add more cases.

c++: fix contracts with NRV

The NRV implementation was blindly replacing the operand of RETURN_EXPR,
clobbering anything that check_return_expr might have added on to the actual
initialization, such as checking the postcondition.

GCC 12 note: There are no contracts in GCC 12, but this issue also broke
setting current_retval_sentinel.

gcc/cp/ChangeLog:

* semantics.cc (finalize_nrv_r): [RETURN_EXPR]: Only replace the
INIT_EXPR.

c++: fix throwing cleanup with label

While looking at PR92407 I noticed that the expectations of
maybe_splice_retval_cleanup weren't being met; an sk_cleanup level was
confusing its attempt to recognize the outer block of the function. And
even if I fixed the detection, it failed to actually wrap the body of the
function because the STATEMENT_LIST it got only had the label, not anything
after it. So I moved the call after poplevel does pop_stmt_list on all the
sk_cleanup levels.

PR c++/33799

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Change
recognition of function body and try scopes.
* semantics.cc (do_poplevel): Call it after poplevel.
(at_try_scope): New.
* cp-tree.h (maybe_splice_retval_cleanup): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/eh/return1.C: Add label cases.

Daily bump.

Fix warning on new Ada testcase

gcc/testsuite/
* gnat.dg/varsize4.adb (Func): Initialize Byte_Read parameter.

Fix internal error on function returning dynamically-sized type

This is a tree sharing issue for the internal return type synthesized for
a function returning a dynamically-sized type and taking an Out or In/Out
parameter passed by copy.

gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Also create a
TYPE_DECL for the return type built for the CI/CO mechanism.

gcc/testsuite/
* gnat.dg/varsize4.ads, gnat.dg/varsize4.adb: New test.
* gnat.dg/varsize4_pkg.ads: New helper.

LoongArch: Remove redundant barrier instructions before LL-SC loops

This is isomorphic to the LLVM changes [1-2].

On LoongArch, the LL and SC instructions has memory barrier semantics:

- LL: <memory-barrier> + <load-exclusive>
- SC: <store-conditional> + <memory-barrier>

But the compare and swap operation is allowed to fail, and if it fails
the SC instruction is not executed, thus the guarantee of acquiring
semantics cannot be ensured. Therefore, an acquire barrier needs to be
generated when failure_memorder includes an acquire operation.

On CPUs implementing LoongArch v1.10 or later, "dbar 0b10100" is an
acquire barrier; on CPUs implementing LoongArch v1.00, it is a full
barrier. So it's always enough for acquire semantics. OTOH if an
acquire semantic is not needed, we still needs the "dbar 0x700" as the
load-load barrier like all LL-SC loops.

[1]:https://github.com/llvm/llvm-project/pull/67391
[2]:https://github.com/llvm/llvm-project/pull/69339

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_memmodel_needs_release_fence): Remove.
(loongarch_cas_failure_memorder_needs_acquire): New static
function.
(loongarch_print_operand): Redefine 'G' for the barrier on CAS
failure.
* config/loongarch/sync.md (atomic_cas_value_strong<mode>):
Remove the redundant barrier before the LL instruction, and
emit an acquire barrier on failure if needed by
failure_memorder.
(atomic_cas_value_cmp_and_7_<mode>): Likewise.
(atomic_cas_value_add_7_<mode>): Remove the unnecessary barrier
before the LL instruction.
(atomic_cas_value_sub_7_<mode>): Likewise.
(atomic_cas_value_and_7_<mode>): Likewise.
(atomic_cas_value_xor_7_<mode>): Likewise.
(atomic_cas_value_or_7_<mode>): Likewise.
(atomic_cas_value_nand_7_<mode>): Likewise.
(atomic_cas_value_exchange_7_<mode>): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/cas-acquire.c: New test.

(cherry picked from commit 4d86dc51e34d2a5695b617afeb56e3414836a79a)

Daily bump.

libstdc++: Fix std::deque::operator[] Xmethod [PR112491]

The Xmethod for std::deque::operator[] has the same bug that I recently
fixed for the std::deque::size() Xmethod. The first node might have
unused capacity at the start, which needs to be accounted for when
indexing into the deque.

libstdc++-v3/ChangeLog:

PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.index):
Correctly handle unused capacity at the start of the first node.
* testsuite/libstdc++-xmethods/deque.cc: Check index operator
when elements have been removed from the front.

(cherry picked from commit 452476db0c705caeac8712d560fc16ced0ca5226)

libstdc++: std::stacktrace tweaks

Fix a typo in a string literal and make the new hash.cc test gracefully
handle missing stacktrace data (see PR 112541).

libstdc++-v3/ChangeLog:

* include/std/stacktrace (basic_stacktrace::at): Fix class name
in exception message.
* testsuite/19_diagnostics/stacktrace/hash.cc: Do not fail if
current() returns a non-empty stacktrace.

(cherry picked from commit cbd0fe22a5ced9751d2450dc4fd6fe3525c2fc02)

rs6000: Consider inline asm as safe if no assembler complains [PR111828]

As discussed in PR111828, rs6000_update_ipa_fn_target_info
is much conservative, currently for any non-empty inline
asm, without any parsing, it would take inline asm could
have HTM insns. It means for one function attributed with
power8 having inline asm, even if it has no HTM insns, we
don't make a function attributed with power10 inline it.

Peter pointed out an inline asm parser can be a slippery
slope, and noticed that the current gnu assembler still
allows HTM insns even with power10 machine type, so he
suggested that we can aggressively ignore the handling on
inline asm, this patch goes for this suggestion.

Considering that there are a few assembler alternatives
and assembler can update its behaviors (complaining HTM
insns at power10 and later cpus sounds reasonable from a
certain point of view), this patch also checks assembler
complains on HTM insns at power10 or not. For a case that
a caller attributed power10 calls a callee attributed
power8 having inline asm with HTM insn, without inlining
at least the compilation succeeds, but if assembler
complains HTM insns at power10, after inlining the
compilation would fail.

The two associated test cases are fine without and with
this patch (effective target takes effect or not).

PR target/111828

gcc/ChangeLog:

* config.in: Regenerate.
* config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Guard
inline asm handling under !HAVE_AS_POWER10_HTM.
* configure: Regenerate.
* configure.ac: Detect assembler support for HTM insns at power10.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_powerpc_as_p10_htm): New proc.
* g++.target/powerpc/pr111828-1.C: New test.
* g++.target/powerpc/pr111828-2.C: New test.

(cherry picked from commit b2075291af8810794c7184fd125b991c2341cb1e)

Daily bump.

libstdc++: Fix std::hash<std::stacktrace> [PR112348]

libstdc++-v3/ChangeLog:

PR libstdc++/112348
* include/std/stacktrace (hash<basic_stacktrace<Alloc>>): Fix
type of hash function for entries.
* testsuite/19_diagnostics/stacktrace/hash.cc: New test.

(cherry picked from commit 6f2fc42d9e52e8322e718e0154cd235d00906f99)

libstdc++: Fix std::deque::size() Xmethod [PR112491]

The Xmethod for std::deque::size() assumed that the first element would
be at the start of the first node. That's only true if elements are only
added at the back. If an element is inserted at the front, or removed
from the front (or anywhere before the middle) then the first node will
not be completely populated, and the Xmethod will give the wrong result.

libstdc++-v3/ChangeLog:

PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.size): Fix
calculation to use _M_start._M_cur.
* testsuite/libstdc++-xmethods/deque.cc: Check failing cases.

(cherry picked from commit 4db820928065eccbeb725406450d826186582b9f)

Daily bump.

libstdc++: Correctly call _string_types function

flake8 points out that the new call to _string_types from
StdExpAnyPrinter.__init__ is not correct -- it needs to be qualified.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py
(StdExpAnyPrinter.__init__): Qualify call to
_string_types.

(cherry picked from commit 4bf77db70e2521dc89f9d7f51c7ae6e58a94b4f9)

libstdc++: _versioned_namespace is always non-None

Some code in the pretty-printers seems to assume that the
_versioned_namespace global might be None (or the empty string).
However, doesn't occur, as the variable is never reassigned.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Assume that
_versioned_namespace is non-None.
* python/libstdcxx/v6/xmethods.py (is_specialization_of):
Assume that _versioned_namespace is non-None.

(cherry picked from commit d342c9de6a1534cbce324bcc3c7c0898ff74d386)

libstdc++: Use Python "not in" operator

flake8 warns about code like

not something in "whatever"

Ordinarily in Python this should be written as:

something not in "whatever"

This patch makes this change.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (Printer.add_version)
(add_one_template_type_printer)
(FilteringTypePrinter.add_one_type_printer): Use Python
"not in" operator.

(cherry picked from commit 202810947ca793b17653658e584008fb151e0937)

libstdc++: Define _versioned_namespace in xmethods.py

flake8 pointed out that is_specialization_of in xmethods.py looks at a
global that wasn't added to the file. This patch correct the
oversight.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/xmethods.py (_versioned_namespace):
Define.

(cherry picked from commit 83ec6e80ff0d56342fc066c2ef649036c0983529)

libstdc++: Refactor Python Xmethods to use is_specialization_of

This copies the is_specialization_of function from printers.py (with
slight modification for versioned namespace handling) and reuses it in
xmethods.py to replace repetitive re.match calls in every class.

This fixes the problem that the regular expressions used \d without
escaping the backslash properly.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/xmethods.py (is_specialization_of): Define
new function.
(ArrayMethodsMatcher, DequeMethodsMatcher)
(ForwardListMethodsMatcher, ListMethodsMatcher)
(VectorMethodsMatcher, AssociativeContainerMethodsMatcher)
(UniquePtrGetWorker, UniquePtrMethodsMatcher)
(SharedPtrSubscriptWorker, SharedPtrMethodsMatcher): Use
is_specialization_of instead of re.match.

(cherry picked from commit 17d3477fa89466604bee5af2a2caf8de5441aeb5)

libstdc++: Reformat Python code

Some of these changes were suggested by autopep8's --aggressive
option, others are for readability.

Break long lines by splitting strings across multiple lines, or
introducing local variables to hold results.

Use raw strings for regular expressions, so that backslashes don't need
to be escaped.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Break long lines. Use raw
strings for regular expressions. Add whitespace around
operators.
(is_member_of_namespace): Use isinstance to check type.
(is_specialization_of): Likewise. Adjust template_name
for versioned namespace instead of duplicating the re.match
call.
(StdExpAnyPrinter._string_types): New static method.
(StdExpAnyPrinter.to_string): Use _string_types.

(cherry picked from commit 6b5c3f9b8139d9eee358b354b35da0b757a0270d)

libstdc++: Format Python docstrings according to PEP 357

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Format docstrings according
to PEP 257.
* python/libstdcxx/v6/xmethods.py: Likewise.

(cherry picked from commit 0ef4cc8225f8669463bca501f36b6951967a692a)

libstdc++: Format Python code according to PEP8

These files were filtered through autopep8 to reformat them more
conventionally.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Reformat.
* python/libstdcxx/v6/xmethods.py: Likewise.

(cherry picked from commit e08559271b2d797f658579ac8610dbf5e58bcfd8)

Fix wrong code due to vec_merge + pcmp to blendvb splitter.

gcc/ChangeLog:

PR target/112443
* config/i386/sse.md (*avx2_pcmp<mode>3_4): Fix swap condition
from LT to GT since there's not in the pattern.
(*avx2_pcmp<mode>3_5): Ditto.

gcc/testsuite/ChangeLog:

* g++.target/i386/pr112443.C: New test.

(cherry picked from commit 9a0cc04b9c9b02426762892b88efc5c44ba546bd)

Daily bump.

libphobos: Fix regression d21 loops in getCpuInfo0B in Solaris/x86 kernel zone

This function assumes that cpuid would return "invalid domain" when a
sub-leaf index greater than what's supported is requested. This turned
out not to always be the case when running on some virtual machines.

As the loop only does anything for levels 0 and 1, make that a hard
limit for number of times the loop is ran.

PR d/112408

libphobos/ChangeLog:

* libdruntime/core/cpuid.d (getCpuInfo0B): Limit number of times loop
runs.

(cherry picked from commit 0b25c1295d4e84af681f4b1f4af2ad37cd270da3)

Daily bump.

libstdc++: use -D_GNU_SOURCE when building libbacktrace

PR libbacktrace/111315
PR libbacktrace/112263
* acinclude.m4: Set -D_GNU_SOURCE in BACKTRACE_CPPFLAGS and when
grepping link.h for dl_iterate_phdr.
* configure: Regenerate.

hppa: Fix typo in PA 2.0 trampoline template

2023-11-06 John David Anglin <danglin@gcc.gnu.org>

* config/pa/pa.cc (pa_asm_trampoline_template): Fix typo.

Daily bump.

d: Fix ICE: verify_gimple_failed (conversion of register to a different size in 'view_convert_expr')

Static arrays in D are passed around by value, rather than decaying to a
pointer. On x86_64 __builtin_va_list is an exception to this rule, but
semantically it's still treated as a static array.

This makes certain assignment operations fail due a mismatch in types.
As all examples in the test program are rejected by C/C++ front-ends,
these are now errors in D too to be consistent.

PR d/110712

gcc/d/ChangeLog:

* d-codegen.cc (d_build_call): Update call to convert_for_argument.
* d-convert.cc (is_valist_parameter_type): New function.
(check_valist_conversion): New function.
(convert_for_assignment): Update signature. Add check whether
assigning va_list is permissible.
(convert_for_argument): Likewise.
* d-tree.h (convert_for_assignment): Update signature.
(convert_for_argument): Likewise.
* expr.cc (ExprVisitor::visit (AssignExp *)): Update call to
convert_for_assignment.

gcc/testsuite/ChangeLog:

* gdc.dg/pr110712.d: New test.

(cherry picked from commit ea8ffdcadb388b531adf4772287e7987a82a84b7)

Daily bump.

d: Fix ICE: in verify_gimple_in_seq on powerpc-darwin9 [PR112270]

This ICE was seen during stage2 on powerpc-darwin9 only. There were
still some uses of GCC's boolean_type_node in the D front-end, which
caused a type mismatch to trigger as D bool size is fixed to 1 byte on
all targets.

So two new nodes have been introduced - d_bool_false_node and
d_bool_true_node - which have replaced all remaining uses of
boolean_false_node and boolean_true_node respectively.

PR d/112270

gcc/d/ChangeLog:

* d-builtins.cc (d_build_d_type_nodes): Initialize d_bool_false_node,
d_bool_true_node.
* d-codegen.cc (build_array_struct_comparison): Use d_bool_false_node
instead of boolean_false_node.
* d-convert.cc (d_truthvalue_conversion): Use d_bool_false_node and
d_bool_true_node instead of boolean_false_node and boolean_true_node.
* d-tree.h (enum d_tree_index): Add DTI_BOOL_FALSE and DTI_BOOL_TRUE.
(d_bool_false_node): New macro.
(d_bool_true_node): New macro.
* modules.cc (build_dso_cdtor_fn): Use d_bool_false_node and
d_bool_true_node instead of boolean_false_node and boolean_true_node.
(register_moduleinfo): Use d_bool_type instead of boolean_type_node.

gcc/testsuite/ChangeLog:

* gdc.dg/pr112270.d: New test.

(cherry picked from commit 10f1489dcb3bd9adccc88898bc12f53398fa3583)

Daily bump.

LoongArch: Define macro CLEAR_INSN_CACHE.

LoongArch's microstructure ensures cache consistency by hardware.
Due to out-of-order execution, "ibar" is required to ensure the visibility of the
store (invalidated icache) executed by this CPU before "ibar" (to the instance).
"ibar" will not invalidate the icache, so the start and end parameters are not Affect
"ibar" performance.

gcc/ChangeLog:

* config/loongarch/loongarch.h (CLEAR_INSN_CACHE): New definition.

(cherry picked from commit 5697ed0327f23d2e2ec4f7beec3b3d02f463173c)

LoongArch: Implement __builtin_thread_pointer for TLS.

gcc/ChangeLog:

* config/loongarch/loongarch.md (get_thread_pointer<mode>):Adds the
instruction template corresponding to the __builtin_thread_pointer
function.
* doc/extend.texi:Add the __builtin_thread_pointer function support
description to the documentation.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/builtin_thread_pointer.c: New test.

(cherry picked from commit 1b30ef7cea773e0af527dbf821e0be42b6a264f8)

Disparage slightly for the alternative which move DFmode between SSE_REGS and GENERAL_REGS.

For testcase

void __cond_swap(double* __x, double* __y) {
  bool __r = (*__x < *__y);
  auto __tmp = __r ? *__x : *__y;
  *__y = __r ? *__y : *__x;
  *__x = __tmp;
}

GCC-14 with -O2 and -march=x86-64 options generates the following code:

__cond_swap(double*, double*):
        movsd   xmm1, QWORD PTR [rdi]
        movsd   xmm0, QWORD PTR [rsi]
        comisd  xmm0, xmm1
        jbe     .L2
        movq    rax, xmm1
        movapd  xmm1, xmm0
        movq    xmm0, rax
.L2:
        movsd   QWORD PTR [rsi], xmm1
        movsd   QWORD PTR [rdi], xmm0
        ret

rax is used to save and restore DFmode value. In RA both GENERAL_REGS
and SSE_REGS cost zero since we didn't disparage the
alternative in movdf_internal pattern, according to register
allocation order, GENERAL_REGS is allocated. The patch add ? for
alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal
pattern, after that we get optimal RA.

__cond_swap:
.LFB0:
.cfi_startproc
movsd (%rdi), %xmm1
movsd (%rsi), %xmm0
comisd %xmm1, %xmm0
jbe .L2
movapd %xmm1, %xmm2
movapd %xmm0, %xmm1
movapd %xmm2, %xmm0
.L2:
movsd %xmm1, (%rsi)
movsd %xmm0, (%rdi)
ret

gcc/ChangeLog:

PR target/110170
* config/i386/i386.md (movdf_internal): Disparage slightly for
2 alternatives (r,v) and (v,r) by adding constraint modifier
'?'.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110170-3.c: New test.

(cherry picked from commit 37a231cc7594d12ba0822077018aad751a6fb94e)

Daily bump.

libstdc++: Build libstdc++_libbacktrace.a as PIC [PR111936]

In order for std::stacktrace to be used in a shared library, the
libbacktrace symbols need to be built with -fPIC. Add the libtool
-prefer-pic flag to the commands in src/libbacktrace/Makefile so that
the archive contains PIC objects.

libstdc++-v3/ChangeLog:

PR libstdc++/111936
* src/libbacktrace/Makefile.am: Add -prefer-pic to libtool
compile commands.
* src/libbacktrace/Makefile.in: Regenerate.

(cherry picked from commit f32c1e1e96ffef6512ce025942b51f3967a3e7f2)

Daily bump.

libstdc++: [_Hashtable] Do not reuse untrusted cached hash code

On merge, reuse a merged node's possibly cached hash code only if we are on the
same type of hash and this hash is stateless.

Usage of function pointers or std::function as hash functor will prevent reusing
cached hash code.

libstdc++-v3/ChangeLog

* include/bits/hashtable_policy.h
(_Hash_code_base::_M_hash_code(const _Hash&, const _Hash_node_value<>&)): Remove.
(_Hash_code_base::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): Remove.
* include/bits/hashtable.h
(_M_src_hash_code<_H2>(const _H2&, const key_type&, const __node_value_type&)): New.
(_M_merge_unique<>, _M_merge_multi<>): Use latter.
* testsuite/23_containers/unordered_map/modifiers/merge.cc
(test04, test05, test06): New test cases.

SH: Fix PR 111001

gcc/ChangeLog:

PR target/111001
* config/sh/sh_treg_combine.cc (sh_treg_combine::record_set_of_reg):
Skip over nop move insns.

rs6000: Make 32 bit stack_protect support prefixed insn [PR111367]

As PR111367 shows, with prefixed insn supported, some of
checkings consider it's able to leverage prefixed insn
for stack protect related load/store, but since we don't
actually change the emitted assembly for 32 bit, it can
cause the assembler error as exposed.

Mike's commit r10-4547-gce6a6c007e5a98 has already handled
the 64 bit case (DImode), this patch is to treat the 32
bit case (SImode) by making use of mode iterator P and
ptrload attribute iterator, also fixes the constraints
to match the emitted operand formats.

PR target/111367

gcc/ChangeLog:

* config/rs6000/rs6000.md (stack_protect_setsi): Support prefixed
instruction emission and incorporate to stack_protect_set<mode>.
(stack_protect_setdi): Rename to ...
(stack_protect_set<mode>): ... this, adjust constraint.
(stack_protect_testsi): Support prefixed instruction emission and
incorporate to stack_protect_test<mode>.
(stack_protect_testdi): Rename to ...
(stack_protect_test<mode>): ... this, adjust constraint.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr111367.C: New test.

(cherry picked from commit 530babc2058be5f2b06b1541384e7b730c368b93)

Daily bump.

Fortran: out of bounds access with nested implied-do IO [PR111837]

gcc/fortran/ChangeLog:

PR fortran/111837
* frontend-passes.cc (traverse_io_block): Dependency check of loop
nest shall be triangular, not banded.

gcc/testsuite/ChangeLog:

PR fortran/111837
* gfortran.dg/implied_do_io_8.f90: New test.

(cherry picked from commit 5ac63ec5da2e93226457bea4dbb3a4f78d5d82c2)

Daily bump.

SH: Fix PR 101177

Fix accidentally inverted comparison.

gcc/ChangeLog:

PR target/101177
* config/sh/sh.md (unnamed split pattern): Fix comparison of
find_regno_note result.

Daily bump.

lra: Avoid unfolded plus-0

While backporting another patch to an earlier release, I hit a
situation in which lra_eliminate_regs_1 would eliminate an address to:

    (plus (reg:P R) (const_int 0))

This address compared not-equal to plain:

    (reg:P R)

which caused an ICE in a later peephole2.  (The ICE showed up in
gfortran.fortran-torture/compile/pr80464.f90 on the branch but seems
to be latent on trunk.)

These unfolded PLUSes shouldn't occur in the insn stream, and later code
in the same function tried to avoid them.

gcc/
PR target/111528
* lra-eliminations.cc (lra_eliminate_regs_1): Use simplify_gen_binary
rather than gen_rtx_PLUS.

(cherry picked from commit 10d59b802a7db9ae908291fb20627c1493cfa26c)

Daily bump.

rs6000: Use default target option node for callee by default [PR111380]

As PR111380 (and the discussion in related PRs) shows, for
now how function rs6000_can_inline_p treats the callee
without any target option node is wrong.  It considers it's
always safe to inline this kind of callee, but actually its
target flags are from the command line options
(target_option_default_node), it's possible that the flags
of callee don't satisfy the condition of inlining, but it
is still inlined, then result in unexpected consequence.

As the associated test case pr111380-1.c shows, the caller
main is attributed with power8, but the callee foo is
compiled with power9 from command line, it's unexpected to
make main inline foo since foo can contain something that
requires power9 capability.  Without this patch, for lto
(with -flto) we can get error message (as it forces the
callee to have a target option node), but for non-lto, it's
inlined unexpectedly.

This patch is to make callee adopt target_option_default_node
when it doesn't have a target option node, it can avoid wrong
inlining decision and fix the inconsistency between LTO and
non-LTO.  It also aligns with what the other ports do.

PR target/111380

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_can_inline_p): Adopt
target_option_default_node when the callee has no option
attributes, also simplify the existing code accordingly.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr111380-1.c: New test.
* gcc.target/powerpc/pr111380-2.c: New test.

(cherry picked from commit 266dfed68b881702e9660889f63408054b7fa9c0)

rs6000: Skip empty inline asm in rs6000_update_ipa_fn_target_info [PR111366]

PR111366 exposes one thing that can be improved in function
rs6000_update_ipa_fn_target_info is to skip the given empty
inline asm string, since it's impossible to adopt any
hardware features (so far HTM).

Since this rs6000_update_ipa_fn_target_info related approach
exists in GCC12 and later, the affected project highway has
updated its target pragma with ",htm", see the link:
https://github.com/google/highway/commit/15e63d61eb535f478bc
I'd not bother to consider an inline asm parser for now but
will file a separated PR for further enhancement.

PR target/111366

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Skip
empty inline asm.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr111366.C: New test.

(cherry picked from commit a65b38e361320e0aa45adbc969c704385ab1f45b)

Daily bump.