git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

aarch64: Fix memtag builtins vs GC [PR108174]

The memtag builtins were being GC'ed away so we end up
with a crash sometimes (maybe even wrong code).
This fixes that issue by adding GTY on the variable/struct
aarch64_memtag_builtin_data.

Committed as obvious after a build/test for aarch64-linux-gnu.

PR target/108174

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc (aarch64_memtag_builtin_data): Make
static and mark with GTY.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/memtag_4.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 5ec7740496a6908b32cd058c0520a2bd5a689bb5)

Daily bump.

Fix internal error on non-byte-aligned reference in GIMPLE DSE

This is a regression present on the mainline, 13 and 12 branches.  For the
attached Ada case, it's a tree checking failure on the mainline at -O:

+===========================GNAT BUG DETECTED==============================+
| 14.0.1 20240226 (experimental) [master r14-9171-g4972f97a265]  GCC error:|
| tree check: expected tree that contains 'decl common' structure,         |
| have 'component_ref' in tree_could_trap_p, at tree-eh.cc:2733        |
| Error detected around /home/eric/cvs/gcc/gcc/testsuite/gnat.dg/opt104.adb:

Time is a 10-byte record and Packed_Rec.T is placed at bit-offset 65 because
of the packing. so tree-ssa-dse.cc:setup_live_bytes_from_ref has computed a
const_size of 88 from ref->offset of 65 and ref->max_size of 80.

Then in tree-ssa-dse.cc:compute_trims:

411       int last_live = bitmap_last_set_bit (live);
(gdb) next
412       if (ref->size.is_constant (&const_size))
(gdb)
414           int last_orig = (const_size / BITS_PER_UNIT) - 1;
(gdb)
418           *trim_tail = last_orig - last_live;

(gdb) call debug_bitmap (live)
n_bits = 256, set = {0 1 2 3 4 5 6 7 8 9 10 }
(gdb) p last_live
$33 = 10
(gdb) p const_size
$34 = 80
(gdb) p last_orig
$35 = 9
(gdb) p *trim_tail
$36 = -1

In other words, compute_trims is overlooking the alignment adjustments that
setup_live_bytes_from_ref applied earlier.  Moveover it reads:

  /* We use sbitmaps biased such that ref->offset is bit zero and the bitmap
     extends through ref->size.  So we know that in the original bitmap
     bits 0..ref->size were true.  We don't actually need the bitmap, just
     the REF to compute the trims.  */

but setup_live_bytes_from_ref used ref->max_size instead of ref->size.

It appears that all the callers of compute_trims assume that ref->offset is
byte aligned and that the trimmed bytes are relative to ref->size, so the
patch simply adds an early return if either condition is not fulfilled.

gcc/
* tree-ssa-dse.cc (compute_trims): Fix description.  Return early
if either ref->offset is not byte aligned or ref->size is not known
to be equal to ref->max_size.
(maybe_trim_complex_store): Fix description.
(maybe_trim_constructor_store): Likewise.
(maybe_trim_partially_dead_store): Likewise.

gcc/testsuite/
* gnat.dg/opt104.ads, gnat.dg/opt104.adb: New test.

x86: Properly implement AMX-TILE load/store intrinsics

ldtilecfg and sttilecfg take a 512-byte memory block.  With
_tile_loadconfig implemented as

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

GCC sees:

(parallel [
  (asm_operands/v ("ldtilecfg %X0") ("") 0
   [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
                         (const_int -64 [0xffffffffffffffc0])) [1 MEM[(const void * *)&tile_data]+0 S8 A128])]
   [(asm_input:DI ("m"))]
   (clobber (reg:CC 17 flags))])

and the memory operand size is 1 byte.  As the result, the rest of 511
bytes is ignored by GCC.  Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.

gcc/ChangeLog:

PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.

gcc/testsuite/ChangeLog:

PR target/114098
* gcc.target/i386/amxtile-4.c: New test.

(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)

Daily bump.

Finalization of object allocated by anonymous access designating local type

The finalization of objects dynamically allocated through an anonymous
access type is deferred to the enclosing library unit in the current
implementation and a warning is given on each of them.

However this cannot be done if the designated type is local, because this
would generate dangling references to the local finalization routine, so
the finalization needs to be dropped in this case and the warning adjusted.

gcc/ada/
PR ada/113893
* exp_ch7.adb (Build_Anonymous_Master): Do not build the master
for a local designated type.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Force Needs_Fin
to false if no finalization master is attached to an access type
and assert that it is anonymous in this case.
* sem_res.adb (Resolve_Allocator): Mention that the object might
not be finalized at all in the warning given when the type is an
anonymous access-to-controlled type.

gcc/testsuite/
* gnat.dg/access10.adb: New test.

Daily bump.

arm: fix ICE with vectorized reciprocal division [PR108120]

The expand pattern for reciprocal division was enabled for all math
optimization modes, but the patterns it was generating were not
enabled unless -funsafe-math-optimizations were enabled, this leads to
an ICE when the pattern we generate cannot be recognized.

Fixed by only enabling vector division when doing unsafe math.

gcc:

PR target/108120
* config/arm/neon.md (div<VCVTF:mode>3): Rename from div<mode>3.
Gate with ARM_HAVE_NEON_<MODE>_ARITH.

gcc/testsuite:
PR target/108120
* gcc.target/arm/neon-recip-div-1.c: New file.

(cherry picked from commit 016c4eed368b80a97101f6156ed99e4c5474fbb7)

RISC-V: Fix riscv/arch-19.c with different ISA spec version

In newer ISA spec, F will implied zicsr, add that into -march option to
prevent different test result on different default -misa-spec version.

gcc/testsuite/

* gcc.target/riscv/arch-19.c: Add -misa-spec.

(cherry picked from commit 9fde76a3be8e1717d9d38492c40675e742611e45)

LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

To improve Binutils compatibility we've had to backport relaxation
support. But if a user just updates to GCC 13.3 and sticks with
Binutils 2.41, there is no reason to use -mno-explicit-relocs as the
default because we are turning off relaxation for Binutils 2.41 (it
lacks conditional branch relaxation support) anyway.

So like GCC 14, make the default of -m[no-]explicit-relocs depend on
-m[no-]relax instead of HAVE_AS_MRELAX_OPTION. Also update the doc to
reflect the behavior change.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in
(TARGET_EXPLICIT_RELOCS): Init to M_OPTION_NOT_SEEN.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc
(loongarch_option_override_internal): Set the default of
TARGET_EXPLICIT_RELOCS to HAVE_AS_EXPLICIT_RELOCS
&& !loongarch_mrelax.
* doc/invoke.texi (-m[no-]explicit-relocs): Update for
LoongArch.

Daily bump.

testsuite: fix Wmismatched-new-delete-8.C with -m32

This fixes
error: 'operator new' takes type 'size_t' ('unsigned int') as first parameter [-fpermissive]

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wmismatched-new-delete-8.C: Use __SIZE_TYPE__.

(cherry picked from commit d34d7c74d51d365a3a4ddcd4383fc7c9f29020a1)

warn-access: Fix handling of unnamed types [PR109804]

This looks like an oversight of handling DEMANGLE_COMPONENT_UNNAMED_TYPE.
DEMANGLE_COMPONENT_UNNAMED_TYPE only has the u.s_number.number set while
the code expected newc.u.s_binary.left would be valid.
So this treats DEMANGLE_COMPONENT_UNNAMED_TYPE like we treat function paramaters
(DEMANGLE_COMPONENT_FUNCTION_PARAM) and template paramaters (DEMANGLE_COMPONENT_TEMPLATE_PARAM).

Note the code in the demangler does this when it sets DEMANGLE_COMPONENT_UNNAMED_TYPE:
ret->type = DEMANGLE_COMPONENT_UNNAMED_TYPE;
ret->u.s_number.number = num;

Committed as obvious after bootstrap/test on x86_64-linux-gnu

PR tree-optimization/109804

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (new_delete_mismatch_p): Handle
DEMANGLE_COMPONENT_UNNAMED_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wmismatched-new-delete-8.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 1076ffda6ce5e6d5fc9577deaf8233e549e5787a)

testsuite: Remove test that should not have been backported [PR105608]

This test (backported as part of r13-8257, from r14-8465) was not meant to
be backported, since it fails on some platforms without other GCC 14 patches
that will not be backported. Remove it from the 13 branch.

gcc/testsuite/ChangeLog:

PR preprocessor/105608
* g++.dg/pch/line-map-3.C: Removed.
* g++.dg/pch/line-map-3.Hs: Removed.

LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299]

Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure
building a cross compiler if the cross assembler is not installed yet.

gcc/ChangeLog:

PR target/112299
* config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0
if not defined yet.

(cherry picked from commit 6bf2cebe2bf49919c78814cb447d3aa6e3550d89)

LoongArch: Disable relaxation if the assembler don't support conditional branch relaxation [PR112330]

As the commit message of r14-4674 has indicated, if the assembler does
not support conditional branch relaxation, a relocation overflow may
happen on conditional branches when relaxation is enabled because the
number of NOP instructions inserted by the assembler will be more than
the number estimated by GCC.

To work around this issue, disable relaxation by default if the
assembler is detected incapable to perform conditional branch relaxation
at GCC build time.  We also need to pass -mno-relax to the assembler to
really disable relaxation.  But, if the assembler does not support
-mrelax option at all, we should not pass -mno-relax to the assembler or
it will immediately error out.  Also handle this with the build time
assembler capability probing, and add a pair of options
-m[no-]pass-mrelax-to-as to allow using a different assembler from the
build-time one.

With this change, if GCC is built with GAS 2.41, relaxation will be
disabled by default.  So the default value of -mexplicit-relocs= is also
changed to 'always' if -mno-relax is specified or implied by the
build-time default, because using assembler macros for symbol addresses
produces no benefit when relaxation is disabled.

gcc/ChangeLog:

PR target/112330
* config/loongarch/genopts/loongarch.opt.in: Add
-m[no]-pass-relax-to-as.  Change the default of -m[no]-relax to
account conditional branch relaxation support status.
* config/loongarch/loongarch.opt: Regenerate.
* configure.ac (gcc_cv_as_loongarch_cond_branch_relax): Check if
the assembler supports conditional branch relaxation.
* configure: Regenerate.
* config.in: Regenerate.  Note that there are some unrelated
changes introduced by r14-5424 (which does not contain a
config.in regeneration).
* config/loongarch/loongarch-opts.h
(HAVE_AS_COND_BRANCH_RELAXATION): Define to 0 if not defined.
* config/loongarch/loongarch.h (ASM_MRELAX_DEFAULT): Define.
(ASM_MRELAX_SPEC): Define.
(ASM_SPEC): Use ASM_MRELAX_SPEC instead of "%{mno-relax}".
* doc/invoke.texi: Document -m[no-]relax and
-m[no-]pass-mrelax-to-as for LoongArch.

(cherry picked from commit fe23a2ff1f5072559552be0e41ab55bf72f5c79f)

LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.

gcc/ChangeLog:

* config.in: Regenerate.
* config/loongarch/genopts/loongarch.opt.in: Add compilation option
mrelax. And set the initial value of explicit-relocs according to the
detection status.
* config/loongarch/gnu-user.h: When compiling with -mno-relax, pass the
--no-relax option to the linker.
* config/loongarch/loongarch-opts.h (HAVE_AS_MRELAX_OPTION): Define macro.
* config/loongarch/loongarch.opt: Regenerate.
* configure: Regenerate.
* configure.ac: Add detection of support for binutils relax function.

(cherry picked from commit 9bab65a77049edcc7afc59532173206ee816e726)

LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP.

There are two reasons for removing this macro definition:
1. The default in the assembler is to use the nop instruction for filling.
2. For assembly directives: .align [abs-expr[, abs-expr[, abs-expr]]]
   The third expression it is the maximum number of bytes that should be
   skipped by this alignment directive.
   Therefore, it will affect the display of the specified alignment rules
   and affect the operating efficiency.

This modification relies on binutils commit 1fb3cdd87ec61715a5684925fb6d6a6cf53bb97c.
(Since the assembler will add nop based on the .align information when doing relax,
it will cause the conditional branch to go out of bounds during the assembly process.
This submission of binutils solves this problem.)

gcc/ChangeLog:

* config/loongarch/loongarch.h (ASM_OUTPUT_ALIGN_WITH_NOP):
Delete.

Co-authored-by: Chenghua Xu <xuchenghua@loongson.cn>
(cherry picked from commit b20c7ee066cb7d952fa193972e8bc6362c6e4063)

Daily bump.

AVR: Improve documentation for -mmcu=.

gcc/
* doc/invoke.texi (AVR Options) <-mmcu>: Remove "Atmel".
Note on complete device support.

(cherry picked from commit e63ae9085aca9d306a2f16445b473289b9186e10)

AVR: Document option -mskip-bug.

gcc/
* doc/invoke.texi (AVR Options) [-mskip-bug]: Add documentation.

(cherry picked from commit 42503cc257fb841904917c9529f24c76511c4e0d)

AVR: invoke.texi: Put internal options in their own @subsubsection.

gcc/
* doc/invoke.texi (AVR Options): Move -mrmw, -mn-flash, -mshort-calls
and -msp8 to...
(AVR Internal Options): ...this new @subsubsection.

(cherry picked from commit 31461d2b651e1db6114cf7ab64ac1508410cf2f2)

Daily bump.

Update gcc zh_CN.po

* zh_CN.po: Update.

veclower: improve selection of vector mode when lowering [PR 112787]

This patch addresses the issue reported in PR target/112787 by improving the
compute type selection. We do this by not considering types with more elements
than the type we are lowering since we'd reject such types anyway.

gcc/ChangeLog:

PR target/112787
* tree-vect-generic.cc (type_for_widest_vector_mode): Change function to
use original vector type and check widest vector mode has at most the
same number of elements.
(get_compute_type): Pass original vector type rather than the element
type to type_for_widest_vector_mode and remove now obsolete check for
the number of elements.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr112787.c: New test.

(cherry picked from commit a3ff76278efe006dc0b50249c8e5baf565bff56b)

Daily bump.

libgcc: fix Win32 CV abnormal spurious wakeups in timed wait [PR113850]

Fix a typo in __gthr_win32_abs_to_rel_time that caused it to return a
relative time in seconds instead of milliseconds. As a consequence,
__gthr_win32_cond_timedwait called SleepConditionVariableCS with a
1000x shorter timeout; this caused ~1000x more spurious wakeups in
CV timed waits such as std::condition_variable::wait_for or wait_until,
resulting generally in much higher CPU usage.

This can be demonstrated by this sample program:

```

int main() {
    std::condition_variable cv;
    std::mutex mx;
    bool pass = false;

    auto thread_fn = [&](bool timed) {
        int wakeups = 0;
        using sc = std::chrono::system_clock;
        auto before = sc::now();
        std::unique_lock<std::mutex> ml(mx);
        if (timed) {
            cv.wait_for(ml, std::chrono::seconds(2), [&]{
                ++wakeups;
                return pass;
            });
        } else {
            cv.wait(ml, [&]{
                ++wakeups;
                return pass;
            });
        }
        printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
                int((sc::now() - before) / std::chrono::milliseconds(1)));
        pass = false;
    };

    {
        // timed wait, let expire
        std::thread t(thread_fn, true);
        t.join();
    }

    {
        // timed wait, wake up explicitly after 1 second
        std::thread t(thread_fn, true);
        std::this_thread::sleep_for(std::chrono::seconds(1));
        {
            std::unique_lock<std::mutex> ml(mx);
            pass = true;
        }
        cv.notify_all();
        t.join();
    }

    {
        // non-timed wait, wake up explicitly after 1 second
        std::thread t(thread_fn, false);
        std::this_thread::sleep_for(std::chrono::seconds(1));
        {
            std::unique_lock<std::mutex> ml(mx);
            pass = true;
        }
        cv.notify_all();
        t.join();
    }
    return 0;
}
```

On builds based on non-affected threading models (e.g. POSIX on Linux,
or winpthreads or MCF on Win32) the output is something like
```
pass: 0; wakeups: 2; elapsed: 2000 ms
pass: 1; wakeups: 2; elapsed: 991 ms
pass: 1; wakeups: 2; elapsed: 996 ms
```

while with the Win32 threading model we get
```
pass: 0; wakeups: 1418; elapsed: 2000 ms
pass: 1; wakeups: 479; elapsed: 988 ms
pass: 1; wakeups: 2; elapsed: 992 ms
```
(notice the huge number of wakeups in the timed wait cases only).

This commit fixes the conversion, adjusting the final division by
NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
file and not used in any other place, so probably just a typo).

libgcc/ChangeLog:

PR libgcc/113850
* config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
fix absolute timespec to relative milliseconds count
conversion (it incorrectly returned seconds instead of
milliseconds); this avoids spurious wakeups in
__gthr_win32_cond_timedwait

(cherry picked from commit 05ad8fb55a55f1e201fd781c84682a7c0bbd4d97)

c++: ICE with reinterpret_cast and switch [PR113545]

Jason, this is the patch you proposed for PR113545. It looks very safe
so I'm posting it here so that it's not forgotten.

PR c++/113545

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_switch_expr): If the condition doesn't reduce
to an INTEGER_CST, consider it non-constant.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-reinterpret3.C: New test.
* g++.dg/cpp1y/constexpr-reinterpret4.C: New test.

(cherry picked from commit 39d989022dd0eacf1a7b95b7b20621acbe717d70)

libstdc++: Fix constexpr basic_string union member [PR113294]

A call to `basic_string::clear()` in the std::string move assignment
operator leads to a constexpr error from an access of inactive union
member `_M_local_buf` in the added test (`test_move()`). Changing
`__str._M_local_buf` to `__str._M_use_local_data()` in
`operator=(basic_string&& __str)` fixes this.

PR libstdc++/113294

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string::operator=): Use
_M_use_local_data() instead of _M_local_buf on the moved-from
string.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc
(test_move): New test.

Signed-off-by: Paul Keir <paul.keir@uws.ac.uk>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 065dddc6e07a917c57c7955db13b1fe77abbcabc)

libstdc++: Avoid aliasing violation in std::valarray [PR99117]

The call to __valarray_copy constructs an _Array object to refer to
this->_M_data but that means that accesses to this->_M_data are through
a restrict-qualified pointer. This leads to undefined behaviour when
copying from an _Expr object that actually aliases this->_M_data.

Replace the call to __valarray_copy with a plain loop. I think this
removes the only use of that overload of __valarray_copy, so it could
probably be removed. I haven't done that here.

libstdc++-v3/ChangeLog:

PR libstdc++/99117
* include/std/valarray (valarray::operator=(const _Expr&)):
Use loop to copy instead of __valarray_copy with _Array.
* testsuite/26_numerics/valarray/99117.cc: New test.

(cherry picked from commit b58f0e5216a3053486e7f1aa96c3f2443b14d630)

libstdc++: Update tzdata to 2024a

Import the new 2024a tzdata.zi file. The leapseconds file was also
updated to have a new expiry (no new leap seconds were added).

libstdc++-v3/ChangeLog:

* src/c++20/tzdata.zi: Import new file from 2024a release.
* src/c++20/tzdb.cc (tzdb_list::_Node::_S_read_leap_seconds)
Update expiry date for leap seconds list.

(cherry picked from commit 4d6513f80bd2349559ec7e3844cf096cdb103e9a)

libstdc++: Guard tr2::bases and tr2::direct_bases with __has_builtin

These non-standard extensions use GCC-specific built-ins. Use
__has_builtin to avoid errors when Clang compiles this header.

See https://github.com/llvm/llvm-project/issues/24289

libstdc++-v3/ChangeLog:

* include/tr2/type_traits (bases, direct_bases): Use
__has_builtin to check if required built-ins are supported.

(cherry picked from commit 5fb204aaf34b68c427f5b2bfb933fed72fe3eafb)

Daily bump.

testsuite: Require lra effective target for pr107385.c

Old reload doesn't support asm goto with output operands.
We have lra effective target (though, strangely it returns
0 just for 2 targets out of at least 16 targets with no LRA support),
so this patch uses it, similarly how it is done in other asm goto
tests with output operands.

2024-02-15 Jakub Jelinek <jakub@redhat.com>

PR middle-end/107385
* gcc.dg/pr107385.c: Require lra effective target.

(cherry picked from commit 0d5d1c75f5c68b6064640c3154ae5f4c0b464905)

testsuite: Add testcase for already fixed PR [PR107385]

This testcase has been fixed by the PR113921 fix, but unlike testcase
in there this one is not target specific.

2024-02-15 Jakub Jelinek <jakub@redhat.com>

PR middle-end/107385
* gcc.dg/pr107385.c: New test.

(cherry picked from commit 5459a9074afabf700f055fc8164f88dadb1c39b0)

expand: Fix handling of asm goto outputs vs. PHI argument adjustments [PR113921]

The Linux kernel and the following testcase distilled from it is
miscompiled, because tree-outof-ssa.cc (eliminate_phi) emits some
fixups on some of the edges (but doesn't commit edge insertions).
Later expand_asm_stmt emits further instructions on the same edge.
Now the problem is that expand_asm_stmt uses insert_insn_on_edge
to add its own fixups, but that function appends to the existing
sequence on the edge if any.  And the bug triggers when the
fixup sequence emitted by eliminate_phi uses a pseudo which the
fixup sequence emitted by expand_asm_stmt later on sets.
So, we end up with
  (set (reg A) (asm_operands ...))
and on one of the edges queued sequence
  (set (reg C) (reg B)) // added by eliminate_phi
  (set (reg B) (reg A)) // added by expand_asm_stmt
That is wrong, what we emit by expand_asm_stmt needs to be as close
to the asm_operands as possible (they aren't known until expand_asm_stmt
is called, the PHI fixup code assumes it is reg B which holds the right
value) and the PHI adjustments need to be done after it.

So, the following patch introduces a prepend_insn_to_edge function and
uses it from expand_asm_stmt, so that we queue
  (set (reg B) (reg A)) // added by expand_asm_stmt
  (set (reg C) (reg B)) // added by eliminate_phi
instead and so the value from the asm_operands output propagates correctly
to the PHI result.

2024-02-15  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/113921
* cfgrtl.h (prepend_insn_to_edge): New declaration.
* cfgrtl.cc (insert_insn_on_edge): Clarify behavior in function
comment.
(prepend_insn_to_edge): New function.
* cfgexpand.cc (expand_asm_stmt): Use prepend_insn_to_edge instead of
insert_insn_on_edge.

* gcc.target/i386/pr113921.c: New test.

(cherry picked from commit 2b4efc5db2aedb59196987300e14951d08cd7106)

AVR: target 113927 - Simple code triggers stack frame for Reduced Tiny.

The -mmcu=avrtiny cores have no ADIW and SBIW instructions.  This was
implemented by clearing all regs out of regclass ADDW_REGS so that
constraint "w" never matched.  This corrupted the subset relations of
the register classes as they appear in enum reg_class.

This patch keeps ADDW_REGS like for all other cores, i.e. it contains
R24...R31.  Instead of tests like  test_hard_reg_class (ADDW_REGS, *)
the code now uses  avr_adiw_reg_p (*).  And all insns with constraint "w"
get "isa" insn attribute value of "adiw".

Plus, a new built-in macro __AVR_HAVE_ADIW__ is provided, which is more
specific than __AVR_TINY__.

gcc/
PR target/113927
* config/avr/avr.h (AVR_HAVE_ADIW): New macro.
* config/avr/avr-protos.h (avr_adiw_reg_p): New proto.
* config/avr/avr.cc (avr_adiw_reg_p): New function.
(avr_conditional_register_usage) [AVR_TINY]: Don't clear ADDW_REGS.
Replace test_hard_reg_class (ADDW_REGS, ...) with calls to
* config/avr/avr.md: Same.
(attr "isa") <tiny, no_tiny>: Remove.
<adiw, no_adiw>: Add.
(define_insn, define_insn_and_split): When an alternative has
constraint "w", then set attribute "isa" to "adiw".
* config/avr/avr-c.cc (avr_cpu_cpp_builtins) [AVR_HAVE_ADIW]:
Built-in define __AVR_HAVE_ADIW__.
* doc/invoke.texi (AVR Options): Document it.

(cherry picked from commit 5cff288c2dae4ea709df067cf398f23e214b2e80)

Daily bump.

c++: variable partial spec redeclaration [PR113612]

If register_specialization finds a previous declaration and throws the new
one away, we shouldn't still add the new one to
DECL_TEMPLATE_SPECIALIZATIONS.

PR c++/113612

gcc/cp/ChangeLog:

* pt.cc (process_partial_specialization): Return early
on redeclaration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ85.C: New test.

(cherry picked from commit 19ac327de421fe05916e234e3450e6e1cc5c935c)

tree-optimization/113896 - testcase for fixed PR

The SLP permute optimization rewrite fixed this.

PR tree-optimization/113896
* g++.dg/torture/pr113896.C: New testcase.

(cherry picked from commit 4a1cd5560b9b545eb848eb1d1e06d345fb606f76)

libgfortran: Adjust bytes_left and pos for access="STREAM".

During tab edits, the pos (position) and bytes_used
Variables were not being set correctly for stream I/O.
Since stream I/O does not have 'real' records, the
format buffer active length must be used instead of
the record length variable.

PR libgfortran/109358

libgfortran/ChangeLog:

* io/transfer.c (formatted_transfer_scalar_write): Adjust
bytes_used and pos variable for stream access.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr109358.f90: New test.

(cherry picked from commit 153ce7a78edbe339fa0b1cd314dea0554f59faf0)

libgfortran: EN0.0E0 and ES0.0E0 format editing.

F2018 and F2023 standards added zero width exponents. This required
additional special handing in the process of building formatted
floating point strings.

G formatting uses either F or E formatting as documented in
write_float.def comments. This logic changes the format token from FMT_G
to FMT_F or FMT_E. The new formatting requirements interfere with this
process when a FMT_G float string is being built. To avoid this, a new
component called 'pushed' is added to the fnode structure to save this
condition. The 'pushed' condition is then used to bypass portions of
the new ES,E,EN, and D formatting, falling through to the existing
default formatting which is retained.

libgfortran/ChangeLog:
PR libfortran/111022
* io/format.c (get_fnode): Update initialization of fnode.
(parse_format_list): Initialization.
* io/format.h (struct fnode): Added the new 'pushed' component.
* io/write.c (select_buffer): Whitespace.
(write_real): Whitespace.
(write_real_w0): Adjust logic for the d == 0 condition.
* io/write_float.def (determine_precision): Whitespace.
(build_float_string): Calculate width of ..E0 exponents and
adjust logic accordingly.
(build_infnan_string): Whitespace.
(CALCULATE_EXP): Whitespace.
(quadmath_snprintf): Whitespace.
(determine_en_precision): Whitespace.

gcc/testsuite/ChangeLog:
PR libfortran/111022
* gfortran.dg/fmt_error_10.f: Show D+0 exponent.
* gfortran.dg/pr96436_4.f90: Show E+0 exponent.
* gfortran.dg/pr96436_5.f90: Show E+0 exponent.
* gfortran.dg/pr111022.f90: New test.

(cherry picked from commit d436e8e70dacd9c06247bb56d0abeded8fcb4242)

Daily bump.

avr: Fix wrong array bounds warning on SFR access

The warning was raised on accessing SFRs at addresses below the default
page size, as gcc considers accessing addresses in the first page of
memory as suspicious. This doesn't apply to an embedded target like the
avr, where both flash and RAM have zero as a valid address. Zero is also
a valid address in named address spaces (__memx, flash<n> etc..).

This commit implements TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID for the avr
target and reports to gcc that zero is a valid address on all
address spaces. It also disables flag_delete_null_pointer_checks
based on the target hook, and modifies target-supports.exp to add avr
to the list of targets that always keep null pointer checks. This fixes
a bunch of DejaGNU failures that occur otherwise.

PR target/105523

gcc/ChangeLog:

* common/config/avr/avr-common.cc: Remove setting
of OPT_fdelete_null_pointer_checks.
* config/avr/avr.cc (avr_option_override): Clear
flag_delete_null_pointer_checks if zero_address_valid.
(avr_addr_space_zero_address_valid): New function.
(TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID): Provide target
hook.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_keeps_null_pointer_checks): Add
avr.
* gcc.target/avr/pr105523.c: New test.

(cherry picked from commit 58e1bc2b1c8420773b16452d47932a6ca0d003fb)

Daily bump.

libgfortran: avoid duplicate libraries in spec

The linking of libgcc is already present in %(liborig), so the current
situation duplicates libraries. This was not an issue until macOS's new
linker started giving warnings for such cases.

libgfortran/ChangeLog:

PR libfortran/110651
* libgfortran.spec.in: Remove duplicate libraries.

Daily bump.

call maybe_return_this in build_clone

__dt_base doesn't get its body from a maybe_return_this caller, it's
rather cloned with the full body within build_clone, and then it's
left alone, without going through finish_function_body or
build_delete_destructor_body, that call maybe_return_this.

Now, this is correct as far as the generated code is concerned, since
the cloned body of a cdtor that returns this is also a cdtor body that
returns this.  The problem is that the decl for THIS is also cloned,
and it doesn't get the warning suppression introduced by
maybe_return_this, so Wuse-after-free3.C fails with an excess warning
at the closing brace of the dtor body.

I've split out the warning suppression from maybe_return_this, and
arranged to call that bit from the relevant build_clone case.
Unfortunately, because the warning is silenced for all uses of the
THIS decl, rather than only for the ABI-mandated return stmt, this
also silences the very warning that the testcase checks for.

I'm not revamping the warning suppression approach to overcome this,
so I'm xfailing the expected warning on ARM EABI, hoping that's the
only target with cdtor_return_this, and leaving it at that.

for  gcc/cp/ChangeLog

* decl.cc (maybe_prepare_return_this): Split out of...
(maybe_return_this): ... this.
* cp-tree.h (maybe_prepare_return_this): Declare.
* class.cc (build_clone): Call it.

for  gcc/testsuite/ChangeLog

* g++.dg/warn/Wuse-after-free3.C: xfail on arm_eabi.

(cherry picked from commit 0d24289d129639efdc79338a64188d6d404375e8)

[testsuite] tsvc: skip include malloc.h when unavailable

tsvc tests all fail on systems that don't offer a malloc.h, other than
those that explicitly rule that out.  Use the preprocessor to test for
malloc.h's availability.

tsvc.h also expects a definition for struct timeval, but it doesn't
include sys/time.h.  Add a conditional include thereof.

for  gcc/testsuite/ChangeLog

* gcc.dg/vect/tsvc/tsvc.h: Test for and conditionally include
malloc.h and sys/time.h.

(cherry picked from commit 2f20d6296087cae51f55eeecb3efefe786191fd6)

testsuite: Pattern does not match when using --specs=nano.specs

When running the testsuite for newlib nano, the --specs=nano.specs
option is used.  This option prepends cpp_unique_options with
"-isystem =/include/newlib-nano" so that the newlib nano header files
override the newlib standard ones.  As the -isystem option is prepended,
the -quiet option is no longer the first option to cc1.  Adjust the test
accordingly.

Patch has been verified on Windows and Linux.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/options.exp: Allow other options before the
-quite option for cc1.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 1175d1b35ce7bf8ee7c9b37b334370f01eb95335)

c++: for contracts, cdtors never return this

When targetm.cxx.cdtor_return_this() holds, cdtors have a
non-VOID_TYPE_P result, but IMHO this ABI implementation detail
shouldn't leak to the abstract language conceptual framework, in which
cdtors don't have return values.  For contracts, specifically those
that establish postconditions on results, such a leakage is present,
and the present patch puts an end to it: with it, cdtors get an error
for result postconditions regardless of the ABI.  This fixes
g++.dg/contracts/contracts-ctor-dtor2.C on arm-eabi.

for  gcc/cp/ChangeLog

* contracts.cc (check_postcondition_result): Cope with
cdtor_return_this.

(cherry picked from commit 71804526d3a71a8c0f189a89ce3aa615784bfd8b)

libstdc++: Do not use assume attribute for Clang [PR112467]

Clang has an 'assume' attribute, but it's a function attribute not a
statement attribute. The recently-added use of the statement form causes
an error with Clang.

libstdc++-v3/ChangeLog:

PR libstdc++/112467
* include/bits/stl_bvector.h (_M_assume_normalized): Do not use
statement form of assume attribute for Clang.

(cherry picked from commit 807f47497f17ed50be91f0f879308cb6fa063966)

libstdc++: optimize bit iterators assuming normalization [PR110807]

The representation of bit iterators, using a pointer into an array of
words, and an unsigned bit offset into that word, makes for some
optimization challenges: because the compiler doesn't know that the
offset is always in a certain narrow range, beginning at zero and
ending before the word bitwidth, when a function loads an offset that
it hasn't normalized itself, it may fail to derive certain reasonable
conclusions, even to the point of retaining useless calls that elicit
incorrect warnings.

Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
list to a global bit vector variable.  Based on the compile-time
constant length of the list, we decide in _M_insert_range whether to
use the existing storage or to allocate new storage for the vector.
After allocation, we decide in _M_copy_aligned how to copy any
preexisting portions of the vector to the newly-allocated storage.
When copying two or more words, we use __builtin_memmove.

However, because we compute the available room using bit offsets
without range information, even comparing them with constants, we fail
to infer ranges for the preexisting vector depending on word size, and
may thus retain the memmove call despite knowing we've only allocated
one word.

Other parts of the compiler then detect the mismatch between the
constant allocation size and the much larger range that could
theoretically be copied into the newly-allocated storage if we could
reach the call.

Ensuring the compiler is aware of the constraints on the offset range
enables it to do a much better job at optimizing.  Using attribute
assume (_M_offset <= ...) didn't work, because gimple lowered that to
something that vrp could only use to ensure 'this' was non-NULL.
Exposing _M_offset as an automatic variable/gimple register outside
the unevaluated assume operand enabled the optimizer to do its job.

Rather than placing such load-then-assume constructs all over, I
introduced an always-inline member function in bit iterators that does
the job of conveying to the compiler the information that the
assumption is supposed to hold, and various calls throughout functions
pertaining to bit iterators that might not otherwise know that the
offsets have to be in range, so that the compiler no longer needs to
make conservative assumptions that prevent optimizations.

With the explicit assumptions, the compiler can correlate the test for
available storage in the vector with the test for how much storage
might need to be copied, and determine that, if we're not asking for
enough room for two or more words, we can omit entirely the code to
copy two or more words, without any runtime overhead whatsoever: no
traces remain of the undefined behavior or of the tests that inform
the compiler about the assumptions that must hold.

for  libstdc++-v3/ChangeLog

PR libstdc++/110807
* include/bits/stl_bvector.h (_Bit_iterator_base): Add
_M_assume_normalized member function.  Call it in _M_bump_up,
_M_bump_down, _M_incr, operator==, operator<=>, operator<, and
operator-.
(_Bit_iterator): Also call it in operator*.
(_Bit_const_iterator): Likewise.

(cherry picked from commit e39b3e02c27bd771a07e385f9672ecf1a45ced77)

Daily bump.

libstdc++: Prefer posix_memalign for aligned-new [PR113258]

As described in PR libstdc++/113258 there are old versions of tcmalloc
which replace malloc and related APIs, but do not repalce aligned_alloc
because it didn't exist at the time they were released. This means that
when operator new(size_t, align_val_t) uses aligned_alloc to obtain
memory, it comes from libc's aligned_alloc not from tcmalloc. But when
operator delete(void*, size_t, align_val_t) uses free to deallocate the
memory, that goes to tcmalloc's replacement version of free, which
doesn't know how to free it.

If we give preference to the older posix_memalign instead of
aligned_alloc then we're more likely to use a function that will be
compatible with the replacement version of free. Because posix_memalign
has been around for longer, it's more likely that old third-party malloc
replacements will also replace posix_memalign alongside malloc and free.

libstdc++-v3/ChangeLog:

PR libstdc++/113258
* libsupc++/new_opa.cc: Prefer to use posix_memalign if
available.

(cherry picked from commit f50f2efae9fb0965d8ccdb62cfdb698336d5a933)

libstdc++: Fix non-portable results from 64-bit std::subtract_with_carry_engine [PR107466]

I implemented the resolution of LWG 3809 in r13-4364-ga64775a0edd469 but
it was recently noted in the MSVC STL github repo that the change causes
possible truncation for 64-bit seeds. Whether the truncation occurs (and
to what value) depends on the width of uint_least32_t which is not
portable, so the output of the PRNG for 64-bit seed values is no longer
the same as in C++20, and no longer portable across platforms.

That new issue was filed as LWG 4014. I proposed a new change which
reduces the seed by the LCG's modulus before the conversion to
uint_least32_t. This ensures that 64-bit seed values are consistently
reduced by the modulus before any truncation. This removes the
platform-dependent behaviour and restores the old behaviour for
std::subtract_with_carry_engine specializations using a 64-bit result
type (such as std::ranlux48_base).

libstdc++-v3/ChangeLog:

PR libstdc++/107466
* include/bits/random.tcc (subtract_with_carry_engine::seed):
Implement proposed resolution of LWG 4014.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
Check for expected result of 64-bit engine with seed that
doesn't fit in 32-bits.

(cherry picked from commit c224dec0e7c88e7a95633023018cdcb6ee87c65f)

libstdc++: Implement some missing functions for net::ip::network_v6

libstdc++-v3/ChangeLog:

* include/experimental/internet (network_v6::network): Define.
(network_v6::hosts): Finish implementing.
(network_v6::to_string): Do not concatenate std::string to
arbitrary std::basic_string specialization.
* testsuite/experimental/net/internet/network/v6/cons.cc: New
test.

(cherry picked from commit 5b069117e261ae0b438aee0cdff8707827810adf)

libstdc++: Avoid reusing moved-from iterators in PSTL tests [PR90276]

The reverse_invoker utility for PSTL tests uses forwarding references for
all parameters, but some of those parameters get forwarded to move
constructors which then leave the objects in a moved-from state. When
the parameters are forwarded a second time that results in making new
copies of moved-from iterators. For libstdc++ debug mode iterators, the
moved-from state is singular, which means copying them will abort at
runtime.

The fix is to make copies of iterator arguments instead of forwarding
them.

The callers of reverse_invoker::operator() also forward the iterators
multiple times, but that's OK because reverse_invoker accepts them by
forwarding reference but then breaks the chain of forwarding and copies
them as lvalues.

libstdc++-v3/ChangeLog:

PR libstdc++/90276
* testsuite/util/pstl/test_utils.h (reverse_invoker): Do not use
perfect forwarding for iterator arguments.

(cherry picked from commit 723a7c1ad29523b9ddff53c7b147bffea56fbb63)

libstdc++: Remove noexcept from std::osyncstream::operator=

This should not be noexcept because its _M_syncbuf member has a
potentially-throwing move assignment operator. The noexcept was removed
by LWG 3867.

libstdc++-v3/ChangeLog:

* include/std/syncstream (basic_osyncstream::operator=): Remove
noexcept, as per LWG 3867.

(cherry picked from commit 67f5a8c802463228da7c7bb2ab100095217560a6)

libstdc++: Allow explicit conversion of string views with different traits

This was changed by LWG 3857.

libstdc++-v3/ChangeLog:

* include/std/string_view (basic_string_view(R&&)): Remove
constraint that traits_type must be the same, as per LWG 3857.
* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc:
Explicit conversion between different specializations should be
allowed.
* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc:
Likewise.

(cherry picked from commit f60d7e1c64518936797ec1009cb49f72f8fe45b9)

AVR: target/113824 - Fix multilib set for ATA5795.

gcc/
PR target/113824
* config/avr/avr-mcus.def (ata5797): Move from avr5 to avr4.
* doc/avr-mmcu.texi: Rebuild.

(cherry picked from commit 969bc5805eecb04a278e751e861805cfa04a4d5f)

AVR: Always define __AVR_PM_BASE_ADDRESS__ in specs provided the core has it.

gcc/
* config/avr/gen-avr-mmcu-specs.cc (print_mcu) <*cpp_mcu>: Spec always
defines __AVR_PM_BASE_ADDRESS__ if the core has it.

(cherry picked from commit e515d813f080fb4c4e70d3c7b01815a909893688)

Daily bump.

Update gcc zh_CN.po

* zh_CN.po: Update.

Fix contracts-tmpl-spec2.C on targets where plain char is unsigned by default

Since contracts-tmpl-spec2.C is just testing contracts, I thought it would be better
to just add `-fsigned-char` to the options rather than change the testcase to support
both cases.

Committed after testing on aarch64-linux-gnu.

gcc/testsuite/ChangeLog:

PR testsuite/108321
* g++.dg/contracts/contracts-tmpl-spec2.C: Add -fsigned-char
to options.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 6e15e4e1abed02443a27a69455f4bfa49457c99e)

testsuite: Replace many dg-require-thread-fence with dg-require-atomic-cmpxchg-word

These tests actually use a form of atomic compare and exchange
operation, not just atomic loading and storing. Some targets (not
supported by e.g. libatomic) have atomic loading and storing, but not
compare and exchange, yielding linker errors for missing library
functions.

This change is just for existing uses of
dg-require-thread-fence. It does not fix any other tests
that should also be gated on dg-require-atomic-cmpxchg-word.

* testsuite/29_atomics/atomic/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_flag/clear/1.cc,
testsuite/29_atomics/atomic_flag/cons/value_init.cc,
testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc,
testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc,
testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_ref/generic.cc,
testsuite/29_atomics/atomic_ref/integral.cc,
testsuite/29_atomics/atomic_ref/pointer.cc: Replace
dg-require-thread-fence with dg-require-atomic-cmpxchg-word.

(cherry picked from commit ba0cde8ba2d93b7193050eb5ef3cc6f7a2cdfe61)

testsuite: Add dg-require-atomic-cmpxchg-word

Some targets (armv6-m) support inline atomic load and store,
i.e. dg-require-thread-fence matches, but not atomic operations like
compare and exchange.

This directive can be used to replace uses of dg-require-thread-fence
where an atomic operation is actually used.

* testsuite/lib/dg-options.exp (dg-require-atomic-cmpxchg-word):
New proc.
* testsuite/lib/libstdc++.exp (check_v3_target_atomic_cmpxchg_word):
Ditto.

(cherry picked from commit 2a4d9e4f533c77870cc0eb60fbbd8047da4c7386)

libstdc++: Add dg-require-thread-fence in several tests

Some targets like arm-eabi with newlib and default settings rely on
__sync_synchronize() to ensure synchronization. Newlib does not
implement it by default, to make users aware they have to take special
care.

This makes a few tests fail to link.

This patch requires the missing thread-fence effective target in the
tests that need it, making them UNSUPPORTED instead of FAIL and
UNRESOLVED.

2023-09-10 Christophe Lyon <christophe.lyon@linaro.org>

libstdc++-v3/
* testsuite/29_atomics/atomic/compare_exchange_padding.cc: Likewise.
* testsuite/29_atomics/atomic/cons/value_init.cc: Likewise.
* testsuite/29_atomics/atomic_float/value_init.cc: Likewise.
* testsuite/29_atomics/atomic_integral/cons/value_init.cc: Likewise.
* testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc: Likewise.
* testsuite/29_atomics/atomic_ref/generic.cc: Likewise.
* testsuite/29_atomics/atomic_ref/integral.cc: Likewise.
* testsuite/29_atomics/atomic_ref/pointer.cc: Likewise.

(cherry picked from commit 62b29347c38394ae32858f2301aa9aa65205984e)

aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

The PR shows us ICEing due to an unrecognizable TFmode save emitted by
aarch64_process_components.  The problem is that for T{I,F,D}mode we
conservatively require mems to be in range for x-register ldp/stp.  That
is because (at least for TImode) it can be allocated to both GPRs and
FPRs, and in the GPR case that is an x-reg ldp/stp, and the FPR case is
a q-register load/store.

As Richard pointed out in the PR, aarch64_get_separate_components
already checks that the offsets are suitable for a single load, so we
just need to choose a mode in aarch64_reg_save_mode that gives the full
q-register range.  In this patch, we choose V16QImode as an alternative
16-byte "bag-of-bits" mode that doesn't have the artificial range
restrictions imposed on T{I,F,D}mode.

Unlike for GCC 14 we need additional handling in the load/store pair
code as various cases are not expecting to see V16QImode (particularly
the writeback patterns, but also aarch64_gen_load_pair).

This is a backport of
r14-8657-g0529ba8168c89f24314e8750237d77bb132bea9c.

gcc/ChangeLog:

PR target/111677
* config/aarch64/aarch64.cc (aarch64_reg_save_mode): Use
V16QImode for the full 16-byte FPR saves in the vector PCS case.
(aarch64_gen_storewb_pair): Handle V16QImode.
(aarch64_gen_loadwb_pair): Likewise.
(aarch64_gen_load_pair): Likewise.
* config/aarch64/aarch64.md (loadwb_pair<TX:mode>_<P:mode>):
Rename to ...
(loadwb_pair<TX_V16QI:mode>_<P:mode>): ... this, extending to
V16QImode.
(storewb_pair<TX:mode>_<P:mode>): Rename to ...
(storewb_pair<TX_V16QI:mode>_<P:mode>): ... this, extending to
V16QImode.
* config/aarch64/iterators.md (TX_V16QI): New.

gcc/testsuite/ChangeLog:

PR target/111677
* gcc.target/aarch64/torture/pr111677.c: New test.

Daily bump.

tree-optimization/112618 - unused .MASK_CALL

We have to make sure to remove unused .MASK_CALL internal function
calls after vectorization.

PR tree-optimization/112618
* tree-vect-loop.cc (vect_transform_loop_stmt): For not
relevant and unused .MASK_CALL make sure we remove the
scalar stmt.

* gcc.dg/pr112618.c: New testcase.

(cherry picked from commit 57c028acbec4f7b594e6b024e02d6c799b51e03d)

tree-optimization/112505 - bit-precision induction vectorization

Vectorization of bit-precision inductions isn't implemented but we
don't check this, instead we ICE during transform.

PR tree-optimization/112505
* tree-vect-loop.cc (vectorizable_induction): Reject
bit-precision induction.

* gcc.dg/vect/pr112505.c: New testcase.

(cherry picked from commit ec345df53556ec581590347f71c3d9ff3cdbca76)

tree-optimization/112495 - alias versioning and address spaces

We are not correctly handling differing address spaces in dependence
analysis runtime alias check generation so refuse to do that.

PR tree-optimization/112495
* tree-data-ref.cc (runtime_alias_check_p): Reject checks
between different address spaces.

* gcc.target/i386/pr112495.c: New testcase.

(cherry picked from commit 0f593c0521caab8cfac53514b1a5e7d0d0dd1932)

tree-optimization/110221 - SLP and loop mask/len

The following fixes the issue that when SLP stmts are internal defs
but appear invariant because they end up only using invariant defs
then they get scheduled outside of the loop. This nice optimization
breaks down when loop masks or lens are applied since those are not
explicitly tracked as dependences. The following makes sure to never
schedule internal defs outside of the vectorized loop when the
loop uses masks/lens.

PR tree-optimization/110221
* tree-vect-slp.cc (vect_schedule_slp_node): When loop
masking / len is applied make sure to not schedule
intenal defs outside of the loop.

* gfortran.dg/pr110221.f: New testcase.

(cherry picked from commit e5f1956498251a4973d52c8aad3faf34d0443169)

middle-end/110176 - wrong zext (bool) <= (int) 4294967295u folding

The following fixes a wrong pattern that didn't match the behavior
of the original fold_widened_comparison in that get_unwidened
returned a constant always in the wider type. But here we're
using (int) 4294967295u without the conversion applied. Fixed
by doing as earlier in the pattern - matching constants only
if the conversion was actually applied.

PR middle-end/110176
* match.pd (zext (bool) <= (int) 4294967295u): Make sure
to match INTEGER_CST only without outstanding conversion.

* gcc.dg/torture/pr110176.c: New testcase.

(cherry picked from commit 22dbfbe8767ff4c1d93e39f68ec7c2d5b1358beb)

libstdc++: /dev/null is not accessible on Windows

When running the DejaGNU testsuite on a toolchain built for native
Windows, the path /dev/null can't be used to open a stream to void.
On native Windows, the resource is instead named "nul".

In 17_intro/tag_type_explicit_ctor.cc, the following statement would
fail to match when the DejaGNU testsuite is running in cygwin with a
native toolchain.
// dg-error 53 "explicit" "" { target hosted }

The "target hosted"-check is using cpp to verify if _GLIBCXX_HOSTED is
defined and discards the output by simply redirecting it to /dev/null.
In v3_target_compile, it's overridden to "nul" for MinGW targets, but
the same rule applies when host is cygwin, so replace the condition
with a check for Windows.

The error in the log would look like this for the "target hosted" check:
cc1plus.exe: fatal error: opening output file /dev/null: No such file or directory

The tag_type_explicit_ctor.cc test fails with this on Windows:
.../tag_type_explicit_ctor.cc:53: error: converting to 'std::defer_lock_t' from initializer list would use explicit constructor 'constexpr std::defer_lock_t::defer_lock_t()'
.../tag_type_explicit_ctor.cc:54: error: converting to 'std::try_to_lock_t' from initializer list would use explicit constructor 'constexpr std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:55: error: converting to 'std::try_to_lock_t' from initializer list would use explicit constructor 'constexpr std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:67: error: converting to 'std::defer_lock_t' from initializer list would use explicit constructor 'constexpr std::defer_lock_t::defer_lock_t()'
.../tag_type_explicit_ctor.cc:68: error: converting to 'std::try_to_lock_t' from initializer list would use explicit constructor 'constexpr std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:69: error: converting to 'std::adopt_lock_t' from initializer list would use explicit constructor 'constexpr std::adopt_lock_t::adopt_lock_t()'

Patch has been verified on Windows and Linux.

libstdc++-v3:

* testsuite/lib/libstdc++.exp: Use "nul" for Windows, "/dev/null"
for other environments.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 1e4664b97daf85d3134baa47a5af3db23658df34)

c++: defaulted op== for incomplete class [PR107291]

After complaining about lack of friendship, we should not try to go on and
define the defaulted comparison operator anyway.

PR c++/107291

gcc/cp/ChangeLog:

* method.cc (early_check_defaulted_comparison): Fail if not friend.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-eq17.C: New test.

(cherry picked from commit c5d34912ad576be1ef19be92f7eabde54b9089eb)

Daily bump.

c++: prvalue of array type [PR111286]

Here we want to build a prvalue array to bind to the T reference, but we
were wrongly trying to strip cv-quals from the array prvalue, which should
be treated the same as a class prvalue.

PR c++/111286

gcc/cp/ChangeLog:

* tree.cc (rvalue): Don't drop cv-quals from an array.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-array22.C: New test.

(cherry picked from commit c7e8381748f78335e9fef23f363b6a9e4463ce7e)

varasm: check float size [PR109359]

In PR95226, the testcase was failing because we tried to output_constant a
NOP_EXPR to float from a double REAL_CST, and so we output a double where
the caller wanted a float. That doesn't happen anymore, but with the
output_constant hunk we will ICE in that situation rather than emit the
wrong number of bytes.

Part of the problem was that initializer_constant_valid_p_1 returned true
for that NOP_EXPR, because it compared the sizes of integer types but not
floating-point types. So the C++ front end assumed it didn't need to fold
the initializer.

This also fixed the test for PR109359.

PR c++/95226
PR c++/109359

gcc/ChangeLog:

* varasm.cc (output_constant) [REAL_TYPE]: Check that sizes match.
(initializer_constant_valid_p_1): Compare float precision.

gcc/testsuite/ChangeLog:

* g++.dg/ext/frounding-math1.C: New test.

(cherry picked from commit e7cc4d703bceb9095316c106eba0d1939c6c8044)

Update gcc zh_CN.po

* zh_CN.po: Update.

mips: Fix missing mode in neg<mode:MSA>2

I was too sleepy writting this :(.

gcc/ChangeLog:

* config/mips/mips-msa.md (neg<mode:MSA>2): Add missing mode for
neg.

(cherry picked from commit 55357960fbddd261e32f088f5dd328d58b0f25b3)

MIPS: Fix wrong MSA FP vector negation

We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with MSA enabled.

Use the bnegi.df instructions to simply reverse the sign bit instead.

gcc/ChangeLog:

* config/mips/mips-msa.md (elmsgnbit): New define_mode_attr.
(neg<mode>2): Change the mode iterator from MSA to IMSA because
in FP arithmetic we cannot use (0 - x) for -x.
(neg<mode>2): New define_insn to implement FP vector negation,
using a bnegi instruction to negate the sign bit.

(cherry picked from commit 4d7fe3cf82505b45719356a2e51b1480b5ee21d6)

Daily bump.

Revert use of accumulator type in expansion of 'Reduce attribute

gcc/ada
* exp_attr.adb (Expand_N_Attribute_Reference): Revert older change.

Revert fix for reduction expression with overloaded reducer subprogram

gcc/ada
* exp_attr.adb (Expand_N_Attribute_Reference): Revert latest change.

Daily bump.

c++: op== defaulted outside class [PR110084]

defaulted_late_check is for checks that need to happen after the class is
complete; we shouldn't call it sooner.

PR c++/110084

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Only check a function defaulted
outside the class if the class is complete.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-synth-neg3.C: Check error message.
* g++.dg/cpp2a/spaceship-eq16.C: New test.

(cherry picked from commit e17a122d417fc0d606bcb3a3705b93ee81745cab)

c++: variable template array of unknown bound [PR113638]

When we added variable templates, we didn't extend the VAR_HAD_UNKNOWN_BOUND
handling for class template static data members to handle them as well.

PR c++/113638

gcc/cp/ChangeLog:

* cp-tree.h: Adjust comment.
* pt.cc (instantiate_template): Set VAR_HAD_UNKNOWN_BOUND for
variable template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ-array1.C: New test.

(cherry picked from commit 0b786ff38ab398087820d91241e030a28c451df9)

c++: no_unique_address and constexpr [PR112439]

Here, because we don't build a CONSTRUCTOR for an empty base, we were
wrongly marking the Foo CONSTRUCTOR as complete after initializing the Empty
member. Fixed by checking empty_base here as well.

PR c++/112439

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_store_expression): Check empty_base
before marking a CONSTRUCTOR readonly.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/no_unique_address15.C: New test.

(cherry picked from commit f4998609908e4926fc095ce97cb84b187294fd1d)