git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

projects / thirdparty / gcc.git / log

commit | commitdiff | tree

GCC Administrator [Tue, 8 Oct 2024 00:20:03 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

Jonathan Wakely [Mon, 28 Nov 2022 12:16:21 +0000 (12:16 +0000)]

libstdc++: Fix std::string_view for IL32P16 targets

For H8/300 with -msx -mn -mint32 the type of (_M_len - __pos) is int,
because int is wider than size_t so the operands are promoted.

libstdc++-v3/ChangeLog:

* include/std/string_view (basic_string_view::copy) Use explicit
template argument for call to std::min<size_t>.
(basic_string_view::substr): Likewise.

commit | commitdiff | tree

Jonathan Wakely [Thu, 20 Jun 2024 15:13:10 +0000 (16:13 +0100)]

libstdc++: Initialize base in test allocator's constructor

This fixes a warning from one of the test allocators:
warning: base class 'class std::allocator<__gnu_test::copy_tracker>' should be explicitly initialized in the copy constructor [-Wextra]

libstdc++-v3/ChangeLog:

* testsuite/util/testsuite_allocator.h (tracker_allocator):
Initialize base class in copy constructor.

(cherry picked from commit e2fb245b07f489ed5bfd9a945e0053b4a3211245)

commit | commitdiff | tree

Jonathan Wakely [Mon, 8 Apr 2024 16:41:00 +0000 (17:41 +0100)]

libstdc++: Handle EMLINK and EFTYPE in std::filesystem::remove_all

Although POSIX requires ELOOP, FreeBSD documents that openat with
O_NOFOLLOW returns EMLINK if the last component of a filename is a
symbolic link.  Check for EMLINK as well as ELOOP, so that the TOCTTOU
mitigation in remove_all works correctly.

See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214633 or the
FreeBSD man page for reference.

According to its man page, DragonFlyBSD also uses EMLINK for this error,
and NetBSD uses its own EFTYPE. OpenBSD follows POSIX and uses EMLINK.

This fixes these failures on FreeBSD:
FAIL: 27_io/filesystem/operations/remove_all.cc  -std=gnu++17 execution test
FAIL: experimental/filesystem/operations/remove_all.cc  -std=gnu++17 execution test

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc (remove_all) [__FreeBSD__ || __DragonFly__]:
Check for EMLINK as well as ELOOP.
[__NetBSD__]: Check for EFTYPE as well as ELOOP.

commit | commitdiff | tree

Jonathan Wakely [Mon, 10 Jun 2024 13:08:16 +0000 (14:08 +0100)]

libstdc++: Fix std::tr2::dynamic_bitset shift operations [PR115399]

The shift operations for dynamic_bitset fail to zero out words where the
non-zero bits were shifted to a completely different word.

For a right shift we don't need to sanitize the unused bits in the high
word, because we know they were already clear and a right shift doesn't
change that.

libstdc++-v3/ChangeLog:

PR libstdc++/115399
* include/tr2/dynamic_bitset (operator>>=): Remove redundant
call to _M_do_sanitize.
* include/tr2/dynamic_bitset.tcc (_M_do_left_shift): Zero out
low bits in words that should no longer be populated.
(_M_do_right_shift): Likewise for high bits.
* testsuite/tr2/dynamic_bitset/pr115399.cc: New test.

(cherry picked from commit bd3a312728fbf8c35a09239b9180269f938f872e)

commit | commitdiff | tree

Kim Gräsman [Tue, 27 Aug 2024 16:08:47 +0000 (17:08 +0100)]

libstdc++: Fix @headername for bits/cpp_type_traits.h

There is no file ext/type_traits, point it to ext/type_traits.h instead.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h: Improve doxygen file docs.

(cherry picked from commit f6ed7a61a7c906f8fb7f8059132225c9bc41f3b2)

commit | commitdiff | tree

Kim Gräsman [Tue, 27 Aug 2024 16:11:29 +0000 (17:11 +0100)]

libstdc++: Fix @file for target-specific opt_random.h

A few of these files self-identified as ext/random.tcc, update to use
the actual basename.

libstdc++-v3/ChangeLog:

* config/cpu/aarch64/opt/ext/opt_random.h: Improve doxygen file
docs.
* config/cpu/i486/opt/ext/opt_random.h: Likewise.

(cherry picked from commit c2ad7b2d5247cf2ddee98d7f46274775a3fa1268)

commit | commitdiff | tree

Jonathan Wakely [Fri, 5 Jul 2024 19:00:04 +0000 (20:00 +0100)]

libstdc++: Use reserved form of [[__likely__]] in <variant>

We should not use [[unlikely]] before C++20, so use [[__unlikely__]]
instead.

libstdc++-v3/ChangeLog:

* include/std/variant (_Variant_storage::_M_reset): Use
__unlikely__ form of attribute instead of unlikely.

(cherry picked from commit 9f1cd51766f251aafe0f1b898892f79855892729)

commit | commitdiff | tree

Jonathan Wakely [Wed, 28 Aug 2024 11:38:18 +0000 (12:38 +0100)]

libstdc++: Fix autoconf check for O_NONBLOCK in <fcntl.h>

I misused the AC_CHECK_DECL macro, assuming that it behaved like
AC_CHECK_DECLS and always defined a HAVE_xxx macro if the decl was
found. Instead, the [action-if-found] shell commands are needed to
defined HAVE_O_NONBLOCK explicitly.

libstdc++-v3/ChangeLog:

* configure.ac: Fix check for O_NONBLOCK.
* config.h.in: Regenerate.
* configure: Regenerate.

(cherry picked from commit b68561dd7925dfee1836f75d3fa8d33fff5c2498)

commit | commitdiff | tree

Jonathan Wakely [Tue, 10 Sep 2024 13:25:41 +0000 (14:25 +0100)]

libstdc++: std::string move assignment should not use POCCA trait [PR116641]

The changes to implement LWG 2579 (r10-327-gdb33efde17932f) made
std::string::assign use the propagate_on_container_copy_assignment
(POCCA) trait, for consistency with operator=(const basic_string&).
However, this also unintentionally affected operator=(basic_string&&)
which calls assign(str) to make a deep copy when performing a move is
not possible. The fix is for the move assignment operator to call
_M_assign(str) instead of assign(str), as this just does the deep copy
and doesn't check the POCCA trait first.

The bug only affects the unlikely/useless combination of POCCA==true and
POCMA==false, but we should fix it for correctness anyway. it should
also make move assignment slightly cheaper to compile and execute,
because we skip the extra code in assign(const basic_string&).

libstdc++-v3/ChangeLog:

PR libstdc++/116641
* include/bits/basic_string.h (operator=(basic_string&&)): Call
_M_assign instead of assign.
* testsuite/21_strings/basic_string/allocator/116641.cc: New
test.

(cherry picked from commit c07cf418fdde0c192e370a8d76a991cc7215e9c4)

commit | commitdiff | tree

Jonathan Wakely [Fri, 28 Jun 2024 14:14:15 +0000 (15:14 +0100)]

libstdc++: Define __glibcxx_assert_fail for non-verbose build [PR115585]

When the library is configured with --disable-libstdcxx-verbose the
assertions just abort instead of calling __glibcxx_assert_fail, and so I
didn't export that function for the non-verbose build. However, that
option is documented to not change the library ABI, so we still need to
export the symbol from the library. It could be needed by programs
compiled against the headers from a verbose build.

The non-verbose definition can just call abort so that it doesn't pull
in I/O symbols, which are unwanted in a non-verbose build.

libstdc++-v3/ChangeLog:

PR libstdc++/115585
* src/c++11/assert_fail.cc (__glibcxx_assert_fail): Add
definition for non-verbose builds.

(cherry picked from commit 52370c839edd04df86d3ff2b71fcdca0c7376a7f)

commit | commitdiff | tree

GCC Administrator [Mon, 7 Oct 2024 00:19:15 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 6 Oct 2024 00:20:33 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

John David Anglin [Sat, 5 Oct 2024 22:18:31 +0000 (18:18 -0400)]

hppa: Fix indirect_goto constraint

Noticed testing LRA.

2024-10-05 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.md: Fix indirect_got constraint.

commit | commitdiff | tree

GCC Administrator [Sat, 5 Oct 2024 00:19:25 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

H.J. Lu [Fri, 4 Oct 2024 08:21:15 +0000 (16:21 +0800)]

x86: Disable stack protector for naked functions

Since naked functions should not enable stack protector, define
TARGET_STACK_PROTECT_RUNTIME_ENABLED_P to disable stack protector
for naked functions.

gcc/

PR target/116962
* config/i386/i386.cc (ix86_stack_protect_runtime_enabled_p): New
function.
(TARGET_STACK_PROTECT_RUNTIME_ENABLED_P): New.

gcc/testsuite/

PR target/116962
* gcc.target/i386/pr116962.c: New file.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 7d2845da112214f064e7b24531cc67e256b5177e)

commit | commitdiff | tree

GCC Administrator [Fri, 4 Oct 2024 00:18:37 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 3 Oct 2024 00:19:24 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Richard Biener [Wed, 18 Sep 2024 07:52:55 +0000 (09:52 +0200)]

tree-optimization/116585 - SSA corruption with split_constant_offset

split_constant_offset when looking through SSA defs can end up
picking SSA leafs that are subject to abnormal coalescing. This
can lead to downstream consumers to insert code based on the
result (like from dataref analysis) in places that violate constraints
for abnormal coalescing. It's best to not expand defs whose operands
are subject to abnormal coalescing - and not either do something when
a subexpression has operands like that already.

PR tree-optimization/116585
* tree-data-ref.cc (split_constant_offset_1): When either
operand is subject to abnormal coalescing do no further
processing.

* gcc.dg/torture/pr116585.c: New testcase.

(cherry picked from commit 1d0cb3b5fca69b81e69cfdb4aea0eebc1ac04750)

commit | commitdiff | tree

GCC Administrator [Wed, 2 Oct 2024 00:18:26 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Tue, 1 Oct 2024 00:21:17 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

Jan Hubicka [Tue, 3 Sep 2024 11:38:33 +0000 (13:38 +0200)]

Zen5 tuning part 1: avoid FMA chains

testing matrix multiplication benchmarks shows that FMA on a critical chain
is a perofrmance loss over separate multiply and add. While the latency of 4
is lower than multiply + add (3+2) the problem is that all values needs to
be ready before computation starts.

While on znver4 AVX512 code fared well with FMA, it was because of the split
registers. Znver5 benefits from avoding FMA on all widths.  This may be different
with the mobile version though.

On naive matrix multiplication benchmark the difference is 8% with -O3
only since with -Ofast loop interchange solves the problem differently.
It is 30% win, for example, on S323 from TSVC:

real_t s323(struct args_t * func_args)
{

//    recurrences
//    coupled recurrence

    initialise_arrays(__func__);
    gettimeofday(&func_args->t1, NULL);

    for (int nl = 0; nl < iterations/2; nl++) {
        for (int i = 1; i < LEN_1D; i++) {
            a[i] = b[i-1] + c[i] * d[i];
            b[i] = a[i] + c[i] * e[i];
        }
        dummy(a, b, c, d, e, aa, bb, cc, 0.);
    }

    gettimeofday(&func_args->t2, NULL);
    return calc_checksum(__func__);
}

gcc/ChangeLog:

* config/i386/x86-tune.def (X86_TUNE_AVOID_128FMA_CHAINS): Enable for
znver5.
(X86_TUNE_AVOID_256FMA_CHAINS): Likewise.
(X86_TUNE_AVOID_512FMA_CHAINS): Likewise.

(cherry picked from commit d6360b4083695970789fd65b9c515c11a5ce25b4)

commit | commitdiff | tree

GCC Administrator [Mon, 30 Sep 2024 00:19:50 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 29 Sep 2024 00:19:38 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Richard Biener [Tue, 16 Jul 2024 08:45:27 +0000 (10:45 +0200)]

Fixup unaligned load/store cost for znver5

Currently unaligned YMM and ZMM load and store costs are cheaper than
aligned which causes the vectorizer to purposely mis-align accesses
by adding an alignment prologue. It looks like the unaligned costs
were simply copied from the bogus znver4 costs. The following makes
the unaligned costs equal to the aligned costs like in the fixed znver4
version.

* config/i386/x86-tune-costs.h (znver5_cost): Update unaligned
load and store cost from the aligned costs.

(cherry picked from commit 896393791ee34ffc176c87d232dfee735db3aaab)

commit | commitdiff | tree

Jan Hubicka [Mon, 18 Mar 2024 09:22:44 +0000 (10:22 +0100)]

Add AMD znver5 processor enablement with scheduler model

2024-02-14 Jan Hubicka <jh@suse.cz>
Karthiban Anbazhagan <Karthiban.Anbazhagan@amd.com>

gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5.
* common/config/i386/i386-common.cc (processor_names): Add znver5.
(processor_alias_table): Likewise.
* common/config/i386/i386-cpuinfo.h (processor_types): Add new zen
family.
(processor_subtypes): Add znver5.
* config.gcc (x86_64-*-* |...): Likewise.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
march=native detect znver5 cpu's.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add
znver5.
* config/i386/i386-options.cc (m_ZNVER5): New definition
(processor_cost_table): Add znver5.
* config/i386/i386.cc (ix86_reassociation_width): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER5
(PTA_ZNVER5): New definition.
* config/i386/i386.md (define_attr "cpu"): Add znver5.
(Scheduling descriptions) Add znver5.md.
* config/i386/x86-tune-costs.h (znver5_cost): New definition.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver5.
(ix86_adjust_cost): Likewise.
* config/i386/x86-tune.def (avx512_move_by_pieces): Add m_ZNVER5.
(avx512_store_by_pieces): Add m_ZNVER5.
* doc/extend.texi: Add znver5.
* doc/invoke.texi: Likewise.
* config/i386/znver4.md: Rename to zn4zn5.md; combine znver4 and znver5 Scheduler.

gcc/testsuite/ChangeLog:
* g++.target/i386/mv29.C: Handle znver5 arch.
* gcc.target/i386/funcspec-56.inc:Likewise.

(cherry picked from commit d0aa0af9a9b7dd709a8c7ff6604ed6b7da0fc23a)

commit | commitdiff | tree

H.J. Lu [Wed, 25 Sep 2024 08:39:04 +0000 (16:39 +0800)]

x86: Don't use address override with segment regsiter

Address override only applies to the (reg32) part in the thread address
fs:(reg32).  Don't rewrite thread address like

(set (reg:CCZ 17 flags)
    (compare:CCZ (reg:SI 98 [ __gmpfr_emax.0_1 ])
        (mem/c:SI (plus:SI (plus:SI (unspec:SI [
                            (const_int 0 [0])
                        ] UNSPEC_TP)
                    (reg:SI 107))
                (const:SI (unspec:SI [
                            (symbol_ref:SI ("previous_emax") [flags 0x1a] <var_decl 0x7fffe9a11cf0 previous_emax>)
                        ] UNSPEC_DTPOFF))) [1 previous_emax+0 S4 A32])))

if address override is used to avoid the invalid memory operand like

cmpl %fs:previous_emax@dtpoff(%eax), %r12d

gcc/

PR target/116839
* config/i386/i386.cc (ix86_rewrite_tls_address_1): Make it
static.  Return if TLS address is thread register plus an integer
register.

gcc/testsuite/

PR target/116839
* gcc.target/i386/pr116839.c: New file.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit c79cc30862d7255ca15884aa956d1ccfa279d86a)

commit | commitdiff | tree

GCC Administrator [Sat, 28 Sep 2024 00:20:43 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

Stefan Schulze Frielinghaus [Fri, 27 Sep 2024 10:45:42 +0000 (12:45 +0200)]

s390: Fix TF to FPRX2 conversion [PR115860]

Currently subregs originating from *tf_to_fprx2_0 and *tf_to_fprx2_1
survive register allocation.  This in turn leads to wrong register
renaming.  Keeping the current approach would mean we need two insns for
*tf_to_fprx2_0 and *tf_to_fprx2_1, respectively.  Something along the
lines

(define_insn "*tf_to_fprx2_0"
  [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" "=f") 0)
        (unspec:DF [(match_operand:TF 1 "general_operand" "v")]
                   UNSPEC_TF_TO_FPRX2_0))]
  "TARGET_VXE"
  "#")

(define_insn "*tf_to_fprx2_0"
  [(set (match_operand:DF 0 "nonimmediate_operand" "=f")
        (unspec:DF [(match_operand:TF 1 "general_operand" "v")]
                   UNSPEC_TF_TO_FPRX2_0))]
  "TARGET_VXE"
  "vpdi\t%v0,%v1,%v0,1
  [(set_attr "op_type" "VRR")])

and similar for *tf_to_fprx2_1.  Note, pre register allocation operand 0
has mode FPRX2 and afterwards DF once subregs have been eliminated.

Since we always copy a whole vector register into a floating-point
register pair, another way to fix this is to merge *tf_to_fprx2_0 and
*tf_to_fprx2_1 into a single insn which means we don't have to use
subregs at all.  The downside of this is that the assembler template
contains two instructions, now.  The upside is that we don't have to
come up with some artificial insn before RA which might be more
readable/maintainable.  That is implemented by this patch.

In commit r11-4872-ge627cda5686592, the output operand specifier %V was
introduced which is used in tf_to_fprx2 only, now.  Instead of coming up
with its counterpart %F for floating-point registers, which would also
only be used in tf_to_fprx2, I print the operands directly.  This
renders %V unused which is why it is removed by this patch.

gcc/ChangeLog:

PR target/115860
* config/s390/s390.cc (print_operand): Remove operand specifier
%V.
* config/s390/s390.md (UNSPEC_TF_TO_FPRX2): New.
* config/s390/vector.md (*tf_to_fprx2_0): Remove.
(*tf_to_fprx2_1): Remove.
(tf_to_fprx2): New.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/long-double-asm-abi.c: Adapt
scan-assembler directive.
* gcc.target/s390/vector/long-double-to-i64.c: Adapt
scan-assembler directive.
* gcc.target/s390/pr115860-1.c: New test.

(cherry picked from commit 46c2538435dfc50dd5c67c4e03ce387d1f6ebe9b)

commit | commitdiff | tree

Stefan Schulze Frielinghaus [Fri, 27 Sep 2024 10:45:42 +0000 (12:45 +0200)]

s390: Fix AQ and AR constraints

Ensure for AQ and AR constraints that the resulting displacement after
adding any positive offset less than the size of the object being
referenced is still valid.

gcc/ChangeLog:

* config/s390/s390.cc (s390_mem_constraint): Check displacement
for AQ and AR constraints.

(cherry picked from commit 1a71ff3b89aadc7fa0af0bca269d74bb23c1a957)

commit | commitdiff | tree

GCC Administrator [Fri, 27 Sep 2024 00:20:45 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 26 Sep 2024 00:21:00 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 25 Sep 2024 00:20:31 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Tue, 24 Sep 2024 00:20:06 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 23 Sep 2024 00:19:53 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 22 Sep 2024 00:20:46 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 21 Sep 2024 00:19:46 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Harald Anlauf [Thu, 5 Sep 2024 19:30:25 +0000 (21:30 +0200)]

Fortran: fix ICE in gfc_create_module_variable [PR100273]

gcc/fortran/ChangeLog:

PR fortran/100273
* trans-decl.cc (gfc_create_module_variable): Handle module
variable also when it is needed for the result specification
of a contained function.

gcc/testsuite/ChangeLog:

PR fortran/100273
* gfortran.dg/pr100273.f90: New test.

(cherry picked from commit 1f462b5072a5e82c35921f7e3bdf3959c4a49dc9)

commit | commitdiff | tree

GCC Administrator [Fri, 20 Sep 2024 17:37:53 +0000 (17:37 +0000)]

Daily bump.

commit | commitdiff | tree

Eric Botcazou [Fri, 20 Sep 2024 10:32:13 +0000 (12:32 +0200)]

Fix small thinko in IPA mod/ref pass

When a memory copy operation is analyzed by analyze_ssa_name, if both the
load and store are made through the same SSA name, the store is overlooked.

gcc/
* ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Always
process both the load and the store of a memory copy operation.

gcc/testsuite/
* gcc.dg/ipa/modref-4.c: New test.

commit | commitdiff | tree

Stefan Schulze Frielinghaus [Fri, 20 Sep 2024 12:08:32 +0000 (14:08 +0200)]

s390: Fix strict_low_part generation

In s390_expand_insv(), if generating code for ICM et al. src is a MEM
and gen_lowpart might force src into a register such that we end up with
patterns which do not match anymore.  Use adjust_address() instead in
order to preserve a MEM.

Furthermore, it is not straight forward to enforce a subreg.  For
example, in case of a paradoxical subreg, gen_lowpart() may return a
register.  In order to compensate this, s390_gen_lowpart_subreg() emits
a reference to a pseudo which does not coincide with its definition
which is wrong.  Additionally, if dest is a paradoxical subreg, then do
not try to emit a strict_low_part since it could mean that dest was not
initialized even though this might be fixed up later by init-regs.

Splitter for insn *get_tp_64, *zero_extendhisi2_31,
*zero_extendqisi2_31, *zero_extendqihi2_31 are applied after reload.
Thus, operands[0] is a hard register and gen_lowpart (m, operands[0])
just returns the hard register for mode m which is fine to use as an
argument for strict_low_part, i.e., we do not need to enforce subregs
here since after reload subregs are supposed to be eliminated anyway.

This fixes gcc.dg/torture/pr111821.c.

gcc/ChangeLog:

* config/s390/s390-protos.h (s390_gen_lowpart_subreg): Remove.
* config/s390/s390.cc (s390_gen_lowpart_subreg): Remove.
(s390_expand_insv): Use adjust_address() and emit a
strict_low_part only in case of a natural subreg.
* config/s390/s390.md: Use gen_lowpart() instead of
s390_gen_lowpart_subreg().

(cherry picked from commit 9ebc9fbdddfe1ec85355b068354315a4da8e1ca0)

commit | commitdiff | tree

Haochen Jiang [Wed, 18 Sep 2024 03:20:15 +0000 (11:20 +0800)]

doc: Add more alias option and reorder Intel CPU -march documentation

This patch is backported from GCC15 with some tweaks.

Since r15-3539, there are requests coming in to add other alias option
documentation. This patch will add all of them, including corei7, corei7-avx,
core-avx-i, core-avx2, atom and slm.

Also in the patch, I reordered that part of documentation, currently all
the CPUs/products are just all over the place. I regrouped them by
date-to-now products (since the very first CPU to latest Panther Lake), P-core
(since the clients become hybrid cores, starting from Sapphire Rapids) and
E-core (since Bonnell). In GCC14 and eariler GCC, Xeon Phi CPUs are still
there, I put them after E-core CPUs.

And in the patch, I refined the product names in documentation.

gcc/ChangeLog:

* doc/invoke.texi: Add corei7, corei7-avx, core-avx-i,
core-avx2, atom, and slm. Reorder the -march documentation by
splitting them into date-to-now products, P-core, E-core and
Xeon Phi. Refine the product names in documentation.

commit | commitdiff | tree

GCC Administrator [Thu, 19 Sep 2024 00:20:51 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 18 Sep 2024 00:19:22 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Marek Polacek [Mon, 16 Sep 2024 20:42:38 +0000 (16:42 -0400)]

c++: crash with anon VAR_DECL [PR116676]

r12-3495 added maybe_warn_about_constant_value which will crash if
it gets a nameless VAR_DECL, which is what happens in this PR.

We created this VAR_DECL in cp_parser_decomposition_declaration.

PR c++/116676

gcc/cp/ChangeLog:

* constexpr.cc (maybe_warn_about_constant_value): Check DECL_NAME.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-116676.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit dfe0d4389a3ce43179563a63046ad3e74d615a08)

commit | commitdiff | tree

GCC Administrator [Tue, 17 Sep 2024 00:18:27 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 16 Sep 2024 00:18:36 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

H.J. Lu [Fri, 6 Sep 2024 12:24:07 +0000 (05:24 -0700)]

x86-64: Don't use temp for argument in a TImode register

Don't use temp for a PARALLEL BLKmode argument of an EXPR_LIST expression
in a TImode register. Otherwise, the TImode variable will be put in
the GPR save area which guarantees only 8-byte alignment.

gcc/

PR target/116621
* config/i386/i386.cc (ix86_gimplify_va_arg): Don't use temp for
a PARALLEL BLKmode container of an EXPR_LIST expression in a
TImode register.

gcc/testsuite/

PR target/116621
* gcc.target/i386/pr116621.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit fa7bbb065c63aa802e0bbb04d605407dad58cf94)

commit | commitdiff | tree

GCC Administrator [Sun, 15 Sep 2024 00:18:06 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 14 Sep 2024 00:18:08 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 13 Sep 2024 00:19:01 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 12 Sep 2024 00:18:11 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 11 Sep 2024 00:19:52 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Tue, 10 Sep 2024 00:25:59 +0000 (00:25 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 9 Sep 2024 00:18:31 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 8 Sep 2024 00:20:33 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 7 Sep 2024 00:18:42 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 6 Sep 2024 00:19:57 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

H.J. Lu [Tue, 27 Aug 2024 20:11:39 +0000 (13:11 -0700)]

ipa: Don't disable function parameter analysis for fat LTO

Update analyze_parms not to disable function parameter analysis for
-ffat-lto-objects. Tested on x86-64, there are no differences in zstd
with "-O2 -flto=auto" -g "vs -O2 -flto=auto -g -ffat-lto-objects".

PR ipa/116410
* ipa-modref.cc (analyze_parms): Always analyze function parameter
for LTO.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 2f1689ea8e631ebb4ff3720d56ef0362f5898ff6)

commit | commitdiff | tree

GCC Administrator [Thu, 5 Sep 2024 00:20:00 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 4 Sep 2024 00:25:58 +0000 (00:25 +0000)]

Daily bump.

commit | commitdiff | tree

Haochen Jiang [Mon, 2 Sep 2024 07:00:22 +0000 (15:00 +0800)]

i386: Fix vfpclassph non-optimizied intrin

The intrin for non-optimized got a typo in mask type, which will cause
the high bits of __mmask32 being unexpectedly zeroed.

The test does not fail under O0 with current 1b since the testcase is
wrong. We need to include avx512-mask-type.h after SIZE is defined, or
it will always be __mmask8. That problem also happened in AVX10.2 testcases.
I will write a seperate patch to fix that.

gcc/ChangeLog:

* config/i386/avx512fp16intrin.h
(_mm512_mask_fpclass_ph_mask): Correct mask type to __mmask32.
(_mm512_fpclass_ph_mask): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-vfpclassph-1c.c: New test.

commit | commitdiff | tree

GCC Administrator [Tue, 3 Sep 2024 00:23:54 +0000 (00:23 +0000)]

Daily bump.

commit | commitdiff | tree

liuhongt [Thu, 29 Aug 2024 03:39:20 +0000 (11:39 +0800)]

Check avx upper register for parallel.

For function arguments/return, when it's BLK mode, it's put in a
parallel with an expr_list, and the expr_list contains the real mode
and registers.
Current ix86_check_avx_upper_register only checked for SSE_REG_P, and
failed to handle that. The patch extend the handle to each subrtx.

gcc/ChangeLog:

PR target/116512
* config/i386/i386.cc (ix86_check_avx_upper_register): Iterate
subrtx to scan for avx upper register.
(ix86_check_avx_upper_stores): Inline old
ix86_check_avx_upper_register.
(ix86_avx_u128_mode_needed): Ditto, and replace
FOR_EACH_SUBRTX with call to new
ix86_check_avx_upper_register.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116512.c: New test.

(cherry picked from commit ab214ef734bfc3dcffcf79ff9e1dd651c2b40566)

commit | commitdiff | tree

GCC Administrator [Mon, 2 Sep 2024 00:20:58 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 1 Sep 2024 00:29:35 +0000 (00:29 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 31 Aug 2024 00:20:04 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 30 Aug 2024 00:24:32 +0000 (00:24 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 29 Aug 2024 00:21:13 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 28 Aug 2024 00:21:21 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 26 Aug 2024 00:20:27 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 25 Aug 2024 00:20:22 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 24 Aug 2024 00:19:20 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 23 Aug 2024 00:18:43 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

liuhongt [Thu, 22 Aug 2024 06:31:40 +0000 (14:31 +0800)]

Fix testcase failure.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pieces-memcpy-10.c: Use -mmove-max=256 and
-mstore-max=256.
* gcc.target/i386/pieces-memcpy-6.c: Ditto.
* gcc.target/i386/pieces-memset-38.c: Ditto.
* gcc.target/i386/pieces-memset-40.c: Ditto.
* gcc.target/i386/pieces-memset-41.c: Ditto.
* gcc.target/i386/pieces-memset-42.c: Ditto.
* gcc.target/i386/pieces-memset-43.c: Ditto.
* gcc.target/i386/pieces-strcpy-2.c: Ditto.

(cherry picked from commit ea9c508927ec032c6d67a24df59ffa429e4d3d95)

commit | commitdiff | tree

liuhongt [Thu, 15 Aug 2024 04:54:07 +0000 (12:54 +0800)]

Align ix86_{move_max,store_max} with vectorizer.

When none of mprefer-vector-width, avx256_optimal/avx128_optimal,
avx256_store_by_pieces/avx512_store_by_pieces is specified, GCC will
set ix86_{move_max,store_max} as max available vector length except
for AVX part.

      if (TARGET_AVX512F_P (opts->x_ix86_isa_flags)
  && TARGET_EVEX512_P (opts->x_ix86_isa_flags2))
opts->x_ix86_move_max = PVW_AVX512;
      else
opts->x_ix86_move_max = PVW_AVX128;

So for -mavx2, vectorizer will choose 256-bit for vectorization, but
128-bit is used for struct copy, there could be a potential STLF issue
due to this "misalign".

The patch fixes that.

gcc/ChangeLog:

* config/i386/i386-options.cc (ix86_option_override_internal):
set ix86_{move_max,store_max} to PVW_AVX256 when TARGET_AVX
instead of PVW_AVX128.

gcc/testsuite/ChangeLog:
* gcc.target/i386/pieces-memcpy-10.c: Add -mprefer-vector-width=128.
* gcc.target/i386/pieces-memcpy-6.c: Ditto.
* gcc.target/i386/pieces-memset-38.c: Ditto.
* gcc.target/i386/pieces-memset-40.c: Ditto.
* gcc.target/i386/pieces-memset-41.c: Ditto.
* gcc.target/i386/pieces-memset-42.c: Ditto.
* gcc.target/i386/pieces-memset-43.c: Ditto.
* gcc.target/i386/pieces-strcpy-2.c: Ditto.
* gcc.target/i386/pieces-memcpy-22.c: New test.
* gcc.target/i386/pieces-memset-51.c: New test.
* gcc.target/i386/pieces-strcpy-3.c: New test.

(cherry picked from commit aea374238cec1a1e53fb79575d2f998e16926999)

commit | commitdiff | tree

GCC Administrator [Thu, 22 Aug 2024 00:19:44 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Alexandre Oliva [Wed, 26 Jun 2024 05:08:18 +0000 (02:08 -0300)]

[testsuite] [arm] [vect] adjust mve-vshr test [PR113281]

The test was too optimistic, alas.  We used to vectorize shifts by
clamping the shift counts below the bit width of the types (e.g. at 15
for 16-bit vector elements), but (uint16_t)32768 >> (uint16_t)16 is
well defined (because of promotion to 32-bit int) and must yield 0,
not 1 (as before the fix).

Unfortunately, in the gimple model of vector units, such large shift
counts wouldn't be well-defined, so we won't vectorize such shifts any
more, unless we can tell they're in range or undefined.

So the test that expected the vectorization we no longer performed
needs to be adjusted.  Instead of nobbling the test, Richard Earnshaw
suggested annotating the test with the expected ranges so as to enable
the optimization, and Christophe Lyon suggested a further
simplification.

Co-Authored-By: Richard Earnshaw <Richard.Earnshaw@arm.com>
for  gcc/testsuite/ChangeLog

PR tree-optimization/113281
* gcc.target/arm/simd/mve-vshr.c: Add expected ranges.

(cherry picked from commit 54d2339c9f87f702e02e571a5460e11c19e1c02f)

commit | commitdiff | tree

GCC Administrator [Wed, 21 Aug 2024 00:20:08 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Tue, 20 Aug 2024 00:19:54 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 19 Aug 2024 00:19:05 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 18 Aug 2024 00:18:49 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 17 Aug 2024 00:18:53 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Richard Sandiford [Fri, 16 Aug 2024 14:37:50 +0000 (15:37 +0100)]

aarch64: Fix bogus cnot optimisation [PR114603]

aarch64-sve.md had a pattern that combined:

cmpeq pb.T, pa/z, zc.T, #0
mov zd.T, pb/z, #1

into:

cnot zd.T, pa/m, zc.T

But this is only valid if pa.T is a ptrue. In other cases, the
original would set inactive elements of zd.T to 0, whereas the
combined form would copy elements from zc.T.

gcc/
PR target/114603
* config/aarch64/aarch64-sve.md (@aarch64_pred_cnot<mode>): Replace
with...
(@aarch64_ptrue_cnot<mode>): ...this, requiring operand 1 to be
a ptrue.
(*cnot<mode>): Require operand 1 to be a ptrue.
* config/aarch64/aarch64-sve-builtins-base.cc (svcnot_impl::expand):
Use aarch64_ptrue_cnot<mode> for _x operations that are predicated
with a ptrue. Represent other _x operations as fully-defined _m
operations.

gcc/testsuite/
PR target/114603
* gcc.target/aarch64/sve/acle/general/cnot_1.c: New test.

(cherry picked from commit 67cbb1c638d6ab3a9cb77e674541e2b291fb67df)

commit | commitdiff | tree

Richard Sandiford [Fri, 16 Aug 2024 14:37:50 +0000 (15:37 +0100)]

aarch64: Fix expansion of svsudot [PR114607]

Not sure how this happend, but: svsudot is supposed to be expanded
as USDOT with the operands swapped. However, a thinko in the
expansion of svsudot meant that the arguments weren't in fact
swapped; the attempted swap was just a no-op. And the testcases
blithely accepted that.

gcc/
PR target/114607
* config/aarch64/aarch64-sve-builtins-base.cc
(svusdot_impl::expand): Fix botched attempt to swap the operands
for svsudot.

gcc/testsuite/
PR target/114607
* gcc.target/aarch64/sve/acle/asm/sudot_s32.c: New test.

(cherry picked from commit 2c1c2485a4b1aca746ac693041e51ea6da5c64ca)

commit | commitdiff | tree

GCC Administrator [Fri, 16 Aug 2024 00:19:08 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 15 Aug 2024 00:22:33 +0000 (00:22 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 14 Aug 2024 00:19:54 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Tue, 13 Aug 2024 00:20:00 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

liuhongt [Wed, 24 Jul 2024 03:29:23 +0000 (11:29 +0800)]

Refine constraint "Bk" to define_special_memory_constraint.

For below pattern, RA may still allocate r162 as v/k register, try to
reload for address with leaq __libc_tsd_CTYPE_B@gottpoff(%rip), %rsi
which result a linker error.

(set (reg:DI 162)
     (mem/u/c:DI
       (const:DI (unspec:DI
[(symbol_ref:DI ("a") [flags 0x60]  <var_decl 0x7f621f6e1c60 a>)]
UNSPEC_GOTNTPOFF))

Quote from H.J for why linker issue an error.
>What do these do:
>
>        leaq    __libc_tsd_CTYPE_B@gottpoff(%rip), %rax
>        vmovq   (%rax), %xmm0
>
>From x86-64 TLS psABI:
>
>The assembler generates for the x@gottpoff(%rip) expressions a R X86
>64 GOTTPOFF relocation for the symbol x which requests the linker to
>generate a GOT entry with a R X86 64 TPOFF64 relocation. The offset of
>the GOT entry relative to the end of the instruction is then used in
>the instruction. The R X86 64 TPOFF64 relocation is pro- cessed at
>program startup time by the dynamic linker by looking up the symbol x
>in the modules loaded at that point. The offset is written in the GOT
>entry and later loaded by the addq instruction.
>
>The above code sequence looks wrong to me.

gcc/ChangeLog:

PR target/116043
* config/i386/constraints.md (Bk): Refine to
define_special_memory_constraint.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116043.c: New test.

(cherry picked from commit bc1fda00d5f20e2f3e77a50b2822562b6e0040b2)

commit | commitdiff | tree

GCC Administrator [Mon, 12 Aug 2024 00:20:04 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 11 Aug 2024 00:19:37 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 10 Aug 2024 00:20:01 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 9 Aug 2024 00:20:31 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 8 Aug 2024 00:20:19 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 7 Aug 2024 00:18:20 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

John David Anglin [Tue, 6 Aug 2024 17:40:26 +0000 (13:40 -0400)]

hppa: Fix (plus (plus (mult (a) (mem_shadd_constant)) (b)) (c)) optimization

The constant C must be an integral multiple of the shift value in
the above optimization. Non integral values can occur evaluating
IMAGPART_EXPR when the shadd constant is 8 and we have SFmode.

2024-08-06 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR target/113384
* config/pa/pa.cc (hppa_legitimize_address): Add check to
ensure constant is an integral multiple of shift the value.

commit | commitdiff | tree

Andrew Pinski [Sat, 3 Aug 2024 16:30:57 +0000 (09:30 -0700)]

sh: Don't call make_insn_raw in sh_recog_treg_set_expr [PR116189]

This was an interesting compare debug failure to debug. The first symptom
was in gcse which would produce different order of creating psedu-registers. This
was caused by a different order of a hashtable walk, due to the hash table having different
number of entries. Which in turn was due to the number of max insn being different between
the 2 runs. The place max insn uid comes from was in sh_recog_treg_set_expr which is called
via rtx_costs and fwprop would cause rtx_costs in some cases for debug insn related stuff.

Build and tested for sh4-linux-gnu.

PR target/116189

gcc/ChangeLog:

* config/sh/sh.cc (sh_recog_treg_set_expr): Don't call make_insn_raw,
make the insn with a fake uid.

gcc/testsuite/ChangeLog:

* c-c++-common/torture/pr116189-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 0355c943b9e954e8f59068971d934f1b91ecb729)

commit | commitdiff | tree

GCC Administrator [Tue, 6 Aug 2024 00:18:50 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Paul Thomas [Fri, 19 Jul 2024 15:58:33 +0000 (16:58 +0100)]

libgomp: Remove bogus warnings from privatized-ref-2.f90.

2024-07-19 Paul Thomas <pault@gcc.gnu.org>

libgomp/ChangeLog

* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Cut
dg-note about 'a' and remove bogus warnings about its array
descriptor components being used uninitialized.

(cherry picked from commit 8d6994f33a98a168151a57a3d21395b19196cd9d)

Mirror of https://gcc.gnu.org/git/gcc.git

RSS Atom