]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
4 months agoDaily bump.
GCC Administrator [Mon, 10 Mar 2025 00:18:58 +0000 (00:18 +0000)] 
Daily bump.

4 months agoDaily bump.
GCC Administrator [Sun, 9 Mar 2025 00:19:08 +0000 (00:19 +0000)] 
Daily bump.

4 months agoDaily bump.
GCC Administrator [Sat, 8 Mar 2025 00:19:50 +0000 (00:19 +0000)] 
Daily bump.

4 months agoarm: Handle fixed PIC register in require_pic_register (PR target/115485)
Christophe Lyon [Mon, 3 Mar 2025 11:12:18 +0000 (11:12 +0000)] 
arm: Handle fixed PIC register in require_pic_register (PR target/115485)

Commit r9-4307-g89d7557202d25a forgot to accept a fixed PIC register
when extending the assert in require_pic_register.

arm_pic_register can be set explicitly by the user
(e.g. -mpic-register=r9) or implicitly as the default value with
-fpic/-fPIC/-fPIE and -mno-pic-data-is-text-relative -mlong-calls, and
we want to use/accept it when recording cfun->machine->pic_reg as used
to be the case.

PR target/115485
gcc/
* config/arm/arm.cc (require_pic_register): Fix typos in
comment. Handle fixed arm_pic_register.

gcc/testsuite/
* g++.target/arm/pr115485.C: New test.

(cherry picked from commit b1d0ac28de643e7c810e407a0668737131cdcc00)

4 months agoDaily bump.
GCC Administrator [Fri, 7 Mar 2025 00:19:12 +0000 (00:19 +0000)] 
Daily bump.

4 months agoDaily bump.
GCC Administrator [Thu, 6 Mar 2025 00:20:56 +0000 (00:20 +0000)] 
Daily bump.

5 months agotestsuite: Add tests for already fixed PR [PR119071]
Jakub Jelinek [Tue, 4 Mar 2025 08:52:22 +0000 (09:52 +0100)] 
testsuite: Add tests for already fixed PR [PR119071]

Uros' r15-7793 fixed this PR as well, I'm just committing tests
from the PR so that it can be closed.

2025-03-04  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/119071
* gcc.dg/pr119071.c: New test.
* gcc.c-torture/execute/pr119071.c: New test.

(cherry picked from commit ccf9db9a6fa4b5bc7aad5e9603e2ac71984142a0)

5 months agocombine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]
Uros Bizjak [Wed, 12 Feb 2025 10:19:57 +0000 (11:19 +0100)] 
combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

The combine pass is trying to combine:

Trying 16, 22, 21 -> 23:
   16: r104:QI=flags:CCNO>0
   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
      REG_UNUSED flags:CC
   21: r119:QI=flags:CCNO<=0
      REG_DEAD flags:CCNO
   23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
      REG_DEAD r120:QI
      REG_DEAD r119:QI
      REG_UNUSED flags:CC

and creates the following two insn sequence:

modifying insn i2    22: r104:QI=flags:CCNO>0
      REG_DEAD flags:CC
deferring rescan insn with uid = 22.
modifying insn i3    23: r110:QI=flags:CCNO<=0
      REG_DEAD flags:CC
deferring rescan insn with uid = 23.

where the REG_DEAD note in i2 is not correct, because the flags
register is still referenced in i3.  In try_combine() megafunction,
we have this part:

--cut here--
    /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
    if (i3notes)
      distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
    if (i2notes)
      distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
    if (i1notes)
      distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
elim_i2, local_elim_i1, local_elim_i0);
    if (i0notes)
      distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, local_elim_i0);
    if (midnotes)
      distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
--cut here--

where the compiler distributes REG_UNUSED note from i2:

   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
      REG_UNUSED flags:CC

via distribute_notes() using the following:

--cut here--
  /* Otherwise, if this register is used by I3, then this register
     now dies here, so we must put a REG_DEAD note here unless there
     is one already.  */
  else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
   && ! (REG_P (XEXP (note, 0))
 ? find_regno_note (i3, REG_DEAD,
    REGNO (XEXP (note, 0)))
 : find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
    {
      PUT_REG_NOTE_KIND (note, REG_DEAD);
      place = i3;
    }
--cut here--

Flags register is used in I3, but there already is a REG_DEAD note in I3.
The above condition doesn't trigger and continues in the "else" part where
REG_DEAD note is put to I2.  The proposed solution corrects the above
logic to trigger every time the register is referenced in I3, avoiding the
"else" part.

PR rtl-optimization/118739

gcc/ChangeLog:

* combine.cc (distribute_notes) <case REG_UNUSED>: Correct the
logic when the register is used by I3.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118739.c: New test.

(cherry picked from commit a92dc3fe31c95d56019b2fb95a58414bca06241f)

5 months agoDaily bump.
GCC Administrator [Wed, 5 Mar 2025 00:22:21 +0000 (00:22 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Tue, 4 Mar 2025 00:20:32 +0000 (00:20 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Mon, 3 Mar 2025 00:20:33 +0000 (00:20 +0000)] 
Daily bump.

5 months agod: Fix comparing uninitialized memory in dstruct.d [PR116961]
Iain Buclaw [Fri, 28 Feb 2025 18:22:36 +0000 (19:22 +0100)] 
d: Fix comparing uninitialized memory in dstruct.d [PR116961]

Floating-point emulation in the D front-end is done via a type named
`struct longdouble`, which in GDC is a small interface around the
real_value type. Because the D code cannot include gcc/real.h directly,
a big enough buffer is used for the data instead.

On x86_64, this buffer is actually bigger than real_value itself, so
when a new longdouble object is created with

    longdouble r;
    real_from_string3 (&r.rv (), buffer, mode);
    return r;

there is uninitialized padding at the end of `r`.  This was never a
problem when D was implemented in C++ (until GCC 12) as comparing two
longdouble objects with `==' would be forwarded to the relevant
operator== overload that extracted the underlying real_value.

However when the front-end was translated to D, such conditions were
instead rewritten into identity comparisons

    return exp.toReal() is CTFloat.zero

The `is` operator gets lowered as a call to `memcmp() == 0', which is
where the read of uninitialized memory occurs, as seen by valgrind.

==26778== Conditional jump or move depends on uninitialised value(s)
==26778==    at 0x911F41: dmd.dstruct._isZeroInit(dmd.expression.Expression) (dstruct.d:635)
==26778==    by 0x9123BE: StructDeclaration::finalizeSize() (dstruct.d:373)
==26778==    by 0x86747C: dmd.aggregate.AggregateDeclaration.determineSize(ref const(dmd.location.Loc)) (aggregate.d:226)
[...]

To avoid accidentally reading uninitialized data, explicitly initialize
all `longdouble` variables with an empty constructor on C++ side of the
implementation before initializing underlying real_value type it holds.

PR d/116961

gcc/d/ChangeLog:

* d-codegen.cc (build_float_cst): Change new_value type from real_t to
real_value.
* d-ctfloat.cc (CTFloat::fabs): Default initialize the return value.
(CTFloat::ldexp): Likewise.
(CTFloat::parse): Likewise.
* d-longdouble.cc (longdouble::add): Likewise.
(longdouble::sub): Likewise.
(longdouble::mul): Likewise.
(longdouble::div): Likewise.
(longdouble::mod): Likewise.
(longdouble::neg): Likewise.
* d-port.cc (Port::isFloat32LiteralOutOfRange): Likewise.
(Port::isFloat64LiteralOutOfRange): Likewise.

gcc/testsuite/ChangeLog:

* gdc.dg/pr116961.d: New test.

(cherry picked from commit f7bc17ebc9ef89700672ed7125da719f3558f3b7)

5 months agoDaily bump.
GCC Administrator [Sun, 2 Mar 2025 00:20:29 +0000 (00:20 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Sat, 1 Mar 2025 13:06:08 +0000 (13:06 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Fri, 28 Feb 2025 00:20:33 +0000 (00:20 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Thu, 27 Feb 2025 00:20:42 +0000 (00:20 +0000)] 
Daily bump.

5 months agos390: Fix s390_valid_shift_count() for TI mode [PR118835]
Stefan Schulze Frielinghaus [Thu, 13 Feb 2025 08:13:06 +0000 (09:13 +0100)] 
s390: Fix s390_valid_shift_count() for TI mode [PR118835]

During combine we may end up with

(set (reg:DI 66 [ _6 ])
     (ashift:DI (reg:DI 72 [ x ])
                (subreg:QI (and:TI (reg:TI 67 [ _1 ])
                                   (const_wide_int 0x0aaaaaaaaaaaaaabf))
                           15)))

where the shift count operand does not trivially fit the scheme of
address operands.  Reject those operands, especially since
strip_address_mutations() expects expressions of the form
(and ... (const_int ...)) and fails for (and ... (const_wide_int ...)).

Thus, be more strict here and accept only CONST_INT operands.  Done by
replacing immediate_operand() with const_int_operand() which is enough
since the former only additionally checks for LEGITIMATE_PIC_OPERAND_P
and targetm.legitimate_constant_p which are always true for CONST_INT
operands.

While on it, fix indentation of the if block.

gcc/ChangeLog:

PR target/118835
* config/s390/s390.cc (s390_valid_shift_count): Reject shift
count operands which do not trivially fit the scheme of
address operands.

gcc/testsuite/ChangeLog:

* gcc.target/s390/pr118835.c: New test.

(cherry picked from commit ac9806dae30d07ab082ac341fe5646987753adcb)

5 months agoDaily bump.
GCC Administrator [Wed, 26 Feb 2025 00:20:22 +0000 (00:20 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Tue, 25 Feb 2025 00:20:11 +0000 (00:20 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Mon, 24 Feb 2025 00:20:39 +0000 (00:20 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Sun, 23 Feb 2025 00:19:56 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Sat, 22 Feb 2025 00:19:13 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Fri, 21 Feb 2025 00:19:33 +0000 (00:19 +0000)] 
Daily bump.

5 months agolibgcc: On FreeBSD use GCC's crt objects for static linking
Dimitry Andric [Tue, 28 Jan 2025 17:36:16 +0000 (18:36 +0100)] 
libgcc: On FreeBSD use GCC's crt objects for static linking

Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's
crt objects for static linking. Otherwise it could mix crtbeginT.o
from the base system with libgcc's crtend.o, possibly leading to
segfaults.

libgcc:
PR target/118685
* config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts.

Signed-off-by: Dimitry Andric <dimitry@andric.com>
5 months agoDaily bump.
GCC Administrator [Thu, 20 Feb 2025 00:19:13 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Wed, 19 Feb 2025 00:19:10 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Tue, 18 Feb 2025 00:19:17 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Mon, 17 Feb 2025 00:18:55 +0000 (00:18 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Sun, 16 Feb 2025 00:19:25 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Sat, 15 Feb 2025 00:18:50 +0000 (00:18 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Fri, 14 Feb 2025 00:19:31 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Thu, 13 Feb 2025 00:19:41 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Wed, 12 Feb 2025 00:19:33 +0000 (00:19 +0000)] 
Daily bump.

5 months agox86: Correct ASM_OUTPUT_SYMBOL_REF
H.J. Lu [Tue, 11 Feb 2025 05:47:54 +0000 (13:47 +0800)] 
x86: Correct ASM_OUTPUT_SYMBOL_REF

x is not a macro argument.  It just happens to work as final.cc passes
x for 2nd argument:

final.cc:      ASM_OUTPUT_SYMBOL_REF (file, x);

PR target/118825
* config/i386/i386.h (ASM_OUTPUT_SYMBOL_REF): Replace x with
SYM.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 7317fc0b03380a83ad03a5fc4fabef5f38c44c9d)

5 months agoDaily bump.
GCC Administrator [Tue, 11 Feb 2025 00:19:35 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Mon, 10 Feb 2025 00:19:20 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Sun, 9 Feb 2025 00:18:50 +0000 (00:18 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Sat, 8 Feb 2025 00:19:15 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Fri, 7 Feb 2025 00:20:42 +0000 (00:20 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Thu, 6 Feb 2025 00:19:58 +0000 (00:19 +0000)] 
Daily bump.

5 months agoDaily bump.
GCC Administrator [Wed, 5 Feb 2025 00:20:24 +0000 (00:20 +0000)] 
Daily bump.

5 months agooptions: Adjust cl_optimization_compare to avoid checking ICE [PR115913]
Lewis Hyatt [Sun, 26 Jan 2025 23:57:00 +0000 (18:57 -0500)] 
options: Adjust cl_optimization_compare to avoid checking ICE [PR115913]

At the end of a sequence like:
 #pragma GCC push_options
 ...
 #pragma GCC pop_options

the handler for pop_options calls cl_optimization_compare() (as generated by
optc-save-gen.awk) to make sure that all global state has been restored to
the value it had prior to the push_options call. The verification is
performed for almost all entries in the global_options struct. This leads to
unexpected checking asserts, as discussed in the PR, in case the state of
warnings-related options has been intentionally modified in between
push_options and pop_options via a call to #pragma GCC diagnostic. Address
that by skipping the verification for CL_WARNING-flagged options.

gcc/ChangeLog:

PR middle-end/115913
* optc-save-gen.awk (cl_optimization_compare): Skip options with
CL_WARNING flag.

gcc/testsuite/ChangeLog:

PR middle-end/115913
* c-c++-common/cpp/pr115913.c: New test.

5 months agoDaily bump.
GCC Administrator [Tue, 4 Feb 2025 00:19:57 +0000 (00:19 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Mon, 3 Feb 2025 00:19:35 +0000 (00:19 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sun, 2 Feb 2025 00:19:46 +0000 (00:19 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sat, 1 Feb 2025 00:20:04 +0000 (00:20 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Fri, 31 Jan 2025 00:21:40 +0000 (00:21 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Thu, 30 Jan 2025 00:21:03 +0000 (00:21 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Wed, 29 Jan 2025 00:21:37 +0000 (00:21 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Tue, 28 Jan 2025 00:24:05 +0000 (00:24 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Mon, 27 Jan 2025 00:21:19 +0000 (00:21 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sun, 26 Jan 2025 00:20:34 +0000 (00:20 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sat, 25 Jan 2025 00:22:01 +0000 (00:22 +0000)] 
Daily bump.

6 months agors6000: Fix ICE for invalid constants in built-in functions
Peter Bergner [Thu, 16 Jan 2025 16:53:27 +0000 (10:53 -0600)] 
rs6000: Fix ICE for invalid constants in built-in functions

For invalid constant operand values used in built-in functions, return
const0_rtx to signify an error occurred during expansion.

2025-01-16  Peter Bergner  <bergner@linux.ibm.com>

gcc/
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Return
const0_rtx when there is an error.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-error.c: New test.

(cherry picked from commit 0696af74b3392e2178215607337b116d1bb53e34)

6 months agors6000: Fix loop limit for built-in constant checking
Peter Bergner [Thu, 16 Jan 2025 16:49:45 +0000 (10:49 -0600)] 
rs6000: Fix loop limit for built-in constant checking

The loop checking for built-in constant operand restrictions was missing
some operands due to the loop limit being too small.  Fixing that exposed
a testsuite failure which is caused by a typo in the pmxvi4ger8pp definition
where we had made the PMASK field too small.

2025-01-16  Peter Bergner  <bergner@linux.ibm.com>

gcc/
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Use correct
array size for the loop limit.
* config/rs6000/rs6000-builtins.def: Fix field size for PMASK operand.

(cherry picked from commit 1a2d63a78f99b7fdc2eff5bf9065682d5bbbaaca)

6 months agoDaily bump.
GCC Administrator [Fri, 24 Jan 2025 00:21:29 +0000 (00:21 +0000)] 
Daily bump.

6 months agohppa: Fix typo in ADDITIONAL_REGISTER_NAMES in pa32-regs.h
John David Anglin [Thu, 23 Jan 2025 19:35:22 +0000 (14:35 -0500)] 
hppa: Fix typo in ADDITIONAL_REGISTER_NAMES in pa32-regs.h

2025-01-23  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa32-regs.h (ADDITIONAL_REGISTER_NAMES): Change
register 86 name to "%fr31L".

6 months agoDaily bump.
GCC Administrator [Thu, 23 Jan 2025 00:21:48 +0000 (00:21 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Wed, 22 Jan 2025 00:22:59 +0000 (00:22 +0000)] 
Daily bump.

6 months agod: Fix ICE in build_deref, at d/d-codegen.cc:1650 [PR111650]
Iain Buclaw [Fri, 19 Apr 2024 08:51:12 +0000 (10:51 +0200)] 
d: Fix ICE in build_deref, at d/d-codegen.cc:1650 [PR111650]

PR d/111650

gcc/d/ChangeLog:

* decl.cc (get_fndecl_arguments): Move generation of frame type to ...
(DeclVisitor::visit (FuncDeclaration *)): ... here, after the call to
build_closure.

gcc/testsuite/ChangeLog:

* gdc.dg/pr111650.d: New test.

(cherry picked from commit 4d4929fe0654d51b52a2bf6e6188d7aad0bf17ac)

6 months agoZen5 tuning part 2: disable gather and scatter
Jan Hubicka [Tue, 3 Sep 2024 13:07:41 +0000 (15:07 +0200)] 
Zen5 tuning part 2: disable gather and scatter

We disable gathers for zen4.  It seems that gather has improved a bit compared
to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when
the indices are known ahead of time. Vector loads followed by shuffles result
in a higher load bandwidth." however the situation seems to be more
complicated.

gather is 5-10% loss on parest benchmark as well as 30% loss on sparse dot
products in TSVC. Curiously enough breaking these out into microbenchmark
reversed the situation and it turns out that the performance depends on
how indices are distributed.  gather is loss if indices are sequential,
neutral if they are random and win for some strides (4, 8).

This seems to be similar to earlier zens, so I think (especially for
backporting znver5 support) that it makes sense to be conistent and disable
gather unless we work out a good heuristics on when to use it. Since we
typically do not know the indices in advance, I don't see how that can be done.

I opened PR116582 with some examples of wins and loses

gcc/ChangeLog:

* config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Disable for
ZNVER5.
(X86_TUNE_USE_SCATTER_2PARTS): Disable for ZNVER5.
(X86_TUNE_USE_GATHER_4PARTS): Disable for ZNVER5.
(X86_TUNE_USE_SCATTER_4PARTS): Disable for ZNVER5.
(X86_TUNE_USE_GATHER_8PARTS): Disable for ZNVER5.
(X86_TUNE_USE_SCATTER_8PARTS): Disable for ZNVER5.

(cherry picked from commit d82edbe92eed53a479736fcbbe6d54d0fb42daa4)

6 months agoDaily bump.
GCC Administrator [Tue, 21 Jan 2025 00:20:51 +0000 (00:20 +0000)] 
Daily bump.

6 months agod: Fix failing test with 32-bit compiler [PR114434]
Iain Buclaw [Mon, 20 Jan 2025 19:01:03 +0000 (20:01 +0100)] 
d: Fix failing test with 32-bit compiler [PR114434]

Since the introduction of gdc.test/runnable/test23514.d, it's exposed an
incorrect compilation when adding a 64-bit constant to a link-time
address.  The current cast to size_t causes a loss of precision, which
can result in incorrect compilation.

PR d/114434

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (PtrExp *)): Get the offset as a
dinteger_t rather than a size_t.
(ExprVisitor::visit (SymOffExp *)): Likewise.

gcc/testsuite/ChangeLog:

* gdc.test/runnable/test23514.d: New test.

(cherry picked from commit 9ab38952a2033d6d4a8e31c3c4d2ab1a25a406c6)

6 months agoi386: Disable SImode/DImode moves from/to mask regs without avx512bw [PR118067]
Uros Bizjak [Mon, 20 Jan 2025 15:19:43 +0000 (16:19 +0100)] 
i386: Disable SImode/DImode moves from/to mask regs without avx512bw [PR118067]

SImode and DImode moves from/to mask registers are valid only with AVX512BW,
so mark relevant alternatives in *movsi_internal and *movdi_internal as such.

PR target/118067

gcc/ChangeLog:

* config/i386/i386.md (*movdi_internal):
Disable alternatives from/to mask registers without AVX512BW.
(*movsi_internal): Ditto.

6 months agoc++: Friend classes don't shadow enclosing template class paramater [PR118255]
Simon Martin [Sun, 5 Jan 2025 09:36:47 +0000 (10:36 +0100)] 
c++: Friend classes don't shadow enclosing template class paramater [PR118255]

We currently reject the following code

=== code here ===
template <int non_template> struct S { friend class non_template; };
class non_template {};
S<0> s;
=== code here ===

While EDG agrees with the current behaviour, clang and MSVC don't (see
https://godbolt.org/z/69TGaabhd), and I believe that this code is valid,
since the friend clause does not actually declare a type, so it cannot
shadow anything. The fact that we didn't error out if the non_template
class was declared before S backs this up as well.

This patch fixes this by skipping the call to check_template_shadow for
hidden bindings.

PR c++/118255

gcc/cp/ChangeLog:

* name-lookup.cc (pushdecl): Don't call check_template_shadow
for hidden bindings.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/pr99116-1.C: Adjust test expectation.
* g++.dg/template/friend84.C: New test.

(cherry picked from commit b5a069203fc074ab75d994c4a7e0f2db6a0a00fd)

6 months agoDaily bump.
GCC Administrator [Mon, 20 Jan 2025 00:20:46 +0000 (00:20 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sun, 19 Jan 2025 00:20:22 +0000 (00:20 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sat, 18 Jan 2025 00:20:52 +0000 (00:20 +0000)] 
Daily bump.

6 months agoFix setting of call graph node AutoFDO count
Eugene Rozenfeld [Sat, 11 Jan 2025 03:48:52 +0000 (19:48 -0800)] 
Fix setting of call graph node AutoFDO count

We are initializing both the call graph node count and
the entry block count of the function with the head_count value
from the profile.

Count propagation algorithm may refine the entry block count
and we may end up with a case where the call graph node count
is set to zero but the entry block count is non-zero. That becomes
a problem because we have this code in execute_fixup_cfg:

 profile_count num = node->count;
 profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
 bool scale = num.initialized_p () && !(num == den);

Here if num is 0 but den is not 0, scale becomes true and we
lose the counts in

if (scale)
  bb->count = bb->count.apply_scale (num, den);

This is what happened in the issue reported in PR116743
(a 10% regression in MySQL HAMMERDB tests).
3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in
AutoFDO count propagation, which caused a mismatch between
the call graph node count (zero) and the entry block count (non-zero)
and subsequent loss of counts as described above.

The fix is to update the call graph node count once we've done count propagation.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
PR gcov-profile/116743
* auto-profile.cc (afdo_annotate_cfg): Fix mismatch between the call graph node count
and the entry block count.

(cherry picked from commit e683c6b029f809c7a1981b4341c95d9652c22e18)

6 months agoc++: Allow pragmas in NSDMIs [PR118147]
Nathaniel Shead [Fri, 20 Dec 2024 11:09:39 +0000 (22:09 +1100)] 
c++: Allow pragmas in NSDMIs [PR118147]

This patch removes the (unnecessary) CPP_PRAGMA_EOL case from
cp_parser_cache_defarg, which currently has the result that any pragmas
in the NSDMI cause an error.

PR c++/118147

gcc/cp/ChangeLog:

* parser.cc (cp_parser_cache_defarg): Don't error when
CPP_PRAGMA_EOL.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-defer7.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
(cherry picked from commit f3ccc57e5f044031a1b07e79330de9220e93afe7)

6 months agotree-optimization/117417 - ICE with complex load optimization
Richard Biener [Tue, 12 Nov 2024 10:15:15 +0000 (11:15 +0100)] 
tree-optimization/117417 - ICE with complex load optimization

When we decompose a complex load only used as real and imaginary
parts we fail to honor IL constraints which are that a BIT_FIELD_REF
of register type should be outermost in a ref.  The following
simply avoids the transform when the complex load has such a
BIT_FIELD_REF.

PR tree-optimization/117417
* tree-ssa-forwprop.cc (pass_forwprop::execute): Avoid
decomposing BIT_FIELD_REF complex load.

* gcc.dg/torture/pr117417.c: New testcase.

(cherry picked from commit d976daa931642d940b7b27032ca6139210c07eed)

6 months agotree-optimization/117307 - STMT_VINFO_SLP_VECT_ONLY mis-computation
Richard Biener [Mon, 28 Oct 2024 08:52:08 +0000 (09:52 +0100)] 
tree-optimization/117307 - STMT_VINFO_SLP_VECT_ONLY mis-computation

STMT_VINFO_SLP_VECT_ONLY isn't properly computed as union of all
group members and when the group is later split due to duplicates
not all sub-groups inherit the flag.

PR tree-optimization/117307
* tree-vect-data-refs.cc (vect_analyze_data_ref_accesses):
Properly compute STMT_VINFO_SLP_VECT_ONLY.  Set it on all
parts of a split group.

* gcc.dg/vect/pr117307.c: New testcase.

(cherry picked from commit 19722308a286d9a00eead8ac82b948da8c4ca38b)

6 months agotree-optimization/117254 - ICE with access diangostics
Richard Biener [Tue, 22 Oct 2024 09:46:47 +0000 (11:46 +0200)] 
tree-optimization/117254 - ICE with access diangostics

The diagnostics code fails to handle non-constant domain max.

PR tree-optimization/117254
* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg):
Check the array domain max is constant before using it.

* gcc.dg/pr117254.c: New testcase.

(cherry picked from commit d464a52d0678dfea523a60efe8b792ba1b8d40db)

6 months agotree-optimization/117104 - add missed guards to max(a,b) != a simplification
Richard Biener [Sat, 12 Oct 2024 12:51:37 +0000 (14:51 +0200)] 
tree-optimization/117104 - add missed guards to max(a,b) != a simplification

For vector types we have to make sure the comparison result is a vector
type and the resulting compare operation is supported.  As the resulting
compare is never an equality compare I didn't bother to check for the
cbranch case.

PR tree-optimization/117104
* match.pd ((cmp:c (minmax:c @0 @1) @0) -> (out @0 @1)): Properly
guard the vector case.

* gcc.dg/pr117104.c: New testcase.

(cherry picked from commit f54d42e00007e7a558b273d87f95b3e5b1938f5a)

6 months agomatch.pd: Further fma negation fixes [PR116891]
Jakub Jelinek [Tue, 15 Oct 2024 17:38:46 +0000 (19:38 +0200)] 
match.pd: Further fma negation fixes [PR116891]

On Mon, Oct 14, 2024 at 08:53:29AM +0200, Jakub Jelinek wrote:
> >     PR middle-end/116891
> >     * match.pd ((negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)):
> >     Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING.
>
> Guess it would be nice to have a testcase which FAILs without the patch and
> PASSes with it, but it can be added later.

I've added such a testcase now, and additionally found the fix only fixed
one of the 4 problematic similar cases.

Here is a patch which fixes the others too and adds the testcases.
fma-pr116891.c FAILed without your patch, FAILs with your patch too (but
only due to the bar/baz/qux checks) and PASSes with the patch.

2024-10-15  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/116891
* match.pd ((negate (fmas@3 @0 @1 @2)) -> (IFN_FNMS @0 @1 @2)):
Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING.
((negate (IFN_FMS@3 @0 @1 @2)) -> (IFN_FNMA @0 @1 @2)): Likewise.
((negate (IFN_FNMA@3 @0 @1 @2)) -> (IFN_FMS @0 @1 @2)): Likewise.

* gcc.dg/pr116891.c: New test.
* gcc.target/i386/fma-pr116891.c: New test.

(cherry picked from commit 4366f0c7e296ea0d7279343c9b0a1d597588a1da)

6 months agomiddle-end/116891 - fix (negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)
Richard Biener [Mon, 14 Oct 2024 06:11:22 +0000 (08:11 +0200)] 
middle-end/116891 - fix (negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)

Transforming -fma (-a, b, -c) to fma (a, b, c) is only valid when
not rounding towards -inf or +inf as the sign of the multiplication
changes.

PR middle-end/116891
* match.pd ((negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)):
Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING.

(cherry picked from commit c53bd48c6920bc1f4039b6682aafbf414a600e47)

6 months agotree-optimization/116768 - wrong dependence analysis
Richard Biener [Thu, 19 Sep 2024 12:58:18 +0000 (14:58 +0200)] 
tree-optimization/116768 - wrong dependence analysis

The following reverts a bogus fix done for PR101009 and instead makes
sure we get into the same_access_functions () case when computing
the distance vector for g[1] and g[1] where the constants ended up
having different types.  The generic code doesn't seem to handle
loop invariant dependences.  The special case gets us both
( 0 ) and ( 1 ) as distance vectors while formerly we got ( 1 ),
which the PR101009 fix changed to ( 0 ) with bad effects on other
cases as shown in this PR.

PR tree-optimization/116768
* tree-data-ref.cc (build_classic_dist_vector_1): Revert
PR101009 change.
* tree-chrec.cc (eq_evolutions_p): Make sure (sizetype)1
and (int)1 compare equal.

* gcc.dg/torture/pr116768.c: New testcase.

(cherry picked from commit 5b5a36b122e1205449f1512bf39521b669e713ef)

6 months agotree-optimization/116290 - fix compare-debug issue in ldist
Richard Biener [Sun, 13 Oct 2024 13:12:44 +0000 (15:12 +0200)] 
tree-optimization/116290 - fix compare-debug issue in ldist

Loop distribution does different analysis with -g0/-g due to counting
a debug stmt starting a BB against a limit which will everntually
lead to different IVOPTs choices.  I've fixed a possible IVOPTs
issue on the way even though it doesn't make a difference here.

PR tree-optimization/116290
* tree-loop-distribution.cc (determine_reduction_stmt_1): PHIs
have no debug variants.  Start with first non-debug real stmt.
* tree-ssa-loop-ivopts.cc (find_givs_in_bb): Do not analyze
debug stmts.

* gcc.dg/pr116290.c: New testcase.

(cherry picked from commit 566740013b3445162b8c4bc2205e4e568d014968)

6 months agomiddle-end/69482 - not preserving volatile accesses
Richard Biener [Mon, 9 Jan 2023 11:46:28 +0000 (12:46 +0100)] 
middle-end/69482 - not preserving volatile accesses

The following addresses a long standing issue with not preserving
accesses to non-volatile objects through volatile qualified
pointers in the case that object gets expanded to a register.  The
fix is to treat accesses to an object with a volatile qualified
access as forcing that object to memory.  This issue got more
exposed recently so it regressed more since GCC 11.

PR middle-end/69482
* cfgexpand.cc (discover_nonconstant_array_refs_r): Volatile
qualified accesses also force objects to memory.

* gcc.target/i386/pr69482-1.c: New testcase.
* gcc.target/i386/pr69482-2.c: Likewise.

(cherry picked from commit a5a8242153d078f1ebe60f00409415da260a29ee)

6 months agoDaily bump.
GCC Administrator [Fri, 17 Jan 2025 00:22:41 +0000 (00:22 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Thu, 16 Jan 2025 00:22:28 +0000 (00:22 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Wed, 15 Jan 2025 00:19:25 +0000 (00:19 +0000)] 
Daily bump.

6 months agoZen5 tuning part 5: update instruction latencies in x86-tune-costs
Jan Hubicka [Wed, 4 Sep 2024 07:19:08 +0000 (09:19 +0200)] 
Zen5 tuning part 5: update instruction latencies in x86-tune-costs

there is nothing exciting in this patch.  I measured latencies and also compared
them with newly released optimization guide.  There are no dramatic changes
compared to zen4.  One interesting new bit is that addss is faster and can be
2 cycles when fed by another addss.

I also increased the large insn bound since decoders seems no longer require
instructions to be 8 bytes or less.

gcc/ChangeLog:

* config/i386/x86-tune-costs.h (znver5_cost): Update instruction
costs.

(cherry picked from commit 4292297a0f938ffc953422fa246ff00fe345fe3d)

6 months agoDaily bump.
GCC Administrator [Tue, 14 Jan 2025 00:20:26 +0000 (00:20 +0000)] 
Daily bump.

6 months agoFortran: Cray pointer comparison wrongly optimized away [PR106692]
Harald Anlauf [Thu, 2 Jan 2025 19:22:23 +0000 (20:22 +0100)] 
Fortran: Cray pointer comparison wrongly optimized away [PR106692]

PR fortran/106692

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_expr_op): Inhibit excessive optimization
of Cray pointers by treating them as volatile in comparisons.

gcc/testsuite/ChangeLog:

* gfortran.dg/cray_pointers_13.f90: New test.

(cherry picked from commit c7754a2fb2e60987524947fe189f3ffac035ea1d)

6 months agoDaily bump.
GCC Administrator [Mon, 13 Jan 2025 00:19:09 +0000 (00:19 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sun, 12 Jan 2025 00:19:38 +0000 (00:19 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Sat, 11 Jan 2025 00:21:21 +0000 (00:21 +0000)] 
Daily bump.

6 months agotree-optimization/116057 - wrong code with CCP and vector CTORs
Richard Biener [Wed, 24 Jul 2024 11:16:35 +0000 (13:16 +0200)] 
tree-optimization/116057 - wrong code with CCP and vector CTORs

The following fixes an issue with CCPs likely_value when faced with
a vector CTOR containing undef SSA names and constants.  This should
be classified as CONSTANT and not UNDEFINED.

PR tree-optimization/116057
* tree-ssa-ccp.cc (likely_value): Also walk CTORs in stmt
operands to look for constants.

* gcc.dg/torture/pr116057.c: New testcase.

(cherry picked from commit 1ea551514b9c285d801ac5ab8d78b22483ff65af)

6 months agotree-optimization/115669 - fix SLP reduction association
Richard Biener [Thu, 27 Jun 2024 09:26:08 +0000 (11:26 +0200)] 
tree-optimization/115669 - fix SLP reduction association

The following avoids associating a reduction path as that might
get STMT_VINFO_REDUC_IDX out-of-sync with the SLP operand order.
This is a latent issue with SLP reductions but now easily exposed
as we're doing single-lane SLP reductions.

When we achieved SLP only we can move and update this meta-data.

PR tree-optimization/115669
* tree-vect-slp.cc (vect_build_slp_tree_2): Do not reassociate
chains that participate in a reduction.

* gcc.dg/vect/pr115669.c: New testcase.

(cherry picked from commit 7886830bb45c4f5dca0496d4deae9a45204d78f5)

6 months agotree-optimization/115646 - ICE with pow shrink-wrapping from bitfield
Richard Biener [Tue, 25 Jun 2024 14:13:02 +0000 (16:13 +0200)] 
tree-optimization/115646 - ICE with pow shrink-wrapping from bitfield

The following makes analysis and transform agree on constraints.

PR tree-optimization/115646
* tree-call-cdce.cc (check_pow): Check for bit_sz values
as allowed by transform.

* gcc.dg/pr115646.c: New testcase.

(cherry picked from commit 453b1d291d1a0f89087ad91cf6b1bed1ec68eff3)

6 months agodoc: cpp: fix version test example syntax
Sam James [Wed, 1 Jan 2025 17:16:17 +0000 (17:16 +0000)] 
doc: cpp: fix version test example syntax

gcc/ChangeLog:

* doc/cpp.texi (Common Predefined Macros): Fix syntax.

6 months agoDaily bump.
GCC Administrator [Fri, 10 Jan 2025 00:20:08 +0000 (00:20 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Thu, 9 Jan 2025 00:20:03 +0000 (00:20 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Wed, 8 Jan 2025 00:20:50 +0000 (00:20 +0000)] 
Daily bump.

6 months agoZen5 tuning part 4: update reassocation width
Jan Hubicka [Tue, 3 Sep 2024 16:20:34 +0000 (18:20 +0200)] 
Zen5 tuning part 4: update reassocation width

Zen5 has 6 instead of 4 ALUs and the integer multiplication can now execute in
3 of them.  FP units can do 2 additions and 2 multiplications with latency 2
and 3.  This patch updates reassociation width accordingly.  This has potential
of increasing register pressure but unlike while benchmarking znver1 tuning
I did not noticed this actually causing problem on spec, so this patch bumps
up reassociation width to 6 for everything except for integer vectors, where
there are 4 units with typical latency of 1.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_reassociation_width): Update for Znver5.
* config/i386/x86-tune-costs.h (znver5_costs): Update reassociation
widths.

(cherry picked from commit f0ab3de6ec0e3540f2e57f3f5628005f0a4e3fa5)

6 months agoZen5 tuning part 3: scheduler tweaks
Jan Hubicka [Tue, 3 Sep 2024 14:26:16 +0000 (16:26 +0200)] 
Zen5 tuning part 3: scheduler tweaks

this patch adds support for new fussion in znver5 documented in the
optimization manual:

   The Zen5 microarchitecture adds support to fuse reg-reg MOV Instructions
   with certain ALU instructions. The following conditions need to be met for
   fusion to happen:
     - The MOV should be reg-reg mov with Opcode 0x89 or 0x8B
     - The MOV is followed by an ALU instruction where the MOV and ALU destination register match.
     - The ALU instruction may source only registers or immediate data. There cannot be any memory source.
     - The ALU instruction sources either the source or dest of MOV instruction.
     - If ALU instruction has 2 reg sources, they should be different.
     - The following ALU instructions can fuse with an older qualified MOV instruction:
       ADD ADC AND XOR OP SUB SBB INC DEC NOT SAL / SHL SHR SAR
       (I assume OP is OR)

I also increased issue rate from 4 to 6.  Theoretically znver5 can do more, but
with our model we can't realy use it.
Increasing issue rate to 8 leads to infinite loop in scheduler.

Finally, I also enabled fuse_alu_and_branch since it is supported by
znver5 (I think by earlier zens too).

New fussion pattern moves quite few instructions around in common code:
@@ -2210,13 +2210,13 @@
        .cfi_offset 3, -32
        leaq    63(%rsi), %rbx
        movq    %rbx, %rbp
+       shrq    $6, %rbp
+       salq    $3, %rbp
        subq    $16, %rsp
        .cfi_def_cfa_offset 48
        movq    %rdi, %r12
-       shrq    $6, %rbp
-       movq    %rsi, 8(%rsp)
-       salq    $3, %rbp
        movq    %rbp, %rdi
+       movq    %rsi, 8(%rsp)
        call    _Znwm
        movq    8(%rsp), %rsi
        movl    $0, 8(%r12)
@@ -2224,8 +2224,8 @@
        movq    %rax, (%r12)
        movq    %rbp, 32(%r12)
        testq   %rsi, %rsi
-       movq    %rsi, %rdx
        cmovns  %rsi, %rbx
+       movq    %rsi, %rdx
        sarq    $63, %rdx
        shrq    $58, %rdx
        sarq    $6, %rbx
which should help decoder bandwidth and perhaps also cache, though I was not
able to measure off-noise effect on SPEC.

gcc/ChangeLog:

* config/i386/i386.h (TARGET_FUSE_MOV_AND_ALU): New tune.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Updat for znver5.
(ix86_adjust_cost): Add TODO about znver5 memory latency.
(ix86_fuse_mov_alu_p): New.
(ix86_macro_fusion_pair_p): Use it.
* config/i386/x86-tune.def (X86_TUNE_FUSE_ALU_AND_BRANCH): Add ZNVER5.
(X86_TUNE_FUSE_MOV_AND_ALU): New tune;

(cherry picked from commit e2125a600552bc6e0329e3f1224eea14804db8d3)

6 months agoDaily bump.
GCC Administrator [Tue, 7 Jan 2025 00:21:31 +0000 (00:21 +0000)] 
Daily bump.

6 months agoDaily bump.
GCC Administrator [Mon, 6 Jan 2025 00:19:13 +0000 (00:19 +0000)] 
Daily bump.

6 months agoAda: Fix build for dummy s-taprop
Estevan Castilho (Tevo) [Sat, 28 Dec 2024 20:37:37 +0000 (20:37 +0000)] 
Ada: Fix build for dummy s-taprop

gcc/ada
* libgnarl/s-taprop__dummy.adb: Remove use clause for
System.Parameters.
(Unlock): Remove Global_Lock formal parameter.
(Write_Lock): Likewise.