git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

c++: constexpr, array, private ctor [PR120800]

Here cxx_eval_vec_init_1 wants to recreate the default constructor call that
we previously built and threw away in build_vec_init_elt, but we aren't in
the same access context at this point. Since we already checked access,
let's just suppress access control here.

Redoing overload resolution at constant evaluation time is sketchy, but
should usually be fine for a default/copy constructor.

PR c++/120800

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_vec_init_1): Suppress access control.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-array30.C: New test.

(cherry picked from commit f6462f664725844faa97ae7e8690e4fedee65788)

Daily bump.

aarch64: Prevent streaming-compatible code from assembler rejection [PR121028]

Streaming-compatible functions can be compiled without SME enabled, but need
to use "SMSTART SM" and "SMSTOP SM" to temporarily switch into the streaming
state of a callee. These switches are conditional on the current mode being
opposite to the target mode, so no SME instructions are executed if SME is not
available.

However, in GAS, "SMSTART SM" and "SMSTOP SM" always require +sme. A call
from a streaming-compatible function, compiled without SME enabled, to a non
-streaming function will be rejected as:

Error: selected processor does not support `smstop sm'..

To work around this, we make use of the .inst directive to insert the literal
encodings of "SMSTART SM" and "SMSTOP SM".

gcc/ChangeLog:
PR target/121028
* config/aarch64/aarch64-sme.md (aarch64_smstart_sm): Use the .inst
directive if !TARGET_SME.
(aarch64_smstop_sm): Likewise.

gcc/testsuite/ChangeLog:
PR target/121028
* gcc.target/aarch64/sme/call_sm_switch_1.c: Tell check-function
-bodies not to ignore .inst directives, and replace the test for
"smstart sm" with one for it's encoding.
* gcc.target/aarch64/sme/call_sm_switch_11.c: Likewise.
* gcc.target/aarch64/sme/pr121028.c: New test.

(cherry picked from commit d52e9ef98bb30872482a46e7a2ec6a20c3ca4a4c)

testsuite: Fix gcc.target/powerpc/vsx-builtin-7.c test [PR119382]

The test vsx-builtin-7.c failed on powerpc64le-linux due to Identical
Code Folding (ICF) merging the functions insert_di_0_v2 and insert_di_0.
This behavior was introduced by commit r15-7961-gdc47161c1f32c3, which
enhanced alias analysis in ao_compare::compare_ao_refs, enabling the
compiler to identify and optimize structurally identical functions. As a
result, the compiler replaced insert_di_0_v2 with a tail call to
insert_di_0, altering the expected test behavior.

This patch adds -fno-ipa-icf to the test's dg-options to disable ICF,
avoiding function merging and ensuring the test executes correctly.

2025-06-24 Jeevitha Palanisamy <jeevitha@linux.ibm.com>

gcc/testsuite/
PR testsuite/119382
* gcc.target/powerpc/vsx-builtin-7.c: Add '-fno-ipa-icf' to dg-options.

(cherry picked from commit a1fb757342058111fddcdc5ca2f866c61e87c9b8)

x86: Transform to "pushq $-1; popq reg" for -Oz

commit 4c80062d7b8c272e2e193b8074a8440dbb4fe588
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun May 25 07:40:29 2025 +0800

    x86: Enable *mov<mode>_(and|or) only for -Oz

disabled transformation from "movq $-1,reg" to "pushq $-1; popq reg" for
-Oz.  But for legacy integer registers, the former is 4 bytes and the
latter is 3 bytes.  Enable such transformation for -Oz.

gcc/

PR target/120427
* config/i386/i386.md (peephole2): Transform "movq $-1,reg" to
"pushq $-1; popq reg" for -Oz if reg is a legacy integer register.

gcc/testsuite/

PR target/120427
* gcc.target/i386/pr120427-5.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 71dae74158d05b75e367629ce21da3f0a2945576)

Eliminate redundant vpextrq/vpinsrq when move TI to V4SI.

r14-1902-g96c3539f2a3813 split TImode move with 2 DImode move, it's
supposed to optimize TImode in parameter/return since accoring to
psABI it's stored into 2 general registers.

But when TImode is not in parameter/return, it could create redundancy
in the PR.

The patch add a splitter to handle that.

.i.e.
(insn 10 9 14 2 (set (subreg:V2DI (reg:V4SI 98 [ <retval> ]) 0)
(vec_concat:V2DI (subreg:DI (reg:TI 101) 0)
(subreg:DI (reg:TI 101) 8)))
8442 {vec_concatv2di}
(expr_list:REG_DEAD (reg:TI 101)

gcc/ChangeLog:

PR target/121274
* config/i386/sse.md (*vec_concatv2di_0): Add a splitter
before it.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr121274.c: New test.

(cherry picked from commit 6a466839340dce3b596b3ae5ce85bd05a067ae00)

[sanitizer_common] Remove reference to obsolete termio ioctls (#138822)

Cherry picked from LLVM commit c99b1bcd505064f2e086e6b1034ce0b0c91ea5b9.

The termio ioctls are no longer used after commit 59978b21ad9c
("[sanitizer_common] Remove interceptors for deprecated struct termio
(#137403)"), remove them.  Fixes this build error:

../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:765:27: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  765 |   unsigned IOCTL_TCGETA = TCGETA;
      |                           ^~~~~~
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:769:27: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  769 |   unsigned IOCTL_TCSETA = TCSETA;
      |                           ^~~~~~
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:770:28: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  770 |   unsigned IOCTL_TCSETAF = TCSETAF;
      |                            ^~~~~~~
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:771:28: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  771 |   unsigned IOCTL_TCSETAW = TCSETAW;
      |                            ^~~~~~~

(cherry picked from commit dbe0ba6c90d53229613c7eb3f476580ae1b9aae1)

libsanitizer: Fix build with glibc 2.42

The termio structure will be removed from glibc 2.42. It has
been deprecated since the late 80s/early 90s.

Cherry-picked from LLVM commit 59978b21ad9c65276ee8e14f26759691b8a65763
("[sanitizer_common] Remove interceptors for deprecated struct termio
(#137403)").

Co-Authored-By: Tom Stellard <tstellar@redhat.com>
libsanitizer/

* sanitizer_common/sanitizer_common_interceptors_ioctl.inc: Cherry
picked from LLVM commit 59978b21ad9c65276ee8e14f26759691b8a65763.
* sanitizer_common/sanitizer_platform_limits_posix.cpp: Likewise.
* sanitizer_common/sanitizer_platform_limits_posix.h: Likewise.

(cherry picked from commit 1789c57dc97ea2f9819ef89e28bf17208b6208e7)

Daily bump.

x86: Enable *mov<mode>_(and|or) only for -Oz

commit ef26c151c14a87177d46fd3d725e7f82e040e89f
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Thu Dec 23 12:33:07 2021 +0000

    x86: PR target/103773: Fix wrong-code with -Oz from pop to memory.

added "*mov<mode>_and" and extended "*mov<mode>_or" to transform
"mov $0,mem" to the shorter "and $0,mem" and "mov $-1,mem" to the shorter
"or $-1,mem" for -Oz.  But the new pattern:

(define_insn "*mov<mode>_and"
  [(set (match_operand:SWI248 0 "memory_operand" "=m")
    (match_operand:SWI248 1 "const0_operand"))
   (clobber (reg:CC FLAGS_REG))]
  "reload_completed"
  "and{<imodesuffix>}\t{%1, %0|%0, %1}"
  [(set_attr "type" "alu1")
   (set_attr "mode" "<MODE>")
   (set_attr "length_immediate" "1")])

and the extended pattern:

(define_insn "*mov<mode>_or"
  [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm")
    (match_operand:SWI248 1 "constm1_operand"))
   (clobber (reg:CC FLAGS_REG))]
  "reload_completed"
  "or{<imodesuffix>}\t{%1, %0|%0, %1}"
  [(set_attr "type" "alu1")
   (set_attr "mode" "<MODE>")
   (set_attr "length_immediate" "1")])

aren't guarded for -Oz.  As a result, "and $0,mem" and "or $-1,mem" are
generated without -Oz.

1. Change *mov<mode>_and" to define_insn_and_split and split it to
"mov $0,mem" if not -Oz.
2. Change "*mov<mode>_or" to define_insn_and_split and split it to
"mov $-1,mem" if not -Oz.
3. Don't transform "mov $-1,reg" to "push $-1; pop reg" for -Oz since it
should be transformed to "or $-1,reg".

gcc/

PR target/120427
* config/i386/i386.md (*mov<mode>_and): Changed to
define_insn_and_split.  Split it to "mov $0,mem" if not -Oz.
(*mov<mode>_or): Changed to define_insn_and_split.  Split it
to "mov $-1,mem" if not -Oz.
(peephole2): Don't transform "mov $-1,reg" to "push $-1; pop reg"
for -Oz since it will be transformed to "or $-1,reg".

gcc/testsuite/

PR target/120427
* gcc.target/i386/cold-attribute-4.c: Compile with -Oz.
* gcc.target/i386/pr120427-1.c: New test.
* gcc.target/i386/pr120427-2.c: Likewise.
* gcc.target/i386/pr120427-3.c: Likewise.
* gcc.target/i386/pr120427-4.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 4c80062d7b8c272e2e193b8074a8440dbb4fe588)

Daily bump.

LoongArch: Fix wrong code generated by TARGET_VECTORIZE_VEC_PERM_CONST [PR121064]

When TARGET_VECTORIZE_VEC_PERM_CONST is called, target may be the
same pseudo as op0 and/or op1.  Loading the selector into target
would clobber the input, producing wrong code like

    vld     $vr0, $t0
    vshuf.w $vr0, $vr0, $vr1

So don't load the selector into d->target, use a new pseudo to hold the
selector instead.  The reload pass will load the pseudo for selector and
the pseudo for target into the same hard register (following our
constraint '0' on the shuf instructions) anyway.

gcc/ChangeLog:

PR target/121064
* config/loongarch/lsx.md (lsx_vshuf_<lsxfmt_f>): Add '@' to
generate a mode-aware helper.  Use <VIMODE> as the mode of the
operand 1 (selector).
* config/loongarch/lasx.md (lasx_xvshuf_<lasxfmt_f>): Likewise.
* config/loongarch/loongarch.cc
(loongarch_try_expand_lsx_vshuf_const): Create a new pseudo for
the selector.  Use the mode-aware helper to simplify the code.
(loongarch_expand_vec_perm_const): Likewise.

gcc/testsuite/ChangeLog:

PR target/121064
* gcc.target/loongarch/pr121064.c: New test.

(cherry picked from commit d626debcb3717f18bf2ee88f4281b109b13e1181)

Daily bump.

aarch64: Tweak handling of general SVE permutes [PR121027]

This PR is partly about a code quality regression that was triggered
by g:caa7a99a052929d5970677c5b639e1fa5166e334.  That patch taught the
gimple optimisers to fold two VEC_PERM_EXPRs into one, conditional
upon either (a) the original permutations not being "native" operations
or (b) the combined permutation being a "native" operation.

Whether something is a "native" operation is tested by calling
can_vec_perm_const_p with allow_variable_p set to false.  This requires
the permutation to be supported directly by TARGET_VECTORIZE_VEC_PERM_CONST,
rather than falling back to the general vec_perm optab.

This exposed a problem with the way that we handled general 2-input
permutations for SVE.  Unlike Advanced SIMD, base SVE does not have
an instruction to do general 2-input permutations.  We do still implement
the vec_perm optab for SVE, but only when the vector length is known at
compile time.  The general expansion is pretty expensive: an AND, a SUB,
two TBLs, and an ORR.  It certainly couldn't be considered a "native"
operation.

However, if a VEC_PERM_EXPR has a constant selector, the indices can
be wider than the elements being permuted.  This is not true for the
vec_perm optab, where the indices and permuted elements must have the
same precision.

This leads to one case where we cannot leave a general 2-input permutation
to be handled by the vec_perm optab: when permuting bytes on a target
with 2048-bit vectors.  In that case, the indices of the elements in
the second vector are in the range [256, 511], which cannot be stored
in a byte index.

TARGET_VECTORIZE_VEC_PERM_CONST therefore has to handle 2-input SVE
permutations for one specific case.  Rather than check for that
specific case, the code went ahead and used the vec_perm expansion
whenever it worked.  But that undermines the !allow_variable_p
handling in can_vec_perm_const_p; it becomes impossible for
target-independent code to distinguish "native" operations from
the worst-case fallback.

This patch instead limits TARGET_VECTORIZE_VEC_PERM_CONST to the
cases that it has to handle.  It fixes the PR for all vector lengths
except 2048 bits.

A better fix would be to introduce some sort of costing mechanism,
which would allow us to reject the new VEC_PERM_EXPR even for
2048-bit targets.  But that would be a significant amount of work
and would not be backportable.

gcc/
PR target/121027
* config/aarch64/aarch64.cc (aarch64_evpc_sve_tbl): Punt on 2-input
operations that can be handled by vec_perm.

gcc/testsuite/
PR target/121027
* gcc.target/aarch64/sve/acle/general/perm_1.c: New test.

(cherry picked from commit 1f52396c6fc940224e9d858d49e41310a6dfa43d)

aarch64: Fix general permutes of svbfloat16_ts [PR121027]

Testing gcc.target/aarch64/sve/permute_2.c without the associated GCC
patch triggered an unrecognisable insn ICE for the svbfloat16_t tests.
This was because the implementation of general two-vector permutes
requires two TBLs and an ORR, with the ORR being represented as an
unspec for floating-point modes. The associated pattern did not
cover VNx8BF.

gcc/
PR target/121027
* config/aarch64/iterators.md (SVE_I): Move further up file.
(SVE_F): New mode iterator.
(SVE_ALL): Redefine in terms of SVE_I and SVE_F.
* config/aarch64/aarch64-sve.md (*<LOGICALF:optab><mode>3): Extend
to all SVE_F.

gcc/testsuite/
PR target/121027
* gcc.target/aarch64/sve/permute_5.c: New test.

(cherry picked from commit 4fd473f66faf5bd95c84fe5c0fa41be735a7c09f)

aarch64: Fix ZIP1 order in aarch64_expand_vector_init [PR118891]

aarch64_expand_vector_init contains some divide-and-conquer code
that tries to load the odd and even elements into 64-bit registers
and then ZIP them together. On big-endian targets, the even elements
are more significant than the odd elements and so should come second
in the ZIP.

This fixes many execution failures on aarch64_be-elf, including
gcc.c-torture/execute/pr28982a.c.

gcc/
PR target/118891
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Fix the
ZIP1 operand order for big-endian targets.

(cherry picked from commit cb2b5471516c3c469f65d927a2a30eb15357e429)

aarch64: Fix neon-sve-bridge.c failures for big-endian

Lowpart subregs are generally disallowed on big-endian SVE vector
registers, since the first memory element is stored at the least
significant end of the register, rather than the most significant end.
(See the comment at the head of aarch64-sve.md for details,
and aarch64_modes_compatible_p for the implementation.)

This means that arm_sve_neon_bridge.h needs to use custom define_insns
for big-endian targets, in lieu of using lowpart subregs.  However,
one of those define_insns relied on the prohibited lowparts internally,
to convert an Advanced SIMD register to an SVE register.  Since the
lowpart is not allowed, the lowpart_subreg would return null, leading
to a later ICE.

The simplest fix seems to be to use %Z instead, to force the Advanced
SIMD register to be written as an SVE register.

gcc/
* config/aarch64/aarch64-sve.md (@aarch64_sve_set_neonq_<mode>):
Use %Z instead of lowpart_subreg.  Tweak formatting.

(cherry picked from commit 69c839c7361430ec27d1f13f909531b872588f27)

vect: Fix VEC_WIDEN_PLUS_HI/LO choice for big-endian [PR118891]

In the tree codes and optabs, the "hi" in a vector hi/lo pair means
"most significant" and the "lo" means "least significant", with
sigificance following GCC's normal endian expectations.  Thus on
big-endian targets, the hi part handles the first half of the elements
in memory order and the lo part handles the second half.

For tree codes, supportable_widening_operation first chooses hi/lo
pairs based on little-endian order and then uses:

  if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
    std::swap (c1, c2);

to adjust.  However, the handling for internal functions was missing
an equivalent fixup.  This led to several execution failures in vect.exp
on aarch64_be-elf.

If the hi/lo code fails, the internal function handling goes on to try
even/odd.  But I couldn't see anything obvious that would put the even/
odd results back into the right order later, so there might be a latent
bug there too.

gcc/
PR tree-optimization/118891
* tree-vect-stmts.cc (supportable_widening_operation): Swap the
hi and lo internal functions on big-endian targets.

(cherry picked from commit ec54a14239b12d03c600c14f3ce9710e65cd33f1)

c++: constexpr uninitialized union [PR120577]

This was failing for two reasons:

1) We were wrongly treating the basic_string constructor as
zero-initializing the object, which it doesn't.
2) Given that, when we went to look for a value for the anonymous union,
we concluded that it was value-initialized, and trying to evaluate that
broke because we weren't setting ctx->ctor for it.

This patch fixes both issues, #1 by setting CONSTRUCTOR_NO_CLEARING and #2
by inserting a new CONSTRUCTOR for the member rather than evaluate it out of
context, which is consistent with cxx_eval_store_expression.

PR c++/120577

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Set
CONSTRUCTOR_NO_CLEARING on initial value for ctor.
(cxx_eval_component_reference): Make value-initialization
of an aggregate member explicit.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-union9.C: New test.

(cherry picked from commit f23b5df56e237df9f66b615ca4babc564d5f75de)

Daily bump.

rs6000: Add TARGET_FLOAT128_HW guard for quad-precision insns

gcc/
* config/rs6000/rs6000.md (floatti<mode>2, floatunsti<mode>2,
fix_trunc<mode>ti2): Add guard TARGET_FLOAT128_HW.
* config/rs6000/vsx.md (xsxexpqp_<IEEE128:mode>_<V2DI_DI:mode>,
xsxsigqp_<IEEE128:mode>_<VEC_TI:mode>, xsiexpqpf_<mode>,
xsiexpqp_<IEEE128:mode>_<V2DI_DI:mode>, xscmpexpqp_<code>_<mode>,
*xscmpexpqp, xststdcnegqp_<mode>): Replace guard TARGET_P9_VECTOR
with TARGET_FLOAT128_HW.

gcc/testsuite/
* gcc.target/powerpc/float128-cmp2-runnable.c: Replace
ppc_float128_sw with ppc_float128_hw and remove p9vector_hw.

(cherry picked from commit bf891fcabca7a59ce71e85c8f2eea2bfabbffe59)

Daily bump.

gimple-fold: Fix up big endian _BitInt adjustment [PR121131]

The following testcase ICEs because SCALAR_INT_TYPE_MODE of course
doesn't work for large BITINT_TYPE types which have BLKmode.
native_encode* as well as e.g. r14-8276 use in cases like these
GET_MODE_SIZE (SCALAR_INT_TYPE_MODE ()) and TREE_INT_CST_LOW (TYPE_SIZE_UNIT
()) for the BLKmode ones.
In this case, it wants bits rather than bytes, so I've used
GET_MODE_BITSIZE like before and TYPE_SIZE otherwise.

Furthermore, the patch only computes encoding_size for big endian
targets, for little endian we don't really adjust anything, so there
is no point computing it.

2025-07-18 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/121131
* gimple-fold.cc (fold_nonarray_ctor_reference): Use
TREE_INT_CST_LOW (TYPE_SIZE ()) instead of
GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE ()) for BLKmode BITINT_TYPEs.
Don't compute encoding_size at all for little endian targets.

* gcc.dg/bitint-124.c: New test.

(cherry picked from commit 90955b2f61f787ebc446f0a105b5f49672388d89)

Daily bump.

x86-64: Add RDI clobber to 64-bit dynamic TLS patterns

*tls_global_dynamic_64_largepic, *tls_local_dynamic_64_<mode> and
*tls_local_dynamic_base_64_largepic use RDI as the __tls_get_addr
argument. Add RDI clobber to these patterns to show it.

gcc/

PR target/120908
* config/i386/i386.cc (legitimize_tls_address): Pass RDI to
gen_tls_local_dynamic_64.
* config/i386/i386.md (*tls_global_dynamic_64_largepic): Add
RDI clobber and use it to generate LEA.
(*tls_local_dynamic_64_<mode>): Likewise.
(*tls_local_dynamic_base_64_largepic): Likewise.
(@tls_local_dynamic_64_<mode>): Add a clobber.

gcc/testsuite/

PR target/120908
* gcc.target/i386/pr120908.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit d8d5e2a8031e74f08f61ccdd727476f97940c5a6)

x86-64: Add RDI clobber to tls_global_dynamic_64 patterns

*tls_global_dynamic_64_<mode> uses RDI as the __tls_get_addr argument.
Add RDI clobber to tls_global_dynamic_64 patterns to show it.

PR target/120908
* config/i386/i386.cc (legitimize_tls_address): Pass RDI to
gen_tls_global_dynamic_64.
* config/i386/i386.md (*tls_global_dynamic_64_<mode>): Add RDI
clobber and use it to generate LEA.
(@tls_global_dynamic_64_<mode>): Add a clobber.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 7710d513a552f1fa1b7485ec6b318bafaa6d4cd7)

c++: constexpr array testcase [PR87097]

This seems to have been fixed by r15-7260 for PR118285, but is sufficiently
different to merit its own test.

PR c++/87097

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-array29.C: New test.

Daily bump.

Fortran: Fix ICE in ASSOCIATE with user defined operator [PR121060]

2025-07-16 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/121060
* interface.cc (matching_typebound_op): Defer determination of
specific procedure until resolution by returning NULL.

gcc/testsuite/
PR fortran/121060
* gfortran.dg/associate_75.f90: New test.

(cherry picked from commit 82e912344d28cf1a69e5f8e047203ea7eb625302)

Daily bump.

i386: Remove KEYLOCKER related feature since Panther Lake and Clearwater Forest

According to July 2025 SDM, Key locker will no longer be supported on
hardware 2025 onwards. This means for Panther Lake and Clearwater Forest,
the feature will not be enabled. Remove them from those two platforms.

gcc/ChangeLog:

* config/i386/i386.h (PTA_PANTHERLAKE): Revmoe KL and WIDEKL.
(PTA_CLEARWATERFOREST): Ditto.
* doc/invoke.texi: Revise documentation.

Daily bump.

AVR: Add support for AVR32DAxxS, AVR64DAxxS, AVR128DAxxS devices.

gcc/
* config/avr/avr-mcus.def (avr32da28s, avr32da32s, avr32da48s)
(avr64da28s, avr64da32s, avr64da48s avr64da64s)
(avr128da28s, avr128da32s, avr128da48s, avr128da64s): Add devices.
* doc/avr-mmcu.texi: Rebuild.

Daily bump.

c++: Fix a pasto in the PR120471 fix [PR120940]

No idea how this slipped in, I'm terribly sorry.
Strangely nothing in the testsuite has caught this, so I've added
a new test for that.

2025-07-03 Jakub Jelinek <jakub@redhat.com>

PR c++/120940
* typeck.cc (cp_build_array_ref): Fix a pasto.

* g++.dg/parse/pr120940.C: New test.
* g++.dg/warn/Wduplicated-branches9.C: New test.

(cherry picked from commit dc90649466a54ab61926d88500a05f59a55cb055)

Ada: Remove left-overs of front-end exception mechanism

It was removed from the compiler a few releases ago.

gcc/ada/
* gcc-interface/Makefile.in (gnatlib-sjlj): Delete.
(gnatlib-zcx): Do not modify Frontend_Exceptions constant.
* libgnat/system-linux-loongarch.ads (Frontend_Exceptions): Delete.

ada: Fix missing error on too large Component_Size not multiple of storage unit

This is a small regression introduced a few years ago.

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (gnat_to_gnu_component_type): Validate the
Component_Size like the size of a type only if the component type
is actually packed.

ada: Fix wrong conversion of controlled array with representation change

The problem is that a temporary is created for the conversion because of the
representation change, and it is finalized without having been initialized.

gcc/ada/ChangeLog:

* exp_ch4.adb (Handle_Changed_Representation): Alphabetize local
variables. Set the No_Finalize_Actions flag on the assignment.

c++: Fix up cp_build_array_ref COND_EXPR handling [PR120471]

The following testcase is miscompiled since the introduction of UBSan,
cp_build_array_ref COND_EXPR handling replaces
(cond ? a : b)[idx] with cond ? a[idx] : b[idx], but if there are
SAVE_EXPRs inside of idx, they will be evaluated just in one of the
branches and the other uses uninitialized temporaries.

Fixed by keeping doing what it did if idx doesn't have side effects
and is invariant.  Otherwise if op1/op2 are ARRAY_TYPE arrays with
invariant addresses or pointers with invariant values, use
SAVE_EXPR <op0>, SAVE_EXPR <idx>, SAVE_EXPR <op0> as a new condition
and SAVE_EXPR <idx> instead of idx for the recursive calls.
Otherwise punt, but if op1/op2 are ARRAY_TYPE, furthermore call
cp_default_conversion on array, so that COND_EXPR with ARRAY_TYPE doesn't
survive in the IL until expansion.

2025-07-01  Jakub Jelinek  <jakub@redhat.com>

PR c++/120471
gcc/cp/
* typeck.cc (cp_build_array_ref) <case COND_EXPR>: If idx is not
INTEGER_CST, don't optimize the case (but cp_default_conversion on
array early if it has ARRAY_TYPE) or use
SAVE_EXPR <op0>, SAVE_EXPR <idx>, SAVE_EXPR <op0> as new op0 depending
on flag_strong_eval_order and whether op1 and op2 are arrays with
invariant address or tree invariant pointers.  Formatting fixes.
gcc/testsuite/
* g++.dg/ubsan/pr120471.C: New test.
* g++.dg/parse/pr120471.C: New test.

(cherry picked from commit 988e87b66882875b14a6cab11c17516863c74a63)

aarch64: Incorrect removal of ZA restore [PR120624]

The PCS defines a lazy save scheme for managing ZA across normal
"private-ZA" functions.  GCC currently uses this scheme for calls
to all private-ZA functions (rather than using caller-save).

Therefore, before a sequence of calls to private-ZA functions, GCC emits
code to set up a lazy save.  After the sequence of calls, GCC emits code
to check whether lazy save was committed and restore the ZA contents
if so.

These sequences are emitted by the mode-switching pass, in an attempt
to reduce the number of redundant saves and restores.

The lazy save scheme also means that, before a function can use ZA,
it must first conditionally store the old contents of ZA to the caller's
lazy save buffer, if any.

This all creates some relatively complex dependencies between
setup code, save/restore code, and normal reads from and writes to ZA.
These dependencies are modelled using special fake hard registers:

    ;; Sometimes we use placeholder instructions to mark where later
    ;; ABI-related lowering is needed.  These placeholders read and
    ;; write this register.  Instructions that depend on the lowering
    ;; read the register.
    (LOWERING_REGNUM 87)

    ;; Represents the contents of the current function's TPIDR2 block,
    ;; in abstract form.
    (TPIDR2_BLOCK_REGNUM 88)

    ;; Holds the value that the current function wants PSTATE.ZA to be.
    ;; The actual value can sometimes vary, because it does not track
    ;; changes to PSTATE.ZA that happen during a lazy save and restore.
    ;; Those effects are instead tracked by ZA_SAVED_REGNUM.
    (SME_STATE_REGNUM 89)

    ;; Instructions write to this register if they set TPIDR2_EL0 to a
    ;; well-defined value.  Instructions read from the register if they
    ;; depend on the result of such writes.
    ;;
    ;; The register does not model the architected TPIDR2_ELO, just the
    ;; current function's management of it.
    (TPIDR2_SETUP_REGNUM 90)

    ;; Represents the property "has an incoming lazy save been committed?".
    (ZA_FREE_REGNUM 91)

    ;; Represents the property "are the current function's ZA contents
    ;; stored in the lazy save buffer, rather than in ZA itself?".
    (ZA_SAVED_REGNUM 92)

    ;; Represents the contents of the current function's ZA state in
    ;; abstract form.  At various times in the function, these contents
    ;; might be stored in ZA itself, or in the function's lazy save buffer.
    ;;
    ;; The contents persist even when the architected ZA is off.  Private-ZA
    ;; functions have no effect on its contents.
    (ZA_REGNUM 93)

Every normal read from ZA and write to ZA depends on SME_STATE_REGNUM,
in order to sequence the code with the initial setup of ZA and
with the lazy save scheme.

The code to restore ZA after a call involves several instructions,
including conditional control flow.  It is initially represented as
a single define_insn and is split late, after shrink-wrapping and
prologue/epilogue insertion.

The split form of the restore instruction includes a conditional call
to __arm_tpidr2_restore:

(define_insn "aarch64_tpidr2_restore"
  [(set (reg:DI ZA_SAVED_REGNUM)
(unspec:DI [(reg:DI R0_REGNUM)] UNSPEC_TPIDR2_RESTORE))
   (set (reg:DI SME_STATE_REGNUM)
(unspec:DI [(reg:DI SME_STATE_REGNUM)] UNSPEC_TPIDR2_RESTORE))
  ...
)

The write to SME_STATE_REGNUM indicates the end of the region where
ZA_REGNUM might differ from the real contents of ZA.  In other words,
it is the point at which normal reads from ZA and writes to ZA
can safely take place.

To finally get to the point, the problem in this PR was that the
unsplit aarch64_restore_za pattern was missing this change to
SME_STATE_REGNUM.  It could therefore be deleted as dead before
it had chance to be split.  The split form had the correct dataflow,
but the unsplit form didn't.

Unfortunately, the tests for this code tended to use calls and asms
to model regions of ZA usage, and those don't seem to be affected
in the same way.

gcc/
PR target/120624
* config/aarch64/aarch64.md (SME_STATE_REGNUM): Expand on comments.
* config/aarch64/aarch64-sme.md (aarch64_restore_za): Also set
SME_STATE_REGNUM

gcc/testsuite/
PR target/120624
* gcc.target/aarch64/sme/za_state_7.c: New test.

(cherry picked from commit 8546265e2ee386ea8a4b2f9150ddfed32c9d15ea)

Daily bump.

Fix misoptimization of CONSTRUCTOR with reverse SSO

fold_ctor_reference already punts on a CONSTRUCTOR whose type has reverse
storage order, but it can be invoked in a couple of places on a CONSTRUCTOR
with native storage order that has been wrapped in a VIEW_CONVERT_EXPR to a
type with reverse storage order; this would require a post adjustment that
does not currently exist, thus yield wrong code for this admittedly quite
pathological (but supported) case.

gcc/
* gimple-fold.cc (fold_const_aggregate_ref_1) <COMPONENT_REF>:
Bail out immediately if the reference has reverse storage order.
* tree-ssa-sccvn.cc (fully_constant_vn_reference_p): Likewise.
gcc/testsuite/
* gnat.dg/sso20.adb: New test.

Daily bump.

i386: Remove CLDEMOTE for clients

CLDEMOTE is not enabled on clients according to SDM. SDM only mentioned
it will be enabled on Xeon and Atom servers, not clients. Remove them
since Alder Lake (where it is introduced).

gcc/ChangeLog:

* config/i386/i386.h (PTA_ALDERLAKE): Use PTA_GOLDMONT_PLUS
as base to remove PTA_CLDEMOTE.
(PTA_SIERRAFOREST): Add PTA_CLDEMOTE since PTA_ALDERLAKE
does not include that anymore.
* doc/invoke.texi: Update texi file.

Daily bump.

tree-optimization/116674 - vectorizable_simd_clone_call and re-analysis

When SLP analysis scraps an instance because it fails to analyze we
can end up calling vectorizable_* in analysis mode on a node that
was analyzed during the analysis of that instance again.
vectorizable_simd_clone_call wasn't expecting that and instead
guarded analysis/transform code on populated data structures.
The following changes it so it survives re-analysis.

PR tree-optimization/116674
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Support
re-analysis.

* g++.dg/vect/pr116674.cc: New testcase.

(cherry picked from commit 09a514fbb67caf7e33a6ceddf524ee21024c33c5)

Daily bump.

dfp: Further decimal_real_to_integer fixes [PR120631]

Unfortunately, the following further testcase shows that there aren't
problems only with very large precisions and large exponents, but pretty
much anything larger than 64-bits. After all, before _BitInt support dfp
didn't even have {,unsigned }__int128 <-> _Decimal{32,64,128,64x} support,
and the testcase again shows some of the conversions yielding zeros.
While the pr120631.c test worked even without the earlier patch.

So, this patch assumes 64-bit precision at most is ok and for anything
larger it just uses exponent 0 and multiplies afterwards.

2025-06-19 Jakub Jelinek <jakub@redhat.com>

PR middle-end/120631
* dfp.cc (decimal_real_to_integer): Use result multiplication not just
when precision > 128 and dn.exponent > 19, but when precision > 64
and dn.exponent > 0.

* gcc.dg/dfp/bitint-10.c: New test.
* gcc.dg/dfp/pr120631.c: New test.

(cherry picked from commit e2eb9da5546b5e2fccb86586cda3beee8f69f5c9)

dfp, real: Fix up FLOAT_EXPR/FIX_TRUNC_EXPR constant folding between dfp and large _BitInt [PR120631]

The following testcase shows that while at runtime we handle conversions
between _Decimal{64,128} and large _BitInt correctly, at compile time we
mishandle them in both directions, in one direction we end up in ICE in
decimal_from_integer callee because the char buffer is too short for the
needed number of decimal digits, in the conversion of dfp to large _BitInt
we return 0 in the wide_int.

The following patch fixes the ICE by using larger buffer (XALLOCAVEC
allocated, it will be never larger than 65536 / 3 bytes) in the larger
_BitInt case, and the other direction by setting exponent to exp % 19
and instead multiplying the result by needed powers of 10^19 (10^19 chosen
as largest power of ten that can fit into UHWI).

2025-06-18 Jakub Jelinek <jakub@redhat.com>

PR middle-end/120631
* real.cc (decimal_from_integer): Add digits argument, if larger than
256, use XALLOCAVEC allocated buffer.
(real_from_integer): Pass val_in's precision divided by 3 to
decimal_from_integer.
* dfp.cc (decimal_real_to_integer): For precision > 128 if finite
and exponent is large, decrease exponent and multiply resulting
wide_int by powers of 10^19.

* gcc.dg/dfp/bitint-9.c: New test.

(cherry picked from commit f3002d664d1137844c714645a841a48ab57d0eaa)

Daily bump.

Fix test case for PR117811 which failed for int < 32 bit.

PR middle-end/117811
PR testsuite/52641
gcc/testsuite/
* gcc.dg/torture/pr117811.c: Fix for int < 32 bit.

(cherry picked from commit 07f229c2d7ee6b604e5a86092e675d5d36c1ba4e)

opcodes: fix wrong code in expand_binop_directly [PR117811]

If expand_binop_directly fails to add a REG_EQUAL note it tries to
unwind and restart.  But it can unwind too far if expand_binop changed
some of the operands before calling it.  We don't need to unwind that
far anyway since we should end up taking exactly the same route next
time, just without a target rtx.

To fix this we remove LAST from the argument list and let the callers
(all in expand_binop) do their own unwinding if the call fails.
Instead we unwind just as far as the entry to expand_binop_directly
and recurse within this function instead of all the way back up.

gcc/ChangeLog:

PR middle-end/117811
* optabs.cc (expand_binop_directly): Remove LAST as an argument,
instead record the last insn on entry.  Only delete insns if
we need to restart and restart by calling ourself, not expand_binop.
(expand_binop): Update callers to expand_binop_directly.  If it
fails to expand the operation, delete back to LAST.

gcc/testsuite:

PR middle-end/117811
* gcc.dg/torture/pr117811.c: New test.

(cherry picked from commit 7679b826840c58343d72d05922355b646db4bdcc)

recip: Reset range info when replacing sqrt with rsqrt [PR120638]

This pass reuses a SSA_NAME on the lhs of sqrt etc. call as lhs
of .RSQRT etc. call. The following testcase is miscompiled since my recent
ranger cast changes, because we compute (correct) range for sqrtf argument
as well as result but then recip pass keeps using that range for the .RQSRT
call which returns 1. / sqrt, so the function then returns 0.5f
unconditionally.
Note, on foo this is a regression from GCC 15, but on bar it regressed
already with the r14-536 change.

2025-06-12 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/120638
* tree-ssa-math-opts.cc (pass_cse_reciprocals::execute): Call
reset_flow_sensitive_info on arg1.

* gcc.dg/pr120638.c: New test.

(cherry picked from commit 8804e5b5b127b27d099d0c361fa2161d0b13edef)

real: Fix up real_from_integer [PR120547]

The function has 2 problems, one is _BitInt specific and the other is
most likely also reproduceable only with it.

The first issue is that I've missed updating the function for _BitInt,
maxbitlen as MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT
obviously isn't guaranteed to be larger than any integral type we might
want to convert at compile time from wide_int to REAL_VALUE_FORMAT.
Just using len instead of it works fine, at least when used after
HOST_BITS_PER_WIDE_INT is added to it and it is truncated to multiples
of HOST_BITS_PER_WIDE_INT.

The other bug is that if the value has too many significant bits (formerly
maxbitlen - cnt_l_z, now len - cnt_l_z), the code just shifts it right and
adds the shift count to the future exponent.  That isn't correct for
rounding as the testcase attempts to show, the internal real format has more
bits than any precision in supported format, but we still need to
distinguish bewtween values exactly half way between representable floating
point values (those should be rounded to even) and the case when we've
shifted away some non-zero bits, so the value was tiny bit larger than half
way and then we should round up.

The patch uses something like e.g. soft-fp uses in these cases, right shift
with sticky bit in the least significant bit.

2025-06-05  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/120547
* real.cc (real_from_integer): Remove maxbitlen variable, use
len instead of that.  When shifting right, or in 1 if any of the
shifted away bits are non-zero.  Formatting fix.

* gcc.dg/bitint-123.c: New test.

(cherry picked from commit ea9ea72e448e391d4be781b74956a0190f93afc8)

tree-chrec: Use signed_type_for in convert_affine_scev

On s390x-linux I've run into the gcc.dg/torture/bitint-27.c test ICEing in
build_nonstandard_integer_type called from convert_affine_scev (not sure
why it doesn't trigger on x86_64/aarch64).
The problem is clear, when ct is a BITINT_TYPE with some large
TYPE_PRECISION, build_nonstandard_integer_type won't really work on it.

The patch fixes it similarly what has been done for GCC 14 in various
other spots.

2025-05-20 Jakub Jelinek <jakub@redhat.com>

* tree-chrec.cc (convert_affine_scev): Use signed_type_for instead of
build_nonstandard_integer_type.

(cherry picked from commit e38027c8ff449ffadaca449004bb891b9094ad00)

Daily bump.

libstdc++: Fix std::format thousands separators when sign present [PR120548]

The leading sign character should be skipped when deciding whether to
insert thousands separators into a floating-point format.

libstdc++-v3/ChangeLog:

PR libstdc++/120548
* include/std/format (__formatter_fp::_M_localize): Do not
include a leading sign character in the string to be grouped.
* testsuite/std/format/functions/format.cc: Check grouping when
sign is present in the output.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit 2c3559839d70df6311da18fd93237050405580c3)

libstdc++: Tweak localized formatting for floating-point types

libstdc++-v3/ChangeLog:

* include/std/format (__formatter_fp::_M_localize): Add comments
and micro-optimize string copy.

(cherry picked from commit 99b8be43d7c548db127ee4f4d0918c55edc68b3f)

libstdc++: Make system_clock::to_time_t always_inline [PR99832]

For some 32-bit targets Glibc supports changing the size of time_t to be
64 bits by defining _TIME_BITS=64. That causes an ABI change which
would affect std::chrono::system_clock::to_time_t. Because to_time_t is
not a function template, its mangled name does not depend on the return
type, so it has the same mangled name whether it returns a 32-bit time_t
or a 64-bit time_t. On targets where the size of time_t can be selected
at preprocessing time, that can cause ODR violations, e.g. the linker
selects a definition of to_time_t that returns a 32-bit value but a
caller expects 64-bit and so reads 32 bits of garbage from the stack.

This commit adds always_inline to to_time_t so that all callers inline
the conversion to time_t, and will do so using whatever type time_t
happens to be in that translation unit.

Existing objects compiled before this change will either have inlined
the function anyway (which is likely if compiled with any optimization
enabled) or will contain a COMDAT definition of the inline function and
so still be able to find it at link-time.

The attribute is also added to system_clock::from_time_t, because that's
an equally simple function and it seems reasonable for them to both be
always inlined.

libstdc++-v3/ChangeLog:

PR libstdc++/99832
* include/bits/chrono.h (system_clock::to_time_t): Add
always_inline attribute to be agnostic to the underlying type of
time_t.
(system_clock::from_time_t): Add always_inline for consistency
with to_time_t.
* testsuite/20_util/system_clock/99832.cc: New test.

(cherry picked from commit d045eb13b0b42870a1f081895df3901112a358f0)

Daily bump.

libstdc++: Fix incorrect links to archived SGI STL docs

In r8-7777-g25949ee33201f2 I updated some URLs to point to copies of the
SGI STL docs in the Wayback Machine, because the original pags were no
longer hosted on sgi.com. However, I incorrectly assumed that if one
archived page was at https://web.archive.org/web/20171225062613/... then
all the other pages would be too. Apparently that's not how the Wayback
Machine works, and each page is archived on a different date. That meant
that some of our links were redirecting to archived copies of the
announcement that the SGI STL docs have gone away.

This fixes each URL to refer to a correctly archived copy of the
original docs.

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Update URL for archived SGI STL docs.
* doc/xml/manual/containers.xml: Likewise.
* doc/xml/manual/extensions.xml: Likewise.
* doc/xml/manual/using.xml: Likewise.
* doc/xml/manual/utilities.xml: Likewise.
* doc/html/*: Regenerate.

(cherry picked from commit 501e6e786652748ff0ad9a322f74b9b47970031f)

Daily bump.