git.ipfire.org Git - thirdparty/gcc.git/log

forwprop: Fix copy prop for alignment after the final folding [PR122086]

After r16-4081-g966cdec2b2 which added folding of __builtin_assume_aligned,
forwprop would propagate pointers that lower alignment replacing ones with
greater alignment. This causes us to lose alignment information that
__builtin_assume_aligned provided to expand. Normally this just loses some
optimizations except in the s390 case where the alignment is specifically
checked and was for inlining of the atomics; without this patch an infininite
loop would happen.

Note this was previously broken for -Og before r16-4081-g966cdec2b2. This
fixes -Og case as forwprop is used instead of copyprop.

This moves the testcase for pr107389.c to torture to get a generic testcase.
pr107389.c was originally for -O0 case but we should test for other
optimization levels so this is not lost again.

align-5.c is xfailed because __builtin_assume_aligned is not instrumented for ubsan
alignment and ubsan check to see pointer is aligned before emitting a check for the
load (based on the known alignment in compiling). See PR 122038 too. I had mentioned
this issue previously in r16-4081-g966cdec2b2 too.

PR middle-end/107389
PR tree-optimization/122086
gcc/ChangeLog:

* tree-ssa-forwprop.cc (forwprop_may_propagate_copy): New function.
(pass_forwprop::execute): Use forwprop_may_propagate_copy
instead of may_propagate_copy.

gcc/testsuite/ChangeLog:

* gcc.dg/pr107389.c: Move to...
* gcc.dg/torture/pr107389.c: ...here. Skip for lto.
Use dg-additional-options rather than dg-options.
* c-c++-common/ubsan/align-5.c: xfail.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

Daily bump.

riscv: Fix gimple folding of the vset* intrinsics [PR122270]

The problem here is that when the backend folds the vset intrinsics,
it tries to keep the lhs of the new statement to be the same as the old statement
due to the check in gsi_replace. The problem is with a MEM_REF vset::fold was
unsharing the new lhs here and using the original lhs in the other new statement.
This meant the check in gsi_replace would fail.
This fixes that oversight by switching around which statement gets the unshared
version.

Note the comment in vset::fold was already correct just not matching the code:
    /* Replace the call with two statements: a copy of the full tuple
       to the call result, followed by an update of the individual vector.

       The fold routines expect the replacement statement to have the
       same lhs as the original call, so return the copy statement
       rather than the field update.  */

Changes since v1:
* v2: Fix testcase.

PR target/122270

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc (vset::fold): Use the
unshare_expr for the statement that will be added seperately rather
the one which will be used for the replacement.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr122270-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

install.texi: Update COBOL requirements.

gcc/ChangeLog:

* doc/install.texi: Add libxml2 dependency for COBOL
library. Clarify 128-bit numeric dependency.

poly-int: Fix struct vs class confusion

We currently issue a good deal of warnings of the following kind:

  In file included from /scratch/tmp/gerald/GCC-HEAD/gcc/coretypes.h:500:
  .../GCC-HEAD/gcc/poly-int.h:378:1: warning: 'poly_int' defined as a
  struct template here but previously declared as a class template; this
  is valid, but may result in linker errors under the Microsoft C++ ABI
  [-Wmismatched-tags]
    378 | struct poly_int
        | ^
  .../GCC-HEAD/gcc/poly-int.h:32:38: note: did you mean struct here?
     32 | template<unsigned int N, typename T> class poly_int;
        |                                      ^~~~~
        |                                      struct

A `grep "'poly_int' defined as a struct template" | cut -d: -f1 | uniq -c`
shows 749 issue for a typical bootstrap.

Addressing this brings down the bootstrap log by 8.5% - from 80454 lines
down to 73613.

gcc:
* poly-int.h: Change struct poly_int to class poly_int.

x86-64: Use `movsxd` to perform SI-to-DI extension in Intel syntax

Although there's no possibility of ambiguity, Intel manual says the mnemonic
for DWORD-to-QWORD sign-extension operation should be MOVSXD. Some assemblers
(GNU AS, NASM) also overload MOVSX, but some others don't accept MOVSX (LLVM,
MASM, YASM in NASM mode) and require MOVSXD.

This mnemonic was introduced in r0-34259-g123bf9e3f4056d in 2001, and has not
been updated ever since.

gcc/ChangeLog:

PR target/119079
* config/i386/i386.md: Use `movsxd` to perform SI-to-DI extension in Intel
syntax.

Signed-off-by: LIU Hao <lh_mouse@126.com>

[PATCH v2] RISC-V: Fix moving data from V_REGS to GR_REGS by scalar move.

When the ELEN is 32 in rv64, it can move a V2SI vector register into a DI GPR.
But the extract result in low-part must be zero-extended.

The following gimples are from pr111391-2.c:
_25 = (char) d.1_23;
_17 = {_25, _25, _25, _25, _25, _25, _25, _25};
a_26 = VIEW_CONVERT_EXPR<long int>(_17);

The final assembly:
vsetivli        zero,2,e32,m1,ta,ma
vslidedown.vi   v2,v4,1
vmv.x.s s1,v4 -> s1 must be zero-extended to prevent the bit 31 of v4[0] is 1
vmv.x.s a4,v2
slli    a5,a4,32
or      s1,a5,s1

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Append extend.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr111391-2.c: Add expected asm.

[PATCH] middle-end: Fix typos in comments

This patch fixes spelling errors in comments:
- "accomodate" -> "accommodate" in wide-int.h and value-range-storage.h
- "the the" -> "the" in tree-vectorizer.h

gcc/ChangeLog:

* wide-int.h: Fix typo "accomodate" to "accommodate" in comment.
* value-range-storage.h: Likewise.
* tree-vectorizer.h (dr_set_safe_speculative_read_required):
Fix duplicate "the the" to "the" in comment.

Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>

Ada: Fix internal error on pragma Machine_Attribute with string constant

This was reported a long time ago and is a fairly pathological case,
so the fix is purposely ad hoc: when the attribute name of a pragma
Machine_Attribute is not a string literal, its processing needs to
be delayed for the back-end.

gcc/ada/
PR ada/13370
* sem_prag.adb (Analyze_Pragma) <Pragma_Machine_Attribute>: Set the
Has_Delayed_Freeze flag if the argument is not a literal.

gcc/testsuite/
* gnat.dg/machine_attr3.ads, gnat.dg/machine_attr3.adb: New test.

Cobol: Suppress recipe echoing in Makefile

gcc/cobol/
* Make-lang.in ($(srcdir)/cobol/token_names.h): Silence recipe.

Fortran: Fix generic user operators in PDTs [PR122290]

2025-10-26 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/122290
* decl.cc (variable_decl): Matching component initializer
expressions in PDT templates should be done with gfc_match_expr
to avoid reduction too early. If the expression type is unknown
copy the component typespec.
(gfc_get_pdt_instance): Change comment from a TODO to an
explanation. Insert parameter values in initializers. For
components that are not marked with PDT attributes, do the
full reduction for init expressions.
* primary.cc (gfc_match_actual_arglist): Convert PDT kind exprs
using the component initializer.
* resolve.cc (resolve_typebound_intrinsic_op): Preempt
gfc_check_new_interface for pdt_types as well as entities used
in submodules.
* simplify.cc (get_kind): Remove PDT kind conversion.

gcc/testsuite/
PR fortran/122290
* gfortran.dg/pdt_60.f03: New test.

libgccjit: Fix bootstrap fail from format specifiers.

The changes in r16-4527-gc11d9eaa8ac9ee caused format mismatches
in jit-recording.cc because Darwin’s uint64_t == long long unsigned int
(and the format specifiers were %ld and %li).

gcc/jit/ChangeLog:

* jit-recording.cc (recording::array_type::make_debug_string,
recording::array_type::write_reproducer): Use PRIu64 format
specifier for uint64_t.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

[aarch64] [testsuite] tolerate alternate insn selection [PR121599]

On gcc-14, instead of 'movi\td[0-9]*,#0', we select
'mov\tz[0-9]*\.b,#0', and the testcase fails.
As in pfalse* tests, tolerate the difference.

for gcc/testsuite/ChangeLog

PR target/121599
* gcc.target/aarch64/sve2/pr121599.c: Tolerate alterate insn
selection.

Daily bump.

doc: fix __attribute__((nocf_check)) documentation

Fix two syntax errors (missing '(' and ')' and misplaced '{').

gcc/ChangeLog:

* doc/extend.texi (nocf_check): Fix syntax errors in example.

libgcobol: fix compat w/ >=libxml2-2.12

libgcobol/ChangeLog:
PR cobol/122398

* xmlparse.cc (__gg__xml_parse): Make 'msg' const.

Fortran: IS_CONTIGUOUS and pointers to non-contiguous targets [PR114023]

PR fortran/114023

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_trans_pointer_assignment): Always set dtype
when remapping a pointer. For unlimited polymorphic LHS use
elem_len from RHS.
* trans-intrinsic.cc (gfc_conv_is_contiguous_expr): Extend inline
generated code for IS_CONTIGUOUS for pointer arguments to detect
when span differs from the element size.

gcc/testsuite/ChangeLog:

* gfortran.dg/is_contiguous_5.f90: New test.

Replace VSB_DIR by sysroot refs in VxWorks LIBGCC2_INCLUDES

This matches what VXWORKS_ADDITIONAL_CPP_SPEC does, allowing libgcc to
build without VSB_DIR defined in the environment as soon as the configure
switches featured a proper --with-build-sysroot was provided.

Tested with a successful build for --target=powerpc-wrs-vxworks7r2
using mainline sources.

2025-10-23 Olivier Hainque <hainque@adacore.com>

libgcc/
* config/t-vxworks (LIBGCC2_INCLUDES): Replace $(VSB_DIR)
by sysroot references.

LoongArch: Implement vector reduction from 256-bit to 128-bit

gcc/ChangeLog:

* config/loongarch/lasx.md (vec_extract<mode><lasxhalf>): New define_expand.
(vec_extract_lo_<mode>): New define_insn_and_split.
(vec_extract_hi_<mode>): New define_insn.
* config/loongarch/loongarch-protos.h (loongarch_check_vect_par_cnst_half)
New function prototype.
* config/loongarch/loongarch.cc (loongarch_split_reduction):
Implement TARGET_VECTORIZE_SPLIT_REDUCTION.
(loongarch_check_vect_par_cnst_half): New function.
* config/loongarch/predicates.md
(vect_par_cnst_low_half): New predicate.
(vect_par_cnst_high_half): New predicate.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/lasx-reduc-1.c: New test.

Daily bump.

Add test case for PR 110405

PR was fixed bu one of the other patches.

* gcc.dg/pr110405.c: New.

Refine COND_EXPR ranges better.

Recognize COND_EXPRs where there is only one ssa_name used in the
condition as well as one of the fields in the COND_EXPR.  ie:

  cond = ssa_name < 20
  result = cond ? ssa_name : 20

Adjust the range of ssa_name to reflect the conditional value of ssa_name
relative to whether its in the TRUE or FALSE part of the COND_EXPR.

PR tree-optimization/114025
gcc/
* gimple-range-fold.cc (fold_using_range::condexpr_adjust): Handle
the same ssa_name in the condition and the COND_EXPR better.

gcc/testsuite/
* g++.dg/pr114025.C: New.

PR modula2/122407: Followup to spell check remaining intrinsics

This followup patch ensures that any unknown symbol spell check
error in the instrinsics uses the parameter token rather than the
procedure name token. In turn this allows the filter module to
detect and remove multiple unknowns at the same token.
The patch also adds spell checking to the instrinsic parameters.

gcc/m2/ChangeLog:

PR modula2/122407
* gm2-compiler/FilterError.def (Copyright): Use correct
licence.
* gm2-compiler/FilterError.mod (Copyright): Ditto.
* gm2-compiler/M2Quads.mod (BuildNewProcedure): Rewrite.
(BuildIncProcedure): Ditto.
(BuildDecProcedure): Ditto.
(BuildInclProcedure): Ditto.
(BuildExclProcedure): Ditto.
(BuildAbsFunction): Ditto.
(BuildCapFunction): Ditto.
(BuildChrFunction): Ditto.
(BuildOrdFunction): Ditto.
(BuildIntFunction): Ditto.
(BuildMinFunction): Ditto.
(BuildMaxFunction): Ditto.
(BuildTruncFunction): Ditto.
(BuildTBitSizeFunction): Ditto.
(BuildTSizeFunction): Ditto.
(BuildSizeFunction): Ditto.

gcc/testsuite/ChangeLog:

PR modula2/122407
* gm2.dg/spell/iso/fail/badspellabs.mod: New test.
* gm2.dg/spell/iso/fail/badspelladr.mod: New test.
* gm2.dg/spell/iso/fail/badspellcap.mod: New test.
* gm2.dg/spell/iso/fail/badspellchr.mod: New test.
* gm2.dg/spell/iso/fail/badspellchr2.mod: New test.
* gm2.dg/spell/iso/fail/badspelldec.mod: New test.
* gm2.dg/spell/iso/fail/badspellexcl.mod: New test.
* gm2.dg/spell/iso/fail/badspellinc.mod: New test.
* gm2.dg/spell/iso/fail/badspellincl.mod: New test.
* gm2.dg/spell/iso/fail/badspellnew.mod: New test.
* gm2.dg/spell/iso/fail/badspellsize.mod: New test.
* gm2.dg/spell/iso/fail/dg-spell-iso-fail.exp: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

aarch64/testsuite: fix auto-init-padding-[24].c testcases [PR122402]

After r16-3956-gf613fdc6920c83, these 2 testcases start to fail as
now SRA will fully scalarize the structs and only provide with the piece
that was used. But the testcase is testing about the padding so turning
off SRA is correct in this case.

Pushed as obvious after testing the testcases now pass.

PR target/122402
gcc/testsuite/ChangeLog:

* gcc.target/aarch64/auto-init-padding-2.c: Turn off SRA.
* gcc.target/aarch64/auto-init-padding-4.c: Likewise.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

libstdc++: Fix <mdspan> export in std module

When exporting declarations in a namespace with using declarations, the
name must be fully qualified. Recently introduced std::full_extent{,_t},
std:strided_slice, and std::submdspan_mapping_result broke module std
and module std.compat.

libstdc++-v3/ChangeLog:

* src/c++23/std.cc.in (std::strided_slice, std::full_extent_t)
(std::full_extent, std::submdspan_mapping_result): Add std
qualification.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

Ada: Small cleanup in Makefile

gcc/ada/
PR ada/80033
* gcc-interface/Makefile.in (force): Restore.

Fortran: fix TRANSFER of subarray component references [PR122386]

Commit r16-518 introduced a change that fixed inquiry references of complex
arrays as argument to the TRANSFER intrinsic by forcing a temporary. The
solution taken however turned out not to be generalizable to component
references of nested derived-type arrays. A better way is the revert that
patch and force the generation of a temporary when the SOURCE expression is
a not simply-contiguous array.

PR fortran/122386

gcc/fortran/ChangeLog:

* dependency.cc (gfc_ref_needs_temporary_p): Revert r16-518.
* trans-intrinsic.cc (gfc_conv_intrinsic_transfer): Force temporary
for SOURCE not being a simply-contiguous array.

gcc/testsuite/ChangeLog:

* gfortran.dg/transfer_array_subref_2.f90: New test.

Ada: Small cleanup in Makefile

gcc/ada/
PR ada/80033
* gcc-interface/Makefile.in (deftarg.o): Delete.
(init-vxsim.o): Likewise.
(force): Likewise.

Ada: Fix argument expansion with unbalanced quote on Windows

The last character of the argument is chopped as if it was a quote.

gcc/ada/
PR ada/122367
* rtinit.c (__gnat_runtime_initialize) [__MINGW32__]: Fix detection
of quoted arguments.

Ada: Fix segfault on file without final EOL with -gnatyc

The compiler overruns the source file buffer.

gcc/ada/
PR ada/118782
* styleg.adb (Is_Box_Comment): Also stop the loop at EOF.

Ada: Fix warning for redefinition of POLLPRI macro on Windows

The macro is explicitly forced to 0 on Windows.

gcc/ada/
PR ada/113516
* s-oscons-tmplt.c [_WIN32]: Undefine POLLPRI before redefining it.

Ada: Fix strange control flow in terminals.c

This was caught by a static analyzer some time ago.

gcc/ada/
PR ada/98879
* terminals.c (__gnat_setup_child_communication) [_WIN32]: Add else
blocks in the processing of the data returned by ReadFile.

Ada: Fix other instances of incorrect String lower bound in gnatlink

This also reverts an unintentional change introduced by the initial fix.

gcc/ada/
PR ada/81087
* gnatlink.adb (Is_Prefix): Move around, streamline and return false
when the prefix is not strict.
(Gnatlink): Fix other instances of incorrect lower bound assumption.

Split signed bitwise AND operations.

The algorithm for bitwise AND struggles with complex signed operations
which cross the signed/unsigned barrier. When this happens, process it
as 2 seperate ranges [LB, -1] and [0, UB], and combine the results.

PR tree-optimization/114725.c
gcc/
* range-op.cc (operator_bitwise_and::wi_fold): Split signed
operations crossing zero into 2 operations.

gcc/testsuite/
* gcc.dg/pr114725.c: New.

Create and apply bitmasks for truncating casts.

When folding a cast, we were not applying the bitmask if we reached
a VARYING result.
We were also not creating a bitmask to represent the lower bits of a
truncating cast in op1_range. So GORI was losing bits.

PR tree-optimization/118254
PR tree-optimization/114331
gcc/
* range-op.cc (operator_cast::fold_range): When VARYING is
reached, update the bitmask if we reach VARYING.
(operator_cast::op1_range): For truncating casts, create a
bitmask bit in LHS.

gcc/testsuite/
* gcc.dg/pr114331.c: New.
* gcc.dg/pr118254.c: New.

testsuite: Add test for ICE fixed by r16-4571

I recently ran into an ICE that was fixed by richi's
r16-4571-g1ceda79ca5fe1a1a296624a98de8fd04958fbe55.

This adds a testcase for that fix.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/vect-permute-ice.c: New test.

libstdc++: Forward arguments for bind_front<f>,bind_back<f>,nttp<f> [PR122022]

This patch fixes a missing forwarding-reference (&&) in _Bind_fn_t::operator()
and lambda returned from not_fn<f>.

The bind_front<f>/bind_back<f> tests were updated to use a structure similar
to r16-3398-g250dd5b5604fbc to cover cases involving zero, one, and many bound
arguments.

PR libstdc++/122022

libstdc++-v3/ChangeLog:

* include/std/functional (_Bind_fn_t): Use forwarding reference as
paremeter.
(std::not_fn<f>): Use forwarding reference as lambda parameter.
* testsuite/20_util/function_objects/bind_back/nttp.cc: Rework tests.
* testsuite/20_util/function_objects/bind_front/nttp.cc: Likewise.
* testsuite/20_util/function_objects/not_fn/nttp.cc: Add test for
argument forwarding.

xtensa: Remove redundant use of 'n'-constraint for call insns

Because 'i'-constraint clearly includes 'n'.

gcc/ChangeLog:

* config/xtensa/xtensa.md (call_internal, call_value_internal,
sibcall_internal, sibcall_value_internal): Remove 'n'-constraint.

xtensa: Remove redundant use of 'i'-constraint

Because it is redundant to specify 'i'-constraints on operands in single-
alternative match templates whose predicates are "const_int_operand" itself
or those that imply CONST_INT_P().

This patch also removes the 'i'-constraints on the next argument of the
callee (the number of bytes of arguments) in the four "call_internal"
patterns, since we are not interested in these arguments.

gcc/ChangeLog:

* config/xtensa/xtensa.md (*addsubx, *subsi3_from_const,
*xtensa_clamps, *andsi3_const_pow2_minus_one,
*andsi3_const_negative_pow2, *andsi3_const_shifted_mask,
*splice_bits, extvsi_internal, extzvsi_internal,
*extzvsi-1bit_ashlsi3, *extzvsi-1bit_addsubx, insvsi, *lsiu, *ssiu,
*lsip, *ssip, *shift_per_byte_omit_AND_0, *shift_per_byte_omit_AND_1,
*shlrd_const, *shlrd_per_byte_omit_AND, *masktrue_const_bitcmpl,
*masktrue_const_pow2_minus_one, *masktrue_const_negative_pow2,
*masktrue_const_shifted_mask, call_internal, call_value_internal,
sibcall_internal, sibcall_value_internal, entry,
*eqne_zero_masked_bits, *eqne_in_range): Remove 'i'-constraint.

PR modula2/122407: similar error messages are emitted for an unknown symbol

This followup to PR modula2/122241 reduces error message clutter by
filtering unknown symbol error ensuring that only one error message
is emitted for an unknown symbol at a particular location.
The filter is implemented using two binary trees. A new generic
(based on the address type) binary dictionary module is added to
the base libraries.

gcc/m2/ChangeLog:

PR modula2/122407
* Make-lang.in (GM2-LIBS-BOOT-DEFS): Add BinDict.def.
(GM2-LIBS-BOOT-MODS): Add BinDict.mod.
(GM2-COMP-BOOT-DEFS): Add FilterError.def.
(GM2-COMP-BOOT-MODS): Add FilterError.mod.
(GM2-LIBS-DEFS): Add BinDict.def.
(GM2-LIBS-MODS): Add BinDict.mod.
* gm2-compiler/M2Error.def (KillError): New procedure.
* gm2-compiler/M2Error.mod (WriteFormat3): Reformat.
(NewError): Rewrite and call AddToList.
(AddToList): New procedure.
(SubFromList): Ditto.
(WipeReferences): Ditto.
(KillError): Ditto.
* gm2-compiler/M2LexBuf.mod (MakeVirtualTok): Return
caret if all token positions are identical.
* gm2-compiler/M2MetaError.mod (KillError): Import.
(FilterError): Import.
(FilterUnknown): New global.
(initErrorBlock): Initialize symcause and token.
(push): Capitalize comments.
(pop): Copy symcause to toblock if discovered.
(doError): Add parameter sym.
(defaultError): Assign token if discovered.
Pass NulSym to doError.
(updateTokSym): New procedure.
(chooseError): Call updateTokSym.
(doErrorScopeModule): Pass sym to doError.
(doErrorScopeForward): Ditto.
(doErrorScopeMod): Ditto.
(doErrorScopeFor): Ditto.
(doErrorScopeDefinition): Ditto.
(doErrorScopeDef): Ditto.
(doErrorScopeProc): Ditto.
(used): Pass sym[bol] to doError.
(op): Assign symcause when encountering
an error, warning or note.
(MetaErrorStringT1): Rewrite.
(MetaErrorStringT2): Ditto.
(MetaErrorStringT3): Ditto.
(MetaErrorStringT4): Ditto.
(isUniqueError): New procedure function.
(wrapErrors): Rewrite.
(FilterUnknown): Initialize.
* gm2-compiler/M2Quads.mod (BuildTSizeFunction): Add spell check
hint specifier.
* gm2-compiler/FilterError.def: New file.
* gm2-compiler/FilterError.mod: New file.
* gm2-libs/BinDict.def: New file.
* gm2-libs/BinDict.mod: New file.

libgm2/ChangeLog:

PR modula2/122407
* libm2pim/Makefile.am (M2MODS): Add BinDict.mod.
(M2DEFS): Add BinDict.def.
* libm2pim/Makefile.in: Regenerate.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Simplify 'Makefile' dependencies for libatomic [PR81358]

I noticed that commit r16-4315-ge63cf4b130b86dd7dde1bf499d3d40faca10ea2e
"PR81358: Enable automatic linking of libatomic" had introduced a lot of
repeated 'Makefile' dependencies for libatomic, including some nonsensical
ones, like 'configure-stage1-target-libada: maybe-all-stage1-target-libatomic'
(libada isn't bootstrapped). That's because the code for generation of
dependencies had been put into inside an existing loop over 'target_modules'.

PR driver/81358
* Makefile.tpl: Move generation of dependencies for libatomic out
of loop over 'target_modules'.
* Makefile.in: Regenerate.

middle-end/122392 - Remove erroneous PASS_MEM_STAT annotation.

Hi,
this patch remvoes the annotation which causes the build to fail when
configured with --enable-gather-detailed-mem-stats. I am very familiar
with the mem stat system yet, so I'd leave annotating these functions
for a future patch.

gcc/ChangeLog:

PR middle-end/122392
* attr-callback.cc (callback_build_attr): Remove erroneous
annotation.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>

Fix reduction validation for associated reduction chains

The code checking whether we have a single cycle and tracking the
reduction chain was not transitioned to full SLP which now shows
when having a SLP reduction chain built after associating the
reduction operation.

* tree-vect-loop.cc (vectorizable_reduction): SLP-ify reduction
operation processing a bit more.

* gcc.dg/vect/vect-pr122406-1.c: Adjust to expect reduction
chain vectorization.
* gcc.dg/vect/vect-pr122406-2.c: Likewise.

tree-optimization/122406 - incomplete handling of reduction chain with conversion

The following fixes the mixup between reduction operation and conversion
wrapping a reduction chain. This also exposes a missed optimization
but I'm going to fix that in a followup.

PR tree-optimization/122406
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Create
the SLP nodes for the conversions around the reduction
operation if required.

* gcc.dg/vect/vect-pr122406-1.c: New testcase.
* gcc.dg/vect/vect-pr122406-2.c: Likewise.

OpenMP: Fix bogus diagnostics with intervening code [PR121452]

The introduction in r14-3488-ga62c8324e7e31a of OMP_STRUCTURED_BLOCK (to
diagnose invalid intervening code) caused a regression rejecting the valid use
of the Fortran CONTINUE statement to end a collapsed loop.
This patch fixes the incorrect error checking in the OMP lowering pass. It also
fixes a check in the Fortran front end that erroneously rejects a similar
statement in an ordered loop.

Co-authored by: Tobias Burnus <tburnus@baylibre.com>

PR fortran/121452

gcc/fortran/ChangeLog:

* openmp.cc (resolve_omp_do): Allow CONTINUE as end statement of a
perfectly nested loop.

gcc/ChangeLog:

* omp-low.cc (check_omp_nesting_restrictions): Accept an
OMP_STRUCTURED_BLOCK in a collapsed simd region and in an ordered loop.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/pr121452-1.c: New test.
* c-c++-common/gomp/pr121452-2.c: New test.
* gfortran.dg/gomp/pr121452-1.f90: New test.
* gfortran.dg/gomp/pr121452-2.f90: New test.
* gfortran.dg/gomp/pr121452-3.f90: New test.

x86: builtin-fabs-2.c: Also scan (%edi) for x32

Adjust gcc.target/i386/builtin-fabs-2.c to scan both (%rdi) and (%edi).

PR target/122323
* gcc.target/i386/builtin-fabs-2.c: Also scan (%edi)for x32.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

match.pd: Fold VEC_PERM_EXPR chains implementing concat-and-extract

When compiling the following code with SIMDe on AArch64:

__m128i lo = _mm_srli_si128(a, 12);
__m128i hi = _mm_slli_si128(b, 4);
__m128i res = _mm_blend_epi16(hi, lo, 3);

current GCC produces:

mov     v31.4s, 0
ext     v30.16b, v0.16b, v31.16b, #12
ext     v0.16b, v31.16b, v1.16b, #12
ins     v0.s[0], v30.s[0]

instead of the more efficient:

ext     v0.16b, v0.16b, v1.16b, #12

GCC builds three VEC_PERM_EXPRs for the intrinsic calls. The first two
implement vector shifts and the final one implements the blend, but they
use different vector modes. The forward propagation fails to optimize
this case because VIEW_CONVERT_EXPRs in between block the folding.

This patch adds a match.pd pattern to recognize the concat-and-extract
idiom and folds the VEC_PERM_EXPR chain, even when VIEW_CONVERT_EXPRs
split the chain.

Bootstrapped and tested on aarch64-linux-gnu and x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd: Fold VEC_PERM_EXPR chains implementing vector
concat-and-extract.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-vecperm-1.c: New test.

Undefine SET_CMODEL before #define in rs6000/vxworks.h

This prevents warnings complaining about the redefinition
on top of the base version.

2025-10-16 Olivier Hainque <hainque@adacore.com>

* config/rs6000/vxworks.h (SET_CMODEL): Undefine before
(re)defining.

Adjust VxWorks special case in testsuite check_weak_available

check_weak_available was reporting weak symbols unsupported
for vxworks unconditionally while they are actually supported
vxworks 7 now (assumed >= r2). This change adjusts the
predicate to reflect that.

We used to believe we should distinguish kernel and rtp modes,
and experiments showed that this distinction is actually
counterproductive for the testsuite's purposes.

This allows a few extra tests to run (and pass :), in particular
in g++.dg/modules.

2021-02-03 Olivier Hainque <hainque@adacore.com>

gcc/testsuite/

* lib/target-supports.exp (check_weak_available):
Return 1 for VxWorks7.

c: Implement C2y static assertions in expressions

C2y has added support for static assertions as void expressions, in
addition to use as declarations (N3715 was accepted in an online vote
between meetings).

Implement the feature in GCC. There is a syntactic ambiguity between
a static assertion as a declaration and one as an expression
statement, which the accepted feature resolves by making such a usage
a declaration (this only affects the sequence of syntax productions by
which the code is parsed, not the actual semantics of the assertion);
I've raised the similar ambiguity in for loops on the WG14 reflector.

If just concerned with C2y, and not with diagnosing the use of a
feature not supported in older standard versions, the feature might be
simpler to implement by defaulting to treating static assertions in
ambiguous contexts as expressions rather than declarations, but that
would make it hard to diagnose exactly the cases that are new in C2y
(those depend on the static assertion either not being the whole
expression statement, or being in a context where an expression
statement is allowed but a declaration is not, e.g. the body of an if
statement). Instead, to support such diagnostics, the implementation
follows the standard in what is considered a declaration and what is
considered an expression, by looking ahead to what follows the closing
parenthesis when a static assertion starts in a context where a
declaration is permitted.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-parser.cc (c_parser_next_tokens_start_typename)
(c_parser_next_tokens_start_declaration): Add argument for token
to start from
(c_parser_next_tokens_start_declaration): Check for whether static
assertion followed by semicolon.
(c_parser_check_balanced_raw_token_sequence): Declare earlier.
(c_parser_compound_statement_nostart, c_parser_for_statement): Use
c_parser_next_tokens_start_declaration not
c_token_starts_declaration on second token.
(c_parser_unary_expression): Handle static assertions.
* c-parser.h (c_parser_next_tokens_start_declaration): Add
argument.

gcc/testsuite/
* gcc.dg/c23-static-assert-5.c, gcc.dg/c23-static-assert-6.c,
gcc.dg/c23-static-assert-7.c, gcc.dg/c23-static-assert-8.c,
gcc.dg/c2y-static-assert-2.c, gcc.dg/c2y-static-assert-3.c,
gcc.dg/c2y-static-assert-4.c: New tests.

Daily bump.

cobol: Corrected FUNCTION CHAR and FUNCTION ORD.

The functions CHAR and ORD have been changed to correctly report on
character positions within the collation sequence.

The use of the LOW-VALUE and HIGH-VALUE figurative constants has been
corrected.

Some establishment of DISPLAY and NATIONAL encodings has been done
in anticipation of changes soon to come.

Some new testsuite tests have been added.

gcc/cobol/ChangeLog:

* genapi.cc (parser_alphabet): Alphabet encoding.
(parser_alphabet_use): Likewise.
(parser_xml_parse): Use correct debugging macro; encoding.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(initialize_the_data): Encoding.
(parser_label_label): Debugging macros.
(parser_label_goto): Likewise.
(parser_file_add): Encoding.
(parser_intrinsic_call_1): Special handling for __gg__char.
(parser_intrinsic_call_2): Formatting.
* parse.y: Response from FUNCTION ORD is flagged "unsigned".
* symbols.cc (cbl_alphabet_t::reencode): Establish
low_char & high_char.
* symbols.h (struct cbl_alphabet_t): Likewise.

libgcobol/ChangeLog:

* charmaps.cc: Encoding.
* charmaps.h (class charmap_t): Encoding.
* intrinsic.cc (__gg__char): Report the character at the
collation position.
(__gg__ord): Report the collation position of a character.
* libgcobol.cc (struct program_state): Add encodings;
Remove obsolete defines.
(__gg__current_collation): New function for encoding/collation.
(__gg__pop_program_state): Encoding.
(__gg__init_program_state): Encoding.
(format_for_display_internal): Handle LOW-VALUE and HIGH-VALUE.
(__gg__compare_2): Encoding.
(__gg__alphabet_use): Likewise.
* libgcobol.h (__gg__current_collation): New declaration.
* xmlparse.cc (__gg__xml_parse): Make a parameter const.

gcc/testsuite/ChangeLog:

* cobol.dg/group2/Length_overflow__2_.out: Updated test result.
* cobol.dg/group2/Length_overflow_with_offset__1_.out: Likewise.
* cobol.dg/group2/Offset_overflow.out: Likewise.
* cobol.dg/group2/CALL_with_OCCURS_DEPENDING_ON.cob: New test.
* cobol.dg/group2/CALL_with_OCCURS_DEPENDING_ON.out: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_ASCII.cob: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_ASCII.out: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_EBCDIC.cob: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_EBCDIC.out: New test.
* cobol.dg/group2/EC-BOUND-REF-MOD_checking_process_termination.cob: New test.
* cobol.dg/group2/EC-BOUND-REF-MOD_checking_process_termination.out: New test.
* cobol.dg/group2/Intrinsics_without_FUNCTION_keyword__3_.cob: New test.
* cobol.dg/group2/Occurs_DEPENDING_ON__source_and_dest.cob: New test.
* cobol.dg/group2/Occurs_DEPENDING_ON__source_and_dest.out: New test.
* cobol.dg/group2/Recursive_subscripts.cob: New test.
* cobol.dg/group2/Recursive_subscripts.out: New test.
* cobol.dg/group2/SEARCH_ALL_with_OCCURS_DEPENDING_ON.cob: New test.
* cobol.dg/group2/SEARCH_ALL_with_OCCURS_DEPENDING_ON.out: New test.
* cobol.dg/group2/Subscript_by_arithmetic_expression.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__1_.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__1_.out: New test.
* cobol.dg/group2/Subscript_out_of_bounds__2_.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__2_.out: New test.
* cobol.dg/group2/Subscripted_refmods.cob: New test.
* cobol.dg/group2/Subscripted_refmods.out: New test.
* cobol.dg/group2/length_of_ODO_Rules_7__8A__and_8B.cob: New test.
* cobol.dg/group2/length_of_ODO_Rules_7__8A__and_8B.out: New test.
* cobol.dg/group2/length_of_ODO_w_-_reference_modification.cob: New test.

match: improve handling of `((signed)x) < 0` to `x >= (unsigned)SIGNED_TYPE_MIN` in `(type1)x CMP CST1 ? (type2)x : CST2` pattern.

This is a follow on r16-4534-g07800a565abd20 based on the review of
the other pattern (https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698336.html)
as the same issue mentioned in that review apply here.

This changes to use the new version of minmax_from_comparison so we don't need to create
a tree for the constant. and use wi::mask instead of TYPE_MIN_VALUE.

gcc/ChangeLog:

* match.pd (`(type1)x CMP CST1 ? (type2)x : CST2`): Better handling
of `((signed)x) < 0`.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

phiopt: Remove minmax_replacement [PR101024]

Now all of the optimizations are done in match from
minmax_replacement. We can now remove minmax_replacement. :)
This keeps around the special case for fp `a CMP b ? a : b` that was
added with r14-2699-g9f8f37f5490076 (PR 88540) and moves it to
match_simplify_replacement.

Bootsrapped and tested on x86_64-linux-gnu.

Note bool-12.c needed to be updated since phiopt1 rejecting the
BIT_AND/BIT_IOR with a cast and not getting MIN/MAX any more.

gcc/ChangeLog:

PR tree-optimization/101024
* tree-ssa-phiopt.cc (match_simplify_replacement): Special
case fp `a CMP b ? a : b` when not creating a min/max.
(strip_bit_not): Remove.
(invert_minmax_code): Remove.
(minmax_replacement): Remove.
(pass_phiopt::execute): Update pass comment.
Don't call minmax_replacement.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-12.c: Update based on when BIT_AND/BIT_IOR
is created and no longer MIN/MAX.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

match: Add support for `((signed)a </>= 0) ? min/max (a, c) : b` [PR101024]

This is the last patch that is needed to support to remove minmax_replacement.
This fixes pr101024-1.c which is failing when minmax_replacement is removed.

This next patch will remove it.

Changes since v1:
* v2: Add new version of minmax_from_comparison that takes widest_int.
      Constraint the pattern to constant integers in some cases.
      Use mask to create the SIGNED_MAX and use GT/LE instead.
      Use wi::le_p/wi::ge_p instead of fold_build to do the comparison.

gcc/ChangeLog:

PR tree-optimization/101024
* fold-const.cc (minmax_from_comparison): New version that takes widest_int
instead of tree.
(minmax_from_comparison):  Call minmax_from_comparison for integer cst case.
* fold-const.h (minmax_from_comparison): New declaration.
* match.pd (`((signed)a </>= 0) ? min/max (a, c) : b`): New pattern.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

cobol: Implement the XML PARSE statement.

These changes implement the XML PARSE statement as described in the IBM
specification.

A repair to exception handling is included.  Up until now, an exception
after a successful file operation wasn't handled properly.

A repair to value declarations for BINARY / COMP / COMP-4 / COMP-5
values now allows them to have digits to the right of the implied
decimal point.  Processing of the "S" PICTURE character has been
normalized as well.

Co-Authored-By: James K. Lowden <jklowden@cobolworx.com>
Co-Authored-By: Robert Dubner <rdubner@symas.com>
gcc/cobol/ChangeLog:

* Make-lang.in: Incorporate new token_names.h file.
* cdf.y: Modify tokens.
* gcobol.1: Document XML PARSE statement
* genapi.cc (parser_enter_program): Verify that every goto has a
matching label.
(parser_end_program): Likewise.
(parser_alphabet): Refine handling codeset encodings.
(parser_alphabet_use): Likewise.
(label_fetch): Moved from later in the source code.
(parser_xml_parse): New routine for XML PARSE.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(parser_label_label): Verify goto/label matching.
(parser_label_goto): Likewise.
(parser_entry): Minor change to SHOW_PARSE report.
* genapi.h (parser_alphabet): Set parameter to const.
(parser_xml_parse): Declare new function.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(parser_label_addr): Likewise.
* parse.y: label_pair_t structure; locale processing; new token
processing for alphabets and XML PARSE.
* parse_ante.h (name_of): Return field->name when initial is NULL.
(new_tempnumeric): Make signable_e optional.
(ast_save_locale): New function.
(data_division_ready): Warning for "no alphabet".
* scan.l: Repair interpretation of BINARY, COMP, COMP-4, and
COMP-5.
* scan_ante.h (struct bint_t): Likewise.
* scan_post.h (current_tokens_t::tokenset_t::tokenset_t):
Include token_names.h.
* symbols.cc (symbols_alphabet_set): Revert to prior alphabet
determination.
(symbol_table_init): New XML special registers.
(new_temporary): Make signable_e controllable, not fixed.
* symbols.h (__gg__encoding_iconv_valid): New declaration.
(enum cbl_label_type_t): New LblXml label type.
(struct cbl_xml_parse_t):
(struct cbl_label_t): Implement XML PARSE.
(new_temporary): Incorporate boolean for signable_e.
(symbol_elem_of): Change label field type handling.
(cbl_section_of): Likewise.
(cbl_field_of): Likewise.
(cbl_label_of): Likewise.
(cbl_special_name_of):  Likewise.
(cbl_alphabet_of):  Likewise.
(cbl_file_of):  Likewise.
* token_names.h: New file.
* util.cc (gcc_location_set_impl): Improve location_t calculations
when entering and leaving COPYBOOKs.

libgcobol/ChangeLog:

* Makefile.am: Changes for XML PARSE and POSIX functions.
* Makefile.in: Likewise.
* charmaps.cc: Augment encodings[] table with "supported" boolean.
(__gg__encoding_iconv_name): Modify how encodings are identified.
(encoding_descr): Likewise.
(__gg__encoding_iconv_valid): Likewise.
* common-defs.h (callback_t): Define function pointer.
* constants.cc: Use named cbl_attr_e constants instead of magic
numbers.; New definitions for XML special registers.
* encodings.h (struct encodings_t): Declare "supported" boolean.
* libgcobol.cc (format_for_display_internal): Use std::ptrdiff_t.
(__gg__alphabet_use): Add case for iconv_CP1252_e.
(default_exception_handler): Repair exception handling after a
successful file operation.
* posix/errno.cc: New file.
* posix/localtime.cc: New file.
* posix/stat.cc: New file.
* posix/stat.h: New file.
* posix/tm.h: New file.
* xmlparse.cc: New file to support XML PARSE statement.

gcc/testsuite/ChangeLog:

* cobol.dg/typo-1.cob: New test for squiggles and carets.

aarch64: Add __HAVE_FUNCTION_MULTI_VERSIONING macro.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Add
__HAVE_FUNCTION_MULTI_VERSIONING macro.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: Remove unnecessary sort from dispatch_function_versions.

The version data-structure already stores the versions in a sorted order so
sorting here is unnecessary.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (dispatch_function_versions): Remove
unnecessary sorting and data structure.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: testsuite: Add test for supported FMV extensions.

Add tests that check the aarch64 version features are supported, that they
have the correct priority ordering, and that the generated resolver is correct.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/fmv_priority1.c: New test.
* gcc.target/aarch64/fmv_priority2.c: New test.
* gcc.target/aarch64/fmv_priority.in: Support file.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: Fix fmv priority ordering [PR target/122190]

This fixes the versioning rules for aarch64.

Previously this would prioritize the version string with more extensions
specified regardless of the extension.

The ACLE rules are that any two version strings should be ordered by the
highest priority feature that the versions don't have in common.

PR target/122190

gcc/ChangeLog:

* config/aarch64/aarch64.cc (compare_feature_masks): Fix version rules.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr122190.c: New test

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: Dump version ordering for FMV.

This adds the fmv function versions to the targetclone dump.

This is useful for debugging and tests checking function version priority
ordering.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_generate_version_dispatcher_body):
Dump function versions and the ordering.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

match.pd: Fold pattern of round semantics.

In the 538.imagick_r benchmark of Spec2017, I find these pattern from
MagickRound function. This patch implements these pattern in match.pd
for 4 rules:
1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5)
2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5)
3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5)
4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5)

The patch implements floor(x+0.5) operation to replace these pattern
that semantics of round(x) function.

The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu,
SPEC 2017 and SPEC 2006 were run:
As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+
in base test of ratio mode.
As for SPEC 2006, while the transform does not seem to be triggered,
we also see no non-noise impact on performance.

gcc/ChangeLog:

* match.pd: Add new pattern for round.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-round-1.c: New test.

libgomp: fine-grained pinned memory allocator

This patch introduces a new custom memory allocator for use with pinned
memory (in the case where the Cuda allocator isn't available).  In future,
this allocator will also be used for Managed Memory.  Both memories are
incompatible with the system malloc because allocated memory cannot share a
page with memory allocated for other purposes.

This means that small allocations will no longer consume an entire page of
pinned memory.  Unfortunately, it also means that pinned memory pages will
never be unmapped (although they may be reused).  This isn't a technical
limitation; the "free" algorithm could be extended in future, if needed.

The implementation is not perfect; there are various corner cases (especially
related to extending onto new pages) where allocations and reallocations may
be sub-optimal, but it should still be a step forward in support for small
allocations.

I have considered using libmemkind's "fixed" memory but rejected it for three
reasons: 1) libmemkind may not always be present at runtime, 2) there's no
currently documented means to extend a "fixed" kind one page at a time
(although the code appears to have an undocumented function that may do the
job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its
regular kinds would be straight-forward), 3) Managed Memory benefits from
having the metadata located in different memory and using an external
implementation makes it hard to guarantee this.

libgomp/ChangeLog:

* Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c.
* Makefile.in: Regenerate.
* basic-allocator.c: Mention simple-allocator in the comment.
* config/linux/allocator.c: Include unistd.h.
(pin_ctx): New variable.
(ctxlock): New variable.
(linux_init_pin_ctx): New function.
(linux_memspace_alloc): Use simple-allocator for pinned memory.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise.
* libgomp.h (gomp_simple_alloc_init_context): New prototype.
(gomp_simple_alloc_register_memory): New prototype.
(gomp_simple_alloc): New prototype.
(gomp_simple_free): New prototype.
(gomp_simple_realloc): New prototype.
* libgomp.texi: Update pinned memory trait documentation.
* testsuite/libgomp.c/alloc-pinned-8.c: New test.
* simple-allocator.c: New file.

libgomp, nvptx: Cuda pinned memory

Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX. At present, the other supported devices do
not have equivalent capabilities (or requirements).

libgomp/ChangeLog:

* config/linux/allocator.c: Include assert.h.
(using_device_for_page_locked): New variable.
(linux_memspace_alloc): Add init0 parameter. Support device pinning.
(linux_memspace_calloc): Set init0 to true.
(linux_memspace_free): Support device pinning.
(linux_memspace_realloc): Support device pinning.
(MEMSPACE_ALLOC): Set init0 to false.
* libgomp-plugin.h
(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* libgomp.h (gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(struct gomp_device_descr): Add page_locked_host_alloc_func and
page_locked_host_free_func.
* libgomp.texi: Adjust the docs for the pinned trait.
* plugin/plugin-nvptx.c
(GOMP_OFFLOAD_page_locked_host_alloc): New function.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* target.c (device_for_page_locked): New variable.
(get_device_for_page_locked): New function.
(gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(gomp_load_plugin_for_device): Add page_locked_host_alloc and
page_locked_host_free.
* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
devices.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

Remove LOOP_VINFO_SLP_UNROLLING_FACTOR

The following removes LOOP_VINFO_SLP_UNROLLING_FACTOR in favor of
using LOOP_VINFO_VECT_FACTOR directly now that there's no difference
between the two possible.

* tree-vectorizer.h (_loop_vec_info::slp_unrolling_factor): Remove.
(LOOP_VINFO_SLP_UNROLLING_FACTOR): Likewise.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Adjust.
(vect_analyze_loop_2): Likewise.
* tree-vect-slp.cc (vect_make_slp_decision): Set
LOOP_VINFO_VECT_FACTOR directly.

Move SLP permute optimization until after VF is final

The following moves SLP permute optimization until after we applied
a possible extra unroll factor.

* tree-vect-loop.cc (vect_analyze_loop_2): Move vect_optimize_slp
after applying suggested_unroll_factor.

Fix possible segfault in load/store-lane analysis

The following fixes a segfault that appeared with a patch applying
additional permutes to a reduction SLP instance root.

* tree-vect-loop.cc (vect_analyze_loop_2): Deal with NULL
element in SLP_TREE_SCALAR_STMTS.

testsuite: arm: [MVE] Relax expected code for vbicq_f [PR122223]

The original versions of the pr122223.c test only took into account
code generated with -mfloat-abi=hard, which uses q0.

With -mfloat-abi=softfp, this can be any Q register, so replace q0
with a suitable regex.

gcc/testsuite/ChangeLog:

PR target/122223
* gcc.target/arm/mve/intrinsics/pr122223.c: Relax expected code.

Support reduc_sbool_and_scal_m for V{QI,SI,DI}mode.

gcc/ChangeLog:

PR target/101639
* config/i386/sse.md
(VI_AVX): New mode iterator.
(VI_AVX_CMP): Ditto.
(ssebytemode): Add V16HI, V32QI, V16QI.
(reduc_sbool_and_scal_<mode>): New expander.
(reduc_sbool_ior_scal_<mode>): Ditto.
(reduc_sbool_xor_scal_<mode>): Ditto.
(*eq<mode>3_2_negate): New pre_reload splitter.
(*ptest<mode>_ccz): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr101639_reduc_mask_vdi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_vqi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_vsi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_ior_vqi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_and_vqi.c: New test.

Support reduc_sbool_{and,ior,xor}_scal_m for avx512 kmask.

gcc/ChangeLog:

PR target/101639
* config/i386/sse.md
(reduc_sbool_and_scal_<mode>): New expander.
(reduc_sbool_ior_scal_<mode>): Ditto.
(reduc_sbool_xor_scal_<mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr101639_reduc_mask_di.c: New test.
* gcc.target/i386/pr101639_reduc_mask_hi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_qi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_si.c: New test.

Daily bump.

x86: Use HOST_WIDE_INT_(0|M1)U to initialize unsigned HOST_WIDE_INT

Use HOST_WIDE_INT_0U, instead of 0, HOST_WIDE_INT_M1U, instead of -1, to
initialize unsigned HOST_WIDE_INT.

* config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Use
HOST_WIDE_INT_0U and HOST_WIDE_INT_M1U to initialize unsigned
HOST_WIDE_INT.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

testsuite: Fix local labels [PR122378]

r16-4540-g80af807e52e4f4 exposed a bug in two testcases where the declaration of
local labels was wrongly commented out. That caused "duplicate label" errors.
Uncommenting declarations fixes it.

PR middle-end/122378

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/attrs-metadirective-2.c: Uncomment local label
declaration.
* c-c++-common/gomp/metadirective-2.c: Likewise.

libstdc++: Avoid incrementing input iterators with std::prev [PR122224]

As explained in PR libstdc++/122224 we do not make it ill-formed to call
std::prev with a non-Cpp17BidirectionalIterator. Instead we just use a
runtime assertion to check the std::advance precondition that the
distance is not negative.

This allows us to support std::prev on types which model the C++20
std::bidirectional_iterator concept but do not meet the
Cpp17BidirectionalIterator requirements, e.g. iota_view's iterators.

It also allows us to support std::prev(iter, -1) which is admittedly
weird, but there's no reason it shouldn't be equivalent to
std::next(iter), which is perfectly fine to use on non-bidirectional
iterators. In other words, "reverse decrementing" is valid for
non-bidirectional iterators.

However, the current implementation of std::advance for
non-bidirectional iterators uses a loop that does `while (n--) ++i;`
which assumes that n is not negative and so will eventually reach zero.
When the assertion for the precondition is not enabled, incrementing the
iterator while n is non-zero means that using std::prev(iter) or
std::next(iter, -1) on a non-bidirectional iterator will keep
incrementing the iterator until n reaches INT_MIN, overflows, and then
keeps decrementing until it eventually reaches zero. Incrementing most
iterators that many times will cause memory safety errors long before
the integer reaches zero and terminates the loop.

This commit changes the loop to use `while (n-- > 0)` which means that
the loop doesn't execute at all if a negative n is used. We still
consider such calls to be erroneous, but when the precondition isn't
checked by an assertion, the function now has no effects. The undefined
behaviour resulting from incrementing the iterator is prevented.

libstdc++-v3/ChangeLog:

PR libstdc++/122224
* include/bits/stl_iterator_base_funcs.h (prev): Compare
distance as n > 0 instead of n != 0.
* testsuite/24_iterators/range_operations/122224.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

MAINTAINERS: Update my contact info.

ChangeLog:

* MAINTAINERS: Update my contact information.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>

c++: Fix up RAW_DATA_CST handling in braced_list_to_string [PR122302]

The following testcase is miscompiled, because a RAW_DATA_CST tree
node is shared by multiple CONSTRUCTORs and when the braced_list_to_string
function changes one to extend the RAW_DATA_CST over the single preceding
and single succeeding INTEGER_CST, it changes the RAW_DATA_CST in
the other CONSTRUCTOR where the elts around it are still present.

Fixed by tweaking a copy of it instead, like we handle it in other spots.

2025-10-22 Jakub Jelinek <jakub@redhat.com>

PR c++/122302
* c-common.cc (braced_list_to_string): Call copy_node on RAW_DATA_CST
before changing RAW_DATA_POINTER and RAW_DATA_LENGTH on it.

* g++.dg/cpp0x/pr122302.C: New test.
* g++.dg/cpp/embed-27.C: New test.

AArch64: Add support for boolean reductions for Adv. SIMD using SVE

When doing boolean reductions for Adv. SIMD vectors and SVE is available
we can use SVE instructions instead of Adv. SIMD ones to do the reduction.

For instance OR-reductions are

        umaxp v3.4s, v3.4s, v3.4s
        fmov x1, d3
        cmp x1, 0
        cset w0, ne

and with SVE we generate:

        ptrue p1.b, vl16
        cmpne p1.b, p1/z, z3.b, #0
        cset w0, any

Where the ptrue is normally executed much earlier so it's not a bottleneck for
the compare.

For the remaining codegen see test vect-reduc-bool-18.c.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (reduc_sbool_and_scal_<mode>,
reduc_sbool_ior_scal_<mode>, reduc_sbool_xor_scal_<mode>): Use SVE if
available.
* config/aarch64/aarch64-sve.md (*cmp<cmp_op><mode>_ptest): Rename ...
(@aarch64_pred_cmp<cmp_op><mode>_ptest): ... To this.
(reduc_sbool_xor_scal_<mode>): Rename ...
(@reduc_sbool_xor_scal_<mode>): ... To this.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/vect-reduc-bool-10.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-11.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-12.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-13.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-14.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-15.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-16.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-17.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-18.c: New test.

AArch64: Add support for boolean reductions for Adv. SIMD

The vectorizer has learned how to do boolean reductions of masks to a C bool
for the operations OR, XOR and AND.

This implements the new optabs for Adv.SIMD.  Adv.SIMD today can already
vectorize such loops but does so through SHIFT-AND-INSERT to perform the
reductions step-wise and inorder.  As an example, an OR reduction today does:

        movi    v3.4s, 0
        ext     v5.16b, v30.16b, v3.16b, #8
        orr     v5.16b, v5.16b, v30.16b
        ext     v29.16b, v5.16b, v3.16b, #4
        orr     v29.16b, v29.16b, v5.16b
        ext     v4.16b, v29.16b, v3.16b, #2
        orr     v4.16b, v4.16b, v29.16b
        ext     v3.16b, v4.16b, v3.16b, #1
        orr     v3.16b, v3.16b, v4.16b
        fmov    w1, s3
        and     w1, w1, 1

For reducing to a boolean however we don't need the stepwise reduction and can
just look at the bit patterns. For e.g. OR we now generate:

        umaxp v3.4s, v3.4s, v3.4s
        fmov x1, d3
        cmp x1, 0
        cset w0, ne

For the remaining codegen see test vect-reduc-bool-9.c.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (reduc_sbool_and_scal_<mode>,
reduc_sbool_ior_scal_<mode>, reduc_sbool_xor_scal_<mode>): New.
* config/aarch64/iterators.md (VALLI): New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vect-reduc-bool-1.c: New test.
* gcc.target/aarch64/vect-reduc-bool-2.c: New test.
* gcc.target/aarch64/vect-reduc-bool-3.c: New test.
* gcc.target/aarch64/vect-reduc-bool-4.c: New test.
* gcc.target/aarch64/vect-reduc-bool-5.c: New test.
* gcc.target/aarch64/vect-reduc-bool-6.c: New test.
* gcc.target/aarch64/vect-reduc-bool-7.c: New test.
* gcc.target/aarch64/vect-reduc-bool-8.c: New test.
* gcc.target/aarch64/vect-reduc-bool-9.c: New test.

AArch64: Add support for boolean reductions for SVE

The vectorizer has learned how to do boolean reductions of masks to a C bool
for the operations OR, XOR and AND.

This implements the new optabs for SVE.

For SVE & and the | case would use the CC registers.

or_reduc:
        ptest   p0, p0.b
        cset    w0, any

and_reduc:
        ptrue   p3.b, all
        nots    p3.b, p3/z, p0.b
        cset    w0, none

and the ^ case we'd see if the number of active predicate lanes
is a multiple of two.

xor_reduc:
        ptrue   p3.b, all
        cntp    x0, p3, p0.b
        and     w0, w0, 1

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (reduc_sbool_and_scal_<mode>,
reduc_sbool_ior_scal_<mode>, reduc_sbool_xor_scal_<mode>): New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/vect-reduc-bool-1.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-2.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-3.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-4.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-5.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-6.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-7.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-8.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-9.c: New test.

vect: Add support for boolean reductions for VLA

The support for the new boolean reduction optabs didn't quite work for VLA
because the code later on insists on the target still having a shift-and-insert
optab.

This is however not needed if the target can do the reduction using the new
optabs, and the initial reduction value matches the neutral value and we
have one SLP lane while not having a reduction chain.

gcc/ChangeLog:

* tree-vect-loop.cc (vectorizable_reduction): Don't always require
IFN_VEC_SHL_INSERT when using reduc sbool optabs.

aarch64: Add autoregenerated files for AArch64 options.

In the previous committed patch to "add support for
menable-sysreg-checking flag", I have made changes to
config/aarch64/aarch64.opt, but missed to update the
autoregenerated files.

This patch adds the updated autoregenerated aarch64.opt.urls
changes.

gcc/ChangeLog:

* config/aarch64/aarch64.opt.urls: Regenerate.

tree-optimization/122364 - reduction chain with conversion

The following handles detecting of a reduction chain wrapped in a
conversion. This does not yet try to combine operands with different
signedness, but we should now handle signed integer accumulation
to both a signed and unsigned accumulator fine.

PR tree-optimization/122364
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Re-try
linearization on a conversion source.

* gcc.dg/vect/vect-reduc-chain-5.c: New testcase.

tree-optimization/122370 - ICE with reduction and masks

The following fixes bad interaction with mask demotion to data
and the code dealing with UB on signed reductions by making sure
to also update compute_vectype when updating vectype.

PR tree-optimization/122370
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Also update compute_vectype when demoting masks to an
integer vector.

* gcc.dg/vect/vect-pr122370.c: New testcase.

libstdc++: Add missing constraints to views::indices

Calling views::indices(n) should be expression equivalent to
views::iota(decltype(n)(0), n), which means it should have the same
constraints as views::iota and be SFINAE-friendly.

libstdc++-v3/ChangeLog:

* include/std/ranges (indices::operator()): Constrain using
__can_iota_view concept.
* testsuite/std/ranges/indices/1.cc: Check SFINAE-friendliness
required by expression equivalence. Replace unused <vector>
header with <stddef.h> needed for size_t.

tree-optimization/122371 - ICE with reduction chain and fold-left reduction

The fold-left reduction transform relies on preserving the scalar
cycle PHI. The following rewrites how we connect this to the
involved stmt-infos instead of relying on (the actually bogus for
reduction chain) scalar stmts in SLP nodes more than absolutely
necessary. This also makes sure to not re-associate to form a
reduction chain when a fold-left reduction is required.

PR tree-optimization/122371
* tree-vect-loop.cc (vectorize_fold_left_reduction): Get
to the scalar def to replace via the scalar PHI backedge def.
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Do not
re-associate to for a reduction chain if a fold-left
reduction is required.

* gcc.dg/vect/vect-pr122371.c: New testcase.

libstdc++: Implement optional<T&> from P2988R12 [PR121748]

This patch implements optional<T&> based on the P2988R12 paper, incorporating
corrections from LWG4300, LWG4304, and LWG3467. The resolution for LWG4015
is also extended to cover optional<T&>.

We introduce _M_fwd() helper, that is equivalent to operator*(), except that
it does not check non-empty precondition. It is used in to correctly propagate
the value during move construction from optional<T&>. This is necessary because
moving an optional<T&> must not move the contained object, which is the key
distinction between *std::move(opt) and std::move(*opt).

The implementation deviates from the standard by providing a separate std::swap
overload for std::optional<T&>, which simplifies preserving the resolution of
LWG2766.

This introduces a few changes to make_optional behavior (see included test):
* some previously valid uses of make_optional<T>({...}) (where T is not a
  reference type) now become ill-formed (see optional/make_optional_neg.cc).
* make_optional<T&>(t) and make_optional<const T&>(ct), where decltype(t) is T&,
  and decltype(ct) is const T& now produce optional<T&> and optional<const T&>
  respectively, instead of optional<T>.
* a few other uses of make_optional<R> with reference type R are now ill-formed.

PR libstdc++/121748

libstdc++-v3/ChangeLog:

* include/bits/version.def: Bump value for optional,
* include/bits/version.h: Regenerate.
* include/std/optional (std::__is_valid_contained_type_for_optional):
Define.
(std::optional<T>): Use __is_valid_contained_type_for_optional.
(optional<T>(const optional<_Up>&), optional<T>(optional<_Up>&&))
(optional<T>::operator=(const optional<_Up>&))
(optional<T>::operator=(optional<_Up>&&)): Replacex._M_get() with
x._M_fwd(), and std::move(x._M_get()) with std::move(x)._M_fwd().
(optional<T>::and_then): Remove uncessary remove_cvref_t.
(optional<T>::_M_fwd): Define.
(std::optional<T&>): Define new partial specialization.
(std::swap(std::optional<T&>, std::optional<T&>)): Define.
(std::make_optional(_Tp&&)): Add non-type template parameter.
(std::make_optional): Use parenthesis to constructor optional.
(std::hash<optional<T>>): Add comment.
* testsuite/20_util/optional/make_optional-2.cc: Guarded not longer
working example.
* testsuite/20_util/optional/relops/constrained.cc: Expand test to
cover optionals of reference.
* testsuite/20_util/optional/requirements.cc: Ammend for
optional<T&>.
* testsuite/20_util/optional/requirements_neg.cc: Likewise.
* testsuite/20_util/optional/version.cc: Test new value of
__cpp_lib_optional.
* testsuite/20_util/optional/make_optional_neg.cc: New test.
* testsuite/20_util/optional/monadic/ref_neg.cc: New test.
* testsuite/20_util/optional/ref/access.cc: New test.
* testsuite/20_util/optional/ref/assign.cc: New test.
* testsuite/20_util/optional/ref/cons.cc: New test.
* testsuite/20_util/optional/ref/internal_traits.cc: New test.
* testsuite/20_util/optional/ref/make_optional/1.cc: New test.
* testsuite/20_util/optional/ref/make_optional/from_args_neg.cc:
New test.
* testsuite/20_util/optional/ref/make_optional/from_lvalue_neg.cc:
New test.
* testsuite/20_util/optional/ref/make_optional/from_rvalue_neg.cc:
New test.
* testsuite/20_util/optional/ref/monadic.cc: New test.
* testsuite/20_util/optional/ref/relops.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Add comparison operators between tuple<> and array<T, 0> [PR119721]

This fixes the C++23 compliance issue where std::tuple<> cannot be compared
with other empty tuple-like types such as std::array<T, 0>.

The operators correctly allow comparison with array<T, 0> even when T is not
comparable, because empty tuple-like types don't compare element values.

PR libstdc++/119721

libstdc++-v3/ChangeLog:

* include/std/tuple (tuple<>::operator==, tuple<>::operator<=>):
Define.
* testsuite/23_containers/tuple/comparison_operators/119721.cc:
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>

tree-optimization/122365 - deal with bool SLP reductions

I hadn't thought of these but at least added an assert which now
tripped.  Fixed thus.  There's also a latent issue with AVX512
mask types.  The by-pieces reduction code used the wrong element
sizes.

PR tree-optimization/122365
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Convert all inputs.  Use the proper vector element sizes
for the elementwise reduction.

* gcc.dg/vect/vect-reduc-bool-9.c: New testcase.

Initial Nova Lake Support

This patch will add initial support for Nova Lake according to Intel
ISE.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h
(get_intel_cpu): Handle Nova Lake.
* common/config/i386/i386-common.cc (processor_name):
Add Nova Lake.
(processor_alias_table): Ditto.
* common/config/i386/i386-cpuinfo.h (enum processor_types):
Add INTEL_COREI7_NOVALAKE.
* config.gcc: Add -march=novalake.
* config/i386/driver-i386.cc (host_detect_local_cpu): Handle
novalake.
* config/i386/i386-c.cc (ix86_target_macros_internal): Ditto.
* config/i386/i386-options.cc (processor_cost_table): Ditto.
(m_NOVALAKE): New.
(m_CORE_HYBRID): Add novalake.
* config/i386/i386.h (enum processor_type): Ditto.
* doc/extend.texi: Ditto.
* doc/invoke.texi: Ditto.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv16.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Handle new march.

i386: Correct cpu codename value for unknown model number

There are several changes for features enabled on cpus. r16-1666 disabled
CLDEMOTE on clients. r16-2224 removed Key locker since Panther Lake and
Clearwater forest. r16-4436 disabled PREFETCHI on Panther Lake.

The patches caused the current return guess value not aligned for
host_detect_local_cpu meeting the unknown model number. Correct the
logic according to the features enabled.

This patch will also backport to GCC14 and GCC15.

gcc/ChangeLog:

* config/i386/driver-i386.cc (host_detect_local_cpu): Correct
the logic for unknown model number cpu guess value.

Simplify avx512 vector integer comparison when 2 operands are known equal

For comparison NEQ/LT/NLE, it's simplified to 0.
For comparison LE/EQ/NLT, it's simplied to (1u << nelt) - 1
gcc/ChangeLog:

PR target/122320
* config/i386/sse.md (*<avx512>_cmp<mode>3_dup_op): New define_insn_and_split.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr122320-mask16.c: New test.
* gcc.target/i386/pr122320-mask2.c: New test.
* gcc.target/i386/pr122320-mask32.c: New test.
* gcc.target/i386/pr122320-mask4.c: New test.
* gcc.target/i386/pr122320-mask64.c: New test.
* gcc.target/i386/pr122320-mask8.c: New test.

libgccjit: Add _Float16, _Float32, _Float64 and __float128 support for jit

gcc/ChangeLog:

* config/i386/i386-jit.cc: Mark new float types as supported.

gcc/jit/ChangeLog:

* docs/topics/types.rst: Document new types.
* dummy-frontend.cc: Support new types in tree_type_to_jit_type.
* jit-common.h: Update NUM_GCC_JIT_TYPES.
* jit-playback.cc: Support new types in get_tree_node_for_type.
* jit-recording.cc: Support new types.
* libgccjit.h (gcc_jit_types): Add new types.

gcc/testsuite/ChangeLog:

* jit.dg/all-non-failing-tests.h: Mention new test.
* jit.dg/test-sized-float.c: New test.

Daily bump.

libgccjit: Fix error on Power architectures caused by wrong jit_target_objs

gcc/ChangeLog:
* config.gcc (jit_target_objs): Don't set this variable since
the object files don't exist.

c2y: Allow unspecified arrays in generic association.

To allow unspecified arrays in generic association add a new
declaration context GENERIC_ASSOC for grokdeclarator and new
function grokgenassoc to be used by the parser. The error
about unspecified array is moved from build_array_declarator
to grokdeclarator to be able to check for this.

gcc/c/ChangeLog:
* c-decl.cc (build_array_declarator): Remove error.
(grokgenassoc): New function.
(grokdeclarator): Add error.
* c-parser.cc (c_parser_generic_selection): Use grokgenassoc.
* c-tree.h (grokgenassoc): Add prototype.

gcc/testsuite/ChangeLog:
* gcc.dg/c2y-generic-6.c: New test.
* gcc.dg/c2y-generic-7.c: New test.

c++: Implement C++23 P2674R1 - A trait for implicit lifetime types

The following patch attempts to implement the compiler side of the
C++23 P2674R1 paper.  As mentioned in the paper, since CWG2605
the trait isn't really implementable purely on the library side.

Because it is implemented completely on the compiler side, it
just uses SCALAR_TYPE_P and so can e.g. accept __int128 even in
-std=c++23 mode, even when std::is_scalar_v<__int128> is false in
that case.  And as an extention it (like Clang) accepts _Complex
types and vector types.
I must say I'm quite surprised that any array types are considered
implicit-lifetime, even if their element type is not, but perhaps
there is some reason for that.
Because std::is_array_v<int[0]> is false, it returns false for that
as well, dunno if that shouldn't be changed for implicit-lifetime.
It accepts also VLAs.

The library part has been split into a separate patch still pending
review; committing it now so that reflection can use it in its
std::meta::is_implicit_lifetime_type implementation.

2025-10-21  Jakub Jelinek  <jakub@redhat.com>

gcc/cp/
* cp-tree.h: Implement C++23 P2674R1 - A trait for implicit lifetime
types.
(implicit_lifetime_type_p): Declare.
* tree.cc (implicit_lifetime_type_p): New function.
* cp-trait.def (IS_IMPLICIT_LIFETIME): New unary trait.
* semantics.cc (trait_expr_value): Handle CPTK_IS_IMPLICIT_LIFETIME.
(finish_trait_expr): Likewise.
* constraint.cc (diagnose_trait_expr): Likewise.
gcc/testsuite/
* g++.dg/ext/is_implicit_lifetime.C: New test.

arm: testsuite: [MVE] Fix expected code for vadcq_m and vsbcq_m [PR122189]

The original versions of these tests only took into account code
generated with -mfloat-abi=hard.

Depending on how the toolchain is configured, arm_v8_1m_mve may use
-mfloat-abi-softfp, which generates a different instructions order.

Depending on the -mtune setting, the order can also vary, so the patch
adds -fno-schedule-insns -fno-schedule-insns2 to avoid such
maintenance issues.

In particular, this fixes the failures with:
-mthumb -march=armv7e-m+fp.dp -mtune=cortex-m7 -mfloat-abi=hard -mfpu=auto
-mthumb -march=armv6s-m -mtune=cortex-m0 -mfloat-abi=soft -mfpu=auto

gcc/testsuite/ChangeLog:

PR target/122189
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c

OpenMP: Handle non-executable directives in intervening code [PR120180,PR122306]

OpenMP 6 permits non-executable directives in intervening code; this commit adds
support for a sensible subset, namely metadirectives, nothing, assume, and
'error at(compilation)'.
Also handle the special case where a metadirective can be resolved at parse time
to 'omp nothing'.
This fixes a build issue that affects 10 out 12 SPECaccel benchmarks.

Co-authored by: Tobias Burnus <tburnus@baylibre.com>

PR c/120180
PR fortran/122306

gcc/c/ChangeLog:

* c-parser.cc (c_parser_pragma): Accept a subset of non-executable
OpenMP directives in intervening code.
(c_parser_omp_error): Reject 'error at(execution)' in intervening code.
(c_parser_omp_metadirective): Return early if only one selector matches
and it resolves to 'omp nothing'.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_metadirective): Return early if only one
selector matches and it resolves to 'omp nothing'.
(cp_parser_omp_error): Reject 'error at(execution)' in intervening code.
(cp_parser_pragma): Accept a subset of non-executable OpenMP directives
as intervening code.

gcc/fortran/ChangeLog:

* gfortran.h (enum gfc_exec_op): Add EXEC_OMP_FIRST_OPENMP_EXEC and
EXEC_OMP_LAST_OPENMP_EXEC.
* openmp.cc (gfc_match_omp_context_selector): Remove static. Remove
checks on score. Add cleanup. Remove checks on trait properties.
(gfc_match_omp_context_selector_specification): Remove static. Adjust
calls to gfc_match_omp_context_selector.
(gfc_match_omp_declare_variant): Adjust call to
gfc_match_omp_context_selector_specification.
(match_omp_metadirective): Likewise.
(icode_code_error_callback): Reject all statements except
'assume' and 'metadirective'.
(gfc_resolve_omp_context_selector): New function.
(resolve_omp_metadirective): Skip metadirectives which context selectors
can be statically resolved to false. Replace metadirective by its body
if only 'nothing' remains.
(gfc_resolve_omp_declare): Call gfc_resolve_omp_context_selector for
each variant.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/imperfect1.c: Adjust dg-error.
* c-c++-common/gomp/imperfect4.c: Likewise.
* c-c++-common/gomp/pr120180.c: Move to...
* c-c++-common/gomp/pr120180-1.c: ...here. Remove dg-error.
* g++.dg/gomp/attrs-imperfect1.C: Adjust dg-error.
* g++.dg/gomp/attrs-imperfect4.C: Likewise.
* gfortran.dg/gomp/declare-variant-2.f90: Adjust dg-error.
* gfortran.dg/gomp/declare-variant-20.f90: Likewise.
* c-c++-common/gomp/pr120180-2.c: New test.
* g++.dg/gomp/pr120180-1.C: New test.
* gfortran.dg/gomp/pr120180-1.f90: New test.
* gfortran.dg/gomp/pr120180-2.f90: New test.
* gfortran.dg/gomp/pr122306-1.f90: New file.
* gfortran.dg/gomp/pr122306-2.f90: New file.

x86_64: Start TImode STV chains from zero-extension or *concatditi.

Currently x86_64's TImode STV pass has the restriction that candidate
chains must start with a TImode load from memory.  This patch improves
the functionality of STV to allow zero-extensions and construction of
TImode pseudos from two DImode values (i.e. *concatditi) to both be
considered candidate chain initiators.  For example, this allows chains
starting from an __int128 function argument to be processed by STV.

Compiled with -O2 on x86_64:

__int128 m0,m1,m2,m3;
void foo(__int128 m)
{
    m0 = m;
    m1 = m;
    m2 = m;
    m3 = m;
}

Previously generated:

foo:    xchgq   %rdi, %rsi
        movq    %rsi, m0(%rip)
        movq    %rdi, m0+8(%rip)
        movq    %rsi, m1(%rip)
        movq    %rdi, m1+8(%rip)
        movq    %rsi, m2(%rip)
        movq    %rdi, m2+8(%rip)
        movq    %rsi, m3(%rip)
        movq    %rdi, m3+8(%rip)
        ret

With the patch, we now generate:

foo: movq    %rdi, %xmm0
        movq    %rsi, %xmm1
        punpcklqdq      %xmm1, %xmm0
        movaps  %xmm0, m0(%rip)
        movaps  %xmm0, m1(%rip)
        movaps  %xmm0, m2(%rip)
        movaps  %xmm0, m3(%rip)
        ret

or with -mavx2:

foo: vmovq   %rdi, %xmm1
        vpinsrq $1, %rsi, %xmm1, %xmm0
        vmovdqa %xmm0, m0(%rip)
        vmovdqa %xmm0, m1(%rip)
        vmovdqa %xmm0, m2(%rip)
        vmovdqa %xmm0, m3(%rip)
        ret

Likewise, for zero-extension:

__int128 m0,m1,m2,m3;
void bar(unsigned long x)
{
    __int128 m = x;
    m0 = m;
    m1 = m;
    m2 = m;
    m3 = m;
}

Previously with -O2:

bar:    movq    %rdi, m0(%rip)
        movq    $0, m0+8(%rip)
        movq    %rdi, m1(%rip)
        movq    $0, m1+8(%rip)
        movq    %rdi, m2(%rip)
        movq    $0, m2+8(%rip)
        movq    %rdi, m3(%rip)
        movq    $0, m3+8(%rip)
        ret

with this patch:

bar: movq    %rdi, %xmm0
        movaps  %xmm0, m0(%rip)
        movaps  %xmm0, m1(%rip)
        movaps  %xmm0, m2(%rip)
        movaps  %xmm0, m3(%rip)
        ret

As shown in the examples above, the scalar-to-vector (STV) conversion of
*concatditi has an overhead [treating two DImode registers as a TImode
value is free on x86_64], but specifying this penalty allows the STV
pass to make an informed decision if the total cost/gain of the chain
is a net win.

2025-10-21  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc (timode_concatdi_p): New
function to recognize the various variants of *concatditi3_[1-7].
(scalar_chain::add_insn): Like VEC_SELECT, ZERO_EXTEND and
timode_concatdi_p instructions don't require their input
operands to be converted (to TImode).
(timode_scalar_chain::compute_convert_gain): Split/clone XOR and
IOR cases from AND case, to handle timode_concatdi_p costs.
<case PLUS>: Handle timode_concatdi_p conversion costs.
<case ZERO_EXTEND>: Provide costs of DImode to TImode extension.
(timode_convert_concatdi): Helper function to transform
a *concatditi3 instruction into a vec_concatv2di instruction.
(timode_scalar_chain::convert_insn): Split/clone XOR and IOR
cases from ANS case, to handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
<case ZERO_EXTEND>: Convert zero_extendditi2 to *vec_concatv2di_0.
<case PLUS>: Handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
(timode_scalar_to_vector_candidate_p): Support timode_concatdi_p
instructions in IOR, XOR and PLUS cases.
<case ZERO_EXTEND>: Consider zero extension of a register from
DImode to TImode to be a candidate.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-stv-10.c: New test case.
* gcc.target/i386/sse4_1-stv-11.c: Likewise.
* gcc.target/i386/sse4_1-stv-12.c: Likewise.

OpenMP: Update directive arrays used for 'omp assume(s)' with contains/absent

Both Fortran and C/C++ have an array with classifications of directives;
currently, this array is only used to handle the restrictions of the
contains/absent clauses to the assume/assumes directives.

For C/C++, uncommenting 'declare mapper' was missed. Additionally,
'end ...' is a directive but not a directive name; hence, those
are now rejected as 'unknown directive' instead of as 'invalid'
directive.

Additionally, both lists now list newer entries (commented out) for
OpenMP 6.x - and a note (comment) was added for C/C++'s
'begin metadirective' and for Fortran's 'allocate', respectively.

gcc/c-family/ChangeLog:

* c-omp.cc (c_omp_directives): Uncomment 'declare mapper',
add comment to 'begin metadirective', add 6.x unimplemented
directives as comment-out entries.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.

gcc/fortran/ChangeLog:

* openmp.cc (gfc_omp_directive): Add comment to 'allocate';
add 6.x unimplemented directives as comment-out entries.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/assumes-2.c: Change for 'invalid'
to 'unknown' change for end directives.
* c-c++-common/gomp/begin-assumes-2.c: Likewise.
* c-c++-common/gomp/assume-2.c: Likewise. Check 'declare
mapper'.