Andrew Pinski [Sat, 25 Oct 2025 00:13:26 +0000 (17:13 -0700)]
forwprop: Fix copy prop for alignment after the final folding [PR122086]
After r16-4081-g966cdec2b2 which added folding of __builtin_assume_aligned,
forwprop would propagate pointers that lower alignment replacing ones with
greater alignment. This causes us to lose alignment information that
__builtin_assume_aligned provided to expand. Normally this just loses some
optimizations except in the s390 case where the alignment is specifically
checked and was for inlining of the atomics; without this patch an infininite
loop would happen.
Note this was previously broken for -Og before r16-4081-g966cdec2b2. This
fixes -Og case as forwprop is used instead of copyprop.
This moves the testcase for pr107389.c to torture to get a generic testcase.
pr107389.c was originally for -O0 case but we should test for other
optimization levels so this is not lost again.
align-5.c is xfailed because __builtin_assume_aligned is not instrumented for ubsan
alignment and ubsan check to see pointer is aligned before emitting a check for the
load (based on the known alignment in compiling). See PR 122038 too. I had mentioned
this issue previously in r16-4081-g966cdec2b2 too.
* tree-ssa-forwprop.cc (forwprop_may_propagate_copy): New function.
(pass_forwprop::execute): Use forwprop_may_propagate_copy
instead of may_propagate_copy.
gcc/testsuite/ChangeLog:
* gcc.dg/pr107389.c: Move to...
* gcc.dg/torture/pr107389.c: ...here. Skip for lto.
Use dg-additional-options rather than dg-options.
* c-c++-common/ubsan/align-5.c: xfail.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Wed, 15 Oct 2025 16:59:25 +0000 (09:59 -0700)]
riscv: Fix gimple folding of the vset* intrinsics [PR122270]
The problem here is that when the backend folds the vset intrinsics,
it tries to keep the lhs of the new statement to be the same as the old statement
due to the check in gsi_replace. The problem is with a MEM_REF vset::fold was
unsharing the new lhs here and using the original lhs in the other new statement.
This meant the check in gsi_replace would fail.
This fixes that oversight by switching around which statement gets the unshared
version.
Note the comment in vset::fold was already correct just not matching the code:
/* Replace the call with two statements: a copy of the full tuple
to the call result, followed by an update of the individual vector.
The fold routines expect the replacement statement to have the
same lhs as the original call, so return the copy statement
rather than the field update. */
Changes since v1:
* v2: Fix testcase.
PR target/122270
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc (vset::fold): Use the
unshare_expr for the statement that will be added seperately rather
the one which will be used for the replacement.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr122270-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Gerald Pfeifer [Sun, 26 Oct 2025 17:01:02 +0000 (18:01 +0100)]
poly-int: Fix struct vs class confusion
We currently issue a good deal of warnings of the following kind:
In file included from /scratch/tmp/gerald/GCC-HEAD/gcc/coretypes.h:500:
.../GCC-HEAD/gcc/poly-int.h:378:1: warning: 'poly_int' defined as a
struct template here but previously declared as a class template; this
is valid, but may result in linker errors under the Microsoft C++ ABI
[-Wmismatched-tags]
378 | struct poly_int
| ^
.../GCC-HEAD/gcc/poly-int.h:32:38: note: did you mean struct here?
32 | template<unsigned int N, typename T> class poly_int;
| ^~~~~
| struct
A `grep "'poly_int' defined as a struct template" | cut -d: -f1 | uniq -c`
shows 749 issue for a typical bootstrap.
Addressing this brings down the bootstrap log by 8.5% - from 80454 lines
down to 73613.
gcc:
* poly-int.h: Change struct poly_int to class poly_int.
LIU Hao [Sat, 25 Oct 2025 09:19:34 +0000 (17:19 +0800)]
x86-64: Use `movsxd` to perform SI-to-DI extension in Intel syntax
Although there's no possibility of ambiguity, Intel manual says the mnemonic
for DWORD-to-QWORD sign-extension operation should be MOVSXD. Some assemblers
(GNU AS, NASM) also overload MOVSX, but some others don't accept MOVSX (LLVM,
MASM, YASM in NASM mode) and require MOVSXD.
This mnemonic was introduced in r0-34259-g123bf9e3f4056d in 2001, and has not
been updated ever since.
gcc/ChangeLog:
PR target/119079
* config/i386/i386.md: Use `movsxd` to perform SI-to-DI extension in Intel
syntax.
Kuan-Lin Chen [Sun, 26 Oct 2025 15:09:03 +0000 (09:09 -0600)]
[PATCH v2] RISC-V: Fix moving data from V_REGS to GR_REGS by scalar move.
When the ELEN is 32 in rv64, it can move a V2SI vector register into a DI GPR.
But the extract result in low-part must be zero-extended.
The following gimples are from pr111391-2.c:
_25 = (char) d.1_23;
_17 = {_25, _25, _25, _25, _25, _25, _25, _25};
a_26 = VIEW_CONVERT_EXPR<long int>(_17);
The final assembly:
vsetivli zero,2,e32,m1,ta,ma
vslidedown.vi v2,v4,1
vmv.x.s s1,v4 -> s1 must be zero-extended to prevent the bit 31 of v4[0] is 1
vmv.x.s a4,v2
slli a5,a4,32
or s1,a5,s1
Osama Abdelkader [Sun, 26 Oct 2025 14:38:06 +0000 (08:38 -0600)]
[PATCH] middle-end: Fix typos in comments
This patch fixes spelling errors in comments:
- "accomodate" -> "accommodate" in wide-int.h and value-range-storage.h
- "the the" -> "the" in tree-vectorizer.h
gcc/ChangeLog:
* wide-int.h: Fix typo "accomodate" to "accommodate" in comment.
* value-range-storage.h: Likewise.
* tree-vectorizer.h (dr_set_safe_speculative_read_required):
Fix duplicate "the the" to "the" in comment.
Eric Botcazou [Sun, 26 Oct 2025 09:21:31 +0000 (10:21 +0100)]
Ada: Fix internal error on pragma Machine_Attribute with string constant
This was reported a long time ago and is a fairly pathological case,
so the fix is purposely ad hoc: when the attribute name of a pragma
Machine_Attribute is not a string literal, its processing needs to
be delayed for the back-end.
gcc/ada/
PR ada/13370
* sem_prag.adb (Analyze_Pragma) <Pragma_Machine_Attribute>: Set the
Has_Delayed_Freeze flag if the argument is not a literal.
gcc/testsuite/
* gnat.dg/machine_attr3.ads, gnat.dg/machine_attr3.adb: New test.
Paul Thomas [Sun, 26 Oct 2025 07:49:09 +0000 (07:49 +0000)]
Fortran: Fix generic user operators in PDTs [PR122290]
2025-10-26 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/122290
* decl.cc (variable_decl): Matching component initializer
expressions in PDT templates should be done with gfc_match_expr
to avoid reduction too early. If the expression type is unknown
copy the component typespec.
(gfc_get_pdt_instance): Change comment from a TODO to an
explanation. Insert parameter values in initializers. For
components that are not marked with PDT attributes, do the
full reduction for init expressions.
* primary.cc (gfc_match_actual_arglist): Convert PDT kind exprs
using the component initializer.
* resolve.cc (resolve_typebound_intrinsic_op): Preempt
gfc_check_new_interface for pdt_types as well as entities used
in submodules.
* simplify.cc (get_kind): Remove PDT kind conversion.
gcc/testsuite/
PR fortran/122290
* gfortran.dg/pdt_60.f03: New test.
Rainer Orth [Sat, 25 Oct 2025 19:15:41 +0000 (20:15 +0100)]
libgccjit: Fix bootstrap fail from format specifiers.
The changes in r16-4527-gc11d9eaa8ac9ee caused format mismatches
in jit-recording.cc because Darwin’s uint64_t == long long unsigned int
(and the format specifiers were %ld and %li).
gcc/jit/ChangeLog:
* jit-recording.cc (recording::array_type::make_debug_string,
recording::array_type::write_reproducer): Use PRIu64 format
specifier for uint64_t.
Harald Anlauf [Fri, 24 Oct 2025 19:33:08 +0000 (21:33 +0200)]
Fortran: IS_CONTIGUOUS and pointers to non-contiguous targets [PR114023]
PR fortran/114023
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_trans_pointer_assignment): Always set dtype
when remapping a pointer. For unlimited polymorphic LHS use
elem_len from RHS.
* trans-intrinsic.cc (gfc_conv_is_contiguous_expr): Extend inline
generated code for IS_CONTIGUOUS for pointer arguments to detect
when span differs from the element size.
Olivier Hainque [Fri, 17 Oct 2025 17:53:02 +0000 (17:53 +0000)]
Replace VSB_DIR by sysroot refs in VxWorks LIBGCC2_INCLUDES
This matches what VXWORKS_ADDITIONAL_CPP_SPEC does, allowing libgcc to
build without VSB_DIR defined in the environment as soon as the configure
switches featured a proper --with-build-sysroot was provided.
Tested with a successful build for --target=powerpc-wrs-vxworks7r2
using mainline sources.
2025-10-23 Olivier Hainque <hainque@adacore.com>
libgcc/
* config/t-vxworks (LIBGCC2_INCLUDES): Replace $(VSB_DIR)
by sysroot references.
Jiahao Xu [Thu, 23 Oct 2025 06:29:06 +0000 (14:29 +0800)]
LoongArch: Implement vector reduction from 256-bit to 128-bit
gcc/ChangeLog:
* config/loongarch/lasx.md (vec_extract<mode><lasxhalf>): New define_expand.
(vec_extract_lo_<mode>): New define_insn_and_split.
(vec_extract_hi_<mode>): New define_insn.
* config/loongarch/loongarch-protos.h (loongarch_check_vect_par_cnst_half)
New function prototype.
* config/loongarch/loongarch.cc (loongarch_split_reduction):
Implement TARGET_VECTORIZE_SPLIT_REDUCTION.
(loongarch_check_vect_par_cnst_half): New function.
* config/loongarch/predicates.md
(vect_par_cnst_low_half): New predicate.
(vect_par_cnst_high_half): New predicate.
Adjust the range of ssa_name to reflect the conditional value of ssa_name
relative to whether its in the TRUE or FALSE part of the COND_EXPR.
PR tree-optimization/114025
gcc/
* gimple-range-fold.cc (fold_using_range::condexpr_adjust): Handle
the same ssa_name in the condition and the COND_EXPR better.
Gaius Mulley [Fri, 24 Oct 2025 20:51:36 +0000 (21:51 +0100)]
PR modula2/122407: Followup to spell check remaining intrinsics
This followup patch ensures that any unknown symbol spell check
error in the instrinsics uses the parameter token rather than the
procedure name token. In turn this allows the filter module to
detect and remove multiple unknowns at the same token.
The patch also adds spell checking to the instrinsic parameters.
PR modula2/122407
* gm2.dg/spell/iso/fail/badspellabs.mod: New test.
* gm2.dg/spell/iso/fail/badspelladr.mod: New test.
* gm2.dg/spell/iso/fail/badspellcap.mod: New test.
* gm2.dg/spell/iso/fail/badspellchr.mod: New test.
* gm2.dg/spell/iso/fail/badspellchr2.mod: New test.
* gm2.dg/spell/iso/fail/badspelldec.mod: New test.
* gm2.dg/spell/iso/fail/badspellexcl.mod: New test.
* gm2.dg/spell/iso/fail/badspellinc.mod: New test.
* gm2.dg/spell/iso/fail/badspellincl.mod: New test.
* gm2.dg/spell/iso/fail/badspellnew.mod: New test.
* gm2.dg/spell/iso/fail/badspellsize.mod: New test.
* gm2.dg/spell/iso/fail/dg-spell-iso-fail.exp: New test.
After r16-3956-gf613fdc6920c83, these 2 testcases start to fail as
now SRA will fully scalarize the structs and only provide with the piece
that was used. But the testcase is testing about the padding so turning
off SRA is correct in this case.
Pushed as obvious after testing the testcases now pass.
PR target/122402
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/auto-init-padding-2.c: Turn off SRA.
* gcc.target/aarch64/auto-init-padding-4.c: Likewise.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
When exporting declarations in a namespace with using declarations, the
name must be fully qualified. Recently introduced std::full_extent{,_t},
std:strided_slice, and std::submdspan_mapping_result broke module std
and module std.compat.
Harald Anlauf [Thu, 23 Oct 2025 19:21:04 +0000 (21:21 +0200)]
Fortran: fix TRANSFER of subarray component references [PR122386]
Commit r16-518 introduced a change that fixed inquiry references of complex
arrays as argument to the TRANSFER intrinsic by forcing a temporary. The
solution taken however turned out not to be generalizable to component
references of nested derived-type arrays. A better way is the revert that
patch and force the generation of a temporary when the SOURCE expression is
a not simply-contiguous array.
PR fortran/122386
gcc/fortran/ChangeLog:
* dependency.cc (gfc_ref_needs_temporary_p): Revert r16-518.
* trans-intrinsic.cc (gfc_conv_intrinsic_transfer): Force temporary
for SOURCE not being a simply-contiguous array.
gcc/testsuite/ChangeLog:
* gfortran.dg/transfer_array_subref_2.f90: New test.
Ada: Fix other instances of incorrect String lower bound in gnatlink
This also reverts an unintentional change introduced by the initial fix.
gcc/ada/
PR ada/81087
* gnatlink.adb (Is_Prefix): Move around, streamline and return false
when the prefix is not strict.
(Gnatlink): Fix other instances of incorrect lower bound assumption.
Andrew MacLeod [Thu, 23 Oct 2025 21:19:02 +0000 (17:19 -0400)]
Split signed bitwise AND operations.
The algorithm for bitwise AND struggles with complex signed operations
which cross the signed/unsigned barrier. When this happens, process it
as 2 seperate ranges [LB, -1] and [0, UB], and combine the results.
PR tree-optimization/114725.c
gcc/
* range-op.cc (operator_bitwise_and::wi_fold): Split signed
operations crossing zero into 2 operations.
Andrew MacLeod [Tue, 21 Oct 2025 20:05:22 +0000 (16:05 -0400)]
Create and apply bitmasks for truncating casts.
When folding a cast, we were not applying the bitmask if we reached
a VARYING result.
We were also not creating a bitmask to represent the lower bits of a
truncating cast in op1_range. So GORI was losing bits.
PR tree-optimization/118254
PR tree-optimization/114331
gcc/
* range-op.cc (operator_cast::fold_range): When VARYING is
reached, update the bitmask if we reach VARYING.
(operator_cast::op1_range): For truncating casts, create a
bitmask bit in LHS.
Tomasz Kamiński [Fri, 24 Oct 2025 07:37:13 +0000 (09:37 +0200)]
libstdc++: Forward arguments for bind_front<f>,bind_back<f>,nttp<f> [PR122022]
This patch fixes a missing forwarding-reference (&&) in _Bind_fn_t::operator()
and lambda returned from not_fn<f>.
The bind_front<f>/bind_back<f> tests were updated to use a structure similar
to r16-3398-g250dd5b5604fbc to cover cases involving zero, one, and many bound
arguments.
PR libstdc++/122022
libstdc++-v3/ChangeLog:
* include/std/functional (_Bind_fn_t): Use forwarding reference as
paremeter.
(std::not_fn<f>): Use forwarding reference as lambda parameter.
* testsuite/20_util/function_objects/bind_back/nttp.cc: Rework tests.
* testsuite/20_util/function_objects/bind_front/nttp.cc: Likewise.
* testsuite/20_util/function_objects/not_fn/nttp.cc: Add test for
argument forwarding.
Because it is redundant to specify 'i'-constraints on operands in single-
alternative match templates whose predicates are "const_int_operand" itself
or those that imply CONST_INT_P().
This patch also removes the 'i'-constraints on the next argument of the
callee (the number of bytes of arguments) in the four "call_internal"
patterns, since we are not interested in these arguments.
Gaius Mulley [Fri, 24 Oct 2025 12:04:10 +0000 (13:04 +0100)]
PR modula2/122407: similar error messages are emitted for an unknown symbol
This followup to PR modula2/122241 reduces error message clutter by
filtering unknown symbol error ensuring that only one error message
is emitted for an unknown symbol at a particular location.
The filter is implemented using two binary trees. A new generic
(based on the address type) binary dictionary module is added to
the base libraries.
gcc/m2/ChangeLog:
PR modula2/122407
* Make-lang.in (GM2-LIBS-BOOT-DEFS): Add BinDict.def.
(GM2-LIBS-BOOT-MODS): Add BinDict.mod.
(GM2-COMP-BOOT-DEFS): Add FilterError.def.
(GM2-COMP-BOOT-MODS): Add FilterError.mod.
(GM2-LIBS-DEFS): Add BinDict.def.
(GM2-LIBS-MODS): Add BinDict.mod.
* gm2-compiler/M2Error.def (KillError): New procedure.
* gm2-compiler/M2Error.mod (WriteFormat3): Reformat.
(NewError): Rewrite and call AddToList.
(AddToList): New procedure.
(SubFromList): Ditto.
(WipeReferences): Ditto.
(KillError): Ditto.
* gm2-compiler/M2LexBuf.mod (MakeVirtualTok): Return
caret if all token positions are identical.
* gm2-compiler/M2MetaError.mod (KillError): Import.
(FilterError): Import.
(FilterUnknown): New global.
(initErrorBlock): Initialize symcause and token.
(push): Capitalize comments.
(pop): Copy symcause to toblock if discovered.
(doError): Add parameter sym.
(defaultError): Assign token if discovered.
Pass NulSym to doError.
(updateTokSym): New procedure.
(chooseError): Call updateTokSym.
(doErrorScopeModule): Pass sym to doError.
(doErrorScopeForward): Ditto.
(doErrorScopeMod): Ditto.
(doErrorScopeFor): Ditto.
(doErrorScopeDefinition): Ditto.
(doErrorScopeDef): Ditto.
(doErrorScopeProc): Ditto.
(used): Pass sym[bol] to doError.
(op): Assign symcause when encountering
an error, warning or note.
(MetaErrorStringT1): Rewrite.
(MetaErrorStringT2): Ditto.
(MetaErrorStringT3): Ditto.
(MetaErrorStringT4): Ditto.
(isUniqueError): New procedure function.
(wrapErrors): Rewrite.
(FilterUnknown): Initialize.
* gm2-compiler/M2Quads.mod (BuildTSizeFunction): Add spell check
hint specifier.
* gm2-compiler/FilterError.def: New file.
* gm2-compiler/FilterError.mod: New file.
* gm2-libs/BinDict.def: New file.
* gm2-libs/BinDict.mod: New file.
Thomas Schwinge [Tue, 21 Oct 2025 07:46:32 +0000 (09:46 +0200)]
Simplify 'Makefile' dependencies for libatomic [PR81358]
I noticed that commit r16-4315-ge63cf4b130b86dd7dde1bf499d3d40faca10ea2e
"PR81358: Enable automatic linking of libatomic" had introduced a lot of
repeated 'Makefile' dependencies for libatomic, including some nonsensical
ones, like 'configure-stage1-target-libada: maybe-all-stage1-target-libatomic'
(libada isn't bootstrapped). That's because the code for generation of
dependencies had been put into inside an existing loop over 'target_modules'.
PR driver/81358
* Makefile.tpl: Move generation of dependencies for libatomic out
of loop over 'target_modules'.
* Makefile.in: Regenerate.
Hi,
this patch remvoes the annotation which causes the build to fail when
configured with --enable-gather-detailed-mem-stats. I am very familiar
with the mem stat system yet, so I'd leave annotating these functions
for a future patch.
Richard Biener [Fri, 24 Oct 2025 08:13:14 +0000 (10:13 +0200)]
Fix reduction validation for associated reduction chains
The code checking whether we have a single cycle and tracking the
reduction chain was not transitioned to full SLP which now shows
when having a SLP reduction chain built after associating the
reduction operation.
* tree-vect-loop.cc (vectorizable_reduction): SLP-ify reduction
operation processing a bit more.
Richard Biener [Fri, 24 Oct 2025 07:07:56 +0000 (09:07 +0200)]
tree-optimization/122406 - incomplete handling of reduction chain with conversion
The following fixes the mixup between reduction operation and conversion
wrapping a reduction chain. This also exposes a missed optimization
but I'm going to fix that in a followup.
PR tree-optimization/122406
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Create
the SLP nodes for the conversions around the reduction
operation if required.
* gcc.dg/vect/vect-pr122406-1.c: New testcase.
* gcc.dg/vect/vect-pr122406-2.c: Likewise.
OpenMP: Fix bogus diagnostics with intervening code [PR121452]
The introduction in r14-3488-ga62c8324e7e31a of OMP_STRUCTURED_BLOCK (to
diagnose invalid intervening code) caused a regression rejecting the valid use
of the Fortran CONTINUE statement to end a collapsed loop.
This patch fixes the incorrect error checking in the OMP lowering pass. It also
fixes a check in the Fortran front end that erroneously rejects a similar
statement in an ordered loop.
* openmp.cc (resolve_omp_do): Allow CONTINUE as end statement of a
perfectly nested loop.
gcc/ChangeLog:
* omp-low.cc (check_omp_nesting_restrictions): Accept an
OMP_STRUCTURED_BLOCK in a collapsed simd region and in an ordered loop.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/pr121452-1.c: New test.
* c-c++-common/gomp/pr121452-2.c: New test.
* gfortran.dg/gomp/pr121452-1.f90: New test.
* gfortran.dg/gomp/pr121452-2.f90: New test.
* gfortran.dg/gomp/pr121452-3.f90: New test.
GCC builds three VEC_PERM_EXPRs for the intrinsic calls. The first two
implement vector shifts and the final one implements the blend, but they
use different vector modes. The forward propagation fails to optimize
this case because VIEW_CONVERT_EXPRs in between block the folding.
This patch adds a match.pd pattern to recognize the concat-and-extract
idiom and folds the VEC_PERM_EXPR chain, even when VIEW_CONVERT_EXPRs
split the chain.
Bootstrapped and tested on aarch64-linux-gnu and x86_64-linux-gnu.
Olivier Hainque [Tue, 3 Jun 2025 21:10:38 +0000 (18:10 -0300)]
Adjust VxWorks special case in testsuite check_weak_available
check_weak_available was reporting weak symbols unsupported
for vxworks unconditionally while they are actually supported
vxworks 7 now (assumed >= r2). This change adjusts the
predicate to reflect that.
We used to believe we should distinguish kernel and rtp modes,
and experiments showed that this distinction is actually
counterproductive for the testsuite's purposes.
This allows a few extra tests to run (and pass :), in particular
in g++.dg/modules.
2021-02-03 Olivier Hainque <hainque@adacore.com>
gcc/testsuite/
* lib/target-supports.exp (check_weak_available):
Return 1 for VxWorks7.
Joseph Myers [Fri, 24 Oct 2025 00:47:54 +0000 (00:47 +0000)]
c: Implement C2y static assertions in expressions
C2y has added support for static assertions as void expressions, in
addition to use as declarations (N3715 was accepted in an online vote
between meetings).
Implement the feature in GCC. There is a syntactic ambiguity between
a static assertion as a declaration and one as an expression
statement, which the accepted feature resolves by making such a usage
a declaration (this only affects the sequence of syntax productions by
which the code is parsed, not the actual semantics of the assertion);
I've raised the similar ambiguity in for loops on the WG14 reflector.
If just concerned with C2y, and not with diagnosing the use of a
feature not supported in older standard versions, the feature might be
simpler to implement by defaulting to treating static assertions in
ambiguous contexts as expressions rather than declarations, but that
would make it hard to diagnose exactly the cases that are new in C2y
(those depend on the static assertion either not being the whole
expression statement, or being in a context where an expression
statement is allowed but a declaration is not, e.g. the body of an if
statement). Instead, to support such diagnostics, the implementation
follows the standard in what is considered a declaration and what is
considered an expression, by looking ahead to what follows the closing
parenthesis when a static assertion starts in a context where a
declaration is permitted.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c/
* c-parser.cc (c_parser_next_tokens_start_typename)
(c_parser_next_tokens_start_declaration): Add argument for token
to start from
(c_parser_next_tokens_start_declaration): Check for whether static
assertion followed by semicolon.
(c_parser_check_balanced_raw_token_sequence): Declare earlier.
(c_parser_compound_statement_nostart, c_parser_for_statement): Use
c_parser_next_tokens_start_declaration not
c_token_starts_declaration on second token.
(c_parser_unary_expression): Handle static assertions.
* c-parser.h (c_parser_next_tokens_start_declaration): Add
argument.
Robert Dubner [Thu, 23 Oct 2025 18:18:41 +0000 (14:18 -0400)]
cobol: Corrected FUNCTION CHAR and FUNCTION ORD.
The functions CHAR and ORD have been changed to correctly report on
character positions within the collation sequence.
The use of the LOW-VALUE and HIGH-VALUE figurative constants has been
corrected.
Some establishment of DISPLAY and NATIONAL encodings has been done
in anticipation of changes soon to come.
Some new testsuite tests have been added.
gcc/cobol/ChangeLog:
* genapi.cc (parser_alphabet): Alphabet encoding.
(parser_alphabet_use): Likewise.
(parser_xml_parse): Use correct debugging macro; encoding.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(initialize_the_data): Encoding.
(parser_label_label): Debugging macros.
(parser_label_goto): Likewise.
(parser_file_add): Encoding.
(parser_intrinsic_call_1): Special handling for __gg__char.
(parser_intrinsic_call_2): Formatting.
* parse.y: Response from FUNCTION ORD is flagged "unsigned".
* symbols.cc (cbl_alphabet_t::reencode): Establish
low_char & high_char.
* symbols.h (struct cbl_alphabet_t): Likewise.
libgcobol/ChangeLog:
* charmaps.cc: Encoding.
* charmaps.h (class charmap_t): Encoding.
* intrinsic.cc (__gg__char): Report the character at the
collation position.
(__gg__ord): Report the collation position of a character.
* libgcobol.cc (struct program_state): Add encodings;
Remove obsolete defines.
(__gg__current_collation): New function for encoding/collation.
(__gg__pop_program_state): Encoding.
(__gg__init_program_state): Encoding.
(format_for_display_internal): Handle LOW-VALUE and HIGH-VALUE.
(__gg__compare_2): Encoding.
(__gg__alphabet_use): Likewise.
* libgcobol.h (__gg__current_collation): New declaration.
* xmlparse.cc (__gg__xml_parse): Make a parameter const.
gcc/testsuite/ChangeLog:
* cobol.dg/group2/Length_overflow__2_.out: Updated test result.
* cobol.dg/group2/Length_overflow_with_offset__1_.out: Likewise.
* cobol.dg/group2/Offset_overflow.out: Likewise.
* cobol.dg/group2/CALL_with_OCCURS_DEPENDING_ON.cob: New test.
* cobol.dg/group2/CALL_with_OCCURS_DEPENDING_ON.out: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_ASCII.cob: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_ASCII.out: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_EBCDIC.cob: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_EBCDIC.out: New test.
* cobol.dg/group2/EC-BOUND-REF-MOD_checking_process_termination.cob: New test.
* cobol.dg/group2/EC-BOUND-REF-MOD_checking_process_termination.out: New test.
* cobol.dg/group2/Intrinsics_without_FUNCTION_keyword__3_.cob: New test.
* cobol.dg/group2/Occurs_DEPENDING_ON__source_and_dest.cob: New test.
* cobol.dg/group2/Occurs_DEPENDING_ON__source_and_dest.out: New test.
* cobol.dg/group2/Recursive_subscripts.cob: New test.
* cobol.dg/group2/Recursive_subscripts.out: New test.
* cobol.dg/group2/SEARCH_ALL_with_OCCURS_DEPENDING_ON.cob: New test.
* cobol.dg/group2/SEARCH_ALL_with_OCCURS_DEPENDING_ON.out: New test.
* cobol.dg/group2/Subscript_by_arithmetic_expression.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__1_.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__1_.out: New test.
* cobol.dg/group2/Subscript_out_of_bounds__2_.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__2_.out: New test.
* cobol.dg/group2/Subscripted_refmods.cob: New test.
* cobol.dg/group2/Subscripted_refmods.out: New test.
* cobol.dg/group2/length_of_ODO_Rules_7__8A__and_8B.cob: New test.
* cobol.dg/group2/length_of_ODO_Rules_7__8A__and_8B.out: New test.
* cobol.dg/group2/length_of_ODO_w_-_reference_modification.cob: New test.
Andrew Pinski [Wed, 22 Oct 2025 22:46:03 +0000 (15:46 -0700)]
match: improve handling of `((signed)x) < 0` to `x >= (unsigned)SIGNED_TYPE_MIN` in `(type1)x CMP CST1 ? (type2)x : CST2` pattern.
This is a follow on r16-4534-g07800a565abd20 based on the review of
the other pattern (https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698336.html)
as the same issue mentioned in that review apply here.
This changes to use the new version of minmax_from_comparison so we don't need to create
a tree for the constant. and use wi::mask instead of TYPE_MIN_VALUE.
Andrew Pinski [Tue, 21 Oct 2025 05:59:51 +0000 (22:59 -0700)]
phiopt: Remove minmax_replacement [PR101024]
Now all of the optimizations are done in match from
minmax_replacement. We can now remove minmax_replacement. :)
This keeps around the special case for fp `a CMP b ? a : b` that was
added with r14-2699-g9f8f37f5490076 (PR 88540) and moves it to
match_simplify_replacement.
Bootsrapped and tested on x86_64-linux-gnu.
Note bool-12.c needed to be updated since phiopt1 rejecting the
BIT_AND/BIT_IOR with a cast and not getting MIN/MAX any more.
gcc/ChangeLog:
PR tree-optimization/101024
* tree-ssa-phiopt.cc (match_simplify_replacement): Special
case fp `a CMP b ? a : b` when not creating a min/max.
(strip_bit_not): Remove.
(invert_minmax_code): Remove.
(minmax_replacement): Remove.
(pass_phiopt::execute): Update pass comment.
Don't call minmax_replacement.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/bool-12.c: Update based on when BIT_AND/BIT_IOR
is created and no longer MIN/MAX.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Tue, 21 Oct 2025 05:45:47 +0000 (22:45 -0700)]
match: Add support for `((signed)a </>= 0) ? min/max (a, c) : b` [PR101024]
This is the last patch that is needed to support to remove minmax_replacement.
This fixes pr101024-1.c which is failing when minmax_replacement is removed.
This next patch will remove it.
Changes since v1:
* v2: Add new version of minmax_from_comparison that takes widest_int.
Constraint the pattern to constant integers in some cases.
Use mask to create the SIGNED_MAX and use GT/LE instead.
Use wi::le_p/wi::ge_p instead of fold_build to do the comparison.
gcc/ChangeLog:
PR tree-optimization/101024
* fold-const.cc (minmax_from_comparison): New version that takes widest_int
instead of tree.
(minmax_from_comparison): Call minmax_from_comparison for integer cst case.
* fold-const.h (minmax_from_comparison): New declaration.
* match.pd (`((signed)a </>= 0) ? min/max (a, c) : b`): New pattern.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Robert Dubner [Tue, 21 Oct 2025 17:33:30 +0000 (13:33 -0400)]
cobol: Implement the XML PARSE statement.
These changes implement the XML PARSE statement as described in the IBM
specification.
A repair to exception handling is included. Up until now, an exception
after a successful file operation wasn't handled properly.
A repair to value declarations for BINARY / COMP / COMP-4 / COMP-5
values now allows them to have digits to the right of the implied
decimal point. Processing of the "S" PICTURE character has been
normalized as well.
Co-Authored-By: James K. Lowden <jklowden@cobolworx.com> Co-Authored-By: Robert Dubner <rdubner@symas.com>
gcc/cobol/ChangeLog:
* Make-lang.in: Incorporate new token_names.h file.
* cdf.y: Modify tokens.
* gcobol.1: Document XML PARSE statement
* genapi.cc (parser_enter_program): Verify that every goto has a
matching label.
(parser_end_program): Likewise.
(parser_alphabet): Refine handling codeset encodings.
(parser_alphabet_use): Likewise.
(label_fetch): Moved from later in the source code.
(parser_xml_parse): New routine for XML PARSE.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(parser_label_label): Verify goto/label matching.
(parser_label_goto): Likewise.
(parser_entry): Minor change to SHOW_PARSE report.
* genapi.h (parser_alphabet): Set parameter to const.
(parser_xml_parse): Declare new function.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(parser_label_addr): Likewise.
* parse.y: label_pair_t structure; locale processing; new token
processing for alphabets and XML PARSE.
* parse_ante.h (name_of): Return field->name when initial is NULL.
(new_tempnumeric): Make signable_e optional.
(ast_save_locale): New function.
(data_division_ready): Warning for "no alphabet".
* scan.l: Repair interpretation of BINARY, COMP, COMP-4, and
COMP-5.
* scan_ante.h (struct bint_t): Likewise.
* scan_post.h (current_tokens_t::tokenset_t::tokenset_t):
Include token_names.h.
* symbols.cc (symbols_alphabet_set): Revert to prior alphabet
determination.
(symbol_table_init): New XML special registers.
(new_temporary): Make signable_e controllable, not fixed.
* symbols.h (__gg__encoding_iconv_valid): New declaration.
(enum cbl_label_type_t): New LblXml label type.
(struct cbl_xml_parse_t):
(struct cbl_label_t): Implement XML PARSE.
(new_temporary): Incorporate boolean for signable_e.
(symbol_elem_of): Change label field type handling.
(cbl_section_of): Likewise.
(cbl_field_of): Likewise.
(cbl_label_of): Likewise.
(cbl_special_name_of): Likewise.
(cbl_alphabet_of): Likewise.
(cbl_file_of): Likewise.
* token_names.h: New file.
* util.cc (gcc_location_set_impl): Improve location_t calculations
when entering and leaving COPYBOOKs.
libgcobol/ChangeLog:
* Makefile.am: Changes for XML PARSE and POSIX functions.
* Makefile.in: Likewise.
* charmaps.cc: Augment encodings[] table with "supported" boolean.
(__gg__encoding_iconv_name): Modify how encodings are identified.
(encoding_descr): Likewise.
(__gg__encoding_iconv_valid): Likewise.
* common-defs.h (callback_t): Define function pointer.
* constants.cc: Use named cbl_attr_e constants instead of magic
numbers.; New definitions for XML special registers.
* encodings.h (struct encodings_t): Declare "supported" boolean.
* libgcobol.cc (format_for_display_internal): Use std::ptrdiff_t.
(__gg__alphabet_use): Add case for iconv_CP1252_e.
(default_exception_handler): Repair exception handling after a
successful file operation.
* posix/errno.cc: New file.
* posix/localtime.cc: New file.
* posix/stat.cc: New file.
* posix/stat.h: New file.
* posix/tm.h: New file.
* xmlparse.cc: New file to support XML PARSE statement.
gcc/testsuite/ChangeLog:
* cobol.dg/typo-1.cob: New test for squiggles and carets.
Alfie Richards [Tue, 7 Oct 2025 12:42:41 +0000 (12:42 +0000)]
aarch64: testsuite: Add test for supported FMV extensions.
Add tests that check the aarch64 version features are supported, that they
have the correct priority ordering, and that the generated resolver is correct.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/fmv_priority1.c: New test.
* gcc.target/aarch64/fmv_priority2.c: New test.
* gcc.target/aarch64/fmv_priority.in: Support file.
zhaozhou [Thu, 23 Oct 2025 10:52:22 +0000 (18:52 +0800)]
match.pd: Fold pattern of round semantics.
In the 538.imagick_r benchmark of Spec2017, I find these pattern from
MagickRound function. This patch implements these pattern in match.pd
for 4 rules:
1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5)
2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5)
3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5)
4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5)
The patch implements floor(x+0.5) operation to replace these pattern
that semantics of round(x) function.
The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu,
SPEC 2017 and SPEC 2006 were run:
As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+
in base test of ratio mode.
As for SPEC 2006, while the transform does not seem to be triggered,
we also see no non-noise impact on performance.
Andrew Stubbs [Mon, 20 Oct 2025 14:57:41 +0000 (14:57 +0000)]
libgomp: fine-grained pinned memory allocator
This patch introduces a new custom memory allocator for use with pinned
memory (in the case where the Cuda allocator isn't available). In future,
this allocator will also be used for Managed Memory. Both memories are
incompatible with the system malloc because allocated memory cannot share a
page with memory allocated for other purposes.
This means that small allocations will no longer consume an entire page of
pinned memory. Unfortunately, it also means that pinned memory pages will
never be unmapped (although they may be reused). This isn't a technical
limitation; the "free" algorithm could be extended in future, if needed.
The implementation is not perfect; there are various corner cases (especially
related to extending onto new pages) where allocations and reallocations may
be sub-optimal, but it should still be a step forward in support for small
allocations.
I have considered using libmemkind's "fixed" memory but rejected it for three
reasons: 1) libmemkind may not always be present at runtime, 2) there's no
currently documented means to extend a "fixed" kind one page at a time
(although the code appears to have an undocumented function that may do the
job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its
regular kinds would be straight-forward), 3) Managed Memory benefits from
having the metadata located in different memory and using an external
implementation makes it hard to guarantee this.
libgomp/ChangeLog:
* Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c.
* Makefile.in: Regenerate.
* basic-allocator.c: Mention simple-allocator in the comment.
* config/linux/allocator.c: Include unistd.h.
(pin_ctx): New variable.
(ctxlock): New variable.
(linux_init_pin_ctx): New function.
(linux_memspace_alloc): Use simple-allocator for pinned memory.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise.
* libgomp.h (gomp_simple_alloc_init_context): New prototype.
(gomp_simple_alloc_register_memory): New prototype.
(gomp_simple_alloc): New prototype.
(gomp_simple_free): New prototype.
(gomp_simple_realloc): New prototype.
* libgomp.texi: Update pinned memory trait documentation.
* testsuite/libgomp.c/alloc-pinned-8.c: New test.
* simple-allocator.c: New file.
Andrew Stubbs [Tue, 14 Oct 2025 11:22:05 +0000 (11:22 +0000)]
libgomp, nvptx: Cuda pinned memory
Use Cuda to pin memory, instead of Linux mlock, when available.
There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.
The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX. At present, the other supported devices do
not have equivalent capabilities (or requirements).
libgomp/ChangeLog:
* config/linux/allocator.c: Include assert.h.
(using_device_for_page_locked): New variable.
(linux_memspace_alloc): Add init0 parameter. Support device pinning.
(linux_memspace_calloc): Set init0 to true.
(linux_memspace_free): Support device pinning.
(linux_memspace_realloc): Support device pinning.
(MEMSPACE_ALLOC): Set init0 to false.
* libgomp-plugin.h
(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* libgomp.h (gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(struct gomp_device_descr): Add page_locked_host_alloc_func and
page_locked_host_free_func.
* libgomp.texi: Adjust the docs for the pinned trait.
* plugin/plugin-nvptx.c
(GOMP_OFFLOAD_page_locked_host_alloc): New function.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* target.c (device_for_page_locked): New variable.
(get_device_for_page_locked): New function.
(gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(gomp_load_plugin_for_device): Add page_locked_host_alloc and
page_locked_host_free.
* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
devices.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Richard Biener [Thu, 23 Oct 2025 09:13:09 +0000 (11:13 +0200)]
Remove LOOP_VINFO_SLP_UNROLLING_FACTOR
The following removes LOOP_VINFO_SLP_UNROLLING_FACTOR in favor of
using LOOP_VINFO_VECT_FACTOR directly now that there's no difference
between the two possible.
liuhongt [Wed, 15 Oct 2025 09:00:30 +0000 (02:00 -0700)]
Support reduc_sbool_and_scal_m for V{QI,SI,DI}mode.
gcc/ChangeLog:
PR target/101639
* config/i386/sse.md
(VI_AVX): New mode iterator.
(VI_AVX_CMP): Ditto.
(ssebytemode): Add V16HI, V32QI, V16QI.
(reduc_sbool_and_scal_<mode>): New expander.
(reduc_sbool_ior_scal_<mode>): Ditto.
(reduc_sbool_xor_scal_<mode>): Ditto.
(*eq<mode>3_2_negate): New pre_reload splitter.
(*ptest<mode>_ccz): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr101639_reduc_mask_vdi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_vqi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_vsi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_ior_vqi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_and_vqi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_di.c: New test.
* gcc.target/i386/pr101639_reduc_mask_hi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_qi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_si.c: New test.
r16-4540-g80af807e52e4f4 exposed a bug in two testcases where the declaration of
local labels was wrongly commented out. That caused "duplicate label" errors.
Uncommenting declarations fixes it.
PR middle-end/122378
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/attrs-metadirective-2.c: Uncomment local label
declaration.
* c-c++-common/gomp/metadirective-2.c: Likewise.
Jonathan Wakely [Tue, 21 Oct 2025 16:06:49 +0000 (17:06 +0100)]
libstdc++: Avoid incrementing input iterators with std::prev [PR122224]
As explained in PR libstdc++/122224 we do not make it ill-formed to call
std::prev with a non-Cpp17BidirectionalIterator. Instead we just use a
runtime assertion to check the std::advance precondition that the
distance is not negative.
This allows us to support std::prev on types which model the C++20
std::bidirectional_iterator concept but do not meet the
Cpp17BidirectionalIterator requirements, e.g. iota_view's iterators.
It also allows us to support std::prev(iter, -1) which is admittedly
weird, but there's no reason it shouldn't be equivalent to
std::next(iter), which is perfectly fine to use on non-bidirectional
iterators. In other words, "reverse decrementing" is valid for
non-bidirectional iterators.
However, the current implementation of std::advance for
non-bidirectional iterators uses a loop that does `while (n--) ++i;`
which assumes that n is not negative and so will eventually reach zero.
When the assertion for the precondition is not enabled, incrementing the
iterator while n is non-zero means that using std::prev(iter) or
std::next(iter, -1) on a non-bidirectional iterator will keep
incrementing the iterator until n reaches INT_MIN, overflows, and then
keeps decrementing until it eventually reaches zero. Incrementing most
iterators that many times will cause memory safety errors long before
the integer reaches zero and terminates the loop.
This commit changes the loop to use `while (n-- > 0)` which means that
the loop doesn't execute at all if a negative n is used. We still
consider such calls to be erroneous, but when the precondition isn't
checked by an assertion, the function now has no effects. The undefined
behaviour resulting from incrementing the iterator is prevented.
libstdc++-v3/ChangeLog:
PR libstdc++/122224
* include/bits/stl_iterator_base_funcs.h (prev): Compare
distance as n > 0 instead of n != 0.
* testsuite/24_iterators/range_operations/122224.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jakub Jelinek [Wed, 22 Oct 2025 11:11:52 +0000 (13:11 +0200)]
c++: Fix up RAW_DATA_CST handling in braced_list_to_string [PR122302]
The following testcase is miscompiled, because a RAW_DATA_CST tree
node is shared by multiple CONSTRUCTORs and when the braced_list_to_string
function changes one to extend the RAW_DATA_CST over the single preceding
and single succeeding INTEGER_CST, it changes the RAW_DATA_CST in
the other CONSTRUCTOR where the elts around it are still present.
Fixed by tweaking a copy of it instead, like we handle it in other spots.
2025-10-22 Jakub Jelinek <jakub@redhat.com>
PR c++/122302
* c-common.cc (braced_list_to_string): Call copy_node on RAW_DATA_CST
before changing RAW_DATA_POINTER and RAW_DATA_LENGTH on it.
* g++.dg/cpp0x/pr122302.C: New test.
* g++.dg/cpp/embed-27.C: New test.
Where the ptrue is normally executed much earlier so it's not a bottleneck for
the compare.
For the remaining codegen see test vect-reduc-bool-18.c.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (reduc_sbool_and_scal_<mode>,
reduc_sbool_ior_scal_<mode>, reduc_sbool_xor_scal_<mode>): Use SVE if
available.
* config/aarch64/aarch64-sve.md (*cmp<cmp_op><mode>_ptest): Rename ...
(@aarch64_pred_cmp<cmp_op><mode>_ptest): ... To this.
(reduc_sbool_xor_scal_<mode>): Rename ...
(@reduc_sbool_xor_scal_<mode>): ... To this.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/vect-reduc-bool-10.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-11.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-12.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-13.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-14.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-15.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-16.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-17.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-18.c: New test.
Tamar Christina [Wed, 22 Oct 2025 09:51:41 +0000 (10:51 +0100)]
AArch64: Add support for boolean reductions for Adv. SIMD
The vectorizer has learned how to do boolean reductions of masks to a C bool
for the operations OR, XOR and AND.
This implements the new optabs for Adv.SIMD. Adv.SIMD today can already
vectorize such loops but does so through SHIFT-AND-INSERT to perform the
reductions step-wise and inorder. As an example, an OR reduction today does:
* gcc.target/aarch64/vect-reduc-bool-1.c: New test.
* gcc.target/aarch64/vect-reduc-bool-2.c: New test.
* gcc.target/aarch64/vect-reduc-bool-3.c: New test.
* gcc.target/aarch64/vect-reduc-bool-4.c: New test.
* gcc.target/aarch64/vect-reduc-bool-5.c: New test.
* gcc.target/aarch64/vect-reduc-bool-6.c: New test.
* gcc.target/aarch64/vect-reduc-bool-7.c: New test.
* gcc.target/aarch64/vect-reduc-bool-8.c: New test.
* gcc.target/aarch64/vect-reduc-bool-9.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-1.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-2.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-3.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-4.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-5.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-6.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-7.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-8.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-9.c: New test.
Tamar Christina [Wed, 22 Oct 2025 09:49:26 +0000 (10:49 +0100)]
vect: Add support for boolean reductions for VLA
The support for the new boolean reduction optabs didn't quite work for VLA
because the code later on insists on the target still having a shift-and-insert
optab.
This is however not needed if the target can do the reduction using the new
optabs, and the initial reduction value matches the neutral value and we
have one SLP lane while not having a reduction chain.
gcc/ChangeLog:
* tree-vect-loop.cc (vectorizable_reduction): Don't always require
IFN_VEC_SHL_INSERT when using reduc sbool optabs.
aarch64: Add autoregenerated files for AArch64 options.
In the previous committed patch to "add support for
menable-sysreg-checking flag", I have made changes to
config/aarch64/aarch64.opt, but missed to update the
autoregenerated files.
This patch adds the updated autoregenerated aarch64.opt.urls
changes.
Richard Biener [Tue, 21 Oct 2025 13:23:18 +0000 (15:23 +0200)]
tree-optimization/122364 - reduction chain with conversion
The following handles detecting of a reduction chain wrapped in a
conversion. This does not yet try to combine operands with different
signedness, but we should now handle signed integer accumulation
to both a signed and unsigned accumulator fine.
PR tree-optimization/122364
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Re-try
linearization on a conversion source.
Richard Biener [Wed, 22 Oct 2025 08:05:01 +0000 (10:05 +0200)]
tree-optimization/122370 - ICE with reduction and masks
The following fixes bad interaction with mask demotion to data
and the code dealing with UB on signed reductions by making sure
to also update compute_vectype when updating vectype.
PR tree-optimization/122370
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Also update compute_vectype when demoting masks to an
integer vector.
Jonathan Wakely [Tue, 21 Oct 2025 20:21:45 +0000 (21:21 +0100)]
libstdc++: Add missing constraints to views::indices
Calling views::indices(n) should be expression equivalent to
views::iota(decltype(n)(0), n), which means it should have the same
constraints as views::iota and be SFINAE-friendly.
libstdc++-v3/ChangeLog:
* include/std/ranges (indices::operator()): Constrain using
__can_iota_view concept.
* testsuite/std/ranges/indices/1.cc: Check SFINAE-friendliness
required by expression equivalence. Replace unused <vector>
header with <stddef.h> needed for size_t.
Richard Biener [Wed, 22 Oct 2025 07:41:15 +0000 (09:41 +0200)]
tree-optimization/122371 - ICE with reduction chain and fold-left reduction
The fold-left reduction transform relies on preserving the scalar
cycle PHI. The following rewrites how we connect this to the
involved stmt-infos instead of relying on (the actually bogus for
reduction chain) scalar stmts in SLP nodes more than absolutely
necessary. This also makes sure to not re-associate to form a
reduction chain when a fold-left reduction is required.
PR tree-optimization/122371
* tree-vect-loop.cc (vectorize_fold_left_reduction): Get
to the scalar def to replace via the scalar PHI backedge def.
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Do not
re-associate to for a reduction chain if a fold-left
reduction is required.
libstdc++: Implement optional<T&> from P2988R12 [PR121748]
This patch implements optional<T&> based on the P2988R12 paper, incorporating
corrections from LWG4300, LWG4304, and LWG3467. The resolution for LWG4015
is also extended to cover optional<T&>.
We introduce _M_fwd() helper, that is equivalent to operator*(), except that
it does not check non-empty precondition. It is used in to correctly propagate
the value during move construction from optional<T&>. This is necessary because
moving an optional<T&> must not move the contained object, which is the key
distinction between *std::move(opt) and std::move(*opt).
The implementation deviates from the standard by providing a separate std::swap
overload for std::optional<T&>, which simplifies preserving the resolution of
LWG2766.
This introduces a few changes to make_optional behavior (see included test):
* some previously valid uses of make_optional<T>({...}) (where T is not a
reference type) now become ill-formed (see optional/make_optional_neg.cc).
* make_optional<T&>(t) and make_optional<const T&>(ct), where decltype(t) is T&,
and decltype(ct) is const T& now produce optional<T&> and optional<const T&>
respectively, instead of optional<T>.
* a few other uses of make_optional<R> with reference type R are now ill-formed.
PR libstdc++/121748
libstdc++-v3/ChangeLog:
* include/bits/version.def: Bump value for optional,
* include/bits/version.h: Regenerate.
* include/std/optional (std::__is_valid_contained_type_for_optional):
Define.
(std::optional<T>): Use __is_valid_contained_type_for_optional.
(optional<T>(const optional<_Up>&), optional<T>(optional<_Up>&&))
(optional<T>::operator=(const optional<_Up>&))
(optional<T>::operator=(optional<_Up>&&)): Replacex._M_get() with
x._M_fwd(), and std::move(x._M_get()) with std::move(x)._M_fwd().
(optional<T>::and_then): Remove uncessary remove_cvref_t.
(optional<T>::_M_fwd): Define.
(std::optional<T&>): Define new partial specialization.
(std::swap(std::optional<T&>, std::optional<T&>)): Define.
(std::make_optional(_Tp&&)): Add non-type template parameter.
(std::make_optional): Use parenthesis to constructor optional.
(std::hash<optional<T>>): Add comment.
* testsuite/20_util/optional/make_optional-2.cc: Guarded not longer
working example.
* testsuite/20_util/optional/relops/constrained.cc: Expand test to
cover optionals of reference.
* testsuite/20_util/optional/requirements.cc: Ammend for
optional<T&>.
* testsuite/20_util/optional/requirements_neg.cc: Likewise.
* testsuite/20_util/optional/version.cc: Test new value of
__cpp_lib_optional.
* testsuite/20_util/optional/make_optional_neg.cc: New test.
* testsuite/20_util/optional/monadic/ref_neg.cc: New test.
* testsuite/20_util/optional/ref/access.cc: New test.
* testsuite/20_util/optional/ref/assign.cc: New test.
* testsuite/20_util/optional/ref/cons.cc: New test.
* testsuite/20_util/optional/ref/internal_traits.cc: New test.
* testsuite/20_util/optional/ref/make_optional/1.cc: New test.
* testsuite/20_util/optional/ref/make_optional/from_args_neg.cc:
New test.
* testsuite/20_util/optional/ref/make_optional/from_lvalue_neg.cc:
New test.
* testsuite/20_util/optional/ref/make_optional/from_rvalue_neg.cc:
New test.
* testsuite/20_util/optional/ref/monadic.cc: New test.
* testsuite/20_util/optional/ref/relops.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Biener [Tue, 21 Oct 2025 13:40:41 +0000 (15:40 +0200)]
tree-optimization/122365 - deal with bool SLP reductions
I hadn't thought of these but at least added an assert which now
tripped. Fixed thus. There's also a latent issue with AVX512
mask types. The by-pieces reduction code used the wrong element
sizes.
PR tree-optimization/122365
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Convert all inputs. Use the proper vector element sizes
for the elementwise reduction.
Haochen Jiang [Tue, 21 Oct 2025 03:21:45 +0000 (11:21 +0800)]
i386: Correct cpu codename value for unknown model number
There are several changes for features enabled on cpus. r16-1666 disabled
CLDEMOTE on clients. r16-2224 removed Key locker since Panther Lake and
Clearwater forest. r16-4436 disabled PREFETCHI on Panther Lake.
The patches caused the current return guess value not aligned for
host_detect_local_cpu meeting the unknown model number. Correct the
logic according to the features enabled.
This patch will also backport to GCC14 and GCC15.
gcc/ChangeLog:
* config/i386/driver-i386.cc (host_detect_local_cpu): Correct
the logic for unknown model number cpu guess value.
liuhongt [Mon, 20 Oct 2025 08:42:32 +0000 (01:42 -0700)]
Simplify avx512 vector integer comparison when 2 operands are known equal
For comparison NEQ/LT/NLE, it's simplified to 0.
For comparison LE/EQ/NLT, it's simplied to (1u << nelt) - 1
gcc/ChangeLog:
PR target/122320
* config/i386/sse.md (*<avx512>_cmp<mode>3_dup_op): New define_insn_and_split.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr122320-mask16.c: New test.
* gcc.target/i386/pr122320-mask2.c: New test.
* gcc.target/i386/pr122320-mask32.c: New test.
* gcc.target/i386/pr122320-mask4.c: New test.
* gcc.target/i386/pr122320-mask64.c: New test.
* gcc.target/i386/pr122320-mask8.c: New test.
Antoni Boucher [Fri, 19 Sep 2025 12:44:04 +0000 (08:44 -0400)]
libgccjit: Add _Float16, _Float32, _Float64 and __float128 support for jit
gcc/ChangeLog:
* config/i386/i386-jit.cc: Mark new float types as supported.
gcc/jit/ChangeLog:
* docs/topics/types.rst: Document new types.
* dummy-frontend.cc: Support new types in tree_type_to_jit_type.
* jit-common.h: Update NUM_GCC_JIT_TYPES.
* jit-playback.cc: Support new types in get_tree_node_for_type.
* jit-recording.cc: Support new types.
* libgccjit.h (gcc_jit_types): Add new types.
gcc/testsuite/ChangeLog:
* jit.dg/all-non-failing-tests.h: Mention new test.
* jit.dg/test-sized-float.c: New test.
Martin Uecker [Sun, 12 Oct 2025 18:00:22 +0000 (20:00 +0200)]
c2y: Allow unspecified arrays in generic association.
To allow unspecified arrays in generic association add a new
declaration context GENERIC_ASSOC for grokdeclarator and new
function grokgenassoc to be used by the parser. The error
about unspecified array is moved from build_array_declarator
to grokdeclarator to be able to check for this.
Jakub Jelinek [Tue, 21 Oct 2025 17:17:11 +0000 (19:17 +0200)]
c++: Implement C++23 P2674R1 - A trait for implicit lifetime types
The following patch attempts to implement the compiler side of the
C++23 P2674R1 paper. As mentioned in the paper, since CWG2605
the trait isn't really implementable purely on the library side.
Because it is implemented completely on the compiler side, it
just uses SCALAR_TYPE_P and so can e.g. accept __int128 even in
-std=c++23 mode, even when std::is_scalar_v<__int128> is false in
that case. And as an extention it (like Clang) accepts _Complex
types and vector types.
I must say I'm quite surprised that any array types are considered
implicit-lifetime, even if their element type is not, but perhaps
there is some reason for that.
Because std::is_array_v<int[0]> is false, it returns false for that
as well, dunno if that shouldn't be changed for implicit-lifetime.
It accepts also VLAs.
The library part has been split into a separate patch still pending
review; committing it now so that reflection can use it in its
std::meta::is_implicit_lifetime_type implementation.
2025-10-21 Jakub Jelinek <jakub@redhat.com>
gcc/cp/
* cp-tree.h: Implement C++23 P2674R1 - A trait for implicit lifetime
types.
(implicit_lifetime_type_p): Declare.
* tree.cc (implicit_lifetime_type_p): New function.
* cp-trait.def (IS_IMPLICIT_LIFETIME): New unary trait.
* semantics.cc (trait_expr_value): Handle CPTK_IS_IMPLICIT_LIFETIME.
(finish_trait_expr): Likewise.
* constraint.cc (diagnose_trait_expr): Likewise.
gcc/testsuite/
* g++.dg/ext/is_implicit_lifetime.C: New test.
OpenMP: Handle non-executable directives in intervening code [PR120180,PR122306]
OpenMP 6 permits non-executable directives in intervening code; this commit adds
support for a sensible subset, namely metadirectives, nothing, assume, and
'error at(compilation)'.
Also handle the special case where a metadirective can be resolved at parse time
to 'omp nothing'.
This fixes a build issue that affects 10 out 12 SPECaccel benchmarks.
* c-parser.cc (c_parser_pragma): Accept a subset of non-executable
OpenMP directives in intervening code.
(c_parser_omp_error): Reject 'error at(execution)' in intervening code.
(c_parser_omp_metadirective): Return early if only one selector matches
and it resolves to 'omp nothing'.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_metadirective): Return early if only one
selector matches and it resolves to 'omp nothing'.
(cp_parser_omp_error): Reject 'error at(execution)' in intervening code.
(cp_parser_pragma): Accept a subset of non-executable OpenMP directives
as intervening code.
gcc/fortran/ChangeLog:
* gfortran.h (enum gfc_exec_op): Add EXEC_OMP_FIRST_OPENMP_EXEC and
EXEC_OMP_LAST_OPENMP_EXEC.
* openmp.cc (gfc_match_omp_context_selector): Remove static. Remove
checks on score. Add cleanup. Remove checks on trait properties.
(gfc_match_omp_context_selector_specification): Remove static. Adjust
calls to gfc_match_omp_context_selector.
(gfc_match_omp_declare_variant): Adjust call to
gfc_match_omp_context_selector_specification.
(match_omp_metadirective): Likewise.
(icode_code_error_callback): Reject all statements except
'assume' and 'metadirective'.
(gfc_resolve_omp_context_selector): New function.
(resolve_omp_metadirective): Skip metadirectives which context selectors
can be statically resolved to false. Replace metadirective by its body
if only 'nothing' remains.
(gfc_resolve_omp_declare): Call gfc_resolve_omp_context_selector for
each variant.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/imperfect1.c: Adjust dg-error.
* c-c++-common/gomp/imperfect4.c: Likewise.
* c-c++-common/gomp/pr120180.c: Move to...
* c-c++-common/gomp/pr120180-1.c: ...here. Remove dg-error.
* g++.dg/gomp/attrs-imperfect1.C: Adjust dg-error.
* g++.dg/gomp/attrs-imperfect4.C: Likewise.
* gfortran.dg/gomp/declare-variant-2.f90: Adjust dg-error.
* gfortran.dg/gomp/declare-variant-20.f90: Likewise.
* c-c++-common/gomp/pr120180-2.c: New test.
* g++.dg/gomp/pr120180-1.C: New test.
* gfortran.dg/gomp/pr120180-1.f90: New test.
* gfortran.dg/gomp/pr120180-2.f90: New test.
* gfortran.dg/gomp/pr122306-1.f90: New file.
* gfortran.dg/gomp/pr122306-2.f90: New file.
Roger Sayle [Tue, 21 Oct 2025 12:14:58 +0000 (13:14 +0100)]
x86_64: Start TImode STV chains from zero-extension or *concatditi.
Currently x86_64's TImode STV pass has the restriction that candidate
chains must start with a TImode load from memory. This patch improves
the functionality of STV to allow zero-extensions and construction of
TImode pseudos from two DImode values (i.e. *concatditi) to both be
considered candidate chain initiators. For example, this allows chains
starting from an __int128 function argument to be processed by STV.
Compiled with -O2 on x86_64:
__int128 m0,m1,m2,m3;
void foo(__int128 m)
{
m0 = m;
m1 = m;
m2 = m;
m3 = m;
}
bar: movq %rdi, %xmm0
movaps %xmm0, m0(%rip)
movaps %xmm0, m1(%rip)
movaps %xmm0, m2(%rip)
movaps %xmm0, m3(%rip)
ret
As shown in the examples above, the scalar-to-vector (STV) conversion of
*concatditi has an overhead [treating two DImode registers as a TImode
value is free on x86_64], but specifying this penalty allows the STV
pass to make an informed decision if the total cost/gain of the chain
is a net win.
2025-10-21 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-features.cc (timode_concatdi_p): New
function to recognize the various variants of *concatditi3_[1-7].
(scalar_chain::add_insn): Like VEC_SELECT, ZERO_EXTEND and
timode_concatdi_p instructions don't require their input
operands to be converted (to TImode).
(timode_scalar_chain::compute_convert_gain): Split/clone XOR and
IOR cases from AND case, to handle timode_concatdi_p costs.
<case PLUS>: Handle timode_concatdi_p conversion costs.
<case ZERO_EXTEND>: Provide costs of DImode to TImode extension.
(timode_convert_concatdi): Helper function to transform
a *concatditi3 instruction into a vec_concatv2di instruction.
(timode_scalar_chain::convert_insn): Split/clone XOR and IOR
cases from ANS case, to handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
<case ZERO_EXTEND>: Convert zero_extendditi2 to *vec_concatv2di_0.
<case PLUS>: Handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
(timode_scalar_to_vector_candidate_p): Support timode_concatdi_p
instructions in IOR, XOR and PLUS cases.
<case ZERO_EXTEND>: Consider zero extension of a register from
DImode to TImode to be a candidate.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-stv-10.c: New test case.
* gcc.target/i386/sse4_1-stv-11.c: Likewise.
* gcc.target/i386/sse4_1-stv-12.c: Likewise.
Tobias Burnus [Tue, 21 Oct 2025 11:31:19 +0000 (13:31 +0200)]
OpenMP: Update directive arrays used for 'omp assume(s)' with contains/absent
Both Fortran and C/C++ have an array with classifications of directives;
currently, this array is only used to handle the restrictions of the
contains/absent clauses to the assume/assumes directives.
For C/C++, uncommenting 'declare mapper' was missed. Additionally,
'end ...' is a directive but not a directive name; hence, those
are now rejected as 'unknown directive' instead of as 'invalid'
directive.
Additionally, both lists now list newer entries (commented out) for
OpenMP 6.x - and a note (comment) was added for C/C++'s
'begin metadirective' and for Fortran's 'allocate', respectively.
gcc/c-family/ChangeLog:
* c-omp.cc (c_omp_directives): Uncomment 'declare mapper',
add comment to 'begin metadirective', add 6.x unimplemented
directives as comment-out entries.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_omp_directive): Add comment to 'allocate';
add 6.x unimplemented directives as comment-out entries.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/assumes-2.c: Change for 'invalid'
to 'unknown' change for end directives.
* c-c++-common/gomp/begin-assumes-2.c: Likewise.
* c-c++-common/gomp/assume-2.c: Likewise. Check 'declare
mapper'.