git.ipfire.org Git - thirdparty/gcc.git/log

x86: builtin-fabs-2.c: Also scan (%edi) for x32

Adjust gcc.target/i386/builtin-fabs-2.c to scan both (%rdi) and (%edi).

PR target/122323
* gcc.target/i386/builtin-fabs-2.c: Also scan (%edi)for x32.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

match.pd: Fold VEC_PERM_EXPR chains implementing concat-and-extract

When compiling the following code with SIMDe on AArch64:

__m128i lo = _mm_srli_si128(a, 12);
__m128i hi = _mm_slli_si128(b, 4);
__m128i res = _mm_blend_epi16(hi, lo, 3);

current GCC produces:

mov     v31.4s, 0
ext     v30.16b, v0.16b, v31.16b, #12
ext     v0.16b, v31.16b, v1.16b, #12
ins     v0.s[0], v30.s[0]

instead of the more efficient:

ext     v0.16b, v0.16b, v1.16b, #12

GCC builds three VEC_PERM_EXPRs for the intrinsic calls. The first two
implement vector shifts and the final one implements the blend, but they
use different vector modes. The forward propagation fails to optimize
this case because VIEW_CONVERT_EXPRs in between block the folding.

This patch adds a match.pd pattern to recognize the concat-and-extract
idiom and folds the VEC_PERM_EXPR chain, even when VIEW_CONVERT_EXPRs
split the chain.

Bootstrapped and tested on aarch64-linux-gnu and x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd: Fold VEC_PERM_EXPR chains implementing vector
concat-and-extract.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-vecperm-1.c: New test.

Undefine SET_CMODEL before #define in rs6000/vxworks.h

This prevents warnings complaining about the redefinition
on top of the base version.

2025-10-16 Olivier Hainque <hainque@adacore.com>

* config/rs6000/vxworks.h (SET_CMODEL): Undefine before
(re)defining.

Adjust VxWorks special case in testsuite check_weak_available

check_weak_available was reporting weak symbols unsupported
for vxworks unconditionally while they are actually supported
vxworks 7 now (assumed >= r2). This change adjusts the
predicate to reflect that.

We used to believe we should distinguish kernel and rtp modes,
and experiments showed that this distinction is actually
counterproductive for the testsuite's purposes.

This allows a few extra tests to run (and pass :), in particular
in g++.dg/modules.

2021-02-03 Olivier Hainque <hainque@adacore.com>

gcc/testsuite/

* lib/target-supports.exp (check_weak_available):
Return 1 for VxWorks7.

c: Implement C2y static assertions in expressions

C2y has added support for static assertions as void expressions, in
addition to use as declarations (N3715 was accepted in an online vote
between meetings).

Implement the feature in GCC. There is a syntactic ambiguity between
a static assertion as a declaration and one as an expression
statement, which the accepted feature resolves by making such a usage
a declaration (this only affects the sequence of syntax productions by
which the code is parsed, not the actual semantics of the assertion);
I've raised the similar ambiguity in for loops on the WG14 reflector.

If just concerned with C2y, and not with diagnosing the use of a
feature not supported in older standard versions, the feature might be
simpler to implement by defaulting to treating static assertions in
ambiguous contexts as expressions rather than declarations, but that
would make it hard to diagnose exactly the cases that are new in C2y
(those depend on the static assertion either not being the whole
expression statement, or being in a context where an expression
statement is allowed but a declaration is not, e.g. the body of an if
statement). Instead, to support such diagnostics, the implementation
follows the standard in what is considered a declaration and what is
considered an expression, by looking ahead to what follows the closing
parenthesis when a static assertion starts in a context where a
declaration is permitted.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-parser.cc (c_parser_next_tokens_start_typename)
(c_parser_next_tokens_start_declaration): Add argument for token
to start from
(c_parser_next_tokens_start_declaration): Check for whether static
assertion followed by semicolon.
(c_parser_check_balanced_raw_token_sequence): Declare earlier.
(c_parser_compound_statement_nostart, c_parser_for_statement): Use
c_parser_next_tokens_start_declaration not
c_token_starts_declaration on second token.
(c_parser_unary_expression): Handle static assertions.
* c-parser.h (c_parser_next_tokens_start_declaration): Add
argument.

gcc/testsuite/
* gcc.dg/c23-static-assert-5.c, gcc.dg/c23-static-assert-6.c,
gcc.dg/c23-static-assert-7.c, gcc.dg/c23-static-assert-8.c,
gcc.dg/c2y-static-assert-2.c, gcc.dg/c2y-static-assert-3.c,
gcc.dg/c2y-static-assert-4.c: New tests.

Daily bump.

cobol: Corrected FUNCTION CHAR and FUNCTION ORD.

The functions CHAR and ORD have been changed to correctly report on
character positions within the collation sequence.

The use of the LOW-VALUE and HIGH-VALUE figurative constants has been
corrected.

Some establishment of DISPLAY and NATIONAL encodings has been done
in anticipation of changes soon to come.

Some new testsuite tests have been added.

gcc/cobol/ChangeLog:

* genapi.cc (parser_alphabet): Alphabet encoding.
(parser_alphabet_use): Likewise.
(parser_xml_parse): Use correct debugging macro; encoding.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(initialize_the_data): Encoding.
(parser_label_label): Debugging macros.
(parser_label_goto): Likewise.
(parser_file_add): Encoding.
(parser_intrinsic_call_1): Special handling for __gg__char.
(parser_intrinsic_call_2): Formatting.
* parse.y: Response from FUNCTION ORD is flagged "unsigned".
* symbols.cc (cbl_alphabet_t::reencode): Establish
low_char & high_char.
* symbols.h (struct cbl_alphabet_t): Likewise.

libgcobol/ChangeLog:

* charmaps.cc: Encoding.
* charmaps.h (class charmap_t): Encoding.
* intrinsic.cc (__gg__char): Report the character at the
collation position.
(__gg__ord): Report the collation position of a character.
* libgcobol.cc (struct program_state): Add encodings;
Remove obsolete defines.
(__gg__current_collation): New function for encoding/collation.
(__gg__pop_program_state): Encoding.
(__gg__init_program_state): Encoding.
(format_for_display_internal): Handle LOW-VALUE and HIGH-VALUE.
(__gg__compare_2): Encoding.
(__gg__alphabet_use): Likewise.
* libgcobol.h (__gg__current_collation): New declaration.
* xmlparse.cc (__gg__xml_parse): Make a parameter const.

gcc/testsuite/ChangeLog:

* cobol.dg/group2/Length_overflow__2_.out: Updated test result.
* cobol.dg/group2/Length_overflow_with_offset__1_.out: Likewise.
* cobol.dg/group2/Offset_overflow.out: Likewise.
* cobol.dg/group2/CALL_with_OCCURS_DEPENDING_ON.cob: New test.
* cobol.dg/group2/CALL_with_OCCURS_DEPENDING_ON.out: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_ASCII.cob: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_ASCII.out: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_EBCDIC.cob: New test.
* cobol.dg/group2/CHAR_and_ORD_with_COLLATING_sequence_-_EBCDIC.out: New test.
* cobol.dg/group2/EC-BOUND-REF-MOD_checking_process_termination.cob: New test.
* cobol.dg/group2/EC-BOUND-REF-MOD_checking_process_termination.out: New test.
* cobol.dg/group2/Intrinsics_without_FUNCTION_keyword__3_.cob: New test.
* cobol.dg/group2/Occurs_DEPENDING_ON__source_and_dest.cob: New test.
* cobol.dg/group2/Occurs_DEPENDING_ON__source_and_dest.out: New test.
* cobol.dg/group2/Recursive_subscripts.cob: New test.
* cobol.dg/group2/Recursive_subscripts.out: New test.
* cobol.dg/group2/SEARCH_ALL_with_OCCURS_DEPENDING_ON.cob: New test.
* cobol.dg/group2/SEARCH_ALL_with_OCCURS_DEPENDING_ON.out: New test.
* cobol.dg/group2/Subscript_by_arithmetic_expression.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__1_.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__1_.out: New test.
* cobol.dg/group2/Subscript_out_of_bounds__2_.cob: New test.
* cobol.dg/group2/Subscript_out_of_bounds__2_.out: New test.
* cobol.dg/group2/Subscripted_refmods.cob: New test.
* cobol.dg/group2/Subscripted_refmods.out: New test.
* cobol.dg/group2/length_of_ODO_Rules_7__8A__and_8B.cob: New test.
* cobol.dg/group2/length_of_ODO_Rules_7__8A__and_8B.out: New test.
* cobol.dg/group2/length_of_ODO_w_-_reference_modification.cob: New test.

match: improve handling of `((signed)x) < 0` to `x >= (unsigned)SIGNED_TYPE_MIN` in `(type1)x CMP CST1 ? (type2)x : CST2` pattern.

This is a follow on r16-4534-g07800a565abd20 based on the review of
the other pattern (https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698336.html)
as the same issue mentioned in that review apply here.

This changes to use the new version of minmax_from_comparison so we don't need to create
a tree for the constant. and use wi::mask instead of TYPE_MIN_VALUE.

gcc/ChangeLog:

* match.pd (`(type1)x CMP CST1 ? (type2)x : CST2`): Better handling
of `((signed)x) < 0`.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

phiopt: Remove minmax_replacement [PR101024]

Now all of the optimizations are done in match from
minmax_replacement. We can now remove minmax_replacement. :)
This keeps around the special case for fp `a CMP b ? a : b` that was
added with r14-2699-g9f8f37f5490076 (PR 88540) and moves it to
match_simplify_replacement.

Bootsrapped and tested on x86_64-linux-gnu.

Note bool-12.c needed to be updated since phiopt1 rejecting the
BIT_AND/BIT_IOR with a cast and not getting MIN/MAX any more.

gcc/ChangeLog:

PR tree-optimization/101024
* tree-ssa-phiopt.cc (match_simplify_replacement): Special
case fp `a CMP b ? a : b` when not creating a min/max.
(strip_bit_not): Remove.
(invert_minmax_code): Remove.
(minmax_replacement): Remove.
(pass_phiopt::execute): Update pass comment.
Don't call minmax_replacement.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-12.c: Update based on when BIT_AND/BIT_IOR
is created and no longer MIN/MAX.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

match: Add support for `((signed)a </>= 0) ? min/max (a, c) : b` [PR101024]

This is the last patch that is needed to support to remove minmax_replacement.
This fixes pr101024-1.c which is failing when minmax_replacement is removed.

This next patch will remove it.

Changes since v1:
* v2: Add new version of minmax_from_comparison that takes widest_int.
      Constraint the pattern to constant integers in some cases.
      Use mask to create the SIGNED_MAX and use GT/LE instead.
      Use wi::le_p/wi::ge_p instead of fold_build to do the comparison.

gcc/ChangeLog:

PR tree-optimization/101024
* fold-const.cc (minmax_from_comparison): New version that takes widest_int
instead of tree.
(minmax_from_comparison):  Call minmax_from_comparison for integer cst case.
* fold-const.h (minmax_from_comparison): New declaration.
* match.pd (`((signed)a </>= 0) ? min/max (a, c) : b`): New pattern.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

cobol: Implement the XML PARSE statement.

These changes implement the XML PARSE statement as described in the IBM
specification.

A repair to exception handling is included.  Up until now, an exception
after a successful file operation wasn't handled properly.

A repair to value declarations for BINARY / COMP / COMP-4 / COMP-5
values now allows them to have digits to the right of the implied
decimal point.  Processing of the "S" PICTURE character has been
normalized as well.

Co-Authored-By: James K. Lowden <jklowden@cobolworx.com>
Co-Authored-By: Robert Dubner <rdubner@symas.com>
gcc/cobol/ChangeLog:

* Make-lang.in: Incorporate new token_names.h file.
* cdf.y: Modify tokens.
* gcobol.1: Document XML PARSE statement
* genapi.cc (parser_enter_program): Verify that every goto has a
matching label.
(parser_end_program): Likewise.
(parser_alphabet): Refine handling codeset encodings.
(parser_alphabet_use): Likewise.
(label_fetch): Moved from later in the source code.
(parser_xml_parse): New routine for XML PARSE.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(parser_label_label): Verify goto/label matching.
(parser_label_goto): Likewise.
(parser_entry): Minor change to SHOW_PARSE report.
* genapi.h (parser_alphabet): Set parameter to const.
(parser_xml_parse): Declare new function.
(parser_xml_on_exception): Likewise.
(parser_xml_not_exception): Likewise.
(parser_xml_end): Likewise.
(parser_label_addr): Likewise.
* parse.y: label_pair_t structure; locale processing; new token
processing for alphabets and XML PARSE.
* parse_ante.h (name_of): Return field->name when initial is NULL.
(new_tempnumeric): Make signable_e optional.
(ast_save_locale): New function.
(data_division_ready): Warning for "no alphabet".
* scan.l: Repair interpretation of BINARY, COMP, COMP-4, and
COMP-5.
* scan_ante.h (struct bint_t): Likewise.
* scan_post.h (current_tokens_t::tokenset_t::tokenset_t):
Include token_names.h.
* symbols.cc (symbols_alphabet_set): Revert to prior alphabet
determination.
(symbol_table_init): New XML special registers.
(new_temporary): Make signable_e controllable, not fixed.
* symbols.h (__gg__encoding_iconv_valid): New declaration.
(enum cbl_label_type_t): New LblXml label type.
(struct cbl_xml_parse_t):
(struct cbl_label_t): Implement XML PARSE.
(new_temporary): Incorporate boolean for signable_e.
(symbol_elem_of): Change label field type handling.
(cbl_section_of): Likewise.
(cbl_field_of): Likewise.
(cbl_label_of): Likewise.
(cbl_special_name_of):  Likewise.
(cbl_alphabet_of):  Likewise.
(cbl_file_of):  Likewise.
* token_names.h: New file.
* util.cc (gcc_location_set_impl): Improve location_t calculations
when entering and leaving COPYBOOKs.

libgcobol/ChangeLog:

* Makefile.am: Changes for XML PARSE and POSIX functions.
* Makefile.in: Likewise.
* charmaps.cc: Augment encodings[] table with "supported" boolean.
(__gg__encoding_iconv_name): Modify how encodings are identified.
(encoding_descr): Likewise.
(__gg__encoding_iconv_valid): Likewise.
* common-defs.h (callback_t): Define function pointer.
* constants.cc: Use named cbl_attr_e constants instead of magic
numbers.; New definitions for XML special registers.
* encodings.h (struct encodings_t): Declare "supported" boolean.
* libgcobol.cc (format_for_display_internal): Use std::ptrdiff_t.
(__gg__alphabet_use): Add case for iconv_CP1252_e.
(default_exception_handler): Repair exception handling after a
successful file operation.
* posix/errno.cc: New file.
* posix/localtime.cc: New file.
* posix/stat.cc: New file.
* posix/stat.h: New file.
* posix/tm.h: New file.
* xmlparse.cc: New file to support XML PARSE statement.

gcc/testsuite/ChangeLog:

* cobol.dg/typo-1.cob: New test for squiggles and carets.

aarch64: Add __HAVE_FUNCTION_MULTI_VERSIONING macro.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Add
__HAVE_FUNCTION_MULTI_VERSIONING macro.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: Remove unnecessary sort from dispatch_function_versions.

The version data-structure already stores the versions in a sorted order so
sorting here is unnecessary.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (dispatch_function_versions): Remove
unnecessary sorting and data structure.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: testsuite: Add test for supported FMV extensions.

Add tests that check the aarch64 version features are supported, that they
have the correct priority ordering, and that the generated resolver is correct.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/fmv_priority1.c: New test.
* gcc.target/aarch64/fmv_priority2.c: New test.
* gcc.target/aarch64/fmv_priority.in: Support file.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: Fix fmv priority ordering [PR target/122190]

This fixes the versioning rules for aarch64.

Previously this would prioritize the version string with more extensions
specified regardless of the extension.

The ACLE rules are that any two version strings should be ordered by the
highest priority feature that the versions don't have in common.

PR target/122190

gcc/ChangeLog:

* config/aarch64/aarch64.cc (compare_feature_masks): Fix version rules.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr122190.c: New test

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

aarch64: Dump version ordering for FMV.

This adds the fmv function versions to the targetclone dump.

This is useful for debugging and tests checking function version priority
ordering.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_generate_version_dispatcher_body):
Dump function versions and the ordering.

Reviewed-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

match.pd: Fold pattern of round semantics.

In the 538.imagick_r benchmark of Spec2017, I find these pattern from
MagickRound function. This patch implements these pattern in match.pd
for 4 rules:
1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5)
2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5)
3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5)
4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5)

The patch implements floor(x+0.5) operation to replace these pattern
that semantics of round(x) function.

The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu,
SPEC 2017 and SPEC 2006 were run:
As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+
in base test of ratio mode.
As for SPEC 2006, while the transform does not seem to be triggered,
we also see no non-noise impact on performance.

gcc/ChangeLog:

* match.pd: Add new pattern for round.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-round-1.c: New test.

libgomp: fine-grained pinned memory allocator

This patch introduces a new custom memory allocator for use with pinned
memory (in the case where the Cuda allocator isn't available).  In future,
this allocator will also be used for Managed Memory.  Both memories are
incompatible with the system malloc because allocated memory cannot share a
page with memory allocated for other purposes.

This means that small allocations will no longer consume an entire page of
pinned memory.  Unfortunately, it also means that pinned memory pages will
never be unmapped (although they may be reused).  This isn't a technical
limitation; the "free" algorithm could be extended in future, if needed.

The implementation is not perfect; there are various corner cases (especially
related to extending onto new pages) where allocations and reallocations may
be sub-optimal, but it should still be a step forward in support for small
allocations.

I have considered using libmemkind's "fixed" memory but rejected it for three
reasons: 1) libmemkind may not always be present at runtime, 2) there's no
currently documented means to extend a "fixed" kind one page at a time
(although the code appears to have an undocumented function that may do the
job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its
regular kinds would be straight-forward), 3) Managed Memory benefits from
having the metadata located in different memory and using an external
implementation makes it hard to guarantee this.

libgomp/ChangeLog:

* Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c.
* Makefile.in: Regenerate.
* basic-allocator.c: Mention simple-allocator in the comment.
* config/linux/allocator.c: Include unistd.h.
(pin_ctx): New variable.
(ctxlock): New variable.
(linux_init_pin_ctx): New function.
(linux_memspace_alloc): Use simple-allocator for pinned memory.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise.
* libgomp.h (gomp_simple_alloc_init_context): New prototype.
(gomp_simple_alloc_register_memory): New prototype.
(gomp_simple_alloc): New prototype.
(gomp_simple_free): New prototype.
(gomp_simple_realloc): New prototype.
* libgomp.texi: Update pinned memory trait documentation.
* testsuite/libgomp.c/alloc-pinned-8.c: New test.
* simple-allocator.c: New file.

libgomp, nvptx: Cuda pinned memory

Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX. At present, the other supported devices do
not have equivalent capabilities (or requirements).

libgomp/ChangeLog:

* config/linux/allocator.c: Include assert.h.
(using_device_for_page_locked): New variable.
(linux_memspace_alloc): Add init0 parameter. Support device pinning.
(linux_memspace_calloc): Set init0 to true.
(linux_memspace_free): Support device pinning.
(linux_memspace_realloc): Support device pinning.
(MEMSPACE_ALLOC): Set init0 to false.
* libgomp-plugin.h
(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* libgomp.h (gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(struct gomp_device_descr): Add page_locked_host_alloc_func and
page_locked_host_free_func.
* libgomp.texi: Adjust the docs for the pinned trait.
* plugin/plugin-nvptx.c
(GOMP_OFFLOAD_page_locked_host_alloc): New function.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* target.c (device_for_page_locked): New variable.
(get_device_for_page_locked): New function.
(gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(gomp_load_plugin_for_device): Add page_locked_host_alloc and
page_locked_host_free.
* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
devices.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

Remove LOOP_VINFO_SLP_UNROLLING_FACTOR

The following removes LOOP_VINFO_SLP_UNROLLING_FACTOR in favor of
using LOOP_VINFO_VECT_FACTOR directly now that there's no difference
between the two possible.

* tree-vectorizer.h (_loop_vec_info::slp_unrolling_factor): Remove.
(LOOP_VINFO_SLP_UNROLLING_FACTOR): Likewise.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Adjust.
(vect_analyze_loop_2): Likewise.
* tree-vect-slp.cc (vect_make_slp_decision): Set
LOOP_VINFO_VECT_FACTOR directly.

Move SLP permute optimization until after VF is final

The following moves SLP permute optimization until after we applied
a possible extra unroll factor.

* tree-vect-loop.cc (vect_analyze_loop_2): Move vect_optimize_slp
after applying suggested_unroll_factor.

Fix possible segfault in load/store-lane analysis

The following fixes a segfault that appeared with a patch applying
additional permutes to a reduction SLP instance root.

* tree-vect-loop.cc (vect_analyze_loop_2): Deal with NULL
element in SLP_TREE_SCALAR_STMTS.

testsuite: arm: [MVE] Relax expected code for vbicq_f [PR122223]

The original versions of the pr122223.c test only took into account
code generated with -mfloat-abi=hard, which uses q0.

With -mfloat-abi=softfp, this can be any Q register, so replace q0
with a suitable regex.

gcc/testsuite/ChangeLog:

PR target/122223
* gcc.target/arm/mve/intrinsics/pr122223.c: Relax expected code.

Support reduc_sbool_and_scal_m for V{QI,SI,DI}mode.

gcc/ChangeLog:

PR target/101639
* config/i386/sse.md
(VI_AVX): New mode iterator.
(VI_AVX_CMP): Ditto.
(ssebytemode): Add V16HI, V32QI, V16QI.
(reduc_sbool_and_scal_<mode>): New expander.
(reduc_sbool_ior_scal_<mode>): Ditto.
(reduc_sbool_xor_scal_<mode>): Ditto.
(*eq<mode>3_2_negate): New pre_reload splitter.
(*ptest<mode>_ccz): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr101639_reduc_mask_vdi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_vqi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_vsi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_ior_vqi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_and_vqi.c: New test.

Support reduc_sbool_{and,ior,xor}_scal_m for avx512 kmask.

gcc/ChangeLog:

PR target/101639
* config/i386/sse.md
(reduc_sbool_and_scal_<mode>): New expander.
(reduc_sbool_ior_scal_<mode>): Ditto.
(reduc_sbool_xor_scal_<mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr101639_reduc_mask_di.c: New test.
* gcc.target/i386/pr101639_reduc_mask_hi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_qi.c: New test.
* gcc.target/i386/pr101639_reduc_mask_si.c: New test.

Daily bump.

x86: Use HOST_WIDE_INT_(0|M1)U to initialize unsigned HOST_WIDE_INT

Use HOST_WIDE_INT_0U, instead of 0, HOST_WIDE_INT_M1U, instead of -1, to
initialize unsigned HOST_WIDE_INT.

* config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Use
HOST_WIDE_INT_0U and HOST_WIDE_INT_M1U to initialize unsigned
HOST_WIDE_INT.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

testsuite: Fix local labels [PR122378]

r16-4540-g80af807e52e4f4 exposed a bug in two testcases where the declaration of
local labels was wrongly commented out. That caused "duplicate label" errors.
Uncommenting declarations fixes it.

PR middle-end/122378

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/attrs-metadirective-2.c: Uncomment local label
declaration.
* c-c++-common/gomp/metadirective-2.c: Likewise.

libstdc++: Avoid incrementing input iterators with std::prev [PR122224]

As explained in PR libstdc++/122224 we do not make it ill-formed to call
std::prev with a non-Cpp17BidirectionalIterator. Instead we just use a
runtime assertion to check the std::advance precondition that the
distance is not negative.

This allows us to support std::prev on types which model the C++20
std::bidirectional_iterator concept but do not meet the
Cpp17BidirectionalIterator requirements, e.g. iota_view's iterators.

It also allows us to support std::prev(iter, -1) which is admittedly
weird, but there's no reason it shouldn't be equivalent to
std::next(iter), which is perfectly fine to use on non-bidirectional
iterators. In other words, "reverse decrementing" is valid for
non-bidirectional iterators.

However, the current implementation of std::advance for
non-bidirectional iterators uses a loop that does `while (n--) ++i;`
which assumes that n is not negative and so will eventually reach zero.
When the assertion for the precondition is not enabled, incrementing the
iterator while n is non-zero means that using std::prev(iter) or
std::next(iter, -1) on a non-bidirectional iterator will keep
incrementing the iterator until n reaches INT_MIN, overflows, and then
keeps decrementing until it eventually reaches zero. Incrementing most
iterators that many times will cause memory safety errors long before
the integer reaches zero and terminates the loop.

This commit changes the loop to use `while (n-- > 0)` which means that
the loop doesn't execute at all if a negative n is used. We still
consider such calls to be erroneous, but when the precondition isn't
checked by an assertion, the function now has no effects. The undefined
behaviour resulting from incrementing the iterator is prevented.

libstdc++-v3/ChangeLog:

PR libstdc++/122224
* include/bits/stl_iterator_base_funcs.h (prev): Compare
distance as n > 0 instead of n != 0.
* testsuite/24_iterators/range_operations/122224.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

MAINTAINERS: Update my contact info.

ChangeLog:

* MAINTAINERS: Update my contact information.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>

c++: Fix up RAW_DATA_CST handling in braced_list_to_string [PR122302]

The following testcase is miscompiled, because a RAW_DATA_CST tree
node is shared by multiple CONSTRUCTORs and when the braced_list_to_string
function changes one to extend the RAW_DATA_CST over the single preceding
and single succeeding INTEGER_CST, it changes the RAW_DATA_CST in
the other CONSTRUCTOR where the elts around it are still present.

Fixed by tweaking a copy of it instead, like we handle it in other spots.

2025-10-22 Jakub Jelinek <jakub@redhat.com>

PR c++/122302
* c-common.cc (braced_list_to_string): Call copy_node on RAW_DATA_CST
before changing RAW_DATA_POINTER and RAW_DATA_LENGTH on it.

* g++.dg/cpp0x/pr122302.C: New test.
* g++.dg/cpp/embed-27.C: New test.

AArch64: Add support for boolean reductions for Adv. SIMD using SVE

When doing boolean reductions for Adv. SIMD vectors and SVE is available
we can use SVE instructions instead of Adv. SIMD ones to do the reduction.

For instance OR-reductions are

        umaxp v3.4s, v3.4s, v3.4s
        fmov x1, d3
        cmp x1, 0
        cset w0, ne

and with SVE we generate:

        ptrue p1.b, vl16
        cmpne p1.b, p1/z, z3.b, #0
        cset w0, any

Where the ptrue is normally executed much earlier so it's not a bottleneck for
the compare.

For the remaining codegen see test vect-reduc-bool-18.c.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (reduc_sbool_and_scal_<mode>,
reduc_sbool_ior_scal_<mode>, reduc_sbool_xor_scal_<mode>): Use SVE if
available.
* config/aarch64/aarch64-sve.md (*cmp<cmp_op><mode>_ptest): Rename ...
(@aarch64_pred_cmp<cmp_op><mode>_ptest): ... To this.
(reduc_sbool_xor_scal_<mode>): Rename ...
(@reduc_sbool_xor_scal_<mode>): ... To this.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/vect-reduc-bool-10.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-11.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-12.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-13.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-14.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-15.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-16.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-17.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-18.c: New test.

AArch64: Add support for boolean reductions for Adv. SIMD

The vectorizer has learned how to do boolean reductions of masks to a C bool
for the operations OR, XOR and AND.

This implements the new optabs for Adv.SIMD.  Adv.SIMD today can already
vectorize such loops but does so through SHIFT-AND-INSERT to perform the
reductions step-wise and inorder.  As an example, an OR reduction today does:

        movi    v3.4s, 0
        ext     v5.16b, v30.16b, v3.16b, #8
        orr     v5.16b, v5.16b, v30.16b
        ext     v29.16b, v5.16b, v3.16b, #4
        orr     v29.16b, v29.16b, v5.16b
        ext     v4.16b, v29.16b, v3.16b, #2
        orr     v4.16b, v4.16b, v29.16b
        ext     v3.16b, v4.16b, v3.16b, #1
        orr     v3.16b, v3.16b, v4.16b
        fmov    w1, s3
        and     w1, w1, 1

For reducing to a boolean however we don't need the stepwise reduction and can
just look at the bit patterns. For e.g. OR we now generate:

        umaxp v3.4s, v3.4s, v3.4s
        fmov x1, d3
        cmp x1, 0
        cset w0, ne

For the remaining codegen see test vect-reduc-bool-9.c.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (reduc_sbool_and_scal_<mode>,
reduc_sbool_ior_scal_<mode>, reduc_sbool_xor_scal_<mode>): New.
* config/aarch64/iterators.md (VALLI): New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vect-reduc-bool-1.c: New test.
* gcc.target/aarch64/vect-reduc-bool-2.c: New test.
* gcc.target/aarch64/vect-reduc-bool-3.c: New test.
* gcc.target/aarch64/vect-reduc-bool-4.c: New test.
* gcc.target/aarch64/vect-reduc-bool-5.c: New test.
* gcc.target/aarch64/vect-reduc-bool-6.c: New test.
* gcc.target/aarch64/vect-reduc-bool-7.c: New test.
* gcc.target/aarch64/vect-reduc-bool-8.c: New test.
* gcc.target/aarch64/vect-reduc-bool-9.c: New test.

AArch64: Add support for boolean reductions for SVE

The vectorizer has learned how to do boolean reductions of masks to a C bool
for the operations OR, XOR and AND.

This implements the new optabs for SVE.

For SVE & and the | case would use the CC registers.

or_reduc:
        ptest   p0, p0.b
        cset    w0, any

and_reduc:
        ptrue   p3.b, all
        nots    p3.b, p3/z, p0.b
        cset    w0, none

and the ^ case we'd see if the number of active predicate lanes
is a multiple of two.

xor_reduc:
        ptrue   p3.b, all
        cntp    x0, p3, p0.b
        and     w0, w0, 1

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (reduc_sbool_and_scal_<mode>,
reduc_sbool_ior_scal_<mode>, reduc_sbool_xor_scal_<mode>): New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/vect-reduc-bool-1.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-2.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-3.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-4.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-5.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-6.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-7.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-8.c: New test.
* gcc.target/aarch64/sve/vect-reduc-bool-9.c: New test.

vect: Add support for boolean reductions for VLA

The support for the new boolean reduction optabs didn't quite work for VLA
because the code later on insists on the target still having a shift-and-insert
optab.

This is however not needed if the target can do the reduction using the new
optabs, and the initial reduction value matches the neutral value and we
have one SLP lane while not having a reduction chain.

gcc/ChangeLog:

* tree-vect-loop.cc (vectorizable_reduction): Don't always require
IFN_VEC_SHL_INSERT when using reduc sbool optabs.

aarch64: Add autoregenerated files for AArch64 options.

In the previous committed patch to "add support for
menable-sysreg-checking flag", I have made changes to
config/aarch64/aarch64.opt, but missed to update the
autoregenerated files.

This patch adds the updated autoregenerated aarch64.opt.urls
changes.

gcc/ChangeLog:

* config/aarch64/aarch64.opt.urls: Regenerate.

tree-optimization/122364 - reduction chain with conversion

The following handles detecting of a reduction chain wrapped in a
conversion. This does not yet try to combine operands with different
signedness, but we should now handle signed integer accumulation
to both a signed and unsigned accumulator fine.

PR tree-optimization/122364
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Re-try
linearization on a conversion source.

* gcc.dg/vect/vect-reduc-chain-5.c: New testcase.

tree-optimization/122370 - ICE with reduction and masks

The following fixes bad interaction with mask demotion to data
and the code dealing with UB on signed reductions by making sure
to also update compute_vectype when updating vectype.

PR tree-optimization/122370
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Also update compute_vectype when demoting masks to an
integer vector.

* gcc.dg/vect/vect-pr122370.c: New testcase.

libstdc++: Add missing constraints to views::indices

Calling views::indices(n) should be expression equivalent to
views::iota(decltype(n)(0), n), which means it should have the same
constraints as views::iota and be SFINAE-friendly.

libstdc++-v3/ChangeLog:

* include/std/ranges (indices::operator()): Constrain using
__can_iota_view concept.
* testsuite/std/ranges/indices/1.cc: Check SFINAE-friendliness
required by expression equivalence. Replace unused <vector>
header with <stddef.h> needed for size_t.

tree-optimization/122371 - ICE with reduction chain and fold-left reduction

The fold-left reduction transform relies on preserving the scalar
cycle PHI. The following rewrites how we connect this to the
involved stmt-infos instead of relying on (the actually bogus for
reduction chain) scalar stmts in SLP nodes more than absolutely
necessary. This also makes sure to not re-associate to form a
reduction chain when a fold-left reduction is required.

PR tree-optimization/122371
* tree-vect-loop.cc (vectorize_fold_left_reduction): Get
to the scalar def to replace via the scalar PHI backedge def.
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): Do not
re-associate to for a reduction chain if a fold-left
reduction is required.

* gcc.dg/vect/vect-pr122371.c: New testcase.

libstdc++: Implement optional<T&> from P2988R12 [PR121748]

This patch implements optional<T&> based on the P2988R12 paper, incorporating
corrections from LWG4300, LWG4304, and LWG3467. The resolution for LWG4015
is also extended to cover optional<T&>.

We introduce _M_fwd() helper, that is equivalent to operator*(), except that
it does not check non-empty precondition. It is used in to correctly propagate
the value during move construction from optional<T&>. This is necessary because
moving an optional<T&> must not move the contained object, which is the key
distinction between *std::move(opt) and std::move(*opt).

The implementation deviates from the standard by providing a separate std::swap
overload for std::optional<T&>, which simplifies preserving the resolution of
LWG2766.

This introduces a few changes to make_optional behavior (see included test):
* some previously valid uses of make_optional<T>({...}) (where T is not a
  reference type) now become ill-formed (see optional/make_optional_neg.cc).
* make_optional<T&>(t) and make_optional<const T&>(ct), where decltype(t) is T&,
  and decltype(ct) is const T& now produce optional<T&> and optional<const T&>
  respectively, instead of optional<T>.
* a few other uses of make_optional<R> with reference type R are now ill-formed.

PR libstdc++/121748

libstdc++-v3/ChangeLog:

* include/bits/version.def: Bump value for optional,
* include/bits/version.h: Regenerate.
* include/std/optional (std::__is_valid_contained_type_for_optional):
Define.
(std::optional<T>): Use __is_valid_contained_type_for_optional.
(optional<T>(const optional<_Up>&), optional<T>(optional<_Up>&&))
(optional<T>::operator=(const optional<_Up>&))
(optional<T>::operator=(optional<_Up>&&)): Replacex._M_get() with
x._M_fwd(), and std::move(x._M_get()) with std::move(x)._M_fwd().
(optional<T>::and_then): Remove uncessary remove_cvref_t.
(optional<T>::_M_fwd): Define.
(std::optional<T&>): Define new partial specialization.
(std::swap(std::optional<T&>, std::optional<T&>)): Define.
(std::make_optional(_Tp&&)): Add non-type template parameter.
(std::make_optional): Use parenthesis to constructor optional.
(std::hash<optional<T>>): Add comment.
* testsuite/20_util/optional/make_optional-2.cc: Guarded not longer
working example.
* testsuite/20_util/optional/relops/constrained.cc: Expand test to
cover optionals of reference.
* testsuite/20_util/optional/requirements.cc: Ammend for
optional<T&>.
* testsuite/20_util/optional/requirements_neg.cc: Likewise.
* testsuite/20_util/optional/version.cc: Test new value of
__cpp_lib_optional.
* testsuite/20_util/optional/make_optional_neg.cc: New test.
* testsuite/20_util/optional/monadic/ref_neg.cc: New test.
* testsuite/20_util/optional/ref/access.cc: New test.
* testsuite/20_util/optional/ref/assign.cc: New test.
* testsuite/20_util/optional/ref/cons.cc: New test.
* testsuite/20_util/optional/ref/internal_traits.cc: New test.
* testsuite/20_util/optional/ref/make_optional/1.cc: New test.
* testsuite/20_util/optional/ref/make_optional/from_args_neg.cc:
New test.
* testsuite/20_util/optional/ref/make_optional/from_lvalue_neg.cc:
New test.
* testsuite/20_util/optional/ref/make_optional/from_rvalue_neg.cc:
New test.
* testsuite/20_util/optional/ref/monadic.cc: New test.
* testsuite/20_util/optional/ref/relops.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Add comparison operators between tuple<> and array<T, 0> [PR119721]

This fixes the C++23 compliance issue where std::tuple<> cannot be compared
with other empty tuple-like types such as std::array<T, 0>.

The operators correctly allow comparison with array<T, 0> even when T is not
comparable, because empty tuple-like types don't compare element values.

PR libstdc++/119721

libstdc++-v3/ChangeLog:

* include/std/tuple (tuple<>::operator==, tuple<>::operator<=>):
Define.
* testsuite/23_containers/tuple/comparison_operators/119721.cc:
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>

tree-optimization/122365 - deal with bool SLP reductions

I hadn't thought of these but at least added an assert which now
tripped.  Fixed thus.  There's also a latent issue with AVX512
mask types.  The by-pieces reduction code used the wrong element
sizes.

PR tree-optimization/122365
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Convert all inputs.  Use the proper vector element sizes
for the elementwise reduction.

* gcc.dg/vect/vect-reduc-bool-9.c: New testcase.

Initial Nova Lake Support

This patch will add initial support for Nova Lake according to Intel
ISE.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h
(get_intel_cpu): Handle Nova Lake.
* common/config/i386/i386-common.cc (processor_name):
Add Nova Lake.
(processor_alias_table): Ditto.
* common/config/i386/i386-cpuinfo.h (enum processor_types):
Add INTEL_COREI7_NOVALAKE.
* config.gcc: Add -march=novalake.
* config/i386/driver-i386.cc (host_detect_local_cpu): Handle
novalake.
* config/i386/i386-c.cc (ix86_target_macros_internal): Ditto.
* config/i386/i386-options.cc (processor_cost_table): Ditto.
(m_NOVALAKE): New.
(m_CORE_HYBRID): Add novalake.
* config/i386/i386.h (enum processor_type): Ditto.
* doc/extend.texi: Ditto.
* doc/invoke.texi: Ditto.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv16.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Handle new march.

i386: Correct cpu codename value for unknown model number

There are several changes for features enabled on cpus. r16-1666 disabled
CLDEMOTE on clients. r16-2224 removed Key locker since Panther Lake and
Clearwater forest. r16-4436 disabled PREFETCHI on Panther Lake.

The patches caused the current return guess value not aligned for
host_detect_local_cpu meeting the unknown model number. Correct the
logic according to the features enabled.

This patch will also backport to GCC14 and GCC15.

gcc/ChangeLog:

* config/i386/driver-i386.cc (host_detect_local_cpu): Correct
the logic for unknown model number cpu guess value.

Simplify avx512 vector integer comparison when 2 operands are known equal

For comparison NEQ/LT/NLE, it's simplified to 0.
For comparison LE/EQ/NLT, it's simplied to (1u << nelt) - 1
gcc/ChangeLog:

PR target/122320
* config/i386/sse.md (*<avx512>_cmp<mode>3_dup_op): New define_insn_and_split.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr122320-mask16.c: New test.
* gcc.target/i386/pr122320-mask2.c: New test.
* gcc.target/i386/pr122320-mask32.c: New test.
* gcc.target/i386/pr122320-mask4.c: New test.
* gcc.target/i386/pr122320-mask64.c: New test.
* gcc.target/i386/pr122320-mask8.c: New test.

libgccjit: Add _Float16, _Float32, _Float64 and __float128 support for jit

gcc/ChangeLog:

* config/i386/i386-jit.cc: Mark new float types as supported.

gcc/jit/ChangeLog:

* docs/topics/types.rst: Document new types.
* dummy-frontend.cc: Support new types in tree_type_to_jit_type.
* jit-common.h: Update NUM_GCC_JIT_TYPES.
* jit-playback.cc: Support new types in get_tree_node_for_type.
* jit-recording.cc: Support new types.
* libgccjit.h (gcc_jit_types): Add new types.

gcc/testsuite/ChangeLog:

* jit.dg/all-non-failing-tests.h: Mention new test.
* jit.dg/test-sized-float.c: New test.

Daily bump.

libgccjit: Fix error on Power architectures caused by wrong jit_target_objs

gcc/ChangeLog:
* config.gcc (jit_target_objs): Don't set this variable since
the object files don't exist.

c2y: Allow unspecified arrays in generic association.

To allow unspecified arrays in generic association add a new
declaration context GENERIC_ASSOC for grokdeclarator and new
function grokgenassoc to be used by the parser. The error
about unspecified array is moved from build_array_declarator
to grokdeclarator to be able to check for this.

gcc/c/ChangeLog:
* c-decl.cc (build_array_declarator): Remove error.
(grokgenassoc): New function.
(grokdeclarator): Add error.
* c-parser.cc (c_parser_generic_selection): Use grokgenassoc.
* c-tree.h (grokgenassoc): Add prototype.

gcc/testsuite/ChangeLog:
* gcc.dg/c2y-generic-6.c: New test.
* gcc.dg/c2y-generic-7.c: New test.

c++: Implement C++23 P2674R1 - A trait for implicit lifetime types

The following patch attempts to implement the compiler side of the
C++23 P2674R1 paper.  As mentioned in the paper, since CWG2605
the trait isn't really implementable purely on the library side.

Because it is implemented completely on the compiler side, it
just uses SCALAR_TYPE_P and so can e.g. accept __int128 even in
-std=c++23 mode, even when std::is_scalar_v<__int128> is false in
that case.  And as an extention it (like Clang) accepts _Complex
types and vector types.
I must say I'm quite surprised that any array types are considered
implicit-lifetime, even if their element type is not, but perhaps
there is some reason for that.
Because std::is_array_v<int[0]> is false, it returns false for that
as well, dunno if that shouldn't be changed for implicit-lifetime.
It accepts also VLAs.

The library part has been split into a separate patch still pending
review; committing it now so that reflection can use it in its
std::meta::is_implicit_lifetime_type implementation.

2025-10-21  Jakub Jelinek  <jakub@redhat.com>

gcc/cp/
* cp-tree.h: Implement C++23 P2674R1 - A trait for implicit lifetime
types.
(implicit_lifetime_type_p): Declare.
* tree.cc (implicit_lifetime_type_p): New function.
* cp-trait.def (IS_IMPLICIT_LIFETIME): New unary trait.
* semantics.cc (trait_expr_value): Handle CPTK_IS_IMPLICIT_LIFETIME.
(finish_trait_expr): Likewise.
* constraint.cc (diagnose_trait_expr): Likewise.
gcc/testsuite/
* g++.dg/ext/is_implicit_lifetime.C: New test.

arm: testsuite: [MVE] Fix expected code for vadcq_m and vsbcq_m [PR122189]

The original versions of these tests only took into account code
generated with -mfloat-abi=hard.

Depending on how the toolchain is configured, arm_v8_1m_mve may use
-mfloat-abi-softfp, which generates a different instructions order.

Depending on the -mtune setting, the order can also vary, so the patch
adds -fno-schedule-insns -fno-schedule-insns2 to avoid such
maintenance issues.

In particular, this fixes the failures with:
-mthumb -march=armv7e-m+fp.dp -mtune=cortex-m7 -mfloat-abi=hard -mfpu=auto
-mthumb -march=armv6s-m -mtune=cortex-m0 -mfloat-abi=soft -mfpu=auto

gcc/testsuite/ChangeLog:

PR target/122189
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c

OpenMP: Handle non-executable directives in intervening code [PR120180,PR122306]

OpenMP 6 permits non-executable directives in intervening code; this commit adds
support for a sensible subset, namely metadirectives, nothing, assume, and
'error at(compilation)'.
Also handle the special case where a metadirective can be resolved at parse time
to 'omp nothing'.
This fixes a build issue that affects 10 out 12 SPECaccel benchmarks.

Co-authored by: Tobias Burnus <tburnus@baylibre.com>

PR c/120180
PR fortran/122306

gcc/c/ChangeLog:

* c-parser.cc (c_parser_pragma): Accept a subset of non-executable
OpenMP directives in intervening code.
(c_parser_omp_error): Reject 'error at(execution)' in intervening code.
(c_parser_omp_metadirective): Return early if only one selector matches
and it resolves to 'omp nothing'.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_metadirective): Return early if only one
selector matches and it resolves to 'omp nothing'.
(cp_parser_omp_error): Reject 'error at(execution)' in intervening code.
(cp_parser_pragma): Accept a subset of non-executable OpenMP directives
as intervening code.

gcc/fortran/ChangeLog:

* gfortran.h (enum gfc_exec_op): Add EXEC_OMP_FIRST_OPENMP_EXEC and
EXEC_OMP_LAST_OPENMP_EXEC.
* openmp.cc (gfc_match_omp_context_selector): Remove static. Remove
checks on score. Add cleanup. Remove checks on trait properties.
(gfc_match_omp_context_selector_specification): Remove static. Adjust
calls to gfc_match_omp_context_selector.
(gfc_match_omp_declare_variant): Adjust call to
gfc_match_omp_context_selector_specification.
(match_omp_metadirective): Likewise.
(icode_code_error_callback): Reject all statements except
'assume' and 'metadirective'.
(gfc_resolve_omp_context_selector): New function.
(resolve_omp_metadirective): Skip metadirectives which context selectors
can be statically resolved to false. Replace metadirective by its body
if only 'nothing' remains.
(gfc_resolve_omp_declare): Call gfc_resolve_omp_context_selector for
each variant.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/imperfect1.c: Adjust dg-error.
* c-c++-common/gomp/imperfect4.c: Likewise.
* c-c++-common/gomp/pr120180.c: Move to...
* c-c++-common/gomp/pr120180-1.c: ...here. Remove dg-error.
* g++.dg/gomp/attrs-imperfect1.C: Adjust dg-error.
* g++.dg/gomp/attrs-imperfect4.C: Likewise.
* gfortran.dg/gomp/declare-variant-2.f90: Adjust dg-error.
* gfortran.dg/gomp/declare-variant-20.f90: Likewise.
* c-c++-common/gomp/pr120180-2.c: New test.
* g++.dg/gomp/pr120180-1.C: New test.
* gfortran.dg/gomp/pr120180-1.f90: New test.
* gfortran.dg/gomp/pr120180-2.f90: New test.
* gfortran.dg/gomp/pr122306-1.f90: New file.
* gfortran.dg/gomp/pr122306-2.f90: New file.

x86_64: Start TImode STV chains from zero-extension or *concatditi.

Currently x86_64's TImode STV pass has the restriction that candidate
chains must start with a TImode load from memory.  This patch improves
the functionality of STV to allow zero-extensions and construction of
TImode pseudos from two DImode values (i.e. *concatditi) to both be
considered candidate chain initiators.  For example, this allows chains
starting from an __int128 function argument to be processed by STV.

Compiled with -O2 on x86_64:

__int128 m0,m1,m2,m3;
void foo(__int128 m)
{
    m0 = m;
    m1 = m;
    m2 = m;
    m3 = m;
}

Previously generated:

foo:    xchgq   %rdi, %rsi
        movq    %rsi, m0(%rip)
        movq    %rdi, m0+8(%rip)
        movq    %rsi, m1(%rip)
        movq    %rdi, m1+8(%rip)
        movq    %rsi, m2(%rip)
        movq    %rdi, m2+8(%rip)
        movq    %rsi, m3(%rip)
        movq    %rdi, m3+8(%rip)
        ret

With the patch, we now generate:

foo: movq    %rdi, %xmm0
        movq    %rsi, %xmm1
        punpcklqdq      %xmm1, %xmm0
        movaps  %xmm0, m0(%rip)
        movaps  %xmm0, m1(%rip)
        movaps  %xmm0, m2(%rip)
        movaps  %xmm0, m3(%rip)
        ret

or with -mavx2:

foo: vmovq   %rdi, %xmm1
        vpinsrq $1, %rsi, %xmm1, %xmm0
        vmovdqa %xmm0, m0(%rip)
        vmovdqa %xmm0, m1(%rip)
        vmovdqa %xmm0, m2(%rip)
        vmovdqa %xmm0, m3(%rip)
        ret

Likewise, for zero-extension:

__int128 m0,m1,m2,m3;
void bar(unsigned long x)
{
    __int128 m = x;
    m0 = m;
    m1 = m;
    m2 = m;
    m3 = m;
}

Previously with -O2:

bar:    movq    %rdi, m0(%rip)
        movq    $0, m0+8(%rip)
        movq    %rdi, m1(%rip)
        movq    $0, m1+8(%rip)
        movq    %rdi, m2(%rip)
        movq    $0, m2+8(%rip)
        movq    %rdi, m3(%rip)
        movq    $0, m3+8(%rip)
        ret

with this patch:

bar: movq    %rdi, %xmm0
        movaps  %xmm0, m0(%rip)
        movaps  %xmm0, m1(%rip)
        movaps  %xmm0, m2(%rip)
        movaps  %xmm0, m3(%rip)
        ret

As shown in the examples above, the scalar-to-vector (STV) conversion of
*concatditi has an overhead [treating two DImode registers as a TImode
value is free on x86_64], but specifying this penalty allows the STV
pass to make an informed decision if the total cost/gain of the chain
is a net win.

2025-10-21  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc (timode_concatdi_p): New
function to recognize the various variants of *concatditi3_[1-7].
(scalar_chain::add_insn): Like VEC_SELECT, ZERO_EXTEND and
timode_concatdi_p instructions don't require their input
operands to be converted (to TImode).
(timode_scalar_chain::compute_convert_gain): Split/clone XOR and
IOR cases from AND case, to handle timode_concatdi_p costs.
<case PLUS>: Handle timode_concatdi_p conversion costs.
<case ZERO_EXTEND>: Provide costs of DImode to TImode extension.
(timode_convert_concatdi): Helper function to transform
a *concatditi3 instruction into a vec_concatv2di instruction.
(timode_scalar_chain::convert_insn): Split/clone XOR and IOR
cases from ANS case, to handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
<case ZERO_EXTEND>: Convert zero_extendditi2 to *vec_concatv2di_0.
<case PLUS>: Handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
(timode_scalar_to_vector_candidate_p): Support timode_concatdi_p
instructions in IOR, XOR and PLUS cases.
<case ZERO_EXTEND>: Consider zero extension of a register from
DImode to TImode to be a candidate.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-stv-10.c: New test case.
* gcc.target/i386/sse4_1-stv-11.c: Likewise.
* gcc.target/i386/sse4_1-stv-12.c: Likewise.

OpenMP: Update directive arrays used for 'omp assume(s)' with contains/absent

Both Fortran and C/C++ have an array with classifications of directives;
currently, this array is only used to handle the restrictions of the
contains/absent clauses to the assume/assumes directives.

For C/C++, uncommenting 'declare mapper' was missed. Additionally,
'end ...' is a directive but not a directive name; hence, those
are now rejected as 'unknown directive' instead of as 'invalid'
directive.

Additionally, both lists now list newer entries (commented out) for
OpenMP 6.x - and a note (comment) was added for C/C++'s
'begin metadirective' and for Fortran's 'allocate', respectively.

gcc/c-family/ChangeLog:

* c-omp.cc (c_omp_directives): Uncomment 'declare mapper',
add comment to 'begin metadirective', add 6.x unimplemented
directives as comment-out entries.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.

gcc/fortran/ChangeLog:

* openmp.cc (gfc_omp_directive): Add comment to 'allocate';
add 6.x unimplemented directives as comment-out entries.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/assumes-2.c: Change for 'invalid'
to 'unknown' change for end directives.
* c-c++-common/gomp/begin-assumes-2.c: Likewise.
* c-c++-common/gomp/assume-2.c: Likewise. Check 'declare
mapper'.

tree-optimization/120687 - reduction chain with UB on signed overflow

The following adds the ability to discover a reduction chain on a
series of statements that invoke undefined behavior on integer overflow.
This inhibits the reassoc pass from associating stmts in the way
naturally leading to a reduction chain. The common mistake on the
source side is to rely on the += operator to sum multiple inputs.

After the refactoring of how we handle reduction chains we can
easily use vect_slp_linearize_chain to do this our selves and
rely on the vectorizer punning operations to unsigned given reduction
vectorization always associates.

PR tree-optimization/120687
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): When
there's no natural reduction chain see if vect_slp_linearize_chain
can recover one and built the SLP instance manually in that
case.
(vect_schedule_slp): Deal with NULL lanes when looking for
stores to remove.
* tree-vect-loop.cc (vect_transform_cycle_phi): Dump when we
are successfully transforming a reduction chain.

* gcc.dg/vect/vect-reduc-chain-4.c: New testcase.

Fix partial epilog for bool vectors

When we do epilogue vectorization the partial reduction of a bool
vector via vect_create_partial_epilog ends up being done on an
integer vector but we fail to pun back to a bool vector at the end,
causing an ICE later. I couldn't manage to create a testcase
running into the failure but a pending patch will expose this on
gcc.dg/vect/vect-switch-ifcvt-3.c

* tree-vect-loop.cc (vect_create_partial_epilog): Pun back
to the requested type if necessary.

vect: Fix regression for PR104116

The commit gcc-16-4464-g6883d51304f added 30 new tests for testing
vectorization of {FLOOR,MOD,ROUND}_{DIV,MOD}_EXPR. Few of them failed
for certain targets due to the vectorization of runtime-check loop which
was not intended.
This patch disables optimization for all of the run-time check loops so
that the count of vectorized loop is always 1.

2025-10-21 Avinash Jayakar <avinashd@linux.ibm.com>

gcc/testsuite/ChangeLog:
PR target/104116
* gcc.dg/vect/pr104116-ceil-div-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-div-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-div.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-mod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-mod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-mod.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-udiv-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-udiv-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-udiv.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-umod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-umod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-umod.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-div-2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-div-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-div.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-mod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-mod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-mod.c: disable vectorization.
* gcc.dg/vect/pr104116-round-div-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-div-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-div.c: disable vectorization.
* gcc.dg/vect/pr104116-round-mod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-mod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-mod.c: disable vectorization.
* gcc.dg/vect/pr104116-round-udiv-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-udiv-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-udiv.c: disable vectorization.
* gcc.dg/vect/pr104116-round-umod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-umod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-umod.c: disable vectorization.
* gcc.dg/vect/pr104116.h (init_arr): use std idiom, correct
indentation.
(init_uarr): use std idiom.

match: Add support for convert `((signed)x) < 0` to `x >= (unsigned)SIGNED_TYPE_MIN` while detecting min/max [PR110068]

This copies the optimization which was done to fix PR 95699 to match detection of MIN/MAX
from minmax_replacement to match.
This is another step in getting rid of minmax_replacement in phiopt. There are still a few
more min/max detections that needs to be handled before the removal. pr101024-1.c adds one
example of that but since the testcase currently passes I didn't xfail it.

pr110068-1.c adds a testcase which was not detected beforehand either.

Changes since v1:
* v2: Fix comment about how it is transformed.
Use SIGNED_TYPE_MIN everywhere instead of mxing in SIGNED_TYPE_MAX too.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/95699
PR tree-optimization/101024
PR tree-optimization/110068

gcc/ChangeLog:

* match.pd (`(type1)x CMP CST1 ? (type2)x : CST2`): Treat
`(signed)x </>= 0` as `x >=/< SIGNED_TYPE_MIN`

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr101024-1.c: New test.
* gcc.dg/tree-ssa/pr110068-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

Redefine ASM_PREFERRED_EH_DATA_FORMAT for ppc[64]-vxworks

This patch redefines ASM_PREFERRED_EH_DATA_FORMAT from the
otherwise inherited linux variant, preventing DW_EH_PE_indirect
in 64bit DKMs, where they are not strictly
needed and where the runtime load could resolve the DW.refs to
symbols of the same name within a different DKM loaded previously.

gcc/
* config/rs6000/vxworks.h (ASM_PREFERRED_EH_DATA_FORMAT):
Redefine.

Replace VSB_DIR by sysroot ref in VXWORKS_ADDITIONAL_CPP_SPEC

VXWORKS_ADDITIONAL_CPP_SPEC has an artificial guard on
-fself-test to prevent all-gcc build failures from self-tests
in environments where VSB_DIR is not defined.

The libraries are not built during such
checks; having a VxWorks installation at hand is not necessary, and
requiring VSB_DIR to be defined is inappropriate.

This patch replaces the use of %getenv(VSB_DIR) by $sysroot references
which allows removing the artifical guard of -fself-tests.

gcc/
* config/vxworks.h (VXWORKS_ADDITIONAL_CPP_SPEC):
Remove guard on -fself-tests and replace %:getenv(VSB_DIR) by
sysroot references.

Daily bump.

Fix minor RISC-V testsuite failure

This fixes reduc-8 yet again. This time the required "a2" moved to the other source operand of the add. So the regexp is further expanded to allow add anyreg,anyreg,a2 or add anyreg,a2,anyreg.

gcc/testsuite
* gcc.target/riscv/rvv/autovec/reduc/reduc-8.c: Adjust expected output.

Ada: Add missing qualifier for integer literal

gcc/ada/
PR ada/102078
* affinity.c (__gnat_set_affinity_mask): Add U qualifier.

ipa: Delete callback edges when redirecting to unreachable.

When a callback-carrying edge is redirected to __builtin_unreachable,
the associated callbacks will never get called, so the corresponding
callback edges must be deleted, as they no longer reflect the reality.

The line in analyze_function_body is an obvious typo I discovered during
debugging, so I decided to bundle it in.

gcc/ChangeLog:

* ipa-fnsummary.cc (redirect_to_unreachable): Purge callback
edges when redirecting the carrying edge.
(analyze_function_body): Fix typo.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>

libgccjit: Add gcc_jit_context_new_array_type_u64

gcc/jit/ChangeLog:

* docs/topics/compatibility.rst (LIBGCCJIT_ABI_37): New ABI tag.
* docs/topics/types.rst: Document
gcc_jit_context_new_array_type_u64.
* jit-playback.cc (new_array_type): Change num_elements type to
uint64_t.
* jit-playback.h (new_array_type): Change num_elements type to
uint64_t.
* jit-recording.cc (recording::context::new_array_type): Change
num_elements type to uint64_t.
(recording::array_type::make_debug_string): Use uint64_t
format.
(recording::array_type::write_reproducer): Switch to
gcc_jit_context_new_array_type_u64.
* jit-recording.h (class array_type): Change num_elements type
to uint64_t.
(new_array_type): Change num_elements type to uint64_t.
(num_elements): Change return type to uint64_t.
* libgccjit.cc (gcc_jit_context_new_array_type_u64):
New function.
* libgccjit.h (gcc_jit_context_new_array_type_u64):
New function.
* libgccjit.exports: New function.
* libgccjit.map: New function.

gcc/testsuite/ChangeLog:

* jit.dg/all-non-failing-tests.h: Add test-arrays-u64.c.
* jit.dg/test-arrays-u64.c: New test.

testsuite: Move ipcp-cb* from ipa to libgomp

This patch addresses the incorrectly placed tests, which fail if the
testsuite is ran and gcc has not been installed yet, as discussed
here:
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698095.html.

gcc/testsuite/ChangeLog:
* gcc.dg/ipa/ipcp-cb-spec1.c: Moved to libgomp/testsuite/libgomp.c/.
* gcc.dg/ipa/ipcp-cb-spec2.c: Likewise.
* gcc.dg/ipa/ipcp-cb1.c: Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.c/ipcp-cb-spec1.c: Moved from
gcc/testsuite/gcc.dg/ipa/.
* testsuite/libgomp.c/ipcp-cb-spec2.c: Likewise.
* testsuite/libgomp.c/ipcp-cb1.c: Likewise.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>

Ada: Fix incorrect specification of GNAT.Calendar.Time_IO "%c"

The timezone is not printed by the "%c" specifier.

gcc/ada/
PR ada/32318
* libgnat/g-catiio.adb (Image_Helper) <'c'>: Fix comment.

libgccjit: Do not treat warnings as errors

gcc/jit/ChangeLog:

* jit-playback.cc (add_error, add_error_va): Send DK_ERROR to
add_error_va.
(add_diagnostic): Call add_diagnostic instead of add_error.
* jit-recording.cc (DEFINE_DIAGNOSTIC_KIND): New define.
(recording::context::add_diagnostic): New function.
(recording::context::add_error): Send DK_ERROR to add_error_va.
(recording::context::add_error_va): New parameter diagnostic_kind.
* jit-recording.h (add_diagnostic): New function.
(add_error_va): New parameter diagnostic_kind.
* libgccjit.cc (jit_error): Send DK_ERROR to add_error_va.

gcc/testsuite/ChangeLog:

* jit.dg/test-error-array-bounds.c: Fix test.

libgccjit: Fix infinite recursion in gt_ggc_mx_lang_tree_node

2022-06-02 Antoni Boucher <bouanto@zoho.com>

gcc/jit/
PR jit/105827
* dummy-frontend.cc: Fix lang_tree_node.
* jit-common.h: New function (jit_tree_chain_next) used by
lang_tree_node.

libgccjit: Support more target builtin types

This also adds option to abort on unsupported type in order to be able
to detect new unsupported types more easily.

gcc/jit/ChangeLog:
PR jit/117886
* dummy-frontend.cc: Support some missing types.
* jit-playback.h (get_abort_on_unsupported_target_builtin): New
function.
* jit-recording.cc (get_abort_on_unsupported_target_builtin,
set_abort_on_unsupported_target_builtin): New functions.
* jit-recording.h (get_abort_on_unsupported_target_builtin,
set_abort_on_unsupported_target_builtin): New functions.
(m_abort_on_unsupported_target_builtin): New field.
* libgccjit.cc
(gcc_jit_context_set_abort_on_unsupported_target_builtin): New
function.
* libgccjit.h
(gcc_jit_context_set_abort_on_unsupported_target_builtin): New
function.
* libgccjit.exports (LIBGCCJIT_ABI_36): New ABI tag.
* libgccjit.map (LIBGCCJIT_ABI_36): New ABI tag.
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_36): New ABI tag.
* docs/topics/contexts.rst: Document new function.

hurd: Add OPTION_GLIBC_P and OPTION_GLIBC

GNU/Hurd uses glibc just like GNU/Linux.

This is needed for gcc to notice that glibc supports split stack in
finish_options.

PR go/104290
gcc/ChangeLog:
* config/gnu.h (OPTION_GLIBC_P, OPTION_GLIBC): Define.

c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads: Adjust 'libgomp.c++/{target-flex-101.C,target-std__flat_map-concurrent.C,target-std__flat_multimap-concurrent.C}' [PR114457, PR122268, PR120450]

With commit r16-4212-gf256a13f8aed833fe964a2ba541b7b30ad9b4a76
"c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads [PR114457]",
we acquired:

    {+FAIL: libgomp.c++/target-flex-101.C (internal compiler error: in assign_temp, at function.cc:990)+}
    [-PASS:-]{+FAIL:+} libgomp.c++/target-flex-101.C (test for excess errors)
    [-PASS:-]{+UNRESOLVED:+} libgomp.c++/target-flex-101.C [-execution test-]{+compilation failed to produce executable+}

... for GCN, nvptx offloading compilation, and on the other hand:

    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_map-concurrent.C (internal compiler error[-: in assign_temp, at function.cc:990)-]
    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_map-concurrent.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-std__flat_map-concurrent.C [-compilation failed to produce executable-]{+execution test+}

    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C (internal compiler error[-: in assign_temp, at function.cc:990)-]
    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C [-compilation failed to produce executable-]{+execution test+}

... for GCN offloading compilation (already PASSed for nvptx).

Note that these test cases explicitly use '-std=c++23', so don't undergo the
new C++26 P2795R5 functionality.  Yet, comparing before vs. after that commit,
in the 'gimple' dumps (that is, early host compilation), there are a lot of
changes where 'gimple_assign <constructor, [...], {CLOBBER(bob)}, NULL, NULL>'s
and relatedly 'gimple_bind's newly appear/no longer appear elsewhere.  This
leads to correspondingly different code at the beginning of offloading
compilation.  Why/how that now ('libgomp.c++/target-flex-101.C') vs. before
('libgomp.c++/{target-std__flat_map-concurrent.C,target-std__flat_multimap-concurrent.C}')
translates into 'expand' ICEs, I can't tell.

PR c++/114457
PR c++/122268
PR c++/120450
libgomp/
* testsuite/libgomp.c++/target-flex-101.C: XFAIL GCN, nvptx
offloading compilation.
* testsuite/libgomp.c++/target-std__flat_map-concurrent.C:
Un-XFAIL GCN offloading compilation.
* testsuite/libgomp.c++/target-std__flat_multimap-concurrent.C:
Likewise.

c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads: Adjust 'c-c++-common/goacc/kernels-decompose-pr100280-1.c' [PR114457]

With commit r16-4212-gf256a13f8aed833fe964a2ba541b7b30ad9b4a76
"c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads [PR114457]",
we acquired:

    @@ -181180,8 +184423,8 @@ PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26  at line 14
    PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26  at line 15 (test for warnings, line 12)
    PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26  at line 16 (test for warnings, line 12)
    PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26 (test for excess errors)
    [-XFAIL:-]{+XPASS:+} c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26 TODO at line 18 (test for warnings, line 19)
    [-XFAIL:-]{+XPASS:+} c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26 TODO location at line 17 (test for bogus messages, line 10)

As in other OpenACC 'kernels' test cases, the underlying issue again is
PR121975 "Various goacc failures with -ftrivial-auto-var-init=zero" (to be
resolved later on).

PR c++/114457
gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-pr100280-1.c: Skip for
c++26 until PR121975 is fixed.

Ada: Fix Default_Component_Value aspect wrongly ignored on derived type

This is again an old issue, which was mostly fixed a few releases ago except
for the specific case of an array type derived from String.

gcc/ada/
PR ada/68179
* exp_ch3.adb (Expand_Freeze_Array_Type): Build an initialization
procedure for a type derived from String declared with the aspect
Default_Aspect_Component_Value.

gcc/testsuite/
* gnat.dg/component_value1.adb: New test.

Ada: Fix use type clause invalidated by use clause in nested package

This is an old issue, whereby a use type clause is partially invalidated by
a use clause in a nested package, a variant of PR ada/64869 recently fixed.
The problem occurs only for unusual primitive operators because of a small
oversight in the implementation. The fix simply aligns this implementation
with the one exercised by PR ada/64869, which is more robust.

gcc/ada/
PR ada/52319
* sem_ch7.adb (Uninstall_Declarations): Use direct test on Nkind
to spot operators.
* sem_ch8.adb (End_Use_Package): Also test the Etype of operators
to spot those which are primitive operators of use-visible types.

gcc/testsuite/
* gnat.dg/use_type3.adb: New test.

Ensure use of gcc's version of stdatomic.h in gthr-vxworks

VxWorks provides its own version of the standard stdatomic.h, possibly
relying on non-gcc builtins, and our implementation of the gthr API resorts
to VxWorks specific functions for atomicity features.

When compiling libgcc (with gcc), make sure gcc's version of stdatomic.h
is used: #include it here, first, then define the macro used to guard the
system version so it doesn't get expanded when included indirectly by
other system headers.

2025-10-20 Olivier Hainque <hainque@adacore.com>
Ashley Gay <gay@adacore.com>

libgcc/
* config/gthr-vxworks.h: Include stdatomic.h and prevent indirect
inclusion of contents from the system version of that header.

Tidy bits of libgcc/config/gthr-vxworks

This addresses a variety of warnings about missing prototypes
or suspicious ptr-to-function conversions.

libgcc/
* config/gthr-vxworks-thread.c (__init_gthread_tcb): Make static.
(__delete_gthread_tcb): Likewise.
(__task_wrapper): Likewise.
(__gthread_create): Convert __task_wrapper to (void *) before going
to (FUNCPTR).
* config/gthr-vxworks-tls.c (tls_delete_hook): Accommodate prototype
variations between kernel and rtp. Return STATUS.

xtensa: Make all memory constraints special

In a previous commit (fb7b82964f54192d0723a45c0657d2eb7c5ac97c), we fixed an issue
where loads from literal pool to a hardware floating-point register were double-
indirected; that is, the address of the literal pool entry was temporarily loaded
from another entry into the address (GP) register, and then loaded from that
address into the FP register. However, we discovered that the same issue could
occur in rare cases when loading FP constants into address registers.

Similarly, this problem can be avoided by prefixing the corresponding alternative
constraint with '^' to increase the cost of Reload/LRA, but as a more fundamental
and comprehensive solution, this patch defines all memory constraint definitions
using define_special_memory_constraint, so that reloads cannot occur for addresses
(based on a good suggestion from Jeff Law).

gcc/ChangeLog:

* config/xtensa/constraints.md (R, U):
Change define_memory_constraint to define_special_memory_constraint.
* config/xtensa/xtensa.md
(movsi_internal, movhi_internal, movqi_internal):
Rearrange their alternatives in the order of constant assignment, register-
register move, load, store and special. And also consolidate overlapping
alternatives.
(movsf_internal): Rearrange the alternatives as above, and remove the '^'
alternative character which is no longer needed.

xtensa: Make individual use of CONST16 instruction

Until now, in Xtensa ISA, the CONST16 machine instruction (which shifts a
specified register left by half a word and stores a 16-bit constant value
in the low halfword of the register) has always been used in pairs and
only for full-word constant value assignments.

This patch provides a new insn definition for using CONST16 alone, and
also adds a constantsynth method that saves one byte for constant assign-
ments within a certain range when TARGET_DENSITY is also enabled.

gcc/ChangeLog:

* config/xtensa/xtensa.cc
(constantsynth_method_const16): New.
(constantsynth_methods): Append constantsynth_method_const16().
(constantsynth_info): Add cost calculation for full-word constant
assignment when TARGET_CONST16 is enabled.
(constantsynth_pass1): Change it so that it works regardless of
TARGET_CONST16.
* config/xtensa/xtensa.md (*xtensa_const16): New.

xtensa: Apply split_DI_SF_DF_const() even if TARGET_CONST16 or TARGET_AUTOLITPOOLS

Otherwise, if TARGET_CONST16 or TARGET_AUTOLITPOOLS is enabled, DI/SF/DFmode
constant assignments will not benefit from their splitting or constantsynth.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (do_largeconst):
Change split_DI_SF_DF_const() to be called unconditionally.

libstdc++: Implement P3060R3: Add std::views::indices(n)

This patch adds the views::indices function using iota.

libstdc++-v3/ChangeLog:

* include/bits/version.def: Add ranges_indices FTM.
* include/bits/version.h: Regenerate.
* include/std/ranges: Implement views::indices.
* testsuite/std/ranges/indices/1.cc: New test.

Include linux-protos.h for ppc*vxworks7r2

This provides prototypes for target hooks dragged in through linux.h,
in a similar fashion as the ppc*-linux ports do.

gcc/
* config.gcc (powerpc*-wrs-vxworks7r*): Add linux-protos.h
to tm_p_file.

libstdc++: Deduce function_ref<M&() noexcept> from member object pointers.

Implement resolution of LWG4425.

libstdc++-v3/ChangeLog:

* include/bits/funcwrap.h (__polyfunc::__deduce_funcref):
Adjust signature produced for member object pointers.
* testsuite/20_util/function_ref/deduction.cc: Update tests.

Infer TOOL/TOOL_FAMILY from vxworks-predef.h on VxWorks7

This change moves, for VxWorks 7, the setting of the TOOL
and TOOL_FAMILY macros from a builtin_define to a run-time
computation from vxworks-predefs.h.

This is useful on Vx7 to allow a single toolchain to be used
for instances of VxWorks based on either a gnu or an llvm system
toolchain for a given cpu (typically, powerpc).

This is achieved by leveraging the existence of a very basic
autoconf.h file in all VxWorks 7 VSBs, #included directly from
vxworks-predef.h.

gcc/
* config/vxworks.h (VXWORKS_OS_CPP_BUILTINS): Only
builtin_define TOOL and TOOL_FAMILY for !TARGET_VXWORKS7.
Augment comment on VXWORKS_PERSONALITY.
* config/vxworks/vxworks-predef.h: Infer TOOL and TOOL_FAMILY
from the VSB autoconf.h when we have one, determined by the presence
of a _VSB_CONFIG_FILE definition.

libgcc/
* config/t-vxworks: -include vxworks-predef.h explicitly, as the
automatic inclusion is disabled by -nostdinc.

aarch64: Add support for menable-sysreg-checking flag.

Hi All,

In the current Binutils we have disabled the feature gating for sysreg
by default and we have introduced a new flag "-menable-sysreg-checking"
to renable some of this checking.

However in GCC, we have disabled the feature gating of sysreg to read/write
intrinsics __arm_[wr]sr* and we have not added any mechanism to check the
feature gating if needed similar to Binutils.

This patch adds the support for the flag "-menable-sysreg-checking" which
renables some of the feature checking of sysreg to read/write intrinsics
__arm_[wr]sr* similar to Binutils.

For inline assembly, sysreg checks are not performed by CC1 and are
instead delegated to the assembler. By default, the assembler does not
perform these checks either. With this patch, the -menable-sysreg-checking
flag passed to the compiler will also be propagated to the assembler,
enabling sysreg checking for inline assembly.

gcc/ChangeLog:

* config/aarch64/aarch64-elf.h (ASM_SPEC): Update the macro.
* config/aarch64/aarch64.cc (aarch64_valid_sysreg_name_p):
Add feature check condition.
(aarch64_retrieve_sysreg): Likewise.
* config/aarch64/aarch64.opt (menable-sysreg-checking):
Define new flag.
* doc/invoke.texi (menable-sysreg-checking): Document new flag.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/asm-inlined-sysreg-1.c: New test.
* gcc.target/aarch64/acle/asm-inlined-sysreg-2.c: Likewise.
* gcc.target/aarch64/acle/rwsr-gated-1.c: Likewise.
* gcc.target/aarch64/acle/rwsr-gated-2.c: Likewise.
* lib/target-supports.exp
(check_effective_target_aarch64_sysreg_guarding_ok): Check
assembler support of -menable-sysreg-checking flag.

MAINTAINERS: Add myself to vectorizer maintainer list

Following the announcement on
https://gcc.gnu.org/pipermail/gcc/2025-October/246833.html
adding myself to vectorizer maintainer list.

ChangeLog:

* MAINTAINERS (Various Maintainers): Add myself for the vectorizer.

Fix minor testsuite scan failures for RISC-V

This fixes minor testsuite fallout after some of Jan's recent changes, nothing
of real significance, just minor changes in codegen causing scan tests to fail.
It's mostly an -O1/-Og problem and we can just skip the tests for those.

gcc/testsuite
* gcc.target/riscv/rvv/vsetvl/imm_switch-6.c: Skip scan-asm test for -O1 too.
* gcc.target/riscv/rvv/vsetvl/imm_switch-7.c: Likewise.
* gcc.target/riscv/shrink-wrap-1.c: Likewise. Skip for -Og as well.
* gcc.target/riscv/xandes/xandesperf-1.c: Adjust expected output.

Ada: Use Osint.Program_Name in gnatchop

This aligns gnatchop with the other GNAT tools when it comes to locating
GCC's driver executable.

gcc/ada/
PR ada/87777
* gnatchop.adb: Add with clause for Osint.
(Locate_Executable): Delete.
(Gnatchop): Use Osint.Program_Name and Locate_Exec_On_Path instead
of Locate_Executable to locate GCC's driver executable.

top-level: Add forgejo sanity checks

Add a sample workflow for Forgejo, as an example of integrated CI.

To keep it lightweight, we run only two small checks on each patch of
the series:
- contrib/check_GNU_style.py
  which catches common mistakes (spaces vs tab, missing spaces, ...)
  but has some false positive warnings.

- contrib/gcc-changelog/git_check_commit.py
  which checks the commit message and ChangeLog entry

In order to run both checks even if the other fails, we use two steps
with 'continue-on-error: true', and we need a 'final-result'
consolidation step to generate the global status.

ChangeLog:
* .forgejo/workflows/sanity-checks.yaml: New file.

libstdc++: Remove undeclared macros from configure.ac [PR122322]

The additions inr16-4443-g651bf5126da124 cause errors when running
autoreconf.

libstdc++-v3/ChangeLog:

PR libstdc++/122322
* configure.ac (with_newlib) <*-rtems*>: Remove
HAVE_SYS_IOCT4YL_H, _GLIBCXX_USE_LINK, _GLIBCXX_USE_READLINK,
_GLIBCXX_USE_SYMLINK, _GLIBCXX_USE_TRUNCATE, and
_GLIBCXX_USE_FDOPENDIR. Remove duplicates.
* configure: Regenerate.

Ada: Fix spurious warning for renaming of component of VFA record

This is a regression present on the mainline and all active branches: the
compiler gives a spurious "is not referenced" warning for the renaming of
a component of a Volatile_Full_Access record.

gcc/ada/
PR ada/107536
* exp_ch2.adb (Expand_Renaming): Mark the entity as referenced.

gcc/testsuite/
* gnat.dg/renaming18.adb: New test.

tree-optimization/121631 - UB in vector epilogue

The vectorizer fails to take UB due to signed overflow into account
when generating code for the epilogue of a signed reduction. The
following tries to make sure to perform the actual reduction
computations in an unsigned type. I did not bother to adjust
inputs to internal functions like .REDUC_PLUS.

PR tree-optimization/121631
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
When the reduction operation invokes UB on signed overflow
make sure to perform operations with it on an unsigned type.

Implement bool reduction vectorization

Currently we mess up here in two places.  One is pattern recognition
which computes a mask-precision for a bool reduction PHI that's
inconsistent with that of the latch definition.  This is solved by
iterating the mask-precision computation.  The second is that the
reduction epilogue generation and the code querying support for it
isn't ready for mask inputs.  The following fixes this by falling
back to doing all the epilogue processing on a data type again, if
the target does not support a direct mask reduction.  For that we
utilize the newly added reduc_sbool_{and,ior,xor}_scal optabs
so we can go the direct IFN path on masks if the target supports
that.  In the future we can also implement an additional fallback
for IOR and AND reductions using a scalar cond-expr like
mask != 0 ? true : false, but the new optabs provide more information
to the target.

PR tree-optimization/101639
PR tree-optimization/103495
* tree-vectorizer.h (vect_reduc_info_s): Add reduc_type_for_mask.
(VECT_REDUC_INFO_VECTYPE_FOR_MASK): New.
* tree-vect-patterns.cc (vect_determine_mask_precision):
Return whether the mask precision changed.
(vect_determine_precisions): Iterate mask precision computation
for loop vectorization.
* tree-vect-loop.cc (get_initial_defs_for_reduction): Properly
convert non-mask initial values to a mask initial def for
the reduction.
(sbool_reduction_fn_for_fn): New function.
(vect_create_epilog_for_reduction): For a mask input convert
it to the vector type analysis decided to use.  Use a regular
conversion for the final convert to the scalar code type.
(vectorizable_reduction): Support mask reductions.  Verify
we can compute a data vector from the mask result or a direct
maks reduction is provided by the target.

* gcc.dg/vect/vect-reduc-bool-1.c: New testcase.
* gcc.dg/vect/vect-reduc-bool-2.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-3.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-4.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-5.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-6.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-7.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-8.c: Likewise.

Add reduc_sbool_{and,ior,xor}_scal optabs

The following adds named patterns for reducing of vector masks with
AND, IOR and XOR to be used by the vectorizer. A slight complication
are targets using scalar integer modes as mask modes, as for those
the mode for low-precision masks is ambiguous. For this reason the
optab follows what vec_pack_sbool_trunc does and passes an additional
CONST_INT operand indicating the number of lanes in the input mask.
Note this is done always when the vector mask mode is an integer mode
and never otherwise.

* doc/md.texi (reduc_sbool_{and,ior,xor}_scal_<mode>): Document.
* optabs.def (reduc_sbool_and_scal_optab,
reduc_sbool_ior_scal_optab, reduc_sbool_xor_scal_optab): New.
* internal-fn.def (REDUC_SBOOL_AND, REDUC_SBOOL_IOR,
REDUC_SBOO_XOR): Likewise.
* internal-fn.cc (reduc_sbool_direct): New initializer.
(expand_reduc_sbool_optab_fn): New expander.
(direct_reduc_sbool_optab_supported_p): New.

Update auto-vectorizer maintainance area

The following adjusts the attribution of the auto-vectorizer area
to say 'vectorizer (+ tree-if-conv)' as approved by the SC.

* MAINTAINERS (auto-vectorizer): Change attribution to
vectorizer (+ tree-if-conv).

x86: Optimize copysign (x, const_double)

After

commit 3f176e1adc6bc9cc2c21222d776b51d9f43cb66b
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Thu Nov 9 13:59:39 2023 +0000

    middle-end: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

fneg (fabs (x)) is expanded to copysign (x, -1).  Swap constraints for
operands[1] and operands[2] in copysign<mode>3 pattern to optimize

  y = copysign (x, const_double)

instead of

  y = copysign (const_double, x)

Simplify

  y = copysign (x, positive_const_double)

to

  y = ~signbit_mask & x

and

  y = copysign (x, negative_const_double)

to

  y = signbit_mask | x

gcc/

PR target/99930
PR target/122323
* config/i386/i386-expand.cc (ix86_expand_copysign): Swap
operands[1] with operands[2].  Optimize copysign (x, const_double)
instead of copysign (const_double, x).
* config/i386/i386.md (copysign<mode>3): Swap constraints for
operands[1] and operands[2].

gcc/testsuite/

PR target/99930
PR target/122323
* gcc.target/i386/builtin-copysign-2.c: New test.
* gcc.target/i386/builtin-copysign-3.c: Likewise.
* gcc.target/i386/builtin-copysign-4.c: Likewise.
* gcc.target/i386/builtin-copysign-5.c: Likewise.
* gcc.target/i386/builtin-copysign-6.c: Likewise.
* gcc.target/i386/builtin-copysign-7.c: Likewise.
* gcc.target/i386/builtin-copysign-8a.c: Likewise.
* gcc.target/i386/builtin-copysign-8b.c: Likewise.
* gcc.target/i386/builtin-fabs-1.c: Likewise.
* gcc.target/i386/builtin-fabs-2.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Daily bump.

PR modula2/122333: m2spellcheck.cc remove memset and tidyup

This patch removes memset from m2spellcheck_InitCandidates.
It corrects a comment boiler plate and removes an unused local
variable. Finally it frees up memory used by the candidates_array
in KillCandidates.

gcc/m2/ChangeLog:

PR modula2/122333
* gm2-compiler/M2MetaError.mod (JoinSentances): Remove
unused variable.
* gm2-gcc/m2spellcheck.cc (m2spellcheck_InitCandidates): Rewrite.
(KillCandidates): Deallocate auto_vec candidates_array.
(candidates_array_vec_t): New declaration.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

AVR: The nzb=1 patterns with IOR, XOR, AND work the same way with PLUS.

gcc/
* config/avr/avr.cc (avr_nonzero_bits_lsr_operands_p): Also
handle PLUS.
* config/avr/avr.md (pixaop): New code iterator for PLUS,
IOR, XOR, AND.
(nzb=1 insns): Use pixaop instead of bitop code iterator.
Handle PLUS in outputs.