git.ipfire.org Git - thirdparty/gcc.git/log

libstdc++: Adjust flat_set::swap swapping order

In r17-908 I accidentally made us swap the comparator first, but we
decided that the container should be swapped first.

libstdc++-v3/ChangeLog:

* include/std/flat_set (_Flat_set_impl::swap): Swap _M_cont
first.

Fortran: checking of passed character length [PR125393]

Commit r16-3462 enhanced checking of character length passed to a character
dummy.  However, when the actual argument was an array element, its storage
size was estimated from all elements up to the end of the array.  This
could give a bogus warning when the dummy argument was of a scalar
character type.  Fix check for this case to actually compare the character
lengths of actual and dummy.

PR fortran/125393

gcc/fortran/ChangeLog:

* interface.cc (get_expr_storage_size): Additionally return
character length.
(gfc_compare_actual_formal): When the formal is a scalar character
variable, use character lengths, not array storage size for check.

gcc/testsuite/ChangeLog:

* gfortran.dg/argument_checking_28.f90: New test.

libstdc++: allocate_at_least ask only what it reports (P0401)

allocate_at_least is rounding up the allocation request size to
its default alignment, which may be more than an integral
multiple of the object size requested. When the memory is freed,
what the container reports it is freeing differs from the amount
that was allocated. This patch rounds the request size back down
to what will be reported to the caller.

The algorithm to compute the allocation is altered in response
to findings on godbolt.org, which indicate dropping to uint8 to
perform the division is a pessimization everywhere other than
x86. The new version emits code for multiplication, instead.

In addition, the remaining -m32 test that failed under the new
allocation method is fixed, and guards are added for building
with -fno-aligned-new and for -fno-sized-deallocation.

Tested on x86 -m64/-m32.

libstdc++-v3/ChangeLog:
* include/bits/new_allocator.h (allocate_at_least): Reduce
allocation to match what is reported.
* testsuite/20_util/allocator/allocate_at_least2.cc: Add tests.
* testsuite/23_containers/vector/modifiers/insert_vs_emplace.cc:
Fix, for -m32 and new allocation results.

cobol: Speed improvements; function prototypes; POSIX compatibility.

1) The execution speed of ADD N TO VAR and SUBTRACT N FROM VAR where N
is an integer in the range -9 through +9 and VAR is of type Numeric
Display is improved through specialized code in genmath.cc

2) The execution speed of FILE READ of line-sequential files is improved
by using a 64K read buffer.

3) COBOL function prototypes are implemented.

4) These changes include the beginning of implementing the POSIX
compatibility layer.

5) Added the ability to detect GOTO_EXPR that lack matching LABEL_EXPR.

Co-authored-by: Robert Dubner <rdubner@symas.com>
Co-authored-by: James K. Lowden <jklowden@cobolworx.com>
Co-authored-by: Xavier Del Campo <xdelcampo@symas.com>
gcc/cobol/ChangeLog:

* Make-lang.in: Include gcobc script.
* cdf.y: Change formal parameters of cdf_literalize().
* cobol1.cc (cobol_langhook_handle_option): Add OPT_ftrunc option.
* compare.cc (total_digits_tree): Remove debugging statements.
(float_compare): Likewise.
* copybook.h (class copybook_elem_t): Update conditional close().
* dts.h: Change copyright notice.
* gcobc: Likewise.
* gcobol.1: Likewise.
* gcobol.3: Likewise.
* gcobolspec.cc (COMPAT_LIBRARY): POSIX compatibility.
(POSIX_LIBRARY): Likewise.
(lang_specific_driver): Likewise.
* genapi.cc (section_label): Missing LABEL_EXPR detection.
(paragraph_label): Likewise.
(internal_perform_through): Likewise.
(enter_program_common): Add comment.
(parser_enter_program): Change current_program_index() handling.
(build_alter_switch): Missing LABEL_EXPR detection.
(parser_display_internal): Handle REFER_T_ADDRESS_OF flag.
(create_and_call): ADDRESS OF is passed BY VALUE.
* gengen.cc (LOOK_FOR_MISSING_LABELS_not): Missing LABEL_EXPR
detection.
(dump_missing_labels): Likewise.
(gg_append_statement): Likewise.
(gg_struct_field_ref): Likewise.
(LABEL_ROOT): Likewise.
(gg_create_goto_pair): Likewise.
(scm_dump_generic_nodes): Forward declaration.
(gg_leaving_the_source_code_file): Missing LABEL_EXPR detection.
(label_decl_text_from_expr): New function.
* gengen.h (gg_create_assembler_name): New declaration.
(label_decl_text_from_expr): New declaration.
* genmath.cc (uchar_f_node): Fast ADD N TO NUMERIC-DISPLAY.
(uchar_ten_node): Likewise.
(fast_add): Likewise.
(fast_subtract): Likewise.
(parser_add): Likewise.
(add_floats): Likewise.
(ordinary_add_format_1): Likewise.
(ordinary_subtract_format_1): Likewise.
(add_case_1): Likewise.
(add_case_2): Likewise.
(add_case_3): Likewise.
(parser_multiply): Likewise.
(add_case_4): Likewise.
(add_litN_to_numdisp): Likewise.
(add_format_1): Likewise.
(add_format_2): Likewise.
(add_format_3): Likewise.
(subtract_floats): Likewise.
(subtract_format_1): Likewise.
(subtract_format_2): Likewise.
(subtract_format_3): Likewise.
(parser_subtract): Likewise.
* genutil.cc (refer_has_depends): False when type == FldIndex.
* lang-specs.h: Add fdefaultbyte, fstatic-call, ftrunc.
* lang.opt: Add ftrunc.
* lexio.cc (cdftext::open_input): Improved error message.
* parse.y: CDF support, POSIX support.
* parse_ante.h (cbl_division_t): Different enum.
(mode_syntax_only): New implementation of syntax_only.
(parse_error_inc): Likewise.
(resume_parsing): Likewise.
(successful_parse): Likewise.
(name_of): Formal parameter is now const.
(nice_name_of): Likewise.
(ast_op): Chanage formal parameters.
(prototype_ok): COBOL function prototypes.
(struct prototype_type_t): Likewise.
(is_allowed_name): Likewise.
(prototype_add): Likewise.
(prototype_args): Likewise.
(verify_args): Likewise.
(valid_pointer_relop): New function.
(field_value_all): Eliminate.
(current_field): COBOL function prototypes.
(ast_enter_exit_section): Improved error messages.
(data_division_ready): Improved mode_syntax_only.
(file_section_fd_set): Change "return false" to "return 0".
(ast_end_program): Improved mode_syntax_only.
* scan_ante.h (symbol_function_token): Use symbol_function_any().
(symbol_exists): Change for() loop termination.
(typed_name): COBOL function prototypes.
* structs.cc: Support for buffered FILE READ.
* symbols.cc (symbol_field_location): Use field_locs[] map.
(symbol_table_extend): Likewise.
(is_prototypical): COBOL function prototypes.
(symbol_elem_cmp): Likewise.
(symbol_program): Likewise.
(struct symbol_elem_t): Likewise.
(symbol_function): Likewise.
(enum protoreq_t): Likewise.
(symbol_function_impl): Likewise.
(struct cbl_label_t): Likewise.
(symbol_function_any): Likewise.
(symbols_dump): Likewise.
(cbl_field_t::attr_str): Likewise.
(field_str): Likewise.
(symbols_update): Likewise.
(symbol_field_add): Likewise.
(symbol_field_same_as): Likewise.
(cbl_alphabet_t::reencode): Detect iconv() errors.
(symbol_program_add): COBOL function prototypes.
* symbols.h (enum dspc_t): Enum for Division, Section, Paragraph,
Clause.
(cbl_prototype_ok): COBOL function prototypes.
(valid_move): Handle strong typing.
(struct parameter_t): Improved function parameter handling.
(struct cbl_ffi_arg_t): Likewise.
(struct cbl_label_t): COBOL function prototypes.
(struct function_descr_t): Likewise.
(struct cbl_alphabet_t): Detect iconv() errors.
(struct cbl_file_t): Support for LINAGE and the like.
(prototype_args):COBOL function prototypes.
(is_prototypical):COBOL function prototypes.
(is_numeric): Refmods are not numeric.
(struct symbol_elem_t): Additional declarations.
* symfind.cc (update_symbol_map2): Use symbols map.
* token_names.h: New comment.
* util.cc (cbl_prototype_ok): COBOL function prototypes.
(cdf_literalize): New formal parameters.
(effective_type): New function.
(valid_move): Handle strong typing.
(cobol_trunc_binary): Handle new ftrunc option.
(parse_error_reset): Forward declaration.
(parse_file): Formatting.
* util.h (cobol_trunc_binary): New declaration.

libgcobol/ChangeLog:

* Makefile.am: Add AM_COBC and AM_COBFLAGS; update
toolexeclib_LTLIBRARIES with libgcobol_posix.la and
libgcobol_compat_gnu.la.
* Makefile.in: POSIX compatibility support.
* aclocal.m4: Regenerate.
* charmaps.cc (__gg__iconverter): Restore map of encoding pairs.
(__gg__get_charmap): Change how encodings are mapped.
* charmaps.h (CHARMAPS_H): Include #include <map>.
(DEFAULT_32_ENCODING): Wrap in __FreeBSD__ conditional.
(error_msg_direct): Wrap in IN_TARGET_LIBS.
(class cbl_iconv_t): Wrapper for iconv() calls.
(class charmap_t): Explicit constructor.
* compat/README.md: POSIX compatibility layer.
* compat/gnu/lib/CBL_ALLOC_MEM.cbl: Likewise.
* compat/gnu/lib/CBL_CHECK_FILE_EXIST.cbl: Likewise.
* compat/gnu/lib/CBL_DELETE_FILE.cbl: Likewise.
* compat/gnu/lib/CBL_FREE_MEM.cbl: Likewise.
* compat/gnu/udf/stored-char-length.cbl: Likewise.
* compat/t/Makefile: Likewise.
* compat/t/smoke.cbl: Likewise.
* configure: Regenerate.
* configure.ac: New macros
* configure.tgt: Likewise.
* ec.h (enum ec_type_t): New implementor-defined ec_imp_iconv_open_e
exception.
* encodings.h (_ENCODINGS_H_): #include <type_traits> for mapping
the cbl_encoding_t values.
(struct cbl_encoding_t_hash): Likewise.
* exceptl.h (ec_type_of): Remove "extern" from declaration.
* gcobolio.h (FILE_BUFFER_SIZE): READ FILE buffer size.
* gfileio.cc (sequential_file_write): Honor non-ascii encodings.
(line_sequential_file_read): Buffered FILE READ.
(line_sequential_file_read_sbc): Buffered FILE READ.
* intrinsic.cc (string_to_dest): Eliminate function.
(get_all_time): Replace __gg__convert_encoding() with
__gg__iconverter().
(__gg__when_compiled): Likewise.
* io.cc (__compat_file_status_word): POSIX compatibility layer.
* io.h (enum file_high_t): Likewise.
(enum file_status_t): Likewise.
* libgcobol.cc (init_var_both): Eliminate call to
initialize_program_state().
(__gg__move): Eliminate call to __gg__convert_encoding_length;
handle REFER_T_ADDRESS_OF.
(display_both): Handle REFER_T_ADDRESS_OF.
(__gg__display_clean): Likewise.
(__gg__convert_encoding): Eliminate function.
(__gg__convert_encoding_length): Likewise.
(default_exception_handler): Improve exception handling.
(ec_type_descr): Likewise.
(ec_type_disposition): Likewise.
(ec_is_fatal): Likewise.
(__gg__check_fatal_exception): Likewise.
(__gg__set_env_value): Remove call to __gg__convert_encoding.
* libgcobol.h (__gg__convert_encoding): Eliminate.
(__gg__convert_encoding_length): Eliminate.
* posix/bin/udf-gen: POSIX compatibility.
* posix/cpy/posix-errno.cbl: Likewise.
* posix/cpy/psx-lseek.cpy: Likewise.
* posix/cpy/psx-open.cpy: Likewise.
* posix/cpy/statbuf.cpy: Likewise.
* posix/cpy/tm.cpy: Likewise.
* posix/shim/lseek.cc (offsetof): Likewise.
(posix_lseek): Likewise.
* posix/shim/open.cc (posix_open): Likewise.
* posix/t/errno.cbl: Likewise.
* posix/t/exit.cbl: Likewise.
* posix/t/localtime.cbl: Likewise.
* posix/t/stat.cbl: Likewise.
* posix/udf/posix-exit.cbl: Likewise.
* posix/udf/posix-ftruncate.cbl: Likewise.
* posix/udf/posix-localtime.cbl: Likewise.
* posix/udf/posix-lseek.cbl: Likewise.
* posix/udf/posix-mkdir.cbl: Likewise.
* posix/udf/posix-open.cbl: Likewise.
* posix/udf/posix-read.cbl: Likewise.
* posix/udf/posix-stat.cbl: Likewise.
* posix/udf/posix-unlink.cbl: Likewise.
* posix/udf/posix-write.cbl: Likewise.
* valconv.cc: New exceptions.
* compat/gnu/cpy/cblproto.cpy: New file.
* compat/gnu/cpy/cbltypes.cpy: New file.
* compat/gnu/cpy/stored-char-length.cpy: New file.
* compat/gnu/lib/CBL_CLOSE_FILE.cbl: New file.
* compat/gnu/lib/CBL_CREATE_FILE.cbl: New file.
* compat/gnu/lib/CBL_OPEN_FILE.cbl: New file.
* compat/gnu/lib/CBL_READ_FILE.cbl: New file.
* compat/gnu/lib/CBL_WRITE_FILE.cbl: New file.
* compat/gnu/lib/cbl_alloc_mem.3: New file.
* compat/gnu/lib/cbl_alloc_mem.cbl3: New file.
* compat/gnu/lib/cbl_check_file_exist.3: New file.
* compat/gnu/lib/cbl_close_file.3: New file.
* compat/gnu/lib/cbl_create_file.3: New file.
* compat/gnu/lib/cbl_delete_file.3: New file.
* compat/gnu/lib/cbl_free_mem.3: New file.
* compat/gnu/lib/cbl_open_file.3: New file.
* compat/gnu/lib/cbl_read_file.3: New file.
* compat/gnu/lib/cbl_write_file.3: New file.
* compat/gnu/udf/cobrt-file-status.cbl: New file.
* posix/cpy/posix-close.cpy: New file.
* posix/cpy/posix-errno.cpy: New file.
* posix/cpy/posix-exit.cpy: New file.
* posix/cpy/posix-fstat.cpy: New file.
* posix/cpy/posix-ftruncate.cpy: New file.
* posix/cpy/posix-localtime.cpy: New file.
* posix/cpy/posix-lseek.cpy: New file.
* posix/cpy/posix-mkdir.cpy: New file.
* posix/cpy/posix-open.cpy: New file.
* posix/cpy/posix-read.cpy: New file.
* posix/cpy/posix-stat.cpy: New file.
* posix/cpy/posix-unlink.cpy: New file.
* posix/cpy/posix-write.cpy: New file.
* posix/shim/fstat.cc: New file.
* posix/udf/posix-close.cbl: New file.
* posix/udf/posix-errno.cbl: New file.
* posix/udf/posix-fstat.cbl: New file.

gcc/testsuite/ChangeLog:

* cobol.dg/group2/37-digit_Initialization_of_fundamental_types.cob:
Updated compiler error message.
* cobol.dg/group2/BINARY_and_COMP-5.cob:
Likewise.
* cobol.dg/group2/Check_for_equality_of_COMP-1___COMP-2.cob:
Likewise.
* cobol.dg/group2/Multi-target_MOVE_with_subscript_re-evaluation.cob:
Likewise.
* cobol.dg/group2/Named_conditionals_-_fixed__float__and_alphabetic.cob:
Likewise.
* cobol.dg/group2/Simple_p-scaling.cob:
Likewise.
* cobol.dg/group2/access_to_OPTIONAL_LINKAGE_item_not_passed.cob:
Likewise.
* cobol.dg/group2/compare_national_to_display.cob:
Likewise.
* cobol.dg/group2/comprensive_compare_comp-1_comp-5.cob:
Likewise.
* cobol.dg/group2/CBL_ALLOC_MEM___CBL_FREE_MEM.cob: New test.
* cobol.dg/group2/CBL_ALLOC_MEM___CBL_FREE_MEM.out: New test.
* cobol.dg/group2/CBL_CHECK_FILE_EXIST.cob: New test.
* cobol.dg/group2/CBL_CHECK_FILE_EXIST.out: New test.
* cobol.dg/group2/CBL_CREATE_FILE___CBL_WRITE_FILE___CBL_CLOSE_FILE.cob: New test.
* cobol.dg/group2/CBL_DELETE_FILE.cob: New test.
* cobol.dg/group2/CBL_DELETE_FILE.out: New test.
* cobol.dg/group2/CBL_OPEN_FILE___CBL_CLOSE_FILE.cob: New test.
* cobol.dg/group2/CBL_OPEN_FILE___CBL_CLOSE_FILE.out: New test.
* cobol.dg/group2/CBL_OPEN_FILE___CBL_READ_FILE___CBL_CLOSE_FILE.cob: New test.
* cobol.dg/group2/CBL_OPEN_FILE___CBL_READ_FILE___CBL_CLOSE_FILE.out: New test.
* cobol.dg/group2/CBL_READ_FILE__check_file_size_with_flags___128.cob: New test.
* cobol.dg/group2/Complex_HEX__VALUE_and_MOVE_-_UTF-16.cob: New test.
* cobol.dg/group2/Complex_HEX__VALUE_and_MOVE_-_UTF-16.out: New test.
* cobol.dg/group2/MOVE_LEVEL_78.cob: New test.
* cobol.dg/group2/MOVE_LEVEL_78.out: New test.
* cobol.dg/group2/add_-1_to_negative_pic_S9999.cob: New test.
* cobol.dg/group2/add_-1_to_negative_pic_S9999.out: New test.
* cobol.dg/group2/add_-1_to_pic_9999.cob: New test.
* cobol.dg/group2/add_-1_to_pic_9999.out: New test.
* cobol.dg/group2/add_-1_to_positive_pic_S9999.cob: New test.
* cobol.dg/group2/add_-1_to_positive_pic_S9999.out: New test.
* cobol.dg/group2/add_1_to_pic_9999.cob: New test.
* cobol.dg/group2/add_1_to_pic_9999.out: New test.
* cobol.dg/group2/add_1_to_positive_pic_S9999.cob: New test.
* cobol.dg/group2/add_1_to_positive_pic_S9999.out: New test.
* cobol.dg/group2/add__1_to_negative_pic_S9999.cob: New test.
* cobol.dg/group2/add__1_to_negative_pic_S9999.out: New test.
* cobol.dg/group2/ambiguous_PERFORM.cob: New test.
* cobol.dg/group2/ambiguous_PERFORM.out: New test.
* cobol.dg/group2/cbltypes.cpy: New test.
* cobol.dg/group2/compare_float_to_other_types.cob: New test.
* cobol.dg/group2/compare_float_to_other_types.out: New test.
* cobol.dg/group2/move_numeric_to_alphanumeric.cob: New test.
* cobol.dg/group2/move_numeric_to_alphanumeric.out: New test.

OpenMP: Reject omp_{cgroup,pteam,thread}_mem_alloc for static vars in ALLOCATE directive [PR122892]

Using omp_{cgroup,pteam,thread}_mem_alloc for static variables was not
very useful as currently worded in the spec; hence, OpenMP 6.1 will
disallow it also for for local static variables, OpenMP 6.0 already
disallowed for other static variables. Cf. OpenMP specification
issue #4665.

For Fortran, the check is modified while for C the check was completely
missing. Both has been rectified by this commit. For C++, the allocate
directive still has to be added.

PR c/122892

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_allocate): Reject
omp_{cgroup,pteam,thread}_mem_alloc for static variables.

gcc/fortran/ChangeLog:

* openmp.cc (gfc_resolve_omp_allocate): Reject
omp_{cgroup,pteam,thread}_mem_alloc also for local static
variables.

gcc/ChangeLog:

* gimplify.cc (gimplify_scan_omp_clauses): Update for removed
plural -S in GOMP_OMP_PREDEF_ALLOC_THREAD.

include/ChangeLog:

* gomp-constants.h (GOMP_OMP_PREDEF_ALLOC_THREADS): Rename to ...
(GOMP_OMP_PREDEF_ALLOC_THREAD): ... this.
(GOMP_OMP_PREDEF_ALLOC_CGROUP, GOMP_OMP_PREDEF_ALLOC_PTEAM): Define
with the value of omp_{cgroup,pteam}_mem_alloc

libgomp/ChangeLog:

* allocator.c (_Static_assert): Add asserts for the values of
GOMP_OMP_PREDEF_ALLOC_CGROUP and GOMP_OMP_PREDEF_ALLOC_PTEAM.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-static-3.f90: Modify to also
disallow local static variables.
* c-c++-common/gomp/allocate-20.c: New test.

libgcc: Support -mcall-ms2sysv-xlogues on FreeBSD/x86

With the bulk of the gcc.target/x86_64/abi tests fixed on FreeBSD/amd64,
a couple remain:

FAIL: gcc.target/x86_64/abi/ms-sysv/ms-sysv.c -mcall-ms2sysv-xlogues -O2 "-DGEN_ARGS=-p0\ --omit-rbp-clobbers" (test for excess errors)

and five more.  They all fail to link like

gld-2.46: /tmp//ccXprYoX.o: in function `msabi_00_0':
ms-sysv.c:(.text+0x30): undefined reference to `__sse_savms64f_12'

and many more missing symbols.  Those are usually provided in libgcc.a
by i386/t-msabi, so this patch includes them on FreeBSD/x86, too.  As
with the previous fixes, the resms64*.h and savms64*.h files need to
include .note.GNU-stack, too.

Bootstrapped without regressions on amd64-pc-freebsd15.0 with both gld
and /usr/bin/ld (lld), and x86_64-pc-linux-gnu.

2026-03-19  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

libgcc:
* config.host <i[34567]86-*-freebsd*> (tmake_file): Add
i386/t-msabi.
<x86_64-*-freebsd*>: Likewise.
* config/i386/i386-asm.h: Update comment.

* config/i386/resms64.h: Use .note.GNU-stack on FreeBSD, too.
* config/i386/resms64f.h: Likewise.
* config/i386/resms64fx.h: Likewise.
* config/i386/resms64x.h: Likewise.
* config/i386/savms64.h: Likewise.
* config/i386/savms64f.h: Likewise.

gccrs: workaround  -Wrestrict false positive [PR114385]

Recent change gives:

In file included from /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/string:45,
                 from ../../gcc/rust/rust-system.h:34,
                 from ../../gcc/rust/lex/rust-token.cc:19:
In static member function 'static constexpr std::char_traits<char>::char_type* std::char_traits<char>::copy(char_type*, const char_type*, std::size_t)',
    inlined from 'static constexpr void std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_S_copy(_CharT*, const _CharT*, size_type) [with _CharT = char; _Traits = std:
:char_traits<char>; _Alloc = std::allocator<char>]' at /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4
87:21,
    inlined from 'static constexpr void std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_S_copy(_CharT*, const _CharT*, size_type) [with _CharT = char; _Traits = std:
:char_traits<char>; _Alloc = std::allocator<char>]' at /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4
82:7,
    inlined from 'constexpr void std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_mutate(size_type, size_type, const _CharT*, size_type) [with _CharT = char; _Trait
s = std::char_traits<char>; _Alloc = std::allocator<char>]' at /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_st
ring.tcc:403:15,
    inlined from 'constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_append(const _CharT*, size_type) [
with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' at /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc+
+-v3/include/bits/basic_string.tcc:498:17,
    inlined from 'constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::append(const _CharT*, size_type) [wit
h _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' at /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v
3/include/bits/basic_string.h:1624:18,
    inlined from 'constexpr _Str std::__str_concat(const typename _Str::value_type*, typename _Str::size_type, const typename _Str::value_type*, typename _Str::size_type,
const typename _Str::allocator_type&) [with _Str = __cxx11::basic_string<char>]' at /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/
include/bits/basic_string.h:3908:19,
    inlined from 'constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc> std::operator+(const __cxx11::basic_string<_CharT, _Traits, _Alloc>&, const _CharT*) [with
_CharT = char; _Traits = char_traits<char>; _Alloc = allocator<char>]' at /gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bi
ts/basic_string.h:3984:31,
    inlined from 'std::string Rust::Token::as_string() const' at ../../gcc/rust/lex/rust-token.cc:251:45:
/gccrs/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:432:56: error: 'void* __builtin_memcpy(void*, const void*
, long unsigned int)' accessing 18446744073709551609 or more bytes at offsets 0 and 0 overlaps 9223372036854775795 bytes at offset -9223372036854775802 [-Werror=restrict]
  432 |         return static_cast<char_type*>(__builtin_memcpy(__s1, __s2, __n));
      |                                        ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~

Split the concatenation to avoid the warning.

Fix comes from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125404#c2

gcc/rust/ChangeLog:
PR tree-optimization/114385
* lex/rust-token.cc (Token::as_string): split concatenation.

Co-authored-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Marc Poulhiès <dkm@kataplop.net>

[riscv] Fix sync builtins unspec->unspecv

gcc/
* config/riscv/sync.md: Move & rename atomic unspec enums to unspecv
enum. Use renamed UNSPECV_$NAME enums.
* config/riscv/sync-rvwmo.md: Use renamed UNSPECV_$NAME enums.
* config/riscv/sync-ztso.md: Use renamed UNSPECV_$NAME enums.

aarch64: add __ARM_FEATURE_ macros for SVE2.2 and SME2.2

This patch defines __ARM_FEATURE_ macros for the SVE2.2 and SME2.2
extensions, together with necessary new tests. In the v1 of the series
(https://gcc.gnu.org/pipermail/gcc-patches/2026-February/707393.html),
this was a part of the first patch, but now it's been moved to the tail
end of the series so that these definitions aren't visible before the
contents of the extensions are actually available.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
Emit definitions for __ARM_FEATURE_{SVE,SME}2p2.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pragma_cpp_predefs_3.c: Add SVE2p2 tests.
* gcc.target/aarch64/pragma_cpp_predefs_4.c: Add SME2p2 test.

aarch64: implement FMUL SME instruction

The SME2.2 extension introduces the following variants of a new
streaming-mode instruction:

- FMUL (Multi-vector floating-point multiply by vector)
- FMUL (Multi-vector floating-point multiply)

The first operand is a multi-vector consisting of two or four vectors, and
the second operand either has the same type, or is a single vector of the
underlying type.  New intrinsics are documented in the ACLE manual [0] and
are as follows:

svfloat{16,32,64}x{2,4}_t svmul[_single_f{16,32,64}_x{2,4}]
  (svfloat{16,32,64}x{2,4}_t zd, svfloat{16,32,64}_t zm) __arm_streaming;

svfloat{16,32,64}x{2,4}_t svmul[_f{16,32,64}_x{2,4}]
  (svfloat{16,32,64}x{2,4}_t zd, svfloat{16,32,64}x{2,4}_t zm) __arm_streaming;

This patch implements the above changes throughout the SVE builtin
description files and aarch64-sve2.md.

[0] https://github.com/ARM-software/acle

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-sve2.def (svmul): Define new
SVE function variant.
* config/aarch64/aarch64-sve2.md (@aarch64_sve_<optab><mode>): New
instruction pattern.
(@aarch64_sve_<optab><mode>_single): Likewise.
* config/aarch64/aarch64.h (TARGET_STREAMING_SME2p2): New macro.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sme2/acle-asm/mul_f16_x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mul_f16_x4.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/mul_f32_x2.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/mul_f32_x4.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/mul_f64_x2.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/mul_f64_x4.c: Likewise.

aarch64: implement changes for COMPACT and EXPAND SVE instructions

SVE2.2 and SME2.2 extensions introduce the following changes related to
COMPACT/EXPAND instructions:

- COMPACT (Copy Active vector elements to lower-numbered elements) for 8-
  and 16-bit-wide vector elements: these variants of an existing instruction
  are new in SVE2.2 (or in streaming mode, SME2.2)
- COMPACT (Copy Active vector elements to lower-numbered elements) for 32-
  and 64-bit-wide vector elements: previously only legal in non-streaming
  mode, these variants are now allowed in streaming mode under SME2.2
- EXPAND (Copy lower-numbered vector elements to Active elements): this
  instruction is new in SVE2.2 (or in streaming mode, SME2.2)

The new supporting intrinsics are documented in the ACLE manual [0] and
are as follows:

sv{uint,int}{8,16}_t svcompact[_{u,s}{8,16}]
  (svbool_t pg, sv{uint,int}{8,16}_t zn);
sv{mfloat8,bfloat16,float16}_t svcompact[_{mf8,bf16,f16}]
  (svbool_t pg, sv{mfloat8,bfloat16,float16}_t zn);

sv{uint,int}{8,16,32,64}_t svexpand[_{u,s}{8,16,32,64}]
  (svbool_t pg, sv{uint,int}{8,16,32,64}_t zn);
svfloat{16,32,64}_t svexpand[_f{16,32,64}]
  (svbool_t pg, svfloat{16,32,64}_t zn);
sv{mfloat8,bfloat16}_t svexpand[_{mf8,bf16}]
  (svbool_t pg, sv{mfloat8,bfloat16}_t zn);

This patch implements the above changes throughout the SVE builtin
description files and aarch64-sve{,2}.md.

New ASM tests have been added as usual; also, an adjustment has been made
to aarch64-ssve.exp in g++.target/ to reflect the fact that the svcompact
intrinsic is not nonstreaming-only anymore.

[0] https://github.com/ARM-software/acle

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-base.cc (class svexpand_impl):
Define new SVE function base.
* config/aarch64/aarch64-sve-builtins-base.def (svcompact): Allow
execution in streaming mode when SME2p2 is enabled.
* config/aarch64/aarch64-sve-builtins-base.h (svexpand): Declare
new SVE function base.
* config/aarch64/aarch64-sve-builtins-sve2.def (svcompact): Define
new SVE function.
(svexpand): Likewise.
* config/aarch64/aarch64-sve.md (@aarch64_sve_compact<mode>):
Enable 32- and 64-bit element variants under SME2p2.  New
insn pattern for 8- and 16-bit elements.
(@aarch64_sve_expand<mode>): New insn pattern.
* config/aarch64/aarch64.h (TARGET_SVE_OR_SME2p2): New macro.
* config/aarch64/aarch64.md (UNSPEC_SVE_EXPAND): New UNSPEC.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/aarch64-ssve.exp: Add sve2p2 to the
target string.  Move svcompact from $nonstreaming_only to
$streaming_ok.
* gcc.target/aarch64/sve2/acle/asm/compact_bf16.c: New test.
* gcc.target/aarch64/sve2/acle/asm/compact_f32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_f64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/compact_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_bf16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_f32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_f64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/expand_u8.c: Likewise.

aarch64: implement FIRSTP and LASTP SVE instructions

This commit implements patterns and intrinsics for these two instructions
new in SVE2.2 (or in streaming mode, SME2.2):

- FIRSTP (Scalar index of first true predicate element (predicated))
- LASTP (Scalar index of last true predicate element (predicated))

The new intrinsics are documented in the ACLE manual [0] and have the
following signatures:

int64_t svfirstp_b{8,16,32,64} (svbool_t pg, svbool_t pn);
int64_t svlastp_b{8,16,32,64} (svbool_t pg, svbool_t pn);

The intrinsics are implemented in the usual way; the new
svfirst_lastp_impl base class is used for both families. The ->fold ()
method implements constant folding except for LASTP under
-msve-vector-bits=scalable. On the .md side, the patterns for both new
instructions are implemented using UNSPECs as they can't be expressed in
terms of standard RTL.

Included are standard asm tests (which are heavily based on cntp_* tests
from the sve directory), as well as some general C tests
demonstrating aforementioned optimizations when PG and/or PN are constant
vectors.

[0] https://github.com/ARM-software/acle

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-sve2.cc
(class svfirst_lastp_impl): Define new SVE function base class.
(svfirstp): Define new SVE function base.
(svlastp): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.def (svfirstp): Define
new SVE function.
(svlastp): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.h (svfirstp): Declare
new SVE function base.
* config/aarch64/aarch64-sve2.md (@aarch64_pred_firstp<mode>): New
insn pattern.
(@aarch64_pred_lastp<mode>): Likewise.
* config/aarch64/iterators.md (UNSPEC_FIRSTP): New UNSPEC.
(UNSPEC_LASTP): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/firstp_b16.c: New test.
* gcc.target/aarch64/sve2/acle/asm/firstp_b32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/firstp_b64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/firstp_b8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/lastp_b16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/lastp_b32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/lastp_b64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/lastp_b8.c: Likewise.
* gcc.target/aarch64/sve2/acle/general/firstp.c: Likewise.
* gcc.target/aarch64/sve2/acle/general/lastp.c: Likewise.

aarch64: implement FRINT32/64 SVE instructions

SVE2.2 (or in streaming mode, SME2.2) adds the following SVE
instructions:

- FRINT32X (Floating-point round to 32-bit integer (predicated))
- FRINT32Z (Floating-point round to 32-bit integer, rounding toward zero
  (predicated))
- FRINT64X (Floating-point round to 64-bit integer (predicated))
- FRINT64Z (Floating-point round to 64-bit integer, rounding toward zero
  (predicated))

The intrinsics that expand to them are defined in the ACLE manual [0]:

svfloat{32,64}_t svrint32x{_f32,_f64}_z
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint32x{_f32,_f64}_x
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint32x{_f32,_f64}_m
  (svfloat{32,64}_t inactive, svbool_t pg, svfloat{32,64}_t zn);

svfloat{32,64}_t svrint32z{_f32,_f64}_z
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint32z{_f32,_f64}_x
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint32z{_f32,_f64}_m
  (svfloat{32,64}_t inactive, svbool_t pg, svfloat{32,64}_t zn);

svfloat{32,64}_t svrint64x{_f32,_f64}_z
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint64x{_f32,_f64}_x
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint64x{_f32,_f64}_m
  (svfloat{32,64}_t inactive, svbool_t pg, svfloat{32,64}_t zn);

svfloat{32,64}_t svrint64z{_f32,_f64}_z
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint64z{_f32,_f64}_x
  (svbool_t pg, svfloat{32,64}_t zn);
svfloat{32,64}_t svrint64z{_f32,_f64}_m
  (svfloat{32,64}_t inactive, svbool_t pg, svfloat{32,64}_t zn);

The implementation of new intrinsics and RTL patterns is quite
straightforward, and a standard set of ASM tests has been added to the
sve2/acle/asm directory.

[0] https://github.com/ARM-software/acle

Changes since v1:
- Append extension names to comments in aarch64-sve2.md.

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-sve2.cc (svrint32x): Define
new function base.
(svrint32z): Likewise.
(svrint64x): Likewise.
(svrint64z): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.def (svrint32x):
Define new SVE function.
(svrint32z): Likewise.
(svrint64x): Likewise.
(svrint64z): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.h (svrint32x): Declare
new function base.
(svrint32z): Likewise.
(svrint64x): Likewise.
(svrint32z): Likewise.
* config/aarch64/aarch64-sve-builtins.cc (TYPES_sd_float): New
type set.
(sd_float): New SVE type array.
* config/aarch64/aarch64-sve2.md (@cond_<frintnzs_op><mode>): New
insn pattern.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/rint32x_f32.c: New test.
* gcc.target/aarch64/sve2/acle/asm/rint32x_f64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rint32z_f32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rint32z_f64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rint64x_f32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rint64x_f64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rint64z_f32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rint64z_f64.c: Likewise.

aarch64: add zeroing forms for predicated SVE top FP conversions

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing
predication for the following SVE FP conversion instructions:

SVE1:
- BFCVTNT (Single-precision convert to BFloat16 (top, predicated))

SVE2:
- FCVTLT (Floating-point widening convert (top, predicated))
- FCVTNT (Floating-point narrowing convert (top, predicated))
- FCVTXNT (Double-precision convert to single-precision, rounding
  to odd (top, predicated))

Additionally, this patch implements corresponding intrinsics documented in
the ACLE manual [0] with the following signatures:

svfloat{32,64}_t svcvtlt_{f32[_f16],_f64[_f32]}_z
  (svbool_t pg, svfloat{16,32}_t op);

sv{bfloat16,float16,float32}_t svcvtnt_{f16[_f32],_f32[_f64],_bf16[_f32]}_z
  (sv{bfloat16,float16,float32}_t even, svbool_t pg, svfloat{32,64}_t op);

svfloat32_t svcvtxnt_f32[_f64]_z
  (svfloat32_t even, svbool_t pg, svfloat64_t op);

This patch adds an alternative that emits a single zeroing-predication
form of the instructions mentioned above (as long as the sve2p2_or_sme2p2
condition holds) to corresponding RTL patterns.  For narrowing conversions
([B]FCVTNT and FCVTXNT), since an additional merge operand controlling the
values of inactive lanes is required, the intrinsics have been changed to
use the new top_narrowing_convert SVE function base class; this new class
injects a const_vector selector operand at expand time.  Depending on the
value of this operand, either the destination vector or a constant zero
vector is used to supply values for inactive lanes.

The new tests all have "_z" in their names since they only cover the
zeroing-predication versions of their respective intrinsics.

[0] https://github.com/ARM-software/acle

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-base.cc (class svcvtnt_impl):
Remove.
(svcvtnt): Redefine using narrowing_top_convert.
* config/aarch64/aarch64-sve-builtins-functions.h
(class narrowing_top_convert): New SVE function base class.
(NARROWING_TOP_CONVERT0): New function-like macro for specializing
narrowing_top_convert.
(NARROWING_TOP_CONVERT1): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc (class svcvtxnt_impl):
Remove.
(svcvtxnt): Redefine using narrowing_top_convert.
* config/aarch64/aarch64-sve-builtins-sve2.def (svcvtlt): Allow
zeroing predication.
(svcvtnt): Likewise.
(svcvtxnt): Likewise.
* config/aarch64/aarch64-sve.md (@aarch64_sve_cvtnt<mode>):
Convert to compact syntax. Add operand 4 for values of
inactive lanes.  New alternative for zeroing predication.
* config/aarch64/aarch64-sve2.md
(*cond_<sve_fp_op><mode>_relaxed): Convert to compact syntax.
New alternative for zeroing predication.
(*cond_<sve_fp_op><mode>_strict): Likewise.
(@aarch64_sve_cvtnt<mode>): Convert to compact syntax. Add
operand 4 for values of inactive lanes.  New alternative for
zeroing predication.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/cvtlt_f32_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/cvtlt_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtnt_bf16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtnt_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtnt_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtxnt_f32_z.c: Likewise.

aarch64: add zeroing forms for predicated SVE int-/FP-to-FP conversions

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing
predication for the following SVE FP conversion instructions:

SVE1:
- SCVTF (Signed integer convert to floating-point (predicated))
- UCVTF (Unsigned integer convert to floating-point (predicated))
- FCVT (Floating-point convert (predicated))
- BFCVT (Single-precision convert to BFloat16 (predicated))

SVE2:
- FCVTX (Double-precision convert to single-precision, rounding to
  odd (predicated))

The SVE1 instructions are spread over several patterns for various
combinations of source/destination widths and FP semantics, and the FCVTX
instruction is serviced by two patterns in the aarch64-sve2.md file via
the SVE2_COND_FP_UNARY_NARROWB iterator (one for strict, the other for
relaxed FP semantics).  The patch adds an alternative that emits a single
zeroing-predication version of an instruction whenever the merge operand
is a constant zero vector and the sve2p2_or_sme2p2 condition holds.

As with the original cvt_b?f* tests in the sve/acle/asm directory,
testcases for conversions from both integral and floating-point types
coexist in the same files and are grouped only by the destination type.
FCVTX tests are added in a separate file.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_relaxed):
New alternative for zeroing predication.  Add `arch` attribute
to every alternative.
(*cond_<optab>_nonextend<SVE_HSDI:mode><SVE_PARTIAL_F:mode>_relaxed):
Likewise.
(*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_strict):
Likewise.
(*cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>):
Likewise.
(*cond_<optab>_trunc<SVE_FULL_SDF:mode><SVE_FULL_HSF:mode>):
Likewise.
(*cond_<optab>_trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>):
Likewise.
(*cond_<optab>_trunc<VNx4SF_ONLY:mode><VNx8BF_ONLY:mode>):
Likewise.
(*cond_<optab>_nontrunc<SVE_FULL_HSF:mode><SVE_FULL_SDF:mode>):
Likewise.
(*cond_<optab>_nontrunc<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>_relaxed):
Likewise.
* config/aarch64/aarch64-sve2.md
(*cond_<sve_fp_op><mode>_any_relaxed): Likewise.
(*cond_<sve_fp_op><mode>_any_strict): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/cvt_bf16_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/cvt_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvt_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvt_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtx_f32_z.c: Likewise.

aarch64: add zeroing forms for predicated SVE FP-to-integer conversions

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing predication
for all variants of the following FP-to-integer conversion instructions:

- FCVTZU (Floating-point convert to unsigned integer, rounding toward zero
  (predicated))
- FCVTZS (Floating-point convert to signed integer, rounding toward zero
  (predicated))

To implement this change, this patch adds a new alternative to patterns
involving the SVE_COND_FCVTI iterator and accepting an independent value
as the merge operand.  The new alternative has the new zeroing-predication
forms as the output string and is only enabled when sve2p2_or_sme2p2 is
true in the target architecture.

The new ASM tests only cover the "_z" versions of the intrinsics and as
such all have the "_z" suffix in their name, and are grouped by type of
the destination operand.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_relaxed):
New alternative for zeroing predication.  Add `arch` attribute
to every alternative.
(*cond_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>_relaxed):
Likewise.
(*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_strict):
Likewise.
(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
Likewise.
(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx2SI_ONLY:mode>_relaxed):
Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/cvt_s16_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/cvt_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvt_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvt_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvt_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvt_u64_z.c: Likewise.

aarch64: add zeroing forms for predicated SVE FP unary operations

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing predication
for the following floating-point unary instructions:

SVE:

- FABS (Floating-point absolute value (predicated))
- FNEG (Floating-point negate (predicated))
- FRECPX (Floating-point reciprocal exponent (predicated))
- FRINT<r> (Floating-point round to integral value (predicated))
- FSQRT (Floating-point square root (predicated))

SVE2:
- FLOGB (Floating-point base 2 logarithm as integer (predicated))

These instructions are covered by SVE_COND_FP_UNARY for SVE and
SVE2_COND_INT_UNARY_FP for SVE2, thus this change is limited to two
patterns in each of aarch64-sve.md and aarch64-sve2.md (one for relaxed,
and one for strict FP semantics). The change is to add a new alternative
with Dz as operand 3 (the merge operand), enabled only if the
sve2p2_or_sme2p2 condition holds and emitting a single instruction with
zeroing predication.

The tests that have been added are based on the original SVE tests
for corresponding instructions, but all have a "_z" suffix in their name
since they only test codegen for the "_z" variants of the corresponding
intrinsics.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_any_relaxed):
New alternative for zeroing predication. Add `arch` attribute
to every alternative.
(*cond_<optab><mode>_any_strict): Likewise.
* config/aarch64/aarch64-sve2.md (*cond_<sve_fp_op><mode>):
Likewise.
(*cond_<sve_fp_op><mode>_strict): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/abs_f16_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/abs_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/abs_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/logb_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/logb_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/logb_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/neg_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/neg_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/neg_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/recpx_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/recpx_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/recpx_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rinta_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rinta_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rinta_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rinti_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rinti_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rinti_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintm_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintm_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintm_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintn_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintn_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintn_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintp_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintp_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintp_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintx_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintx_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintx_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintz_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintz_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rintz_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sqrt_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sqrt_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sqrt_f64_z.c: Likewise.

aarch64: add zeroing forms for predicated SVE bit reversal operations

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing
predication for the following SVE bit reversal instructions:

- REVB, REVH, REVW (Reverse bytes / halfwords / words within elements
  (predicated))
- REVD (Reverse 64-bit doublewords in elements (predicated)) (SVE2 only)

The first three are covered by the SVE_INT_UNARY code iterator, and REVD,
being SVE2-only, has a standalone pattern in aarch64-sve2.md.  This patch
adds an alternative for the zeroing-predication forms of the original
instructions.  The pattern for REVD also required changes to the predicate
for operand 3 to accept constant zero RTX whenever SVE2.2 is enabled.
Additionally, use the /z form of the REVD instruction for PRED_X
predication to save a data dependency.

The tests that have been added are based on the original SVE/SVE2 tests
for corresponding instructions, but all have a "_z" suffix in their name
since they only test codegen for the "_z" variants of the corresponding
intrinsics.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (@cond_<optab><mode>):
New alternative for zeroing predication.  Add `arch` attribute
to every alternative.
* config/aarch64/aarch64-sve2.md (@aarch64_pred_<optab><mode>):
Use zeroing predication variant for PRED_X.
(@cond_<optab><mode>): Accept constant zero as operand 3.  New
alternative for zeroing predication.  Add `arch` attribute to
every alternative.
* config/aarch64/predicates.md (aarch64_simd_reg_or_direct_zero):
New predicate.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/revb_s16_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/revb_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revb_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revb_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revb_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revb_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_bf16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revd_u8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revh_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revh_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revh_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revh_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revw_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/revw_u64_z.c: Likewise.

aarch64: add zeroing forms for predicated SVE integer extends

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing
predication for the following SVE integer extension instructions:

- SXTB, SXTH, SXTW (Signed byte / halfword / word extend (predicated))
- UXTB, UXTH, UXTW (Unsigned byte / halfword / word extend (predicated))

The functional change is limited to two patterns in aarch64-sve.md
handling SVE extends merging with an independent value, to which this
patch adds a new alternative that emits a single zeroing-predication form
of an instruction as long as the sve2p2_or_sme2p2 condition holds.

The tests that have been added are based on the original SVE tests
for corresponding instructions, but all have a "_z" suffix in their name
since they only test codegen for the "_z" variants of the corresponding
intrinsics.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>):
New alternative for zeroing predication. Add `arch` attribute
to every alternative.
(*cond_uxt<mode>_any): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/extb_s16_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/extb_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/extb_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/extb_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/extb_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/extb_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/exth_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/exth_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/exth_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/exth_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/extw_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/extw_u64_z.c: Likewise.

aarch64: add zeroing forms for predicated SVE integer unary operations

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing predication
for the following integer unary instructions:

SVE:
- ABS (Absolute value (predicated))
- CLS (Count leading sign bits (predicated))
- CLZ (Count leading zero bits (predicated))
- CNT (Count non-zero bits (predicated))
- CNOT (Logically invert boolean condition (predicated))
- NEG (Negate (predicated))
- NOT (Bitwise invert (predicated))
- RBIT (Reverse bits (predicated))

SVE2:
- SQABS (Signed saturating absolute value)
- SQNEG (Signed saturating negate)
- URECPE (Unsigned reciprocal estimate (predicated))
- URSRQTE (Unsigned reciprocal square root estimate (predicated))

These instructions are covered by the SVE_INT_UNARY and SVE2_U32_UNARY
iterators, except for CNOT, which has a standalone pattern.  Therefore,
three patterns across aarch64-sve.md and aarch64-sve2.md had to be
provided with a new alternative, having Dz (const_vector of all zeroes) as
the merge operand.  The new alternatives are conditional upon the
sve2p2_or_sme2p2 test added earlier, and emit the new zeroing-predication
forms of the original instructions.

The tests that have been added are based on the original SVE/SVE2 tests
for corresponding instructions, but all have a "_z" suffix in their name
since they only test codegen for the "_z" variants of the corresponding
intrinsics.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_any):
New alternative for zeroing predication.  Add `arch` attribute
to every alternative.
(*cond_cnot<mode>_any): Likewise.
* config/aarch64/aarch64-sve2.md: (*cond_<sve_int_op><mode>):
Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/abs_s16_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/abs_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/abs_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/abs_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cls_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cls_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cls_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cls_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/clz_u8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnot_u8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cnt_u8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/neg_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/neg_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/neg_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/neg_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/not_u8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qabs_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qabs_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qabs_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qabs_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qneg_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qneg_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qneg_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qneg_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_s16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_s32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_s64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_s8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_u16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_u64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rbit_u8_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/recpe_u32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/rsqrte_u32_z.c: Likewise.

aarch64: add preliminary definitions for SVE2.2/SME2.2

This is a preparatory patch for the bulk of the SVE2.2/SME2.2 support
series, putting into place some machinery used by the later patches. This
includes TARGET_* constants that are set based on ISA flags, and new
match_test definitions that are used to enable/disable individual
instruction patterns/alternatives.

On the testsuite side of things, this patch adds two new effective-target
checks in lib/target-supports.exp, one for each of SVE2.2-capable HW and
toolchain.

v1 of this patch also contained __ARM_FEATURE_* macro definitions for
SVE2.2 and SME2.2, but these have been moved to the end of the series to
improve bisection.

gcc/ChangeLog:

* config/aarch64/aarch64.h (TARGET_SVE2p2): New macro.
(TARGET_SME2p2): Likewise.
(TARGET_SVE2p2_OR_SME2p2): Likewise.
* config/aarch64/aarch64.md (arches): Add sve2p2_or_sme2p2 enum
constant.
(arch): Add test for sve2p2_or_sme2p2.
* doc/invoke.texi: Document sve2p2 and sme2p2 extensions.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_aarch64_sve2p2_hw): New target check.
(check_effective_target_aarch64_sve2p2_ok): New target check.
(exts_sve2): Add sme2p2.

i386: Rewrite index*1+disp into base+disp

Sometimes, GCC may synthesize an address like [index * 1 + displacement]. This
commit rewrites that into [base + displacement], to eliminate the requirement
of an SIB byte (which is always the case, as RSP isn't a valid index), and to
allow a small displacement to be encoded in one byte.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_decompose_address): Add a special case where
there's no base, there's an index, and the scale is 1.

gcc/testsuite/ChangeLog:

* gcc.target/i386/rewrite-sib-without-base.c: New test.

Signed-off-by: LIU Hao <lh_mouse@126.com>

testsuite: Tweak sse2-p{add,sub}[bdw]-2.c tests for -march=cascadelake

Update the recently added gcc.target/i386/sse2-p{add,sub}[bdw]-2.c
tests for -march=cascadelake. Committed as obvious.

2026-05-29 Roger Sayle <roger@nextmovesoftware.com>

gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-paddb-2.c: Support -march=cascadelake.
* gcc.target/i386/sse2-paddd-2.c: Likewise.
* gcc.target/i386/sse2-paddw-2.c: Likewise.
* gcc.target/i386/sse2-psubb-2.c: Likewise.
* gcc.target/i386/sse2-psubd-2.c: Likewise.
* gcc.target/i386/sse2-psubw-2.c: Likewise.

ada: Fix bug when reading multibyte utf-8 character

A multibyte utf-8 character has its msb set, which is the sign bit for a
signed value.

The get_immediate C function, for linux (and others) uses read() when
the character is read from a terminal. It was using a "char" type, so it
can be both signed or unsigned (target dependent). On target where char
is signed, it means that reading a multibyte utf-8 character will
produce a negative value. For example:

€ = 0xE2 0x82 0xAC

The first byte is 0xE2, which is -30 for a signed char.

Then the value is written in a signed int, still as -30 (0xFFFF_FFE2),
and the caller fails a range check because 0xFFFF_FFE2 is not in the
unsigned range for a Character (0..255).

Fixing the variable to an unsigned char avoids the conversion to a
signed value.

gcc/ada/ChangeLog:

* sysdep.c (getc_immediate_common): Read character as unsigned
value.

ada: Cleanup of Analyze_Aspect_Specifications and related code

Rename Decorate to be Decorate_Aspect_Links; seems more readable.
Change it to support N_Attribute_Definition_Clause in addition
to N_Pragma. Move most calls to it into Insert_Aitem.

Move call to Set_Has_Delayed_Rep_Aspects to be near
calls to Set_Has_Delayed_Aspects.

Make Anod and Eloc variables more local to where they are used.

Misc comment improvements, including removing some useless ones.

gcc/ada/ChangeLog:

* sem_ch13.adb (Delay_Aspect): Remove the side effect.
(Decorate): Rename to be Decorate_Aspect_Links.
Generalize.
(Insert_Aitem): Call Decorate_Aspect_Links.
* aspects.ads: Minor comment improvement: we don't need to worry;
we just need to do it.
* einfo.ads: Minor comment improvement.

ada: Implement AI22-0154 (Revised resolution of indexing aspects)

Customer code was running into an error due to a violation of the
rule for indexing aspects that any functions declared in the same
package spec that do not satisfy the legality rules for eligible
indexing functions make the aspects illegal. In this case it was
due to a derived type inheriting a function of the parent type
that had indexing aspects. Consideration of this problem led
to proposing language changes in AI22-0154, which revises the
resolution rules to take the indexing profile requirements into
account (rather than allowing resolution indexing aspect names
to consider any available function declared within the scope).
This set of changes implements the revised resolution rules,
allowing the compiler to accept the customer code.

In some cases the compiler will now issue a warning instead of
ignoring an ineligible candidate entity. Specifically this is
done when a candidate interpretation is a function that has at
least a first formal of the type associated with the aspect,
but doesn't satisfy other requirements of the particular
indexing aspect. We impose this limitation so as to avoid
issuing too many false-positive warnings.

These changes also reduce technical debt by removing code in
Sem_Util.Inherit_Nonoverridable_Aspect that was handling checking
and addition of new indexing functions for derived types via calls
to Check_Function_For_Indexing_Aspect. That handling is now covered
fully by Check_Indexing_Functions (which itself makes calls to
Check_Function_For_Indexing_Aspect).

Additionally, these changes attempt to implement rule changes
specified by AI22-0159/01 (Inheritance for aspects allowed to
denote multiple subprograms), an AI that was added to address
problems identified while finalizing AI22-0154.

gcc/ada/ChangeLog:

* sem_ch6.adb (New_Overloaded_Entity): Add missing call to
Check_For_Primitive_Subprogram (Is_Primitive must be set).
* sem_ch13.ads (Check_Function_For_Indexing_Aspect): Move declaration
to package body.
* sem_ch13.adb (Check_Indexing_Functions): Remove early return for
derived types. Pass appropriate values for the new Boolean parameters
on existing calls to Check_Function_For_Indexing_Aspect. Perform a
second interpretation loop, calling Check_Function_For_Indexing_Aspect
and passing Indexing_Found for the Has_Eligible_Func parameter and True
for the Error_On_Ineligible parameter, and remove the existing call
to Error_Msg_NE that was flagging nonlocal entities (a similar error
is now reported inside procedure Check_Function_For_Indexing_Aspect).
Suppress call to Check_Inherited_Indexing in derived type cases.
(Check_Nonoverridable_Aspect_Subprograms): Remove early return when
the aspect spec does not come from source, so aspects of derived types
will also go through this procedure. Check restrictions of AI22-0159/01
for derived types and inheritance of aspects. Replace iteration over
overloaded interpretations with iteration over Aspect_Subprograms (and
only do that for indexing aspects). Condition Sloc for existing error
check for nonprimitive operations based on whether the aspect comes
from source, posting the error on the entity rather than the aspect
if the aspect is not given explicitly.
(Analyze_Aspects_At_Freeze_Point): Split off a new case alternative
for iterator aspects, and specialize treatment for indexing aspects
by forcing a search for new indexing functions. When none are found,
issue an error only in the case where the type has no inherited
indexing functions. Test that the version is at least Ada_2012 rather
than Ada_2022 for calling Check_Nonoverridable_Aspect_Subprograms.
(Check_Function_For_Indexing_Aspect): Move declaration from the package
spec to the body. Add Has_Eligible_Func and Error_On_Ineligible formals
and update spec comment.
Return early if the candidate subprogram was already inherited (present
in Aspect_Subprograms).
For a scope mismatch on Subp, report error only when Has_Eligible_Func
is False and Error_On_Ineligible is True (and never a warning).
Add "<<" in several calls to Report_Ineligible_Indexing_Function
(formerly Illegal_Indexing) to allow either warnings or errors.
Return without adding subprogram to Aspect_Subprograms when
Error_On_Ineligible is False.
(Report_Ineligible_Indexing_Function): Name changed from
Illegal_Indexing.
Return early when only a warning can be issued and the ineligible
subprogram is inherited, or if its first formal (if any) does not match
the aspect's associated type (to reduce false-positive warnings).
Set Error_Msg_Warn based on Error_On_Ineligible formal.
Report a continuation message identifying the ineligible entity.
Remove comment preceding body that has been obviated by AI22-0154.
* sem_util.adb (Inherit_Nonoverridable_Aspect): Remove the loop over
primitives that was checking and adding eligible primitives. That code
was incomplete, and collection of new indexing functions for derived
types is now handled by Check_Indexing_Functions. Also remove the
associated "???" comment.

ada: Document the gnatg switch

gcc/ada/ChangeLog:

* doc/gnat_rm.rst: update toctree
* doc/gnat_rm/about_this_guide.rst: add reference
* doc/gnat_rm/gnat_implementation_mode.rst: new file
* opt.ads: remove redundant comment
* gnat_rm.texi: Regenerate.

ada: Minor typo fix in documentation

gcc/ada/ChangeLog:

* doc/gnat_rm/gnat_language_extensions.rst (Destructors): fix
typo.
* gnat_rm.texi: Regenerate.

ada: Complete previous light runtime configuration fix

A recent fix for light runtime configuration was missing a crucial part:
it left an #include directive that needed to be removed. This patch
completes that fix.

gcc/ada/ChangeLog:

* argv.c: Remove unused include directive.

ada: Create a boolean version of Warnings_Suppressed

Add a Boolean overload of Warnings_Suppressed that wraps the existing
String_Id version, simplifying call sites that only need to know whether
warnings are suppressed at a location rather than the suppression reason.

gcc/ada/ChangeLog:

* erroutc.ads (Warnings_Suppressed): New Boolean overload.
* errout.adb (Error_Msg_Internal): Use Boolean Warnings_Suppressed.
* errutil.adb (Error_Msg): Likewise.

ada: Improve error message insertion methods

Extract the error chain insertion logic into dedicated subprograms.
Insert_Error_Msg adds a new message into the chain and adds the next and
previous pointers, making the deferred Set_Prev_Pointers pass in Finalize
redundant. Find_Msg_Insertion_Point and Is_Before extract the existing
logic for finding the insertion point in Error_Msg_Internal.

gcc/ada/ChangeLog:

* errout.adb (Is_Before): New helper function.
(Find_Msg_Insertion_Point): New procedure.
(Error_Msg_Internal): Use Find_Msg_Insertion_Point and Insert_Error_Msg.
(Finalize): Remove call to Set_Prev_Pointers.
(Set_Prev_Pointers): Removed.
* erroutc.adb (Insert_Error_Msg): New procedure.
* erroutc.ads (Insert_Error_Msg): New declaration.

ada: Do not set Global_Discard_Names in GNATprove_Mode

GNATprove now supports the Image attribute of enumerated types. Hence,
it is important to keep the names of the enumeration literals to be able
to properly reason about them.

gcc/ada/ChangeLog:

* gnat1drv.adb (Adjust_Global_Switches): Do not set
Global_Discard_Names in GNATprove_Mode.

ada: Fix light runtime configurations

A recent changed added a dependency from the environment-related
functions in argv.c to env.c. This broke some runtime configurations that
provide command line support but no environment variable support.

This fixes the issue by moving all of argv.c's environment-related code
to env.c. It also tweaks a comment in passing.

gcc/ada/ChangeLog:

* argv.c (__gnat_env_count) (__gnat_len_env) (__gnat_fill_env): Move
to...
* env.c (__gnat_env_count) (__gnat_len_env) (__gnat_fill_env):
...here. Tweak comment.

ada: Do not disable conformance warning in GNAT_Mode

The switch -gnatw_p enables a warning during conformance
checking that is stricter than the standard Ada conformance
rules. This patch removes the test for -gnatg mode when
issuing the warning, because that is redundant -- -gnatg
already turns off -gnatw_p.

We do not want this warning enabled in GNAT sources, but there is
no need to have -gnatg involved explicitly.
The same goes for In_Internal_Unit.

gcc/ada/ChangeLog:

* sem_ch6.adb (Subprogram_Subtypes_Have_Same_Declaration):
Remove tests for In_Internal_Unit and GNAT_Mode.

ada: VAST Check_Entity_Chain

Add Check_Entity_Chain to VAST: checks the Next_Entity/Prev_Entity are
consistent for entity chains.

Currently only checked for entities that are used as Scope.

Fixing existing inconsistencies is not direct.

Any call to Copy_And_Swap creates an incorrect chain, where the new node
has its Prev/Next/First/Last links copied from the original node, but
back links are not changed, leading to something like this for
Copy_And_Swap (Priv, Full):

  ,----,       ,----,       ,----,     ,----,
  | A  |------>| B  |------>|Priv|---->| D  |---> Empty
  |    |<------|    |<------|    |<----|    |
  '----'       '----'       '----'     '----'
                   ^                    ^
                   |        ,----,      |
                   `--------|Full|------`
                            |    |
                            '----'

And then after a while, probably after Exchange_Entities() the links are
incorrect and traversing the chain from First to Last or from Last to
First does not yield the same elements.

gcc/ada/ChangeLog:

* vast.adb (Check_Enum)<Check_Entity_Chain>: Add.
(Status)<Check_Entity_Chain>: Set to Print_And_Continue.
(Check_Entity_Chain): New.
(Check_Scope): Call Check_Entity_Chain.

ada: Add Delete_Error_And_Continuation_Msgs and refactor duplicate code in errout and errutil

Packages errout and errutil were sharing a lot of code. Extract all of
the common functionality to erroutc.
Extract Delete_Specifically_Suppressed_Warnings and Set_Prev_Pointers.

gcc/ada/ChangeLog:

* errout.adb (Delete_Warning_And_Continuations): use
Delete_Error_And_Continuation_Msgs.
(Output_Messages): Call new refactored subprograms.
(Delete_Specifically_Suppressed_Warnings): New
procedure.
* (Set_Prev_Pointers): New procedure.
* (Finalize): use Delete_Specifically_Suppressed_Warnigns and
Set_Prev_Pointers.
(Finalize): use Delete_Error_And_Continuation_Msgs.
* erroutc.adb (Delete_Error_And_Continuation_Msgs): New procedure.
(Remove_Duplicate_Errors): New_Function.
(Write_All_Errors_In_Brief_Format): New function.
(Write_All_Errors_In_Verbose_Format): New function.
(Write_Error_Summary): New function.
* erroutc.ads (Delete_Error_And_Continuation_Msgs): Likewise.
(Remove_Duplicate_Errors): Likewise.
(Write_All_Errors_In_Brief_Format): Likewise.
(Write_All_Errors_In_Verbose_Format): Likewise.
(Write_Error_Summary): Likewise.
* errutil.adb (Finalize): Call new refactored subprograms.

ada: Add Filter_And_Delete_Errors

gcc/ada/ChangeLog:

* errout.adb (Remove_Warning_Messages): Use
Filter_And_Delete_Errors.
* errout.ads (Purge_Messages): Renamed to
Delete_Error_Msgs_In_Range.
* erroutc.adb (Filter_And_Delete_Errors): New procedure.
(Purge_Messages): Renamed to Delete_Error_Msgs_In_Range.
* erroutc.ads (Filter_And_Delete_Errors): New procedure.
(Purge_Messages): Renamed to Delete_Error_Msgs_In_Range.
* par-ch5.adb (Missing_Begin): call Delete_Error_Msgs_In_Range.

ada: Simplify Warning_Specifically_Suppressed calls.

In most places we only care about whether the warning was suppressed or
not and we never care what the exact reason was. Add a new subprogram
Warning_Is_Suppressed for that purpose.

gcc/ada/ChangeLog:

* errout.adb (Finalize): use Warning_Is_Suppressed.
* erroutc.adb (Warning_Is_Suppressed): New subprogram.
* erroutc.ads (Warning_Is_Suppressed): Likewise.

ada: Refactor error message deletion

Extract the common code from multiple places where we deleted
messages into one common subprogram.

gcc/ada/ChangeLog:

* errout.adb: Use Delete_Error_Msg.
* erroutc.adb (Delete_Error_Msg): New subprogram.
* erroutc.ads (Delete_Error_Msg): Likewise.

ada: Fix crash on qualified bounds during unnesting

The problem is that the Activation_Record_Component field is accessed for
an E_Package entity, which does not contain any.

gcc/ada/ChangeLog:

* exp_unst.adb (Note_Uplevel_Bound_Trav): Do not register an uplevel
reference for a package. Use a single if_statement in the body.

ada: Remove obsolete trick in Analyze_Function_Return

The compiler no longer creates a temporary for controlled aggregate returns.

gcc/ada/ChangeLog:

* sem_ch6.adb (Analyze_Function_Return): Remove obsolete code that
wraps the return in a block when the expression is an aggregate.

ada: Create a function for checking Suppressed loop warnings

gcc/ada/ChangeLog:

* errout.adb (Error_Msg): Add new function
In_Loop_With_Suppressed_Warnings.

ada: Simplify implementation of instantiation messages

Remove duplication and extra variables and simplify control flow.

gcc/ada/ChangeLog:

* errout.adb (Error_Msg_N): Simplify code.

ada: Improve dmsg

Add missing attributes to dmsg. Additionally add support for
printing locations and fixes.

gcc/ada/ChangeLog:

* erroutc-pretty_emitter.adb (To_String): Relocated to erroutc.
(To_File_Name): Likewise.
(Line_To_String): Likewise.
(Column_To_String): Likewise.
* erroutc.adb (dedit): New function for debugging edits.
(dfix): New function for debuging fixes.
(dloc): New function for debugging locations.
(dmsg): Print missing Error_Msg_Object attributes.
(To_String): New function for printing spans
(To_String): Relocated from erroutc-pretty_emitter.adb
(To_File_Name): Likewise.
* erroutc.ads: Likewise.

ada: Stop using gnat_envp

First, a bit of context: Ada has only had support for manipulating
environment variables in the standard library since Ada 2005 and the
introduction of Ada.Environment_Variables.

Prior to that, GNAT had introduced the implementation-specific
Ada.Command_Line.Environment, which still exists today. Until now,
Ada.Command_Line.Environment used a global variable, gnat_envp, which
must be initialized with envp, the optional third parameter to main in C.
When the main was in Ada, the binder generated the appropriate assignment.
The rest of the time, it was the responsibility of the user to write this
assignment. Failure to do so would cause null pointer dereferences when
using Ada.Command_Line.Environment. Although documented in the spec of
Ada.Command_Line, this was rather easy to miss.

Worse, the assignment caused linking failures in the rather common case
of a C GPR project with'ing an Ada GPR project and linking dynamically.

Also, Ada.Command_Line.Environment was inconsistent across platforms with
regard to how it was affected by calls to putenv.

When we added support for the standard Ada.Environment_Variables, the
gnat_envp machinery wasn't reused. Instead, another mechanism based on
the Unix global variable environ (and its close equivalents on other
platforms) was introduced.

What this patch does is switch Ada.Command_Line.Environment over to this
new environ-based mechanism. All uses of gnat_envp are removed, but the
definition itself is kept for backwards compatibility.

gcc/ada/ChangeLog:

* argv-lynxos178-raven-cert.c: Update comments.
* argv.c (gnat_envp): Add comment about it being unused.
(__gnat_env_count, __gnat_len_env, __gnat_fill_env): Use
__gnat_environ instead of gnat_envp.
* bindgen.adb (Command_Line_Used): Update comment.
(Gen_Main): Remove gnat_envp assignment generation. Remove generated
envp parameter.
(Gen_Output_File_Ada): Remove generated envp parameter.
* env.h: Make usable as C++.
* libgnat/a-colien.ads: Remove comment.
* libgnat/a-comlin.ads: Update comment.
* targparm.ads: Update comment.

ada: Require compilation unit to have no indentation

We had a style check for compilation unit to start at column number which is
multiple of indentation value. Now we require compilation units to no have no
indentation.

gcc/ada/ChangeLog:

* par-ch10.adb (P_Compilation_Unit): Require no indentation.

ada: Fix compiler crash on primitive completed by expression function

This further restricts the special bypass for the freezing of the profile
in Analyze_Subprogram_Body_Helper to the case of wrapper functions.

gcc/ada/ChangeLog:

PR ada/93702
* exp_ch3.adb (Make_Controlling_Function_Wrappers): Do not set the
Was_Expression_Function flag on the body.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Avoid freezing the
profile only for wrapper functions.

ada: Pretty-print filter of loop parameter specification

Filter was only pretty-printed for iterator specification, but it can also
appear in loop parameter specification. This only affects debug output.

gcc/ada/ChangeLog:

* sprint.adb (Sprint_Node_Actual): Print filter in loop parameter
specification.

ada: Suppress warning for quantified expression with filters

If quantified expression has a filter, it becomes less clear whether we should
warn about quantified variable not being used.

gcc/ada/ChangeLog:

* sem_ch4.adb (Analyze_Quantified_Expression): If there is a filter,
then suppress the warning.

ada: Suppress warning about unused variable in trivial quantification

When condition of a quantification expression is written as True or False, then
the user has likely done this on purpose and there is no need for a warning.

gcc/ada/ChangeLog:

* sem_ch4.adb (Analyze_Quantified_Expression): Suppress warning for
trivial conditions.

ada: Unset Comes_From_Source in inlined static functions

Unset Comes_From_Source in the inlined expression in
order to avoid spurious resolution errors.

gcc/ada/ChangeLog:

* inline.adb (Adjust_Node): Renamed from Adjust_Sloc and
additionally unset Comes_From_Source.

ada: Missing overflow check on Integer_128 under GNATProve mode

Under GNATProve mode the frontend does not generate overflow
checks on type conversions of Universal Integer numbers to
128-bit integer type numbers.

gcc/ada/ChangeLog:

* checks.adb (Apply_Scalar_Range_Check): When the type of the expression
is Universal Integer we cannot statically determine if the expression
is in the range of the target type.
* sem_eval.adb (In_Subrange_Of): Do not consider T2 in the range of
Universal Integer (since theoretically they are not).
(Test_In_Range): Do not consider Universal type expressions in range
of subtype Typ.

ada: Add (r)pech debug routines for entity chains and simple check

(r)pech (Print Entity Chain - Header) can be used to dump the entity
chains with one node header per line:

- N_Defining_Identifier "system__use_ada_main_program_name" (Entity_Id=2804) (source)
- N_Defining_Identifier "system__zcx_by_default" (Entity_Id=2808) (source)
- N_Defining_Identifier "system__standard_library" (Entity_Id=108628) (source)
- N_Defining_Identifier "system__exception_table" (Entity_Id=109523) (source)

Also add a simple consistency check to all routines that dumps the
entity chain: if Prev (Next (E)) /= E (or Next (Prev (E)) /= E in the
reverse order), an extra line is printed:

- N_Defining_Identifier "system__tick" (Entity_Id=2550) (source)
!! - Prev (Next (^^^^)) = N_Defining_Identifier "system__default_priority" (Entity_Id=2700) (source)
- N_Defining_Identifier "system__address" (Entity_Id=2553) (source)

This example shows that the next links have 2550->2553, but the previous
links have 2700 <- 2553.

gcc/ada/ChangeLog:

* treepr.ads (pech, rpech): New.
(Print_Entity_Chain): Adjust signature and comment to handle
printing only header and doing the simple check.
* treepr.adb (pech, rpech): New.
(Print_Entity_Chain): Support for printing only headers and doing
simple check.

ada: Enable resolution of overloading on Last and Previous for Iterable

The resolution of overloading for the optional Last and Previous primitives
of an Iterable aspect should be done like for other primitives.

gcc/ada/ChangeLog:

* sem_ch13.adb (Resolve_Iterable_Operation): Handle Previous and Last
like Next and First.

ada: Add volatile abstract state to creation functions in Interfaces.C.Strings

The additional volatile abstract state is necessary to model the value of the
new pointer.

gcc/ada/ChangeLog:

* libgnat/i-cstrin.ads: New C_Addresses volatile state to use as
input of the New_String and New_Char_Array.

ada: Fix assertion failure on call in object notation in entry barrier

The problem is that the Original_Record_Component field is accessed without
checking that it may be.

gcc/ada/ChangeLog:

* sem_util.adb (Statically_Names_Object) <N_Selected_Component>:
Return False if the selector is neither component nor discriminant.

ada: Inheritance of pragma/aspect unchecked_union

Derived types inherit pragma/aspect uncheched union.

gcc/ada/ChangeLog:

* sem_ch3.adb (Build_Derived_Record_Type): Record type derivations
inherit Is_Unchecked_Union and Has_Unchecked_Union flags.
(Inherit_Component): Add discriminals to the associations list.
* exp_ch3.adb (Build_Record_Init_Proc): Derivations of Unchecked_Union
types don't need an initialization procedure; they reuse the init proc
of their parent type.

ada: Distribute declaration of return object into conditional expressions

This lifts one of the limitations of the distribution of a declaration of
an object into the dependent expressions of its initialization expression
when it is a conditional expression, namely the case of the return object
of an extended return statement.

gcc/ada/ChangeLog:

* exp_ch4.adb (Expand_N_Case_Expression): Deal with initialization
expression of return object.
(Expand_N_If_Expression): Likewise.
(Insert_Conditional_Object_Declaration): Likewise.
* exp_util.adb (Is_Distributable_Declaration): Lift limitation for
return objects, including those with a class-wide type.
* sem_ch3.adb (Analyze_Object_Declaration): Set Return_Applies_To
on artificial return objects created from within a transient scope.
Remove test on Expander_Active for better error recovery.

ada: Fix reStructuredText markup

gcc/ada/ChangeLog:

* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Fix
markup.
* gnat_ugn.texi: Regenerate.

ada: Suppress warning about unused quantified variables with junk names

For quantified expressions like "for all Dummy in ... => True" we don't want
to warn about unused variable when it has a junk name.

gcc/ada/ChangeLog:

* sem_ch4.adb (Analyze_Quantified_Expression): Suppress warning for
variables with junk names.

ada: Refactor Inline_Static_Function_Call

gcc/ada/ChangeLog:

* inline.adb (Inline_Static_Function_Call): Reduce source code nesting.
* inline.ads (Inline_Static_Function_Call): Likewise.

ada: Calculate the sloc adjustment for inlined static functions

First (and last) node calculation is done by traversing the original
nodes of the given node. This is fine for expanding existing code.
However when inlining static functions this can lead to a node that is
in a completly different location (e.g. the spec) being considered the
first node in the location of the inlined call. This means that in this
type of scenario reseting the slocs is not enough.

The correct approach to use here would be to calculate the Adjustment
in the Source File Index between the function and the inlined call. This
approach is also used in inlining regular subprograms.

Once there is an entry in the Source File Index for the inlined call the
error message mechanism will both highlight the call and the expression
function if an error is present in the inlined call.

gcc/ada/ChangeLog:

* inline.adb (Inline_Static_Function_Call): Add a Source File Index
entry for the call and apply the necessary sloc adjustment values
for all of the inlined nodes.

ada: Make __gnat_copy_attribs non-blocking on windows

gcc/ada/ChangeLog:

* adaint.c (__gnat_copy_attribs): use GetFileAttributesEx to
to fetch attributes.

c++: Fix build_value_init_noctor anon aggr handling

As I've mentioned on Saturday, the CWG3130 patchset fail to bootstrap.
The problem is that we try to call build_value_init on anonymous unions
or structs, which doesn't work well when they don't have a default
constructor.
Now, if some non-trivial construction is needed, type will already have
either a user-provided constructor or at least non-trivial one, in that
case build_value_init_noctor isn't called at all.  So, this patch
just zero-initializes the anonymous aggregate members.

2026-05-29  Jakub Jelinek  <jakub@redhat.com>

* init.cc (build_value_init_noctor): Zero initialize anonymous
union/struct subobjects.  Formatting fix.

libgfortran: Use MapViewOfFileEx instead of MapViewOfFileExNuma in caf_shmem

MapViewOfFileExNuma is only present when _WIN32_WINNT >= 0x0600 (Windows Vista
or later). The code is passing NUMA_NO_PREFERRED_MODE, and that
is documented as:

No NUMA node is preferred. This is the same as calling the MapViewOfFileEx
function.

https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-mapviewoffileexnuma

So, MapViewOfFileEx will behave identically, while still allowing Windows XP
support.

libgfortran/ChangeLog:

* caf/shmem/shared_memory.c (shared_memory_init): Use
MapViewOfFileEx instead of MapViewOfFileExNuma.

bb-slp-complex-mla-half-float.c: Add the missing end brace

commit 44a31df54837adf2f7815e7966dfe8ac32eb8f3b
Author: Artemiy Volkov <artemiy.volkov@arm.com>
Date:   Mon May 18 10:21:18 2026 +0000

    aarch64: introduce partial AdvSIMD vector modes

changed gcc.dg/vect/complex/bb-slp-complex-mla-half-float.c to:

-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1"  { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" { xfail arm*-*-* } } */

The end brace was missing.  Add the missing end brace to fix it.

PR testsuite/125489
* gcc.dg/vect/complex/bb-slp-complex-mla-half-float.c: Add the
missing end brace.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Daily bump.

Fortran: f_c_string intrinsic improvements

The existing implementation of f_c_string is quite inefficient, doing
either 2 or 3 allocations and copies of the input string prefix. This
rewrite adds folding for constant string arguments and handles other
cases with a single allocation and copy.

This patch also adds the missing documentation for this intrinsic to the
gfortran manual.

gcc/fortran/ChangeLog
* intrinsic.texi (F_C_STRING): New section.
* trans-intrinsic.cc (conv_trim): Delete.
(conv_isocbinding_function): Rewrite the F_C_STRING case.

gcc/testsuite/ChangeLog
* gfortran.dg/f_c_string3.f90: New.
* gfortran.dg/f_c_string4.f90: New.
* gfortran.dg/f_c_string5.f90: New.

Fortran: Add c_f_strpointer intrinsic

This is a missing Fortran 2023 feature.

gcc/fortran/ChangeLog
* check.cc (gfc_check_c_f_strpointer): New.
* f95-lang.cc (gfc_init_builtin_functions): Add BUILT_IN_STRNLEN.
* gfortran.h (enum gfc_isym_id): Add GFC_ISYM_C_F_STRPOINTER.
* gfortran.texi (Interoperable Subroutines and Functions): Mention
f_c_string and c_f_strpointer.
* intrinsic.cc (add_subroutines): Add c_f_strpointer. Fix nearby
whitespace errors.
(sort_actual): Handle first argument to c_f_strpointer specially.
* intrinsic.h (gfc_check_c_f_strpointer): Declare.
* intrinsic.texi (C_F_STRPOINTER): New section. Add entry to menu
and cross-references from similar functions.
* iso-c-binding.def: Add c_f_strpointer.
* trans-intrinsic.cc (conv_isocbinding_subroutine_strpointer): New.
(gfc_conv_intrinsic_subroutine): Call it.

gcc/testsuite/ChangeLog
* gfortran.dg/c_f_strpointer-1.f90: New.
* gfortran.dg/c_f_strpointer-2.f90: New.
* gfortran.dg/c_f_strpointer-3.f90: New.
* gfortran.dg/c_f_strpointer-4.f90: New.
* gfortran.dg/c_f_strpointer-5.f90: New.
* gfortran.dg/c_f_strpointer-6.f90: New.
* gfortran.dg/c_f_strpointer-7.f90: New.
* gfortran.dg/c_f_strpointer-8.f90: New.
* gfortran.dg/c_f_strpointer-9.f90: New.
* gfortran.dg/c_f_strpointer-10.f90: New.
* gfortran.dg/pr108961.f90: Rename locally-defined c_f_strpointer.

Co-authored-by: Tobias Burnus <tburnus@baylibre.com>

libcody: allow non-ASCII module names [PR120458]

Before this commit, attempting to use non-ASCII characters in quoted
words failed, even though the protocol allows the usage of such
characters in quoted words. To fix this:

1. Remove `c >= 0x7f` comparison when parsing a quoted word.
2. Use `unsigned char` instead of `char` such that `c < 0x20` fails for
non-ASCII characters.

PR c++/120458

libcody/ChangeLog:

* buffer.cc (S2C): Allow non-ASCII chars in quoted words.
* cody.hh: Use unsigned char for S2C().

gcc/testsuite/ChangeLog:

* g++.dg/README: Explain purpose of modules/ dir.
* g++.dg/modules/pr120458-1_a.C: Define non-ASCII module with
default mapper.
* g++.dg/modules/pr120458-1_b.C: Import non-ASCII module with
default mapper.
* g++.dg/modules/pr120458-2_a.C: Define non-ASCII module with
a file as mapper.
* g++.dg/modules/pr120458-2_b.C: Import non-ASCII module with
a file as mapper.
* g++.dg/modules/pr120458-2.map: Define mapping for pr120458-2
test case.

Signed-off-by: Jean-Christian CÎRSTEA <jean-christian.cirstea@tuta.com>

vect-early-break-no-epilog_11.c: Require avx512f_runtime

Require avx512f_runtime instead of avx512f_hw to fix

ERROR: gcc.dg/vect/vect-early-break-no-epilog_11.c -flto -ffat-lto-objects: unknown effective target keyword `avx512f_hw' for " dg-require-effective-target 7 avx512f_hw { target i?86-*-* x86_64-*-* } "

* gcc.dg/vect/vect-early-break-no-epilog_11.c: Require
avx512f_runtime instead of avx512f_hw.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

tree-ssa: Loop store motion micro-optimizations.

ref_always_accessed_p is (currently) only ever called with stored_p being
true, so specializing for this case, renaming ref_always_accessed{,_p} to
ref_always_stored{,_p} saves storage and some redundant checks at run-time.

2026-05-28 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
* tree-ssa-loop-im.cc (ref_always_accessed_p): Rename to...
(ref_always_stored_p): New function specialized to determine if
REF is a store that is always executed in LOOP.
(execute_sm): Use ref_always_stored_p instead of
ref_always_accessed_p.
(class ref_always_accessed): Rename to..
(class ref_always_stored): Remove (always true) stored_p field.
(ref_always_stored::operator ()): Always check for a store.
Move hash table lookup, get_lim_data, after store test.
(can_sm_ref_p): Use ref_always_stored_p insead of
ref_always_accessed_p.

x86_64 SSE: Tweak/correct STV cost of 128-bit rotate by constant.

This one line change resolves the failure of gcc.target/i386/rotate-2.c
when compiled with -march=cascadelake triggered by recent STV improvements.
https://gcc.gnu.org/pipermail/gcc-patches/2026-May/716996.html

The decision of whether to perform STV is finely balanced, and affected
by the microarchitecture's timings/costs, but in this case the underlying
issue appears to be the parameterized cost for performing a 128-bit
rotation by a constant in SSE registers.  Depending upon the number
of bits to rotate by, SSE requires either 1 or 2 shuffles, followed
by a left shift, a right shift and an any_or_plus to combine the result.
This is therefore 4 or 5 instructions, but currently returns
COSTS_N_INSNS(1) instead of COSTS_N_INSNS(4) [probably a typo].

As an aside, it might be more useful for this gain to based on latency;
as both the shuffles and the shifts can each be performed in parallel,
so a reasonable vcost may therefore be COSTS_N_INSNS(3), but such fine
tuning might require microbenchmarking.  I mention it here just in case
using COSTS_N_INSNS(4) is bisected as a performance regression.

2026-05-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc (compute_convert_gain): Tweak
the cost of a 128-bit rotation to be 4 (or 5) instructions.

x86_64 SSE: Handle SUBREG conversions in TImode STV (for ptest).

This patch teaches i386's STV pass how to handle SUBREG conversions,
i.e. that a TImode SUBREG can be transformed into a V1TImode SUBREG,
without worrying about other DEFs and USEs.

One example where this is useful is

typedef long long __m128i __attribute__ ((__vector_size__ (16)));
int foo (__m128i x, __m128i y) {
  return (__int128)x == (__int128)y;
}

where with -O2 -msse4 we can now scalar-to-vector transform:

(insn 7 4 8 2 (set (reg:CCZ 17 flags)
        (compare:CCZ (subreg:TI (reg/v:V2DI 86 [ x ]) 0)
            (subreg:TI (reg/v:V2DI 87 [ y ]) 0))) {*cmpti_doubleword}

into

(insn 17 4 7 2 (set (reg:V1TI 91)
        (xor:V1TI (subreg:V1TI (reg/v:V2DI 86 [ x ]) 0)
            (subreg:V1TI (reg/v:V2DI 87 [ y ]) 0)))
     (nil))
(insn 7 17 8 2 (set (reg:CCZ 17 flags)
        (unspec:CCZ [
                (reg:V1TI 91) repeated x2
            ] UNSPEC_PTEST)) {*sse4_1_ptestv1ti}
     (expr_list:REG_DEAD (reg/v:V2DI 87 [ y ])
        (expr_list:REG_DEAD (reg/v:V2DI 86 [ x ])
            (nil))))

with the dramatic effect that the assembly output before:

foo: movaps  %xmm0, -40(%rsp)
        movq    -32(%rsp), %rdx
        movq    %xmm0, %rax
        movq    %xmm1, %rsi
        movaps  %xmm1, -24(%rsp)
        movq    -16(%rsp), %rcx
        xorq    %rsi, %rax
        xorq    %rcx, %rdx
        orq     %rdx, %rax
        sete    %al
        movzbl  %al, %eax
        ret

now becomes

foo: pxor    %xmm1, %xmm0
        xorl    %eax, %eax
        ptest   %xmm0, %xmm0
        sete    %al
        ret

i.e. a 128-bit vector doesn't need to be transferred to the
scalar unit to be tested for equality.  The new test case includes
additional related examples that show similar improvements.

Previously we explicitly checked *cmpti_doubleword operands to be
either immediate constants, or a TImode REG or a TImode MEM.  By
enhancing this to allow a TImode SUBREG, we now handle everything
that would match the general_operand predicate, making this part
of STV more like other RTL passes (lra/reload).  The big change is
that unlike a regular DF USE, a SUBREG USE doesn't require us to
analyze and convert the rest of the DEF-USE chain.

2026-05-28  Roger Sayle  <roger@nextmovesoftware.com>
    Hongtao Liu  <hongtao.liu@intel.com>

gcc/ChangeLog
* config/i386/i386-features.cc (scalar_chain::add_insn): Don't
call analyze_register_chain if the USE is a SUBREG.
(scalar_chain::convert_op): Call gen_lowpart to convert
scalar (TImode) SUBREGs to vector (V1TImode) SUBREGs.
(convertible_comparison_p): We can now handle all general_operands
of *cmp<dwi>_doubleword.
(timode_remove_non_convertible_regs): We only need to check TImode
uses that aren't TImode SUBREGs of registers in other modes.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-ptest-7.c: New test case.

x86 SSE: Improve vector increment/decrement on x86.

This patch improves the code generated by the i386 backend for incrementing
(adding one to) and decrementing (subtracting one from) a vector.  With SSE
materializing the vector -1 is more efficient than materializing the
vector +1, hence x + 1 (increment) is better expressed as x - (-1), and
x - 1 (decrement) is better expressed as x + (-1).  Conveniently the
relevant additions and subtractions are specified as a single pattern,
using a plusminus iterator, in the machine description.

For the four example functions:

typedef char v16sqi __attribute__ ((vector_size(16)));
typedef unsigned char v16uqi __attribute__ ((vector_size(16)));

v16sqi sadd1(v16sqi x) { return x+1; }
v16uqi uadd1(v16uqi x) { return x+1; }
v16sqi saddm1(v16sqi x) { return x-1; }
v16uqi uaddm1(v16uqi x) { return x-1; }

GCC with -O2 -mavx2 previously generated:

sadd1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpabsb  %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

uadd1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpabsb  %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

saddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpabsb  %xmm1, %xmm1
        vpsubb  %xmm1, %xmm0, %xmm0
        ret

uaddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

With this patch, we now consistently generate:

sadd1:  vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpsubb  %xmm1, %xmm0, %xmm0
        ret

uadd1:  vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpsubb  %xmm1, %xmm0, %xmm0
        ret

saddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

uaddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

2026-05-28  Roger Sayle  <roger@nextmovesoftware.com>
    Hongtao Liu  <hongtao.liu@intel.com>
    Uros Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386.md (inv_insn): New define_code_attr.
* config/i386/sse.md (<plusminus><mode>3): Accept a CONST_VECTOR
as the second operand.  If the second operand is CONST1_RTX,
canonicalize to use CONSTM1_RTX instead.
(*add<mode>3_one): New define_insn_and_split to convert padd +1
to psub -1.
(*sub<mode>3_one): Likewise, a new define_insn_and_split to
convert psub +1 to padd -1.

gcc/testsuite/ChangeLog
* gcc.target/i386/avx512f-simd-1.c: Tweak test case.
* gcc.target/i386/sse2-paddb-2.c: New test case.
* gcc.target/i386/sse2-paddd-2.c: Likewise.
* gcc.target/i386/sse2-paddw-2.c: Likewise.
* gcc.target/i386/sse2-psubb-2.c: Likewise.
* gcc.target/i386/sse2-psubd-2.c: Likewise.
* gcc.target/i386/sse2-psubw-2.c: Likewise.

c++: fix infinite looping with arr[arr] [PR125454]

Here r16-3466 moved the canonicalization step that transforms
idx[array] to array[idx] to the beginning of cp_build_array_ref.
When we have array[array], we'll be swapping till we blow the stack.

Previously, we'd give the !INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P
error so there was no problem.

PR c++/125454

gcc/cp/ChangeLog:

* typeck.cc (cp_build_array_ref): Don't recurse for array[array].

gcc/testsuite/ChangeLog:

* g++.dg/other/array8.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

fortran: module-contained PRIVATE procedures must have global ELF linkage [PR125430]

Assisted by: Claude Sonnet 4.6

gcc/fortran/ChangeLog:

PR fortran/125430
* trans-decl.cc (build_function_decl): Set TREE_PUBLIC for all
module-contained procedures so submodules compiled as separate
translation units can reach them via host association. Also set
DECL_VISIBILITY to VISIBILITY_HIDDEN for PRIVATE procedures,
matching the existing treatment of module variables.

gcc/testsuite/ChangeLog:

PR fortran/125430
* gfortran.dg/module_private_2.f90: Remove scan-tree-dump-times
assertion for 'priv'; PRIVATE module procedures now have global
linkage with hidden visibility and are no longer optimized away.
* gfortran.dg/public_private_module_2.f90: Add xfail markers to
scan-assembler-not for 'two' and 'six'; update comment to mention
procedures alongside variables.
* gfortran.dg/public_private_module_7.f90: Add xfail marker to
scan-assembler-not for '__m_common_attrs_MOD_other'.
* gfortran.dg/public_private_module_8.f90: Add xfail marker to
scan-assembler-not for '__m_MOD_myotherlen'.
* gfortran.dg/submodule_private_host.f90: New test.
* gfortran.dg/submodule_private_host_aux.f90: New auxiliary file.
* gfortran.dg/warn_unused_function_2.f90: Remove 'defined but not
used' expectation for s1; PRIVATE module procedures now have
global linkage and no longer trigger the unused-function warning.

[RISC-V] Fix expected testsuite output after ext-dce changes

The recent changes to ext-dce can transform sign extension to zero extension in
some cases.  As a result tests which previously expected a signed load can now
see an unsigned load.  Of course on rv32 "lw" loads a full word, so this
doesn't show up there.  So instead of looking for "lw" we instead look for
"(lwu|lw)".  This fixes the "regressions" after the ext-dce changes.

gcc/testsuite
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: Adjust expected
output.
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/a-rvwmo-store-release.c: Likewise.
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: Likewise.
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/a-ztso-store-release.c: Likewise.
* gcc.target/riscv/amo/zalasr-rvwmo-store-compat-seq-cst.c: Likewise.
* gcc.target/riscv/amo/zalasr-rvwmo-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/zalasr-rvwmo-store-release.c: Likewise.
* gcc.target/riscv/amo/zalasr-ztso-store-compat-seq-cst.c: Likewise.
* gcc.target/riscv/amo/zalasr-ztso-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/zalasr-ztso-store-release.c: Likewise.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.
* gcc.target/riscv/cpymem-64.c: Likewise.
* gcc.target/riscv/memcpy-nonoverlapping.c: Likewise.
* gcc.target/riscv/pr67731.c: Likewise.

c++: add fixed test [PR106957]

Fixed by r16-8015:
c++: error routines re-entered with uneval lambda [PR124397]

PR c++/106957

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval32.C: New test.

libstdc++: Fix -fno-exceptions support in testsuite_allocator.h

This fixes the error

.../testsuite_allocator.h:402:13: error: exception handling disabled, use '-fexceptions' to enable
402 | catch(...)
| ^~~~~

seen when running some C++23 library tests with -fno-exceptions.

libstdc++-v3/ChangeLog:

* testsuite/util/testsuite_allocator.h
(uneq_allocator::allocate): Use __try/__catch instead.
(uneq_allocator::allocate_at_least): Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Fix availability of flat_meow::operator=(initializer_list)

This assignment operator was not being brought in from the private base
class causing assignments from {...} to be inefficiently treated as
construction + move assignment.

libstdc++-v3/ChangeLog:

* include/std/flat_map (flat_map): Bring in operator= from
_Flat_map_base.
(flat_multimap): Likewise.
* include/std/flat_set (flat_set): Bring in operator= from
_Flat_set_base.
(flat_multiset): Likewise.
* testsuite/23_containers/flat_map/1.cc (test11): Simplify by
using = {...}.
(test12): New test.
* testsuite/23_containers/flat_multimap/1.cc (test10): Simplify
by using = {...}.
(test11): New test.
* testsuite/23_containers/flat_multiset/1.cc (test10): Simplify
by using = {...}.
(test11): New test.
* testsuite/23_containers/flat_set/1.cc (test10): Simplify by
using = {...}.
(test11): New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Implement P3567R2 flat_meow fixes

This implements the changes in sections 5, 6 and 8 of P3567R2; the other
changes (in section 4 and 7) are effectively already implemented.

libstdc++-v3/ChangeLog:

* include/bits/version.def (flat_map): Bump to 202511.
(flat_set): Likewise.
* include/bits/version.h: Regenerate.
* include/std/flat_map (_Flat_map_impl): Remove
is_nothrow_swappable_v assertions.
(_Flat_map_impl::_Flat_map_impl): Explicitly default copy ctor.
Define move ctor with corrected exception handling as per
P3567R2.
(_Flat_map_impl::operator=): Likewise.
(_Flat_map_impl::insert_range): Define new __sorted_t overload
as per P3567R2.
(_Flat_map_impl::swap): Make conditionally noexcept as per
P3567R2.
* include/std/flat_set (_Flat_set_impl): Remove
is_nothrow_swappable_v assertion.
(_Flat_set_impl::_Flat_set_impl): Explicitly default copy ctor.
Define move ctor with correct invariant preserving behavior as
per P3567R2.
(_Flat_set_impl::operator=): Likewise.
(_Flat_set_impl::_M_insert_range): Factored out from
insert_range. Add bool parameter __is_sorted defaulted to
false.
(_Flat_set_impl::insert_range): Define new __sorted_t overload
as per P3567R2.
(_Flat_set_impl::swap): Make conditionally noexcept as per
P3567R2. Correct to use ranges::swap instead of ADL swap.
* testsuite/23_containers/flat_map/1.cc (test11, test12):
New tests.
* testsuite/23_containers/flat_multimap/1.cc (test10, test11):
New tests.
* testsuite/23_containers/flat_multiset/1.cc (test10, test11):
New tests.
* testsuite/23_containers/flat_set/1.cc (test10, test11):
New tests.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Fix suboptimal complexity of flat_map::_M_insert

Ever since r16-1742 ranges::inplace_merge is now correctly C++20 iterator
aware which allows us to idiomatically implement this helper with the
correct optimal complexity N + M log M instead of N log N.

libstdc++-v3/ChangeLog:

* include/std/flat_map (_Flat_map_impl::_M_insert): New bool
parameter __is_sorted defaulted to false. Reimplement using
views::zip and ranges::inplace_merge.
(_Flat_map_impl::insert): In the __sorted_t overload, pass
__is_sorted=true to _M_insert.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

cobol: Add assertion to suppress -Warray-bounds false positive [PR125404]

This works around a warning from std::vector code, which seems to be
assuming that the vector is empty and therefore calling back() would be
invalid:

/home/test/src/gcc/gcc/cobol/symfind.cc:526:45: error: array subscript -1 is outside array bounds of ‘long unsigned int [1152921504606846975]’ [-Werror=array-bounds=]
526 | return ancestors.back() == i01;
| ~~~~~~~~~~~~~~~~~^~~~~~

Compiling with -D_GLIBCXX_ASSERTIONS also fixes the warning.

gcc/cobol/ChangeLog:

PR cobol/125404
* symfind.cc (symbol_find): Add assertion that ancestors vector
is not empty.

RISC-V: Fix REGNO_REG_CLASS for FP hard registers

The GCC Internals Manual, section 19.8 "Register Classes", documents
REGNO_REG_CLASS as:

  REGNO_REG_CLASS (regno)                                      [Macro]
    A C expression whose value is a register class containing hard
    register regno.  In general there is more than one such class;
    choose a class which is minimal, meaning that no smaller class
    also contains the register.

riscv_regno_to_class[] currently maps every FP hard register to
RVC_FP_REGS, but RVC_FP_REGS only contains f8-f15.  The entries for
f0-f7 and f16-f31 therefore violate the "containing hard register
regno" half of the contract: the returned class does not contain the
register at all.

The mismatch corrupts IRA's cost model.  setup_allocno_cost_vector
indexes the per-hard-reg cost slot via REGNO_REG_CLASS:

  rclass = REGNO_REG_CLASS (hard_regno);
  num = cost_classes_ptr->index[rclass];
  ...
  reg_costs[j] = COSTS (costs, i)->cost[num];

After setup_regno_cost_classes_by_mode adds RVC_FP_REGS to the cost
classes, the cost for e.g. f16 is silently read from the RVC_FP_REGS
slot.

The new fp-reg-class.c testcase puts eight "cf"- and sixteen "f"-
constrained doubles live across a call.  In the buggy state IRA
places the cf pseudos outside the cf class and LRA recovers with
sixteen fmv.d to fs* registers; with the fix IRA spills those values
honestly and the IRA "+++Costs" line reports a non-zero "mem"
component.

Fix it by giving each FP hard register its minimal class: FP_REGS for
f0-f7 and f16-f31, RVC_FP_REGS for f8-f15.  As a companion change,
switch riscv_secondary_memory_needed from class-equality tests to
reg_class_subset_p so it still recognises the FP side regardless of
which subclass the table returns.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_regno_to_class): Use the minimal
class containing each FP hard register: FP_REGS for f0-f7 and
f16-f31, RVC_FP_REGS for f8-f15.
(riscv_secondary_memory_needed): Use reg_class_subset_p to
detect FP classes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/fp-reg-class.c: New test.

RISC-V: Support VLS LMUL cost scaling

Make VLS (fixed-length) vector modes use the same LMUL cost scaling as
VLA modes. This makes the vectorizer to pick smaller LMULs sometimes.

Here is how I update the testsuite which failed in regression test:
  - dyn-lmul-conv-[1-2].c: The cost model now prefers smaller LMULs,
    so update expectations.
  - pr123414.c: This test relies on large LMULs to trigger a specific bug,
    can be fixed by adding -fno-vect-cost-model.

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (get_lmul_cost_scaling):
Enable scaling for all vector modes (VLA and VLS).

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/dyn-lmul-conv-1.c: Update expected LMUL counts.
* gcc.target/riscv/rvv/autovec/dyn-lmul-conv-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/pr123414.c: Disable vector cost model.

Signed-off-by: Zhongyao Chen <chen.zhongyao@zte.com.cn>

avr.opt.urls: Add -masm-len-notes and -Wasm-len-notes.

gcc/
* config/avr/avr.opt.urls (-masm-len-notes, -Wasm-len-notes): Add.

testsuite: add AVX512 requirement to vect-early-break-no-epilog_11.c

This testcase on x86_64 needs AVX512 to vectorize.
My original testing used -march=native so it was on by default.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-early-break-no-epilog_11.c: Add AVX512 for x86_64.

libstdc++: Optimize operator<< for piecewise distributions.

This avoids creating an temporary vector and uses _M_int and _M_den
members of _M_param. Empty _M_int (default) values are handled by
printing values direclty.

libstdc++-v3/ChangeLog:

* include/bits/random.h (piecewise_constant_distribution::param_type)
(piecewise_linear_distribution::param_type): Befriend operator<<.
* include/bits/random.tcc
(operator<<(basic_ostream&, piecewise_linear_distribution))
(operator<<(basic_ostream&, piecewise_constant_distribution)):
Use __x._M_param._M_int and __x._M_param._M_den instead of accessors.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Expand serialization test for piecewise distributions.

Due the viariability of the resutls, the test are currently limited
to x86_64 architectures. float/double test are disabled for -m32
as I was getting unstable result.

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/piecewise_constant_distribution/operators/serialize2.cc:
New test.
* testsuite/26_numerics/random/piecewise_linear_distribution/operators/serialize2.cc:
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

aarch64/sve: combine AdvSIMD and SVE vec_duplicates

Currently, to duplicate a 64-bit or narrower value into a SVE register, we
choose to go via an intermediate 128-bit AdvSIMD register, viz.:

svfloat32_t foo(float x) {
    return svdupq_n_f32(x, x, x, x);
}

which will produce the following code:

        dup     v0.4s, v0.s[0]
        dup     z0.q, z0.q[0]
        ret

when compiled with -O2 -march=armv9-a+sve.

This can be simplified into a single dup instruction going to an SVE
register directly from a scalar (or a smaller vector) value:

mov z0.s, s0
ret

To facilitate this, this patch adds a pattern that combine can use to
merge two vec_duplicate instructions (scalar -> AdvSIMD and AdvSIMD ->
SVE) into a single one (scalar -> SVE).

To demonstrate the effect of this patch, the vec-init-23.c test from
AdvSIMD was reused as a new SVE test (vec_init_5.c).

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(*aarch64_vec_duplicate_subvector<vconsv><vconq><mode>):
New pattern.
* config/aarch64/iterators.md (VCONSV): New mode attribute.
(vconsv): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/vec_init_5.c: New test.

aarch64: implement vec_concat support for sub-64-bit types

This patch improves handling of 2-element vec_concats in
aarch64_vector_init_fallback (); where previously the aarch64_vec_concat
insn was emitted only for pairs of vectors, we now allow scalar operands
as well. Furthermore, if the two operands are the same, we can now emit a
vec_duplicate instead of a vec_concat, leading to better code generation.

This is backed by the new combine{z,_internal}{,_be} insn patterns, that
were each split between integral 16- and 32-bit modes (only involving GPRs
and memory), and the rest (requiring the "w" alternatives as well).

The effect of the changes is illustrated by the changes to vec-init-23.c,
introduced in the previous patch (and a handful of other vector-init
related tests).

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (*aarch64_combine_internal<mode>):
New insn pattern.
(*aarch64_combine_internal_be<mode>): Likewise.
(*aarch64_combinez<mode>): Likewise.
(*aarch64_combinez_be<mode>): Likewise.
(@aarch64_vec_concat<mode>): Support smaller vector and scalar modes.
* config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback):
Handle the case of two scalar elements.
* config/aarch64/iterators.md (SSUB64): New mode iterator.
(VSSUB64): Likewise.
(VSSUB32_I) : Likewise.
(VSSUB64_F): Likewise.
(VS32_I_SUB64_F): Likewise.
(single_wx): Define attribute for sub-64-bit vector and scalar modes.
(bitsize): Likewise.
(VDBL): Likewise.
(single_dwx): New mode attribute.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/gather_load_10.c: Adjust testcase.
* gcc.target/aarch64/sve/slp_1.c: Likewise.
* gcc.target/aarch64/vec-init-18.c: Likewise.
* gcc.target/aarch64/vec-init-23.c: Likewise.

aarch64: initialize vectors from starting subsequence

Now that we have 2- and 4-element vector modes for all the sub-word scalar
modes, we can emit more efficient code when the elements of a vector
constructor can be generated from a common starting subsequence of length
power of two.  To do this, first detect the shortest possible starting
subsequence by repeatedly folding the initial constructor element array
in half, as long as the left and the right halves are equal.  Afterwards,
after emitting the subsequence, duplicate it by generating a
vec_duplicate with the correct source mode.

On the MD side, this requires implementing the vec_duplicate optab to
duplicate an arbitrary sub-128-bit value into a full 64- or a 128-bit
AdvSIMD register, as well as the vec_set insn for the VSUB64 modes (needed
as fallback for the divide-and-conquer approach).  The latter uses a
properly scaled and shifted "bfi" for integer values, and a properly
indexed "ins" for FP elements.

This change allows us to get rid of long chains of inserts and compile
things like:

int16x8_t f (int16_t x, int16_t y, int16_t z, int16_t w)
{
return (int16x8_t) {x, y, z, w, x, y, z, w};
}

into:
bfi     w0, w2, 16, 16
bfi     w1, w3, 16, 16
dup     v31.2s, w0
dup     v0.2s, w1
zip1    v0.8h, v31.8h, v0.8h
ret

rather than:

dup     v31.4h, w0
dup     v0.4h, w1
ins     v31.h[1], w2
ins     v0.h[1], w3
ins     v31.h[3], w2
ins     v0.h[3], w3
zip1    v0.8h, v31.8h, v0.8h
ret

This patch also includes an extensive new test, which includes the above
case, as well as adjustments to existing codegen tests as necessary.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (*aarch64_simd_dup_subvector<vconq><mode>):
New insn pattern.
(*aarch64_simd_dup_subvector<vcond><mode>): Likewise.
(@aarch64_simd_vec_set<mode>): Likewise.
(vec_set<mode>): Handle 16- and 32-bit vector modes in the expander.
* config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback): Add
logic to initialize vector from starting subsequence.  Make static.
(scalar_move_insn_p): Consider sub-64-bit vector moves scalar.
* config/aarch64/iterators.md (VDDUP): New iterator.
(VQDUP): Likewise.
(elem_bits): Define attribute for sub-64-bit vector modes.
(Vetype): Likewise.
(VEL): Likewise.
(single_wx): Define attribute for sub-64-bit vector and scalar modes.
(single_type): Likewise.
(VCOND): Likewise.
(VCONQ): Likewise.
(Vqduptype): New mode attribute.
(Vdduptype): Likewise.
(vcond): Likewise.
(vconq): Likewise.
(vstype): Define attribute for 64-bit vector and sub-128-bit scalar
modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ldp_stp_16.c: Adjust testcase.
* gcc.target/aarch64/sve/slp_1.c: Likewise.
* gcc.target/aarch64/vec-init-18.c: Likewise.
* gcc.target/aarch64/vec-init-23.c: New test.

aarch64: introduce partial AdvSIMD vector modes

In addition to V2HF that already exists, this patch adds 4 more partial
(16- and 32-bit) AdvSIMD vector modes: V4QI, V2QI, V2HI, and V2BF.  For
now, these are intended only for duplication into full-sized (32-, 64-,
and 128-bit) registers.  As a minimal closure required to bootstrap the
compiler, this also implements the "mov" expand and the "aarch64_simd_mov"
insn_and_split for the new modes (gathered under the VSUB64 iterator).

This patch also adds the new aarch64_advsimd_sub_dword_mode_p () helper to
facilitate detecting the new modes; that is then used (a) to disable
vec_perm_const vectorization for those modes, (b) in the "mov" expander
for those modes, and (c) to define the new "Da" constraint.

Some existing testcases were adjusted where needed.  (The _Float16
testcase in sve/slp_1.c temporarily expects GPRs to be used for V2HF,
which is corrected to FPRs by the succeeding patch; and the half-float
complex tests now recognize some of the patterns, but check that V2BF
still can't be used for vectorization.)

gcc/ChangeLog:

* config/aarch64/aarch64-modes.def (VECTOR_MODE): Remove V2HF.
(VECTOR_MODES): Define V2QI, V4QI, V2HI, V2HF, V2BF.
* config/aarch64/aarch64-protos.h
(aarch64_advsimd_sub_dword_mode_p): Declare new predicate.
* config/aarch64/aarch64-simd.md (*aarch64_simd_mov<mode>): New
define_insn_and_split pattern.
(mov<mode>): Add sub-64-bit vector modes to the VALL_F16 expander.
Forego const vector expansion for those modes.
* config/aarch64/aarch64.cc (aarch64_classify_vector_mode):
Handle 16- and 32-bit vector modes.
(aarch64_advsimd_sub_dword_mode_p): Define new predicate.
(aarch64_vectorize_vec_perm_const): Refuse for partial vector modes.
* config/aarch64/constraints.md (Da): New constraint.
* config/aarch64/iterators.md (VSUB64): New iterator.
(VALL_F16_SUB64): Likewise.
(size): Define attribute for sub-64-bit vector modes.
(VSC): New mode attribute.
(vstype): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/complex/bb-slp-complex-add-half-float.c: Adjust testcase.
* gcc.dg/vect/complex/bb-slp-complex-mla-half-float.c: Likewise.
* gcc.dg/vect/complex/bb-slp-complex-mul-half-float.c: Likewise.
* gcc.target/aarch64/sve/slp_1.c: Likewise.

i386: Refine c86-4g fdiv scheduling model

Commit r17-258 introduced separated c86-4g fdiv units to avoid the
automaton explosion caused by modeling the whole divider latency on
normal FPU pipes.  But the real hardware may keep the associated FPU
pipe occupied for some cycles at both the beginning and the end of
an fdiv or sqrt operation.  Following Alexander's suggestion in [1],
this patch still keeps the long-latency part on the dedicated fdiv
unit but models only a bounded part of the FPU pipe occupancy.  It
makes the first four cycles reserve both the selected FPU pipe and
the fdiv unit, then keep only the fdiv unit for the remaining cycles.

Taking r17-258 as baseline, I tried K = 1,2,3,4 for

  fpu,divider*N -> (fpu+divider)*K, divider*(N-K)

and measured the time for build/genautomata and the top 100 symbol
sizes of insn-automata.o (baseline normalized as 100) as below:

1) without any other changes:
              time     size
  baseline    100      100
  r17-203     340.0    629.3
  K1          100.3    100
  K2          105.5    112.5
  K3          112.8    129
  K4          119.4    141

2) Splitting fpu0/fpu2 and fpu1/fpu3 to paired automatons:
              time     size
  baseline    100      100
  r17-203     340.0    629.3
  KS1         79.6     43.3
  KS2         79.8     43.3
  KS3         79.6     43.3
  KS4         79.4     43.3

It turns out that if we want to model the FPU occupancy for some
beginning cycles, separating the involved fpu1/fpu3 from the
original fpu looks better.  So this patch splits fpu0/fpu2 and
fpu1/fpu3 into two paired automata and this extra coupling does
not grow the main FPU automata significantly.

This patch also corrects some other modeling omissions like:

  - Fix c86_4g_fp_op_idiv_load latency typo by one cycle.
  - Merge the old c86_4g_m7 idiv DI/SI/HI reservations after
    aligning their latency and divider unit occupancy (with
    updated values), while keeping QI separate.
  - Adjust reservation units in templates like
    c86_4g_m7_avx_vpinsr_reg_load and c86_4g_m7_avx512_sseadd_xy
    etc.
  - Add missing reservation units and unit occupancy in templates
    like c86_4g_m7_avx512_permi2_ymm and
    c86_4g_m7_sse_sseiadd_hplus_load etc.
  - Adjust reservation units and unit occupancy in templates like
    c86_4g_m7_avx512_perm_zmm_imm, c86_4g_m7_avx512_expand and
    c86_4g_m7_avx512_ssemul etc.

And also introduces some reusable reservation aliases to simplify
some modelings.

I tested build time for i686 bootstrapping in a docker container:
  - r17-202: 2437s (before c86-4g support)
  - r17-203: 7291s (c86-4g support)
  - r17-258: 2646s (tweaking for build time)
  - this: 2358s
It looks this patch improves build time (even better than r17-202
though the trivial gap can be due to some jitter).

The symbol sizes are improved as below:

nm -CS -t d --defined-only gcc/insn-automata.o \
    | sed 's/^[0-9]* 0*//' \
    | sort -n | tail -20

with r17-258:

  20068 r bdver1_fp_transitions
  22354 r c86_4g_m7_ieu_min_issue_delay
  26208 r slm_min_issue_delay
  26580 t internal_min_issue_delay(int, DFA_chip*)
  26869 t internal_state_transition(int, DFA_chip*)
  27244 r bdver1_fp_min_issue_delay
  28518 r glm_check
  28518 r glm_transitions
  33690 r geode_min_issue_delay
  33728 r c86_4g_fp_transitions
  45436 r znver4_fpu_min_issue_delay
  46980 r bdver3_fp_min_issue_delay
  49428 r glm_min_issue_delay
  53730 r btver2_fp_min_issue_delay
  53760 r znver1_fp_transitions
  89414 r c86_4g_m7_ieu_transitions
  93960 r bdver3_fp_transitions
  181744 r znver4_fpu_transitions
  326322 r c86_4g_m7_fpu_min_issue_delay
  1305288 r c86_4g_m7_fpu_transitions

with this:

  17872 r print_reservation(_IO_FILE*, rtx_insn*)::...
  20068 r bdver1_fp_check
  20068 r bdver1_fp_transitions
  22016 r c86_4g_m7_fpu02_transitions
  22354 r c86_4g_m7_ieu_min_issue_delay
  26208 r slm_min_issue_delay
  27244 r bdver1_fp_min_issue_delay
  28199 t internal_min_issue_delay(int, DFA_chip*)
  28362 t internal_state_transition(int, DFA_chip*)
  28518 r glm_check
  28518 r glm_transitions
  33690 r geode_min_issue_delay
  45436 r znver4_fpu_min_issue_delay
  46980 r bdver3_fp_min_issue_delay
  49428 r glm_min_issue_delay
  53730 r btver2_fp_min_issue_delay
  53760 r znver1_fp_transitions
  89414 r c86_4g_m7_ieu_transitions
  93960 r bdver3_fp_transitions
  181744 r znver4_fpu_transitions

Based on random sampling of SPEC2017 benchmarks 525.x264_r and
521.wrf_r, I verified that the new modeling introduces no
significant compilation overhead.  Testing with a single job on a
c86-4g-m7 machine revealed no impact on x264 and a tiny increase
for wrf (~0.3%).

[1] https://gcc.gnu.org/pipermail/gcc-patches/2026-May/716681.html

gcc/ChangeLog:

* config/i386/c86-4g-m7.md (c86_4g_m7_fpu): Remove automaton.
(c86_4g_m7_fpu02): New automaton.
(c86_4g_m7_fpu13): Ditto.
(c86-4g-m7-fpu0): Move to c86_4g_m7_fpu02 automaton.
(c86-4g-m7-fpu1): Move to c86_4g_m7_fpu13 automaton.
(c86-4g-m7-fpu2): Move to c86_4g_m7_fpu02 automaton.
(c86-4g-m7-fpu3): Move to c86_4g_m7_fpu13 automaton.
(c86-4g-m7-fdiv): Remove cpu unit.
(c86-4g-m7-fdiv1): New cpu unit.
(c86-4g-m7-fdiv3): Ditto.
(c86-4g-m7-fpu_0_3): New reservation.
(c86-4g-m7-fpu_1_3x2): Ditto.
(c86-4g-m7-fpu_1_3x3): Ditto.
(c86-4g-m7-fpu_1_3x6): Ditto.
(c86-4g-m7-fpux2): Ditto.
(c86-4g-m7-fpux4): Ditto.
(c86-4g-m7-fpux6): Ditto.
(c86-4g-m7-fpux8): Ditto.
(c86-4g-m7-fpux16): Ditto.
(c86-4g-m7-fp1fdiv1x4): Ditto.
(c86-4g-m7-fp3fdiv3x4): Ditto.
(c86-4g-m7-fdiv13): Ditto.
(c86-4g-m7-fp13div13): Ditto.
(c86-4g-m7-fp13div13x4): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x8): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x9): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x11): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x15): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x18): Ditto.
(c86_4g_m7_idiv): New reservation.
(c86_4g_m7_idiv_QI): Adjust reservation latency and unit occupancy.
(c86_4g_m7_idiv_load): New reservation.
(c86_4g_m7_idiv_QI_load): Adjust reservation latency and unit
occupancy.
(c86_4g_m7_idiv_DI): Remove reservation.
(c86_4g_m7_idiv_SI): Ditto.
(c86_4g_m7_idiv_HI): Ditto.
(c86_4g_m7_idiv_DI_load): Ditto.
(c86_4g_m7_idiv_SI_load): Ditto.
(c86_4g_m7_idiv_HI_load): Ditto.
(c86_4g_m7_sse_insertimm): Adjust reservation units and unit
occupancy.
(c86_4g_m7_sse_insert): Ditto.
(c86_4g_m7_fp_sqrt): Adjust reservation.
(c86_4g_m7_fp_div): Ditto.
(c86_4g_m7_fp_div_load): Ditto.
(c86_4g_m7_fp_idiv_load): Ditto.
(c86_4g_m7_sse_pinsr_reg): Adjust reservation units and unit
occupancy.
(c86_4g_m7_sse_pinsr_reg_load): Ditto.
(c86_4g_m7_avx_vpinsr_reg): Ditto.
(c86_4g_m7_avx_vpinsr_reg_load): Ditto.
(c86_4g_m7_avx512_perm_xmm): Delete the prefix condition.
(c86_4g_m7_avx512_perm_xmm_opload): Ditto.
(c86_4g_m7_avx512_permi2_ymm): Adjust reservation units and unit
occupancy.
(c86_4g_m7_avx512_permi2_zmm): Ditto.
(c86_4g_m7_avx512_permi2_ymm_load): Ditto.
(c86_4g_m7_avx512_permi2_zmm_load): Ditto.
(c86_4g_m7_avx512_perm_zmm_imm): Ditto.
(c86_4g_m7_avx512_perm_zmm_imm_load): Ditto.
(c86_4g_m7_avx512_perm_zmm_noimm): Ditto.
(c86_4g_m7_sse_perm_zmm_noimm_load): Ditto.
(c86_4g_m7_avx_perm_ymm): Remove.
(c86_4g_m7_avx_perm_ymem): Ditto.
(c86_4g_m7_avx512_shuf_zmm): Adjust reservation units and unit
occupancy.
(c86_4g_m7_avx512_shuf_zmem): Ditto.
(c86_4g_m7_avx512_cmpestr): Ditto.
(c86_4g_m7_avx512_cmpestr_load): Ditto.
(c86_4g_m7_avx512_vdbpsadbw_zmm): Ditto.
(c86_4g_m7_avx512_vdbpsadbw_zmem): Ditto.
(c86_4g_m7_avx_ssecomi_comi): Ditto.
(c86_4g_m7_avx_ssecomi_comi_load): Ditto.
(c86_4g_m7_avx512_expand): Ditto.
(c86_4g_m7_avx512_expand_load): Ditto.
(c86_4g_m7_avx512_expand_z): Ditto.
(c86_4g_m7_avx512_expand_z_load): Ditto.
(c86_4g_m7_sse_movnt_xy): Rename to c86_4g_m7_sse_movnt.
(c86_4g_m7_avx512_sseadd_xy): Adjust reservation units.
(c86_4g_m7_avx512_sseadd_xy_load): Ditto.
(c86_4g_m7_sse_sseiadd_hplus): Adjust reservation units and unit
occupancy.
(c86_4g_m7_sse_sseiadd_hplus_load): Ditto.
(c86_4g_m7_avx512_ssemul): Adjust reservation units.
(c86_4g_m7_avx512_ssemul_load): Ditto.
(c86_4g_m7_avx512_ssediv): Remove.
(c86_4g_m7_avx512_ssediv_mem): Remove.
(c86_4g_m7_avx512_ssediv_x): New.
(c86_4g_m7_avx512_ssediv_xmem): New.
(c86_4g_m7_avx512_ssediv_y): New.
(c86_4g_m7_avx512_ssediv_ymem): New.
(c86_4g_m7_avx512_ssediv_z): Adjust reservation units.
(c86_4g_m7_avx512_ssediv_zmem): Ditto.
(c86_4g_m7_avx512_ssecmp_z): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_ssecmp_z_load): Ditto.
(c86_4g_m7_avx512_ssecmp_vp_z): New reservation.
(c86_4g_m7_avx512_ssecmp_vp_z_load): Ditto.
(c86_4g_m7_avx512_ssecmp_test_z): Remove reservation.
(c86_4g_m7_avx512_ssecmp_test_z_load): Ditto.
(c86_4g_m7_avx512_muladd): Broaden matching condition.
(c86_4g_m7_avx512_muladd_load): Ditto.
(c86_4g_m7_fma_muladd): Remove reservation.
(c86_4g_m7_fma_muladd_load): Ditto.
(c86_4g_m7_avx512_sse_conflict_x): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_sse_conflict_x_load): Ditto.
(c86_4g_m7_avx512_sse_conflict_y): Ditto.
(c86_4g_m7_avx512_sse_conflict_y_load): Ditto.
(c86_4g_m7_avx512_sse_conflict_z): Ditto.
(c86_4g_m7_avx512_sse_conflict_z_load): Ditto.
(c86_4g_m7_avx512_sse_class_z): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_sse_class_z_load): Ditto.
(c86_4g_m7_avx512_sse_sqrt): Remove.
(c86_4g_m7_avx512_sse_sqrt_load): Remove.
(c86_4g_m7_avx512_sse_sqrt_sf_x): New.
(c86_4g_m7_avx512_sse_sqrt_sf_xload): New.
(c86_4g_m7_avx512_sse_sqrt_sf_y): New.
(c86_4g_m7_avx512_sse_sqrt_sf_yload): New.
(c86_4g_m7_avx512_sse_sqrt_sf_z): New.
(c86_4g_m7_avx512_sse_sqrt_sf_zload): New.
(c86_4g_m7_avx512_sse_sqrt_df_x): New.
(c86_4g_m7_avx512_sse_sqrt_df_xload): New.
(c86_4g_m7_avx512_sse_sqrt_df_y): New.
(c86_4g_m7_avx512_sse_sqrt_df_yload): New.
(c86_4g_m7_avx512_sse_sqrt_df_z): New.
(c86_4g_m7_avx512_sse_sqrt_df_zload): New.
(c86_4g_m7_avx512_msklog_vector): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_mskmov_z_k): Ditto.
(c86_4g_m7_avx512_mskmov_k_reg): Ditto.
* config/i386/c86-4g.md (c86_4g_fp): Remove automaton.
(c86_4g_fp024): New automaton.
(c86_4g_fp1): Ditto.
(c86-4g-fp0): Move to c86_4g_fp024 automaton.
(c86-4g-fp1): Move to c86_4g_fp1 automaton.
(c86-4g-fp2): Move to c86_4g_fp024 automaton.
(c86-4g-fp3): Ditto.
(c86-4g-fp1fdivx4): New reservation.
(c86_4g_fp_sqrt): Adjust reservation.
(c86_4g_sse_sqrt_sf): Ditto.
(c86_4g_sse_sqrt_sf_mem): Ditto.
(c86_4g_sse_sqrt_df): Ditto.
(c86_4g_sse_sqrt_df_mem): Ditto.
(c86_4g_fp_op_div): Ditto.
(c86_4g_fp_op_div_load): Ditto.
(c86_4g_fp_op_idiv_load): Adjust reservation latency.
(c86_4g_ssediv_ss_ps): Adjust reservation.
(c86_4g_ssediv_ss_ps_load): Ditto.
(c86_4g_ssediv_sd_pd): Ditto.
(c86_4g_ssediv_sd_pd_load): Ditto.
(c86_4g_ssediv_avx256_ps): Ditto.
(c86_4g_ssediv_avx256_ps_load): Ditto.
(c86_4g_ssediv_avx256_pd): Ditto.
(c86_4g_ssediv_avx256_pd_load): Ditto.

Co-authored-by: Xin Liu <liulxx@hygon.cn>
Signed-off-by: Xin Liu <liulxx@hygon.cn>
Signed-off-by: Kewen Lin <linkewen@hygon.cn>

RISC-V: Add RISC-V RVV main-loop overhead comparison in cost model

Add an RVV-specific loop-overhead comparison in the RISC-V cost model and
use it after inside-loop cost comparison.

The RISC-V implementation prefers RVV mode that eliminate the main
loop, and otherwise compares their main-loop head overhead.

Local testing shows no regressions. This is likely because few testcases
have equal inside-loop cost, especially before VLS lmul cost scaling support.

I also ran regression tests with temporary VLS lmul cost scaling support.
Only 3 regressions found:
- dyn-lmul-conv-1.c & dyn-lmul-conv-2.c: Cost model now prefers smaller LMULs
due to VLS lmul scaling, so this is reasonable, just need to update expectations.
- pr123414.c: This test relies on large LMULs to trigger a specific bug,
so reasonable too, can be fixed by adding -fno-vect-cost-model.

The VLS LMUL cost scaling patch will be updated after this is pushed.

gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc
(estimated_loop_iters): New function.
(compare_loop_overhead): New function.
(costs::better_main_loop_than_p): Compare RVV loop overhead after
inside-loop cost.

Signed-off-by: Zhongyao Chen <chen.zhongyao@zte.com.cn>

aarch64: Make more use of UINTVAL

I noticed while reviewing some other code that we have existing code of
the form (unsigned HOST_WIDE_INT) INTVAL (X). Such expressions are (by
definition of UINTVAL) equivalent to UINTVAL (x), and the latter is both
more succint and (IMO) more readable, so this patch replaces those
instances in the aarch64 backend accordingly.

There are also many occurrences of this outside of aarch64, I see:

$ git grep -nE "$unsigned HOST_WIDE_INT$\s?INTVAL" | wc -l
73

with this patch applied, but this patch just fixes the aarch64 cases for
now.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_strip_extend): Replace
(unsigned HOST_WIDE_INT) INVAL (x) with UINTVAL (x).
* config/aarch64/predicates.md (aarch64_shift_imm_si): Likewise.
(aarch64_shift_imm_di): Likewise.
(aarch64_shift_imm64_di): Likewise.
(aarch64_imm3): Likewise.

AVR: Support [[len=<words]] notes in inline asm to specifty its size.

This patch adds support for [[len=<words>]] in (the comments of) inline
asm constructs.  It serves several purposes:

- Cases where the expanded asm is longer than determined from the number
  of physical and logical line breaks.  Such cases can lead to errors
  when a jump that uses a too optimistic jump offset is crossing an asm.

- Better code generation for jumps that are crossing an asm.  The default
  length of an asm is (1 + NL) * 2 words, where NL denotes the sum of
  physical and logical line breaks.  However, almost all AVR instructions
  occupy only one 16-bit word.

The feature is implemented in ADJUST_INSN_LENGTH.  The length of
an asm is the sum over all [[len=<words>]] notes, except when an
unrecognized construct is found or an error occurred.  In the latter
case, the default insn length is used.  These <words> are supported:

<words> = [0-9]+
   Specifies a non-negative decimal integer.

<words> = %[0-9]+
<words> = %[<name>]   # Already resolved to %[0-9]+ by the middle-end.
   Refers to the respective asm operand, which must be CONST_INT.

<words> = lds
<words> = sts
   Specifies the length of a LDS or STS instruction, i.e.
   1 word if AVR_TINY, and 2 words otherwise.

<words> = %~
<words> = %~call
<words> = %~jmp
   Specifies the length of a %~call resp. %~jmp instruction, i.e.
   2 words if AVR_HAVE_JMP_CALL, and 1 word otherwise.

In order to observe the assigned lengths, see -fdump-rtl-shorten or the
";; ADDR = ..." insn addresses in the asm output with -mlog=insn_addresses.

The benefits of using magic comments are:

- The feature is backwards compatible, and the target code can use
  the same asm syntax since only asm comments have to be adjusted.
  No #ifdef feature test macros are needed.  The only case where the
  feature is not fully backwards compatible is when asm templates
  already contain invalid "[[len=" notes for some reason.  In that
  case, -mno-asm-len-notes restores the old behavior.

- Since the asm size is the sum over all notes, the final size can
  be stitched together from multiple annotations / parts of an asm
  template, and there is no need to support operations like plus.

gcc/
* config/avr/avr.cc (avr_read_number, avr_length_of_asm)
(avr_maybe_length_of_asm): New static functions.
(avr_adjust_insn_length): Call avr_maybe_length_of_asm on
unrecognized insns.
* config/avr/avr.opt (-masm-len-notes, -Wasm-len-notes): New
options.
* doc/invoke.texi (AVR Options): Add -masm-len-notes,
-Wasm-len-notes.
* doc/extend.texi (Size of an asm): Add @subsubheading
"Specifying the size of an asm on AVR".

libgcc/config/avr/libf7/
* libf7.h: Add "[len=...]]" notes to all non-empty inline asm's.
* libf7.c: Dito.

AVR: ad target/121343 - Use hard-reg constraints in [u]divmod insns.

PR target/121343
gcc/
* config/avr/avr.md (divmod<mode>4, udivmod<mode>4): Use
hard-reg constraints instead of explicit hard-regs.
(*divmodqi4_call_split, *udivmodqi4_call_split): Remove.
(*divmodhi4_call_split, *udivmodhi4_call_split): Remove.
(*divmodpsi4_call_split, *udivmodpsi4_call_split): Remove.
(*divmodsi4_call_split, *udivmodsi4_call_split): Remove.

i386: Fix up *add<mode>_1<nf_name> [PR125469]

The following testcase ICEs, because combine matches
(set (reg:DI 108) (plus:DI (reg:DI 104 [ s ]) (subreg:DI (reg:TI 103 [ _2 ]) 8)))
Now, because ix86_validate_address_register has:
12038         /* Don't allow SUBREGs that span more than a word.  It can
12039            lead to spill failures when the register is one word out
12040            of a two word structure.  */
12041         if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
12042           return NULL_RTX;
this isn't recognized as *leadi, but is recognized as *adddi_1_nf pattern
instead.  Now, later on the RA turns it into:
(set (reg:DI 2 cx [108]) (plus:DI (reg:DI 0 ax [orig:104 s ] [104]) (reg:DI 5 di [ _2+8 ])))
which would be valid *leadi, but given that INSN_CODE is already set to the
*adddi_1_nf and that also satisfies it, nothing re-recognizes it as *leadi.
But in that case without TARGET_APX_NDD the pattern has return "#";
That is a bug, because there is no splitter to split that
(set (reg:DI 2 cx [108]) (plus:DI (reg:DI 0 ax [orig:104 s ] [104]) (reg:DI 5 di [ _2+8 ])))
into itself so that it is re-recognized as *leadi, so it just ICEs.
I think having a splitter to split to the same thing would be just weird, so
this just outputs lea insn directly.

2026-05-28  Jakub Jelinek  <jakub@redhat.com>

PR target/125469
* config/i386/i386.md (*add<mode>_1<nf_name>): Don't return "#" for
the lea non-TARGET_APX_NDD case, instead emit a lea directly.

* gcc.target/i386/apx-nf-pr125469.c: New test.

Reviewed-by: Uros Bizjak <ubizjak@gmail.com>