Tobias Burnus [Wed, 17 Jun 2026 10:16:55 +0000 (12:16 +0200)]
Fortran/OpenMP: Rename declare-mapper struct members
The usm->mapper_id vs. usm->usm->mapper_id and
also the usm->usm itself was a bit confusing.
Hence, this is now:
usm->requested_mapper_id
usm->resolved_usm
where the latter has
usm->resolved_usm->mapper_id
Hereby, usm->requested_mapper_id is set for a map/to/from
clause such as 'map(mapper(my_name), to: x' - while
the resolved_usm points to an object that has been
created by 'declare mapper'.
gcc/fortran/ChangeLog:
* gfortran.h (struct gfc_omp_udm): Add comment.
(struct gfc_omp_namelist_udm): Likewise; rename members
mapper_id to requested_mapper_id and usm to resolved_usm.
* module.cc (load_omp_udms, write_omp_udm): Update accordingly.
* openmp.cc (gfc_match_omp_clauses, resolve_omp_clauses): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
Dhruv Chawla [Tue, 16 Jun 2026 12:05:39 +0000 (12:05 +0000)]
lto: Fix streaming of loop->can_be_parallel [PR125717]
The can_be_parallel flag was not being streamed out, so any frontend
setting the flag (via annot_expr_parallel_kind) would not observe it doing
anything when LTO was enabled.
Zhongjie Guo [Wed, 17 Jun 2026 03:29:15 +0000 (03:29 +0000)]
i386: Adjust c86-4g-m7 512-bit memory costs
c86-4g-m7 is a split-regs AVX512 target. A 512-bit memory
operation is implemented as two 256-bit halves, so the vectorizer cost
model should not make 512-bit loads and stores almost as cheap as
256-bit ones.
The old c86_4g_m7_cost values made 512-bit loads/stores cost 12/12,
close to or equal to the 256-bit 10/12 costs. This can make 64-byte
vectorization win in the loop body cost comparison even when 32-byte
vectors avoid extra reduction epilogue work.
Set the 512-bit load/store and unaligned load/store costs to twice the
256-bit costs. This removes the artificial 64-byte body-cost advantage;
for dot-product style reduction loops, the reduction epilogue cost can
then make 32-byte vectorization preferable.
Compared with gcc-trunk without this tuning, local SPEC2006/SPEC2017
testing shows improvements in several vector-width sensitive workloads.
SPEC2006 1-copy fp_speed improved by 2.32%, including 436.cactusADM
+12.39%, 433.milc +7.33%, and 459.GemsFDTD +4.76%. SPEC2017 32-copy
fprate improved by 0.40%, with 526.blender_r improving by 2.80%.
gcc/ChangeLog:
* config/i386/x86-tune-costs.h (c86_4g_m7_cost): Increase
512-bit load/store and unaligned load/store costs.
gcc/testsuite/ChangeLog:
* gcc.target/i386/c86-4g-m7-vect-load-cost-reduc.c: New test.
Xin Liu [Wed, 17 Jun 2026 03:28:39 +0000 (03:28 +0000)]
i386: Fix some c86-4g-m7 reservations
Fix several existing c86-4g-m7 scheduling reservation issues.
The fixes correct decode unit selection, branch and call execution units,
missing store resources for store or load/store forms, and several FPU
pipeline resource descriptions. They also rename a few reservations so the
template names better match the instructions they cover, and simplify
duplicate memory attribute checks.
gcc/ChangeLog:
* config/i386/c86-4g-m7.md (c86_4g_m7_imov_xchg): Adjust
reservation units.
(c86_4g_m7_imov_xchg_load): Ditto.
(c86_4g_m7_call): Ditto.
(c86_4g_m7_branch): Ditto.
(c86_4g_m7_branch_load): Ditto.
(c86_4g_m7_fp_spc_direct): Add missing store unit.
(c86_4g_m7_sse_pinsr_reg): Adjust reservation units.
(c86_4g_m7_avx512_insertx_ymm): Ditto.
(c86_4g_m7_avx512_insertx_ymem): Ditto.
(c86_4g_m7_avx512_insertx_zxmm): Ditto.
(c86_4g_m7_avx512_insertx_zxmem): Ditto.
(c86_4g_m7_avx512_abs_load): Add missing store unit.
(c86_4g_m7_avx_sign): Use combined FPU reservation.
(c86_4g_m7_avx_sign_load): Ditto.
(c86_4g_m7_avx_aes): Ditto.
(c86_4g_m7_avx_aes_load): Ditto.
(c86_4g_m7_extr_load): Rename to ...
(c86_4g_m7_extr_store): ... this and restrict to store memory.
(c86_4g_m7_avx_imul): Use combined FPU reservation.
(c86_4g_m7_avx_imul_mem): Ditto.
(c86_4g_m7_avx512_vpmovx_y_load): Add missing store unit.
(c86_4g_m7_avx_vpmovx_xx_load): Ditto.
(c86_4g_m7_avx512_sseadd_maxmin_xy): Rename to ...
(c86_4g_m7_avx512_sseadd_maxmin): ... this and simplify
memory attribute check.
(c86_4g_m7_avx512_sseadd_maxmin_xy_load): Rename to ...
(c86_4g_m7_avx512_sseadd_maxmin_load): ... this and simplify
memory attribute check.
(c86_4g_m7_avx512_sseadd_xy): Rename to ...
(c86_4g_m7_avx512_sseadd): ... this.
(c86_4g_m7_avx512_sseadd_xy_load): Rename to ...
(c86_4g_m7_avx512_sseadd_load): ... this.
(c86_4g_m7_sse_sseiadd_sadbw): Use combined FPU reservation.
(c86_4g_m7_sse_sseiadd_sadbw_mem): Ditto.
(c86_4g_m7_avx512_ssecmp_vp_z): Adjust reservation units.
(c86_4g_m7_avx512_ssecmp_vp_z_load): Ditto.
(c86_4g_m7_avx512_ssecmp_test_load): Ditto.
(c86_4g_m7_avx512_mskmov_k_m): Adjust latency.
Robin Dapp [Wed, 17 Jun 2026 04:40:34 +0000 (22:40 -0600)]
[PATCH] RISC-V: Fix more scalar mode_idx instances [PR125478].
Hi,
This is another case of PR123022 and PR116149 where we query a scalar source
operand instead of a vector operand, leading to a wrong AVL during avlprop.
This patch moves viwalu, vfwalu, viwmul, and vfwmul to the proper
bucket. I wonder why I didn't do that the last two times but it seems
to be the correct choice now 🙂
Regtested on rv64gcv_zvl512b and waiting for the CI.
Regards
Robin
PR target/125478
gcc/ChangeLog:
* config/riscv/vector.md: Set widen-alu mode_idx to 3.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr125478.c: New test.
Jeff Law [Wed, 17 Jun 2026 02:34:49 +0000 (20:34 -0600)]
[committed][RISC-V] Tighten and xfail test for pr120811
So the test for pr120811 has actually been failing for several months, but the
scan-asm regexp was too general in what it accepted and as a result we weren't
aware it was failing. What's a bit weird, is it looks like post-reload combine
just never tries the relevant insns, though it does look like a viable
post-reload combine opportunity.
Then changes in the last day or so scrambled the output even more to the point
where the faulty scan-asm wouldn't match and the regression has come to light.
This adds the missing escapes to the scan-asm. I've confirmed it works with
the old (desired) code generation and it fails with the change from a couple
months ago which seems to be preventing the post-reload combine. It xfails the
test as well.
I'll get a bug opened to track the performance regression.
gcc/testsuite
* gcc.target/riscv/pr120811.c: Add missing escapes and xfail.
Andrew MacLeod [Thu, 11 Jun 2026 17:16:13 +0000 (13:16 -0400)]
Pointer global ranges of nonzero are no longer invariant.
TO preserve processing, the cache did not track pointer ranges any more
once it was set to non-zero. With prange now tracking pointers, any
conditional may provide points-to info to a nonzero range, so we
must track these values now.
* gimple-range-cache.cc (ranger_cache::set_global_range): Nonzero
pointer ranges are no longer invariant.
Andrew MacLeod [Mon, 1 Jun 2026 18:28:33 +0000 (14:28 -0400)]
Switch VRP to use prange.
Use the points-to info in prange instead olf the value-range-equiv side
table in VRP. Adjust testcases:
- PTA causes less mem*_chk routines to be generated.
- Gimple fold is causing issues by not properly handling pointer_plus.
See PR 123160. XFAILing until resolved.
- Adding noipa prevents VRP from propagating IPA info causing a failure.
gcc/
* tree-vrp.cc (rvrp_folder::value_of_expr): Use prange PTA info.
(value_on_edge): Likewise.
Jose E. Marchesi [Tue, 16 Jun 2026 20:52:14 +0000 (22:52 +0200)]
a68: fix deduplication of in-MOIF modes
Of course I got the in-MOIF deduplication of modes wrong due to a
stupid thinko. This patch fixes the thinko and also adds a little
extra sanity check.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-imports.cc (a68_replace_equivalent_mode): Get a moif and
check extracts.
(a68_open_packet): Fix deduplication of in-moif modes and move
extract replacement code to a68_replace_equivalent_mode.
* a68-low-moids.cc (a68_lower_moids): Check that all known modes
have an associated ctype as part of the sanity checks.
H.J. Lu [Fri, 8 May 2026 04:20:02 +0000 (12:20 +0800)]
SSP: Check UINTPTR_TYPE to get uintptr_t type
default_stack_protect_guard calls
lang_hooks.types.type_for_mode (ptr_mode, 1);
to get an integer type for __stack_chk_guard which is declared as a
global symbol of type uintptr_t. For 32-bit systems, uintptr_t may
be either unsigned int or unsigned long int. On 32-bit Darwin, we get
$ cat /tmp/x.c
__UINTPTR_TYPE__ __stack_chk_guard = 0x1000;
$ ./xgcc -B./ -S /tmp/x.c -m32
/tmp/x.c:1:18: error: conflicting types for ‘__stack_chk_guard’; have ‘long unsigned int’
1 | __UINTPTR_TYPE__ __stack_chk_guard = 0x1000;
| ^~~~~~~~~~~~~~~~~
cc1: note: previous declaration of ‘__stack_chk_guard’ with type ‘unsigned int’
$
since lang_hooks.types.type_for_mode returns unsigned int while Darwin's
uintptr_t is unsigned long int. Update default_stack_protect_guard to
call unsigned_integer_tree_node_for_type with UINTPTR_TYPE to get unsigned
integer type for uintptr_t instead.
gcc/
PR c/125226
* targhooks.cc (default_stack_protect_guard): If UINTPTR_TYPE
isn't NULL, call unsigned_integer_tree_node_for_type with
UINTPTR_TYPE to get unsigned integer type for uintptr_t.
* tree.cc (unsigned_integer_tree_node_for_type): New function.
(build_common_tree_nodes): Call unsigned_integer_tree_node with
SIZE_TYPE to get unsigned integer type for size_t.
* tree.h (unsigned_integer_tree_node_for_type): New prototype.
Robert Dubner [Tue, 16 Jun 2026 18:26:24 +0000 (14:26 -0400)]
cobol: Simplify tree type determination; eliminate file-static variables.
Use tree_type_from_field() and tree_type_from_refer() to replace a number
of similar routines. Replace a number of specific file-static
variable definitions with non-specific variable definitions.
Repair calculation of DIVIDE ... GIVING REMAINDER where parameters are
negative.
* gmath.cc (multiply_int256_by_int64): Comment.
(int256_as_decimal): Handle negative values.
(__gg__dividef45): Repair remainder logic when numeric-display
parameters are negative.
gcc/testsuite/ChangeLog:
* cobol.dg/group2/DIVIDE_binary-long_giving_remainder.cob: New test.
* cobol.dg/group2/DIVIDE_binary-long_giving_remainder.out: New test.
* cobol.dg/group2/DIVIDE_numeric-display_giving_remainder.cob: New test.
* cobol.dg/group2/DIVIDE_numeric-display_giving_remainder.out: New test.
* cobol.dg/group2/GCC_125616_RT3601_-_IBM_incorrect_DISPLAY_of_COMP-1_COMP-2_float.cob:
New test.
* cobol.dg/group2/PR39_RT3573_-_Parser_issue_w_concat._source_lines.cob: New test.
Jakub Jelinek [Tue, 16 Jun 2026 20:24:57 +0000 (22:24 +0200)]
c++, libcpp: Add -std=c++2[9d] and -std=gnu++2[9d] options
The following patch adds a new -std= option arguments for C++29 and
changes __cplusplus value for C++26 from 202400L to 202603L.
2026-06-16 Jakub Jelinek <jakub@redhat.com>
gcc/
* doc/invoke.texi (-std=c++2d, -std=c++29, -std=gnu++2d,
-std=gnu++29): Document.
(-std=c++2c, -std=c++26, -std=gnu++2c, -std=gnu++26): Tweak so that
26 comes first and is meant as the supported option.
* doc/cpp.texi: Adjust expected __cplusplus value for -std=c++26,
document value for -std=c++29.
* dwarf2out.cc (highest_c_language): Handle C++29.
(gen_compile_unit_die): Likewise. Adjust lversion for C++26.
gcc/c-family/
* c-common.h (enum cxx_dialect): Add cxx29.
* c-opts.cc (set_std_cxx29): New function.
(c_common_handle_option): Handle -std=c++2d, -std=c++29, -std=gnu++2d
and -std=gnu++29.
* c.opt (std=c++2d, std=c++29, std=gnu++2d, std=gnu++29): New.
(std=c++2c, std=c++26, std=gnu++2c, std=gnu++26): Move Undocumented
to the 2c cases, tweak description.
gcc/testsuite/
* g++.dg/cpp26/cplusplus.C: Expect __cplusplus == 202603L.
* g++.dg/cpp29/cplusplus.C: New test.
* g++.dg/cpp29/feat-cxx29.C: New test.
* lib/target-supports.exp (check_effective_target_c++26): Also
true for C++29.
(check_effective_target_c++26_down, check_effective_target_c++29_only,
check_effective_target_c++29): New procedures.
* lib/g++-dg.exp (g++-std-flags): Replace 26 with 29 in the list
and add 26 to the end.
libcpp/
* include/cpplib.h (enum c_lang): Add CLK_GNUCXX29 and CLK_CXX29.
* init.cc (lang_defaults): Add CLK_GNUCXX29 and CLK_CXX29 row.
(cpp_init_builtins): Change __cplusplus value for C++26 from 202400L
to 202603L and set __cplusplus value for C++29 to 202700L.
James K. Lowden [Tue, 16 Jun 2026 15:15:46 +0000 (11:15 -0400)]
cobol: Use cdf namespace in CDF Bison parser output.
Fixes PR 119215 LTO error by removing enumerated types and other
names from the global namespace, and by deleting use of YYLTYPE. A
single location type cbl_loc_t is used by both parsers.
Also fixes PR 122466 using conditional compilation for non-POSIX
symbols.
Require Bison 3.8.2 for C++ output. The generated C++ uses
iostreams to format error messages. The cdf.y file includes the
iostream header and does not use any GCC header files.
gcc/cobol/ChangeLog:
* Make-lang.in: Report Bison version.
* cbldiag.h (defined): Define possibly missing macros.
(ATTRIBUTE_GCOBOL_DIAG): Define if missing.
(ATTRIBUTE_PRINTF_1): Same.
(ATTRIBUTE_PRINTF_3): Same.
(yyerror): Remove.
(struct YYLTYPE): Remove.
(enum cbl_gcobol_feature_t): Relocate from symbols.h.
(YYLTYPE_IS_DECLARED): Remove.
(YYLTYPE_IS_TRIVIAL): Remove.
(cobol_location): Relocate.
(cobol_gcobol_feature_set): Declare.
(struct YDFLTYPE): Remove.
(enum cbl_call_convention_t): Relocate from symbols.h.
(YDFLTYPE_IS_DECLARED): Remove.
(YDFLTYPE_IS_TRIVIAL): Remove.
(current_call_convention): Declare.
(cdf_push): Declare.
(cdf_push_call_convention): Declare.
(cdf_push_current_tokens): Declare.
(cdf_push_dictionary): Declare.
(cdf_push_enabled_exceptions): Declare.
(cdf_push_source_format): Declare.
(cdf_pop): Declare.
(cdf_pop_call_convention): Declare.
(cdf_pop_current_tokens): Declare.
(cdf_pop_dictionary): Declare.
(cdf_pop_source_format): Declare.
(cdf_pop_enabled_exceptions): Declare.
(current_program_index): Declare.
(struct cbl_loc_t): Derive from cbl_loc_base_t.
(struct cbl_loc_base_t): Define.
(cbl_err): Declare.
(cbl_errx): Declare.
(error_msg): Use cbl_loc_t.
(warn_msg): Same.
(cbl_unimplemented_at): Same.
(gcc_location_set): Same.
* cdf.y: Require Bison 3.8.2 and generate C++ in cdf namespace.
* cdfval.h (struct YDFLTYPE): Remove.
(struct cbl_loc_t): Forward declaration.
(struct cdfval_base_t): User-defined conversion from derived.
* copybook.h (gcc_assert): Use assert(3) within cdf.y.
(gcc_unreachable): Declare within cdf.y.
(class copybook_elem_t): Use cbl_loc_t.
(CTOUPPER): Use toupper(3) in uppername_t helper.
(TOUPPER): Same.
(class copybook_t): Use cbl_loc_t.
* exceptg.h (struct cbl_label_t): Declare cbl_label_t.
* gcobc: Support -fno-ec.
* gcobol.1: Reword -fsyntax-only slightly.
* genapi.cc (parser_label_label): Use cobol_location().
* lang.opt: Add comment in re lang.opt.urls.
* lexio.cc (struct replacing_term_t): Use cbl_loc_t.
(location_in): Same.
(parse_copy_directive): Same.
* lexio.h (struct filespan_t): Same.
* messages.cc (cbl_message): Same.
* parse.y: Same, and propagate location variously.
* parse_ante.h (current_data_section_set): Use cbl_loc_t.
(namcpy): Same.
(reject_refmod): Same.
(require_pointer): Same.
(require_integer): Same.
(ast_op): Same.
(perform_tgt_set): Same.
(label_add): Same.
(paragraph_reference): Same.
(tee_up_name): Same.
(ast_inspect): Same.
(ast_enter_section): Same.
(ast_enter_paragraph): Same.
(prototype_add): Same.
(verify_args): Same.
(subscript_dimension_error): Same.
(literal_subscripts_valid): Same.
(literal_refmod_valid): Same.
(struct cbl_fieldloc_t): Remove.
(intrinsic_call_1): Use cbl_loc_t.
(symbol_find): Same.
(valid_redefine): Same.
(field_add): Same.
(field_type_update): Same.
(field_capacity_error): Same.
(field_alloc): Same.
(file_add): Same.
(alphabet_add): Same.
(set_real_from_capacity): Same.
(procedure_division_ready): Same.
(file_section_fd_set): Same.
(ast_call): Same.
(field_binary_usage): Same.
(ast_end_program): Same.
(cobol_location): Same.
(location_set): Same.
(statement_begin): Same.
(ast_first_statement): Same.
* scan.l: Qualify tokens with new cdf namespace.
* scan_ante.h (ydfparse): Now static.
(cdf_context): Declare.
(ydfltype_of): Remove.
(update_location): Use cbl_loc_t.
(reset_location): Same.
(YY_USER_INIT): Same.
(class picture_t): Same.
* scan_post.h (ydfparse): Wrapper for cdf_parser::parse() method.
(ydfchar): Lookahead helper.
(ydfdebug): Same.
(run_cdf): Clearer debug messages.
(struct pending_token_t): Renamed to recent_token_t.
(struct recent_token_t): As above.
(PENDING): Renames to RECENT.
(RECENT): As above.
(next_token): Removed.
(recent_tokens_t): Capture abandoned lookahead tokens.
(prelex): Use recent_tokens queue.
(yylex): Drop normal parsing idea.
* symbols.cc (symbol_field_location): Use cbl_loc_t.
(symbol_alphabet): Same.
(cbl_alphabet_t::cbl_alphabet_t): Same.
(cbl_alphabet_t::assign): Same.
(cbl_alphabet_t::also): Same.
(symbol_temporary_location): Same.
(cbl_field_t::encode): Same.
* symbols.h (enum cbl_gcobol_feature_t): Remove.
(cobol_gcobol_feature_set): Remove.
(struct cbl_field_t): Use cbl_loc_t.
(struct cbl_refer_t): Same.
(struct cbl_alphabet_t): Same.
(struct cbl_perform_tgt_t): Same.
(struct cbl_nameloc_t): Same.
(class name_queue_t): Same.
(tee_up_name): Same.
(symbol_field_location): Same.
(enum cbl_call_convention_t): Remove
(class current_tokens_t): Use cbl_loc_t.
(current_call_convention): Same.
(gcc_location_set): Same.
* util.cc (class cdf_directives_t): Use cbl_loc_t.
(cdf_unreachable): Define as gcc_unreachable.
(cdf_literalize): Do not handle location.
(cdf_file): New function.
(cdf_file_index): Same.
(cdf_file_name): Same.
(cdf_add_field): Same.
(cbl_field_t::encode_numeric): Remove unused parameter.
(cbl_field_t::report_invalid_initial_value): Use cbl_loc_t.
(match_proc): Namespace for local prototype-verification functions.
(DUMP_PROCEDURE_CALLS): Guard macro for debug function.
(procedure_calls_dump): New function to show uses of PERFORM.
(gcc_location_set_impl): Use cbl_loc_t.
(gcc_location_set): Same.
(class temp_loc_t): Same.
(error_msg): Same.
(warn_msg): Same.
(ydfdebug): Same.
(cobol_set_debugging): Same.
(cbl_unimplemented_at): Same.
* util.h (cbl_err): Remove declaration.
(cbl_errx): Same.
(cdf_push): Same.
(cdf_push_call_convention): Same.
(cdf_push_current_tokens): Same.
(cdf_push_dictionary): Same.
(cdf_push_enabled_exceptions): Same.
(cdf_push_source_format): Same.
(cdf_pop): Same.
(cdf_pop_call_convention): Same.
(cdf_pop_current_tokens): Same.
(cdf_pop_dictionary): Same.
(cdf_pop_source_format): Same.
(cdf_pop_enabled_exceptions): Same.
libgcobol/ChangeLog:
* posix/shim/open.cc (defined): Honor _GNU_SOURCE and _POSIX_C_SOURCE.
Marek Polacek [Fri, 12 Jun 2026 19:46:10 +0000 (15:46 -0400)]
c++: bogus error with meta::members_of [PR125770]
mark_used produces the
use of built-in parameter pack '__integer_pack' outside of a template
error even when called with tf_none (in this case, from
resolve_type_of_reflected_decl).
I played with the idea of skipping all fndecl_built_in_p functions
in resolve_type_of_reflected_decl but it doesn't seem to make any
reasonable difference on compile time.
PR c++/125770
gcc/cp/ChangeLog:
* decl2.cc (mark_used): Check complain & tf_error before giving
an error.
Marek Polacek [Mon, 1 Jun 2026 21:32:38 +0000 (17:32 -0400)]
c++: ICE in tsubst_expr with comma expr [PR125539]
Here we crash because a TARGET_EXPR gets into tsubst_expr with something
like:
template<typename>
constexpr static A a = (A{}, A{});
Pre r14-4796, when we had tsubst_copy, we didn't crash because we had
an early exit when substituting with args=NULL_TREE. tsubst_expr
deliberately doesn't have that early exit.
Note that
template<typename>
constexpr static A a = A{};
works because expand_aggr_init_1 sees a COMPOUND_LITERAL_P and does the
early exit without calling expand_default_init. But with a COMPOUND_EXPR
we don't take that path.
The TARGET_EXPR is created via check_initializer -> build_aggr_init_full_exprs
-> build_aggr_init -> expand_aggr_init_1 -> expand_default_init -> ocp_convert
-> build_cplus_new -> build_target_expr. I tried adjusting the big
and ugly check in check_initializer before build_aggr_init_full_exprs
but that didn't work out. I also tried avoiding the call to ocp_convert
but that broke some reflection tests.
We can fix this by calling perform_implicit_conversion in a template to
create an IMPLICIT_CONV_EXPR, then build_cplus_new won't create a TARGET_EXPR.
PR c++/125539
gcc/cp/ChangeLog:
* cvt.cc (ocp_convert): In a template, always call
perform_implicit_conversion. Pass flags to
perform_implicit_conversion_flags.
* decl.cc (check_initializer): Remove a call to
build_implicit_conv_flags.
This patch fixes a mistake in gather/scatter discovery. In the third
"phase" we check for a larger offset type for the needed scaling but
fail to let the vectorizer know it. The patch just sets
supported_offset_vectype and also skips costing an instruction in case
the necessary conversion is a nop.
Bootstrapped and regtested on x86, power10, and aarch64.
Regtested on riscv64.
Regards
Robin
PR tree-optimization/125516
gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Set
supported_offset_vectype.
* tree-vect-stmts.cc (vectorizable_store): Skip nop conversions
when costing scatters.
(vectorizable_load): Ditto for gathers.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr125516.c: New test.
Abhishek Kaushik [Tue, 16 Jun 2026 13:43:50 +0000 (07:43 -0600)]
[PATCH] tree-optimization: Query ranger on edge for niter bound expressions
determine_value_range only queried ranger when VAR was an SSA_NAME.
However, number_of_iterations_exit_assumptions calls
expand_simple_operations on the IV bases before asking for the range.
This can expose simple GENERIC expressions, including conversions, even
when the original value had an SSA_NAME with useful range information.
Ranger can analyze such expressions at a program point or edge, so query
it for integral bound expressions on the loop preheader edge rather than
restricting the query to SSA_NAMEs. This lets niter analysis recover the
narrower range for value-preserving conversions such as uint8_t to int,
while still handling non-value-preserving conversions and wrapping
expressions conservatively.
The new test covers a direct converted uint8_t bound, a masked uint16_t
bound whose useful range comes from ranger, a guarded uint16_t bound
whose range is context-sensitive, a non-value-preserving uint8_t to
int8_t conversion, and a wrapping unsigned expression.
Bootstrapped and regression tested on aarch64-unknown-linux-gnu.
gcc/ChangeLog:
* tree-ssa-loop-niter.cc (determine_value_range): Query ranger
for integral expressions, not only SSA_NAMEs. Query ranges on the
loop preheader edge.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve2/niter-convert-range.c: New test.
Kyrylo Tkachov [Tue, 16 Jun 2026 10:16:42 +0000 (12:16 +0200)]
vect: consult the alias oracle for unanalyzable BB SLP dependences
vect_slp_analyze_data_ref_dependence conservatively reported a dependence
whenever the classical (affine) data-dependence test returned chrec_dont_know,
e.g. when one of the accesses has a non-affine or runtime array subscript. In
the BB SLP region check this is overly pessimistic: the unanalyzable subscript
says nothing about whether the two references can actually alias, and the alias
oracle can frequently still prove they cannot (distinct restrict parameters,
distinct non-escaping objects, and so on). When that happens a perfectly good
SLP group is torn down. The motivating case is the deal.II
VectorizedArray<double,N> reciprocal in SPEC CPU 2026 766.femflow_r.
The store-sink and load-hoist walkers already fall back to the alias oracle
(stmt_may_clobber_ref_p_1 / ref_maybe_used_by_stmt_p) for statements that have
no single recorded data reference. Extend that fallback to the chrec_dont_know
case: vect_slp_analyze_data_ref_dependence now returns a three-way result
(chrec_known when the references are provably independent, chrec_dont_know when
the affine test cannot analyze them, and the dependence otherwise) so each
caller can tell "unknown" apart from "dependent", and on "unknown" runs the same
oracle query it already uses for the no-data-reference case, with the TBAA
setting appropriate to what is being moved: no TBAA when sinking a store (a
moving store may change the dynamic type), TBAA when hoisting a load.
On the new gcc.dg/vect/bb-slp-dep-oracle.c the 8-lane reciprocal group is torn
down and emitted scalar without the patch and vectorizes to four vector
divides with it.
Bootstrapped and tested on aarch64-none-linux-gnu.
* tree-vect-data-refs.cc (vect_slp_analyze_data_ref_dependence):
Return a three-way tree result (chrec_known when independent,
chrec_dont_know when the affine test cannot analyze the pair, the
dependence otherwise) instead of a bool.
(vect_slp_analyze_store_dependences): Resort to the alias oracle on
an unknown dependence as well as on a missing data reference; a
store is being moved so do not use TBAA.
(vect_slp_analyze_load_dependences): Likewise on the load-hoist
paths, using TBAA as a load is being hoisted; also record that the
ao_ref has been initialized in check_hoist.
Robin Dapp [Tue, 16 Jun 2026 12:54:48 +0000 (06:54 -0600)]
[PATCH] RISC-V: Fix splitter [PR125670].
Hi,
In the PR we ICE during vsetvl, expecting a register in the VL operand
slot which only contains an immediate 4. Non-VLMAX insns with immediate
length have a NULL_RTX in that slot.
However, during a split, we erroneously use operand[5] instead of
operand[6]. operand[5] is the mask policy and happened to be "1".
"1" indicates a VLMAX insn in the avl_type operand. This caused the
wrong turn in vsetvl.
The patch just corrects the operand number.
Regtested on rv64gcv_zvl512b. Going to wait for the CI.
Regards
Robin
PR target/125670
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Use avl_type operand number.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr125670.c: New test.
Rainer Orth [Tue, 16 Jun 2026 12:04:04 +0000 (14:04 +0200)]
testsuite: Limit shared memory use of coarray tests [PR125584]
As discussed in PR testsuite/125584, the gfortran.dg/coarray tests with
-lcaf_shmem require excessive amounts of backing store for their shared
memory images: for 64-bit tests, each test uses 4 GB images.
On Solaris, the mapped files reside in /tmp/.LIBRT/SHM, which is tmpfs.
During parallel testing, several such tests can run in parallel, leading
to excessive VM use, in particular on targets like Solaris that don't
use lazy allocation. This is unacceptable since the testsuite defaults
need to be safe for all targets.
The tests PASS just fine when limiting GFORTRAN_SHARED_MEMORY_SIZE to 2M
instead. To allow for testing with either the default or a different
size, setting GCC_TEST_RUN_EXPENSIVE to a non-empty string allows to
override this.
Tested on amd64-pc-solaris2.11 and x86_64-pc-linux-gnu.
Richard Biener [Thu, 11 Jun 2026 13:51:02 +0000 (15:51 +0200)]
tree-optimization/125730 - avoid losing track of base in IVOPTs
The PR highlights several places where IVOPTs gets lost in tracking
an appropriate base to use for a TARGET_MEM_REF. The following
addresses the initial place, enough to fix the bug, but already
showing difficulties in avoiding fallout. The first place is
alloc_iv where we convert the nice pointer IV to an unsigned integer
before applying affine canonicalization.
Removing that resolves the PR and with the fold_plusminus_mult_expr
change tests without regressions on x86_64.
PR tree-optimization/125730
* tree-ssa-loop-ivopts.cc (alloc_iv): Do not convert pointer
IVs to unsigned before canonicalizing.
Richard Biener [Mon, 15 Jun 2026 09:29:08 +0000 (11:29 +0200)]
Improve fold_plusminus_mult_expr for 64bit and larger types
The following enhances fold_plusminus_mult_expr to catch
a * -4U + b * 4 and factor it as (a * -1u + b) * 4. This
does not work currently because we're using HOST_WIDE_INT
arithmetic. Switch that to wide_int, which makes the
folding apply more consistently.
For gcc.dg/loop-versioning-13.c the heuristics in
gimple-loop-versioning.cc get confused as they fail to
truncate some computations. I did not inverstigate further.
* fold-const.cc (fold_plusminus_mult_expr): Use
wide_int for the case of two INTEGER_CST multiplicands.
* gcc.dg/loop-versioning-13.c: XFAIL one transfor for ilp32.
* gcc.dg/pr109393.c: Remove XFAIL for ilp32.
Kyrylo Tkachov [Tue, 16 Jun 2026 07:58:28 +0000 (00:58 -0700)]
aarch64: Fix wrong code for high-64-zero Advanced SIMD constants [PR125794]
r17-1491-gf152cf1734f808 (PR113926) taught aarch64_simd_valid_imm to
materialize a 128-bit Advanced SIMD MOV constant whose high 64 bits are
zero with a 64-bit MOVI/FMOV, which zeroes the upper half of the
register. It records this with simd_immediate_info::width == 64
(output_width).
However, when the low 64 bits are not themselves a valid Advanced SIMD
(MOVI/MVNI/FMOV) immediate, the function fell through to the SVE
immediate forms (aarch64_sve_valid_immediate). Those use a replicating
"mov zN.<T>, #imm", which sets the whole vector, including the high 64
bits that were required to be zero, to the repeated low-64-bit value.
For e.g. the V4SI constant { 0, 1, 0, 0 } this emitted
instead of the intended { 0, 1, 0, 0 }, producing wrong code.
Fix it by not falling through to the SVE forms when output_width is set:
such a constant must be formed by a 64-bit Advanced SIMD MOVI/FMOV
(handled by the Advanced SIMD and floating-point paths just above) or
not at all, in which case the caller materializes it some other way
(e.g. a literal-pool load), which is the pre-r17-1491 behavior for these
constants.
The PR113926 optimization is unaffected: it only applies when the
Advanced SIMD or floating-point path accepts the low 64 bits, and those
still return true before the new check.
Bootstrapped and regression-tested on aarch64-linux-gnu.
Pushing to trunk.
PR target/125794
* config/aarch64/aarch64.cc (aarch64_simd_valid_imm): Do not fall
through to the replicating SVE immediate forms for a 128-bit
Advanced SIMD constant whose high 64 bits are zero (output_width
!= 0).
gcc/testsuite/ChangeLog:
PR target/125794
* gcc.target/aarch64/sve/pr125794.c: New test.
There are certain patterns that are not recognized by the method
optimize_spaceship. For example,
a == b ? 0 : (a > b) : 1 : -1;
is rightly recognized as spaceship operator. But
a <= b ? -(a < b) : 1 or
a >= b ? (a > b) : -1
is not being currently recognized.
This patch recognizes such patterns and chooses to emit the spaceship
optab if target supports it, which improves code-generation for such
targets.
gcc/ChangeLog:
PR tree-optimization/59429
* match.pd: New match patterns to recognize spaceship variants.
* tree-ssa-math-opts.cc (gimple_spaceship): Match function declaration.
(match_spaceship): New function to recognize spaceship given phi node.
(math_opts_dom_walker::after_dom_children): Add match_spaceship check.
gcc/testsuite/ChangeLog:
PR tree-optimization/59429
* lib/target-supports.exp (check_effective_target_spaceship): Add new
proc for spaceship optab. x86, aarch64 and s390 included.
* gcc.dg/spaceship_int_variants.c: New test.
* gcc.dg/spaceship_uint_variants.c: New test.
* gcc.dg/spaceship_mixed_variants.c: New test.
Zhongyao Chen [Fri, 12 Jun 2026 03:59:56 +0000 (11:59 +0800)]
RISC-V: Adjust testcase asm check for vx-[5|6]-i[8|16].c
After commit 9f8409f2e2c, SLP discovery can retry swapped operands for
commutative parents before falling back to an external scalar.
These tests can be vectorized again, so update asm check.
Roger Sayle [Mon, 15 Jun 2026 19:09:55 +0000 (20:09 +0100)]
i386: Tweak cost of SSE fabs/fneg in ix86_insn_cost.
This patch fixes a poor interaction between the splitters for SSE
floating point abs/neg in the i386 backend, and the late-combine pass.
Before reload, these patterns exist as a PARALLEL containing the USE
of a value (pseudo) holding the sign-bit. Currently late-combine
propagates this sign-bit mask from the constant pool, changing the
USE of a REG to the USE of a MEM. This unCSE is reasonable if this
MEM is used only once, but less than optimal if this MEM is accessed
many times.
The problem is that this USE doesn't currently have a cost in
ix86_insn_cost, so propagating this load from memory into the USE
makes if free (to combine's profitable replacement calculation).
This patch improve things by providing a nominal cost for USEs of
MEM.
As an example, consider the following function:
float x, y, z;
void foo()
{
x = -x;
y = -y;
z = -z;
}
Currently with -O2 GCC generates three loads from the constant pool:
Note this is one more instruction, but code size is smaller and
the total cost (as calculated by the i386 backend) is lower.
For a single neg/abs the memory address is still propagated.
2026-06-15 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.cc (ix86_insn_cost): Add a suitable penalty
for USE of a MEM in a PARALLEL (for *<absneg>[sd]f2_1 splitter).
gcc/testsuite/ChangeLog
* gcc.target/i386/fabsneg-2.c: New test case.
When a color group spans the whole register file (size == 32), as can
happen for a heavily unrolled, vectorized loop, "1U << 32" is undefined
and evaluates to 1 on AArch64 hosts, so the expression sets no bits at all.
The 32 FPRs of the group are therefore not recorded as allocated.
Subsequent colors (and broaden_colors) then reuse those registers, which
breaks the invariant that distinct colors receive disjoint FPRs.
In PR125795 this let the loop-invariant TBL permute index, which is live
across the whole loop, share v28 with the LD2 tuple destinations, so the
index was clobbered mid-loop and the loop produced wrong results.
Fix this by using a 64-bit shift base: unsigned long long is at least
64 bits on every host, so "1ULL << 32" is well-defined.
best + size <= 32 is guaranteed by the candidate search, which the patch
also asserts, so the result still fits in the 32-bit m_allocated_fprs
When the full-width group can no longer be hidden, allocate_colors correctly
fails to find a register for the other color and the region is left to the
real register allocator, matching -mearly-ra=none.
Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk and later to the branches after testing.
Jose E. Marchesi [Mon, 15 Jun 2026 14:02:57 +0000 (16:02 +0200)]
a68: handle duplicated modes in module imports
In principle all the modes emitted by the module exporter are
deduplicated. However, in certain cases in which the same unions
result from unraveling, duplicates may occur.
This patch makes a68_open_packet to deduplicate duplicated modes from
the same moif.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-imports.cc (INCLUDE_VECTOR): Define.
(a68_open_packet): Deduplicate within-moif modes while importing.
gcc/testsuite/ChangeLog
* algol68/execute/modules/module26.a68: New file.
* algol68/execute/modules/program-26.a68: New test.
[PATCH] match: For nonnegative X and Y, relax condition on X % Y < Y to true [PR125737]
From ae75421fd6c7d50e5b1e9aafea2ae3cbcd4ebc1c Mon Sep 17 00:00:00 2001
From: Kael Andrew Alonzo Franco <kaelfandrew@gmail.com>
Date: Sun, 14 Jun 2026 06:28:01 -0400
Subject: [PATCH] match: For nonnegative X and Y, relax condition on X % Y < Y to true [PR125737]
tree_expr_nonnegative_p covers TYPE_UNSIGNED (type) or when X and Y are known to be nonnegative.
Bootstrapped and tested on x86_64-pc-linux-gnu
PR tree-optimization/125737
gcc/ChangeLog:
PR tree-optimization/125737
* match.pd: Use tree_expr_nonnegative_p for X % Y < Y to true.
gcc/testsuite/ChangeLog:
PR tree-optimization/125737
* gcc.dg/pr125737.c: New test.
Andrew Pinski [Fri, 12 Jun 2026 18:40:35 +0000 (11:40 -0700)]
phiopt: reorganize factoring/cselim-limited for phiopt
This is in preparation for adding factoring out loads
for phiopt were we want to loop over all 3 factoring
cases if one of them made a change so that a load elimination
might allow for a store elimination.
This moves the cs-elim limited loop into the factoring out
operation into the same loop.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-phiopt.cc (factor_out_all): New function.
(pass_phiopt::execute): Call factor_out_all isntead
of factor_out_conditional_operation and
cond_if_else_store_replacement_limited.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
store_integral_bit_field: Graceful fallback when SUBREG narrowing fails [PR123754]
The multi-word narrowing path in store_integral_bit_field uses
simplify_gen_subreg followed by gcc_assert (op0). The symmetric path
in extract_integral_bit_field was switched to force_subreg, but the
store side was deliberately left on simplify_gen_subreg because op0
is an lvalue. When the subreg simplification fails (e.g. a vector
op0 punned through an int mode whose word-aligned subregs are rejected
by validate_subreg, as happens for V8SI on -mbig-endian aarch64),
the assert fires.
The avoid-store-forwarding pass (-favoid-store-forwarding) triggers
this: it routes such a vector op0 through store_integral_bit_field.
Replace the assert with a graceful fallback to store_split_bit_field,
mirroring the cross-word branch immediately above. No change for
inputs where the narrowing succeeds.
Tested on AArch64, x86-64 and PowerPC BE.
PR rtl-optimization/123754
gcc/ChangeLog:
* expmed.cc (store_integral_bit_field): When the SUBREG
narrowing fails, defer to store_split_bit_field instead of
asserting.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr123754.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-be-2.c: New test.
Pan Li [Thu, 4 Jun 2026 14:47:49 +0000 (22:47 +0800)]
RISC-V: Add testcase for unsigned scalar SAT_MUL form 12
The form 12 of unsigned scalar SAT_MUL has supported from
the previous change. Thus, add the test cases to make sure
it works well.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat/sat_u_mul-13-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-13-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-13-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-13-u8.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-13-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-13-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-13-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-13-u8.c: New test.
RISC-V: Run smart multilib match even when generic matcher picked a dir
The generic textual matcher in gcc.cc:set_multilib_dir does not
understand RISC-V arch supersetting and treats MULTILIB_DEFAULTS
entries as if they were on the command line via default_arg(). When
the user passes a -march= that is a superset of one of MULTILIB_OPTIONS'
arches but does not textually match, default_arg can rescue the wrong
entry and pick a multilib that is not the closest match.
riscv_compute_multilib used to early-return whenever multilib_dir was
already set, accepting that incorrect generic pick. Drop the early
return and run the match-score-based selection unconditionally.
There is no need to fall back to the generic-matched multilib_dir
after the smart matcher runs: the default "." multilib is parsed into
multilib_infos with the compiler's default arch/abi, so the smart
matcher handles every case the generic matcher can reach. If it
still returns NULL the request is genuinely incompatible with all
configured multilibs and riscv_multi_lib_check fires the proper
"Cannot find suitable multilib" diagnostic instead of silently
linking against incompatible default-arch libraries.
With the pre-fix driver, "-march=rv64g_zba_zcmp_zcmt -mabi=lp64f"
selects the rv64gc multilib (textual default rescue); after the fix
the smart matcher correctly picks the rv64g_zcmp_zcmt multilib.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_select_multilib):
Don't set riscv_no_matched_multi_lib here; let the caller own
the flag.
(riscv_compute_multilib): Drop the early return that accepted
the generic-matched multilib_dir; always run the smart matcher
and set riscv_no_matched_multi_lib when it finds no candidate.
Filip Kastl [Mon, 15 Jun 2026 12:20:58 +0000 (14:20 +0200)]
toplev: Ask for 128MB stack when compiled with ASAN [PR124206]
64MB stack is not enough for running
gcc/testsuite/gcc.c-torture/compile/limits-exprparen.c with an
ASAN-instrumented GCC. Ask for more stack if GCC was compiled with ASAN
instrumentation.
PR sanitizer/124206
gcc/ChangeLog:
* gcc.cc (driver::global_initializations): Ask for 128MB stack
instead of just 64MB when __SANITIZE_ADDRESS__ is defined.
* toplev.cc (toplev::main): Ditto.
Rainer Orth [Mon, 15 Jun 2026 12:18:59 +0000 (14:18 +0200)]
libgcc: Fix _mcount on 32-bit Solaris/x86 [PR38239]
Profiling on 32-bit Solaris/x86 has been broken since
Save call-clobbered registers in _mcount on 32-bit Solaris/x86 (PR target/38239)
https://gcc.gnu.org/pipermail/gcc-patches/2016-March/444175.html
This was only recently noticed when setting up a Solaris/i386 binutils
buildbot.
Since internal_mcount is a regular function on Solaris, the selfpc and
frompcindex args need to pushed to the stack. Besides, the patch fixes
a couple of warnings in gmon.c
Bootstrapped withoug regressions on i386-pc-solaris2.11. With this
patch, the binutils gprof tests PASS.
ld: fatal: relocation error: R_SPARC_32: file c_lto_toplevel-extended-asm-1_1.o: symbol .text (section): value 0x100001340 does not fit
The same error has been present with gld all along, but was only
recently introduced in Solaris ld as discussed in binutils PR ld/25802.
It doesn't occur on Linux/sparc64 which uses a different text start
address.
Tested on sparcv9-sun-solaris2.11, sparc64-unknown-linux-gnu, and
x86_64-pc-linux-gnu.
Richard Biener [Mon, 15 Jun 2026 08:40:10 +0000 (10:40 +0200)]
Fix VEC_COND_EXPR matching with inverted compare
The following fixes detecting of VEC_COND_EXPR <cmp, {0,..}, {-1,...}>
which we recognize in ovce_extract_ops by inverting 'cmp'. But
after checking that the false value is {-1,...} we then continue
verifying it is also {0,...} which it of course is not. Fixed
by checking the true value in that case.
* tree-ssa-reassoc.cc (ovce_extract_ops): Fixup
false value matching for the inverted comparison case.
In 2-byte loops, don't force scratch regs into constraint "w".
Use a C function for asm output. It will emit SBIW if possible.
gcc/
* config/avr/avr-protos.h (avr_out_delay_loop): New proto.
* config/avr/avr.cc (avr_out_delay_loop): New function.
(avr_adjust_insn_length) [ADJUST_LEN_DELAY_LOOP]: Handle case.
(avr_expand_delay_cycles): Overhaul. Allow loop counts of
zero; they represent a power of 2.
* config/avr/avr.md (adjust_len) [delay_loop]: Add.
(*delay_cycles_1): Use avr_out_delay_loop for asm out.
(*delay_cycles_4): Same.
(*delay_cycles_3): Same.
(*delay_cycles_2): Same. Relax constraints to "d".
Use two QImode scratch regs instead of one HImode one.
Jose E. Marchesi [Mon, 15 Jun 2026 07:38:57 +0000 (09:38 +0200)]
a68: do not pub non-publicized indicants in exports
Even non-publicized modes were being added to the module interfaces.
This was because for all other kind of taxes (identifiers, operator
indicants, etc) the PUBLICIZED tax attribute is set at taxes
collection time, but for mode indicants the flag shall be set at
extraction time.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-extract.cc (a68_extract_indicants): Set PUBLICIZED in
the new tags for mode indicants.
* a68-parser.cc (a68_new_tag): Initialize PUBLICIZED to false.
Andrew Pinski [Sun, 14 Jun 2026 20:19:47 +0000 (13:19 -0700)]
match: Fix up `(~x) >> (type)x` pattern for truncation [PR125790]
I missed this during the review and when I suggest adding support
for the cast. But a truncation of the shifter operand the value
could be defined.
Since the front-end adds a cast to unsigned int, we need to split
pr125707.c into two and xfail the long case and change it to
`long long` so it would xfail for ilp32 [and llp64il32] targets.
PR tree-optimization/125790
gcc/ChangeLog:
* match.pd (`(~x)>>x`): Reject truncation of shifter.
gcc/testsuite/ChangeLog:
* gcc.dg/pr125707.c: Move the long over to pr125707-1.c.
* gcc.dg/pr125707-1.c: New test; xfailed.
* gcc.dg/pr125790-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Monk Chiang [Wed, 21 Jan 2026 03:29:13 +0000 (11:29 +0800)]
RISC-V: Ensure Zicfilp lpad 4-byte alignment
Add LABEL_ALIGN to align non-local goto target labels, removing the
explicit gen_lpad_align () call. Extend gpr_save to emit the full
CFI guard (.p2align 2, .option push/norelax/norvc, call, .option pop,
lpad 0) when Zicfilp is active. Add lpad_align before the thunk
entry lpad.
gcc/ChangeLog:
* config/riscv/riscv-zicfilp.cc (rest_of_insert_landing_pad):
Remove gen_lpad_align before non-local goto labels; LABEL_ALIGN
now handles their alignment. Simplify gpr_save handling: the
pattern now outputs the full CFI sequence itself.
* config/riscv/riscv.cc (riscv_output_mi_thunk): Add
gen_lpad_align before entry lpad.
* config/riscv/riscv.h (LABEL_ALIGN): New macro; ensures 4-byte
alignment for Zicfilp non-local goto target labels.
* config/riscv/riscv.md (gpr_save): When is_zicfilp_p, output
.p2align 2, .option push/norelax/norvc, call, .option pop, lpad 0.
gcc/testsuite/ChangeLog:
* g++.target/riscv/zicfilp-thunk.C: New test.
* gcc.target/riscv/zicfilp-func-entry.c: New test.
* gcc.target/riscv/zicfilp-gpr-save.c: New test.
* gcc.target/riscv/zicfilp-nonlocal-goto.c: New test.
Monk Chiang [Wed, 10 Jun 2026 01:34:44 +0000 (18:34 -0700)]
RISC-V: Add Zicfilp LPAD protection for setjmp and indirect_return
Add call-site LPAD insertion for two cases:
1. setjmp / __attribute__((returns_twice)) calls, which may return a
second time via longjmp.
2. A new "indirect_return" attribute for functions that may return to
an unexpected address.
Detection uses riscv_call_needs_lpad_p() at expand time. When needed,
call_internal_cfi / call_value_internal_cfi emit .p2align 2,
.option push/norelax/norvc, the call, .option pop, and lpad 0 as a
single insn. This prevents the assembler (c.jal) or linker (jal
relaxation) from shifting the return address off the lpad.
Indirect calls to returns_twice or indirect_return functions are not
covered.
Changes in v4:
- Revert the v2 change, make indirect_return a type attribute.
Changes in v3:
- Fix coding style in riscv.cc
Changes in v2:
- Change indirect_return from a type attribute to a decl attribute,
consistent with returns_twice.
- Add length attributes to call_internal_cfi and call_value_internal_cfi
to reflect the multi-instruction sequence.
- Copy the explaining comment to call_value_internal_cfi.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_call_needs_lpad_p): Declare.
* config/riscv/riscv.cc (riscv_gnu_attributes): Register new
indirect_return attribute for function declarations.
(riscv_call_needs_lpad_p): New function.
* config/riscv/riscv.md (call_internal_cfi): New insn pattern with
length attribute.
(call_value_internal_cfi): Likewise for call-with-return-value,
with comment and length attribute.
(define_expand "call"): Emit call_internal_cfi when
riscv_call_needs_lpad_p returns true.
(define_expand "call_value"): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zicfilp-indirect-return.c: New test.
* gcc.target/riscv/zicfilp-setjmp.c: New test.
Between combine and late_combine1 optimization passes I found a case
where a cbranchsi4 was getting injected between the "set sr_f" and
"jump if sr_f" instructions. The middle-end thought this was ok as it
could not see that the cbranchsi4 pattern will clobber the sr_f register
after splitting.
Add a clause to the cbranchsi4 pattern to indicate that it will clobber
sr_f. This allows the middle-end to avoid placing the cbranchsi4
pattern incorrectly.
After this patch these tests pass.
make check-gcc RUNTESTFLAGS='dg-torture.exp=builtin-arith-overflow-12.c'
Stafford Horne [Thu, 4 Jun 2026 17:36:39 +0000 (18:36 +0100)]
or1k: Fix 64-bit shifts on OpenRISC
On OpenRISC, 64-bit shift tests shiftdi-2.c and vshift-1.c were failing
and it looks to always have been broken.
After investigation it was found that OpenRISC fails to define
SHIFT_COUNT_TRUNCATED which is needed as both register and immediate
shift amounts are unsigned and only use the low-order 5 bits. Also,
the immediate used for 32-bit shifts is an unsigned 5-bit value not a
6-bit value; update the predicate.
After these changes these tests pass.
make check-gcc RUNTESTFLAGS='dg.exp=vshift-1.c execute.exp=shiftdi-2.c'
gcc/ChangeLog:
* config/or1k/or1k.h (SHIFT_COUNT_TRUNCATED): Define.
* config/or1k/or1k.md (rotrsi3): Rename reg_or_u6_operand to
reg_or_u5_operand.
(<shift_op>si3): Ditto.
* config/or1k/predicates.md (reg_or_u6_operand): Remove.
(reg_or_u5_operand): New predicate.
Robert Dubner [Sun, 14 Jun 2026 20:25:38 +0000 (16:25 -0400)]
cobol: Improve MOVE routines.
Implement MOVE COMP-3 to NumericDisplay. Expand test routines verifying SIZE
ERROR behavior for the new MOVE algorithms. Fix long-standing errors in
processing truncated MOVEs to numeric-display and packed-decimal that resulted
in "negative zero" constructions.
gcc/cobol/ChangeLog:
* move.cc (hex_of): Move the routine.
(hex_msg): Likewise.
(clear_negative_zero): New routine for clearing "negative zero"
after certain MOVEs.
(mh_numeric_display): Use clear_negative_zero().
(mh_packed_to_packed): Check for SIZE-ERROR; use
clear_negative_zero().
(mh_packed_to_numdisp): New routine.
(move_helper): Use mh_packed_to_numdisp().
(parser_move): Move the parser_move routine.
(parser_move_multi): Likewise.
(mh_numdisp_to_packed): Move routine; use clear_negative_zero;
* parse.y: Set separate_e for COMP-6 variables.
libgcobol/ChangeLog:
* libgcobol.cc (int128_to_field): Set packed-decimal sign nybble to
"positive" when value is zero.
gcc/testsuite/ChangeLog:
* cobol.dg/group2/COMP-3_to_COMP-3_size_error.cob: New test.
* cobol.dg/group2/COMP-3_to_COMP-3_size_error.out: New test.
* cobol.dg/group2/COMP-3_to_numeric-display_size_error.cob: New test.
* cobol.dg/group2/COMP-3_to_numeric-display_size_error.out: New test.
* cobol.dg/group2/Clear_negative_zero_after_truncated_MOVE.cob: New test.
* cobol.dg/group2/Clear_negative_zero_after_truncated_MOVE.out: New test.
* cobol.dg/group2/numeric-display_to_COMP-3_size_error.cob: New test.
* cobol.dg/group2/numeric-display_to_COMP-3_size_error.out: New test.
Jerry DeLisle [Sun, 14 Jun 2026 02:10:04 +0000 (19:10 -0700)]
fortran: Fix double free in ASSOCIATE over allocatable char function [PR125782]
When an ASSOCIATE selector is a function call returning an allocatable
deferred-length character, trans_associate_var unconditionally added an
extra free of the associate-name's backend decl. That free was added in
2017 (PR60458/77296) to release the result of a POINTER-valued character
function, which is not otherwise freed. For an ALLOCATABLE-valued
character function, however, the result temporary is already freed by the
procedure call's own cleanup code, and the associate name aliases that
same temporary, so the extra free caused a double free at the end of the
ASSOCIATE block.
Restrict the extra free to POINTER-valued function results, where it is
still needed.
Assisted by: Claude Sonnet 4.6
PR fortran/125782
gcc/fortran/ChangeLog:
* trans-stmt.cc (trans_associate_var): Only free the associate
name's backend decl for a deferred-length character function
result when the result is a POINTER, not when it is
ALLOCATABLE, since the latter is already freed by the
procedure call's cleanup.
macOS 27 (Golden Gate) corresponds to darwin27. Previously, macOS 26 was
darwin25, and darwin26 never existed. We need to adapt the driver to
this new numbering scheme.
Jose E. Marchesi [Sun, 14 Jun 2026 15:31:16 +0000 (17:31 +0200)]
a68: use INCLUDE_FOO before system.h for standard C++ headers
Define INCLUDE_* preprocessor symbols rather than including some
standard C++ headers directly.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-brackets.cc (INCLUDE_STRING): Define.
Do not include <string.
* a68-parser-bottom-up.cc (INCLUDE_STRING): Define.
Do not include <string>.
* a68-moids-diagnostics.cc (INCLUDE_STRING): Define.
Do not include <string>.
* a68-imports.cc (INCLUDE_STRING): Define.
Do not include <string>.
* a68-imports-archive.cc (INCLUDE_MAP): Define.
(INCLUDE_STRING): Likewise.
Do not include <string> nor <map>.
Jose E. Marchesi [Sun, 14 Jun 2026 11:03:19 +0000 (13:03 +0200)]
a68: fix type of flex.sub_offset in struct encoded_mode
The type of flex.sub_offset in struct encoded_mode shall of course be
uint64_t rather than uint8_t. This was triggering corrupted exports
sections once a certain amount of modes were reached.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-imports.cc (struct encoded_mode): Type of flex.sub_offset
shall be uint64_t.
Thomas Koenig [Sun, 14 Jun 2026 07:33:41 +0000 (09:33 +0200)]
Add a few cases where -Wwarn-unused-but-set-variable should not warn.
When variables are use-associated, volatile or asynchronous,
reference or definition may not be visible in the local namespace.
Thus, they need to be excluded from unused vs set warnings.
gcc/fortran/ChangeLog:
PR fortran/30438
* resolve.cc (find_unused_vs_set): Exclude variables from
cwarnings if use_assoc, volatile_ or asynchronous are set.
gcc/testsuite/ChangeLog:
PR fortran/30438
* gfortran.dg/warn_unused_but_set_variable_3.f90: New test.
Ben Boeckel [Tue, 2 Apr 2024 02:14:44 +0000 (22:14 -0400)]
libcpp/init: remove unnecessary `struct` keyword
The initial P1689 patches were written in 2019 and ended up having code
move around over time ended up introducing a `struct` keyword to the
implementation of `cpp_finish`. Remove it to match the rest of the file
and its declaration in the header.
Souradipto Das [Mon, 8 Jun 2026 03:45:24 +0000 (09:15 +0530)]
tree-optimization: Add bitop reduction simplifications against zero
This patch introduces a simplification rule in match.pd to reduce bitwise
expressions against zero. Specifically, it simplifies patterns where a
variable checked against zero is combined via bitwise AND/OR with a compounded
bitwise OR check against zero.
PR tree-optimization/125442
gcc/ChangeLog:
* match.pd: Add simplification rules for
(a == 0) | ((a | b) == 0) -> (a == 0) and
(a != 0) & ((a | b) != 0) -> (a != 0).
gcc/testsuite/ChangeLog:
* gcc.dg/int-bwise-opt-3.c: New test.
* gcc.dg/int-bwise-opt-4.c: New test.
Suggested-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Signed-off-by: Souradipto Das <souradiptodas6@gmail.com>
Andrew Pinski [Sat, 13 Jun 2026 21:42:30 +0000 (14:42 -0700)]
range fold: Fix relation folding of |/& when reversed operands [PR125774]
This showed up in GCC 13 in the original testcase but became latent in GCC 14.
So I created a simple gimple testcase to show the issue.
So what we have is:
_21 = _20 > lower_9;
_22 = lower_9 > _20;
_23 = _21 | _22;
And this would incorrectly be folded into 1 and that is because we treated one of
those `>` as `<=` rather than as just `<`. This was due to an incorrect use
of relation_negate rather than relation_swap when dealing with swapping the operands.
Pushed as obvious after a bootstrap/test on x86_64-linux-gnu.
PR tree-optimization/125774
gcc/ChangeLog:
* gimple-range-fold.cc (fold_using_range::relation_fold_and_or): Use
relation_swap rather than relation_negate when the operands are exchanged.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr125774-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jose E. Marchesi [Sat, 13 Jun 2026 23:42:02 +0000 (01:42 +0200)]
a68: fix brackets parser diagnostics
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-brackets.cc: Include <string>.
(bracket_check_error): Get additional argument `item'.
(bracket_check_diagnose): Likewise.
(bracket_check_parse): Fix call to a68_error passing the `item'.
Andrew Pinski [Sat, 13 Jun 2026 16:38:08 +0000 (09:38 -0700)]
phiopt: Fix is_factor_profitable for debug stmts [PR125776]
This is a latent bug in is_factor_profitable where we would
consider a debug statement as an use for lifetime usage
afterwards. So you would get a compare debug failure in some
cases. pr125776-2.c fails since r15-4503-g8d6d6d537fdc75.
Pushed as obvious after a bootstrap/test.
PR tree-optimization/125776
gcc/ChangeLog:
* tree-ssa-phiopt.cc (is_factor_profitable): An
usage in a debug stmt should be ignored.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr125776-1.c: New test.
* gcc.dg/torture/pr125776-2.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Tomasz Kamiński [Thu, 30 Apr 2026 18:07:14 +0000 (20:07 +0200)]
libstdc++: Implement P4206R0: Revert string support in std::constant_wrapper.
We need to apply remove_cvref_t on decltype(_Xv) for default template argument
due PR115314. The constant_wrapper::value is declared as decltype((__Xv)) due
PR125188.
libstdc++-v3/ChangeLog:
* include/bits/version.def (constant_wrapper): Bump to 202606L.
* include/bits/version.h: Regenerate.
* include/bits/funcref_impl.h (function_ref::function_ref): Rename
template parameter from __cwfn to __fn and use it direclty.
* include/bits/funcwrap.h (function_ref): Rename template parameter
to __fn.
(std::constant_wrapper): Use auto as non-type template parameter,
and refeference it as value.
* include/bits/utility.h (__CwFixedValue): Remove.
* testsuite/20_util/constant_wrapper/generic.cc: Remove arrays
and string literal tests. Add test for address of value.
* testsuite/20_util/constant_wrapper/other_wrappers.cc:
Remove test_array.
Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jose E. Marchesi [Sat, 13 Jun 2026 16:44:27 +0000 (18:44 +0200)]
a68: avoid coalescing of stmt_lists in a68_lower_unit_list
At it happens, a68_lower_unit_list collects units as members of a
stmt_list, lowering them and appending them. Problem is, stmt_lists
get coalesced when appended (or prepended) to another stmt_list.
One of the units that may lower in a stmt_list are generators, and the
coalescing manifests itself when the generators are found in
collateral clauses.
This patch puts in place a temporary workaround for this, which is to
wrap the stmt_list into a NOP_EXPR. This avoid the coalescing, but a
less hackish solution will probably consist on changing the way
unit_lists get collected instead.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-low-generator.cc (a68_low_generator): Wrap the resulting
stmt_list into a nop_expr.
gcc/testsuite/ChangeLog
* algol68/execute/gen-in-constructor-1.a68: New test.
* algol68/execute/gen-in-constructor-2.a68: Likewise.
Jeff Law [Sat, 13 Jun 2026 15:05:32 +0000 (09:05 -0600)]
[PATCH v6 4/4] find_a_program: Search with machine prefix in some cases
Prefatory note:
I've since learned that this quite similar to
https://inbox.sourceware.org/gcc-patches/20240522095404.1825269-1-syq@gcc.gnu.org/,
postdating my original patch series, but predating this version. See
that thread for additional motivation. That patch updated a few specific
callsite for various programs; I instead opted to enhance find_a_program
(which I myself originally factored out of find_a_file in 5fee8a0a9223d030c66d53c104fb0a431369248f for this purpose) to catch all
such cases programmatically.
I want this change because when cross compiling, users expect prefixed
tools, like
- $AS = $machine-as
- $LD = $machine-ld
etc. and it would be less confusing if GCC would find those same tools
in a similar way. GCC instead looking for its own (less widely used than
prefixing) nested-directory way of disambiguating, and then falling back
looking for *unprefixed* tools (which typically are for the wrong
platform in cross cases) is quite confusing. In my distro, Nixpkgs, and
elsewhere, I've seen people draw the wrong conclusions because of this,
which is that one must use absolute paths hard-coded at build time in
order to get the right behavior, and dodge the incorrect unprefixed
tools.
For what its worth, Clang/LLVM also look for prefixed tools in this
manner, so this isn't the first time it would be done. The difference is
that they will look for prefixed tools in *all* cases --- i.e. also
within the machine-specific locations as I outlined above (as things
this patch does *not* do). I think that was done for mere coding
convenience, and there is no actual motivation "doubly disambiguating"
tools with machine directories and machine prefixes. So in the interest
of sticking strictly to the motivation / being conservative in how much
new behavior is implemented, I did not implement that part.
gcc/ChangeLog:
* gcc.cc (find_a_program): Implement the new behavior described
above.
(driver::set_up_specs): Initilize variable. It would be nice to
have less spooky-action-at-a-distince using C++ features, but I
just matched how the corresponding suffix variable worked for
now for uniformity.
John Ericson [Sat, 13 Jun 2026 14:56:16 +0000 (08:56 -0600)]
[PATCH v6 3/4] for_each_path: Pass to callback whether dir is machine-disambiguated
We will use this in the subsequent patch to control what filenames we
search for.
gcc/ChangeLog:
* gcc.cc (for_each_path): Pass an additional boolean argument to
the callback specifying whether the current directly being
searched is machine-specific, as described above.
(build_search_list): Add unused parameter to lambda match new
callback type.
(find_a_file): Add unused parameter to lambda match new callback
type.
(find_a_program): Add unused parameter to lambda match new
callback type.
(spec_path::operator()): Add unused parameter to match new
callback type.
Kyrylo Tkachov [Fri, 12 Jun 2026 00:00:00 +0000 (00:00 +0000)]
match.pd: fold (0/1) * -(0/1) into -((0/1) & (0/1))
For operands known to be in the range [0, 1], multiplying a 0/1 value by a
negated 0/1 value is the negation of their bitwise AND:
x * -y == -(x & y) when x, y are in { 0, 1 }.
This complements the existing "{ 0, 1 } * { 0, 1 } -> { 0, 1 } & { 0, 1 }"
simplification, which does not handle a negated operand. For the
comparison-derived 0/1 masks produced by if-conversion this exposes a plain
bitwise AND of the original conditions to later passes (replacing a
COND_EXPR).
This triggers a few times in astcenc in SPEC2026 where it simplifies the
codegen of one of the hot kernels and gives a ~2.4% improvement on
aarch64, though the real winners for that kernel are described in
PR125750. This is just a small cleanup.
Sunil Dora [Thu, 11 Jun 2026 12:50:20 +0000 (18:20 +0530)]
driver: Spill long COLLECT_GCC_OPTIONS to a response file [PR111527]
Many kernels enforce a per-string length limit on argv and envp
strings passed to execve(). On Linux, MAX_ARG_STRLEN limits each
string to 32 * PAGE_SIZE (~128KB); Windows limits individual
environment variables to 32767 characters. When the assembled
value exceeds such a limit, the build fails.
When the assembled value would exceed COLLECT2_OPTIONS_MAX_LENGTH
(default 1024, host-overridable via defaults.h), the driver writes
the option list to a temporary response file via writeargv() and
exports "COLLECT_GCC_OPTIONS=@<path>" instead. collect2,
lto-wrapper and lto-plugin transparently expand the @file form using
existing expandargv() infrastructure, so the change is invisible to
normal builds.
Bootstrapped and regression tested on x86_64-pc-linux-gnu.
PR driver/111527
gcc/ChangeLog:
* defaults.h (COLLECT2_OPTIONS_MAX_LENGTH): New macro.
* collect-utils.cc (read_collect_gcc_options): New function.
* collect-utils.h (read_collect_gcc_options): Declare.
* collect2.cc (main): Use read_collect_gcc_options instead
of getenv.
* doc/hostconfig.texi (Host Misc): Document
COLLECT2_OPTIONS_MAX_LENGTH.
* doc/invoke.texi (Environment Variables): Document the
@file form of COLLECT_GCC_OPTIONS.
* gcc.cc (xsetenv_collect_gcc_options): New function.
(set_collect_gcc_options): Use xsetenv_collect_gcc_options.
* lto-wrapper.cc (run_gcc): Use read_collect_gcc_options
instead of getenv.
gcc/testsuite/ChangeLog:
* gcc.misc-tests/pr111527.exp: New test.
include/ChangeLog:
* libiberty.h (expandargstr): Declare.
libiberty/ChangeLog:
* argv.c (expandargstr): New function.
lto-plugin/ChangeLog:
* lto-plugin.c (read_collect_gcc_options): New function.
(onload): Use read_collect_gcc_options instead of getenv.
Signed-off-by: Sunil Dora <sunilkumar.dora@windriver.com>
Tobias Burnus [Fri, 12 Jun 2026 21:30:35 +0000 (23:30 +0200)]
OpenMP/Fortran: Fix module-use renaming with declare mapper/reduction
Use the same logic as in gfc_compare_derived_types to compare the
types. Additionally, the 'declare' part only permits derived types
(per syntax) and not class - while using the mapper/reduction with
CLASS variables is possible.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_omp_udm_find, gfc_omp_udr_find): Fix
to handle derived-type renaming via module use.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/declare-mapper-6.f90: New test.
* gfortran.dg/gomp/declare-mapper-7.f90: New test.
* gfortran.dg/gomp/declare-reduction-3.f90: New test.
* gfortran.dg/gomp/declare-reduction-4.f90: New test.
Jakub Jelinek [Fri, 12 Jun 2026 20:43:06 +0000 (22:43 +0200)]
c++: Diagnose invalid type of bitfield widths in templates [PR125674]
As the first testcase shows, outside of templates or when
the bitfield width is not type dependent, we diagnose it
in grokbitfield:
if (width != error_mark_node)
{
/* The width must be an integer type. */
if (!type_dependent_expression_p (width)
&& !INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P (TREE_TYPE (width)))
error ("width of bit-field %qD has non-integral type %qT", value,
TREE_TYPE (width));
else if (!check_for_bare_parameter_packs (width))
{
/* Temporarily stash the width in DECL_BIT_FIELD_REPRESENTATIVE.
check_bitfield_decl picks it from there later and sets DECL_SIZE
accordingly. */
DECL_BIT_FIELD_REPRESENTATIVE (value) = width;
SET_DECL_C_BIT_FIELD (value);
}
}
Later on in check_bitfield_decl we verify it is a constant expression,
folded into INTEGER_CST, non-negative etc.
But during instantiation, we don't repeat that check, so only call
check_bitfield_decl later on which can sometimes emit different diagnostics
(so e.g.
bit-field ‘D<1.0e+0>::d’ width not an integer constant
instead of
width of bit-field ‘D<N>::d’ has non-integral type ‘double’
) but what the second testcase shows, we can ICE during cxx_constant_value
even before that if the type is even more problematic.
The following patch fixes that by repeating the test from grokbitfield
during tsubst_decl.
Tobias Burnus [Fri, 12 Jun 2026 17:12:58 +0000 (19:12 +0200)]
Fortran: Improve OpenMP/OpenACC syntax diagnostic
The way the OpenMP and OpenACC parser is written is such that
when the directive name has been successfully matched, any error returned
by the match function should be real.
However, 'match_word' resets he locus to the before-match locus such
that all information is lost, except that error vs. no match data is
still available. Thus, for OpenMP and OpenACC, the error often was
Unclassifiable OpenMP directive at (1)
which is odd when knowing that one used a supported directive; that
the caret pointed to the directive name did not really help, either.
With this commit, the match errors for OpenMP and OpenACC yield the
following error if no buffered message exists:
Syntax error in statement at (1)
pointing the the current locus. (Still, a more explicit error
would be better, e.g. for many errors in 'omp declare reduction',
but still better than previously.)
gcc/fortran/ChangeLog:
* parse.cc (match_word): Add no_substring and
reject_stmt_on_error arguments, defaulting to false and true,
respectively.
(match_word_omp_simd): Do not reject_statement on error and
enable no-substring matching.
(matcha, matcho, matchdo): Call match_word with no_substring
set to true and reject_stmt_on_error set to false.
(decode_omp_directive): Distinguish unknown directive name from
errors found during matching.
(decode_oacc_directive): Likewise; use matcha not match.
(matcha, matcho, matchdo, matchs, matchds): #undef after use.
Julian Brown [Fri, 12 Jun 2026 17:08:43 +0000 (19:08 +0200)]
Fortran/OpenMP: Add module support for 'declare mapper'
This commits fixes some issues, moves resolution to the mapper_id
from parsing to resolution stage and saves the mapper in the
module file.
As a side effect, there is no longer a 'sorry, unimplemented'
for 'declare mapper'; the 'sorry' is now printed when using
explicit map clauses that require a mapper. Note that no
error is printed if the code only uses implicit maps, even
though the mapper is ignored.
* gfortran.h (gfc_omp_namelist): Change udm member into
a pointer type.
(gfc_omp_namelist_udm): Add mapper_id member and move
down in the file below the related ..._udr struct.
(gfc_get_omp_namelist_udm): New convenience macro.
* match.cc (gfc_free_omp_namelist): Free udm.
* module.cc (MOD_VERSION_NUMERIC): Add.
(load_omp_udrs): Add diagnostic_group.
(omp_map_clause_ops, load_omp_udms, check_omp_declare_mappers,
write_omp_udm, write_omp_udms): New.
(read_module, write_module): Support 'declare mapper'.
* openmp.cc (gfc_find_omp_udm, gfc_match_omp_clauses,
resolve_omp_clauses): Handle mapper_id and do later
resolution.
* resolve.cc (resolve_types): Remove 'declare mapper' sorry.
* trans-openmp.cc (gfc_trans_omp_clauses): Add sorry for
map clauses with mapper.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/declare-mapper-1.f90: Remove no longer
expect 'sorry, unimplemented'.
* gfortran.dg/gomp/declare-mapper-3.f90: New test.
* gfortran.dg/gomp/declare-mapper-31.f90: New test.
* gfortran.dg/gomp/declare-mapper-4.f90: New test.
* gfortran.dg/gomp/declare-mapper-5.f90: New test.
Ramin Moussavi [Tue, 2 Jun 2026 22:00:00 +0000 (00:00 +0200)]
microblaze: add Linux signal frame unwinding support
libgcc has no MD_FALLBACK_FRAME_STATE_FOR for microblaze*-linux*, so the
DWARF unwinder cannot step through signal frames at all. Anything that
unwinds out of a signal handler -- most prominently NPTL asynchronous
pthread cancellation (SIGCANCEL) -- either stops early with
_URC_END_OF_STACK (cleanup handlers below the signal frame never run) or
misinterprets the on-stack signal trampoline and crashes with SIGSEGV.
Add the standard fallback: recognize the two-instruction trampoline the
kernel writes into struct rt_sigframe on the stack
addik r12, r0, __NR_rt_sigreturn
brki r14, 0x8
and rebuild the frame state from the sigcontext's pt_regs. The ucontext
is anchored relative to the trampoline (its last member) rather than to
the CFA, so the layout of the frame head does not matter.
The interrupted PC is recorded in DWARF column 36, one past the hard
registers, because column 15 must keep the interrupted r15 (unrelated to
the resume address of a signal frame). Declaring it as
DWARF_ALT_FRAME_RETURN_COLUMN makes init_dwarf_reg_size_table size the
column; without that _Unwind_GetGR reads a zero size and aborts.
Tested with a microblazeel-linux-uclibc cross compiler against uClibc-ng
git, running its NPTL test suite under qemu-system-microblazeel -M
petalogix-s3adsp1800. Without the fix 17 tests fail (tst-cancel{1..5,7,
9,16,20,x4,x7}, tst-cleanup{1..3}, tst-cond{16,17}) by SIGSEGV or by
hanging in the unwinder; with it all 17 pass and the rest of the suite
is unchanged. The implementation follows the mips/aarch64
linux-unwind.h pattern.
Thomas Schwinge [Thu, 21 May 2026 16:54:08 +0000 (18:54 +0200)]
'#define _GNU_SOURCE' in 'libgomp/plugin/plugin-gcn.c'
'#define _GNU_SOURCE' in 'libgomp/plugin/plugin-gcn.c', like all other
libgomp source files do, instead of via the 'Makefile'. Minor fix-up for
Subversion r278138 (Git commit 237957cc2c1818f30207f02747a880bd1cd28d0b)
"GCN Libgomp Plugin" (..., which, back then, likely had inherited that
from the HSA libgomp plugin).
Currently an "entire" address is reloaded even in cases where section
anchors are involved. This makes it harder to share section anchors
which is the whole point of them. For example, in cases where
offsetable MEMs are valid do not reload .LANCHOR42+offset but only
.LANCHOR42 and replace the address with the resulting reload register
and the offset. As a consequence subsequent passes only have to deal
with register equivalences in order to share section anchors. For
example, consider testsuite/gcc.target/s390/section-anchors-4.c.
Without this patch, after LRA we end up with