git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

Andrew Pinski [Sat, 10 May 2025 03:56:42 +0000 (20:56 -0700)]

aarch64: Fix narrowing warning in aarch64_detect_vector_stmt_subtype

There is a narrowing warning in aarch64_detect_vector_stmt_subtype
about gather_load_x32_cost and gather_load_x64_cost converting from int to unsigned.
These fields are always unsigned and even the constructor for sve_vec_cost takes
an unsigned. So let's just move the fields over to unsigned.

Build and tested for aarch64-linux-gnu.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (struct sve_vec_cost): Change gather_load_x32_cost
and gather_load_x64_cost fields to unsigned.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Mon, 21 Apr 2025 20:00:19 +0000 (13:00 -0700)]

forwprop: Add alias walk limit to optimize_memcpy_to_memset.

As sugguested in https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681507.html,
this adds the aliasing walk limit.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (optimize_memcpy_to_memset): Add a limit on the alias walk.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Mon, 21 Apr 2025 19:19:49 +0000 (12:19 -0700)]

forwprop: Move memcpy_to_memset from gimple fold to forwprop

Since this optimization now walks the vops, it is better to only
do it in forwprop rather than in all the time in fold_stmt.

The next patch will add the limit to the alias walk.

gcc/ChangeLog:

* gimple-fold.cc (optimize_memcpy_to_memset): Move to
tree-ssa-forwprop.cc.
(gimple_fold_builtin_memory_op): Remove call to
optimize_memcpy_to_memset.
(fold_stmt_1): Likewise.
* tree-ssa-forwprop.cc (optimize_memcpy_to_memset): Move from
gimple-fold.cc.
(simplify_builtin_call): Try to optimize memcpy/memset.
(pass_forwprop::execute): Try to optimize memcpy like assignment
from a previous memset.

gcc/testsuite/ChangeLog:

* gcc.dg/pr78408-1.c: Update scan to forwprop1 only.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Iain Sandoe [Sat, 10 May 2025 16:22:55 +0000 (17:22 +0100)]

c++, coroutines: Allow NVRO in more cases for ramp functions.

The constraints of the c++ coroutines specification require the ramp
to construct a return object early in the function.  This will be returned
at some later time.  This is implemented as NVRO but requires that copying
be well-formed even though it will be elided.  Special-case ramp functions
to allow this.

gcc/cp/ChangeLog:

* typeck.cc (check_return_expr): Suppress conversions for NVRO
in coroutine ramp functions.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

commit | commitdiff | tree

Iain Sandoe [Sat, 10 May 2025 16:12:44 +0000 (17:12 +0100)]

c++: Set the outer brace marker for missed cases.

In some cases, a function might be declared as FUNCTION_NEEDS_BODY_BLOCK
but all the content is contained within that block. However, poplevel
is currently assuming that such cases would always contain subblocks.

In the case that we do have a body block, but there are no subblocks
then st the outer brace marker on the body block. This situation occurs
for at least coroutine lambda ramp functions and empty constructors.

gcc/cp/ChangeLog:

* decl.cc (poplevel): Set BLOCK_OUTER_CURLY_BRACE_P on the
body block for functions with no subblocks.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

commit | commitdiff | tree

Nathaniel Shead [Fri, 28 Mar 2025 12:30:31 +0000 (23:30 +1100)]

c++/modules: Clean up importer_interface

This patch removes some no longer needed special casing in linkage
determination, and makes the distinction between "always_emit" and
"internal" for better future-proofing.

gcc/cp/ChangeLog:

* module.cc (importer_interface): Adjust flags.
(get_importer_interface): Rename flags.
(trees_out::core_bools): Clean up special casing.
(trees_out::write_function_def): Rename flag.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

commit | commitdiff | tree

Jason Merrill [Fri, 16 May 2025 12:22:08 +0000 (08:22 -0400)]

c++: one more coro test tweak

After my r16-670, running the testsuite with explicit --stds didn't run this
one in C++17 mode, but the default did. Let's remove the { target c++17 }
so it doesn't by default, either.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr94760-mismatched-traits-and-promise-prev.C:
Remove { target c++17 }.

commit | commitdiff | tree

Richard Sandiford [Fri, 16 May 2025 12:24:03 +0000 (13:24 +0100)]

Manual tweak of some end_sequence callers

This patch mops up obvious redundancies that weren't caught by the
automatic regexp replacements in earlier patches. It doesn't do
anything with genemit.cc, since that will be part of a later series.

gcc/
* config/arm/arm.cc (arm_gen_load_multiple_1): Simplify use of
end_sequence.
(arm_gen_store_multiple_1): Likewise.
* expr.cc (gen_move_insn): Likewise.
* gentarget-def.cc (main): Likewise.

commit | commitdiff | tree

Richard Sandiford [Fri, 16 May 2025 12:24:02 +0000 (13:24 +0100)]

Automatic replacement of end_sequence/return pairs

This is the result of using a regexp to replace:

  rtx( |_insn *)<stuff> = end_sequence ();
  return <stuff>;

with:

  return end_sequence ();

gcc/
* asan.cc (asan_emit_allocas_unpoison): Directly return the
result of end_sequence.
(hwasan_emit_untag_frame): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
* config/arm/arm.cc (arm_call_tls_get_addr): Likewise.
* config/avr/avr-passes.cc (avr_parallel_insn_from_insns): Likewise.
* config/sh/sh_treg_combine.cc
(sh_treg_combine::make_not_reg_insn): Likewise.
* tree-outof-ssa.cc (emit_partition_copy): Likewise.

commit | commitdiff | tree

Richard Sandiford [Fri, 16 May 2025 12:24:01 +0000 (13:24 +0100)]

Automatic replacement of get_insns/end_sequence pairs

This is the result of using a regexp to replace instances of:

  <stuff> = get_insns ();
  end_sequence ();

with:

  <stuff> = end_sequence ();

where the indentation is the same for both lines, and where there
might be blank lines inbetween.

gcc/
* asan.cc (asan_clear_shadow): Use the return value of end_sequence,
rather than calling get_insns separately.
(asan_emit_stack_protection, asan_emit_allocas_unpoison): Likewise.
(hwasan_frame_base, hwasan_emit_untag_frame): Likewise.
* auto-inc-dec.cc (attempt_change): Likewise.
* avoid-store-forwarding.cc (process_store_forwarding): Likewise.
* bb-reorder.cc (fix_crossing_unconditional_branches): Likewise.
* builtins.cc (expand_builtin_apply_args): Likewise.
(expand_builtin_return, expand_builtin_mathfn_ternary): Likewise.
(expand_builtin_mathfn_3, expand_builtin_int_roundingfn): Likewise.
(expand_builtin_int_roundingfn_2, expand_builtin_saveregs): Likewise.
(inline_string_cmp): Likewise.
* calls.cc (expand_call): Likewise.
* cfgexpand.cc (expand_asm_stmt, pass_expand::execute): Likewise.
* cfgloopanal.cc (init_set_costs): Likewise.
* cfgrtl.cc (insert_insn_on_edge, prepend_insn_to_edge): Likewise.
(rtl_lv_add_condition_to_bb): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
(aarch64_do_track_speculation): Likewise.
* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately)
(aarch64_expand_vector_init, aarch64_gen_ccmp_first): Likewise.
(aarch64_gen_ccmp_next, aarch64_mode_emit): Likewise.
(aarch64_md_asm_adjust): Likewise.
(aarch64_switch_pstate_sm_for_landing_pad): Likewise.
(aarch64_switch_pstate_sm_for_jump): Likewise.
(aarch64_switch_pstate_sm_for_call): Likewise.
* config/alpha/alpha.cc (alpha_legitimize_address_1): Likewise.
(alpha_emit_xfloating_libcall, alpha_gp_save_rtx): Likewise.
* config/arc/arc.cc (hwloop_optimize): Likewise.
* config/arm/aarch-common.cc (arm_md_asm_adjust): Likewise.
* config/arm/arm-builtins.cc: Likewise.
* config/arm/arm.cc (require_pic_register): Likewise.
(arm_call_tls_get_addr, arm_gen_load_multiple_1): Likewise.
(arm_gen_store_multiple_1, cmse_clear_registers): Likewise.
(cmse_nonsecure_call_inline_register_clear): Likewise.
(arm_attempt_dlstp_transform): Likewise.
* config/avr/avr-passes.cc (bbinfo_t::optimize_one_block): Likewise.
(avr_parallel_insn_from_insns): Likewise.
* config/avr/avr.cc (avr_prologue_setup_frame): Likewise.
(avr_expand_epilogue): Likewise.
* config/bfin/bfin.cc (hwloop_optimize): Likewise.
* config/c6x/c6x.cc (c6x_expand_compare): Likewise.
* config/cris/cris.cc (cris_split_movdx): Likewise.
* config/cris/cris.md: Likewise.
* config/csky/csky.cc (csky_call_tls_get_addr): Likewise.
* config/epiphany/resolve-sw-modes.cc
(pass_resolve_sw_modes::execute): Likewise.
* config/fr30/fr30.cc (fr30_move_double): Likewise.
* config/frv/frv.cc (frv_split_scc, frv_split_cond_move): Likewise.
(frv_split_minmax, frv_split_abs): Likewise.
* config/frv/frv.md: Likewise.
* config/gcn/gcn.cc (move_callee_saved_registers): Likewise.
(gcn_expand_prologue, gcn_restore_exec, gcn_md_reorg): Likewise.
* config/i386/i386-expand.cc
(ix86_expand_carry_flag_compare, ix86_expand_int_movcc): Likewise.
(ix86_vector_duplicate_value, expand_vec_perm_interleave2): Likewise.
(expand_vec_perm_vperm2f128_vblend): Likewise.
(expand_vec_perm_2perm_interleave): Likewise.
(expand_vec_perm_2perm_pblendv): Likewise.
(expand_vec_perm2_vperm2f128_vblend, ix86_gen_ccmp_first): Likewise.
(ix86_gen_ccmp_next): Likewise.
* config/i386/i386-features.cc
(scalar_chain::make_vector_copies): Likewise.
(scalar_chain::convert_reg, scalar_chain::convert_op): Likewise.
(timode_scalar_chain::convert_insn): Likewise.
* config/i386/i386.cc (ix86_init_pic_reg, ix86_va_start): Likewise.
(ix86_get_drap_rtx, legitimize_tls_address): Likewise.
(ix86_md_asm_adjust): Likewise.
* config/ia64/ia64.cc (ia64_expand_tls_address): Likewise.
(ia64_expand_compare, spill_restore_mem): Likewise.
(expand_vec_perm_interleave_2): Likewise.
* config/loongarch/loongarch.cc
(loongarch_call_tls_get_addr): Likewise.
* config/m32r/m32r.cc (gen_split_move_double): Likewise.
* config/m32r/m32r.md: Likewise.
* config/m68k/m68k.cc (m68k_call_tls_get_addr): Likewise.
(m68k_call_m68k_read_tp, m68k_sched_md_init_global): Likewise.
* config/m68k/m68k.md: Likewise.
* config/microblaze/microblaze.cc
(microblaze_call_tls_get_addr): Likewise.
* config/mips/mips.cc (mips_call_tls_get_addr): Likewise.
(mips_ls2_init_dfa_post_cycle_insn): Likewise.
(mips16_split_long_branches): Likewise.
* config/nvptx/nvptx.cc (nvptx_gen_shuffle): Likewise.
(nvptx_gen_shared_bcast, nvptx_propagate): Likewise.
(workaround_uninit_method_1, workaround_uninit_method_2): Likewise.
(workaround_uninit_method_3): Likewise.
* config/or1k/or1k.cc (or1k_init_pic_reg): Likewise.
* config/pa/pa.cc (legitimize_tls_address): Likewise.
* config/pru/pru.cc (pru_expand_fp_compare, pru_reorg_loop): Likewise.
* config/riscv/riscv-shorten-memrefs.cc
(pass_shorten_memrefs::transform): Likewise.
* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): Likewise.
* config/riscv/riscv.cc (riscv_call_tls_get_addr): Likewise.
(riscv_frm_emit_after_bb_end): Likewise.
* config/rl78/rl78.cc (rl78_emit_libcall): Likewise.
* config/rs6000/rs6000.cc (rs6000_debug_legitimize_address): Likewise.
* config/s390/s390.cc (legitimize_tls_address): Likewise.
(s390_two_part_insv, s390_load_got, s390_va_start): Likewise.
* config/sh/sh_treg_combine.cc
(sh_treg_combine::make_not_reg_insn): Likewise.
* config/sparc/sparc.cc (sparc_legitimize_tls_address): Likewise.
(sparc_output_mi_thunk, sparc_init_pic_reg): Likewise.
* config/stormy16/stormy16.cc (xstormy16_split_cbranch): Likewise.
* config/xtensa/xtensa.cc (xtensa_copy_incoming_a7): Likewise.
(xtensa_expand_block_set_libcall): Likewise.
(xtensa_expand_block_set_unrolled_loop): Likewise.
(xtensa_expand_block_set_small_loop, xtensa_call_tls_desc): Likewise.
* dse.cc (emit_inc_dec_insn_before, find_shift_sequence): Likewise.
(replace_read): Likewise.
* emit-rtl.cc (reorder_insns, gen_clobber, gen_use): Likewise.
* except.cc (dw2_build_landing_pads, sjlj_mark_call_sites): Likewise.
(sjlj_emit_function_enter, sjlj_emit_function_exit): Likewise.
(sjlj_emit_dispatch_table): Likewise.
* expmed.cc (expmed_mult_highpart_optab, expand_sdiv_pow2): Likewise.
* expr.cc (convert_mode_scalar, emit_move_multi_word): Likewise.
(gen_move_insn, expand_cond_expr_using_cmove): Likewise.
(expand_expr_divmod, expand_expr_real_2): Likewise.
(maybe_optimize_pow2p_mod_cmp, maybe_optimize_mod_cmp): Likewise.
* function.cc (emit_initial_value_sets): Likewise.
(instantiate_virtual_regs_in_insn, expand_function_end): Likewise.
(get_arg_pointer_save_area, make_split_prologue_seq): Likewise.
(make_prologue_seq, gen_call_used_regs_seq): Likewise.
(thread_prologue_and_epilogue_insns): Likewise.
(match_asm_constraints_1): Likewise.
* gcse.cc (prepare_copy_insn): Likewise.
* ifcvt.cc (noce_emit_store_flag, noce_emit_move_insn): Likewise.
(noce_emit_cmove): Likewise.
* init-regs.cc (initialize_uninitialized_regs): Likewise.
* internal-fn.cc (expand_POPCOUNT): Likewise.
* ira-emit.cc (emit_move_list): Likewise.
* ira.cc (ira): Likewise.
* loop-doloop.cc (doloop_modify): Likewise.
* loop-unroll.cc (compare_and_jump_seq): Likewise.
(unroll_loop_runtime_iterations, insert_base_initialization): Likewise.
(split_iv, insert_var_expansion_initialization): Likewise.
(combine_var_copies_in_loop_exit): Likewise.
* lower-subreg.cc (resolve_simple_move,resolve_shift_zext): Likewise.
* lra-constraints.cc (match_reload, check_and_process_move): Likewise.
(process_addr_reg, insert_move_for_subreg): Likewise.
(process_address_1, curr_insn_transform): Likewise.
(inherit_reload_reg, process_invariant_for_inheritance): Likewise.
(inherit_in_ebb, remove_inheritance_pseudos): Likewise.
* lra-remat.cc (do_remat): Likewise.
* mode-switching.cc (commit_mode_sets): Likewise.
(optimize_mode_switching): Likewise.
* optabs.cc (expand_binop, expand_twoval_binop_libfunc): Likewise.
(expand_clrsb_using_clz, expand_doubleword_clz_ctz_ffs): Likewise.
(expand_doubleword_popcount, expand_ctz, expand_ffs): Likewise.
(expand_absneg_bit, expand_unop, expand_copysign_bit): Likewise.
(prepare_float_lib_cmp, expand_float, expand_fix): Likewise.
(expand_fixed_convert, gen_cond_trap): Likewise.
(expand_atomic_fetch_op): Likewise.
* ree.cc (combine_reaching_defs): Likewise.
* reg-stack.cc (compensate_edge): Likewise.
* reload1.cc (emit_input_reload_insns): Likewise.
* sel-sched-ir.cc (setup_nop_and_exit_insns): Likewise.
* shrink-wrap.cc (emit_common_heads_for_components): Likewise.
(emit_common_tails_for_components): Likewise.
(insert_prologue_epilogue_for_components): Likewise.
* tree-outof-ssa.cc (emit_partition_copy): Likewise.
(insert_value_copy_on_edge): Likewise.
* tree-ssa-loop-ivopts.cc (computation_cost): Likewise.

commit | commitdiff | tree

Richard Sandiford [Fri, 16 May 2025 12:24:01 +0000 (13:24 +0100)]

Make end_sequence return the insn sequence

The start_sequence/end_sequence interface was a big improvement over
the previous state, but one slightly awkward thing about it is that
you have to call get_insns before end_sequence in order to get the
insn sequence itself:

   To get the contents of the sequence just made, you must call
   `get_insns' *before* calling here.

We therefore have quite a lot of code like this:

  insns = get_insns ();
  end_sequence ();
  return insns;

It would seem simpler to write:

  return end_sequence ();

instead.

I can see three main potential objections to this:

(1) It isn't obvious whether ending the sequence would return the first
    or the last instruction.  But although some code reads *both* the
    first and the last instruction, I can't think of a specific case
    where code would want *only* the last instruction.  All the emit
    functions take the first instruction rather than the last.

(2) The "end" in end_sequence might imply the C++ meaning of an exclusive
    endpoint iterator.  But for an insn sequence, the exclusive endpoint
    is always the null pointer, so it would never need to be returned.
    That said, we could rename the function to something like
    "finish_sequence" or "complete_sequence" if this is an issue.

(3) There might have been an intention that start_sequence/end_sequence
    could in future reclaim memory for unwanted sequences, and so an
    explicit get_insns was used to indicate that the caller does want
    the sequence.

    But that sort of memory reclaimation has never been added,
    and now that the codebase is C++, it would be easier to handle
    using RAII.  I think reclaiming memory would be difficult to do in
    any case, since some code records the individual instructions that
    they emit, rather than using get_insns.

gcc/
* rtl.h (end_sequence): Return the sequence.
* emit-rtl.cc (end_sequence): Likewise.

commit | commitdiff | tree

Jonathan Wakely [Thu, 15 May 2025 15:03:53 +0000 (16:03 +0100)]

libstdc++: Fix proc check_v3_target_namedlocale for "" locale [PR65909]

When the last format argument to a Tcl proc is named 'args' it has
special meaning and is a list that accepts any number of arguments[1].
This means when "" is passed to the proc and then we expand "$args" we
get an empty list formatted as "{}". My r16-537-g3e2b83faeb6b14 change
broke all uses of dg-require-namedlocale with empty locale names, "".

By changing the name of the formal argument to 'locale' we avoid the
special behaviour for 'args' and now it only accepts a single argument
(as was always intended). When expanded as "$locale" we get "" as I
expected.

[1] https://www.tcl-lang.org/man/tcl9.0/TclCmd/proc.html

libstdc++-v3/ChangeLog:

PR libstdc++/65909
* testsuite/lib/libstdc++.exp (check_v3_target_namedlocale):
Change name of formal argument to locale.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

commit | commitdiff | tree

Pan Li [Tue, 13 May 2025 14:54:17 +0000 (22:54 +0800)]

RISC-V: Reuse test name for vx combine test data [NFC]

For run test, we have a name like add/sub to indicate
the testcase. So we can reuse this to identify the
test data instead of a new one.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Take
test name for the vx combine test data.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i16.c: Leverage
the test name to identify the test data.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Tue, 13 May 2025 14:47:13 +0000 (22:47 +0800)]

RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 2

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Add test cases
for vsub vx combine case 1 with GR2VR cost 2.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Tue, 13 May 2025 14:38:57 +0000 (22:38 +0800)]

RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 1

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add test cases
for vsub vx combine case 1 with GR2VR cost 1.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Tue, 13 May 2025 14:32:03 +0000 (22:32 +0800)]

RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 0

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add test cases
for vsub vx combine case 1 with GR2VR cost 0.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Sun, 11 May 2025 08:32:51 +0000 (16:32 +0800)]

RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 15

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Add test cases
for vsub vx combine with GR2VR cost 15.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Sun, 11 May 2025 08:31:16 +0000 (16:31 +0800)]

RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 1

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add test cases
for vsub vx combine with GR2VR cost 1.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Diito.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Sun, 11 May 2025 08:27:48 +0000 (16:27 +0800)]

RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 0

Add asm dump check and run test for vec_duplicate + vsub.vv
combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add vector sub
vx combine asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for vector sub vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Tue, 13 May 2025 03:12:53 +0000 (11:12 +0800)]

RISC-V: Adjust vx combine test case to avoid name conflict

Given we will put all vx combine for int8 in a single file,
we need to make sure the generate function for different
types and ops has different function name. Thus, refactor
the test helper macros for avoiding possible function name
conflict.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add
type and op name to generate test function name.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Refine the
test helper macros to avoid conflict.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_run.h: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Tue, 13 May 2025 02:00:35 +0000 (10:00 +0800)]

RISC-V: Rename vx_vadd-* testcase to vx-* for all vx combine [NFC]

We would like to arrange all vx combine asm check test into
one file for better management. Thus, rename vx_vadd-* to
vx-*.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: ...here.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Sun, 11 May 2025 08:20:28 +0000 (16:20 +0800)]

RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vsub.vv to the
vsub.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)                                        \
  void                                                                \
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {                                                                   \
    for (unsigned i = 0; i < n; i++)                                  \
      out[i] = in[i] OP x;                                            \
  }

  DEF_VX_BINARY(int32_t, -)

Before this patch:
  10   │ test_binary_vx_sub:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma // Deleted if GR2VR cost zero
  13   │     vmv.v.x v2,a2                // Ditto.
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vsub.vv v1,v2,v1
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_binary_vx_sub:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vsub.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*<optab>_vx_<mode>): Add new
pattern to convert vec_duplicate + vsub.vv to vsub.vx.
* config/riscv/riscv.cc (riscv_rtx_costs): Add minus as plus op.
* config/riscv/vector-iterators.md: Add minus to iterator
any_int_binop_no_shift_vx.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

GCC Administrator [Fri, 16 May 2025 00:18:46 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Jason Merrill [Sat, 10 May 2025 15:24:38 +0000 (11:24 -0400)]

c++: remove coroutines.exp

coroutines.exp was basically only there to add -std=c++20 to all the tests;
removing it lets us use the general support for running tests under multiple
standards. Doing this revealed that some tests that specifically run in
C++17 mode were relying on -std=c++20 followed by -std=c++17 leaving
flag_coroutines set, which seems unintentional, and different from how we
handle other feature flags. So this changes that, and adds the missing
-fcoroutines to those tests.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/co-await-syntax-09-convert.C: Add -fcoroutines.
* g++.dg/coroutines/co-await-syntax-10.C
* g++.dg/coroutines/co-await-syntax-11.C
* g++.dg/coroutines/co-await-void_type.C
* g++.dg/coroutines/co-return-warning-1.C
* g++.dg/coroutines/ramp-return-a.C
* g++.dg/coroutines/ramp-return-c.C: Likewise.
* g++.dg/coroutines/coroutines.exp: Removed.
* lib/g++-dg.exp: Start at C++20 for coroutines/

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): Set flag_coroutines.
(set_std_cxx20, set_std_cxx23, set_std_cxx26): Not here.

commit | commitdiff | tree

Harald Anlauf [Thu, 15 May 2025 19:07:07 +0000 (21:07 +0200)]

Fortran: default-initialization and functions returning derived type [PR85750]

Functions with non-pointer, non-allocatable result and of derived type did
not always get initialized although the type had default-initialization,
and a derived type component had the allocatable or pointer attribute.
Rearrange the logic when to apply default-initialization.

PR fortran/85750

gcc/fortran/ChangeLog:

* resolve.cc (resolve_symbol): Reorder conditions when to apply
default-initializers.

gcc/testsuite/ChangeLog:

* gfortran.dg/alloc_comp_auto_array_3.f90: Adjust scan counts.
* gfortran.dg/alloc_comp_class_3.f03: Remove bogus warnings.
* gfortran.dg/alloc_comp_class_4.f03: Likewise.
* gfortran.dg/allocate_with_source_14.f03: Adjust scan count.
* gfortran.dg/derived_constructor_comps_6.f90: Likewise.
* gfortran.dg/derived_result_5.f90: New test.

commit | commitdiff | tree

Joseph Myers [Thu, 15 May 2025 18:02:26 +0000 (18:02 +0000)]

Update gcc zh_CN.po

* zh_CN.po: Update.

commit | commitdiff | tree

Robert Dubner [Thu, 15 May 2025 17:33:16 +0000 (13:33 -0400)]

cobol: One additional edit to testsuite/cobol.dg/group1/check_88.cob [PR120251]

Missed one edit. This fixes that.

gcc/testsuite/ChangeLog:

PR cobol/120251
* cobol.dg/group1/check_88.cob: One final regex "." instead of "ß"

commit | commitdiff | tree

Joseph Myers [Thu, 15 May 2025 17:19:48 +0000 (17:19 +0000)]

Update cpplib zh_CN.po

* zh_CN.po: Update.

commit | commitdiff | tree

Andrew MacLeod [Wed, 14 May 2025 15:32:58 +0000 (11:32 -0400)]

Enhance bitwise_and::op1_range

Any known bits from the LHS range can be used to specify known bits in
the non-mask operand.

PR tree-optimization/116546
gcc/
* range-op.cc (operator_bitwise_and::op1_range): Utilize bitmask
from the LHS to improve op1's bitmask.

gcc/testsuite/
* gcc.dg/pr116546.c: New.

commit | commitdiff | tree

Andrew MacLeod [Wed, 14 May 2025 15:13:15 +0000 (11:13 -0400)]

Allow bitmask intersection to process unknown masks.

bitmask_intersection should not return immediately if the current mask is
unknown. Unknown may mean its the default for a range, and this may
interact in intersting ways with the other bitmask.

PR tree-optimization/116546
* value-range.cc (irange::intersect_bitmask): Allow unknown
bitmasks to be processed.

commit | commitdiff | tree

Andrew MacLeod [Wed, 14 May 2025 15:12:22 +0000 (11:12 -0400)]

Improve constant bitmasks.

bitmasks for constants are created only for trailing zeros. It is no
additional work to also include leading 1's in the value that are also
known.
before : [5, 7] mask 0x7 value 0x0
after : [5, 7] mask 0x3 value 0x4

PR tree-optimization/116546
* value-range.cc (irange_bitmask::irange_bitmask): Include
leading ones in the bitmask.

commit | commitdiff | tree

Andrew MacLeod [Tue, 13 May 2025 17:23:16 +0000 (13:23 -0400)]

Turn get_bitmask_from_range into an irange_bitmask constructor.

There are other places where this is interesting, so move the static
function into a constructor for class irange_bitmask.

* value-range.cc (irange_bitmask::irange_bitmask): Rename from
get_bitmask_from_range and tweak.
(prange::set): Use new constructor.
(prange::intersect): Use new constructor.
(irange::get_bitmask): Likewise.
* value-range.h (irange_bitmask): New constructor prototype.

commit | commitdiff | tree

Andrew MacLeod [Thu, 15 May 2025 15:06:05 +0000 (11:06 -0400)]

Check for casts becoming UNDEFINED.

In various situations a cast that is ultimately unreahcable may produce
an UNDEFINED result, and we can't check the bounds in this case.

PR tree-optimization/120277
gcc/
* range-op-ptr.cc (operator_cast::fold_range): Check if the cast
if UNDEFINED before setting bounds.

gcc/testsuite/
* gcc.dg/pr120277.c: New.

commit | commitdiff | tree

Robert Dubner [Thu, 15 May 2025 16:01:12 +0000 (12:01 -0400)]

cobol: Don't display 0xFF HIGH-VALUE characters in testcases. [PR120251]

The tests were displaying 0xFF characters, and the resulting generated
output changed with the system locale. The check_88 test was modified
so that the regex comparisons ignore those character positions. Two
of the other tests were changed to output hexadecimal rather than
character strings.

There is one new test, and the other inspect testcases were edited to
remove an unimportant back-apostrophe that had found its way into the
source code sequence number area.

gcc/testsuite/ChangeLog:

PR cobol/120251
* cobol.dg/group1/check_88.cob: Ignore characters above 0x80.
* cobol.dg/group2/ALLOCATE_Rule_8_OPTION_INITIALIZE_with_figconst.cob:
Output HIGH-VALUE as hex, rather than as characters.
* cobol.dg/group2/ALLOCATE_Rule_8_OPTION_INITIALIZE_with_figconst.out:
Likewise.
* cobol.dg/group2/INSPECT_CONVERTING_TO_figurative_constants.cob: Typo.
* cobol.dg/group2/INSPECT_CONVERTING_TO_figurative_constants.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_1.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_2.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_3.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_4.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_5-f.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_6.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_7.cob: Likewise.
* cobol.dg/group2/Multiple_INDEXED_BY_variables_with_the_same_name.cob: New test.
* cobol.dg/group2/Multiple_INDEXED_BY_variables_with_the_same_name.out: New test.

commit | commitdiff | tree

Luc Grosheintz [Wed, 14 May 2025 19:13:52 +0000 (21:13 +0200)]

libstdc++: Fix class mandate for extents.

The standard states that the IndexType must be a signed or unsigned
integer. This mandate was implemented using `std::is_integral_v`. Which
also includes (among others) char and bool, which neither signed nor
unsigned integers.

libstdc++-v3/ChangeLog:

* include/std/mdspan: Implement the mandate for extents as
signed or unsigned integer and not any interal type. Remove
leading underscores from names in static_assert message.
* testsuite/23_containers/mdspan/extents/class_mandates_neg.cc:
Check that extents<char,...> and extents<bool,...> are invalid.
Adjust dg-prune-output pattern.
* testsuite/23_containers/mdspan/extents/misc.cc: Update
tests to avoid `char` and `bool` as IndexType.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

commit | commitdiff | tree

Jonathan Wakely [Thu, 15 May 2025 10:01:05 +0000 (11:01 +0100)]

libstdc++: Fix std::format_kind primary template for Clang [PR120190]

Although Clang trunk has been adjusted to handle our std::format_kind
definition (because they need to be able to compile the GCC 15.1.0
release), it's probably better to not rely on something that they might
start diagnosing again in future.

Define the primary template in terms of an immediately invoked function
expression, so that we can put a static_assert(false) in the body.

libstdc++-v3/ChangeLog:

PR libstdc++/120190
* include/std/format (format_kind): Adjust primary template to
not depend on itself.
* testsuite/std/format/ranges/format_kind_neg.cc: Adjust
expected errors. Check more invalid specializations.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Daniel Krügler <daniel.kruegler@gmail.com>

commit | commitdiff | tree

Jonathan Wakely [Mon, 12 May 2025 11:56:17 +0000 (12:56 +0100)]

libstdc++: Micro-optimization in std::arg overload for scalars

Use __builtin_signbit directly instead of std::signbit.

libstdc++-v3/ChangeLog:

* include/std/complex (arg(T)): Use __builtin_signbit instead of
std::signbit.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

commit | commitdiff | tree

Jonathan Wakely [Mon, 12 May 2025 10:34:01 +0000 (11:34 +0100)]

libstdc++: Deprecate non-standard std::fabs(const complex<T>&) [PR120235]

There was an overload of fabs for std::complex in TR1 and in some C++0x
drafts, but it was removed from the working draft by LWG 595.

Since we've been providing it for decades we should deprecate it before
removing it.

libstdc++-v3/ChangeLog:

PR libstdc++/120235
* doc/html/*: Regenerate.
* doc/xml/manual/evolution.xml: Document deprecation.
* include/std/complex: Replace references to TR1 subclauses with
corresponding C++11 subclauses.
(fabs): Add deprecated attribute.
* testsuite/26_numerics/complex/fabs_neg.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

commit | commitdiff | tree

Jeff Law [Thu, 15 May 2025 15:03:13 +0000 (09:03 -0600)]

[RISC-V][PR target/120223] Don't use bset/binv for XTHEADBS

Thead has the XTHEADBB extension which has a lot of overlap with Zbb.  I made
the incorrect assumption that XTHEADBS would largely be like Zbs when
generalizing Shreya's work.

As a result we can't use the operation synthesis code for IOR/XOR because we
don't have binv/bset like capabilities.  I should have double checked on
XTHEADBS, my bad.

Anyway, the fix is trivial.  Don't allow bset/binv based on XTHEADBS.

Already spun in my tester.  Spinning in the pre-commit CI system now.

PR target/120223
gcc/
* config/riscv/riscv.cc (synthesize_ior_xor): XTHEADBS does not have
single bit manipulations.

gcc/testsuite/

* gcc.target/riscv/pr120223.c: New test.

commit | commitdiff | tree

Patrick Palka [Thu, 15 May 2025 15:07:53 +0000 (11:07 -0400)]

c++: unifying specializations of non-primary tmpls [PR120161]

Here unification of P=Wrap<int>::type, A=Wrap<long>::type wrongly
succeeds ever since r14-4112 which made the RECORD_TYPE case of unify
no longer recurse into template arguments for non-primary templates
(since they're a non-deduced context) and so the int/long mismatch that
makes the two types distinct goes unnoticed.

In the case of (comparing specializations of) a non-primary template,
unify should still go on to compare the types directly before returning
success.

PR c++/120161

gcc/cp/ChangeLog:

* pt.cc (unify) <case RECORD_TYPE>: When comparing specializations
of a non-primary template, still perform a type comparison.

gcc/testsuite/ChangeLog:

* g++.dg/template/unify13.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

commit | commitdiff | tree

Jason Merrill [Mon, 18 Nov 2024 18:48:41 +0000 (19:48 +0100)]

c++: use normal std list for module tests

The modules tests have used their own version of the code to run tests under
multiple standard versions; they should use the same one as other tests.

I'm not sure about continuing to run modules tests in C++17 mode, but I
guess we might as well for now.

gcc/testsuite/ChangeLog:

* lib/g++-dg.exp (g++-std-flags): Factor out of g++-dg-runtest.
* g++.dg/modules/modules.exp: Use it instead of a copy.

commit | commitdiff | tree

Jason Merrill [Fri, 9 May 2025 23:13:49 +0000 (19:13 -0400)]

c++: -fimplicit-constexpr and modules

Import didn't like differences in DECL_DECLARED_CONSTEXPR_P due to implicit
constexpr, breaking several g++.dg/modules tests; we should handle that
along with DECL_MAYBE_DELETED. For which we need to stream the bit.

gcc/cp/ChangeLog:

* module.cc (trees_out::lang_decl_bools): Stream implicit_constexpr.
(trees_in::lang_decl_bools): Likewise.
(trees_in::is_matching_decl): Check it.

commit | commitdiff | tree

Jason Merrill [Wed, 14 May 2025 14:23:32 +0000 (10:23 -0400)]

c++: one more PR99599 tweak

Patrick pointed out that if the parm/arg types aren't complete yet at this
point, it would affect the type_has_converting_constructor and
TYPE_HAS_CONVERSION tests. I don't have a testcase, but it makes sense for
safety.

PR c++/99599

gcc/cp/ChangeLog:

* pt.cc (conversion_may_instantiate_p): Make sure
classes are complete.

commit | commitdiff | tree

Jason Merrill [Thu, 1 May 2025 14:20:25 +0000 (10:20 -0400)]

libstdc++: build testsuite with -Wabi

I added this locally to check whether the PR120012 fix affects libstdc++ (it
doesn't) but it seems more generally useful to catch whether compiler
ABI changes have library impact.

libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp: Add -Wabi.

commit | commitdiff | tree

Alexander Monakov [Mon, 12 May 2025 20:23:31 +0000 (23:23 +0300)]

tighten type verification for CONJ_EXPR

As a followup to PAREN_EXPR verification, let's ensure that CONJ_EXPR is
only used with COMPLEX_TYPE. While at it, move the whole block towards
the end of the switch, because unlike the other entries it needs to
break out of the switch, not immediately return from the function,
as after the switch we check that types of LHS and RHS match.

Refactor a bit to avoid repeated blocks with debug_generic_expr.

gcc/ChangeLog:

* tree-cfg.cc (verify_gimple_assign_unary): Accept only
COMPLEX_TYPE for CONJ_EXPR.

commit | commitdiff | tree

Tobias Burnus [Thu, 15 May 2025 07:15:21 +0000 (09:15 +0200)]

OpenMP/Fortran: Fix allocatable-component mapping of derived-type array comps

The check whether the location expression in map clause has allocatable
components was failing for some derived-type array expressions such as
  map(var%tiles(1))
as the compiler produced
  _4 = var.tiles;
  MEMREF(_4, _5);
This commit now also handles this case.

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_omp_deep_mapping_do): Handle SSA_NAME if
a def_stmt is available.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/alloc-comp-4.f90: New test.

commit | commitdiff | tree

Tomasz Kamiński [Thu, 15 May 2025 06:58:09 +0000 (08:58 +0200)]

libstdc++: Fix preprocessor check for __float128 formatter [PR119246]

The previous check `_GLIBCXX_FORMAT_F128 != 1` was passing if
_GLIBCXX_FORMAT_F128 was not defined, i.e. evaluted to zero.

This broke sparc-sun-solaris2.11 and x86_64-darwin.

PR libstdc++/119246

libstdc++-v3/ChangeLog:

* include/std/format: Updated check for _GLIBCXX_FORMAT_F128.

commit | commitdiff | tree

Andrew Pinski [Wed, 14 May 2025 16:01:07 +0000 (09:01 -0700)]

tree: Canonical order for ADDR

This is the followup based on the review at
https://inbox.sourceware.org/gcc-patches/CAFiYyc3xeG75dsWaF63Zbu5uELPEAEoHwGfoGaVyDWouUJ70Mg@mail.gmail.com/
.
We should put ADDR_EXPR last instead of just is_gimple_invariant_address ones.

Note a few match patterns needed to be updated for this change but we get a decent improvement
as forwprop-38.c is now able to optimize during CCP rather than taking all the way to forwprop.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* fold-const.cc (tree_swap_operands_p): Put ADDR_EXPR last
instead of just is_gimple_invariant_address ones.
* match.pd (`a ptr+ b !=\== ADDR`, `ADDR !=/== ssa_name`):
Move the ADDR to the last operand. Update comment.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Richard Biener [Wed, 14 May 2025 14:45:08 +0000 (16:45 +0200)]

Enhance -fopt-info-vec vectorized loop diagnostic

The following includes whether we vectorize an epilogue, whether
we use loop masking and what vectorization factor (unroll factor)
we use. So it's now

t.c:4:21: optimized: loop vectorized using 64 byte vectors and unroll factor 32
t.c:4:21: optimized: epilogue loop vectorized using masked 64 byte vectors and unroll factor 32

for a masked epilogue with AVX512 and HImode data for example. Rather
than

t.c:4:21: optimized: loop vectorized using 64 byte vectors
t.c:4:21: optimized: loop vectorized using 64 byte vectors

I verified we don't translate opt-info messages and thus excessive
use of %s to compose the strings should be OK.

* tree-vectorizer.cc (vect_transform_loops): When diagnosing
a vectorized loop indicate whether we vectorized an epilogue,
whether we used masked vectors and what unroll factor was
used.

* gcc.target/i386/pr110310.c: Adjust.

commit | commitdiff | tree

Richard Biener [Wed, 14 May 2025 14:36:29 +0000 (16:36 +0200)]

Fix regression from x86 multi-epilogue tuning

With the avx512_two_epilogues tuning enabled for zen4 and zen5
the gcc.target/i386/vect-epilogues-5.c testcase below regresses
and ends up using AVX2 sized vectors for the masked epilogue
rather than AVX512 sized vectors. The following patch rectifies
this and adds coverage for the intended behavior.

* config/i386/i386.cc (ix86_vector_costs::finish_cost):
Do not suggest a first epilogue mode for AVX512 sized
main loops with X86_TUNE_AVX512_TWO_EPILOGUES as that
interferes with using a masked epilogue.

* gcc.target/i386/vect-epilogues-1.c: New testcase.
* gcc.target/i386/vect-epilogues-2.c: Likewise.
* gcc.target/i386/vect-epilogues-3.c: Likewise.
* gcc.target/i386/vect-epilogues-4.c: Likewise.
* gcc.target/i386/vect-epilogues-5.c: Likewise.

commit | commitdiff | tree

liuhongt [Tue, 13 May 2025 01:26:13 +0000 (18:26 -0700)]

Update libbid according to the latest Intel Decimal Floating-Point Math Library.

The Intel Decimal Floating-Point Math Library is available as open-source on Netlib[1].

[1] https://www.netlib.org/misc/intel/

libgcc/config/libbid/ChangeLog:

* bid128_string.c (MIN_DIGITS): New macro.
(bid128_from_string): Bug fix. Conversion from very long input
string to decimal.

commit | commitdiff | tree

GCC Administrator [Thu, 15 May 2025 00:19:47 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Joseph Myers [Wed, 14 May 2025 21:11:50 +0000 (21:11 +0000)]

Update gcc sv.po

* sv.po: Update.

commit | commitdiff | tree

Joseph Myers [Wed, 14 May 2025 20:25:27 +0000 (20:25 +0000)]

Update cpplib es.po

* es.po: Update.

commit | commitdiff | tree

Simon Martin [Wed, 14 May 2025 18:29:57 +0000 (20:29 +0200)]

c++: Add testcase for issue fixed in GCC 15 [PR120126]

Patrick noticed that this PR's testcase has been fixed by the patch for
PR c++/114292 (r15-7238-gceabea405ffdc8), more specifically the part
that walks the type of DECL_EXPR DECLs.

This simply adds the case to the testsuite.

PR c++/120126

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-ice33.C: New test.

commit | commitdiff | tree

Thomas Koenig [Wed, 14 May 2025 18:11:48 +0000 (20:11 +0200)]

Fix explicit arrays with non-constant size for -fc-prototypes.

gcc/fortran/ChangeLog:

PR fortran/120139
* dump-parse-tree.cc (get_c_type_name): If no constant
size of an array exists, output an asterisk.

commit | commitdiff | tree

Thomas Koenig [Tue, 13 May 2025 17:02:06 +0000 (19:02 +0200)]

Do not dump non-interoperable types with -fc-prototypes.

gcc/fortran/ChangeLog:

PR fortran/120107
* dump-parse-tree.cc (write_type): Do not dump non-interoperable
types.

commit | commitdiff | tree

Tobias Burnus [Wed, 14 May 2025 18:06:49 +0000 (20:06 +0200)]

OpenMP: Fix mapping of zero-sized arrays with non-literal size: map(var[:n]), n = 0

For map(ptr[:0]), the used map kind is GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION
and it is permitted that 'ptr' does not exist. 'ptr' is set to the device
pointee if it exists or to the host value otherwise.

For map(ptr[:3]), the variable is first mapped and then ptr is updated to point
to the just-mapped device data; the attachment uses GOMP_MAP_ATTACH.

For map(ptr[:n]), generates always a GOMP_MAP_ATTACH, but when n == 0, it
was failing with:
"pointer target not mapped for attach"

The solution is not to fail but first to check whether it was mapped before.
It turned out that for the mapping part, GCC adds a run-time check whether
n == 0 - and uses GOMP_MAP_ZERO_LEN_ARRAY_SECTION for the mapping.
Thus, we just have to check whether there such a mapping for the address
for which the GOMP_MAP_ATTACH. was requested. And, if there was, the
error diagnostic can be skipped.

Unsurprisingly, this issue occurs in real-world code; it was detected in
a code that distributes work via MPI and for some processes, some bounds
ended up to be zero.

libgomp/ChangeLog:

* target.c (gomp_attach_pointer): Return bool; accept additional
bool to optionally silence the fatal pointee-not-found error.
(gomp_map_vars_internal): If the pointee could not be found,
check whether it was mapped as GOMP_MAP_ZERO_LEN_ARRAY_SECTION.
* libgomp.h (gomp_attach_pointer): Update prototype.
* oacc-mem.c (acc_attach_async, goacc_enter_data_internal): Update
calls.
* testsuite/libgomp.c/target-map-zero-sized.c: New test.
* testsuite/libgomp.c/target-map-zero-sized-2.c: New test.
* testsuite/libgomp.c/target-map-zero-sized-3.c: New test.

commit | commitdiff | tree

Richard Biener [Tue, 13 May 2025 08:08:36 +0000 (10:08 +0200)]

Remove the mixed stmt_vec_info/SLP node record_stmt_cost overload

The following changes the record_stmt_cost calls in
vectorizable_load/store to only pass the SLP node when costing
vector stmts. For now we'll still pass the stmt_vec_info,
determined from SLP_TREE_REPRESENTATIVE, so this merely cleans up
the API.

* tree-vectorizer.h (record_stmt_cost): Remove mixed
stmt_vec_info/SLP node inline overload.
* tree-vect-stmts.cc (vectorizable_store): For costing
vector stmts only pass SLP node to record_stmt_cost.
(vectorizable_load): Likewise.

commit | commitdiff | tree

Richard Biener [Tue, 13 May 2025 07:50:36 +0000 (09:50 +0200)]

Use vectype from SLP node for vect_get_{load,store}_cost if possible

The vect_get_{load,store}_cost API is used from both vectorizable_*
where we've done SLP analysis and from alignment peeling analysis
with is done before this and thus only stmt_vec_infos are available.
The following patch makes sure we pick the vector type relevant
for costing from the SLP node when available.

* tree-vect-stmts.cc (vect_get_store_cost): Compute vectype based
on whether we got SLP node or stmt_vec_info and use the full
record_stmt_cost API.
(vect_get_load_cost): Likewise.

commit | commitdiff | tree

Kito Cheng [Tue, 13 May 2025 02:34:34 +0000 (10:34 +0800)]

RISC-V: Fix uninit riscv_subset_list::m_allow_adding_dup issue

We forgot to initialize m_allow_adding_dup in the constructor of
riscv_subset_list, then that will be a random value...that will lead
to a random behavior of the -march may accpet duplicate extension.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_subset_list::riscv_subset_list): Init m_allow_adding_dup.
Reviewed-by: Christoph Müllner <christoph.muellner@vrull.eu>

commit | commitdiff | tree

Jiawei [Tue, 13 May 2025 07:23:39 +0000 (15:23 +0800)]

RISC-V: Add augmented hypervisor series extensions.

The augmented hypervisor series extensions 'sha'[1] is a new profile-defined
extension series that captures the full set of features that are mandated to
be supported along with the 'H' extension.

[1] https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc#rva23s64-profile

Version log: Update implements， fix testcase format.

gcc/ChangeLog:

* config/riscv/riscv-ext.def: New extension defs.
* config/riscv/riscv-ext.opt: Ditto.
* doc/riscv-ext.texi: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-55.c: New test.

commit | commitdiff | tree

Kito Cheng [Wed, 14 May 2025 15:19:38 +0000 (23:19 +0800)]

RISC-V: Drop duplicate build rule for riscv-ext.opt [NFC]

gcc/ChangeLog:

* config/riscv/t-riscv: Drop duplicate build rule for
riscv-ext.opt.

commit | commitdiff | tree

Kito Cheng [Wed, 14 May 2025 15:19:17 +0000 (23:19 +0800)]

RISC-V: Regen riscv-ext.opt.urls

gcc/ChangeLog:

* config/riscv/riscv-ext.opt.urls: Regenerate.

commit | commitdiff | tree

Andrew Pinski [Tue, 13 May 2025 21:27:12 +0000 (14:27 -0700)]

gimple: Move canonicalization of bool==0 and bool!=1 to cleanupcfg

This moves the canonicalization of `bool==0` and `bool!=1` from
forwprop to cleanupcfg. We will still need to call it from forwprop
so we don't need to call forwprop a few times to fp comparisons in some
cases (forwprop-16.c was added originally for this code even).

This is the first step in removing forward_propagate_into_gimple_cond
and forward_propagate_into_comparison.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-cfgcleanup.cc (canonicalize_bool_cond): New function.
(cleanup_control_expr_graph): Call canonicalize_bool_cond for GIMPLE_COND.
* tree-cfgcleanup.h (canonicalize_bool_cond): New declaration.
* tree-ssa-forwprop.cc (forward_propagate_into_gimple_cond):
Call canonicalize_bool_cond.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 13 May 2025 20:50:24 +0000 (13:50 -0700)]

gimple: Add assert for code being a comparison in gimple_cond_set_code

We have code later on that verifies the code is a comparison. So let's
try to catch it earlier. So it is easier to debug where the incorrect code
gets set.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* gimple.h (gimple_cond_set_code): Add assert of the code
being a comparison.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 13 May 2025 20:04:32 +0000 (13:04 -0700)]

forwprop: Change an if into an assert

Since the merge of the tuples branch (r0-88576-g726a989a8b74bf), the
if:
```
if (TREE_CODE_CLASS (gimple_cond_code (stmt)) != tcc_comparison)
```
Will always be false so let's change it into an assert.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (forward_propagate_into_gimple_cond): Assert
that gimple_cond_code is always a comparison.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 13 May 2025 16:56:13 +0000 (09:56 -0700)]

gimple: allow fold_stmt without setting cfun in case of GIMPLE_COND folding

This is the followup mentioned in https://gcc.gnu.org/pipermail/gcc-patches/2025-May/683444.html .
It adds the check for cfun before accessing function specific flags.
We handle the case where !cfun as conservative in that it the function might throw.

gcc/ChangeLog:

* gimple-fold.cc (replace_stmt_with_simplification): Check cfun before
accessing cfun.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Mon, 21 Apr 2025 23:33:04 +0000 (16:33 -0700)]

forwprop: Move around the marking bb for eh to after the local non-fold_stmt optimizations

When moving the optimize_memcpy_to_memset optimization to forwprop from fold_stmt, the marking
of the bb to purge for eh cleanup was not happening for the local optimizations but only after
the fold_stmt, this causes g++.dg/torture/except-2.C to fail.
So this patch moves the marking of the bbs for cleanups after the local forwprop optimizations
instead of before.

There was already code to add to to_purge after forward_propagate_into_comparison and removes
that as it is now redundant.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (pass_forwprop::execute): Move marking of to_purge bb
and marking of fixup statements to after the local optimizations.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 22 Apr 2025 03:15:42 +0000 (20:15 -0700)]

forwprop: Fix looping after fold_stmt and some forwprop local folds happen

r10-2587-gcc19f80ceb27cc added a loop over the current statment if there was
a change. Except in some cases it turns out changed will turn from true to false
because instead of doing |= after the fold_stmt, there was an just an `=`.
This fixes that and now we loop even if fold_stmt changed the statement and
there was a local fold that happened.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (pass_forwprop::execute): Use `|=` for
changed on the local folding.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andreas Schwab [Wed, 14 May 2025 13:12:16 +0000 (15:12 +0200)]

libiberty: remove duplicated declaration of mkstemps

* libiberty.h (mkstemps): Remove duplicate.

commit | commitdiff | tree

Richard Biener [Mon, 12 May 2025 13:02:42 +0000 (15:02 +0200)]

This transitions vect_model_simple_cost to SLP only

As part of the vector cost API cleanup this transitions
vect_model_simple_cost to only record costs with SLP node.
For this to work the patch adds an overload to record_stmt_cost
only passing in the SLP node.

The vect_prologue_cost_for_slp adjustment is one spot that
needs an eye with regard to re-doing the whole thing.

* tree-vectorizer.h (record_stmt_cost): Add overload with
only SLP node and no vector type.
* tree-vect-stmts.cc (record_stmt_cost): Use
SLP_TREE_REPRESENTATIVE for stmt_vec_info.
(vect_model_simple_cost): Do not get stmt_vec_info argument
and adjust.
(vectorizable_call): Adjust.
(vectorizable_simd_clone_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_assignment): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_condition): Likewise.
(vectorizable_comparison_1): Likewise.
* tree-vect-slp.cc (vect_prologue_cost_for_slp): Use
full-blown record_stmt_cost.

commit | commitdiff | tree

Ville Voutilainen [Wed, 14 May 2025 13:39:09 +0000 (16:39 +0300)]

Remove a sanity check comment now that the sanity check has been removed

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold): Remove a remnant comment.

commit | commitdiff | tree

Tomasz Kamiński [Mon, 12 May 2025 09:06:34 +0000 (11:06 +0200)]

libstdc++: Renamed bits/move_only_function.h to bits/funcwrap.h [PR119125]

The file now includes copyable_function in addition to
move_only_function.

PR libstdc++/119125

libstdc++-v3/ChangeLog:
* include/bits/move_only_function.h: Move to...
* include/bits/funcwrap.h: ...here.
* doc/doxygen/stdheader.cc (init_map): Replaced move_only_function.h
with funcwrap.h, and changed include guard to use feature test macro.
Move bits/version.h include before others.
* include/Makefile.am: Likewise.
* include/Makefile.in: Likewise.
* include/std/functional: Likewise.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

commit | commitdiff | tree

liuhongt [Wed, 18 Dec 2024 06:32:31 +0000 (22:32 -0800)]

Consider frequency in cost estimation when converting scalar to vector.

n some benchmark, I notice stv failed due to cost unprofitable, but the igain
is inside the loop, but sse<->integer conversion is outside the loop, current cost
model doesn't consider the frequency of those gain/cost.
The patch weights those cost with frequency.

gcc/ChangeLog:

PR target/120215
* config/i386/i386-features.cc
(scalar_chain::mark_dual_mode_def): Weight
cost of integer<->sse move with bb frequency when it's
optimized_for_speed_p.
(general_scalar_chain::compute_convert_gain): Ditto, and
adjust function prototype to return true/false when cost model
is profitable or not.
(timode_scalar_chain::compute_convert_gain): Ditto.
(convert_scalars_to_vector): Adjust after the upper two
function prototype are changed.
* config/i386/i386-features.h (class scalar_chain): Change
n_integer_to_sse/n_sse_to_integer to cost_sse_integer, and add
weighted_cost_sse_integer.
(class general_scalar_chain): Adjust prototype to return bool
intead of int.
(class timode_scalar_chain): Ditto.

commit | commitdiff | tree

Tomasz Kamiński [Mon, 12 May 2025 08:01:22 +0000 (10:01 +0200)]

libstdc++: Implement C++26 copyable_function [PR119125]

This patch implements C++26 copyable_function as specified in P2548R6.
It also implements LWG 4255 that adjust move_only_function so constructing
from empty copyable_function, produces empty functor. This falls from
existing checks, after specializing __is_polymorphic_function_v for
copyable_function specializations.

For compatible invoker signatures, the move_only_function may be constructed
from copyable_funciton without double indirection. To achieve that we derive
_Cpy_base from _Mo_base, and specialize __is_polymorphic_function_v for
copyable_function. Similary copyable_functions with compatible signatures
can be converted without double indirection.

As we starting to use _Op::_Copy operation from the _M_manage function,
invocations of that functions may now throw exceptions, so noexcept needs
to be removed from the signature of stored _M_manage pointers. This also
affects operations in _Mo_base, however we already wrap _M_manage invocations
in noexcept member functions (_M_move, _M_destroy, swap).

PR libstdc++/119125

libstdc++-v3/ChangeLog:

* doc/doxygen/stdheader.cc: Addded cpyfunc_impl.h header.
* include/Makefile.am: Add bits cpyfunc_impl.h.
* include/Makefile.in: Add bits cpyfunc_impl.h.
* include/bits/cpyfunc_impl.h: New file.
* include/bits/mofunc_impl.h: Mention LWG 4255.
* include/bits/move_only_function.h: Update header description
and change guard to also check __glibcxx_copyable_function.
(_Manager::_Func): Remove noexcept.
(std::__is_polymorphic_function_v<move_only_function<_Tp>>)
(__variant::_Never_valueless_alt<std::move_only_function<_Signature...>>)
(move_only_function) [__glibcxx_move_only_function]: Adjust guard.
(std::__is_polymorphic_function_v<copyable_function<_Tp>>)
(__variant::_Never_valueless_alt<std::copyable_function<_Signature...>>)
(__polyfunc::_Cpy_base, std::copyable_function)
[__glibcxx_copyable_function]: Define.
* include/bits/version.def: Define copyable_function.
* include/bits/version.h: Regenerate.
* include/std/functional: Define __cpp_lib_copyable_function.
* src/c++23/std.cc.in (copyable_function)
[__cpp_lib_copyable_function]: Export.
* testsuite/20_util/copyable_function/call.cc: New test based on
move_only_function tests.
* testsuite/20_util/copyable_function/cons.cc: New test based on
move_only_function tests.
* testsuite/20_util/copyable_function/conv.cc: New test based on
move_only_function tests.
* testsuite/20_util/copyable_function/copy.cc: New test.
* testsuite/20_util/copyable_function/move.cc: New test based on
move_only_function tests.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

commit | commitdiff | tree

Tomasz Kamiński [Thu, 8 May 2025 06:08:43 +0000 (08:08 +0200)]

libstdc++: Avoid double indirection in move_only_function when possible [PR119125]

Based on the provision in C++26 [func.wrap.general] p2 this patch adjust the generic
move_only_function(_Fn&&) constructor, such that when _Fn refers to selected
move_only_function instantiations, the ownership of the target object is directly
transfered to constructor object. This avoid cost of double indirection in this situation.
We apply this also in C++23 mode.

We also fix handling of self assignments, to match behavior required by standard,
due use of copy and swap idiom.

An instantiations MF1 of move_only_function can transfer target of another
instantiation MF2, if it can be constructed via usual rules (__is_callable_from<_MF2>),
and their invoker are convertible (__is_invoker_convertible<MF2, MF1>()), i.e.:
* MF1 is less noexcept than MF2,
* return types are the same after stripping cv-quals,
* adujsted parameters type are the same (__poly::_param_t), i.e. param of types T and T&&
are compatible for non-trivially copyable objects.
Compatiblity of cv ref qualification is checked via __is_callable_from<_MF2>.

To achieve above the generation of _M_invoke functions is moved to _Invoker class
templates, that only depends on noexcept, return type and adjusted parameter of the
signature. To make the invoker signature compatible between const and mutable
qualified signatures, we always accept _Storage as const& and perform a const_cast
for locally stored object. This approach guarantees that we never strip const from
const object.

Another benefit of this approach is that move_only_function<void(std::string)>
and move_only_function<void(std::string&&)> use same funciton pointer, which should
reduce binary size.

The _Storage and _Manager functionality was also extracted and adjusted from
_Mofunc_base, in preparation for implementation for copyable_function and
function_ref. The _Storage was adjusted to store functions pointers as void(*)().
The manage function, now accepts _Op enum parameter, and supports additional
operations:
* _Op::_Address stores address of target object in destination
* _Op::_Copy, when enabled, copies from source to destination
Furthermore, we provide a type-independent mamange functions for handling all:
* function pointer types
* trivially copyable object stored locally.
Similary as in case of invoker, we always pass source as const (for copy),
and cast away constness in case of move operations, where we know that source
is mutable.

Finally, the new helpers are defined in __polyfunc internal namespace.

PR libstdc++/119125

libstdc++-v3/ChangeLog:

* include/bits/mofunc_impl.h: (std::move_only_function): Adjusted for
changes in bits/move_only_function.h
(move_only_function::move_only_function(_Fn&&)): Special case
move_only_functions with same invoker.
(move_only_function::operator=(move_only_function&&)): Handle self
assigment.
* include/bits/move_only_function.h (__polyfunc::_Ptrs)
(__polyfunc::_Storage): Refactored from _Mo_func::_Storage.
(__polyfunc::__param_t): Moved from move_only_function::__param_t.
(__polyfunc::_Base_invoker, __polyfunc::_Invoke): Refactored from
move_only_function::_S_invoke.
(__polyfunc::_Manager): Refactored from _Mo_func::_S_manager.
(std::_Mofunc_base): Moved into __polyfunc::_Mo_base with parts
extracted to __polyfunc::_Storage and __polyfunc::_Manager.
(__polyfunc::__deref_as, __polyfunc::__invoker_of)
(__polyfunc::__base_of, __polyfunc::__is_invoker_convertible): Define.
(std::__is_move_only_function_v): Renamed to
__is_polymorphic_function_v.
(std::__is_polymorphic_function_v): Renamed from
__is_move_only_function_v.
* testsuite/20_util/move_only_function/call.cc: Test for
functions pointers.
* testsuite/20_util/move_only_function/conv.cc: New test.
* testsuite/20_util/move_only_function/move.cc: Tests for
self assigment.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

commit | commitdiff | tree

Tomasz Kamiński [Wed, 30 Apr 2025 08:37:48 +0000 (10:37 +0200)]

libstdc++: Preserve the argument type in basic_format_args [PR119246]

This commits adjust the way how the arguments are stored in the _Arg_value
(and thus basic_format_args), by preserving the types of fixed width
floating-point types, that were previously converted to float, double,
long double.

The _Arg_value union now contains alternatives with std::bfloat16_t,
std::float16_t, std::float32_t, std::float64_t that use pre-existing
_Arg_bf16, _Arg_f16, _Arg_f32, _Arg_f32 argument types.

This does not affect formatting, as specialization of formatters for fixed
width floating-point types formats them by casting to the corresponding
standard floating point type.

For the 128bit floating we need to handle the ppc64 architecture,
(_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT) for which the long double may (per TU
basis) designate either __ibm128 and __ieee128 type, we need to store both
types in the _Arg_value and have two _Arg_types (_Arg_ibm128, _Arg_ieee128).
On other architectures we use extra enumerator value to store __float128,
that is different from long double and _Float128. This is consistent with ppc64,
for which __float128, if present, is same type as __ieee128. We use _Arg_float128
_M_float128 names that deviate from _Arg_fN naming scheme, to emphasize that
this flag is not used for std::float128_t (_Float128) type, that is consistenly
formatted via handle.

The __format::__float128_t type is renamed to __format::__flt128_t, to mitigate
visual confusion between this type and __float128. We also introduce __bflt16_t
typedef instead of using of decltype.

We add new alternative for the _Arg_value and allow them to be accessed via _S_get,
when the types are available. However, we produce and handle corresponding _Arg_type,
only when we can format them. See also r14-3329-g27d0cfcb2b33de.

The formatter<_Float128, _CharT> that formats via __format::__flt128_t is always
provided, when type is available. It is still correct when __format::__flt128_t
is _Float128.

We also provide formatter<__float128, _CharT> that formats via __flt128_t.
As this type may be disabled (-mno-float128), extra care needs to be taken,
for situation when __float128 is same as long double. If the formatter would be
defined in such case, the formatter<long double, _CharT> would be generated
from different specializations, and have different mangling:
* formatter<__float128, _CharT> if __float128 is present,
* formatter<__format::__formattable_float, _CharT> otherwise.
To best of my knowledge this happens only on ppc64 for __ieee128 and __float128,
so the formatter is not defined in this case. static_assert is added to detect
other configurations like that. In such case we should replace it with constraint.

PR libstdc++/119246

libstdc++-v3/ChangeLog:

* include/std/format (__format::__bflt16_t): Define.
(_GLIBCXX_FORMAT_F128): Separate value for cases where _Float128
is used.
(__format::__float128_t): Renamed to __format::__flt128_t.
(std::formatter<_Float128, _CharT>): Define always if there is
formattable 128bit float.
(std::formatter<__float128, _CharT>): Define.
(_Arg_type::_Arg_f128): Rename to _Arg_float128 and adjust value.
(_Arg_type::_Arg_ibm128): Change value to _Arg_ldbl.
(_Arg_type::_Arg_ieee128): Define as alias to _Arg_float128.
(_Arg_value::_M_f128): Replaced with _M_ieee128 and _M_float128.
(_Arg_value::_M_ieee128, _Arg_value::_M_float128)
(_Arg_value::_M_bf16, _Arg_value::_M_f16, _Arg_value::_M_f32)
(_Arg_value::_M_f64): Define.
(_Arg_value::_S_get, basic_format_arg::_S_to_enum): Handle __bflt16,
_Float16, _Float32, _Float64, and __float128 types.
(basic_format_arg::_S_to_arg_type): Preserve _bflt16, _Float16,
_Float32, _Float64 and __float128 types.
(basic_format_arg::_M_visit): Handle _Arg_float128, _Arg_ieee128,
_Arg_b16, _Arg_f16, _Arg_f32, _Arg_f64.
* testsuite/std/format/arguments/args.cc: Updated to illustrate
that extended floating point types use handles now. Added test
for __float128.
* testsuite/std/format/parse_ctx.cc: Extended test to cover class
to check_dynamic_spec with floating point types and handles.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

commit | commitdiff | tree

Richard Earnshaw [Wed, 14 May 2025 10:28:42 +0000 (11:28 +0100)]

Remove Marcus Shawcroft

Marcus has stood down as a maintainer and we have no new email address.

ChangeLog:

* MAINTAINERS: Marcus Shawcroft has resigned from the project.

commit | commitdiff | tree

Martin Jambor [Wed, 14 May 2025 10:08:24 +0000 (12:08 +0200)]

tree-sra: Do not create stores into const aggregates (PR111873)

This patch fixes (hopefully the) one remaining place where gimple SRA
was still creating a load into const aggregates.  It occurs when there
is a replacement for a load but that replacement is not type
compatible - typically because it is a single field structure.

I have used testcases from duplicates because the original test-case
no longer reproduces for me.

gcc/ChangeLog:

2025-05-13  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/111873
* tree-sra.cc (sra_modify_expr): When processing a load which has
a type-incompatible replacement, do not store the contents of the
replacement into the original aggregate when that aggregate is
const.

gcc/testsuite/ChangeLog:

2025-05-13  Martin Jambor  <mjambor@suse.cz>

* gcc.dg/ipa/pr120044-1.c: New test.
* gcc.dg/ipa/pr120044-2.c: Likewise.
* gcc.dg/tree-ssa/pr114864.c: Likewise.

commit | commitdiff | tree

Nathaniel Shead [Thu, 8 May 2025 13:06:13 +0000 (23:06 +1000)]

c++/modules: Fix handling of -fdeclone-ctor-dtor with explicit instantiations [PR120125]

The attached testcase ICEs in maybe_thunk_body because we haven't
created a node in the cgraph for an imported explicit instantiation yet.

We in fact really shouldn't be emitting calls at all, since an imported
explicit instantiation always exists in the TU we imported it from. So
this patch adjusts DECL_NOT_REALLY_EXTERN handling to account for this.

PR c++/120125

gcc/cp/ChangeLog:

* module.cc (trees_out::write_function_def): Only set
DECL_NOT_REALLY_EXTERN if the importer might need to emit it.
* optimize.cc (maybe_thunk_body): Don't assume 'fn' has a cgraph
node created.

gcc/testsuite/ChangeLog:

* g++.dg/modules/clone-4_a.C: New test.
* g++.dg/modules/clone-4_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

commit | commitdiff | tree

Nathaniel Shead [Mon, 21 Apr 2025 10:40:29 +0000 (20:40 +1000)]

c++: Fix OpenMP support with C++20 modules [PR119864]

In r15-2799-gf1bfba3a9b3f31, a new kind of global constructor was added.
Unfortunately this broke C++20 modules, as both the host and target
constructors were given the same mangled name. This patch ensures that
only the host constructor gets the module name mangling for now, and
stops forcing the creation of the target constructor even when no such
initialization is required.

PR c++/119864

gcc/cp/ChangeLog:

* decl2.cc (start_objects): Only use module initialized for
host.
(c_parse_final_cleanups): Don't always create an OMP offload
init function in modules.

gcc/testsuite/ChangeLog:

* g++.dg/modules/openmp-1.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

commit | commitdiff | tree

Nathaniel Shead [Fri, 9 May 2025 15:12:20 +0000 (01:12 +1000)]

c++/modules: Revert "Remove unnecessary lazy_load_pendings"

This reverts commit r16-63-g241157eb0858b3. It turns out that the
'lazy_load_pendings' is necessary if we haven't seen a binding for the
given template name at all in the current TU, as it is also used to find
template instantiations with the given name.

gcc/cp/ChangeLog:

* name-lookup.cc (lookup_imported_hidden_friend): Add back
lazy_load_pendings with comment.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tpl-friend-19_a.C: New test.
* g++.dg/modules/tpl-friend-19_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

commit | commitdiff | tree

Ville Voutilainen [Mon, 12 May 2025 20:16:46 +0000 (23:16 +0300)]

Add std::to_underlying to the set of stdlib functions that are always folded

gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold): Add to_underlying.

gcc/testsuite/ChangeLog:
* g++.dg/opt/pr96780_cpp23.C: New.

libstdc++-v3/ChangeLog:
* include/std/utility (to_underlying): Add the __always_inline__ attribute.

Signed-off-by: Ville Voutilainen <ville.voutilainen@gmail.com>

commit | commitdiff | tree

Stefan Schulze Frielinghaus [Wed, 14 May 2025 07:22:00 +0000 (09:22 +0200)]

s390: Fix tf_to_fprx2

Insn tf_to_fprx2 moves a TF value into a floating-point register pair.
For alternative 0, the input is a vector register, however, in the else
case instruction ldr is emitted which expects floating-point register
operands only.  Thus, this works only for vector registers which overlap
with floating-point registers.  Replace ldr with vlr so that the
remaining vector registers are dealt with, too.  Emitting a vlr instead
of a ldr is fine since the destination register %v0 is part of a
floating-point register pair which means that the low half of %v0 is
ignored in the end anyway and therefore may be clobbered.

gcc/ChangeLog:

* config/s390/vector.md: Fix tf_to_fprx2 by using vlr instead of
ldr.

commit | commitdiff | tree

Tobias Burnus [Wed, 14 May 2025 07:18:09 +0000 (09:18 +0200)]

Fortran: Fix mpfr_tanu use in gfc_simplify_cotand with mpfr 4.2.0+ [PR120225]

Fix commit r16-607-gc91c226762b422.

gcc/fortran/ChangeLog:

PR fortran/120225
* simplify.cc (gfc_simplify_cotand): Fix used argument in
mpfr_tanu call.

commit | commitdiff | tree

Tobias Burnus [Wed, 14 May 2025 07:12:13 +0000 (09:12 +0200)]

Fortran: Use mpfr_sinu etc. with mpfr 4.2.0+ for degree trigonometric functions [PR120225]

As MPFR 4.2.0 added, support for degree trigonometric functions by via the
mpfr_...u functions (for u = 360), it makes sense to use them if available.
If MPFR is older, the current implementation is used as fallback.

PR fortran/120225

gcc/fortran/ChangeLog:

* simplify.cc: Include "trigd_fe.inc" only with MPFR < 4.2.0.
(rad2deg, rad2deg): Only define if MPFR < 4.2.0.
(gfc_simplify_acosd, gfc_simplify_asind, gfc_simplify_atand,
gfc_simplify_atan2d, gfc_simplify_cosd, gfc_simplify_tand,
gfc_simplify_cotand): Use mpfr_...u functions with MPFR >= 4.2.0.

commit | commitdiff | tree

Owen Avery [Tue, 13 May 2025 21:18:28 +0000 (17:18 -0400)]

c++: Allow -Wvirtual-move-assign to be more easily ignored

This patch makes it easier to selectively disable
-Wvirtual-move-assign by allowing diagnostic pragmas on
base class move assignment operators to suppress such
warnings.

gcc/cp/ChangeLog:

* method.cc (synthesized_method_walk): Check whether
-Wvirtual-move-assign is enabled at the location of a base
class's move assignment operator.

gcc/testsuite/ChangeLog:

* g++.dg/warn/ignore-virtual-move-assign.C: New test.

Co-authored-by: Jason Merrill <jason@redhat.com>
Signed-off-by: Owen Avery <powerboat9.gamer@gmail.com>

commit | commitdiff | tree

liuhongt [Tue, 8 Apr 2025 03:12:00 +0000 (20:12 -0700)]

Extend vect_recog_cond_expr_convert_pattern to handle floating point type.

For floating point, !flag_trapping_math is needed for the pattern
which transforms 2 conversions to 1 conversion, and may lose 1
potential trap. There shouldn't be any accuracy issue.

gcc/ChangeLog:

PR tree-optimization/103771
* match.pd (cond_expr_convert_p): Extend the match to handle
scalar floating point type.
* tree-vect-patterns.cc
(vect_recog_cond_expr_convert_pattern): Handle floating point
type.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103771-4.c: New test.

commit | commitdiff | tree

GCC Administrator [Wed, 14 May 2025 00:18:21 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Tobias Burnus [Tue, 13 May 2025 22:53:50 +0000 (00:53 +0200)]

gfortran.dg/dec_math.f90: Add comment regarding F2023 [PR113413]

gcc/testsuite/ChangeLog:

PR fortran/113413
* gfortran.dg/dec_math.f90: Add comment that degree
functions are part of F2023.

commit | commitdiff | tree

Yuao Ma [Mon, 12 May 2025 15:07:37 +0000 (23:07 +0800)]

fortran: map atand(y, x) to atan2d(y, x) [PR113413]

According to the Fortran standard, atand(y, x) is equivalent to atan2d(y, x).
However, the current atand(y, x) function produces an error. This patch
includes the necessary intrinsic mapping, related test, and intrinsic
documentation.
The minor comment change in intrinsic.cc is cherry-picked from Steve's previous
work.

PR fortran/113413 - ATAND(Y,X) is unsupported

PR fortran/113413

gcc/fortran/ChangeLog:

* intrinsic.cc (do_check): Minor doc polish.
(add_functions): Add atand(y, x) mapping.
* intrinsic.texi: Update atand example.

gcc/testsuite/ChangeLog:

* gfortran.dg/dec_math.f90: Add atand(y, x) testcase.

Signed-off-by: Yuao Ma <c8ef@outlook.com>
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>

commit | commitdiff | tree

Gaius Mulley [Tue, 13 May 2025 21:54:33 +0000 (22:54 +0100)]

PR modula2/120253: Error message column numbers should start at 1 not 0

This patch ensures that column numbers start at 1 rather than 0.

gcc/m2/ChangeLog:

PR modula2/120253
* m2.flex (FIRST_COLUMN): New define.
(updatepos): Remove commented code.
(consumeLine): Assign column to FIRST_COLUMN.
(initLine): Ditto.
(m2flex_GetColumnNo): Return FIRST_COLUMN if currentLine is NULL.
(m2flex_GetLineNo): Rewrite for positive logic.
(m2flex_GetLocation): Ditto.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 22 Apr 2025 16:40:28 +0000 (09:40 -0700)]

gimple-fold: Don't replace `tmp = FP0 CMP FP1; if (tmp != 0)` over and over again when comparison can throw

with -ftrapping-math -fnon-call-exceptions and:
```
tmp = FP0 CMP FP1;

if (tmp != 0) ...
```
a call fold_stmt on the GIMPLE_COND will replace the above with
a new tmp each time and we even lose the eh informatin on the
previous comparison too.

Changes since v1:
* v2: Use INTEGRAL_TYPE_P instead of a check against BOOLEAN_TYPE.
Add testcase which shows where losing of landing pad happened.

PR tree-optimization/119903
gcc/ChangeLog:

* gimple-fold.cc (replace_stmt_with_simplification): Reject for
noncall exceptions replacing comparison with itself.
gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/pr119903-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 13 May 2025 01:58:32 +0000 (18:58 -0700)]

verifier: Fix up PAREN_EXPR verification [PR118868]

The verification added in r12-1608-g2f1686ff70b25f, was incorrect
for PAREN_EXPR, pointer types should be valid for PAREN_EXPR.
Also for PAREN_EXPR, aggregate types don't make sense (currently
they ICE much earlier in the gimplifier rather than error message) so
we should disallow them here too.

Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/118868

gcc/ChangeLog:

* tree-cfg.cc (verify_gimple_assign_unary): Allow pointers
but disallow aggregate types for PAREN_EXPR.

gcc/testsuite/ChangeLog:

* c-c++-common/pr118868-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Wed, 4 Dec 2024 02:57:45 +0000 (18:57 -0800)]

cfgexpand: Update cache during the original DFS walk

This is a small optimization which can improve how many times are need through the update loop.
It can reduce the number of times in the update loop by maybe 1 times.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* cfgexpand.cc (vars_ssa_cache::operator()): Update the cache if the use is already
has a cache.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Andrew Pinski [Tue, 3 Dec 2024 23:57:42 +0000 (15:57 -0800)]

cfgexpand: Reverse the order of going through the update_cache_list queue.

This is a small optimization, the reversed order of the walk of update_cache_list queue.
The queue is pushed in Pre-order/NLR, reversing the order will reduce how many times we
need to go through the loop as we update the nodes which might have a link back to another
one first.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* cfgexpand.cc (vars_ssa_cache::operator()): Reverse the order of the going
through the update list.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Richard Biener [Tue, 13 May 2025 12:16:57 +0000 (14:16 +0200)]

Remove non-SLP path from vectorizable_induction

This removes the non-SLP path from vectorizable_induction.

* tree-vect-loop.cc (vectorizable_nonlinear_induction):
Remove non-SLP path, use SLP_TREE_VECTYPE.
(vectorizable_induction): Likewise. Drop ncopies variable
which is always 1.

commit | commitdiff | tree

Gaius Mulley [Tue, 13 May 2025 12:35:00 +0000 (13:35 +0100)]

PR modula2/120188: Use existing test for plugin

This is a cleanup patch which to use the existing plugin test
rather than check the configure build options.

gcc/testsuite/ChangeLog:

PR modula2/120188
* gm2.dg/doc/examples/plugin/fail/doc-examples-plugin-fail.exp:
Remove call to gm2-dg-frontend-configure-check and replace with
tests for whether plugin variables exist.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

commit | commitdiff | tree

Jakub Jelinek [Tue, 13 May 2025 12:20:22 +0000 (14:20 +0200)]

libfortran: Fix up _gfortran_{,m,s}findloc2_s{1,4} [PR120196]

As mentioned in the PR, _gfortran_{,m,s}findloc2_s{1,4} iterate too many
times in the back case if nothing is found.
For !back, the loops are for (i = 1; i <= extent; i++) so i is in the
body [1, extent] if nothing is found, but for back it is
for (i = extent; i >= 0; i--) so i is in the body [0, extent] and compares
one element before the start of the array.
Note, findloc1_s{1,4} uses
          for (n = len; n > 0; n--, src -= delta * len_array)
for the back loop and
          for (n = 1; n <= len; n++, src += delta * len_array)
for !back.  This patch fixes that.
The testcase fails under valgrind without the libgfortran changes and
succeeds with those.

2025-05-13  Jakub Jelinek  <jakub@redhat.com>

PR libfortran/120196
* m4/ifindloc2.m4 (header1, header2): For back use i > 0 rather than
i >= 0 as for condition.
* generated/findloc2_s1.c: Regenerate.
* generated/findloc2_s4.c: Regenerate.

* gfortran.dg/pr120196.f90: New test.

Mirror of https://gcc.gnu.org/git/gcc.git

RSS Atom