testsuite: arm: adjust inline assembler for arm-none-eabi
The fix applied to toplevel-extended-asm-1_0.c in r16-7892-gb02f9495dcf635,
that defines the symbol as a function, also needs to be applied to
toplevel-simple-asm-1_0.c.
gcc/testsuite/ChangeLog:
* gcc.dg/lto/toplevel-simple-asm-1_0.c: Adjust inline assembler
for arm-none-eabi.
gccrs: Support labeled block value breaks in HIR lowering
This change implements backend lowering support for Rust labeled blocks.
Previously, labeled blocks were rejected in "CompileExpr::visit(BlockExpr)"
as unsupported. With this patch, labeled blocks are lowered by introducing
the following :-
1. A backend "LABEL_DECL" used as the jump target for "break 'label".
2. A temporary "Bvariable" used to hold the block’s resulting value.
gcc/rust/ChangeLog:
* backend/rust-compile-expr.cc (CompileExpr::visit): Lower labeled block.
(CompileExpr::construct_block_label): Utility function to construct block label.
(CompileExpr::lookup_label): Utility function to lookup label.
(CompileExpr::lookup_temp_var): Utility function to lookup block temp variables.
(CompileExpr::resolve_util): Utility to resolve NodeId to HirId.
* backend/rust-compile-expr.h: Header functions.
* resolve/rust-late-name-resolver-2.0.cc (Late::visit): Fix label resolution.
gccrs: Add feature gate for rustc_const_stable attribute
rustc_const_stable attributes are used within the core library but were
not properly feature gated. The compiler now rejects their usage when
the feature has not been explicitly enabled.
gcc/rust/ChangeLog:
* checks/errors/feature/rust-feature-gate.cc (FeatureGate::visit): Add
a feature gate around rustc_const_stable attributes.
gccrs: Defer literal suffix validation to parser and preserve source fidelity
Number literal evaluation and suffix validation should be done after macro expansion,
so we defer these to the parser phase. This preserves source fidelity for macro token
trees.
gcc/rust/ChangeLog:
* ast/rust-ast-collector.cc (TokenCollector::visit): Update Token::make_int and
Token::make_float calls to include suffix_start and IntegerLiteralBase::Decimal.
* expand/rust-macro-builtins-location.cc (MacroBuiltin::column_handler): Pass string
length and base to Token::make_int.
(MacroBuiltin::line_handler): Likewise.
* lex/rust-lex.cc (Lexer::parse_in_type_suffix): Rename to parse_in_suffix and return
string instead of PrimitiveCoreType.
(Lexer::parse_in_suffix): Remove underscore stripping to preserve source fidelity for
macros.
(Lexer::parse_in_exponent_part): Preserve '+' and '-' characters in the raw string.
(Lexer::parse_in_decimal): Remove underscore stripping.
(Lexer::parse_non_decimal_int_literal): Track suffix start index and pass literal base.
(Lexer::parse_non_decimal_int_literals): Use IntegerLiteralBase enum values instead of
raw integers.
(Lexer::parse_decimal_int_or_float): Track suffix string length and pass base parameters
to token creation.
* lex/rust-lex.h: Update method signatures for suffix parsing.
* lex/rust-token.h (enum class IntegerLiteralBase): New enum to represent numeric bases.
* parse/rust-parse-impl-expr.hxx: use LiteralResolve functions to evaluate raw token
strings.
* parse/rust-parse-impl-pattern.hxx: Use evaluated literal strings for INT and FLOAT
tokens.
* parse/rust-parse.cc (resolve_literal_suffix): Move suffix validation logic from lexer
to parser.
(evaluate_integer_literal): New function to strip underscores and convert to decimal via
GMP.
(evaluate_float_literal): New function to strip underscores from floats.
* parse/rust-parse.h (evaluate_integer_literal): Declare in LiteralResolve namespace.
(evaluate_float_literal): Likewise.
(resolve_literal_suffix): Likewise.
* util/rust-token-converter.cc (from_literal): Safely reconstruct raw text and suffix to
dynamically determine base and suffix_start for ProcMacros.
gcc/testsuite/ChangeLog:
* rust/compile/deferred-suffix-validation.rs: New test.
* rust/compile/evaluate-integer-or-float.rs: New test.
* rust/compile/tuple-index.rs: New test.
Arthur Cohen [Mon, 23 Mar 2026 11:52:00 +0000 (12:52 +0100)]
gccrs: nr: Do first part of path resolution in types NS
gcc/rust/ChangeLog:
* resolve/rust-name-resolution-context.hxx: Do segment resolution in types NS for more
correctness and correct behavior when later resolving paths that use imports and/or
modules.
Arthur Cohen [Mon, 23 Mar 2026 04:36:16 +0000 (05:36 +0100)]
gccrs: nr: Move path resolution from ForeverStack to NRCtx
gcc/rust/ChangeLog:
* resolve/rust-forever-stack.h: Move declarations from ForeverStack to NRCtx, make most
of the ForeverStack members public as it helps the Ctx a lot.
* resolve/rust-forever-stack.hxx: Move implementation of resolve_path methods to NRCtx.
* resolve/rust-name-resolution-context.h: Declare resolve_path methods.
* resolve/rust-name-resolution-context.hxx: New file with resolve_path impls.
gccrs: Fix ICE cloning trait functions without return types
Fixes Rust-GCC/gccrs#3972.
Trait functions without an explicit return type can have a null
`return_type` in `TraitFunctionDecl`. When such declarations are copied,
the copy constructor and assignment operator currently try to clone the
return type unconditionally, and this can lead to an ICE.
Handle this case by keeping `nullptr` when there is no return type to
clone. Also add a regression test for the example from Rust-GCC/gccrs#3972.
gccrs: Fix ICE in get_function_expr when cfg'd return type inside macro
the problem is cfg-strip emits an error for unstrippable expressions but
doesn't mark the parent for strip, leaving a broken subtree for later
passes to ICE on.
gcc/rust/ChangeLog:
* expand/rust-cfg-strip.cc (CfgStrip::visit): mark CallExpr for
strip when function expression fails stripping.
(CfgStrip::visit): mark ArrayIndexExpr for strip when array or
index expression fails stripping.
libgomp/oacc-mem: add missing assert to goacc_enter_datum
A bug I accidentally introduced made it so that new variables are
allocated with some room to spare before them, and ergo, that tgt_offset
!= 0, leading to tests failing in what looked like a strange way. Turns
out, goacc_enter_datum was failing to validate its assumption that
tgt_offset == 0. This patch adds that assert.
libgomp/ChangeLog:
* oacc-mem.c (goacc_enter_datum): Assert that tgt_offset of the
newly-mapped variable is zero.
libgomp/plugin-gcn: remove unneeded heap allocation in run_kernel
So far, the GCN plugin has used a kernel_dispatch struct instance it
calls "shadow" to keep effectively a copy of part of the HSA dispatch
packet before populating said packet. It also allocated it on the heap.
This, at first glance, seems useless: why double up the data in a shadow
when it's already in packet?
But, it serves a purpose. The packet is owned by the HSA runtime.
After dispatch, its contents are to be considered no longer accessible
by the dispatcher (i.e. run_kernel). So, we can't read back from it the
addresses or handles of resources we allocated, and so, we can't clean
them up.
However, this allocation doesn't need to happen on the heap. It's of a
known fixed size, and its lifetime is the same as the lifetime of an
automatic variable.
This patch demotes the heap allocation into an automatic variable, and
adds commentary to make it clear what the purpose of this "shadow" is.
In the end, the result of this patch is that the run_kernel hot path has
one fewer allocation.
I've also taken the opportunity to do some very minor code cleanup.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (struct kernel_dispatch): Store
hsa_signal_t, rather than a uint64_t, so that we don't rely on
knowledge of the contents of hsa_signal_t.
(create_kernel_dispatch): Rename...
(prepare_kernel_dispatch): ... to this, as it no longer creates
a kernel dispatch. The allocation that would've created it is
hoisted...
(run_kernel): ... here, as an automatic variable. Move logic
that copies the fields of kernel_dispatch...
(populate_packet_from_dispatch): ... into this standalone
function, to make it clearer.
(release_kernel_dispatch): Rename....
(cleanup_kernel_dispatch): ... to this, don't free 'shadow'.
libgomp: let plugins handle allocating the target variable table
In my examination of BabelStream results on AMD GCN, I've found that,
for each BabelStream kernel execution, we spend significant time in
allocating and initializing memory in gomp_map_vars (~55µs, whereas the
actual BabelStream code executes in ~746µs, meaning we increase the time
BabelStream measures by 7% just on that).
Upon further examination, I've found that the only reason gomp_map_vars
decides to allocate and map any memory in the first place is because it
is constructing the table of pointers to variables on the target, which
I've taken to calling the "target variable table". Given that the GCN
plugin already must perform some memory allocation before starting up a
kernel, namely to allocate kernel arguments, it would be beneficial if
we could merge this allocation with the kernel arguments allocation.
In addition, since the kernel arguments live in host memory, populating
them can be performed using string functions, without any need to call
for expensive host2dev copies.
This patch introduces an opaque type for "offload sessions". This type
is defined by each plugin and allows it to store data related to a
single offload job. The sessions are allocated and managed by libgomp,
and initialized and utilized by the plugin. Their lifetime starts with
a call to GOMP_OFFLOAD_session_start, and ends with
GOMP_OFFLOAD_{openacc_{async_,}exec,{async_,}run}.
The patch then uses this framework to make management of the target
variable table more flexible: the plugin may elect to implement
GOMP_OFFLOAD_session_allocate_target_var_table, which allows the plugin
to attempt to allocate the target variable table in host memory.
If it fails, or if the plugin does not provide this function, libgomp
will perform this allocation as it does today - in target memory - and
tell the session about it using
GOMP_OFFLOAD_session_set_target_var_table.
In the case of AMD GCN, upon a call to
GOMP_OFFLOAD_session_allocate_target_var_table, the plugin will
immediately allocate kernel arguments with enough space for the target
variable table, no matter what size the plugin asks for[1], and return
that pointer to libgomp.
This results in the runtime of gomp_map_vars effectively disappearing
from traces.
[1] It may be beneficial to limit this, to some fixed amount, to make it
so that the future allocation cache has a higher cache hit rate. It
may also depend on whether hsa_memory_allocate for kernel arguments
takes runtime proportional to the number of bytes it needs to
allocate.
include/ChangeLog:
* gomp-constants.h (GOMP_VERSION): Bump. Signature of
GOMP_OFFLOAD_run et al changed.
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_run, GOMP_OFFLOAD_exec)
(GOMP_OFFLOAD_async_run, GOMP_OFFLOAD_openacc_async_exec): Pass
session in place of target variable table and devices.
(struct gomp_offload_session): New.
(GOMP_OFFLOAD_session_size): New
(GOMP_OFFLOAD_check_session_struct): New.
(GOMP_OFFLOAD_session_boilerplate): New.
(GOMP_OFFLOAD_session_start): New.
(GOMP_OFFLOAD_session_allocate_target_var_table): New.
(GOMP_OFFLOAD_session_set_target_var_table): New.
* libgomp.h (struct gomp_target_task): Add offload_session
field.
(struct gomp_device_descr): Add offload session management
functions.
(gomp_offload_session_new): New.
(goacc_map_vars): Add SESSION to signature
* oacc-host.c (struct gomp_offload_session): Define, for host
offload fallback case.
(host_session_size): New. Implements GOMP_OFFLOAD_session_size.
(host_session_start): New. Implements
GOMP_OFFLOAD_session_start.
(host_session_set_target_var_table): New. Implements
GOMP_OFFLOAD_session_set_target_var_table.
(host_run): Adjust to match GOMP_OFFLOAD_run.
(host_openacc_exec): Adjust to match GOMP_OFFLOAD_openacc_exec.
(host_openacc_async_exec): Adjust to match
GOMP_OFFLOAD_openacc_async_exec.
* oacc-mem.c (acc_map_data): Adjust call to goacc_map_vars.
(goacc_enter_datum): Ditto.
(goacc_enter_data_internal): Ditto.
* oacc-parallel.c (GOACC_parallel_keyed): Allocate and pass
offload session.
(GOACC_data_start): Adjust call to goacc_map_vars.
* plugin/plugin-gcn.c (struct kernel_dispatch): Remove
kernarg_cache_node.
(struct kernargs): Add a flexible array member for the target
variable table.
(struct kernel_launch): Store an offload session rather than
target var. table pointer.
(print_kernel_dispatch): Receive kernargs as parameter.
(struct gomp_offload_session): Define.
(init_session): New.
(GOMP_OFFLOAD_session_start): Implement, using init_session.
(release_session): New.
(alloc_kernargs_on_agent): Rename to...
(allocate_session_kernargs): ... this, store result in
passed-in SESSION, and allocate extra room for target variable
table (rounding it up to nearest multiple of 64 pointers).
(GOMP_OFFLOAD_session_allocate_target_var_table): Implement
using the previous function.
(GOMP_OFFLOAD_session_set_target_var_table): Ditto.
(create_kernel_dispatch): Remove kernarg allocation, instead
receiving it as an argument.
(release_kernel_dispatch): Receive kernargs as an argument,
don't release them.
(run_kernel): Adjust to use sessions.
(destroy_module): Ditto.
(GOMP_OFFLOAD_load_image): Ditto.
(execute_queue_entry): Adjust to match changed struct
kernel_launch.
(queue_push_launch): Ditto.
(gcn_exec): Receive and pass along session.
(GOMP_OFFLOAD_run): Ditto.
(GOMP_OFFLOAD_async_run): Ditto.
(GOMP_OFFLOAD_openacc_exec): Ditto.
(GOMP_OFFLOAD_openacc_async_exec): Ditto.
* plugin/plugin-nvptx.c (struct gomp_offload_session): Define.
(GOMP_OFFLOAD_session_start): Implement.
(GOMP_OFFLOAD_session_set_target_var_table): Implement.
(GOMP_OFFLOAD_openacc_exec): Adjust to receive session.
(GOMP_OFFLOAD_openacc_async_exec): Ditto.
(GOMP_OFFLOAD_run): Ditto.
* target.c (gomp_get_tvt_size): Extract helper from...
(gomp_map_vars_internal): ... here. Receive SESSION, iff doing
target offload. Use a target variable table on the host
allocated by GOMP_OFFLOAD_session_allocate_target_var_table if
possible, or call GOMP_OFFLOAD_session_set_target_var_table with
an allocated device pointer otherwise.
(gomp_map_vars): Update to pass along session.
(goacc_map_vars): Ditto.
(GOMP_target): Allocate and pass along session.
(GOMP_target_ext): Ditto.
(gomp_target_data_fallback): Adjust call to gomp_map_vars.
(GOMP_target_data): Ditto.
(GOMP_target_data_ext): Ditto.
(GOMP_target_enter_exit_data): Ditto.
(gomp_target_task_fn): Start and pass along session, the storage
for which is allocated by gomp_create_target_task.
(DLSYM2): Rename from DLSYM, adding a new parameter for the
variable to populate, akin to DLSYM_OPT.
(DLSYM): Delegate to DLSYM2.
(gomp_load_plugin_for_device): Populate session-related fields.
* task.c (gomp_create_target_task): Allocate enough storage for
an offload session.
* testsuite/libgomp.c-c++-common/gcn-kernel-launch-no-tvt-alloc.c: New test.
* testsuite/libgomp.c-c++-common/gcn-kernel-launch-tvt-alloc.c: New test.
Arsen Arsenović [Thu, 12 Feb 2026 15:42:02 +0000 (15:42 +0000)]
libgomp/gcn: parallelize initializing threads of a team
Currently, libgomp performs initialization of all threads in a team
in its lead thread, and then releases all threads to do work. This
means that, before reaching the release, each thread is doing nothing,
waiting for the lead threads to do lots of thread initialization
operations.
This initialization is identical for each thread.
We can parallelize it by performing this initialization in each thread,
after releasing each. This allows the threads of a team to be released
near-immediately, which should cut team startup time roughly by just
under the number of threads.
In order to achieve this, the lead thread prepares the parameters each
thread needs for initialization by copying them into an object each will
be able to read from, and only initializes each remaining thread in the
team with a few pointers.
No functional changes intended in this commit. It may seem like there
is a functional change, as gomp_prep_our_thread no longer sets
icv.nthreads_var, whereas the old code did, but the value that was being
set by old code was always equal to the value already present in the
ICV, because both are initialized from parent tasks ICV (or global ICV
if that's missing) and, hence, the write was always redundant.
libgomp/ChangeLog:
* libgomp.h (struct gomp_thread_start_data): New struct. Holds
thread-independent parameters needed to initialize current
thread.
(struct gomp_team): On GCN, add thr_start_data field, that holds
a gomp_thread_start_data to be used in each thread.
(struct gomp_thread): Add start_data field, that points to
thread initialization parameters.
* config/gcn/team.c (gomp_team_start): Move thread
initialization steps into ...
(gomp_prep_our_thread): this new function, such that it reads
from a gomp_thread_start_data object.
(gomp_thread_start): Call the above to initialize our thread.
Hongyu Wang [Fri, 29 May 2026 07:18:08 +0000 (15:18 +0800)]
i386: Add tuning to disable memory-form NDD
Benchmark shows memory form of NDD instructions is not beneficial
on NovaLake. Add X86_TUNE_ENABLE_NDD_MEM tuning (default off) to
deprioritize NDD alternatives with memory source operands via the
preferred_for_speed attribute. For pure NDD patterns that have a
single alternative with rm constraint, split into r,m alternatives
and apply preferred_for_speed on the memory alternative. For legacy
patterns with NDD alternatives, also split the NDD rm constraint
into separate r and m alternatives so the deprioritization targets
only the memory form.
Hongyu Wang [Tue, 26 May 2026 02:29:04 +0000 (07:59 +0530)]
i386: Disable SETcc.ZU generation on DMR/NVL via tune flag
Microbenchmark performance on NovaLake/DiamondRapids shows no benefit
from SETcc.ZU encoding on these cores. Add X86_TUNE_DISABLE_SETZUCC
to suppress setzucc generation for DMR/NVL while keeping it enabled
for other APX-capable targets.
gcc/ChangeLog:
* config/i386/x86-tune.def (X86_TUNE_DISABLE_SETZUCC): New.
Enable for m_DIAMONDRAPIDS | m_NOVALAKE.
* config/i386/i386.h (TARGET_DISABLE_SETZUCC): New define.
* config/i386/i386.md (*setcc_<mode>_zu): Guard with
TARGET_APX_ZU && !TARGET_DISABLE_SETZUCC.
(*setcc_di_1, *setcc_<mode>_1_movzbl): Guard with
(!TARGET_APX_ZU || TARGET_DISABLE_SETZUCC).
(*setcc_qi, *setcc_qi_slp): Emit setzucc only when
TARGET_APX_ZU && !TARGET_DISABLE_SETZUCC.
Andrew Pinski [Sat, 30 May 2026 22:54:59 +0000 (15:54 -0700)]
range-op: Add relation effect for integer mult [PR23471]
tree_expr_nonnegative_p has code to say `a*a` is nonnegative
(for signed integers). But when I removed the call to
gimple_stmt_nonnegative_p, ranger could not figure out that
that `a*a` is nonnegative. This shows up as a regression
on riscv with `gcc.target/riscv/rvv/vsetvl/avl_single-65.c`.
But you can also reproduce the same issue if we disable forwprop
and look at the result of EVRP and look at the exported range
for that statement.
The fix is to teach the range multiply operator that when
dealing with `a*a`, the lhs will be non-negative.
That is add a op1_op2_relation_effect method to operator_mult
that handles the case where the relationship is equal.
vrp-mult-nonneg-1.c is now a testcase which GCC can optimize
which was not handled before.
vrp-mult-nonneg-2.c is the reduced testcase for PR125513 and the regression.
Jeff Law [Sun, 31 May 2026 22:34:42 +0000 (16:34 -0600)]
[RISC-V][PR rtl-optimization/123313] Improve select between reg,-1
So this improves our ability to select across reg,-1. The early versions of
this patch allowed const,-1, but those sequences weren't any better and
occasionally ever-so-slightly worse, so those are rejected. I've spot checked
spec2017 where it does show up, but it's not clear that it's showing up in any
hot code.
The basic idea is to use an sCC to generate 1,0, subtract 1 giving 0, -1. Then
we can IOR that with the other input. Concretely:
> int f(int a, int b, int c)
> {
> a = -1;
> if (c < 10) a = b;
> return a;
> }
Currently generates:
> li a5,9
> addi a1,a1,1
> sgt a2,a2,a5
> czero.nez a2,a1,a2
> addi a0,a2,-1
> ret
After this patch:
> slti a0,a2,10
> addi a0,a0,-1
> or a0,a0,a1
> ret
Probably the same performance on 4+ wide designs (and perhaps often on a 2 wide
designs). But it encodes a lot more efficiently, 18 bytes for the first
sequence, just 10 bytes for the second. That can be important on some designs,
particularly since if-converted blocks are more likely to be large and/or cross
cache line boundaries.
This has been bootstrapped and regression tested on x86, and riscv64. The
riscv64 bootstraps were on the Pioneer, K1 (early version of the patch) and K3
(most recent versions). It's also been tested on all the *-elf platforms in my
tester as well as additional bootstraps on platforms like alpha, sh4, etc.
I'll wait for a final confirmation from the pre-commit tester before moving
forward.
PR rtl-optimization/123313
gcc/
* ifcvt.cc (noce_try_store_flag_logical): New function.
(noce_process_if_block): Call it.
gcc/testsuite/
* gcc.target/riscv/pr123313.c: New test.
* gcc.target/riscv/pr124009.c: Adjust expected output.
Daniel Barboza [Tue, 26 May 2026 18:39:30 +0000 (15:39 -0300)]
gcc: finalize deprecation of -Wstrict-overflow
The flag has been documented as deprecated since GCC 8:
https://gcc.gnu.org/gcc-8/changes.html
"-fno-strict-overflow is now mapped to -fwrapv -fwrapv-pointer (...)
-Wstrict-overflow is deprecated."
But we kept it maintained and functional all along these years until GCC
17, where we're now "finalizing its deprecation" so to speak.
Remove the remaining enum and update all relevant docs and tests to make
it official.
gcc/c-family/ChangeLog:
* c.opt: Removed Wstrict-overflow entry.
gcc/ChangeLog:
* common.opt: Updated Wstrict-overflow entries to indicate that
they do nothing now.
* coretypes.h (enum warn_strict_overflow_code): Enum removed.
* doc/invoke.texi: Updated Wstrict-overflow entries to indicate
that they do nothing now. Also removed the option from the
"Warning Options" list.
* opts.cc (common_handle_option): Removed OPT_Wstrict_overflow
logic.
Daniel Barboza [Tue, 26 May 2026 16:56:05 +0000 (13:56 -0300)]
testsuite: remove Wstrict-overflow related tests
The following testsuite PRs are related to the now obsolete
Wstrict-overflow option:
- Bug 36227 - [4.3 Regression] POINTER_PLUS folding introduces undefined
overflow
- Bug 48022 - [4.6 Regression] -Wstrict-overflow warning on code that
doesn't have overflows
- Bug 49705 - -Wstrict-overflow should not diagnose unevaluated
expressions
- Bug 52904 - -Wstrict-overflow false alarm with bounded loop
They are exercising code that no longer exists, so remove all of them.
Two other tests (pr81592.c and pragma-diag-3.c) have Wstrict-overflow
checks that got removed.
gcc/testsuite/ChangeLog:
* gcc.dg/pr81592.c: Removed strict-overflow options.
* gcc.dg/pragma-diag-3.c: Removed Wstrict-overflow option, along
with a reference and tests for PR66098 ("[5 regression] #pragma
diagnostic 'ignored' not fullyundone by pop for
strict-overflow").
* gcc.dg/pr36227.c: Removed.
* gcc.dg/pr48022-1.c: Removed.
* gcc.dg/pr48022-2.c: Removed.
* gcc.dg/pr49705.c: Removed.
* gcc.dg/pr52904.c: Removed.
Daniel Barboza [Fri, 16 Jan 2026 17:02:07 +0000 (09:02 -0800)]
testsuite: remove all Wstrict-overflow tests
We're no longer issuing Wstrict-overflow warnings, even with the
-Wstrict-overflow flag being used.
Remove the tests that are still testing for it. They're either compile
tests that are testing cases where a warning shouldn't be issued or
XFAIL tests.
Daniel Barboza [Fri, 16 Jan 2026 16:11:05 +0000 (08:11 -0800)]
gcc/fold-const: remove fold_*_overflow_warnings
fold_undefer_overflow_warnings () is one of the last few places where
Wstrict_overflow warnings are issued. It uses a
fold_deferring_overflow_warnings int to determine whether the warning should
be shown, a fold_defer_overflow_warnings helper that increments it,
a fold_deferred_overflow_warning string that stores the
deferred warning, an enum for the warning level of the deferred warning.
Daniel Barboza [Mon, 25 May 2026 17:52:56 +0000 (14:52 -0300)]
tree-vrp: remove compare_values_warnv ()
Remove compare_values_warn and all the Wstrict-overflow related logic
from the previous callers.
gcc/ChangeLog:
* tree-vrp.cc (compare_values_warnv): Removed.
(compare_values): Changed to do the same thing as the former
compare_values_warn but without the strict_overflow_p logic.
* tree-vrp.h (compare_values_warnv): Removed.
* vr-values.cc (test_for_singularity): Removed the
supress_warning logic that was used by compare_values_warnv.
Daniel Barboza [Fri, 16 Jan 2026 11:05:18 +0000 (03:05 -0800)]
tree-ssa-reassoc: remove strict_overflow_p from range_entry
Continuing the deprecation of -Wstrict-overflow, remove the
strict_overflow_p flag from range_entry and the two associated
warning_at calls.
gcc/ChangeLog:
* tree-ssa-reassoc.cc (init_range_entry): Removed
strict_overflow_p use.
(force_into_ssa_name): Likewise.
(update_range_test): Removed strict_overflow_p from function
signature, along with the flag associated logic and the
warning_at call.
(optimize_range_tests_xor): Removed strict_overflow_p use.
(optimize_range_tests_diff): Likewise.
(optimize_range_tests_to_bit_test): Likewise.
(optimize_range_tests_cmp_bitwise): Likewise.
(optimize_range_tests_var_bound): Removed strict_overflow_p from
function signature, along with the flag associated logic and the
warning_at call.
(optimize_range_tests): Removed strict_overflow_p use.
* tree-ssa-reassoc.h (struct range_entry): Removed
strict_overflow_p flag.
Georg-Johann Lay [Sun, 31 May 2026 08:11:35 +0000 (10:11 +0200)]
AVR: Support [[len=nl]] in asm templates.
Most inline asm consists of 1-word instruction, so their exact
size can be conveniently specified with [[len=nl]], which is
half the default size.
gcc/
* config/avr/avr.cc (avr_length_of_asm): Support "nl" as
the number of lines in the code template.
* doc/extend.texi (Size of an asm): Document [[len=nl]].
Simplify the description of the default length.
hppa: Fix clear_cache pattern and use it in pa_trampoline_init
The clear_cache pattern was broken and only flushed the instruction
cache. On PA-RISC, both the data and instruction caches need to be
flushed and these flushes need to be separated by a sync instruction.
The code is reworked and simplified.
2026-05-30 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.cc (pa_trampoline_init): Rework to use
clear_cache pattern.
* config/pa/pa.md (dcacheflush): Use "<<" condition instead
of "<<=".
(icacheflush): Remove.
(icacheflush1, icacheflush2, icacheflush3): New flush patterns
for PA 1.x targets, PA 2.0 targets, and PA 1.x no space
register targets.
(clear_cache): Rework to flush data and instruction caches.
Skip flush if the start address is greater than or equal to
the end address. Don't align the end address to a cacheline
boundary. Handle instruction flushes for PA 1.x targets,
PA 2.0 targets, and PA 1.x no space register targets.
Artemiy Volkov [Sat, 30 May 2026 14:59:52 +0000 (14:59 +0000)]
testsuite: adjust tests for FIRSTP/LASTP SVE2 instructions
Looks like I didn't have Wilco's recent r17-843-ge6c1179fd40d1c
when re-testing some parts of the SVE2.2 series, which led to a few
check-function-bodies mismatches discovered by the Linaro CI[0]. The fix
here is to adapt the tests by changing x0 to w0 in the expected
codegen whenever the constant value is unsigned in
aarch64/sve2/acle/general/{first,last}p.c.
Jeff Law [Sat, 30 May 2026 17:13:18 +0000 (11:13 -0600)]
Fix m68k bootstrap due to diagnostic
The m68k port stopped bootstrapping recently because of this error:
> In file included from ./tm.h:31,
> from ../../../gcc/gcc/target.h:52,
> from ../../../gcc/gcc/dwarf2cfi.cc:23:
> ../../../gcc/gcc/dwarf2cfi.cc: In function 'bool dwarf2out_do_cfi_asm()':
> ../../../gcc/gcc/config/m68k/m68k.h:780:22: error: use of logical '||' with constant operand '2' [-Werror=constant-logical-operand]
> 780 | && ((GLOBAL) || (CODE))) \
> | ~~~~~~~~~^~~~~~~~~
> ../../../gcc/gcc/dwarf2cfi.cc:3755:9: note: in expansion of macro 'ASM_PREFERRED_EH_DATA_FORMAT'
> 3755 | enc = ASM_PREFERRED_EH_DATA_FORMAT (/*code=*/2,/*global=*/1);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../../gcc/gcc/config/m68k/m68k.h:780:22: note: use '|' for bitwise operation
> ../../../gcc/gcc/dwarf2cfi.cc:3755:9: note: in expansion of macro 'ASM_PREFERRED_EH_DATA_FORMAT'
> 3755 | enc = ASM_PREFERRED_EH_DATA_FORMAT (/*code=*/2,/*global=*/1);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Note the "2" being passed down. That's what triggers the warning. The
semantic preserving change is to use !!CODE. Given the port won't currently
bootstrap, I'm committing as-is.
Jeff Law [Sat, 30 May 2026 17:01:14 +0000 (11:01 -0600)]
Fix scan-asm failure for loongarch64 after recent ext-dce changes
The recent changes to ext-dce to promote narrow operations into wider modes
when the extended bits are dead slightly twiddled the output for a loongarch64
test. This just adjusts the expected output.
xtensa: Optimize 'insvsi' insn pattern if TARGET_DEPBITS is not configured
By default, the RTX generation pass conservatively expands bit-field
insertion to an insn sequence consisting of the bit mask and shift of
the inserted value, the bit-inversion mask of the destination, and finally
a logical-OR.
However, if the logical-AND operation on an inverted bit-field mask is
relatively expensive, it is more advantageous to shift the inserted value
without masking it and then follow the idiom '(A & M) | (B & ~M)' ->
'((A ^ B) & M) ^ B'.
/* example */
struct foo {
unsigned int x:10;
unsigned int y:11;
unsigned int z:11;
};
struct foo test0(struct foo a, unsigned int b) {
a.x = b;
return a;
}
struct foo test1(struct foo a, unsigned int b) {
a.y = b;
return a;
}
* config/xtensa/xtensa.md (insvsi_intermal):
Rename from 'insvsi'.
(insvsi): New expansion pattern that also addresses situations
where the DEPBITS machine instruction is unavailable.
In recent versions of gcc, expressions like '(A & M) | (B & ~M)' are
transformed into '((A ^ B) & M) ^ B' by GIMPLE simplification, so the
existence of that MD pattern is no longer relevant.
Jerry DeLisle [Wed, 27 May 2026 02:21:52 +0000 (19:21 -0700)]
fortran: fix ICE with procedure pointer declared in BLOCK
Procedure pointer declared inside a BLOCK construct in a program that has
contained procedures caused an ICE in convert_nonlocal_reference_op
(tree-nested.cc) because get_proc_pointer_decl set the proc pointer's
DECL_CONTEXT to NULL instead of the enclosing program function decl.
The root cause: the condition to call gfc_add_decl_to_function vs
gfc_add_decl_to_parent_function checked whether proc_name->backend_decl
matched current_function_decl. For a BLOCK namespace the proc_name has
FL_LABEL flavor and its backend_decl is never set, so the condition failed
and gfc_add_decl_to_parent_function was called. That function sets
DECL_CONTEXT to DECL_CONTEXT(current_function_decl), which is NULL for a
top-level program. The tree-nested pass then found no nesting level
matching target_context = NULL and crashed in the internal_error call
dereferencing the NULL target_context.
Fix: add the missing BLOCK namespace check (FL_LABEL flavor) so that
procedure pointers in BLOCK constructs are treated like regular variables
and added to the enclosing function via gfc_add_decl_to_function.
Assisted by: Claude Sonnet 4.6
PR fortran/105582
gcc/fortran/ChangeLog:
* trans-decl.cc (get_proc_pointer_decl): Add FL_LABEL check to
route BLOCK-construct procedure pointers to gfc_add_decl_to_function
rather than gfc_add_decl_to_parent_function.
Jakub Jelinek [Sat, 30 May 2026 15:49:18 +0000 (17:49 +0200)]
c++: Don't ICE on the static constexpr expansion-stmt vars during mangling [PR125123]
The following testcase ICEs, because we decide to mangle the (for the time
being as workaround static constexpr variables created for expansion
statements). And if there is more than one in the same function and we
mangle both, we ICE because they mangle the same.
The problem is that cp_finish_decl does not determine_local_discriminator
for DECL_ARTIFICIAL vars.
The following patch fixes that by calling it even for DECL_ARTIFICIAL vars.
The patch also sets DECL_IGNORED_P on those vars, I think there is no
value exposing those in the debug information, the iterating is done
at compile time and all user IMHO cares are the individual user variables
initialized to whatever was derived from the temporaries.
2026-05-29 Jakub Jelinek <jakub@redhat.com>
PR c++/125123
* parser.cc (cp_build_range_for_decls): If range_temp or begin
are static, set DECL_IGNORED_P on it.
* pt.cc (finish_expansion_stmt): Similarly for iter.
* decl.cc (cp_finish_decl): Call determine_local_discriminator
etc. also for DECL_ARTIFICIAL TREE_STATIC vars.