[IRA]: Fix soft conflict and hard reg cost calculation
When finding soft conflict in IRA, we wrongly use conflict allocno mode.
This can result in more shuffling on the region borders and worse code
generation. The patch fixes this.
gcc/ChangeLog:
* ira-color.cc (assign_hard_reg): Use the right allocno mode to
call note_conflict.
- ICE verify_vssa exceeds stack space for big functions [PR124805]
The source from PR124561 led to an ICE with --enable-checking, caused by a stack overflow.
The recursive verification code verify_vssa in tree-ssa.cc could not handle the extreme
number of basic blocks within the typical limits of stack space.
As for PR124561 the recursive code was transformed into an iterative version, which
avoided the recursive calls.
A worklist is used, which has as entries a pair of a basic_block and a tree (vdef).
The logic of verification steps for each basic_block is unchanged, although the order
of basic_blocks is changed.
This fixes PR124805.
Reg tested OK.
2026-04-07 Heiko Eißfeldt <heiko@hexco.de>
PR middle-end/124805
* tree-ssa.cc (verify_vssa):
replace recursive calls with iteration for lower stack usage
Andrew Pinski [Wed, 29 Apr 2026 21:34:32 +0000 (14:34 -0700)]
match: Simplify patterns for `a != b` implies a or b is non-zero
This simplified the patterns by using a for loop. Also noticed
that the `:c` on the inner ne/eq is not needed as it will match
the same canonicalization as the inner bit_ior too so removes that too.
This removes a little more 300 lines from the generated gimple-match*.cc files too.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* match.pd (`(a !=/== b) &\| ((a|b) ==/!= 0)`):
Simplify patterns using for loop and remove the `:c`
on the inner ne/eq.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
aarch64: Handle opts_set parameter properly in aarch64_option_restore
Previously, the AArch64 implementation of TARGET_OPTION_RESTORE ignored
the opts_set parameter and its callee, aarch64_override_options_internal,
invoked SET_OPTION_IF_UNSET with &global_options_set instead of with
opts_set.
That was bad for maintainability, because it was based on an assumption
that cl_target_option_restore would only be called with &global_options_set.
Otherwise, if an option were set in *opts_set but not in global_options_set,
the corresponding value would have been wrongly overridden; conversely, if
an option were set in global_options_set but not in *opts_set then its
value would not have been overridden as expected.
It looks as though cl_target_option_restore is not currently called with
an argument expression other than &global_options_set except by the arm,
i386 and s390 backends. However, ascertaining that and ensuring it will
always be true wastes more time than simply doing the right thing.
gcc/ChangeLog:
* config/aarch64/aarch64-c.cc (aarch64_pragma_target_parse):
Pass &global_options_set as an argument to
aarch64_override_options_internal.
* config/aarch64/aarch64-protos.h (aarch64_override_options_internal):
Add a parameter declaration for opts_set.
* config/aarch64/aarch64.cc (aarch64_override_options_internal):
Add a parameter declaration for opts_set and use the argument
when invoking SET_OPTION_IF_UNSET.
(aarch64_override_options): Pass &global_options_set as an argument to
aarch64_override_options_internal.
(aarch64_option_restore): As above.
(aarch64_set_current_function): As above.
(aarch64_option_valid_attribute_p): As above.
(aarch64_option_valid_version_attribute_p): As above.
Richard Biener [Thu, 30 Apr 2026 06:39:52 +0000 (08:39 +0200)]
tree-optimization/125088 - some TLC to the new vect_bb_slp_scalar_cost
This realizes that orig_stmt_info == stmt and refactors control flow
around cost recording to avoid the do { } while (false); loop which
had continue stmts confusing coverity.
PR tree-optimization/125088
* tree-vect-slp.cc (vect_bb_slp_scalar_cost): Refactor and
simplify.
* tree-vect-stmts.cc (vect_nop_conversion_p): Exclude
copies with memory accesses.
Richard Biener [Fri, 24 Apr 2026 12:35:49 +0000 (14:35 +0200)]
flip --param ix86-vect-compare-costs default
The following flips the default of ix86-vect-compare-costs as discussed
during stage3/4. It adds the testcase from PR120398 and ensures the
existing one works without specifying the --param.
Testcases have been adjusted with simple dump scan adjustments.
gcc.target/i386/vect-epilogues-10.c shows that we compute the
masked epilog to be more expensive than the not masked one. That's
probably correct as we're facing an in-order reduction. I have
added -fno-vect-cost-model given this is a testcase for a missing
feature.
PR tree-optimization/120398
PR tree-optimization/123603
* config/i386/i386.opt (ix86-vect-compare-costs): Default to 1.
Richard Biener [Wed, 29 Apr 2026 13:26:35 +0000 (15:26 +0200)]
[x86] Avoid gcc.target/i386/shift-gf2p8affine-?.c fails with compare costs
The following disables epilogue vectorization for the
gcc.target/i386/shift-gf2p8affine-?.c tests so they pass with both
--param ix86-vect-compare-costs=1 and =0.
Richard Biener [Wed, 29 Apr 2026 09:07:06 +0000 (11:07 +0200)]
[x86] Adjust gcc.target/i386/vect-epilogues-2.c and vect-pr113078.c
The following adjusts two very similar testcases that when
vector cost comparison is enabled and with generic tuning,
chose to use SSE vector size for the vector epilogue as that
reduces the possible iterations through the scalar epilogue
following that and thus speeds up the overall epilogue processing
for a majority of cases. I have chosen to duplicate the
testcases for --param ix86-vect-compare-costs=0 and =1.
Richard Biener [Tue, 28 Apr 2026 13:44:41 +0000 (15:44 +0200)]
[x86] Adjust gcc.target/i386/vect-strided-?.c for cost compare
With cost comparison and MMX-with-SSE vector width available we
prefer to use V2SImode over V4SImode with shuffles, rightfully
so I think. The following adds variants with explicit cost
compare enabled and disabled and adjusts the cost comparison
variant accordingly.
The following resolves the gcc.target/i386/vect-epilogues-3.c failure
when --param ix86-vect-compare-costs=1 is specified. When the target
requests multiple epilogues to be used and the new candidate is the
epilogue of choice of the currently prevailing epilogue keep that.
But avoid doing so if the new candidate uses a vectorization factor
of one which should be an optimal vector epilog. This avoids
regressing gcc.dg/vect/costmodel/x86_64/costmodel-pr122573.c
* config/i386/i386.cc (ix86_vector_costs::better_epilogue_loop_than_p):
New. If the other loop suggests this as epilog prefer other.
This overrides vector_costs::better_main_loop_than_p to avoid
regressing gcc.target/i386/vect-partial-vectors-2.c with
--param ix86-vect-compare-costs=1. As the user (or a tuning model)
asks for masked epilogs the vectorizer considers to mask the
main loop in case it effectively works as a standalone vector epilog
due to known small number of iterations of the loop. While the
generic cost compare rightfully figures masking of AVX is more expensive
than not masking with SSE it does not consider the cost of the epilog.
This compensates with a x86 specific heuristic that prefers the
masked loop if the loop cannot be vectorized with a non-masked
main loop and at most a single vector epilog plus a single scalar
epilog iteration. This is a reasonable heuristic for x86 and
a small number of iterations as icache footprint matters here,
so considering the possibility of 3 vector epilogs and 1 scalar
iteration does not look profitable. Unless testcases will prove
to us otherwise.
I'm not sure if it makes sense to preserve --param ix86-vect-compare-costs=0
in the end, if people think so I'll duplicate the testcase with
both modes explicitly specified.
* tree-vectorizer.h (vector_costs::vinfo): New accessor.
* config/i386/i386.cc (ix86_vector_costs::better_main_loop_than_p):
Prefer a masked main loop if we can elide enough of (vector)
epilog loop iterations.
This expands on the changes from test fix r16-6710-gda5a5c55284969:
* test name now reflect the size of the generator range,
* extracted code repeated between tests was exctracted to run_generator,
* expanded non-power of two ranges types to cover all IEC559 floating point,
* select values to test based on the size of mantisa instead of type,
handling different long double representations.
The test now cover the cases, where mutliple value greater than one are
produced (and skipped) in the row. To avoid test running infinite loop,
the number of skips per element is limited by max_skips_per_elem template
parameter of run_generator.
The values checked in test_2p31m1<double> differs from their old test03<double>
counterpart, as we now request mantissa - 5 bits for each type (48bits for
ieee64) instead of previously hardoced 30bits.
Andrew Pinski [Wed, 29 Apr 2026 19:49:49 +0000 (12:49 -0700)]
testsuite: Fix cond-add-vec-2.C and make cond-add-vec-1.C test some more
With -march=cascadelake/-mavx512f, the VEC_COND_EXPR is turned into a COND_ADD.
This breaks cond-add-vec-2.C check to make sure the conditional add is still there.
So we need to check for COND_ADD or VEC_COND_EXPR in forwprop1.
Even though cond-add-vec-1.C works right now, it is best to make sure COND_ADD is
not there.
Pushed as obvious after testing with and without -march=cascadelake on x86_64.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/cond-add-vec-1.C: Add a check to make sure COND_ADD
is not there either.
* g++.dg/tree-ssa/cond-add-vec-2.C: Change the check for VEC_COND_EXPR
to allow for COND_ADD.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Wilco Dijkstra [Tue, 3 Feb 2026 18:31:51 +0000 (18:31 +0000)]
AArch64: Deprecate -mpc-relative-literal-loads
Deprecate -mpc-relative-literal-loads. Emitting special symbols in
the text section causes issues (see PR123791). Since the option is
relatively obscure and GCC now uses anchors for literals, there is
no need to keep it.
Cleanup code models - remove the confusing AARCH64_CMODEL_TINY_PIC,
AARCH64_CMODEL_SMALL_PIC and AARCH64_CMODEL_SMALL_SPIC. This simplifies
a lot of code. No change to generated code.
[LRA]: Fix a bug in updating live info in rematerialization
LRA rematerialization ignores that a pseudo can require more one hard reg
when updating live hard reg info. This can result in wrong
rematerialization. The patch fixes this.
gcc/ChangeLog:
* lra-remat.cc (do_remat): Use the right nregs for pseudo hard reg
when updating live hard regs.
[IRA]: Use correct allocno when building conflicts
When conflicts are built in IRA a wrong conflict allocno is taken. The
allocno is used only in assertion which becomes always true and checks
nothing. The patch fixes this.
gcc/ChangeLog:
* ira-conflicts.cc (build_object_conflicts): Use the right
conflicting allocno.
Richard Biener [Wed, 29 Apr 2026 12:28:49 +0000 (14:28 +0200)]
tree-optimization/125080 - fix SLP scalar stmt coverage for instance roots
Even instance roots can be mentioned in externs of other instances
and thus have to be kept scalar. Consider that.
PR tree-optimization/125080
* tree-vect-slp.cc (vect_bb_slp_mark_stmts_vectorized): Only
add instance root stmts to scalar coverage if they do not
appear in externs.
Patrick Palka [Wed, 29 Apr 2026 12:48:50 +0000 (08:48 -0400)]
c++/modules: memfn merging wrt to obj-ness [PR125035]
Here we ICE during declaration merging for the streamed-in static A::f
because we incorrectly match with the in-TU iobj A::f instead of the
in-TU static A::f.
The problem is the merge key doesn't have enough information to discern
between two overloads that essentially only differ by whether they have
an object parameter (and whether it's implicit or explicit). To that end
this patch adds iobj_p and xobj_p bits to merge_key.
PR c++/125035
gcc/cp/ChangeLog:
* module.cc (merge_key): Add iobj_p and xobj_p bits.
(trees_out::key_mergeable) <case MK_named>: Set and stream
merge_key's iobj_p and xobj_p bits.
(check_mergeable_decl) <case FUNCTION_DECL>: Compare merge_key's
iobj_p and xobj_p bits with that of the given function.
(trees_in::key_mergeable): Stream merge_key's iobj_p and xobj_p
bits.
gcc/testsuite/ChangeLog:
* g++.dg/modules/merge-22.h: New test.
* g++.dg/modules/merge-22_a.H: New test.
* g++.dg/modules/merge-22_b.C: New test.
Patrick Palka [Wed, 29 Apr 2026 12:48:31 +0000 (08:48 -0400)]
c++/modules+reflection: fix merging typedef struct { } A [PR124582]
r16-7903 changed the representation of typedefs to an unnamed type, such
as typedef struct { } A, so that we preserve both the unnamed and typedef
TYPE_DECL rather than replacing the unnamed decl. This patch teaches
modules declaration merging to handle the new representation when streaming
in the unnamed decl, working around the fact that the unnamed decl isn't
visible to name lookup but still has the same DECL_NAME as the typedef decl.
PR c++/124582
PR c++/123810
gcc/cp/ChangeLog:
* module.cc (check_mergeable_decl) <case TYPE_DECL>: Handle
merging a typedef to an unnamed type with the -freflection
representation.
gcc/testsuite/ChangeLog:
* g++.dg/modules/anon-4.h: New test.
* g++.dg/modules/anon-4_a.H: New test.
* g++.dg/modules/anon-4_b.C: New test.
Xin Liu [Wed, 29 Apr 2026 10:56:50 +0000 (10:56 +0000)]
i386: Support HYGON c86-4g series processors
This patch enables new x86 CPU vendor HYGON ID detection
and adds c86-4g series c86-4g-m{4,6,7} processor supports.
Without such support, if users use -march=native option on
HYGON machines, they can get some old arch like core2, it
would be suboptimal. It also enables -m{arch,tune}=c86-4g
-m{4,6,7} supports. Based on the hardware characteristics,
appropriate cost models and tuning parameters are provided.
New machine description files are introduced: c86-4g.md is
used to describe the pipeline for c86-4g-m4 and c86-4g-m6,
while c86-4g-m7.md describes the pipeline for c86-4g-m7.
To better model some pipeline information, it introduces
new attrs c86_attr and c86_decode by following existing
practice.
Bootstrapped and regtested on hygon c86-4g-m4 and c86-4g-m7
machine, as well as a cfarm x86-64 machine.
Co-authored-by: Zhaoling Bao <baozhaoling@hygon.cn> Signed-off-by: Xin Liu <liulxx@hygon.cn> Signed-off-by: Zhaoling Bao <baozhaoling@hygon.cn>
gcc/ChangeLog:
* gcc.target/i386/builtin_target.c: Add handling for HYGON CPUs by
validating the vendor and invoking HYGON-specific CPU detection.
* gcc.target/i386/funcspec-56.inc: Test function target attribute on
{arch,tune}=c86-4g-m{4,6,7}.
* g++.target/i386/mv33.C: New test.
Julian Brown [Wed, 29 Apr 2026 10:12:13 +0000 (12:12 +0200)]
OpenMP: Expand "declare mapper" mappers for target {enter,exit,} data directives
This patch allows 'declare mapper' mappers to be used on 'omp target
data', 'omp target enter data' and 'omp target exit data' directives.
For each of these, only explicit mappings are supported, unlike for
'omp target' directives where implicit uses of variables inside an
offload region might trigger mappers also.
Add support for C and C++.
The patch also adjusts 'map kind decay' to match OpenMP 5.2 semantics,
which is particularly important with regard to 'exit data' operations.
gcc/c-family/
* c-common.h (c_omp_region_type): Add C_ORT_EXIT_DATA,
C_ORT_OMP_EXIT_DATA.
(c_omp_instantiate_mappers): Add region type parameter.
* c-omp.cc (omp_split_map_kind, omp_join_map_kind,
omp_map_decayed_kind): New functions.
(omp_instantiate_mapper): Add ORT parameter. Implement map kind decay
for instantiated mapper clauses.
(c_omp_instantiate_mappers): Add ORT parameter, pass to
omp_instantiate_mapper.
gcc/c/
* c-parser.cc (c_parser_omp_target_data): Instantiate mappers for
'omp target data'.
(c_parser_omp_target_enter_data): Instantiate mappers for 'omp target
enter data'.
(c_parser_omp_target_exit_data): Instantiate mappers for 'omp target
exit data'.
(c_parser_omp_target): Add c_omp_region_type argument to
c_omp_instantiate_mappers call.
* c-tree.h (c_omp_instantiate_mappers): Remove spurious prototype.
gcc/cp/
* parser.cc (cp_parser_omp_target_data): Instantiate mappers for 'omp
target data'.
(cp_parser_omp_target_enter_data): Instantiate mappers for 'omp target
enter data'.
(cp_parser_omp_target_exit_data): Instantiate mappers for 'omp target
exit data'.
(cp_parser_omp_target): Add c_omp_region_type argument to
c_omp_instantiate_mappers call.
* pt.cc (tsubst_omp_clauses): Instantiate mappers for OMP regions other
than just C_ORT_OMP_TARGET.
(tsubst_expr): Update call to tsubst_omp_clauses for OMP_TARGET_UPDATE,
OMP_TARGET_ENTER_DATA, OMP_TARGET_EXIT_DATA stanza.
* semantics.cc (cxx_omp_map_array_section): Avoid calling
build_array_ref for non-array/non-pointer bases (error reported
already).
gcc/testsuite/
* c-c++-common/gomp/declare-mapper-15.c: New test.
* c-c++-common/gomp/declare-mapper-16.c: New test.
* g++.dg/gomp/declare-mapper-1.C: Adjust expected scan output.
Pan Li [Mon, 27 Apr 2026 02:01:52 +0000 (10:01 +0800)]
RISC-V: Combine vec_duplicate + vmsgtu.vv to vmsgtu.vx on GR2VR cost
This patch would like to combine the vec_duplicate + vmsgtu.vv to the
vmsgtu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.
Assume we have asm code like below, GR2VR cost is 0.
Jakub Jelinek [Wed, 29 Apr 2026 09:12:47 +0000 (11:12 +0200)]
bitintlower: Padding bit fixes, part 6 [PR123635]
I've missed torture/bitint-{93,94}.c FAILs on s390x-linux (i.e. big endian).
For __builtin_mul_overflow, the code to extend the partial most significant
limb is done before memmoving it down, so that limb actually isn't on big
endian at offset 0 but is nelts - obj_nelts. The following patch computes
obj_nelts first, uses it on big-endian and so that the offset checking
asserts don't trigger, on big-endian also uses NULL_TREE first argument to
limb_access.
2026-04-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/123635
* gimple-lower-bitint.cc (bitint_large_huge::finish_arith_overflow):
Move obj_nelts/atype computation before bitint_extended handling. For
bitint_big_endian in the bitint_extended handling use size_zero_node
only for limb_access_type calls, otherwise use
size_int (nelts - obj_nelts) and pass NULL_TREE as first argument to
limb_access calls.
BB SLP: Enabling reduction root finding for sum-of-diff kind of patterns
Add an optional parameter allow_alt_code to vect_slp_linearize_chain
(default true). When false, do not follow into MINUS_EXPR when
building a PLUS reduction chain; treat MINUS results as leaves.
This will allow "sum of diffs" (d_i = a[i]-b[i], sum = d0+...+dN)
kind of pattern to be recognized and vectorized. Pure PLUS chains
will still work; other callers of vect_slp_linearize_chain keep the
default. Once support for MINUS_EXPR in the chain is added, this
call site can be switched to allow_alt_code true.
gcc/ChangeLog:
* tree-vect-slp.cc (vect_slp_linearize_chain): Optional parameter
allow_alt_code added (default true), check added not to follow
MINUS_EXPR, when false.
(vect_slp_check_for_roots): Calls vect_slp_linearize_chain with
parameter allow_alt_code set to false.
Jakub Jelinek [Wed, 29 Apr 2026 06:01:02 +0000 (08:01 +0200)]
c++: Fix up REFLECT_BASE comparison
While writing testcase for PR125007 I found an ICE in cp_tree_equal.
The r16-7260 change to compare_reflections broke REFLECT_BASE comparisons.
It now calls cp_tree_equal on their REFLECT_EXPR_HANDLE which is TREE_BINFO.
It works if lhs == rhs, returns true, or if TREE_CODE is different (returns
false), but otherwise the function isn't prepared to handle TREE_BINFO
and because TREE_BINFO is tcc_exceptional, ends with
default:
gcc_unreachable ();
(for --disable-checking it actually works by doing return false; after
this). This patch fixes that in the third hunk by doing lhs == rhs
comparison only.
2026-04-29 Jakub Jelinek <jakub@redhat.com>
* reflect.cc (compare_reflection): For REFLECT_BASE use lhs == rhs rather
than cp_tree_equal.
Jakub Jelinek [Wed, 29 Apr 2026 05:58:46 +0000 (07:58 +0200)]
testsuite: Diagnose non-uglified names even in requires exprs
I was worried we don't handle lambda parameters/captures,
but apparently we do (so I have just added tests to verify that),
and noticed we don't handle params of requires expressions, so
added test coverage for that and handled those in the plugins.
2026-04-29 Jakub Jelinek <jakub@redhat.com>
* g++.dg/plugin/uglification_plugin.cc (plugin_check_tree): Walk
REQUIRES_EXPR_PARMS of REQUIRES_EXPR.
(plugin_walk_decl): Walk TEMPLATE_PARMS_CONSTRAINTS using
plugin_check_tree. Walk DECL_INITIAL of CONCEPT_DECL as well.
* g++.dg/plugin/uglification.C: Add tests for non-uglified names
in lambda parameters, lambda captures and requires expressions.
Reviewed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jakub Jelinek [Wed, 29 Apr 2026 05:55:02 +0000 (07:55 +0200)]
testsuite: Add plugin to verify bits/std.cc exports
The following patch adds another g++.dg/plugin/ testsuite plugin,
this time to verify whether some std.cc exports aren't mistakenly
omitted.
The patch is a reworked version of the
https://gcc.gnu.org/pipermail/libstdc++/2025-August/thread.html#62859
proof of concept. That version just dumped out everything it saw
in the std namespace and its child namespaces (excluding non-inline
subnamespaces with identifiers starting with underscore) and then I've
used sed&grep to form a list of omissions.
This patch keeps the previous walk of std namespace and namespaces children
of it, but it only reports (in this version using error_at instead of inform
previously) what it finds if it isn't exported from the module and is not
deprecated (deprecated attribute is used usually for zombie.names in the
standard).
I've been strugling with the detection of what is and what isn't exported,
had to try several different methods.
What is DECL_MODULE_EXPORT_P is ignored, but that is not set on everything
actually exported. In other cases there is OVL_EXPORT_P flag on OVERLOAD
(but OVL_HIDDEN_P at the start doesn't have it). Another case are inline
namespaces, e.g. for std::filesystem::__cxx11::begin or
std::filesystem::__cxx11::directory_iterator. In the latter case, there
is no sign of the above flags in __cxx11 binding entry, but there is a
USING_DECL with the same name directly in std::filesystem. And for begin
there is OVERLOAD with OVL_EXPORT_P in std::filesystem but not in
std::filesystem::__cxx11.
2026-04-29 Jakub Jelinek <jakub@redhat.com>
* g++.dg/plugin/plugin.exp: Set PLUGIN_DEFAULT_REPO. Add
set*module*exports* to plugin_test_list. Remove *.gcm files
at the start and end.
* g++.dg/plugin/std_module_exports_plugin.cc: New file.
* g++.dg/plugin/std-module-exports-c++20.C: New test.
* g++.dg/plugin/std-module-exports-c++23.C: New test.
* g++.dg/plugin/std-module-exports-c++26.C: New test.
Reviewed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jakub Jelinek [Wed, 29 Apr 2026 05:49:32 +0000 (07:49 +0200)]
testsuite: Add C++ plugin to check for libstdc++ header uglification
The following patch adds a plugin (sorry, to check-g++ testsuite rather
than libstdc++ testsuite because in plugin.exp we have all the needed
infrastructure.
The plugin diagnoses non-obfuscated function parameter names, automatic
variable names, template arguments, requires arguments etc., but as an
exception allows non-obfuscated names which appear as function/template
etc. names in std namespace. The uglification.C test verifies the plugin
diagnoses what it should be.
2026-04-29 Jakub Jelinek <jakub@redhat.com>
* g++.dg/plugin/plugin.exp (plugin_test_list): Add uglification tests.
* g++.dg/plugin/uglification_plugin.cc: New file.
* g++.dg/plugin/uglification.C: New test.
* g++.dg/plugin/uglification-c++98.C: New test.
* g++.dg/plugin/uglification-c++11.C: New test.
* g++.dg/plugin/uglification-c++14.C: New test.
* g++.dg/plugin/uglification-c++17.C: New test.
* g++.dg/plugin/uglification-c++20.C: New test.
* g++.dg/plugin/uglification-c++23.C: New test.
* g++.dg/plugin/uglification-c++26.C: New test.
Reviewed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
libstdc++: simd: x86: accept 64-bit long double as double [PR124657]
Various simd_x86 functions that handle double need to be adjusted to
match 64-bit long double as well.
Introduce __is_x86_ps<_Tp>() and __is_x86_pd<_Tp>() and use them
instead of is_same_v<_Tp, float> and is_same_v<_Tp, double>,
respectively.
for libstdc++-v3/ChangeLog
PR libstdc++/124657
* include/experimental/bits/simd_x86.h
(__is_x86_ps<_Tp>): New. Replace is_same_v<_Tp, float> with it.
(__is_x86_pd<_Tp>): New. Replace is_same_v<_Tp, double> with it.
libstdc++: follow std in numeric_limits<bool>::traps and integral traps
There's a comment from 2002 suggesting that
numeric_limits<bool>::traps was in a DR, but C++ standards including
11, 17 and 23 explicitly set it to false, presumably in response to
issue 184.
Issue 554 clarifies that traps is about values that may trap, rather
than operations that may trap, so we were wrong in the interpretation
about divide-by-zero operations' trapping on integral types that led
to __glibcxx_integral_traps's defaulting to true, and some of its
overrides.
Align numeric_limits<bool>::traps with the standard, default
__glibcxx_integral_traps to false, drop the overriders based on the
incorrect interpretation, but keep __glibcxx_integral_traps to allow
command-line restoring of this ABI fix, and for the admittedly
unlikely case of trapping integral values' coming to exist on some
architecture.
for libstdc++-v3/ChangeLog
* include/std/limits (__glibcxx_integral_traps): Set to
false. Update comments.
(numeric_limits<bool>::traps): Drop comments.
* config/cpu/arm/cpu_defines.h: Remove.
* config/cpu/powerpc/cpu_defines.h: Likewise.
* configure.host (cpu_defines_dir): Adjust.
David Malcolm [Tue, 24 Feb 2026 23:47:35 +0000 (18:47 -0500)]
analyzer: new warning: -Wanalyzer-div-by-zero (PR analyzer/124217)
gcc/analyzer/ChangeLog:
PR analyzer/124217
* analyzer.opt (Wanalyzer-div-by-zero): New.
* analyzer.opt.urls: Regenerate.
* region-model.cc (class div_by_zero_diagnostic): New.
(region_model::get_gassign_result): Add warning for division by
zero if ctxt is non-null. Bail out on such cases even if ctxt
is null.
* svalue.cc (type_can_have_value_range_p): Also handle frange.
David Malcolm [Mon, 23 Mar 2026 18:29:27 +0000 (14:29 -0400)]
analyzer: split out various pending_diagnostic subclasses from region-model.cc
Split up region-model.cc somewhat. No functional change intended.
gcc/ChangeLog:
* Makefile.in (ANALYZER_OBJS): Add
analyzer/poisoned-value-diagnostic.o,
analyzer/shift-diagnostics.o, and
analyzer/write-to-const-diagnostics.o.
gcc/analyzer/ChangeLog:
* poisoned-value-diagnostic.cc: New file, taken from material in
region-model.cc.
* region-model.cc (class poisoned_value_diagnostic): Move to
poisoned-value-diagnostic.cc.
(class shift_count_negative_diagnostic): Move to
shift-diagnostics.cc.
(class shift_count_overflow_diagnostic): Likewise.
(region_model::get_gassign_result): Use factory functions when
creating diagnostics so that the subclasses can be moved to their
own source files.
(region_model::check_for_poison): Likewise.
(region_model::deref_rvalue): Likewise.
(class write_to_const_diagnostic): Move to
write-to-const-diagnostics.cc.
(class write_to_string_literal_diagnostic): Likewise.
(region_model::check_for_writable_region): Use factory functions
when creating diagnostics so that the subclasses can be moved to
their own source files.
* region-model.h (make_poisoned_value_diagnostic): New decl.
(make_shift_count_negative_diagnostic): New decl.
(make_shift_count_overflow_diagnostic): New decl.
(make_write_to_const_diagnostic): New decl.
(make_write_to_string_literal_diagnostic): New decl.
* shift-diagnostics.cc: New file, taken from material in
region-model.cc.
* write-to-const-diagnostics.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 15 Jan 2026 19:17:08 +0000 (14:17 -0500)]
analyzer: use concrete_binding_map for compound_svalue (PR analyzer/123145)
A compound_svalue can only have concrete bindings. Capture this in the
type system by splitting out the concrete parts of class binding_map
into a new class concrete_binding_map, and use the latter for
compound_svalue. This also allows some simplifications and
optimizations, where we can use bit_range rather than binding keys.
No functional change intended.
gcc/analyzer/ChangeLog:
PR analyzer/123145
* access-diagram.cc
(compound_svalue_spatial_item::compound_svalue_spatial_item):
Update for compound_svalue using concrete_binding_map rather than
binding_map.
* bounds-checking.cc (strip_types): Likewise.
* call-summary.cc
(call_summary_replay::convert_svalue_from_summary_1): Update for
reimplementation of class binding_map.
(call_summary_replay::convert_svalue_from_summary_1): Likewise.
* infinite-recursion.cc (contains_unknown_p): Update for
compound_svalue using concrete_binding_map rather than
binding_map.
* program-state.cc (sm_state_map::impl_set_state): Likewise.
* region-model-manager.cc (maybe_undo_optimize_bit_field_compare):
Likewise.
(maybe_undo_optimize_bit_field_compare): Avoid building a
concrete_binding key by using get_any_exact_binding.
(region_model_manager::get_or_create_compound_svalue): New
overload, consuming a concrete_binding_map &&.
* region-model-manager.h
(region_model_manager::get_or_create_compound_svalue): New decl
for the above.
* region-model-reachability.cc (reachable_regions::handle_sval):
Update for compound_svalue using concrete_binding_map rather than
binding_map.
(reachable_regions::handle_parm): Likewise.
* region-model.cc (region_model::scan_for_null_terminator_1): Port
from binding_map to concrete_binding_map.
(exposure_through_uninit_copy::calc_num_uninit_bits): Update for
compound_svalue using concrete_binding_map rather than
binding_map.
(contains_uninit_p): Likewise.
* region.cc (decl_region::calc_svalue_for_constructor): Port from
binding_map to concrete_binding_map.
(decl_region::get_svalue_for_initializer): Update call to
get_or_create_compound_svalue.
* store.cc (concrete_binding_map::dump_to_pp): New.
(concrete_binding_map::dump): New.
(concrete_binding_map::add_to_tree_widget): New.
(concrete_binding_map::validate): New.
(binding_map::cmp): Convert to...
(concrete_binding_map::cmp): ...this.
(concrete_binding_map::get_any_exact_binding): New.
(concrete_binding_map::calc_complexity): New.
(concrete_binding_map::remove_overlapping_binding): New.
(concrete_binding_map::remove_overlapping_bindings): New.
(concrete_binding_map::get_overlapping_bindings): New.
(binding_map::put): Update for change to m_concrete.
(binding_map::validate): Likewise.
(binding_map::apply_ctor_to_region): Convert to...
(concrete_binding_map::apply_ctor_to_region): ...this.
(binding_map::apply_ctor_val_to_range): Convert to...
(concrete_binding_map::apply_ctor_val_to_range): ...this.
(binding_map::apply_ctor_pair_to_child_region): Convert to...
(concrete_binding_map::apply_ctor_pair_to_child_region): ...this.
(binding_map::remove_overlapping_bindings): Move part of
implementation to
concrete_binding_map::remove_overlapping_binding.
(binding_cluster::bind_compound_sval): Simplify using
concrete_binding_map.
(binding_cluster::maybe_get_compound_binding): Likewise.
(store::replay_call_summary_cluster): Update for change
to compound_svalue.
* store.h: Include "analyzer/complexity.h".
(class concrete_binding_map): New, based on
binding_map::concrete_bindings_t.
(binding_map::concrete_bindings_t): Use concrete_binding_map.
(binding_map::empty_p): Update for above.
(binding_map::apply_ctor_to_region): Drop decl.
(binding_map::cmp): Likewise.
(binding_map::apply_ctor_val_to_range): Likewise.
(binding_map::apply_ctor_pair_to_child_region): Likewise.
* svalue.cc (svalue::cmp_ptr): Update for change to
compound_svalue.
(compound_svalue::compound_svalue): Port from binding_map to
concrete_binding_map.
(compound_svalue::accept): Likewise.
(compound_svalue::calc_complexity): Drop.
(compound_svalue::maybe_fold_bits_within): Port from binding_map
to concrete_binding_map.
* svalue.h (class compound_svalue): Update leading comment. Port
from binding_map to concrete_binding_map.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Virginia Kodsy [Wed, 25 Mar 2026 15:39:39 +0000 (17:39 +0200)]
analyzer: add known_function handler for strcasecmp
This patch adds a known_function handler for strcasecmp to the
static analyzer. It ensures the analyzer checks for null-terminated
string arguments and, when a return value is expected (LHS),
it conjures a symbolic value for the result.
analyzer: model mktemp-family success/failure outcomes [PR105890]
The known_function handlers for the mktemp family all use
set_any_lhs_with_defaults, leaving the return value unconstrained.
This means the analyzer cannot distinguish the success path from the
failure path and cannot, for example, detect use of an invalid file
descriptor returned by mkstemp.
This patch makes the analyzer aware of each function's return
convention. A nested enum kf_mktemp_family::outcome describes the
three conventions used by the family:
fd -- returns a non-negative fd on success, -1 on failure
(mkstemp, mkostemp, mkstemps, mkostemps).
null_ptr -- returns a pointer on success, NULL on failure
(mkdtemp).
modif_tmpl -- returns the template pointer; sets template[0] to
'\0' on failure (mktemp).
Each call is bifurcated into success and failure paths, modeling
the return value and errno according to the outcome. This enables
fd leak and double-close detection for the fd-returning variants.
A new helper region_model::update_for_null_return is added for the
null_ptr failure path.
The template placeholder check is now in impl_call_post and influences
bifurcation: when the placeholder is definitely invalid, only the
failure path is explored.
The patch "analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix
and -Wanalyzer-mkstemp-of-string-literal [PR105890]" added those two
warnings for mkstemp only. This patch generalizes them to the whole
mktemp family (including GNU extensions): mktemp, mkstemp, mkostemp,
mkstemps, mkostemps, and mkdtemp.
The two warnings are renamed to reflect their broader scope:
-Wanalyzer-mkstemp-missing-suffix becomes
-Wanalyzer-mktemp-missing-placeholder. For the suffixed variants
(mkstemps, mkostemps), the diagnostic accounts for the suffix length
when locating the "XXXXXX" placeholder.
-Wanalyzer-mkostemp-redundant-flags warns when mkostemp or
mkostemps is called with flags that include O_RDWR, O_CREAT, or
O_EXCL, which are already implied by these functions and produce
errors on some systems.
All three warnings are enabled by default under -fanalyzer.
Bootstrapped and tested on x86_64-pc-linux-gnu.
gcc/analyzer/ChangeLog:
PR analyzer/105890
* analyzer-language.cc (stash_named_constants): Stash O_CREAT,
O_EXCL, and O_RDWR for use by kf.cc.
* analyzer.opt: Rename -Wanalyzer-mkstemp-missing-suffix to
-Wanalyzer-mktemp-missing-placeholder and
-Wanalyzer-mkstemp-of-string-literal to
-Wanalyzer-mktemp-of-string-literal. Add
-Wanalyzer-mkostemp-redundant-flags. Fix alphabetical ordering.
* analyzer.opt.urls: Regenerate.
* kf.cc (class mkstemp_of_string_literal): Rename to...
(class mktemp_of_string_literal): ...this.
(class mkstemp_missing_suffix): Rename to...
(class mktemp_missing_placeholder): ...this. Add trailing_len
parameter for suffixed variants.
(class mkostemp_redundant_flags): New diagnostic class.
(class kf_mktemp_family): New base class with shared template
and flags checking logic.
(kf_mktemp_family::check_template_with_suffixlen_arg): New.
(kf_mktemp_family::check_template): New.
(kf_mktemp_family::check_flags): New.
(kf_mktemp_family::check_placeholder): New.
(class kf_mkstemp): Rename to...
(class kf_mktemp_simple): ...this. Generalize to handle mktemp,
mkstemp, and mkdtemp.
(class kf_mkostemp): New known_function handler.
(class kf_mkostemps): New known_function handler.
(class kf_mkstemps): New known_function handler.
(register_known_functions): Register all mktemp family handlers.
gcc/ChangeLog:
PR analyzer/105890
* doc/invoke.texi: Rename -Wanalyzer-mkstemp-missing-suffix to
-Wanalyzer-mktemp-missing-placeholder and
-Wanalyzer-mkstemp-of-string-literal to
-Wanalyzer-mktemp-of-string-literal. Add
-Wanalyzer-mkostemp-redundant-flags. Fix alphabetical ordering
of detailed descriptions.
gcc/testsuite/ChangeLog:
PR analyzer/105890
* gcc.dg/analyzer/mkstemp-1.c: Update terminology from "suffix"
to "placeholder".
* gcc.dg/analyzer/mkdtemp-1.c: New test.
* gcc.dg/analyzer/mkostemp-1.c: New test.
* gcc.dg/analyzer/mkostemps-1.c: New test.
* gcc.dg/analyzer/mkstemps-1.c: New test.
* gcc.dg/analyzer/mktemp-1.c: New test.
Signed-off-by: Tomas Ortin Fernandez (quanrong) <quanrong@mailbox.org>
analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and -Wanalyzer-mkstemp-of-string-literal [PR105890]
This patch adds two new analyzer warnings for misuse of mkstemp(3):
-Wanalyzer-mkstemp-of-string-literal warns when a string literal is
passed to mkstemp. Since mkstemp modifies its argument in place,
passing a string literal is undefined behavior (SEI CERT C rule
STR30-C). The diagnostic suggests using a writable character array
instead.
-Wanalyzer-mkstemp-missing-suffix warns when the template argument
does not end with the required "XXXXXX" suffix. This addresses PR
analyzer/105890.
Both warnings are enabled by default under -fanalyzer.
The checks are in the analyzer rather than -Wformat because mkstemp
does not use a format attribute. Placing the checks in the analyzer
could also allow interprocedural analysis in the future, once the
analyzer can fully track string contents across function calls.
Bootstrapped and tested on x86_64-pc-linux-gnu.
gcc/analyzer/ChangeLog:
PR analyzer/105890
* analyzer.opt: Add -Wanalyzer-mkstemp-missing-suffix and
-Wanalyzer-mkstemp-of-string-literal.
* analyzer.opt.urls: Add URL entries for the new warnings.
* kf.cc (class mkstemp_of_string_literal): New diagnostic class
for mkstemp called on a string literal.
(class mkstemp_missing_suffix): New diagnostic class for mkstemp
called with a template missing the "XXXXXX" suffix.
(class kf_mkstemp): New known_function handler for mkstemp.
(register_known_functions): Register kf_mkstemp.
gcc/ChangeLog:
PR analyzer/105890
* doc/invoke.texi: Add -Wanalyzer-mkstemp-missing-suffix and
-Wanalyzer-mkstemp-of-string-literal.
gcc/testsuite/ChangeLog:
PR analyzer/105890
* gcc.dg/analyzer/mkstemp-1.c: New test.
Signed-off-by: Tomas Ortin Fernandez (quanrong) <quanrong@mailbox.org>
Saksham Gupta [Mon, 9 Mar 2026 06:20:36 +0000 (11:50 +0530)]
analyzer: add known function handling for atoi, atol, and atoll
This patch adds kf_atoi_family to handle atoi, atol, and atoll functions in the
analyzer, ensuring that the argument is checked for a valid,
null-terminated string.
gcc/analyzer/ChangeLog:
* kf.cc (class kf_atoi_family): New class.
(register_known_functions): Register atoi, atol, and atoll.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/atoi-1.c: Update test coverage.
David Malcolm [Tue, 24 Feb 2026 22:54:39 +0000 (17:54 -0500)]
sarif-replay: decode event IDs [PR123056]
Attempt to round-trip event IDs through in execution paths
through SARIF.
gcc/ChangeLog:
PR sarif-replay/123056
* libsarifreplay.cc: Include "json-pointer-parsing.h".
(sarif_replayer::sarif_replayer): Initialize m_root_val.
(sarif_replayer::m_root_val): New field.
(sarif_replayer::replay_file): Store m_root_val.
(sarif_replayer::append_embeddded_link): Add message_obj param.
Attempt to decode intra-sarif links, turning them into event IDs.
(sarif_replayer::decode_link_within_sarif): New.
(sarif_replayer::make_plain_text_within_result_message): Pass
message_obj to append_embeddded_link.
gcc/testsuite/ChangeLog:
PR sarif-replay/123056
* sarif-replay.dg/2.1.0-invalid/3.10.3-bad-json-pointer.sarif: New
test.
* sarif-replay.dg/2.1.0-valid/embedded-links-pr123056-check-sarif-roundtrip.py
(test_roundtrip_of_url_in_generated_sarif): Update expected
result, to expect the URL for the event.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 24 Feb 2026 22:44:31 +0000 (17:44 -0500)]
Introduce pretty-print-token-buffer.{cc,h}
Move the implementation of diagnostic_message_buffer from libdiagnostics
to a new pretty-print-token-buffer.{cc,h}, for capturing the tokens from
a pretty-print.
Implement a new class pp_token_buffer_element for replaying the tokens
in a pretty_print_token_buffer into another pretty-print, using "%e".
Add selftests.
gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add pretty-print-token-buffer.o.
* libgdiagnostics.cc: Drop include of "auto-obstack.h".
Include "pretty-print-token-buffer.h".
(class copying_token_printer): Move to
pretty-print-token-buffer.cc.
(struct diagnostic_message_buffer): Reimplement as a subclass of
pretty_print_token_buffer.
(diagnostic_message_buffer::to_string): Rename to
pretty_print_token_buffer::to_string and move to
pretty-print-token-buffer.cc.
* pretty-print-token-buffer.cc: New file, based on material from
libgdiagnostics.cc.
* pretty-print-token-buffer.h: New file, based on material from
libgdiagnostics.h.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::pretty_print_token_buffer_cc_tests.
* selftest.h (selftest::pretty_print_token_buffer_cc_tests): New
decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Mon, 19 Jan 2026 17:25:55 +0000 (12:25 -0500)]
analyzer: avoid naked "new"
Modernization; no functional change intended.
gcc/analyzer/ChangeLog:
* access-diagram.cc
(access_diagram_impl::add_aligned_child_table): Use
std::make_unique rather than "new".
(access_diagram_impl::add_valid_vs_invalid_ruler): Likewise.
* checker-path.h (checker_path::replace_event): Use
std::unique_ptr.
* diagnostic-manager.cc
(diagnostic_manager::consolidate_conditions): Use std::make_unique
rather than "new".
* feasible-graph.cc (feasible_graph::make_epath): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jeff Law [Tue, 28 Apr 2026 22:55:13 +0000 (16:55 -0600)]
[V3][RISC-V][PR target/124760] Promote SI to DI in some cases to encourage shNadd insns
So for this testcase:
int foo (int t)
{
return 3 * t - 1;
}
We currently generate:
slliw a5,a0,1
addw a0,a5,a0
addiw a0,a0,-1
ret
Intuitively we can see we're doing a 32->64 sign extension at each step and we
could drop the intermediate sign extensions. In fact, not only can we drop the
intermediate sign extensions, we can safely "promote" the intermediate
operations from SI to DI with a final sign extending add. Conceptually that
unlocks combining the first shift+add into a shNadd insn resulting in this
code:
sh1add a0, a0, a0
addiw a0, a0, -1
ret
The patch, but not the testcase, has been in my tree for a while, so it's been
through bootstrap & regression testing on the BPI and Pioneer as well as
testing on riscv32-elf and riscv64-elf. Obviously I'll wait for pre-commit CI
to do its thing before pushing.
PR target/124760
gcc/
* config/riscv/bitmanip.md (SI->DI promoting shadd pattern): Promote
intermediate SI ops to DI ops when there's a final extending op.
Andrew Pinski [Tue, 28 Apr 2026 19:46:31 +0000 (12:46 -0700)]
phiprop: Fix typo [PR125067]
When I factored out the code in can_handle_load, I had a small typo
which seemed to work for most cases but I had noticed later on was
broken. Basically the bb where the vop definition has to be dominated
by the current bb (and can't be the current bb).
Pushed as obvious afte a quick bootstrapped.
PR tree-optimization/125067
gcc/ChangeLog:
* tree-ssa-phiprop.cc (can_handle_load): Fix copy and pasto
on dominated_by_p.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
The proposed resolution of CWG 3065 suggests that reflection on a block-scope
extern declaration be ill-formed. This patch makes it so, and it also
happens to fix a crash.
PR c++/124756
gcc/cp/ChangeLog:
* reflect.cc (get_reflection): Give an error when taking the
reflection of a block-scope extern.
Marek Polacek [Mon, 20 Apr 2026 17:00:52 +0000 (13:00 -0400)]
c++/reflection: improve diagnostic for dependent splices
In the parser we've changed the "not usable in a splice" error messages
to the more helpful "expected a reflection of ...", but tsubst_splice_scope
still uses the former. This patch updates the diagnostic there as well.
Let's also teach inform_tree_category about concepts and alias templates
now that a testcase exercises them.
gcc/cp/ChangeLog:
* error.cc (inform_tree_category): Also print concept and alias
template.
* pt.cc (tsubst_splice_scope): Reword the diagnostic messages.
Call inform_tree_category.
[LRA]: Fix elimination recognition for INC/DEC RTL
There is a typo when we processing {PRE,POST}_{INC,DEC} and
{PRE,POST}_MODIFY to prevent elimination of hard reg operand. The
condition actually makes to consider pseudos instead of hard reg. The
patch fixes this.
gcc/ChangeLog:
* lra-eliminations.cc (mark_not_eliminable): Fix condition to
consider hard regs instead of pseudos for INC/DEC/MODIFY operands.
When searching prefered hard regs from too strict constraints we can ignore
some alternatives for subsequent operands. This can result in worse code
generation. The patch fixes this.
gcc/ChangeLog:
* ira-lives.cc (ira_implicitly_set_insn_hard_regs): Use the same
start prefered for all operand.
Andrew Pinski [Fri, 27 Mar 2026 22:42:16 +0000 (15:42 -0700)]
phiprop: Move the check on vuse before the dominator tests
This again is some small optimization of the order of checks here.
The dom tests don't say if the prop can happen any more so putting
them after tests that will cause the prop not to happen is a good thing.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-phiprop.cc (propagate_with_phi): Move vuse checks
before the dominator tests.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Fri, 27 Mar 2026 22:25:13 +0000 (15:25 -0700)]
phiprop: Factor out the vdef check into new function
This is just a small cleanup and should make the code easier
to understand. And it should make it easier to add/allow
to skip over some store statements that don't affect the
variable being loadded.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-phiprop.cc (propagate_with_phi): Factor out
checking the load for vdef to ....
(can_move_into_conditional): Here.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Thu, 9 Apr 2026 19:40:22 +0000 (12:40 -0700)]
testsuite: Add phiprop testcase that is already fixed [PR116823]
This testcase was extracted from fold-const.cc but was fixed
by r16-4212-gf256a13f8aed83 which removed the clobber.
Since this is fixed seperately from the other improvements,
it is in a seperate patch.
Philipp Tomsich [Tue, 28 Apr 2026 17:22:35 +0000 (11:22 -0600)]
[4/6] fold-mem-offsets: Move RISC-V size-optimization workaround to the backend
The fold-mem-offsets pass contained a target-specific workaround that
skipped basic blocks optimized for size, to avoid conflicting with
RISC-V's shorten-memrefs pass. This penalized all targets.
Move the workaround to the RISC-V backend by disabling fold-mem-offsets
via SET_OPTION_IF_UNSET in riscv_option_override when optimizing for
size with compressed instructions enabled (the same condition that gates
the shorten-memrefs pass). This preserves the RISC-V behavior while
allowing other targets to fold offsets in size-optimized blocks.
gcc/ChangeLog:
* fold-mem-offsets.cc (pass_fold_mem_offsets::execute): Remove
optimize_bb_for_size_p check.
* config/riscv/riscv.cc (riscv_option_override): Disable
flag_fold_mem_offsets when optimizing for size with compressed
instructions.
This patch support RISC-V Zalasr[1](load-acquire/store-release) extension. Based on Edwin Lu's old patch:
https://patchwork.sourceware.org/project/gcc/patch/20250410214940.2712673-1-ewlu@rivosinc.com/
Implements TARGET_MEMTAG_CAN_TAG_ADDRESSES and TARGET_MEMTAG_TAG_BITSIZE
for the RISC-V back end, allowing -fsanitize=hwaddress if the target
machine supports the pointer masking extension.
------
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_can_tag_addresses): New function.
(RISCV_HWASAN_TAG_SIZE): New definition.
(riscv_memtag_tag_bitsize): New function.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New definition.
(TARGET_MEMTAG_TAG_BITSIZE): Likewise.
[HWASAN] [RISC-V] Update EnableTaggingAbi for RISC-V linux. (#176616)
Cherry-picked from LLVM commit: 32d21326f3b60874fd72bbe509c06dbe5b729a32
Enabling pointer tagging in the userspace ABI for RISC-V kernels differs
to that of Aarch64. It requires requesting a particular number of masked
pointer bits, an error is returned if the platform could not accommodate
the request:
https://docs.kernel.org/arch/riscv/uabi.html#pointer-masking
While experimenting with enabling RISC-V HWASAN on GCC I was hitting the
error
> HWAddressSanitizer failed to enable tagged address syscall ABI
when attempting to run instrumented programs in the spike simulator
running kernel release 6.18. This patch successfully allows the tagged
address syscall ABI to be enabled by the support runtime.
Jeff Law [Tue, 28 Apr 2026 16:07:07 +0000 (10:07 -0600)]
[RISC-V][PR tree-optimization/94892] Improve equality test of sign bit splat against zero
One of the tests in pr94892 showed a case where we failed to convert a
sign bit splat + equality test against into a simple lt/ge test which
doesn't require the sign bit splat.
This is only failing on rv64, probably because the case in question has
a DI sign bit splat, then we take a lowpart SI subreg. The lowpart
dance isn't needed for rv32, though I've structured the test to verify
that we get sensible code on rv32 as well as rv64.
Like many other patches I'm submitting now, this has been in my tester
for a while, but the test has not. I'll be waiting on the pre-commit
tester to verify sanity before moving forward. I'm particularly
interested to see how it behaves with no -march flags. It should be
taking the defaults from when the toolchain was built, which should do
what we want.
Jeff
PR tree-optimization/94892
gcc/
* config/riscv/riscv.md (sign_bit_splat_equality_test): New pattern.
Tomasz Kamiński [Tue, 28 Apr 2026 14:01:47 +0000 (16:01 +0200)]
libstdc++: Make pointer_traits::pointer_to constexpr for main template.
This resolves LWG3454, "pointer_traits::pointer_to should be constexpr",
accepted in Kona 2025.
The change is applied since C++20, i.e. standard in which pointer_to
was made constexpr for T* specialization.
libstdc++-v3/ChangeLog:
* include/bits/ptr_traits.h (__ptr_traits_ptr_to::pointer_to):
Define as constexpr since C++20.
* testsuite/20_util/pointer_traits/pointer_to_constexpr.cc:
New test for custom pointer-like type.
libgomp.fortran/map-subarray-6.f90: Fix and robustify
Changes:
* Actually initialize the proper variable.
* Handle the three cases explicitly: self mapping/host fallback, mapping
but host accessible and mapping and (potentially) not host accessible.
Hence, remove 'dg-should-fail' - as the code should now always run.
* Add more checks for not pointer attaching, using values outside mapped
range.
* Add several comments and handle the case that 'tgt' is actually removed
during gimplification as unused. (Two cases: once the result with 'tgt'
removed - and once using 'tgt'/'tgt2' in the target region - and checking
then for the result).
libgomp/ChangeLog:
* testsuite/libgomp.fortran/map-subarray-6.f90: Fix, extend, and
robustify.
Richard Biener [Thu, 5 Mar 2026 10:20:44 +0000 (11:20 +0100)]
Avoid live code-generation for stmts kept as scalars
The following avoids trying to code-generate live lane extracts for
scalar defs that we have to keep anyway because they are used in
SLP graph leafs as extern inputs.
This resolves the known cases of one of the workarounds in live
code-generation.
* tree-vect-slp.cc (vect_bb_slp_mark_live_stmts): Do not
attempt to live code-generate defs that are kept in scalar
form anyway.
* tree-vect-loop.cc (vectorizable_live_operation): Update
comment.
Richard Biener [Tue, 3 Mar 2026 14:09:22 +0000 (15:09 +0100)]
Cost each BB vect live lane only once
The following makes sure to cost live scalar stmts appearing in multiple
SLP nodes only once and code-generate them from the SLP node we verified
we can replace all scalar uses from.
* tree-vectorizer.h (_slp_tree::live_lanes): New vector.
(SLP_TREE_LIVE_LANES): New.
* tree-vect-loop.cc (vectorizable_live_operation): Append
to SLP_TREE_LIVE_LANES.
* tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize
SLP_TREE_LIVE_LANES.
(_slp_tree::~_slp_tree): Release SLP_TREE_LIVE_LANES.
(vect_print_slp_tree): Adjust live lane dumping, indicating
the SLP node a lane is code generated from.
(vect_bb_slp_mark_live_stmts): No longer verify we can
code-generate from all SLP nodes but at least one, picking
the first.
* tree-vect-stmts.cc (vect_transform_stmt): Iterate over
SLP_TREE_LIVE_LANES.
(vect_analyze_stmt): Also analyze reductions for live
lanes.
The following uses the vector coverage indicated by SLP_TREE_TYPE to
improve and simplify BB vector scalar costing, finally handling SLP
patterns properly.
PR tree-optimization/124222
* tree-vect-slp.cc (vect_slp_gather_vectorized_scalar_stmts): Remove.
(vect_bb_slp_scalar_cost): Simplify by using SLP_TREE_TYPE and
a use-def walk of the scalar stmts SSA uses.
(vect_bb_vectorization_profitable_p): Simplify.
* gcc.dg/vect/costmodel/x86_64/costmodel-pr124222.c: New testcase.
Richard Biener [Mon, 2 Mar 2026 13:53:04 +0000 (14:53 +0100)]
Simplify vect_bb_slp_mark_live_stmts
The following uses the full scalar stmt coverage now denoted by
SLP_TREE_TYPE to simplify computing STMT_VINFO_LIVE_P for code
generation of live lanes.
Richard Biener [Tue, 3 Mar 2026 12:48:17 +0000 (13:48 +0100)]
Re-do vect_mark_slp_stmts to compute full scalar stmt coverage
The following re-purposes STMT_SLP_TYPE for BB vectorization to indicate
the scalar (non-pattern) stmt coverage of the vectorized SLP graph.
This will allow for simpler and more precise determining of live lanes
and scalar costing.
* tree-vect-slp.cc (vect_slp_analyze_bb_1): Split out pure_slp
marking into ...
(vect_bb_slp_mark_stmts_vectorized): ... new function. Compute
full scalar stmt coverage of the SLP graph.
(vect_slp_gather_extern_scalar_stmts): New helper.
(vect_bb_slp_mark_live_stmts): Adjust.
* tree-vect-loop.cc (vectorizable_live_operation): Likewise.
Richard Biener [Mon, 2 Mar 2026 14:10:14 +0000 (15:10 +0100)]
Move BB analysis code to make flow more obvious
The following moves BB vect live stmt marking out of
vect_slp_analyze_operations to vect_slp_analyze_bb_1 and SLP stmt marking,
marking some vectorized stmts as PURE_SLP, right before it which
is the only remaining consumer.
* tree-vect-slp.cc (vect_slp_analyze_operations): Move
vect_bb_slp_mark_live_stmts call ...
(vect_slp_analyze_bb_1): ... here. Move SLP stmt marking
right before it.
(vect_mark_slp_stmts): Remove unused overload.