Arthur Cohen [Tue, 12 Aug 2025 15:04:03 +0000 (17:04 +0200)]
gccrs: ast: Cleanup SingleASTNode::NodeType
gcc/rust/ChangeLog:
* ast/rust-ast.h: Change NodeType to enum class Kind.
* ast/rust-ast-fragment.cc: Use new names.
* ast/rust-ast-fragment.h: Likewise.
* ast/rust-ast.cc (SingleASTNode::SingleASTNode): Likewise.
Owen Avery [Tue, 12 Aug 2025 02:06:02 +0000 (22:06 -0400)]
gccrs: CfgStrip AST nodes marked with #[test]
gcc/rust/ChangeLog:
* expand/rust-cfg-strip.cc: Include "rust-macro-expand.h".
(fails_cfg): Rename to...
(CfgStrip::fails_cfg): ...here and handle test attributes.
(fails_cfg_with_expand): Rename to...
(CfgStrip::fails_cfg_with_expand): ...here and handle test
attributes.
* expand/rust-cfg-strip.h (struct ExpansionCfg): Forward
declare.
(CfgStrip::fails_cfg): New member function.
(CfgStrip::fails_cfg_with_expand): Likewise.
(CfgStrip::CfgStrip): Accept reference to ExpansionCfg.
(CfgStrip::expansion_cfg): New member variable.
* rust-session-manager.cc (Session::expansion): Pass
ExpansionCfg instance to CfgStrip constructor.
gccrs: Emit an error message on unsupported llvm_asm
llvm_asm was never meant to be completely supported since it has been
replaced with the asm macro but we still need it to compile some parts
of libcore, previously the compiler was aborting when an unsupported
llvm_asm construct was found.
gcc/rust/ChangeLog:
* ast/rust-expr.h: Add const getters to llvm members.
* hir/rust-ast-lower-expr.cc (check_llvm_asm_support): Check llvm_asm
usage validity.
(ASTLoweringExpr::visit): Emit an error message instead of aborting.
Owen Avery [Sat, 9 Aug 2025 22:33:58 +0000 (18:33 -0400)]
gccrs: Add checks to ExpandVisitor
This should help detect issues like
https://github.com/Rust-GCC/gccrs/issues/3444.
gcc/rust/ChangeLog:
* ast/rust-ast.h (Stmt::get_node_id): Make virtual.
(Type::get_node_id): Likewise.
(AssociatedItem::get_node_id): New virtual member function.
* ast/rust-expr.h (TypeCastExpr::get_casted_expr_ptr): New
member function.
(TypeCastExpr::get_type_to_cast_to_ptr): Likewise.
(ClosureExprInner::get_definition_expr_ptr): Likewise.
* ast/rust-item.h (TypeAlias::get_node_id): New member function
to override AssociatedItem::get_node_id.
(ConstantItem::get_node_id): Likewise.
* expand/rust-expand-visitor.cc
(ExpandVisitor::maybe_expand_expr): Adjust
macro_invoc_expect_id.
(ExpandVisitor::maybe_expand_type): Likewise and add an overload
for std::unique_ptr<TypeNoBounds>.
(ExpandVisitor::visit): Check macro_invoc_expect_id and
generally improve visitors so that the testsuite will still
pass.
* expand/rust-expand-visitor.h (ExpandVisitor::ExpandVisitor):
Initialize member variable macro_invoc_expect_id.
(ExpandVisitor::maybe_expand_type): Add an overload for
std::unique_ptr<TypeNoBounds>.
(ExpandVisitor::expand_macro_children): Adjust
macro_invoc_expect_id.
(ExpandVisitor::visit): Add an overload for TypeCastExpr.
(ExpandVisitor::macro_invoc_expect_id): New member variable.
gcc/testsuite/ChangeLog:
* rust/compile/macros/mbe/macro49.rs: Add missing lang items.
lishin [Fri, 8 Aug 2025 21:35:20 +0000 (22:35 +0100)]
gccrs: Fix ICE on exclusive_range_pattern lowering
gcc/rust/ChangeLog:
* backend/rust-compile-pattern.cc (CompilePatternCheckExpr::visit):
Check upper compare operator.
* hir/rust-ast-lower-pattern.cc (ASTLoweringPattern::visit):
Handle lowering of exclusive range pattern.
* hir/tree/rust-hir-pattern.h (class RangePattern):
Add support for exclusive ranges in HIR representation.
* expand/rust-macro-expand.cc (transcribe_expression): Parse any
outer attributes before parsing an expression.
* parse/rust-parse.h (Parser::parse_outer_attributes): Make
public.
Several place built an object before copying it in a vector. This commit
favorise in place construction and use readily available vector size to
reserve memory beforehand.
Owen Avery [Tue, 5 Aug 2025 20:44:02 +0000 (16:44 -0400)]
gccrs: Improve handling of AttrInputLiteral
Also adjusts a few error messages to be more in line with rustc (and
more amenable to extract_string_literal usage).
gcc/rust/ChangeLog:
* util/rust-attributes.cc (Attributes::extract_string_literal):
New function definition.
* util/rust-attributes.h (Attributes::extract_string_literal):
New function declaration.
* ast/rust-collect-lang-items.cc (get_lang_item_attr): Use
extract_string_literal.
* backend/rust-compile-base.cc: Include "rust-attributes.h".
(HIRCompileBase::handle_link_section_attribute_on_fndecl):
Use extract_string_literal.
(HIRCompileBase::handle_must_use_attribute_on_fndecl): Likewise.
* hir/rust-ast-lower-base.cc
(ASTLoweringBase::handle_lang_item_attribute): Likewise.
* rust-session-manager.cc (Session::handle_crate_name):
Likewise.
gccrs: Refactor HIR::PatternItem class and its derivatives
Renames HIR::TupleItems to PatternItems for better generalization because it will be reused for
SlicePattern in the future. Enum values MULTIPLE/RANGED are renamed to NO_REST/HAS_REST
to reduce misleadingness as it is completely different and separate from the ranged values `..`
representation. This results in renaming of all related classes and updates to all code that
references them.
gcc/rust/ChangeLog:
* hir/tree/rust-hir-pattern.h:
- Rename TupleItems to PatternItems.
- Rename TuplePatternItemsMultiple/Ranged & TupleStructItemsRange/NoRange to
TuplePatternItemsNoRest/HasRest and TupleStructItemsNoRest/HasRest.
- Update enum values to NO_REST/HAS_REST.
- Rename clone_tuple_items_impl to clone_pattern_items_impl.
* hir/tree/rust-hir-full-decls.h: Renamed the classes accordingly.
* hir/tree/rust-hir-visitor.h: Renamed the classes accordingly.
* hir/tree/rust-hir-visitor.cc: Renamed the classes accordingly.
* hir/rust-hir-dump.h: Renamed the classes accordingly.
* hir/rust-hir-dump.cc: Renamed the classes accordingly.
* hir/tree/rust-hir.cc: Renamed the classes accordingly.
* hir/rust-ast-lower-base.cc: Renamed the classes accordingly.
* hir/rust-ast-lower-pattern.cc: Renamed the classes accordingly.
* backend/rust-compile-pattern.cc: Renamed the classes accordingly.
* backend/rust-compile-var-decl.h: Renamed the classes accordingly.
* checks/errors/borrowck/rust-bir-builder-pattern.cc: Renamed the classes accordingly.
* checks/errors/borrowck/rust-bir-builder-struct.h: Renamed the classes accordingly.
* checks/errors/borrowck/rust-function-collector.h: Renamed the classes accordingly.
* checks/errors/rust-const-checker.cc: Renamed the classes accordingly.
* checks/errors/rust-const-checker.h: Renamed the classes accordingly.
* checks/errors/rust-hir-pattern-analysis.cc: Renamed the classes accordingly.
* checks/errors/rust-hir-pattern-analysis.h: Renamed the classes accordingly.
* checks/errors/rust-unsafe-checker.cc: Renamed the classes accordingly.
* checks/errors/rust-unsafe-checker.h: Renamed the classes accordingly.
* checks/errors/rust-readonly-check2.cc: Renamed the classes accordingly.
* typecheck/rust-hir-type-check-pattern.cc: Update references to renamed classes and enum
values.
David Faust [Tue, 28 Oct 2025 18:13:25 +0000 (11:13 -0700)]
dwarf: handle repeated decl with different btf_decl_tags [PR122248]
The check in gen_btf_tag_dies which asserted that if the target DIE
already had an annotation then it must be the same as the one we are
attempting to add was too strict. It is valid for multiple declarations
of the same object to appear with different decl_tags, in which case the
tags from each are accumulated in DECL_ATTRIBUTES. The existing
annotation may not be the same as the one being added, since new tags
will be added to the head of the chain.
The proper behavior is to always replace any existing AT_GNU_annotation
to refer to the chain of annotations we have just constructed, whether
the head of that chain is the same or not.
PR debug/122248
gcc/
* dwarf2out.cc (gen_btf_tag_dies): Always replace an existing
AT_GNU_annotation on the target die.
David Faust [Wed, 29 Oct 2025 22:21:16 +0000 (15:21 -0700)]
btf: do not prune at typedefs
The existing BTF pruning logic meant that an anonymous struct or
union type hidden behind a typedef, such as in the common construct:
typedef struct { ... } my_struct_type;
could be pruned if 'my_struct_type' was only ever referenced via pointer
members in other structs/unions types used in the program.
The result of pruning is to skip emitting full type information for
a struct or union type by replacing it with a BTF_KIND_FWD, indicating
that it exists but its definition is omitted. Any types used only by
pruned types are fully omitted from the generated BTF.
In cases like this where the struct/union type is anonymous, the result
is an anonymous BTF_KIND_FWD, which is useless. The presence of such a
type record rightly causes complaints from BTF loaders. Worse, since
the TYPEDEF for 'my_struct_type' itself may _not_ be pruned, its type
information will be incomplete.
Change the BTF pruner so that we never consider pruning at a typedef,
and always either keep or discard both the type and the typedef.
gcc/
* btfout.cc (btf_add_used_type_1): Do not consider creating
fixups at typedefs.
Michal Jires [Mon, 25 Aug 2025 16:23:24 +0000 (18:23 +0200)]
lto: Partition toplevel assembly in 1to1
1to1 partitioning now also partitions toplevel assembly.
Other partitionings keep the old behavior of putting all
toplevel assembly into single partition.
Michal Jires [Mon, 25 Aug 2025 16:07:29 +0000 (18:07 +0200)]
lto: Use toplevel_node in lto_symtab_encoder
This patch replaces symtab_node with toplevel_node in lto_symtab_encoder
and modifies all places where lto_symtab_encoder is used to handle
(ignore) asm_node.
Michal Jires [Mon, 25 Aug 2025 15:37:19 +0000 (17:37 +0200)]
cgraph: Add toplevel_node
asm_node and symbol_node will now inherit from toplevel_node.
This is now useful for lto partitioning, in future it should be also
useful for toplevel extended assembly.
Michal Jires [Thu, 15 May 2025 14:37:12 +0000 (16:37 +0200)]
lto: Fix reversed sorting of node order.
Sorting by node order in lto partitioning is incorrectly reversed.
For default balanced partitioning this caused all noreorder symbols
to be partitioned into a single partition where they were sorted again,
but correctly.
Qing Zhao [Fri, 24 Oct 2025 18:02:06 +0000 (18:02 +0000)]
Extend the attribute "counted_by" to support VOID pointer under GNU extension.
This extension is requested by linux kernel to ease the adoption of counted_by
attribute into linux kernel.
Please refer to
https://lore.kernel.org/lkml/20251021095447.GL3245006@noisy.programming.kicks-ass.net/
for the initial request for this feature.
The attribute is allowed for a pointer to void, However,
Warnings will be issued for such cases when -Wpointer-arith is
specified. When this attribute is applied on a pointer to void, the
size of each element of this pointer array is treated as 1.
gcc/c-family/ChangeLog:
* c-attribs.cc (handle_counted_by_attribute): Allow counted_by for
void pointer. Issue warnings when -Wpointer-arith is present.
gcc/c/ChangeLog:
* c-typeck.cc (build_access_with_size_for_counted_by): When the element
type is void, assign size one as the element_size.
gcc/ChangeLog:
* doc/extend.texi: Clarification when the counted_by attribute is applied
on a void pointer.
gcc/testsuite/ChangeLog:
* gcc.dg/pointer-counted-by.c: Update for void pointers.
* gcc.dg/pointer-counted-by-10.c: New test.
* gcc.dg/pointer-counted-by-4-void.c: New test.
Andrew Pinski [Wed, 29 Oct 2025 18:58:31 +0000 (11:58 -0700)]
MATCH: Optimize `VEC_SHL_INSERT (dup (A), A)` to just `dup (A) [PR116075]
It was noticed if we have `.VEC_SHL_INSERT ({ 0, ... }, 0)` it was not being
simplified to just `{ 0, ... }`. This was generated from the autovectorizer
(maybe even on accident, see PR tree-optmization/116081).
This adds a few SVE testcases to see if this is optimized since the
auto-vectorizer or intrinsics are the only two ways of getting this
produced.
Changes since:
* v1: Move the constant case over to fold-const-call.cc.
Simplify match pattern to use handle vec_duplicate.
Build and tested for aarch64-linux-gnu with no regressions.
PR target/116075
gcc/ChangeLog:
* fold-const-call.cc (fold_const_vec_shl_insert): New function.
(fold_const_call): Call fold_const_vec_shl_insert for CFN_VEC_SHL_INSERT.
* match.pd (`VEC_SHL_INSERT (dup (A), A)`): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/dup-insr-1.c: New test.
* gcc.target/aarch64/sve/dup-insr-2.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
The two two clobber can be considered the same.
So starting at `bb 4`'s. Bofore we walk back to the call of g statement
and would notice that the use in the phi node of `bb5` and that would cause
the walk to stop. But in this case since he phi node has a single use of the
clobber and the clobber matches the original clobber it can be considered the
same "one". So with the patch now, we walk back one more statement and allow it.
Similar to the at the call to p statement.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122247
gcc/ChangeLog:
* tree-ssa-forwprop.cc (do_simple_agr_dse): Allow phi node for the usage
if the usage of the phi result is just the "same" as the original clobber.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/copy-prop-aggregate-sra-2.C: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
So when the simple DSE looks at the clobber from `bb3`, we find the use of
MEM_6 is in a non dominating BB of BB3 so it gets rejected. But since this usage
is also a clobber which isthe same as the original clobber; it can be safely assumed
to do the same thing as the first clobber. So it can be safely ignored.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122247
gcc/ChangeLog:
* tree-ssa-forwprop.cc (do_simple_agr_dse): Allow
use to be a clobber of the same kind to the same lhs.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/copy-prop-aggregate-sra-1.C: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Robin Dapp [Fri, 17 Oct 2025 09:07:17 +0000 (11:07 +0200)]
niter: Use ranger to query ctz range.
When niter runs after the copy-header pass it sometimes fails to
simplify assumptions in a ctz loop.
As the assumption is a simple nonzero test here we can have
ranger get us the range of the shifted expression, then verify that
this range is nonzero.
This helps recognize a ctz loop in 502.gcc's compute_transp.
PR/tree-optimization 122207
gcc/ChangeLog:
* tree-ssa-loop-niter.cc (shifted_range_nonzero_p): New
function.
(number_of_iterations_cltz): Call new function.
* tree-ssa-loop.cc (pass_scev_cprop::execute): Enable ranger.
Eric Botcazou [Thu, 30 Oct 2025 14:41:09 +0000 (15:41 +0100)]
[Ada] Fix formal parameter incorrectly visible from outside of instance
The problem had been partially fixed two decades ago and the original
testcase correctly rejected, but almost 4 years later the submitter
made a small tweak to it which exposed the issue again...
The original fix was a change to Find_Expanded_Name, this additional fix
is to make exactly the same change to the processing of Collect_Interps
for expanded names.
gcc/ada/
PR ada/15610
* sem_type.adb (Collect_Interps): Apply the same visibility
criterion to expanded names as Find_Expanded_Name.
gcc/testsuite/
* gnat.dg/specs/generic_inst7.ads: New test.
* gnat.dg/specs/generic_inst8.ads: New test.
Robin Dapp [Thu, 30 Oct 2025 13:48:07 +0000 (07:48 -0600)]
[PATCH v2] RISC-V: avlprop: Scale AVL by subreg ratio [PR122445].
Hi,
Since r16-4391-g85ab3a22ed11c9 we can use a punned type/mode for grouped
loads and stores. Vineet reported an x264 wrong-code bug since that
commit. The crux of the issue is that in avlprop we back-propagate
the AVL from consumers (like stores) to producers.
When e.g. a V4QI vector is type-punned by a V1SI vector
(subreg:V1SI (reg:V4QI ...)
the AVL of that instruction refers to the outer subreg mode, i.e. for an
AVL of 1 in a store we store one SImode element. The producer of the
store data is not type punned and still uses V4QI and we produce 4
QImode elements. Due to this mismatch we back-propagate the consumer
AVL of 1 to the producers, causing wrong code.
This patch looks if the use is inside a subreg and scales the immediate
AVL by the ratio of inner and outer mode.
Changes from v1:
- Move NULL check into loop.
- Add REG_P check.
Regtested on rv64gcv_zvl512b.
Regards
Robin
PR target/122445
gcc/ChangeLog:
* config/riscv/riscv-avlprop.cc (pass_avlprop::get_vlmax_ta_preferred_avl):
Scale AVL of subreg uses.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr122445.c: New test.
Artemiy Volkov [Thu, 30 Oct 2025 13:42:03 +0000 (07:42 -0600)]
[PATCH][PR tree-optimization/122478] match.pd: fix simplify pattern for view_convert (BIT_FIELD_REF)
The pattern introduced in r16-4682-g5eafa8d16be873 couldn't handle
conversion from <unnamed-unsigned:1> to unsigned char, which ended up
causing a gimple checking failure reported in PR122478. This patch fixes
the pattern by prohibiting widening integral conversions in addition to
the narrowing ones, or equivalently, requiring that the converted-to and
converted-from types of the VCE both have precision equal to their size.
Since type_has_mode_precision_p () does not apply to vector types, filter
them out by adding a !INTEGRAL_TYPE_P () check on TREE_TYPE (@0).
Bootstrapped and regtested on aarch64 and x86_64, regtested on i386 and
riscv64, one GIMPLE test added.
PR tree-optimization/122478
gcc/ChangeLog:
* match.pd: Fix the view_convert (BIT_FIELD_REF) pattern.
Richard Biener [Thu, 30 Oct 2025 13:24:46 +0000 (14:24 +0100)]
Adjust gcc.dg/tree-ssa/pr92834.c
Scanning the optimized dump is fragile due to vectorization. The
following instead scans after early phiopt1, adjusting for not
yet eliminated static functions.
* gcc.dg/tree-ssa/pr92834.c: Scan phiopt1 instead of optimized.
Richard Biener [Thu, 30 Oct 2025 12:30:21 +0000 (13:30 +0100)]
[i386] Fix type in ix86_move_max setup
There's a typo in the way we compute opts->x_ix86_move_max:
if (opts_set->x_ix86_move_max == PVW_NONE)
{
/* Set the maximum number of bits can be moved from memory to
memory efficiently. */
if (opts_set->x_prefer_vector_width_type != PVW_NONE)
opts->x_ix86_move_max = opts->x_prefer_vector_width_type;
else if (ix86_tune_features[X86_TUNE_AVX512_MOVE_BY_PIECES])
opts->x_ix86_move_max = PVW_AVX512;
else if (ix86_tune_features[X86_TUNE_AVX256_MOVE_BY_PIECES])
opts->x_ix86_move_max = PVW_AVX256;
else
{
opts->x_ix86_move_max = opts->x_prefer_vector_width_type;
/* */ if (opts_set->x_ix86_move_max == PVW_NONE)
{
if (TARGET_AVX512F_P (opts->x_ix86_isa_flags))
opts->x_ix86_move_max = PVW_AVX512;
/* Align with vectorizer to avoid potential STLF issue. */
else if (TARGET_AVX_P (opts->x_ix86_isa_flags))
opts->x_ix86_move_max = PVW_AVX256;
else
opts->x_ix86_move_max = PVW_AVX128;
}
}
}
as written the /* */ condition is redundant with the outermost one.
But intended is (IMO) that the earlier set opts->x_prefer_vector_width_type
via X86_TUNE_{AVX128,AVX256}_OPTIMAL takes precedence over the ISA
based setup that follows. So instead of checking opts_set we want
to check whether the previous assignment left us with still PVW_NONE.
The issue makes us ignore X86_TUNE_AVX128_OPTIMAL/X86_TUNE_AVX256_OPTIMAL
when determining opts->x_ix86_move_max.
* config/i386/i386-options.cc (ix86_option_override_internal):
Fix check during opts->x_ix86_move_max initialization.
lra: Fix computing reg class for hard register constraints [PR121198]
Currently the register class derived from a hard register constraint is
solely determined from a single register. This even works for register
pairs if all the required registers are contained in this very register
class and falls apart if not. For example:
long
test (void)
{
long x;
__asm__ ("..." : "={r22}" (x));
return x;
}
For AVR -mmcu=atmega8, variable `x` requires a register quadruple and
the minimal class for single register r22 is SIMPLE_LD_REGS which itself
entails registers r16 up to r23. However, variable `x` is bound to
registers r22 up to r25. Thus, the minimal class containing those is
LD_REGS. Therefore, compute the least upper bound of all register
classes over all required registers.
PR rtl-optimization/121198
gcc/ChangeLog:
* lra-constraints.cc (process_alt_operands): Compute least upper
bound of all register classes over all required registers in
order to determine register class for a hard register constraint.
Gaius Mulley [Thu, 30 Oct 2025 11:19:08 +0000 (11:19 +0000)]
PR modula2/122485: add spell checking to module names
This patch introduces spell checking during module imports.
If the correct module name has been seen prior to the incorrect import
then it will attempt to provide a hint during the error message.
gcc/m2/ChangeLog:
PR modula2/122485
* gm2-compiler/M2Comp.mod (Pass0CheckDef): Add spell check
format specifier filtering on module names.
* gm2-compiler/M2MetaError.mod (errorBlock): New field
filterDef.
(initErrorBlock): Initialize filterDef.
(continuation): Add 'D' filter on definition module specifier.
(SpellHint): Rewrite to check for filterDef and defimp symbols.
(FilterOnDefinitionModule): New procedure.
* gm2-compiler/M2Quads.mod (BuildSizeFunction): Rewrite to
ensure variables are initialized.
* gm2-compiler/M2StackSpell.def (GetDefModuleSpellHint): New
procedure function.
* gm2-compiler/M2StackSpell.mod (GetDefModuleSpellHint): New
procedure function.
(CandidatePushName): New procedure.
(BuildHintStr): New procedure.
(CheckForHintStr): Rewrite.
gcc/testsuite/ChangeLog:
PR modula2/122485
* gm2.dg/spell/iso/fail/badimport.mod: New test.
Richard Biener [Thu, 30 Oct 2025 09:38:13 +0000 (10:38 +0100)]
Swap operands during SLP discovery for mismatching STMT_VINFO_REDUC_IDX
When we are unlucky operand canonicalization can end up presenting
us with different order, making a possible SLP reduction group
not match up. The following allows swapping operands in this case.
* tree-vect-slp.cc (vect_get_operand_map): Handle commutative
operands when swapping is requested.
(vect_build_slp_tree_1): Allow STMT_VINFO_REDUC_IDX differences
when operand swapping makes them match and request swapping.
(vect_build_slp_instance): Indicate we have successfully
discovered a SLP reduction group.
* gcc.dg/vect/slp-reduc-13.c: New testcase.
Co-authored-by: Eric Botcazou <ebotcazou@adacore.com>
* config/i386/i386.md (ovf_add_cmp): New code attribute.
(udf_sub_cmp): Ditto.
(ovf_comm): New int iterator.
(*plus_within_<code><mode>3_<ovf_comm>): New insn and split pattern.
(*minus_within_<code><mode>3): Ditto.
gcc/testsuite/ChangeLog:
* gcc.dg/pr116815.c: New test.
* gcc.target/i386/pr116815.c: New test.
Andrew Pinski [Wed, 29 Oct 2025 23:30:50 +0000 (16:30 -0700)]
gimple-fold: Remove assume_aligned folding
So in the end I agree with Richi's comment at
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698856.html:
> I see. I wonder whether it would be better to leave __builtin_assume_aligned
> around then, because that inherently introduces the copy and it would show why.
> TER / SSA coalescing might make a mess our of the copies you leave in place
> anyway, no?
This leaves __builtin_assume_aligned around.
Will also push the revert of r16-4637-g8590b32deac05e along side this.
* c-c++-common/ubsan/align-5.c: Xfail.
* gcc.dg/pr107389.c: Move to...
* gcc.dg/torture/pr107389.c: ...here. Skip for lto.
* gcc.dg/builtin-assume-aligned-1.c: Instead of
testing for deleting of assume-align, test for
the alignment/misalignment. Also disable the
vectorizer.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Richard Biener [Fri, 24 Oct 2025 10:31:54 +0000 (12:31 +0200)]
tree-optimization/120687 - legitimise some permutes before optimizing
The following does a simple legitimising attempt on the SLP graph
permutations before trying to optimize them. For the case we have
a single non-zero layout we can force that to all partitions if
it is compatible. This way we end up with the most canonical
(and possibly no-op) load permutations and permutes.
I have refrained from trying to use internal_node_cost to actually
check if the result is legitimate (it would need at least the
change to anticipate redundant load permute eliding). This relies
on start_choosing_layouts chosing layout zero for everything we
cannot handle (like non-bijective permutes).
What's still missing is to try to process disconnected parts of the
SLP graph separately. We should possibly try to handle those
separately through all of the SLP optimize process for simplicity.
This also includes a fix for a dumping ICE when we permute the
root node of a reduction chain that was associated.
PR tree-optimization/120687
* tree-vect-slp.cc (vect_optimize_slp_pass::is_compatible_layout):
New overload for checking a whole partition.
(vect_optimize_slp_pass::legitimize): New function trying
a single layout for all partitions for now.
(vect_optimize_slp_pass::run): Try legitimizing to a single
layout before propagating.
(vect_slp_analyze_operations): For dumping deal with
SLP_TREE_SCALAR_STMTS being empty or element zero being NULL.
Jakub Jelinek [Thu, 30 Oct 2025 07:43:18 +0000 (08:43 +0100)]
libstd++: Implement C++23 P2674R1 - A trait for implicit lifetime types
The following patch attempts to implement the library side of the
C++23 P2674R1 paper. As mentioned in the paper, since CWG2605
the trait isn't really implementable purely on the library side.
The compiler side has been committed earlier, so this just uses
the new builtin trait on the library side.
2025-10-30 Jakub Jelinek <jakub@redhat.com>
* include/bits/version.def (is_implicit_lifetime): New.
* include/bits/version.h: Regenerate.
* include/std/type_traits (std::is_implicit_lifetime,
std::is_implicit_lifetime_v): New trait.
* src/c++23/std.cc.in (std::is_implicit_lifetime,
std::is_implicit_lifetime_v): Export.
* testsuite/20_util/is_implicit_lifetime/version.cc: New test.
* testsuite/20_util/is_implicit_lifetime/value.cc: New test.
Guo Jie [Wed, 29 Oct 2025 08:38:54 +0000 (16:38 +0800)]
LoongArch: Standard instruction template fnmam4 correction
The current implementation of the fnmam4 instruction template requires
the third source operand to be assigned the same hard register as the
target operand, but the constraint is not documented in the instruction
manual or standard template definitions. The current constraint will
generate additional data dependencies and extra instructions.
Jinyang He [Wed, 29 Oct 2025 08:07:35 +0000 (16:07 +0800)]
LoongArch: Only allow valid binary op when optimize conditional move
It is wrong that optimize from `if (cond) dest op= 1 << shift` to
`dest op= (cond ? 1 : 0) << shift` when `dest op 0 != dest`.
Like `and`, `mul` or `div`.
And in this optimization `mul` and `div` is optimized to shift.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_expand_conditional_move): Only allow valid binary
op when optimize conditional move.
Eric Botcazou [Wed, 29 Oct 2025 23:33:36 +0000 (00:33 +0100)]
Ada: Fix spurious visibility issue with qualified aggregate in instantiation
Aggregates used as actuals of formal object parameters are handled specially
by Instantiate_Object in Sem_Ch12 and qualifying them is sufficient to block
this special processing.
gcc/ada/
PR ada/54178
* sem_ch12.adb (Instantiate_Object): Strip qualification to detect
aggregates used as actuals.
gcc/testsuite/
* gnat.dg/aggr32.adb: New test.
* gnat.dg/aggr32_pkg.ads: New helper.
* gnat.dg/aggr32_pkg-child.ads: Likewise.
Eric Botcazou [Wed, 29 Oct 2025 23:06:00 +0000 (00:06 +0100)]
Ada: Fix instantiation failure with qualified name of child generic unit
This is again an issue with multiple levels of nested instances, and it
arises because the qualified name of the problematic child generic unit
is used (this works fine with the direct name), exposing the rather
questionable processing implemented for instances in Find_Expanded_Name.
The patch replaces this processing with the straightforward decoding of
the renaming scheme used in Sem_Ch12.
gcc/ada/
PR ada/16214
* sem_ch8.adb (Find_Expanded_Name): Consolidate and streamline the
processing required for references to instances within themselves.
Jeff Law [Wed, 29 Oct 2025 20:52:03 +0000 (14:52 -0600)]
[PR target/116662][RISC-V] Adjust destructive interference size for RISC-V
So per the discussion in PR 116662, this adjusts the destructive interference
size for RISC-V to be more in line with current designs (64 bytes).
Getting this wrong is "just" a performance issue, so there's no correctness
concerns to be worried about. The only real worry is that the value can have
ABI implications. The position that Jason and others have taken is that while
it can be mis-used in a way that gets exposed as ABI, that's inherently unsafe
and we issue warning diagnostics for those cases.
So here's the change to bump it to 64 bytes. Tested on rv32 and rv64 embedded
targets. Bootstrap on the Pioneer & BPI is in flight and not due to land for
several hours. Will push once pre-commit CI has done its thing (and the
Pioneer might have finished its cycle by then, which I'll check, obviously).
PR target/116662
gcc/
* config/riscv/riscv.cc (riscv_option_override): Override
default value for destructive interference size.
Jonathan Wakely [Wed, 29 Oct 2025 21:37:18 +0000 (21:37 +0000)]
libstdc++: Fix -Wunused-variable from <regex>
In r16-4709-gc55c1de3a9adb2 I meant to use the result of the
static_cast<char> for the rest of the function following it, but I
accidentally used the original variable __ch. This causes
-Wunused-variable warnings for the __c initialized from the cast.
This fixes the rest of the function to use __c instead of __ch.
libstdc++-v3/ChangeLog:
* include/bits/regex.tcc (regex_traits::value): Use __c instead
of __ch.
Jonathan Wakely [Wed, 29 Oct 2025 15:28:52 +0000 (15:28 +0000)]
libstdc++: Do not include internal headers in tests
For 42319.cc the PR says that <ios> reproduced the problem, so let's
include that instead. We should also use the no_pch option because
otherwise the test is including everything anyway, and so fails to check
that the char_traits.h header can be included in isolation. There's also
no reason to use an explicit -std=gnu++11 option, we can test it for all
modes instead.
For the thread test there's no reason to use <bits/move.h> instead of
the correct header for std::move.
libstdc++-v3/ChangeLog:
* testsuite/17_intro/headers/c++2011/42319.cc: Include <ios>
instead of <bits/char_traits.h>. Add no_pch option. Remove
explicit -std=gnu++11 option.
* testsuite/30_threads/thread/swap/1.cc: Include <utility>
instead of <bits/move.h>.
Tomasz Kamiński [Mon, 27 Oct 2025 13:19:47 +0000 (14:19 +0100)]
libstdc++: Implement const copy-assignment for tuple<> [PR119721]
This patch completes the implementation of P2321R2, giving tuple proper proxy
reference semantics.
The assignment operator is implemented as a template constrained to accept only
tuple<>. Consequently, the language does not consider it a copy assignment
operator, which prevents tuple<> from losing its trivially copyable status.
The _Tuple template parameter is defaulted, ensuring the operator remains
a viable candidate for assignment with an empty brace-init list.
Osama Abdelkader [Sat, 25 Oct 2025 17:25:42 +0000 (20:25 +0300)]
libstdc++: Add constructors and assignments for tuple<> with tuple-like types [PR119721]
This patch adds support for constructing and assigning tuple<> from
other empty tuple-like types (e.g., array<T, 0>), completing the C++23
tuple-like interface for the zero-element tuple specialization.
The implementation includes:
- Constructor from forwarding reference to tuple-like types
- Allocator-aware constructor from tuple-like types
- Assignment operator from tuple-like types
- Const assignment operator from tuple-like types
PR libstdc++/119721
libstdc++-v3/ChangeLog:
* include/std/tuple (tuple<>::tuple(const tuple&))
(tuple<>::operator=(const tuple&)): Define as defaulted.
(tuple<>::swap): Moved the defintion after assignments.
(tuple<>::tuple(_UTuple&&))
(tuple<>::tuple(allocator_arg_t, const _Alloc&, _UTuple&&))
(tuple<>::operator=(_UTuple&&)) [__cpp_lib_tuple_like]: Define.
(tuple<>::operator==, tuple<>::opeator<=>): Parenthesize
constrains individually.
* testsuite/23_containers/tuple/cons/119721.cc: New test for
constructors and assignments with empty tuple-like types.
* testsuite/20_util/tuple/requirements/empty_trivial.cc:
New test verifying tuple<> remains trivially copyable.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 28 Oct 2025 12:15:52 +0000 (12:15 +0000)]
libstdc++: Simplify std::regex_traits::value
We don't need to use an istringstream to convert a hex digit to its
numerical value. And if we don't use istringstream there, we don't need
to include <sstream> in <regex>.
libstdc++-v3/ChangeLog:
* include/bits/regex.tcc (regex_traits::value): Implement
without using istringstream.
* include/std/regex: Do not include <sstream>.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Biener [Wed, 29 Oct 2025 08:03:57 +0000 (09:03 +0100)]
Fix possible double-free and leak in BB SLP discovery
vect_build_slp_instance always releases the scalar stmts vector, so make sure
to mark it as released and actually release it.
* tree-vect-slp.cc (vect_analyze_slp): Mark stmts in BB roots
as released after vect_build_slp_instance.
(vect_build_slp_instance): Release scalar_stmts when exiting
early.
Paul Thomas [Wed, 29 Oct 2025 11:06:19 +0000 (11:06 +0000)]
Fortran: PDT - gfortran does not catch F2023:R916 [PR122165]
2025-10-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/122165
* primary.cc (gfc_match_varspec): If the previous component ref
was a type specification parameter, a type inquiry ref cannot
follow.
gcc/testsuite
PR fortran/122165
* gfortran.dg/pdt_64.f03: New test.
Paul Thomas [Wed, 29 Oct 2025 09:20:24 +0000 (09:20 +0000)]
Fortran: Fix recursive PDT function invocation [PR122433, PR122434]
2025-10-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/122433
* decl.cc (gfc_get_pdt_instance): Prevent a PDT component of
the same type as the template from being converted into an
instance.
PR fortran/122434
* resolve.cc (gfc_impure_variable): The result of a pure
function is a valid allocate object since it is pure.
gcc/testsuite/
PR fortran/122433
* gfortran.dg/pdt_62.f03: New test.
PR fortran/122434
* gfortran.dg/pdt_63.f03: New test.
When implementing the vector template for copysign, we used vector
floating-point AND and IOR operations. This allows AND and IOR operands
to be vector floating-point types. However, the constraint YC does not
handle vector floating-point constants, resulting in ICE.
PR target/122097
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_const_vector_bitimm_set_p): Add support for vector float.
(loongarch_const_vector_bitimm_clr_p): Likewise.
(loongarch_print_operand): Likewise.
* config/loongarch/simd.md (and<mode>3): Likewise.
Xi Ruoyao [Sun, 26 Oct 2025 05:20:20 +0000 (13:20 +0800)]
LoongArch: Make the code generation of the trap pattern configurable
In some applications (notably the Linux kernel), "break 0" is used as a
trap that a handler may be able to recover. But in GCC the "trap"
pattern is meant to make the program rightfully die instead.
As [1] describes, sometimes it's vital to distinguish between the two
cases. The kernel developers prefer "break 1" here, but in the
user-space it's better to trigger a SIGILL instead of SIGTRAP as the
latter is more likely used as a application-defined trap.
To support both cases, make the code generation configurable with a new
option.
* config/loongarch/genopts/loongarch.opt.in (-mbreak-code=):
New.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.md (trap): Separate to a
define_insn and a define_expand which takes la_break_code.
* doc/invoke.texi (-mbreak-code=): Document.
* config/loongarch/loongarch.opt.urls: Regenerate.
gcc/testsuite
* gcc.target/loongarch/trap-default.c: New test.
* gcc.target/loongarch/trap-1.c: New test.
Eric Botcazou [Tue, 28 Oct 2025 22:04:49 +0000 (23:04 +0100)]
Ada: Fix visibility issue for child unit declared as instance on homonym
The reproducer is made up of 9 units containing multiple level of nested
instances, but in the end the problem is that the final child unit has
the same name as the parameter in its instantiation, exposing the wrong
manipulation of the homonym chain done in Analyze_Subprogram_Instantiation.
The fix is to replace this manipulation with a call to Remove_Homonym.
gcc/ada/
PR ada/48039
* sem_ch12.adb (Analyze_Subprogram_Instantiation): Call
Remove_Homonym to remove the enclosing package from visibility.
Marek Polacek [Tue, 28 Oct 2025 00:08:00 +0000 (20:08 -0400)]
c++: share more trees representing enumerators
This came up in Reflection where an assert fails because we have two
different trees for same enumerators. The reason is that in
finish_enum_value_list we copy_node when converting the enumerators.
It should be more efficient to share trees for identical enumerators.
This fix was proposed by Jakub.
gcc/cp/ChangeLog:
* decl.cc (finish_enum_value_list): Use fold_convert instead of
copy_node.
Co-authored-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Jonathan Wakely [Fri, 24 Oct 2025 10:38:22 +0000 (11:38 +0100)]
libstdc++: Fix deadlock in shared_timed_mutex test [PR122401]
The test_shared_relative function deadlocks on older Glibc versions that
don't have pthread_rwlock_clockrdlock, because (as already mentioned
earlier in the test file) pthread_rwlock_timedrdlock returns EDEADLK if
the thread that already holds a write lock attempts to acquire read
lock, causing std::shared_timed_mutex to loop forever.
The fix is to do the invalid try_lock_shared_for call on a different
thread. To avoid undefined behaviour, we need to make the same changes
to all calls that try to acquire a lock that is already held.
Also add missing -pthread for PR122401.
libstdc++-v3/ChangeLog:
PR libstdc++/122401
* testsuite/30_threads/shared_timed_mutex/try_lock_until/116586.cc:
Do not try to acquire locks on the thread that already holds a
lock. Add -pthread for et pthread.
Reviewed-by: Mike Crowe <mac@mcrowe.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Earnshaw [Tue, 28 Oct 2025 15:06:51 +0000 (15:06 +0000)]
editorconfig: Add explicit tab_width when indent_size != 8
The documentation for editorconfig files says that setting indent_size
changes the default value of tab_width; but the documentation is a
little ambiguous as to what happens if the two values are set via
different match rules. I'd generally expect in this case that the
defaulting behavior would only kick in if there were no setting of
tab_width at all, but it seems that the go implementation (or at least
the way forgejo uses the go implementation) does not do this.
However, it is fairly easy to make this all explicit by explicitly
setting tab_width whenever we have an indent_size that is not 8. I've
deliberately omitted overriding this when the indent style is set to
space, since this should make the presence of a hard tab show up in
the forge UI more clearly as incorrect indentation.
/ChangeLog:
* .editorconfig: Explicitly set tab_width whenever a
config rule has indent_style = tab and indent_size != 8.
Eric Botcazou [Wed, 1 Oct 2025 10:28:59 +0000 (12:28 +0200)]
ada: Fix miscompilation at -O2 due to aliasing issue caused by -gnatVa
The problem is that the expanded code generated by -gnatVa (-gnatVc to be
precise) violates strict aliasing rules, because it contains a 'Reference
to an elementary component that is nonaliased ('Reference is equivalent to
a pointer for code generation purposes and the "aliased" keyword is trusted
for components whose type is elementary by code generators).
Remove_Side_Effects already knows that it must make a copy for elementary
types instead of taking 'Reference, but it is fooled by the private type
of the expression. The fix is to still use the Etype to build new nodes,
but to use its Underlying_Type to select the strategy to do so.
gcc/ada/ChangeLog:
* exp_util.adb (Remove_Side_Effects): Use separately the Etype of
the expression to build new nodes and its Underlying_Type to drive
part of the processing.
ada: Rework disabling signals when calling pthread_create on QNX
Use the correct pthread_sigmask instead of sigprocmask when disabling
signals on QNX. Furthermore make use of the already existing bindings to
implement that functionality in Ada instead of C. Enable signals in both
Create_Task and Enter_Task as they need to be enabled in both the parent
and the child.
gcc/ada/ChangeLog:
* adaint.c: Remove __gnat_enable_signals, __gnat_disable_signals
and related code for QNX.
* libgnarl/s-taprop__qnx.adb: Disable and enable
signals in Ada.
Exception names don't get the Is_Imported flag set even when they're
imported from CPP. With the flag set, we end up referencing an
external variable instead of defining the exception data structure as
expected, and aspect Import behaves differently from pragma Import.
Refrain from calling Set_Is_Imported when analyzing an exception's
Import aspect.
gcc/ada/ChangeLog:
* sem_ch13.adb (Analyze_Aspect_Export_Import): Skip
Set_Is_Imported on E_Exception.
* sem_prag.adb (Process_Import_Or_Interface): Explain
why not Set_Is_Imported.
This patch avoids marking subprograms not declared immediately within package
specifications as primitive, unless they're either inherited or overriding.
gcc/ada/ChangeLog:
* sem_util.adb (Collect_Primitive_Operations): Avoid setting
Is_Primitive for noninherited and nonoverriding subprograms not
declared immediately within a package specification.
* sem_ch13.adb (Check_Nonoverridable_Aspect_Subprograms): Better
error posting to allow multiple errors on same type but different
aggregate subprogram.
This patch adds two new subprograms to Table.Table: Clear and Is_Empty.
Their selling point is that they don't require being aware of the bounds
of the instance of Table.Table, avoiding the off-by-one errors that can
happen when using Set_Last or Last directly.
This patch also replaces existing code by calls to these new subprograms
in a few places where it makes sense. It also adds a call to
Table.Table.First in the same spirit on the side.
gcc/ada/ChangeLog:
* table.ads (Clear, Is_Empty): New subprograms.
* table.adb (Clear, Is_Empty): Likewise.
(Init): Use new subprogram.
* atree.adb (Traverse_Func_With_Parent): Use new subprograms.
* fmap.adb (Empty_Tables): Use new subprogram.
* par_sco.adb (Process_Pending_Decisions): Likewise.
* sem_elab.adb (Check_Elab_Call): Likewise.
* sem_ch12.adb (Build_Local_Package, Analyze_Package_Instantiation,
Analyze_Subprogram_Instantiation): Likewise.
(Save_And_Reset): Use Table.Table.First.
Eric Botcazou [Fri, 26 Sep 2025 17:45:10 +0000 (19:45 +0200)]
ada: Fix unexpected overflow check before fixed-point multiplication
The problem is that the code generating the fixed-point multiply uses the
subtypes of the operands to size the operation, while operations are to be
performed in base types, which are signed per the RM 3.5.9(12) subclause.
As a consequence, when the subtypes are fully asymmetric unsigned, the size
is too small and an incorrect overflow check is generated.
The code generating the divide was fixed a long time ago, this aligns the
code generating the multiply and the code generating the remainder, which
in turn requires a couple of adjustments to related routines.
gcc/ada/ChangeLog:
PR ada/122063
* exp_fixd.adb (Build_Double_Divide_Code): Convert the result of the
multiply.
(Build_Multiply): Use base types of operands to size the operation.
(Build_Rem): Likewise.
(Build_Scaled_Divide_Code): Convert the result of the multiply.
Javier Miranda [Mon, 15 Sep 2025 16:34:47 +0000 (16:34 +0000)]
ada: Unsigned_Base_Range aspect (part 5)
Enable this language extension using -gnat.u, and extend the
current support to handle derivations of types that have
Unsigned_Base_Range aspect.
gcc/ada/ChangeLog:
* aspects.adb (Get_Aspect_Id): Enable aspect Unsigned_Base_Range
using -gnatd.u
* debug.adb (Debug_Flag_Dot_U): Document this switch.
* einfo-utils.adb (Is_Modular_Integer_Type): Return True if
the entity is a modular integer type and its base type does
not have the attribute has_unsigned_base_range_aspect.
(Is_Signed_Integer_Type): Return True if the entity is a signed
integer type, or it is a modular integer type and its base type
has the attribute has_unsigned_base_range_aspect.
* einfo.ads (E_Modular_Integer_Type): Add documentation of
Has_Unsigned_Base_Range_Aspect.
* par-ch4.adb (Scan_Apostrophe): Enable attribute Unsigned_Base_Range
using -gnatd.u
* sem_ch13.adb (Analyze_One_Aspect): Check general language
restrictions on aspect Unsigned_Base_Range. For Unsigned_Base_Range
aspect, do not delay the generation of the pragma becase we need
to process it before any type or subtype derivation is analyzed.
* sem_ch3.adb (Build_Scalar_Bound): Disable code analyzing the
bound with the base type of the parent type because, for unsigned
base range types, their base type is a modular type but their
type is a signed integer type.
* sem_prag.adb (Analyze_Pragma): Enable pragma Unsigned_Base_Range
using -gnatd.u. Check more errors on Unsigned_Base_Range pragma,
and create the new base type only when required.
Before this patch, Sem_Ch12 jumped through questionable hoops in the way
it used its Generics_Renaming table that involved defensive calls to the
'Valid attribute. No known bug has been caused by this, but valgrind
reported incorrect memory operations because of it.
After analysis, the problem seems to be a mix 0-based and 1-based
indexing in the uses of Generic_Renamings and a convoluted interface for
the Set_Instance_Of procedure, leading to an unclear status for
Generic_Renamings.Table (0).
This patch fixes those problems and removes the accompanying defensive
code.
gcc/ada/ChangeLog:
* sem_ch12.adb (Build_Local_Package)
(Analyze_Package_Instantiation, Analyze_Subprogram_Instantiation):
Fix Set_Last calls.
(Set_Instance_Of): Use Table.Table.Append.
(Save_And_Reset): Remove useless call. Remove defensive code.
(Restore): Remove incorrect Set_Last call and adapt to
Set_Instance_Of change.