Georg-Johann Lay [Thu, 14 May 2026 08:21:28 +0000 (10:21 +0200)]
AVR: target/125194 - Make -mno-call-main work with -flto.
Instead of emitting .global __call_main + __call_main=0 in some
module, it uses a %{mno-call-main: --defsym __call_main=0} spec.
The problem with the old implementation is that avr_no_call_main_p
was set by cc1[plus] (in avr_insert_attributes) but used by lto1
(in avr_file_end). The new approach uses --defsym __call_main=0
in order to avoid link fails due to multiple definitions of __call_main
in *.o and lib<mcu>.a.
PR target/125194
gcc/
* config/avr/avr.cc (avr_no_call_main_p): Remove variable...
(avr_file_end): ...and code that uses it.
(avr_insert_attributes): Same. Add "used" to main attributes
when -mno-call-main.
* config/avr/gen-avr-mmcu-specs.cc (print_mcu): Emit code
for link_no_call_main specs.
* config/avr/specs.h (LINK_SPEC): Add %(link_no_call_main).
Alexandre Oliva [Wed, 13 May 2026 15:50:34 +0000 (12:50 -0300)]
libstdc++: rebuild configure
An earlier configury patch made originally in gcc-15, where no line
number changes came up during configure rebuild, needed an explicit
rebuild after porting to trunk.
This patch removes the "hi/lo" optimization path in interleaved
stepped constant vector synthesis. This optimization was found
to be not really better than the fallback merge version. Once the
overflow issue is fixed, the extra masking makes the code generation
"even worse". So remove it instead of fixing.
Obsolete tests slp-interleave-[1-5].c are also removed as they were
specifically designed to verify this now-removed path.
My local test shows no regression.
PR target/125215
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector_interleaved_stepped_npatterns):
Remove hi/lo optimization and always use the merge fallback.
Tomasz Kamiński [Wed, 13 May 2026 09:51:25 +0000 (11:51 +0200)]
libstdc++: Use type_identity_t for operator<=> parameter [PR114400]
This matches change to operator== from r14-9642-gf4605c53ea2eeaf,
and implements exact resolution of LWG3950, "std::basic_string_view comparison
operators are overspecified". The difference between __type_identity and
type_identity is observable as illustrated by PR.
libstdc++-v3/ChangeLog:
PR libstdc++/114400
* include/std/string_view (operator<=>): Use type_identity_t
instead of __type_identity_t.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Marek Polacek [Tue, 12 May 2026 21:01:29 +0000 (17:01 -0400)]
c++/reflection: overzealous complete_type in consteval_only_p [PR125280]
This bug report shows that we generate a bogus "has incomplete type"
error for Element::mData in type12.C with -freflection: consteval_only_p
attempts to complete Array when finishing mData, but mData uses Element
which is currently being defined, so we can't complete it yet. We could
fix this bogus error by checking can_complete_type_without_circularity
before calling complete_type in consteval_only_p, but I'm no longer so
sure that we must call complete_type at all. This patch removes that
call. One consequence of that is that we don't produce the "outside
a constant-evaluated context" error for C::mData (only when mData is
defined outside the class as with A::mData above). This behavior
matches clang++ though so I'm not too worried about it.
type13.C shows that even without the complete_type we can still get
to consteval_only_p_walker::walk with a member with erroneous type
which currently crashes.
PR c++/125280
gcc/cp/ChangeLog:
* reflect.cc (consteval_only_p): Don't complete_type.
(consteval_only_p_walker::walk): Return false for
error_mark_node.
gcc/testsuite/ChangeLog:
* g++.dg/reflect/init19.C: New test.
* g++.dg/reflect/type12.C: New test.
* g++.dg/reflect/type13.C: New test.
Tomasz Kamiński [Wed, 13 May 2026 12:51:02 +0000 (14:51 +0200)]
libstdc++: Test for unsupported engine range for 128bits floating points [PR119739]
This patch add test illustrating, that after implementing P0952 "A new specification for
std::generate_canonical", generators, that emit range of non-power-of-two size, that
span over B bits, are not supported in combination with 128bits integer are not
support for B in ranges: [22, 23), [26, 29), [33, 38), [43, 57). This is because, the
lowest multiply of B that is larger of equal than 113 (size of mantisa of float128)
is greater than 128, and thus they will require 256 bits integers support.
This does not impact any generate defined in standard (see gencanon_eng.cc tests),
nor generate emitting power of 2 sized ranges.
PR libstdc++/119739
libstdc++-v3/ChangeLog:
* testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon_eng_neg.cc:
New test.
Reviewed-by: Nathan Myers <nmyers@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Tamar Christina [Wed, 13 May 2026 11:36:07 +0000 (12:36 +0100)]
scev: maintain affine CHRECs in the presence of type conversions
The example
float *e;
void f (float *f, float *g, char *h, int n,
int b, int c, int d)
{
float a = 0;
for (int i = 0; i < n; ++i) {
int j = b + i, k = c + i * d;
float l = g[j], m = h[i] ? g[k] : l;
a += f[i] * m;
}
*e = a;
}
gets vectorized using gathers for the access to g:
however the first g is g[b+i] and second is g[c + i*d];
since b is loop invariant the access to g[b+i] is actually linear and since c
is loop invariant, then the base of the second access g[c + i *d] can be
simplified by recognizing the base as g + c.
Today however SCEV fails to analyze these accesses as affine and as a
consequence we end up with gathers:
: missed: failed: evolution of base is not affine.
base_address:
offset from base address:
constant offset from base address:
step:
base alignment: 0
base misalignment: 0
offset alignment: 0
step alignment: 0
base_object: *_63
Looking at SCEV this is because of an outer cast around the CHREC:
Roger Sayle [Wed, 13 May 2026 11:31:12 +0000 (12:31 +0100)]
x86: Shorter load immediate constants with -Oz
This patch adds two peephole2 patterns to i386.md to decrease the size
of some integer loads. These replace "movl $const, %eax" (5 bytes)
with "xorl %eax, %eax" followed by either "movb $const,%al" or
"movb $const,%ah" (together 4 bytes), for suitable constants and
suitable general registers, when the flags register is dead.
Ideally modern Intel and AMD prcoessors can recognize these sequences
during instruction decode (avoiding any partial register stall in
the same way they avoid the false dependence for the xorl), and
internally generate a single uop, treating these bytes like an
alternate instruction encoding.
2026-05-13 Roger Sayle <roger@nextmovesoftware.com>
Uros Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/32803
* config/i386/i386.md (peephole2): Don't transform xorl;movb into
movzb with -Oz.
(peephole2): Convert movl into xorl;movb (strict_low_part) with -Oz.
(peephole2): Likewise, convert movl into xorl;movb [abcd]h with -Oz.
gcc/testsuite/ChangeLog
PR target/32803
* gcc.target/i386/pr32803-2.c: New test case.
* gcc.target/i386/pr32803-3.c: Likewise.
Roger Sayle [Wed, 13 May 2026 11:23:41 +0000 (12:23 +0100)]
testsuite: Skip new test case gcc.target/arm/muldi-1.c with -mthumb
The recent test to confirm PR middle-end/122871 is resolved on ARM,
wasn't expecting -mthumb. This adds a requires-effective-target.
Committed as obvious.
2026-05-13 Roger Sayle <roger@nextmovesoftware.com>
Richard Earnshaw <rearnsha@arm.com>
gcc/testsuite/ChangeLog
PR middle-end/122871
* gcc.target/arm/muldi-1.c: Skip test if compiled with -mthumb.
Richard Biener [Wed, 13 May 2026 06:47:43 +0000 (08:47 +0200)]
Record (de-)composition type in ls_type for VMAT_STRIDED_SLP accesses
The following arranges the vector (de-)composition type for VMAT_STRIDED_SLP
loads and stores to be available in the ls_type field of the load/store data
for target costing.
* tree-vect-stmts.cc (vectorizable_store): Record ls_type
for VMAT_STRIDED_SLP.
(vectorizable_load): Likewise.
Richard Biener [Tue, 12 May 2026 12:44:30 +0000 (14:44 +0200)]
Delay setting of slp_node->data in vectorizable_{load,store}
We move 'ls' to slp_node->data early, inhibiting late adjustments
like setting of ls_type. We also have failure cases after, which
we should not. The following moves one such failure check earlier
and moves setting slp_node->data and SLP_TREE_TYPE
down, duplicating it to various "success" returns.
* tree-vect-stmts.cc (vectorizable_store): Set slp_node->data
and SLP_TREE_TYPE only on success.
(vectorizable_load): Likewise. Move one validity check early.
Jeevitha [Wed, 13 May 2026 08:36:10 +0000 (03:36 -0500)]
rs6000: Fix [su]mul<mode>3_highpart patterns to use RTL codes [PR122665]
The existing smul<mode>3_highpart and umul<mode>3_highpart patterns
incorrectly defined the high-part multiply by shifting both operands
right by 32 before multiplication. This does not match the semantics
of the instructions vmulhs<wd> and vmulhu<wd>, which perform a widened
multiplication and return the high part of the result.
This patch replaces the incorrect shift-based patterns with the proper
smul_highpart and umul_highpart RTL codes, and updates the operand
predicate from vsx_register_operand to altivec_register_operand, since
these instructions only accept Altivec registers.
gcc/
PR target/122665
* config/rs6000/vsx.md (smul<mode>3_highpart, umul<mode>3_highpart):
Replace shift-based patterns with smul_highpart and umul_highpart RTL
codes and use altivec_register_operand.
Pattern 2:
int pop(unsigned x) {
x = x - ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x + (x >> 4)) & 0x0F0F0F0F;
x = x + (x >> 8);
x = x + (x >> 16);
return x & 0x0000003F;
}
Pattern 3:
int pop(unsigned x) {
x = x - ((x >> 1) & 0x55555555);
x = x - 3*((x >> 2) & 0x33333333)
x = (x + (x >> 4)) & 0x0F0F0F0F;
x = x + (x >> 8);
x = x + (x >> 16);
return x & 0x0000003F;
}
gcc/ChangeLog:
* match.pd: Add new popcount pattern variants from Hacker's Delight.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/popcount7.c: New test.
* gcc.dg/tree-ssa/popcount7_2.c: New test.
* gcc.dg/tree-ssa/popcount8.c: New test.
* gcc.dg/tree-ssa/popcount9.c: New test.
-@item New @code{omp_pause_stop_tool} constant for omp_pause_resource @tab N @tab
+@item New @code{omp_pause_stop_tool} constant for @code{omp_pause_resource
+ @tab Y @tab
H.J. Lu [Sun, 10 May 2026 10:36:26 +0000 (18:36 +0800)]
x86-64: Use R11 for DRAP register in preserve_none functions
In 64-bit mode, for preserve_none functions, DRAP may use any register
except AX, R12–R15, DI, SI (argument registers), SP, and BP. Use R11.
In non-callee-saved functions, R10 and R13 are also available since they
are not used for parameter passing. In 32-bit mode, preserve_none does
not affect parameter passing, so the current approach remains valid.
DRAP register is used to restore stack pointer in epilogue for stack
realignment. Always save and restore DRAP register between prologue
and epilogue so that stack pointer can be restored.
Tested with CPython 3.14.4 on Linux/x86-64.
gcc/
PR target/120870
* config/i386/i386.cc (ix86_save_reg): Return true for DRAP
register early at entry.
(find_drap_reg): Use R11_REG in preserve_none functions in
64-bit mode.
gcc/testsuite/
PR target/120870
* gcc.target/i386/pr120870-1.c: New test.
* gcc.target/i386/pr120870-2.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Co-Authored-By: Uros Bizjak <ubizjak@gmail.com>
Álvaro Begué [Tue, 12 May 2026 12:55:01 +0000 (14:55 +0200)]
libstdc++: Support ON-format DAY in Zone UNTIL field [PR124852]
The Zone-line UNTIL parser only accepted a plain day-of-month integer
for the DAY field, while the tzdata.zi grammar accepts the same ON-style
forms as Rule lines: lastSun, Sun>=8, Sat<=20, etc. Real zones use these
forms in their UNTIL DAY: Europe/Simferopol's `3 - MSK 1997 Mar lastSu
1u`, for instance, became `Mar 1` (silently misparsed) instead of `Mar
30`, leaving Simferopol an extra 29 days in MSK.
The previous parser's `int d = 1; in >> m >> d >> t;` chain silently
left d == 1 when the day token wasn't a digit, then went on to parse the
remainder as the TIME field.
Renames on_day to on_month_day, and factor out the day-component parser
parser from operator>>(istream&, on_day&) as operator>> for newly added
on_day_t tag type, and reuse it for the UNTIL DAY field. The operator
handles all three on_day forms (DayOfMonth, LastWeekday, LessEq /
GreaterEq). The MONTH-only and YEAR-only short forms are still accepted
because the DAY/TIME fields are optional and default to day 1, time 00:00.
The on_day struct's pin() method handles the year/month-relative
resolution.
The DAY field is unambiguously distinguishable from a TIME field that
could otherwise follow the MONTH directly: per zic's grammar, MONTH
must be followed by DAY before any TIME is allowed. So we always
attempt to parse a DAY if any non-whitespace remains after the MONTH.
libstdc++-v3/ChangeLog:
PR libstdc++/124852
* src/c++20/tzdb.cc (on_day): Rename to...
(on_day_month): Rename from on_day.
(on_day_month::on_day_t, on_day_month::on_day): Define.
(operator>>(istream&, on_day_t&&)): Factored out of
operator>>(istream&, on_day&).
(operator>>(istream&, on_day&)): Use on_day_t parser.
(operator>>(istream&, ZoneInfo&)): Replace the integer DAY
parser with on_day_t for the UNTIL field.
* testsuite/std/time/time_zone/until_day_on.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Álvaro Begué <alvaro.begue@gmail.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Tobias Burnus [Wed, 13 May 2026 06:21:53 +0000 (08:21 +0200)]
libgomp: Add stub omp_control_tool for OMPT
Add the omp_control_tool routine (always returning omp_control_tool_notool);
add it and omp_control_tool/omp_control_tool_result enums/parameters to omp.h
and omp_lib.{h,mod}. Plus add OpenMP 6.0's omp_pause_stop_tool named constant.
Note that omp_control_tool uses the OpenMP 6.0 C/C++ prototype that uses an
enum omp_control_tool_t as first argument instead of an 'int'.
gcc/fortran/ChangeLog:
* intrinsic.texi (OpenMP Modules): Add named parameters for
omp_control_tool and omp_control_tool_result.
libstdc++: Fix numeric save offset on Zone lines [PR124851]
When a Zone line specifies a numeric value as its RULES field (the
constant DST save value for that zone line, e.g. Africa/Gaborone's
"2 1 CAST" line), the parser stored the standard offset alone in
ZoneInfo::m_offset. ZoneInfo::to() then returned that as
sys_info::offset, dropping the numeric save and reporting a total
offset that was wrong by the save amount.
This was inconsistent with the two ZoneInfo constructors that take a
sys_info, which previously stored the *total* offset (stdoff + save) in
m_offset. As a result m_offset's semantics depended on which code path
created the ZoneInfo, and only the parser path's lines with non-zero
numeric save were observably broken.
Fix by giving m_offset a single semantics: always the standard offset
only. The two sys_info-taking constructors now subtract the save before
storing, and to(0 adds it back when reconstructing the sys_info.
The remaining .offset() callers inside _M_get_sys_info already expect
the standard offset (they are computing rule firing times, where the
save component is added separately from the active rule's save value),
so no other call sites need adjustment.
libstdc++-v3/ChangeLog:
PR libstdc++/124851
* src/c++20/tzdb.cc (ZoneInfo::ZoneInfo(sys_info&&)): Store
stdoff only in m_offset (subtract info.save).
(ZoneInfo::ZoneInfo(const pair<sys_info, string_view>&)):
Likewise.
(ZoneInfo::offset()): Document new semantics.
(ZoneInfo::to(sys_info&)): Add m_save back to offset() when
populating sys_info::offset.
* testsuite/std/time/time_zone/numeric_save.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Álvaro Begué <alvaro.begue@gmail.com>
Martin Uecker [Tue, 12 May 2026 05:11:38 +0000 (07:11 +0200)]
c: avoid false positive for useless casts and generic [PR125261]
To reduce the number of false positives, we guard -Wuseless-cast by
c_inhibit_evaluation_warnings and also increment it for a generic
association if we have seen a prior match for a (non-default)
association. This covers the common case where the default association
comes last. If there is another association selected after we have
seen a default, we still have false positives.
PR c/125261
gcc/c/ChangeLog:
* c-parser.cc (c_parser_generic_selection): Modify logic for
c_inhibit_evaluation_warnings.
* c-typeck.cc (build_c_cast): Use c_inhibit_evaluation_warnings.
gcc/testsuite/ChangeLog:
* gcc.dg/pr125261.c: New test.
Naveen [Wed, 13 May 2026 03:30:01 +0000 (20:30 -0700)]
testsuite: Move pr123286.c to gcc.target/aarch64/
The test for PR middle-end/123286 unconditionally includes <arm_neon.h>
which is only available on AArch64 targets. Hence, move the testcase to
gcc.target/aarch64/ where the target context is implicit.
Robert Dubner [Tue, 12 May 2026 17:52:28 +0000 (13:52 -0400)]
cobol: Improved GENERIC for conditionals and comparisons.
After several years, I am finally developing some understanding of
GENERIC and how the middle-end processes it. These wide-ranging changes
improve the execution speed of conditional logic and numeric-numeric,
numeric-alpha, and alpha-alpha comparisons.
I started with refining the way GENERIC for "IF <conditional statement>"
is created, and then I moved on to numerous individual cases. Some
all-purpose routines in libgcobol.so have been broken out into special-
purpose routines implemented in GENERIC.
gcc/cobol/ChangeLog:
* Make-lang.in: Incorporate new gcc/cobol/compare.cc file.
* cobol1.cc (ATTR_CONST_NOTHROW_LEAF): Incorporate __builtin_swap16,
__builtin_swap32, __builtin_swap64, and __builtin_swap128.
(cobol_langhook_init): Likewise.
* genapi.cc (treeplet_fill_source): Improve speed.
(get_binary_value_from_float): Spelling.
(normal_normal_compare): Eliminate.
(compare_binary_binary): Eliminate.
(DEBUG_COMPARE): Eliminate.
(cobol_compare): Eliminate.
(parser_enter_file): Eliminate obsolete variables.
(data_decl_type_for): New function.
(parser_alphabet_use): Flag altered alphabets for speed.
(parser_display): Environment switch for putting comments into
the assembly language.
(program_end_stuff): Change "hijack" to "hijack_h".
(parser_division): Repair RETURN-CODE logic.
(parser_logop): Improve GENERIC for logical operations.
(parser_relop): Use new cobol_compare_relop() routine.
(parser_relop_long): Elminate unnecessary static variable.
(inspect_tally): Improve parameter passing to library routine.
(inspect_replacing): Likewise.
(parser_intrinsic_subst): Likewise.
(parser_intrinsic_callv): Likewise.
(parser_intrinsic_call_1): Likewise.
(parser_bsearch_start): Likewise.
(parser_bsearch_when): Use new comparison routine; simplify logic.
(parser_unstring): Improve parameter passing to library routine.
(parser_string): Likewise.
(create_and_call): Repair RETURN-CODE logic.
(parser_call): Adjust exception processing when the target cannot be
found.
(build_temporaryN): Constructor for cblc_field_t::data.
(hijack_for_development): Change hijacking name to "dubner_h".
(hijacker): Change hijacking name to "hijack_h".
(get_reference_to_data): New function.
(mh_identical): Improve speed when sender and receiver have the same
structure.
(mh_source_is_literalN): Eliminate leading plus/minus when moving a
numeric to an alphanumeric.
(move_helper): Adjust logic for mh_identical and mh_source_is_group.
(actually_create_the_static_field): Use constructor for data member.
(psa_new_var_decl): Typo in comment.
(parser_symbol_add): Make the generated data type more consistent
with the COBOL variable type.
* genapi.h (parser_bsearch_when): Change declaration.
(parser_bsearch_start): Formatting.
(parser_sort): Formatting.
* gengen.cc (gg_show_type): Expand for ARRAY_TYPE and ARRAY_REF.
(gg_define_from_declaration): Use void type for DECL_EXPR.
(gg_define_volatile_variable): New function.
(gg_get_address): New function.
(gg_array_value): Use fold_convert().
(gg_bswap): New function.
(gg_memcmp): New function.
* gengen.h (SCHAR_P): New and changed declarations.
(struct gg_function_t): Add alphabet_in_use flag.
(gg_define_volatile_variable): New declaration.
(gg_get_address_of): Comment.
(gg_pointer_to_array): Comment.
(gg_get_address): New declaration.
(gg_bswap): New declaration.
(gg_memcmp): New declaration.
(gg_insert_into_assemblerf): Formatting.
* genmath.cc (arithmetic_operation): Improved handling of
parameters.
(fast_add): Improved handling of locations.
(parser_add): Formatting.
* genutil.cc (tree_type_from_digits): Correct parameter.
(get_data_offset): Correct exception handling.
(get_binary_value_tree): Improve location handling.
(tree_type_from_field): Correct logic.
(tree_type_from_size): Correct signs for returned type.
(build_array_of_treeplets): Eliminated.
(build_array_of_referlets): New function.
(build_array_of_fourplets): Eliminated.
(build_array_of_refers): New function.
(refer_is_clean): Improved logic.
(refer_is_super_clean): New function.
(refer_is_working_storage): New function.
(refer_offset): Formatting.
(binary_from_FldNumericBin5): New function.
(binary_from_FldNumericBinary): New function.
(d_and_q_num_disp): New function.
(binary_from_FldNumericDisplay): New function.
(make_dp2bin_decl): New function.
(d_and_q_packed): New function.
(binary_from_comp_3): New function.
(binary_from_comp_6): New function.
(binary_from_FldPacked): New function.
(binary_from_FldFloat): New function.
(get_binary_value): New function.
(get_location): New function.
(get_length): New function.
* genutil.h (tree_type_from_digits): New declaration.
(tree_type_from_size): Changed declaration.
(refer_is_super_clean): New declaration.
(refer_is_working_storage): New declaration.
(refer_offset): Changed declaration.
(build_array_of_treeplets): Remove declaration.
(build_array_of_referlets): New declaration.
(build_array_of_fourplets): Remove declaration.
(build_array_of_refers): New declaration.
(tree_type_from_field): New declaration.
(get_binary_value): New declaration.
(get_location): New declaration.
(get_length): New declaration.
* parse.y: Mysterious changes. All changes to YACC rules are
mysterious.
* parse_ante.h (class log_expr_t): Changes to logop() invocation.
* scan_post.h (yylex): Remove unnecessary trailing semicolon.
* structs.cc (create_cblc_file_t): Change cblc_file_t declaration.
(create_referlet_t): New function.
(create_refer_t): New function.
(create_our_type_nodes): Add cblc_referlet_type_node and
cblc_refer_type_node.
* structs.h (member2): New declaration.
(GTY): Type for cblc_referlet_type_node and cblc_refer_type_node.
* symbols.cc (temporaries_t::add): Remove unnecessary trailing
semicolon.
* symbols.h (struct cbl_bsearch_t): Remove obsolete member.
(ENABLE_HIJACKING): Compilation switch for enabling dubner_h and
hijack_h code-generation hijacking.
* util.cc (symbol_field_type_update): Comment.
* compare.cc: New file.
* compare.h: New file.
libgcobol/ChangeLog:
* charmaps.h (class charmap_t): Remove an abort().
* common-defs.h (SUPERTYPE): Pairs integers for complex switch().
(cbl_file_mode_str): Remove unnecessary trailing semicolon.
* gcobolio.h: New cblc_referlet_t and cblc_refer_t structures;
eliminate obsolete structures.
* gmath.cc (__gg__pow): Improved parameter handling.
(__gg__add_fixed_phase1): Likewise.
(__gg__addf1_fixed_phase2): Likewise.
(__gg__fixed_phase2_assign_to_c): Likewise.
(__gg__add_float_phase1): Likewise.
(__gg__addf1_float_phase2): Likewise.
(__gg__float_phase2_assign_to_c): Likewise.
(__gg__addf3): Likewise.
(__gg__subtractf1_fixed_phase2): Likewise.
(__gg__subtractf2_fixed_phase1): Likewise.
(__gg__subtractf1_float_phase2): Likewise.
(__gg__subtractf2_float_phase1): Likewise.
(__gg__subtractf3): Likewise.
(__gg__multiplyf1_phase1): Likewise.
(__gg__multiplyf1_phase2): Likewise.
(__gg__multiplyf2): Likewise.
(__gg__dividef1_phase2): Likewise.
(__gg__dividef23): Likewise.
(__gg__dividef45): Likewise.
* inspect.cc (inspect_backward_format_1): Likewise.
(__gg__inspect_format_1): Likewise.
(inspect_backward_format_2): Likewise.
(__gg__inspect_format_2): Likewise.
(__gg__inspect_format_1_sbc): Likewise.
* intrinsic.cc (kahan_summation): Likewise.
(variance): Likewise.
(__gg__concat): Likewise.
(__gg__max): Likewise.
(__gg__mean): Likewise.
(__gg__median): Likewise.
(__gg__midrange): Likewise.
(__gg__min): Likewise.
(__gg__ord_min): Likewise.
(__gg__ord_max): Likewise.
(__gg__present_value): Likewise.
(__gg__range): Likewise.
(__gg__standard_deviation): Likewise.
(__gg__sum): Likewise.
(__gg__variance): Likewise.
(__gg__substitute): Likewise.
* libgcobol.cc (__gg__resize_int_p): Eliminate.
(__gg__resize_treeplet): Eliminate.
(initialize_program_state): Eliminate the use of obsolete variables.
(format_for_display_internal): Handle FldLiteralN; display up to
38 digits for __int128.
(compare_field_class): Rename to __gg__compare_field_class.
(__gg__compare_field_class): Likewise.
(interconvert): Correct codeset correction logic.
(__gg__compare_2): Use __gg__compare_field_class.
(__gg__move): Handle FldNumericBin5 correction.
(__gg__string): Improved parameter handling.
(display_both): Cope with missing codeset parameter.
(__gg__literaln_alpha_compare): Eliminate.
(__gg__unstring): Improved parameter handling.
(__gg__just_mangle_name): Improved codeset handling.
(__gg__convert): Formatting.
(__gg__set_data_member): Eliminate.
(__gg__show_int128): New function.
(__gg__compare_string_all): New function.
(__gg__compare_string_1): New function.
(ASCII_16): Abuse of the preprocessor to create a 1024-byte string
of ASCII spaces.
(ASCII_64): Likewise.
(ASCII_256): Likewise.
(ASCII_1024): Likewise.
(EBCDIC_16): Abuse of the preprocessor to create a 1024-byte string
of EBCDIC spaces.
(EBCDIC_64): Likewise.
(EBCDIC_256): Likewise.
(EBCDIC_1024): Likewise.
(__gg__compare_string_1a): New function.
(__gg__compare_string_1e): New function.
(__gg__compare_string_2): New function.
(__gg__compare_string_2a): New function.
(__gg__compare_string_4): New function.
(__gg__compare_string_4a): New function.
(__gg_compare_string_different): New function.
(__gg__compare_numeric_all): New function.
(__gg__compare_binary_to_string): New function.
* stringbin.cc (__gg__binary_to_string_ascii): Improved algorithm.
gcc/testsuite/ChangeLog:
* cobol.dg/group2/Check_for_equality_of_COMP-1___COMP-2.cob:
Corrected logic.
* cobol.dg/group2/ENTRY_statement.cob: Expanded test.
* cobol.dg/group2/ENTRY_statement.out: Likewise.
* cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob: Automated
generation of run-time environment variable.
* cobol.dg/group2/Intrinsic_Function_ABS.cob: Corrected.
* cobol.dg/group2/RETURN-CODE_moving.cob: Requires "dialect ibm".
* cobol.dg/group2/FUNCTION_TRIM_with_NATIONAL_characters.cob: New test.
* cobol.dg/group2/FUNCTION_TRIM_with_NATIONAL_characters.out: New test.
* cobol.dg/group2/Large_PIC_10000000_.cob: New test.
* cobol.dg/group2/Large_PIC_10000000_.out: New test.
* cobol.dg/group2/Nested_PERFORM.cob: New test.
* cobol.dg/group2/Nested_PERFORM.out: New test.
* cobol.dg/group2/Overlapping_MOVE.cob: New test.
* cobol.dg/group2/Overlapping_MOVE.out: New test.
* cobol.dg/group2/PERFORM_TIMES_subscripted.cob: New test.
* cobol.dg/group2/PERFORM_TIMES_subscripted.out: New test.
* cobol.dg/group2/PERFORM_VARYING_BY_-0.2.cob: New test.
* cobol.dg/group2/PERFORM_VARYING_BY_-0.2.out: New test.
* cobol.dg/group2/REDEFINES__chained.cob: New test.
* cobol.dg/group2/REDEFINES__chained.out: New test.
* cobol.dg/group2/RETURN-CODE_with_INITIAL_and_RECURSIVE.cob: New test.
* cobol.dg/group2/RETURN-CODE_with_INITIAL_and_RECURSIVE.out: New test.
* cobol.dg/group2/Sanity_check_for_ENTRY.cob: New test.
* cobol.dg/group2/Sanity_check_for_ENTRY.out: New test.
* cobol.dg/group2/Simple_COMP-X.cob: New test.
* cobol.dg/group2/Simple_COMP-X.out: New test.
* cobol.dg/group2/compare_alpha_to_all__literal_.cob: New test.
* cobol.dg/group2/compare_alpha_to_all__literal_.out: New test.
* cobol.dg/group2/compare_national_to_display.cob: New test.
* cobol.dg/group2/compare_national_to_display.out: New test.
* cobol.dg/group2/comprensive_compare_comp-1_comp-5.cob: New test.
* cobol.dg/group2/comprensive_compare_comp-1_comp-5.out: New test.
* cobol.dg/group2/refmod_with_nested_parentheses.cob: New test.
* cobol.dg/group2/refmod_with_nested_parentheses.out: New test.
* cobol.dg/group2/signed_unsigned_compare.cob: New test.
* cobol.dg/group2/signed_unsigned_compare.out: New test.
ICE with -Winfinite-recursion due to recursive rather than work queue/list [PR124651]
As suggested the control flow in
pass_warn_recursion::find_function_exit() was changed
from a recursive to an iterative form. The logic for detecting infinite
recursion is left unchanged.
This avoids stack overflows while handling very large functions
as could be seen with the generated code attached to the PR.
Reg tested OK.
2026-04-07 Heiko Eißfeldt <heiko@hexco.de>
PR middle-end/124651
* gimple-warn-recursion.cc (find_function_exit):
replace recursive calls with iteration for lower stack usage
Léo Hardt [Tue, 12 May 2026 22:09:28 +0000 (19:09 -0300)]
doc: Remove unused reference to @gol macro.
The @gol texinfo macro appears to have been used as a workaround for
achieving line breaks in gccoptlists, and has been removed on commit 43b72ed. The texi2pod rule to ignore it was not cleaned up at the time.
I did not find any references to @gol in the documentation, and building
man pages with and without the deleted rule produced identical files.
contrib/ChangeLog:
* texi2pod.pl: Remove rule to parse the defunct @gol macro.
Add warnings of potentially-uninitialized padding bits
Commit 0547dbb725b reduced the number of cases in which
union padding bits are zeroed when the relevant language
standard does not strictly require it, unless gcc was
invoked with -fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to explicitly
request zeroing of padding bits.
This commit adds a closely related warning,
-Wzero-init-padding-bits=, which is intended to help
programmers to find code that might now need to be
rewritten or recompiled with
-fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to replicate
the behaviour that it had when compiled by older
versions of GCC. It can also be used to find struct
padding that was never previously guaranteed to be
zero initialized and still isn't unless GCC is
invoked with -fzero-init-padding-bits=all.
The new warning can be set to the same three states
as -fzero-init-padding-bits ('standard', 'unions'
or 'all') and has the same default value ('standard').
The two options interact as follows:
f: standard f: unions f: all
w: standard X X X
w: unions U X X
w: all A S X
X = No warnings about padding
U = Warnings about padding of unions.
S = Warnings about padding of structs.
A = Warnings about padding of structs and unions.
The level of optimisation and whether or not the
entire initializer is dropped to memory can both
affect whether warnings are produced when compiling
a given program. This is intentional, since tying
the warnings more closely to the relevant language
standard would require a very different approach
that would still be target-dependent, might impose
an unacceptable burden on programmers, and would
risk not satisfying the intended use-case (which
is closely tied to a specific optimisation).
gcc/ChangeLog:
* common.opt: Add Wzero-init-padding-bits=.
* common.opt.urls: Regenerated.
* doc/invoke.texi: Document Wzero-init-padding-bits=.
* expr.cc (categorize_ctor_elements_1): Update new struct type
ctor_completeness instead of an integer to indicate presence of
padding or missing fields in a constructor. Instead of setting -1
upon discovery of padding bits in both structs and unions,
set separate flags to indicate the type of padding bits.
(categorize_ctor_elements): Update the type and documentation of
the p_complete parameter.
(mostly_zeros_p): Use new struct type ctor_completeness when
calling categorize_ctor_elements.
(all_zeros_p): Use new struct type ctor_completeness when
calling categorize_ctor_elements.
* expr.h (struct ctor_completeness): New struct type to replace an
an integer that could take the value -1 ('all fields are
initialized, but there's padding'), 0 ('fields are missing') or
1 ('all fields are initialized, and there's no padding'). Named
bool members make the code easier to understand and make room to
disambiguate struct padding bits from union padding bits.
(categorize_ctor_elements): Update the function declaration to use
the new struct type in the last parameter declaration.
* gimplify.cc (gimplify_init_constructor): Replace use of
complete_p != 0 ('all fields are initialized') with !sparse,
replace use of complete == 0 ('fields are missing') with sparse, and
replace use of complete <= 0 ('fields are missing' or 'all fields are
initialized, but there's padding') with sparse || padded_union or
padded_non_union. Trigger new warnings if storage for the object
is not zeroed but padded_union or padded_non_union is set
(because this combination implies possible non-zero padding bits).
gcc/testsuite/ChangeLog:
* gcc.dg/c23-empty-init-warn-1.c: New test.
* gcc.dg/c23-empty-init-warn-10.c: New test.
* gcc.dg/c23-empty-init-warn-11.c: New test.
* gcc.dg/c23-empty-init-warn-12.c: New test.
* gcc.dg/c23-empty-init-warn-13.c: New test.
* gcc.dg/c23-empty-init-warn-14.c: New test.
* gcc.dg/c23-empty-init-warn-15.c: New test.
* gcc.dg/c23-empty-init-warn-16.c: New test.
* gcc.dg/c23-empty-init-warn-17.c: New test.
* gcc.dg/c23-empty-init-warn-2.c: New test.
* gcc.dg/c23-empty-init-warn-3.c: New test.
* gcc.dg/c23-empty-init-warn-4.c: New test.
* gcc.dg/c23-empty-init-warn-5.c: New test.
* gcc.dg/c23-empty-init-warn-6.c: New test.
* gcc.dg/c23-empty-init-warn-7.c: New test.
* gcc.dg/c23-empty-init-warn-8.c: New test.
* gcc.dg/c23-empty-init-warn-9.c: New test.
* gcc.dg/gnu11-empty-init-warn-1.c: New test.
* gcc.dg/gnu11-empty-init-warn-10.c: New test.
* gcc.dg/gnu11-empty-init-warn-11.c: New test.
* gcc.dg/gnu11-empty-init-warn-12.c: New test.
* gcc.dg/gnu11-empty-init-warn-13.c: New test.
* gcc.dg/gnu11-empty-init-warn-14.c: New test.
* gcc.dg/gnu11-empty-init-warn-15.c: New test.
* gcc.dg/gnu11-empty-init-warn-16.c: New test.
* gcc.dg/gnu11-empty-init-warn-17.c: New test.
* gcc.dg/gnu11-empty-init-warn-2.c: New test.
* gcc.dg/gnu11-empty-init-warn-3.c: New test.
* gcc.dg/gnu11-empty-init-warn-4.c: New test.
* gcc.dg/gnu11-empty-init-warn-5.c: New test.
* gcc.dg/gnu11-empty-init-warn-6.c: New test.
* gcc.dg/gnu11-empty-init-warn-7.c: New test.
* gcc.dg/gnu11-empty-init-warn-8.c: New test.
* gcc.dg/gnu11-empty-init-warn-9.c: New test.
Marek Polacek [Mon, 11 May 2026 21:19:42 +0000 (17:19 -0400)]
c++: deferred parsing of default arguments [PR50479]
In
void fn (int i = sizeof (i)) {}
the i in sizeof should refer to the parameter ([basic.scope.pdecl]) and
since the second i is not evaluated, the code is valid per
[dcl.fct.default]/9: A parameter shall not appear as a potentially
evaluated expression in a default argument.
This patch fixes this by moving the grokdeclarator call from
_parameter_declaration_list to _parameter_declaration and maybe calling
pushdecl before parsing the default argument.
PR c++/50479
PR c++/62244
gcc/cp/ChangeLog:
* parser.cc (cp_parser_parameter_declaration_list): Move the
grokdeclarator call and setting DECL_SOURCE_LOCATION to...
(cp_parser_parameter_declaration): ...here. New tree parameter.
Set it. Call pushdecl for a named decl with a default argument.
gcc/testsuite/ChangeLog:
* g++.dg/reflect/parm1.C: Uncomment code.
* g++.dg/parse/defarg22.C: New test.
* g++.dg/parse/defarg23.C: New test.
* g++.dg/parse/defarg24.C: New test.
* g++.dg/parse/defarg25.C: New test.
* g++.dg/parse/defarg26.C: New test.
Jonathan Wakely [Tue, 28 Apr 2026 12:15:22 +0000 (13:15 +0100)]
libstdc++: Improve handling of leap second expiry time [PR123165]
This change allows the hardcoded list of leap seconds in <chrono> to be
used even when the program is executing after the hardcoded expiry date
in that header.
For times after the hardcoded expiry, the inline __get_leap_second_info
function calls a new library function which compares the number of
hardcoded leap seconds in the header with the number of leap seconds
defined in the tzdata leapseconds file (usually provided by the OS).
There are three leap second lists that are relevant here. The first is
the hardcoded list (and its expiry time) in the <chrono> header. That is
fixed when the user code is compiled, and might be out of date by the
time the application runs. The second list (and its expiry time) is
hardcoded in tzdb.cc in the libstdc++.so library. If the application
uses a newer libstdc++.so at runtime than the <chrono> header used at
compile time, we can still avoid going to the filesystem for dates
within the expiry time of the list in libstdc++.so. The third list is
the most up-to-date one which is read from the leapseconds file.
The code added by this commit tries to avoid reading the file when
possible (because that's slower than using the in-memory lists), and if
it does have to go to the file system, it tries to avoid doing so again
next time the leap seconds are needed.
libstdc++-v3/ChangeLog:
PR libstdc++/123165
* acinclude.m4 (libtool_VERSION): Bump version.
* config/abi/pre/gnu.ver (GLIBCXX_3.4.36): Add new symbol
version and export new symbol.
* configure: Regenerate.
* include/std/chrono (__detail::__recent_leap_second_info):
Declare new function and make it a friend of varous classes.
(leap_second): Make private constructor constexpr. Remove friend
declaration for get_leap_second_info.
(__detail::__get_leap_second_info): Use new function for times
past the hardcoded expiry.
* src/c++20/tzdb.cc (tzdb_list::_Node::fixed_leaps): Move array
of leap seconds here from _S_read_leap_seconds.
(fixed_expiry, num_leap_seconds): New globals.
(__detail::__recent_leap_second_info): Define new function.
(tzdb_list::_Node::_S_read_leap_seconds): Populate vector from
_Node::fixed_leaps. Rename bool variable to clarify meaning.
(tzdb_list::_Node::_S_replace_head): Update num_leap_seconds
when updating the tzdb_list.
* testsuite/util/testsuite_abi.cc: Update known_versions and
latestp.
* testsuite/std/time/clock/utc/leap_second_info-2.cc: New test.
Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Martin Jambor [Tue, 12 May 2026 12:59:13 +0000 (14:59 +0200)]
sra: Fix build_user_friendly_ref_for_offset for bit-fields (PR124151)
When SRA propagates bit-field propagations across assignments, it
first attempts to use build_user_friendly_ref_for_offset to represent
the expression of the new accesses and a possible scalar replacement
so that if there are any warnings generated for it, they are as nice
as we can make them.
However, this can lead to situations where, despite that the new
access has exactly the same type as the new old one, it accesses a
(record or union): field which is just big enough for its precision,
whereas the one we want to match has size rounded up to bytes. This
causes discrepancy between the recorded size of the new access and the
size get_ref_base_and_extent reports for its expr, which trips the
verifier.
Unlike the previous approach which avoided propagation in the case of
bit fields, this patch fixes build_user_friendly_ref_for_offset by
making it also track the size it is looking at and the size it is
looking for so that it can declare success only if these two also
match. Additionally, it reverts the simple bail-out fix for PR 117217
because it is no longer necessary. (I have verified the bug is still
fixed though by applying the new fix on top of the last problematic
commit.)
gcc/ChangeLog:
2026-04-29 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/124151
* tree-sra.cc (build_user_friendly_ref_for_offset): Added parameters
CUR_SIZE and EXP_SIZE. Added code passing the correct CUR_SIZE and
checking it against EXP_SIZE. Removed unused code for the case when
EXP_TYPE was NULL_TREE.
(create_artificial_child_access): Adjusted the call to
build_user_friendly_ref_for_offset.
(propagate_subaccesses_from_rhs): Likewise.
(propagate_subaccesses_from_rhs): Removed a check that the size of
lchild is a multiple of BITS_PER_UNIT.
(propagate_subaccesses_from_lhs): Likewise.
gcc/testsuite/ChangeLog:
2026-04-29 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/124151
* gcc.dg/tree-ssa/pr124151.c: New test.
Yoshinori Sato [Tue, 12 May 2026 11:39:07 +0000 (12:39 +0100)]
RX: Fix infinite-loop on LRA [PR113948]
The LRA was confused and looping due to the definition of the register class.
The LRA yielded incorrect results because the manipulation of stack frames
and the movement of double words relied heavily on existing reloads.
These changes ensure that the correct code is generated even when the "-mlra"
option is specified.
gcc/ChangeLog:
PR target/113948
* config/rx/rx-protos.h (rx_split_double_move): New helper prototype.
(rx_relax_double_operands): Likewise.
* config/rx/rx.cc (rx_legitimize_address): Add expand complex case.
(rx_is_legitimate_address): Add double word case.
(rx_gen_move_template): Fix operation size in unsigned extend.
(rx_gen_move_template): Remove DImode and DFmode.
(rx_get_stack_layout): Fix for frame size calculation.
(rx_initial_elimination_offset): The calculation method has been
changed to one that supports LRA.
(rx_hard_regno_nregs): Use CEIL.
(rx_hard_regno_mode_ok): Add ATTRIBUTE_UNUSED.
(rx_get_subword): New. Double word move helper.
(rx_split_double_move): Likewise.
(rx_relax_double_operands): Likewise.
* config/rx/rx.h (reg_class): Add CC for all regsisters.
(CLASS_MAX_NREGS): Remove.
* config/rx/rx.md (mov<register_modes:mode>):
Replace copy_to_mode_reg to force_reg.
(movdi): Limit the arguments to make register allocation easier.
(movdf): Likewise.
(movdi_internal): New.
(movdf_internal): New.
(addsi3_pid): New. Handling UNSPEC_PID_ADDR.
(addsi3_lra): New. alternative addptrsi3.
(ashlsi3_lra): Likewise.
Signed-off-by: Yoshinori Sato <yoshinori.sato@nifty.com>
Lili Cui [Tue, 12 May 2026 17:01:00 +0000 (10:01 -0700)]
[PATCH 2/2] tree-optimization/vect: Allow single-lane SLP fallback when limit is exhausted
In vect_analyze_slp_reduction, the early bail "if (*limit == 0) return
false" blocked all SLP discovery including the single-lane fallback path.
However, single-lane SLP trees (group_size == 1) do not consume the
discovery limit as they cannot cause exponential tree growth.
This causes vectorization failures in loops with many independent
conditional reductions: multi-lane grouping attempts exhaust the limit,
then the single-lane fallback that would have succeeded is incorrectly
rejected.
The fix moves the limit check to only guard chain analysis (which builds
multi-lane trees and does consume limit), allowing the single-lane
fallback to always proceed.
This improves 731.astcenc_r (-Ofast) by 3.8% on EMR and 1.4% on Znver5 with single-copy.
gcc/ChangeLog:
* tree-vect-slp.cc (vect_analyze_slp_reduction): Don't bail out
early when SLP discovery limit is exhausted; only guard the chain
analysis which may build multi-lane trees. Single-lane fallback
does not consume limit and should always be attempted.
Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
Lili Cui [Tue, 12 May 2026 17:00:00 +0000 (10:00 -0700)]
[PATCH 1/2] tree-optimization/vect: Allow commutative operand swap for IFN in SLP reduction
In vect_build_slp_tree_1, when checking whether reduction operands at
different positions can be swapped, only tree_code operations (e.g.
PLUS_EXPR) were recognized as commutative. Internal functions produced
by if-conversion (e.g. .COND_ADD, .COND_MUL) were not handled, causing
"different reduc_idx" failures when the reduction operand appeared at
different commutative positions across SLP lanes.
This patch extends the commutative swap recognition to internal
functions using the unified first_commutative_argument(code_helper, tree)
interface to identify the swappable operand pair for both tree codes and
internal functions.
This improves 731.astcenc_r (-Ofast) by 7.1% on EMR and 2.48% on Znver5.
gcc/ChangeLog:
* tree-vect-slp.cc (vect_build_slp_tree_1): Use unified
first_commutative_argument interface to allow commutative
operand swap for both tree codes and internal functions
(e.g. .COND_ADD) in SLP reduction matching.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-reduc-15.c: New test.
Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
The precondition for (view_convert (BIT_FIELD_REF)) simplification at
match.pd:5881 last fixed in r16-4735-g44c27171c36a91 is still wrong,
as it allows a vector type to be converted to/from _BitInt types (for
which precision can be smaller than size). Address this by always
checking type_has_mode_precision_p () for all integer types. (This fix
was posted by Richard B. at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125259#c5.)
Add a testcase as reduced in the PR as tree-ssa/pr125259.c.
Bootstrapped and regtested on aarch64, arm, and x86_64.
OK for trunk and 16?
PR middle-end/125259
gcc/ChangeLog:
* match.pd: Fix the view_convert (BIT_FIELD_REF) pattern.
Alexandre Oliva [Mon, 11 May 2026 06:47:03 +0000 (03:47 -0300)]
vxworks: support aarch64 errata
I noticed that aarch64 errata required specs tweaks that were missing
on vxworks. To my surprise, adding those specs enabled even the
linker tests to pass, despite using vxlink and not always performing
final links. Ok, then, reverting the skip directives added not too
long ago.
Alexandre Oliva [Mon, 11 May 2026 06:46:46 +0000 (03:46 -0300)]
libstdc++: vxworks: enable clock_gettime
We've been using gettimeofday() on VxWorks, because configury didn't
state clock_gettime was available. It has been since at least
vxworks6.9, possibly earlier. Indeed, it's been available for longer
than gettimeofday(), so this enables libstdc++'s chrono.cc to work
with earlier 6.9 releases.
We've used clock_gettime unconditionally in __gthread_cond_timedwait
for a very long time, so it's not like this brings in a new
dependency, but it allows clocks and deadlines to work with the same
precision. Before this change, we'd use gettimeofday's coarser
clocks, and finer timed waits, which makes room for imprecisions.
Alexandre Oliva [Tue, 12 May 2026 06:29:10 +0000 (03:29 -0300)]
testsuite: riscv: reset -march for tests with -mcpu
The tests fail when --target_board sets -march to a 32-bit
architecture. Override that -march by resetting it, so that the arch
implied by -mcpu prevails.
Naveen [Tue, 12 May 2026 04:08:11 +0000 (21:08 -0700)]
tree-optimization: Fold SAT_ADD at gimple level
Extend scalar SAT_ADD constant folding to recognize cases where one operand is
zero. It allows SAT_ADD expressions with constant operands to fold away early.
The change improves optimization opportunities and avoids emitting unnecessary
SAT_ADD operations.
Bootstrapped and tested on aarch64-linux-gnu.
PR middle-end/123286
gcc/ChangeLog:
* fold-const-call.cc (fold_internal_fn_sat_add): New function.
(fold_const_call): Handle CFN_SAT_ADD.
* match.pd: Add simplifications for x SAT_ADD 0 == x.
* genmatch.cc (commutative_op): Add CFN_SAT_ADD.
gcc/testsuite/ChangeLog:
* gcc.dg/pr123286.c: New test.
Naveen [Tue, 12 May 2026 04:04:38 +0000 (21:04 -0700)]
[Aarch64]: Make aarch64_output_simd_mov_imm_low return const char *
aarch64_output_simd_mov_imm_low emits the instruction directly using
output_asm_insn and returns an empty string to indicate that no further
template needs to be emitted. Since the returned empty string is a string
literal, the function should return const char * rather than char *.
This fixes a bootstrap failure with -Werror=write-strings.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (aarch64_output_simd_mov_imm_low):
Change return type to const char *.
* config/aarch64/aarch64.cc (aarch64_output_simd_mov_imm_low): Likewise.
Marek Polacek [Thu, 7 May 2026 20:38:34 +0000 (16:38 -0400)]
c++/reflection: fixes for comparing reflections [PR125208]
This fixes two bugs:
1) crash in cp_tree_equal when comparing reflections with binfos;
cp_tree_equal doesn't handle those. We're coming from
lookup_template_class -> spec_hasher::equal -> comp_template_args
-> cp_tree_equal. We should use compare_reflections in cp_tree_equal.
2) the fix for 1) revealed that compare_reflections is buggy when
comparing two aliases: we shouldn't fall back to same_type_p
because given
using A = int;
using B = int;
^^A != ^^B should hold.
PR c++/125208
gcc/cp/ChangeLog:
* reflect.cc (compare_reflections): Use == when comparing two
aliases.
* tree.cc (cp_tree_equal) <case REFLECT_EXPR>: Use
compare_reflections.
gcc/testsuite/ChangeLog:
* g++.dg/reflect/alias3.C: New test.
* g++.dg/reflect/bases_of5.C: New test.
Andrew Pinski [Mon, 11 May 2026 21:34:58 +0000 (14:34 -0700)]
contrib: Fix check_GNU_style.py for some .opt issues [PR125275]
I noticed while reviewing a patch check_GNU_style.py would fail
for the .opt files in a few ways. First greater than 80 columns
is expected. Second is the space after the "function".
Both of these are not useful for .opt so let's ignore then here.
PR other/125275
contrib/ChangeLog:
* check_GNU_style_lib.py (LineLengthCheck.check): Ignore
filenames that end with .opt.
(FunctionParenthesisCheck.check): Likewise.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jason Merrill [Mon, 11 May 2026 15:55:06 +0000 (11:55 -0400)]
c-family: look through non-user-facing typedef [PR124621]
'aka' printing shouldn't differ between in-tree and installed compilers due
to difference in whether libstdc++ headers are considered to be system
headers. The problem was that if we encounter a non-user-facing typedef
like __iter_type<T>, we stopped there instead of continuing to look through
typedefs to find the underlying user type.
Jerry DeLisle [Sat, 9 May 2026 18:49:21 +0000 (11:49 -0700)]
fortran: Add -fcoarray=shared option to auto-link -lcaf_shmem
The new -fcoarray=shared option provides a convenient shorthand for
the common invocation -fcoarray=lib -lcaf_shmem. The driver transforms
-fcoarray=shared into -fcoarray=lib for the frontend and automatically
appends -lcaf_shmem to the link command. Existing uses of -fcoarray=lib
are unaffected.
* lang.opt (fcoarray=): Add shared enum value; update help text.
* gfortranspec.cc (CAF_SHMEM_LIBRARY): New macro.
(lang_specific_driver): Detect -fcoarray=shared in first pass and
set need_caf_shmem flag. In second pass, transform -fcoarray=shared
to -fcoarray=lib for cc1. Append -lcaf_shmem when need_caf_shmem
is set and linking is active.
xtensa: Assert the results of several validate_change() calls
I regret that I've been using validate_change() thoughtlessly until now,
and should at least verify whether the RTX changes were successful.
I hope this patch doesn't cause any problems, but if not, I will take some
kind of new action.
gcc/ChangeLog:
* config/xtensa/xtensa.cc
(FPreg_neg_scaled_simm12b, convert_SF_const, constantsynth_pass1,
litpool_set_src_1, litpool_set_src):
Change each call to validate_change() and apply_change_group() to
trigger an ICE if the result is not true.
* config/xtensa/xtensa.cc
(xtensa_can_eliminate_callee_saved_reg_p):
Change the arguments that return values from being passed by pointer
to C++ reference.
(xtensa_expand_prologue):
Adjust the call to xtensa_can_eliminate_callee_saved_reg_p() to
match the above changes.
Since the commit of the preceding patch for IRA, "ira: Scale save/restore
costs of callee save registers with block frequency" (3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b),
the Xtensa ISA has shown a tendency to allocate slightly more function stack
frames, especially when using the CALL0 ABI.
/* example */
extern void foo(void);
int test(int a) {
int array[252];
foo();
asm volatile (""::"m"(array));
return a;
}
Pan Li [Fri, 8 May 2026 01:41:01 +0000 (09:41 +0800)]
Match: Move saturation alu patterns into match-sat-alu.pd [NFC]
Given there are lots of sat alu patterns in match.pd, and
there will more in short future. Move all of them into
a separated file for better org.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
gcc/ChangeLog:
* Makefile.in: Add match-sat-alu.pd as dependency.
* match.pd: Remove saturation alu related patterns.
* match-sat-alu.pd: Add new file for saturation alu patterns.
Oleg Tolmatcev [Wed, 6 May 2026 17:20:31 +0000 (19:20 +0200)]
i386: Keep SEH enabled for Win64 sysv_abi functions
TARGET_SEH currently depends on TARGET_64BIT_MS_ABI. For Win64
functions compiled with SysV ABI, that makes the SEH hooks drop out
even though exception handling still emits .seh_handlerdata. The
result is assembly with .seh_handlerdata outside any .seh_proc block.
Keep TARGET_SEH enabled for all 64-bit Windows functions when unwind
tables are requested so SEH prologue and handler state remains
consistent across ABI variants.
Add a regression test covering a SysV ABI wrapper trampoline that
references a throwing callee on x86_64 mingw.
gcc/ChangeLog:
* config/i386/cygming.h (TARGET_SEH): Keep SEH enabled for all
64-bit Windows functions when unwind tables are requested.
gcc/testsuite/ChangeLog:
* g++.target/i386/seh-sysvabi-wrap1.C: New test.
Signed-off-by: oltolm <oleg.tolmatcev@gmail.com> Signed-off-by: Jonathan Yong <10walls@gmail.com>
Richard Biener [Mon, 11 May 2026 08:25:57 +0000 (10:25 +0200)]
tree-optimization/125250 - LIM speculating not noop load/store
The following avoids speculating a load/store pair for modes
that cannot transfer bits or, as for the testcase, bitfield
loads that are either value changing or invoke UB when out-of-bound
(and that we'd rewrite to be defined with explicit truncation).
PR tree-optimization/125250
* tree-ssa-loop-im.cc (execute_sm): For modes that cannot
transfer bits, _Bool and bitfield accesses force the
multi-threaded model.
LIU Hao [Fri, 8 May 2026 13:02:23 +0000 (21:02 +0800)]
mingw: Ensure symbols are quoted in Intel syntax
Previously, this code
extern int shl;
int get_shl(void) { return shl; }
gave errors like
$ x86_64-w64-mingw32-gcc -masm=intel test.c
ccUSyr0f.s: Assembler messages:
ccUSyr0f.s:24: Error: invalid use of operator "shl"
because it contained
.refptr.shl:
.quad shl
This `shl` should have referenced the symbol, but it appeared in an expression
context, where, in Intel syntax, it got interpreted as the shift-left operator.
This commit fixes the issue by emitting the target symbol with
`ASM_OUTPUT_LABELREF`, which will quote it properly with regard to the output
assembler syntax.
PR target/53929
gcc/ChangeLog:
* config/mingw/winnt.cc (mingw_pe_file_end): Use `ASM_OUTPUT_LABELREF`
to emit `name`.
Signed-off-by: LIU Hao <lh_mouse@126.com> Signed-off-by: Jonathan Yong <10walls@gmail.com>
Tomasz Kamiński [Wed, 6 May 2026 13:22:06 +0000 (15:22 +0200)]
libstdc++: Reorder compile-time checks for __formatter_str::_M_format_range.
If _M_format_range was called with prvalue of span S (or any contiguous_range),
the previous chain of if-contexpr will call _M_format_range<S&>, then
_M_format_range<const S&> and then format(string_view). By checking for
contiguous_range first, it calls format(string_view) direclty, removing
unnecessary instantiations and symbols.
Similary, for all prvalues of type R, that meet __simply_formattable_range R,
we were instantiating _M_format_range<R&> and then _M_format_range<const R&>.
By moving the if for __simply_formattable_range before is_lvalue_reference_v,
we call _M_format_range<const R&> direclty.
libstdc++-v3/ChangeLog:
* include/std/format (__formatter_str::_M_format_range):
Reorder constexpr checks, to reduce number of instantiations.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Biener [Wed, 6 May 2026 11:43:27 +0000 (13:43 +0200)]
adjust OMP SIMD call cost
The following adds special handling to OMP SIMD vector call costs
which were not costed at all and for which a single simple vector
stmt isn't appropriate. PR125174 shows that even when AVX imposes
more overhead (from also slightly bogus costing) than SSE, when
there's two OMP SIMD calls involved doing less of those should trump
that.
PR target/125174
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
Cost calls as 10 times FMA.
combine: Reject any resulting insns using hard reg constraints [PR121426]
This fixes
t.c:6:1: error: unable to find a register to spill
6 | }
| ^
for target avr. In the PR we are given a patch which makes use of hard
register constraints in the machine description for divmodhi4. Prior
combine we have for the test from the PR
The patched instruction divmodhi4 constraints operand 2 (here pseudo
48) to hard register 22. Combine merges insn 7 into 9 by crossing a
hard register assignment of register 22.
This leaves us with a conflict for pseudo 48 in the updated insn 9 since
register 22 is live here.
Fixed by pulling the sledge hammer and rejecting any resulting insn
which makes use of hard register constraints. Ideally we would skip
based on the fact whether a combination crosses a hard register
assignment and the corresponding hard register is also referred by a
single register constraint of the resulting insn.
PR rtl-optimization/121426
gcc/ChangeLog:
* combine.cc (recog_for_combine_1): Reject insns which make use
of hard register constraints.
Naveen [Mon, 11 May 2026 05:45:41 +0000 (22:45 -0700)]
[PATCH] [Aarch64]: Use fmov for some low-lane FP SIMD constant vectors
Extend AdvSIMD constant materialization to recognize vectors where only
the low element is a representable floating-point constant and all other
elements are zero.
Bootstrapped and tested on aarch64-linux-gnu.
PR target/113856
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h
(aarch64_output_simd_mov_imm_low): New.
(aarch64_const_vec_fmov_p): New.
* config/aarch64/aarch64-simd.md (mov<mode>): Do not expand constant
vectors handled by aarch64_const_vec_fmov_p into VDUP.
(*aarch64_simd_mov<VDMOV:mode>): Add Dc alternatives for FMOV based
SIMD constant moves.
(*aarch64_simd_mov<VQMOV:mode>): Likewise.
* config/aarch64/aarch64.cc (aarch64_const_vec_fmov_p): New function.
(aarch64_output_simd_mov_imm_low): New function.
* config/aarch64/constraints.md (Dc): New constraint.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr113856.c: New test.
[PATCH] combine: Check against CLOBBER in make_compound_operation_int [PR125209]
In combine_simplify_rtx, CLOBBER can be returned, which is propagated to
make_compound_operation. When make_compound_operation_int then calls
simplify_subreg, it triggers a gcc_assert, because the mode is neither inner,
nor void. This results in an ICE.
We fix this by checking if we got CLOBBER before calling simplify_subreg from
make_compound_operation and bail out; we return NULL_RTX.
Testcase from the bug report by Zhendong Su.
Bootstrapped and tested on x86_64-linux-gnu.
PR rtl-optimization/125209
gcc/
* combine.cc (make_compound_operation_int): Return NULL_RTX if we got CLOBBER.
gcc/testsuite/
* gcc.dg/pr125209.c: New test.
Signed-off-by: Boudewijn van der Heide <boudewijn@delta-utec.com>
The first operand of the IOR clears the low bit of the source register leaving
everything else unchanged. The second operand of the IOR clears everything but
the low bit and flips the low bit. When we IOR those together we get the
original value with the lowest bit flipped. The key is to realize we have the
same pseudo in both arms and there are no bits in common for the constants. So
this works for an arbitrary bit(s) as long as the constants have the right
form.
That gets us good code on riscv and almost certainly helps other targets.
There is another form which shows up on the H8 and possibly other targets
sub-word arithmetic. op0 and op1 are respectively:
Note we're in QImode. op1 just flips the highest QImode bit. If there are
carry-outs, we don't really care about them. The net is we can capture that
case on the H8 by verifying this form flips the highest bit for the given mode.
Otherwise the carry-outs are relevant and our transformation is incorrect.
Plan is to commit Friday. While it has been tested with the usual bootstraps
as well as testing on various cross platforms, I'm more comfortable giving
folks time to take a looksie to see if Shreya or I missed anything critical.
For the testcase in question before/afters look like this:
PR rtl-optimization/80770
gcc/
* rtl.h (simplify_context::simplify_ior_with_common_term): Add
new method.
(simplify_context::simplify_binary_operation_1): Use new method.
* simplify-rtx.cc (simplify_context::simplify_ior_with_common_term):
New method.
gcc/testsuite/
* gcc.target/riscv/pr80770.c: New test.
* gcc.target/riscv/pr80770-2.c: New test.
* gcc.target/h8300/pr80770.c: New test.
* gcc.target/h8300/pr80770-2.c: New test.
Co-authored-by: Jeff Law <jeffrey.law@oss.qualcomm.com>
Eric Botcazou [Sun, 10 May 2026 10:51:44 +0000 (12:51 +0200)]
Ada: Fix Image for derived enumeration type with representation clause
The problem is that Expand_Image_Attribute incorrectly fetches the root type
for enumeration types, thus bypassing a clause present on the derived type.
The fix is to change the two fields Lit_Indexes and Lit_Strings defined for
enumeration types and subtypes to be formally present on root types only, as
well as to make Expand_Image_Attribute stick to base types.
gcc/ada/
PR ada/125240
* gen_il-gen-gen_entities.adb (Enumeration_Kind): Make
Lit_Indexes and Lit_Strings be defined for root types only.
* einfo.ads (Lit_Hash): Adjust description.
(Lit_Indexes): Likewise.
(Lit_Strings): Likewise.
(E_Enumeration_Type): Likewise.
* exp_imgv.adb (Expand_Image_Attribute): Do not fetch the root type
for enumeration types, except for character types, and adjust.
PR target/125238
* config/i386/i386-features.cc (ix86_broadcast_inner): Set kind
to X86_CSE_CONST_VECTOR if the vector load can be converted to
constant integer load.
gcc/testsuite/
PR target/125238
* gcc.target/i386/pr125238.c: New test.
Paul Thomas [Fri, 8 May 2026 05:34:21 +0000 (06:34 +0100)]
Fortran: Allow access to coarray elements within modules. [PR125051]
The parts of this patch is fix the problem are chunks 2 and 3. Chunk3 prevents
gfc_conv_intrinsic_caf_get from working in the module namespace, when the array
symbol is in a module. Equally, though, gfc_current_ns is not necessarily in
the referencing procedure namespace. The second chunk makes sure that this is
the case. As an aside, it seems to us that it makes considerably more sense that
gfc_current_ns be that of the current procedure. The first chunk makes sure that
result symbol initialization does not occur outside the function.
Passes regtesting with FC44/x86_64.
2026-05-10 Andre Vehreschild <vehre@gcc.gnu.org>
Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/125051
* trans-decl.cc (gfc_get_symbol_decl): gfc_defer_symbol_init
must not be called for PDT types, classes or types with PDT
(gfc_generate_function_code): If gfc_current_ns is not the same
as the function namespace, stash it,change it to the function
namespace and restore after translation of the code.
* trans-intrinsic.cc (gfc_conv_intrinsic_caf_get): If the array
is in a module, use the symbol namespace.
* trans-openmp.cc (gfc_trans_omp_array_reduction_or_udr): If the
current namespace is not that of the procedure, change to the
procedure namspace and revert on leaving this function.
gcc/testsuite/
PR fortran/125051
* gfortran.dg/coarray/pr125051.f90: New test.
riscv_expand_epilogue will skip emitting sspopchk when cm.popret is
emitted. After this patch we will no longer emit cm.popret and instead
use cm.pop + sspopchk + a regular return:
sspush ra
cm.push {ra, s0-s1}, -32
..
cm.pop {ra, s0-s1}, 32
sspopchk ra
jr ra
Andrew Pinski [Sat, 9 May 2026 01:32:50 +0000 (18:32 -0700)]
cfghooks: constifify cfg_hooks [PR117871]
This constifies the hooks so the structs can't be changed at runtime.
The only odd place where we need to handle special is sel-sched.
This is because we make a copy of the current cfghooks and then
use its own. This changes things slightly there, there is still a
copy used but instead of copying back into the current cfghooks, we
just change the pointer back to the original one. This code is only
enabled by ia64 backend by default so I doubt it will change.
Boostrapped and tested on x86_64-linux-gnu.
PR middle-end/117871
gcc/ChangeLog:
* cfghooks.cc (cfg_hooks): Change the type
to be a pointer to a const struct cfg_hooks.
(get_cfg_hooks): Return the current pointer
rather the struct.
(set_cfg_hooks): Change the argument type and
set the cfg_hooks directly to it.
* cfghooks.h (gimple_cfg_hooks): Constify.
(rtl_cfg_hooks): Likewise.
(cfg_layout_rtl_cfg_hooks): Likewise.
(get_cfg_hooks): Update declration.
(set_cfg_hooks): Likewise.
* cfgrtl.cc (rtl_cfg_hooks): Constify.
(cfg_layout_rtl_cfg_hooks): Likewise.
* sel-sched-ir.cc (orig_cfg_hooks): Change to a pointer.
(sel_create_basic_block): Update
for orig_cfg_hooks being a pointer.
(sel_register_cfg_hooks): Update for the constification
of cfg_hooks.
* tree-cfg.cc (gimple_cfg_hooks): Constify.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sat, 9 May 2026 00:51:54 +0000 (17:51 -0700)]
cfghooks: Remove name field
Now we have an IR field, we can remove the name field. The
only time the name is used was for internal errors so having
this field outside of the hooks is better anyways.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* cfghooks.cc (current_ir_name): New function.
(dump_bb_for_graph): Use current_ir_name
instead of accessing the name field.
(dump_bb_as_sarif_properties): Likewise.
(redirect_edge_and_branch): Likewise.
(can_remove_branch_p): Likewise.
(redirect_edge_and_branch_force): Likewise.
(split_block_1): Likewise.
(move_block_after): Likewise.
(delete_basic_block): Likewise.
(split_edge): Likewise.
(create_basic_block_1): Likewise.
(can_merge_blocks_p): Likewise.
(predict_edge): Likewise.
(predicted_by_p): Likewise.
(merge_blocks): Likewise.
(make_forwarder_block): Likewise.
(force_nonfallthru): Likewise.
(can_duplicate_block_p): Likewise.
(duplicate_block): Likewise.
(block_ends_with_call_p): Likewise.
(block_ends_with_condjump_p): Likewise.
(flow_call_edges_add): Likewise.
* cfghooks.h (struct cfg_hooks): Remove the name
field.
* cfgrtl.cc (rtl_cfg_hooks): Update for the removal
of the name field.
(cfg_layout_rtl_cfg_hooks): Likewise.
* tree-cfg.cc (struct cfg_hooks): Likewise.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sat, 9 May 2026 00:16:10 +0000 (17:16 -0700)]
cfghooks: Move ir_type inside cfghooks
This is the first step in constification (and/or C++ification)
of the cfghooks. Currently we compare variables to figure out
what the current IR type is. Rather let's move the ir_type
into the cfghooks. This will help with constification due
to sel-sched overloading one of the hooks.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* cfghooks.cc (current_ir_type): Return cfghooks' ir field.
* cfghooks.h (struct cfg_hooks): Add ir field.
* cfgrtl.cc (rtl_cfg_hooks): Update for new ir field.
(cfg_layout_rtl_cfg_hooks): Likewise.
* tree-cfg.cc (gimple_cfg_hooks): Likewise.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
PR target/125239
* config/i386/i386-features.cc (ix86_place_single_vector_set):
For CONST_VECTOR source, check CONST0_RTX with
X86_CSE_CONST0_VECTOR and CONSTM1_RTX with X86_CSE_CONSTM1_VECTOR.
(ix86_broadcast_inner): Set x86_cse kind to X86_CSE_CONST0_VECTOR
for CONST0_RTX and X86_CSE_CONSTM1_VECTOR for CONSTM1_RTX.
gcc/testsuite/
PR target/125239
* gcc.target/i386/pr124407-1.c: Adjusted.
* gcc.target/i386/pr125239.c: New test.
Andrew Pinski [Fri, 8 May 2026 19:46:32 +0000 (12:46 -0700)]
match: Fix merged patterns for a!=b implies a and b are not zero [PR125234]
In r17-231-gc65691bc5a2873, I messed up the resulting constant for
`(a != b) & ((a | b) == 0)` and `(a == b) | ((a | b) != 0)`. I had
swapped which one was resulting in true/false. This fixes the issue
and adds a testcase to make sure it does not regress again.
Pushed as obvious after a bootstrap/test on x86_64-linux-gnu.
Marek Polacek [Thu, 7 May 2026 22:09:57 +0000 (18:09 -0400)]
c++: fix ICE with invalid targ [PR125043]
The patch that allowed DECL_NTTP_OBJECT_P in invalid_tparm_referent_p
also added the assert checking for tinfos/__func__ (r14-8189). But in
these tests we got to the assert with a temporary object coming from
create_temporary_var: either a reference temporary or compound literal
temporary. The former could be checked by seeing if the name starts
with _ZGR but the latter don't have it. So perhaps we can just check
DECL_IGNORED_P, always set for create_temporary_var objects.
PR c++/115181
PR c++/125043
PR c++/124979
gcc/cp/ChangeLog:
* pt.cc (invalid_tparm_referent_p): Allow DECL_IGNORED_P in an
assert.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/nontype-auto27.C: New test.
* g++.dg/cpp1z/nontype-auto28.C: New test.
* g++.dg/cpp2a/nontype-class75.C: New test.
Roger Sayle [Fri, 8 May 2026 20:06:54 +0000 (21:06 +0100)]
PR middle-end/124637: Fix passing padded constant structs in registers on big-endian targets
This patch resolves PR middle-end/124637, a wrong code regression when
passing a struct as a register on big-endian targets. On big-endian
targets, store_constructor fills fields from the most significant bits,
so for structs narrower than word size, any padding is incorrectly
placed in the least significant bytes. This issue is fixed (on
affected targets) by using a (unsigned) right shift on the value
determined by store_constructor to correctly align the structure in
the least significant bytes, and place the padding in the high bits.
Many thanks to Manjunath Matti for testing this patch on real hardware,
and Drea Pinski for reviewing/approving it. The new test case may be
a little fragile, but currently "works for me". Please feel free to
tweak it for powerpc variants/environments I've not consider/encountered.
2026-05-08 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR middle-end/124637
* calls.cc (load_register_parameters): If using store_constructor
to place a constant structure in a register, use a right shift to
align the structure/padding if required on big-endian targets.
gcc/testsuite/ChangeLog
PR middle-end/124637
* gcc.target/powerpc/pr124637.c: New test case.
Harald Anlauf [Fri, 8 May 2026 19:45:31 +0000 (21:45 +0200)]
libfortran: fix static analyser cppcheck warning in free_format_data [PR125087]
The static analyser cppcheck reported a pointless assignment in function
free_format_data. The intent of the assignment was to finally nullify the
pointer to allocated format data after memory has been freed. Since C does
not support references, add a level of indirection to the function argument
so that the dereferenced argument can be nullified.
PR libfortran/125087
libgfortran/ChangeLog:
* io/format.c (free_format_data): Change argument from pointer to
format_data to pointer to pointer of object.
(free_format_hash_table): Adjust argument passed to
free_format_data.
(save_parsed_format): Likewise.
* io/format.h (free_format_data): Adjust prototype.
* io/transfer.c (st_read_done_worker): Adjust argument passed to
free_format_data.
(st_write_done_worker): Likewise.
Jeff Law [Fri, 8 May 2026 17:40:29 +0000 (11:40 -0600)]
[V2][RISC-V][PR target/124955] Utilize slliw for some left shifted signed bitfield extractions
Some functional change as was already posted, this time with a testcase. Given
it's been in my tester and through the pre-commit CI system, I'm going forward
now.
--
So as the PR notes, this is an attempt to squeeze out some instructions from a
hot part of leela, the random number generator in particular.
The key is realizing that the the first two statements are just a sign extended
bitfield of length 20. That ultimately gets shifted left 12 bits. 20+12 = 32,
so we can at least conceptually use slliw (shift left sign extending result
from SI to DI). The andi just turns off the low bit.
Given a sign extracted bitfield starting at bit 0, of size N that is then left
shifted by M where N+M == 32 is a natural slliw instruction. However, when I
tried to recognize that and generate the slliw form I saw code quality
regressions that didn't look particularly reasonable to try and fix. So we
want to be more selective about recognizing that idiom. So we recognize it
when we subsequently mask off some bits and the mask can be encoded via andi.
This likely could be extended to other logical operations that don't ultimately
affect the SI sign bit.
PR target/124955
gcc/
* config/riscv/riscv.md (masked shifted bitfield extraction): New
splitter to utilize slliw to eliminate the need for sign extnesion.
Jeff Law [Fri, 8 May 2026 17:19:02 +0000 (11:19 -0600)]
[RISC-V][PR tree-optimization/93504] Handle (X & C) | ((X^Y) & ~C) -> X ^ ( Y & ~C) in simplify-rtx
This is a trivial generalization of existing simplify-rtx code. Essentially
the code in question was handling IOR, but not XOR. I'm keeping the bz open as
this probably should have been cleaned up before getting into RTL.
The net is something like this:
> #define N 0x202
> #define OP ^
>
> unsigned f(unsigned a, unsigned b)
> {
> unsigned t = a OP b;
> unsigned t1 = t&N;
> unsigned t2 = a&~N;
> return t1 | t2;
> }
>
Originally compiled into:
xor a1,a0,a1
andi a1,a1,514
andi a0,a0,-515
or a0,a1,a0
ret
After it compiles into:
andi a1,a1,514
xor a0,a1,a0
ret
Bootstrapped and regression tested on x86, aarch64 and various targets in qemu.
Also tested on the usual embedded targets.
PR tree-optimization/93504
gcc/
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
Generalize existing code for (X & C) | ((X|Y) & ~C) to handle
(X & C) | ((X^Y) & ~C) as well.
gcc/testsuite
* gcc.target/riscv/pr93504.c: New test.
Harald Anlauf [Thu, 7 May 2026 20:34:52 +0000 (22:34 +0200)]
Fortran: fix automatic deallocation with derived type IO [PR111952,PR125059]
The implementation of derived type IO wrongly forced allocatable instances
of the DT as static, which prevented automatic deallocation of local
variables. The motivation was an attempt to prevent optimizations leading
to certain testcase failures. Howver, the underlying reason of the problem
was a wrong fnspec of _gfortran_transfer_derived that declared the IO
variable as being only read ('r'). Declare the corresponding parameter as
being written ('w').
PR fortran/111952
PR fortran/125059
gcc/fortran/ChangeLog:
* trans-decl.cc (gfc_finish_var_decl): Remove bogus code forcing
a DT variable with DTIO as static.
* trans-io.cc (gfc_build_io_library_fndecls): Fix fnspec attribute.