Patrick Palka [Fri, 13 Jun 2025 15:03:19 +0000 (11:03 -0400)]
libstdc++: Optimize __make_comp/pred_proj for empty/scalar types
When creating a composite comparator/predicate that invokes a given
projection function, we don't need to capture a scalar (such as a
function pointer or member pointer) or empty object by reference,
instead capture it by value and use [[no_unique_address]] to elide
its storage (in the empty case). This makes using __make_comp_proj
zero-cost in the common case where both functions are empty/scalars.
libstdc++-v3/ChangeLog:
* include/bits/ranges_algo.h (__detail::__by_ref_or_value_fn): New.
(__detail::_Comp_proj): New.
(__detail::__make_comp_proj): Use it instead.
(__detail::_Pred_proj): New.
(__detail::__make_pred_proj): Use it instead.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++: add a workaround for format_kind<optional<T>> [PR120644]
The specialization of format_kind for optional is causing a problem when
optional is imported and included. The comments on the PR strongly
suggest that this is a frontend bug; this commit just works around the
issue by specifying the type of format_kind<optional<T>> to be
`range_format`, rather than leaving the compiler deduce it via `auto`.
PR c++/120644
libstdc++-v3/ChangeLog:
* include/std/optional (format_kind): Do not use `auto`.
Jakub Jelinek [Fri, 13 Jun 2025 12:01:18 +0000 (14:01 +0200)]
expand: Fix up edge splitting for ENTRY block during expansion if there are any PHIs [PR120629]
Andrew ran some extra ranger checking during bootstrap and found one more
case (though much rarer than the GIMPLE_COND case).
Seems on fold-const.cc (native_encode_expr) we end up with bb 2, ENTRY
bb successor, having PHI nodes (usually there is some bb in between, even if
empty, in the native_encode_expr it is tail recursion but haven't managed
to construct a test with such case by hand).
So, we have in optimized dump
<bb 2> [local count: 1089340384]:
# expr_12 = PHI <expr_199(D)(0), part_93(51)>
# ptr_13 = PHI <ptr_86(D)(0), ptr_13(51)>
# len_14 = PHI <len_103(D)(0), _198(51)>
# off_10 = PHI <off_102(D)(0), _207(51)>
# add_acc_99 = PHI <0(0), add_acc_101(51)>
where there are mostly default defs from the 0->2 edge (and one zero)
and some other values from the other edge.
construct_init_block inserts a BB_RTL basic block with the function start
instructions and similarly to the GIMPLE_COND case it wants to insert that
bb on the edge from ENTRY to its single successor.
Now, without this patch redirect_edge_succ redirects the 0->2 edge to 0->52,
so the 51->2 edge gets moved first by unordered_remove, and
make_single_succ_edge adds a new 52->2 edge. So we end up with
# expr_12 = PHI <expr_199(D)(51), part_93(52)>
# ptr_13 = PHI <ptr_86(D)(51), ptr_13(52)>
# len_14 = PHI <len_103(D)(51), _198(52)>
# off_10 = PHI <off_102(D)(51), _207(52)>
# add_acc_99 = PHI <0(51), add_acc_101(52)>
which is not correct, the default definitions and zero are now from the edge
from end of function and the other values from the edge from the new BB_RTL
successor of ENTRY. With this patch we get
# expr_12 = PHI <expr_199(D)(52), part_93(51)>
# ptr_13 = PHI <ptr_86(D)(52), ptr_13(51)>
# len_14 = PHI <len_103(D)(52), _198(51)>
# off_10 = PHI <off_102(D)(52), _207(51)>
# add_acc_99 = PHI <0(52), add_acc_101(51)>
instead.
2025-06-13 Jakub Jelinek <jakub@redhat.com>
PR middle-end/120629
* cfgexpand.cc (construct_init_block): If first_block isn't BB_RTL,
has any PHI nodes and false_edge->dest_idx before redirection is
different from make_single_succ_edge result's dest_idx, swap the
latter with the former last pred edge and their dest_idx members.
Tomasz Kamiński [Fri, 6 Jun 2025 10:24:11 +0000 (12:24 +0200)]
libstdc++: Rework formatting of empty chrono-spec for duration.
In contrast to other calendar types if empty chrono-spec is used for duration
we are required to format it (and its representation type) via ostream.
Handling this case was now moved to be part of the format function
for duration. To facilitate that __formatter_chrono::_M_format_to_ostream
function was made public.
However, for standard integral types, we know the result of inserting
them into ostream, and in consequence we can format them directly. This
is handled by configuring default format spec to "%Q%q" for such types.
As we no longer use __formatter_chrono::_M_format with empty chrono-spec,
this function now requires that _M_chrono_specs are not empty,
and conditional call to _M_format_to_ostream is removed. This allows
_M_format_to_ostream to be reduced to accept only duration.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__formatter_chrono::_M_format):
Remove handling of empty _M_chrono_specs.
(__formatter_chrono::_M_format_to_ostream): Changed to accept
only chrono::duration and made public.
(std::formatter<chrono::duration<_Rep, _Period>, _CharT>):
Configure __defSpec and handle empty chrono-spec locally.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Tomasz Kamiński [Fri, 6 Jun 2025 09:56:08 +0000 (11:56 +0200)]
libstdc++: Format empty chrono-spec for the sys_info and local_info directly.
This patch change implementation of the formatters for sys_info and local_info,
so they no longer delegate to operator<< for ostream in case of empty spec.
As this types may be only formatted with chrono-spec containing only %%, %t, %n
specifiers and fill characters, we use a separate __formatter_chrono_info formatter.
For empty chron-spec __formatter_chrono_info formats sys_info using format_to call
with format specifier extracted from corresponding operator<<, that now delegates
to format with empty spec. For local_info we replicate functionality of the operator<<.
The alignment and padding is handled using an _Padding_sink.
For non-empty spec, we delegate to __formatter_chrono::_M_format. As non-of the
format specifiers depends on the formatted object, we pass chrono::day to avoid
triggering additional specializations.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__format::__formatter_chrono_info)
[_GLIBCXX_USE_CXX11_ABI || ! _GLIBCXX_USE_DUAL_ABI]: Define.
(std::formatter<chrono::sys_info, _CharT>)
(std::formatter<chrono::local_inf, _CharT>): Delegate to
__format::__formatter_chrono_info.
(std::operator<<(basic_ostream<_CharT, _Traits>& const sys_info&)):
Use format on sys_info with empty format spec.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Kito Cheng [Tue, 10 Jun 2025 03:17:45 +0000 (11:17 +0800)]
driver: Try to read spec from gcc_exec_prefix if possible
GCC will try to read the spec file from the directory where it is
installed, but it should try to read from gcc_exec_prefix rather than
standard_exec_prefix, because the latter is not the right one if
compiler has been relocated into other places other than the path
specfied at configuration time.
gcc/ChangeLog:
* gcc.cc (driver::set_up_specs): Use gcc_exec_prefix to
read the spec file rather than standard_exec_prefix.
gcc.c-torture/compile/dll.c -O0 (internal compiler error: in assemble_variable, at varasm.cc:2544)
gcc.dg/visibility-12.c (internal compiler error: in expand_call, at calls.cc:3744)
for more-elf.
PR target/120589
* config/mcore/mcore.cc (mcore_mark_dllimport): Don't use
gen_rtx_MEM.
Jakub Jelinek [Thu, 12 Jun 2025 18:22:39 +0000 (20:22 +0200)]
recip: Reset range info when replacing sqrt with rsqrt [PR120638]
This pass reuses a SSA_NAME on the lhs of sqrt etc. call as lhs
of .RSQRT etc. call. The following testcase is miscompiled since my recent
ranger cast changes, because we compute (correct) range for sqrtf argument
as well as result but then recip pass keeps using that range for the .RQSRT
call which returns 1. / sqrt, so the function then returns 0.5f
unconditionally.
Note, on foo this is a regression from GCC 15, but on bar it regressed
already with the r14-536 change.
2025-06-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/120638
* tree-ssa-math-opts.cc (pass_cse_reciprocals::execute): Call
reset_flow_sensitive_info on arg1.
libstdc++: do not use an unreserved name in _Temporary_buffer [PR119496]
As the PR observes, _Temporary_buffer was using an unreserved name for a
member function that can therefore clash with macros defined by the
user. Avoid that by renaming the member function.
PR libstdc++/119496
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h: Adjust calls to requested_size.
* include/bits/stl_tempbuf.h (requested_size): Rename with
an _M_ prefix.
* testsuite/17_intro/names.cc: Add a #define for
requested_size.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
libstdc++: add range support to std::optional (P3168)
This commit implements P3168 ("Give std::optional Range Support"), added
for C++26. Both begin() and end() are straightforward, implemented using
normal_iterator over a raw pointer.
std::optional is also a view, so specialize enable_view for it.
We also need to disable automatic formatting a std::optional as a range
by specializing format_kind. In order to avoid dragging <format> when
including <optional>, I've isolated format_kind and some supporting code
into <bits/formatfwd.h> so that I can use that (comparatively) lighter
header.
libstdc++-v3/ChangeLog:
* include/bits/formatfwd.h (format_kind): Move the definition
(and some supporting code) from <format>.
* include/std/format (format_kind): Likewise.
* include/bits/version.def (optional_range_support): Add
the feature-testing macro.
* include/bits/version.h: Regenerate.
* include/std/optional (iterator, const_iterator, begin, end):
Add range support.
(enable_view): Specialize for std::optional.
(format_kind): Specialize for std::optional.
* testsuite/20_util/optional/range.cc: New test.
* testsuite/20_util/optional/version.cc: Test the new
feature-testing macro.
Stafford Horne [Sat, 7 Jun 2025 13:46:30 +0000 (14:46 +0100)]
or1k: Fix ICE in libgcc caused by recent validate_subreg changes
After commit eb2ea476db2 ("emit-rtl: Allow extra checks for
paradoxical subregs [PR119966]") paradoxical subregs or the OpenRISC
condition flag register (reg:BI sr_f) are no longer allowed.
This causes and ICE in the ce1 pass which tries to get the or1k flag
register into an SI register, which is no longer possible.
Adjust or1k_can_change_mode_class to allow changing the or1k flag reg to
SI mode which in turn allows paradoxical subregs to be generated again.
gcc/ChangeLog:
PR target/120587
* config/or1k/or1k.cc (or1k_can_change_mode_class): Allow
changing flags mode from BI to SI to allow for paradoxical
subregs.
Tomasz Kamiński [Fri, 6 Jun 2025 09:32:27 +0000 (11:32 +0200)]
libstdc++: Format empty chrono-spec for the time points and hh_mm_ss directly.
This patch change implementation of the formatters for time points and hh_mm_ss,
so they no longer delegate to operator<< for ostream in case of empty chrono-spec.
As in case of calendar types, the formatters for specific type now provide
__formatter_chrono with default _ChronoSpec that are used in case if empty
chrono-spec.
The configuration of __defSpec is straight forward, except for the sys_time,
and local_time that print time, if the duration is convertible to days,
which is equivalent to setting _M_chrono_specs "%F" instead of "%F %T".
Furthermore, certain sys_time<Dur> do not support ostream operator, and
should not be formattable with empty spec - in such case default
_M_chrono_spec, allowing the issue to still be detected in _M_parse.
Finally, _ChronoFormats are extended to cover required format strings.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (_ChronoFormats::_S_ftz)
(_ChronoFormats::_S_ft, _ChronoFormats::_S_t): Define.
(__formatter_chrono::_M_format_to_ostream): Remove handling for
time_points.
(std::formatter<chrono::hh_mm_ss<_Dur>, _CharT>)
(std::formatter<chrono::sys_time<_Dur>, _CharT>)
(std::formatter<chrono::utc_time<_Dur>, _CharT>)
(std::formatter<chrono::tai_time<_Dur>, _CharT>)
(std::formatter<chrono::gps_time<_Dur>, _CharT>)
(std::formatter<chrono::file_time<_Dur>, _CharT>)
(std::formatter<chrono::local_time<_Dur>, _CharT>)
(std::formatter<chrono::__detail::__local_time_fmt<_Dur>, _CharT>)
(std::formatter<chrono::zoned_time<_Dur>, _CharT>):
Define __defSpec, and pass it as argument to _M_prase and
constructor of __formatter_chrono.
Tomasz Kamiński [Fri, 6 Jun 2025 08:45:21 +0000 (10:45 +0200)]
libstdc++: Format empty chrono-spec for the calendar types directly.
This patch change implementation of the formatters for the calendar types,
so they no longer delegate to operator<< for ostream in case of empty chrono-spec.
Instead of that, we define the behavior in terms of format specifiers
supplied by each formatter as an argument to _M_parse. Similarly each formatter
constructs its __formatter_chrono from a relevant default spec, preserving the
functionality of calling format on default constructed formatters.
Expressing the existing functionality of the operator ostream, requires
providing two additional features:
* printing "is not a valid sth" for !ok objects,
* printing a weekday index in the month.
The formatter functionality is enabled by setting spec _M_debug (corresponding
to '?') that is currently unused. This is currently supported only for
subset of format specifiers used by the ostream operators. In future, we could
make this user configurable (by adding '?' after 'L') and cover all flags.
For the handling of the weekday index (for weekday_indexed, month_weekday,
year_month_weekday), we need to introduce a new format specifier. To not
conflict with future extension we use '%\0' (embedded null) as this character
cannot be placed in valid format spec.
Finally, the format strings for calendar types subsets each other, e.g.
year_month_weekday_last ("%Y/%b/%a[last])" contains month_weekday_last,
weekday_last, weekday, e.t.c.. We introduce a _ChronoFormats class that provide
consteval accessors to format specs, internally sharing they representations.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__format::_ChronoFormats): Define.
(__formatter_chrono::__formatter_chrono())
(__formatter_chrono::__formatter_chrono(_ChronoSpec<_CharT>)): Define.
(__formatter_chrono::_M_parse): Add parameter with default spec,
and merge it with new values. Handle '%\0' as weekday index
specifier.
(__formatter_chrono::_M_a_A, __formatter_chrono::_M_b_B)
(__formatter_chrono::_M_C_y_Y, __formatter_chrono::_M_d_e)
(__formatter_chrono::_M_F): Support _M_debug flag.
(__formatter_chrono::_M_wi, __formatter_chrono::_S_weekday_index):
Define.
(std::formatter<chrono::day, _CharT>)
(std::formatter<chrono::month, _CharT>)
(std::formatter<chrono::year, _CharT>)
(std::formatter<chrono::weekday, _CharT>)
(std::formatter<chrono::weekday_indexed, _CharT>)
(std::formatter<chrono::weekday_last, _CharT>)
(std::formatter<chrono::month_day, _CharT>)
(std::formatter<chrono::month_day_last, _CharT>)
(std::formatter<chrono::month_weekday, _CharT>)
(std::formatter<chrono::month_weekday_last, _CharT>)
(std::formatter<chrono::year_month, _CharT>)
(std::formatter<chrono::year_month_day, _CharT>)
(std::formatter<chrono::year_month_day_last, _CharT>)
(std::formatter<chrono::year_month_weekday, _CharT>)
(std::formatter<chrono::year_month_weekday_last, _CharT>):
Define __defSpec, and pass it as argument to _M_parse and
constructor of __formatter_chrono.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Uros Bizjak [Thu, 12 Jun 2025 12:03:50 +0000 (14:03 +0200)]
i386: Fix signed integer overflow in ix86_expand_int_movcc, part 2 [PR120604]
Make sure we can represent the difference between two 64-bit DImode immediate
values in 64-bit HOST_WIDE_INT and return false if this is not the case.
ix86_expand_int_movcc is used in mov<mode>cc expaner. Expander will FAIL
when the function returns false and middle-end will retry expansion with
values forced to registers.
PR target/120604
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_int_movcc): Make sure
we can represent the difference between two 64-bit DImode
immediate values in 64-bit HOST_WIDE_INT.
Jakub Jelinek [Thu, 12 Jun 2025 13:51:51 +0000 (15:51 +0200)]
expand: Fix up edge splitting for GIMPLE_COND expansion if there are any PHIs [PR120629]
My r16-1398 PR120434 ranger during expansion change broke profiled lto
bootstrap on x86_64-linux, the following testcase is reduced from that.
The problem is during expand_gimple_cond, if we are unlucky that neither
of edge_true and edge_false point to the next basic block, the code
effectively attempts to split the false_edge and make the new bb BB_RTL
with some extra instructions which just arranges to jump.
It does it by creating a new bb, redirecting the false_edge and then
creating a new edge from the new bb to the dest.
Note, we don't have GIMPLE cfg hooks installed anymore and even if we
would, the 3 calls aren't the same as one split_edge with transformation
of the new bb into BB_RTL and adding it some BB_HEAD/BB_END. If
false_edge->dest is BB_RTL or doesn't have PHI nodes (which before my
patch was always the case because), then this works fine, but with
PHI nodes on false_edge->dest redirect_edge_succ will remove the false_edge
from dest->preds (unordered remove which moves into its place the last edge
in the vector) and the new make_edge will then add the new edge as last
in the vector. So, unless false_edge is the last edge in the dest->preds
vector this effectively swaps the last edge in the vector with
false_edge/its new replacement.
gimple_split_edge solves this by temporarily clearing phi_nodes on dest
(not needed when we don't have GIMPLE hooks), then making the new edge
first and redirecting the old edge (plus restoring phi_nodes on dest).
That way the redirection replaces the old edge with the new one and
PHI arguments don't need adjustment. At the cost of temporarily needing
one more edge in the vector and so if unlucky reallocation.
Doing it like that is one of the options (i.e. just move the
make_single_succ_edge call). This patch instead keeps doing what it did
and just swaps two edges again if needed to restore the PHI behavior
- remember edge_false->dest_idx first if there are PHI nodes in
edge_false->dest and afterwards if new edge's dest_idx is different from
the remembered one, swap the new edge with EDGE_PRED (dest, old_dest_idx).
That way PHI arguments are maintained properly as well. Without this
we sometimes just swap PHI arguments.
In particular we had
# ivtmp.24_52 = PHI <ivtmp.24_49(10), 1(6)>
on bb 8 (dest) and edge_false is the 10->8 edge. We create a new
BB_RTL bb 15 on this edge, redirect the 10->8 edge to 10->15 which
does unordered_remove and so the bb8->preds edge vec is just 6->8,
PHIs not touched as in IR_RTL_CFGRTL mode. Then a new 15->8 edge is
created. Without the patch we get
# ivtmp.24_52 = PHI <ivtmp.24_49(6), 1(15)>
which is wrong, while with this patch we get
# ivtmp.24_52 = PHI <ivtmp.24_49(15), 1(6)>
which matches just the addition of (for ranger uninteresting) BB_RTL
on the 10->15->8 edge.
2025-06-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/120629
* cfgexpand.cc (expand_gimple_cond): If dest bb isn't BB_RTL,
has any PHI nodes and false_edge->dest_idx before redirection is
different from make_single_succ_edge result's dest_idx, swap the
latter with the former last pred edge and their dest_idx members.
Pan Li [Wed, 11 Jun 2025 13:49:21 +0000 (21:49 +0800)]
RISC-V: Combine vec_duplicate + vmax.vv to vmax.vx on GR2VR cost
This patch would like to combine the vec_duplicate + vmax.vv to the
vmax.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.
Assume we have example code like below, GR2VR cost is 0.
#define DEF_VX_BINARY(T, OP) \
void \
test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
{ \
for (unsigned i = 0; i < n; i++) \
out[i] = in[i] OP x; \
}
* config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add new
case SMAX.
(expand_vx_binary_vec_vec_dup): Ditto.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op smax.
aarch64: Incorrect removal of ZA restore [PR120624]
The PCS defines a lazy save scheme for managing ZA across normal
"private-ZA" functions. GCC currently uses this scheme for calls
to all private-ZA functions (rather than using caller-save).
Therefore, before a sequence of calls to private-ZA functions, GCC emits
code to set up a lazy save. After the sequence of calls, GCC emits code
to check whether lazy save was committed and restore the ZA contents
if so.
These sequences are emitted by the mode-switching pass, in an attempt
to reduce the number of redundant saves and restores.
The lazy save scheme also means that, before a function can use ZA,
it must first conditionally store the old contents of ZA to the caller's
lazy save buffer, if any.
This all creates some relatively complex dependencies between
setup code, save/restore code, and normal reads from and writes to ZA.
These dependencies are modelled using special fake hard registers:
;; Sometimes we use placeholder instructions to mark where later
;; ABI-related lowering is needed. These placeholders read and
;; write this register. Instructions that depend on the lowering
;; read the register.
(LOWERING_REGNUM 87)
;; Represents the contents of the current function's TPIDR2 block,
;; in abstract form.
(TPIDR2_BLOCK_REGNUM 88)
;; Holds the value that the current function wants PSTATE.ZA to be.
;; The actual value can sometimes vary, because it does not track
;; changes to PSTATE.ZA that happen during a lazy save and restore.
;; Those effects are instead tracked by ZA_SAVED_REGNUM.
(SME_STATE_REGNUM 89)
;; Instructions write to this register if they set TPIDR2_EL0 to a
;; well-defined value. Instructions read from the register if they
;; depend on the result of such writes.
;;
;; The register does not model the architected TPIDR2_ELO, just the
;; current function's management of it.
(TPIDR2_SETUP_REGNUM 90)
;; Represents the property "has an incoming lazy save been committed?".
(ZA_FREE_REGNUM 91)
;; Represents the property "are the current function's ZA contents
;; stored in the lazy save buffer, rather than in ZA itself?".
(ZA_SAVED_REGNUM 92)
;; Represents the contents of the current function's ZA state in
;; abstract form. At various times in the function, these contents
;; might be stored in ZA itself, or in the function's lazy save buffer.
;;
;; The contents persist even when the architected ZA is off. Private-ZA
;; functions have no effect on its contents.
(ZA_REGNUM 93)
Every normal read from ZA and write to ZA depends on SME_STATE_REGNUM,
in order to sequence the code with the initial setup of ZA and
with the lazy save scheme.
The code to restore ZA after a call involves several instructions,
including conditional control flow. It is initially represented as
a single define_insn and is split late, after shrink-wrapping and
prologue/epilogue insertion.
The split form of the restore instruction includes a conditional call
to __arm_tpidr2_restore:
The write to SME_STATE_REGNUM indicates the end of the region where
ZA_REGNUM might differ from the real contents of ZA. In other words,
it is the point at which normal reads from ZA and writes to ZA
can safely take place.
To finally get to the point, the problem in this PR was that the
unsplit aarch64_restore_za pattern was missing this change to
SME_STATE_REGNUM. It could therefore be deleted as dead before
it had chance to be split. The split form had the correct dataflow,
but the unsplit form didn't.
Unfortunately, the tests for this code tended to use calls and asms
to model regions of ZA usage, and those don't seem to be affected
in the same way.
gcc/
PR target/120624
* config/aarch64/aarch64.md (SME_STATE_REGNUM): Expand on comments.
* config/aarch64/aarch64-sme.md (aarch64_restore_za): Also set
SME_STATE_REGNUM
gcc/testsuite/
PR target/120624
* gcc.target/aarch64/sme/za_state_7.c: New test.
Tomasz Kamiński [Thu, 5 Jun 2025 08:40:10 +0000 (10:40 +0200)]
libstdc++: Uglify __mapping_alike template parameter and fix test and typo in comment.
When the static assert was generated from instantiations of default member
initializer of class B, the error was not generated for B<1, std::layout_left,
std::layout_left> case, only when -D_GLIBCXX_DEBUG was set. Changing B calls to
functions fixes that.
We also replace class with typename in template head of layout_right::mapping
constructors.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__mdspan::__mapping_alike): Rename template
parameter from M to _M_p.
(layout_right::mapping): Replace class with typename in template
head.
(layout_stride::mapping): Fix typo in comment.
* testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc:
Changed B to function.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Luc Grosheintz [Wed, 4 Jun 2025 14:58:53 +0000 (16:58 +0200)]
libstdc++: Make layout_left(layout_stride) noexcept.
[mdspan.layout.left.cons] of N4950 states that this ctor is not
noexcept. Since, all other ctors of layout_left, layout_right or
layout_stride are noexcept, the choice was made, based on
[res.on.exception.handling], to make this ctor noexcept.
Two other major standard library implementations make the same choice.
libstdc++-v3/ChangeLog:
* include/std/mdspan (layout_left): Strengthen the exception
guarantees of layout_left::mapping(layout_stride::mapping).
* testsuite/23_containers/mdspan/layouts/ctors.cc:
Simplify tests to reflect the change.
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Luc Grosheintz [Wed, 4 Jun 2025 14:58:52 +0000 (16:58 +0200)]
libstdc++: Add tests for layout_stride.
Implements the tests for layout_stride and for the features of the other
two layouts that depend on layout_stride.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: Add
tests for layout_stride.
* testsuite/23_containers/mdspan/layouts/ctors.cc: Add test for
layout_stride and the interaction with other layouts.
* testsuite/23_containers/mdspan/layouts/empty.cc: Ditto.
* testsuite/23_containers/mdspan/layouts/mapping.cc: Ditto.
* testsuite/23_containers/mdspan/layouts/stride.cc: New test.
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Luc Grosheintz [Wed, 4 Jun 2025 14:58:50 +0000 (16:58 +0200)]
libstdc++: Add tests for layout_right.
Adds tests for layout_right and for the parts of layout_left that depend
on layout_right.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: Add
tests for layout_right.
* testsuite/23_containers/mdspan/layouts/ctors.cc: Add tests for
layout_right and the interaction with layout_left.
* testsuite/23_containers/mdspan/layouts/empty.cc: ditto.
* testsuite/23_containers/mdspan/layouts/mapping.cc: ditto.
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Luc Grosheintz [Wed, 4 Jun 2025 14:58:48 +0000 (16:58 +0200)]
libstdc++: Add tests for layout_left.
Implements a suite of tests for the currently implemented parts of
layout_left. The individual tests are templated over the layout type, to
allow reuse as more layouts are added.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: New test.
* testsuite/23_containers/mdspan/layouts/ctors.cc: New test.
* testsuite/23_containers/mdspan/layouts/empty.cc: New test.
* testsuite/23_containers/mdspan/layouts/mapping.cc: New test.
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Luc Grosheintz [Wed, 4 Jun 2025 14:58:46 +0000 (16:58 +0200)]
libstdc++: Improve naming, whitespace and silence warnings for extents.
libstdc++-v3/ChangeLog:
* include/std/mdspan(__mdspan::_ExtentsStorage): Change name
of private member _M_dynamic_extens to _M_dyn_exts.
(extents): Change name of private member from _M_dynamic_extents
to _M_exts.
Fix two instances of whitespace errors.
* testsuite/23_containers/mdspan/extents/ctor_default.cc: Fix
integer comparison with cmp_equal.
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 10 Jun 2025 08:56:05 +0000 (09:56 +0100)]
libstdc++: Improve documentation on copyright notices in new tests
Clarify that FSF copyright notices in tests are incorrect for
contributions under DCO terms. Clarify the sentence about copying
existing tests to clarify that it is only referring to copying the code
in the test file, rather than just copying an existing file as a
template for a new test.
libstdc++-v3/ChangeLog:
* doc/xml/manual/test.xml: Improve discussion of copyright
notices in new test cases.
* doc/html/manual/test.html: Regenerate.
Jonathan Wakely [Fri, 6 Jun 2025 12:43:22 +0000 (13:43 +0100)]
libstdc++: Remove outdated comment about wchar_t in create_testsuite_files
This script claims that wchar_t tests are filtered out if the toolchain
being tested doesn't support it. That doesn't seem to have been true
since r0-68039-ga72c74a1dee345 in 2005.
libstdc++-v3/ChangeLog:
* scripts/create_testsuite_files: Remove incorrect comment about
filtering out wchar_t tests.
Jonathan Wakely [Wed, 11 Jun 2025 10:11:52 +0000 (11:11 +0100)]
libstdc++: Do not specialize std::formatter for incomplete type [PR120625]
Using an incomplete type as the template argument for std::formatter
specializations causes problems for program-defined specializations of
std::formatter which have constraints. When the compiler has to find
which specialization of std::formatter to use for the incomplete type it
considers the program-defined specializations and checks to see if their
constraints are satisfied, which can give errors if the constraints
cannot be checked for incomplete types.
This replaces the base class of the disabled specializations with a
concrete class __formatter_disabled, so there is no need to match a
specialization and no more incomplete type.
libstdc++-v3/ChangeLog:
PR libstdc++/120625
* include/std/format (__format::__disabled): Remove.
(__formatter_disabled): New type.
(formatter<char*, wchar_t>, formatter<const char*, wchar_t>)
(formatter<char[N], wchar_t>, formatter<string, wchar_t>)
(formatter<string_view, wchar_t>): Use __formatter_disabled as
base class instead of formatter<__disabled, wchar_t>.
* testsuite/std/format/formatter/120625.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Eric Botcazou [Fri, 7 Feb 2025 08:26:51 +0000 (09:26 +0100)]
ada: Fix code size increase with pragma Aggregate_Individually_Assign
The problem is that individual assignments have to preserve the other bits
of the target, in other words the original semantics of aggregates, where
the anonymous object is entirely constructed, is partially lost.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Variable>: Generate
a zero-initialization for the anonymous object of a small aggregate
allocated on the stack.
(inline_status_for_subprog): Minor tweak.
Eric Botcazou [Sat, 5 Apr 2025 14:21:45 +0000 (16:21 +0200)]
ada: Fix documentation of Generalized Finalization extension
The current documentation does not reflect the implementation present in
the compiler and contains various other inaccuracies.
gcc/ada/ChangeLog:
* doc/gnat_rm/gnat_language_extensions.rst
(Generalized Finalization): Document the actual implementation.
(No_Raise): Move to separate section.
* gnat_rm.texi: Regenerate.
Eric Botcazou [Fri, 4 Apr 2025 18:17:28 +0000 (20:17 +0200)]
ada: Assorted minor cleanups and tweaks
There should be no functional changes.
gcc/ada/ChangeLog:
* einfo.ads (Has_Homonym): Fix inaccuracy in description.
* sem_ch8.ads (Find_Direct_Name): Remove obsolete description.
* sem_ch12.adb (Analyze_Associations): Rename I_Node parameter
into N and adjust description.
(Analyze_Subprogram_Instantiation): Add missing description.
(Contains_Instance_Of): Fix description.
(Associations): Rename Generic_Actual_Rec into Actual_Rec and
Gen_Assocs_Rec into Match_Rec.
(Analyze_One_Association): Rename I_Node parameter into N.
(Check_Fixed_Point_Warning): Rename Gen_Assocs parameter into
Match.
(Body of Associations): Minor cleanups and tweaks.
(Analyze_Associations): Rename I_Node parameter into N and
adjust implementation.
(Analyze_One_Association): Likewise.
(Analyze_Package_Instantiation): Remove obsolete code and clean up.
(Check_Fixed_Point_Warning): Rename Gen_Assocs parameter into
Match and adjust implementation.
(Freeze_Package_Instance): Simplify condition.
(Get_Unit_Instantiation_Node): Add support for instantiations of
subprograms and stop the loop properly in case of errors.
* sem_util.ads (Add_Global_Declaration): Rename N parameter into
Decl and fix description.
* sem_util.adb (Add_Global_Declaration): Rename N parameter into
Decl and adjust implementation.
Bob Duff [Tue, 1 Apr 2025 23:03:31 +0000 (19:03 -0400)]
ada: VAST: Check basic tree properties
Check that the tree is really a tree, that parent pointers
make sense, that every node has been analyzed, and so on.
Most of these checks are disabled, because they fail in
many cases, including the compiler and run-time library.
Improve the debugging support in VAST. Walk subtrees "by hand",
rather than calling Atree.Traverse routines, because that
makes debugging printouts more convenient, and because we
want to keep a node stack for checking parents.
gcc/ada/ChangeLog:
* vast.adb: Check basic tree properties.
* atree.adb (Traverse_Field): Minor.
* treepr.adb (Destroy): Minor comment.
Eric Botcazou [Wed, 2 Apr 2025 08:02:18 +0000 (10:02 +0200)]
ada: Fix various issues in System.Value_F.Integer_To_Fixed function
The first issue is that the function would wrongly raise Constraint_Error
on the edge case where Val = 2**(Int'Size - 1) and Minus is not set.
The second issue is that some runtimes are compiled with -gnatp and would
fail to raise Constraint_Error when the sum of the terms overflows an Int.
The third issue is that the function takes a long time to deal with huge
negative exponents.
gcc/ada/ChangeLog:
* libgnat/s-valuef.adb (Integer_To_Fixed): Enable overflow checks.
Deal specifically with Val = 2**(Int'Size - 1) if Minus is not set.
Exit the loop when V saturates to 0 in the case of (huge) negative
exponents.
ada: Tweak special handling of synchronized type scopes
Exp_Util.Insert_Actions handles scopes of synchronized types specially,
but the condition it tested before this patch was not quite correct in
some cases, for example during some expansion operations made under
Expand_N_Task_Type_Declaration. This patch refines the test.
Eric Botcazou [Mon, 31 Mar 2025 19:29:27 +0000 (21:29 +0200)]
ada: Implement -gnatRh switch to display holes in record layout
This implements the new (sub)switch -gnatRh to display holes in the layout
of record types, which are mostly present to fulfill alignment requirements.
gcc/ada/ChangeLog:
* doc/gnat_ugn/building_executable_programs_with_gnat.rst (List of
all switches): Add -gnatRh subswitch.
(Debugging Control): Document -gnatRh subswitch.
* opt.ads (List_Representation_Info_Holes): New boolean variable.
* repinfo.adb: Add with clause for GNAT.Heap_Sort_G.
(List_Common_Type_Info): Relax assertion.
(List_Object_Info): Replace assertion with additional test.
(List_Record_Layout): If -gnatRh is specified, make sure that the
components are ordered by increasing offsets. Output a comment
line giving the number of unused bits if there is a hole between
consecutive components. Streamline the control flow of the loop.
(List_Record_Info): Use the original record type giving the layout
of components, if any, to display the layout of the record.
* switch-c.adb (Scan_Front_End_Switches) <-gnatR>: Add support for
-gnatRh subswitch.
* usage.adb (Usage): Document -gnatRh subswitch.
* gnat_ugn.texi: Regenerate.
ada: Adjust alignment calculation for secondary stack
The secondary stack allocator needs to take alignment constraints into
account when doing allocations. In the full runtime the secondary stack
is allocated in chunks on the heap and can grow dynamically. As it does
not grow contiguously the "top" of the stack depends on the size of
the allocation. Therefore the alignment of the stack top is not known at
allocation time and the padding needed for a particular alignment needs
to be calculated conservatively to ensure the allocation fits the
requested size after the base address has been aligned.
On more restricted platforms the secondary stack is a contiguous block
of statically allocated memory. Here the conservative mechanism is not
required since the allocations base address is known and the required
padding can be calculated right away. The conservative approach also
sometimes causes an allocation to be slightly larger than it needs to
be. This can be a problem on platforms with limited RAM availability. To
avoid this problem modify the calculation of the required padding on these
platforms to always exactly fit the required size.
gcc/ada/ChangeLog:
* libgnat/s-secsta.adb (SS_Allocate): Add comment about
conservative alignment padding calculation.
* libgnat/s-secsta__cheri.adb (SS_Allocate): Add comment about
conservative alignment padding calculation.
This patch removes various calls on entities that have their entity
kinds set to E_Void, by giving those entities their proper kinds earlier.
This is to prepare for potential new future invariant checks on entities.
gcc/ada/ChangeLog:
* sem_ch3.adb (Constrain_Index, Make_Index, Array_Type_Declaration,
Analyze_Number_Declaration): Remove uses of E_Void.
Eric Botcazou [Fri, 28 Mar 2025 17:51:23 +0000 (18:51 +0100)]
ada: Document supported GCC optimization switches
In particular the most recently added ones, namely -Og and -Oz. But -Ofast
is not documented because it disregards strict compliance with standards.
gcc/ada/ChangeLog:
* usage.adb (Usage): Justify the documentation of common switches
like that of other switches. Rework that of the -O switch.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst (Compiler
switches) <-O>: Rework and document 'z' and 'g' operands.
* doc/gnat_ugn/gnat_and_program_execution.rst (Optimization Levels):
Rework and document -Oz and -Og switches.
* gnat_ugn.texi: Regenerate.
Before this patch, Constrain_Index always started by creating an itype
but then sometimes not using it for anything. This patch makes it so an
itype is only created when needed.
Alfie Richards [Thu, 27 Mar 2025 14:12:06 +0000 (14:12 +0000)]
Refactor record_function_versions.
Renames record_function_versions to add_function_version, and make it
explicit that it is adding a single version to the function structure.
Additionally, change the insertion point to always maintain priority ordering
of the versions.
This allows for removing logic for moving the default to the first
position which was duplicated across target specific code and enables
easier reasoning about function sets.
gcc/ChangeLog:
* cgraph.cc (cgraph_node::record_function_versions): Refactor and
rename to...
(cgraph_node::add_function_version): new function.
* cgraph.h (cgraph_node::record_function_versions): Refactor and
rename to...
(cgraph_node::add_function_version): new function.
* config/aarch64/aarch64.cc (aarch64_get_function_versions_dispatcher):
Remove reordering.
* config/i386/i386-features.cc (ix86_get_function_versions_dispatcher):
Remove reordering.
* config/riscv/riscv.cc (riscv_get_function_versions_dispatcher):
Remove reordering.
* config/rs6000/rs6000.cc (rs6000_get_function_versions_dispatcher):
Remove reordering.
gcc/cp/ChangeLog:
* decl.cc (maybe_version_functions): Change record_function_versions
call to add_function_version.
Jakub Jelinek [Thu, 12 Jun 2025 06:33:38 +0000 (08:33 +0200)]
c, c++: Save 8 bytes of memory in lang_type for non-ObjC*
For C++26 P2786R13 I'm afraid I'll need 4-6 new flags on class types
in struct lang_type (1 bit for trivially_relocatable_if_eligible,
1 for replaceable_if_eligible, 1 for not_trivially_relocatable and
1 for not_replaceable and perhaps 2 bits whether the last 2 have been
computed already) and there are just 2 bits left.
The following patch is an attempt to save 8 bytes of memory
in those structures when not compiling ObjC or ObjC++ (I think those
are used fairly rarely and the patch keeps the sizes unmodified for
those 2). The old allocations were 32 bytes for C and 120 bytes
for C++. The patch moves the objc_info member last in the C++ case
(it was already last in the C case), arranges for GC to skip it
for C and C++ but walk for ObjC and ObjC++ and allocates or
copies over just offsetof bytes instead of sizeof.
2025-06-12 Jakub Jelinek <jakub@redhat.com>
gcc/c/
* c-lang.h (union lang_type::maybe_objc_info): New type.
(struct lang_type): Use union maybe_objc_info info member
instead of tree objc_info.
* c-decl.cc (finish_struct): Allocate struct lang_type using
ggc_internal_cleared_alloc instead of ggc_cleared_alloc,
and use sizeof (struct lang_type) for ObjC and otherwise
offsetof (struct lang_type, info) as size.
(finish_enum): Likewise.
gcc/cp/
* cp-tree.h (union lang_type::maybe_objc_info): New type.
(struct lang_type): Use union maybe_objc_info info member
instead of tree objc_info.
* lex.cc (copy_lang_type): Use sizeof (struct lang_type)
just for ObjC++ and otherwise offsetof (struct lang_type, info).
(maybe_add_lang_type_raw): Likewise.
(cxx_make_type): Formatting fix.
gcc/objc/
* objc-act.h (TYPE_OBJC_INFO): Define to info.objc_info
instead of objc_info.
gcc/objcp/
* objcp-decl.h (TYPE_OBJC_INFO): Define to info.objc_info
instead of objc_info.
Yuao Ma [Wed, 11 Jun 2025 15:33:35 +0000 (23:33 +0800)]
fortran: add intrinsic doc for trig functions with half revolutions
This patch is a follow-up to commit r16-938-ge8fdd55ec90749. In this patch, we
add intrinsic documentation for the newly added trig functions with half
revolutions. We also reorder the documentation for `atand` to place it in
correct alphabetical order.
PR fortran/113152
gcc/fortran/ChangeLog:
* intrinsic.texi: Document new half-revolution trigonometric
functions. Reorder doc for atand.
Signed-off-by: Yuao Ma <c8ef@outlook.com> Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
c/c++: Handle '#pragma GCC target optimize' early [PR48026]
Handle '#pragma GCC optimize' earlier as the __OPTIMIZE__ macro may need
to be defined as well for certain usages. Add additional tests for the
'#pragma GCC target' case with auto-vectorization enabled and multiple
combinations of namespaces and/or class member functions.
This is similar to what was done for `#pramga GCC target` in r14-4967-g8697d3a1dcf327,
to fix the similar issue there.
* c-pragma.cc (init_pragma): Use c_register_pragma_with_early_handler
instead of c_register_pragma for `#pragma GCC optimize`.
gcc/testsuite/ChangeLog:
* c-c++-common/pragma-optimize-1.c: New test.
* g++.target/i386/vect-pragma-target-1.C: New test.
* g++.target/i386/vect-pragma-target-2.C: New test.
* gcc.target/i386/vect-pragma-target-1.c: New test.
* gcc.target/i386/vect-pragma-target-2.c: New test.
Signed-off-by: Gwenole Beauchesne <gb.devel@gmail.com> Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
Jonathan Wakely [Thu, 29 May 2025 12:49:04 +0000 (13:49 +0100)]
libstdc++: Add _GLIBCXX_USE_BUILTIN_TRAIT to Doxygen config
This ensures that Doxygen sees the simpler definitions of type traits,
which are implemented using the built-ins.
Also add _GLIBCXX_HAVE_ICONV (which is less important) and fix some
typos for _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE and
_GLIBCXX_END_INLINE_ABI_NAMESPACE.
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (PREDEFINED): Remove -D prefixes from
some macros. Define _GLIBCXX_USE_BUILTIN_TRAIT and
_GLIBCXX_HAVE_ICONV macros.
Martin Uecker [Mon, 9 Jun 2025 16:48:43 +0000 (18:48 +0200)]
c: remaining fix for the composite type inconsistency [PR120510]
There is an old GNU extension which allows overriding the
promoted old-style arguments when there is an earlier prototype
An example (from a test added for PR16666) is the following.
float dremf (float, float);
float
dremf (x, y)
float x, y;
{
return x + y;
}
The types of the two declarations are not compatible, because
the arguments are not self-promoting. Add a special case
to function_types_compatible_p that can be toggled via a flag
for comptypes_internal and add a helper function to be able to
add the checking assertions to composite_type.
PR c/120510
gcc/c/ChangeLog:
* c-typeck.cc (composite_type_internal): Activate checking
assertions for all types and also inputs.
(comptypes_for_composite_check): New helper function.
(function_types_compatible_p): Add exception.
gcc/testsuite/ChangeLog:
* gcc.dg/old-style-prom-4.c: New test.
These changes are help make it possible to compile on MacOS. In
addition to guarding clock_settime() calls, it removes the use
of structures and constants needed for clock_settime().
David Malcolm [Wed, 11 Jun 2025 18:21:41 +0000 (14:21 -0400)]
diagnostics: add selftests for html_token_printer [PR116792]
No functional change intended.
gcc/ChangeLog:
PR other/116792
* diagnostic-format-html.cc: Include "selftest-xml.h".
(html_builder::make_element_for_diagnostic): Move...
(class html_token_printer): ...from local to the function
to the global namespace.
(struct selftest::token_printer_test): New.
(selftest::test_token_printer): New.
(selftest::test_simple_log): Simplify using ASSERT_XML_PRINT_EQ.
(selftest::test_metadata): Likewise.
(selftest::diagnostic_format_html_cc_tests): Run the new test.
* selftest-xml.h: New file.
* xml.cc: Include "selftest-xml.h".
(selftest::assert_xml_print_eq): New.
(selftest::test_no_dtd): Simplify using ASSERT_XML_PRINT_EQ.
(selftest::test_printer): Likewise.
(selftest::test_attribute_ordering): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The instruction scheduler appears to be speculatively hoisting vsetvl
insns outside of their basic block without checking for data
dependencies. This resulted in a situation where the following occurs
vsetvli a5,a1,e32,m1,tu,ma
vle32.v v2,0(a0)
sub a1,a1,a5 <-- a1 potentially set to 0
sh2add a0,a5,a0
vfmacc.vv v1,v2,v2
vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0
beq a1,zero,.L12 <-- check if avl is 0
This patch would essentially delay the vsetvl update to after the branch
to prevent unnecessarily updating the vinfo at the end of a basic block.
Uros Bizjak [Wed, 11 Jun 2025 12:12:33 +0000 (14:12 +0200)]
i386: Fix signed integer overflow in ix86_expand_int_movcc [PR120604]
Patch for PR120553 enabled full 64-bit DImode immediates in
ix86_expand_int_movcc. However, the function calculates the difference
between two immediate arguments using signed 64-bit HOST_WIDE_INT
subtractions that can cause signed integer overflow.
Avoid the overflow by casting operands of subtractions to
(unsigned HOST_WIDE_INT).
PR target/120604
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_int_movcc): Cast operands of
signed 64-bit HOST_WIDE_INT subtractions to (unsigned HOST_WIDE_INT).
Simplify the structure of the Makefile, because now we don't need so
many different variables. Also, list the m4 directory entirely in
EXTRA_DIST instead of individual files, to reduce information
duplication.
I also introduce some level of checking the regenerate.sh script.
It should catch cases where the list of generated files gets out
of sync between Makefile.am and regenerate.sh
RISC-V: Add patterns for vector-scalar negate-(multiply-add/sub) [PR119100]
This pattern enables the combine pass (or late-combine, depending on the case)
to merge a vec_duplicate into a (possibly negated) minus-mult RTL instruction.
Before this patch, we have two instructions, e.g.:
vfmv.v.f v6,fa0
vfnmadd.vv v2,v6,v4
After, we get only one:
vfnmadd.vf v2,fa0,v4
This also fixes a sign mistake in the handling of vfmsub.
PR target/119100
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*<optab>_vf_<mode>): Only handle vfmadd
and vfmsub.
(*vfnmsub_<mode>): New pattern.
(*vfnmadd_<mode>): New pattern.
* config/riscv/riscv.cc (get_vector_binary_rtx_cost): Add cost model for
NEG and VEC_DUPLICATE.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfnmadd and
vfnmsub.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop.h: Add support for neg
variants. Fix sign for sub.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop_data.h: Add data for neg
variants. Fix data for sub.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop_run.h: Rename x to f.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f16.c: Add neg
argument.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f64.c: New test.
Jonathan Wakely [Thu, 22 May 2025 13:49:25 +0000 (14:49 +0100)]
libstdc++: Replace some uses of std::__addressof with std::addressof
Since r16-154-gc91eb5a5c13f14 std::addressof is no less efficient than
std::__addressof, so change some uses of the latter to the former. We
can't change them all, because some uses need to compile as C++98 which
only has std::__addressof.
Similarly, since r16-848-gb2aeeb2803f97b std::is_constant_evaluated is
no less efficient than std::__is_constant_evaluated.
libstdc++-v3/ChangeLog:
* include/bits/stl_construct.h: Replace std::__addressof with
std::addressof in code that doesn't need to compile as C++98.
Replace std::__is_constant_evaluated with
std::is_constant_evaluated in code that doesn't need to compile
as C++17 or earlier.
* include/bits/stl_uninitialized.h: Likewise for __addressof.
Jonathan Wakely [Tue, 10 Jun 2025 08:34:50 +0000 (09:34 +0100)]
libstdc++: Remove unused 'test' variables in test cases
These variables could be used by custom definitions of the VERIFY macro
prior to GCC 7.1 but serve no purpose now. They can be removed, along
with the documentation with the historical note.
Jonathan Wakely [Wed, 21 May 2025 22:48:34 +0000 (23:48 +0100)]
libstdc++: Improve diagnostics for ill-formed std::_Destroy and std::_Destroy_n [PR120390]
By using std::is_trivially_destructible instead of the old
__has_trivial_destructor built-in we no longer need the static_assert to
deal with types with deleted destructors. All non-destructible types,
including those with deleted destructors, will now give user-friendly
diagnostics that clearly explain the problem.
Also combine the _Destroy_aux and _Destroy_n_aux class templates used
for C++98 into one, so that we perform fewer expensive class template
instantiations.
libstdc++-v3/ChangeLog:
PR libstdc++/120390
* include/bits/stl_construct.h (_Destroy_aux::__destroy_n): New
static member function.
(_Destroy_aux<true>::__destroy_n): Likewise.
(_Destroy_n_aux): Remove.
(_Destroy(ForwardIterator, ForwardIterator)): Remove
static_assert. Use is_trivially_destructible instead of
__has_trivial_destructor.
(_Destroy_n): Likewise. Use _Destroy_aux::__destroy_n instead of
_Destroy_n_aux::__destroy_n.
* testsuite/20_util/specialized_algorithms/memory_management_tools/destroy_neg.cc:
Adjust dg-error strings. Move destroy_n tests to ...
* testsuite/20_util/specialized_algorithms/memory_management_tools/destroy_n_neg.cc:
New test.
* testsuite/23_containers/vector/cons/destructible_debug_neg.cc:
Adjust dg-error strings.
* testsuite/23_containers/vector/cons/destructible_neg.cc:
Likewise.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 10 Jun 2025 09:52:13 +0000 (10:52 +0100)]
libstdc++: Fix new <sstream> tests for COW std::string ABI
The std::basic_stringbuf::get_allocator() member is only available for
the SSO std::string ABI.
libstdc++-v3/ChangeLog:
* testsuite/27_io/basic_istringstream/cons/char/string_view.cc:
Only check get_allocator() for new string ABI.
* testsuite/27_io/basic_ostringstream/cons/char/string_view.cc:
Likewise.
* testsuite/27_io/basic_stringbuf/cons/char/string_view.cc:
Likewise.
* testsuite/27_io/basic_stringstream/cons/char/string_view.cc:
Likewise.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jakub Jelinek [Wed, 11 Jun 2025 05:16:06 +0000 (07:16 +0200)]
testsuite: Add -mpopcnt and -mabm variants of PR90693 tests
My r16-1398 patch broke bootstrap on aarch64-linux and powerpc64le-linux
at least. Fixed with r16-1408.
The following patch just adds testcases with which the bug can be reproduced
also on x86_64-linux where it hasn't been caught by the testsuite (while
there are 2 tests with it, both where compiled with -mno-abm -mno-popcnt
and so didn't trigger the right path). This patch just includes those
tests in 4 further ones, two with -mpopcnt and two with -mabm flags.
2025-06-11 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/90693
* gcc.target/i386/pr90693-3.c: New test.
* gcc.target/i386/pr90693-4.c: New test.
* gcc.target/i386/pr90693-5.c: New test.
* gcc.target/i386/pr90693-6.c: New test.
Jakub Jelinek [Wed, 11 Jun 2025 05:03:04 +0000 (07:03 +0200)]
ranger: Handle the theoretical case of GIMPLE_COND with one succ edge during expansion [PR120434]
On Tue, Jun 10, 2025 at 10:51:25AM -0400, Andrew MacLeod wrote:
> Edge range should be fine, and really that assert doesnt really need to be
> there.
>
> Where the issue could arise is in gimple-range-fold.cc in
> fold_using_range::range_of_range_op() where we see something like:
>
> else if (is_a<gcond *> (s) && gimple_bb (s))
> {
> basic_block bb = gimple_bb (s);
> edge e0 = EDGE_SUCC (bb, 0);
> edge e1 = EDGE_SUCC (bb, 1);
>
> if (!single_pred_p (e0->dest))
> e0 = NULL;
> if (!single_pred_p (e1->dest))
> e1 = NULL;
> src.register_outgoing_edges (as_a<gcond *> (s),
> as_a <irange> (r), e0, e1);
>
> Althogh, now that I look at it, it doesn't need much adjustment, just the
> expectation that there are 2 edges. I suppose EDGE_SUCC (bb, 1) cpould
> potentially trap if there is only one edge. we'd just have to guard it and
> alloow for that case
This patch implements that.
2025-06-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/120434
* gimple-range-fold.cc: Include rtl.h.
(fold_using_range::range_of_range_op): Handle bb ending with
GIMPLE_COND during RTL expansion where there is only one succ
edge instead of two.
Jakub Jelinek [Wed, 11 Jun 2025 05:00:27 +0000 (07:00 +0200)]
internal-fn: Fix up .POPCOUNT expansion
Apparently my ranger during expansion patch broke bootstrap on
aarch64-linux, while building libsupc++, there is endless recursion
on __builtin_popcountl (x) == 1 expansion.
The hack to temporarily replace SSA_NAME_VAR of the lhs which replaced
the earlier hack to temporarily change the gimple_call_lhs relies on
the lhs being expanded with EXPAND_WRITE when expanding that ifn call.
Unfortunately, in two spots I was using expand_normal (lhs) instead
of expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE) which was used
everywhere else in internal-fn.cc. This happened to work fine in the
past, but doesn't anymore. git blame shows it was my patch using
these incorrect calls.
2025-06-11 Jakub Jelinek <jakub@redhat.com>
* internal-fn.cc (expand_POPCOUNT): Use
expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE) instead of
expand_normal (lhs).
David Malcolm [Wed, 11 Jun 2025 00:06:38 +0000 (20:06 -0400)]
diagnostics: make experimental-html sink prettier [PR116792]
This patch to the "experimental-html" diagnostic sink:
* adds use of the PatternFly 3 CSS library (via an optional link
in the generated html to a copy in a CDN)
* uses PatternFly's "alert" pattern to show severities for diagnostics,
properly nesting "note" diagnostics for diagnostic groups.
Example:
before: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/before/diagnostic-ranges.c.html
after: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/after/diagnostic-ranges.c.html
* adds initial support for logical locations and physical locations
* adds initial support for multi-level nested diagnostics such as those
for C++ concepts diagnostics. Ideally this would show a clickable
disclosure widget to expand/collapse a level, but for now it uses
nested <ul> elements with <li> for the child diagnostics.
Example:
before: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/before/nested-diagnostics-1.C.html
after: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/after/nested-diagnostics-1.C.html
gcc/ChangeLog:
PR other/116792
* diagnostic-format-html.cc: Include "diagnostic-path.h" and
"diagnostic-client-data-hooks.h".
(html_builder::m_logical_loc_mgr): New field.
(html_builder::m_cur_nesting_levels): New field.
(html_builder::m_last_logical_location): New field.
(html_builder::m_last_location): New field.
(html_builder::m_last_expanded_location): New field.
(HTML_STYLE): Add "white-space: pre;" to .source and .annotation.
Add "gcc-quoted-text" CSS class.
(html_builder::html_builder): Initialize the new fields. If CSS
is enabled, add CDN links to PatternFly 3 stylesheets.
(html_builder::add_stylesheet): New.
(html_builder::on_report_diagnostic): Add "alert" param to
make_element_for_diagnostic, setting it by default, but unsetting
it for nested diagnostics below the top level. Use
add_at_nesting_level for nested diagnostics.
(add_nesting_level_attr): New.
(html_builder::add_at_nesting_level): New.
(get_pf_class_for_alert_div): New.
(get_pf_class_for_alert_icon): New.
(get_label_for_logical_location_kind): New.
(add_labelled_value): New.
(html_builder::make_element_for_diagnostic): Add leading comment.
Add "alert" param. Drop class="gcc-diagnostic" from <div> tag,
instead adding the class for a PatternFly 3 alert if "alert" is
true, and adding a <span> with an alert icon, both according to
the diagnostic severity. Add a severity prefix to the message for
alerts. Add any metadata/option text as suffixes to the message.
Show any logical location. Show any physical location. Don't
show the locus if the last location is unchanged within the
diagnostic_group. Wrap any execution path element in a
<div id="execution-path"> and add a label to it. Wrap any
generated patch in a <div id="suggested-fix"> and add a label
to it.
(selftest::test_simple_log): Update expected HTML.
gcc/testsuite/ChangeLog:
PR other/116792
* gcc.dg/html-output/missing-semicolon.py: Update for changes
to diagnostic elements.
* gcc.dg/format/diagnostic-ranges-html.py: Likewise.
* gcc.dg/plugin/diagnostic-test-metadata-html.py: Likewise. Drop
out-of-date comment.
* gcc.dg/plugin/diagnostic-test-paths-2.py: Likewise.
* gcc.dg/plugin/diagnostic-test-paths-4.py: Likewise. Drop
out-of-date comment.
* gcc.dg/plugin/diagnostic-test-show-locus.py: Likewise.
* lib/htmltest.py (get_diag_by_index): Update to use search by id.
(get_message_within_diag): Update to use search by class.
libcpp/ChangeLog:
PR other/116792
* include/line-map.h (typedef expanded_location): Convert to...
(struct expanded_location): ...this.
(operator==): New decl, for expanded_location.
(operator!=): Likewise.
* line-map.cc (operator==): New decl, for expanded_location.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 11 Jun 2025 00:06:37 +0000 (20:06 -0400)]
diagnostics: fix tag nesting issues in experimental-html sink [PR120610]
I've been seeing issues in the experimental-html sink where the nesting
of tags goes wrong.
The two issues I've seen are:
* the pp_token_list from the diagnostic message that reaches the
html_token_printer doesn't always have matching pairs of begin/end
tokens (PR other/120610)
* a bug in diagnostic-show-locus where there was a stray xp.pop_tag,
in print_trailing_fixits.
This patch:
* changes the xml::printer::pop_tag API so that it now takes the
expected name of the element being popped (rather than expressing this
in comments), and that, by default, the xml::printer asserts that this
matches.
* gives the html_token_printer its own xml::printer instance to restrict
the affected area of the DOM tree; this xml::printer doesn't enforce
nesting (PR other/120610)
* adds RAII sentinel classes that automatically check for pushes/pops
being balanced within a scope, using them in various places
* fixes the bug in print_trailing_fixits for html output
gcc/ChangeLog:
PR other/120610
* diagnostic-format-html.cc (html_builder::html_builder): Update
for new param of xml::printer::pop_tag.
(html_path_label_writer::end_label): Likewise.
(html_builder::make_element_for_diagnostic::html_token_printer):
Give the instance its own xml::printer. Update for new param of
xml::printer::pop_tag.
(html_builder::make_element_for_diagnostic): Give the instance its
own xml::printer.
(html_builder::make_metadata_element): Update for new param of
xml::printer::pop_tag.
(html_builder::flush_to_file): Likewise.
* diagnostic-path-output.cc (begin_html_stack_frame): Likewise.
(begin_html_stack_frame): Likewise.
(end_html_stack_frame): Likewise.
(print_path_summary_as_html): Likewise.
* diagnostic-show-locus.cc
(struct to_text::auto_check_tag_nesting): New.
(struct to_html:: auto_check_tag_nesting): New.
(to_text::pop_html_tag): Change param to const char *.
(to_html::pop_html_tag): Likewise; rename param to
"expected_name".
(default_diagnostic_start_span_fn<to_html>): Update for new param
of xml::printer::pop_tag.
(layout_printer<to_html>::end_label): Likewise.
(layout_printer<Sink>::print_trailing_fixits): Add RAII sentinel
to check tag nesting for the HTML case. Delete stray popping
of "td" in the presence of fix-it hints.
(layout_printer<Sink>::print_line): Add RAII sentinel
to check tag nesting for the HTML case.
(diagnostic_source_print_policy::print_as_html): Likewise.
(layout_printer<Sink>::print): Likewise.
* xml-printer.h (xml::printer::printer): Add optional
"check_popped_tags" param.
(xml::printer::pop_tag): Add "expected_name" param.
(xml::printer::get_num_open_tags): New accessor.
(xml::printer::dump): New decl.
(xml::printer::m_check_popped_tags): New field.
(class xml::auto_check_tag_nesting): New.
(class xml::auto_print_element): Update for new param of pop_tag.
* xml.cc: Move pragma pop so that the pragma also covers
xml::printer's member functions, "dump" in particular.
(xml::printer::printer): Add param "check_popped_tags".
(xml::printer::pop_tag): Add param "expected_name" and use it to assert
that the popped tag is as expected. Assert that we have a tag to
pop.
(xml::printer::dump): New.
(selftest::test_printer): Update for new param of pop_tag.
(selftest::test_attribute_ordering): Likewise.
Spotted whilst implementing nesting support in the
experimental-html diagnostic sink.
gcc/ChangeLog:
* gimple-ssa-warn-access.cc
(pass_waccess::maybe_check_dealloc_call): Add missing
auto_diagnostic_group to nest the "returned from %qD"
note within the warning.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>