Added by P2985R0 for C++26. This simply exposes the compiler
builtin, and adds the feature-testing macro.
libstdc++-v3/ChangeLog:
* include/bits/version.def: Added the feature-testing macro.
* include/bits/version.h: Regenerated.
* include/std/type_traits: Add support for
std::is_virtual_base_of and std::is_virtual_base_of_v,
implemented in terms of the compiler builtin.
* testsuite/20_util/is_virtual_base_of/value.cc: New test.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Eric Botcazou [Sat, 5 Oct 2024 12:39:14 +0000 (14:39 +0200)]
Fix various issues of -ftrivial-auto-var-init=zero with Ada
This polishes a few rough edges that prevent -ftrivial-auto-var-init=zero
from working in Ada:
- build_common_builtin_nodes declares BUILT_IN_CLEAR_PADDING with 3
instead 2 parameters, now gimple_fold_builtin_clear_padding contains
the assertion:
gcc_assert (gimple_call_num_args (stmt) == 2)
This causes gimple_builtin_call_types_compatible_p to always return false
in Ada (this works in C/C++ because another declaration is used).
- gimple_add_init_for_auto_var uses EXPR_LOCATION to fetch the location
of a DECL node, which always returns UNKNOWN_LOCATION.
- the machinery attempts to initialize Out parameters.
gcc/
PR middle-end/116933
* gimplify.cc (gimple_add_init_for_auto_var): Use the correct macro
to fetch the source location of the variable.
* tree.cc (common_builtin_nodes): Remove the 3rd parameter in the
type of BUILT_IN_CLEAR_PADDING.
gcc/ada/
PR middle-end/116933
* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Out_Parameter>: Add
the "uninitialized" attribute on Out parameters.
* gcc-interface/utils.cc (gnat_internal_attributes): Add entry for
the "uninitialized" attribute.
(handle_uninitialized_attribute): New function.
gcc/testsuite/
* gnat.dg/auto_var_init.adb: New test.
Richard Biener [Fri, 4 Oct 2024 09:13:58 +0000 (11:13 +0200)]
Improve load permutation lowering
The following makes sure the emitted even/odd extraction scheme
follows one that ends up with actual trivial even/odd extract permutes.
When we choose a level 2 extract we generate { 0, 1, 4, 5, ... }
which for example the x86 backend doesn't recognize with just SSE
and QImode elements. So this now follows what the non-SLP interleaving
code would do which is element granular even/odd extracts.
This resolves gcc.dg/vect/vect-strided[-a]-u8-i8-gap*.c FAILs with
--param vect-force-slp=1 on x86_64.
David Malcolm [Fri, 4 Oct 2024 22:31:17 +0000 (18:31 -0400)]
diagnostics: bulletproof opening of SARIF output [PR116978]
Introduce a new RAII class diagnostic_output_file to track ownership
of the FILE * for SARIF output.
In particular, the .sarif file is now opened immediately, rather
than at the end of the compile, and so will fail earlier if the
file can't be opened.
Doing so fixes a couple of ICEs in -fdiagnostics-format=sarif-file when
invoking, say, cc1 directly, rather than from the driver.
gcc/ChangeLog:
PR other/116978
* diagnostic-format-sarif.cc (sarif_builder::sarif_builder):
Gracefully handle "main_input_filename_" being NULL.
(sarif_output_format::sarif_output_format): Replace param
"base_file_name" with "output_file" and assert that the file
was opened successfully and has a non-NULL filename.
(sarif_output_format::~sarif_file_output_format): Move
responsibility for building the filename and opening the file from
here to the creator of the instance.
(sarif_output_format::m_base_file_name): Replace with...
(sarif_output_format::m_output_file): ...this.
(diagnostic_output_format_init_sarif_file): Make "line_maps" param
non-const. Gracefully handle "base_file_name" being NULL.
Construct the filename and open the file here, rather than in
~sarif_file_output_format, and handle failures immediately here,
rather than at the end of the compile.
* diagnostic-format-sarif.h: Include "diagnostic-output-file.h".
(diagnostic_output_format_init_sarif_file): Make "line_maps" param
non-const.
* diagnostic-output-file.h: New file.
* diagnostic.cc (diagnostic_context::emit_diagnostic): New.
(diagnostic_context::emit_diagnostic_va): New.
* diagnostic.h (diagnostic_context::emit_diagnostic): New decl.
(diagnostic_context::emit_diagnostic_va): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
I should have put new unspecs in SVE_COND_FP_MAXMIN but I put it in
SVE_COND_FP_BINARY_REG instead. That was incorrect because the
SVE_COND_FP_MAXMIN iterator is being used for predicated floating-point
maximum/minimum, not SVE_COND_FP_BINARY_REG.
Also added a testcase to validate the new change.
Regression tested on aarch64-unknown-linux-gnu and found no regressions.
There are some test cases with "libitm" in their directory names which
appear in compare_tests output as changed tests but it looks like they
are in the output just because of changed build directories, like from
build-patched/aarch64-unknown-linux-gnu/./libitm/* to
build-pristine/aarch64-unknown-linux-gnu/./libitm/*. I didn't think it
was a cause of concern and have pushed this for review.
gcc/ChangeLog:
PR target/116934
* config/aarch64/iterators.md: Move UNSPEC_COND_SMAX and
UNSPEC_COND_SMIN to correct iterators.
gcc/testsuite/ChangeLog:
PR target/116934
* gcc.target/aarch64/sve2/pr116934.c: New test.
AVR: target/116953 - ICE due to operands clobber in avr_out_sbxx_branch.
PR target/116953
gcc/
* config/avr/avr.cc (avr_out_sbxx_branch): Work on a copy of
the operands rather than on operands itself, which is just
recog_data.operand and may be clobbered by jump_over_one_insn_p.
gcc/testsuite/
* gcc.target/avr/torture/pr116953.c: New test.
testsuite - Some float64 and float32x test require double64plus.
Some of the float64 and float32x test cases are using double built-ins
and hence require double64plus resp. that double is at least as good
as float32x (double_float32xplus).
testsuite: Fix fallout of turning warnings into errors on 32-bit Arm
Since commits 2c3db94d9fd ("c: Turn int-conversion warnings into
permerrors") and 55e94561e97e ("c: Turn -Wimplicit-function-declaration
into a permerror") these tests fail with errors such as:
FAIL: gcc.target/arm/pr59858.c (test for excess errors)
FAIL: gcc.target/arm/pr65647.c (test for excess errors)
FAIL: gcc.target/arm/pr65710.c (test for excess errors)
FAIL: gcc.target/arm/pr97969.c (test for excess errors)
Here's one example of the excess errors:
FAIL: gcc.target/arm/pr65647.c (test for excess errors)
Excess errors:
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:17: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:51: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:62: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:7:48: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:8:9: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:24:5: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:25:5: error: initialization of 'int' from 'struct S1 *' makes integer from pointer without a cast [-Wint-conversion]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:41:3: error: implicit declaration of function 'fn3'; did you mean 'fn2'? [-Wimplicit-function-declaration]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:46:3: error: implicit declaration of function 'fn5'; did you mean 'fn4'? [-Wimplicit-function-declaration]
/path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:57:16: error: implicit declaration of function 'fn6'; did you mean 'fn4'? [-Wimplicit-function-declaration]
PR rtl-optimization/59858 and PR target/65710 test the fix of an ICE.
PR target/65647 and PR target/97969 test for a compilation infinite loop.
Therefore, add -fpermissive so that the tests behave as they did previously.
Tested on armv8l-linux-gnueabihf.
Tsung Chun Lin [Fri, 4 Oct 2024 14:02:07 +0000 (08:02 -0600)]
[PATCH] RISC-V/libgcc: Fix incorrect .cfi_offset for saving ra in __riscv_save_[0-3] on ilp32e.
From 8b3c5ebe8aacbcc4ddf1be8dea9a555e7e1bcc39 Mon Sep 17 00:00:00 2001
From: Jim Lin <jim@andestech.com>
Date: Fri, 4 Oct 2024 14:48:12 +0800
Subject: [PATCH] RISC-V/libgcc: Fix incorrect .cfi_offset for saving ra in
__riscv_save_[0-3] on ilp32e.
libgcc/ChangeLog:
* config/riscv/save-restore.S: Fix .cfi_offset for saving ra in
__riscv_save_[0-3] on ilp32e.
Patrick Palka [Fri, 4 Oct 2024 14:01:39 +0000 (10:01 -0400)]
libstdc++/ranges: Implement various small LWG issues
This implements the following small LWG issues:
3848. adjacent_view, adjacent_transform_view and slide_view missing base accessor
3851. chunk_view::inner-iterator missing custom iter_move and iter_swap
3947. Unexpected constraints on adjacent_transform_view::base()
4001. iota_view should provide empty
4012. common_view::begin/end are missing the simple-view check
4013. lazy_split_view::outer-iterator::value_type should not provide default constructor
4035. single_view should provide empty
4053. Unary call to std::views::repeat does not decay the argument
4054. Repeating a repeat_view should repeat the view
libstdc++-v3/ChangeLog:
* include/std/ranges (single_view::empty): Define as per LWG 4035.
(iota_view::empty): Define as per LWG 4001.
(lazy_split_view::_OuterIter::value_type): Remove default
constructor and make other constructor private as per LWG 4013.
(common_view::begin): Disable non-const overload for simple
views as per LWG 4012.
(common_view::end): Likewise.
(adjacent_view::base): Define as per LWG 3848.
(adjacent_transform_view::base): Likewise.
(chunk_view::_InnerIter::iter_move): Define as per LWG 3851.
(chunk_view::_InnerIter::itep_swap): Likewise.
(slide_view::base): Define as per LWG 3848.
(repeat_view): Adjust deduction guide as per LWG 4053.
(_Repeat::operator()): Adjust single-parameter overload as per
LWG 4054.
* testsuite/std/ranges/adaptors/adjacent/1.cc: Verify existence
of base member function.
* testsuite/std/ranges/adaptors/adjacent_transform/1.cc: Likewise.
* testsuite/std/ranges/adaptors/chunk/1.cc: Test LWG 3851 example.
* testsuite/std/ranges/adaptors/slide/1.cc: Verify existence of
base member function.
* testsuite/std/ranges/iota/iota_view.cc: Test LWG 4001 example.
* testsuite/std/ranges/repeat/1.cc: Test LWG 4053/4054 examples.
Jakub Jelinek [Fri, 4 Oct 2024 13:24:24 +0000 (15:24 +0200)]
testsuite: Fix up unevalstr2.C test
The CWG2521 changes adjusted the unevalstr1.C test but didn't adjust
unevalstr2.C test, which now FAILs in C++23 mode.
The intent in both of those tests was to test the separate (now deprecated)
syntax, so instead of removing the space between closing " and _ I've
adjusted the testcase to expect those 17 extra warnings. And I've also
adjusted the unevalstr1.C testcase to do the same, when it is removed from
C++29 or whatever, that can be just guarded by #if.
But it is actually useful to also test the UDL variant without space between
closing " and _, so I've added new test coverage for that too to both tests.
2024-10-04 Jakub Jelinek <jakub@redhat.com>
* g++.dg/cpp26/unevalstr1.C: Revert the 2024-10-03 changes, instead
expect extra warnings. Add another set of tests without space
between " and _.
* g++.dg/cpp26/unevalstr2.C: Expect extra warnings for C++23. Add
another set of tests without space between " and _.
aarch64: Set Armv9-A generic L1 cache line size to 64 bytes
I'd like to use a value of 64 bytes for the L1 cache size for Armv9-A
generic tuning.
As described in g:9a99559a478111f7fbeec29bd78344df7651c707 this value is used
to set the std::hardware_destructive_interference_size value which we want to
be not overly large when running concurrent applications on large core-count
systems.
The generic value for Armv8-A systems and the port baseline is 256 bytes
because that's what the A64FX CPU has, as set de-facto in
aarch64_override_options_internal.
But for Armv9-A CPUs as far as I know there isn't anything larger
than 64 bytes, so we should be able to use the smaller value here and reduce
the size of concurrent structs that use
std::hardware_destructive_interference_size to pad their fields.
Bootstrapped and tested on aarch64-none-linux-gnu.
Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
* config/aarch64/tuning_models/generic_armv9_a.h
(generic_armv9a_prefetch_tune): Define.
(generic_armv9_a_tunings): Use the above.
Andre Vieira [Fri, 4 Oct 2024 12:43:46 +0000 (13:43 +0100)]
arm: Fix missed CE optimization for armv8.1-m.main [PR 116444]
This patch restores missed optimizations for armv8.1-m.main targets that were
missed when the generation of csinc, csinv and csneg were enabled for the same
with patch series containing:
[PATCH 2/5][Arm] New pattern for CSINV instructions
The original patch series makes use of the "noce" machinery to transform RTL
into patterns that later match the Armv8.1-M Mainline, by getting the target
hook TARGET_HAVE_CONDITIONAL_EXECUTION, to return FALSE for such targets prior
to reload_completed. The same machinery however was transforming other RTL
patterns which were later on causing the "ce" pass post reload_completed to no
longer optimize conditional execution opportunities, which was causing the
regression observed in PR target/116444, a regression of 'testsuite/gcc.target/arm/thumb-ifcvt-2.c'
when ran for an Armv8.1-M Mainline target.
This patch implements the target hook TARGET_NOCE_CONVERSION_PROFITABLE_P to
only allow "noce" to generate patterns that match CSINV, CSINC and CSNEG. Thus
ensuring that the early "ce" passes do not ruin things for later ones.
gcc/ChangeLog:
PR target/116444
* config/arm/arm-protos.h (arm_noce_conversion_profitable_p): New
declaration.
* config/arm/arm.cc (arm_is_v81m_cond_insn): New helper function used
in ...
(arm_noce_conversion_profitable_p): ... here. New function to implement
...
(TARGET_NOCE_PROFITABLE_P): ... this target hook. New define.
Jakub Jelinek [Fri, 4 Oct 2024 12:02:13 +0000 (14:02 +0200)]
diagnostic, pch: Fix up the new diagnostic PCH methods for ubsan checking [PR116936]
The PR notes that the new pch_save/pch_restore methods I've added
recently invoke UB if either m_classification_history.address ()
or m_push_list.address () is NULL (which can happen if those vectors
are empty (and in the pch_save case nothing has been pushed into them
before either). While the corresponding length is necessarily 0,
fwrite (NULL, something, 0, f) or
fread (NULL, something, 0, f) still invoke UB.
The following patch fixes that by not calling fwrite/fread if the
corresponding length is 0.
2024-10-04 Jakub Jelinek <jakub@redhat.com>
PR pch/116936
* diagnostic.cc (diagnostic_option_classifier::pch_save): Only call
fwrite if corresponding length is non-zero.
(diagnostic_option_classifier::pch_restore): Only call fread if
corresponding length is non-zero.
Jakub Jelinek [Fri, 4 Oct 2024 11:12:45 +0000 (13:12 +0200)]
i386: Fix up ix86_expand_int_compare with TImode comparisons of SUBREGs from V8{H,B}Fmode against zero [PR116921]
The following testcase ICEs, because the ix86_expand_int_compare
optimization to use {,v}ptest assumes there are instructions for all
16-byte vector modes. That isn't the case, we only have one for
V16QI, V8HI, V4SI, V2DI, V1TI, V4SF and V2DF, not for
V8HF nor V8BF.
The following patch fixes that by using the V8HI instruction instead
for those 2 modes. tmp can't be a SUBREG, because it is SUBREG_REG
of another SUBREG, so we don't need to worry about gen_lowpart
failing.
2024-10-04 Jakub Jelinek <jakub@redhat.com>
PR target/116921
* config/i386/i386-expand.cc (ix86_expand_int_compare): Add a SUBREG
to V8HImode from V8HFmode or V8BFmode before generating a ptest.
Jakub Jelinek [Fri, 4 Oct 2024 10:36:52 +0000 (12:36 +0200)]
i386: Fix up *minmax<mode>3_2 splitter [PR116925]
While *minmax<mode>3_1 correctly uses
if (MEM_P (operands[1]))
operands[1] = force_reg (<MODE>mode, operands[1]);
to ensure operands[1] is not a MEM, *minmax<mode>3_2 does it wrongly
by calling force_reg but ignoring its return value.
The following borderingly obvious patch fixes that.
Didn't find similar other errors in the backend with force_reg calls.
2024-10-04 Jakub Jelinek <jakub@redhat.com>
PR target/116925
* config/i386/sse.md (*minmax<mode>3_2): Assign force_reg result
back to operands[2] instead of throwing it away.
Nathaniel Shead [Fri, 4 Oct 2024 02:01:38 +0000 (12:01 +1000)]
c++: Allow references to internal-linkage vars in C++11 [PR113266]
[temp.arg.nontype] changed in C++11 to allow naming internal-linkage
variables and functions. We currently already handle internal-linkage
functions, but variables were missed; this patch updates this.
PR c++/113266
PR c++/116911
gcc/cp/ChangeLog:
* parser.cc (cp_parser_template_argument): Allow
internal-linkage variables since C++11.
Nathaniel Shead [Fri, 4 Oct 2024 00:46:57 +0000 (10:46 +1000)]
c++: Return the underlying decl rather than the USING_DECL from update_binding [PR116913]
Users of pushdecl assume that the returned decl will be a possibly
updated decl matching the one that was passed in. My r15-3910 change
broke this since in some cases we would now return USING_DECLs; this
patch fixes the situation.
PR c++/116913
gcc/cp/ChangeLog:
* name-lookup.cc (update_binding): Return the strip_using'd old
decl rather than the binding.
Jonathan Wakely [Fri, 4 Oct 2024 09:44:46 +0000 (10:44 +0100)]
libstdc++: Replace implicit lambda capture of 'this' [PR116964]
C++20 deprecates implicit capture of 'this', so change [=] to [this] for
all lambda expressions in <shared_mutex>. This only shows up on targets
where _GLIBCXX_USE_PTHREAD_RWLOCK_T is not defined, as we have an
alternative implementation of shared mutexes in that case.
libstdc++-v3/ChangeLog:
PR libstdc++/116964
* include/std/shared_mutex (__shared_mutex_cv): Use [this] for
lambda captures.
(shared_timed_mutex) [!_GLIBCXX_USE_PTHREAD_RWLOCK_T]: Likewise.
Jason Merrill [Thu, 3 Oct 2024 20:29:20 +0000 (16:29 -0400)]
c++: record template specialization hash
A lot of compile time of template-heavy code is spent in re-hashing
hashtable elements upon expansion. The following records the hash in the
hash element. This speeds up C++20 compilation of stdc++.h by about 25% for
about a 0.1% increase in memory usage.
With the hash value in the entry, we don't need to pass it separately to the
find functions.
Adding default arguments to the spec and hash fields simplifies spec_entry
initialization and avoids problems from hash starting with an indeterminate
value.
gcc/cp/ChangeLog:
* cp-tree.h (spec_entry::hash): New member.
* pt.cc (spec_hasher::hash): Set it and return it.
(maybe_process_partial_specialization): Clear it when
changing tmpl/args.
(lookup_template_class): Likewise, don't pass hash to find.
(retrieve_specialization): Set it, don't pass hash to find.
(register_specialization): Don't pass hash to find.
(reregister_specialization): Likewise.
(match_mergeable_specialization): Likewise.
(add_mergeable_specialization): Likewise.
Co-authored-by: Richard Biener <rguenther@suse.de>
Jason Merrill [Thu, 3 Oct 2024 20:31:00 +0000 (16:31 -0400)]
c++: free garbage vec in coerce_template_parms
coerce_template_parms can create two different vecs for the inner template
arguments, new_inner_args and (potentially) the result of
expand_template_argument_pack. One or the other, or possibly both, end up
being garbage: in the typical case, the expanded vec is garbage because it's
only used as the source for convert_template_argument. In some dependent
cases, the new vec is garbage because we decide to return the original args
instead. In these cases, ggc_free the garbage vec to reduce the memory
overhead of overload resolution.
Eric Botcazou [Thu, 3 Oct 2024 17:46:59 +0000 (19:46 +0200)]
Aarch64: Define WIDEST_HARDWARE_FP_SIZE
The macro is documented like this in the internal manual:
-- Macro: WIDEST_HARDWARE_FP_SIZE
A C expression for the size in bits of the widest floating-point
format supported by the hardware. If you define this macro, you
must specify a value less than or equal to mode precision of the
mode used for C type 'long double' (from hook
'targetm.c.mode_for_floating_type' with argument
'TI_LONG_DOUBLE_TYPE'). If you do not define this macro, mode
precision of the mode used for C type 'long double' is the default.
AArch64 uses 128-bit TFmode for long double but, as far as I know, no FPU
implemented in hardware supports it.
gcc/
* config/aarch64/aarch64.h (WIDEST_HARDWARE_FP_SIZE): Define to 64.
gcc/testsuite/
* gnat.dg/specs/size_clause6.ads: New test.
Jason Merrill [Wed, 2 Oct 2024 17:23:53 +0000 (13:23 -0400)]
c++: -Wdeprecated enables later standard deprecations
By default -Wdeprecated warns about deprecations in the active standard.
When specified explicitly, let's also warn about deprecations in later
standards.
gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Explicit -Wdeprecated enables
deprecations from later standards.
gcc/ChangeLog:
* doc/invoke.texi: Explicit -Wdeprecated enables more warnings.
Jason Merrill [Wed, 2 Oct 2024 12:05:28 +0000 (08:05 -0400)]
c++: free garbage vec in coerce_template_parms
coerce_template_parms can create two different vecs for the inner template
arguments, new_inner_args and (potentially) the result of
expand_template_argument_pack. One or the other, or possibly both, end up
being garbage: in the typical case, the expanded vec is garbage because it's
only used as the source for convert_template_argument. In some dependent
cases, the new vec is garbage because we decide to return the original args
instead. In these cases, ggc_free the garbage vec to reduce the memory
overhead of overload resolution.
gcc/cp/ChangeLog:
* pt.cc (struct free_if_changed_proxy): New.
(coerce_template_parms): Use it.
Co-authored-by: Richard Biener <rguenther@suse.de>
Andrew Pinski [Wed, 2 Oct 2024 21:21:24 +0000 (14:21 -0700)]
aarch64: Fix early ra for -fno-delete-dead-exceptions [PR116927]
Early-RA was considering throwing instructions as being dead and removing
them even if -fno-delete-dead-exceptions was in use. This fixes that oversight.
Built and tested for aarch64-linux-gnu.
PR target/116927
gcc/ChangeLog:
* config/aarch64/aarch64-early-ra.cc (early_ra::is_dead_insn): Insns
that throw are not dead with -fno-delete-dead-exceptions.
gcc/testsuite/ChangeLog:
* g++.dg/torture/pr116927-1.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
François Dumont [Thu, 9 Nov 2023 18:06:52 +0000 (19:06 +0100)]
libstdc++: [_Hashtable] Fix some implementation inconsistencies
Get rid of the different usages of the mutable keyword except in
_Prime_rehash_policy where it is preserved for abi compatibility reason.
Fix comment to explain that we need the computation of bucket index noexcept
to be able to rehash the container when needed.
For Standard instantiations through std::unordered_xxx containers we already
force caching of hash code when hash functor is not noexcep so it is guarantied.
The static_assert purpose in _Hashtable on _M_bucket_index is thus limited
to usages of _Hashtable with exotic _Hashtable_traits.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h (_NodeBuilder<>::_S_build): Remove
const qualification on _NodeGenerator instance.
(_ReuseOrAllocNode<>::operator()(_Args&&...)): Remove const qualification.
(_ReuseOrAllocNode<>::_M_nodes): Remove mutable.
(_Insert_base<>::_M_insert_range): Remove _NodeGetter const qualification.
(_Hash_code_base<>::_M_bucket_index(const _Hash_node_value<>&, size_t)):
Simplify noexcept declaration, we already static_assert that _RangeHash functor
is noexcept.
* include/bits/hashtable.h: Rework comments. Remove const qualifier on
_NodeGenerator& arguments.
David Malcolm [Thu, 3 Oct 2024 02:05:03 +0000 (22:05 -0400)]
diagnostics: support SARIF 2.2 output, undocumented for now [PR116301]
GCC currently supports outputting SARIF v2.1.0
Version 2.2 of the SARIF spec is not yet official, but the draft has
already gained features we might might want to use.
This patch extends the SARIF output code to accept a enum sarif_version
parameter internally, representing 2.1.0 or a prerelease of 2.2
The patch updates the SARIF output selftests so that they are run for
all such versions.
I hope to expose this "properly" via the mechanism described
in comment #13 of PR116613. In the meantime, the patch adds a new
-fdiagnostics-format=sarif-file-2.2-prerelease
for use by the DejaGnu testsuite, deliberately left undocumented for
now.
The copy of the 2.2 draft schema in the testsuite was downloaded from
https://raw.githubusercontent.com/oasis-tcs/sarif-spec/refs/tags/2.2-prerelease-2024-08-08/sarif-2.2/schema/sarif-2-2.schema.json
The patch adds support for capturing related locations within an ICE
notification for SARIF 2.2 onwards, thus capturing "include chain"
information for SARIF-based reports of ICEs that occur within a
header; see https://github.com/oasis-tcs/sarif-spec/issues/540
The patch does *not* add support for the "scannedFile" role, leaving it
to followup work; see https://github.com/oasis-tcs/sarif-spec/issues/459
gcc/ChangeLog:
PR other/116301
* common.opt (sarif-file-2.2-prerelease): New value for
-fdiagnostics-format=.
* diagnostic-format-sarif.cc
(sarif_location_manager::sarif_location_manager): Move
initialization of m_related_locations_arr here from sarif_result's
ctor.
(sarif_location_manager::add_related_location): Implement for
base class, taking sarif_result's implementation. Add "builder"
param.
(sarif_location_manager::m_related_locations_arr): Move here from
class sarif_result.
(class sarif_result): Move m_related_locations_arr field and
add_related_location vfunc to class sarif_location_manager.
(sarif_builder::get_version): New accessor.
(sarif_builder::m_version): New field.
(sarif_invocation::add_notification_for_ice): Call
process_worklist on the notification for SARIF 2.2 and later.
(sarif_location_manager::process_worklist_item): Pass builder to
calls to add_related_location.
(sarif_result::on_nested_diagnostic): Likewise.
(sarif_result::on_diagram): Likewise.
(sarif_ice_notification::add_related_location): Add builder param.
For SARIF 2.2 and later chain up to base class impl so that
notifications get related locations.
(sarif_builder::sarif_builder): Add "version" param.
(SARIF_SCHEMA): Delete in favor of...
(sarif_version_to_url): New function.
(SARIF_VERSION): Delete in favor of...
(sarif_version_to_property): New function.
(make_top_level_object): Update to use m_version for "$schema" and
"version".
(sarif_output_format::sarif_output_format): Add "version" param.
(sarif_stream_output_format::sarif_stream_output_format):
Likewise.
(sarif_file_output_format::sarif_file_output_format): Likewise.
(diagnostic_output_format_init_sarif_stderr): Likewise.
(diagnostic_output_format_init_sarif_file): Likewise.
(diagnostic_output_format_init_sarif_stream): Likewise.
(selftest::test_sarif_diagnostic_context): Likewise.
(selftest::test_make_location_object): Likewise.
(selftest::test_simple_log): Likewise. Update schema and version
tests accordingly.
(selftest::test_simple_log_2): Add "version" param.
(selftest::test_message_with_embedded_link): Likewise.
(selftest::run_tests_per_version): New, based on the
for_each_line_table_case calls in...
(selftest::diagnostic_format_sarif_cc_tests): Add loop over sarif
versions. Replace for_each_line_table_case calls with one
call to run_tests_per_version.
* diagnostic-format-sarif.h: Include "diagnostic-format.h".
(enum class sarif_version): New.
(diagnostic_output_format_init_sarif_stderr): Move to here from
diagnostic-format.h. Add "version" param.
(diagnostic_output_format_init_sarif_file): Likewise.
(diagnostic_output_format_init_sarif_stream): Likewise.
* diagnostic-format.h: Include "diagnostic.h".
(diagnostic_output_format_init_sarif_stderr): Move from here to
diagnostic-format-sarif.h.
* diagnostic.cc: Define INCLUDE_MEMORY.
Include "diagnostic-format-sarif.h".
(diagnostic_output_format_init): Pass sarif_version::v2_1_0 to
existing SARIF options.
Add case DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE.
* diagnostic.h (enum diagnostics_output_format): Add
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE.
gcc/testsuite/ChangeLog:
PR other/116301
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2.1.c: New test.
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2.2.c: New test.
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2_1.py: Support
script for new test.
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2_2.py: Likewise.
* gcc.dg/plugin/crash-test-ice-in-header.h: New header.
* gcc.dg/plugin/plugin.exp: Add the new tests.
* lib/sarif-schema-2.2-prerelease-2024-08-08.json: New schema
file.
* lib/scansarif.exp (verify-sarif-file): Add optional argument for
specifying which version of the schema to validate against,
supporting "2.1" and "2.2", defaulting to the former.
Update the test name to capture the version of the schema tested
against.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Andrew Pinski [Tue, 1 Oct 2024 18:34:00 +0000 (18:34 +0000)]
phiopt: Fix VCE moving by rewriting it into cast [PR116098]
Phiopt match_and_simplify might move a well defined VCE assign statement
from being conditional to being uncondtitional; that VCE might no longer
being defined. It will need a rewrite into a cast instead.
This adds the rewriting code to move_stmt for the VCE case.
This is enough to fix the issue at hand. It should also be using rewrite_to_defined_overflow
but first I need to move the check to see a rewrite is needed into its own function
and that is causing issues (see https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663938.html).
Plus this version is easiest to backport.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/116098
gcc/ChangeLog:
* tree-ssa-phiopt.cc (move_stmt): Rewrite VCEs from integer to integer
types to case.
gcc/testsuite/ChangeLog:
* c-c++-common/torture/pr116098-2.c: New test.
* g++.dg/torture/pr116098-1.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
testsuite/52641 - Make gcc.dg/strict-flex-array-3.c work on int != 32 bits.
PR testsuite/52641
gcc/testsuite/
* gcc.dg/strict-flex-array-3.c (expect) [AVR]: Use custom
version due to AVR-LibC limitations.
(stuff): Use __SIZEOF_INT__ instead of hard-coded values.
middle-end: Fix ifcvt predicate generation for masked function calls
Up until now, due to a latent bug in the code for the ifcvt pass,
irrespective of the branch taken in a conditional statement, the
original condition for the if statement was used in masking the
function call.
This patch ensures that the correct predicate mask generation is
carried out such that, upon autovectorization, the correct vector
lanes are selected in the vectorized function call.
gcc/ChangeLog:
* tree-if-conv.cc (predicate_statements): Fix handling of
predicated function calls.
Andre Vieira [Wed, 2 Oct 2024 14:14:40 +0000 (15:14 +0100)]
arm: Prevent ICE when doloop dec_set is not PLUS expr
This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter
was making an assumption about the form of what a candidate for a dec_insn
should be, which caused an ICE.
This dec_insn is the instruction that decreases the loop counter inside a
decrementing loop and we expect it to have the following form:
(set (reg CONDCOUNT)
(plus (reg CONDCOUNT)
(const_int)))
Where CONDCOUNT is the loop counter, and const int is the negative constant
used to decrement it.
This patch also improves our search for a valid dec_insn. Before this patch
we'd only look for a dec_insn inside the loop header if the loop latch was
empty. We now also search the loop header if the loop latch is not empty but
the last instruction is not a valid dec_insn. This could potentially be improved
to search all instructions inside the loop latch.
gcc/ChangeLog:
* config/arm/arm.cc (check_dec_insn): New helper function containing
code hoisted from...
(arm_mve_dlstp_check_dec_counter): ... here. Use check_dec_insn to
check the validity of the candidate dec_insn.
Simon Martin [Wed, 2 Oct 2024 13:32:37 +0000 (15:32 +0200)]
c++: Fix regression introduced by r15-3796 [PR116722]
Jason pointed out that the fix I made for PR116722 via r15-3796
introduces a regression when running constexpr-dynamic10.C with
-fimplicit-constexpr.
The problem is that my change makes us leave cxx_eval_call_expression
early, and bypass the call to cxx_eval_thunk_call (through a recursive
call to cxx_eval_call_expression) that used to emit an error for that
testcase with -fimplicit-constexpr.
This patch emits the error if !ctx->quiet before bailing out because the
{con,de}structor belongs to a class with virtual bases.
PR c++/116722
gcc/cp/ChangeLog:
* constexpr.cc (cxx_bind_parameters_in_call): When !ctx->quiet,
emit error before bailing out due to a call to {con,de}structor
for a class with virtual bases.
when inserting code to determine if var is power of two. If the target
doesn't support expanding the builtin as special instructions switch
conversion relies on this whole pattern being expanded as bitmagic.
However, it is possible that other GIMPLE optimizations move the two
statements of the pattern apart. In that case the builtin becomes a
libgcc call in the final binary. The call is slow and in case of
freestanding programs can result in linking error (this bug was
originally found while compiling Linux kernel).
This patch modifies switch conversion to insert the bitmagic
(var ^ (var - 1)) > (var - 1) instead of the builtin.
gcc/ChangeLog:
PR tree-optimization/116616
* tree-switch-conversion.cc (can_pow2p): Remove this function.
(gen_pow2p): Generate bitmagic instead of a builtin. Remove the
TYPE parameter.
(switch_conversion::is_exp_index_transform_viable): Don't call
can_pow2p.
(switch_conversion::exp_index_transform): Call gen_pow2p without
the TYPE parameter.
* tree-switch-conversion.h: Remove
m_exp_index_transform_pow2p_type.
gcc/testsuite/ChangeLog:
PR tree-optimization/116616
* gcc.target/i386/switch-exp-transform-1.c: Don't test for
presence of the POPCOUNT internal fn call.
Richard Biener [Wed, 2 Oct 2024 07:39:50 +0000 (09:39 +0200)]
Speedup iterative_hash_template_arg
Using iterative_hash_object is expensive compared to using
iterative_hash_hashval_t which is fit for integer sized values.
The following reduces the number of perf cycles spent in
iterative_hash_template_arg and iterative_hash combined by 20%.
gcc/cp/
* pt.cc (iterative_hash_template_arg): Avoid using
iterative_hash_object.
Richard Biener [Wed, 2 Oct 2024 11:40:59 +0000 (13:40 +0200)]
Adjust gcc.dg/vect/slp-12a.c
We can now SLP the loop. There's PR116583 tracking that this still
fails for VLA vectors when load-lanes doesn't support a group of
size 8. We can't express this right now so the testcase keeps
FAILing for aarch64 with SVE (but passes now for riscv).
Richard Biener [Wed, 2 Oct 2024 11:39:14 +0000 (13:39 +0200)]
Adjust expectation for gcc.dg/vect/slp-19c.c
We can now vectorize the first loop with SLP when using V2SImode
vectors since then we can handle the non-power-of-two interleaving.
We can also SLP the second loop reliably now after adding induction
support for VLA vectors.
The testcase in PR114855 shows profile prediction to evaluate
the same SSA def via expr_expected_value for each condition or
switch in a function. The following patch caches the expected
value (and probability/predictor) for each visited SSA def,
also protecting against recursion and thus obsoleting the visited
bitmap. This reduces the time spent in branch prediction from
1.2s to 0.3s, though callgrind which was how I noticed this
seems to be comparatively very much happier about the change than
this number suggests.
PR tree-optimization/114855
* predict.cc (ssa_expected_value): New global.
(expr_expected_value): Do not take bitmap.
(expr_expected_value_1): Likewise. Use ssa_expected_value
to cache results for a SSA def.
(tree_predict_by_opcode): Adjust.
(tree_estimate_probability): Manage ssa_expected_value.
(tree_guess_outgoing_edge_probabilities): Likewise.
Jonathan Wakely [Tue, 24 Sep 2024 22:20:56 +0000 (23:20 +0100)]
libstdc++: Populate std::time_get::get's %c format for C locale
We were using the empty string "" for D_T_FMT and ERA_D_T_FMT in the C
locale, instead of "%a %b %e %T %Y" as the C standard requires. Set it
correctly for each locale implementation that defines time_members.cc.
We can also explicitly set the _M_era_xxx pointers to the same values as
the corresponding _M_xxx ones, rather than setting them to point to
identical string literals. This doesn't rely on the compiler merging
string literals, and makes it more explicit that they're the same in the
C locale.
libstdc++-v3/ChangeLog:
* config/locale/dragonfly/time_members.cc
(__timepunct<char>::_M_initialize_timepunc)
(__timepunct<wchar_t>::_M_initialize_timepunc): Set
_M_date_time_format for C locale. Set %Ex formats to the same
values as the %x formats.
* config/locale/generic/time_members.cc: Likewise.
* config/locale/gnu/time_members.cc: Likewise.
* testsuite/22_locale/time_get/get/char/5.cc: New test.
* testsuite/22_locale/time_get/get/wchar_t/5.cc: New test.
Jonathan Wakely [Fri, 6 Sep 2024 20:41:47 +0000 (21:41 +0100)]
libstdc++: Fix rounding in chrono::parse
I noticed that chrono::parse was using duration_cast and time_point_cast
to convert the parsed value to the result. Those functions truncate
towards zero, which is not generally what you want. Especially for
negative times before the epoch, where truncating towards zero rounds
"up" towards the next duration/time_point. Using chrono::round is
typically better, as that rounds to nearest.
However, while testing the fix I realised that rounding to the nearest
can give surprising results in some cases. For example if we parse a
chrono::sys_days using chrono::parse("F %T", "2024-09-22 18:34:56", tp)
then we will round up to the next day, i.e. sys_days(2024y/09/23). That
seems surprising, and I think 2024-09-22 is what most users would
expect.
This change attempts to provide a hybrid rounding heuristic where we use
chrono::round for the general case, but when the result has a period
that is one of minutes, hours, days, weeks, or years then we truncate
towards negative infinity using chrono::floor. This means that we
truncate "2024-09-22 18:34:56" to the start of the current
minute/hour/day/week/year, instead of rounding up to 2024-09-23, or to
18:35, or 17:00. For a period of months chrono::round is used, because
the months duration is defined as a twelfth of a year, which is not
actually the length of any calendar month. We don't want to truncate to
a whole number of "months" if that can actually go from e.g. 2023-03-01
to 2023-01-31, because February is shorter than chrono::months(1).
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__detail::__use_floor): New
function.
(__detail::__round): New function.
(from_stream): Use __detail::__round.
* testsuite/std/time/clock/file/io.cc: Check for expected
rounding in parse.
* testsuite/std/time/clock/gps/io.cc: Likewise.
Jonathan Wakely [Tue, 1 Oct 2024 09:43:43 +0000 (10:43 +0100)]
libstdc++: Fix -Wlong-long warning in <bits/postypes.h>
For 32-bit targets __INT64_TYPE__ expands to long long, which gives a
pedwarn for C++98 mode, causing:
FAIL: 17_intro/headers/c++1998/all_pedantic_errors.cc -std=gnu++98 (test for excess errors)
Excess errors:
.../bits/postypes.h:64: error: ISO C++ 1998 does not support 'long long' [-Wlong-long]
The following patch implements the clang -Wheader-guard warning, which warns
if a valid multiple inclusion header guard's #ifndef/#if !defined directive
is immediately (no other non-line directives nor other (non-comment)
tokens in between) followed by #define directive for some different macro,
which in get_suggestion rules is close enough to the actual header guard
macro (i.e. likely misspelling), the #define is object-like with empty
definition (I've followed what clang implements) and the macro isn't defined
later on (at least not on the final #endif at the end of a header).
In this case it emits a warning, so that
#ifndef STDIO_H
#define STDOI_H
...
#endif
or similar misspellings can be caught.
clang enables this warning by default, but I've put it into -Wall instead
as it still seems to be a style warning, nothing more severe; if a header
doesn't survive multiple inclusion because of the misspelling, users will
get different diagnostics.
2024-10-02 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/96842
libcpp/
* include/cpplib.h (struct cpp_options): Add warn_header_guard member.
(enum cpp_warning_reason): Add CPP_W_HEADER_GUARD enumerator.
* internal.h (struct cpp_reader): Add mi_def_cmacro, mi_loc and
mi_def_loc members.
(_cpp_defined_macro_p): Constify type pointed by argument type.
Formatting fix.
* init.cc (cpp_create_reader): Clear
CPP_OPTION (pfile, warn_header_guard).
* directives.cc (struct if_stack): Add def_loc and mi_def_cmacro
members.
(DIRECTIVE_TABLE): Add IF_COND flag to define.
(do_define): Set ifs->mi_def_cmacro on a define immediately following
#ifndef directive for the guard. Clear pfile->mi_valid. Formatting
fix.
(do_endif): Copy over pfile->mi_def_cmacro and pfile->mi_def_loc
if ifs->mi_def_cmacro is set and pfile->mi_cmacro isn't a defined
macro.
(push_conditional): Clear mi_def_cmacro and mi_def_loc members.
* files.cc (_cpp_pop_file_buffer): Emit -Wheader-guard diagnostics.
gcc/
* doc/invoke.texi (Wheader-guard): Document.
gcc/c-family/
* c.opt (Wheader-guard): New option.
* c.opt.urls: Regenerated.
* c-ppoutput.cc (init_pp_output): Initialize also cb->get_suggestion.
gcc/testsuite/
* c-c++-common/cpp/Wheader-guard-1.c: New test.
* c-c++-common/cpp/Wheader-guard-1-1.h: New test.
* c-c++-common/cpp/Wheader-guard-1-2.h: New test.
* c-c++-common/cpp/Wheader-guard-1-3.h: New test.
* c-c++-common/cpp/Wheader-guard-1-4.h: New test.
* c-c++-common/cpp/Wheader-guard-1-5.h: New test.
* c-c++-common/cpp/Wheader-guard-1-6.h: New test.
* c-c++-common/cpp/Wheader-guard-1-7.h: New test.
* c-c++-common/cpp/Wheader-guard-1-8.h: New test.
* c-c++-common/cpp/Wheader-guard-1-9.h: New test.
* c-c++-common/cpp/Wheader-guard-1-10.h: New test.
* c-c++-common/cpp/Wheader-guard-1-11.h: New test.
* c-c++-common/cpp/Wheader-guard-1-12.h: New test.
* c-c++-common/cpp/Wheader-guard-2.c: New test.
* c-c++-common/cpp/Wheader-guard-2.h: New test.
* c-c++-common/cpp/Wheader-guard-3.c: New test.
* c-c++-common/cpp/Wheader-guard-3.h: New test.
Jakub Jelinek [Wed, 2 Oct 2024 08:14:50 +0000 (10:14 +0200)]
opts: Fix up regenerate-opt-urls dependencies
It seems that we currently require
1) enabling at least c,c++,fortran,d in --enable-languages
2) first doing make html
before one can successfully regenerate-opt-urls, otherwise without 2)
one gets
make regenerate-opt-urls
make: *** No rule to make target '/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gcc/Option-Index.html', needed by 'regenerate-opt-urls'. Stop.
or say if not configuring d after make html one still gets
make regenerate-opt-urls
make: *** No rule to make target '/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gdc/Option-Index.html', needed by 'regenerate-opt-urls'. Stop.
Now, I believe neither 1) nor 2) is really necessary.
The regenerate-opt-urls goal has dependency on 3 Option-Index.html files,
but those files don't have dependencies how to generate them.
make html has dependency on $(HTMLS_BUILD) which adds
$(build_htmldir)/gcc/index.html and lang.html among other things, where
the former actually builds not just index.html but also Option-Index.html
and tons of other files, and lang.html is filled in by configure depending
on configured languages, so sometimes will include gfortran.html and
sometimes d.html.
The following patch adds dependencies of the Option-Index.html on their
corresponding index.html files and that is all that seems to be needed,
make regenerate-opt-urls then works even without prior make html and
even if just a subset of c/c++, fortran and d is enabled.
2024-10-02 Jakub Jelinek <jakub@redhat.com>
* Makefile.in ($(OPT_URLS_HTML_DEPS)): Add dependencies of the
Option-Index.html files on the corresponding index.html files.
Don't mention the requirement that all languages that have their own
HTML manuals to be enabled.
Andrew Pinski [Tue, 1 Oct 2024 21:48:19 +0000 (14:48 -0700)]
backprop: Fix deleting of a phi node [PR116922]
The problem here is remove_unused_var is called on a name that is
defined by a phi node but it deletes it like removing a normal statement.
remove_phi_node should be called rather than gsi_remove for phinodes.
Note there is a possibility of using simple_dce_from_worklist instead
but that is for another day.
gcc.target/powerpc/p9-vec-length-full-8.c was expecting all loops to
use -with-len fully masked vectorization to avoid epilogues because
the loops needed peeling for gaps. With SLP we have improved things
here and the loops using V2D[IF]mode no longer need peeling for gaps
since the target can compose those vectors from two scalars and
in turn we generate better code and not need an epilogue either
(the iteration count divides by the VF).
Richard Biener [Tue, 1 Oct 2024 13:17:18 +0000 (15:17 +0200)]
tree-optimization/116654 - missed dr_explicit_realign[_optimized] with SLP
With single-lane SLP we miss to use the power realing loads causing
some testsuite FAILs. r14-2430-g4736ddd11874fe exempted SLP of
non-grouped accesses because that could have been only splats
where the scheme isn't used anyway, but now with single-lane SLP
it can be contiguous accesses.
PR tree-optimization/116654
* tree-vect-data-refs.cc (vect_supportable_dr_alignment):
Treat non-grouped accesses like non-SLP.
The below test are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_sub-2-i16.c: New test.
* gcc.target/riscv/sat_s_sub-2-i32.c: New test.
* gcc.target/riscv/sat_s_sub-2-i64.c: New test.
* gcc.target/riscv/sat_s_sub-2-i8.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i16.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i32.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i64.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i8.c: New test.
Introduce two new unspecs, UNSPEC_COND_SMAX and UNSPEC_COND_SMIN,
corresponding to rtl operators smax and smin. UNSPEC_COND_SMAX is used
to generate fmaxnm instruction and UNSPEC_COND_SMIN is used to generate
fminnm instruction.
With these new unspecs, we can generate SVE2 max/min instructions using
existing generic unpredicated and predicated instruction patterns that
use optab attribute. Thus, we have removed specialised instruction
patterns for max/min instructions that were using
SVE_COND_FP_MAXMIN_PUBLIC iterator.
No new test cases as the existing test cases should be enough to test
this refactoring.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md
(<fmaxmin><mode>3): Remove this instruction pattern.
(cond_<fmaxmin><mode>): Remove this instruction pattern.
* config/aarch64/iterators.md: New unspecs and changes to
iterators and attrs to use the new unspecs
Thomas Koenig [Sun, 29 Sep 2024 14:52:51 +0000 (16:52 +0200)]
Implement MAXVAL and MINVAL for UNSIGNED.
gcc/fortran/ChangeLog:
* check.cc (int_or_real_or_char_or_unsigned_check_f2003): New function.
(gfc_check_minval_maxval): Use it.
* trans-intrinsic.cc (gfc_conv_intrinsic_minmaxval): Handle
initial values for UNSIGNED.
* gfortran.texi: Document MINVAL and MAXVAL for unsigned.
libgfortran/ChangeLog:
* Makefile.am: Add minval and maxval files.
* Makefile.in: Regenerated.
* gfortran.map: Add new functions.
* generated/maxval_m1.c: New file.
* generated/maxval_m16.c: New file.
* generated/maxval_m2.c: New file.
* generated/maxval_m4.c: New file.
* generated/maxval_m8.c: New file.
* generated/minval_m1.c: New file.
* generated/minval_m16.c: New file.
* generated/minval_m2.c: New file.
* generated/minval_m4.c: New file.
* generated/minval_m8.c: New file.
Eric Botcazou [Tue, 1 Oct 2024 15:54:00 +0000 (17:54 +0200)]
Fix wrong code out of NRV + RSO + inlining
The testcase is miscompiled with -O -flto beccause the three optimizations
NRV + RSO + inlining are applied to the same call: when the LHS of the call
is marked write-only before inlining, it will keep the mark after inlining
although it may be read in GIMPLE from that point on.
The fix is to apply the removal of the store, that would have been applied
later if the call was not inlined, right before inlining, which will prevent
the problematic references to the LHS from being generated during inlining.
gcc/
* tree-inline.cc (expand_call_inline): Remove the store to the
return slot if it is a global variable that is only written to.
gcc/testsuite/
* gnat.dg/lto28.adb: New test.
* gnat.dg/lto28_pkg1.ads: New helper.
* gnat.dg/lto28_pkg2.ads: Likewise.
* gnat.dg/lto28_pkg2.adb: Likewise.
* gnat.dg/lto28_pkg3.ads: Likewise.
From 06a370a0a2329dd4da0ffcab7c35ea7df2353baf Mon Sep 17 00:00:00 2001
From: Jim Lin <jim@andestech.com>
Date: Tue, 1 Oct 2024 14:42:56 +0800
Subject: [PATCH] RISC-V/libgcc: Fix incorrect and missing .cfi_offset for
__riscv_save_[0-3] on RV32.
libgcc/ChangeLog:
* config/riscv/save-restore.S: Fix .cfi_offset for
__riscv_save_[0-3] on RV32.
P2985R0 (C++26) introduces std::is_virtual_base_of; this is the compiler
builtin that will back up the library trait (which strictly requires
compiler support).
The name has been chosen to match LLVM/MSVC's, as per the discussion
here:
https://github.com/llvm/llvm-project/issues/98310
The actual user-facing type trait in libstdc++ will be added later.
gcc/cp/ChangeLog:
* constraint.cc (diagnose_trait_expr): New diagnostic.
* cp-trait.def (IS_VIRTUAL_BASE_OF): New builtin.
* cp-tree.h (enum base_access_flags): Add a new flag to be
able to request a search for a virtual base class.
* cxx-pretty-print.cc (pp_cxx_userdef_literal): Update the
list of GNU extensions to the grammar.
* search.cc (struct lookup_base_data_s): Add a field to
request searching for a virtual base class.
(dfs_lookup_base): Add the ability to look for a virtual
base class.
(lookup_base): Forward the flag to dfs_lookup_base.
* semantics.cc (trait_expr_value): Implement the builtin
by calling lookup_base with the new flag.
(finish_trait_expr): Handle the new builtin.
gcc/ChangeLog:
* doc/extend.texi: Document the new
__builtin_is_virtual_base_of builtin; amend the docs for
__is_base_of.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_virtual_base_of.C: New test.
* g++.dg/ext/is_virtual_base_of_diagnostic.C: New test.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com> Reviewed-by: Jason Merrill <jason@redhat.com>
We should factor out the conversion here as that will allow a simplfication to
`(t_3 != 0) & (c_4 != 0)`. Unlike most other types; `a ? b : CST` will simplify
for boolean result type to either `a | b` or `a & b` so allowing this conversion
for all operations will be always profitable.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Note on the phi-opt-7.c testcase change, we are now able to optimize this
and remove the if due to the factoring out now so this is an improvement.
PR tree-optimization/116890
gcc/ChangeLog:
* tree-ssa-phiopt.cc (factor_out_conditional_operation): Conversions
from bool is also should be considered as wanting to happen.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/phi-opt-7.c: Update testcase for no ifs left.
* gcc.dg/tree-ssa/phi-opt-42.c: New test.
* gcc.dg/tree-ssa/phi-opt-43.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Gaius Mulley [Tue, 1 Oct 2024 13:26:31 +0000 (14:26 +0100)]
PR modula2/116918 -fswig correct syntax
This patch fixes the syntax for the generated swig interface file.
The % characters in fprintf require escaping.
gcc/m2/ChangeLog:
PR modula2/116918
* gm2-compiler/M2Swig.mod (AnnotateProcedure): Capitalize
the generated comment, split comment into multiple lines and
terminate the comment with ". */".
(DoCheckUnbounded): Escape the % character with %%.
(DoWriteFile): Ditto.