forwprop: Fix lane handling for VEC_PERM sequence blending
In PR117830 a miscompilation of 464.h264ref was reported.
An analysis showed that wrong code was generated because of
unsatisfied assumptions. This patch addresses these issues.
The first assumption was that we could independently analyze the two
vec-perms at the start of a vec-perm-simplify sequence and use the
information later for calculating a final vec-perm selector that
utilizes fewer lanes. However, this information does not help much,
because for changing the selector entry, we need to ensure that both
elements of the operand vectors v_1 and v_2 remain equal.
This is addressed by removing the function get_vect_selector_index_map
and checking for this equality in the loop where we create the new
selector.
The calculation of the selector vector for the blended sequence
assumed that the indices of the selector vector of the narrowed
sequences are increasing. This assumption does not hold in general.
This was fixed by allowing a wrap-around when searching for an empty
lane.
Further, there was an issue in the calculation of the selector vector
entries for the second sequence. The code did not consider that the
lanes of the second sequence could have been moved.
A relevant property of this patch is that it introduces a
couple of nested loops, where the out loop iterates from
i=0..nelts and the inner loop iterates from j=0..i.
To avoid performance concerns, a check is introduced that
ensures nelts won't exceed 4 lanes.
The added test case is derived from h264ref (the other cases from the
benchmark have the same structure and don't provide additional coverage).
Bootstrapped and regression-tested on x86-64 and aarch64.
Further, tested on CPU 2006 h264ref and CPU 2017 x264.
PR117830
gcc/ChangeLog:
* tree-ssa-forwprop.cc (get_vect_selector_index_map): Removed.
(recognise_vec_perm_simplify_seq): Fix calculation of vec-perm
selectors of narrowed sequence.
(calc_perm_vec_perm_simplify_seqs): Fixing calculation of
vec-perm selectors of the blended sequence.
(process_vec_perm_simplify_seq_list): Add whitespace to dump
string to avoid bad formatted dump output.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/vector-11.c: New test.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This patch ensures that the list of valid -mtune options
does not contain entries more than once.
The -mtune option accepts CPU identifiers as well as
tuning identifiers and there are cases where a CPU and
its tuning have the same identifier.
PR116347
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_get_valid_option_values):
Skip adding mtune entries that are already in the list.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Jakub Jelinek [Fri, 20 Dec 2024 09:17:56 +0000 (10:17 +0100)]
c++: Fix up maybe_unused attribute handling [PR110345]
When adding test coverage for maybe_unused attribute, I've run into
several things:
1) similarly to deprecated attribute, the attribute shouldn't pedantically
appertain to types other than class/enumeration definitions
2) similarly to deprecated attribute, the attribute shouldn't pedantically
appertain to unnamed bit-fields
3) the standard says that it can appertain to identifier labels, but
we handled it silently also on case and default labels
4) I've run into a weird spurious error on
int f [[maybe_unused]];
int & [[maybe_unused]] i = f;
int && [[maybe_unused]] j = 0;
The problem was that we create an attribute variant for the int &
type, then create an attribute variant for the int && type, and
the type_canon_hash hashing just thought those 2 are the same,
so used int & [[maybe_unused]] type for j rather than
int && [[maybe_unused]]. As TYPE_REF_IS_RVALUE is a flag in the
generic code, it was easily possible to hash that flag and compare
it
2024-12-19 Jakub Jelinek <jakub@redhat.com>
PR c++/110345
gcc/
* tree.cc (type_hash_canon_hash): Hash TYPE_REF_IS_RVALUE for
REFERENCE_TYPE.
(type_cache_hasher::equal): Compare TYPE_REF_IS_RVALUE for
REFERENCE_TYPE.
gcc/cp/
* tree.cc (handle_maybe_unused_attribute): New function.
(std_attributes): Use handle_maybe_unused_attribute instead
of handle_unused_attribute for maybe_unused attribute.
gcc/testsuite/
* g++.dg/cpp0x/attr-maybe_unused1.C: New test.
* g++.dg/cpp0x/alignas21.C: Add test for int && alignas (int).
Jakub Jelinek [Fri, 20 Dec 2024 09:12:08 +0000 (10:12 +0100)]
c++: Disallow [[deprecated]] on types other than class/enum definitions [PR110345]
For C++ 26 P2552R3 I went through all the spots (except modules) where
attribute-specifier-seq appears in the grammar and tried to construct
a testcase in all those spots, for now for [[deprecated]] attribute.
The patch below contains that testcase. One needed change for this
particular attribute was that currently we handle [[deprecated]]
exactly the same as [[gnu::deprecated]], but for the latter unlike C++14
or later we allow it also on almost all types, while the standard
is strict and allows it only on
https://eel.is/c++draft/dcl.attr#deprecated-2
The attribute may be applied to the declaration of a class, a typedef-name,
a variable, a non-static data member, a function, a namespace,
an enumeration, an enumerator, a concept, or a template specialization.
The following patch just adds a pedwarn for the cases that gnu::deprecated
allows but C++14 disallows, so integral/floating/boolean types,
pointers/references, array types, function types etc.
Basically, for TYPE_P, if the attribute is applied in place (which means
the struct/union/class/enum definition), it is allowed, otherwise pedwarned.
I've tried to compile it also with latest clang and there is agreement in
most of the diagnostics, just at block scope (inside of foo) it doesn't
diagnose
auto e = new int [n] [[deprecated]];
auto e2 = new int [n] [[deprecated]] [42];
[[deprecated]] lab:;
and at namespace scope
[[deprecated]];
I think that all feels like clang++ bug.
Also this pedwarns on
[[deprecated]] int : 0;
at class scope, that isn't a non-static data member...
I guess to mark the paper as implemented (or what has been already voted
into C++23 earlier) we'll need to add similar testcase for all the other
standard attributes and make sure we check what the attributes can appertain
to and what they can't.
2024-12-19 Jakub Jelinek <jakub@redhat.com>
PR c++/110345
* parser.cc (cp_parser_std_attribute): Don't transform
[[deprecated]] into [[gnu::deprecated]].
* tree.cc (handle_std_deprecated_attribute): New function.
(std_attributes): Add deprecated entry.
Fortran: Fix caf_stop_numeric and reporting exceptions from caf [PR57598]
Caf_stop_numeric always exited with code 0, which is wrong. It shall
behave like regular stop. Add reporting exceptions to caf's stop
handlers. For this the existing library routine had to be exported.
libgfortran/ChangeLog:
PR fortran/57598
* caf/single.c (_gfortran_caf_stop_numeric): Report exceptions
on stop. And fix send_by_ref.
(_gfortran_caf_stop_str): Same.
(_gfortran_caf_error_stop_str): Same.
(_gfortran_caf_error_stop): Same.
* gfortran.map: Add report_exception for export.
* libgfortran.h (report_exception): Add to internal export.
* runtime/stop.c (report_exception): Same.
c++/modules: Validate external linkage definitions in header units [PR116401]
[module.import] p6 says "A header unit shall not contain a definition of
a non-inline function or variable whose name has external linkage."
This patch implements this requirement, and cleans up some issues in the
testsuite where this was already violated. To handle deduction guides
we mark them as inline, since although we give them a definition for
implementation reasons, by the standard they have no definition, and so
we should not error in this case.
PR c++/116401
gcc/cp/ChangeLog:
* decl.cc (grokfndecl): Mark deduction guides as 'inline'.
* module.cc (check_module_decl_linkage): Implement checks for
non-inline external linkage definitions in headers.
c++/modules: Check linkage for exported declarations
By [module.interface] p3, if an exported declaration is not within a
header unit, it shall not declare a name with internal linkage.
Unfortunately we cannot just do this within set_originating_module,
since at the locations its called the linkage for declarations are not
always fully determined yet. We could move the calls but this causes
the checking assertion to fail as the originating module declaration may
have moved, and in general for some kinds of declarations it's not
always obvious where it should be moved to.
This patch instead introduces a new function to check that the linkage
of a declaration within a module is correct, to be called for all
declarations once their linkage is fully determined.
As a drive-by fix this patch also improves the source location of
namespace aliases to point at the identifier rather than the terminating
semicolon.
c++/modules: Support unnamed namespaces in header units
A header unit may contain unnamed namespaces, and those declarations
are exported (as with any declaration in a header unit). This patch
ensures that such declarations are correctly handled.
The change to 'make_namespace_finish' is required so that if an unnamed
namespace is first seen by an import it is correctly handled within
'add_imported_namespace'. I don't see any particular reason why
handling of unnamed namespaces here had to be handled separately outside
that function since these are the only two callers.
gcc/cp/ChangeLog:
* module.cc (depset::hash::add_binding_entity): Also walk
unnamed namespaces.
(module_state::write_namespaces): Adjust assertion.
* name-lookup.cc (push_namespace): Move anon using-directive
handling to...
(make_namespace_finish): ...here.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-9_a.H: New test.
* g++.dg/modules/internal-9_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Fri, 11 Oct 2024 11:16:02 +0000 (22:16 +1100)]
c++/modules: Ignore TU-local entities where necessary
[basic.link] p14 lists a number of circumstances where a declaration
naming a TU-local entity is not an exposure, notably the bodies of
non-inline templates and friend declarations in classes. This patch
ensures that these references do not error when exporting the module.
We do need to still error on instantiation from a different module,
however, in case this refers to a TU-local entity. As such this patch
adds a new tree TU_LOCAL_ENTITY which is used purely as a placeholder to
poison any attempted template instantiations that refer to it.
This is also streamed for friend decls so that merging (based on the
index of an entity into the friend decl list) doesn't break and to
prevent complicating the logic; I imagine this shouldn't ever come up
though.
We also add a new warning, '-Wtemplate-names-tu-local', to handle the
case where someone accidentally refers to a TU-local value from within a
non-inline function template. This will compile without errors as-is,
but any attempt to instantiate the decl will fail; this warning can be
used to ensure that this doesn't happen. The warning is silenced for
any declarations with explicit instantiations, since uses of those
instantiations would not be exposures.
The main piece that this patch doesn't yet attempt to solve is ADL: as
specified, if ADL adds an overload set that includes a translation-unit
local entity when instantiating a template, that overload set is now
poisoned and counts as an exposure. Unfortunately, we don't currently
differentiate between decls that are hidden due to not being exported,
or decls that are hidden due to being hidden friends, so this patch
instead just keeps the current (wrong) behaviour of non-exported
entities not being visible to ADL at all.
Additionally, this patch doesn't attempt to ignore non-ODR uses of
constants in constexpr functions or templates. The obvious approach of
folding them early in 'mark_use' doesn't seem to work (for a variety of
reasons), so this leaves this to a later patch to implement, as it's at
least no worse than the current behaviour and easy enough to workaround.
For completeness this patch adds a new xtreme-header testcase to ensure
that we have no regressions with regards to exposures of TU-local
declarations in the standard library header files. A more restrictive
test would be to do 'export extern "C++"' here, but unfortunately the
system headers on some targets declare TU-local entities, so we'll make
do with checking that at least the C++ standard library headers don't
refer to such entities.
gcc/c-family/ChangeLog:
* c.opt: New warning '-Wtemplate-names-tu-local'.
gcc/cp/ChangeLog:
* cp-objcp-common.cc (cp_tree_size): Add TU_LOCAL_ENTITY.
* cp-tree.def (TU_LOCAL_ENTITY): New tree code.
* cp-tree.h (DECL_TEMPLATE_INSTANTIATIONS): Update comment.
(struct tree_tu_local_entity): New type.
(TU_LOCAL_ENTITY_NAME): New accessor.
(TU_LOCAL_ENTITY_LOCATION): New accessor.
(enum cp_tree_node_structure_enum): Add TS_CP_TU_LOCAL_ENTITY.
(union GTY): Add tu_local_entity field.
* module.cc (enum tree_tag): New flag DB_REFS_TU_LOCAL_BIT.
(depset::has_defn): Override for TU-local entities.
(depset::refs_tu_local): New accessor.
(depset::hash::ignore_tu_local): New field.
(depset::hash::hash): Initialize it.
(trees_out::tree_tag::tt_tu_local): New flag.
(trees_out::writing_local_entities): New field.
(trees_out::is_initial_scan): New function.
(trees_out::tu_local_count): New counter.
(trees_out::trees_out): Initialize writing_local_entities.
(dumper::impl::nested_name): Handle TU_LOCAL_ENTITY.
(trees_out::instrument): Report TU-local entity counts.
(trees_out::decl_value): Early exit for TU-local entities.
(trees_in::decl_value): Handle typedefs of TU-local entities.
(trees_out::decl_node): Adjust assertion to cope with early exit
of TU-local deps. Always write TU-local entities by value.
(trees_out::type_node): Handle TU-local types.
(trees_out::has_tu_local_dep): New function.
(trees_out::find_tu_local_decl): New function.
(trees_out::tree_node): Intercept TU-local entities and write
placeholder values for them instead of normal streaming.
(trees_in::tree_node): Handle TU-local entities and TU-local
template results.
(trees_out::write_function_def): Ignore exposures in non-inline
function bodies.
(trees_out::write_var_def): Ignore exposures in initializers.
(trees_out::write_class_def): Ignore exposures in friend decls.
(trees_in::read_class_def): Skip TU-local friends.
(trees_out::write_definition): Record whether we're writing a
decl which refers to TU-local entities.
(depset::hash::add_dependency): Only mark as exposure if we're not
ignoring TU-local entities.
(depset::hash::find_dependencies): Use depset's own is_key_order
function rather than delegating via walker. Pass whether the
decl has ignored TU-local entities in its definition.
(template_has_explicit_inst): New function.
(depset::hash::finalize_dependencies): Implement new warning
Wtemplate-names-tu-local.
(module_state::intercluster_seed): Don't seed TU-local deps.
(module_state::write_cluster): Pass whether the decl has ignored
TU-local entities in its definition.
* pt.cc (register_specialization): Always register in a module.
(complain_about_tu_local_entity): New function.
(expr_contains_tu_local_entity): New function.
(function_contains_tu_local_entity): New function.
(instantiate_class_template): Skip TU-local friends.
(tsubst_decl): Handle typedefs of TU-local entities.
(tsubst): Complain about TU-local entities.
(dependent_operand_p): Early exit for TU-local entities so we
don't attempt to constant-evaluate them.
(tsubst_expr): Detect and complain about TU-local entities.
* g++.dg/modules/internal-5_a.C: New test.
* g++.dg/modules/internal-5_b.C: New test.
* g++.dg/modules/internal-6.C: New test.
* g++.dg/modules/internal-7_a.C: New test.
* g++.dg/modules/internal-7_b.C: New test.
* g++.dg/modules/internal-8_a.C: New test.
* g++.dg/modules/xtreme-header-8.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Tue, 8 Oct 2024 09:50:38 +0000 (20:50 +1100)]
c++/modules: Detect exposures of TU-local entities
Currently, the modules streaming code implements some checks for
declarations in the CMI that reference (some kinds of) internal-linkage
entities, and errors if so. This patch expands on that support to
implement the logic for exposures of TU-local entities as defined in
[basic.link] since P1815.
This will cause some code that previously errored in modules to start
compiling; for instance, template specialisations of internal linkage
functions.
However, some code that previously appeared valid will with this patch
no longer compile, notably some kinds of usages of internal linkage
functions included from the GMF. This appears to be related to P2808
and FR-025, however as yet there doesn't appear to be consensus for
changing these rules so I've implemented them as-is.
This patch leaves a couple of things out. In particular, a couple of
the rules for what is a TU-local entity currently seem to me to be
redundant; I've left them as FIXMEs to be handled once I can find
testcases that aren't adequately supported by the other logic here.
Additionally, there are some exceptions for when naming a TU-local
entity is not always an exposure; I've left support for this to a
follow-up patch for easier review, as it has broader implications for
streaming.
TU-local lambdas are also not yet properly implemented, due to other
bugs with regards to LAMBDA_TYPE_EXTRA_SCOPE not being set in all cases
that it probably should be (see also PR c++/116568). We can revisit
this once that issue has been fixed.
Finally, this patch makes a couple of small adjustments to the modules
streaming logic to prune any leftover TU-local deps (that aren't
erroneous exposures). This is required for this patch to ensure that
later stages don't get confused by any leftover TU-local entities
floating around.
gcc/cp/ChangeLog:
* tree.cc (decl_linkage): Treat DECL_SELF_REFERENCE_P like
DECL_IMPLICIT_TYPEDEF_P.
* name-lookup.cc (do_namespace_alias): Fix linkage.
* module.cc (DB_IS_INTERNAL_BIT): Rename to...
(DB_TU_LOCAL_BIT): ...this.
(DB_REFS_INTERNAL_BIT): Rename to...
(DB_EXPOSURE_BIT): ...this.
(depset::hash::is_internal): Rename to...
(depset::hash::is_tu_local): ...this.
(depset::hash::refs_internal): Rename to...
(depset::hash::is_exposure): ...this.
(depset::hash::is_tu_local_entity): New function.
(depset::hash::has_tu_local_tmpl_arg): New function.
(depset::hash::is_tu_local_value): New function.
(depset::hash::make_dependency): Check for TU-local entities.
(depset::hash::add_dependency): Make current an exposure
whenever it references a TU-local entity.
(depset::hash::add_binding_entity): Don't create bindings for
any TU-local entity.
(depset::hash::finalize_dependencies): Rename flags and adjust
diagnostic messages to report exposures of TU-local entities.
(depset::tarjan::connect): Don't include any TU-local depsets.
(depset::hash::connect): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/block-decl-2.C: Adjust messages.
* g++.dg/modules/internal-1.C: Adjust messages, remove XFAILs.
* g++.dg/modules/linkage-2.C: Adjust messages, remove XFAILS.
* g++.dg/modules/internal-3.C: New test.
* g++.dg/modules/internal-4_a.H: New test.
* g++.dg/modules/internal-4_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
François Dumont [Mon, 22 Jul 2024 19:54:36 +0000 (21:54 +0200)]
libstdc++: Add fancy pointer support to std::map and std::set [PR57272]
The fancy allocator pointer type support is added to std::map,
std::multimap, std::multiset and std::set through the underlying
std::_Rb_tree class.
To respect ABI a new parralel hierarchy of node types has been added.
This change introduces new class template parameterized on the
allocator's void_pointer type, __rb_tree::_Node_base, and new class
templates parameterized on the allocator's pointer type, __rb_tree::_Node,
__rb_tree::_Iterator. The iterator class template is used for both
iterator and const_iterator. Whether std::_Rb_tree<K, V, KoV, C, A>
should use the old _Rb_tree_node<V> or new __rb_tree::_Node<A::pointer>
type family internally is controlled by a new __rb_tree::_Node_traits
traits template.
Because std::pointer_traits and std::__to_address are not defined for
C++98, there is no way to support fancy pointers in C++98. For C++98 the
_Node_traits traits always choose the old _Rb_tree_node family.
In case anybody is currently using std::_Rb_tree with an allocator that
has a fancy pointer, this change would be an ABI break, because their
std::_Rb_tree instantiations would start to (correctly) use the fancy
pointer type. If the fancy pointer just contains a single pointer and so
has the same size, layout, and object representation as a raw pointer,
the code might still work (despite being an ODR violation). But if their
fancy pointer has a different representation, they would need to
recompile all their code using that allocator with std::_Rb_tree. Because
std::_Rb_tree will never use fancy pointers in C++98 mode, recompiling
everything to use fancy pointers isn't even possible if mixing C++98 and
C++11 code that uses std::_Rb_tree. To alleviate this problem, compiling
with -D_GLIBCXX_USE_ALLOC_PTR_FOR_RB_TREE=0 will force std::_Rb_tree to
have the old, non-conforming behaviour and use raw pointers internally.
For testing purposes, compiling with -D_GLIBCXX_USE_ALLOC_PTR_FOR_RB_TREE=9001
will force std::_Rb_tree to always use the new node types. This macro is
currently undocumented, which needs to be fixed.
As _Rb_tree is using _Base_ptr to represent the tree this change also
simplifies the implementation by removing all the const pointer types
and associated methods.
libstdc++-v3/ChangeLog:
PR libstdc++/57272
* include/bits/stl_tree.h
[_GLIBCXX_USE_ALLOC_PTR_FOR_RB_TREE]: New macro to control usage of the
code required to support fancy allocator pointer type.
(_Rb_tree_node_base::_Const_Base_ptr): Remove.
(_Rb_tree_node_base::_S_minimum, _Rb_tree_node_base::_S_maximum): Remove
overloads for _Const_Base_ptr.
(_Rb_tree_node_base::_M_base_ptr()): New.
(_Rb_tree_node::_Link_type): Remove.
(_Rb_tree_node::_M_node_ptr()): New.
(__rb_tree::_Node_base<>): New.
(__rb_tree::_Header<>): New.
(__rb_tree::_Node<>): New.
(_Rb_tree_increment(const _Rb_tree_node_base*)): Remove declaration.
(_Rb_tree_decrement(const _Rb_tree_node_base*)): Remove declaration.
(_Rb_tree_iterator<>::_Self): Remove.
(_Rb_tree_iterator<>::_Link_type): Rename into...
(_Rb_tree_iterator<>::_Node_ptr): ...this.
(_Rb_tree_const_iterator<>::_Link_type): Rename into...
(_Rb_tree_const_iterator<>::_Node_ptr): ...this.
(_Rb_tree_const_iterator<>::_M_const_cast): Remove.
(_Rb_tree_const_iterator<>::_M_node): Change type into _Base_ptr.
(__rb_tree::_Iterator<>): New.
(__rb_tree::_Node_traits<>): New.
(_Rb_tree<>::_Node_base, _Rb_tree::_Node): New.
(_Rb_tree<>::_Link_type): Rename into...
(_Rb_tree<>::_Node_ptr): ...this.
(_Rb_tree<>::_Const_Base_ptr, _Rb_tree<>::_Const_Node_ptr): Remove.
(_Rb_tree<>::_M_mbegin): Remove.
(_Rb_tree<>::_M_begin_node()): New.
(_S_key(const _Node&)): New.
(_S_key(_Base_ptr)): New, call latter.
(_S_key(_Node_ptr)): Likewise.
(_Rb_tree<>::_S_left(_Const_Base_ptr)): Remove.
(_Rb_tree<>::_S_right(_Const_Base_ptr)): Remove.
(_Rb_tree<>::_S_maximum(_Const_Base_ptr)): Remove.
(_Rb_tree<>::_S_minimum(_Const_Base_ptr)): Remove.
* testsuite/23_containers/map/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/multimap/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/multiset/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/set/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/set/requirements/explicit_instantiation/alloc_ptr.cc:
New test case.
* testsuite/23_containers/set/requirements/explicit_instantiation/alloc_ptr_ignored.cc:
New test case.
Patrick Palka [Thu, 19 Dec 2024 17:00:31 +0000 (12:00 -0500)]
c++: optimize constraint subsumption [PR118069]
Since atomic constraints are interned the subsumption machinery can
safely use pointer instead of structural hashing for them. This speeds
up compilation of the testcase in the PR from ~3s to ~2s.
PR c++/118069
gcc/cp/ChangeLog:
* constraint.cc (atom_hasher): Define here, instead of ...
* cp-tree.h (atom_hasher): ... here.
* logic.cc (clause::m_set): Use pointer instead of structural
hashing.
Patrick Palka [Thu, 19 Dec 2024 17:00:29 +0000 (12:00 -0500)]
c++: integer overflow during constraint subsumption [PR118069]
For the testcase in the PR we hang during constraint subsumption
ultimately because one of the constraints is complex enough that its
conjunctive normal form is calculated to have more than 2^31 clauses,
which causes the size calculation (through an int) to overflow and so
the optimization in subsumes_constraints_nonnull
if (dnf_size (lhs) <= cnf_size (rhs))
// iterate over DNF of LHS
else
// iterate over CNF of RHS
incorrectly decides to loop over the CNF (>> billions of clauses)
instead of the DNF (thousands of clauses).
I haven't verified that the result of cnf_size is correct for the
problematic constraint but integer overflow is definitely plausible
given that CNF/DNF can be exponentially larger than the original
constraint in the worst case.
This patch fixes this by using 64-bit saturating arithmetic during
these size calculations (via new add/mul_sat_hwi functions) so that
overflow is less likely and if it does occur we handle it gracefully.
It should be highly unlikely that both the DNF and CNF sizes overflow,
and if they do then it doesn't matter which form we select, subsumption
will take forever either way. The testcase now compiles in ~3 seconds
on my machine after this change.
PR c++/118069
gcc/ChangeLog:
* hwint.h (add_sat_hwi): New function.
(mul_sat_hwi): Likewise.
gcc/cp/ChangeLog:
* logic.cc (dnf_size_r): Use HOST_WIDE_INT instead of int, and
handle overflow gracefully via add_sat_hwi and mul_sat_hwi.
(cnf_size_r): Likewise.
(dnf_size): Use HOST_WIDE_INT instead of int.
(cnf_size): Likewise.
Tobias Burnus [Thu, 19 Dec 2024 16:27:41 +0000 (17:27 +0100)]
OpenMP: Add 'nec' as to the 'vendor' context-selector list
For unknown vendors using in a context selector such as
match(implementation={vendor(...)})
GCC prints a warning like:
warning: unknown property 'nec' of 'vendor' selector
While all known vendors (including the vendor 'unknown') are silently
accepted, only "gnu" counts as matched by GCC.
The list of known vendors is published in OpenMP's additional
definition document (or, previously, the context definitions document).
While the initial list did not contain 'nec', it was added quite early
but GCC missed this addition, which this commit rectifies.
Some history:
* GCC added the list in r10-3744-g94e7f906ca5c73 (Oct 2019)
* At spec level, 'pgi' was replaced by 'nvidia' in Nov 2019, but
GCC (since r10-4639-gd0ec7c935f0c96, Nov 2019) and LLVM recognize
both vendor names.
* 'nec' was then added in Dec 2019 and is present in
"Context Definitions for the OpenMP API Specification Version 5.0
– Version 1.0", but only this commit adds it.
* 'hpe' (as alias for 'cray') was added to the spec in Nov 2020 but
to GCC only in r14-6720-gd0603dfe9d3bc7 (Dec 2023).
Patrick Palka [Thu, 19 Dec 2024 16:31:19 +0000 (11:31 -0500)]
libstdc++: Implement C++23 <flat_set> (P1222R4)
This implements the C++23 container adaptors std::flat_set and
std::flat_multiset from P1222R4. The implementation is essentially
an simpler and pared down version of std::flat_map.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new header <flat_set>.
* include/Makefile.in: Regenerate.
* include/bits/version.def (__cpp_flat_set): Define.
* include/bits/version.h: Regenerate
* include/precompiled/stdc++.h: Include <flat_set>.
* include/std/flat_set: New file.
* src/c++23/std.cc.in: Export <flat_set>.
* testsuite/23_containers/flat_multiset/1.cc: New test.
* testsuite/23_containers/flat_set/1.cc: New test.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Patrick Palka [Thu, 19 Dec 2024 16:31:09 +0000 (11:31 -0500)]
libstdc++: Implement C++23 <flat_map> (P0429R9)
This implements the C++23 container adaptors std::flat_map and
std::flat_multimap from P0429R9. The implementation is shared
as much as possible between the two adaptors via a common base
class that's parameterized according to key uniqueness.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new header <flat_map>.
* include/Makefile.in: Regenerate.
* include/bits/alloc_traits.h (__not_allocator_like): New concept.
* include/bits/stl_function.h (__transparent_comparator): Likewise.
* include/bits/stl_iterator_base_types.h (__has_input_iter_cat):
Likewise.
* include/bits/uses_allocator.h (__allocator_for): Likewise.
* include/bits/utility.h (sorted_unique_t): Define for C++23.
(sorted_unique): Likewise.
(sorted_equivalent_t): Likewise.
(sorted_equivalent): Likewise.
* include/bits/version.def (flat_map): Define.
* include/bits/version.h: Regenerate.
* include/precompiled/stdc++.h: Include <flat_map>.
* include/std/flat_map: New file.
* src/c++23/std.cc.in: Export <flat_map>.
* testsuite/23_containers/flat_map/1.cc: New test.
* testsuite/23_containers/flat_multimap/1.cc: New test.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
and then we pass the EXPR_STMT to maybe_constant_init, with D.2707 as
the object. But their types don't match anymore, so we crash. We'd
have to pass D.2707.it as the object for it to work.
This patch adjusts cxx_eval_outermost_constant_expr to take the object's
type if available.
constexpr-prvalue3.C is reduced from a large std::ranges libstdc++ test.
PR c++/117980
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_outermost_constant_expr): If there's
an object to initialize, take its type. Don't set the type
in the constexpr dtor case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-prvalue2.C: New test.
* g++.dg/cpp0x/constexpr-prvalue3.C: New test.
testsuite: arm: Use effective-target for memset-inline* tests
Split tests into 2 parts:
- The first part checkes the assmbler generated.
- The second part does the run test and this part now requires
effective-target arm_neon_hw.
Jakub Jelinek [Thu, 19 Dec 2024 10:36:29 +0000 (11:36 +0100)]
testsuite: Fix toplevel-asm-1.c failure for riscv
On Wed, Dec 18, 2024 at 01:19:43PM +0100, Andreas Schwab wrote:
> On Dez 12 2024, Jakub Jelinek wrote:
>
> > The intent was to test %cN because %N doesn't DTRT on various targets.
> > I have a patch to add %ccN support which should then work even on riscv
> > hopefully, but unfortunately it hasn't been fully reviewed yet.
>
> That didn't change toplevel-asm-1, so the failure remains.
Yes, I've only committed what was approved.
The following patch ought to fix this (and if there are other targets which
don't really support %cN for SYMBOL_REFs even with -fno-pic, they can be
added there too; I think it is useful to test %cN on the targets where it
works though).
2024-12-19 Jakub Jelinek <jakub@redhat.com>
* c-c++-common/toplevel-asm-1.c: Use %cc3 %cc4 instead of %c3 %c4
on riscv.
Pan Li [Thu, 19 Dec 2024 01:03:59 +0000 (09:03 +0800)]
RISC-V: Adjust the strided store testcases check times on options
The vsse* dump check times changes on options (O2, O3) after we add
(mem:BLK (scratch)) to the define_insn of strided load.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c: Adjust
the vsse check times based on optimization option.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c: Ditto.
Pan Li [Thu, 19 Dec 2024 00:58:20 +0000 (08:58 +0800)]
RISC-V: Make vector strided store alias all other memories
Almost the same as the RVV strided load, the vector strided store
doesn't involve the (mem:BLK (scratch)) to alias all other memories.
It will make the alias analysis only consider the base address of
strided store.
PR target/118075
gcc/ChangeLog:
* config/riscv/vector.md: Add the (mem:BLK (scratch)) as the
lhs of strided store define insn.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr118075-run-1.c: New test.
Alexandre Oliva [Thu, 19 Dec 2024 01:17:31 +0000 (22:17 -0300)]
ifcombine field merge: handle masks with sign extensions
When a loaded field is sign extended, masked and compared, we used to
drop from the mask the bits past the original field width, which is
not correct.
Take note of the fact that the mask covered copies of the sign bit,
before clipping it, and arrange to test the sign bit if we're
comparing with zero. Punt in other cases.
If bits_test fail recoverably, try other ifcombine strategies.
for gcc/ChangeLog
* gimple-fold.cc (decode_field_reference): Add psignbit
parameter. Set it if the mask references sign-extending
bits.
(fold_truth_andor_for_ifcombine): Adjust calls with new
variables. Swap them along with other r?_* variables. Handle
extended sign bit compares with zero.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): If bits_test
fails in a way that doesn't prevent other ifcombine strategies
from passing, give them a try.
Alexandre Oliva [Thu, 19 Dec 2024 01:17:18 +0000 (22:17 -0300)]
ifcombine field merge: handle bitfield zero tests in range tests
Some bitfield compares with zero are optimized to range tests, so
instead of X & ~(Bit - 1) != 0 what reaches ifcombine is X > (Bit - 1),
where Bit is a power of two and X is unsigned.
This patch recognizes this optimized form of masked compares, and
attempts to merge them like masked compares, which enables some more
field merging that a folder version of fold_truth_andor used to handle
without additional effort.
I haven't seen X & ~(Bit - 1) == 0 become X <= (Bit - 1), or X < Bit
for that matter, but it was easy enough to handle the former
symmetrically to the above.
The latter was also easy enough, and so was its symmetric, X >= Bit,
that is handled like X & ~(Bit - 1) != 0.
for gcc/ChangeLog
* gimple-fold.cc (decode_field_reference): Accept incoming
mask.
(fold_truth_andor_for_ifcombine): Handle some compares with
powers of two, minus 1 or 0, like masked compares with zero.
Alexandre Oliva [Thu, 19 Dec 2024 01:17:13 +0000 (22:17 -0300)]
noncontiguous ifcombine: skip marking of non-SSA_NAMEs [PR117915]
When ifcombine_mark_ssa_name is called directly, rather than by
ifcombine_mark_ssa_name_walk, we need to check that name is an
SSA_NAME at the caller or in the function itself. For convenience and
safety, I'm moving the checks from _walk to the implementation proper.
Alexandre Oliva [Thu, 19 Dec 2024 01:17:02 +0000 (22:17 -0300)]
ifcombine field merge: do not follow a second conversion [PR118046]
The testcase shows that conversions that would impact negatively the
ifcombine field merging implementation won't always have been
optimized out by the time we reach ifcombine.
There's probably room to support multiple conversions with extra
logic, but this workaround should avoid codegen errors until that
logic is figured out.
for gcc/ChangeLog
PR tree-optimization/118046
* gimple-fold.cc (decode_field_reference): Don't follow more
than one conversion.
Alexandre Oliva [Thu, 19 Dec 2024 01:16:58 +0000 (22:16 -0300)]
ifcombine field merge: stricten loads tests, swap compare to match
ACATS-4 ca11d02 exposed an error in the logic for recognizing and
identifying the inner object in decode_field_ref: a view-converting
load, inserted in a previous successful field merging operation, was
recognized by gimple_convert_def_p within decode_field_reference, and
as a result we took its operand as the expression, and failed to take
note of the load location.
Without that load, we couldn't compare vuses, and then we ended up
inserting a wider load before relevant parts of the object were
initialized.
This patch makes gimple_convert_def_p recognize loads only when
requested, and requires that either both or neither parts of a
potentially merged operand have associated loads.
As a bonus, it enables additional optimizations by swapping the
operands of the second compare when that makes left-hand operands
of both compares match.
for gcc/ChangeLog
* gimple-fold.cc (gimple_convert_def_p): Reject load stmts
unless requested.
(decode_field_reference): Accept a converting load at the last
conversion matcher, subsuming the load identification.
(fold_truth_andor_for_ifcombine): Refuse to merge operands
when only one of them has an associated load stmt. Swap
operands of one of the compares if that helps them match.
Eric Botcazou [Wed, 18 Dec 2024 20:48:36 +0000 (21:48 +0100)]
Fix bootstrap failure on SPARC with -O3 -mvis3
This replaces the use of FAIL in the new vec_cmp[u] expanders by that of a
predicate for the operator, which is (apparently) required for the optabs
machinery to properly compute the set of supported vector comparisons.
gcc/
PR target/118096
* config/sparc/predicates.md (vec_cmp_operator): New predicate.
(vec_cmpu_operator): Likewise.
* config/sparc/sparc.md (vec_cmp<FPCMP:mode><P:mode>): Use the
vec_cmp_operator predicate instead of FAILing the expansion.
(vec_cmpu<FPCMP:mode><P:mode>): Likewise for vec_cmpu_operator.
Michal Jires [Wed, 18 Dec 2024 17:28:46 +0000 (18:28 +0100)]
ipcp don't propagate where not needed - fix uninit constructor
Removed unitialized empty constructor as was objected.
gcc/ChangeLog:
* lto-cgraph.cc (lto_symtab_encoder_delete_node):
Declare var later when initialized.
* lto-streamer.h (struct lto_encoder_entry):
Remove empty constructor.
Tamar Christina [Wed, 18 Dec 2024 16:39:25 +0000 (16:39 +0000)]
libstdc++: Adjust probabilities of hashmap loop conditions
We are currently generating a loop which has more comparisons than you'd
typically need as the probablities on the small size loop are such that it
assumes the likely case is that an element is not found.
This again generates a pattern that's harder for branch predictors to follow,
but also just generates more instructions for the what one could say is the
typical case: That your hashtable contains the entry you are looking for.
This patch adds a __builtin_expect in _M_find_before_node where at the moment
the loop is optimized for the case where we don't do any iterations.
A simple testcase is (compiled with -fno-split-path to simulate the loop
in libstdc++):
#include <stdbool.h>
bool foo (int **a, int n, int val, int *tkn)
{
for (int i = 0; i < n; i++)
{
if (!a[i] || a[i]==tkn)
return false;
i.e. BB rotation makes is generate an unconditional branch to a conditional
branch. However this method is only called when the size is above a certain
threshold, and so it's likely that we have to do that first iteration.
Adding:
#include <stdbool.h>
bool foo (int **a, int n, int val, int *tkn)
{
for (int i = 0; i < n; i++)
{
if (__builtin_expect(!a[i] || a[i]==tkn, 0))
return false;
if (*a[i] == val)
return true;
}
}
to indicate that we will likely do an iteration more generates:
Jonathan Wakely [Sat, 14 Dec 2024 01:17:27 +0000 (01:17 +0000)]
libstdc++: Clear std::priority_queue after moving from it [PR118088]
We don't know what state an arbitrary sequence container will be in
after moving from it, so a moved-from std::priority_queue needs to clear
the moved-from container to ensure it doesn't contain elements that are
in an invalid order for the queue. An alternative would be to call
std::make_heap again to re-establish the rvalue queue's invariant, but
that could potentially cause an exception to be thrown. Just clearing it
so the sequence is empty seems safer and more likely to match user
expectations.
libstdc++-v3/ChangeLog:
PR libstdc++/118088
* include/bits/stl_queue.h (priority_queue(priority_queue&&)):
Clear the source object after moving from it.
(priority_queue(priority_queue&&, const Alloc&)): Likewise.
(operator=(priority_queue&&)): Likewise.
* testsuite/23_containers/priority_queue/118088.cc: New test.
Michal Jires [Thu, 24 Oct 2024 01:02:55 +0000 (03:02 +0200)]
lto: Remap node order for stability.
This patch adds remapping of node order for each lto partition.
Resulting order conserves relative order inside partition, but
is independent of outside symbols. So if lto partition contains
identical set of symbols, their remapped order will be stable
between compilations.
Michal Jires [Thu, 24 Oct 2024 00:21:00 +0000 (02:21 +0200)]
Node clones share order.
Symbol order corresponds to the order in source code.
For clones their order is currently arbitrarily chosen as max order++
But it would be more consistent with original purpose to choose clones
order to be shared with the original node order.
This stabilizes clone order for Incremental LTO.
Order is thus no longer unique, but this property is not used outside
of previous patch, where we can use uid.
If total order would be needed, sorting by order and then uid suffices.
gcc/ChangeLog:
* cgraph.h (symbol_table::register_symbol):
Order can be already set.
* cgraphclones.cc (cgraph_node::create_clone):
Reuse order for clones.
Michal Jires [Thu, 24 Oct 2024 00:04:12 +0000 (02:04 +0200)]
ipa-strub: Replace cgraph_node order with uid.
ipa_strub_set_mode_for_new_functions uses node order as unique ever
increasing identifier. This is better satisfied with uid.
Order loses uniqueness with following patches.
gcc/ChangeLog:
* ipa-strub.cc (ipa_strub_set_mode_for_new_functions): Replace
order with uid.
(pass_ipa_strub_mode::execute): Likewise.
Jakub Jelinek [Wed, 18 Dec 2024 14:53:24 +0000 (15:53 +0100)]
c++: Speed up compilation of large char array initializers when not using #embed
The following patch (again, on top of the #embed patchset
attempts to optimize compilation of large {{{,un}signed ,}char,std::byte}
array initializers when not using #embed in the source.
Unlike the C patch which is done during the parsing of initializers this
is done when lexing tokens into an array, because C++ lexes all tokens
upfront and so by the time we parse the initializers we already have 16
bytes per token allocated (i.e. 32 extra compile time memory bytes per
one byte in the array).
The drawback is again that it can result in worse locations for diagnostics
(-Wnarrowing, -Wconversion) when initializing signed char arrays with values
128..255. Not really sure what to do about this though unlike the C case,
the locations would need to be preserved through reshape_init* and perhaps
till template instantiation.
For #embed, there is just a single location_t (could be range of the
directive), for diagnostics perhaps we could extend it to say byte xyz of
the file embedded here or something like that, but the optimization done by
this patch, either we'd need to bump the minimum limit at which to try it,
or say temporarily allocate a location_t array for each byte and then clear
it when we no longer need it or something.
I've been using the same testcases as for C, with #embed of 100'000'000
bytes:
time ./cc1plus -quiet -O2 -o test4a.s2 test4a.c
real 0m0.972s
user 0m0.578s
sys 0m0.195s
with xxd -i alternative of the same data without this patch it consumed
around 13.2GB of RAM and
time ./cc1plus -quiet -O2 -o test4b.s4 test4b.c
real 3m47.968s
user 3m41.907s
sys 0m5.015s
and the same with this patch it consumed around 3.7GB of RAM and
time ./cc1plus -quiet -O2 -o test4b.s3 test4b.c
real 0m24.772s
user 0m23.118s
sys 0m1.495s
2024-12-18 Jakub Jelinek <jakub@redhat.com>
* parser.cc (cp_lexer_new_main): Attempt to optimize large sequences
of CPP_NUMBER with int type and values 0-255 separated by CPP_COMMA
into CPP_EMBED with RAW_DATA_CST u.value.
Jakub Jelinek [Wed, 18 Dec 2024 14:21:40 +0000 (15:21 +0100)]
gimple-fold: Fix up decode_field_reference xor handling [PR118081]
The function comment says:
*XOR_P is to be FALSE if EXP might be a XOR used in a compare, in which
case, if XOR_CMP_OP is a zero constant, it will be overridden with *PEXP,
*XOR_P will be set to TRUE, and the left-hand operand of the XOR will be
decoded. If *XOR_P is TRUE, XOR_CMP_OP is supposed to be NULL, and then the
right-hand operand of the XOR will be decoded.
and the comment right above the xor_p handling says
/* Turn (a ^ b) [!]= 0 into a [!]= b. */
but I don't see anything that would actually check that the other operand is
0, in the testcase below it happily optimizes (a ^ 1) == 8 into a == 1.
The following patch adds that check.
Note, there are various other parts of the function I'm worried about, but
haven't had time to construct counterexamples yet.
One worrying thing is the
/* Drop casts, only save the outermost type. We need not worry about
narrowing then widening casts, or vice-versa, for those that are not
essential for the compare have already been optimized out at this
point. */
comment, while obviously there are various optimizations which do optimize
nested casts and the like, I'm not really sure it is safe to rely on them
happening always before this optimization, there are various options to
disable certain optimizations and some IL could appear right before
ifcombine without being optimized yet the way this routine expects.
Plus, the 3 casts are looked through in between various optimizations which
might make those narrowing/widening or vice versa cases necessary.
Also, e.g. for the xor optimization, I think there is a difference between
int a and
(a ^ 0x23) == 0
and
((int) (((unsigned char) a) ^ (unsigned char) 0x23)) == 0
etc.
Another thing I'm worrying about are mixing up the different patterns
together, there is the BIT_AND_EXPR handling, BIT_XOR_EXPR handling,
RSHIFT_EXPR handling and then load handling.
What if all 4 appear together, or 3 of them, 2 of them?
Is the xor optimization still valid if there is BIT_AND_EXPR in between?
I.e. instead of
(a ^ 123) == 0
there is
((a ^ 123) & 234) == 0
?
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/118081
* gimple-fold.cc (decode_field_reference): Only set *xor_p to true
if *xor_cmp_op is integer_zerop.
libatomic/ChangeLog:
PR driver/81358
* Makefile.am: Pass -fno-link-libatomic.
New rule all.
* configure.ac: Assert that CFLAGS is set and pass -fno-link-libatomic.
* Makefile.in: Regenerate.
* configure: Regenerate.
Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com> Co-authored-by: Matthew Malcolmson <mmalcolmson@nvidia.com>
Tobias Burnus [Wed, 18 Dec 2024 08:25:50 +0000 (09:25 +0100)]
OpenMP: Add declare variant's 'append_args' clause in C/C++
Add the append_args clause of 'declare variant' to C and C++,
fix/improve diagnostic for 'interop' clause and 'declare_variant'
clauses on the way.
Cleanup dispatch handling in gimplify_call_expr a bit and
partially handle 'append_args'. (Namely, those parts that
do not require libraries calls, i.e. a dispatch construct
where the 'device' and 'interop' clause has been specified.)
The sorry can be removed once an enum value like
omp_ipr_(ompx_gnu_)omp_device_num (cf. OpenMP Spec Issue 4451)
has to be added to the runtime side such that omp_get_interop_int
returns the device number of an interop object (as passed to
dispatch via the interop clause); and a call to GOMP_interop
has to be added to create interop objects. Once available, only
a very localized change in gimplify_call_expr is required to
claim for full support. - And Fortran parsing support.
* c-parser.cc (c_parser_omp_clause_init_modifiers): New;
split of from ...
(c_parser_omp_clause_init): ... here; call it.
(c_finish_omp_declare_variant): Parse 'append_args' clause.
(c_parser_omp_clause_interop): Set tree used/read.
gcc/cp/ChangeLog:
* decl.cc (omp_declare_variant_finalize_one): Handle
append_args.
* parser.cc (cp_parser_omp_clause_init_modifiers): New;
split of from ...
(cp_parser_omp_clause_init): ... here; call it.
(cp_parser_omp_all_clauses): Replace interop parsing by
a call to ...
(cp_parser_omp_clause_interop): ... this new function;
set tree used/read.
(cp_finish_omp_declare_variant): Parse 'append_args' clause.
(cp_parser_omp_declare): Update comment.
* pt.cc (tsubst_attribute, tsubst_omp_clauses): Handle template
substitution also for declare variant's append_args clause,
using for 'init' the same code as for interop's init clause.
gcc/ChangeLog:
* gimplify.cc (gimplify_call_expr): Update for OpenMP's
append_args; cleanup of OpenMP's dispatch clause handling.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/declare-variant-2.c: Update dg-error msg.
* c-c++-common/gomp/dispatch-12.c: Likewise.
* c-c++-common/gomp/dispatch-11.c: Likewise and extend a bit.
* c-c++-common/gomp/append-args-1.c: New test.
* c-c++-common/gomp/append-args-2.c: New test.
* c-c++-common/gomp/append-args-3.c: New test.
* g++.dg/gomp/append-args-1.C: New test.
* g++.dg/gomp/append-args-2.C: New test.
* g++.dg/gomp/append-args-3.C: New test.
Jakub Jelinek [Wed, 18 Dec 2024 11:00:13 +0000 (12:00 +0100)]
c++: Fix up pedantic handling of alignas [PR110345]
The following patch on top of the PR110345 P2552R3 series
emits pedantic pedwarns for alignas appertaining to incorrect entities.
As the middle-end and attribute exclusions look for "aligned" attribute,
the patch transforms alignas into "internal "::aligned attribute (didn't
use [[aligned (x)]] so that people can't type it that way).
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR c++/110345
gcc/c-family/
* c-common.h (attr_aligned_exclusions): Declare.
(handle_aligned_attribute): Likewise.
* c-attribs.cc (handle_aligned_attribute): No longer
static.
(attr_aligned_exclusions): Use extern instead of static.
gcc/cp/
* cp-tree.h (enum cp_tree_index): Add CPTI_INTERNAL_IDENTIFIER.
(internal_identifier): Define.
(internal_attribute_table): Declare.
* parser.cc (cp_parser_exception_declaration): Error on alignas
on exception declaration.
(cp_parser_std_attribute_spec): Turn alignas into internal
ns aligned attribute rather than gnu.
* decl.cc (initialize_predefined_identifiers): Initialize
internal_identifier.
* tree.cc (handle_alignas_attribute): New function.
(internal_attributes): New variable.
(internal_attribute_table): Likewise.
* cp-objcp-common.h (cp_objcp_attribute_table): Add
internal_attribute_table entry.
gcc/testsuite/
* g++.dg/cpp0x/alignas1.C: Add dg-options "".
* g++.dg/cpp0x/alignas2.C: Likewise.
* g++.dg/cpp0x/alignas7.C: Likewise.
* g++.dg/cpp0x/alignas21.C: New test.
* g++.dg/ext/bitfield9.C: Expect a warning.
* g++.dg/cpp2a/is-layout-compatible3.C: Add dg-options -pedantic.
Expect a warning.
Jakub Jelinek [Wed, 18 Dec 2024 10:58:39 +0000 (11:58 +0100)]
c++: Add fallthrough attribute further test coverage [PR110345]
Similarly for fallthrough attribute. Had to add a second testcase because
the diagnostics for fallthrough not used within switch at all is done during
expansion and expansion won't happen if there are other errors in the
testcase.
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR c++/110345
* g++.dg/cpp0x/attr-fallthrough1.C: New test.
* g++.dg/cpp0x/attr-fallthrough2.C: New test.
Jakub Jelinek [Wed, 18 Dec 2024 10:57:50 +0000 (11:57 +0100)]
c++: Add carries_dependency further test coverage [PR110345]
This patch adds additional test coverage for the carries_dependency
attribute (unlike other attributes, the attribute actually isn't implemented
for real, so we warn even in the cases of valid uses because we ignore those
as well).
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR c++/110345
* g++.dg/cpp0x/attr-carries_dependency2.C: New test.
Jakub Jelinek [Wed, 18 Dec 2024 10:55:59 +0000 (11:55 +0100)]
c++: Handle attributes on exception declarations [PR110345]
This is a continuation of the series for the ignorability of standard
attributes.
I've added a test for assume attribute diagnostics appertaining to various
entities (mostly invalid) and while doing that, I've discovered that
attributes on exception declarations were mostly ignored, this patch
adds the missing cp_decl_attributes call and also in the
cp_parser_type_specifier_seq case differentiates between attributes and
std_attributes to be able to differentiate between attributes which apply
to the declaration using type-specifier-seq and attributes after the type
specifiers.
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR c++/110345
* parser.cc (cp_parser_type_specifier_seq): Chain cxx11_attribute_p
attributes after any type specifier in the is_declaration case
to std_attributes rather than attributes. Set also ds_attribute
or ds_std_attribute locations if not yet set.
(cp_parser_exception_declaration): Pass &type_specifiers.attributes
instead of NULL as last argument, call cp_decl_attributes.
Jakub Jelinek [Wed, 18 Dec 2024 10:54:57 +0000 (11:54 +0100)]
c++: Diagnose attributes on class/enum declarations [PR110345]
The following testcase shows another issue where we just ignored
attributes without telling user we did that.
If there are any declarators, the ignoring of the attribute
are diagnosed in grokdeclarator etc., but if there is none
(and we don't error such as on
int;
), the following patch emits diagnostics.
Jakub Jelinek [Wed, 18 Dec 2024 10:52:31 +0000 (11:52 +0100)]
c++: Handle enum attributes like class attributes [PR110345]
As the following testcase shows, cp_parser_decl_specifier_seq
was calling warn_misplaced_attr_for_class_type only for class types
and not for enum types, while check_tag_decl calls them for both
class and enum types.
Enum types are really the same case here, the attribute needs to go
before the type name to apply to all instances of the type.
Additionally, when warn_misplaced_attr_for_class_type is called, it
diagnoses something and so it is fine to drop the attributes then
on the floor, but in case it wasn't a type decision, it silently
discarded the attributes, which is invalid for the ignorability of
standard attributes paper. This patch in that case adds them to
decl_specs->std_attributes and let it be diagnosed later (e.g.
in grokdeclarator).
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR c++/110345
* parser.cc (cp_parser_decl_specifier_seq): Call
warn_misplaced_attr_for_class_type for all OVERLOAD_TYPE_P
types, not just CLASS_TYPE_P. When not calling
warn_misplaced_attr_for_class_type, don't clear attrs and
add it to decl_specs->std_attributes instead.
Jakub Jelinek [Wed, 18 Dec 2024 10:50:26 +0000 (11:50 +0100)]
inline-asm: Add - constraint modifier support for toplevel extended asm [PR41045]
The following patch adds - constraint modifier support (only in toplevel asms),
which one can use to allow i, s and n constraint to accept SYMBOL_REFs
even with -fpic.
So, the recommended way mark toplevel asm as defining some symbol
would be ":" constraint (usually with cc modifier in the pattern), while
to mark toplevel asm as using some symbol (again, either function or
variable), one would use "-s" constraint again with address of that function
or variable.
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR c/41045
gcc/
* stmt.cc (parse_output_constraint, parse_input_constraint): Handle
- modifier.
* recog.h (raw_constraint_p): Declare.
* recog.cc (raw_constraint_p): New variable.
(asm_operand_ok, constrain_operands): Handle - modifier.
* common.md (i, s, n): For raw_constraint_p don't require
LEGITIMATE_PIC_OPERAND_P.
* doc/md.texi: Document - constraint modifier.
gcc/c/
* c-typeck.cc (build_asm_expr): Reject - constraint modifier inside
of a function.
gcc/cp/
* semantics.cc (finish_asm_stmt): Reject - constraint modifier inside
of a function.
gcc/testsuite/
* c-c++-common/toplevel-asm-4.c: Add missing %cc2 use in template, add
bar, x, &y operands with "-i" and "-s" constraints.
(x, y): New variables.
(bar): Declare.
* c-c++-common/toplevel-asm-7.c: New test.
* c-c++-common/toplevel-asm-8.c: New test.
Jakub Jelinek [Wed, 18 Dec 2024 10:49:11 +0000 (11:49 +0100)]
inline-asm: Add support for cc operand modifier
As mentioned in the "inline asm: Add new constraint for symbol definitions"
patch description, while the c operand modifier is documented to:
Require a constant operand and print the constant expression with no punctuation.
it actually doesn't do that with -fpic at least on some targets and
has been behaving that way for at least 3 decades.
It prints the operand using output_addr_const if CONSTANT_ADDRESS_P is true,
but CONSTANT_ADDRESS_P can do all sorts of target specific checks.
And if it is false, it falls back to output_operand (operands[opnum], 'c');
which will on various targets just result in an error that it is invalid
modifier letter (weird because it is documented), on others like x86 or
alpha will handle the operand in some weird way if it is a comparison
and otherwise complain the argument isn't a comparison, on others like
arm perhaps do what the user wanted.
As I wrote, we are pretty much out of modifier letters because some targets
use a lot of them, and almost out of % punctuation chars (I think ` is left)
but right now punctuation chars aren't normally followed by operand number
anyway.
So, the following patch takes one of the generic letters (c) and adds an
extra modifier char after it, I chose cc, which behaves like c but just
always uses output_addr_const instead of falling back to the machine
dependent code.
2024-12-18 Jakub Jelinek <jakub@redhat.com>
* final.cc (output_asm_insn): Add support for cc operand modifier.
* doc/extend.texi (Generic Operand Modifiers): Document cc operand
modifier.
* doc/md.texi (@samp{:} in constraint): Mention the cc operand
modifier and add small example.
* c-c++-common/toplevel-asm-4.c: Don't use -fno-pie option.
Use cc modifier instead of c.
(v, w): Add extern keyword.
* c-c++-common/toplevel-asm-6.c: New test.
Jakub Jelinek [Wed, 18 Dec 2024 10:44:36 +0000 (11:44 +0100)]
inline asm: Add new constraint for symbol definitions
The following patch on top of the PR41045 toplevel extended asm patch
allows marking inline asms (both toplevel and function-local, admittedly
it is less useful for the latter, so if you want, I can add restrictions)
as defining symbols, either functions or variables.
As most remaining constraint letters are used at least on some targets,
I'm using : as the new constraint. It is similar to "s" in that it
wants CONSTANT_P && !CONST_SCALAR_INT_P, but
1) it specially requires an address of a function or variable declaration,
so for functions the expected use is
void foo (void);
...
":" (foo)
or
":" (&foo)
and for variables (unless they are arrays)
extern int var;
...
":" (&var)
2) it makes no sense to say that either something is defined or it is
used in a register or something similar, so the patch diagnoses if
one attempts to mix it with other constraints; ":,:,:" is allowed
just because one could be using 3 alternatives in some other operand
3) unlike "s", the constraint doesn't check LEGITIMATE_PIC_OPERAND_P for
-fpic, even in -fpic one should be able to use it the same way
4) the cgraph portion needs to be really added later
5) and last but not least, I'm afraid %c0 print modifier isn't very
good for printing it; it works fine without -fpic/-fpie, but 'c'
modifier is handled as
if (CONSTANT_ADDRESS_P (operands[opnum]))
output_addr_const (asm_out_file, operands[opnum]);
else
output_operand (operands[opnum], 'c');
and because at least on some arches like x86 CONSTANT_ADDRESS_P
is redefined to do backend specific PIC mess, it will just
output_operand and likely just be rejected (on x86 with an error
that the argument is not a comparison)
Guess for x86 one can use %p0 instead.
But I'm afraid we are mostly out of generic modifiers,
and targetm.asm_out.print_operand_punct_valid_p seems to use most
of the punctuation characters as well.
I think ` is unused, but wonder if we want to use up the last
remaining letter that way, perhaps make %`<letter>0?
Or extend the existing generic modifiers, keep %c0 behave as it
does right now and make %cc0 be a 2 letter modifier which is
PIC friendly and prints using output_addr_const anything that can
be printed that way? A follow-up patch implements the %cc0 version.
Tamar Christina [Wed, 18 Dec 2024 09:02:46 +0000 (09:02 +0000)]
libstdc++: Add inline keyword to _M_locate
In GCC 12 there was a ~40% regression in the performance of hashmap->find.
This regression came about accidentally:
Before GCC 12 the find function was small enough that IPA would inline it even
though it wasn't marked inline. In GCC-12 an optimization was added to perform
a linear search when the entries in the hashmap are small.
This increased the size of the function enough that IPA would no longer inline.
Inlining had two benefits:
1. The return value is a reference. so it has to be returned and dereferenced
even though the search loop may have already dereference it.
2. The pattern is a hard pattern to track for branch predictors. This causes
a large number of branch misses if the value is immediately checked and
branched on. i.e. if (a != m.end()) which is a common pattern.
The patch fixes both these issues by adding the inline keyword to _M_locate
to allow the inliner to consider inlining again.
This and the other patches have been ran through serveral benchmarks where
the size, number of elements searched for and type (reference vs value) etc
were tested.
The change shows no statistical regression, but an average find improvement of
~27% and a range between ~10-60% improvements. A selection of the results:
Xi Ruoyao [Mon, 16 Dec 2024 12:43:03 +0000 (20:43 +0800)]
LoongArch: Add CRC expander to generate faster CRC
64-bit LoongArch has native CRC instructions for two specific
polynomials. For other polynomials or 32-bit, use the generic
table-based approach but optimize bit reversing.
gcc/ChangeLog:
* config/loongarch/loongarch.md (crc_rev<mode:SUBDI>si4): New
define_expand.
Xi Ruoyao [Mon, 2 Dec 2024 02:53:27 +0000 (10:53 +0800)]
LoongArch: Add bit reverse operations
LoongArch supports native bit reverse operation for QI, SI, DI, and for
HI we can expand it into a shift and a bit reverse in word_mode.
I was reluctant to add them because until PR50481 is fixed these
operations will be just useless. But now it turns out we can use them
to optimize the bit reversing CRC calculation if recognized by the
generic CRC pass. So add them in prepare for the next patch adding CRC
expanders.
gcc/ChangeLog:
* config/loongarch/loongarch.md (@rbit<mode:GPR>): New
define_insn template.
(rbitsi_extended): New define_insn.
(rbitqi): New define_insn.
(rbithi): New define_expand.
Lewis Hyatt [Wed, 18 Dec 2024 02:26:18 +0000 (21:26 -0500)]
c++: modules: Fix 32-bit overflow with 64-bit location_t [PR117970]
With the move to 64-bit location_t in r15-6016, I missed a spot in module.cc
where a location_t was still being stored in a 32-bit int. Fixed.
The xtreme-header* tests in modules.exp were still passing fine on lots of
architectures that were tested (x86-64, i686, aarch64, sparc, riscv64), but
the PR shows that they were failing in some particular risc-v multilib
configurations. They pass now.
gcc/cp/ChangeLog:
PR c++/117970
* module.cc (module_state::read_ordinary_maps): Change argument to
line_map_uint_t instead of unsigned int.
Inserting an empty range into a std::deque results in undefined calls to
either std::copy, std::copy_backward, std::move, or std::move_backward.
We call those algos with invalid arguments where the output range is the
same as the input range, e.g. std::copy(first, last, first) which
violates the preconditions for the algorithms.
This fix simply returns early if there's nothing to insert. Most callers
already ensure that we don't even call _M_range_insert_aux with an empty
range, but some callers don't. Rather than checking for n == 0 in each
of the callers, this just does the check once and uses __builtin_expect
to treat empty insertions as unlikely.
libstdc++-v3/ChangeLog:
PR libstdc++/118035
* include/bits/deque.tcc (_M_range_insert_aux): Return
immediately if inserting an empty range.
* testsuite/23_containers/deque/modifiers/insert/118035.cc: New
test.
Sandra Loosemore [Tue, 17 Dec 2024 15:19:36 +0000 (15:19 +0000)]
Documentation: Make OpenMP/OpenACC docs easier to find [PR26154]
PR c/26154 is one of our oldest documentation issues. The only
discussion of OpenMP support in the GCC manual is buried in the "C
Dialect Options" section, with nothing at all under "Extensions". The
Fortran manual does have separate sections for OpenMP and OpenACC
extensions so I have copy-edited/adapted that text for similar sections
in the GCC manual, as well as breaking out the OpenMP and OpenACC options
into their own section (they apply to all of C, C++, and Fortran).
I also updated the information about what versions of OpenMP and
OpenACC are supported and removed some redundant text from the Fortran
manual to prevent it from getting out of sync on future updates, and
inserted some cross-references to the new sections elsewhere.
gcc/ChangeLog
PR c/26154
* common.opt.urls: Regenerated.
* doc/extend.texi (C Extensions): Adjust menu for new sections.
(Attribute Syntax): Mention OpenMP directives.
(Pragmas): Mention OpenMP and OpenACC directives.
(OpenMP): New section.
(OpenACC): New section.
* doc/invoke.texi (Invoking GCC): Adjust menu for new section.
(Option Summary): Move OpenMP and OpenACC options to their own
category.
(C Dialect Options): Move documentation for -foffload, -fopenacc,
-fopenacc-dim, -fopenmp, -fopenmd-simd, and
-fopenmp-target-simd-clone to...
(OpenMP and OpenACC Options): ...this new section. Light
copy-editing of the option descriptions.
gcc/fortran/ChangeLog:
PR c/26154
* gfortran.texi (Standards): Remove redundant info about
OpenMP/OpenACC standard support.
(OpenMP): Copy-editing and update version info.
(OpenACC): Likewise.
* lang.opt.urls: Regenerated.
Richard Biener [Tue, 17 Dec 2024 10:23:02 +0000 (11:23 +0100)]
middle-end/118062 - bogus lowering of vector compares
The generic expand_vector_piecewise routine supports lowering of
a vector operation to vector operations of smaller size. When
computing the extract position from the larger vector it uses the
element size in bits of the original result vector to determine
the number of elements in the smaller vector. That is wrong when
lowering a compare as the vector element size of a bool vector
does not have to agree with that of the compare operand. The
following simplifies this, fixing the error.
Marek Polacek [Thu, 12 Dec 2024 19:56:07 +0000 (14:56 -0500)]
c++: ICE initializing array of aggrs [PR117985]
This crash started with my r12-7803 but I believe the problem lies
elsewhere.
build_vec_init has cleanup_flags whose purpose is -- if I grok this
correctly -- to avoid destructing an object multiple times. Let's
say we are initializing an array of A. Then we might end up in
a scenario similar to initlist-eh1.C:
try
{
call A::A in a loop
// #0
try
{
call a fn using the array
}
finally
{
// #1
call A::~A in a loop
}
}
catch
{
// #2
call A::~A in a loop
}
cleanup_flags makes us emit a statement like
D.3048 = 2;
at #0 to disable performing the cleanup at #2, since #1 will take
care of the destruction of the array.
But if we are not emitting the loop because we can use a constant
initializer (and use a single { a, b, ...}), we shouldn't generate
the statement resetting the iterator to its initial value. Otherwise
we crash in gimplify_var_or_parm_decl because it gets the stray decl
D.3048.
PR c++/117985
gcc/cp/ChangeLog:
* init.cc (build_vec_init): Pop CLEANUP_FLAGS if we're not
generating the loop.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-array23.C: New test.
* g++.dg/cpp0x/initlist-array24.C: New test.
Oliver Kozul [Tue, 17 Dec 2024 14:44:33 +0000 (07:44 -0700)]
[PATCH] RISC-V: optimization on checking certain bits set ((x & mask) == val)
The patch optimizes code generation for comparisons of the form
X & C1 == C2 by converting them to (X | ~C1) == (C2 | ~C1).
C1 is a constant that requires li and addi to be loaded,
while ~C1 requires a single lui instruction.
As the values of C1 and C2 are not visible within
the equality expression, a plus pattern is matched instead.
PR target/114087
gcc/ChangeLog:
* config/riscv/riscv.md (*lui_constraint<ANYI:mode>_and_to_or): New pattern
Yangyu Chen [Tue, 17 Dec 2024 14:41:05 +0000 (07:41 -0700)]
RISC-V: Remove svvptc from riscv-ext-bitmask.def
There should be no svvptc in the riscv-ext-bitmask.def file since it has
not yet been added to the RISC-V C API Specification or the Linux
hwprobe. And there is no need for userspace software to know that this
extension exists. So remove it from the riscv-ext-bitmask.def file.
Fixes: e4f4b2dc08 ("RISC-V: Minimal support for svvptc extension.") Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
Kito Cheng [Mon, 9 Dec 2024 06:55:20 +0000 (14:55 +0800)]
RISC-V: Add new constraint R for register even-odd pairs
Although this constraint is not currently used for any instructions, it is very
useful for custom instructions. Additionally, some new standard extensions
(not yet upstream), such as `Zilsd` and `Zclsd`, are potential users of this
constraint. Therefore, I believe there is sufficient justification to add it
now.
gcc/ChangeLog:
* config/riscv/constraints.md (R): New constraint.
* doc/md.texi: Document new constraint `R`.
Kito Cheng [Thu, 14 Nov 2024 09:24:45 +0000 (17:24 +0800)]
RISC-V: Implment N modifier for printing the register number rather than the register name
The modifier `N`, to print the raw encoding of a register. This is used
when using `.insn <length>, <encoding>`, where the user wants to pass
a value to the instruction in a known register, but where the
instruction doesn't follow the existing instruction formats, so the
assembly parser is not expecting a register name, just a raw integer.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_print_operand): Add N.
* doc/extend.texi: Document for N,
Martin Jambor [Tue, 17 Dec 2024 10:17:14 +0000 (11:17 +0100)]
ipa: Improve how we derive value ranges from IPA invariants
I believe that the current function ipa_range_set_and_normalize lacks
a check that a base of an ADDR_EXPR lacks a test whether the base
really cannot be NULL, so this patch adds it. Moreover, I never liked
the name as I do not think it makes the value of ranges any more
normal but rather just special-cases non-zero ip_invariant pointers.
Therefore, I have given it a different name and moved it to a .cc
file, our LTO bootstrap should inline (and/or split) it if necessary
anyway.
Because, as Honza correctly pointed out, deriving non-NULLness from a
pointer depends on flag_delete_null_pointer_checks which is an
optimization flag and thus depends on a given function, in this
version of the patch ipa_get_range_from_ip_invariant gets a
context_node parameter for that purpose. This then needs to be used
within symtab_node::nonzero_address which gets a special overload in
which the value of the flag can be provided as a parameter.
gcc/ChangeLog:
2024-12-11 Martin Jambor <mjambor@suse.cz>
* cgraph.h (symtab_node): Add a new overload of nonzero_address.
* symtab.cc (symtab_node::nonzero_address): Add a new overload whith a
parameter for delete_null_pointer_checks. Make the original overload
call the new one which has retains the actual implementation.
* ipa-prop.h (ipa_get_range_from_ip_invariant): Declare.
(ipa_range_set_and_normalize): Remove.
* ipa-prop.cc (ipa_get_range_from_ip_invariant): New function.
(ipa_range_set_and_normalize): Remove.
* ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Add a new parameter
context_node. Use ipa_get_range_from_ip_invariant instead of
ipa_range_set_and_normalize and pass to it the new parameter.
(ipa_value_range_from_jfunc): Pass cs->caller as the context_node to
ipa_vr_intersect_with_arith_jfunc.
(propagate_vr_across_jump_function): Likewise.
(ipa_get_range_from_ip_invariant): New function.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Use
ipa_get_range_from_ip_invariant instead of ipa_range_set_and_normalize
Martin Jambor [Tue, 17 Dec 2024 10:17:14 +0000 (11:17 +0100)]
ipa: Better value ranges for pointer integer constants
When looking into cases where we know an actual argument of a call is
a constant but we don't generate a singleton value-range for the jump
function, I found out that the special handling of pointer constants
does not work well for constant zero pointer values. In fact the code
only attempts to see if it can figure out that an argument is not zero
and if it can figure out any alignment information.
With this patch, we try to use the value_range that ranger can give us
in the jump function if we can and we query ranger for all kinds of
arguments, not just SSA_NAMES (and so also pointer integer constants).
If we cannot figure out a useful range we fall back again on figuring
out non-NULLness with tree_single_nonzero_warnv_p.
With this patch, we generate
[prange] struct S * [0, 0] MASK 0x0 VALUE 0x0
instead of for example:
[prange] struct S * [0, +INF] MASK 0xfffffffffffffff0 VALUE 0x0
for a zero constant passed in a call.
If you are wondering why we check whether the value range obtained
from range_of_expr can be undefined, even when the function returns
true, that is because that can apparently happen fro default-definition
SSA_NAMEs.
gcc/ChangeLog:
2024-11-15 Martin Jambor <mjambor@suse.cz>
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Try harder to
use the value range obtained from ranger for pointer values.
Martin Jambor [Tue, 17 Dec 2024 10:17:14 +0000 (11:17 +0100)]
ipa: Skip widening type conversions in jump function constructions
Originally, we did not stream any formal parameter types into WPA and
were generally very conservative when it came to type mismatches in
IPA-CP. Over the time, mismatches that happen in code and blew up in
WPA made us to be much more resilient and also to stream the types of
the parameters which we now use commonly.
With that information, we can safely skip conversions when looking at
the IL from which we build jump functions and then simply fold convert
the constants and ranges to the resulting type, as long as we are
careful that performing the corresponding folding of constants gives
the corresponding results. In order to do that, we must ensure that
the old value can be represented in the new one without any loss.
With this change, we can nicely propagate non-NULLness in IPA-VR as
demonstrated with the new test case.
I have gone through all other uses of (all components of) jump
functions which could be affected by this and verified they do indeed
check types and can handle mismatches.
gcc/ChangeLog:
2024-12-11 Martin Jambor <mjambor@suse.cz>
* ipa-prop.cc: Include vr-values.h.
(skip_a_safe_conversion_op): New function.
(ipa_compute_jump_functions_for_edge): Use it.
Jakub Jelinek [Tue, 17 Dec 2024 09:13:24 +0000 (10:13 +0100)]
c++: Diagnose earlier non-static data members with cv containing class type [PR116108]
In r10-6457 aka PR92593 fix a check has been added to reject
earlier non-static data members with current_class_type in templates,
as the deduction then can result in endless recursion in reshape_init.
It fixed the
template <class T>
struct S { S s = 1; };
S t{2};
crashes, but as the following testcase shows, didn't catch when there
are cv qualifiers on the non-static data member.
Fixed by using TYPE_MAIN_VARIANT.
2024-12-17 Jakub Jelinek <jakub@redhat.com>
PR c++/116108
gcc/cp/
* decl.cc (grokdeclarator): Pass TYYPE_MAIN_VARIANT (type)
rather than type to same_type_p when checking if the non-static
data member doesn't have current class type.
gcc/testsuite/
* g++.dg/cpp1z/class-deduction117.C: New test.