arm: Add integer vector overloading of vsubq_x instrinsic
In the past we had only defined the vsubq_x generic overload of the
vsubq_x_* intrinsics for float vector types. This would cause them
to fall back to the `__ARM_undef` failure state if they was called
through the generic version.
This patch simply adds these overloads.
gcc/ChangeLog:
* config/arm/arm_mve.h (__arm_vsubq_x FP): New overloads.
(__arm_vsubq_x Integer): New.
arm: Explicitly specify other float types for _Generic overloading [PR107515]
This patch adds explicit references to other float types
to __ARM_mve_typeid in arm_mve.h. Resolves PR 107515:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515
arm: propagate fixed overloading of MVE intrinsic scalar parameters
This is a mechanical patch that propagates the change proposed in
my previous patch for vaddq[_m]_n
across all other polymorphic MVE intrinsic overloads of scalar types.
The above commits changed the definitions of the intrinsics from using
`[u]int[8/16/32]_t` types for the scalar argument to using `int`. This
allowed `int` to be supported in user code through the overloaded
`#defines`, but seems to have broken the `[u]int[8/16/32]_t` types
The solution implemented by this patch is to explicitly use a new
_Generic mapping from all the `[u]int[8/16/32]_t` types for int. With this
change, both `int` and `[u]int[8/16/32]_t` parameters are supported from
user code and are handled by the overloading mechanism correctly.
Note that in these scalar cases it is safe to pass the raw p<n>, rather
than the typeof-ed __p<n>, because we are not at risk of the _Generics
being exponentially expanded on the `n` scalar argument to an `_n`
intrinsic. Using p<n> instead will give a more accurate error message
to the user, should something be wrong with that argument.
Jakub Jelinek [Wed, 17 May 2023 08:15:50 +0000 (10:15 +0200)]
c++: Don't try to initialize zero width bitfields in zero initialization [PR109868]
My GCC 12 change to avoid removing zero-sized bitfields as they are
important for ABI and are needed for layout compatibility traits
apparently causes zero sized bitfields to be initialized in the IL,
which at least in 13+ results in ICEs in the ranger which is upset
about zero precision types.
I think we could even avoid initializing other unnamed bitfields, but
unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end
clearing of padding bits and until we have some new flag that represents
the request to clear padding bits, I think it is better to keep zeroing
non-zero sized unnamed bitfields.
In addition to skipping those fields, I have changed the logic how
UNION_TYPEs are handled, the current code was a little bit weird in that
e.g. if first non-static data member had error_mark_node type, we'd happily
zero initialize the second non-static data member, etc.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR c++/109868
* init.cc (build_zero_init_1): Don't initialize zero-width bitfields.
For unions only initialize the first FIELD_DECL.
Jonathan Wakely [Mon, 28 Nov 2022 13:28:53 +0000 (13:28 +0000)]
libstdc++: Fix src/c++17/memory_resource for H8 targets [PR107801]
This fixes compilation failures for H8 multilibs. For the normal
multilib (ILP16L32?), the chunk struct does not have the expected size,
because uint32_t is type long and has alignment 4 (by default). This
forces sizeof(chunk) to be 12 instead of the expected 10. We can fix
that by using bitset::size_type instead of uint32_t, so that we only use
a 16-bit size when size_t and pointers are 16-bit types.
For the IL32P16 multilibs that use -mint32, int is wider than size_t
and so arithmetic expressions involving size_t promote to int. This
means we need some explicit casts back to size_t.
libstdc++-v3/ChangeLog:
PR libstdc++/107801
* src/c++17/memory_resource.cc (chunk::_M_bytes): Change type
from uint32_t to bitset::size_type. Adjust static assertion.
(__pool_resource::_Pool::replenish): Cast to size_t after
multiplication instead of before.
(__pool_resource::_M_alloc_pools): Ensure both arguments to
std::max have type size_t.
Jason Merrill [Wed, 22 Mar 2023 20:11:47 +0000 (16:11 -0400)]
c++: local class in nested generic lambda [PR109241]
In this testcase, the tree walk to look for bare parameter packs was
confused by finding a type with no TREE_BINFO. But it should be fine that
it's unset; we already checked for unexpanded packs at parse time.
I also tried doing the partial instantiation of the local class, which is
probably the long-term direction we want to go, but for stage 4 let's go
with this safer change.
On the branch ranger isn't powerful enough to handle some cases
appearing with logical-op-non-short-circuit evaluating to false
causing FAILs of the testcase for ppc64le and s390x. The following
foces logical-op-non-short-circuit to true for this testcase
on the branch.
liuhongt [Wed, 10 May 2023 07:16:58 +0000 (15:16 +0800)]
x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.
if (mdaz-ftz)
link crtfastmath.o
else if ((Ofast || ffast-math || funsafe-math-optimizations)
&& !mno-daz-ftz)
link crtfastmath.o
else
Don't link crtfastmath.o
gcc/ChangeLog:
* config/i386/cygwin.h (ENDFILE_SPEC): Link crtfastmath.o
whenever -mdaz-ftz is specified. Don't link crtfastmath.o
when -mno-daz-ftz is specified.
* config/i386/darwin.h (ENDFILE_SPEC): Ditto.
* config/i386/gnu-user-common.h
(GNU_USER_TARGET_MATHFILE_SPEC): Ditto.
* config/i386/mingw32.h (ENDFILE_SPEC): Ditto.
* config/i386/i386.opt (mdaz-ftz): New option.
* doc/invoke.texi (x86 options): Document mftz-daz.
Jonathan Wakely [Wed, 16 Nov 2022 12:22:04 +0000 (12:22 +0000)]
libstdc++: Fix std::any pretty printer
The recent changes to FilteringTypePrinter affect the result of
gdb.lookup_type('std::string') in StdExpAnyPrinter, causing it to always
return the std::__cxx11::basic_string specialization. This then causes a
gdb.error exception when trying to lookup the std::any manager type for
a specialization using that string, but that manager was never
instantiated in the program. This causes FAILs when running the tests
with -D_GLIBCXX_USE_CXX11_ABI=0:
FAIL: libstdc++-prettyprinters/libfundts.cc print as
FAIL: libstdc++-prettyprinters/libfundts.cc print as
The ugly solution used in this patch is to repeat the lookup for every
type that std::string could be a typedef for, and hope it only works for
one of them.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (StdExpAnyPrinter): Make
expansion of std::string in manager name more robust.
Dan Horák [Wed, 3 May 2023 19:29:09 +0000 (14:29 -0500)]
libffi: fix handling of homogeneous float128 structs (#689)
If there is a homogeneous struct with float128 members, they should be
copied to vector register save area. The current code incorrectly copies
only the value of the first member, not increasing the pointer with each
iteration. Fix this.
Patrick Palka [Sat, 1 Apr 2023 16:01:30 +0000 (12:01 -0400)]
c++: NTTP constraint depending on outer parms [PR109160]
Here we're crashing during satisfaction for the NTTP 'C<B> auto V'
ultimately because convert_template_argument / unify don't pass all
outer template arguments to do_auto_deduction, and during satisfaction
we need to know all arguments. While these callers do pass some outer
arguments, they are only sufficient to properly substitute the
(level-lowered) 'auto' and are not necessarily the entire set.
Fortunately it seems these callers have access to the full set of outer
arguments via convert_template_argument's 'in_decl' parameter and
unify's 'tparms' parameter. So this patch adds a new parameter to
do_auto_deduction, used only during adc_unify deduction, through which
these callers can pass the enclosing (partially instantiated) template
and from which do_auto_deduction can obtain _all_ outer template
arguments for sake of satisfaction.
This patch also ensures that the 'in_decl' argument passed to
coerce_template_parms is always a TEMPLATE_DECL, which in turn allows us
to pass it as-is to do_auto_deduction; the only coerce_template_parms
caller that needed adjustment was tsubst_decl it seems.
PR c++/109160
gcc/cp/ChangeLog:
* cp-tree.h (do_auto_deduction): Add defaulted tmpl parameter.
* pt.cc (convert_template_argument): Pass 'in_decl' as 'tmpl' to
do_auto_deduction.
(tsubst_decl) <case VAR_/TYPE_DECL>: Pass 'tmpl' instead of 't' as
'in_decl' to coerce_template_parms.
(unify) <case TEMPLATE_PARM_INDEX>: Pass TPARMS_PRIMARY_TEMPLATE
as 'tmpl' to do_auto_deduction.
(do_auto_deduction): Document default arguments. Rename local
variable 'tmpl' to 'ctmpl'. Use 'tmpl' to obtain a full set of
template arguments for satisfaction in the adc_unify case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-placeholder12.C: New test.
Jakub Jelinek [Tue, 9 May 2023 10:14:18 +0000 (12:14 +0200)]
testsuite: Add further testcase for already fixed PR [PR109778]
I came up with a testcase which reproduces all the way to r10-7469.
LTO to avoid early inlining it, so that ccp handles rotates and not
shifts before they are turned into rotates.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* gcc.dg/lto/pr109778_0.c: New test.
* gcc.dg/lto/pr109778_1.c: New file.
Jakub Jelinek [Tue, 9 May 2023 10:10:07 +0000 (12:10 +0200)]
tree-ssa-ccp, wide-int: Fix up handling of [LR]ROTATE_EXPR in bitwise ccp [PR109778]
The following testcase is miscompiled, because bitwise ccp2 handles
a rotate with a signed type incorrectly.
Seems tree-ssa-ccp.cc has the only callers of wi::[lr]rotate with 3
arguments, all other callers just rotate in the right precision and
I think work correctly. ccp works with widest_ints and so rotations
by the excessive precision certainly don't match what it wants
when it sees a rotate in some specific bitsize. Still, if it is
unsigned rotate and the widest_int is zero extended from width,
the functions perform left shift and logical right shift on the value
and then at the end zero extend the result of left shift and uselessly
also the result of logical right shift and return | of that.
On the testcase we the signed char rrotate by 4 argument is
CONSTANT -75 i.e. 0xffffffff....fffffb5 with mask 2.
The mask is correctly rotated to 0x20, but because the 8-bit constant
is sign extended to 192-bit one, the logical right shift by 4 doesn't
yield expected 0xb, but gives 0xfffffffffff....ffffb, and then
return wi::zext (left, width) | wi::zext (right, width); where left is
0xfffffff....fb50, so we return 0xfb instead of the expected
0x5b.
The following patch fixes that by doing the zero extension in case of
the right variable before doing wi::lrshift rather than after it.
Also, wi::[lr]rotate widht width < precision always zero extends
the result. I'm afraid it can't do better because it doesn't know
if it is done for an unsigned or signed type, but the caller in this
case knows that very well, so I've done the extension based on sgn
in the caller. E.g. 0x5b rotated right (or left) by 4 with width 8
previously gave 0xb5, but sgn == SIGNED in widest_int it should be
0xffffffff....fffb5 instead.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* wide-int.h (wi::lrotate, wi::rrotate): Call wi::lrshift on
wi::zext (x, width) rather than x if width != precision, rather
than using wi::zext (right, width) after the shift.
* tree-ssa-ccp.cc (bit_value_binop): Call wi::ext on the results
of wi::lrotate or wi::rrotate.
Martin Uecker [Wed, 8 Feb 2023 14:02:43 +0000 (15:02 +0100)]
Fix ICE related to implicit access attributes for VLA arguments [PR105660]
When constructing the specifier string when merging an access attribute
that encodes information about VLA arguments, the string was constructed
in random order by iterating through a hash table. Fix this by iterating
though the list of arguments.
gcc/c-family/Changelog:
PR c/105660
* c-attribs.cc (append_access_attr): Use order of arguments when
construction string.
(append_access_attr_idxs): Rename and make static.
* c-warn.cc (warn_parm_array_mismatch): Add assertion.
gcc/testsuite/ChangeLog:
PR c/105660
* gcc.dg/pr105660-1.c: New test.
* gcc.dg/pr105660-2.c: New test.
Kewen Lin [Wed, 26 Apr 2023 05:21:14 +0000 (00:21 -0500)]
rs6000: Guard power9-vector for vsx_scalar_cmp_exp_qp_* [PR108758]
__builtin_vsx_scalar_cmp_exp_qp_{eq,gt,lt,unordered} used
to be guarded with condition TARGET_P9_VECTOR before new
bif framework was introduced (r12-5752-gd08236359eb229),
since r12-5752 they are placed under stanza ieee128-hw,
that is to check condition TARGET_FLOAT128_HW, it caused
test case float128-cmp2-runnable.c to fail at -m32 as the
condition TARGET_FLOAT128_HW isn't satisified with -m32.
By checking the commit history, I didn't see any notes on
why this condition change on them was made, so this patch
is to move these bifs from stanza ieee128-hw to stanza
power9-vector as before.
PR target/108758
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def
(__builtin_vsx_scalar_cmp_exp_qp_eq, __builtin_vsx_scalar_cmp_exp_qp_gt
__builtin_vsx_scalar_cmp_exp_qp_lt,
__builtin_vsx_scalar_cmp_exp_qp_unordered): Move from stanza ieee128-hw
to power9-vector.
Kewen Lin [Wed, 26 Apr 2023 05:21:05 +0000 (00:21 -0500)]
rs6000: Fix predicate for const vector in sldoi_to_mov [PR109069]
As PR109069 shows, commit r12-6537-g080a06fcb076b3 which
introduces define_insn_and_split sldoi_to_mov adopts
easy_vector_constant for const vector of interest, but it's
wrong since predicate easy_vector_constant doesn't guarantee
each byte in the const vector is the same. One counter
example is the const vector in pr109069-1.c. This patch is
to introduce new predicate const_vector_each_byte_same to
ensure all bytes in the given const vector are the same by
considering both int and float, meanwhile for the constants
which don't meet easy_vector_constant we need to gen a move
instead of just a set, and uses VECTOR_MEM_ALTIVEC_OR_VSX_P
rather than VECTOR_UNIT_ALTIVEC_OR_VSX_P for V2DImode support
under VSX since vector long long type of vec_sld is guarded
under stanza vsx.
PR target/109069
gcc/ChangeLog:
* config/rs6000/altivec.md (sldoi_to_mov<mode>): Replace predicate
easy_vector_constant with const_vector_each_byte_same, add
handlings in preparation for !easy_vector_constant, and update
VECTOR_UNIT_ALTIVEC_OR_VSX_P with VECTOR_MEM_ALTIVEC_OR_VSX_P.
* config/rs6000/predicates.md (const_vector_each_byte_same): New
predicate.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr109069-1.c: New test.
* gcc.target/powerpc/pr109069-2-run.c: New test.
* gcc.target/powerpc/pr109069-2.c: New test.
* gcc.target/powerpc/pr109069-2.h: New test.
Kefu Chai [Mon, 1 May 2023 20:24:26 +0000 (21:24 +0100)]
libstdc++: Set _M_string_length before calling _M_dispose() [PR109703]
This always sets _M_string_length in the constructor for ranges of input
iterators, such as stream iterators.
We copy from the source range to the local buffer, and then repeatedly
reallocate a larger one if necessary. When disposing the old buffer,
_M_is_local() is used to tell if the buffer is the local one or not (and
so must be deallocated). In addition to comparing the buffer address
with the local buffer, _M_is_local() has an optimization hint so that
the compiler knows that for a string using the local buffer, there is an
invariant that _M_string_length <= _S_local_capacity (added for PR109299
via r13-6915-gbf78b43873b0b7). But we failed to set _M_string_length in
the constructor taking a pair of iterators, so the invariant might not
hold, and __builtin_unreachable() is reached. This causes UBsan errors,
and potentially misoptimization.
To ensure the invariant holds, _M_string_length is initialized to zero
before doing anything else, so that _M_is_local() doesn't see an
uninitialized value.
This issue only surfaces when constructing a string with a range of
input iterator, and the uninitialized _M_string_length happens to be
greater than _S_local_capacity, i.e., 15 for the std::string
specialization.
Jakub Jelinek [Wed, 3 May 2023 08:38:04 +0000 (10:38 +0200)]
c++: Fix up VEC_INIT_EXPR gimplification after r12-7069
During patch backporting, I've noticed that while most cp_walk_tree calls
with cp_fold_r callback callers were changed from &pset to cp_fold_data
&data, the VEC_INIT_EXPR gimplifications has not, so it still passes just
address of a hash_set<tree> and so if during the folding we ever touch
data->genericize, we use uninitialized data there.
The following patch changes it to do the same thing as cp_fold_function
because the VEC_INIT_EXPR gimplifications will happen on function bodies
only.
2023-05-03 Jakub Jelinek <jakub@redhat.com>
* cp-gimplify.cc (cp_fold_data): Move definition earlier.
(cp_gimplify_expr): Pass address of genericize = true
constructed data rather than &pset to cp_walk_tree with cp_fold_r.
c++, coroutines: Fix block nests when the function has no top-level bind.
When the function contains no local vars and also no nested scopes, there
is no top-level bind expression. Because the rewritten coroutine body will
require both local vars and contain nested scopes, we add a bind expression
to such functions. When this was done the necessary scope blocks were
omitted which leads to disconnected function content.
Fixed by adding a new block to the added bind expression.
Iain Sandoe [Thu, 30 Mar 2023 07:44:23 +0000 (13:14 +0530)]
c++,coroutines: Stabilize names of promoted slot vars [PR101118].
When we need to 'promote' a value (i.e. store it in the coroutine frame) it
is given a frame entry name. This was based on the DECL_UID for slot vars.
However, when LTO is used, the names from multiple TUs become visible at the
same time, and the DECL_UIDs usually differ between units. This leads to a
"ODR mismatch" warning for the frame type.
The fix here is to use the current promoted temporaries count to produce
the name, this is stable between TUs and computed per coroutine.
Andrew Pinski [Thu, 8 Dec 2022 22:34:16 +0000 (22:34 +0000)]
coroutines: Build pointer initializers with nullptr_node [PR107768]
The PR reports that using integer_zero_node triggers a warning for
-Wzero-as-null-pointer-constant which comes from compiler-generated code so
makes no sense to the end user.
Patrick Palka [Wed, 12 Apr 2023 17:08:21 +0000 (13:08 -0400)]
libstdc++: Implement LWG 3904 change to lazy_split_view's iterator
libstdc++-v3/ChangeLog:
* include/std/ranges (lazy_split_view::_OuterIter::_OuterIter):
Propagate _M_trailing_empty in the const-converting constructor
as per LWG 3904.
* testsuite/std/ranges/adaptors/lazy_split.cc (test12): New test.
Patrick Palka [Tue, 14 Mar 2023 20:44:32 +0000 (16:44 -0400)]
libstdc++: Implement P2520R0 changes to move_iterator's iterator_concept
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (move_iterator::_S_iter_concept):
Define.
(__cpp_lib_move_iterator_concept): Define for C++20.
(move_iterator::iterator_concept): Strengthen as per P2520R0.
* include/std/version (__cpp_lib_move_iterator_concept): Define
for C++20.
* testsuite/24_iterators/move_iterator/p2520r0.cc: New test.
Patrick Palka [Thu, 9 Mar 2023 18:35:04 +0000 (13:35 -0500)]
libstdc++: Make views::single/iota/istream SFINAE-friendly [PR108362]
PR libstdc++/108362
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::__can_single_view): New concept.
(_Single::operator()): Constrain it. Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_iota_view): New concept.
(_Iota::operator()): Constrain it. Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_istream_view): New concept.
(_Istream::operator()): Constrain it. Move [[nodiscard]] to the
end of the function declarator.
* testsuite/std/ranges/iota/lwg3292_neg.cc: Prune "in
requirements" diagnostic.
* testsuite/std/ranges/iota/iota_view.cc (test07): New test.
* testsuite/std/ranges/istream_view.cc (test08): New test.
* testsuite/std/ranges/single_view.cc (test07): New test.
Patrick Palka [Mon, 24 Apr 2023 17:39:54 +0000 (13:39 -0400)]
libstdc++: Fix __max_diff_type::operator>>= for negative values
This patch fixes sign bit propagation when right-shifting a negative
__max_diff_type value by more than one, a bug that our existing test
coverage didn't expose until r14-159-g03cebd304955a6 fixed the front
end's 'signed typedef-name' handling that the test relies on (which is
a non-standard extension to the language grammar).
libstdc++-v3/ChangeLog:
* include/bits/max_size_type.h (__max_diff_type::operator>>=):
Fix propagation of sign bit.
* testsuite/std/ranges/iota/max_size_type.cc: Avoid using the
non-standard 'signed typedef-name'. Add some compile-time tests
for right-shifting a negative __max_diff_type value by more than
one.
Patrick Palka [Fri, 24 Mar 2023 18:51:24 +0000 (14:51 -0400)]
c++: outer 'this' leaking into local class [PR106969]
Here when resolving the implicit object for '&wrapped' within the
local class Foo, we expect to obtain a dummy object of type Foo& since
there's no 'this' available in this context. And yet at this point
current_class_ref still corresponds to the outer class Context (and is
const), which confuses maybe_dummy_object into propagating the cv-quals
of current_class_ref and returning an object of type const Foo&. Thus
decltype(&wrapped) wrongly yields const int* instead of int*.
The problem ultimately seems to be that the 'this' from the enclosing
class appears available for use when parsing the local class, but 'this'
shouldn't persist across classes like that. This patch fixes this by
clearing current_class_ptr/ref before parsing a class definition.
After this change, for the test name-clash11.C in C++98 mode we would
now complain about an invalid use of 'this' in e.g.
ASSERT (sizeof (this->A) == 16);
due to the way the test defines the ASSERT macro via a local class.
This patch redefines the macro using a local typedef instead.
PR c++/106969
gcc/cp/ChangeLog:
* parser.cc (cp_parser_class_specifier): Clear current_class_ptr
and current_class_ref sooner, before parsing a class definition.
gcc/testsuite/ChangeLog:
* g++.dg/lookup/name-clash11.C: Fix ASSERT macro definition in
C++98 mode.
* g++.dg/lookup/this2.C: New test.
Here we're mishandling the unevaluated array new-expressions due to a
supposed non-constant array size ever since r12-5253-g4df7f8c79835d569
made us no longer perform constant evaluation of non-manifestly-constant
expressions within unevaluated contexts. This shouldn't make a difference
here since the array sizes are constant literals, except they're expressed
as NON_LVALUE_EXPR location wrappers around INTEGER_CST, wrappers which
used to get stripped as part of constant evaluation and now no longer do.
Moreover it means build_vec_init can't constant fold the MINUS_EXPR
'maxindex' passed from build_new_1 when in an unevaluated context (since
it tries reducing it via maybe_constant_value called with mce_unknown).
This patch fixes these issues by making maybe_constant_value (and
fold_non_dependent_expr) try folding an unevaluated non-manifestly-constant
operand via fold(), as long as it simplifies to a simple constant, rather
than doing no simplification at all. This covers e.g. simple arithmetic
and casts including stripping of location wrappers around INTEGER_CST.
In passing, this patch also fixes maybe_constant_value to avoid constant
evaluating an unevaluated operand when called with mce_false, by adjusting
the early exit test appropriately.
Co-authored-by: Jason Merrill <jason@redhat.com>
PR c++/108219
PR c++/108218
gcc/cp/ChangeLog:
* constexpr.cc (fold_to_constant): Define.
(maybe_constant_value): Move up early exit test for unevaluated
operands. Try reducing an unevaluated operand to a constant via
fold_to_constant.
(fold_non_dependent_expr_template): Add early exit test for
CONSTANT_CLASS_P nodes. Try reducing an unevaluated operand
to a constant via fold_to_constant.
* cp-tree.h (fold_to_constant): Declare.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/new6.C: New test.
* g++.dg/cpp2a/concepts-new1.C: New test.
Jonathan Wakely [Thu, 24 Nov 2022 21:09:03 +0000 (21:09 +0000)]
libstdc++: Call predicate with non-const values in std::erase_if [PR107850]
As specified in the standard, the predicate for std::erase_if has to be
invocable as non-const with a non-const lvalues argument. Restore
support for predicates that only accept non-const arguments.
It's not strictly nevessary to change it for the set and unordered_set
overloads, because they only give const access to the elements anyway.
I've done it for them too just to keep them all consistent.
Jonathan Wakely [Fri, 31 Mar 2023 12:38:14 +0000 (13:38 +0100)]
libstdc++: Avoid -Wmaybe-uninitialized warning in std::stop_source [PR109339]
We pass a const-reference to *this before it's constructed, and GCC
assumes that all const-references are accessed. Add the access attribute
to say it's not accessed.
libstdc++-v3/ChangeLog:
PR libstdc++/109339
* include/std/stop_token (_Stop_state_ptr(const stop_source&)):
Add attribute access with access-mode 'none'.
* testsuite/30_threads/stop_token/stop_source/109339.cc: New test.
Jonathan Wakely [Mon, 28 Nov 2022 09:44:52 +0000 (09:44 +0000)]
libstdc++: Make 16-bit std::subtract_with_carry_engine work [PR107466]
This implements the proposed resolution of LWG 3809, so that
std::subtract_with_carry_engine can be used with a 16-bit result_type.
Currently this produces a narrowing error when instantiating the
std::linear_congruential_engine to create the initial state. It also
truncates the default_seed constant when passing it as a result_type
argument.
Change the type of the constant to uint_least32_t and pass 0u when the
default_seed should be used.
libstdc++-v3/ChangeLog:
PR libstdc++/107466
* include/bits/random.h (subtract_with_carry_engine): Use 32-bit
type for default seed. Use 0u as default argument for
subtract_with_carry_engine(result_type) constructor and
seed(result_type) member function.
* include/bits/random.tcc (subtract_with_carry_engine): Use
32-bit type for default seed and engine used for initial state.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
New test.
Jonathan Wakely [Wed, 26 Apr 2023 11:27:59 +0000 (12:27 +0100)]
libstdc++: Reduce Doxygen output for PDF
Including the header source code in the doxygen-generated PDF file makes
it too large, and causes pdflatex to run out of memory. If we only set
SOURCE_BROWSER=YES for the HTML docs then we won't include the sources
in the PDF file.
There are several macros defined for std::valarray that are only used to
generate repetitive code and then #undef'd. Those aren't useful in the
doxygen docs, especially the ones that reuse the same name in different
files. Omitting them avoids warnings about duplicate labels in the
refman.tex file.
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (SOURCE_BROWSER): Only set to YES for
HTML docs.
* include/bits/gslice_array.h (_DEFINE_VALARRAY_OPERATOR): Omit
from doxygen docs.
* include/bits/indirect_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/mask_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/slice_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/std/valarray (_DEFINE_VALARRAY_UNARY_OPERATOR)
(_DEFINE_VALARRAY_AUGMENTED_ASSIGNMENT)
(_DEFINE_VALARRAY_EXPR_AUGMENTED_ASSIGNMENT)
(_DEFINE_BINARY_OPERATOR): Likewise.
Roger Sayle [Tue, 10 Jan 2023 14:05:46 +0000 (14:05 +0000)]
PR rtl-optimization/106421: ICE in bypass_block from non-local goto.
This patch fixes PR rtl-optimization/106421, an ICE-on-valid (but
undefined) regression. The fix, as proposed by Richard Biener, is to
defend against BLOCK_FOR_INSN returning NULL in cprop's bypass_block.
2023-01-10 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/106421
* cprop.cc (bypass_block): Check that DEST is local to this
function (non-NULL) before calling find_edge.
gcc/testsuite/ChangeLog
PR rtl-optimization/106421
* gcc.dg/pr106421.c: New test case.
H.J. Lu [Mon, 16 Jan 2023 18:45:41 +0000 (10:45 -0800)]
x86: Disable -mforce-indirect-call for PIC in 32-bit mode
-mforce-indirect-call generates invalid instruction in 32-bit MI thunk
since there are no available scratch registers in 32-bit PIC mode.
Disable -mforce-indirect-call for PIC in 32-bit mode when generating
MI thunk.
gcc/
PR target/105980
* config/i386/i386.cc (x86_output_mi_thunk): Disable
-mforce-indirect-call for PIC in 32-bit mode.
gcc/testsuite/
PR target/105980
* g++.target/i386/pr105980.C: New test.
Jan Hubicka [Fri, 12 Aug 2022 14:25:28 +0000 (16:25 +0200)]
Fix invalid devirtualization when combining final keyword and anonymous types
this patch fixes a wrong code issue where we incorrectly devirtualize to
__builtin_unreachable. The problem occurs in combination of anonymous
namespaces and final keyword used on methods. We do two optimizations here
1) when reacing final method we cut the search for possible new targets
2) if the type is anonymous we detect whether it is ever instatiated by
looking if its vtable is referred to.
Now this goes wrong when thre is an anonymous type with final method that
is not instantiated while its derived type is. So if 1 triggers we need
to make 2 to look for vtables of all derived types as done by this patch.
Bootstrpaped/regtested x86_64-linux
Honza
gcc/ChangeLog:
2022-08-10 Jan Hubicka <hubicka@ucw.cz>
PR middle-end/106057
* ipa-devirt.cc (type_or_derived_type_possibly_instantiated_p): New
function.
(possible_polymorphic_call_targets): Use it.
gcc/testsuite/ChangeLog:
2022-08-10 Jan Hubicka <hubicka@ucw.cz>
PR middle-end/106057
* g++.dg/tree-ssa/pr101839.C: New test.
Martin Jambor [Wed, 26 Apr 2023 16:38:39 +0000 (18:38 +0200)]
ipa: Fix double reference-count decrements for the same edge (PR 107769, PR 109318)
It turns out that since addition of the code that can identify globals
which are only read from, the code that keeps track of the references
can decrement their count for the same calls, once during IPA-CP and
then again during inlining. Fixed by adding a special flag to the
pass-through variant and simply wiping out the reference to the
refdesc structure from the constant ones.
Moreover, during debugging of the issue I have discovered that the
code removing references could remove a reference associated with the
same statement but of a wrong type. In all cases it wanted to remove
an IPA_REF_ADDR reference so removing a lesser one instead should do
no harm in practice, but we should try to be consistent and so this
patch extends symtab_node::find_reference so that it searches for a
reference of a given type only.
gcc/ChangeLog:
2023-04-14 Martin Jambor <mjambor@suse.cz>
PR ipa/107769
PR ipa/109318
* cgraph.h (symtab_node::find_reference): Add parameter use_type.
* ipa-prop.h (ipa_pass_through_data): New flag refdesc_decremented.
(ipa_zap_jf_refdesc): New function.
(ipa_get_jf_pass_through_refdesc_decremented): Likewise.
(ipa_set_jf_pass_through_refdesc_decremented): Likewise.
* ipa-cp.cc (ipcp_discover_new_direct_edges): Provide a value for
the new parameter of find_reference.
(adjust_references_in_caller): Likewise. Make sure the constant jump
function is not used to decrement a refdec counter again. Only
decrement refdesc counters when the pass_through jump function allows
it. Added a detailed dump when decrementing refdesc counters.
* ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Dump new flag.
(ipa_set_jf_simple_pass_through): Initialize the new flag.
(ipa_set_jf_unary_pass_through): Likewise.
(ipa_set_jf_arith_pass_through): Likewise.
(remove_described_reference): Provide a value for the new parameter of
find_reference.
(update_jump_functions_after_inlining): Zap refdesc of new jfunc if
the previous pass_through had a flag mandating that we do so.
(propagate_controlled_uses): Likewise. Only decrement refdesc
counters when the pass_through jump function allows it.
(ipa_edge_args_sum_t::duplicate): Provide a value for the new
parameter of find_reference.
(ipa_write_jump_function): Assert the new flag does not have to be
streamed.
* symtab.cc (symtab_node::find_reference): Add parameter use_type, use
it in searching.
Jakub Jelinek [Tue, 25 Apr 2023 12:20:51 +0000 (14:20 +0200)]
powerpc: Fix up *branch_anddi3_dot for -m32 -mpowerpc64 [PR109566]
The following testcase reduced from newlib ICEs on powerpc-linux,
with -O2 -m32 -mpowerpc64 since r12-6433 PR102239 optimization was
added and on the original testcase since some ranger improvements in
GCC 13 made it no longer latent on newlib.
The problem is that the *branch_anddi3_dot define_insn_and_split
relies on the *rotldi3_mask_dot define_insn_and_split being recognized
during splitting. The rs6000_is_valid_rotate_dot_mask function checks whether
the mask is a CONST_INT which is a valid mask, but *rotl<mode>3_mask_dot in
addition to checking that it is a valid mask also has
(<MODE>mode == Pmode || UINTVAL (operands[3]) <= 0x7fffffff)
test in the condition. For TARGET_64BIT that doesn't add any further
requirements, but for !TARGET_64BIT && TARGET_POWERPC64 if the AND
second operand is larger than INT_MAX it will not be recognized.
The rs6000_is_valid_rotate_dot_mask function is used solely in one spot,
condition of *branch_anddi3_dot, so the following patch adjusts it
to check for that as well.
2023-04-25 Jakub Jelinek <jakub@redhat.com>
PR target/109566
* config/rs6000/rs6000.cc (rs6000_is_valid_rotate_dot_mask): For
!TARGET_64BIT, don't return true if UINTVAL (mask) << (63 - nb)
is larger than signed int maximum.
Richard Biener [Tue, 25 Apr 2023 12:56:44 +0000 (14:56 +0200)]
tree-optimization/109609 - correctly interpret arg size in fnspec
By majority vote and a hint from the API name which is
arg_max_access_size_given_by_arg_p this interprets a memory access
size specified as given as other argument such as for strncpy
in the testcase which has "1cO313" as specifying the _maximum_
size read/written rather than the exact size. There are two
uses interpreting it that way already and one differing. The
following adjusts the differing and clarifies the documentation.
PR tree-optimization/109609
* attr-fnspec.h (arg_max_access_size_given_by_arg_p):
Clarify semantics.
* tree-ssa-alias.cc (check_fnspec): Correctly interpret
the size given by arg_max_access_size_given_by_arg_p as
maximum, not exact, size.
Richard Biener [Mon, 24 Apr 2023 11:31:07 +0000 (13:31 +0200)]
rtl-optimization/109585 - alias analysis typo
When r10-514-gc6b84edb6110dd2b4fb improved access path analysis
it introduced a typo that triggers when there's an access to a
trailing array in the first access path leading to false
disambiguation.
Richard Biener [Fri, 21 Apr 2023 10:57:17 +0000 (12:57 +0200)]
tree-optimization/109573 - avoid ICEing on unexpected live def
The following relaxes the assert in vectorizable_live_operation
where we catch currently unhandled cases to also allow an
intermediate copy as it happens here but also relax the assert
to checking only.
PR tree-optimization/109573
* tree-vect-loop.cc (vectorizable_live_operation): Allow
unhandled SSA copy as well. Demote assert to checking only.
Eric Botcazou [Tue, 25 Apr 2023 08:46:16 +0000 (10:46 +0200)]
Remove obsolete configure code in gnattools
It was recently pointed out that we generate symbolic links to ghost files
when building the GNAT tools, as the mlib-tgt-specific-*.adb files are gone.
Harald Anlauf [Tue, 11 Apr 2023 14:44:32 +0000 (16:44 +0200)]
Fortran: resolve correct generic with TYPE(C_PTR) arguments [PR61615,PR99982]
gcc/fortran/ChangeLog:
PR fortran/61615
PR fortran/99982
* interface.cc (compare_parameter): Enable type and rank checks for
arguments of derived type from the intrinsic module ISO_C_BINDING.
gcc/testsuite/ChangeLog:
PR fortran/61615
PR fortran/99982
* gfortran.dg/interface_49.f90: New test.