liuhongt [Wed, 10 May 2023 07:16:58 +0000 (15:16 +0800)]
x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.
if (mdaz-ftz)
link crtfastmath.o
else if ((Ofast || ffast-math || funsafe-math-optimizations)
&& !mno-daz-ftz)
link crtfastmath.o
else
Don't link crtfastmath.o
gcc/ChangeLog:
* config/i386/cygwin.h (ENDFILE_SPEC): Link crtfastmath.o
whenever -mdaz-ftz is specified. Don't link crtfastmath.o
when -mno-daz-ftz is specified.
* config/i386/darwin.h (ENDFILE_SPEC): Ditto.
* config/i386/gnu-user-common.h
(GNU_USER_TARGET_MATHFILE_SPEC): Ditto.
* config/i386/mingw32.h (ENDFILE_SPEC): Ditto.
* config/i386/i386.opt (mdaz-ftz): New option.
* doc/invoke.texi (x86 options): Document mftz-daz.
Jonathan Wakely [Wed, 16 Nov 2022 12:22:04 +0000 (12:22 +0000)]
libstdc++: Fix std::any pretty printer
The recent changes to FilteringTypePrinter affect the result of
gdb.lookup_type('std::string') in StdExpAnyPrinter, causing it to always
return the std::__cxx11::basic_string specialization. This then causes a
gdb.error exception when trying to lookup the std::any manager type for
a specialization using that string, but that manager was never
instantiated in the program. This causes FAILs when running the tests
with -D_GLIBCXX_USE_CXX11_ABI=0:
FAIL: libstdc++-prettyprinters/libfundts.cc print as
FAIL: libstdc++-prettyprinters/libfundts.cc print as
The ugly solution used in this patch is to repeat the lookup for every
type that std::string could be a typedef for, and hope it only works for
one of them.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (StdExpAnyPrinter): Make
expansion of std::string in manager name more robust.
Dan Horák [Wed, 3 May 2023 19:29:09 +0000 (14:29 -0500)]
libffi: fix handling of homogeneous float128 structs (#689)
If there is a homogeneous struct with float128 members, they should be
copied to vector register save area. The current code incorrectly copies
only the value of the first member, not increasing the pointer with each
iteration. Fix this.
Patrick Palka [Sat, 1 Apr 2023 16:01:30 +0000 (12:01 -0400)]
c++: NTTP constraint depending on outer parms [PR109160]
Here we're crashing during satisfaction for the NTTP 'C<B> auto V'
ultimately because convert_template_argument / unify don't pass all
outer template arguments to do_auto_deduction, and during satisfaction
we need to know all arguments. While these callers do pass some outer
arguments, they are only sufficient to properly substitute the
(level-lowered) 'auto' and are not necessarily the entire set.
Fortunately it seems these callers have access to the full set of outer
arguments via convert_template_argument's 'in_decl' parameter and
unify's 'tparms' parameter. So this patch adds a new parameter to
do_auto_deduction, used only during adc_unify deduction, through which
these callers can pass the enclosing (partially instantiated) template
and from which do_auto_deduction can obtain _all_ outer template
arguments for sake of satisfaction.
This patch also ensures that the 'in_decl' argument passed to
coerce_template_parms is always a TEMPLATE_DECL, which in turn allows us
to pass it as-is to do_auto_deduction; the only coerce_template_parms
caller that needed adjustment was tsubst_decl it seems.
PR c++/109160
gcc/cp/ChangeLog:
* cp-tree.h (do_auto_deduction): Add defaulted tmpl parameter.
* pt.cc (convert_template_argument): Pass 'in_decl' as 'tmpl' to
do_auto_deduction.
(tsubst_decl) <case VAR_/TYPE_DECL>: Pass 'tmpl' instead of 't' as
'in_decl' to coerce_template_parms.
(unify) <case TEMPLATE_PARM_INDEX>: Pass TPARMS_PRIMARY_TEMPLATE
as 'tmpl' to do_auto_deduction.
(do_auto_deduction): Document default arguments. Rename local
variable 'tmpl' to 'ctmpl'. Use 'tmpl' to obtain a full set of
template arguments for satisfaction in the adc_unify case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-placeholder12.C: New test.
Jakub Jelinek [Tue, 9 May 2023 10:14:18 +0000 (12:14 +0200)]
testsuite: Add further testcase for already fixed PR [PR109778]
I came up with a testcase which reproduces all the way to r10-7469.
LTO to avoid early inlining it, so that ccp handles rotates and not
shifts before they are turned into rotates.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* gcc.dg/lto/pr109778_0.c: New test.
* gcc.dg/lto/pr109778_1.c: New file.
Jakub Jelinek [Tue, 9 May 2023 10:10:07 +0000 (12:10 +0200)]
tree-ssa-ccp, wide-int: Fix up handling of [LR]ROTATE_EXPR in bitwise ccp [PR109778]
The following testcase is miscompiled, because bitwise ccp2 handles
a rotate with a signed type incorrectly.
Seems tree-ssa-ccp.cc has the only callers of wi::[lr]rotate with 3
arguments, all other callers just rotate in the right precision and
I think work correctly. ccp works with widest_ints and so rotations
by the excessive precision certainly don't match what it wants
when it sees a rotate in some specific bitsize. Still, if it is
unsigned rotate and the widest_int is zero extended from width,
the functions perform left shift and logical right shift on the value
and then at the end zero extend the result of left shift and uselessly
also the result of logical right shift and return | of that.
On the testcase we the signed char rrotate by 4 argument is
CONSTANT -75 i.e. 0xffffffff....fffffb5 with mask 2.
The mask is correctly rotated to 0x20, but because the 8-bit constant
is sign extended to 192-bit one, the logical right shift by 4 doesn't
yield expected 0xb, but gives 0xfffffffffff....ffffb, and then
return wi::zext (left, width) | wi::zext (right, width); where left is
0xfffffff....fb50, so we return 0xfb instead of the expected
0x5b.
The following patch fixes that by doing the zero extension in case of
the right variable before doing wi::lrshift rather than after it.
Also, wi::[lr]rotate widht width < precision always zero extends
the result. I'm afraid it can't do better because it doesn't know
if it is done for an unsigned or signed type, but the caller in this
case knows that very well, so I've done the extension based on sgn
in the caller. E.g. 0x5b rotated right (or left) by 4 with width 8
previously gave 0xb5, but sgn == SIGNED in widest_int it should be
0xffffffff....fffb5 instead.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* wide-int.h (wi::lrotate, wi::rrotate): Call wi::lrshift on
wi::zext (x, width) rather than x if width != precision, rather
than using wi::zext (right, width) after the shift.
* tree-ssa-ccp.cc (bit_value_binop): Call wi::ext on the results
of wi::lrotate or wi::rrotate.
Martin Uecker [Wed, 8 Feb 2023 14:02:43 +0000 (15:02 +0100)]
Fix ICE related to implicit access attributes for VLA arguments [PR105660]
When constructing the specifier string when merging an access attribute
that encodes information about VLA arguments, the string was constructed
in random order by iterating through a hash table. Fix this by iterating
though the list of arguments.
gcc/c-family/Changelog:
PR c/105660
* c-attribs.cc (append_access_attr): Use order of arguments when
construction string.
(append_access_attr_idxs): Rename and make static.
* c-warn.cc (warn_parm_array_mismatch): Add assertion.
gcc/testsuite/ChangeLog:
PR c/105660
* gcc.dg/pr105660-1.c: New test.
* gcc.dg/pr105660-2.c: New test.
Kewen Lin [Wed, 26 Apr 2023 05:21:14 +0000 (00:21 -0500)]
rs6000: Guard power9-vector for vsx_scalar_cmp_exp_qp_* [PR108758]
__builtin_vsx_scalar_cmp_exp_qp_{eq,gt,lt,unordered} used
to be guarded with condition TARGET_P9_VECTOR before new
bif framework was introduced (r12-5752-gd08236359eb229),
since r12-5752 they are placed under stanza ieee128-hw,
that is to check condition TARGET_FLOAT128_HW, it caused
test case float128-cmp2-runnable.c to fail at -m32 as the
condition TARGET_FLOAT128_HW isn't satisified with -m32.
By checking the commit history, I didn't see any notes on
why this condition change on them was made, so this patch
is to move these bifs from stanza ieee128-hw to stanza
power9-vector as before.
PR target/108758
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def
(__builtin_vsx_scalar_cmp_exp_qp_eq, __builtin_vsx_scalar_cmp_exp_qp_gt
__builtin_vsx_scalar_cmp_exp_qp_lt,
__builtin_vsx_scalar_cmp_exp_qp_unordered): Move from stanza ieee128-hw
to power9-vector.
Kewen Lin [Wed, 26 Apr 2023 05:21:05 +0000 (00:21 -0500)]
rs6000: Fix predicate for const vector in sldoi_to_mov [PR109069]
As PR109069 shows, commit r12-6537-g080a06fcb076b3 which
introduces define_insn_and_split sldoi_to_mov adopts
easy_vector_constant for const vector of interest, but it's
wrong since predicate easy_vector_constant doesn't guarantee
each byte in the const vector is the same. One counter
example is the const vector in pr109069-1.c. This patch is
to introduce new predicate const_vector_each_byte_same to
ensure all bytes in the given const vector are the same by
considering both int and float, meanwhile for the constants
which don't meet easy_vector_constant we need to gen a move
instead of just a set, and uses VECTOR_MEM_ALTIVEC_OR_VSX_P
rather than VECTOR_UNIT_ALTIVEC_OR_VSX_P for V2DImode support
under VSX since vector long long type of vec_sld is guarded
under stanza vsx.
PR target/109069
gcc/ChangeLog:
* config/rs6000/altivec.md (sldoi_to_mov<mode>): Replace predicate
easy_vector_constant with const_vector_each_byte_same, add
handlings in preparation for !easy_vector_constant, and update
VECTOR_UNIT_ALTIVEC_OR_VSX_P with VECTOR_MEM_ALTIVEC_OR_VSX_P.
* config/rs6000/predicates.md (const_vector_each_byte_same): New
predicate.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr109069-1.c: New test.
* gcc.target/powerpc/pr109069-2-run.c: New test.
* gcc.target/powerpc/pr109069-2.c: New test.
* gcc.target/powerpc/pr109069-2.h: New test.
Kefu Chai [Mon, 1 May 2023 20:24:26 +0000 (21:24 +0100)]
libstdc++: Set _M_string_length before calling _M_dispose() [PR109703]
This always sets _M_string_length in the constructor for ranges of input
iterators, such as stream iterators.
We copy from the source range to the local buffer, and then repeatedly
reallocate a larger one if necessary. When disposing the old buffer,
_M_is_local() is used to tell if the buffer is the local one or not (and
so must be deallocated). In addition to comparing the buffer address
with the local buffer, _M_is_local() has an optimization hint so that
the compiler knows that for a string using the local buffer, there is an
invariant that _M_string_length <= _S_local_capacity (added for PR109299
via r13-6915-gbf78b43873b0b7). But we failed to set _M_string_length in
the constructor taking a pair of iterators, so the invariant might not
hold, and __builtin_unreachable() is reached. This causes UBsan errors,
and potentially misoptimization.
To ensure the invariant holds, _M_string_length is initialized to zero
before doing anything else, so that _M_is_local() doesn't see an
uninitialized value.
This issue only surfaces when constructing a string with a range of
input iterator, and the uninitialized _M_string_length happens to be
greater than _S_local_capacity, i.e., 15 for the std::string
specialization.
Jakub Jelinek [Wed, 3 May 2023 08:38:04 +0000 (10:38 +0200)]
c++: Fix up VEC_INIT_EXPR gimplification after r12-7069
During patch backporting, I've noticed that while most cp_walk_tree calls
with cp_fold_r callback callers were changed from &pset to cp_fold_data
&data, the VEC_INIT_EXPR gimplifications has not, so it still passes just
address of a hash_set<tree> and so if during the folding we ever touch
data->genericize, we use uninitialized data there.
The following patch changes it to do the same thing as cp_fold_function
because the VEC_INIT_EXPR gimplifications will happen on function bodies
only.
2023-05-03 Jakub Jelinek <jakub@redhat.com>
* cp-gimplify.cc (cp_fold_data): Move definition earlier.
(cp_gimplify_expr): Pass address of genericize = true
constructed data rather than &pset to cp_walk_tree with cp_fold_r.
c++, coroutines: Fix block nests when the function has no top-level bind.
When the function contains no local vars and also no nested scopes, there
is no top-level bind expression. Because the rewritten coroutine body will
require both local vars and contain nested scopes, we add a bind expression
to such functions. When this was done the necessary scope blocks were
omitted which leads to disconnected function content.
Fixed by adding a new block to the added bind expression.
Iain Sandoe [Thu, 30 Mar 2023 07:44:23 +0000 (13:14 +0530)]
c++,coroutines: Stabilize names of promoted slot vars [PR101118].
When we need to 'promote' a value (i.e. store it in the coroutine frame) it
is given a frame entry name. This was based on the DECL_UID for slot vars.
However, when LTO is used, the names from multiple TUs become visible at the
same time, and the DECL_UIDs usually differ between units. This leads to a
"ODR mismatch" warning for the frame type.
The fix here is to use the current promoted temporaries count to produce
the name, this is stable between TUs and computed per coroutine.
Andrew Pinski [Thu, 8 Dec 2022 22:34:16 +0000 (22:34 +0000)]
coroutines: Build pointer initializers with nullptr_node [PR107768]
The PR reports that using integer_zero_node triggers a warning for
-Wzero-as-null-pointer-constant which comes from compiler-generated code so
makes no sense to the end user.
Patrick Palka [Wed, 12 Apr 2023 17:08:21 +0000 (13:08 -0400)]
libstdc++: Implement LWG 3904 change to lazy_split_view's iterator
libstdc++-v3/ChangeLog:
* include/std/ranges (lazy_split_view::_OuterIter::_OuterIter):
Propagate _M_trailing_empty in the const-converting constructor
as per LWG 3904.
* testsuite/std/ranges/adaptors/lazy_split.cc (test12): New test.
Patrick Palka [Tue, 14 Mar 2023 20:44:32 +0000 (16:44 -0400)]
libstdc++: Implement P2520R0 changes to move_iterator's iterator_concept
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (move_iterator::_S_iter_concept):
Define.
(__cpp_lib_move_iterator_concept): Define for C++20.
(move_iterator::iterator_concept): Strengthen as per P2520R0.
* include/std/version (__cpp_lib_move_iterator_concept): Define
for C++20.
* testsuite/24_iterators/move_iterator/p2520r0.cc: New test.
Patrick Palka [Thu, 9 Mar 2023 18:35:04 +0000 (13:35 -0500)]
libstdc++: Make views::single/iota/istream SFINAE-friendly [PR108362]
PR libstdc++/108362
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::__can_single_view): New concept.
(_Single::operator()): Constrain it. Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_iota_view): New concept.
(_Iota::operator()): Constrain it. Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_istream_view): New concept.
(_Istream::operator()): Constrain it. Move [[nodiscard]] to the
end of the function declarator.
* testsuite/std/ranges/iota/lwg3292_neg.cc: Prune "in
requirements" diagnostic.
* testsuite/std/ranges/iota/iota_view.cc (test07): New test.
* testsuite/std/ranges/istream_view.cc (test08): New test.
* testsuite/std/ranges/single_view.cc (test07): New test.
Patrick Palka [Mon, 24 Apr 2023 17:39:54 +0000 (13:39 -0400)]
libstdc++: Fix __max_diff_type::operator>>= for negative values
This patch fixes sign bit propagation when right-shifting a negative
__max_diff_type value by more than one, a bug that our existing test
coverage didn't expose until r14-159-g03cebd304955a6 fixed the front
end's 'signed typedef-name' handling that the test relies on (which is
a non-standard extension to the language grammar).
libstdc++-v3/ChangeLog:
* include/bits/max_size_type.h (__max_diff_type::operator>>=):
Fix propagation of sign bit.
* testsuite/std/ranges/iota/max_size_type.cc: Avoid using the
non-standard 'signed typedef-name'. Add some compile-time tests
for right-shifting a negative __max_diff_type value by more than
one.
Patrick Palka [Fri, 24 Mar 2023 18:51:24 +0000 (14:51 -0400)]
c++: outer 'this' leaking into local class [PR106969]
Here when resolving the implicit object for '&wrapped' within the
local class Foo, we expect to obtain a dummy object of type Foo& since
there's no 'this' available in this context. And yet at this point
current_class_ref still corresponds to the outer class Context (and is
const), which confuses maybe_dummy_object into propagating the cv-quals
of current_class_ref and returning an object of type const Foo&. Thus
decltype(&wrapped) wrongly yields const int* instead of int*.
The problem ultimately seems to be that the 'this' from the enclosing
class appears available for use when parsing the local class, but 'this'
shouldn't persist across classes like that. This patch fixes this by
clearing current_class_ptr/ref before parsing a class definition.
After this change, for the test name-clash11.C in C++98 mode we would
now complain about an invalid use of 'this' in e.g.
ASSERT (sizeof (this->A) == 16);
due to the way the test defines the ASSERT macro via a local class.
This patch redefines the macro using a local typedef instead.
PR c++/106969
gcc/cp/ChangeLog:
* parser.cc (cp_parser_class_specifier): Clear current_class_ptr
and current_class_ref sooner, before parsing a class definition.
gcc/testsuite/ChangeLog:
* g++.dg/lookup/name-clash11.C: Fix ASSERT macro definition in
C++98 mode.
* g++.dg/lookup/this2.C: New test.
Here we're mishandling the unevaluated array new-expressions due to a
supposed non-constant array size ever since r12-5253-g4df7f8c79835d569
made us no longer perform constant evaluation of non-manifestly-constant
expressions within unevaluated contexts. This shouldn't make a difference
here since the array sizes are constant literals, except they're expressed
as NON_LVALUE_EXPR location wrappers around INTEGER_CST, wrappers which
used to get stripped as part of constant evaluation and now no longer do.
Moreover it means build_vec_init can't constant fold the MINUS_EXPR
'maxindex' passed from build_new_1 when in an unevaluated context (since
it tries reducing it via maybe_constant_value called with mce_unknown).
This patch fixes these issues by making maybe_constant_value (and
fold_non_dependent_expr) try folding an unevaluated non-manifestly-constant
operand via fold(), as long as it simplifies to a simple constant, rather
than doing no simplification at all. This covers e.g. simple arithmetic
and casts including stripping of location wrappers around INTEGER_CST.
In passing, this patch also fixes maybe_constant_value to avoid constant
evaluating an unevaluated operand when called with mce_false, by adjusting
the early exit test appropriately.
Co-authored-by: Jason Merrill <jason@redhat.com>
PR c++/108219
PR c++/108218
gcc/cp/ChangeLog:
* constexpr.cc (fold_to_constant): Define.
(maybe_constant_value): Move up early exit test for unevaluated
operands. Try reducing an unevaluated operand to a constant via
fold_to_constant.
(fold_non_dependent_expr_template): Add early exit test for
CONSTANT_CLASS_P nodes. Try reducing an unevaluated operand
to a constant via fold_to_constant.
* cp-tree.h (fold_to_constant): Declare.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/new6.C: New test.
* g++.dg/cpp2a/concepts-new1.C: New test.
Jonathan Wakely [Thu, 24 Nov 2022 21:09:03 +0000 (21:09 +0000)]
libstdc++: Call predicate with non-const values in std::erase_if [PR107850]
As specified in the standard, the predicate for std::erase_if has to be
invocable as non-const with a non-const lvalues argument. Restore
support for predicates that only accept non-const arguments.
It's not strictly nevessary to change it for the set and unordered_set
overloads, because they only give const access to the elements anyway.
I've done it for them too just to keep them all consistent.
Jonathan Wakely [Fri, 31 Mar 2023 12:38:14 +0000 (13:38 +0100)]
libstdc++: Avoid -Wmaybe-uninitialized warning in std::stop_source [PR109339]
We pass a const-reference to *this before it's constructed, and GCC
assumes that all const-references are accessed. Add the access attribute
to say it's not accessed.
libstdc++-v3/ChangeLog:
PR libstdc++/109339
* include/std/stop_token (_Stop_state_ptr(const stop_source&)):
Add attribute access with access-mode 'none'.
* testsuite/30_threads/stop_token/stop_source/109339.cc: New test.
Jonathan Wakely [Mon, 28 Nov 2022 09:44:52 +0000 (09:44 +0000)]
libstdc++: Make 16-bit std::subtract_with_carry_engine work [PR107466]
This implements the proposed resolution of LWG 3809, so that
std::subtract_with_carry_engine can be used with a 16-bit result_type.
Currently this produces a narrowing error when instantiating the
std::linear_congruential_engine to create the initial state. It also
truncates the default_seed constant when passing it as a result_type
argument.
Change the type of the constant to uint_least32_t and pass 0u when the
default_seed should be used.
libstdc++-v3/ChangeLog:
PR libstdc++/107466
* include/bits/random.h (subtract_with_carry_engine): Use 32-bit
type for default seed. Use 0u as default argument for
subtract_with_carry_engine(result_type) constructor and
seed(result_type) member function.
* include/bits/random.tcc (subtract_with_carry_engine): Use
32-bit type for default seed and engine used for initial state.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
New test.
Jonathan Wakely [Wed, 26 Apr 2023 11:27:59 +0000 (12:27 +0100)]
libstdc++: Reduce Doxygen output for PDF
Including the header source code in the doxygen-generated PDF file makes
it too large, and causes pdflatex to run out of memory. If we only set
SOURCE_BROWSER=YES for the HTML docs then we won't include the sources
in the PDF file.
There are several macros defined for std::valarray that are only used to
generate repetitive code and then #undef'd. Those aren't useful in the
doxygen docs, especially the ones that reuse the same name in different
files. Omitting them avoids warnings about duplicate labels in the
refman.tex file.
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (SOURCE_BROWSER): Only set to YES for
HTML docs.
* include/bits/gslice_array.h (_DEFINE_VALARRAY_OPERATOR): Omit
from doxygen docs.
* include/bits/indirect_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/mask_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/slice_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/std/valarray (_DEFINE_VALARRAY_UNARY_OPERATOR)
(_DEFINE_VALARRAY_AUGMENTED_ASSIGNMENT)
(_DEFINE_VALARRAY_EXPR_AUGMENTED_ASSIGNMENT)
(_DEFINE_BINARY_OPERATOR): Likewise.
Roger Sayle [Tue, 10 Jan 2023 14:05:46 +0000 (14:05 +0000)]
PR rtl-optimization/106421: ICE in bypass_block from non-local goto.
This patch fixes PR rtl-optimization/106421, an ICE-on-valid (but
undefined) regression. The fix, as proposed by Richard Biener, is to
defend against BLOCK_FOR_INSN returning NULL in cprop's bypass_block.
2023-01-10 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/106421
* cprop.cc (bypass_block): Check that DEST is local to this
function (non-NULL) before calling find_edge.
gcc/testsuite/ChangeLog
PR rtl-optimization/106421
* gcc.dg/pr106421.c: New test case.
H.J. Lu [Mon, 16 Jan 2023 18:45:41 +0000 (10:45 -0800)]
x86: Disable -mforce-indirect-call for PIC in 32-bit mode
-mforce-indirect-call generates invalid instruction in 32-bit MI thunk
since there are no available scratch registers in 32-bit PIC mode.
Disable -mforce-indirect-call for PIC in 32-bit mode when generating
MI thunk.
gcc/
PR target/105980
* config/i386/i386.cc (x86_output_mi_thunk): Disable
-mforce-indirect-call for PIC in 32-bit mode.
gcc/testsuite/
PR target/105980
* g++.target/i386/pr105980.C: New test.
Jan Hubicka [Fri, 12 Aug 2022 14:25:28 +0000 (16:25 +0200)]
Fix invalid devirtualization when combining final keyword and anonymous types
this patch fixes a wrong code issue where we incorrectly devirtualize to
__builtin_unreachable. The problem occurs in combination of anonymous
namespaces and final keyword used on methods. We do two optimizations here
1) when reacing final method we cut the search for possible new targets
2) if the type is anonymous we detect whether it is ever instatiated by
looking if its vtable is referred to.
Now this goes wrong when thre is an anonymous type with final method that
is not instantiated while its derived type is. So if 1 triggers we need
to make 2 to look for vtables of all derived types as done by this patch.
Bootstrpaped/regtested x86_64-linux
Honza
gcc/ChangeLog:
2022-08-10 Jan Hubicka <hubicka@ucw.cz>
PR middle-end/106057
* ipa-devirt.cc (type_or_derived_type_possibly_instantiated_p): New
function.
(possible_polymorphic_call_targets): Use it.
gcc/testsuite/ChangeLog:
2022-08-10 Jan Hubicka <hubicka@ucw.cz>
PR middle-end/106057
* g++.dg/tree-ssa/pr101839.C: New test.
Martin Jambor [Wed, 26 Apr 2023 16:38:39 +0000 (18:38 +0200)]
ipa: Fix double reference-count decrements for the same edge (PR 107769, PR 109318)
It turns out that since addition of the code that can identify globals
which are only read from, the code that keeps track of the references
can decrement their count for the same calls, once during IPA-CP and
then again during inlining. Fixed by adding a special flag to the
pass-through variant and simply wiping out the reference to the
refdesc structure from the constant ones.
Moreover, during debugging of the issue I have discovered that the
code removing references could remove a reference associated with the
same statement but of a wrong type. In all cases it wanted to remove
an IPA_REF_ADDR reference so removing a lesser one instead should do
no harm in practice, but we should try to be consistent and so this
patch extends symtab_node::find_reference so that it searches for a
reference of a given type only.
gcc/ChangeLog:
2023-04-14 Martin Jambor <mjambor@suse.cz>
PR ipa/107769
PR ipa/109318
* cgraph.h (symtab_node::find_reference): Add parameter use_type.
* ipa-prop.h (ipa_pass_through_data): New flag refdesc_decremented.
(ipa_zap_jf_refdesc): New function.
(ipa_get_jf_pass_through_refdesc_decremented): Likewise.
(ipa_set_jf_pass_through_refdesc_decremented): Likewise.
* ipa-cp.cc (ipcp_discover_new_direct_edges): Provide a value for
the new parameter of find_reference.
(adjust_references_in_caller): Likewise. Make sure the constant jump
function is not used to decrement a refdec counter again. Only
decrement refdesc counters when the pass_through jump function allows
it. Added a detailed dump when decrementing refdesc counters.
* ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Dump new flag.
(ipa_set_jf_simple_pass_through): Initialize the new flag.
(ipa_set_jf_unary_pass_through): Likewise.
(ipa_set_jf_arith_pass_through): Likewise.
(remove_described_reference): Provide a value for the new parameter of
find_reference.
(update_jump_functions_after_inlining): Zap refdesc of new jfunc if
the previous pass_through had a flag mandating that we do so.
(propagate_controlled_uses): Likewise. Only decrement refdesc
counters when the pass_through jump function allows it.
(ipa_edge_args_sum_t::duplicate): Provide a value for the new
parameter of find_reference.
(ipa_write_jump_function): Assert the new flag does not have to be
streamed.
* symtab.cc (symtab_node::find_reference): Add parameter use_type, use
it in searching.
Jakub Jelinek [Tue, 25 Apr 2023 12:20:51 +0000 (14:20 +0200)]
powerpc: Fix up *branch_anddi3_dot for -m32 -mpowerpc64 [PR109566]
The following testcase reduced from newlib ICEs on powerpc-linux,
with -O2 -m32 -mpowerpc64 since r12-6433 PR102239 optimization was
added and on the original testcase since some ranger improvements in
GCC 13 made it no longer latent on newlib.
The problem is that the *branch_anddi3_dot define_insn_and_split
relies on the *rotldi3_mask_dot define_insn_and_split being recognized
during splitting. The rs6000_is_valid_rotate_dot_mask function checks whether
the mask is a CONST_INT which is a valid mask, but *rotl<mode>3_mask_dot in
addition to checking that it is a valid mask also has
(<MODE>mode == Pmode || UINTVAL (operands[3]) <= 0x7fffffff)
test in the condition. For TARGET_64BIT that doesn't add any further
requirements, but for !TARGET_64BIT && TARGET_POWERPC64 if the AND
second operand is larger than INT_MAX it will not be recognized.
The rs6000_is_valid_rotate_dot_mask function is used solely in one spot,
condition of *branch_anddi3_dot, so the following patch adjusts it
to check for that as well.
2023-04-25 Jakub Jelinek <jakub@redhat.com>
PR target/109566
* config/rs6000/rs6000.cc (rs6000_is_valid_rotate_dot_mask): For
!TARGET_64BIT, don't return true if UINTVAL (mask) << (63 - nb)
is larger than signed int maximum.
Richard Biener [Tue, 25 Apr 2023 12:56:44 +0000 (14:56 +0200)]
tree-optimization/109609 - correctly interpret arg size in fnspec
By majority vote and a hint from the API name which is
arg_max_access_size_given_by_arg_p this interprets a memory access
size specified as given as other argument such as for strncpy
in the testcase which has "1cO313" as specifying the _maximum_
size read/written rather than the exact size. There are two
uses interpreting it that way already and one differing. The
following adjusts the differing and clarifies the documentation.
PR tree-optimization/109609
* attr-fnspec.h (arg_max_access_size_given_by_arg_p):
Clarify semantics.
* tree-ssa-alias.cc (check_fnspec): Correctly interpret
the size given by arg_max_access_size_given_by_arg_p as
maximum, not exact, size.
Richard Biener [Mon, 24 Apr 2023 11:31:07 +0000 (13:31 +0200)]
rtl-optimization/109585 - alias analysis typo
When r10-514-gc6b84edb6110dd2b4fb improved access path analysis
it introduced a typo that triggers when there's an access to a
trailing array in the first access path leading to false
disambiguation.
Richard Biener [Fri, 21 Apr 2023 10:57:17 +0000 (12:57 +0200)]
tree-optimization/109573 - avoid ICEing on unexpected live def
The following relaxes the assert in vectorizable_live_operation
where we catch currently unhandled cases to also allow an
intermediate copy as it happens here but also relax the assert
to checking only.
PR tree-optimization/109573
* tree-vect-loop.cc (vectorizable_live_operation): Allow
unhandled SSA copy as well. Demote assert to checking only.
Eric Botcazou [Tue, 25 Apr 2023 08:46:16 +0000 (10:46 +0200)]
Remove obsolete configure code in gnattools
It was recently pointed out that we generate symbolic links to ghost files
when building the GNAT tools, as the mlib-tgt-specific-*.adb files are gone.
Harald Anlauf [Tue, 11 Apr 2023 14:44:32 +0000 (16:44 +0200)]
Fortran: resolve correct generic with TYPE(C_PTR) arguments [PR61615,PR99982]
gcc/fortran/ChangeLog:
PR fortran/61615
PR fortran/99982
* interface.cc (compare_parameter): Enable type and rank checks for
arguments of derived type from the intrinsic module ISO_C_BINDING.
gcc/testsuite/ChangeLog:
PR fortran/61615
PR fortran/99982
* gfortran.dg/interface_49.f90: New test.
Jonathan Wakely [Thu, 20 Apr 2023 20:02:22 +0000 (21:02 +0100)]
libstdc++: Optimize std::try_facet and std::use_facet [PR103755]
The std::try_facet and std::use_facet functions were optimized in r13-3888-gb3ac43a3c05744 to avoid redundant checking for all facets that
are required to always be present in every locale.
This performs a simpler version of the optimization that only applies to
std::ctype<char>, std::num_get<char>, std::num_put<char>, and the
wchar_t specializations of those facets. Those are the facets that are
cached by std::basic_ios, which means they're used on construction for
every iostream object. This smaller change is suitable for the gcc-12
branch, and mitigates the performance loss for powerpc64le-linux on the
gcc-12 branch caused by r12-9454-g24cf9f4c6f45f7 for PR 103387. It also
greatly improves the performance of constructing iostreams objects, for
all targets.
libstdc++-v3/ChangeLog:
PR libstdc++/103755
* include/bits/locale_classes.tcc (try_facet, use_facet): Do not
check array index or dynamic type when accessing required
specializations of std::ctype, std::num_get, or std::num_put.
* testsuite/22_locale/ctype/is/string/89728_neg.cc: Adjust
expected errors.
rs6000: correct vector sign extend builtins on Big Endian
gcc/
PR target/108812
* config/rs6000/vsx.md (vsx_sign_extend_qi_<mode>): Rename to...
(vsx_sign_extend_v16qi_<mode>): ... this.
(vsx_sign_extend_hi_<mode>): Rename to...
(vsx_sign_extend_v8hi_<mode>): ... this.
(vsx_sign_extend_si_v2di): Rename to...
(vsx_sign_extend_v4si_v2di): ... this.
(vsignextend_qi_<mode>): Remove.
(vsignextend_hi_<mode>): Remove.
(vsignextend_si_v2di): Remove.
(vsignextend_v2di_v1ti): Remove.
(*xxspltib_<mode>_split): Replace gen_vsx_sign_extend_qi_v2di with
gen_vsx_sign_extend_v16qi_v2di and gen_vsx_sign_extend_qi_v4si
with gen_vsx_sign_extend_v16qi_v4si.
* config/rs6000/rs6000.md (split for DI constant generation):
Replace gen_vsx_sign_extend_qi_si with gen_vsx_sign_extend_v16qi_si.
(split for HSDI constant generation): Replace gen_vsx_sign_extend_qi_di
with gen_vsx_sign_extend_v16qi_di and gen_vsx_sign_extend_qi_si
with gen_vsx_sign_extend_v16qi_si.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vsignextsb2d):
Set bif-pattern to vsx_sign_extend_v16qi_v2di.
(__builtin_altivec_vsignextsb2w): Set bif-pattern to
vsx_sign_extend_v16qi_v4si.
(__builtin_altivec_visgnextsh2d): Set bif-pattern to
vsx_sign_extend_v8hi_v2di.
(__builtin_altivec_vsignextsh2w): Set bif-pattern to
vsx_sign_extend_v8hi_v4si.
(__builtin_altivec_vsignextsw2d): Set bif-pattern to
vsx_sign_extend_si_v2di.
(__builtin_altivec_vsignext): Set bif-pattern to
vsx_sign_extend_v2di_v1ti.
* config/rs6000/rs6000-builtin.cc (lxvrse_expand_builtin): Replace
gen_vsx_sign_extend_qi_v2di with gen_vsx_sign_extend_v16qi_v2di,
gen_vsx_sign_extend_hi_v2di with gen_vsx_sign_extend_v8hi_v2di and
gen_vsx_sign_extend_si_v2di with gen_vsx_sign_extend_v4si_v2di.
gcc/testsuite/
PR target/108812
* gcc.target/powerpc/p9-sign_extend-runnable.c: Set corresponding
expected vectors for Big Endian.
* gcc.target/powerpc/int_128bit-runnable.c: Likewise.
Jonathan Wakely [Tue, 29 Nov 2022 15:50:06 +0000 (15:50 +0000)]
libstdc++: Avoid bogus warning in std::vector::insert [PR107852]
GCC assumes that any global variable might be modified by operator new,
and so in the testcase for this PR all data members get reloaded after
allocating new storage. By making local copies of the _M_start and
_M_finish members we avoid that, and then the compiler has enough info
to remove the dead branches that trigger bogus -Warray-bounds warnings.
libstdc++-v3/ChangeLog:
PR libstdc++/107852
PR libstdc++/106199
PR libstdc++/100366
* include/bits/vector.tcc (vector::_M_fill_insert): Copy
_M_start and _M_finish members before allocating.
(vector::_M_default_append): Likewise.
(vector::_M_range_insert): Likewise.
Jonathan Wakely [Tue, 4 Apr 2023 11:04:14 +0000 (12:04 +0100)]
libstdc++: Fix outdated docs about demangling exception messages
The string returned by std::bad_exception::what() hasn't been a mangled
name since PR libstdc++/14493 was fixed for GCC 4.2.0, so remove the
docs showing how to demangle it.
libstdc++-v3/ChangeLog:
* doc/xml/manual/extensions.xml: Remove std::bad_exception from
example program.
* doc/html/manual/ext_demangling.html: Regenerate.
Jonathan Wakely [Tue, 28 Mar 2023 20:07:21 +0000 (21:07 +0100)]
libstdc++: Do not use facets cached in ios for ALT128 build [PR103387]
For the powerpc64le build with two different long double
representations, we cannot use the ios_base::_M_num_put and
ios_base::_M_num_get pointers, because they might have been initialized
in a translation unit using the other long double type. With the changes
to add __try_use_facet to GCC 13 the cache isn't really needed now, we
can just access the right facet in the locale directly, without needing
RTTI checks.
libstdc++-v3/ChangeLog:
PR libstdc++/103387
* include/bits/istream.tcc (istream::_M_extract(ValueT&)): Use
std::use_facet instead of cached _M_num_get facet.
(istream::operator>>(short&)): Likewise.
(istream::operator>>(int&)): Likewise.
* include/bits/ostream.tcc (ostream::_M_insert(ValueT)): Use
std::use_facet instead of cached _M_num_put facet.
Jonathan Wakely [Wed, 22 Mar 2023 21:54:24 +0000 (21:54 +0000)]
libstdc++: Fix assigning nullptr to std::atomic<shared_ptr<T>> (LWG 3893)
LWG voted this to Tentatively Ready recently.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_atomic.h (atomic::operator=(nullptr_t)):
Add overload, as per LWG 3893.
* testsuite/20_util/shared_ptr/atomic/atomic_shared_ptr.cc:
Check assignment from nullptr.
Jonathan Wakely [Wed, 29 Mar 2023 21:43:16 +0000 (22:43 +0100)]
libstdc++: Apply small fix from LWG 3843 to std::expected
LWG 3843 adds some type requirements to std::expected::value to ensure
that it can correctly copy the error value if it needs to throw an
exception. We don't need to do anything to enforce that, because it will
already be ill-formed if the type can't be copied. The issue also makes
a small drive-by fix to ensure that a const E& is copied from the
non-const value()& overload, which this change implements.
libstdc++-v3/ChangeLog:
* include/std/expected (expected::value() &): Use const lvalue
for unex member passed to bad_expected_access constructor, as
per LWG 3843.
My earlier patch for 108099 made us accept this non-standard pattern but
messed up the semantics, so that e.g. unsigned __int128_t was not a 128-bit
type.
PR c++/108099
gcc/cp/ChangeLog:
* decl.cc (grokdeclarator): Keep typedef_decl for __int128_t.
Jason Merrill [Sat, 15 Apr 2023 02:40:43 +0000 (22:40 -0400)]
c++: constexpr aggregate destruction [PR109357]
We were assuming that the result of evaluation of TARGET_EXPR_INITIAL would
always be the new value of the temporary, but that's not necessarily true
when the initializer is complex (i.e. target_expr_needs_replace). In that
case evaluating the initializer initializes the temporary as a side-effect.
PR c++/109357
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_constant_expression) [TARGET_EXPR]:
Check for complex initializer.
Jason Merrill [Thu, 23 Mar 2023 19:57:39 +0000 (15:57 -0400)]
c-family: -Wsequence-point and COMPONENT_REF [PR107163]
The patch for PR91415 fixed -Wsequence-point to treat shifts and ARRAY_REF
as sequenced in C++17, and COMPONENT_REF as well. But this is unnecessary
for COMPONENT_REF, since the RHS is just a FIELD_DECL with no actual
evaluation, and in this testcase handling COMPONENT_REF as sequenced blows
up fast in a deep inheritance tree. Instead, look through it.
PR c++/107163
gcc/c-family/ChangeLog:
* c-common.cc (verify_tree): Don't use sequenced handling
for COMPONENT_REF.
The default argument code in type_unification_real was assuming that all
targs we've deduced by that point are non-dependent, but that's not the case
for partial ordering.
PR c++/105481
gcc/cp/ChangeLog:
* pt.cc (type_unification_real): Adjust for partial ordering.
Jason Merrill [Thu, 23 Mar 2023 20:50:09 +0000 (16:50 -0400)]
c++: constexpr PMF conversion [PR105996]
Here, we were calling build_reinterpret_cast regardless of whether there was
actually a cast, and that now sets REINTERPRET_CAST_P. But that
optimization seems dodgy anyway, as it involves NOP_EXPR from one
RECORD_TYPE to another and we try to reserve NOP_EXPR for fundamental types.
And the generated code seems the same, so let's drop it. And also strip
location wrappers.
PR c++/105996
gcc/cp/ChangeLog:
* typeck.cc (build_ptrmemfunc): Drop 0-offset optimization
and location wrappers.
Jason Merrill [Sat, 18 Mar 2023 12:27:26 +0000 (08:27 -0400)]
c++: DMI in template with virtual base [PR106890]
When parsing a default member init we just build a CONVERT_EXPR for
converting to a virtual base, and then expand that into the more complex
form when we actually use the DMI in a constructor. But that wasn't working
for the template case where we are considering the conversion at the point
that the constructor needs the DMI instantiation, so it seemed like we were
in a constructor already. And then when the other constructor tries to
reuse the instantiation, it sees uses of the first constructor's parameters,
and dies. So ensure that we get the CONVERT_EXPR in this case, too.
PR c++/106890
gcc/cp/ChangeLog:
* init.cc (maybe_instantiate_nsdmi_init): Don't leave
current_function_decl set to a constructor.
Jason Merrill [Fri, 17 Mar 2023 21:26:40 +0000 (17:26 -0400)]
c++: constant, array, lambda, template [PR108975]
When a lambda refers to a constant local variable in the enclosing scope, we
tentatively capture it, but if we end up pulling out its constant value, we
go back at the end of the lambda and prune any unneeded captures. Here
while parsing the template we decided that the dim capture was unneeded,
because we folded it away, but then we brought back the use in the template
trees that try to preserve the source representation with added type info.
So then when we tried to instantiate that use, we couldn't find what it was
trying to use, and crashed.
Fixed by not trying to prune when parsing a template; we'll prune at
instantiation time.
PR c++/108975
gcc/cp/ChangeLog:
* lambda.cc (prune_lambda_captures): Don't bother in a template.
Jason Merrill [Fri, 17 Mar 2023 13:43:48 +0000 (09:43 -0400)]
c++: namespace-scoped friend in local class [PR69410]
do_friend was only considering class-qualified identifiers for the
qualified-id case, but we also need to skip local scope when there's an
explicit namespace scope.
PR c++/69410
gcc/cp/ChangeLog:
* friend.cc (do_friend): Handle namespace as scope argument.
* decl.cc (grokdeclarator): Pass down in_namespace.
Jason Merrill [Thu, 16 Mar 2023 19:11:25 +0000 (15:11 -0400)]
c++: generic lambda, local class, __func__ [PR108242]
Here we are trying to do name lookup in a deferred instantiation of t() and
failing to find __func__. tsubst_expr already tries to instantiate members
of local classes, but was failing with the partial instantiation of generic
lambdas.