Jonathan Wakely [Tue, 4 May 2021 22:31:48 +0000 (23:31 +0100)]
libstdc++: Optimize std::visit for the common case [PR 78113]
GCC does not do a good job of optimizing the table of function pointers
used for variant visitation. This avoids using the table for the common
case of visiting a single variant with a small number of alternative
types. Instead we use:
switch(v.index())
{
case 0: return visitor(get<0>(v));
case 1: return visitor(get<1>(v));
...
}
It's not quite that simple, because get<1>(v) is ill-formed if the
variant only has one alternative, and similarly for each get<N>. We
need to ensure each case only applies the visitor if the index is in
range for the actual type we're dealing with, and tell the compiler that
the case is unreachable otherwise. We also need to invoke the visitor
via the __gen_vtable_impl::__visit_invoke function, to handle the raw
visitation cases used to implement std::variant assignments and
comparisons.
Because that gets quite verbose and repetitive, a macro is used to stamp
out the cases.
We also need to handle the valueless_by_exception case, but only for raw
visitation, because std::visit already checks for it before calling
__do_visit.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/78113
* include/std/variant (__do_visit): Use a switch when we have a
single variant with a small number of alternatives.
Implement the changes from P2162R2 (as a DR for C++17).
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/90943
* include/std/variant (__cpp_lib_variant): Update value.
(__detail::__variant::__as): New helpers implementing the
as-variant exposition-only function templates.
(visit, visit<R>): Use __as to upcast the variant parameters.
* include/std/version (__cpp_lib_variant): Update value.
* testsuite/20_util/variant/visit_inherited.cc: New test.
This uses C++11 features to simplify the definition of the
__normal_iterator constructor that allows converting from iterator to
const_iterator. The previous definition relied on _Container::pointer
which is present in std::vector and std::basic_string, but is not
actually part of the container requirements.
Removing the use of _Container::pointer and defining it in terms of
is_convertible allows __normal_iterator to be used with new container
types which do not define a pointer member. Specifically, this will
allow it to be used in std::basic_stacktrace.
In theory this will enable some conversions which were not previously
permitted, for example __normal_iterator<volatile T*, vector<T>> can
now be converted to __normal_iterator<const volatile T*, vector<T>>.
In practice this doesn't matter because the library never uses such
types. In any case, allowing those conversions is consistent with
the corresponding constructors of std::reverse_iterator and
std::move_iterator.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (__normal_iterator): Simplify
converting constructor and do not require _Container::pointer.
Jonathan Wakely [Fri, 30 Apr 2021 14:04:34 +0000 (15:04 +0100)]
libstdc++: Make move ctor noexcept for fully-dynamic string
The move constructor for the "fully-dynamic" COW string is not noexcept,
because it allocates a new empty string rep for the moved-from string.
However, there is no need to do that, because the moved-from string does
not have to be left empty. Instead, implement move construction for the
fully-dynamic case as a reference count increment, so the string is
shared.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/cow_string.h [_GLIBCXX_FULLY_DYNAMIC_STRING]
(basic_string(basic_string&&)): Add noexcept and avoid
allocation, by sharing rep with the rvalue string.
Jonathan Wakely [Wed, 28 Apr 2021 10:40:47 +0000 (11:40 +0100)]
libstdc++: Use conditional noexcept in std::reverse_iterator [PR 94418]
This adds a noexcept-specifier to each constructor and assignment
operator of std::reverse_iterator so that they are noexcept when the
corresponding operation on the underlying iterator is noexcept.
The std::reverse_iterator class template already requires that the
operations on the underlying type are valid, so we don't need to use the
std::is_nothrow_xxx traits to protect against errors when the expression
isn't even valid. We can just use a noexcept operator to test if the
expression can throw, without the overhead of redundantly checking if
the initialization/assignment would be valid.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/94418
* include/bits/stl_iterator.h (reverse_iterator): Use
conditional noexcept on constructors and assignment operators.
* testsuite/24_iterators/reverse_iterator/noexcept.cc: New test.
Jonathan Wakely [Tue, 20 Apr 2021 15:16:13 +0000 (16:16 +0100)]
libstdc++: Do not allocate a zero-size vector<bool> [PR 100153]
The vector<bool>::shrink_to_fit() implementation will allocate new
storage even if the vector is empty. That then leads to the
end-of-storage pointer being non-null and equal to the _M_start._M_p
pointer, which means that _M_end_addr() has undefined behaviour.
The fix is to stop doing a useless zero-sized allocation in
shrink_to_fit(), so that _M_start._M_p and _M_end_of_storage are both
null after an empty vector shrinks.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/100153
* include/bits/vector.tcc (vector<bool>::_M_shrink_to_fit()):
When size() is zero just deallocate and reset.
Jonathan Wakely [Sat, 17 Apr 2021 21:34:09 +0000 (22:34 +0100)]
libstdc++: Implement std::clamp with std::min and std::max [PR 96733]
The compiler doesn't know about the precondition of std::clamp that
(hi < lo) is false, and so can't optimize as well as we'd like. By using
std::min and std::max we help the compiler.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/96733
* include/bits/stl_algo.h (clamp): Use std::min and std::max.
Tobias Burnus [Fri, 1 Oct 2021 18:02:23 +0000 (20:02 +0200)]
Add/update libgomp.fortran/alloc-*.f90
libgomp/ChangeLog:
* testsuite/libgomp.fortran/alloc-10.f90: Fix alignment check.
* testsuite/libgomp.fortran/alloc-7.f90: Fix array access.
* testsuite/libgomp.fortran/alloc-8.f90: Likewise.
* testsuite/libgomp.fortran/alloc-11.f90: New test for omp_realloc,
based on libgomp.c-c++-common/alloc-9.c.
PR c/102103
* g++.dg/cpp0x/constexpr-array-ptr10.C: Suppress a valid warning.
* g++.dg/warn/Wreturn-local-addr-6.C: Correct a cast.
* gcc.dg/Waddress.c: Expect a warning.
* c-c++-common/Waddress-3.c: New test.
* c-c++-common/Waddress-4.c: New test.
* g++.dg/warn/Waddress-5.C: New test.
* g++.dg/warn/Waddress-6.C: New test.
* g++.dg/warn/pr101219.C: Expect a warning.
* gcc.dg/Waddress-3.c: New test.
tsan: don't print __tsan_atomic* functions in report stacks
Currently __tsan_atomic* functions do FuncEntry/Exit using caller PC
and then use current PC (pointing to __tsan_atomic* itself) during
memory access handling. As the result the top function in reports
involving atomics is __tsan_atomic* and the next frame points to user code.
Remove FuncEntry/Exit in atomic functions and use caller PC
during memory access handling. This removes __tsan_atomic*
from the top of report stacks, so that they point right to user code.
The motivation for this is performance.
Some atomic operations are very hot (mostly loads),
so removing FuncEntry/Exit is beneficial.
This also reduces thread trace consumption (1 event instead of 3).
__tsan_atomic* at the top of the stack is not necessary
and does not add any new information. We already say
"atomic write of size 4", "__tsan_atomic32_store" does not add
anything new.
It also makes reports consistent between atomic and non-atomic
accesses. For normal accesses we say "previous write" and point
to user code; for atomics we say "previous atomic write" and now
also point to user code.
tsan: avoid extra call indirection in unaligned access functions
Currently unaligned access functions are defined in tsan_interface.cpp
and do a real call to MemoryAccess. This means we have a real call
and no read/write constant propagation.
Unaligned memory access can be quite hot for some programs
(observed on some compression algorithms with ~90% of unaligned accesses).
Move them to tsan_interface_inl.h to avoid the additional call
and enable constant propagation.
Also reorder the actual store and memory access handling for
__sanitizer_unaligned_store callbacks to enable tail calling
in MemoryAccess.
The updated lots_of_threads.c test with 300 threads
started running for too long on machines with low
hardware parallelism (e.g. taskset -c 0-1).
On lots of CPUs it finishes in ~2 secs. But with
taskset -c 0-1 it runs for hundreds of seconds
effectively spinning in the barrier in the sleep loop.
We now have the handy futex API in sanitizer_common.
Use it instead of the passive spin loop.
It makes the test run only faster with taskset -c 0-1,
it runs for ~1.5 secs, while with full parallelism
it still runs for ~2 secs (but consumes less CPU time).
qingzhe huang [Fri, 1 Oct 2021 14:46:35 +0000 (10:46 -0400)]
c++: cv-qualified ref introduced by typedef [PR101783]
The root cause of this bug is that it considers reference with
cv-qualifiers as an error by generating value for variable "bad_quals".
However, this is not correct for case of typedef. Here I quote spec
[dcl.ref]/1 :
"Cv-qualified references are ill-formed except when the cv-qualifiers
are introduced through the use of a typedef-name ([dcl.typedef],
[temp.param]) or decltype-specifier ([dcl.type.decltype]),
in which case the cv-qualifiers are ignored."
Jonathan Wakely [Fri, 1 Oct 2021 13:06:42 +0000 (14:06 +0100)]
libstdc++: Define basic_regex::multiline for non-strict modes
The regex_constants::multiline constant is defined for non-strict C++11
and C++14 modes, on the basis that the feature is a DR (even though it
was really a new feature addition to C++17 and probably shouldn't have
gone through the issues list).
This makes the basic_regex::multiline constant defined consistently with
the regex_constants::multiline one.
For strict C++11 and C++14 mode we don't define them, because multiline
is not a reserved name in those standards.
libstdc++-v3/ChangeLog:
* include/bits/regex.h (basic_regex::multiline): Define for
non-strict C++11 and C++14 modes.
* include/bits/regex_constants.h (regex_constants::multiline):
Add _GLIBCXX_RESOLVE_LIB_DEFECTS comment.
Jonathan Wakely [Thu, 30 Sep 2021 13:39:36 +0000 (14:39 +0100)]
libstdc++: Add noexcept to istream_iterator and ostream_iterator
libstdc++-v3/ChangeLog:
* include/bits/stream_iterator.h (istream_iterator): Add
noexcept to constructors and non-throwing member functions and
friend functions.
(ostream_iterator): Likewise.
Jonathan Wakely [Thu, 30 Sep 2021 10:25:15 +0000 (11:25 +0100)]
libstdc++: Fix _ForwardIteratorConcept for __gnu_debug::vector<bool>
The recent changes to the _GLIBCXX_CONCEPT_CHECKS checks for forward
iterators don't work for vector<bool> iterators in debug mode, because
the _Safe_iterator specializations don't match the special cases I added
for _Bit_iterator and _Bit_const_iterator.
This refactors the _ForwardIteratorReferenceConcept class template to
identify vector<bool> iterators using a new trait, which also works for
debug iterators.
libstdc++-v3/ChangeLog:
* include/bits/boost_concept_check.h (_Is_vector_bool_iterator):
New trait to identify vector<bool> iterators, including debug
ones.
(_ForwardIteratorReferenceConcept): Add default template
argument using _Is_vector_bool_iterator and use it in partial
specialization for the vector<bool> cases.
(_Mutable_ForwardIteratorReferenceConcept): Likewise.
* testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error
line number.
Jonathan Wakely [Wed, 29 Sep 2021 19:46:55 +0000 (20:46 +0100)]
libstdc++: Replace try-catch in std::list::merge to avoid O(N) size
The current std::list::merge code calls size() before starting to merge
any elements, so that the _M_size members can be updated after the merge
finishes. The work is done in a try-block so that the sizes can still be
updated in an exception handler if any element comparison throws.
The _M_size members only exist for the cxx11 ABI, so the initial call to
size() and the try-catch are only needed for that ABI. For the old ABI
the size() call performs an O(N) list traversal to get a value that
isn't even used, and catching exceptions just to rethrow them isn't
needed either.
This refactors the merge functions to remove the try-catch block and use
an RAII type instead. For the cxx11 ABI that type's destructor updates
the list sizes, and for the old ABI it's a no-op.
libstdc++-v3/ChangeLog:
* include/bits/list.tcc (list::merge): Remove call to size() and
try-catch block. Use _Finalize_merge instead.
* include/bits/stl_list.h (list::_Finalize_merge): New
scope guard type to update _M_size members after a merge.
Aldy Hernandez [Fri, 1 Oct 2021 10:27:55 +0000 (12:27 +0200)]
Remove shadowed oracle field.
The m_oracle field in the path solver was shadowing the base class.
This was causing subtle problems while calculating outgoing edges
between blocks, because the query object being passed did not have an
oracle set.
Jakub Jelinek [Fri, 1 Oct 2021 12:27:32 +0000 (14:27 +0200)]
ubsan: Move INT_MIN / -1 instrumentation from -fsanitize=integer-divide-by-zero to -fsanitize=signed-integer-overflow [PR102515]
As noted by Richi, in clang INT_MIN / -1 is instrumented under
-fsanitize=signed-integer-overflow rather than
-fsanitize=integer-divide-by-zero as we did and doing it in the former
makes more sense, as it is overflow during division rather than division
by zero.
I've verified on godbolt that clang behaved that way since 3.2-ish times or
so when sanitizers were added.
Furthermore, we've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
The case where INT_MIN / -1 is instrumented by one sanitizer and
x / 0 by another one when both are enabled is slightly harder if
the -f{,no-}sanitize-recover={integer-divide-by-zero,signed-integer-overflow}
flags differ, then we need to emit both __ubsan_handle_divrem_overflow
and __ubsan_handle_divrem_overflow_abort calls guarded by their respective
checks rather than one guarded by check1 || check2.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR sanitizer/102515
gcc/
* doc/invoke.texi (-fsanitize=integer-divide-by-zero): Remove
INT_MIN / -1 division detection from here ...
(-fsanitize=signed-integer-overflow): ... and add it here.
gcc/c-family/
* c-ubsan.c (ubsan_instrument_division): Check the right
flag_sanitize_recover bit, depending on which sanitization
is done. Sanitize INT_MIN / -1 under SANITIZE_SI_OVERFLOW
rather than SANITIZE_DIVIDE. If both SANITIZE_SI_OVERFLOW
and SANITIZE_DIVIDE is enabled, neither check is known
to be false and flag_sanitize_recover bits for those two
aren't the same, emit both __ubsan_handle_divrem_overflow
and __ubsan_handle_divrem_overflow_abort calls.
gcc/c/
* c-typeck.c (build_binary_op): Call ubsan_instrument_division
for division even for SANITIZE_SI_OVERFLOW.
gcc/cp/
* typeck.c (cp_build_binary_op): Call ubsan_instrument_division
for division even for SANITIZE_SI_OVERFLOW.
gcc/testsuite/
* c-c++-common/ubsan/div-by-zero-3.c: Use
-fsanitize=signed-integer-overflow instead of
-fsanitize=integer-divide-by-zero.
* c-c++-common/ubsan/div-by-zero-5.c: Likewise.
* c-c++-common/ubsan/div-by-zero-4.c: Likewise. Add
-fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/float-div-by-zero-2.c: New test.
* c-c++-common/ubsan/overflow-div-1.c: New test.
* c-c++-common/ubsan/overflow-div-2.c: New test.
* c-c++-common/ubsan/overflow-div-3.c: New test.
Andrew Pinski [Fri, 1 Oct 2021 09:23:47 +0000 (09:23 +0000)]
Fix bb-slp-pr97709.c after computed goto change
Looks like I tested the change for bb-slp-pr97709.c on an
older tree which did not have the error message so I had
missed one more place where the change was needed.
Anyways committed after testing to make sure the testcase passes
now.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-pr97709.c: Fix for computed goto
pointers.
Martin Liska [Wed, 2 Jun 2021 06:44:37 +0000 (08:44 +0200)]
Append target/optimize attr to the current cmdline.
gcc/c-family/ChangeLog:
* c-common.c (parse_optimize_options): Combine optimize
options with what was provided on the command line.
gcc/ChangeLog:
* toplev.c (toplev::main): Save decoded optimization options.
* toplev.h (save_opt_decoded_options): New.
* doc/extend.texi: Be more clear about optimize and target
attributes.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math.
* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
* gcc.target/i386/attr-optimize.c: New test.
Eric Botcazou [Fri, 1 Oct 2021 08:56:45 +0000 (10:56 +0200)]
Fix ICE with stack checking emulation at -O2
On bare-metal platforms, the Ada compiler emulates stack checking (it is
required by the language and tested by ACATS) in the runtime via the
stack_check_libfunc hook of the RTL middle-end. Calls to the function
are generated as libcalls but they now require a proper function type
at -O2 or above.
gcc/
* explow.c: Include langhooks.h.
(set_stack_check_libfunc): Build a proper function type.
Eric Botcazou [Fri, 1 Oct 2021 08:49:34 +0000 (10:49 +0200)]
Fix PR c++/64697 at -O1 or above
The BFD fix eliminates the link failure and working code is generated at
-O0, but _not_ when optimization is enabled because the optimizer changes:
movq .refptr._ZTH1s(%rip), %rax
testq %rax, %rax
je .L2
call _ZTH1s
into:
leaq _ZTH1s(%rip), %rax
testq %rax, %rax
je .L2
call _ZTH1s
and the leaq now also gets the relocation overflow. So the fix is to
teach legitimate_pic_address_disp_p to reject the transformation when
the symbol is an external weak function, which yields:
cmpq $0, .refptr._ZTH1s(%rip)
je .L2
call _ZTH1s
and the cmpq keeps a relocation that does not overflow.
gcc/
PR c++/64697
* config/i386/i386.c (legitimate_pic_address_disp_p): For PE-COFF do
not return true for external weak function symbols in medium model.
Jakub Jelinek [Fri, 1 Oct 2021 08:45:48 +0000 (10:45 +0200)]
openmp: Differentiate between order(concurrent) and order(reproducible:concurrent)
While OpenMP 5.1 implies order(concurrent) is the same thing as
order(reproducible:concurrent), this is going to change in OpenMP 5.2, where
essentially order(concurrent) means nothing is stated on whether it is
reproducible or unconstrained (and is determined by other means, e.g. for/do
with schedule static or runtime with static being selected is implicitly
reproducible, distribute with dist_schedule static is implicitly reproducible,
loop is implicitly reproducible) and when the modifier is specified explicitly,
it overrides the implicit behavior either way.
And, when order(reproducible:concurrent) is used with e.g. schedule(dynamic)
or some other schedule that is by definition not reproducible, it is
implementation's duty to ensure it is reproducible, either by remembering how
it scheduled some loop and then replaying the same schedule when seeing loops
with the same directive/schedule/number of iterations, or by overriding the
schedule to some reproducible one.
This patch doesn't implement the 5.2 wording just yet, but in the FEs
differentiates between the 3 states - no explicit modifier, explicit reproducible
or explicit unconstrainted, so that the middle-end can easily switch any time.
Instead it follows the 5.1 wording where both order(concurrent) (implicit or
explicit) or order(reproducible:concurrent) imply reproducibility.
And, it implements the easier method, when for/do should be reproducible, it
just chooses static schedule. order(concurrent) implies no OpenMP APIs in the
loop body nor threadprivate vars, so the exact scheduling isn't (easily at least)
observable.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_ORDER_REPRODUCIBLE): Define.
* tree-pretty-print.c (dump_omp_clause) <case OMP_CLAUSE_ORDER>: Print
reproducible: for OMP_CLAUSE_ORDER_REPRODUCIBLE.
* omp-general.c (omp_extract_for_data): If OMP_CLAUSE_ORDER is seen
without OMP_CLAUSE_ORDER_UNCONSTRAINED, overwrite sched_kind to
OMP_CLAUSE_SCHEDULE_STATIC.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Also copy
OMP_CLAUSE_ORDER_REPRODUCIBLE.
gcc/c/
* c-parser.c (c_parser_omp_clause_order): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier.
gcc/cp/
* parser.c (cp_parser_omp_clause_order): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier.
gcc/fortran/
* gfortran.h (gfc_omp_clauses): Add order_reproducible bitfield.
* dump-parse-tree.c (show_omp_clauses): Print REPRODUCIBLE: for it.
* openmp.c (gfc_match_omp_clauses): Set order_reproducible for
explicit reproducible: modifier.
* trans-openmp.c (gfc_trans_omp_clauses): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for order_reproducible.
(gfc_split_omp_clauses): Also copy order_reproducible.
gcc/testsuite/
* gfortran.dg/gomp/order-5.f90: Adjust scan-tree-dump-times regexps.
libgomp/
* testsuite/libgomp.c-c++-common/order-reproducible-1.c: New test.
* testsuite/libgomp.c-c++-common/order-reproducible-2.c: New test.
Jakub Jelinek [Fri, 1 Oct 2021 08:30:16 +0000 (10:30 +0200)]
c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]
The introduction of push_local_extern_decl_alias in r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a
broke tls vars, while the decl they are created for has the tls model
set properly, nothing sets it for the alias that is actually used,
so accesses to it are done as if they were normal variables.
This is then diagnosed at link time if the definition of the extern
vars is __thread/thread_local.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/102496
* name-lookup.c (push_local_extern_decl_alias): Return early even for
tls vars with non-dependent type when processing_template_decl. For
CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias.
* g++.dg/tls/pr102496-1.C: New test.
* g++.dg/tls/pr102496-2.C: New test.
Richard Biener [Thu, 30 Sep 2021 13:05:53 +0000 (15:05 +0200)]
middle-end/102518 - avoid invalid GIMPLE during inlining
When inlining we have to avoid mapping a non-lvalue parameter
value into a context that prevents the parameter to be a register.
Formerly the register were TREE_ADDRESSABLE but now it can be
just DECL_NOT_GIMPLE_REG_P.
2021-09-30 Richard Biener <rguenther@suse.de>
PR middle-end/102518
* tree-inline.c (setup_one_parameter): Avoid substituting
an invariant into contexts where a GIMPLE register is not valid.
Bob Duff [Tue, 17 Aug 2021 15:02:51 +0000 (11:02 -0400)]
[Ada] Subprogram_Variant in ignored ghost code
gcc/ada/
* exp_ch6.adb (Expand_Call_Helper): Do not call
Check_Subprogram_Variant if the subprogram is an ignored ghost
entity. Otherwise the compiler crashes (in debug builds) or
gives strange error messages (in production builds).
Steve Baird [Thu, 12 Aug 2021 23:55:36 +0000 (16:55 -0700)]
[Ada] Improved checking for invalid index values when accessing array elements
gcc/ada/
* checks.ads: Define a type Dimension_Set. Add an out-mode
parameter of this new type to Generate_Index_Checks so that
callers can know for which dimensions a check was generated. Add
an in-mode parameter of this new type to
Apply_Subscript_Validity_Checks so that callers can indicate
that no check is needed for certain dimensions.
* checks.adb (Generate_Index_Checks): Implement new
Checks_Generated parameter.
(Apply_Subscript_Validity_Checks): Implement new No_Check_Needed
parameter.
* exp_ch4.adb (Expand_N_Indexed_Component): Call
Apply_Subscript_Validity_Checks in more cases than before. This
includes declaring two new local functions,
(Is_Renamed_Variable_Name,
Type_Requires_Subscript_Validity_Checks_For_Reads): To help in
deciding whether to call Apply_Subscript_Validity_Checks.
Adjust to parameter profile changes in Generate_Index_Checks and
Apply_Subscript_Validity_Checks.
Eric Botcazou [Fri, 13 Aug 2021 16:32:53 +0000 (18:32 +0200)]
[Ada] Document rounding mode assumed for dynamic floating-point computations
gcc/ada/
* doc/gnat_rm/implementation_defined_characteristics.rst: Document
the rounding mode assumed for dynamic computations as per 3.5.7(16).
* gnat_rm.texi: Regenerate.
Bob Duff [Thu, 12 Aug 2021 20:49:16 +0000 (16:49 -0400)]
[Ada] More work on efficiency improvements
gcc/ada/
* table.ads (Table_Type): Remove "aliased"; no longer needed by
Atree. Besides it contradicted the comment a few lines above,
"-- Note: We do not make the table components aliased...".
* types.ads: Move type Slot to Atree.
* atree.ads: Move type Slot fromt Types to here. Move type
Node_Header from Seinfo to here.
* atree.adb: Avoid the need for aliased components of the Slots
table. Instead of 'Access, use a getter and setter. Misc
cleanups.
(Print_Statistics): Print statistics about node and entity kind
frequencies. Give 3 digit fractions instead of percentages.
* (Get_Original_Node_Count, Set_Original_Node_Count): Statistics
for calls to Original_Node and Set_Original_Node.
(Original_Node, Set_Original_Node): Gather statistics by calling
the above.
(Print_Field_Statistics): Print Original_Node statistics.
(Update_Kind_Statistics): Remove, and put all statistics
gathering under "if Atree_Statistics_Enabled", which is a flag
generated in Seinfo by Gen_IL.
* gen_il-gen.adb (Compute_Field_Offsets): Choose offsets of
Nkind, Ekind, and Homonym first. This causes a slight efficiency
improvement. Misc cleanups. Do not generate Node_Header; it is
now hand-written in Atree. When choosing the order in which to
assign offsets, weight by the frequency of the node type, so the
more common nodes get their field offsets assigned earlier. Add
more special cases.
(Compute_Type_Sizes): Remove this and related things.
There was a comment: "At some point we can instrument Atree to
print out accurate size statistics, and remove this code." We
have Atree statistics, so we now remove this code.
(Put_Seinfo): Generate Atree_Statistics_Enabled, which is equal
to Statistics_Enabled. This allows Atree to say "if
Atree_Statistics_Enabled then <gather statistics>" for
efficiency. When Atree_Statistics_Enabled is False, the "if ..."
will be optimized away.
* gen_il-internals.ads (Type_Frequency): New table of kind
frequencies.
* gen_il-internals.adb: Minor comment improvement.
* gen_il-fields.ads: Remove unused subtypes. Suppress style
checks in the Type_Frequency table. If we regenerate this
table (see -gnatd.A) we don't want to have to fiddle with
casing.
* impunit.adb: Minor.
* sinfo-utils.adb: Minor.
* debug.adb: Minor comment improvement.
Ed Schonberg [Thu, 12 Aug 2021 14:39:21 +0000 (10:39 -0400)]
[Ada] Crash on improper use of GNAT attribute Type_Key
gcc/ada/
* sem_attr.adb (Analyze_Attribute, case Type_Key): Attribute can
be applied to a formal type.
* sem_ch5.adb (Analyze_Case_Statement): If Extensions_Allowed is
not enabled, verify that the type of the expression is discrete.
* aspects.ads: Add CUDA_Device aspect.
* gnat_cuda.ads (Add_CUDA_Device_Entity): New subprogram.
* gnat_cuda.adb:
(Add_CUDA_Device_Entity): New subprogram.
(CUDA_Device_Entities_Table): New hashmap for CUDA_Device
entities.
(Get_CUDA_Device_Entities): New internal subprogram.
(Set_CUDA_Device_Entities): New internal subprogram.
* par-prag.adb (Prag): Handle pragma id Pragma_CUDA_Device.
* sem_prag.ads (Aspect_Specifying_Pragma): Mark CUDA_Device as
being both aspect and pragma.
* sem_prag.adb (Analyze_Pragma): Add CUDA_Device entities to
list of CUDA_Entities belonging to package N.
(Sig_Flags): Signal CUDA_Device entities as referenced.
* snames.ads-tmpl: Create CUDA_Device names and pragmas.
Ed Schonberg [Wed, 11 Aug 2021 16:52:29 +0000 (12:52 -0400)]
[Ada] Implementation of AI12-0212: iterator specs in array aggregates (II)
gcc/ada/
* exp_aggr.adb (Expand_Array_Aggregate,
Two_Pass_Aggregate_Expansion): Increment index for element
insertion within the loop, only if upper bound has not been
reached.
* contracts.ads (Make_Class_Precondition_Subps): New subprogram.
(Merge_Class_Conditions): New subprogram.
(Process_Class_Conditions_At_Freeze_Point): New subprogram.
* contracts.adb (Check_Class_Condition): New subprogram.
(Set_Class_Condition): New subprogram.
(Analyze_Contracts): Remove code analyzing class-wide-clone
subprogram since it is no longer built.
(Process_Spec_Postconditions): Avoid processing twice seen
subprograms.
(Process_Preconditions): Simplify its functionality to
non-class-wide preconditions.
(Process_Preconditions_For): No action needed for wrappers and
helpers.
(Make_Class_Precondition_Subps): New subprogram.
(Process_Class_Conditions_At_Freeze_Point): New subprogram.
(Merge_Class_Conditions): New subprogram.
* exp_ch6.ads (Install_Class_Preconditions_Check): New
subprogram.
* exp_ch6.adb (Expand_Call_Helper): Install class-wide
preconditions check on dispatching primitives that have or
inherit class-wide preconditions.
(Freeze_Subprogram): Remove code for null procedures with
preconditions.
(Install_Class_Preconditions_Check): New subprogram.
* exp_util.ads (Build_Class_Wide_Expression): Lower the
complexity of this subprogram; out-mode formal Needs_Wrapper
since this functionality is now provided by a new subprogram.
(Get_Mapped_Entity): New subprogram.
(Map_Formals): New subprogram.
* exp_util.adb (Build_Class_Wide_Expression): Lower the
complexity of this subprogram. Its previous functionality is now
provided by subprograms Needs_Wrapper and Check_Class_Condition.
(Add_Parent_DICs): Map the overridden primitive to the
overriding one.
(Get_Mapped_Entity): New subprogram.
(Map_Formals): New subprogram.
(Update_Primitives_Mapping): Adding assertion.
* freeze.ads (Check_Inherited_Conditions): Subprogram made
public with added formal to support late overriding.
* freeze.adb (Check_Inherited_Conditions): New implementation;
builds the dispatch table wrapper required for class-wide
pre/postconditions; added support for late overriding.
(Needs_Wrapper): New subprogram.
* sem.ads (Inside_Class_Condition_Preanalysis): New global
variable.
* sem_disp.ads (Covered_Interface_Primitives): New subprogram.
* sem_disp.adb (Covered_Interface_Primitives): New subprogram.
(Check_Dispatching_Context): Skip checking context of
dispatching calls during preanalysis of class-wide conditions
since at that stage the expression is not installed yet on its
definite context.
(Check_Dispatching_Call): Skip checking 6.1.1(18.2/5) by
AI12-0412 on helpers and wrappers internally built for
supporting class-wide conditions; for late-overriding
subprograms call Check_Inherited_Conditions to build the
dispatch-table wrapper (if required).
(Propagate_Tag): Adding call to
Install_Class_Preconditions_Check.
* sem_util.ads (Build_Class_Wide_Clone_Body): Removed.
(Build_Class_Wide_Clone_Call): Removed.
(Build_Class_Wide_Clone_Decl): Removed.
(Class_Condition): New subprogram.
(Nearest_Class_Condition_Subprogram): New subprogram.
* sem_util.adb (Build_Class_Wide_Clone_Body): Removed.
(Build_Class_Wide_Clone_Call): Removed.
(Build_Class_Wide_Clone_Decl): Removed.
(Class_Condition): New subprogram.
(Nearest_Class_Condition_Subprogram): New subprogram.
(Eligible_For_Conditional_Evaluation): No need to evaluate
class-wide conditions during preanalysis since the expression is
not installed on its definite context.
* einfo.ads (Class_Wide_Clone): Removed.
(Class_Postconditions): New attribute.
(Class_Preconditions): New attribute.
(Class_Preconditions_Subprogram): New attribute.
(Dynamic_Call_Helper): New attribute.
(Ignored_Class_Postconditions): New attribute.
(Ignored_Class_Preconditions): New attribute.
(Indirect_Call_Wrapper): New attribute.
(Is_Dispatch_Table_Wrapper): New attribute.
(Static_Call_Helper): New attribute.
* exp_attr.adb (Expand_N_Attribute_Reference): When the prefix
is of an access-to-subprogram type that has class-wide
preconditions and an indirect-call wrapper of such subprogram is
available, replace the prefix by the wrapper.
* exp_ch3.adb (Build_Class_Condition_Subprograms): New
subprogram.
(Register_Dispatch_Table_Wrappers): New subprogram.
* exp_disp.adb (Build_Class_Wide_Check): Removed; class-wide
precondition checks now rely on internally built helpers.
* sem_ch13.adb (Analyze_Aspect_Specifications): Set initial
value of attributes Class_Preconditions, Class_Postconditions,
Ignored_Class_Preconditions and Ignored_Class_Postconditions.
These values are later updated with the full pre/postcondition
by Merge_Class_Conditions.
(Freeze_Entity_Checks): Call
Process_Class_Conditions_At_Freeze_Point.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Remove code
building the body of the class-wide clone subprogram since it is
no longer required.
(Install_Entity): Adding assertion.
* sem_prag.adb (Analyze_Pre_Post_Condition_In_Decl_Part): Remove
code building and analyzing the class-wide clone subprogram; no
longer required.
(Build_Pragma_Check_Equivalent): Adjust call to
Build_Class_Wide_Expression since the formal named Needs_Wrapper
has been removed.
* sem_attr.adb (Analyze_Attribute_Old_Result): Skip processing
these attributes during preanalysis of class-wide conditions
since at that stage the expression is not installed yet on its
definite context.
* sem_res.adb (Resolve_Actuals): Skip applying RM 3.9.2(9/1) and
SPARK RM 6.1.7(3) on actuals of internal helpers and wrappers
built to support class-wide preconditions.
* sem_ch5.adb (Process_Bounds): Do not generate a constant
declaration for the bounds when we are preanalyzing a class-wide
condition.
(Analyze_Loop_Parameter_Specification): Handle preanalysis of
quantified expression placed in the outermost expression of a
class-wide condition.
* ghost.adb (Check_Ghost_Context): No check required during
preanalysis of class-wide conditions.
* gen_il-fields.ads (Opt_Field_Enum): Adding
Class_Postconditions, Class_Preconditions,
Class_Preconditions_Subprogram, Dynamic_Call_Helper,
Ignored_Class_Postconditions, Ignored_Class_Preconditions,
Indirect_Call_Wrapper, Is_Dispatch_Table_Wrapper,
Static_Call_Helper.
* gen_il-gen-gen_entities.adb (Is_Dispatch_Table_Wrapper):
Adding semantic flag Is_Dispatch_Table_Wrapper; removing
semantic field Class_Wide_Clone; adding semantic fields for
Class_Postconditions, Class_Preconditions,
Class_Preconditions_Subprogram, Dynamic_Call_Helper,
Ignored_Class_Postconditions, Indirect_Call_Wrapper,
Ignored_Class_Preconditions, and Static_Call_Helper.
Piotr Trojanek [Wed, 11 Aug 2021 15:57:55 +0000 (17:57 +0200)]
[Ada] Fix deleting CodePeer files for non-ordinary units
gcc/ada/
* comperr.adb (Delete_SCIL_Files): Handle generic subprogram
declarations and renaming just like generic package declarations
and renamings, respectively; handle
N_Subprogram_Renaming_Declaration.
Steve Baird [Tue, 10 Aug 2021 17:33:42 +0000 (10:33 -0700)]
[Ada] Improve error message for .ali file version mismatch
gcc/ada/
* bcheck.adb (Check_Versions): Add support for the case where
the .ali file contains both a primary and a secondary version
number, as in "GNAT Lib v22.20210809".
Steve Baird [Mon, 2 Aug 2021 23:18:08 +0000 (16:18 -0700)]
[Ada] Fix bug in inherited user-defined-literal aspects for tagged types
gcc/ada/
* sem_res.adb (Resolve): Two separate fixes. In the case where
Find_Aspect for a literal aspect returns the aspect for a
different (ancestor) type, call Corresponding_Primitive_Op to
get the right callee. In the case where a downward tagged type
conversion appears to be needed, generate a null extension
aggregate instead, as per Ada RM 3.4(27).
* sem_util.ads, sem_util.adb: Add new Corresponding_Primitive_Op
function. It maps a primitive op of a tagged type and a
descendant type of that tagged type to the corresponding
primitive op of the descendant type. The body of this function
was written by Javier Miranda.
Bob Duff [Mon, 9 Aug 2021 23:06:18 +0000 (19:06 -0400)]
[Ada] Info. gathering in preparation for more efficiency improvements
gcc/ada/
* atree.adb: Gather and print statistics about frequency of
getter and setter calls.
* atree.ads (Print_Statistics): New procedure for printing
statistics.
* debug.adb: Document -gnatd.A switch.
* gen_il-gen.adb: Generate code for statistics gathering.
Choose the offset of Homonym early. Misc cleanup. Put more
comments in the generated code.
* gen_il-internals.ads (Unknown_Offset): New value to indicate
that the offset has not yet been chosen.
* gnat1drv.adb: Call Print_Statistics.
* libgnat/s-imglli.ads: Minor comment fix.
* output.ads (Write_Int_64): New procedure to write a 64-bit
value. Needed for new statistics, and could come in handy
elsewhere.
* output.adb (Write_Int_64): Likewise.
* sinfo.ads: Remove obsolete comment. The xtreeprs program no
longer exists.
* types.ads: New 64-bit types needed for new statistics.
[Ada] Support gmem.out longer than 2G on 32 bit platforms
gcc/ada/
* libgnat/memtrack.adb (Putc): New routine wrapped around fputc
with error check.
(Write): New routine wrapped around fwrite with error check.
Remove bound functions fopen, fwrite, fputs, fclose, OS_Exit.
Use the similar routines from System.CRTL and System.OS_Lib.
Ed Schonberg [Sun, 8 Aug 2021 14:34:38 +0000 (10:34 -0400)]
[Ada] Spurious range checks on aggregate with non-static bounds
gcc/ada/
* exp_aggr.adb (Must_Slide): If the aggregate only contains an
others_clause no sliding id involved. Otherwise sliding is
required if any bound of the aggregate or the context subtype is
non-static.
Steve Baird [Thu, 5 Aug 2021 18:18:19 +0000 (11:18 -0700)]
[Ada] Improve error message for .ali file version mismatch
gcc/ada/
* bcheck.adb (Check_Versions): In the case of an ali file
version mismatch, if distinct integer values can be extracted
from the two version strings then include those values in the
generated error message.
Patrick Palka [Thu, 30 Sep 2021 21:34:23 +0000 (17:34 -0400)]
c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]
is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but
C++20 aggregate paren init means multi-arg ctors can now be trivial too.
This patch relaxes the relevant early exit check accordingly.
PR c++/102535
gcc/cp/ChangeLog:
* method.c (is_xible_helper): Don't exit early for multi-arg
ctors in C++20.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_trivially_constructible7.C: New test.
Patrick Palka [Thu, 30 Sep 2021 21:29:18 +0000 (17:29 -0400)]
c++: argument order in a variadic type trait intrinsic
When parsing a variadic type trait intrinsic, we build up the list of
trailing arguments in reverse, but we neglect to reverse the list to
the true order afterwards. This causes us to confuse the meaning of
e.g. __is_xible(x, y, z) vs __is_xible(x, z, y).
Note that this bug doesn't affect the library traits because they pass a
pack expansion as the single trailing argument to __is_xible, which gets
expanded in the correct order by tsubst_tree_list.
gcc/cp/ChangeLog:
* parser.c (cp_parser_trait_expr): Call nreverse on the reversed
list of trailing arguments.
Ian Lance Taylor [Thu, 30 Sep 2021 04:48:48 +0000 (21:48 -0700)]
compiler: avoid calling Expression::type before lowering
This is a minor cleanup to ensure that the various Expression::do_type
methods don't have to worry about the possibility that the Expression
has not been lowered.
A test for CLASS(*) + assumed rank was missing; adding a test to
unlimited_polymorphic_1.f03 showed an ICE as backend_decl wasn't
set. While gfc_get_symbol_decl would fix it, the code also assumed
that the class(*) was a variable and could not be a subobject of
a derived type.
PR fortran/71703
PR fortran/84007
gcc/fortran/ChangeLog:
* trans-intrinsic.c (gfc_conv_same_type_as): Fix handling
of UNLIMITED_POLY.
* trans.h (gfc_vtpr_hash_get): Renamed prototype to ...
(gfc_vptr_hash_get): ... this to match function name.
libphobos: Print stacktrace before terminating program due to uncaught exception.
By default, D run-time has a top level exception handler to catch
anything that was uncaught by user code. However when the
`rt_trapExceptions' flag is cleared, this handler would not be enabled,
and this termination would occur, aborting the program, but without any
information about the exception.
libphobos/ChangeLog:
* libdruntime/gcc/deh.d (_d_print_throwable): Declare.
(_d_throw): Print stacktrace before terminating program due to
uncaught exception.
libphobos: Remove unused variables in gcc.backtrace.
The core.runtime module always overrides the default parameter value for
constructor calls. MaxAlignment is not required because a class can be
created on the stack with the `scope' keyword.
libphobos/ChangeLog:
* libdruntime/core/runtime.d (runModuleUnitTests): Use scope to new
LibBacktrace on the stack.
* libdruntime/gcc/backtrace.d (FIRSTFRAME): Remove.
(LibBacktrace.MaxAlignment): Remove.
(LibBacktrace.this): Remove default initialization of firstFrame.
(UnwindBacktrace.this): Likewise.
libphobos: Define main function as extern(C) when compiling without D runtime (PR102476)
The default supplied main function as read when compiling with `-fmain'
has extern(D) linkage. However this does not work when mixing this
option together with `-fno-druntime'.
PR d/102476
gcc/testsuite/ChangeLog:
* gdc.dg/pr102476.d: New test.
libphobos/ChangeLog:
* libdruntime/__main.di: Define main function as extern(C) when
compiling without D runtime.
libgomp/
* testsuite/libgomp.fortran/alloc-7.f90: Add dg-prune-output
for -fintrinsic-modules-path= warning of the C compiler.
* testsuite/libgomp.fortran/alloc-9.f90: Likewise.
* testsuite/libgomp.fortran/alloc-10.f90: Likewise.
openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran
gcc/ChangeLog:
* omp-low.c (omp_runtime_api_call): Add omp_aligned_{,c}alloc and
omp_{c,re}alloc, fix omp_alloc/omp_free.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1): Set implementation status to Y for
omp_aligned_{,c}alloc and omp_{c,re}alloc routines.
* omp_lib.f90.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc,
omp_realloc): Add.
* omp_lib.h.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc,
omp_realloc): Add.
* testsuite/libgomp.fortran/alloc-10.f90: New test.
* testsuite/libgomp.fortran/alloc-6.f90: New test.
* testsuite/libgomp.fortran/alloc-7.c: New test.
* testsuite/libgomp.fortran/alloc-7.f90: New test.
* testsuite/libgomp.fortran/alloc-8.f90: New test.
* testsuite/libgomp.fortran/alloc-9.f90: New test.
Richard Biener [Thu, 30 Sep 2021 11:05:45 +0000 (13:05 +0200)]
Refine alingment peeling fix
This refines the previous fix further by reverting to the original
code since the API is a bit of a mess. It also fixes the vector type
used to query the misalignment - that was what triggered the original
bogus change.
2021-09-30 Richard Biener <rguenther@suse.de>
* tree-vect-data-refs.c (vect_update_misalignment_for_peel):
Restore and fix condition under which we apply npeel to
the DRs misalignment value.
My upcoming improvements to the DOM threader triggered a warning in
this code. It looks like the format string is ".ltrans%u.ltrans", but
we're only writing a max of ".ltrans" + whatever the MAX_INT is here.
Jakub Jelinek [Thu, 30 Sep 2021 07:30:18 +0000 (09:30 +0200)]
openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc
This patch adds new OpenMP 5.1 allocator entrypoints and in addition to that
fixes an omp_alloc bug which is hard to test for - if the first allocator
fails but has a larger alignment trait and has a fallback allocator, either
the default behavior or a user fallback, then the extra alignment will be used
even in the fallback allocation, rather than just starting with whatever
alignment has been requested (in GOMP_alloc or the minimum one in omp_alloc).
Jonathan's comment on IRC this morning made me realize that I should add
alloc_align attributes to 2 of the prototypes and I still need to add testsuite
coverage for omp_realloc, will do that in a follow-up.
2021-09-30 Jakub Jelinek <jakub@redhat.com>
* omp.h.in (omp_aligned_alloc, omp_calloc, omp_aligned_calloc,
omp_realloc): New prototypes.
(omp_alloc): Move after omp_free prototype, add __malloc__ (omp_free)
attribute.
* allocator.c: Include string.h.
(omp_aligned_alloc): No longer static, add ialias. Add new_alignment
variable and use it instead of alignment so that when retrying the old
alignment is used again. Don't retry if new alignment is the same
as old alignment, unless allocator had pool size.
(omp_alloc, GOMP_alloc, GOMP_free): Use ialias_call.
(omp_aligned_calloc, omp_calloc, omp_realloc): New functions.
* libgomp.map (OMP_5.0.2): Export omp_aligned_alloc, omp_calloc,
omp_aligned_calloc and omp_realloc.
* testsuite/libgomp.c-c++-common/alloc-4.c (main): Add
omp_aligned_alloc, omp_calloc and omp_aligned_calloc tests.
* testsuite/libgomp.c-c++-common/alloc-5.c: New test.
* testsuite/libgomp.c-c++-common/alloc-6.c: New test.
* testsuite/libgomp.c-c++-common/alloc-7.c: New test.
* testsuite/libgomp.c-c++-common/alloc-8.c: New test.
[PR102501] Adjust jump threading testcases for ppc64* and others.
I really don't know what to do here. This is a bit of whack-o-mole.
The IL is sufficiently different for various architectures that any
tweak can cause the number of jump threads to vary.
For the pr7745-2.c testcase, we have less threading candidates because 2
of them now cross loop boundaries. Interestingly, this test matches
"Jumps threaded", not threads registered, so the block copier can
drop threads at copying time adding further confusion.
For example, we can register N threads, but the old copier can cancel
N-M threads while updating the CFG for a variety of different reasons
(removed edges, threading through loop exits, etc). This makes the
"Registering jump threads" not to match the total number of threads this
test checks for with "Jumps threaded".
The pr66752-3.c test OTOH, is just a matter of thread4 eliminating the
"if". I had erroneously thought it would always be eliminated by
thread3, but we really don't care where it gets cleaned up. All we know
is that DCE can't depend on the early threaders doing this work, because
it may cross loop boundaries. I've chosen thread4 arbitrarily, but we
could just as easily pick the ".optimized" dump.
Sorry, I'm really at my wits end here. I don't see any clean path
forward, except rewrite these tests as gimple IL. They're close to useless
as they sit.