Patrick Palka [Thu, 17 Feb 2022 13:35:23 +0000 (08:35 -0500)]
c++: double non-dep folding from finish_compound_literal [PR104565]
In finish_compound_literal, we perform non-dependent expr folding before
the call to check_narrowing ever since r9-5973. But ever since r10-7096,
check_narrowing also performs non-dependent expr folding of its own.
This double folding means tsubst will see non-templated trees during the
second folding, which causes a spurious error in the below testcase.
This patch removes the former folding operation; it seems obviated by
the latter one.
Patrick Palka [Thu, 3 Feb 2022 23:54:23 +0000 (18:54 -0500)]
c++: dependence of member noexcept-spec [PR104079]
Here a stale TYPE_DEPENDENT_P/_P_VALID value for f's function type
after replacing the type's DEFERRED_NOEXCEPT with the parsed dependent
noexcept-spec causes us to try to instantiate g's noexcept-spec ahead
of time (since it in turn appears non-dependent), leading to an ICE.
This patch fixes this by clearing TYPE_DEPENDENT_P_VALID in
fixup_deferred_exception_variants appropriately (as in
build_cp_fntype_variant).
That turns out to fix the testcase for C++17 but not for C++11/14,
because it's not until C++17 that a noexcept-spec is part of (and
therefore affects dependence of) the function type. Since dependence of
NOEXCEPT_EXPR is defined in terms of instantiation dependence, the most
appropriate fix for earlier dialects seems to be to make instantiation
dependence consider dependence of a noexcept-spec.
PR c++/104079
gcc/cp/ChangeLog:
* pt.c (value_dependent_noexcept_spec_p): New predicate split
out from ...
(dependent_type_p_r): ... here.
(instantiation_dependent_r): Use value_dependent_noexcept_spec_p
to consider dependence of a noexcept-spec before C++17.
* tree.c (fixup_deferred_exception_variants): Clear
TYPE_DEPENDENT_P_VALID.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept74.C: New test.
* g++.dg/cpp0x/noexcept74a.C: New test.
Patrick Palka [Sat, 26 Mar 2022 14:20:16 +0000 (10:20 -0400)]
c++: ICE when building builtin operator->* set [PR103455]
Here when constructing the builtin operator->* candidate set according
to the available conversion functions for the operand types, we end up
considering a candidate with C1=T (through B's dependent conversion
function) and C2=F, during which we crash from DERIVED_FROM_P because
dependent_type_p sees a TEMPLATE_TYPE_PARM outside of a template
context.
Sidestepping the question of whether we should be considering such a
dependent conversion function here in the first place, it seems futile
to test DERIVED_FROM_P for anything other than an actual class type, so
this patch fixes this ICE by simply guarding the DERIVED_FROM_P test
with CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.
PR c++/103455
gcc/cp/ChangeLog:
* call.c (add_builtin_candidate) <case MEMBER_REF>: Test
CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.
Jan Hubicka [Fri, 26 Nov 2021 12:36:35 +0000 (13:36 +0100)]
Fix handling of in_flags in update_escape_summary_1
update_escape_summary_1 has thinko where it compues proper min_flags but then
stores original value (ignoring the fact whether there was a dereference
in the escape point).
PR ipa/103432
* ipa-modref.c (update_escape_summary_1): Fix handling of min_flags.
c++: Fix ICE due to shared BLOCK node in coroutine generation [PR103328]
When finishing a function that is a coroutine, the function is
transformed into a "ramp" function, and the original user-provided
function body gets moved into a newly created "actor" function.
In this case `current_function_decl` points to the ramp function,
but `current_binding_level->blocks` would still point to the
scope block of the user-provided function body in the actor function,
so when the ramp function was finished during `poplevel()` in decl.cc,
we could end up with that block being reused as the `DECL_INITIAL()` of
the ramp function:
This block would then be independently modified by subsequent passes
touching either the ramp or the actor function, potentially causing
an ICE depending on the order and function of these passes.
which is problematic since gimplifying the base object
? inv : src produces a register temporary but GIMPLE does not
really support a register as a base for an ARRAY_REF (even
though that's not strictly validated it seems as can be seen
at -O0). Interestingly the C++ frontend avoids this issue
by emitting the following GENERIC instead:
The proposed patch below fixes things up when using an rvalue
as the base is OK by emitting a copy from a register base to a
non-register one. The ?: as lvalue extension seems to be gone
for C, C++ again unwraps the COND_EXPR in that case.
2022-02-11 Richard Biener <rguenther@suse.de>
PR middle-end/104497
* gimplify.c (gimplify_compound_lval): Make sure the
base is a non-register if needed and possible.
When we optimize permutations in a reduction chain we have to
be careful to select the correct live-out stmt, otherwise the
reduction result will be unused and the retained scalar code will
execute only the number of vector iterations.
2022-03-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/105053
* tree-vect-loop.c (vect_create_epilog_for_reduction): Pick
the correct live-out stmt for a reduction chain.
Andrew Pinski [Wed, 9 Feb 2022 22:56:58 +0000 (14:56 -0800)]
[COMMITTED] Fix PR aarch64/104474: ICE with vector float initializers and non-consts.
The problem here is that the aarch64 back-end was placing const0_rtx
into the constant vector RTL even if the mode was a floating point mode.
The fix is instead to use CONST0_RTX and pass the mode to select the
correct zero (either const_int or const_double).
Committed as obvious after a bootstrap/test on aarch64-linux-gnu with
no regressions.
PR target/104474
gcc/ChangeLog:
* config/aarch64/aarch64.c
(aarch64_sve_expand_vector_init_handle_trailing_constants):
Use CONST0_RTX instead of const0_rtx for the non-constant elements.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/pr104474-1.c: New test.
* gcc.target/aarch64/sve/pr104474-2.c: New test.
* gcc.target/aarch64/sve/pr104474-3.c: New test.
Richard Biener [Mon, 28 Mar 2022 08:07:53 +0000 (10:07 +0200)]
tree-optimization/105070 - annotate bit cluster tests with locations
The following makes sure to annotate the tests generated by
switch lowering bit-clustering with locations which otherwise
can be completely lost even at -O0.
2022-03-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/105070
* tree-switch-conversion.h
(bit_test_cluster::hoist_edge_and_branch_if_true): Add location
argument.
* tree-switch-conversion.c
(bit_test_cluster::hoist_edge_and_branch_if_true): Annotate
cond with location.
(bit_test_cluster::emit): Annotate all generated expressions
with location.
Richard Biener [Wed, 23 Mar 2022 11:21:32 +0000 (12:21 +0100)]
rtl-optimization/105028 - fix compile-time hog in form_threads_from_copies
form_threads_from_copies processes a sorted array of copies, skipping
those with the same thread and conflicting threads and merging the
first non-conflicting ones. After that it terminates the loop and
gathers the remaining elements of the array, skipping same thread
copies, re-starting the process. For a large number of copies this
gathering of the rest takes considerable time and it also appears
pointless. The following simply continues processing the array
which should be equivalent as far as I can see.
This takes form_threads_from_copies off the profile radar from
previously taking ~50% of the compile-time.
2022-03-23 Richard Biener <rguenther@suse.de>
PR rtl-optimization/105028
* ira-color.c (form_threads_from_copies): Remove unnecessary
copying of the sorted_copies tail.
Richard Biener [Fri, 11 Mar 2022 13:09:33 +0000 (14:09 +0100)]
tree-optimization/104880 - update-address-taken and cmpxchg
The following addresses optimistic non-addressable marking of
an argument of __atomic_compare_exchange_n which broke when
I added DECL_NOT_GIMPLE_REG_P since we cannot guarantee we can
rewrite it when TREE_ADDRESSABLE is unset. Instead we have to
restore TREE_ADDRESSABLE in that case.
2022-03-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/104880
* tree-ssa.c (execute_update_address_taken): Remember if we
optimistically made something not addressable and
prepare to undo it.
Richard Biener [Wed, 6 Apr 2022 11:46:56 +0000 (13:46 +0200)]
middle-end/105165 - sorry instead of ICE for _Complex asm goto
Complex lowering cannot currently deal with asm gotos with _Complex
output operands. Emit a sorry instead of ICEing, those should not
appear in practice.
IPA_JF_ANCESTOR jump functions are constructed also when the formal
parameter of the caller is first checked whether it is NULL and left
as it is if it is NULL, to accommodate C++ casts to an ancestor class.
The jump function type was invented for devirtualization and IPA-CP
propagation of tree constants is also careful to apply it only to
existing DECLs(*) but as PR 103083 shows, the part propagating "known
bits" was not careful about this, which can lead to miscompilations.
This patch introduces a flag to the ancestor jump functions which
tells whether a NULL-check was elided when creating it and makes the
bits propagation behave accordingly, masking any bits otherwise would
be known to be one. This should safely preserve alignment info, which
is the primary ifnormation that we keep in bits for pointers.
(*) There still may remain problems when a DECL resides on address
zero (with -fno-delete-null-pointer-checks ...I hope it cannot happen
otherwise). I am looking into that now but I think it will be easier
for everyone if I do so in a follow-up patch.
gcc/ChangeLog:
2022-02-11 Martin Jambor <mjambor@suse.cz>
PR ipa/103083
* ipa-prop.h (ipa_ancestor_jf_data): New flag keep_null;
(ipa_get_jf_ancestor_keep_null): New function.
* ipa-prop.c (ipa_set_ancestor_jf): Initialize keep_null field of the
ancestor function.
(compute_complex_assign_jump_func): Pass false to keep_null
parameter of ipa_set_ancestor_jf.
(compute_complex_ancestor_jump_func): Pass true to keep_null
parameter of ipa_set_ancestor_jf.
(update_jump_functions_after_inlining): Carry over keep_null from the
original ancestor jump-function or merge them.
(ipa_write_jump_function): Stream keep_null flag.
(ipa_read_jump_function): Likewise.
(ipa_print_node_jump_functions_for_edge): Print the new flag.
* ipa-cp.c (class ipcp_bits_lattice): Make various getters const. New
member function known_nonzero_p.
(ipcp_bits_lattice::known_nonzero_p): New.
(ipcp_bits_lattice::meet_with_1): New parameter drop_all_ones,
observe it.
(ipcp_bits_lattice::meet_with): Likewise.
(propagate_bits_across_jump_function): Simplify. Pass true in
drop_all_ones when it is necessary.
(propagate_aggs_across_jump_function): Take care of keep_null
flag.
(ipa_get_jf_ancestor_result): Propagate NULL accross keep_null
jump functions.
gcc/testsuite/ChangeLog:
2021-11-25 Martin Jambor <mjambor@suse.cz>
* gcc.dg/ipa/pr103083-1.c: New test.
* gcc.dg/ipa/pr103083-2.c: Likewise.
Jonathan Wakely [Thu, 17 Feb 2022 17:23:36 +0000 (17:23 +0000)]
libstdc++: Make std::error_code printer more robust
This attempts to implement a partial workaround for the GDB bug
https://sourceware.org/bugzilla/show_bug.cgi?id=28856 which causes GDB
to crash when printing a frame with a std::error_code argument.
By recognising the known error categories defined in the library and
hardcoding their names we do not need to call cat->name() on the
category. This has the additional benefit of also working when
debugging a core file rather than a running process. For those known
categories we can also cast the int value to the corresponding error
code enum (e.g. future_errc) so that we show an enumerator instead of
just an integer.
For program-defined categories we just use the name of the dynamic type
to identify the category, and print the value as an integer. Once the
GDB bug is fixed and the virtual name() function can be called safely,
that would be preferable. For now it's better to have an imperfect
printer that doesn't crash GDB.
This rewritten StdErrorCodePrinter needs gdb.Value.dynamic_type, so is
only registered if that is supported, which means GDB 7.7 and later.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (StdErrorCodePrinter): Replace
code that call cat->name() on std::error_category objects.
Identify known categories by symbol name and use a hardcoded
name. Print error code values as enumerators where appopriate.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Adjust expected
name of custom category. Check io_errc and future_errc errors.
Jonathan Wakely [Wed, 23 Mar 2022 09:57:20 +0000 (09:57 +0000)]
libstdc++: Add missing constraints to std::bit_cast [PR105027]
Our std::bit_cast was relying on the compiler to check for errors inside
__builtin_bit_cast, instead of checking them as constraints. That means
std::bit_cast was not SFINAE-friendly.
This fix uses a requires-clause, so for old versions of Clang without
concepts support the function will still be unconstrained. At some point
in future we can remove the #ifdef __cpp_concepts check and rely on all
compilers having full concepts support in C++20 mode.
The path::begin() fix should have been part of r12-3930-gf2b7f56a15d9cb.
Thanks to Timm Bäder for reporting this one.
libstdc++-v3/ChangeLog:
* include/experimental/bits/fs_fwd.h (copy_file): Remove
incorrect noexcept from declaration.
* include/experimental/bits/fs_path.h (path::begin, path::end):
Add noexcept to declarations, to match definitions.
Jonathan Wakely [Tue, 1 Feb 2022 23:58:08 +0000 (23:58 +0000)]
libstdc++: Do not use dirent::d_type unconditionally
These new tests should not use the d_type member unless it's actually
present on the OS.
libstdc++-v3/ChangeLog:
* testsuite/27_io/filesystem/iterators/error_reporting.cc: Use
autoconf macro to check whether d_type is present.
* testsuite/experimental/filesystem/iterators/error_reporting.cc:
Likewise.
Jonathan Wakely [Mon, 31 Jan 2022 21:12:53 +0000 (21:12 +0000)]
libstdc++: Reset filesystem::recursive_directory_iterator on error
The standard requires directory iterators to become equal to the end
iterator value if they report an error. Some members functions of
filesystem::recursive_directory_iterator fail to do that.
libstdc++-v3/ChangeLog:
* src/c++17/fs_dir.cc (recursive_directory_iterator::increment):
Reset state to past-the-end iterator on error.
(fs::recursive_directory_iterator::pop(error_code&)): Likewise.
(fs::recursive_directory_iterator::pop()): Check _M_dirs before
it might get reset.
* src/filesystem/dir.cc (recursive_directory_iterator): Likewise,
for the TS implementation.
* testsuite/27_io/filesystem/iterators/error_reporting.cc: New test.
* testsuite/experimental/filesystem/iterators/error_reporting.cc: New test.
Jonathan Wakely [Fri, 4 Feb 2022 15:23:31 +0000 (15:23 +0000)]
libstdc++: Remove un-implementable noexcept from Filesystem TS operations
LWG 3014 removed these incorrect noexcept specifications from the C++17
std::filesystem operations. They are also incorrect on the experimental
TS versions and should be removed from them too.
Jonathan Wakely [Tue, 8 Mar 2022 09:14:33 +0000 (09:14 +0000)]
libstdc++: Remove incorrect copyright notice from header
This file has the SGI copyright notice, but contains no code from
the SGI STL. It was entirely written by me in 2019, originally as part
of the <memory> header. When I extracted it into a new header I
accidentally copied across the SGI copyright, but that only applies to
some much older parts of <memory>.
Jonathan Wakely [Mon, 31 Jan 2022 11:00:18 +0000 (11:00 +0000)]
libstdc++: Improve config output for --enable-cstdio [PR104301]
Currently we just print "checking for underlying I/O to use... stdio"
unconditionally, whether configured to use stdio_pure or stdio_posix. We
should make it clear that the user's configure option chose the right
thing.
libstdc++-v3/ChangeLog:
PR libstdc++/104301
* acinclude.m4 (GLIBCXX_ENABLE_CSTDIO): Print different messages
for stdio_pure and stdio_posix options.
* configure: Regenerate.
Jakub Jelinek [Sun, 3 Apr 2022 19:50:43 +0000 (21:50 +0200)]
i386: Fix up ix86_expand_vector_init_general [PR105123]
The following testcase is miscompiled on ia32.
The problem is that at -O0 we end up with:
vector(4) short unsigned int _1;
short unsigned int u.0_3;
...
_1 = {u.0_3, u.0_3, u.0_3, u.0_3};
statement (dead) which is wrongly expanded.
elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode,
so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]).
The intent is to manually broadcast that value to 2 SImode parts,
but because we pass word as target to expand_simple_binop, it will
overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0:
10: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
11: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
12: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
13: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
14: clobber r110:V4HI
15: r110:V4HI#0=r83:SI
16: r110:V4HI#4=r83:SI
as the two ors do nothing and two shifts each by 16 left shift it all
away.
The following patch fixes that by using NULL_RTX target, so we expand it as
10: {r110:SI=r83:SI<<0x10;clobber flags:CC;}
11: {r111:SI=r110:SI|r83:SI;clobber flags:CC;}
12: {r112:SI=r83:SI<<0x10;clobber flags:CC;}
13: {r113:SI=r112:SI|r83:SI;clobber flags:CC;}
14: clobber r114:V4HI
15: r114:V4HI#0=r111:SI
16: r114:V4HI#4=r113:SI
instead.
Another possibility would be to pass NULL_RTX only when word == elt
and word otherwise, where word would necessarily be a pseudo from the first
shift after passing NULL_RTX there once or pass NULL_RTX for the shift and
word for ior.
2022-04-03 Jakub Jelinek <jakub@redhat.com>
PR target/105123
* config/i386/i386-expand.c (ix86_expand_vector_init_general): Avoid
using word as target for expand_simple_binop when doing ASHIFT and
IOR.
[PR105032] LRA: modify loop condition to find reload insns for hard reg splitting
When trying to split hard reg live range to assign hard reg to a reload
pseudo, LRA searches for reload insns of the reload pseudo
assuming a specific order of the reload insns. This order is violated if
reload involved in inheritance transformation. In such case, the loop used
for reload insn searching can become infinite. The patch fixes this.
Marek Polacek [Tue, 29 Mar 2022 18:36:55 +0000 (14:36 -0400)]
c-family: ICE with -Wconversion and A ?: B [PR101030]
This patch fixes a crash in conversion_warning on a null expression.
It is null because the testcase uses the GNU A ?: B extension. We
could also use op0 instead of op1 in this case, but it doesn't seem
to be necessary.
PR c++/101030
gcc/c-family/ChangeLog:
* c-warn.c (conversion_warning) <case COND_EXPR>: Don't call
conversion_warning when OP1 is null.
Kito Cheng [Wed, 27 Oct 2021 15:41:17 +0000 (23:41 +0800)]
RISC-V: Handle zi* extension correctly for arch-canonicalize script
Canonical order for z-prefixed extension are rely on the canonical order of
single letter extension, however we didn't put i into the list before,
so when we put zicsr or zifencei it will got exception.
gcc/ChangeLog:
* config/riscv/arch-canonicalize (CANONICAL_ORDER): Add `i` to
CANONICAL_ORDER.
RISC-V: Fix register class subset checks for CLASS_MAX_NREGS
Fix the register class subset checks in the determination of the maximum
number of consecutive registers needed to hold a value of a given mode.
The number depends on whether a register is a general-purpose or a
floating-point register, so check whether the register class requested
is a subset (argument 1 to `reg_class_subset_p') rather than superset
(argument 2) of GR_REGS or FP_REGS class respectively.
gcc/
* config/riscv/riscv.c (riscv_class_max_nregs): Swap the
arguments to `reg_class_subset_p'.
Nelson Chu [Mon, 29 Nov 2021 12:48:20 +0000 (04:48 -0800)]
RISC-V: jal cannot refer to a default visibility symbol for shared object.
This is the original binutils bugzilla report,
https://sourceware.org/bugzilla/show_bug.cgi?id=28509
And this is the first version of the proposed binutils patch,
https://sourceware.org/pipermail/binutils/2021-November/118398.html
After applying the binutils patch, I get the the unexpected error when
building libgcc,
/scratch/nelsonc/riscv-gnu-toolchain/riscv-gcc/libgcc/config/riscv/div.S:42:
/scratch/nelsonc/build-upstream/rv64gc-linux/build-install/riscv64-unknown-linux-gnu/bin/ld: relocation R_RISCV_JAL against `__udivdi3' which may bind externally can not be used when making a shared object; recompile with -fPIC
Therefore, this patch add an extra hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead.
The solution is similar to glibc as follows,
https://sourceware.org/git/?p=glibc.git;a=commit;h=68389203832ab39dd0dbaabbc4059e7fff51c29b
libgcc/ChangeLog:
* config/riscv/div.S: Add the hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target it since it is non-preemptible.
* config/riscv/riscv-asm.h: Added new macros HIDDEN_JUMPTARGET and
HIDDEN_DEF.
RISC-V: Fix use-after-free error in `parse_multiletter_ext'
Avoid undefined arithmetic involving a pointer to a heap allocation that
has been freed and move a problematic calculation ahead of the following
call to `free' in `riscv_subset_list::parse_multiletter_ext', removing a
compilation error:
.../gcc/common/config/riscv/riscv-common.c: In member function 'const char* riscv_subset_list::parse_multiletter_ext(const char*, const char*, const char*)':
.../gcc/common/config/riscv/riscv-common.cc:905:27: error: pointer 'subset' used after 'void free(void*)' [-Werror=use-after-free]
905 | p += end_of_version - subset;
| ~~~~~~~~~~~~~~~^~~~~~~~
.../gcc/common/config/riscv/riscv-common.cc:904:12: note: call to 'void free(void*)' here
904 | free (subset);
| ~~~~~^~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:2428: riscv-common.o] Error 1
and a build regression from commit 671a283636de ("Add -Wuse-after-free
[PR80532].").
gcc/
* common/config/riscv/riscv-common.c
(riscv_subset_list::parse_multiletter_ext): Move pointer
arithmetic ahead of `free'.
Jakub Jelinek [Wed, 30 Mar 2022 08:49:47 +0000 (10:49 +0200)]
ubsan: Fix ICE due to -fsanitize=object-size [PR105093]
The following testcase ICEs, because for a volatile X & RESULT_DECL
ubsan wants to take address of that reference. instrument_object_size
is called with x, so the base is equal to the access and the var
is automatic, so there is no risk of an out of bounds access for it.
Normally we wouldn't instrument those because we fold address of the
t - address of inner to 0, add constant size of the decl and it is
equal to what __builtin_object_size computes. But the volatile
results in the subtraction not being folded.
The first hunk fixes it by punting if we access the whole automatic
decl, so that even volatile won't cause a problem.
The second hunk (not strictly needed for this testcase) is similar
to what has been added to asan.cc recently, if we actually take
address of a decl and keep it in the IL, we better mark it addressable.
2022-03-30 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/105093
* ubsan.c (instrument_object_size): If t is equal to inner and
is a decl other than global var, punt. When emitting call to
UBSAN_OBJECT_SIZE ifn, make sure base is addressable.
Jakub Jelinek [Wed, 30 Mar 2022 08:21:16 +0000 (10:21 +0200)]
store-merging: Avoid ICEs on roughly ~0ULL/8 sized stores [PR105094]
On the following testcase on 64-bit targets, store-merging sees
a MEM_REF store from {} ctor with "negative" bitsize where bitoff + bitsize
wraps around to very small end offset. This later confuses the code
so that it allocates just a few bytes of memory but fills in huge amounts of
it. Later on there is a param_store_merging_max_size size check but due to
the wrap-around we pass that.
The following patch punts on such large bitsizes.
2022-03-30 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/105094
* gimple-ssa-store-merging.c (mem_valid_for_store_merging): Punt if
bitsize <= 0 rather than just == 0.
Jakub Jelinek [Wed, 30 Mar 2022 07:16:41 +0000 (09:16 +0200)]
c++: Fox template-introduction tentative parsing in class bodies clear colon_corrects_to_scope_p [PR105061]
The concepts support (in particular template introductions from concepts TS)
broke the following testcase, valid unnamed bitfields with dependent
types (or even just typedefs) were diagnosed as typos (: instead of correct
::) in template introduction during their tentative parsing.
The following patch fixes that by not doing this : to :: correction when
member_p is true.
2022-03-30 Jakub Jelinek <jakub@redhat.com>
PR c++/105061
* parser.c (cp_parser_template_introduction): If member_p, temporarily
clear parser->colon_corrects_to_scope_p around tentative parsing of
nested name specifier.
Jakub Jelinek [Sat, 26 Mar 2022 07:11:58 +0000 (08:11 +0100)]
c++: Fix up __builtin_{bit_cast,convertvector} parsing
Jonathan reported on IRC that we don't parse
__builtin_bit_cast (type, val).field
etc.
The problem is that for these 2 builtins we return from
cp_parser_postfix_expression instead of setting postfix_expression
to the cp_build_* value and falling through into the postfix regression
suffix handling loop.
2022-03-26 Jakub Jelinek <jakub@redhat.com>
* parser.c (cp_parser_postfix_expression)
<case RID_BILTIN_CONVERTVECTOR, case RID_BUILTIN_BIT_CAST>: Don't
return cp_build_{vec,convert,bit_cast} result right away, instead
set postfix_expression to it and break.
* c-c++-common/builtin-convertvector-3.c: New test.
* g++.dg/cpp2a/bit-cast15.C: New test.
Jakub Jelinek [Thu, 24 Mar 2022 11:23:51 +0000 (12:23 +0100)]
fold-const: Handle C++ dependent COMPONENT_REFs in operand_equal_p [PR105035]
As mentioned in the PR, operand_equal_p already contains some hacks so that
it can be called already on pre-instantiation C++ trees from templates,
but the recent change to compare DECL_FIELD_OFFSET in the COMPONENT_REF
case broke this. Many such COMPONENT_REFs are already punted on earlier
because they have NULL TREE_TYPE, but in this case the code knows what
type they have but still uses an IDENTIFIER_NODE as second operand
of COMPONENT_REF (I think SCOPE_REF is something that could be used too).
The following patch looks at those DECL_FIELD_*OFFSET fields only if
both field[01] args are FIELD_DECLs and otherwise keeps it to the
earlier OP_SAME (1) check that guards this whole block.
2022-03-24 Jakub Jelinek <jakub@redhat.com>
PR c++/105035
* fold-const.c (operand_equal_p) <case COMPONENT_REF>: If either
field0 or field1 is not a FIELD_DECL, return false.
Jakub Jelinek [Thu, 24 Mar 2022 09:12:25 +0000 (10:12 +0100)]
c++: extern thread_local declarations in constexpr [PR104994]
C++14 to C++20 apparently should allow extern thread_local declarations in
constexpr functions, however useless they are there (because accessing
such vars is not valid in a constant expression, perhaps sizeof/decltype).
P2242 changed that for C++23 to passing through declaration but
https://cplusplus.github.io/CWG/issues/2552.html
has been filed for it yesterday.
Jakub Jelinek [Sat, 19 Mar 2022 12:53:12 +0000 (13:53 +0100)]
i386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971]
__builtin_ia32_readeflags_u* aren't marked const or pure I think
intentionally, so that they aren't CSEd from different regions of a function
etc. because we don't and can't easily track all dependencies between
it and surrounding code (if somebody looks at the condition flags, it is
dependent on the vast majority of instructions).
But the builtin itself doesn't have any side-effects, so if we ignore the
result of the builtin, there is no point to emit anything.
There is a LRA bug that miscompiles the testcase which this patch makes
latent, which is certainly worth fixing too, but IMHO this change
(and maybe ix86_gimple_fold_builtin too which would fold it even earlier
when it looses lhs) is worth it as well.
2022-03-19 Jakub Jelinek <jakub@redhat.com>
PR middle-end/104971
* config/i386/i386-expand.c
(ix86_expand_builtin) <case IX86_BUILTIN_READ_FLAGS>: If ignore,
don't push/pop anything and just return const0_rtx.
Jakub Jelinek [Sat, 19 Mar 2022 07:40:47 +0000 (08:40 +0100)]
c-family: Fix up ICE during pretty-printing of PMF related expression [PR101515]
The intent of r11-6729 is that it prints something that helps user to figure
out what exactly is being accessed.
When we find a unique non-static data member that is being accessed, even
when we can't fold it nicely, IMNSHO it is better to print
((sometype *)&var)->field
or
(*(sometype *)&var).field
instead of
*(fieldtype *)((char *)&var + 56)
because the user doesn't know what is at offset 56, we shouldn't ask user
to decipher structure layout etc.
One question is if we could return something better for the TYPE_PTRMEMFUNC_FLAG
RECORD_TYPE members here (something that would print it more naturally/readably
in a C++ way), though the fact that the routine is in c-family makes it
harder.
Another one is whether we shouldn't punt for FIELD_DECLs that don't have
nicely printable name of its containing scope, something like:
if (tree scope = get_containing_scope (field))
if (TYPE_P (scope) && TYPE_NAME (scope) == NULL_TREE)
break;
return cop;
or so. This patch implements that.
Note the returned cop is a COMPONENT_REF where the first argument has a
nicely printable type name (x with type sp), but sp's TYPE_MAIN_VARIANT
is the unnamed TYPE_PTRMEMFUNC_FLAG. So another possibility would be if
we see such a problem for the FIELD_DECL's scope, check if TYPE_MAIN_VARIANT
of the first COMPONENT_REF's argument is equal to that scope and in that
case use TREE_TYPE of the first COMPONENT_REF's argument as the scope
instead.
2022-03-19 Jakub Jelinek <jakub@redhat.com>
PR c++/101515
* c-pretty-print.c (c_fold_indirect_ref_for_warn): For C++ don't
return COMPONENT_REFs with FIELD_DECLs whose containing scope can't
be printed.
Jakub Jelinek [Fri, 18 Mar 2022 17:58:06 +0000 (18:58 +0100)]
Allow (void *) 0xdeadbeef accesses without warnings [PR99578]
Starting with GCC11 we keep emitting false positive -Warray-bounds or
-Wstringop-overflow etc. warnings on widely used *(type *)0x12345000
style accesses (or memory/string routines to such pointers).
This is a standard programming style supported by all C/C++ compilers
I've ever tried, used mostly in kernel or DSP programming, but sometimes
also together with mmap MAP_FIXED when certain things, often I/O registers
but could be anything else too are known to be present at fixed
addresses.
Such INTEGER_CST addresses can appear in code either because a user
used it like that (in which case it is fine) or because somebody used
pointer arithmetics (including &((struct whatever *)NULL)->field) on
a NULL pointer. The middle-end warning code wrongly assumes that the
latter case is what is very likely, while the former is unlikely and
users should change their code.
The following patch adds a min-pagesize param defaulting to 4KB,
and treats INTEGER_CST addresses smaller than that as assumed results
of pointer arithmetics from NULL while addresses equal or larger than
that as expected user constant addresses. For GCC 13 we can
represent results from pointer arithmetics on NULL using
&MEM[(void*)0 + offset] instead of (void*)offset INTEGER_CSTs.
2022-03-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99578
PR middle-end/100680
PR tree-optimization/100834
* params.opt (--param=min-pagesize=): New parameter.
* builtins.c (compute_objsize_r) <case INTEGER_CST>: Use maximum
object size instead of zero for pointer constants equal or larger
than min-pagesize.
* gcc.dg/tree-ssa/pr99578-1.c: New test.
* gcc.dg/pr99578-1.c: New test.
* gcc.dg/pr99578-2.c: New test.
* gcc.dg/pr99578-3.c: New test.
* gcc.dg/pr100680.c: New test.
* gcc.dg/pr100834.c: New test.
Jakub Jelinek [Fri, 18 Mar 2022 17:49:23 +0000 (18:49 +0100)]
c++: Fix up constexpr evaluation of new with zero sized types [PR104568]
The new expression constant expression evaluation right now tries to
deduce how many elts the array it uses for the heap or heap [] vars
should have (or how many elts should its trailing array have if it has
cookie at the start). As new is lowered at that point to
(some_type *) ::operator new (size)
or so, it computes it by subtracting cookie size if any from size, then
divides the result by sizeof (some_type).
This works fine for most types, except when sizeof (some_type) is 0,
then we divide by zero; size is then equal to cookie_size (or if there
is no cookie, to 0).
The following patch special cases those cases so that we don't divide
by zero and also recover the original outer_nelts from the expression
by forcing the size not to be folded in that case but be explicit
0 * outer_nelts or cookie_size + 0 * outer_nelts.
Note, we have further issues, we accept-invalid various cases, for both
zero sized elt_type and even non-zero sized elts, we aren't able to
diagnose out of bounds POINTER_PLUS_EXPR like:
constexpr bool
foo ()
{
auto p = new int[2];
auto q1 = &p[0];
auto q2 = &p[1];
auto q3 = &p[2];
auto q4 = &p[3];
delete[] p;
return true;
}
constexpr bool a = foo ();
That doesn't look like a regression so I think we should resolve that for
GCC 13, but there are 2 problems. Figure out why
cxx_fold_pointer_plus_expression doesn't deal with the &heap []
etc. cases, and for the zero sized arrays, I think we really need to preserve
whether user wrote an array ref or pointer addition, because in the
&p[3] case if sizeof(p[0]) == 0 we know that if it has 2 elements it is
out of bounds, while if we see p p+ 0 the information if it was
p + 2 or p + 3 in the source is lost.
clang++ seems to handle it fine even in the zero sized cases or with
new expressions.
2022-03-18 Jakub Jelinek <jakub@redhat.com>
PR c++/104568
* init.c (build_new_constexpr_heap_type): Remove FULL_SIZE
argument and its handling, instead add ITYPE2 argument. Only
support COOKIE_SIZE != NULL.
(build_new_1): If size is 0, change it to 0 * outer_nelts if
outer_nelts is non-NULL. Pass type rather than elt_type to
maybe_wrap_new_for_constexpr.
* constexpr.c (build_new_constexpr_heap_type): New function.
(cxx_eval_constant_expression) <case CONVERT_EXPR>:
If elt_size is zero sized type, try to recover outer_nelts from
the size argument to operator new/new[] and pass that as
arg_size to build_new_constexpr_heap_type. Pass ctx,
non_constant_p and overflow_p to that call too.
Jakub Jelinek [Thu, 17 Mar 2022 17:49:00 +0000 (18:49 +0100)]
libatomic: Improve 16-byte atomics on Intel AVX [PR104688]
As mentioned in the PR, the latest Intel SDM has added:
"Processors that enumerate support for Intel® AVX (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
guarantee that the 16-byte memory operations performed by the following instructions will always be
carried out atomically:
• MOVAPD, MOVAPS, and MOVDQA.
• VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
• VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and k0 (masking disabled).
(Note that these instructions require the linear addresses of their memory operands to be 16-byte
aligned.)"
The following patch deals with it just on the libatomic library side so far,
currently (since ~ 2017) we emit all the __atomic_* 16-byte builtins as
library calls since and this is something that we can hopefully backport.
The patch simply introduces yet another ifunc variant that takes priority
over the pure CMPXCHG16B one, one that checks AVX and CMPXCHG16B bits and
on non-Intel clears the AVX bit during detection for now (if AMD comes
with the same guarantee, we could revert the config/x86/init.c hunk),
which implements 16-byte atomic load as vmovdqa and 16-byte atomic store
as vmovdqa followed by mfence.
2022-03-17 Jakub Jelinek <jakub@redhat.com>
PR target/104688
* Makefile.am (IFUNC_OPTIONS): Change on x86_64 to -mcx16 -mcx16.
(libatomic_la_LIBADD): Add $(addsuffix _16_2_.lo,$(SIZEOBJS)) for
x86_64.
* Makefile.in: Regenerated.
* config/x86/host-config.h (IFUNC_COND_1): For x86_64 define to
both AVX and CMPXCHG16B bits.
(IFUNC_COND_2): Define.
(IFUNC_NCOND): For x86_64 define to 2 * (N == 16).
(MAYBE_HAVE_ATOMIC_CAS_16, MAYBE_HAVE_ATOMIC_EXCHANGE_16,
MAYBE_HAVE_ATOMIC_LDST_16): Define to IFUNC_COND_2 rather than
IFUNC_COND_1.
(HAVE_ATOMIC_CAS_16): Redefine to 1 whenever IFUNC_ALT != 0.
(HAVE_ATOMIC_LDST_16): Redefine to 1 whenever IFUNC_ALT == 1.
(atomic_compare_exchange_n): Define whenever IFUNC_ALT != 0
on x86_64 for N == 16.
(__atomic_load_n, __atomic_store_n): Redefine whenever IFUNC_ALT == 1
on x86_64 for N == 16.
(atomic_load_n, atomic_store_n): New functions.
* config/x86/init.c (__libat_feat1_init): On x86_64 clear bit_AVX
if CPU vendor is not Intel.
Jakub Jelinek [Wed, 16 Mar 2022 10:04:16 +0000 (11:04 +0100)]
aarch64: Fix up RTL sharing bug in aarch64_load_symref_appropriately [PR104910]
We unshare all RTL created during expansion, but when
aarch64_load_symref_appropriately is called after expansion like in the
following testcases, we use imm in both HIGH and LO_SUM operands.
If imm is some RTL that shouldn't be shared like a non-sharable CONST,
we get at least with --enable-checking=rtl a checking ICE, otherwise might
just get silently wrong code.
The following patch fixes that by copying it if it can't be shared.
Jakub Jelinek [Tue, 15 Mar 2022 08:12:03 +0000 (09:12 +0100)]
ifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]
find_if_case_{1,2} implicitly assumes conditional jumps and rewrites them,
so if they have extra side-effects or are say asm goto, things don't work
well, either the side-effects are lost or we could ICE.
In particular, the testcase below on s390x has there a doloop instruction
that decrements a register in addition to testing it for non-zero and
conditionally jumping based on that.
The following patch fixes that by punting for !onlyjump_p case, i.e.
if there are side-effects in the jump instruction or it isn't a plain PC
setter.
Also, it assumes BB_END (test_bb) will be always non-NULL, because basic
blocks with 2 non-abnormal successor edges should always have some instruction
at the end that determines which edge to take.
2022-03-15 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/104814
* ifcvt.c (find_if_case_1, find_if_case_2): Punt if test_bb doesn't
end with onlyjump_p. Assume BB_END (test_bb) is always non-NULL.
Jakub Jelinek [Wed, 9 Mar 2022 08:15:28 +0000 (09:15 +0100)]
c, c++, c-family: -Wshift-negative-value and -Wshift-overflow* tweaks for -fwrapv and C++20+ [PR104711]
As mentioned in the PR, different standards have different definition
on what is an UB left shift. They all agree on out of bounds (including
negative) shift count.
The rules used by ubsan are:
C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB
C++11-C++17 x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 then UB
C++20 and later everything is well defined
Now, for C++20, I've in the P1236R1 implementation added an early
exit for -Wshift-overflow* warning so that it never warns, but apparently
-Wshift-negative-value remained as is. As it is well defined in C++20,
the following patch doesn't enable -Wshift-negative-value from -Wextra
anymore for C++20 and later, if users want for compatibility with C++17
and earlier get the warning, they still can by using -Wshift-negative-value
explicitly.
Another thing is -fwrapv, that is an extension to the standards, so it is up
to us how exactly we define that case. Our ubsan code treats
TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only
diagnosing out of bounds shift count and nothing else and IMHO it is most
sensical to treat -fwrapv signed left shifts the same as C++20 treats
them, https://eel.is/c++draft/expr.shift#2
"The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N,
where N is the width of the type of the result.
[Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled.
— end note]"
with no UB dependent on the E1 values. The UB is only
"The behavior is undefined if the right operand is negative, or greater
than or equal to the width of the promoted left operand."
Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't
consider UB in left shifts dependent on the first operand's value, only
the out of bounds shifts.
While this change isn't a regression, I'd think it is useful for GCC 12,
it doesn't add new warnings, but just removes warnings that aren't
appropriate.
2022-03-09 Jakub Jelinek <jakub@redhat.com>
PR c/104711
gcc/
* doc/invoke.texi (-Wextra): Document that -Wshift-negative-value
is enabled by it only for C++11 to C++17 rather than for C++03 or
later.
(-Wshift-negative-value): Similarly (except here we stated
that it is enabled for C++11 or later).
gcc/c-family/
* c-opts.c (c_common_post_options): Don't enable
-Wshift-negative-value from -Wextra for C++20 or later.
* c-ubsan.c (ubsan_instrument_shift): Adjust comments.
* c-warn.c (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
gcc/c/
* c-fold.c (c_fully_fold_internal): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
* c-typeck.c (build_binary_op): Likewise.
gcc/cp/
* constexpr.c (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
* typeck.c (cp_build_binary_op): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
gcc/testsuite/
* c-c++-common/Wshift-negative-value-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-negative-value-2.c: Likewise.
* c-c++-common/Wshift-negative-value-3.c: Likewise.
* c-c++-common/Wshift-negative-value-4.c: Likewise.
* c-c++-common/Wshift-negative-value-7.c: New test.
* c-c++-common/Wshift-negative-value-8.c: New test.
* c-c++-common/Wshift-negative-value-9.c: New test.
* c-c++-common/Wshift-negative-value-10.c: New test.
* c-c++-common/Wshift-overflow-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-overflow-2.c: Likewise.
* c-c++-common/Wshift-overflow-5.c: Likewise.
* c-c++-common/Wshift-overflow-6.c: Likewise.
* c-c++-common/Wshift-overflow-7.c: Likewise.
* c-c++-common/Wshift-overflow-8.c: New test.
* c-c++-common/Wshift-overflow-9.c: New test.
* c-c++-common/Wshift-overflow-10.c: New test.
* c-c++-common/Wshift-overflow-11.c: New test.
* c-c++-common/Wshift-overflow-12.c: New test.
Jakub Jelinek [Tue, 8 Mar 2022 20:41:21 +0000 (21:41 +0100)]
c++: Don't suggest cdtor or conversion op identifiers in spelling hints [PR104806]
On the following testcase, we emit "did you mean '__dt '?" in the error
message. "__dt " shows there because it is dtor_identifier, but we
shouldn't suggest those to the user, they are purely internal and can't
be really typed by the user because of the final space in it.
2022-03-08 Jakub Jelinek <jakub@redhat.com>
PR c++/104806
* search.c (lookup_field_fuzzy_info::fuzzy_lookup_field): Ignore
identifiers with space at the end.
Jakub Jelinek [Mon, 7 Mar 2022 10:14:04 +0000 (11:14 +0100)]
s390: Fix up *cmp_and_trap_unsigned_int<mode> constraints [PR104775]
The following testcase fails to assemble due to clgte %r6,0(%r1,%r10)
insn not being accepted by assembler.
My rough understanding is that in the RSY-b insn format the spot
in other formats used for index registers is used instead for M3 what
kind of comparison it is, so this patch follows what other similar
instructions use for constraint (i.e. one without index register).
2022-03-07 Jakub Jelinek <jakub@redhat.com>
PR target/104775
* config/s390/s390.md (*cmp_and_trap_unsigned_int<mode>): Use
S constraint instead of T in the last alternative.
Jakub Jelinek [Wed, 2 Mar 2022 09:48:14 +0000 (10:48 +0100)]
cfgrtl: Fix up -g vs. -g0 code generation -flto differences in fixup_reorder_chain [PR104589]
This is similar to PR104237 and similarly to that, no testcase included
for the testsuite, as we don't have a framework to compile/link with
-g -flto and -g0 -flto and compare -fdump-final-insns= results from
the lto1 compilations.
With -flto, whether two location_t compare equal or not and just
express the same location is a lottery.
2022-03-02 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/104589
* cfgrtl.c (fixup_reorder_chain): Use loc_equal instead of direct
INSN_LOCATION comparison with goto_locus.
Jakub Jelinek [Fri, 25 Feb 2022 20:25:12 +0000 (21:25 +0100)]
match.pd: Further complex simplification fixes [PR104675]
Mark mentioned in the PR further 2 simplifications that also ICE
with complex types.
For these, eventually (but IMO GCC 13 materials) we could support it
for vector types if it would be uniform vector constants.
Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs
and we can't use bit_and etc. for complex type.
2022-02-25 Jakub Jelinek <jakub@redhat.com>
Marc Glisse <marc.glisse@inria.fr>
PR tree-optimization/104675
* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1):
Restrict simplifications to INTEGRAL_TYPE_P.
Jakub Jelinek [Fri, 25 Feb 2022 17:58:48 +0000 (18:58 +0100)]
rs6000: Use rs6000_emit_move in movmisalign<mode> expander [PR104681]
The following testcase ICEs, because for some strange reason it decides to use
movmisaligntf during expansion where the destination is MEM and source is
CONST_DOUBLE. For normal mov<mode> expanders the rs6000 backend uses
rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG
and a few other things, but for movmisalign<mode> nothing enforced this.
The middle-end documents that movmisalign<mode> shouldn't fail, so we can't
force that through predicates or condition on the expander.
2022-02-25 Jakub Jelinek <jakub@redhat.com>
PR target/104681
* config/rs6000/vector.md (movmisalign<mode>): Use rs6000_emit_move.
Jakub Jelinek [Fri, 25 Feb 2022 11:06:52 +0000 (12:06 +0100)]
i386: Use a new temp slot kind for splitter to floatdi<mode>2_i387_with_xmm [PR104674]
As mentioned in the PR, the following testcase is miscompiled for similar
reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various
places during expansion and during expansion we can guarantee that the
lifetime of those temporary slot doesn't overlap. But the following
splitter uses SLOT_TEMP too and in between expansion and split1 there is
a possibility that something extends the lifetime of SLOT_TEMP created
slots across an instruction that will be split by this splitter.
The following patch fixes it by using a new temp slot kind to make sure
it doesn't reuse a SLOT_TEMP that could be live across the instruction.
2022-02-25 Jakub Jelinek <jakub@redhat.com>
PR target/104674
* config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387.
* config/i386/i386.md (splitter to floatdi<mode>2_i387_with_xmm): Use
SLOT_FLOATxFDI_387 rather than SLOT_TEMP.
Jakub Jelinek [Fri, 25 Feb 2022 09:55:17 +0000 (10:55 +0100)]
match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]
We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
&/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
So, we should avoid simplifications which turn valid complex type
expressions into something that will ICE during expansion.
2022-02-25 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/104675
* match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
COMPLEX_TYPE.
* gcc.dg/pr104675-1.c: New test.
* gcc.dg/pr104675-2.c: New test.
Jakub Jelinek [Thu, 24 Feb 2022 14:29:02 +0000 (15:29 +0100)]
sccvn: Fix visit_reference_op_call value numbering of vdefs [PR104601]
The following testcase is miscompiled, because -fipa-pure-const discovers
that bar is const, but when sccvn during fre3 sees
# .MEM_140 = VDEF <.MEM_96>
*__pred$__d_43 = _50 (_49);
where _50 value numbers to &bar, it value numbers .MEM_140 to
vuse_ssa_val (gimple_vuse (stmt)). For const/pure calls that return
a SSA_NAME (or don't have lhs) that is fine, those calls don't store
anything, but if the lhs is present and not an SSA_NAME, value numbering
the vdef to anything but itself means that e.g. walk_non_aliased_vuses
won't consider the call, but the call acts as a store to its lhs.
When it is ignored, sccvn will return whatever has been stored to the
lhs earlier.
I've bootstrapped/regtested an earlier version of this patch, which did the
if (!lhs && gimple_call_lhs (stmt))
changed |= set_ssa_val_to (vdef, vdef);
part before else if (vnresult->result_vdef), and that regressed
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo \\\\(" 1
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo2 \\\\(" 1
so this updated patch uses result_vdef there as before and only otherwise
(which I think must be the const/pure case) decides based on whether the
lhs is non-SSA_NAME.
2022-02-24 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/104601
* tree-ssa-sccvn.c (visit_reference_op_call): For calls with
non-SSA_NAME lhs value number vdef to itself instead of e.g. the
vuse value number.
Jakub Jelinek [Tue, 22 Feb 2022 10:32:08 +0000 (11:32 +0100)]
libiberty: Fix up debug.temp.o creation if *.o has 64K+ sections [PR104617]
On
#define A(n) int foo1##n(void) { return 1##n; }
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325)
B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642)
testcase with
./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections
./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o
/tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized
(testcase too slow to be included into testsuite).
The problem is clearly reported by readelf:
readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323
readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section.
because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and
sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE
inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE
range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym
where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be
used instead and .symtab_shndx section should contain the real section
index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >=
SHN_LORESERVE value is needed it should put those into
Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can
contain any section index.
Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before
2011) used to mishandle the > 63.75K sections case and assumed there is a
hole in between the sections, but what
simple_object_elf_copy_lto_debug_sections does wouldn't help in that case
for the debug temp object creation, we'd need to detect the case also in
that routine and take it into account in the remapping etc. I think
it is not worth it given that it is over 10 years, if somebody needs
63.75K or more sections, better use more recent binutils.
2022-02-22 Jakub Jelinek <jakub@redhat.com>
PR lto/104617
* simple-object-elf.c (simple_object_elf_match): Fix up URL
in comment.
(simple_object_elf_copy_lto_debug_sections): Remap sh_info and
sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE
range (inclusive).
<bb 9> [local count: 1073741824]:
<retval>.x = 0;
The problem is during expansion, <retval> isn't marked TREE_ADDRESSABLE,
even when we take its address in (unsigned long) &<retval>.x.
Now, instrument_derefs has code to avoid the instrumentation altogether
if we can prove the access is within bounds of an automatic variable in the
current function and the var isn't TREE_ADDRESSABLE (or we don't instrument
use after scope), but we do it solely for VAR_DECLs.
I think we should treat RESULT_DECLs exactly like that too, which is what
the following patch does. I must say I'm unsure about PARM_DECLs, those can
have different cases, either they are fully or partially passed in
registers, then if we take parameter's address, they are in a local copy
inside of a function and so work like those automatic vars. But if they
are fully passed in memory, we typically just take address of the slot
and in that case they live in the caller's frame. It is true we don't
(can't) put any asan padding in between the arguments, so all asan could
detect in that case is if caller passes fewer on stack arguments or smaller
arguments than callee accepts. Anyway, as I'm unsure, I haven't added
PARM_DECLs to that case.
And another thing is, when we actually build_fold_addr_expr, we need to
mark_addressable the inner if it isn't addressable already.
2022-02-19 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/102656
* asan.c (instrument_derefs): If inner is a RESULT_DECL and access is
known to be within bounds, treat it like automatic variables.
If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from
current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark
it addressable.
Marek Polacek [Tue, 22 Mar 2022 18:37:02 +0000 (14:37 -0400)]
c: -Wmissing-field-initializers and designated inits [PR82283, PR84685]
This patch fixes two kinds of wrong -Wmissing-field-initializers
warnings. Our docs say that this warning "does not warn about designated
initializers", but we give a warning for
1) the array case:
struct S {
struct N {
int a;
int b;
} c[1];
} d = {
.c[0].a = 1,
.c[0].b = 1, // missing initializer for field 'b' of 'struct N'
};
we warn because push_init_level, when constructing an array, clears
constructor_designated (which the warning relies on), and we forget
that we were in a designated initializer context. Fixed by the
push_init_level hunk; and
2) the compound literal case:
struct T {
int a;
int *b;
int c;
};
struct T t = { .b = (int[]){1} }; // missing initializer for field 'c' of 'struct T'
where set_designator properly sets constructor_designated to 1, but the
compound literal causes us to create a whole new initializer_stack in
start_init, which clears constructor_designated. Then, after we've parsed
the compound literal, finish_init flushes the initializer_stack entry,
but doesn't restore constructor_designated, so we forget we were in
a designated initializer context, which causes the bogus warning. (The
designated flag is also tracked in constructor_stack, but in this case,
we didn't perform push_init_level between set_designator and start_init
so it wasn't saved anywhere.)
PR c/82283
PR c/84685
gcc/c/ChangeLog:
* c-typeck.c (struct initializer_stack): Add 'designated' member.
(start_init): Set it.
(finish_init): Restore constructor_designated.
(push_init_level): Set constructor_designated to the value of
constructor_designated in the upper constructor_stack.
gcc/testsuite/ChangeLog:
* gcc.dg/Wmissing-field-initializers-1.c: New test.
* gcc.dg/Wmissing-field-initializers-2.c: New test.
* gcc.dg/Wmissing-field-initializers-3.c: New test.
* gcc.dg/Wmissing-field-initializers-4.c: New test.
* gcc.dg/Wmissing-field-initializers-5.c: New test.
Marek Polacek [Wed, 23 Mar 2022 21:12:29 +0000 (17:12 -0400)]
c++: alignas and alignof void [PR104944]
I started looking into this PR because in GCC 4.9 we were able to
detect the invalid
struct alignas(void) S{};
but I broke it in r210262.
It's ill-formed code in C++:
[dcl.align]/3: "An alignment-specifier of the form alignas(type-id) has
the same effect as alignas(alignof(type-id))", and [expr.align]/1:
"The operand shall be a type-id representing a complete object type,
or an array thereof, or a reference to one of those types." and void
is not a complete type.
It's also invalid in C:
6.7.5: _Alignas(type-name) is equivalent to _Alignas(_Alignof(type-name))
6.5.3.4: "The _Alignof operator shall not be applied to a function type
or an incomplete type."
We have a GNU extension whereby we treat sizeof(void) as 1, but I assume
it doesn't apply to alignof, at least in C++. However, __alignof__(void)
is still accepted with a -Wpedantic warning.
We still say "invalid application of 'alignof'" rather than 'alignas' in the
void diagnostic, but I felt that fixing that may not be suitable as part of
this patch. The "incomplete type" diagnostic still always prints
'__alignof__'.
Marek Polacek [Thu, 17 Mar 2022 18:39:37 +0000 (14:39 -0400)]
c++: ICE with template code in constexpr [PR104284]
Since r9-6073 cxx_eval_store_expression preevaluates the value to
be stored, and that revealed a crash where a template code (here,
code=IMPLICIT_CONV_EXPR) leaks into cxx_eval*.
It happens because we're performing build_vec_init while processing
a template, which calls get_temp_regvar which creates an INIT_EXPR.
This INIT_EXPR's RHS contains an rvalue conversion so we create an
IMPLICIT_CONV_EXPR. Its operand is not type-dependent and the whole
INIT_EXPR is not type-dependent. So we call build_non_dependent_expr
which, with -fchecking=2, calls fold_non_dependent_expr. At this
point the expression still has an IMPLICIT_CONV_EXPR, which ought to
be handled in instantiate_non_dependent_expr_internal. However,
tsubst_copy_and_build doesn't handle INIT_EXPR; it will just call
tsubst_copy which does nothing when args is null. So we fail to
replace the IMPLICIT_CONV_EXPR and ICE.
The problem is that we call build_vec_init in a template in the
first place. We can avoid doing so by checking p_t_d before
calling build_aggr_init in check_initializer.
PR c++/104284
gcc/cp/ChangeLog:
* decl.c (check_initializer): Don't call build_aggr_init in
a template.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/constexpr-104284-1.C: New test.
* g++.dg/cpp1y/constexpr-104284-2.C: New test.
* g++.dg/cpp1y/constexpr-104284-3.C: New test.
* g++.dg/cpp1y/constexpr-104284-4.C: New test.
Marek Polacek [Tue, 8 Mar 2022 18:55:15 +0000 (13:55 -0500)]
c++: Wrong error with alias template in class tmpl [PR104108]
In r10-6329 I tried to optimize the number of calls to v_d_e_p in
convert_nontype_argument by remembering whether the expression was
value-dependent in a bool flag. I did that wrongly assuming that its
value-dependence will not be changed by build_converted_constant_expr.
This testcase shows that it can: b_c_c_e gets a VAR_DECL for m_parameter,
which is not value-dependent, but we're converting it to "const int &"
so it returns
(const int &)(const int *) &m_parameter
which suddenly becomes value-dependent because of the added ADDR_EXPR:
has_value_dependent_address is now true because m_parameter's context S<T>
is dependent. With this bug in place, we went to the second branch here:
if (TYPE_REF_OBJ_P (TREE_TYPE (expr)) && val_dep_p)
/* OK, dependent reference. We don't want to ask whether a DECL is
itself value-dependent, since what we want here is its address. */;
else
{
expr = build_address (expr);
if (invalid_tparm_referent_p (type, expr, complain))
return NULL_TREE;
}
wherein build_address created a bad tree and then i_t_r_p complained.
PR c++/104108
gcc/cp/ChangeLog:
* pt.c (convert_nontype_argument): Recompute
value_dependent_expression_p after build_converted_constant_expr.
Marek Polacek [Tue, 22 Mar 2022 22:42:27 +0000 (18:42 -0400)]
c++: FIX_TRUNC_EXPR in tsubst [PR102990]
This is a crash where a FIX_TRUNC_EXPR gets into tsubst_copy_and_build
where it hits gcc_unreachable ().
The history of tsubst_copy_and_build/FIX_TRUNC_EXPR is such that it
was introduced in r181478, but it did the wrong thing, whereupon it
was turned into gcc_unreachable () in r258821 (see this thread:
<https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495853.html>).
In a template, we should never create a FIX_TRUNC_EXPR (that's what
conv_unsafe_in_template_p is for). But in this test we are NOT in
a template when we call digest_nsdmi_init which ends up calling
convert_like, converting 1.0e+0 to int, so convert_to_integer_1
gives us a FIX_TRUNC_EXPR.
But then when we get to parsing f's parameters, we are in a template
when processing decltype(Helpers{}), and since r268321, when the
compound literal isn't instantiation-dependent and the type isn't
type-dependent, finish_compound_literal falls back to the normal
processing, so it calls digest_init, which does fold_non_dependent_init
and since the FIX_TRUNC_EXPR isn't dependent, we instantiate and
therefore crash in tsubst_copy_and_build.
The fateful call to fold_non_dependent_init comes from massage_init_elt,
We shouldn't be calling f_n_d_i on the result of get_nsdmi. This we can
avoid by eschewing calling f_n_d_i on CONSTRUCTORs; their elements have
already been folded.
Marek Polacek [Tue, 13 Jul 2021 21:16:54 +0000 (17:16 -0400)]
c++: constexpr array reference and value-initialization [PR101371]
This PR gave me a hard time: I saw multiple issues starting with
different revisions. But ultimately the root cause seems to be
the following, and the attached patch fixes all issues I've found
here.
In cxx_eval_array_reference we create a new constexpr context for the
CP_AGGREGATE_TYPE_P case, but we also have to create it for the
non-aggregate case. In this test, we are evaluating
((B *)this)->a = rhs->a
which means that we set ctx.object to ((B *)this)->a. Then we proceed
to evaluate the initializer, rhs->a. For *rhs, we eval rhs, a PARM_DECL,
for which we have (const B &) &c.arr[0] in the hash table. Then
cxx_fold_indirect_ref gives us c.arr[0]. c is evaluated to {.arr={}} so
c.arr is {}. Now we want c.arr[0], so we end up in cxx_eval_array_reference
and since we're initializing from {}, we call build_value_init which
gives us an AGGR_INIT_EXPR that calls 'constexpr B::B()'. Then we
evaluate this AGGR_INIT_EXPR and since its first argument is dummy,
we take ctx.object instead. But that is the wrong object, we're not
initializing ((B *)this)->a here. And so we wound up with an
initializer for A, and then crash in cxx_eval_component_reference:
Jason Merrill [Thu, 10 Feb 2022 22:57:38 +0000 (17:57 -0500)]
c++: TTP in member alias template [PR104107]
In the first testcase, coerce_template_template_parms was adding too much of
outer_args when coercing to match P's template parameters, so that when
substituting into the 'const T&' parameter we got an unrelated template
argument for T. We should only add outer_args when the argument template is
a nested template.
PR c++/104107
PR c++/95036
gcc/cp/ChangeLog:
* pt.c (coerce_template_template_parms): Take full parms.
Avoid adding too much of outer_args.
(coerce_template_template_parm): Adjust.
(template_template_parm_bindings_ok_p): Adjust.
(convert_template_argument): Adjust.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alias-decl-ttp2.C: New test.
* g++.dg/cpp1z/ttp2.C: New test.
Jason Merrill [Fri, 25 Mar 2022 15:26:06 +0000 (11:26 -0400)]
c++: ICE with alias in pack expansion [PR103769]
This was breaking because when we stripped the 't' typedef in s<t<Args>...>
to be s<Args...>, the TYPE_MAIN_VARIANT of "Args..." was still
"t<Args>...", because type pack expansions are treated as types. Fixed by
using the right function to copy a "type".
PR c++/99445
PR c++/103769
gcc/cp/ChangeLog:
* tree.c (strip_typedefs): Use build_distinct_type_copy.
Jason Merrill [Sun, 27 Mar 2022 00:10:19 +0000 (20:10 -0400)]
c++: mangling union{1} in template [PR104847]
My implementation of union non-type template arguments in r11-2016 broke
braced casts of union type, because they are still in syntactic (undigested)
form.
PR c++/104847
gcc/cp/ChangeLog:
* mangle.c (write_expression): Don't write a union designator when
undigested.
Jason Merrill [Sun, 27 Mar 2022 00:38:54 +0000 (20:38 -0400)]
c++: missing aggregate base ctor [PR102045]
When make_base_init_ok changes a call to a complete constructor into a call
to a base constructor, we were never marking the base ctor as used, so it
didn't get emitted.
Jason Merrill [Sun, 27 Mar 2022 04:28:30 +0000 (00:28 -0400)]
c++: member alias declaration [PR103968]
Here, we were wrongly thinking that (const Options&)Widget<T>::c_options is
not value-dependent because neither the type nor the (value of) c_options
are dependent, but since we're binding it to a reference we also need to
consider that it has a value-dependent address.
PR c++/103968
gcc/cp/ChangeLog:
* pt.c (value_dependent_expression_p): Check
has_value_dependent_address for conversion to reference.
Jason Merrill [Sun, 27 Mar 2022 03:54:22 +0000 (23:54 -0400)]
c++: CTAD and member alias template [PR102123]
When building a deduction guide from the Test constructor, we need to
rewrite the use of _dummy into a dependent reference, i.e. Test<T>::template
_dummy. We were using SCOPE_REF for both type and non-type templates; we
need to use UNBOUND_CLASS_TEMPLATE for type templates.
PR c++/102123
gcc/cp/ChangeLog:
* pt.c (tsubst_copy): Use make_unbound_class_template for rewriting
a type template reference.
Kito Cheng [Mon, 8 Nov 2021 14:45:49 +0000 (22:45 +0800)]
[PR/target 102957] Allow Z*-ext extension with only 2 char.
We was assume the Z* extension should be more than 2 char, so we put an
assertion there, but it should just an error or warning rather than an
assertion, however RISC-V has add `Zk` extension, which just 2 char, so
actually, we should just allow that.
gcc/ChangeLog
PR target/102957
* common/config/riscv/riscv-common.c (multi_letter_subset_rank): Remove
assertion for Z*-ext.
Jakub Jelinek [Mon, 14 Mar 2022 09:44:38 +0000 (10:44 +0100)]
i386: Fix up _mm_loadu_si{16,32} [PR99754]
These intrinsics are supposed to do an unaligned may_alias load
of a 16-bit or 32-bit value and store it as the first element of
a 128-bit integer vector, with all other elements cleared.
The current _mm_storeu_* implementation implements that correctly, uses
__*_u types to do the store and extracts the first element of a vector into
it.
But _mm_loadu_si{16,32} gets it all wrong. It performs an aligned
non-may_alias load and because _mm_set_epi{16,32} has the args reversed,
it also inserts it into the last vector element instead of first.
The following patch fixes that.
Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
for _mm_loadu_si16 it says strangely SSE. But the intrinsics
returns __m128i, which is only defined in emmintrin.h, and
_mm_set_epi16 is also only SSE2 and later in emmintrin.h.
Even clang defines it in emmintrin.h and ends up with inlining
failure when calling _mm_loadu_si16 from sse,no-sse2 function.
So, isn't that a bug in the intrinsic guide instead?
2022-03-14 Jakub Jelinek <jakub@redhat.com>
PR target/99754
* config/i386/emmintrin.h (_mm_loadu_si32): Put loaded value into
first rather than last element of the vector, use __m32_u to do
a really unaligned load, use just 0 instead of (int)0.
(_mm_loadu_si16): Put loaded value into first rather than last
element of the vector, use __m16_u to do a really unaligned load,
use just 0 instead of (short)0.
* gcc.target/i386/pr99754-1.c: New test.
* gcc.target/i386/pr99754-2.c: New test.