git.ipfire.org Git - thirdparty/gcc.git/log

c++: double non-dep folding from finish_compound_literal [PR104565]

In finish_compound_literal, we perform non-dependent expr folding before
the call to check_narrowing ever since r9-5973. But ever since r10-7096,
check_narrowing also performs non-dependent expr folding of its own.
This double folding means tsubst will see non-templated trees during the
second folding, which causes a spurious error in the below testcase.

This patch removes the former folding operation; it seems obviated by
the latter one.

PR c++/104565

gcc/cp/ChangeLog:

* semantics.c (finish_compound_literal): Don't perform
non-dependent expr folding before calling check_narrowing.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent22.C: New test.

(cherry picked from commit 6bbd8afee0036c274f5ebb5b48d6fdc2091bd046)

c++: dependence of member noexcept-spec [PR104079]

Here a stale TYPE_DEPENDENT_P/_P_VALID value for f's function type
after replacing the type's DEFERRED_NOEXCEPT with the parsed dependent
noexcept-spec causes us to try to instantiate g's noexcept-spec ahead
of time (since it in turn appears non-dependent), leading to an ICE.

This patch fixes this by clearing TYPE_DEPENDENT_P_VALID in
fixup_deferred_exception_variants appropriately (as in
build_cp_fntype_variant).

That turns out to fix the testcase for C++17 but not for C++11/14,
because it's not until C++17 that a noexcept-spec is part of (and
therefore affects dependence of) the function type. Since dependence of
NOEXCEPT_EXPR is defined in terms of instantiation dependence, the most
appropriate fix for earlier dialects seems to be to make instantiation
dependence consider dependence of a noexcept-spec.

PR c++/104079

gcc/cp/ChangeLog:

* pt.c (value_dependent_noexcept_spec_p): New predicate split
out from ...
(dependent_type_p_r): ... here.
(instantiation_dependent_r): Use value_dependent_noexcept_spec_p
to consider dependence of a noexcept-spec before C++17.
* tree.c (fixup_deferred_exception_variants): Clear
TYPE_DEPENDENT_P_VALID.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept74.C: New test.
* g++.dg/cpp0x/noexcept74a.C: New test.

(cherry picked from commit 82e31c8973eb1a752c2ffd01005efe291d35cee3)

c++: ICE when building builtin operator->* set [PR103455]

Here when constructing the builtin operator->* candidate set according
to the available conversion functions for the operand types, we end up
considering a candidate with C1=T (through B's dependent conversion
function) and C2=F, during which we crash from DERIVED_FROM_P because
dependent_type_p sees a TEMPLATE_TYPE_PARM outside of a template
context.

Sidestepping the question of whether we should be considering such a
dependent conversion function here in the first place, it seems futile
to test DERIVED_FROM_P for anything other than an actual class type, so
this patch fixes this ICE by simply guarding the DERIVED_FROM_P test
with CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.

PR c++/103455

gcc/cp/ChangeLog:

* call.c (add_builtin_candidate) <case MEMBER_REF>: Test
CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.

gcc/testsuite/ChangeLog:

* g++.dg/overload/builtin6.C: New test.

(cherry picked from commit 04f19580e8dbdbc7366d0f5fd068aa0cecafdc9d)

Fortran: improve error recovery for invalid coarray function declarations

gcc/fortran/ChangeLog:

PR fortran/104210
* arith.c (eval_intrinsic): Avoid NULL pointer dereference.
(gfc_zero_size_array): Likewise.

gcc/testsuite/ChangeLog:

PR fortran/104210
* gfortran.dg/pr104210.f90: New test.

(cherry picked from commit 892c7f03ae63121766a8be48f7e3b672547fd627)

Fix handling of in_flags in update_escape_summary_1

update_escape_summary_1 has thinko where it compues proper min_flags but then
stores original value (ignoring the fact whether there was a dereference
in the escape point).

PR ipa/103432
* ipa-modref.c (update_escape_summary_1): Fix handling of min_flags.

(cherry picked from commit a70faf6e4df7481c2c9a08a06657c20beb3043de)

Fix min_flags handling in mod-ref

gcc/ChangeLog:

2021-08-11 Jan Hubicka <hubicka@ucw.cz>
Alexandre Oliva <oliva@adacore.com>

* ipa-modref.c (modref_lattice::dump): Fix escape_point's min_flags
dumping.
(modref_lattice::merge_deref): Fix handling of indirect scape points.
(update_escape_summary_1): Likewise.
(update_escape_summary): Likewise.
(ipa_merge_modref_summary_after_inlining): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/modref-dse.c: New test.

(cherry picked from commit 9851a1631f2915fafdc733539b6c8b5fb81e7ae5)

c++: Fix ICE due to shared BLOCK node in coroutine generation [PR103328]

When finishing a function that is a coroutine, the function is
transformed into a "ramp" function, and the original user-provided
function body gets moved into a newly created "actor" function.

In this case `current_function_decl` points to the ramp function,
but `current_binding_level->blocks` would still point to the
scope block of the user-provided function body in the actor function,
so when the ramp function was finished during `poplevel()` in decl.cc,
we could end up with that block being reused as the `DECL_INITIAL()` of
the ramp function:

   subblocks = functionbody >= 0 ? current_binding_level->blocks : 0;
   // [...]
   DECL_INITIAL (current_function_decl) = block ? block : subblocks;

This block would then be independently modified by subsequent passes
touching either the ramp or the actor function, potentially causing
an ICE depending on the order and function of these passes.

gcc/cp/ChangeLog:

PR c++/103328
* coroutines.cc (morph_fn_to_coro): Reset
current_binding_level->blocks.

gcc/testsuite/ChangeLog:

PR c++/103328
* g++.dg/coroutines/pr103328.C: New test.

Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 0847ad33b908af88bca1e6980d0b977316d05e18)

Use OEP_DECL_NAME when comparing VLA bounds [PR101585].

Resolves:
PR c/101585 - Bad interaction of -fsanitize=undefined and -Wvla-parameters

gcc/c-family:

PR c/101585
* c-warn.c (warn_parm_ptrarray_mismatch): Use OEP_DECL_NAME.

gcc/testsuite:
PR c/101585
* gcc.dg/Wvla-parameter-13.c: New test.

(cherry picked from commit a0f9a5dcc3bbe6c7de499e17d201d0f2cb512649)

tree-optimization/99121 - avoid ICEing for non-constant sizes

The following is a simple fix to avoid ICEing on non-constant
sizes of ARRAY_REFs instead of backporting too intrusive changes
done on trunk.

2022-04-07 Richard Biener <rguenther@suse.de>

PR tree-optimization/99121
* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
Bail out for non-constant type size.

Fix target/100106 ICE in gen_movdi

As the test case shows, the outer mode may have a higher alignment
requirement than the inner mode here.

2021-04-27 Bernd Edlinger <bernd.edlinger@hotmail.de>

PR target/100106
* simplify-rtx.c (simplify_context::simplify_subreg): Check the
memory alignment for the outer mode.

* gcc.c-torture/compile/pr100106.c: New testcase.

(cherry picked from commit c33db31d9ad96f6414460315c12b4b505fad5dd7)

middle-end/104497 - gimplification of vector indexing

The following attempts to address gimplification of

   ... = VIEW_CONVERT_EXPR<int[4]>((i & 1) != 0 ? inv : src)[i];

which is problematic since gimplifying the base object
? inv : src produces a register temporary but GIMPLE does not
really support a register as a base for an ARRAY_REF (even
though that's not strictly validated it seems as can be seen
at -O0).  Interestingly the C++ frontend avoids this issue
by emitting the following GENERIC instead:

   ... = (i & 1) != 0 ? VIEW_CONVERT_EXPR<int[4]>(inv)[i] : VIEW_CONVERT_EXPR<int[4]>(src)[i];

The proposed patch below fixes things up when using an rvalue
as the base is OK by emitting a copy from a register base to a
non-register one.  The ?: as lvalue extension seems to be gone
for C, C++ again unwraps the COND_EXPR in that case.

2022-02-11  Richard Biener  <rguenther@suse.de>

PR middle-end/104497
* gimplify.c (gimplify_compound_lval): Make sure the
base is a non-register if needed and possible.

* c-c++-common/torture/pr104497.c: New testcase.

tree-optimization/105053 - fix reduction chain epilogue generation

When we optimize permutations in a reduction chain we have to
be careful to select the correct live-out stmt, otherwise the
reduction result will be unused and the retained scalar code will
execute only the number of vector iterations.

2022-03-25 Richard Biener <rguenther@suse.de>

PR tree-optimization/105053
* tree-vect-loop.c (vect_create_epilog_for_reduction): Pick
the correct live-out stmt for a reduction chain.

* g++.dg/vect/pr105053.cc: New testcase.

[COMMITTED] Fix PR aarch64/104474: ICE with vector float initializers and non-consts.

The problem here is that the aarch64 back-end was placing const0_rtx
into the constant vector RTL even if the mode was a floating point mode.
The fix is instead to use CONST0_RTX and pass the mode to select the
correct zero (either const_int or const_double).

Committed as obvious after a bootstrap/test on aarch64-linux-gnu with
no regressions.

PR target/104474

gcc/ChangeLog:

* config/aarch64/aarch64.c
(aarch64_sve_expand_vector_init_handle_trailing_constants):
Use CONST0_RTX instead of const0_rtx for the non-constant elements.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pr104474-1.c: New test.
* gcc.target/aarch64/sve/pr104474-2.c: New test.
* gcc.target/aarch64/sve/pr104474-3.c: New test.

(cherry picked from commit 41582f88ec01c5ce2f85ebc4ac2743eb426d6e33)

tree-optimization/105070 - annotate bit cluster tests with locations

The following makes sure to annotate the tests generated by
switch lowering bit-clustering with locations which otherwise
can be completely lost even at -O0.

2022-03-28 Richard Biener <rguenther@suse.de>

PR tree-optimization/105070
* tree-switch-conversion.h
(bit_test_cluster::hoist_edge_and_branch_if_true): Add location
argument.
* tree-switch-conversion.c
(bit_test_cluster::hoist_edge_and_branch_if_true): Annotate
cond with location.
(bit_test_cluster::emit): Annotate all generated expressions
with location.

(cherry picked from commit bc86a86a4f2c057bc0e0be94dcbb8c128ae7f717)

rtl-optimization/105028 - fix compile-time hog in form_threads_from_copies

form_threads_from_copies processes a sorted array of copies, skipping
those with the same thread and conflicting threads and merging the
first non-conflicting ones.  After that it terminates the loop and
gathers the remaining elements of the array, skipping same thread
copies, re-starting the process.  For a large number of copies this
gathering of the rest takes considerable time and it also appears
pointless.  The following simply continues processing the array
which should be equivalent as far as I can see.

This takes form_threads_from_copies off the profile radar from
previously taking ~50% of the compile-time.

2022-03-23  Richard Biener  <rguenther@suse.de>

PR rtl-optimization/105028
* ira-color.c (form_threads_from_copies): Remove unnecessary
copying of the sorted_copies tail.

(cherry picked from commit 1daa198aafd72925ca8dd8616385f523ff180d4a)

tree-optimization/104880 - update-address-taken and cmpxchg

The following addresses optimistic non-addressable marking of
an argument of __atomic_compare_exchange_n which broke when
I added DECL_NOT_GIMPLE_REG_P since we cannot guarantee we can
rewrite it when TREE_ADDRESSABLE is unset. Instead we have to
restore TREE_ADDRESSABLE in that case.

2022-03-11 Richard Biener <rguenther@suse.de>

PR tree-optimization/104880
* tree-ssa.c (execute_update_address_taken): Remember if we
optimistically made something not addressable and
prepare to undo it.

* g++.dg/opt/pr104880.cc: New testcase.

(cherry picked from commit eb5edcf3f3ae008a1c55c88f08a886a5f350a759)

middle-end/105165 - sorry instead of ICE for _Complex asm goto

Complex lowering cannot currently deal with asm gotos with _Complex
output operands. Emit a sorry instead of ICEing, those should not
appear in practice.

2022-04-06 Richard Biener <rguenther@suse.de>

PR middle-end/105165
* tree-complex.c (expand_complex_asm): Sorry for asm goto
_Complex outputs.

* gcc.dg/pr105165.c: New testcase.

(cherry picked from commit 54ed6563d22694aa3e1935f89641a4f696a3a9f7)

Daily bump.

ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

IPA_JF_ANCESTOR jump functions are constructed also when the formal
parameter of the caller is first checked whether it is NULL and left
as it is if it is NULL, to accommodate C++ casts to an ancestor class.

The jump function type was invented for devirtualization and IPA-CP
propagation of tree constants is also careful to apply it only to
existing DECLs(*) but as PR 103083 shows, the part propagating "known
bits" was not careful about this, which can lead to miscompilations.

This patch introduces a flag to the ancestor jump functions which
tells whether a NULL-check was elided when creating it and makes the
bits propagation behave accordingly, masking any bits otherwise would
be known to be one.  This should safely preserve alignment info, which
is the primary ifnormation that we keep in bits for pointers.

(*) There still may remain problems when a DECL resides on address
zero (with -fno-delete-null-pointer-checks ...I hope it cannot happen
otherwise).  I am looking into that now but I think it will be easier
for everyone if I do so in a follow-up patch.

gcc/ChangeLog:

2022-02-11  Martin Jambor  <mjambor@suse.cz>

PR ipa/103083
* ipa-prop.h (ipa_ancestor_jf_data): New flag keep_null;
(ipa_get_jf_ancestor_keep_null): New function.
* ipa-prop.c (ipa_set_ancestor_jf): Initialize keep_null field of the
ancestor function.
(compute_complex_assign_jump_func): Pass false to keep_null
parameter of ipa_set_ancestor_jf.
(compute_complex_ancestor_jump_func): Pass true to keep_null
parameter of ipa_set_ancestor_jf.
(update_jump_functions_after_inlining): Carry over keep_null from the
original ancestor jump-function or merge them.
(ipa_write_jump_function): Stream keep_null flag.
(ipa_read_jump_function): Likewise.
(ipa_print_node_jump_functions_for_edge): Print the new flag.
* ipa-cp.c (class ipcp_bits_lattice): Make various getters const.  New
member function known_nonzero_p.
(ipcp_bits_lattice::known_nonzero_p): New.
(ipcp_bits_lattice::meet_with_1): New parameter drop_all_ones,
observe it.
(ipcp_bits_lattice::meet_with): Likewise.
(propagate_bits_across_jump_function): Simplify.  Pass true in
drop_all_ones when it is necessary.
(propagate_aggs_across_jump_function): Take care of keep_null
flag.
(ipa_get_jf_ancestor_result): Propagate NULL accross keep_null
jump functions.

gcc/testsuite/ChangeLog:

2021-11-25  Martin Jambor  <mjambor@suse.cz>

* gcc.dg/ipa/pr103083-1.c: New test.
* gcc.dg/ipa/pr103083-2.c: Likewise.

(cherry picked from commit 7ea3a73c195a79e6740ae594ee1a14c8bf7a938d)

libstdc++: Make std::error_code printer more robust

This attempts to implement a partial workaround for the GDB bug
https://sourceware.org/bugzilla/show_bug.cgi?id=28856 which causes GDB
to crash when printing a frame with a std::error_code argument.

By recognising the known error categories defined in the library and
hardcoding their names we do not need to call cat->name() on the
category. This has the additional benefit of also working when
debugging a core file rather than a running process. For those known
categories we can also cast the int value to the corresponding error
code enum (e.g. future_errc) so that we show an enumerator instead of
just an integer.

For program-defined categories we just use the name of the dynamic type
to identify the category, and print the value as an integer. Once the
GDB bug is fixed and the virtual name() function can be called safely,
that would be preferable. For now it's better to have an imperfect
printer that doesn't crash GDB.

This rewritten StdErrorCodePrinter needs gdb.Value.dynamic_type, so is
only registered if that is supported, which means GDB 7.7 and later.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdErrorCodePrinter): Replace
code that call cat->name() on std::error_category objects.
Identify known categories by symbol name and use a hardcoded
name. Print error code values as enumerators where appopriate.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Adjust expected
name of custom category. Check io_errc and future_errc errors.

(cherry picked from commit 36100e0e952b92a6cd819620fcef851f0069ac8f)

libstdc++: Add missing constraints to std::bit_cast [PR105027]

Our std::bit_cast was relying on the compiler to check for errors inside
__builtin_bit_cast, instead of checking them as constraints. That means
std::bit_cast was not SFINAE-friendly.

This fix uses a requires-clause, so for old versions of Clang without
concepts support the function will still be unconstrained. At some point
in future we can remove the #ifdef __cpp_concepts check and rely on all
compilers having full concepts support in C++20 mode.

libstdc++-v3/ChangeLog:

PR libstdc++/105027
* include/std/bit (bit_cast): Add constraints.
* testsuite/26_numerics/bit/bit.cast/105027.cc: New test.

(cherry picked from commit 4894d69a1f37d54b6a612e58053db477ff5ba832)

libstdc++: Fix mismatched noexcept-specifiers in Filesystem TS

The copy_file fix should have been part of r12-7063-gda72e0fd20f87b.

The path::begin() fix should have been part of r12-3930-gf2b7f56a15d9cb.
Thanks to Timm Bäder for reporting this one.

libstdc++-v3/ChangeLog:

* include/experimental/bits/fs_fwd.h (copy_file): Remove
incorrect noexcept from declaration.
* include/experimental/bits/fs_path.h (path::begin, path::end):
Add noexcept to declarations, to match definitions.

(cherry picked from commit 944da70a5d1cdc5bd4327b2d32420f57b6883985)

libstdc++: Adjust Filesystem TS test for Windows

The Filesystem TS isn't really supported for Windows, but the FAIL for
this test is just because it doesn't match what happens on Windows.

libstdc++-v3/ChangeLog:

* testsuite/experimental/filesystem/operations/create_directories.cc:
Adjust expected results for Windows.

(cherry picked from commit 61b783995fac5355827ada1f8544052119a23606)

libstdc++: Do not use dirent::d_type unconditionally

These new tests should not use the d_type member unless it's actually
present on the OS.

libstdc++-v3/ChangeLog:

* testsuite/27_io/filesystem/iterators/error_reporting.cc: Use
autoconf macro to check whether d_type is present.
* testsuite/experimental/filesystem/iterators/error_reporting.cc:
Likewise.

(cherry picked from commit d98668eb06f532b2dbe0c721fa1b9ed6e643df27)

libstdc++: Reset filesystem::recursive_directory_iterator on error

The standard requires directory iterators to become equal to the end
iterator value if they report an error. Some members functions of
filesystem::recursive_directory_iterator fail to do that.

libstdc++-v3/ChangeLog:

* src/c++17/fs_dir.cc (recursive_directory_iterator::increment):
Reset state to past-the-end iterator on error.
(fs::recursive_directory_iterator::pop(error_code&)): Likewise.
(fs::recursive_directory_iterator::pop()): Check _M_dirs before
it might get reset.
* src/filesystem/dir.cc (recursive_directory_iterator): Likewise,
for the TS implementation.
* testsuite/27_io/filesystem/iterators/error_reporting.cc: New test.
* testsuite/experimental/filesystem/iterators/error_reporting.cc: New test.

(cherry picked from commit ec09a5335f0ade7071f6157dfd97dbb3de3e4f97)

libstdc++: Simplify std::allocator_traits<allocator<void>>::construct

We don't need a preprocessor condition to decide whether to use
placement new or std::construct_at, because std::_Construct already does
that.

libstdc++-v3/ChangeLog:

* include/bits/alloc_traits.h (allocator_traits<allocator<void>>):
Use std::_Construct for construct.

(cherry picked from commit 917c7b136e8b556b0027223058006a6caeb56871)

libstdc++: Remove un-implementable noexcept from Filesystem TS operations

LWG 3014 removed these incorrect noexcept specifications from the C++17
std::filesystem operations. They are also incorrect on the experimental
TS versions and should be removed from them too.

libstdc++-v3/ChangeLog:

* include/experimental/bits/fs_ops.h (fs::copy_file): Remove
noexcept.
(fs::create_directories): Likewise.
(fs::remove_all): Likewise.
* src/filesystem/ops.cc (fs::copy_file): Remove noexcept.
(fs::create_directories): Likewise.
(fs::remove_all): Likewise.

(cherry picked from commit da72e0fd20f87bb523a81a505c00546d3622e9dd)

libstdc++: Fix doxygen comment for filesystem::perms operators

libstdc++-v3/ChangeLog:

* include/bits/fs_fwd.h (filesystem::perms): Fix comment.

(cherry picked from commit 90263a48303a5ae552ea04c68ed7fa5da49b1876)

libstdc++: Rename non-reserved macros in config header [PR103650]

libstdc++-v3/ChangeLog:

PR libstdc++/103650
* include/Makefile.am: Rename LT_OBJDIR and STDC_HEADERS.
* include/Makefile.in: Regenerate.
* testsuite/17_intro/headers/c++1998/103650.cc: New test.

(cherry picked from commit fa092570fbaf3bb4202e518eb8beba146c464d9f)

libstdc++: Use __cpp_lib_concepts in std::reverse_iterator [PR104098]

We should not assume that std::iter_value_t etc. are defined
unconditionally for C++20 mode.

libstdc++-v3/ChangeLog:

PR libstdc++/104098
* include/bits/stl_iterator.h (reverse_iterator): Check
__cpp_lib_concepts instead of __cplusplus.

(cherry picked from commit e13e95bd274148a825bc9527efac49e99080dd64)

libstdc++: Remove -gdwarf-4 from flags for debug library

The default is -gdwarf-5 now, so this is hurting rather than improving
things.

libstdc++-v3/ChangeLog:

* configure.ac (GLIBCXX_ENABLE_DEBUG_FLAGS): Remove -gdwarf-4
from default flags.
* configure: Regenerate.

(cherry picked from commit fe3e978027724f28d3e15747c991844793d42922)

libstdc++: Document final option names for enabling C++20

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2020.xml: Use final C++20 option
names.
* doc/html/manual/status.html: Regenerate.

(cherry picked from commit 5a3dc58a1d7a792e776a59389e8901b614ce6d0d)

libstdc++: Add suggestion to std::uncaught_exception() warning

We should use the SUGGEST macro for std::uncaught_exception()
deprecation warnings.

libstdc++-v3/ChangeLog:

* include/bits/allocator.h: Qualify std::allocator_traits in
deprecated warnings.
* libsupc++/exception (uncaught_exception): Add suggestion to
deprecated warning.

(cherry picked from commit 27ba40559ccb887458009a34f710d4a22af85156)

libstdc++: Add missing constexpr to uses-allocator construction utilities [PR104542]

libstdc++-v3/ChangeLog:

PR libstdc++/104542
* include/bits/uses_allocator_args.h (make_obj_using_allocator)
(uninitialized_construct_using_allocator): Add constexpr.
* testsuite/20_util/uses_allocator/make_obj.cc: Check constexpr.
* testsuite/20_util/uses_allocator/uninitialized_construct.cc: New test.

(cherry picked from commit 6cfb7ffb659fd6b87a21312021ab023a06e8f6be)

libstdc++: Fix filenames in Doxygen @file comments

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/bits/fs_ops.h: Fix filename in Doxygen comment.
* include/experimental/bits/fs_ops.h: Likewise.

(cherry picked from commit 1e9c026848dd871266305d7e52292e0e10897f31)

libstdc++: Remove incorrect copyright notice from header

This file has the SGI copyright notice, but contains no code from
the SGI STL. It was entirely written by me in 2019, originally as part
of the <memory> header. When I extracted it into a new header I
accidentally copied across the SGI copyright, but that only applies to
some much older parts of <memory>.

libstdc++-v3/ChangeLog:

* include/bits/uses_allocator_args.h: Remove incorrect copyright
notice.

(cherry picked from commit 7cce7b1c3d829172eb7f232e71ad194a0ad51931)

libstdc++: Improve config output for --enable-cstdio [PR104301]

Currently we just print "checking for underlying I/O to use... stdio"
unconditionally, whether configured to use stdio_pure or stdio_posix. We
should make it clear that the user's configure option chose the right
thing.

libstdc++-v3/ChangeLog:

PR libstdc++/104301
* acinclude.m4 (GLIBCXX_ENABLE_CSTDIO): Print different messages
for stdio_pure and stdio_posix options.
* configure: Regenerate.

(cherry picked from commit 19b8946dbda5fda4389ef8e3ea162c3df2b1998d)

Daily bump.

i386: Fix up ix86_expand_vector_init_general [PR105123]

The following testcase is miscompiled on ia32.
The problem is that at -O0 we end up with:
  vector(4) short unsigned int _1;
  short unsigned int u.0_3;
...
  _1 = {u.0_3, u.0_3, u.0_3, u.0_3};
statement (dead) which is wrongly expanded.
elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode,
so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]).
The intent is to manually broadcast that value to 2 SImode parts,
but because we pass word as target to expand_simple_binop, it will
overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0:
   10: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   12: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   14: clobber r110:V4HI
   15: r110:V4HI#0=r83:SI
   16: r110:V4HI#4=r83:SI
as the two ors do nothing and two shifts each by 16 left shift it all
away.
The following patch fixes that by using NULL_RTX target, so we expand it as
   10: {r110:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r111:SI=r110:SI|r83:SI;clobber flags:CC;}
   12: {r112:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r113:SI=r112:SI|r83:SI;clobber flags:CC;}
   14: clobber r114:V4HI
   15: r114:V4HI#0=r111:SI
   16: r114:V4HI#4=r113:SI
instead.

Another possibility would be to pass NULL_RTX only when word == elt
and word otherwise, where word would necessarily be a pseudo from the first
shift after passing NULL_RTX there once or pass NULL_RTX for the shift and
word for ior.

2022-04-03  Jakub Jelinek  <jakub@redhat.com>

PR target/105123
* config/i386/i386-expand.c (ix86_expand_vector_init_general): Avoid
using word as target for expand_simple_binop when doing ASHIFT and
IOR.

* gcc.target/i386/pr105123.c: New test.

(cherry picked from commit e1a74058b784c845e84a0cf1997b54b984df483d)

[PR105032] LRA: modify loop condition to find reload insns for hard reg splitting

When trying to split hard reg live range to assign hard reg to a reload
pseudo, LRA searches for reload insns of the reload pseudo
assuming a specific order of the reload insns. This order is violated if
reload involved in inheritance transformation. In such case, the loop used
for reload insn searching can become infinite. The patch fixes this.

gcc/ChangeLog:

PR middle-end/105032
* lra-assigns.c (find_reload_regno_insns): Modify loop condition.

gcc/testsuite/ChangeLog:

PR middle-end/105032
* gcc.target/i386/pr105032.c: New.

Daily bump.

c-family: ICE with -Wconversion and A ?: B [PR101030]

This patch fixes a crash in conversion_warning on a null expression.
It is null because the testcase uses the GNU A ?: B extension. We
could also use op0 instead of op1 in this case, but it doesn't seem
to be necessary.

PR c++/101030

gcc/c-family/ChangeLog:

* c-warn.c (conversion_warning) <case COND_EXPR>: Don't call
conversion_warning when OP1 is null.

gcc/testsuite/ChangeLog:

* g++.dg/ext/cond5.C: New test.

(cherry picked from commit 5db9ce171019f8915885cebd5cc5f4101bb926e6)

x86: Also use Yw in *ssse3_pshufbv8qi3 clobber

PR target/105068
* config/i386/sse.md (*ssse3_pshufbv8qi3): Also replace "Yv" with
"Yw" in clobber.

(cherry picked from commit cccbb776589c1825de1bd2eefabb11d72ef28de8)

RISC-V: Fixing -misa-spec [PR/target 104853]

gcc/ChangeLog:

* config.gcc (riscv*-*-*): Set right default isa spec.

RISC-V: Handle zi* extension correctly for arch-canonicalize script

Canonical order for z-prefixed extension are rely on the canonical order of
single letter extension, however we didn't put i into the list before,
so when we put zicsr or zifencei it will got exception.

gcc/ChangeLog:

* config/riscv/arch-canonicalize (CANONICAL_ORDER): Add `i` to
CANONICAL_ORDER.

(cherry picked from commit e399cde6f9c89cafbbf6c3274c0af3c369d4f872)

RISC-V: Fix register class subset checks for CLASS_MAX_NREGS

Fix the register class subset checks in the determination of the maximum
number of consecutive registers needed to hold a value of a given mode.

The number depends on whether a register is a general-purpose or a
floating-point register, so check whether the register class requested
is a subset (argument 1 to `reg_class_subset_p') rather than superset
(argument 2) of GR_REGS or FP_REGS class respectively.

gcc/
* config/riscv/riscv.c (riscv_class_max_nregs): Swap the
arguments to `reg_class_subset_p'.

(cherry picked from commit a31056e9196daf0a5b0e92d171b5227cc994103b)

RISC-V: Fix wrong zifencei handling in riscv_subset_list::to_string

This issue cause zifencei never correctly appended on the ISA string.

gcc/ChangeLog

* common/config/riscv/riscv-common.c (riscv_subset_list::to_string): Fix
wrong marco checking.

(cherry picked from commit 402d28998fa35d9ffc47aa084f66f9381491eeca)

RISC-V: jal cannot refer to a default visibility symbol for shared object.

This is the original binutils bugzilla report,
https://sourceware.org/bugzilla/show_bug.cgi?id=28509

And this is the first version of the proposed binutils patch,
https://sourceware.org/pipermail/binutils/2021-November/118398.html

After applying the binutils patch, I get the the unexpected error when
building libgcc,

/scratch/nelsonc/riscv-gnu-toolchain/riscv-gcc/libgcc/config/riscv/div.S:42:
/scratch/nelsonc/build-upstream/rv64gc-linux/build-install/riscv64-unknown-linux-gnu/bin/ld: relocation R_RISCV_JAL against `__udivdi3' which may bind externally can not be used when making a shared object; recompile with -fPIC

Therefore, this patch add an extra hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead.
The solution is similar to glibc as follows,
https://sourceware.org/git/?p=glibc.git;a=commit;h=68389203832ab39dd0dbaabbc4059e7fff51c29b

libgcc/ChangeLog:

* config/riscv/div.S: Add the hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target it since it is non-preemptible.
* config/riscv/riscv-asm.h: Added new macros HIDDEN_JUMPTARGET and
HIDDEN_DEF.

(cherry picked from commit 45116f342057b7facecd3d05c2091ce3a77eda59)

RISC-V: Fix use-after-free error in `parse_multiletter_ext'

Avoid undefined arithmetic involving a pointer to a heap allocation that
has been freed and move a problematic calculation ahead of the following
call to `free' in `riscv_subset_list::parse_multiletter_ext', removing a
compilation error:

.../gcc/common/config/riscv/riscv-common.c: In member function 'const char* riscv_subset_list::parse_multiletter_ext(const char*, const char*, const char*)':
.../gcc/common/config/riscv/riscv-common.cc:905:27: error: pointer 'subset' used after 'void free(void*)' [-Werror=use-after-free]
  905 |       p += end_of_version - subset;
      |            ~~~~~~~~~~~~~~~^~~~~~~~
.../gcc/common/config/riscv/riscv-common.cc:904:12: note: call to 'void free(void*)' here
  904 |       free (subset);
      |       ~~~~~^~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:2428: riscv-common.o] Error 1

and a build regression from commit 671a283636de ("Add -Wuse-after-free
[PR80532].").

gcc/
* common/config/riscv/riscv-common.c
(riscv_subset_list::parse_multiletter_ext): Move pointer
arithmetic ahead of `free'.

(cherry picked from commit dad495e30135904b0d0305eab8c0ce5f838440d4)

RISC-V: Do not emit zcisr and zifencei if i-ext is 2.0

I-ext 2.0 already included zicsr and zifencei, skip that prevent
confusing binutils.

gcc/ChangeLog

* common/config/riscv/riscv-common.c (riscv_subset_list::to_string):
Skip zicsr and zifencei if I-ext is 2.0.

(cherry picked from commit ca2bbb88f999f4d3cc40e89bc1aba712505dd598)

RISC-V: Fix detection of zifencei support for binutils

- binutils will complain version info is not found if default ISA spec
is 2.2 for binutils.

Error: cannot find default versions of the ISA extension `zifencei'

gcc/ChangeLog:

* configure.ac: Fix detection for zifencei support.
* configure: Regenerate.

(cherry picked from commit affdeda16ef7fbd34f850443fe63bb407714297e)

ubsan: Fix ICE due to -fsanitize=object-size [PR105093]

The following testcase ICEs, because for a volatile X & RESULT_DECL
ubsan wants to take address of that reference.  instrument_object_size
is called with x, so the base is equal to the access and the var
is automatic, so there is no risk of an out of bounds access for it.
Normally we wouldn't instrument those because we fold address of the
t - address of inner to 0, add constant size of the decl and it is
equal to what __builtin_object_size computes.  But the volatile
results in the subtraction not being folded.

The first hunk fixes it by punting if we access the whole automatic
decl, so that even volatile won't cause a problem.
The second hunk (not strictly needed for this testcase) is similar
to what has been added to asan.cc recently, if we actually take
address of a decl and keep it in the IL, we better mark it addressable.

2022-03-30  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/105093
* ubsan.c (instrument_object_size): If t is equal to inner and
is a decl other than global var, punt.  When emitting call to
UBSAN_OBJECT_SIZE ifn, make sure base is addressable.

* g++.dg/ubsan/pr105093.C: New test.

(cherry picked from commit e3e68fa59ead502c24950298b53c637bbe535a74)

store-merging: Avoid ICEs on roughly ~0ULL/8 sized stores [PR105094]

On the following testcase on 64-bit targets, store-merging sees
a MEM_REF store from {} ctor with "negative" bitsize where bitoff + bitsize
wraps around to very small end offset.  This later confuses the code
so that it allocates just a few bytes of memory but fills in huge amounts of
it.  Later on there is a param_store_merging_max_size size check but due to
the wrap-around we pass that.

The following patch punts on such large bitsizes.

2022-03-30  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/105094
* gimple-ssa-store-merging.c (mem_valid_for_store_merging): Punt if
bitsize <= 0 rather than just == 0.

* gcc.dg/pr105094.c: New test.

(cherry picked from commit 387e818cda0ffde86f624228c3da1ab28f453685)

LTO: bump bytecode version

The following revision 91f7d7e1bb6827bf8e0b7ba7eb949953a5b1bd18
breaks bytecode as it introduces a new param.

gcc/ChangeLog:

* lto-streamer.h (LTO_minor_version): Bump it.

c++: Fox template-introduction tentative parsing in class bodies clear colon_corrects_to_scope_p [PR105061]

The concepts support (in particular template introductions from concepts TS)
broke the following testcase, valid unnamed bitfields with dependent
types (or even just typedefs) were diagnosed as typos (: instead of correct
::) in template introduction during their tentative parsing.
The following patch fixes that by not doing this : to :: correction when
member_p is true.

2022-03-30 Jakub Jelinek <jakub@redhat.com>

PR c++/105061
* parser.c (cp_parser_template_introduction): If member_p, temporarily
clear parser->colon_corrects_to_scope_p around tentative parsing of
nested name specifier.

* g++.dg/concepts/pr105061.C: New test.

(cherry picked from commit 4f2795218a6ba6a7b7b9b18ca7a6e390661e1608)

Daily bump.

c++: Fix up __builtin_{bit_cast,convertvector} parsing

Jonathan reported on IRC that we don't parse
__builtin_bit_cast (type, val).field
etc.
The problem is that for these 2 builtins we return from
cp_parser_postfix_expression instead of setting postfix_expression
to the cp_build_* value and falling through into the postfix regression
suffix handling loop.

2022-03-26 Jakub Jelinek <jakub@redhat.com>

* parser.c (cp_parser_postfix_expression)
<case RID_BILTIN_CONVERTVECTOR, case RID_BUILTIN_BIT_CAST>: Don't
return cp_build_{vec,convert,bit_cast} result right away, instead
set postfix_expression to it and break.

* c-c++-common/builtin-convertvector-3.c: New test.
* g++.dg/cpp2a/bit-cast15.C: New test.

(cherry picked from commit 1806829e08f14e4cacacec43d7845cc2dad2ddc8)

fold-const: Handle C++ dependent COMPONENT_REFs in operand_equal_p [PR105035]

As mentioned in the PR, operand_equal_p already contains some hacks so that
it can be called already on pre-instantiation C++ trees from templates,
but the recent change to compare DECL_FIELD_OFFSET in the COMPONENT_REF
case broke this. Many such COMPONENT_REFs are already punted on earlier
because they have NULL TREE_TYPE, but in this case the code knows what
type they have but still uses an IDENTIFIER_NODE as second operand
of COMPONENT_REF (I think SCOPE_REF is something that could be used too).

The following patch looks at those DECL_FIELD_*OFFSET fields only if
both field[01] args are FIELD_DECLs and otherwise keeps it to the
earlier OP_SAME (1) check that guards this whole block.

2022-03-24 Jakub Jelinek <jakub@redhat.com>

PR c++/105035
* fold-const.c (operand_equal_p) <case COMPONENT_REF>: If either
field0 or field1 is not a FIELD_DECL, return false.

* g++.dg/warn/Wduplicated-cond2.C: New test.

(cherry picked from commit 8698ff67cdff4364c8adad2921ed532359a155ec)

c++: extern thread_local declarations in constexpr [PR104994]

C++14 to C++20 apparently should allow extern thread_local declarations in
constexpr functions, however useless they are there (because accessing
such vars is not valid in a constant expression, perhaps sizeof/decltype).
P2242 changed that for C++23 to passing through declaration but
https://cplusplus.github.io/CWG/issues/2552.html
has been filed for it yesterday.

2022-03-24 Jakub Jelinek <jakub@redhat.com>

PR c++/104994
* constexpr.c (potential_constant_expression_1): Don't diagnose extern
thread_local declarations.
* decl.c (start_decl): Likewise.

* g++.dg/cpp23/constexpr-nonlit7.C: New test.

(cherry picked from commit 72124f487ccb5c8065dd5f7b8fba254600b7e611)

i386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971]

__builtin_ia32_readeflags_u* aren't marked const or pure I think
intentionally, so that they aren't CSEd from different regions of a function
etc. because we don't and can't easily track all dependencies between
it and surrounding code (if somebody looks at the condition flags, it is
dependent on the vast majority of instructions).
But the builtin itself doesn't have any side-effects, so if we ignore the
result of the builtin, there is no point to emit anything.

There is a LRA bug that miscompiles the testcase which this patch makes
latent, which is certainly worth fixing too, but IMHO this change
(and maybe ix86_gimple_fold_builtin too which would fold it even earlier
when it looses lhs) is worth it as well.

2022-03-19 Jakub Jelinek <jakub@redhat.com>

PR middle-end/104971
* config/i386/i386-expand.c
(ix86_expand_builtin) <case IX86_BUILTIN_READ_FLAGS>: If ignore,
don't push/pop anything and just return const0_rtx.

* gcc.target/i386/pr104971.c: New test.

(cherry picked from commit b60bc913cca7439d29a7ec9e9a7f448d8841b43c)

c-family: Fix up ICE during pretty-printing of PMF related expression [PR101515]

The intent of r11-6729 is that it prints something that helps user to figure
out what exactly is being accessed.
When we find a unique non-static data member that is being accessed, even
when we can't fold it nicely, IMNSHO it is better to print
  ((sometype *)&var)->field
or
  (*(sometype *)&var).field
instead of
  *(fieldtype *)((char *)&var + 56)
because the user doesn't know what is at offset 56, we shouldn't ask user
to decipher structure layout etc.

One question is if we could return something better for the TYPE_PTRMEMFUNC_FLAG
RECORD_TYPE members here (something that would print it more naturally/readably
in a C++ way), though the fact that the routine is in c-family makes it
harder.

Another one is whether we shouldn't punt for FIELD_DECLs that don't have
nicely printable name of its containing scope, something like:
                if (tree scope = get_containing_scope (field))
                  if (TYPE_P (scope) && TYPE_NAME (scope) == NULL_TREE)
                    break;
                return cop;
or so.  This patch implements that.

Note the returned cop is a COMPONENT_REF where the first argument has a
nicely printable type name (x with type sp), but sp's TYPE_MAIN_VARIANT
is the unnamed TYPE_PTRMEMFUNC_FLAG.  So another possibility would be if
we see such a problem for the FIELD_DECL's scope, check if TYPE_MAIN_VARIANT
of the first COMPONENT_REF's argument is equal to that scope and in that
case use TREE_TYPE of the first COMPONENT_REF's argument as the scope
instead.

2022-03-19  Jakub Jelinek  <jakub@redhat.com>

PR c++/101515
* c-pretty-print.c (c_fold_indirect_ref_for_warn): For C++ don't
return COMPONENT_REFs with FIELD_DECLs whose containing scope can't
be printed.

* g++.dg/warn/pr101515.C: New test.

(cherry picked from commit 2663d18356b0a62f5a800c7e5596d814cd3c2c41)

Allow (void *) 0xdeadbeef accesses without warnings [PR99578]

Starting with GCC11 we keep emitting false positive -Warray-bounds or
-Wstringop-overflow etc. warnings on widely used *(type *)0x12345000
style accesses (or memory/string routines to such pointers).
This is a standard programming style supported by all C/C++ compilers
I've ever tried, used mostly in kernel or DSP programming, but sometimes
also together with mmap MAP_FIXED when certain things, often I/O registers
but could be anything else too are known to be present at fixed
addresses.

Such INTEGER_CST addresses can appear in code either because a user
used it like that (in which case it is fine) or because somebody used
pointer arithmetics (including &((struct whatever *)NULL)->field) on
a NULL pointer.  The middle-end warning code wrongly assumes that the
latter case is what is very likely, while the former is unlikely and
users should change their code.

The following patch adds a min-pagesize param defaulting to 4KB,
and treats INTEGER_CST addresses smaller than that as assumed results
of pointer arithmetics from NULL while addresses equal or larger than
that as expected user constant addresses.  For GCC 13 we can
represent results from pointer arithmetics on NULL using
&MEM[(void*)0 + offset] instead of (void*)offset INTEGER_CSTs.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/99578
PR middle-end/100680
PR tree-optimization/100834
* params.opt (--param=min-pagesize=): New parameter.
* builtins.c (compute_objsize_r) <case INTEGER_CST>: Use maximum
object size instead of zero for pointer constants equal or larger
than min-pagesize.

* gcc.dg/tree-ssa/pr99578-1.c: New test.
* gcc.dg/pr99578-1.c: New test.
* gcc.dg/pr99578-2.c: New test.
* gcc.dg/pr99578-3.c: New test.
* gcc.dg/pr100680.c: New test.
* gcc.dg/pr100834.c: New test.

(cherry picked from commit 32ca611c42658948f1b8883994796f35e8b4e74d)

c++: Fix up constexpr evaluation of new with zero sized types [PR104568]

The new expression constant expression evaluation right now tries to
deduce how many elts the array it uses for the heap or heap [] vars
should have (or how many elts should its trailing array have if it has
cookie at the start).  As new is lowered at that point to
(some_type *) ::operator new (size)
or so, it computes it by subtracting cookie size if any from size, then
divides the result by sizeof (some_type).
This works fine for most types, except when sizeof (some_type) is 0,
then we divide by zero; size is then equal to cookie_size (or if there
is no cookie, to 0).
The following patch special cases those cases so that we don't divide
by zero and also recover the original outer_nelts from the expression
by forcing the size not to be folded in that case but be explicit
0 * outer_nelts or cookie_size + 0 * outer_nelts.

Note, we have further issues, we accept-invalid various cases, for both
zero sized elt_type and even non-zero sized elts, we aren't able to
diagnose out of bounds POINTER_PLUS_EXPR like:
constexpr bool
foo ()
{
  auto p = new int[2];
  auto q1 = &p[0];
  auto q2 = &p[1];
  auto q3 = &p[2];
  auto q4 = &p[3];
  delete[] p;
  return true;
}
constexpr bool a = foo ();
That doesn't look like a regression so I think we should resolve that for
GCC 13, but there are 2 problems.  Figure out why
cxx_fold_pointer_plus_expression doesn't deal with the &heap []
etc. cases, and for the zero sized arrays, I think we really need to preserve
whether user wrote an array ref or pointer addition, because in the
&p[3] case if sizeof(p[0]) == 0 we know that if it has 2 elements it is
out of bounds, while if we see p p+ 0 the information if it was
p + 2 or p + 3 in the source is lost.
clang++ seems to handle it fine even in the zero sized cases or with
new expressions.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/104568
* init.c (build_new_constexpr_heap_type): Remove FULL_SIZE
argument and its handling, instead add ITYPE2 argument.  Only
support COOKIE_SIZE != NULL.
(build_new_1): If size is 0, change it to 0 * outer_nelts if
outer_nelts is non-NULL.  Pass type rather than elt_type to
maybe_wrap_new_for_constexpr.
* constexpr.c (build_new_constexpr_heap_type): New function.
(cxx_eval_constant_expression) <case CONVERT_EXPR>:
If elt_size is zero sized type, try to recover outer_nelts from
the size argument to operator new/new[] and pass that as
arg_size to build_new_constexpr_heap_type.  Pass ctx,
non_constant_p and overflow_p to that call too.

* g++.dg/cpp2a/constexpr-new22.C: New test.

(cherry picked from commit 0a0c2c3f06227d46b5e9542dfdd4e0fd2d67d894)

libatomic: Improve 16-byte atomics on Intel AVX [PR104688]

As mentioned in the PR, the latest Intel SDM has added:
"Processors that enumerate support for Intel® AVX (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
guarantee that the 16-byte memory operations performed by the following instructions will always be
carried out atomically:
• MOVAPD, MOVAPS, and MOVDQA.
• VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
• VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and k0 (masking disabled).
(Note that these instructions require the linear addresses of their memory operands to be 16-byte
aligned.)"

The following patch deals with it just on the libatomic library side so far,
currently (since ~ 2017) we emit all the __atomic_* 16-byte builtins as
library calls since and this is something that we can hopefully backport.

The patch simply introduces yet another ifunc variant that takes priority
over the pure CMPXCHG16B one, one that checks AVX and CMPXCHG16B bits and
on non-Intel clears the AVX bit during detection for now (if AMD comes
with the same guarantee, we could revert the config/x86/init.c hunk),
which implements 16-byte atomic load as vmovdqa and 16-byte atomic store
as vmovdqa followed by mfence.

2022-03-17 Jakub Jelinek <jakub@redhat.com>

PR target/104688
* Makefile.am (IFUNC_OPTIONS): Change on x86_64 to -mcx16 -mcx16.
(libatomic_la_LIBADD): Add $(addsuffix _16_2_.lo,$(SIZEOBJS)) for
x86_64.
* Makefile.in: Regenerated.
* config/x86/host-config.h (IFUNC_COND_1): For x86_64 define to
both AVX and CMPXCHG16B bits.
(IFUNC_COND_2): Define.
(IFUNC_NCOND): For x86_64 define to 2 * (N == 16).
(MAYBE_HAVE_ATOMIC_CAS_16, MAYBE_HAVE_ATOMIC_EXCHANGE_16,
MAYBE_HAVE_ATOMIC_LDST_16): Define to IFUNC_COND_2 rather than
IFUNC_COND_1.
(HAVE_ATOMIC_CAS_16): Redefine to 1 whenever IFUNC_ALT != 0.
(HAVE_ATOMIC_LDST_16): Redefine to 1 whenever IFUNC_ALT == 1.
(atomic_compare_exchange_n): Define whenever IFUNC_ALT != 0
on x86_64 for N == 16.
(__atomic_load_n, __atomic_store_n): Redefine whenever IFUNC_ALT == 1
on x86_64 for N == 16.
(atomic_load_n, atomic_store_n): New functions.
* config/x86/init.c (__libat_feat1_init): On x86_64 clear bit_AVX
if CPU vendor is not Intel.

(cherry picked from commit 1d47c0512a265d4bb3ab9e56259fd1e4f4d42c75)

aarch64: Fix up RTL sharing bug in aarch64_load_symref_appropriately [PR104910]

We unshare all RTL created during expansion, but when
aarch64_load_symref_appropriately is called after expansion like in the
following testcases, we use imm in both HIGH and LO_SUM operands.
If imm is some RTL that shouldn't be shared like a non-sharable CONST,
we get at least with --enable-checking=rtl a checking ICE, otherwise might
just get silently wrong code.

The following patch fixes that by copying it if it can't be shared.

2022-03-16 Jakub Jelinek <jakub@redhat.com>

PR target/104910
* config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Copy
imm rtx.

* gcc.dg/pr104910.c: New test.

(cherry picked from commit 952155629ca1a4dfe7c7b26e53d118a9b853ed4a)

ifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]

find_if_case_{1,2} implicitly assumes conditional jumps and rewrites them,
so if they have extra side-effects or are say asm goto, things don't work
well, either the side-effects are lost or we could ICE.
In particular, the testcase below on s390x has there a doloop instruction
that decrements a register in addition to testing it for non-zero and
conditionally jumping based on that.

The following patch fixes that by punting for !onlyjump_p case, i.e.
if there are side-effects in the jump instruction or it isn't a plain PC
setter.

Also, it assumes BB_END (test_bb) will be always non-NULL, because basic
blocks with 2 non-abnormal successor edges should always have some instruction
at the end that determines which edge to take.

2022-03-15 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/104814
* ifcvt.c (find_if_case_1, find_if_case_2): Punt if test_bb doesn't
end with onlyjump_p. Assume BB_END (test_bb) is always non-NULL.

* gcc.c-torture/execute/pr104814.c: New test.

(cherry picked from commit a2645cd8fb33b36d737b310e26f4c47401305c7b)

c, c++, c-family: -Wshift-negative-value and -Wshift-overflow* tweaks for -fwrapv and C++20+ [PR104711]

As mentioned in the PR, different standards have different definition
on what is an UB left shift.  They all agree on out of bounds (including
negative) shift count.
The rules used by ubsan are:
C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB
C++11-C++17 x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 then UB
C++20 and later everything is well defined
Now, for C++20, I've in the P1236R1 implementation added an early
exit for -Wshift-overflow* warning so that it never warns, but apparently
-Wshift-negative-value remained as is.  As it is well defined in C++20,
the following patch doesn't enable -Wshift-negative-value from -Wextra
anymore for C++20 and later, if users want for compatibility with C++17
and earlier get the warning, they still can by using -Wshift-negative-value
explicitly.
Another thing is -fwrapv, that is an extension to the standards, so it is up
to us how exactly we define that case.  Our ubsan code treats
TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only
diagnosing out of bounds shift count and nothing else and IMHO it is most
sensical to treat -fwrapv signed left shifts the same as C++20 treats
them, https://eel.is/c++draft/expr.shift#2
"The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N,
where N is the width of the type of the result.
[Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled.
— end note]"
with no UB dependent on the E1 values.  The UB is only
"The behavior is undefined if the right operand is negative, or greater
than or equal to the width of the promoted left operand."
Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't
consider UB in left shifts dependent on the first operand's value, only
the out of bounds shifts.

While this change isn't a regression, I'd think it is useful for GCC 12,
it doesn't add new warnings, but just removes warnings that aren't
appropriate.

2022-03-09  Jakub Jelinek  <jakub@redhat.com>

PR c/104711
gcc/
* doc/invoke.texi (-Wextra): Document that -Wshift-negative-value
is enabled by it only for C++11 to C++17 rather than for C++03 or
later.
(-Wshift-negative-value): Similarly (except here we stated
that it is enabled for C++11 or later).
gcc/c-family/
* c-opts.c (c_common_post_options): Don't enable
-Wshift-negative-value from -Wextra for C++20 or later.
* c-ubsan.c (ubsan_instrument_shift): Adjust comments.
* c-warn.c (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
gcc/c/
* c-fold.c (c_fully_fold_internal): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
* c-typeck.c (build_binary_op): Likewise.
gcc/cp/
* constexpr.c (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
* typeck.c (cp_build_binary_op): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
gcc/testsuite/
* c-c++-common/Wshift-negative-value-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-negative-value-2.c: Likewise.
* c-c++-common/Wshift-negative-value-3.c: Likewise.
* c-c++-common/Wshift-negative-value-4.c: Likewise.
* c-c++-common/Wshift-negative-value-7.c: New test.
* c-c++-common/Wshift-negative-value-8.c: New test.
* c-c++-common/Wshift-negative-value-9.c: New test.
* c-c++-common/Wshift-negative-value-10.c: New test.
* c-c++-common/Wshift-overflow-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-overflow-2.c: Likewise.
* c-c++-common/Wshift-overflow-5.c: Likewise.
* c-c++-common/Wshift-overflow-6.c: Likewise.
* c-c++-common/Wshift-overflow-7.c: Likewise.
* c-c++-common/Wshift-overflow-8.c: New test.
* c-c++-common/Wshift-overflow-9.c: New test.
* c-c++-common/Wshift-overflow-10.c: New test.
* c-c++-common/Wshift-overflow-11.c: New test.
* c-c++-common/Wshift-overflow-12.c: New test.

(cherry picked from commit d76511138dc816ef66fd16f71531f48c37dac3b4)

c++: Don't suggest cdtor or conversion op identifiers in spelling hints [PR104806]

On the following testcase, we emit "did you mean '__dt '?" in the error
message. "__dt " shows there because it is dtor_identifier, but we
shouldn't suggest those to the user, they are purely internal and can't
be really typed by the user because of the final space in it.

2022-03-08 Jakub Jelinek <jakub@redhat.com>

PR c++/104806
* search.c (lookup_field_fuzzy_info::fuzzy_lookup_field): Ignore
identifiers with space at the end.

* g++.dg/spellcheck-pr104806.C: New test.

(cherry picked from commit e480c3c06d20874fd7504bfdcca0b829f8000389)

s390: Fix up *cmp_and_trap_unsigned_int<mode> constraints [PR104775]

The following testcase fails to assemble due to clgte %r6,0(%r1,%r10)
insn not being accepted by assembler.
My rough understanding is that in the RSY-b insn format the spot
in other formats used for index registers is used instead for M3 what
kind of comparison it is, so this patch follows what other similar
instructions use for constraint (i.e. one without index register).

2022-03-07 Jakub Jelinek <jakub@redhat.com>

PR target/104775
* config/s390/s390.md (*cmp_and_trap_unsigned_int<mode>): Use
S constraint instead of T in the last alternative.

* gcc.target/s390/pr104775.c: New test.

(cherry picked from commit 2472dcaa8cb9e02e902f83d419c3ee7e0f3d9041)

cfgrtl: Fix up -g vs. -g0 code generation -flto differences in fixup_reorder_chain [PR104589]

This is similar to PR104237 and similarly to that, no testcase included
for the testsuite, as we don't have a framework to compile/link with
-g -flto and -g0 -flto and compare -fdump-final-insns= results from
the lto1 compilations.

With -flto, whether two location_t compare equal or not and just
express the same location is a lottery.

2022-03-02 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/104589
* cfgrtl.c (fixup_reorder_chain): Use loc_equal instead of direct
INSN_LOCATION comparison with goto_locus.

(cherry picked from commit 2e1b00367abaf8b6dbb47fd8518d9ac69c326a06)

match.pd: Further complex simplification fixes [PR104675]

Mark mentioned in the PR further 2 simplifications that also ICE
with complex types.
For these, eventually (but IMO GCC 13 materials) we could support it
for vector types if it would be uniform vector constants.
Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs
and we can't use bit_and etc. for complex type.

2022-02-25 Jakub Jelinek <jakub@redhat.com>
Marc Glisse <marc.glisse@inria.fr>

PR tree-optimization/104675
* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1):
Restrict simplifications to INTEGRAL_TYPE_P.

* gcc.dg/pr104675-3.c : New test.

(cherry picked from commit f62115c9b770a66c5378f78a2d5866243d560573)

rs6000: Use rs6000_emit_move in movmisalign<mode> expander [PR104681]

The following testcase ICEs, because for some strange reason it decides to use
movmisaligntf during expansion where the destination is MEM and source is
CONST_DOUBLE. For normal mov<mode> expanders the rs6000 backend uses
rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG
and a few other things, but for movmisalign<mode> nothing enforced this.
The middle-end documents that movmisalign<mode> shouldn't fail, so we can't
force that through predicates or condition on the expander.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR target/104681
* config/rs6000/vector.md (movmisalign<mode>): Use rs6000_emit_move.

* g++.dg/opt/pr104681.C: New test.

(cherry picked from commit 3885a122f817a1b6dca4a84ba9e020d5ab2060af)

i386: Use a new temp slot kind for splitter to floatdi<mode>2_i387_with_xmm [PR104674]

As mentioned in the PR, the following testcase is miscompiled for similar
reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various
places during expansion and during expansion we can guarantee that the
lifetime of those temporary slot doesn't overlap. But the following
splitter uses SLOT_TEMP too and in between expansion and split1 there is
a possibility that something extends the lifetime of SLOT_TEMP created
slots across an instruction that will be split by this splitter.

The following patch fixes it by using a new temp slot kind to make sure
it doesn't reuse a SLOT_TEMP that could be live across the instruction.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR target/104674
* config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387.
* config/i386/i386.md (splitter to floatdi<mode>2_i387_with_xmm): Use
SLOT_FLOATxFDI_387 rather than SLOT_TEMP.

* gcc.target/i386/pr104674.c: New test.

(cherry picked from commit eabf7bbe601f2c0d87bd0a1012d7a602df2037da)

match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]

We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
&/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
So, we should avoid simplifications which turn valid complex type
expressions into something that will ICE during expansion.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/104675
* match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
COMPLEX_TYPE.

* gcc.dg/pr104675-1.c: New test.
* gcc.dg/pr104675-2.c: New test.

(cherry picked from commit 758671b88b78d7629376b118ec6ca6bcfbabbd36)

sccvn: Fix visit_reference_op_call value numbering of vdefs [PR104601]

The following testcase is miscompiled, because -fipa-pure-const discovers
that bar is const, but when sccvn during fre3 sees
  # .MEM_140 = VDEF <.MEM_96>
  *__pred$__d_43 = _50 (_49);
where _50 value numbers to &bar, it value numbers .MEM_140 to
vuse_ssa_val (gimple_vuse (stmt)).  For const/pure calls that return
a SSA_NAME (or don't have lhs) that is fine, those calls don't store
anything, but if the lhs is present and not an SSA_NAME, value numbering
the vdef to anything but itself means that e.g. walk_non_aliased_vuses
won't consider the call, but the call acts as a store to its lhs.
When it is ignored, sccvn will return whatever has been stored to the
lhs earlier.

I've bootstrapped/regtested an earlier version of this patch, which did the
if (!lhs && gimple_call_lhs (stmt))
  changed |= set_ssa_val_to (vdef, vdef);
part before else if (vnresult->result_vdef), and that regressed
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo \\\\(" 1
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo2 \\\\(" 1
so this updated patch uses result_vdef there as before and only otherwise
(which I think must be the const/pure case) decides based on whether the
lhs is non-SSA_NAME.

2022-02-24  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/104601
* tree-ssa-sccvn.c (visit_reference_op_call): For calls with
non-SSA_NAME lhs value number vdef to itself instead of e.g. the
vuse value number.

* g++.dg/torture/pr104601.C: New test.

(cherry picked from commit 9251b457eb8df912f2e8203d0ee8ab300c041520)

libiberty: Fix up debug.temp.o creation if *.o has 64K+ sections [PR104617]

On
#define A(n) int foo1##n(void) { return 1##n; }
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325)
B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642)
testcase with
./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto  -O0 -o foo1.o foo1.c -ffunction-sections
./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o
/tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized
(testcase too slow to be included into testsuite).
The problem is clearly reported by readelf:
readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323
readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section.
because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and
sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE
inclusive.  Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE
range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym
where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be
used instead and .symtab_shndx section should contain the real section
index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >=
SHN_LORESERVE value is needed it should put those into
Shdr[0].sh_{size,link}.  But, sh_{link,info} are 32-bit fields which can
contain any section index.

Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before
2011) used to mishandle the > 63.75K sections case and assumed there is a
hole in between the sections, but what
simple_object_elf_copy_lto_debug_sections does wouldn't help in that case
for the debug temp object creation, we'd need to detect the case also in
that routine and take it into account in the remapping etc.  I think
it is not worth it given that it is over 10 years, if somebody needs
63.75K or more sections, better use more recent binutils.

2022-02-22  Jakub Jelinek  <jakub@redhat.com>

PR lto/104617
* simple-object-elf.c (simple_object_elf_match): Fix up URL
in comment.
(simple_object_elf_copy_lto_debug_sections): Remap sh_info and
sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE
range (inclusive).

(cherry picked from commit 2f59f067610f22c3f2ec9b1516e24b85836676ed)

asan: Mark instrumented vars addressable [PR102656]

We ICE on the following testcase, because the asan1 pass decides to
instrument
  <retval>.x = 0;
and does that by
  _13 = &<retval>.x;
  .ASAN_CHECK (7, _13, 4, 4);
  <retval>.x = 0;
and later sanopt pass turns that into:
  _39 = (unsigned long) &<retval>.x;
  _40 = _39 >> 3;
  _41 = _40 + 2147450880;
  _42 = (signed char *) _41;
  _43 = *_42;
  _44 = _43 != 0;
  _45 = _39 & 7;
  _46 = (signed char) _45;
  _47 = _46 + 3;
  _48 = _47 >= _43;
  _49 = _44 & _48;
  if (_49 != 0)
    goto <bb 10>; [0.05%]
  else
    goto <bb 9>; [99.95%]

  <bb 10> [local count: 536864]:
  __builtin___asan_report_store4 (_39);

  <bb 9> [local count: 1073741824]:
  <retval>.x = 0;
The problem is during expansion, <retval> isn't marked TREE_ADDRESSABLE,
even when we take its address in (unsigned long) &<retval>.x.

Now, instrument_derefs has code to avoid the instrumentation altogether
if we can prove the access is within bounds of an automatic variable in the
current function and the var isn't TREE_ADDRESSABLE (or we don't instrument
use after scope), but we do it solely for VAR_DECLs.

I think we should treat RESULT_DECLs exactly like that too, which is what
the following patch does.  I must say I'm unsure about PARM_DECLs, those can
have different cases, either they are fully or partially passed in
registers, then if we take parameter's address, they are in a local copy
inside of a function and so work like those automatic vars.  But if they
are fully passed in memory, we typically just take address of the slot
and in that case they live in the caller's frame.  It is true we don't
(can't) put any asan padding in between the arguments, so all asan could
detect in that case is if caller passes fewer on stack arguments or smaller
arguments than callee accepts.  Anyway, as I'm unsure, I haven't added
PARM_DECLs to that case.

And another thing is, when we actually build_fold_addr_expr, we need to
mark_addressable the inner if it isn't addressable already.

2022-02-19  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/102656
* asan.c (instrument_derefs): If inner is a RESULT_DECL and access is
known to be within bounds, treat it like automatic variables.
If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from
current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark
it addressable.

* g++.dg/asan/pr102656.C: New test.

(cherry picked from commit 9e3bbb4a8024121eb0fa675cb1f074218c1345a6)

c: -Wmissing-field-initializers and designated inits [PR82283, PR84685]

This patch fixes two kinds of wrong -Wmissing-field-initializers
warnings.  Our docs say that this warning "does not warn about designated
initializers", but we give a warning for

1) the array case:

  struct S {
    struct N {
      int a;
      int b;
    } c[1];
  } d = {
    .c[0].a = 1,
    .c[0].b = 1, // missing initializer for field 'b' of 'struct N'
  };

we warn because push_init_level, when constructing an array, clears
constructor_designated (which the warning relies on), and we forget
that we were in a designated initializer context.  Fixed by the
push_init_level hunk; and

2) the compound literal case:

  struct T {
    int a;
    int *b;
    int c;
  };

  struct T t = { .b = (int[]){1} }; // missing initializer for field 'c' of 'struct T'

where set_designator properly sets constructor_designated to 1, but the
compound literal causes us to create a whole new initializer_stack in
start_init, which clears constructor_designated.  Then, after we've parsed
the compound literal, finish_init flushes the initializer_stack entry,
but doesn't restore constructor_designated, so we forget we were in
a designated initializer context, which causes the bogus warning.  (The
designated flag is also tracked in constructor_stack, but in this case,
we didn't perform push_init_level between set_designator and start_init
so it wasn't saved anywhere.)

PR c/82283
PR c/84685

gcc/c/ChangeLog:

* c-typeck.c (struct initializer_stack): Add 'designated' member.
(start_init): Set it.
(finish_init): Restore constructor_designated.
(push_init_level): Set constructor_designated to the value of
constructor_designated in the upper constructor_stack.

gcc/testsuite/ChangeLog:

* gcc.dg/Wmissing-field-initializers-1.c: New test.
* gcc.dg/Wmissing-field-initializers-2.c: New test.
* gcc.dg/Wmissing-field-initializers-3.c: New test.
* gcc.dg/Wmissing-field-initializers-4.c: New test.
* gcc.dg/Wmissing-field-initializers-5.c: New test.

(cherry picked from commit 4b7d9f8f51bd96d290aac230c71e501fcb6b21a6)

c++: alignas and alignof void [PR104944]

I started looking into this PR because in GCC 4.9 we were able to
detect the invalid

  struct alignas(void) S{};

but I broke it in r210262.

It's ill-formed code in C++:
[dcl.align]/3: "An alignment-specifier of the form alignas(type-id) has
the same effect as alignas(alignof(type-id))", and [expr.align]/1:
"The operand shall be a type-id representing a complete object type,
or an array thereof, or a reference to one of those types." and void
is not a complete type.

It's also invalid in C:
6.7.5: _Alignas(type-name) is equivalent to _Alignas(_Alignof(type-name))
6.5.3.4: "The _Alignof operator shall not be applied to a function type
or an incomplete type."

We have a GNU extension whereby we treat sizeof(void) as 1, but I assume
it doesn't apply to alignof, at least in C++.  However, __alignof__(void)
is still accepted with a -Wpedantic warning.

We still say "invalid application of 'alignof'" rather than 'alignas' in the
void diagnostic, but I felt that fixing that may not be suitable as part of
this patch.  The "incomplete type" diagnostic still always prints
'__alignof__'.

PR c++/104944

gcc/cp/ChangeLog:

* typeck.c (cxx_sizeof_or_alignof_type): Diagnose alignof(void).
(cxx_alignas_expr): Call cxx_sizeof_or_alignof_type with
complain == true.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alignas20.C: New test.

(cherry picked from commit d0b938a7612fb7acf1f181da9577235c83ede59e)

c++: ICE with template code in constexpr [PR104284]

Since r9-6073 cxx_eval_store_expression preevaluates the value to
be stored, and that revealed a crash where a template code (here,
code=IMPLICIT_CONV_EXPR) leaks into cxx_eval*.

It happens because we're performing build_vec_init while processing
a template, which calls get_temp_regvar which creates an INIT_EXPR.
This INIT_EXPR's RHS contains an rvalue conversion so we create an
IMPLICIT_CONV_EXPR.  Its operand is not type-dependent and the whole
INIT_EXPR is not type-dependent.  So we call build_non_dependent_expr
which, with -fchecking=2, calls fold_non_dependent_expr.  At this
point the expression still has an IMPLICIT_CONV_EXPR, which ought to
be handled in instantiate_non_dependent_expr_internal.  However,
tsubst_copy_and_build doesn't handle INIT_EXPR; it will just call
tsubst_copy which does nothing when args is null.  So we fail to
replace the IMPLICIT_CONV_EXPR and ICE.

The problem is that we call build_vec_init in a template in the
first place.  We can avoid doing so by checking p_t_d before
calling build_aggr_init in check_initializer.

PR c++/104284

gcc/cp/ChangeLog:

* decl.c (check_initializer): Don't call build_aggr_init in
a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-104284-1.C: New test.
* g++.dg/cpp1y/constexpr-104284-2.C: New test.
* g++.dg/cpp1y/constexpr-104284-3.C: New test.
* g++.dg/cpp1y/constexpr-104284-4.C: New test.

(cherry picked from commit 9fdac7e16c940fb6264e6ddaf99c761f1a64a054)

c++: Wrong error with alias template in class tmpl [PR104108]

In r10-6329 I tried to optimize the number of calls to v_d_e_p in
convert_nontype_argument by remembering whether the expression was
value-dependent in a bool flag.  I did that wrongly assuming that its
value-dependence will not be changed by build_converted_constant_expr.
This testcase shows that it can: b_c_c_e gets a VAR_DECL for m_parameter,
which is not value-dependent, but we're converting it to "const int &"
so it returns

  (const int &)(const int *) &m_parameter

which suddenly becomes value-dependent because of the added ADDR_EXPR:
has_value_dependent_address is now true because m_parameter's context S<T>
is dependent.  With this bug in place, we went to the second branch here:

      if (TYPE_REF_OBJ_P (TREE_TYPE (expr)) && val_dep_p)
        /* OK, dependent reference.  We don't want to ask whether a DECL is
           itself value-dependent, since what we want here is its address.  */;
      else
        {
          expr = build_address (expr);

          if (invalid_tparm_referent_p (type, expr, complain))
            return NULL_TREE;
        }

wherein build_address created a bad tree and then i_t_r_p complained.

PR c++/104108

gcc/cp/ChangeLog:

* pt.c (convert_nontype_argument): Recompute
value_dependent_expression_p after build_converted_constant_expr.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-74.C: New test.

(cherry picked from commit d54ce4641ed106666208be36fd514cae8ff1153c)

c++: FIX_TRUNC_EXPR in tsubst [PR102990]

This is a crash where a FIX_TRUNC_EXPR gets into tsubst_copy_and_build
where it hits gcc_unreachable ().

The history of tsubst_copy_and_build/FIX_TRUNC_EXPR is such that it
was introduced in r181478, but it did the wrong thing, whereupon it
was turned into gcc_unreachable () in r258821 (see this thread:
<https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495853.html>).

In a template, we should never create a FIX_TRUNC_EXPR (that's what
conv_unsafe_in_template_p is for). But in this test we are NOT in
a template when we call digest_nsdmi_init which ends up calling
convert_like, converting 1.0e+0 to int, so convert_to_integer_1
gives us a FIX_TRUNC_EXPR.

But then when we get to parsing f's parameters, we are in a template
when processing decltype(Helpers{}), and since r268321, when the
compound literal isn't instantiation-dependent and the type isn't
type-dependent, finish_compound_literal falls back to the normal
processing, so it calls digest_init, which does fold_non_dependent_init
and since the FIX_TRUNC_EXPR isn't dependent, we instantiate and
therefore crash in tsubst_copy_and_build.

The fateful call to fold_non_dependent_init comes from massage_init_elt,
We shouldn't be calling f_n_d_i on the result of get_nsdmi. This we can
avoid by eschewing calling f_n_d_i on CONSTRUCTORs; their elements have
already been folded.

PR c++/102990

gcc/cp/ChangeLog:

* typeck2.c (massage_init_elt): Avoid folding CONSTRUCTORs.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template22.C: New test.
* g++.dg/cpp0x/nsdmi-template23.C: New test.

(cherry picked from commit f0530882d99abc410bb080051aa04e5cea848f18)

c++: constexpr array reference and value-initialization [PR101371]

This PR gave me a hard time: I saw multiple issues starting with
different revisions.  But ultimately the root cause seems to be
the following, and the attached patch fixes all issues I've found
here.

In cxx_eval_array_reference we create a new constexpr context for the
CP_AGGREGATE_TYPE_P case, but we also have to create it for the
non-aggregate case.  In this test, we are evaluating

  ((B *)this)->a = rhs->a

which means that we set ctx.object to ((B *)this)->a.  Then we proceed
to evaluate the initializer, rhs->a.  For *rhs, we eval rhs, a PARM_DECL,
for which we have (const B &) &c.arr[0] in the hash table.  Then
cxx_fold_indirect_ref gives us c.arr[0].  c is evaluated to {.arr={}} so
c.arr is {}.  Now we want c.arr[0], so we end up in cxx_eval_array_reference
and since we're initializing from {}, we call build_value_init which
gives us an AGGR_INIT_EXPR that calls 'constexpr B::B()'.  Then we
evaluate this AGGR_INIT_EXPR and since its first argument is dummy,
we take ctx.object instead.  But that is the wrong object, we're not
initializing ((B *)this)->a here.  And so we wound up with an
initializer for A, and then crash in cxx_eval_component_reference:

  gcc_assert (DECL_CONTEXT (part) == TYPE_MAIN_VARIANT (TREE_TYPE (whole)));

where DECL_CONTEXT (part) is B (as it should be) but the type of whole
was A.

So create a new object, if there already was one, and the element type
is not a scalar.

PR c++/101371

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_array_reference): Create a new .object
and .ctor for the non-aggregate non-scalar case too when
value-initializing.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-101371-2.C: New test.
* g++.dg/cpp1y/constexpr-101371.C: New test.

(cherry picked from commit a42f8120442cf3ba25d621bed857b5be19019d0c)

Daily bump.

c++: TTP in member alias template [PR104107]

In the first testcase, coerce_template_template_parms was adding too much of
outer_args when coercing to match P's template parameters, so that when
substituting into the 'const T&' parameter we got an unrelated template
argument for T. We should only add outer_args when the argument template is
a nested template.

PR c++/104107
PR c++/95036

gcc/cp/ChangeLog:

* pt.c (coerce_template_template_parms): Take full parms.
Avoid adding too much of outer_args.
(coerce_template_template_parm): Adjust.
(template_template_parm_bindings_ok_p): Adjust.
(convert_template_argument): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-ttp2.C: New test.
* g++.dg/cpp1z/ttp2.C: New test.

c++: ICE with alias in pack expansion [PR103769]

This was breaking because when we stripped the 't' typedef in s<t<Args>...>
to be s<Args...>, the TYPE_MAIN_VARIANT of "Args..." was still
"t<Args>...", because type pack expansions are treated as types. Fixed by
using the right function to copy a "type".

PR c++/99445
PR c++/103769

gcc/cp/ChangeLog:

* tree.c (strip_typedefs): Use build_distinct_type_copy.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-alias5.C: New test.

c++: mangling union{1} in template [PR104847]

My implementation of union non-type template arguments in r11-2016 broke
braced casts of union type, because they are still in syntactic (undigested)
form.

PR c++/104847

gcc/cp/ChangeLog:

* mangle.c (write_expression): Don't write a union designator when
undigested.

gcc/testsuite/ChangeLog:

* g++.dg/abi/mangle-union1.C: New test.

c++: missing aggregate base ctor [PR102045]

When make_base_init_ok changes a call to a complete constructor into a call
to a base constructor, we were never marking the base ctor as used, so it
didn't get emitted.

PR c++/102045

gcc/cp/ChangeLog:

* call.c (make_base_init_ok): Call make_used.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/aggr-base12.C: New test.

c++: member alias declaration [PR103968]

Here, we were wrongly thinking that (const Options&)Widget<T>::c_options is
not value-dependent because neither the type nor the (value of) c_options
are dependent, but since we're binding it to a reference we also need to
consider that it has a value-dependent address.

PR c++/103968

gcc/cp/ChangeLog:

* pt.c (value_dependent_expression_p): Check
has_value_dependent_address for conversion to reference.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-mem1.C: New test.

c++: CTAD and member alias template [PR102123]

When building a deduction guide from the Test constructor, we need to
rewrite the use of _dummy into a dependent reference, i.e. Test<T>::template
_dummy. We were using SCOPE_REF for both type and non-type templates; we
need to use UNBOUND_CLASS_TEMPLATE for type templates.

PR c++/102123

gcc/cp/ChangeLog:

* pt.c (tsubst_copy): Use make_unbound_class_template for rewriting
a type template reference.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction110.C: New test.

c++: visibility of local extern [PR103291]

When setting up the hidden namespace-scope decl for a local extern, we also
need to set its visibility.

PR c++/103291

gcc/cp/ChangeLog:

* name-lookup.c (push_local_extern_decl_alias): Call
determine_visibility.

gcc/testsuite/ChangeLog:

* g++.dg/ext/visibility/visibility-local-extern1.C: New test.

x86: Use Yw constraint on *ssse3_pshufbv8qi3

Since AVX512VL and AVX512BW are required for AVX512 VPSHUFB, replace the
"Yv" register constraint with the "Yw" register constraint.

gcc/

PR target/105068
* config/i386/sse.md (*ssse3_pshufbv8qi3): Replace "Yv" with
"Yw".

(cherry picked from commit 08e69332881f8d28ce8b559ffba1900ae5c0d5ee)

[PR/target 102957] Allow Z*-ext extension with only 2 char.

We was assume the Z* extension should be more than 2 char, so we put an
assertion there, but it should just an error or warning rather than an
assertion, however RISC-V has add `Zk` extension, which just 2 char, so
actually, we should just allow that.

gcc/ChangeLog

PR target/102957
* common/config/riscv/riscv-common.c (multi_letter_subset_rank): Remove
assertion for Z*-ext.

gcc/testsuite/ChangeLog

* gcc.target/riscv/pr102957.c: New.

(cherry picked from commit abe562bb01479ea2c8952ad98714f3225527aa7e)

i386: Fix up _mm_loadu_si{16,32} [PR99754]

These intrinsics are supposed to do an unaligned may_alias load
of a 16-bit or 32-bit value and store it as the first element of
a 128-bit integer vector, with all other elements cleared.

The current _mm_storeu_* implementation implements that correctly, uses
__*_u types to do the store and extracts the first element of a vector into
it.
But _mm_loadu_si{16,32} gets it all wrong.  It performs an aligned
non-may_alias load and because _mm_set_epi{16,32} has the args reversed,
it also inserts it into the last vector element instead of first.

The following patch fixes that.

Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
for _mm_loadu_si16 it says strangely SSE.  But the intrinsics
returns __m128i, which is only defined in emmintrin.h, and
_mm_set_epi16 is also only SSE2 and later in emmintrin.h.
Even clang defines it in emmintrin.h and ends up with inlining
failure when calling _mm_loadu_si16 from sse,no-sse2 function.
So, isn't that a bug in the intrinsic guide instead?

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

PR target/99754
* config/i386/emmintrin.h (_mm_loadu_si32): Put loaded value into
first rather than last element of the vector, use __m32_u to do
a really unaligned load, use just 0 instead of (int)0.
(_mm_loadu_si16): Put loaded value into first rather than last
element of the vector, use __m16_u to do a really unaligned load,
use just 0 instead of (short)0.

* gcc.target/i386/pr99754-1.c: New test.
* gcc.target/i386/pr99754-2.c: New test.

Daily bump.

x86: Use -msse2 on gcc.target/i386/pr95483-1.c

Replace -msse with -msse2 since <emmintrin.h> requires SSE2.

PR testsuite/105055
* gcc.target/i386/pr95483-1.c: Replace -msse with -msse2.

(cherry picked from commit bdd7b679e8497c07e25726f6ab6429e4c4d429c7)