git.ipfire.org Git - thirdparty/gcc.git/log

libstdc++: Remove incorrect copyright notice from header

This file has the SGI copyright notice, but contains no code from
the SGI STL. It was entirely written by me in 2019, originally as part
of the <memory> header. When I extracted it into a new header I
accidentally copied across the SGI copyright, but that only applies to
some much older parts of <memory>.

libstdc++-v3/ChangeLog:

* include/bits/uses_allocator_args.h: Remove incorrect copyright
notice.

(cherry picked from commit 7cce7b1c3d829172eb7f232e71ad194a0ad51931)

libstdc++: Improve config output for --enable-cstdio [PR104301]

Currently we just print "checking for underlying I/O to use... stdio"
unconditionally, whether configured to use stdio_pure or stdio_posix. We
should make it clear that the user's configure option chose the right
thing.

libstdc++-v3/ChangeLog:

PR libstdc++/104301
* acinclude.m4 (GLIBCXX_ENABLE_CSTDIO): Print different messages
for stdio_pure and stdio_posix options.
* configure: Regenerate.

(cherry picked from commit 19b8946dbda5fda4389ef8e3ea162c3df2b1998d)

Daily bump.

i386: Fix up ix86_expand_vector_init_general [PR105123]

The following testcase is miscompiled on ia32.
The problem is that at -O0 we end up with:
  vector(4) short unsigned int _1;
  short unsigned int u.0_3;
...
  _1 = {u.0_3, u.0_3, u.0_3, u.0_3};
statement (dead) which is wrongly expanded.
elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode,
so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]).
The intent is to manually broadcast that value to 2 SImode parts,
but because we pass word as target to expand_simple_binop, it will
overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0:
   10: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   12: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   14: clobber r110:V4HI
   15: r110:V4HI#0=r83:SI
   16: r110:V4HI#4=r83:SI
as the two ors do nothing and two shifts each by 16 left shift it all
away.
The following patch fixes that by using NULL_RTX target, so we expand it as
   10: {r110:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r111:SI=r110:SI|r83:SI;clobber flags:CC;}
   12: {r112:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r113:SI=r112:SI|r83:SI;clobber flags:CC;}
   14: clobber r114:V4HI
   15: r114:V4HI#0=r111:SI
   16: r114:V4HI#4=r113:SI
instead.

Another possibility would be to pass NULL_RTX only when word == elt
and word otherwise, where word would necessarily be a pseudo from the first
shift after passing NULL_RTX there once or pass NULL_RTX for the shift and
word for ior.

2022-04-03  Jakub Jelinek  <jakub@redhat.com>

PR target/105123
* config/i386/i386-expand.c (ix86_expand_vector_init_general): Avoid
using word as target for expand_simple_binop when doing ASHIFT and
IOR.

* gcc.target/i386/pr105123.c: New test.

(cherry picked from commit e1a74058b784c845e84a0cf1997b54b984df483d)

[PR105032] LRA: modify loop condition to find reload insns for hard reg splitting

When trying to split hard reg live range to assign hard reg to a reload
pseudo, LRA searches for reload insns of the reload pseudo
assuming a specific order of the reload insns. This order is violated if
reload involved in inheritance transformation. In such case, the loop used
for reload insn searching can become infinite. The patch fixes this.

gcc/ChangeLog:

PR middle-end/105032
* lra-assigns.c (find_reload_regno_insns): Modify loop condition.

gcc/testsuite/ChangeLog:

PR middle-end/105032
* gcc.target/i386/pr105032.c: New.

Daily bump.

c-family: ICE with -Wconversion and A ?: B [PR101030]

This patch fixes a crash in conversion_warning on a null expression.
It is null because the testcase uses the GNU A ?: B extension. We
could also use op0 instead of op1 in this case, but it doesn't seem
to be necessary.

PR c++/101030

gcc/c-family/ChangeLog:

* c-warn.c (conversion_warning) <case COND_EXPR>: Don't call
conversion_warning when OP1 is null.

gcc/testsuite/ChangeLog:

* g++.dg/ext/cond5.C: New test.

(cherry picked from commit 5db9ce171019f8915885cebd5cc5f4101bb926e6)

x86: Also use Yw in *ssse3_pshufbv8qi3 clobber

PR target/105068
* config/i386/sse.md (*ssse3_pshufbv8qi3): Also replace "Yv" with
"Yw" in clobber.

(cherry picked from commit cccbb776589c1825de1bd2eefabb11d72ef28de8)

RISC-V: Fixing -misa-spec [PR/target 104853]

gcc/ChangeLog:

* config.gcc (riscv*-*-*): Set right default isa spec.

RISC-V: Handle zi* extension correctly for arch-canonicalize script

Canonical order for z-prefixed extension are rely on the canonical order of
single letter extension, however we didn't put i into the list before,
so when we put zicsr or zifencei it will got exception.

gcc/ChangeLog:

* config/riscv/arch-canonicalize (CANONICAL_ORDER): Add `i` to
CANONICAL_ORDER.

(cherry picked from commit e399cde6f9c89cafbbf6c3274c0af3c369d4f872)

RISC-V: Fix register class subset checks for CLASS_MAX_NREGS

Fix the register class subset checks in the determination of the maximum
number of consecutive registers needed to hold a value of a given mode.

The number depends on whether a register is a general-purpose or a
floating-point register, so check whether the register class requested
is a subset (argument 1 to `reg_class_subset_p') rather than superset
(argument 2) of GR_REGS or FP_REGS class respectively.

gcc/
* config/riscv/riscv.c (riscv_class_max_nregs): Swap the
arguments to `reg_class_subset_p'.

(cherry picked from commit a31056e9196daf0a5b0e92d171b5227cc994103b)

RISC-V: Fix wrong zifencei handling in riscv_subset_list::to_string

This issue cause zifencei never correctly appended on the ISA string.

gcc/ChangeLog

* common/config/riscv/riscv-common.c (riscv_subset_list::to_string): Fix
wrong marco checking.

(cherry picked from commit 402d28998fa35d9ffc47aa084f66f9381491eeca)

RISC-V: jal cannot refer to a default visibility symbol for shared object.

This is the original binutils bugzilla report,
https://sourceware.org/bugzilla/show_bug.cgi?id=28509

And this is the first version of the proposed binutils patch,
https://sourceware.org/pipermail/binutils/2021-November/118398.html

After applying the binutils patch, I get the the unexpected error when
building libgcc,

/scratch/nelsonc/riscv-gnu-toolchain/riscv-gcc/libgcc/config/riscv/div.S:42:
/scratch/nelsonc/build-upstream/rv64gc-linux/build-install/riscv64-unknown-linux-gnu/bin/ld: relocation R_RISCV_JAL against `__udivdi3' which may bind externally can not be used when making a shared object; recompile with -fPIC

Therefore, this patch add an extra hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead.
The solution is similar to glibc as follows,
https://sourceware.org/git/?p=glibc.git;a=commit;h=68389203832ab39dd0dbaabbc4059e7fff51c29b

libgcc/ChangeLog:

* config/riscv/div.S: Add the hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target it since it is non-preemptible.
* config/riscv/riscv-asm.h: Added new macros HIDDEN_JUMPTARGET and
HIDDEN_DEF.

(cherry picked from commit 45116f342057b7facecd3d05c2091ce3a77eda59)

RISC-V: Fix use-after-free error in `parse_multiletter_ext'

Avoid undefined arithmetic involving a pointer to a heap allocation that
has been freed and move a problematic calculation ahead of the following
call to `free' in `riscv_subset_list::parse_multiletter_ext', removing a
compilation error:

.../gcc/common/config/riscv/riscv-common.c: In member function 'const char* riscv_subset_list::parse_multiletter_ext(const char*, const char*, const char*)':
.../gcc/common/config/riscv/riscv-common.cc:905:27: error: pointer 'subset' used after 'void free(void*)' [-Werror=use-after-free]
  905 |       p += end_of_version - subset;
      |            ~~~~~~~~~~~~~~~^~~~~~~~
.../gcc/common/config/riscv/riscv-common.cc:904:12: note: call to 'void free(void*)' here
  904 |       free (subset);
      |       ~~~~~^~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:2428: riscv-common.o] Error 1

and a build regression from commit 671a283636de ("Add -Wuse-after-free
[PR80532].").

gcc/
* common/config/riscv/riscv-common.c
(riscv_subset_list::parse_multiletter_ext): Move pointer
arithmetic ahead of `free'.

(cherry picked from commit dad495e30135904b0d0305eab8c0ce5f838440d4)

RISC-V: Do not emit zcisr and zifencei if i-ext is 2.0

I-ext 2.0 already included zicsr and zifencei, skip that prevent
confusing binutils.

gcc/ChangeLog

* common/config/riscv/riscv-common.c (riscv_subset_list::to_string):
Skip zicsr and zifencei if I-ext is 2.0.

(cherry picked from commit ca2bbb88f999f4d3cc40e89bc1aba712505dd598)

RISC-V: Fix detection of zifencei support for binutils

- binutils will complain version info is not found if default ISA spec
is 2.2 for binutils.

Error: cannot find default versions of the ISA extension `zifencei'

gcc/ChangeLog:

* configure.ac: Fix detection for zifencei support.
* configure: Regenerate.

(cherry picked from commit affdeda16ef7fbd34f850443fe63bb407714297e)

ubsan: Fix ICE due to -fsanitize=object-size [PR105093]

The following testcase ICEs, because for a volatile X & RESULT_DECL
ubsan wants to take address of that reference.  instrument_object_size
is called with x, so the base is equal to the access and the var
is automatic, so there is no risk of an out of bounds access for it.
Normally we wouldn't instrument those because we fold address of the
t - address of inner to 0, add constant size of the decl and it is
equal to what __builtin_object_size computes.  But the volatile
results in the subtraction not being folded.

The first hunk fixes it by punting if we access the whole automatic
decl, so that even volatile won't cause a problem.
The second hunk (not strictly needed for this testcase) is similar
to what has been added to asan.cc recently, if we actually take
address of a decl and keep it in the IL, we better mark it addressable.

2022-03-30  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/105093
* ubsan.c (instrument_object_size): If t is equal to inner and
is a decl other than global var, punt.  When emitting call to
UBSAN_OBJECT_SIZE ifn, make sure base is addressable.

* g++.dg/ubsan/pr105093.C: New test.

(cherry picked from commit e3e68fa59ead502c24950298b53c637bbe535a74)

store-merging: Avoid ICEs on roughly ~0ULL/8 sized stores [PR105094]

On the following testcase on 64-bit targets, store-merging sees
a MEM_REF store from {} ctor with "negative" bitsize where bitoff + bitsize
wraps around to very small end offset.  This later confuses the code
so that it allocates just a few bytes of memory but fills in huge amounts of
it.  Later on there is a param_store_merging_max_size size check but due to
the wrap-around we pass that.

The following patch punts on such large bitsizes.

2022-03-30  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/105094
* gimple-ssa-store-merging.c (mem_valid_for_store_merging): Punt if
bitsize <= 0 rather than just == 0.

* gcc.dg/pr105094.c: New test.

(cherry picked from commit 387e818cda0ffde86f624228c3da1ab28f453685)

LTO: bump bytecode version

The following revision 91f7d7e1bb6827bf8e0b7ba7eb949953a5b1bd18
breaks bytecode as it introduces a new param.

gcc/ChangeLog:

* lto-streamer.h (LTO_minor_version): Bump it.

c++: Fox template-introduction tentative parsing in class bodies clear colon_corrects_to_scope_p [PR105061]

The concepts support (in particular template introductions from concepts TS)
broke the following testcase, valid unnamed bitfields with dependent
types (or even just typedefs) were diagnosed as typos (: instead of correct
::) in template introduction during their tentative parsing.
The following patch fixes that by not doing this : to :: correction when
member_p is true.

2022-03-30 Jakub Jelinek <jakub@redhat.com>

PR c++/105061
* parser.c (cp_parser_template_introduction): If member_p, temporarily
clear parser->colon_corrects_to_scope_p around tentative parsing of
nested name specifier.

* g++.dg/concepts/pr105061.C: New test.

(cherry picked from commit 4f2795218a6ba6a7b7b9b18ca7a6e390661e1608)

Daily bump.

c++: Fix up __builtin_{bit_cast,convertvector} parsing

Jonathan reported on IRC that we don't parse
__builtin_bit_cast (type, val).field
etc.
The problem is that for these 2 builtins we return from
cp_parser_postfix_expression instead of setting postfix_expression
to the cp_build_* value and falling through into the postfix regression
suffix handling loop.

2022-03-26 Jakub Jelinek <jakub@redhat.com>

* parser.c (cp_parser_postfix_expression)
<case RID_BILTIN_CONVERTVECTOR, case RID_BUILTIN_BIT_CAST>: Don't
return cp_build_{vec,convert,bit_cast} result right away, instead
set postfix_expression to it and break.

* c-c++-common/builtin-convertvector-3.c: New test.
* g++.dg/cpp2a/bit-cast15.C: New test.

(cherry picked from commit 1806829e08f14e4cacacec43d7845cc2dad2ddc8)

fold-const: Handle C++ dependent COMPONENT_REFs in operand_equal_p [PR105035]

As mentioned in the PR, operand_equal_p already contains some hacks so that
it can be called already on pre-instantiation C++ trees from templates,
but the recent change to compare DECL_FIELD_OFFSET in the COMPONENT_REF
case broke this. Many such COMPONENT_REFs are already punted on earlier
because they have NULL TREE_TYPE, but in this case the code knows what
type they have but still uses an IDENTIFIER_NODE as second operand
of COMPONENT_REF (I think SCOPE_REF is something that could be used too).

The following patch looks at those DECL_FIELD_*OFFSET fields only if
both field[01] args are FIELD_DECLs and otherwise keeps it to the
earlier OP_SAME (1) check that guards this whole block.

2022-03-24 Jakub Jelinek <jakub@redhat.com>

PR c++/105035
* fold-const.c (operand_equal_p) <case COMPONENT_REF>: If either
field0 or field1 is not a FIELD_DECL, return false.

* g++.dg/warn/Wduplicated-cond2.C: New test.

(cherry picked from commit 8698ff67cdff4364c8adad2921ed532359a155ec)

c++: extern thread_local declarations in constexpr [PR104994]

C++14 to C++20 apparently should allow extern thread_local declarations in
constexpr functions, however useless they are there (because accessing
such vars is not valid in a constant expression, perhaps sizeof/decltype).
P2242 changed that for C++23 to passing through declaration but
https://cplusplus.github.io/CWG/issues/2552.html
has been filed for it yesterday.

2022-03-24 Jakub Jelinek <jakub@redhat.com>

PR c++/104994
* constexpr.c (potential_constant_expression_1): Don't diagnose extern
thread_local declarations.
* decl.c (start_decl): Likewise.

* g++.dg/cpp23/constexpr-nonlit7.C: New test.

(cherry picked from commit 72124f487ccb5c8065dd5f7b8fba254600b7e611)

i386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971]

__builtin_ia32_readeflags_u* aren't marked const or pure I think
intentionally, so that they aren't CSEd from different regions of a function
etc. because we don't and can't easily track all dependencies between
it and surrounding code (if somebody looks at the condition flags, it is
dependent on the vast majority of instructions).
But the builtin itself doesn't have any side-effects, so if we ignore the
result of the builtin, there is no point to emit anything.

There is a LRA bug that miscompiles the testcase which this patch makes
latent, which is certainly worth fixing too, but IMHO this change
(and maybe ix86_gimple_fold_builtin too which would fold it even earlier
when it looses lhs) is worth it as well.

2022-03-19 Jakub Jelinek <jakub@redhat.com>

PR middle-end/104971
* config/i386/i386-expand.c
(ix86_expand_builtin) <case IX86_BUILTIN_READ_FLAGS>: If ignore,
don't push/pop anything and just return const0_rtx.

* gcc.target/i386/pr104971.c: New test.

(cherry picked from commit b60bc913cca7439d29a7ec9e9a7f448d8841b43c)

c-family: Fix up ICE during pretty-printing of PMF related expression [PR101515]

The intent of r11-6729 is that it prints something that helps user to figure
out what exactly is being accessed.
When we find a unique non-static data member that is being accessed, even
when we can't fold it nicely, IMNSHO it is better to print
  ((sometype *)&var)->field
or
  (*(sometype *)&var).field
instead of
  *(fieldtype *)((char *)&var + 56)
because the user doesn't know what is at offset 56, we shouldn't ask user
to decipher structure layout etc.

One question is if we could return something better for the TYPE_PTRMEMFUNC_FLAG
RECORD_TYPE members here (something that would print it more naturally/readably
in a C++ way), though the fact that the routine is in c-family makes it
harder.

Another one is whether we shouldn't punt for FIELD_DECLs that don't have
nicely printable name of its containing scope, something like:
                if (tree scope = get_containing_scope (field))
                  if (TYPE_P (scope) && TYPE_NAME (scope) == NULL_TREE)
                    break;
                return cop;
or so.  This patch implements that.

Note the returned cop is a COMPONENT_REF where the first argument has a
nicely printable type name (x with type sp), but sp's TYPE_MAIN_VARIANT
is the unnamed TYPE_PTRMEMFUNC_FLAG.  So another possibility would be if
we see such a problem for the FIELD_DECL's scope, check if TYPE_MAIN_VARIANT
of the first COMPONENT_REF's argument is equal to that scope and in that
case use TREE_TYPE of the first COMPONENT_REF's argument as the scope
instead.

2022-03-19  Jakub Jelinek  <jakub@redhat.com>

PR c++/101515
* c-pretty-print.c (c_fold_indirect_ref_for_warn): For C++ don't
return COMPONENT_REFs with FIELD_DECLs whose containing scope can't
be printed.

* g++.dg/warn/pr101515.C: New test.

(cherry picked from commit 2663d18356b0a62f5a800c7e5596d814cd3c2c41)

Allow (void *) 0xdeadbeef accesses without warnings [PR99578]

Starting with GCC11 we keep emitting false positive -Warray-bounds or
-Wstringop-overflow etc. warnings on widely used *(type *)0x12345000
style accesses (or memory/string routines to such pointers).
This is a standard programming style supported by all C/C++ compilers
I've ever tried, used mostly in kernel or DSP programming, but sometimes
also together with mmap MAP_FIXED when certain things, often I/O registers
but could be anything else too are known to be present at fixed
addresses.

Such INTEGER_CST addresses can appear in code either because a user
used it like that (in which case it is fine) or because somebody used
pointer arithmetics (including &((struct whatever *)NULL)->field) on
a NULL pointer.  The middle-end warning code wrongly assumes that the
latter case is what is very likely, while the former is unlikely and
users should change their code.

The following patch adds a min-pagesize param defaulting to 4KB,
and treats INTEGER_CST addresses smaller than that as assumed results
of pointer arithmetics from NULL while addresses equal or larger than
that as expected user constant addresses.  For GCC 13 we can
represent results from pointer arithmetics on NULL using
&MEM[(void*)0 + offset] instead of (void*)offset INTEGER_CSTs.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/99578
PR middle-end/100680
PR tree-optimization/100834
* params.opt (--param=min-pagesize=): New parameter.
* builtins.c (compute_objsize_r) <case INTEGER_CST>: Use maximum
object size instead of zero for pointer constants equal or larger
than min-pagesize.

* gcc.dg/tree-ssa/pr99578-1.c: New test.
* gcc.dg/pr99578-1.c: New test.
* gcc.dg/pr99578-2.c: New test.
* gcc.dg/pr99578-3.c: New test.
* gcc.dg/pr100680.c: New test.
* gcc.dg/pr100834.c: New test.

(cherry picked from commit 32ca611c42658948f1b8883994796f35e8b4e74d)

c++: Fix up constexpr evaluation of new with zero sized types [PR104568]

The new expression constant expression evaluation right now tries to
deduce how many elts the array it uses for the heap or heap [] vars
should have (or how many elts should its trailing array have if it has
cookie at the start).  As new is lowered at that point to
(some_type *) ::operator new (size)
or so, it computes it by subtracting cookie size if any from size, then
divides the result by sizeof (some_type).
This works fine for most types, except when sizeof (some_type) is 0,
then we divide by zero; size is then equal to cookie_size (or if there
is no cookie, to 0).
The following patch special cases those cases so that we don't divide
by zero and also recover the original outer_nelts from the expression
by forcing the size not to be folded in that case but be explicit
0 * outer_nelts or cookie_size + 0 * outer_nelts.

Note, we have further issues, we accept-invalid various cases, for both
zero sized elt_type and even non-zero sized elts, we aren't able to
diagnose out of bounds POINTER_PLUS_EXPR like:
constexpr bool
foo ()
{
  auto p = new int[2];
  auto q1 = &p[0];
  auto q2 = &p[1];
  auto q3 = &p[2];
  auto q4 = &p[3];
  delete[] p;
  return true;
}
constexpr bool a = foo ();
That doesn't look like a regression so I think we should resolve that for
GCC 13, but there are 2 problems.  Figure out why
cxx_fold_pointer_plus_expression doesn't deal with the &heap []
etc. cases, and for the zero sized arrays, I think we really need to preserve
whether user wrote an array ref or pointer addition, because in the
&p[3] case if sizeof(p[0]) == 0 we know that if it has 2 elements it is
out of bounds, while if we see p p+ 0 the information if it was
p + 2 or p + 3 in the source is lost.
clang++ seems to handle it fine even in the zero sized cases or with
new expressions.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/104568
* init.c (build_new_constexpr_heap_type): Remove FULL_SIZE
argument and its handling, instead add ITYPE2 argument.  Only
support COOKIE_SIZE != NULL.
(build_new_1): If size is 0, change it to 0 * outer_nelts if
outer_nelts is non-NULL.  Pass type rather than elt_type to
maybe_wrap_new_for_constexpr.
* constexpr.c (build_new_constexpr_heap_type): New function.
(cxx_eval_constant_expression) <case CONVERT_EXPR>:
If elt_size is zero sized type, try to recover outer_nelts from
the size argument to operator new/new[] and pass that as
arg_size to build_new_constexpr_heap_type.  Pass ctx,
non_constant_p and overflow_p to that call too.

* g++.dg/cpp2a/constexpr-new22.C: New test.

(cherry picked from commit 0a0c2c3f06227d46b5e9542dfdd4e0fd2d67d894)

libatomic: Improve 16-byte atomics on Intel AVX [PR104688]

As mentioned in the PR, the latest Intel SDM has added:
"Processors that enumerate support for Intel® AVX (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
guarantee that the 16-byte memory operations performed by the following instructions will always be
carried out atomically:
• MOVAPD, MOVAPS, and MOVDQA.
• VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
• VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and k0 (masking disabled).
(Note that these instructions require the linear addresses of their memory operands to be 16-byte
aligned.)"

The following patch deals with it just on the libatomic library side so far,
currently (since ~ 2017) we emit all the __atomic_* 16-byte builtins as
library calls since and this is something that we can hopefully backport.

The patch simply introduces yet another ifunc variant that takes priority
over the pure CMPXCHG16B one, one that checks AVX and CMPXCHG16B bits and
on non-Intel clears the AVX bit during detection for now (if AMD comes
with the same guarantee, we could revert the config/x86/init.c hunk),
which implements 16-byte atomic load as vmovdqa and 16-byte atomic store
as vmovdqa followed by mfence.

2022-03-17 Jakub Jelinek <jakub@redhat.com>

PR target/104688
* Makefile.am (IFUNC_OPTIONS): Change on x86_64 to -mcx16 -mcx16.
(libatomic_la_LIBADD): Add $(addsuffix _16_2_.lo,$(SIZEOBJS)) for
x86_64.
* Makefile.in: Regenerated.
* config/x86/host-config.h (IFUNC_COND_1): For x86_64 define to
both AVX and CMPXCHG16B bits.
(IFUNC_COND_2): Define.
(IFUNC_NCOND): For x86_64 define to 2 * (N == 16).
(MAYBE_HAVE_ATOMIC_CAS_16, MAYBE_HAVE_ATOMIC_EXCHANGE_16,
MAYBE_HAVE_ATOMIC_LDST_16): Define to IFUNC_COND_2 rather than
IFUNC_COND_1.
(HAVE_ATOMIC_CAS_16): Redefine to 1 whenever IFUNC_ALT != 0.
(HAVE_ATOMIC_LDST_16): Redefine to 1 whenever IFUNC_ALT == 1.
(atomic_compare_exchange_n): Define whenever IFUNC_ALT != 0
on x86_64 for N == 16.
(__atomic_load_n, __atomic_store_n): Redefine whenever IFUNC_ALT == 1
on x86_64 for N == 16.
(atomic_load_n, atomic_store_n): New functions.
* config/x86/init.c (__libat_feat1_init): On x86_64 clear bit_AVX
if CPU vendor is not Intel.

(cherry picked from commit 1d47c0512a265d4bb3ab9e56259fd1e4f4d42c75)

aarch64: Fix up RTL sharing bug in aarch64_load_symref_appropriately [PR104910]

We unshare all RTL created during expansion, but when
aarch64_load_symref_appropriately is called after expansion like in the
following testcases, we use imm in both HIGH and LO_SUM operands.
If imm is some RTL that shouldn't be shared like a non-sharable CONST,
we get at least with --enable-checking=rtl a checking ICE, otherwise might
just get silently wrong code.

The following patch fixes that by copying it if it can't be shared.

2022-03-16 Jakub Jelinek <jakub@redhat.com>

PR target/104910
* config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Copy
imm rtx.

* gcc.dg/pr104910.c: New test.

(cherry picked from commit 952155629ca1a4dfe7c7b26e53d118a9b853ed4a)

ifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]

find_if_case_{1,2} implicitly assumes conditional jumps and rewrites them,
so if they have extra side-effects or are say asm goto, things don't work
well, either the side-effects are lost or we could ICE.
In particular, the testcase below on s390x has there a doloop instruction
that decrements a register in addition to testing it for non-zero and
conditionally jumping based on that.

The following patch fixes that by punting for !onlyjump_p case, i.e.
if there are side-effects in the jump instruction or it isn't a plain PC
setter.

Also, it assumes BB_END (test_bb) will be always non-NULL, because basic
blocks with 2 non-abnormal successor edges should always have some instruction
at the end that determines which edge to take.

2022-03-15 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/104814
* ifcvt.c (find_if_case_1, find_if_case_2): Punt if test_bb doesn't
end with onlyjump_p. Assume BB_END (test_bb) is always non-NULL.

* gcc.c-torture/execute/pr104814.c: New test.

(cherry picked from commit a2645cd8fb33b36d737b310e26f4c47401305c7b)

c, c++, c-family: -Wshift-negative-value and -Wshift-overflow* tweaks for -fwrapv and C++20+ [PR104711]

As mentioned in the PR, different standards have different definition
on what is an UB left shift.  They all agree on out of bounds (including
negative) shift count.
The rules used by ubsan are:
C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB
C++11-C++17 x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 then UB
C++20 and later everything is well defined
Now, for C++20, I've in the P1236R1 implementation added an early
exit for -Wshift-overflow* warning so that it never warns, but apparently
-Wshift-negative-value remained as is.  As it is well defined in C++20,
the following patch doesn't enable -Wshift-negative-value from -Wextra
anymore for C++20 and later, if users want for compatibility with C++17
and earlier get the warning, they still can by using -Wshift-negative-value
explicitly.
Another thing is -fwrapv, that is an extension to the standards, so it is up
to us how exactly we define that case.  Our ubsan code treats
TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only
diagnosing out of bounds shift count and nothing else and IMHO it is most
sensical to treat -fwrapv signed left shifts the same as C++20 treats
them, https://eel.is/c++draft/expr.shift#2
"The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N,
where N is the width of the type of the result.
[Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled.
— end note]"
with no UB dependent on the E1 values.  The UB is only
"The behavior is undefined if the right operand is negative, or greater
than or equal to the width of the promoted left operand."
Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't
consider UB in left shifts dependent on the first operand's value, only
the out of bounds shifts.

While this change isn't a regression, I'd think it is useful for GCC 12,
it doesn't add new warnings, but just removes warnings that aren't
appropriate.

2022-03-09  Jakub Jelinek  <jakub@redhat.com>

PR c/104711
gcc/
* doc/invoke.texi (-Wextra): Document that -Wshift-negative-value
is enabled by it only for C++11 to C++17 rather than for C++03 or
later.
(-Wshift-negative-value): Similarly (except here we stated
that it is enabled for C++11 or later).
gcc/c-family/
* c-opts.c (c_common_post_options): Don't enable
-Wshift-negative-value from -Wextra for C++20 or later.
* c-ubsan.c (ubsan_instrument_shift): Adjust comments.
* c-warn.c (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
gcc/c/
* c-fold.c (c_fully_fold_internal): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
* c-typeck.c (build_binary_op): Likewise.
gcc/cp/
* constexpr.c (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
* typeck.c (cp_build_binary_op): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
gcc/testsuite/
* c-c++-common/Wshift-negative-value-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-negative-value-2.c: Likewise.
* c-c++-common/Wshift-negative-value-3.c: Likewise.
* c-c++-common/Wshift-negative-value-4.c: Likewise.
* c-c++-common/Wshift-negative-value-7.c: New test.
* c-c++-common/Wshift-negative-value-8.c: New test.
* c-c++-common/Wshift-negative-value-9.c: New test.
* c-c++-common/Wshift-negative-value-10.c: New test.
* c-c++-common/Wshift-overflow-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-overflow-2.c: Likewise.
* c-c++-common/Wshift-overflow-5.c: Likewise.
* c-c++-common/Wshift-overflow-6.c: Likewise.
* c-c++-common/Wshift-overflow-7.c: Likewise.
* c-c++-common/Wshift-overflow-8.c: New test.
* c-c++-common/Wshift-overflow-9.c: New test.
* c-c++-common/Wshift-overflow-10.c: New test.
* c-c++-common/Wshift-overflow-11.c: New test.
* c-c++-common/Wshift-overflow-12.c: New test.

(cherry picked from commit d76511138dc816ef66fd16f71531f48c37dac3b4)

c++: Don't suggest cdtor or conversion op identifiers in spelling hints [PR104806]

On the following testcase, we emit "did you mean '__dt '?" in the error
message. "__dt " shows there because it is dtor_identifier, but we
shouldn't suggest those to the user, they are purely internal and can't
be really typed by the user because of the final space in it.

2022-03-08 Jakub Jelinek <jakub@redhat.com>

PR c++/104806
* search.c (lookup_field_fuzzy_info::fuzzy_lookup_field): Ignore
identifiers with space at the end.

* g++.dg/spellcheck-pr104806.C: New test.

(cherry picked from commit e480c3c06d20874fd7504bfdcca0b829f8000389)

s390: Fix up *cmp_and_trap_unsigned_int<mode> constraints [PR104775]

The following testcase fails to assemble due to clgte %r6,0(%r1,%r10)
insn not being accepted by assembler.
My rough understanding is that in the RSY-b insn format the spot
in other formats used for index registers is used instead for M3 what
kind of comparison it is, so this patch follows what other similar
instructions use for constraint (i.e. one without index register).

2022-03-07 Jakub Jelinek <jakub@redhat.com>

PR target/104775
* config/s390/s390.md (*cmp_and_trap_unsigned_int<mode>): Use
S constraint instead of T in the last alternative.

* gcc.target/s390/pr104775.c: New test.

(cherry picked from commit 2472dcaa8cb9e02e902f83d419c3ee7e0f3d9041)

cfgrtl: Fix up -g vs. -g0 code generation -flto differences in fixup_reorder_chain [PR104589]

This is similar to PR104237 and similarly to that, no testcase included
for the testsuite, as we don't have a framework to compile/link with
-g -flto and -g0 -flto and compare -fdump-final-insns= results from
the lto1 compilations.

With -flto, whether two location_t compare equal or not and just
express the same location is a lottery.

2022-03-02 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/104589
* cfgrtl.c (fixup_reorder_chain): Use loc_equal instead of direct
INSN_LOCATION comparison with goto_locus.

(cherry picked from commit 2e1b00367abaf8b6dbb47fd8518d9ac69c326a06)

match.pd: Further complex simplification fixes [PR104675]

Mark mentioned in the PR further 2 simplifications that also ICE
with complex types.
For these, eventually (but IMO GCC 13 materials) we could support it
for vector types if it would be uniform vector constants.
Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs
and we can't use bit_and etc. for complex type.

2022-02-25 Jakub Jelinek <jakub@redhat.com>
Marc Glisse <marc.glisse@inria.fr>

PR tree-optimization/104675
* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1):
Restrict simplifications to INTEGRAL_TYPE_P.

* gcc.dg/pr104675-3.c : New test.

(cherry picked from commit f62115c9b770a66c5378f78a2d5866243d560573)

rs6000: Use rs6000_emit_move in movmisalign<mode> expander [PR104681]

The following testcase ICEs, because for some strange reason it decides to use
movmisaligntf during expansion where the destination is MEM and source is
CONST_DOUBLE. For normal mov<mode> expanders the rs6000 backend uses
rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG
and a few other things, but for movmisalign<mode> nothing enforced this.
The middle-end documents that movmisalign<mode> shouldn't fail, so we can't
force that through predicates or condition on the expander.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR target/104681
* config/rs6000/vector.md (movmisalign<mode>): Use rs6000_emit_move.

* g++.dg/opt/pr104681.C: New test.

(cherry picked from commit 3885a122f817a1b6dca4a84ba9e020d5ab2060af)

i386: Use a new temp slot kind for splitter to floatdi<mode>2_i387_with_xmm [PR104674]

As mentioned in the PR, the following testcase is miscompiled for similar
reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various
places during expansion and during expansion we can guarantee that the
lifetime of those temporary slot doesn't overlap. But the following
splitter uses SLOT_TEMP too and in between expansion and split1 there is
a possibility that something extends the lifetime of SLOT_TEMP created
slots across an instruction that will be split by this splitter.

The following patch fixes it by using a new temp slot kind to make sure
it doesn't reuse a SLOT_TEMP that could be live across the instruction.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR target/104674
* config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387.
* config/i386/i386.md (splitter to floatdi<mode>2_i387_with_xmm): Use
SLOT_FLOATxFDI_387 rather than SLOT_TEMP.

* gcc.target/i386/pr104674.c: New test.

(cherry picked from commit eabf7bbe601f2c0d87bd0a1012d7a602df2037da)

match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]

We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
&/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
So, we should avoid simplifications which turn valid complex type
expressions into something that will ICE during expansion.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/104675
* match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
COMPLEX_TYPE.

* gcc.dg/pr104675-1.c: New test.
* gcc.dg/pr104675-2.c: New test.

(cherry picked from commit 758671b88b78d7629376b118ec6ca6bcfbabbd36)

sccvn: Fix visit_reference_op_call value numbering of vdefs [PR104601]

The following testcase is miscompiled, because -fipa-pure-const discovers
that bar is const, but when sccvn during fre3 sees
  # .MEM_140 = VDEF <.MEM_96>
  *__pred$__d_43 = _50 (_49);
where _50 value numbers to &bar, it value numbers .MEM_140 to
vuse_ssa_val (gimple_vuse (stmt)).  For const/pure calls that return
a SSA_NAME (or don't have lhs) that is fine, those calls don't store
anything, but if the lhs is present and not an SSA_NAME, value numbering
the vdef to anything but itself means that e.g. walk_non_aliased_vuses
won't consider the call, but the call acts as a store to its lhs.
When it is ignored, sccvn will return whatever has been stored to the
lhs earlier.

I've bootstrapped/regtested an earlier version of this patch, which did the
if (!lhs && gimple_call_lhs (stmt))
  changed |= set_ssa_val_to (vdef, vdef);
part before else if (vnresult->result_vdef), and that regressed
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo \\\\(" 1
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo2 \\\\(" 1
so this updated patch uses result_vdef there as before and only otherwise
(which I think must be the const/pure case) decides based on whether the
lhs is non-SSA_NAME.

2022-02-24  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/104601
* tree-ssa-sccvn.c (visit_reference_op_call): For calls with
non-SSA_NAME lhs value number vdef to itself instead of e.g. the
vuse value number.

* g++.dg/torture/pr104601.C: New test.

(cherry picked from commit 9251b457eb8df912f2e8203d0ee8ab300c041520)

libiberty: Fix up debug.temp.o creation if *.o has 64K+ sections [PR104617]

On
#define A(n) int foo1##n(void) { return 1##n; }
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325)
B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642)
testcase with
./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto  -O0 -o foo1.o foo1.c -ffunction-sections
./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o
/tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized
(testcase too slow to be included into testsuite).
The problem is clearly reported by readelf:
readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323
readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section.
because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and
sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE
inclusive.  Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE
range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym
where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be
used instead and .symtab_shndx section should contain the real section
index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >=
SHN_LORESERVE value is needed it should put those into
Shdr[0].sh_{size,link}.  But, sh_{link,info} are 32-bit fields which can
contain any section index.

Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before
2011) used to mishandle the > 63.75K sections case and assumed there is a
hole in between the sections, but what
simple_object_elf_copy_lto_debug_sections does wouldn't help in that case
for the debug temp object creation, we'd need to detect the case also in
that routine and take it into account in the remapping etc.  I think
it is not worth it given that it is over 10 years, if somebody needs
63.75K or more sections, better use more recent binutils.

2022-02-22  Jakub Jelinek  <jakub@redhat.com>

PR lto/104617
* simple-object-elf.c (simple_object_elf_match): Fix up URL
in comment.
(simple_object_elf_copy_lto_debug_sections): Remap sh_info and
sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE
range (inclusive).

(cherry picked from commit 2f59f067610f22c3f2ec9b1516e24b85836676ed)

asan: Mark instrumented vars addressable [PR102656]

We ICE on the following testcase, because the asan1 pass decides to
instrument
  <retval>.x = 0;
and does that by
  _13 = &<retval>.x;
  .ASAN_CHECK (7, _13, 4, 4);
  <retval>.x = 0;
and later sanopt pass turns that into:
  _39 = (unsigned long) &<retval>.x;
  _40 = _39 >> 3;
  _41 = _40 + 2147450880;
  _42 = (signed char *) _41;
  _43 = *_42;
  _44 = _43 != 0;
  _45 = _39 & 7;
  _46 = (signed char) _45;
  _47 = _46 + 3;
  _48 = _47 >= _43;
  _49 = _44 & _48;
  if (_49 != 0)
    goto <bb 10>; [0.05%]
  else
    goto <bb 9>; [99.95%]

  <bb 10> [local count: 536864]:
  __builtin___asan_report_store4 (_39);

  <bb 9> [local count: 1073741824]:
  <retval>.x = 0;
The problem is during expansion, <retval> isn't marked TREE_ADDRESSABLE,
even when we take its address in (unsigned long) &<retval>.x.

Now, instrument_derefs has code to avoid the instrumentation altogether
if we can prove the access is within bounds of an automatic variable in the
current function and the var isn't TREE_ADDRESSABLE (or we don't instrument
use after scope), but we do it solely for VAR_DECLs.

I think we should treat RESULT_DECLs exactly like that too, which is what
the following patch does.  I must say I'm unsure about PARM_DECLs, those can
have different cases, either they are fully or partially passed in
registers, then if we take parameter's address, they are in a local copy
inside of a function and so work like those automatic vars.  But if they
are fully passed in memory, we typically just take address of the slot
and in that case they live in the caller's frame.  It is true we don't
(can't) put any asan padding in between the arguments, so all asan could
detect in that case is if caller passes fewer on stack arguments or smaller
arguments than callee accepts.  Anyway, as I'm unsure, I haven't added
PARM_DECLs to that case.

And another thing is, when we actually build_fold_addr_expr, we need to
mark_addressable the inner if it isn't addressable already.

2022-02-19  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/102656
* asan.c (instrument_derefs): If inner is a RESULT_DECL and access is
known to be within bounds, treat it like automatic variables.
If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from
current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark
it addressable.

* g++.dg/asan/pr102656.C: New test.

(cherry picked from commit 9e3bbb4a8024121eb0fa675cb1f074218c1345a6)

c: -Wmissing-field-initializers and designated inits [PR82283, PR84685]

This patch fixes two kinds of wrong -Wmissing-field-initializers
warnings.  Our docs say that this warning "does not warn about designated
initializers", but we give a warning for

1) the array case:

  struct S {
    struct N {
      int a;
      int b;
    } c[1];
  } d = {
    .c[0].a = 1,
    .c[0].b = 1, // missing initializer for field 'b' of 'struct N'
  };

we warn because push_init_level, when constructing an array, clears
constructor_designated (which the warning relies on), and we forget
that we were in a designated initializer context.  Fixed by the
push_init_level hunk; and

2) the compound literal case:

  struct T {
    int a;
    int *b;
    int c;
  };

  struct T t = { .b = (int[]){1} }; // missing initializer for field 'c' of 'struct T'

where set_designator properly sets constructor_designated to 1, but the
compound literal causes us to create a whole new initializer_stack in
start_init, which clears constructor_designated.  Then, after we've parsed
the compound literal, finish_init flushes the initializer_stack entry,
but doesn't restore constructor_designated, so we forget we were in
a designated initializer context, which causes the bogus warning.  (The
designated flag is also tracked in constructor_stack, but in this case,
we didn't perform push_init_level between set_designator and start_init
so it wasn't saved anywhere.)

PR c/82283
PR c/84685

gcc/c/ChangeLog:

* c-typeck.c (struct initializer_stack): Add 'designated' member.
(start_init): Set it.
(finish_init): Restore constructor_designated.
(push_init_level): Set constructor_designated to the value of
constructor_designated in the upper constructor_stack.

gcc/testsuite/ChangeLog:

* gcc.dg/Wmissing-field-initializers-1.c: New test.
* gcc.dg/Wmissing-field-initializers-2.c: New test.
* gcc.dg/Wmissing-field-initializers-3.c: New test.
* gcc.dg/Wmissing-field-initializers-4.c: New test.
* gcc.dg/Wmissing-field-initializers-5.c: New test.

(cherry picked from commit 4b7d9f8f51bd96d290aac230c71e501fcb6b21a6)

c++: alignas and alignof void [PR104944]

I started looking into this PR because in GCC 4.9 we were able to
detect the invalid

  struct alignas(void) S{};

but I broke it in r210262.

It's ill-formed code in C++:
[dcl.align]/3: "An alignment-specifier of the form alignas(type-id) has
the same effect as alignas(alignof(type-id))", and [expr.align]/1:
"The operand shall be a type-id representing a complete object type,
or an array thereof, or a reference to one of those types." and void
is not a complete type.

It's also invalid in C:
6.7.5: _Alignas(type-name) is equivalent to _Alignas(_Alignof(type-name))
6.5.3.4: "The _Alignof operator shall not be applied to a function type
or an incomplete type."

We have a GNU extension whereby we treat sizeof(void) as 1, but I assume
it doesn't apply to alignof, at least in C++.  However, __alignof__(void)
is still accepted with a -Wpedantic warning.

We still say "invalid application of 'alignof'" rather than 'alignas' in the
void diagnostic, but I felt that fixing that may not be suitable as part of
this patch.  The "incomplete type" diagnostic still always prints
'__alignof__'.

PR c++/104944

gcc/cp/ChangeLog:

* typeck.c (cxx_sizeof_or_alignof_type): Diagnose alignof(void).
(cxx_alignas_expr): Call cxx_sizeof_or_alignof_type with
complain == true.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alignas20.C: New test.

(cherry picked from commit d0b938a7612fb7acf1f181da9577235c83ede59e)

c++: ICE with template code in constexpr [PR104284]

Since r9-6073 cxx_eval_store_expression preevaluates the value to
be stored, and that revealed a crash where a template code (here,
code=IMPLICIT_CONV_EXPR) leaks into cxx_eval*.

It happens because we're performing build_vec_init while processing
a template, which calls get_temp_regvar which creates an INIT_EXPR.
This INIT_EXPR's RHS contains an rvalue conversion so we create an
IMPLICIT_CONV_EXPR.  Its operand is not type-dependent and the whole
INIT_EXPR is not type-dependent.  So we call build_non_dependent_expr
which, with -fchecking=2, calls fold_non_dependent_expr.  At this
point the expression still has an IMPLICIT_CONV_EXPR, which ought to
be handled in instantiate_non_dependent_expr_internal.  However,
tsubst_copy_and_build doesn't handle INIT_EXPR; it will just call
tsubst_copy which does nothing when args is null.  So we fail to
replace the IMPLICIT_CONV_EXPR and ICE.

The problem is that we call build_vec_init in a template in the
first place.  We can avoid doing so by checking p_t_d before
calling build_aggr_init in check_initializer.

PR c++/104284

gcc/cp/ChangeLog:

* decl.c (check_initializer): Don't call build_aggr_init in
a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-104284-1.C: New test.
* g++.dg/cpp1y/constexpr-104284-2.C: New test.
* g++.dg/cpp1y/constexpr-104284-3.C: New test.
* g++.dg/cpp1y/constexpr-104284-4.C: New test.

(cherry picked from commit 9fdac7e16c940fb6264e6ddaf99c761f1a64a054)

c++: Wrong error with alias template in class tmpl [PR104108]

In r10-6329 I tried to optimize the number of calls to v_d_e_p in
convert_nontype_argument by remembering whether the expression was
value-dependent in a bool flag.  I did that wrongly assuming that its
value-dependence will not be changed by build_converted_constant_expr.
This testcase shows that it can: b_c_c_e gets a VAR_DECL for m_parameter,
which is not value-dependent, but we're converting it to "const int &"
so it returns

  (const int &)(const int *) &m_parameter

which suddenly becomes value-dependent because of the added ADDR_EXPR:
has_value_dependent_address is now true because m_parameter's context S<T>
is dependent.  With this bug in place, we went to the second branch here:

      if (TYPE_REF_OBJ_P (TREE_TYPE (expr)) && val_dep_p)
        /* OK, dependent reference.  We don't want to ask whether a DECL is
           itself value-dependent, since what we want here is its address.  */;
      else
        {
          expr = build_address (expr);

          if (invalid_tparm_referent_p (type, expr, complain))
            return NULL_TREE;
        }

wherein build_address created a bad tree and then i_t_r_p complained.

PR c++/104108

gcc/cp/ChangeLog:

* pt.c (convert_nontype_argument): Recompute
value_dependent_expression_p after build_converted_constant_expr.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-74.C: New test.

(cherry picked from commit d54ce4641ed106666208be36fd514cae8ff1153c)

c++: FIX_TRUNC_EXPR in tsubst [PR102990]

This is a crash where a FIX_TRUNC_EXPR gets into tsubst_copy_and_build
where it hits gcc_unreachable ().

The history of tsubst_copy_and_build/FIX_TRUNC_EXPR is such that it
was introduced in r181478, but it did the wrong thing, whereupon it
was turned into gcc_unreachable () in r258821 (see this thread:
<https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495853.html>).

In a template, we should never create a FIX_TRUNC_EXPR (that's what
conv_unsafe_in_template_p is for). But in this test we are NOT in
a template when we call digest_nsdmi_init which ends up calling
convert_like, converting 1.0e+0 to int, so convert_to_integer_1
gives us a FIX_TRUNC_EXPR.

But then when we get to parsing f's parameters, we are in a template
when processing decltype(Helpers{}), and since r268321, when the
compound literal isn't instantiation-dependent and the type isn't
type-dependent, finish_compound_literal falls back to the normal
processing, so it calls digest_init, which does fold_non_dependent_init
and since the FIX_TRUNC_EXPR isn't dependent, we instantiate and
therefore crash in tsubst_copy_and_build.

The fateful call to fold_non_dependent_init comes from massage_init_elt,
We shouldn't be calling f_n_d_i on the result of get_nsdmi. This we can
avoid by eschewing calling f_n_d_i on CONSTRUCTORs; their elements have
already been folded.

PR c++/102990

gcc/cp/ChangeLog:

* typeck2.c (massage_init_elt): Avoid folding CONSTRUCTORs.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template22.C: New test.
* g++.dg/cpp0x/nsdmi-template23.C: New test.

(cherry picked from commit f0530882d99abc410bb080051aa04e5cea848f18)

c++: constexpr array reference and value-initialization [PR101371]

This PR gave me a hard time: I saw multiple issues starting with
different revisions.  But ultimately the root cause seems to be
the following, and the attached patch fixes all issues I've found
here.

In cxx_eval_array_reference we create a new constexpr context for the
CP_AGGREGATE_TYPE_P case, but we also have to create it for the
non-aggregate case.  In this test, we are evaluating

  ((B *)this)->a = rhs->a

which means that we set ctx.object to ((B *)this)->a.  Then we proceed
to evaluate the initializer, rhs->a.  For *rhs, we eval rhs, a PARM_DECL,
for which we have (const B &) &c.arr[0] in the hash table.  Then
cxx_fold_indirect_ref gives us c.arr[0].  c is evaluated to {.arr={}} so
c.arr is {}.  Now we want c.arr[0], so we end up in cxx_eval_array_reference
and since we're initializing from {}, we call build_value_init which
gives us an AGGR_INIT_EXPR that calls 'constexpr B::B()'.  Then we
evaluate this AGGR_INIT_EXPR and since its first argument is dummy,
we take ctx.object instead.  But that is the wrong object, we're not
initializing ((B *)this)->a here.  And so we wound up with an
initializer for A, and then crash in cxx_eval_component_reference:

  gcc_assert (DECL_CONTEXT (part) == TYPE_MAIN_VARIANT (TREE_TYPE (whole)));

where DECL_CONTEXT (part) is B (as it should be) but the type of whole
was A.

So create a new object, if there already was one, and the element type
is not a scalar.

PR c++/101371

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_array_reference): Create a new .object
and .ctor for the non-aggregate non-scalar case too when
value-initializing.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-101371-2.C: New test.
* g++.dg/cpp1y/constexpr-101371.C: New test.

(cherry picked from commit a42f8120442cf3ba25d621bed857b5be19019d0c)

Daily bump.

c++: TTP in member alias template [PR104107]

In the first testcase, coerce_template_template_parms was adding too much of
outer_args when coercing to match P's template parameters, so that when
substituting into the 'const T&' parameter we got an unrelated template
argument for T. We should only add outer_args when the argument template is
a nested template.

PR c++/104107
PR c++/95036

gcc/cp/ChangeLog:

* pt.c (coerce_template_template_parms): Take full parms.
Avoid adding too much of outer_args.
(coerce_template_template_parm): Adjust.
(template_template_parm_bindings_ok_p): Adjust.
(convert_template_argument): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-ttp2.C: New test.
* g++.dg/cpp1z/ttp2.C: New test.

c++: ICE with alias in pack expansion [PR103769]

This was breaking because when we stripped the 't' typedef in s<t<Args>...>
to be s<Args...>, the TYPE_MAIN_VARIANT of "Args..." was still
"t<Args>...", because type pack expansions are treated as types. Fixed by
using the right function to copy a "type".

PR c++/99445
PR c++/103769

gcc/cp/ChangeLog:

* tree.c (strip_typedefs): Use build_distinct_type_copy.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-alias5.C: New test.

c++: mangling union{1} in template [PR104847]

My implementation of union non-type template arguments in r11-2016 broke
braced casts of union type, because they are still in syntactic (undigested)
form.

PR c++/104847

gcc/cp/ChangeLog:

* mangle.c (write_expression): Don't write a union designator when
undigested.

gcc/testsuite/ChangeLog:

* g++.dg/abi/mangle-union1.C: New test.

c++: missing aggregate base ctor [PR102045]

When make_base_init_ok changes a call to a complete constructor into a call
to a base constructor, we were never marking the base ctor as used, so it
didn't get emitted.

PR c++/102045

gcc/cp/ChangeLog:

* call.c (make_base_init_ok): Call make_used.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/aggr-base12.C: New test.

c++: member alias declaration [PR103968]

Here, we were wrongly thinking that (const Options&)Widget<T>::c_options is
not value-dependent because neither the type nor the (value of) c_options
are dependent, but since we're binding it to a reference we also need to
consider that it has a value-dependent address.

PR c++/103968

gcc/cp/ChangeLog:

* pt.c (value_dependent_expression_p): Check
has_value_dependent_address for conversion to reference.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-mem1.C: New test.

c++: CTAD and member alias template [PR102123]

When building a deduction guide from the Test constructor, we need to
rewrite the use of _dummy into a dependent reference, i.e. Test<T>::template
_dummy. We were using SCOPE_REF for both type and non-type templates; we
need to use UNBOUND_CLASS_TEMPLATE for type templates.

PR c++/102123

gcc/cp/ChangeLog:

* pt.c (tsubst_copy): Use make_unbound_class_template for rewriting
a type template reference.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction110.C: New test.

c++: visibility of local extern [PR103291]

When setting up the hidden namespace-scope decl for a local extern, we also
need to set its visibility.

PR c++/103291

gcc/cp/ChangeLog:

* name-lookup.c (push_local_extern_decl_alias): Call
determine_visibility.

gcc/testsuite/ChangeLog:

* g++.dg/ext/visibility/visibility-local-extern1.C: New test.

x86: Use Yw constraint on *ssse3_pshufbv8qi3

Since AVX512VL and AVX512BW are required for AVX512 VPSHUFB, replace the
"Yv" register constraint with the "Yw" register constraint.

gcc/

PR target/105068
* config/i386/sse.md (*ssse3_pshufbv8qi3): Replace "Yv" with
"Yw".

(cherry picked from commit 08e69332881f8d28ce8b559ffba1900ae5c0d5ee)

[PR/target 102957] Allow Z*-ext extension with only 2 char.

We was assume the Z* extension should be more than 2 char, so we put an
assertion there, but it should just an error or warning rather than an
assertion, however RISC-V has add `Zk` extension, which just 2 char, so
actually, we should just allow that.

gcc/ChangeLog

PR target/102957
* common/config/riscv/riscv-common.c (multi_letter_subset_rank): Remove
assertion for Z*-ext.

gcc/testsuite/ChangeLog

* gcc.target/riscv/pr102957.c: New.

(cherry picked from commit abe562bb01479ea2c8952ad98714f3225527aa7e)

i386: Fix up _mm_loadu_si{16,32} [PR99754]

These intrinsics are supposed to do an unaligned may_alias load
of a 16-bit or 32-bit value and store it as the first element of
a 128-bit integer vector, with all other elements cleared.

The current _mm_storeu_* implementation implements that correctly, uses
__*_u types to do the store and extracts the first element of a vector into
it.
But _mm_loadu_si{16,32} gets it all wrong.  It performs an aligned
non-may_alias load and because _mm_set_epi{16,32} has the args reversed,
it also inserts it into the last vector element instead of first.

The following patch fixes that.

Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
for _mm_loadu_si16 it says strangely SSE.  But the intrinsics
returns __m128i, which is only defined in emmintrin.h, and
_mm_set_epi16 is also only SSE2 and later in emmintrin.h.
Even clang defines it in emmintrin.h and ends up with inlining
failure when calling _mm_loadu_si16 from sse,no-sse2 function.
So, isn't that a bug in the intrinsic guide instead?

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

PR target/99754
* config/i386/emmintrin.h (_mm_loadu_si32): Put loaded value into
first rather than last element of the vector, use __m32_u to do
a really unaligned load, use just 0 instead of (int)0.
(_mm_loadu_si16): Put loaded value into first rather than last
element of the vector, use __m16_u to do a really unaligned load,
use just 0 instead of (short)0.

* gcc.target/i386/pr99754-1.c: New test.
* gcc.target/i386/pr99754-2.c: New test.

Daily bump.

x86: Use -msse2 on gcc.target/i386/pr95483-1.c

Replace -msse with -msse2 since <emmintrin.h> requires SSE2.

PR testsuite/105055
* gcc.target/i386/pr95483-1.c: Replace -msse with -msse2.

(cherry picked from commit bdd7b679e8497c07e25726f6ab6429e4c4d429c7)

x86: Use x constraint on KL patterns

Since KL instructions have no AVX512 version, replace the "v" register
constraint with the "x" register constraint.

PR target/105058
* config/i386/sse.md (loadiwkey): Replace "v" with "x".
(aes<aesklvariant>u8): Likewise.

(cherry picked from commit ede5f5224d55b84b9f186b288164df9c06fd85e7)

x86: Use x constraint on SSSE3 patterns with MMX operands

Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND
have no AVX512 version, replace the "Yv" register constraint with the
"x" register constraint.

PR target/105052
* config/i386/sse.md (ssse3_ph<plusminus_mnemonic>wv4hi3):
Replace "Yv" with "x".
(ssse3_ph<plusminus_mnemonic>dv2si3): Likewise.
(ssse3_psign<mode>3): Likewise.

(cherry picked from commit 99591cf43fc1da0fb72b3da02ba937ba30bd2bf2)

Daily bump.

Properly reset the port handle when closing

When the serial port is closed, we need to ensure that the port handle is
properly reset for it to be detected as closed.

gcc/ada/
PR ada/104767
* libgnat/g-sercom__mingw.adb (Close): Reset port handle to -1.
* libgnat/g-sercom__linux.adb (Close): Likewise.

Daily bump.

tree-optimization/101636 - CTOR vectorization ICE

The following fixes an ICE when vectorizing the defs of a CTOR
results in a different vector type than expected. That can happen
with AARCH64 SVE and a fixed vector length as noted in r10-5979
and on x86 with AVX512 mask CTORs and trying to re-vectorize
using SSE as shown in this bug.

The fix is simply to reject the vectorization when it didn't
produce the desired type.

2022-02-23 Richard Biener <rguenther@suse.de>

PR tree-optimization/101636
PR tree-optimization/104782
* tree-vect-slp.c (vect_slp_analyze_operations): Make sure
the CTOR is vectorized with an expected type.

* c-c++-common/torture/pr101636.c: Likewise.
* gcc.dg/vect/pr104782.c: New testcase.

tree-optimization/104931 - mitigate niter analysis issue

The following backports a pointer associating pattern from trunk
that mitigates an issue with number_of_iterations_lt_to_ne in
which context we fail to fold a comparison but succeed in folding
a related subtraction.  In the failure mode this results in
a loop wrongly assumed looping with a bogus number of iterations,
resulting in crashing of the premake application on start.

With the backported simplification we are able to fold the
comparison and correctly compute the loop as not iterating.

I have failed to create a standalone testcase.  I belive part
of the issue is still latent but I have failed to nail down
the issue exactly.  Still I believe the backporting of the
mitigation patch is the most sensible and safe thing at this
point.

2022-03-16  Richard Biener  <rguenther@suse.de>

PR tree-optimization/104931
* match.pd ((ptr) (x p+ y) p+ z -> (ptr) (x p+ (y + z))): New GENERIC
simplification.

Daily bump.

x86: Disable SSE in ISA2 for -mgeneral-regs-only

Replace OPTION_MASK_ISA2_AVX512F_UNSET with OPTION_MASK_ISA2_SSE_UNSET
in OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET to disable SSE, AVX and
AVX512 ISAs.

gcc/

PR target/105000
* common/config/i386/i386-common.c
(OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET): Replace
OPTION_MASK_ISA2_AVX512F_UNSET with OPTION_MASK_ISA2_SSE_UNSET.

gcc/testsuite/

PR target/105000
* gcc.target/i386/pr105000-1.c: New test.
* gcc.target/i386/pr105000-2.c: Likewise.
* gcc.target/i386/pr105000-3.c: Likewise.

(cherry picked from commit e8b6afa98f0a390c955a089a3d61fdd24f4e1d3a)

x86: Also check _SOFT_FLOAT in <x86gprintrin.h>

Push target("general-regs-only") in <x86gprintrin.h> if x87 is enabled.

gcc/

PR target/104890
* config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before
pushing target("general-regs-only").

gcc/testsuite/

PR target/104890
* gcc.target/i386/pr104890.c: New test.

(cherry picked from commit 3117ffce4c1598a35e724138735b099262353358)

c++: lambda in template default argument [PR103186]

The problem with this testcase was that since my patch for PR97900 we
weren't preserving DECL_UID identity for parameters of instantiations of
templated functions, so using those parameters as the keys for the
defarg_inst map broke. I think this was always fragile given the
possibility of redeclarations, so instead of reverting that change let's
switch to keying off the function.

Memory use compiling stdc++.h is not noticeably different.

PR c++/103186

gcc/cp/ChangeLog:

* pt.c (defarg_inst): Use tree_vec_map_cache_hasher.
(defarg_insts_for): New.
(tsubst_default_argument): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-defarg10.C: New test.

tree: move tree_vec_map_cache_hasher into header

gcc/ChangeLog:

* tree.h (struct tree_vec_map_cache_hasher): Move from...
* tree.c (struct tree_vec_map_cache_hasher): ...here.

c++: alias template and typename [PR103057]

Usually we handle DR1558 substitution near the top of tsubst, but in this
case while substituting TYPENAME_TYPE we were passing an alias
specialization to tsubst_aggr_type, which ignored its aliasness.

PR c++/103057

gcc/cp/ChangeLog:

* pt.c (tsubst_aggr_type): Call tsubst for alias template
specialization.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-void1.C: New test.

c++: assignment to temporary [PR59950]

Given build_this of a TARGET_EXPR, cp_build_fold_indirect_ref returns the
TARGET_EXPR. But that's the wrong value category for the result of the
defaulted class assignment operator, which returns an lvalue, so we need to
actually build the INDIRECT_REF.

PR c++/59950

gcc/cp/ChangeLog:

* call.c (build_over_call): Use cp_build_indirect_ref.

gcc/testsuite/ChangeLog:

* g++.dg/init/assign2.C: New test.

c++: fix tree_contains_struct for C++ types [PR101095]

Many of the types from cp-tree.def were only marked as having tree_common,
when actually most of them have type_non_common.  This broke
g++.dg/modules/xtreme-header-2, as the modules code relies on
tree_contains_struct to know what bits it needs to stream.

We don't seem to use type_non_common for TYPE_ARGUMENT_PACK, so I bumped it
down to TS_TYPE_COMMON.  I tried doing the same in cp_tree_size, but that
breaks without more extensive changes to tree_node_structure.

Why do we need the init_ts function anyway?  It seems redundant with
tree_node_structure.

PR c++/101095

gcc/cp/ChangeLog:

* cp-objcp-common.c (cp_common_init_ts): Mark types as types.
(cp_tree_size): Remove redundant entries.

c++: initialized array of vla [PR58646]

We went into build_vec_init because we're dealing with a VLA, but then
build_vec_init thought it was safe to just build an INIT_EXPR because the
outer dimension is constant. Nope.

PR c++/58646

gcc/cp/ChangeLog:

* init.c (build_vec_init): Check for vla element type.

gcc/testsuite/ChangeLog:

* g++.dg/ext/vla24.C: New test.

c++: designated init and aggregate members [PR103337]

Our C++20 designated initializer handling was broken with members of class
type; we would find the relevant member and then try to find a member of
the member with the same name. Or we would sometimes ignore the designator
entirely. The former problem is fixed by the change to reshape_init_class,
the latter by the change to reshape_init_r.

PR c++/103337
PR c++/102740
PR c++/103299
PR c++/102538

gcc/cp/ChangeLog:

* decl.c (reshape_init_class): Avoid looking for designator
after we found it.
(reshape_init_r): Keep looking for designator.

gcc/testsuite/ChangeLog:

* g++.dg/ext/flexary3.C: Remove one error.
* g++.dg/parse/pr43765.C: Likewise.
* g++.dg/cpp2a/desig22.C: New test.
* g++.dg/cpp2a/desig23.C: New test.
* g++.dg/cpp2a/desig24.C: New test.
* g++.dg/cpp2a/desig25.C: New test.

c++: designator and anon struct [PR101767]

We found .x in the anonymous struct, but then didn't find .y there; we
should decide that means we're done with the struct rather than that the
code is wrong.

PR c++/101767

gcc/cp/ChangeLog:

* decl.c (reshape_init_class): Back out of anon struct
if a designator doesn't match.

gcc/testsuite/ChangeLog:

* g++.dg/ext/anon-struct10.C: New test.

Daily bump.

x86: Properly check FEATURE_AESKLE

1. Pass 0x19 to __cpuid for bit_AESKLE.
2. Enable FEATURE_AESKLE only if bit_AESKLE is set.

PR target/104998
* common/config/i386/cpuinfo.h (get_available_features): Pass
0x19 to __cpuid for bit_AESKLE. Enable FEATURE_AESKLE only if
bit_AESKLE is set.

(cherry picked from commit d0363a80690a299179d010c1d8a6d6430e7eb73f)

d: Fix internal compiler error: in build_complex, at tree.c:2358

The conversion from the special _Complex enum to native complex used
build_complex, however the input value isn't necessarily a literal.

PR d/105004

gcc/d/ChangeLog:

* d-codegen.cc (build_struct_literal): Use complex_expr to build
complex expressions from __c_complex types.

gcc/testsuite/ChangeLog:

* gdc.dg/pr105004.d: New test.

(cherry picked from commit 1dd51373a82408361068e130a84caa888ef0d2b3)

Daily bump.

Fortran: Fix gfc_maybe_dereference_var [PR104430][PR99585]

PR fortran/99585
PR fortran/104430

gcc/fortran/ChangeLog:

* trans-expr.c (conv_parent_component_references): Fix comment;
simplify comparison.
(gfc_maybe_dereference_var): Avoid d referencing a nonpointer.

gcc/testsuite/ChangeLog:

* gfortran.dg/class_result_10.f90: New test.

(cherry picked from commit c0134b7383992aab5c1a91440dbdd8fbb747169c)

Daily bump.

rs6000: Fix invalid address passed to __builtin_mma_disassemble_acc [PR104923]

The mma_disassemble_output_operand predicate is too lenient on the types
of addresses it will accept, leading to combine creating invalid address
that eventually lead to ICEs in LRA. The solution is to restrict the
addresses to indirect, indexed or those valid for quad memory accesses.

2022-03-15 Peter Bergner <bergner@linux.ibm.com>

gcc/
PR target/104923
* config/rs6000/predicates.md (mma_disassemble_output_operand): Restrict
acceptable MEM addresses.

gcc/testsuite/
PR target/104923
* gcc.target/powerpc/pr104923.c: New test.

(cherry picked from commit b5baf569f77e1f172061642d4d8593e1ea737add)

rs6000: Allow -mlong-double-64 after -mabi={ibm,ieee}longdouble [PR104208, PR87496]

The glibc build is showing a build error due to extra "error" checking from my
PR87496 fix.  That checking was overeager, disallowing setting the long double
size to 64-bits if the 128-bit long double ABI had already been specified.
Now we only emit an error if we specify a 128-bit long double ABI if our
long double size is not 128 bits.  This also fixes an erroneous error when
-mabi=ieeelongdouble is used and ISA 2.06 is not enabled, but the long double
size has been changed to 64 bits.

2022-03-04  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/87496
PR target/104208
* config/rs6000/rs6000.c (rs6000_option_override_internal): Make the
ISA 2.06 requirement for -mabi=ieeelongdouble conditional on
-mlong-double-128.
Move the -mabi=ieeelongdouble and -mabi=ibmlongdouble error checking
from here...
* common/config/rs6000/rs6000-common.c (rs6000_handle_option):
... to here.

gcc/testsuite/
PR target/87496
PR target/104208
* gcc.target/powerpc/pr104208-1.c: New test.
* gcc.target/powerpc/pr104208-2.c: Likewise.
* gcc.target/powerpc/pr87496-2.c: Swap long double options to trigger
the expected error.
* gcc.target/powerpc/pr87496-3.c: Likewise.

(cherry picked from commit cb16bc3b5f34733ef9bbf8d2e3acacdecb099a62)

x86: Correct march=sapphirerapids to base on icelake server

march=sapphirerapids should be based on icelake server not cooperlake.

gcc/ChangeLog:

PR target/104963
* config/i386/i386.h (PTA_SAPPHIRERAPIDS): change it to base on ICX.
* doc/invoke.texi: Update documents for Intel sapphirerapids.

gcc/testsuite/ChangeLog

PR target/104963
* gcc.target/i386/pr104963.c: New test case.

Daily bump.

Backport PR fortran/96983 patch to GCC 11.

I applied a patch on the trunk in April 22nd, 2021 that fixes an issue (PR
fortran/66983) where we could fail for 128-bit floating point types
because we don't have a built-in function that is equivalent to llround
for 128-bit integer types.  Instead, the patch does a round operation
followed by conversion to the appropriate integer size if we don't have
the specialized round to specific integer type built-in functions.

When I checked in the change, I was told to wait until after GCC 11.1
shipped before doing the backport.  I forgot about the change until
recently.  Before checking it in, I did bootstraps and ran regression
tests on:

    1) little endian power9 using 128-bit IBM long double
    2) little endian power9 using 128-bit IEEE long double
    3) big endian power8 (both 64-bit and 32-bit) (and)
    4) x86_64 native build

There were no new regressions.  The test gfortran.dg/pr96711.f90 now
passes on PowerPC.

In the 10 months or so since the change was made on the trunk, the code in
build_round_expr did not change.

2022-03-16   Michael Meissner  <meissner@linux.ibm.com>

gcc/fortran/
PR fortran/96983
* trans-intrinsic.c (build_round_expr): If int type is larger than
long long, do the round and convert to the integer type.  Do not
try to find a floating point type the exact size of the integer
type.  Backport patch from 2021-04-22 on the trunk.

Daily bump.

middle-end/100775 - updating the reg use in exit block for -fzero-call-used-regs

GCC11 backport.

In the pass_zero_call_used_regs, when updating dataflow info after adding
the register zeroing sequence in the epilogue of the function, we should
call "df_update_exit_block_uses" to update the register use information in
the exit block to include all the registers that have been zeroed.

gcc/ChangeLog:

PR middle-end/100775
* function.c (gen_call_used_regs_seq): Call
df_update_exit_block_uses when updating df.

gcc/testsuite/ChangeLog:

PR middle-end/100775
* gcc.target/arm/pr100775.c: New test.

ada/104861 - use target_noncanonial for Target_Name

The following arranges for s-oscons.ads to record target_noncanonical
for Target_Name, matching the install directory layout and what
gcc -dumpmachine says. This fixes build issues with gprbuild.

2022-03-10 Richard Biener <rguenther@suse.de>

PR ada/104861
gcc/ada/
* gcc-interface/Makefile.in (target_noncanonical): Substitute.
(OSCONS_CPP): Pass target_noncanonical as TARGET.

(cherry picked from commit 9467e7331188705ec16c086b77e1809c5b0aab7d)

middle-end/104786 - ICE with asm and VLA

The following fixes an ICE observed with a MEM_REF allows_mem asm
operand referencing a VLA. The following makes sure to not attempt
to go the temporary creation way when we cannot.

2022-03-09 Richard Biener <rguenther@suse.de>

PR middle-end/104786
* cfgexpand.c (expand_asm_stmt): Do not generate a copy
for VLAs without an upper size bound.

* gcc.dg/pr104786.c: New testcase.

(cherry picked from commit ba3ff5e35144e2afff4ccef4ccbbbbaba9870afb)

tree-optimization/104511 - avoid FP to DFP conversion for VEC_PACK_TRUNC

This avoids forwprop from matching DFP <-> FP vector conversions
using VEC_[UN]PACK{_TRUNC,_LO,_HI}.  Maybe DFP vectors shouldn't be
a thing, but they appearantly are.  Re-using CONVERT/NOP_EXPR for
DFP <-> FP conversions was probably a mistake.

2022-02-14  Richard Biener  <rguenther@suse.de>

PR tree-optimization/104511
* tree-ssa-forwprop.c (simplify_vector_constructor): Avoid
touching DFP <-> FP conversions.

* gcc.dg/pr104511.c: New testcase.

(cherry picked from commit f320197c8b495324dc6997a99d53e7f45ecf5840)

target/104453 - guard call folding with NULL LHS

This guards shift builtin folding to do nothing when there is
no LHS, similar to what other foldings do.

2022-02-09 Richard Biener <rguenther@suse.de>

PR target/104453
* config/i386/i386.c (ix86_gimple_fold_builtin): Guard shift
folding for NULL LHS.

* gcc.target/i386/pr104453.c: New testcase.

(cherry picked from commit 1c827873ed283df282f2df11dfe0ff607e07dab3)