git.ipfire.org Git - thirdparty/gcc.git/log

Add '-Wopenacc-parallelism'

... to diagnose potentially suboptimal choices regarding OpenACC parallelism.

Not enabled by default: too noisy ("*potentially* suboptimal choices"); see
XFAILed 'dg-bogus'es.

gcc/c-family/
* c.opt (Wopenacc-parallelism): New.
gcc/fortran/
* lang.opt (Wopenacc-parallelism): New.
gcc/
* omp-offload.c (oacc_validate_dims): Implement
'-Wopenacc-parallelism'.
* doc/invoke.texi (-Wopenacc-parallelism): Document.
gcc/testsuite/
* c-c++-common/goacc/diag-parallelism-1.c: New.
* c-c++-common/goacc/acc-icf.c: Specify '-Wopenacc-parallelism',
and match diagnostics, as appropriate.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/goacc/classify-serial.c: Likewise.
* c-c++-common/goacc/kernels-decompose-1.c: Likewise.
* c-c++-common/goacc/kernels-decompose-2.c: Likewise.
* c-c++-common/goacc/parallel-dims-1.c: Likewise.
* c-c++-common/goacc/parallel-reduction.c: Likewise.
* c-c++-common/goacc/pr70688.c: Likewise.
* c-c++-common/goacc/routine-1.c: Likewise.
* c-c++-common/goacc/routine-level-of-parallelism-2.c: Likewise.
* c-c++-common/goacc/uninit-dim-clause.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
* gfortran.dg/goacc/classify-serial.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-1.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
* gfortran.dg/goacc/parallel-tree.f95: Likewise.
* gfortran.dg/goacc/routine-4.f90: Likewise.
* gfortran.dg/goacc/routine-level-of-parallelism-1.f90: Likewise.
* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
* gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise.
* gfortran.dg/goacc/uninit-dim-clause.f95: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Specify
'-Wopenacc-parallelism', and match diagnostics, as appropriate.
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/mode-transitions.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/private-variables.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/static-variable-1.c:
Likewise.
* testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise.
* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr84028.f90: Likewise.
* testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise.
* testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.

Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com>
Co-Authored-By: Tom de Vries <vries@codesourcery.com>
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>

[OpenACC] Don't compile libgomp testcases with '-w'

We'd like to actually catch compiler diagnostics (and currently there aren't
any).

libgomp/
* testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Don't
compile with '-w'.
* testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-6.c: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.

Move gimplify_buildN API local to only remaining user

This moves the legacy gimplify_buildN API to tree-vect-generic.c,
its only user and elides the gimplification step, making it a wrapper
around gimple_build, adjusting tree_vec_extract for this.

I've noticed that vector CTOR expansion doesn't deal with unfolded
{} and thus this makes it more resilent. I've also adjusted the
match.pd vector CTOR extraction code to make sure it doesn't
produce a CTOR when folding would make it a vector constant.

2021-04-15 Richard Biener <rguenther@suse.de>

* tree-cfg.h (gimplify_build1): Remove.
(gimplify_build2): Likewise.
(gimplify_build3): Likewise.
* tree-cfg.c (gimplify_build1): Move to tree-vect-generic.c.
(gimplify_build2): Likewise.
(gimplify_build3): Likewise.
* tree-vect-generic.c (gimplify_build1): Move from tree-cfg.c.
Modernize.
(gimplify_build2): Likewise.
(gimplify_build3): Likewise.
(tree_vec_extract): Use resimplify with following SSA edges.
(expand_vector_parallel): Avoid passing NULL size/bitpos
to tree_vec_extract.
* expr.c (store_constructor): Deal with zero-element CTORs.
* match.pd (bit_field_ref <vector CTOR>): Make sure to
produce vector constants when possible.

Remove gimplify_buildN API use from complex lowering

This removes the legacy gimplify_buildN API use from complex lowering.

2021-04-15 Richard Biener <rguenther@suse.de>

* tree-complex.c: Include gimple-fold.h.
(expand_complex_addition): Use gimple_build.
(expand_complex_multiplication_components): Likewise.
(expand_complex_multiplication): Likewise.
(expand_complex_div_straight): Likewise.
(expand_complex_div_wide): Likewise.
(expand_complex_division): Likewise.
(expand_complex_conjugate): Likewise.
(expand_complex_comparison): Likewise.

Remove gimplify_buildN API use from phiopt

This removes use of the legacy gimplify_buildN API from phiopt.

2021-04-15 Richard Biener <rguenther@suse.de>

* tree-ssa-phiopt.c (two_value_replacement): Remove use
of legacy gimplify_buildN API.

tree-optimization/99473 - more cselim

This fixes the pre-condition on cselim to include all references
and decls when they end up as auto-var.

Bootstrapped/tested on x86_64-linux

2021-03-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/99473
* tree-ssa-phiopt.c (cond_store_replacement): Handle all
stores.

* gcc.dg/tree-ssa/pr99473-1.c: New testcase.

Simplify {gimplify_and_,}update_call_from_tree API

This removes update_call_from_tree in favor of
gimplify_and_update_call_from_tree, removing some code duplication
and simplifying the API use. Some users of update_call_from_tree
have been transitioned to replace_call_with_value and the API
and its dependences have been moved to gimple-fold.h.

This shaves off another user of valid_gimple_rhs_p which is now
only used from within gimple-fold.c and thus moved and made private.

2021-04-14 Richard Biener <rguenther@suse.de>

* tree-ssa-propagate.h (valid_gimple_rhs_p): Remove.
(update_gimple_call): Likewise.
(update_call_from_tree): Likewise.
* tree-ssa-propagate.c (valid_gimple_rhs_p): Remove.
(valid_gimple_call_p): Likewise.
(move_ssa_defining_stmt_for_defs): Likewise.
(finish_update_gimple_call): Likewise.
(update_gimple_call): Likewise.
(update_call_from_tree): Likewise.
(propagate_tree_value_into_stmt): Use replace_call_with_value.
* gimple-fold.h (update_gimple_call): Declare.
* gimple-fold.c (valid_gimple_rhs_p): Move here from
tree-ssa-propagate.c.
(update_gimple_call): Likewise.
(valid_gimple_call_p): Likewise.
(finish_update_gimple_call): Likewise, and simplify.
(gimplify_and_update_call_from_tree): Implement
update_call_from_tree functionality, avoid excessive
push/pop_gimplify_context.
(gimple_fold_builtin): Use only gimplify_and_update_call_from_tree.
(gimple_fold_call): Likewise.
* gimple-ssa-sprintf.c (try_substitute_return_value): Likewise.
* tree-ssa-ccp.c (ccp_folder::fold_stmt): Likewise.
(pass_fold_builtins::execute): Likewise.
(optimize_stack_restore): Use replace_call_with_value.
* tree-cfg.c (fold_loop_internal_call): Likewise.
* tree-ssa-dce.c (maybe_optimize_arith_overflow): Use
only gimplify_and_update_call_from_tree.
* tree-ssa-strlen.c (handle_builtin_strlen): Likewise.
(handle_builtin_strchr): Likewise.
* tsan.c: Include gimple-fold.h instead of tree-ssa-propagate.h.

* config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin):
Use replace_call_with_value.

vmsdbgout: Remove useless register keywords

register keyword was removed in C++17, and in vmsdbgout.c it served no
useful purpose.

2021-04-26 Jakub Jelinek <jakub@redhat.com>

PR debug/100255
* vmsdbgout.c (ASM_OUTPUT_DEBUG_STRING, vmsdbgout_begin_block,
vmsdbgout_end_block, lookup_filename, vmsdbgout_source_line): Remove
register keywords.

Daily bump.

Add folding and remove expanders for x86 *pcmp{et,gt}* builtins [PR target/98911]

gcc/ChangeLog:

PR target/98911
* config/i386/i386-builtin.def (BDESC): Change the icode of
the following builtins to CODE_FOR_nothing.
* config/i386/i386.c (ix86_gimple_fold_builtin): Fold
IX86_BUILTIN_PCMPEQB128, IX86_BUILTIN_PCMPEQW128,
IX86_BUILTIN_PCMPEQD128, IX86_BUILTIN_PCMPEQQ,
IX86_BUILTIN_PCMPEQB256, IX86_BUILTIN_PCMPEQW256,
IX86_BUILTIN_PCMPEQD256, IX86_BUILTIN_PCMPEQQ256,
IX86_BUILTIN_PCMPGTB128, IX86_BUILTIN_PCMPGTW128,
IX86_BUILTIN_PCMPGTD128, IX86_BUILTIN_PCMPGTQ,
IX86_BUILTIN_PCMPGTB256, IX86_BUILTIN_PCMPGTW256,
IX86_BUILTIN_PCMPGTD256, IX86_BUILTIN_PCMPGTQ256.
* config/i386/sse.md (avx2_eq<mode>3): Deleted.
(sse2_eq<mode>3): Ditto.
(sse4_1_eqv2di3): Ditto.
(sse2_gt<mode>3): Rename to ..
(*sse2_gt<mode>3): .. this.

gcc/testsuite/ChangeLog:

PR target/98911
* gcc.target/i386/pr98911.c: New test.
* gcc.target/i386/funcspec-8.c: Replace __builtin_ia32_pcmpgtq
with __builtin_ia32_pcmpistrm128 since it has been folded.

Daily bump.

analyzer: fix ICE on NULL change.m_expr [PR100244]

PR analyzer/100244 reports an ICE on a -Wanalyzer-free-of-non-heap
due to a case where free_of_non_heap::describe_state_change can be
passed a NULL change.m_expr for a suitably complicated symbolic value.

Bulletproof it by checking for change.m_expr being NULL before
dereferencing it.

gcc/analyzer/ChangeLog:
PR analyzer/100244
* sm-malloc.cc (free_of_non_heap::describe_state_change):
Bulletproof against change.m_expr being NULL.

gcc/testsuite/ChangeLog:
PR analyzer/100244
* g++.dg/analyzer/pr100244.C: New test.

PR fortran/100154 - ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131

Add appropriate static checks for the character and status arguments to
the GNU Fortran intrinsic extensions fget[c], fput[c]. Extend variable
check to allow a function reference having a data pointer result.

gcc/fortran/ChangeLog:

PR fortran/100154
* check.c (variable_check): Allow function reference having a data
pointer result.
(arg_strlen_is_zero): New function.
(gfc_check_fgetputc_sub): Add static check of character and status
arguments.
(gfc_check_fgetput_sub): Likewise.
* intrinsic.c (add_subroutines): Fix argument name for the
character argument to intrinsic subroutines fget[c], fput[c].

gcc/testsuite/ChangeLog:

PR fortran/100154
* gfortran.dg/pr100154.f90: New test.

Fortran - allow target of pointer from evaluation of function-reference

Fortran allows the target of a pointer from the evaluation of a
function-reference in a variable definition context (e.g. F2018:R902).

gcc/fortran/ChangeLog:

PR fortran/100218
* expr.c (gfc_check_vardef_context): Extend check to allow pointer
from a function reference.

gcc/testsuite/ChangeLog:

PR fortran/100218
* gfortran.dg/ptr-func-4.f90: New test.

Revert "Darwin : Adjust darwin_binds_local_p for PIC code [PR100152]."

Unfortunately, although this is required to fix the PR, and is
notionally correct, it regresses some of the sanitizer and IPA
tests. Reverting until this can be analysed.

This reverts commit b6600392bf71c4a9785f8f49948b611425896830.

testuite: fix libtdc++ libatomic flags

Some ports require libatomic for atomic operations, at least for some
data types and widths. The libstdc++ testsuite previously was updated
to link against libatomic, but the search path was hard-coded to
something that is not always correct, and the shared library search
path was not set.

The search path was hard-coded to the expected location of the
libatomic build directory relative to the libstdc++ testsuite
directory, but if one uses parallelism when invoking the libstdc++
testsuite, the tests are run in the "normalXX" sub-directories, for
which the hard-coded search path is incorrect. The path also is
incorrect for alternative multilib and tool options.

This patch adopts the logic from gcc/testsuite/lib/atomic-dg.exp to
search for the library and adds the logic to the libstdc++ testsuite
libatomic seatch path code. Previously the libstdc++ testsuite atomic
tests failed depending on the build configuration and if a build of
libatomic was installed in the default search path.

Bootstrapped on powerpc-ibm-aix7.2.3.0.

libstdc++-v3/ChangeLog:

* testsuite/lib/dg-options.exp (atomic_link_flags): New.
(add_options_for_libatomic): Use atomic_link_flags.

Darwin : Adjust darwin_binds_local_p for PIC code [PR100152].

Darwin's dynamic linker supports interposition and lazy symbol binding.
If we are generating PIC code and a symbol is public, then it could
potentially be indirected via a lazy-resolver stub; we cannot tell at
compile-time if this will be done (since the indirection can be the
result of adding a -flat-namespace option at link-time). Here we are
conservative and assume that any such symbol cannot bind locally.
The default implementation for binds_local_p handles undefined, weak and
common symbols which are always indirected (for mdynamic-no-pic also).

gcc/ChangeLog:

PR target/100152
* config/darwin.c (darwin_binds_local_p): Assume that any
public symbol might be interposed for PIC code. Update function
header comment to reflect current Darwin capability.

Adjust guality xfails for aarch64*-*-*

This patch gives clean guality.exp test results for aarch64-linux-gnu
with modern (top-of-tree) gdb.

For people using older gdbs, it will trade one set of noisy results for
another set.  I still think it's better to have the xfails based on
one “clean” and “modern” run rather than have FAILs and XPASSes for
all runs.

It's hard to tell which of these results are aarch64-specific and
which aren't.  If other target maintainers want to do something similar,
and are prepared to assume the same gdb version, then it should become
clearer over time which ones are target-specific and which aren't.

There are no new skips here, so changes in test results will still
show up as XPASSes.

I've not analysed the failures or filed PRs for them.  In some
ways the guality directory itself seems like the best place to
start looking for xfails, if someone's interested in working
in this area.

gcc/testsuite/
* gcc.dg/guality/example.c: Update aarch64*-*-* xfails.
* gcc.dg/guality/guality.c: Likewise.
* gcc.dg/guality/inline-params.c: Likewise.
* gcc.dg/guality/loop-1.c: Likewise.
* gcc.dg/guality/pr36728-1.c: Likewise.
* gcc.dg/guality/pr36728-2.c: Likewise.
* gcc.dg/guality/pr36728-3.c: Likewise.
* gcc.dg/guality/pr41447-1.c: Likewise.
* gcc.dg/guality/pr54200.c:  Likewise.
* gcc.dg/guality/pr54519-1.c: Likewise.
* gcc.dg/guality/pr54519-2.c: Likewise.
* gcc.dg/guality/pr54519-3.c: Likewise.
* gcc.dg/guality/pr54519-4.c: Likewise.
* gcc.dg/guality/pr54519-5.c: Likewise.
* gcc.dg/guality/pr54519-6.c: Likewise.
* gcc.dg/guality/pr54693-2.c: Likewise.
* gcc.dg/guality/pr56154-1.c: Likewise.
* gcc.dg/guality/pr59776.c: Likewise.
* gcc.dg/guality/pr68860-1.c: Likewise.
* gcc.dg/guality/pr68860-2.c: Likewise.
* gcc.dg/guality/pr90074.c: Likewise.
* gcc.dg/guality/pr90716.c: Likewise.
* gcc.dg/guality/sra-1.c: Likewise.

Add dg-final option-based target selectors

This patch adds target selectors of the form:

  { any-opts "opt1" ... "optn" }
  { no-opts "opt1" ... "optn" }

for skipping or xfailing tests based on compiler options.  It only
works for dg-final selectors.

The patch then uses no-opts to exclude -O0 and (sometimes) -Og from
some guality.exp xfails.  AFAICT (based on gcc-testresults) these
tests pass for those options for all targets.

gcc/
* doc/sourcebuild.texi: Document no-opts and any-opts target
selectors.

gcc/testsuite/
* lib/target-supports-dg.exp (selector_expression): Handle any-opts
and no-opts.
* gcc.dg/guality/pr41353-1.c: Exclude -O0 from xfail.
* gcc.dg/guality/pr59776.c: Likewise.
* gcc.dg/guality/pr54970.c: Likewise -O0 and -Og.

c++: do_class_deduction and dependent init [PR93383]

Here we're crashing during CTAD with a dependent initializer (performed
from convert_template_argument) because one of the initializer's
elements has an empty TREE_TYPE, which ends up making resolve_args
unhappy.

Besides the case where we're initializing one template placeholder
from another, which is already specifically handled earlier in
do_class_deduction, it seems we can't in general correctly resolve a
template placeholder using a dependent initializer, so this patch makes
the function just punt until instantiation time instead.

gcc/cp/ChangeLog:

PR c++/89565
PR c++/93383
PR c++/95291
PR c++/99200
PR c++/99683
* pt.c (do_class_deduction): Punt if the initializer is
type-dependent.

gcc/testsuite/ChangeLog:

PR c++/89565
PR c++/93383
PR c++/95291
PR c++/99200
PR c++/99683
* g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive.
* g++.dg/cpp2a/nontype-class45.C: New test.
* g++.dg/cpp2a/nontype-class46.C: New test.
* g++.dg/cpp2a/nontype-class47.C: New test.
* g++.dg/cpp2a/nontype-class48.C: New test.

c++: Hard error with tentative parse and CTAD [PR87709]

When parsing e.g. the operand of sizeof, where both types and
expressions are accepted, if during the tentative type parse we
encounter an unexpected template placeholder, we must simulate
an error rather than issue a real error because the expression
parse can still succeed.

gcc/cp/ChangeLog:

PR c++/87709
* parser.c (cp_parser_type_id_1): If we see a template
placeholder, first try simulating an error before issuing
a real error.

gcc/testsuite/ChangeLog:

PR c++/87709
* g++.dg/cpp1z/class-deduction86.C: New test.

Daily bump.

Fix logic error in 32-bit trampolines.

The test in the PowerPC 32-bit trampoline support is backwards.  It aborts
if the trampoline size is greater than the expected size.  It should abort
when the trampoline size is less than the expected size.  I fixed the test
so the operands are reversed.  I then folded the load immediate into the
compare instruction.

I verified this by creating a 32-bit trampoline program and manually
changing the size of the trampoline to be 48 instead of 40.  The program
aborted with the larger size.  I updated this code and ran the test again
and it passed.

I added a test case that runs on PowerPC 32-bit Linux systems and it calls
the __trampoline_setup function with a larger buffer size than the
compiler uses.  The test is not run on 64-bit systems, since the function
__trampoline_setup is not called.  I also limited the test to just Linux
systems, in case trampolines are handled differently in other systems.

libgcc/
2021-04-23  Michael Meissner  <meissner@linux.ibm.com>

PR target/98952
* config/rs6000/tramp.S (__trampoline_setup, elfv1 #ifdef): Fix
trampoline size comparison in 32-bit by reversing test and
combining load immediate with compare.
(__trampoline_setup, elfv2 #ifdef): Fix trampoline size comparison
in 32-bit by reversing test and combining load immediate with
compare.

gcc/testsuite/
2021-04-23  Michael Meissner  <meissner@linux.ibm.com>

PR target/98952
* gcc.target/powerpc/pr98952.c: New test.

bpf: allow BSS symbols to be global symbols

Prior to this, a BSS declaration such as:

  int foo;
  static int bar;

Generates:

  .global foo
  .local  foo
  .comm   foo,4,4
  .local  bar
  .comm   bar,4,4

Creating symbols:

  0000000000000000 b foo
  0000000000000004 b bar

Both symbols are local. However, libbpf bpf_object__variable_offset
rquires symbols to be STB_GLOBAL & STT_OBJECT for data section lookup.
This patch makes the same declaration generate:

  .global foo
  .type   foo, @object
  .lcomm  foo,4,4
  .local  bar
  .comm   bar,4,4

Creating symbols:

  0000000000000000 B foo
  0000000000000004 b bar

And libbpf will be okay with looking up the global symbol "foo".

2021-04-22  YiFei Zhu  <zhuyifei1999@gmail.com>

gcc/

* config/bpf/bpf.h (ASM_OUTPUT_ALIGNED_BSS): Use .type and .lcomm.

bpf: align function entry point to 64 bits

Libbpf does not treat paddings after functions well. If function
symbols does not cover a whole text section, it will emit error
similar to:

libbpf: sec '.text': failed to find program symbol at offset 56

Each instruction in BPF is a multiple of 8 bytes, so align the
functions to 8 bytes, similar to how clang does it.

2021-04-22 YiFei Zhu <zhuyifei1999@gmail.com>

gcc/

* config/bpf/bpf.h (FUNCTION_BOUNDARY): Set to 64.

i386: Reject -m96bit-long-double for 64bit targets [PR100041]

64bit targets default to 128bit long double, so -m96bit-long-double should
not be used. Together with -m128bit-long-double, this option was intended
to be an optimization for 32bit targets only.

Error out when -m96bit-long-double is used with 64bit targets.

2021-04-23 Uroš Bizjak <ubizjak@gmail.com>

gcc/
PR target/100041
* config/i386/i386-options.c (ix86_option_override_internal):
Error out when -m96bit-long-double is used with 64bit targets.
* config/i386/i386.md (*pushxf_rounded): Remove pattern.

gcc/testsuite/

PR target/100041
* gcc.target/i386/pr79514.c (dg-error):
Expect error for 64bit targets.

Remove not feasible FIXME

gcc/ChangeLog:

* lto-wrapper.c: Remove FIXME about usage of
hardware_concurrency. The function is not on par with
what we have now.

MAINTAINERS: Add myself for write after approval

ChangeLog:

2021-04-23 David Faust <david.faust@oracle.com>

* MAINTAINERS (Write After Approval): Add myself.

i386: Fix atomic FP peepholes [PR100182]

64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
targets, so there is no need for additional atomic moves to a temporary
register.

Introduced load peephole2 patterns assume that there won't be any additional
loads from the load location outside the peepholed sequence and wrongly
removed the source location initialization.

OTOH, introduced store peephole2 patterns assume there won't be any additional
loads from the stored location outside the peepholed sequence and wrongly
removed the destination location initialization.  Note that we can't use plain
x87 FST instruction to initialize destination location because FST converts
the value to the double-precision format, changing bits during move.

The patch restores removed initializations in load and store patterns.
Additionally, plain x87 FST in store peephole2 patterns is prevented by
limiting the store operand source to SSE registers.

2021-04-23  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
PR target/100182
* config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
Copy operand 3 to operand 4.  Use sse_reg_operand
as operand 3 predicate.
(FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage): Ditto.
(LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
(LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
(FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
Copy operand 1 to operand 0.
(FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage): Ditto.
(LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
(LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage): Ditto.

gcc/testsuite/

PR target/100182
* gcc.target/i386/pr100182.c: New test.
* gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not.
* gcc.target/i386/pr71245-2.c (dg-final): Ditto.

early-remat.c: Fix new/delete mismatch [PR100230]

This simple patch fixes a mistmatched operator new/delete in
early-remat.c which triggers ASan errors on (at least) AArch64 when
compiling SVE code.

gcc/ChangeLog:

PR rtl-optimization/100230
* early-remat.c (early_remat::sort_candidates): Use delete[]
instead of delete for array allocated with new[].

libstdc++: Allow net::io_context to compile without <poll.h> [PR 100180]

This adds dummy placeholders to net::io_context so that it can still be
compiled on targets without <poll.h>.

libstdc++-v3/ChangeLog:

PR libstdc++/100180
* include/experimental/io_context (io_context): Define
dummy_pollfd type so that most member functions still compile
without <poll.h> and struct pollfd.

libstdc++: Clarify argument to net::io_context::async_wait

Add a comment documenting the __w parameter of the private
ios_context::async_wait function. Add casts to callers, making the
conversions explicit.

libstdc++-v3/ChangeLog:

* include/experimental/io_context (io_context::async_wait): Add
comment.
* include/experimental/socket (basic_socket::async_connect):
Cast wait_type constant to int.
(basic_datagram_socket::async_receive): Likewise.
(basic_datagram_socket::async_receive_from): Likewise.
(basic_datagram_socket::async_send): Likewise.
(basic_datagram_socket::async_send_to): Likewise.
(basic_stream_socket::async_receive): Likewise.
(basic_stream_socket::async_send): Likewise. Use io_context
parameter directly, instead of via an executor.
(basic_socket_acceptor::async_accept): Likewise.

libstdc++ Simplify definition of net::socket_base constants

libstdc++-v3/ChangeLog:

* include/experimental/socket (socket_base::shutdown_type):
(socket_base::wait_type, socket_base::message_flags):
Remove enumerators. Initialize constants directly with desired
values.
(socket_base::message_flags): Make all operators constexpr and
noexcept.
* testsuite/util/testsuite_common_types.h (test_bitmask_values):
New test utility.
* testsuite/experimental/net/socket/socket_base.cc: New test.

c++: Fix pretty printing pointer to function type [PR98767]

When pretty printing a pointer to function type,
pp_cxx_parameter_declaration_clause ends up always outputting an empty
function parameter list because the loop that outputs the list iterates
over 'args' instead of 'types', and 'args' is empty when a FUNCTION_TYPE
is passed to this routine (as opposed to a FUNCTION_DECL).

This patch fixes this by making the loop iterate over 'types' instead.
This patch also moves the retrofitted chain-of-PARM_DECLs printing from
here to pp_cxx_requires_expr, the only caller that uses it. Doing so
lets us easily output the trailing '...' in the parameter list of a
variadic function, which this patch also implements.

gcc/cp/ChangeLog:

PR c++/98767
* cxx-pretty-print.c (pp_cxx_parameter_declaration_clause):
Adjust parameter list loop to iterate over 'types' instead of
'args'. Output the trailing '...' for a variadic function.
Remove PARM_DECL support.
(pp_cxx_requires_expr): Pretty print the parameter list directly
instead of going through pp_cxx_parameter_declaration_clause.

gcc/testsuite/ChangeLog:

PR c++/98767
* g++.dg/concepts/diagnostic17.C: New test.

c++: Refine enum direct-list-initialization [CWG2374]

This implements the wording changes of CWG2374, which clarifies the
wording of P0138 to forbid e.g. direct-list-initialization of a scoped
enumeration from a different scoped enumeration.

gcc/cp/ChangeLog:

DR 2374
* decl.c (is_direct_enum_init): Check the implicit
convertibility requirement added by CWG 2374.

gcc/testsuite/ChangeLog:

DR 2374
* g++.dg/cpp1z/direct-enum-init2.C: New test.

VEC_COND_EXPR code cleanup

This removes now unnecessary special-casings of VEC_COND_EXPRs after
making its first operand a gimple value.

2021-04-14 Richard Biener <rguenther@suse.de>

* genmatch.c (lower_cond): Remove VEC_COND_EXPR special-casing.
(capture_info::capture_info): Likewise.
(capture_info::walk_match): Likewise.
(expr::gen_transform): Likewise.
(dt_simplify::gen_1): Likewise.
* gimple-match-head.c (maybe_resimplify_conditional_op):
Remove VEC_COND_EXPR special-casing.
(gimple_simplify): Likewise.
* gimple.c (gimple_could_trap_p_1): Adjust.
* tree-ssa-pre.c (compute_avail): Allow VEC_COND_EXPR
to participate in PRE.

First do add_noreturn_fake_exit_edges in connect_infinite_loops_to_exit

Most callers of connect_infinite_loops_to_exit already do this but
the few that do not end up with extra exit edges. The following
makes that consistent, also matching the post-dominance DFS walk code.

2021-02-25 Richard Biener <rguenther@suse.de>

* cfganal.c (connect_infinite_loops_to_exit): First call
add_noreturn_fake_exit_edges.
* ipa-sra.c (process_scan_results): Do not call the now redundant
add_noreturn_fake_exit_edges.
* predict.c (tree_estimate_probability): Likewise.
(rebuild_frequencies): Likewise.
* store-motion.c (one_store_motion_pass): Likewise.

tree-optimization/100222 - remove redundant mark_irreducible_loops calls

loop_optimizer_init (LOOPS_NORMAL) already performs this (quite
expensive) marking.

2021-04-23 Richard Biener <rguenther@suse.de>

PR tree-optimization/100222
* predict.c (pass_profile::execute): Remove redundant call to
mark_irreducible_loops.
(report_predictor_hitrates): Likewise.

Avoid more temporaries in IVOPTs

This avoids use of valid_gimple_rhs_p and instead gimplifies to
such a RHS, avoiding more SSA copies being generated by IVOPTs.

2021-04-14 Richard Biener <rguenther@suse.de>

* tree-ssa-loop-ivopts.c (rewrite_use_nonlinear_expr): Avoid
valid_gimple_rhs_p by instead gimplifying to one.

c++: Use STATIC_ASSERT for OVL_OP_MAX.

gcc/cp/ChangeLog:

* cp-tree.h (STATIC_ASSERT): Prefer static assert.
* lex.c (init_operators): Remove run-time check.

tree-optimization/99971 - improve BB vect dependence analysis

We can use TBAA even when we have a DR, do so. For the testcase
that means fully vectorizing it instead of only vectorizing
the first store group resulting in suboptimal code.

2021-04-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/99971
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences):
Always use TBAA for loads.

* g++.dg/vect/slp-pr99971.cc: New testcase.

MASK_AVX256_SPLIT_UNALIGNED_STORE/LOAD should be cleared in opts->x_target_flags when X86_TUNE_AVX256_UNALIGNED_LOAD/STORE_OPTIMAL is enabled by target attribute.

gcc/ChangeLog:

PR target/100093
* config/i386/i386-options.c (ix86_option_override_internal):
Clear MASK_AVX256_SPLIT_UNALIGNED_LOAD/STORE in x_target_flags
when X86_TUNE_AVX256_UNALIGNED_LOAD/STORE_OPTIMAL is enabled
by target attribute.

gcc/testsuite/ChangeLog:

PR target/100093
* gcc.target/i386/pr100093.c: New test.

Daily bump.

aix: Switch AIX configurtion to DWARF2 debugging

This patch is in preparation for removing stabs debugging support from GCC.

The rs6000 configuration files remain somewhat intertwined with the
stabs debugging support, but the configuration no longer generates
stabs debugging information.

This patch means that earlier releases (Technology Levels) of AIX 7.1
and 7.2, prior to DWARF support and fixes, cannot build GCC or support
GCC.

gcc/ChangeLog:

* config/rs6000/aix71.h (PREFERRED_DEBUGGING_TYPE): Change to
DWARF2_DEBUG.
* config/rs6000/aix72.h (PREFERRED_DEBUGGING_TYPE): Same.

aix: Remove AIX 6.1 support.

AIX 6.1 is past end of life and extended support. This patch removes
the configuration option and references to AIX 6.1.

contrib/ChangeLog:

* config-list.mk: Remove rs6000-ibm-aix6.1.
Rename rs6000-ibm-aix7.1 to powerpc-ibm-aix7.1.
Add powerpc-ibm-aix7.2.

gcc/ChangeLog:

* config.gcc (powerpc-ibm-aix6.*): Remove.
* config/rs6000/aix61.h: Delete.

aix: delete AIX pre-PowerPC version of atomicity.h

The AIX-specific version of atomicity.h that provides compatibility
for the origina POWER architecture without atomic instructions no longer
is referenced. This patch deletes the file.

libstdc++-v3/ChangeLog:

* config/os/aix/atomicity.h: Delete.

c++: Add testcase for already fixed PR [PR94508]

We correctly accept this testcase since r11-8144.

gcc/testsuite/ChangeLog:

PR c++/94508
* g++.dg/cpp2a/concepts-uneval3.C: New test.

c++: Add testcase for already fixed PR [PR77435]

We correctly accept this testcase since r8-1437.

gcc/testsuite/ChangeLog:

PR c++/77435
* g++.dg/template/partial-specialization9.C: New test.

c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161]

Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression.  That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.

The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent.  It seems we need to test both op0 and op1 separately
to suppress this warning.

gcc/cp/ChangeLog:

PR c++/100161
* pt.c (tsubst_copy_and_build) <case PLUS_EXPR>: Test op0 and
op1 separately for value- or type-dependence.

gcc/testsuite/ChangeLog:

PR c++/100161
* g++.dg/warn/Wtype-limits6.C: New test.

c++: Add testcase for already fixed PR [PR84689]

We correctly accept this testcase since r11-1638.

gcc/testsuite/ChangeLog:

PR c++/84689
* g++.dg/cpp0x/sfinae67.C: New test.

c++: Add testcase for already fixed PR [PR16617]

We correctly diagnose the invalid access since r11-1350.

gcc/testsuite/ChangeLog:

PR c++/16617
* g++.dg/template/access36.C: New test.

testsuite/substr_{9,10}.f90: Move to gfortran.dg/

gcc/testsuite/
* substr_9.f90: Move to ...
* gfortran.dg/substr_9.f90: ... here.
* substr_10.f90: Move to ...
* gfortran.dg/substr_10.f90: ... here.

libstdc++: Fix semaphore to work with system_clock timeouts

The __cond_wait_until_impl function takes a steady_clock timeout, but
then sometimes tries to compare it to a time from the system_clock,
which won't compile. Additionally, that function gets called with
system_clock timeouts, which also won't compile. This makes the function
accept timeouts for either clock, and compare to the time from the right
clock.

This fixes the compilation error that was causing two tests to fail on
non-futex targets, so we can revert the r12-11 change to disable them.

libstdc++-v3/ChangeLog:

* include/bits/atomic_timed_wait.h (__cond_wait_until_impl):
Handle system_clock as well as steady_clock.
* testsuite/30_threads/semaphore/try_acquire_for.cc: Re-enable.
* testsuite/30_threads/semaphore/try_acquire_until.cc:
Re-enable.

libstdc++: Add options for libatomic to test

This fixes a linker error on AIX:

FAIL: 30_threads/semaphore/try_acquire_posix.cc (test for excess errors)
Excess errors:
ld: 0711-317 ERROR: Undefined symbol: .__atomic_fetch_add_8
ld: 0711-317 ERROR: Undefined symbol: .__atomic_load_8
ld: 0711-317 ERROR: Undefined symbol: .__atomic_fetch_sub_8
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
collect2: error: ld returned 8 exit status

libstdc++-v3/ChangeLog:

* testsuite/30_threads/semaphore/try_acquire_posix.cc: Add
options for libatomic.

libstdc++: Fix typo in comment

libstdc++-v3/ChangeLog:

* config/os/gnu-linux/os_defines.h: Fix type in comment.

libstdc++: Reject std::make_shared<T[]> [PR 99006]

Prior to C++20 it should be ill-formed to use std::make_shared with an
array type (and we don't support the C++20 feature to make it valid yet
anyway).

libstdc++-v3/ChangeLog:

PR libstdc++/99006
* include/bits/shared_ptr.h (allocate_shared): Assert that _Tp
is not an array type.
* include/bits/shared_ptr_base.h (__allocate_shared): Likewise.
* testsuite/20_util/shared_ptr/creation/99006.cc: New test.

Fix various typos.

PR testsuite/100159
PR testsuite/100192

gcc/ChangeLog:

* builtins.c (expand_builtin): Fix typos and missing comments.
* dwarf2out.c (gen_subprogram_die): Likewise.
(gen_struct_or_union_type_die): Likewise.

gcc/fortran/ChangeLog:

* frontend-passes.c (optimize_expr): Fix typos and missing comments.

gcc/testsuite/ChangeLog:

* g++.dg/template/nontype29.C: Fix typos and missing comments.
* gcc.dg/Warray-bounds-64.c: Likewise.
* gcc.dg/Warray-parameter.c: Likewise.
* gcc.dg/Wstring-compare.c: Likewise.
* gcc.dg/format/gcc_diag-11.c: Likewise.
* gfortran.dg/array_constructor_3.f90: Likewise.
* gfortran.dg/matmul_bounds_9.f90: Likewise.
* gfortran.dg/pr78033.f90: Likewise.
* gfortran.dg/pr96325.f90: Likewise.

libstdc++: Fix "bare" notifications dropped by waiters check

For types that track whether or not there extant waiters (e.g.
semaphore) internally, the __atomic_notify_address_bare() call was
introduced to avoid the overhead of loading the atomic count of
waiters. For platforms that don't have Futex, however, there was
still a check for waiters, and seeing that there are none (because
in the bare case, the count is not incremented), the notification
is dropped. This commit addresses that case.

libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h: Always notify waiters in the
case of 'bare' address notification.

i386: Fix unsigned int -> double conversion on i386 w/ -mfpmath=sse [PR100119]

2021-04-22 Uroš Bizjak <ubizjak@gmail.com>

gcc/
PR target/100119
* config/i386/i386-expand.c (ix86_expand_convert_uns_sidf_sse):
Remove the sign with FE_DOWNWARD, where x - x = -0.0.

gcc/testsuite/

PR target/100119
* gcc.target/i386/pr100119.c: New test.

libstdc++: Add workaround for ia32 floating atomics miscompilations [PR100184]

gcc on ia32 miscompiles various atomics involving floating point,
unfortunately I'm afraid it is too late to fix that for 11.1 and
as I'm quite lost on it, it might take a while for 12 too
(disabling all the 8 peephole2s would be easiest, but then we'd
run into optimization regressions).

While 1.cc just FAILs, with dejagnu 1.6.1 wait_notify.cc hangs the
make check even after the timeout fires.  The following patch therefore
xfails the former and skips the latter.

Tested on x86_64-linux where
make check RUNTESTFLAGS='conformance.exp=atomic_float/*.cc'
is still
                === libstdc++ Summary ===

# of expected passes            8
and on i686-linux, where it is now
                === libstdc++ Summary ===

# of expected passes            5
# of expected failures          1
# of unsupported tests          1

2021-04-22  Jakub Jelinek  <jakub@redhat.com>

PR target/100182
* testsuite/29_atomics/atomic_float/1.cc: Add dg-xfail-run-if for
ia32.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Add dg-skip-if for
ia32.

libstdc++: Remove #error from <semaphore> implementation [PR 100179]

This removes the #error from <bits/semaphore_base.h> for the case where
neither __atomic_semaphore nor __platform_semaphore is defined.

Also rename the _GLIBCXX_REQUIRE_POSIX_SEMAPHORE macro to
_GLIBCXX_USE_POSIX_SEMAPHORE for consistency with the similar
_GLIBCXX_USE_CXX11_ABI macro that can be used to request an alternative
(ABI-changing) implementation.

libstdc++-v3/ChangeLog:

PR libstdc++/100179
* include/bits/semaphore_base.h: Remove #error.
* include/std/semaphore: Do not define anything unless one of
the implementations is available.

testsuite/aarch64: Run pr99988.c test under lp64 only

The new test fails with -mabi=ilp32:
sorry, unimplemented: return address signing is only supported for '-mabi=lp64'

2021-04-22 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
PR target/99988
* gcc.target/aarch64/pr99988.c: Skip if not lp64 target.

gfortran.dg/pr68078.f90: Avoid increasing RLIMIT_AS

pr68078.f90 tests out-of-memory handling and calls set_vm_limit to set the
soft limit. However, setrlimit was then called with hard limit RLIM_INFINITY,
which failed when the current hard limit was lower.

gcc/testsuite/
* gfortran.dg/set_vm_limit.c (set_vm_limit): Call getrlimit, use
obtained hard limit, and only call setrlimit if new softlimit is lower.

testsuite/100176 - fix struct-layout-1_generate.c compile

With -Werror=return-type we run into compile fails complaining about
missing return stmts.

2021-04-21 Richard Biener <rguenther@suse.de>

PR testsuite/100176
* objc.dg/gnu-encoding/struct-layout-encoding-1_generate.c: Add
missing return.

Avoid -latomic for amdgcn offloading

libatomic isn't built for amdgcn but reduction-16.c adds it
via -foffload=-latomic when offloading for nvptx is enabled.
The following avoids linker errors when offloading to amdgcn is enabled
as well.

2021-04-21 Richard Biener <rguenther@suse.de>

libgomp/
* testsuite/libgomp.c-c++-common/reduction-16.c: Use -latomic
only on nvptx-none.

Fix Fortran rounding issues, PR fortran/96983.

I was looking at Fortran PR 96983, which fails on the PowerPC when trying to
run the test PR96711.F90.  The compiler ICEs because the PowerPC does not have
a floating point type with a type precision of 128.  The reason is that the
PowerPC has 3 different 128 bit floating point types (__float128/_Float128,
__ibm128, and long double).  Currently long double uses the IBM extended double
type, but we would like to switch to using IEEE 128-bit long doubles in the
future.

In order to prevent the compiler from converting explicit __ibm128 types to
long double when long double uses the IEEE 128-bit representation, we have set
up the precision for __ibm128 to be 128, long double to be 127, and
__float128/_Float128 to be 126.

Originally, I was trying to see if for Fortran, I could change the precision of
long double to be 128 (Fortran doesn't access __ibm128), but it quickly became
hard to get the changes to work.

I looked at the Fortran code in build_round_expr, and I came to the conclusion
that there is no reason to promote the floating point type.  If you just do a
normal round of the value using the current floating point format and then
convert it to the integer type.  We don't have an appropriate built-in function
that provides the equivalent of llround for 128-bit integer types.

This patch fixes the compiler crash.

However, while with this patch, the PowerPC compiler will not crash when
building the test case, it will not run on the current default installation.
The failure is because the test is explicitly expecting 128-bit floating point
to handle 10384593717069655257060992658440192_16 (i.e. 2**113).

By default, the PowerPC uses IBM extended double used for 128-bit floating
point.  The IBM extended double format is a pair of doubles that provides more
mantissa bits but does not grow the expoenent range.  The value in the test is
fine for IEEE 128-bit floating point, but it is too large for the PowerPC
extended double setup.

I have built the following tests with this patch:

   * I have built a bootstrap compiler on a little endian power9 Linux system
     with the default long double format (IBM extended double).  The
     pr96711.f90 test builds, but it does not run due to the range of the
     real*16 exponent.  There were no other regressions in the C/C++/Fortran
     tests.

   * I have built a bootstrap compiler on a little endian power9 Linux system
     with the default long double format set to IEEE 128-bit. I used the
     Advance Toolchain 14.0-2 to provide the IEEE 128-bits.  The compiler was
     configured to build power9 code by default, so the test generated native
     power9 IEEE 128-bit instructions.  The pr96711.f90 test builds and runs
     correctly in this setup.

   * I have built a bootstrap compiler on a big endian power8 Linux system with
     the default long double format (IBM extended double).  Like the first
     case, the pr96711.f90 test does not crash the compiler, but the test fails
     due to the range of the real*16 exponent.    There were no other
     regressions in the C/C++/Fortran tests.

   * I built a bootstrap compiler on my x86_64 laptop.  There were no
     regressions in the tests.

gcc/fortran/
2021-04-21  Michael Meissner  <meissner@linux.ibm.com>

PR fortran/96983
* trans-intrinsic.c (build_round_expr): If int type is larger than
long long, do the round and convert to the integer type.  Do not
try to find a floating point type the exact size of the integer
type.

Daily bump.

libgomp.fortran/depobj-1.f90: Fix omp_depend_kind

libgomp/
* testsuite/libgomp.fortran/depobj-1.f90: Use omp_lib's
omp_depend_kind instead of defining it as 16.

[libstdc++] Fix test timeout in stop_calback/destroy.cc

A change was made to __atomic_semaphore::_S_do_try_acquire() to
(ideally) let the compare_exchange reload the value of __old rather than
always reloading it twice. This causes _M_acquire to spin indefinitely
if the value of __old is already 0.

libstdc++-v3/ChangeLog:
* include/bits/semaphore_base.h: Always reload __old in
__atomic_semaphore::_S_do_try_acquire().
* testsuite/30_threads/stop_token/stop_callback/destroy.cc:
re-enable testcase.

Darwin, X86 : Fix bootstrap break from flags changes.

The changes from r12-36-g1751bec027f030515889fcf4baa9c91501aafc85
did not remove the uses of TARGET_ISA_* from i386/darwin.h.

Fixed thus.

gcc/ChangeLog:

* config/i386/darwin.h (TARGET_64BIT): Remove definition
based on TARGET_ISA_64BIT.
(TARGET_64BIT_P): Remove definition based on
TARGET_ISA_64BIT_P().

Revert "Use std::thread::hardware_concurrency in lto-wrapper.c."

This reverts commit 0a18305ee11e139838771f96c5a037a29606236e.

Call toplev::finalize in CHECKING_P mode.

gcc/ChangeLog:

PR jit/98615
* main.c (main): Call toplev::finalize in CHECKING_P mode.
* ipa-modref.c (ipa_modref_c_finalize): summaries are NULL
when incremental LTO linking happens.

libgomp/testsuite: Fix checks for dg-excess-errors

For the tests modified below, the effective target line has to be effective
when compiling for an offload target, except that variable-not-offloaded.c
would compile with unified-share memory and pr86416-*.c if long double/float128
is supported.
The previous check used a run-time device ability check. This new variant
now enables those dg- lines when _compiling_ for nvptx or gcn.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
New, based on check_effective_target_offload_target_nvptx.
(check_effective_target_offload_target_nvptx): Call it.
(check_effective_target_offload_target_amdgcn): New.
* testsuite/libgomp.c-c++-common/function-not-offloaded.c:
Require target offload_target_nvptx || offload_target_amdgcn.
* testsuite/libgomp.c-c++-common/variable-not-offloaded.c: Likewise.
* testsuite/libgomp.c/pr86416-1.c: Likewise.
* testsuite/libgomp.c/pr86416-2.c: Likewise.

libstdc++: Install libstdc++*-gdb.py more robustly [PR 99453]

In order for GDB to auto-load the pretty printers, they must be installed
as "libstdc++.$ext-gdb.py", where 'libstdc++.$ext' is the name of the
object file that is loaded by GDB [1], i.e. the libstdc++ shared library.

The approach taken in libstdc++-v3/python/Makefile.am is to loop over
files matching 'libstdc++*' in $(DESTDIR)$(toolexeclibdir) and choose
the last file matching that glob that is not a symlink, the Libtool
'*.la' file or a Python file.

That works fine for ELF targets where the matching names are:

  libstdc++.a
  libstdc++.so
  libstdc++.so.6
  libstdc++.so.6.0.29

But not for macOS with:

  libstdc++.6.dylib
  libstdc++.a

Or MinGW with:

  libstdc++-6.dll
  libstdc++.dll.a

Try to make a better job at installing the pretty printers with the
correct name by copying the approach taken by isl [2], that is, using
a sed invocation on the Libtool-generated 'libstdc++.la' to read the
correct name for the current platform.

[1] https://sourceware.org/gdb/onlinedocs/gdb/objfile_002dgdbdotext-file.html
[2] https://repo.or.cz/isl.git/blob/HEAD:/Makefile.am#l611

libstdc++-v3/ChangeLog:

PR libstdc++/99453
* python/Makefile.am: Install libstdc++*-gdb.py more robustly.
* python/Makefile.in: Regenerate.

Co-authored-by: Jonathan Wakely <jwakely@redhat.com>

testsuite: Fix bind_c_array_params_2.f90 on AIX

gcc/testsuite/ChangeLog:

* gfortran.dg/bind_c_array_params_2.f90: Look for AIX-specific call
pattern.

[libstdc++] Add missing _M_try_acquire() to __platform_semaphore

libstdc++-v3/ChangeLog:
* include/bits/semaphore_base.h: Add missing _M_try_acquire()
member to __platform_wait.

LTO: fallback to -flto=N if -flto=jobserver does not work.

gcc/ChangeLog:

* lto-wrapper.c (run_gcc): When -flto=jobserver is used, but the
makeserver cannot be detected, then use -flto=N fallback.

c++: Don't allow defining types in enum-base [PR96380]

In r11-2064 I made cp_parser_enum_specifier commit to tentative parse
when seeing a '{'.  That still looks like the correct thing to do, but
it caused an ICE-on-invalid as well as accepts-invalid.

When we have something sneaky like this, which is broken in multiple
ways:

  template <class>
  enum struct c : union enum struct c { e = b, f = a };

we parse the "enum struct c" part (that's OK) and then we see that
we have an enum-base, so we consume ':' and then parse the type-specifier
that follows the :.  "union enum" is clearly invalid, but we're still
parsing tentatively and we parse everything up to the ;, and then
throw away the underlying type.  We parsed everything because we were
tricked into parsing an enum-specifier in an enum-base of another
enum-specifier!  Not good.

Since the grammar for enum-base doesn't allow a defining-type-specifier,
only a type-specifier, we should set type_definition_forbidden_message
which fixes all the problems in this PR.

gcc/cp/ChangeLog:

PR c++/96380
* parser.c (cp_parser_enum_specifier): Don't allow defining
types in enum-base.

gcc/testsuite/ChangeLog:

PR c++/96380
* g++.dg/cpp0x/enum_base4.C: New test.
* g++.dg/cpp0x/enum_base5.C: New test.

Fix clang warning (-Wstring-plus-int)

This fixes:

lto-plugin.c:642:7: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]

lto-plugin/ChangeLog:

* lto-plugin.c (exec_lto_wrapper): Make a temp variable.

aarch64: Always use .init/.fini_array for GNU/Linux

I was wondering why the (now fixed) c-c++-common/attr-retain-[78].c
failures were showing up in the native results for aarch64-linux-gnu
but not in the posted cross results. It turns out that .init/
.fini_array support is disabled by default for cross builds,
which in turn stops those tests from running.

The test for .init/fini_array support has two parts: one that builds
something with the assembler and linker, and another that compiles
C code and uses preprocessor macros to test the glibc version.
The first test would work with build=host but the second is only
safe for build=target.

However, AArch64 postdates glibc and binutils support for
.init/fini_array by some distance, so it's safe to hard-code the
result to "yes" for cross compilers.

This fixes the only material difference in auto-host.h between
a native and a cross build.

gcc/
* acinclude.m4 (gcc_AC_INITFINI_ARRAY): When cross-compiling,
default to yes for aarch64-linux-gnu.
* configure: Regenerate.

Use std::thread::hardware_concurrency in lto-wrapper.c.

gcc/ChangeLog:

* lto-wrapper.c (cpuset_popcount): Remove.
(init_num_threads): Remove and use hardware_concurrency.

Fix clang warnings.

gcc/ChangeLog:

* config/i386/i386.c: Remove superfluous || TARGET_MACHO
which remains to be '(... || 0)' and clang complains about it.
* dwarf2out.c (AT_vms_delta): Declare conditionally.
(add_AT_vms_delta): Likewise.
* tree.c (fld_simplified_type): Use rather more common pattern
for disabling of something (#if 0).
(get_tree_code_name): Likewise.
(verify_type_variant): Likewise.

Remove TARGET_foo (ix86_tune == PROCESSOR_foo) macros.

gcc/ChangeLog:

* config/i386/i386-expand.c (decide_alignment): Use newly named
macro TARGET_CPU_P.
* config/i386/i386.c (ix86_decompose_address): Likewise.
(ix86_address_cost): Likewise.
(ix86_lea_outperforms): Likewise.
(ix86_avoid_lea_for_addr): Likewise.
(ix86_add_stmt_cost): Likewise.
* config/i386/i386.h (TARGET_*): Remove.
(TARGET_CPU_P): New macro.
* config/i386/i386.md: Use newly named macro TARGET_CPU_P.
* config/i386/x86-tune-sched-atom.c (do_reorder_for_imul): Likewise.
(swap_top_of_ready_list): Likewise.
(ix86_atom_sched_reorder): Likewise.
* config/i386/x86-tune-sched-bd.c (ix86_bd_has_dispatch): Likewise.
* config/i386/x86-tune-sched.c (ix86_adjust_cost): Likewise.

Overhaul in isa_flags and handling it.

gcc/ChangeLog:

* config/i386/i386-options.c (TARGET_EXPLICIT_NO_SAHF_P):
Define.
(SET_TARGET_NO_SAHF): Likewise.
(TARGET_EXPLICIT_PREFETCH_SSE_P): Likewise.
(SET_TARGET_PREFETCH_SSE): Likewise.
(TARGET_EXPLICIT_NO_TUNE_P): Likewise.
(SET_TARGET_NO_TUNE): Likewise.
(TARGET_EXPLICIT_NO_80387_P): Likewise.
(SET_TARGET_NO_80387): Likewise.
(DEF_PTA): New.
* config/i386/i386.h (TARGET_*): Remove.
* opth-gen.awk: Generate new used macros.

Generate PTA features from a def file.

gcc/ChangeLog:

* config/i386/i386.h (PTA_*): Remove.
(enum pta_flag): New.
(DEF_PTA): Generate PTA_* values from i386-isa.def.
* config/i386/i386-isa.def: New file.

aarch64: Avoid duplicating bti j insns for jump tables [PR99988]

This patch fixes PR99988 which shows us generating large (> 250)
sequences of back-to-back bti j instructions.

The fix is simply to avoid inserting bti j instructions at the target of
a jump table if we've already inserted one for a given label.

gcc/ChangeLog:

PR target/99988
* config/aarch64/aarch64-bti-insert.c (aarch64_bti_j_insn_p): New.
(rest_of_insert_bti): Avoid inserting duplicate bti j insns for
jump table targets.

gcc/testsuite/ChangeLog:

PR target/99988
* gcc.target/aarch64/pr99988.c: New test.

testsuite: Add -fchecking to dg-ice tests

In --enable-checking=release builds (which is the default on release
branches), I'm getting various extra FAILs that don't appear in
--enable-checking=yes builds.

XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp2a/nontype-class39.C  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++14 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++17 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++98 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++14 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++17 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++98 (test for excess errors)

These are tests that have dg-ice and most of those ICEs are checking ICEs
which go away in release checking when -fno-checking is the default.

The following patch adds -fchecking option to those.

2021-04-21  Jakub Jelinek  <jakub@redhat.com>

* g++.dg/cpp1z/constexpr-lambda26.C: Add dg-additional-options
-fchecking.
* g++.dg/cpp1y/auto-fn61.C: Likewise.
* g++.dg/cpp2a/nontype-class39.C: Likewise.
* g++.dg/cpp0x/constexpr-52830.C: Likewise.
* g++.dg/cpp0x/vt-88982.C: Likewise.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Add -fchecking to
dg-additional-options.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise.

x86: Add -mmwait for -mgeneral-regs-only

Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
-mgeneral-regs-only and make -msse3 to imply -mmwait.

gcc/

* config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
x86_64-*-* targets.
* common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
New.
(OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
(ix86_handle_option): Handle -mmwait.
* config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
__builtin_ia32_monitor and __builtin_ia32_mwait.
* config/i386/i386-options.c (isa2_opts): Add -mmwait.
(ix86_valid_target_attribute_inner_p): Likewise.
(ix86_option_override_internal): Enable mwait/monitor
instructions for -msse3.
* config/i386/i386.h (TARGET_MWAIT): New.
(TARGET_MWAIT_P): Likewise.
* config/i386/i386.opt: Add -mmwait.
* config/i386/mwaitintrin.h: New file.
* config/i386/pmmintrin.h: Include <mwaitintrin.h>.
* config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
TARGET_MWAIT.
(@sse3_monitor_<mode>): Likewise.
* config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
* doc/extend.texi: Document mwait target attribute.
* doc/invoke.texi: Document -mmwait.

gcc/testsuite/

* gcc.target/i386/monitor-2.c: New test.

libstdc++: Fix whitespace in license boilerplate

libstdc++-v3/ChangeLog:

* include/std/latch: Replace tab characters in license text.
* include/std/semaphore: Likewise.

Remove DEF_ENUM from stringop.def.

gcc/ChangeLog:

* config/i386/i386-options.c (DEF_ENUM): Remove it.
* config/i386/i386-opts.h (DEF_ENUM): Likewise.
* config/i386/stringop.def (DEF_ENUM): Likewise.

Revert "Use flags in dump_decl."

This reverts commit 9b6360b83cf5c684422c42301faee3a79ac42dc1.

Fix endian bug in rust demangler

libiberty/
PR demangler/100177
* rust-demangle.c (demangle_const_char): Properly print the
character value.

Use flags in dump_decl.

gcc/cp/ChangeLog:

* error.c (dump_decl): Use flags in dump_generic_node call.

Support LABEL_DECL in %qD directive.

gcc/cp/ChangeLog:

* error.c (dump_decl): Support anonymous labels.

gcc/ChangeLog:

* tree-cfg.c (gimple_verify_flow_info): Use qD instead
of print_generic_expr.

testsuite/100176 - fix struct-layout-1_generate.c compile

With -Werror=return-type we run into compile fails complaining about
missing return stmts.

2021-04-21 Richard Biener <rguenther@suse.de>

PR testsuite/100176
* g++.dg/compat/struct-layout-1_generate.c: Add missing return.
* gcc.dg/compat/struct-layout-1_generate.c: Likewise.

cprop: Fix -fcompare-debug bug in constprop_register [PR100148]

The following testcase shows different behavior between -g and -g0
in constprop_register, if a flags register setter is separated
from a conditional jump using those flags with -g by a DEBUG_INSN.
As it uses just NEXT_INSN, for -g it will look at the DEBUG_INSN which is
not a conditional jump, while otherwise it would look at the conditional
jump and call cprop_jump.

2021-04-21 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/100148
* cprop.c (constprop_register): Use next_nondebug_insn instead of
NEXT_INSN.

* g++.dg/opt/pr100148.C: New test.

Test simlified call in cgraph_node::analyze().

gcc/ChangeLog:

PR ipa/98815
* cgraphunit.c (cgraph_node::analyze): Remove duplicate
free_dominance_info calls.

Fix AIX libstdc++ semaphore support [PR100164]

> > The #error would not be hit if _GLIBCXX_HAVE_POSIX_SEMAPHORE were defined,
> > but it shows up in your error report.

> You now have pinpointed the problem.

> It's not that AIX doesn't have semaphore, but that the code previously
> had a fallback that hid a bug in the macros:

  // Use futex if available and didn't force use of POSIX
  using __fast_semaphore = __atomic_semaphore<__detail::__platform_wait_t>;
  using __fast_semaphore = __platform_semaphore;
  using __fast_semaphore = __atomic_semaphore<ptrdiff_t>;

> The problem is that libstdc++ configure defines
> _GLIBCXX_HAVE_POSIX_SEMAPHORE in config.h.  libstdc++ uses sed to
> rewrite config.h to c++config.h and prepends _GLIBCXX_, so c++config.h
> contains

> And bits/semaphore_base.h is not testing that corrupted macro.  Either
> semaphore_base.h needs to test for the corrupted macro, or libtsdc++
> configure needs to define HAVE_POSIX_SEMAPHORE without itself
> prepending _GLIBCXX_  so that the c++config.h rewriting works
> correctly and defines the correct macro for semaphore_base.h.

The include/Makefile.am sed is:
        sed -e 's/HAVE_/_GLIBCXX_HAVE_/g' \
            -e 's/PACKAGE/_GLIBCXX_PACKAGE/g' \
            -e 's/VERSION/_GLIBCXX_VERSION/g' \
            -e 's/WORDS_/_GLIBCXX_WORDS_/g' \
            -e 's/_DARWIN_USE_64_BIT_INODE/_GLIBCXX_DARWIN_USE_64_BIT_INODE/g' \
            -e 's/_FILE_OFFSET_BITS/_GLIBCXX_FILE_OFFSET_BITS/g' \
            -e 's/_LARGE_FILES/_GLIBCXX_LARGE_FILES/g' \
            -e 's/ICONV_CONST/_GLIBCXX_ICONV_CONST/g' \
            -e '/[       ]_GLIBCXX_LONG_DOUBLE_COMPAT[   ]/d' \
            -e '/[       ]_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT[    ]/d' \
            < ${CONFIG_HEADER} >> $@ ;\
so for many macros one needs _GLIBCXX_ prefixes already in configure,
as can be seen in grep AC_DEFINE.*_GLIBCXX configure.ac acinclude.m4
But _GLIBCXX_HAVE_POSIX_SEMAPHORE is the only one that shouldn't have
that prefix because the sed is adding that.
E.g. on i686-linux, I see
grep _GLIBCXX__GLIBCXX c++config.h
that proves it is the only broken one.

So this change fixes the acinclude.m4 side.

2021-04-21  Jakub Jelinek  <jakub@redhat.com>

PR libstdc++/100164
* acinclude.m4: For POSIX semaphores AC_DEFINE HAVE_POSIX_SEMAPHORE
rather than _GLIBCXX_HAVE_POSIX_SEMAPHORE.
* configure: Regenerated.
* config.h.in: Regenerated.

Simplify maybe_fold_reference API

This simplifies the maybe_fold_reference API reflecting that it
no longer canonicalizes refs (that's done with another function)
but only performs constant folding and thus does nothing for is_lhs.

This in turn allows to rip out quite some dead code and one user
of valid_gimple_rhs_p.

2021-04-16 Richard Biener <rguenther@suse.de>

* gimple-fold.c (maybe_fold_reference): Remove is_lhs
parameter (and assume it to be false).
(fold_gimple_assign): Adjust, remove all callers of
maybe_fold_reference calling it with is_lhs true.
(gimple_fold_call): Likewise.
(fold_stmt_1): Likewise.

Fortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_namelist): Handle depobj + mutexinoutset
in the depend clause.
(show_omp_clauses, show_omp_node, show_code_node): Handle depobj.
* gfortran.h (enum gfc_statement): Add ST_OMP_DEPOBJ.
(enum gfc_omp_depend_op): Add OMP_DEPEND_UNSET,
OMP_DEPEND_MUTEXINOUTSET and OMP_DEPEND_DEPOBJ.
(gfc_omp_clauses): Add destroy, depobj_update and depobj.
(enum gfc_exec_op): Add EXEC_OMP_DEPOBJ
* match.h (gfc_match_omp_depobj): Match 'omp depobj'.
* openmp.c (gfc_match_omp_clauses): Add depobj + mutexinoutset
to depend clause.
(gfc_match_omp_depobj, resolve_omp_clauses, gfc_resolve_omp_directive):
Handle 'omp depobj'.
* parse.c (decode_omp_directive, next_statement, gfc_ascii_statement):
Likewise.
* resolve.c (gfc_resolve_code): Likewise.
* st.c (gfc_free_statement): Likewise.
* trans-openmp.c (gfc_trans_omp_clauses): Handle depobj + mutexinoutset
in the depend clause.
(gfc_trans_omp_depobj, gfc_trans_omp_directive): Handle EXEC_OMP_DEPOBJ.
* trans.c (trans_code): Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/depobj-1.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/depobj-1.f90: New test.
* gfortran.dg/gomp/depobj-2.f90: New test.