git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

projects / thirdparty / gcc.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Andrew Stubbs [Fri, 3 Dec 2021 17:46:41 +0000 (17:46 +0000)]

libgomp, nvptx: low-latency memory allocator

This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc. The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
the GOMP_NVPTX_LOWLAT_POOL environment variable.

The use of the PTX dynamic_smem_size feature means that the minimum version
requirement is now bumped to 4.1 (still old at this point).

libgomp/ChangeLog:

* allocator.c (MEMSPACE_ALLOC): New macro.
(MEMSPACE_CALLOC): New macro.
(MEMSPACE_REALLOC): New macro.
(MEMSPACE_FREE): New macro.
(dynamic_smem_size): New constants.
(omp_alloc): Use MEMSPACE_ALLOC.
Implement fall-backs for predefined allocators.
(omp_free): Use MEMSPACE_FREE.
(omp_calloc): Use MEMSPACE_CALLOC.
Implement fall-backs for predefined allocators.
(omp_realloc): Use MEMSPACE_REALLOC.
Implement fall-backs for predefined allocators.
* config/nvptx/team.c (__nvptx_lowlat_heap_root): New variable.
(__nvptx_lowlat_pool): New asm varaible.
(gomp_nvptx_main): Initialize the low-latency heap.
* plugin/plugin-nvptx.c (lowlat_pool_size): New variable.
(GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar.
(GOMP_OFFLOAD_run): Apply lowlat_pool_size.
* config/nvptx/allocator.c: New file.
* testsuite/libgomp.c/allocators-1.c: New test.
* testsuite/libgomp.c/allocators-2.c: New test.
* testsuite/libgomp.c/allocators-3.c: New test.
* testsuite/libgomp.c/allocators-4.c: New test.
* testsuite/libgomp.c/allocators-5.c: New test.
* testsuite/libgomp.c/allocators-6.c: New test.

commit | commitdiff | tree

Andrew Stubbs [Tue, 21 Dec 2021 10:09:08 +0000 (10:09 +0000)]

nvptx: bump default to PTX 4.1

gcc/ChangeLog:

* config/nvptx/nvptx-opts.h (ptx_version): Change PTX_VERSION_3_1 to
PTX_VERSION_4_1.
* config/nvptx/nvptx.c (nvptx_file_start): Bump minimum PTX version
to 4.1.
* config/nvptx/nvptx.opt (ptx_version): Add 4.1. Change default.
doc/invoke.texi: -mptx default is now 4.1.

commit | commitdiff | tree

Andrew Stubbs [Thu, 16 Dec 2021 15:30:05 +0000 (15:30 +0000)]

OpenMP: allow requires dynamic_allocators

There's no need to reject the dynamic_allocators requires directive because
we actually do support the feature, and it doesn't have to actually "do"
anything.

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_requires): Don't "sorry" dynamic_allocators.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_requires): Don't "sorry" dynamic_allocators.

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_requires): Don't "sorry" dynamic_allocators.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/requires-8.f90: Reinstate dynamic allocators
requirement.

commit | commitdiff | tree

Chung-Lin Tang [Thu, 2 Dec 2021 10:24:03 +0000 (18:24 +0800)]

fortran: OpenMP/OpenACC array mapping alignment fix (PR90030)

Fix issue with the Fortran front-end when mapping arrays: when creating the
data MEM_REF for the map clause, there was a convention of casting the
referencing pointer to 'c_char *' by
fold_convert (build_pointer_type (char_type_node), ptr).

This causes the alignment passed to the libgomp runtime for array data
hardwared to '1', and causes alignment errors on the offload target.

This patch fixes this by removing the char_type_node pointer converts, and
adding gcc_asserts to ensure POINTER_TYPE_P (TREE_TYPE (ptr)).

PR fortran/90030

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer
to char_type_node, add gcc_assert of POINTER_TYPE_P.
(gfc_trans_omp_array_section): Likewise.
(gfc_trans_omp_clauses): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/goacc/finalize-1.f: Adjust scan test.
* gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
* gfortran.dg/gomp/affinity-clause-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-6.f90: Likewise.
* gfortran.dg/gomp/map-3.f90: Likewise.
* gfortran.dg/gomp/pr78260-2.f90: Likewise.
* gfortran.dg/gomp/pr78260-3.f90: Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/pr90030.f90: New test.
* testsuite/libgomp.fortran/pr90030.f90: New test.

(cherry picked from commit 1ac7a8c9e4798d352eb8c64905dd38086af4e1cd)

commit | commitdiff | tree

Chung-Lin Tang [Fri, 3 Dec 2021 09:27:17 +0000 (17:27 +0800)]

fortran: Fix setting of array lower bound for named arrays

This patch fixes a case of setting array low-bounds, found for particular uses
of SOURCE=/MOLD=. This adjusts the relevant part in gfc_trans_allocate() to
set e3_has_nodescriptor only for non-named arrays.

2021-12-03 Tobias Burnus <tobias@codesourcery.com>

gcc/fortran/ChangeLog:

* trans-stmt.c (gfc_trans_allocate): Set e3_has_nodescriptor to true
only for non-named arrays.

gcc/testsuite/ChangeLog:

* gfortran.dg/allocate_with_source_26.f90: Adjust testcase.
* gfortran.dg/allocate_with_mold_4.f90: New testcase.

(cherry picked from commit 6262e3a22b3d86afc116480bc59a7bb30b0cfd40)

commit | commitdiff | tree

Tobias Burnus [Wed, 24 Nov 2021 10:08:40 +0000 (11:08 +0100)]

Update GMP/MPFR/MPC/ISL version in contrib/download_prerequisites

contrib/
* download_prerequisites: Update to gmp-6.2.1, mpfr-4.1.0,
mpc-1.2.1 and isl-0.24.
* prerequisites.md5: Update hash.
* prerequisites.sha512: Likewise.

(cherry picked from commit be60f80247feb72b47af62cda66c82a0fa6c1cdc)

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:22:48 +0000 (16:22 +0100)]

openacc: Adjust test expectations to new "kernels" handling

Adjust tests to changed expectations with the new Graphite-based
"kernels" handling.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr84955-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Adjust.
* testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90: Adjust.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90: Removed.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/acc-icf.c: Adjust.
* c-c++-common/goacc/cache-3-1.c: Adjust.
* c-c++-common/goacc/classify-kernels-unparallelized-graphite.c: Adjust.
* c-c++-common/goacc/classify-kernels.c: Adjust.
* c-c++-common/goacc/classify-serial.c: Adjust.
* c-c++-common/goacc/if-clause-2.c: Adjust.
* c-c++-common/goacc/kernels-decompose-1.c: Adjust.
* c-c++-common/goacc/kernels-decompose-2.c: Adjust.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Adjust.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Adjust.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: Adjust.
* c-c++-common/goacc/kernels-loop-3.c: Adjust.
* c-c++-common/goacc/loop-2-kernels.c: Adjust.
* c-c++-common/goacc/nested-reductions-2-parallel.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Adjust.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Adjust.
* c-c++-common/goacc/routine-1.c: Adjust.
* c-c++-common/goacc/routine-level-of-parallelism-2.c: Adjust.
* c-c++-common/goacc/routine-nohost-1.c: Adjust.
* c-c++-common/goacc/uninit-copy-clause.c: Adjust.
* gcc.dg/goacc/loop-processing-1.c: Adjust.
* gcc.dg/goacc/nested-function-1.c: Adjust.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust.
* gfortran.dg/goacc/classify-kernels.f95: Adjust.
* gfortran.dg/goacc/classify-parallel.f95: Adjust.
* gfortran.dg/goacc/classify-routine.f95: Adjust.
* gfortran.dg/goacc/classify-serial.f95: Adjust.
* gfortran.dg/goacc/common-block-3.f90: Adjust.
* gfortran.dg/goacc/gang-static.f95: Adjust.
* gfortran.dg/goacc/kernels-decompose-1.f95: Adjust.
* gfortran.dg/goacc/kernels-decompose-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-inner.f95: Adjust.
* gfortran.dg/goacc/kernels-loop.f95: Adjust.
* gfortran.dg/goacc/kernels-tree.f95: Adjust.
* gfortran.dg/goacc/loop-2-kernels.f95: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-2.f90: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-3.f90: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-4.f90: Adjust.
* gfortran.dg/goacc/nested-function-1.f90: Adjust.
* gfortran.dg/goacc/nested-reductions-2-parallel.f90: Adjust.
* gfortran.dg/goacc/pr72741.f90: Adjust.
* gfortran.dg/goacc/private-explicit-kernels-1.f95: Adjust.
* gfortran.dg/goacc/private-predetermined-kernels-1.f95: Adjust.
* gfortran.dg/goacc/routine-module-mod-1.f90: Adjust.
* gfortran.dg/goacc/uninit-copy-clause.f95: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Removed.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:22:29 +0000 (16:22 +0100)]

graphite: Accept loops without data references

It seems that the check that rejects loops without data references is
only included to avoid handling non-profitable loops. Including those
loops in Graphite's analysis enables more consistent diagnostic
messages in OpenACC "kernels" code and does not introduce any
testsuite regressions. If executing Graphite on loops without
data references leads to noticeable compile time slow-downs for
non-OpenACC users of Graphite, the check can be re-introduced but
restricted to non-OpenACC functions.

gcc/ChangeLog:

* graphite-scop-detection.c (scop_detection::harmful_loop_in_region):
Remove check for loops without data references.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:21:57 +0000 (16:21 +0100)]

graphite: Adjust scop loop-nest choice

The find_common_loop function is used in Graphite to obtain a common
super-loop of all loops inside a SCoP.  The function is applied to the
loop of the destination block of the edge that leads into the SESE
region and the loop of the source block of the edge that exits the
region.  The exit block is usually introduced by the canonicalization
of the loop structure that Graphite does to support its code
generation. If it is empty, it may happen that it belongs to the outer
fake loop.  This way, build_alias_set may end up analysing
data-references with respect to this loop although there may exist a
proper super-loop of the SCoP loops.  This does not seem to be correct
in general and it leads to problems with runtime alias check creation
which fails if executed on a loop without niter information.

gcc/ChangeLog:

        * graphite-scop-detection.c (scop_context_loop): New function.
        (build_alias_set): Use scop_context_loop instead of find_common_loop.
        * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
        * graphite.h (scop_context_loop): New declaration.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:21:42 +0000 (16:21 +0100)]

graphite: Tune parameters for OpenACC use

The default values of some parameters that restrict Graphite's
resource usage are too low for many OpenACC codes.  Furthermore,
exceeding the limits does not alwas lead to user-visible diagnostic
messages.

This commit increases the parameter values on OpenACC functions.  The
values were chosen to allow for the analysis of all "kernels" regions
in the SPEC ACCEL v1.3 benchmark suite.  Warnings about exceeded
Graphite-related limits are added to the -fopt-info-missed
output. Those warnings are phrased in a uniform way that intentionally
refers to the "data-dependence analysis" of "OpenACC loops" instead of
"a failure in Graphite" to make them easier to understand for users.

gcc/ChangeLog:

        * graphite-optimize-isl.c (optimize_isl): Adjust
param_max_isl_operations value for OpenACC functions and add
special warnings if value gets exceeded.

* graphite-scop-detection.c (build_scops): Likewise for
param_graphite_max_arrays_per_scop.

gcc/testsuite/ChangeLog:

        * gcc.dg/goacc/graphite-parameter-1.c: New test.
        * gcc.dg/goacc/graphite-parameter-2.c: New test.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:20:56 +0000 (16:20 +0100)]

openacc: Disable pass_pre on outlined functions analyzed by Graphite

The additional dependences introduced by partial redundancy
elimination proper and by the code hoisting step of the pass very
often cause Graphite to fail on OpenACC functions. On the other hand,
the pass can also enable the analysis of OpenACC loops (cf. e.g. the
loop-auto-transfer-4.f90 testcase), for instance, because full
redundancy elimination removes definitions that would otherwise
prevent the creation of runtime alias checks outside of the SCoP.

This commit disables the actual partial redundancy elimination step as
well as the code hoisting step of pass_pre on OpenACC functions that
might be handled by Graphite.

gcc/ChangeLog:

* tree-ssa-pre.c (insert): Skip any insertions in OpenACC
functions that might be processed by Graphite.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:20:41 +0000 (16:20 +0100)]

openacc: Handle internal function calls in pass_lim

The loop invariant motion pass correctly refuses to move statements
out of a loop if any other statement in the loop is unanalyzable.  The
pass does not know how to handle the OpenACC internal function calls
which was not necessary until recently when the OpenACC device
lowering pass was moved to a later position in the pass pipeline.

This commit changes pass_lim to ignore the OpenACC internal function
calls which do not contain any memory references. The hoisting enabled
by this change can be useful for the data-dependence analysis in
Graphite; for instance, in the outlined functions for OpenACC regions,
all invariant accesses to the ".omp_data_i" struct should be hoisted
out of the OpenACC loop.  This is particularly important for variables
that were scalars in the original loop and which have been turned into
accesses to the struct by the outlining process.  Not hoisting those
can prevent scalar evolution analysis which is crucial for Graphite.
Since any hoisting that introduces intermediate names - and hence,
"fake" dependences - inside the analyzed nest can be harmful to
data-dependence analysis, a flag to restrict the hoisting in OpenACC
functions is added to the pass. The pass instance that executes before
Graphite now runs with this flag set to true and the pass instance
after Graphite runs unrestricted.

A more precise way of selecting the statements for which hoisting
should be enabled is left for a future improvement.

gcc/ChangeLog:
        * passes.def: Set restrict_oacc_hoisting to true for the early
pass_lim instance.
* tree-ssa-loop-im.c (movement_possibility): Add
restrict_oacc_hoisting flag to function; restrict movement if set.
(compute_invariantness): Add restrict_oacc_hoisting flag and pass it on.
        (gather_mem_refs_stmt): Skip IFN_GOACC_LOOP and IFN_UNIQUE
calls.
        (loop_invariant_motion_in_fun): Add restrict_oacc_hoisting flag and
pass it on.
        (pass_lim::execute): Pass on new flags.
* tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Adjust declaration.
* gimple-loop-interchange.cc (pass_linterchange::execute): Adjust call to
loop_invariant_motion_in_fun.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:20:15 +0000 (16:20 +0100)]

openacc: Warn about "independent" "kernels" loops with data-dependences

This commit concerns loops in OpenACC "kernels" region that have been marked
up with an explicit "independent" clause by the user, but for which Graphite
found data dependences.  A discussion on the private internal OpenACC mailing
list suggested that warning the user about the dependences woud be a more
acceptable solution than reverting the user's decision. This behavior is
implemented by the present commit.

gcc/ChangeLog:

        * common.opt: Add flag Wopenacc-false-independent.
        * omp-offload.c (oacc_loop_warn_if_false_independent): New function.
        (oacc_loop_fixed_partitions): Call from here.

commit | commitdiff | tree

Andrew Stubbs [Tue, 16 Nov 2021 15:19:53 +0000 (16:19 +0100)]

openacc: Add runtime alias checking for OpenACC kernels

This commit adds the code generation for the runtime alias checks for
OpenACC loops that have been analyzed by Graphite. The runtime alias
check condition gets generated in Graphite. It is evaluated by the
code generated for the IFN_GOACC_LOOP internal function calls. If
aliasing is detected at runtime, the execution dimensions get adjusted
to execute the affected loops sequentially.

gcc/ChangeLog:

* graphite-isl-ast-to-gimple.c: Include internal-fn.h.
(graphite_oacc_analyze_scop): Implement runtime alias checks.
* omp-expand.c (expand_oacc_for): Add an additional "noalias" parameter
to GOACC_LOOP internal calls, and initialise it to integer_one_node.
* omp-offload.c (oacc_xform_loop): Integrate the runtime alias check
into the GOACC_LOOP expansion.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-2.c: New test.

commit | commitdiff | tree

Andrew Stubbs [Tue, 16 Nov 2021 15:19:23 +0000 (16:19 +0100)]

openacc: Add data optimization pass

Address PR90591 "Avoid unnecessary data transfer out of OMP
construct", for simple (but common) cases.

This commit adds a pass that optimizes data mapping clauses.
Currently, it can optimize copy/map(tofrom) clauses involving scalars
to copyin/map(to) and further to "private".  The pass is restricted
"kernels" regions but could be extended to other types of regions.

gcc/ChangeLog:

        * Makefile.in: Add pass.
        * doc/gimple.texi: TODO.
        * gimple-walk.c (walk_gimple_seq_mod): Adjust for backward walking.
        * gimple-walk.h (struct walk_stmt_info): Add field.
        * passes.def: Add new pass.
        * tree-pass.h (make_pass_omp_data_optimize): New declaration.
        * omp-data-optimize.cc: New file.

libgomp/ChangeLog:

        * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Expect optimization messages.
        * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.

gcc/testsuite/ChangeLog:

        * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.
        * c-c++-common/goacc/uninit-copy-clause.c: Likewise.
        * gfortran.dg/goacc/uninit-copy-clause.f95: Likewise.
        * c-c++-common/goacc/omp_data_optimize-1.c: New test.
        * g++.dg/goacc/omp_data_optimize-1.C: New test.
        * gfortran.dg/goacc/omp_data_optimize-1.f90: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:18:02 +0000 (16:18 +0100)]

Add function for printing a single OMP_CLAUSE

Commit 89f4f339130c ("For 'OMP_CLAUSE' in 'dump_generic_node', dump
the whole OMP clause chain") changed the dumping behavior for
OMP_CLAUSEs.  The old behavior is required for a follow-up
commit ("openacc: Add data optimization pass") that optimizes single
OMP_CLAUSEs.

gcc/ChangeLog:

        * tree-pretty-print.c (print_omp_clause_to_str): Add new function.
        * tree-pretty-print.h (print_omp_clause_to_str): Add declaration.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:17:48 +0000 (16:17 +0100)]

openacc: Remove unused partitioning in "kernels" regions

With the old "kernels" handling, unparallelized regions would
get executed with 1x1x1 partitioning even if the user provided
explicit num_gangs, num_workers clauses etc.

This commit restores this behavior by removing unused partitioning
after assigning the parallelism dimensions to loops.

gcc/ChangeLog:

        * omp-offload.c (oacc_remove_unused_partitioning): New function
        for removing partitioning that is not used by any loop.
        (oacc_validate_dims): Call oacc_remove_unused_partitioning and
        enable warnings about unused partitioning.

libgomp/ChangeLog:

        * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Adjust
        expectations.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:17:15 +0000 (16:17 +0100)]

openacc: Add further kernels tests

Add some copies of tests to continue covering the old "parloops"-based
"kernels" implementation - until it gets removed from GCC - and
add further tests for the new Graphite-based implementation.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90:
New test.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/classify-kernels-unparallelized-graphite.c:
New test.
* c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
New test.
* c-c++-common/goacc/kernels-decompose-1-parloops.c: New test.
* c-c++-common/goacc/kernels-reduction-parloops.c: New test.
* c-c++-common/goacc/loop-auto-reductions.c: New test.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto-parloops.c:
New test.
* c-c++-common/goacc/note-parallelism-kernels-loops-1.c: New test.
* c-c++-common/goacc/note-parallelism-kernels-loops-parloops.c:
New test.
* gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95:
New test.
* gfortran.dg/goacc/kernels-conversion.f95: New test.
* gfortran.dg/goacc/kernels-decompose-1-parloops.f95: New test.
* gfortran.dg/goacc/kernels-decompose-parloops-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-parloops-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-parloops-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-parloops.f95: New test.
* gfortran.dg/goacc/kernels-reductions.f90: New test.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:16:47 +0000 (16:16 +0100)]

openacc: Add "can_be_parallel" flag info to "graph" dumps

gcc/ChangeLog:

* graph.c (oacc_get_fn_attrib): New declaration.
(find_loop_location): New declaration.
(draw_cfg_nodes_for_loop): Print value of the
can_be_parallel flag at the top of loops in OpenACC
functions.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:16:22 +0000 (16:16 +0100)]

openacc: Use Graphite for dependence analysis in "kernels" regions

This commit changes the handling of OpenACC "kernels" to use Graphite
for dependence analysis. To this end, it first introduces a new
internal representation for "kernels" regions which should be analyzed
by Graphite in pass_omp_oacc_kernels_decompose.  This is now the
default for all "kernels" regions, but the old handling is still
available through the command line parameter
"--param=openacc_kernels=decompose-parloops".  The handling of this
new region type in the omp lowering and omp offloading passes follows
the existing handling for "parallel" regions.  This replaces the
specialized handling for "kernels" regions that was previously used
and which was in limited in many ways.

Graphite is adjusted to be able to analyze the OpenACC functions that
get outlined from the "kernels" regions. It is enabled to handle the
internal function calls that contain information about OpenACC
constructs. In some places where function calls would be rejected by
Graphite, those calls need to be ignored. In other places, information
about the loop step, bounds etc. needs to be extracted from the
calls. The goal is to enable an analysis of the original loop
parameters although the omp lowering and expansion steps have already
modified the loop structure.  Some parallelization-enabling constructs
such as OpenACC "reduction" and "private"/"firstprivate" clauses must
be recognized and the data-dependences must be adjusted to reflect the
semantics of those constructs.  The data-dependence analysis step in
Graphite has so far been tied to the code generation step.  This
commit introduces a separate data-dependence analysis step that avoids
the code generation.  This is necessary because adjusting the code
generation to create a correct OpenACC loop structure would require
very considerable effort and the goal of this commit is to implement
the dependence analysis only. The ability to use Graphite for
dependence analysis without its code generation might be of
independent interest, but it is so far used for OpenACC purposes
only. In general, all changes to Graphite try to avoid affecting other
uses of Graphite as much as possible.

gcc/ChangeLog:

* Makefile.in: Add graphite-oacc.o
* cfgloop.c (alloc_loop): Set can_be_parallel_valid_p to false.
* cfgloop.h: Add can_be_parallel_valid_p field.
* cfgloopmanip.c (copy_loop_info): Add assert.
* config/nvptx/nvptx.c (nvptx_goacc_reduction_setup):
* doc/invoke.texi: Adjust param openacc-kernels description.
* doc/passes.texi: Adjust pass_ipa_oacc_kernels description.
* flag-types.h (enum openacc_kernels):Add
OPENACC_KERNELS_DECOMPOSE_PARLOOPS.
* gimple-pretty-print.c (dump_gimple_omp_target): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
* gimple.h (enum gf_mask): Add
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE and
widen GF_OMP_TARGET_KIND_MASK.
(is_gimple_omp_oacc): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
(is_gimple_omp_offloaded): Likewise.
* gimplify.c (gimplify_omp_for): Enable reduction localization
for "kernels" regions.
(gimplify_omp_workshare): Likewise.
* graphite-dependences.c (scop_get_reads_and_writes): Handle
"kills" and "reduction" PDRs.
(apply_schedule_on_deps): Add dump output for intermediate
steps of the dependence computation to enable understanding
of unexpected dependences.
(carries_deps): Likewise.
(scop_get_dependences): Handle "kill" operations and add dump
output.
* graphite-isl-ast-to-gimple.c (visit_schedule_loop_node): New function.
(graphite_oacc_analyze_scop): New function.
* graphite-optimize-isl.c (optimize_isl): Remove "static" and
add argument to identify OpenACC use; don't fail on unchanged
schedule in this case.
* graphite-poly.c (new_poly_dr): Handle "kills".
(print_pdr): Likewise.
(new_gimple_poly_bb): Likewise.
(free_gimple_poly_bb): Likewise.
(new_scop): Handle "reduction", "private", and "firstprivate"
hash sets.
(free_scop): Likewise.
(print_isl_space): New function.
(debug_isl_space): New function.
* graphite-scop-detection.c (scop_detection::can_represent_loop):
Don't fail if niter is 0 in OpenACC functions.
(scop_detection::add_scop): Don't reject regions with only one
loop in OpenACC functions.
(ignored_oacc_internal_call_p): New function.
(scan_tree_for_params): Handle VIEW_CONVERT_EXPR.
(stmt_has_side_effects): Ignore internal OpenACC function calls.
(add_write): Likewise.
(add_read): Likewise.
(add_kill): New function.
(add_kills): New function.
(add_oacc_kills): New function.
(try_generate_gimple_bb): Kill false dependences for OpenACC
"private"/"firstprivate" vars.
(gather_bbs::gather_bbs): Determin OpenACC
"private"/"firstprivate" vars in region.
(gather_bbs::before_dom_children): Add assert.
(determine_openacc_reductions): New function.
(build_scops): Determine OpenACC "reduction" vars in SCoP.
* graphite-sese-to-poly.c (oacc_ifn_call_extract): New declaration.
(oacc_internal_call_p): New function.
(build_poly_dr): Ignore internal OpenACC function calls,
* handle "reduction" refs.
(build_poly_sr): Likewise; handle "kill" operations.
* graphite.c (graphite_transform_loops): Accept functions with
only a single loop.
(oacc_enable_graphite_p): New function.
(gate_graphite_transforms): Enable pass on OpenACC functions.
* graphite.h (enum poly_dr_type): Add PDR_KILL.
(struct poly_dr): Add "is_reduction" field.
(new_poly_dr): Add argument to declaration.
(pdr_kill_p): New function.
(print_isl_space): New declaration.
(debug_isl_space): New declaration.
(struct scop): Add fields "reductions_vars",
"oacc_firstprivate_vars", and "oacc_private_scalars".
(optimize_isl): New declaration.
(graphite_oacc_analyze_scop): New declaration.
* internal-fn.c (expand_UNIQUE): Handle
IFN_UNIQUE_OACC_PRIVATE_SCALAR and IFN_UNIQUE_OACC_FIRSTPRIVATE
* internal-fn.h: Add OACC_PRIVATE_SCALAR and OACC_FIRSTPRIVATE
* omp-expand.c (struct omp_region): Adjust comment.
(expand_omp_taskloop_for_inner):
(expand_omp_for): Add asserts about expected "kernels" region types.
(mark_loops_in_oacc_kernels_region): Likewise.
(expand_omp_target): Likewise; handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
(build_omp_regions_1): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
Likewise.
(omp_make_gimple_edges): Likewise.
* omp-general.c (oacc_get_kernels_attrib): New function.
(oacc_get_fn_dim_size): Allow argument to be NULL.
* omp-general.h (oacc_get_kernels_attrib): New declaration.
* omp-low.c (struct omp_context): Add fields
"oacc_firstprivate_vars" and "oacc_private_scalars".
(was_originally_oacc_kernels): New function.
(is_oacc_kernels):
(is_oacc_kernels_decomposed_graphite_part): New function.
(new_omp_context): Allocate "oacc_first_private_vars" and
"oacc_private_scalars" ...
(delete_omp_context): ... and free from here.
(oacc_record_firstprivate_var_clauses): New function.
(oacc_record_private_scalars): New function.
(scan_sharing_clauses): Call functions to record "private"
scalars and "firstprivate" variables.
(check_oacc_kernel_gwv): Add assert.
(ctx_in_oacc_kernels_region): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
(scan_omp_for): Likewise.
(check_omp_nesting_restrictions): Likewise.
(lower_oacc_head_mark): Likewise.
(lower_omp_for): Likewise.
(lower_omp_target): Create "private" and "firstprivate" marker
call statements.
(lower_oacc_head_tail): Adjust "private" and "firstprivate"
marker calls.
(lower_oacc_reductions): Emit "private" and "firstprivate"
marker call statements.
(make_oacc_firstprivate_vars_marker): New function.
(make_oacc_private_scalars_marker): New function.
* omp-oacc-kernels-decompose.cc (adjust_region_code_walk_stmt_fn):
Assign GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE to
region using the new "kernels" handling.
(make_region_seq): Adjust default region type for new
"kernels" handling; no more exceptions, let Graphite handle everything.
(make_region_loop_nest): Likewise; add dump output and assert.
(adjust_nested_loop_clauses): Stop creating "auto" clauses if
loop has "independent", "gang" etc.
(transform_kernels_loop_clauses): Likewise.
* omp-offload.c (oacc_extract_loop_call): New function.
(oacc_loop_get_cfg_loop): New function.
(can_be_parallel_str): New function.
(oacc_loop_can_be_parallel_p): New function.
(oacc_parallel_kernels_graphite_fun_p): New function.
(oacc_parallel_fun_p): New function.
(oacc_loop_transform_auto_into_independent): New function, ...
(oacc_loop_fixed_partitions): ... called from here to transfer
the result of Graphite's analysis to the loop.
(execute_oacc_loop_designation): Handle "oacc
functions with "parallel_kernels_graphite" attribute.
(execute_oacc_device_lower): Handle
IFN_UNIQUE_OACC_PRIVATE_SCALAR and IFN_UNIQUE_OACC_FIRSTPRIVATE.
* omp-offload.h (oacc_extract_loop_call): Add declaration.
* params.opt: Add "param=openacc-kernels" value "decompose-parloops".
* sese.c (scalar_evolution_in_region): "Redirect" SCEV
analysis to outer loop for IFN_GOACC_LOOP calls.
* sese.h: Add field "kill_scalar_refs".
* tree-chrec.c (chrec_fold_plus_1): Handle VIEW_CONVERT_EXPR
like CASE_CONVERT.
* tree-data-ref.c (dump_data_reference): Include
* DR_BASE_ADDRESS and DR_OFFSET in dump output.
(get_references_in_stmt): Don't reject OpenACC internal function
calls.
(graphite_find_data_references_in_stmt): Remove unused variable.
* tree-parloops.c (pass_parallelize_loops::execute): Disable
pass with the new kernels handling, enable if requested explicitly.
* tree-scalar-evolution.c (set_scev_analyze_openacc_calls):
Set flag to enable the analysis of internal OpenACC function
calls (use for Graphite only).
(oacc_call_analyzable_p): New function.
(oacc_ifn_call_extract): New function.
(oacc_simplify): New function.
(add_to_evolution): Simplify OpenACC internal function calls
if applicable.
(follow_ssa_edge_binary): Likewise.
(follow_ssa_edge_expr): Likewise.
(follow_copies_to_constant): Likewise.
(analyze_initial_condition): Likewise.
(interpret_loop_phi): Likewise.
(interpret_gimple_call): New function.
(interpret_rhs_expr): Likewise.
(instantiate_scev_name): Likewise.
(analyze_scalar_evolution_1): Handle GIMPLE_CALL, handle default definitions.
(expression_expensive_p): Consider internal OpenACC calls to
be cheap.
* tree-scalar-evolution.h (set_scev_analyze_openacc_calls):
New declaration.
(oacc_call_analyzable_p): New declaration.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
lhs of internal OpenACC function calls necessary.
* tree-ssa-ifcombine.c (recognize_if_then_else):
* tree-ssa-loop-niter.c (oacc_call_analyzable_p):
(oacc_ifn_call_extract): New declaration.
(interpret_gimple_call): New delcaration.
(expand_simple_operations): Handle internal OpenACC function calls.
* tree-ssa-loop.c (gate_oacc_kernels): Disable for new
"kernels" handling.
* graphite-oacc.c: New file.
* graphite-oacc.h: New file.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/kernels-independent.f90: Adjust.
* testsuite/libgomp.oacc-fortran/kernels-loop-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/classify-kernels.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Adjust.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Removed.
* c-c++-common/goacc/kernels-reduction.c: Removed.
* gfortran.dg/goacc/loop-auto-transfer-2.f90: New test.
* gfortran.dg/goacc/loop-auto-transfer-3.f90: New test.
* gfortran.dg/goacc/loop-auto-transfer-4.f90: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:15:08 +0000 (16:15 +0100)]

graphite: Add runtime alias checking

Graphite rejects a SCoP if it contains a pair of data references for
which it cannot determine statically if they may alias. This happens
very often, for instance in C code which does not use explicit
"restrict".  This commit adds the possibility to analyze a SCoP
nevertheless and perform an alias check at runtime.  Then, if aliasing
is detected, the execution will fall back to the unoptimized SCoP.

TODO This needs more testing on non-OpenACC code.

gcc/ChangeLog:

        * common.opt: Add fgraphite-runtime-alias-checks.
        * graphite-isl-ast-to-gimple.c
        (generate_alias_cond): New function.
        (graphite_regenerate_ast_isl): Use from here.
        * graphite-poly.c (new_scop): Create unhandled_alias_ddrs vec ...
(free_scop): and release here.
        * graphite-scop-detection.c (dr_defs_outside_region): New function.
        (dr_well_analyzed_for_runtime_alias_check_p): New function.
        (graphite_runtime_alias_check_p): New function.
        (build_alias_set): Record unhandled alias ddrs for later alias check
        creation if flag_graphite_runtime_alias_checks is true instead
        of failing.
        * graphite.h (struct scop): Add field unhandled_alias_ddrs.
        * sese.h (has_operands_from_region_p): New function.
gcc/testsuite/ChangeLog:

        * gcc.dg/graphite/alias-1.c: New test.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:14:41 +0000 (16:14 +0100)]

Move compute_alias_check_pairs to tree-data-ref.c

Move this function from tree-loop-distribution.c to tree-data-ref.c
and make it non-static to enable its use from other parts of GCC.

gcc/ChangeLog:
* tree-loop-distribution.c (data_ref_segment_size): Remove function.
(latch_dominated_by_data_ref): Likewise.
(compute_alias_check_pairs): Likewise.

* tree-data-ref.c (data_ref_segment_size): New function,
copied from tree-loop-distribution.c
(compute_alias_check_pairs): Likewise.
(latch_dominated_by_data_ref): Likewise.

* tree-data-ref.h (compute_alias_check_pairs): New declaration.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:13:51 +0000 (16:13 +0100)]

Fix branch prediction dump message

Instead of, for instance, "Loop got predicted 1 to iterate 10 times"
the message should be "Loop 1 got predicted to iterate 10 times".

gcc/ChangeLog:

* predict.c (pass_profile::execute): Fix dump message.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:13:03 +0000 (16:13 +0100)]

graphite: Fix minor mistakes in comments

gcc/ChangeLog:

* graphite-sese-to-poly.c (build_poly_sr_1): Fix a typo and
a reference to a variable which does not exist.
* graphite-isl-ast-to-gimple.c (gsi_insert_earliest): Fix typo
in comment.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:12:23 +0000 (16:12 +0100)]

graphite: Rename isl_id_for_ssa_name

The SSA names for which this function gets used are always SCoP
parameters and hence "isl_id_for_parameter" is a better name.  It also
explains the prefix "P_" for those names in the ISL representation.

gcc/ChangeLog:

* graphite-sese-to-poly.c (isl_id_for_ssa_name): Rename to ...
  (isl_id_for_parameter): ... this new function name.
  (build_scop_context): Adjust function use.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:11:21 +0000 (16:11 +0100)]

graphite: Extend SCoP detection dump output

Extend dump output to make understanding why Graphite rejects to
include a loop in a SCoP easier (for GCC developers).

ChangeLog:

        * graphite-scop-detection.c (scop_detection::can_represent_loop):
Output reason for failure to dump file.
        (scop_detection::harmful_loop_in_region): Likewise.
        (scop_detection::graphite_can_represent_expr): Likewise.
        (scop_detection::stmt_has_simple_data_refs_p): Likewise.
        (scop_detection::stmt_simple_for_scop_p): Likewise.
(print_sese_loop_numbers): New function.
        (scop_detection::add_scop): Use from here to print loops in
rejected SCoP.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:07:34 +0000 (16:07 +0100)]

openacc: Move pass_oacc_device_lower after pass_graphite

The OpenACC device lowering pass must run after the Graphite pass to
allow for the use of Graphite for automatic parallelization of kernels
regions in the future. Experimentation has shown that it is best,
performancewise, to run pass_oacc_device_lower together with the
related passes pass_oacc_loop_designation and pass_oacc_gimple_workers
early after pass_graphite in pass_tree_loop, at least if the other
tree loop passes are not adjusted. In particular, to enable
vectorization which is crucial for GCN offloading, device lowering
should happen before pass_vectorize. To bring the loops contained in
the offloading functions into the shape expected by the loop
vectorizer, we have to make sure that some passes that previously were
executed only once before pass_tree_loop are also executed on the
offloading functions. To ensure the execution of
pass_oacc_device_lower if pass_tree_loop does not execute (no loops,
no optimizations), we introduce two further copies of the pass to the
pipeline that run if there are no loops or if no optimization is
performed.

gcc/ChangeLog:

* omp-general.c (oacc_get_fn_dim_size): Return 0 on
missing "dims".
* omp-offload.c (pass_oacc_loop_designation::clone): New
member function.
(pass_oacc_gimple_workers::clone): Likewise.
(pass_oacc_gimple_device_lower::clone): Likewise.
* passes.c (pass_data_no_loop_optimizations): New pass_data.
(class pass_no_loop_optimizations): New pass.
(make_pass_no_loop_optimizations): New function.
* passes.def: Move pass_oacc_{loop_designation,
gimple_workers, device_lower} into tree_loop, and add
copies to pass_tree_no_loop and to new
pass_no_loop_optimizations. Add copies of passes pass_ccp,
pass_ipa_warn, pass_complete_unrolli, pass_backprop,
pass_phiprop, pass_fix_loops after the OpenACC passes
in pass_tree_loop.
* tree-ssa-loop-ivcanon.c (pass_complete_unroll::clone):
New member function.
(pass_complete_unrolli::clone): Likewise.
* tree-ssa-loop.c (pass_fix_loops::clone): Likewise.
(pass_tree_loop_init::clone): Likewise.
(pass_tree_loop_done::clone): Likewise.
* tree-ssa-phiprop.c (pass_phiprop::clone): Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust
expected output to pass name changes due to the pass
reordering and cloning.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Likewise
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/goacc/loop-processing-1.c: Adjust expected output
* to pass name changes due to the pass reordering and cloning.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/goacc/routine-nohost-1.c: Likewise.
* c-c++-common/unroll-1.c: Likewise.
* c-c++-common/unroll-4.c: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* gcc.dg/tree-ssa/backprop-1.c: Likewise.
* gcc.dg/tree-ssa/backprop-2.c: Likewise.
* gcc.dg/tree-ssa/backprop-3.c: Likewise.
* gcc.dg/tree-ssa/backprop-4.c: Likewise.
* gcc.dg/tree-ssa/backprop-5.c: Likewise.
* gcc.dg/tree-ssa/backprop-6.c: Likewise.
* gcc.dg/tree-ssa/cunroll-1.c: Likewise.
* gcc.dg/tree-ssa/cunroll-3.c: Likewise.
* gcc.dg/tree-ssa/cunroll-9.c: Likewise.
* gcc.dg/tree-ssa/ldist-17.c: Likewise.
* gcc.dg/tree-ssa/loop-38.c: Likewise.
* gcc.dg/tree-ssa/pr21463.c: Likewise.
* gcc.dg/tree-ssa/pr45427.c: Likewise.
* gcc.dg/tree-ssa/pr61743-1.c: Likewise.
* gcc.dg/unroll-2.c: Likewise.
* gcc.dg/unroll-3.c: Likewise.
* gcc.dg/unroll-4.c: Likewise.
* gcc.dg/unroll-5.c: Likewise.
* gcc.dg/vect/vect-profile-1.c: Likewise.
* c-c++-common/goacc/device-lowering-debug-optimization.c: New test.
* c-c++-common/goacc/device-lowering-no-loops.c: New test.
* c-c++-common/goacc/device-lowering-no-optimization.c: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

commit | commitdiff | tree

Sandra Loosemore [Tue, 16 Nov 2021 15:09:51 +0000 (16:09 +0100)]

Fortran: delinearize multi-dimensional array accesses

The Fortran front end presently linearizes accesses to
multi-dimensional arrays by combining the indices for the various
dimensions into a series of explicit multiplies and adds with
refactoring to allow CSE of invariant parts of the computation.
Unfortunately this representation interferes with Graphite-based loop
optimizations.  It is difficult to recover the original
multi-dimensional form of the access by the time loop optimizations
run because parts of it have already been optimized away or into a
form that is not easily recognizable, so it seems better to have the
Fortran front end produce delinearized accesses to begin with, a set
of nested ARRAY_REFs similar to the existing behavior of the C and C++
front ends.  This is a long-standing problem that has previously been
discussed e.g. in PR 14741 and PR61000.

This patch is an initial implementation for explicit array accesses
only; it doesn't handle the accesses generated during scalarization of
whole-array or array-section operations, which follow a different code
path.

        gcc/
        * expr.c (get_inner_reference): Handle NOP_EXPR like
        VIEW_CONVERT_EXPR.

        gcc/fortran/
        * lang.opt (-param=delinearize=): New.
        * trans-array.c (get_class_array_vptr): New, split from...
        (build_array_ref): ...here.
        (get_array_lbound, get_array_ubound): New, split from...
        (gfc_conv_array_ref): ...here.  Additional code refactoring
        plus support for delinearization of the array access.

        gcc/testsuite/
        * gfortran.dg/assumed_type_2.f90: Adjust patterns.
        * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise.
        * gfortran.dg/graphite/block-3.f90: Remove xfails.
        * gfortran.dg/graphite/block-4.f90: Likewise.
        * gfortran.dg/inline_matmul_24.f90: Adjust patterns.
        * gfortran.dg/no_arg_check_2.f90: Likewise.
        * gfortran.dg/pr32921.f: Likewise.
        * gfortran.dg/reassoc_4.f: Disable delinearization for this test.

Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>

commit | commitdiff | tree

Richard Sandiford [Sat, 24 Apr 2021 08:35:16 +0000 (09:35 +0100)]

Add dg-final option-based target selectors

This patch adds target selectors of the form:

  { any-opts "opt1" ... "optn" }
  { no-opts "opt1" ... "optn" }

for skipping or xfailing tests based on compiler options.  It only
works for dg-final selectors.

The patch then uses no-opts to exclude -O0 and (sometimes) -Og from
some guality.exp xfails.  AFAICT (based on gcc-testresults) these
tests pass for those options for all targets.

gcc/
* doc/sourcebuild.texi: Document no-opts and any-opts target
selectors.

gcc/testsuite/
* lib/target-supports-dg.exp (selector_expression): Handle any-opts
and no-opts.
* gcc.dg/guality/pr41353-1.c: Exclude -O0 from xfail.
* gcc.dg/guality/pr59776.c: Likewise.
* gcc.dg/guality/pr54970.c: Likewise -O0 and -Og.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:08:40 +0000 (16:08 +0100)]

Fix gimple_debug_cfg declaration

Silence a warning. The argument type did not match the definition.

gcc/ChangeLog:

* tree-cfg.h (gimple_debug_cfg): Change argument type from int
to dump_flags_t.

commit | commitdiff | tree

Tobias Burnus [Wed, 10 Nov 2021 13:48:34 +0000 (14:48 +0100)]

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9233-g3dea90505df136a4b361665772ef8e62306cfcdb (Nov 10, 2021)

commit | commitdiff | tree

Richard Biener [Wed, 10 Nov 2021 10:08:03 +0000 (11:08 +0100)]

testsuite/102690 - XFAIL g++.dg/warn/Warray-bounds-16.C

This XFAILs the bogus diagnostic test and rectifies the expectation
on the optimization.

2021-11-10 Richard Biener <rguenther@suse.de>

PR testsuite/102690
* g++.dg/warn/Warray-bounds-16.C: XFAIL diagnostic part
and optimization.

(cherry picked from commit b2cd32b743ba440e75505ce30c6b5c592ed144ea)

commit | commitdiff | tree

Jakub Jelinek [Wed, 10 Nov 2021 07:51:07 +0000 (08:51 +0100)]

openmp: For default(none) ignore variables created by ubsan_create_data [PR64888]

We weren't ignoring the ubsan variables created by c-ubsan.c before gimplification
(others are added later). One way to fix this would be to introduce further
UBSAN_ internal functions and lower it later (sanopt pass) like other ifns,
this patch instead recognizes those magic vars by name/name of type and DECL_ARTIFICIAL
and TYPE_ARTIFICIAL.

2021-10-21 Jakub Jelinek <jakub@redhat.com>

PR middle-end/64888
gcc/c-family/
* c-omp.c (c_omp_predefined_variable): Return true also for
ubsan_create_data created artificial variables.
gcc/testsuite/
* c-c++-common/ubsan/pr64888.c: New test.

(cherry picked from commit 40dd9d839e52f679d8eabc1c5ca0ca17a5ccfd14)

commit | commitdiff | tree

Thomas Schwinge [Wed, 10 Nov 2021 07:49:34 +0000 (08:49 +0100)]

Restore 'GOMP_OPENACC_DIM' environment variable parsing

... that got broken by recent commit c057ed9c52c6a63a1a692268f916b1a9131cd4b7
"openmp: Fix up strtoul and strtoull uses in libgomp", resulting in spurious
FAILs for tests specifying 'dg-set-target-env-var "GOMP_OPENACC_DIM" "[...]"'.

libgomp/
* env.c (parse_gomp_openacc_dim): Restore parsing.

(cherry picked from commit 00c9ce13a64e324dabd8dfd236882919a3119479)

commit | commitdiff | tree

GCC Administrator [Wed, 10 Nov 2021 00:18:14 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Xionghu Luo [Thu, 4 Nov 2021 01:23:03 +0000 (20:23 -0500)]

rs6000: Fix incorrect fusion constraint [PR102991]

gcc/ChangeLog:

2021-11-05 Xionghu Luo <luoxhu@linux.ibm.com>

PR target/102991
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl: Fix incorrect clobber constraint.

(cherry picked from commit 614b39757b8b61f70ac1c666edb7a01a5fc19cd4)

commit | commitdiff | tree

GCC Administrator [Tue, 9 Nov 2021 00:18:14 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Richard Biener [Mon, 18 Oct 2021 07:10:43 +0000 (09:10 +0200)]

tree-optimization/102798 - avoid copying PTA info to old SSA names

The vectorizer duplicates pointer-info to created pointer bases
but it has to avoid changing points-to info on existing SSA names
because there's now flow-sensitive info in there (pt->pt_null as
set from VRP).

2021-10-18 Richard Biener <rguenther@suse.de>

PR tree-optimization/102798
* tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref):
Only copy points-to info to newly generated SSA names.

* gcc.dg/pr102798.c: New testcase.

commit | commitdiff | tree

Richard Biener [Thu, 30 Sep 2021 13:05:53 +0000 (15:05 +0200)]

middle-end/102518 - avoid invalid GIMPLE during inlining

When inlining we have to avoid mapping a non-lvalue parameter
value into a context that prevents the parameter to be a register.
Formerly the register were TREE_ADDRESSABLE but now it can be
just DECL_NOT_GIMPLE_REG_P.

2021-09-30 Richard Biener <rguenther@suse.de>

PR middle-end/102518
* tree-inline.c (setup_one_parameter): Avoid substituting
an invariant into contexts where a GIMPLE register is not valid.

* gcc.dg/torture/pr102518.c: New testcase.

commit | commitdiff | tree

Richard Biener [Mon, 18 Oct 2021 08:31:19 +0000 (10:31 +0200)]

tree-optimization/102788 - avoid spurious bool pattern fails

Bool pattern recog is required for correctness since vectorized
compares otherwise produce -1 for true so any context where bool
is used as value and not as condition or mask needs to be replaced
with CMP ? 1 : 0.  When we fail to find a vector type for the
result of such use we may not simply elide such transform since
a new bool result can emerge when for example the cast_forwprop
pattern is applied.  So the following avoids failing of the
bool pattern recog process and instead not assign a vector type
for the stmt.

2021-10-18  Richard Biener  <rguenther@suse.de>

PR tree-optimization/102788
* tree-vect-patterns.c (vect_init_pattern_stmt): Allow
a NULL vectype.
(vect_pattern_recog_1): Likewise.
(vect_recog_bool_pattern): Continue matching the pattern
even if we do not have a vector type for a conversion
result.

* g++.dg/vect/pr102788.cc: New testcase.

(cherry picked from commit eb032893675afea4b01cc6ad06a3e0dcfe9b51cd)

commit | commitdiff | tree

Richard Biener [Fri, 15 Oct 2021 06:41:57 +0000 (08:41 +0200)]

ipa/102762 - fix ICE with invalid __builtin_va_arg_pack () use

We have to be careful to not break the argument space calculation.
If there's not enough arguments just do not append any.

2021-10-15 Richard Biener <rguenther@suse.de>

PR ipa/102762
* tree-inline.c (copy_bb): Avoid underflowing nargs.

* gcc.dg/torture/pr102762.c: New testcase.

(cherry picked from commit 11a4714860d2df6ba496d55379e7dc702d5fc425)

commit | commitdiff | tree

Richard Biener [Tue, 12 Oct 2021 11:42:08 +0000 (13:42 +0200)]

tree-optimization/102572 - fix gathers with invariant mask

This fixes the vector def gathering for invariant masks which
failed to pass in the desired vector type resulting in a non-mask
type to be generate.

2021-10-12 Richard Biener <rguenther@suse.de>

PR tree-optimization/102572
* tree-vect-stmts.c (vect_build_gather_load_calls): When
gathering the vectorized defs for the mask pass in the
desired mask vector type so invariants will be handled
correctly.

* g++.dg/vect/pr102572.cc: New testcase.

(cherry picked from commit 9f12a45ef147e563f099c24c293830727e8204cc)

commit | commitdiff | tree

Richard Biener [Tue, 31 Aug 2021 08:28:40 +0000 (10:28 +0200)]

tree-optimization/102139 - fix SLP DR base alignment

When doing whole-function SLP we have to make sure the recorded
base alignments we compute as the maximum alignment seen for a
base anywhere in the function is actually valid at the point
we want to make use of it.

To make this work we now record the stmt the alignment was derived
from in addition to the DRs innermost behavior and we use a
dominance check to verify the recorded info is valid when doing
BB vectorization. For this to work for groups inside a BB that are
separate by a call that might not return we now store the DR
analysis group-id permanently and use that for an additional check
when the DRs are in the same BB.

2021-08-31 Richard Biener <rguenther@suse.de>

PR tree-optimization/102139
* tree-vectorizer.h (vec_base_alignments): Adjust hash-map
type to record a std::pair of the stmt-info and the innermost
loop behavior.
(dr_vec_info::group): New member.
* tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
(vect_compute_data_ref_alignment): Verify the recorded
base alignment can be used.
(data_ref_pair): Remove.
(dr_group_sort_cmp): Adjust.
(vect_analyze_data_ref_accesses): Store the group-ID in the
dr_vec_info and operate on a vector of dr_vec_infos.

* gcc.dg/torture/pr102139.c: New testcase.

(cherry picked from commit 153766ec8351d55cfe8bd6d69bdfc0c2cef71e56)

commit | commitdiff | tree

Richard Biener [Fri, 20 Aug 2021 09:32:00 +0000 (11:32 +0200)]

Refactor BB splitting of DRs for SLP group analysis

This uses the group_id computed to ensure DRs in different BBs do
not get merged into a DR group.  To achieve this we seed the
group from the BB index when group_ids are not computed and we
make sure to bump the group_id when advancing to the next BB for
BB SLP analysis.

This paves the way for relaxing the grouping for BB vectorization
by adjusting its group_id computation.

2021-08-20  Richard Biener  <rguenther@suse.de>

* tree-vect-data-refs.c (dr_group_sort_cmp): Do not compare
BBs.
(vect_analyze_data_ref_accesses): Likewise.  Assign the BB
index as group_id when dataref_groups were not computed.
* tree-vect-slp.c (vect_slp_bbs): Bump current_group when
we advace to the next BB.

(cherry picked from commit 37744f8260857005c8409c9e2e633a05c768a7dd)

commit | commitdiff | tree

Richard Biener [Mon, 11 Oct 2021 14:06:03 +0000 (16:06 +0200)]

middle-end/101480 - overloaded global new/delete

The following fixes the issue of ignoring side-effects on memory
from overloaded global new/delete operators by not marking them
as effectively 'const' apart from other explicitely specified
side-effects.

This will cause

FAIL: g++.dg/warn/Warray-bounds-16.C -std=gnu++1? (test for excess errors)

because we now no longer statically see the initialization loop
never executes because the call to operator new can now clobber 'a.m'.
This seems to be an issue with the warning code and/or ranger so
I'm leaving this FAIL to be addressed as followup.

2021-10-11 Richard Biener <rguenther@suse.de>

PR middle-end/101480
* gimple.c (gimple_call_fnspec): Do not mark operator new/delete
as const.

* g++.dg/torture/pr10148.C: New testcase.

(cherry picked from commit 09a0affdb0598a54835ac4bb0dd6b54122c12916)

commit | commitdiff | tree

Martin Liska [Fri, 5 Nov 2021 15:50:06 +0000 (16:50 +0100)]

gcov-profile: Fix -fcompare-debug with -fprofile-generate [PR100520]

PR gcov-profile/100520

gcc/ChangeLog:

* coverage.c (coverage_compute_profile_id): Strip .gk when
compare debug is used.
* system.h (endswith): New function.

gcc/testsuite/ChangeLog:

* gcc.dg/pr100520.c: New test.

(cherry picked from commit 7553bd35c876efaf8ab0b6661a6102822b99e6e3)

commit | commitdiff | tree

Martin Liska [Mon, 8 Nov 2021 11:58:28 +0000 (12:58 +0100)]

gcc-changelog: sync from master

contrib/ChangeLog:

* gcc-changelog/git_check_commit.py: Sync from master.
* gcc-changelog/git_commit.py: Likewise.
* gcc-changelog/git_email.py: Likewise.
* gcc-changelog/git_update_version.py: Likewise.
* gcc-changelog/test_email.py: Likewise.
* gcc-changelog/test_patches.txt: Likewise.

commit | commitdiff | tree

Kewen Lin [Tue, 26 Oct 2021 02:05:02 +0000 (21:05 -0500)]

vect: Don't update inits for simd_lane_access DRs [PR102789]

As PR102789 shows, when vectorizer does some peelings for alignment
in prologues, function vect_update_inits_of_drs would update the
inits of some drs. But as the failed case, we shouldn't update the
dr for simd_lane_access, it has the fixed-length storage mainly for
the main loop, the update can make the access out of bound and access
the unexpected element.

gcc/ChangeLog:

PR tree-optimization/102789
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of simd_lane_access.

(cherry picked from commit f3dbd3f36d55178d0a9e4431043cbc950524969a)

commit | commitdiff | tree

GCC Administrator [Mon, 8 Nov 2021 00:17:58 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Harald Anlauf [Tue, 26 Oct 2021 18:51:46 +0000 (20:51 +0200)]

Fortran: error recovery on initializing invalid derived type array component

gcc/fortran/ChangeLog:

PR fortran/102816
* resolve.c (resolve_structure_cons): Reject invalid array spec of
a DT component referenced in a structure constructor.

gcc/testsuite/ChangeLog:

PR fortran/102816
* gfortran.dg/pr102816.f90: New test.

(cherry picked from commit 99af0b2f0fe1c0dc8c6d558157e700326d52816a)

commit | commitdiff | tree

Harald Anlauf [Fri, 15 Oct 2021 19:23:17 +0000 (21:23 +0200)]

Fortran: validate shape of arrays in constructors against declarations

gcc/fortran/ChangeLog:

PR fortran/102685
* decl.c (match_clist_expr): Set rank/shape of clist initializer
to match LHS.
* resolve.c (resolve_structure_cons): In a structure constructor,
compare shapes of array components against declared shape.

gcc/testsuite/ChangeLog:

PR fortran/102685
* gfortran.dg/derived_constructor_char_1.f90: Fix invalid code.
* gfortran.dg/pr70931.f90: Likewise.
* gfortran.dg/transfer_simplify_2.f90: Likewise.
* gfortran.dg/pr102685.f90: New test.

Co-authored-by: Tobias Burnus <tobias@codesourcery.com>
(cherry picked from commit 1e819bd95ebeefc1dc469daa1855ce005cb77822)

commit | commitdiff | tree

Harald Anlauf [Sat, 6 Nov 2021 18:42:01 +0000 (19:42 +0100)]

Fortran: error recovery on rank mismatch of array and its initializer

gcc/fortran/ChangeLog:

PR fortran/102715
* decl.c (add_init_expr_to_sym): Reject rank mismatch between
array and its initializer.

gcc/testsuite/ChangeLog:

PR fortran/102715
* gfortran.dg/pr68019.f90: Adjust error message.
* gfortran.dg/pr102715.f90: New test.

(cherry picked from commit df2135e88a8f78c853b35246ad426b01b6d08378)

commit | commitdiff | tree

Harald Anlauf [Fri, 5 Nov 2021 22:48:20 +0000 (23:48 +0100)]

Fortran: fix simplification of array-valued parameter expressions

gcc/fortran/ChangeLog:

PR fortran/102817
* expr.c (simplify_parameter_variable): Copy shape of referenced
subobject when simplifying.

gcc/testsuite/ChangeLog:

PR fortran/102817
* gfortran.dg/pr102817.f90: New test.

(cherry picked from commit bcf3728abe8488882922005166d3065fc5fdfea1)

commit | commitdiff | tree

Harald Anlauf [Sun, 10 Oct 2021 18:11:43 +0000 (20:11 +0200)]

Fortran: handle initialization of derived type parameter arrays from scalar

gcc/fortran/ChangeLog:

PR fortran/99348
PR fortran/102521
* decl.c (add_init_expr_to_sym): Extend initialization of
parameter arrays from scalars to handle derived types.

gcc/testsuite/ChangeLog:

PR fortran/99348
PR fortran/102521
* gfortran.dg/parameter_array_init_8.f90: New test.

(cherry picked from commit 74ccca380cde5e79e082d39214b306a90ded0344)

commit | commitdiff | tree

GCC Administrator [Sun, 7 Nov 2021 00:18:02 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 6 Nov 2021 00:18:03 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

John David Anglin [Fri, 5 Nov 2021 16:09:08 +0000 (16:09 +0000)]

Support TI mode and soft float on PA64

This change implements TI mode on PA64.  Various new patterns are
added to pa.md.  The libgcc build needed modification to build both
DI and TI routines.  We also need various softfp routines to
convert to and from TImode.

I added full softfp for the -msoft-float option.  At the moment,
this doesn't completely eliminate all use of the floating-point
co-processor.  For this, libgcc needs to be built with -msoft-mult.
The floating-point exception support also needs a soft option.

2021-11-05  John David Anglin  <danglin@gcc.gnu.org>

PR libgomp/96661

gcc/ChangeLog:

* config/pa/pa-modes.def: Add OImode integer type.
* config/pa/pa.c (pa_scalar_mode_supported_p): Allow TImode
for TARGET_64BIT.
* config/pa/pa.h (MIN_UNITS_PER_WORD) Define to MIN_UNITS_PER_WORD
to UNITS_PER_WORD if IN_LIBGCC2.
* config/pa/pa.md (addti3, addvti3, subti3, subvti3, negti2,
negvti2, ashlti3, shrpd_internal): New patterns.
Change some multi instruction types to multi.

libgcc/ChangeLog:

* config.host (hppa*64*-*-linux*): Revise tmake_file.
(hppa*64*-*-hpux11*): Likewise.
* config/pa/sfp-exceptions.c: New.
* config/pa/sfp-machine.h: New.
* config/pa/t-dimode: New.
* config/pa/t-softfp-sfdftf: New.

commit | commitdiff | tree

Martin Liska [Fri, 13 Aug 2021 15:22:35 +0000 (17:22 +0200)]

Speed up jump table switch detection.

PR tree-optimization/100393

gcc/ChangeLog:

* tree-switch-conversion.c (group_cluster::dump): Use
get_comparison_count.
(jump_table_cluster::find_jump_tables): Pre-compute number of
comparisons and then decrement it. Cache also max_ratio.
(jump_table_cluster::can_be_handled): Change signature.
* tree-switch-conversion.h (get_comparison_count): New.

(cherry picked from commit c517cf2e685e2903b591d63c1034ff9726cb3822)

commit | commitdiff | tree

Rasmus Villemoes [Tue, 2 Nov 2021 10:43:57 +0000 (11:43 +0100)]

gcc: vx-common.h: fix test for VxWorks7

The macro TARGET_VXWORKS7 is always defined (see vxworks-dummy.h).
Thus we need to test its value, not its definedness.

Fixes aca124df (define NO_DOT_IN_LABEL only in vxworks6).

gcc/ChangeLog:

* config/vx-common.h: Test value of TARGET_VXWORKS7 rather
than definedness.

(cherry picked from commit 44d0243a247dd1280265c649dab26e9486ffa015)

commit | commitdiff | tree

GCC Administrator [Fri, 5 Nov 2021 00:18:08 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

H.J. Lu [Thu, 4 Nov 2021 14:37:18 +0000 (07:37 -0700)]

x86: Check leal/addl gcc.target/i386/amxtile-3.c for x32

Check leal and addl for x32 to fix:

FAIL: gcc.target/i386/amxtile-3.c scan-assembler addq[ \\t]+\\$12
FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+4
FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+8

* gcc.target/i386/amxtile-3.c: Check leal/addl for x32.

(cherry picked from commit fbe58ba97aff3270877d7fd5600c17687b85964c)

commit | commitdiff | tree

Hongyu Wang [Wed, 3 Nov 2021 05:58:52 +0000 (13:58 +0800)]

i386: Fix wrong result for AMX-TILE intrinsic when parsing expression.

_tile_loadd, _tile_stored, _tile_streamloadd intrinsics are defined by
macro, so the parameters should be wrapped by parentheses to accept
expressions.

gcc/ChangeLog:

* config/i386/amxtileintrin.h (_tile_loadd_internal): Add
parentheses to base and stride.
(_tile_stream_loadd_internal): Likewise.
(_tile_stored_internal): Likewise.

gcc/testsuite/ChangeLog:
* gcc.target/i386/amxtile-3.c: New test.

commit | commitdiff | tree

GCC Administrator [Thu, 4 Nov 2021 00:18:09 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Maciej W. Rozycki [Wed, 3 Nov 2021 22:07:04 +0000 (22:07 +0000)]

ranger: Fix `-Werror' build error with `ranger_cache::push_poor_value'

Remove a commit 86534c07a390 ("Disable poor value processing in ranger
cache.") regression that caused GCC not to build anymore if `-Werror'
has been enabled:

.../gcc/gimple-range-cache.cc: In member function 'bool ranger_cache::push_poor_value(basic_block, tree)':
.../gcc/gimple-range-cache.cc:850:44: error: unused parameter 'bb' [-Werror=unused-parameter]
  850 | ranger_cache::push_poor_value (basic_block bb, tree name)
      |                                ~~~~~~~~~~~~^~
.../gcc/gimple-range-cache.cc:850:53: error: unused parameter 'name' [-Werror=unused-parameter]
  850 | ranger_cache::push_poor_value (basic_block bb, tree name)
      |                                                ~~~~~^~~~

To keep the change to the minimum mark the parameters reported unused.

gcc/
* gimple-range-cache.cc (ranger_cache::push_poor_value): Mark
parameters unused.

commit | commitdiff | tree

Vladimir N. Makarov [Tue, 26 Oct 2021 18:03:42 +0000 (14:03 -0400)]

[PR102842] Consider all outputs in generation of matching reloads

Without considering all output insn operands (not only processed
before), in rare cases LRA can use the same hard register for
different outputs of the insn on different assignment subpasses. The
patch fixes the problem.

gcc/ChangeLog:

PR rtl-optimization/102842
* lra-constraints.c (match_reload): Ignore out in checking values
of outs.
(curr_insn_transform): Collect outputs before doing reloads of operands.

gcc/testsuite/ChangeLog:

PR rtl-optimization/102842
* g++.target/arm/pr102842.C: New test.

commit | commitdiff | tree

Richard Biener [Wed, 13 Oct 2021 07:13:36 +0000 (09:13 +0200)]

ipa/102714 - IPA SRA eliding volatile

The following fixes the volatileness check of IPA SRA which was
looking at the innermost reference when checking TREE_THIS_VOLATILE
but the reference to check is the outermost one.

2021-10-13 Richard Biener <rguenther@suse.de>

PR ipa/102714
* ipa-sra.c (ptr_parm_has_nonarg_uses): Fix volatileness
check.

* gcc.dg/ipa/pr102714.c: New testcase.

(cherry picked from commit 23cd18c60c8188e3d68eda721cdb739199e85e5b)

commit | commitdiff | tree

GCC Administrator [Wed, 3 Nov 2021 00:18:12 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Tobias Burnus [Tue, 2 Nov 2021 14:52:52 +0000 (15:52 +0100)]

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merged up to r11-9199-gfdc2700d09530227a520e8d33623b7956582cffb (Nov 2, 2021)

commit | commitdiff | tree

Tobias Burnus [Tue, 2 Nov 2021 14:51:22 +0000 (15:51 +0100)]

openmp: Add testcase for threadprivate random access class iterators

This adds a testcase for random access class iterators. The diagnostics
can be different between templates and non-templates, as for some
threadprivate vars finish_id_expression replaces them with call to their
corresponding wrapper, but I think it is not that big deal, we reject
it in either case.

2021-11-02 Jakub Jelinek <jakub@redhat.com>

* g++.dg/gomp/loop-8.C: New test.

(cherry picked from commit fb7fee84813b23487baf0c1094860251229ab5dd)

commit | commitdiff | tree

Tobias Burnus [Tue, 2 Nov 2021 14:49:45 +0000 (15:49 +0100)]

openmp: Diagnose threadprivate OpenMP loop iterators

We weren't diagnosing the
The loop iteration variable may not appear in a threadprivate directive.
restriction which used to be in 5.0 just among the Worksharing-Loop
restrictions but in 5.1 it is among Canonical Loop Nest Form restrictions.

This patch diagnoses those.

2021-10-30 Jakub Jelinek <jakub@redhat.com>

* gimplify.c (gimplify_omp_for): Diagnose threadprivate iterators.

* c-c++-common/gomp/loop-10.c: New test.

(cherry picked from commit 6f449bb93b33d63fa8a1b8d021d8d36f27ffe054)

commit | commitdiff | tree

GCC Administrator [Tue, 2 Nov 2021 00:17:59 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Jonathan Wakely [Mon, 1 Nov 2021 11:06:51 +0000 (11:06 +0000)]

libstdc++: Fix range access for empty std::valarray [PR103022]

The std::begin and std::end overloads for std::valarray are defined in
terms of std::addressof(v[0]) which is undefined for an empty valarray.

libstdc++-v3/ChangeLog:

PR libstdc++/103022
* include/std/valarray (begin, end): Do not dereference an empty
valarray. Add noexcept and [[nodiscard]].
* testsuite/26_numerics/valarray/range_access.cc: Check empty
valarray. Check iterator properties. Run as well as compiling.
* testsuite/26_numerics/valarray/range_access2.cc: Likewise.
* testsuite/26_numerics/valarray/103022.cc: New test.

(cherry picked from commit 91bac9fed5d082f0b180834110ebc0f46f97599a)

commit | commitdiff | tree

GCC Administrator [Mon, 1 Nov 2021 00:17:51 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 31 Oct 2021 00:18:04 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 30 Oct 2021 00:18:11 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 29 Oct 2021 00:18:11 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Eric Botcazou [Thu, 28 Oct 2021 13:51:14 +0000 (15:51 +0200)]

Update documentation of %X spec

%X
Output the accumulated linker options specified by -Wl or a ‘%x’ spec string

The part about -Wl has been obsolete for 27 years, since this change:

Author: Torbjorn Granlund <tege@gnu.org>
Date:   Thu Oct 27 18:04:25 1994 +0000

    (process_command): Handle -Wl, and -Xlinker similar to -l,

    i.e., preserve their order with respect to linker input files.

Technically speaking, the arguments of -l, -Wl and -Xlinker are input files.

gcc/
* doc/invoke.texi (%X): Remove obsolete reference to -Wl.

commit | commitdiff | tree

GCC Administrator [Thu, 28 Oct 2021 00:18:23 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Harald Anlauf [Tue, 26 Oct 2021 18:54:41 +0000 (20:54 +0200)]

Fortran: do not restrict PDT KIND and LEN type parameters to default integer

gcc/fortran/ChangeLog:

PR fortran/102917
* decl.c (match_attr_spec): Remove invalid integer kind checks on
KIND and LEN attributes of PDTs.

gcc/testsuite/ChangeLog:

PR fortran/102917
* gfortran.dg/pdt_4.f03: Adjust testcase.

(cherry picked from commit cfcb27cfcb1d32b8cf7bc463cc1fc5cacae8d199)

commit | commitdiff | tree

John David Anglin [Wed, 27 Oct 2021 18:01:40 +0000 (18:01 +0000)]

Fix warnings building linux-atomic.c and fptr.c on hppa64-linux

The file fptr.c is specific to 32-bit hppa-linux and should not be
included in LIB2ADD on hppa64-linux.

There is a builtin type mismatch in linux-atomic.c using the type
long long unsigned int for 64-bit atomic operations on hppa64-linux.

2021-10-27 John David Anglin <danglin@gcc.gnu.org>

libgcc/ChangeLog:

* config.host (hppa*64*-*-linux*): Don't add pa/t-linux to
tmake_file.
* config/pa/linux-atomic.c: Define u8, u16 and u64 types.
Use them in FETCH_AND_OP_2, OP_AND_FETCH_2, COMPARE_AND_SWAP_2,
SYNC_LOCK_TEST_AND_SET_2 and SYNC_LOCK_RELEASE_1 macros.
* config/pa/t-linux64 (LIB1ASMSRC): New define.
(LIB1ASMFUNCS): Revise.
(HOST_LIBGCC2_CFLAGS): Add "-DLINUX=1".

commit | commitdiff | tree

Martin Jambor [Wed, 27 Oct 2021 17:15:33 +0000 (19:15 +0200)]

sra: Fix corner case of total scalarization with virtual inheritance (PR 102505)

PR 102505 is a situation where of SRA takes its initial top-level
access size from a get_ref_base_and_extent called on a COMPONENT_REF,
and thus derived frm the FIELD_DECL, which however does not include a
virtual base.  Total scalarization then goes on traversing the type,
which however has virtual base past the non-virtual bits, tricking SRA
to create sub-accesses outside of the supposedly encompassing
accesses, which in turn triggers the verifier within the pass.

The patch below fixes that by failing total scalarization when this
situation is detected.

This backport also has commit f217e87972a2a207e793101fc05cfc9dd095c678
squashed into it in order to avoid PR 102886 that the fix introduced
on trunk.

gcc/ChangeLog:

2021-10-20  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/102505
* tree-sra.c (totally_scalarize_subtree): Check that the
encountered field fits within the acces we would like to put it
in.

gcc/testsuite/ChangeLog:

2021-10-20  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/102505
* g++.dg/torture/pr102505.C: New test.

(cherry picked from commit 701ee067807b80957c65bd7ff94b6099a27181de)

commit | commitdiff | tree

Tobias Burnus [Wed, 27 Oct 2021 09:02:54 +0000 (11:02 +0200)]

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9188-g2563fba71d0b67cb69b2cd2da649a50a70f2a02f (Oct 27, 2021)

commit | commitdiff | tree

Tobias Burnus [Wed, 27 Oct 2021 09:01:56 +0000 (11:01 +0200)]

Fortran: Fix 'select rank' for allocatables/pointers

gcc/fortran/ChangeLog:

* trans-stmt.c (gfc_trans_select_rank_cases): Fix condition
for allocatables/pointers.

gcc/testsuite/ChangeLog:

* gfortran.dg/PR93963.f90: Extend testcase by scan-tree-dump test.

(cherry picked from commit 7f899b23f36f94f907a025d3eeaf3e4640544927)

commit | commitdiff | tree

Jakub Jelinek [Wed, 27 Oct 2021 08:41:59 +0000 (10:41 +0200)]

openmp: Document that non-rect loops are not supported in Fortran yet

I've found we claim to support non-rectangular loops, but don't actually
support those in Fortran, as can be seen on:
  integer i, j
  !$omp parallel do collapse(2)
  do i = 0, 10
    do j = 0, i
    end do
  end do
end
To support this, the Fortran FE needs to allow the valid forms of
non-rectangular loops and disallow others, so mainly it needs its
updated version of c-omp.c c_omp_check_loop_iv etc., plus for non-rectangular
lb or ub expressions emit a TREE_VEC instead of normal expression as the C/C++ FE
do, plus testsuite coverage.

2021-10-27  Jakub Jelinek  <jakub@redhat.com>

* libgomp.texi (OpenMP 5.0): Mention that Non-rectangular loop nests
aren't implemented for Fortran yet.

(cherry picked from commit eef811490646a68c9893968a421b351e7bf704e1)

commit | commitdiff | tree

Jakub Jelinek [Wed, 27 Oct 2021 08:41:01 +0000 (10:41 +0200)]

openmp: Allow non-rectangular loops with pointer iterators

This patch handles pointer iterators for non-rectangular loops. They are
more limited than integral iterators of non-rectangular loops, in particular
only var-outer, var-outer + a2, a2 + var-outer or var-outer - a2 can appear
in lb or ub where a2 is some integral loop invariant expression, so no e.g.
multiplication etc.

2021-10-27 Jakub Jelinek <jakub@redhat.com>

gcc/
* omp-expand.c (expand_omp_for_init_counts): Handle non-rectangular
iterators with pointer types.
(expand_omp_for_init_vars, extract_omp_for_update_vars): Likewise.
gcc/c-family/
* c-omp.c (c_omp_check_loop_iv_r): Don't clear 3rd bit for
POINTER_PLUS_EXPR.
(c_omp_check_nonrect_loop_iv): Handle POINTER_PLUS_EXPR.
(c_omp_check_loop_iv): Set kind even if the iterator is non-integral.
gcc/testsuite/
* c-c++-common/gomp/loop-8.c: New test.
* c-c++-common/gomp/loop-9.c: New test.
libgomp/
* testsuite/libgomp.c/loop-26.c: New test.
* testsuite/libgomp.c/loop-27.c: New test.

(cherry picked from commit 2084b5f42a4432da8b0625f9c669bf690ec46468)

commit | commitdiff | tree

Jakub Jelinek [Wed, 27 Oct 2021 08:37:58 +0000 (10:37 +0200)]

openmp: Don't reject some valid initializers or conditions of non-rectangular loops [PR102854]

In C++, if an iterator has or might have (e.g. dependent type) class type we
remember the original init expressions and check those separately for presence
of iterators, because for class iterators we turn those into expressions that
always do contain reference to the current iterator.  But this resulted in
rejecting valid non-rectangular loop where the dependent type is later instantiated
to an integral type.

Non-rectangular loops with class random access iterators remain broken, that is something
to be fixed incrementally.

2021-10-27  Jakub Jelinek  <jakub@redhat.com>

PR c++/102854
gcc/c-family/
* c-common.h (c_omp_check_loop_iv_exprs): Add enum tree_code argument.
* c-omp.c (c_omp_check_loop_iv_r): For trees other than decls,
TREE_VEC, PLUS_EXPR, MINUS_EXPR, MULT_EXPR, POINTER_PLUS_EXPR or
conversions temporarily clear the 3rd bit from d->kind while walking
subtrees.
(c_omp_check_loop_iv_exprs): Add CODE argument.  Or in 4 into data.kind
if possibly non-rectangular.
gcc/cp/
* semantics.c (handle_omp_for_class_iterator,
finish_omp_for): Adjust c_omp_check_loop_iv_exprs caller.
gcc/testsuite/
* g++.dg/gomp/loop-3.C: Don't expect some errors.
* g++.dg/gomp/loop-7.C: New test.

(cherry picked from commit 6b0f35299bd1468ebc13b900a73b7cac6181a2aa)

commit | commitdiff | tree

GCC Administrator [Wed, 27 Oct 2021 00:18:10 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Piotr Kubaj [Sat, 16 Oct 2021 02:09:05 +0000 (04:09 +0200)]

gcc/configure: Check for powerpc64le*-*-freebsd*

Only powerpc64-unknown-freebsd was checked for.

Signed-off-by: Piotr Kubaj <pkubaj@FreeBSD.org>
gcc/
* configure.ac: Treat powerpc64*-*-freebsd* the same as
powerpc64-*-freebsd*.
* configure: Regenerate.

(cherry picked from commit a9ef07fe5899fc5998395cdbf96e00af372cfb0b)

commit | commitdiff | tree

Tobias Burnus [Tue, 26 Oct 2021 14:04:52 +0000 (16:04 +0200)]

Fortran: Fix character(len=cst) dummies with bind(C) [PR102885]

PR fortran/102885

gcc/fortran/ChangeLog:

* trans-decl.c (gfc_conv_cfi_to_gfc): Properly handle nonconstant
character lenghts.

gcc/testsuite/ChangeLog:

* gfortran.dg/lto/bind-c-char_0.f90: New test.

(cherry picked from commit a31a3d0421f0cf1f7eefacfec8cbf37e7f91600d)

commit | commitdiff | tree

GCC Administrator [Tue, 26 Oct 2021 00:18:18 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 25 Oct 2021 00:18:08 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

John David Anglin [Sun, 24 Oct 2021 17:55:25 +0000 (17:55 +0000)]

Revise -mdisable-fpregs option and add new -msoft-mult option

The behavior of the -mdisable-fpregs is confusing in that it doesn't
disable the use of the floating-point registers in all situations.
The -msoft-float disables the use of the floating-point registers in
all situations.  The Linux kernel only needs to disable use of the
xmpyu instruction to avoid using the floating-point registers.

This change revises the -mdisable-fpregs option to disable the use of
the floating-point registers in all situations.  It is now equivalent
to the -msoft-float option.  A new -msoft-mult option is added to
disable use of the xmpyu instruction.  The libgcc library can be
compiled with the -msoft-mult option to avoid using hardware integer
multiplication.

2021-10-24  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa-d.c (pa_d_handle_target_float_abi): Don't check
TARGET_DISABLE_FPREGS.
* config/pa/pa.c (fix_range): Use MASK_SOFT_FLOAT instead of
MASK_DISABLE_FPREGS.
(hppa_rtx_costs): Don't check TARGET_DISABLE_FPREGS.  Adjust
cost of hardware integer multiplication.
(pa_conditional_register_usage): Don't check TARGET_DISABLE_FPREGS.
* config/pa/pa.h (INT14_OK_STRICT): Likewise.
* config/pa/pa.md: Don't check TARGET_DISABLE_FPREGS. Check
TARGET_SOFT_FLOAT in patterns that use xmpyu instruction.
* config/pa/pa.opt (mdisable-fpregs): Change target mask to
SOFT_FLOAT.  Revise comment.
(msoft-float): New option.

commit | commitdiff | tree

John David Anglin [Sun, 24 Oct 2021 16:40:12 +0000 (16:40 +0000)]

Don't use 'G' constraint in integer move patterns

The 'G' constraint only matches a float zero.

2021-10-24 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.md: Don't use 'G' constraint in integer move patterns.

commit | commitdiff | tree

GCC Administrator [Sun, 24 Oct 2021 00:18:04 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 23 Oct 2021 00:18:10 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 22 Oct 2021 00:18:04 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

H.J. Lu [Thu, 21 Oct 2021 16:45:14 +0000 (09:45 -0700)]

x86: Document -fcf-protection requires i686 or newer

PR target/98667
* doc/invoke.texi: Document -fcf-protection requires i686 or
new.

(cherry picked from commit 1373066a46d8d47abd97e46a005aef3b3dbfe94a)

commit | commitdiff | tree

Jakub Jelinek [Thu, 21 Oct 2021 09:12:55 +0000 (11:12 +0200)]

testsuite: Fix up gfortran.dg/gomp/strictly*.f90 testcases

While these testcases are dg-do compile only, I think it is better not to
give users bad examples and avoid unnecessary data races in testcases (unless
it is exactly what we want to test). Perhaps one day we'll do some analysis
and warn about data races...

2021-10-21 Jakub Jelinek <jakub@redhat.com>

* gfortran.dg/gomp/strictly-structured-block-1.f90: Use call do_work
instead of x = x + 1 in places where the latter could be a data race.
* gfortran.dg/gomp/strictly-structured-block-2.f90: Likewise.
* gfortran.dg/gomp/strictly-structured-block-3.f90: Likewise.

(cherry picked from commit e633f82fb71817f3232688869c1eb59f60eb78ca)

commit | commitdiff | tree

Chung-Lin Tang [Thu, 21 Oct 2021 06:56:20 +0000 (14:56 +0800)]

openmp: Fortran strictly-structured blocks support

This implements strictly-structured blocks support for Fortran, as specified in
OpenMP 5.2. This now allows using a Fortran BLOCK construct as the body of most
OpenMP constructs, with a "!$omp end ..." ending directive optional for that
form.

gcc/fortran/ChangeLog:

* decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case
together with COMP_BLOCK.
* parse.c (parse_omp_structured_block): Change return type to
'gfc_statement', add handling for strictly-structured block case, adjust
recursive calls to parse_omp_structured_block.
(parse_executable): Adjust calls to parse_omp_structured_block.
* parse.h (enum gfc_compile_state): Add
COMP_OMP_STRICTLY_STRUCTURED_BLOCK.
* trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case
handling.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/cancel-1.f90: Adjust testcase.
* gfortran.dg/gomp/nesting-3.f90: Adjust testcase.
* gfortran.dg/gomp/strictly-structured-block-1.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-2.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-3.f90: New test.

libgomp/ChangeLog:

* libgomp.texi (Support of strictly structured blocks in Fortran):
Adjust to 'Y'.
* testsuite/libgomp.fortran/task-reduction-16.f90: Adjust testcase.

(cherry picked from commit 2e4659199e814b7ee0f6bd925fd2c0a7610da856)

commit | commitdiff | tree

Tobias Burnus [Thu, 21 Oct 2021 07:28:57 +0000 (09:28 +0200)]

testsuite/libgomp.oacc-fortran/: Add -Wopenacc-parallelism

The following testcases expect the -Wopenacc-parallelism warning output
but did fail as not compiled with that warning; solution: add it.

* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: Compile
with -Wopenacc-parallelism.
* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Likewise.

Mirror of https://gcc.gnu.org/git/gcc.git