]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
11 months agoUse OpenACC code to process OpenMP target regions devel/omp/gcc-12
Chung-Lin Tang [Fri, 19 May 2023 19:14:04 +0000 (12:14 -0700)] 
Use OpenACC code to process OpenMP target regions

This is a backport of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619003.html

This patch implements '-fopenmp-target=acc', which enables internally handling
a subset of OpenMP target regions as OpenACC parallel regions. This basically
includes target, teams, parallel, distribute, for/do constructs, and atomics.

Essentially, we adjust the internal kinds to OpenACC type, and let OpenACC code
paths handle them, with various needed adjustments throughout middle-end and
nvptx backend. When using this "OMPACC" mode, if there are cases the patch
doesn't handle, it issues a warning, and reverts to normal processing for that
target region.

gcc/ChangeLog:

* builtins.cc (expand_builtin_omp_builtins): New function.
(expand_builtin): Add expand cases for BUILT_IN_GOMP_BARRIER,
BUILT_IN_OMP_GET_THREAD_NUM, BUILT_IN_OMP_GET_NUM_THREADS,
BUILT_IN_OMP_GET_TEAM_NUM, and BUILT_IN_OMP_GET_NUM_TEAMS using
expand_builtin_omp_builtins, enabled under -fopenmp-target=acc.
* cgraphunit.cc (analyze_functions): Add call to
omp_ompacc_attribute_tagging, enabled under -fopenmp-target=acc.
* common.opt (fopenmp-target=): Add new option and enums.
* config/nvptx/mkoffload.cc (main): Handle -fopenmp-target=.
* config/nvptx/nvptx-protos.h (nvptx_expand_omp_get_num_threads): New
prototype.
(nvptx_mem_shared_p): Likewise.
* config/nvptx/nvptx.cc (omp_num_threads_sym): New global static RTX
symbol for number of threads in team.
(omp_num_threads_align): New var for alignment of omp_num_threads_sym.
(need_omp_num_threads): New bool for if any function references
omp_num_threads_sym.
(nvptx_option_override): Initialize omp_num_threads_sym/align.
(write_as_kernel): Disable normal OpenMP kernel entry under OMPACC mode.
(nvptx_declare_function_name): Disable shim function under OMPACC mode.
Disable soft-stack under OMPACC mode. Add generation of neutering init
code under OMPACC mode.
(nvptx_output_set_softstack): Return "" under OMPACC mode.
(nvptx_expand_call): Set parallelism to vector for function calls with
"ompacc for" attached.
(nvptx_expand_oacc_fork): Set mode to GOMP_DIM_VECTOR under OMPACC mode.
(nvptx_expand_oacc_join): Likewise.
(nvptx_expand_omp_get_num_threads): New function.
(nvptx_mem_shared_p): New function.
(nvptx_mach_max_workers): Return 1 under OMPACC mode.
(nvptx_mach_vector_length): Return 32 under OMPACC mode.
(nvptx_single): Add adjustments for OMPACC mode, which have
parallel-construct fork/joins, and regions of code where neutering is
dynamically determined.
(nvptx_reorg): Enable neutering under OMPACC mode when "ompacc for"
attribute is attached to function. Disable uniform-simt when under
OMPACC mode.
(nvptx_file_end): Write __nvptx_omp_num_threads out when needed.
(nvptx_goacc_fork_join): Return true under OMPACC mode.
* config/nvptx/nvptx.h (struct GTY(()) machine_function): Add
omp_parallel_predicate and omp_fn_entry_num_threads_reg fields.
* config/nvptx/nvptx.md (unspecv): Add UNSPECV_GET_TID,
UNSPECV_GET_NTID, UNSPECV_GET_CTAID, UNSPECV_GET_NCTAID,
UNSPECV_OMP_PARALLEL_FORK, UNSPECV_OMP_PARALLEL_JOIN entries.
(nvptx_shared_mem_operand): New predicate.
(gomp_barrier): New expand pattern.
(omp_get_num_threads): New expand pattern.
(omp_get_num_teams): New insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_team_num): Likewise.
(get_ntid): Likewise.
(nvptx_omp_parallel_fork): Likewise.
(nvptx_omp_parallel_join): Likewise.

* flag-types.h (omp_target_mode_kind): New flag value enum.
* gimplify.cc (struct gimplify_omp_ctx): Add 'bool ompacc' field.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE__OMPACC_.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_ctx_ompacc_p): New function.
(gimplify_omp_for): Handle combined loops under OMPACC.

* lto-wrapper.cc (append_compiler_options): Add OPT_fopenmp_target_.
* omp-builtins.def (BUILT_IN_OMP_GET_THREAD_NUM): Remove CONST.
(BUILT_IN_OMP_GET_NUM_THREADS): Likewise.
* omp-expand.cc (remove_exit_barrier): Disable addressable-var
processing for parallel construct child functions under OMPACC mode.
(expand_oacc_for): Add OMPACC mode handling.
(get_target_arguments): Force thread_limit clause value to 1 under
OMPACC mode.
(expand_omp): Under OMPACC mode, avoid child function expanding of
GIMPLE_OMP_PARALLEL.
* omp-general.cc (omp_extract_for_data): Adjustments for OMPACC mode.
* omp-low.cc (struct omp_context): Add 'bool ompacc_p' field.
(scan_sharing_clauses): Handle OMP_CLAUSE__OMPACC_.
(ompacc_ctx_p): New function.
(scan_omp_parallel): Handle OMPACC mode, avoid creating child function.
(scan_omp_target): Tag "ompacc"/"ompacc for" attributes for target
construct child function, remove OMP_CLAUSE__OMPACC_ clauses.
(lower_oacc_head_mark): Handle OMPACC mode cases.
(lower_omp_for): Adjust OMP_FOR kind from OpenMP to OpenACC kinds, add
vector/gang clauses as needed. Add other OMPACC handling.
(lower_omp_taskreg): Add call to lower_oacc_head_tail for OMPACC case.
(lower_omp_target): Do OpenACC gang privatization under OMPACC case.
(lower_omp_teams): Forward OpenACC privatization variables to outer
target region under OMPACC mode.
(lower_omp_1): Do OpenACC gang privatization under OMPACC case for
GIMPLE_BIND.
* omp-offload.cc (ompacc_supported_clauses_p): New function.
(struct target_region_data): New struct type for tree walk.
(scan_fndecl_for_ompacc): New function.
(scan_omp_target_region_r): New function.
(scan_omp_target_construct_r): New function.
(omp_ompacc_attribute_tagging): New function.
(oacc_dim_call): Add OMPACC case handling.
(execute_oacc_device_lower): Make parts explicitly only OpenACC enabled.
(pass_oacc_device_lower::gate): Enable pass under OMPACC mode.
* omp-offload.h (omp_ompacc_attribute_tagging): New prototype.
* opts.cc (finish_options): Only allow -fopenmp-target= when -fopenmp
and no -fopenacc.
* target-insns.def (gomp_barrier): New defined insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_num_threads): Likewise.
(omp_get_team_num): Likewise.
(omp_get_num_teams): Likewise.
* tree-core.h (enum omp_clause_code): Add new OMP_CLAUSE__OMPACC_ entry
for internal clause.
* tree-nested.cc (convert_nonlocal_omp_clauses): Handle
OMP_CLAUSE__OMPACC_.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE__OMPACC_.
* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE__OMPACC_ entry.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE__OMPACC__FOR): New macro for OMP_CLAUSE__OMPACC_.

* tree-ssa-loop.cc (pass_oacc_only::gate): Enable pass under OMPACC
mode cases.

libgomp/ChangeLog:

* config/nvptx/team.c (__nvptx_omp_num_threads): New global variable in
shared memory.

11 months agoLTO: Fix writing of toplevel asm with offloading [PR109816]
Tobias Burnus [Fri, 12 May 2023 14:27:40 +0000 (16:27 +0200)] 
LTO: Fix writing of toplevel asm with offloading [PR109816]

When offloading was enabled, top-level 'asm' were added to the offloading
section, confusing assemblers which did not support the syntax. Additionally,
with offloading and -flto, the top-level assembler code did not end up
in the host files.

As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file,
the issue became more apparent, causing fails with nvptx for some
C++ testcases.

PR libstdc++/109816

gcc/ChangeLog:

* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
'!lto_stream_offload_p'.

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-map-class-1.C: New test.
* testsuite/libgomp.c++/target-map-class-2.C: New test.

(cherry picked from commit a835f046cdf017b9e8ad5576df4f10daaf8420d0)

11 months agoImplement LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [PR109128]
Tobias Burnus [Thu, 11 May 2023 10:29:05 +0000 (12:29 +0200)] 
Implement LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [PR109128]

This is one part of the fix for PR109128, along with a corresponding
binutils's linker change.  Without this patch, what happens in the
linker, when an unused object in a .a file has offload data, is that
elf_link_is_defined_archive_symbol calls bfd_link_plugin_object_p,
which ends up calling the plugin's claim_file_handler, which then
records the object as one with offload data. That is, the linker never
decides to use the object in the first place, but use of this _p
interface (called as part of trying to decide whether to use the
object) results in the plugin deciding to use its offload data (and a
consequent mismatch in the offload data present at runtime).

The new hook allows the linker plugin to distinguish calls to
claim_file_handler that know the object is being used by the linker
(from ldmain.c:add_archive_element), from calls that don't know it's
being used by the linker (from elf_link_is_defined_archive_symbol); in
the latter case, the plugin should avoid recording the object as one
with offload data.

PR middle-end/109128

include/
* plugin-api.h (ld_plugin_claim_file_handler_v2)
(ld_plugin_register_claim_file_v2)
(LDPT_REGISTER_CLAIM_FILE_HOOK_V2): New.
(struct ld_plugin_tv): Add tv_register_claim_file_v2.

lto-plugin/
* lto-plugin.c (register_claim_file_v2): New.
(claim_file_handler_v2): New.
(claim_file_handler): Wrap claim_file_handler_v2.
(onload): Handle LDPT_REGISTER_CLAIM_FILE_HOOK_V2.

(cherry picked from commit c49d51fa8134f6c7e6c7cf6e4e3007c4fea617c5)

11 months agoopenmp: Fix initialization for 'unroll full'
Frederik Harwath [Thu, 4 May 2023 16:09:23 +0000 (18:09 +0200)] 
openmp: Fix initialization for 'unroll full'

The index variable initialization for the 'omp unroll'
directive with 'full' clause got lost and the testsuite
did not catch it.

Add the initialization and add -Wall to some tests
to detect uninitialized variable uses and other
potential problems in the code generation.

gcc/ChangeLog:

* omp-transform-loops.cc (full_unroll): Add initialization of index variable.

libgomp/ChangeLog:

* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c:
Use -Wall and add -Wno-unknown-pragmas to disable warnings about empty pragmas.
Use -O2.
* testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C:
Copy of testsuite/libgomp.c-c++-common/matrix-no-directive-unroll-full-1.c,
but using -O0 which works only for C++.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: Use -Wall
and use -Wno-unknown-pragmas to disable warnings about empty pragmas.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c:
Likewise and fix broken function calls found by -Wall.

12 months agoamdgcn: Fix addsub bug
Andrew Stubbs [Wed, 26 Apr 2023 14:23:48 +0000 (15:23 +0100)] 
amdgcn: Fix addsub bug

The vec_fmsubadd instuction actually had add twice, by mistake.

Also improve code-gen for all the complex patterns by using properly
undefined values.  Mostly this just prevents the compiler reserving space
in the stack frame.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (cmul<conj_op><mode>3): Use gcn_gen_undef.
(cml<addsub_as><mode>4): Likewise.
(vec_addsub<mode>3): Likewise.
(cadd<rot><mode>3): Likewise.
(vec_fmaddsub<mode>4): Likewise.
(vec_fmsubadd<mode>4): Likewise, and use sub for the odd lanes.

12 months agoopenmp: Fix loop transformation tests
Frederik Harwath [Tue, 25 Apr 2023 13:32:56 +0000 (15:32 +0200)] 
openmp: Fix loop transformation tests

libgomp/ChangeLog:

* testsuite/libgomp.fortran/loop-transforms/tile-2.f90: Add reduction clause.
* testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: Initialize var.
* testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: Add reduction
and initialization.

12 months agoamdgcn: bug fix ldexp insn
Andrew Stubbs [Thu, 20 Apr 2023 10:11:13 +0000 (11:11 +0100)] 
amdgcn: bug fix ldexp insn

The vop3 instructions don't support B constraint immediates.
Also, take the use the SV_FP iterator to delete a redundant pattern.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (vnsi, VnSI): Add scalar modes.
(ldexp<mode>3): Delete.
(ldexp<mode>3<exec>): Change "B" to "A".

12 months agoamdgcn: update target-supports.exp
Andrew Stubbs [Tue, 18 Apr 2023 11:03:43 +0000 (12:03 +0100)] 
amdgcn: update target-supports.exp

The backend can now vectorize more things.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_vect_call_copysignf): Add amdgcn.
(check_effective_target_vect_call_sqrtf): Add amdgcn.
(check_effective_target_vect_call_ceilf): Add amdgcn.
(check_effective_target_vect_call_floor): Add amdgcn.
(check_effective_target_vect_logical_reduc): Add amdgcn.

12 months agoamdgcn, openmp: Fix concurrency in low-latency allocator
Andrew Stubbs [Wed, 19 Apr 2023 16:33:41 +0000 (17:33 +0100)] 
amdgcn, openmp: Fix concurrency in low-latency allocator

The previous code works fine on Fiji and Vega 10 devices, but bogs down in The
spin locks on Vega 20 or newer.  Adding the sleep instructions fixes the
problem.

libgomp/ChangeLog:

* basic-allocator.c (basic_alloc_free): Use BASIC_ALLOC_YIELD.
(basic_alloc_realloc): Use BASIC_ALLOC_YIELD.

12 months agoAdd missing changelog entries
Andrew Stubbs [Thu, 20 Apr 2023 11:20:47 +0000 (12:20 +0100)] 
Add missing changelog entries

12 months agoamdgcn: HardFP divide
Andrew Stubbs [Fri, 14 Apr 2023 16:05:15 +0000 (17:05 +0100)] 
amdgcn: HardFP divide

Implement FP division using hardware instructions. This replaces both the
softfp library calls, and the --fast-math inaccurate divsion we had previously.

The GCN architecture does not have a single divide instruction, but it does
have a number of support instructions designed to make multiply-by-reciprocal
sufficiently accurate for non-fast-math usage.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (SV_SFDF): New iterator.
(SV_FP): New iterator.
(scalar_mode, SCALAR_MODE): Add identity mappings for scalar modes.
(recip<mode>2): Unify the two patterns using SV_FP.
(div_scale<mode><exec_vcc>): New insn.
(div_fmas<mode><exec>): New insn.
(div_fixup<mode><exec>): New insn.
(div<mode>3): Unify the two expanders and rewrite using hardfp.
* config/gcn/gcn.cc (gcn_md_reorg): Support "vccwait" attribute.
* config/gcn/gcn.md (unspec): Add UNSPEC_DIV_SCALE, UNSPEC_DIV_FMAS,
and UNSPEC_DIV_FIXUP.
(vccwait): New attribute.

gcc/testsuite/ChangeLog:

* gcc.target/gcn/fpdiv.c: Remove the -ffast-math requirement.

(cherry picked from commit cfdc45f73c56ad051a53576a4e88675ced2660d4)

12 months agoif-conv: Restore MASK_CALL conversion [PR108888]
Andre Vieira [Tue, 11 Apr 2023 09:07:43 +0000 (10:07 +0100)] 
if-conv: Restore MASK_CALL conversion [PR108888]

The original patch to fix this PR broke the if-conversion of calls into
IFN_MASK_CALL.  This patch restores that original behaviour and makes sure the
tests added earlier specifically test inbranch SIMD clones.

gcc/ChangeLog:

PR tree-optimization/108888
* tree-if-conv.cc (predicate_statements): Fix gimple call check.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-simd-clone-16.c: Make simd clone inbranch only.
* gcc.dg/vect/vect-simd-clone-17.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18.c: Likewise.

(cherry picked from commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877)

12 months ago[og12] OpenMP: Fix checking ICE in "declare target" ctor/dtor support
Julian Brown [Tue, 4 Apr 2023 18:32:16 +0000 (18:32 +0000)] 
[og12] OpenMP: Fix checking ICE in "declare target" ctor/dtor support

This patch fixes an ICE with checking enabled with the patch:

  abcb5dbac666513c798e574808f849f76a1c0799

There were two problems: first, OMP_CLAUSE_CHAIN was erroneously
used as the chain pointer instead of TREE_CHAIN for a non-OMP clause
list. Secondly, "copy_node" by itself is not sufficient to clone the
initialization statement for use in the on-target constructor/destructor
function. Instead we now use walk_tree with "copy_tree_body_r" and
appropriate configuration parameters.

2023-04-05  Julian Brown  <julian@codesourcery.com>

gcc/cp/
* decl2.cc (tree-inline.h): Include.
(do_static_initialization_or_destruction): Change OMP_TARGET parameter
to pass the host version of the SSDF function decl.  Use
copy_tree_body_r to clone init stmt.  Update forward declaration.
(c_parse_final_cleanups): Update calls to
do_static_initialization_or_destruction.  Use TREE_CHAIN instead of
OMP_CLAUSE_CHAIN.

12 months agoamdgcn: Add 64-bit vector not
Andrew Stubbs [Mon, 3 Apr 2023 11:16:11 +0000 (12:16 +0100)] 
amdgcn: Add 64-bit vector not

gcc/ChangeLog:

* config/gcn/gcn-valu.md (one_cmpl<mode>2<exec>): New.

12 months ago'-foffload-memory=pinned' using offloading device interfaces for non-contiguous array...
Thomas Schwinge [Thu, 30 Mar 2023 08:08:12 +0000 (10:08 +0200)] 
'-foffload-memory=pinned' using offloading device interfaces for non-contiguous array support

Changes related to og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f
"Merge non-contiguous array support patches".

libgomp/
* target.c (gomp_map_vars_internal)
<non-contiguous array support>: Handle 'always_pinned_mode'.

12 months ago'-foffload-memory=pinned' using offloading device interfaces
Thomas Schwinge [Thu, 30 Mar 2023 08:08:12 +0000 (10:08 +0200)] 
'-foffload-memory=pinned' using offloading device interfaces

Implemented for nvptx offloading via 'cuMemHostAlloc', 'cuMemHostRegister'.

gcc/
* doc/invoke.texi (-foffload-memory=pinned): Document.
include/
* cuda/cuda.h (CUresult): Add
'CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED'.
(CUdevice_attribute): Add
'CU_DEVICE_ATTRIBUTE_READ_ONLY_HOST_REGISTER_SUPPORTED'.
(CU_MEMHOSTREGISTER_READ_ONLY): Add.
(cuMemHostGetFlags, cuMemHostRegister, cuMemHostUnregister): Add.
libgomp/
* libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_free): Add
'struct goacc_asyncqueue *' formal parameter.
(GOMP_OFFLOAD_page_locked_host_register)
(GOMP_OFFLOAD_page_locked_host_unregister)
(GOMP_OFFLOAD_page_locked_host_p): Add.
* libgomp.h (always_pinned_mode)
(gomp_page_locked_host_register_dev)
(gomp_page_locked_host_unregister_dev): Add.
(struct splay_tree_key_s): Add 'page_locked_host_p'.
(struct gomp_device_descr): Add
'GOMP_OFFLOAD_page_locked_host_register',
'GOMP_OFFLOAD_page_locked_host_unregister',
'GOMP_OFFLOAD_page_locked_host_p'.
* libgomp.texi (-foffload-memory=pinned): Document.
* plugin/cuda-lib.def (cuMemHostGetFlags, cuMemHostRegister_v2)
(cuMemHostRegister, cuMemHostUnregister): Add.
* plugin/plugin-nvptx.c (struct ptx_device): Add
'read_only_host_register_supported'.
(nvptx_open_device): Initialize it.
(free_host_blocks, free_host_blocks_lock)
(nvptx_run_deferred_page_locked_host_free)
(nvptx_page_locked_host_free_callback, nvptx_page_locked_host_p)
(GOMP_OFFLOAD_page_locked_host_register)
(nvptx_page_locked_host_unregister_callback)
(GOMP_OFFLOAD_page_locked_host_unregister)
(GOMP_OFFLOAD_page_locked_host_p)
(nvptx_run_deferred_page_locked_host_unregister)
(nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback):
Add.
(GOMP_OFFLOAD_fini_device, GOMP_OFFLOAD_page_locked_host_alloc)
(GOMP_OFFLOAD_run): Call
'nvptx_run_deferred_page_locked_host_free'.
(struct goacc_asyncqueue): Add
'page_locked_host_unregister_blocks_lock',
'page_locked_host_unregister_blocks'.
(nvptx_goacc_asyncqueue_construct)
(nvptx_goacc_asyncqueue_destruct): Handle those.
(GOMP_OFFLOAD_page_locked_host_free): Handle
'struct goacc_asyncqueue *' formal parameter.
(GOMP_OFFLOAD_openacc_async_test)
(nvptx_goacc_asyncqueue_synchronize): Call
'nvptx_run_deferred_page_locked_host_unregister'.
(GOMP_OFFLOAD_openacc_async_serialize): Call
'nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback'.
* config/linux/allocator.c (linux_memspace_alloc)
(linux_memspace_calloc, linux_memspace_free)
(linux_memspace_realloc): Remove 'always_pinned_mode' handling.
(GOMP_enable_pinned_mode): Move...
* target.c: ... here.
(always_pinned_mode, verify_always_pinned_mode)
(gomp_verify_always_pinned_mode, gomp_page_locked_host_alloc_dev)
(gomp_page_locked_host_free_dev)
(gomp_page_locked_host_aligned_alloc_dev)
(gomp_page_locked_host_aligned_free_dev)
(gomp_page_locked_host_register_dev)
(gomp_page_locked_host_unregister_dev): Add.
(gomp_copy_host2dev, gomp_map_vars_internal)
(gomp_remove_var_internal, gomp_unmap_vars_internal)
(get_gomp_offload_icvs, gomp_load_image_to_device)
(gomp_target_rev, omp_target_memcpy_copy)
(omp_target_memcpy_rect_worker): Handle 'always_pinned_mode'.
(gomp_copy_host2dev, gomp_copy_dev2host): Handle
'verify_always_pinned_mode'.
(GOMP_target_ext): Add 'assert'.
(gomp_page_locked_host_alloc): Use
'gomp_page_locked_host_alloc_dev'.
(gomp_page_locked_host_free): Use
'gomp_page_locked_host_free_dev'.
(omp_target_associate_ptr): Adjust.
(gomp_load_plugin_for_device): Handle 'page_locked_host_register',
'page_locked_host_unregister', 'page_locked_host_p'.
* oacc-mem.c (memcpy_tofrom_device): Handle 'always_pinned_mode'.
* libgomp_g.h (GOMP_enable_pinned_mode): Adjust.
* testsuite/libgomp.c/alloc-pinned-7.c: Remove.

12 months agoOpenACC: Pass pre-allocated 'ptrblock' to 'goacc_noncontig_array_create_ptrblock...
Thomas Schwinge [Wed, 15 Mar 2023 13:32:12 +0000 (14:32 +0100)] 
OpenACC: Pass pre-allocated 'ptrblock' to 'goacc_noncontig_array_create_ptrblock' [PR76739]

... to simplify later changes.  No functional change.

Follow-up for og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f
"Merge non-contiguous array support patches".

PR other/76739
libgomp/
* target.c (gomp_map_vars_internal): Pass pre-allocated 'ptrblock'
to 'goacc_noncontig_array_create_ptrblock'.
* oacc-parallel.c (goacc_noncontig_array_create_ptrblock): Adjust.
* oacc-int.h (goacc_noncontig_array_create_ptrblock): Adjust.

12 months agolibgomp: Document OpenMP 'pinned' memory
Thomas Schwinge [Fri, 24 Mar 2023 14:14:57 +0000 (15:14 +0100)] 
libgomp: Document OpenMP 'pinned' memory

libgomp/
* libgomp.texi (AMD Radeon, nvptx): Document OpenMP 'pinned'
memory.

12 months agoResolve 'error: unused parameter' in 'gcc/cp/decl2.cc:one_static_initialization_or_de...
Thomas Schwinge [Sun, 2 Apr 2023 09:13:33 +0000 (11:13 +0200)] 
Resolve 'error: unused parameter' in 'gcc/cp/decl2.cc:one_static_initialization_or_destruction'

    [...]/gcc/cp/decl2.cc: In function ‘void one_static_initialization_or_destruction(tree, tree, bool, bool)’:
    [...]/gcc/cp/decl2.cc:4171:48: error: unused parameter ‘omp_target’ [-Werror=unused-parameter]
     4171 |                                           bool omp_target)
          |                                           ~~~~~^~~~~~~~~~
    cc1plus: all warnings being treated as errors
    make[3]: *** [cp/decl2.o] Error 1

Fix-up for og12 commit abcb5dbac666513c798e574808f849f76a1c0799
"[og12] OpenMP: Constructors and destructors for "declare target" static aggregates".

gcc/cp/
* decl2.cc (one_static_initialization_or_destruction): Remove
'omp_target' formal parameter.  Adjust all users.

13 months agoopenmp: Handle GIMPLE_OMP_METADIRECTIVE in walk_omp_for_loops
Frederik Harwath [Thu, 30 Mar 2023 14:22:07 +0000 (16:22 +0200)] 
openmp: Handle GIMPLE_OMP_METADIRECTIVE in walk_omp_for_loops

gcc/ChangeLog:

* omp-transform-loops.cc (walk_omp_for_loops): Handle
 GIMPLE_OMP_METADIRECTIVE.

13 months agoAdd missing ChangeLog.omp entries
Frederik Harwath [Tue, 28 Mar 2023 14:42:14 +0000 (16:42 +0200)] 
Add missing ChangeLog.omp entries

... for commits b35b06e9d39..09f39bb1141

13 months agoAdd missing ChangeLog.omp entries
Julian Brown [Tue, 28 Mar 2023 13:19:27 +0000 (13:19 +0000)] 
Add missing ChangeLog.omp entries

13 months ago[og12] OpenMP: Constructors and destructors for "declare target" static aggregates
Julian Brown [Wed, 22 Mar 2023 22:57:33 +0000 (22:57 +0000)] 
[og12] OpenMP: Constructors and destructors for "declare target" static aggregates

This patch adds support for running constructors and destructors for
static (file-scope) aggregates for C++ objects which are marked with
"declare target" directives on OpenMP offload targets.

At present, space is allocated on the target for such aggregates, but
nothing ever constructs them properly, so they end up zero-initialised.

Tested with offloading to AMD GCN. I will apply to the og12 branch
shortly.

ChangeLog

2023-03-27  Julian Brown  <julian@codesourcery.com>

gcc/cp/
* decl2.cc (priority_info): Add omp_tgt_initializations_p and
omp_tgt_destructions_p.
(start_objects, start_static_storage_duration_function,
do_static_initialization_or_destruction,
one_static_initialization_or_destruction,
generate_ctor_or_dtor_function): Add 'omp_target' parameter.  Support
"declare target" decls. Update forward declarations.
(OMP_SSDF_IDENTIFIER): New macro.
(omp_tgt_ssdf_decls): New vec.
(get_priority_info): Initialize omp_tgt_initializations_p and
omp_tgt_destructions_p fields.
(handle_tls_init): Update call to
omp_static_initialization_or_destruction.
(c_parse_final_cleanups): Support constructors/destructors on OpenMP
offload targets.

gcc/
* omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): New builtin.
* tree.cc (get_file_function_name): Support names for on-target
constructor/destructor functions.

libgomp/
* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New
test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New
test.

13 months agoopenmp: Add C/C++ support for loop transformations on inner loops
Frederik Harwath [Fri, 24 Mar 2023 17:20:08 +0000 (18:20 +0100)] 
openmp: Add C/C++ support for loop transformations on inner loops

Add the parsing of loop transformations on inner loops of a loop-nest.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_nested_loop_transform_clauses):
Add argument for the level of loop-nest at which the clauses
appear, ...
(c_parser_omp_tile): ... adjust use here,
(c_parser_omp_unroll): ... and here,
(c_parser_omp_for_loop): ... and here.  Stop treating loop
transformations like intervening code, parse them, and adjust
the loop-nest depth if necessary for tiling.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_is_pragma): New function.
(cp_parser_omp_nested_loop_transform_clauses):
Add argument for the level of loop-nest at which the clauses
appear, ...
(cp_parser_omp_tile): ... adjust use here,
(cp_parser_omp_unroll): ... and here,
(cp_parser_omp_for_loop): ... and here.  Stop treating loop

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/loop-transforms/unroll-inner-1.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-inner-2.c: New test.

libgomp/ChangeLog
* testsuite/libgomp.c++/loop-transforms/tile-1.C: Deleted, replaced by
matrix-* tests.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h:
New header file for new tests.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c:
New test.

13 months agoopenmp: Add Fortran support for loop transformations on inner loops
Frederik Harwath [Fri, 24 Mar 2023 17:29:51 +0000 (18:29 +0100)] 
openmp: Add Fortran support for loop transformations on inner loops

So far the implementation of the "omp tile" and "omp unroll"
directives restricted their use to the outermost loop of a loop-nest.
This commit changes the Fortran front end to parse and verify the
directives on inner loops.  The transformation clauses are extended to
carry the information about the level of the loop-nest at which a
transformation should be applied.  The middle end transformation pass
is adjusted to apply the transformations at the right level of a loop
nest and to take their effect on the loop nest depth into account.

gcc/fortran/ChangeLog:

* openmp.cc (omp_unroll_removes_loop_nest): Move down in file.
(resolve_loop_transform_generic): Remove, and ...
(resolve_omp_unroll): ... inline and adapt here. Move function.
Move functin.
(find_nested_loop_in_block): New function.
(find_nested_loop_in_chain): New function, used ...
(is_outer_iteration_variable): ... here, and ...
(expr_is_invariant): ... here.
(resolve_omp_do): Adjust code for resolving loop transformations.
(resolve_omp_tile): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_TRANSFROM_LEVEL
on new clause.
(compute_transformed_depth): New function to compute the depth
("collapse") of a transformed loop nest, used
(gfc_trans_omp_do): ... here.

gcc/ChangeLog:

* omp-transform-loops.cc (gimple_assign_rhs_to_tree): Fix type
in comment.
(gomp_for_uncollapse): Adjust "collapse" value after uncollapse.
(partial_unroll): Add argument for the loop nest level to be transformed.
(tile): Likewise.
(transform_gomp_for): Pass level to transformatoin functions.
(optimize_transformation_clauses): Handle transformation clauses for all
levels recursively.
* tree-pretty-print.cc (dump_omp_clause): Print
OMP_CLAUSE_TRANSFORM_LEVEL for OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE.
* tree.cc: Increase number of operands of OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New macro to access
clause operand 0.
(OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Use operand 1 instead of 0.
(OMP_CLAUSE_TILE_SIZES): Likewise.

gcc/cp/ChangeLog

* parser.cc (cp_parser_omp_clause_unroll_full): Set new
OMP_CLAUSE_TRANSFORM_LEVEL operand to default value.
(cp_parser_omp_clause_unroll_partial): Likewise.
(cp_parser_omp_tile_sizes): Likewise.
(cp_parser_omp_loop_transform_clause): Likewise.
(cp_parser_omp_nested_loop_transform_clauses): Likewise.
(cp_parser_omp_unroll): Likewise.
* pt.cc (tsubst_omp_clauses): Adjust OMP_CLAUSE_UNROLL_PARTIAL
and OMP_CLAUSE_TILE handling to changed number of operands.

gcc/c/ChangeLog

* c-parser.cc (c_parser_omp_clause_unroll_full): Set new
OMP_CLAUSE_TRANSFORM_LEVEL operand to default value.
(c_parser_omp_clause_unroll_partial): Likewise.
(c_parser_omp_tile_sizes): Likewise.
(c_parser_omp_loop_transform_clause): Likewise.
(c_parser_omp_nested_loop_transform_clauses): Likewise.
(c_parser_omp_unroll): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/loop-transforms/unroll-8.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-9.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/inner-loops.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-3.f90: Adapt to
changed diagnostic messages.

libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New test.

13 months agoopenmp: Add C/C++ support for "omp tile"
Frederik Harwath [Fri, 24 Mar 2023 17:18:03 +0000 (18:18 +0100)] 
openmp: Add C/C++ support for "omp tile"

This commit adds the C and C++ front end support for the "omp tile"
directive.  The middle end support for the transformation is
implemented in a previous commit.

gcc/c-family/ChangeLog:

* c-omp.cc (c_omp_directives): Add PRAGMA_OMP_TILE.
* c-pragma.cc (omp_pragmas_simd): Likewise.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_TILE

gcc/c/ChangeLog:

* c-parser.cc (c_parser_nested_omp_unroll_clauses): Rename and
generalize ...
(c_parser_omp_nested_loop_transform_clauses): ... to this.
(c_parser_omp_for_loop): Handle "omp tile" parsing in loop nests.
(c_parser_omp_tile_sizes): Parse single "sizes" clause.
(c_parser_omp_loop_transform_clause): New function.
(c_parser_omp_tile): New function for parsing "omp tile"
(c_parser_omp_unroll): Adjust to renaming.
(c_parser_omp_construct): Handle PRAGMA_OMP_TILE.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_clause_unroll_partial): Adjust.
(cp_parser_nested_omp_unroll_clauses): Rename ...
(cp_parser_omp_nested_loop_transform_clauses): ... to this.
(cp_parser_omp_for_loop): Handle "omp tile" parsing in loop nests.
(cp_parser_omp_tile_sizes): New function, parses single "sizes" clause
(cp_parser_omp_tile): New function for parsing "omp tile".
(cp_parser_omp_loop_transform_clause): New  function.
(cp_parser_omp_unroll): Adjust to renaming.
(cp_parser_omp_construct): Handle PRAGMA_OMP_TILE.
(cp_parser_pragma): Likewise.
* pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_TILE.
* semantics.cc (finish_omp_clauses): Likewise.

gcc/ChangeLog:

* gimplify.cc (omp_for_drop_tile_clauses): New function, ...
(gimplify_omp_for): ... used here.

libgomp/ChangeLog:

* testsuite/libgomp.c++/loop-transforms/tile-1.C: New test.
* testsuite/libgomp.c++/loop-transforms/tile-2.C: New test.
* testsuite/libgomp.c++/loop-transforms/tile-3.C: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/loop-transforms/tile-1.c: New test.
* c-c++-common/gomp/loop-transforms/tile-2.c: New test.
* c-c++-common/gomp/loop-transforms/tile-3.c: New test.
* c-c++-common/gomp/loop-transforms/tile-4.c: New test.
* c-c++-common/gomp/loop-transforms/tile-5.c: New test.
* c-c++-common/gomp/loop-transforms/tile-6.c: New test.
* c-c++-common/gomp/loop-transforms/tile-7.c: New test.
* c-c++-common/gomp/loop-transforms/tile-8.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-2.c: Adapt
to changed diagnostic messages.
* g++.dg/gomp/loop-transforms/tile-1.h: New test.
* g++.dg/gomp/loop-transforms/tile-1a.C: New test.
* g++.dg/gomp/loop-transforms/tile-1b.C: New test.

13 months agoopenmp: Add Fortran support for "omp tile"
Frederik Harwath [Fri, 24 Mar 2023 17:30:08 +0000 (18:30 +0100)] 
openmp: Add Fortran support for "omp tile"

This commit implements the Fortran front end support for the "omp
tile" directive and the corresponding middle end transformation.

gcc/fortran/ChangeLog:

* gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE.
(enum gfc_exec_op): Add EXEC_OMP_TILE.
(loop_transform_p): New declaration.
(struct gfc_omp_clauses): Add "tile_sizes" field.
* dump-parse-tree.cc (show_omp_clauses): Handle "tile_sizes" dumping.
(show_omp_node): Handle EXEC_OMP_TILE.
(show_code_node): Likewise.
* match.h (gfc_match_omp_tile): New declaration.
* openmp.cc (gfc_free_omp_clauses): Free "tile_sizes" field.
(match_tile_sizes): New function.
(OMP_TILE_CLAUSES): New macro.
(gfc_match_omp_tile): New function.
(resolve_omp_do): Handle EXEC_OMP_TILE.
(resolve_omp_tile): New function.
(omp_code_to_statement): Handle EXEC_OMP_TILE.
(gfc_resolve_omp_directive): Likewise.
* parse.cc (decode_omp_directive): Handle ST_OMP_END_TILE
and ST_OMP_TILE.
(next_statement): Handle ST_OMP_TILE.
(gfc_ascii_statement): Likewise.
(parse_omp_do): Likewise.
(parse_executable): Likewise.
* resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_TILE.
(gfc_resolve_code): Likewise.
* st.cc (gfc_free_statement): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle "tile_sizes" field.
(loop_transform_p): New function.
(gfc_expr_list_len): New function.
(gfc_trans_omp_do): Handle EXEC_OMP_TILE.
(gfc_trans_omp_directive): Likewise.
* trans.cc (trans_code): Likewise.

gcc/ChangeLog:

* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_TILE.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_loop): Likewise.
* omp-transform-loops.cc (walk_omp_for_loops): New declaration.
(subst_var_in_op): New function.
(subst_var): New function.
(gomp_for_number_of_iterations): Adjust.
(gomp_for_iter_count_type): New function.
(gimple_assign_rhs_to_tree): New function.
(subst_defs): New function.
(gomp_for_uncollapse): Adjust.
(transformation_clause_p): Add OMP_CLAUSE_TILE.
(tile): New function.
(transform_gomp_for): Handle OMP_CLAUSE_TILE.
(optimize_transformation_clauses): Handle OMP_CLAUSE_TILE.
* omp-general.cc (omp_loop_transform_clause_p): Add
OMP_CLAUSE_TILE.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_TILE.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_TILE.
* tree.cc: Add OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TILE_SIZES): New macro.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/loop-transforms/tile-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-1a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New test.

13 months agoopenacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE
Frederik Harwath [Fri, 24 Mar 2023 17:30:21 +0000 (18:30 +0100)] 
openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE

OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation
construct "omp tile".

gcc/ChangeLog:

* tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TILE_LIST): Rename to ...
(OMP_CLAUSE_OACC_TILE_LIST): ... this.
(OMP_CLAUSE_TILE_ITERVAR): Rename to ...
(OMP_CLAUSE_OACC_TILE_ITERVAR): ... this.
(OMP_CLAUSE_TILE_COUNT): Rename to ...
(OMP_CLAUSE_OACC_TILE_COUNT): this.
* gimplify.cc (gimplify_scan_omp_clauses): Adjust to renamings.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_for): Likewise.
* omp-general.cc (omp_extract_for_data): Likewise.
* omp-low.cc (scan_sharing_clauses): Likewise.
(lower_oacc_head_mark): Likewise.
* tree-nested.cc (convert_nonlocal_omp_clauses): Likewise.
(convert_local_omp_clauses): Likewise.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree.cc: Likewise.

gcc/c-family/ChangeLog:

* c-omp.cc (c_oacc_split_loop_clauses): Adjust to renamings.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_clause_collapse): Adjust to renamings.
(c_parser_oacc_clause_tile): Likewise.
(c_parser_omp_for_loop): Likewise.
* c-typeck.cc (c_finish_omp_clauses): Likewise.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_oacc_clause_tile): Adjust to renamings.
(cp_parser_omp_clause_collapse): Likewise.
(cp_parser_omp_for_loop): Likewise.
* pt.cc (tsubst_omp_clauses): Likewise.
* semantics.cc (finish_omp_clauses): Likewise.
(finish_omp_for): Likewise.

gcc/fortran/ChangeLog:

* openmp.cc (enum omp_mask2): Adjust to renamings.
(gfc_match_omp_clauses): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.

13 months agoopenmp: Add C/C++ support for "omp unroll" directive
Frederik Harwath [Fri, 24 Mar 2023 17:14:23 +0000 (18:14 +0100)] 
openmp: Add C/C++ support for "omp unroll" directive

This commit implements the C and the C++ front end changes to support
the "omp unroll" directive.  The execution of the loop transformation
relies on the pass that has been added as a part of the earlier
Fortran patch.

gcc/c-family/ChangeLog:

* c-gimplify.cc (c_genericize_control_stmt): Handle OMP_UNROLL.
* c-omp.cc: Add "unroll" to omp_directives[].
* c-pragma.cc: Add "unroll" to omp_pragmas_simd[].
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_UNROLL to
pragma_kind and adjust PRAGMA_OMP__LAST_.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FULL and
PRAGMA_OMP_CLAUSE_PARTIAL.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_clause_name): Handle "full" and
"partial" clauses.
(check_no_duplicate_clause): Change return type to bool and
return check result.
(c_parser_omp_clause_unroll_full): New function for parsing
the "unroll clause".
(c_parser_omp_clause_unroll_partial): New function for
parsing the "partial" clause.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FULL
and PRAGMA_OMP_CLAUSE_PARTIAL.
(c_parser_nested_omp_unroll_clauses): New function for parsing
"omp unroll" directives following another directive.
(OMP_UNROLL_CLAUSE_MASK): New definition.
(c_parser_omp_unroll): New function for parsing "omp unroll"
loops that are not associated with another directive.
(c_parser_omp_construct): Handle PRAGMA_OMP_UNROLL.
* c-typeck.cc (c_finish_omp_clauses): Handle
OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL,
and OMP_CLAUSE_UNROLL_NONE.

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_gimplify_expr): Handle OMP_UNROLL.
(cp_fold_r): Likewise.
(cp_genericize_r): Likewise.
* parser.cc (cp_parser_omp_clause_name): Handle "full" clause.
(check_no_duplicate_clause): Change return type to bool and
return check result.
(cp_parser_omp_clause_unroll_full): New function for parsing
the "unroll clause".
(cp_parser_omp_clause_unroll_partial): New function for
parsing the "partial" clause.
(cp_parser_omp_all_clauses): Handle OMP_CLAUSE_UNROLL and
OMP_CLAUSE_FULL.
(cp_parser_nested_omp_unroll_clauses): New function for parsing
"omp unroll" directives following another directive.
(cp_parser_omp_for_loop): Handle "omp unroll" directives
between directive and loop.
(OMP_UNROLL_CLAUSE_MASK): New definition.
(cp_parser_omp_unroll): New function for parsing "omp unroll"
loops that are not associated with another directive.

(cp_parser_omp_construct): Handle PRAGMA_OMP_UNROLL.
(cp_parser_pragma): Handle PRAGMA_OMP_UNROLL.
* pt.cc (tsubst_omp_clauses): Handle
OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_FULL, and
OMP_CLAUSE_UNROLL_NONE.
(tsubst_expr): Handle OMP_UNROLL.
* semantics.cc (finish_omp_clauses): Handle
OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL,
and OMP_CLAUSE_UNROLL_NONE.

libgomp/ChangeLog:

* testsuite/libgomp.c++/loop-transforms/unroll-1.C: New test.
* testsuite/libgomp.c++/loop-transforms/unroll-2.C: New test.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/loop-transforms/unroll-1.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-2.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-3.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-4.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-5.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-6.c: New test.
* g++.dg/gomp/loop-transforms/unroll-1.C: New test.
* g++.dg/gomp/loop-transforms/unroll-2.C: New test.
* g++.dg/gomp/loop-transforms/unroll-3.C: New test.

13 months agoopenmp: Add Fortran support for "omp unroll" directive
Frederik Harwath [Fri, 24 Mar 2023 17:11:57 +0000 (18:11 +0100)] 
openmp: Add Fortran support for "omp unroll" directive

This commit implements the OpenMP 5.1 "omp unroll" directive for
Fortran. The Fortran front end changes encompass the parsing and the
verification of nesting restrictions etc. The actual loop
transformation is implemented in a new language-independent
"omp_transform_loops" pass which runs before omp lowering.  No attempt
is made to re-use existing unrolling optimizations because a separate
implementation allows for better control of the unrolling. The new
pass will also serve as a foundation for the implementation of further
OpenMP loop transformations. This commit only implements the support
for "omp unroll" on the outermost loop of a loop nest.  The support
for inner loops will be added later.

gcc/ChangeLog:

* Makefile.in: Add omp_transform_loops.o.
* gimple-pretty-print.cc (dump_gimple_omp_for): Handle "full"
and "partial" clauses.
* gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_TRANSFORM_LOOP.
* gimplify.cc (is_gimple_stmt): Handle OMP_UNROLL.
(gimplify_scan_omp_clauses): Handle OMP_UNROLL_FULL,
OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL.
(gimplify_adjust_omp_clauses): Handle OMP_UNROLL_FULL,
OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL.
(gimplify_omp_for): Handle OMP_UNROLL.
(gimplify_expr): Likewise.
* params.opt: Add omp-unroll-full-max-iteration and
omp-unroll-default-factor.
* passes.def: Add pass_omp_transform_loop before
pass_lower_omp.
* tree-core.h (enum omp_clause_code): Add
OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and
OMP_CLAUSE_UNROLL_PARTIAL.
* tree-pass.h (make_pass_omp_transform_loops): Declare
pmake_pass_omp_transform_loops.
* tree-pretty-print.cc (dump_omp_clause): Handle
OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and
OMP_CLAUSE_UNROLL_PARTIAL.
(dump_generic_node): Handle OMP_UNROLL.
* tree.cc (omp_clause_num_ops): Add number of operators
for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and
OMP_CLAUSE_UNROLL_PARTIAl.
(omp_clause_code_names): Add name strings for
OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and
OMP_CLAUSE_UNROLL_PARTIAL.
* tree.def (OMP_UNROLL): Define.
* tree.h (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Define.
* omp-transform-loops.cc: New file.
* omp-general.cc (omp_loop_transform_clause_p): New function.
* omp-general.h (omp_loop_transform_clause_p): New declaration.

gcc/fortran/ChangeLog:

* dump-parse-tree.cc (show_omp_clauses): Handle "unroll full"
and "unroll partial".
(show_omp_node): Handle OMP_UNROLL.
(show_code_node): Handle EXEC_OMP_UNROLL.
* gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL.
(enum gfc_exec_op): Add EXEC_OMP_UNROLL.
* match.h (gfc_match_omp_unroll): Declare.
* openmp.cc (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL.
(gfc_match_omp_clauses): Handle "omp unroll partial".
(OMP_UNROLL_CLAUSES): New macro definition.
(gfc_match_omp_unroll): Match "full" clause.
(omp_unroll_removes_loop_nest): New function.
(resolve_omp_unroll): New function.
(resolve_omp_do): Accept and verify "omp unroll"
directives between directive and loop.
(omp_code_to_statement): Handle EXEC_OMP_UNROLL.
(gfc_resolve_omp_directive): Likewise.
* parse.cc (decode_omp_directive): Handle "undroll" and "end unroll".
(next_statement): Handle ST_OMP_UNROLL.
(gfc_ascii_statement): Handle ST_OMP_UNROLL and ST_OMP_END_UNROLL.
(parse_omp_do): Accept ST_OMP_UNROLL and ST_OMP_END_UNROLL
before/after loop.
(parse_executable): Handle ST_OMP_UNROLL.
* resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_UNROLL.
(gfc_resolve_code): Likewise.
* st.cc (gfc_free_statement): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle unroll clauses.
(gfc_trans_omp_do): Handle OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_NONE creation.
(gfc_trans_omp_directive): Handle EXEC_OMP_UNROLL.
* trans.cc (trans_code): Likewise.

libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/loop-transforms/unroll-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-5.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-6.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-7.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-9.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New test.

13 months agoAdd 'libgomp.c/alloc-ompx_host_mem_alloc-1.c'
Thomas Schwinge [Wed, 15 Feb 2023 10:27:55 +0000 (11:27 +0100)] 
Add 'libgomp.c/alloc-ompx_host_mem_alloc-1.c'

OpenMP 'ompx_host_mem_alloc' is available for host and nvptx offloading as of
og12 commit 84914e197d91a67b3d27db0e4c69a433462983a5
"openmp, nvptx: ompx_unified_shared_mem_alloc", and for GCN offloading as of
og12 commit c77c45a641fedc3fe770e909cc010fb1735bdbbd
"amdgcn, libgomp: low-latency allocator".

libgomp/
* testsuite/libgomp.c/alloc-ompx_host_mem_alloc-1.c: New.

13 months agoMiscellaneous clean-up re OpenMP 'ompx_host_mem_space'
Thomas Schwinge [Fri, 17 Feb 2023 13:13:15 +0000 (14:13 +0100)] 
Miscellaneous clean-up re OpenMP 'ompx_host_mem_space'

Like done for nvptx in og12 commit 23f52e49368d7b26a1b1a72d6bb903d31666e961
"Miscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', 'ompx_host_mem_space'".

Clean-up for og12 commit c77c45a641fedc3fe770e909cc010fb1735bdbbd
"amdgcn, libgomp: low-latency allocator".  No functional change.

libgomp/
* config/gcn/allocator.c (gcn_memspace_free): Explicitly handle
'memspace == ompx_host_mem_space'.

13 months agoAdd caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]
Thomas Schwinge [Thu, 23 Mar 2023 11:32:35 +0000 (12:32 +0100)] 
Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]

Follow-up to commit 49d1a2f91325fa8cc011149e27e5093a988b3a49
"OpenMP: Handle descriptors in target's firstprivate [PR104949]".

PR fortran/104949
libgomp/
* target.c (gomp_map_vars_internal) <GOMP_MAP_FIRSTPRIVATE>: Add
caveat/safeguard.

(cherry picked from commit e8fec6998b656dac02d4bc6c69b35a0fb5611e87)

13 months agolibgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation
Thomas Schwinge [Tue, 21 Mar 2023 15:14:16 +0000 (16:14 +0100)] 
libgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation

... by using the existing 'goacc_asyncqueue' instead of re-coding parts of it.

Follow-up to commit 131d18e928a3ea1ab2d3bf61aa92d68a8a254609
"libgomp/nvptx: Prepare for reverse-offload callback handling",
and commit ea4b23d9c82d9be3b982c3519fe5e8e9d833a6a8
"libgomp: Handle OpenMP's reverse offloads".

libgomp/
* target.c (gomp_target_rev): Instead of 'dev_to_host_cpy',
'host_to_dev_cpy', 'token', take a single 'goacc_asyncqueue'.
* libgomp.h (gomp_target_rev): Adjust.
* libgomp-plugin.c (GOMP_PLUGIN_target_rev): Adjust.
* libgomp-plugin.h (GOMP_PLUGIN_target_rev): Adjust.
* plugin/plugin-gcn.c (process_reverse_offload): Adjust.
* plugin/plugin-nvptx.c (rev_off_dev_to_host_cpy)
(rev_off_host_to_dev_cpy): Remove.
(GOMP_OFFLOAD_run): Adjust.

13 months agoIn 'libgomp/target.c:gomp_unmap_vars_internal', defer 'gomp_remove_var'
Thomas Schwinge [Tue, 14 Mar 2023 18:42:12 +0000 (19:42 +0100)] 
In 'libgomp/target.c:gomp_unmap_vars_internal', defer 'gomp_remove_var'

An upcoming change requires that 'gomp_remove_var' be deferred until after all
'gomp_copy_dev2host' calls have been handled.

Do this likewise to how commit 275c736e732d29934e4d22e8f030d5aae8c12a52
"libgomp: Structure element mapping for OpenMP 5.0" changed 'gomp_exit_data'.

libgomp/
* target.c (gomp_unmap_vars_internal): Queue splay-tree keys for
removal after main loop.

13 months agoGiven OpenACC 'async', defer 'free' of non-contiguous array support data structures...
Thomas Schwinge [Wed, 15 Mar 2023 12:34:02 +0000 (13:34 +0100)] 
Given OpenACC 'async', defer 'free' of non-contiguous array support data structures [PR76739]

Fix-up for og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f
"Merge non-contiguous array support patches".

PR other/76739
libgomp/
* oacc-parallel.c (GOACC_parallel_keyed): Given OpenACC 'async',
defer 'free' of non-contiguous array support data structures.
* target.c (gomp_map_vars_internal): Likewise.

13 months agoFortran/OpenMP: Fix 'alloc' and 'from' mapping for allocatable components
Tobias Burnus [Thu, 23 Mar 2023 17:04:17 +0000 (18:04 +0100)] 
Fortran/OpenMP: Fix 'alloc' and 'from' mapping for allocatable components

Even with 'alloc' and map-entering 'from' mapping, the following should hold.
For explicit mapping, that's already the case, this handles the automatical
deep mapping of allocatable components. Namely:
* On the device, the array bounds (of allocated allocatables) must match the
  host, implying 'to' (or 'tofrom') mapping.
* On map exiting, the copying out shall not destroy the unallocated allocation
  status (nor the pointer address of allocated allocatables).

The latter was not a problem for allocated allocatables as for those a pointer
was GOMP_MAP_ATTACHed; however, for unallocated allocatables, before it copied
back device-allocated memory which might not be nullified.

While 'alloc' was not deep-mapped at all, for map-entering 'from', the array
bounds were not set, making allocated derived-type components inaccessible on
the device (and wrong on the host on copy back).

The solution is, first, to deep-map 'alloc' as well and to copy to the device
even with 'alloc' and (map-entering) 'from'. This copying is only done if there
is a scalar (for the unallocated case) or array allocatable directly in the
derived type and then it is shallowly copied; the data pointed to is then again
only alloc'ed, unless it contains in turn allocatables.

gcc/fortran/

* trans-openmp.cc (gfc_has_alloc_comps): Add 'bool
shallow_alloc_only=false' arg.
(gfc_omp_replace_alloc_by_to_mapping): New, call it.
(gfc_omp_deep_map_kind_p): Return 'true' also for '(present,)alloc'.
(gfc_omp_deep_mapping_item, gfc_omp_deep_mapping_do): On map entering,
replace shallowly 'alloc'/'from' by '(from)to' mapping if there are
allocatable components.

libgomp/

* testsuite/libgomp.fortran/map-alloc-comp-8.f90: New test.

13 months agoFortran: Add attr.class_ok check for generate_callback_wrapper
Tobias Burnus [Thu, 23 Mar 2023 13:29:00 +0000 (14:29 +0100)] 
Fortran: Add attr.class_ok check for generate_callback_wrapper

Proper variables/components of type BT_CLASS have 'class_ok' set; check
for that to avoid an ICE on invalid code for gfortran.dg/pr108434.f90.

gcc/fortran/
* class.cc (generate_callback_wrapper): Add attr.class_ok check.
* resolve.cc (resolve_fl_derived): Likewise.

13 months agoOpenMP/Fortran: Fix unmapping of GOMP_MAP_POINTER for scalar allocatables/pointers
Tobias Burnus [Thu, 23 Mar 2023 07:57:45 +0000 (08:57 +0100)] 
OpenMP/Fortran: Fix unmapping of GOMP_MAP_POINTER for scalar allocatables/pointers

target exit data: Do unmap GOMP_MAP_POINTER for scalar allocatables/pointers
to prevent stale mappings.

While for allocatable/pointer arrays, there is a PSET followed by POINTER,
for allocatable/pointer scalars there is only a POINTER. Before the below
mentioned OG12 patch: For exit data, PSET was converted to RELEASE/DELETE
in gimplify.cc while all POINTER were removed; correct for arrays but leaving
POINTER behind for scalars. Since that commit, all in trans-openmp.cc but
the scalar case was still mishandled before this follow-up commit.

This is a follow up to OG12's 55a18d4744258e3909568e425f9f473c49f9d13f
While the problem is independent, it will be merged into v4 of the
mainline patch
'Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings'

gcc/fortran/

* trans-openmp.cc (gfc_trans_omp_clauses): Fix unmapping of
GOMP_MAP_POINTER for scalar allocatables/pointers.

gcc/testsuite/

* gfortran.dg/gomp/map-10.f90: New test.

13 months agoamdgcn: Add instruction patterns for complex number operations.
Andrew Jenner [Wed, 22 Mar 2023 11:20:40 +0000 (11:20 +0000)] 
amdgcn: Add instruction patterns for complex number operations.

gcc/ChangeLog:

* config/gcn/gcn-protos.h (gcn_expand_dpp_swap_pairs_insn)
(gcn_expand_dpp_distribute_even_insn)
(gcn_expand_dpp_distribute_odd_insn): Declare.
* config/gcn/gcn-valu.md (@dpp_swap_pairs<mode>)
(@dpp_distribute_even<mode>, @dpp_distribute_odd<mode>)
(cmul<conj_op><mode>3, cml<addsub_as><mode>4, vec_addsub<mode>3)
(cadd<rot><mode>3, vec_fmaddsub<mode>4, vec_fmsubadd<mode>4)
(fms<mode>4<exec>, fms<mode>4_negop2<exec>, fms<mode>4)
(fms<mode>4_negop2): New patterns.
* config/gcn/gcn.cc (gcn_expand_dpp_swap_pairs_insn)
(gcn_expand_dpp_distribute_even_insn)
(gcn_expand_dpp_distribute_odd_insn): New functions.
* config/gcn/gcn.md: Add entries to unspec enum.

gcc/testsuite/ChangeLog:

* gcc.target/gcn/complex.c: New test.

13 months agoamdgcn: Fix register size bug
Andrew Stubbs [Fri, 17 Mar 2023 11:04:12 +0000 (11:04 +0000)] 
amdgcn: Fix register size bug

Fix an issue in which "vectors" of duplicate entries placed in scalar
registers caused the following 63 registers to be marked live, for the
purpose of prologue generation, which resulted in stack corruption.

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_class_max_nregs): Handle vectors in SGPRs.
(move_callee_saved_registers): Detect the bug condition early.

13 months agovect: Fix missed gather load opportunity
Richard Sandiford [Tue, 20 Sep 2022 14:27:46 +0000 (15:27 +0100)] 
vect: Fix missed gather load opportunity

While writing a testcase for PR106794, I noticed that we failed
to vectorise the testcase in the patch for SVE.  The code that
recognises gather loads tries to optimise the point at which
the offset is calculated, to avoid unnecessary extensions or
truncations:

  /* Don't include the conversion if the target is happy with
     the current offset type.  */

But breaking only makes sense if we're at an SSA_NAME (which could
then be vectorised).  We shouldn't break on a conversion embedded
in a generic expression.

gcc/
* tree-vect-data-refs.cc (vect_check_gather_scatter): Restrict
early-out optimisation to SSA_NAMEs.

gcc/testsuite/
* gcc.dg/vect/vect-gather-5.c: New test.

(cherry picked from commit 4a773bf2f08656a39ac75cf6b4871c8cec8b5007)

13 months agoamdgcn: gather/scatter with DImode offsets
Andrew Stubbs [Mon, 6 Mar 2023 12:42:44 +0000 (12:42 +0000)] 
amdgcn: gather/scatter with DImode offsets

The GPU architecture requires SImode offsets on gather/scatter instructions,
but they can also take a vector of absolute addresses, so this allows
gather/scatter in more situations.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (gather_load<mode><vndi>): New.
(scatter_store<mode><vndi>): New.
(mask_gather_load<mode><vndi>): New.
(mask_scatter_store<mode><vndi>): New.

13 months agoamdgcn: vec_extract no-op insns
Andrew Stubbs [Wed, 1 Mar 2023 15:32:50 +0000 (15:32 +0000)] 
amdgcn: vec_extract no-op insns

Just using move insn for no-op conversions triggers special move handling in
IRA which declares that subreg of vectors aren't valid and routes everything
through memory.  These patterns make the vec_select explicit and all is well.

gcc/ChangeLog:

* config/gcn/gcn-protos.h (gcn_stepped_zero_int_parallel_p): New.
* config/gcn/gcn-valu.md (V_1REG_ALT): New.
(V_2REG_ALT): New.
(vec_extract<V_1REG:mode><V_1REG_ALT:mode>_nop): New.
(vec_extract<V_2REG:mode><V_2REG_ALT:mode>_nop): New.
(vec_extract<V_ALL:mode><V_ALL_ALT:mode>): Use new patterns.
* config/gcn/gcn.cc (gcn_stepped_zero_int_parallel_p): New.
* config/gcn/predicates.md (ascending_zero_int_parallel): New.

13 months agoUse 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]
Thomas Schwinge [Fri, 10 Mar 2023 17:14:44 +0000 (18:14 +0100)] 
Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]

Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec',
'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in
particular conceptually: no more device memory allocation, host to device data
copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for
us.

This depends on commit 2b2340e236c0bba8aaca358ea25a5accd8249fbd
"Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data",
where I said that "a use will emerge later", which is this one here.

PR libgomp/90596
libgomp/
* target.c (gomp_map_vars_internal): Allow for
'param_kind == GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_TARGET'.
* oacc-parallel.c (GOACC_parallel_keyed): Pass
'GOMP_MAP_VARS_TARGET' to 'goacc_map_vars'.
* plugin/plugin-gcn.c (alloc_by_agent, gcn_exec)
(GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec):
Adjust, simplify.
(gomp_offload_free): Remove.
* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
(GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify.
(cuda_free_argmem): Remove.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
Adjust.

(cherry picked from commit f8332e52a498df480f72303de32ad0751ad899fe)

13 months agoAllow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data
Thomas Schwinge [Mon, 27 Feb 2023 15:41:17 +0000 (16:41 +0100)] 
Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data

This does *allow*, but under no circumstances is this currently going to be
used: all potentially applicable data is non-'ephemeral', and thus not
considered for 'gomp_coalesce_buf_add' for OpenACC 'async'.  (But a use will
emerge later.)

Follow-up to commit r12-2530-gd88a6951586c7229b25708f4486eaaf4bf4b5bbe
"Don't use libgomp 'cbuf' buffering with OpenACC 'async'", addressing this
TODO comment:

    TODO ... but we could allow CBUF usage for EPHEMERAL data?  (Open question:
    is it more performant to use libgomp CBUF buffering or individual device
    asyncronous copying?)

Ephemeral data is small, and therefore individual device asyncronous copying
does seem dubious -- in particular given that for all those, we'd individually
have to allocate and queue for deallocation a temporary buffer to capture the
ephemeral data.  Instead, just let the 'cbuf' *be* the temporary buffer.

libgomp/
* target.c (gomp_copy_host2dev, gomp_map_vars_internal): Allow
libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral'
data.

(cherry picked from commit 2b2340e236c0bba8aaca358ea25a5accd8249fbd)

13 months agoSimplify OpenACC 'no_create' clause implementation
Thomas Schwinge [Mon, 27 Feb 2023 11:02:02 +0000 (12:02 +0100)] 
Simplify OpenACC 'no_create' clause implementation

For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then
simplify the device plugins accordingly.

This is a follow-up to
Subversion r279551 (Git commit a6163563f2ce502bd4ef444bd5de33570bb8eeb1)
"Add OpenACC 2.6's no_create",
Subversion r279622 (Git commit 5bcd470bf0749e1f56d05dd43aa9584ff2e3a090)
"Use gomp_map_val for OpenACC host-to-device address translation".

libgomp/
* target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for
'GOMP_MAP_IF_PRESENT'.
* plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec)
(GOMP_OFFLOAD_openacc_async_exec): Adjust.
* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
(GOMP_OFFLOAD_openacc_async_exec): Likewise.
* testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async'
testing.
* testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise.

(cherry picked from commit 199867d07be65cb0227a318ebf42b8376ca09313)

13 months agoOpenACC: Remove 'acc_async_test' -> skip shortcut in 'libgomp/oacc-async.c:goacc_wait'
Thomas Schwinge [Fri, 24 Feb 2023 15:17:57 +0000 (16:17 +0100)] 
OpenACC: Remove 'acc_async_test' -> skip shortcut in 'libgomp/oacc-async.c:goacc_wait'

We're not taking such a shortcut anywhere else, and (with future changes) it
has potential to confuse things if synchronization in a libgomp plugin happens
to have side effects even if an async queue currently is empty.

libgomp/
* oacc-async.c (goacc_wait): Remove 'acc_async_test' -> skip
shortcut.

(cherry picked from commit b5037d4a073f2e4625afab5ec1f35624d9f9eba1)

13 months agoDocument/verify another aspect of OpenACC 'async' semantics in 'libgomp.oacc-c-c...
Thomas Schwinge [Fri, 24 Feb 2023 15:21:31 +0000 (16:21 +0100)] 
Document/verify another aspect of OpenACC 'async' semantics in 'libgomp.oacc-c-c++-common/data-3.c'

... that I almost broke with later implementation changes.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/data-3.c: Document/verify
another aspect of OpenACC 'async' semantics.

(cherry picked from commit 442d51a20ef13a8e6c080ca30bc37fc93b6bfac4)

13 months agoFix OpenACC/GCN 'acc_ev_enqueue_launch_end' position
Thomas Schwinge [Thu, 2 Mar 2023 09:39:09 +0000 (10:39 +0100)] 
Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position

For an OpenACC compute construct, we've currently got:

  - [...]
  - acc_ev_enqueue_launch_start
  - launch kernel
  - free memory
  - acc_ev_free
  - acc_ev_enqueue_launch_end

This confused another thing that I'm working on, so I adjusted that to:

  - [...]
  - acc_ev_enqueue_launch_start
  - launch kernel
  - acc_ev_enqueue_launch_end
  - free memory
  - acc_ev_free

Correspondingly, verify 'acc_ev_alloc', 'acc_ev_free' in
'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.

libgomp/
* plugin/plugin-gcn.c (gcn_exec): Fix 'acc_ev_enqueue_launch_end'
position.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
Verify 'acc_ev_alloc', 'acc_ev_free'.

(cherry picked from commit 649f1939baf11f45fd3579b8b9601c7840a097b3)

13 months agolibgomp: Merge 'gomp_map_vars_openacc' into 'goacc_map_vars' [PR76739]
Thomas Schwinge [Thu, 2 Mar 2023 17:36:47 +0000 (18:36 +0100)] 
libgomp: Merge 'gomp_map_vars_openacc' into 'goacc_map_vars' [PR76739]

Upstream has 'goacc_map_vars'; merge the new 'gomp_map_vars_openacc' into it.
(Maybe the latter didn't exist yet when the former was originally added?)
No functional change.

Clean-up for og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f
"Merge non-contiguous array support patches".

PR other/76739
libgomp/
* libgomp.h (goacc_map_vars): Add 'struct goacc_ncarray_info *'
formal parameter.
(gomp_map_vars_openacc): Remove.
* target.c (goacc_map_vars): Adjust.
(gomp_map_vars_openacc): Remove.
* oacc-mem.c (acc_map_data, goacc_enter_datum)
(goacc_enter_data_internal): Adjust.
* oacc-parallel.c (GOACC_parallel_keyed, GOACC_data_start):
Adjust.

13 months agoRevert "OpenACC profiling-interface fixes for asynchronous operations"
Thomas Schwinge [Thu, 2 Mar 2023 10:28:24 +0000 (11:28 +0100)] 
Revert "OpenACC profiling-interface fixes for asynchronous operations"

There is occasional execution failure; these changes need to be reviewed.

This reverts og12 commit 719f93c8618a134f90b5b661ab70c918d659ad05.

libgomp/
* oacc-host.c: Revert
"OpenACC profiling-interface fixes for asynchronous operations"
changes.
* oacc-mem.c: Likewise.
* oacc-parallel.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
Likewise.

13 months agoRevert "Revert changes to acc_prof-init-1.c and acc_prof-parallel-1.c"
Thomas Schwinge [Thu, 2 Mar 2023 10:24:28 +0000 (11:24 +0100)] 
Revert "Revert changes to acc_prof-init-1.c and acc_prof-parallel-1.c"

... as a prerequisite for reverting
"OpenACC profiling-interface fixes for asynchronous operations".

This reverts og12 commit b845d2f62e7da1c4cfdfee99690de94b648d076d.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Revert
"Revert changes to acc_prof-init-1.c and acc_prof-parallel-1.c"
changes.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
Likewise.

13 months agoamdgcn: Add instruction patterns for conditional min/max operations
Paul-Antoine Arras [Wed, 1 Mar 2023 16:20:21 +0000 (17:20 +0100)] 
amdgcn: Add instruction patterns for conditional min/max operations

gcc/ChangeLog:

* config/gcn/gcn-valu.md (<expander><mode>3_exec): Add patterns for
{s|u}{max|min} in QI, HI and DI modes.
(<expander><mode>3): Add pattern for {s|u}{max|min} in DI mode.
(cond_<fexpander><mode>): Add pattern for cond_f{max|min}.
(cond_<expander><mode>): Add pattern for cond_{s|u}{max|min}.
* config/gcn/gcn.cc (gcn_spill_class): Allow the exec register to be
saved in SGPRs.

gcc/testsuite/ChangeLog:

* gcc.target/gcn/cond_fmaxnm_1.c: New test.
* gcc.target/gcn/cond_fmaxnm_1_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_2.c: New test.
* gcc.target/gcn/cond_fmaxnm_2_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_3.c: New test.
* gcc.target/gcn/cond_fmaxnm_3_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_4.c: New test.
* gcc.target/gcn/cond_fmaxnm_4_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_5.c: New test.
* gcc.target/gcn/cond_fmaxnm_5_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_6.c: New test.
* gcc.target/gcn/cond_fmaxnm_6_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_7.c: New test.
* gcc.target/gcn/cond_fmaxnm_7_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_8.c: New test.
* gcc.target/gcn/cond_fmaxnm_8_run.c: New test.
* gcc.target/gcn/cond_fminnm_1.c: New test.
* gcc.target/gcn/cond_fminnm_1_run.c: New test.
* gcc.target/gcn/cond_fminnm_2.c: New test.
* gcc.target/gcn/cond_fminnm_2_run.c: New test.
* gcc.target/gcn/cond_fminnm_3.c: New test.
* gcc.target/gcn/cond_fminnm_3_run.c: New test.
* gcc.target/gcn/cond_fminnm_4.c: New test.
* gcc.target/gcn/cond_fminnm_4_run.c: New test.
* gcc.target/gcn/cond_fminnm_5.c: New test.
* gcc.target/gcn/cond_fminnm_5_run.c: New test.
* gcc.target/gcn/cond_fminnm_6.c: New test.
* gcc.target/gcn/cond_fminnm_6_run.c: New test.
* gcc.target/gcn/cond_fminnm_7.c: New test.
* gcc.target/gcn/cond_fminnm_7_run.c: New test.
* gcc.target/gcn/cond_fminnm_8.c: New test.
* gcc.target/gcn/cond_fminnm_8_run.c: New test.
* gcc.target/gcn/cond_smax_1.c: New test.
* gcc.target/gcn/cond_smax_1_run.c: New test.
* gcc.target/gcn/cond_smin_1.c: New test.
* gcc.target/gcn/cond_smin_1_run.c: New test.
* gcc.target/gcn/cond_umax_1.c: New test.
* gcc.target/gcn/cond_umax_1_run.c: New test.
* gcc.target/gcn/cond_umin_1.c: New test.
* gcc.target/gcn/cond_umin_1_run.c: New test.
* gcc.target/gcn/smax_1.c: New test.
* gcc.target/gcn/smax_1_run.c: New test.
* gcc.target/gcn/smin_1.c: New test.
* gcc.target/gcn/smin_1_run.c: New test.
* gcc.target/gcn/umax_1.c: New test.
* gcc.target/gcn/umax_1_run.c: New test.
* gcc.target/gcn/umin_1.c: New test.
* gcc.target/gcn/umin_1_run.c: New test.

(cherry picked from commit 553ff2524f412be4e02e2ffb1a0a3dc3e2280742)

13 months agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Thu, 2 Mar 2023 15:32:12 +0000 (16:32 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9210-gb3f9d2cf7dd5488800f867a6aae076465ecb391b (2nd Mar 2023)

13 months agoDaily bump.
GCC Administrator [Thu, 2 Mar 2023 00:20:59 +0000 (00:20 +0000)] 
Daily bump.

14 months agoOpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546]
Tobias Burnus [Wed, 1 Mar 2023 14:18:40 +0000 (15:18 +0100)] 
OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546]

For is_device_ptr, optional checks should only be done before calling
libgomp, afterwards they are NULL either because of absent or, by
chance, because it is unallocated or unassociated (for pointers/allocatables).

Additionally, it fixes an issue with explicit mapping for 'type(c_ptr)'.

PR middle-end/108546

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_clauses): Fix mapping of
type(C_ptr) variables.

gcc/ChangeLog:

* omp-low.cc (lower_omp_target): Remove optional handling
on the receiver side, i.e. inside target (data), for
use_device_ptr.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/is_device_ptr-3.f90: New test.
* testsuite/libgomp.fortran/use_device_ptr-optional-4.f90: New test.

(cherry picked from commit 96ff97ff6574666a5509ae9fa596e7f2b6ad4f88)

14 months agoDaily bump.
GCC Administrator [Wed, 1 Mar 2023 00:21:39 +0000 (00:21 +0000)] 
Daily bump.

14 months agoDaily bump.
GCC Administrator [Tue, 28 Feb 2023 00:22:29 +0000 (00:22 +0000)] 
Daily bump.

14 months agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Mon, 27 Feb 2023 16:29:19 +0000 (17:29 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9207-gb8e496d132ec087c9db5951fea23551dcc831d8c (27th Feb 2023)

14 months agoUpdate dg-dump-scan for "Fortran/OpenMP: Fix mapping of array descriptors and deferre...
Tobias Burnus [Mon, 27 Feb 2023 11:47:54 +0000 (12:47 +0100)] 
Update dg-dump-scan for "Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings"

Follow-up to commit 55a18d4744258e3909568e425f9f473c49f9d13f
"Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings"
updating the dumps.
* For the goacc testcase, 'to' changed to 'release' and due to 'finally' then
  to 'delete', which can be regarded as bugfix.
* For pr78260-2.f90, the calculation moved inside the 'if(...->data == NULL)'
  block to handle deferred-string length vars better, esp. when 'optional'.

gcc/testsuite/:
        * gfortran.dg/goacc/finalize-1.f: Update scan-tree-dump-times for
        mapping changes.
        * gfortran.dg/gomp/pr78260-2.f90: Likewise.

14 months agoasan: adjust module name for global variables
Martin Liska [Fri, 17 Feb 2023 14:11:02 +0000 (15:11 +0100)] 
asan: adjust module name for global variables

As mentioned in the PR, when we use LTO, we wrongly use ltrans output
file name as a module name of a global variable. That leads to a
non-reproducible output.

After the suggested change, we emit context name of normal global
variables. And for artificial variables (like .Lubsan_data3), we use
aux_base_name (e.g. "./a.ltrans0.ltrans").

PR sanitizer/108834

gcc/ChangeLog:

* asan.cc (asan_add_global): Use proper TU name for normal
global variables (and aux_base_name for the artificial one).

gcc/testsuite/ChangeLog:

* c-c++-common/asan/global-overflow-1.c: Test line and column
info for a global variable.

(cherry picked from commit 94c9b1bb79f63d000ebb05efc155c149325e332d)

14 months agors6000/test: Adjust some test cases on partial vector [PR96373]
Kewen Lin [Tue, 14 Feb 2023 02:03:26 +0000 (20:03 -0600)] 
rs6000/test: Adjust some test cases on partial vector [PR96373]

As Richard pointed out in [1] and the testing on Power10, the
proposed fix for PR96373 requires some updates on a few rs6000
test cases which adopt partial vector.  This patch is to fix
all of them with one extra option "-fno-trapping-math" as
Richard suggested.

Besides, the original test case also failed on Power10 without
Richard's proposed fix, this patch adds it together for a bit
better testing coverage.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-January/610728.html

PR target/96373

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/p9-vec-length-epil-1.c: Add -fno-trapping-math.
* gcc.target/powerpc/p9-vec-length-epil-2.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-3.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-4.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-5.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-6.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-8.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-1.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-2.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-3.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-4.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-5.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-6.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-8.c: Likewise.
* gcc.target/powerpc/pr96373.c: New test.

(cherry picked from commit 4f5a1198065dc078f8099db628da7b06a2666f34)

14 months agoDaily bump.
GCC Administrator [Mon, 27 Feb 2023 00:21:10 +0000 (00:21 +0000)] 
Daily bump.

14 months agoDaily bump.
GCC Administrator [Sun, 26 Feb 2023 00:21:36 +0000 (00:21 +0000)] 
Daily bump.

14 months agoDaily bump.
GCC Administrator [Sat, 25 Feb 2023 00:21:46 +0000 (00:21 +0000)] 
Daily bump.

14 months agoRTEMS: Tune multilib selection
Sebastian Huber [Wed, 22 Feb 2023 20:38:18 +0000 (21:38 +0100)] 
RTEMS: Tune multilib selection

gcc/ChangeLog:

* config/riscv/t-rtems: Keep only -mcmodel=medany 64-bit multilibs.
Add non-compact 32-bit multilibs.

(cherry picked from commit 35a067020e41d97bc3be15b518b3dc2a64b4aae2)

14 months agoDaily bump.
GCC Administrator [Fri, 24 Feb 2023 00:22:05 +0000 (00:22 +0000)] 
Daily bump.

14 months agotree-optimization/108888 - call if-conversion
Richard Biener [Thu, 23 Feb 2023 10:03:03 +0000 (11:03 +0100)] 
tree-optimization/108888 - call if-conversion

The following makes sure to only predicate calls necessary.

PR tree-optimization/108888
* tree-if-conv.cc (if_convertible_stmt_p): Set PLF_2 on
calls to predicate.
(predicate_statements): Only predicate calls with PLF_2.

* g++.dg/torture/pr108888.C: New testcase.

(cherry picked from commit 31cc5821223a096ef61743bff520f4a0dbba5872)

14 months agovect: inbranch SIMD clones
Andrew Stubbs [Thu, 28 Jul 2022 15:07:22 +0000 (16:07 +0100)] 
vect: inbranch SIMD clones

There has been support for generating "inbranch" SIMD clones for a long time,
but nothing actually uses them (as far as I can see).

This patch add supports for a sub-set of possible cases (those using
mask_mode == VOIDmode).  The other cases fail to vectorize, just as before,
so there should be no regressions.

The sub-set of support should cover all cases needed by amdgcn, at present.

gcc/ChangeLog:

* internal-fn.cc (expand_MASK_CALL): New.
* internal-fn.def (MASK_CALL): New.
* internal-fn.h (expand_MASK_CALL): New prototype.
* omp-simd-clone.cc (simd_clone_adjust_argument_types): Set vector_type
for mask arguments also.
* tree-if-conv.cc: Include cgraph.h.
(if_convertible_stmt_p): Do if conversions for calls to SIMD calls.
(predicate_statements): Convert functions to IFN_MASK_CALL.
* tree-vect-loop.cc (vect_get_datarefs_in_loop): Recognise
IFN_MASK_CALL as a SIMD function call.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Handle
IFN_MASK_CALL as an inbranch SIMD function call.
Generate the mask vector arguments.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-simd-clone-16.c: New test.
* gcc.dg/vect/vect-simd-clone-16b.c: New test.
* gcc.dg/vect/vect-simd-clone-16c.c: New test.
* gcc.dg/vect/vect-simd-clone-16d.c: New test.
* gcc.dg/vect/vect-simd-clone-16e.c: New test.
* gcc.dg/vect/vect-simd-clone-16f.c: New test.
* gcc.dg/vect/vect-simd-clone-17.c: New test.
* gcc.dg/vect/vect-simd-clone-17b.c: New test.
* gcc.dg/vect/vect-simd-clone-17c.c: New test.
* gcc.dg/vect/vect-simd-clone-17d.c: New test.
* gcc.dg/vect/vect-simd-clone-17e.c: New test.
* gcc.dg/vect/vect-simd-clone-17f.c: New test.
* gcc.dg/vect/vect-simd-clone-18.c: New test.
* gcc.dg/vect/vect-simd-clone-18b.c: New test.
* gcc.dg/vect/vect-simd-clone-18c.c: New test.
* gcc.dg/vect/vect-simd-clone-18d.c: New test.
* gcc.dg/vect/vect-simd-clone-18e.c: New test.
* gcc.dg/vect/vect-simd-clone-18f.c: New test.

(cherry picked from commit 3da77f217c8b2089ecba3eb201e727c3fcdcd19d)

14 months agolibstdc++: Simplify three helper functions into one
Matthias Kretz [Sat, 14 Jan 2023 22:32:38 +0000 (23:32 +0100)] 
libstdc++: Simplify three helper functions into one

Broadcast is a very common function. This should reduce compile-time
effort.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd.h (__vector_broadcast):
Implement via __vector_broadcast_impl instead of
__call_with_n_evaluations + 2 lambdas.
(__vector_broadcast_impl): New.

(cherry picked from commit 2e29e2fbeb8936e5c85cefaf547cba42e17e137b)

14 months agolibstdc++: Fix -Wsign-compare issue
Matthias Kretz [Tue, 21 Feb 2023 09:31:55 +0000 (10:31 +0100)] 
libstdc++: Fix -Wsign-compare issue

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_builtin.h (_S_set): Compare as
int. The actual range of these indexes is very small.

(cherry picked from commit ffa39f7120f6e83a567d7a83ff4437f6b41036ea)

14 months agolibstdc++: Add missing constexpr on simd shift implementation
Matthias Kretz [Mon, 20 Feb 2023 16:35:59 +0000 (17:35 +0100)] 
libstdc++: Add missing constexpr on simd shift implementation

Resolves -Wtautological-compare warnings about `if
(__builtin_is_constant_evaluated())` in the implementations of these
functions.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h (_S_bit_shift_left)
(_S_bit_shift_right): Declare constexpr. The implementation was
already expecting constexpr evaluation.

(cherry picked from commit fa37ac2b59ed1c379b35dbf9bd58f7849f9fd5b5)

14 months agolibgomp: no need to attach USM pointers
Andrew Stubbs [Mon, 20 Feb 2023 12:23:14 +0000 (12:23 +0000)] 
libgomp: no need to attach USM pointers

Fix a bug in which Fortran pointers inside derived types caused a runtime
error when Unified Shared Memory was active.

libgomp/ChangeLog:

* target.c (gomp_attach_pointer): Check for USM.
* testsuite/libgomp.fortran/usm-3.f90: New test.

14 months agolibstdc++: Fix uses of non-reserved names in simd header
Matthias Kretz [Thu, 16 Feb 2023 15:29:54 +0000 (16:29 +0100)] 
libstdc++: Fix uses of non-reserved names in simd header

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (__extract_part, split):
Use reserved name for template parameter.

(cherry picked from commit bb920f561e983c64d146f173dc4ebc098441a962)

14 months agoDaily bump.
GCC Administrator [Thu, 23 Feb 2023 00:22:26 +0000 (00:22 +0000)] 
Daily bump.

14 months agoFortran/OpenMP: Fix mapping of array descriptors and deferred-length strings
Tobias Burnus [Wed, 22 Feb 2023 20:18:33 +0000 (21:18 +0100)] 
Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings

Previously, array descriptors might have been mapped as 'alloc'
instead of 'to' for 'alloc', not updating the array bounds. The
'alloc' could also appear for 'data exit', failing with a libgomp
assert. In some cases, either array descriptors or deferred-length
string's length variable was not mapped. And, finally, some offset
calculations with array-sections mappings went wrong.

The testcases contain some comment-out tests which require follow-up
work and for which PR exist. Those mostly relate to deferred-length
strings which have several issues beyong OpenMP support.

This is the OG12 variant of the submitted but unreviewed GCC 13/mainline
patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612387.html

gcc/fortran/ChangeLog:

* trans-decl.cc (gfc_get_symbol_decl): Add attributes
such as 'declare target' also to hidden artificial
variable for deferred-length character variables.
* trans-openmp.cc (gfc_trans_omp_array_section,
gfc_trans_omp_clauses, gfc_trans_omp_target_exit_data):
Improve mapping of array descriptors and deferred-length
string variables.

gcc/ChangeLog:

* gimplify.cc (gimplify_scan_omp_clauses): Remove Fortran
special case.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-enter-data-3.f90: Uncomment
'target exit data'.
* testsuite/libgomp.fortran/target-enter-data-4.f90: New test.
* testsuite/libgomp.fortran/target-enter-data-5.f90: New test.
* testsuite/libgomp.fortran/target-enter-data-6.f90: New test.
* testsuite/libgomp.fortran/target-enter-data-7.f90: New test.

14 months agoFix: Fortran/OpenMP: align/allocator modifiers to the allocate clause
Tobias Burnus [Wed, 22 Feb 2023 11:35:29 +0000 (12:35 +0100)] 
Fix: Fortran/OpenMP: align/allocator modifiers to the allocate clause

When merging r13-4584-gb2e1c49b4a4 to OG12 as commit 58e0579ed87,
the 'align' handling seemingly ended up in the wrong clause.
(Result: libgomp.fortran/allocate-2a.f90 FAILED; now fixed.)

gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Move align modifier
handling from OMP_LIST_ALLOCATOR to OMP_LIST_ALLOCATE.

14 months agoDaily bump.
GCC Administrator [Wed, 22 Feb 2023 00:22:57 +0000 (00:22 +0000)] 
Daily bump.

14 months agolibstdc++: Add noexcept-specifier to std::reference_wrapper::operator()
Jonathan Wakely [Wed, 31 Aug 2022 12:57:34 +0000 (13:57 +0100)] 
libstdc++: Add noexcept-specifier to std::reference_wrapper::operator()

This isn't required by the standard, but there's an LWG issue suggesting
to add it.

Also use __invoke_result instead of result_of, to match the spec in
recent standards.

libstdc++-v3/ChangeLog:

* include/bits/refwrap.h (reference_wrapper::operator()): Add
noexcept-specifier and use __invoke_result instead of result_of.
* testsuite/20_util/reference_wrapper/invoke-noexcept.cc: New test.

(cherry picked from commit e47df5eb56c4e7aca0d3e50826e5aaa1887fa446)

14 months agolibstdc++: Fix std::filesystem errors with -fkeep-inline-functions [PR108636]
Jonathan Wakely [Thu, 2 Feb 2023 14:06:40 +0000 (14:06 +0000)] 
libstdc++: Fix std::filesystem errors with -fkeep-inline-functions [PR108636]

With -fkeep-inline-functions there are linker errors when including
<filesystem>. This happens because there are some filesystem::path
constructors defined inline which call non-exported functions defined in
the library. That's usually not a problem, because those constructors
are only called by code that's also inside the library. But when the
header is compiled with -fkeep-inline-functions those inline functions
are emitted even though they aren't called. That then creates an
undefined reference to the other library internsl. The fix is to just
move the private constructors into the library where they are called.
That way they are never even seen by users, and so not compiled even if
-fkeep-inline-functions is used.

libstdc++-v3/ChangeLog:

PR libstdc++/108636
* include/bits/fs_path.h (path::path(string_view, _Type))
(path::_Cmpt::_Cmpt(string_view, _Type, size_t)): Move inline
definitions to ...
* src/c++17/fs_path.cc: ... here.
* testsuite/27_io/filesystem/path/108636.cc: New test.

(cherry picked from commit db8d6fc572ec316ccfcf70b1dffe3be0b1b37212)

14 months agoDaily bump.
GCC Administrator [Tue, 21 Feb 2023 00:22:04 +0000 (00:22 +0000)] 
Daily bump.

14 months agoc++: ICE with redundant capture [PR108829]
Marek Polacek [Thu, 16 Feb 2023 22:41:24 +0000 (17:41 -0500)] 
c++: ICE with redundant capture [PR108829]

Here we crash in is_capture_proxy:

  /* Location wrappers should be stripped or otherwise handled by the
     caller before using this predicate.  */
  gcc_checking_assert (!location_wrapper_p (decl));

We only crash with the redundant capture:

  int abyPage = [=, abyPage] { ... }

because prune_lambda_captures is only called when there was a default
capture, and with [=] only abyPage won't be in LAMBDA_EXPR_CAPTURE_LIST.

The problem is that LAMBDA_CAPTURE_EXPLICIT_P wasn't propagated
correctly and so var_to_maybe_prune proceeded where it shouldn't.

Co-Authored by: Patrick Palka <ppalka@redhat.com>

PR c++/108829

gcc/cp/ChangeLog:

* pt.cc (prepend_one_capture): Set LAMBDA_CAPTURE_EXPLICIT_P.
(tsubst_lambda_expr): Pass LAMBDA_CAPTURE_EXPLICIT_P to
prepend_one_capture.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-108829-2.C: New test.
* g++.dg/cpp0x/lambda/lambda-108829.C: New test.

(cherry picked from commit 02d8ab3e4e2f3d9dc12157a98c976d6698e71e29)

14 months agoPrototype 'GOMP_enable_pinned_mode'
Thomas Schwinge [Mon, 20 Feb 2023 14:29:44 +0000 (15:29 +0100)] 
Prototype 'GOMP_enable_pinned_mode'

Fix-up for og12 commit 842df187487f5b16ae29bbe7e9acd79661a9df48
"openmp: -foffload-memory=pinned".  No functional change.

libgomp/
* libgomp_g.h (GOMP_enable_pinned_mode): New.

14 months agoAttempt to not just register but allocate OpenMP pinned memory using a device: ChangeLog
Thomas Schwinge [Mon, 20 Feb 2023 14:00:48 +0000 (15:00 +0100)] 
Attempt to not just register but allocate OpenMP pinned memory using a device: ChangeLog

... forgotten in og12 commit 4bd844f3e0202b3d083f0784f4343570c88bb86c
"Attempt to not just register but allocate OpenMP pinned memory using a device".

14 months agoAttempt to not just register but allocate OpenMP pinned memory using a device
Thomas Schwinge [Mon, 20 Feb 2023 13:44:43 +0000 (14:44 +0100)] 
Attempt to not just register but allocate OpenMP pinned memory using a device

... instead of 'mmap' plus attempting to register using a device.

Implemented for nvptx offloading via 'cuMemHostAlloc'.

This re-works og12 commit a5a4800e92773da7126c00a9c79b172494d58ab5
"Attempt to register OpenMP pinned memory using a device instead of 'mlock'".

include/
* cuda/cuda.h (cuMemHostRegister, cuMemHostUnregister): Remove.
libgomp/
* config/linux/allocator.c (linux_memspace_alloc): Add 'init0'
formal parameter.  Adjust all users.
(linux_memspace_alloc, linux_memspace_free): Attempt to allocate
OpenMP pinned memory using a device instead of 'mmap' plus
attempting to register using a device.
* libgomp-plugin.h (GOMP_OFFLOAD_register_page_locked)
(GOMP_OFFLOAD_unregister_page_locked): Remove.
(GOMP_OFFLOAD_page_locked_host_alloc)
(GOMP_OFFLOAD_page_locked_host_free): New.
* libgomp.h (gomp_register_page_locked)
(gomp_unregister_page_locked): Remove.
(gomp_page_locked_host_alloc, gomp_page_locked_host_free): New.
(struct gomp_device_descr): Remove 'register_page_locked_func',
'unregister_page_locked_func'.  Add 'page_locked_host_alloc_func',
'page_locked_host_free_func'.
* plugin/cuda-lib.def (cuMemHostRegister_v2, cuMemHostRegister)
(cuMemHostUnregister): Remove.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_register_page_locked)
(GOMP_OFFLOAD_unregister_page_locked): Remove.
(GOMP_OFFLOAD_page_locked_host_alloc)
(GOMP_OFFLOAD_page_locked_host_free): New.
* target.c (gomp_register_page_locked)
(gomp_unregister_page_locked): Remove.
(gomp_page_locked_host_alloc, gomp_page_locked_host_free): Add.
(gomp_load_plugin_for_device): Don't handle
'register_page_locked', 'unregister_page_locked'.  Handle
'page_locked_host_alloc', 'page_locked_host_free'.

Suggested-by: Andrew Stubbs <ams@codesourcery.com>
14 months agoaarch64: Fix up bfmlal lane pattern [PR104921]
Alex Coplan [Mon, 6 Feb 2023 14:32:21 +0000 (14:32 +0000)] 
aarch64: Fix up bfmlal lane pattern [PR104921]

As the testcase shows, this pattern had an incorrect constraint leading
to GCC's output getting rejected by the assembler.

This patch fixes the constraint accordingly.

The test is split into two: one that can run without bf16 support from
the assembler and another that checks that the output actually assembles
when such support is available.

gcc/ChangeLog:

PR target/104921
* config/aarch64/aarch64-simd.md (aarch64_bfmlal<bt>_lane<q>v4sf):
Use correct constraint for operand 3.

gcc/testsuite/ChangeLog:

PR target/104921
* gcc.target/aarch64/pr104921-1.c: New test.
* gcc.target/aarch64/pr104921-2.c: New test.
* gcc.target/aarch64/pr104921.x: Include file for new tests.

(cherry picked from commit 277e1f30a5e4e634304a7b8a532825119f0ea47f)

14 months agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Mon, 20 Feb 2023 06:52:01 +0000 (07:52 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9189-gc6e3ecca0e3dcf567d0c843a4987e52591041372 (20th Feb 2023)

14 months agoDaily bump.
GCC Administrator [Mon, 20 Feb 2023 00:20:31 +0000 (00:20 +0000)] 
Daily bump.

14 months agoDaily bump.
GCC Administrator [Sun, 19 Feb 2023 00:20:53 +0000 (00:20 +0000)] 
Daily bump.

14 months agoLoongArch: Fix multiarch tuple canonization
Xi Ruoyao [Mon, 13 Feb 2023 10:38:53 +0000 (18:38 +0800)] 
LoongArch: Fix multiarch tuple canonization

Multiarch tuple will be coded in file or directory names in
multiarch-aware distros, so one ABI should have only one multiarch
tuple.  For example, "--target=loongarch64-linux-gnu --with-abi=lp64s"
and "--target=loongarch64-linux-gnusf" should both set multiarch tuple
to "loongarch64-linux-gnusf".  Before this commit,
"--target=loongarch64-linux-gnu --with-abi=lp64s --disable-multilib"
will produce wrong result (loongarch64-linux-gnu).

A recent LoongArch psABI revision mandates "loongarch64-linux-gnu" to be
used for -mabi=lp64d (instead of "loongarch64-linux-gnuf64") for some
non-technical reason [1].  Note that we cannot make
"loongarch64-linux-gnuf64" an alias for "loongarch64-linux-gnu" because
to implement such an alias, we must create thousands of symlinks in the
distro and doing so would be completely unpractical.  This commit also
aligns GCC with the revision.

Tested by building cross compilers with --enable-multiarch and multiple
combinations of --target=loongarch64-linux-gnu*, --with-abi=lp64{s,f,d},
and --{enable,disable}-multilib; and run "xgcc --print-multiarch" then
manually verify the result with eyesight.

[1]: https://github.com/loongson/LoongArch-Documentation/pull/80

gcc/ChangeLog:

* config.gcc (triplet_abi): Set its value based on $with_abi,
instead of $target.
(la_canonical_triplet): Set it after $triplet_abi is set
correctly.
* config/loongarch/t-linux (MULTILIB_OSDIRNAMES): Make the
multiarch tuple for lp64d "loongarch64-linux-gnu" (without
"f64" suffix).

(cherry picked from commit 017849d9d88f021770a90f12fffec9aa2425ed27)

14 months agoDaily bump.
GCC Administrator [Sat, 18 Feb 2023 00:21:23 +0000 (00:21 +0000)] 
Daily bump.

14 months agoDaily bump.
GCC Administrator [Fri, 17 Feb 2023 00:22:02 +0000 (00:22 +0000)] 
Daily bump.

14 months agoAttempt to register OpenMP pinned memory using a device instead of 'mlock'
Thomas Schwinge [Thu, 16 Feb 2023 14:57:37 +0000 (15:57 +0100)] 
Attempt to register OpenMP pinned memory using a device instead of 'mlock'

Implemented for nvptx offloading via 'cuMemHostRegister'.  This means: (a) not
running into 'mlock' limitations, and (b) the device is aware of this and may
optimize host <-> device memory transfers.

This re-works og12 commit ab7520b3b4cd9fdabfd63652badde478955bd3b5
"libgomp: pinned memory".

include/
* cuda/cuda.h (cuMemHostRegister, cuMemHostUnregister): New.
libgomp/
* config/linux/allocator.c (linux_memspace_alloc)
(linux_memspace_free, linux_memspace_realloc): Attempt to register
OpenMP pinned memory using a device instead of 'mlock'.
* libgomp-plugin.h (GOMP_OFFLOAD_register_page_locked)
(GOMP_OFFLOAD_unregister_page_locked): New.
* libgomp.h (gomp_register_page_locked)
(gomp_unregister_page_locked): New
(struct gomp_device_descr): Add 'register_page_locked_func',
'unregister_page_locked_func'.
* plugin/cuda-lib.def (cuMemHostRegister_v2, cuMemHostRegister)
(cuMemHostUnregister): New.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_register_page_locked)
(GOMP_OFFLOAD_unregister_page_locked): New.
* target.c (gomp_register_page_locked)
(gomp_unregister_page_locked): New.
(gomp_load_plugin_for_device): Handle 'register_page_locked',
'unregister_page_locked'.
* testsuite/libgomp.c/alloc-pinned-1.c: Adjust.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

14 months agoIn 'libgomp/allocator.c:omp_realloc', route 'free' through 'MEMSPACE_FREE'
Thomas Schwinge [Tue, 14 Feb 2023 12:35:03 +0000 (13:35 +0100)] 
In 'libgomp/allocator.c:omp_realloc', route 'free' through 'MEMSPACE_FREE'

... to not run into a SIGSEGV if a non-'malloc'-based allocation is 'free'd
here.

Fix-up for og12 commit c5d1d7651297a273321154a5fe1b01eba9dcf604
"libgomp, nvptx: low-latency memory allocator".

libgomp/
* allocator.c (omp_realloc): Route 'free' through 'MEMSPACE_FREE'.

14 months agoClarify/verify OpenMP 'omp_calloc' zero-initialization for pinned memory
Thomas Schwinge [Wed, 15 Feb 2023 09:23:03 +0000 (10:23 +0100)] 
Clarify/verify OpenMP 'omp_calloc' zero-initialization for pinned memory

Clarification for og12 commit ab7520b3b4cd9fdabfd63652badde478955bd3b5
"libgomp: pinned memory".  No functional change.

libgomp/
* config/linux/allocator.c (linux_memspace_alloc)
(linux_memspace_calloc): Clarify zero-initialization for pinned
memory.
* testsuite/libgomp.c/alloc-pinned-1.c: Verify zero-initialization
for pinned memory.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.

14 months agoMiscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', 'ompx_host_mem_space'
Thomas Schwinge [Tue, 14 Feb 2023 16:10:57 +0000 (17:10 +0100)] 
Miscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', 'ompx_host_mem_space'

Clean-up for og12 commit 84914e197d91a67b3d27db0e4c69a433462983a5
"openmp, nvptx: ompx_unified_shared_mem_alloc".  No functional change.

libgomp/
* config/linux/allocator.c (linux_memspace_calloc): Elide
(innocuous) duplicate 'if' condition.
* config/nvptx/allocator.c (nvptx_memspace_free): Explicitly
handle 'memspace == ompx_host_mem_space'.
* libgomp.h (gomp_is_usm_ptr): Remove.

14 months agoUn-break nvptx libgomp build
Thomas Schwinge [Thu, 16 Feb 2023 20:59:55 +0000 (21:59 +0100)] 
Un-break nvptx libgomp build

    In file included from [...]/libgomp/config/nvptx/allocator.c:49:
    [...]/libgomp/config/nvptx/../../basic-allocator.c:52:2: error: invalid preprocessing directive #deine; did you mean #define?
       52 | #deine BASIC_ALLOC_YIELD
          |  ^~~~~
          |  define

Yes, indeed.

Fix-up for og12 commit 9583738a62a33a276b2aad980a27e77097f95924
"nvptx, libgomp: Move the low-latency allocator code".

libgomp/
* basic-allocator.c (BASIC_ALLOC_YIELD): instead of '#deine',
'#define' it.

14 months agolibstdc++: Fix incorrect function call in -ffast-math optimization
Matthias Kretz [Fri, 3 Feb 2023 16:47:05 +0000 (17:47 +0100)] 
libstdc++: Fix incorrect function call in -ffast-math optimization

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_math.h (__hypot): Bitcasting
between scalars requires the __bit_cast helper function instead
of simd_bit_cast.

(cherry picked from commit a5de17d9120dde7e6598a05ea4d1556c2783c69b)

14 months agolibstdc++: Fix incorrect __builtin_is_constant_evaluated calls
Matthias Kretz [Thu, 2 Feb 2023 11:29:33 +0000 (12:29 +0100)] 
libstdc++: Fix incorrect __builtin_is_constant_evaluated calls

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h
(_SimdImplX86::_S_not_equal_to, _SimdImplX86::_S_less)
(_SimdImplX86::_S_less_equal): Do not call
__builtin_is_constant_evaluated in constexpr-if.

(cherry picked from commit 1fd3836463c65f695831ef04c7dbda1e7a1794ba)