]> git.ipfire.org Git - thirdparty/gcc.git/blame - libgomp/libgomp.texi
LoongArch: Fix instruction name typo in lsx_vreplgr2vr_<lsxfmt_f> template
[thirdparty/gcc.git] / libgomp / libgomp.texi
CommitLineData
d77de738
ML
1\input texinfo @c -*-texinfo-*-
2
3@c %**start of header
4@setfilename libgomp.info
5@settitle GNU libgomp
6@c %**end of header
7
8
9@copying
74d5206f 10Copyright @copyright{} 2006-2023 Free Software Foundation, Inc.
d77de738
ML
11
12Permission is granted to copy, distribute and/or modify this document
13under the terms of the GNU Free Documentation License, Version 1.3 or
14any later version published by the Free Software Foundation; with the
15Invariant Sections being ``Funding Free Software'', the Front-Cover
16texts being (a) (see below), and with the Back-Cover Texts being (b)
17(see below). A copy of the license is included in the section entitled
18``GNU Free Documentation License''.
19
20(a) The FSF's Front-Cover Text is:
21
22 A GNU Manual
23
24(b) The FSF's Back-Cover Text is:
25
26 You have freedom to copy and modify this GNU Manual, like GNU
27 software. Copies published by the Free Software Foundation raise
28 funds for GNU development.
29@end copying
30
31@ifinfo
32@dircategory GNU Libraries
33@direntry
34* libgomp: (libgomp). GNU Offloading and Multi Processing Runtime Library.
35@end direntry
36
37This manual documents libgomp, the GNU Offloading and Multi Processing
38Runtime library. This is the GNU implementation of the OpenMP and
39OpenACC APIs for parallel and accelerator programming in C/C++ and
40Fortran.
41
42Published by the Free Software Foundation
4351 Franklin Street, Fifth Floor
44Boston, MA 02110-1301 USA
45
46@insertcopying
47@end ifinfo
48
49
50@setchapternewpage odd
51
52@titlepage
53@title GNU Offloading and Multi Processing Runtime Library
54@subtitle The GNU OpenMP and OpenACC Implementation
55@page
56@vskip 0pt plus 1filll
57@comment For the @value{version-GCC} Version*
58@sp 1
59Published by the Free Software Foundation @*
6051 Franklin Street, Fifth Floor@*
61Boston, MA 02110-1301, USA@*
62@sp 1
63@insertcopying
64@end titlepage
65
66@summarycontents
67@contents
68@page
69
70
71@node Top, Enabling OpenMP
72@top Introduction
73@cindex Introduction
74
75This manual documents the usage of libgomp, the GNU Offloading and
76Multi Processing Runtime Library. This includes the GNU
77implementation of the @uref{https://www.openmp.org, OpenMP} Application
78Programming Interface (API) for multi-platform shared-memory parallel
79programming in C/C++ and Fortran, and the GNU implementation of the
80@uref{https://www.openacc.org, OpenACC} Application Programming
81Interface (API) for offloading of code to accelerator devices in C/C++
82and Fortran.
83
84Originally, libgomp implemented the GNU OpenMP Runtime Library. Based
85on this, support for OpenACC and offloading (both OpenACC and OpenMP
864's target construct) has been added later on, and the library's name
87changed to GNU Offloading and Multi Processing Runtime Library.
88
89
90
91@comment
92@comment When you add a new menu item, please keep the right hand
93@comment aligned to the same column. Do not use tabs. This provides
94@comment better formatting.
95@comment
96@menu
97* Enabling OpenMP:: How to enable OpenMP for your applications.
98* OpenMP Implementation Status:: List of implemented features by OpenMP version
99* OpenMP Runtime Library Routines: Runtime Library Routines.
100 The OpenMP runtime application programming
101 interface.
102* OpenMP Environment Variables: Environment Variables.
103 Influencing OpenMP runtime behavior with
104 environment variables.
105* Enabling OpenACC:: How to enable OpenACC for your
106 applications.
107* OpenACC Runtime Library Routines:: The OpenACC runtime application
108 programming interface.
109* OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
110 environment variables.
111* CUDA Streams Usage:: Notes on the implementation of
112 asynchronous operations.
113* OpenACC Library Interoperability:: OpenACC library interoperability with the
114 NVIDIA CUBLAS library.
115* OpenACC Profiling Interface::
116* OpenMP-Implementation Specifics:: Notes specifics of this OpenMP
117 implementation
118* Offload-Target Specifics:: Notes on offload-target specific internals
119* The libgomp ABI:: Notes on the external ABI presented by libgomp.
120* Reporting Bugs:: How to report bugs in the GNU Offloading and
121 Multi Processing Runtime Library.
122* Copying:: GNU general public license says
123 how you can copy and share libgomp.
124* GNU Free Documentation License::
125 How you can copy and share this manual.
126* Funding:: How to help assure continued work for free
127 software.
128* Library Index:: Index of this documentation.
129@end menu
130
131
132@c ---------------------------------------------------------------------
133@c Enabling OpenMP
134@c ---------------------------------------------------------------------
135
136@node Enabling OpenMP
137@chapter Enabling OpenMP
138
643a5223
TB
139To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
140flag @option{-fopenmp} must be specified. For C and C++, this enables
5648446c 141the handling of the OpenMP directives using @code{#pragma omp} and the
643a5223
TB
142@code{[[omp::directive(...)]]}, @code{[[omp::sequence(...)]]} and
143@code{[[omp::decl(...)]]} attributes. For Fortran, it enables for
144free source form the @code{!$omp} sentinel for directives and the
145@code{!$} conditional compilation sentinel and for fixed source form the
146@code{c$omp}, @code{*$omp} and @code{!$omp} sentinels for directives and
147the @code{c$}, @code{*$} and @code{!$} conditional compilation sentinels.
148The flag also arranges for automatic linking of the OpenMP runtime library
d77de738
ML
149(@ref{Runtime Library Routines}).
150
643a5223
TB
151The @option{-fopenmp-simd} flag can be used to enable a subset of
152OpenMP directives that do not require the linking of either the
153OpenMP runtime library or the POSIX threads library.
154
d77de738
ML
155A complete description of all OpenMP directives may be found in the
156@uref{https://www.openmp.org, OpenMP Application Program Interface} manuals.
157See also @ref{OpenMP Implementation Status}.
158
159
160@c ---------------------------------------------------------------------
161@c OpenMP Implementation Status
162@c ---------------------------------------------------------------------
163
164@node OpenMP Implementation Status
165@chapter OpenMP Implementation Status
166
167@menu
168* OpenMP 4.5:: Feature completion status to 4.5 specification
169* OpenMP 5.0:: Feature completion status to 5.0 specification
170* OpenMP 5.1:: Feature completion status to 5.1 specification
171* OpenMP 5.2:: Feature completion status to 5.2 specification
c16e85d7 172* OpenMP Technical Report 11:: Feature completion status to first 6.0 preview
d77de738
ML
173@end menu
174
175The @code{_OPENMP} preprocessor macro and Fortran's @code{openmp_version}
176parameter, provided by @code{omp_lib.h} and the @code{omp_lib} module, have
177the value @code{201511} (i.e. OpenMP 4.5).
178
179@node OpenMP 4.5
180@section OpenMP 4.5
181
182The OpenMP 4.5 specification is fully supported.
183
184@node OpenMP 5.0
185@section OpenMP 5.0
186
187@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
188@c This list is sorted as in OpenMP 5.1's B.3 not as in OpenMP 5.0's B.2
189
190@multitable @columnfractions .60 .10 .25
191@headitem Description @tab Status @tab Comments
192@item Array shaping @tab N @tab
193@item Array sections with non-unit strides in C and C++ @tab N @tab
194@item Iterators @tab Y @tab
195@item @code{metadirective} directive @tab N @tab
196@item @code{declare variant} directive
197 @tab P @tab @emph{simd} traits not handled correctly
2cd0689a 198@item @var{target-offload-var} ICV and @code{OMP_TARGET_OFFLOAD}
d77de738 199 env variable @tab Y @tab
2cd0689a 200@item Nested-parallel changes to @var{max-active-levels-var} ICV @tab Y @tab
d77de738 201@item @code{requires} directive @tab P
8c2fc744 202 @tab complete but no non-host device provides @code{unified_shared_memory}
d77de738 203@item @code{teams} construct outside an enclosing target region @tab Y @tab
85da0b40
TB
204@item Non-rectangular loop nests @tab P
205 @tab Full support for C/C++, partial for Fortran
206 (@uref{https://gcc.gnu.org/PR110735,PR110735})
d77de738
ML
207@item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab
208@item @code{nonmonotonic} as default loop schedule modifier for worksharing-loop
209 constructs @tab Y @tab
87f9b6c2 210@item Collapse of associated loops that are imperfectly nested loops @tab Y @tab
d77de738
ML
211@item Clauses @code{if}, @code{nontemporal} and @code{order(concurrent)} in
212 @code{simd} construct @tab Y @tab
213@item @code{atomic} constructs in @code{simd} @tab Y @tab
214@item @code{loop} construct @tab Y @tab
215@item @code{order(concurrent)} clause @tab Y @tab
216@item @code{scan} directive and @code{in_scan} modifier for the
217 @code{reduction} clause @tab Y @tab
218@item @code{in_reduction} clause on @code{task} constructs @tab Y @tab
219@item @code{in_reduction} clause on @code{target} constructs @tab P
220 @tab @code{nowait} only stub
221@item @code{task_reduction} clause with @code{taskgroup} @tab Y @tab
222@item @code{task} modifier to @code{reduction} clause @tab Y @tab
223@item @code{affinity} clause to @code{task} construct @tab Y @tab Stub only
224@item @code{detach} clause to @code{task} construct @tab Y @tab
225@item @code{omp_fulfill_event} runtime routine @tab Y @tab
226@item @code{reduction} and @code{in_reduction} clauses on @code{taskloop}
227 and @code{taskloop simd} constructs @tab Y @tab
228@item @code{taskloop} construct cancelable by @code{cancel} construct
229 @tab Y @tab
230@item @code{mutexinoutset} @emph{dependence-type} for @code{depend} clause
231 @tab Y @tab
232@item Predefined memory spaces, memory allocators, allocator traits
13c3e29d 233 @tab Y @tab See also @ref{Memory allocation}
d77de738 234@item Memory management routines @tab Y @tab
969f5c3e 235@item @code{allocate} directive @tab P @tab Only C and Fortran, only stack variables
d77de738
ML
236@item @code{allocate} clause @tab P @tab Initial support
237@item @code{use_device_addr} clause on @code{target data} @tab Y @tab
f84fdb13 238@item @code{ancestor} modifier on @code{device} clause @tab Y @tab
d77de738
ML
239@item Implicit declare target directive @tab Y @tab
240@item Discontiguous array section with @code{target update} construct
241 @tab N @tab
242@item C/C++'s lvalue expressions in @code{to}, @code{from}
243 and @code{map} clauses @tab N @tab
244@item C/C++'s lvalue expressions in @code{depend} clauses @tab Y @tab
245@item Nested @code{declare target} directive @tab Y @tab
246@item Combined @code{master} constructs @tab Y @tab
247@item @code{depend} clause on @code{taskwait} @tab Y @tab
248@item Weak memory ordering clauses on @code{atomic} and @code{flush} construct
249 @tab Y @tab
250@item @code{hint} clause on the @code{atomic} construct @tab Y @tab Stub only
251@item @code{depobj} construct and depend objects @tab Y @tab
252@item Lock hints were renamed to synchronization hints @tab Y @tab
253@item @code{conditional} modifier to @code{lastprivate} clause @tab Y @tab
254@item Map-order clarifications @tab P @tab
255@item @code{close} @emph{map-type-modifier} @tab Y @tab
256@item Mapping C/C++ pointer variables and to assign the address of
257 device memory mapped by an array section @tab P @tab
258@item Mapping of Fortran pointer and allocatable variables, including pointer
259 and allocatable components of variables
260 @tab P @tab Mapping of vars with allocatable components unsupported
261@item @code{defaultmap} extensions @tab Y @tab
262@item @code{declare mapper} directive @tab N @tab
263@item @code{omp_get_supported_active_levels} routine @tab Y @tab
264@item Runtime routines and environment variables to display runtime thread
265 affinity information @tab Y @tab
266@item @code{omp_pause_resource} and @code{omp_pause_resource_all} runtime
267 routines @tab Y @tab
268@item @code{omp_get_device_num} runtime routine @tab Y @tab
269@item OMPT interface @tab N @tab
270@item OMPD interface @tab N @tab
271@end multitable
272
273@unnumberedsubsec Other new OpenMP 5.0 features
274
275@multitable @columnfractions .60 .10 .25
276@headitem Description @tab Status @tab Comments
277@item Supporting C++'s range-based for loop @tab Y @tab
278@end multitable
279
280
281@node OpenMP 5.1
282@section OpenMP 5.1
283
284@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
285
286@multitable @columnfractions .60 .10 .25
287@headitem Description @tab Status @tab Comments
288@item OpenMP directive as C++ attribute specifiers @tab Y @tab
289@item @code{omp_all_memory} reserved locator @tab Y @tab
290@item @emph{target_device trait} in OpenMP Context @tab N @tab
291@item @code{target_device} selector set in context selectors @tab N @tab
292@item C/C++'s @code{declare variant} directive: elision support of
293 preprocessed code @tab N @tab
294@item @code{declare variant}: new clauses @code{adjust_args} and
295 @code{append_args} @tab N @tab
296@item @code{dispatch} construct @tab N @tab
297@item device-specific ICV settings with environment variables @tab Y @tab
eda38850 298@item @code{assume} and @code{assumes} directives @tab Y @tab
d77de738
ML
299@item @code{nothing} directive @tab Y @tab
300@item @code{error} directive @tab Y @tab
301@item @code{masked} construct @tab Y @tab
302@item @code{scope} directive @tab Y @tab
303@item Loop transformation constructs @tab N @tab
304@item @code{strict} modifier in the @code{grainsize} and @code{num_tasks}
305 clauses of the @code{taskloop} construct @tab Y @tab
1a554a2c 306@item @code{align} clause in @code{allocate} directive @tab P
969f5c3e 307 @tab Only C and Fortran (and only stack variables)
b2e1c49b 308@item @code{align} modifier in @code{allocate} clause @tab Y @tab
d77de738
ML
309@item @code{thread_limit} clause to @code{target} construct @tab Y @tab
310@item @code{has_device_addr} clause to @code{target} construct @tab Y @tab
311@item Iterators in @code{target update} motion clauses and @code{map}
312 clauses @tab N @tab
313@item Indirect calls to the device version of a procedure or function in
a49c7d31 314 @code{target} regions @tab P @tab Only C and C++
d77de738
ML
315@item @code{interop} directive @tab N @tab
316@item @code{omp_interop_t} object support in runtime routines @tab N @tab
317@item @code{nowait} clause in @code{taskwait} directive @tab Y @tab
318@item Extensions to the @code{atomic} directive @tab Y @tab
319@item @code{seq_cst} clause on a @code{flush} construct @tab Y @tab
320@item @code{inoutset} argument to the @code{depend} clause @tab Y @tab
321@item @code{private} and @code{firstprivate} argument to @code{default}
322 clause in C and C++ @tab Y @tab
4ede915d 323@item @code{present} argument to @code{defaultmap} clause @tab Y @tab
d77de738
ML
324@item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
325 @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
326 routines @tab Y @tab
327@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
328@item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
329 runtime routines @tab Y @tab
330@item @code{omp_get_mapped_ptr} runtime routine @tab Y @tab
331@item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and
332 @code{omp_aligned_calloc} runtime routines @tab Y @tab
333@item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added,
334 @code{omp_atv_default} changed @tab Y @tab
335@item @code{omp_display_env} runtime routine @tab Y @tab
336@item @code{ompt_scope_endpoint_t} enum: @code{ompt_scope_beginend} @tab N @tab
337@item @code{ompt_sync_region_t} enum additions @tab N @tab
338@item @code{ompt_state_t} enum: @code{ompt_state_wait_barrier_implementation}
339 and @code{ompt_state_wait_barrier_teams} @tab N @tab
340@item @code{ompt_callback_target_data_op_emi_t},
341 @code{ompt_callback_target_emi_t}, @code{ompt_callback_target_map_emi_t}
342 and @code{ompt_callback_target_submit_emi_t} @tab N @tab
343@item @code{ompt_callback_error_t} type @tab N @tab
344@item @code{OMP_PLACES} syntax extensions @tab Y @tab
345@item @code{OMP_NUM_TEAMS} and @code{OMP_TEAMS_THREAD_LIMIT} environment
346 variables @tab Y @tab
347@end multitable
348
349@unnumberedsubsec Other new OpenMP 5.1 features
350
351@multitable @columnfractions .60 .10 .25
352@headitem Description @tab Status @tab Comments
353@item Support of strictly structured blocks in Fortran @tab Y @tab
354@item Support of structured block sequences in C/C++ @tab Y @tab
355@item @code{unconstrained} and @code{reproducible} modifiers on @code{order}
356 clause @tab Y @tab
357@item Support @code{begin/end declare target} syntax in C/C++ @tab Y @tab
358@item Pointer predetermined firstprivate getting initialized
359to address of matching mapped list item per 5.1, Sect. 2.21.7.2 @tab N @tab
360@item For Fortran, diagnose placing declarative before/between @code{USE},
361 @code{IMPORT}, and @code{IMPLICIT} as invalid @tab N @tab
eda38850 362@item Optional comma between directive and clause in the @code{#pragma} form @tab Y @tab
a49c7d31 363@item @code{indirect} clause in @code{declare target} @tab P @tab Only C and C++
c16e85d7 364@item @code{device_type(nohost)}/@code{device_type(host)} for variables @tab N @tab
4ede915d
TB
365@item @code{present} modifier to the @code{map}, @code{to} and @code{from}
366 clauses @tab Y @tab
d77de738
ML
367@end multitable
368
369
370@node OpenMP 5.2
371@section OpenMP 5.2
372
373@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
374
375@multitable @columnfractions .60 .10 .25
376@headitem Description @tab Status @tab Comments
2cd0689a 377@item @code{omp_in_explicit_task} routine and @var{explicit-task-var} ICV
d77de738
ML
378 @tab Y @tab
379@item @code{omp}/@code{ompx}/@code{omx} sentinels and @code{omp_}/@code{ompx_}
380 namespaces @tab N/A
381 @tab warning for @code{ompx/omx} sentinels@footnote{The @code{ompx}
382 sentinel as C/C++ pragma and C++ attributes are warned for with
383 @code{-Wunknown-pragmas} (implied by @code{-Wall}) and @code{-Wattributes}
384 (enabled by default), respectively; for Fortran free-source code, there is
385 a warning enabled by default and, for fixed-source code, the @code{omx}
386 sentinel is warned for with with @code{-Wsurprising} (enabled by
387 @code{-Wall}). Unknown clauses are always rejected with an error.}
091b6dbc 388@item Clauses on @code{end} directive can be on directive @tab Y @tab
0698c9fd
TB
389@item @code{destroy} clause with destroy-var argument on @code{depobj}
390 @tab N @tab
d77de738
ML
391@item Deprecation of no-argument @code{destroy} clause on @code{depobj}
392 @tab N @tab
393@item @code{linear} clause syntax changes and @code{step} modifier @tab Y @tab
394@item Deprecation of minus operator for reductions @tab N @tab
395@item Deprecation of separating @code{map} modifiers without comma @tab N @tab
396@item @code{declare mapper} with iterator and @code{present} modifiers
397 @tab N @tab
398@item If a matching mapped list item is not found in the data environment, the
b25ea7ab 399 pointer retains its original value @tab Y @tab
d77de738
ML
400@item New @code{enter} clause as alias for @code{to} on declare target directive
401 @tab Y @tab
402@item Deprecation of @code{to} clause on declare target directive @tab N @tab
403@item Extended list of directives permitted in Fortran pure procedures
2df7e451 404 @tab Y @tab
d77de738
ML
405@item New @code{allocators} directive for Fortran @tab N @tab
406@item Deprecation of @code{allocate} directive for Fortran
407 allocatables/pointers @tab N @tab
408@item Optional paired @code{end} directive with @code{dispatch} @tab N @tab
409@item New @code{memspace} and @code{traits} modifiers for @code{uses_allocators}
410 @tab N @tab
411@item Deprecation of traits array following the allocator_handle expression in
412 @code{uses_allocators} @tab N @tab
413@item New @code{otherwise} clause as alias for @code{default} on metadirectives
414 @tab N @tab
415@item Deprecation of @code{default} clause on metadirectives @tab N @tab
416@item Deprecation of delimited form of @code{declare target} @tab N @tab
417@item Reproducible semantics changed for @code{order(concurrent)} @tab N @tab
418@item @code{allocate} and @code{firstprivate} clauses on @code{scope}
419 @tab Y @tab
420@item @code{ompt_callback_work} @tab N @tab
9f80367e 421@item Default map-type for the @code{map} clause in @code{target enter/exit data}
d77de738
ML
422 @tab Y @tab
423@item New @code{doacross} clause as alias for @code{depend} with
424 @code{source}/@code{sink} modifier @tab Y @tab
425@item Deprecation of @code{depend} with @code{source}/@code{sink} modifier
426 @tab N @tab
427@item @code{omp_cur_iteration} keyword @tab Y @tab
428@end multitable
429
430@unnumberedsubsec Other new OpenMP 5.2 features
431
432@multitable @columnfractions .60 .10 .25
433@headitem Description @tab Status @tab Comments
434@item For Fortran, optional comma between directive and clause @tab N @tab
435@item Conforming device numbers and @code{omp_initial_device} and
436 @code{omp_invalid_device} enum/PARAMETER @tab Y @tab
2cd0689a 437@item Initial value of @var{default-device-var} ICV with
18c8b56c 438 @code{OMP_TARGET_OFFLOAD=mandatory} @tab Y @tab
0698c9fd 439@item @code{all} as @emph{implicit-behavior} for @code{defaultmap} @tab Y @tab
d77de738
ML
440@item @emph{interop_types} in any position of the modifier list for the @code{init} clause
441 of the @code{interop} construct @tab N @tab
a49c7d31
KCY
442@item Invoke virtual member functions of C++ objects created on the host device
443 on other devices @tab N @tab
d77de738
ML
444@end multitable
445
446
c16e85d7
TB
447@node OpenMP Technical Report 11
448@section OpenMP Technical Report 11
449
450Technical Report (TR) 11 is the first preview for OpenMP 6.0.
451
452@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
453@multitable @columnfractions .60 .10 .25
454@item Features deprecated in versions 5.2, 5.1 and 5.0 were removed
455 @tab N/A @tab Backward compatibility
456@item The @code{decl} attribute was added to the C++ attribute syntax
04b2fb5b 457 @tab Y @tab
c16e85d7
TB
458@item @code{_ALL} suffix to the device-scope environment variables
459 @tab P @tab Host device number wrongly accepted
460@item For Fortran, @emph{locator list} can be also function reference with
461 data pointer result @tab N @tab
462@item Ref-count change for @code{use_device_ptr}/@code{use_device_addr}
463 @tab N @tab
464@item Implicit reduction identifiers of C++ classes
465 @tab N @tab
466@item Change of the @emph{map-type} property from @emph{ultimate} to
467 @emph{default} @tab N @tab
468@item Concept of @emph{assumed-size arrays} in C and C++
469 @tab N @tab
470@item Mapping of @emph{assumed-size arrays} in C, C++ and Fortran
471 @tab N @tab
472@item @code{groupprivate} directive @tab N @tab
473@item @code{local} clause to declare target directive @tab N @tab
474@item @code{part_size} allocator trait @tab N @tab
475@item @code{pin_device}, @code{preferred_device} and @code{target_access}
476 allocator traits
477 @tab N @tab
478@item @code{access} allocator trait changes @tab N @tab
479@item Extension of @code{interop} operation of @code{append_args}, allowing all
480 modifiers of the @code{init} clause
9f80367e 481 @tab N @tab
c16e85d7
TB
482@item @code{interop} clause to @code{dispatch} @tab N @tab
483@item @code{apply} code to loop-transforming constructs @tab N @tab
484@item @code{omp_curr_progress_width} identifier @tab N @tab
485@item @code{safesync} clause to the @code{parallel} construct @tab N @tab
486@item @code{omp_get_max_progress_width} runtime routine @tab N @tab
8da7476c 487@item @code{strict} modifier keyword to @code{num_threads} @tab N @tab
c16e85d7
TB
488@item @code{memscope} clause to @code{atomic} and @code{flush} @tab N @tab
489@item Routines for obtaining memory spaces/allocators for shared/device memory
490 @tab N @tab
491@item @code{omp_get_memspace_num_resources} routine @tab N @tab
492@item @code{omp_get_submemspace} routine @tab N @tab
493@item @code{ompt_get_buffer_limits} OMPT routine @tab N @tab
494@item Extension of @code{OMP_DEFAULT_DEVICE} and new
495 @code{OMP_AVAILABLE_DEVICES} environment vars @tab N @tab
496@item Supporting increments with abstract names in @code{OMP_PLACES} @tab N @tab
497@end multitable
498
499@unnumberedsubsec Other new TR 11 features
500@multitable @columnfractions .60 .10 .25
501@item Relaxed Fortran restrictions to the @code{aligned} clause @tab N @tab
502@item Mapping lambda captures @tab N @tab
503@item For Fortran, atomic compare with storing the comparison result
504 @tab N @tab
c16e85d7
TB
505@end multitable
506
507
508
d77de738
ML
509@c ---------------------------------------------------------------------
510@c OpenMP Runtime Library Routines
511@c ---------------------------------------------------------------------
512
513@node Runtime Library Routines
514@chapter OpenMP Runtime Library Routines
515
506f068e
TB
516The runtime routines described here are defined by Section 18 of the OpenMP
517specification in version 5.2.
d77de738
ML
518
519@menu
506f068e
TB
520* Thread Team Routines::
521* Thread Affinity Routines::
522* Teams Region Routines::
523* Tasking Routines::
524@c * Resource Relinquishing Routines::
525* Device Information Routines::
e0786ba6 526* Device Memory Routines::
506f068e
TB
527* Lock Routines::
528* Timing Routines::
529* Event Routine::
530@c * Interoperability Routines::
971f119f 531* Memory Management Routines::
506f068e
TB
532@c * Tool Control Routine::
533@c * Environment Display Routine::
534@end menu
d77de738 535
506f068e
TB
536
537
538@node Thread Team Routines
539@section Thread Team Routines
540
541Routines controlling threads in the current contention group.
542They have C linkage and do not throw exceptions.
543
544@menu
545* omp_set_num_threads:: Set upper team size limit
d77de738 546* omp_get_num_threads:: Size of the active team
506f068e 547* omp_get_max_threads:: Maximum number of threads of parallel region
d77de738
ML
548* omp_get_thread_num:: Current thread ID
549* omp_in_parallel:: Whether a parallel region is active
d77de738 550* omp_set_dynamic:: Enable/disable dynamic teams
506f068e
TB
551* omp_get_dynamic:: Dynamic teams setting
552* omp_get_cancellation:: Whether cancellation support is enabled
d77de738 553* omp_set_nested:: Enable/disable nested parallel regions
506f068e 554* omp_get_nested:: Nested parallel regions
d77de738 555* omp_set_schedule:: Set the runtime scheduling method
506f068e
TB
556* omp_get_schedule:: Obtain the runtime scheduling method
557* omp_get_teams_thread_limit:: Maximum number of threads imposed by teams
558* omp_get_supported_active_levels:: Maximum number of active regions supported
559* omp_set_max_active_levels:: Limits the number of active parallel regions
560* omp_get_max_active_levels:: Current maximum number of active regions
561* omp_get_level:: Number of parallel regions
562* omp_get_ancestor_thread_num:: Ancestor thread ID
563* omp_get_team_size:: Number of threads in a team
564* omp_get_active_level:: Number of active parallel regions
565@end menu
d77de738 566
d77de738 567
d77de738 568
506f068e
TB
569@node omp_set_num_threads
570@subsection @code{omp_set_num_threads} -- Set upper team size limit
571@table @asis
572@item @emph{Description}:
573Specifies the number of threads used by default in subsequent parallel
574sections, if those do not specify a @code{num_threads} clause. The
575argument of @code{omp_set_num_threads} shall be a positive integer.
d77de738 576
506f068e
TB
577@item @emph{C/C++}:
578@multitable @columnfractions .20 .80
579@item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
580@end multitable
d77de738 581
506f068e
TB
582@item @emph{Fortran}:
583@multitable @columnfractions .20 .80
584@item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
585@item @tab @code{integer, intent(in) :: num_threads}
586@end multitable
d77de738 587
506f068e
TB
588@item @emph{See also}:
589@ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
d77de738 590
506f068e
TB
591@item @emph{Reference}:
592@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.1.
593@end table
d77de738
ML
594
595
506f068e
TB
596
597@node omp_get_num_threads
598@subsection @code{omp_get_num_threads} -- Size of the active team
d77de738
ML
599@table @asis
600@item @emph{Description}:
506f068e
TB
601Returns the number of threads in the current team. In a sequential section of
602the program @code{omp_get_num_threads} returns 1.
d77de738 603
506f068e
TB
604The default team size may be initialized at startup by the
605@env{OMP_NUM_THREADS} environment variable. At runtime, the size
606of the current team may be set either by the @code{NUM_THREADS}
607clause or by @code{omp_set_num_threads}. If none of the above were
608used to define a specific value and @env{OMP_DYNAMIC} is disabled,
609one thread per CPU online is used.
610
611@item @emph{C/C++}:
d77de738 612@multitable @columnfractions .20 .80
506f068e 613@item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
d77de738
ML
614@end multitable
615
616@item @emph{Fortran}:
617@multitable @columnfractions .20 .80
506f068e 618@item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
d77de738
ML
619@end multitable
620
621@item @emph{See also}:
506f068e 622@ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
d77de738
ML
623
624@item @emph{Reference}:
506f068e 625@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.2.
d77de738
ML
626@end table
627
628
629
506f068e
TB
630@node omp_get_max_threads
631@subsection @code{omp_get_max_threads} -- Maximum number of threads of parallel region
d77de738
ML
632@table @asis
633@item @emph{Description}:
506f068e
TB
634Return the maximum number of threads used for the current parallel region
635that does not use the clause @code{num_threads}.
d77de738 636
506f068e 637@item @emph{C/C++}:
d77de738 638@multitable @columnfractions .20 .80
506f068e 639@item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
d77de738
ML
640@end multitable
641
642@item @emph{Fortran}:
643@multitable @columnfractions .20 .80
506f068e 644@item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
d77de738
ML
645@end multitable
646
647@item @emph{See also}:
506f068e 648@ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
d77de738
ML
649
650@item @emph{Reference}:
506f068e 651@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.3.
d77de738
ML
652@end table
653
654
655
506f068e
TB
656@node omp_get_thread_num
657@subsection @code{omp_get_thread_num} -- Current thread ID
d77de738
ML
658@table @asis
659@item @emph{Description}:
506f068e
TB
660Returns a unique thread identification number within the current team.
661In a sequential parts of the program, @code{omp_get_thread_num}
662always returns 0. In parallel regions the return value varies
663from 0 to @code{omp_get_num_threads}-1 inclusive. The return
664value of the primary thread of a team is always 0.
d77de738
ML
665
666@item @emph{C/C++}:
667@multitable @columnfractions .20 .80
506f068e 668@item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
d77de738
ML
669@end multitable
670
671@item @emph{Fortran}:
672@multitable @columnfractions .20 .80
506f068e 673@item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
d77de738
ML
674@end multitable
675
676@item @emph{See also}:
506f068e 677@ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
d77de738
ML
678
679@item @emph{Reference}:
506f068e 680@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.4.
d77de738
ML
681@end table
682
683
684
506f068e
TB
685@node omp_in_parallel
686@subsection @code{omp_in_parallel} -- Whether a parallel region is active
d77de738
ML
687@table @asis
688@item @emph{Description}:
506f068e
TB
689This function returns @code{true} if currently running in parallel,
690@code{false} otherwise. Here, @code{true} and @code{false} represent
691their language-specific counterparts.
d77de738
ML
692
693@item @emph{C/C++}:
694@multitable @columnfractions .20 .80
506f068e 695@item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
d77de738
ML
696@end multitable
697
698@item @emph{Fortran}:
699@multitable @columnfractions .20 .80
506f068e 700@item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
d77de738
ML
701@end multitable
702
d77de738 703@item @emph{Reference}:
506f068e 704@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.6.
d77de738
ML
705@end table
706
707
506f068e
TB
708@node omp_set_dynamic
709@subsection @code{omp_set_dynamic} -- Enable/disable dynamic teams
d77de738
ML
710@table @asis
711@item @emph{Description}:
506f068e
TB
712Enable or disable the dynamic adjustment of the number of threads
713within a team. The function takes the language-specific equivalent
714of @code{true} and @code{false}, where @code{true} enables dynamic
715adjustment of team sizes and @code{false} disables it.
d77de738 716
506f068e 717@item @emph{C/C++}:
d77de738 718@multitable @columnfractions .20 .80
506f068e 719@item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
d77de738
ML
720@end multitable
721
722@item @emph{Fortran}:
723@multitable @columnfractions .20 .80
506f068e
TB
724@item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
725@item @tab @code{logical, intent(in) :: dynamic_threads}
d77de738
ML
726@end multitable
727
728@item @emph{See also}:
506f068e 729@ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
d77de738
ML
730
731@item @emph{Reference}:
506f068e 732@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.7.
d77de738
ML
733@end table
734
735
736
737@node omp_get_dynamic
506f068e 738@subsection @code{omp_get_dynamic} -- Dynamic teams setting
d77de738
ML
739@table @asis
740@item @emph{Description}:
741This function returns @code{true} if enabled, @code{false} otherwise.
742Here, @code{true} and @code{false} represent their language-specific
743counterparts.
744
745The dynamic team setting may be initialized at startup by the
746@env{OMP_DYNAMIC} environment variable or at runtime using
747@code{omp_set_dynamic}. If undefined, dynamic adjustment is
748disabled by default.
749
750@item @emph{C/C++}:
751@multitable @columnfractions .20 .80
752@item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
753@end multitable
754
755@item @emph{Fortran}:
756@multitable @columnfractions .20 .80
757@item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
758@end multitable
759
760@item @emph{See also}:
761@ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
762
763@item @emph{Reference}:
764@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.8.
765@end table
766
767
768
506f068e
TB
769@node omp_get_cancellation
770@subsection @code{omp_get_cancellation} -- Whether cancellation support is enabled
d77de738
ML
771@table @asis
772@item @emph{Description}:
506f068e
TB
773This function returns @code{true} if cancellation is activated, @code{false}
774otherwise. Here, @code{true} and @code{false} represent their language-specific
775counterparts. Unless @env{OMP_CANCELLATION} is set true, cancellations are
776deactivated.
d77de738 777
506f068e 778@item @emph{C/C++}:
d77de738 779@multitable @columnfractions .20 .80
506f068e 780@item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
d77de738
ML
781@end multitable
782
783@item @emph{Fortran}:
784@multitable @columnfractions .20 .80
506f068e 785@item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
d77de738
ML
786@end multitable
787
788@item @emph{See also}:
506f068e 789@ref{OMP_CANCELLATION}
d77de738
ML
790
791@item @emph{Reference}:
506f068e 792@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.9.
d77de738
ML
793@end table
794
795
796
506f068e
TB
797@node omp_set_nested
798@subsection @code{omp_set_nested} -- Enable/disable nested parallel regions
d77de738
ML
799@table @asis
800@item @emph{Description}:
506f068e
TB
801Enable or disable nested parallel regions, i.e., whether team members
802are allowed to create new teams. The function takes the language-specific
803equivalent of @code{true} and @code{false}, where @code{true} enables
804dynamic adjustment of team sizes and @code{false} disables it.
d77de738 805
15886c03 806Enabling nested parallel regions also sets the maximum number of
506f068e 807active nested regions to the maximum supported. Disabling nested parallel
15886c03 808regions sets the maximum number of active nested regions to one.
506f068e
TB
809
810Note that the @code{omp_set_nested} API routine was deprecated
811in the OpenMP specification 5.2 in favor of @code{omp_set_max_active_levels}.
812
813@item @emph{C/C++}:
d77de738 814@multitable @columnfractions .20 .80
506f068e 815@item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
d77de738
ML
816@end multitable
817
818@item @emph{Fortran}:
819@multitable @columnfractions .20 .80
506f068e
TB
820@item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
821@item @tab @code{logical, intent(in) :: nested}
d77de738
ML
822@end multitable
823
824@item @emph{See also}:
506f068e
TB
825@ref{omp_get_nested}, @ref{omp_set_max_active_levels},
826@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
d77de738
ML
827
828@item @emph{Reference}:
506f068e 829@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.10.
d77de738
ML
830@end table
831
832
833
506f068e
TB
834@node omp_get_nested
835@subsection @code{omp_get_nested} -- Nested parallel regions
d77de738
ML
836@table @asis
837@item @emph{Description}:
506f068e
TB
838This function returns @code{true} if nested parallel regions are
839enabled, @code{false} otherwise. Here, @code{true} and @code{false}
840represent their language-specific counterparts.
841
842The state of nested parallel regions at startup depends on several
843environment variables. If @env{OMP_MAX_ACTIVE_LEVELS} is defined
844and is set to greater than one, then nested parallel regions will be
845enabled. If not defined, then the value of the @env{OMP_NESTED}
846environment variable will be followed if defined. If neither are
847defined, then if either @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND}
848are defined with a list of more than one value, then nested parallel
849regions are enabled. If none of these are defined, then nested parallel
850regions are disabled by default.
851
852Nested parallel regions can be enabled or disabled at runtime using
853@code{omp_set_nested}, or by setting the maximum number of nested
854regions with @code{omp_set_max_active_levels} to one to disable, or
855above one to enable.
856
857Note that the @code{omp_get_nested} API routine was deprecated
858in the OpenMP specification 5.2 in favor of @code{omp_get_max_active_levels}.
859
860@item @emph{C/C++}:
861@multitable @columnfractions .20 .80
862@item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
863@end multitable
864
865@item @emph{Fortran}:
866@multitable @columnfractions .20 .80
867@item @emph{Interface}: @tab @code{logical function omp_get_nested()}
868@end multitable
869
870@item @emph{See also}:
871@ref{omp_get_max_active_levels}, @ref{omp_set_nested},
872@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
873
874@item @emph{Reference}:
875@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.11.
876@end table
877
878
879
880@node omp_set_schedule
881@subsection @code{omp_set_schedule} -- Set the runtime scheduling method
882@table @asis
883@item @emph{Description}:
884Sets the runtime scheduling method. The @var{kind} argument can have the
885value @code{omp_sched_static}, @code{omp_sched_dynamic},
886@code{omp_sched_guided} or @code{omp_sched_auto}. Except for
887@code{omp_sched_auto}, the chunk size is set to the value of
888@var{chunk_size} if positive, or to the default value if zero or negative.
889For @code{omp_sched_auto} the @var{chunk_size} argument is ignored.
d77de738
ML
890
891@item @emph{C/C++}
892@multitable @columnfractions .20 .80
506f068e 893@item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int chunk_size);}
d77de738
ML
894@end multitable
895
896@item @emph{Fortran}:
897@multitable @columnfractions .20 .80
506f068e
TB
898@item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, chunk_size)}
899@item @tab @code{integer(kind=omp_sched_kind) kind}
900@item @tab @code{integer chunk_size}
d77de738
ML
901@end multitable
902
903@item @emph{See also}:
506f068e
TB
904@ref{omp_get_schedule}
905@ref{OMP_SCHEDULE}
d77de738
ML
906
907@item @emph{Reference}:
506f068e 908@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.12.
d77de738
ML
909@end table
910
911
506f068e
TB
912
913@node omp_get_schedule
914@subsection @code{omp_get_schedule} -- Obtain the runtime scheduling method
d77de738
ML
915@table @asis
916@item @emph{Description}:
15886c03
TB
917Obtain the runtime scheduling method. The @var{kind} argument is set to
918@code{omp_sched_static}, @code{omp_sched_dynamic},
506f068e
TB
919@code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
920@var{chunk_size}, is set to the chunk size.
d77de738
ML
921
922@item @emph{C/C++}
923@multitable @columnfractions .20 .80
506f068e 924@item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *chunk_size);}
d77de738
ML
925@end multitable
926
927@item @emph{Fortran}:
928@multitable @columnfractions .20 .80
506f068e
TB
929@item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, chunk_size)}
930@item @tab @code{integer(kind=omp_sched_kind) kind}
931@item @tab @code{integer chunk_size}
d77de738
ML
932@end multitable
933
506f068e
TB
934@item @emph{See also}:
935@ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
936
d77de738 937@item @emph{Reference}:
506f068e 938@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.13.
d77de738
ML
939@end table
940
941
506f068e
TB
942@node omp_get_teams_thread_limit
943@subsection @code{omp_get_teams_thread_limit} -- Maximum number of threads imposed by teams
d77de738
ML
944@table @asis
945@item @emph{Description}:
15886c03 946Return the maximum number of threads that are able to participate in
506f068e 947each team created by a teams construct.
d77de738
ML
948
949@item @emph{C/C++}:
950@multitable @columnfractions .20 .80
506f068e 951@item @emph{Prototype}: @tab @code{int omp_get_teams_thread_limit(void);}
d77de738
ML
952@end multitable
953
954@item @emph{Fortran}:
955@multitable @columnfractions .20 .80
506f068e 956@item @emph{Interface}: @tab @code{integer function omp_get_teams_thread_limit()}
d77de738
ML
957@end multitable
958
959@item @emph{See also}:
506f068e 960@ref{omp_set_teams_thread_limit}, @ref{OMP_TEAMS_THREAD_LIMIT}
d77de738
ML
961
962@item @emph{Reference}:
506f068e 963@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.6.
d77de738
ML
964@end table
965
966
967
506f068e
TB
968@node omp_get_supported_active_levels
969@subsection @code{omp_get_supported_active_levels} -- Maximum number of active regions supported
d77de738
ML
970@table @asis
971@item @emph{Description}:
506f068e
TB
972This function returns the maximum number of nested, active parallel regions
973supported by this implementation.
d77de738 974
506f068e 975@item @emph{C/C++}
d77de738 976@multitable @columnfractions .20 .80
506f068e 977@item @emph{Prototype}: @tab @code{int omp_get_supported_active_levels(void);}
d77de738
ML
978@end multitable
979
980@item @emph{Fortran}:
981@multitable @columnfractions .20 .80
506f068e 982@item @emph{Interface}: @tab @code{integer function omp_get_supported_active_levels()}
d77de738
ML
983@end multitable
984
985@item @emph{See also}:
506f068e 986@ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
d77de738
ML
987
988@item @emph{Reference}:
506f068e 989@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.15.
d77de738
ML
990@end table
991
992
993
506f068e
TB
994@node omp_set_max_active_levels
995@subsection @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
d77de738
ML
996@table @asis
997@item @emph{Description}:
506f068e
TB
998This function limits the maximum allowed number of nested, active
999parallel regions. @var{max_levels} must be less or equal to
1000the value returned by @code{omp_get_supported_active_levels}.
d77de738 1001
506f068e
TB
1002@item @emph{C/C++}
1003@multitable @columnfractions .20 .80
1004@item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
1005@end multitable
d77de738 1006
506f068e
TB
1007@item @emph{Fortran}:
1008@multitable @columnfractions .20 .80
1009@item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
1010@item @tab @code{integer max_levels}
1011@end multitable
d77de738 1012
506f068e
TB
1013@item @emph{See also}:
1014@ref{omp_get_max_active_levels}, @ref{omp_get_active_level},
1015@ref{omp_get_supported_active_levels}
2cd0689a 1016
506f068e
TB
1017@item @emph{Reference}:
1018@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.15.
1019@end table
1020
1021
1022
1023@node omp_get_max_active_levels
1024@subsection @code{omp_get_max_active_levels} -- Current maximum number of active regions
1025@table @asis
1026@item @emph{Description}:
1027This function obtains the maximum allowed number of nested, active parallel regions.
1028
1029@item @emph{C/C++}
d77de738 1030@multitable @columnfractions .20 .80
506f068e 1031@item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
d77de738
ML
1032@end multitable
1033
1034@item @emph{Fortran}:
1035@multitable @columnfractions .20 .80
506f068e 1036@item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
d77de738
ML
1037@end multitable
1038
1039@item @emph{See also}:
506f068e 1040@ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
d77de738
ML
1041
1042@item @emph{Reference}:
506f068e 1043@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.16.
d77de738
ML
1044@end table
1045
1046
506f068e
TB
1047@node omp_get_level
1048@subsection @code{omp_get_level} -- Obtain the current nesting level
d77de738
ML
1049@table @asis
1050@item @emph{Description}:
506f068e
TB
1051This function returns the nesting level for the parallel blocks,
1052which enclose the calling call.
d77de738 1053
506f068e 1054@item @emph{C/C++}
d77de738 1055@multitable @columnfractions .20 .80
506f068e 1056@item @emph{Prototype}: @tab @code{int omp_get_level(void);}
d77de738
ML
1057@end multitable
1058
1059@item @emph{Fortran}:
1060@multitable @columnfractions .20 .80
506f068e 1061@item @emph{Interface}: @tab @code{integer function omp_level()}
d77de738
ML
1062@end multitable
1063
506f068e
TB
1064@item @emph{See also}:
1065@ref{omp_get_active_level}
1066
d77de738 1067@item @emph{Reference}:
506f068e 1068@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.17.
d77de738
ML
1069@end table
1070
1071
1072
506f068e
TB
1073@node omp_get_ancestor_thread_num
1074@subsection @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
d77de738
ML
1075@table @asis
1076@item @emph{Description}:
506f068e
TB
1077This function returns the thread identification number for the given
1078nesting level of the current thread. For values of @var{level} outside
1079zero to @code{omp_get_level} -1 is returned; if @var{level} is
1080@code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
d77de738 1081
506f068e 1082@item @emph{C/C++}
d77de738 1083@multitable @columnfractions .20 .80
506f068e 1084@item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
d77de738
ML
1085@end multitable
1086
1087@item @emph{Fortran}:
1088@multitable @columnfractions .20 .80
506f068e
TB
1089@item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
1090@item @tab @code{integer level}
d77de738
ML
1091@end multitable
1092
506f068e
TB
1093@item @emph{See also}:
1094@ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
1095
d77de738 1096@item @emph{Reference}:
506f068e 1097@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.18.
d77de738
ML
1098@end table
1099
1100
1101
506f068e
TB
1102@node omp_get_team_size
1103@subsection @code{omp_get_team_size} -- Number of threads in a team
d77de738
ML
1104@table @asis
1105@item @emph{Description}:
506f068e
TB
1106This function returns the number of threads in a thread team to which
1107either the current thread or its ancestor belongs. For values of @var{level}
1108outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
11091 is returned, and for @code{omp_get_level}, the result is identical
1110to @code{omp_get_num_threads}.
d77de738
ML
1111
1112@item @emph{C/C++}:
1113@multitable @columnfractions .20 .80
506f068e 1114@item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
d77de738
ML
1115@end multitable
1116
1117@item @emph{Fortran}:
1118@multitable @columnfractions .20 .80
506f068e
TB
1119@item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
1120@item @tab @code{integer level}
d77de738
ML
1121@end multitable
1122
506f068e
TB
1123@item @emph{See also}:
1124@ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
1125
d77de738 1126@item @emph{Reference}:
506f068e 1127@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.19.
d77de738
ML
1128@end table
1129
1130
1131
506f068e
TB
1132@node omp_get_active_level
1133@subsection @code{omp_get_active_level} -- Number of parallel regions
d77de738
ML
1134@table @asis
1135@item @emph{Description}:
506f068e
TB
1136This function returns the nesting level for the active parallel blocks,
1137which enclose the calling call.
d77de738 1138
506f068e 1139@item @emph{C/C++}
d77de738 1140@multitable @columnfractions .20 .80
506f068e 1141@item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
d77de738
ML
1142@end multitable
1143
1144@item @emph{Fortran}:
1145@multitable @columnfractions .20 .80
506f068e 1146@item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
d77de738
ML
1147@end multitable
1148
1149@item @emph{See also}:
506f068e 1150@ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
d77de738
ML
1151
1152@item @emph{Reference}:
506f068e 1153@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.20.
d77de738
ML
1154@end table
1155
1156
1157
506f068e
TB
1158@node Thread Affinity Routines
1159@section Thread Affinity Routines
1160
1161Routines controlling and accessing thread-affinity policies.
1162They have C linkage and do not throw exceptions.
1163
1164@menu
1165* omp_get_proc_bind:: Whether threads may be moved between CPUs
1166@c * omp_get_num_places:: <fixme>
1167@c * omp_get_place_num_procs:: <fixme>
1168@c * omp_get_place_proc_ids:: <fixme>
1169@c * omp_get_place_num:: <fixme>
1170@c * omp_get_partition_num_places:: <fixme>
1171@c * omp_get_partition_place_nums:: <fixme>
1172@c * omp_set_affinity_format:: <fixme>
1173@c * omp_get_affinity_format:: <fixme>
1174@c * omp_display_affinity:: <fixme>
1175@c * omp_capture_affinity:: <fixme>
1176@end menu
1177
1178
1179
d77de738 1180@node omp_get_proc_bind
506f068e 1181@subsection @code{omp_get_proc_bind} -- Whether threads may be moved between CPUs
d77de738
ML
1182@table @asis
1183@item @emph{Description}:
1184This functions returns the currently active thread affinity policy, which is
1185set via @env{OMP_PROC_BIND}. Possible values are @code{omp_proc_bind_false},
1186@code{omp_proc_bind_true}, @code{omp_proc_bind_primary},
1187@code{omp_proc_bind_master}, @code{omp_proc_bind_close} and @code{omp_proc_bind_spread},
1188where @code{omp_proc_bind_master} is an alias for @code{omp_proc_bind_primary}.
1189
1190@item @emph{C/C++}:
1191@multitable @columnfractions .20 .80
1192@item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
1193@end multitable
1194
1195@item @emph{Fortran}:
1196@multitable @columnfractions .20 .80
1197@item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
1198@end multitable
1199
1200@item @emph{See also}:
1201@ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
1202
1203@item @emph{Reference}:
1204@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.22.
1205@end table
1206
1207
1208
506f068e
TB
1209@node Teams Region Routines
1210@section Teams Region Routines
d77de738 1211
506f068e
TB
1212Routines controlling the league of teams that are executed in a @code{teams}
1213region. They have C linkage and do not throw exceptions.
d77de738 1214
506f068e
TB
1215@menu
1216* omp_get_num_teams:: Number of teams
1217* omp_get_team_num:: Get team number
1218* omp_set_num_teams:: Set upper teams limit for teams region
1219* omp_get_max_teams:: Maximum number of teams for teams region
1220* omp_set_teams_thread_limit:: Set upper thread limit for teams construct
1221* omp_get_thread_limit:: Maximum number of threads
1222@end menu
d77de738 1223
d77de738
ML
1224
1225
506f068e
TB
1226@node omp_get_num_teams
1227@subsection @code{omp_get_num_teams} -- Number of teams
d77de738
ML
1228@table @asis
1229@item @emph{Description}:
506f068e 1230Returns the number of teams in the current team region.
d77de738 1231
506f068e 1232@item @emph{C/C++}:
d77de738 1233@multitable @columnfractions .20 .80
506f068e 1234@item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
d77de738
ML
1235@end multitable
1236
1237@item @emph{Fortran}:
1238@multitable @columnfractions .20 .80
506f068e 1239@item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
d77de738
ML
1240@end multitable
1241
d77de738 1242@item @emph{Reference}:
506f068e 1243@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.32.
d77de738
ML
1244@end table
1245
1246
1247
1248@node omp_get_team_num
506f068e 1249@subsection @code{omp_get_team_num} -- Get team number
d77de738
ML
1250@table @asis
1251@item @emph{Description}:
1252Returns the team number of the calling thread.
1253
1254@item @emph{C/C++}:
1255@multitable @columnfractions .20 .80
1256@item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
1257@end multitable
1258
1259@item @emph{Fortran}:
1260@multitable @columnfractions .20 .80
1261@item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
1262@end multitable
1263
1264@item @emph{Reference}:
1265@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.33.
1266@end table
1267
1268
1269
506f068e
TB
1270@node omp_set_num_teams
1271@subsection @code{omp_set_num_teams} -- Set upper teams limit for teams construct
d77de738
ML
1272@table @asis
1273@item @emph{Description}:
506f068e
TB
1274Specifies the upper bound for number of teams created by the teams construct
1275which does not specify a @code{num_teams} clause. The
1276argument of @code{omp_set_num_teams} shall be a positive integer.
d77de738
ML
1277
1278@item @emph{C/C++}:
1279@multitable @columnfractions .20 .80
506f068e 1280@item @emph{Prototype}: @tab @code{void omp_set_num_teams(int num_teams);}
d77de738
ML
1281@end multitable
1282
1283@item @emph{Fortran}:
1284@multitable @columnfractions .20 .80
506f068e
TB
1285@item @emph{Interface}: @tab @code{subroutine omp_set_num_teams(num_teams)}
1286@item @tab @code{integer, intent(in) :: num_teams}
d77de738
ML
1287@end multitable
1288
1289@item @emph{See also}:
506f068e 1290@ref{OMP_NUM_TEAMS}, @ref{omp_get_num_teams}, @ref{omp_get_max_teams}
d77de738
ML
1291
1292@item @emph{Reference}:
506f068e 1293@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.3.
d77de738
ML
1294@end table
1295
1296
1297
506f068e
TB
1298@node omp_get_max_teams
1299@subsection @code{omp_get_max_teams} -- Maximum number of teams of teams region
d77de738
ML
1300@table @asis
1301@item @emph{Description}:
506f068e
TB
1302Return the maximum number of teams used for the teams region
1303that does not use the clause @code{num_teams}.
d77de738
ML
1304
1305@item @emph{C/C++}:
1306@multitable @columnfractions .20 .80
506f068e 1307@item @emph{Prototype}: @tab @code{int omp_get_max_teams(void);}
d77de738
ML
1308@end multitable
1309
1310@item @emph{Fortran}:
1311@multitable @columnfractions .20 .80
506f068e 1312@item @emph{Interface}: @tab @code{integer function omp_get_max_teams()}
d77de738
ML
1313@end multitable
1314
1315@item @emph{See also}:
506f068e 1316@ref{omp_set_num_teams}, @ref{omp_get_num_teams}
d77de738
ML
1317
1318@item @emph{Reference}:
506f068e 1319@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.4.
d77de738
ML
1320@end table
1321
1322
1323
506f068e
TB
1324@node omp_set_teams_thread_limit
1325@subsection @code{omp_set_teams_thread_limit} -- Set upper thread limit for teams construct
d77de738
ML
1326@table @asis
1327@item @emph{Description}:
15886c03 1328Specifies the upper bound for number of threads that are available
506f068e
TB
1329for each team created by the teams construct which does not specify a
1330@code{thread_limit} clause. The argument of
1331@code{omp_set_teams_thread_limit} shall be a positive integer.
d77de738
ML
1332
1333@item @emph{C/C++}:
1334@multitable @columnfractions .20 .80
506f068e 1335@item @emph{Prototype}: @tab @code{void omp_set_teams_thread_limit(int thread_limit);}
d77de738
ML
1336@end multitable
1337
1338@item @emph{Fortran}:
1339@multitable @columnfractions .20 .80
506f068e
TB
1340@item @emph{Interface}: @tab @code{subroutine omp_set_teams_thread_limit(thread_limit)}
1341@item @tab @code{integer, intent(in) :: thread_limit}
d77de738
ML
1342@end multitable
1343
1344@item @emph{See also}:
506f068e 1345@ref{OMP_TEAMS_THREAD_LIMIT}, @ref{omp_get_teams_thread_limit}, @ref{omp_get_thread_limit}
d77de738
ML
1346
1347@item @emph{Reference}:
506f068e 1348@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.5.
d77de738
ML
1349@end table
1350
1351
1352
506f068e
TB
1353@node omp_get_thread_limit
1354@subsection @code{omp_get_thread_limit} -- Maximum number of threads
d77de738
ML
1355@table @asis
1356@item @emph{Description}:
506f068e 1357Return the maximum number of threads of the program.
d77de738
ML
1358
1359@item @emph{C/C++}:
1360@multitable @columnfractions .20 .80
506f068e 1361@item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
d77de738
ML
1362@end multitable
1363
1364@item @emph{Fortran}:
1365@multitable @columnfractions .20 .80
506f068e 1366@item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
d77de738
ML
1367@end multitable
1368
1369@item @emph{See also}:
506f068e 1370@ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
d77de738
ML
1371
1372@item @emph{Reference}:
506f068e 1373@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.14.
d77de738
ML
1374@end table
1375
1376
1377
506f068e
TB
1378@node Tasking Routines
1379@section Tasking Routines
1380
1381Routines relating to explicit tasks.
1382They have C linkage and do not throw exceptions.
1383
1384@menu
1385* omp_get_max_task_priority:: Maximum task priority value that can be set
819f3d36 1386* omp_in_explicit_task:: Whether a given task is an explicit task
506f068e
TB
1387* omp_in_final:: Whether in final or included task region
1388@end menu
1389
1390
1391
1392@node omp_get_max_task_priority
1393@subsection @code{omp_get_max_task_priority} -- Maximum priority value
1394that can be set for tasks.
d77de738
ML
1395@table @asis
1396@item @emph{Description}:
506f068e 1397This function obtains the maximum allowed priority number for tasks.
d77de738 1398
506f068e 1399@item @emph{C/C++}
d77de738 1400@multitable @columnfractions .20 .80
506f068e 1401@item @emph{Prototype}: @tab @code{int omp_get_max_task_priority(void);}
d77de738
ML
1402@end multitable
1403
1404@item @emph{Fortran}:
1405@multitable @columnfractions .20 .80
506f068e 1406@item @emph{Interface}: @tab @code{integer function omp_get_max_task_priority()}
d77de738
ML
1407@end multitable
1408
1409@item @emph{Reference}:
506f068e 1410@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
d77de738
ML
1411@end table
1412
1413
506f068e 1414
819f3d36
TB
1415@node omp_in_explicit_task
1416@subsection @code{omp_in_explicit_task} -- Whether a given task is an explicit task
1417@table @asis
1418@item @emph{Description}:
1419The function returns the @var{explicit-task-var} ICV; it returns true when the
1420encountering task was generated by a task-generating construct such as
1421@code{target}, @code{task} or @code{taskloop}. Otherwise, the encountering task
1422is in an implicit task region such as generated by the implicit or explicit
1423@code{parallel} region and @code{omp_in_explicit_task} returns false.
1424
1425@item @emph{C/C++}
1426@multitable @columnfractions .20 .80
1427@item @emph{Prototype}: @tab @code{int omp_in_explicit_task(void);}
1428@end multitable
1429
1430@item @emph{Fortran}:
1431@multitable @columnfractions .20 .80
1432@item @emph{Interface}: @tab @code{logical function omp_in_explicit_task()}
1433@end multitable
1434
1435@item @emph{Reference}:
1436@uref{https://www.openmp.org, OpenMP specification v5.2}, Section 18.5.2.
1437@end table
1438
1439
1440
d77de738 1441@node omp_in_final
506f068e 1442@subsection @code{omp_in_final} -- Whether in final or included task region
d77de738
ML
1443@table @asis
1444@item @emph{Description}:
1445This function returns @code{true} if currently running in a final
1446or included task region, @code{false} otherwise. Here, @code{true}
1447and @code{false} represent their language-specific counterparts.
1448
1449@item @emph{C/C++}:
1450@multitable @columnfractions .20 .80
1451@item @emph{Prototype}: @tab @code{int omp_in_final(void);}
1452@end multitable
1453
1454@item @emph{Fortran}:
1455@multitable @columnfractions .20 .80
1456@item @emph{Interface}: @tab @code{logical function omp_in_final()}
1457@end multitable
1458
1459@item @emph{Reference}:
1460@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.21.
1461@end table
1462
1463
1464
506f068e
TB
1465@c @node Resource Relinquishing Routines
1466@c @section Resource Relinquishing Routines
1467@c
1468@c Routines releasing resources used by the OpenMP runtime.
1469@c They have C linkage and do not throw exceptions.
1470@c
1471@c @menu
1472@c * omp_pause_resource:: <fixme>
1473@c * omp_pause_resource_all:: <fixme>
1474@c @end menu
1475
1476@node Device Information Routines
1477@section Device Information Routines
1478
1479Routines related to devices available to an OpenMP program.
1480They have C linkage and do not throw exceptions.
1481
1482@menu
1483* omp_get_num_procs:: Number of processors online
1484@c * omp_get_max_progress_width:: <fixme>/TR11
1485* omp_set_default_device:: Set the default device for target regions
1486* omp_get_default_device:: Get the default device for target regions
1487* omp_get_num_devices:: Number of target devices
1488* omp_get_device_num:: Get device that current thread is running on
1489* omp_is_initial_device:: Whether executing on the host device
1490* omp_get_initial_device:: Device number of host device
1491@end menu
1492
1493
1494
1495@node omp_get_num_procs
1496@subsection @code{omp_get_num_procs} -- Number of processors online
d77de738
ML
1497@table @asis
1498@item @emph{Description}:
506f068e 1499Returns the number of processors online on that device.
d77de738
ML
1500
1501@item @emph{C/C++}:
1502@multitable @columnfractions .20 .80
506f068e 1503@item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
d77de738
ML
1504@end multitable
1505
1506@item @emph{Fortran}:
1507@multitable @columnfractions .20 .80
506f068e 1508@item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
d77de738
ML
1509@end multitable
1510
1511@item @emph{Reference}:
506f068e 1512@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.5.
d77de738
ML
1513@end table
1514
1515
1516
1517@node omp_set_default_device
506f068e 1518@subsection @code{omp_set_default_device} -- Set the default device for target regions
d77de738
ML
1519@table @asis
1520@item @emph{Description}:
1521Set the default device for target regions without device clause. The argument
1522shall be a nonnegative device number.
1523
1524@item @emph{C/C++}:
1525@multitable @columnfractions .20 .80
1526@item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
1527@end multitable
1528
1529@item @emph{Fortran}:
1530@multitable @columnfractions .20 .80
1531@item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
1532@item @tab @code{integer device_num}
1533@end multitable
1534
1535@item @emph{See also}:
1536@ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
1537
1538@item @emph{Reference}:
1539@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
1540@end table
1541
1542
1543
506f068e
TB
1544@node omp_get_default_device
1545@subsection @code{omp_get_default_device} -- Get the default device for target regions
d77de738
ML
1546@table @asis
1547@item @emph{Description}:
506f068e 1548Get the default device for target regions without device clause.
2cd0689a 1549
d77de738
ML
1550@item @emph{C/C++}:
1551@multitable @columnfractions .20 .80
506f068e 1552@item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
d77de738
ML
1553@end multitable
1554
1555@item @emph{Fortran}:
1556@multitable @columnfractions .20 .80
506f068e 1557@item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
d77de738
ML
1558@end multitable
1559
1560@item @emph{See also}:
506f068e 1561@ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
d77de738
ML
1562
1563@item @emph{Reference}:
506f068e 1564@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.30.
d77de738
ML
1565@end table
1566
1567
1568
506f068e
TB
1569@node omp_get_num_devices
1570@subsection @code{omp_get_num_devices} -- Number of target devices
d77de738
ML
1571@table @asis
1572@item @emph{Description}:
506f068e 1573Returns the number of target devices.
d77de738
ML
1574
1575@item @emph{C/C++}:
1576@multitable @columnfractions .20 .80
506f068e 1577@item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
d77de738
ML
1578@end multitable
1579
1580@item @emph{Fortran}:
1581@multitable @columnfractions .20 .80
506f068e 1582@item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
d77de738
ML
1583@end multitable
1584
d77de738 1585@item @emph{Reference}:
506f068e 1586@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.31.
d77de738
ML
1587@end table
1588
1589
1590
506f068e
TB
1591@node omp_get_device_num
1592@subsection @code{omp_get_device_num} -- Return device number of current device
d77de738
ML
1593@table @asis
1594@item @emph{Description}:
506f068e
TB
1595This function returns a device number that represents the device that the
1596current thread is executing on. For OpenMP 5.0, this must be equal to the
1597value returned by the @code{omp_get_initial_device} function when called
1598from the host.
d77de738 1599
506f068e 1600@item @emph{C/C++}
d77de738 1601@multitable @columnfractions .20 .80
506f068e 1602@item @emph{Prototype}: @tab @code{int omp_get_device_num(void);}
d77de738
ML
1603@end multitable
1604
1605@item @emph{Fortran}:
506f068e
TB
1606@multitable @columnfractions .20 .80
1607@item @emph{Interface}: @tab @code{integer function omp_get_device_num()}
d77de738
ML
1608@end multitable
1609
1610@item @emph{See also}:
506f068e 1611@ref{omp_get_initial_device}
d77de738
ML
1612
1613@item @emph{Reference}:
506f068e 1614@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.37.
d77de738
ML
1615@end table
1616
1617
1618
506f068e
TB
1619@node omp_is_initial_device
1620@subsection @code{omp_is_initial_device} -- Whether executing on the host device
d77de738
ML
1621@table @asis
1622@item @emph{Description}:
506f068e
TB
1623This function returns @code{true} if currently running on the host device,
1624@code{false} otherwise. Here, @code{true} and @code{false} represent
1625their language-specific counterparts.
d77de738 1626
506f068e 1627@item @emph{C/C++}:
d77de738 1628@multitable @columnfractions .20 .80
506f068e 1629@item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
d77de738
ML
1630@end multitable
1631
1632@item @emph{Fortran}:
1633@multitable @columnfractions .20 .80
506f068e 1634@item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
d77de738
ML
1635@end multitable
1636
d77de738 1637@item @emph{Reference}:
506f068e 1638@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.34.
d77de738
ML
1639@end table
1640
1641
1642
506f068e
TB
1643@node omp_get_initial_device
1644@subsection @code{omp_get_initial_device} -- Return device number of initial device
d77de738
ML
1645@table @asis
1646@item @emph{Description}:
506f068e
TB
1647This function returns a device number that represents the host device.
1648For OpenMP 5.1, this must be equal to the value returned by the
1649@code{omp_get_num_devices} function.
d77de738 1650
506f068e 1651@item @emph{C/C++}
d77de738 1652@multitable @columnfractions .20 .80
506f068e 1653@item @emph{Prototype}: @tab @code{int omp_get_initial_device(void);}
d77de738
ML
1654@end multitable
1655
1656@item @emph{Fortran}:
1657@multitable @columnfractions .20 .80
506f068e 1658@item @emph{Interface}: @tab @code{integer function omp_get_initial_device()}
d77de738
ML
1659@end multitable
1660
1661@item @emph{See also}:
506f068e 1662@ref{omp_get_num_devices}
d77de738
ML
1663
1664@item @emph{Reference}:
506f068e 1665@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.35.
d77de738
ML
1666@end table
1667
1668
1669
e0786ba6
TB
1670@node Device Memory Routines
1671@section Device Memory Routines
1672
1673Routines related to memory allocation and managing corresponding
1674pointers on devices. They have C linkage and do not throw exceptions.
1675
1676@menu
1677* omp_target_alloc:: Allocate device memory
1678* omp_target_free:: Free device memory
1679* omp_target_is_present:: Check whether storage is mapped
506f068e
TB
1680@c * omp_target_is_accessible:: <fixme>
1681@c * omp_target_memcpy:: <fixme>
1682@c * omp_target_memcpy_rect:: <fixme>
1683@c * omp_target_memcpy_async:: <fixme>
1684@c * omp_target_memcpy_rect_async:: <fixme>
e0786ba6
TB
1685@c * omp_target_memset:: <fixme>/TR12
1686@c * omp_target_memset_async:: <fixme>/TR12
1687* omp_target_associate_ptr:: Associate a device pointer with a host pointer
1688* omp_target_disassociate_ptr:: Remove device--host pointer association
1689* omp_get_mapped_ptr:: Return device pointer to a host pointer
1690@end menu
1691
1692
1693
1694@node omp_target_alloc
1695@subsection @code{omp_target_alloc} -- Allocate device memory
1696@table @asis
1697@item @emph{Description}:
1698This routine allocates @var{size} bytes of memory in the device environment
1699associated with the device number @var{device_num}. If successful, a device
1700pointer is returned, otherwise a null pointer.
1701
1702In GCC, when the device is the host or the device shares memory with the host,
1703the memory is allocated on the host; in that case, when @var{size} is zero,
1704either NULL or a unique pointer value that can later be successfully passed to
1705@code{omp_target_free} is returned. When the allocation is not performed on
1706the host, a null pointer is returned when @var{size} is zero; in that case,
1707additionally a diagnostic might be printed to standard error (stderr).
1708
1709Running this routine in a @code{target} region except on the initial device
1710is not supported.
1711
1712@item @emph{C/C++}
1713@multitable @columnfractions .20 .80
1714@item @emph{Prototype}: @tab @code{void *omp_target_alloc(size_t size, int device_num)}
1715@end multitable
1716
1717@item @emph{Fortran}:
1718@multitable @columnfractions .20 .80
1719@item @emph{Interface}: @tab @code{type(c_ptr) function omp_target_alloc(size, device_num) bind(C)}
1720@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int, c_size_t}
1721@item @tab @code{integer(c_size_t), value :: size}
1722@item @tab @code{integer(c_int), value :: device_num}
1723@end multitable
1724
1725@item @emph{See also}:
1726@ref{omp_target_free}, @ref{omp_target_associate_ptr}
1727
1728@item @emph{Reference}:
1729@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.1
1730@end table
1731
1732
1733
1734@node omp_target_free
1735@subsection @code{omp_target_free} -- Free device memory
1736@table @asis
1737@item @emph{Description}:
1738This routine frees memory allocated by the @code{omp_target_alloc} routine.
1739The @var{device_ptr} argument must be either a null pointer or a device pointer
1740returned by @code{omp_target_alloc} for the specified @code{device_num}. The
1741device number @var{device_num} must be a conforming device number.
1742
1743Running this routine in a @code{target} region except on the initial device
1744is not supported.
1745
1746@item @emph{C/C++}
1747@multitable @columnfractions .20 .80
1748@item @emph{Prototype}: @tab @code{void omp_target_free(void *device_ptr, int device_num)}
1749@end multitable
1750
1751@item @emph{Fortran}:
1752@multitable @columnfractions .20 .80
1753@item @emph{Interface}: @tab @code{subroutine omp_target_free(device_ptr, device_num) bind(C)}
1754@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1755@item @tab @code{type(c_ptr), value :: device_ptr}
1756@item @tab @code{integer(c_int), value :: device_num}
1757@end multitable
1758
1759@item @emph{See also}:
1760@ref{omp_target_alloc}, @ref{omp_target_disassociate_ptr}
1761
1762@item @emph{Reference}:
1763@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.2
1764@end table
1765
1766
1767
1768@node omp_target_is_present
1769@subsection @code{omp_target_is_present} -- Check whether storage is mapped
1770@table @asis
1771@item @emph{Description}:
1772This routine tests whether storage, identified by the host pointer @var{ptr}
1773is mapped to the device specified by @var{device_num}. If so, it returns
1774@emph{true} and otherwise @emph{false}.
1775
1776In GCC, this includes self mapping such that @code{omp_target_is_present}
1777returns @emph{true} when @var{device_num} specifies the host or when the host
1778and the device share memory. If @var{ptr} is a null pointer, @var{true} is
1779returned and if @var{device_num} is an invalid device number, @var{false} is
1780returned.
1781
1782If those conditions do not apply, @emph{true} is returned if the association has
1783been established by an explicit or implicit @code{map} clause, the
1784@code{declare target} directive or a call to the @code{omp_target_associate_ptr}
1785routine.
1786
1787Running this routine in a @code{target} region except on the initial device
1788is not supported.
1789
1790@item @emph{C/C++}
1791@multitable @columnfractions .20 .80
1792@item @emph{Prototype}: @tab @code{int omp_target_is_present(const void *ptr,}
1793@item @tab @code{ int device_num)}
1794@end multitable
1795
1796@item @emph{Fortran}:
1797@multitable @columnfractions .20 .80
1798@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_is_present(ptr, &}
1799@item @tab @code{ device_num) bind(C)}
1800@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1801@item @tab @code{type(c_ptr), value :: ptr}
1802@item @tab @code{integer(c_int), value :: device_num}
1803@end multitable
1804
1805@item @emph{See also}:
1806@ref{omp_target_associate_ptr}
1807
1808@item @emph{Reference}:
1809@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.3
1810@end table
1811
1812
1813
1814@node omp_target_associate_ptr
1815@subsection @code{omp_target_associate_ptr} -- Associate a device pointer with a host pointer
1816@table @asis
1817@item @emph{Description}:
1818This routine associates storage on the host with storage on a device identified
1819by @var{device_num}. The device pointer is usually obtained by calling
1820@code{omp_target_alloc} or by other means (but not by using the @code{map}
1821clauses or the @code{declare target} directive). The host pointer should point
1822to memory that has a storage size of at least @var{size}.
1823
1824The @var{device_offset} parameter specifies the offset into @var{device_ptr}
1825that is used as the base address for the device side of the mapping; the
1826storage size should be at least @var{device_offset} plus @var{size}.
1827
1828After the association, the host pointer can be used in a @code{map} clause and
1829in the @code{to} and @code{from} clauses of the @code{target update} directive
1830to transfer data between the associated pointers. The reference count of such
1831associated storage is infinite. The association can be removed by calling
1832@code{omp_target_disassociate_ptr} which should be done before the lifetime
1833of either either storage ends.
1834
1835The routine returns nonzero (@code{EINVAL}) when the @var{device_num} invalid,
1836for when the initial device or the associated device shares memory with the
1837host. @code{omp_target_associate_ptr} returns zero if @var{host_ptr} points
1838into already associated storage that is fully inside of a previously associated
1839memory. Otherwise, if the association was successful zero is returned; if none
1840of the cases above apply, nonzero (@code{EINVAL}) is returned.
1841
1842The @code{omp_target_is_present} routine can be used to test whether
1843associated storage for a device pointer exists.
1844
1845Running this routine in a @code{target} region except on the initial device
1846is not supported.
1847
1848@item @emph{C/C++}
1849@multitable @columnfractions .20 .80
1850@item @emph{Prototype}: @tab @code{int omp_target_associate_ptr(const void *host_ptr,}
1851@item @tab @code{ const void *device_ptr,}
1852@item @tab @code{ size_t size,}
1853@item @tab @code{ size_t device_offset,}
1854@item @tab @code{ int device_num)}
1855@end multitable
1856
1857@item @emph{Fortran}:
1858@multitable @columnfractions .20 .80
1859@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_associate_ptr(host_ptr, &}
1860@item @tab @code{ device_ptr, size, device_offset, device_num) bind(C)}
1861@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int, c_size_t}
1862@item @tab @code{type(c_ptr), value :: host_ptr, device_ptr}
1863@item @tab @code{integer(c_size_t), value :: size, device_offset}
1864@item @tab @code{integer(c_int), value :: device_num}
1865@end multitable
1866
1867@item @emph{See also}:
1868@ref{omp_target_disassociate_ptr}, @ref{omp_target_is_present},
1869@ref{omp_target_alloc}
1870
1871@item @emph{Reference}:
1872@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.9
1873@end table
1874
1875
1876
1877@node omp_target_disassociate_ptr
1878@subsection @code{omp_target_disassociate_ptr} -- Remove device--host pointer association
1879@table @asis
1880@item @emph{Description}:
1881This routine removes the storage association established by calling
1882@code{omp_target_associate_ptr} and sets the reference count to zero,
1883even if @code{omp_target_associate_ptr} was invoked multiple times for
1884for host pointer @code{ptr}. If applicable, the device memory needs
1885to be freed by the user.
1886
1887If an associated device storage location for the @var{device_num} was
1888found and has infinite reference count, the association is removed and
1889zero is returned. In all other cases, nonzero (@code{EINVAL}) is returned
1890and no other action is taken.
1891
1892Note that passing a host pointer where the association to the device pointer
1893was established with the @code{declare target} directive yields undefined
1894behavior.
1895
1896Running this routine in a @code{target} region except on the initial device
1897is not supported.
1898
1899@item @emph{C/C++}
1900@multitable @columnfractions .20 .80
1901@item @emph{Prototype}: @tab @code{int omp_target_disassociate_ptr(const void *ptr,}
1902@item @tab @code{ int device_num)}
1903@end multitable
1904
1905@item @emph{Fortran}:
1906@multitable @columnfractions .20 .80
1907@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_disassociate_ptr(ptr, &}
1908@item @tab @code{ device_num) bind(C)}
1909@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1910@item @tab @code{type(c_ptr), value :: ptr}
1911@item @tab @code{integer(c_int), value :: device_num}
1912@end multitable
1913
1914@item @emph{See also}:
1915@ref{omp_target_associate_ptr}
1916
1917@item @emph{Reference}:
1918@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.10
1919@end table
1920
1921
1922
1923@node omp_get_mapped_ptr
1924@subsection @code{omp_get_mapped_ptr} -- Return device pointer to a host pointer
1925@table @asis
1926@item @emph{Description}:
1927If the device number is refers to the initial device or to a device with
1928memory accessible from the host (shared memory), the @code{omp_get_mapped_ptr}
bc238c40 1929routines returns the value of the passed @var{ptr}. Otherwise, if associated
e0786ba6
TB
1930storage to the passed host pointer @var{ptr} exists on device associated with
1931@var{device_num}, it returns that pointer. In all other cases and in cases of
1932an error, a null pointer is returned.
1933
1934The association of storage location is established either via an explicit or
1935implicit @code{map} clause, the @code{declare target} directive or the
1936@code{omp_target_associate_ptr} routine.
1937
1938Running this routine in a @code{target} region except on the initial device
1939is not supported.
1940
1941@item @emph{C/C++}
1942@multitable @columnfractions .20 .80
1943@item @emph{Prototype}: @tab @code{void *omp_get_mapped_ptr(const void *ptr, int device_num);}
1944@end multitable
1945
1946@item @emph{Fortran}:
1947@multitable @columnfractions .20 .80
1948@item @emph{Interface}: @tab @code{type(c_ptr) function omp_get_mapped_ptr(ptr, device_num) bind(C)}
1949@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1950@item @tab @code{type(c_ptr), value :: ptr}
1951@item @tab @code{integer(c_int), value :: device_num}
1952@end multitable
1953
1954@item @emph{See also}:
1955@ref{omp_target_associate_ptr}
1956
1957@item @emph{Reference}:
1958@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.11
1959@end table
1960
1961
506f068e
TB
1962
1963@node Lock Routines
1964@section Lock Routines
1965
1966Initialize, set, test, unset and destroy simple and nested locks.
1967The routines have C linkage and do not throw exceptions.
1968
1969@menu
1970* omp_init_lock:: Initialize simple lock
1971* omp_init_nest_lock:: Initialize nested lock
1972@c * omp_init_lock_with_hint:: <fixme>
1973@c * omp_init_nest_lock_with_hint:: <fixme>
1974* omp_destroy_lock:: Destroy simple lock
1975* omp_destroy_nest_lock:: Destroy nested lock
1976* omp_set_lock:: Wait for and set simple lock
1977* omp_set_nest_lock:: Wait for and set simple lock
1978* omp_unset_lock:: Unset simple lock
1979* omp_unset_nest_lock:: Unset nested lock
1980* omp_test_lock:: Test and set simple lock if available
1981* omp_test_nest_lock:: Test and set nested lock if available
1982@end menu
1983
1984
1985
d77de738 1986@node omp_init_lock
506f068e 1987@subsection @code{omp_init_lock} -- Initialize simple lock
d77de738
ML
1988@table @asis
1989@item @emph{Description}:
1990Initialize a simple lock. After initialization, the lock is in
1991an unlocked state.
1992
1993@item @emph{C/C++}:
1994@multitable @columnfractions .20 .80
1995@item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
1996@end multitable
1997
1998@item @emph{Fortran}:
1999@multitable @columnfractions .20 .80
2000@item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
2001@item @tab @code{integer(omp_lock_kind), intent(out) :: svar}
2002@end multitable
2003
2004@item @emph{See also}:
2005@ref{omp_destroy_lock}
2006
2007@item @emph{Reference}:
2008@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
2009@end table
2010
2011
2012
506f068e
TB
2013@node omp_init_nest_lock
2014@subsection @code{omp_init_nest_lock} -- Initialize nested lock
d77de738
ML
2015@table @asis
2016@item @emph{Description}:
506f068e
TB
2017Initialize a nested lock. After initialization, the lock is in
2018an unlocked state and the nesting count is set to zero.
d77de738
ML
2019
2020@item @emph{C/C++}:
2021@multitable @columnfractions .20 .80
506f068e 2022@item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
d77de738
ML
2023@end multitable
2024
2025@item @emph{Fortran}:
2026@multitable @columnfractions .20 .80
506f068e
TB
2027@item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
2028@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
d77de738
ML
2029@end multitable
2030
2031@item @emph{See also}:
506f068e 2032@ref{omp_destroy_nest_lock}
d77de738 2033
506f068e
TB
2034@item @emph{Reference}:
2035@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
d77de738
ML
2036@end table
2037
2038
2039
506f068e
TB
2040@node omp_destroy_lock
2041@subsection @code{omp_destroy_lock} -- Destroy simple lock
d77de738
ML
2042@table @asis
2043@item @emph{Description}:
506f068e
TB
2044Destroy a simple lock. In order to be destroyed, a simple lock must be
2045in the unlocked state.
d77de738
ML
2046
2047@item @emph{C/C++}:
2048@multitable @columnfractions .20 .80
506f068e 2049@item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
d77de738
ML
2050@end multitable
2051
2052@item @emph{Fortran}:
2053@multitable @columnfractions .20 .80
506f068e 2054@item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
d77de738
ML
2055@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
2056@end multitable
2057
2058@item @emph{See also}:
506f068e 2059@ref{omp_init_lock}
d77de738
ML
2060
2061@item @emph{Reference}:
506f068e 2062@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
d77de738
ML
2063@end table
2064
2065
2066
506f068e
TB
2067@node omp_destroy_nest_lock
2068@subsection @code{omp_destroy_nest_lock} -- Destroy nested lock
d77de738
ML
2069@table @asis
2070@item @emph{Description}:
506f068e
TB
2071Destroy a nested lock. In order to be destroyed, a nested lock must be
2072in the unlocked state and its nesting count must equal zero.
d77de738
ML
2073
2074@item @emph{C/C++}:
2075@multitable @columnfractions .20 .80
506f068e 2076@item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
d77de738
ML
2077@end multitable
2078
2079@item @emph{Fortran}:
2080@multitable @columnfractions .20 .80
506f068e
TB
2081@item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
2082@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
d77de738
ML
2083@end multitable
2084
2085@item @emph{See also}:
506f068e 2086@ref{omp_init_lock}
d77de738
ML
2087
2088@item @emph{Reference}:
506f068e 2089@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
d77de738
ML
2090@end table
2091
2092
2093
506f068e
TB
2094@node omp_set_lock
2095@subsection @code{omp_set_lock} -- Wait for and set simple lock
d77de738
ML
2096@table @asis
2097@item @emph{Description}:
506f068e
TB
2098Before setting a simple lock, the lock variable must be initialized by
2099@code{omp_init_lock}. The calling thread is blocked until the lock
2100is available. If the lock is already held by the current thread,
2101a deadlock occurs.
d77de738
ML
2102
2103@item @emph{C/C++}:
2104@multitable @columnfractions .20 .80
506f068e 2105@item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
d77de738
ML
2106@end multitable
2107
2108@item @emph{Fortran}:
2109@multitable @columnfractions .20 .80
506f068e 2110@item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
d77de738
ML
2111@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
2112@end multitable
2113
2114@item @emph{See also}:
506f068e 2115@ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
d77de738
ML
2116
2117@item @emph{Reference}:
506f068e 2118@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
d77de738
ML
2119@end table
2120
2121
2122
d77de738 2123@node omp_set_nest_lock
506f068e 2124@subsection @code{omp_set_nest_lock} -- Wait for and set nested lock
d77de738
ML
2125@table @asis
2126@item @emph{Description}:
2127Before setting a nested lock, the lock variable must be initialized by
2128@code{omp_init_nest_lock}. The calling thread is blocked until the lock
2129is available. If the lock is already held by the current thread, the
2130nesting count for the lock is incremented.
2131
2132@item @emph{C/C++}:
2133@multitable @columnfractions .20 .80
2134@item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
2135@end multitable
2136
2137@item @emph{Fortran}:
2138@multitable @columnfractions .20 .80
2139@item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
2140@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
2141@end multitable
2142
2143@item @emph{See also}:
2144@ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
2145
2146@item @emph{Reference}:
2147@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
2148@end table
2149
2150
2151
506f068e
TB
2152@node omp_unset_lock
2153@subsection @code{omp_unset_lock} -- Unset simple lock
d77de738
ML
2154@table @asis
2155@item @emph{Description}:
506f068e
TB
2156A simple lock about to be unset must have been locked by @code{omp_set_lock}
2157or @code{omp_test_lock} before. In addition, the lock must be held by the
2158thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
2159or more threads attempted to set the lock before, one of them is chosen to,
2160again, set the lock to itself.
d77de738
ML
2161
2162@item @emph{C/C++}:
2163@multitable @columnfractions .20 .80
506f068e 2164@item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
d77de738
ML
2165@end multitable
2166
2167@item @emph{Fortran}:
2168@multitable @columnfractions .20 .80
506f068e
TB
2169@item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
2170@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
d77de738
ML
2171@end multitable
2172
d77de738 2173@item @emph{See also}:
506f068e 2174@ref{omp_set_lock}, @ref{omp_test_lock}
d77de738
ML
2175
2176@item @emph{Reference}:
506f068e 2177@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
d77de738
ML
2178@end table
2179
2180
2181
2182@node omp_unset_nest_lock
506f068e 2183@subsection @code{omp_unset_nest_lock} -- Unset nested lock
d77de738
ML
2184@table @asis
2185@item @emph{Description}:
2186A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
2187or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
2188thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
2189lock becomes unlocked. If one ore more threads attempted to set the lock before,
2190one of them is chosen to, again, set the lock to itself.
2191
2192@item @emph{C/C++}:
2193@multitable @columnfractions .20 .80
2194@item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
2195@end multitable
2196
2197@item @emph{Fortran}:
2198@multitable @columnfractions .20 .80
2199@item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
2200@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
2201@end multitable
2202
2203@item @emph{See also}:
2204@ref{omp_set_nest_lock}
2205
2206@item @emph{Reference}:
2207@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
2208@end table
2209
2210
2211
506f068e
TB
2212@node omp_test_lock
2213@subsection @code{omp_test_lock} -- Test and set simple lock if available
d77de738
ML
2214@table @asis
2215@item @emph{Description}:
506f068e
TB
2216Before setting a simple lock, the lock variable must be initialized by
2217@code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
2218does not block if the lock is not available. This function returns
2219@code{true} upon success, @code{false} otherwise. Here, @code{true} and
2220@code{false} represent their language-specific counterparts.
d77de738
ML
2221
2222@item @emph{C/C++}:
2223@multitable @columnfractions .20 .80
506f068e 2224@item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
d77de738
ML
2225@end multitable
2226
2227@item @emph{Fortran}:
2228@multitable @columnfractions .20 .80
506f068e
TB
2229@item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
2230@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
2231@end multitable
2232
2233@item @emph{See also}:
2234@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
2235
2236@item @emph{Reference}:
2237@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
2238@end table
2239
2240
2241
2242@node omp_test_nest_lock
2243@subsection @code{omp_test_nest_lock} -- Test and set nested lock if available
2244@table @asis
2245@item @emph{Description}:
2246Before setting a nested lock, the lock variable must be initialized by
2247@code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
2248@code{omp_test_nest_lock} does not block if the lock is not available.
2249If the lock is already held by the current thread, the new nesting count
2250is returned. Otherwise, the return value equals zero.
2251
2252@item @emph{C/C++}:
2253@multitable @columnfractions .20 .80
2254@item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
2255@end multitable
2256
2257@item @emph{Fortran}:
2258@multitable @columnfractions .20 .80
2259@item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
d77de738
ML
2260@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
2261@end multitable
2262
506f068e 2263
d77de738 2264@item @emph{See also}:
506f068e 2265@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
d77de738
ML
2266
2267@item @emph{Reference}:
506f068e 2268@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
d77de738
ML
2269@end table
2270
2271
2272
506f068e
TB
2273@node Timing Routines
2274@section Timing Routines
2275
2276Portable, thread-based, wall clock timer.
2277The routines have C linkage and do not throw exceptions.
2278
2279@menu
2280* omp_get_wtick:: Get timer precision.
2281* omp_get_wtime:: Elapsed wall clock time.
2282@end menu
2283
2284
2285
d77de738 2286@node omp_get_wtick
506f068e 2287@subsection @code{omp_get_wtick} -- Get timer precision
d77de738
ML
2288@table @asis
2289@item @emph{Description}:
2290Gets the timer precision, i.e., the number of seconds between two
2291successive clock ticks.
2292
2293@item @emph{C/C++}:
2294@multitable @columnfractions .20 .80
2295@item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
2296@end multitable
2297
2298@item @emph{Fortran}:
2299@multitable @columnfractions .20 .80
2300@item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
2301@end multitable
2302
2303@item @emph{See also}:
2304@ref{omp_get_wtime}
2305
2306@item @emph{Reference}:
2307@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.2.
2308@end table
2309
2310
2311
2312@node omp_get_wtime
506f068e 2313@subsection @code{omp_get_wtime} -- Elapsed wall clock time
d77de738
ML
2314@table @asis
2315@item @emph{Description}:
2316Elapsed wall clock time in seconds. The time is measured per thread, no
2317guarantee can be made that two distinct threads measure the same time.
2318Time is measured from some "time in the past", which is an arbitrary time
2319guaranteed not to change during the execution of the program.
2320
2321@item @emph{C/C++}:
2322@multitable @columnfractions .20 .80
2323@item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
2324@end multitable
2325
2326@item @emph{Fortran}:
2327@multitable @columnfractions .20 .80
2328@item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
2329@end multitable
2330
2331@item @emph{See also}:
2332@ref{omp_get_wtick}
2333
2334@item @emph{Reference}:
2335@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.1.
2336@end table
2337
2338
2339
506f068e
TB
2340@node Event Routine
2341@section Event Routine
2342
2343Support for event objects.
2344The routine has C linkage and do not throw exceptions.
2345
2346@menu
2347* omp_fulfill_event:: Fulfill and destroy an OpenMP event.
2348@end menu
2349
2350
2351
d77de738 2352@node omp_fulfill_event
506f068e 2353@subsection @code{omp_fulfill_event} -- Fulfill and destroy an OpenMP event
d77de738
ML
2354@table @asis
2355@item @emph{Description}:
2356Fulfill the event associated with the event handle argument. Currently, it
2357is only used to fulfill events generated by detach clauses on task
2358constructs - the effect of fulfilling the event is to allow the task to
2359complete.
2360
2361The result of calling @code{omp_fulfill_event} with an event handle other
2362than that generated by a detach clause is undefined. Calling it with an
2363event handle that has already been fulfilled is also undefined.
2364
2365@item @emph{C/C++}:
2366@multitable @columnfractions .20 .80
2367@item @emph{Prototype}: @tab @code{void omp_fulfill_event(omp_event_handle_t event);}
2368@end multitable
2369
2370@item @emph{Fortran}:
2371@multitable @columnfractions .20 .80
2372@item @emph{Interface}: @tab @code{subroutine omp_fulfill_event(event)}
2373@item @tab @code{integer (kind=omp_event_handle_kind) :: event}
2374@end multitable
2375
2376@item @emph{Reference}:
2377@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.5.1.
2378@end table
2379
2380
2381
506f068e
TB
2382@c @node Interoperability Routines
2383@c @section Interoperability Routines
2384@c
2385@c Routines to obtain properties from an @code{omp_interop_t} object.
2386@c They have C linkage and do not throw exceptions.
2387@c
2388@c @menu
2389@c * omp_get_num_interop_properties:: <fixme>
2390@c * omp_get_interop_int:: <fixme>
2391@c * omp_get_interop_ptr:: <fixme>
2392@c * omp_get_interop_str:: <fixme>
2393@c * omp_get_interop_name:: <fixme>
2394@c * omp_get_interop_type_desc:: <fixme>
2395@c * omp_get_interop_rc_desc:: <fixme>
2396@c @end menu
2397
971f119f
TB
2398@node Memory Management Routines
2399@section Memory Management Routines
2400
2401Routines to manage and allocate memory on the current device.
2402They have C linkage and do not throw exceptions.
2403
2404@menu
2405* omp_init_allocator:: Create an allocator
2406* omp_destroy_allocator:: Destroy an allocator
2407* omp_set_default_allocator:: Set the default allocator
2408* omp_get_default_allocator:: Get the default allocator
bc238c40
TB
2409* omp_alloc:: Memory allocation with an allocator
2410* omp_aligned_alloc:: Memory allocation with an allocator and alignment
2411* omp_free:: Freeing memory allocated with OpenMP routines
2412* omp_calloc:: Allocate nullified memory with an allocator
2413* omp_aligned_calloc:: Allocate nullified aligned memory with an allocator
2414* omp_realloc:: Reallocate memory allocated with OpenMP routines
506f068e
TB
2415@c * omp_get_memspace_num_resources:: <fixme>/TR11
2416@c * omp_get_submemspace:: <fixme>/TR11
971f119f
TB
2417@end menu
2418
2419
2420
2421@node omp_init_allocator
2422@subsection @code{omp_init_allocator} -- Create an allocator
2423@table @asis
2424@item @emph{Description}:
2425Create an allocator that uses the specified memory space and has the specified
2426traits; if an allocator that fulfills the requirements cannot be created,
2427@code{omp_null_allocator} is returned.
2428
2429The predefined memory spaces and available traits can be found at
2430@ref{OMP_ALLOCATOR}, where the trait names have to be be prefixed by
2431@code{omp_atk_} (e.g. @code{omp_atk_pinned}) and the named trait values by
2432@code{omp_atv_} (e.g. @code{omp_atv_true}); additionally, @code{omp_atv_default}
2433may be used as trait value to specify that the default value should be used.
2434
2435@item @emph{C/C++}:
2436@multitable @columnfractions .20 .80
2437@item @emph{Prototype}: @tab @code{omp_allocator_handle_t omp_init_allocator(}
2438@item @tab @code{ omp_memspace_handle_t memspace,}
2439@item @tab @code{ int ntraits,}
2440@item @tab @code{ const omp_alloctrait_t traits[]);}
2441@end multitable
2442
2443@item @emph{Fortran}:
2444@multitable @columnfractions .20 .80
2445@item @emph{Interface}: @tab @code{function omp_init_allocator(memspace, ntraits, traits)}
bc238c40
TB
2446@item @tab @code{integer (omp_allocator_handle_kind) :: omp_init_allocator}
2447@item @tab @code{integer (omp_memspace_handle_kind), intent(in) :: memspace}
971f119f
TB
2448@item @tab @code{integer, intent(in) :: ntraits}
2449@item @tab @code{type (omp_alloctrait), intent(in) :: traits(*)}
2450@end multitable
2451
2452@item @emph{See also}:
2453@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_destroy_allocator}
2454
2455@item @emph{Reference}:
2456@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.2
2457@end table
2458
2459
2460
2461@node omp_destroy_allocator
2462@subsection @code{omp_destroy_allocator} -- Destroy an allocator
2463@table @asis
2464@item @emph{Description}:
2465Releases all resources used by a memory allocator, which must not represent
2466a predefined memory allocator. Accessing memory after its allocator has been
2467destroyed has unspecified behavior. Passing @code{omp_null_allocator} to the
15886c03 2468routine is permitted but has no effect.
971f119f
TB
2469
2470
2471@item @emph{C/C++}:
2472@multitable @columnfractions .20 .80
2473@item @emph{Prototype}: @tab @code{void omp_destroy_allocator (omp_allocator_handle_t allocator);}
2474@end multitable
2475
2476@item @emph{Fortran}:
2477@multitable @columnfractions .20 .80
2478@item @emph{Interface}: @tab @code{subroutine omp_destroy_allocator(allocator)}
bc238c40 2479@item @tab @code{integer (omp_allocator_handle_kind), intent(in) :: allocator}
971f119f
TB
2480@end multitable
2481
2482@item @emph{See also}:
2483@ref{omp_init_allocator}
2484
2485@item @emph{Reference}:
2486@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.3
2487@end table
2488
2489
2490
2491@node omp_set_default_allocator
2492@subsection @code{omp_set_default_allocator} -- Set the default allocator
2493@table @asis
2494@item @emph{Description}:
2495Sets the default allocator that is used when no allocator has been specified
2496in the @code{allocate} or @code{allocator} clause or if an OpenMP memory
2497routine is invoked with the @code{omp_null_allocator} allocator.
2498
2499@item @emph{C/C++}:
2500@multitable @columnfractions .20 .80
2501@item @emph{Prototype}: @tab @code{void omp_set_default_allocator(omp_allocator_handle_t allocator);}
2502@end multitable
2503
2504@item @emph{Fortran}:
2505@multitable @columnfractions .20 .80
2506@item @emph{Interface}: @tab @code{subroutine omp_set_default_allocator(allocator)}
bc238c40 2507@item @tab @code{integer (omp_allocator_handle_kind), intent(in) :: allocator}
971f119f
TB
2508@end multitable
2509
2510@item @emph{See also}:
2511@ref{omp_get_default_allocator}, @ref{omp_init_allocator}, @ref{OMP_ALLOCATOR},
2512@ref{Memory allocation}
2513
2514@item @emph{Reference}:
2515@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.4
2516@end table
2517
2518
2519
2520@node omp_get_default_allocator
2521@subsection @code{omp_get_default_allocator} -- Get the default allocator
2522@table @asis
2523@item @emph{Description}:
2524The routine returns the default allocator that is used when no allocator has
2525been specified in the @code{allocate} or @code{allocator} clause or if an
2526OpenMP memory routine is invoked with the @code{omp_null_allocator} allocator.
2527
2528@item @emph{C/C++}:
2529@multitable @columnfractions .20 .80
2530@item @emph{Prototype}: @tab @code{omp_allocator_handle_t omp_get_default_allocator();}
2531@end multitable
2532
2533@item @emph{Fortran}:
2534@multitable @columnfractions .20 .80
2535@item @emph{Interface}: @tab @code{function omp_get_default_allocator()}
bc238c40 2536@item @tab @code{integer (omp_allocator_handle_kind) :: omp_get_default_allocator}
971f119f
TB
2537@end multitable
2538
2539@item @emph{See also}:
2540@ref{omp_set_default_allocator}, @ref{OMP_ALLOCATOR}
2541
2542@item @emph{Reference}:
2543@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.5
2544@end table
2545
2546
506f068e 2547
bc238c40
TB
2548@node omp_alloc
2549@subsection @code{omp_alloc} -- Memory allocation with an allocator
2550@table @asis
2551@item @emph{Description}:
2552Allocate memory with the specified allocator, which can either be a predefined
2553allocator, an allocator handle or @code{omp_null_allocator}. If the allocators
2554is @code{omp_null_allocator}, the allocator specified by the
2555@var{def-allocator-var} ICV is used. @var{size} must be a nonnegative number
2556denoting the number of bytes to be allocated; if @var{size} is zero,
2557@code{omp_alloc} will return a null pointer. If successful, a pointer to the
2558allocated memory is returned, otherwise the @code{fallback} trait of the
2559allocator determines the behavior. The content of the allocated memory is
2560unspecified.
2561
2562In @code{target} regions, either the @code{dynamic_allocators} clause must
2563appear on a @code{requires} directive in the same compilation unit -- or the
2564@var{allocator} argument may only be a constant expression with the value of
2565one of the predefined allocators and may not be @code{omp_null_allocator}.
2566
2567Memory allocated by @code{omp_alloc} must be freed using @code{omp_free}.
2568
2569@item @emph{C}:
2570@multitable @columnfractions .20 .80
2571@item @emph{Prototype}: @tab @code{void* omp_alloc(size_t size,}
2572@item @tab @code{ omp_allocator_handle_t allocator)}
2573@end multitable
2574
2575@item @emph{C++}:
2576@multitable @columnfractions .20 .80
2577@item @emph{Prototype}: @tab @code{void* omp_alloc(size_t size,}
2578@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2579@end multitable
2580
2581@item @emph{Fortran}:
2582@multitable @columnfractions .20 .80
2583@item @emph{Interface}: @tab @code{type(c_ptr) function omp_alloc(size, allocator) bind(C)}
2584@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2585@item @tab @code{integer (c_size_t), value :: size}
2586@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2587@end multitable
2588
2589@item @emph{See also}:
2590@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2591@ref{omp_free}, @ref{omp_init_allocator}
2592
2593@item @emph{Reference}:
2594@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.6
2595@end table
2596
2597
2598
2599@node omp_aligned_alloc
2600@subsection @code{omp_aligned_alloc} -- Memory allocation with an allocator and alignment
2601@table @asis
2602@item @emph{Description}:
2603Allocate memory with the specified allocator, which can either be a predefined
2604allocator, an allocator handle or @code{omp_null_allocator}. If the allocators
2605is @code{omp_null_allocator}, the allocator specified by the
2606@var{def-allocator-var} ICV is used. @var{alignment} must be a positive power
2607of two and @var{size} must be a nonnegative number that is a multiple of the
2608alignment and denotes the number of bytes to be allocated; if @var{size} is
2609zero, @code{omp_aligned_alloc} will return a null pointer. The alignment will
2610be at least the maximal value required by @code{alignment} trait of the
2611allocator and the value of the passed @var{alignment} argument. If successful,
2612a pointer to the allocated memory is returned, otherwise the @code{fallback}
2613trait of the allocator determines the behavior. The content of the allocated
2614memory is unspecified.
2615
2616In @code{target} regions, either the @code{dynamic_allocators} clause must
2617appear on a @code{requires} directive in the same compilation unit -- or the
2618@var{allocator} argument may only be a constant expression with the value of
2619one of the predefined allocators and may not be @code{omp_null_allocator}.
2620
2621Memory allocated by @code{omp_aligned_alloc} must be freed using
2622@code{omp_free}.
2623
2624@item @emph{C}:
2625@multitable @columnfractions .20 .80
2626@item @emph{Prototype}: @tab @code{void* omp_aligned_alloc(size_t alignment,}
2627@item @tab @code{ size_t size,}
2628@item @tab @code{ omp_allocator_handle_t allocator)}
2629@end multitable
2630
2631@item @emph{C++}:
2632@multitable @columnfractions .20 .80
2633@item @emph{Prototype}: @tab @code{void* omp_aligned_alloc(size_t alignment,}
2634@item @tab @code{ size_t size,}
2635@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2636@end multitable
2637
2638@item @emph{Fortran}:
2639@multitable @columnfractions .20 .80
2640@item @emph{Interface}: @tab @code{type(c_ptr) function omp_aligned_alloc(alignment, size, allocator) bind(C)}
2641@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2642@item @tab @code{integer (c_size_t), value :: alignment, size}
2643@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2644@end multitable
2645
2646@item @emph{See also}:
2647@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2648@ref{omp_free}, @ref{omp_init_allocator}
2649
2650@item @emph{Reference}:
2651@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.13.6
2652@end table
2653
2654
2655
2656@node omp_free
2657@subsection @code{omp_free} -- Freeing memory allocated with OpenMP routines
2658@table @asis
2659@item @emph{Description}:
2660The @code{omp_free} routine deallocates memory previously allocated by an
2661OpenMP memory-management routine. The @var{ptr} argument must point to such
2662memory or be a null pointer; if it is a null pointer, no operation is
2663performed. If specified, the @var{allocator} argument must be either the
2664memory allocator that was used for the allocation or @code{omp_null_allocator};
2665if it is @code{omp_null_allocator}, the implementation will determine the value
2666automatically.
2667
2668Calling @code{omp_free} invokes undefined behavior if the memory
2669was already deallocated or when the used allocator has already been destroyed.
2670
2671@item @emph{C}:
2672@multitable @columnfractions .20 .80
2673@item @emph{Prototype}: @tab @code{void omp_free(void *ptr,}
2674@item @tab @code{ omp_allocator_handle_t allocator)}
2675@end multitable
2676
2677@item @emph{C++}:
2678@multitable @columnfractions .20 .80
2679@item @emph{Prototype}: @tab @code{void omp_free(void *ptr,}
2680@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2681@end multitable
2682
2683@item @emph{Fortran}:
2684@multitable @columnfractions .20 .80
2685@item @emph{Interface}: @tab @code{subroutine omp_free(ptr, allocator) bind(C)}
2686@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr}
2687@item @tab @code{type (c_ptr), value :: ptr}
2688@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2689@end multitable
2690
2691@item @emph{See also}:
2692@ref{omp_alloc}, @ref{omp_aligned_alloc}, @ref{omp_calloc},
2693@ref{omp_aligned_calloc}, @ref{omp_realloc}
2694
2695@item @emph{Reference}:
2696@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.7
2697@end table
2698
2699
2700
2701@node omp_calloc
2702@subsection @code{omp_calloc} -- Allocate nullified memory with an allocator
2703@table @asis
2704@item @emph{Description}:
2705Allocate zero-initialized memory with the specified allocator, which can either
2706be a predefined allocator, an allocator handle or @code{omp_null_allocator}. If
2707the allocators is @code{omp_null_allocator}, the allocator specified by the
2708@var{def-allocator-var} ICV is used. The to-be allocated memory is for an
2709array with @var{nmemb} elements, each having a size of @var{size} bytes. Both
2710@var{nmemb} and @var{size} must be nonnegative numbers; if either of them is
2711zero, @code{omp_calloc} will return a null pointer. If successful, a pointer to
2712the zero-initialized allocated memory is returned, otherwise the @code{fallback}
2713trait of the allocator determines the behavior.
2714
2715In @code{target} regions, either the @code{dynamic_allocators} clause must
2716appear on a @code{requires} directive in the same compilation unit -- or the
2717@var{allocator} argument may only be a constant expression with the value of
2718one of the predefined allocators and may not be @code{omp_null_allocator}.
2719
2720Memory allocated by @code{omp_calloc} must be freed using @code{omp_free}.
2721
2722@item @emph{C}:
2723@multitable @columnfractions .20 .80
2724@item @emph{Prototype}: @tab @code{void* omp_calloc(size_t nmemb, size_t size,}
2725@item @tab @code{ omp_allocator_handle_t allocator)}
2726@end multitable
2727
2728@item @emph{C++}:
2729@multitable @columnfractions .20 .80
2730@item @emph{Prototype}: @tab @code{void* omp_calloc(size_t nmemb, size_t size,}
2731@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2732@end multitable
2733
2734@item @emph{Fortran}:
2735@multitable @columnfractions .20 .80
2736@item @emph{Interface}: @tab @code{type(c_ptr) function omp_calloc(nmemb, size, allocator) bind(C)}
2737@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2738@item @tab @code{integer (c_size_t), value :: nmemb, size}
2739@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2740@end multitable
2741
2742@item @emph{See also}:
2743@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2744@ref{omp_free}, @ref{omp_init_allocator}
2745
2746@item @emph{Reference}:
2747@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.13.8
2748@end table
2749
2750
2751
2752@node omp_aligned_calloc
2753@subsection @code{omp_aligned_calloc} -- Allocate aligned nullified memory with an allocator
2754@table @asis
2755@item @emph{Description}:
2756Allocate zero-initialized memory with the specified allocator, which can either
2757be a predefined allocator, an allocator handle or @code{omp_null_allocator}. If
2758the allocators is @code{omp_null_allocator}, the allocator specified by the
2759@var{def-allocator-var} ICV is used. The to-be allocated memory is for an
2760array with @var{nmemb} elements, each having a size of @var{size} bytes. Both
2761@var{nmemb} and @var{size} must be nonnegative numbers; if either of them is
2762zero, @code{omp_aligned_calloc} will return a null pointer. @var{alignment}
2763must be a positive power of two and @var{size} must be a multiple of the
2764alignment; the alignment will be at least the maximal value required by
2765@code{alignment} trait of the allocator and the value of the passed
2766@var{alignment} argument. If successful, a pointer to the zero-initialized
2767allocated memory is returned, otherwise the @code{fallback} trait of the
2768allocator determines the behavior.
2769
2770In @code{target} regions, either the @code{dynamic_allocators} clause must
2771appear on a @code{requires} directive in the same compilation unit -- or the
2772@var{allocator} argument may only be a constant expression with the value of
2773one of the predefined allocators and may not be @code{omp_null_allocator}.
2774
2775Memory allocated by @code{omp_aligned_calloc} must be freed using
2776@code{omp_free}.
2777
2778@item @emph{C}:
2779@multitable @columnfractions .20 .80
2780@item @emph{Prototype}: @tab @code{void* omp_aligned_calloc(size_t nmemb, size_t size,}
2781@item @tab @code{ omp_allocator_handle_t allocator)}
2782@end multitable
2783
2784@item @emph{C++}:
2785@multitable @columnfractions .20 .80
2786@item @emph{Prototype}: @tab @code{void* omp_aligned_calloc(size_t nmemb, size_t size,}
2787@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2788@end multitable
2789
2790@item @emph{Fortran}:
2791@multitable @columnfractions .20 .80
2792@item @emph{Interface}: @tab @code{type(c_ptr) function omp_aligned_calloc(nmemb, size, allocator) bind(C)}
2793@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2794@item @tab @code{integer (c_size_t), value :: nmemb, size}
2795@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2796@end multitable
2797
2798@item @emph{See also}:
2799@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2800@ref{omp_free}, @ref{omp_init_allocator}
2801
2802@item @emph{Reference}:
2803@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.13.8
2804@end table
2805
2806
2807
2808@node omp_realloc
2809@subsection @code{omp_realloc} -- Reallocate memory allocated with OpenMP routines
2810@table @asis
2811@item @emph{Description}:
2812The @code{omp_realloc} routine deallocates memory to which @var{ptr} points to
2813and allocates new memory with the specified @var{allocator} argument; the
2814new memory will have the content of the old memory up to the minimum of the
2815old size and the new @var{size}, otherwise the content of the returned memory
2816is unspecified. If the new allocator is the same as the old one, the routine
2817tries to resize the existing memory allocation, returning the same address as
2818@var{ptr} if successful. @var{ptr} must point to memory allocated by an OpenMP
2819memory-management routine.
2820
2821The @var{allocator} and @var{free_allocator} arguments must be a predefined
2822allocator, an allocator handle or @code{omp_null_allocator}. If
2823@var{free_allocator} is @code{omp_null_allocator}, the implementation
2824automatically determines the allocator used for the allocation of @var{ptr}.
2825If @var{allocator} is @code{omp_null_allocator} and @var{ptr} is is not a
2826null pointer, the same allocator as @code{free_allocator} is used and
2827when @var{ptr} is a null pointer the allocator specified by the
2828@var{def-allocator-var} ICV is used.
2829
2830The @var{size} must be a nonnegative number denoting the number of bytes to be
2831allocated; if @var{size} is zero, @code{omp_realloc} will return free the
2832memory and return a null pointer. When @var{size} is nonzero: if successful,
2833a pointer to the allocated memory is returned, otherwise the @code{fallback}
2834trait of the allocator determines the behavior.
2835
2836In @code{target} regions, either the @code{dynamic_allocators} clause must
2837appear on a @code{requires} directive in the same compilation unit -- or the
2838@var{free_allocator} and @var{allocator} arguments may only be a constant
2839expression with the value of one of the predefined allocators and may not be
2840@code{omp_null_allocator}.
2841
2842Memory allocated by @code{omp_realloc} must be freed using @code{omp_free}.
2843Calling @code{omp_free} invokes undefined behavior if the memory
2844was already deallocated or when the used allocator has already been destroyed.
2845
2846@item @emph{C}:
2847@multitable @columnfractions .20 .80
2848@item @emph{Prototype}: @tab @code{void* omp_realloc(void *ptr, size_t size,}
2849@item @tab @code{ omp_allocator_handle_t allocator,}
2850@item @tab @code{ omp_allocator_handle_t free_allocator)}
2851@end multitable
2852
2853@item @emph{C++}:
2854@multitable @columnfractions .20 .80
2855@item @emph{Prototype}: @tab @code{void* omp_realloc(void *ptr, size_t size,}
2856@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator,}
2857@item @tab @code{ omp_allocator_handle_t free_allocator=omp_null_allocator)}
2858@end multitable
2859
2860@item @emph{Fortran}:
2861@multitable @columnfractions .20 .80
2862@item @emph{Interface}: @tab @code{type(c_ptr) function omp_realloc(ptr, size, allocator, free_allocator) bind(C)}
2863@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2864@item @tab @code{type(C_ptr), value :: ptr}
2865@item @tab @code{integer (c_size_t), value :: size}
2866@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator, free_allocator}
2867@end multitable
2868
2869@item @emph{See also}:
2870@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2871@ref{omp_free}, @ref{omp_init_allocator}
2872
2873@item @emph{Reference}:
2874@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.9
2875@end table
2876
2877
2878
506f068e
TB
2879@c @node Tool Control Routine
2880@c
2881@c FIXME
2882
2883@c @node Environment Display Routine
2884@c @section Environment Display Routine
2885@c
2886@c Routine to display the OpenMP number and the initial value of ICVs.
2887@c It has C linkage and do not throw exceptions.
2888@c
2889@c menu
2890@c * omp_display_env:: <fixme>
2891@c end menu
2892
d77de738
ML
2893@c ---------------------------------------------------------------------
2894@c OpenMP Environment Variables
2895@c ---------------------------------------------------------------------
2896
2897@node Environment Variables
2898@chapter OpenMP Environment Variables
2899
2900The environment variables which beginning with @env{OMP_} are defined by
2cd0689a
TB
2901section 4 of the OpenMP specification in version 4.5 or in a later version
2902of the specification, while those beginning with @env{GOMP_} are GNU extensions.
2903Most @env{OMP_} environment variables have an associated internal control
2904variable (ICV).
2905
2906For any OpenMP environment variable that sets an ICV and is neither
2907@code{OMP_DEFAULT_DEVICE} nor has global ICV scope, associated
2908device-specific environment variables exist. For them, the environment
2909variable without suffix affects the host. The suffix @code{_DEV_} followed
2910by a non-negative device number less that the number of available devices sets
2911the ICV for the corresponding device. The suffix @code{_DEV} sets the ICV
2912of all non-host devices for which a device-specific corresponding environment
2913variable has not been set while the @code{_ALL} suffix sets the ICV of all
2914host and non-host devices for which a more specific corresponding environment
2915variable is not set.
d77de738
ML
2916
2917@menu
73a0d3bf
TB
2918* OMP_ALLOCATOR:: Set the default allocator
2919* OMP_AFFINITY_FORMAT:: Set the format string used for affinity display
d77de738 2920* OMP_CANCELLATION:: Set whether cancellation is activated
73a0d3bf 2921* OMP_DISPLAY_AFFINITY:: Display thread affinity information
d77de738
ML
2922* OMP_DISPLAY_ENV:: Show OpenMP version and environment variables
2923* OMP_DEFAULT_DEVICE:: Set the device used in target regions
2924* OMP_DYNAMIC:: Dynamic adjustment of threads
2925* OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
2926* OMP_MAX_TASK_PRIORITY:: Set the maximum task priority value
2927* OMP_NESTED:: Nested parallel regions
2928* OMP_NUM_TEAMS:: Specifies the number of teams to use by teams region
2929* OMP_NUM_THREADS:: Specifies the number of threads to use
0b9bd33d
JJ
2930* OMP_PROC_BIND:: Whether threads may be moved between CPUs
2931* OMP_PLACES:: Specifies on which CPUs the threads should be placed
d77de738
ML
2932* OMP_STACKSIZE:: Set default thread stack size
2933* OMP_SCHEDULE:: How threads are scheduled
bc238c40 2934* OMP_TARGET_OFFLOAD:: Controls offloading behavior
d77de738
ML
2935* OMP_TEAMS_THREAD_LIMIT:: Set the maximum number of threads imposed by teams
2936* OMP_THREAD_LIMIT:: Set the maximum number of threads
2937* OMP_WAIT_POLICY:: How waiting threads are handled
2938* GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
2939* GOMP_DEBUG:: Enable debugging output
2940* GOMP_STACKSIZE:: Set default thread stack size
2941* GOMP_SPINCOUNT:: Set the busy-wait spin count
2942* GOMP_RTEMS_THREAD_POOLS:: Set the RTEMS specific thread pools
2943@end menu
2944
2945
73a0d3bf
TB
2946@node OMP_ALLOCATOR
2947@section @env{OMP_ALLOCATOR} -- Set the default allocator
2948@cindex Environment Variable
2949@table @asis
971f119f 2950@item @emph{ICV:} @var{def-allocator-var}
2cd0689a 2951@item @emph{Scope:} data environment
73a0d3bf
TB
2952@item @emph{Description}:
2953Sets the default allocator that is used when no allocator has been specified
2954in the @code{allocate} or @code{allocator} clause or if an OpenMP memory
2955routine is invoked with the @code{omp_null_allocator} allocator.
2956If unset, @code{omp_default_mem_alloc} is used.
2957
2958The value can either be a predefined allocator or a predefined memory space
2959or a predefined memory space followed by a colon and a comma-separated list
2960of memory trait and value pairs, separated by @code{=}.
2961
2cd0689a
TB
2962Note: The corresponding device environment variables are currently not
2963supported. Therefore, the non-host @var{def-allocator-var} ICVs are always
2964initialized to @code{omp_default_mem_alloc}. However, on all devices,
2965the @code{omp_set_default_allocator} API routine can be used to change
2966value.
2967
73a0d3bf 2968@multitable @columnfractions .45 .45
a85a106c 2969@headitem Predefined allocators @tab Associated predefined memory spaces
73a0d3bf
TB
2970@item omp_default_mem_alloc @tab omp_default_mem_space
2971@item omp_large_cap_mem_alloc @tab omp_large_cap_mem_space
2972@item omp_const_mem_alloc @tab omp_const_mem_space
2973@item omp_high_bw_mem_alloc @tab omp_high_bw_mem_space
2974@item omp_low_lat_mem_alloc @tab omp_low_lat_mem_space
2975@item omp_cgroup_mem_alloc @tab --
2976@item omp_pteam_mem_alloc @tab --
2977@item omp_thread_mem_alloc @tab --
2978@end multitable
2979
a85a106c
TB
2980The predefined allocators use the default values for the traits,
2981as listed below. Except that the last three allocators have the
2982@code{access} trait set to @code{cgroup}, @code{pteam}, and
2983@code{thread}, respectively.
2984
2985@multitable @columnfractions .25 .40 .25
2986@headitem Trait @tab Allowed values @tab Default value
73a0d3bf
TB
2987@item @code{sync_hint} @tab @code{contended}, @code{uncontended},
2988 @code{serialized}, @code{private}
a85a106c 2989 @tab @code{contended}
73a0d3bf 2990@item @code{alignment} @tab Positive integer being a power of two
a85a106c 2991 @tab 1 byte
73a0d3bf
TB
2992@item @code{access} @tab @code{all}, @code{cgroup},
2993 @code{pteam}, @code{thread}
a85a106c 2994 @tab @code{all}
73a0d3bf 2995@item @code{pool_size} @tab Positive integer
a85a106c 2996 @tab See @ref{Memory allocation}
73a0d3bf
TB
2997@item @code{fallback} @tab @code{default_mem_fb}, @code{null_fb},
2998 @code{abort_fb}, @code{allocator_fb}
a85a106c 2999 @tab See below
73a0d3bf 3000@item @code{fb_data} @tab @emph{unsupported as it needs an allocator handle}
a85a106c 3001 @tab (none)
73a0d3bf 3002@item @code{pinned} @tab @code{true}, @code{false}
a85a106c 3003 @tab @code{false}
73a0d3bf
TB
3004@item @code{partition} @tab @code{environment}, @code{nearest},
3005 @code{blocked}, @code{interleaved}
a85a106c 3006 @tab @code{environment}
73a0d3bf
TB
3007@end multitable
3008
a85a106c
TB
3009For the @code{fallback} trait, the default value is @code{null_fb} for the
3010@code{omp_default_mem_alloc} allocator and any allocator that is associated
3011with device memory; for all other other allocators, it is @code{default_mem_fb}
3012by default.
3013
73a0d3bf
TB
3014Examples:
3015@smallexample
3016OMP_ALLOCATOR=omp_high_bw_mem_alloc
3017OMP_ALLOCATOR=omp_large_cap_mem_space
506f068e 3018OMP_ALLOCATOR=omp_low_lat_mem_space:pinned=true,partition=nearest
73a0d3bf
TB
3019@end smallexample
3020
a85a106c 3021@item @emph{See also}:
971f119f
TB
3022@ref{Memory allocation}, @ref{omp_get_default_allocator},
3023@ref{omp_set_default_allocator}
73a0d3bf
TB
3024
3025@item @emph{Reference}:
3026@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.21
3027@end table
3028
3029
3030
3031@node OMP_AFFINITY_FORMAT
3032@section @env{OMP_AFFINITY_FORMAT} -- Set the format string used for affinity display
3033@cindex Environment Variable
3034@table @asis
2cd0689a
TB
3035@item @emph{ICV:} @var{affinity-format-var}
3036@item @emph{Scope:} device
73a0d3bf
TB
3037@item @emph{Description}:
3038Sets the format string used when displaying OpenMP thread affinity information.
3039Special values are output using @code{%} followed by an optional size
3040specification and then either the single-character field type or its long
15886c03 3041name enclosed in curly braces; using @code{%%} displays a literal percent.
73a0d3bf 3042The size specification consists of an optional @code{0.} or @code{.} followed
450b05ce 3043by a positive integer, specifying the minimal width of the output. With
73a0d3bf
TB
3044@code{0.} and numerical values, the output is padded with zeros on the left;
3045with @code{.}, the output is padded by spaces on the left; otherwise, the
3046output is padded by spaces on the right. If unset, the value is
3047``@code{level %L thread %i affinity %A}''.
3048
3049Supported field types are:
3050
3051@multitable @columnfractions .10 .25 .60
3052@item t @tab team_num @tab value returned by @code{omp_get_team_num}
3053@item T @tab num_teams @tab value returned by @code{omp_get_num_teams}
3054@item L @tab nesting_level @tab value returned by @code{omp_get_level}
3055@item n @tab thread_num @tab value returned by @code{omp_get_thread_num}
3056@item N @tab num_threads @tab value returned by @code{omp_get_num_threads}
3057@item a @tab ancestor_tnum
3058 @tab value returned by
3059 @code{omp_get_ancestor_thread_num(omp_get_level()-1)}
3060@item H @tab host @tab name of the host that executes the thread
450b05ce
TB
3061@item P @tab process_id @tab process identifier
3062@item i @tab native_thread_id @tab native thread identifier
73a0d3bf
TB
3063@item A @tab thread_affinity
3064 @tab comma separated list of integer values or ranges, representing the
3065 processors on which a process might execute, subject to affinity
3066 mechanisms
3067@end multitable
3068
3069For instance, after setting
3070
3071@smallexample
3072OMP_AFFINITY_FORMAT="%0.2a!%n!%.4L!%N;%.2t;%0.2T;%@{team_num@};%@{num_teams@};%A"
3073@end smallexample
3074
3075with either @code{OMP_DISPLAY_AFFINITY} being set or when calling
3076@code{omp_display_affinity} with @code{NULL} or an empty string, the program
3077might display the following:
3078
3079@smallexample
308000!0! 1!4; 0;01;0;1;0-11
308100!3! 1!4; 0;01;0;1;0-11
308200!2! 1!4; 0;01;0;1;0-11
308300!1! 1!4; 0;01;0;1;0-11
3084@end smallexample
3085
3086@item @emph{See also}:
3087@ref{OMP_DISPLAY_AFFINITY}
3088
3089@item @emph{Reference}:
3090@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.14
3091@end table
3092
3093
3094
d77de738
ML
3095@node OMP_CANCELLATION
3096@section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
3097@cindex Environment Variable
3098@table @asis
2cd0689a
TB
3099@item @emph{ICV:} @var{cancel-var}
3100@item @emph{Scope:} global
d77de738
ML
3101@item @emph{Description}:
3102If set to @code{TRUE}, the cancellation is activated. If set to @code{FALSE} or
3103if unset, cancellation is disabled and the @code{cancel} construct is ignored.
3104
3105@item @emph{See also}:
3106@ref{omp_get_cancellation}
3107
3108@item @emph{Reference}:
3109@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.11
3110@end table
3111
3112
3113
73a0d3bf
TB
3114@node OMP_DISPLAY_AFFINITY
3115@section @env{OMP_DISPLAY_AFFINITY} -- Display thread affinity information
3116@cindex Environment Variable
3117@table @asis
2cd0689a
TB
3118@item @emph{ICV:} @var{display-affinity-var}
3119@item @emph{Scope:} global
73a0d3bf
TB
3120@item @emph{Description}:
3121If set to @code{FALSE} or if unset, affinity displaying is disabled.
15886c03 3122If set to @code{TRUE}, the runtime displays affinity information about
73a0d3bf
TB
3123OpenMP threads in a parallel region upon entering the region and every time
3124any change occurs.
3125
3126@item @emph{See also}:
3127@ref{OMP_AFFINITY_FORMAT}
3128
3129@item @emph{Reference}:
3130@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.13
3131@end table
3132
3133
3134
3135
d77de738
ML
3136@node OMP_DISPLAY_ENV
3137@section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
3138@cindex Environment Variable
3139@table @asis
2cd0689a
TB
3140@item @emph{ICV:} none
3141@item @emph{Scope:} not applicable
d77de738
ML
3142@item @emph{Description}:
3143If set to @code{TRUE}, the OpenMP version number and the values
3144associated with the OpenMP environment variables are printed to @code{stderr}.
3145If set to @code{VERBOSE}, it additionally shows the value of the environment
3146variables which are GNU extensions. If undefined or set to @code{FALSE},
15886c03 3147this information is not shown.
d77de738
ML
3148
3149
3150@item @emph{Reference}:
3151@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.12
3152@end table
3153
3154
3155
3156@node OMP_DEFAULT_DEVICE
3157@section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
3158@cindex Environment Variable
3159@table @asis
2cd0689a
TB
3160@item @emph{ICV:} @var{default-device-var}
3161@item @emph{Scope:} data environment
d77de738
ML
3162@item @emph{Description}:
3163Set to choose the device which is used in a @code{target} region, unless the
3164value is overridden by @code{omp_set_default_device} or by a @code{device}
3165clause. The value shall be the nonnegative device number. If no device with
3166the given device number exists, the code is executed on the host. If unset,
18c8b56c
TB
3167@env{OMP_TARGET_OFFLOAD} is @code{mandatory} and no non-host devices are
3168available, it is set to @code{omp_invalid_device}. Otherwise, if unset,
15886c03 3169device number 0 is used.
d77de738
ML
3170
3171
3172@item @emph{See also}:
3173@ref{omp_get_default_device}, @ref{omp_set_default_device},
8bd11fa4 3174@ref{OMP_TARGET_OFFLOAD}
d77de738
ML
3175
3176@item @emph{Reference}:
8bd11fa4 3177@uref{https://www.openmp.org, OpenMP specification v5.2}, Section 21.2.7
d77de738
ML
3178@end table
3179
3180
3181
3182@node OMP_DYNAMIC
3183@section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
3184@cindex Environment Variable
3185@table @asis
2cd0689a
TB
3186@item @emph{ICV:} @var{dyn-var}
3187@item @emph{Scope:} global
d77de738
ML
3188@item @emph{Description}:
3189Enable or disable the dynamic adjustment of the number of threads
3190within a team. The value of this environment variable shall be
3191@code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
3192disabled by default.
3193
3194@item @emph{See also}:
3195@ref{omp_set_dynamic}
3196
3197@item @emph{Reference}:
3198@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.3
3199@end table
3200
3201
3202
3203@node OMP_MAX_ACTIVE_LEVELS
3204@section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
3205@cindex Environment Variable
3206@table @asis
2cd0689a
TB
3207@item @emph{ICV:} @var{max-active-levels-var}
3208@item @emph{Scope:} data environment
d77de738
ML
3209@item @emph{Description}:
3210Specifies the initial value for the maximum number of nested parallel
3211regions. The value of this variable shall be a positive integer.
3212If undefined, then if @env{OMP_NESTED} is defined and set to true, or
3213if @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND} are defined and set to
3214a list with more than one item, the maximum number of nested parallel
15886c03
TB
3215regions is initialized to the largest number supported, otherwise
3216it is set to one.
d77de738
ML
3217
3218@item @emph{See also}:
2cd0689a
TB
3219@ref{omp_set_max_active_levels}, @ref{OMP_NESTED}, @ref{OMP_PROC_BIND},
3220@ref{OMP_NUM_THREADS}
3221
d77de738
ML
3222
3223@item @emph{Reference}:
3224@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.9
3225@end table
3226
3227
3228
3229@node OMP_MAX_TASK_PRIORITY
3230@section @env{OMP_MAX_TASK_PRIORITY} -- Set the maximum priority
3231number that can be set for a task.
3232@cindex Environment Variable
3233@table @asis
2cd0689a
TB
3234@item @emph{ICV:} @var{max-task-priority-var}
3235@item @emph{Scope:} global
d77de738
ML
3236@item @emph{Description}:
3237Specifies the initial value for the maximum priority value that can be
3238set for a task. The value of this variable shall be a non-negative
3239integer, and zero is allowed. If undefined, the default priority is
32400.
3241
3242@item @emph{See also}:
3243@ref{omp_get_max_task_priority}
3244
3245@item @emph{Reference}:
3246@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.14
3247@end table
3248
3249
3250
3251@node OMP_NESTED
3252@section @env{OMP_NESTED} -- Nested parallel regions
3253@cindex Environment Variable
3254@cindex Implementation specific setting
3255@table @asis
2cd0689a
TB
3256@item @emph{ICV:} @var{max-active-levels-var}
3257@item @emph{Scope:} data environment
d77de738
ML
3258@item @emph{Description}:
3259Enable or disable nested parallel regions, i.e., whether team members
3260are allowed to create new teams. The value of this environment variable
3261shall be @code{TRUE} or @code{FALSE}. If set to @code{TRUE}, the number
15886c03
TB
3262of maximum active nested regions supported is by default set to the
3263maximum supported, otherwise it is set to one. If
3264@env{OMP_MAX_ACTIVE_LEVELS} is defined, its setting overrides this
d77de738
ML
3265setting. If both are undefined, nested parallel regions are enabled if
3266@env{OMP_NUM_THREADS} or @env{OMP_PROC_BINDS} are defined to a list with
3267more than one item, otherwise they are disabled by default.
3268
2cd0689a
TB
3269Note that the @code{OMP_NESTED} environment variable was deprecated in
3270the OpenMP specification 5.2 in favor of @code{OMP_MAX_ACTIVE_LEVELS}.
3271
d77de738 3272@item @emph{See also}:
2cd0689a
TB
3273@ref{omp_set_max_active_levels}, @ref{omp_set_nested},
3274@ref{OMP_MAX_ACTIVE_LEVELS}
d77de738
ML
3275
3276@item @emph{Reference}:
3277@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.6
3278@end table
3279
3280
3281
3282@node OMP_NUM_TEAMS
3283@section @env{OMP_NUM_TEAMS} -- Specifies the number of teams to use by teams region
3284@cindex Environment Variable
3285@table @asis
2cd0689a
TB
3286@item @emph{ICV:} @var{nteams-var}
3287@item @emph{Scope:} device
d77de738
ML
3288@item @emph{Description}:
3289Specifies the upper bound for number of teams to use in teams regions
3290without explicit @code{num_teams} clause. The value of this variable shall
3291be a positive integer. If undefined it defaults to 0 which means
3292implementation defined upper bound.
3293
3294@item @emph{See also}:
3295@ref{omp_set_num_teams}
3296
3297@item @emph{Reference}:
3298@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.23
3299@end table
3300
3301
3302
3303@node OMP_NUM_THREADS
3304@section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
3305@cindex Environment Variable
3306@cindex Implementation specific setting
3307@table @asis
2cd0689a
TB
3308@item @emph{ICV:} @var{nthreads-var}
3309@item @emph{Scope:} data environment
d77de738
ML
3310@item @emph{Description}:
3311Specifies the default number of threads to use in parallel regions. The
3312value of this variable shall be a comma-separated list of positive integers;
3313the value specifies the number of threads to use for the corresponding nested
15886c03 3314level. Specifying more than one item in the list automatically enables
d77de738
ML
3315nesting by default. If undefined one thread per CPU is used.
3316
2cd0689a
TB
3317When a list with more than value is specified, it also affects the
3318@var{max-active-levels-var} ICV as described in @ref{OMP_MAX_ACTIVE_LEVELS}.
3319
d77de738 3320@item @emph{See also}:
2cd0689a 3321@ref{omp_set_num_threads}, @ref{OMP_MAX_ACTIVE_LEVELS}
d77de738
ML
3322
3323@item @emph{Reference}:
3324@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.2
3325@end table
3326
3327
3328
3329@node OMP_PROC_BIND
0b9bd33d 3330@section @env{OMP_PROC_BIND} -- Whether threads may be moved between CPUs
d77de738
ML
3331@cindex Environment Variable
3332@table @asis
2cd0689a
TB
3333@item @emph{ICV:} @var{bind-var}
3334@item @emph{Scope:} data environment
d77de738
ML
3335@item @emph{Description}:
3336Specifies whether threads may be moved between processors. If set to
0b9bd33d 3337@code{TRUE}, OpenMP threads should not be moved; if set to @code{FALSE}
d77de738
ML
3338they may be moved. Alternatively, a comma separated list with the
3339values @code{PRIMARY}, @code{MASTER}, @code{CLOSE} and @code{SPREAD} can
3340be used to specify the thread affinity policy for the corresponding nesting
3341level. With @code{PRIMARY} and @code{MASTER} the worker threads are in the
3342same place partition as the primary thread. With @code{CLOSE} those are
3343kept close to the primary thread in contiguous place partitions. And
3344with @code{SPREAD} a sparse distribution
3345across the place partitions is used. Specifying more than one item in the
15886c03 3346list automatically enables nesting by default.
d77de738 3347
2cd0689a
TB
3348When a list is specified, it also affects the @var{max-active-levels-var} ICV
3349as described in @ref{OMP_MAX_ACTIVE_LEVELS}.
3350
d77de738
ML
3351When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
3352@env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
3353
3354@item @emph{See also}:
2cd0689a
TB
3355@ref{omp_get_proc_bind}, @ref{GOMP_CPU_AFFINITY}, @ref{OMP_PLACES},
3356@ref{OMP_MAX_ACTIVE_LEVELS}
d77de738
ML
3357
3358@item @emph{Reference}:
3359@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.4
3360@end table
3361
3362
3363
3364@node OMP_PLACES
0b9bd33d 3365@section @env{OMP_PLACES} -- Specifies on which CPUs the threads should be placed
d77de738
ML
3366@cindex Environment Variable
3367@table @asis
2cd0689a
TB
3368@item @emph{ICV:} @var{place-partition-var}
3369@item @emph{Scope:} implicit tasks
d77de738
ML
3370@item @emph{Description}:
3371The thread placement can be either specified using an abstract name or by an
3372explicit list of the places. The abstract names @code{threads}, @code{cores},
3373@code{sockets}, @code{ll_caches} and @code{numa_domains} can be optionally
3374followed by a positive number in parentheses, which denotes the how many places
3375shall be created. With @code{threads} each place corresponds to a single
3376hardware thread; @code{cores} to a single core with the corresponding number of
3377hardware threads; with @code{sockets} the place corresponds to a single
3378socket; with @code{ll_caches} to a set of cores that shares the last level
3379cache on the device; and @code{numa_domains} to a set of cores for which their
3380closest memory on the device is the same memory and at a similar distance from
3381the cores. The resulting placement can be shown by setting the
3382@env{OMP_DISPLAY_ENV} environment variable.
3383
3384Alternatively, the placement can be specified explicitly as comma-separated
3385list of places. A place is specified by set of nonnegative numbers in curly
3386braces, denoting the hardware threads. The curly braces can be omitted
3387when only a single number has been specified. The hardware threads
3388belonging to a place can either be specified as comma-separated list of
3389nonnegative thread numbers or using an interval. Multiple places can also be
3390either specified by a comma-separated list of places or by an interval. To
3391specify an interval, a colon followed by the count is placed after
3392the hardware thread number or the place. Optionally, the length can be
3393followed by a colon and the stride number -- otherwise a unit stride is
3394assumed. Placing an exclamation mark (@code{!}) directly before a curly
15886c03
TB
3395brace or numbers inside the curly braces (excluding intervals)
3396excludes those hardware threads.
d77de738
ML
3397
3398For instance, the following specifies the same places list:
3399@code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
3400@code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
3401
3402If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
3403@env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
3404between CPUs following no placement policy.
3405
3406@item @emph{See also}:
3407@ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
3408@ref{OMP_DISPLAY_ENV}
3409
3410@item @emph{Reference}:
3411@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.5
3412@end table
3413
3414
3415
3416@node OMP_STACKSIZE
3417@section @env{OMP_STACKSIZE} -- Set default thread stack size
3418@cindex Environment Variable
3419@table @asis
2cd0689a
TB
3420@item @emph{ICV:} @var{stacksize-var}
3421@item @emph{Scope:} device
d77de738
ML
3422@item @emph{Description}:
3423Set the default thread stack size in kilobytes, unless the number
3424is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
3425case the size is, respectively, in bytes, kilobytes, megabytes
3426or gigabytes. This is different from @code{pthread_attr_setstacksize}
3427which gets the number of bytes as an argument. If the stack size cannot
3428be set due to system constraints, an error is reported and the initial
3429stack size is left unchanged. If undefined, the stack size is system
3430dependent.
3431
2cd0689a
TB
3432@item @emph{See also}:
3433@ref{GOMP_STACKSIZE}
3434
d77de738
ML
3435@item @emph{Reference}:
3436@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.7
3437@end table
3438
3439
3440
3441@node OMP_SCHEDULE
3442@section @env{OMP_SCHEDULE} -- How threads are scheduled
3443@cindex Environment Variable
3444@cindex Implementation specific setting
3445@table @asis
2cd0689a
TB
3446@item @emph{ICV:} @var{run-sched-var}
3447@item @emph{Scope:} data environment
d77de738
ML
3448@item @emph{Description}:
3449Allows to specify @code{schedule type} and @code{chunk size}.
3450The value of the variable shall have the form: @code{type[,chunk]} where
3451@code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
3452The optional @code{chunk} size shall be a positive integer. If undefined,
3453dynamic scheduling and a chunk size of 1 is used.
3454
3455@item @emph{See also}:
3456@ref{omp_set_schedule}
3457
3458@item @emph{Reference}:
3459@uref{https://www.openmp.org, OpenMP specification v4.5}, Sections 2.7.1.1 and 4.1
3460@end table
3461
3462
3463
3464@node OMP_TARGET_OFFLOAD
bc238c40 3465@section @env{OMP_TARGET_OFFLOAD} -- Controls offloading behavior
d77de738
ML
3466@cindex Environment Variable
3467@cindex Implementation specific setting
3468@table @asis
2cd0689a
TB
3469@item @emph{ICV:} @var{target-offload-var}
3470@item @emph{Scope:} global
d77de738 3471@item @emph{Description}:
bc238c40 3472Specifies the behavior with regard to offloading code to a device. This
d77de738
ML
3473variable can be set to one of three values - @code{MANDATORY}, @code{DISABLED}
3474or @code{DEFAULT}.
3475
15886c03 3476If set to @code{MANDATORY}, the program terminates with an error if
8bd11fa4
TB
3477any device construct or device memory routine uses a device that is unavailable
3478or not supported by the implementation, or uses a non-conforming device number.
15886c03
TB
3479If set to @code{DISABLED}, then offloading is disabled and all code runs on
3480the host. If set to @code{DEFAULT}, the program tries offloading to the
3481device first, then falls back to running code on the host if it cannot.
d77de738 3482
15886c03 3483If undefined, then the program behaves as if @code{DEFAULT} was set.
d77de738 3484
15886c03 3485Note: Even with @code{MANDATORY}, no run-time termination is performed when
8bd11fa4
TB
3486the device number in a @code{device} clause or argument to a device memory
3487routine is for host, which includes using the device number in the
3488@var{default-device-var} ICV. However, the initial value of
3489the @var{default-device-var} ICV is affected by @code{MANDATORY}.
3490
3491@item @emph{See also}:
3492@ref{OMP_DEFAULT_DEVICE}
3493
d77de738 3494@item @emph{Reference}:
8bd11fa4 3495@uref{https://www.openmp.org, OpenMP specification v5.2}, Section 21.2.8
d77de738
ML
3496@end table
3497
3498
3499
3500@node OMP_TEAMS_THREAD_LIMIT
3501@section @env{OMP_TEAMS_THREAD_LIMIT} -- Set the maximum number of threads imposed by teams
3502@cindex Environment Variable
3503@table @asis
2cd0689a
TB
3504@item @emph{ICV:} @var{teams-thread-limit-var}
3505@item @emph{Scope:} device
d77de738
ML
3506@item @emph{Description}:
3507Specifies an upper bound for the number of threads to use by each contention
3508group created by a teams construct without explicit @code{thread_limit}
3509clause. The value of this variable shall be a positive integer. If undefined,
3510the value of 0 is used which stands for an implementation defined upper
3511limit.
3512
3513@item @emph{See also}:
3514@ref{OMP_THREAD_LIMIT}, @ref{omp_set_teams_thread_limit}
3515
3516@item @emph{Reference}:
3517@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.24
3518@end table
3519
3520
3521
3522@node OMP_THREAD_LIMIT
3523@section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
3524@cindex Environment Variable
3525@table @asis
2cd0689a
TB
3526@item @emph{ICV:} @var{thread-limit-var}
3527@item @emph{Scope:} data environment
d77de738
ML
3528@item @emph{Description}:
3529Specifies the number of threads to use for the whole program. The
3530value of this variable shall be a positive integer. If undefined,
3531the number of threads is not limited.
3532
3533@item @emph{See also}:
3534@ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
3535
3536@item @emph{Reference}:
3537@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.10
3538@end table
3539
3540
3541
3542@node OMP_WAIT_POLICY
3543@section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
3544@cindex Environment Variable
3545@table @asis
3546@item @emph{Description}:
3547Specifies whether waiting threads should be active or passive. If
3548the value is @code{PASSIVE}, waiting threads should not consume CPU
3549power while waiting; while the value is @code{ACTIVE} specifies that
3550they should. If undefined, threads wait actively for a short time
3551before waiting passively.
3552
3553@item @emph{See also}:
3554@ref{GOMP_SPINCOUNT}
3555
3556@item @emph{Reference}:
3557@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.8
3558@end table
3559
3560
3561
3562@node GOMP_CPU_AFFINITY
3563@section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
3564@cindex Environment Variable
3565@table @asis
3566@item @emph{Description}:
3567Binds threads to specific CPUs. The variable should contain a space-separated
3568or comma-separated list of CPUs. This list may contain different kinds of
3569entries: either single CPU numbers in any order, a range of CPUs (M-N)
3570or a range with some stride (M-N:S). CPU numbers are zero based. For example,
15886c03 3571@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} binds the initial thread
d77de738
ML
3572to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
3573CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
15886c03 3574and 14 respectively and then starts assigning back from the beginning of
d77de738
ML
3575the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
3576
3577There is no libgomp library routine to determine whether a CPU affinity
3578specification is in effect. As a workaround, language-specific library
3579functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
3580Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
3581environment variable. A defined CPU affinity on startup cannot be changed
3582or disabled during the runtime of the application.
3583
3584If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
3585@env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
3586@env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
15886c03 3587@code{FALSE}, the host system handles the assignment of threads to CPUs.
d77de738
ML
3588
3589@item @emph{See also}:
3590@ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
3591@end table
3592
3593
3594
3595@node GOMP_DEBUG
3596@section @env{GOMP_DEBUG} -- Enable debugging output
3597@cindex Environment Variable
3598@table @asis
3599@item @emph{Description}:
3600Enable debugging output. The variable should be set to @code{0}
3601(disabled, also the default if not set), or @code{1} (enabled).
3602
15886c03 3603If enabled, some debugging output is printed during execution.
d77de738
ML
3604This is currently not specified in more detail, and subject to change.
3605@end table
3606
3607
3608
3609@node GOMP_STACKSIZE
3610@section @env{GOMP_STACKSIZE} -- Set default thread stack size
3611@cindex Environment Variable
3612@cindex Implementation specific setting
3613@table @asis
3614@item @emph{Description}:
3615Set the default thread stack size in kilobytes. This is different from
3616@code{pthread_attr_setstacksize} which gets the number of bytes as an
3617argument. If the stack size cannot be set due to system constraints, an
3618error is reported and the initial stack size is left unchanged. If undefined,
3619the stack size is system dependent.
3620
3621@item @emph{See also}:
3622@ref{OMP_STACKSIZE}
3623
3624@item @emph{Reference}:
3625@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
3626GCC Patches Mailinglist},
3627@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
3628GCC Patches Mailinglist}
3629@end table
3630
3631
3632
3633@node GOMP_SPINCOUNT
3634@section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
3635@cindex Environment Variable
3636@cindex Implementation specific setting
3637@table @asis
3638@item @emph{Description}:
3639Determines how long a threads waits actively with consuming CPU power
3640before waiting passively without consuming CPU power. The value may be
3641either @code{INFINITE}, @code{INFINITY} to always wait actively or an
3642integer which gives the number of spins of the busy-wait loop. The
3643integer may optionally be followed by the following suffixes acting
3644as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
3645million), @code{G} (giga, billion), or @code{T} (tera, trillion).
3646If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
3647300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
364830 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
3649If there are more OpenMP threads than available CPUs, 1000 and 100
3650spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
3651undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
3652or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
3653
3654@item @emph{See also}:
3655@ref{OMP_WAIT_POLICY}
3656@end table
3657
3658
3659
3660@node GOMP_RTEMS_THREAD_POOLS
3661@section @env{GOMP_RTEMS_THREAD_POOLS} -- Set the RTEMS specific thread pools
3662@cindex Environment Variable
3663@cindex Implementation specific setting
3664@table @asis
3665@item @emph{Description}:
3666This environment variable is only used on the RTEMS real-time operating system.
3667It determines the scheduler instance specific thread pools. The format for
3668@env{GOMP_RTEMS_THREAD_POOLS} is a list of optional
3669@code{<thread-pool-count>[$<priority>]@@<scheduler-name>} configurations
3670separated by @code{:} where:
3671@itemize @bullet
3672@item @code{<thread-pool-count>} is the thread pool count for this scheduler
3673instance.
3674@item @code{$<priority>} is an optional priority for the worker threads of a
3675thread pool according to @code{pthread_setschedparam}. In case a priority
15886c03 3676value is omitted, then a worker thread inherits the priority of the OpenMP
d77de738
ML
3677primary thread that created it. The priority of the worker thread is not
3678changed after creation, even if a new OpenMP primary thread using the worker has
3679a different priority.
3680@item @code{@@<scheduler-name>} is the scheduler instance name according to the
3681RTEMS application configuration.
3682@end itemize
3683In case no thread pool configuration is specified for a scheduler instance,
15886c03 3684then each OpenMP primary thread of this scheduler instance uses its own
d77de738
ML
3685dynamically allocated thread pool. To limit the worker thread count of the
3686thread pools, each OpenMP primary thread must call @code{omp_set_num_threads}.
3687@item @emph{Example}:
3688Lets suppose we have three scheduler instances @code{IO}, @code{WRK0}, and
3689@code{WRK1} with @env{GOMP_RTEMS_THREAD_POOLS} set to
3690@code{"1@@WRK0:3$4@@WRK1"}. Then there are no thread pool restrictions for
3691scheduler instance @code{IO}. In the scheduler instance @code{WRK0} there is
3692one thread pool available. Since no priority is specified for this scheduler
3693instance, the worker thread inherits the priority of the OpenMP primary thread
3694that created it. In the scheduler instance @code{WRK1} there are three thread
3695pools available and their worker threads run at priority four.
3696@end table
3697
3698
3699
3700@c ---------------------------------------------------------------------
3701@c Enabling OpenACC
3702@c ---------------------------------------------------------------------
3703
3704@node Enabling OpenACC
3705@chapter Enabling OpenACC
3706
3707To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
3708flag @option{-fopenacc} must be specified. This enables the OpenACC directive
643a5223
TB
3709@samp{#pragma acc} in C/C++ and, in Fortran, the @samp{!$acc} sentinel in free
3710source form and the @samp{c$acc}, @samp{*$acc} and @samp{!$acc} sentinels in
3711fixed source form. The flag also arranges for automatic linking of the OpenACC
3712runtime library (@ref{OpenACC Runtime Library Routines}).
d77de738
ML
3713
3714See @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
3715
3716A complete description of all OpenACC directives accepted may be found in
3717the @uref{https://www.openacc.org, OpenACC} Application Programming
3718Interface manual, version 2.6.
3719
3720
3721
3722@c ---------------------------------------------------------------------
3723@c OpenACC Runtime Library Routines
3724@c ---------------------------------------------------------------------
3725
3726@node OpenACC Runtime Library Routines
3727@chapter OpenACC Runtime Library Routines
3728
3729The runtime routines described here are defined by section 3 of the OpenACC
3730specifications in version 2.6.
3731They have C linkage, and do not throw exceptions.
3732Generally, they are available only for the host, with the exception of
3733@code{acc_on_device}, which is available for both the host and the
3734acceleration device.
3735
3736@menu
3737* acc_get_num_devices:: Get number of devices for the given device
3738 type.
3739* acc_set_device_type:: Set type of device accelerator to use.
3740* acc_get_device_type:: Get type of device accelerator to be used.
3741* acc_set_device_num:: Set device number to use.
3742* acc_get_device_num:: Get device number to be used.
3743* acc_get_property:: Get device property.
3744* acc_async_test:: Tests for completion of a specific asynchronous
3745 operation.
3746* acc_async_test_all:: Tests for completion of all asynchronous
3747 operations.
3748* acc_wait:: Wait for completion of a specific asynchronous
3749 operation.
3750* acc_wait_all:: Waits for completion of all asynchronous
3751 operations.
3752* acc_wait_all_async:: Wait for completion of all asynchronous
3753 operations.
3754* acc_wait_async:: Wait for completion of asynchronous operations.
3755* acc_init:: Initialize runtime for a specific device type.
3756* acc_shutdown:: Shuts down the runtime for a specific device
3757 type.
3758* acc_on_device:: Whether executing on a particular device
3759* acc_malloc:: Allocate device memory.
3760* acc_free:: Free device memory.
3761* acc_copyin:: Allocate device memory and copy host memory to
3762 it.
3763* acc_present_or_copyin:: If the data is not present on the device,
3764 allocate device memory and copy from host
3765 memory.
3766* acc_create:: Allocate device memory and map it to host
3767 memory.
3768* acc_present_or_create:: If the data is not present on the device,
3769 allocate device memory and map it to host
3770 memory.
3771* acc_copyout:: Copy device memory to host memory.
3772* acc_delete:: Free device memory.
3773* acc_update_device:: Update device memory from mapped host memory.
3774* acc_update_self:: Update host memory from mapped device memory.
3775* acc_map_data:: Map previously allocated device memory to host
3776 memory.
3777* acc_unmap_data:: Unmap device memory from host memory.
3778* acc_deviceptr:: Get device pointer associated with specific
3779 host address.
3780* acc_hostptr:: Get host pointer associated with specific
3781 device address.
3782* acc_is_present:: Indicate whether host variable / array is
3783 present on device.
3784* acc_memcpy_to_device:: Copy host memory to device memory.
3785* acc_memcpy_from_device:: Copy device memory to host memory.
3786* acc_attach:: Let device pointer point to device-pointer target.
3787* acc_detach:: Let device pointer point to host-pointer target.
3788
3789API routines for target platforms.
3790
3791* acc_get_current_cuda_device:: Get CUDA device handle.
3792* acc_get_current_cuda_context::Get CUDA context handle.
3793* acc_get_cuda_stream:: Get CUDA stream handle.
3794* acc_set_cuda_stream:: Set CUDA stream handle.
3795
3796API routines for the OpenACC Profiling Interface.
3797
3798* acc_prof_register:: Register callbacks.
3799* acc_prof_unregister:: Unregister callbacks.
3800* acc_prof_lookup:: Obtain inquiry functions.
3801* acc_register_library:: Library registration.
3802@end menu
3803
3804
3805
3806@node acc_get_num_devices
3807@section @code{acc_get_num_devices} -- Get number of devices for given device type
3808@table @asis
3809@item @emph{Description}
3810This function returns a value indicating the number of devices available
3811for the device type specified in @var{devicetype}.
3812
3813@item @emph{C/C++}:
3814@multitable @columnfractions .20 .80
3815@item @emph{Prototype}: @tab @code{int acc_get_num_devices(acc_device_t devicetype);}
3816@end multitable
3817
3818@item @emph{Fortran}:
3819@multitable @columnfractions .20 .80
3820@item @emph{Interface}: @tab @code{integer function acc_get_num_devices(devicetype)}
3821@item @tab @code{integer(kind=acc_device_kind) devicetype}
3822@end multitable
3823
3824@item @emph{Reference}:
3825@uref{https://www.openacc.org, OpenACC specification v2.6}, section
38263.2.1.
3827@end table
3828
3829
3830
3831@node acc_set_device_type
3832@section @code{acc_set_device_type} -- Set type of device accelerator to use.
3833@table @asis
3834@item @emph{Description}
3835This function indicates to the runtime library which device type, specified
3836in @var{devicetype}, to use when executing a parallel or kernels region.
3837
3838@item @emph{C/C++}:
3839@multitable @columnfractions .20 .80
3840@item @emph{Prototype}: @tab @code{acc_set_device_type(acc_device_t devicetype);}
3841@end multitable
3842
3843@item @emph{Fortran}:
3844@multitable @columnfractions .20 .80
3845@item @emph{Interface}: @tab @code{subroutine acc_set_device_type(devicetype)}
3846@item @tab @code{integer(kind=acc_device_kind) devicetype}
3847@end multitable
3848
3849@item @emph{Reference}:
3850@uref{https://www.openacc.org, OpenACC specification v2.6}, section
38513.2.2.
3852@end table
3853
3854
3855
3856@node acc_get_device_type
3857@section @code{acc_get_device_type} -- Get type of device accelerator to be used.
3858@table @asis
3859@item @emph{Description}
3860This function returns what device type will be used when executing a
3861parallel or kernels region.
3862
3863This function returns @code{acc_device_none} if
3864@code{acc_get_device_type} is called from
3865@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
3866callbacks of the OpenACC Profiling Interface (@ref{OpenACC Profiling
3867Interface}), that is, if the device is currently being initialized.
3868
3869@item @emph{C/C++}:
3870@multitable @columnfractions .20 .80
3871@item @emph{Prototype}: @tab @code{acc_device_t acc_get_device_type(void);}
3872@end multitable
3873
3874@item @emph{Fortran}:
3875@multitable @columnfractions .20 .80
3876@item @emph{Interface}: @tab @code{function acc_get_device_type(void)}
3877@item @tab @code{integer(kind=acc_device_kind) acc_get_device_type}
3878@end multitable
3879
3880@item @emph{Reference}:
3881@uref{https://www.openacc.org, OpenACC specification v2.6}, section
38823.2.3.
3883@end table
3884
3885
3886
3887@node acc_set_device_num
3888@section @code{acc_set_device_num} -- Set device number to use.
3889@table @asis
3890@item @emph{Description}
3891This function will indicate to the runtime which device number,
3892specified by @var{devicenum}, associated with the specified device
3893type @var{devicetype}.
3894
3895@item @emph{C/C++}:
3896@multitable @columnfractions .20 .80
3897@item @emph{Prototype}: @tab @code{acc_set_device_num(int devicenum, acc_device_t devicetype);}
3898@end multitable
3899
3900@item @emph{Fortran}:
3901@multitable @columnfractions .20 .80
3902@item @emph{Interface}: @tab @code{subroutine acc_set_device_num(devicenum, devicetype)}
3903@item @tab @code{integer devicenum}
3904@item @tab @code{integer(kind=acc_device_kind) devicetype}
3905@end multitable
3906
3907@item @emph{Reference}:
3908@uref{https://www.openacc.org, OpenACC specification v2.6}, section
39093.2.4.
3910@end table
3911
3912
3913
3914@node acc_get_device_num
3915@section @code{acc_get_device_num} -- Get device number to be used.
3916@table @asis
3917@item @emph{Description}
3918This function returns which device number associated with the specified device
3919type @var{devicetype}, will be used when executing a parallel or kernels
3920region.
3921
3922@item @emph{C/C++}:
3923@multitable @columnfractions .20 .80
3924@item @emph{Prototype}: @tab @code{int acc_get_device_num(acc_device_t devicetype);}
3925@end multitable
3926
3927@item @emph{Fortran}:
3928@multitable @columnfractions .20 .80
3929@item @emph{Interface}: @tab @code{function acc_get_device_num(devicetype)}
3930@item @tab @code{integer(kind=acc_device_kind) devicetype}
3931@item @tab @code{integer acc_get_device_num}
3932@end multitable
3933
3934@item @emph{Reference}:
3935@uref{https://www.openacc.org, OpenACC specification v2.6}, section
39363.2.5.
3937@end table
3938
3939
3940
3941@node acc_get_property
3942@section @code{acc_get_property} -- Get device property.
3943@cindex acc_get_property
3944@cindex acc_get_property_string
3945@table @asis
3946@item @emph{Description}
3947These routines return the value of the specified @var{property} for the
3948device being queried according to @var{devicenum} and @var{devicetype}.
3949Integer-valued and string-valued properties are returned by
3950@code{acc_get_property} and @code{acc_get_property_string} respectively.
3951The Fortran @code{acc_get_property_string} subroutine returns the string
3952retrieved in its fourth argument while the remaining entry points are
3953functions, which pass the return value as their result.
3954
3955Note for Fortran, only: the OpenACC technical committee corrected and, hence,
3956modified the interface introduced in OpenACC 2.6. The kind-value parameter
3957@code{acc_device_property} has been renamed to @code{acc_device_property_kind}
3958for consistency and the return type of the @code{acc_get_property} function is
3959now a @code{c_size_t} integer instead of a @code{acc_device_property} integer.
15886c03 3960The parameter @code{acc_device_property} is still provided,
d77de738
ML
3961but might be removed in a future version of GCC.
3962
3963@item @emph{C/C++}:
3964@multitable @columnfractions .20 .80
3965@item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
3966@item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
3967@end multitable
3968
3969@item @emph{Fortran}:
3970@multitable @columnfractions .20 .80
3971@item @emph{Interface}: @tab @code{function acc_get_property(devicenum, devicetype, property)}
3972@item @emph{Interface}: @tab @code{subroutine acc_get_property_string(devicenum, devicetype, property, string)}
3973@item @tab @code{use ISO_C_Binding, only: c_size_t}
3974@item @tab @code{integer devicenum}
3975@item @tab @code{integer(kind=acc_device_kind) devicetype}
3976@item @tab @code{integer(kind=acc_device_property_kind) property}
3977@item @tab @code{integer(kind=c_size_t) acc_get_property}
3978@item @tab @code{character(*) string}
3979@end multitable
3980
3981@item @emph{Reference}:
3982@uref{https://www.openacc.org, OpenACC specification v2.6}, section
39833.2.6.
3984@end table
3985
3986
3987
3988@node acc_async_test
3989@section @code{acc_async_test} -- Test for completion of a specific asynchronous operation.
3990@table @asis
3991@item @emph{Description}
3992This function tests for completion of the asynchronous operation specified
15886c03
TB
3993in @var{arg}. In C/C++, a non-zero value is returned to indicate
3994the specified asynchronous operation has completed while Fortran returns
3995@code{true}. If the asynchronous operation has not completed, C/C++ returns
3996zero and Fortran returns @code{false}.
d77de738
ML
3997
3998@item @emph{C/C++}:
3999@multitable @columnfractions .20 .80
4000@item @emph{Prototype}: @tab @code{int acc_async_test(int arg);}
4001@end multitable
4002
4003@item @emph{Fortran}:
4004@multitable @columnfractions .20 .80
4005@item @emph{Interface}: @tab @code{function acc_async_test(arg)}
4006@item @tab @code{integer(kind=acc_handle_kind) arg}
4007@item @tab @code{logical acc_async_test}
4008@end multitable
4009
4010@item @emph{Reference}:
4011@uref{https://www.openacc.org, OpenACC specification v2.6}, section
40123.2.9.
4013@end table
4014
4015
4016
4017@node acc_async_test_all
4018@section @code{acc_async_test_all} -- Tests for completion of all asynchronous operations.
4019@table @asis
4020@item @emph{Description}
4021This function tests for completion of all asynchronous operations.
15886c03
TB
4022In C/C++, a non-zero value is returned to indicate all asynchronous
4023operations have completed while Fortran returns @code{true}. If
4024any asynchronous operation has not completed, C/C++ returns zero and
4025Fortran returns @code{false}.
d77de738
ML
4026
4027@item @emph{C/C++}:
4028@multitable @columnfractions .20 .80
4029@item @emph{Prototype}: @tab @code{int acc_async_test_all(void);}
4030@end multitable
4031
4032@item @emph{Fortran}:
4033@multitable @columnfractions .20 .80
4034@item @emph{Interface}: @tab @code{function acc_async_test()}
4035@item @tab @code{logical acc_get_device_num}
4036@end multitable
4037
4038@item @emph{Reference}:
4039@uref{https://www.openacc.org, OpenACC specification v2.6}, section
40403.2.10.
4041@end table
4042
4043
4044
4045@node acc_wait
4046@section @code{acc_wait} -- Wait for completion of a specific asynchronous operation.
4047@table @asis
4048@item @emph{Description}
4049This function waits for completion of the asynchronous operation
4050specified in @var{arg}.
4051
4052@item @emph{C/C++}:
4053@multitable @columnfractions .20 .80
4054@item @emph{Prototype}: @tab @code{acc_wait(arg);}
4055@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait(arg);}
4056@end multitable
4057
4058@item @emph{Fortran}:
4059@multitable @columnfractions .20 .80
4060@item @emph{Interface}: @tab @code{subroutine acc_wait(arg)}
4061@item @tab @code{integer(acc_handle_kind) arg}
4062@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait(arg)}
4063@item @tab @code{integer(acc_handle_kind) arg}
4064@end multitable
4065
4066@item @emph{Reference}:
4067@uref{https://www.openacc.org, OpenACC specification v2.6}, section
40683.2.11.
4069@end table
4070
4071
4072
4073@node acc_wait_all
4074@section @code{acc_wait_all} -- Waits for completion of all asynchronous operations.
4075@table @asis
4076@item @emph{Description}
4077This function waits for the completion of all asynchronous operations.
4078
4079@item @emph{C/C++}:
4080@multitable @columnfractions .20 .80
4081@item @emph{Prototype}: @tab @code{acc_wait_all(void);}
4082@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait_all(void);}
4083@end multitable
4084
4085@item @emph{Fortran}:
4086@multitable @columnfractions .20 .80
4087@item @emph{Interface}: @tab @code{subroutine acc_wait_all()}
4088@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait_all()}
4089@end multitable
4090
4091@item @emph{Reference}:
4092@uref{https://www.openacc.org, OpenACC specification v2.6}, section
40933.2.13.
4094@end table
4095
4096
4097
4098@node acc_wait_all_async
4099@section @code{acc_wait_all_async} -- Wait for completion of all asynchronous operations.
4100@table @asis
4101@item @emph{Description}
4102This function enqueues a wait operation on the queue @var{async} for any
4103and all asynchronous operations that have been previously enqueued on
4104any queue.
4105
4106@item @emph{C/C++}:
4107@multitable @columnfractions .20 .80
4108@item @emph{Prototype}: @tab @code{acc_wait_all_async(int async);}
4109@end multitable
4110
4111@item @emph{Fortran}:
4112@multitable @columnfractions .20 .80
4113@item @emph{Interface}: @tab @code{subroutine acc_wait_all_async(async)}
4114@item @tab @code{integer(acc_handle_kind) async}
4115@end multitable
4116
4117@item @emph{Reference}:
4118@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41193.2.14.
4120@end table
4121
4122
4123
4124@node acc_wait_async
4125@section @code{acc_wait_async} -- Wait for completion of asynchronous operations.
4126@table @asis
4127@item @emph{Description}
4128This function enqueues a wait operation on queue @var{async} for any and all
4129asynchronous operations enqueued on queue @var{arg}.
4130
4131@item @emph{C/C++}:
4132@multitable @columnfractions .20 .80
4133@item @emph{Prototype}: @tab @code{acc_wait_async(int arg, int async);}
4134@end multitable
4135
4136@item @emph{Fortran}:
4137@multitable @columnfractions .20 .80
4138@item @emph{Interface}: @tab @code{subroutine acc_wait_async(arg, async)}
4139@item @tab @code{integer(acc_handle_kind) arg, async}
4140@end multitable
4141
4142@item @emph{Reference}:
4143@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41443.2.12.
4145@end table
4146
4147
4148
4149@node acc_init
4150@section @code{acc_init} -- Initialize runtime for a specific device type.
4151@table @asis
4152@item @emph{Description}
4153This function initializes the runtime for the device type specified in
4154@var{devicetype}.
4155
4156@item @emph{C/C++}:
4157@multitable @columnfractions .20 .80
4158@item @emph{Prototype}: @tab @code{acc_init(acc_device_t devicetype);}
4159@end multitable
4160
4161@item @emph{Fortran}:
4162@multitable @columnfractions .20 .80
4163@item @emph{Interface}: @tab @code{subroutine acc_init(devicetype)}
4164@item @tab @code{integer(acc_device_kind) devicetype}
4165@end multitable
4166
4167@item @emph{Reference}:
4168@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41693.2.7.
4170@end table
4171
4172
4173
4174@node acc_shutdown
4175@section @code{acc_shutdown} -- Shuts down the runtime for a specific device type.
4176@table @asis
4177@item @emph{Description}
4178This function shuts down the runtime for the device type specified in
4179@var{devicetype}.
4180
4181@item @emph{C/C++}:
4182@multitable @columnfractions .20 .80
4183@item @emph{Prototype}: @tab @code{acc_shutdown(acc_device_t devicetype);}
4184@end multitable
4185
4186@item @emph{Fortran}:
4187@multitable @columnfractions .20 .80
4188@item @emph{Interface}: @tab @code{subroutine acc_shutdown(devicetype)}
4189@item @tab @code{integer(acc_device_kind) devicetype}
4190@end multitable
4191
4192@item @emph{Reference}:
4193@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41943.2.8.
4195@end table
4196
4197
4198
4199@node acc_on_device
4200@section @code{acc_on_device} -- Whether executing on a particular device
4201@table @asis
4202@item @emph{Description}:
4203This function returns whether the program is executing on a particular
4204device specified in @var{devicetype}. In C/C++ a non-zero value is
4205returned to indicate the device is executing on the specified device type.
15886c03
TB
4206In Fortran, @code{true} is returned. If the program is not executing
4207on the specified device type C/C++ returns zero, while Fortran
4208returns @code{false}.
d77de738
ML
4209
4210@item @emph{C/C++}:
4211@multitable @columnfractions .20 .80
4212@item @emph{Prototype}: @tab @code{acc_on_device(acc_device_t devicetype);}
4213@end multitable
4214
4215@item @emph{Fortran}:
4216@multitable @columnfractions .20 .80
4217@item @emph{Interface}: @tab @code{function acc_on_device(devicetype)}
4218@item @tab @code{integer(acc_device_kind) devicetype}
4219@item @tab @code{logical acc_on_device}
4220@end multitable
4221
4222
4223@item @emph{Reference}:
4224@uref{https://www.openacc.org, OpenACC specification v2.6}, section
42253.2.17.
4226@end table
4227
4228
4229
4230@node acc_malloc
4231@section @code{acc_malloc} -- Allocate device memory.
4232@table @asis
4233@item @emph{Description}
4234This function allocates @var{len} bytes of device memory. It returns
4235the device address of the allocated memory.
4236
4237@item @emph{C/C++}:
4238@multitable @columnfractions .20 .80
4239@item @emph{Prototype}: @tab @code{d_void* acc_malloc(size_t len);}
4240@end multitable
4241
4242@item @emph{Reference}:
4243@uref{https://www.openacc.org, OpenACC specification v2.6}, section
42443.2.18.
4245@end table
4246
4247
4248
4249@node acc_free
4250@section @code{acc_free} -- Free device memory.
4251@table @asis
4252@item @emph{Description}
4253Free previously allocated device memory at the device address @code{a}.
4254
4255@item @emph{C/C++}:
4256@multitable @columnfractions .20 .80
4257@item @emph{Prototype}: @tab @code{acc_free(d_void *a);}
4258@end multitable
4259
4260@item @emph{Reference}:
4261@uref{https://www.openacc.org, OpenACC specification v2.6}, section
42623.2.19.
4263@end table
4264
4265
4266
4267@node acc_copyin
4268@section @code{acc_copyin} -- Allocate device memory and copy host memory to it.
4269@table @asis
4270@item @emph{Description}
4271In C/C++, this function allocates @var{len} bytes of device memory
4272and maps it to the specified host address in @var{a}. The device
4273address of the newly allocated device memory is returned.
4274
4275In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4276a contiguous array section. The second form @var{a} specifies a
4277variable or array element and @var{len} specifies the length in bytes.
4278
4279@item @emph{C/C++}:
4280@multitable @columnfractions .20 .80
4281@item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);}
4282@item @emph{Prototype}: @tab @code{void *acc_copyin_async(h_void *a, size_t len, int async);}
4283@end multitable
4284
4285@item @emph{Fortran}:
4286@multitable @columnfractions .20 .80
4287@item @emph{Interface}: @tab @code{subroutine acc_copyin(a)}
4288@item @tab @code{type, dimension(:[,:]...) :: a}
4289@item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)}
4290@item @tab @code{type, dimension(:[,:]...) :: a}
4291@item @tab @code{integer len}
4292@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, async)}
4293@item @tab @code{type, dimension(:[,:]...) :: a}
4294@item @tab @code{integer(acc_handle_kind) :: async}
4295@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, len, async)}
4296@item @tab @code{type, dimension(:[,:]...) :: a}
4297@item @tab @code{integer len}
4298@item @tab @code{integer(acc_handle_kind) :: async}
4299@end multitable
4300
4301@item @emph{Reference}:
4302@uref{https://www.openacc.org, OpenACC specification v2.6}, section
43033.2.20.
4304@end table
4305
4306
4307
4308@node acc_present_or_copyin
4309@section @code{acc_present_or_copyin} -- If the data is not present on the device, allocate device memory and copy from host memory.
4310@table @asis
4311@item @emph{Description}
4312This function tests if the host data specified by @var{a} and of length
15886c03
TB
4313@var{len} is present or not. If it is not present, device memory
4314is allocated and the host memory copied. The device address of
d77de738
ML
4315the newly allocated device memory is returned.
4316
4317In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4318a contiguous array section. The second form @var{a} specifies a variable or
4319array element and @var{len} specifies the length in bytes.
4320
4321Note that @code{acc_present_or_copyin} and @code{acc_pcopyin} exist for
4322backward compatibility with OpenACC 2.0; use @ref{acc_copyin} instead.
4323
4324@item @emph{C/C++}:
4325@multitable @columnfractions .20 .80
4326@item @emph{Prototype}: @tab @code{void *acc_present_or_copyin(h_void *a, size_t len);}
4327@item @emph{Prototype}: @tab @code{void *acc_pcopyin(h_void *a, size_t len);}
4328@end multitable
4329
4330@item @emph{Fortran}:
4331@multitable @columnfractions .20 .80
4332@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a)}
4333@item @tab @code{type, dimension(:[,:]...) :: a}
4334@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a, len)}
4335@item @tab @code{type, dimension(:[,:]...) :: a}
4336@item @tab @code{integer len}
4337@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a)}
4338@item @tab @code{type, dimension(:[,:]...) :: a}
4339@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a, len)}
4340@item @tab @code{type, dimension(:[,:]...) :: a}
4341@item @tab @code{integer len}
4342@end multitable
4343
4344@item @emph{Reference}:
4345@uref{https://www.openacc.org, OpenACC specification v2.6}, section
43463.2.20.
4347@end table
4348
4349
4350
4351@node acc_create
4352@section @code{acc_create} -- Allocate device memory and map it to host memory.
4353@table @asis
4354@item @emph{Description}
4355This function allocates device memory and maps it to host memory specified
4356by the host address @var{a} with a length of @var{len} bytes. In C/C++,
4357the function returns the device address of the allocated device memory.
4358
4359In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4360a contiguous array section. The second form @var{a} specifies a variable or
4361array element and @var{len} specifies the length in bytes.
4362
4363@item @emph{C/C++}:
4364@multitable @columnfractions .20 .80
4365@item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);}
4366@item @emph{Prototype}: @tab @code{void *acc_create_async(h_void *a, size_t len, int async);}
4367@end multitable
4368
4369@item @emph{Fortran}:
4370@multitable @columnfractions .20 .80
4371@item @emph{Interface}: @tab @code{subroutine acc_create(a)}
4372@item @tab @code{type, dimension(:[,:]...) :: a}
4373@item @emph{Interface}: @tab @code{subroutine acc_create(a, len)}
4374@item @tab @code{type, dimension(:[,:]...) :: a}
4375@item @tab @code{integer len}
4376@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, async)}
4377@item @tab @code{type, dimension(:[,:]...) :: a}
4378@item @tab @code{integer(acc_handle_kind) :: async}
4379@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, len, async)}
4380@item @tab @code{type, dimension(:[,:]...) :: a}
4381@item @tab @code{integer len}
4382@item @tab @code{integer(acc_handle_kind) :: async}
4383@end multitable
4384
4385@item @emph{Reference}:
4386@uref{https://www.openacc.org, OpenACC specification v2.6}, section
43873.2.21.
4388@end table
4389
4390
4391
4392@node acc_present_or_create
4393@section @code{acc_present_or_create} -- If the data is not present on the device, allocate device memory and map it to host memory.
4394@table @asis
4395@item @emph{Description}
4396This function tests if the host data specified by @var{a} and of length
15886c03
TB
4397@var{len} is present or not. If it is not present, device memory
4398is allocated and mapped to host memory. In C/C++, the device address
d77de738
ML
4399of the newly allocated device memory is returned.
4400
4401In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4402a contiguous array section. The second form @var{a} specifies a variable or
4403array element and @var{len} specifies the length in bytes.
4404
4405Note that @code{acc_present_or_create} and @code{acc_pcreate} exist for
4406backward compatibility with OpenACC 2.0; use @ref{acc_create} instead.
4407
4408@item @emph{C/C++}:
4409@multitable @columnfractions .20 .80
4410@item @emph{Prototype}: @tab @code{void *acc_present_or_create(h_void *a, size_t len)}
4411@item @emph{Prototype}: @tab @code{void *acc_pcreate(h_void *a, size_t len)}
4412@end multitable
4413
4414@item @emph{Fortran}:
4415@multitable @columnfractions .20 .80
4416@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a)}
4417@item @tab @code{type, dimension(:[,:]...) :: a}
4418@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a, len)}
4419@item @tab @code{type, dimension(:[,:]...) :: a}
4420@item @tab @code{integer len}
4421@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a)}
4422@item @tab @code{type, dimension(:[,:]...) :: a}
4423@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a, len)}
4424@item @tab @code{type, dimension(:[,:]...) :: a}
4425@item @tab @code{integer len}
4426@end multitable
4427
4428@item @emph{Reference}:
4429@uref{https://www.openacc.org, OpenACC specification v2.6}, section
44303.2.21.
4431@end table
4432
4433
4434
4435@node acc_copyout
4436@section @code{acc_copyout} -- Copy device memory to host memory.
4437@table @asis
4438@item @emph{Description}
4439This function copies mapped device memory to host memory which is specified
4440by host address @var{a} for a length @var{len} bytes in C/C++.
4441
4442In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4443a contiguous array section. The second form @var{a} specifies a variable or
4444array element and @var{len} specifies the length in bytes.
4445
4446@item @emph{C/C++}:
4447@multitable @columnfractions .20 .80
4448@item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);}
4449@item @emph{Prototype}: @tab @code{acc_copyout_async(h_void *a, size_t len, int async);}
4450@item @emph{Prototype}: @tab @code{acc_copyout_finalize(h_void *a, size_t len);}
4451@item @emph{Prototype}: @tab @code{acc_copyout_finalize_async(h_void *a, size_t len, int async);}
4452@end multitable
4453
4454@item @emph{Fortran}:
4455@multitable @columnfractions .20 .80
4456@item @emph{Interface}: @tab @code{subroutine acc_copyout(a)}
4457@item @tab @code{type, dimension(:[,:]...) :: a}
4458@item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)}
4459@item @tab @code{type, dimension(:[,:]...) :: a}
4460@item @tab @code{integer len}
4461@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, async)}
4462@item @tab @code{type, dimension(:[,:]...) :: a}
4463@item @tab @code{integer(acc_handle_kind) :: async}
4464@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, len, async)}
4465@item @tab @code{type, dimension(:[,:]...) :: a}
4466@item @tab @code{integer len}
4467@item @tab @code{integer(acc_handle_kind) :: async}
4468@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a)}
4469@item @tab @code{type, dimension(:[,:]...) :: a}
4470@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a, len)}
4471@item @tab @code{type, dimension(:[,:]...) :: a}
4472@item @tab @code{integer len}
4473@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, async)}
4474@item @tab @code{type, dimension(:[,:]...) :: a}
4475@item @tab @code{integer(acc_handle_kind) :: async}
4476@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, len, async)}
4477@item @tab @code{type, dimension(:[,:]...) :: a}
4478@item @tab @code{integer len}
4479@item @tab @code{integer(acc_handle_kind) :: async}
4480@end multitable
4481
4482@item @emph{Reference}:
4483@uref{https://www.openacc.org, OpenACC specification v2.6}, section
44843.2.22.
4485@end table
4486
4487
4488
4489@node acc_delete
4490@section @code{acc_delete} -- Free device memory.
4491@table @asis
4492@item @emph{Description}
4493This function frees previously allocated device memory specified by
4494the device address @var{a} and the length of @var{len} bytes.
4495
4496In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4497a contiguous array section. The second form @var{a} specifies a variable or
4498array element and @var{len} specifies the length in bytes.
4499
4500@item @emph{C/C++}:
4501@multitable @columnfractions .20 .80
4502@item @emph{Prototype}: @tab @code{acc_delete(h_void *a, size_t len);}
4503@item @emph{Prototype}: @tab @code{acc_delete_async(h_void *a, size_t len, int async);}
4504@item @emph{Prototype}: @tab @code{acc_delete_finalize(h_void *a, size_t len);}
4505@item @emph{Prototype}: @tab @code{acc_delete_finalize_async(h_void *a, size_t len, int async);}
4506@end multitable
4507
4508@item @emph{Fortran}:
4509@multitable @columnfractions .20 .80
4510@item @emph{Interface}: @tab @code{subroutine acc_delete(a)}
4511@item @tab @code{type, dimension(:[,:]...) :: a}
4512@item @emph{Interface}: @tab @code{subroutine acc_delete(a, len)}
4513@item @tab @code{type, dimension(:[,:]...) :: a}
4514@item @tab @code{integer len}
4515@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, async)}
4516@item @tab @code{type, dimension(:[,:]...) :: a}
4517@item @tab @code{integer(acc_handle_kind) :: async}
4518@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, len, async)}
4519@item @tab @code{type, dimension(:[,:]...) :: a}
4520@item @tab @code{integer len}
4521@item @tab @code{integer(acc_handle_kind) :: async}
4522@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a)}
4523@item @tab @code{type, dimension(:[,:]...) :: a}
4524@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a, len)}
4525@item @tab @code{type, dimension(:[,:]...) :: a}
4526@item @tab @code{integer len}
4527@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, async)}
4528@item @tab @code{type, dimension(:[,:]...) :: a}
4529@item @tab @code{integer(acc_handle_kind) :: async}
4530@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, len, async)}
4531@item @tab @code{type, dimension(:[,:]...) :: a}
4532@item @tab @code{integer len}
4533@item @tab @code{integer(acc_handle_kind) :: async}
4534@end multitable
4535
4536@item @emph{Reference}:
4537@uref{https://www.openacc.org, OpenACC specification v2.6}, section
45383.2.23.
4539@end table
4540
4541
4542
4543@node acc_update_device
4544@section @code{acc_update_device} -- Update device memory from mapped host memory.
4545@table @asis
4546@item @emph{Description}
4547This function updates the device copy from the previously mapped host memory.
4548The host memory is specified with the host address @var{a} and a length of
4549@var{len} bytes.
4550
4551In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4552a contiguous array section. The second form @var{a} specifies a variable or
4553array element and @var{len} specifies the length in bytes.
4554
4555@item @emph{C/C++}:
4556@multitable @columnfractions .20 .80
4557@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len);}
4558@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len, async);}
4559@end multitable
4560
4561@item @emph{Fortran}:
4562@multitable @columnfractions .20 .80
4563@item @emph{Interface}: @tab @code{subroutine acc_update_device(a)}
4564@item @tab @code{type, dimension(:[,:]...) :: a}
4565@item @emph{Interface}: @tab @code{subroutine acc_update_device(a, len)}
4566@item @tab @code{type, dimension(:[,:]...) :: a}
4567@item @tab @code{integer len}
4568@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, async)}
4569@item @tab @code{type, dimension(:[,:]...) :: a}
4570@item @tab @code{integer(acc_handle_kind) :: async}
4571@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, len, async)}
4572@item @tab @code{type, dimension(:[,:]...) :: a}
4573@item @tab @code{integer len}
4574@item @tab @code{integer(acc_handle_kind) :: async}
4575@end multitable
4576
4577@item @emph{Reference}:
4578@uref{https://www.openacc.org, OpenACC specification v2.6}, section
45793.2.24.
4580@end table
4581
4582
4583
4584@node acc_update_self
4585@section @code{acc_update_self} -- Update host memory from mapped device memory.
4586@table @asis
4587@item @emph{Description}
4588This function updates the host copy from the previously mapped device memory.
4589The host memory is specified with the host address @var{a} and a length of
4590@var{len} bytes.
4591
4592In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4593a contiguous array section. The second form @var{a} specifies a variable or
4594array element and @var{len} specifies the length in bytes.
4595
4596@item @emph{C/C++}:
4597@multitable @columnfractions .20 .80
4598@item @emph{Prototype}: @tab @code{acc_update_self(h_void *a, size_t len);}
4599@item @emph{Prototype}: @tab @code{acc_update_self_async(h_void *a, size_t len, int async);}
4600@end multitable
4601
4602@item @emph{Fortran}:
4603@multitable @columnfractions .20 .80
4604@item @emph{Interface}: @tab @code{subroutine acc_update_self(a)}
4605@item @tab @code{type, dimension(:[,:]...) :: a}
4606@item @emph{Interface}: @tab @code{subroutine acc_update_self(a, len)}
4607@item @tab @code{type, dimension(:[,:]...) :: a}
4608@item @tab @code{integer len}
4609@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, async)}
4610@item @tab @code{type, dimension(:[,:]...) :: a}
4611@item @tab @code{integer(acc_handle_kind) :: async}
4612@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, len, async)}
4613@item @tab @code{type, dimension(:[,:]...) :: a}
4614@item @tab @code{integer len}
4615@item @tab @code{integer(acc_handle_kind) :: async}
4616@end multitable
4617
4618@item @emph{Reference}:
4619@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46203.2.25.
4621@end table
4622
4623
4624
4625@node acc_map_data
4626@section @code{acc_map_data} -- Map previously allocated device memory to host memory.
4627@table @asis
4628@item @emph{Description}
4629This function maps previously allocated device and host memory. The device
4630memory is specified with the device address @var{d}. The host memory is
4631specified with the host address @var{h} and a length of @var{len}.
4632
4633@item @emph{C/C++}:
4634@multitable @columnfractions .20 .80
4635@item @emph{Prototype}: @tab @code{acc_map_data(h_void *h, d_void *d, size_t len);}
4636@end multitable
4637
4638@item @emph{Reference}:
4639@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46403.2.26.
4641@end table
4642
4643
4644
4645@node acc_unmap_data
4646@section @code{acc_unmap_data} -- Unmap device memory from host memory.
4647@table @asis
4648@item @emph{Description}
4649This function unmaps previously mapped device and host memory. The latter
4650specified by @var{h}.
4651
4652@item @emph{C/C++}:
4653@multitable @columnfractions .20 .80
4654@item @emph{Prototype}: @tab @code{acc_unmap_data(h_void *h);}
4655@end multitable
4656
4657@item @emph{Reference}:
4658@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46593.2.27.
4660@end table
4661
4662
4663
4664@node acc_deviceptr
4665@section @code{acc_deviceptr} -- Get device pointer associated with specific host address.
4666@table @asis
4667@item @emph{Description}
4668This function returns the device address that has been mapped to the
4669host address specified by @var{h}.
4670
4671@item @emph{C/C++}:
4672@multitable @columnfractions .20 .80
4673@item @emph{Prototype}: @tab @code{void *acc_deviceptr(h_void *h);}
4674@end multitable
4675
4676@item @emph{Reference}:
4677@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46783.2.28.
4679@end table
4680
4681
4682
4683@node acc_hostptr
4684@section @code{acc_hostptr} -- Get host pointer associated with specific device address.
4685@table @asis
4686@item @emph{Description}
4687This function returns the host address that has been mapped to the
4688device address specified by @var{d}.
4689
4690@item @emph{C/C++}:
4691@multitable @columnfractions .20 .80
4692@item @emph{Prototype}: @tab @code{void *acc_hostptr(d_void *d);}
4693@end multitable
4694
4695@item @emph{Reference}:
4696@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46973.2.29.
4698@end table
4699
4700
4701
4702@node acc_is_present
4703@section @code{acc_is_present} -- Indicate whether host variable / array is present on device.
4704@table @asis
4705@item @emph{Description}
4706This function indicates whether the specified host address in @var{a} and a
4707length of @var{len} bytes is present on the device. In C/C++, a non-zero
4708value is returned to indicate the presence of the mapped memory on the
4709device. A zero is returned to indicate the memory is not mapped on the
4710device.
4711
4712In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4713a contiguous array section. The second form @var{a} specifies a variable or
4714array element and @var{len} specifies the length in bytes. If the host
4715memory is mapped to device memory, then a @code{true} is returned. Otherwise,
4716a @code{false} is return to indicate the mapped memory is not present.
4717
4718@item @emph{C/C++}:
4719@multitable @columnfractions .20 .80
4720@item @emph{Prototype}: @tab @code{int acc_is_present(h_void *a, size_t len);}
4721@end multitable
4722
4723@item @emph{Fortran}:
4724@multitable @columnfractions .20 .80
4725@item @emph{Interface}: @tab @code{function acc_is_present(a)}
4726@item @tab @code{type, dimension(:[,:]...) :: a}
4727@item @tab @code{logical acc_is_present}
4728@item @emph{Interface}: @tab @code{function acc_is_present(a, len)}
4729@item @tab @code{type, dimension(:[,:]...) :: a}
4730@item @tab @code{integer len}
4731@item @tab @code{logical acc_is_present}
4732@end multitable
4733
4734@item @emph{Reference}:
4735@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47363.2.30.
4737@end table
4738
4739
4740
4741@node acc_memcpy_to_device
4742@section @code{acc_memcpy_to_device} -- Copy host memory to device memory.
4743@table @asis
4744@item @emph{Description}
4745This function copies host memory specified by host address of @var{src} to
4746device memory specified by the device address @var{dest} for a length of
4747@var{bytes} bytes.
4748
4749@item @emph{C/C++}:
4750@multitable @columnfractions .20 .80
4751@item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void *src, size_t bytes);}
4752@end multitable
4753
4754@item @emph{Reference}:
4755@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47563.2.31.
4757@end table
4758
4759
4760
4761@node acc_memcpy_from_device
4762@section @code{acc_memcpy_from_device} -- Copy device memory to host memory.
4763@table @asis
4764@item @emph{Description}
4765This function copies host memory specified by host address of @var{src} from
4766device memory specified by the device address @var{dest} for a length of
4767@var{bytes} bytes.
4768
4769@item @emph{C/C++}:
4770@multitable @columnfractions .20 .80
4771@item @emph{Prototype}: @tab @code{acc_memcpy_from_device(d_void *dest, h_void *src, size_t bytes);}
4772@end multitable
4773
4774@item @emph{Reference}:
4775@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47763.2.32.
4777@end table
4778
4779
4780
4781@node acc_attach
4782@section @code{acc_attach} -- Let device pointer point to device-pointer target.
4783@table @asis
4784@item @emph{Description}
4785This function updates a pointer on the device from pointing to a host-pointer
4786address to pointing to the corresponding device data.
4787
4788@item @emph{C/C++}:
4789@multitable @columnfractions .20 .80
4790@item @emph{Prototype}: @tab @code{acc_attach(h_void **ptr);}
4791@item @emph{Prototype}: @tab @code{acc_attach_async(h_void **ptr, int async);}
4792@end multitable
4793
4794@item @emph{Reference}:
4795@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47963.2.34.
4797@end table
4798
4799
4800
4801@node acc_detach
4802@section @code{acc_detach} -- Let device pointer point to host-pointer target.
4803@table @asis
4804@item @emph{Description}
4805This function updates a pointer on the device from pointing to a device-pointer
4806address to pointing to the corresponding host data.
4807
4808@item @emph{C/C++}:
4809@multitable @columnfractions .20 .80
4810@item @emph{Prototype}: @tab @code{acc_detach(h_void **ptr);}
4811@item @emph{Prototype}: @tab @code{acc_detach_async(h_void **ptr, int async);}
4812@item @emph{Prototype}: @tab @code{acc_detach_finalize(h_void **ptr);}
4813@item @emph{Prototype}: @tab @code{acc_detach_finalize_async(h_void **ptr, int async);}
4814@end multitable
4815
4816@item @emph{Reference}:
4817@uref{https://www.openacc.org, OpenACC specification v2.6}, section
48183.2.35.
4819@end table
4820
4821
4822
4823@node acc_get_current_cuda_device
4824@section @code{acc_get_current_cuda_device} -- Get CUDA device handle.
4825@table @asis
4826@item @emph{Description}
4827This function returns the CUDA device handle. This handle is the same
4828as used by the CUDA Runtime or Driver API's.
4829
4830@item @emph{C/C++}:
4831@multitable @columnfractions .20 .80
4832@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_device(void);}
4833@end multitable
4834
4835@item @emph{Reference}:
4836@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4837A.2.1.1.
4838@end table
4839
4840
4841
4842@node acc_get_current_cuda_context
4843@section @code{acc_get_current_cuda_context} -- Get CUDA context handle.
4844@table @asis
4845@item @emph{Description}
4846This function returns the CUDA context handle. This handle is the same
4847as used by the CUDA Runtime or Driver API's.
4848
4849@item @emph{C/C++}:
4850@multitable @columnfractions .20 .80
4851@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_context(void);}
4852@end multitable
4853
4854@item @emph{Reference}:
4855@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4856A.2.1.2.
4857@end table
4858
4859
4860
4861@node acc_get_cuda_stream
4862@section @code{acc_get_cuda_stream} -- Get CUDA stream handle.
4863@table @asis
4864@item @emph{Description}
4865This function returns the CUDA stream handle for the queue @var{async}.
4866This handle is the same as used by the CUDA Runtime or Driver API's.
4867
4868@item @emph{C/C++}:
4869@multitable @columnfractions .20 .80
4870@item @emph{Prototype}: @tab @code{void *acc_get_cuda_stream(int async);}
4871@end multitable
4872
4873@item @emph{Reference}:
4874@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4875A.2.1.3.
4876@end table
4877
4878
4879
4880@node acc_set_cuda_stream
4881@section @code{acc_set_cuda_stream} -- Set CUDA stream handle.
4882@table @asis
4883@item @emph{Description}
4884This function associates the stream handle specified by @var{stream} with
4885the queue @var{async}.
4886
4887This cannot be used to change the stream handle associated with
4888@code{acc_async_sync}.
4889
4890The return value is not specified.
4891
4892@item @emph{C/C++}:
4893@multitable @columnfractions .20 .80
4894@item @emph{Prototype}: @tab @code{int acc_set_cuda_stream(int async, void *stream);}
4895@end multitable
4896
4897@item @emph{Reference}:
4898@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4899A.2.1.4.
4900@end table
4901
4902
4903
4904@node acc_prof_register
4905@section @code{acc_prof_register} -- Register callbacks.
4906@table @asis
4907@item @emph{Description}:
4908This function registers callbacks.
4909
4910@item @emph{C/C++}:
4911@multitable @columnfractions .20 .80
4912@item @emph{Prototype}: @tab @code{void acc_prof_register (acc_event_t, acc_prof_callback, acc_register_t);}
4913@end multitable
4914
4915@item @emph{See also}:
4916@ref{OpenACC Profiling Interface}
4917
4918@item @emph{Reference}:
4919@uref{https://www.openacc.org, OpenACC specification v2.6}, section
49205.3.
4921@end table
4922
4923
4924
4925@node acc_prof_unregister
4926@section @code{acc_prof_unregister} -- Unregister callbacks.
4927@table @asis
4928@item @emph{Description}:
4929This function unregisters callbacks.
4930
4931@item @emph{C/C++}:
4932@multitable @columnfractions .20 .80
4933@item @emph{Prototype}: @tab @code{void acc_prof_unregister (acc_event_t, acc_prof_callback, acc_register_t);}
4934@end multitable
4935
4936@item @emph{See also}:
4937@ref{OpenACC Profiling Interface}
4938
4939@item @emph{Reference}:
4940@uref{https://www.openacc.org, OpenACC specification v2.6}, section
49415.3.
4942@end table
4943
4944
4945
4946@node acc_prof_lookup
4947@section @code{acc_prof_lookup} -- Obtain inquiry functions.
4948@table @asis
4949@item @emph{Description}:
4950Function to obtain inquiry functions.
4951
4952@item @emph{C/C++}:
4953@multitable @columnfractions .20 .80
4954@item @emph{Prototype}: @tab @code{acc_query_fn acc_prof_lookup (const char *);}
4955@end multitable
4956
4957@item @emph{See also}:
4958@ref{OpenACC Profiling Interface}
4959
4960@item @emph{Reference}:
4961@uref{https://www.openacc.org, OpenACC specification v2.6}, section
49625.3.
4963@end table
4964
4965
4966
4967@node acc_register_library
4968@section @code{acc_register_library} -- Library registration.
4969@table @asis
4970@item @emph{Description}:
4971Function for library registration.
4972
4973@item @emph{C/C++}:
4974@multitable @columnfractions .20 .80
4975@item @emph{Prototype}: @tab @code{void acc_register_library (acc_prof_reg, acc_prof_reg, acc_prof_lookup_func);}
4976@end multitable
4977
4978@item @emph{See also}:
4979@ref{OpenACC Profiling Interface}, @ref{ACC_PROFLIB}
4980
4981@item @emph{Reference}:
4982@uref{https://www.openacc.org, OpenACC specification v2.6}, section
49835.3.
4984@end table
4985
4986
4987
4988@c ---------------------------------------------------------------------
4989@c OpenACC Environment Variables
4990@c ---------------------------------------------------------------------
4991
4992@node OpenACC Environment Variables
4993@chapter OpenACC Environment Variables
4994
4995The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
4996are defined by section 4 of the OpenACC specification in version 2.0.
4997The variable @env{ACC_PROFLIB}
4998is defined by section 4 of the OpenACC specification in version 2.6.
d77de738
ML
4999
5000@menu
5001* ACC_DEVICE_TYPE::
5002* ACC_DEVICE_NUM::
5003* ACC_PROFLIB::
d77de738
ML
5004@end menu
5005
5006
5007
5008@node ACC_DEVICE_TYPE
5009@section @code{ACC_DEVICE_TYPE}
5010@table @asis
67f5d368
TB
5011@item @emph{Description}:
5012Control the default device type to use when executing compute regions.
5013If unset, the code can be run on any device type, favoring a non-host
5014device type.
5015
5016Supported values in GCC (if compiled in) are
5017@itemize
5018@item @code{host}
5019@item @code{nvidia}
5020@item @code{radeon}
5021@end itemize
d77de738
ML
5022@item @emph{Reference}:
5023@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50244.1.
5025@end table
5026
5027
5028
5029@node ACC_DEVICE_NUM
5030@section @code{ACC_DEVICE_NUM}
5031@table @asis
67f5d368
TB
5032@item @emph{Description}:
5033Control which device, identified by device number, is the default device.
5034The value must be a nonnegative integer less than the number of devices.
5035If unset, device number zero is used.
d77de738
ML
5036@item @emph{Reference}:
5037@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50384.2.
5039@end table
5040
5041
5042
5043@node ACC_PROFLIB
5044@section @code{ACC_PROFLIB}
5045@table @asis
67f5d368
TB
5046@item @emph{Description}:
5047Semicolon-separated list of dynamic libraries that are loaded as profiling
5048libraries. Each library must provide at least the @code{acc_register_library}
5049routine. Each library file is found as described by the documentation of
5050@code{dlopen} of your operating system.
d77de738
ML
5051@item @emph{See also}:
5052@ref{acc_register_library}, @ref{OpenACC Profiling Interface}
5053
5054@item @emph{Reference}:
5055@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50564.3.
5057@end table
5058
5059
5060
d77de738
ML
5061@c ---------------------------------------------------------------------
5062@c CUDA Streams Usage
5063@c ---------------------------------------------------------------------
5064
5065@node CUDA Streams Usage
5066@chapter CUDA Streams Usage
5067
5068This applies to the @code{nvptx} plugin only.
5069
5070The library provides elements that perform asynchronous movement of
5071data and asynchronous operation of computing constructs. This
5072asynchronous functionality is implemented by making use of CUDA
5073streams@footnote{See "Stream Management" in "CUDA Driver API",
5074TRM-06703-001, Version 5.5, for additional information}.
5075
5076The primary means by that the asynchronous functionality is accessed
5077is through the use of those OpenACC directives which make use of the
5078@code{async} and @code{wait} clauses. When the @code{async} clause is
5079first used with a directive, it creates a CUDA stream. If an
5080@code{async-argument} is used with the @code{async} clause, then the
5081stream is associated with the specified @code{async-argument}.
5082
5083Following the creation of an association between a CUDA stream and the
5084@code{async-argument} of an @code{async} clause, both the @code{wait}
5085clause and the @code{wait} directive can be used. When either the
5086clause or directive is used after stream creation, it creates a
5087rendezvous point whereby execution waits until all operations
5088associated with the @code{async-argument}, that is, stream, have
5089completed.
5090
5091Normally, the management of the streams that are created as a result of
5092using the @code{async} clause, is done without any intervention by the
5093caller. This implies the association between the @code{async-argument}
15886c03 5094and the CUDA stream is maintained for the lifetime of the program.
d77de738
ML
5095However, this association can be changed through the use of the library
5096function @code{acc_set_cuda_stream}. When the function
5097@code{acc_set_cuda_stream} is called, the CUDA stream that was
15886c03 5098originally associated with the @code{async} clause is destroyed.
d77de738
ML
5099Caution should be taken when changing the association as subsequent
5100references to the @code{async-argument} refer to a different
5101CUDA stream.
5102
5103
5104
5105@c ---------------------------------------------------------------------
5106@c OpenACC Library Interoperability
5107@c ---------------------------------------------------------------------
5108
5109@node OpenACC Library Interoperability
5110@chapter OpenACC Library Interoperability
5111
5112@section Introduction
5113
5114The OpenACC library uses the CUDA Driver API, and may interact with
5115programs that use the Runtime library directly, or another library
5116based on the Runtime library, e.g., CUBLAS@footnote{See section 2.26,
5117"Interactions with the CUDA Driver API" in
5118"CUDA Runtime API", Version 5.5, and section 2.27, "VDPAU
5119Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
5120for additional information on library interoperability.}.
5121This chapter describes the use cases and what changes are
5122required in order to use both the OpenACC library and the CUBLAS and Runtime
5123libraries within a program.
5124
5125@section First invocation: NVIDIA CUBLAS library API
5126
5127In this first use case (see below), a function in the CUBLAS library is called
5128prior to any of the functions in the OpenACC library. More specifically, the
5129function @code{cublasCreate()}.
5130
5131When invoked, the function initializes the library and allocates the
5132hardware resources on the host and the device on behalf of the caller. Once
5133the initialization and allocation has completed, a handle is returned to the
5134caller. The OpenACC library also requires initialization and allocation of
5135hardware resources. Since the CUBLAS library has already allocated the
5136hardware resources for the device, all that is left to do is to initialize
5137the OpenACC library and acquire the hardware resources on the host.
5138
5139Prior to calling the OpenACC function that initializes the library and
5140allocate the host hardware resources, you need to acquire the device number
5141that was allocated during the call to @code{cublasCreate()}. The invoking of the
5142runtime library function @code{cudaGetDevice()} accomplishes this. Once
5143acquired, the device number is passed along with the device type as
5144parameters to the OpenACC library function @code{acc_set_device_num()}.
5145
5146Once the call to @code{acc_set_device_num()} has completed, the OpenACC
5147library uses the context that was created during the call to
15886c03 5148@code{cublasCreate()}. In other words, both libraries share the
d77de738
ML
5149same context.
5150
5151@smallexample
5152 /* Create the handle */
5153 s = cublasCreate(&h);
5154 if (s != CUBLAS_STATUS_SUCCESS)
5155 @{
5156 fprintf(stderr, "cublasCreate failed %d\n", s);
5157 exit(EXIT_FAILURE);
5158 @}
5159
5160 /* Get the device number */
5161 e = cudaGetDevice(&dev);
5162 if (e != cudaSuccess)
5163 @{
5164 fprintf(stderr, "cudaGetDevice failed %d\n", e);
5165 exit(EXIT_FAILURE);
5166 @}
5167
5168 /* Initialize OpenACC library and use device 'dev' */
5169 acc_set_device_num(dev, acc_device_nvidia);
5170
5171@end smallexample
5172@center Use Case 1
5173
5174@section First invocation: OpenACC library API
5175
5176In this second use case (see below), a function in the OpenACC library is
eda38850 5177called prior to any of the functions in the CUBLAS library. More specifically,
d77de738
ML
5178the function @code{acc_set_device_num()}.
5179
5180In the use case presented here, the function @code{acc_set_device_num()}
5181is used to both initialize the OpenACC library and allocate the hardware
5182resources on the host and the device. In the call to the function, the
5183call parameters specify which device to use and what device
5184type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
5185is but one method to initialize the OpenACC library and allocate the
5186appropriate hardware resources. Other methods are available through the
15886c03 5187use of environment variables and these is discussed in the next section.
d77de738
ML
5188
5189Once the call to @code{acc_set_device_num()} has completed, other OpenACC
5190functions can be called as seen with multiple calls being made to
5191@code{acc_copyin()}. In addition, calls can be made to functions in the
5192CUBLAS library. In the use case a call to @code{cublasCreate()} is made
5193subsequent to the calls to @code{acc_copyin()}.
5194As seen in the previous use case, a call to @code{cublasCreate()}
5195initializes the CUBLAS library and allocates the hardware resources on the
5196host and the device. However, since the device has already been allocated,
15886c03 5197@code{cublasCreate()} only initializes the CUBLAS library and allocates
d77de738
ML
5198the appropriate hardware resources on the host. The context that was created
5199as part of the OpenACC initialization is shared with the CUBLAS library,
5200similarly to the first use case.
5201
5202@smallexample
5203 dev = 0;
5204
5205 acc_set_device_num(dev, acc_device_nvidia);
5206
5207 /* Copy the first set to the device */
5208 d_X = acc_copyin(&h_X[0], N * sizeof (float));
5209 if (d_X == NULL)
5210 @{
5211 fprintf(stderr, "copyin error h_X\n");
5212 exit(EXIT_FAILURE);
5213 @}
5214
5215 /* Copy the second set to the device */
5216 d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
5217 if (d_Y == NULL)
5218 @{
5219 fprintf(stderr, "copyin error h_Y1\n");
5220 exit(EXIT_FAILURE);
5221 @}
5222
5223 /* Create the handle */
5224 s = cublasCreate(&h);
5225 if (s != CUBLAS_STATUS_SUCCESS)
5226 @{
5227 fprintf(stderr, "cublasCreate failed %d\n", s);
5228 exit(EXIT_FAILURE);
5229 @}
5230
5231 /* Perform saxpy using CUBLAS library function */
5232 s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
5233 if (s != CUBLAS_STATUS_SUCCESS)
5234 @{
5235 fprintf(stderr, "cublasSaxpy failed %d\n", s);
5236 exit(EXIT_FAILURE);
5237 @}
5238
5239 /* Copy the results from the device */
5240 acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
5241
5242@end smallexample
5243@center Use Case 2
5244
5245@section OpenACC library and environment variables
5246
5247There are two environment variables associated with the OpenACC library
5248that may be used to control the device type and device number:
5249@env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}, respectively. These two
5250environment variables can be used as an alternative to calling
5251@code{acc_set_device_num()}. As seen in the second use case, the device
5252type and device number were specified using @code{acc_set_device_num()}.
5253If however, the aforementioned environment variables were set, then the
5254call to @code{acc_set_device_num()} would not be required.
5255
5256
5257The use of the environment variables is only relevant when an OpenACC function
5258is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
5259is called prior to a call to an OpenACC function, then you must call
5260@code{acc_set_device_num()}@footnote{More complete information
5261about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
5262sections 4.1 and 4.2 of the @uref{https://www.openacc.org, OpenACC}
5263Application Programming Interface”, Version 2.6.}
5264
5265
5266
5267@c ---------------------------------------------------------------------
5268@c OpenACC Profiling Interface
5269@c ---------------------------------------------------------------------
5270
5271@node OpenACC Profiling Interface
5272@chapter OpenACC Profiling Interface
5273
5274@section Implementation Status and Implementation-Defined Behavior
5275
5276We're implementing the OpenACC Profiling Interface as defined by the
5277OpenACC 2.6 specification. We're clarifying some aspects here as
5278@emph{implementation-defined behavior}, while they're still under
5279discussion within the OpenACC Technical Committee.
5280
5281This implementation is tuned to keep the performance impact as low as
5282possible for the (very common) case that the Profiling Interface is
5283not enabled. This is relevant, as the Profiling Interface affects all
5284the @emph{hot} code paths (in the target code, not in the offloaded
5285code). Users of the OpenACC Profiling Interface can be expected to
15886c03
TB
5286understand that performance is impacted to some degree once the
5287Profiling Interface is enabled: for example, because of the
d77de738
ML
5288@emph{runtime} (libgomp) calling into a third-party @emph{library} for
5289every event that has been registered.
5290
5291We're not yet accounting for the fact that @cite{OpenACC events may
5292occur during event processing}.
5293We just handle one case specially, as required by CUDA 9.0
5294@command{nvprof}, that @code{acc_get_device_type}
5295(@ref{acc_get_device_type})) may be called from
5296@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
5297callbacks.
5298
5299We're not yet implementing initialization via a
5300@code{acc_register_library} function that is either statically linked
5301in, or dynamically via @env{LD_PRELOAD}.
5302Initialization via @code{acc_register_library} functions dynamically
5303loaded via the @env{ACC_PROFLIB} environment variable does work, as
5304does directly calling @code{acc_prof_register},
5305@code{acc_prof_unregister}, @code{acc_prof_lookup}.
5306
5307As currently there are no inquiry functions defined, calls to
15886c03 5308@code{acc_prof_lookup} always returns @code{NULL}.
d77de738
ML
5309
5310There aren't separate @emph{start}, @emph{stop} events defined for the
5311event types @code{acc_ev_create}, @code{acc_ev_delete},
5312@code{acc_ev_alloc}, @code{acc_ev_free}. It's not clear if these
5313should be triggered before or after the actual device-specific call is
5314made. We trigger them after.
5315
5316Remarks about data provided to callbacks:
5317
5318@table @asis
5319
5320@item @code{acc_prof_info.event_type}
5321It's not clear if for @emph{nested} event callbacks (for example,
5322@code{acc_ev_enqueue_launch_start} as part of a parent compute
5323construct), this should be set for the nested event
5324(@code{acc_ev_enqueue_launch_start}), or if the value of the parent
5325construct should remain (@code{acc_ev_compute_construct_start}). In
15886c03 5326this implementation, the value generally corresponds to the
d77de738
ML
5327innermost nested event type.
5328
5329@item @code{acc_prof_info.device_type}
5330@itemize
5331
5332@item
5333For @code{acc_ev_compute_construct_start}, and in presence of an
15886c03 5334@code{if} clause with @emph{false} argument, this still refers to
d77de738
ML
5335the offloading device type.
5336It's not clear if that's the expected behavior.
5337
5338@item
5339Complementary to the item before, for
5340@code{acc_ev_compute_construct_end}, this is set to
5341@code{acc_device_host} in presence of an @code{if} clause with
5342@emph{false} argument.
5343It's not clear if that's the expected behavior.
5344
5345@end itemize
5346
5347@item @code{acc_prof_info.thread_id}
5348Always @code{-1}; not yet implemented.
5349
5350@item @code{acc_prof_info.async}
5351@itemize
5352
5353@item
5354Not yet implemented correctly for
5355@code{acc_ev_compute_construct_start}.
5356
5357@item
5358In a compute construct, for host-fallback
15886c03 5359execution/@code{acc_device_host} it always is
d77de738 5360@code{acc_async_sync}.
15886c03 5361It is unclear if that is the expected behavior.
d77de738
ML
5362
5363@item
5364For @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end},
5365it will always be @code{acc_async_sync}.
15886c03 5366It is unclear if that is the expected behavior.
d77de738
ML
5367
5368@end itemize
5369
5370@item @code{acc_prof_info.async_queue}
5371There is no @cite{limited number of asynchronous queues} in libgomp.
15886c03 5372This always has the same value as @code{acc_prof_info.async}.
d77de738
ML
5373
5374@item @code{acc_prof_info.src_file}
5375Always @code{NULL}; not yet implemented.
5376
5377@item @code{acc_prof_info.func_name}
5378Always @code{NULL}; not yet implemented.
5379
5380@item @code{acc_prof_info.line_no}
5381Always @code{-1}; not yet implemented.
5382
5383@item @code{acc_prof_info.end_line_no}
5384Always @code{-1}; not yet implemented.
5385
5386@item @code{acc_prof_info.func_line_no}
5387Always @code{-1}; not yet implemented.
5388
5389@item @code{acc_prof_info.func_end_line_no}
5390Always @code{-1}; not yet implemented.
5391
5392@item @code{acc_event_info.event_type}, @code{acc_event_info.*.event_type}
5393Relating to @code{acc_prof_info.event_type} discussed above, in this
5394implementation, this will always be the same value as
5395@code{acc_prof_info.event_type}.
5396
5397@item @code{acc_event_info.*.parent_construct}
5398@itemize
5399
5400@item
5401Will be @code{acc_construct_parallel} for all OpenACC compute
5402constructs as well as many OpenACC Runtime API calls; should be the
5403one matching the actual construct, or
5404@code{acc_construct_runtime_api}, respectively.
5405
5406@item
5407Will be @code{acc_construct_enter_data} or
5408@code{acc_construct_exit_data} when processing variable mappings
5409specified in OpenACC @emph{declare} directives; should be
5410@code{acc_construct_declare}.
5411
5412@item
5413For implicit @code{acc_ev_device_init_start},
5414@code{acc_ev_device_init_end}, and explicit as well as implicit
5415@code{acc_ev_alloc}, @code{acc_ev_free},
5416@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
5417@code{acc_ev_enqueue_download_start}, and
5418@code{acc_ev_enqueue_download_end}, will be
5419@code{acc_construct_parallel}; should reflect the real parent
5420construct.
5421
5422@end itemize
5423
5424@item @code{acc_event_info.*.implicit}
5425For @code{acc_ev_alloc}, @code{acc_ev_free},
5426@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
5427@code{acc_ev_enqueue_download_start}, and
5428@code{acc_ev_enqueue_download_end}, this currently will be @code{1}
5429also for explicit usage.
5430
5431@item @code{acc_event_info.data_event.var_name}
5432Always @code{NULL}; not yet implemented.
5433
5434@item @code{acc_event_info.data_event.host_ptr}
5435For @code{acc_ev_alloc}, and @code{acc_ev_free}, this is always
5436@code{NULL}.
5437
5438@item @code{typedef union acc_api_info}
5439@dots{} as printed in @cite{5.2.3. Third Argument: API-Specific
5440Information}. This should obviously be @code{typedef @emph{struct}
5441acc_api_info}.
5442
5443@item @code{acc_api_info.device_api}
5444Possibly not yet implemented correctly for
5445@code{acc_ev_compute_construct_start},
5446@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}:
5447will always be @code{acc_device_api_none} for these event types.
5448For @code{acc_ev_enter_data_start}, it will be
5449@code{acc_device_api_none} in some cases.
5450
5451@item @code{acc_api_info.device_type}
5452Always the same as @code{acc_prof_info.device_type}.
5453
5454@item @code{acc_api_info.vendor}
5455Always @code{-1}; not yet implemented.
5456
5457@item @code{acc_api_info.device_handle}
5458Always @code{NULL}; not yet implemented.
5459
5460@item @code{acc_api_info.context_handle}
5461Always @code{NULL}; not yet implemented.
5462
5463@item @code{acc_api_info.async_handle}
5464Always @code{NULL}; not yet implemented.
5465
5466@end table
5467
5468Remarks about certain event types:
5469
5470@table @asis
5471
5472@item @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
5473@itemize
5474
5475@item
5476@c See 'DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT' in
5477@c 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c',
5478@c 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.
5479When a compute construct triggers implicit
5480@code{acc_ev_device_init_start} and @code{acc_ev_device_init_end}
5481events, they currently aren't @emph{nested within} the corresponding
5482@code{acc_ev_compute_construct_start} and
5483@code{acc_ev_compute_construct_end}, but they're currently observed
5484@emph{before} @code{acc_ev_compute_construct_start}.
5485It's not clear what to do: the standard asks us provide a lot of
5486details to the @code{acc_ev_compute_construct_start} callback, without
5487(implicitly) initializing a device before?
5488
5489@item
5490Callbacks for these event types will not be invoked for calls to the
5491@code{acc_set_device_type} and @code{acc_set_device_num} functions.
5492It's not clear if they should be.
5493
5494@end itemize
5495
5496@item @code{acc_ev_enter_data_start}, @code{acc_ev_enter_data_end}, @code{acc_ev_exit_data_start}, @code{acc_ev_exit_data_end}
5497@itemize
5498
5499@item
5500Callbacks for these event types will also be invoked for OpenACC
5501@emph{host_data} constructs.
5502It's not clear if they should be.
5503
5504@item
5505Callbacks for these event types will also be invoked when processing
5506variable mappings specified in OpenACC @emph{declare} directives.
5507It's not clear if they should be.
5508
5509@end itemize
5510
5511@end table
5512
5513Callbacks for the following event types will be invoked, but dispatch
5514and information provided therein has not yet been thoroughly reviewed:
5515
5516@itemize
5517@item @code{acc_ev_alloc}
5518@item @code{acc_ev_free}
5519@item @code{acc_ev_update_start}, @code{acc_ev_update_end}
5520@item @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end}
5521@item @code{acc_ev_enqueue_download_start}, @code{acc_ev_enqueue_download_end}
5522@end itemize
5523
5524During device initialization, and finalization, respectively,
5525callbacks for the following event types will not yet be invoked:
5526
5527@itemize
5528@item @code{acc_ev_alloc}
5529@item @code{acc_ev_free}
5530@end itemize
5531
5532Callbacks for the following event types have not yet been implemented,
5533so currently won't be invoked:
5534
5535@itemize
5536@item @code{acc_ev_device_shutdown_start}, @code{acc_ev_device_shutdown_end}
5537@item @code{acc_ev_runtime_shutdown}
5538@item @code{acc_ev_create}, @code{acc_ev_delete}
5539@item @code{acc_ev_wait_start}, @code{acc_ev_wait_end}
5540@end itemize
5541
5542For the following runtime library functions, not all expected
5543callbacks will be invoked (mostly concerning implicit device
5544initialization):
5545
5546@itemize
5547@item @code{acc_get_num_devices}
5548@item @code{acc_set_device_type}
5549@item @code{acc_get_device_type}
5550@item @code{acc_set_device_num}
5551@item @code{acc_get_device_num}
5552@item @code{acc_init}
5553@item @code{acc_shutdown}
5554@end itemize
5555
5556Aside from implicit device initialization, for the following runtime
5557library functions, no callbacks will be invoked for shared-memory
5558offloading devices (it's not clear if they should be):
5559
5560@itemize
5561@item @code{acc_malloc}
5562@item @code{acc_free}
5563@item @code{acc_copyin}, @code{acc_present_or_copyin}, @code{acc_copyin_async}
5564@item @code{acc_create}, @code{acc_present_or_create}, @code{acc_create_async}
5565@item @code{acc_copyout}, @code{acc_copyout_async}, @code{acc_copyout_finalize}, @code{acc_copyout_finalize_async}
5566@item @code{acc_delete}, @code{acc_delete_async}, @code{acc_delete_finalize}, @code{acc_delete_finalize_async}
5567@item @code{acc_update_device}, @code{acc_update_device_async}
5568@item @code{acc_update_self}, @code{acc_update_self_async}
5569@item @code{acc_map_data}, @code{acc_unmap_data}
5570@item @code{acc_memcpy_to_device}, @code{acc_memcpy_to_device_async}
5571@item @code{acc_memcpy_from_device}, @code{acc_memcpy_from_device_async}
5572@end itemize
5573
5574@c ---------------------------------------------------------------------
5575@c OpenMP-Implementation Specifics
5576@c ---------------------------------------------------------------------
5577
5578@node OpenMP-Implementation Specifics
5579@chapter OpenMP-Implementation Specifics
5580
5581@menu
2cd0689a 5582* Implementation-defined ICV Initialization::
d77de738 5583* OpenMP Context Selectors::
450b05ce 5584* Memory allocation::
d77de738
ML
5585@end menu
5586
2cd0689a
TB
5587@node Implementation-defined ICV Initialization
5588@section Implementation-defined ICV Initialization
5589@cindex Implementation specific setting
5590
5591@multitable @columnfractions .30 .70
5592@item @var{affinity-format-var} @tab See @ref{OMP_AFFINITY_FORMAT}.
5593@item @var{def-allocator-var} @tab See @ref{OMP_ALLOCATOR}.
5594@item @var{max-active-levels-var} @tab See @ref{OMP_MAX_ACTIVE_LEVELS}.
5595@item @var{dyn-var} @tab See @ref{OMP_DYNAMIC}.
819f3d36 5596@item @var{nthreads-var} @tab See @ref{OMP_NUM_THREADS}.
2cd0689a
TB
5597@item @var{num-devices-var} @tab Number of non-host devices found
5598by GCC's run-time library
5599@item @var{num-procs-var} @tab The number of CPU cores on the
5600initial device, except that affinity settings might lead to a
5601smaller number. On non-host devices, the value of the
5602@var{nthreads-var} ICV.
5603@item @var{place-partition-var} @tab See @ref{OMP_PLACES}.
5604@item @var{run-sched-var} @tab See @ref{OMP_SCHEDULE}.
5605@item @var{stacksize-var} @tab See @ref{OMP_STACKSIZE}.
5606@item @var{thread-limit-var} @tab See @ref{OMP_TEAMS_THREAD_LIMIT}
5607@item @var{wait-policy-var} @tab See @ref{OMP_WAIT_POLICY} and
5608@ref{GOMP_SPINCOUNT}
5609@end multitable
5610
d77de738
ML
5611@node OpenMP Context Selectors
5612@section OpenMP Context Selectors
5613
5614@code{vendor} is always @code{gnu}. References are to the GCC manual.
5615
75e3773b
TB
5616@c NOTE: Only the following selectors have been implemented. To add
5617@c additional traits for target architecture, TARGET_OMP_DEVICE_KIND_ARCH_ISA
5618@c has to be implemented; cf. also PR target/105640.
5619@c For offload devices, add *additionally* gcc/config/*/t-omp-device.
5620
5621For the host compiler, @code{kind} always matches @code{host}; for the
5622offloading architectures AMD GCN and Nvidia PTX, @code{kind} always matches
5623@code{gpu}. For the x86 family of computers, AMD GCN and Nvidia PTX
5624the following traits are supported in addition; while OpenMP is supported
5625on more architectures, GCC currently does not match any @code{arch} or
5626@code{isa} traits for those.
5627
5628@multitable @columnfractions .65 .30
5629@headitem @code{arch} @tab @code{isa}
d77de738
ML
5630@item @code{x86}, @code{x86_64}, @code{i386}, @code{i486},
5631 @code{i586}, @code{i686}, @code{ia32}
d77de738
ML
5632 @tab See @code{-m...} flags in ``x86 Options'' (without @code{-m})
5633@item @code{amdgcn}, @code{gcn}
e0b95c2e
TB
5634 @tab See @code{-march=} in ``AMD GCN Options''@footnote{Additionally,
5635 @code{gfx803} is supported as an alias for @code{fiji}.}
d77de738 5636@item @code{nvptx}
d77de738
ML
5637 @tab See @code{-march=} in ``Nvidia PTX Options''
5638@end multitable
5639
450b05ce
TB
5640@node Memory allocation
5641@section Memory allocation
d77de738 5642
bc238c40
TB
5643The description below applies to:
5644
5645@itemize
5646@item Explicit use of the OpenMP API routines, see
5647 @ref{Memory Management Routines}.
5648@item The @code{allocate} clause, except when the @code{allocator} modifier is a
5649 constant expression with value @code{omp_default_mem_alloc} and no
5650 @code{align} modifier has been specified. (In that case, the normal
5651 @code{malloc} allocation is used.)
5652@item Using the @code{allocate} directive for automatic/stack variables, except
5653 when the @code{allocator} clause is a constant expression with value
5654 @code{omp_default_mem_alloc} and no @code{align} clause has been
5655 specified. (In that case, the normal allocation is used: stack allocation
5656 and, sometimes for Fortran, also @code{malloc} [depending on flags such as
5657 @option{-fstack-arrays}].)
5658@item Using the @code{allocate} directive for variable in static memory is
5659 currently not supported (compile time error).
5660@item Using the @code{allocators} directive for Fortran pointers and
5661 allocatables is currently not supported (compile time error).
5662@end itemize
5663
a85a106c
TB
5664For the available predefined allocators and, as applicable, their associated
5665predefined memory spaces and for the available traits and their default values,
5666see @ref{OMP_ALLOCATOR}. Predefined allocators without an associated memory
5667space use the @code{omp_default_mem_space} memory space.
5668
8c2fc744
TB
5669For the memory spaces, the following applies:
5670@itemize
5671@item @code{omp_default_mem_space} is supported
5672@item @code{omp_const_mem_space} maps to @code{omp_default_mem_space}
5673@item @code{omp_low_lat_mem_space} maps to @code{omp_default_mem_space}
5674@item @code{omp_large_cap_mem_space} maps to @code{omp_default_mem_space},
5675 unless the memkind library is available
5676@item @code{omp_high_bw_mem_space} maps to @code{omp_default_mem_space},
5677 unless the memkind library is available
5678@end itemize
5679
d77de738
ML
5680On Linux systems, where the @uref{https://github.com/memkind/memkind, memkind
5681library} (@code{libmemkind.so.0}) is available at runtime, it is used when
5682creating memory allocators requesting
5683
5684@itemize
5685@item the memory space @code{omp_high_bw_mem_space}
5686@item the memory space @code{omp_large_cap_mem_space}
450b05ce 5687@item the @code{partition} trait @code{interleaved}; note that for
8c2fc744 5688 @code{omp_large_cap_mem_space} the allocation will not be interleaved
d77de738
ML
5689@end itemize
5690
450b05ce
TB
5691On Linux systems, where the @uref{https://github.com/numactl/numactl, numa
5692library} (@code{libnuma.so.1}) is available at runtime, it used when creating
5693memory allocators requesting
5694
5695@itemize
5696@item the @code{partition} trait @code{nearest}, except when both the
5697libmemkind library is available and the memory space is either
5698@code{omp_large_cap_mem_space} or @code{omp_high_bw_mem_space}
5699@end itemize
5700
5701Note that the numa library will round up the allocation size to a multiple of
5702the system page size; therefore, consider using it only with large data or
5703by sharing allocations via the @code{pool_size} trait. Furthermore, the Linux
5704kernel does not guarantee that an allocation will always be on the nearest NUMA
5705node nor that after reallocation the same node will be used. Note additionally
5706that, on Linux, the default setting of the memory placement policy is to use the
5707current node; therefore, unless the memory placement policy has been overridden,
5708the @code{partition} trait @code{environment} (the default) will be effectively
5709a @code{nearest} allocation.
5710
a85a106c 5711Additional notes regarding the traits:
8c2fc744
TB
5712@itemize
5713@item The @code{pinned} trait is unsupported.
a85a106c
TB
5714@item The default for the @code{pool_size} trait is no pool and for every
5715 (re)allocation the associated library routine is called, which might
5716 internally use a memory pool.
8c2fc744
TB
5717@item For the @code{partition} trait, the partition part size will be the same
5718 as the requested size (i.e. @code{interleaved} or @code{blocked} has no
5719 effect), except for @code{interleaved} when the memkind library is
450b05ce
TB
5720 available. Furthermore, for @code{nearest} and unless the numa library
5721 is available, the memory might not be on the same NUMA node as thread
5722 that allocated the memory; on Linux, this is in particular the case when
5723 the memory placement policy is set to preferred.
8c2fc744
TB
5724@item The @code{access} trait has no effect such that memory is always
5725 accessible by all threads.
5726@item The @code{sync_hint} trait has no effect.
5727@end itemize
d77de738
ML
5728
5729@c ---------------------------------------------------------------------
5730@c Offload-Target Specifics
5731@c ---------------------------------------------------------------------
5732
5733@node Offload-Target Specifics
5734@chapter Offload-Target Specifics
5735
5736The following sections present notes on the offload-target specifics
5737
5738@menu
5739* AMD Radeon::
5740* nvptx::
5741@end menu
5742
5743@node AMD Radeon
5744@section AMD Radeon (GCN)
5745
5746On the hardware side, there is the hierarchy (fine to coarse):
5747@itemize
5748@item work item (thread)
5749@item wavefront
5750@item work group
81476bc4 5751@item compute unit (CU)
d77de738
ML
5752@end itemize
5753
5754All OpenMP and OpenACC levels are used, i.e.
5755@itemize
5756@item OpenMP's simd and OpenACC's vector map to work items (thread)
5757@item OpenMP's threads (``parallel'') and OpenACC's workers map
5758 to wavefronts
5759@item OpenMP's teams and OpenACC's gang use a threadpool with the
5760 size of the number of teams or gangs, respectively.
5761@end itemize
5762
5763The used sizes are
5764@itemize
5765@item Number of teams is the specified @code{num_teams} (OpenMP) or
81476bc4
MV
5766 @code{num_gangs} (OpenACC) or otherwise the number of CU. It is limited
5767 by two times the number of CU.
d77de738
ML
5768@item Number of wavefronts is 4 for gfx900 and 16 otherwise;
5769 @code{num_threads} (OpenMP) and @code{num_workers} (OpenACC)
5770 overrides this if smaller.
5771@item The wavefront has 102 scalars and 64 vectors
5772@item Number of workitems is always 64
5773@item The hardware permits maximally 40 workgroups/CU and
5774 16 wavefronts/workgroup up to a limit of 40 wavefronts in total per CU.
5775@item 80 scalars registers and 24 vector registers in non-kernel functions
5776 (the chosen procedure-calling API).
5777@item For the kernel itself: as many as register pressure demands (number of
5778 teams and number of threads, scaled down if registers are exhausted)
5779@end itemize
5780
5781The implementation remark:
5782@itemize
5783@item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
5784 using the C library @code{printf} functions and the Fortran
5785 @code{print}/@code{write} statements.
243fa488 5786@item Reverse offload regions (i.e. @code{target} regions with
f84fdb13
TB
5787 @code{device(ancestor:1)}) are processed serially per @code{target} region
5788 such that the next reverse offload region is only executed after the previous
5789 one returned.
f1af7d65 5790@item OpenMP code that has a @code{requires} directive with
f84fdb13
TB
5791 @code{unified_shared_memory} will remove any GCN device from the list of
5792 available devices (``host fallback'').
2e3dd14d
TB
5793@item The available stack size can be changed using the @code{GCN_STACK_SIZE}
5794 environment variable; the default is 32 kiB per thread.
d77de738
ML
5795@end itemize
5796
5797
5798
5799@node nvptx
5800@section nvptx
5801
5802On the hardware side, there is the hierarchy (fine to coarse):
5803@itemize
5804@item thread
5805@item warp
5806@item thread block
5807@item streaming multiprocessor
5808@end itemize
5809
5810All OpenMP and OpenACC levels are used, i.e.
5811@itemize
5812@item OpenMP's simd and OpenACC's vector map to threads
5813@item OpenMP's threads (``parallel'') and OpenACC's workers map to warps
5814@item OpenMP's teams and OpenACC's gang use a threadpool with the
5815 size of the number of teams or gangs, respectively.
5816@end itemize
5817
5818The used sizes are
5819@itemize
5820@item The @code{warp_size} is always 32
5821@item CUDA kernel launched: @code{dim=@{#teams,1,1@}, blocks=@{#threads,warp_size,1@}}.
81476bc4
MV
5822@item The number of teams is limited by the number of blocks the device can
5823 host simultaneously.
d77de738
ML
5824@end itemize
5825
5826Additional information can be obtained by setting the environment variable to
5827@code{GOMP_DEBUG=1} (very verbose; grep for @code{kernel.*launch} for launch
5828parameters).
5829
5830GCC generates generic PTX ISA code, which is just-in-time compiled by CUDA,
5831which caches the JIT in the user's directory (see CUDA documentation; can be
5832tuned by the environment variables @code{CUDA_CACHE_@{DISABLE,MAXSIZE,PATH@}}.
5833
5834Note: While PTX ISA is generic, the @code{-mptx=} and @code{-march=} commandline
eda38850 5835options still affect the used PTX ISA code and, thus, the requirements on
d77de738
ML
5836CUDA version and hardware.
5837
5838The implementation remark:
5839@itemize
5840@item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
5841 using the C library @code{printf} functions. Note that the Fortran
5842 @code{print}/@code{write} statements are not supported, yet.
5843@item Compilation OpenMP code that contains @code{requires reverse_offload}
5844 requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30}
5845 is not supported.
eda38850
TB
5846@item For code containing reverse offload (i.e. @code{target} regions with
5847 @code{device(ancestor:1)}), there is a slight performance penalty
5848 for @emph{all} target regions, consisting mostly of shutdown delay
5849 Per device, reverse offload regions are processed serially such that
5850 the next reverse offload region is only executed after the previous
5851 one returned.
f1af7d65
TB
5852@item OpenMP code that has a @code{requires} directive with
5853 @code{unified_shared_memory} will remove any nvptx device from the
eda38850 5854 list of available devices (``host fallback'').
2cd0689a
TB
5855@item The default per-warp stack size is 128 kiB; see also @code{-msoft-stack}
5856 in the GCC manual.
25072a47
TB
5857@item The OpenMP routines @code{omp_target_memcpy_rect} and
5858 @code{omp_target_memcpy_rect_async} and the @code{target update}
5859 directive for non-contiguous list items will use the 2D and 3D
5860 memory-copy functions of the CUDA library. Higher dimensions will
5861 call those functions in a loop and are therefore supported.
d77de738
ML
5862@end itemize
5863
5864
5865@c ---------------------------------------------------------------------
5866@c The libgomp ABI
5867@c ---------------------------------------------------------------------
5868
5869@node The libgomp ABI
5870@chapter The libgomp ABI
5871
5872The following sections present notes on the external ABI as
5873presented by libgomp. Only maintainers should need them.
5874
5875@menu
5876* Implementing MASTER construct::
5877* Implementing CRITICAL construct::
5878* Implementing ATOMIC construct::
5879* Implementing FLUSH construct::
5880* Implementing BARRIER construct::
5881* Implementing THREADPRIVATE construct::
5882* Implementing PRIVATE clause::
5883* Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
5884* Implementing REDUCTION clause::
5885* Implementing PARALLEL construct::
5886* Implementing FOR construct::
5887* Implementing ORDERED construct::
5888* Implementing SECTIONS construct::
5889* Implementing SINGLE construct::
5890* Implementing OpenACC's PARALLEL construct::
5891@end menu
5892
5893
5894@node Implementing MASTER construct
5895@section Implementing MASTER construct
5896
5897@smallexample
5898if (omp_get_thread_num () == 0)
5899 block
5900@end smallexample
5901
5902Alternately, we generate two copies of the parallel subfunction
5903and only include this in the version run by the primary thread.
5904Surely this is not worthwhile though...
5905
5906
5907
5908@node Implementing CRITICAL construct
5909@section Implementing CRITICAL construct
5910
5911Without a specified name,
5912
5913@smallexample
5914 void GOMP_critical_start (void);
5915 void GOMP_critical_end (void);
5916@end smallexample
5917
5918so that we don't get COPY relocations from libgomp to the main
5919application.
5920
5921With a specified name, use omp_set_lock and omp_unset_lock with
5922name being transformed into a variable declared like
5923
5924@smallexample
5925 omp_lock_t gomp_critical_user_<name> __attribute__((common))
5926@end smallexample
5927
5928Ideally the ABI would specify that all zero is a valid unlocked
5929state, and so we wouldn't need to initialize this at
5930startup.
5931
5932
5933
5934@node Implementing ATOMIC construct
5935@section Implementing ATOMIC construct
5936
5937The target should implement the @code{__sync} builtins.
5938
5939Failing that we could add
5940
5941@smallexample
5942 void GOMP_atomic_enter (void)
5943 void GOMP_atomic_exit (void)
5944@end smallexample
5945
5946which reuses the regular lock code, but with yet another lock
5947object private to the library.
5948
5949
5950
5951@node Implementing FLUSH construct
5952@section Implementing FLUSH construct
5953
5954Expands to the @code{__sync_synchronize} builtin.
5955
5956
5957
5958@node Implementing BARRIER construct
5959@section Implementing BARRIER construct
5960
5961@smallexample
5962 void GOMP_barrier (void)
5963@end smallexample
5964
5965
5966@node Implementing THREADPRIVATE construct
5967@section Implementing THREADPRIVATE construct
5968
5969In _most_ cases we can map this directly to @code{__thread}. Except
5970that OMP allows constructors for C++ objects. We can either
5971refuse to support this (how often is it used?) or we can
5972implement something akin to .ctors.
5973
5974Even more ideally, this ctor feature is handled by extensions
5975to the main pthreads library. Failing that, we can have a set
5976of entry points to register ctor functions to be called.
5977
5978
5979
5980@node Implementing PRIVATE clause
5981@section Implementing PRIVATE clause
5982
5983In association with a PARALLEL, or within the lexical extent
5984of a PARALLEL block, the variable becomes a local variable in
5985the parallel subfunction.
5986
5987In association with FOR or SECTIONS blocks, create a new
5988automatic variable within the current function. This preserves
5989the semantic of new variable creation.
5990
5991
5992
5993@node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
5994@section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
5995
5996This seems simple enough for PARALLEL blocks. Create a private
5997struct for communicating between the parent and subfunction.
5998In the parent, copy in values for scalar and "small" structs;
5999copy in addresses for others TREE_ADDRESSABLE types. In the
6000subfunction, copy the value into the local variable.
6001
6002It is not clear what to do with bare FOR or SECTION blocks.
6003The only thing I can figure is that we do something like:
6004
6005@smallexample
6006#pragma omp for firstprivate(x) lastprivate(y)
6007for (int i = 0; i < n; ++i)
6008 body;
6009@end smallexample
6010
6011which becomes
6012
6013@smallexample
6014@{
6015 int x = x, y;
6016
6017 // for stuff
6018
6019 if (i == n)
6020 y = y;
6021@}
6022@end smallexample
6023
6024where the "x=x" and "y=y" assignments actually have different
6025uids for the two variables, i.e. not something you could write
6026directly in C. Presumably this only makes sense if the "outer"
6027x and y are global variables.
6028
6029COPYPRIVATE would work the same way, except the structure
6030broadcast would have to happen via SINGLE machinery instead.
6031
6032
6033
6034@node Implementing REDUCTION clause
6035@section Implementing REDUCTION clause
6036
6037The private struct mentioned in the previous section should have
6038a pointer to an array of the type of the variable, indexed by the
6039thread's @var{team_id}. The thread stores its final value into the
6040array, and after the barrier, the primary thread iterates over the
6041array to collect the values.
6042
6043
6044@node Implementing PARALLEL construct
6045@section Implementing PARALLEL construct
6046
6047@smallexample
6048 #pragma omp parallel
6049 @{
6050 body;
6051 @}
6052@end smallexample
6053
6054becomes
6055
6056@smallexample
6057 void subfunction (void *data)
6058 @{
6059 use data;
6060 body;
6061 @}
6062
6063 setup data;
6064 GOMP_parallel_start (subfunction, &data, num_threads);
6065 subfunction (&data);
6066 GOMP_parallel_end ();
6067@end smallexample
6068
6069@smallexample
6070 void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
6071@end smallexample
6072
6073The @var{FN} argument is the subfunction to be run in parallel.
6074
6075The @var{DATA} argument is a pointer to a structure used to
6076communicate data in and out of the subfunction, as discussed
6077above with respect to FIRSTPRIVATE et al.
6078
6079The @var{NUM_THREADS} argument is 1 if an IF clause is present
6080and false, or the value of the NUM_THREADS clause, if
6081present, or 0.
6082
6083The function needs to create the appropriate number of
6084threads and/or launch them from the dock. It needs to
6085create the team structure and assign team ids.
6086
6087@smallexample
6088 void GOMP_parallel_end (void)
6089@end smallexample
6090
6091Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
6092
6093
6094
6095@node Implementing FOR construct
6096@section Implementing FOR construct
6097
6098@smallexample
6099 #pragma omp parallel for
6100 for (i = lb; i <= ub; i++)
6101 body;
6102@end smallexample
6103
6104becomes
6105
6106@smallexample
6107 void subfunction (void *data)
6108 @{
6109 long _s0, _e0;
6110 while (GOMP_loop_static_next (&_s0, &_e0))
6111 @{
6112 long _e1 = _e0, i;
6113 for (i = _s0; i < _e1; i++)
6114 body;
6115 @}
6116 GOMP_loop_end_nowait ();
6117 @}
6118
6119 GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
6120 subfunction (NULL);
6121 GOMP_parallel_end ();
6122@end smallexample
6123
6124@smallexample
6125 #pragma omp for schedule(runtime)
6126 for (i = 0; i < n; i++)
6127 body;
6128@end smallexample
6129
6130becomes
6131
6132@smallexample
6133 @{
6134 long i, _s0, _e0;
6135 if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
6136 do @{
6137 long _e1 = _e0;
6138 for (i = _s0, i < _e0; i++)
6139 body;
6140 @} while (GOMP_loop_runtime_next (&_s0, _&e0));
6141 GOMP_loop_end ();
6142 @}
6143@end smallexample
6144
6145Note that while it looks like there is trickiness to propagating
6146a non-constant STEP, there isn't really. We're explicitly allowed
6147to evaluate it as many times as we want, and any variables involved
6148should automatically be handled as PRIVATE or SHARED like any other
6149variables. So the expression should remain evaluable in the
6150subfunction. We can also pull it into a local variable if we like,
6151but since its supposed to remain unchanged, we can also not if we like.
6152
6153If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
6154able to get away with no work-sharing context at all, since we can
6155simply perform the arithmetic directly in each thread to divide up
6156the iterations. Which would mean that we wouldn't need to call any
6157of these routines.
6158
6159There are separate routines for handling loops with an ORDERED
6160clause. Bookkeeping for that is non-trivial...
6161
6162
6163
6164@node Implementing ORDERED construct
6165@section Implementing ORDERED construct
6166
6167@smallexample
6168 void GOMP_ordered_start (void)
6169 void GOMP_ordered_end (void)
6170@end smallexample
6171
6172
6173
6174@node Implementing SECTIONS construct
6175@section Implementing SECTIONS construct
6176
6177A block as
6178
6179@smallexample
6180 #pragma omp sections
6181 @{
6182 #pragma omp section
6183 stmt1;
6184 #pragma omp section
6185 stmt2;
6186 #pragma omp section
6187 stmt3;
6188 @}
6189@end smallexample
6190
6191becomes
6192
6193@smallexample
6194 for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
6195 switch (i)
6196 @{
6197 case 1:
6198 stmt1;
6199 break;
6200 case 2:
6201 stmt2;
6202 break;
6203 case 3:
6204 stmt3;
6205 break;
6206 @}
6207 GOMP_barrier ();
6208@end smallexample
6209
6210
6211@node Implementing SINGLE construct
6212@section Implementing SINGLE construct
6213
6214A block like
6215
6216@smallexample
6217 #pragma omp single
6218 @{
6219 body;
6220 @}
6221@end smallexample
6222
6223becomes
6224
6225@smallexample
6226 if (GOMP_single_start ())
6227 body;
6228 GOMP_barrier ();
6229@end smallexample
6230
6231while
6232
6233@smallexample
6234 #pragma omp single copyprivate(x)
6235 body;
6236@end smallexample
6237
6238becomes
6239
6240@smallexample
6241 datap = GOMP_single_copy_start ();
6242 if (datap == NULL)
6243 @{
6244 body;
6245 data.x = x;
6246 GOMP_single_copy_end (&data);
6247 @}
6248 else
6249 x = datap->x;
6250 GOMP_barrier ();
6251@end smallexample
6252
6253
6254
6255@node Implementing OpenACC's PARALLEL construct
6256@section Implementing OpenACC's PARALLEL construct
6257
6258@smallexample
6259 void GOACC_parallel ()
6260@end smallexample
6261
6262
6263
6264@c ---------------------------------------------------------------------
6265@c Reporting Bugs
6266@c ---------------------------------------------------------------------
6267
6268@node Reporting Bugs
6269@chapter Reporting Bugs
6270
6271Bugs in the GNU Offloading and Multi Processing Runtime Library should
6272be reported via @uref{https://gcc.gnu.org/bugzilla/, Bugzilla}. Please add
6273"openacc", or "openmp", or both to the keywords field in the bug
6274report, as appropriate.
6275
6276
6277
6278@c ---------------------------------------------------------------------
6279@c GNU General Public License
6280@c ---------------------------------------------------------------------
6281
6282@include gpl_v3.texi
6283
6284
6285
6286@c ---------------------------------------------------------------------
6287@c GNU Free Documentation License
6288@c ---------------------------------------------------------------------
6289
6290@include fdl.texi
6291
6292
6293
6294@c ---------------------------------------------------------------------
6295@c Funding Free Software
6296@c ---------------------------------------------------------------------
6297
6298@include funding.texi
6299
6300@c ---------------------------------------------------------------------
6301@c Index
6302@c ---------------------------------------------------------------------
6303
6304@node Library Index
6305@unnumbered Library Index
6306
6307@printindex cp
6308
6309@bye