]> git.ipfire.org Git - thirdparty/gcc.git/blame - libgomp/libgomp.texi
c++: Allow exporting const-qualified namespace-scope variables [PR99232]
[thirdparty/gcc.git] / libgomp / libgomp.texi
CommitLineData
d77de738
ML
1\input texinfo @c -*-texinfo-*-
2
3@c %**start of header
4@setfilename libgomp.info
5@settitle GNU libgomp
6@c %**end of header
7
8
9@copying
74d5206f 10Copyright @copyright{} 2006-2023 Free Software Foundation, Inc.
d77de738
ML
11
12Permission is granted to copy, distribute and/or modify this document
13under the terms of the GNU Free Documentation License, Version 1.3 or
14any later version published by the Free Software Foundation; with the
15Invariant Sections being ``Funding Free Software'', the Front-Cover
16texts being (a) (see below), and with the Back-Cover Texts being (b)
17(see below). A copy of the license is included in the section entitled
18``GNU Free Documentation License''.
19
20(a) The FSF's Front-Cover Text is:
21
22 A GNU Manual
23
24(b) The FSF's Back-Cover Text is:
25
26 You have freedom to copy and modify this GNU Manual, like GNU
27 software. Copies published by the Free Software Foundation raise
28 funds for GNU development.
29@end copying
30
31@ifinfo
32@dircategory GNU Libraries
33@direntry
34* libgomp: (libgomp). GNU Offloading and Multi Processing Runtime Library.
35@end direntry
36
37This manual documents libgomp, the GNU Offloading and Multi Processing
38Runtime library. This is the GNU implementation of the OpenMP and
39OpenACC APIs for parallel and accelerator programming in C/C++ and
40Fortran.
41
42Published by the Free Software Foundation
4351 Franklin Street, Fifth Floor
44Boston, MA 02110-1301 USA
45
46@insertcopying
47@end ifinfo
48
49
50@setchapternewpage odd
51
52@titlepage
53@title GNU Offloading and Multi Processing Runtime Library
54@subtitle The GNU OpenMP and OpenACC Implementation
55@page
56@vskip 0pt plus 1filll
57@comment For the @value{version-GCC} Version*
58@sp 1
59Published by the Free Software Foundation @*
6051 Franklin Street, Fifth Floor@*
61Boston, MA 02110-1301, USA@*
62@sp 1
63@insertcopying
64@end titlepage
65
66@summarycontents
67@contents
68@page
69
70
71@node Top, Enabling OpenMP
72@top Introduction
73@cindex Introduction
74
75This manual documents the usage of libgomp, the GNU Offloading and
76Multi Processing Runtime Library. This includes the GNU
77implementation of the @uref{https://www.openmp.org, OpenMP} Application
78Programming Interface (API) for multi-platform shared-memory parallel
79programming in C/C++ and Fortran, and the GNU implementation of the
80@uref{https://www.openacc.org, OpenACC} Application Programming
81Interface (API) for offloading of code to accelerator devices in C/C++
82and Fortran.
83
84Originally, libgomp implemented the GNU OpenMP Runtime Library. Based
85on this, support for OpenACC and offloading (both OpenACC and OpenMP
864's target construct) has been added later on, and the library's name
87changed to GNU Offloading and Multi Processing Runtime Library.
88
89
90
91@comment
92@comment When you add a new menu item, please keep the right hand
93@comment aligned to the same column. Do not use tabs. This provides
94@comment better formatting.
95@comment
96@menu
97* Enabling OpenMP:: How to enable OpenMP for your applications.
98* OpenMP Implementation Status:: List of implemented features by OpenMP version
99* OpenMP Runtime Library Routines: Runtime Library Routines.
100 The OpenMP runtime application programming
101 interface.
102* OpenMP Environment Variables: Environment Variables.
103 Influencing OpenMP runtime behavior with
104 environment variables.
105* Enabling OpenACC:: How to enable OpenACC for your
106 applications.
107* OpenACC Runtime Library Routines:: The OpenACC runtime application
108 programming interface.
109* OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
110 environment variables.
111* CUDA Streams Usage:: Notes on the implementation of
112 asynchronous operations.
113* OpenACC Library Interoperability:: OpenACC library interoperability with the
114 NVIDIA CUBLAS library.
115* OpenACC Profiling Interface::
116* OpenMP-Implementation Specifics:: Notes specifics of this OpenMP
117 implementation
118* Offload-Target Specifics:: Notes on offload-target specific internals
119* The libgomp ABI:: Notes on the external ABI presented by libgomp.
120* Reporting Bugs:: How to report bugs in the GNU Offloading and
121 Multi Processing Runtime Library.
122* Copying:: GNU general public license says
123 how you can copy and share libgomp.
124* GNU Free Documentation License::
125 How you can copy and share this manual.
126* Funding:: How to help assure continued work for free
127 software.
128* Library Index:: Index of this documentation.
129@end menu
130
131
132@c ---------------------------------------------------------------------
133@c Enabling OpenMP
134@c ---------------------------------------------------------------------
135
136@node Enabling OpenMP
137@chapter Enabling OpenMP
138
643a5223
TB
139To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
140flag @option{-fopenmp} must be specified. For C and C++, this enables
5648446c 141the handling of the OpenMP directives using @code{#pragma omp} and the
643a5223
TB
142@code{[[omp::directive(...)]]}, @code{[[omp::sequence(...)]]} and
143@code{[[omp::decl(...)]]} attributes. For Fortran, it enables for
144free source form the @code{!$omp} sentinel for directives and the
145@code{!$} conditional compilation sentinel and for fixed source form the
146@code{c$omp}, @code{*$omp} and @code{!$omp} sentinels for directives and
147the @code{c$}, @code{*$} and @code{!$} conditional compilation sentinels.
148The flag also arranges for automatic linking of the OpenMP runtime library
d77de738
ML
149(@ref{Runtime Library Routines}).
150
643a5223
TB
151The @option{-fopenmp-simd} flag can be used to enable a subset of
152OpenMP directives that do not require the linking of either the
153OpenMP runtime library or the POSIX threads library.
154
d77de738
ML
155A complete description of all OpenMP directives may be found in the
156@uref{https://www.openmp.org, OpenMP Application Program Interface} manuals.
157See also @ref{OpenMP Implementation Status}.
158
159
160@c ---------------------------------------------------------------------
161@c OpenMP Implementation Status
162@c ---------------------------------------------------------------------
163
164@node OpenMP Implementation Status
165@chapter OpenMP Implementation Status
166
167@menu
168* OpenMP 4.5:: Feature completion status to 4.5 specification
169* OpenMP 5.0:: Feature completion status to 5.0 specification
170* OpenMP 5.1:: Feature completion status to 5.1 specification
171* OpenMP 5.2:: Feature completion status to 5.2 specification
fcddf7ce 172* OpenMP Technical Report 12:: Feature completion status to second 6.0 preview
d77de738
ML
173@end menu
174
175The @code{_OPENMP} preprocessor macro and Fortran's @code{openmp_version}
176parameter, provided by @code{omp_lib.h} and the @code{omp_lib} module, have
177the value @code{201511} (i.e. OpenMP 4.5).
178
179@node OpenMP 4.5
180@section OpenMP 4.5
181
182The OpenMP 4.5 specification is fully supported.
183
184@node OpenMP 5.0
185@section OpenMP 5.0
186
187@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
188@c This list is sorted as in OpenMP 5.1's B.3 not as in OpenMP 5.0's B.2
189
190@multitable @columnfractions .60 .10 .25
191@headitem Description @tab Status @tab Comments
192@item Array shaping @tab N @tab
193@item Array sections with non-unit strides in C and C++ @tab N @tab
194@item Iterators @tab Y @tab
195@item @code{metadirective} directive @tab N @tab
196@item @code{declare variant} directive
197 @tab P @tab @emph{simd} traits not handled correctly
2cd0689a 198@item @var{target-offload-var} ICV and @code{OMP_TARGET_OFFLOAD}
d77de738 199 env variable @tab Y @tab
2cd0689a 200@item Nested-parallel changes to @var{max-active-levels-var} ICV @tab Y @tab
d77de738 201@item @code{requires} directive @tab P
8c2fc744 202 @tab complete but no non-host device provides @code{unified_shared_memory}
d77de738 203@item @code{teams} construct outside an enclosing target region @tab Y @tab
85da0b40
TB
204@item Non-rectangular loop nests @tab P
205 @tab Full support for C/C++, partial for Fortran
206 (@uref{https://gcc.gnu.org/PR110735,PR110735})
d77de738
ML
207@item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab
208@item @code{nonmonotonic} as default loop schedule modifier for worksharing-loop
209 constructs @tab Y @tab
87f9b6c2 210@item Collapse of associated loops that are imperfectly nested loops @tab Y @tab
d77de738
ML
211@item Clauses @code{if}, @code{nontemporal} and @code{order(concurrent)} in
212 @code{simd} construct @tab Y @tab
213@item @code{atomic} constructs in @code{simd} @tab Y @tab
214@item @code{loop} construct @tab Y @tab
215@item @code{order(concurrent)} clause @tab Y @tab
216@item @code{scan} directive and @code{in_scan} modifier for the
217 @code{reduction} clause @tab Y @tab
218@item @code{in_reduction} clause on @code{task} constructs @tab Y @tab
219@item @code{in_reduction} clause on @code{target} constructs @tab P
220 @tab @code{nowait} only stub
221@item @code{task_reduction} clause with @code{taskgroup} @tab Y @tab
222@item @code{task} modifier to @code{reduction} clause @tab Y @tab
223@item @code{affinity} clause to @code{task} construct @tab Y @tab Stub only
224@item @code{detach} clause to @code{task} construct @tab Y @tab
225@item @code{omp_fulfill_event} runtime routine @tab Y @tab
226@item @code{reduction} and @code{in_reduction} clauses on @code{taskloop}
227 and @code{taskloop simd} constructs @tab Y @tab
228@item @code{taskloop} construct cancelable by @code{cancel} construct
229 @tab Y @tab
230@item @code{mutexinoutset} @emph{dependence-type} for @code{depend} clause
231 @tab Y @tab
232@item Predefined memory spaces, memory allocators, allocator traits
13c3e29d 233 @tab Y @tab See also @ref{Memory allocation}
d77de738 234@item Memory management routines @tab Y @tab
969f5c3e 235@item @code{allocate} directive @tab P @tab Only C and Fortran, only stack variables
d77de738
ML
236@item @code{allocate} clause @tab P @tab Initial support
237@item @code{use_device_addr} clause on @code{target data} @tab Y @tab
f84fdb13 238@item @code{ancestor} modifier on @code{device} clause @tab Y @tab
d77de738
ML
239@item Implicit declare target directive @tab Y @tab
240@item Discontiguous array section with @code{target update} construct
241 @tab N @tab
242@item C/C++'s lvalue expressions in @code{to}, @code{from}
243 and @code{map} clauses @tab N @tab
244@item C/C++'s lvalue expressions in @code{depend} clauses @tab Y @tab
245@item Nested @code{declare target} directive @tab Y @tab
246@item Combined @code{master} constructs @tab Y @tab
247@item @code{depend} clause on @code{taskwait} @tab Y @tab
248@item Weak memory ordering clauses on @code{atomic} and @code{flush} construct
249 @tab Y @tab
250@item @code{hint} clause on the @code{atomic} construct @tab Y @tab Stub only
251@item @code{depobj} construct and depend objects @tab Y @tab
252@item Lock hints were renamed to synchronization hints @tab Y @tab
253@item @code{conditional} modifier to @code{lastprivate} clause @tab Y @tab
254@item Map-order clarifications @tab P @tab
255@item @code{close} @emph{map-type-modifier} @tab Y @tab
256@item Mapping C/C++ pointer variables and to assign the address of
257 device memory mapped by an array section @tab P @tab
258@item Mapping of Fortran pointer and allocatable variables, including pointer
259 and allocatable components of variables
260 @tab P @tab Mapping of vars with allocatable components unsupported
261@item @code{defaultmap} extensions @tab Y @tab
262@item @code{declare mapper} directive @tab N @tab
263@item @code{omp_get_supported_active_levels} routine @tab Y @tab
264@item Runtime routines and environment variables to display runtime thread
265 affinity information @tab Y @tab
266@item @code{omp_pause_resource} and @code{omp_pause_resource_all} runtime
267 routines @tab Y @tab
268@item @code{omp_get_device_num} runtime routine @tab Y @tab
269@item OMPT interface @tab N @tab
270@item OMPD interface @tab N @tab
271@end multitable
272
273@unnumberedsubsec Other new OpenMP 5.0 features
274
275@multitable @columnfractions .60 .10 .25
276@headitem Description @tab Status @tab Comments
277@item Supporting C++'s range-based for loop @tab Y @tab
278@end multitable
279
280
281@node OpenMP 5.1
282@section OpenMP 5.1
283
284@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
285
286@multitable @columnfractions .60 .10 .25
287@headitem Description @tab Status @tab Comments
288@item OpenMP directive as C++ attribute specifiers @tab Y @tab
289@item @code{omp_all_memory} reserved locator @tab Y @tab
290@item @emph{target_device trait} in OpenMP Context @tab N @tab
291@item @code{target_device} selector set in context selectors @tab N @tab
292@item C/C++'s @code{declare variant} directive: elision support of
293 preprocessed code @tab N @tab
294@item @code{declare variant}: new clauses @code{adjust_args} and
295 @code{append_args} @tab N @tab
296@item @code{dispatch} construct @tab N @tab
297@item device-specific ICV settings with environment variables @tab Y @tab
eda38850 298@item @code{assume} and @code{assumes} directives @tab Y @tab
d77de738
ML
299@item @code{nothing} directive @tab Y @tab
300@item @code{error} directive @tab Y @tab
301@item @code{masked} construct @tab Y @tab
302@item @code{scope} directive @tab Y @tab
303@item Loop transformation constructs @tab N @tab
304@item @code{strict} modifier in the @code{grainsize} and @code{num_tasks}
305 clauses of the @code{taskloop} construct @tab Y @tab
1a554a2c 306@item @code{align} clause in @code{allocate} directive @tab P
969f5c3e 307 @tab Only C and Fortran (and only stack variables)
b2e1c49b 308@item @code{align} modifier in @code{allocate} clause @tab Y @tab
d77de738
ML
309@item @code{thread_limit} clause to @code{target} construct @tab Y @tab
310@item @code{has_device_addr} clause to @code{target} construct @tab Y @tab
311@item Iterators in @code{target update} motion clauses and @code{map}
312 clauses @tab N @tab
313@item Indirect calls to the device version of a procedure or function in
a49c7d31 314 @code{target} regions @tab P @tab Only C and C++
d77de738
ML
315@item @code{interop} directive @tab N @tab
316@item @code{omp_interop_t} object support in runtime routines @tab N @tab
317@item @code{nowait} clause in @code{taskwait} directive @tab Y @tab
318@item Extensions to the @code{atomic} directive @tab Y @tab
319@item @code{seq_cst} clause on a @code{flush} construct @tab Y @tab
320@item @code{inoutset} argument to the @code{depend} clause @tab Y @tab
321@item @code{private} and @code{firstprivate} argument to @code{default}
322 clause in C and C++ @tab Y @tab
4ede915d 323@item @code{present} argument to @code{defaultmap} clause @tab Y @tab
d77de738
ML
324@item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
325 @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
326 routines @tab Y @tab
327@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
328@item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
329 runtime routines @tab Y @tab
330@item @code{omp_get_mapped_ptr} runtime routine @tab Y @tab
331@item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and
332 @code{omp_aligned_calloc} runtime routines @tab Y @tab
333@item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added,
334 @code{omp_atv_default} changed @tab Y @tab
335@item @code{omp_display_env} runtime routine @tab Y @tab
336@item @code{ompt_scope_endpoint_t} enum: @code{ompt_scope_beginend} @tab N @tab
337@item @code{ompt_sync_region_t} enum additions @tab N @tab
338@item @code{ompt_state_t} enum: @code{ompt_state_wait_barrier_implementation}
339 and @code{ompt_state_wait_barrier_teams} @tab N @tab
340@item @code{ompt_callback_target_data_op_emi_t},
341 @code{ompt_callback_target_emi_t}, @code{ompt_callback_target_map_emi_t}
342 and @code{ompt_callback_target_submit_emi_t} @tab N @tab
343@item @code{ompt_callback_error_t} type @tab N @tab
344@item @code{OMP_PLACES} syntax extensions @tab Y @tab
345@item @code{OMP_NUM_TEAMS} and @code{OMP_TEAMS_THREAD_LIMIT} environment
346 variables @tab Y @tab
347@end multitable
348
349@unnumberedsubsec Other new OpenMP 5.1 features
350
351@multitable @columnfractions .60 .10 .25
352@headitem Description @tab Status @tab Comments
353@item Support of strictly structured blocks in Fortran @tab Y @tab
354@item Support of structured block sequences in C/C++ @tab Y @tab
355@item @code{unconstrained} and @code{reproducible} modifiers on @code{order}
356 clause @tab Y @tab
357@item Support @code{begin/end declare target} syntax in C/C++ @tab Y @tab
358@item Pointer predetermined firstprivate getting initialized
359to address of matching mapped list item per 5.1, Sect. 2.21.7.2 @tab N @tab
360@item For Fortran, diagnose placing declarative before/between @code{USE},
361 @code{IMPORT}, and @code{IMPLICIT} as invalid @tab N @tab
eda38850 362@item Optional comma between directive and clause in the @code{#pragma} form @tab Y @tab
a49c7d31 363@item @code{indirect} clause in @code{declare target} @tab P @tab Only C and C++
c16e85d7 364@item @code{device_type(nohost)}/@code{device_type(host)} for variables @tab N @tab
4ede915d
TB
365@item @code{present} modifier to the @code{map}, @code{to} and @code{from}
366 clauses @tab Y @tab
d77de738
ML
367@end multitable
368
369
370@node OpenMP 5.2
371@section OpenMP 5.2
372
373@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
374
375@multitable @columnfractions .60 .10 .25
376@headitem Description @tab Status @tab Comments
2cd0689a 377@item @code{omp_in_explicit_task} routine and @var{explicit-task-var} ICV
d77de738
ML
378 @tab Y @tab
379@item @code{omp}/@code{ompx}/@code{omx} sentinels and @code{omp_}/@code{ompx_}
380 namespaces @tab N/A
381 @tab warning for @code{ompx/omx} sentinels@footnote{The @code{ompx}
382 sentinel as C/C++ pragma and C++ attributes are warned for with
383 @code{-Wunknown-pragmas} (implied by @code{-Wall}) and @code{-Wattributes}
384 (enabled by default), respectively; for Fortran free-source code, there is
385 a warning enabled by default and, for fixed-source code, the @code{omx}
386 sentinel is warned for with with @code{-Wsurprising} (enabled by
387 @code{-Wall}). Unknown clauses are always rejected with an error.}
091b6dbc 388@item Clauses on @code{end} directive can be on directive @tab Y @tab
0698c9fd
TB
389@item @code{destroy} clause with destroy-var argument on @code{depobj}
390 @tab N @tab
d77de738
ML
391@item Deprecation of no-argument @code{destroy} clause on @code{depobj}
392 @tab N @tab
393@item @code{linear} clause syntax changes and @code{step} modifier @tab Y @tab
394@item Deprecation of minus operator for reductions @tab N @tab
395@item Deprecation of separating @code{map} modifiers without comma @tab N @tab
396@item @code{declare mapper} with iterator and @code{present} modifiers
397 @tab N @tab
398@item If a matching mapped list item is not found in the data environment, the
b25ea7ab 399 pointer retains its original value @tab Y @tab
d77de738
ML
400@item New @code{enter} clause as alias for @code{to} on declare target directive
401 @tab Y @tab
402@item Deprecation of @code{to} clause on declare target directive @tab N @tab
403@item Extended list of directives permitted in Fortran pure procedures
2df7e451 404 @tab Y @tab
d77de738
ML
405@item New @code{allocators} directive for Fortran @tab N @tab
406@item Deprecation of @code{allocate} directive for Fortran
407 allocatables/pointers @tab N @tab
408@item Optional paired @code{end} directive with @code{dispatch} @tab N @tab
409@item New @code{memspace} and @code{traits} modifiers for @code{uses_allocators}
410 @tab N @tab
411@item Deprecation of traits array following the allocator_handle expression in
412 @code{uses_allocators} @tab N @tab
413@item New @code{otherwise} clause as alias for @code{default} on metadirectives
414 @tab N @tab
415@item Deprecation of @code{default} clause on metadirectives @tab N @tab
416@item Deprecation of delimited form of @code{declare target} @tab N @tab
417@item Reproducible semantics changed for @code{order(concurrent)} @tab N @tab
418@item @code{allocate} and @code{firstprivate} clauses on @code{scope}
419 @tab Y @tab
420@item @code{ompt_callback_work} @tab N @tab
9f80367e 421@item Default map-type for the @code{map} clause in @code{target enter/exit data}
d77de738
ML
422 @tab Y @tab
423@item New @code{doacross} clause as alias for @code{depend} with
424 @code{source}/@code{sink} modifier @tab Y @tab
425@item Deprecation of @code{depend} with @code{source}/@code{sink} modifier
426 @tab N @tab
427@item @code{omp_cur_iteration} keyword @tab Y @tab
428@end multitable
429
430@unnumberedsubsec Other new OpenMP 5.2 features
431
432@multitable @columnfractions .60 .10 .25
433@headitem Description @tab Status @tab Comments
434@item For Fortran, optional comma between directive and clause @tab N @tab
435@item Conforming device numbers and @code{omp_initial_device} and
436 @code{omp_invalid_device} enum/PARAMETER @tab Y @tab
2cd0689a 437@item Initial value of @var{default-device-var} ICV with
18c8b56c 438 @code{OMP_TARGET_OFFLOAD=mandatory} @tab Y @tab
0698c9fd 439@item @code{all} as @emph{implicit-behavior} for @code{defaultmap} @tab Y @tab
d77de738
ML
440@item @emph{interop_types} in any position of the modifier list for the @code{init} clause
441 of the @code{interop} construct @tab N @tab
a49c7d31
KCY
442@item Invoke virtual member functions of C++ objects created on the host device
443 on other devices @tab N @tab
d77de738
ML
444@end multitable
445
446
fcddf7ce
TB
447@node OpenMP Technical Report 12
448@section OpenMP Technical Report 12
c16e85d7 449
fcddf7ce 450Technical Report (TR) 12 is the second preview for OpenMP 6.0.
c16e85d7
TB
451
452@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
453@multitable @columnfractions .60 .10 .25
454@item Features deprecated in versions 5.2, 5.1 and 5.0 were removed
455 @tab N/A @tab Backward compatibility
fcddf7ce
TB
456@item Full support for C23 was added @tab P @tab
457@item Full support for C++23 was added @tab P @tab
c16e85d7
TB
458@item @code{_ALL} suffix to the device-scope environment variables
459 @tab P @tab Host device number wrongly accepted
fcddf7ce
TB
460@item @code{num_threads} now accepts a list @tab N @tab
461@item Supporting increments with abstract names in @code{OMP_PLACES} @tab N @tab
462@item Extension of @code{OMP_DEFAULT_DEVICE} and new
463 @code{OMP_AVAILABLE_DEVICES} environment vars @tab N @tab
464@item New @code{OMP_THREADS_RESERVE} environment variable @tab N @tab
465@item The @code{decl} attribute was added to the C++ attribute syntax
466 @tab Y @tab
467@item The OpenMP directive syntax was extended to include C 23 attribute
468 specifiers @tab Y @tab
469@item All inarguable clauses take now an optional Boolean argument @tab N @tab
c16e85d7
TB
470@item For Fortran, @emph{locator list} can be also function reference with
471 data pointer result @tab N @tab
fcddf7ce
TB
472@item Concept of @emph{assumed-size arrays} in C and C++
473 @tab N @tab
474@item @emph{directive-name-modifier} accepted in all clauses @tab N @tab
475@item For Fortran, atomic with BLOCK construct and, for C/C++, with
476 unlimited curly braces supported @tab N @tab
477@item For Fortran, atomic compare with storing the comparison result
478 @tab N @tab
479@item New @code{looprange} clause @tab N @tab
c16e85d7
TB
480@item Ref-count change for @code{use_device_ptr}/@code{use_device_addr}
481 @tab N @tab
fcddf7ce 482@item Support for inductions @tab N @tab
c16e85d7
TB
483@item Implicit reduction identifiers of C++ classes
484 @tab N @tab
485@item Change of the @emph{map-type} property from @emph{ultimate} to
486 @emph{default} @tab N @tab
fcddf7ce
TB
487@item @code{self} modifier to @code{map} and @code{self} as
488 @code{defaultmap} argument @tab N @tab
c16e85d7
TB
489@item Mapping of @emph{assumed-size arrays} in C, C++ and Fortran
490 @tab N @tab
491@item @code{groupprivate} directive @tab N @tab
fcddf7ce 492@item @code{local} clause to @code{declare target} directive @tab N @tab
c16e85d7
TB
493@item @code{part_size} allocator trait @tab N @tab
494@item @code{pin_device}, @code{preferred_device} and @code{target_access}
495 allocator traits
496 @tab N @tab
497@item @code{access} allocator trait changes @tab N @tab
498@item Extension of @code{interop} operation of @code{append_args}, allowing all
499 modifiers of the @code{init} clause
9f80367e 500 @tab N @tab
c16e85d7 501@item @code{interop} clause to @code{dispatch} @tab N @tab
fcddf7ce
TB
502@item @code{message} and @code{severity} calauses to @code{parallel} directive
503 @tab N @tab
504@item @code{self} clause to @code{requires} directive @tab N @tab
505@item @code{no_openmp_constructs} assumptions clause @tab N @tab
506@item @code{reverse} loop-transformation construct @tab N @tab
507@item @code{interchange} loop-transformation construct @tab N @tab
508@item @code{fuse} loop-transformation construct @tab N @tab
c16e85d7
TB
509@item @code{apply} code to loop-transforming constructs @tab N @tab
510@item @code{omp_curr_progress_width} identifier @tab N @tab
511@item @code{safesync} clause to the @code{parallel} construct @tab N @tab
512@item @code{omp_get_max_progress_width} runtime routine @tab N @tab
8da7476c 513@item @code{strict} modifier keyword to @code{num_threads} @tab N @tab
fcddf7ce
TB
514@item @code{atomic} permitted in a construct with @code{order(concurrent)}
515 @tab N @tab
516@item @code{coexecute} directive for Fortran @tab N @tab
517@item Fortran DO CONCURRENT as associated loop in a @code{loop} construct
518 @tab N @tab
519@item @code{threadset} clause in task-generating constructs @tab N @tab
520@item @code{nowait} clause with reverse-offload @code{target} directives
521 @tab N @tab
522@item Boolean argument to @code{nowait} and @code{nogroup} may be non constant
523 @tab N @tab
c16e85d7 524@item @code{memscope} clause to @code{atomic} and @code{flush} @tab N @tab
fcddf7ce
TB
525@item @code{omp_is_free_agent} and @code{omp_ancestor_is_free_agent} routines
526 @tab N @tab
527@item @code{omp_target_memset} and @code{omp_target_memset_rect_async} routines
528 @tab N @tab
c16e85d7
TB
529@item Routines for obtaining memory spaces/allocators for shared/device memory
530 @tab N @tab
531@item @code{omp_get_memspace_num_resources} routine @tab N @tab
532@item @code{omp_get_submemspace} routine @tab N @tab
fcddf7ce
TB
533@item @code{ompt_target_data_transfer} and @code{ompt_target_data_transfer_async}
534 values in @code{ompt_target_data_op_t} enum @tab N @tab
c16e85d7 535@item @code{ompt_get_buffer_limits} OMPT routine @tab N @tab
c16e85d7
TB
536@end multitable
537
fcddf7ce 538@unnumberedsubsec Other new TR 12 features
c16e85d7
TB
539@multitable @columnfractions .60 .10 .25
540@item Relaxed Fortran restrictions to the @code{aligned} clause @tab N @tab
541@item Mapping lambda captures @tab N @tab
fcddf7ce 542@item New @code{omp_pause_stop_tool} constant for omp_pause_resource @tab N @tab
c16e85d7
TB
543@end multitable
544
545
546
d77de738
ML
547@c ---------------------------------------------------------------------
548@c OpenMP Runtime Library Routines
549@c ---------------------------------------------------------------------
550
551@node Runtime Library Routines
552@chapter OpenMP Runtime Library Routines
553
506f068e
TB
554The runtime routines described here are defined by Section 18 of the OpenMP
555specification in version 5.2.
d77de738
ML
556
557@menu
506f068e
TB
558* Thread Team Routines::
559* Thread Affinity Routines::
560* Teams Region Routines::
561* Tasking Routines::
562@c * Resource Relinquishing Routines::
563* Device Information Routines::
e0786ba6 564* Device Memory Routines::
506f068e
TB
565* Lock Routines::
566* Timing Routines::
567* Event Routine::
568@c * Interoperability Routines::
971f119f 569* Memory Management Routines::
506f068e
TB
570@c * Tool Control Routine::
571@c * Environment Display Routine::
572@end menu
d77de738 573
506f068e
TB
574
575
576@node Thread Team Routines
577@section Thread Team Routines
578
579Routines controlling threads in the current contention group.
580They have C linkage and do not throw exceptions.
581
582@menu
583* omp_set_num_threads:: Set upper team size limit
d77de738 584* omp_get_num_threads:: Size of the active team
506f068e 585* omp_get_max_threads:: Maximum number of threads of parallel region
d77de738
ML
586* omp_get_thread_num:: Current thread ID
587* omp_in_parallel:: Whether a parallel region is active
d77de738 588* omp_set_dynamic:: Enable/disable dynamic teams
506f068e
TB
589* omp_get_dynamic:: Dynamic teams setting
590* omp_get_cancellation:: Whether cancellation support is enabled
d77de738 591* omp_set_nested:: Enable/disable nested parallel regions
506f068e 592* omp_get_nested:: Nested parallel regions
d77de738 593* omp_set_schedule:: Set the runtime scheduling method
506f068e
TB
594* omp_get_schedule:: Obtain the runtime scheduling method
595* omp_get_teams_thread_limit:: Maximum number of threads imposed by teams
596* omp_get_supported_active_levels:: Maximum number of active regions supported
597* omp_set_max_active_levels:: Limits the number of active parallel regions
598* omp_get_max_active_levels:: Current maximum number of active regions
599* omp_get_level:: Number of parallel regions
600* omp_get_ancestor_thread_num:: Ancestor thread ID
601* omp_get_team_size:: Number of threads in a team
602* omp_get_active_level:: Number of active parallel regions
603@end menu
d77de738 604
d77de738 605
d77de738 606
506f068e
TB
607@node omp_set_num_threads
608@subsection @code{omp_set_num_threads} -- Set upper team size limit
609@table @asis
610@item @emph{Description}:
611Specifies the number of threads used by default in subsequent parallel
612sections, if those do not specify a @code{num_threads} clause. The
613argument of @code{omp_set_num_threads} shall be a positive integer.
d77de738 614
506f068e
TB
615@item @emph{C/C++}:
616@multitable @columnfractions .20 .80
617@item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
618@end multitable
d77de738 619
506f068e
TB
620@item @emph{Fortran}:
621@multitable @columnfractions .20 .80
622@item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
623@item @tab @code{integer, intent(in) :: num_threads}
624@end multitable
d77de738 625
506f068e
TB
626@item @emph{See also}:
627@ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
d77de738 628
506f068e
TB
629@item @emph{Reference}:
630@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.1.
631@end table
d77de738
ML
632
633
506f068e
TB
634
635@node omp_get_num_threads
636@subsection @code{omp_get_num_threads} -- Size of the active team
d77de738
ML
637@table @asis
638@item @emph{Description}:
506f068e
TB
639Returns the number of threads in the current team. In a sequential section of
640the program @code{omp_get_num_threads} returns 1.
d77de738 641
506f068e
TB
642The default team size may be initialized at startup by the
643@env{OMP_NUM_THREADS} environment variable. At runtime, the size
644of the current team may be set either by the @code{NUM_THREADS}
645clause or by @code{omp_set_num_threads}. If none of the above were
646used to define a specific value and @env{OMP_DYNAMIC} is disabled,
647one thread per CPU online is used.
648
649@item @emph{C/C++}:
d77de738 650@multitable @columnfractions .20 .80
506f068e 651@item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
d77de738
ML
652@end multitable
653
654@item @emph{Fortran}:
655@multitable @columnfractions .20 .80
506f068e 656@item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
d77de738
ML
657@end multitable
658
659@item @emph{See also}:
506f068e 660@ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
d77de738
ML
661
662@item @emph{Reference}:
506f068e 663@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.2.
d77de738
ML
664@end table
665
666
667
506f068e
TB
668@node omp_get_max_threads
669@subsection @code{omp_get_max_threads} -- Maximum number of threads of parallel region
d77de738
ML
670@table @asis
671@item @emph{Description}:
506f068e
TB
672Return the maximum number of threads used for the current parallel region
673that does not use the clause @code{num_threads}.
d77de738 674
506f068e 675@item @emph{C/C++}:
d77de738 676@multitable @columnfractions .20 .80
506f068e 677@item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
d77de738
ML
678@end multitable
679
680@item @emph{Fortran}:
681@multitable @columnfractions .20 .80
506f068e 682@item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
d77de738
ML
683@end multitable
684
685@item @emph{See also}:
506f068e 686@ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
d77de738
ML
687
688@item @emph{Reference}:
506f068e 689@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.3.
d77de738
ML
690@end table
691
692
693
506f068e
TB
694@node omp_get_thread_num
695@subsection @code{omp_get_thread_num} -- Current thread ID
d77de738
ML
696@table @asis
697@item @emph{Description}:
506f068e
TB
698Returns a unique thread identification number within the current team.
699In a sequential parts of the program, @code{omp_get_thread_num}
700always returns 0. In parallel regions the return value varies
701from 0 to @code{omp_get_num_threads}-1 inclusive. The return
702value of the primary thread of a team is always 0.
d77de738
ML
703
704@item @emph{C/C++}:
705@multitable @columnfractions .20 .80
506f068e 706@item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
d77de738
ML
707@end multitable
708
709@item @emph{Fortran}:
710@multitable @columnfractions .20 .80
506f068e 711@item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
d77de738
ML
712@end multitable
713
714@item @emph{See also}:
506f068e 715@ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
d77de738
ML
716
717@item @emph{Reference}:
506f068e 718@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.4.
d77de738
ML
719@end table
720
721
722
506f068e
TB
723@node omp_in_parallel
724@subsection @code{omp_in_parallel} -- Whether a parallel region is active
d77de738
ML
725@table @asis
726@item @emph{Description}:
506f068e
TB
727This function returns @code{true} if currently running in parallel,
728@code{false} otherwise. Here, @code{true} and @code{false} represent
729their language-specific counterparts.
d77de738
ML
730
731@item @emph{C/C++}:
732@multitable @columnfractions .20 .80
506f068e 733@item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
d77de738
ML
734@end multitable
735
736@item @emph{Fortran}:
737@multitable @columnfractions .20 .80
506f068e 738@item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
d77de738
ML
739@end multitable
740
d77de738 741@item @emph{Reference}:
506f068e 742@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.6.
d77de738
ML
743@end table
744
745
506f068e
TB
746@node omp_set_dynamic
747@subsection @code{omp_set_dynamic} -- Enable/disable dynamic teams
d77de738
ML
748@table @asis
749@item @emph{Description}:
506f068e
TB
750Enable or disable the dynamic adjustment of the number of threads
751within a team. The function takes the language-specific equivalent
752of @code{true} and @code{false}, where @code{true} enables dynamic
753adjustment of team sizes and @code{false} disables it.
d77de738 754
506f068e 755@item @emph{C/C++}:
d77de738 756@multitable @columnfractions .20 .80
506f068e 757@item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
d77de738
ML
758@end multitable
759
760@item @emph{Fortran}:
761@multitable @columnfractions .20 .80
506f068e
TB
762@item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
763@item @tab @code{logical, intent(in) :: dynamic_threads}
d77de738
ML
764@end multitable
765
766@item @emph{See also}:
506f068e 767@ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
d77de738
ML
768
769@item @emph{Reference}:
506f068e 770@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.7.
d77de738
ML
771@end table
772
773
774
775@node omp_get_dynamic
506f068e 776@subsection @code{omp_get_dynamic} -- Dynamic teams setting
d77de738
ML
777@table @asis
778@item @emph{Description}:
779This function returns @code{true} if enabled, @code{false} otherwise.
780Here, @code{true} and @code{false} represent their language-specific
781counterparts.
782
783The dynamic team setting may be initialized at startup by the
784@env{OMP_DYNAMIC} environment variable or at runtime using
785@code{omp_set_dynamic}. If undefined, dynamic adjustment is
786disabled by default.
787
788@item @emph{C/C++}:
789@multitable @columnfractions .20 .80
790@item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
791@end multitable
792
793@item @emph{Fortran}:
794@multitable @columnfractions .20 .80
795@item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
796@end multitable
797
798@item @emph{See also}:
799@ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
800
801@item @emph{Reference}:
802@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.8.
803@end table
804
805
806
506f068e
TB
807@node omp_get_cancellation
808@subsection @code{omp_get_cancellation} -- Whether cancellation support is enabled
d77de738
ML
809@table @asis
810@item @emph{Description}:
506f068e
TB
811This function returns @code{true} if cancellation is activated, @code{false}
812otherwise. Here, @code{true} and @code{false} represent their language-specific
813counterparts. Unless @env{OMP_CANCELLATION} is set true, cancellations are
814deactivated.
d77de738 815
506f068e 816@item @emph{C/C++}:
d77de738 817@multitable @columnfractions .20 .80
506f068e 818@item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
d77de738
ML
819@end multitable
820
821@item @emph{Fortran}:
822@multitable @columnfractions .20 .80
506f068e 823@item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
d77de738
ML
824@end multitable
825
826@item @emph{See also}:
506f068e 827@ref{OMP_CANCELLATION}
d77de738
ML
828
829@item @emph{Reference}:
506f068e 830@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.9.
d77de738
ML
831@end table
832
833
834
506f068e
TB
835@node omp_set_nested
836@subsection @code{omp_set_nested} -- Enable/disable nested parallel regions
d77de738
ML
837@table @asis
838@item @emph{Description}:
506f068e
TB
839Enable or disable nested parallel regions, i.e., whether team members
840are allowed to create new teams. The function takes the language-specific
841equivalent of @code{true} and @code{false}, where @code{true} enables
842dynamic adjustment of team sizes and @code{false} disables it.
d77de738 843
15886c03 844Enabling nested parallel regions also sets the maximum number of
506f068e 845active nested regions to the maximum supported. Disabling nested parallel
15886c03 846regions sets the maximum number of active nested regions to one.
506f068e
TB
847
848Note that the @code{omp_set_nested} API routine was deprecated
849in the OpenMP specification 5.2 in favor of @code{omp_set_max_active_levels}.
850
851@item @emph{C/C++}:
d77de738 852@multitable @columnfractions .20 .80
506f068e 853@item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
d77de738
ML
854@end multitable
855
856@item @emph{Fortran}:
857@multitable @columnfractions .20 .80
506f068e
TB
858@item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
859@item @tab @code{logical, intent(in) :: nested}
d77de738
ML
860@end multitable
861
862@item @emph{See also}:
506f068e
TB
863@ref{omp_get_nested}, @ref{omp_set_max_active_levels},
864@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
d77de738
ML
865
866@item @emph{Reference}:
506f068e 867@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.10.
d77de738
ML
868@end table
869
870
871
506f068e
TB
872@node omp_get_nested
873@subsection @code{omp_get_nested} -- Nested parallel regions
d77de738
ML
874@table @asis
875@item @emph{Description}:
506f068e
TB
876This function returns @code{true} if nested parallel regions are
877enabled, @code{false} otherwise. Here, @code{true} and @code{false}
878represent their language-specific counterparts.
879
880The state of nested parallel regions at startup depends on several
881environment variables. If @env{OMP_MAX_ACTIVE_LEVELS} is defined
882and is set to greater than one, then nested parallel regions will be
883enabled. If not defined, then the value of the @env{OMP_NESTED}
884environment variable will be followed if defined. If neither are
885defined, then if either @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND}
886are defined with a list of more than one value, then nested parallel
887regions are enabled. If none of these are defined, then nested parallel
888regions are disabled by default.
889
890Nested parallel regions can be enabled or disabled at runtime using
891@code{omp_set_nested}, or by setting the maximum number of nested
892regions with @code{omp_set_max_active_levels} to one to disable, or
893above one to enable.
894
895Note that the @code{omp_get_nested} API routine was deprecated
896in the OpenMP specification 5.2 in favor of @code{omp_get_max_active_levels}.
897
898@item @emph{C/C++}:
899@multitable @columnfractions .20 .80
900@item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
901@end multitable
902
903@item @emph{Fortran}:
904@multitable @columnfractions .20 .80
905@item @emph{Interface}: @tab @code{logical function omp_get_nested()}
906@end multitable
907
908@item @emph{See also}:
909@ref{omp_get_max_active_levels}, @ref{omp_set_nested},
910@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
911
912@item @emph{Reference}:
913@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.11.
914@end table
915
916
917
918@node omp_set_schedule
919@subsection @code{omp_set_schedule} -- Set the runtime scheduling method
920@table @asis
921@item @emph{Description}:
922Sets the runtime scheduling method. The @var{kind} argument can have the
923value @code{omp_sched_static}, @code{omp_sched_dynamic},
924@code{omp_sched_guided} or @code{omp_sched_auto}. Except for
925@code{omp_sched_auto}, the chunk size is set to the value of
926@var{chunk_size} if positive, or to the default value if zero or negative.
927For @code{omp_sched_auto} the @var{chunk_size} argument is ignored.
d77de738
ML
928
929@item @emph{C/C++}
930@multitable @columnfractions .20 .80
506f068e 931@item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int chunk_size);}
d77de738
ML
932@end multitable
933
934@item @emph{Fortran}:
935@multitable @columnfractions .20 .80
506f068e
TB
936@item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, chunk_size)}
937@item @tab @code{integer(kind=omp_sched_kind) kind}
938@item @tab @code{integer chunk_size}
d77de738
ML
939@end multitable
940
941@item @emph{See also}:
506f068e
TB
942@ref{omp_get_schedule}
943@ref{OMP_SCHEDULE}
d77de738
ML
944
945@item @emph{Reference}:
506f068e 946@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.12.
d77de738
ML
947@end table
948
949
506f068e
TB
950
951@node omp_get_schedule
952@subsection @code{omp_get_schedule} -- Obtain the runtime scheduling method
d77de738
ML
953@table @asis
954@item @emph{Description}:
15886c03
TB
955Obtain the runtime scheduling method. The @var{kind} argument is set to
956@code{omp_sched_static}, @code{omp_sched_dynamic},
506f068e
TB
957@code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
958@var{chunk_size}, is set to the chunk size.
d77de738
ML
959
960@item @emph{C/C++}
961@multitable @columnfractions .20 .80
506f068e 962@item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *chunk_size);}
d77de738
ML
963@end multitable
964
965@item @emph{Fortran}:
966@multitable @columnfractions .20 .80
506f068e
TB
967@item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, chunk_size)}
968@item @tab @code{integer(kind=omp_sched_kind) kind}
969@item @tab @code{integer chunk_size}
d77de738
ML
970@end multitable
971
506f068e
TB
972@item @emph{See also}:
973@ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
974
d77de738 975@item @emph{Reference}:
506f068e 976@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.13.
d77de738
ML
977@end table
978
979
506f068e
TB
980@node omp_get_teams_thread_limit
981@subsection @code{omp_get_teams_thread_limit} -- Maximum number of threads imposed by teams
d77de738
ML
982@table @asis
983@item @emph{Description}:
15886c03 984Return the maximum number of threads that are able to participate in
506f068e 985each team created by a teams construct.
d77de738
ML
986
987@item @emph{C/C++}:
988@multitable @columnfractions .20 .80
506f068e 989@item @emph{Prototype}: @tab @code{int omp_get_teams_thread_limit(void);}
d77de738
ML
990@end multitable
991
992@item @emph{Fortran}:
993@multitable @columnfractions .20 .80
506f068e 994@item @emph{Interface}: @tab @code{integer function omp_get_teams_thread_limit()}
d77de738
ML
995@end multitable
996
997@item @emph{See also}:
506f068e 998@ref{omp_set_teams_thread_limit}, @ref{OMP_TEAMS_THREAD_LIMIT}
d77de738
ML
999
1000@item @emph{Reference}:
506f068e 1001@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.6.
d77de738
ML
1002@end table
1003
1004
1005
506f068e
TB
1006@node omp_get_supported_active_levels
1007@subsection @code{omp_get_supported_active_levels} -- Maximum number of active regions supported
d77de738
ML
1008@table @asis
1009@item @emph{Description}:
506f068e
TB
1010This function returns the maximum number of nested, active parallel regions
1011supported by this implementation.
d77de738 1012
506f068e 1013@item @emph{C/C++}
d77de738 1014@multitable @columnfractions .20 .80
506f068e 1015@item @emph{Prototype}: @tab @code{int omp_get_supported_active_levels(void);}
d77de738
ML
1016@end multitable
1017
1018@item @emph{Fortran}:
1019@multitable @columnfractions .20 .80
506f068e 1020@item @emph{Interface}: @tab @code{integer function omp_get_supported_active_levels()}
d77de738
ML
1021@end multitable
1022
1023@item @emph{See also}:
506f068e 1024@ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
d77de738
ML
1025
1026@item @emph{Reference}:
506f068e 1027@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.15.
d77de738
ML
1028@end table
1029
1030
1031
506f068e
TB
1032@node omp_set_max_active_levels
1033@subsection @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
d77de738
ML
1034@table @asis
1035@item @emph{Description}:
506f068e
TB
1036This function limits the maximum allowed number of nested, active
1037parallel regions. @var{max_levels} must be less or equal to
1038the value returned by @code{omp_get_supported_active_levels}.
d77de738 1039
506f068e
TB
1040@item @emph{C/C++}
1041@multitable @columnfractions .20 .80
1042@item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
1043@end multitable
d77de738 1044
506f068e
TB
1045@item @emph{Fortran}:
1046@multitable @columnfractions .20 .80
1047@item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
1048@item @tab @code{integer max_levels}
1049@end multitable
d77de738 1050
506f068e
TB
1051@item @emph{See also}:
1052@ref{omp_get_max_active_levels}, @ref{omp_get_active_level},
1053@ref{omp_get_supported_active_levels}
2cd0689a 1054
506f068e
TB
1055@item @emph{Reference}:
1056@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.15.
1057@end table
1058
1059
1060
1061@node omp_get_max_active_levels
1062@subsection @code{omp_get_max_active_levels} -- Current maximum number of active regions
1063@table @asis
1064@item @emph{Description}:
1065This function obtains the maximum allowed number of nested, active parallel regions.
1066
1067@item @emph{C/C++}
d77de738 1068@multitable @columnfractions .20 .80
506f068e 1069@item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
d77de738
ML
1070@end multitable
1071
1072@item @emph{Fortran}:
1073@multitable @columnfractions .20 .80
506f068e 1074@item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
d77de738
ML
1075@end multitable
1076
1077@item @emph{See also}:
506f068e 1078@ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
d77de738
ML
1079
1080@item @emph{Reference}:
506f068e 1081@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.16.
d77de738
ML
1082@end table
1083
1084
506f068e
TB
1085@node omp_get_level
1086@subsection @code{omp_get_level} -- Obtain the current nesting level
d77de738
ML
1087@table @asis
1088@item @emph{Description}:
506f068e
TB
1089This function returns the nesting level for the parallel blocks,
1090which enclose the calling call.
d77de738 1091
506f068e 1092@item @emph{C/C++}
d77de738 1093@multitable @columnfractions .20 .80
506f068e 1094@item @emph{Prototype}: @tab @code{int omp_get_level(void);}
d77de738
ML
1095@end multitable
1096
1097@item @emph{Fortran}:
1098@multitable @columnfractions .20 .80
506f068e 1099@item @emph{Interface}: @tab @code{integer function omp_level()}
d77de738
ML
1100@end multitable
1101
506f068e
TB
1102@item @emph{See also}:
1103@ref{omp_get_active_level}
1104
d77de738 1105@item @emph{Reference}:
506f068e 1106@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.17.
d77de738
ML
1107@end table
1108
1109
1110
506f068e
TB
1111@node omp_get_ancestor_thread_num
1112@subsection @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
d77de738
ML
1113@table @asis
1114@item @emph{Description}:
506f068e
TB
1115This function returns the thread identification number for the given
1116nesting level of the current thread. For values of @var{level} outside
1117zero to @code{omp_get_level} -1 is returned; if @var{level} is
1118@code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
d77de738 1119
506f068e 1120@item @emph{C/C++}
d77de738 1121@multitable @columnfractions .20 .80
506f068e 1122@item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
d77de738
ML
1123@end multitable
1124
1125@item @emph{Fortran}:
1126@multitable @columnfractions .20 .80
506f068e
TB
1127@item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
1128@item @tab @code{integer level}
d77de738
ML
1129@end multitable
1130
506f068e
TB
1131@item @emph{See also}:
1132@ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
1133
d77de738 1134@item @emph{Reference}:
506f068e 1135@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.18.
d77de738
ML
1136@end table
1137
1138
1139
506f068e
TB
1140@node omp_get_team_size
1141@subsection @code{omp_get_team_size} -- Number of threads in a team
d77de738
ML
1142@table @asis
1143@item @emph{Description}:
506f068e
TB
1144This function returns the number of threads in a thread team to which
1145either the current thread or its ancestor belongs. For values of @var{level}
1146outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
11471 is returned, and for @code{omp_get_level}, the result is identical
1148to @code{omp_get_num_threads}.
d77de738
ML
1149
1150@item @emph{C/C++}:
1151@multitable @columnfractions .20 .80
506f068e 1152@item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
d77de738
ML
1153@end multitable
1154
1155@item @emph{Fortran}:
1156@multitable @columnfractions .20 .80
506f068e
TB
1157@item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
1158@item @tab @code{integer level}
d77de738
ML
1159@end multitable
1160
506f068e
TB
1161@item @emph{See also}:
1162@ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
1163
d77de738 1164@item @emph{Reference}:
506f068e 1165@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.19.
d77de738
ML
1166@end table
1167
1168
1169
506f068e
TB
1170@node omp_get_active_level
1171@subsection @code{omp_get_active_level} -- Number of parallel regions
d77de738
ML
1172@table @asis
1173@item @emph{Description}:
506f068e
TB
1174This function returns the nesting level for the active parallel blocks,
1175which enclose the calling call.
d77de738 1176
506f068e 1177@item @emph{C/C++}
d77de738 1178@multitable @columnfractions .20 .80
506f068e 1179@item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
d77de738
ML
1180@end multitable
1181
1182@item @emph{Fortran}:
1183@multitable @columnfractions .20 .80
506f068e 1184@item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
d77de738
ML
1185@end multitable
1186
1187@item @emph{See also}:
506f068e 1188@ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
d77de738
ML
1189
1190@item @emph{Reference}:
506f068e 1191@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.20.
d77de738
ML
1192@end table
1193
1194
1195
506f068e
TB
1196@node Thread Affinity Routines
1197@section Thread Affinity Routines
1198
1199Routines controlling and accessing thread-affinity policies.
1200They have C linkage and do not throw exceptions.
1201
1202@menu
1203* omp_get_proc_bind:: Whether threads may be moved between CPUs
1204@c * omp_get_num_places:: <fixme>
1205@c * omp_get_place_num_procs:: <fixme>
1206@c * omp_get_place_proc_ids:: <fixme>
1207@c * omp_get_place_num:: <fixme>
1208@c * omp_get_partition_num_places:: <fixme>
1209@c * omp_get_partition_place_nums:: <fixme>
1210@c * omp_set_affinity_format:: <fixme>
1211@c * omp_get_affinity_format:: <fixme>
1212@c * omp_display_affinity:: <fixme>
1213@c * omp_capture_affinity:: <fixme>
1214@end menu
1215
1216
1217
d77de738 1218@node omp_get_proc_bind
506f068e 1219@subsection @code{omp_get_proc_bind} -- Whether threads may be moved between CPUs
d77de738
ML
1220@table @asis
1221@item @emph{Description}:
1222This functions returns the currently active thread affinity policy, which is
1223set via @env{OMP_PROC_BIND}. Possible values are @code{omp_proc_bind_false},
1224@code{omp_proc_bind_true}, @code{omp_proc_bind_primary},
1225@code{omp_proc_bind_master}, @code{omp_proc_bind_close} and @code{omp_proc_bind_spread},
1226where @code{omp_proc_bind_master} is an alias for @code{omp_proc_bind_primary}.
1227
1228@item @emph{C/C++}:
1229@multitable @columnfractions .20 .80
1230@item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
1231@end multitable
1232
1233@item @emph{Fortran}:
1234@multitable @columnfractions .20 .80
1235@item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
1236@end multitable
1237
1238@item @emph{See also}:
1239@ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
1240
1241@item @emph{Reference}:
1242@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.22.
1243@end table
1244
1245
1246
506f068e
TB
1247@node Teams Region Routines
1248@section Teams Region Routines
d77de738 1249
506f068e
TB
1250Routines controlling the league of teams that are executed in a @code{teams}
1251region. They have C linkage and do not throw exceptions.
d77de738 1252
506f068e
TB
1253@menu
1254* omp_get_num_teams:: Number of teams
1255* omp_get_team_num:: Get team number
1256* omp_set_num_teams:: Set upper teams limit for teams region
1257* omp_get_max_teams:: Maximum number of teams for teams region
1258* omp_set_teams_thread_limit:: Set upper thread limit for teams construct
1259* omp_get_thread_limit:: Maximum number of threads
1260@end menu
d77de738 1261
d77de738
ML
1262
1263
506f068e
TB
1264@node omp_get_num_teams
1265@subsection @code{omp_get_num_teams} -- Number of teams
d77de738
ML
1266@table @asis
1267@item @emph{Description}:
506f068e 1268Returns the number of teams in the current team region.
d77de738 1269
506f068e 1270@item @emph{C/C++}:
d77de738 1271@multitable @columnfractions .20 .80
506f068e 1272@item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
d77de738
ML
1273@end multitable
1274
1275@item @emph{Fortran}:
1276@multitable @columnfractions .20 .80
506f068e 1277@item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
d77de738
ML
1278@end multitable
1279
d77de738 1280@item @emph{Reference}:
506f068e 1281@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.32.
d77de738
ML
1282@end table
1283
1284
1285
1286@node omp_get_team_num
506f068e 1287@subsection @code{omp_get_team_num} -- Get team number
d77de738
ML
1288@table @asis
1289@item @emph{Description}:
1290Returns the team number of the calling thread.
1291
1292@item @emph{C/C++}:
1293@multitable @columnfractions .20 .80
1294@item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
1295@end multitable
1296
1297@item @emph{Fortran}:
1298@multitable @columnfractions .20 .80
1299@item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
1300@end multitable
1301
1302@item @emph{Reference}:
1303@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.33.
1304@end table
1305
1306
1307
506f068e
TB
1308@node omp_set_num_teams
1309@subsection @code{omp_set_num_teams} -- Set upper teams limit for teams construct
d77de738
ML
1310@table @asis
1311@item @emph{Description}:
506f068e
TB
1312Specifies the upper bound for number of teams created by the teams construct
1313which does not specify a @code{num_teams} clause. The
1314argument of @code{omp_set_num_teams} shall be a positive integer.
d77de738
ML
1315
1316@item @emph{C/C++}:
1317@multitable @columnfractions .20 .80
506f068e 1318@item @emph{Prototype}: @tab @code{void omp_set_num_teams(int num_teams);}
d77de738
ML
1319@end multitable
1320
1321@item @emph{Fortran}:
1322@multitable @columnfractions .20 .80
506f068e
TB
1323@item @emph{Interface}: @tab @code{subroutine omp_set_num_teams(num_teams)}
1324@item @tab @code{integer, intent(in) :: num_teams}
d77de738
ML
1325@end multitable
1326
1327@item @emph{See also}:
506f068e 1328@ref{OMP_NUM_TEAMS}, @ref{omp_get_num_teams}, @ref{omp_get_max_teams}
d77de738
ML
1329
1330@item @emph{Reference}:
506f068e 1331@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.3.
d77de738
ML
1332@end table
1333
1334
1335
506f068e
TB
1336@node omp_get_max_teams
1337@subsection @code{omp_get_max_teams} -- Maximum number of teams of teams region
d77de738
ML
1338@table @asis
1339@item @emph{Description}:
506f068e
TB
1340Return the maximum number of teams used for the teams region
1341that does not use the clause @code{num_teams}.
d77de738
ML
1342
1343@item @emph{C/C++}:
1344@multitable @columnfractions .20 .80
506f068e 1345@item @emph{Prototype}: @tab @code{int omp_get_max_teams(void);}
d77de738
ML
1346@end multitable
1347
1348@item @emph{Fortran}:
1349@multitable @columnfractions .20 .80
506f068e 1350@item @emph{Interface}: @tab @code{integer function omp_get_max_teams()}
d77de738
ML
1351@end multitable
1352
1353@item @emph{See also}:
506f068e 1354@ref{omp_set_num_teams}, @ref{omp_get_num_teams}
d77de738
ML
1355
1356@item @emph{Reference}:
506f068e 1357@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.4.
d77de738
ML
1358@end table
1359
1360
1361
506f068e
TB
1362@node omp_set_teams_thread_limit
1363@subsection @code{omp_set_teams_thread_limit} -- Set upper thread limit for teams construct
d77de738
ML
1364@table @asis
1365@item @emph{Description}:
15886c03 1366Specifies the upper bound for number of threads that are available
506f068e
TB
1367for each team created by the teams construct which does not specify a
1368@code{thread_limit} clause. The argument of
1369@code{omp_set_teams_thread_limit} shall be a positive integer.
d77de738
ML
1370
1371@item @emph{C/C++}:
1372@multitable @columnfractions .20 .80
506f068e 1373@item @emph{Prototype}: @tab @code{void omp_set_teams_thread_limit(int thread_limit);}
d77de738
ML
1374@end multitable
1375
1376@item @emph{Fortran}:
1377@multitable @columnfractions .20 .80
506f068e
TB
1378@item @emph{Interface}: @tab @code{subroutine omp_set_teams_thread_limit(thread_limit)}
1379@item @tab @code{integer, intent(in) :: thread_limit}
d77de738
ML
1380@end multitable
1381
1382@item @emph{See also}:
506f068e 1383@ref{OMP_TEAMS_THREAD_LIMIT}, @ref{omp_get_teams_thread_limit}, @ref{omp_get_thread_limit}
d77de738
ML
1384
1385@item @emph{Reference}:
506f068e 1386@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.5.
d77de738
ML
1387@end table
1388
1389
1390
506f068e
TB
1391@node omp_get_thread_limit
1392@subsection @code{omp_get_thread_limit} -- Maximum number of threads
d77de738
ML
1393@table @asis
1394@item @emph{Description}:
506f068e 1395Return the maximum number of threads of the program.
d77de738
ML
1396
1397@item @emph{C/C++}:
1398@multitable @columnfractions .20 .80
506f068e 1399@item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
d77de738
ML
1400@end multitable
1401
1402@item @emph{Fortran}:
1403@multitable @columnfractions .20 .80
506f068e 1404@item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
d77de738
ML
1405@end multitable
1406
1407@item @emph{See also}:
506f068e 1408@ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
d77de738
ML
1409
1410@item @emph{Reference}:
506f068e 1411@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.14.
d77de738
ML
1412@end table
1413
1414
1415
506f068e
TB
1416@node Tasking Routines
1417@section Tasking Routines
1418
1419Routines relating to explicit tasks.
1420They have C linkage and do not throw exceptions.
1421
1422@menu
1423* omp_get_max_task_priority:: Maximum task priority value that can be set
819f3d36 1424* omp_in_explicit_task:: Whether a given task is an explicit task
506f068e 1425* omp_in_final:: Whether in final or included task region
fcddf7ce
TB
1426@c * omp_is_free_agent:: <fixme>/TR12
1427@c * omp_ancestor_is_free_agent:: <fixme>/TR12
506f068e
TB
1428@end menu
1429
1430
1431
1432@node omp_get_max_task_priority
1433@subsection @code{omp_get_max_task_priority} -- Maximum priority value
1434that can be set for tasks.
d77de738
ML
1435@table @asis
1436@item @emph{Description}:
506f068e 1437This function obtains the maximum allowed priority number for tasks.
d77de738 1438
506f068e 1439@item @emph{C/C++}
d77de738 1440@multitable @columnfractions .20 .80
506f068e 1441@item @emph{Prototype}: @tab @code{int omp_get_max_task_priority(void);}
d77de738
ML
1442@end multitable
1443
1444@item @emph{Fortran}:
1445@multitable @columnfractions .20 .80
506f068e 1446@item @emph{Interface}: @tab @code{integer function omp_get_max_task_priority()}
d77de738
ML
1447@end multitable
1448
1449@item @emph{Reference}:
506f068e 1450@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
d77de738
ML
1451@end table
1452
1453
506f068e 1454
819f3d36
TB
1455@node omp_in_explicit_task
1456@subsection @code{omp_in_explicit_task} -- Whether a given task is an explicit task
1457@table @asis
1458@item @emph{Description}:
1459The function returns the @var{explicit-task-var} ICV; it returns true when the
1460encountering task was generated by a task-generating construct such as
1461@code{target}, @code{task} or @code{taskloop}. Otherwise, the encountering task
1462is in an implicit task region such as generated by the implicit or explicit
1463@code{parallel} region and @code{omp_in_explicit_task} returns false.
1464
1465@item @emph{C/C++}
1466@multitable @columnfractions .20 .80
1467@item @emph{Prototype}: @tab @code{int omp_in_explicit_task(void);}
1468@end multitable
1469
1470@item @emph{Fortran}:
1471@multitable @columnfractions .20 .80
1472@item @emph{Interface}: @tab @code{logical function omp_in_explicit_task()}
1473@end multitable
1474
1475@item @emph{Reference}:
1476@uref{https://www.openmp.org, OpenMP specification v5.2}, Section 18.5.2.
1477@end table
1478
1479
1480
d77de738 1481@node omp_in_final
506f068e 1482@subsection @code{omp_in_final} -- Whether in final or included task region
d77de738
ML
1483@table @asis
1484@item @emph{Description}:
1485This function returns @code{true} if currently running in a final
1486or included task region, @code{false} otherwise. Here, @code{true}
1487and @code{false} represent their language-specific counterparts.
1488
1489@item @emph{C/C++}:
1490@multitable @columnfractions .20 .80
1491@item @emph{Prototype}: @tab @code{int omp_in_final(void);}
1492@end multitable
1493
1494@item @emph{Fortran}:
1495@multitable @columnfractions .20 .80
1496@item @emph{Interface}: @tab @code{logical function omp_in_final()}
1497@end multitable
1498
1499@item @emph{Reference}:
1500@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.21.
1501@end table
1502
1503
1504
506f068e
TB
1505@c @node Resource Relinquishing Routines
1506@c @section Resource Relinquishing Routines
1507@c
1508@c Routines releasing resources used by the OpenMP runtime.
1509@c They have C linkage and do not throw exceptions.
1510@c
1511@c @menu
1512@c * omp_pause_resource:: <fixme>
1513@c * omp_pause_resource_all:: <fixme>
1514@c @end menu
1515
1516@node Device Information Routines
1517@section Device Information Routines
1518
1519Routines related to devices available to an OpenMP program.
1520They have C linkage and do not throw exceptions.
1521
1522@menu
1523* omp_get_num_procs:: Number of processors online
1524@c * omp_get_max_progress_width:: <fixme>/TR11
1525* omp_set_default_device:: Set the default device for target regions
1526* omp_get_default_device:: Get the default device for target regions
1527* omp_get_num_devices:: Number of target devices
1528* omp_get_device_num:: Get device that current thread is running on
1529* omp_is_initial_device:: Whether executing on the host device
1530* omp_get_initial_device:: Device number of host device
1531@end menu
1532
1533
1534
1535@node omp_get_num_procs
1536@subsection @code{omp_get_num_procs} -- Number of processors online
d77de738
ML
1537@table @asis
1538@item @emph{Description}:
506f068e 1539Returns the number of processors online on that device.
d77de738
ML
1540
1541@item @emph{C/C++}:
1542@multitable @columnfractions .20 .80
506f068e 1543@item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
d77de738
ML
1544@end multitable
1545
1546@item @emph{Fortran}:
1547@multitable @columnfractions .20 .80
506f068e 1548@item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
d77de738
ML
1549@end multitable
1550
1551@item @emph{Reference}:
506f068e 1552@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.5.
d77de738
ML
1553@end table
1554
1555
1556
1557@node omp_set_default_device
506f068e 1558@subsection @code{omp_set_default_device} -- Set the default device for target regions
d77de738
ML
1559@table @asis
1560@item @emph{Description}:
1561Set the default device for target regions without device clause. The argument
1562shall be a nonnegative device number.
1563
1564@item @emph{C/C++}:
1565@multitable @columnfractions .20 .80
1566@item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
1567@end multitable
1568
1569@item @emph{Fortran}:
1570@multitable @columnfractions .20 .80
1571@item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
1572@item @tab @code{integer device_num}
1573@end multitable
1574
1575@item @emph{See also}:
1576@ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
1577
1578@item @emph{Reference}:
1579@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
1580@end table
1581
1582
1583
506f068e
TB
1584@node omp_get_default_device
1585@subsection @code{omp_get_default_device} -- Get the default device for target regions
d77de738
ML
1586@table @asis
1587@item @emph{Description}:
506f068e 1588Get the default device for target regions without device clause.
2cd0689a 1589
d77de738
ML
1590@item @emph{C/C++}:
1591@multitable @columnfractions .20 .80
506f068e 1592@item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
d77de738
ML
1593@end multitable
1594
1595@item @emph{Fortran}:
1596@multitable @columnfractions .20 .80
506f068e 1597@item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
d77de738
ML
1598@end multitable
1599
1600@item @emph{See also}:
506f068e 1601@ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
d77de738
ML
1602
1603@item @emph{Reference}:
506f068e 1604@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.30.
d77de738
ML
1605@end table
1606
1607
1608
506f068e
TB
1609@node omp_get_num_devices
1610@subsection @code{omp_get_num_devices} -- Number of target devices
d77de738
ML
1611@table @asis
1612@item @emph{Description}:
506f068e 1613Returns the number of target devices.
d77de738
ML
1614
1615@item @emph{C/C++}:
1616@multitable @columnfractions .20 .80
506f068e 1617@item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
d77de738
ML
1618@end multitable
1619
1620@item @emph{Fortran}:
1621@multitable @columnfractions .20 .80
506f068e 1622@item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
d77de738
ML
1623@end multitable
1624
d77de738 1625@item @emph{Reference}:
506f068e 1626@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.31.
d77de738
ML
1627@end table
1628
1629
1630
506f068e
TB
1631@node omp_get_device_num
1632@subsection @code{omp_get_device_num} -- Return device number of current device
d77de738
ML
1633@table @asis
1634@item @emph{Description}:
506f068e
TB
1635This function returns a device number that represents the device that the
1636current thread is executing on. For OpenMP 5.0, this must be equal to the
1637value returned by the @code{omp_get_initial_device} function when called
1638from the host.
d77de738 1639
506f068e 1640@item @emph{C/C++}
d77de738 1641@multitable @columnfractions .20 .80
506f068e 1642@item @emph{Prototype}: @tab @code{int omp_get_device_num(void);}
d77de738
ML
1643@end multitable
1644
1645@item @emph{Fortran}:
506f068e
TB
1646@multitable @columnfractions .20 .80
1647@item @emph{Interface}: @tab @code{integer function omp_get_device_num()}
d77de738
ML
1648@end multitable
1649
1650@item @emph{See also}:
506f068e 1651@ref{omp_get_initial_device}
d77de738
ML
1652
1653@item @emph{Reference}:
506f068e 1654@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.37.
d77de738
ML
1655@end table
1656
1657
1658
506f068e
TB
1659@node omp_is_initial_device
1660@subsection @code{omp_is_initial_device} -- Whether executing on the host device
d77de738
ML
1661@table @asis
1662@item @emph{Description}:
506f068e
TB
1663This function returns @code{true} if currently running on the host device,
1664@code{false} otherwise. Here, @code{true} and @code{false} represent
1665their language-specific counterparts.
d77de738 1666
506f068e 1667@item @emph{C/C++}:
d77de738 1668@multitable @columnfractions .20 .80
506f068e 1669@item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
d77de738
ML
1670@end multitable
1671
1672@item @emph{Fortran}:
1673@multitable @columnfractions .20 .80
506f068e 1674@item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
d77de738
ML
1675@end multitable
1676
d77de738 1677@item @emph{Reference}:
506f068e 1678@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.34.
d77de738
ML
1679@end table
1680
1681
1682
506f068e
TB
1683@node omp_get_initial_device
1684@subsection @code{omp_get_initial_device} -- Return device number of initial device
d77de738
ML
1685@table @asis
1686@item @emph{Description}:
506f068e
TB
1687This function returns a device number that represents the host device.
1688For OpenMP 5.1, this must be equal to the value returned by the
1689@code{omp_get_num_devices} function.
d77de738 1690
506f068e 1691@item @emph{C/C++}
d77de738 1692@multitable @columnfractions .20 .80
506f068e 1693@item @emph{Prototype}: @tab @code{int omp_get_initial_device(void);}
d77de738
ML
1694@end multitable
1695
1696@item @emph{Fortran}:
1697@multitable @columnfractions .20 .80
506f068e 1698@item @emph{Interface}: @tab @code{integer function omp_get_initial_device()}
d77de738
ML
1699@end multitable
1700
1701@item @emph{See also}:
506f068e 1702@ref{omp_get_num_devices}
d77de738
ML
1703
1704@item @emph{Reference}:
506f068e 1705@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.35.
d77de738
ML
1706@end table
1707
1708
1709
e0786ba6
TB
1710@node Device Memory Routines
1711@section Device Memory Routines
1712
1713Routines related to memory allocation and managing corresponding
1714pointers on devices. They have C linkage and do not throw exceptions.
1715
1716@menu
1717* omp_target_alloc:: Allocate device memory
1718* omp_target_free:: Free device memory
1719* omp_target_is_present:: Check whether storage is mapped
506f068e
TB
1720@c * omp_target_is_accessible:: <fixme>
1721@c * omp_target_memcpy:: <fixme>
1722@c * omp_target_memcpy_rect:: <fixme>
1723@c * omp_target_memcpy_async:: <fixme>
1724@c * omp_target_memcpy_rect_async:: <fixme>
e0786ba6
TB
1725@c * omp_target_memset:: <fixme>/TR12
1726@c * omp_target_memset_async:: <fixme>/TR12
1727* omp_target_associate_ptr:: Associate a device pointer with a host pointer
1728* omp_target_disassociate_ptr:: Remove device--host pointer association
1729* omp_get_mapped_ptr:: Return device pointer to a host pointer
1730@end menu
1731
1732
1733
1734@node omp_target_alloc
1735@subsection @code{omp_target_alloc} -- Allocate device memory
1736@table @asis
1737@item @emph{Description}:
1738This routine allocates @var{size} bytes of memory in the device environment
1739associated with the device number @var{device_num}. If successful, a device
1740pointer is returned, otherwise a null pointer.
1741
1742In GCC, when the device is the host or the device shares memory with the host,
1743the memory is allocated on the host; in that case, when @var{size} is zero,
1744either NULL or a unique pointer value that can later be successfully passed to
1745@code{omp_target_free} is returned. When the allocation is not performed on
1746the host, a null pointer is returned when @var{size} is zero; in that case,
1747additionally a diagnostic might be printed to standard error (stderr).
1748
1749Running this routine in a @code{target} region except on the initial device
1750is not supported.
1751
1752@item @emph{C/C++}
1753@multitable @columnfractions .20 .80
1754@item @emph{Prototype}: @tab @code{void *omp_target_alloc(size_t size, int device_num)}
1755@end multitable
1756
1757@item @emph{Fortran}:
1758@multitable @columnfractions .20 .80
1759@item @emph{Interface}: @tab @code{type(c_ptr) function omp_target_alloc(size, device_num) bind(C)}
1760@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int, c_size_t}
1761@item @tab @code{integer(c_size_t), value :: size}
1762@item @tab @code{integer(c_int), value :: device_num}
1763@end multitable
1764
1765@item @emph{See also}:
1766@ref{omp_target_free}, @ref{omp_target_associate_ptr}
1767
1768@item @emph{Reference}:
1769@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.1
1770@end table
1771
1772
1773
1774@node omp_target_free
1775@subsection @code{omp_target_free} -- Free device memory
1776@table @asis
1777@item @emph{Description}:
1778This routine frees memory allocated by the @code{omp_target_alloc} routine.
1779The @var{device_ptr} argument must be either a null pointer or a device pointer
1780returned by @code{omp_target_alloc} for the specified @code{device_num}. The
1781device number @var{device_num} must be a conforming device number.
1782
1783Running this routine in a @code{target} region except on the initial device
1784is not supported.
1785
1786@item @emph{C/C++}
1787@multitable @columnfractions .20 .80
1788@item @emph{Prototype}: @tab @code{void omp_target_free(void *device_ptr, int device_num)}
1789@end multitable
1790
1791@item @emph{Fortran}:
1792@multitable @columnfractions .20 .80
1793@item @emph{Interface}: @tab @code{subroutine omp_target_free(device_ptr, device_num) bind(C)}
1794@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1795@item @tab @code{type(c_ptr), value :: device_ptr}
1796@item @tab @code{integer(c_int), value :: device_num}
1797@end multitable
1798
1799@item @emph{See also}:
1800@ref{omp_target_alloc}, @ref{omp_target_disassociate_ptr}
1801
1802@item @emph{Reference}:
1803@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.2
1804@end table
1805
1806
1807
1808@node omp_target_is_present
1809@subsection @code{omp_target_is_present} -- Check whether storage is mapped
1810@table @asis
1811@item @emph{Description}:
1812This routine tests whether storage, identified by the host pointer @var{ptr}
1813is mapped to the device specified by @var{device_num}. If so, it returns
1814@emph{true} and otherwise @emph{false}.
1815
1816In GCC, this includes self mapping such that @code{omp_target_is_present}
1817returns @emph{true} when @var{device_num} specifies the host or when the host
1818and the device share memory. If @var{ptr} is a null pointer, @var{true} is
1819returned and if @var{device_num} is an invalid device number, @var{false} is
1820returned.
1821
1822If those conditions do not apply, @emph{true} is returned if the association has
1823been established by an explicit or implicit @code{map} clause, the
1824@code{declare target} directive or a call to the @code{omp_target_associate_ptr}
1825routine.
1826
1827Running this routine in a @code{target} region except on the initial device
1828is not supported.
1829
1830@item @emph{C/C++}
1831@multitable @columnfractions .20 .80
1832@item @emph{Prototype}: @tab @code{int omp_target_is_present(const void *ptr,}
1833@item @tab @code{ int device_num)}
1834@end multitable
1835
1836@item @emph{Fortran}:
1837@multitable @columnfractions .20 .80
1838@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_is_present(ptr, &}
1839@item @tab @code{ device_num) bind(C)}
1840@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1841@item @tab @code{type(c_ptr), value :: ptr}
1842@item @tab @code{integer(c_int), value :: device_num}
1843@end multitable
1844
1845@item @emph{See also}:
1846@ref{omp_target_associate_ptr}
1847
1848@item @emph{Reference}:
1849@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.3
1850@end table
1851
1852
1853
1854@node omp_target_associate_ptr
1855@subsection @code{omp_target_associate_ptr} -- Associate a device pointer with a host pointer
1856@table @asis
1857@item @emph{Description}:
1858This routine associates storage on the host with storage on a device identified
1859by @var{device_num}. The device pointer is usually obtained by calling
1860@code{omp_target_alloc} or by other means (but not by using the @code{map}
1861clauses or the @code{declare target} directive). The host pointer should point
1862to memory that has a storage size of at least @var{size}.
1863
1864The @var{device_offset} parameter specifies the offset into @var{device_ptr}
1865that is used as the base address for the device side of the mapping; the
1866storage size should be at least @var{device_offset} plus @var{size}.
1867
1868After the association, the host pointer can be used in a @code{map} clause and
1869in the @code{to} and @code{from} clauses of the @code{target update} directive
1870to transfer data between the associated pointers. The reference count of such
1871associated storage is infinite. The association can be removed by calling
1872@code{omp_target_disassociate_ptr} which should be done before the lifetime
1873of either either storage ends.
1874
1875The routine returns nonzero (@code{EINVAL}) when the @var{device_num} invalid,
1876for when the initial device or the associated device shares memory with the
1877host. @code{omp_target_associate_ptr} returns zero if @var{host_ptr} points
1878into already associated storage that is fully inside of a previously associated
1879memory. Otherwise, if the association was successful zero is returned; if none
1880of the cases above apply, nonzero (@code{EINVAL}) is returned.
1881
1882The @code{omp_target_is_present} routine can be used to test whether
1883associated storage for a device pointer exists.
1884
1885Running this routine in a @code{target} region except on the initial device
1886is not supported.
1887
1888@item @emph{C/C++}
1889@multitable @columnfractions .20 .80
1890@item @emph{Prototype}: @tab @code{int omp_target_associate_ptr(const void *host_ptr,}
1891@item @tab @code{ const void *device_ptr,}
1892@item @tab @code{ size_t size,}
1893@item @tab @code{ size_t device_offset,}
1894@item @tab @code{ int device_num)}
1895@end multitable
1896
1897@item @emph{Fortran}:
1898@multitable @columnfractions .20 .80
1899@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_associate_ptr(host_ptr, &}
1900@item @tab @code{ device_ptr, size, device_offset, device_num) bind(C)}
1901@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int, c_size_t}
1902@item @tab @code{type(c_ptr), value :: host_ptr, device_ptr}
1903@item @tab @code{integer(c_size_t), value :: size, device_offset}
1904@item @tab @code{integer(c_int), value :: device_num}
1905@end multitable
1906
1907@item @emph{See also}:
1908@ref{omp_target_disassociate_ptr}, @ref{omp_target_is_present},
1909@ref{omp_target_alloc}
1910
1911@item @emph{Reference}:
1912@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.9
1913@end table
1914
1915
1916
1917@node omp_target_disassociate_ptr
1918@subsection @code{omp_target_disassociate_ptr} -- Remove device--host pointer association
1919@table @asis
1920@item @emph{Description}:
1921This routine removes the storage association established by calling
1922@code{omp_target_associate_ptr} and sets the reference count to zero,
1923even if @code{omp_target_associate_ptr} was invoked multiple times for
1924for host pointer @code{ptr}. If applicable, the device memory needs
1925to be freed by the user.
1926
1927If an associated device storage location for the @var{device_num} was
1928found and has infinite reference count, the association is removed and
1929zero is returned. In all other cases, nonzero (@code{EINVAL}) is returned
1930and no other action is taken.
1931
1932Note that passing a host pointer where the association to the device pointer
1933was established with the @code{declare target} directive yields undefined
1934behavior.
1935
1936Running this routine in a @code{target} region except on the initial device
1937is not supported.
1938
1939@item @emph{C/C++}
1940@multitable @columnfractions .20 .80
1941@item @emph{Prototype}: @tab @code{int omp_target_disassociate_ptr(const void *ptr,}
1942@item @tab @code{ int device_num)}
1943@end multitable
1944
1945@item @emph{Fortran}:
1946@multitable @columnfractions .20 .80
1947@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_disassociate_ptr(ptr, &}
1948@item @tab @code{ device_num) bind(C)}
1949@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1950@item @tab @code{type(c_ptr), value :: ptr}
1951@item @tab @code{integer(c_int), value :: device_num}
1952@end multitable
1953
1954@item @emph{See also}:
1955@ref{omp_target_associate_ptr}
1956
1957@item @emph{Reference}:
1958@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.10
1959@end table
1960
1961
1962
1963@node omp_get_mapped_ptr
1964@subsection @code{omp_get_mapped_ptr} -- Return device pointer to a host pointer
1965@table @asis
1966@item @emph{Description}:
1967If the device number is refers to the initial device or to a device with
1968memory accessible from the host (shared memory), the @code{omp_get_mapped_ptr}
bc238c40 1969routines returns the value of the passed @var{ptr}. Otherwise, if associated
e0786ba6
TB
1970storage to the passed host pointer @var{ptr} exists on device associated with
1971@var{device_num}, it returns that pointer. In all other cases and in cases of
1972an error, a null pointer is returned.
1973
1974The association of storage location is established either via an explicit or
1975implicit @code{map} clause, the @code{declare target} directive or the
1976@code{omp_target_associate_ptr} routine.
1977
1978Running this routine in a @code{target} region except on the initial device
1979is not supported.
1980
1981@item @emph{C/C++}
1982@multitable @columnfractions .20 .80
1983@item @emph{Prototype}: @tab @code{void *omp_get_mapped_ptr(const void *ptr, int device_num);}
1984@end multitable
1985
1986@item @emph{Fortran}:
1987@multitable @columnfractions .20 .80
1988@item @emph{Interface}: @tab @code{type(c_ptr) function omp_get_mapped_ptr(ptr, device_num) bind(C)}
1989@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_int}
1990@item @tab @code{type(c_ptr), value :: ptr}
1991@item @tab @code{integer(c_int), value :: device_num}
1992@end multitable
1993
1994@item @emph{See also}:
1995@ref{omp_target_associate_ptr}
1996
1997@item @emph{Reference}:
1998@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 18.8.11
1999@end table
2000
2001
506f068e
TB
2002
2003@node Lock Routines
2004@section Lock Routines
2005
2006Initialize, set, test, unset and destroy simple and nested locks.
2007The routines have C linkage and do not throw exceptions.
2008
2009@menu
2010* omp_init_lock:: Initialize simple lock
2011* omp_init_nest_lock:: Initialize nested lock
2012@c * omp_init_lock_with_hint:: <fixme>
2013@c * omp_init_nest_lock_with_hint:: <fixme>
2014* omp_destroy_lock:: Destroy simple lock
2015* omp_destroy_nest_lock:: Destroy nested lock
2016* omp_set_lock:: Wait for and set simple lock
2017* omp_set_nest_lock:: Wait for and set simple lock
2018* omp_unset_lock:: Unset simple lock
2019* omp_unset_nest_lock:: Unset nested lock
2020* omp_test_lock:: Test and set simple lock if available
2021* omp_test_nest_lock:: Test and set nested lock if available
2022@end menu
2023
2024
2025
d77de738 2026@node omp_init_lock
506f068e 2027@subsection @code{omp_init_lock} -- Initialize simple lock
d77de738
ML
2028@table @asis
2029@item @emph{Description}:
2030Initialize a simple lock. After initialization, the lock is in
2031an unlocked state.
2032
2033@item @emph{C/C++}:
2034@multitable @columnfractions .20 .80
2035@item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
2036@end multitable
2037
2038@item @emph{Fortran}:
2039@multitable @columnfractions .20 .80
2040@item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
2041@item @tab @code{integer(omp_lock_kind), intent(out) :: svar}
2042@end multitable
2043
2044@item @emph{See also}:
2045@ref{omp_destroy_lock}
2046
2047@item @emph{Reference}:
2048@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
2049@end table
2050
2051
2052
506f068e
TB
2053@node omp_init_nest_lock
2054@subsection @code{omp_init_nest_lock} -- Initialize nested lock
d77de738
ML
2055@table @asis
2056@item @emph{Description}:
506f068e
TB
2057Initialize a nested lock. After initialization, the lock is in
2058an unlocked state and the nesting count is set to zero.
d77de738
ML
2059
2060@item @emph{C/C++}:
2061@multitable @columnfractions .20 .80
506f068e 2062@item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
d77de738
ML
2063@end multitable
2064
2065@item @emph{Fortran}:
2066@multitable @columnfractions .20 .80
506f068e
TB
2067@item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
2068@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
d77de738
ML
2069@end multitable
2070
2071@item @emph{See also}:
506f068e 2072@ref{omp_destroy_nest_lock}
d77de738 2073
506f068e
TB
2074@item @emph{Reference}:
2075@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
d77de738
ML
2076@end table
2077
2078
2079
506f068e
TB
2080@node omp_destroy_lock
2081@subsection @code{omp_destroy_lock} -- Destroy simple lock
d77de738
ML
2082@table @asis
2083@item @emph{Description}:
506f068e
TB
2084Destroy a simple lock. In order to be destroyed, a simple lock must be
2085in the unlocked state.
d77de738
ML
2086
2087@item @emph{C/C++}:
2088@multitable @columnfractions .20 .80
506f068e 2089@item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
d77de738
ML
2090@end multitable
2091
2092@item @emph{Fortran}:
2093@multitable @columnfractions .20 .80
506f068e 2094@item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
d77de738
ML
2095@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
2096@end multitable
2097
2098@item @emph{See also}:
506f068e 2099@ref{omp_init_lock}
d77de738
ML
2100
2101@item @emph{Reference}:
506f068e 2102@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
d77de738
ML
2103@end table
2104
2105
2106
506f068e
TB
2107@node omp_destroy_nest_lock
2108@subsection @code{omp_destroy_nest_lock} -- Destroy nested lock
d77de738
ML
2109@table @asis
2110@item @emph{Description}:
506f068e
TB
2111Destroy a nested lock. In order to be destroyed, a nested lock must be
2112in the unlocked state and its nesting count must equal zero.
d77de738
ML
2113
2114@item @emph{C/C++}:
2115@multitable @columnfractions .20 .80
506f068e 2116@item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
d77de738
ML
2117@end multitable
2118
2119@item @emph{Fortran}:
2120@multitable @columnfractions .20 .80
506f068e
TB
2121@item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
2122@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
d77de738
ML
2123@end multitable
2124
2125@item @emph{See also}:
506f068e 2126@ref{omp_init_lock}
d77de738
ML
2127
2128@item @emph{Reference}:
506f068e 2129@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
d77de738
ML
2130@end table
2131
2132
2133
506f068e
TB
2134@node omp_set_lock
2135@subsection @code{omp_set_lock} -- Wait for and set simple lock
d77de738
ML
2136@table @asis
2137@item @emph{Description}:
506f068e
TB
2138Before setting a simple lock, the lock variable must be initialized by
2139@code{omp_init_lock}. The calling thread is blocked until the lock
2140is available. If the lock is already held by the current thread,
2141a deadlock occurs.
d77de738
ML
2142
2143@item @emph{C/C++}:
2144@multitable @columnfractions .20 .80
506f068e 2145@item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
d77de738
ML
2146@end multitable
2147
2148@item @emph{Fortran}:
2149@multitable @columnfractions .20 .80
506f068e 2150@item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
d77de738
ML
2151@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
2152@end multitable
2153
2154@item @emph{See also}:
506f068e 2155@ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
d77de738
ML
2156
2157@item @emph{Reference}:
506f068e 2158@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
d77de738
ML
2159@end table
2160
2161
2162
d77de738 2163@node omp_set_nest_lock
506f068e 2164@subsection @code{omp_set_nest_lock} -- Wait for and set nested lock
d77de738
ML
2165@table @asis
2166@item @emph{Description}:
2167Before setting a nested lock, the lock variable must be initialized by
2168@code{omp_init_nest_lock}. The calling thread is blocked until the lock
2169is available. If the lock is already held by the current thread, the
2170nesting count for the lock is incremented.
2171
2172@item @emph{C/C++}:
2173@multitable @columnfractions .20 .80
2174@item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
2175@end multitable
2176
2177@item @emph{Fortran}:
2178@multitable @columnfractions .20 .80
2179@item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
2180@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
2181@end multitable
2182
2183@item @emph{See also}:
2184@ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
2185
2186@item @emph{Reference}:
2187@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
2188@end table
2189
2190
2191
506f068e
TB
2192@node omp_unset_lock
2193@subsection @code{omp_unset_lock} -- Unset simple lock
d77de738
ML
2194@table @asis
2195@item @emph{Description}:
506f068e
TB
2196A simple lock about to be unset must have been locked by @code{omp_set_lock}
2197or @code{omp_test_lock} before. In addition, the lock must be held by the
2198thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
2199or more threads attempted to set the lock before, one of them is chosen to,
2200again, set the lock to itself.
d77de738
ML
2201
2202@item @emph{C/C++}:
2203@multitable @columnfractions .20 .80
506f068e 2204@item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
d77de738
ML
2205@end multitable
2206
2207@item @emph{Fortran}:
2208@multitable @columnfractions .20 .80
506f068e
TB
2209@item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
2210@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
d77de738
ML
2211@end multitable
2212
d77de738 2213@item @emph{See also}:
506f068e 2214@ref{omp_set_lock}, @ref{omp_test_lock}
d77de738
ML
2215
2216@item @emph{Reference}:
506f068e 2217@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
d77de738
ML
2218@end table
2219
2220
2221
2222@node omp_unset_nest_lock
506f068e 2223@subsection @code{omp_unset_nest_lock} -- Unset nested lock
d77de738
ML
2224@table @asis
2225@item @emph{Description}:
2226A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
2227or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
2228thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
2229lock becomes unlocked. If one ore more threads attempted to set the lock before,
2230one of them is chosen to, again, set the lock to itself.
2231
2232@item @emph{C/C++}:
2233@multitable @columnfractions .20 .80
2234@item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
2235@end multitable
2236
2237@item @emph{Fortran}:
2238@multitable @columnfractions .20 .80
2239@item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
2240@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
2241@end multitable
2242
2243@item @emph{See also}:
2244@ref{omp_set_nest_lock}
2245
2246@item @emph{Reference}:
2247@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
2248@end table
2249
2250
2251
506f068e
TB
2252@node omp_test_lock
2253@subsection @code{omp_test_lock} -- Test and set simple lock if available
d77de738
ML
2254@table @asis
2255@item @emph{Description}:
506f068e
TB
2256Before setting a simple lock, the lock variable must be initialized by
2257@code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
2258does not block if the lock is not available. This function returns
2259@code{true} upon success, @code{false} otherwise. Here, @code{true} and
2260@code{false} represent their language-specific counterparts.
d77de738
ML
2261
2262@item @emph{C/C++}:
2263@multitable @columnfractions .20 .80
506f068e 2264@item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
d77de738
ML
2265@end multitable
2266
2267@item @emph{Fortran}:
2268@multitable @columnfractions .20 .80
506f068e
TB
2269@item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
2270@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
2271@end multitable
2272
2273@item @emph{See also}:
2274@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
2275
2276@item @emph{Reference}:
2277@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
2278@end table
2279
2280
2281
2282@node omp_test_nest_lock
2283@subsection @code{omp_test_nest_lock} -- Test and set nested lock if available
2284@table @asis
2285@item @emph{Description}:
2286Before setting a nested lock, the lock variable must be initialized by
2287@code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
2288@code{omp_test_nest_lock} does not block if the lock is not available.
2289If the lock is already held by the current thread, the new nesting count
2290is returned. Otherwise, the return value equals zero.
2291
2292@item @emph{C/C++}:
2293@multitable @columnfractions .20 .80
2294@item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
2295@end multitable
2296
2297@item @emph{Fortran}:
2298@multitable @columnfractions .20 .80
2299@item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
d77de738
ML
2300@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
2301@end multitable
2302
506f068e 2303
d77de738 2304@item @emph{See also}:
506f068e 2305@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
d77de738
ML
2306
2307@item @emph{Reference}:
506f068e 2308@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
d77de738
ML
2309@end table
2310
2311
2312
506f068e
TB
2313@node Timing Routines
2314@section Timing Routines
2315
2316Portable, thread-based, wall clock timer.
2317The routines have C linkage and do not throw exceptions.
2318
2319@menu
2320* omp_get_wtick:: Get timer precision.
2321* omp_get_wtime:: Elapsed wall clock time.
2322@end menu
2323
2324
2325
d77de738 2326@node omp_get_wtick
506f068e 2327@subsection @code{omp_get_wtick} -- Get timer precision
d77de738
ML
2328@table @asis
2329@item @emph{Description}:
2330Gets the timer precision, i.e., the number of seconds between two
2331successive clock ticks.
2332
2333@item @emph{C/C++}:
2334@multitable @columnfractions .20 .80
2335@item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
2336@end multitable
2337
2338@item @emph{Fortran}:
2339@multitable @columnfractions .20 .80
2340@item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
2341@end multitable
2342
2343@item @emph{See also}:
2344@ref{omp_get_wtime}
2345
2346@item @emph{Reference}:
2347@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.2.
2348@end table
2349
2350
2351
2352@node omp_get_wtime
506f068e 2353@subsection @code{omp_get_wtime} -- Elapsed wall clock time
d77de738
ML
2354@table @asis
2355@item @emph{Description}:
2356Elapsed wall clock time in seconds. The time is measured per thread, no
2357guarantee can be made that two distinct threads measure the same time.
2358Time is measured from some "time in the past", which is an arbitrary time
2359guaranteed not to change during the execution of the program.
2360
2361@item @emph{C/C++}:
2362@multitable @columnfractions .20 .80
2363@item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
2364@end multitable
2365
2366@item @emph{Fortran}:
2367@multitable @columnfractions .20 .80
2368@item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
2369@end multitable
2370
2371@item @emph{See also}:
2372@ref{omp_get_wtick}
2373
2374@item @emph{Reference}:
2375@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.1.
2376@end table
2377
2378
2379
506f068e
TB
2380@node Event Routine
2381@section Event Routine
2382
2383Support for event objects.
2384The routine has C linkage and do not throw exceptions.
2385
2386@menu
2387* omp_fulfill_event:: Fulfill and destroy an OpenMP event.
2388@end menu
2389
2390
2391
d77de738 2392@node omp_fulfill_event
506f068e 2393@subsection @code{omp_fulfill_event} -- Fulfill and destroy an OpenMP event
d77de738
ML
2394@table @asis
2395@item @emph{Description}:
2396Fulfill the event associated with the event handle argument. Currently, it
2397is only used to fulfill events generated by detach clauses on task
2398constructs - the effect of fulfilling the event is to allow the task to
2399complete.
2400
2401The result of calling @code{omp_fulfill_event} with an event handle other
2402than that generated by a detach clause is undefined. Calling it with an
2403event handle that has already been fulfilled is also undefined.
2404
2405@item @emph{C/C++}:
2406@multitable @columnfractions .20 .80
2407@item @emph{Prototype}: @tab @code{void omp_fulfill_event(omp_event_handle_t event);}
2408@end multitable
2409
2410@item @emph{Fortran}:
2411@multitable @columnfractions .20 .80
2412@item @emph{Interface}: @tab @code{subroutine omp_fulfill_event(event)}
2413@item @tab @code{integer (kind=omp_event_handle_kind) :: event}
2414@end multitable
2415
2416@item @emph{Reference}:
2417@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.5.1.
2418@end table
2419
2420
2421
506f068e
TB
2422@c @node Interoperability Routines
2423@c @section Interoperability Routines
2424@c
2425@c Routines to obtain properties from an @code{omp_interop_t} object.
2426@c They have C linkage and do not throw exceptions.
2427@c
2428@c @menu
2429@c * omp_get_num_interop_properties:: <fixme>
2430@c * omp_get_interop_int:: <fixme>
2431@c * omp_get_interop_ptr:: <fixme>
2432@c * omp_get_interop_str:: <fixme>
2433@c * omp_get_interop_name:: <fixme>
2434@c * omp_get_interop_type_desc:: <fixme>
2435@c * omp_get_interop_rc_desc:: <fixme>
2436@c @end menu
2437
971f119f
TB
2438@node Memory Management Routines
2439@section Memory Management Routines
2440
2441Routines to manage and allocate memory on the current device.
2442They have C linkage and do not throw exceptions.
2443
2444@menu
2445* omp_init_allocator:: Create an allocator
2446* omp_destroy_allocator:: Destroy an allocator
2447* omp_set_default_allocator:: Set the default allocator
2448* omp_get_default_allocator:: Get the default allocator
bc238c40
TB
2449* omp_alloc:: Memory allocation with an allocator
2450* omp_aligned_alloc:: Memory allocation with an allocator and alignment
2451* omp_free:: Freeing memory allocated with OpenMP routines
2452* omp_calloc:: Allocate nullified memory with an allocator
2453* omp_aligned_calloc:: Allocate nullified aligned memory with an allocator
2454* omp_realloc:: Reallocate memory allocated with OpenMP routines
506f068e
TB
2455@c * omp_get_memspace_num_resources:: <fixme>/TR11
2456@c * omp_get_submemspace:: <fixme>/TR11
971f119f
TB
2457@end menu
2458
2459
2460
2461@node omp_init_allocator
2462@subsection @code{omp_init_allocator} -- Create an allocator
2463@table @asis
2464@item @emph{Description}:
2465Create an allocator that uses the specified memory space and has the specified
2466traits; if an allocator that fulfills the requirements cannot be created,
2467@code{omp_null_allocator} is returned.
2468
2469The predefined memory spaces and available traits can be found at
2470@ref{OMP_ALLOCATOR}, where the trait names have to be be prefixed by
2471@code{omp_atk_} (e.g. @code{omp_atk_pinned}) and the named trait values by
2472@code{omp_atv_} (e.g. @code{omp_atv_true}); additionally, @code{omp_atv_default}
2473may be used as trait value to specify that the default value should be used.
2474
2475@item @emph{C/C++}:
2476@multitable @columnfractions .20 .80
2477@item @emph{Prototype}: @tab @code{omp_allocator_handle_t omp_init_allocator(}
2478@item @tab @code{ omp_memspace_handle_t memspace,}
2479@item @tab @code{ int ntraits,}
2480@item @tab @code{ const omp_alloctrait_t traits[]);}
2481@end multitable
2482
2483@item @emph{Fortran}:
2484@multitable @columnfractions .20 .80
2485@item @emph{Interface}: @tab @code{function omp_init_allocator(memspace, ntraits, traits)}
bc238c40
TB
2486@item @tab @code{integer (omp_allocator_handle_kind) :: omp_init_allocator}
2487@item @tab @code{integer (omp_memspace_handle_kind), intent(in) :: memspace}
971f119f
TB
2488@item @tab @code{integer, intent(in) :: ntraits}
2489@item @tab @code{type (omp_alloctrait), intent(in) :: traits(*)}
2490@end multitable
2491
2492@item @emph{See also}:
2493@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_destroy_allocator}
2494
2495@item @emph{Reference}:
2496@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.2
2497@end table
2498
2499
2500
2501@node omp_destroy_allocator
2502@subsection @code{omp_destroy_allocator} -- Destroy an allocator
2503@table @asis
2504@item @emph{Description}:
2505Releases all resources used by a memory allocator, which must not represent
2506a predefined memory allocator. Accessing memory after its allocator has been
2507destroyed has unspecified behavior. Passing @code{omp_null_allocator} to the
15886c03 2508routine is permitted but has no effect.
971f119f
TB
2509
2510
2511@item @emph{C/C++}:
2512@multitable @columnfractions .20 .80
2513@item @emph{Prototype}: @tab @code{void omp_destroy_allocator (omp_allocator_handle_t allocator);}
2514@end multitable
2515
2516@item @emph{Fortran}:
2517@multitable @columnfractions .20 .80
2518@item @emph{Interface}: @tab @code{subroutine omp_destroy_allocator(allocator)}
bc238c40 2519@item @tab @code{integer (omp_allocator_handle_kind), intent(in) :: allocator}
971f119f
TB
2520@end multitable
2521
2522@item @emph{See also}:
2523@ref{omp_init_allocator}
2524
2525@item @emph{Reference}:
2526@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.3
2527@end table
2528
2529
2530
2531@node omp_set_default_allocator
2532@subsection @code{omp_set_default_allocator} -- Set the default allocator
2533@table @asis
2534@item @emph{Description}:
2535Sets the default allocator that is used when no allocator has been specified
2536in the @code{allocate} or @code{allocator} clause or if an OpenMP memory
2537routine is invoked with the @code{omp_null_allocator} allocator.
2538
2539@item @emph{C/C++}:
2540@multitable @columnfractions .20 .80
2541@item @emph{Prototype}: @tab @code{void omp_set_default_allocator(omp_allocator_handle_t allocator);}
2542@end multitable
2543
2544@item @emph{Fortran}:
2545@multitable @columnfractions .20 .80
2546@item @emph{Interface}: @tab @code{subroutine omp_set_default_allocator(allocator)}
bc238c40 2547@item @tab @code{integer (omp_allocator_handle_kind), intent(in) :: allocator}
971f119f
TB
2548@end multitable
2549
2550@item @emph{See also}:
2551@ref{omp_get_default_allocator}, @ref{omp_init_allocator}, @ref{OMP_ALLOCATOR},
2552@ref{Memory allocation}
2553
2554@item @emph{Reference}:
2555@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.4
2556@end table
2557
2558
2559
2560@node omp_get_default_allocator
2561@subsection @code{omp_get_default_allocator} -- Get the default allocator
2562@table @asis
2563@item @emph{Description}:
2564The routine returns the default allocator that is used when no allocator has
2565been specified in the @code{allocate} or @code{allocator} clause or if an
2566OpenMP memory routine is invoked with the @code{omp_null_allocator} allocator.
2567
2568@item @emph{C/C++}:
2569@multitable @columnfractions .20 .80
2570@item @emph{Prototype}: @tab @code{omp_allocator_handle_t omp_get_default_allocator();}
2571@end multitable
2572
2573@item @emph{Fortran}:
2574@multitable @columnfractions .20 .80
2575@item @emph{Interface}: @tab @code{function omp_get_default_allocator()}
bc238c40 2576@item @tab @code{integer (omp_allocator_handle_kind) :: omp_get_default_allocator}
971f119f
TB
2577@end multitable
2578
2579@item @emph{See also}:
2580@ref{omp_set_default_allocator}, @ref{OMP_ALLOCATOR}
2581
2582@item @emph{Reference}:
2583@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.5
2584@end table
2585
2586
506f068e 2587
bc238c40
TB
2588@node omp_alloc
2589@subsection @code{omp_alloc} -- Memory allocation with an allocator
2590@table @asis
2591@item @emph{Description}:
2592Allocate memory with the specified allocator, which can either be a predefined
2593allocator, an allocator handle or @code{omp_null_allocator}. If the allocators
2594is @code{omp_null_allocator}, the allocator specified by the
2595@var{def-allocator-var} ICV is used. @var{size} must be a nonnegative number
2596denoting the number of bytes to be allocated; if @var{size} is zero,
2597@code{omp_alloc} will return a null pointer. If successful, a pointer to the
2598allocated memory is returned, otherwise the @code{fallback} trait of the
2599allocator determines the behavior. The content of the allocated memory is
2600unspecified.
2601
2602In @code{target} regions, either the @code{dynamic_allocators} clause must
2603appear on a @code{requires} directive in the same compilation unit -- or the
2604@var{allocator} argument may only be a constant expression with the value of
2605one of the predefined allocators and may not be @code{omp_null_allocator}.
2606
2607Memory allocated by @code{omp_alloc} must be freed using @code{omp_free}.
2608
2609@item @emph{C}:
2610@multitable @columnfractions .20 .80
2611@item @emph{Prototype}: @tab @code{void* omp_alloc(size_t size,}
2612@item @tab @code{ omp_allocator_handle_t allocator)}
2613@end multitable
2614
2615@item @emph{C++}:
2616@multitable @columnfractions .20 .80
2617@item @emph{Prototype}: @tab @code{void* omp_alloc(size_t size,}
2618@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2619@end multitable
2620
2621@item @emph{Fortran}:
2622@multitable @columnfractions .20 .80
2623@item @emph{Interface}: @tab @code{type(c_ptr) function omp_alloc(size, allocator) bind(C)}
2624@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2625@item @tab @code{integer (c_size_t), value :: size}
2626@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2627@end multitable
2628
2629@item @emph{See also}:
2630@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2631@ref{omp_free}, @ref{omp_init_allocator}
2632
2633@item @emph{Reference}:
2634@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.6
2635@end table
2636
2637
2638
2639@node omp_aligned_alloc
2640@subsection @code{omp_aligned_alloc} -- Memory allocation with an allocator and alignment
2641@table @asis
2642@item @emph{Description}:
2643Allocate memory with the specified allocator, which can either be a predefined
2644allocator, an allocator handle or @code{omp_null_allocator}. If the allocators
2645is @code{omp_null_allocator}, the allocator specified by the
2646@var{def-allocator-var} ICV is used. @var{alignment} must be a positive power
2647of two and @var{size} must be a nonnegative number that is a multiple of the
2648alignment and denotes the number of bytes to be allocated; if @var{size} is
2649zero, @code{omp_aligned_alloc} will return a null pointer. The alignment will
2650be at least the maximal value required by @code{alignment} trait of the
2651allocator and the value of the passed @var{alignment} argument. If successful,
2652a pointer to the allocated memory is returned, otherwise the @code{fallback}
2653trait of the allocator determines the behavior. The content of the allocated
2654memory is unspecified.
2655
2656In @code{target} regions, either the @code{dynamic_allocators} clause must
2657appear on a @code{requires} directive in the same compilation unit -- or the
2658@var{allocator} argument may only be a constant expression with the value of
2659one of the predefined allocators and may not be @code{omp_null_allocator}.
2660
2661Memory allocated by @code{omp_aligned_alloc} must be freed using
2662@code{omp_free}.
2663
2664@item @emph{C}:
2665@multitable @columnfractions .20 .80
2666@item @emph{Prototype}: @tab @code{void* omp_aligned_alloc(size_t alignment,}
2667@item @tab @code{ size_t size,}
2668@item @tab @code{ omp_allocator_handle_t allocator)}
2669@end multitable
2670
2671@item @emph{C++}:
2672@multitable @columnfractions .20 .80
2673@item @emph{Prototype}: @tab @code{void* omp_aligned_alloc(size_t alignment,}
2674@item @tab @code{ size_t size,}
2675@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2676@end multitable
2677
2678@item @emph{Fortran}:
2679@multitable @columnfractions .20 .80
2680@item @emph{Interface}: @tab @code{type(c_ptr) function omp_aligned_alloc(alignment, size, allocator) bind(C)}
2681@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2682@item @tab @code{integer (c_size_t), value :: alignment, size}
2683@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2684@end multitable
2685
2686@item @emph{See also}:
2687@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2688@ref{omp_free}, @ref{omp_init_allocator}
2689
2690@item @emph{Reference}:
2691@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.13.6
2692@end table
2693
2694
2695
2696@node omp_free
2697@subsection @code{omp_free} -- Freeing memory allocated with OpenMP routines
2698@table @asis
2699@item @emph{Description}:
2700The @code{omp_free} routine deallocates memory previously allocated by an
2701OpenMP memory-management routine. The @var{ptr} argument must point to such
2702memory or be a null pointer; if it is a null pointer, no operation is
2703performed. If specified, the @var{allocator} argument must be either the
2704memory allocator that was used for the allocation or @code{omp_null_allocator};
2705if it is @code{omp_null_allocator}, the implementation will determine the value
2706automatically.
2707
2708Calling @code{omp_free} invokes undefined behavior if the memory
2709was already deallocated or when the used allocator has already been destroyed.
2710
2711@item @emph{C}:
2712@multitable @columnfractions .20 .80
2713@item @emph{Prototype}: @tab @code{void omp_free(void *ptr,}
2714@item @tab @code{ omp_allocator_handle_t allocator)}
2715@end multitable
2716
2717@item @emph{C++}:
2718@multitable @columnfractions .20 .80
2719@item @emph{Prototype}: @tab @code{void omp_free(void *ptr,}
2720@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2721@end multitable
2722
2723@item @emph{Fortran}:
2724@multitable @columnfractions .20 .80
2725@item @emph{Interface}: @tab @code{subroutine omp_free(ptr, allocator) bind(C)}
2726@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr}
2727@item @tab @code{type (c_ptr), value :: ptr}
2728@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2729@end multitable
2730
2731@item @emph{See also}:
2732@ref{omp_alloc}, @ref{omp_aligned_alloc}, @ref{omp_calloc},
2733@ref{omp_aligned_calloc}, @ref{omp_realloc}
2734
2735@item @emph{Reference}:
2736@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.7
2737@end table
2738
2739
2740
2741@node omp_calloc
2742@subsection @code{omp_calloc} -- Allocate nullified memory with an allocator
2743@table @asis
2744@item @emph{Description}:
2745Allocate zero-initialized memory with the specified allocator, which can either
2746be a predefined allocator, an allocator handle or @code{omp_null_allocator}. If
2747the allocators is @code{omp_null_allocator}, the allocator specified by the
2748@var{def-allocator-var} ICV is used. The to-be allocated memory is for an
2749array with @var{nmemb} elements, each having a size of @var{size} bytes. Both
2750@var{nmemb} and @var{size} must be nonnegative numbers; if either of them is
2751zero, @code{omp_calloc} will return a null pointer. If successful, a pointer to
2752the zero-initialized allocated memory is returned, otherwise the @code{fallback}
2753trait of the allocator determines the behavior.
2754
2755In @code{target} regions, either the @code{dynamic_allocators} clause must
2756appear on a @code{requires} directive in the same compilation unit -- or the
2757@var{allocator} argument may only be a constant expression with the value of
2758one of the predefined allocators and may not be @code{omp_null_allocator}.
2759
2760Memory allocated by @code{omp_calloc} must be freed using @code{omp_free}.
2761
2762@item @emph{C}:
2763@multitable @columnfractions .20 .80
2764@item @emph{Prototype}: @tab @code{void* omp_calloc(size_t nmemb, size_t size,}
2765@item @tab @code{ omp_allocator_handle_t allocator)}
2766@end multitable
2767
2768@item @emph{C++}:
2769@multitable @columnfractions .20 .80
2770@item @emph{Prototype}: @tab @code{void* omp_calloc(size_t nmemb, size_t size,}
2771@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2772@end multitable
2773
2774@item @emph{Fortran}:
2775@multitable @columnfractions .20 .80
2776@item @emph{Interface}: @tab @code{type(c_ptr) function omp_calloc(nmemb, size, allocator) bind(C)}
2777@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2778@item @tab @code{integer (c_size_t), value :: nmemb, size}
2779@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2780@end multitable
2781
2782@item @emph{See also}:
2783@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2784@ref{omp_free}, @ref{omp_init_allocator}
2785
2786@item @emph{Reference}:
2787@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.13.8
2788@end table
2789
2790
2791
2792@node omp_aligned_calloc
2793@subsection @code{omp_aligned_calloc} -- Allocate aligned nullified memory with an allocator
2794@table @asis
2795@item @emph{Description}:
2796Allocate zero-initialized memory with the specified allocator, which can either
2797be a predefined allocator, an allocator handle or @code{omp_null_allocator}. If
2798the allocators is @code{omp_null_allocator}, the allocator specified by the
2799@var{def-allocator-var} ICV is used. The to-be allocated memory is for an
2800array with @var{nmemb} elements, each having a size of @var{size} bytes. Both
2801@var{nmemb} and @var{size} must be nonnegative numbers; if either of them is
2802zero, @code{omp_aligned_calloc} will return a null pointer. @var{alignment}
2803must be a positive power of two and @var{size} must be a multiple of the
2804alignment; the alignment will be at least the maximal value required by
2805@code{alignment} trait of the allocator and the value of the passed
2806@var{alignment} argument. If successful, a pointer to the zero-initialized
2807allocated memory is returned, otherwise the @code{fallback} trait of the
2808allocator determines the behavior.
2809
2810In @code{target} regions, either the @code{dynamic_allocators} clause must
2811appear on a @code{requires} directive in the same compilation unit -- or the
2812@var{allocator} argument may only be a constant expression with the value of
2813one of the predefined allocators and may not be @code{omp_null_allocator}.
2814
2815Memory allocated by @code{omp_aligned_calloc} must be freed using
2816@code{omp_free}.
2817
2818@item @emph{C}:
2819@multitable @columnfractions .20 .80
2820@item @emph{Prototype}: @tab @code{void* omp_aligned_calloc(size_t nmemb, size_t size,}
2821@item @tab @code{ omp_allocator_handle_t allocator)}
2822@end multitable
2823
2824@item @emph{C++}:
2825@multitable @columnfractions .20 .80
2826@item @emph{Prototype}: @tab @code{void* omp_aligned_calloc(size_t nmemb, size_t size,}
2827@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator)}
2828@end multitable
2829
2830@item @emph{Fortran}:
2831@multitable @columnfractions .20 .80
2832@item @emph{Interface}: @tab @code{type(c_ptr) function omp_aligned_calloc(nmemb, size, allocator) bind(C)}
2833@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2834@item @tab @code{integer (c_size_t), value :: nmemb, size}
2835@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator}
2836@end multitable
2837
2838@item @emph{See also}:
2839@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2840@ref{omp_free}, @ref{omp_init_allocator}
2841
2842@item @emph{Reference}:
2843@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.13.8
2844@end table
2845
2846
2847
2848@node omp_realloc
2849@subsection @code{omp_realloc} -- Reallocate memory allocated with OpenMP routines
2850@table @asis
2851@item @emph{Description}:
2852The @code{omp_realloc} routine deallocates memory to which @var{ptr} points to
2853and allocates new memory with the specified @var{allocator} argument; the
2854new memory will have the content of the old memory up to the minimum of the
2855old size and the new @var{size}, otherwise the content of the returned memory
2856is unspecified. If the new allocator is the same as the old one, the routine
2857tries to resize the existing memory allocation, returning the same address as
2858@var{ptr} if successful. @var{ptr} must point to memory allocated by an OpenMP
2859memory-management routine.
2860
2861The @var{allocator} and @var{free_allocator} arguments must be a predefined
2862allocator, an allocator handle or @code{omp_null_allocator}. If
2863@var{free_allocator} is @code{omp_null_allocator}, the implementation
2864automatically determines the allocator used for the allocation of @var{ptr}.
2865If @var{allocator} is @code{omp_null_allocator} and @var{ptr} is is not a
2866null pointer, the same allocator as @code{free_allocator} is used and
2867when @var{ptr} is a null pointer the allocator specified by the
2868@var{def-allocator-var} ICV is used.
2869
2870The @var{size} must be a nonnegative number denoting the number of bytes to be
2871allocated; if @var{size} is zero, @code{omp_realloc} will return free the
2872memory and return a null pointer. When @var{size} is nonzero: if successful,
2873a pointer to the allocated memory is returned, otherwise the @code{fallback}
2874trait of the allocator determines the behavior.
2875
2876In @code{target} regions, either the @code{dynamic_allocators} clause must
2877appear on a @code{requires} directive in the same compilation unit -- or the
2878@var{free_allocator} and @var{allocator} arguments may only be a constant
2879expression with the value of one of the predefined allocators and may not be
2880@code{omp_null_allocator}.
2881
2882Memory allocated by @code{omp_realloc} must be freed using @code{omp_free}.
2883Calling @code{omp_free} invokes undefined behavior if the memory
2884was already deallocated or when the used allocator has already been destroyed.
2885
2886@item @emph{C}:
2887@multitable @columnfractions .20 .80
2888@item @emph{Prototype}: @tab @code{void* omp_realloc(void *ptr, size_t size,}
2889@item @tab @code{ omp_allocator_handle_t allocator,}
2890@item @tab @code{ omp_allocator_handle_t free_allocator)}
2891@end multitable
2892
2893@item @emph{C++}:
2894@multitable @columnfractions .20 .80
2895@item @emph{Prototype}: @tab @code{void* omp_realloc(void *ptr, size_t size,}
2896@item @tab @code{ omp_allocator_handle_t allocator=omp_null_allocator,}
2897@item @tab @code{ omp_allocator_handle_t free_allocator=omp_null_allocator)}
2898@end multitable
2899
2900@item @emph{Fortran}:
2901@multitable @columnfractions .20 .80
2902@item @emph{Interface}: @tab @code{type(c_ptr) function omp_realloc(ptr, size, allocator, free_allocator) bind(C)}
2903@item @tab @code{use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t}
2904@item @tab @code{type(C_ptr), value :: ptr}
2905@item @tab @code{integer (c_size_t), value :: size}
2906@item @tab @code{integer (omp_allocator_handle_kind), value :: allocator, free_allocator}
2907@end multitable
2908
2909@item @emph{See also}:
2910@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_set_default_allocator},
2911@ref{omp_free}, @ref{omp_init_allocator}
2912
2913@item @emph{Reference}:
2914@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.9
2915@end table
2916
2917
2918
506f068e
TB
2919@c @node Tool Control Routine
2920@c
2921@c FIXME
2922
2923@c @node Environment Display Routine
2924@c @section Environment Display Routine
2925@c
2926@c Routine to display the OpenMP number and the initial value of ICVs.
2927@c It has C linkage and do not throw exceptions.
2928@c
2929@c menu
2930@c * omp_display_env:: <fixme>
2931@c end menu
2932
d77de738
ML
2933@c ---------------------------------------------------------------------
2934@c OpenMP Environment Variables
2935@c ---------------------------------------------------------------------
2936
2937@node Environment Variables
2938@chapter OpenMP Environment Variables
2939
2940The environment variables which beginning with @env{OMP_} are defined by
2cd0689a
TB
2941section 4 of the OpenMP specification in version 4.5 or in a later version
2942of the specification, while those beginning with @env{GOMP_} are GNU extensions.
2943Most @env{OMP_} environment variables have an associated internal control
2944variable (ICV).
2945
2946For any OpenMP environment variable that sets an ICV and is neither
2947@code{OMP_DEFAULT_DEVICE} nor has global ICV scope, associated
2948device-specific environment variables exist. For them, the environment
2949variable without suffix affects the host. The suffix @code{_DEV_} followed
2950by a non-negative device number less that the number of available devices sets
2951the ICV for the corresponding device. The suffix @code{_DEV} sets the ICV
2952of all non-host devices for which a device-specific corresponding environment
2953variable has not been set while the @code{_ALL} suffix sets the ICV of all
2954host and non-host devices for which a more specific corresponding environment
2955variable is not set.
d77de738
ML
2956
2957@menu
73a0d3bf
TB
2958* OMP_ALLOCATOR:: Set the default allocator
2959* OMP_AFFINITY_FORMAT:: Set the format string used for affinity display
d77de738 2960* OMP_CANCELLATION:: Set whether cancellation is activated
73a0d3bf 2961* OMP_DISPLAY_AFFINITY:: Display thread affinity information
d77de738
ML
2962* OMP_DISPLAY_ENV:: Show OpenMP version and environment variables
2963* OMP_DEFAULT_DEVICE:: Set the device used in target regions
2964* OMP_DYNAMIC:: Dynamic adjustment of threads
2965* OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
2966* OMP_MAX_TASK_PRIORITY:: Set the maximum task priority value
2967* OMP_NESTED:: Nested parallel regions
2968* OMP_NUM_TEAMS:: Specifies the number of teams to use by teams region
2969* OMP_NUM_THREADS:: Specifies the number of threads to use
0b9bd33d
JJ
2970* OMP_PROC_BIND:: Whether threads may be moved between CPUs
2971* OMP_PLACES:: Specifies on which CPUs the threads should be placed
d77de738
ML
2972* OMP_STACKSIZE:: Set default thread stack size
2973* OMP_SCHEDULE:: How threads are scheduled
bc238c40 2974* OMP_TARGET_OFFLOAD:: Controls offloading behavior
d77de738
ML
2975* OMP_TEAMS_THREAD_LIMIT:: Set the maximum number of threads imposed by teams
2976* OMP_THREAD_LIMIT:: Set the maximum number of threads
2977* OMP_WAIT_POLICY:: How waiting threads are handled
2978* GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
2979* GOMP_DEBUG:: Enable debugging output
2980* GOMP_STACKSIZE:: Set default thread stack size
2981* GOMP_SPINCOUNT:: Set the busy-wait spin count
2982* GOMP_RTEMS_THREAD_POOLS:: Set the RTEMS specific thread pools
2983@end menu
2984
2985
73a0d3bf
TB
2986@node OMP_ALLOCATOR
2987@section @env{OMP_ALLOCATOR} -- Set the default allocator
2988@cindex Environment Variable
2989@table @asis
971f119f 2990@item @emph{ICV:} @var{def-allocator-var}
2cd0689a 2991@item @emph{Scope:} data environment
73a0d3bf
TB
2992@item @emph{Description}:
2993Sets the default allocator that is used when no allocator has been specified
2994in the @code{allocate} or @code{allocator} clause or if an OpenMP memory
2995routine is invoked with the @code{omp_null_allocator} allocator.
2996If unset, @code{omp_default_mem_alloc} is used.
2997
2998The value can either be a predefined allocator or a predefined memory space
2999or a predefined memory space followed by a colon and a comma-separated list
3000of memory trait and value pairs, separated by @code{=}.
3001
2cd0689a
TB
3002Note: The corresponding device environment variables are currently not
3003supported. Therefore, the non-host @var{def-allocator-var} ICVs are always
3004initialized to @code{omp_default_mem_alloc}. However, on all devices,
3005the @code{omp_set_default_allocator} API routine can be used to change
3006value.
3007
73a0d3bf 3008@multitable @columnfractions .45 .45
a85a106c 3009@headitem Predefined allocators @tab Associated predefined memory spaces
73a0d3bf
TB
3010@item omp_default_mem_alloc @tab omp_default_mem_space
3011@item omp_large_cap_mem_alloc @tab omp_large_cap_mem_space
3012@item omp_const_mem_alloc @tab omp_const_mem_space
3013@item omp_high_bw_mem_alloc @tab omp_high_bw_mem_space
3014@item omp_low_lat_mem_alloc @tab omp_low_lat_mem_space
3015@item omp_cgroup_mem_alloc @tab --
3016@item omp_pteam_mem_alloc @tab --
3017@item omp_thread_mem_alloc @tab --
3018@end multitable
3019
a85a106c
TB
3020The predefined allocators use the default values for the traits,
3021as listed below. Except that the last three allocators have the
3022@code{access} trait set to @code{cgroup}, @code{pteam}, and
3023@code{thread}, respectively.
3024
3025@multitable @columnfractions .25 .40 .25
3026@headitem Trait @tab Allowed values @tab Default value
73a0d3bf
TB
3027@item @code{sync_hint} @tab @code{contended}, @code{uncontended},
3028 @code{serialized}, @code{private}
a85a106c 3029 @tab @code{contended}
73a0d3bf 3030@item @code{alignment} @tab Positive integer being a power of two
a85a106c 3031 @tab 1 byte
73a0d3bf
TB
3032@item @code{access} @tab @code{all}, @code{cgroup},
3033 @code{pteam}, @code{thread}
a85a106c 3034 @tab @code{all}
73a0d3bf 3035@item @code{pool_size} @tab Positive integer
a85a106c 3036 @tab See @ref{Memory allocation}
73a0d3bf
TB
3037@item @code{fallback} @tab @code{default_mem_fb}, @code{null_fb},
3038 @code{abort_fb}, @code{allocator_fb}
a85a106c 3039 @tab See below
73a0d3bf 3040@item @code{fb_data} @tab @emph{unsupported as it needs an allocator handle}
a85a106c 3041 @tab (none)
73a0d3bf 3042@item @code{pinned} @tab @code{true}, @code{false}
a85a106c 3043 @tab @code{false}
73a0d3bf
TB
3044@item @code{partition} @tab @code{environment}, @code{nearest},
3045 @code{blocked}, @code{interleaved}
a85a106c 3046 @tab @code{environment}
73a0d3bf
TB
3047@end multitable
3048
a85a106c
TB
3049For the @code{fallback} trait, the default value is @code{null_fb} for the
3050@code{omp_default_mem_alloc} allocator and any allocator that is associated
3051with device memory; for all other other allocators, it is @code{default_mem_fb}
3052by default.
3053
73a0d3bf
TB
3054Examples:
3055@smallexample
3056OMP_ALLOCATOR=omp_high_bw_mem_alloc
3057OMP_ALLOCATOR=omp_large_cap_mem_space
506f068e 3058OMP_ALLOCATOR=omp_low_lat_mem_space:pinned=true,partition=nearest
73a0d3bf
TB
3059@end smallexample
3060
a85a106c 3061@item @emph{See also}:
971f119f
TB
3062@ref{Memory allocation}, @ref{omp_get_default_allocator},
3063@ref{omp_set_default_allocator}
73a0d3bf
TB
3064
3065@item @emph{Reference}:
3066@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.21
3067@end table
3068
3069
3070
3071@node OMP_AFFINITY_FORMAT
3072@section @env{OMP_AFFINITY_FORMAT} -- Set the format string used for affinity display
3073@cindex Environment Variable
3074@table @asis
2cd0689a
TB
3075@item @emph{ICV:} @var{affinity-format-var}
3076@item @emph{Scope:} device
73a0d3bf
TB
3077@item @emph{Description}:
3078Sets the format string used when displaying OpenMP thread affinity information.
3079Special values are output using @code{%} followed by an optional size
3080specification and then either the single-character field type or its long
15886c03 3081name enclosed in curly braces; using @code{%%} displays a literal percent.
73a0d3bf 3082The size specification consists of an optional @code{0.} or @code{.} followed
450b05ce 3083by a positive integer, specifying the minimal width of the output. With
73a0d3bf
TB
3084@code{0.} and numerical values, the output is padded with zeros on the left;
3085with @code{.}, the output is padded by spaces on the left; otherwise, the
3086output is padded by spaces on the right. If unset, the value is
3087``@code{level %L thread %i affinity %A}''.
3088
3089Supported field types are:
3090
3091@multitable @columnfractions .10 .25 .60
3092@item t @tab team_num @tab value returned by @code{omp_get_team_num}
3093@item T @tab num_teams @tab value returned by @code{omp_get_num_teams}
3094@item L @tab nesting_level @tab value returned by @code{omp_get_level}
3095@item n @tab thread_num @tab value returned by @code{omp_get_thread_num}
3096@item N @tab num_threads @tab value returned by @code{omp_get_num_threads}
3097@item a @tab ancestor_tnum
3098 @tab value returned by
3099 @code{omp_get_ancestor_thread_num(omp_get_level()-1)}
3100@item H @tab host @tab name of the host that executes the thread
450b05ce
TB
3101@item P @tab process_id @tab process identifier
3102@item i @tab native_thread_id @tab native thread identifier
73a0d3bf
TB
3103@item A @tab thread_affinity
3104 @tab comma separated list of integer values or ranges, representing the
3105 processors on which a process might execute, subject to affinity
3106 mechanisms
3107@end multitable
3108
3109For instance, after setting
3110
3111@smallexample
3112OMP_AFFINITY_FORMAT="%0.2a!%n!%.4L!%N;%.2t;%0.2T;%@{team_num@};%@{num_teams@};%A"
3113@end smallexample
3114
3115with either @code{OMP_DISPLAY_AFFINITY} being set or when calling
3116@code{omp_display_affinity} with @code{NULL} or an empty string, the program
3117might display the following:
3118
3119@smallexample
312000!0! 1!4; 0;01;0;1;0-11
312100!3! 1!4; 0;01;0;1;0-11
312200!2! 1!4; 0;01;0;1;0-11
312300!1! 1!4; 0;01;0;1;0-11
3124@end smallexample
3125
3126@item @emph{See also}:
3127@ref{OMP_DISPLAY_AFFINITY}
3128
3129@item @emph{Reference}:
3130@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.14
3131@end table
3132
3133
3134
d77de738
ML
3135@node OMP_CANCELLATION
3136@section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
3137@cindex Environment Variable
3138@table @asis
2cd0689a
TB
3139@item @emph{ICV:} @var{cancel-var}
3140@item @emph{Scope:} global
d77de738
ML
3141@item @emph{Description}:
3142If set to @code{TRUE}, the cancellation is activated. If set to @code{FALSE} or
3143if unset, cancellation is disabled and the @code{cancel} construct is ignored.
3144
3145@item @emph{See also}:
3146@ref{omp_get_cancellation}
3147
3148@item @emph{Reference}:
3149@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.11
3150@end table
3151
3152
3153
73a0d3bf
TB
3154@node OMP_DISPLAY_AFFINITY
3155@section @env{OMP_DISPLAY_AFFINITY} -- Display thread affinity information
3156@cindex Environment Variable
3157@table @asis
2cd0689a
TB
3158@item @emph{ICV:} @var{display-affinity-var}
3159@item @emph{Scope:} global
73a0d3bf
TB
3160@item @emph{Description}:
3161If set to @code{FALSE} or if unset, affinity displaying is disabled.
15886c03 3162If set to @code{TRUE}, the runtime displays affinity information about
73a0d3bf
TB
3163OpenMP threads in a parallel region upon entering the region and every time
3164any change occurs.
3165
3166@item @emph{See also}:
3167@ref{OMP_AFFINITY_FORMAT}
3168
3169@item @emph{Reference}:
3170@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.13
3171@end table
3172
3173
3174
3175
d77de738
ML
3176@node OMP_DISPLAY_ENV
3177@section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
3178@cindex Environment Variable
3179@table @asis
2cd0689a
TB
3180@item @emph{ICV:} none
3181@item @emph{Scope:} not applicable
d77de738
ML
3182@item @emph{Description}:
3183If set to @code{TRUE}, the OpenMP version number and the values
3184associated with the OpenMP environment variables are printed to @code{stderr}.
3185If set to @code{VERBOSE}, it additionally shows the value of the environment
3186variables which are GNU extensions. If undefined or set to @code{FALSE},
15886c03 3187this information is not shown.
d77de738
ML
3188
3189
3190@item @emph{Reference}:
3191@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.12
3192@end table
3193
3194
3195
3196@node OMP_DEFAULT_DEVICE
3197@section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
3198@cindex Environment Variable
3199@table @asis
2cd0689a
TB
3200@item @emph{ICV:} @var{default-device-var}
3201@item @emph{Scope:} data environment
d77de738
ML
3202@item @emph{Description}:
3203Set to choose the device which is used in a @code{target} region, unless the
3204value is overridden by @code{omp_set_default_device} or by a @code{device}
3205clause. The value shall be the nonnegative device number. If no device with
3206the given device number exists, the code is executed on the host. If unset,
18c8b56c
TB
3207@env{OMP_TARGET_OFFLOAD} is @code{mandatory} and no non-host devices are
3208available, it is set to @code{omp_invalid_device}. Otherwise, if unset,
15886c03 3209device number 0 is used.
d77de738
ML
3210
3211
3212@item @emph{See also}:
3213@ref{omp_get_default_device}, @ref{omp_set_default_device},
8bd11fa4 3214@ref{OMP_TARGET_OFFLOAD}
d77de738
ML
3215
3216@item @emph{Reference}:
8bd11fa4 3217@uref{https://www.openmp.org, OpenMP specification v5.2}, Section 21.2.7
d77de738
ML
3218@end table
3219
3220
3221
3222@node OMP_DYNAMIC
3223@section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
3224@cindex Environment Variable
3225@table @asis
2cd0689a
TB
3226@item @emph{ICV:} @var{dyn-var}
3227@item @emph{Scope:} global
d77de738
ML
3228@item @emph{Description}:
3229Enable or disable the dynamic adjustment of the number of threads
3230within a team. The value of this environment variable shall be
3231@code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
3232disabled by default.
3233
3234@item @emph{See also}:
3235@ref{omp_set_dynamic}
3236
3237@item @emph{Reference}:
3238@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.3
3239@end table
3240
3241
3242
3243@node OMP_MAX_ACTIVE_LEVELS
3244@section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
3245@cindex Environment Variable
3246@table @asis
2cd0689a
TB
3247@item @emph{ICV:} @var{max-active-levels-var}
3248@item @emph{Scope:} data environment
d77de738
ML
3249@item @emph{Description}:
3250Specifies the initial value for the maximum number of nested parallel
3251regions. The value of this variable shall be a positive integer.
3252If undefined, then if @env{OMP_NESTED} is defined and set to true, or
3253if @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND} are defined and set to
3254a list with more than one item, the maximum number of nested parallel
15886c03
TB
3255regions is initialized to the largest number supported, otherwise
3256it is set to one.
d77de738
ML
3257
3258@item @emph{See also}:
2cd0689a
TB
3259@ref{omp_set_max_active_levels}, @ref{OMP_NESTED}, @ref{OMP_PROC_BIND},
3260@ref{OMP_NUM_THREADS}
3261
d77de738
ML
3262
3263@item @emph{Reference}:
3264@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.9
3265@end table
3266
3267
3268
3269@node OMP_MAX_TASK_PRIORITY
3270@section @env{OMP_MAX_TASK_PRIORITY} -- Set the maximum priority
3271number that can be set for a task.
3272@cindex Environment Variable
3273@table @asis
2cd0689a
TB
3274@item @emph{ICV:} @var{max-task-priority-var}
3275@item @emph{Scope:} global
d77de738
ML
3276@item @emph{Description}:
3277Specifies the initial value for the maximum priority value that can be
3278set for a task. The value of this variable shall be a non-negative
3279integer, and zero is allowed. If undefined, the default priority is
32800.
3281
3282@item @emph{See also}:
3283@ref{omp_get_max_task_priority}
3284
3285@item @emph{Reference}:
3286@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.14
3287@end table
3288
3289
3290
3291@node OMP_NESTED
3292@section @env{OMP_NESTED} -- Nested parallel regions
3293@cindex Environment Variable
3294@cindex Implementation specific setting
3295@table @asis
2cd0689a
TB
3296@item @emph{ICV:} @var{max-active-levels-var}
3297@item @emph{Scope:} data environment
d77de738
ML
3298@item @emph{Description}:
3299Enable or disable nested parallel regions, i.e., whether team members
3300are allowed to create new teams. The value of this environment variable
3301shall be @code{TRUE} or @code{FALSE}. If set to @code{TRUE}, the number
15886c03
TB
3302of maximum active nested regions supported is by default set to the
3303maximum supported, otherwise it is set to one. If
3304@env{OMP_MAX_ACTIVE_LEVELS} is defined, its setting overrides this
d77de738
ML
3305setting. If both are undefined, nested parallel regions are enabled if
3306@env{OMP_NUM_THREADS} or @env{OMP_PROC_BINDS} are defined to a list with
3307more than one item, otherwise they are disabled by default.
3308
2cd0689a
TB
3309Note that the @code{OMP_NESTED} environment variable was deprecated in
3310the OpenMP specification 5.2 in favor of @code{OMP_MAX_ACTIVE_LEVELS}.
3311
d77de738 3312@item @emph{See also}:
2cd0689a
TB
3313@ref{omp_set_max_active_levels}, @ref{omp_set_nested},
3314@ref{OMP_MAX_ACTIVE_LEVELS}
d77de738
ML
3315
3316@item @emph{Reference}:
3317@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.6
3318@end table
3319
3320
3321
3322@node OMP_NUM_TEAMS
3323@section @env{OMP_NUM_TEAMS} -- Specifies the number of teams to use by teams region
3324@cindex Environment Variable
3325@table @asis
2cd0689a
TB
3326@item @emph{ICV:} @var{nteams-var}
3327@item @emph{Scope:} device
d77de738
ML
3328@item @emph{Description}:
3329Specifies the upper bound for number of teams to use in teams regions
3330without explicit @code{num_teams} clause. The value of this variable shall
3331be a positive integer. If undefined it defaults to 0 which means
3332implementation defined upper bound.
3333
3334@item @emph{See also}:
3335@ref{omp_set_num_teams}
3336
3337@item @emph{Reference}:
3338@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.23
3339@end table
3340
3341
3342
3343@node OMP_NUM_THREADS
3344@section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
3345@cindex Environment Variable
3346@cindex Implementation specific setting
3347@table @asis
2cd0689a
TB
3348@item @emph{ICV:} @var{nthreads-var}
3349@item @emph{Scope:} data environment
d77de738
ML
3350@item @emph{Description}:
3351Specifies the default number of threads to use in parallel regions. The
3352value of this variable shall be a comma-separated list of positive integers;
3353the value specifies the number of threads to use for the corresponding nested
15886c03 3354level. Specifying more than one item in the list automatically enables
d77de738
ML
3355nesting by default. If undefined one thread per CPU is used.
3356
2cd0689a
TB
3357When a list with more than value is specified, it also affects the
3358@var{max-active-levels-var} ICV as described in @ref{OMP_MAX_ACTIVE_LEVELS}.
3359
d77de738 3360@item @emph{See also}:
2cd0689a 3361@ref{omp_set_num_threads}, @ref{OMP_MAX_ACTIVE_LEVELS}
d77de738
ML
3362
3363@item @emph{Reference}:
3364@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.2
3365@end table
3366
3367
3368
3369@node OMP_PROC_BIND
0b9bd33d 3370@section @env{OMP_PROC_BIND} -- Whether threads may be moved between CPUs
d77de738
ML
3371@cindex Environment Variable
3372@table @asis
2cd0689a
TB
3373@item @emph{ICV:} @var{bind-var}
3374@item @emph{Scope:} data environment
d77de738
ML
3375@item @emph{Description}:
3376Specifies whether threads may be moved between processors. If set to
0b9bd33d 3377@code{TRUE}, OpenMP threads should not be moved; if set to @code{FALSE}
d77de738
ML
3378they may be moved. Alternatively, a comma separated list with the
3379values @code{PRIMARY}, @code{MASTER}, @code{CLOSE} and @code{SPREAD} can
3380be used to specify the thread affinity policy for the corresponding nesting
3381level. With @code{PRIMARY} and @code{MASTER} the worker threads are in the
3382same place partition as the primary thread. With @code{CLOSE} those are
3383kept close to the primary thread in contiguous place partitions. And
3384with @code{SPREAD} a sparse distribution
3385across the place partitions is used. Specifying more than one item in the
15886c03 3386list automatically enables nesting by default.
d77de738 3387
2cd0689a
TB
3388When a list is specified, it also affects the @var{max-active-levels-var} ICV
3389as described in @ref{OMP_MAX_ACTIVE_LEVELS}.
3390
d77de738
ML
3391When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
3392@env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
3393
3394@item @emph{See also}:
2cd0689a
TB
3395@ref{omp_get_proc_bind}, @ref{GOMP_CPU_AFFINITY}, @ref{OMP_PLACES},
3396@ref{OMP_MAX_ACTIVE_LEVELS}
d77de738
ML
3397
3398@item @emph{Reference}:
3399@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.4
3400@end table
3401
3402
3403
3404@node OMP_PLACES
0b9bd33d 3405@section @env{OMP_PLACES} -- Specifies on which CPUs the threads should be placed
d77de738
ML
3406@cindex Environment Variable
3407@table @asis
2cd0689a
TB
3408@item @emph{ICV:} @var{place-partition-var}
3409@item @emph{Scope:} implicit tasks
d77de738
ML
3410@item @emph{Description}:
3411The thread placement can be either specified using an abstract name or by an
3412explicit list of the places. The abstract names @code{threads}, @code{cores},
3413@code{sockets}, @code{ll_caches} and @code{numa_domains} can be optionally
3414followed by a positive number in parentheses, which denotes the how many places
3415shall be created. With @code{threads} each place corresponds to a single
3416hardware thread; @code{cores} to a single core with the corresponding number of
3417hardware threads; with @code{sockets} the place corresponds to a single
3418socket; with @code{ll_caches} to a set of cores that shares the last level
3419cache on the device; and @code{numa_domains} to a set of cores for which their
3420closest memory on the device is the same memory and at a similar distance from
3421the cores. The resulting placement can be shown by setting the
3422@env{OMP_DISPLAY_ENV} environment variable.
3423
3424Alternatively, the placement can be specified explicitly as comma-separated
3425list of places. A place is specified by set of nonnegative numbers in curly
3426braces, denoting the hardware threads. The curly braces can be omitted
3427when only a single number has been specified. The hardware threads
3428belonging to a place can either be specified as comma-separated list of
3429nonnegative thread numbers or using an interval. Multiple places can also be
3430either specified by a comma-separated list of places or by an interval. To
3431specify an interval, a colon followed by the count is placed after
3432the hardware thread number or the place. Optionally, the length can be
3433followed by a colon and the stride number -- otherwise a unit stride is
3434assumed. Placing an exclamation mark (@code{!}) directly before a curly
15886c03
TB
3435brace or numbers inside the curly braces (excluding intervals)
3436excludes those hardware threads.
d77de738
ML
3437
3438For instance, the following specifies the same places list:
3439@code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
3440@code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
3441
3442If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
3443@env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
3444between CPUs following no placement policy.
3445
3446@item @emph{See also}:
3447@ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
3448@ref{OMP_DISPLAY_ENV}
3449
3450@item @emph{Reference}:
3451@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.5
3452@end table
3453
3454
3455
3456@node OMP_STACKSIZE
3457@section @env{OMP_STACKSIZE} -- Set default thread stack size
3458@cindex Environment Variable
3459@table @asis
2cd0689a
TB
3460@item @emph{ICV:} @var{stacksize-var}
3461@item @emph{Scope:} device
d77de738
ML
3462@item @emph{Description}:
3463Set the default thread stack size in kilobytes, unless the number
3464is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
3465case the size is, respectively, in bytes, kilobytes, megabytes
3466or gigabytes. This is different from @code{pthread_attr_setstacksize}
3467which gets the number of bytes as an argument. If the stack size cannot
3468be set due to system constraints, an error is reported and the initial
3469stack size is left unchanged. If undefined, the stack size is system
3470dependent.
3471
2cd0689a
TB
3472@item @emph{See also}:
3473@ref{GOMP_STACKSIZE}
3474
d77de738
ML
3475@item @emph{Reference}:
3476@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.7
3477@end table
3478
3479
3480
3481@node OMP_SCHEDULE
3482@section @env{OMP_SCHEDULE} -- How threads are scheduled
3483@cindex Environment Variable
3484@cindex Implementation specific setting
3485@table @asis
2cd0689a
TB
3486@item @emph{ICV:} @var{run-sched-var}
3487@item @emph{Scope:} data environment
d77de738
ML
3488@item @emph{Description}:
3489Allows to specify @code{schedule type} and @code{chunk size}.
3490The value of the variable shall have the form: @code{type[,chunk]} where
3491@code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
3492The optional @code{chunk} size shall be a positive integer. If undefined,
3493dynamic scheduling and a chunk size of 1 is used.
3494
3495@item @emph{See also}:
3496@ref{omp_set_schedule}
3497
3498@item @emph{Reference}:
3499@uref{https://www.openmp.org, OpenMP specification v4.5}, Sections 2.7.1.1 and 4.1
3500@end table
3501
3502
3503
3504@node OMP_TARGET_OFFLOAD
bc238c40 3505@section @env{OMP_TARGET_OFFLOAD} -- Controls offloading behavior
d77de738
ML
3506@cindex Environment Variable
3507@cindex Implementation specific setting
3508@table @asis
2cd0689a
TB
3509@item @emph{ICV:} @var{target-offload-var}
3510@item @emph{Scope:} global
d77de738 3511@item @emph{Description}:
bc238c40 3512Specifies the behavior with regard to offloading code to a device. This
d77de738
ML
3513variable can be set to one of three values - @code{MANDATORY}, @code{DISABLED}
3514or @code{DEFAULT}.
3515
15886c03 3516If set to @code{MANDATORY}, the program terminates with an error if
8bd11fa4
TB
3517any device construct or device memory routine uses a device that is unavailable
3518or not supported by the implementation, or uses a non-conforming device number.
15886c03
TB
3519If set to @code{DISABLED}, then offloading is disabled and all code runs on
3520the host. If set to @code{DEFAULT}, the program tries offloading to the
3521device first, then falls back to running code on the host if it cannot.
d77de738 3522
15886c03 3523If undefined, then the program behaves as if @code{DEFAULT} was set.
d77de738 3524
15886c03 3525Note: Even with @code{MANDATORY}, no run-time termination is performed when
8bd11fa4
TB
3526the device number in a @code{device} clause or argument to a device memory
3527routine is for host, which includes using the device number in the
3528@var{default-device-var} ICV. However, the initial value of
3529the @var{default-device-var} ICV is affected by @code{MANDATORY}.
3530
3531@item @emph{See also}:
3532@ref{OMP_DEFAULT_DEVICE}
3533
d77de738 3534@item @emph{Reference}:
8bd11fa4 3535@uref{https://www.openmp.org, OpenMP specification v5.2}, Section 21.2.8
d77de738
ML
3536@end table
3537
3538
3539
3540@node OMP_TEAMS_THREAD_LIMIT
3541@section @env{OMP_TEAMS_THREAD_LIMIT} -- Set the maximum number of threads imposed by teams
3542@cindex Environment Variable
3543@table @asis
2cd0689a
TB
3544@item @emph{ICV:} @var{teams-thread-limit-var}
3545@item @emph{Scope:} device
d77de738
ML
3546@item @emph{Description}:
3547Specifies an upper bound for the number of threads to use by each contention
3548group created by a teams construct without explicit @code{thread_limit}
3549clause. The value of this variable shall be a positive integer. If undefined,
3550the value of 0 is used which stands for an implementation defined upper
3551limit.
3552
3553@item @emph{See also}:
3554@ref{OMP_THREAD_LIMIT}, @ref{omp_set_teams_thread_limit}
3555
3556@item @emph{Reference}:
3557@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.24
3558@end table
3559
3560
3561
3562@node OMP_THREAD_LIMIT
3563@section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
3564@cindex Environment Variable
3565@table @asis
2cd0689a
TB
3566@item @emph{ICV:} @var{thread-limit-var}
3567@item @emph{Scope:} data environment
d77de738
ML
3568@item @emph{Description}:
3569Specifies the number of threads to use for the whole program. The
3570value of this variable shall be a positive integer. If undefined,
3571the number of threads is not limited.
3572
3573@item @emph{See also}:
3574@ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
3575
3576@item @emph{Reference}:
3577@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.10
3578@end table
3579
3580
3581
3582@node OMP_WAIT_POLICY
3583@section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
3584@cindex Environment Variable
3585@table @asis
3586@item @emph{Description}:
3587Specifies whether waiting threads should be active or passive. If
3588the value is @code{PASSIVE}, waiting threads should not consume CPU
3589power while waiting; while the value is @code{ACTIVE} specifies that
3590they should. If undefined, threads wait actively for a short time
3591before waiting passively.
3592
3593@item @emph{See also}:
3594@ref{GOMP_SPINCOUNT}
3595
3596@item @emph{Reference}:
3597@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.8
3598@end table
3599
3600
3601
3602@node GOMP_CPU_AFFINITY
3603@section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
3604@cindex Environment Variable
3605@table @asis
3606@item @emph{Description}:
3607Binds threads to specific CPUs. The variable should contain a space-separated
3608or comma-separated list of CPUs. This list may contain different kinds of
3609entries: either single CPU numbers in any order, a range of CPUs (M-N)
3610or a range with some stride (M-N:S). CPU numbers are zero based. For example,
15886c03 3611@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} binds the initial thread
d77de738
ML
3612to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
3613CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
15886c03 3614and 14 respectively and then starts assigning back from the beginning of
d77de738
ML
3615the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
3616
3617There is no libgomp library routine to determine whether a CPU affinity
3618specification is in effect. As a workaround, language-specific library
3619functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
3620Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
3621environment variable. A defined CPU affinity on startup cannot be changed
3622or disabled during the runtime of the application.
3623
3624If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
3625@env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
3626@env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
15886c03 3627@code{FALSE}, the host system handles the assignment of threads to CPUs.
d77de738
ML
3628
3629@item @emph{See also}:
3630@ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
3631@end table
3632
3633
3634
3635@node GOMP_DEBUG
3636@section @env{GOMP_DEBUG} -- Enable debugging output
3637@cindex Environment Variable
3638@table @asis
3639@item @emph{Description}:
3640Enable debugging output. The variable should be set to @code{0}
3641(disabled, also the default if not set), or @code{1} (enabled).
3642
15886c03 3643If enabled, some debugging output is printed during execution.
d77de738
ML
3644This is currently not specified in more detail, and subject to change.
3645@end table
3646
3647
3648
3649@node GOMP_STACKSIZE
3650@section @env{GOMP_STACKSIZE} -- Set default thread stack size
3651@cindex Environment Variable
3652@cindex Implementation specific setting
3653@table @asis
3654@item @emph{Description}:
3655Set the default thread stack size in kilobytes. This is different from
3656@code{pthread_attr_setstacksize} which gets the number of bytes as an
3657argument. If the stack size cannot be set due to system constraints, an
3658error is reported and the initial stack size is left unchanged. If undefined,
3659the stack size is system dependent.
3660
3661@item @emph{See also}:
3662@ref{OMP_STACKSIZE}
3663
3664@item @emph{Reference}:
3665@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
3666GCC Patches Mailinglist},
3667@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
3668GCC Patches Mailinglist}
3669@end table
3670
3671
3672
3673@node GOMP_SPINCOUNT
3674@section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
3675@cindex Environment Variable
3676@cindex Implementation specific setting
3677@table @asis
3678@item @emph{Description}:
3679Determines how long a threads waits actively with consuming CPU power
3680before waiting passively without consuming CPU power. The value may be
3681either @code{INFINITE}, @code{INFINITY} to always wait actively or an
3682integer which gives the number of spins of the busy-wait loop. The
3683integer may optionally be followed by the following suffixes acting
3684as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
3685million), @code{G} (giga, billion), or @code{T} (tera, trillion).
3686If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
3687300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
368830 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
3689If there are more OpenMP threads than available CPUs, 1000 and 100
3690spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
3691undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
3692or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
3693
3694@item @emph{See also}:
3695@ref{OMP_WAIT_POLICY}
3696@end table
3697
3698
3699
3700@node GOMP_RTEMS_THREAD_POOLS
3701@section @env{GOMP_RTEMS_THREAD_POOLS} -- Set the RTEMS specific thread pools
3702@cindex Environment Variable
3703@cindex Implementation specific setting
3704@table @asis
3705@item @emph{Description}:
3706This environment variable is only used on the RTEMS real-time operating system.
3707It determines the scheduler instance specific thread pools. The format for
3708@env{GOMP_RTEMS_THREAD_POOLS} is a list of optional
3709@code{<thread-pool-count>[$<priority>]@@<scheduler-name>} configurations
3710separated by @code{:} where:
3711@itemize @bullet
3712@item @code{<thread-pool-count>} is the thread pool count for this scheduler
3713instance.
3714@item @code{$<priority>} is an optional priority for the worker threads of a
3715thread pool according to @code{pthread_setschedparam}. In case a priority
15886c03 3716value is omitted, then a worker thread inherits the priority of the OpenMP
d77de738
ML
3717primary thread that created it. The priority of the worker thread is not
3718changed after creation, even if a new OpenMP primary thread using the worker has
3719a different priority.
3720@item @code{@@<scheduler-name>} is the scheduler instance name according to the
3721RTEMS application configuration.
3722@end itemize
3723In case no thread pool configuration is specified for a scheduler instance,
15886c03 3724then each OpenMP primary thread of this scheduler instance uses its own
d77de738
ML
3725dynamically allocated thread pool. To limit the worker thread count of the
3726thread pools, each OpenMP primary thread must call @code{omp_set_num_threads}.
3727@item @emph{Example}:
3728Lets suppose we have three scheduler instances @code{IO}, @code{WRK0}, and
3729@code{WRK1} with @env{GOMP_RTEMS_THREAD_POOLS} set to
3730@code{"1@@WRK0:3$4@@WRK1"}. Then there are no thread pool restrictions for
3731scheduler instance @code{IO}. In the scheduler instance @code{WRK0} there is
3732one thread pool available. Since no priority is specified for this scheduler
3733instance, the worker thread inherits the priority of the OpenMP primary thread
3734that created it. In the scheduler instance @code{WRK1} there are three thread
3735pools available and their worker threads run at priority four.
3736@end table
3737
3738
3739
3740@c ---------------------------------------------------------------------
3741@c Enabling OpenACC
3742@c ---------------------------------------------------------------------
3743
3744@node Enabling OpenACC
3745@chapter Enabling OpenACC
3746
3747To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
3748flag @option{-fopenacc} must be specified. This enables the OpenACC directive
643a5223
TB
3749@samp{#pragma acc} in C/C++ and, in Fortran, the @samp{!$acc} sentinel in free
3750source form and the @samp{c$acc}, @samp{*$acc} and @samp{!$acc} sentinels in
3751fixed source form. The flag also arranges for automatic linking of the OpenACC
3752runtime library (@ref{OpenACC Runtime Library Routines}).
d77de738
ML
3753
3754See @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
3755
3756A complete description of all OpenACC directives accepted may be found in
3757the @uref{https://www.openacc.org, OpenACC} Application Programming
3758Interface manual, version 2.6.
3759
3760
3761
3762@c ---------------------------------------------------------------------
3763@c OpenACC Runtime Library Routines
3764@c ---------------------------------------------------------------------
3765
3766@node OpenACC Runtime Library Routines
3767@chapter OpenACC Runtime Library Routines
3768
3769The runtime routines described here are defined by section 3 of the OpenACC
3770specifications in version 2.6.
3771They have C linkage, and do not throw exceptions.
3772Generally, they are available only for the host, with the exception of
3773@code{acc_on_device}, which is available for both the host and the
3774acceleration device.
3775
3776@menu
3777* acc_get_num_devices:: Get number of devices for the given device
3778 type.
3779* acc_set_device_type:: Set type of device accelerator to use.
3780* acc_get_device_type:: Get type of device accelerator to be used.
3781* acc_set_device_num:: Set device number to use.
3782* acc_get_device_num:: Get device number to be used.
3783* acc_get_property:: Get device property.
3784* acc_async_test:: Tests for completion of a specific asynchronous
3785 operation.
3786* acc_async_test_all:: Tests for completion of all asynchronous
3787 operations.
3788* acc_wait:: Wait for completion of a specific asynchronous
3789 operation.
3790* acc_wait_all:: Waits for completion of all asynchronous
3791 operations.
3792* acc_wait_all_async:: Wait for completion of all asynchronous
3793 operations.
3794* acc_wait_async:: Wait for completion of asynchronous operations.
3795* acc_init:: Initialize runtime for a specific device type.
3796* acc_shutdown:: Shuts down the runtime for a specific device
3797 type.
3798* acc_on_device:: Whether executing on a particular device
3799* acc_malloc:: Allocate device memory.
3800* acc_free:: Free device memory.
3801* acc_copyin:: Allocate device memory and copy host memory to
3802 it.
3803* acc_present_or_copyin:: If the data is not present on the device,
3804 allocate device memory and copy from host
3805 memory.
3806* acc_create:: Allocate device memory and map it to host
3807 memory.
3808* acc_present_or_create:: If the data is not present on the device,
3809 allocate device memory and map it to host
3810 memory.
3811* acc_copyout:: Copy device memory to host memory.
3812* acc_delete:: Free device memory.
3813* acc_update_device:: Update device memory from mapped host memory.
3814* acc_update_self:: Update host memory from mapped device memory.
3815* acc_map_data:: Map previously allocated device memory to host
3816 memory.
3817* acc_unmap_data:: Unmap device memory from host memory.
3818* acc_deviceptr:: Get device pointer associated with specific
3819 host address.
3820* acc_hostptr:: Get host pointer associated with specific
3821 device address.
3822* acc_is_present:: Indicate whether host variable / array is
3823 present on device.
3824* acc_memcpy_to_device:: Copy host memory to device memory.
3825* acc_memcpy_from_device:: Copy device memory to host memory.
3826* acc_attach:: Let device pointer point to device-pointer target.
3827* acc_detach:: Let device pointer point to host-pointer target.
3828
3829API routines for target platforms.
3830
3831* acc_get_current_cuda_device:: Get CUDA device handle.
3832* acc_get_current_cuda_context::Get CUDA context handle.
3833* acc_get_cuda_stream:: Get CUDA stream handle.
3834* acc_set_cuda_stream:: Set CUDA stream handle.
3835
3836API routines for the OpenACC Profiling Interface.
3837
3838* acc_prof_register:: Register callbacks.
3839* acc_prof_unregister:: Unregister callbacks.
3840* acc_prof_lookup:: Obtain inquiry functions.
3841* acc_register_library:: Library registration.
3842@end menu
3843
3844
3845
3846@node acc_get_num_devices
3847@section @code{acc_get_num_devices} -- Get number of devices for given device type
3848@table @asis
3849@item @emph{Description}
3850This function returns a value indicating the number of devices available
3851for the device type specified in @var{devicetype}.
3852
3853@item @emph{C/C++}:
3854@multitable @columnfractions .20 .80
3855@item @emph{Prototype}: @tab @code{int acc_get_num_devices(acc_device_t devicetype);}
3856@end multitable
3857
3858@item @emph{Fortran}:
3859@multitable @columnfractions .20 .80
3860@item @emph{Interface}: @tab @code{integer function acc_get_num_devices(devicetype)}
3861@item @tab @code{integer(kind=acc_device_kind) devicetype}
3862@end multitable
3863
3864@item @emph{Reference}:
3865@uref{https://www.openacc.org, OpenACC specification v2.6}, section
38663.2.1.
3867@end table
3868
3869
3870
3871@node acc_set_device_type
3872@section @code{acc_set_device_type} -- Set type of device accelerator to use.
3873@table @asis
3874@item @emph{Description}
3875This function indicates to the runtime library which device type, specified
3876in @var{devicetype}, to use when executing a parallel or kernels region.
3877
3878@item @emph{C/C++}:
3879@multitable @columnfractions .20 .80
3880@item @emph{Prototype}: @tab @code{acc_set_device_type(acc_device_t devicetype);}
3881@end multitable
3882
3883@item @emph{Fortran}:
3884@multitable @columnfractions .20 .80
3885@item @emph{Interface}: @tab @code{subroutine acc_set_device_type(devicetype)}
3886@item @tab @code{integer(kind=acc_device_kind) devicetype}
3887@end multitable
3888
3889@item @emph{Reference}:
3890@uref{https://www.openacc.org, OpenACC specification v2.6}, section
38913.2.2.
3892@end table
3893
3894
3895
3896@node acc_get_device_type
3897@section @code{acc_get_device_type} -- Get type of device accelerator to be used.
3898@table @asis
3899@item @emph{Description}
3900This function returns what device type will be used when executing a
3901parallel or kernels region.
3902
3903This function returns @code{acc_device_none} if
3904@code{acc_get_device_type} is called from
3905@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
3906callbacks of the OpenACC Profiling Interface (@ref{OpenACC Profiling
3907Interface}), that is, if the device is currently being initialized.
3908
3909@item @emph{C/C++}:
3910@multitable @columnfractions .20 .80
3911@item @emph{Prototype}: @tab @code{acc_device_t acc_get_device_type(void);}
3912@end multitable
3913
3914@item @emph{Fortran}:
3915@multitable @columnfractions .20 .80
3916@item @emph{Interface}: @tab @code{function acc_get_device_type(void)}
3917@item @tab @code{integer(kind=acc_device_kind) acc_get_device_type}
3918@end multitable
3919
3920@item @emph{Reference}:
3921@uref{https://www.openacc.org, OpenACC specification v2.6}, section
39223.2.3.
3923@end table
3924
3925
3926
3927@node acc_set_device_num
3928@section @code{acc_set_device_num} -- Set device number to use.
3929@table @asis
3930@item @emph{Description}
3931This function will indicate to the runtime which device number,
3932specified by @var{devicenum}, associated with the specified device
3933type @var{devicetype}.
3934
3935@item @emph{C/C++}:
3936@multitable @columnfractions .20 .80
3937@item @emph{Prototype}: @tab @code{acc_set_device_num(int devicenum, acc_device_t devicetype);}
3938@end multitable
3939
3940@item @emph{Fortran}:
3941@multitable @columnfractions .20 .80
3942@item @emph{Interface}: @tab @code{subroutine acc_set_device_num(devicenum, devicetype)}
3943@item @tab @code{integer devicenum}
3944@item @tab @code{integer(kind=acc_device_kind) devicetype}
3945@end multitable
3946
3947@item @emph{Reference}:
3948@uref{https://www.openacc.org, OpenACC specification v2.6}, section
39493.2.4.
3950@end table
3951
3952
3953
3954@node acc_get_device_num
3955@section @code{acc_get_device_num} -- Get device number to be used.
3956@table @asis
3957@item @emph{Description}
3958This function returns which device number associated with the specified device
3959type @var{devicetype}, will be used when executing a parallel or kernels
3960region.
3961
3962@item @emph{C/C++}:
3963@multitable @columnfractions .20 .80
3964@item @emph{Prototype}: @tab @code{int acc_get_device_num(acc_device_t devicetype);}
3965@end multitable
3966
3967@item @emph{Fortran}:
3968@multitable @columnfractions .20 .80
3969@item @emph{Interface}: @tab @code{function acc_get_device_num(devicetype)}
3970@item @tab @code{integer(kind=acc_device_kind) devicetype}
3971@item @tab @code{integer acc_get_device_num}
3972@end multitable
3973
3974@item @emph{Reference}:
3975@uref{https://www.openacc.org, OpenACC specification v2.6}, section
39763.2.5.
3977@end table
3978
3979
3980
3981@node acc_get_property
3982@section @code{acc_get_property} -- Get device property.
3983@cindex acc_get_property
3984@cindex acc_get_property_string
3985@table @asis
3986@item @emph{Description}
3987These routines return the value of the specified @var{property} for the
3988device being queried according to @var{devicenum} and @var{devicetype}.
3989Integer-valued and string-valued properties are returned by
3990@code{acc_get_property} and @code{acc_get_property_string} respectively.
3991The Fortran @code{acc_get_property_string} subroutine returns the string
3992retrieved in its fourth argument while the remaining entry points are
3993functions, which pass the return value as their result.
3994
3995Note for Fortran, only: the OpenACC technical committee corrected and, hence,
3996modified the interface introduced in OpenACC 2.6. The kind-value parameter
3997@code{acc_device_property} has been renamed to @code{acc_device_property_kind}
3998for consistency and the return type of the @code{acc_get_property} function is
3999now a @code{c_size_t} integer instead of a @code{acc_device_property} integer.
15886c03 4000The parameter @code{acc_device_property} is still provided,
d77de738
ML
4001but might be removed in a future version of GCC.
4002
4003@item @emph{C/C++}:
4004@multitable @columnfractions .20 .80
4005@item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
4006@item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
4007@end multitable
4008
4009@item @emph{Fortran}:
4010@multitable @columnfractions .20 .80
4011@item @emph{Interface}: @tab @code{function acc_get_property(devicenum, devicetype, property)}
4012@item @emph{Interface}: @tab @code{subroutine acc_get_property_string(devicenum, devicetype, property, string)}
4013@item @tab @code{use ISO_C_Binding, only: c_size_t}
4014@item @tab @code{integer devicenum}
4015@item @tab @code{integer(kind=acc_device_kind) devicetype}
4016@item @tab @code{integer(kind=acc_device_property_kind) property}
4017@item @tab @code{integer(kind=c_size_t) acc_get_property}
4018@item @tab @code{character(*) string}
4019@end multitable
4020
4021@item @emph{Reference}:
4022@uref{https://www.openacc.org, OpenACC specification v2.6}, section
40233.2.6.
4024@end table
4025
4026
4027
4028@node acc_async_test
4029@section @code{acc_async_test} -- Test for completion of a specific asynchronous operation.
4030@table @asis
4031@item @emph{Description}
4032This function tests for completion of the asynchronous operation specified
15886c03
TB
4033in @var{arg}. In C/C++, a non-zero value is returned to indicate
4034the specified asynchronous operation has completed while Fortran returns
4035@code{true}. If the asynchronous operation has not completed, C/C++ returns
4036zero and Fortran returns @code{false}.
d77de738
ML
4037
4038@item @emph{C/C++}:
4039@multitable @columnfractions .20 .80
4040@item @emph{Prototype}: @tab @code{int acc_async_test(int arg);}
4041@end multitable
4042
4043@item @emph{Fortran}:
4044@multitable @columnfractions .20 .80
4045@item @emph{Interface}: @tab @code{function acc_async_test(arg)}
4046@item @tab @code{integer(kind=acc_handle_kind) arg}
4047@item @tab @code{logical acc_async_test}
4048@end multitable
4049
4050@item @emph{Reference}:
4051@uref{https://www.openacc.org, OpenACC specification v2.6}, section
40523.2.9.
4053@end table
4054
4055
4056
4057@node acc_async_test_all
4058@section @code{acc_async_test_all} -- Tests for completion of all asynchronous operations.
4059@table @asis
4060@item @emph{Description}
4061This function tests for completion of all asynchronous operations.
15886c03
TB
4062In C/C++, a non-zero value is returned to indicate all asynchronous
4063operations have completed while Fortran returns @code{true}. If
4064any asynchronous operation has not completed, C/C++ returns zero and
4065Fortran returns @code{false}.
d77de738
ML
4066
4067@item @emph{C/C++}:
4068@multitable @columnfractions .20 .80
4069@item @emph{Prototype}: @tab @code{int acc_async_test_all(void);}
4070@end multitable
4071
4072@item @emph{Fortran}:
4073@multitable @columnfractions .20 .80
4074@item @emph{Interface}: @tab @code{function acc_async_test()}
4075@item @tab @code{logical acc_get_device_num}
4076@end multitable
4077
4078@item @emph{Reference}:
4079@uref{https://www.openacc.org, OpenACC specification v2.6}, section
40803.2.10.
4081@end table
4082
4083
4084
4085@node acc_wait
4086@section @code{acc_wait} -- Wait for completion of a specific asynchronous operation.
4087@table @asis
4088@item @emph{Description}
4089This function waits for completion of the asynchronous operation
4090specified in @var{arg}.
4091
4092@item @emph{C/C++}:
4093@multitable @columnfractions .20 .80
4094@item @emph{Prototype}: @tab @code{acc_wait(arg);}
4095@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait(arg);}
4096@end multitable
4097
4098@item @emph{Fortran}:
4099@multitable @columnfractions .20 .80
4100@item @emph{Interface}: @tab @code{subroutine acc_wait(arg)}
4101@item @tab @code{integer(acc_handle_kind) arg}
4102@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait(arg)}
4103@item @tab @code{integer(acc_handle_kind) arg}
4104@end multitable
4105
4106@item @emph{Reference}:
4107@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41083.2.11.
4109@end table
4110
4111
4112
4113@node acc_wait_all
4114@section @code{acc_wait_all} -- Waits for completion of all asynchronous operations.
4115@table @asis
4116@item @emph{Description}
4117This function waits for the completion of all asynchronous operations.
4118
4119@item @emph{C/C++}:
4120@multitable @columnfractions .20 .80
4121@item @emph{Prototype}: @tab @code{acc_wait_all(void);}
4122@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait_all(void);}
4123@end multitable
4124
4125@item @emph{Fortran}:
4126@multitable @columnfractions .20 .80
4127@item @emph{Interface}: @tab @code{subroutine acc_wait_all()}
4128@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait_all()}
4129@end multitable
4130
4131@item @emph{Reference}:
4132@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41333.2.13.
4134@end table
4135
4136
4137
4138@node acc_wait_all_async
4139@section @code{acc_wait_all_async} -- Wait for completion of all asynchronous operations.
4140@table @asis
4141@item @emph{Description}
4142This function enqueues a wait operation on the queue @var{async} for any
4143and all asynchronous operations that have been previously enqueued on
4144any queue.
4145
4146@item @emph{C/C++}:
4147@multitable @columnfractions .20 .80
4148@item @emph{Prototype}: @tab @code{acc_wait_all_async(int async);}
4149@end multitable
4150
4151@item @emph{Fortran}:
4152@multitable @columnfractions .20 .80
4153@item @emph{Interface}: @tab @code{subroutine acc_wait_all_async(async)}
4154@item @tab @code{integer(acc_handle_kind) async}
4155@end multitable
4156
4157@item @emph{Reference}:
4158@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41593.2.14.
4160@end table
4161
4162
4163
4164@node acc_wait_async
4165@section @code{acc_wait_async} -- Wait for completion of asynchronous operations.
4166@table @asis
4167@item @emph{Description}
4168This function enqueues a wait operation on queue @var{async} for any and all
4169asynchronous operations enqueued on queue @var{arg}.
4170
4171@item @emph{C/C++}:
4172@multitable @columnfractions .20 .80
4173@item @emph{Prototype}: @tab @code{acc_wait_async(int arg, int async);}
4174@end multitable
4175
4176@item @emph{Fortran}:
4177@multitable @columnfractions .20 .80
4178@item @emph{Interface}: @tab @code{subroutine acc_wait_async(arg, async)}
4179@item @tab @code{integer(acc_handle_kind) arg, async}
4180@end multitable
4181
4182@item @emph{Reference}:
4183@uref{https://www.openacc.org, OpenACC specification v2.6}, section
41843.2.12.
4185@end table
4186
4187
4188
4189@node acc_init
4190@section @code{acc_init} -- Initialize runtime for a specific device type.
4191@table @asis
4192@item @emph{Description}
4193This function initializes the runtime for the device type specified in
4194@var{devicetype}.
4195
4196@item @emph{C/C++}:
4197@multitable @columnfractions .20 .80
4198@item @emph{Prototype}: @tab @code{acc_init(acc_device_t devicetype);}
4199@end multitable
4200
4201@item @emph{Fortran}:
4202@multitable @columnfractions .20 .80
4203@item @emph{Interface}: @tab @code{subroutine acc_init(devicetype)}
4204@item @tab @code{integer(acc_device_kind) devicetype}
4205@end multitable
4206
4207@item @emph{Reference}:
4208@uref{https://www.openacc.org, OpenACC specification v2.6}, section
42093.2.7.
4210@end table
4211
4212
4213
4214@node acc_shutdown
4215@section @code{acc_shutdown} -- Shuts down the runtime for a specific device type.
4216@table @asis
4217@item @emph{Description}
4218This function shuts down the runtime for the device type specified in
4219@var{devicetype}.
4220
4221@item @emph{C/C++}:
4222@multitable @columnfractions .20 .80
4223@item @emph{Prototype}: @tab @code{acc_shutdown(acc_device_t devicetype);}
4224@end multitable
4225
4226@item @emph{Fortran}:
4227@multitable @columnfractions .20 .80
4228@item @emph{Interface}: @tab @code{subroutine acc_shutdown(devicetype)}
4229@item @tab @code{integer(acc_device_kind) devicetype}
4230@end multitable
4231
4232@item @emph{Reference}:
4233@uref{https://www.openacc.org, OpenACC specification v2.6}, section
42343.2.8.
4235@end table
4236
4237
4238
4239@node acc_on_device
4240@section @code{acc_on_device} -- Whether executing on a particular device
4241@table @asis
4242@item @emph{Description}:
4243This function returns whether the program is executing on a particular
4244device specified in @var{devicetype}. In C/C++ a non-zero value is
4245returned to indicate the device is executing on the specified device type.
15886c03
TB
4246In Fortran, @code{true} is returned. If the program is not executing
4247on the specified device type C/C++ returns zero, while Fortran
4248returns @code{false}.
d77de738
ML
4249
4250@item @emph{C/C++}:
4251@multitable @columnfractions .20 .80
4252@item @emph{Prototype}: @tab @code{acc_on_device(acc_device_t devicetype);}
4253@end multitable
4254
4255@item @emph{Fortran}:
4256@multitable @columnfractions .20 .80
4257@item @emph{Interface}: @tab @code{function acc_on_device(devicetype)}
4258@item @tab @code{integer(acc_device_kind) devicetype}
4259@item @tab @code{logical acc_on_device}
4260@end multitable
4261
4262
4263@item @emph{Reference}:
4264@uref{https://www.openacc.org, OpenACC specification v2.6}, section
42653.2.17.
4266@end table
4267
4268
4269
4270@node acc_malloc
4271@section @code{acc_malloc} -- Allocate device memory.
4272@table @asis
4273@item @emph{Description}
4274This function allocates @var{len} bytes of device memory. It returns
4275the device address of the allocated memory.
4276
4277@item @emph{C/C++}:
4278@multitable @columnfractions .20 .80
4279@item @emph{Prototype}: @tab @code{d_void* acc_malloc(size_t len);}
4280@end multitable
4281
4282@item @emph{Reference}:
4283@uref{https://www.openacc.org, OpenACC specification v2.6}, section
42843.2.18.
4285@end table
4286
4287
4288
4289@node acc_free
4290@section @code{acc_free} -- Free device memory.
4291@table @asis
4292@item @emph{Description}
4293Free previously allocated device memory at the device address @code{a}.
4294
4295@item @emph{C/C++}:
4296@multitable @columnfractions .20 .80
4297@item @emph{Prototype}: @tab @code{acc_free(d_void *a);}
4298@end multitable
4299
4300@item @emph{Reference}:
4301@uref{https://www.openacc.org, OpenACC specification v2.6}, section
43023.2.19.
4303@end table
4304
4305
4306
4307@node acc_copyin
4308@section @code{acc_copyin} -- Allocate device memory and copy host memory to it.
4309@table @asis
4310@item @emph{Description}
4311In C/C++, this function allocates @var{len} bytes of device memory
4312and maps it to the specified host address in @var{a}. The device
4313address of the newly allocated device memory is returned.
4314
4315In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4316a contiguous array section. The second form @var{a} specifies a
4317variable or array element and @var{len} specifies the length in bytes.
4318
4319@item @emph{C/C++}:
4320@multitable @columnfractions .20 .80
4321@item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);}
4322@item @emph{Prototype}: @tab @code{void *acc_copyin_async(h_void *a, size_t len, int async);}
4323@end multitable
4324
4325@item @emph{Fortran}:
4326@multitable @columnfractions .20 .80
4327@item @emph{Interface}: @tab @code{subroutine acc_copyin(a)}
4328@item @tab @code{type, dimension(:[,:]...) :: a}
4329@item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)}
4330@item @tab @code{type, dimension(:[,:]...) :: a}
4331@item @tab @code{integer len}
4332@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, async)}
4333@item @tab @code{type, dimension(:[,:]...) :: a}
4334@item @tab @code{integer(acc_handle_kind) :: async}
4335@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, len, async)}
4336@item @tab @code{type, dimension(:[,:]...) :: a}
4337@item @tab @code{integer len}
4338@item @tab @code{integer(acc_handle_kind) :: async}
4339@end multitable
4340
4341@item @emph{Reference}:
4342@uref{https://www.openacc.org, OpenACC specification v2.6}, section
43433.2.20.
4344@end table
4345
4346
4347
4348@node acc_present_or_copyin
4349@section @code{acc_present_or_copyin} -- If the data is not present on the device, allocate device memory and copy from host memory.
4350@table @asis
4351@item @emph{Description}
4352This function tests if the host data specified by @var{a} and of length
15886c03
TB
4353@var{len} is present or not. If it is not present, device memory
4354is allocated and the host memory copied. The device address of
d77de738
ML
4355the newly allocated device memory is returned.
4356
4357In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4358a contiguous array section. The second form @var{a} specifies a variable or
4359array element and @var{len} specifies the length in bytes.
4360
4361Note that @code{acc_present_or_copyin} and @code{acc_pcopyin} exist for
4362backward compatibility with OpenACC 2.0; use @ref{acc_copyin} instead.
4363
4364@item @emph{C/C++}:
4365@multitable @columnfractions .20 .80
4366@item @emph{Prototype}: @tab @code{void *acc_present_or_copyin(h_void *a, size_t len);}
4367@item @emph{Prototype}: @tab @code{void *acc_pcopyin(h_void *a, size_t len);}
4368@end multitable
4369
4370@item @emph{Fortran}:
4371@multitable @columnfractions .20 .80
4372@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a)}
4373@item @tab @code{type, dimension(:[,:]...) :: a}
4374@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a, len)}
4375@item @tab @code{type, dimension(:[,:]...) :: a}
4376@item @tab @code{integer len}
4377@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a)}
4378@item @tab @code{type, dimension(:[,:]...) :: a}
4379@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a, len)}
4380@item @tab @code{type, dimension(:[,:]...) :: a}
4381@item @tab @code{integer len}
4382@end multitable
4383
4384@item @emph{Reference}:
4385@uref{https://www.openacc.org, OpenACC specification v2.6}, section
43863.2.20.
4387@end table
4388
4389
4390
4391@node acc_create
4392@section @code{acc_create} -- Allocate device memory and map it to host memory.
4393@table @asis
4394@item @emph{Description}
4395This function allocates device memory and maps it to host memory specified
4396by the host address @var{a} with a length of @var{len} bytes. In C/C++,
4397the function returns the device address of the allocated device memory.
4398
4399In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4400a contiguous array section. The second form @var{a} specifies a variable or
4401array element and @var{len} specifies the length in bytes.
4402
4403@item @emph{C/C++}:
4404@multitable @columnfractions .20 .80
4405@item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);}
4406@item @emph{Prototype}: @tab @code{void *acc_create_async(h_void *a, size_t len, int async);}
4407@end multitable
4408
4409@item @emph{Fortran}:
4410@multitable @columnfractions .20 .80
4411@item @emph{Interface}: @tab @code{subroutine acc_create(a)}
4412@item @tab @code{type, dimension(:[,:]...) :: a}
4413@item @emph{Interface}: @tab @code{subroutine acc_create(a, len)}
4414@item @tab @code{type, dimension(:[,:]...) :: a}
4415@item @tab @code{integer len}
4416@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, async)}
4417@item @tab @code{type, dimension(:[,:]...) :: a}
4418@item @tab @code{integer(acc_handle_kind) :: async}
4419@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, len, async)}
4420@item @tab @code{type, dimension(:[,:]...) :: a}
4421@item @tab @code{integer len}
4422@item @tab @code{integer(acc_handle_kind) :: async}
4423@end multitable
4424
4425@item @emph{Reference}:
4426@uref{https://www.openacc.org, OpenACC specification v2.6}, section
44273.2.21.
4428@end table
4429
4430
4431
4432@node acc_present_or_create
4433@section @code{acc_present_or_create} -- If the data is not present on the device, allocate device memory and map it to host memory.
4434@table @asis
4435@item @emph{Description}
4436This function tests if the host data specified by @var{a} and of length
15886c03
TB
4437@var{len} is present or not. If it is not present, device memory
4438is allocated and mapped to host memory. In C/C++, the device address
d77de738
ML
4439of the newly allocated device memory is returned.
4440
4441In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4442a contiguous array section. The second form @var{a} specifies a variable or
4443array element and @var{len} specifies the length in bytes.
4444
4445Note that @code{acc_present_or_create} and @code{acc_pcreate} exist for
4446backward compatibility with OpenACC 2.0; use @ref{acc_create} instead.
4447
4448@item @emph{C/C++}:
4449@multitable @columnfractions .20 .80
4450@item @emph{Prototype}: @tab @code{void *acc_present_or_create(h_void *a, size_t len)}
4451@item @emph{Prototype}: @tab @code{void *acc_pcreate(h_void *a, size_t len)}
4452@end multitable
4453
4454@item @emph{Fortran}:
4455@multitable @columnfractions .20 .80
4456@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a)}
4457@item @tab @code{type, dimension(:[,:]...) :: a}
4458@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a, len)}
4459@item @tab @code{type, dimension(:[,:]...) :: a}
4460@item @tab @code{integer len}
4461@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a)}
4462@item @tab @code{type, dimension(:[,:]...) :: a}
4463@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a, len)}
4464@item @tab @code{type, dimension(:[,:]...) :: a}
4465@item @tab @code{integer len}
4466@end multitable
4467
4468@item @emph{Reference}:
4469@uref{https://www.openacc.org, OpenACC specification v2.6}, section
44703.2.21.
4471@end table
4472
4473
4474
4475@node acc_copyout
4476@section @code{acc_copyout} -- Copy device memory to host memory.
4477@table @asis
4478@item @emph{Description}
4479This function copies mapped device memory to host memory which is specified
4480by host address @var{a} for a length @var{len} bytes in C/C++.
4481
4482In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4483a contiguous array section. The second form @var{a} specifies a variable or
4484array element and @var{len} specifies the length in bytes.
4485
4486@item @emph{C/C++}:
4487@multitable @columnfractions .20 .80
4488@item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);}
4489@item @emph{Prototype}: @tab @code{acc_copyout_async(h_void *a, size_t len, int async);}
4490@item @emph{Prototype}: @tab @code{acc_copyout_finalize(h_void *a, size_t len);}
4491@item @emph{Prototype}: @tab @code{acc_copyout_finalize_async(h_void *a, size_t len, int async);}
4492@end multitable
4493
4494@item @emph{Fortran}:
4495@multitable @columnfractions .20 .80
4496@item @emph{Interface}: @tab @code{subroutine acc_copyout(a)}
4497@item @tab @code{type, dimension(:[,:]...) :: a}
4498@item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)}
4499@item @tab @code{type, dimension(:[,:]...) :: a}
4500@item @tab @code{integer len}
4501@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, async)}
4502@item @tab @code{type, dimension(:[,:]...) :: a}
4503@item @tab @code{integer(acc_handle_kind) :: async}
4504@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, len, async)}
4505@item @tab @code{type, dimension(:[,:]...) :: a}
4506@item @tab @code{integer len}
4507@item @tab @code{integer(acc_handle_kind) :: async}
4508@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a)}
4509@item @tab @code{type, dimension(:[,:]...) :: a}
4510@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a, len)}
4511@item @tab @code{type, dimension(:[,:]...) :: a}
4512@item @tab @code{integer len}
4513@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, async)}
4514@item @tab @code{type, dimension(:[,:]...) :: a}
4515@item @tab @code{integer(acc_handle_kind) :: async}
4516@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, len, async)}
4517@item @tab @code{type, dimension(:[,:]...) :: a}
4518@item @tab @code{integer len}
4519@item @tab @code{integer(acc_handle_kind) :: async}
4520@end multitable
4521
4522@item @emph{Reference}:
4523@uref{https://www.openacc.org, OpenACC specification v2.6}, section
45243.2.22.
4525@end table
4526
4527
4528
4529@node acc_delete
4530@section @code{acc_delete} -- Free device memory.
4531@table @asis
4532@item @emph{Description}
4533This function frees previously allocated device memory specified by
4534the device address @var{a} and the length of @var{len} bytes.
4535
4536In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4537a contiguous array section. The second form @var{a} specifies a variable or
4538array element and @var{len} specifies the length in bytes.
4539
4540@item @emph{C/C++}:
4541@multitable @columnfractions .20 .80
4542@item @emph{Prototype}: @tab @code{acc_delete(h_void *a, size_t len);}
4543@item @emph{Prototype}: @tab @code{acc_delete_async(h_void *a, size_t len, int async);}
4544@item @emph{Prototype}: @tab @code{acc_delete_finalize(h_void *a, size_t len);}
4545@item @emph{Prototype}: @tab @code{acc_delete_finalize_async(h_void *a, size_t len, int async);}
4546@end multitable
4547
4548@item @emph{Fortran}:
4549@multitable @columnfractions .20 .80
4550@item @emph{Interface}: @tab @code{subroutine acc_delete(a)}
4551@item @tab @code{type, dimension(:[,:]...) :: a}
4552@item @emph{Interface}: @tab @code{subroutine acc_delete(a, len)}
4553@item @tab @code{type, dimension(:[,:]...) :: a}
4554@item @tab @code{integer len}
4555@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, async)}
4556@item @tab @code{type, dimension(:[,:]...) :: a}
4557@item @tab @code{integer(acc_handle_kind) :: async}
4558@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, len, async)}
4559@item @tab @code{type, dimension(:[,:]...) :: a}
4560@item @tab @code{integer len}
4561@item @tab @code{integer(acc_handle_kind) :: async}
4562@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a)}
4563@item @tab @code{type, dimension(:[,:]...) :: a}
4564@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a, len)}
4565@item @tab @code{type, dimension(:[,:]...) :: a}
4566@item @tab @code{integer len}
4567@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, async)}
4568@item @tab @code{type, dimension(:[,:]...) :: a}
4569@item @tab @code{integer(acc_handle_kind) :: async}
4570@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, len, async)}
4571@item @tab @code{type, dimension(:[,:]...) :: a}
4572@item @tab @code{integer len}
4573@item @tab @code{integer(acc_handle_kind) :: async}
4574@end multitable
4575
4576@item @emph{Reference}:
4577@uref{https://www.openacc.org, OpenACC specification v2.6}, section
45783.2.23.
4579@end table
4580
4581
4582
4583@node acc_update_device
4584@section @code{acc_update_device} -- Update device memory from mapped host memory.
4585@table @asis
4586@item @emph{Description}
4587This function updates the device copy from the previously mapped host memory.
4588The host memory is specified with the host address @var{a} and a length of
4589@var{len} bytes.
4590
4591In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4592a contiguous array section. The second form @var{a} specifies a variable or
4593array element and @var{len} specifies the length in bytes.
4594
4595@item @emph{C/C++}:
4596@multitable @columnfractions .20 .80
4597@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len);}
4598@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len, async);}
4599@end multitable
4600
4601@item @emph{Fortran}:
4602@multitable @columnfractions .20 .80
4603@item @emph{Interface}: @tab @code{subroutine acc_update_device(a)}
4604@item @tab @code{type, dimension(:[,:]...) :: a}
4605@item @emph{Interface}: @tab @code{subroutine acc_update_device(a, len)}
4606@item @tab @code{type, dimension(:[,:]...) :: a}
4607@item @tab @code{integer len}
4608@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, async)}
4609@item @tab @code{type, dimension(:[,:]...) :: a}
4610@item @tab @code{integer(acc_handle_kind) :: async}
4611@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, len, async)}
4612@item @tab @code{type, dimension(:[,:]...) :: a}
4613@item @tab @code{integer len}
4614@item @tab @code{integer(acc_handle_kind) :: async}
4615@end multitable
4616
4617@item @emph{Reference}:
4618@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46193.2.24.
4620@end table
4621
4622
4623
4624@node acc_update_self
4625@section @code{acc_update_self} -- Update host memory from mapped device memory.
4626@table @asis
4627@item @emph{Description}
4628This function updates the host copy from the previously mapped device memory.
4629The host memory is specified with the host address @var{a} and a length of
4630@var{len} bytes.
4631
4632In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4633a contiguous array section. The second form @var{a} specifies a variable or
4634array element and @var{len} specifies the length in bytes.
4635
4636@item @emph{C/C++}:
4637@multitable @columnfractions .20 .80
4638@item @emph{Prototype}: @tab @code{acc_update_self(h_void *a, size_t len);}
4639@item @emph{Prototype}: @tab @code{acc_update_self_async(h_void *a, size_t len, int async);}
4640@end multitable
4641
4642@item @emph{Fortran}:
4643@multitable @columnfractions .20 .80
4644@item @emph{Interface}: @tab @code{subroutine acc_update_self(a)}
4645@item @tab @code{type, dimension(:[,:]...) :: a}
4646@item @emph{Interface}: @tab @code{subroutine acc_update_self(a, len)}
4647@item @tab @code{type, dimension(:[,:]...) :: a}
4648@item @tab @code{integer len}
4649@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, async)}
4650@item @tab @code{type, dimension(:[,:]...) :: a}
4651@item @tab @code{integer(acc_handle_kind) :: async}
4652@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, len, async)}
4653@item @tab @code{type, dimension(:[,:]...) :: a}
4654@item @tab @code{integer len}
4655@item @tab @code{integer(acc_handle_kind) :: async}
4656@end multitable
4657
4658@item @emph{Reference}:
4659@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46603.2.25.
4661@end table
4662
4663
4664
4665@node acc_map_data
4666@section @code{acc_map_data} -- Map previously allocated device memory to host memory.
4667@table @asis
4668@item @emph{Description}
4669This function maps previously allocated device and host memory. The device
4670memory is specified with the device address @var{d}. The host memory is
4671specified with the host address @var{h} and a length of @var{len}.
4672
4673@item @emph{C/C++}:
4674@multitable @columnfractions .20 .80
4675@item @emph{Prototype}: @tab @code{acc_map_data(h_void *h, d_void *d, size_t len);}
4676@end multitable
4677
4678@item @emph{Reference}:
4679@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46803.2.26.
4681@end table
4682
4683
4684
4685@node acc_unmap_data
4686@section @code{acc_unmap_data} -- Unmap device memory from host memory.
4687@table @asis
4688@item @emph{Description}
4689This function unmaps previously mapped device and host memory. The latter
4690specified by @var{h}.
4691
4692@item @emph{C/C++}:
4693@multitable @columnfractions .20 .80
4694@item @emph{Prototype}: @tab @code{acc_unmap_data(h_void *h);}
4695@end multitable
4696
4697@item @emph{Reference}:
4698@uref{https://www.openacc.org, OpenACC specification v2.6}, section
46993.2.27.
4700@end table
4701
4702
4703
4704@node acc_deviceptr
4705@section @code{acc_deviceptr} -- Get device pointer associated with specific host address.
4706@table @asis
4707@item @emph{Description}
4708This function returns the device address that has been mapped to the
4709host address specified by @var{h}.
4710
4711@item @emph{C/C++}:
4712@multitable @columnfractions .20 .80
4713@item @emph{Prototype}: @tab @code{void *acc_deviceptr(h_void *h);}
4714@end multitable
4715
4716@item @emph{Reference}:
4717@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47183.2.28.
4719@end table
4720
4721
4722
4723@node acc_hostptr
4724@section @code{acc_hostptr} -- Get host pointer associated with specific device address.
4725@table @asis
4726@item @emph{Description}
4727This function returns the host address that has been mapped to the
4728device address specified by @var{d}.
4729
4730@item @emph{C/C++}:
4731@multitable @columnfractions .20 .80
4732@item @emph{Prototype}: @tab @code{void *acc_hostptr(d_void *d);}
4733@end multitable
4734
4735@item @emph{Reference}:
4736@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47373.2.29.
4738@end table
4739
4740
4741
4742@node acc_is_present
4743@section @code{acc_is_present} -- Indicate whether host variable / array is present on device.
4744@table @asis
4745@item @emph{Description}
4746This function indicates whether the specified host address in @var{a} and a
4747length of @var{len} bytes is present on the device. In C/C++, a non-zero
4748value is returned to indicate the presence of the mapped memory on the
4749device. A zero is returned to indicate the memory is not mapped on the
4750device.
4751
4752In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
4753a contiguous array section. The second form @var{a} specifies a variable or
4754array element and @var{len} specifies the length in bytes. If the host
4755memory is mapped to device memory, then a @code{true} is returned. Otherwise,
4756a @code{false} is return to indicate the mapped memory is not present.
4757
4758@item @emph{C/C++}:
4759@multitable @columnfractions .20 .80
4760@item @emph{Prototype}: @tab @code{int acc_is_present(h_void *a, size_t len);}
4761@end multitable
4762
4763@item @emph{Fortran}:
4764@multitable @columnfractions .20 .80
4765@item @emph{Interface}: @tab @code{function acc_is_present(a)}
4766@item @tab @code{type, dimension(:[,:]...) :: a}
4767@item @tab @code{logical acc_is_present}
4768@item @emph{Interface}: @tab @code{function acc_is_present(a, len)}
4769@item @tab @code{type, dimension(:[,:]...) :: a}
4770@item @tab @code{integer len}
4771@item @tab @code{logical acc_is_present}
4772@end multitable
4773
4774@item @emph{Reference}:
4775@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47763.2.30.
4777@end table
4778
4779
4780
4781@node acc_memcpy_to_device
4782@section @code{acc_memcpy_to_device} -- Copy host memory to device memory.
4783@table @asis
4784@item @emph{Description}
4785This function copies host memory specified by host address of @var{src} to
4786device memory specified by the device address @var{dest} for a length of
4787@var{bytes} bytes.
4788
4789@item @emph{C/C++}:
4790@multitable @columnfractions .20 .80
4791@item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void *src, size_t bytes);}
4792@end multitable
4793
4794@item @emph{Reference}:
4795@uref{https://www.openacc.org, OpenACC specification v2.6}, section
47963.2.31.
4797@end table
4798
4799
4800
4801@node acc_memcpy_from_device
4802@section @code{acc_memcpy_from_device} -- Copy device memory to host memory.
4803@table @asis
4804@item @emph{Description}
4805This function copies host memory specified by host address of @var{src} from
4806device memory specified by the device address @var{dest} for a length of
4807@var{bytes} bytes.
4808
4809@item @emph{C/C++}:
4810@multitable @columnfractions .20 .80
4811@item @emph{Prototype}: @tab @code{acc_memcpy_from_device(d_void *dest, h_void *src, size_t bytes);}
4812@end multitable
4813
4814@item @emph{Reference}:
4815@uref{https://www.openacc.org, OpenACC specification v2.6}, section
48163.2.32.
4817@end table
4818
4819
4820
4821@node acc_attach
4822@section @code{acc_attach} -- Let device pointer point to device-pointer target.
4823@table @asis
4824@item @emph{Description}
4825This function updates a pointer on the device from pointing to a host-pointer
4826address to pointing to the corresponding device data.
4827
4828@item @emph{C/C++}:
4829@multitable @columnfractions .20 .80
4830@item @emph{Prototype}: @tab @code{acc_attach(h_void **ptr);}
4831@item @emph{Prototype}: @tab @code{acc_attach_async(h_void **ptr, int async);}
4832@end multitable
4833
4834@item @emph{Reference}:
4835@uref{https://www.openacc.org, OpenACC specification v2.6}, section
48363.2.34.
4837@end table
4838
4839
4840
4841@node acc_detach
4842@section @code{acc_detach} -- Let device pointer point to host-pointer target.
4843@table @asis
4844@item @emph{Description}
4845This function updates a pointer on the device from pointing to a device-pointer
4846address to pointing to the corresponding host data.
4847
4848@item @emph{C/C++}:
4849@multitable @columnfractions .20 .80
4850@item @emph{Prototype}: @tab @code{acc_detach(h_void **ptr);}
4851@item @emph{Prototype}: @tab @code{acc_detach_async(h_void **ptr, int async);}
4852@item @emph{Prototype}: @tab @code{acc_detach_finalize(h_void **ptr);}
4853@item @emph{Prototype}: @tab @code{acc_detach_finalize_async(h_void **ptr, int async);}
4854@end multitable
4855
4856@item @emph{Reference}:
4857@uref{https://www.openacc.org, OpenACC specification v2.6}, section
48583.2.35.
4859@end table
4860
4861
4862
4863@node acc_get_current_cuda_device
4864@section @code{acc_get_current_cuda_device} -- Get CUDA device handle.
4865@table @asis
4866@item @emph{Description}
4867This function returns the CUDA device handle. This handle is the same
4868as used by the CUDA Runtime or Driver API's.
4869
4870@item @emph{C/C++}:
4871@multitable @columnfractions .20 .80
4872@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_device(void);}
4873@end multitable
4874
4875@item @emph{Reference}:
4876@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4877A.2.1.1.
4878@end table
4879
4880
4881
4882@node acc_get_current_cuda_context
4883@section @code{acc_get_current_cuda_context} -- Get CUDA context handle.
4884@table @asis
4885@item @emph{Description}
4886This function returns the CUDA context handle. This handle is the same
4887as used by the CUDA Runtime or Driver API's.
4888
4889@item @emph{C/C++}:
4890@multitable @columnfractions .20 .80
4891@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_context(void);}
4892@end multitable
4893
4894@item @emph{Reference}:
4895@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4896A.2.1.2.
4897@end table
4898
4899
4900
4901@node acc_get_cuda_stream
4902@section @code{acc_get_cuda_stream} -- Get CUDA stream handle.
4903@table @asis
4904@item @emph{Description}
4905This function returns the CUDA stream handle for the queue @var{async}.
4906This handle is the same as used by the CUDA Runtime or Driver API's.
4907
4908@item @emph{C/C++}:
4909@multitable @columnfractions .20 .80
4910@item @emph{Prototype}: @tab @code{void *acc_get_cuda_stream(int async);}
4911@end multitable
4912
4913@item @emph{Reference}:
4914@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4915A.2.1.3.
4916@end table
4917
4918
4919
4920@node acc_set_cuda_stream
4921@section @code{acc_set_cuda_stream} -- Set CUDA stream handle.
4922@table @asis
4923@item @emph{Description}
4924This function associates the stream handle specified by @var{stream} with
4925the queue @var{async}.
4926
4927This cannot be used to change the stream handle associated with
4928@code{acc_async_sync}.
4929
4930The return value is not specified.
4931
4932@item @emph{C/C++}:
4933@multitable @columnfractions .20 .80
4934@item @emph{Prototype}: @tab @code{int acc_set_cuda_stream(int async, void *stream);}
4935@end multitable
4936
4937@item @emph{Reference}:
4938@uref{https://www.openacc.org, OpenACC specification v2.6}, section
4939A.2.1.4.
4940@end table
4941
4942
4943
4944@node acc_prof_register
4945@section @code{acc_prof_register} -- Register callbacks.
4946@table @asis
4947@item @emph{Description}:
4948This function registers callbacks.
4949
4950@item @emph{C/C++}:
4951@multitable @columnfractions .20 .80
4952@item @emph{Prototype}: @tab @code{void acc_prof_register (acc_event_t, acc_prof_callback, acc_register_t);}
4953@end multitable
4954
4955@item @emph{See also}:
4956@ref{OpenACC Profiling Interface}
4957
4958@item @emph{Reference}:
4959@uref{https://www.openacc.org, OpenACC specification v2.6}, section
49605.3.
4961@end table
4962
4963
4964
4965@node acc_prof_unregister
4966@section @code{acc_prof_unregister} -- Unregister callbacks.
4967@table @asis
4968@item @emph{Description}:
4969This function unregisters callbacks.
4970
4971@item @emph{C/C++}:
4972@multitable @columnfractions .20 .80
4973@item @emph{Prototype}: @tab @code{void acc_prof_unregister (acc_event_t, acc_prof_callback, acc_register_t);}
4974@end multitable
4975
4976@item @emph{See also}:
4977@ref{OpenACC Profiling Interface}
4978
4979@item @emph{Reference}:
4980@uref{https://www.openacc.org, OpenACC specification v2.6}, section
49815.3.
4982@end table
4983
4984
4985
4986@node acc_prof_lookup
4987@section @code{acc_prof_lookup} -- Obtain inquiry functions.
4988@table @asis
4989@item @emph{Description}:
4990Function to obtain inquiry functions.
4991
4992@item @emph{C/C++}:
4993@multitable @columnfractions .20 .80
4994@item @emph{Prototype}: @tab @code{acc_query_fn acc_prof_lookup (const char *);}
4995@end multitable
4996
4997@item @emph{See also}:
4998@ref{OpenACC Profiling Interface}
4999
5000@item @emph{Reference}:
5001@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50025.3.
5003@end table
5004
5005
5006
5007@node acc_register_library
5008@section @code{acc_register_library} -- Library registration.
5009@table @asis
5010@item @emph{Description}:
5011Function for library registration.
5012
5013@item @emph{C/C++}:
5014@multitable @columnfractions .20 .80
5015@item @emph{Prototype}: @tab @code{void acc_register_library (acc_prof_reg, acc_prof_reg, acc_prof_lookup_func);}
5016@end multitable
5017
5018@item @emph{See also}:
5019@ref{OpenACC Profiling Interface}, @ref{ACC_PROFLIB}
5020
5021@item @emph{Reference}:
5022@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50235.3.
5024@end table
5025
5026
5027
5028@c ---------------------------------------------------------------------
5029@c OpenACC Environment Variables
5030@c ---------------------------------------------------------------------
5031
5032@node OpenACC Environment Variables
5033@chapter OpenACC Environment Variables
5034
5035The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
5036are defined by section 4 of the OpenACC specification in version 2.0.
5037The variable @env{ACC_PROFLIB}
5038is defined by section 4 of the OpenACC specification in version 2.6.
d77de738
ML
5039
5040@menu
5041* ACC_DEVICE_TYPE::
5042* ACC_DEVICE_NUM::
5043* ACC_PROFLIB::
d77de738
ML
5044@end menu
5045
5046
5047
5048@node ACC_DEVICE_TYPE
5049@section @code{ACC_DEVICE_TYPE}
5050@table @asis
67f5d368
TB
5051@item @emph{Description}:
5052Control the default device type to use when executing compute regions.
5053If unset, the code can be run on any device type, favoring a non-host
5054device type.
5055
5056Supported values in GCC (if compiled in) are
5057@itemize
5058@item @code{host}
5059@item @code{nvidia}
5060@item @code{radeon}
5061@end itemize
d77de738
ML
5062@item @emph{Reference}:
5063@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50644.1.
5065@end table
5066
5067
5068
5069@node ACC_DEVICE_NUM
5070@section @code{ACC_DEVICE_NUM}
5071@table @asis
67f5d368
TB
5072@item @emph{Description}:
5073Control which device, identified by device number, is the default device.
5074The value must be a nonnegative integer less than the number of devices.
5075If unset, device number zero is used.
d77de738
ML
5076@item @emph{Reference}:
5077@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50784.2.
5079@end table
5080
5081
5082
5083@node ACC_PROFLIB
5084@section @code{ACC_PROFLIB}
5085@table @asis
67f5d368
TB
5086@item @emph{Description}:
5087Semicolon-separated list of dynamic libraries that are loaded as profiling
5088libraries. Each library must provide at least the @code{acc_register_library}
5089routine. Each library file is found as described by the documentation of
5090@code{dlopen} of your operating system.
d77de738
ML
5091@item @emph{See also}:
5092@ref{acc_register_library}, @ref{OpenACC Profiling Interface}
5093
5094@item @emph{Reference}:
5095@uref{https://www.openacc.org, OpenACC specification v2.6}, section
50964.3.
5097@end table
5098
5099
5100
d77de738
ML
5101@c ---------------------------------------------------------------------
5102@c CUDA Streams Usage
5103@c ---------------------------------------------------------------------
5104
5105@node CUDA Streams Usage
5106@chapter CUDA Streams Usage
5107
5108This applies to the @code{nvptx} plugin only.
5109
5110The library provides elements that perform asynchronous movement of
5111data and asynchronous operation of computing constructs. This
5112asynchronous functionality is implemented by making use of CUDA
5113streams@footnote{See "Stream Management" in "CUDA Driver API",
5114TRM-06703-001, Version 5.5, for additional information}.
5115
5116The primary means by that the asynchronous functionality is accessed
5117is through the use of those OpenACC directives which make use of the
5118@code{async} and @code{wait} clauses. When the @code{async} clause is
5119first used with a directive, it creates a CUDA stream. If an
5120@code{async-argument} is used with the @code{async} clause, then the
5121stream is associated with the specified @code{async-argument}.
5122
5123Following the creation of an association between a CUDA stream and the
5124@code{async-argument} of an @code{async} clause, both the @code{wait}
5125clause and the @code{wait} directive can be used. When either the
5126clause or directive is used after stream creation, it creates a
5127rendezvous point whereby execution waits until all operations
5128associated with the @code{async-argument}, that is, stream, have
5129completed.
5130
5131Normally, the management of the streams that are created as a result of
5132using the @code{async} clause, is done without any intervention by the
5133caller. This implies the association between the @code{async-argument}
15886c03 5134and the CUDA stream is maintained for the lifetime of the program.
d77de738
ML
5135However, this association can be changed through the use of the library
5136function @code{acc_set_cuda_stream}. When the function
5137@code{acc_set_cuda_stream} is called, the CUDA stream that was
15886c03 5138originally associated with the @code{async} clause is destroyed.
d77de738
ML
5139Caution should be taken when changing the association as subsequent
5140references to the @code{async-argument} refer to a different
5141CUDA stream.
5142
5143
5144
5145@c ---------------------------------------------------------------------
5146@c OpenACC Library Interoperability
5147@c ---------------------------------------------------------------------
5148
5149@node OpenACC Library Interoperability
5150@chapter OpenACC Library Interoperability
5151
5152@section Introduction
5153
5154The OpenACC library uses the CUDA Driver API, and may interact with
5155programs that use the Runtime library directly, or another library
5156based on the Runtime library, e.g., CUBLAS@footnote{See section 2.26,
5157"Interactions with the CUDA Driver API" in
5158"CUDA Runtime API", Version 5.5, and section 2.27, "VDPAU
5159Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
5160for additional information on library interoperability.}.
5161This chapter describes the use cases and what changes are
5162required in order to use both the OpenACC library and the CUBLAS and Runtime
5163libraries within a program.
5164
5165@section First invocation: NVIDIA CUBLAS library API
5166
5167In this first use case (see below), a function in the CUBLAS library is called
5168prior to any of the functions in the OpenACC library. More specifically, the
5169function @code{cublasCreate()}.
5170
5171When invoked, the function initializes the library and allocates the
5172hardware resources on the host and the device on behalf of the caller. Once
5173the initialization and allocation has completed, a handle is returned to the
5174caller. The OpenACC library also requires initialization and allocation of
5175hardware resources. Since the CUBLAS library has already allocated the
5176hardware resources for the device, all that is left to do is to initialize
5177the OpenACC library and acquire the hardware resources on the host.
5178
5179Prior to calling the OpenACC function that initializes the library and
5180allocate the host hardware resources, you need to acquire the device number
5181that was allocated during the call to @code{cublasCreate()}. The invoking of the
5182runtime library function @code{cudaGetDevice()} accomplishes this. Once
5183acquired, the device number is passed along with the device type as
5184parameters to the OpenACC library function @code{acc_set_device_num()}.
5185
5186Once the call to @code{acc_set_device_num()} has completed, the OpenACC
5187library uses the context that was created during the call to
15886c03 5188@code{cublasCreate()}. In other words, both libraries share the
d77de738
ML
5189same context.
5190
5191@smallexample
5192 /* Create the handle */
5193 s = cublasCreate(&h);
5194 if (s != CUBLAS_STATUS_SUCCESS)
5195 @{
5196 fprintf(stderr, "cublasCreate failed %d\n", s);
5197 exit(EXIT_FAILURE);
5198 @}
5199
5200 /* Get the device number */
5201 e = cudaGetDevice(&dev);
5202 if (e != cudaSuccess)
5203 @{
5204 fprintf(stderr, "cudaGetDevice failed %d\n", e);
5205 exit(EXIT_FAILURE);
5206 @}
5207
5208 /* Initialize OpenACC library and use device 'dev' */
5209 acc_set_device_num(dev, acc_device_nvidia);
5210
5211@end smallexample
5212@center Use Case 1
5213
5214@section First invocation: OpenACC library API
5215
5216In this second use case (see below), a function in the OpenACC library is
eda38850 5217called prior to any of the functions in the CUBLAS library. More specifically,
d77de738
ML
5218the function @code{acc_set_device_num()}.
5219
5220In the use case presented here, the function @code{acc_set_device_num()}
5221is used to both initialize the OpenACC library and allocate the hardware
5222resources on the host and the device. In the call to the function, the
5223call parameters specify which device to use and what device
5224type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
5225is but one method to initialize the OpenACC library and allocate the
5226appropriate hardware resources. Other methods are available through the
15886c03 5227use of environment variables and these is discussed in the next section.
d77de738
ML
5228
5229Once the call to @code{acc_set_device_num()} has completed, other OpenACC
5230functions can be called as seen with multiple calls being made to
5231@code{acc_copyin()}. In addition, calls can be made to functions in the
5232CUBLAS library. In the use case a call to @code{cublasCreate()} is made
5233subsequent to the calls to @code{acc_copyin()}.
5234As seen in the previous use case, a call to @code{cublasCreate()}
5235initializes the CUBLAS library and allocates the hardware resources on the
5236host and the device. However, since the device has already been allocated,
15886c03 5237@code{cublasCreate()} only initializes the CUBLAS library and allocates
d77de738
ML
5238the appropriate hardware resources on the host. The context that was created
5239as part of the OpenACC initialization is shared with the CUBLAS library,
5240similarly to the first use case.
5241
5242@smallexample
5243 dev = 0;
5244
5245 acc_set_device_num(dev, acc_device_nvidia);
5246
5247 /* Copy the first set to the device */
5248 d_X = acc_copyin(&h_X[0], N * sizeof (float));
5249 if (d_X == NULL)
5250 @{
5251 fprintf(stderr, "copyin error h_X\n");
5252 exit(EXIT_FAILURE);
5253 @}
5254
5255 /* Copy the second set to the device */
5256 d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
5257 if (d_Y == NULL)
5258 @{
5259 fprintf(stderr, "copyin error h_Y1\n");
5260 exit(EXIT_FAILURE);
5261 @}
5262
5263 /* Create the handle */
5264 s = cublasCreate(&h);
5265 if (s != CUBLAS_STATUS_SUCCESS)
5266 @{
5267 fprintf(stderr, "cublasCreate failed %d\n", s);
5268 exit(EXIT_FAILURE);
5269 @}
5270
5271 /* Perform saxpy using CUBLAS library function */
5272 s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
5273 if (s != CUBLAS_STATUS_SUCCESS)
5274 @{
5275 fprintf(stderr, "cublasSaxpy failed %d\n", s);
5276 exit(EXIT_FAILURE);
5277 @}
5278
5279 /* Copy the results from the device */
5280 acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
5281
5282@end smallexample
5283@center Use Case 2
5284
5285@section OpenACC library and environment variables
5286
5287There are two environment variables associated with the OpenACC library
5288that may be used to control the device type and device number:
5289@env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}, respectively. These two
5290environment variables can be used as an alternative to calling
5291@code{acc_set_device_num()}. As seen in the second use case, the device
5292type and device number were specified using @code{acc_set_device_num()}.
5293If however, the aforementioned environment variables were set, then the
5294call to @code{acc_set_device_num()} would not be required.
5295
5296
5297The use of the environment variables is only relevant when an OpenACC function
5298is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
5299is called prior to a call to an OpenACC function, then you must call
5300@code{acc_set_device_num()}@footnote{More complete information
5301about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
5302sections 4.1 and 4.2 of the @uref{https://www.openacc.org, OpenACC}
5303Application Programming Interface”, Version 2.6.}
5304
5305
5306
5307@c ---------------------------------------------------------------------
5308@c OpenACC Profiling Interface
5309@c ---------------------------------------------------------------------
5310
5311@node OpenACC Profiling Interface
5312@chapter OpenACC Profiling Interface
5313
5314@section Implementation Status and Implementation-Defined Behavior
5315
5316We're implementing the OpenACC Profiling Interface as defined by the
5317OpenACC 2.6 specification. We're clarifying some aspects here as
5318@emph{implementation-defined behavior}, while they're still under
5319discussion within the OpenACC Technical Committee.
5320
5321This implementation is tuned to keep the performance impact as low as
5322possible for the (very common) case that the Profiling Interface is
5323not enabled. This is relevant, as the Profiling Interface affects all
5324the @emph{hot} code paths (in the target code, not in the offloaded
5325code). Users of the OpenACC Profiling Interface can be expected to
15886c03
TB
5326understand that performance is impacted to some degree once the
5327Profiling Interface is enabled: for example, because of the
d77de738
ML
5328@emph{runtime} (libgomp) calling into a third-party @emph{library} for
5329every event that has been registered.
5330
5331We're not yet accounting for the fact that @cite{OpenACC events may
5332occur during event processing}.
5333We just handle one case specially, as required by CUDA 9.0
5334@command{nvprof}, that @code{acc_get_device_type}
5335(@ref{acc_get_device_type})) may be called from
5336@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
5337callbacks.
5338
5339We're not yet implementing initialization via a
5340@code{acc_register_library} function that is either statically linked
5341in, or dynamically via @env{LD_PRELOAD}.
5342Initialization via @code{acc_register_library} functions dynamically
5343loaded via the @env{ACC_PROFLIB} environment variable does work, as
5344does directly calling @code{acc_prof_register},
5345@code{acc_prof_unregister}, @code{acc_prof_lookup}.
5346
5347As currently there are no inquiry functions defined, calls to
15886c03 5348@code{acc_prof_lookup} always returns @code{NULL}.
d77de738
ML
5349
5350There aren't separate @emph{start}, @emph{stop} events defined for the
5351event types @code{acc_ev_create}, @code{acc_ev_delete},
5352@code{acc_ev_alloc}, @code{acc_ev_free}. It's not clear if these
5353should be triggered before or after the actual device-specific call is
5354made. We trigger them after.
5355
5356Remarks about data provided to callbacks:
5357
5358@table @asis
5359
5360@item @code{acc_prof_info.event_type}
5361It's not clear if for @emph{nested} event callbacks (for example,
5362@code{acc_ev_enqueue_launch_start} as part of a parent compute
5363construct), this should be set for the nested event
5364(@code{acc_ev_enqueue_launch_start}), or if the value of the parent
5365construct should remain (@code{acc_ev_compute_construct_start}). In
15886c03 5366this implementation, the value generally corresponds to the
d77de738
ML
5367innermost nested event type.
5368
5369@item @code{acc_prof_info.device_type}
5370@itemize
5371
5372@item
5373For @code{acc_ev_compute_construct_start}, and in presence of an
15886c03 5374@code{if} clause with @emph{false} argument, this still refers to
d77de738
ML
5375the offloading device type.
5376It's not clear if that's the expected behavior.
5377
5378@item
5379Complementary to the item before, for
5380@code{acc_ev_compute_construct_end}, this is set to
5381@code{acc_device_host} in presence of an @code{if} clause with
5382@emph{false} argument.
5383It's not clear if that's the expected behavior.
5384
5385@end itemize
5386
5387@item @code{acc_prof_info.thread_id}
5388Always @code{-1}; not yet implemented.
5389
5390@item @code{acc_prof_info.async}
5391@itemize
5392
5393@item
5394Not yet implemented correctly for
5395@code{acc_ev_compute_construct_start}.
5396
5397@item
5398In a compute construct, for host-fallback
15886c03 5399execution/@code{acc_device_host} it always is
d77de738 5400@code{acc_async_sync}.
15886c03 5401It is unclear if that is the expected behavior.
d77de738
ML
5402
5403@item
5404For @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end},
5405it will always be @code{acc_async_sync}.
15886c03 5406It is unclear if that is the expected behavior.
d77de738
ML
5407
5408@end itemize
5409
5410@item @code{acc_prof_info.async_queue}
5411There is no @cite{limited number of asynchronous queues} in libgomp.
15886c03 5412This always has the same value as @code{acc_prof_info.async}.
d77de738
ML
5413
5414@item @code{acc_prof_info.src_file}
5415Always @code{NULL}; not yet implemented.
5416
5417@item @code{acc_prof_info.func_name}
5418Always @code{NULL}; not yet implemented.
5419
5420@item @code{acc_prof_info.line_no}
5421Always @code{-1}; not yet implemented.
5422
5423@item @code{acc_prof_info.end_line_no}
5424Always @code{-1}; not yet implemented.
5425
5426@item @code{acc_prof_info.func_line_no}
5427Always @code{-1}; not yet implemented.
5428
5429@item @code{acc_prof_info.func_end_line_no}
5430Always @code{-1}; not yet implemented.
5431
5432@item @code{acc_event_info.event_type}, @code{acc_event_info.*.event_type}
5433Relating to @code{acc_prof_info.event_type} discussed above, in this
5434implementation, this will always be the same value as
5435@code{acc_prof_info.event_type}.
5436
5437@item @code{acc_event_info.*.parent_construct}
5438@itemize
5439
5440@item
5441Will be @code{acc_construct_parallel} for all OpenACC compute
5442constructs as well as many OpenACC Runtime API calls; should be the
5443one matching the actual construct, or
5444@code{acc_construct_runtime_api}, respectively.
5445
5446@item
5447Will be @code{acc_construct_enter_data} or
5448@code{acc_construct_exit_data} when processing variable mappings
5449specified in OpenACC @emph{declare} directives; should be
5450@code{acc_construct_declare}.
5451
5452@item
5453For implicit @code{acc_ev_device_init_start},
5454@code{acc_ev_device_init_end}, and explicit as well as implicit
5455@code{acc_ev_alloc}, @code{acc_ev_free},
5456@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
5457@code{acc_ev_enqueue_download_start}, and
5458@code{acc_ev_enqueue_download_end}, will be
5459@code{acc_construct_parallel}; should reflect the real parent
5460construct.
5461
5462@end itemize
5463
5464@item @code{acc_event_info.*.implicit}
5465For @code{acc_ev_alloc}, @code{acc_ev_free},
5466@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
5467@code{acc_ev_enqueue_download_start}, and
5468@code{acc_ev_enqueue_download_end}, this currently will be @code{1}
5469also for explicit usage.
5470
5471@item @code{acc_event_info.data_event.var_name}
5472Always @code{NULL}; not yet implemented.
5473
5474@item @code{acc_event_info.data_event.host_ptr}
5475For @code{acc_ev_alloc}, and @code{acc_ev_free}, this is always
5476@code{NULL}.
5477
5478@item @code{typedef union acc_api_info}
5479@dots{} as printed in @cite{5.2.3. Third Argument: API-Specific
5480Information}. This should obviously be @code{typedef @emph{struct}
5481acc_api_info}.
5482
5483@item @code{acc_api_info.device_api}
5484Possibly not yet implemented correctly for
5485@code{acc_ev_compute_construct_start},
5486@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}:
5487will always be @code{acc_device_api_none} for these event types.
5488For @code{acc_ev_enter_data_start}, it will be
5489@code{acc_device_api_none} in some cases.
5490
5491@item @code{acc_api_info.device_type}
5492Always the same as @code{acc_prof_info.device_type}.
5493
5494@item @code{acc_api_info.vendor}
5495Always @code{-1}; not yet implemented.
5496
5497@item @code{acc_api_info.device_handle}
5498Always @code{NULL}; not yet implemented.
5499
5500@item @code{acc_api_info.context_handle}
5501Always @code{NULL}; not yet implemented.
5502
5503@item @code{acc_api_info.async_handle}
5504Always @code{NULL}; not yet implemented.
5505
5506@end table
5507
5508Remarks about certain event types:
5509
5510@table @asis
5511
5512@item @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
5513@itemize
5514
5515@item
5516@c See 'DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT' in
5517@c 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c',
5518@c 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.
5519When a compute construct triggers implicit
5520@code{acc_ev_device_init_start} and @code{acc_ev_device_init_end}
5521events, they currently aren't @emph{nested within} the corresponding
5522@code{acc_ev_compute_construct_start} and
5523@code{acc_ev_compute_construct_end}, but they're currently observed
5524@emph{before} @code{acc_ev_compute_construct_start}.
5525It's not clear what to do: the standard asks us provide a lot of
5526details to the @code{acc_ev_compute_construct_start} callback, without
5527(implicitly) initializing a device before?
5528
5529@item
5530Callbacks for these event types will not be invoked for calls to the
5531@code{acc_set_device_type} and @code{acc_set_device_num} functions.
5532It's not clear if they should be.
5533
5534@end itemize
5535
5536@item @code{acc_ev_enter_data_start}, @code{acc_ev_enter_data_end}, @code{acc_ev_exit_data_start}, @code{acc_ev_exit_data_end}
5537@itemize
5538
5539@item
5540Callbacks for these event types will also be invoked for OpenACC
5541@emph{host_data} constructs.
5542It's not clear if they should be.
5543
5544@item
5545Callbacks for these event types will also be invoked when processing
5546variable mappings specified in OpenACC @emph{declare} directives.
5547It's not clear if they should be.
5548
5549@end itemize
5550
5551@end table
5552
5553Callbacks for the following event types will be invoked, but dispatch
5554and information provided therein has not yet been thoroughly reviewed:
5555
5556@itemize
5557@item @code{acc_ev_alloc}
5558@item @code{acc_ev_free}
5559@item @code{acc_ev_update_start}, @code{acc_ev_update_end}
5560@item @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end}
5561@item @code{acc_ev_enqueue_download_start}, @code{acc_ev_enqueue_download_end}
5562@end itemize
5563
5564During device initialization, and finalization, respectively,
5565callbacks for the following event types will not yet be invoked:
5566
5567@itemize
5568@item @code{acc_ev_alloc}
5569@item @code{acc_ev_free}
5570@end itemize
5571
5572Callbacks for the following event types have not yet been implemented,
5573so currently won't be invoked:
5574
5575@itemize
5576@item @code{acc_ev_device_shutdown_start}, @code{acc_ev_device_shutdown_end}
5577@item @code{acc_ev_runtime_shutdown}
5578@item @code{acc_ev_create}, @code{acc_ev_delete}
5579@item @code{acc_ev_wait_start}, @code{acc_ev_wait_end}
5580@end itemize
5581
5582For the following runtime library functions, not all expected
5583callbacks will be invoked (mostly concerning implicit device
5584initialization):
5585
5586@itemize
5587@item @code{acc_get_num_devices}
5588@item @code{acc_set_device_type}
5589@item @code{acc_get_device_type}
5590@item @code{acc_set_device_num}
5591@item @code{acc_get_device_num}
5592@item @code{acc_init}
5593@item @code{acc_shutdown}
5594@end itemize
5595
5596Aside from implicit device initialization, for the following runtime
5597library functions, no callbacks will be invoked for shared-memory
5598offloading devices (it's not clear if they should be):
5599
5600@itemize
5601@item @code{acc_malloc}
5602@item @code{acc_free}
5603@item @code{acc_copyin}, @code{acc_present_or_copyin}, @code{acc_copyin_async}
5604@item @code{acc_create}, @code{acc_present_or_create}, @code{acc_create_async}
5605@item @code{acc_copyout}, @code{acc_copyout_async}, @code{acc_copyout_finalize}, @code{acc_copyout_finalize_async}
5606@item @code{acc_delete}, @code{acc_delete_async}, @code{acc_delete_finalize}, @code{acc_delete_finalize_async}
5607@item @code{acc_update_device}, @code{acc_update_device_async}
5608@item @code{acc_update_self}, @code{acc_update_self_async}
5609@item @code{acc_map_data}, @code{acc_unmap_data}
5610@item @code{acc_memcpy_to_device}, @code{acc_memcpy_to_device_async}
5611@item @code{acc_memcpy_from_device}, @code{acc_memcpy_from_device_async}
5612@end itemize
5613
5614@c ---------------------------------------------------------------------
5615@c OpenMP-Implementation Specifics
5616@c ---------------------------------------------------------------------
5617
5618@node OpenMP-Implementation Specifics
5619@chapter OpenMP-Implementation Specifics
5620
5621@menu
2cd0689a 5622* Implementation-defined ICV Initialization::
d77de738 5623* OpenMP Context Selectors::
450b05ce 5624* Memory allocation::
d77de738
ML
5625@end menu
5626
2cd0689a
TB
5627@node Implementation-defined ICV Initialization
5628@section Implementation-defined ICV Initialization
5629@cindex Implementation specific setting
5630
5631@multitable @columnfractions .30 .70
5632@item @var{affinity-format-var} @tab See @ref{OMP_AFFINITY_FORMAT}.
5633@item @var{def-allocator-var} @tab See @ref{OMP_ALLOCATOR}.
5634@item @var{max-active-levels-var} @tab See @ref{OMP_MAX_ACTIVE_LEVELS}.
5635@item @var{dyn-var} @tab See @ref{OMP_DYNAMIC}.
819f3d36 5636@item @var{nthreads-var} @tab See @ref{OMP_NUM_THREADS}.
2cd0689a
TB
5637@item @var{num-devices-var} @tab Number of non-host devices found
5638by GCC's run-time library
5639@item @var{num-procs-var} @tab The number of CPU cores on the
5640initial device, except that affinity settings might lead to a
5641smaller number. On non-host devices, the value of the
5642@var{nthreads-var} ICV.
5643@item @var{place-partition-var} @tab See @ref{OMP_PLACES}.
5644@item @var{run-sched-var} @tab See @ref{OMP_SCHEDULE}.
5645@item @var{stacksize-var} @tab See @ref{OMP_STACKSIZE}.
5646@item @var{thread-limit-var} @tab See @ref{OMP_TEAMS_THREAD_LIMIT}
5647@item @var{wait-policy-var} @tab See @ref{OMP_WAIT_POLICY} and
5648@ref{GOMP_SPINCOUNT}
5649@end multitable
5650
d77de738
ML
5651@node OpenMP Context Selectors
5652@section OpenMP Context Selectors
5653
5654@code{vendor} is always @code{gnu}. References are to the GCC manual.
5655
75e3773b
TB
5656@c NOTE: Only the following selectors have been implemented. To add
5657@c additional traits for target architecture, TARGET_OMP_DEVICE_KIND_ARCH_ISA
5658@c has to be implemented; cf. also PR target/105640.
5659@c For offload devices, add *additionally* gcc/config/*/t-omp-device.
5660
5661For the host compiler, @code{kind} always matches @code{host}; for the
5662offloading architectures AMD GCN and Nvidia PTX, @code{kind} always matches
5663@code{gpu}. For the x86 family of computers, AMD GCN and Nvidia PTX
5664the following traits are supported in addition; while OpenMP is supported
5665on more architectures, GCC currently does not match any @code{arch} or
5666@code{isa} traits for those.
5667
5668@multitable @columnfractions .65 .30
5669@headitem @code{arch} @tab @code{isa}
d77de738
ML
5670@item @code{x86}, @code{x86_64}, @code{i386}, @code{i486},
5671 @code{i586}, @code{i686}, @code{ia32}
d77de738
ML
5672 @tab See @code{-m...} flags in ``x86 Options'' (without @code{-m})
5673@item @code{amdgcn}, @code{gcn}
e0b95c2e
TB
5674 @tab See @code{-march=} in ``AMD GCN Options''@footnote{Additionally,
5675 @code{gfx803} is supported as an alias for @code{fiji}.}
d77de738 5676@item @code{nvptx}
d77de738
ML
5677 @tab See @code{-march=} in ``Nvidia PTX Options''
5678@end multitable
5679
450b05ce
TB
5680@node Memory allocation
5681@section Memory allocation
d77de738 5682
bc238c40
TB
5683The description below applies to:
5684
5685@itemize
5686@item Explicit use of the OpenMP API routines, see
5687 @ref{Memory Management Routines}.
5688@item The @code{allocate} clause, except when the @code{allocator} modifier is a
5689 constant expression with value @code{omp_default_mem_alloc} and no
5690 @code{align} modifier has been specified. (In that case, the normal
5691 @code{malloc} allocation is used.)
5692@item Using the @code{allocate} directive for automatic/stack variables, except
5693 when the @code{allocator} clause is a constant expression with value
5694 @code{omp_default_mem_alloc} and no @code{align} clause has been
5695 specified. (In that case, the normal allocation is used: stack allocation
5696 and, sometimes for Fortran, also @code{malloc} [depending on flags such as
5697 @option{-fstack-arrays}].)
5698@item Using the @code{allocate} directive for variable in static memory is
5699 currently not supported (compile time error).
5700@item Using the @code{allocators} directive for Fortran pointers and
5701 allocatables is currently not supported (compile time error).
5702@end itemize
5703
a85a106c
TB
5704For the available predefined allocators and, as applicable, their associated
5705predefined memory spaces and for the available traits and their default values,
5706see @ref{OMP_ALLOCATOR}. Predefined allocators without an associated memory
5707space use the @code{omp_default_mem_space} memory space.
5708
8c2fc744
TB
5709For the memory spaces, the following applies:
5710@itemize
5711@item @code{omp_default_mem_space} is supported
5712@item @code{omp_const_mem_space} maps to @code{omp_default_mem_space}
5713@item @code{omp_low_lat_mem_space} maps to @code{omp_default_mem_space}
5714@item @code{omp_large_cap_mem_space} maps to @code{omp_default_mem_space},
5715 unless the memkind library is available
5716@item @code{omp_high_bw_mem_space} maps to @code{omp_default_mem_space},
5717 unless the memkind library is available
5718@end itemize
5719
d77de738
ML
5720On Linux systems, where the @uref{https://github.com/memkind/memkind, memkind
5721library} (@code{libmemkind.so.0}) is available at runtime, it is used when
5722creating memory allocators requesting
5723
5724@itemize
5725@item the memory space @code{omp_high_bw_mem_space}
5726@item the memory space @code{omp_large_cap_mem_space}
450b05ce 5727@item the @code{partition} trait @code{interleaved}; note that for
8c2fc744 5728 @code{omp_large_cap_mem_space} the allocation will not be interleaved
d77de738
ML
5729@end itemize
5730
450b05ce
TB
5731On Linux systems, where the @uref{https://github.com/numactl/numactl, numa
5732library} (@code{libnuma.so.1}) is available at runtime, it used when creating
5733memory allocators requesting
5734
5735@itemize
5736@item the @code{partition} trait @code{nearest}, except when both the
5737libmemkind library is available and the memory space is either
5738@code{omp_large_cap_mem_space} or @code{omp_high_bw_mem_space}
5739@end itemize
5740
5741Note that the numa library will round up the allocation size to a multiple of
5742the system page size; therefore, consider using it only with large data or
5743by sharing allocations via the @code{pool_size} trait. Furthermore, the Linux
5744kernel does not guarantee that an allocation will always be on the nearest NUMA
5745node nor that after reallocation the same node will be used. Note additionally
5746that, on Linux, the default setting of the memory placement policy is to use the
5747current node; therefore, unless the memory placement policy has been overridden,
5748the @code{partition} trait @code{environment} (the default) will be effectively
5749a @code{nearest} allocation.
5750
a85a106c 5751Additional notes regarding the traits:
8c2fc744
TB
5752@itemize
5753@item The @code{pinned} trait is unsupported.
a85a106c
TB
5754@item The default for the @code{pool_size} trait is no pool and for every
5755 (re)allocation the associated library routine is called, which might
5756 internally use a memory pool.
8c2fc744
TB
5757@item For the @code{partition} trait, the partition part size will be the same
5758 as the requested size (i.e. @code{interleaved} or @code{blocked} has no
5759 effect), except for @code{interleaved} when the memkind library is
450b05ce
TB
5760 available. Furthermore, for @code{nearest} and unless the numa library
5761 is available, the memory might not be on the same NUMA node as thread
5762 that allocated the memory; on Linux, this is in particular the case when
5763 the memory placement policy is set to preferred.
8c2fc744
TB
5764@item The @code{access} trait has no effect such that memory is always
5765 accessible by all threads.
5766@item The @code{sync_hint} trait has no effect.
5767@end itemize
d77de738
ML
5768
5769@c ---------------------------------------------------------------------
5770@c Offload-Target Specifics
5771@c ---------------------------------------------------------------------
5772
5773@node Offload-Target Specifics
5774@chapter Offload-Target Specifics
5775
5776The following sections present notes on the offload-target specifics
5777
5778@menu
5779* AMD Radeon::
5780* nvptx::
5781@end menu
5782
5783@node AMD Radeon
5784@section AMD Radeon (GCN)
5785
5786On the hardware side, there is the hierarchy (fine to coarse):
5787@itemize
5788@item work item (thread)
5789@item wavefront
5790@item work group
81476bc4 5791@item compute unit (CU)
d77de738
ML
5792@end itemize
5793
5794All OpenMP and OpenACC levels are used, i.e.
5795@itemize
5796@item OpenMP's simd and OpenACC's vector map to work items (thread)
5797@item OpenMP's threads (``parallel'') and OpenACC's workers map
5798 to wavefronts
5799@item OpenMP's teams and OpenACC's gang use a threadpool with the
5800 size of the number of teams or gangs, respectively.
5801@end itemize
5802
5803The used sizes are
5804@itemize
5805@item Number of teams is the specified @code{num_teams} (OpenMP) or
81476bc4
MV
5806 @code{num_gangs} (OpenACC) or otherwise the number of CU. It is limited
5807 by two times the number of CU.
d77de738
ML
5808@item Number of wavefronts is 4 for gfx900 and 16 otherwise;
5809 @code{num_threads} (OpenMP) and @code{num_workers} (OpenACC)
5810 overrides this if smaller.
5811@item The wavefront has 102 scalars and 64 vectors
5812@item Number of workitems is always 64
5813@item The hardware permits maximally 40 workgroups/CU and
5814 16 wavefronts/workgroup up to a limit of 40 wavefronts in total per CU.
5815@item 80 scalars registers and 24 vector registers in non-kernel functions
5816 (the chosen procedure-calling API).
5817@item For the kernel itself: as many as register pressure demands (number of
5818 teams and number of threads, scaled down if registers are exhausted)
5819@end itemize
5820
5821The implementation remark:
5822@itemize
5823@item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
5824 using the C library @code{printf} functions and the Fortran
5825 @code{print}/@code{write} statements.
243fa488 5826@item Reverse offload regions (i.e. @code{target} regions with
f84fdb13
TB
5827 @code{device(ancestor:1)}) are processed serially per @code{target} region
5828 such that the next reverse offload region is only executed after the previous
5829 one returned.
f1af7d65 5830@item OpenMP code that has a @code{requires} directive with
f84fdb13
TB
5831 @code{unified_shared_memory} will remove any GCN device from the list of
5832 available devices (``host fallback'').
2e3dd14d
TB
5833@item The available stack size can be changed using the @code{GCN_STACK_SIZE}
5834 environment variable; the default is 32 kiB per thread.
d77de738
ML
5835@end itemize
5836
5837
5838
5839@node nvptx
5840@section nvptx
5841
5842On the hardware side, there is the hierarchy (fine to coarse):
5843@itemize
5844@item thread
5845@item warp
5846@item thread block
5847@item streaming multiprocessor
5848@end itemize
5849
5850All OpenMP and OpenACC levels are used, i.e.
5851@itemize
5852@item OpenMP's simd and OpenACC's vector map to threads
5853@item OpenMP's threads (``parallel'') and OpenACC's workers map to warps
5854@item OpenMP's teams and OpenACC's gang use a threadpool with the
5855 size of the number of teams or gangs, respectively.
5856@end itemize
5857
5858The used sizes are
5859@itemize
5860@item The @code{warp_size} is always 32
5861@item CUDA kernel launched: @code{dim=@{#teams,1,1@}, blocks=@{#threads,warp_size,1@}}.
81476bc4
MV
5862@item The number of teams is limited by the number of blocks the device can
5863 host simultaneously.
d77de738
ML
5864@end itemize
5865
5866Additional information can be obtained by setting the environment variable to
5867@code{GOMP_DEBUG=1} (very verbose; grep for @code{kernel.*launch} for launch
5868parameters).
5869
5870GCC generates generic PTX ISA code, which is just-in-time compiled by CUDA,
5871which caches the JIT in the user's directory (see CUDA documentation; can be
5872tuned by the environment variables @code{CUDA_CACHE_@{DISABLE,MAXSIZE,PATH@}}.
5873
5874Note: While PTX ISA is generic, the @code{-mptx=} and @code{-march=} commandline
eda38850 5875options still affect the used PTX ISA code and, thus, the requirements on
d77de738
ML
5876CUDA version and hardware.
5877
5878The implementation remark:
5879@itemize
5880@item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
5881 using the C library @code{printf} functions. Note that the Fortran
5882 @code{print}/@code{write} statements are not supported, yet.
5883@item Compilation OpenMP code that contains @code{requires reverse_offload}
5884 requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30}
5885 is not supported.
eda38850
TB
5886@item For code containing reverse offload (i.e. @code{target} regions with
5887 @code{device(ancestor:1)}), there is a slight performance penalty
5888 for @emph{all} target regions, consisting mostly of shutdown delay
5889 Per device, reverse offload regions are processed serially such that
5890 the next reverse offload region is only executed after the previous
5891 one returned.
f1af7d65
TB
5892@item OpenMP code that has a @code{requires} directive with
5893 @code{unified_shared_memory} will remove any nvptx device from the
eda38850 5894 list of available devices (``host fallback'').
2cd0689a
TB
5895@item The default per-warp stack size is 128 kiB; see also @code{-msoft-stack}
5896 in the GCC manual.
25072a47
TB
5897@item The OpenMP routines @code{omp_target_memcpy_rect} and
5898 @code{omp_target_memcpy_rect_async} and the @code{target update}
5899 directive for non-contiguous list items will use the 2D and 3D
5900 memory-copy functions of the CUDA library. Higher dimensions will
5901 call those functions in a loop and are therefore supported.
d77de738
ML
5902@end itemize
5903
5904
5905@c ---------------------------------------------------------------------
5906@c The libgomp ABI
5907@c ---------------------------------------------------------------------
5908
5909@node The libgomp ABI
5910@chapter The libgomp ABI
5911
5912The following sections present notes on the external ABI as
5913presented by libgomp. Only maintainers should need them.
5914
5915@menu
5916* Implementing MASTER construct::
5917* Implementing CRITICAL construct::
5918* Implementing ATOMIC construct::
5919* Implementing FLUSH construct::
5920* Implementing BARRIER construct::
5921* Implementing THREADPRIVATE construct::
5922* Implementing PRIVATE clause::
5923* Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
5924* Implementing REDUCTION clause::
5925* Implementing PARALLEL construct::
5926* Implementing FOR construct::
5927* Implementing ORDERED construct::
5928* Implementing SECTIONS construct::
5929* Implementing SINGLE construct::
5930* Implementing OpenACC's PARALLEL construct::
5931@end menu
5932
5933
5934@node Implementing MASTER construct
5935@section Implementing MASTER construct
5936
5937@smallexample
5938if (omp_get_thread_num () == 0)
5939 block
5940@end smallexample
5941
5942Alternately, we generate two copies of the parallel subfunction
5943and only include this in the version run by the primary thread.
5944Surely this is not worthwhile though...
5945
5946
5947
5948@node Implementing CRITICAL construct
5949@section Implementing CRITICAL construct
5950
5951Without a specified name,
5952
5953@smallexample
5954 void GOMP_critical_start (void);
5955 void GOMP_critical_end (void);
5956@end smallexample
5957
5958so that we don't get COPY relocations from libgomp to the main
5959application.
5960
5961With a specified name, use omp_set_lock and omp_unset_lock with
5962name being transformed into a variable declared like
5963
5964@smallexample
5965 omp_lock_t gomp_critical_user_<name> __attribute__((common))
5966@end smallexample
5967
5968Ideally the ABI would specify that all zero is a valid unlocked
5969state, and so we wouldn't need to initialize this at
5970startup.
5971
5972
5973
5974@node Implementing ATOMIC construct
5975@section Implementing ATOMIC construct
5976
5977The target should implement the @code{__sync} builtins.
5978
5979Failing that we could add
5980
5981@smallexample
5982 void GOMP_atomic_enter (void)
5983 void GOMP_atomic_exit (void)
5984@end smallexample
5985
5986which reuses the regular lock code, but with yet another lock
5987object private to the library.
5988
5989
5990
5991@node Implementing FLUSH construct
5992@section Implementing FLUSH construct
5993
5994Expands to the @code{__sync_synchronize} builtin.
5995
5996
5997
5998@node Implementing BARRIER construct
5999@section Implementing BARRIER construct
6000
6001@smallexample
6002 void GOMP_barrier (void)
6003@end smallexample
6004
6005
6006@node Implementing THREADPRIVATE construct
6007@section Implementing THREADPRIVATE construct
6008
6009In _most_ cases we can map this directly to @code{__thread}. Except
6010that OMP allows constructors for C++ objects. We can either
6011refuse to support this (how often is it used?) or we can
6012implement something akin to .ctors.
6013
6014Even more ideally, this ctor feature is handled by extensions
6015to the main pthreads library. Failing that, we can have a set
6016of entry points to register ctor functions to be called.
6017
6018
6019
6020@node Implementing PRIVATE clause
6021@section Implementing PRIVATE clause
6022
6023In association with a PARALLEL, or within the lexical extent
6024of a PARALLEL block, the variable becomes a local variable in
6025the parallel subfunction.
6026
6027In association with FOR or SECTIONS blocks, create a new
6028automatic variable within the current function. This preserves
6029the semantic of new variable creation.
6030
6031
6032
6033@node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
6034@section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
6035
6036This seems simple enough for PARALLEL blocks. Create a private
6037struct for communicating between the parent and subfunction.
6038In the parent, copy in values for scalar and "small" structs;
6039copy in addresses for others TREE_ADDRESSABLE types. In the
6040subfunction, copy the value into the local variable.
6041
6042It is not clear what to do with bare FOR or SECTION blocks.
6043The only thing I can figure is that we do something like:
6044
6045@smallexample
6046#pragma omp for firstprivate(x) lastprivate(y)
6047for (int i = 0; i < n; ++i)
6048 body;
6049@end smallexample
6050
6051which becomes
6052
6053@smallexample
6054@{
6055 int x = x, y;
6056
6057 // for stuff
6058
6059 if (i == n)
6060 y = y;
6061@}
6062@end smallexample
6063
6064where the "x=x" and "y=y" assignments actually have different
6065uids for the two variables, i.e. not something you could write
6066directly in C. Presumably this only makes sense if the "outer"
6067x and y are global variables.
6068
6069COPYPRIVATE would work the same way, except the structure
6070broadcast would have to happen via SINGLE machinery instead.
6071
6072
6073
6074@node Implementing REDUCTION clause
6075@section Implementing REDUCTION clause
6076
6077The private struct mentioned in the previous section should have
6078a pointer to an array of the type of the variable, indexed by the
6079thread's @var{team_id}. The thread stores its final value into the
6080array, and after the barrier, the primary thread iterates over the
6081array to collect the values.
6082
6083
6084@node Implementing PARALLEL construct
6085@section Implementing PARALLEL construct
6086
6087@smallexample
6088 #pragma omp parallel
6089 @{
6090 body;
6091 @}
6092@end smallexample
6093
6094becomes
6095
6096@smallexample
6097 void subfunction (void *data)
6098 @{
6099 use data;
6100 body;
6101 @}
6102
6103 setup data;
6104 GOMP_parallel_start (subfunction, &data, num_threads);
6105 subfunction (&data);
6106 GOMP_parallel_end ();
6107@end smallexample
6108
6109@smallexample
6110 void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
6111@end smallexample
6112
6113The @var{FN} argument is the subfunction to be run in parallel.
6114
6115The @var{DATA} argument is a pointer to a structure used to
6116communicate data in and out of the subfunction, as discussed
6117above with respect to FIRSTPRIVATE et al.
6118
6119The @var{NUM_THREADS} argument is 1 if an IF clause is present
6120and false, or the value of the NUM_THREADS clause, if
6121present, or 0.
6122
6123The function needs to create the appropriate number of
6124threads and/or launch them from the dock. It needs to
6125create the team structure and assign team ids.
6126
6127@smallexample
6128 void GOMP_parallel_end (void)
6129@end smallexample
6130
6131Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
6132
6133
6134
6135@node Implementing FOR construct
6136@section Implementing FOR construct
6137
6138@smallexample
6139 #pragma omp parallel for
6140 for (i = lb; i <= ub; i++)
6141 body;
6142@end smallexample
6143
6144becomes
6145
6146@smallexample
6147 void subfunction (void *data)
6148 @{
6149 long _s0, _e0;
6150 while (GOMP_loop_static_next (&_s0, &_e0))
6151 @{
6152 long _e1 = _e0, i;
6153 for (i = _s0; i < _e1; i++)
6154 body;
6155 @}
6156 GOMP_loop_end_nowait ();
6157 @}
6158
6159 GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
6160 subfunction (NULL);
6161 GOMP_parallel_end ();
6162@end smallexample
6163
6164@smallexample
6165 #pragma omp for schedule(runtime)
6166 for (i = 0; i < n; i++)
6167 body;
6168@end smallexample
6169
6170becomes
6171
6172@smallexample
6173 @{
6174 long i, _s0, _e0;
6175 if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
6176 do @{
6177 long _e1 = _e0;
6178 for (i = _s0, i < _e0; i++)
6179 body;
6180 @} while (GOMP_loop_runtime_next (&_s0, _&e0));
6181 GOMP_loop_end ();
6182 @}
6183@end smallexample
6184
6185Note that while it looks like there is trickiness to propagating
6186a non-constant STEP, there isn't really. We're explicitly allowed
6187to evaluate it as many times as we want, and any variables involved
6188should automatically be handled as PRIVATE or SHARED like any other
6189variables. So the expression should remain evaluable in the
6190subfunction. We can also pull it into a local variable if we like,
6191but since its supposed to remain unchanged, we can also not if we like.
6192
6193If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
6194able to get away with no work-sharing context at all, since we can
6195simply perform the arithmetic directly in each thread to divide up
6196the iterations. Which would mean that we wouldn't need to call any
6197of these routines.
6198
6199There are separate routines for handling loops with an ORDERED
6200clause. Bookkeeping for that is non-trivial...
6201
6202
6203
6204@node Implementing ORDERED construct
6205@section Implementing ORDERED construct
6206
6207@smallexample
6208 void GOMP_ordered_start (void)
6209 void GOMP_ordered_end (void)
6210@end smallexample
6211
6212
6213
6214@node Implementing SECTIONS construct
6215@section Implementing SECTIONS construct
6216
6217A block as
6218
6219@smallexample
6220 #pragma omp sections
6221 @{
6222 #pragma omp section
6223 stmt1;
6224 #pragma omp section
6225 stmt2;
6226 #pragma omp section
6227 stmt3;
6228 @}
6229@end smallexample
6230
6231becomes
6232
6233@smallexample
6234 for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
6235 switch (i)
6236 @{
6237 case 1:
6238 stmt1;
6239 break;
6240 case 2:
6241 stmt2;
6242 break;
6243 case 3:
6244 stmt3;
6245 break;
6246 @}
6247 GOMP_barrier ();
6248@end smallexample
6249
6250
6251@node Implementing SINGLE construct
6252@section Implementing SINGLE construct
6253
6254A block like
6255
6256@smallexample
6257 #pragma omp single
6258 @{
6259 body;
6260 @}
6261@end smallexample
6262
6263becomes
6264
6265@smallexample
6266 if (GOMP_single_start ())
6267 body;
6268 GOMP_barrier ();
6269@end smallexample
6270
6271while
6272
6273@smallexample
6274 #pragma omp single copyprivate(x)
6275 body;
6276@end smallexample
6277
6278becomes
6279
6280@smallexample
6281 datap = GOMP_single_copy_start ();
6282 if (datap == NULL)
6283 @{
6284 body;
6285 data.x = x;
6286 GOMP_single_copy_end (&data);
6287 @}
6288 else
6289 x = datap->x;
6290 GOMP_barrier ();
6291@end smallexample
6292
6293
6294
6295@node Implementing OpenACC's PARALLEL construct
6296@section Implementing OpenACC's PARALLEL construct
6297
6298@smallexample
6299 void GOACC_parallel ()
6300@end smallexample
6301
6302
6303
6304@c ---------------------------------------------------------------------
6305@c Reporting Bugs
6306@c ---------------------------------------------------------------------
6307
6308@node Reporting Bugs
6309@chapter Reporting Bugs
6310
6311Bugs in the GNU Offloading and Multi Processing Runtime Library should
6312be reported via @uref{https://gcc.gnu.org/bugzilla/, Bugzilla}. Please add
6313"openacc", or "openmp", or both to the keywords field in the bug
6314report, as appropriate.
6315
6316
6317
6318@c ---------------------------------------------------------------------
6319@c GNU General Public License
6320@c ---------------------------------------------------------------------
6321
6322@include gpl_v3.texi
6323
6324
6325
6326@c ---------------------------------------------------------------------
6327@c GNU Free Documentation License
6328@c ---------------------------------------------------------------------
6329
6330@include fdl.texi
6331
6332
6333
6334@c ---------------------------------------------------------------------
6335@c Funding Free Software
6336@c ---------------------------------------------------------------------
6337
6338@include funding.texi
6339
6340@c ---------------------------------------------------------------------
6341@c Index
6342@c ---------------------------------------------------------------------
6343
6344@node Library Index
6345@unnumbered Library Index
6346
6347@printindex cp
6348
6349@bye