]> git.ipfire.org Git - thirdparty/gcc.git/blame - libgomp/libgomp.texi
AVX512FP16: Add testcase for scalar FMA instructions.
[thirdparty/gcc.git] / libgomp / libgomp.texi
CommitLineData
3721b9e1
DF
1\input texinfo @c -*-texinfo-*-
2
3@c %**start of header
4@setfilename libgomp.info
5@settitle GNU libgomp
6@c %**end of header
7
8
9@copying
4b24d500 10Copyright @copyright{} 2006-2021 Free Software Foundation, Inc.
3721b9e1
DF
11
12Permission is granted to copy, distribute and/or modify this document
07a67d6a 13under the terms of the GNU Free Documentation License, Version 1.3 or
3721b9e1 14any later version published by the Free Software Foundation; with the
70b1e376 15Invariant Sections being ``Funding Free Software'', the Front-Cover
3721b9e1
DF
16texts being (a) (see below), and with the Back-Cover Texts being (b)
17(see below). A copy of the license is included in the section entitled
18``GNU Free Documentation License''.
19
20(a) The FSF's Front-Cover Text is:
21
22 A GNU Manual
23
24(b) The FSF's Back-Cover Text is:
25
26 You have freedom to copy and modify this GNU Manual, like GNU
27 software. Copies published by the Free Software Foundation raise
28 funds for GNU development.
29@end copying
30
31@ifinfo
32@dircategory GNU Libraries
33@direntry
f1f3453e 34* libgomp: (libgomp). GNU Offloading and Multi Processing Runtime Library.
3721b9e1
DF
35@end direntry
36
f1f3453e 37This manual documents libgomp, the GNU Offloading and Multi Processing
41dbbb37
TS
38Runtime library. This is the GNU implementation of the OpenMP and
39OpenACC APIs for parallel and accelerator programming in C/C++ and
40Fortran.
3721b9e1
DF
41
42Published by the Free Software Foundation
4351 Franklin Street, Fifth Floor
44Boston, MA 02110-1301 USA
45
46@insertcopying
47@end ifinfo
48
49
50@setchapternewpage odd
51
52@titlepage
f1f3453e 53@title GNU Offloading and Multi Processing Runtime Library
41dbbb37 54@subtitle The GNU OpenMP and OpenACC Implementation
3721b9e1
DF
55@page
56@vskip 0pt plus 1filll
57@comment For the @value{version-GCC} Version*
58@sp 1
59Published by the Free Software Foundation @*
6051 Franklin Street, Fifth Floor@*
61Boston, MA 02110-1301, USA@*
62@sp 1
63@insertcopying
64@end titlepage
65
66@summarycontents
67@contents
68@page
69
70
c33fd160 71@node Top, Enabling OpenMP
3721b9e1
DF
72@top Introduction
73@cindex Introduction
74
f1f3453e 75This manual documents the usage of libgomp, the GNU Offloading and
41dbbb37 76Multi Processing Runtime Library. This includes the GNU
1a6d1d24 77implementation of the @uref{https://www.openmp.org, OpenMP} Application
41dbbb37
TS
78Programming Interface (API) for multi-platform shared-memory parallel
79programming in C/C++ and Fortran, and the GNU implementation of the
9651fbaf 80@uref{https://www.openacc.org, OpenACC} Application Programming
41dbbb37
TS
81Interface (API) for offloading of code to accelerator devices in C/C++
82and Fortran.
3721b9e1 83
41dbbb37
TS
84Originally, libgomp implemented the GNU OpenMP Runtime Library. Based
85on this, support for OpenACC and offloading (both OpenACC and OpenMP
864's target construct) has been added later on, and the library's name
87changed to GNU Offloading and Multi Processing Runtime Library.
f1f3453e 88
3721b9e1
DF
89
90
91@comment
92@comment When you add a new menu item, please keep the right hand
93@comment aligned to the same column. Do not use tabs. This provides
94@comment better formatting.
95@comment
96@menu
97* Enabling OpenMP:: How to enable OpenMP for your applications.
cff72ef4 98* OpenMP Implementation Status:: List of implemented features by OpenMP version
4102bda6
TS
99* OpenMP Runtime Library Routines: Runtime Library Routines.
100 The OpenMP runtime application programming
3721b9e1 101 interface.
4102bda6
TS
102* OpenMP Environment Variables: Environment Variables.
103 Influencing OpenMP runtime behavior with
104 environment variables.
cdf6119d
JN
105* Enabling OpenACC:: How to enable OpenACC for your
106 applications.
107* OpenACC Runtime Library Routines:: The OpenACC runtime application
108 programming interface.
109* OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
110 environment variables.
111* CUDA Streams Usage:: Notes on the implementation of
112 asynchronous operations.
113* OpenACC Library Interoperability:: OpenACC library interoperability with the
114 NVIDIA CUBLAS library.
5fae049d 115* OpenACC Profiling Interface::
3721b9e1 116* The libgomp ABI:: Notes on the external ABI presented by libgomp.
f1f3453e
TS
117* Reporting Bugs:: How to report bugs in the GNU Offloading and
118 Multi Processing Runtime Library.
3721b9e1
DF
119* Copying:: GNU general public license says
120 how you can copy and share libgomp.
121* GNU Free Documentation License::
122 How you can copy and share this manual.
123* Funding:: How to help assure continued work for free
124 software.
3d3949df 125* Library Index:: Index of this documentation.
3721b9e1
DF
126@end menu
127
128
129@c ---------------------------------------------------------------------
130@c Enabling OpenMP
131@c ---------------------------------------------------------------------
132
133@node Enabling OpenMP
134@chapter Enabling OpenMP
135
136To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
83fd6c5b 137flag @command{-fopenmp} must be specified. This enables the OpenMP directive
3721b9e1
DF
138@code{#pragma omp} in C/C++ and @code{!$omp} directives in free form,
139@code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form,
140@code{!$} conditional compilation sentinels in free form and @code{c$},
83fd6c5b 141@code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
3721b9e1
DF
142arranges for automatic linking of the OpenMP runtime library
143(@ref{Runtime Library Routines}).
144
cff72ef4
TB
145A complete description of all OpenMP directives may be found in the
146@uref{https://www.openmp.org, OpenMP Application Program Interface} manuals.
147See also @ref{OpenMP Implementation Status}.
148
149
150@c ---------------------------------------------------------------------
151@c OpenMP Implementation Status
152@c ---------------------------------------------------------------------
153
154@node OpenMP Implementation Status
155@chapter OpenMP Implementation Status
156
157@menu
158* OpenMP 4.5:: Feature completion status to 4.5 specification
159* OpenMP 5.0:: Feature completion status to 5.0 specification
160* OpenMP 5.1:: Feature completion status to 5.1 specification
161@end menu
162
163The @code{_OPENMP} preprocessor macro and Fortran's @code{openmp_version}
164parameter, provided by @code{omp_lib.h} and the @code{omp_lib} module, have
165the value @code{201511} (i.e. OpenMP 4.5).
166
167@node OpenMP 4.5
168@section OpenMP 4.5
169
170The OpenMP 4.5 specification is fully supported.
171
172@node OpenMP 5.0
173@section OpenMP 5.0
174
ff7bc505
TB
175@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
176@c This list is sorted as in OpenMP 5.1's B.3 not as in OpenMP 5.0's B.2
177
178@multitable @columnfractions .60 .10 .25
179@headitem Description @tab Status @tab Comments
180@item Array shaping @tab N @tab
181@item Array sections with non-unit strides in C and C++ @tab N @tab
182@item Iterators @tab Y @tab
183@item @code{metadirective} directive @tab N @tab
184@item @code{declare variant} directive
185 @tab P @tab Only C and C++, simd traits not handled correctly
186@item @emph{target-offload-var} ICV and @code{OMP_TARGET_OFFLOAD}
187 env variable @tab Y @tab
188@item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab
189@item @code{requires} directive @tab P
190 @tab Only fulfillable requirement is @code{atomic_default_mem_order}
191@item @code{teams} construct outside an enclosing target region @tab Y @tab
192@item Non-rectangular loop nests @tab Y @tab
193@item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab
194@item @code{nonmonotonic} as default loop schedule modifier for worksharing-loop
195 constructs @tab Y @tab
196@item Collapse of associated loops that are imperfectly nested loops @tab N @tab
197@item Clauses @code{if}, @code{nontemporal} and @code{order(concurrent)} in
198 @code{simd} construct @tab Y @tab
199@item @code{atomic} constructs in @code{simd} @tab Y @tab
200@item @code{loop} construct @tab Y @tab
201@item @code{order(concurrent)} clause @tab Y @tab
202@item @code{scan} directive and @code{in_scan} modifier for the
203 @code{reduction} clause @tab Y @tab
204@item @code{in_reduction} clause on @code{task} constructs @tab Y @tab
205@item @code{in_reduction} clause on @code{target} constructs @tab P
206 @tab Only C/C++, @code{nowait} only stub
207@item @code{task_reduction} clause with @code{taskgroup} @tab Y @tab
208@item @code{task} modifier to @code{reduction} clause @tab Y @tab
209@item @code{affinity} clause to @code{task} construct @tab Y @tab Stub only
210@item @code{detach} clause to @code{task} construct @tab Y @tab
211@item @code{omp_fulfill_event} runtime routine @tab Y @tab
212@item @code{reduction} and @code{in_reduction} clauses on @code{taskloop}
213 and @code{taskloop simd} constructs @tab Y @tab
214@item @code{taskloop} construct cancelable by @code{cancel} construct
215 @tab Y @tab
216@item @code{mutexinouset} @emph{dependence-type} for @code{depend} clause
217 @tab Y @tab
218@item Predefined memory spaces, memory allocators, allocator traits
219 @tab Y @tab Some are only stubs
220@item Memory management routines @tab Y @tab
221@item @code{allocate} directive @tab N @tab
222@item @code{allocate} clause @tab P @tab initial support in C/C++ only
223@item @code{use_device_addr} clause on @code{target data} @tab Y @tab
224@item @code{ancestor} modifier on @code{device} clause
225 @tab P @tab Reverse offload unsupported
226@item Implicit declare target directive @tab Y @tab
227@item Discontiguous array section with @code{target update} construct
228 @tab N @tab
229@item C/C++'s lvalue expressions in @code{to}, @code{from}
230 and @code{map} clauses @tab N @tab
231@item C/C++'s lvalue expressions in @code{depend} clauses @tab Y @tab
232@item Nested @code{declare target} directive @tab Y @tab
233@item Combined @code{master} constructs @tab Y @tab
234@item @code{depend} clause on @code{taskwait} @tab Y @tab
235@item Weak memory ordering clauses on @code{atomic} and @code{flush} construct
236 @tab Y @tab
237@item @code{hint} clause on the @code{atomic} construct @tab Y @tab Stub only
238@item @code{depobj} construct and depend objects @tab Y @tab
239@item Lock hints were renamed to synchronization hints @tab Y @tab
240@item @code{conditional} modifier to @code{lastprivate} clause @tab Y @tab
241@item Map-order clarifications @tab P @tab
242@item @code{close} @emph{map-type-modifier} @tab Y @tab
243@item Mapping C/C++ pointer variables and to assign the address of
244 device memory mapped by an array section @tab P @tab
245@item Mapping of Fortran pointer and allocatable variables, including pointer
246 and allocatable components of variables
247 @tab P @tab Mapping of vars with allocatable components unspported
248@item @code{defaultmap} extensions @tab Y @tab
249@item @code{declare mapper} directive @tab N @tab
250@item @code{omp_get_supported_active_levels} routine @tab Y @tab
251@item Runtime routines and environment variables to display runtime thread
252 affinity information @tab Y @tab
253@item @code{omp_pause_resource} and @code{omp_pause_resource_all} runtime
254 routines @tab Y @tab
255@item @code{omp_get_device_num} runtime routine @tab Y @tab
256@item OMPT interface @tab N @tab
257@item OMPD interface @tab N @tab
258@end multitable
259
260@unnumberedsubsec Other new OpenMP 5.0 features
261
262@multitable @columnfractions .60 .10 .25
263@headitem Description @tab Status @tab Comments
264@item Supporting C++'s range-based for loop @tab Y @tab
265@end multitable
266
cff72ef4
TB
267
268@node OpenMP 5.1
269@section OpenMP 5.1
270
271@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
272
273@multitable @columnfractions .60 .10 .25
274@headitem Description @tab Status @tab Comments
275@item OpenMP directive as C++ attribute specifiers @tab Y @tab
276@item @code{omp_all_memory} reserved locator @tab N @tab
277@item @emph{target_device trait} in OpenMP Context @tab N @tab
278@item @code{target_device} selector set in context selectors @tab N @tab
4a7842bb 279@item C/C++'s @code{declare variant} directive: elision support of
cff72ef4 280 preprocessed code @tab N @tab
4a7842bb 281@item @code{declare variant}: new clauses @code{adjust_args} and
cff72ef4
TB
282 @code{append_args} @tab N @tab
283@item @code{dispatch} construct @tab N @tab
284@item device-specific ICV settings the environment variables @tab N @tab
285@item assume directive @tab N @tab
286@item @code{nothing} directive @tab Y @tab
287@item @code{error} directive @tab Y @tab
288@item @code{masked} construct @tab Y @tab
289@item @code{scope} directive @tab Y @tab
290@item Loop transformation constructs @tab N @tab
291@item @code{strict} modifier in the @code{grainsize} and @code{num_tasks}
292 clauses of the taskloop construct @tab Y @tab
293@item @code{align} clause/modifier in @code{allocate} directive/clause
294 and @code{allocator} directive @tab N @tab
295@item @code{thread_limit} clause to @code{target} construct @tab N @tab
296@item @code{has_device_addr} clause to @code{target} construct @tab N @tab
297@item iterators in @code{target update} motion clauses and @code{map}
298 clauses @tab N @tab
299@item indirect calls to the device version of a procedure or function in
300 @code{target} regions @tab N @tab
301@item @code{interop} directive @tab N @tab
302@item @code{omp_interop_t} object support in runtime routines @tab N @tab
303@item @code{nowait} clause in @code{taskwait} directive @tab N @tab
304@item Extensions to the @code{atomic} directive @tab N @tab
305@item @code{seq_cst} clause on a @code{flush} construct @tab Y @tab
306@item @code{inoutset} argument to the @code{depend} clause @tab N @tab
307@item @code{private} and @code{firstprivate} argument to @code{default}
308 clause in C and C++ @tab N @tab
309@item @code{present} argument to @code{defaultmap} clause @tab N @tab
310@item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
311 @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
312 routines @tab N @tab
313@item @code{omp_target_is_accessible} runtime routine @tab N @tab
314@item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
315 runtime routines @tab N @tab
316@item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
317@item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and
318 @code{omp_aligned_calloc} runtime routines @tab N @tab
319@item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added,
320 @code{omp_atv_default} changed @tab Y @tab
321@item @code{omp_display_env} runtime routine @tab P
322 @tab Not inside @code{target} regions
323@item @code{ompt_scope_endpoint_t} enum: @code{ompt_scope_beginend} @tab N @tab
324@item @code{ompt_sync_region_t} enum additions @tab N @tab
325@item @code{ompt_state_t} enum: @code{ompt_state_wait_barrier_implementation}
326 and @code{ompt_state_wait_barrier_teams} @tab N @tab
327@item @code{ompt_callback_target_data_op_emi_t},
328 @code{ompt_callback_target_emi_t}, @code{ompt_callback_target_map_emi_t}
329 and @code{ompt_callback_target_submit_emi_t} @tab N @tab
330@item @code{ompt_callback_error_t} type @tab N @tab
331@item @code{OMP_PLACES} syntax was extension @tab N @tab
332@item @code{OMP_NUM_TEAMS} and @code{OMP_TEAMS_THREAD_LIMIT} environment
333 variables @tab N @tab
334@end multitable
335
336@unnumberedsubsec Other new OpenMP 5.1 features
337
338@multitable @columnfractions .60 .10 .25
339@headitem Description @tab Status @tab Comments
340@item Suppport of strictly structured blocks in Fortran @tab N @tab
341@end multitable
3721b9e1
DF
342
343
344@c ---------------------------------------------------------------------
4102bda6 345@c OpenMP Runtime Library Routines
3721b9e1
DF
346@c ---------------------------------------------------------------------
347
348@node Runtime Library Routines
4102bda6 349@chapter OpenMP Runtime Library Routines
3721b9e1 350
83fd6c5b 351The runtime routines described here are defined by Section 3 of the OpenMP
00b9bd52 352specification in version 4.5. The routines are structured in following
5c6ed53a 353three parts:
3721b9e1 354
72832460 355@menu
83fd6c5b
TB
356Control threads, processors and the parallel environment. They have C
357linkage, and do not throw exceptions.
f5745bed 358
5c6ed53a
TB
359* omp_get_active_level:: Number of active parallel regions
360* omp_get_ancestor_thread_num:: Ancestor thread ID
83fd6c5b
TB
361* omp_get_cancellation:: Whether cancellation support is enabled
362* omp_get_default_device:: Get the default device for target regions
0bac793e 363* omp_get_device_num:: Get device that current thread is running on
5c6ed53a 364* omp_get_dynamic:: Dynamic teams setting
74c9882b 365* omp_get_initial_device:: Device number of host device
5c6ed53a 366* omp_get_level:: Number of parallel regions
445567b2 367* omp_get_max_active_levels:: Current maximum number of active regions
d9a6bd32 368* omp_get_max_task_priority:: Maximum task priority value that can be set
6a2ba183 369* omp_get_max_threads:: Maximum number of threads of parallel region
5c6ed53a 370* omp_get_nested:: Nested parallel regions
83fd6c5b 371* omp_get_num_devices:: Number of target devices
5c6ed53a 372* omp_get_num_procs:: Number of processors online
83fd6c5b 373* omp_get_num_teams:: Number of teams
5c6ed53a 374* omp_get_num_threads:: Size of the active team
83fd6c5b 375* omp_get_proc_bind:: Whether theads may be moved between CPUs
5c6ed53a 376* omp_get_schedule:: Obtain the runtime scheduling method
445567b2 377* omp_get_supported_active_levels:: Maximum number of active regions supported
83fd6c5b 378* omp_get_team_num:: Get team number
5c6ed53a 379* omp_get_team_size:: Number of threads in a team
6a2ba183 380* omp_get_thread_limit:: Maximum number of threads
5c6ed53a
TB
381* omp_get_thread_num:: Current thread ID
382* omp_in_parallel:: Whether a parallel region is active
20906c66 383* omp_in_final:: Whether in final or included task region
83fd6c5b
TB
384* omp_is_initial_device:: Whether executing on the host device
385* omp_set_default_device:: Set the default device for target regions
5c6ed53a
TB
386* omp_set_dynamic:: Enable/disable dynamic teams
387* omp_set_max_active_levels:: Limits the number of active parallel regions
388* omp_set_nested:: Enable/disable nested parallel regions
389* omp_set_num_threads:: Set upper team size limit
390* omp_set_schedule:: Set the runtime scheduling method
3721b9e1
DF
391
392Initialize, set, test, unset and destroy simple and nested locks.
393
3721b9e1
DF
394* omp_init_lock:: Initialize simple lock
395* omp_set_lock:: Wait for and set simple lock
396* omp_test_lock:: Test and set simple lock if available
397* omp_unset_lock:: Unset simple lock
398* omp_destroy_lock:: Destroy simple lock
399* omp_init_nest_lock:: Initialize nested lock
400* omp_set_nest_lock:: Wait for and set simple lock
401* omp_test_nest_lock:: Test and set nested lock if available
402* omp_unset_nest_lock:: Unset nested lock
403* omp_destroy_nest_lock:: Destroy nested lock
3721b9e1
DF
404
405Portable, thread-based, wall clock timer.
406
3721b9e1
DF
407* omp_get_wtick:: Get timer precision.
408* omp_get_wtime:: Elapsed wall clock time.
0194e2f0
KCY
409
410Support for event objects.
411
412* omp_fulfill_event:: Fulfill and destroy an OpenMP event.
3721b9e1
DF
413@end menu
414
5c6ed53a
TB
415
416
417@node omp_get_active_level
418@section @code{omp_get_active_level} -- Number of parallel regions
419@table @asis
420@item @emph{Description}:
421This function returns the nesting level for the active parallel blocks,
422which enclose the calling call.
423
424@item @emph{C/C++}
425@multitable @columnfractions .20 .80
6a2ba183 426@item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
5c6ed53a
TB
427@end multitable
428
429@item @emph{Fortran}:
430@multitable @columnfractions .20 .80
acb5c916 431@item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
5c6ed53a
TB
432@end multitable
433
434@item @emph{See also}:
435@ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
436
437@item @emph{Reference}:
1a6d1d24 438@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.20.
5c6ed53a
TB
439@end table
440
441
442
443@node omp_get_ancestor_thread_num
444@section @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
445@table @asis
446@item @emph{Description}:
447This function returns the thread identification number for the given
83fd6c5b 448nesting level of the current thread. For values of @var{level} outside
5c6ed53a
TB
449zero to @code{omp_get_level} -1 is returned; if @var{level} is
450@code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
451
452@item @emph{C/C++}
453@multitable @columnfractions .20 .80
454@item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
455@end multitable
456
457@item @emph{Fortran}:
458@multitable @columnfractions .20 .80
acb5c916 459@item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
5c6ed53a
TB
460@item @tab @code{integer level}
461@end multitable
462
463@item @emph{See also}:
464@ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
465
466@item @emph{Reference}:
1a6d1d24 467@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.18.
83fd6c5b
TB
468@end table
469
470
471
472@node omp_get_cancellation
473@section @code{omp_get_cancellation} -- Whether cancellation support is enabled
474@table @asis
475@item @emph{Description}:
476This function returns @code{true} if cancellation is activated, @code{false}
477otherwise. Here, @code{true} and @code{false} represent their language-specific
478counterparts. Unless @env{OMP_CANCELLATION} is set true, cancellations are
479deactivated.
480
481@item @emph{C/C++}:
482@multitable @columnfractions .20 .80
483@item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
484@end multitable
485
486@item @emph{Fortran}:
487@multitable @columnfractions .20 .80
488@item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
489@end multitable
490
491@item @emph{See also}:
492@ref{OMP_CANCELLATION}
493
494@item @emph{Reference}:
1a6d1d24 495@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.9.
83fd6c5b
TB
496@end table
497
498
499
500@node omp_get_default_device
501@section @code{omp_get_default_device} -- Get the default device for target regions
502@table @asis
503@item @emph{Description}:
504Get the default device for target regions without device clause.
505
506@item @emph{C/C++}:
507@multitable @columnfractions .20 .80
508@item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
509@end multitable
510
511@item @emph{Fortran}:
512@multitable @columnfractions .20 .80
513@item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
514@end multitable
515
516@item @emph{See also}:
517@ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
518
519@item @emph{Reference}:
1a6d1d24 520@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.30.
5c6ed53a
TB
521@end table
522
523
524
3721b9e1
DF
525@node omp_get_dynamic
526@section @code{omp_get_dynamic} -- Dynamic teams setting
527@table @asis
528@item @emph{Description}:
529This function returns @code{true} if enabled, @code{false} otherwise.
530Here, @code{true} and @code{false} represent their language-specific
531counterparts.
532
14734fc7 533The dynamic team setting may be initialized at startup by the
83fd6c5b
TB
534@env{OMP_DYNAMIC} environment variable or at runtime using
535@code{omp_set_dynamic}. If undefined, dynamic adjustment is
14734fc7
DF
536disabled by default.
537
3721b9e1
DF
538@item @emph{C/C++}:
539@multitable @columnfractions .20 .80
6a2ba183 540@item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
3721b9e1
DF
541@end multitable
542
543@item @emph{Fortran}:
544@multitable @columnfractions .20 .80
545@item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
546@end multitable
547
548@item @emph{See also}:
14734fc7 549@ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
3721b9e1
DF
550
551@item @emph{Reference}:
1a6d1d24 552@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.8.
5c6ed53a
TB
553@end table
554
555
556
74c9882b
JJ
557@node omp_get_initial_device
558@section @code{omp_get_initial_device} -- Return device number of initial device
559@table @asis
560@item @emph{Description}:
561This function returns a device number that represents the host device.
562For OpenMP 5.1, this must be equal to the value returned by the
563@code{omp_get_num_devices} function.
564
565@item @emph{C/C++}
566@multitable @columnfractions .20 .80
567@item @emph{Prototype}: @tab @code{int omp_get_initial_device(void);}
568@end multitable
569
570@item @emph{Fortran}:
571@multitable @columnfractions .20 .80
572@item @emph{Interface}: @tab @code{integer function omp_get_initial_device()}
573@end multitable
574
575@item @emph{See also}:
576@ref{omp_get_num_devices}
577
578@item @emph{Reference}:
579@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.35.
580@end table
581
582
583
0bac793e
CLT
584@node omp_get_device_num
585@section @code{omp_get_device_num} -- Return device number of current device
586@table @asis
587@item @emph{Description}:
588This function returns a device number that represents the device that the
589current thread is executing on. For OpenMP 5.0, this must be equal to the
590value returned by the @code{omp_get_initial_device} function when called
591from the host.
592
593@item @emph{C/C++}
594@multitable @columnfractions .20 .80
595@item @emph{Prototype}: @tab @code{int omp_get_device_num(void);}
596@end multitable
597
598@item @emph{Fortran}:
599@multitable @columnfractions .20 .80
600@item @emph{Interface}: @tab @code{integer function omp_get_device_num()}
601@end multitable
602
603@item @emph{See also}:
604@ref{omp_get_initial_device}
605
606@item @emph{Reference}:
607@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.37.
608@end table
609
610
611
5c6ed53a
TB
612@node omp_get_level
613@section @code{omp_get_level} -- Obtain the current nesting level
614@table @asis
615@item @emph{Description}:
616This function returns the nesting level for the parallel blocks,
617which enclose the calling call.
618
619@item @emph{C/C++}
620@multitable @columnfractions .20 .80
6a2ba183 621@item @emph{Prototype}: @tab @code{int omp_get_level(void);}
5c6ed53a
TB
622@end multitable
623
624@item @emph{Fortran}:
625@multitable @columnfractions .20 .80
acb5c916 626@item @emph{Interface}: @tab @code{integer function omp_level()}
5c6ed53a
TB
627@end multitable
628
629@item @emph{See also}:
630@ref{omp_get_active_level}
631
632@item @emph{Reference}:
1a6d1d24 633@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.17.
5c6ed53a
TB
634@end table
635
636
637
638@node omp_get_max_active_levels
445567b2 639@section @code{omp_get_max_active_levels} -- Current maximum number of active regions
5c6ed53a
TB
640@table @asis
641@item @emph{Description}:
6a2ba183 642This function obtains the maximum allowed number of nested, active parallel regions.
5c6ed53a
TB
643
644@item @emph{C/C++}
645@multitable @columnfractions .20 .80
6a2ba183 646@item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
5c6ed53a
TB
647@end multitable
648
649@item @emph{Fortran}:
650@multitable @columnfractions .20 .80
acb5c916 651@item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
5c6ed53a
TB
652@end multitable
653
654@item @emph{See also}:
655@ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
656
657@item @emph{Reference}:
1a6d1d24 658@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.16.
3721b9e1
DF
659@end table
660
661
d9a6bd32
JJ
662@node omp_get_max_task_priority
663@section @code{omp_get_max_task_priority} -- Maximum priority value
664that can be set for tasks.
665@table @asis
666@item @emph{Description}:
667This function obtains the maximum allowed priority number for tasks.
668
669@item @emph{C/C++}
670@multitable @columnfractions .20 .80
671@item @emph{Prototype}: @tab @code{int omp_get_max_task_priority(void);}
672@end multitable
673
674@item @emph{Fortran}:
675@multitable @columnfractions .20 .80
676@item @emph{Interface}: @tab @code{integer function omp_get_max_task_priority()}
677@end multitable
678
679@item @emph{Reference}:
1a6d1d24 680@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
d9a6bd32
JJ
681@end table
682
3721b9e1
DF
683
684@node omp_get_max_threads
6a2ba183 685@section @code{omp_get_max_threads} -- Maximum number of threads of parallel region
3721b9e1
DF
686@table @asis
687@item @emph{Description}:
6a2ba183 688Return the maximum number of threads used for the current parallel region
5c6ed53a 689that does not use the clause @code{num_threads}.
3721b9e1
DF
690
691@item @emph{C/C++}:
692@multitable @columnfractions .20 .80
6a2ba183 693@item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
3721b9e1
DF
694@end multitable
695
696@item @emph{Fortran}:
697@multitable @columnfractions .20 .80
698@item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
699@end multitable
700
701@item @emph{See also}:
5c6ed53a 702@ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
3721b9e1
DF
703
704@item @emph{Reference}:
1a6d1d24 705@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.3.
3721b9e1
DF
706@end table
707
708
709
710@node omp_get_nested
711@section @code{omp_get_nested} -- Nested parallel regions
712@table @asis
713@item @emph{Description}:
714This function returns @code{true} if nested parallel regions are
83fd6c5b 715enabled, @code{false} otherwise. Here, @code{true} and @code{false}
3721b9e1
DF
716represent their language-specific counterparts.
717
6fae7eda
KCY
718The state of nested parallel regions at startup depends on several
719environment variables. If @env{OMP_MAX_ACTIVE_LEVELS} is defined
720and is set to greater than one, then nested parallel regions will be
721enabled. If not defined, then the value of the @env{OMP_NESTED}
722environment variable will be followed if defined. If neither are
723defined, then if either @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND}
724are defined with a list of more than one value, then nested parallel
725regions are enabled. If none of these are defined, then nested parallel
726regions are disabled by default.
727
728Nested parallel regions can be enabled or disabled at runtime using
729@code{omp_set_nested}, or by setting the maximum number of nested
730regions with @code{omp_set_max_active_levels} to one to disable, or
731above one to enable.
14734fc7 732
3721b9e1
DF
733@item @emph{C/C++}:
734@multitable @columnfractions .20 .80
6a2ba183 735@item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
3721b9e1
DF
736@end multitable
737
738@item @emph{Fortran}:
739@multitable @columnfractions .20 .80
87350d4a 740@item @emph{Interface}: @tab @code{logical function omp_get_nested()}
3721b9e1
DF
741@end multitable
742
743@item @emph{See also}:
6fae7eda
KCY
744@ref{omp_set_max_active_levels}, @ref{omp_set_nested},
745@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
3721b9e1
DF
746
747@item @emph{Reference}:
1a6d1d24 748@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.11.
83fd6c5b
TB
749@end table
750
751
752
753@node omp_get_num_devices
754@section @code{omp_get_num_devices} -- Number of target devices
755@table @asis
756@item @emph{Description}:
757Returns the number of target devices.
758
759@item @emph{C/C++}:
760@multitable @columnfractions .20 .80
761@item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
762@end multitable
763
764@item @emph{Fortran}:
765@multitable @columnfractions .20 .80
766@item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
767@end multitable
768
769@item @emph{Reference}:
1a6d1d24 770@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.31.
3721b9e1
DF
771@end table
772
773
774
775@node omp_get_num_procs
776@section @code{omp_get_num_procs} -- Number of processors online
777@table @asis
778@item @emph{Description}:
83fd6c5b 779Returns the number of processors online on that device.
3721b9e1
DF
780
781@item @emph{C/C++}:
782@multitable @columnfractions .20 .80
6a2ba183 783@item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
3721b9e1
DF
784@end multitable
785
786@item @emph{Fortran}:
787@multitable @columnfractions .20 .80
788@item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
789@end multitable
790
791@item @emph{Reference}:
1a6d1d24 792@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.5.
83fd6c5b
TB
793@end table
794
795
796
797@node omp_get_num_teams
798@section @code{omp_get_num_teams} -- Number of teams
799@table @asis
800@item @emph{Description}:
801Returns the number of teams in the current team region.
802
803@item @emph{C/C++}:
804@multitable @columnfractions .20 .80
805@item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
806@end multitable
807
808@item @emph{Fortran}:
809@multitable @columnfractions .20 .80
810@item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
811@end multitable
812
813@item @emph{Reference}:
1a6d1d24 814@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.32.
3721b9e1
DF
815@end table
816
817
818
819@node omp_get_num_threads
820@section @code{omp_get_num_threads} -- Size of the active team
821@table @asis
822@item @emph{Description}:
83fd6c5b 823Returns the number of threads in the current team. In a sequential section of
3721b9e1
DF
824the program @code{omp_get_num_threads} returns 1.
825
14734fc7 826The default team size may be initialized at startup by the
83fd6c5b 827@env{OMP_NUM_THREADS} environment variable. At runtime, the size
14734fc7 828of the current team may be set either by the @code{NUM_THREADS}
83fd6c5b
TB
829clause or by @code{omp_set_num_threads}. If none of the above were
830used to define a specific value and @env{OMP_DYNAMIC} is disabled,
14734fc7
DF
831one thread per CPU online is used.
832
3721b9e1
DF
833@item @emph{C/C++}:
834@multitable @columnfractions .20 .80
6a2ba183 835@item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
3721b9e1
DF
836@end multitable
837
838@item @emph{Fortran}:
839@multitable @columnfractions .20 .80
840@item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
841@end multitable
842
843@item @emph{See also}:
844@ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
845
846@item @emph{Reference}:
1a6d1d24 847@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.2.
83fd6c5b
TB
848@end table
849
850
851
852@node omp_get_proc_bind
853@section @code{omp_get_proc_bind} -- Whether theads may be moved between CPUs
854@table @asis
855@item @emph{Description}:
856This functions returns the currently active thread affinity policy, which is
857set via @env{OMP_PROC_BIND}. Possible values are @code{omp_proc_bind_false},
432de084
TB
858@code{omp_proc_bind_true}, @code{omp_proc_bind_primary},
859@code{omp_proc_bind_master}, @code{omp_proc_bind_close} and @code{omp_proc_bind_spread},
860where @code{omp_proc_bind_master} is an alias for @code{omp_proc_bind_primary}.
83fd6c5b
TB
861
862@item @emph{C/C++}:
863@multitable @columnfractions .20 .80
864@item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
865@end multitable
866
867@item @emph{Fortran}:
868@multitable @columnfractions .20 .80
869@item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
870@end multitable
871
872@item @emph{See also}:
873@ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
874
875@item @emph{Reference}:
1a6d1d24 876@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.22.
5c6ed53a
TB
877@end table
878
879
880
881@node omp_get_schedule
882@section @code{omp_get_schedule} -- Obtain the runtime scheduling method
883@table @asis
884@item @emph{Description}:
83fd6c5b 885Obtain the runtime scheduling method. The @var{kind} argument will be
5c6ed53a 886set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
83fd6c5b 887@code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
d9a6bd32 888@var{chunk_size}, is set to the chunk size.
5c6ed53a
TB
889
890@item @emph{C/C++}
891@multitable @columnfractions .20 .80
d9a6bd32 892@item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *chunk_size);}
5c6ed53a
TB
893@end multitable
894
895@item @emph{Fortran}:
896@multitable @columnfractions .20 .80
d9a6bd32 897@item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, chunk_size)}
5c6ed53a 898@item @tab @code{integer(kind=omp_sched_kind) kind}
d9a6bd32 899@item @tab @code{integer chunk_size}
5c6ed53a
TB
900@end multitable
901
902@item @emph{See also}:
903@ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
904
905@item @emph{Reference}:
1a6d1d24 906@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.13.
83fd6c5b
TB
907@end table
908
909
8949b985
KCY
910@node omp_get_supported_active_levels
911@section @code{omp_get_supported_active_levels} -- Maximum number of active regions supported
912@table @asis
913@item @emph{Description}:
914This function returns the maximum number of nested, active parallel regions
915supported by this implementation.
916
917@item @emph{C/C++}
918@multitable @columnfractions .20 .80
919@item @emph{Prototype}: @tab @code{int omp_get_supported_active_levels(void);}
920@end multitable
921
922@item @emph{Fortran}:
923@multitable @columnfractions .20 .80
924@item @emph{Interface}: @tab @code{integer function omp_get_supported_active_levels()}
925@end multitable
926
927@item @emph{See also}:
928@ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
929
930@item @emph{Reference}:
931@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.15.
932@end table
933
934
83fd6c5b
TB
935
936@node omp_get_team_num
937@section @code{omp_get_team_num} -- Get team number
938@table @asis
939@item @emph{Description}:
940Returns the team number of the calling thread.
941
942@item @emph{C/C++}:
943@multitable @columnfractions .20 .80
944@item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
945@end multitable
946
947@item @emph{Fortran}:
948@multitable @columnfractions .20 .80
949@item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
950@end multitable
951
952@item @emph{Reference}:
1a6d1d24 953@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.33.
5c6ed53a
TB
954@end table
955
956
957
958@node omp_get_team_size
959@section @code{omp_get_team_size} -- Number of threads in a team
960@table @asis
961@item @emph{Description}:
962This function returns the number of threads in a thread team to which
83fd6c5b 963either the current thread or its ancestor belongs. For values of @var{level}
6a2ba183
AH
964outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
9651 is returned, and for @code{omp_get_level}, the result is identical
5c6ed53a
TB
966to @code{omp_get_num_threads}.
967
968@item @emph{C/C++}:
969@multitable @columnfractions .20 .80
6a2ba183 970@item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
5c6ed53a
TB
971@end multitable
972
973@item @emph{Fortran}:
974@multitable @columnfractions .20 .80
975@item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
976@item @tab @code{integer level}
977@end multitable
978
979@item @emph{See also}:
980@ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
981
982@item @emph{Reference}:
1a6d1d24 983@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.19.
5c6ed53a
TB
984@end table
985
986
987
988@node omp_get_thread_limit
6a2ba183 989@section @code{omp_get_thread_limit} -- Maximum number of threads
5c6ed53a
TB
990@table @asis
991@item @emph{Description}:
6a2ba183 992Return the maximum number of threads of the program.
5c6ed53a
TB
993
994@item @emph{C/C++}:
995@multitable @columnfractions .20 .80
6a2ba183 996@item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
5c6ed53a
TB
997@end multitable
998
999@item @emph{Fortran}:
1000@multitable @columnfractions .20 .80
1001@item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
1002@end multitable
1003
1004@item @emph{See also}:
1005@ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
1006
1007@item @emph{Reference}:
1a6d1d24 1008@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.14.
3721b9e1
DF
1009@end table
1010
1011
1012
83fd6c5b 1013@node omp_get_thread_num
3721b9e1
DF
1014@section @code{omp_get_thread_num} -- Current thread ID
1015@table @asis
1016@item @emph{Description}:
6a2ba183 1017Returns a unique thread identification number within the current team.
5c6ed53a 1018In a sequential parts of the program, @code{omp_get_thread_num}
83fd6c5b
TB
1019always returns 0. In parallel regions the return value varies
1020from 0 to @code{omp_get_num_threads}-1 inclusive. The return
432de084 1021value of the primary thread of a team is always 0.
3721b9e1
DF
1022
1023@item @emph{C/C++}:
1024@multitable @columnfractions .20 .80
6a2ba183 1025@item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
3721b9e1
DF
1026@end multitable
1027
1028@item @emph{Fortran}:
1029@multitable @columnfractions .20 .80
1030@item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
1031@end multitable
1032
1033@item @emph{See also}:
5c6ed53a 1034@ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
3721b9e1
DF
1035
1036@item @emph{Reference}:
1a6d1d24 1037@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.4.
3721b9e1
DF
1038@end table
1039
1040
1041
1042@node omp_in_parallel
1043@section @code{omp_in_parallel} -- Whether a parallel region is active
1044@table @asis
1045@item @emph{Description}:
83fd6c5b
TB
1046This function returns @code{true} if currently running in parallel,
1047@code{false} otherwise. Here, @code{true} and @code{false} represent
3721b9e1
DF
1048their language-specific counterparts.
1049
1050@item @emph{C/C++}:
1051@multitable @columnfractions .20 .80
6a2ba183 1052@item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
3721b9e1
DF
1053@end multitable
1054
1055@item @emph{Fortran}:
1056@multitable @columnfractions .20 .80
1057@item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
1058@end multitable
1059
1060@item @emph{Reference}:
1a6d1d24 1061@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.6.
20906c66
JJ
1062@end table
1063
1064
1065@node omp_in_final
1066@section @code{omp_in_final} -- Whether in final or included task region
1067@table @asis
1068@item @emph{Description}:
1069This function returns @code{true} if currently running in a final
83fd6c5b 1070or included task region, @code{false} otherwise. Here, @code{true}
20906c66
JJ
1071and @code{false} represent their language-specific counterparts.
1072
1073@item @emph{C/C++}:
1074@multitable @columnfractions .20 .80
1075@item @emph{Prototype}: @tab @code{int omp_in_final(void);}
1076@end multitable
1077
1078@item @emph{Fortran}:
1079@multitable @columnfractions .20 .80
1080@item @emph{Interface}: @tab @code{logical function omp_in_final()}
1081@end multitable
1082
1083@item @emph{Reference}:
1a6d1d24 1084@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.21.
3721b9e1
DF
1085@end table
1086
1087
83fd6c5b
TB
1088
1089@node omp_is_initial_device
1090@section @code{omp_is_initial_device} -- Whether executing on the host device
1091@table @asis
1092@item @emph{Description}:
1093This function returns @code{true} if currently running on the host device,
1094@code{false} otherwise. Here, @code{true} and @code{false} represent
1095their language-specific counterparts.
1096
1097@item @emph{C/C++}:
1098@multitable @columnfractions .20 .80
1099@item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
1100@end multitable
1101
1102@item @emph{Fortran}:
1103@multitable @columnfractions .20 .80
1104@item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
1105@end multitable
1106
1107@item @emph{Reference}:
1a6d1d24 1108@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.34.
83fd6c5b
TB
1109@end table
1110
1111
1112
1113@node omp_set_default_device
1114@section @code{omp_set_default_device} -- Set the default device for target regions
1115@table @asis
1116@item @emph{Description}:
1117Set the default device for target regions without device clause. The argument
1118shall be a nonnegative device number.
1119
1120@item @emph{C/C++}:
1121@multitable @columnfractions .20 .80
1122@item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
1123@end multitable
1124
1125@item @emph{Fortran}:
1126@multitable @columnfractions .20 .80
1127@item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
1128@item @tab @code{integer device_num}
1129@end multitable
1130
1131@item @emph{See also}:
1132@ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
1133
1134@item @emph{Reference}:
1a6d1d24 1135@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
83fd6c5b
TB
1136@end table
1137
1138
1139
3721b9e1
DF
1140@node omp_set_dynamic
1141@section @code{omp_set_dynamic} -- Enable/disable dynamic teams
1142@table @asis
1143@item @emph{Description}:
1144Enable or disable the dynamic adjustment of the number of threads
83fd6c5b 1145within a team. The function takes the language-specific equivalent
3721b9e1
DF
1146of @code{true} and @code{false}, where @code{true} enables dynamic
1147adjustment of team sizes and @code{false} disables it.
1148
1149@item @emph{C/C++}:
1150@multitable @columnfractions .20 .80
4fed6b25 1151@item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
3721b9e1
DF
1152@end multitable
1153
1154@item @emph{Fortran}:
1155@multitable @columnfractions .20 .80
4fed6b25
TB
1156@item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
1157@item @tab @code{logical, intent(in) :: dynamic_threads}
3721b9e1
DF
1158@end multitable
1159
1160@item @emph{See also}:
1161@ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
1162
1163@item @emph{Reference}:
1a6d1d24 1164@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.7.
5c6ed53a
TB
1165@end table
1166
1167
1168
1169@node omp_set_max_active_levels
1170@section @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
1171@table @asis
1172@item @emph{Description}:
6a2ba183 1173This function limits the maximum allowed number of nested, active
8949b985
KCY
1174parallel regions. @var{max_levels} must be less or equal to
1175the value returned by @code{omp_get_supported_active_levels}.
5c6ed53a
TB
1176
1177@item @emph{C/C++}
1178@multitable @columnfractions .20 .80
6a2ba183 1179@item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
5c6ed53a
TB
1180@end multitable
1181
1182@item @emph{Fortran}:
1183@multitable @columnfractions .20 .80
6a2ba183 1184@item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
5c6ed53a
TB
1185@item @tab @code{integer max_levels}
1186@end multitable
1187
1188@item @emph{See also}:
8949b985
KCY
1189@ref{omp_get_max_active_levels}, @ref{omp_get_active_level},
1190@ref{omp_get_supported_active_levels}
5c6ed53a
TB
1191
1192@item @emph{Reference}:
1a6d1d24 1193@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.15.
3721b9e1
DF
1194@end table
1195
1196
1197
1198@node omp_set_nested
1199@section @code{omp_set_nested} -- Enable/disable nested parallel regions
1200@table @asis
1201@item @emph{Description}:
f1b0882e 1202Enable or disable nested parallel regions, i.e., whether team members
83fd6c5b 1203are allowed to create new teams. The function takes the language-specific
3721b9e1
DF
1204equivalent of @code{true} and @code{false}, where @code{true} enables
1205dynamic adjustment of team sizes and @code{false} disables it.
1206
6fae7eda
KCY
1207Enabling nested parallel regions will also set the maximum number of
1208active nested regions to the maximum supported. Disabling nested parallel
1209regions will set the maximum number of active nested regions to one.
1210
3721b9e1
DF
1211@item @emph{C/C++}:
1212@multitable @columnfractions .20 .80
4fed6b25 1213@item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
3721b9e1
DF
1214@end multitable
1215
1216@item @emph{Fortran}:
1217@multitable @columnfractions .20 .80
4fed6b25
TB
1218@item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
1219@item @tab @code{logical, intent(in) :: nested}
3721b9e1
DF
1220@end multitable
1221
1222@item @emph{See also}:
6fae7eda
KCY
1223@ref{omp_get_nested}, @ref{omp_set_max_active_levels},
1224@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
3721b9e1
DF
1225
1226@item @emph{Reference}:
1a6d1d24 1227@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.10.
3721b9e1
DF
1228@end table
1229
1230
1231
1232@node omp_set_num_threads
1233@section @code{omp_set_num_threads} -- Set upper team size limit
1234@table @asis
1235@item @emph{Description}:
1236Specifies the number of threads used by default in subsequent parallel
83fd6c5b
TB
1237sections, if those do not specify a @code{num_threads} clause. The
1238argument of @code{omp_set_num_threads} shall be a positive integer.
3721b9e1 1239
3721b9e1
DF
1240@item @emph{C/C++}:
1241@multitable @columnfractions .20 .80
4fed6b25 1242@item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
3721b9e1
DF
1243@end multitable
1244
1245@item @emph{Fortran}:
1246@multitable @columnfractions .20 .80
4fed6b25
TB
1247@item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
1248@item @tab @code{integer, intent(in) :: num_threads}
3721b9e1
DF
1249@end multitable
1250
1251@item @emph{See also}:
1252@ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
1253
1254@item @emph{Reference}:
1a6d1d24 1255@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.1.
5c6ed53a
TB
1256@end table
1257
1258
1259
1260@node omp_set_schedule
1261@section @code{omp_set_schedule} -- Set the runtime scheduling method
1262@table @asis
1263@item @emph{Description}:
83fd6c5b 1264Sets the runtime scheduling method. The @var{kind} argument can have the
5c6ed53a 1265value @code{omp_sched_static}, @code{omp_sched_dynamic},
83fd6c5b 1266@code{omp_sched_guided} or @code{omp_sched_auto}. Except for
5c6ed53a 1267@code{omp_sched_auto}, the chunk size is set to the value of
d9a6bd32
JJ
1268@var{chunk_size} if positive, or to the default value if zero or negative.
1269For @code{omp_sched_auto} the @var{chunk_size} argument is ignored.
5c6ed53a
TB
1270
1271@item @emph{C/C++}
1272@multitable @columnfractions .20 .80
d9a6bd32 1273@item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int chunk_size);}
5c6ed53a
TB
1274@end multitable
1275
1276@item @emph{Fortran}:
1277@multitable @columnfractions .20 .80
d9a6bd32 1278@item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, chunk_size)}
5c6ed53a 1279@item @tab @code{integer(kind=omp_sched_kind) kind}
d9a6bd32 1280@item @tab @code{integer chunk_size}
5c6ed53a
TB
1281@end multitable
1282
1283@item @emph{See also}:
1284@ref{omp_get_schedule}
1285@ref{OMP_SCHEDULE}
1286
1287@item @emph{Reference}:
1a6d1d24 1288@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.12.
3721b9e1
DF
1289@end table
1290
1291
1292
1293@node omp_init_lock
1294@section @code{omp_init_lock} -- Initialize simple lock
1295@table @asis
1296@item @emph{Description}:
83fd6c5b 1297Initialize a simple lock. After initialization, the lock is in
3721b9e1
DF
1298an unlocked state.
1299
1300@item @emph{C/C++}:
1301@multitable @columnfractions .20 .80
1302@item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
1303@end multitable
1304
1305@item @emph{Fortran}:
1306@multitable @columnfractions .20 .80
4fed6b25
TB
1307@item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
1308@item @tab @code{integer(omp_lock_kind), intent(out) :: svar}
3721b9e1
DF
1309@end multitable
1310
1311@item @emph{See also}:
1312@ref{omp_destroy_lock}
1313
1314@item @emph{Reference}:
1a6d1d24 1315@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
3721b9e1
DF
1316@end table
1317
1318
1319
1320@node omp_set_lock
1321@section @code{omp_set_lock} -- Wait for and set simple lock
1322@table @asis
1323@item @emph{Description}:
1324Before setting a simple lock, the lock variable must be initialized by
83fd6c5b
TB
1325@code{omp_init_lock}. The calling thread is blocked until the lock
1326is available. If the lock is already held by the current thread,
3721b9e1
DF
1327a deadlock occurs.
1328
1329@item @emph{C/C++}:
1330@multitable @columnfractions .20 .80
1331@item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
1332@end multitable
1333
1334@item @emph{Fortran}:
1335@multitable @columnfractions .20 .80
4fed6b25
TB
1336@item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
1337@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1338@end multitable
1339
1340@item @emph{See also}:
1341@ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
1342
1343@item @emph{Reference}:
1a6d1d24 1344@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
3721b9e1
DF
1345@end table
1346
1347
1348
1349@node omp_test_lock
1350@section @code{omp_test_lock} -- Test and set simple lock if available
1351@table @asis
1352@item @emph{Description}:
1353Before setting a simple lock, the lock variable must be initialized by
83fd6c5b
TB
1354@code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
1355does not block if the lock is not available. This function returns
1356@code{true} upon success, @code{false} otherwise. Here, @code{true} and
3721b9e1
DF
1357@code{false} represent their language-specific counterparts.
1358
1359@item @emph{C/C++}:
1360@multitable @columnfractions .20 .80
1361@item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
1362@end multitable
1363
1364@item @emph{Fortran}:
1365@multitable @columnfractions .20 .80
4fed6b25
TB
1366@item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
1367@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1368@end multitable
1369
1370@item @emph{See also}:
1371@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1372
1373@item @emph{Reference}:
1a6d1d24 1374@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
3721b9e1
DF
1375@end table
1376
1377
1378
1379@node omp_unset_lock
1380@section @code{omp_unset_lock} -- Unset simple lock
1381@table @asis
1382@item @emph{Description}:
1383A simple lock about to be unset must have been locked by @code{omp_set_lock}
83fd6c5b
TB
1384or @code{omp_test_lock} before. In addition, the lock must be held by the
1385thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
1386or more threads attempted to set the lock before, one of them is chosen to,
20906c66 1387again, set the lock to itself.
3721b9e1
DF
1388
1389@item @emph{C/C++}:
1390@multitable @columnfractions .20 .80
1391@item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
1392@end multitable
1393
1394@item @emph{Fortran}:
1395@multitable @columnfractions .20 .80
4fed6b25
TB
1396@item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
1397@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1398@end multitable
1399
1400@item @emph{See also}:
1401@ref{omp_set_lock}, @ref{omp_test_lock}
1402
1403@item @emph{Reference}:
1a6d1d24 1404@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
3721b9e1
DF
1405@end table
1406
1407
1408
1409@node omp_destroy_lock
1410@section @code{omp_destroy_lock} -- Destroy simple lock
1411@table @asis
1412@item @emph{Description}:
83fd6c5b 1413Destroy a simple lock. In order to be destroyed, a simple lock must be
3721b9e1
DF
1414in the unlocked state.
1415
1416@item @emph{C/C++}:
1417@multitable @columnfractions .20 .80
6a2ba183 1418@item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
3721b9e1
DF
1419@end multitable
1420
1421@item @emph{Fortran}:
1422@multitable @columnfractions .20 .80
4fed6b25
TB
1423@item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
1424@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1425@end multitable
1426
1427@item @emph{See also}:
1428@ref{omp_init_lock}
1429
1430@item @emph{Reference}:
1a6d1d24 1431@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
3721b9e1
DF
1432@end table
1433
1434
1435
1436@node omp_init_nest_lock
1437@section @code{omp_init_nest_lock} -- Initialize nested lock
1438@table @asis
1439@item @emph{Description}:
83fd6c5b 1440Initialize a nested lock. After initialization, the lock is in
3721b9e1
DF
1441an unlocked state and the nesting count is set to zero.
1442
1443@item @emph{C/C++}:
1444@multitable @columnfractions .20 .80
1445@item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
1446@end multitable
1447
1448@item @emph{Fortran}:
1449@multitable @columnfractions .20 .80
4fed6b25
TB
1450@item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
1451@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
3721b9e1
DF
1452@end multitable
1453
1454@item @emph{See also}:
1455@ref{omp_destroy_nest_lock}
1456
1457@item @emph{Reference}:
1a6d1d24 1458@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
3721b9e1
DF
1459@end table
1460
1461
1462@node omp_set_nest_lock
6a2ba183 1463@section @code{omp_set_nest_lock} -- Wait for and set nested lock
3721b9e1
DF
1464@table @asis
1465@item @emph{Description}:
1466Before setting a nested lock, the lock variable must be initialized by
83fd6c5b
TB
1467@code{omp_init_nest_lock}. The calling thread is blocked until the lock
1468is available. If the lock is already held by the current thread, the
20906c66 1469nesting count for the lock is incremented.
3721b9e1
DF
1470
1471@item @emph{C/C++}:
1472@multitable @columnfractions .20 .80
1473@item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
1474@end multitable
1475
1476@item @emph{Fortran}:
1477@multitable @columnfractions .20 .80
4fed6b25
TB
1478@item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
1479@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1480@end multitable
1481
1482@item @emph{See also}:
1483@ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
1484
1485@item @emph{Reference}:
1a6d1d24 1486@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
3721b9e1
DF
1487@end table
1488
1489
1490
1491@node omp_test_nest_lock
1492@section @code{omp_test_nest_lock} -- Test and set nested lock if available
1493@table @asis
1494@item @emph{Description}:
1495Before setting a nested lock, the lock variable must be initialized by
83fd6c5b 1496@code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
3721b9e1
DF
1497@code{omp_test_nest_lock} does not block if the lock is not available.
1498If the lock is already held by the current thread, the new nesting count
83fd6c5b 1499is returned. Otherwise, the return value equals zero.
3721b9e1
DF
1500
1501@item @emph{C/C++}:
1502@multitable @columnfractions .20 .80
1503@item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
1504@end multitable
1505
1506@item @emph{Fortran}:
1507@multitable @columnfractions .20 .80
4fed6b25
TB
1508@item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
1509@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1510@end multitable
1511
1512
1513@item @emph{See also}:
1514@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1515
1516@item @emph{Reference}:
1a6d1d24 1517@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
3721b9e1
DF
1518@end table
1519
1520
1521
1522@node omp_unset_nest_lock
1523@section @code{omp_unset_nest_lock} -- Unset nested lock
1524@table @asis
1525@item @emph{Description}:
1526A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
83fd6c5b
TB
1527or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
1528thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
1529lock becomes unlocked. If one ore more threads attempted to set the lock before,
20906c66 1530one of them is chosen to, again, set the lock to itself.
3721b9e1
DF
1531
1532@item @emph{C/C++}:
1533@multitable @columnfractions .20 .80
1534@item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
1535@end multitable
1536
1537@item @emph{Fortran}:
1538@multitable @columnfractions .20 .80
4fed6b25
TB
1539@item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
1540@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1541@end multitable
1542
1543@item @emph{See also}:
1544@ref{omp_set_nest_lock}
1545
1546@item @emph{Reference}:
1a6d1d24 1547@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
3721b9e1
DF
1548@end table
1549
1550
1551
1552@node omp_destroy_nest_lock
1553@section @code{omp_destroy_nest_lock} -- Destroy nested lock
1554@table @asis
1555@item @emph{Description}:
83fd6c5b 1556Destroy a nested lock. In order to be destroyed, a nested lock must be
3721b9e1
DF
1557in the unlocked state and its nesting count must equal zero.
1558
1559@item @emph{C/C++}:
1560@multitable @columnfractions .20 .80
1561@item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
1562@end multitable
1563
1564@item @emph{Fortran}:
1565@multitable @columnfractions .20 .80
4fed6b25
TB
1566@item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
1567@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1568@end multitable
1569
1570@item @emph{See also}:
1571@ref{omp_init_lock}
1572
1573@item @emph{Reference}:
1a6d1d24 1574@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
3721b9e1
DF
1575@end table
1576
1577
1578
1579@node omp_get_wtick
1580@section @code{omp_get_wtick} -- Get timer precision
1581@table @asis
1582@item @emph{Description}:
f1b0882e 1583Gets the timer precision, i.e., the number of seconds between two
3721b9e1
DF
1584successive clock ticks.
1585
1586@item @emph{C/C++}:
1587@multitable @columnfractions .20 .80
6a2ba183 1588@item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
3721b9e1
DF
1589@end multitable
1590
1591@item @emph{Fortran}:
1592@multitable @columnfractions .20 .80
1593@item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
1594@end multitable
1595
1596@item @emph{See also}:
1597@ref{omp_get_wtime}
1598
1599@item @emph{Reference}:
1a6d1d24 1600@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.2.
3721b9e1
DF
1601@end table
1602
1603
1604
1605@node omp_get_wtime
1606@section @code{omp_get_wtime} -- Elapsed wall clock time
1607@table @asis
1608@item @emph{Description}:
83fd6c5b 1609Elapsed wall clock time in seconds. The time is measured per thread, no
6a2ba183 1610guarantee can be made that two distinct threads measure the same time.
21e1e594
JJ
1611Time is measured from some "time in the past", which is an arbitrary time
1612guaranteed not to change during the execution of the program.
3721b9e1
DF
1613
1614@item @emph{C/C++}:
1615@multitable @columnfractions .20 .80
6a2ba183 1616@item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
3721b9e1
DF
1617@end multitable
1618
1619@item @emph{Fortran}:
1620@multitable @columnfractions .20 .80
1621@item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
1622@end multitable
1623
1624@item @emph{See also}:
1625@ref{omp_get_wtick}
1626
1627@item @emph{Reference}:
1a6d1d24 1628@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.1.
3721b9e1
DF
1629@end table
1630
1631
1632
0194e2f0
KCY
1633@node omp_fulfill_event
1634@section @code{omp_fulfill_event} -- Fulfill and destroy an OpenMP event
1635@table @asis
1636@item @emph{Description}:
1637Fulfill the event associated with the event handle argument. Currently, it
1638is only used to fulfill events generated by detach clauses on task
1639constructs - the effect of fulfilling the event is to allow the task to
1640complete.
1641
1642The result of calling @code{omp_fulfill_event} with an event handle other
1643than that generated by a detach clause is undefined. Calling it with an
1644event handle that has already been fulfilled is also undefined.
1645
1646@item @emph{C/C++}:
1647@multitable @columnfractions .20 .80
1648@item @emph{Prototype}: @tab @code{void omp_fulfill_event(omp_event_handle_t event);}
1649@end multitable
1650
1651@item @emph{Fortran}:
1652@multitable @columnfractions .20 .80
1653@item @emph{Interface}: @tab @code{subroutine omp_fulfill_event(event)}
1654@item @tab @code{integer (kind=omp_event_handle_kind) :: event}
1655@end multitable
1656
1657@item @emph{Reference}:
1658@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.5.1.
1659@end table
1660
1661
1662
3721b9e1 1663@c ---------------------------------------------------------------------
4102bda6 1664@c OpenMP Environment Variables
3721b9e1
DF
1665@c ---------------------------------------------------------------------
1666
1667@node Environment Variables
4102bda6 1668@chapter OpenMP Environment Variables
3721b9e1 1669
acf0174b 1670The environment variables which beginning with @env{OMP_} are defined by
00b9bd52 1671section 4 of the OpenMP specification in version 4.5, while those
acf0174b 1672beginning with @env{GOMP_} are GNU extensions.
3721b9e1
DF
1673
1674@menu
06441dd5
SH
1675* OMP_CANCELLATION:: Set whether cancellation is activated
1676* OMP_DISPLAY_ENV:: Show OpenMP version and environment variables
1677* OMP_DEFAULT_DEVICE:: Set the device used in target regions
1678* OMP_DYNAMIC:: Dynamic adjustment of threads
1679* OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
d9a6bd32 1680* OMP_MAX_TASK_PRIORITY:: Set the maximum task priority value
06441dd5
SH
1681* OMP_NESTED:: Nested parallel regions
1682* OMP_NUM_THREADS:: Specifies the number of threads to use
1683* OMP_PROC_BIND:: Whether theads may be moved between CPUs
1684* OMP_PLACES:: Specifies on which CPUs the theads should be placed
1685* OMP_STACKSIZE:: Set default thread stack size
1686* OMP_SCHEDULE:: How threads are scheduled
1bfc07d1 1687* OMP_TARGET_OFFLOAD:: Controls offloading behaviour
06441dd5
SH
1688* OMP_THREAD_LIMIT:: Set the maximum number of threads
1689* OMP_WAIT_POLICY:: How waiting threads are handled
1690* GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
1691* GOMP_DEBUG:: Enable debugging output
1692* GOMP_STACKSIZE:: Set default thread stack size
1693* GOMP_SPINCOUNT:: Set the busy-wait spin count
1694* GOMP_RTEMS_THREAD_POOLS:: Set the RTEMS specific thread pools
3721b9e1
DF
1695@end menu
1696
1697
83fd6c5b
TB
1698@node OMP_CANCELLATION
1699@section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
1700@cindex Environment Variable
1701@table @asis
1702@item @emph{Description}:
1703If set to @code{TRUE}, the cancellation is activated. If set to @code{FALSE} or
1704if unset, cancellation is disabled and the @code{cancel} construct is ignored.
1705
1706@item @emph{See also}:
1707@ref{omp_get_cancellation}
1708
1709@item @emph{Reference}:
1a6d1d24 1710@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.11
83fd6c5b
TB
1711@end table
1712
1713
1714
1715@node OMP_DISPLAY_ENV
1716@section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
1717@cindex Environment Variable
1718@table @asis
1719@item @emph{Description}:
1720If set to @code{TRUE}, the OpenMP version number and the values
1721associated with the OpenMP environment variables are printed to @code{stderr}.
1722If set to @code{VERBOSE}, it additionally shows the value of the environment
1723variables which are GNU extensions. If undefined or set to @code{FALSE},
1724this information will not be shown.
1725
1726
1727@item @emph{Reference}:
1a6d1d24 1728@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.12
83fd6c5b
TB
1729@end table
1730
1731
1732
1733@node OMP_DEFAULT_DEVICE
1734@section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
1735@cindex Environment Variable
1736@table @asis
1737@item @emph{Description}:
1738Set to choose the device which is used in a @code{target} region, unless the
1739value is overridden by @code{omp_set_default_device} or by a @code{device}
1740clause. The value shall be the nonnegative device number. If no device with
1741the given device number exists, the code is executed on the host. If unset,
1742device number 0 will be used.
1743
1744
1745@item @emph{See also}:
1746@ref{omp_get_default_device}, @ref{omp_set_default_device},
1747
1748@item @emph{Reference}:
1a6d1d24 1749@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.13
83fd6c5b
TB
1750@end table
1751
1752
1753
3721b9e1
DF
1754@node OMP_DYNAMIC
1755@section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
1756@cindex Environment Variable
1757@table @asis
1758@item @emph{Description}:
1759Enable or disable the dynamic adjustment of the number of threads
83fd6c5b
TB
1760within a team. The value of this environment variable shall be
1761@code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
7c2b7f45 1762disabled by default.
3721b9e1
DF
1763
1764@item @emph{See also}:
1765@ref{omp_set_dynamic}
1766
1767@item @emph{Reference}:
1a6d1d24 1768@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.3
5c6ed53a
TB
1769@end table
1770
1771
1772
1773@node OMP_MAX_ACTIVE_LEVELS
6a2ba183 1774@section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
5c6ed53a
TB
1775@cindex Environment Variable
1776@table @asis
1777@item @emph{Description}:
6a2ba183 1778Specifies the initial value for the maximum number of nested parallel
83fd6c5b 1779regions. The value of this variable shall be a positive integer.
6fae7eda
KCY
1780If undefined, then if @env{OMP_NESTED} is defined and set to true, or
1781if @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND} are defined and set to
1782a list with more than one item, the maximum number of nested parallel
1783regions will be initialized to the largest number supported, otherwise
1784it will be set to one.
5c6ed53a
TB
1785
1786@item @emph{See also}:
6fae7eda 1787@ref{omp_set_max_active_levels}, @ref{OMP_NESTED}
5c6ed53a
TB
1788
1789@item @emph{Reference}:
1a6d1d24 1790@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.9
3721b9e1
DF
1791@end table
1792
1793
1794
d9a6bd32
JJ
1795@node OMP_MAX_TASK_PRIORITY
1796@section @env{OMP_MAX_TASK_PRIORITY} -- Set the maximum priority
1797number that can be set for a task.
1798@cindex Environment Variable
1799@table @asis
1800@item @emph{Description}:
1801Specifies the initial value for the maximum priority value that can be
1802set for a task. The value of this variable shall be a non-negative
1803integer, and zero is allowed. If undefined, the default priority is
18040.
1805
1806@item @emph{See also}:
1807@ref{omp_get_max_task_priority}
1808
1809@item @emph{Reference}:
1a6d1d24 1810@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.14
d9a6bd32
JJ
1811@end table
1812
1813
1814
3721b9e1
DF
1815@node OMP_NESTED
1816@section @env{OMP_NESTED} -- Nested parallel regions
1817@cindex Environment Variable
14734fc7 1818@cindex Implementation specific setting
3721b9e1
DF
1819@table @asis
1820@item @emph{Description}:
f1b0882e 1821Enable or disable nested parallel regions, i.e., whether team members
83fd6c5b 1822are allowed to create new teams. The value of this environment variable
6fae7eda
KCY
1823shall be @code{TRUE} or @code{FALSE}. If set to @code{TRUE}, the number
1824of maximum active nested regions supported will by default be set to the
1825maximum supported, otherwise it will be set to one. If
1826@env{OMP_MAX_ACTIVE_LEVELS} is defined, its setting will override this
1827setting. If both are undefined, nested parallel regions are enabled if
1828@env{OMP_NUM_THREADS} or @env{OMP_PROC_BINDS} are defined to a list with
1829more than one item, otherwise they are disabled by default.
3721b9e1
DF
1830
1831@item @emph{See also}:
6fae7eda 1832@ref{omp_set_max_active_levels}, @ref{omp_set_nested}
3721b9e1
DF
1833
1834@item @emph{Reference}:
1a6d1d24 1835@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.6
3721b9e1
DF
1836@end table
1837
1838
1839
1840@node OMP_NUM_THREADS
1841@section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
1842@cindex Environment Variable
14734fc7 1843@cindex Implementation specific setting
3721b9e1
DF
1844@table @asis
1845@item @emph{Description}:
83fd6c5b 1846Specifies the default number of threads to use in parallel regions. The
20906c66 1847value of this variable shall be a comma-separated list of positive integers;
6fae7eda
KCY
1848the value specifies the number of threads to use for the corresponding nested
1849level. Specifying more than one item in the list will automatically enable
1850nesting by default. If undefined one thread per CPU is used.
3721b9e1
DF
1851
1852@item @emph{See also}:
6fae7eda 1853@ref{omp_set_num_threads}, @ref{OMP_NESTED}
3721b9e1
DF
1854
1855@item @emph{Reference}:
1a6d1d24 1856@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.2
83fd6c5b
TB
1857@end table
1858
1859
1860
72832460
UB
1861@node OMP_PROC_BIND
1862@section @env{OMP_PROC_BIND} -- Whether theads may be moved between CPUs
1863@cindex Environment Variable
1864@table @asis
1865@item @emph{Description}:
1866Specifies whether threads may be moved between processors. If set to
1867@code{TRUE}, OpenMP theads should not be moved; if set to @code{FALSE}
1868they may be moved. Alternatively, a comma separated list with the
432de084
TB
1869values @code{PRIMARY}, @code{MASTER}, @code{CLOSE} and @code{SPREAD} can
1870be used to specify the thread affinity policy for the corresponding nesting
1871level. With @code{PRIMARY} and @code{MASTER} the worker threads are in the
1872same place partition as the primary thread. With @code{CLOSE} those are
1873kept close to the primary thread in contiguous place partitions. And
1874with @code{SPREAD} a sparse distribution
6fae7eda
KCY
1875across the place partitions is used. Specifying more than one item in the
1876list will automatically enable nesting by default.
72832460
UB
1877
1878When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
1879@env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
1880
1881@item @emph{See also}:
6fae7eda
KCY
1882@ref{omp_get_proc_bind}, @ref{GOMP_CPU_AFFINITY},
1883@ref{OMP_NESTED}, @ref{OMP_PLACES}
72832460
UB
1884
1885@item @emph{Reference}:
1a6d1d24 1886@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.4
72832460
UB
1887@end table
1888
1889
1890
83fd6c5b
TB
1891@node OMP_PLACES
1892@section @env{OMP_PLACES} -- Specifies on which CPUs the theads should be placed
1893@cindex Environment Variable
1894@table @asis
1895@item @emph{Description}:
1896The thread placement can be either specified using an abstract name or by an
1897explicit list of the places. The abstract names @code{threads}, @code{cores}
1898and @code{sockets} can be optionally followed by a positive number in
1899parentheses, which denotes the how many places shall be created. With
1900@code{threads} each place corresponds to a single hardware thread; @code{cores}
1901to a single core with the corresponding number of hardware threads; and with
1902@code{sockets} the place corresponds to a single socket. The resulting
1903placement can be shown by setting the @env{OMP_DISPLAY_ENV} environment
1904variable.
1905
1906Alternatively, the placement can be specified explicitly as comma-separated
1907list of places. A place is specified by set of nonnegative numbers in curly
1908braces, denoting the denoting the hardware threads. The hardware threads
1909belonging to a place can either be specified as comma-separated list of
1910nonnegative thread numbers or using an interval. Multiple places can also be
1911either specified by a comma-separated list of places or by an interval. To
1912specify an interval, a colon followed by the count is placed after after
1913the hardware thread number or the place. Optionally, the length can be
1914followed by a colon and the stride number -- otherwise a unit stride is
1915assumed. For instance, the following specifies the same places list:
1916@code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
1917@code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
1918
1919If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
1920@env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
1921between CPUs following no placement policy.
1922
1923@item @emph{See also}:
1924@ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
1925@ref{OMP_DISPLAY_ENV}
1926
1927@item @emph{Reference}:
1a6d1d24 1928@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.5
83fd6c5b
TB
1929@end table
1930
1931
1932
72832460
UB
1933@node OMP_STACKSIZE
1934@section @env{OMP_STACKSIZE} -- Set default thread stack size
83fd6c5b
TB
1935@cindex Environment Variable
1936@table @asis
1937@item @emph{Description}:
72832460
UB
1938Set the default thread stack size in kilobytes, unless the number
1939is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
1940case the size is, respectively, in bytes, kilobytes, megabytes
1941or gigabytes. This is different from @code{pthread_attr_setstacksize}
1942which gets the number of bytes as an argument. If the stack size cannot
1943be set due to system constraints, an error is reported and the initial
1944stack size is left unchanged. If undefined, the stack size is system
1945dependent.
83fd6c5b 1946
72832460 1947@item @emph{Reference}:
1a6d1d24 1948@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.7
3721b9e1
DF
1949@end table
1950
1951
1952
1953@node OMP_SCHEDULE
1954@section @env{OMP_SCHEDULE} -- How threads are scheduled
1955@cindex Environment Variable
14734fc7 1956@cindex Implementation specific setting
3721b9e1
DF
1957@table @asis
1958@item @emph{Description}:
1959Allows to specify @code{schedule type} and @code{chunk size}.
1960The value of the variable shall have the form: @code{type[,chunk]} where
5c6ed53a 1961@code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
83fd6c5b 1962The optional @code{chunk} size shall be a positive integer. If undefined,
7c2b7f45 1963dynamic scheduling and a chunk size of 1 is used.
3721b9e1 1964
5c6ed53a
TB
1965@item @emph{See also}:
1966@ref{omp_set_schedule}
1967
1968@item @emph{Reference}:
1a6d1d24 1969@uref{https://www.openmp.org, OpenMP specification v4.5}, Sections 2.7.1.1 and 4.1
5c6ed53a
TB
1970@end table
1971
1972
1973
1bfc07d1
KCY
1974@node OMP_TARGET_OFFLOAD
1975@section @env{OMP_TARGET_OFFLOAD} -- Controls offloading behaviour
1976@cindex Environment Variable
1977@cindex Implementation specific setting
1978@table @asis
1979@item @emph{Description}:
1980Specifies the behaviour with regard to offloading code to a device. This
1981variable can be set to one of three values - @code{MANDATORY}, @code{DISABLED}
1982or @code{DEFAULT}.
1983
1984If set to @code{MANDATORY}, the program will terminate with an error if
1985the offload device is not present or is not supported. If set to
1986@code{DISABLED}, then offloading is disabled and all code will run on the
1987host. If set to @code{DEFAULT}, the program will try offloading to the
1988device first, then fall back to running code on the host if it cannot.
1989
1990If undefined, then the program will behave as if @code{DEFAULT} was set.
1991
1992@item @emph{Reference}:
1993@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.17
1994@end table
1995
1996
1997
5c6ed53a 1998@node OMP_THREAD_LIMIT
6a2ba183 1999@section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
5c6ed53a
TB
2000@cindex Environment Variable
2001@table @asis
2002@item @emph{Description}:
83fd6c5b
TB
2003Specifies the number of threads to use for the whole program. The
2004value of this variable shall be a positive integer. If undefined,
5c6ed53a
TB
2005the number of threads is not limited.
2006
2007@item @emph{See also}:
83fd6c5b 2008@ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
5c6ed53a
TB
2009
2010@item @emph{Reference}:
1a6d1d24 2011@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.10
5c6ed53a
TB
2012@end table
2013
2014
2015
2016@node OMP_WAIT_POLICY
2017@section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
2018@cindex Environment Variable
2019@table @asis
2020@item @emph{Description}:
83fd6c5b 2021Specifies whether waiting threads should be active or passive. If
5c6ed53a
TB
2022the value is @code{PASSIVE}, waiting threads should not consume CPU
2023power while waiting; while the value is @code{ACTIVE} specifies that
83fd6c5b 2024they should. If undefined, threads wait actively for a short time
acf0174b
JJ
2025before waiting passively.
2026
2027@item @emph{See also}:
2028@ref{GOMP_SPINCOUNT}
5c6ed53a
TB
2029
2030@item @emph{Reference}:
1a6d1d24 2031@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.8
3721b9e1
DF
2032@end table
2033
2034
2035
2036@node GOMP_CPU_AFFINITY
2037@section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
2038@cindex Environment Variable
2039@table @asis
2040@item @emph{Description}:
83fd6c5b
TB
2041Binds threads to specific CPUs. The variable should contain a space-separated
2042or comma-separated list of CPUs. This list may contain different kinds of
06785a48 2043entries: either single CPU numbers in any order, a range of CPUs (M-N)
83fd6c5b 2044or a range with some stride (M-N:S). CPU numbers are zero based. For example,
06785a48
DF
2045@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
2046to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
2047CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
2048and 14 respectively and then start assigning back from the beginning of
6a2ba183 2049the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
06785a48 2050
f1f3453e 2051There is no libgomp library routine to determine whether a CPU affinity
83fd6c5b 2052specification is in effect. As a workaround, language-specific library
06785a48
DF
2053functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
2054Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
83fd6c5b 2055environment variable. A defined CPU affinity on startup cannot be changed
06785a48
DF
2056or disabled during the runtime of the application.
2057
83fd6c5b
TB
2058If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
2059@env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
2060@env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
2061@code{FALSE}, the host system will handle the assignment of threads to CPUs.
20906c66
JJ
2062
2063@item @emph{See also}:
83fd6c5b 2064@ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
3721b9e1
DF
2065@end table
2066
2067
2068
41dbbb37
TS
2069@node GOMP_DEBUG
2070@section @env{GOMP_DEBUG} -- Enable debugging output
2071@cindex Environment Variable
2072@table @asis
2073@item @emph{Description}:
2074Enable debugging output. The variable should be set to @code{0}
2075(disabled, also the default if not set), or @code{1} (enabled).
2076
2077If enabled, some debugging output will be printed during execution.
2078This is currently not specified in more detail, and subject to change.
2079@end table
2080
2081
2082
3721b9e1
DF
2083@node GOMP_STACKSIZE
2084@section @env{GOMP_STACKSIZE} -- Set default thread stack size
2085@cindex Environment Variable
14734fc7 2086@cindex Implementation specific setting
3721b9e1
DF
2087@table @asis
2088@item @emph{Description}:
83fd6c5b 2089Set the default thread stack size in kilobytes. This is different from
5c6ed53a 2090@code{pthread_attr_setstacksize} which gets the number of bytes as an
83fd6c5b
TB
2091argument. If the stack size cannot be set due to system constraints, an
2092error is reported and the initial stack size is left unchanged. If undefined,
7c2b7f45 2093the stack size is system dependent.
3721b9e1 2094
5c6ed53a 2095@item @emph{See also}:
0024f1af 2096@ref{OMP_STACKSIZE}
5c6ed53a 2097
3721b9e1 2098@item @emph{Reference}:
c1030b5c 2099@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
3721b9e1 2100GCC Patches Mailinglist},
c1030b5c 2101@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
3721b9e1
DF
2102GCC Patches Mailinglist}
2103@end table
2104
2105
2106
acf0174b
JJ
2107@node GOMP_SPINCOUNT
2108@section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
2109@cindex Environment Variable
2110@cindex Implementation specific setting
2111@table @asis
2112@item @emph{Description}:
2113Determines how long a threads waits actively with consuming CPU power
83fd6c5b 2114before waiting passively without consuming CPU power. The value may be
acf0174b 2115either @code{INFINITE}, @code{INFINITY} to always wait actively or an
83fd6c5b 2116integer which gives the number of spins of the busy-wait loop. The
acf0174b
JJ
2117integer may optionally be followed by the following suffixes acting
2118as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
2119million), @code{G} (giga, billion), or @code{T} (tera, trillion).
2120If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
2121300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
212230 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
2123If there are more OpenMP threads than available CPUs, 1000 and 100
2124spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
2125undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
2126or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
2127
2128@item @emph{See also}:
2129@ref{OMP_WAIT_POLICY}
2130@end table
2131
2132
2133
06441dd5
SH
2134@node GOMP_RTEMS_THREAD_POOLS
2135@section @env{GOMP_RTEMS_THREAD_POOLS} -- Set the RTEMS specific thread pools
2136@cindex Environment Variable
2137@cindex Implementation specific setting
2138@table @asis
2139@item @emph{Description}:
2140This environment variable is only used on the RTEMS real-time operating system.
2141It determines the scheduler instance specific thread pools. The format for
2142@env{GOMP_RTEMS_THREAD_POOLS} is a list of optional
2143@code{<thread-pool-count>[$<priority>]@@<scheduler-name>} configurations
2144separated by @code{:} where:
2145@itemize @bullet
2146@item @code{<thread-pool-count>} is the thread pool count for this scheduler
2147instance.
2148@item @code{$<priority>} is an optional priority for the worker threads of a
2149thread pool according to @code{pthread_setschedparam}. In case a priority
2150value is omitted, then a worker thread will inherit the priority of the OpenMP
432de084
TB
2151primary thread that created it. The priority of the worker thread is not
2152changed after creation, even if a new OpenMP primary thread using the worker has
06441dd5
SH
2153a different priority.
2154@item @code{@@<scheduler-name>} is the scheduler instance name according to the
2155RTEMS application configuration.
2156@end itemize
2157In case no thread pool configuration is specified for a scheduler instance,
432de084 2158then each OpenMP primary thread of this scheduler instance will use its own
06441dd5 2159dynamically allocated thread pool. To limit the worker thread count of the
432de084 2160thread pools, each OpenMP primary thread must call @code{omp_set_num_threads}.
06441dd5
SH
2161@item @emph{Example}:
2162Lets suppose we have three scheduler instances @code{IO}, @code{WRK0}, and
2163@code{WRK1} with @env{GOMP_RTEMS_THREAD_POOLS} set to
2164@code{"1@@WRK0:3$4@@WRK1"}. Then there are no thread pool restrictions for
2165scheduler instance @code{IO}. In the scheduler instance @code{WRK0} there is
2166one thread pool available. Since no priority is specified for this scheduler
432de084 2167instance, the worker thread inherits the priority of the OpenMP primary thread
06441dd5
SH
2168that created it. In the scheduler instance @code{WRK1} there are three thread
2169pools available and their worker threads run at priority four.
2170@end table
2171
2172
2173
cdf6119d
JN
2174@c ---------------------------------------------------------------------
2175@c Enabling OpenACC
2176@c ---------------------------------------------------------------------
2177
2178@node Enabling OpenACC
2179@chapter Enabling OpenACC
2180
2181To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
2182flag @option{-fopenacc} must be specified. This enables the OpenACC directive
c1030b5c 2183@code{#pragma acc} in C/C++ and @code{!$acc} directives in free form,
cdf6119d
JN
2184@code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
2185@code{!$} conditional compilation sentinels in free form and @code{c$},
2186@code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
2187arranges for automatic linking of the OpenACC runtime library
2188(@ref{OpenACC Runtime Library Routines}).
2189
8d1a1cb1
TB
2190See @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
2191
cdf6119d 2192A complete description of all OpenACC directives accepted may be found in
9651fbaf 2193the @uref{https://www.openacc.org, OpenACC} Application Programming
e464fc90 2194Interface manual, version 2.6.
cdf6119d 2195
cdf6119d
JN
2196
2197
2198@c ---------------------------------------------------------------------
2199@c OpenACC Runtime Library Routines
2200@c ---------------------------------------------------------------------
2201
2202@node OpenACC Runtime Library Routines
2203@chapter OpenACC Runtime Library Routines
2204
2205The runtime routines described here are defined by section 3 of the OpenACC
e464fc90 2206specifications in version 2.6.
cdf6119d
JN
2207They have C linkage, and do not throw exceptions.
2208Generally, they are available only for the host, with the exception of
2209@code{acc_on_device}, which is available for both the host and the
2210acceleration device.
2211
2212@menu
2213* acc_get_num_devices:: Get number of devices for the given device
2214 type.
2215* acc_set_device_type:: Set type of device accelerator to use.
2216* acc_get_device_type:: Get type of device accelerator to be used.
2217* acc_set_device_num:: Set device number to use.
2218* acc_get_device_num:: Get device number to be used.
6c84c8bf 2219* acc_get_property:: Get device property.
cdf6119d
JN
2220* acc_async_test:: Tests for completion of a specific asynchronous
2221 operation.
c1030b5c 2222* acc_async_test_all:: Tests for completion of all asynchronous
cdf6119d
JN
2223 operations.
2224* acc_wait:: Wait for completion of a specific asynchronous
2225 operation.
c1030b5c 2226* acc_wait_all:: Waits for completion of all asynchronous
cdf6119d
JN
2227 operations.
2228* acc_wait_all_async:: Wait for completion of all asynchronous
2229 operations.
2230* acc_wait_async:: Wait for completion of asynchronous operations.
2231* acc_init:: Initialize runtime for a specific device type.
2232* acc_shutdown:: Shuts down the runtime for a specific device
2233 type.
2234* acc_on_device:: Whether executing on a particular device
2235* acc_malloc:: Allocate device memory.
2236* acc_free:: Free device memory.
2237* acc_copyin:: Allocate device memory and copy host memory to
2238 it.
2239* acc_present_or_copyin:: If the data is not present on the device,
2240 allocate device memory and copy from host
2241 memory.
2242* acc_create:: Allocate device memory and map it to host
2243 memory.
2244* acc_present_or_create:: If the data is not present on the device,
2245 allocate device memory and map it to host
2246 memory.
2247* acc_copyout:: Copy device memory to host memory.
2248* acc_delete:: Free device memory.
2249* acc_update_device:: Update device memory from mapped host memory.
2250* acc_update_self:: Update host memory from mapped device memory.
2251* acc_map_data:: Map previously allocated device memory to host
2252 memory.
2253* acc_unmap_data:: Unmap device memory from host memory.
2254* acc_deviceptr:: Get device pointer associated with specific
2255 host address.
2256* acc_hostptr:: Get host pointer associated with specific
2257 device address.
93d90219 2258* acc_is_present:: Indicate whether host variable / array is
cdf6119d
JN
2259 present on device.
2260* acc_memcpy_to_device:: Copy host memory to device memory.
2261* acc_memcpy_from_device:: Copy device memory to host memory.
e464fc90
TB
2262* acc_attach:: Let device pointer point to device-pointer target.
2263* acc_detach:: Let device pointer point to host-pointer target.
cdf6119d
JN
2264
2265API routines for target platforms.
2266
2267* acc_get_current_cuda_device:: Get CUDA device handle.
2268* acc_get_current_cuda_context::Get CUDA context handle.
2269* acc_get_cuda_stream:: Get CUDA stream handle.
2270* acc_set_cuda_stream:: Set CUDA stream handle.
5fae049d
TS
2271
2272API routines for the OpenACC Profiling Interface.
2273
2274* acc_prof_register:: Register callbacks.
2275* acc_prof_unregister:: Unregister callbacks.
2276* acc_prof_lookup:: Obtain inquiry functions.
2277* acc_register_library:: Library registration.
cdf6119d
JN
2278@end menu
2279
2280
2281
2282@node acc_get_num_devices
2283@section @code{acc_get_num_devices} -- Get number of devices for given device type
2284@table @asis
2285@item @emph{Description}
2286This function returns a value indicating the number of devices available
2287for the device type specified in @var{devicetype}.
2288
2289@item @emph{C/C++}:
2290@multitable @columnfractions .20 .80
2291@item @emph{Prototype}: @tab @code{int acc_get_num_devices(acc_device_t devicetype);}
2292@end multitable
2293
2294@item @emph{Fortran}:
2295@multitable @columnfractions .20 .80
2296@item @emph{Interface}: @tab @code{integer function acc_get_num_devices(devicetype)}
2297@item @tab @code{integer(kind=acc_device_kind) devicetype}
2298@end multitable
2299
2300@item @emph{Reference}:
e464fc90 2301@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
23023.2.1.
2303@end table
2304
2305
2306
2307@node acc_set_device_type
2308@section @code{acc_set_device_type} -- Set type of device accelerator to use.
2309@table @asis
2310@item @emph{Description}
c1030b5c 2311This function indicates to the runtime library which device type, specified
cdf6119d
JN
2312in @var{devicetype}, to use when executing a parallel or kernels region.
2313
2314@item @emph{C/C++}:
2315@multitable @columnfractions .20 .80
2316@item @emph{Prototype}: @tab @code{acc_set_device_type(acc_device_t devicetype);}
2317@end multitable
2318
2319@item @emph{Fortran}:
2320@multitable @columnfractions .20 .80
2321@item @emph{Interface}: @tab @code{subroutine acc_set_device_type(devicetype)}
2322@item @tab @code{integer(kind=acc_device_kind) devicetype}
2323@end multitable
2324
2325@item @emph{Reference}:
e464fc90 2326@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
23273.2.2.
2328@end table
2329
2330
2331
2332@node acc_get_device_type
2333@section @code{acc_get_device_type} -- Get type of device accelerator to be used.
2334@table @asis
2335@item @emph{Description}
2336This function returns what device type will be used when executing a
2337parallel or kernels region.
2338
b52643ab
KCY
2339This function returns @code{acc_device_none} if
2340@code{acc_get_device_type} is called from
2341@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
2342callbacks of the OpenACC Profiling Interface (@ref{OpenACC Profiling
2343Interface}), that is, if the device is currently being initialized.
2344
cdf6119d
JN
2345@item @emph{C/C++}:
2346@multitable @columnfractions .20 .80
2347@item @emph{Prototype}: @tab @code{acc_device_t acc_get_device_type(void);}
2348@end multitable
2349
2350@item @emph{Fortran}:
2351@multitable @columnfractions .20 .80
2352@item @emph{Interface}: @tab @code{function acc_get_device_type(void)}
2353@item @tab @code{integer(kind=acc_device_kind) acc_get_device_type}
2354@end multitable
2355
2356@item @emph{Reference}:
e464fc90 2357@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
23583.2.3.
2359@end table
2360
2361
2362
2363@node acc_set_device_num
2364@section @code{acc_set_device_num} -- Set device number to use.
2365@table @asis
2366@item @emph{Description}
2367This function will indicate to the runtime which device number,
8d1a1cb1 2368specified by @var{devicenum}, associated with the specified device
cdf6119d
JN
2369type @var{devicetype}.
2370
2371@item @emph{C/C++}:
2372@multitable @columnfractions .20 .80
8d1a1cb1 2373@item @emph{Prototype}: @tab @code{acc_set_device_num(int devicenum, acc_device_t devicetype);}
cdf6119d
JN
2374@end multitable
2375
2376@item @emph{Fortran}:
2377@multitable @columnfractions .20 .80
2378@item @emph{Interface}: @tab @code{subroutine acc_set_device_num(devicenum, devicetype)}
2379@item @tab @code{integer devicenum}
2380@item @tab @code{integer(kind=acc_device_kind) devicetype}
2381@end multitable
2382
2383@item @emph{Reference}:
e464fc90 2384@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
23853.2.4.
2386@end table
2387
2388
2389
2390@node acc_get_device_num
2391@section @code{acc_get_device_num} -- Get device number to be used.
2392@table @asis
2393@item @emph{Description}
2394This function returns which device number associated with the specified device
2395type @var{devicetype}, will be used when executing a parallel or kernels
2396region.
2397
2398@item @emph{C/C++}:
2399@multitable @columnfractions .20 .80
2400@item @emph{Prototype}: @tab @code{int acc_get_device_num(acc_device_t devicetype);}
2401@end multitable
2402
2403@item @emph{Fortran}:
2404@multitable @columnfractions .20 .80
2405@item @emph{Interface}: @tab @code{function acc_get_device_num(devicetype)}
2406@item @tab @code{integer(kind=acc_device_kind) devicetype}
2407@item @tab @code{integer acc_get_device_num}
2408@end multitable
2409
2410@item @emph{Reference}:
e464fc90 2411@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
24123.2.5.
2413@end table
2414
2415
2416
6c84c8bf
MR
2417@node acc_get_property
2418@section @code{acc_get_property} -- Get device property.
2419@cindex acc_get_property
2420@cindex acc_get_property_string
2421@table @asis
2422@item @emph{Description}
2423These routines return the value of the specified @var{property} for the
2424device being queried according to @var{devicenum} and @var{devicetype}.
2425Integer-valued and string-valued properties are returned by
2426@code{acc_get_property} and @code{acc_get_property_string} respectively.
2427The Fortran @code{acc_get_property_string} subroutine returns the string
2428retrieved in its fourth argument while the remaining entry points are
2429functions, which pass the return value as their result.
2430
8d1a1cb1
TB
2431Note for Fortran, only: the OpenACC technical committee corrected and, hence,
2432modified the interface introduced in OpenACC 2.6. The kind-value parameter
2433@code{acc_device_property} has been renamed to @code{acc_device_property_kind}
2434for consistency and the return type of the @code{acc_get_property} function is
2435now a @code{c_size_t} integer instead of a @code{acc_device_property} integer.
2436The parameter @code{acc_device_property} will continue to be provided,
2437but might be removed in a future version of GCC.
2438
6c84c8bf
MR
2439@item @emph{C/C++}:
2440@multitable @columnfractions .20 .80
2441@item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
2442@item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
2443@end multitable
2444
2445@item @emph{Fortran}:
2446@multitable @columnfractions .20 .80
2447@item @emph{Interface}: @tab @code{function acc_get_property(devicenum, devicetype, property)}
2448@item @emph{Interface}: @tab @code{subroutine acc_get_property_string(devicenum, devicetype, property, string)}
8d1a1cb1 2449@item @tab @code{use ISO_C_Binding, only: c_size_t}
6c84c8bf
MR
2450@item @tab @code{integer devicenum}
2451@item @tab @code{integer(kind=acc_device_kind) devicetype}
8d1a1cb1
TB
2452@item @tab @code{integer(kind=acc_device_property_kind) property}
2453@item @tab @code{integer(kind=c_size_t) acc_get_property}
6c84c8bf
MR
2454@item @tab @code{character(*) string}
2455@end multitable
2456
2457@item @emph{Reference}:
2458@uref{https://www.openacc.org, OpenACC specification v2.6}, section
24593.2.6.
2460@end table
2461
2462
2463
cdf6119d
JN
2464@node acc_async_test
2465@section @code{acc_async_test} -- Test for completion of a specific asynchronous operation.
2466@table @asis
2467@item @emph{Description}
93d90219 2468This function tests for completion of the asynchronous operation specified
cdf6119d
JN
2469in @var{arg}. In C/C++, a non-zero value will be returned to indicate
2470the specified asynchronous operation has completed. While Fortran will return
93d90219 2471a @code{true}. If the asynchronous operation has not completed, C/C++ returns
cdf6119d
JN
2472a zero and Fortran returns a @code{false}.
2473
2474@item @emph{C/C++}:
2475@multitable @columnfractions .20 .80
2476@item @emph{Prototype}: @tab @code{int acc_async_test(int arg);}
2477@end multitable
2478
2479@item @emph{Fortran}:
2480@multitable @columnfractions .20 .80
2481@item @emph{Interface}: @tab @code{function acc_async_test(arg)}
2482@item @tab @code{integer(kind=acc_handle_kind) arg}
2483@item @tab @code{logical acc_async_test}
2484@end multitable
2485
2486@item @emph{Reference}:
e464fc90
TB
2487@uref{https://www.openacc.org, OpenACC specification v2.6}, section
24883.2.9.
cdf6119d
JN
2489@end table
2490
2491
2492
2493@node acc_async_test_all
2494@section @code{acc_async_test_all} -- Tests for completion of all asynchronous operations.
2495@table @asis
2496@item @emph{Description}
93d90219 2497This function tests for completion of all asynchronous operations.
cdf6119d
JN
2498In C/C++, a non-zero value will be returned to indicate all asynchronous
2499operations have completed. While Fortran will return a @code{true}. If
2500any asynchronous operation has not completed, C/C++ returns a zero and
2501Fortran returns a @code{false}.
2502
2503@item @emph{C/C++}:
2504@multitable @columnfractions .20 .80
2505@item @emph{Prototype}: @tab @code{int acc_async_test_all(void);}
2506@end multitable
2507
2508@item @emph{Fortran}:
2509@multitable @columnfractions .20 .80
2510@item @emph{Interface}: @tab @code{function acc_async_test()}
2511@item @tab @code{logical acc_get_device_num}
2512@end multitable
2513
2514@item @emph{Reference}:
e464fc90
TB
2515@uref{https://www.openacc.org, OpenACC specification v2.6}, section
25163.2.10.
cdf6119d
JN
2517@end table
2518
2519
2520
2521@node acc_wait
2522@section @code{acc_wait} -- Wait for completion of a specific asynchronous operation.
2523@table @asis
2524@item @emph{Description}
2525This function waits for completion of the asynchronous operation
2526specified in @var{arg}.
2527
2528@item @emph{C/C++}:
2529@multitable @columnfractions .20 .80
2530@item @emph{Prototype}: @tab @code{acc_wait(arg);}
7ce64403 2531@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait(arg);}
cdf6119d
JN
2532@end multitable
2533
2534@item @emph{Fortran}:
2535@multitable @columnfractions .20 .80
2536@item @emph{Interface}: @tab @code{subroutine acc_wait(arg)}
2537@item @tab @code{integer(acc_handle_kind) arg}
7ce64403
TS
2538@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait(arg)}
2539@item @tab @code{integer(acc_handle_kind) arg}
cdf6119d
JN
2540@end multitable
2541
2542@item @emph{Reference}:
e464fc90
TB
2543@uref{https://www.openacc.org, OpenACC specification v2.6}, section
25443.2.11.
cdf6119d
JN
2545@end table
2546
2547
2548
2549@node acc_wait_all
2550@section @code{acc_wait_all} -- Waits for completion of all asynchronous operations.
2551@table @asis
2552@item @emph{Description}
2553This function waits for the completion of all asynchronous operations.
2554
2555@item @emph{C/C++}:
2556@multitable @columnfractions .20 .80
2557@item @emph{Prototype}: @tab @code{acc_wait_all(void);}
7ce64403 2558@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait_all(void);}
cdf6119d
JN
2559@end multitable
2560
2561@item @emph{Fortran}:
2562@multitable @columnfractions .20 .80
7ce64403
TS
2563@item @emph{Interface}: @tab @code{subroutine acc_wait_all()}
2564@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait_all()}
cdf6119d
JN
2565@end multitable
2566
2567@item @emph{Reference}:
e464fc90
TB
2568@uref{https://www.openacc.org, OpenACC specification v2.6}, section
25693.2.13.
cdf6119d
JN
2570@end table
2571
2572
2573
2574@node acc_wait_all_async
2575@section @code{acc_wait_all_async} -- Wait for completion of all asynchronous operations.
2576@table @asis
2577@item @emph{Description}
2578This function enqueues a wait operation on the queue @var{async} for any
2579and all asynchronous operations that have been previously enqueued on
2580any queue.
2581
2582@item @emph{C/C++}:
2583@multitable @columnfractions .20 .80
2584@item @emph{Prototype}: @tab @code{acc_wait_all_async(int async);}
2585@end multitable
2586
2587@item @emph{Fortran}:
2588@multitable @columnfractions .20 .80
2589@item @emph{Interface}: @tab @code{subroutine acc_wait_all_async(async)}
2590@item @tab @code{integer(acc_handle_kind) async}
2591@end multitable
2592
2593@item @emph{Reference}:
e464fc90
TB
2594@uref{https://www.openacc.org, OpenACC specification v2.6}, section
25953.2.14.
cdf6119d
JN
2596@end table
2597
2598
2599
2600@node acc_wait_async
2601@section @code{acc_wait_async} -- Wait for completion of asynchronous operations.
2602@table @asis
2603@item @emph{Description}
2604This function enqueues a wait operation on queue @var{async} for any and all
2605asynchronous operations enqueued on queue @var{arg}.
2606
2607@item @emph{C/C++}:
2608@multitable @columnfractions .20 .80
2609@item @emph{Prototype}: @tab @code{acc_wait_async(int arg, int async);}
2610@end multitable
2611
2612@item @emph{Fortran}:
2613@multitable @columnfractions .20 .80
2614@item @emph{Interface}: @tab @code{subroutine acc_wait_async(arg, async)}
2615@item @tab @code{integer(acc_handle_kind) arg, async}
2616@end multitable
2617
2618@item @emph{Reference}:
e464fc90
TB
2619@uref{https://www.openacc.org, OpenACC specification v2.6}, section
26203.2.12.
cdf6119d
JN
2621@end table
2622
2623
2624
2625@node acc_init
2626@section @code{acc_init} -- Initialize runtime for a specific device type.
2627@table @asis
2628@item @emph{Description}
2629This function initializes the runtime for the device type specified in
2630@var{devicetype}.
2631
2632@item @emph{C/C++}:
2633@multitable @columnfractions .20 .80
2634@item @emph{Prototype}: @tab @code{acc_init(acc_device_t devicetype);}
2635@end multitable
2636
2637@item @emph{Fortran}:
2638@multitable @columnfractions .20 .80
2639@item @emph{Interface}: @tab @code{subroutine acc_init(devicetype)}
2640@item @tab @code{integer(acc_device_kind) devicetype}
2641@end multitable
2642
2643@item @emph{Reference}:
e464fc90
TB
2644@uref{https://www.openacc.org, OpenACC specification v2.6}, section
26453.2.7.
cdf6119d
JN
2646@end table
2647
2648
2649
2650@node acc_shutdown
2651@section @code{acc_shutdown} -- Shuts down the runtime for a specific device type.
2652@table @asis
2653@item @emph{Description}
2654This function shuts down the runtime for the device type specified in
2655@var{devicetype}.
2656
2657@item @emph{C/C++}:
2658@multitable @columnfractions .20 .80
2659@item @emph{Prototype}: @tab @code{acc_shutdown(acc_device_t devicetype);}
2660@end multitable
2661
2662@item @emph{Fortran}:
2663@multitable @columnfractions .20 .80
2664@item @emph{Interface}: @tab @code{subroutine acc_shutdown(devicetype)}
2665@item @tab @code{integer(acc_device_kind) devicetype}
2666@end multitable
2667
2668@item @emph{Reference}:
e464fc90
TB
2669@uref{https://www.openacc.org, OpenACC specification v2.6}, section
26703.2.8.
cdf6119d
JN
2671@end table
2672
2673
2674
2675@node acc_on_device
2676@section @code{acc_on_device} -- Whether executing on a particular device
2677@table @asis
2678@item @emph{Description}:
2679This function returns whether the program is executing on a particular
2680device specified in @var{devicetype}. In C/C++ a non-zero value is
93d90219 2681returned to indicate the device is executing on the specified device type.
cdf6119d
JN
2682In Fortran, @code{true} will be returned. If the program is not executing
2683on the specified device type C/C++ will return a zero, while Fortran will
2684return @code{false}.
2685
2686@item @emph{C/C++}:
2687@multitable @columnfractions .20 .80
2688@item @emph{Prototype}: @tab @code{acc_on_device(acc_device_t devicetype);}
2689@end multitable
2690
2691@item @emph{Fortran}:
2692@multitable @columnfractions .20 .80
2693@item @emph{Interface}: @tab @code{function acc_on_device(devicetype)}
2694@item @tab @code{integer(acc_device_kind) devicetype}
2695@item @tab @code{logical acc_on_device}
2696@end multitable
2697
2698
2699@item @emph{Reference}:
e464fc90
TB
2700@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27013.2.17.
cdf6119d
JN
2702@end table
2703
2704
2705
2706@node acc_malloc
2707@section @code{acc_malloc} -- Allocate device memory.
2708@table @asis
2709@item @emph{Description}
2710This function allocates @var{len} bytes of device memory. It returns
2711the device address of the allocated memory.
2712
2713@item @emph{C/C++}:
2714@multitable @columnfractions .20 .80
2715@item @emph{Prototype}: @tab @code{d_void* acc_malloc(size_t len);}
2716@end multitable
2717
2718@item @emph{Reference}:
e464fc90
TB
2719@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27203.2.18.
cdf6119d
JN
2721@end table
2722
2723
2724
2725@node acc_free
2726@section @code{acc_free} -- Free device memory.
2727@table @asis
2728@item @emph{Description}
2729Free previously allocated device memory at the device address @code{a}.
2730
2731@item @emph{C/C++}:
2732@multitable @columnfractions .20 .80
2733@item @emph{Prototype}: @tab @code{acc_free(d_void *a);}
2734@end multitable
2735
2736@item @emph{Reference}:
e464fc90
TB
2737@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27383.2.19.
cdf6119d
JN
2739@end table
2740
2741
2742
2743@node acc_copyin
2744@section @code{acc_copyin} -- Allocate device memory and copy host memory to it.
2745@table @asis
2746@item @emph{Description}
2747In C/C++, this function allocates @var{len} bytes of device memory
2748and maps it to the specified host address in @var{a}. The device
2749address of the newly allocated device memory is returned.
2750
2751In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2752a contiguous array section. The second form @var{a} specifies a
2753variable or array element and @var{len} specifies the length in bytes.
2754
2755@item @emph{C/C++}:
2756@multitable @columnfractions .20 .80
2757@item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);}
e464fc90 2758@item @emph{Prototype}: @tab @code{void *acc_copyin_async(h_void *a, size_t len, int async);}
cdf6119d
JN
2759@end multitable
2760
2761@item @emph{Fortran}:
2762@multitable @columnfractions .20 .80
2763@item @emph{Interface}: @tab @code{subroutine acc_copyin(a)}
2764@item @tab @code{type, dimension(:[,:]...) :: a}
2765@item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)}
2766@item @tab @code{type, dimension(:[,:]...) :: a}
2767@item @tab @code{integer len}
e464fc90
TB
2768@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, async)}
2769@item @tab @code{type, dimension(:[,:]...) :: a}
2770@item @tab @code{integer(acc_handle_kind) :: async}
2771@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, len, async)}
2772@item @tab @code{type, dimension(:[,:]...) :: a}
2773@item @tab @code{integer len}
2774@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
2775@end multitable
2776
2777@item @emph{Reference}:
e464fc90
TB
2778@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27793.2.20.
cdf6119d
JN
2780@end table
2781
2782
2783
2784@node acc_present_or_copyin
2785@section @code{acc_present_or_copyin} -- If the data is not present on the device, allocate device memory and copy from host memory.
2786@table @asis
2787@item @emph{Description}
c1030b5c 2788This function tests if the host data specified by @var{a} and of length
cdf6119d
JN
2789@var{len} is present or not. If it is not present, then device memory
2790will be allocated and the host memory copied. The device address of
2791the newly allocated device memory is returned.
2792
2793In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2794a contiguous array section. The second form @var{a} specifies a variable or
2795array element and @var{len} specifies the length in bytes.
2796
e464fc90
TB
2797Note that @code{acc_present_or_copyin} and @code{acc_pcopyin} exist for
2798backward compatibility with OpenACC 2.0; use @ref{acc_copyin} instead.
2799
cdf6119d
JN
2800@item @emph{C/C++}:
2801@multitable @columnfractions .20 .80
2802@item @emph{Prototype}: @tab @code{void *acc_present_or_copyin(h_void *a, size_t len);}
2803@item @emph{Prototype}: @tab @code{void *acc_pcopyin(h_void *a, size_t len);}
2804@end multitable
2805
2806@item @emph{Fortran}:
2807@multitable @columnfractions .20 .80
2808@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a)}
2809@item @tab @code{type, dimension(:[,:]...) :: a}
2810@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a, len)}
2811@item @tab @code{type, dimension(:[,:]...) :: a}
2812@item @tab @code{integer len}
2813@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a)}
2814@item @tab @code{type, dimension(:[,:]...) :: a}
2815@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a, len)}
2816@item @tab @code{type, dimension(:[,:]...) :: a}
2817@item @tab @code{integer len}
2818@end multitable
2819
2820@item @emph{Reference}:
e464fc90
TB
2821@uref{https://www.openacc.org, OpenACC specification v2.6}, section
28223.2.20.
cdf6119d
JN
2823@end table
2824
2825
2826
2827@node acc_create
2828@section @code{acc_create} -- Allocate device memory and map it to host memory.
2829@table @asis
2830@item @emph{Description}
2831This function allocates device memory and maps it to host memory specified
2832by the host address @var{a} with a length of @var{len} bytes. In C/C++,
2833the function returns the device address of the allocated device memory.
2834
2835In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2836a contiguous array section. The second form @var{a} specifies a variable or
2837array element and @var{len} specifies the length in bytes.
2838
2839@item @emph{C/C++}:
2840@multitable @columnfractions .20 .80
2841@item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);}
e464fc90 2842@item @emph{Prototype}: @tab @code{void *acc_create_async(h_void *a, size_t len, int async);}
cdf6119d
JN
2843@end multitable
2844
2845@item @emph{Fortran}:
2846@multitable @columnfractions .20 .80
2847@item @emph{Interface}: @tab @code{subroutine acc_create(a)}
2848@item @tab @code{type, dimension(:[,:]...) :: a}
2849@item @emph{Interface}: @tab @code{subroutine acc_create(a, len)}
2850@item @tab @code{type, dimension(:[,:]...) :: a}
2851@item @tab @code{integer len}
e464fc90
TB
2852@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, async)}
2853@item @tab @code{type, dimension(:[,:]...) :: a}
2854@item @tab @code{integer(acc_handle_kind) :: async}
2855@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, len, async)}
2856@item @tab @code{type, dimension(:[,:]...) :: a}
2857@item @tab @code{integer len}
2858@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
2859@end multitable
2860
2861@item @emph{Reference}:
e464fc90
TB
2862@uref{https://www.openacc.org, OpenACC specification v2.6}, section
28633.2.21.
cdf6119d
JN
2864@end table
2865
2866
2867
2868@node acc_present_or_create
2869@section @code{acc_present_or_create} -- If the data is not present on the device, allocate device memory and map it to host memory.
2870@table @asis
2871@item @emph{Description}
c1030b5c 2872This function tests if the host data specified by @var{a} and of length
cdf6119d
JN
2873@var{len} is present or not. If it is not present, then device memory
2874will be allocated and mapped to host memory. In C/C++, the device address
2875of the newly allocated device memory is returned.
2876
2877In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2878a contiguous array section. The second form @var{a} specifies a variable or
2879array element and @var{len} specifies the length in bytes.
2880
e464fc90
TB
2881Note that @code{acc_present_or_create} and @code{acc_pcreate} exist for
2882backward compatibility with OpenACC 2.0; use @ref{acc_create} instead.
cdf6119d
JN
2883
2884@item @emph{C/C++}:
2885@multitable @columnfractions .20 .80
2886@item @emph{Prototype}: @tab @code{void *acc_present_or_create(h_void *a, size_t len)}
2887@item @emph{Prototype}: @tab @code{void *acc_pcreate(h_void *a, size_t len)}
2888@end multitable
2889
2890@item @emph{Fortran}:
2891@multitable @columnfractions .20 .80
2892@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a)}
2893@item @tab @code{type, dimension(:[,:]...) :: a}
2894@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a, len)}
2895@item @tab @code{type, dimension(:[,:]...) :: a}
2896@item @tab @code{integer len}
2897@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a)}
2898@item @tab @code{type, dimension(:[,:]...) :: a}
2899@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a, len)}
2900@item @tab @code{type, dimension(:[,:]...) :: a}
2901@item @tab @code{integer len}
2902@end multitable
2903
2904@item @emph{Reference}:
e464fc90
TB
2905@uref{https://www.openacc.org, OpenACC specification v2.6}, section
29063.2.21.
cdf6119d
JN
2907@end table
2908
2909
2910
2911@node acc_copyout
2912@section @code{acc_copyout} -- Copy device memory to host memory.
2913@table @asis
2914@item @emph{Description}
2915This function copies mapped device memory to host memory which is specified
2916by host address @var{a} for a length @var{len} bytes in C/C++.
2917
2918In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2919a contiguous array section. The second form @var{a} specifies a variable or
2920array element and @var{len} specifies the length in bytes.
2921
2922@item @emph{C/C++}:
2923@multitable @columnfractions .20 .80
2924@item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);}
e464fc90
TB
2925@item @emph{Prototype}: @tab @code{acc_copyout_async(h_void *a, size_t len, int async);}
2926@item @emph{Prototype}: @tab @code{acc_copyout_finalize(h_void *a, size_t len);}
2927@item @emph{Prototype}: @tab @code{acc_copyout_finalize_async(h_void *a, size_t len, int async);}
cdf6119d
JN
2928@end multitable
2929
2930@item @emph{Fortran}:
2931@multitable @columnfractions .20 .80
2932@item @emph{Interface}: @tab @code{subroutine acc_copyout(a)}
2933@item @tab @code{type, dimension(:[,:]...) :: a}
2934@item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)}
2935@item @tab @code{type, dimension(:[,:]...) :: a}
2936@item @tab @code{integer len}
e464fc90
TB
2937@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, async)}
2938@item @tab @code{type, dimension(:[,:]...) :: a}
2939@item @tab @code{integer(acc_handle_kind) :: async}
2940@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, len, async)}
2941@item @tab @code{type, dimension(:[,:]...) :: a}
2942@item @tab @code{integer len}
2943@item @tab @code{integer(acc_handle_kind) :: async}
2944@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a)}
2945@item @tab @code{type, dimension(:[,:]...) :: a}
2946@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a, len)}
2947@item @tab @code{type, dimension(:[,:]...) :: a}
2948@item @tab @code{integer len}
2949@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, async)}
2950@item @tab @code{type, dimension(:[,:]...) :: a}
2951@item @tab @code{integer(acc_handle_kind) :: async}
2952@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, len, async)}
2953@item @tab @code{type, dimension(:[,:]...) :: a}
2954@item @tab @code{integer len}
2955@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
2956@end multitable
2957
2958@item @emph{Reference}:
e464fc90
TB
2959@uref{https://www.openacc.org, OpenACC specification v2.6}, section
29603.2.22.
cdf6119d
JN
2961@end table
2962
2963
2964
2965@node acc_delete
2966@section @code{acc_delete} -- Free device memory.
2967@table @asis
2968@item @emph{Description}
2969This function frees previously allocated device memory specified by
2970the device address @var{a} and the length of @var{len} bytes.
2971
2972In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2973a contiguous array section. The second form @var{a} specifies a variable or
2974array element and @var{len} specifies the length in bytes.
2975
2976@item @emph{C/C++}:
2977@multitable @columnfractions .20 .80
2978@item @emph{Prototype}: @tab @code{acc_delete(h_void *a, size_t len);}
e464fc90
TB
2979@item @emph{Prototype}: @tab @code{acc_delete_async(h_void *a, size_t len, int async);}
2980@item @emph{Prototype}: @tab @code{acc_delete_finalize(h_void *a, size_t len);}
2981@item @emph{Prototype}: @tab @code{acc_delete_finalize_async(h_void *a, size_t len, int async);}
cdf6119d
JN
2982@end multitable
2983
2984@item @emph{Fortran}:
2985@multitable @columnfractions .20 .80
2986@item @emph{Interface}: @tab @code{subroutine acc_delete(a)}
2987@item @tab @code{type, dimension(:[,:]...) :: a}
2988@item @emph{Interface}: @tab @code{subroutine acc_delete(a, len)}
2989@item @tab @code{type, dimension(:[,:]...) :: a}
2990@item @tab @code{integer len}
e464fc90
TB
2991@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, async)}
2992@item @tab @code{type, dimension(:[,:]...) :: a}
2993@item @tab @code{integer(acc_handle_kind) :: async}
2994@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, len, async)}
2995@item @tab @code{type, dimension(:[,:]...) :: a}
2996@item @tab @code{integer len}
2997@item @tab @code{integer(acc_handle_kind) :: async}
2998@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a)}
2999@item @tab @code{type, dimension(:[,:]...) :: a}
3000@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a, len)}
3001@item @tab @code{type, dimension(:[,:]...) :: a}
3002@item @tab @code{integer len}
3003@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, async)}
3004@item @tab @code{type, dimension(:[,:]...) :: a}
3005@item @tab @code{integer(acc_handle_kind) :: async}
3006@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, len, async)}
3007@item @tab @code{type, dimension(:[,:]...) :: a}
3008@item @tab @code{integer len}
3009@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3010@end multitable
3011
3012@item @emph{Reference}:
e464fc90
TB
3013@uref{https://www.openacc.org, OpenACC specification v2.6}, section
30143.2.23.
cdf6119d
JN
3015@end table
3016
3017
3018
3019@node acc_update_device
3020@section @code{acc_update_device} -- Update device memory from mapped host memory.
3021@table @asis
3022@item @emph{Description}
3023This function updates the device copy from the previously mapped host memory.
3024The host memory is specified with the host address @var{a} and a length of
3025@var{len} bytes.
3026
3027In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3028a contiguous array section. The second form @var{a} specifies a variable or
3029array element and @var{len} specifies the length in bytes.
3030
3031@item @emph{C/C++}:
3032@multitable @columnfractions .20 .80
3033@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len);}
e464fc90 3034@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len, async);}
cdf6119d
JN
3035@end multitable
3036
3037@item @emph{Fortran}:
3038@multitable @columnfractions .20 .80
3039@item @emph{Interface}: @tab @code{subroutine acc_update_device(a)}
3040@item @tab @code{type, dimension(:[,:]...) :: a}
3041@item @emph{Interface}: @tab @code{subroutine acc_update_device(a, len)}
3042@item @tab @code{type, dimension(:[,:]...) :: a}
3043@item @tab @code{integer len}
e464fc90
TB
3044@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, async)}
3045@item @tab @code{type, dimension(:[,:]...) :: a}
3046@item @tab @code{integer(acc_handle_kind) :: async}
3047@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, len, async)}
3048@item @tab @code{type, dimension(:[,:]...) :: a}
3049@item @tab @code{integer len}
3050@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3051@end multitable
3052
3053@item @emph{Reference}:
e464fc90
TB
3054@uref{https://www.openacc.org, OpenACC specification v2.6}, section
30553.2.24.
cdf6119d
JN
3056@end table
3057
3058
3059
3060@node acc_update_self
3061@section @code{acc_update_self} -- Update host memory from mapped device memory.
3062@table @asis
3063@item @emph{Description}
3064This function updates the host copy from the previously mapped device memory.
3065The host memory is specified with the host address @var{a} and a length of
3066@var{len} bytes.
3067
3068In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3069a contiguous array section. The second form @var{a} specifies a variable or
3070array element and @var{len} specifies the length in bytes.
3071
3072@item @emph{C/C++}:
3073@multitable @columnfractions .20 .80
3074@item @emph{Prototype}: @tab @code{acc_update_self(h_void *a, size_t len);}
e464fc90 3075@item @emph{Prototype}: @tab @code{acc_update_self_async(h_void *a, size_t len, int async);}
cdf6119d
JN
3076@end multitable
3077
3078@item @emph{Fortran}:
3079@multitable @columnfractions .20 .80
3080@item @emph{Interface}: @tab @code{subroutine acc_update_self(a)}
3081@item @tab @code{type, dimension(:[,:]...) :: a}
3082@item @emph{Interface}: @tab @code{subroutine acc_update_self(a, len)}
3083@item @tab @code{type, dimension(:[,:]...) :: a}
3084@item @tab @code{integer len}
e464fc90
TB
3085@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, async)}
3086@item @tab @code{type, dimension(:[,:]...) :: a}
3087@item @tab @code{integer(acc_handle_kind) :: async}
3088@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, len, async)}
3089@item @tab @code{type, dimension(:[,:]...) :: a}
3090@item @tab @code{integer len}
3091@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3092@end multitable
3093
3094@item @emph{Reference}:
e464fc90
TB
3095@uref{https://www.openacc.org, OpenACC specification v2.6}, section
30963.2.25.
cdf6119d
JN
3097@end table
3098
3099
3100
3101@node acc_map_data
3102@section @code{acc_map_data} -- Map previously allocated device memory to host memory.
3103@table @asis
3104@item @emph{Description}
3105This function maps previously allocated device and host memory. The device
3106memory is specified with the device address @var{d}. The host memory is
3107specified with the host address @var{h} and a length of @var{len}.
3108
3109@item @emph{C/C++}:
3110@multitable @columnfractions .20 .80
3111@item @emph{Prototype}: @tab @code{acc_map_data(h_void *h, d_void *d, size_t len);}
3112@end multitable
3113
3114@item @emph{Reference}:
e464fc90
TB
3115@uref{https://www.openacc.org, OpenACC specification v2.6}, section
31163.2.26.
cdf6119d
JN
3117@end table
3118
3119
3120
3121@node acc_unmap_data
3122@section @code{acc_unmap_data} -- Unmap device memory from host memory.
3123@table @asis
3124@item @emph{Description}
3125This function unmaps previously mapped device and host memory. The latter
3126specified by @var{h}.
3127
3128@item @emph{C/C++}:
3129@multitable @columnfractions .20 .80
3130@item @emph{Prototype}: @tab @code{acc_unmap_data(h_void *h);}
3131@end multitable
3132
3133@item @emph{Reference}:
e464fc90
TB
3134@uref{https://www.openacc.org, OpenACC specification v2.6}, section
31353.2.27.
cdf6119d
JN
3136@end table
3137
3138
3139
3140@node acc_deviceptr
3141@section @code{acc_deviceptr} -- Get device pointer associated with specific host address.
3142@table @asis
3143@item @emph{Description}
3144This function returns the device address that has been mapped to the
3145host address specified by @var{h}.
3146
3147@item @emph{C/C++}:
3148@multitable @columnfractions .20 .80
3149@item @emph{Prototype}: @tab @code{void *acc_deviceptr(h_void *h);}
3150@end multitable
3151
3152@item @emph{Reference}:
e464fc90
TB
3153@uref{https://www.openacc.org, OpenACC specification v2.6}, section
31543.2.28.
cdf6119d
JN
3155@end table
3156
3157
3158
3159@node acc_hostptr
3160@section @code{acc_hostptr} -- Get host pointer associated with specific device address.
3161@table @asis
3162@item @emph{Description}
3163This function returns the host address that has been mapped to the
3164device address specified by @var{d}.
3165
3166@item @emph{C/C++}:
3167@multitable @columnfractions .20 .80
3168@item @emph{Prototype}: @tab @code{void *acc_hostptr(d_void *d);}
3169@end multitable
3170
3171@item @emph{Reference}:
e464fc90
TB
3172@uref{https://www.openacc.org, OpenACC specification v2.6}, section
31733.2.29.
cdf6119d
JN
3174@end table
3175
3176
3177
3178@node acc_is_present
3179@section @code{acc_is_present} -- Indicate whether host variable / array is present on device.
3180@table @asis
3181@item @emph{Description}
3182This function indicates whether the specified host address in @var{a} and a
3183length of @var{len} bytes is present on the device. In C/C++, a non-zero
3184value is returned to indicate the presence of the mapped memory on the
3185device. A zero is returned to indicate the memory is not mapped on the
3186device.
3187
3188In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3189a contiguous array section. The second form @var{a} specifies a variable or
3190array element and @var{len} specifies the length in bytes. If the host
3191memory is mapped to device memory, then a @code{true} is returned. Otherwise,
3192a @code{false} is return to indicate the mapped memory is not present.
3193
3194@item @emph{C/C++}:
3195@multitable @columnfractions .20 .80
3196@item @emph{Prototype}: @tab @code{int acc_is_present(h_void *a, size_t len);}
3197@end multitable
3198
3199@item @emph{Fortran}:
3200@multitable @columnfractions .20 .80
3201@item @emph{Interface}: @tab @code{function acc_is_present(a)}
3202@item @tab @code{type, dimension(:[,:]...) :: a}
3203@item @tab @code{logical acc_is_present}
3204@item @emph{Interface}: @tab @code{function acc_is_present(a, len)}
3205@item @tab @code{type, dimension(:[,:]...) :: a}
3206@item @tab @code{integer len}
3207@item @tab @code{logical acc_is_present}
3208@end multitable
3209
3210@item @emph{Reference}:
e464fc90
TB
3211@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32123.2.30.
cdf6119d
JN
3213@end table
3214
3215
3216
3217@node acc_memcpy_to_device
3218@section @code{acc_memcpy_to_device} -- Copy host memory to device memory.
3219@table @asis
3220@item @emph{Description}
3221This function copies host memory specified by host address of @var{src} to
3222device memory specified by the device address @var{dest} for a length of
3223@var{bytes} bytes.
3224
3225@item @emph{C/C++}:
3226@multitable @columnfractions .20 .80
3227@item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void *src, size_t bytes);}
3228@end multitable
3229
3230@item @emph{Reference}:
e464fc90
TB
3231@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32323.2.31.
cdf6119d
JN
3233@end table
3234
3235
3236
3237@node acc_memcpy_from_device
3238@section @code{acc_memcpy_from_device} -- Copy device memory to host memory.
3239@table @asis
3240@item @emph{Description}
3241This function copies host memory specified by host address of @var{src} from
3242device memory specified by the device address @var{dest} for a length of
3243@var{bytes} bytes.
3244
3245@item @emph{C/C++}:
3246@multitable @columnfractions .20 .80
3247@item @emph{Prototype}: @tab @code{acc_memcpy_from_device(d_void *dest, h_void *src, size_t bytes);}
3248@end multitable
3249
3250@item @emph{Reference}:
e464fc90
TB
3251@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32523.2.32.
3253@end table
3254
3255
3256
3257@node acc_attach
3258@section @code{acc_attach} -- Let device pointer point to device-pointer target.
3259@table @asis
3260@item @emph{Description}
3261This function updates a pointer on the device from pointing to a host-pointer
3262address to pointing to the corresponding device data.
3263
3264@item @emph{C/C++}:
3265@multitable @columnfractions .20 .80
3266@item @emph{Prototype}: @tab @code{acc_attach(h_void **ptr);}
3267@item @emph{Prototype}: @tab @code{acc_attach_async(h_void **ptr, int async);}
3268@end multitable
3269
3270@item @emph{Reference}:
3271@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32723.2.34.
3273@end table
3274
3275
3276
3277@node acc_detach
3278@section @code{acc_detach} -- Let device pointer point to host-pointer target.
3279@table @asis
3280@item @emph{Description}
3281This function updates a pointer on the device from pointing to a device-pointer
3282address to pointing to the corresponding host data.
3283
3284@item @emph{C/C++}:
3285@multitable @columnfractions .20 .80
3286@item @emph{Prototype}: @tab @code{acc_detach(h_void **ptr);}
3287@item @emph{Prototype}: @tab @code{acc_detach_async(h_void **ptr, int async);}
3288@item @emph{Prototype}: @tab @code{acc_detach_finalize(h_void **ptr);}
3289@item @emph{Prototype}: @tab @code{acc_detach_finalize_async(h_void **ptr, int async);}
3290@end multitable
3291
3292@item @emph{Reference}:
3293@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32943.2.35.
cdf6119d
JN
3295@end table
3296
3297
3298
3299@node acc_get_current_cuda_device
3300@section @code{acc_get_current_cuda_device} -- Get CUDA device handle.
3301@table @asis
3302@item @emph{Description}
3303This function returns the CUDA device handle. This handle is the same
3304as used by the CUDA Runtime or Driver API's.
3305
3306@item @emph{C/C++}:
3307@multitable @columnfractions .20 .80
3308@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_device(void);}
3309@end multitable
3310
3311@item @emph{Reference}:
e464fc90 3312@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3313A.2.1.1.
3314@end table
3315
3316
3317
3318@node acc_get_current_cuda_context
3319@section @code{acc_get_current_cuda_context} -- Get CUDA context handle.
3320@table @asis
3321@item @emph{Description}
3322This function returns the CUDA context handle. This handle is the same
3323as used by the CUDA Runtime or Driver API's.
3324
3325@item @emph{C/C++}:
3326@multitable @columnfractions .20 .80
18c247cc 3327@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_context(void);}
cdf6119d
JN
3328@end multitable
3329
3330@item @emph{Reference}:
e464fc90 3331@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3332A.2.1.2.
3333@end table
3334
3335
3336
3337@node acc_get_cuda_stream
3338@section @code{acc_get_cuda_stream} -- Get CUDA stream handle.
3339@table @asis
3340@item @emph{Description}
18c247cc
TS
3341This function returns the CUDA stream handle for the queue @var{async}.
3342This handle is the same as used by the CUDA Runtime or Driver API's.
cdf6119d
JN
3343
3344@item @emph{C/C++}:
3345@multitable @columnfractions .20 .80
18c247cc 3346@item @emph{Prototype}: @tab @code{void *acc_get_cuda_stream(int async);}
cdf6119d
JN
3347@end multitable
3348
3349@item @emph{Reference}:
e464fc90 3350@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3351A.2.1.3.
3352@end table
3353
3354
3355
3356@node acc_set_cuda_stream
3357@section @code{acc_set_cuda_stream} -- Set CUDA stream handle.
3358@table @asis
3359@item @emph{Description}
3360This function associates the stream handle specified by @var{stream} with
18c247cc
TS
3361the queue @var{async}.
3362
3363This cannot be used to change the stream handle associated with
3364@code{acc_async_sync}.
3365
3366The return value is not specified.
cdf6119d
JN
3367
3368@item @emph{C/C++}:
3369@multitable @columnfractions .20 .80
18c247cc 3370@item @emph{Prototype}: @tab @code{int acc_set_cuda_stream(int async, void *stream);}
cdf6119d
JN
3371@end multitable
3372
3373@item @emph{Reference}:
e464fc90 3374@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3375A.2.1.4.
3376@end table
3377
3378
3379
5fae049d
TS
3380@node acc_prof_register
3381@section @code{acc_prof_register} -- Register callbacks.
3382@table @asis
3383@item @emph{Description}:
3384This function registers callbacks.
3385
3386@item @emph{C/C++}:
3387@multitable @columnfractions .20 .80
3388@item @emph{Prototype}: @tab @code{void acc_prof_register (acc_event_t, acc_prof_callback, acc_register_t);}
3389@end multitable
3390
3391@item @emph{See also}:
3392@ref{OpenACC Profiling Interface}
3393
3394@item @emph{Reference}:
3395@uref{https://www.openacc.org, OpenACC specification v2.6}, section
33965.3.
3397@end table
3398
3399
3400
3401@node acc_prof_unregister
3402@section @code{acc_prof_unregister} -- Unregister callbacks.
3403@table @asis
3404@item @emph{Description}:
3405This function unregisters callbacks.
3406
3407@item @emph{C/C++}:
3408@multitable @columnfractions .20 .80
3409@item @emph{Prototype}: @tab @code{void acc_prof_unregister (acc_event_t, acc_prof_callback, acc_register_t);}
3410@end multitable
3411
3412@item @emph{See also}:
3413@ref{OpenACC Profiling Interface}
3414
3415@item @emph{Reference}:
3416@uref{https://www.openacc.org, OpenACC specification v2.6}, section
34175.3.
3418@end table
3419
3420
3421
3422@node acc_prof_lookup
3423@section @code{acc_prof_lookup} -- Obtain inquiry functions.
3424@table @asis
3425@item @emph{Description}:
3426Function to obtain inquiry functions.
3427
3428@item @emph{C/C++}:
3429@multitable @columnfractions .20 .80
3430@item @emph{Prototype}: @tab @code{acc_query_fn acc_prof_lookup (const char *);}
3431@end multitable
3432
3433@item @emph{See also}:
3434@ref{OpenACC Profiling Interface}
3435
3436@item @emph{Reference}:
3437@uref{https://www.openacc.org, OpenACC specification v2.6}, section
34385.3.
3439@end table
3440
3441
3442
3443@node acc_register_library
3444@section @code{acc_register_library} -- Library registration.
3445@table @asis
3446@item @emph{Description}:
3447Function for library registration.
3448
3449@item @emph{C/C++}:
3450@multitable @columnfractions .20 .80
3451@item @emph{Prototype}: @tab @code{void acc_register_library (acc_prof_reg, acc_prof_reg, acc_prof_lookup_func);}
3452@end multitable
3453
3454@item @emph{See also}:
3455@ref{OpenACC Profiling Interface}, @ref{ACC_PROFLIB}
3456
3457@item @emph{Reference}:
3458@uref{https://www.openacc.org, OpenACC specification v2.6}, section
34595.3.
3460@end table
3461
3462
3463
cdf6119d
JN
3464@c ---------------------------------------------------------------------
3465@c OpenACC Environment Variables
3466@c ---------------------------------------------------------------------
3467
3468@node OpenACC Environment Variables
3469@chapter OpenACC Environment Variables
3470
3471The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
3472are defined by section 4 of the OpenACC specification in version 2.0.
5fae049d
TS
3473The variable @env{ACC_PROFLIB}
3474is defined by section 4 of the OpenACC specification in version 2.6.
cdf6119d
JN
3475The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
3476
3477@menu
3478* ACC_DEVICE_TYPE::
3479* ACC_DEVICE_NUM::
5fae049d 3480* ACC_PROFLIB::
cdf6119d
JN
3481* GCC_ACC_NOTIFY::
3482@end menu
3483
3484
3485
3486@node ACC_DEVICE_TYPE
3487@section @code{ACC_DEVICE_TYPE}
3488@table @asis
3489@item @emph{Reference}:
e464fc90 3490@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
34914.1.
3492@end table
3493
3494
3495
3496@node ACC_DEVICE_NUM
3497@section @code{ACC_DEVICE_NUM}
3498@table @asis
3499@item @emph{Reference}:
e464fc90 3500@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
35014.2.
3502@end table
3503
3504
3505
5fae049d
TS
3506@node ACC_PROFLIB
3507@section @code{ACC_PROFLIB}
3508@table @asis
3509@item @emph{See also}:
3510@ref{acc_register_library}, @ref{OpenACC Profiling Interface}
3511
3512@item @emph{Reference}:
3513@uref{https://www.openacc.org, OpenACC specification v2.6}, section
35144.3.
3515@end table
3516
3517
3518
cdf6119d
JN
3519@node GCC_ACC_NOTIFY
3520@section @code{GCC_ACC_NOTIFY}
3521@table @asis
3522@item @emph{Description}:
3523Print debug information pertaining to the accelerator.
3524@end table
3525
3526
3527
3528@c ---------------------------------------------------------------------
3529@c CUDA Streams Usage
3530@c ---------------------------------------------------------------------
3531
3532@node CUDA Streams Usage
3533@chapter CUDA Streams Usage
3534
3535This applies to the @code{nvptx} plugin only.
3536
3537The library provides elements that perform asynchronous movement of
3538data and asynchronous operation of computing constructs. This
3539asynchronous functionality is implemented by making use of CUDA
3540streams@footnote{See "Stream Management" in "CUDA Driver API",
3541TRM-06703-001, Version 5.5, for additional information}.
3542
c1030b5c 3543The primary means by that the asynchronous functionality is accessed
cdf6119d
JN
3544is through the use of those OpenACC directives which make use of the
3545@code{async} and @code{wait} clauses. When the @code{async} clause is
3546first used with a directive, it creates a CUDA stream. If an
3547@code{async-argument} is used with the @code{async} clause, then the
3548stream is associated with the specified @code{async-argument}.
3549
3550Following the creation of an association between a CUDA stream and the
3551@code{async-argument} of an @code{async} clause, both the @code{wait}
3552clause and the @code{wait} directive can be used. When either the
3553clause or directive is used after stream creation, it creates a
3554rendezvous point whereby execution waits until all operations
3555associated with the @code{async-argument}, that is, stream, have
3556completed.
3557
3558Normally, the management of the streams that are created as a result of
3559using the @code{async} clause, is done without any intervention by the
3560caller. This implies the association between the @code{async-argument}
3561and the CUDA stream will be maintained for the lifetime of the program.
3562However, this association can be changed through the use of the library
3563function @code{acc_set_cuda_stream}. When the function
3564@code{acc_set_cuda_stream} is called, the CUDA stream that was
3565originally associated with the @code{async} clause will be destroyed.
3566Caution should be taken when changing the association as subsequent
3567references to the @code{async-argument} refer to a different
3568CUDA stream.
3569
3570
3571
3572@c ---------------------------------------------------------------------
3573@c OpenACC Library Interoperability
3574@c ---------------------------------------------------------------------
3575
3576@node OpenACC Library Interoperability
3577@chapter OpenACC Library Interoperability
3578
3579@section Introduction
3580
3581The OpenACC library uses the CUDA Driver API, and may interact with
3582programs that use the Runtime library directly, or another library
3583based on the Runtime library, e.g., CUBLAS@footnote{See section 2.26,
3584"Interactions with the CUDA Driver API" in
3585"CUDA Runtime API", Version 5.5, and section 2.27, "VDPAU
3586Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
3587for additional information on library interoperability.}.
3588This chapter describes the use cases and what changes are
3589required in order to use both the OpenACC library and the CUBLAS and Runtime
3590libraries within a program.
3591
3592@section First invocation: NVIDIA CUBLAS library API
3593
3594In this first use case (see below), a function in the CUBLAS library is called
3595prior to any of the functions in the OpenACC library. More specifically, the
3596function @code{cublasCreate()}.
3597
3598When invoked, the function initializes the library and allocates the
3599hardware resources on the host and the device on behalf of the caller. Once
3600the initialization and allocation has completed, a handle is returned to the
3601caller. The OpenACC library also requires initialization and allocation of
3602hardware resources. Since the CUBLAS library has already allocated the
3603hardware resources for the device, all that is left to do is to initialize
3604the OpenACC library and acquire the hardware resources on the host.
3605
3606Prior to calling the OpenACC function that initializes the library and
3607allocate the host hardware resources, you need to acquire the device number
3608that was allocated during the call to @code{cublasCreate()}. The invoking of the
3609runtime library function @code{cudaGetDevice()} accomplishes this. Once
3610acquired, the device number is passed along with the device type as
3611parameters to the OpenACC library function @code{acc_set_device_num()}.
3612
3613Once the call to @code{acc_set_device_num()} has completed, the OpenACC
3614library uses the context that was created during the call to
3615@code{cublasCreate()}. In other words, both libraries will be sharing the
3616same context.
3617
3618@smallexample
3619 /* Create the handle */
3620 s = cublasCreate(&h);
3621 if (s != CUBLAS_STATUS_SUCCESS)
3622 @{
3623 fprintf(stderr, "cublasCreate failed %d\n", s);
3624 exit(EXIT_FAILURE);
3625 @}
3626
3627 /* Get the device number */
3628 e = cudaGetDevice(&dev);
3629 if (e != cudaSuccess)
3630 @{
3631 fprintf(stderr, "cudaGetDevice failed %d\n", e);
3632 exit(EXIT_FAILURE);
3633 @}
3634
3635 /* Initialize OpenACC library and use device 'dev' */
3636 acc_set_device_num(dev, acc_device_nvidia);
3637
3638@end smallexample
3639@center Use Case 1
3640
3641@section First invocation: OpenACC library API
3642
3643In this second use case (see below), a function in the OpenACC library is
3644called prior to any of the functions in the CUBLAS library. More specificially,
3645the function @code{acc_set_device_num()}.
3646
3647In the use case presented here, the function @code{acc_set_device_num()}
3648is used to both initialize the OpenACC library and allocate the hardware
3649resources on the host and the device. In the call to the function, the
3650call parameters specify which device to use and what device
3651type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
3652is but one method to initialize the OpenACC library and allocate the
3653appropriate hardware resources. Other methods are available through the
3654use of environment variables and these will be discussed in the next section.
3655
3656Once the call to @code{acc_set_device_num()} has completed, other OpenACC
3657functions can be called as seen with multiple calls being made to
3658@code{acc_copyin()}. In addition, calls can be made to functions in the
3659CUBLAS library. In the use case a call to @code{cublasCreate()} is made
3660subsequent to the calls to @code{acc_copyin()}.
3661As seen in the previous use case, a call to @code{cublasCreate()}
3662initializes the CUBLAS library and allocates the hardware resources on the
3663host and the device. However, since the device has already been allocated,
3664@code{cublasCreate()} will only initialize the CUBLAS library and allocate
3665the appropriate hardware resources on the host. The context that was created
3666as part of the OpenACC initialization is shared with the CUBLAS library,
3667similarly to the first use case.
3668
3669@smallexample
3670 dev = 0;
3671
3672 acc_set_device_num(dev, acc_device_nvidia);
3673
3674 /* Copy the first set to the device */
3675 d_X = acc_copyin(&h_X[0], N * sizeof (float));
3676 if (d_X == NULL)
3677 @{
3678 fprintf(stderr, "copyin error h_X\n");
3679 exit(EXIT_FAILURE);
3680 @}
3681
3682 /* Copy the second set to the device */
3683 d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
3684 if (d_Y == NULL)
3685 @{
3686 fprintf(stderr, "copyin error h_Y1\n");
3687 exit(EXIT_FAILURE);
3688 @}
3689
3690 /* Create the handle */
3691 s = cublasCreate(&h);
3692 if (s != CUBLAS_STATUS_SUCCESS)
3693 @{
3694 fprintf(stderr, "cublasCreate failed %d\n", s);
3695 exit(EXIT_FAILURE);
3696 @}
3697
3698 /* Perform saxpy using CUBLAS library function */
3699 s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
3700 if (s != CUBLAS_STATUS_SUCCESS)
3701 @{
3702 fprintf(stderr, "cublasSaxpy failed %d\n", s);
3703 exit(EXIT_FAILURE);
3704 @}
3705
3706 /* Copy the results from the device */
3707 acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
3708
3709@end smallexample
3710@center Use Case 2
3711
3712@section OpenACC library and environment variables
3713
3714There are two environment variables associated with the OpenACC library
3715that may be used to control the device type and device number:
8d1a1cb1
TB
3716@env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}, respectively. These two
3717environment variables can be used as an alternative to calling
cdf6119d
JN
3718@code{acc_set_device_num()}. As seen in the second use case, the device
3719type and device number were specified using @code{acc_set_device_num()}.
3720If however, the aforementioned environment variables were set, then the
3721call to @code{acc_set_device_num()} would not be required.
3722
3723
3724The use of the environment variables is only relevant when an OpenACC function
3725is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
3726is called prior to a call to an OpenACC function, then you must call
3727@code{acc_set_device_num()}@footnote{More complete information
3728about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
9651fbaf 3729sections 4.1 and 4.2 of the @uref{https://www.openacc.org, OpenACC}
e464fc90 3730Application Programming Interface”, Version 2.6.}
cdf6119d
JN
3731
3732
3733
5fae049d
TS
3734@c ---------------------------------------------------------------------
3735@c OpenACC Profiling Interface
3736@c ---------------------------------------------------------------------
3737
3738@node OpenACC Profiling Interface
3739@chapter OpenACC Profiling Interface
3740
3741@section Implementation Status and Implementation-Defined Behavior
3742
3743We're implementing the OpenACC Profiling Interface as defined by the
3744OpenACC 2.6 specification. We're clarifying some aspects here as
3745@emph{implementation-defined behavior}, while they're still under
3746discussion within the OpenACC Technical Committee.
3747
3748This implementation is tuned to keep the performance impact as low as
3749possible for the (very common) case that the Profiling Interface is
3750not enabled. This is relevant, as the Profiling Interface affects all
3751the @emph{hot} code paths (in the target code, not in the offloaded
3752code). Users of the OpenACC Profiling Interface can be expected to
3753understand that performance will be impacted to some degree once the
3754Profiling Interface has gotten enabled: for example, because of the
3755@emph{runtime} (libgomp) calling into a third-party @emph{library} for
3756every event that has been registered.
3757
3758We're not yet accounting for the fact that @cite{OpenACC events may
3759occur during event processing}.
b52643ab
KCY
3760We just handle one case specially, as required by CUDA 9.0
3761@command{nvprof}, that @code{acc_get_device_type}
3762(@ref{acc_get_device_type})) may be called from
3763@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
3764callbacks.
5fae049d
TS
3765
3766We're not yet implementing initialization via a
3767@code{acc_register_library} function that is either statically linked
3768in, or dynamically via @env{LD_PRELOAD}.
3769Initialization via @code{acc_register_library} functions dynamically
3770loaded via the @env{ACC_PROFLIB} environment variable does work, as
3771does directly calling @code{acc_prof_register},
3772@code{acc_prof_unregister}, @code{acc_prof_lookup}.
3773
3774As currently there are no inquiry functions defined, calls to
3775@code{acc_prof_lookup} will always return @code{NULL}.
3776
3777There aren't separate @emph{start}, @emph{stop} events defined for the
3778event types @code{acc_ev_create}, @code{acc_ev_delete},
3779@code{acc_ev_alloc}, @code{acc_ev_free}. It's not clear if these
3780should be triggered before or after the actual device-specific call is
3781made. We trigger them after.
3782
3783Remarks about data provided to callbacks:
3784
3785@table @asis
3786
3787@item @code{acc_prof_info.event_type}
3788It's not clear if for @emph{nested} event callbacks (for example,
3789@code{acc_ev_enqueue_launch_start} as part of a parent compute
3790construct), this should be set for the nested event
3791(@code{acc_ev_enqueue_launch_start}), or if the value of the parent
3792construct should remain (@code{acc_ev_compute_construct_start}). In
3793this implementation, the value will generally correspond to the
3794innermost nested event type.
3795
3796@item @code{acc_prof_info.device_type}
3797@itemize
3798
3799@item
3800For @code{acc_ev_compute_construct_start}, and in presence of an
3801@code{if} clause with @emph{false} argument, this will still refer to
3802the offloading device type.
3803It's not clear if that's the expected behavior.
3804
3805@item
3806Complementary to the item before, for
3807@code{acc_ev_compute_construct_end}, this is set to
3808@code{acc_device_host} in presence of an @code{if} clause with
3809@emph{false} argument.
3810It's not clear if that's the expected behavior.
3811
3812@end itemize
3813
3814@item @code{acc_prof_info.thread_id}
3815Always @code{-1}; not yet implemented.
3816
3817@item @code{acc_prof_info.async}
3818@itemize
3819
3820@item
3821Not yet implemented correctly for
3822@code{acc_ev_compute_construct_start}.
3823
3824@item
3825In a compute construct, for host-fallback
3826execution/@code{acc_device_host} it will always be
3827@code{acc_async_sync}.
3828It's not clear if that's the expected behavior.
3829
3830@item
3831For @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end},
3832it will always be @code{acc_async_sync}.
3833It's not clear if that's the expected behavior.
3834
3835@end itemize
3836
3837@item @code{acc_prof_info.async_queue}
3838There is no @cite{limited number of asynchronous queues} in libgomp.
3839This will always have the same value as @code{acc_prof_info.async}.
3840
3841@item @code{acc_prof_info.src_file}
3842Always @code{NULL}; not yet implemented.
3843
3844@item @code{acc_prof_info.func_name}
3845Always @code{NULL}; not yet implemented.
3846
3847@item @code{acc_prof_info.line_no}
3848Always @code{-1}; not yet implemented.
3849
3850@item @code{acc_prof_info.end_line_no}
3851Always @code{-1}; not yet implemented.
3852
3853@item @code{acc_prof_info.func_line_no}
3854Always @code{-1}; not yet implemented.
3855
3856@item @code{acc_prof_info.func_end_line_no}
3857Always @code{-1}; not yet implemented.
3858
3859@item @code{acc_event_info.event_type}, @code{acc_event_info.*.event_type}
3860Relating to @code{acc_prof_info.event_type} discussed above, in this
3861implementation, this will always be the same value as
3862@code{acc_prof_info.event_type}.
3863
3864@item @code{acc_event_info.*.parent_construct}
3865@itemize
3866
3867@item
3868Will be @code{acc_construct_parallel} for all OpenACC compute
3869constructs as well as many OpenACC Runtime API calls; should be the
3870one matching the actual construct, or
3871@code{acc_construct_runtime_api}, respectively.
3872
3873@item
3874Will be @code{acc_construct_enter_data} or
3875@code{acc_construct_exit_data} when processing variable mappings
3876specified in OpenACC @emph{declare} directives; should be
3877@code{acc_construct_declare}.
3878
3879@item
3880For implicit @code{acc_ev_device_init_start},
3881@code{acc_ev_device_init_end}, and explicit as well as implicit
3882@code{acc_ev_alloc}, @code{acc_ev_free},
3883@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
3884@code{acc_ev_enqueue_download_start}, and
3885@code{acc_ev_enqueue_download_end}, will be
3886@code{acc_construct_parallel}; should reflect the real parent
3887construct.
3888
3889@end itemize
3890
3891@item @code{acc_event_info.*.implicit}
3892For @code{acc_ev_alloc}, @code{acc_ev_free},
3893@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
3894@code{acc_ev_enqueue_download_start}, and
3895@code{acc_ev_enqueue_download_end}, this currently will be @code{1}
3896also for explicit usage.
3897
3898@item @code{acc_event_info.data_event.var_name}
3899Always @code{NULL}; not yet implemented.
3900
3901@item @code{acc_event_info.data_event.host_ptr}
3902For @code{acc_ev_alloc}, and @code{acc_ev_free}, this is always
3903@code{NULL}.
3904
3905@item @code{typedef union acc_api_info}
3906@dots{} as printed in @cite{5.2.3. Third Argument: API-Specific
3907Information}. This should obviously be @code{typedef @emph{struct}
3908acc_api_info}.
3909
3910@item @code{acc_api_info.device_api}
3911Possibly not yet implemented correctly for
3912@code{acc_ev_compute_construct_start},
3913@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}:
3914will always be @code{acc_device_api_none} for these event types.
3915For @code{acc_ev_enter_data_start}, it will be
3916@code{acc_device_api_none} in some cases.
3917
3918@item @code{acc_api_info.device_type}
3919Always the same as @code{acc_prof_info.device_type}.
3920
3921@item @code{acc_api_info.vendor}
3922Always @code{-1}; not yet implemented.
3923
3924@item @code{acc_api_info.device_handle}
3925Always @code{NULL}; not yet implemented.
3926
3927@item @code{acc_api_info.context_handle}
3928Always @code{NULL}; not yet implemented.
3929
3930@item @code{acc_api_info.async_handle}
3931Always @code{NULL}; not yet implemented.
3932
3933@end table
3934
3935Remarks about certain event types:
3936
3937@table @asis
3938
3939@item @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
3940@itemize
3941
3942@item
3943@c See 'DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT' in
3944@c 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c',
3945@c 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.
ff7bc505 3946When a compute construct triggers implicit
5fae049d
TS
3947@code{acc_ev_device_init_start} and @code{acc_ev_device_init_end}
3948events, they currently aren't @emph{nested within} the corresponding
3949@code{acc_ev_compute_construct_start} and
3950@code{acc_ev_compute_construct_end}, but they're currently observed
3951@emph{before} @code{acc_ev_compute_construct_start}.
3952It's not clear what to do: the standard asks us provide a lot of
3953details to the @code{acc_ev_compute_construct_start} callback, without
3954(implicitly) initializing a device before?
3955
3956@item
3957Callbacks for these event types will not be invoked for calls to the
3958@code{acc_set_device_type} and @code{acc_set_device_num} functions.
3959It's not clear if they should be.
3960
3961@end itemize
3962
3963@item @code{acc_ev_enter_data_start}, @code{acc_ev_enter_data_end}, @code{acc_ev_exit_data_start}, @code{acc_ev_exit_data_end}
3964@itemize
3965
3966@item
3967Callbacks for these event types will also be invoked for OpenACC
3968@emph{host_data} constructs.
3969It's not clear if they should be.
3970
3971@item
3972Callbacks for these event types will also be invoked when processing
3973variable mappings specified in OpenACC @emph{declare} directives.
3974It's not clear if they should be.
3975
3976@end itemize
3977
3978@end table
3979
3980Callbacks for the following event types will be invoked, but dispatch
3981and information provided therein has not yet been thoroughly reviewed:
3982
3983@itemize
3984@item @code{acc_ev_alloc}
3985@item @code{acc_ev_free}
3986@item @code{acc_ev_update_start}, @code{acc_ev_update_end}
3987@item @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end}
3988@item @code{acc_ev_enqueue_download_start}, @code{acc_ev_enqueue_download_end}
3989@end itemize
3990
3991During device initialization, and finalization, respectively,
3992callbacks for the following event types will not yet be invoked:
3993
3994@itemize
3995@item @code{acc_ev_alloc}
3996@item @code{acc_ev_free}
3997@end itemize
3998
3999Callbacks for the following event types have not yet been implemented,
4000so currently won't be invoked:
4001
4002@itemize
4003@item @code{acc_ev_device_shutdown_start}, @code{acc_ev_device_shutdown_end}
4004@item @code{acc_ev_runtime_shutdown}
4005@item @code{acc_ev_create}, @code{acc_ev_delete}
4006@item @code{acc_ev_wait_start}, @code{acc_ev_wait_end}
4007@end itemize
4008
4009For the following runtime library functions, not all expected
4010callbacks will be invoked (mostly concerning implicit device
4011initialization):
4012
4013@itemize
4014@item @code{acc_get_num_devices}
4015@item @code{acc_set_device_type}
4016@item @code{acc_get_device_type}
4017@item @code{acc_set_device_num}
4018@item @code{acc_get_device_num}
4019@item @code{acc_init}
4020@item @code{acc_shutdown}
4021@end itemize
4022
4023Aside from implicit device initialization, for the following runtime
4024library functions, no callbacks will be invoked for shared-memory
4025offloading devices (it's not clear if they should be):
4026
4027@itemize
4028@item @code{acc_malloc}
4029@item @code{acc_free}
4030@item @code{acc_copyin}, @code{acc_present_or_copyin}, @code{acc_copyin_async}
4031@item @code{acc_create}, @code{acc_present_or_create}, @code{acc_create_async}
4032@item @code{acc_copyout}, @code{acc_copyout_async}, @code{acc_copyout_finalize}, @code{acc_copyout_finalize_async}
4033@item @code{acc_delete}, @code{acc_delete_async}, @code{acc_delete_finalize}, @code{acc_delete_finalize_async}
4034@item @code{acc_update_device}, @code{acc_update_device_async}
4035@item @code{acc_update_self}, @code{acc_update_self_async}
4036@item @code{acc_map_data}, @code{acc_unmap_data}
4037@item @code{acc_memcpy_to_device}, @code{acc_memcpy_to_device_async}
4038@item @code{acc_memcpy_from_device}, @code{acc_memcpy_from_device_async}
4039@end itemize
4040
4041
4042
3721b9e1
DF
4043@c ---------------------------------------------------------------------
4044@c The libgomp ABI
4045@c ---------------------------------------------------------------------
4046
4047@node The libgomp ABI
4048@chapter The libgomp ABI
4049
4050The following sections present notes on the external ABI as
6a2ba183 4051presented by libgomp. Only maintainers should need them.
3721b9e1
DF
4052
4053@menu
4054* Implementing MASTER construct::
4055* Implementing CRITICAL construct::
4056* Implementing ATOMIC construct::
4057* Implementing FLUSH construct::
4058* Implementing BARRIER construct::
4059* Implementing THREADPRIVATE construct::
4060* Implementing PRIVATE clause::
4061* Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
4062* Implementing REDUCTION clause::
4063* Implementing PARALLEL construct::
4064* Implementing FOR construct::
4065* Implementing ORDERED construct::
4066* Implementing SECTIONS construct::
4067* Implementing SINGLE construct::
cdf6119d 4068* Implementing OpenACC's PARALLEL construct::
3721b9e1
DF
4069@end menu
4070
4071
4072@node Implementing MASTER construct
4073@section Implementing MASTER construct
4074
4075@smallexample
4076if (omp_get_thread_num () == 0)
4077 block
4078@end smallexample
4079
4080Alternately, we generate two copies of the parallel subfunction
432de084 4081and only include this in the version run by the primary thread.
6a2ba183 4082Surely this is not worthwhile though...
3721b9e1
DF
4083
4084
4085
4086@node Implementing CRITICAL construct
4087@section Implementing CRITICAL construct
4088
4089Without a specified name,
4090
4091@smallexample
4092 void GOMP_critical_start (void);
4093 void GOMP_critical_end (void);
4094@end smallexample
4095
4096so that we don't get COPY relocations from libgomp to the main
4097application.
4098
4099With a specified name, use omp_set_lock and omp_unset_lock with
4100name being transformed into a variable declared like
4101
4102@smallexample
4103 omp_lock_t gomp_critical_user_<name> __attribute__((common))
4104@end smallexample
4105
4106Ideally the ABI would specify that all zero is a valid unlocked
6a2ba183 4107state, and so we wouldn't need to initialize this at
3721b9e1
DF
4108startup.
4109
4110
4111
4112@node Implementing ATOMIC construct
4113@section Implementing ATOMIC construct
4114
4115The target should implement the @code{__sync} builtins.
4116
4117Failing that we could add
4118
4119@smallexample
4120 void GOMP_atomic_enter (void)
4121 void GOMP_atomic_exit (void)
4122@end smallexample
4123
4124which reuses the regular lock code, but with yet another lock
4125object private to the library.
4126
4127
4128
4129@node Implementing FLUSH construct
4130@section Implementing FLUSH construct
4131
4132Expands to the @code{__sync_synchronize} builtin.
4133
4134
4135
4136@node Implementing BARRIER construct
4137@section Implementing BARRIER construct
4138
4139@smallexample
4140 void GOMP_barrier (void)
4141@end smallexample
4142
4143
4144@node Implementing THREADPRIVATE construct
4145@section Implementing THREADPRIVATE construct
4146
4147In _most_ cases we can map this directly to @code{__thread}. Except
4148that OMP allows constructors for C++ objects. We can either
4149refuse to support this (how often is it used?) or we can
4150implement something akin to .ctors.
4151
4152Even more ideally, this ctor feature is handled by extensions
4153to the main pthreads library. Failing that, we can have a set
4154of entry points to register ctor functions to be called.
4155
4156
4157
4158@node Implementing PRIVATE clause
4159@section Implementing PRIVATE clause
4160
4161In association with a PARALLEL, or within the lexical extent
4162of a PARALLEL block, the variable becomes a local variable in
4163the parallel subfunction.
4164
4165In association with FOR or SECTIONS blocks, create a new
4166automatic variable within the current function. This preserves
4167the semantic of new variable creation.
4168
4169
4170
4171@node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
4172@section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
4173
6a2ba183
AH
4174This seems simple enough for PARALLEL blocks. Create a private
4175struct for communicating between the parent and subfunction.
3721b9e1
DF
4176In the parent, copy in values for scalar and "small" structs;
4177copy in addresses for others TREE_ADDRESSABLE types. In the
4178subfunction, copy the value into the local variable.
4179
6a2ba183
AH
4180It is not clear what to do with bare FOR or SECTION blocks.
4181The only thing I can figure is that we do something like:
3721b9e1
DF
4182
4183@smallexample
4184#pragma omp for firstprivate(x) lastprivate(y)
4185for (int i = 0; i < n; ++i)
4186 body;
4187@end smallexample
4188
4189which becomes
4190
4191@smallexample
4192@{
4193 int x = x, y;
4194
4195 // for stuff
4196
4197 if (i == n)
4198 y = y;
4199@}
4200@end smallexample
4201
4202where the "x=x" and "y=y" assignments actually have different
4203uids for the two variables, i.e. not something you could write
4204directly in C. Presumably this only makes sense if the "outer"
4205x and y are global variables.
4206
4207COPYPRIVATE would work the same way, except the structure
4208broadcast would have to happen via SINGLE machinery instead.
4209
4210
4211
4212@node Implementing REDUCTION clause
4213@section Implementing REDUCTION clause
4214
4215The private struct mentioned in the previous section should have
4216a pointer to an array of the type of the variable, indexed by the
4217thread's @var{team_id}. The thread stores its final value into the
432de084 4218array, and after the barrier, the primary thread iterates over the
3721b9e1
DF
4219array to collect the values.
4220
4221
4222@node Implementing PARALLEL construct
4223@section Implementing PARALLEL construct
4224
4225@smallexample
4226 #pragma omp parallel
4227 @{
4228 body;
4229 @}
4230@end smallexample
4231
4232becomes
4233
4234@smallexample
4235 void subfunction (void *data)
4236 @{
4237 use data;
4238 body;
4239 @}
4240
4241 setup data;
4242 GOMP_parallel_start (subfunction, &data, num_threads);
4243 subfunction (&data);
4244 GOMP_parallel_end ();
4245@end smallexample
4246
4247@smallexample
4248 void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
4249@end smallexample
4250
4251The @var{FN} argument is the subfunction to be run in parallel.
4252
4253The @var{DATA} argument is a pointer to a structure used to
4254communicate data in and out of the subfunction, as discussed
f1b0882e 4255above with respect to FIRSTPRIVATE et al.
3721b9e1
DF
4256
4257The @var{NUM_THREADS} argument is 1 if an IF clause is present
4258and false, or the value of the NUM_THREADS clause, if
4259present, or 0.
4260
4261The function needs to create the appropriate number of
4262threads and/or launch them from the dock. It needs to
4263create the team structure and assign team ids.
4264
4265@smallexample
4266 void GOMP_parallel_end (void)
4267@end smallexample
4268
4269Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
4270
4271
4272
4273@node Implementing FOR construct
4274@section Implementing FOR construct
4275
4276@smallexample
4277 #pragma omp parallel for
4278 for (i = lb; i <= ub; i++)
4279 body;
4280@end smallexample
4281
4282becomes
4283
4284@smallexample
4285 void subfunction (void *data)
4286 @{
4287 long _s0, _e0;
4288 while (GOMP_loop_static_next (&_s0, &_e0))
4289 @{
4290 long _e1 = _e0, i;
4291 for (i = _s0; i < _e1; i++)
4292 body;
4293 @}
4294 GOMP_loop_end_nowait ();
4295 @}
4296
4297 GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
4298 subfunction (NULL);
4299 GOMP_parallel_end ();
4300@end smallexample
4301
4302@smallexample
4303 #pragma omp for schedule(runtime)
4304 for (i = 0; i < n; i++)
4305 body;
4306@end smallexample
4307
4308becomes
4309
4310@smallexample
4311 @{
4312 long i, _s0, _e0;
4313 if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
4314 do @{
4315 long _e1 = _e0;
4316 for (i = _s0, i < _e0; i++)
4317 body;
4318 @} while (GOMP_loop_runtime_next (&_s0, _&e0));
4319 GOMP_loop_end ();
4320 @}
4321@end smallexample
4322
6a2ba183 4323Note that while it looks like there is trickiness to propagating
3721b9e1
DF
4324a non-constant STEP, there isn't really. We're explicitly allowed
4325to evaluate it as many times as we want, and any variables involved
4326should automatically be handled as PRIVATE or SHARED like any other
4327variables. So the expression should remain evaluable in the
4328subfunction. We can also pull it into a local variable if we like,
4329but since its supposed to remain unchanged, we can also not if we like.
4330
4331If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
4332able to get away with no work-sharing context at all, since we can
4333simply perform the arithmetic directly in each thread to divide up
4334the iterations. Which would mean that we wouldn't need to call any
4335of these routines.
4336
4337There are separate routines for handling loops with an ORDERED
4338clause. Bookkeeping for that is non-trivial...
4339
4340
4341
4342@node Implementing ORDERED construct
4343@section Implementing ORDERED construct
4344
4345@smallexample
4346 void GOMP_ordered_start (void)
4347 void GOMP_ordered_end (void)
4348@end smallexample
4349
4350
4351
4352@node Implementing SECTIONS construct
4353@section Implementing SECTIONS construct
4354
4355A block as
4356
4357@smallexample
4358 #pragma omp sections
4359 @{
4360 #pragma omp section
4361 stmt1;
4362 #pragma omp section
4363 stmt2;
4364 #pragma omp section
4365 stmt3;
4366 @}
4367@end smallexample
4368
4369becomes
4370
4371@smallexample
4372 for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
4373 switch (i)
4374 @{
4375 case 1:
4376 stmt1;
4377 break;
4378 case 2:
4379 stmt2;
4380 break;
4381 case 3:
4382 stmt3;
4383 break;
4384 @}
4385 GOMP_barrier ();
4386@end smallexample
4387
4388
4389@node Implementing SINGLE construct
4390@section Implementing SINGLE construct
4391
4392A block like
4393
4394@smallexample
4395 #pragma omp single
4396 @{
4397 body;
4398 @}
4399@end smallexample
4400
4401becomes
4402
4403@smallexample
4404 if (GOMP_single_start ())
4405 body;
4406 GOMP_barrier ();
4407@end smallexample
4408
4409while
4410
4411@smallexample
4412 #pragma omp single copyprivate(x)
4413 body;
4414@end smallexample
4415
4416becomes
4417
4418@smallexample
4419 datap = GOMP_single_copy_start ();
4420 if (datap == NULL)
4421 @{
4422 body;
4423 data.x = x;
4424 GOMP_single_copy_end (&data);
4425 @}
4426 else
4427 x = datap->x;
4428 GOMP_barrier ();
4429@end smallexample
4430
4431
4432
cdf6119d
JN
4433@node Implementing OpenACC's PARALLEL construct
4434@section Implementing OpenACC's PARALLEL construct
4435
4436@smallexample
4437 void GOACC_parallel ()
4438@end smallexample
4439
4440
4441
3721b9e1 4442@c ---------------------------------------------------------------------
f1f3453e 4443@c Reporting Bugs
3721b9e1
DF
4444@c ---------------------------------------------------------------------
4445
4446@node Reporting Bugs
4447@chapter Reporting Bugs
4448
f1f3453e 4449Bugs in the GNU Offloading and Multi Processing Runtime Library should
c1030b5c 4450be reported via @uref{https://gcc.gnu.org/bugzilla/, Bugzilla}. Please add
41dbbb37
TS
4451"openacc", or "openmp", or both to the keywords field in the bug
4452report, as appropriate.
3721b9e1
DF
4453
4454
4455
4456@c ---------------------------------------------------------------------
4457@c GNU General Public License
4458@c ---------------------------------------------------------------------
4459
e6fdc918 4460@include gpl_v3.texi
3721b9e1
DF
4461
4462
4463
4464@c ---------------------------------------------------------------------
4465@c GNU Free Documentation License
4466@c ---------------------------------------------------------------------
4467
4468@include fdl.texi
4469
4470
4471
4472@c ---------------------------------------------------------------------
4473@c Funding Free Software
4474@c ---------------------------------------------------------------------
4475
4476@include funding.texi
4477
4478@c ---------------------------------------------------------------------
4479@c Index
4480@c ---------------------------------------------------------------------
4481
3d3949df
SL
4482@node Library Index
4483@unnumbered Library Index
3721b9e1
DF
4484
4485@printindex cp
4486
4487@bye