@code{omp_null_allocator} is returned.
The predefined memory spaces and available traits can be found at
-@ref{OMP_ALLOCATOR}, where the trait names have to be prefixed by
+@ref{Memory allocation}, where the trait names have to be prefixed by
@code{omp_atk_} (e.g. @code{omp_atk_pinned}) and the named trait values by
@code{omp_atv_} (e.g. @code{omp_atv_true}); additionally, @code{omp_atv_default}
may be used as trait value to specify that the default value should be used.
@end multitable
@item @emph{See also}:
-@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_destroy_allocator}
+@ref{Memory allocation}, @ref{OMP_ALLOCATOR}, @ref{omp_destroy_allocator}
@item @emph{Reference}:
@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.2
or a predefined memory space followed by a colon and a comma-separated list
of memory trait and value pairs, separated by @code{=}.
+See @ref{Memory allocation} for a list of supported prefedined allocators,
+memory spaces, and traits.
+
Note: The corresponding device environment variables are currently not
supported. Therefore, the non-host @var{def-allocator-var} ICVs are always
initialized to @code{omp_default_mem_alloc}. However, on all devices,
the @code{omp_set_default_allocator} API routine can be used to change
value.
-@multitable @columnfractions .45 .45
-@headitem Predefined allocators @tab Associated predefined memory spaces
-@item omp_default_mem_alloc @tab omp_default_mem_space
-@item omp_large_cap_mem_alloc @tab omp_large_cap_mem_space
-@item omp_const_mem_alloc @tab omp_const_mem_space
-@item omp_high_bw_mem_alloc @tab omp_high_bw_mem_space
-@item omp_low_lat_mem_alloc @tab omp_low_lat_mem_space
-@item omp_cgroup_mem_alloc @tab omp_low_lat_mem_space (implementation defined)
-@item omp_pteam_mem_alloc @tab omp_low_lat_mem_space (implementation defined)
-@item omp_thread_mem_alloc @tab omp_low_lat_mem_space (implementation defined)
-@item ompx_gnu_pinned_mem_alloc @tab omp_default_mem_space (GNU extension)
-@end multitable
-
-The predefined allocators use the default values for the traits,
-as listed below. Except that the last three allocators have the
-@code{access} trait set to @code{cgroup}, @code{pteam}, and
-@code{thread}, respectively.
-
-@multitable @columnfractions .25 .40 .25
-@headitem Trait @tab Allowed values @tab Default value
-@item @code{sync_hint} @tab @code{contended}, @code{uncontended},
- @code{serialized}, @code{private}
- @tab @code{contended}
-@item @code{alignment} @tab Positive integer being a power of two
- @tab 1 byte
-@item @code{access} @tab @code{all}, @code{cgroup},
- @code{pteam}, @code{thread}
- @tab @code{all}
-@item @code{pool_size} @tab Positive integer
- @tab See @ref{Memory allocation}
-@item @code{fallback} @tab @code{default_mem_fb}, @code{null_fb},
- @code{abort_fb}, @code{allocator_fb}
- @tab See below
-@item @code{fb_data} @tab @emph{unsupported as it needs an allocator handle}
- @tab (none)
-@item @code{pinned} @tab @code{true}, @code{false}
- @tab See below
-@item @code{partition} @tab @code{environment}, @code{nearest},
- @code{blocked}, @code{interleaved}
- @tab @code{environment}
-@end multitable
-
-For the @code{fallback} trait, the default value is @code{null_fb} for the
-@code{omp_default_mem_alloc} allocator and any allocator that is associated
-with device memory; for all other allocators, it is @code{default_mem_fb}
-by default.
-
-For the @code{pinned} trait, the default value is @code{true} for
-predefined allocator @code{ompx_gnu_pinned_mem_alloc} (a GNU extension), and
-@code{false} for all others.
-
Examples:
@smallexample
OMP_ALLOCATOR=omp_high_bw_mem_alloc
on the current device. It copies @var{bytes} bytes of data from the device
address, specified by @var{data_dev_src}, to the device address
@var{data_dev_dest}. The @code{_async} version performs the transfer
-asnychronously using the queue associated with @var{async_arg}.
+asynchronously using the queue associated with @var{async_arg}.
@item @emph{C/C++}:
@multitable @columnfractions .20 .80
@tab See @code{-march=} in ``Nvidia PTX Options''
@end multitable
+
@node Memory allocation
@section Memory allocation
@code{_Alignof} and C++'s @code{alignof}.
@end itemize
-For the available predefined allocators and, as applicable, their associated
-predefined memory spaces and for the available traits and their default values,
-see @ref{OMP_ALLOCATOR}. Predefined allocators without an associated memory
-space use the @code{omp_default_mem_space} memory space. See additionally
-@ref{Offload-Target Specifics}.
+GCC supports the following predefined allocators and predefined memory spaces:
+
+@multitable @columnfractions .45 .45
+@headitem Predefined allocators @tab Associated predefined memory spaces
+@item omp_default_mem_alloc @tab omp_default_mem_space
+@item omp_large_cap_mem_alloc @tab omp_large_cap_mem_space
+@item omp_const_mem_alloc @tab omp_const_mem_space
+@item omp_high_bw_mem_alloc @tab omp_high_bw_mem_space
+@item omp_low_lat_mem_alloc @tab omp_low_lat_mem_space
+@item omp_cgroup_mem_alloc @tab omp_low_lat_mem_space (implementation defined)
+@item omp_pteam_mem_alloc @tab omp_low_lat_mem_space (implementation defined)
+@item omp_thread_mem_alloc @tab omp_low_lat_mem_space (implementation defined)
+@item ompx_gnu_pinned_mem_alloc @tab omp_default_mem_space (GNU extension)
+@end multitable
+
+Each predefined allocator, including @code{omp_null_allocator}, has a corresponding
+allocator class template that meet the C++ allocator completeness requirements.
+These are located in the @code{omp::allocator} namespace, and the
+@code{ompx::allocator} namespace for gnu extensions. This allows the
+allocator-aware C++ standard library containers to use OpenMP allocation routines;
+for instance:
+
+@smallexample
+std::vector<int, omp::allocator::cgroup_mem<int>> vec;
+@end smallexample
+
+The following allocator templates are supported:
+
+@multitable @columnfractions .45 .45
+@headitem Predefined allocators @tab Associated allocator template
+@item omp_null_allocator @tab omp::allocator::null_allocator
+@item omp_default_mem_alloc @tab omp::allocator::default_mem
+@item omp_large_cap_mem_alloc @tab omp::allocator::large_cap_mem
+@item omp_const_mem_alloc @tab omp::allocator::const_mem
+@item omp_high_bw_mem_alloc @tab omp::allocator::high_bw_mem
+@item omp_low_lat_mem_alloc @tab omp::allocator::low_lat_mem
+@item omp_cgroup_mem_alloc @tab omp::allocator::cgroup_mem
+@item omp_pteam_mem_alloc @tab omp::allocator::pteam_mem
+@item omp_thread_mem_alloc @tab omp::allocator::thread_mem
+@item ompx_gnu_pinned_mem_alloc @tab ompx::allocator::gnu_pinned_mem
+@end multitable
+
+The following traits are available when constructing a new allocator;
+if a trait is not specified or with the value @code{default}, the
+specified default value is used for that trait. The predefined
+allocators use the default values of each trait, except that the
+@code{omp_cgroup_mem_alloc}, @code{omp_pteam_mem_alloc}, and
+@code{omp_thread_mem_alloc} allocators have the @code{access} trait
+set to @code{cgroup}, @code{pteam}, and @code{thread}, respectively.
+For each trait, a named constant prefixed by @code{omp_atk_} exists;
+for each non-numeric value, a named constant prefixed by @code{omp_atv_}
+exists.
+
+@multitable @columnfractions .25 .40 .25
+@headitem Trait @tab Allowed values @tab Default value
+@item @code{sync_hint} @tab @code{contended}, @code{uncontended},
+ @code{serialized}, @code{private}
+ @tab @code{contended}
+@item @code{alignment} @tab Positive integer being a power of two
+ @tab 1 byte
+@item @code{access} @tab @code{all}, @code{cgroup},
+ @code{pteam}, @code{thread}
+ @tab @code{all}
+@item @code{pool_size} @tab Positive integer (bytes)
+ @tab See below.
+@item @code{fallback} @tab @code{default_mem_fb}, @code{null_fb},
+ @code{abort_fb}, @code{allocator_fb}
+ @tab See below
+@item @code{fb_data} @tab @emph{allocator handle}
+ @tab (none)
+@item @code{pinned} @tab @code{true}, @code{false}
+ @tab See below
+@item @code{partition} @tab @code{environment}, @code{nearest},
+ @code{blocked}, @code{interleaved}
+ @tab @code{environment}
+@end multitable
+
+For the @code{fallback} trait, the default value is @code{null_fb} for the
+@code{omp_default_mem_alloc} allocator and any allocator that is associated
+with device memory; for all other allocators, it is @code{default_mem_fb}
+by default.
+
+For the @code{pinned} trait, the default value is @code{true} for
+predefined allocator @code{ompx_gnu_pinned_mem_alloc} (a GNU extension), and
+@code{false} for all others.
+
+The following description applies to the initial device (the host) and largely
+also to non-host devices; for the latter, also see @ref{Offload-Target Specifics}.
For the memory spaces, the following applies:
@itemize
@end itemize
On Linux systems, where the @uref{https://github.com/memkind/memkind, memkind
-library} (@code{libmemkind.so.0}) is available at runtime, it is used when
-creating memory allocators requesting
+library} (@code{libmemkind.so.0}) is available at runtime and the respective
+memkind kind is supported, it is used when creating memory allocators requesting
@itemize
-@item the memory space @code{omp_high_bw_mem_space}
-@item the memory space @code{omp_large_cap_mem_space}
-@item the @code{partition} trait @code{interleaved}; note that for
- @code{omp_large_cap_mem_space} the allocation will not be interleaved
+@item the @code{partition} trait @code{interleaved} except when the memory space
+ is @code{omp_large_cap_mem_space} (uses @code{MEMKIND_HBW_INTERLEAVE})
+@item the memory space is @code{omp_high_bw_mem_space} (uses
+ @code{MEMKIND_HBW_PREFERRED})
+@item the memory space is @code{omp_large_cap_mem_space} (uses
+ @code{MEMKIND_DAX_KMEM_ALL} or, if not available, @code{MEMKIND_DAX_KMEM})
@end itemize
On Linux systems, where the @uref{https://github.com/numactl/numactl, numa
Additional notes regarding the traits:
@itemize
@item The @code{pinned} trait is supported on Linux hosts, but is subject to
- the OS @code{ulimit}/@code{rlimit} locked memory settings.
+ the OS @code{ulimit}/@code{rlimit} locked memory settings. It currently
+ uses @code{mmap} and is therefore optimized for few allocations, including
+ large data. If the conditions for numa or memkind allocations are
+ fulfilled, those allocators are used instead.
@item The default for the @code{pool_size} trait is no pool and for every
(re)allocation the associated library routine is called, which might
- internally use a memory pool.
+ internally use a memory pool. Currently, the same applies when a
+ @code{pool_size} has been specified, except that once allocations exceed
+ the the pool size, the action of the @code{fallback} trait applies.
@item For the @code{partition} trait, the partition part size will be the same
as the requested size (i.e. @code{interleaved} or @code{blocked} has no
effect), except for @code{interleaved} when the memkind library is
that allocated the memory; on Linux, this is in particular the case when
the memory placement policy is set to preferred.
@item The @code{access} trait has no effect such that memory is always
- accessible by all threads.
+ accessible by all threads. (Except on supported no-host devices.)
@item The @code{sync_hint} trait has no effect.
@end itemize
See also:
@ref{Offload-Target Specifics}
+
+
@c ---------------------------------------------------------------------
@c Offload-Target Specifics
@c ---------------------------------------------------------------------