]>
Commit | Line | Data |
---|---|---|
c63539ff ML |
1 | .. |
2 | Copyright 1988-2022 Free Software Foundation, Inc. | |
3 | This is part of the GCC manual. | |
4 | For copying conditions, see the copyright.rst file. | |
5 | ||
6 | .. _nvptx: | |
7 | ||
8 | nvptx | |
9 | ***** | |
10 | ||
11 | On the hardware side, there is the hierarchy (fine to coarse): | |
12 | ||
13 | * thread | |
14 | ||
15 | * warp | |
16 | ||
17 | * thread block | |
18 | ||
19 | * streaming multiprocessor | |
20 | ||
21 | All OpenMP and OpenACC levels are used, i.e. | |
22 | ||
23 | * OpenMP's simd and OpenACC's vector map to threads | |
24 | ||
25 | * OpenMP's threads ('parallel') and OpenACC's workers map to warps | |
26 | ||
27 | * OpenMP's teams and OpenACC's gang use a threadpool with the | |
28 | size of the number of teams or gangs, respectively. | |
29 | ||
30 | The used sizes are | |
31 | ||
32 | * The ``warp_size`` is always 32 | |
33 | ||
34 | * CUDA kernel launched: ``dim={#teams,1,1}, blocks={#threads,warp_size,1}``. | |
35 | ||
36 | Additional information can be obtained by setting the environment variable to | |
37 | ``GOMP_DEBUG=1`` (very verbose; grep for ``kernel.*launch`` for launch | |
38 | parameters). | |
39 | ||
40 | GCC generates generic PTX ISA code, which is just-in-time compiled by CUDA, | |
41 | which caches the JIT in the user's directory (see CUDA documentation; can be | |
42 | tuned by the environment variables ``CUDA_CACHE_{DISABLE,MAXSIZE,PATH}``. | |
43 | ||
44 | Note: While PTX ISA is generic, the ``-mptx=`` and ``-march=`` commandline | |
45 | options still affect the used PTX ISA code and, thus, the requirments on | |
46 | CUDA version and hardware. | |
47 | ||
48 | The implementation remark: | |
49 | ||
50 | * I/O within OpenMP target regions and OpenACC parallel/kernels is supported | |
51 | using the C library ``printf`` functions. Note that the Fortran | |
52 | ``print`` / ``write`` statements are not supported, yet. | |
53 | ||
54 | * Compilation OpenMP code that contains ``requires reverse_offload`` | |
55 | requires at least ``-march=sm_35``, compiling for ``-march=sm_30`` | |
56 | is not supported. | |
57 | ||
58 | .. - | |
59 | The libgomp ABI | |
3ed1b4ce | 60 | - |