]> git.ipfire.org Git - thirdparty/gcc.git/blame - gcc/doc/gcc/gcc-command-options/machine-dependent-options/nvidia-ptx-options.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / gcc / gcc-command-options / machine-dependent-options / nvidia-ptx-options.rst
CommitLineData
c63539ff
ML
1..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6.. program:: Nvidia PTX
7
8.. index:: Nvidia PTX options, nvptx options
9
10.. _nvidia-ptx-options:
11
12Nvidia PTX Options
13^^^^^^^^^^^^^^^^^^
14
15These options are defined for Nvidia PTX:
16
17.. option:: -m64
18
19 Ignored, but preserved for backward compatibility. Only 64-bit ABI is
20 supported.
21
22.. option:: -march={architecture-string}
23
24 Generate code for the specified PTX ISA target architecture
25 (e.g. :samp:`sm_35`). Valid architecture strings are :samp:`sm_30`,
26 :samp:`sm_35`, :samp:`sm_53`, :samp:`sm_70`, :samp:`sm_75` and
27 :samp:`sm_80`.
28 The default depends on how the compiler has been configured, see
29 :option:`--with-arch`.
30
31 This option sets the value of the preprocessor macro
32 ``__PTX_SM__`` ; for instance, for :samp:`sm_35`, it has the value
33 :samp:`350`.
34
35.. option:: -misa={architecture-string}
36
37 Alias of :option:`-march=`.
38
39.. option:: -march-map={architecture-string}
40
41 Select the closest available :option:`-march=` value that is not more
42 capable. For instance, for :option:`-march-map=sm_50` select
43 :option:`-march=sm_35`, and for :option:`-march-map=sm_53` select
44 :option:`-march=sm_53`.
45
46.. option:: -mptx={version-string}
47
48 Generate code for the specified PTX ISA version (e.g. :samp:`7.0`).
49 Valid version strings include :samp:`3.1`, :samp:`6.0`, :samp:`6.3`, and
50 :samp:`7.0`. The default PTX ISA version is 6.0, unless a higher
51 version is required for specified PTX ISA target architecture via
52 option :option:`-march=`.
53
54 This option sets the values of the preprocessor macros
55 ``__PTX_ISA_VERSION_MAJOR__`` and ``__PTX_ISA_VERSION_MINOR__`` ;
56 for instance, for :samp:`3.1` the macros have the values :samp:`3` and
57 :samp:`1`, respectively.
58
59.. option:: -mmainkernel
60
61 Link in code for a __main kernel. This is for stand-alone instead of
62 offloading execution.
63
64.. option:: -moptimize
65
66 Apply partitioned execution optimizations. This is the default when any
67 level of optimization is selected.
68
69.. option:: -msoft-stack
70
71 Generate code that does not use ``.local`` memory
72 directly for stack storage. Instead, a per-warp stack pointer is
73 maintained explicitly. This enables variable-length stack allocation (with
74 variable-length arrays or ``alloca``), and when global memory is used for
75 underlying storage, makes it possible to access automatic variables from other
76 threads, or with atomic instructions. This code generation variant is used
77 for OpenMP offloading, but the option is exposed on its own for the purpose
78 of testing the compiler; to generate code suitable for linking into programs
79 using OpenMP offloading, use option :option:`-mgomp`.
80
81.. option:: -muniform-simt
82
83 Switch to code generation variant that allows to execute all threads in each
84 warp, while maintaining memory state and side effects as if only one thread
85 in each warp was active outside of OpenMP SIMD regions. All atomic operations
86 and calls to runtime (malloc, free, vprintf) are conditionally executed (iff
87 current lane index equals the master lane index), and the register being
88 assigned is copied via a shuffle instruction from the master lane. Outside of
89 SIMD regions lane 0 is the master; inside, each thread sees itself as the
90 master. Shared memory array ``int __nvptx_uni[]`` stores all-zeros or
91 all-ones bitmasks for each warp, indicating current mode (0 outside of SIMD
92 regions). Each thread can bitwise-and the bitmask at position ``tid.y``
93 with current lane index to compute the master lane index.
94
95.. option:: -mgomp
96
97 Generate code for use in OpenMP offloading: enables :option:`-msoft-stack` and
3ed1b4ce 98 :option:`-muniform-simt` options, and selects corresponding multilib variant.