]> git.ipfire.org Git - thirdparty/gcc.git/blob - gcc/doc/gcc/extensions-to-the-c-language-family/how-to-use-inline-assembly-language-in-c-code.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / gcc / extensions-to-the-c-language-family / how-to-use-inline-assembly-language-in-c-code.rst
1 ..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6 .. index:: asm keyword, assembly language in C, inline assembly language, mixing assembly language and C
7
8 .. _using-assembly-language-with-c:
9
10 How to Use Inline Assembly Language in C Code
11 *********************************************
12
13 The ``asm`` keyword allows you to embed assembler instructions
14 within C code. GCC provides two forms of inline ``asm``
15 statements. A basic ``asm`` statement is one with no
16 operands (see :ref:`basic-asm`), while an extended ``asm``
17 statement (see :ref:`extended-asm`) includes one or more operands.
18 The extended form is preferred for mixing C and assembly language
19 within a function, but to include assembly language at
20 top level you must use basic ``asm``.
21
22 You can also use the ``asm`` keyword to override the assembler name
23 for a C symbol, or to place a C variable in a specific register.
24
25 .. toctree::
26 :maxdepth: 2
27
28
29 .. index:: basic asm, assembly language in C, basic
30
31 .. _basic-asm:
32
33 Basic Asm --- Assembler Instructions Without Operands
34 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
35
36 A basic ``asm`` statement has the following syntax:
37
38 .. code-block::
39
40 asm asm-qualifiers ( AssemblerInstructions )
41
42 For the C language, the ``asm`` keyword is a GNU extension.
43 When writing C code that can be compiled with :option:`-ansi` and the
44 :option:`-std` options that select C dialects without GNU extensions, use
45 ``__asm__`` instead of ``asm`` (see :ref:`alternate-keywords`). For
46 the C++ language, ``asm`` is a standard keyword, but ``__asm__``
47 can be used for code compiled with :option:`-fno-asm`.
48
49 Qualifiers
50 ^^^^^^^^^^
51
52 ``volatile``
53 The optional ``volatile`` qualifier has no effect.
54 All basic ``asm`` blocks are implicitly volatile.
55
56 ``inline``
57 If you use the ``inline`` qualifier, then for inlining purposes the size
58 of the ``asm`` statement is taken as the smallest size possible (see :ref:`size-of-an-asm`).
59
60 Parameters
61 ^^^^^^^^^^
62
63 :samp:`{AssemblerInstructions}`
64 This is a literal string that specifies the assembler code. The string can
65 contain any instructions recognized by the assembler, including directives.
66 GCC does not parse the assembler instructions themselves and
67 does not know what they mean or even whether they are valid assembler input.
68
69 You may place multiple assembler instructions together in a single ``asm``
70 string, separated by the characters normally used in assembly code for the
71 system. A combination that works in most places is a newline to break the
72 line, plus a tab character (written as :samp:`\\n\\t`).
73 Some assemblers allow semicolons as a line separator. However,
74 note that some assembler dialects use semicolons to start a comment.
75
76 Remarks
77 ^^^^^^^
78
79 Using extended ``asm`` (see :ref:`extended-asm`) typically produces
80 smaller, safer, and more efficient code, and in most cases it is a
81 better solution than basic ``asm``. However, there are two
82 situations where only basic ``asm`` can be used:
83
84 * Extended ``asm`` statements have to be inside a C
85 function, so to write inline assembly language at file scope ('top-level'),
86 outside of C functions, you must use basic ``asm``.
87 You can use this technique to emit assembler directives,
88 define assembly language macros that can be invoked elsewhere in the file,
89 or write entire functions in assembly language.
90 Basic ``asm`` statements outside of functions may not use any
91 qualifiers.
92
93 * Functions declared
94 with the :fn-attr:`naked` attribute also require basic ``asm``
95 (see :ref:`function-attributes`).
96
97 Safely accessing C data and calling functions from basic ``asm`` is more
98 complex than it may appear. To access C data, it is better to use extended
99 ``asm``.
100
101 Do not expect a sequence of ``asm`` statements to remain perfectly
102 consecutive after compilation. If certain instructions need to remain
103 consecutive in the output, put them in a single multi-instruction ``asm``
104 statement. Note that GCC's optimizers can move ``asm`` statements
105 relative to other code, including across jumps.
106
107 ``asm`` statements may not perform jumps into other ``asm`` statements.
108 GCC does not know about these jumps, and therefore cannot take
109 account of them when deciding how to optimize. Jumps from ``asm`` to C
110 labels are only supported in extended ``asm``.
111
112 Under certain circumstances, GCC may duplicate (or remove duplicates of) your
113 assembly code when optimizing. This can lead to unexpected duplicate
114 symbol errors during compilation if your assembly code defines symbols or
115 labels.
116
117 .. warning::
118
119 The C standards do not specify semantics for ``asm``,
120 making it a potential source of incompatibilities between compilers. These
121 incompatibilities may not produce compiler warnings/errors.
122
123 GCC does not parse basic ``asm`` 's :samp:`{AssemblerInstructions}`, which
124 means there is no way to communicate to the compiler what is happening
125 inside them. GCC has no visibility of symbols in the ``asm`` and may
126 discard them as unreferenced. It also does not know about side effects of
127 the assembler code, such as modifications to memory or registers. Unlike
128 some compilers, GCC assumes that no changes to general purpose registers
129 occur. This assumption may change in a future release.
130
131 To avoid complications from future changes to the semantics and the
132 compatibility issues between compilers, consider replacing basic ``asm``
133 with extended ``asm``. See
134 `How to convert
135 from basic asm to extended asm <https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended>`_ for information about how to perform this
136 conversion.
137
138 The compiler copies the assembler instructions in a basic ``asm``
139 verbatim to the assembly language output file, without
140 processing dialects or any of the :samp:`%` operators that are available with
141 extended ``asm``. This results in minor differences between basic
142 ``asm`` strings and extended ``asm`` templates. For example, to refer to
143 registers you might use :samp:`%eax` in basic ``asm`` and
144 :samp:`%%eax` in extended ``asm``.
145
146 On targets such as x86 that support multiple assembler dialects,
147 all basic ``asm`` blocks use the assembler dialect specified by the
148 :option:`-masm` command-line option (see :ref:`x86-options`).
149 Basic ``asm`` provides no
150 mechanism to provide different assembler strings for different dialects.
151
152 For basic ``asm`` with non-empty assembler string GCC assumes
153 the assembler block does not change any general purpose registers,
154 but it may read or write any globally accessible variable.
155
156 Here is an example of basic ``asm`` for i386:
157
158 .. code-block:: c++
159
160 /* Note that this code will not compile with -masm=intel */
161 #define DebugBreak() asm("int $3")
162
163 .. index:: extended asm, assembly language in C, extended
164
165 .. _extended-asm:
166
167 Extended Asm - Assembler Instructions with C Expression Operands
168 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
169
170 With extended ``asm`` you can read and write C variables from
171 assembler and perform jumps from assembler code to C labels.
172 Extended ``asm`` syntax uses colons (:samp:`:`) to delimit
173 the operand parameters after the assembler template:
174
175 .. code-block::
176
177 asm asm-qualifiers ( AssemblerTemplate
178 : OutputOperands
179 [ : InputOperands
180 [ : Clobbers ] ])
181
182 asm asm-qualifiers ( AssemblerTemplate
183 : OutputOperands
184 : InputOperands
185 : Clobbers
186 : GotoLabels)
187
188 where in the last form, :samp:`{asm-qualifiers}` contains ``goto`` (and in the
189 first form, not).
190
191 The ``asm`` keyword is a GNU extension.
192 When writing code that can be compiled with :option:`-ansi` and the
193 various :option:`-std` options, use ``__asm__`` instead of
194 ``asm`` (see :ref:`alternate-keywords`).
195
196 Qualifiers
197 ^^^^^^^^^^
198
199 ``volatile``
200 The typical use of extended ``asm`` statements is to manipulate input
201 values to produce output values. However, your ``asm`` statements may
202 also produce side effects. If so, you may need to use the ``volatile``
203 qualifier to disable certain optimizations. See :ref:`volatile`.
204
205 ``inline``
206 If you use the ``inline`` qualifier, then for inlining purposes the size
207 of the ``asm`` statement is taken as the smallest size possible
208 (see :ref:`size-of-an-asm`).
209
210 ``goto``
211 This qualifier informs the compiler that the ``asm`` statement may
212 perform a jump to one of the labels listed in the :samp:`{GotoLabels}`.
213 See :ref:`gotolabels`.
214
215 Parameters
216 ^^^^^^^^^^
217
218 :samp:`{AssemblerTemplate}`
219 This is a literal string that is the template for the assembler code. It is a
220 combination of fixed text and tokens that refer to the input, output,
221 and goto parameters. See :ref:`assemblertemplate`.
222
223 :samp:`{OutputOperands}`
224 A comma-separated list of the C variables modified by the instructions in the
225 :samp:`{AssemblerTemplate}`. An empty list is permitted. See :ref:`outputoperands`.
226
227 :samp:`{InputOperands}`
228 A comma-separated list of C expressions read by the instructions in the
229 :samp:`{AssemblerTemplate}`. An empty list is permitted. See :ref:`inputoperands`.
230
231 :samp:`{Clobbers}`
232 A comma-separated list of registers or other values changed by the
233 :samp:`{AssemblerTemplate}`, beyond those listed as outputs.
234 An empty list is permitted. See :ref:`clobbers-and-scratch-registers`.
235
236 :samp:`{GotoLabels}`
237 When you are using the ``goto`` form of ``asm``, this section contains
238 the list of all C labels to which the code in the
239 :samp:`{AssemblerTemplate}` may jump.
240 See :ref:`gotolabels`.
241
242 ``asm`` statements may not perform jumps into other ``asm`` statements,
243 only to the listed :samp:`{GotoLabels}`.
244 GCC's optimizers do not know about other jumps; therefore they cannot take
245 account of them when deciding how to optimize.
246
247 The total number of input + output + goto operands is limited to 30.
248
249 Remarks
250 ^^^^^^^
251
252 The ``asm`` statement allows you to include assembly instructions directly
253 within C code. This may help you to maximize performance in time-sensitive
254 code or to access assembly instructions that are not readily available to C
255 programs.
256
257 Note that extended ``asm`` statements must be inside a function. Only
258 basic ``asm`` may be outside functions (see :ref:`basic-asm`).
259 Functions declared with the :fn-attr:`naked` attribute also require basic
260 ``asm`` (see :ref:`function-attributes`).
261
262 While the uses of ``asm`` are many and varied, it may help to think of an
263 ``asm`` statement as a series of low-level instructions that convert input
264 parameters to output parameters. So a simple (if not particularly useful)
265 example for i386 using ``asm`` might look like this:
266
267 .. code-block:: c++
268
269 int src = 1;
270 int dst;
271
272 asm ("mov %1, %0\n\t"
273 "add $1, %0"
274 : "=r" (dst)
275 : "r" (src));
276
277 printf("%d\n", dst);
278
279 This code copies ``src`` to ``dst`` and add 1 to ``dst``.
280
281 .. index:: volatile asm, asm volatile
282
283 .. _volatile:
284
285 Volatile
286 ~~~~~~~~
287
288 GCC's optimizers sometimes discard ``asm`` statements if they determine
289 there is no need for the output variables. Also, the optimizers may move
290 code out of loops if they believe that the code will always return the same
291 result (i.e. none of its input values change between calls). Using the
292 ``volatile`` qualifier disables these optimizations. ``asm`` statements
293 that have no output operands and ``asm goto`` statements,
294 are implicitly volatile.
295
296 This i386 code demonstrates a case that does not use (or require) the
297 ``volatile`` qualifier. If it is performing assertion checking, this code
298 uses ``asm`` to perform the validation. Otherwise, ``dwRes`` is
299 unreferenced by any code. As a result, the optimizers can discard the
300 ``asm`` statement, which in turn removes the need for the entire
301 ``DoCheck`` routine. By omitting the ``volatile`` qualifier when it
302 isn't needed you allow the optimizers to produce the most efficient code
303 possible.
304
305 .. code-block:: c++
306
307 void DoCheck(uint32_t dwSomeValue)
308 {
309 uint32_t dwRes;
310
311 // Assumes dwSomeValue is not zero.
312 asm ("bsfl %1,%0"
313 : "=r" (dwRes)
314 : "r" (dwSomeValue)
315 : "cc");
316
317 assert(dwRes > 3);
318 }
319
320 The next example shows a case where the optimizers can recognize that the input
321 (``dwSomeValue``) never changes during the execution of the function and can
322 therefore move the ``asm`` outside the loop to produce more efficient code.
323 Again, using the ``volatile`` qualifier disables this type of optimization.
324
325 .. code-block:: c++
326
327 void do_print(uint32_t dwSomeValue)
328 {
329 uint32_t dwRes;
330
331 for (uint32_t x=0; x < 5; x++)
332 {
333 // Assumes dwSomeValue is not zero.
334 asm ("bsfl %1,%0"
335 : "=r" (dwRes)
336 : "r" (dwSomeValue)
337 : "cc");
338
339 printf("%u: %u %u\n", x, dwSomeValue, dwRes);
340 }
341 }
342
343 The following example demonstrates a case where you need to use the
344 ``volatile`` qualifier.
345 It uses the x86 ``rdtsc`` instruction, which reads
346 the computer's time-stamp counter. Without the ``volatile`` qualifier,
347 the optimizers might assume that the ``asm`` block will always return the
348 same value and therefore optimize away the second call.
349
350 .. code-block:: c++
351
352 uint64_t msr;
353
354 asm volatile ( "rdtsc\n\t" // Returns the time in EDX:EAX.
355 "shl $32, %%rdx\n\t" // Shift the upper bits left.
356 "or %%rdx, %0" // 'Or' in the lower bits.
357 : "=a" (msr)
358 :
359 : "rdx");
360
361 printf("msr: %llx\n", msr);
362
363 // Do other work...
364
365 // Reprint the timestamp
366 asm volatile ( "rdtsc\n\t" // Returns the time in EDX:EAX.
367 "shl $32, %%rdx\n\t" // Shift the upper bits left.
368 "or %%rdx, %0" // 'Or' in the lower bits.
369 : "=a" (msr)
370 :
371 : "rdx");
372
373 printf("msr: %llx\n", msr);
374
375 GCC's optimizers do not treat this code like the non-volatile code in the
376 earlier examples. They do not move it out of loops or omit it on the
377 assumption that the result from a previous call is still valid.
378
379 Note that the compiler can move even ``volatile asm`` instructions relative
380 to other code, including across jump instructions. For example, on many
381 targets there is a system register that controls the rounding mode of
382 floating-point operations. Setting it with a ``volatile asm`` statement,
383 as in the following PowerPC example, does not work reliably.
384
385 .. code-block:: c++
386
387 asm volatile("mtfsf 255, %0" : : "f" (fpenv));
388 sum = x + y;
389
390 The compiler may move the addition back before the ``volatile asm``
391 statement. To make it work as expected, add an artificial dependency to
392 the ``asm`` by referencing a variable in the subsequent code, for
393 example:
394
395 .. code-block:: c++
396
397 asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
398 sum = x + y;
399
400 Under certain circumstances, GCC may duplicate (or remove duplicates of) your
401 assembly code when optimizing. This can lead to unexpected duplicate symbol
402 errors during compilation if your ``asm`` code defines symbols or labels.
403 Using :samp:`%=`
404 (see :ref:`assemblertemplate`) may help resolve this problem.
405
406 .. index:: asm assembler template
407
408 .. _assemblertemplate:
409
410 Assembler Template
411 ~~~~~~~~~~~~~~~~~~
412
413 An assembler template is a literal string containing assembler instructions.
414 The compiler replaces tokens in the template that refer
415 to inputs, outputs, and goto labels,
416 and then outputs the resulting string to the assembler. The
417 string can contain any instructions recognized by the assembler, including
418 directives. GCC does not parse the assembler instructions
419 themselves and does not know what they mean or even whether they are valid
420 assembler input. However, it does count the statements
421 (see :ref:`size-of-an-asm`).
422
423 You may place multiple assembler instructions together in a single ``asm``
424 string, separated by the characters normally used in assembly code for the
425 system. A combination that works in most places is a newline to break the
426 line, plus a tab character to move to the instruction field (written as
427 :samp:`\\n\\t`).
428 Some assemblers allow semicolons as a line separator. However, note
429 that some assembler dialects use semicolons to start a comment.
430
431 Do not expect a sequence of ``asm`` statements to remain perfectly
432 consecutive after compilation, even when you are using the ``volatile``
433 qualifier. If certain instructions need to remain consecutive in the output,
434 put them in a single multi-instruction ``asm`` statement.
435
436 Accessing data from C programs without using input/output operands (such as
437 by using global symbols directly from the assembler template) may not work as
438 expected. Similarly, calling functions directly from an assembler template
439 requires a detailed understanding of the target assembler and ABI.
440
441 Since GCC does not parse the assembler template,
442 it has no visibility of any
443 symbols it references. This may result in GCC discarding those symbols as
444 unreferenced unless they are also listed as input, output, or goto operands.
445
446 Special format strings
447 ^^^^^^^^^^^^^^^^^^^^^^
448
449 In addition to the tokens described by the input, output, and goto operands,
450 these tokens have special meanings in the assembler template:
451
452 :samp:`%%`
453 Outputs a single :samp:`%` into the assembler code.
454
455 :samp:`%=`
456 Outputs a number that is unique to each instance of the ``asm``
457 statement in the entire compilation. This option is useful when creating local
458 labels and referring to them multiple times in a single template that
459 generates multiple assembler instructions.
460
461 :samp:`%{` :samp:`%|` :samp:`%}`
462 Outputs :samp:`{`, :samp:`|`, and :samp:`}` characters (respectively)
463 into the assembler code. When unescaped, these characters have special
464 meaning to indicate multiple assembler dialects, as described below.
465
466 Multiple assembler dialects in asm templates
467 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
468
469 On targets such as x86, GCC supports multiple assembler dialects.
470 The :option:`-masm` option controls which dialect GCC uses as its
471 default for inline assembler. The target-specific documentation for the
472 :option:`-masm` option contains the list of supported dialects, as well as the
473 default dialect if the option is not specified. This information may be
474 important to understand, since assembler code that works correctly when
475 compiled using one dialect will likely fail if compiled using another.
476 See :ref:`x86-options`.
477
478 If your code needs to support multiple assembler dialects (for example, if
479 you are writing public headers that need to support a variety of compilation
480 options), use constructs of this form:
481
482 .. code-block:: c++
483
484 { dialect0 | dialect1 | dialect2... }
485
486 This construct outputs ``dialect0``
487 when using dialect #0 to compile the code,
488 ``dialect1`` for dialect #1, etc. If there are fewer alternatives within the
489 braces than the number of dialects the compiler supports, the construct
490 outputs nothing.
491
492 For example, if an x86 compiler supports two dialects
493 (:samp:`att`, :samp:`intel`), an
494 assembler template such as this:
495
496 .. code-block:: c++
497
498 "bt{l %[Offset],%[Base] | %[Base],%[Offset]}; jc %l2"
499
500 is equivalent to one of
501
502 .. code-block:: c++
503
504 "btl %[Offset],%[Base] ; jc %l2" /* att dialect */
505 "bt %[Base],%[Offset]; jc %l2" /* intel dialect */
506
507 Using that same compiler, this code:
508
509 .. code-block:: c++
510
511 "xchg{l}\t{%%}ebx, %1"
512
513 corresponds to either
514
515 .. code-block:: c++
516
517 "xchgl\t%%ebx, %1" /* att dialect */
518 "xchg\tebx, %1" /* intel dialect */
519
520 There is no support for nesting dialect alternatives.
521
522 .. index:: asm output operands
523
524 .. _outputoperands:
525
526 Output Operands
527 ~~~~~~~~~~~~~~~
528
529 An ``asm`` statement has zero or more output operands indicating the names
530 of C variables modified by the assembler code.
531
532 In this i386 example, ``old`` (referred to in the template string as
533 ``%0``) and ``*Base`` (as ``%1``) are outputs and ``Offset``
534 (``%2``) is an input:
535
536 .. code-block:: c++
537
538 bool old;
539
540 __asm__ ("btsl %2,%1\n\t" // Turn on zero-based bit #Offset in Base.
541 "sbb %0,%0" // Use the CF to calculate old.
542 : "=r" (old), "+rm" (*Base)
543 : "Ir" (Offset)
544 : "cc");
545
546 return old;
547
548 Operands are separated by commas. Each operand has this format:
549
550 .. code-block:: c++
551
552 [ [asmSymbolicName] ] constraint (cvariablename)
553
554 :samp:`{asmSymbolicName}`
555 Specifies a symbolic name for the operand.
556 Reference the name in the assembler template
557 by enclosing it in square brackets
558 (i.e. :samp:`%[Value]`). The scope of the name is the ``asm`` statement
559 that contains the definition. Any valid C variable name is acceptable,
560 including names already defined in the surrounding code. No two operands
561 within the same ``asm`` statement can use the same symbolic name.
562
563 When not using an :samp:`{asmSymbolicName}`, use the (zero-based) position
564 of the operand
565 in the list of operands in the assembler template. For example if there are
566 three output operands, use :samp:`%0` in the template to refer to the first,
567 :samp:`%1` for the second, and :samp:`%2` for the third.
568
569 :samp:`{constraint}`
570 A string constant specifying constraints on the placement of the operand;
571 See :ref:`constraints`, for details.
572
573 Output constraints must begin with either :samp:`=` (a variable overwriting an
574 existing value) or :samp:`+` (when reading and writing). When using
575 :samp:`=`, do not assume the location contains the existing value
576 on entry to the ``asm``, except
577 when the operand is tied to an input; see :ref:`inputoperands`.
578
579 After the prefix, there must be one or more additional constraints
580 (see :ref:`constraints`) that describe where the value resides. Common
581 constraints include :samp:`r` for register and :samp:`m` for memory.
582 When you list more than one possible location (for example, ``"=rm"``),
583 the compiler chooses the most efficient one based on the current context.
584 If you list as many alternates as the ``asm`` statement allows, you permit
585 the optimizers to produce the best possible code.
586 If you must use a specific register, but your Machine Constraints do not
587 provide sufficient control to select the specific register you want,
588 local register variables may provide a solution (see :ref:`local-register-variables`).
589
590 :samp:`{cvariablename}`
591 Specifies a C lvalue expression to hold the output, typically a variable name.
592 The enclosing parentheses are a required part of the syntax.
593
594 When the compiler selects the registers to use to
595 represent the output operands, it does not use any of the clobbered registers
596 (see :ref:`clobbers-and-scratch-registers`).
597
598 Output operand expressions must be lvalues. The compiler cannot check whether
599 the operands have data types that are reasonable for the instruction being
600 executed. For output expressions that are not directly addressable (for
601 example a bit-field), the constraint must allow a register. In that case, GCC
602 uses the register as the output of the ``asm``, and then stores that
603 register into the output.
604
605 Operands using the :samp:`+` constraint modifier count as two operands
606 (that is, both as input and output) towards the total maximum of 30 operands
607 per ``asm`` statement.
608
609 Use the :samp:`&` constraint modifier (see :ref:`modifiers`) on all output
610 operands that must not overlap an input. Otherwise,
611 GCC may allocate the output operand in the same register as an unrelated
612 input operand, on the assumption that the assembler code consumes its
613 inputs before producing outputs. This assumption may be false if the assembler
614 code actually consists of more than one instruction.
615
616 The same problem can occur if one output parameter (:samp:`{a}`) allows a register
617 constraint and another output parameter (:samp:`{b}`) allows a memory constraint.
618 The code generated by GCC to access the memory address in :samp:`{b}` can contain
619 registers which *might* be shared by :samp:`{a}`, and GCC considers those
620 registers to be inputs to the asm. As above, GCC assumes that such input
621 registers are consumed before any outputs are written. This assumption may
622 result in incorrect behavior if the ``asm`` statement writes to :samp:`{a}`
623 before using
624 :samp:`{b}`. Combining the :samp:`&` modifier with the register constraint on :samp:`{a}`
625 ensures that modifying :samp:`{a}` does not affect the address referenced by
626 :samp:`{b}`. Otherwise, the location of :samp:`{b}`
627 is undefined if :samp:`{a}` is modified before using :samp:`{b}`.
628
629 ``asm`` supports operand modifiers on operands (for example :samp:`%k2`
630 instead of simply :samp:`%2`). Typically these qualifiers are hardware
631 dependent. The list of supported modifiers for x86 is found at :ref:`x86operandmodifiers`.
632
633 If the C code that follows the ``asm`` makes no use of any of the output
634 operands, use ``volatile`` for the ``asm`` statement to prevent the
635 optimizers from discarding the ``asm`` statement as unneeded
636 (see :ref:`volatile`).
637
638 This code makes no use of the optional :samp:`{asmSymbolicName}`. Therefore it
639 references the first output operand as ``%0`` (were there a second, it
640 would be ``%1``, etc). The number of the first input operand is one greater
641 than that of the last output operand. In this i386 example, that makes
642 ``Mask`` referenced as ``%1`` :
643
644 .. code-block:: c++
645
646 uint32_t Mask = 1234;
647 uint32_t Index;
648
649 asm ("bsfl %1, %0"
650 : "=r" (Index)
651 : "r" (Mask)
652 : "cc");
653
654 That code overwrites the variable ``Index`` (:samp:`=`),
655 placing the value in a register (:samp:`r`).
656 Using the generic :samp:`r` constraint instead of a constraint for a specific
657 register allows the compiler to pick the register to use, which can result
658 in more efficient code. This may not be possible if an assembler instruction
659 requires a specific register.
660
661 The following i386 example uses the :samp:`{asmSymbolicName}` syntax.
662 It produces the
663 same result as the code above, but some may consider it more readable or more
664 maintainable since reordering index numbers is not necessary when adding or
665 removing operands. The names ``aIndex`` and ``aMask``
666 are only used in this example to emphasize which
667 names get used where.
668 It is acceptable to reuse the names ``Index`` and ``Mask``.
669
670 .. code-block:: c++
671
672 uint32_t Mask = 1234;
673 uint32_t Index;
674
675 asm ("bsfl %[aMask], %[aIndex]"
676 : [aIndex] "=r" (Index)
677 : [aMask] "r" (Mask)
678 : "cc");
679
680 Here are some more examples of output operands.
681
682 .. code-block:: c++
683
684 uint32_t c = 1;
685 uint32_t d;
686 uint32_t *e = &c;
687
688 asm ("mov %[e], %[d]"
689 : [d] "=rm" (d)
690 : [e] "rm" (*e));
691
692 Here, ``d`` may either be in a register or in memory. Since the compiler
693 might already have the current value of the ``uint32_t`` location
694 pointed to by ``e``
695 in a register, you can enable it to choose the best location
696 for ``d`` by specifying both constraints.
697
698 .. index:: asm flag output operands
699
700 .. _flagoutputoperands:
701
702 Flag Output Operands
703 ~~~~~~~~~~~~~~~~~~~~
704
705 Some targets have a special register that holds the 'flags' for the
706 result of an operation or comparison. Normally, the contents of that
707 register are either unmodifed by the asm, or the ``asm`` statement is
708 considered to clobber the contents.
709
710 On some targets, a special form of output operand exists by which
711 conditions in the flags register may be outputs of the asm. The set of
712 conditions supported are target specific, but the general rule is that
713 the output variable must be a scalar integer, and the value is boolean.
714 When supported, the target defines the preprocessor symbol
715 ``__GCC_ASM_FLAG_OUTPUTS__``.
716
717 Because of the special nature of the flag output operands, the constraint
718 may not include alternatives.
719
720 Most often, the target has only one flags register, and thus is an implied
721 operand of many instructions. In this case, the operand should not be
722 referenced within the assembler template via ``%0`` etc, as there's
723 no corresponding text in the assembly language.
724
725 ARM AArch64
726 The flag output constraints for the ARM family are of the form
727 :samp:`=@cc{cond}` where :samp:`{cond}` is one of the standard
728 conditions defined in the ARM ARM for ``ConditionHolds``.
729
730 ``eq``
731 Z flag set, or equal
732
733 ``ne``
734 Z flag clear or not equal
735
736 ``cs`` ``hs``
737 C flag set or unsigned greater than equal
738
739 ``cc`` ``lo``
740 C flag clear or unsigned less than
741
742 ``mi``
743 N flag set or 'minus'
744
745 ``pl``
746 N flag clear or 'plus'
747
748 ``vs``
749 V flag set or signed overflow
750
751 ``vc``
752 V flag clear
753
754 ``hi``
755 unsigned greater than
756
757 ``ls``
758 unsigned less than equal
759
760 ``ge``
761 signed greater than equal
762
763 ``lt``
764 signed less than
765
766 ``gt``
767 signed greater than
768
769 ``le``
770 signed less than equal
771
772 The flag output constraints are not supported in thumb1 mode.
773
774 x86 family
775 The flag output constraints for the x86 family are of the form
776 :samp:`=@cc{cond}` where :samp:`{cond}` is one of the standard
777 conditions defined in the ISA manual for ``jcc`` or
778 ``setcc``.
779
780 ``a``
781 'above' or unsigned greater than
782
783 ``ae``
784 'above or equal' or unsigned greater than or equal
785
786 ``b``
787 'below' or unsigned less than
788
789 ``be``
790 'below or equal' or unsigned less than or equal
791
792 ``c``
793 carry flag set
794
795 ``e`` ``z``
796 'equal' or zero flag set
797
798 ``g``
799 signed greater than
800
801 ``ge``
802 signed greater than or equal
803
804 ``l``
805 signed less than
806
807 ``le``
808 signed less than or equal
809
810 ``o``
811 overflow flag set
812
813 ``p``
814 parity flag set
815
816 ``s``
817 sign flag set
818
819 ``na`` ``nae`` ``nb`` ``nbe`` ``nc`` ``ne`` ``ng`` ``nge`` ``nl`` ``nle`` ``no`` ``np`` ``ns`` ``nz``
820 'not' :samp:`{flag}`, or inverted versions of those above
821
822 .. index:: asm input operands, asm expressions
823
824 .. _inputoperands:
825
826 Input Operands
827 ~~~~~~~~~~~~~~
828
829 Input operands make values from C variables and expressions available to the
830 assembly code.
831
832 Operands are separated by commas. Each operand has this format:
833
834 .. code-block:: c++
835
836 [ [asmSymbolicName] ] constraint (cexpression)
837
838 :samp:`{asmSymbolicName}`
839 Specifies a symbolic name for the operand.
840 Reference the name in the assembler template
841 by enclosing it in square brackets
842 (i.e. :samp:`%[Value]`). The scope of the name is the ``asm`` statement
843 that contains the definition. Any valid C variable name is acceptable,
844 including names already defined in the surrounding code. No two operands
845 within the same ``asm`` statement can use the same symbolic name.
846
847 When not using an :samp:`{asmSymbolicName}`, use the (zero-based) position
848 of the operand
849 in the list of operands in the assembler template. For example if there are
850 two output operands and three inputs,
851 use :samp:`%2` in the template to refer to the first input operand,
852 :samp:`%3` for the second, and :samp:`%4` for the third.
853
854 :samp:`{constraint}`
855 A string constant specifying constraints on the placement of the operand;
856 See :ref:`constraints`, for details.
857
858 Input constraint strings may not begin with either :samp:`=` or :samp:`+`.
859 When you list more than one possible location (for example, :samp:`"irm"`),
860 the compiler chooses the most efficient one based on the current context.
861 If you must use a specific register, but your Machine Constraints do not
862 provide sufficient control to select the specific register you want,
863 local register variables may provide a solution (see :ref:`local-register-variables`).
864
865 Input constraints can also be digits (for example, ``"0"``). This indicates
866 that the specified input must be in the same place as the output constraint
867 at the (zero-based) index in the output constraint list.
868 When using :samp:`{asmSymbolicName}` syntax for the output operands,
869 you may use these names (enclosed in brackets :samp:`[]`) instead of digits.
870
871 :samp:`{cexpression}`
872 This is the C variable or expression being passed to the ``asm`` statement
873 as input. The enclosing parentheses are a required part of the syntax.
874
875 When the compiler selects the registers to use to represent the input
876 operands, it does not use any of the clobbered registers
877 (see :ref:`clobbers-and-scratch-registers`).
878
879 If there are no output operands but there are input operands, place two
880 consecutive colons where the output operands would go:
881
882 .. code-block:: c++
883
884 __asm__ ("some instructions"
885 : /* No outputs. */
886 : "r" (Offset / 8));
887
888 .. warning::
889
890 Do *not* modify the contents of input-only operands
891 (except for inputs tied to outputs). The compiler assumes that on exit from
892 the ``asm`` statement these operands contain the same values as they
893 had before executing the statement.
894
895 It is *not* possible to use clobbers
896 to inform the compiler that the values in these inputs are changing. One
897 common work-around is to tie the changing input variable to an output variable
898 that never gets used. Note, however, that if the code that follows the
899 ``asm`` statement makes no use of any of the output operands, the GCC
900 optimizers may discard the ``asm`` statement as unneeded
901 (see :ref:`volatile`).
902
903 ``asm`` supports operand modifiers on operands (for example :samp:`%k2`
904 instead of simply :samp:`%2`). Typically these qualifiers are hardware
905 dependent. The list of supported modifiers for x86 is found at
906 :ref:`x86operandmodifiers`.
907
908 In this example using the fictitious ``combine`` instruction, the
909 constraint ``"0"`` for input operand 1 says that it must occupy the same
910 location as output operand 0. Only input operands may use numbers in
911 constraints, and they must each refer to an output operand. Only a number (or
912 the symbolic assembler name) in the constraint can guarantee that one operand
913 is in the same place as another. The mere fact that ``foo`` is the value of
914 both operands is not enough to guarantee that they are in the same place in
915 the generated assembler code.
916
917 .. code-block:: c++
918
919 asm ("combine %2, %0"
920 : "=r" (foo)
921 : "0" (foo), "g" (bar));
922
923 Here is an example using symbolic names.
924
925 .. code-block:: c++
926
927 asm ("cmoveq %1, %2, %[result]"
928 : [result] "=r"(result)
929 : "r" (test), "r" (new), "[result]" (old));
930
931 .. index:: asm clobbers, asm scratch registers
932
933 .. _clobbers-and-scratch-registers:
934
935 Clobbers and Scratch Registers
936 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
937
938 While the compiler is aware of changes to entries listed in the output
939 operands, the inline ``asm`` code may modify more than just the outputs. For
940 example, calculations may require additional registers, or the processor may
941 overwrite a register as a side effect of a particular assembler instruction.
942 In order to inform the compiler of these changes, list them in the clobber
943 list. Clobber list items are either register names or the special clobbers
944 (listed below). Each clobber list item is a string constant
945 enclosed in double quotes and separated by commas.
946
947 Clobber descriptions may not in any way overlap with an input or output
948 operand. For example, you may not have an operand describing a register class
949 with one member when listing that register in the clobber list. Variables
950 declared to live in specific registers (see :ref:`explicit-register-variables`) and used
951 as ``asm`` input or output operands must have no part mentioned in the
952 clobber description. In particular, there is no way to specify that input
953 operands get modified without also specifying them as output operands.
954
955 When the compiler selects which registers to use to represent input and output
956 operands, it does not use any of the clobbered registers. As a result,
957 clobbered registers are available for any use in the assembler code.
958
959 Another restriction is that the clobber list should not contain the
960 stack pointer register. This is because the compiler requires the
961 value of the stack pointer to be the same after an ``asm``
962 statement as it was on entry to the statement. However, previous
963 versions of GCC did not enforce this rule and allowed the stack
964 pointer to appear in the list, with unclear semantics. This behavior
965 is deprecated and listing the stack pointer may become an error in
966 future versions of GCC.
967
968 Here is a realistic example for the VAX showing the use of clobbered
969 registers:
970
971 .. code-block:: c++
972
973 asm volatile ("movc3 %0, %1, %2"
974 : /* No outputs. */
975 : "g" (from), "g" (to), "g" (count)
976 : "r0", "r1", "r2", "r3", "r4", "r5", "memory");
977
978 Also, there are two special clobber arguments:
979
980 ``"cc"``
981 The ``"cc"`` clobber indicates that the assembler code modifies the flags
982 register. On some machines, GCC represents the condition codes as a specific
983 hardware register; ``"cc"`` serves to name this register.
984 On other machines, condition code handling is different,
985 and specifying ``"cc"`` has no effect. But
986 it is valid no matter what the target.
987
988 ``"memory"``
989 The ``"memory"`` clobber tells the compiler that the assembly code
990 performs memory
991 reads or writes to items other than those listed in the input and output
992 operands (for example, accessing the memory pointed to by one of the input
993 parameters). To ensure memory contains correct values, GCC may need to flush
994 specific register values to memory before executing the ``asm``. Further,
995 the compiler does not assume that any values read from memory before an
996 ``asm`` remain unchanged after that ``asm`` ; it reloads them as
997 needed.
998 Using the ``"memory"`` clobber effectively forms a read/write
999 memory barrier for the compiler.
1000
1001 Note that this clobber does not prevent the *processor* from doing
1002 speculative reads past the ``asm`` statement. To prevent that, you need
1003 processor-specific fence instructions.
1004
1005 Flushing registers to memory has performance implications and may be
1006 an issue for time-sensitive code. You can provide better information
1007 to GCC to avoid this, as shown in the following examples. At a
1008 minimum, aliasing rules allow GCC to know what memory *doesn't*
1009 need to be flushed.
1010
1011 Here is a fictitious sum of squares instruction, that takes two
1012 pointers to floating point values in memory and produces a floating
1013 point register output.
1014 Notice that ``x``, and ``y`` both appear twice in the ``asm``
1015 parameters, once to specify memory accessed, and once to specify a
1016 base register used by the ``asm``. You won't normally be wasting a
1017 register by doing this as GCC can use the same register for both
1018 purposes. However, it would be foolish to use both ``%1`` and
1019 ``%3`` for ``x`` in this ``asm`` and expect them to be the
1020 same. In fact, ``%3`` may well not be a register. It might be a
1021 symbolic memory reference to the object pointed to by ``x``.
1022
1023 .. code-block:: c++
1024
1025 asm ("sumsq %0, %1, %2"
1026 : "+f" (result)
1027 : "r" (x), "r" (y), "m" (*x), "m" (*y));
1028
1029 Here is a fictitious ``*z++ = *x++ * *y++`` instruction.
1030 Notice that the ``x``, ``y`` and ``z`` pointer registers
1031 must be specified as input/output because the ``asm`` modifies
1032 them.
1033
1034 .. code-block:: c++
1035
1036 asm ("vecmul %0, %1, %2"
1037 : "+r" (z), "+r" (x), "+r" (y), "=m" (*z)
1038 : "m" (*x), "m" (*y));
1039
1040 An x86 example where the string memory argument is of unknown length.
1041
1042 .. code-block:: c++
1043
1044 asm("repne scasb"
1045 : "=c" (count), "+D" (p)
1046 : "m" (*(const char (*)[]) p), "0" (-1), "a" (0));
1047
1048 If you know the above will only be reading a ten byte array then you
1049 could instead use a memory input like:
1050 ``"m" (*(const char (*)[10]) p)``.
1051
1052 Here is an example of a PowerPC vector scale implemented in assembly,
1053 complete with vector and condition code clobbers, and some initialized
1054 offset registers that are unchanged by the ``asm``.
1055
1056 .. code-block:: c++
1057
1058 void
1059 dscal (size_t n, double *x, double alpha)
1060 {
1061 asm ("/* lots of asm here */"
1062 : "+m" (*(double (*)[n]) x), "+&r" (n), "+b" (x)
1063 : "d" (alpha), "b" (32), "b" (48), "b" (64),
1064 "b" (80), "b" (96), "b" (112)
1065 : "cr0",
1066 "vs32","vs33","vs34","vs35","vs36","vs37","vs38","vs39",
1067 "vs40","vs41","vs42","vs43","vs44","vs45","vs46","vs47");
1068 }
1069
1070 Rather than allocating fixed registers via clobbers to provide scratch
1071 registers for an ``asm`` statement, an alternative is to define a
1072 variable and make it an early-clobber output as with ``a2`` and
1073 ``a3`` in the example below. This gives the compiler register
1074 allocator more freedom. You can also define a variable and make it an
1075 output tied to an input as with ``a0`` and ``a1``, tied
1076 respectively to ``ap`` and ``lda``. Of course, with tied
1077 outputs your ``asm`` can't use the input value after modifying the
1078 output register since they are one and the same register. What's
1079 more, if you omit the early-clobber on the output, it is possible that
1080 GCC might allocate the same register to another of the inputs if GCC
1081 could prove they had the same value on entry to the ``asm``. This
1082 is why ``a1`` has an early-clobber. Its tied input, ``lda``
1083 might conceivably be known to have the value 16 and without an
1084 early-clobber share the same register as ``%11``. On the other
1085 hand, ``ap`` can't be the same as any of the other inputs, so an
1086 early-clobber on ``a0`` is not needed. It is also not desirable in
1087 this case. An early-clobber on ``a0`` would cause GCC to allocate
1088 a separate register for the ``"m" (*(const double (*)[]) ap)``
1089 input. Note that tying an input to an output is the way to set up an
1090 initialized temporary register modified by an ``asm`` statement.
1091 An input not tied to an output is assumed by GCC to be unchanged, for
1092 example ``"b" (16)`` below sets up ``%11`` to 16, and GCC might
1093 use that register in following code if the value 16 happened to be
1094 needed. You can even use a normal ``asm`` output for a scratch if
1095 all inputs that might share the same register are consumed before the
1096 scratch is used. The VSX registers clobbered by the ``asm``
1097 statement could have used this technique except for GCC's limit on the
1098 number of ``asm`` parameters.
1099
1100 .. code-block:: c++
1101
1102 static void
1103 dgemv_kernel_4x4 (long n, const double *ap, long lda,
1104 const double *x, double *y, double alpha)
1105 {
1106 double *a0;
1107 double *a1;
1108 double *a2;
1109 double *a3;
1110
1111 __asm__
1112 (
1113 /* lots of asm here */
1114 "#n=%1 ap=%8=%12 lda=%13 x=%7=%10 y=%0=%2 alpha=%9 o16=%11\n"
1115 "#a0=%3 a1=%4 a2=%5 a3=%6"
1116 :
1117 "+m" (*(double (*)[n]) y),
1118 "+&r" (n), // 1
1119 "+b" (y), // 2
1120 "=b" (a0), // 3
1121 "=&b" (a1), // 4
1122 "=&b" (a2), // 5
1123 "=&b" (a3) // 6
1124 :
1125 "m" (*(const double (*)[n]) x),
1126 "m" (*(const double (*)[]) ap),
1127 "d" (alpha), // 9
1128 "r" (x), // 10
1129 "b" (16), // 11
1130 "3" (ap), // 12
1131 "4" (lda) // 13
1132 :
1133 "cr0",
1134 "vs32","vs33","vs34","vs35","vs36","vs37",
1135 "vs40","vs41","vs42","vs43","vs44","vs45","vs46","vs47"
1136 );
1137 }
1138
1139 .. index:: asm goto labels
1140
1141 .. _gotolabels:
1142
1143 Goto Labels
1144 ~~~~~~~~~~~
1145
1146 ``asm goto`` allows assembly code to jump to one or more C labels. The
1147 :samp:`{GotoLabels}` section in an ``asm goto`` statement contains
1148 a comma-separated
1149 list of all C labels to which the assembler code may jump. GCC assumes that
1150 ``asm`` execution falls through to the next statement (if this is not the
1151 case, consider using the ``__builtin_unreachable`` intrinsic after the
1152 ``asm`` statement). Optimization of ``asm goto`` may be improved by
1153 using the :fn-attr:`hot` and :fn-attr:`cold` label attributes (see :ref:`label-attributes`).
1154
1155 If the assembler code does modify anything, use the ``"memory"`` clobber
1156 to force the
1157 optimizers to flush all register values to memory and reload them if
1158 necessary after the ``asm`` statement.
1159
1160 Also note that an ``asm goto`` statement is always implicitly
1161 considered volatile.
1162
1163 Be careful when you set output operands inside ``asm goto`` only on
1164 some possible control flow paths. If you don't set up the output on
1165 given path and never use it on this path, it is okay. Otherwise, you
1166 should use :samp:`+` constraint modifier meaning that the operand is
1167 input and output one. With this modifier you will have the correct
1168 values on all possible paths from the ``asm goto``.
1169
1170 To reference a label in the assembler template, prefix it with
1171 :samp:`%l` (lowercase :samp:`L`) followed by its (zero-based) position
1172 in :samp:`{GotoLabels}` plus the number of input and output operands.
1173 Output operand with constraint modifier :samp:`+` is counted as two
1174 operands because it is considered as one output and one input operand.
1175 For example, if the ``asm`` has three inputs, one output operand
1176 with constraint modifier :samp:`+` and one output operand with
1177 constraint modifier :samp:`=` and references two labels, refer to the
1178 first label as :samp:`%l6` and the second as :samp:`%l7`).
1179
1180 Alternately, you can reference labels using the actual C label name
1181 enclosed in brackets. For example, to reference a label named
1182 ``carry``, you can use :samp:`%l[carry]`. The label must still be
1183 listed in the :samp:`{GotoLabels}` section when using this approach. It
1184 is better to use the named references for labels as in this case you
1185 can avoid counting input and output operands and special treatment of
1186 output operands with constraint modifier :samp:`+`.
1187
1188 Here is an example of ``asm goto`` for i386:
1189
1190 .. code-block:: c++
1191
1192 asm goto (
1193 "btl %1, %0\n\t"
1194 "jc %l2"
1195 : /* No outputs. */
1196 : "r" (p1), "r" (p2)
1197 : "cc"
1198 : carry);
1199
1200 return 0;
1201
1202 carry:
1203 return 1;
1204
1205 The following example shows an ``asm goto`` that uses a memory clobber.
1206
1207 .. code-block:: c++
1208
1209 int frob(int x)
1210 {
1211 int y;
1212 asm goto ("frob %%r5, %1; jc %l[error]; mov (%2), %%r5"
1213 : /* No outputs. */
1214 : "r"(x), "r"(&y)
1215 : "r5", "memory"
1216 : error);
1217 return y;
1218 error:
1219 return -1;
1220 }
1221
1222 The following example shows an ``asm goto`` that uses an output.
1223
1224 .. code-block:: c++
1225
1226 int foo(int count)
1227 {
1228 asm goto ("dec %0; jb %l[stop]"
1229 : "+r" (count)
1230 :
1231 :
1232 : stop);
1233 return count;
1234 stop:
1235 return 0;
1236 }
1237
1238 The following artificial example shows an ``asm goto`` that sets
1239 up an output only on one path inside the ``asm goto``. Usage of
1240 constraint modifier ``=`` instead of ``+`` would be wrong as
1241 ``factor`` is used on all paths from the ``asm goto``.
1242
1243 .. code-block:: c++
1244
1245 int foo(int inp)
1246 {
1247 int factor = 0;
1248 asm goto ("cmp %1, 10; jb %l[lab]; mov 2, %0"
1249 : "+r" (factor)
1250 : "r" (inp)
1251 :
1252 : lab);
1253 lab:
1254 return inp * factor; /* return 2 * inp or 0 if inp < 10 */
1255 }
1256
1257 .. _x86operandmodifiers:
1258
1259 x86 Operand Modifiers
1260 ~~~~~~~~~~~~~~~~~~~~~
1261
1262 References to input, output, and goto operands in the assembler template
1263 of extended ``asm`` statements can use
1264 modifiers to affect the way the operands are formatted in
1265 the code output to the assembler. For example, the
1266 following code uses the :samp:`h` and :samp:`b` modifiers for x86:
1267
1268 .. code-block:: c++
1269
1270 uint16_t num;
1271 asm volatile ("xchg %h0, %b0" : "+a" (num) );
1272
1273 These modifiers generate this assembler code:
1274
1275 .. code-block:: c++
1276
1277 xchg %ah, %al
1278
1279 The rest of this discussion uses the following code for illustrative purposes.
1280
1281 .. code-block:: c++
1282
1283 int main()
1284 {
1285 int iInt = 1;
1286
1287 top:
1288
1289 asm volatile goto ("some assembler instructions here"
1290 : /* No outputs. */
1291 : "q" (iInt), "X" (sizeof(unsigned char) + 1), "i" (42)
1292 : /* No clobbers. */
1293 : top);
1294 }
1295
1296 With no modifiers, this is what the output from the operands would be
1297 for the :samp:`att` and :samp:`intel` dialects of assembler:
1298
1299 .. list-table::
1300 :header-rows: 1
1301
1302 * - Operand
1303 - :samp:`att`
1304 - :samp:`intel`
1305
1306 * - ``%0``
1307 - ``%eax``
1308 - ``eax``
1309 * - ``%1``
1310 - ``$2``
1311 - ``2``
1312 * - ``%3``
1313 - ``$.L3``
1314 - ``OFFSET FLAT:.L3``
1315 * - ``%4``
1316 - ``$8``
1317 - ``8``
1318 * - ``%5``
1319 - ``%xmm0``
1320 - ``xmm0``
1321 * - ``%7``
1322 - ``$0``
1323 - ``0``
1324
1325 The table below shows the list of supported modifiers and their effects.
1326
1327 .. list-table::
1328 :header-rows: 1
1329 :widths: 10 50 10 10 10
1330
1331 * - Modifier
1332 - Description
1333 - Operand
1334 - :samp:`att`
1335 - :samp:`intel`
1336
1337 * - ``A``
1338 - Print an absolute memory reference.
1339 - ``%A0``
1340 - ``*%rax``
1341 - ``rax``
1342 * - ``b``
1343 - Print the QImode name of the register.
1344 - ``%b0``
1345 - ``%al``
1346 - ``al``
1347 * - ``B``
1348 - print the opcode suffix of b.
1349 - ``%B0``
1350 - ``b``
1351 -
1352 * - ``c``
1353 - Require a constant operand and print the constant expression with no punctuation.
1354 - ``%c1``
1355 - ``2``
1356 - ``2``
1357 * - ``d``
1358 - print duplicated register operand for AVX instruction.
1359 - ``%d5``
1360 - ``%xmm0, %xmm0``
1361 - ``xmm0, xmm0``
1362 * - ``E``
1363 - Print the address in Double Integer (DImode) mode (8 bytes) when the target is 64-bit. Otherwise mode is unspecified (VOIDmode).
1364 - ``%E1``
1365 - ``%(rax)``
1366 - ``[rax]``
1367 * - ``g``
1368 - Print the V16SFmode name of the register.
1369 - ``%g0``
1370 - ``%zmm0``
1371 - ``zmm0``
1372 * - ``h``
1373 - Print the QImode name for a 'high' register.
1374 - ``%h0``
1375 - ``%ah``
1376 - ``ah``
1377 * - ``H``
1378 - Add 8 bytes to an offsettable memory reference. Useful when accessing the high 8 bytes of SSE values. For a memref in (%rax), it generates
1379 - ``%H0``
1380 - ``8(%rax)``
1381 - ``8[rax]``
1382 * - ``k``
1383 - Print the SImode name of the register.
1384 - ``%k0``
1385 - ``%eax``
1386 - ``eax``
1387 * - ``l``
1388 - Print the label name with no punctuation.
1389 - ``%l3``
1390 - ``.L3``
1391 - ``.L3``
1392 * - ``L``
1393 - print the opcode suffix of l.
1394 - ``%L0``
1395 - ``l``
1396 -
1397 * - ``N``
1398 - print maskz.
1399 - ``%N7``
1400 - ``{z}``
1401 - ``{z}``
1402 * - ``p``
1403 - Print raw symbol name (without syntax-specific prefixes).
1404 - ``%p2``
1405 - ``42``
1406 - ``42``
1407 * - ``P``
1408 - If used for a function, print the PLT suffix and generate PIC code. For example, emit ``foo@PLT`` instead of 'foo' for the function foo(). If used for a constant, drop all syntax-specific prefixes and issue the bare constant. See ``p`` above.
1409 -
1410 -
1411 -
1412 * - ``q``
1413 - Print the DImode name of the register.
1414 - ``%q0``
1415 - ``%rax``
1416 - ``rax``
1417 * - ``Q``
1418 - print the opcode suffix of q.
1419 - ``%Q0``
1420 - ``q``
1421 -
1422 * - ``R``
1423 - print embedded rounding and sae.
1424 - ``%R4``
1425 - ``{rn-sae},``
1426 - ``, {rn-sae}``
1427 * - ``r``
1428 - print only sae.
1429 - ``%r4``
1430 - ``{sae},``
1431 - ``, {sae}``
1432 * - ``s``
1433 - print a shift double count, followed by the assemblers argument delimiterprint the opcode suffix of s.
1434 - ``%s1``
1435 - ``$2,``
1436 - ``2,``
1437 * - ``S``
1438 - print the opcode suffix of s.
1439 - ``%S0``
1440 - ``s``
1441 -
1442 * - ``t``
1443 - print the V8SFmode name of the register.
1444 - ``%t5``
1445 - ``%ymm0``
1446 - ``ymm0``
1447 * - ``T``
1448 - print the opcode suffix of t.
1449 - ``%T0``
1450 - ``t``
1451 -
1452 * - ``V``
1453 - print naked full integer register name without %.
1454 - ``%V0``
1455 - ``eax``
1456 - ``eax``
1457 * - ``w``
1458 - Print the HImode name of the register.
1459 - ``%w0``
1460 - ``%ax``
1461 - ``ax``
1462 * - ``W``
1463 - print the opcode suffix of w.
1464 - ``%W0``
1465 - ``w``
1466 -
1467 * - ``x``
1468 - print the V4SFmode name of the register.
1469 - ``%x5``
1470 - ``%xmm0``
1471 - ``xmm0``
1472 * - ``y``
1473 - print "st(0)" instead of "st" as a register.
1474 - ``%y6``
1475 - ``%st(0)``
1476 - ``st(0)``
1477 * - ``z``
1478 - Print the opcode suffix for the size of the current integer operand (one of ``b`` / ``w`` / ``l`` / ``q``).
1479 - ``%z0``
1480 - ``l``
1481 -
1482 * - ``Z``
1483 - Like ``z``, with special suffixes for x87 instructions.
1484 -
1485 -
1486 -
1487
1488 .. _x86floatingpointasmoperands:
1489
1490 x86 Floating-Point asm Operands
1491 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1492
1493 On x86 targets, there are several rules on the usage of stack-like registers
1494 in the operands of an ``asm``. These rules apply only to the operands
1495 that are stack-like registers:
1496
1497 * Given a set of input registers that die in an ``asm``, it is
1498 necessary to know which are implicitly popped by the ``asm``, and
1499 which must be explicitly popped by GCC.
1500
1501 An input register that is implicitly popped by the ``asm`` must be
1502 explicitly clobbered, unless it is constrained to match an
1503 output operand.
1504
1505 * For any input register that is implicitly popped by an ``asm``, it is
1506 necessary to know how to adjust the stack to compensate for the pop.
1507 If any non-popped input is closer to the top of the reg-stack than
1508 the implicitly popped register, it would not be possible to know what the
1509 stack looked like---it's not clear how the rest of the stack 'slides
1510 up'.
1511
1512 All implicitly popped input registers must be closer to the top of
1513 the reg-stack than any input that is not implicitly popped.
1514
1515 It is possible that if an input dies in an ``asm``, the compiler might
1516 use the input register for an output reload. Consider this example:
1517
1518 .. code-block:: c++
1519
1520 asm ("foo" : "=t" (a) : "f" (b));
1521
1522 This code says that input ``b`` is not popped by the ``asm``, and that
1523 the ``asm`` pushes a result onto the reg-stack, i.e., the stack is one
1524 deeper after the ``asm`` than it was before. But, it is possible that
1525 reload may think that it can use the same register for both the input and
1526 the output.
1527
1528 To prevent this from happening,
1529 if any input operand uses the :samp:`f` constraint, all output register
1530 constraints must use the :samp:`&` early-clobber modifier.
1531
1532 The example above is correctly written as:
1533
1534 .. code-block:: c++
1535
1536 asm ("foo" : "=&t" (a) : "f" (b));
1537
1538 * Some operands need to be in particular places on the stack. All
1539 output operands fall in this category---GCC has no other way to
1540 know which registers the outputs appear in unless you indicate
1541 this in the constraints.
1542
1543 Output operands must specifically indicate which register an output
1544 appears in after an ``asm``. :samp:`=f` is not allowed: the operand
1545 constraints must select a class with a single register.
1546
1547 * Output operands may not be 'inserted' between existing stack registers.
1548 Since no 387 opcode uses a read/write operand, all output operands
1549 are dead before the ``asm``, and are pushed by the ``asm``.
1550 It makes no sense to push anywhere but the top of the reg-stack.
1551
1552 Output operands must start at the top of the reg-stack: output
1553 operands may not 'skip' a register.
1554
1555 * Some ``asm`` statements may need extra stack space for internal
1556 calculations. This can be guaranteed by clobbering stack registers
1557 unrelated to the inputs and outputs.
1558
1559 This ``asm``
1560 takes one input, which is internally popped, and produces two outputs.
1561
1562 .. code-block:: c++
1563
1564 asm ("fsincos" : "=t" (cos), "=u" (sin) : "0" (inp));
1565
1566 This ``asm`` takes two inputs, which are popped by the ``fyl2xp1`` opcode,
1567 and replaces them with one output. The ``st(1)`` clobber is necessary
1568 for the compiler to know that ``fyl2xp1`` pops both inputs.
1569
1570 .. code-block:: c++
1571
1572 asm ("fyl2xp1" : "=t" (result) : "0" (x), "u" (y) : "st(1)");
1573
1574 .. _msp430operandmodifiers:
1575
1576 MSP430 Operand Modifiers
1577 ~~~~~~~~~~~~~~~~~~~~~~~~
1578
1579 The list below describes the supported modifiers and their effects for MSP430.
1580
1581 .. list-table::
1582 :header-rows: 1
1583 :widths: 10 90
1584
1585 * - Modifier
1586 - Description
1587
1588 * - ``A``
1589 - Select low 16-bits of the constant/register/memory operand.
1590 * - ``B``
1591 - Select high 16-bits of the constant/register/memory operand.
1592 * - ``C``
1593 - Select bits 32-47 of the constant/register/memory operand.
1594 * - ``D``
1595 - Select bits 48-63 of the constant/register/memory operand.
1596 * - ``H``
1597 - Equivalent to ``B`` (for backwards compatibility).
1598 * - ``I``
1599 - Print the inverse (logical ``NOT``) of the constant value.
1600 * - ``J``
1601 - Print an integer without a ``#`` prefix.
1602 * - ``L``
1603 - Equivalent to ``A`` (for backwards compatibility).
1604 * - ``O``
1605 - Offset of the current frame from the top of the stack.
1606 * - ``Q``
1607 - Use the ``A`` instruction postfix.
1608 * - ``R``
1609 - Inverse of condition code, for unsigned comparisons.
1610 * - ``W``
1611 - Subtract 16 from the constant value.
1612 * - ``X``
1613 - Use the ``X`` instruction postfix.
1614 * - ``Y``
1615 - Subtract 4 from the constant value.
1616 * - ``Z``
1617 - Subtract 1 from the constant value.
1618 * - ``b``
1619 - Append ``.B``, ``.W`` or ``.A`` to the instruction, depending on the mode.
1620 * - ``d``
1621 - Offset 1 byte of a memory reference or constant value.
1622 * - ``e``
1623 - Offset 3 bytes of a memory reference or constant value.
1624 * - ``f``
1625 - Offset 5 bytes of a memory reference or constant value.
1626 * - ``g``
1627 - Offset 7 bytes of a memory reference or constant value.
1628 * - ``p``
1629 - Print the value of 2, raised to the power of the given constant. Used to select the specified bit position.
1630 * - ``r``
1631 - Inverse of condition code, for signed comparisons.
1632 * - ``x``
1633 - Equivialent to ``X``, but only for pointers.
1634
1635 .. Most of this node appears by itself (in a different place) even
1636 when the INTERNALS flag is clear. Passages that require the internals
1637 manual's context are conditionalized to appear only in the internals manual.
1638
1639 .. index:: operand constraints, asm, constraints, asm, asm constraints
1640
1641 .. _constraints:
1642
1643 Constraints for asm Operands
1644 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1645
1646 Here are specific details on what constraint letters you can use with
1647 ``asm`` operands.
1648 Constraints can say whether
1649 an operand may be in a register, and which kinds of register; whether the
1650 operand can be a memory reference, and which kinds of address; whether the
1651 operand may be an immediate constant, and which possible values it may
1652 have. Constraints can also require two operands to match.
1653 Side-effects aren't allowed in operands of inline ``asm``, unless
1654 :samp:`<` or :samp:`>` constraints are used, because there is no guarantee
1655 that the side effects will happen exactly once in an instruction that can update
1656 the addressing register.
1657
1658 .. toctree::
1659 :maxdepth: 2
1660
1661
1662 .. include:: ../../../../doc/md.rst
1663
1664
1665 .. Each of the following nodes are wrapped in separate
1666 "@ifset INTERNALS" to work around memory limits for the default
1667 configuration in older tetex distributions. Known to not work:
1668 tetex-1.0.7, known to work: tetex-2.0.2.
1669
1670 .. index:: assembler names for identifiers, names used in assembler code, identifiers, names in assembler code
1671
1672 .. _asm-labels:
1673
1674 Controlling Names Used in Assembler Code
1675 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1676
1677 You can specify the name to be used in the assembler code for a C
1678 function or variable by writing the ``asm`` (or ``__asm__``)
1679 keyword after the declarator.
1680 It is up to you to make sure that the assembler names you choose do not
1681 conflict with any other assembler symbols, or reference registers.
1682
1683 Assembler names for data
1684 ^^^^^^^^^^^^^^^^^^^^^^^^
1685
1686 This sample shows how to specify the assembler name for data:
1687
1688 .. code-block:: c++
1689
1690 int foo asm ("myfoo") = 2;
1691
1692 This specifies that the name to be used for the variable ``foo`` in
1693 the assembler code should be :samp:`myfoo` rather than the usual
1694 :samp:`_foo`.
1695
1696 On systems where an underscore is normally prepended to the name of a C
1697 variable, this feature allows you to define names for the
1698 linker that do not start with an underscore.
1699
1700 GCC does not support using this feature with a non-static local variable
1701 since such variables do not have assembler names. If you are
1702 trying to put the variable in a particular register, see
1703 :ref:`explicit-register-variables`.
1704
1705 Assembler names for functions
1706 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1707
1708 To specify the assembler name for functions, write a declaration for the
1709 function before its definition and put ``asm`` there, like this:
1710
1711 .. code-block:: c++
1712
1713 int func (int x, int y) asm ("MYFUNC");
1714
1715 int func (int x, int y)
1716 {
1717 /* ... */
1718
1719 This specifies that the name to be used for the function ``func`` in
1720 the assembler code should be ``MYFUNC``.
1721
1722 .. _explicit-register-variables:
1723
1724 Variables in Specified Registers
1725 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1726
1727 .. index:: explicit register variables, variables in specified registers, specified registers
1728
1729 .. _explicit-reg-vars:
1730
1731 GNU C allows you to associate specific hardware registers with C
1732 variables. In almost all cases, allowing the compiler to assign
1733 registers produces the best code. However under certain unusual
1734 circumstances, more precise control over the variable storage is
1735 required.
1736
1737 Both global and local variables can be associated with a register. The
1738 consequences of performing this association are very different between
1739 the two, as explained in the sections below.
1740
1741 .. toctree::
1742 :maxdepth: 2
1743
1744
1745 .. _global-register-variables:
1746
1747 Defining Global Register Variables
1748 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1749
1750 .. index:: global register variables, registers, global variables in, registers, global allocation
1751
1752 .. _global-reg-vars:
1753
1754 You can define a global register variable and associate it with a specified
1755 register like this:
1756
1757 .. code-block:: c++
1758
1759 register int *foo asm ("r12");
1760
1761 Here ``r12`` is the name of the register that should be used. Note that
1762 this is the same syntax used for defining local register variables, but for
1763 a global variable the declaration appears outside a function. The
1764 ``register`` keyword is required, and cannot be combined with
1765 ``static``. The register name must be a valid register name for the
1766 target platform.
1767
1768 Do not use type qualifiers such as ``const`` and ``volatile``, as
1769 the outcome may be contrary to expectations. In particular, using the
1770 ``volatile`` qualifier does not fully prevent the compiler from
1771 optimizing accesses to the register.
1772
1773 Registers are a scarce resource on most systems and allowing the
1774 compiler to manage their usage usually results in the best code. However,
1775 under special circumstances it can make sense to reserve some globally.
1776 For example this may be useful in programs such as programming language
1777 interpreters that have a couple of global variables that are accessed
1778 very often.
1779
1780 After defining a global register variable, for the current compilation
1781 unit:
1782
1783 * If the register is a call-saved register, call ABI is affected:
1784 the register will not be restored in function epilogue sequences after
1785 the variable has been assigned. Therefore, functions cannot safely
1786 return to callers that assume standard ABI.
1787
1788 * Conversely, if the register is a call-clobbered register, making
1789 calls to functions that use standard ABI may lose contents of the variable.
1790 Such calls may be created by the compiler even if none are evident in
1791 the original program, for example when libgcc functions are used to
1792 make up for unavailable instructions.
1793
1794 * Accesses to the variable may be optimized as usual and the register
1795 remains available for allocation and use in any computations, provided that
1796 observable values of the variable are not affected.
1797
1798 * If the variable is referenced in inline assembly, the type of access
1799 must be provided to the compiler via constraints (see :ref:`constraints`).
1800 Accesses from basic asms are not supported.
1801
1802 Note that these points *only* apply to code that is compiled with the
1803 definition. The behavior of code that is merely linked in (for example
1804 code from libraries) is not affected.
1805
1806 If you want to recompile source files that do not actually use your global
1807 register variable so they do not use the specified register for any other
1808 purpose, you need not actually add the global register declaration to
1809 their source code. It suffices to specify the compiler option
1810 :option:`-ffixed-reg` (see :ref:`code-gen-options`) to reserve the
1811 register.
1812
1813 Declaring the variable
1814 ^^^^^^^^^^^^^^^^^^^^^^
1815
1816 Global register variables cannot have initial values, because an
1817 executable file has no means to supply initial contents for a register.
1818
1819 When selecting a register, choose one that is normally saved and
1820 restored by function calls on your machine. This ensures that code
1821 which is unaware of this reservation (such as library routines) will
1822 restore it before returning.
1823
1824 On machines with register windows, be sure to choose a global
1825 register that is not affected magically by the function call mechanism.
1826
1827 .. index:: qsort, and global register variables
1828
1829 Using the variable
1830 ^^^^^^^^^^^^^^^^^^
1831
1832 When calling routines that are not aware of the reservation, be
1833 cautious if those routines call back into code which uses them. As an
1834 example, if you call the system library version of ``qsort``, it may
1835 clobber your registers during execution, but (if you have selected
1836 appropriate registers) it will restore them before returning. However
1837 it will *not* restore them before calling ``qsort`` 's comparison
1838 function. As a result, global values will not reliably be available to
1839 the comparison function unless the ``qsort`` function itself is rebuilt.
1840
1841 Similarly, it is not safe to access the global register variables from signal
1842 handlers or from more than one thread of control. Unless you recompile
1843 them specially for the task at hand, the system library routines may
1844 temporarily use the register for other things. Furthermore, since the register
1845 is not reserved exclusively for the variable, accessing it from handlers of
1846 asynchronous signals may observe unrelated temporary values residing in the
1847 register.
1848
1849 .. index:: register variable after longjmp, global register after longjmp, value after longjmp, longjmp, setjmp
1850
1851 On most machines, ``longjmp`` restores to each global register
1852 variable the value it had at the time of the ``setjmp``. On some
1853 machines, however, ``longjmp`` does not change the value of global
1854 register variables. To be portable, the function that called ``setjmp``
1855 should make other arrangements to save the values of the global register
1856 variables, and to restore them in a ``longjmp``. This way, the same
1857 thing happens regardless of what ``longjmp`` does.
1858
1859 .. _local-register-variables:
1860
1861 Specifying Registers for Local Variables
1862 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1863
1864 .. index:: local variables, specifying registers, specifying registers for local variables, registers for local variables
1865
1866 .. _local-reg-vars:
1867
1868 You can define a local register variable and associate it with a specified
1869 register like this:
1870
1871 .. code-block:: c++
1872
1873 register int *foo asm ("r12");
1874
1875 Here ``r12`` is the name of the register that should be used. Note
1876 that this is the same syntax used for defining global register variables,
1877 but for a local variable the declaration appears within a function. The
1878 ``register`` keyword is required, and cannot be combined with
1879 ``static``. The register name must be a valid register name for the
1880 target platform.
1881
1882 Do not use type qualifiers such as ``const`` and ``volatile``, as
1883 the outcome may be contrary to expectations. In particular, when the
1884 ``const`` qualifier is used, the compiler may substitute the
1885 variable with its initializer in ``asm`` statements, which may cause
1886 the corresponding operand to appear in a different register.
1887
1888 As with global register variables, it is recommended that you choose
1889 a register that is normally saved and restored by function calls on your
1890 machine, so that calls to library routines will not clobber it.
1891
1892 The only supported use for this feature is to specify registers
1893 for input and output operands when calling Extended ``asm``
1894 (see :ref:`extended-asm`). This may be necessary if the constraints for a
1895 particular machine don't provide sufficient control to select the desired
1896 register. To force an operand into a register, create a local variable
1897 and specify the register name after the variable's declaration. Then use
1898 the local variable for the ``asm`` operand and specify any constraint
1899 letter that matches the register:
1900
1901 .. code-block:: c++
1902
1903 register int *p1 asm ("r0") = ...;
1904 register int *p2 asm ("r1") = ...;
1905 register int *result asm ("r0");
1906 asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
1907
1908 .. warning::
1909
1910 In the above example, be aware that a register (for example
1911 ``r0``) can be call-clobbered by subsequent code, including function
1912 calls and library calls for arithmetic operators on other variables (for
1913 example the initialization of ``p2``). In this case, use temporary
1914 variables for expressions between the register assignments:
1915
1916 .. code-block:: c++
1917
1918 int t1 = ...;
1919 register int *p1 asm ("r0") = ...;
1920 register int *p2 asm ("r1") = t1;
1921 register int *result asm ("r0");
1922 asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
1923
1924 Defining a register variable does not reserve the register. Other than
1925 when invoking the Extended ``asm``, the contents of the specified
1926 register are not guaranteed. For this reason, the following uses
1927 are explicitly *not* supported. If they appear to work, it is only
1928 happenstance, and may stop working as intended due to (seemingly)
1929 unrelated changes in surrounding code, or even minor changes in the
1930 optimization of a future version of gcc:
1931
1932 * Passing parameters to or from Basic ``asm``
1933
1934 * Passing parameters to or from Extended ``asm`` without using input
1935 or output operands.
1936
1937 * Passing parameters to or from routines written in assembler (or
1938 other languages) using non-standard calling conventions.
1939
1940 Some developers use Local Register Variables in an attempt to improve
1941 gcc's allocation of registers, especially in large functions. In this
1942 case the register name is essentially a hint to the register allocator.
1943 While in some instances this can generate better code, improvements are
1944 subject to the whims of the allocator/optimizers. Since there are no
1945 guarantees that your improvements won't be lost, this usage of Local
1946 Register Variables is discouraged.
1947
1948 On the MIPS platform, there is related use for local register variables
1949 with slightly different characteristics (see :ref:`gccint:mips-coprocessors`).
1950
1951 .. _size-of-an-asm:
1952
1953 Size of an asm
1954 ^^^^^^^^^^^^^^
1955
1956 Some targets require that GCC track the size of each instruction used
1957 in order to generate correct code. Because the final length of the
1958 code produced by an ``asm`` statement is only known by the
1959 assembler, GCC must make an estimate as to how big it will be. It
1960 does this by counting the number of instructions in the pattern of the
1961 ``asm`` and multiplying that by the length of the longest
1962 instruction supported by that processor. (When working out the number
1963 of instructions, it assumes that any occurrence of a newline or of
1964 whatever statement separator character is supported by the assembler ---
1965 typically :samp:`;` --- indicates the end of an instruction.)
1966
1967 Normally, GCC's estimate is adequate to ensure that correct
1968 code is generated, but it is possible to confuse the compiler if you use
1969 pseudo instructions or assembler macros that expand into multiple real
1970 instructions, or if you use assembler directives that expand to more
1971 space in the object file than is needed for a single instruction.
1972 If this happens then the assembler may produce a diagnostic saying that
1973 a label is unreachable.
1974
1975 .. index:: asm inline
1976
1977 This size is also used for inlining decisions. If you use ``asm inline``
1978 instead of just ``asm``, then for inlining purposes the size of the asm
1979 is taken as the minimum size, ignoring how many instructions GCC thinks it is.