]> git.ipfire.org Git - thirdparty/gcc.git/blob - gcc/doc/gcc/extensions-to-the-c-language-family/target-builtins/x86-built-in-functions.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / gcc / extensions-to-the-c-language-family / target-builtins / x86-built-in-functions.rst
1 ..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6 .. _x86-built-in-functions:
7
8 x86 Built-in Functions
9 ^^^^^^^^^^^^^^^^^^^^^^
10
11 These built-in functions are available for the x86-32 and x86-64 family
12 of computers, depending on the command-line switches used.
13
14 If you specify command-line switches such as :option:`-msse`,
15 the compiler could use the extended instruction sets even if the built-ins
16 are not used explicitly in the program. For this reason, applications
17 that perform run-time CPU detection must compile separate files for each
18 supported architecture, using the appropriate flags. In particular,
19 the file containing the CPU detection code should be compiled without
20 these options.
21
22 The following machine modes are available for use with MMX built-in functions
23 (see :ref:`vector-extensions`): ``V2SI`` for a vector of two 32-bit integers,
24 ``V4HI`` for a vector of four 16-bit integers, and ``V8QI`` for a
25 vector of eight 8-bit integers. Some of the built-in functions operate on
26 MMX registers as a whole 64-bit entity, these use ``V1DI`` as their mode.
27
28 If 3DNow! extensions are enabled, ``V2SF`` is used as a mode for a vector
29 of two 32-bit floating-point values.
30
31 If SSE extensions are enabled, ``V4SF`` is used for a vector of four 32-bit
32 floating-point values. Some instructions use a vector of four 32-bit
33 integers, these use ``V4SI``. Finally, some instructions operate on an
34 entire vector register, interpreting it as a 128-bit integer, these use mode
35 ``TI``.
36
37 The x86-32 and x86-64 family of processors use additional built-in
38 functions for efficient use of ``TF`` (``__float128``) 128-bit
39 floating point and ``TC`` 128-bit complex floating-point values.
40
41 The following floating-point built-in functions are always available. All
42 of them implement the function that is part of the name.
43
44 .. code-block:: c++
45
46 __float128 __builtin_fabsq (__float128)
47 __float128 __builtin_copysignq (__float128, __float128)
48
49 The following built-in functions are always available.
50
51 .. function:: __float128 __builtin_infq (void)
52
53 Similar to ``__builtin_inf``, except the return type is ``__float128``.
54
55 .. index:: __builtin_infq
56
57 .. function:: __float128 __builtin_huge_valq (void)
58
59 Similar to ``__builtin_huge_val``, except the return type is ``__float128``.
60
61 .. index:: __builtin_huge_valq
62
63 .. function:: __float128 __builtin_nanq (void)
64
65 Similar to ``__builtin_nan``, except the return type is ``__float128``.
66
67 .. index:: __builtin_nanq
68
69 .. function:: __float128 __builtin_nansq (void)
70
71 Similar to ``__builtin_nans``, except the return type is ``__float128``.
72
73 .. index:: __builtin_nansq
74
75 The following built-in function is always available.
76
77 .. function:: void __builtin_ia32_pause (void)
78
79 Generates the ``pause`` machine instruction with a compiler memory
80 barrier.
81
82 The following built-in functions are always available and can be used to
83 check the target platform type.
84
85 .. function:: void __builtin_cpu_init (void)
86
87 This function runs the CPU detection code to check the type of CPU and the
88 features supported. This built-in function needs to be invoked along with the built-in functions
89 to check CPU type and features, ``__builtin_cpu_is`` and
90 ``__builtin_cpu_supports``, only when used in a function that is
91 executed before any constructors are called. The CPU detection code is
92 automatically executed in a very high priority constructor.
93
94 For example, this function has to be used in ``ifunc`` resolvers that
95 check for CPU type using the built-in functions ``__builtin_cpu_is``
96 and ``__builtin_cpu_supports``, or in constructors on targets that
97 don't support constructor priority.
98
99 .. code-block:: c++
100
101 static void (*resolve_memcpy (void)) (void)
102 {
103 // ifunc resolvers fire before constructors, explicitly call the init
104 // function.
105 __builtin_cpu_init ();
106 if (__builtin_cpu_supports ("ssse3"))
107 return ssse3_memcpy; // super fast memcpy with ssse3 instructions.
108 else
109 return default_memcpy;
110 }
111
112 void *memcpy (void *, const void *, size_t)
113 __attribute__ ((ifunc ("resolve_memcpy")));
114
115 .. function:: int __builtin_cpu_is (const char *cpuname)
116
117 This function returns a positive integer if the run-time CPU
118 is of type :samp:`{cpuname}`
119 and returns ``0`` otherwise. The following CPU names can be detected:
120
121 :samp:`amd`
122 AMD CPU.
123
124 :samp:`intel`
125 Intel CPU.
126
127 :samp:`atom`
128 Intel Atom CPU.
129
130 :samp:`slm`
131 Intel Silvermont CPU.
132
133 :samp:`core2`
134 Intel Core 2 CPU.
135
136 :samp:`corei7`
137 Intel Core i7 CPU.
138
139 :samp:`nehalem`
140 Intel Core i7 Nehalem CPU.
141
142 :samp:`westmere`
143 Intel Core i7 Westmere CPU.
144
145 :samp:`sandybridge`
146 Intel Core i7 Sandy Bridge CPU.
147
148 :samp:`ivybridge`
149 Intel Core i7 Ivy Bridge CPU.
150
151 :samp:`haswell`
152 Intel Core i7 Haswell CPU.
153
154 :samp:`broadwell`
155 Intel Core i7 Broadwell CPU.
156
157 :samp:`skylake`
158 Intel Core i7 Skylake CPU.
159
160 :samp:`skylake-avx512`
161 Intel Core i7 Skylake AVX512 CPU.
162
163 :samp:`cannonlake`
164 Intel Core i7 Cannon Lake CPU.
165
166 :samp:`icelake-client`
167 Intel Core i7 Ice Lake Client CPU.
168
169 :samp:`icelake-server`
170 Intel Core i7 Ice Lake Server CPU.
171
172 :samp:`cascadelake`
173 Intel Core i7 Cascadelake CPU.
174
175 :samp:`tigerlake`
176 Intel Core i7 Tigerlake CPU.
177
178 :samp:`cooperlake`
179 Intel Core i7 Cooperlake CPU.
180
181 :samp:`sapphirerapids`
182 Intel Core i7 sapphirerapids CPU.
183
184 :samp:`alderlake`
185 Intel Core i7 Alderlake CPU.
186
187 :samp:`rocketlake`
188 Intel Core i7 Rocketlake CPU.
189
190 :samp:`graniterapids`
191 Intel Core i7 graniterapids CPU.
192
193 :samp:`bonnell`
194 Intel Atom Bonnell CPU.
195
196 :samp:`silvermont`
197 Intel Atom Silvermont CPU.
198
199 :samp:`goldmont`
200 Intel Atom Goldmont CPU.
201
202 :samp:`goldmont-plus`
203 Intel Atom Goldmont Plus CPU.
204
205 :samp:`tremont`
206 Intel Atom Tremont CPU.
207
208 :samp:`sierraforest`
209 Intel Atom Sierra Forest CPU.
210
211 :samp:`grandridge`
212 Intel Atom Grand Ridge CPU.
213
214 :samp:`knl`
215 Intel Knights Landing CPU.
216
217 :samp:`knm`
218 Intel Knights Mill CPU.
219
220 :samp:`lujiazui`
221 ZHAOXIN lujiazui CPU.
222
223 :samp:`amdfam10h`
224 AMD Family 10h CPU.
225
226 :samp:`barcelona`
227 AMD Family 10h Barcelona CPU.
228
229 :samp:`shanghai`
230 AMD Family 10h Shanghai CPU.
231
232 :samp:`istanbul`
233 AMD Family 10h Istanbul CPU.
234
235 :samp:`btver1`
236 AMD Family 14h CPU.
237
238 :samp:`amdfam15h`
239 AMD Family 15h CPU.
240
241 :samp:`bdver1`
242 AMD Family 15h Bulldozer version 1.
243
244 :samp:`bdver2`
245 AMD Family 15h Bulldozer version 2.
246
247 :samp:`bdver3`
248 AMD Family 15h Bulldozer version 3.
249
250 :samp:`bdver4`
251 AMD Family 15h Bulldozer version 4.
252
253 :samp:`btver2`
254 AMD Family 16h CPU.
255
256 :samp:`amdfam17h`
257 AMD Family 17h CPU.
258
259 :samp:`znver1`
260 AMD Family 17h Zen version 1.
261
262 :samp:`znver2`
263 AMD Family 17h Zen version 2.
264
265 :samp:`amdfam19h`
266 AMD Family 19h CPU.
267
268 :samp:`znver3`
269 AMD Family 19h Zen version 3.
270
271 :samp:`znver4`
272 AMD Family 19h Zen version 4.
273
274 :samp:`x86-64`
275 Baseline x86-64 microarchitecture level (as defined in x86-64 psABI).
276
277 :samp:`x86-64-v2`
278 x86-64-v2 microarchitecture level.
279
280 :samp:`x86-64-v3`
281 x86-64-v3 microarchitecture level.
282
283 :samp:`x86-64-v4`
284 x86-64-v4 microarchitecture level.
285
286 Here is an example:
287
288 .. code-block:: c++
289
290 if (__builtin_cpu_is ("corei7"))
291 {
292 do_corei7 (); // Core i7 specific implementation.
293 }
294 else
295 {
296 do_generic (); // Generic implementation.
297 }
298
299 .. function:: int __builtin_cpu_supports (const char *feature)
300
301 This function returns a positive integer if the run-time CPU
302 supports :samp:`{feature}`
303 and returns ``0`` otherwise. The following features can be detected:
304
305 :samp:`cmov`
306 CMOV instruction.
307
308 :samp:`mmx`
309 MMX instructions.
310
311 :samp:`popcnt`
312 POPCNT instruction.
313
314 :samp:`sse`
315 SSE instructions.
316
317 :samp:`sse2`
318 SSE2 instructions.
319
320 :samp:`sse3`
321 SSE3 instructions.
322
323 :samp:`ssse3`
324 SSSE3 instructions.
325
326 :samp:`sse4.1`
327 SSE4.1 instructions.
328
329 :samp:`sse4.2`
330 SSE4.2 instructions.
331
332 :samp:`avx`
333 AVX instructions.
334
335 :samp:`avx2`
336 AVX2 instructions.
337
338 :samp:`sse4a`
339 SSE4A instructions.
340
341 :samp:`fma4`
342 FMA4 instructions.
343
344 :samp:`xop`
345 XOP instructions.
346
347 :samp:`fma`
348 FMA instructions.
349
350 :samp:`avx512f`
351 AVX512F instructions.
352
353 :samp:`bmi`
354 BMI instructions.
355
356 :samp:`bmi2`
357 BMI2 instructions.
358
359 :samp:`aes`
360 AES instructions.
361
362 :samp:`pclmul`
363 PCLMUL instructions.
364
365 :samp:`avx512vl`
366 AVX512VL instructions.
367
368 :samp:`avx512bw`
369 AVX512BW instructions.
370
371 :samp:`avx512dq`
372 AVX512DQ instructions.
373
374 :samp:`avx512cd`
375 AVX512CD instructions.
376
377 :samp:`avx512er`
378 AVX512ER instructions.
379
380 :samp:`avx512pf`
381 AVX512PF instructions.
382
383 :samp:`avx512vbmi`
384 AVX512VBMI instructions.
385
386 :samp:`avx512ifma`
387 AVX512IFMA instructions.
388
389 :samp:`avx5124vnniw`
390 AVX5124VNNIW instructions.
391
392 :samp:`avx5124fmaps`
393 AVX5124FMAPS instructions.
394
395 :samp:`avx512vpopcntdq`
396 AVX512VPOPCNTDQ instructions.
397
398 :samp:`avx512vbmi2`
399 AVX512VBMI2 instructions.
400
401 :samp:`gfni`
402 GFNI instructions.
403
404 :samp:`vpclmulqdq`
405 VPCLMULQDQ instructions.
406
407 :samp:`avx512vnni`
408 AVX512VNNI instructions.
409
410 :samp:`avx512bitalg`
411 AVX512BITALG instructions.
412
413 Here is an example:
414
415 .. code-block:: c++
416
417 if (__builtin_cpu_supports ("popcnt"))
418 {
419 asm("popcnt %1,%0" : "=r"(count) : "rm"(n) : "cc");
420 }
421 else
422 {
423 count = generic_countbits (n); //generic implementation.
424 }
425
426 The following built-in functions are made available by :option:`-mmmx`.
427 All of them generate the machine instruction that is part of the name.
428
429 .. code-block:: c++
430
431 v8qi __builtin_ia32_paddb (v8qi, v8qi);
432 v4hi __builtin_ia32_paddw (v4hi, v4hi);
433 v2si __builtin_ia32_paddd (v2si, v2si);
434 v8qi __builtin_ia32_psubb (v8qi, v8qi);
435 v4hi __builtin_ia32_psubw (v4hi, v4hi);
436 v2si __builtin_ia32_psubd (v2si, v2si);
437 v8qi __builtin_ia32_paddsb (v8qi, v8qi);
438 v4hi __builtin_ia32_paddsw (v4hi, v4hi);
439 v8qi __builtin_ia32_psubsb (v8qi, v8qi);
440 v4hi __builtin_ia32_psubsw (v4hi, v4hi);
441 v8qi __builtin_ia32_paddusb (v8qi, v8qi);
442 v4hi __builtin_ia32_paddusw (v4hi, v4hi);
443 v8qi __builtin_ia32_psubusb (v8qi, v8qi);
444 v4hi __builtin_ia32_psubusw (v4hi, v4hi);
445 v4hi __builtin_ia32_pmullw (v4hi, v4hi);
446 v4hi __builtin_ia32_pmulhw (v4hi, v4hi);
447 di __builtin_ia32_pand (di, di);
448 di __builtin_ia32_pandn (di,di);
449 di __builtin_ia32_por (di, di);
450 di __builtin_ia32_pxor (di, di);
451 v8qi __builtin_ia32_pcmpeqb (v8qi, v8qi);
452 v4hi __builtin_ia32_pcmpeqw (v4hi, v4hi);
453 v2si __builtin_ia32_pcmpeqd (v2si, v2si);
454 v8qi __builtin_ia32_pcmpgtb (v8qi, v8qi);
455 v4hi __builtin_ia32_pcmpgtw (v4hi, v4hi);
456 v2si __builtin_ia32_pcmpgtd (v2si, v2si);
457 v8qi __builtin_ia32_punpckhbw (v8qi, v8qi);
458 v4hi __builtin_ia32_punpckhwd (v4hi, v4hi);
459 v2si __builtin_ia32_punpckhdq (v2si, v2si);
460 v8qi __builtin_ia32_punpcklbw (v8qi, v8qi);
461 v4hi __builtin_ia32_punpcklwd (v4hi, v4hi);
462 v2si __builtin_ia32_punpckldq (v2si, v2si);
463 v8qi __builtin_ia32_packsswb (v4hi, v4hi);
464 v4hi __builtin_ia32_packssdw (v2si, v2si);
465 v8qi __builtin_ia32_packuswb (v4hi, v4hi);
466
467 v4hi __builtin_ia32_psllw (v4hi, v4hi);
468 v2si __builtin_ia32_pslld (v2si, v2si);
469 v1di __builtin_ia32_psllq (v1di, v1di);
470 v4hi __builtin_ia32_psrlw (v4hi, v4hi);
471 v2si __builtin_ia32_psrld (v2si, v2si);
472 v1di __builtin_ia32_psrlq (v1di, v1di);
473 v4hi __builtin_ia32_psraw (v4hi, v4hi);
474 v2si __builtin_ia32_psrad (v2si, v2si);
475 v4hi __builtin_ia32_psllwi (v4hi, int);
476 v2si __builtin_ia32_pslldi (v2si, int);
477 v1di __builtin_ia32_psllqi (v1di, int);
478 v4hi __builtin_ia32_psrlwi (v4hi, int);
479 v2si __builtin_ia32_psrldi (v2si, int);
480 v1di __builtin_ia32_psrlqi (v1di, int);
481 v4hi __builtin_ia32_psrawi (v4hi, int);
482 v2si __builtin_ia32_psradi (v2si, int);
483
484 The following built-in functions are made available either with
485 :option:`-msse`, or with :option:`-m3dnowa`. All of them generate
486 the machine instruction that is part of the name.
487
488 .. code-block:: c++
489
490 v4hi __builtin_ia32_pmulhuw (v4hi, v4hi);
491 v8qi __builtin_ia32_pavgb (v8qi, v8qi);
492 v4hi __builtin_ia32_pavgw (v4hi, v4hi);
493 v1di __builtin_ia32_psadbw (v8qi, v8qi);
494 v8qi __builtin_ia32_pmaxub (v8qi, v8qi);
495 v4hi __builtin_ia32_pmaxsw (v4hi, v4hi);
496 v8qi __builtin_ia32_pminub (v8qi, v8qi);
497 v4hi __builtin_ia32_pminsw (v4hi, v4hi);
498 int __builtin_ia32_pmovmskb (v8qi);
499 void __builtin_ia32_maskmovq (v8qi, v8qi, char *);
500 void __builtin_ia32_movntq (di *, di);
501 void __builtin_ia32_sfence (void);
502
503 The following built-in functions are available when :option:`-msse` is used.
504 All of them generate the machine instruction that is part of the name.
505
506 .. code-block:: c++
507
508 int __builtin_ia32_comieq (v4sf, v4sf);
509 int __builtin_ia32_comineq (v4sf, v4sf);
510 int __builtin_ia32_comilt (v4sf, v4sf);
511 int __builtin_ia32_comile (v4sf, v4sf);
512 int __builtin_ia32_comigt (v4sf, v4sf);
513 int __builtin_ia32_comige (v4sf, v4sf);
514 int __builtin_ia32_ucomieq (v4sf, v4sf);
515 int __builtin_ia32_ucomineq (v4sf, v4sf);
516 int __builtin_ia32_ucomilt (v4sf, v4sf);
517 int __builtin_ia32_ucomile (v4sf, v4sf);
518 int __builtin_ia32_ucomigt (v4sf, v4sf);
519 int __builtin_ia32_ucomige (v4sf, v4sf);
520 v4sf __builtin_ia32_addps (v4sf, v4sf);
521 v4sf __builtin_ia32_subps (v4sf, v4sf);
522 v4sf __builtin_ia32_mulps (v4sf, v4sf);
523 v4sf __builtin_ia32_divps (v4sf, v4sf);
524 v4sf __builtin_ia32_addss (v4sf, v4sf);
525 v4sf __builtin_ia32_subss (v4sf, v4sf);
526 v4sf __builtin_ia32_mulss (v4sf, v4sf);
527 v4sf __builtin_ia32_divss (v4sf, v4sf);
528 v4sf __builtin_ia32_cmpeqps (v4sf, v4sf);
529 v4sf __builtin_ia32_cmpltps (v4sf, v4sf);
530 v4sf __builtin_ia32_cmpleps (v4sf, v4sf);
531 v4sf __builtin_ia32_cmpgtps (v4sf, v4sf);
532 v4sf __builtin_ia32_cmpgeps (v4sf, v4sf);
533 v4sf __builtin_ia32_cmpunordps (v4sf, v4sf);
534 v4sf __builtin_ia32_cmpneqps (v4sf, v4sf);
535 v4sf __builtin_ia32_cmpnltps (v4sf, v4sf);
536 v4sf __builtin_ia32_cmpnleps (v4sf, v4sf);
537 v4sf __builtin_ia32_cmpngtps (v4sf, v4sf);
538 v4sf __builtin_ia32_cmpngeps (v4sf, v4sf);
539 v4sf __builtin_ia32_cmpordps (v4sf, v4sf);
540 v4sf __builtin_ia32_cmpeqss (v4sf, v4sf);
541 v4sf __builtin_ia32_cmpltss (v4sf, v4sf);
542 v4sf __builtin_ia32_cmpless (v4sf, v4sf);
543 v4sf __builtin_ia32_cmpunordss (v4sf, v4sf);
544 v4sf __builtin_ia32_cmpneqss (v4sf, v4sf);
545 v4sf __builtin_ia32_cmpnltss (v4sf, v4sf);
546 v4sf __builtin_ia32_cmpnless (v4sf, v4sf);
547 v4sf __builtin_ia32_cmpordss (v4sf, v4sf);
548 v4sf __builtin_ia32_maxps (v4sf, v4sf);
549 v4sf __builtin_ia32_maxss (v4sf, v4sf);
550 v4sf __builtin_ia32_minps (v4sf, v4sf);
551 v4sf __builtin_ia32_minss (v4sf, v4sf);
552 v4sf __builtin_ia32_andps (v4sf, v4sf);
553 v4sf __builtin_ia32_andnps (v4sf, v4sf);
554 v4sf __builtin_ia32_orps (v4sf, v4sf);
555 v4sf __builtin_ia32_xorps (v4sf, v4sf);
556 v4sf __builtin_ia32_movss (v4sf, v4sf);
557 v4sf __builtin_ia32_movhlps (v4sf, v4sf);
558 v4sf __builtin_ia32_movlhps (v4sf, v4sf);
559 v4sf __builtin_ia32_unpckhps (v4sf, v4sf);
560 v4sf __builtin_ia32_unpcklps (v4sf, v4sf);
561 v4sf __builtin_ia32_cvtpi2ps (v4sf, v2si);
562 v4sf __builtin_ia32_cvtsi2ss (v4sf, int);
563 v2si __builtin_ia32_cvtps2pi (v4sf);
564 int __builtin_ia32_cvtss2si (v4sf);
565 v2si __builtin_ia32_cvttps2pi (v4sf);
566 int __builtin_ia32_cvttss2si (v4sf);
567 v4sf __builtin_ia32_rcpps (v4sf);
568 v4sf __builtin_ia32_rsqrtps (v4sf);
569 v4sf __builtin_ia32_sqrtps (v4sf);
570 v4sf __builtin_ia32_rcpss (v4sf);
571 v4sf __builtin_ia32_rsqrtss (v4sf);
572 v4sf __builtin_ia32_sqrtss (v4sf);
573 v4sf __builtin_ia32_shufps (v4sf, v4sf, int);
574 void __builtin_ia32_movntps (float *, v4sf);
575 int __builtin_ia32_movmskps (v4sf);
576
577 The following built-in functions are available when :option:`-msse` is used.
578
579 .. function:: v4sf __builtin_ia32_loadups (float *)
580
581 Generates the ``movups`` machine instruction as a load from memory.
582
583 .. function:: void __builtin_ia32_storeups (float *, v4sf)
584
585 Generates the ``movups`` machine instruction as a store to memory.
586
587 .. function:: v4sf __builtin_ia32_loadss (float *)
588
589 Generates the ``movss`` machine instruction as a load from memory.
590
591 .. function:: v4sf __builtin_ia32_loadhps (v4sf, const v2sf *)
592
593 Generates the ``movhps`` machine instruction as a load from memory.
594
595 .. function:: v4sf __builtin_ia32_loadlps (v4sf, const v2sf *)
596
597 Generates the ``movlps`` machine instruction as a load from memory
598
599 .. function:: void __builtin_ia32_storehps (v2sf *, v4sf)
600
601 Generates the ``movhps`` machine instruction as a store to memory.
602
603 .. function:: void __builtin_ia32_storelps (v2sf *, v4sf)
604
605 Generates the ``movlps`` machine instruction as a store to memory.
606
607 The following built-in functions are available when :option:`-msse2` is used.
608
609 All of them generate the machine instruction that is part of the name.
610
611 .. code-block:: c++
612
613 int __builtin_ia32_comisdeq (v2df, v2df);
614 int __builtin_ia32_comisdlt (v2df, v2df);
615 int __builtin_ia32_comisdle (v2df, v2df);
616 int __builtin_ia32_comisdgt (v2df, v2df);
617 int __builtin_ia32_comisdge (v2df, v2df);
618 int __builtin_ia32_comisdneq (v2df, v2df);
619 int __builtin_ia32_ucomisdeq (v2df, v2df);
620 int __builtin_ia32_ucomisdlt (v2df, v2df);
621 int __builtin_ia32_ucomisdle (v2df, v2df);
622 int __builtin_ia32_ucomisdgt (v2df, v2df);
623 int __builtin_ia32_ucomisdge (v2df, v2df);
624 int __builtin_ia32_ucomisdneq (v2df, v2df);
625 v2df __builtin_ia32_cmpeqpd (v2df, v2df);
626 v2df __builtin_ia32_cmpltpd (v2df, v2df);
627 v2df __builtin_ia32_cmplepd (v2df, v2df);
628 v2df __builtin_ia32_cmpgtpd (v2df, v2df);
629 v2df __builtin_ia32_cmpgepd (v2df, v2df);
630 v2df __builtin_ia32_cmpunordpd (v2df, v2df);
631 v2df __builtin_ia32_cmpneqpd (v2df, v2df);
632 v2df __builtin_ia32_cmpnltpd (v2df, v2df);
633 v2df __builtin_ia32_cmpnlepd (v2df, v2df);
634 v2df __builtin_ia32_cmpngtpd (v2df, v2df);
635 v2df __builtin_ia32_cmpngepd (v2df, v2df);
636 v2df __builtin_ia32_cmpordpd (v2df, v2df);
637 v2df __builtin_ia32_cmpeqsd (v2df, v2df);
638 v2df __builtin_ia32_cmpltsd (v2df, v2df);
639 v2df __builtin_ia32_cmplesd (v2df, v2df);
640 v2df __builtin_ia32_cmpunordsd (v2df, v2df);
641 v2df __builtin_ia32_cmpneqsd (v2df, v2df);
642 v2df __builtin_ia32_cmpnltsd (v2df, v2df);
643 v2df __builtin_ia32_cmpnlesd (v2df, v2df);
644 v2df __builtin_ia32_cmpordsd (v2df, v2df);
645 v2di __builtin_ia32_paddq (v2di, v2di);
646 v2di __builtin_ia32_psubq (v2di, v2di);
647 v2df __builtin_ia32_addpd (v2df, v2df);
648 v2df __builtin_ia32_subpd (v2df, v2df);
649 v2df __builtin_ia32_mulpd (v2df, v2df);
650 v2df __builtin_ia32_divpd (v2df, v2df);
651 v2df __builtin_ia32_addsd (v2df, v2df);
652 v2df __builtin_ia32_subsd (v2df, v2df);
653 v2df __builtin_ia32_mulsd (v2df, v2df);
654 v2df __builtin_ia32_divsd (v2df, v2df);
655 v2df __builtin_ia32_minpd (v2df, v2df);
656 v2df __builtin_ia32_maxpd (v2df, v2df);
657 v2df __builtin_ia32_minsd (v2df, v2df);
658 v2df __builtin_ia32_maxsd (v2df, v2df);
659 v2df __builtin_ia32_andpd (v2df, v2df);
660 v2df __builtin_ia32_andnpd (v2df, v2df);
661 v2df __builtin_ia32_orpd (v2df, v2df);
662 v2df __builtin_ia32_xorpd (v2df, v2df);
663 v2df __builtin_ia32_movsd (v2df, v2df);
664 v2df __builtin_ia32_unpckhpd (v2df, v2df);
665 v2df __builtin_ia32_unpcklpd (v2df, v2df);
666 v16qi __builtin_ia32_paddb128 (v16qi, v16qi);
667 v8hi __builtin_ia32_paddw128 (v8hi, v8hi);
668 v4si __builtin_ia32_paddd128 (v4si, v4si);
669 v2di __builtin_ia32_paddq128 (v2di, v2di);
670 v16qi __builtin_ia32_psubb128 (v16qi, v16qi);
671 v8hi __builtin_ia32_psubw128 (v8hi, v8hi);
672 v4si __builtin_ia32_psubd128 (v4si, v4si);
673 v2di __builtin_ia32_psubq128 (v2di, v2di);
674 v8hi __builtin_ia32_pmullw128 (v8hi, v8hi);
675 v8hi __builtin_ia32_pmulhw128 (v8hi, v8hi);
676 v2di __builtin_ia32_pand128 (v2di, v2di);
677 v2di __builtin_ia32_pandn128 (v2di, v2di);
678 v2di __builtin_ia32_por128 (v2di, v2di);
679 v2di __builtin_ia32_pxor128 (v2di, v2di);
680 v16qi __builtin_ia32_pavgb128 (v16qi, v16qi);
681 v8hi __builtin_ia32_pavgw128 (v8hi, v8hi);
682 v16qi __builtin_ia32_pcmpeqb128 (v16qi, v16qi);
683 v8hi __builtin_ia32_pcmpeqw128 (v8hi, v8hi);
684 v4si __builtin_ia32_pcmpeqd128 (v4si, v4si);
685 v16qi __builtin_ia32_pcmpgtb128 (v16qi, v16qi);
686 v8hi __builtin_ia32_pcmpgtw128 (v8hi, v8hi);
687 v4si __builtin_ia32_pcmpgtd128 (v4si, v4si);
688 v16qi __builtin_ia32_pmaxub128 (v16qi, v16qi);
689 v8hi __builtin_ia32_pmaxsw128 (v8hi, v8hi);
690 v16qi __builtin_ia32_pminub128 (v16qi, v16qi);
691 v8hi __builtin_ia32_pminsw128 (v8hi, v8hi);
692 v16qi __builtin_ia32_punpckhbw128 (v16qi, v16qi);
693 v8hi __builtin_ia32_punpckhwd128 (v8hi, v8hi);
694 v4si __builtin_ia32_punpckhdq128 (v4si, v4si);
695 v2di __builtin_ia32_punpckhqdq128 (v2di, v2di);
696 v16qi __builtin_ia32_punpcklbw128 (v16qi, v16qi);
697 v8hi __builtin_ia32_punpcklwd128 (v8hi, v8hi);
698 v4si __builtin_ia32_punpckldq128 (v4si, v4si);
699 v2di __builtin_ia32_punpcklqdq128 (v2di, v2di);
700 v16qi __builtin_ia32_packsswb128 (v8hi, v8hi);
701 v8hi __builtin_ia32_packssdw128 (v4si, v4si);
702 v16qi __builtin_ia32_packuswb128 (v8hi, v8hi);
703 v8hi __builtin_ia32_pmulhuw128 (v8hi, v8hi);
704 void __builtin_ia32_maskmovdqu (v16qi, v16qi);
705 v2df __builtin_ia32_loadupd (double *);
706 void __builtin_ia32_storeupd (double *, v2df);
707 v2df __builtin_ia32_loadhpd (v2df, double const *);
708 v2df __builtin_ia32_loadlpd (v2df, double const *);
709 int __builtin_ia32_movmskpd (v2df);
710 int __builtin_ia32_pmovmskb128 (v16qi);
711 void __builtin_ia32_movnti (int *, int);
712 void __builtin_ia32_movnti64 (long long int *, long long int);
713 void __builtin_ia32_movntpd (double *, v2df);
714 void __builtin_ia32_movntdq (v2df *, v2df);
715 v4si __builtin_ia32_pshufd (v4si, int);
716 v8hi __builtin_ia32_pshuflw (v8hi, int);
717 v8hi __builtin_ia32_pshufhw (v8hi, int);
718 v2di __builtin_ia32_psadbw128 (v16qi, v16qi);
719 v2df __builtin_ia32_sqrtpd (v2df);
720 v2df __builtin_ia32_sqrtsd (v2df);
721 v2df __builtin_ia32_shufpd (v2df, v2df, int);
722 v2df __builtin_ia32_cvtdq2pd (v4si);
723 v4sf __builtin_ia32_cvtdq2ps (v4si);
724 v4si __builtin_ia32_cvtpd2dq (v2df);
725 v2si __builtin_ia32_cvtpd2pi (v2df);
726 v4sf __builtin_ia32_cvtpd2ps (v2df);
727 v4si __builtin_ia32_cvttpd2dq (v2df);
728 v2si __builtin_ia32_cvttpd2pi (v2df);
729 v2df __builtin_ia32_cvtpi2pd (v2si);
730 int __builtin_ia32_cvtsd2si (v2df);
731 int __builtin_ia32_cvttsd2si (v2df);
732 long long __builtin_ia32_cvtsd2si64 (v2df);
733 long long __builtin_ia32_cvttsd2si64 (v2df);
734 v4si __builtin_ia32_cvtps2dq (v4sf);
735 v2df __builtin_ia32_cvtps2pd (v4sf);
736 v4si __builtin_ia32_cvttps2dq (v4sf);
737 v2df __builtin_ia32_cvtsi2sd (v2df, int);
738 v2df __builtin_ia32_cvtsi642sd (v2df, long long);
739 v4sf __builtin_ia32_cvtsd2ss (v4sf, v2df);
740 v2df __builtin_ia32_cvtss2sd (v2df, v4sf);
741 void __builtin_ia32_clflush (const void *);
742 void __builtin_ia32_lfence (void);
743 void __builtin_ia32_mfence (void);
744 v16qi __builtin_ia32_loaddqu (const char *);
745 void __builtin_ia32_storedqu (char *, v16qi);
746 v1di __builtin_ia32_pmuludq (v2si, v2si);
747 v2di __builtin_ia32_pmuludq128 (v4si, v4si);
748 v8hi __builtin_ia32_psllw128 (v8hi, v8hi);
749 v4si __builtin_ia32_pslld128 (v4si, v4si);
750 v2di __builtin_ia32_psllq128 (v2di, v2di);
751 v8hi __builtin_ia32_psrlw128 (v8hi, v8hi);
752 v4si __builtin_ia32_psrld128 (v4si, v4si);
753 v2di __builtin_ia32_psrlq128 (v2di, v2di);
754 v8hi __builtin_ia32_psraw128 (v8hi, v8hi);
755 v4si __builtin_ia32_psrad128 (v4si, v4si);
756 v2di __builtin_ia32_pslldqi128 (v2di, int);
757 v8hi __builtin_ia32_psllwi128 (v8hi, int);
758 v4si __builtin_ia32_pslldi128 (v4si, int);
759 v2di __builtin_ia32_psllqi128 (v2di, int);
760 v2di __builtin_ia32_psrldqi128 (v2di, int);
761 v8hi __builtin_ia32_psrlwi128 (v8hi, int);
762 v4si __builtin_ia32_psrldi128 (v4si, int);
763 v2di __builtin_ia32_psrlqi128 (v2di, int);
764 v8hi __builtin_ia32_psrawi128 (v8hi, int);
765 v4si __builtin_ia32_psradi128 (v4si, int);
766 v4si __builtin_ia32_pmaddwd128 (v8hi, v8hi);
767 v2di __builtin_ia32_movq128 (v2di);
768
769 The following built-in functions are available when :option:`-msse3` is used.
770 All of them generate the machine instruction that is part of the name.
771
772 .. code-block:: c++
773
774 v2df __builtin_ia32_addsubpd (v2df, v2df);
775 v4sf __builtin_ia32_addsubps (v4sf, v4sf);
776 v2df __builtin_ia32_haddpd (v2df, v2df);
777 v4sf __builtin_ia32_haddps (v4sf, v4sf);
778 v2df __builtin_ia32_hsubpd (v2df, v2df);
779 v4sf __builtin_ia32_hsubps (v4sf, v4sf);
780 v16qi __builtin_ia32_lddqu (char const *);
781 void __builtin_ia32_monitor (void *, unsigned int, unsigned int);
782 v4sf __builtin_ia32_movshdup (v4sf);
783 v4sf __builtin_ia32_movsldup (v4sf);
784 void __builtin_ia32_mwait (unsigned int, unsigned int);
785
786 The following built-in functions are available when :option:`-mssse3` is used.
787 All of them generate the machine instruction that is part of the name.
788
789 .. code-block:: c++
790
791 v2si __builtin_ia32_phaddd (v2si, v2si);
792 v4hi __builtin_ia32_phaddw (v4hi, v4hi);
793 v4hi __builtin_ia32_phaddsw (v4hi, v4hi);
794 v2si __builtin_ia32_phsubd (v2si, v2si);
795 v4hi __builtin_ia32_phsubw (v4hi, v4hi);
796 v4hi __builtin_ia32_phsubsw (v4hi, v4hi);
797 v4hi __builtin_ia32_pmaddubsw (v8qi, v8qi);
798 v4hi __builtin_ia32_pmulhrsw (v4hi, v4hi);
799 v8qi __builtin_ia32_pshufb (v8qi, v8qi);
800 v8qi __builtin_ia32_psignb (v8qi, v8qi);
801 v2si __builtin_ia32_psignd (v2si, v2si);
802 v4hi __builtin_ia32_psignw (v4hi, v4hi);
803 v1di __builtin_ia32_palignr (v1di, v1di, int);
804 v8qi __builtin_ia32_pabsb (v8qi);
805 v2si __builtin_ia32_pabsd (v2si);
806 v4hi __builtin_ia32_pabsw (v4hi);
807
808 The following built-in functions are available when :option:`-mssse3` is used.
809 All of them generate the machine instruction that is part of the name.
810
811 .. code-block:: c++
812
813 v4si __builtin_ia32_phaddd128 (v4si, v4si);
814 v8hi __builtin_ia32_phaddw128 (v8hi, v8hi);
815 v8hi __builtin_ia32_phaddsw128 (v8hi, v8hi);
816 v4si __builtin_ia32_phsubd128 (v4si, v4si);
817 v8hi __builtin_ia32_phsubw128 (v8hi, v8hi);
818 v8hi __builtin_ia32_phsubsw128 (v8hi, v8hi);
819 v8hi __builtin_ia32_pmaddubsw128 (v16qi, v16qi);
820 v8hi __builtin_ia32_pmulhrsw128 (v8hi, v8hi);
821 v16qi __builtin_ia32_pshufb128 (v16qi, v16qi);
822 v16qi __builtin_ia32_psignb128 (v16qi, v16qi);
823 v4si __builtin_ia32_psignd128 (v4si, v4si);
824 v8hi __builtin_ia32_psignw128 (v8hi, v8hi);
825 v2di __builtin_ia32_palignr128 (v2di, v2di, int);
826 v16qi __builtin_ia32_pabsb128 (v16qi);
827 v4si __builtin_ia32_pabsd128 (v4si);
828 v8hi __builtin_ia32_pabsw128 (v8hi);
829
830 The following built-in functions are available when :option:`-msse4.1` is
831 used. All of them generate the machine instruction that is part of the
832 name.
833
834 .. code-block:: c++
835
836 v2df __builtin_ia32_blendpd (v2df, v2df, const int);
837 v4sf __builtin_ia32_blendps (v4sf, v4sf, const int);
838 v2df __builtin_ia32_blendvpd (v2df, v2df, v2df);
839 v4sf __builtin_ia32_blendvps (v4sf, v4sf, v4sf);
840 v2df __builtin_ia32_dppd (v2df, v2df, const int);
841 v4sf __builtin_ia32_dpps (v4sf, v4sf, const int);
842 v4sf __builtin_ia32_insertps128 (v4sf, v4sf, const int);
843 v2di __builtin_ia32_movntdqa (v2di *);
844 v16qi __builtin_ia32_mpsadbw128 (v16qi, v16qi, const int);
845 v8hi __builtin_ia32_packusdw128 (v4si, v4si);
846 v16qi __builtin_ia32_pblendvb128 (v16qi, v16qi, v16qi);
847 v8hi __builtin_ia32_pblendw128 (v8hi, v8hi, const int);
848 v2di __builtin_ia32_pcmpeqq (v2di, v2di);
849 v8hi __builtin_ia32_phminposuw128 (v8hi);
850 v16qi __builtin_ia32_pmaxsb128 (v16qi, v16qi);
851 v4si __builtin_ia32_pmaxsd128 (v4si, v4si);
852 v4si __builtin_ia32_pmaxud128 (v4si, v4si);
853 v8hi __builtin_ia32_pmaxuw128 (v8hi, v8hi);
854 v16qi __builtin_ia32_pminsb128 (v16qi, v16qi);
855 v4si __builtin_ia32_pminsd128 (v4si, v4si);
856 v4si __builtin_ia32_pminud128 (v4si, v4si);
857 v8hi __builtin_ia32_pminuw128 (v8hi, v8hi);
858 v4si __builtin_ia32_pmovsxbd128 (v16qi);
859 v2di __builtin_ia32_pmovsxbq128 (v16qi);
860 v8hi __builtin_ia32_pmovsxbw128 (v16qi);
861 v2di __builtin_ia32_pmovsxdq128 (v4si);
862 v4si __builtin_ia32_pmovsxwd128 (v8hi);
863 v2di __builtin_ia32_pmovsxwq128 (v8hi);
864 v4si __builtin_ia32_pmovzxbd128 (v16qi);
865 v2di __builtin_ia32_pmovzxbq128 (v16qi);
866 v8hi __builtin_ia32_pmovzxbw128 (v16qi);
867 v2di __builtin_ia32_pmovzxdq128 (v4si);
868 v4si __builtin_ia32_pmovzxwd128 (v8hi);
869 v2di __builtin_ia32_pmovzxwq128 (v8hi);
870 v2di __builtin_ia32_pmuldq128 (v4si, v4si);
871 v4si __builtin_ia32_pmulld128 (v4si, v4si);
872 int __builtin_ia32_ptestc128 (v2di, v2di);
873 int __builtin_ia32_ptestnzc128 (v2di, v2di);
874 int __builtin_ia32_ptestz128 (v2di, v2di);
875 v2df __builtin_ia32_roundpd (v2df, const int);
876 v4sf __builtin_ia32_roundps (v4sf, const int);
877 v2df __builtin_ia32_roundsd (v2df, v2df, const int);
878 v4sf __builtin_ia32_roundss (v4sf, v4sf, const int);
879
880 The following built-in functions are available when :option:`-msse4.1` is
881 used.
882
883 .. function:: v4sf __builtin_ia32_vec_set_v4sf (v4sf, float, const int)
884
885 Generates the ``insertps`` machine instruction.
886
887 .. function:: int __builtin_ia32_vec_ext_v16qi (v16qi, const int)
888
889 Generates the ``pextrb`` machine instruction.
890
891 .. function:: v16qi __builtin_ia32_vec_set_v16qi (v16qi, int, const int)
892
893 Generates the ``pinsrb`` machine instruction.
894
895 .. function:: v4si __builtin_ia32_vec_set_v4si (v4si, int, const int)
896
897 Generates the ``pinsrd`` machine instruction.
898
899 .. function:: v2di __builtin_ia32_vec_set_v2di (v2di, long long, const int)
900
901 Generates the ``pinsrq`` machine instruction in 64bit mode.
902
903 The following built-in functions are changed to generate new SSE4.1
904 instructions when :option:`-msse4.1` is used.
905
906 .. function:: float __builtin_ia32_vec_ext_v4sf (v4sf, const int)
907
908 Generates the ``extractps`` machine instruction.
909
910 .. function:: int __builtin_ia32_vec_ext_v4si (v4si, const int)
911
912 Generates the ``pextrd`` machine instruction.
913
914 .. function:: long long __builtin_ia32_vec_ext_v2di (v2di, const int)
915
916 Generates the ``pextrq`` machine instruction in 64bit mode.
917
918 The following built-in functions are available when :option:`-msse4.2` is
919 used. All of them generate the machine instruction that is part of the
920 name.
921
922 .. code-block:: c++
923
924 v16qi __builtin_ia32_pcmpestrm128 (v16qi, int, v16qi, int, const int);
925 int __builtin_ia32_pcmpestri128 (v16qi, int, v16qi, int, const int);
926 int __builtin_ia32_pcmpestria128 (v16qi, int, v16qi, int, const int);
927 int __builtin_ia32_pcmpestric128 (v16qi, int, v16qi, int, const int);
928 int __builtin_ia32_pcmpestrio128 (v16qi, int, v16qi, int, const int);
929 int __builtin_ia32_pcmpestris128 (v16qi, int, v16qi, int, const int);
930 int __builtin_ia32_pcmpestriz128 (v16qi, int, v16qi, int, const int);
931 v16qi __builtin_ia32_pcmpistrm128 (v16qi, v16qi, const int);
932 int __builtin_ia32_pcmpistri128 (v16qi, v16qi, const int);
933 int __builtin_ia32_pcmpistria128 (v16qi, v16qi, const int);
934 int __builtin_ia32_pcmpistric128 (v16qi, v16qi, const int);
935 int __builtin_ia32_pcmpistrio128 (v16qi, v16qi, const int);
936 int __builtin_ia32_pcmpistris128 (v16qi, v16qi, const int);
937 int __builtin_ia32_pcmpistriz128 (v16qi, v16qi, const int);
938 v2di __builtin_ia32_pcmpgtq (v2di, v2di);
939
940 The following built-in functions are available when :option:`-msse4.2` is
941 used.
942
943 .. function:: unsigned int __builtin_ia32_crc32qi (unsigned int, unsigned char)
944
945 Generates the ``crc32b`` machine instruction.
946
947 .. function:: unsigned int __builtin_ia32_crc32hi (unsigned int, unsigned short)
948
949 Generates the ``crc32w`` machine instruction.
950
951 .. function:: unsigned int __builtin_ia32_crc32si (unsigned int, unsigned int)
952
953 Generates the ``crc32l`` machine instruction.
954
955 .. function:: unsigned long long __builtin_ia32_crc32di (unsigned long long, unsigned long long)
956
957 Generates the ``crc32q`` machine instruction.
958
959 The following built-in functions are changed to generate new SSE4.2
960 instructions when :option:`-msse4.2` is used.
961
962 .. function:: int __builtin_popcount (unsigned int)
963
964 Generates the ``popcntl`` machine instruction.
965
966 .. function:: int __builtin_popcountl (unsigned long)
967
968 Generates the ``popcntl`` or ``popcntq`` machine instruction,
969 depending on the size of ``unsigned long``.
970
971 .. function:: int __builtin_popcountll (unsigned long long)
972
973 Generates the ``popcntq`` machine instruction.
974
975 The following built-in functions are available when :option:`-mavx` is
976 used. All of them generate the machine instruction that is part of the
977 name.
978
979 .. code-block:: c++
980
981 v4df __builtin_ia32_addpd256 (v4df,v4df);
982 v8sf __builtin_ia32_addps256 (v8sf,v8sf);
983 v4df __builtin_ia32_addsubpd256 (v4df,v4df);
984 v8sf __builtin_ia32_addsubps256 (v8sf,v8sf);
985 v4df __builtin_ia32_andnpd256 (v4df,v4df);
986 v8sf __builtin_ia32_andnps256 (v8sf,v8sf);
987 v4df __builtin_ia32_andpd256 (v4df,v4df);
988 v8sf __builtin_ia32_andps256 (v8sf,v8sf);
989 v4df __builtin_ia32_blendpd256 (v4df,v4df,int);
990 v8sf __builtin_ia32_blendps256 (v8sf,v8sf,int);
991 v4df __builtin_ia32_blendvpd256 (v4df,v4df,v4df);
992 v8sf __builtin_ia32_blendvps256 (v8sf,v8sf,v8sf);
993 v2df __builtin_ia32_cmppd (v2df,v2df,int);
994 v4df __builtin_ia32_cmppd256 (v4df,v4df,int);
995 v4sf __builtin_ia32_cmpps (v4sf,v4sf,int);
996 v8sf __builtin_ia32_cmpps256 (v8sf,v8sf,int);
997 v2df __builtin_ia32_cmpsd (v2df,v2df,int);
998 v4sf __builtin_ia32_cmpss (v4sf,v4sf,int);
999 v4df __builtin_ia32_cvtdq2pd256 (v4si);
1000 v8sf __builtin_ia32_cvtdq2ps256 (v8si);
1001 v4si __builtin_ia32_cvtpd2dq256 (v4df);
1002 v4sf __builtin_ia32_cvtpd2ps256 (v4df);
1003 v8si __builtin_ia32_cvtps2dq256 (v8sf);
1004 v4df __builtin_ia32_cvtps2pd256 (v4sf);
1005 v4si __builtin_ia32_cvttpd2dq256 (v4df);
1006 v8si __builtin_ia32_cvttps2dq256 (v8sf);
1007 v4df __builtin_ia32_divpd256 (v4df,v4df);
1008 v8sf __builtin_ia32_divps256 (v8sf,v8sf);
1009 v8sf __builtin_ia32_dpps256 (v8sf,v8sf,int);
1010 v4df __builtin_ia32_haddpd256 (v4df,v4df);
1011 v8sf __builtin_ia32_haddps256 (v8sf,v8sf);
1012 v4df __builtin_ia32_hsubpd256 (v4df,v4df);
1013 v8sf __builtin_ia32_hsubps256 (v8sf,v8sf);
1014 v32qi __builtin_ia32_lddqu256 (pcchar);
1015 v32qi __builtin_ia32_loaddqu256 (pcchar);
1016 v4df __builtin_ia32_loadupd256 (pcdouble);
1017 v8sf __builtin_ia32_loadups256 (pcfloat);
1018 v2df __builtin_ia32_maskloadpd (pcv2df,v2df);
1019 v4df __builtin_ia32_maskloadpd256 (pcv4df,v4df);
1020 v4sf __builtin_ia32_maskloadps (pcv4sf,v4sf);
1021 v8sf __builtin_ia32_maskloadps256 (pcv8sf,v8sf);
1022 void __builtin_ia32_maskstorepd (pv2df,v2df,v2df);
1023 void __builtin_ia32_maskstorepd256 (pv4df,v4df,v4df);
1024 void __builtin_ia32_maskstoreps (pv4sf,v4sf,v4sf);
1025 void __builtin_ia32_maskstoreps256 (pv8sf,v8sf,v8sf);
1026 v4df __builtin_ia32_maxpd256 (v4df,v4df);
1027 v8sf __builtin_ia32_maxps256 (v8sf,v8sf);
1028 v4df __builtin_ia32_minpd256 (v4df,v4df);
1029 v8sf __builtin_ia32_minps256 (v8sf,v8sf);
1030 v4df __builtin_ia32_movddup256 (v4df);
1031 int __builtin_ia32_movmskpd256 (v4df);
1032 int __builtin_ia32_movmskps256 (v8sf);
1033 v8sf __builtin_ia32_movshdup256 (v8sf);
1034 v8sf __builtin_ia32_movsldup256 (v8sf);
1035 v4df __builtin_ia32_mulpd256 (v4df,v4df);
1036 v8sf __builtin_ia32_mulps256 (v8sf,v8sf);
1037 v4df __builtin_ia32_orpd256 (v4df,v4df);
1038 v8sf __builtin_ia32_orps256 (v8sf,v8sf);
1039 v2df __builtin_ia32_pd_pd256 (v4df);
1040 v4df __builtin_ia32_pd256_pd (v2df);
1041 v4sf __builtin_ia32_ps_ps256 (v8sf);
1042 v8sf __builtin_ia32_ps256_ps (v4sf);
1043 int __builtin_ia32_ptestc256 (v4di,v4di,ptest);
1044 int __builtin_ia32_ptestnzc256 (v4di,v4di,ptest);
1045 int __builtin_ia32_ptestz256 (v4di,v4di,ptest);
1046 v8sf __builtin_ia32_rcpps256 (v8sf);
1047 v4df __builtin_ia32_roundpd256 (v4df,int);
1048 v8sf __builtin_ia32_roundps256 (v8sf,int);
1049 v8sf __builtin_ia32_rsqrtps_nr256 (v8sf);
1050 v8sf __builtin_ia32_rsqrtps256 (v8sf);
1051 v4df __builtin_ia32_shufpd256 (v4df,v4df,int);
1052 v8sf __builtin_ia32_shufps256 (v8sf,v8sf,int);
1053 v4si __builtin_ia32_si_si256 (v8si);
1054 v8si __builtin_ia32_si256_si (v4si);
1055 v4df __builtin_ia32_sqrtpd256 (v4df);
1056 v8sf __builtin_ia32_sqrtps_nr256 (v8sf);
1057 v8sf __builtin_ia32_sqrtps256 (v8sf);
1058 void __builtin_ia32_storedqu256 (pchar,v32qi);
1059 void __builtin_ia32_storeupd256 (pdouble,v4df);
1060 void __builtin_ia32_storeups256 (pfloat,v8sf);
1061 v4df __builtin_ia32_subpd256 (v4df,v4df);
1062 v8sf __builtin_ia32_subps256 (v8sf,v8sf);
1063 v4df __builtin_ia32_unpckhpd256 (v4df,v4df);
1064 v8sf __builtin_ia32_unpckhps256 (v8sf,v8sf);
1065 v4df __builtin_ia32_unpcklpd256 (v4df,v4df);
1066 v8sf __builtin_ia32_unpcklps256 (v8sf,v8sf);
1067 v4df __builtin_ia32_vbroadcastf128_pd256 (pcv2df);
1068 v8sf __builtin_ia32_vbroadcastf128_ps256 (pcv4sf);
1069 v4df __builtin_ia32_vbroadcastsd256 (pcdouble);
1070 v4sf __builtin_ia32_vbroadcastss (pcfloat);
1071 v8sf __builtin_ia32_vbroadcastss256 (pcfloat);
1072 v2df __builtin_ia32_vextractf128_pd256 (v4df,int);
1073 v4sf __builtin_ia32_vextractf128_ps256 (v8sf,int);
1074 v4si __builtin_ia32_vextractf128_si256 (v8si,int);
1075 v4df __builtin_ia32_vinsertf128_pd256 (v4df,v2df,int);
1076 v8sf __builtin_ia32_vinsertf128_ps256 (v8sf,v4sf,int);
1077 v8si __builtin_ia32_vinsertf128_si256 (v8si,v4si,int);
1078 v4df __builtin_ia32_vperm2f128_pd256 (v4df,v4df,int);
1079 v8sf __builtin_ia32_vperm2f128_ps256 (v8sf,v8sf,int);
1080 v8si __builtin_ia32_vperm2f128_si256 (v8si,v8si,int);
1081 v2df __builtin_ia32_vpermil2pd (v2df,v2df,v2di,int);
1082 v4df __builtin_ia32_vpermil2pd256 (v4df,v4df,v4di,int);
1083 v4sf __builtin_ia32_vpermil2ps (v4sf,v4sf,v4si,int);
1084 v8sf __builtin_ia32_vpermil2ps256 (v8sf,v8sf,v8si,int);
1085 v2df __builtin_ia32_vpermilpd (v2df,int);
1086 v4df __builtin_ia32_vpermilpd256 (v4df,int);
1087 v4sf __builtin_ia32_vpermilps (v4sf,int);
1088 v8sf __builtin_ia32_vpermilps256 (v8sf,int);
1089 v2df __builtin_ia32_vpermilvarpd (v2df,v2di);
1090 v4df __builtin_ia32_vpermilvarpd256 (v4df,v4di);
1091 v4sf __builtin_ia32_vpermilvarps (v4sf,v4si);
1092 v8sf __builtin_ia32_vpermilvarps256 (v8sf,v8si);
1093 int __builtin_ia32_vtestcpd (v2df,v2df,ptest);
1094 int __builtin_ia32_vtestcpd256 (v4df,v4df,ptest);
1095 int __builtin_ia32_vtestcps (v4sf,v4sf,ptest);
1096 int __builtin_ia32_vtestcps256 (v8sf,v8sf,ptest);
1097 int __builtin_ia32_vtestnzcpd (v2df,v2df,ptest);
1098 int __builtin_ia32_vtestnzcpd256 (v4df,v4df,ptest);
1099 int __builtin_ia32_vtestnzcps (v4sf,v4sf,ptest);
1100 int __builtin_ia32_vtestnzcps256 (v8sf,v8sf,ptest);
1101 int __builtin_ia32_vtestzpd (v2df,v2df,ptest);
1102 int __builtin_ia32_vtestzpd256 (v4df,v4df,ptest);
1103 int __builtin_ia32_vtestzps (v4sf,v4sf,ptest);
1104 int __builtin_ia32_vtestzps256 (v8sf,v8sf,ptest);
1105 void __builtin_ia32_vzeroall (void);
1106 void __builtin_ia32_vzeroupper (void);
1107 v4df __builtin_ia32_xorpd256 (v4df,v4df);
1108 v8sf __builtin_ia32_xorps256 (v8sf,v8sf);
1109
1110 The following built-in functions are available when :option:`-mavx2` is
1111 used. All of them generate the machine instruction that is part of the
1112 name.
1113
1114 .. code-block:: c++
1115
1116 v32qi __builtin_ia32_mpsadbw256 (v32qi,v32qi,int);
1117 v32qi __builtin_ia32_pabsb256 (v32qi);
1118 v16hi __builtin_ia32_pabsw256 (v16hi);
1119 v8si __builtin_ia32_pabsd256 (v8si);
1120 v16hi __builtin_ia32_packssdw256 (v8si,v8si);
1121 v32qi __builtin_ia32_packsswb256 (v16hi,v16hi);
1122 v16hi __builtin_ia32_packusdw256 (v8si,v8si);
1123 v32qi __builtin_ia32_packuswb256 (v16hi,v16hi);
1124 v32qi __builtin_ia32_paddb256 (v32qi,v32qi);
1125 v16hi __builtin_ia32_paddw256 (v16hi,v16hi);
1126 v8si __builtin_ia32_paddd256 (v8si,v8si);
1127 v4di __builtin_ia32_paddq256 (v4di,v4di);
1128 v32qi __builtin_ia32_paddsb256 (v32qi,v32qi);
1129 v16hi __builtin_ia32_paddsw256 (v16hi,v16hi);
1130 v32qi __builtin_ia32_paddusb256 (v32qi,v32qi);
1131 v16hi __builtin_ia32_paddusw256 (v16hi,v16hi);
1132 v4di __builtin_ia32_palignr256 (v4di,v4di,int);
1133 v4di __builtin_ia32_andsi256 (v4di,v4di);
1134 v4di __builtin_ia32_andnotsi256 (v4di,v4di);
1135 v32qi __builtin_ia32_pavgb256 (v32qi,v32qi);
1136 v16hi __builtin_ia32_pavgw256 (v16hi,v16hi);
1137 v32qi __builtin_ia32_pblendvb256 (v32qi,v32qi,v32qi);
1138 v16hi __builtin_ia32_pblendw256 (v16hi,v16hi,int);
1139 v32qi __builtin_ia32_pcmpeqb256 (v32qi,v32qi);
1140 v16hi __builtin_ia32_pcmpeqw256 (v16hi,v16hi);
1141 v8si __builtin_ia32_pcmpeqd256 (c8si,v8si);
1142 v4di __builtin_ia32_pcmpeqq256 (v4di,v4di);
1143 v32qi __builtin_ia32_pcmpgtb256 (v32qi,v32qi);
1144 v16hi __builtin_ia32_pcmpgtw256 (16hi,v16hi);
1145 v8si __builtin_ia32_pcmpgtd256 (v8si,v8si);
1146 v4di __builtin_ia32_pcmpgtq256 (v4di,v4di);
1147 v16hi __builtin_ia32_phaddw256 (v16hi,v16hi);
1148 v8si __builtin_ia32_phaddd256 (v8si,v8si);
1149 v16hi __builtin_ia32_phaddsw256 (v16hi,v16hi);
1150 v16hi __builtin_ia32_phsubw256 (v16hi,v16hi);
1151 v8si __builtin_ia32_phsubd256 (v8si,v8si);
1152 v16hi __builtin_ia32_phsubsw256 (v16hi,v16hi);
1153 v32qi __builtin_ia32_pmaddubsw256 (v32qi,v32qi);
1154 v16hi __builtin_ia32_pmaddwd256 (v16hi,v16hi);
1155 v32qi __builtin_ia32_pmaxsb256 (v32qi,v32qi);
1156 v16hi __builtin_ia32_pmaxsw256 (v16hi,v16hi);
1157 v8si __builtin_ia32_pmaxsd256 (v8si,v8si);
1158 v32qi __builtin_ia32_pmaxub256 (v32qi,v32qi);
1159 v16hi __builtin_ia32_pmaxuw256 (v16hi,v16hi);
1160 v8si __builtin_ia32_pmaxud256 (v8si,v8si);
1161 v32qi __builtin_ia32_pminsb256 (v32qi,v32qi);
1162 v16hi __builtin_ia32_pminsw256 (v16hi,v16hi);
1163 v8si __builtin_ia32_pminsd256 (v8si,v8si);
1164 v32qi __builtin_ia32_pminub256 (v32qi,v32qi);
1165 v16hi __builtin_ia32_pminuw256 (v16hi,v16hi);
1166 v8si __builtin_ia32_pminud256 (v8si,v8si);
1167 int __builtin_ia32_pmovmskb256 (v32qi);
1168 v16hi __builtin_ia32_pmovsxbw256 (v16qi);
1169 v8si __builtin_ia32_pmovsxbd256 (v16qi);
1170 v4di __builtin_ia32_pmovsxbq256 (v16qi);
1171 v8si __builtin_ia32_pmovsxwd256 (v8hi);
1172 v4di __builtin_ia32_pmovsxwq256 (v8hi);
1173 v4di __builtin_ia32_pmovsxdq256 (v4si);
1174 v16hi __builtin_ia32_pmovzxbw256 (v16qi);
1175 v8si __builtin_ia32_pmovzxbd256 (v16qi);
1176 v4di __builtin_ia32_pmovzxbq256 (v16qi);
1177 v8si __builtin_ia32_pmovzxwd256 (v8hi);
1178 v4di __builtin_ia32_pmovzxwq256 (v8hi);
1179 v4di __builtin_ia32_pmovzxdq256 (v4si);
1180 v4di __builtin_ia32_pmuldq256 (v8si,v8si);
1181 v16hi __builtin_ia32_pmulhrsw256 (v16hi, v16hi);
1182 v16hi __builtin_ia32_pmulhuw256 (v16hi,v16hi);
1183 v16hi __builtin_ia32_pmulhw256 (v16hi,v16hi);
1184 v16hi __builtin_ia32_pmullw256 (v16hi,v16hi);
1185 v8si __builtin_ia32_pmulld256 (v8si,v8si);
1186 v4di __builtin_ia32_pmuludq256 (v8si,v8si);
1187 v4di __builtin_ia32_por256 (v4di,v4di);
1188 v16hi __builtin_ia32_psadbw256 (v32qi,v32qi);
1189 v32qi __builtin_ia32_pshufb256 (v32qi,v32qi);
1190 v8si __builtin_ia32_pshufd256 (v8si,int);
1191 v16hi __builtin_ia32_pshufhw256 (v16hi,int);
1192 v16hi __builtin_ia32_pshuflw256 (v16hi,int);
1193 v32qi __builtin_ia32_psignb256 (v32qi,v32qi);
1194 v16hi __builtin_ia32_psignw256 (v16hi,v16hi);
1195 v8si __builtin_ia32_psignd256 (v8si,v8si);
1196 v4di __builtin_ia32_pslldqi256 (v4di,int);
1197 v16hi __builtin_ia32_psllwi256 (16hi,int);
1198 v16hi __builtin_ia32_psllw256(v16hi,v8hi);
1199 v8si __builtin_ia32_pslldi256 (v8si,int);
1200 v8si __builtin_ia32_pslld256(v8si,v4si);
1201 v4di __builtin_ia32_psllqi256 (v4di,int);
1202 v4di __builtin_ia32_psllq256(v4di,v2di);
1203 v16hi __builtin_ia32_psrawi256 (v16hi,int);
1204 v16hi __builtin_ia32_psraw256 (v16hi,v8hi);
1205 v8si __builtin_ia32_psradi256 (v8si,int);
1206 v8si __builtin_ia32_psrad256 (v8si,v4si);
1207 v4di __builtin_ia32_psrldqi256 (v4di, int);
1208 v16hi __builtin_ia32_psrlwi256 (v16hi,int);
1209 v16hi __builtin_ia32_psrlw256 (v16hi,v8hi);
1210 v8si __builtin_ia32_psrldi256 (v8si,int);
1211 v8si __builtin_ia32_psrld256 (v8si,v4si);
1212 v4di __builtin_ia32_psrlqi256 (v4di,int);
1213 v4di __builtin_ia32_psrlq256(v4di,v2di);
1214 v32qi __builtin_ia32_psubb256 (v32qi,v32qi);
1215 v32hi __builtin_ia32_psubw256 (v16hi,v16hi);
1216 v8si __builtin_ia32_psubd256 (v8si,v8si);
1217 v4di __builtin_ia32_psubq256 (v4di,v4di);
1218 v32qi __builtin_ia32_psubsb256 (v32qi,v32qi);
1219 v16hi __builtin_ia32_psubsw256 (v16hi,v16hi);
1220 v32qi __builtin_ia32_psubusb256 (v32qi,v32qi);
1221 v16hi __builtin_ia32_psubusw256 (v16hi,v16hi);
1222 v32qi __builtin_ia32_punpckhbw256 (v32qi,v32qi);
1223 v16hi __builtin_ia32_punpckhwd256 (v16hi,v16hi);
1224 v8si __builtin_ia32_punpckhdq256 (v8si,v8si);
1225 v4di __builtin_ia32_punpckhqdq256 (v4di,v4di);
1226 v32qi __builtin_ia32_punpcklbw256 (v32qi,v32qi);
1227 v16hi __builtin_ia32_punpcklwd256 (v16hi,v16hi);
1228 v8si __builtin_ia32_punpckldq256 (v8si,v8si);
1229 v4di __builtin_ia32_punpcklqdq256 (v4di,v4di);
1230 v4di __builtin_ia32_pxor256 (v4di,v4di);
1231 v4di __builtin_ia32_movntdqa256 (pv4di);
1232 v4sf __builtin_ia32_vbroadcastss_ps (v4sf);
1233 v8sf __builtin_ia32_vbroadcastss_ps256 (v4sf);
1234 v4df __builtin_ia32_vbroadcastsd_pd256 (v2df);
1235 v4di __builtin_ia32_vbroadcastsi256 (v2di);
1236 v4si __builtin_ia32_pblendd128 (v4si,v4si);
1237 v8si __builtin_ia32_pblendd256 (v8si,v8si);
1238 v32qi __builtin_ia32_pbroadcastb256 (v16qi);
1239 v16hi __builtin_ia32_pbroadcastw256 (v8hi);
1240 v8si __builtin_ia32_pbroadcastd256 (v4si);
1241 v4di __builtin_ia32_pbroadcastq256 (v2di);
1242 v16qi __builtin_ia32_pbroadcastb128 (v16qi);
1243 v8hi __builtin_ia32_pbroadcastw128 (v8hi);
1244 v4si __builtin_ia32_pbroadcastd128 (v4si);
1245 v2di __builtin_ia32_pbroadcastq128 (v2di);
1246 v8si __builtin_ia32_permvarsi256 (v8si,v8si);
1247 v4df __builtin_ia32_permdf256 (v4df,int);
1248 v8sf __builtin_ia32_permvarsf256 (v8sf,v8sf);
1249 v4di __builtin_ia32_permdi256 (v4di,int);
1250 v4di __builtin_ia32_permti256 (v4di,v4di,int);
1251 v4di __builtin_ia32_extract128i256 (v4di,int);
1252 v4di __builtin_ia32_insert128i256 (v4di,v2di,int);
1253 v8si __builtin_ia32_maskloadd256 (pcv8si,v8si);
1254 v4di __builtin_ia32_maskloadq256 (pcv4di,v4di);
1255 v4si __builtin_ia32_maskloadd (pcv4si,v4si);
1256 v2di __builtin_ia32_maskloadq (pcv2di,v2di);
1257 void __builtin_ia32_maskstored256 (pv8si,v8si,v8si);
1258 void __builtin_ia32_maskstoreq256 (pv4di,v4di,v4di);
1259 void __builtin_ia32_maskstored (pv4si,v4si,v4si);
1260 void __builtin_ia32_maskstoreq (pv2di,v2di,v2di);
1261 v8si __builtin_ia32_psllv8si (v8si,v8si);
1262 v4si __builtin_ia32_psllv4si (v4si,v4si);
1263 v4di __builtin_ia32_psllv4di (v4di,v4di);
1264 v2di __builtin_ia32_psllv2di (v2di,v2di);
1265 v8si __builtin_ia32_psrav8si (v8si,v8si);
1266 v4si __builtin_ia32_psrav4si (v4si,v4si);
1267 v8si __builtin_ia32_psrlv8si (v8si,v8si);
1268 v4si __builtin_ia32_psrlv4si (v4si,v4si);
1269 v4di __builtin_ia32_psrlv4di (v4di,v4di);
1270 v2di __builtin_ia32_psrlv2di (v2di,v2di);
1271 v2df __builtin_ia32_gathersiv2df (v2df, pcdouble,v4si,v2df,int);
1272 v4df __builtin_ia32_gathersiv4df (v4df, pcdouble,v4si,v4df,int);
1273 v2df __builtin_ia32_gatherdiv2df (v2df, pcdouble,v2di,v2df,int);
1274 v4df __builtin_ia32_gatherdiv4df (v4df, pcdouble,v4di,v4df,int);
1275 v4sf __builtin_ia32_gathersiv4sf (v4sf, pcfloat,v4si,v4sf,int);
1276 v8sf __builtin_ia32_gathersiv8sf (v8sf, pcfloat,v8si,v8sf,int);
1277 v4sf __builtin_ia32_gatherdiv4sf (v4sf, pcfloat,v2di,v4sf,int);
1278 v4sf __builtin_ia32_gatherdiv4sf256 (v4sf, pcfloat,v4di,v4sf,int);
1279 v2di __builtin_ia32_gathersiv2di (v2di, pcint64,v4si,v2di,int);
1280 v4di __builtin_ia32_gathersiv4di (v4di, pcint64,v4si,v4di,int);
1281 v2di __builtin_ia32_gatherdiv2di (v2di, pcint64,v2di,v2di,int);
1282 v4di __builtin_ia32_gatherdiv4di (v4di, pcint64,v4di,v4di,int);
1283 v4si __builtin_ia32_gathersiv4si (v4si, pcint,v4si,v4si,int);
1284 v8si __builtin_ia32_gathersiv8si (v8si, pcint,v8si,v8si,int);
1285 v4si __builtin_ia32_gatherdiv4si (v4si, pcint,v2di,v4si,int);
1286 v4si __builtin_ia32_gatherdiv4si256 (v4si, pcint,v4di,v4si,int);
1287
1288 The following built-in functions are available when :option:`-maes` is
1289 used. All of them generate the machine instruction that is part of the
1290 name.
1291
1292 .. code-block:: c++
1293
1294 v2di __builtin_ia32_aesenc128 (v2di, v2di);
1295 v2di __builtin_ia32_aesenclast128 (v2di, v2di);
1296 v2di __builtin_ia32_aesdec128 (v2di, v2di);
1297 v2di __builtin_ia32_aesdeclast128 (v2di, v2di);
1298 v2di __builtin_ia32_aeskeygenassist128 (v2di, const int);
1299 v2di __builtin_ia32_aesimc128 (v2di);
1300
1301 The following built-in function is available when :option:`-mpclmul` is
1302 used.
1303
1304 .. function:: v2di __builtin_ia32_pclmulqdq128 (v2di, v2di, const int)
1305
1306 Generates the ``pclmulqdq`` machine instruction.
1307
1308 The following built-in function is available when :option:`-mfsgsbase` is
1309 used. All of them generate the machine instruction that is part of the
1310 name.
1311
1312 .. code-block:: c++
1313
1314 unsigned int __builtin_ia32_rdfsbase32 (void);
1315 unsigned long long __builtin_ia32_rdfsbase64 (void);
1316 unsigned int __builtin_ia32_rdgsbase32 (void);
1317 unsigned long long __builtin_ia32_rdgsbase64 (void);
1318 void _writefsbase_u32 (unsigned int);
1319 void _writefsbase_u64 (unsigned long long);
1320 void _writegsbase_u32 (unsigned int);
1321 void _writegsbase_u64 (unsigned long long);
1322
1323 The following built-in function is available when :option:`-mrdrnd` is
1324 used. All of them generate the machine instruction that is part of the
1325 name.
1326
1327 .. code-block:: c++
1328
1329 unsigned int __builtin_ia32_rdrand16_step (unsigned short *);
1330 unsigned int __builtin_ia32_rdrand32_step (unsigned int *);
1331 unsigned int __builtin_ia32_rdrand64_step (unsigned long long *);
1332
1333 The following built-in function is available when :option:`-mptwrite` is
1334 used. All of them generate the machine instruction that is part of the
1335 name.
1336
1337 .. code-block:: c++
1338
1339 void __builtin_ia32_ptwrite32 (unsigned);
1340 void __builtin_ia32_ptwrite64 (unsigned long long);
1341
1342 The following built-in functions are available when :option:`-msse4a` is used.
1343 All of them generate the machine instruction that is part of the name.
1344
1345 .. code-block:: c++
1346
1347 void __builtin_ia32_movntsd (double *, v2df);
1348 void __builtin_ia32_movntss (float *, v4sf);
1349 v2di __builtin_ia32_extrq (v2di, v16qi);
1350 v2di __builtin_ia32_extrqi (v2di, const unsigned int, const unsigned int);
1351 v2di __builtin_ia32_insertq (v2di, v2di);
1352 v2di __builtin_ia32_insertqi (v2di, v2di, const unsigned int, const unsigned int);
1353
1354 The following built-in functions are available when :option:`-mxop` is used.
1355
1356 .. code-block:: c++
1357
1358 v2df __builtin_ia32_vfrczpd (v2df);
1359 v4sf __builtin_ia32_vfrczps (v4sf);
1360 v2df __builtin_ia32_vfrczsd (v2df);
1361 v4sf __builtin_ia32_vfrczss (v4sf);
1362 v4df __builtin_ia32_vfrczpd256 (v4df);
1363 v8sf __builtin_ia32_vfrczps256 (v8sf);
1364 v2di __builtin_ia32_vpcmov (v2di, v2di, v2di);
1365 v2di __builtin_ia32_vpcmov_v2di (v2di, v2di, v2di);
1366 v4si __builtin_ia32_vpcmov_v4si (v4si, v4si, v4si);
1367 v8hi __builtin_ia32_vpcmov_v8hi (v8hi, v8hi, v8hi);
1368 v16qi __builtin_ia32_vpcmov_v16qi (v16qi, v16qi, v16qi);
1369 v2df __builtin_ia32_vpcmov_v2df (v2df, v2df, v2df);
1370 v4sf __builtin_ia32_vpcmov_v4sf (v4sf, v4sf, v4sf);
1371 v4di __builtin_ia32_vpcmov_v4di256 (v4di, v4di, v4di);
1372 v8si __builtin_ia32_vpcmov_v8si256 (v8si, v8si, v8si);
1373 v16hi __builtin_ia32_vpcmov_v16hi256 (v16hi, v16hi, v16hi);
1374 v32qi __builtin_ia32_vpcmov_v32qi256 (v32qi, v32qi, v32qi);
1375 v4df __builtin_ia32_vpcmov_v4df256 (v4df, v4df, v4df);
1376 v8sf __builtin_ia32_vpcmov_v8sf256 (v8sf, v8sf, v8sf);
1377 v16qi __builtin_ia32_vpcomeqb (v16qi, v16qi);
1378 v8hi __builtin_ia32_vpcomeqw (v8hi, v8hi);
1379 v4si __builtin_ia32_vpcomeqd (v4si, v4si);
1380 v2di __builtin_ia32_vpcomeqq (v2di, v2di);
1381 v16qi __builtin_ia32_vpcomequb (v16qi, v16qi);
1382 v4si __builtin_ia32_vpcomequd (v4si, v4si);
1383 v2di __builtin_ia32_vpcomequq (v2di, v2di);
1384 v8hi __builtin_ia32_vpcomequw (v8hi, v8hi);
1385 v8hi __builtin_ia32_vpcomeqw (v8hi, v8hi);
1386 v16qi __builtin_ia32_vpcomfalseb (v16qi, v16qi);
1387 v4si __builtin_ia32_vpcomfalsed (v4si, v4si);
1388 v2di __builtin_ia32_vpcomfalseq (v2di, v2di);
1389 v16qi __builtin_ia32_vpcomfalseub (v16qi, v16qi);
1390 v4si __builtin_ia32_vpcomfalseud (v4si, v4si);
1391 v2di __builtin_ia32_vpcomfalseuq (v2di, v2di);
1392 v8hi __builtin_ia32_vpcomfalseuw (v8hi, v8hi);
1393 v8hi __builtin_ia32_vpcomfalsew (v8hi, v8hi);
1394 v16qi __builtin_ia32_vpcomgeb (v16qi, v16qi);
1395 v4si __builtin_ia32_vpcomged (v4si, v4si);
1396 v2di __builtin_ia32_vpcomgeq (v2di, v2di);
1397 v16qi __builtin_ia32_vpcomgeub (v16qi, v16qi);
1398 v4si __builtin_ia32_vpcomgeud (v4si, v4si);
1399 v2di __builtin_ia32_vpcomgeuq (v2di, v2di);
1400 v8hi __builtin_ia32_vpcomgeuw (v8hi, v8hi);
1401 v8hi __builtin_ia32_vpcomgew (v8hi, v8hi);
1402 v16qi __builtin_ia32_vpcomgtb (v16qi, v16qi);
1403 v4si __builtin_ia32_vpcomgtd (v4si, v4si);
1404 v2di __builtin_ia32_vpcomgtq (v2di, v2di);
1405 v16qi __builtin_ia32_vpcomgtub (v16qi, v16qi);
1406 v4si __builtin_ia32_vpcomgtud (v4si, v4si);
1407 v2di __builtin_ia32_vpcomgtuq (v2di, v2di);
1408 v8hi __builtin_ia32_vpcomgtuw (v8hi, v8hi);
1409 v8hi __builtin_ia32_vpcomgtw (v8hi, v8hi);
1410 v16qi __builtin_ia32_vpcomleb (v16qi, v16qi);
1411 v4si __builtin_ia32_vpcomled (v4si, v4si);
1412 v2di __builtin_ia32_vpcomleq (v2di, v2di);
1413 v16qi __builtin_ia32_vpcomleub (v16qi, v16qi);
1414 v4si __builtin_ia32_vpcomleud (v4si, v4si);
1415 v2di __builtin_ia32_vpcomleuq (v2di, v2di);
1416 v8hi __builtin_ia32_vpcomleuw (v8hi, v8hi);
1417 v8hi __builtin_ia32_vpcomlew (v8hi, v8hi);
1418 v16qi __builtin_ia32_vpcomltb (v16qi, v16qi);
1419 v4si __builtin_ia32_vpcomltd (v4si, v4si);
1420 v2di __builtin_ia32_vpcomltq (v2di, v2di);
1421 v16qi __builtin_ia32_vpcomltub (v16qi, v16qi);
1422 v4si __builtin_ia32_vpcomltud (v4si, v4si);
1423 v2di __builtin_ia32_vpcomltuq (v2di, v2di);
1424 v8hi __builtin_ia32_vpcomltuw (v8hi, v8hi);
1425 v8hi __builtin_ia32_vpcomltw (v8hi, v8hi);
1426 v16qi __builtin_ia32_vpcomneb (v16qi, v16qi);
1427 v4si __builtin_ia32_vpcomned (v4si, v4si);
1428 v2di __builtin_ia32_vpcomneq (v2di, v2di);
1429 v16qi __builtin_ia32_vpcomneub (v16qi, v16qi);
1430 v4si __builtin_ia32_vpcomneud (v4si, v4si);
1431 v2di __builtin_ia32_vpcomneuq (v2di, v2di);
1432 v8hi __builtin_ia32_vpcomneuw (v8hi, v8hi);
1433 v8hi __builtin_ia32_vpcomnew (v8hi, v8hi);
1434 v16qi __builtin_ia32_vpcomtrueb (v16qi, v16qi);
1435 v4si __builtin_ia32_vpcomtrued (v4si, v4si);
1436 v2di __builtin_ia32_vpcomtrueq (v2di, v2di);
1437 v16qi __builtin_ia32_vpcomtrueub (v16qi, v16qi);
1438 v4si __builtin_ia32_vpcomtrueud (v4si, v4si);
1439 v2di __builtin_ia32_vpcomtrueuq (v2di, v2di);
1440 v8hi __builtin_ia32_vpcomtrueuw (v8hi, v8hi);
1441 v8hi __builtin_ia32_vpcomtruew (v8hi, v8hi);
1442 v4si __builtin_ia32_vphaddbd (v16qi);
1443 v2di __builtin_ia32_vphaddbq (v16qi);
1444 v8hi __builtin_ia32_vphaddbw (v16qi);
1445 v2di __builtin_ia32_vphadddq (v4si);
1446 v4si __builtin_ia32_vphaddubd (v16qi);
1447 v2di __builtin_ia32_vphaddubq (v16qi);
1448 v8hi __builtin_ia32_vphaddubw (v16qi);
1449 v2di __builtin_ia32_vphaddudq (v4si);
1450 v4si __builtin_ia32_vphadduwd (v8hi);
1451 v2di __builtin_ia32_vphadduwq (v8hi);
1452 v4si __builtin_ia32_vphaddwd (v8hi);
1453 v2di __builtin_ia32_vphaddwq (v8hi);
1454 v8hi __builtin_ia32_vphsubbw (v16qi);
1455 v2di __builtin_ia32_vphsubdq (v4si);
1456 v4si __builtin_ia32_vphsubwd (v8hi);
1457 v4si __builtin_ia32_vpmacsdd (v4si, v4si, v4si);
1458 v2di __builtin_ia32_vpmacsdqh (v4si, v4si, v2di);
1459 v2di __builtin_ia32_vpmacsdql (v4si, v4si, v2di);
1460 v4si __builtin_ia32_vpmacssdd (v4si, v4si, v4si);
1461 v2di __builtin_ia32_vpmacssdqh (v4si, v4si, v2di);
1462 v2di __builtin_ia32_vpmacssdql (v4si, v4si, v2di);
1463 v4si __builtin_ia32_vpmacsswd (v8hi, v8hi, v4si);
1464 v8hi __builtin_ia32_vpmacssww (v8hi, v8hi, v8hi);
1465 v4si __builtin_ia32_vpmacswd (v8hi, v8hi, v4si);
1466 v8hi __builtin_ia32_vpmacsww (v8hi, v8hi, v8hi);
1467 v4si __builtin_ia32_vpmadcsswd (v8hi, v8hi, v4si);
1468 v4si __builtin_ia32_vpmadcswd (v8hi, v8hi, v4si);
1469 v16qi __builtin_ia32_vpperm (v16qi, v16qi, v16qi);
1470 v16qi __builtin_ia32_vprotb (v16qi, v16qi);
1471 v4si __builtin_ia32_vprotd (v4si, v4si);
1472 v2di __builtin_ia32_vprotq (v2di, v2di);
1473 v8hi __builtin_ia32_vprotw (v8hi, v8hi);
1474 v16qi __builtin_ia32_vpshab (v16qi, v16qi);
1475 v4si __builtin_ia32_vpshad (v4si, v4si);
1476 v2di __builtin_ia32_vpshaq (v2di, v2di);
1477 v8hi __builtin_ia32_vpshaw (v8hi, v8hi);
1478 v16qi __builtin_ia32_vpshlb (v16qi, v16qi);
1479 v4si __builtin_ia32_vpshld (v4si, v4si);
1480 v2di __builtin_ia32_vpshlq (v2di, v2di);
1481 v8hi __builtin_ia32_vpshlw (v8hi, v8hi);
1482
1483 The following built-in functions are available when :option:`-mfma4` is used.
1484 All of them generate the machine instruction that is part of the name.
1485
1486 .. code-block:: c++
1487
1488 v2df __builtin_ia32_vfmaddpd (v2df, v2df, v2df);
1489 v4sf __builtin_ia32_vfmaddps (v4sf, v4sf, v4sf);
1490 v2df __builtin_ia32_vfmaddsd (v2df, v2df, v2df);
1491 v4sf __builtin_ia32_vfmaddss (v4sf, v4sf, v4sf);
1492 v2df __builtin_ia32_vfmsubpd (v2df, v2df, v2df);
1493 v4sf __builtin_ia32_vfmsubps (v4sf, v4sf, v4sf);
1494 v2df __builtin_ia32_vfmsubsd (v2df, v2df, v2df);
1495 v4sf __builtin_ia32_vfmsubss (v4sf, v4sf, v4sf);
1496 v2df __builtin_ia32_vfnmaddpd (v2df, v2df, v2df);
1497 v4sf __builtin_ia32_vfnmaddps (v4sf, v4sf, v4sf);
1498 v2df __builtin_ia32_vfnmaddsd (v2df, v2df, v2df);
1499 v4sf __builtin_ia32_vfnmaddss (v4sf, v4sf, v4sf);
1500 v2df __builtin_ia32_vfnmsubpd (v2df, v2df, v2df);
1501 v4sf __builtin_ia32_vfnmsubps (v4sf, v4sf, v4sf);
1502 v2df __builtin_ia32_vfnmsubsd (v2df, v2df, v2df);
1503 v4sf __builtin_ia32_vfnmsubss (v4sf, v4sf, v4sf);
1504 v2df __builtin_ia32_vfmaddsubpd (v2df, v2df, v2df);
1505 v4sf __builtin_ia32_vfmaddsubps (v4sf, v4sf, v4sf);
1506 v2df __builtin_ia32_vfmsubaddpd (v2df, v2df, v2df);
1507 v4sf __builtin_ia32_vfmsubaddps (v4sf, v4sf, v4sf);
1508 v4df __builtin_ia32_vfmaddpd256 (v4df, v4df, v4df);
1509 v8sf __builtin_ia32_vfmaddps256 (v8sf, v8sf, v8sf);
1510 v4df __builtin_ia32_vfmsubpd256 (v4df, v4df, v4df);
1511 v8sf __builtin_ia32_vfmsubps256 (v8sf, v8sf, v8sf);
1512 v4df __builtin_ia32_vfnmaddpd256 (v4df, v4df, v4df);
1513 v8sf __builtin_ia32_vfnmaddps256 (v8sf, v8sf, v8sf);
1514 v4df __builtin_ia32_vfnmsubpd256 (v4df, v4df, v4df);
1515 v8sf __builtin_ia32_vfnmsubps256 (v8sf, v8sf, v8sf);
1516 v4df __builtin_ia32_vfmaddsubpd256 (v4df, v4df, v4df);
1517 v8sf __builtin_ia32_vfmaddsubps256 (v8sf, v8sf, v8sf);
1518 v4df __builtin_ia32_vfmsubaddpd256 (v4df, v4df, v4df);
1519 v8sf __builtin_ia32_vfmsubaddps256 (v8sf, v8sf, v8sf);
1520
1521 The following built-in functions are available when :option:`-mlwp` is used.
1522
1523 .. code-block:: c++
1524
1525 void __builtin_ia32_llwpcb16 (void *);
1526 void __builtin_ia32_llwpcb32 (void *);
1527 void __builtin_ia32_llwpcb64 (void *);
1528 void * __builtin_ia32_llwpcb16 (void);
1529 void * __builtin_ia32_llwpcb32 (void);
1530 void * __builtin_ia32_llwpcb64 (void);
1531 void __builtin_ia32_lwpval16 (unsigned short, unsigned int, unsigned short);
1532 void __builtin_ia32_lwpval32 (unsigned int, unsigned int, unsigned int);
1533 void __builtin_ia32_lwpval64 (unsigned __int64, unsigned int, unsigned int);
1534 unsigned char __builtin_ia32_lwpins16 (unsigned short, unsigned int, unsigned short);
1535 unsigned char __builtin_ia32_lwpins32 (unsigned int, unsigned int, unsigned int);
1536 unsigned char __builtin_ia32_lwpins64 (unsigned __int64, unsigned int, unsigned int);
1537
1538 The following built-in functions are available when :option:`-mbmi` is used.
1539 All of them generate the machine instruction that is part of the name.
1540
1541 .. code-block:: c++
1542
1543 unsigned int __builtin_ia32_bextr_u32(unsigned int, unsigned int);
1544 unsigned long long __builtin_ia32_bextr_u64 (unsigned long long, unsigned long long);
1545
1546 The following built-in functions are available when :option:`-mbmi2` is used.
1547 All of them generate the machine instruction that is part of the name.
1548
1549 .. code-block:: c++
1550
1551 unsigned int _bzhi_u32 (unsigned int, unsigned int);
1552 unsigned int _pdep_u32 (unsigned int, unsigned int);
1553 unsigned int _pext_u32 (unsigned int, unsigned int);
1554 unsigned long long _bzhi_u64 (unsigned long long, unsigned long long);
1555 unsigned long long _pdep_u64 (unsigned long long, unsigned long long);
1556 unsigned long long _pext_u64 (unsigned long long, unsigned long long);
1557
1558 The following built-in functions are available when :option:`-mlzcnt` is used.
1559 All of them generate the machine instruction that is part of the name.
1560
1561 .. code-block:: c++
1562
1563 unsigned short __builtin_ia32_lzcnt_u16(unsigned short);
1564 unsigned int __builtin_ia32_lzcnt_u32(unsigned int);
1565 unsigned long long __builtin_ia32_lzcnt_u64 (unsigned long long);
1566
1567 The following built-in functions are available when :option:`-mfxsr` is used.
1568 All of them generate the machine instruction that is part of the name.
1569
1570 .. code-block:: c++
1571
1572 void __builtin_ia32_fxsave (void *);
1573 void __builtin_ia32_fxrstor (void *);
1574 void __builtin_ia32_fxsave64 (void *);
1575 void __builtin_ia32_fxrstor64 (void *);
1576
1577 The following built-in functions are available when :option:`-mxsave` is used.
1578 All of them generate the machine instruction that is part of the name.
1579
1580 .. code-block:: c++
1581
1582 void __builtin_ia32_xsave (void *, long long);
1583 void __builtin_ia32_xrstor (void *, long long);
1584 void __builtin_ia32_xsave64 (void *, long long);
1585 void __builtin_ia32_xrstor64 (void *, long long);
1586
1587 The following built-in functions are available when :option:`-mxsaveopt` is used.
1588 All of them generate the machine instruction that is part of the name.
1589
1590 .. code-block:: c++
1591
1592 void __builtin_ia32_xsaveopt (void *, long long);
1593 void __builtin_ia32_xsaveopt64 (void *, long long);
1594
1595 The following built-in functions are available when :option:`-mtbm` is used.
1596 Both of them generate the immediate form of the bextr machine instruction.
1597
1598 .. code-block:: c++
1599
1600 unsigned int __builtin_ia32_bextri_u32 (unsigned int,
1601 const unsigned int);
1602 unsigned long long __builtin_ia32_bextri_u64 (unsigned long long,
1603 const unsigned long long);
1604
1605 The following built-in functions are available when :option:`-m3dnow` is used.
1606 All of them generate the machine instruction that is part of the name.
1607
1608 .. code-block:: c++
1609
1610 void __builtin_ia32_femms (void);
1611 v8qi __builtin_ia32_pavgusb (v8qi, v8qi);
1612 v2si __builtin_ia32_pf2id (v2sf);
1613 v2sf __builtin_ia32_pfacc (v2sf, v2sf);
1614 v2sf __builtin_ia32_pfadd (v2sf, v2sf);
1615 v2si __builtin_ia32_pfcmpeq (v2sf, v2sf);
1616 v2si __builtin_ia32_pfcmpge (v2sf, v2sf);
1617 v2si __builtin_ia32_pfcmpgt (v2sf, v2sf);
1618 v2sf __builtin_ia32_pfmax (v2sf, v2sf);
1619 v2sf __builtin_ia32_pfmin (v2sf, v2sf);
1620 v2sf __builtin_ia32_pfmul (v2sf, v2sf);
1621 v2sf __builtin_ia32_pfrcp (v2sf);
1622 v2sf __builtin_ia32_pfrcpit1 (v2sf, v2sf);
1623 v2sf __builtin_ia32_pfrcpit2 (v2sf, v2sf);
1624 v2sf __builtin_ia32_pfrsqrt (v2sf);
1625 v2sf __builtin_ia32_pfsub (v2sf, v2sf);
1626 v2sf __builtin_ia32_pfsubr (v2sf, v2sf);
1627 v2sf __builtin_ia32_pi2fd (v2si);
1628 v4hi __builtin_ia32_pmulhrw (v4hi, v4hi);
1629
1630 The following built-in functions are available when :option:`-m3dnowa` is used.
1631 All of them generate the machine instruction that is part of the name.
1632
1633 .. code-block:: c++
1634
1635 v2si __builtin_ia32_pf2iw (v2sf);
1636 v2sf __builtin_ia32_pfnacc (v2sf, v2sf);
1637 v2sf __builtin_ia32_pfpnacc (v2sf, v2sf);
1638 v2sf __builtin_ia32_pi2fw (v2si);
1639 v2sf __builtin_ia32_pswapdsf (v2sf);
1640 v2si __builtin_ia32_pswapdsi (v2si);
1641
1642 The following built-in functions are available when :option:`-mrtm` is used
1643 They are used for restricted transactional memory. These are the internal
1644 low level functions. Normally the functions in
1645 :ref:`x86-transactional-memory-intrinsics` should be used instead.
1646
1647 .. code-block:: c++
1648
1649 int __builtin_ia32_xbegin ();
1650 void __builtin_ia32_xend ();
1651 void __builtin_ia32_xabort (status);
1652 int __builtin_ia32_xtest ();
1653
1654 The following built-in functions are available when :option:`-mmwaitx` is used.
1655 All of them generate the machine instruction that is part of the name.
1656
1657 .. code-block:: c++
1658
1659 void __builtin_ia32_monitorx (void *, unsigned int, unsigned int);
1660 void __builtin_ia32_mwaitx (unsigned int, unsigned int, unsigned int);
1661
1662 The following built-in functions are available when :option:`-mclzero` is used.
1663 All of them generate the machine instruction that is part of the name.
1664
1665 .. code-block:: c++
1666
1667 void __builtin_i32_clzero (void *);
1668
1669 The following built-in functions are available when :option:`-mpku` is used.
1670 They generate reads and writes to PKRU.
1671
1672 .. code-block:: c++
1673
1674 void __builtin_ia32_wrpkru (unsigned int);
1675 unsigned int __builtin_ia32_rdpkru ();
1676
1677 The following built-in functions are available when
1678 :option:`-mshstk` option is used. They support shadow stack
1679 machine instructions from Intel Control-flow Enforcement Technology (CET).
1680 Each built-in function generates the machine instruction that is part
1681 of the function's name. These are the internal low-level functions.
1682 Normally the functions in :ref:`x86-control-flow-protection-intrinsics`
1683 should be used instead.
1684
1685 .. code-block:: c++
1686
1687 unsigned int __builtin_ia32_rdsspd (void);
1688 unsigned long long __builtin_ia32_rdsspq (void);
1689 void __builtin_ia32_incsspd (unsigned int);
1690 void __builtin_ia32_incsspq (unsigned long long);
1691 void __builtin_ia32_saveprevssp(void);
1692 void __builtin_ia32_rstorssp(void *);
1693 void __builtin_ia32_wrssd(unsigned int, void *);
1694 void __builtin_ia32_wrssq(unsigned long long, void *);
1695 void __builtin_ia32_wrussd(unsigned int, void *);
1696 void __builtin_ia32_wrussq(unsigned long long, void *);
1697 void __builtin_ia32_setssbsy(void);
1698 void __builtin_ia32_clrssbsy(void *);