]> git.ipfire.org Git - thirdparty/gcc.git/blame - gcc/doc/gcc/gcc-command-options/options-that-control-optimization.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / gcc / gcc-command-options / options-that-control-optimization.rst
CommitLineData
c63539ff
ML
1..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6.. index:: optimize options, options, optimization
7
8.. _optimize-options:
9
10Options That Control Optimization
11*********************************
12
13These options control various sorts of optimizations.
14
15Without any optimization option, the compiler's goal is to reduce the
16cost of compilation and to make debugging produce the expected
17results. Statements are independent: if you stop the program with a
18breakpoint between statements, you can then assign a new value to any
19variable or change the program counter to any other statement in the
20function and get exactly the results you expect from the source
21code.
22
23Turning on optimization flags makes the compiler attempt to improve
24the performance and/or code size at the expense of compilation time
25and possibly the ability to debug the program.
26
27The compiler performs optimization based on the knowledge it has of the
28program. Compiling multiple files at once to a single output file mode allows
29the compiler to use information gained from all of the files when compiling
30each of them.
31
32Not all optimizations are controlled directly by a flag. Only
33optimizations that have a flag are listed in this section.
34
35Most optimizations are completely disabled at :option:`-O0` or if an
36:option:`-O` level is not set on the command line, even if individual
37optimization flags are specified. Similarly, :option:`-Og` suppresses
38many optimization passes.
39
40Depending on the target and how GCC was configured, a slightly different
41set of optimizations may be enabled at each :option:`-O` level than
42those listed here. You can invoke GCC with :option:`-Q --help=optimizers`
43to find out the exact set of optimizations that are enabled at each level.
44See :ref:`overall-options`, for examples.
45
46.. option:: -O, -O1
47
48 Optimize. Optimizing compilation takes somewhat more time, and a lot
49 more memory for a large function.
50
51 With :option:`-O`, the compiler tries to reduce code size and execution
52 time, without performing any optimizations that take a great deal of
53 compilation time.
54
55 .. Note that in addition to the default_options_table list in opts.cc,
56 several optimization flags default to true but control optimization
57 passes that are explicitly disabled at -O0.
58
59 :option:`-O` turns on the following optimization flags:
60
61 .. Please keep the following list alphabetized.
62
63 :option:`-fauto-inc-dec` |gol|
64 :option:`-fbranch-count-reg` |gol|
65 :option:`-fcombine-stack-adjustments` |gol|
66 :option:`-fcompare-elim` |gol|
67 :option:`-fcprop-registers` |gol|
68 :option:`-fdce` |gol|
69 :option:`-fdefer-pop` |gol|
70 :option:`-fdelayed-branch` |gol|
71 :option:`-fdse` |gol|
72 :option:`-fforward-propagate` |gol|
73 :option:`-fguess-branch-probability` |gol|
74 :option:`-fif-conversion` |gol|
75 :option:`-fif-conversion2` |gol|
76 :option:`-finline-functions-called-once` |gol|
77 :option:`-fipa-modref` |gol|
78 :option:`-fipa-profile` |gol|
79 :option:`-fipa-pure-const` |gol|
80 :option:`-fipa-reference` |gol|
81 :option:`-fipa-reference-addressable` |gol|
82 :option:`-fmerge-constants` |gol|
83 :option:`-fmove-loop-invariants` |gol|
84 :option:`-fmove-loop-stores` |gol|
85 :option:`-fomit-frame-pointer` |gol|
86 :option:`-freorder-blocks` |gol|
87 :option:`-fshrink-wrap` |gol|
88 :option:`-fshrink-wrap-separate` |gol|
89 :option:`-fsplit-wide-types` |gol|
90 :option:`-fssa-backprop` |gol|
91 :option:`-fssa-phiopt` |gol|
92 :option:`-ftree-bit-ccp` |gol|
93 :option:`-ftree-ccp` |gol|
94 :option:`-ftree-ch` |gol|
95 :option:`-ftree-coalesce-vars` |gol|
96 :option:`-ftree-copy-prop` |gol|
97 :option:`-ftree-dce` |gol|
98 :option:`-ftree-dominator-opts` |gol|
99 :option:`-ftree-dse` |gol|
100 :option:`-ftree-forwprop` |gol|
101 :option:`-ftree-fre` |gol|
102 :option:`-ftree-phiprop` |gol|
103 :option:`-ftree-pta` |gol|
104 :option:`-ftree-scev-cprop` |gol|
105 :option:`-ftree-sink` |gol|
106 :option:`-ftree-slsr` |gol|
107 :option:`-ftree-sra` |gol|
108 :option:`-ftree-ter` |gol|
109 :option:`-funit-at-a-time`
110
111.. option:: -O2
112
113 Optimize even more. GCC performs nearly all supported optimizations
114 that do not involve a space-speed tradeoff.
115 As compared to :option:`-O`, this option increases both compilation time
116 and the performance of the generated code.
117
118 :option:`-O2` turns on all optimization flags specified by :option:`-O1`. It
119 also turns on the following optimization flags:
120
121 .. Please keep the following list alphabetized!
122
123 :option:`-falign-functions` :option:`-falign-jumps` |gol|
124 :option:`-falign-labels` :option:`-falign-loops` |gol|
125 :option:`-fcaller-saves` |gol|
126 :option:`-fcode-hoisting` |gol|
127 :option:`-fcrossjumping` |gol|
128 :option:`-fcse-follow-jumps` :option:`-fcse-skip-blocks` |gol|
129 :option:`-fdelete-null-pointer-checks` |gol|
130 :option:`-fdevirtualize` :option:`-fdevirtualize-speculatively` |gol|
131 :option:`-fexpensive-optimizations` |gol|
132 :option:`-ffinite-loops` |gol|
133 :option:`-fgcse` :option:`-fgcse-lm` |gol|
134 :option:`-fhoist-adjacent-loads` |gol|
135 :option:`-finline-functions` |gol|
136 :option:`-finline-small-functions` |gol|
137 :option:`-findirect-inlining` |gol|
138 :option:`-fipa-bit-cp` :option:`-fipa-cp` :option:`-fipa-icf` |gol|
139 :option:`-fipa-ra` :option:`-fipa-sra` :option:`-fipa-vrp` |gol|
140 :option:`-fisolate-erroneous-paths-dereference` |gol|
141 :option:`-flra-remat` |gol|
142 :option:`-foptimize-sibling-calls` |gol|
143 :option:`-foptimize-strlen` |gol|
144 :option:`-fpartial-inlining` |gol|
145 :option:`-fpeephole2` |gol|
146 :option:`-freorder-blocks-algorithm=stc` |gol|
147 :option:`-freorder-blocks-and-partition` :option:`-freorder-functions` |gol|
148 :option:`-frerun-cse-after-loop` |gol|
149 :option:`-fschedule-insns` :option:`-fschedule-insns2` |gol|
150 :option:`-fsched-interblock` :option:`-fsched-spec` |gol|
151 :option:`-fstore-merging` |gol|
152 :option:`-fstrict-aliasing` |gol|
153 :option:`-fthread-jumps` |gol|
154 :option:`-ftree-builtin-call-dce` |gol|
155 :option:`-ftree-loop-vectorize` |gol|
156 :option:`-ftree-pre` |gol|
157 :option:`-ftree-slp-vectorize` |gol|
158 :option:`-ftree-switch-conversion` :option:`-ftree-tail-merge` |gol|
159 :option:`-ftree-vrp` |gol|
160 :option:`-fvect-cost-model=very-cheap`
161
162 Please note the warning under :option:`-fgcse` about
163 invoking :option:`-O2` on programs that use computed gotos.
164
165.. option:: -O3
166
167 Optimize yet more. :option:`-O3` turns on all optimizations specified
168 by :option:`-O2` and also turns on the following optimization flags:
169
170 .. Please keep the following list alphabetized!
171
172 :option:`-fgcse-after-reload` |gol|
173 :option:`-fipa-cp-clone` |gol|
174 :option:`-floop-interchange` |gol|
175 :option:`-floop-unroll-and-jam` |gol|
176 :option:`-fpeel-loops` |gol|
177 :option:`-fpredictive-commoning` |gol|
178 :option:`-fsplit-loops` |gol|
179 :option:`-fsplit-paths` |gol|
180 :option:`-ftree-loop-distribution` |gol|
181 :option:`-ftree-partial-pre` |gol|
182 :option:`-funswitch-loops` |gol|
183 :option:`-fvect-cost-model=dynamic` |gol|
184 :option:`-fversion-loops-for-strides`
185
186.. option:: -O0
187
188 Reduce compilation time and make debugging produce the expected
189 results. This is the default.
190
191.. option:: -Os
192
193 Optimize for size. :option:`-Os` enables all :option:`-O2` optimizations
194 except those that often increase code size:
195
196 :option:`-falign-functions` :option:`-falign-jumps` |gol|
197 :option:`-falign-labels` :option:`-falign-loops` |gol|
198 :option:`-fprefetch-loop-arrays` :option:`-freorder-blocks-algorithm=stc` |gol|
199 It also enables :option:`-finline-functions`, causes the compiler to tune for
200 code size rather than execution speed, and performs further optimizations
201 designed to reduce code size.
202
203.. option:: -Ofast
204
205 Disregard strict standards compliance. :option:`-Ofast` enables all
206 :option:`-O3` optimizations. It also enables optimizations that are not
207 valid for all standard-compliant programs.
208 It turns on :option:`-ffast-math`, :option:`-fallow-store-data-races`
209 and the Fortran-specific :option:`-fstack-arrays`, unless
210 :option:`-fmax-stack-var-size` is specified, and :option:`-fno-protect-parens`.
211 It turns off :option:`-fsemantic-interposition`.
212
213.. option:: -Og
214
215 Optimize debugging experience. :option:`-Og` should be the optimization
216 level of choice for the standard edit-compile-debug cycle, offering
217 a reasonable level of optimization while maintaining fast compilation
218 and a good debugging experience. It is a better choice than :option:`-O0`
219 for producing debuggable code because some compiler passes
220 that collect debug information are disabled at :option:`-O0`.
221
222 Like :option:`-O0`, :option:`-Og` completely disables a number of
223 optimization passes so that individual options controlling them have
224 no effect. Otherwise :option:`-Og` enables all :option:`-O1`
225 optimization flags except for those that may interfere with debugging:
226
227 :option:`-fbranch-count-reg` :option:`-fdelayed-branch` |gol|
228 :option:`-fdse` :option:`-fif-conversion` :option:`-fif-conversion2` |gol|
229 :option:`-finline-functions-called-once` |gol|
230 :option:`-fmove-loop-invariants` :option:`-fmove-loop-stores` :option:`-fssa-phiopt` |gol|
231 :option:`-ftree-bit-ccp` :option:`-ftree-dse` :option:`-ftree-pta` :option:`-ftree-sra`
232
233.. option:: -Oz
234
235 Optimize aggressively for size rather than speed. This may increase
236 the number of instructions executed if those instructions require
237 fewer bytes to encode. :option:`-Oz` behaves similarly to :option:`-Os`
238 including enabling most :option:`-O2` optimizations.
239
240If you use multiple :option:`-O` options, with or without level numbers,
241the last such option is the one that is effective.
242
243Options of the form :samp:`-fflag` specify machine-independent
244flags. Most flags have both positive and negative forms; the negative
245form of :samp:`-ffoo` is :samp:`-fno-foo`. In the table
246below, only one of the forms is listed---the one you typically
247use. You can figure out the other form by either removing :samp:`no-`
248or adding it.
249
250The following options control specific optimizations. They are either
251activated by :option:`-O` options or are related to ones that are. You
252can use the following flags in the rare cases when 'fine-tuning' of
253optimizations to be performed is desired.
254
255.. option:: -fno-defer-pop
256
257 For machines that must pop arguments after a function call, always pop
258 the arguments as soon as each function returns.
259 At levels :option:`-O1` and higher, :option:`-fdefer-pop` is the default;
260 this allows the compiler to let arguments accumulate on the stack for several
261 function calls and pop them all at once.
262
263.. option:: -fdefer-pop
264
265 Default setting; overrides :option:`-fno-defer-pop`.
266
267.. option:: -fforward-propagate
268
269 Perform a forward propagation pass on RTL. The pass tries to combine two
270 instructions and checks if the result can be simplified. If loop unrolling
271 is active, two passes are performed and the second is scheduled after
272 loop unrolling.
273
274 This option is enabled by default at optimization levels :option:`-O1`,
275 :option:`-O2`, :option:`-O3`, :option:`-Os`.
276
277.. option:: -ffp-contract={style}
278
279 :option:`-ffp-contract=off` disables floating-point expression contraction.
280 :option:`-ffp-contract=fast` enables floating-point expression contraction
281 such as forming of fused multiply-add operations if the target has
282 native support for them.
283 :option:`-ffp-contract=on` enables floating-point expression contraction
284 if allowed by the language standard. This is currently not implemented
285 and treated equal to :option:`-ffp-contract=off`.
286
287 The default is :option:`-ffp-contract=fast`.
288
289.. option:: -fomit-frame-pointer
290
291 Omit the frame pointer in functions that don't need one. This avoids the
292 instructions to save, set up and restore the frame pointer; on many targets
293 it also makes an extra register available.
294
295 On some targets this flag has no effect because the standard calling sequence
296 always uses a frame pointer, so it cannot be omitted.
297
298 Note that :option:`-fno-omit-frame-pointer` doesn't guarantee the frame pointer
299 is used in all functions. Several targets always omit the frame pointer in
300 leaf functions.
301
302 Enabled by default at :option:`-O1` and higher.
303
304.. option:: -foptimize-sibling-calls
305
306 Optimize sibling and tail recursive calls.
307
308 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
309
310.. option:: -foptimize-strlen
311
312 Optimize various standard C string functions (e.g. ``strlen``,
313 ``strchr`` or ``strcpy``) and
314 their ``_FORTIFY_SOURCE`` counterparts into faster alternatives.
315
316 Enabled at levels :option:`-O2`, :option:`-O3`.
317
318.. option:: -fno-inline
319
320 Do not expand any functions inline apart from those marked with
321 the :fn-attr:`always_inline` attribute. This is the default when not
322 optimizing.
323
324 Single functions can be exempted from inlining by marking them
325 with the :fn-attr:`noinline` attribute.
326
327.. option:: -finline
328
329 Default setting; overrides :option:`-fno-inline`.
330
331.. option:: -finline-small-functions
332
333 Integrate functions into their callers when their body is smaller than expected
334 function call code (so overall size of program gets smaller). The compiler
335 heuristically decides which functions are simple enough to be worth integrating
336 in this way. This inlining applies to all functions, even those not declared
337 inline.
338
339 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
340
341.. option:: -findirect-inlining
342
343 Inline also indirect calls that are discovered to be known at compile
344 time thanks to previous inlining. This option has any effect only
345 when inlining itself is turned on by the :option:`-finline-functions`
346 or :option:`-finline-small-functions` options.
347
348 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
349
350.. option:: -finline-functions
351
352 Consider all functions for inlining, even if they are not declared inline.
353 The compiler heuristically decides which functions are worth integrating
354 in this way.
355
356 If all calls to a given function are integrated, and the function is
357 declared ``static``, then the function is normally not output as
358 assembler code in its own right.
359
360 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`. Also enabled
361 by :option:`-fprofile-use` and :option:`-fauto-profile`.
362
363.. option:: -finline-functions-called-once
364
365 Consider all ``static`` functions called once for inlining into their
366 caller even if they are not marked ``inline``. If a call to a given
367 function is integrated, then the function is not output as assembler code
368 in its own right.
369
370 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3` and :option:`-Os`,
371 but not :option:`-Og`.
372
373.. option:: -fearly-inlining
374
375 Inline functions marked by :fn-attr:`always_inline` and functions whose body seems
376 smaller than the function call overhead early before doing
377 :option:`-fprofile-generate` instrumentation and real inlining pass. Doing so
378 makes profiling significantly cheaper and usually inlining faster on programs
379 having large chains of nested wrapper functions.
380
381 Enabled by default.
382
383.. option:: -fipa-sra
384
385 Perform interprocedural scalar replacement of aggregates, removal of
386 unused parameters and replacement of parameters passed by reference
387 by parameters passed by value.
388
389 Enabled at levels :option:`-O2`, :option:`-O3` and :option:`-Os`.
390
391.. option:: -finline-limit={n}
392
393 By default, GCC limits the size of functions that can be inlined. This flag
394 allows coarse control of this limit. :samp:`{n}` is the size of functions that
395 can be inlined in number of pseudo instructions.
396
397 Inlining is actually controlled by a number of parameters, which may be
398 specified individually by using :option:`--param name=value`.
399 The :option:`-finline-limit=n` option sets some of these parameters
400 as follows:
401
402 ``max-inline-insns-single``
403 is set to :samp:`{n}/2`.
404
405 ``max-inline-insns-auto``
406 is set to :samp:`{n}/2`.
407
408 See below for a documentation of the individual
409 parameters controlling inlining and for the defaults of these parameters.
410
411 .. note::
412 There may be no value to :option:`-finline-limit` that results
413 in default behavior.
414
415 .. note::
416 Pseudo instruction represents, in this particular context, an
417 abstract measurement of function's size. In no way does it represent a count
418 of assembly instructions and as such its exact meaning might change from one
419 release to an another.
420
421.. option:: -fno-keep-inline-dllexport
422
423 This is a more fine-grained version of :option:`-fkeep-inline-functions`,
424 which applies only to functions that are declared using the :microsoft-windows-fn-attr:`dllexport`
425 attribute or declspec. See :ref:`function-attributes`.
426
427.. option:: -fkeep-inline-dllexport
428
429 Default setting; overrides :option:`-fno-keep-inline-dllexport`.
430
431.. option:: -fkeep-inline-functions
432
433 In C, emit ``static`` functions that are declared ``inline``
434 into the object file, even if the function has been inlined into all
435 of its callers. This switch does not affect functions using the
436 ``extern inline`` extension in GNU C90. In C++, emit any and all
437 inline functions into the object file.
438
439.. option:: -fkeep-static-functions
440
441 Emit ``static`` functions into the object file, even if the function
442 is never used.
443
444.. option:: -fkeep-static-consts
445
446 Emit variables declared ``static const`` when optimization isn't turned
447 on, even if the variables aren't referenced.
448
449 GCC enables this option by default. If you want to force the compiler to
450 check if a variable is referenced, regardless of whether or not
451 optimization is turned on, use the :option:`-fno-keep-static-consts` option.
452
453.. option:: -fmerge-constants
454
455 Attempt to merge identical constants (string constants and floating-point
456 constants) across compilation units.
457
458 This option is the default for optimized compilation if the assembler and
459 linker support it. Use :option:`-fno-merge-constants` to inhibit this
460 behavior.
461
462 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`.
463
464.. option:: -fmerge-all-constants
465
466 Attempt to merge identical constants and identical variables.
467
468 This option implies :option:`-fmerge-constants`. In addition to
469 :option:`-fmerge-constants` this considers e.g. even constant initialized
470 arrays or initialized constant variables with integral or floating-point
471 types. Languages like C or C++ require each variable, including multiple
472 instances of the same variable in recursive calls, to have distinct locations,
473 so using this option results in non-conforming
474 behavior.
475
476.. option:: -fmodulo-sched
477
478 Perform swing modulo scheduling immediately before the first scheduling
479 pass. This pass looks at innermost loops and reorders their
480 instructions by overlapping different iterations.
481
482.. option:: -fmodulo-sched-allow-regmoves
483
484 Perform more aggressive SMS-based modulo scheduling with register moves
485 allowed. By setting this flag certain anti-dependences edges are
486 deleted, which triggers the generation of reg-moves based on the
487 life-range analysis. This option is effective only with
488 :option:`-fmodulo-sched` enabled.
489
490.. option:: -fno-branch-count-reg
491
492 Disable the optimization pass that scans for opportunities to use
493 'decrement and branch' instructions on a count register instead of
494 instruction sequences that decrement a register, compare it against zero, and
495 then branch based upon the result. This option is only meaningful on
496 architectures that support such instructions, which include x86, PowerPC,
497 IA-64 and S/390. Note that the :option:`-fno-branch-count-reg` option
498 doesn't remove the decrement and branch instructions from the generated
499 instruction stream introduced by other optimization passes.
500
501 The default is :option:`-fbranch-count-reg` at :option:`-O1` and higher,
502 except for :option:`-Og`.
503
504.. option:: -fbranch-count-reg
505
506 Default setting; overrides :option:`-fno-branch-count-reg`.
507
508.. option:: -fno-function-cse
509
510 Do not put function addresses in registers; make each instruction that
511 calls a constant function contain the function's address explicitly.
512
513 This option results in less efficient code, but some strange hacks
514 that alter the assembler output may be confused by the optimizations
515 performed when this option is not used.
516
517 The default is :option:`-ffunction-cse`
518
519.. option:: -ffunction-cse
520
521 Default setting; overrides :option:`-fno-function-cse`.
522
523.. option:: -fno-zero-initialized-in-bss
524
525 If the target supports a BSS section, GCC by default puts variables that
526 are initialized to zero into BSS. This can save space in the resulting
527 code.
528
529 This option turns off this behavior because some programs explicitly
530 rely on variables going to the data section---e.g., so that the
531 resulting executable can find the beginning of that section and/or make
532 assumptions based on that.
533
534 The default is :option:`-fzero-initialized-in-bss`.
535
536.. option:: -fzero-initialized-in-bss
537
538 Default setting; overrides :option:`-fno-zero-initialized-in-bss`.
539
540.. option:: -fthread-jumps
541
542 Perform optimizations that check to see if a jump branches to a
543 location where another comparison subsumed by the first is found. If
544 so, the first branch is redirected to either the destination of the
545 second branch or a point immediately following it, depending on whether
546 the condition is known to be true or false.
547
548 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`.
549
550.. option:: -fsplit-wide-types
551
552 When using a type that occupies multiple registers, such as ``long
553 long`` on a 32-bit system, split the registers apart and allocate them
554 independently. This normally generates better code for those types,
555 but may make debugging more difficult.
556
557 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`,
558 :option:`-Os`.
559
560.. option:: -fsplit-wide-types-early
561
562 Fully split wide types early, instead of very late.
563 This option has no effect unless :option:`-fsplit-wide-types` is turned on.
564
565 This is the default on some targets.
566
567.. option:: -fcse-follow-jumps
568
569 In common subexpression elimination (CSE), scan through jump instructions
570 when the target of the jump is not reached by any other path. For
571 example, when CSE encounters an ``if`` statement with an
572 ``else`` clause, CSE follows the jump when the condition
573 tested is false.
574
575 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
576
577.. option:: -fcse-skip-blocks
578
579 This is similar to :option:`-fcse-follow-jumps`, but causes CSE to
580 follow jumps that conditionally skip over blocks. When CSE
581 encounters a simple ``if`` statement with no else clause,
582 :option:`-fcse-skip-blocks` causes CSE to follow the jump around the
583 body of the ``if``.
584
585 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
586
587.. option:: -frerun-cse-after-loop
588
589 Re-run common subexpression elimination after loop optimizations are
590 performed.
591
592 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
593
594.. option:: -fgcse
595
596 Perform a global common subexpression elimination pass.
597 This pass also performs global constant and copy propagation.
598
599 .. note::
600
601 When compiling a program using computed gotos, a GCC
602 extension, you may get better run-time performance if you disable
603 the global common subexpression elimination pass by adding
604 :option:`-fno-gcse` to the command line.
605
606 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
607
608.. option:: -fgcse-lm
609
610 When :option:`-fgcse-lm` is enabled, global common subexpression elimination
611 attempts to move loads that are only killed by stores into themselves. This
612 allows a loop containing a load/store sequence to be changed to a load outside
613 the loop, and a copy/store within the loop.
614
615 Enabled by default when :option:`-fgcse` is enabled.
616
617.. option:: -fgcse-sm
618
619 When :option:`-fgcse-sm` is enabled, a store motion pass is run after
620 global common subexpression elimination. This pass attempts to move
621 stores out of loops. When used in conjunction with :option:`-fgcse-lm`,
622 loops containing a load/store sequence can be changed to a load before
623 the loop and a store after the loop.
624
625 Not enabled at any optimization level.
626
627.. option:: -fgcse-las
628
629 When :option:`-fgcse-las` is enabled, the global common subexpression
630 elimination pass eliminates redundant loads that come after stores to the
631 same memory location (both partial and full redundancies).
632
633 Not enabled at any optimization level.
634
635.. option:: -fgcse-after-reload
636
637 When :option:`-fgcse-after-reload` is enabled, a redundant load elimination
638 pass is performed after reload. The purpose of this pass is to clean up
639 redundant spilling.
640
641 Enabled by :option:`-O3`, :option:`-fprofile-use` and :option:`-fauto-profile`.
642
643.. option:: -faggressive-loop-optimizations
644
645 This option tells the loop optimizer to use language constraints to
646 derive bounds for the number of iterations of a loop. This assumes that
647 loop code does not invoke undefined behavior by for example causing signed
648 integer overflows or out-of-bound array accesses. The bounds for the
649 number of iterations of a loop are used to guide loop unrolling and peeling
650 and loop exit test optimizations.
651 This option is enabled by default.
652
653.. option:: -funconstrained-commons
654
655 This option tells the compiler that variables declared in common blocks
656 (e.g. Fortran) may later be overridden with longer trailing arrays. This
657 prevents certain optimizations that depend on knowing the array bounds.
658
659.. option:: -fcrossjumping
660
661 Perform cross-jumping transformation.
662 This transformation unifies equivalent code and saves code size. The
663 resulting code may or may not perform better than without cross-jumping.
664
665 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
666
667.. option:: -fauto-inc-dec
668
669 Combine increments or decrements of addresses with memory accesses.
670 This pass is always skipped on architectures that do not have
671 instructions to support this. Enabled by default at :option:`-O1` and
672 higher on architectures that support this.
673
674.. option:: -fdce
675
676 Perform dead code elimination (DCE) on RTL.
677 Enabled by default at :option:`-O1` and higher.
678
679.. option:: -fdse
680
681 Perform dead store elimination (DSE) on RTL.
682 Enabled by default at :option:`-O1` and higher.
683
684.. option:: -fif-conversion
685
686 Attempt to transform conditional jumps into branch-less equivalents. This
687 includes use of conditional moves, min, max, set flags and abs instructions, and
688 some tricks doable by standard arithmetics. The use of conditional execution
689 on chips where it is available is controlled by :option:`-fif-conversion2`.
690
691 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`, but
692 not with :option:`-Og`.
693
694.. option:: -fif-conversion2
695
696 Use conditional execution (where available) to transform conditional jumps into
697 branch-less equivalents.
698
699 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`, but
700 not with :option:`-Og`.
701
702.. option:: -fdeclone-ctor-dtor
703
704 The C++ ABI requires multiple entry points for constructors and
705 destructors: one for a base subobject, one for a complete object, and
706 one for a virtual destructor that calls operator delete afterwards.
707 For a hierarchy with virtual bases, the base and complete variants are
708 clones, which means two copies of the function. With this option, the
709 base and complete variants are changed to be thunks that call a common
710 implementation.
711
712 Enabled by :option:`-Os`.
713
714.. option:: -fdelete-null-pointer-checks
715
716 Assume that programs cannot safely dereference null pointers, and that
717 no code or data element resides at address zero.
718 This option enables simple constant
719 folding optimizations at all optimization levels. In addition, other
720 optimization passes in GCC use this flag to control global dataflow
721 analyses that eliminate useless checks for null pointers; these assume
722 that a memory access to address zero always results in a trap, so
723 that if a pointer is checked after it has already been dereferenced,
724 it cannot be null.
725
726 Note however that in some environments this assumption is not true.
727 Use :option:`-fno-delete-null-pointer-checks` to disable this optimization
728 for programs that depend on that behavior.
729
730 This option is enabled by default on most targets. On Nios II ELF, it
731 defaults to off. On AVR and MSP430, this option is completely disabled.
732
733 Passes that use the dataflow information
734 are enabled independently at different optimization levels.
735
736.. option:: -fdevirtualize
737
738 Attempt to convert calls to virtual functions to direct calls. This
739 is done both within a procedure and interprocedurally as part of
740 indirect inlining (:option:`-findirect-inlining`) and interprocedural constant
741 propagation (:option:`-fipa-cp`).
742 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
743
744.. option:: -fdevirtualize-speculatively
745
746 Attempt to convert calls to virtual functions to speculative direct calls.
747 Based on the analysis of the type inheritance graph, determine for a given call
748 the set of likely targets. If the set is small, preferably of size 1, change
749 the call into a conditional deciding between direct and indirect calls. The
750 speculative calls enable more optimizations, such as inlining. When they seem
751 useless after further optimization, they are converted back into original form.
752
753.. option:: -fdevirtualize-at-ltrans
754
755 Stream extra information needed for aggressive devirtualization when running
756 the link-time optimizer in local transformation mode.
757 This option enables more devirtualization but
758 significantly increases the size of streamed data. For this reason it is
759 disabled by default.
760
761.. option:: -fexpensive-optimizations
762
763 Perform a number of minor optimizations that are relatively expensive.
764
765 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
766
767.. option:: -free
768
769 Attempt to remove redundant extension instructions. This is especially
770 helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
771 registers after writing to their lower 32-bit half.
772
773 Enabled for Alpha, AArch64 and x86 at levels :option:`-O2`,
774 :option:`-O3`, :option:`-Os`.
775
776.. option:: -fno-lifetime-dse
777
778 In C++ the value of an object is only affected by changes within its
779 lifetime: when the constructor begins, the object has an indeterminate
780 value, and any changes during the lifetime of the object are dead when
781 the object is destroyed. Normally dead store elimination will take
782 advantage of this; if your code relies on the value of the object
783 storage persisting beyond the lifetime of the object, you can use this
784 flag to disable this optimization. To preserve stores before the
785 constructor starts (e.g. because your operator new clears the object
786 storage) but still treat the object as dead after the destructor, you
787 can use :option:`-flifetime-dse=1`. The default behavior can be
788 explicitly selected with :option:`-flifetime-dse=2`.
789 :option:`-flifetime-dse=0` is equivalent to :option:`-fno-lifetime-dse`.
790
791.. option:: -flifetime-dse
792
793 Default setting; overrides :option:`-fno-lifetime-dse`.
794
795.. option:: -flive-range-shrinkage
796
797 Attempt to decrease register pressure through register live range
798 shrinkage. This is helpful for fast processors with small or moderate
799 size register sets.
800
801.. option:: -fira-algorithm={algorithm}
802
803 Use the specified coloring algorithm for the integrated register
804 allocator. The :samp:`{algorithm}` argument can be :samp:`priority`, which
805 specifies Chow's priority coloring, or :samp:`CB`, which specifies
806 Chaitin-Briggs coloring. Chaitin-Briggs coloring is not implemented
807 for all architectures, but for those targets that do support it, it is
808 the default because it generates better code.
809
810.. option:: -fira-region={region}
811
812 Use specified regions for the integrated register allocator. The
813 :samp:`{region}` argument should be one of the following:
814
815 :samp:`all`
816 Use all loops as register allocation regions.
817 This can give the best results for machines with a small and/or
818 irregular register set.
819
820 :samp:`mixed`
821 Use all loops except for loops with small register pressure
822 as the regions. This value usually gives
823 the best results in most cases and for most architectures,
824 and is enabled by default when compiling with optimization for speed
825 (:option:`-O`, :option:`-O2`, ...).
826
827 :samp:`one`
828 Use all functions as a single region.
829 This typically results in the smallest code size, and is enabled by default for
830 :option:`-Os` or :option:`-O0`.
831
832.. option:: -fira-hoist-pressure
833
834 Use IRA to evaluate register pressure in the code hoisting pass for
835 decisions to hoist expressions. This option usually results in smaller
836 code, but it can slow the compiler down.
837
838 This option is enabled at level :option:`-Os` for all targets.
839
840.. option:: -fira-loop-pressure
841
842 Use IRA to evaluate register pressure in loops for decisions to move
843 loop invariants. This option usually results in generation
844 of faster and smaller code on machines with large register files (>= 32
845 registers), but it can slow the compiler down.
846
847 This option is enabled at level :option:`-O3` for some targets.
848
849.. option:: -fno-ira-share-save-slots
850
851 Disable sharing of stack slots used for saving call-used hard
852 registers living through a call. Each hard register gets a
853 separate stack slot, and as a result function stack frames are
854 larger.
855
856.. option:: -fira-share-save-slots
857
858 Default setting; overrides :option:`-fno-ira-share-save-slots`.
859
860.. option:: -fno-ira-share-spill-slots
861
862 Disable sharing of stack slots allocated for pseudo-registers. Each
863 pseudo-register that does not get a hard register gets a separate
864 stack slot, and as a result function stack frames are larger.
865
866.. option:: -fira-share-spill-slots
867
868 Default setting; overrides :option:`-fno-ira-share-spill-slots`.
869
870.. option:: -flra-remat
871
872 Enable CFG-sensitive rematerialization in LRA. Instead of loading
873 values of spilled pseudos, LRA tries to rematerialize (recalculate)
874 values if it is profitable.
875
876 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
877
878.. option:: -fdelayed-branch
879
880 If supported for the target machine, attempt to reorder instructions
881 to exploit instruction slots available after delayed branch
882 instructions.
883
884 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`,
885 but not at :option:`-Og`.
886
887.. option:: -fschedule-insns
888
889 If supported for the target machine, attempt to reorder instructions to
890 eliminate execution stalls due to required data being unavailable. This
891 helps machines that have slow floating point or memory load instructions
892 by allowing other instructions to be issued until the result of the load
893 or floating-point instruction is required.
894
895 Enabled at levels :option:`-O2`, :option:`-O3`.
896
897.. option:: -fschedule-insns2
898
899 Similar to :option:`-fschedule-insns`, but requests an additional pass of
900 instruction scheduling after register allocation has been done. This is
901 especially useful on machines with a relatively small number of
902 registers and where memory load instructions take more than one cycle.
903
904 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
905
906.. option:: -fno-sched-interblock
907
908 Disable instruction scheduling across basic blocks, which
909 is normally enabled when scheduling before register allocation, i.e.
910 with :option:`-fschedule-insns` or at :option:`-O2` or higher.
911
912.. option:: -fsched-interblock
913
914 Default setting; overrides :option:`-fno-sched-interblock`.
915
916.. option:: -fno-sched-spec
917
918 Disable speculative motion of non-load instructions, which
919 is normally enabled when scheduling before register allocation, i.e.
920 with :option:`-fschedule-insns` or at :option:`-O2` or higher.
921
922.. option:: -fsched-spec
923
924 Default setting; overrides :option:`-fno-sched-spec`.
925
926.. option:: -fsched-pressure
927
928 Enable register pressure sensitive insn scheduling before register
929 allocation. This only makes sense when scheduling before register
930 allocation is enabled, i.e. with :option:`-fschedule-insns` or at
931 :option:`-O2` or higher. Usage of this option can improve the
932 generated code and decrease its size by preventing register pressure
933 increase above the number of available hard registers and subsequent
934 spills in register allocation.
935
936.. option:: -fsched-spec-load
937
938 Allow speculative motion of some load instructions. This only makes
939 sense when scheduling before register allocation, i.e. with
940 :option:`-fschedule-insns` or at :option:`-O2` or higher.
941
942.. option:: -fsched-spec-load-dangerous
943
944 Allow speculative motion of more load instructions. This only makes
945 sense when scheduling before register allocation, i.e. with
946 :option:`-fschedule-insns` or at :option:`-O2` or higher.
947
948.. option:: -fsched-stalled-insns, -fsched-stalled-insns={n}
949
950 Define how many insns (if any) can be moved prematurely from the queue
951 of stalled insns into the ready list during the second scheduling pass.
952 :option:`-fno-sched-stalled-insns` means that no insns are moved
953 prematurely, :option:`-fsched-stalled-insns=0` means there is no limit
954 on how many queued insns can be moved prematurely.
955 :option:`-fsched-stalled-insns` without a value is equivalent to
956 :option:`-fsched-stalled-insns=1`.
957
958.. option:: -fsched-stalled-insns-dep, -fsched-stalled-insns-dep={n}
959
960 Define how many insn groups (cycles) are examined for a dependency
961 on a stalled insn that is a candidate for premature removal from the queue
962 of stalled insns. This has an effect only during the second scheduling pass,
963 and only if :option:`-fsched-stalled-insns` is used.
964 :option:`-fno-sched-stalled-insns-dep` is equivalent to
965 :option:`-fsched-stalled-insns-dep=0`.
966 :option:`-fsched-stalled-insns-dep` without a value is equivalent to
967 :option:`-fsched-stalled-insns-dep=1`.
968
969.. option:: -fsched2-use-superblocks
970
971 When scheduling after register allocation, use superblock scheduling.
972 This allows motion across basic block boundaries,
973 resulting in faster schedules. This option is experimental, as not all machine
974 descriptions used by GCC model the CPU closely enough to avoid unreliable
975 results from the algorithm.
976
977 This only makes sense when scheduling after register allocation, i.e. with
978 :option:`-fschedule-insns2` or at :option:`-O2` or higher.
979
980.. option:: -fsched-group-heuristic
981
982 Enable the group heuristic in the scheduler. This heuristic favors
983 the instruction that belongs to a schedule group. This is enabled
984 by default when scheduling is enabled, i.e. with :option:`-fschedule-insns`
985 or :option:`-fschedule-insns2` or at :option:`-O2` or higher.
986
987.. option:: -fsched-critical-path-heuristic
988
989 Enable the critical-path heuristic in the scheduler. This heuristic favors
990 instructions on the critical path. This is enabled by default when
991 scheduling is enabled, i.e. with :option:`-fschedule-insns`
992 or :option:`-fschedule-insns2` or at :option:`-O2` or higher.
993
994.. option:: -fsched-spec-insn-heuristic
995
996 Enable the speculative instruction heuristic in the scheduler. This
997 heuristic favors speculative instructions with greater dependency weakness.
998 This is enabled by default when scheduling is enabled, i.e.
999 with :option:`-fschedule-insns` or :option:`-fschedule-insns2`
1000 or at :option:`-O2` or higher.
1001
1002.. option:: -fsched-rank-heuristic
1003
1004 Enable the rank heuristic in the scheduler. This heuristic favors
1005 the instruction belonging to a basic block with greater size or frequency.
1006 This is enabled by default when scheduling is enabled, i.e.
1007 with :option:`-fschedule-insns` or :option:`-fschedule-insns2` or
1008 at :option:`-O2` or higher.
1009
1010.. option:: -fsched-last-insn-heuristic
1011
1012 Enable the last-instruction heuristic in the scheduler. This heuristic
1013 favors the instruction that is less dependent on the last instruction
1014 scheduled. This is enabled by default when scheduling is enabled,
1015 i.e. with :option:`-fschedule-insns` or :option:`-fschedule-insns2` or
1016 at :option:`-O2` or higher.
1017
1018.. option:: -fsched-dep-count-heuristic
1019
1020 Enable the dependent-count heuristic in the scheduler. This heuristic
1021 favors the instruction that has more instructions depending on it.
1022 This is enabled by default when scheduling is enabled, i.e.
1023 with :option:`-fschedule-insns` or :option:`-fschedule-insns2` or
1024 at :option:`-O2` or higher.
1025
1026.. option:: -freschedule-modulo-scheduled-loops
1027
1028 Modulo scheduling is performed before traditional scheduling. If a loop
1029 is modulo scheduled, later scheduling passes may change its schedule.
1030 Use this option to control that behavior.
1031
1032.. option:: -fselective-scheduling
1033
1034 Schedule instructions using selective scheduling algorithm. Selective
1035 scheduling runs instead of the first scheduler pass.
1036
1037.. option:: -fselective-scheduling2
1038
1039 Schedule instructions using selective scheduling algorithm. Selective
1040 scheduling runs instead of the second scheduler pass.
1041
1042.. option:: -fsel-sched-pipelining
1043
1044 Enable software pipelining of innermost loops during selective scheduling.
1045 This option has no effect unless one of :option:`-fselective-scheduling` or
1046 :option:`-fselective-scheduling2` is turned on.
1047
1048.. option:: -fsel-sched-pipelining-outer-loops
1049
1050 When pipelining loops during selective scheduling, also pipeline outer loops.
1051 This option has no effect unless :option:`-fsel-sched-pipelining` is turned on.
1052
1053.. option:: -fsemantic-interposition
1054
1055 Some object formats, like ELF, allow interposing of symbols by the
1056 dynamic linker.
1057 This means that for symbols exported from the DSO, the compiler cannot perform
1058 interprocedural propagation, inlining and other optimizations in anticipation
1059 that the function or variable in question may change. While this feature is
1060 useful, for example, to rewrite memory allocation functions by a debugging
1061 implementation, it is expensive in the terms of code quality.
1062 With :option:`-fno-semantic-interposition` the compiler assumes that
1063 if interposition happens for functions the overwriting function will have
1064 precisely the same semantics (and side effects).
1065 Similarly if interposition happens
1066 for variables, the constructor of the variable will be the same. The flag
1067 has no effect for functions explicitly declared inline
1068 (where it is never allowed for interposition to change semantics)
1069 and for symbols explicitly declared weak.
1070
1071.. option:: -fshrink-wrap
1072
1073 Emit function prologues only before parts of the function that need it,
1074 rather than at the top of the function. This flag is enabled by default at
1075 :option:`-O` and higher.
1076
1077.. option:: -fshrink-wrap-separate
1078
1079 Shrink-wrap separate parts of the prologue and epilogue separately, so that
1080 those parts are only executed when needed.
1081 This option is on by default, but has no effect unless :option:`-fshrink-wrap`
1082 is also turned on and the target supports this.
1083
1084.. option:: -fcaller-saves
1085
1086 Enable allocation of values to registers that are clobbered by
1087 function calls, by emitting extra instructions to save and restore the
1088 registers around such calls. Such allocation is done only when it
1089 seems to result in better code.
1090
1091 This option is always enabled by default on certain machines, usually
1092 those which have no call-preserved registers to use instead.
1093
1094 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
1095
1096.. option:: -fcombine-stack-adjustments
1097
1098 Tracks stack adjustments (pushes and pops) and stack memory references
1099 and then tries to find ways to combine them.
1100
1101 Enabled by default at :option:`-O1` and higher.
1102
1103.. option:: -fipa-ra
1104
1105 Use caller save registers for allocation if those registers are not used by
1106 any called function. In that case it is not necessary to save and restore
1107 them around calls. This is only possible if called functions are part of
1108 same compilation unit as current function and they are compiled before it.
1109
1110 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`, however the option
1111 is disabled if generated code will be instrumented for profiling
1112 (:option:`-p`, or :option:`-pg`) or if callee's register usage cannot be known
1113 exactly (this happens on targets that do not expose prologues
1114 and epilogues in RTL).
1115
1116.. option:: -fconserve-stack
1117
1118 Attempt to minimize stack usage. The compiler attempts to use less
1119 stack space, even if that makes the program slower. This option
1120 implies setting the large-stack-frame parameter to 100
1121 and the large-stack-frame-growth parameter to 400.
1122
1123.. option:: -ftree-reassoc
1124
1125 Perform reassociation on trees. This flag is enabled by default
1126 at :option:`-O1` and higher.
1127
1128.. option:: -fcode-hoisting
1129
1130 Perform code hoisting. Code hoisting tries to move the
1131 evaluation of expressions executed on all paths to the function exit
1132 as early as possible. This is especially useful as a code size
1133 optimization, but it often helps for code speed as well.
1134 This flag is enabled by default at :option:`-O2` and higher.
1135
1136.. option:: -ftree-pre
1137
1138 Perform partial redundancy elimination (PRE) on trees. This flag is
1139 enabled by default at :option:`-O2` and :option:`-O3`.
1140
1141.. option:: -ftree-partial-pre
1142
1143 Make partial redundancy elimination (PRE) more aggressive. This flag is
1144 enabled by default at :option:`-O3`.
1145
1146.. option:: -ftree-forwprop
1147
1148 Perform forward propagation on trees. This flag is enabled by default
1149 at :option:`-O1` and higher.
1150
1151.. option:: -ftree-fre
1152
1153 Perform full redundancy elimination (FRE) on trees. The difference
1154 between FRE and PRE is that FRE only considers expressions
1155 that are computed on all paths leading to the redundant computation.
1156 This analysis is faster than PRE, though it exposes fewer redundancies.
1157 This flag is enabled by default at :option:`-O1` and higher.
1158
1159.. option:: -ftree-phiprop
1160
1161 Perform hoisting of loads from conditional pointers on trees. This
1162 pass is enabled by default at :option:`-O1` and higher.
1163
1164.. option:: -fhoist-adjacent-loads
1165
1166 Speculatively hoist loads from both branches of an if-then-else if the
1167 loads are from adjacent locations in the same structure and the target
1168 architecture has a conditional move instruction. This flag is enabled
1169 by default at :option:`-O2` and higher.
1170
1171.. option:: -ftree-copy-prop
1172
1173 Perform copy propagation on trees. This pass eliminates unnecessary
1174 copy operations. This flag is enabled by default at :option:`-O1` and
1175 higher.
1176
1177.. option:: -fipa-pure-const
1178
1179 Discover which functions are pure or constant.
1180 Enabled by default at :option:`-O1` and higher.
1181
1182.. option:: -fipa-reference
1183
1184 Discover which static variables do not escape the
1185 compilation unit.
1186 Enabled by default at :option:`-O1` and higher.
1187
1188.. option:: -fipa-reference-addressable
1189
1190 Discover read-only, write-only and non-addressable static variables.
1191 Enabled by default at :option:`-O1` and higher.
1192
1193.. option:: -fipa-stack-alignment
1194
1195 Reduce stack alignment on call sites if possible.
1196 Enabled by default.
1197
1198.. option:: -fipa-pta
1199
1200 Perform interprocedural pointer analysis and interprocedural modification
1201 and reference analysis. This option can cause excessive memory and
1202 compile-time usage on large compilation units. It is not enabled by
1203 default at any optimization level.
1204
1205.. option:: -fipa-profile
1206
1207 Perform interprocedural profile propagation. The functions called only from
1208 cold functions are marked as cold. Also functions executed once (such as
1209 :fn-attr:`cold`, :fn-attr:`noreturn`, static constructors or destructors) are
1210 identified. Cold functions and loop less parts of functions executed once are
1211 then optimized for size.
1212 Enabled by default at :option:`-O1` and higher.
1213
1214.. option:: -fipa-modref
1215
1216 Perform interprocedural mod/ref analysis. This optimization analyzes the side
1217 effects of functions (memory locations that are modified or referenced) and
1218 enables better optimization across the function call boundary. This flag is
1219 enabled by default at :option:`-O1` and higher.
1220
1221.. option:: -fipa-cp
1222
1223 Perform interprocedural constant propagation.
1224 This optimization analyzes the program to determine when values passed
1225 to functions are constants and then optimizes accordingly.
1226 This optimization can substantially increase performance
1227 if the application has constants passed to functions.
1228 This flag is enabled by default at :option:`-O2`, :option:`-Os` and :option:`-O3`.
1229 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
1230
1231.. option:: -fipa-cp-clone
1232
1233 Perform function cloning to make interprocedural constant propagation stronger.
1234 When enabled, interprocedural constant propagation performs function cloning
1235 when externally visible function can be called with constant arguments.
1236 Because this optimization can create multiple copies of functions,
1237 it may significantly increase code size
1238 (see :option:`--param ipa-cp-unit-growth=value`).
1239 This flag is enabled by default at :option:`-O3`.
1240 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
1241
1242.. option:: -fipa-bit-cp
1243
1244 When enabled, perform interprocedural bitwise constant
1245 propagation. This flag is enabled by default at :option:`-O2` and
1246 by :option:`-fprofile-use` and :option:`-fauto-profile`.
1247 It requires that :option:`-fipa-cp` is enabled.
1248
1249.. option:: -fipa-vrp
1250
1251 When enabled, perform interprocedural propagation of value
1252 ranges. This flag is enabled by default at :option:`-O2`. It requires
1253 that :option:`-fipa-cp` is enabled.
1254
1255.. option:: -fipa-icf
1256
1257 Perform Identical Code Folding for functions and read-only variables.
1258 The optimization reduces code size and may disturb unwind stacks by replacing
1259 a function by equivalent one with a different name. The optimization works
1260 more effectively with link-time optimization enabled.
1261
1262 Although the behavior is similar to the Gold Linker's ICF optimization, GCC ICF
1263 works on different levels and thus the optimizations are not same - there are
1264 equivalences that are found only by GCC and equivalences found only by Gold.
1265
1266 This flag is enabled by default at :option:`-O2` and :option:`-Os`.
1267
1268.. option:: -flive-patching={level}
1269
1270 Control GCC's optimizations to produce output suitable for live-patching.
1271
1272 If the compiler's optimization uses a function's body or information extracted
1273 from its body to optimize/change another function, the latter is called an
1274 impacted function of the former. If a function is patched, its impacted
1275 functions should be patched too.
1276
1277 The impacted functions are determined by the compiler's interprocedural
1278 optimizations. For example, a caller is impacted when inlining a function
1279 into its caller,
1280 cloning a function and changing its caller to call this new clone,
1281 or extracting a function's pureness/constness information to optimize
1282 its direct or indirect callers, etc.
1283
1284 Usually, the more IPA optimizations enabled, the larger the number of
1285 impacted functions for each function. In order to control the number of
1286 impacted functions and more easily compute the list of impacted function,
1287 IPA optimizations can be partially enabled at two different levels.
1288
1289 The :samp:`{level}` argument should be one of the following:
1290
1291 :samp:`inline-clone`
1292 Only enable inlining and cloning optimizations, which includes inlining,
1293 cloning, interprocedural scalar replacement of aggregates and partial inlining.
1294 As a result, when patching a function, all its callers and its clones'
1295 callers are impacted, therefore need to be patched as well.
1296
1297 :option:`-flive-patching=inline-clone` disables the following optimization flags:
1298
1299 :option:`-fwhole-program` :option:`-fipa-pta` :option:`-fipa-reference` :option:`-fipa-ra` |gol|
1300 :option:`-fipa-icf` :option:`-fipa-icf-functions` :option:`-fipa-icf-variables` |gol|
1301 :option:`-fipa-bit-cp` :option:`-fipa-vrp` :option:`-fipa-pure-const` :option:`-fipa-reference-addressable` |gol|
1302 :option:`-fipa-stack-alignment` :option:`-fipa-modref`
1303
1304 :samp:`inline-only-static`
1305 Only enable inlining of static functions.
1306 As a result, when patching a static function, all its callers are impacted
1307 and so need to be patched as well.
1308
1309 In addition to all the flags that :option:`-flive-patching=inline-clone`
1310 disables,
1311 :option:`-flive-patching=inline-only-static` disables the following additional
1312 optimization flags:
1313
1314 :option:`-fipa-cp-clone` :option:`-fipa-sra` :option:`-fpartial-inlining` :option:`-fipa-cp`
1315
1316 When :option:`-flive-patching` is specified without any value, the default value
1317 is :samp:`{inline-clone}`.
1318
1319 This flag is disabled by default.
1320
1321 Note that :option:`-flive-patching` is not supported with link-time optimization
1322 (:option:`-flto`).
1323
1324.. option:: -fisolate-erroneous-paths-dereference
1325
1326 Detect paths that trigger erroneous or undefined behavior due to
1327 dereferencing a null pointer. Isolate those paths from the main control
1328 flow and turn the statement with erroneous or undefined behavior into a trap.
1329 This flag is enabled by default at :option:`-O2` and higher and depends on
1330 :option:`-fdelete-null-pointer-checks` also being enabled.
1331
1332.. option:: -fisolate-erroneous-paths-attribute
1333
1334 Detect paths that trigger erroneous or undefined behavior due to a null value
1335 being used in a way forbidden by a :fn-attr:`returns_nonnull` or :fn-attr:`nonnull`
1336 attribute. Isolate those paths from the main control flow and turn the
1337 statement with erroneous or undefined behavior into a trap. This is not
1338 currently enabled, but may be enabled by :option:`-O2` in the future.
1339
1340.. option:: -ftree-sink
1341
1342 Perform forward store motion on trees. This flag is
1343 enabled by default at :option:`-O1` and higher.
1344
1345.. option:: -ftree-bit-ccp
1346
1347 Perform sparse conditional bit constant propagation on trees and propagate
1348 pointer alignment information.
1349 This pass only operates on local scalar variables and is enabled by default
1350 at :option:`-O1` and higher, except for :option:`-Og`.
1351 It requires that :option:`-ftree-ccp` is enabled.
1352
1353.. option:: -ftree-ccp
1354
1355 Perform sparse conditional constant propagation (CCP) on trees. This
1356 pass only operates on local scalar variables and is enabled by default
1357 at :option:`-O1` and higher.
1358
1359.. option:: -fssa-backprop
1360
1361 Propagate information about uses of a value up the definition chain
1362 in order to simplify the definitions. For example, this pass strips
1363 sign operations if the sign of a value never matters. The flag is
1364 enabled by default at :option:`-O1` and higher.
1365
1366.. option:: -fssa-phiopt
1367
1368 Perform pattern matching on SSA PHI nodes to optimize conditional
1369 code. This pass is enabled by default at :option:`-O1` and higher,
1370 except for :option:`-Og`.
1371
1372.. option:: -ftree-switch-conversion
1373
1374 Perform conversion of simple initializations in a switch to
1375 initializations from a scalar array. This flag is enabled by default
1376 at :option:`-O2` and higher.
1377
1378.. option:: -ftree-tail-merge
1379
1380 Look for identical code sequences. When found, replace one with a jump to the
1381 other. This optimization is known as tail merging or cross jumping. This flag
1382 is enabled by default at :option:`-O2` and higher. The compilation time
1383 in this pass can
1384 be limited using max-tail-merge-comparisons parameter and
1385 max-tail-merge-iterations parameter.
1386
1387.. option:: -ftree-dce
1388
1389 Perform dead code elimination (DCE) on trees. This flag is enabled by
1390 default at :option:`-O1` and higher.
1391
1392.. option:: -ftree-builtin-call-dce
1393
1394 Perform conditional dead code elimination (DCE) for calls to built-in functions
1395 that may set ``errno`` but are otherwise free of side effects. This flag is
1396 enabled by default at :option:`-O2` and higher if :option:`-Os` is not also
1397 specified.
1398
1399.. option:: -ffinite-loops
1400
1401 Assume that a loop with an exit will eventually take the exit and not loop
1402 indefinitely. This allows the compiler to remove loops that otherwise have
1403 no side-effects, not considering eventual endless looping as such.
1404
1405 This option is enabled by default at :option:`-O2` for C++ with -std=c++11
1406 or higher.
1407
1408.. option:: -fno-finite-loops
1409
1410 Default setting; overrides :option:`-ffinite-loops`.
1411
1412.. option:: -ftree-dominator-opts
1413
1414 Perform a variety of simple scalar cleanups (constant/copy
1415 propagation, redundancy elimination, range propagation and expression
1416 simplification) based on a dominator tree traversal. This also
1417 performs jump threading (to reduce jumps to jumps). This flag is
1418 enabled by default at :option:`-O1` and higher.
1419
1420.. option:: -ftree-dse
1421
1422 Perform dead store elimination (DSE) on trees. A dead store is a store into
1423 a memory location that is later overwritten by another store without
1424 any intervening loads. In this case the earlier store can be deleted. This
1425 flag is enabled by default at :option:`-O1` and higher.
1426
1427.. option:: -ftree-ch
1428
1429 Perform loop header copying on trees. This is beneficial since it increases
1430 effectiveness of code motion optimizations. It also saves one jump. This flag
1431 is enabled by default at :option:`-O1` and higher. It is not enabled
1432 for :option:`-Os`, since it usually increases code size.
1433
1434.. option:: -ftree-loop-optimize
1435
1436 Perform loop optimizations on trees. This flag is enabled by default
1437 at :option:`-O1` and higher.
1438
1439.. option:: -ftree-loop-linear, -floop-strip-mine, -floop-block
1440
1441 Perform loop nest optimizations. Same as
1442 :option:`-floop-nest-optimize`. To use this code transformation, GCC has
1443 to be configured with :option:`--with-isl` to enable the Graphite loop
1444 transformation infrastructure.
1445
1446.. option:: -fgraphite-identity
1447
1448 Enable the identity transformation for graphite. For every SCoP we generate
1449 the polyhedral representation and transform it back to gimple. Using
1450 :option:`-fgraphite-identity` we can check the costs or benefits of the
1451 GIMPLE -> GRAPHITE -> GIMPLE transformation. Some minimal optimizations
1452 are also performed by the code generator isl, like index splitting and
1453 dead code elimination in loops.
1454
1455.. option:: -floop-nest-optimize
1456
1457 Enable the isl based loop nest optimizer. This is a generic loop nest
1458 optimizer based on the Pluto optimization algorithms. It calculates a loop
1459 structure optimized for data-locality and parallelism. This option
1460 is experimental.
1461
1462.. option:: -floop-parallelize-all
1463
1464 Use the Graphite data dependence analysis to identify loops that can
1465 be parallelized. Parallelize all the loops that can be analyzed to
1466 not contain loop carried dependences without checking that it is
1467 profitable to parallelize the loops.
1468
1469.. option:: -ftree-coalesce-vars
1470
1471 While transforming the program out of the SSA representation, attempt to
1472 reduce copying by coalescing versions of different user-defined
1473 variables, instead of just compiler temporaries. This may severely
1474 limit the ability to debug an optimized program compiled with
1475 :option:`-fno-var-tracking-assignments`. In the negated form, this flag
1476 prevents SSA coalescing of user variables. This option is enabled by
1477 default if optimization is enabled, and it does very little otherwise.
1478
1479.. option:: -ftree-loop-if-convert
1480
1481 Attempt to transform conditional jumps in the innermost loops to
1482 branch-less equivalents. The intent is to remove control-flow from
1483 the innermost loops in order to improve the ability of the
1484 vectorization pass to handle these loops. This is enabled by default
1485 if vectorization is enabled.
1486
1487.. option:: -ftree-loop-distribution
1488
1489 Perform loop distribution. This flag can improve cache performance on
1490 big loop bodies and allow further loop optimizations, like
1491 parallelization or vectorization, to take place. For example, the loop
1492
1493 .. code-block:: fortran
1494
1495 DO I = 1, N
1496 A(I) = B(I) + C
1497 D(I) = E(I) * F
1498 ENDDO
1499
1500 is transformed to
1501
1502 .. code-block:: fortran
1503
1504 DO I = 1, N
1505 A(I) = B(I) + C
1506 ENDDO
1507 DO I = 1, N
1508 D(I) = E(I) * F
1509 ENDDO
1510
1511 This flag is enabled by default at :option:`-O3`.
1512 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
1513
1514.. option:: -ftree-loop-distribute-patterns
1515
1516 Perform loop distribution of patterns that can be code generated with
1517 calls to a library. This flag is enabled by default at :option:`-O2` and
1518 higher, and by :option:`-fprofile-use` and :option:`-fauto-profile`.
1519
1520 This pass distributes the initialization loops and generates a call to
1521 memset zero. For example, the loop
1522
1523 .. code-block:: fortran
1524
1525 DO I = 1, N
1526 A(I) = 0
1527 B(I) = A(I) + I
1528 ENDDO
1529
1530 is transformed to
1531
1532 .. code-block:: fortran
1533
1534 DO I = 1, N
1535 A(I) = 0
1536 ENDDO
1537 DO I = 1, N
1538 B(I) = A(I) + I
1539 ENDDO
1540
1541 and the initialization loop is transformed into a call to memset zero.
1542 This flag is enabled by default at :option:`-O3`.
1543 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
1544
1545.. option:: -floop-interchange
1546
1547 Perform loop interchange outside of graphite. This flag can improve cache
1548 performance on loop nest and allow further loop optimizations, like
1549 vectorization, to take place. For example, the loop
1550
1551 .. code-block:: c++
1552
1553 for (int i = 0; i < N; i++)
1554 for (int j = 0; j < N; j++)
1555 for (int k = 0; k < N; k++)
1556 c[i][j] = c[i][j] + a[i][k]*b[k][j];
1557
1558 is transformed to
1559
1560 .. code-block:: c++
1561
1562 for (int i = 0; i < N; i++)
1563 for (int k = 0; k < N; k++)
1564 for (int j = 0; j < N; j++)
1565 c[i][j] = c[i][j] + a[i][k]*b[k][j];
1566
1567 This flag is enabled by default at :option:`-O3`.
1568 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
1569
1570.. option:: -floop-unroll-and-jam
1571
1572 Apply unroll and jam transformations on feasible loops. In a loop
1573 nest this unrolls the outer loop by some factor and fuses the resulting
1574 multiple inner loops. This flag is enabled by default at :option:`-O3`.
1575 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
1576
1577.. option:: -ftree-loop-im
1578
1579 Perform loop invariant motion on trees. This pass moves only invariants that
1580 are hard to handle at RTL level (function calls, operations that expand to
1581 nontrivial sequences of insns). With :option:`-funswitch-loops` it also moves
1582 operands of conditions that are invariant out of the loop, so that we can use
1583 just trivial invariantness analysis in loop unswitching. The pass also includes
1584 store motion.
1585
1586.. option:: -ftree-loop-ivcanon
1587
1588 Create a canonical counter for number of iterations in loops for which
1589 determining number of iterations requires complicated analysis. Later
1590 optimizations then may determine the number easily. Useful especially
1591 in connection with unrolling.
1592
1593.. option:: -ftree-scev-cprop
1594
1595 Perform final value replacement. If a variable is modified in a loop
1596 in such a way that its value when exiting the loop can be determined using
1597 only its initial value and the number of loop iterations, replace uses of
1598 the final value by such a computation, provided it is sufficiently cheap.
1599 This reduces data dependencies and may allow further simplifications.
1600 Enabled by default at :option:`-O1` and higher.
1601
1602.. option:: -fivopts
1603
1604 Perform induction variable optimizations (strength reduction, induction
1605 variable merging and induction variable elimination) on trees.
1606
1607.. option:: -ftree-parallelize-loops=n
1608
1609 Parallelize loops, i.e., split their iteration space to run in n threads.
1610 This is only possible for loops whose iterations are independent
1611 and can be arbitrarily reordered. The optimization is only
1612 profitable on multiprocessor machines, for loops that are CPU-intensive,
1613 rather than constrained e.g. by memory bandwidth. This option
1614 implies :option:`-pthread`, and thus is only supported on targets
1615 that have support for :option:`-pthread`.
1616
1617.. option:: -ftree-pta
1618
1619 Perform function-local points-to analysis on trees. This flag is
1620 enabled by default at :option:`-O1` and higher, except for :option:`-Og`.
1621
1622.. option:: -ftree-sra
1623
1624 Perform scalar replacement of aggregates. This pass replaces structure
1625 references with scalars to prevent committing structures to memory too
1626 early. This flag is enabled by default at :option:`-O1` and higher,
1627 except for :option:`-Og`.
1628
1629.. option:: -fstore-merging
1630
1631 Perform merging of narrow stores to consecutive memory addresses. This pass
1632 merges contiguous stores of immediate values narrower than a word into fewer
1633 wider stores to reduce the number of instructions. This is enabled by default
1634 at :option:`-O2` and higher as well as :option:`-Os`.
1635
1636.. option:: -ftree-ter
1637
1638 Perform temporary expression replacement during the SSA->normal phase. Single
1639 use/single def temporaries are replaced at their use location with their
1640 defining expression. This results in non-GIMPLE code, but gives the expanders
1641 much more complex trees to work on resulting in better RTL generation. This is
1642 enabled by default at :option:`-O1` and higher.
1643
1644.. option:: -ftree-slsr
1645
1646 Perform straight-line strength reduction on trees. This recognizes related
1647 expressions involving multiplications and replaces them by less expensive
1648 calculations when possible. This is enabled by default at :option:`-O1` and
1649 higher.
1650
1651.. option:: -ftree-vectorize
1652
1653 Perform vectorization on trees. This flag enables :option:`-ftree-loop-vectorize`
1654 and :option:`-ftree-slp-vectorize` if not explicitly specified.
1655
1656.. option:: -ftree-loop-vectorize
1657
1658 Perform loop vectorization on trees. This flag is enabled by default at
1659 :option:`-O2` and by :option:`-ftree-vectorize`, :option:`-fprofile-use`,
1660 and :option:`-fauto-profile`.
1661
1662.. option:: -ftree-slp-vectorize
1663
1664 Perform basic block vectorization on trees. This flag is enabled by default at
1665 :option:`-O2` and by :option:`-ftree-vectorize`, :option:`-fprofile-use`,
1666 and :option:`-fauto-profile`.
1667
1668.. option:: -ftrivial-auto-var-init={choice}
1669
1670 Initialize automatic variables with either a pattern or with zeroes to increase
1671 the security and predictability of a program by preventing uninitialized memory
1672 disclosure and use.
1673 GCC still considers an automatic variable that doesn't have an explicit
1674 initializer as uninitialized, :option:`-Wuninitialized` and
1675 :option:`-Wanalyzer-use-of-uninitialized-value` will still report
1676 warning messages on such automatic variables.
1677 With this option, GCC will also initialize any padding of automatic variables
1678 that have structure or union types to zeroes.
1679 However, the current implementation cannot initialize automatic variables that
1680 are declared between the controlling expression and the first case of a
1681 ``switch`` statement. Using :option:`-Wtrivial-auto-var-init` to report all
1682 such cases.
1683
1684 The three values of :samp:`{choice}` are:
1685
1686 * :samp:`uninitialized` doesn't initialize any automatic variables.
1687 This is C and C++'s default.
1688
1689 * :samp:`pattern` Initialize automatic variables with values which will likely
1690 transform logic bugs into crashes down the line, are easily recognized in a
1691 crash dump and without being values that programmers can rely on for useful
1692 program semantics.
1693 The current value is byte-repeatable pattern with byte "0xFE".
1694 The values used for pattern initialization might be changed in the future.
1695
1696 * :samp:`zero` Initialize automatic variables with zeroes.
1697
1698 The default is :samp:`uninitialized`.
1699
1700 You can control this behavior for a specific variable by using the variable
1701 attribute :var-attr:`uninitialized` (see :ref:`variable-attributes`).
1702
1703.. option:: -fvect-cost-model={model}
1704
1705 Alter the cost model used for vectorization. The :samp:`{model}` argument
1706 should be one of :samp:`unlimited`, :samp:`dynamic`, :samp:`cheap` or
1707 :samp:`very-cheap`.
1708 With the :samp:`unlimited` model the vectorized code-path is assumed
1709 to be profitable while with the :samp:`dynamic` model a runtime check
1710 guards the vectorized code-path to enable it only for iteration
1711 counts that will likely execute faster than when executing the original
1712 scalar loop. The :samp:`cheap` model disables vectorization of
1713 loops where doing so would be cost prohibitive for example due to
1714 required runtime checks for data dependence or alignment but otherwise
1715 is equal to the :samp:`dynamic` model. The :samp:`very-cheap` model only
1716 allows vectorization if the vector code would entirely replace the
1717 scalar code that is being vectorized. For example, if each iteration
1718 of a vectorized loop would only be able to handle exactly four iterations
1719 of the scalar loop, the :samp:`very-cheap` model would only allow
1720 vectorization if the scalar iteration count is known to be a multiple
1721 of four.
1722
1723 The default cost model depends on other optimization flags and is
1724 either :samp:`dynamic` or :samp:`cheap`.
1725
1726.. option:: -fsimd-cost-model={model}
1727
1728 Alter the cost model used for vectorization of loops marked with the OpenMP
1729 simd directive. The :samp:`{model}` argument should be one of
1730 :samp:`unlimited`, :samp:`dynamic`, :samp:`cheap`. All values of :samp:`{model}`
1731 have the same meaning as described in :option:`-fvect-cost-model` and by
1732 default a cost model defined with :option:`-fvect-cost-model` is used.
1733
1734.. option:: -ftree-vrp
1735
1736 Perform Value Range Propagation on trees. This is similar to the
1737 constant propagation pass, but instead of values, ranges of values are
1738 propagated. This allows the optimizers to remove unnecessary range
1739 checks like array bound checks and null pointer checks. This is
1740 enabled by default at :option:`-O2` and higher. Null pointer check
1741 elimination is only done if :option:`-fdelete-null-pointer-checks` is
1742 enabled.
1743
1744.. option:: -fsplit-paths
1745
1746 Split paths leading to loop backedges. This can improve dead code
1747 elimination and common subexpression elimination. This is enabled by
1748 default at :option:`-O3` and above.
1749
1750.. option:: -fsplit-ivs-in-unroller
1751
1752 Enables expression of values of induction variables in later iterations
1753 of the unrolled loop using the value in the first iteration. This breaks
1754 long dependency chains, thus improving efficiency of the scheduling passes.
1755
1756 A combination of :option:`-fweb` and CSE is often sufficient to obtain the
1757 same effect. However, that is not reliable in cases where the loop body
1758 is more complicated than a single basic block. It also does not work at all
1759 on some architectures due to restrictions in the CSE pass.
1760
1761 This optimization is enabled by default.
1762
1763.. option:: -fvariable-expansion-in-unroller
1764
1765 With this option, the compiler creates multiple copies of some
1766 local variables when unrolling a loop, which can result in superior code.
1767
1768 This optimization is enabled by default for PowerPC targets, but disabled
1769 by default otherwise.
1770
1771.. option:: -fpartial-inlining
1772
1773 Inline parts of functions. This option has any effect only
1774 when inlining itself is turned on by the :option:`-finline-functions`
1775 or :option:`-finline-small-functions` options.
1776
1777 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
1778
1779.. option:: -fpredictive-commoning
1780
1781 Perform predictive commoning optimization, i.e., reusing computations
1782 (especially memory loads and stores) performed in previous
1783 iterations of loops.
1784
1785 This option is enabled at level :option:`-O3`.
1786 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
1787
1788.. option:: -fprefetch-loop-arrays
1789
1790 If supported by the target machine, generate instructions to prefetch
1791 memory to improve the performance of loops that access large arrays.
1792
1793 This option may generate better or worse code; results are highly
1794 dependent on the structure of loops within the source code.
1795
1796 Disabled at level :option:`-Os`.
1797
1798.. option:: -fno-printf-return-value
1799
1800 Do not substitute constants for known return value of formatted output
1801 functions such as ``sprintf``, ``snprintf``, ``vsprintf``, and
1802 ``vsnprintf`` (but not ``printf`` of ``fprintf``). This
1803 transformation allows GCC to optimize or even eliminate branches based
1804 on the known return value of these functions called with arguments that
1805 are either constant, or whose values are known to be in a range that
1806 makes determining the exact return value possible. For example, when
1807 :option:`-fprintf-return-value` is in effect, both the branch and the
1808 body of the ``if`` statement (but not the call to ``snprint``)
1809 can be optimized away when ``i`` is a 32-bit or smaller integer
1810 because the return value is guaranteed to be at most 8.
1811
1812 .. code-block:: c++
1813
1814 char buf[9];
1815 if (snprintf (buf, "%08x", i) >= sizeof buf)
1816 ...
1817
1818 The :option:`-fprintf-return-value` option relies on other optimizations
1819 and yields best results with :option:`-O2` and above. It works in tandem
1820 with the :option:`-Wformat-overflow` and :option:`-Wformat-truncation`
1821 options. The :option:`-fprintf-return-value` option is enabled by default.
1822
1823.. option:: -fprintf-return-value
1824
1825 Default setting; overrides :option:`-fno-printf-return-value`.
1826
1827.. option:: -fno-peephole, -fno-peephole2, -fpeephole, -fpeephole2
1828
1829 Disable any machine-specific peephole optimizations. The difference
1830 between :option:`-fno-peephole` and :option:`-fno-peephole2` is in how they
1831 are implemented in the compiler; some targets use one, some use the
1832 other, a few use both.
1833
1834 :option:`-fpeephole` is enabled by default.
1835 :option:`-fpeephole2` enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
1836
1837.. option:: -fno-guess-branch-probability
1838
1839 Do not guess branch probabilities using heuristics.
1840
1841 GCC uses heuristics to guess branch probabilities if they are
1842 not provided by profiling feedback (:option:`-fprofile-arcs`). These
1843 heuristics are based on the control flow graph. If some branch probabilities
1844 are specified by ``__builtin_expect``, then the heuristics are
1845 used to guess branch probabilities for the rest of the control flow graph,
1846 taking the ``__builtin_expect`` info into account. The interactions
1847 between the heuristics and ``__builtin_expect`` can be complex, and in
1848 some cases, it may be useful to disable the heuristics so that the effects
1849 of ``__builtin_expect`` are easier to understand.
1850
1851 It is also possible to specify expected probability of the expression
1852 with ``__builtin_expect_with_probability`` built-in function.
1853
1854 The default is :option:`-fguess-branch-probability` at levels
1855 :option:`-O`, :option:`-O2`, :option:`-O3`, :option:`-Os`.
1856
1857.. option:: -fguess-branch-probability
1858
1859 Default setting; overrides :option:`-fno-guess-branch-probability`.
1860
1861.. option:: -freorder-blocks
1862
1863 Reorder basic blocks in the compiled function in order to reduce number of
1864 taken branches and improve code locality.
1865
1866 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`.
1867
1868.. option:: -freorder-blocks-algorithm={algorithm}
1869
1870 Use the specified algorithm for basic block reordering. The
1871 :samp:`{algorithm}` argument can be :samp:`simple`, which does not increase
1872 code size (except sometimes due to secondary effects like alignment),
1873 or :samp:`stc`, the 'software trace cache' algorithm, which tries to
1874 put all often executed code together, minimizing the number of branches
1875 executed by making extra copies of code.
1876
1877 The default is :samp:`simple` at levels :option:`-O1`, :option:`-Os`, and
1878 :samp:`stc` at levels :option:`-O2`, :option:`-O3`.
1879
1880.. option:: -freorder-blocks-and-partition
1881
1882 In addition to reordering basic blocks in the compiled function, in order
1883 to reduce number of taken branches, partitions hot and cold basic blocks
1884 into separate sections of the assembly and :samp:`.o` files, to improve
1885 paging and cache locality performance.
1886
1887 This optimization is automatically turned off in the presence of
1888 exception handling or unwind tables (on targets using setjump/longjump or target specific scheme), for linkonce sections, for functions with a user-defined
1889 section attribute and on any architecture that does not support named
1890 sections. When :option:`-fsplit-stack` is used this option is not
1891 enabled by default (to avoid linker errors), but may be enabled
1892 explicitly (if using a working linker).
1893
1894 Enabled for x86 at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
1895
1896.. option:: -freorder-functions
1897
1898 Reorder functions in the object file in order to
1899 improve code locality. This is implemented by using special
1900 subsections ``.text.hot`` for most frequently executed functions and
1901 ``.text.unlikely`` for unlikely executed functions. Reordering is done by
1902 the linker so object file format must support named sections and linker must
1903 place them in a reasonable way.
1904
1905 This option isn't effective unless you either provide profile feedback
1906 (see :option:`-fprofile-arcs` for details) or manually annotate functions with
1907 :fn-attr:`hot` or :fn-attr:`cold` attributes (see :ref:`common-function-attributes`).
1908
1909 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
1910
1911.. _type-punning:
1912
1913.. option:: -fstrict-aliasing
1914
1915 Allow the compiler to assume the strictest aliasing rules applicable to
1916 the language being compiled. For C (and C++), this activates
1917 optimizations based on the type of expressions. In particular, an
1918 object of one type is assumed never to reside at the same address as an
1919 object of a different type, unless the types are almost the same. For
1920 example, an ``unsigned int`` can alias an ``int``, but not a
1921 ``void*`` or a ``double``. A character type may alias any other
1922 type.
1923
1924 Pay special attention to code like this:
1925
1926 .. code-block:: c++
1927
1928 union a_union {
1929 int i;
1930 double d;
1931 };
1932
1933 int f() {
1934 union a_union t;
1935 t.d = 3.0;
1936 return t.i;
1937 }
1938
1939 The practice of reading from a different union member than the one most
1940 recently written to (called 'type-punning') is common. Even with
1941 :option:`-fstrict-aliasing`, type-punning is allowed, provided the memory
1942 is accessed through the union type. So, the code above works as
1943 expected. See :ref:`structures-unions-enumerations-and-bit-fields-implementation`. However, this code might not:
1944
1945 .. code-block:: c++
1946
1947 int f() {
1948 union a_union t;
1949 int* ip;
1950 t.d = 3.0;
1951 ip = &t.i;
1952 return *ip;
1953 }
1954
1955 Similarly, access by taking the address, casting the resulting pointer
1956 and dereferencing the result has undefined behavior, even if the cast
1957 uses a union type, e.g.:
1958
1959 .. code-block:: c++
1960
1961 int f() {
1962 double d = 3.0;
1963 return ((union a_union *) &d)->i;
1964 }
1965
1966 The :option:`-fstrict-aliasing` option is enabled at levels
1967 :option:`-O2`, :option:`-O3`, :option:`-Os`.
1968
1969.. option:: -fipa-strict-aliasing
1970
1971 Controls whether rules of :option:`-fstrict-aliasing` are applied across
1972 function boundaries. Note that if multiple functions gets inlined into a
1973 single function the memory accesses are no longer considered to be crossing a
1974 function boundary.
1975
1976 The :option:`-fipa-strict-aliasing` option is enabled by default and is
1977 effective only in combination with :option:`-fstrict-aliasing`.
1978
1979.. option:: -falign-functions[={n}[:{m}[:{n2}[:{m2}]]]]
1980
1981 Align the start of functions to the next power-of-two greater than or
1982 equal to :samp:`{n}`, skipping up to :samp:`{m}` -1 bytes. This ensures that at
1983 least the first :samp:`{m}` bytes of the function can be fetched by the CPU
1984 without crossing an :samp:`{n}` -byte alignment boundary.
1985
1986 If :samp:`{m}` is not specified, it defaults to :samp:`{n}`.
1987
1988 Examples: :option:`-falign-functions=32` aligns functions to the next
1989 32-byte boundary, :option:`-falign-functions=24` aligns to the next
1990 32-byte boundary only if this can be done by skipping 23 bytes or less,
1991 :option:`-falign-functions=32:7` aligns to the next
1992 32-byte boundary only if this can be done by skipping 6 bytes or less.
1993
1994 The second pair of :samp:`{n2}:{m2}` values allows you to specify
1995 a secondary alignment: :option:`-falign-functions=64:7:32:3` aligns to
1996 the next 64-byte boundary if this can be done by skipping 6 bytes or less,
1997 otherwise aligns to the next 32-byte boundary if this can be done
1998 by skipping 2 bytes or less.
1999 If :samp:`{m2}` is not specified, it defaults to :samp:`{n2}`.
2000
2001 Some assemblers only support this flag when :samp:`{n}` is a power of two;
2002 in that case, it is rounded up.
2003
2004 :option:`-fno-align-functions` and :option:`-falign-functions=1` are
2005 equivalent and mean that functions are not aligned.
2006
2007 If :samp:`{n}` is not specified or is zero, use a machine-dependent default.
2008 The maximum allowed :samp:`{n}` option value is 65536.
2009
2010 Enabled at levels :option:`-O2`, :option:`-O3`.
2011
2012.. option:: -flimit-function-alignment
2013
2014 If this option is enabled, the compiler tries to avoid unnecessarily
2015 overaligning functions. It attempts to instruct the assembler to align
2016 by the amount specified by :option:`-falign-functions`, but not to
2017 skip more bytes than the size of the function.
2018
2019.. option:: -falign-labels[={n}[:{m}[:{n2}[:{m2}]]]]
2020
2021 Align all branch targets to a power-of-two boundary.
2022
2023 Parameters of this option are analogous to the :option:`-falign-functions` option.
2024 :option:`-fno-align-labels` and :option:`-falign-labels=1` are
2025 equivalent and mean that labels are not aligned.
2026
2027 If :option:`-falign-loops` or :option:`-falign-jumps` are applicable and
2028 are greater than this value, then their values are used instead.
2029
2030 If :samp:`{n}` is not specified or is zero, use a machine-dependent default
2031 which is very likely to be :samp:`1`, meaning no alignment.
2032 The maximum allowed :samp:`{n}` option value is 65536.
2033
2034 Enabled at levels :option:`-O2`, :option:`-O3`.
2035
2036.. option:: -falign-loops[={n}[:{m}[:{n2}[:{m2}]]]]
2037
2038 Align loops to a power-of-two boundary. If the loops are executed
2039 many times, this makes up for any execution of the dummy padding
2040 instructions.
2041
2042 If :option:`-falign-labels` is greater than this value, then its value
2043 is used instead.
2044
2045 Parameters of this option are analogous to the :option:`-falign-functions` option.
2046 :option:`-fno-align-loops` and :option:`-falign-loops=1` are
2047 equivalent and mean that loops are not aligned.
2048 The maximum allowed :samp:`{n}` option value is 65536.
2049
2050 If :samp:`{n}` is not specified or is zero, use a machine-dependent default.
2051
2052 Enabled at levels :option:`-O2`, :option:`-O3`.
2053
2054.. option:: -falign-jumps[={n}[:{m}[:{n2}[:{m2}]]]]
2055
2056 Align branch targets to a power-of-two boundary, for branch targets
2057 where the targets can only be reached by jumping. In this case,
2058 no dummy operations need be executed.
2059
2060 If :option:`-falign-labels` is greater than this value, then its value
2061 is used instead.
2062
2063 Parameters of this option are analogous to the :option:`-falign-functions` option.
2064 :option:`-fno-align-jumps` and :option:`-falign-jumps=1` are
2065 equivalent and mean that loops are not aligned.
2066
2067 If :samp:`{n}` is not specified or is zero, use a machine-dependent default.
2068 The maximum allowed :samp:`{n}` option value is 65536.
2069
2070 Enabled at levels :option:`-O2`, :option:`-O3`.
2071
2072.. option:: -fno-allocation-dce
2073
2074 Do not remove unused C++ allocations in dead code elimination.
2075
2076.. option:: -fallow-store-data-races
2077
2078 Allow the compiler to perform optimizations that may introduce new data races
2079 on stores, without proving that the variable cannot be concurrently accessed
2080 by other threads. Does not affect optimization of local data. It is safe to
2081 use this option if it is known that global data will not be accessed by
2082 multiple threads.
2083
2084 Examples of optimizations enabled by :option:`-fallow-store-data-races` include
2085 hoisting or if-conversions that may cause a value that was already in memory
2086 to be re-written with that same value. Such re-writing is safe in a single
2087 threaded context but may be unsafe in a multi-threaded context. Note that on
2088 some processors, if-conversions may be required in order to enable
2089 vectorization.
2090
2091 Enabled at level :option:`-Ofast`.
2092
2093.. option:: -funit-at-a-time
2094
2095 This option is left for compatibility reasons. :option:`-funit-at-a-time`
2096 has no effect, while :option:`-fno-unit-at-a-time` implies
2097 :option:`-fno-toplevel-reorder` and :option:`-fno-section-anchors`.
2098
2099 Enabled by default.
2100
2101.. option:: -fno-toplevel-reorder
2102
2103 Do not reorder top-level functions, variables, and ``asm``
2104 statements. Output them in the same order that they appear in the
2105 input file. When this option is used, unreferenced static variables
2106 are not removed. This option is intended to support existing code
2107 that relies on a particular ordering. For new code, it is better to
2108 use attributes when possible.
2109
2110 :option:`-ftoplevel-reorder` is the default at :option:`-O1` and higher, and
2111 also at :option:`-O0` if :option:`-fsection-anchors` is explicitly requested.
2112 Additionally :option:`-fno-toplevel-reorder` implies
2113 :option:`-fno-section-anchors`.
2114
2115.. option:: -ftoplevel-reorder
2116
2117 Default setting; overrides :option:`-fno-toplevel-reorder`.
2118
2119.. option:: -funreachable-traps
2120
2121 With this option, the compiler turns calls to
2122 ``__builtin_unreachable`` into traps, instead of using them for
2123 optimization. This also affects any such calls implicitly generated
2124 by the compiler.
2125
2126 This option has the same effect as :option:`-fsanitize=unreachable
2127 -fsanitize-trap=unreachable`, but does not affect the values of those
2128 options. If :option:`-fsanitize=unreachable` is enabled, that option
2129 takes priority over this one.
2130
2131 This option is enabled by default at :option:`-O0` and :option:`-Og`.
2132
2133.. option:: -fweb
2134
2135 Constructs webs as commonly used for register allocation purposes and assign
2136 each web individual pseudo register. This allows the register allocation pass
2137 to operate on pseudos directly, but also strengthens several other optimization
2138 passes, such as CSE, loop optimizer and trivial dead code remover. It can,
2139 however, make debugging impossible, since variables no longer stay in a
2140 'home register'.
2141
2142 Enabled by default with :option:`-funroll-loops`.
2143
2144.. option:: -fwhole-program
2145
2146 Assume that the current compilation unit represents the whole program being
2147 compiled. All public functions and variables with the exception of ``main``
2148 and those merged by attribute :fn-attr:`externally_visible` become static functions
2149 and in effect are optimized more aggressively by interprocedural optimizers.
2150
2151 This option should not be used in combination with :option:`-flto`.
2152 Instead relying on a linker plugin should provide safer and more precise
2153 information.
2154
2155.. option:: -flto[={n}]
2156
2157 This option runs the standard link-time optimizer. When invoked
2158 with source code, it generates GIMPLE (one of GCC's internal
2159 representations) and writes it to special ELF sections in the object
2160 file. When the object files are linked together, all the function
2161 bodies are read from these ELF sections and instantiated as if they
2162 had been part of the same translation unit.
2163
2164 To use the link-time optimizer, :option:`-flto` and optimization
2165 options should be specified at compile time and during the final link.
2166 It is recommended that you compile all the files participating in the
2167 same link with the same options and also specify those options at
2168 link time.
2169 For example:
2170
2171 .. code-block:: shell
2172
2173 gcc -c -O2 -flto foo.c
2174 gcc -c -O2 -flto bar.c
2175 gcc -o myprog -flto -O2 foo.o bar.o
2176
2177 The first two invocations to GCC save a bytecode representation
2178 of GIMPLE into special ELF sections inside :samp:`foo.o` and
2179 :samp:`bar.o`. The final invocation reads the GIMPLE bytecode from
2180 :samp:`foo.o` and :samp:`bar.o`, merges the two files into a single
2181 internal image, and compiles the result as usual. Since both
2182 :samp:`foo.o` and :samp:`bar.o` are merged into a single image, this
2183 causes all the interprocedural analyses and optimizations in GCC to
2184 work across the two files as if they were a single one. This means,
2185 for example, that the inliner is able to inline functions in
2186 :samp:`bar.o` into functions in :samp:`foo.o` and vice-versa.
2187
2188 Another (simpler) way to enable link-time optimization is:
2189
2190 .. code-block:: shell
2191
2192 gcc -o myprog -flto -O2 foo.c bar.c
2193
2194 The above generates bytecode for :samp:`foo.c` and :samp:`bar.c`,
2195 merges them together into a single GIMPLE representation and optimizes
2196 them as usual to produce :samp:`myprog`.
2197
2198 The important thing to keep in mind is that to enable link-time
2199 optimizations you need to use the GCC driver to perform the link step.
2200 GCC automatically performs link-time optimization if any of the
2201 objects involved were compiled with the :option:`-flto` command-line option.
2202 You can always override
2203 the automatic decision to do link-time optimization
2204 by passing :option:`-fno-lto` to the link command.
2205
2206 To make whole program optimization effective, it is necessary to make
2207 certain whole program assumptions. The compiler needs to know
2208 what functions and variables can be accessed by libraries and runtime
2209 outside of the link-time optimized unit. When supported by the linker,
2210 the linker plugin (see :option:`-fuse-linker-plugin`) passes information
2211 to the compiler about used and externally visible symbols. When
2212 the linker plugin is not available, :option:`-fwhole-program` should be
2213 used to allow the compiler to make these assumptions, which leads
2214 to more aggressive optimization decisions.
2215
2216 When a file is compiled with :option:`-flto` without
2217 :option:`-fuse-linker-plugin`, the generated object file is larger than
2218 a regular object file because it contains GIMPLE bytecodes and the usual
2219 final code (see :option:`-ffat-lto-objects`). This means that
2220 object files with LTO information can be linked as normal object
2221 files; if :option:`-fno-lto` is passed to the linker, no
2222 interprocedural optimizations are applied. Note that when
2223 :option:`-fno-fat-lto-objects` is enabled the compile stage is faster
2224 but you cannot perform a regular, non-LTO link on them.
2225
2226 When producing the final binary, GCC only
2227 applies link-time optimizations to those files that contain bytecode.
2228 Therefore, you can mix and match object files and libraries with
2229 GIMPLE bytecodes and final object code. GCC automatically selects
2230 which files to optimize in LTO mode and which files to link without
2231 further processing.
2232
2233 Generally, options specified at link time override those
2234 specified at compile time, although in some cases GCC attempts to infer
2235 link-time options from the settings used to compile the input files.
2236
2237 If you do not specify an optimization level option :option:`-O` at
2238 link time, then GCC uses the highest optimization level
2239 used when compiling the object files. Note that it is generally
2240 ineffective to specify an optimization level option only at link time and
2241 not at compile time, for two reasons. First, compiling without
2242 optimization suppresses compiler passes that gather information
2243 needed for effective optimization at link time. Second, some early
2244 optimization passes can be performed only at compile time and
2245 not at link time.
2246
2247 There are some code generation flags preserved by GCC when
2248 generating bytecodes, as they need to be used during the final link.
2249 Currently, the following options and their settings are taken from
2250 the first object file that explicitly specifies them:
2251 :option:`-fcommon`, :option:`-fexceptions`, :option:`-fnon-call-exceptions`,
2252 :option:`-fgnu-tm` and all the :option:`-m` target flags.
2253
2254 The following options :option:`-fPIC`, :option:`-fpic`, :option:`-fpie` and
2255 :option:`-fPIE` are combined based on the following scheme:
2256
2257 .. list-table::
2258 :header-rows: 1
2259
2260 * - argument 1
2261 - argument 2
2262 - output
2263
2264 * - :option:`-fPIC`
2265 - :option:`-fpic`
2266 - :option:`-fpic`
2267 * - :option:`-fPIC`
2268 - :option:`-fno-pic`
2269 - :option:`-fno-pic`
2270 * - :option:`-fpic`/:option:`-fPIC`
2271 - no option
2272 - no option
2273 * - :option:`-fPIC`
2274 - :option:`-fPIE`
2275 - :option:`-fPIE`
2276 * - :option:`-fpic`
2277 - :option:`-fPIE`
2278 - :option:`-fpie`
2279 * - :option:`-fPIC`/:option:`-fpic`
2280 - :option:`-fpie`
2281 - :option:`-fpie`
2282
2283 Certain ABI-changing flags are required to match in all compilation units,
2284 and trying to override this at link time with a conflicting value
2285 is ignored. This includes options such as :option:`-freg-struct-return`
2286 and :option:`-fpcc-struct-return`.
2287
2288 Other options such as :option:`-ffp-contract`, :option:`-fno-strict-overflow`,
2289 :option:`-fwrapv`, :option:`-fno-trapv` or :option:`-fno-strict-aliasing`
2290 are passed through to the link stage and merged conservatively for
2291 conflicting translation units. Specifically
2292 :option:`-fno-strict-overflow`, :option:`-fwrapv` and :option:`-fno-trapv` take
2293 precedence; and for example :option:`-ffp-contract=off` takes precedence
2294 over :option:`-ffp-contract=fast`. You can override them at link time.
2295
2296 Diagnostic options such as :option:`-Wstringop-overflow` are passed
2297 through to the link stage and their setting matches that of the
2298 compile-step at function granularity. Note that this matters only
2299 for diagnostics emitted during optimization. Note that code
2300 transforms such as inlining can lead to warnings being enabled
2301 or disabled for regions if code not consistent with the setting
2302 at compile time.
2303
2304 When you need to pass options to the assembler via :option:`-Wa` or
2305 :option:`-Xassembler` make sure to either compile such translation
2306 units with :option:`-fno-lto` or consistently use the same assembler
2307 options on all translation units. You can alternatively also
2308 specify assembler options at LTO link time.
2309
2310 To enable debug info generation you need to supply :option:`-g` at
2311 compile time. If any of the input files at link time were built
2312 with debug info generation enabled the link will enable debug info
2313 generation as well. Any elaborate debug info settings
2314 like the dwarf level :option:`-gdwarf-5` need to be explicitly repeated
2315 at the linker command line and mixing different settings in different
2316 translation units is discouraged.
2317
2318 If LTO encounters objects with C linkage declared with incompatible
2319 types in separate translation units to be linked together (undefined
2320 behavior according to ISO C99 6.2.7), a non-fatal diagnostic may be
2321 issued. The behavior is still undefined at run time. Similar
2322 diagnostics may be raised for other languages.
2323
2324 Another feature of LTO is that it is possible to apply interprocedural
2325 optimizations on files written in different languages:
2326
2327 .. code-block:: shell
2328
2329 gcc -c -flto foo.c
2330 g++ -c -flto bar.cc
2331 gfortran -c -flto baz.f90
2332 g++ -o myprog -flto -O3 foo.o bar.o baz.o -lgfortran
2333
2334 Notice that the final link is done with :command:`g++` to get the C++
2335 runtime libraries and :option:`-lgfortran` is added to get the Fortran
2336 runtime libraries. In general, when mixing languages in LTO mode, you
2337 should use the same link command options as when mixing languages in a
2338 regular (non-LTO) compilation.
2339
2340 If object files containing GIMPLE bytecode are stored in a library archive, say
2341 :samp:`libfoo.a`, it is possible to extract and use them in an LTO link if you
2342 are using a linker with plugin support. To create static libraries suitable
2343 for LTO, use :command:`gcc-ar` and :command:`gcc-ranlib` instead of :command:`ar`
2344 and :command:`ranlib`;
2345 to show the symbols of object files with GIMPLE bytecode, use
2346 :command:`gcc-nm`. Those commands require that :command:`ar`, :command:`ranlib`
2347 and :command:`nm` have been compiled with plugin support. At link time, use the
2348 flag :option:`-fuse-linker-plugin` to ensure that the library participates in
2349 the LTO optimization process:
2350
2351 .. code-block:: shell
2352
2353 gcc -o myprog -O2 -flto -fuse-linker-plugin a.o b.o -lfoo
2354
2355 With the linker plugin enabled, the linker extracts the needed
2356 GIMPLE files from :samp:`libfoo.a` and passes them on to the running GCC
2357 to make them part of the aggregated GIMPLE image to be optimized.
2358
2359 If you are not using a linker with plugin support and/or do not
2360 enable the linker plugin, then the objects inside :samp:`libfoo.a`
2361 are extracted and linked as usual, but they do not participate
2362 in the LTO optimization process. In order to make a static library suitable
2363 for both LTO optimization and usual linkage, compile its object files with
2364 :option:`-flto` :option:`-ffat-lto-objects`.
2365
2366 Link-time optimizations do not require the presence of the whole program to
2367 operate. If the program does not require any symbols to be exported, it is
2368 possible to combine :option:`-flto` and :option:`-fwhole-program` to allow
2369 the interprocedural optimizers to use more aggressive assumptions which may
2370 lead to improved optimization opportunities.
2371 Use of :option:`-fwhole-program` is not needed when linker plugin is
2372 active (see :option:`-fuse-linker-plugin`).
2373
2374 The current implementation of LTO makes no
2375 attempt to generate bytecode that is portable between different
2376 types of hosts. The bytecode files are versioned and there is a
2377 strict version check, so bytecode files generated in one version of
2378 GCC do not work with an older or newer version of GCC.
2379
2380 Link-time optimization does not work well with generation of debugging
2381 information on systems other than those using a combination of ELF and
2382 DWARF.
2383
2384 If you specify the optional :samp:`{n}`, the optimization and code
2385 generation done at link time is executed in parallel using :samp:`{n}`
2386 parallel jobs by utilizing an installed :command:`make` program. The
2387 environment variable :envvar:`MAKE` may be used to override the program
2388 used.
2389
2390 You can also specify :option:`-flto=jobserver` to use GNU make's
2391 job server mode to determine the number of parallel jobs. This
2392 is useful when the Makefile calling GCC is already executing in parallel.
2393 You must prepend a :samp:`+` to the command recipe in the parent Makefile
2394 for this to work. This option likely only works if :envvar:`MAKE` is
2395 GNU make. Even without the option value, GCC tries to automatically
2396 detect a running GNU make's job server.
2397
2398 Use :option:`-flto=auto` to use GNU make's job server, if available,
2399 or otherwise fall back to autodetection of the number of CPU threads
2400 present in your system.
2401
2402.. option:: -flto-partition={alg}
2403
2404 Specify the partitioning algorithm used by the link-time optimizer.
2405 The value is either :samp:`1to1` to specify a partitioning mirroring
2406 the original source files or :samp:`balanced` to specify partitioning
2407 into equally sized chunks (whenever possible) or :samp:`max` to create
2408 new partition for every symbol where possible. Specifying :samp:`none`
2409 as an algorithm disables partitioning and streaming completely.
2410 The default value is :samp:`balanced`. While :samp:`1to1` can be used
2411 as an workaround for various code ordering issues, the :samp:`max`
2412 partitioning is intended for internal testing only.
2413 The value :samp:`one` specifies that exactly one partition should be
2414 used while the value :samp:`none` bypasses partitioning and executes
2415 the link-time optimization step directly from the WPA phase.
2416
2417.. option:: -flto-compression-level={n}
2418
2419 This option specifies the level of compression used for intermediate
2420 language written to LTO object files, and is only meaningful in
2421 conjunction with LTO mode (:option:`-flto`). GCC currently supports two
2422 LTO compression algorithms. For zstd, valid values are 0 (no compression)
2423 to 19 (maximum compression), while zlib supports values from 0 to 9.
2424 Values outside this range are clamped to either minimum or maximum
2425 of the supported values. If the option is not given,
2426 a default balanced compression setting is used.
2427
2428.. option:: -fuse-linker-plugin
2429
2430 Enables the use of a linker plugin during link-time optimization. This
2431 option relies on plugin support in the linker, which is available in gold
2432 or in GNU ld 2.21 or newer.
2433
2434 This option enables the extraction of object files with GIMPLE bytecode out
2435 of library archives. This improves the quality of optimization by exposing
2436 more code to the link-time optimizer. This information specifies what
2437 symbols can be accessed externally (by non-LTO object or during dynamic
2438 linking). Resulting code quality improvements on binaries (and shared
2439 libraries that use hidden visibility) are similar to :option:`-fwhole-program`.
2440 See :option:`-flto` for a description of the effect of this flag and how to
2441 use it.
2442
2443 This option is enabled by default when LTO support in GCC is enabled
2444 and GCC was configured for use with
2445 a linker supporting plugins (GNU ld 2.21 or newer or gold).
2446
2447.. option:: -ffat-lto-objects
2448
2449 Fat LTO objects are object files that contain both the intermediate language
2450 and the object code. This makes them usable for both LTO linking and normal
2451 linking. This option is effective only when compiling with :option:`-flto`
2452 and is ignored at link time.
2453
2454 :option:`-fno-fat-lto-objects` improves compilation time over plain LTO, but
2455 requires the complete toolchain to be aware of LTO. It requires a linker with
2456 linker plugin support for basic functionality. Additionally,
2457 :command:`nm`, :command:`ar` and :command:`ranlib`
2458 need to support linker plugins to allow a full-featured build environment
2459 (capable of building static libraries etc). GCC provides the :command:`gcc-ar`,
2460 :command:`gcc-nm`, :command:`gcc-ranlib` wrappers to pass the right options
2461 to these tools. With non fat LTO makefiles need to be modified to use them.
2462
2463 Note that modern binutils provide plugin auto-load mechanism.
2464 Installing the linker plugin into :samp:`$libdir/bfd-plugins` has the same
2465 effect as usage of the command wrappers (:command:`gcc-ar`, :command:`gcc-nm` and
2466 :command:`gcc-ranlib`).
2467
2468 The default is :option:`-fno-fat-lto-objects` on targets with linker plugin
2469 support.
2470
2471.. option:: -fcompare-elim
2472
2473 After register allocation and post-register allocation instruction splitting,
2474 identify arithmetic instructions that compute processor flags similar to a
2475 comparison operation based on that arithmetic. If possible, eliminate the
2476 explicit comparison operation.
2477
2478 This pass only applies to certain targets that cannot explicitly represent
2479 the comparison operation before register allocation is complete.
2480
2481 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`.
2482
2483.. option:: -fcprop-registers
2484
2485 After register allocation and post-register allocation instruction splitting,
2486 perform a copy-propagation pass to try to reduce scheduling dependencies
2487 and occasionally eliminate the copy.
2488
2489 Enabled at levels :option:`-O1`, :option:`-O2`, :option:`-O3`, :option:`-Os`.
2490
2491.. option:: -fprofile-correction
2492
2493 Profiles collected using an instrumented binary for multi-threaded programs may
2494 be inconsistent due to missed counter updates. When this option is specified,
2495 GCC uses heuristics to correct or smooth out such inconsistencies. By
2496 default, GCC emits an error message when an inconsistent profile is detected.
2497
2498 This option is enabled by :option:`-fauto-profile`.
2499
2500.. option:: -fprofile-partial-training
2501
2502 With ``-fprofile-use`` all portions of programs not executed during train
2503 run are optimized agressively for size rather than speed. In some cases it is
2504 not practical to train all possible hot paths in the program. (For
2505 example, program may contain functions specific for a given hardware and
2506 trianing may not cover all hardware configurations program is run on.) With
2507 ``-fprofile-partial-training`` profile feedback will be ignored for all
2508 functions not executed during the train run leading them to be optimized as if
2509 they were compiled without profile feedback. This leads to better performance
2510 when train run is not representative but also leads to significantly bigger
2511 code.
2512
2513.. option:: -fprofile-use, -fprofile-use={path}
2514
2515 Enable profile feedback-directed optimizations,
2516 and the following optimizations, many of which
2517 are generally profitable only with profile feedback available:
2518
2519 :option:`-fbranch-probabilities` :option:`-fprofile-values` |gol|
2520 :option:`-funroll-loops` :option:`-fpeel-loops` :option:`-ftracer` :option:`-fvpt` |gol|
2521 :option:`-finline-functions` :option:`-fipa-cp` :option:`-fipa-cp-clone` :option:`-fipa-bit-cp` |gol|
2522 :option:`-fpredictive-commoning` :option:`-fsplit-loops` :option:`-funswitch-loops` |gol|
2523 :option:`-fgcse-after-reload` :option:`-ftree-loop-vectorize` :option:`-ftree-slp-vectorize` |gol|
2524 :option:`-fvect-cost-model=dynamic` :option:`-ftree-loop-distribute-patterns` |gol|
2525 :option:`-fprofile-reorder-functions`
2526
2527 Before you can use this option, you must first generate profiling information.
2528 See :ref:`instrumentation-options`, for information about the
2529 :option:`-fprofile-generate` option.
2530
2531 By default, GCC emits an error message if the feedback profiles do not
2532 match the source code. This error can be turned into a warning by using
2533 :option:`-Wno-error=coverage-mismatch`. Note this may result in poorly
2534 optimized code. Additionally, by default, GCC also emits a warning message if
2535 the feedback profiles do not exist (see :option:`-Wmissing-profile`).
2536
2537 If :samp:`{path}` is specified, GCC looks at the :samp:`{path}` to find
2538 the profile feedback data files. See :option:`-fprofile-dir`.
2539
2540.. option:: -fauto-profile, -fauto-profile={path}
2541
2542 Enable sampling-based feedback-directed optimizations,
2543 and the following optimizations,
2544 many of which are generally profitable only with profile feedback available:
2545
2546 :option:`-fbranch-probabilities` :option:`-fprofile-values` |gol|
2547 :option:`-funroll-loops` :option:`-fpeel-loops` :option:`-ftracer` :option:`-fvpt` |gol|
2548 :option:`-finline-functions` :option:`-fipa-cp` :option:`-fipa-cp-clone` :option:`-fipa-bit-cp` |gol|
2549 :option:`-fpredictive-commoning` :option:`-fsplit-loops` :option:`-funswitch-loops` |gol|
2550 :option:`-fgcse-after-reload` :option:`-ftree-loop-vectorize` :option:`-ftree-slp-vectorize` |gol|
2551 :option:`-fvect-cost-model=dynamic` :option:`-ftree-loop-distribute-patterns` |gol|
2552 :option:`-fprofile-correction`
2553
2554 :samp:`{path}` is the name of a file containing AutoFDO profile information.
2555 If omitted, it defaults to :samp:`fbdata.afdo` in the current directory.
2556
2557 Producing an AutoFDO profile data file requires running your program
2558 with the :command:`perf` utility on a supported GNU/Linux target system.
2559 For more information, see https://perf.wiki.kernel.org/.
2560
2561 E.g.
2562
2563 .. code-block:: c++
2564
2565 perf record -e br_inst_retired:near_taken -b -o perf.data \
2566 -- your_program
2567
2568 Then use the :command:`create_gcov` tool to convert the raw profile data
2569 to a format that can be used by GCC. You must also supply the
2570 unstripped binary for your program to this tool.
2571 See https://github.com/google/autofdo.
2572
2573 E.g.
2574
2575 .. code-block:: c++
2576
2577 create_gcov --binary=your_program.unstripped --profile=perf.data \
2578 --gcov=profile.afdo
2579
2580The following options control compiler behavior regarding floating-point
2581arithmetic. These options trade off between speed and
2582correctness. All must be specifically enabled.
2583
2584.. option:: -ffloat-store
2585
2586 Do not store floating-point variables in registers, and inhibit other
2587 options that might change whether a floating-point value is taken from a
2588 register or memory.
2589
2590 .. index:: floating-point precision
2591
2592 This option prevents undesirable excess precision on machines such as
2593 the 68000 where the floating registers (of the 68881) keep more
2594 precision than a ``double`` is supposed to have. Similarly for the
2595 x86 architecture. For most programs, the excess precision does only
2596 good, but a few programs rely on the precise definition of IEEE floating
2597 point. Use :option:`-ffloat-store` for such programs, after modifying
2598 them to store all pertinent intermediate computations into variables.
2599
2600.. option:: -fexcess-precision={style}
2601
2602 This option allows further control over excess precision on machines
2603 where floating-point operations occur in a format with more precision or
2604 range than the IEEE standard and interchange floating-point types. By
2605 default, :option:`-fexcess-precision=fast` is in effect; this means that
2606 operations may be carried out in a wider precision than the types specified
2607 in the source if that would result in faster code, and it is unpredictable
2608 when rounding to the types specified in the source code takes place.
2609 When compiling C or C++, if :option:`-fexcess-precision=standard` is specified
2610 then excess precision follows the rules specified in ISO C99 or C++; in particular,
2611 both casts and assignments cause values to be rounded to their
2612 semantic types (whereas :option:`-ffloat-store` only affects
2613 assignments). This option is enabled by default for C or C++ if a strict
2614 conformance option such as :option:`-std=c99` or :option:`-std=c++17` is used.
2615 :option:`-ffast-math` enables :option:`-fexcess-precision=fast` by default
2616 regardless of whether a strict conformance option is used.
2617
2618 :option:`-fexcess-precision=standard` is not implemented for languages
2619 other than C or C++. On the x86, it has no effect if :option:`-mfpmath=sse`
2620 or :option:`-mfpmath=sse+387` is specified; in the former case, IEEE
2621 semantics apply without excess precision, and in the latter, rounding
2622 is unpredictable.
2623
2624.. option:: -ffast-math
2625
2626 Sets the options :option:`-fno-math-errno`, :option:`-funsafe-math-optimizations`,
2627 :option:`-ffinite-math-only`, :option:`-fno-rounding-math`,
2628 :option:`-fno-signaling-nans`, :option:`-fcx-limited-range` and
2629 :option:`-fexcess-precision=fast`.
2630
2631 This option causes the preprocessor macro ``__FAST_MATH__`` to be defined.
2632
2633 This option is not turned on by any :option:`-O` option besides
2634 :option:`-Ofast` since it can result in incorrect output for programs
2635 that depend on an exact implementation of IEEE or ISO rules/specifications
2636 for math functions. It may, however, yield faster code for programs
2637 that do not require the guarantees of these specifications.
2638
2639.. option:: -fno-math-errno
2640
2641 Do not set ``errno`` after calling math functions that are executed
2642 with a single instruction, e.g., ``sqrt``. A program that relies on
2643 IEEE exceptions for math error handling may want to use this flag
2644 for speed while maintaining IEEE arithmetic compatibility.
2645
2646 This option is not turned on by any :option:`-O` option since
2647 it can result in incorrect output for programs that depend on
2648 an exact implementation of IEEE or ISO rules/specifications for
2649 math functions. It may, however, yield faster code for programs
2650 that do not require the guarantees of these specifications.
2651
2652 The default is :option:`-fmath-errno`.
2653
2654 On Darwin systems, the math library never sets ``errno``. There is
2655 therefore no reason for the compiler to consider the possibility that
2656 it might, and :option:`-fno-math-errno` is the default.
2657
2658.. option:: -fmath-errno
2659
2660 Default setting; overrides :option:`-fno-math-errno`.
2661
2662.. option:: -funsafe-math-optimizations
2663
2664 Allow optimizations for floating-point arithmetic that (a) assume
2665 that arguments and results are valid and (b) may violate IEEE or
2666 ANSI standards. When used at link time, it may include libraries
2667 or startup files that change the default FPU control word or other
2668 similar optimizations.
2669
2670 This option is not turned on by any :option:`-O` option since
2671 it can result in incorrect output for programs that depend on
2672 an exact implementation of IEEE or ISO rules/specifications for
2673 math functions. It may, however, yield faster code for programs
2674 that do not require the guarantees of these specifications.
2675 Enables :option:`-fno-signed-zeros`, :option:`-fno-trapping-math`,
2676 :option:`-fassociative-math` and :option:`-freciprocal-math`.
2677
2678 The default is :option:`-fno-unsafe-math-optimizations`.
2679
2680.. option:: -fassociative-math
2681
2682 Allow re-association of operands in series of floating-point operations.
2683 This violates the ISO C and C++ language standard by possibly changing
2684 computation result. NOTE: re-ordering may change the sign of zero as
2685 well as ignore NaNs and inhibit or create underflow or overflow (and
2686 thus cannot be used on code that relies on rounding behavior like
2687 ``(x + 2**52) - 2**52``. May also reorder floating-point comparisons
2688 and thus may not be used when ordered comparisons are required.
2689 This option requires that both :option:`-fno-signed-zeros` and
2690 :option:`-fno-trapping-math` be in effect. Moreover, it doesn't make
2691 much sense with :option:`-frounding-math`. For Fortran the option
2692 is automatically enabled when both :option:`-fno-signed-zeros` and
2693 :option:`-fno-trapping-math` are in effect.
2694
2695 The default is :option:`-fno-associative-math`.
2696
2697.. option:: -freciprocal-math
2698
2699 Allow the reciprocal of a value to be used instead of dividing by
2700 the value if this enables optimizations. For example ``x / y``
2701 can be replaced with ``x * (1/y)``, which is useful if ``(1/y)``
2702 is subject to common subexpression elimination. Note that this loses
2703 precision and increases the number of flops operating on the value.
2704
2705 The default is :option:`-fno-reciprocal-math`.
2706
2707.. option:: -ffinite-math-only
2708
2709 Allow optimizations for floating-point arithmetic that assume
2710 that arguments and results are not NaNs or +-Infs.
2711
2712 This option is not turned on by any :option:`-O` option since
2713 it can result in incorrect output for programs that depend on
2714 an exact implementation of IEEE or ISO rules/specifications for
2715 math functions. It may, however, yield faster code for programs
2716 that do not require the guarantees of these specifications.
2717
2718 The default is :option:`-fno-finite-math-only`.
2719
2720.. option:: -fno-signed-zeros
2721
2722 Allow optimizations for floating-point arithmetic that ignore the
2723 signedness of zero. IEEE arithmetic specifies the behavior of
2724 distinct +0.0 and -0.0 values, which then prohibits simplification
2725 of expressions such as x+0.0 or 0.0\*x (even with :option:`-ffinite-math-only`).
2726 This option implies that the sign of a zero result isn't significant.
2727
2728 The default is :option:`-fsigned-zeros`.
2729
2730.. option:: -fsigned-zeros
2731
2732 Default setting; overrides :option:`-fno-signed-zeros`.
2733
2734.. option:: -fno-trapping-math
2735
2736 Compile code assuming that floating-point operations cannot generate
2737 user-visible traps. These traps include division by zero, overflow,
2738 underflow, inexact result and invalid operation. This option requires
2739 that :option:`-fno-signaling-nans` be in effect. Setting this option may
2740 allow faster code if one relies on 'non-stop' IEEE arithmetic, for example.
2741
2742 This option should never be turned on by any :option:`-O` option since
2743 it can result in incorrect output for programs that depend on
2744 an exact implementation of IEEE or ISO rules/specifications for
2745 math functions.
2746
2747 The default is :option:`-ftrapping-math`.
2748
2749 Future versions of GCC may provide finer control of this setting
2750 using C99's ``FENV_ACCESS`` pragma. This command-line option
2751 will be used along with :option:`-frounding-math` to specify the
2752 default state for ``FENV_ACCESS``.
2753
2754.. option:: -ftrapping-math
2755
2756 Default setting; overrides :option:`-fno-trapping-math`.
2757
2758.. option:: -frounding-math
2759
2760 Disable transformations and optimizations that assume default floating-point
2761 rounding behavior. This is round-to-zero for all floating point
2762 to integer conversions, and round-to-nearest for all other arithmetic
2763 truncations. This option should be specified for programs that change
2764 the FP rounding mode dynamically, or that may be executed with a
2765 non-default rounding mode. This option disables constant folding of
2766 floating-point expressions at compile time (which may be affected by
2767 rounding mode) and arithmetic transformations that are unsafe in the
2768 presence of sign-dependent rounding modes.
2769
2770 The default is :option:`-fno-rounding-math`.
2771
2772 This option is experimental and does not currently guarantee to
2773 disable all GCC optimizations that are affected by rounding mode.
2774 Future versions of GCC may provide finer control of this setting
2775 using C99's ``FENV_ACCESS`` pragma. This command-line option
2776 will be used along with :option:`-ftrapping-math` to specify the
2777 default state for ``FENV_ACCESS``.
2778
2779.. option:: -fsignaling-nans
2780
2781 Compile code assuming that IEEE signaling NaNs may generate user-visible
2782 traps during floating-point operations. Setting this option disables
2783 optimizations that may change the number of exceptions visible with
2784 signaling NaNs. This option implies :option:`-ftrapping-math`.
2785
2786 This option causes the preprocessor macro ``__SUPPORT_SNAN__`` to
2787 be defined.
2788
2789 The default is :option:`-fno-signaling-nans`.
2790
2791 This option is experimental and does not currently guarantee to
2792 disable all GCC optimizations that affect signaling NaN behavior.
2793
2794.. option:: -fno-fp-int-builtin-inexact
2795
2796 Do not allow the built-in functions ``ceil``, ``floor``,
2797 ``round`` and ``trunc``, and their ``float`` and ``long
2798 double`` variants, to generate code that raises the 'inexact'
2799 floating-point exception for noninteger arguments. ISO C99 and C11
2800 allow these functions to raise the 'inexact' exception, but ISO/IEC
2801 TS 18661-1:2014, the C bindings to IEEE 754-2008, as integrated into
2802 ISO C2X, does not allow these functions to do so.
2803
2804 The default is :option:`-ffp-int-builtin-inexact`, allowing the
2805 exception to be raised, unless C2X or a later C standard is selected.
2806 This option does nothing unless :option:`-ftrapping-math` is in effect.
2807
2808 Even if :option:`-fno-fp-int-builtin-inexact` is used, if the functions
2809 generate a call to a library function then the 'inexact' exception
2810 may be raised if the library implementation does not follow TS 18661.
2811
2812.. option:: -ffp-int-builtin-inexact
2813
2814 Default setting; overrides :option:`-fno-fp-int-builtin-inexact`.
2815
2816.. option:: -fsingle-precision-constant
2817
2818 Treat floating-point constants as single precision instead of
2819 implicitly converting them to double-precision constants.
2820
2821.. option:: -fcx-limited-range
2822
2823 When enabled, this option states that a range reduction step is not
2824 needed when performing complex division. Also, there is no checking
2825 whether the result of a complex multiplication or division is
2826 ``NaN I*NaN``, with an attempt to rescue the situation in that case. The
2827 default is :option:`-fno-cx-limited-range`, but is enabled by
2828 :option:`-ffast-math`.
2829
2830 This option controls the default setting of the ISO C99
2831 ``CX_LIMITED_RANGE`` pragma. Nevertheless, the option applies to
2832 all languages.
2833
2834.. option:: -fcx-fortran-rules
2835
2836 Complex multiplication and division follow Fortran rules. Range
2837 reduction is done as part of complex division, but there is no checking
2838 whether the result of a complex multiplication or division is ``NaN
2839 + I*NaN``, with an attempt to rescue the situation in that case.
2840
2841 The default is :option:`-fno-cx-fortran-rules`.
2842
2843The following options control optimizations that may improve
2844performance, but are not enabled by any :option:`-O` options. This
2845section includes experimental options that may produce broken code.
2846
2847.. option:: -fbranch-probabilities
2848
2849 After running a program compiled with :option:`-fprofile-arcs`
2850 (see :ref:`instrumentation-options`),
2851 you can compile it a second time using
2852 :option:`-fbranch-probabilities`, to improve optimizations based on
2853 the number of times each branch was taken. When a program
2854 compiled with :option:`-fprofile-arcs` exits, it saves arc execution
2855 counts to a file called :samp:`{sourcename}.gcda` for each source
2856 file. The information in this data file is very dependent on the
2857 structure of the generated code, so you must use the same source code
2858 and the same optimization options for both compilations.
2859 See details about the file naming in :option:`-fprofile-arcs`.
2860
2861 With :option:`-fbranch-probabilities`, GCC puts a
2862 :samp:`REG_BR_PROB` note on each :samp:`JUMP_INSN` and :samp:`CALL_INSN`.
2863 These can be used to improve optimization. Currently, they are only
2864 used in one place: in :samp:`reorg.cc`, instead of guessing which path a
2865 branch is most likely to take, the :samp:`REG_BR_PROB` values are used to
2866 exactly determine which path is taken more often.
2867
2868 Enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
2869
2870.. option:: -fprofile-values
2871
2872 If combined with :option:`-fprofile-arcs`, it adds code so that some
2873 data about values of expressions in the program is gathered.
2874
2875 With :option:`-fbranch-probabilities`, it reads back the data gathered
2876 from profiling values of expressions for usage in optimizations.
2877
2878 Enabled by :option:`-fprofile-generate`, :option:`-fprofile-use`, and
2879 :option:`-fauto-profile`.
2880
2881.. option:: -fprofile-reorder-functions
2882
2883 Function reordering based on profile instrumentation collects
2884 first time of execution of a function and orders these functions
2885 in ascending order.
2886
2887 Enabled with :option:`-fprofile-use`.
2888
2889.. option:: -fvpt
2890
2891 If combined with :option:`-fprofile-arcs`, this option instructs the compiler
2892 to add code to gather information about values of expressions.
2893
2894 With :option:`-fbranch-probabilities`, it reads back the data gathered
2895 and actually performs the optimizations based on them.
2896 Currently the optimizations include specialization of division operations
2897 using the knowledge about the value of the denominator.
2898
2899 Enabled with :option:`-fprofile-use` and :option:`-fauto-profile`.
2900
2901.. option:: -frename-registers
2902
2903 Attempt to avoid false dependencies in scheduled code by making use
2904 of registers left over after register allocation. This optimization
2905 most benefits processors with lots of registers. Depending on the
2906 debug information format adopted by the target, however, it can
2907 make debugging impossible, since variables no longer stay in
2908 a 'home register'.
2909
2910 Enabled by default with :option:`-funroll-loops`.
2911
2912.. option:: -fschedule-fusion
2913
2914 Performs a target dependent pass over the instruction stream to schedule
2915 instructions of same type together because target machine can execute them
2916 more efficiently if they are adjacent to each other in the instruction flow.
2917
2918 Enabled at levels :option:`-O2`, :option:`-O3`, :option:`-Os`.
2919
2920.. option:: -ftracer
2921
2922 Perform tail duplication to enlarge superblock size. This transformation
2923 simplifies the control flow of the function allowing other optimizations to do
2924 a better job.
2925
2926 Enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
2927
2928.. option:: -funroll-loops
2929
2930 Unroll loops whose number of iterations can be determined at compile time or
2931 upon entry to the loop. :option:`-funroll-loops` implies
2932 :option:`-frerun-cse-after-loop`, :option:`-fweb` and :option:`-frename-registers`.
2933 It also turns on complete loop peeling (i.e. complete removal of loops with
2934 a small constant number of iterations). This option makes code larger, and may
2935 or may not make it run faster.
2936
2937 Enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
2938
2939.. option:: -funroll-all-loops
2940
2941 Unroll all loops, even if their number of iterations is uncertain when
2942 the loop is entered. This usually makes programs run more slowly.
2943 :option:`-funroll-all-loops` implies the same options as
2944 :option:`-funroll-loops`.
2945
2946.. option:: -fpeel-loops
2947
2948 Peels loops for which there is enough information that they do not
2949 roll much (from profile feedback or static analysis). It also turns on
2950 complete loop peeling (i.e. complete removal of loops with small constant
2951 number of iterations).
2952
2953 Enabled by :option:`-O3`, :option:`-fprofile-use`, and :option:`-fauto-profile`.
2954
2955.. option:: -fmove-loop-invariants
2956
2957 Enables the loop invariant motion pass in the RTL loop optimizer. Enabled
2958 at level :option:`-O1` and higher, except for :option:`-Og`.
2959
2960.. option:: -fmove-loop-stores
2961
2962 Enables the loop store motion pass in the GIMPLE loop optimizer. This
2963 moves invariant stores to after the end of the loop in exchange for
2964 carrying the stored value in a register across the iteration.
2965 Note for this option to have an effect :option:`-ftree-loop-im` has to
2966 be enabled as well. Enabled at level :option:`-O1` and higher, except
2967 for :option:`-Og`.
2968
2969.. option:: -fsplit-loops
2970
2971 Split a loop into two if it contains a condition that's always true
2972 for one side of the iteration space and false for the other.
2973
2974 Enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
2975
2976.. option:: -funswitch-loops
2977
2978 Move branches with loop invariant conditions out of the loop, with duplicates
2979 of the loop on both branches (modified according to result of the condition).
2980
2981 Enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
2982
2983.. option:: -fversion-loops-for-strides
2984
2985 If a loop iterates over an array with a variable stride, create another
2986 version of the loop that assumes the stride is always one. For example:
2987
2988 .. code-block:: c++
2989
2990 for (int i = 0; i < n; ++i)
2991 x[i * stride] = ...;
2992
2993 becomes:
2994
2995 .. code-block:: c++
2996
2997 if (stride == 1)
2998 for (int i = 0; i < n; ++i)
2999 x[i] = ...;
3000 else
3001 for (int i = 0; i < n; ++i)
3002 x[i * stride] = ...;
3003
3004 This is particularly useful for assumed-shape arrays in Fortran where
3005 (for example) it allows better vectorization assuming contiguous accesses.
3006 This flag is enabled by default at :option:`-O3`.
3007 It is also enabled by :option:`-fprofile-use` and :option:`-fauto-profile`.
3008
3009.. option:: -ffunction-sections, -fdata-sections
3010
3011 Place each function or data item into its own section in the output
3012 file if the target supports arbitrary sections. The name of the
3013 function or the name of the data item determines the section's name
3014 in the output file.
3015
3016 Use these options on systems where the linker can perform optimizations to
3017 improve locality of reference in the instruction space. Most systems using the
3018 ELF object format have linkers with such optimizations. On AIX, the linker
3019 rearranges sections (CSECTs) based on the call graph. The performance impact
3020 varies.
3021
3022 Together with a linker garbage collection (linker :option:`--gc-sections`
3023 option) these options may lead to smaller statically-linked executables (after
3024 stripping).
3025
3026 On ELF/DWARF systems these options do not degenerate the quality of the debug
3027 information. There could be issues with other object files/debug info formats.
3028
3029 Only use these options when there are significant benefits from doing so. When
3030 you specify these options, the assembler and linker create larger object and
3031 executable files and are also slower. These options affect code generation.
3032 They prevent optimizations by the compiler and assembler using relative
3033 locations inside a translation unit since the locations are unknown until
3034 link time. An example of such an optimization is relaxing calls to short call
3035 instructions.
3036
3037.. option:: -fstdarg-opt
3038
3039 Optimize the prologue of variadic argument functions with respect to usage of
3040 those arguments.
3041
3042.. option:: -fsection-anchors
3043
3044 Try to reduce the number of symbolic address calculations by using
3045 shared 'anchor' symbols to address nearby objects. This transformation
3046 can help to reduce the number of GOT entries and GOT accesses on some
3047 targets.
3048
3049 For example, the implementation of the following function ``foo`` :
3050
3051 .. code-block:: c++
3052
3053 static int a, b, c;
3054 int foo (void) { return a + b + c; }
3055
3056 usually calculates the addresses of all three variables, but if you
3057 compile it with :option:`-fsection-anchors`, it accesses the variables
3058 from a common anchor point instead. The effect is similar to the
3059 following pseudocode (which isn't valid C):
3060
3061 .. code-block:: c++
3062
3063 int foo (void)
3064 {
3065 register int *xr = &x;
3066 return xr[&a - &x] + xr[&b - &x] + xr[&c - &x];
3067 }
3068
3069 Not all targets support this option.
3070
3071.. option:: -fzero-call-used-regs={choice}
3072
3073 Zero call-used registers at function return to increase program
3074 security by either mitigating Return-Oriented Programming (ROP)
3075 attacks or preventing information leakage through registers.
3076
3077 The possible values of :samp:`{choice}` are the same as for the
3078 ``zero_call_used_regs`` attribute (see :ref:`function-attributes`).
3079 The default is :samp:`skip`.
3080
3081 You can control this behavior for a specific function by using the function
3082 attribute ``zero_call_used_regs`` (see :ref:`function-attributes`).
3083
3084.. option:: --param {name}={value}
3085
3086 In some places, GCC uses various constants to control the amount of
3087 optimization that is done. For example, GCC does not inline functions
3088 that contain more than a certain number of instructions. You can
3089 control some of these constants on the command line using the
3090 :option:`--param` option.
3091
3092 The names of specific parameters, and the meaning of the values, are
3093 tied to the internals of the compiler, and are subject to change
3094 without notice in future releases.
3095
3096 In order to get minimal, maximal and default value of a parameter,
3097 one can use :option:`--help=param -Q` options.
3098
3099 In each case, the :samp:`{value}` is an integer. The following choices
3100 of :samp:`{name}` are recognized for all targets:
3101
3102 .. gcc-param:: predictable-branch-outcome
3103
3104 When branch is predicted to be taken with probability lower than this threshold
3105 (in percent), then it is considered well predictable.
3106
3107 .. gcc-param:: max-rtl-if-conversion-insns
3108
3109 RTL if-conversion tries to remove conditional branches around a block and
3110 replace them with conditionally executed instructions. This parameter
3111 gives the maximum number of instructions in a block which should be
3112 considered for if-conversion. The compiler will
3113 also use other heuristics to decide whether if-conversion is likely to be
3114 profitable.
3115
3116 .. gcc-param:: max-rtl-if-conversion-predictable-cost
3117
3118 RTL if-conversion will try to remove conditional branches around a block
3119 and replace them with conditionally executed instructions. These parameters
3120 give the maximum permissible cost for the sequence that would be generated
3121 by if-conversion depending on whether the branch is statically determined
3122 to be predictable or not. The units for this parameter are the same as
3123 those for the GCC internal seq_cost metric. The compiler will try to
3124 provide a reasonable default for this parameter using the BRANCH_COST
3125 target macro.
3126
3127 .. gcc-param:: max-crossjump-edges
3128
3129 The maximum number of incoming edges to consider for cross-jumping.
3130 The algorithm used by :option:`-fcrossjumping` is O(N^2) in
3131 the number of edges incoming to each block. Increasing values mean
3132 more aggressive optimization, making the compilation time increase with
3133 probably small improvement in executable size.
3134
3135 .. gcc-param:: min-crossjump-insns
3136
3137 The minimum number of instructions that must be matched at the end
3138 of two blocks before cross-jumping is performed on them. This
3139 value is ignored in the case where all instructions in the block being
3140 cross-jumped from are matched.
3141
3142 .. gcc-param:: max-grow-copy-bb-insns
3143
3144 The maximum code size expansion factor when copying basic blocks
3145 instead of jumping. The expansion is relative to a jump instruction.
3146
3147 .. gcc-param:: max-goto-duplication-insns
3148
3149 The maximum number of instructions to duplicate to a block that jumps
3150 to a computed goto. To avoid O(N^2) behavior in a number of
3151 passes, GCC factors computed gotos early in the compilation process,
3152 and unfactors them as late as possible. Only computed jumps at the
3153 end of a basic blocks with no more than max-goto-duplication-insns are
3154 unfactored.
3155
3156 .. gcc-param:: max-delay-slot-insn-search
3157
3158 The maximum number of instructions to consider when looking for an
3159 instruction to fill a delay slot. If more than this arbitrary number of
3160 instructions are searched, the time savings from filling the delay slot
3161 are minimal, so stop searching. Increasing values mean more
3162 aggressive optimization, making the compilation time increase with probably
3163 small improvement in execution time.
3164
3165 .. gcc-param:: max-delay-slot-live-search
3166
3167 When trying to fill delay slots, the maximum number of instructions to
3168 consider when searching for a block with valid live register
3169 information. Increasing this arbitrarily chosen value means more
3170 aggressive optimization, increasing the compilation time. This parameter
3171 should be removed when the delay slot code is rewritten to maintain the
3172 control-flow graph.
3173
3174 .. gcc-param:: max-gcse-memory
3175
3176 The approximate maximum amount of memory in ``kB`` that can be allocated in
3177 order to perform the global common subexpression elimination
3178 optimization. If more memory than specified is required, the
3179 optimization is not done.
3180
3181 .. gcc-param:: max-gcse-insertion-ratio
3182
3183 If the ratio of expression insertions to deletions is larger than this value
3184 for any expression, then RTL PRE inserts or removes the expression and thus
3185 leaves partially redundant computations in the instruction stream.
3186
3187 .. gcc-param:: max-pending-list-length
3188
3189 The maximum number of pending dependencies scheduling allows
3190 before flushing the current state and starting over. Large functions
3191 with few branches or calls can create excessively large lists which
3192 needlessly consume memory and resources.
3193
3194 .. gcc-param:: max-modulo-backtrack-attempts
3195
3196 The maximum number of backtrack attempts the scheduler should make
3197 when modulo scheduling a loop. Larger values can exponentially increase
3198 compilation time.
3199
3200 .. gcc-param:: max-inline-functions-called-once-loop-depth
3201
3202 Maximal loop depth of a call considered by inline heuristics that tries to
3203 inline all functions called once.
3204
3205 .. gcc-param:: max-inline-functions-called-once-insns
3206
3207 Maximal estimated size of functions produced while inlining functions called
3208 once.
3209
3210 .. gcc-param:: max-inline-insns-single
3211
3212 Several parameters control the tree inliner used in GCC. This number sets the
3213 maximum number of instructions (counted in GCC's internal representation) in a
3214 single function that the tree inliner considers for inlining. This only
3215 affects functions declared inline and methods implemented in a class
3216 declaration (C++).
3217
3218 .. gcc-param:: max-inline-insns-auto
3219
3220 When you use :option:`-finline-functions` (included in :option:`-O3`),
3221 a lot of functions that would otherwise not be considered for inlining
3222 by the compiler are investigated. To those functions, a different
3223 (more restrictive) limit compared to functions declared inline can
3224 be applied (:option:`--param` :gcc-param:`max-inline-insns-auto`).
3225
3226 .. gcc-param:: max-inline-insns-small
3227
3228 This is bound applied to calls which are considered relevant with
3229 :option:`-finline-small-functions`.
3230
3231 .. gcc-param:: max-inline-insns-size
3232
3233 This is bound applied to calls which are optimized for size. Small growth
3234 may be desirable to anticipate optimization oppurtunities exposed by inlining.
3235
3236 .. gcc-param:: uninlined-function-insns
3237
3238 Number of instructions accounted by inliner for function overhead such as
3239 function prologue and epilogue.
3240
3241 .. gcc-param:: uninlined-function-time
3242
3243 Extra time accounted by inliner for function overhead such as time needed to
3244 execute function prologue and epilogue.
3245
3246 .. gcc-param:: inline-heuristics-hint-percent
3247
3248 The scale (in percents) applied to inline-insns-single,
3249 inline-insns-single-O2, inline-insns-auto
3250 when inline heuristics hints that inlining is
3251 very profitable (will enable later optimizations).
3252
3253 .. gcc-param:: uninlined-thunk-insns
3254 uninlined-thunk-time
3255
3256 Same as :option:`--param` :gcc-param:`uninlined-function-insns` and
3257 :option:`--param` :gcc-param:`uninlined-function-time` but applied to function thunks.
3258
3259 .. gcc-param:: inline-min-speedup
3260
3261 When estimated performance improvement of caller + callee runtime exceeds this
3262 threshold (in percent), the function can be inlined regardless of the limit on
3263 :option:`--param` :gcc-param:`max-inline-insns-single` and :option:`--param`
3264 :gcc-param:`max-inline-insns-auto`.
3265
3266 .. gcc-param:: large-function-insns
3267
3268 The limit specifying really large functions. For functions larger than this
3269 limit after inlining, inlining is constrained by
3270 :option:`--param` :gcc-param:`large-function-growth`. This parameter is useful primarily
3271 to avoid extreme compilation time caused by non-linear algorithms used by the
3272 back end.
3273
3274 .. gcc-param:: large-function-growth
3275
3276 Specifies maximal growth of large function caused by inlining in percents.
3277 For example, parameter value 100 limits large function growth to 2.0 times
3278 the original size.
3279
3280 .. gcc-param:: large-unit-insns
3281
3282 The limit specifying large translation unit. Growth caused by inlining of
3283 units larger than this limit is limited by :option:`--param` :gcc-param:`inline-unit-growth`.
3284 For small units this might be too tight.
3285 For example, consider a unit consisting of function A
3286 that is inline and B that just calls A three times. If B is small relative to
3287 A, the growth of unit is 300\% and yet such inlining is very sane. For very
3288 large units consisting of small inlineable functions, however, the overall unit
3289 growth limit is needed to avoid exponential explosion of code size. Thus for
3290 smaller units, the size is increased to :option:`--param` :gcc-param:`large-unit-insns`
3291 before applying :option:`--param` :gcc-param:`inline-unit-growth`.
3292
3293 .. gcc-param:: lazy-modules
3294
3295 Maximum number of concurrently open C++ module files when lazy loading.
3296
3297 .. gcc-param:: inline-unit-growth
3298
3299 Specifies maximal overall growth of the compilation unit caused by inlining.
3300 For example, parameter value 20 limits unit growth to 1.2 times the original
3301 size. Cold functions (either marked cold via an attribute or by profile
3302 feedback) are not accounted into the unit size.
3303
3304 .. gcc-param:: ipa-cp-unit-growth
3305
3306 Specifies maximal overall growth of the compilation unit caused by
3307 interprocedural constant propagation. For example, parameter value 10 limits
3308 unit growth to 1.1 times the original size.
3309
3310 .. gcc-param:: ipa-cp-large-unit-insns
3311
3312 The size of translation unit that IPA-CP pass considers large.
3313
3314 .. gcc-param:: large-stack-frame
3315
3316 The limit specifying large stack frames. While inlining the algorithm is trying
3317 to not grow past this limit too much.
3318
3319 .. gcc-param:: large-stack-frame-growth
3320
3321 Specifies maximal growth of large stack frames caused by inlining in percents.
3322 For example, parameter value 1000 limits large stack frame growth to 11 times
3323 the original size.
3324
3325 .. gcc-param:: max-inline-insns-recursive
3326 max-inline-insns-recursive-auto
3327
3328 Specifies the maximum number of instructions an out-of-line copy of a
3329 self-recursive inline
3330 function can grow into by performing recursive inlining.
3331
3332 :option:`--param` :gcc-param:`max-inline-insns-recursive` applies to functions
3333 declared inline.
3334 For functions not declared inline, recursive inlining
3335 happens only when :option:`-finline-functions` (included in :option:`-O3`) is
3336 enabled; :option:`--param` :gcc-param:`max-inline-insns-recursive-auto` applies instead.
3337
3338 .. gcc-param:: max-inline-recursive-depth
3339 max-inline-recursive-depth-auto
3340
3341 Specifies the maximum recursion depth used for recursive inlining.
3342
3343 :option:`--param` :gcc-param:`max-inline-recursive-depth` applies to functions
3344 declared inline. For functions not declared inline, recursive inlining
3345 happens only when :option:`-finline-functions` (included in :option:`-O3`) is
3346 enabled; :option:`--param` :gcc-param:`max-inline-recursive-depth-auto` applies instead.
3347
3348 .. gcc-param:: min-inline-recursive-probability
3349
3350 Recursive inlining is profitable only for function having deep recursion
3351 in average and can hurt for function having little recursion depth by
3352 increasing the prologue size or complexity of function body to other
3353 optimizers.
3354
3355 When profile feedback is available (see :option:`-fprofile-generate`) the actual
3356 recursion depth can be guessed from the probability that function recurses
3357 via a given call expression. This parameter limits inlining only to call
3358 expressions whose probability exceeds the given threshold (in percents).
3359
3360 .. gcc-param:: early-inlining-insns
3361
3362 Specify growth that the early inliner can make. In effect it increases
3363 the amount of inlining for code having a large abstraction penalty.
3364
3365 .. gcc-param:: max-early-inliner-iterations
3366
3367 Limit of iterations of the early inliner. This basically bounds
3368 the number of nested indirect calls the early inliner can resolve.
3369 Deeper chains are still handled by late inlining.
3370
3371 .. gcc-param:: comdat-sharing-probability
3372
3373 Probability (in percent) that C++ inline function with comdat visibility
3374 are shared across multiple compilation units.
3375
3376 .. gcc-param:: modref-max-bases
3377 modref-max-refs
3378 modref-max-accesses
3379
3380 Specifies the maximal number of base pointers, references and accesses stored
3381 for a single function by mod/ref analysis.
3382
3383 .. gcc-param:: modref-max-tests
3384
3385 Specifies the maxmal number of tests alias oracle can perform to disambiguate
3386 memory locations using the mod/ref information. This parameter ought to be
3387 bigger than :option:`--param` :gcc-param:`modref-max-bases` and :option:`--param
3388 :gcc-param:`modref-max-refs`.
3389
3390 .. gcc-param:: modref-max-depth
3391
3392 Specifies the maximum depth of DFS walk used by modref escape analysis.
3393 Setting to 0 disables the analysis completely.
3394
3395 .. gcc-param:: modref-max-escape-points
3396
3397 Specifies the maximum number of escape points tracked by modref per SSA-name.
3398
3399 .. gcc-param:: modref-max-adjustments
3400
3401 Specifies the maximum number the access range is enlarged during modref dataflow
3402 analysis.
3403
3404 .. gcc-param:: profile-func-internal-id
3405
3406 A parameter to control whether to use function internal id in profile
3407 database lookup. If the value is 0, the compiler uses an id that
3408 is based on function assembler name and filename, which makes old profile
3409 data more tolerant to source changes such as function reordering etc.
3410
3411 .. gcc-param:: min-vect-loop-bound
3412
3413 The minimum number of iterations under which loops are not vectorized
3414 when :option:`-ftree-vectorize` is used. The number of iterations after
3415 vectorization needs to be greater than the value specified by this option
3416 to allow vectorization.
3417
3418 .. gcc-param:: gcse-cost-distance-ratio
3419
3420 Scaling factor in calculation of maximum distance an expression
3421 can be moved by GCSE optimizations. This is currently supported only in the
3422 code hoisting pass. The bigger the ratio, the more aggressive code hoisting
3423 is with simple expressions, i.e., the expressions that have cost
3424 less than gcse-unrestricted-cost. Specifying 0 disables
3425 hoisting of simple expressions.
3426
3427 .. gcc-param:: gcse-unrestricted-cost
3428
3429 Cost, roughly measured as the cost of a single typical machine
3430 instruction, at which GCSE optimizations do not constrain
3431 the distance an expression can travel. This is currently
3432 supported only in the code hoisting pass. The lesser the cost,
3433 the more aggressive code hoisting is. Specifying 0
3434 allows all expressions to travel unrestricted distances.
3435
3436 .. gcc-param:: max-hoist-depth
3437
3438 The depth of search in the dominator tree for expressions to hoist.
3439 This is used to avoid quadratic behavior in hoisting algorithm.
3440 The value of 0 does not limit on the search, but may slow down compilation
3441 of huge functions.
3442
3443 .. gcc-param:: max-tail-merge-comparisons
3444
3445 The maximum amount of similar bbs to compare a bb with. This is used to
3446 avoid quadratic behavior in tree tail merging.
3447
3448 .. gcc-param:: max-tail-merge-iterations
3449
3450 The maximum amount of iterations of the pass over the function. This is used to
3451 limit compilation time in tree tail merging.
3452
3453 .. gcc-param:: store-merging-allow-unaligned
3454
3455 Allow the store merging pass to introduce unaligned stores if it is legal to
3456 do so.
3457
3458 .. gcc-param:: max-stores-to-merge
3459
3460 The maximum number of stores to attempt to merge into wider stores in the store
3461 merging pass.
3462
3463 .. gcc-param:: max-store-chains-to-track
3464
3465 The maximum number of store chains to track at the same time in the attempt
3466 to merge them into wider stores in the store merging pass.
3467
3468 .. gcc-param:: max-stores-to-track
3469
3470 The maximum number of stores to track at the same time in the attemt to
3471 to merge them into wider stores in the store merging pass.
3472
3473 .. gcc-param:: max-unrolled-insns
3474
3475 The maximum number of instructions that a loop may have to be unrolled.
3476 If a loop is unrolled, this parameter also determines how many times
3477 the loop code is unrolled.
3478
3479 .. gcc-param:: max-average-unrolled-insns
3480
3481 The maximum number of instructions biased by probabilities of their execution
3482 that a loop may have to be unrolled. If a loop is unrolled,
3483 this parameter also determines how many times the loop code is unrolled.
3484
3485 .. gcc-param:: max-unroll-times
3486
3487 The maximum number of unrollings of a single loop.
3488
3489 .. gcc-param:: max-peeled-insns
3490
3491 The maximum number of instructions that a loop may have to be peeled.
3492 If a loop is peeled, this parameter also determines how many times
3493 the loop code is peeled.
3494
3495 .. gcc-param:: max-peel-times
3496
3497 The maximum number of peelings of a single loop.
3498
3499 .. gcc-param:: max-peel-branches
3500
3501 The maximum number of branches on the hot path through the peeled sequence.
3502
3503 .. gcc-param:: max-completely-peeled-insns
3504
3505 The maximum number of insns of a completely peeled loop.
3506
3507 .. gcc-param:: max-completely-peel-times
3508
3509 The maximum number of iterations of a loop to be suitable for complete peeling.
3510
3511 .. gcc-param:: max-completely-peel-loop-nest-depth
3512
3513 The maximum depth of a loop nest suitable for complete peeling.
3514
3515 .. gcc-param:: max-unswitch-insns
3516
3517 The maximum number of insns of an unswitched loop.
3518
3519 .. gcc-param:: lim-expensive
3520
3521 The minimum cost of an expensive expression in the loop invariant motion.
3522
3523 .. gcc-param:: min-loop-cond-split-prob
3524
3525 When FDO profile information is available, min-loop-cond-split-prob
3526 specifies minimum threshold for probability of semi-invariant condition
3527 statement to trigger loop split.
3528
3529 .. gcc-param:: iv-consider-all-candidates-bound
3530
3531 Bound on number of candidates for induction variables, below which
3532 all candidates are considered for each use in induction variable
3533 optimizations. If there are more candidates than this,
3534 only the most relevant ones are considered to avoid quadratic time complexity.
3535
3536 .. gcc-param:: iv-max-considered-uses
3537
3538 The induction variable optimizations give up on loops that contain more
3539 induction variable uses.
3540
3541 .. gcc-param:: iv-always-prune-cand-set-bound
3542
3543 If the number of candidates in the set is smaller than this value,
3544 always try to remove unnecessary ivs from the set
3545 when adding a new one.
3546
3547 .. gcc-param:: avg-loop-niter
3548
3549 Average number of iterations of a loop.
3550
3551 .. gcc-param:: dse-max-object-size
3552
3553 Maximum size (in bytes) of objects tracked bytewise by dead store elimination.
3554 Larger values may result in larger compilation times.
3555
3556 .. gcc-param:: dse-max-alias-queries-per-store
3557
3558 Maximum number of queries into the alias oracle per store.
3559 Larger values result in larger compilation times and may result in more
3560 removed dead stores.
3561
3562 .. gcc-param:: scev-max-expr-size
3563
3564 Bound on size of expressions used in the scalar evolutions analyzer.
3565 Large expressions slow the analyzer.
3566
3567 .. gcc-param:: scev-max-expr-complexity
3568
3569 Bound on the complexity of the expressions in the scalar evolutions analyzer.
3570 Complex expressions slow the analyzer.
3571
3572 .. gcc-param:: max-tree-if-conversion-phi-args
3573
3574 Maximum number of arguments in a PHI supported by TREE if conversion
3575 unless the loop is marked with simd pragma.
3576
3577 .. gcc-param:: vect-max-layout-candidates
3578
3579 The maximum number of possible vector layouts (such as permutations)
3580 to consider when optimizing to-be-vectorized code.
3581
3582 .. gcc-param:: vect-max-version-for-alignment-checks
3583
3584 The maximum number of run-time checks that can be performed when
3585 doing loop versioning for alignment in the vectorizer.
3586
3587 .. gcc-param:: vect-max-version-for-alias-checks
3588
3589 The maximum number of run-time checks that can be performed when
3590 doing loop versioning for alias in the vectorizer.
3591
3592 .. gcc-param:: vect-max-peeling-for-alignment
3593
3594 The maximum number of loop peels to enhance access alignment
3595 for vectorizer. Value -1 means no limit.
3596
3597 .. gcc-param:: max-iterations-to-track
3598
3599 The maximum number of iterations of a loop the brute-force algorithm
3600 for analysis of the number of iterations of the loop tries to evaluate.
3601
3602 .. gcc-param:: hot-bb-count-fraction
3603
3604 The denominator n of fraction 1/n of the maximal execution count of a
3605 basic block in the entire program that a basic block needs to at least
3606 have in order to be considered hot. The default is 10000, which means
3607 that a basic block is considered hot if its execution count is greater
3608 than 1/10000 of the maximal execution count. 0 means that it is never
3609 considered hot. Used in non-LTO mode.
3610
3611 .. gcc-param:: hot-bb-count-ws-permille
3612
3613 The number of most executed permilles, ranging from 0 to 1000, of the
3614 profiled execution of the entire program to which the execution count
3615 of a basic block must be part of in order to be considered hot. The
3616 default is 990, which means that a basic block is considered hot if
3617 its execution count contributes to the upper 990 permilles, or 99.0%,
3618 of the profiled execution of the entire program. 0 means that it is
3619 never considered hot. Used in LTO mode.
3620
3621 .. gcc-param:: hot-bb-frequency-fraction
3622
3623 The denominator n of fraction 1/n of the execution frequency of the
3624 entry block of a function that a basic block of this function needs
3625 to at least have in order to be considered hot. The default is 1000,
3626 which means that a basic block is considered hot in a function if it
3627 is executed more frequently than 1/1000 of the frequency of the entry
3628 block of the function. 0 means that it is never considered hot.
3629
3630 .. gcc-param:: unlikely-bb-count-fraction
3631
3632 The denominator n of fraction 1/n of the number of profiled runs of
3633 the entire program below which the execution count of a basic block
3634 must be in order for the basic block to be considered unlikely executed.
3635 The default is 20, which means that a basic block is considered unlikely
3636 executed if it is executed in fewer than 1/20, or 5%, of the runs of
3637 the program. 0 means that it is always considered unlikely executed.
3638
3639 .. gcc-param:: max-predicted-iterations
3640
3641 The maximum number of loop iterations we predict statically. This is useful
3642 in cases where a function contains a single loop with known bound and
3643 another loop with unknown bound.
3644 The known number of iterations is predicted correctly, while
3645 the unknown number of iterations average to roughly 10. This means that the
3646 loop without bounds appears artificially cold relative to the other one.
3647
3648 .. gcc-param:: builtin-expect-probability
3649
3650 Control the probability of the expression having the specified value. This
3651 parameter takes a percentage (i.e. 0 ... 100) as input.
3652
3653 .. gcc-param:: builtin-string-cmp-inline-length
3654
3655 The maximum length of a constant string for a builtin string cmp call
3656 eligible for inlining.
3657
3658 .. gcc-param:: align-threshold
3659
3660 Select fraction of the maximal frequency of executions of a basic block in
3661 a function to align the basic block.
3662
3663 .. gcc-param:: align-loop-iterations
3664
3665 A loop expected to iterate at least the selected number of iterations is
3666 aligned.
3667
3668 .. gcc-param:: tracer-dynamic-coverage
3669 tracer-dynamic-coverage-feedback
3670
3671 This value is used to limit superblock formation once the given percentage of
3672 executed instructions is covered. This limits unnecessary code size
3673 expansion.
3674
3675 The tracer-dynamic-coverage-feedback parameter
3676 is used only when profile
3677 feedback is available. The real profiles (as opposed to statically estimated
3678 ones) are much less balanced allowing the threshold to be larger value.
3679
3680 .. gcc-param:: tracer-max-code-growth
3681
3682 Stop tail duplication once code growth has reached given percentage. This is
3683 a rather artificial limit, as most of the duplicates are eliminated later in
3684 cross jumping, so it may be set to much higher values than is the desired code
3685 growth.
3686
3687 .. gcc-param:: tracer-min-branch-ratio
3688
3689 Stop reverse growth when the reverse probability of best edge is less than this
3690 threshold (in percent).
3691
3692 .. gcc-param:: tracer-min-branch-probability
3693 tracer-min-branch-probability-feedback
3694
3695 Stop forward growth if the best edge has probability lower than this
3696 threshold.
3697
3698 Similarly to tracer-dynamic-coverage two parameters are
3699 provided. tracer-min-branch-probability-feedback is used for
3700 compilation with profile feedback and tracer-min-branch-probability
3701 compilation without. The value for compilation with profile feedback
3702 needs to be more conservative (higher) in order to make tracer
3703 effective.
3704
3705 .. gcc-param:: stack-clash-protection-guard-size
3706
3707 Specify the size of the operating system provided stack guard as
3708 2 raised to :samp:`{num}` bytes. Higher values may reduce the
3709 number of explicit probes, but a value larger than the operating system
3710 provided guard will leave code vulnerable to stack clash style attacks.
3711
3712 .. gcc-param:: stack-clash-protection-probe-interval
3713
3714 Stack clash protection involves probing stack space as it is allocated. This
3715 param controls the maximum distance between probes into the stack as 2 raised
3716 to :samp:`{num}` bytes. Higher values may reduce the number of explicit probes, but a value
3717 larger than the operating system provided guard will leave code vulnerable to
3718 stack clash style attacks.
3719
3720 .. gcc-param:: max-cse-path-length
3721
3722 The maximum number of basic blocks on path that CSE considers.
3723
3724 .. gcc-param:: max-cse-insns
3725
3726 The maximum number of instructions CSE processes before flushing.
3727
3728 .. gcc-param:: ggc-min-expand
3729
3730 GCC uses a garbage collector to manage its own memory allocation. This
3731 parameter specifies the minimum percentage by which the garbage
3732 collector's heap should be allowed to expand between collections.
3733 Tuning this may improve compilation speed; it has no effect on code
3734 generation.
3735
3736 The default is 30% + 70% \* (RAM/1GB) with an upper bound of 100% when
3737 RAM >= 1GB. If ``getrlimit`` is available, the notion of 'RAM' is
3738 the smallest of actual RAM and ``RLIMIT_DATA`` or ``RLIMIT_AS``. If
3739 GCC is not able to calculate RAM on a particular platform, the lower
3740 bound of 30% is used. Setting this parameter and
3741 ggc-min-heapsize to zero causes a full collection to occur at
3742 every opportunity. This is extremely slow, but can be useful for
3743 debugging.
3744
3745 .. gcc-param:: ggc-min-heapsize
3746
3747 Minimum size of the garbage collector's heap before it begins bothering
3748 to collect garbage. The first collection occurs after the heap expands
3749 by ggc-min-expand % beyond ggc-min-heapsize. Again,
3750 tuning this may improve compilation speed, and has no effect on code
3751 generation.
3752
3753 The default is the smaller of RAM/8, RLIMIT_RSS, or a limit that
3754 tries to ensure that RLIMIT_DATA or RLIMIT_AS are not exceeded, but
3755 with a lower bound of 4096 (four megabytes) and an upper bound of
3756 131072 (128 megabytes). If GCC is not able to calculate RAM on a
3757 particular platform, the lower bound is used. Setting this parameter
3758 very large effectively disables garbage collection. Setting this
3759 parameter and ggc-min-expand to zero causes a full collection
3760 to occur at every opportunity.
3761
3762 .. gcc-param:: max-reload-search-insns
3763
3764 The maximum number of instruction reload should look backward for equivalent
3765 register. Increasing values mean more aggressive optimization, making the
3766 compilation time increase with probably slightly better performance.
3767
3768 .. gcc-param:: max-cselib-memory-locations
3769
3770 The maximum number of memory locations cselib should take into account.
3771 Increasing values mean more aggressive optimization, making the compilation time
3772 increase with probably slightly better performance.
3773
3774 .. gcc-param:: max-sched-ready-insns
3775
3776 The maximum number of instructions ready to be issued the scheduler should
3777 consider at any given time during the first scheduling pass. Increasing
3778 values mean more thorough searches, making the compilation time increase
3779 with probably little benefit.
3780
3781 .. gcc-param:: max-sched-region-blocks
3782
3783 The maximum number of blocks in a region to be considered for
3784 interblock scheduling.
3785
3786 .. gcc-param:: max-pipeline-region-blocks
3787
3788 The maximum number of blocks in a region to be considered for
3789 pipelining in the selective scheduler.
3790
3791 .. gcc-param:: max-sched-region-insns
3792
3793 The maximum number of insns in a region to be considered for
3794 interblock scheduling.
3795
3796 .. gcc-param:: max-pipeline-region-insns
3797
3798 The maximum number of insns in a region to be considered for
3799 pipelining in the selective scheduler.
3800
3801 .. gcc-param:: min-spec-prob
3802
3803 The minimum probability (in percents) of reaching a source block
3804 for interblock speculative scheduling.
3805
3806 .. gcc-param:: max-sched-extend-regions-iters
3807
3808 The maximum number of iterations through CFG to extend regions.
3809 A value of 0 disables region extensions.
3810
3811 .. gcc-param:: max-sched-insn-conflict-delay
3812
3813 The maximum conflict delay for an insn to be considered for speculative motion.
3814
3815 .. gcc-param:: sched-spec-prob-cutoff
3816
3817 The minimal probability of speculation success (in percents), so that
3818 speculative insns are scheduled.
3819
3820 .. gcc-param:: sched-state-edge-prob-cutoff
3821
3822 The minimum probability an edge must have for the scheduler to save its
3823 state across it.
3824
3825 .. gcc-param:: sched-mem-true-dep-cost
3826
3827 Minimal distance (in CPU cycles) between store and load targeting same
3828 memory locations.
3829
3830 .. gcc-param:: selsched-max-lookahead
3831
3832 The maximum size of the lookahead window of selective scheduling. It is a
3833 depth of search for available instructions.
3834
3835 .. gcc-param:: selsched-max-sched-times
3836
3837 The maximum number of times that an instruction is scheduled during
3838 selective scheduling. This is the limit on the number of iterations
3839 through which the instruction may be pipelined.
3840
3841 .. gcc-param:: selsched-insns-to-rename
3842
3843 The maximum number of best instructions in the ready list that are considered
3844 for renaming in the selective scheduler.
3845
3846 .. gcc-param:: sms-min-sc
3847
3848 The minimum value of stage count that swing modulo scheduler
3849 generates.
3850
3851 .. gcc-param:: max-last-value-rtl
3852
3853 The maximum size measured as number of RTLs that can be recorded in an expression
3854 in combiner for a pseudo register as last known value of that register.
3855
3856 .. gcc-param:: max-combine-insns
3857
3858 The maximum number of instructions the RTL combiner tries to combine.
3859
3860 .. gcc-param:: integer-share-limit
3861
3862 Small integer constants can use a shared data structure, reducing the
3863 compiler's memory usage and increasing its speed. This sets the maximum
3864 value of a shared integer constant.
3865
3866 .. gcc-param:: ssp-buffer-size
3867
3868 The minimum size of buffers (i.e. arrays) that receive stack smashing
3869 protection when :option:`-fstack-protector` is used.
3870
3871 .. gcc-param:: min-size-for-stack-sharing
3872
3873 The minimum size of variables taking part in stack slot sharing when not
3874 optimizing.
3875
3876 .. gcc-param:: max-jump-thread-duplication-stmts
3877
3878 Maximum number of statements allowed in a block that needs to be
3879 duplicated when threading jumps.
3880
3881 .. gcc-param:: max-jump-thread-paths
3882
3883 The maximum number of paths to consider when searching for jump threading
3884 opportunities. When arriving at a block, incoming edges are only considered
3885 if the number of paths to be searched so far multiplied by the number of
3886 incoming edges does not exhaust the specified maximum number of paths to
3887 consider.
3888
3889 .. gcc-param:: max-fields-for-field-sensitive
3890
3891 Maximum number of fields in a structure treated in
3892 a field sensitive manner during pointer analysis.
3893
3894 .. gcc-param:: prefetch-latency
3895
3896 Estimate on average number of instructions that are executed before
3897 prefetch finishes. The distance prefetched ahead is proportional
3898 to this constant. Increasing this number may also lead to less
3899 streams being prefetched (see :gcc-param:`simultaneous-prefetches`).
3900
3901 .. gcc-param:: simultaneous-prefetches
3902
3903 Maximum number of prefetches that can run at the same time.
3904
3905 .. gcc-param:: l1-cache-line-size
3906
3907 The size of cache line in L1 data cache, in bytes.
3908
3909 .. gcc-param:: l1-cache-size
3910
3911 The size of L1 data cache, in kilobytes.
3912
3913 .. gcc-param:: l2-cache-size
3914
3915 The size of L2 data cache, in kilobytes.
3916
3917 .. gcc-param:: prefetch-dynamic-strides
3918
3919 Whether the loop array prefetch pass should issue software prefetch hints
3920 for strides that are non-constant. In some cases this may be
3921 beneficial, though the fact the stride is non-constant may make it
3922 hard to predict when there is clear benefit to issuing these hints.
3923
3924 Set to 1 if the prefetch hints should be issued for non-constant
3925 strides. Set to 0 if prefetch hints should be issued only for strides that
3926 are known to be constant and below prefetch-minimum-stride.
3927
3928 .. gcc-param:: prefetch-minimum-stride
3929
3930 Minimum constant stride, in bytes, to start using prefetch hints for. If
3931 the stride is less than this threshold, prefetch hints will not be issued.
3932
3933 This setting is useful for processors that have hardware prefetchers, in
3934 which case there may be conflicts between the hardware prefetchers and
3935 the software prefetchers. If the hardware prefetchers have a maximum
3936 stride they can handle, it should be used here to improve the use of
3937 software prefetchers.
3938
3939 A value of -1 means we don't have a threshold and therefore
3940 prefetch hints can be issued for any constant stride.
3941
3942 This setting is only useful for strides that are known and constant.
3943
3944 .. gcc-param:: destructive-interference-size
3945 constructive-interference-size
3946
3947 The values for the C++17 variables
3948 ``std::hardware_destructive_interference_size`` and
3949 ``std::hardware_constructive_interference_size``. The destructive
3950 interference size is the minimum recommended offset between two
3951 independent concurrently-accessed objects; the constructive
3952 interference size is the maximum recommended size of contiguous memory
3953 accessed together. Typically both will be the size of an L1 cache
3954 line for the target, in bytes. For a generic target covering a range of L1
3955 cache line sizes, typically the constructive interference size will be
3956 the small end of the range and the destructive size will be the large
3957 end.
3958
3959 The destructive interference size is intended to be used for layout,
3960 and thus has ABI impact. The default value is not expected to be
3961 stable, and on some targets varies with :option:`-mtune`, so use of
3962 this variable in a context where ABI stability is important, such as
3963 the public interface of a library, is strongly discouraged; if it is
3964 used in that context, users can stabilize the value using this
3965 option.
3966
3967 The constructive interference size is less sensitive, as it is
3968 typically only used in a :samp:`static_assert` to make sure that a type
3969 fits within a cache line.
3970
3971 See also :option:`-Winterference-size`.
3972
3973 .. gcc-param:: loop-interchange-max-num-stmts
3974
3975 The maximum number of stmts in a loop to be interchanged.
3976
3977 .. gcc-param:: loop-interchange-stride-ratio
3978
3979 The minimum ratio between stride of two loops for interchange to be profitable.
3980
3981 .. gcc-param:: min-insn-to-prefetch-ratio
3982
3983 The minimum ratio between the number of instructions and the
3984 number of prefetches to enable prefetching in a loop.
3985
3986 .. gcc-param:: prefetch-min-insn-to-mem-ratio
3987
3988 The minimum ratio between the number of instructions and the
3989 number of memory references to enable prefetching in a loop.
3990
3991 .. gcc-param:: use-canonical-types
3992
3993 Whether the compiler should use the 'canonical' type system.
3994 Should always be 1, which uses a more efficient internal
3995 mechanism for comparing types in C++ and Objective-C++. However, if
3996 bugs in the canonical type system are causing compilation failures,
3997 set this value to 0 to disable canonical types.
3998
3999 .. gcc-param:: switch-conversion-max-branch-ratio
4000
4001 Switch initialization conversion refuses to create arrays that are
4002 bigger than switch-conversion-max-branch-ratio times the number of
4003 branches in the switch.
4004
4005 .. gcc-param:: max-partial-antic-length
4006
4007 Maximum length of the partial antic set computed during the tree
4008 partial redundancy elimination optimization (:option:`-ftree-pre`) when
4009 optimizing at :option:`-O3` and above. For some sorts of source code
4010 the enhanced partial redundancy elimination optimization can run away,
4011 consuming all of the memory available on the host machine. This
4012 parameter sets a limit on the length of the sets that are computed,
4013 which prevents the runaway behavior. Setting a value of 0 for
4014 this parameter allows an unlimited set length.
4015
4016 .. gcc-param:: rpo-vn-max-loop-depth
4017
4018 Maximum loop depth that is value-numbered optimistically.
4019 When the limit hits the innermost
4020 :samp:`{rpo-vn-max-loop-depth}` loops and the outermost loop in the
4021 loop nest are value-numbered optimistically and the remaining ones not.
4022
4023 .. gcc-param:: sccvn-max-alias-queries-per-access
4024
4025 Maximum number of alias-oracle queries we perform when looking for
4026 redundancies for loads and stores. If this limit is hit the search
4027 is aborted and the load or store is not considered redundant. The
4028 number of queries is algorithmically limited to the number of
4029 stores on all paths from the load to the function entry.
4030
4031 .. gcc-param:: ira-max-loops-num
4032
4033 IRA uses regional register allocation by default. If a function
4034 contains more loops than the number given by this parameter, only at most
4035 the given number of the most frequently-executed loops form regions
4036 for regional register allocation.
4037
4038 .. gcc-param:: ira-max-conflict-table-size
4039
4040 Although IRA uses a sophisticated algorithm to compress the conflict
4041 table, the table can still require excessive amounts of memory for
4042 huge functions. If the conflict table for a function could be more
4043 than the size in MB given by this parameter, the register allocator
4044 instead uses a faster, simpler, and lower-quality
4045 algorithm that does not require building a pseudo-register conflict table.
4046
4047 .. gcc-param:: ira-loop-reserved-regs
4048
4049 IRA can be used to evaluate more accurate register pressure in loops
4050 for decisions to move loop invariants (see :option:`-O3`). The number
4051 of available registers reserved for some other purposes is given
4052 by this parameter. Default of the parameter
4053 is the best found from numerous experiments.
4054
4055 .. gcc-param:: ira-consider-dup-in-all-alts
4056
4057 Make IRA to consider matching constraint (duplicated operand number)
4058 heavily in all available alternatives for preferred register class.
4059 If it is set as zero, it means IRA only respects the matching
4060 constraint when it's in the only available alternative with an
4061 appropriate register class. Otherwise, it means IRA will check all
4062 available alternatives for preferred register class even if it has
4063 found some choice with an appropriate register class and respect the
4064 found qualified matching constraint.
4065
4066 .. gcc-param:: lra-inheritance-ebb-probability-cutoff
4067
4068 LRA tries to reuse values reloaded in registers in subsequent insns.
4069 This optimization is called inheritance. EBB is used as a region to
4070 do this optimization. The parameter defines a minimal fall-through
4071 edge probability in percentage used to add BB to inheritance EBB in
4072 LRA. The default value was chosen
4073 from numerous runs of SPEC2000 on x86-64.
4074
4075 .. gcc-param:: loop-invariant-max-bbs-in-loop
4076
4077 Loop invariant motion can be very expensive, both in compilation time and
4078 in amount of needed compile-time memory, with very large loops. Loops
4079 with more basic blocks than this parameter won't have loop invariant
4080 motion optimization performed on them.
4081
4082 .. gcc-param:: loop-max-datarefs-for-datadeps
4083
4084 Building data dependencies is expensive for very large loops. This
4085 parameter limits the number of data references in loops that are
4086 considered for data dependence analysis. These large loops are no
4087 handled by the optimizations using loop data dependencies.
4088
4089 .. gcc-param:: max-vartrack-size
4090
4091 Sets a maximum number of hash table slots to use during variable
4092 tracking dataflow analysis of any function. If this limit is exceeded
4093 with variable tracking at assignments enabled, analysis for that
4094 function is retried without it, after removing all debug insns from
4095 the function. If the limit is exceeded even without debug insns, var
4096 tracking analysis is completely disabled for the function. Setting
4097 the parameter to zero makes it unlimited.
4098
4099 .. gcc-param:: max-vartrack-expr-depth
4100
4101 Sets a maximum number of recursion levels when attempting to map
4102 variable names or debug temporaries to value expressions. This trades
4103 compilation time for more complete debug information. If this is set too
4104 low, value expressions that are available and could be represented in
4105 debug information may end up not being used; setting this higher may
4106 enable the compiler to find more complex debug expressions, but compile
4107 time and memory use may grow.
4108
4109 .. gcc-param:: max-debug-marker-count
4110
4111 Sets a threshold on the number of debug markers (e.g. begin stmt
4112 markers) to avoid complexity explosion at inlining or expanding to RTL.
4113 If a function has more such gimple stmts than the set limit, such stmts
4114 will be dropped from the inlined copy of a function, and from its RTL
4115 expansion.
4116
4117 .. gcc-param:: min-nondebug-insn-uid
4118
4119 Use uids starting at this parameter for nondebug insns. The range below
4120 the parameter is reserved exclusively for debug insns created by
4121 :option:`-fvar-tracking-assignments`, but debug insns may get
4122 (non-overlapping) uids above it if the reserved range is exhausted.
4123
4124 .. gcc-param:: ipa-sra-ptr-growth-factor
4125
4126 IPA-SRA replaces a pointer to an aggregate with one or more new
4127 parameters only when their cumulative size is less or equal to
4128 ipa-sra-ptr-growth-factor times the size of the original
4129 pointer parameter.
4130
4131 .. gcc-param:: ipa-sra-max-replacements
4132
4133 Maximum pieces of an aggregate that IPA-SRA tracks. As a
4134 consequence, it is also the maximum number of replacements of a formal
4135 parameter.
4136
4137 .. gcc-param:: sra-max-scalarization-size-Ospeed
4138 sra-max-scalarization-size-Osize
4139
4140 The two Scalar Reduction of Aggregates passes (SRA and IPA-SRA) aim to
4141 replace scalar parts of aggregates with uses of independent scalar
4142 variables. These parameters control the maximum size, in storage units,
4143 of aggregate which is considered for replacement when compiling for
4144 speed
4145 (:gcc-param:`sra-max-scalarization-size-Ospeed``) or size
4146 (:gcc-param:`sra-max-scalarization-size-Osize``) respectively.
4147
4148 .. gcc-param:: sra-max-propagations
4149
4150 The maximum number of artificial accesses that Scalar Replacement of
4151 Aggregates (SRA) will track, per one local variable, in order to
4152 facilitate copy propagation.
4153
4154 .. gcc-param:: tm-max-aggregate-size
4155
4156 When making copies of thread-local variables in a transaction, this
4157 parameter specifies the size in bytes after which variables are
4158 saved with the logging functions as opposed to save/restore code
4159 sequence pairs. This option only applies when using
4160 :option:`-fgnu-tm`.
4161
4162 .. gcc-param:: graphite-max-nb-scop-params
4163
4164 To avoid exponential effects in the Graphite loop transforms, the
4165 number of parameters in a Static Control Part (SCoP) is bounded.
4166 A value of zero can be used to lift
4167 the bound. A variable whose value is unknown at compilation time and
4168 defined outside a SCoP is a parameter of the SCoP.
4169
4170 .. gcc-param:: loop-block-tile-size
4171
4172 Loop blocking or strip mining transforms, enabled with
4173 :option:`-floop-block` or :option:`-floop-strip-mine`, strip mine each
4174 loop in the loop nest by a given number of iterations. The strip
4175 length can be changed using the loop-block-tile-size
4176 parameter.
4177
4178 .. gcc-param:: ipa-jump-function-lookups
4179
4180 Specifies number of statements visited during jump function offset discovery.
4181
4182 .. gcc-param:: ipa-cp-value-list-size
4183
4184 IPA-CP attempts to track all possible values and types passed to a function's
4185 parameter in order to propagate them and perform devirtualization.
4186 :gcc-param:`ipa-cp-value-list-size` is the maximum number of values and types it
4187 stores per one formal parameter of a function.
4188
4189 .. gcc-param:: ipa-cp-eval-threshold
4190
4191 IPA-CP calculates its own score of cloning profitability heuristics
4192 and performs those cloning opportunities with scores that exceed
4193 :gcc-param:`ipa-cp-eval-threshold`.
4194
4195 .. gcc-param:: ipa-cp-max-recursive-depth
4196
4197 Maximum depth of recursive cloning for self-recursive function.
4198
4199 .. gcc-param:: ipa-cp-min-recursive-probability
4200
4201 Recursive cloning only when the probability of call being executed exceeds
4202 the parameter.
4203
4204 .. gcc-param:: ipa-cp-profile-count-base
4205
4206 When using :option:`-fprofile-use` option, IPA-CP will consider the measured
4207 execution count of a call graph edge at this percentage position in their
4208 histogram as the basis for its heuristics calculation.
4209
4210 .. gcc-param:: ipa-cp-recursive-freq-factor
4211
4212 The number of times interprocedural copy propagation expects recursive
4213 functions to call themselves.
4214
4215 .. gcc-param:: ipa-cp-recursion-penalty
4216
4217 Percentage penalty the recursive functions will receive when they
4218 are evaluated for cloning.
4219
4220 .. gcc-param:: ipa-cp-single-call-penalty
4221
4222 Percentage penalty functions containing a single call to another
4223 function will receive when they are evaluated for cloning.
4224
4225 .. gcc-param:: ipa-max-agg-items
4226
4227 IPA-CP is also capable to propagate a number of scalar values passed
4228 in an aggregate. :gcc-param:`ipa-max-agg-items`` controls the maximum
4229 number of such values per one parameter.
4230
4231 .. gcc-param:: ipa-cp-loop-hint-bonus
4232
4233 When IPA-CP determines that a cloning candidate would make the number
4234 of iterations of a loop known, it adds a bonus of
4235 ipa-cp-loop-hint-bonus to the profitability score of
4236 the candidate.
4237
4238 .. gcc-param:: ipa-max-loop-predicates
4239
4240 The maximum number of different predicates IPA will use to describe when
4241 loops in a function have known properties.
4242
4243 .. gcc-param:: ipa-max-aa-steps
4244
4245 During its analysis of function bodies, IPA-CP employs alias analysis
4246 in order to track values pointed to by function parameters. In order
4247 not spend too much time analyzing huge functions, it gives up and
4248 consider all memory clobbered after examining
4249 :gcc-param:`ipa-max-aa-steps` statements modifying memory.
4250
4251 .. gcc-param:: ipa-max-switch-predicate-bounds
4252
4253 Maximal number of boundary endpoints of case ranges of switch statement.
4254 For switch exceeding this limit, IPA-CP will not construct cloning cost
4255 predicate, which is used to estimate cloning benefit, for default case
4256 of the switch statement.
4257
4258 .. gcc-param:: ipa-max-param-expr-ops
4259
4260 IPA-CP will analyze conditional statement that references some function
4261 parameter to estimate benefit for cloning upon certain constant value.
4262 But if number of operations in a parameter expression exceeds
4263 :gcc-param:`ipa-max-param-expr-ops`, the expression is treated as complicated
4264 one, and is not handled by IPA analysis.
4265
4266 .. gcc-param:: lto-partitions
4267
4268 Specify desired number of partitions produced during WHOPR compilation.
4269 The number of partitions should exceed the number of CPUs used for compilation.
4270
4271 .. gcc-param:: lto-min-partition
4272
4273 Size of minimal partition for WHOPR (in estimated instructions).
4274 This prevents expenses of splitting very small programs into too many
4275 partitions.
4276
4277 .. gcc-param:: lto-max-partition
4278
4279 Size of max partition for WHOPR (in estimated instructions).
4280 to provide an upper bound for individual size of partition.
4281 Meant to be used only with balanced partitioning.
4282
4283 .. gcc-param:: lto-max-streaming-parallelism
4284
4285 Maximal number of parallel processes used for LTO streaming.
4286
4287 .. gcc-param:: cxx-max-namespaces-for-diagnostic-help
4288
4289 The maximum number of namespaces to consult for suggestions when C++
4290 name lookup fails for an identifier.
4291
4292 .. gcc-param:: sink-frequency-threshold
4293
4294 The maximum relative execution frequency (in percents) of the target block
4295 relative to a statement's original block to allow statement sinking of a
4296 statement. Larger numbers result in more aggressive statement sinking.
4297 A small positive adjustment is applied for
4298 statements with memory operands as those are even more profitable so sink.
4299
4300 .. gcc-param:: max-stores-to-sink
4301
4302 The maximum number of conditional store pairs that can be sunk. Set to 0
4303 if either vectorization (:option:`-ftree-vectorize`) or if-conversion
4304 (:option:`-ftree-loop-if-convert`) is disabled.
4305
4306 .. gcc-param:: case-values-threshold
4307
4308 The smallest number of different values for which it is best to use a
4309 jump-table instead of a tree of conditional branches. If the value is
4310 0, use the default for the machine.
4311
4312 .. gcc-param:: jump-table-max-growth-ratio-for-size
4313
4314 The maximum code size growth ratio when expanding
4315 into a jump table (in percent). The parameter is used when
4316 optimizing for size.
4317
4318 .. gcc-param:: jump-table-max-growth-ratio-for-speed
4319
4320 The maximum code size growth ratio when expanding
4321 into a jump table (in percent). The parameter is used when
4322 optimizing for speed.
4323
4324 .. gcc-param:: tree-reassoc-width
4325
4326 Set the maximum number of instructions executed in parallel in
4327 reassociated tree. This parameter overrides target dependent
4328 heuristics used by default if has non zero value.
4329
4330 .. gcc-param:: sched-pressure-algorithm
4331
4332 Choose between the two available implementations of
4333 :option:`-fsched-pressure`. Algorithm 1 is the original implementation
4334 and is the more likely to prevent instructions from being reordered.
4335 Algorithm 2 was designed to be a compromise between the relatively
4336 conservative approach taken by algorithm 1 and the rather aggressive
4337 approach taken by the default scheduler. It relies more heavily on
4338 having a regular register file and accurate register pressure classes.
4339 See :samp:`haifa-sched.cc` in the GCC sources for more details.
4340
4341 The default choice depends on the target.
4342
4343 .. gcc-param:: max-slsr-cand-scan
4344
4345 Set the maximum number of existing candidates that are considered when
4346 seeking a basis for a new straight-line strength reduction candidate.
4347
4348 .. gcc-param:: asan-globals
4349
4350 Enable buffer overflow detection for global objects. This kind
4351 of protection is enabled by default if you are using
4352 :option:`-fsanitize=address` option.
4353 To disable global objects protection use :option:`--param` :gcc-param:`asan-globals`:samp:`=0`.
4354
4355 .. gcc-param:: asan-stack
4356
4357 Enable buffer overflow detection for stack objects. This kind of
4358 protection is enabled by default when using :option:`-fsanitize=address`.
4359 To disable stack protection use :option:`--param` :gcc-param:`asan-stack`:samp:`=0` option.
4360
4361 .. gcc-param:: asan-instrument-reads
4362
4363 Enable buffer overflow detection for memory reads. This kind of
4364 protection is enabled by default when using :option:`-fsanitize=address`.
4365 To disable memory reads protection use
4366 :option:`--param` :gcc-param:`asan-instrument-reads`:samp:`=0`.
4367
4368 .. gcc-param:: asan-instrument-writes
4369
4370 Enable buffer overflow detection for memory writes. This kind of
4371 protection is enabled by default when using :option:`-fsanitize=address`.
4372 To disable memory writes protection use
4373 :option:`--param` :gcc-param:`asan-instrument-writes`:samp:`=0` option.
4374
4375 .. gcc-param:: asan-memintrin
4376
4377 Enable detection for built-in functions. This kind of protection
4378 is enabled by default when using :option:`-fsanitize=address`.
4379 To disable built-in functions protection use
4380 :option:`--param` :gcc-param:`asan-memintrin`:samp:`=0`.
4381
4382 .. gcc-param:: asan-use-after-return
4383
4384 Enable detection of use-after-return. This kind of protection
4385 is enabled by default when using the :option:`-fsanitize=address` option.
4386 To disable it use :option:`--param` :gcc-param:`asan-use-after-return`:samp:`=0`.
4387
4388 .. note::
4389
4390 By default the check is disabled at run time. To enable it,
4391 add ``detect_stack_use_after_return=1`` to the environment variable
4392 :envvar:`ASAN_OPTIONS`.
4393
4394 .. gcc-param:: asan-instrumentation-with-call-threshold
4395
4396 If number of memory accesses in function being instrumented
4397 is greater or equal to this number, use callbacks instead of inline checks.
4398 E.g. to disable inline code use
4399 :option:`--param` :gcc-param:`asan-instrumentation-with-call-threshold`:samp:`=0`.
4400
4401 .. gcc-param:: hwasan-instrument-stack
4402
4403 Enable hwasan instrumentation of statically sized stack-allocated variables.
4404 This kind of instrumentation is enabled by default when using
4405 :option:`-fsanitize=hwaddress` and disabled by default when using
4406 :option:`-fsanitize=kernel-hwaddress`.
4407 To disable stack instrumentation use
4408 :option:`--param` :gcc-param:`hwasan-instrument-stack`:samp:`=0`, and to enable it use
4409 :option:`--param` :gcc-param:`hwasan-instrument-stack`:samp:`=1`.
4410
4411 .. gcc-param:: hwasan-random-frame-tag
4412
4413 When using stack instrumentation, decide tags for stack variables using a
4414 deterministic sequence beginning at a random tag for each frame. With this
4415 parameter unset tags are chosen using the same sequence but beginning from 1.
4416 This is enabled by default for :option:`-fsanitize=hwaddress` and unavailable
4417 for :option:`-fsanitize=kernel-hwaddress`.
4418 To disable it use :option:`--param` :gcc-param:`hwasan-random-frame-tag`:samp:`=0`.
4419
4420 .. gcc-param:: hwasan-instrument-allocas
4421
4422 Enable hwasan instrumentation of dynamically sized stack-allocated variables.
4423 This kind of instrumentation is enabled by default when using
4424 :option:`-fsanitize=hwaddress` and disabled by default when using
4425 :option:`-fsanitize=kernel-hwaddress`.
4426 To disable instrumentation of such variables use
4427 :option:`--param` :gcc-param:`hwasan-instrument-allocas`:samp:`=0`, and to enable it use
4428 :option:`--param` :gcc-param:`hwasan-instrument-allocas`:samp:`=1`.
4429
4430 .. gcc-param:: hwasan-instrument-reads
4431
4432 Enable hwasan checks on memory reads. Instrumentation of reads is enabled by
4433 default for both :option:`-fsanitize=hwaddress` and
4434 :option:`-fsanitize=kernel-hwaddress`.
4435 To disable checking memory reads use
4436 :option:`--param` :gcc-param:`hwasan-instrument-reads`:samp:`=0`.
4437
4438 .. gcc-param:: hwasan-instrument-writes
4439
4440 Enable hwasan checks on memory writes. Instrumentation of writes is enabled by
4441 default for both :option:`-fsanitize=hwaddress` and
4442 :option:`-fsanitize=kernel-hwaddress`.
4443 To disable checking memory writes use
4444 :option:`--param` :gcc-param:`hwasan-instrument-writes`:samp:`=0`.
4445
4446 .. gcc-param:: hwasan-instrument-mem-intrinsics
4447
4448 Enable hwasan instrumentation of builtin functions. Instrumentation of these
4449 builtin functions is enabled by default for both :option:`-fsanitize=hwaddress`
4450 and :option:`-fsanitize=kernel-hwaddress`.
4451 To disable instrumentation of builtin functions use
4452 :option:`--param` :gcc-param:`hwasan-instrument-mem-intrinsics`:samp:`=0`.
4453
4454 .. gcc-param:: use-after-scope-direct-emission-threshold
4455
4456 If the size of a local variable in bytes is smaller or equal to this
4457 number, directly poison (or unpoison) shadow memory instead of using
4458 run-time callbacks.
4459
4460 .. gcc-param:: tsan-distinguish-volatile
4461
4462 Emit special instrumentation for accesses to volatiles.
4463
4464 .. gcc-param:: tsan-instrument-func-entry-exit
4465
4466 Emit instrumentation calls to :samp:`__tsan_func_entry()` and :samp:`__tsan_func_exit()`.
4467
4468 .. gcc-param:: max-fsm-thread-path-insns
4469
4470 Maximum number of instructions to copy when duplicating blocks on a
4471 finite state automaton jump thread path.
4472
4473 .. gcc-param:: threader-debug
4474
4475 threader-debug=[none|all]
4476 Enables verbose dumping of the threader solver.
4477
4478 .. gcc-param:: parloops-chunk-size
4479
4480 Chunk size of omp schedule for loops parallelized by parloops.
4481
4482 .. gcc-param:: parloops-schedule
4483
4484 Schedule type of omp schedule for loops parallelized by parloops (static,
4485 dynamic, guided, auto, runtime).
4486
4487 .. gcc-param:: parloops-min-per-thread
4488
4489 The minimum number of iterations per thread of an innermost parallelized
4490 loop for which the parallelized variant is preferred over the single threaded
4491 one. Note that for a parallelized loop nest the
4492 minimum number of iterations of the outermost loop per thread is two.
4493
4494 .. gcc-param:: max-ssa-name-query-depth
4495
4496 Maximum depth of recursion when querying properties of SSA names in things
4497 like fold routines. One level of recursion corresponds to following a
4498 use-def chain.
4499
4500 .. gcc-param:: max-speculative-devirt-maydefs
4501
4502 The maximum number of may-defs we analyze when looking for a must-def
4503 specifying the dynamic type of an object that invokes a virtual call
4504 we may be able to devirtualize speculatively.
4505
4506 .. gcc-param:: max-vrp-switch-assertions
4507
4508 The maximum number of assertions to add along the default edge of a switch
4509 statement during VRP.
4510
4511 .. gcc-param:: evrp-sparse-threshold
4512
4513 Maximum number of basic blocks before EVRP uses a sparse cache.
4514
4515 .. gcc-param:: vrp1-mode
4516
4517 Specifies the mode VRP pass 1 should operate in.
4518
4519 .. gcc-param:: vrp2-mode
4520
4521 Specifies the mode VRP pass 2 should operate in.
4522
4523 .. gcc-param:: ranger-debug
4524
4525 Specifies the type of debug output to be issued for ranges.
4526
4527 .. gcc-param:: evrp-switch-limit
4528
4529 Specifies the maximum number of switch cases before EVRP ignores a switch.
4530
4531 .. gcc-param:: unroll-jam-min-percent
4532
4533 The minimum percentage of memory references that must be optimized
4534 away for the unroll-and-jam transformation to be considered profitable.
4535
4536 .. gcc-param:: unroll-jam-max-unroll
4537
4538 The maximum number of times the outer loop should be unrolled by
4539 the unroll-and-jam transformation.
4540
4541 .. gcc-param:: max-rtl-if-conversion-unpredictable-cost
4542
4543 Maximum permissible cost for the sequence that would be generated
4544 by the RTL if-conversion pass for a branch that is considered unpredictable.
4545
4546 .. gcc-param:: max-variable-expansions-in-unroller
4547
4548 If :option:`-fvariable-expansion-in-unroller` is used, the maximum number
4549 of times that an individual variable will be expanded during loop unrolling.
4550
4551 .. gcc-param:: partial-inlining-entry-probability
4552
4553 Maximum probability of the entry BB of split region
4554 (in percent relative to entry BB of the function)
4555 to make partial inlining happen.
4556
4557 .. gcc-param:: max-tracked-strlens
4558
4559 Maximum number of strings for which strlen optimization pass will
4560 track string lengths.
4561
4562 .. gcc-param:: gcse-after-reload-partial-fraction
4563
4564 The threshold ratio for performing partial redundancy
4565 elimination after reload.
4566
4567 .. gcc-param:: gcse-after-reload-critical-fraction
4568
4569 The threshold ratio of critical edges execution count that
4570 permit performing redundancy elimination after reload.
4571
4572 .. gcc-param:: max-loop-header-insns
4573
4574 The maximum number of insns in loop header duplicated
4575 by the copy loop headers pass.
4576
4577 .. gcc-param:: vect-epilogues-nomask
4578
4579 Enable loop epilogue vectorization using smaller vector size.
4580
4581 .. gcc-param:: vect-partial-vector-usage
4582
4583 Controls when the loop vectorizer considers using partial vector loads
4584 and stores as an alternative to falling back to scalar code. 0 stops
4585 the vectorizer from ever using partial vector loads and stores. 1 allows
4586 partial vector loads and stores if vectorization removes the need for the
4587 code to iterate. 2 allows partial vector loads and stores in all loops.
4588 The parameter only has an effect on targets that support partial
4589 vector loads and stores.
4590
4591 .. gcc-param:: vect-inner-loop-cost-factor
4592
4593 The maximum factor which the loop vectorizer applies to the cost of statements
4594 in an inner loop relative to the loop being vectorized. The factor applied
4595 is the maximum of the estimated number of iterations of the inner loop and
4596 this parameter. The default value of this parameter is 50.
4597
4598 .. gcc-param:: vect-induction-float
4599
4600 Enable loop vectorization of floating point inductions.
4601
4602 .. gcc-param:: avoid-fma-max-bits
4603
4604 Maximum number of bits for which we avoid creating FMAs.
4605
4606 .. gcc-param:: sms-loop-average-count-threshold
4607
4608 A threshold on the average loop count considered by the swing modulo scheduler.
4609
4610 .. gcc-param:: sms-dfa-history
4611
4612 The number of cycles the swing modulo scheduler considers when checking
4613 conflicts using DFA.
4614
4615 .. gcc-param:: graphite-allow-codegen-errors
4616
4617 Whether codegen errors should be ICEs when :option:`-fchecking`.
4618
4619 .. gcc-param:: sms-max-ii-factor
4620
4621 A factor for tuning the upper bound that swing modulo scheduler
4622 uses for scheduling a loop.
4623
4624 .. gcc-param:: lra-max-considered-reload-pseudos
4625
4626 The max number of reload pseudos which are considered during
4627 spilling a non-reload pseudo.
4628
4629 .. gcc-param:: max-pow-sqrt-depth
4630
4631 Maximum depth of sqrt chains to use when synthesizing exponentiation
4632 by a real constant.
4633
4634 .. gcc-param:: max-dse-active-local-stores
4635
4636 Maximum number of active local stores in RTL dead store elimination.
4637
4638 .. gcc-param:: asan-instrument-allocas
4639
4640 Enable asan allocas/VLAs protection.
4641
4642 .. gcc-param:: max-iterations-computation-cost
4643
4644 Bound on the cost of an expression to compute the number of iterations.
4645
4646 .. gcc-param:: max-isl-operations
4647
4648 Maximum number of isl operations, 0 means unlimited.
4649
4650 .. gcc-param:: graphite-max-arrays-per-scop
4651
4652 Maximum number of arrays per scop.
4653
4654 .. gcc-param:: max-vartrack-reverse-op-size
4655
4656 Max. size of loc list for which reverse ops should be added.
4657
4658 .. gcc-param:: fsm-scale-path-stmts
4659
4660 Scale factor to apply to the number of statements in a threading path
4661 when comparing to the number of (scaled) blocks.
4662
4663 .. gcc-param:: uninit-control-dep-attempts
4664
4665 Maximum number of nested calls to search for control dependencies
4666 during uninitialized variable analysis.
4667
4668 .. gcc-param:: fsm-scale-path-blocks
4669
4670 Scale factor to apply to the number of blocks in a threading path
4671 when comparing to the number of (scaled) statements.
4672
4673 .. gcc-param:: sched-autopref-queue-depth
4674
4675 Hardware autoprefetcher scheduler model control flag.
4676 Number of lookahead cycles the model looks into; at '
4677 ' only enable instruction sorting heuristic.
4678
4679 .. gcc-param:: loop-versioning-max-inner-insns
4680
4681 The maximum number of instructions that an inner loop can have
4682 before the loop versioning pass considers it too big to copy.
4683
4684 .. gcc-param:: loop-versioning-max-outer-insns
4685
4686 The maximum number of instructions that an outer loop can have
4687 before the loop versioning pass considers it too big to copy,
4688 discounting any instructions in inner loops that directly benefit
4689 from versioning.
4690
4691 .. gcc-param:: ssa-name-def-chain-limit
4692
4693 The maximum number of SSA_NAME assignments to follow in determining
4694 a property of a variable such as its value. This limits the number
4695 of iterations or recursive calls GCC performs when optimizing certain
4696 statements or when determining their validity prior to issuing
4697 diagnostics.
4698
4699 .. gcc-param:: store-merging-max-size
4700
4701 Maximum size of a single store merging region in bytes.
4702
4703 .. gcc-param:: hash-table-verification-limit
4704
4705 The number of elements for which hash table verification is done
4706 for each searched element.
4707
4708 .. gcc-param:: max-find-base-term-values
4709
4710 Maximum number of VALUEs handled during a single find_base_term call.
4711
4712 .. gcc-param:: analyzer-max-enodes-per-program-point
4713
4714 The maximum number of exploded nodes per program point within
4715 the analyzer, before terminating analysis of that point.
4716
4717 .. gcc-param:: analyzer-max-constraints
4718
4719 The maximum number of constraints per state.
4720
4721 .. gcc-param:: analyzer-min-snodes-for-call-summary
4722
4723 The minimum number of supernodes within a function for the
4724 analyzer to consider summarizing its effects at call sites.
4725
4726 .. gcc-param:: analyzer-max-enodes-for-full-dump
4727
4728 The maximum depth of exploded nodes that should appear in a dot dump
4729 before switching to a less verbose format.
4730
4731 .. gcc-param:: analyzer-max-recursion-depth
4732
4733 The maximum number of times a callsite can appear in a call stack
4734 within the analyzer, before terminating analysis of a call that would
4735 recurse deeper.
4736
4737 .. gcc-param:: analyzer-max-svalue-depth
4738
4739 The maximum depth of a symbolic value, before approximating
4740 the value as unknown.
4741
4742 .. gcc-param:: analyzer-max-infeasible-edges
4743
4744 The maximum number of infeasible edges to reject before declaring
4745 a diagnostic as infeasible.
4746
4747 .. gcc-param:: gimple-fe-computed-hot-bb-threshold
4748
4749 The number of executions of a basic block which is considered hot.
4750 The parameter is used only in GIMPLE FE.
4751
4752 .. gcc-param:: analyzer-bb-explosion-factor
4753
4754 The maximum number of 'after supernode' exploded nodes within the analyzer
4755 per supernode, before terminating analysis.
4756
4757 .. gcc-param:: ranger-logical-depth
4758
4759 Maximum depth of logical expression evaluation ranger will look through
4760 when evaluating outgoing edge ranges.
4761
4762 .. gcc-param:: relation-block-limit
4763
4764 Maximum number of relations the oracle will register in a basic block.
4765
4766 .. gcc-param:: min-pagesize
4767
4768 Minimum page size for warning purposes.
4769
4770 .. gcc-param:: openacc-kernels
4771
4772 Specify mode of OpenACC 'kernels' constructs handling.
4773 With :option:`--param` :gcc-param:`openacc-kernels`:samp:`=decompose`, OpenACC 'kernels'
4774 constructs are decomposed into parts, a sequence of compute
4775 constructs, each then handled individually.
4776 This is work in progress.
4777 With :option:`--param` :gcc-param:`openacc-kernels`:samp:`=parloops`, OpenACC 'kernels'
4778 constructs are handled by the :samp:`parloops` pass, en bloc.
4779 This is the current default.
4780
4781 .. gcc-param:: openacc-privatization
4782
4783 Specify mode of OpenACC privatization diagnostics for
4784 :option:`-fopt-info-omp-note` and applicable
4785 :option:`-fdump-tree-*-details`.
4786 With :option:`--param` :gcc-param:`openacc-privatization`:samp:`=quiet`, don't diagnose.
4787 This is the current default.
4788 With :option:`--param` :gcc-param:`openacc-privatization`:samp:`=noisy`, do diagnose.
4789
4790 The following choices of :samp:`{name}` are available on AArch64 targets:
4791
4792 .. gcc-param:: aarch64-sve-compare-costs
4793
4794 When vectorizing for SVE, consider using 'unpacked' vectors for
4795 smaller elements and use the cost model to pick the cheapest approach.
4796 Also use the cost model to choose between SVE and Advanced SIMD vectorization.
4797
4798 Using unpacked vectors includes storing smaller elements in larger
4799 containers and accessing elements with extending loads and truncating
4800 stores.
4801
4802 .. gcc-param:: aarch64-float-recp-precision
4803
4804 The number of Newton iterations for calculating the reciprocal for float type.
4805 The precision of division is proportional to this param when division
4806 approximation is enabled. The default value is 1.
4807
4808 .. gcc-param:: aarch64-double-recp-precision
4809
4810 The number of Newton iterations for calculating the reciprocal for double type.
4811 The precision of division is propotional to this param when division
4812 approximation is enabled. The default value is 2.
4813
4814 .. gcc-param:: aarch64-autovec-preference
4815
4816 Force an ISA selection strategy for auto-vectorization. Accepts values from
4817 0 to 4, inclusive.
4818
4819 :samp:`0`
4820 Use the default heuristics.
4821
4822 :samp:`1`
4823 Use only Advanced SIMD for auto-vectorization.
4824
4825 :samp:`2`
4826 Use only SVE for auto-vectorization.
4827
4828 :samp:`3`
4829 Use both Advanced SIMD and SVE. Prefer Advanced SIMD when the costs are
4830 deemed equal.
4831
4832 :samp:`4`
4833 Use both Advanced SIMD and SVE. Prefer SVE when the costs are deemed equal.
4834
4835 The default value is 0.
4836
4837 .. gcc-param:: aarch64-loop-vect-issue-rate-niters
4838
4839 The tuning for some AArch64 CPUs tries to take both latencies and issue
4840 rates into account when deciding whether a loop should be vectorized
4841 using SVE, vectorized using Advanced SIMD, or not vectorized at all.
4842 If this parameter is set to :samp:`{n}`, GCC will not use this heuristic
4843 for loops that are known to execute in fewer than :samp:`{n}` Advanced
4844 SIMD iterations.
4845
4846 .. gcc-param:: aarch64-vect-unroll-limit
4847
4848 The vectorizer will use available tuning information to determine whether it
4849 would be beneficial to unroll the main vectorized loop and by how much. This
4850 parameter set's the upper bound of how much the vectorizer will unroll the main
4851 loop. The default value is four.
4852
4853 The following choices of :samp:`{name}` are available on i386 and x86_64 targets:
4854
4855 .. gcc-param:: x86-stlf-window-ninsns
4856
3ed1b4ce 4857 Instructions number above which STFL stall penalty can be compensated.