]> git.ipfire.org Git - thirdparty/gcc.git/blob - gcc/ira.cc
IRA: Check that reg classes contain a hard reg of given mode in reg move cost calculation
[thirdparty/gcc.git] / gcc / ira.cc
1 /* Integrated Register Allocator (IRA) entry point.
2 Copyright (C) 2006-2022 Free Software Foundation, Inc.
3 Contributed by Vladimir Makarov <vmakarov@redhat.com>.
4
5 This file is part of GCC.
6
7 GCC is free software; you can redistribute it and/or modify it under
8 the terms of the GNU General Public License as published by the Free
9 Software Foundation; either version 3, or (at your option) any later
10 version.
11
12 GCC is distributed in the hope that it will be useful, but WITHOUT ANY
13 WARRANTY; without even the implied warranty of MERCHANTABILITY or
14 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
15 for more details.
16
17 You should have received a copy of the GNU General Public License
18 along with GCC; see the file COPYING3. If not see
19 <http://www.gnu.org/licenses/>. */
20
21 /* The integrated register allocator (IRA) is a
22 regional register allocator performing graph coloring on a top-down
23 traversal of nested regions. Graph coloring in a region is based
24 on Chaitin-Briggs algorithm. It is called integrated because
25 register coalescing, register live range splitting, and choosing a
26 better hard register are done on-the-fly during coloring. Register
27 coalescing and choosing a cheaper hard register is done by hard
28 register preferencing during hard register assigning. The live
29 range splitting is a byproduct of the regional register allocation.
30
31 Major IRA notions are:
32
33 o *Region* is a part of CFG where graph coloring based on
34 Chaitin-Briggs algorithm is done. IRA can work on any set of
35 nested CFG regions forming a tree. Currently the regions are
36 the entire function for the root region and natural loops for
37 the other regions. Therefore data structure representing a
38 region is called loop_tree_node.
39
40 o *Allocno class* is a register class used for allocation of
41 given allocno. It means that only hard register of given
42 register class can be assigned to given allocno. In reality,
43 even smaller subset of (*profitable*) hard registers can be
44 assigned. In rare cases, the subset can be even smaller
45 because our modification of Chaitin-Briggs algorithm requires
46 that sets of hard registers can be assigned to allocnos forms a
47 forest, i.e. the sets can be ordered in a way where any
48 previous set is not intersected with given set or is a superset
49 of given set.
50
51 o *Pressure class* is a register class belonging to a set of
52 register classes containing all of the hard-registers available
53 for register allocation. The set of all pressure classes for a
54 target is defined in the corresponding machine-description file
55 according some criteria. Register pressure is calculated only
56 for pressure classes and it affects some IRA decisions as
57 forming allocation regions.
58
59 o *Allocno* represents the live range of a pseudo-register in a
60 region. Besides the obvious attributes like the corresponding
61 pseudo-register number, allocno class, conflicting allocnos and
62 conflicting hard-registers, there are a few allocno attributes
63 which are important for understanding the allocation algorithm:
64
65 - *Live ranges*. This is a list of ranges of *program points*
66 where the allocno lives. Program points represent places
67 where a pseudo can be born or become dead (there are
68 approximately two times more program points than the insns)
69 and they are represented by integers starting with 0. The
70 live ranges are used to find conflicts between allocnos.
71 They also play very important role for the transformation of
72 the IRA internal representation of several regions into a one
73 region representation. The later is used during the reload
74 pass work because each allocno represents all of the
75 corresponding pseudo-registers.
76
77 - *Hard-register costs*. This is a vector of size equal to the
78 number of available hard-registers of the allocno class. The
79 cost of a callee-clobbered hard-register for an allocno is
80 increased by the cost of save/restore code around the calls
81 through the given allocno's life. If the allocno is a move
82 instruction operand and another operand is a hard-register of
83 the allocno class, the cost of the hard-register is decreased
84 by the move cost.
85
86 When an allocno is assigned, the hard-register with minimal
87 full cost is used. Initially, a hard-register's full cost is
88 the corresponding value from the hard-register's cost vector.
89 If the allocno is connected by a *copy* (see below) to
90 another allocno which has just received a hard-register, the
91 cost of the hard-register is decreased. Before choosing a
92 hard-register for an allocno, the allocno's current costs of
93 the hard-registers are modified by the conflict hard-register
94 costs of all of the conflicting allocnos which are not
95 assigned yet.
96
97 - *Conflict hard-register costs*. This is a vector of the same
98 size as the hard-register costs vector. To permit an
99 unassigned allocno to get a better hard-register, IRA uses
100 this vector to calculate the final full cost of the
101 available hard-registers. Conflict hard-register costs of an
102 unassigned allocno are also changed with a change of the
103 hard-register cost of the allocno when a copy involving the
104 allocno is processed as described above. This is done to
105 show other unassigned allocnos that a given allocno prefers
106 some hard-registers in order to remove the move instruction
107 corresponding to the copy.
108
109 o *Cap*. If a pseudo-register does not live in a region but
110 lives in a nested region, IRA creates a special allocno called
111 a cap in the outer region. A region cap is also created for a
112 subregion cap.
113
114 o *Copy*. Allocnos can be connected by copies. Copies are used
115 to modify hard-register costs for allocnos during coloring.
116 Such modifications reflects a preference to use the same
117 hard-register for the allocnos connected by copies. Usually
118 copies are created for move insns (in this case it results in
119 register coalescing). But IRA also creates copies for operands
120 of an insn which should be assigned to the same hard-register
121 due to constraints in the machine description (it usually
122 results in removing a move generated in reload to satisfy
123 the constraints) and copies referring to the allocno which is
124 the output operand of an instruction and the allocno which is
125 an input operand dying in the instruction (creation of such
126 copies results in less register shuffling). IRA *does not*
127 create copies between the same register allocnos from different
128 regions because we use another technique for propagating
129 hard-register preference on the borders of regions.
130
131 Allocnos (including caps) for the upper region in the region tree
132 *accumulate* information important for coloring from allocnos with
133 the same pseudo-register from nested regions. This includes
134 hard-register and memory costs, conflicts with hard-registers,
135 allocno conflicts, allocno copies and more. *Thus, attributes for
136 allocnos in a region have the same values as if the region had no
137 subregions*. It means that attributes for allocnos in the
138 outermost region corresponding to the function have the same values
139 as though the allocation used only one region which is the entire
140 function. It also means that we can look at IRA work as if the
141 first IRA did allocation for all function then it improved the
142 allocation for loops then their subloops and so on.
143
144 IRA major passes are:
145
146 o Building IRA internal representation which consists of the
147 following subpasses:
148
149 * First, IRA builds regions and creates allocnos (file
150 ira-build.cc) and initializes most of their attributes.
151
152 * Then IRA finds an allocno class for each allocno and
153 calculates its initial (non-accumulated) cost of memory and
154 each hard-register of its allocno class (file ira-cost.c).
155
156 * IRA creates live ranges of each allocno, calculates register
157 pressure for each pressure class in each region, sets up
158 conflict hard registers for each allocno and info about calls
159 the allocno lives through (file ira-lives.cc).
160
161 * IRA removes low register pressure loops from the regions
162 mostly to speed IRA up (file ira-build.cc).
163
164 * IRA propagates accumulated allocno info from lower region
165 allocnos to corresponding upper region allocnos (file
166 ira-build.cc).
167
168 * IRA creates all caps (file ira-build.cc).
169
170 * Having live-ranges of allocnos and their classes, IRA creates
171 conflicting allocnos for each allocno. Conflicting allocnos
172 are stored as a bit vector or array of pointers to the
173 conflicting allocnos whatever is more profitable (file
174 ira-conflicts.cc). At this point IRA creates allocno copies.
175
176 o Coloring. Now IRA has all necessary info to start graph coloring
177 process. It is done in each region on top-down traverse of the
178 region tree (file ira-color.cc). There are following subpasses:
179
180 * Finding profitable hard registers of corresponding allocno
181 class for each allocno. For example, only callee-saved hard
182 registers are frequently profitable for allocnos living
183 through colors. If the profitable hard register set of
184 allocno does not form a tree based on subset relation, we use
185 some approximation to form the tree. This approximation is
186 used to figure out trivial colorability of allocnos. The
187 approximation is a pretty rare case.
188
189 * Putting allocnos onto the coloring stack. IRA uses Briggs
190 optimistic coloring which is a major improvement over
191 Chaitin's coloring. Therefore IRA does not spill allocnos at
192 this point. There is some freedom in the order of putting
193 allocnos on the stack which can affect the final result of
194 the allocation. IRA uses some heuristics to improve the
195 order. The major one is to form *threads* from colorable
196 allocnos and push them on the stack by threads. Thread is a
197 set of non-conflicting colorable allocnos connected by
198 copies. The thread contains allocnos from the colorable
199 bucket or colorable allocnos already pushed onto the coloring
200 stack. Pushing thread allocnos one after another onto the
201 stack increases chances of removing copies when the allocnos
202 get the same hard reg.
203
204 We also use a modification of Chaitin-Briggs algorithm which
205 works for intersected register classes of allocnos. To
206 figure out trivial colorability of allocnos, the mentioned
207 above tree of hard register sets is used. To get an idea how
208 the algorithm works in i386 example, let us consider an
209 allocno to which any general hard register can be assigned.
210 If the allocno conflicts with eight allocnos to which only
211 EAX register can be assigned, given allocno is still
212 trivially colorable because all conflicting allocnos might be
213 assigned only to EAX and all other general hard registers are
214 still free.
215
216 To get an idea of the used trivial colorability criterion, it
217 is also useful to read article "Graph-Coloring Register
218 Allocation for Irregular Architectures" by Michael D. Smith
219 and Glen Holloway. Major difference between the article
220 approach and approach used in IRA is that Smith's approach
221 takes register classes only from machine description and IRA
222 calculate register classes from intermediate code too
223 (e.g. an explicit usage of hard registers in RTL code for
224 parameter passing can result in creation of additional
225 register classes which contain or exclude the hard
226 registers). That makes IRA approach useful for improving
227 coloring even for architectures with regular register files
228 and in fact some benchmarking shows the improvement for
229 regular class architectures is even bigger than for irregular
230 ones. Another difference is that Smith's approach chooses
231 intersection of classes of all insn operands in which a given
232 pseudo occurs. IRA can use bigger classes if it is still
233 more profitable than memory usage.
234
235 * Popping the allocnos from the stack and assigning them hard
236 registers. If IRA cannot assign a hard register to an
237 allocno and the allocno is coalesced, IRA undoes the
238 coalescing and puts the uncoalesced allocnos onto the stack in
239 the hope that some such allocnos will get a hard register
240 separately. If IRA fails to assign hard register or memory
241 is more profitable for it, IRA spills the allocno. IRA
242 assigns the allocno the hard-register with minimal full
243 allocation cost which reflects the cost of usage of the
244 hard-register for the allocno and cost of usage of the
245 hard-register for allocnos conflicting with given allocno.
246
247 * Chaitin-Briggs coloring assigns as many pseudos as possible
248 to hard registers. After coloring we try to improve
249 allocation with cost point of view. We improve the
250 allocation by spilling some allocnos and assigning the freed
251 hard registers to other allocnos if it decreases the overall
252 allocation cost.
253
254 * After allocno assigning in the region, IRA modifies the hard
255 register and memory costs for the corresponding allocnos in
256 the subregions to reflect the cost of possible loads, stores,
257 or moves on the border of the region and its subregions.
258 When default regional allocation algorithm is used
259 (-fira-algorithm=mixed), IRA just propagates the assignment
260 for allocnos if the register pressure in the region for the
261 corresponding pressure class is less than number of available
262 hard registers for given pressure class.
263
264 o Spill/restore code moving. When IRA performs an allocation
265 by traversing regions in top-down order, it does not know what
266 happens below in the region tree. Therefore, sometimes IRA
267 misses opportunities to perform a better allocation. A simple
268 optimization tries to improve allocation in a region having
269 subregions and containing in another region. If the
270 corresponding allocnos in the subregion are spilled, it spills
271 the region allocno if it is profitable. The optimization
272 implements a simple iterative algorithm performing profitable
273 transformations while they are still possible. It is fast in
274 practice, so there is no real need for a better time complexity
275 algorithm.
276
277 o Code change. After coloring, two allocnos representing the
278 same pseudo-register outside and inside a region respectively
279 may be assigned to different locations (hard-registers or
280 memory). In this case IRA creates and uses a new
281 pseudo-register inside the region and adds code to move allocno
282 values on the region's borders. This is done during top-down
283 traversal of the regions (file ira-emit.cc). In some
284 complicated cases IRA can create a new allocno to move allocno
285 values (e.g. when a swap of values stored in two hard-registers
286 is needed). At this stage, the new allocno is marked as
287 spilled. IRA still creates the pseudo-register and the moves
288 on the region borders even when both allocnos were assigned to
289 the same hard-register. If the reload pass spills a
290 pseudo-register for some reason, the effect will be smaller
291 because another allocno will still be in the hard-register. In
292 most cases, this is better then spilling both allocnos. If
293 reload does not change the allocation for the two
294 pseudo-registers, the trivial move will be removed by
295 post-reload optimizations. IRA does not generate moves for
296 allocnos assigned to the same hard register when the default
297 regional allocation algorithm is used and the register pressure
298 in the region for the corresponding pressure class is less than
299 number of available hard registers for given pressure class.
300 IRA also does some optimizations to remove redundant stores and
301 to reduce code duplication on the region borders.
302
303 o Flattening internal representation. After changing code, IRA
304 transforms its internal representation for several regions into
305 one region representation (file ira-build.cc). This process is
306 called IR flattening. Such process is more complicated than IR
307 rebuilding would be, but is much faster.
308
309 o After IR flattening, IRA tries to assign hard registers to all
310 spilled allocnos. This is implemented by a simple and fast
311 priority coloring algorithm (see function
312 ira_reassign_conflict_allocnos::ira-color.cc). Here new allocnos
313 created during the code change pass can be assigned to hard
314 registers.
315
316 o At the end IRA calls the reload pass. The reload pass
317 communicates with IRA through several functions in file
318 ira-color.cc to improve its decisions in
319
320 * sharing stack slots for the spilled pseudos based on IRA info
321 about pseudo-register conflicts.
322
323 * reassigning hard-registers to all spilled pseudos at the end
324 of each reload iteration.
325
326 * choosing a better hard-register to spill based on IRA info
327 about pseudo-register live ranges and the register pressure
328 in places where the pseudo-register lives.
329
330 IRA uses a lot of data representing the target processors. These
331 data are initialized in file ira.cc.
332
333 If function has no loops (or the loops are ignored when
334 -fira-algorithm=CB is used), we have classic Chaitin-Briggs
335 coloring (only instead of separate pass of coalescing, we use hard
336 register preferencing). In such case, IRA works much faster
337 because many things are not made (like IR flattening, the
338 spill/restore optimization, and the code change).
339
340 Literature is worth to read for better understanding the code:
341
342 o Preston Briggs, Keith D. Cooper, Linda Torczon. Improvements to
343 Graph Coloring Register Allocation.
344
345 o David Callahan, Brian Koblenz. Register allocation via
346 hierarchical graph coloring.
347
348 o Keith Cooper, Anshuman Dasgupta, Jason Eckhardt. Revisiting Graph
349 Coloring Register Allocation: A Study of the Chaitin-Briggs and
350 Callahan-Koblenz Algorithms.
351
352 o Guei-Yuan Lueh, Thomas Gross, and Ali-Reza Adl-Tabatabai. Global
353 Register Allocation Based on Graph Fusion.
354
355 o Michael D. Smith and Glenn Holloway. Graph-Coloring Register
356 Allocation for Irregular Architectures
357
358 o Vladimir Makarov. The Integrated Register Allocator for GCC.
359
360 o Vladimir Makarov. The top-down register allocator for irregular
361 register file architectures.
362
363 */
364
365
366 #include "config.h"
367 #include "system.h"
368 #include "coretypes.h"
369 #include "backend.h"
370 #include "target.h"
371 #include "rtl.h"
372 #include "tree.h"
373 #include "df.h"
374 #include "memmodel.h"
375 #include "tm_p.h"
376 #include "insn-config.h"
377 #include "regs.h"
378 #include "ira.h"
379 #include "ira-int.h"
380 #include "diagnostic-core.h"
381 #include "cfgrtl.h"
382 #include "cfgbuild.h"
383 #include "cfgcleanup.h"
384 #include "expr.h"
385 #include "tree-pass.h"
386 #include "output.h"
387 #include "reload.h"
388 #include "cfgloop.h"
389 #include "lra.h"
390 #include "dce.h"
391 #include "dbgcnt.h"
392 #include "rtl-iter.h"
393 #include "shrink-wrap.h"
394 #include "print-rtl.h"
395
396 struct target_ira default_target_ira;
397 class target_ira_int default_target_ira_int;
398 #if SWITCHABLE_TARGET
399 struct target_ira *this_target_ira = &default_target_ira;
400 class target_ira_int *this_target_ira_int = &default_target_ira_int;
401 #endif
402
403 /* A modified value of flag `-fira-verbose' used internally. */
404 int internal_flag_ira_verbose;
405
406 /* Dump file of the allocator if it is not NULL. */
407 FILE *ira_dump_file;
408
409 /* The number of elements in the following array. */
410 int ira_spilled_reg_stack_slots_num;
411
412 /* The following array contains info about spilled pseudo-registers
413 stack slots used in current function so far. */
414 class ira_spilled_reg_stack_slot *ira_spilled_reg_stack_slots;
415
416 /* Correspondingly overall cost of the allocation, overall cost before
417 reload, cost of the allocnos assigned to hard-registers, cost of
418 the allocnos assigned to memory, cost of loads, stores and register
419 move insns generated for pseudo-register live range splitting (see
420 ira-emit.cc). */
421 int64_t ira_overall_cost, overall_cost_before;
422 int64_t ira_reg_cost, ira_mem_cost;
423 int64_t ira_load_cost, ira_store_cost, ira_shuffle_cost;
424 int ira_move_loops_num, ira_additional_jumps_num;
425
426 /* All registers that can be eliminated. */
427
428 HARD_REG_SET eliminable_regset;
429
430 /* Value of max_reg_num () before IRA work start. This value helps
431 us to recognize a situation when new pseudos were created during
432 IRA work. */
433 static int max_regno_before_ira;
434
435 /* Temporary hard reg set used for a different calculation. */
436 static HARD_REG_SET temp_hard_regset;
437
438 #define last_mode_for_init_move_cost \
439 (this_target_ira_int->x_last_mode_for_init_move_cost)
440 \f
441
442 /* The function sets up the map IRA_REG_MODE_HARD_REGSET. */
443 static void
444 setup_reg_mode_hard_regset (void)
445 {
446 int i, m, hard_regno;
447
448 for (m = 0; m < NUM_MACHINE_MODES; m++)
449 for (hard_regno = 0; hard_regno < FIRST_PSEUDO_REGISTER; hard_regno++)
450 {
451 CLEAR_HARD_REG_SET (ira_reg_mode_hard_regset[hard_regno][m]);
452 for (i = hard_regno_nregs (hard_regno, (machine_mode) m) - 1;
453 i >= 0; i--)
454 if (hard_regno + i < FIRST_PSEUDO_REGISTER)
455 SET_HARD_REG_BIT (ira_reg_mode_hard_regset[hard_regno][m],
456 hard_regno + i);
457 }
458 }
459
460 \f
461 #define no_unit_alloc_regs \
462 (this_target_ira_int->x_no_unit_alloc_regs)
463
464 /* The function sets up the three arrays declared above. */
465 static void
466 setup_class_hard_regs (void)
467 {
468 int cl, i, hard_regno, n;
469 HARD_REG_SET processed_hard_reg_set;
470
471 ira_assert (SHRT_MAX >= FIRST_PSEUDO_REGISTER);
472 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
473 {
474 temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
475 CLEAR_HARD_REG_SET (processed_hard_reg_set);
476 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
477 {
478 ira_non_ordered_class_hard_regs[cl][i] = -1;
479 ira_class_hard_reg_index[cl][i] = -1;
480 }
481 for (n = 0, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
482 {
483 #ifdef REG_ALLOC_ORDER
484 hard_regno = reg_alloc_order[i];
485 #else
486 hard_regno = i;
487 #endif
488 if (TEST_HARD_REG_BIT (processed_hard_reg_set, hard_regno))
489 continue;
490 SET_HARD_REG_BIT (processed_hard_reg_set, hard_regno);
491 if (! TEST_HARD_REG_BIT (temp_hard_regset, hard_regno))
492 ira_class_hard_reg_index[cl][hard_regno] = -1;
493 else
494 {
495 ira_class_hard_reg_index[cl][hard_regno] = n;
496 ira_class_hard_regs[cl][n++] = hard_regno;
497 }
498 }
499 ira_class_hard_regs_num[cl] = n;
500 for (n = 0, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
501 if (TEST_HARD_REG_BIT (temp_hard_regset, i))
502 ira_non_ordered_class_hard_regs[cl][n++] = i;
503 ira_assert (ira_class_hard_regs_num[cl] == n);
504 }
505 }
506
507 /* Set up global variables defining info about hard registers for the
508 allocation. These depend on USE_HARD_FRAME_P whose TRUE value means
509 that we can use the hard frame pointer for the allocation. */
510 static void
511 setup_alloc_regs (bool use_hard_frame_p)
512 {
513 #ifdef ADJUST_REG_ALLOC_ORDER
514 ADJUST_REG_ALLOC_ORDER;
515 #endif
516 no_unit_alloc_regs = fixed_nonglobal_reg_set;
517 if (! use_hard_frame_p)
518 add_to_hard_reg_set (&no_unit_alloc_regs, Pmode,
519 HARD_FRAME_POINTER_REGNUM);
520 setup_class_hard_regs ();
521 }
522
523 \f
524
525 #define alloc_reg_class_subclasses \
526 (this_target_ira_int->x_alloc_reg_class_subclasses)
527
528 /* Initialize the table of subclasses of each reg class. */
529 static void
530 setup_reg_subclasses (void)
531 {
532 int i, j;
533 HARD_REG_SET temp_hard_regset2;
534
535 for (i = 0; i < N_REG_CLASSES; i++)
536 for (j = 0; j < N_REG_CLASSES; j++)
537 alloc_reg_class_subclasses[i][j] = LIM_REG_CLASSES;
538
539 for (i = 0; i < N_REG_CLASSES; i++)
540 {
541 if (i == (int) NO_REGS)
542 continue;
543
544 temp_hard_regset = reg_class_contents[i] & ~no_unit_alloc_regs;
545 if (hard_reg_set_empty_p (temp_hard_regset))
546 continue;
547 for (j = 0; j < N_REG_CLASSES; j++)
548 if (i != j)
549 {
550 enum reg_class *p;
551
552 temp_hard_regset2 = reg_class_contents[j] & ~no_unit_alloc_regs;
553 if (! hard_reg_set_subset_p (temp_hard_regset,
554 temp_hard_regset2))
555 continue;
556 p = &alloc_reg_class_subclasses[j][0];
557 while (*p != LIM_REG_CLASSES) p++;
558 *p = (enum reg_class) i;
559 }
560 }
561 }
562
563 \f
564
565 /* Set up IRA_MEMORY_MOVE_COST and IRA_MAX_MEMORY_MOVE_COST. */
566 static void
567 setup_class_subset_and_memory_move_costs (void)
568 {
569 int cl, cl2, mode, cost;
570 HARD_REG_SET temp_hard_regset2;
571
572 for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
573 ira_memory_move_cost[mode][NO_REGS][0]
574 = ira_memory_move_cost[mode][NO_REGS][1] = SHRT_MAX;
575 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
576 {
577 if (cl != (int) NO_REGS)
578 for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
579 {
580 ira_max_memory_move_cost[mode][cl][0]
581 = ira_memory_move_cost[mode][cl][0]
582 = memory_move_cost ((machine_mode) mode,
583 (reg_class_t) cl, false);
584 ira_max_memory_move_cost[mode][cl][1]
585 = ira_memory_move_cost[mode][cl][1]
586 = memory_move_cost ((machine_mode) mode,
587 (reg_class_t) cl, true);
588 /* Costs for NO_REGS are used in cost calculation on the
589 1st pass when the preferred register classes are not
590 known yet. In this case we take the best scenario. */
591 if (ira_memory_move_cost[mode][NO_REGS][0]
592 > ira_memory_move_cost[mode][cl][0])
593 ira_max_memory_move_cost[mode][NO_REGS][0]
594 = ira_memory_move_cost[mode][NO_REGS][0]
595 = ira_memory_move_cost[mode][cl][0];
596 if (ira_memory_move_cost[mode][NO_REGS][1]
597 > ira_memory_move_cost[mode][cl][1])
598 ira_max_memory_move_cost[mode][NO_REGS][1]
599 = ira_memory_move_cost[mode][NO_REGS][1]
600 = ira_memory_move_cost[mode][cl][1];
601 }
602 }
603 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
604 for (cl2 = (int) N_REG_CLASSES - 1; cl2 >= 0; cl2--)
605 {
606 temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
607 temp_hard_regset2 = reg_class_contents[cl2] & ~no_unit_alloc_regs;
608 ira_class_subset_p[cl][cl2]
609 = hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2);
610 if (! hard_reg_set_empty_p (temp_hard_regset2)
611 && hard_reg_set_subset_p (reg_class_contents[cl2],
612 reg_class_contents[cl]))
613 for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
614 {
615 cost = ira_memory_move_cost[mode][cl2][0];
616 if (cost > ira_max_memory_move_cost[mode][cl][0])
617 ira_max_memory_move_cost[mode][cl][0] = cost;
618 cost = ira_memory_move_cost[mode][cl2][1];
619 if (cost > ira_max_memory_move_cost[mode][cl][1])
620 ira_max_memory_move_cost[mode][cl][1] = cost;
621 }
622 }
623 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
624 for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
625 {
626 ira_memory_move_cost[mode][cl][0]
627 = ira_max_memory_move_cost[mode][cl][0];
628 ira_memory_move_cost[mode][cl][1]
629 = ira_max_memory_move_cost[mode][cl][1];
630 }
631 setup_reg_subclasses ();
632 }
633
634 \f
635
636 /* Define the following macro if allocation through malloc if
637 preferable. */
638 #define IRA_NO_OBSTACK
639
640 #ifndef IRA_NO_OBSTACK
641 /* Obstack used for storing all dynamic data (except bitmaps) of the
642 IRA. */
643 static struct obstack ira_obstack;
644 #endif
645
646 /* Obstack used for storing all bitmaps of the IRA. */
647 static struct bitmap_obstack ira_bitmap_obstack;
648
649 /* Allocate memory of size LEN for IRA data. */
650 void *
651 ira_allocate (size_t len)
652 {
653 void *res;
654
655 #ifndef IRA_NO_OBSTACK
656 res = obstack_alloc (&ira_obstack, len);
657 #else
658 res = xmalloc (len);
659 #endif
660 return res;
661 }
662
663 /* Free memory ADDR allocated for IRA data. */
664 void
665 ira_free (void *addr ATTRIBUTE_UNUSED)
666 {
667 #ifndef IRA_NO_OBSTACK
668 /* do nothing */
669 #else
670 free (addr);
671 #endif
672 }
673
674
675 /* Allocate and returns bitmap for IRA. */
676 bitmap
677 ira_allocate_bitmap (void)
678 {
679 return BITMAP_ALLOC (&ira_bitmap_obstack);
680 }
681
682 /* Free bitmap B allocated for IRA. */
683 void
684 ira_free_bitmap (bitmap b ATTRIBUTE_UNUSED)
685 {
686 /* do nothing */
687 }
688
689 \f
690
691 /* Output information about allocation of all allocnos (except for
692 caps) into file F. */
693 void
694 ira_print_disposition (FILE *f)
695 {
696 int i, n, max_regno;
697 ira_allocno_t a;
698 basic_block bb;
699
700 fprintf (f, "Disposition:");
701 max_regno = max_reg_num ();
702 for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
703 for (a = ira_regno_allocno_map[i];
704 a != NULL;
705 a = ALLOCNO_NEXT_REGNO_ALLOCNO (a))
706 {
707 if (n % 4 == 0)
708 fprintf (f, "\n");
709 n++;
710 fprintf (f, " %4d:r%-4d", ALLOCNO_NUM (a), ALLOCNO_REGNO (a));
711 if ((bb = ALLOCNO_LOOP_TREE_NODE (a)->bb) != NULL)
712 fprintf (f, "b%-3d", bb->index);
713 else
714 fprintf (f, "l%-3d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num);
715 if (ALLOCNO_HARD_REGNO (a) >= 0)
716 fprintf (f, " %3d", ALLOCNO_HARD_REGNO (a));
717 else
718 fprintf (f, " mem");
719 }
720 fprintf (f, "\n");
721 }
722
723 /* Outputs information about allocation of all allocnos into
724 stderr. */
725 void
726 ira_debug_disposition (void)
727 {
728 ira_print_disposition (stderr);
729 }
730
731 \f
732
733 /* Set up ira_stack_reg_pressure_class which is the biggest pressure
734 register class containing stack registers or NO_REGS if there are
735 no stack registers. To find this class, we iterate through all
736 register pressure classes and choose the first register pressure
737 class containing all the stack registers and having the biggest
738 size. */
739 static void
740 setup_stack_reg_pressure_class (void)
741 {
742 ira_stack_reg_pressure_class = NO_REGS;
743 #ifdef STACK_REGS
744 {
745 int i, best, size;
746 enum reg_class cl;
747 HARD_REG_SET temp_hard_regset2;
748
749 CLEAR_HARD_REG_SET (temp_hard_regset);
750 for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
751 SET_HARD_REG_BIT (temp_hard_regset, i);
752 best = 0;
753 for (i = 0; i < ira_pressure_classes_num; i++)
754 {
755 cl = ira_pressure_classes[i];
756 temp_hard_regset2 = temp_hard_regset & reg_class_contents[cl];
757 size = hard_reg_set_size (temp_hard_regset2);
758 if (best < size)
759 {
760 best = size;
761 ira_stack_reg_pressure_class = cl;
762 }
763 }
764 }
765 #endif
766 }
767
768 /* Find pressure classes which are register classes for which we
769 calculate register pressure in IRA, register pressure sensitive
770 insn scheduling, and register pressure sensitive loop invariant
771 motion.
772
773 To make register pressure calculation easy, we always use
774 non-intersected register pressure classes. A move of hard
775 registers from one register pressure class is not more expensive
776 than load and store of the hard registers. Most likely an allocno
777 class will be a subset of a register pressure class and in many
778 cases a register pressure class. That makes usage of register
779 pressure classes a good approximation to find a high register
780 pressure. */
781 static void
782 setup_pressure_classes (void)
783 {
784 int cost, i, n, curr;
785 int cl, cl2;
786 enum reg_class pressure_classes[N_REG_CLASSES];
787 int m;
788 HARD_REG_SET temp_hard_regset2;
789 bool insert_p;
790
791 if (targetm.compute_pressure_classes)
792 n = targetm.compute_pressure_classes (pressure_classes);
793 else
794 {
795 n = 0;
796 for (cl = 0; cl < N_REG_CLASSES; cl++)
797 {
798 if (ira_class_hard_regs_num[cl] == 0)
799 continue;
800 if (ira_class_hard_regs_num[cl] != 1
801 /* A register class without subclasses may contain a few
802 hard registers and movement between them is costly
803 (e.g. SPARC FPCC registers). We still should consider it
804 as a candidate for a pressure class. */
805 && alloc_reg_class_subclasses[cl][0] < cl)
806 {
807 /* Check that the moves between any hard registers of the
808 current class are not more expensive for a legal mode
809 than load/store of the hard registers of the current
810 class. Such class is a potential candidate to be a
811 register pressure class. */
812 for (m = 0; m < NUM_MACHINE_MODES; m++)
813 {
814 temp_hard_regset
815 = (reg_class_contents[cl]
816 & ~(no_unit_alloc_regs
817 | ira_prohibited_class_mode_regs[cl][m]));
818 if (hard_reg_set_empty_p (temp_hard_regset))
819 continue;
820 ira_init_register_move_cost_if_necessary ((machine_mode) m);
821 cost = ira_register_move_cost[m][cl][cl];
822 if (cost <= ira_max_memory_move_cost[m][cl][1]
823 || cost <= ira_max_memory_move_cost[m][cl][0])
824 break;
825 }
826 if (m >= NUM_MACHINE_MODES)
827 continue;
828 }
829 curr = 0;
830 insert_p = true;
831 temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
832 /* Remove so far added pressure classes which are subset of the
833 current candidate class. Prefer GENERAL_REGS as a pressure
834 register class to another class containing the same
835 allocatable hard registers. We do this because machine
836 dependent cost hooks might give wrong costs for the latter
837 class but always give the right cost for the former class
838 (GENERAL_REGS). */
839 for (i = 0; i < n; i++)
840 {
841 cl2 = pressure_classes[i];
842 temp_hard_regset2 = (reg_class_contents[cl2]
843 & ~no_unit_alloc_regs);
844 if (hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2)
845 && (temp_hard_regset != temp_hard_regset2
846 || cl2 == (int) GENERAL_REGS))
847 {
848 pressure_classes[curr++] = (enum reg_class) cl2;
849 insert_p = false;
850 continue;
851 }
852 if (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset)
853 && (temp_hard_regset2 != temp_hard_regset
854 || cl == (int) GENERAL_REGS))
855 continue;
856 if (temp_hard_regset2 == temp_hard_regset)
857 insert_p = false;
858 pressure_classes[curr++] = (enum reg_class) cl2;
859 }
860 /* If the current candidate is a subset of a so far added
861 pressure class, don't add it to the list of the pressure
862 classes. */
863 if (insert_p)
864 pressure_classes[curr++] = (enum reg_class) cl;
865 n = curr;
866 }
867 }
868 #ifdef ENABLE_IRA_CHECKING
869 {
870 HARD_REG_SET ignore_hard_regs;
871
872 /* Check pressure classes correctness: here we check that hard
873 registers from all register pressure classes contains all hard
874 registers available for the allocation. */
875 CLEAR_HARD_REG_SET (temp_hard_regset);
876 CLEAR_HARD_REG_SET (temp_hard_regset2);
877 ignore_hard_regs = no_unit_alloc_regs;
878 for (cl = 0; cl < LIM_REG_CLASSES; cl++)
879 {
880 /* For some targets (like MIPS with MD_REGS), there are some
881 classes with hard registers available for allocation but
882 not able to hold value of any mode. */
883 for (m = 0; m < NUM_MACHINE_MODES; m++)
884 if (contains_reg_of_mode[cl][m])
885 break;
886 if (m >= NUM_MACHINE_MODES)
887 {
888 ignore_hard_regs |= reg_class_contents[cl];
889 continue;
890 }
891 for (i = 0; i < n; i++)
892 if ((int) pressure_classes[i] == cl)
893 break;
894 temp_hard_regset2 |= reg_class_contents[cl];
895 if (i < n)
896 temp_hard_regset |= reg_class_contents[cl];
897 }
898 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
899 /* Some targets (like SPARC with ICC reg) have allocatable regs
900 for which no reg class is defined. */
901 if (REGNO_REG_CLASS (i) == NO_REGS)
902 SET_HARD_REG_BIT (ignore_hard_regs, i);
903 temp_hard_regset &= ~ignore_hard_regs;
904 temp_hard_regset2 &= ~ignore_hard_regs;
905 ira_assert (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset));
906 }
907 #endif
908 ira_pressure_classes_num = 0;
909 for (i = 0; i < n; i++)
910 {
911 cl = (int) pressure_classes[i];
912 ira_reg_pressure_class_p[cl] = true;
913 ira_pressure_classes[ira_pressure_classes_num++] = (enum reg_class) cl;
914 }
915 setup_stack_reg_pressure_class ();
916 }
917
918 /* Set up IRA_UNIFORM_CLASS_P. Uniform class is a register class
919 whose register move cost between any registers of the class is the
920 same as for all its subclasses. We use the data to speed up the
921 2nd pass of calculations of allocno costs. */
922 static void
923 setup_uniform_class_p (void)
924 {
925 int i, cl, cl2, m;
926
927 for (cl = 0; cl < N_REG_CLASSES; cl++)
928 {
929 ira_uniform_class_p[cl] = false;
930 if (ira_class_hard_regs_num[cl] == 0)
931 continue;
932 /* We cannot use alloc_reg_class_subclasses here because move
933 cost hooks does not take into account that some registers are
934 unavailable for the subtarget. E.g. for i686, INT_SSE_REGS
935 is element of alloc_reg_class_subclasses for GENERAL_REGS
936 because SSE regs are unavailable. */
937 for (i = 0; (cl2 = reg_class_subclasses[cl][i]) != LIM_REG_CLASSES; i++)
938 {
939 if (ira_class_hard_regs_num[cl2] == 0)
940 continue;
941 for (m = 0; m < NUM_MACHINE_MODES; m++)
942 if (contains_reg_of_mode[cl][m] && contains_reg_of_mode[cl2][m])
943 {
944 ira_init_register_move_cost_if_necessary ((machine_mode) m);
945 if (ira_register_move_cost[m][cl][cl]
946 != ira_register_move_cost[m][cl2][cl2])
947 break;
948 }
949 if (m < NUM_MACHINE_MODES)
950 break;
951 }
952 if (cl2 == LIM_REG_CLASSES)
953 ira_uniform_class_p[cl] = true;
954 }
955 }
956
957 /* Set up IRA_ALLOCNO_CLASSES, IRA_ALLOCNO_CLASSES_NUM,
958 IRA_IMPORTANT_CLASSES, and IRA_IMPORTANT_CLASSES_NUM.
959
960 Target may have many subtargets and not all target hard registers can
961 be used for allocation, e.g. x86 port in 32-bit mode cannot use
962 hard registers introduced in x86-64 like r8-r15). Some classes
963 might have the same allocatable hard registers, e.g. INDEX_REGS
964 and GENERAL_REGS in x86 port in 32-bit mode. To decrease different
965 calculations efforts we introduce allocno classes which contain
966 unique non-empty sets of allocatable hard-registers.
967
968 Pseudo class cost calculation in ira-costs.cc is very expensive.
969 Therefore we are trying to decrease number of classes involved in
970 such calculation. Register classes used in the cost calculation
971 are called important classes. They are allocno classes and other
972 non-empty classes whose allocatable hard register sets are inside
973 of an allocno class hard register set. From the first sight, it
974 looks like that they are just allocno classes. It is not true. In
975 example of x86-port in 32-bit mode, allocno classes will contain
976 GENERAL_REGS but not LEGACY_REGS (because allocatable hard
977 registers are the same for the both classes). The important
978 classes will contain GENERAL_REGS and LEGACY_REGS. It is done
979 because a machine description insn constraint may refers for
980 LEGACY_REGS and code in ira-costs.cc is mostly base on investigation
981 of the insn constraints. */
982 static void
983 setup_allocno_and_important_classes (void)
984 {
985 int i, j, n, cl;
986 bool set_p;
987 HARD_REG_SET temp_hard_regset2;
988 static enum reg_class classes[LIM_REG_CLASSES + 1];
989
990 n = 0;
991 /* Collect classes which contain unique sets of allocatable hard
992 registers. Prefer GENERAL_REGS to other classes containing the
993 same set of hard registers. */
994 for (i = 0; i < LIM_REG_CLASSES; i++)
995 {
996 temp_hard_regset = reg_class_contents[i] & ~no_unit_alloc_regs;
997 for (j = 0; j < n; j++)
998 {
999 cl = classes[j];
1000 temp_hard_regset2 = reg_class_contents[cl] & ~no_unit_alloc_regs;
1001 if (temp_hard_regset == temp_hard_regset2)
1002 break;
1003 }
1004 if (j >= n || targetm.additional_allocno_class_p (i))
1005 classes[n++] = (enum reg_class) i;
1006 else if (i == GENERAL_REGS)
1007 /* Prefer general regs. For i386 example, it means that
1008 we prefer GENERAL_REGS over INDEX_REGS or LEGACY_REGS
1009 (all of them consists of the same available hard
1010 registers). */
1011 classes[j] = (enum reg_class) i;
1012 }
1013 classes[n] = LIM_REG_CLASSES;
1014
1015 /* Set up classes which can be used for allocnos as classes
1016 containing non-empty unique sets of allocatable hard
1017 registers. */
1018 ira_allocno_classes_num = 0;
1019 for (i = 0; (cl = classes[i]) != LIM_REG_CLASSES; i++)
1020 if (ira_class_hard_regs_num[cl] > 0)
1021 ira_allocno_classes[ira_allocno_classes_num++] = (enum reg_class) cl;
1022 ira_important_classes_num = 0;
1023 /* Add non-allocno classes containing to non-empty set of
1024 allocatable hard regs. */
1025 for (cl = 0; cl < N_REG_CLASSES; cl++)
1026 if (ira_class_hard_regs_num[cl] > 0)
1027 {
1028 temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
1029 set_p = false;
1030 for (j = 0; j < ira_allocno_classes_num; j++)
1031 {
1032 temp_hard_regset2 = (reg_class_contents[ira_allocno_classes[j]]
1033 & ~no_unit_alloc_regs);
1034 if ((enum reg_class) cl == ira_allocno_classes[j])
1035 break;
1036 else if (hard_reg_set_subset_p (temp_hard_regset,
1037 temp_hard_regset2))
1038 set_p = true;
1039 }
1040 if (set_p && j >= ira_allocno_classes_num)
1041 ira_important_classes[ira_important_classes_num++]
1042 = (enum reg_class) cl;
1043 }
1044 /* Now add allocno classes to the important classes. */
1045 for (j = 0; j < ira_allocno_classes_num; j++)
1046 ira_important_classes[ira_important_classes_num++]
1047 = ira_allocno_classes[j];
1048 for (cl = 0; cl < N_REG_CLASSES; cl++)
1049 {
1050 ira_reg_allocno_class_p[cl] = false;
1051 ira_reg_pressure_class_p[cl] = false;
1052 }
1053 for (j = 0; j < ira_allocno_classes_num; j++)
1054 ira_reg_allocno_class_p[ira_allocno_classes[j]] = true;
1055 setup_pressure_classes ();
1056 setup_uniform_class_p ();
1057 }
1058
1059 /* Setup translation in CLASS_TRANSLATE of all classes into a class
1060 given by array CLASSES of length CLASSES_NUM. The function is used
1061 make translation any reg class to an allocno class or to an
1062 pressure class. This translation is necessary for some
1063 calculations when we can use only allocno or pressure classes and
1064 such translation represents an approximate representation of all
1065 classes.
1066
1067 The translation in case when allocatable hard register set of a
1068 given class is subset of allocatable hard register set of a class
1069 in CLASSES is pretty simple. We use smallest classes from CLASSES
1070 containing a given class. If allocatable hard register set of a
1071 given class is not a subset of any corresponding set of a class
1072 from CLASSES, we use the cheapest (with load/store point of view)
1073 class from CLASSES whose set intersects with given class set. */
1074 static void
1075 setup_class_translate_array (enum reg_class *class_translate,
1076 int classes_num, enum reg_class *classes)
1077 {
1078 int cl, mode;
1079 enum reg_class aclass, best_class, *cl_ptr;
1080 int i, cost, min_cost, best_cost;
1081
1082 for (cl = 0; cl < N_REG_CLASSES; cl++)
1083 class_translate[cl] = NO_REGS;
1084
1085 for (i = 0; i < classes_num; i++)
1086 {
1087 aclass = classes[i];
1088 for (cl_ptr = &alloc_reg_class_subclasses[aclass][0];
1089 (cl = *cl_ptr) != LIM_REG_CLASSES;
1090 cl_ptr++)
1091 if (class_translate[cl] == NO_REGS)
1092 class_translate[cl] = aclass;
1093 class_translate[aclass] = aclass;
1094 }
1095 /* For classes which are not fully covered by one of given classes
1096 (in other words covered by more one given class), use the
1097 cheapest class. */
1098 for (cl = 0; cl < N_REG_CLASSES; cl++)
1099 {
1100 if (cl == NO_REGS || class_translate[cl] != NO_REGS)
1101 continue;
1102 best_class = NO_REGS;
1103 best_cost = INT_MAX;
1104 for (i = 0; i < classes_num; i++)
1105 {
1106 aclass = classes[i];
1107 temp_hard_regset = (reg_class_contents[aclass]
1108 & reg_class_contents[cl]
1109 & ~no_unit_alloc_regs);
1110 if (! hard_reg_set_empty_p (temp_hard_regset))
1111 {
1112 min_cost = INT_MAX;
1113 for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
1114 {
1115 cost = (ira_memory_move_cost[mode][aclass][0]
1116 + ira_memory_move_cost[mode][aclass][1]);
1117 if (min_cost > cost)
1118 min_cost = cost;
1119 }
1120 if (best_class == NO_REGS || best_cost > min_cost)
1121 {
1122 best_class = aclass;
1123 best_cost = min_cost;
1124 }
1125 }
1126 }
1127 class_translate[cl] = best_class;
1128 }
1129 }
1130
1131 /* Set up array IRA_ALLOCNO_CLASS_TRANSLATE and
1132 IRA_PRESSURE_CLASS_TRANSLATE. */
1133 static void
1134 setup_class_translate (void)
1135 {
1136 setup_class_translate_array (ira_allocno_class_translate,
1137 ira_allocno_classes_num, ira_allocno_classes);
1138 setup_class_translate_array (ira_pressure_class_translate,
1139 ira_pressure_classes_num, ira_pressure_classes);
1140 }
1141
1142 /* Order numbers of allocno classes in original target allocno class
1143 array, -1 for non-allocno classes. */
1144 static int allocno_class_order[N_REG_CLASSES];
1145
1146 /* The function used to sort the important classes. */
1147 static int
1148 comp_reg_classes_func (const void *v1p, const void *v2p)
1149 {
1150 enum reg_class cl1 = *(const enum reg_class *) v1p;
1151 enum reg_class cl2 = *(const enum reg_class *) v2p;
1152 enum reg_class tcl1, tcl2;
1153 int diff;
1154
1155 tcl1 = ira_allocno_class_translate[cl1];
1156 tcl2 = ira_allocno_class_translate[cl2];
1157 if (tcl1 != NO_REGS && tcl2 != NO_REGS
1158 && (diff = allocno_class_order[tcl1] - allocno_class_order[tcl2]) != 0)
1159 return diff;
1160 return (int) cl1 - (int) cl2;
1161 }
1162
1163 /* For correct work of function setup_reg_class_relation we need to
1164 reorder important classes according to the order of their allocno
1165 classes. It places important classes containing the same
1166 allocatable hard register set adjacent to each other and allocno
1167 class with the allocatable hard register set right after the other
1168 important classes with the same set.
1169
1170 In example from comments of function
1171 setup_allocno_and_important_classes, it places LEGACY_REGS and
1172 GENERAL_REGS close to each other and GENERAL_REGS is after
1173 LEGACY_REGS. */
1174 static void
1175 reorder_important_classes (void)
1176 {
1177 int i;
1178
1179 for (i = 0; i < N_REG_CLASSES; i++)
1180 allocno_class_order[i] = -1;
1181 for (i = 0; i < ira_allocno_classes_num; i++)
1182 allocno_class_order[ira_allocno_classes[i]] = i;
1183 qsort (ira_important_classes, ira_important_classes_num,
1184 sizeof (enum reg_class), comp_reg_classes_func);
1185 for (i = 0; i < ira_important_classes_num; i++)
1186 ira_important_class_nums[ira_important_classes[i]] = i;
1187 }
1188
1189 /* Set up IRA_REG_CLASS_SUBUNION, IRA_REG_CLASS_SUPERUNION,
1190 IRA_REG_CLASS_SUPER_CLASSES, IRA_REG_CLASSES_INTERSECT, and
1191 IRA_REG_CLASSES_INTERSECT_P. For the meaning of the relations,
1192 please see corresponding comments in ira-int.h. */
1193 static void
1194 setup_reg_class_relations (void)
1195 {
1196 int i, cl1, cl2, cl3;
1197 HARD_REG_SET intersection_set, union_set, temp_set2;
1198 bool important_class_p[N_REG_CLASSES];
1199
1200 memset (important_class_p, 0, sizeof (important_class_p));
1201 for (i = 0; i < ira_important_classes_num; i++)
1202 important_class_p[ira_important_classes[i]] = true;
1203 for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1204 {
1205 ira_reg_class_super_classes[cl1][0] = LIM_REG_CLASSES;
1206 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1207 {
1208 ira_reg_classes_intersect_p[cl1][cl2] = false;
1209 ira_reg_class_intersect[cl1][cl2] = NO_REGS;
1210 ira_reg_class_subset[cl1][cl2] = NO_REGS;
1211 temp_hard_regset = reg_class_contents[cl1] & ~no_unit_alloc_regs;
1212 temp_set2 = reg_class_contents[cl2] & ~no_unit_alloc_regs;
1213 if (hard_reg_set_empty_p (temp_hard_regset)
1214 && hard_reg_set_empty_p (temp_set2))
1215 {
1216 /* The both classes have no allocatable hard registers
1217 -- take all class hard registers into account and use
1218 reg_class_subunion and reg_class_superunion. */
1219 for (i = 0;; i++)
1220 {
1221 cl3 = reg_class_subclasses[cl1][i];
1222 if (cl3 == LIM_REG_CLASSES)
1223 break;
1224 if (reg_class_subset_p (ira_reg_class_intersect[cl1][cl2],
1225 (enum reg_class) cl3))
1226 ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
1227 }
1228 ira_reg_class_subunion[cl1][cl2] = reg_class_subunion[cl1][cl2];
1229 ira_reg_class_superunion[cl1][cl2] = reg_class_superunion[cl1][cl2];
1230 continue;
1231 }
1232 ira_reg_classes_intersect_p[cl1][cl2]
1233 = hard_reg_set_intersect_p (temp_hard_regset, temp_set2);
1234 if (important_class_p[cl1] && important_class_p[cl2]
1235 && hard_reg_set_subset_p (temp_hard_regset, temp_set2))
1236 {
1237 /* CL1 and CL2 are important classes and CL1 allocatable
1238 hard register set is inside of CL2 allocatable hard
1239 registers -- make CL1 a superset of CL2. */
1240 enum reg_class *p;
1241
1242 p = &ira_reg_class_super_classes[cl1][0];
1243 while (*p != LIM_REG_CLASSES)
1244 p++;
1245 *p++ = (enum reg_class) cl2;
1246 *p = LIM_REG_CLASSES;
1247 }
1248 ira_reg_class_subunion[cl1][cl2] = NO_REGS;
1249 ira_reg_class_superunion[cl1][cl2] = NO_REGS;
1250 intersection_set = (reg_class_contents[cl1]
1251 & reg_class_contents[cl2]
1252 & ~no_unit_alloc_regs);
1253 union_set = ((reg_class_contents[cl1] | reg_class_contents[cl2])
1254 & ~no_unit_alloc_regs);
1255 for (cl3 = 0; cl3 < N_REG_CLASSES; cl3++)
1256 {
1257 temp_hard_regset = reg_class_contents[cl3] & ~no_unit_alloc_regs;
1258 if (hard_reg_set_subset_p (temp_hard_regset, intersection_set))
1259 {
1260 /* CL3 allocatable hard register set is inside of
1261 intersection of allocatable hard register sets
1262 of CL1 and CL2. */
1263 if (important_class_p[cl3])
1264 {
1265 temp_set2
1266 = (reg_class_contents
1267 [ira_reg_class_intersect[cl1][cl2]]);
1268 temp_set2 &= ~no_unit_alloc_regs;
1269 if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1270 /* If the allocatable hard register sets are
1271 the same, prefer GENERAL_REGS or the
1272 smallest class for debugging
1273 purposes. */
1274 || (temp_hard_regset == temp_set2
1275 && (cl3 == GENERAL_REGS
1276 || ((ira_reg_class_intersect[cl1][cl2]
1277 != GENERAL_REGS)
1278 && hard_reg_set_subset_p
1279 (reg_class_contents[cl3],
1280 reg_class_contents
1281 [(int)
1282 ira_reg_class_intersect[cl1][cl2]])))))
1283 ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
1284 }
1285 temp_set2
1286 = (reg_class_contents[ira_reg_class_subset[cl1][cl2]]
1287 & ~no_unit_alloc_regs);
1288 if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1289 /* Ignore unavailable hard registers and prefer
1290 smallest class for debugging purposes. */
1291 || (temp_hard_regset == temp_set2
1292 && hard_reg_set_subset_p
1293 (reg_class_contents[cl3],
1294 reg_class_contents
1295 [(int) ira_reg_class_subset[cl1][cl2]])))
1296 ira_reg_class_subset[cl1][cl2] = (enum reg_class) cl3;
1297 }
1298 if (important_class_p[cl3]
1299 && hard_reg_set_subset_p (temp_hard_regset, union_set))
1300 {
1301 /* CL3 allocatable hard register set is inside of
1302 union of allocatable hard register sets of CL1
1303 and CL2. */
1304 temp_set2
1305 = (reg_class_contents[ira_reg_class_subunion[cl1][cl2]]
1306 & ~no_unit_alloc_regs);
1307 if (ira_reg_class_subunion[cl1][cl2] == NO_REGS
1308 || (hard_reg_set_subset_p (temp_set2, temp_hard_regset)
1309
1310 && (temp_set2 != temp_hard_regset
1311 || cl3 == GENERAL_REGS
1312 /* If the allocatable hard register sets are the
1313 same, prefer GENERAL_REGS or the smallest
1314 class for debugging purposes. */
1315 || (ira_reg_class_subunion[cl1][cl2] != GENERAL_REGS
1316 && hard_reg_set_subset_p
1317 (reg_class_contents[cl3],
1318 reg_class_contents
1319 [(int) ira_reg_class_subunion[cl1][cl2]])))))
1320 ira_reg_class_subunion[cl1][cl2] = (enum reg_class) cl3;
1321 }
1322 if (hard_reg_set_subset_p (union_set, temp_hard_regset))
1323 {
1324 /* CL3 allocatable hard register set contains union
1325 of allocatable hard register sets of CL1 and
1326 CL2. */
1327 temp_set2
1328 = (reg_class_contents[ira_reg_class_superunion[cl1][cl2]]
1329 & ~no_unit_alloc_regs);
1330 if (ira_reg_class_superunion[cl1][cl2] == NO_REGS
1331 || (hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1332
1333 && (temp_set2 != temp_hard_regset
1334 || cl3 == GENERAL_REGS
1335 /* If the allocatable hard register sets are the
1336 same, prefer GENERAL_REGS or the smallest
1337 class for debugging purposes. */
1338 || (ira_reg_class_superunion[cl1][cl2] != GENERAL_REGS
1339 && hard_reg_set_subset_p
1340 (reg_class_contents[cl3],
1341 reg_class_contents
1342 [(int) ira_reg_class_superunion[cl1][cl2]])))))
1343 ira_reg_class_superunion[cl1][cl2] = (enum reg_class) cl3;
1344 }
1345 }
1346 }
1347 }
1348 }
1349
1350 /* Output all uniform and important classes into file F. */
1351 static void
1352 print_uniform_and_important_classes (FILE *f)
1353 {
1354 int i, cl;
1355
1356 fprintf (f, "Uniform classes:\n");
1357 for (cl = 0; cl < N_REG_CLASSES; cl++)
1358 if (ira_uniform_class_p[cl])
1359 fprintf (f, " %s", reg_class_names[cl]);
1360 fprintf (f, "\nImportant classes:\n");
1361 for (i = 0; i < ira_important_classes_num; i++)
1362 fprintf (f, " %s", reg_class_names[ira_important_classes[i]]);
1363 fprintf (f, "\n");
1364 }
1365
1366 /* Output all possible allocno or pressure classes and their
1367 translation map into file F. */
1368 static void
1369 print_translated_classes (FILE *f, bool pressure_p)
1370 {
1371 int classes_num = (pressure_p
1372 ? ira_pressure_classes_num : ira_allocno_classes_num);
1373 enum reg_class *classes = (pressure_p
1374 ? ira_pressure_classes : ira_allocno_classes);
1375 enum reg_class *class_translate = (pressure_p
1376 ? ira_pressure_class_translate
1377 : ira_allocno_class_translate);
1378 int i;
1379
1380 fprintf (f, "%s classes:\n", pressure_p ? "Pressure" : "Allocno");
1381 for (i = 0; i < classes_num; i++)
1382 fprintf (f, " %s", reg_class_names[classes[i]]);
1383 fprintf (f, "\nClass translation:\n");
1384 for (i = 0; i < N_REG_CLASSES; i++)
1385 fprintf (f, " %s -> %s\n", reg_class_names[i],
1386 reg_class_names[class_translate[i]]);
1387 }
1388
1389 /* Output all possible allocno and translation classes and the
1390 translation maps into stderr. */
1391 void
1392 ira_debug_allocno_classes (void)
1393 {
1394 print_uniform_and_important_classes (stderr);
1395 print_translated_classes (stderr, false);
1396 print_translated_classes (stderr, true);
1397 }
1398
1399 /* Set up different arrays concerning class subsets, allocno and
1400 important classes. */
1401 static void
1402 find_reg_classes (void)
1403 {
1404 setup_allocno_and_important_classes ();
1405 setup_class_translate ();
1406 reorder_important_classes ();
1407 setup_reg_class_relations ();
1408 }
1409
1410 \f
1411
1412 /* Set up the array above. */
1413 static void
1414 setup_hard_regno_aclass (void)
1415 {
1416 int i;
1417
1418 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
1419 {
1420 #if 1
1421 ira_hard_regno_allocno_class[i]
1422 = (TEST_HARD_REG_BIT (no_unit_alloc_regs, i)
1423 ? NO_REGS
1424 : ira_allocno_class_translate[REGNO_REG_CLASS (i)]);
1425 #else
1426 int j;
1427 enum reg_class cl;
1428 ira_hard_regno_allocno_class[i] = NO_REGS;
1429 for (j = 0; j < ira_allocno_classes_num; j++)
1430 {
1431 cl = ira_allocno_classes[j];
1432 if (ira_class_hard_reg_index[cl][i] >= 0)
1433 {
1434 ira_hard_regno_allocno_class[i] = cl;
1435 break;
1436 }
1437 }
1438 #endif
1439 }
1440 }
1441
1442 \f
1443
1444 /* Form IRA_REG_CLASS_MAX_NREGS and IRA_REG_CLASS_MIN_NREGS maps. */
1445 static void
1446 setup_reg_class_nregs (void)
1447 {
1448 int i, cl, cl2, m;
1449
1450 for (m = 0; m < MAX_MACHINE_MODE; m++)
1451 {
1452 for (cl = 0; cl < N_REG_CLASSES; cl++)
1453 ira_reg_class_max_nregs[cl][m]
1454 = ira_reg_class_min_nregs[cl][m]
1455 = targetm.class_max_nregs ((reg_class_t) cl, (machine_mode) m);
1456 for (cl = 0; cl < N_REG_CLASSES; cl++)
1457 for (i = 0;
1458 (cl2 = alloc_reg_class_subclasses[cl][i]) != LIM_REG_CLASSES;
1459 i++)
1460 if (ira_reg_class_min_nregs[cl2][m]
1461 < ira_reg_class_min_nregs[cl][m])
1462 ira_reg_class_min_nregs[cl][m] = ira_reg_class_min_nregs[cl2][m];
1463 }
1464 }
1465
1466 \f
1467
1468 /* Set up IRA_PROHIBITED_CLASS_MODE_REGS, IRA_EXCLUDE_CLASS_MODE_REGS, and
1469 IRA_CLASS_SINGLETON. This function is called once IRA_CLASS_HARD_REGS has
1470 been initialized. */
1471 static void
1472 setup_prohibited_and_exclude_class_mode_regs (void)
1473 {
1474 int j, k, hard_regno, cl, last_hard_regno, count;
1475
1476 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
1477 {
1478 temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
1479 for (j = 0; j < NUM_MACHINE_MODES; j++)
1480 {
1481 count = 0;
1482 last_hard_regno = -1;
1483 CLEAR_HARD_REG_SET (ira_prohibited_class_mode_regs[cl][j]);
1484 CLEAR_HARD_REG_SET (ira_exclude_class_mode_regs[cl][j]);
1485 for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--)
1486 {
1487 hard_regno = ira_class_hard_regs[cl][k];
1488 if (!targetm.hard_regno_mode_ok (hard_regno, (machine_mode) j))
1489 SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1490 hard_regno);
1491 else if (in_hard_reg_set_p (temp_hard_regset,
1492 (machine_mode) j, hard_regno))
1493 {
1494 last_hard_regno = hard_regno;
1495 count++;
1496 }
1497 else
1498 {
1499 SET_HARD_REG_BIT (ira_exclude_class_mode_regs[cl][j], hard_regno);
1500 }
1501 }
1502 ira_class_singleton[cl][j] = (count == 1 ? last_hard_regno : -1);
1503 }
1504 }
1505 }
1506
1507 /* Clarify IRA_PROHIBITED_CLASS_MODE_REGS by excluding hard registers
1508 spanning from one register pressure class to another one. It is
1509 called after defining the pressure classes. */
1510 static void
1511 clarify_prohibited_class_mode_regs (void)
1512 {
1513 int j, k, hard_regno, cl, pclass, nregs;
1514
1515 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
1516 for (j = 0; j < NUM_MACHINE_MODES; j++)
1517 {
1518 CLEAR_HARD_REG_SET (ira_useful_class_mode_regs[cl][j]);
1519 for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--)
1520 {
1521 hard_regno = ira_class_hard_regs[cl][k];
1522 if (TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], hard_regno))
1523 continue;
1524 nregs = hard_regno_nregs (hard_regno, (machine_mode) j);
1525 if (hard_regno + nregs > FIRST_PSEUDO_REGISTER)
1526 {
1527 SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1528 hard_regno);
1529 continue;
1530 }
1531 pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)];
1532 for (nregs-- ;nregs >= 0; nregs--)
1533 if (((enum reg_class) pclass
1534 != ira_pressure_class_translate[REGNO_REG_CLASS
1535 (hard_regno + nregs)]))
1536 {
1537 SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1538 hard_regno);
1539 break;
1540 }
1541 if (!TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1542 hard_regno))
1543 add_to_hard_reg_set (&ira_useful_class_mode_regs[cl][j],
1544 (machine_mode) j, hard_regno);
1545 }
1546 }
1547 }
1548 \f
1549 /* Allocate and initialize IRA_REGISTER_MOVE_COST, IRA_MAY_MOVE_IN_COST
1550 and IRA_MAY_MOVE_OUT_COST for MODE. */
1551 void
1552 ira_init_register_move_cost (machine_mode mode)
1553 {
1554 static unsigned short last_move_cost[N_REG_CLASSES][N_REG_CLASSES];
1555 bool all_match = true;
1556 unsigned int i, cl1, cl2;
1557 HARD_REG_SET ok_regs;
1558
1559 ira_assert (ira_register_move_cost[mode] == NULL
1560 && ira_may_move_in_cost[mode] == NULL
1561 && ira_may_move_out_cost[mode] == NULL);
1562 CLEAR_HARD_REG_SET (ok_regs);
1563 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
1564 if (targetm.hard_regno_mode_ok (i, mode))
1565 SET_HARD_REG_BIT (ok_regs, i);
1566
1567 /* Note that we might be asked about the move costs of modes that
1568 cannot be stored in any hard register, for example if an inline
1569 asm tries to create a register operand with an impossible mode.
1570 We therefore can't assert have_regs_of_mode[mode] here. */
1571 for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1572 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1573 {
1574 int cost;
1575 if (!hard_reg_set_intersect_p (ok_regs, reg_class_contents[cl1])
1576 || !hard_reg_set_intersect_p (ok_regs, reg_class_contents[cl2]))
1577 {
1578 if ((ira_reg_class_max_nregs[cl1][mode]
1579 > ira_class_hard_regs_num[cl1])
1580 || (ira_reg_class_max_nregs[cl2][mode]
1581 > ira_class_hard_regs_num[cl2]))
1582 cost = 65535;
1583 else
1584 cost = (ira_memory_move_cost[mode][cl1][0]
1585 + ira_memory_move_cost[mode][cl2][1]) * 2;
1586 }
1587 else
1588 {
1589 cost = register_move_cost (mode, (enum reg_class) cl1,
1590 (enum reg_class) cl2);
1591 ira_assert (cost < 65535);
1592 }
1593 all_match &= (last_move_cost[cl1][cl2] == cost);
1594 last_move_cost[cl1][cl2] = cost;
1595 }
1596 if (all_match && last_mode_for_init_move_cost != -1)
1597 {
1598 ira_register_move_cost[mode]
1599 = ira_register_move_cost[last_mode_for_init_move_cost];
1600 ira_may_move_in_cost[mode]
1601 = ira_may_move_in_cost[last_mode_for_init_move_cost];
1602 ira_may_move_out_cost[mode]
1603 = ira_may_move_out_cost[last_mode_for_init_move_cost];
1604 return;
1605 }
1606 last_mode_for_init_move_cost = mode;
1607 ira_register_move_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1608 ira_may_move_in_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1609 ira_may_move_out_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1610 for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1611 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1612 {
1613 int cost;
1614 enum reg_class *p1, *p2;
1615
1616 if (last_move_cost[cl1][cl2] == 65535)
1617 {
1618 ira_register_move_cost[mode][cl1][cl2] = 65535;
1619 ira_may_move_in_cost[mode][cl1][cl2] = 65535;
1620 ira_may_move_out_cost[mode][cl1][cl2] = 65535;
1621 }
1622 else
1623 {
1624 cost = last_move_cost[cl1][cl2];
1625
1626 for (p2 = &reg_class_subclasses[cl2][0];
1627 *p2 != LIM_REG_CLASSES; p2++)
1628 if (ira_class_hard_regs_num[*p2] > 0
1629 && (ira_reg_class_max_nregs[*p2][mode]
1630 <= ira_class_hard_regs_num[*p2])
1631 && hard_reg_set_intersect_p (ok_regs,
1632 reg_class_contents[cl1])
1633 && hard_reg_set_intersect_p (ok_regs,
1634 reg_class_contents[*p2]))
1635 cost = MAX (cost, ira_register_move_cost[mode][cl1][*p2]);
1636
1637 for (p1 = &reg_class_subclasses[cl1][0];
1638 *p1 != LIM_REG_CLASSES; p1++)
1639 if (ira_class_hard_regs_num[*p1] > 0
1640 && (ira_reg_class_max_nregs[*p1][mode]
1641 <= ira_class_hard_regs_num[*p1])
1642 && hard_reg_set_intersect_p (ok_regs,
1643 reg_class_contents[cl2])
1644 && hard_reg_set_intersect_p (ok_regs,
1645 reg_class_contents[*p1]))
1646 cost = MAX (cost, ira_register_move_cost[mode][*p1][cl2]);
1647
1648 ira_assert (cost <= 65535);
1649 ira_register_move_cost[mode][cl1][cl2] = cost;
1650
1651 if (ira_class_subset_p[cl1][cl2])
1652 ira_may_move_in_cost[mode][cl1][cl2] = 0;
1653 else
1654 ira_may_move_in_cost[mode][cl1][cl2] = cost;
1655
1656 if (ira_class_subset_p[cl2][cl1])
1657 ira_may_move_out_cost[mode][cl1][cl2] = 0;
1658 else
1659 ira_may_move_out_cost[mode][cl1][cl2] = cost;
1660 }
1661 }
1662 }
1663
1664 \f
1665
1666 /* This is called once during compiler work. It sets up
1667 different arrays whose values don't depend on the compiled
1668 function. */
1669 void
1670 ira_init_once (void)
1671 {
1672 ira_init_costs_once ();
1673 lra_init_once ();
1674
1675 ira_use_lra_p = targetm.lra_p ();
1676 }
1677
1678 /* Free ira_max_register_move_cost, ira_may_move_in_cost and
1679 ira_may_move_out_cost for each mode. */
1680 void
1681 target_ira_int::free_register_move_costs (void)
1682 {
1683 int mode, i;
1684
1685 /* Reset move_cost and friends, making sure we only free shared
1686 table entries once. */
1687 for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
1688 if (x_ira_register_move_cost[mode])
1689 {
1690 for (i = 0;
1691 i < mode && (x_ira_register_move_cost[i]
1692 != x_ira_register_move_cost[mode]);
1693 i++)
1694 ;
1695 if (i == mode)
1696 {
1697 free (x_ira_register_move_cost[mode]);
1698 free (x_ira_may_move_in_cost[mode]);
1699 free (x_ira_may_move_out_cost[mode]);
1700 }
1701 }
1702 memset (x_ira_register_move_cost, 0, sizeof x_ira_register_move_cost);
1703 memset (x_ira_may_move_in_cost, 0, sizeof x_ira_may_move_in_cost);
1704 memset (x_ira_may_move_out_cost, 0, sizeof x_ira_may_move_out_cost);
1705 last_mode_for_init_move_cost = -1;
1706 }
1707
1708 target_ira_int::~target_ira_int ()
1709 {
1710 free_ira_costs ();
1711 free_register_move_costs ();
1712 }
1713
1714 /* This is called every time when register related information is
1715 changed. */
1716 void
1717 ira_init (void)
1718 {
1719 this_target_ira_int->free_register_move_costs ();
1720 setup_reg_mode_hard_regset ();
1721 setup_alloc_regs (flag_omit_frame_pointer != 0);
1722 setup_class_subset_and_memory_move_costs ();
1723 setup_reg_class_nregs ();
1724 setup_prohibited_and_exclude_class_mode_regs ();
1725 find_reg_classes ();
1726 clarify_prohibited_class_mode_regs ();
1727 setup_hard_regno_aclass ();
1728 ira_init_costs ();
1729 }
1730
1731 \f
1732 #define ira_prohibited_mode_move_regs_initialized_p \
1733 (this_target_ira_int->x_ira_prohibited_mode_move_regs_initialized_p)
1734
1735 /* Set up IRA_PROHIBITED_MODE_MOVE_REGS. */
1736 static void
1737 setup_prohibited_mode_move_regs (void)
1738 {
1739 int i, j;
1740 rtx test_reg1, test_reg2, move_pat;
1741 rtx_insn *move_insn;
1742
1743 if (ira_prohibited_mode_move_regs_initialized_p)
1744 return;
1745 ira_prohibited_mode_move_regs_initialized_p = true;
1746 test_reg1 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 1);
1747 test_reg2 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 2);
1748 move_pat = gen_rtx_SET (test_reg1, test_reg2);
1749 move_insn = gen_rtx_INSN (VOIDmode, 0, 0, 0, move_pat, 0, -1, 0);
1750 for (i = 0; i < NUM_MACHINE_MODES; i++)
1751 {
1752 SET_HARD_REG_SET (ira_prohibited_mode_move_regs[i]);
1753 for (j = 0; j < FIRST_PSEUDO_REGISTER; j++)
1754 {
1755 if (!targetm.hard_regno_mode_ok (j, (machine_mode) i))
1756 continue;
1757 set_mode_and_regno (test_reg1, (machine_mode) i, j);
1758 set_mode_and_regno (test_reg2, (machine_mode) i, j);
1759 INSN_CODE (move_insn) = -1;
1760 recog_memoized (move_insn);
1761 if (INSN_CODE (move_insn) < 0)
1762 continue;
1763 extract_insn (move_insn);
1764 /* We don't know whether the move will be in code that is optimized
1765 for size or speed, so consider all enabled alternatives. */
1766 if (! constrain_operands (1, get_enabled_alternatives (move_insn)))
1767 continue;
1768 CLEAR_HARD_REG_BIT (ira_prohibited_mode_move_regs[i], j);
1769 }
1770 }
1771 }
1772
1773 \f
1774
1775 /* Extract INSN and return the set of alternatives that we should consider.
1776 This excludes any alternatives whose constraints are obviously impossible
1777 to meet (e.g. because the constraint requires a constant and the operand
1778 is nonconstant). It also excludes alternatives that are bound to need
1779 a spill or reload, as long as we have other alternatives that match
1780 exactly. */
1781 alternative_mask
1782 ira_setup_alts (rtx_insn *insn)
1783 {
1784 int nop, nalt;
1785 bool curr_swapped;
1786 const char *p;
1787 int commutative = -1;
1788
1789 extract_insn (insn);
1790 preprocess_constraints (insn);
1791 alternative_mask preferred = get_preferred_alternatives (insn);
1792 alternative_mask alts = 0;
1793 alternative_mask exact_alts = 0;
1794 /* Check that the hard reg set is enough for holding all
1795 alternatives. It is hard to imagine the situation when the
1796 assertion is wrong. */
1797 ira_assert (recog_data.n_alternatives
1798 <= (int) MAX (sizeof (HARD_REG_ELT_TYPE) * CHAR_BIT,
1799 FIRST_PSEUDO_REGISTER));
1800 for (nop = 0; nop < recog_data.n_operands; nop++)
1801 if (recog_data.constraints[nop][0] == '%')
1802 {
1803 commutative = nop;
1804 break;
1805 }
1806 for (curr_swapped = false;; curr_swapped = true)
1807 {
1808 for (nalt = 0; nalt < recog_data.n_alternatives; nalt++)
1809 {
1810 if (!TEST_BIT (preferred, nalt) || TEST_BIT (exact_alts, nalt))
1811 continue;
1812
1813 const operand_alternative *op_alt
1814 = &recog_op_alt[nalt * recog_data.n_operands];
1815 int this_reject = 0;
1816 for (nop = 0; nop < recog_data.n_operands; nop++)
1817 {
1818 int c, len;
1819
1820 this_reject += op_alt[nop].reject;
1821
1822 rtx op = recog_data.operand[nop];
1823 p = op_alt[nop].constraint;
1824 if (*p == 0 || *p == ',')
1825 continue;
1826
1827 bool win_p = false;
1828 do
1829 switch (c = *p, len = CONSTRAINT_LEN (c, p), c)
1830 {
1831 case '#':
1832 case ',':
1833 c = '\0';
1834 /* FALLTHRU */
1835 case '\0':
1836 len = 0;
1837 break;
1838
1839 case '%':
1840 /* The commutative modifier is handled above. */
1841 break;
1842
1843 case '0': case '1': case '2': case '3': case '4':
1844 case '5': case '6': case '7': case '8': case '9':
1845 {
1846 char *end;
1847 unsigned long dup = strtoul (p, &end, 10);
1848 rtx other = recog_data.operand[dup];
1849 len = end - p;
1850 if (MEM_P (other)
1851 ? rtx_equal_p (other, op)
1852 : REG_P (op) || SUBREG_P (op))
1853 goto op_success;
1854 win_p = true;
1855 }
1856 break;
1857
1858 case 'g':
1859 goto op_success;
1860 break;
1861
1862 default:
1863 {
1864 enum constraint_num cn = lookup_constraint (p);
1865 rtx mem = NULL;
1866 switch (get_constraint_type (cn))
1867 {
1868 case CT_REGISTER:
1869 if (reg_class_for_constraint (cn) != NO_REGS)
1870 {
1871 if (REG_P (op) || SUBREG_P (op))
1872 goto op_success;
1873 win_p = true;
1874 }
1875 break;
1876
1877 case CT_CONST_INT:
1878 if (CONST_INT_P (op)
1879 && (insn_const_int_ok_for_constraint
1880 (INTVAL (op), cn)))
1881 goto op_success;
1882 break;
1883
1884 case CT_ADDRESS:
1885 goto op_success;
1886
1887 case CT_MEMORY:
1888 case CT_RELAXED_MEMORY:
1889 mem = op;
1890 /* Fall through. */
1891 case CT_SPECIAL_MEMORY:
1892 if (!mem)
1893 mem = extract_mem_from_operand (op);
1894 if (MEM_P (mem))
1895 goto op_success;
1896 win_p = true;
1897 break;
1898
1899 case CT_FIXED_FORM:
1900 if (constraint_satisfied_p (op, cn))
1901 goto op_success;
1902 break;
1903 }
1904 break;
1905 }
1906 }
1907 while (p += len, c);
1908 if (!win_p)
1909 break;
1910 /* We can make the alternative match by spilling a register
1911 to memory or loading something into a register. Count a
1912 cost of one reload (the equivalent of the '?' constraint). */
1913 this_reject += 6;
1914 op_success:
1915 ;
1916 }
1917
1918 if (nop >= recog_data.n_operands)
1919 {
1920 alts |= ALTERNATIVE_BIT (nalt);
1921 if (this_reject == 0)
1922 exact_alts |= ALTERNATIVE_BIT (nalt);
1923 }
1924 }
1925 if (commutative < 0)
1926 break;
1927 /* Swap forth and back to avoid changing recog_data. */
1928 std::swap (recog_data.operand[commutative],
1929 recog_data.operand[commutative + 1]);
1930 if (curr_swapped)
1931 break;
1932 }
1933 return exact_alts ? exact_alts : alts;
1934 }
1935
1936 /* Return the number of the output non-early clobber operand which
1937 should be the same in any case as operand with number OP_NUM (or
1938 negative value if there is no such operand). ALTS is the mask
1939 of alternatives that we should consider. SINGLE_INPUT_OP_HAS_CSTR_P
1940 should be set in this function, it indicates whether there is only
1941 a single input operand which has the matching constraint on the
1942 output operand at the position specified in return value. If the
1943 pattern allows any one of several input operands holds the matching
1944 constraint, it's set as false, one typical case is destructive FMA
1945 instruction on target rs6000. Note that for a non-NO_REG preferred
1946 register class with no free register move copy, if the parameter
1947 PARAM_IRA_CONSIDER_DUP_IN_ALL_ALTS is set to one, this function
1948 will check all available alternatives for matching constraints,
1949 even if it has found or will find one alternative with non-NO_REG
1950 regclass, it can respect more cases with matching constraints. If
1951 PARAM_IRA_CONSIDER_DUP_IN_ALL_ALTS is set to zero,
1952 SINGLE_INPUT_OP_HAS_CSTR_P is always true, it will stop to find
1953 matching constraint relationship once it hits some alternative with
1954 some non-NO_REG regclass. */
1955 int
1956 ira_get_dup_out_num (int op_num, alternative_mask alts,
1957 bool &single_input_op_has_cstr_p)
1958 {
1959 int curr_alt, c, original;
1960 bool ignore_p, use_commut_op_p;
1961 const char *str;
1962
1963 if (op_num < 0 || recog_data.n_alternatives == 0)
1964 return -1;
1965 /* We should find duplications only for input operands. */
1966 if (recog_data.operand_type[op_num] != OP_IN)
1967 return -1;
1968 str = recog_data.constraints[op_num];
1969 use_commut_op_p = false;
1970 single_input_op_has_cstr_p = true;
1971
1972 rtx op = recog_data.operand[op_num];
1973 int op_regno = reg_or_subregno (op);
1974 enum reg_class op_pref_cl = reg_preferred_class (op_regno);
1975 machine_mode op_mode = GET_MODE (op);
1976
1977 ira_init_register_move_cost_if_necessary (op_mode);
1978 /* If the preferred regclass isn't NO_REG, continue to find the matching
1979 constraint in all available alternatives with preferred regclass, even
1980 if we have found or will find one alternative whose constraint stands
1981 for a REG (non-NO_REG) regclass. Note that it would be fine not to
1982 respect matching constraint if the register copy is free, so exclude
1983 it. */
1984 bool respect_dup_despite_reg_cstr
1985 = param_ira_consider_dup_in_all_alts
1986 && op_pref_cl != NO_REGS
1987 && ira_register_move_cost[op_mode][op_pref_cl][op_pref_cl] > 0;
1988
1989 /* Record the alternative whose constraint uses the same regclass as the
1990 preferred regclass, later if we find one matching constraint for this
1991 operand with preferred reclass, we will visit these recorded
1992 alternatives to check whether if there is one alternative in which no
1993 any INPUT operands have one matching constraint same as our candidate.
1994 If yes, it means there is one alternative which is perfectly fine
1995 without satisfying this matching constraint. If no, it means in any
1996 alternatives there is one other INPUT operand holding this matching
1997 constraint, it's fine to respect this matching constraint and further
1998 create this constraint copy since it would become harmless once some
1999 other takes preference and it's interfered. */
2000 alternative_mask pref_cl_alts;
2001
2002 for (;;)
2003 {
2004 pref_cl_alts = 0;
2005
2006 for (curr_alt = 0, ignore_p = !TEST_BIT (alts, curr_alt),
2007 original = -1;;)
2008 {
2009 c = *str;
2010 if (c == '\0')
2011 break;
2012 if (c == '#')
2013 ignore_p = true;
2014 else if (c == ',')
2015 {
2016 curr_alt++;
2017 ignore_p = !TEST_BIT (alts, curr_alt);
2018 }
2019 else if (! ignore_p)
2020 switch (c)
2021 {
2022 case 'g':
2023 goto fail;
2024 default:
2025 {
2026 enum constraint_num cn = lookup_constraint (str);
2027 enum reg_class cl = reg_class_for_constraint (cn);
2028 if (cl != NO_REGS && !targetm.class_likely_spilled_p (cl))
2029 {
2030 if (respect_dup_despite_reg_cstr)
2031 {
2032 /* If it's free to move from one preferred class to
2033 the one without matching constraint, it doesn't
2034 have to respect this constraint with costs. */
2035 if (cl != op_pref_cl
2036 && (ira_reg_class_intersect[cl][op_pref_cl]
2037 != NO_REGS)
2038 && (ira_may_move_in_cost[op_mode][op_pref_cl][cl]
2039 == 0))
2040 goto fail;
2041 else if (cl == op_pref_cl)
2042 pref_cl_alts |= ALTERNATIVE_BIT (curr_alt);
2043 }
2044 else
2045 goto fail;
2046 }
2047 if (constraint_satisfied_p (op, cn))
2048 goto fail;
2049 break;
2050 }
2051
2052 case '0': case '1': case '2': case '3': case '4':
2053 case '5': case '6': case '7': case '8': case '9':
2054 {
2055 char *end;
2056 int n = (int) strtoul (str, &end, 10);
2057 str = end;
2058 if (original != -1 && original != n)
2059 goto fail;
2060 gcc_assert (n < recog_data.n_operands);
2061 if (respect_dup_despite_reg_cstr)
2062 {
2063 const operand_alternative *op_alt
2064 = &recog_op_alt[curr_alt * recog_data.n_operands];
2065 /* Only respect the one with preferred rclass, without
2066 respect_dup_despite_reg_cstr it's possible to get
2067 one whose regclass isn't preferred first before,
2068 but it would fail since there should be other
2069 alternatives with preferred regclass. */
2070 if (op_alt[n].cl == op_pref_cl)
2071 original = n;
2072 }
2073 else
2074 original = n;
2075 continue;
2076 }
2077 }
2078 str += CONSTRAINT_LEN (c, str);
2079 }
2080 if (original == -1)
2081 goto fail;
2082 if (recog_data.operand_type[original] == OP_OUT)
2083 {
2084 if (pref_cl_alts == 0)
2085 return original;
2086 /* Visit these recorded alternatives to check whether
2087 there is one alternative in which no any INPUT operands
2088 have one matching constraint same as our candidate.
2089 Give up this candidate if so. */
2090 int nop, nalt;
2091 for (nalt = 0; nalt < recog_data.n_alternatives; nalt++)
2092 {
2093 if (!TEST_BIT (pref_cl_alts, nalt))
2094 continue;
2095 const operand_alternative *op_alt
2096 = &recog_op_alt[nalt * recog_data.n_operands];
2097 bool dup_in_other = false;
2098 for (nop = 0; nop < recog_data.n_operands; nop++)
2099 {
2100 if (recog_data.operand_type[nop] != OP_IN)
2101 continue;
2102 if (nop == op_num)
2103 continue;
2104 if (op_alt[nop].matches == original)
2105 {
2106 dup_in_other = true;
2107 break;
2108 }
2109 }
2110 if (!dup_in_other)
2111 return -1;
2112 }
2113 single_input_op_has_cstr_p = false;
2114 return original;
2115 }
2116 fail:
2117 if (use_commut_op_p)
2118 break;
2119 use_commut_op_p = true;
2120 if (recog_data.constraints[op_num][0] == '%')
2121 str = recog_data.constraints[op_num + 1];
2122 else if (op_num > 0 && recog_data.constraints[op_num - 1][0] == '%')
2123 str = recog_data.constraints[op_num - 1];
2124 else
2125 break;
2126 }
2127 return -1;
2128 }
2129
2130 \f
2131
2132 /* Search forward to see if the source register of a copy insn dies
2133 before either it or the destination register is modified, but don't
2134 scan past the end of the basic block. If so, we can replace the
2135 source with the destination and let the source die in the copy
2136 insn.
2137
2138 This will reduce the number of registers live in that range and may
2139 enable the destination and the source coalescing, thus often saving
2140 one register in addition to a register-register copy. */
2141
2142 static void
2143 decrease_live_ranges_number (void)
2144 {
2145 basic_block bb;
2146 rtx_insn *insn;
2147 rtx set, src, dest, dest_death, note;
2148 rtx_insn *p, *q;
2149 int sregno, dregno;
2150
2151 if (! flag_expensive_optimizations)
2152 return;
2153
2154 if (ira_dump_file)
2155 fprintf (ira_dump_file, "Starting decreasing number of live ranges...\n");
2156
2157 FOR_EACH_BB_FN (bb, cfun)
2158 FOR_BB_INSNS (bb, insn)
2159 {
2160 set = single_set (insn);
2161 if (! set)
2162 continue;
2163 src = SET_SRC (set);
2164 dest = SET_DEST (set);
2165 if (! REG_P (src) || ! REG_P (dest)
2166 || find_reg_note (insn, REG_DEAD, src))
2167 continue;
2168 sregno = REGNO (src);
2169 dregno = REGNO (dest);
2170
2171 /* We don't want to mess with hard regs if register classes
2172 are small. */
2173 if (sregno == dregno
2174 || (targetm.small_register_classes_for_mode_p (GET_MODE (src))
2175 && (sregno < FIRST_PSEUDO_REGISTER
2176 || dregno < FIRST_PSEUDO_REGISTER))
2177 /* We don't see all updates to SP if they are in an
2178 auto-inc memory reference, so we must disallow this
2179 optimization on them. */
2180 || sregno == STACK_POINTER_REGNUM
2181 || dregno == STACK_POINTER_REGNUM)
2182 continue;
2183
2184 dest_death = NULL_RTX;
2185
2186 for (p = NEXT_INSN (insn); p; p = NEXT_INSN (p))
2187 {
2188 if (! INSN_P (p))
2189 continue;
2190 if (BLOCK_FOR_INSN (p) != bb)
2191 break;
2192
2193 if (reg_set_p (src, p) || reg_set_p (dest, p)
2194 /* If SRC is an asm-declared register, it must not be
2195 replaced in any asm. Unfortunately, the REG_EXPR
2196 tree for the asm variable may be absent in the SRC
2197 rtx, so we can't check the actual register
2198 declaration easily (the asm operand will have it,
2199 though). To avoid complicating the test for a rare
2200 case, we just don't perform register replacement
2201 for a hard reg mentioned in an asm. */
2202 || (sregno < FIRST_PSEUDO_REGISTER
2203 && asm_noperands (PATTERN (p)) >= 0
2204 && reg_overlap_mentioned_p (src, PATTERN (p)))
2205 /* Don't change hard registers used by a call. */
2206 || (CALL_P (p) && sregno < FIRST_PSEUDO_REGISTER
2207 && find_reg_fusage (p, USE, src))
2208 /* Don't change a USE of a register. */
2209 || (GET_CODE (PATTERN (p)) == USE
2210 && reg_overlap_mentioned_p (src, XEXP (PATTERN (p), 0))))
2211 break;
2212
2213 /* See if all of SRC dies in P. This test is slightly
2214 more conservative than it needs to be. */
2215 if ((note = find_regno_note (p, REG_DEAD, sregno))
2216 && GET_MODE (XEXP (note, 0)) == GET_MODE (src))
2217 {
2218 int failed = 0;
2219
2220 /* We can do the optimization. Scan forward from INSN
2221 again, replacing regs as we go. Set FAILED if a
2222 replacement can't be done. In that case, we can't
2223 move the death note for SRC. This should be
2224 rare. */
2225
2226 /* Set to stop at next insn. */
2227 for (q = next_real_insn (insn);
2228 q != next_real_insn (p);
2229 q = next_real_insn (q))
2230 {
2231 if (reg_overlap_mentioned_p (src, PATTERN (q)))
2232 {
2233 /* If SRC is a hard register, we might miss
2234 some overlapping registers with
2235 validate_replace_rtx, so we would have to
2236 undo it. We can't if DEST is present in
2237 the insn, so fail in that combination of
2238 cases. */
2239 if (sregno < FIRST_PSEUDO_REGISTER
2240 && reg_mentioned_p (dest, PATTERN (q)))
2241 failed = 1;
2242
2243 /* Attempt to replace all uses. */
2244 else if (!validate_replace_rtx (src, dest, q))
2245 failed = 1;
2246
2247 /* If this succeeded, but some part of the
2248 register is still present, undo the
2249 replacement. */
2250 else if (sregno < FIRST_PSEUDO_REGISTER
2251 && reg_overlap_mentioned_p (src, PATTERN (q)))
2252 {
2253 validate_replace_rtx (dest, src, q);
2254 failed = 1;
2255 }
2256 }
2257
2258 /* If DEST dies here, remove the death note and
2259 save it for later. Make sure ALL of DEST dies
2260 here; again, this is overly conservative. */
2261 if (! dest_death
2262 && (dest_death = find_regno_note (q, REG_DEAD, dregno)))
2263 {
2264 if (GET_MODE (XEXP (dest_death, 0)) == GET_MODE (dest))
2265 remove_note (q, dest_death);
2266 else
2267 {
2268 failed = 1;
2269 dest_death = 0;
2270 }
2271 }
2272 }
2273
2274 if (! failed)
2275 {
2276 /* Move death note of SRC from P to INSN. */
2277 remove_note (p, note);
2278 XEXP (note, 1) = REG_NOTES (insn);
2279 REG_NOTES (insn) = note;
2280 }
2281
2282 /* DEST is also dead if INSN has a REG_UNUSED note for
2283 DEST. */
2284 if (! dest_death
2285 && (dest_death
2286 = find_regno_note (insn, REG_UNUSED, dregno)))
2287 {
2288 PUT_REG_NOTE_KIND (dest_death, REG_DEAD);
2289 remove_note (insn, dest_death);
2290 }
2291
2292 /* Put death note of DEST on P if we saw it die. */
2293 if (dest_death)
2294 {
2295 XEXP (dest_death, 1) = REG_NOTES (p);
2296 REG_NOTES (p) = dest_death;
2297 }
2298 break;
2299 }
2300
2301 /* If SRC is a hard register which is set or killed in
2302 some other way, we can't do this optimization. */
2303 else if (sregno < FIRST_PSEUDO_REGISTER && dead_or_set_p (p, src))
2304 break;
2305 }
2306 }
2307 }
2308
2309 \f
2310
2311 /* Return nonzero if REGNO is a particularly bad choice for reloading X. */
2312 static bool
2313 ira_bad_reload_regno_1 (int regno, rtx x)
2314 {
2315 int x_regno, n, i;
2316 ira_allocno_t a;
2317 enum reg_class pref;
2318
2319 /* We only deal with pseudo regs. */
2320 if (! x || GET_CODE (x) != REG)
2321 return false;
2322
2323 x_regno = REGNO (x);
2324 if (x_regno < FIRST_PSEUDO_REGISTER)
2325 return false;
2326
2327 /* If the pseudo prefers REGNO explicitly, then do not consider
2328 REGNO a bad spill choice. */
2329 pref = reg_preferred_class (x_regno);
2330 if (reg_class_size[pref] == 1)
2331 return !TEST_HARD_REG_BIT (reg_class_contents[pref], regno);
2332
2333 /* If the pseudo conflicts with REGNO, then we consider REGNO a
2334 poor choice for a reload regno. */
2335 a = ira_regno_allocno_map[x_regno];
2336 n = ALLOCNO_NUM_OBJECTS (a);
2337 for (i = 0; i < n; i++)
2338 {
2339 ira_object_t obj = ALLOCNO_OBJECT (a, i);
2340 if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno))
2341 return true;
2342 }
2343 return false;
2344 }
2345
2346 /* Return nonzero if REGNO is a particularly bad choice for reloading
2347 IN or OUT. */
2348 bool
2349 ira_bad_reload_regno (int regno, rtx in, rtx out)
2350 {
2351 return (ira_bad_reload_regno_1 (regno, in)
2352 || ira_bad_reload_regno_1 (regno, out));
2353 }
2354
2355 /* Add register clobbers from asm statements. */
2356 static void
2357 compute_regs_asm_clobbered (void)
2358 {
2359 basic_block bb;
2360
2361 FOR_EACH_BB_FN (bb, cfun)
2362 {
2363 rtx_insn *insn;
2364 FOR_BB_INSNS_REVERSE (bb, insn)
2365 {
2366 df_ref def;
2367
2368 if (NONDEBUG_INSN_P (insn) && asm_noperands (PATTERN (insn)) >= 0)
2369 FOR_EACH_INSN_DEF (def, insn)
2370 {
2371 unsigned int dregno = DF_REF_REGNO (def);
2372 if (HARD_REGISTER_NUM_P (dregno))
2373 add_to_hard_reg_set (&crtl->asm_clobbers,
2374 GET_MODE (DF_REF_REAL_REG (def)),
2375 dregno);
2376 }
2377 }
2378 }
2379 }
2380
2381
2382 /* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and
2383 REGS_EVER_LIVE. */
2384 void
2385 ira_setup_eliminable_regset (void)
2386 {
2387 int i;
2388 static const struct {const int from, to; } eliminables[] = ELIMINABLE_REGS;
2389 int fp_reg_count = hard_regno_nregs (HARD_FRAME_POINTER_REGNUM, Pmode);
2390
2391 /* Setup is_leaf as frame_pointer_required may use it. This function
2392 is called by sched_init before ira if scheduling is enabled. */
2393 crtl->is_leaf = leaf_function_p ();
2394
2395 /* FIXME: If EXIT_IGNORE_STACK is set, we will not save and restore
2396 sp for alloca. So we can't eliminate the frame pointer in that
2397 case. At some point, we should improve this by emitting the
2398 sp-adjusting insns for this case. */
2399 frame_pointer_needed
2400 = (! flag_omit_frame_pointer
2401 || (cfun->calls_alloca && EXIT_IGNORE_STACK)
2402 /* We need the frame pointer to catch stack overflow exceptions if
2403 the stack pointer is moving (as for the alloca case just above). */
2404 || (STACK_CHECK_MOVING_SP
2405 && flag_stack_check
2406 && flag_exceptions
2407 && cfun->can_throw_non_call_exceptions)
2408 || crtl->accesses_prior_frames
2409 || (SUPPORTS_STACK_ALIGNMENT && crtl->stack_realign_needed)
2410 || targetm.frame_pointer_required ());
2411
2412 /* The chance that FRAME_POINTER_NEEDED is changed from inspecting
2413 RTL is very small. So if we use frame pointer for RA and RTL
2414 actually prevents this, we will spill pseudos assigned to the
2415 frame pointer in LRA. */
2416
2417 if (frame_pointer_needed)
2418 for (i = 0; i < fp_reg_count; i++)
2419 df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true);
2420
2421 ira_no_alloc_regs = no_unit_alloc_regs;
2422 CLEAR_HARD_REG_SET (eliminable_regset);
2423
2424 compute_regs_asm_clobbered ();
2425
2426 /* Build the regset of all eliminable registers and show we can't
2427 use those that we already know won't be eliminated. */
2428 for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++)
2429 {
2430 bool cannot_elim
2431 = (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to)
2432 || (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed));
2433
2434 if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from))
2435 {
2436 SET_HARD_REG_BIT (eliminable_regset, eliminables[i].from);
2437
2438 if (cannot_elim)
2439 SET_HARD_REG_BIT (ira_no_alloc_regs, eliminables[i].from);
2440 }
2441 else if (cannot_elim)
2442 error ("%s cannot be used in %<asm%> here",
2443 reg_names[eliminables[i].from]);
2444 else
2445 df_set_regs_ever_live (eliminables[i].from, true);
2446 }
2447 if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
2448 {
2449 for (i = 0; i < fp_reg_count; i++)
2450 if (global_regs[HARD_FRAME_POINTER_REGNUM + i])
2451 /* Nothing to do: the register is already treated as live
2452 where appropriate, and cannot be eliminated. */
2453 ;
2454 else if (!TEST_HARD_REG_BIT (crtl->asm_clobbers,
2455 HARD_FRAME_POINTER_REGNUM + i))
2456 {
2457 SET_HARD_REG_BIT (eliminable_regset,
2458 HARD_FRAME_POINTER_REGNUM + i);
2459 if (frame_pointer_needed)
2460 SET_HARD_REG_BIT (ira_no_alloc_regs,
2461 HARD_FRAME_POINTER_REGNUM + i);
2462 }
2463 else if (frame_pointer_needed)
2464 error ("%s cannot be used in %<asm%> here",
2465 reg_names[HARD_FRAME_POINTER_REGNUM + i]);
2466 else
2467 df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true);
2468 }
2469 }
2470
2471 \f
2472
2473 /* Vector of substitutions of register numbers,
2474 used to map pseudo regs into hardware regs.
2475 This is set up as a result of register allocation.
2476 Element N is the hard reg assigned to pseudo reg N,
2477 or is -1 if no hard reg was assigned.
2478 If N is a hard reg number, element N is N. */
2479 short *reg_renumber;
2480
2481 /* Set up REG_RENUMBER and CALLER_SAVE_NEEDED (used by reload) from
2482 the allocation found by IRA. */
2483 static void
2484 setup_reg_renumber (void)
2485 {
2486 int regno, hard_regno;
2487 ira_allocno_t a;
2488 ira_allocno_iterator ai;
2489
2490 caller_save_needed = 0;
2491 FOR_EACH_ALLOCNO (a, ai)
2492 {
2493 if (ira_use_lra_p && ALLOCNO_CAP_MEMBER (a) != NULL)
2494 continue;
2495 /* There are no caps at this point. */
2496 ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL);
2497 if (! ALLOCNO_ASSIGNED_P (a))
2498 /* It can happen if A is not referenced but partially anticipated
2499 somewhere in a region. */
2500 ALLOCNO_ASSIGNED_P (a) = true;
2501 ira_free_allocno_updated_costs (a);
2502 hard_regno = ALLOCNO_HARD_REGNO (a);
2503 regno = ALLOCNO_REGNO (a);
2504 reg_renumber[regno] = (hard_regno < 0 ? -1 : hard_regno);
2505 if (hard_regno >= 0)
2506 {
2507 int i, nwords;
2508 enum reg_class pclass;
2509 ira_object_t obj;
2510
2511 pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)];
2512 nwords = ALLOCNO_NUM_OBJECTS (a);
2513 for (i = 0; i < nwords; i++)
2514 {
2515 obj = ALLOCNO_OBJECT (a, i);
2516 OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)
2517 |= ~reg_class_contents[pclass];
2518 }
2519 if (ira_need_caller_save_p (a, hard_regno))
2520 {
2521 ira_assert (!optimize || flag_caller_saves
2522 || (ALLOCNO_CALLS_CROSSED_NUM (a)
2523 == ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a))
2524 || regno >= ira_reg_equiv_len
2525 || ira_equiv_no_lvalue_p (regno));
2526 caller_save_needed = 1;
2527 }
2528 }
2529 }
2530 }
2531
2532 /* Set up allocno assignment flags for further allocation
2533 improvements. */
2534 static void
2535 setup_allocno_assignment_flags (void)
2536 {
2537 int hard_regno;
2538 ira_allocno_t a;
2539 ira_allocno_iterator ai;
2540
2541 FOR_EACH_ALLOCNO (a, ai)
2542 {
2543 if (! ALLOCNO_ASSIGNED_P (a))
2544 /* It can happen if A is not referenced but partially anticipated
2545 somewhere in a region. */
2546 ira_free_allocno_updated_costs (a);
2547 hard_regno = ALLOCNO_HARD_REGNO (a);
2548 /* Don't assign hard registers to allocnos which are destination
2549 of removed store at the end of loop. It has no sense to keep
2550 the same value in different hard registers. It is also
2551 impossible to assign hard registers correctly to such
2552 allocnos because the cost info and info about intersected
2553 calls are incorrect for them. */
2554 ALLOCNO_ASSIGNED_P (a) = (hard_regno >= 0
2555 || ALLOCNO_EMIT_DATA (a)->mem_optimized_dest_p
2556 || (ALLOCNO_MEMORY_COST (a)
2557 - ALLOCNO_CLASS_COST (a)) < 0);
2558 ira_assert
2559 (hard_regno < 0
2560 || ira_hard_reg_in_set_p (hard_regno, ALLOCNO_MODE (a),
2561 reg_class_contents[ALLOCNO_CLASS (a)]));
2562 }
2563 }
2564
2565 /* Evaluate overall allocation cost and the costs for using hard
2566 registers and memory for allocnos. */
2567 static void
2568 calculate_allocation_cost (void)
2569 {
2570 int hard_regno, cost;
2571 ira_allocno_t a;
2572 ira_allocno_iterator ai;
2573
2574 ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
2575 FOR_EACH_ALLOCNO (a, ai)
2576 {
2577 hard_regno = ALLOCNO_HARD_REGNO (a);
2578 ira_assert (hard_regno < 0
2579 || (ira_hard_reg_in_set_p
2580 (hard_regno, ALLOCNO_MODE (a),
2581 reg_class_contents[ALLOCNO_CLASS (a)])));
2582 if (hard_regno < 0)
2583 {
2584 cost = ALLOCNO_MEMORY_COST (a);
2585 ira_mem_cost += cost;
2586 }
2587 else if (ALLOCNO_HARD_REG_COSTS (a) != NULL)
2588 {
2589 cost = (ALLOCNO_HARD_REG_COSTS (a)
2590 [ira_class_hard_reg_index
2591 [ALLOCNO_CLASS (a)][hard_regno]]);
2592 ira_reg_cost += cost;
2593 }
2594 else
2595 {
2596 cost = ALLOCNO_CLASS_COST (a);
2597 ira_reg_cost += cost;
2598 }
2599 ira_overall_cost += cost;
2600 }
2601
2602 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
2603 {
2604 fprintf (ira_dump_file,
2605 "+++Costs: overall %" PRId64
2606 ", reg %" PRId64
2607 ", mem %" PRId64
2608 ", ld %" PRId64
2609 ", st %" PRId64
2610 ", move %" PRId64,
2611 ira_overall_cost, ira_reg_cost, ira_mem_cost,
2612 ira_load_cost, ira_store_cost, ira_shuffle_cost);
2613 fprintf (ira_dump_file, "\n+++ move loops %d, new jumps %d\n",
2614 ira_move_loops_num, ira_additional_jumps_num);
2615 }
2616
2617 }
2618
2619 #ifdef ENABLE_IRA_CHECKING
2620 /* Check the correctness of the allocation. We do need this because
2621 of complicated code to transform more one region internal
2622 representation into one region representation. */
2623 static void
2624 check_allocation (void)
2625 {
2626 ira_allocno_t a;
2627 int hard_regno, nregs, conflict_nregs;
2628 ira_allocno_iterator ai;
2629
2630 FOR_EACH_ALLOCNO (a, ai)
2631 {
2632 int n = ALLOCNO_NUM_OBJECTS (a);
2633 int i;
2634
2635 if (ALLOCNO_CAP_MEMBER (a) != NULL
2636 || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0)
2637 continue;
2638 nregs = hard_regno_nregs (hard_regno, ALLOCNO_MODE (a));
2639 if (nregs == 1)
2640 /* We allocated a single hard register. */
2641 n = 1;
2642 else if (n > 1)
2643 /* We allocated multiple hard registers, and we will test
2644 conflicts in a granularity of single hard regs. */
2645 nregs = 1;
2646
2647 for (i = 0; i < n; i++)
2648 {
2649 ira_object_t obj = ALLOCNO_OBJECT (a, i);
2650 ira_object_t conflict_obj;
2651 ira_object_conflict_iterator oci;
2652 int this_regno = hard_regno;
2653 if (n > 1)
2654 {
2655 if (REG_WORDS_BIG_ENDIAN)
2656 this_regno += n - i - 1;
2657 else
2658 this_regno += i;
2659 }
2660 FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
2661 {
2662 ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj);
2663 int conflict_hard_regno = ALLOCNO_HARD_REGNO (conflict_a);
2664 if (conflict_hard_regno < 0)
2665 continue;
2666 if (ira_soft_conflict (a, conflict_a))
2667 continue;
2668
2669 conflict_nregs = hard_regno_nregs (conflict_hard_regno,
2670 ALLOCNO_MODE (conflict_a));
2671
2672 if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1
2673 && conflict_nregs == ALLOCNO_NUM_OBJECTS (conflict_a))
2674 {
2675 if (REG_WORDS_BIG_ENDIAN)
2676 conflict_hard_regno += (ALLOCNO_NUM_OBJECTS (conflict_a)
2677 - OBJECT_SUBWORD (conflict_obj) - 1);
2678 else
2679 conflict_hard_regno += OBJECT_SUBWORD (conflict_obj);
2680 conflict_nregs = 1;
2681 }
2682
2683 if ((conflict_hard_regno <= this_regno
2684 && this_regno < conflict_hard_regno + conflict_nregs)
2685 || (this_regno <= conflict_hard_regno
2686 && conflict_hard_regno < this_regno + nregs))
2687 {
2688 fprintf (stderr, "bad allocation for %d and %d\n",
2689 ALLOCNO_REGNO (a), ALLOCNO_REGNO (conflict_a));
2690 gcc_unreachable ();
2691 }
2692 }
2693 }
2694 }
2695 }
2696 #endif
2697
2698 /* Allocate REG_EQUIV_INIT. Set up it from IRA_REG_EQUIV which should
2699 be already calculated. */
2700 static void
2701 setup_reg_equiv_init (void)
2702 {
2703 int i;
2704 int max_regno = max_reg_num ();
2705
2706 for (i = 0; i < max_regno; i++)
2707 reg_equiv_init (i) = ira_reg_equiv[i].init_insns;
2708 }
2709
2710 /* Update equiv regno from movement of FROM_REGNO to TO_REGNO. INSNS
2711 are insns which were generated for such movement. It is assumed
2712 that FROM_REGNO and TO_REGNO always have the same value at the
2713 point of any move containing such registers. This function is used
2714 to update equiv info for register shuffles on the region borders
2715 and for caller save/restore insns. */
2716 void
2717 ira_update_equiv_info_by_shuffle_insn (int to_regno, int from_regno, rtx_insn *insns)
2718 {
2719 rtx_insn *insn;
2720 rtx x, note;
2721
2722 if (! ira_reg_equiv[from_regno].defined_p
2723 && (! ira_reg_equiv[to_regno].defined_p
2724 || ((x = ira_reg_equiv[to_regno].memory) != NULL_RTX
2725 && ! MEM_READONLY_P (x))))
2726 return;
2727 insn = insns;
2728 if (NEXT_INSN (insn) != NULL_RTX)
2729 {
2730 if (! ira_reg_equiv[to_regno].defined_p)
2731 {
2732 ira_assert (ira_reg_equiv[to_regno].init_insns == NULL_RTX);
2733 return;
2734 }
2735 ira_reg_equiv[to_regno].defined_p = false;
2736 ira_reg_equiv[to_regno].memory
2737 = ira_reg_equiv[to_regno].constant
2738 = ira_reg_equiv[to_regno].invariant
2739 = ira_reg_equiv[to_regno].init_insns = NULL;
2740 if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2741 fprintf (ira_dump_file,
2742 " Invalidating equiv info for reg %d\n", to_regno);
2743 return;
2744 }
2745 /* It is possible that FROM_REGNO still has no equivalence because
2746 in shuffles to_regno<-from_regno and from_regno<-to_regno the 2nd
2747 insn was not processed yet. */
2748 if (ira_reg_equiv[from_regno].defined_p)
2749 {
2750 ira_reg_equiv[to_regno].defined_p = true;
2751 if ((x = ira_reg_equiv[from_regno].memory) != NULL_RTX)
2752 {
2753 ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX
2754 && ira_reg_equiv[from_regno].constant == NULL_RTX);
2755 ira_assert (ira_reg_equiv[to_regno].memory == NULL_RTX
2756 || rtx_equal_p (ira_reg_equiv[to_regno].memory, x));
2757 ira_reg_equiv[to_regno].memory = x;
2758 if (! MEM_READONLY_P (x))
2759 /* We don't add the insn to insn init list because memory
2760 equivalence is just to say what memory is better to use
2761 when the pseudo is spilled. */
2762 return;
2763 }
2764 else if ((x = ira_reg_equiv[from_regno].constant) != NULL_RTX)
2765 {
2766 ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX);
2767 ira_assert (ira_reg_equiv[to_regno].constant == NULL_RTX
2768 || rtx_equal_p (ira_reg_equiv[to_regno].constant, x));
2769 ira_reg_equiv[to_regno].constant = x;
2770 }
2771 else
2772 {
2773 x = ira_reg_equiv[from_regno].invariant;
2774 ira_assert (x != NULL_RTX);
2775 ira_assert (ira_reg_equiv[to_regno].invariant == NULL_RTX
2776 || rtx_equal_p (ira_reg_equiv[to_regno].invariant, x));
2777 ira_reg_equiv[to_regno].invariant = x;
2778 }
2779 if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX)
2780 {
2781 note = set_unique_reg_note (insn, REG_EQUIV, copy_rtx (x));
2782 gcc_assert (note != NULL_RTX);
2783 if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2784 {
2785 fprintf (ira_dump_file,
2786 " Adding equiv note to insn %u for reg %d ",
2787 INSN_UID (insn), to_regno);
2788 dump_value_slim (ira_dump_file, x, 1);
2789 fprintf (ira_dump_file, "\n");
2790 }
2791 }
2792 }
2793 ira_reg_equiv[to_regno].init_insns
2794 = gen_rtx_INSN_LIST (VOIDmode, insn,
2795 ira_reg_equiv[to_regno].init_insns);
2796 if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2797 fprintf (ira_dump_file,
2798 " Adding equiv init move insn %u to reg %d\n",
2799 INSN_UID (insn), to_regno);
2800 }
2801
2802 /* Fix values of array REG_EQUIV_INIT after live range splitting done
2803 by IRA. */
2804 static void
2805 fix_reg_equiv_init (void)
2806 {
2807 int max_regno = max_reg_num ();
2808 int i, new_regno, max;
2809 rtx set;
2810 rtx_insn_list *x, *next, *prev;
2811 rtx_insn *insn;
2812
2813 if (max_regno_before_ira < max_regno)
2814 {
2815 max = vec_safe_length (reg_equivs);
2816 grow_reg_equivs ();
2817 for (i = FIRST_PSEUDO_REGISTER; i < max; i++)
2818 for (prev = NULL, x = reg_equiv_init (i);
2819 x != NULL_RTX;
2820 x = next)
2821 {
2822 next = x->next ();
2823 insn = x->insn ();
2824 set = single_set (insn);
2825 ira_assert (set != NULL_RTX
2826 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))));
2827 if (REG_P (SET_DEST (set))
2828 && ((int) REGNO (SET_DEST (set)) == i
2829 || (int) ORIGINAL_REGNO (SET_DEST (set)) == i))
2830 new_regno = REGNO (SET_DEST (set));
2831 else if (REG_P (SET_SRC (set))
2832 && ((int) REGNO (SET_SRC (set)) == i
2833 || (int) ORIGINAL_REGNO (SET_SRC (set)) == i))
2834 new_regno = REGNO (SET_SRC (set));
2835 else
2836 gcc_unreachable ();
2837 if (new_regno == i)
2838 prev = x;
2839 else
2840 {
2841 /* Remove the wrong list element. */
2842 if (prev == NULL_RTX)
2843 reg_equiv_init (i) = next;
2844 else
2845 XEXP (prev, 1) = next;
2846 XEXP (x, 1) = reg_equiv_init (new_regno);
2847 reg_equiv_init (new_regno) = x;
2848 }
2849 }
2850 }
2851 }
2852
2853 #ifdef ENABLE_IRA_CHECKING
2854 /* Print redundant memory-memory copies. */
2855 static void
2856 print_redundant_copies (void)
2857 {
2858 int hard_regno;
2859 ira_allocno_t a;
2860 ira_copy_t cp, next_cp;
2861 ira_allocno_iterator ai;
2862
2863 FOR_EACH_ALLOCNO (a, ai)
2864 {
2865 if (ALLOCNO_CAP_MEMBER (a) != NULL)
2866 /* It is a cap. */
2867 continue;
2868 hard_regno = ALLOCNO_HARD_REGNO (a);
2869 if (hard_regno >= 0)
2870 continue;
2871 for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp)
2872 if (cp->first == a)
2873 next_cp = cp->next_first_allocno_copy;
2874 else
2875 {
2876 next_cp = cp->next_second_allocno_copy;
2877 if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL
2878 && cp->insn != NULL_RTX
2879 && ALLOCNO_HARD_REGNO (cp->first) == hard_regno)
2880 fprintf (ira_dump_file,
2881 " Redundant move from %d(freq %d):%d\n",
2882 INSN_UID (cp->insn), cp->freq, hard_regno);
2883 }
2884 }
2885 }
2886 #endif
2887
2888 /* Setup preferred and alternative classes for new pseudo-registers
2889 created by IRA starting with START. */
2890 static void
2891 setup_preferred_alternate_classes_for_new_pseudos (int start)
2892 {
2893 int i, old_regno;
2894 int max_regno = max_reg_num ();
2895
2896 for (i = start; i < max_regno; i++)
2897 {
2898 old_regno = ORIGINAL_REGNO (regno_reg_rtx[i]);
2899 ira_assert (i != old_regno);
2900 setup_reg_classes (i, reg_preferred_class (old_regno),
2901 reg_alternate_class (old_regno),
2902 reg_allocno_class (old_regno));
2903 if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
2904 fprintf (ira_dump_file,
2905 " New r%d: setting preferred %s, alternative %s\n",
2906 i, reg_class_names[reg_preferred_class (old_regno)],
2907 reg_class_names[reg_alternate_class (old_regno)]);
2908 }
2909 }
2910
2911 \f
2912 /* The number of entries allocated in reg_info. */
2913 static int allocated_reg_info_size;
2914
2915 /* Regional allocation can create new pseudo-registers. This function
2916 expands some arrays for pseudo-registers. */
2917 static void
2918 expand_reg_info (void)
2919 {
2920 int i;
2921 int size = max_reg_num ();
2922
2923 resize_reg_info ();
2924 for (i = allocated_reg_info_size; i < size; i++)
2925 setup_reg_classes (i, GENERAL_REGS, ALL_REGS, GENERAL_REGS);
2926 setup_preferred_alternate_classes_for_new_pseudos (allocated_reg_info_size);
2927 allocated_reg_info_size = size;
2928 }
2929
2930 /* Return TRUE if there is too high register pressure in the function.
2931 It is used to decide when stack slot sharing is worth to do. */
2932 static bool
2933 too_high_register_pressure_p (void)
2934 {
2935 int i;
2936 enum reg_class pclass;
2937
2938 for (i = 0; i < ira_pressure_classes_num; i++)
2939 {
2940 pclass = ira_pressure_classes[i];
2941 if (ira_loop_tree_root->reg_pressure[pclass] > 10000)
2942 return true;
2943 }
2944 return false;
2945 }
2946
2947 \f
2948
2949 /* Indicate that hard register number FROM was eliminated and replaced with
2950 an offset from hard register number TO. The status of hard registers live
2951 at the start of a basic block is updated by replacing a use of FROM with
2952 a use of TO. */
2953
2954 void
2955 mark_elimination (int from, int to)
2956 {
2957 basic_block bb;
2958 bitmap r;
2959
2960 FOR_EACH_BB_FN (bb, cfun)
2961 {
2962 r = DF_LR_IN (bb);
2963 if (bitmap_bit_p (r, from))
2964 {
2965 bitmap_clear_bit (r, from);
2966 bitmap_set_bit (r, to);
2967 }
2968 if (! df_live)
2969 continue;
2970 r = DF_LIVE_IN (bb);
2971 if (bitmap_bit_p (r, from))
2972 {
2973 bitmap_clear_bit (r, from);
2974 bitmap_set_bit (r, to);
2975 }
2976 }
2977 }
2978
2979 \f
2980
2981 /* The length of the following array. */
2982 int ira_reg_equiv_len;
2983
2984 /* Info about equiv. info for each register. */
2985 struct ira_reg_equiv_s *ira_reg_equiv;
2986
2987 /* Expand ira_reg_equiv if necessary. */
2988 void
2989 ira_expand_reg_equiv (void)
2990 {
2991 int old = ira_reg_equiv_len;
2992
2993 if (ira_reg_equiv_len > max_reg_num ())
2994 return;
2995 ira_reg_equiv_len = max_reg_num () * 3 / 2 + 1;
2996 ira_reg_equiv
2997 = (struct ira_reg_equiv_s *) xrealloc (ira_reg_equiv,
2998 ira_reg_equiv_len
2999 * sizeof (struct ira_reg_equiv_s));
3000 gcc_assert (old < ira_reg_equiv_len);
3001 memset (ira_reg_equiv + old, 0,
3002 sizeof (struct ira_reg_equiv_s) * (ira_reg_equiv_len - old));
3003 }
3004
3005 static void
3006 init_reg_equiv (void)
3007 {
3008 ira_reg_equiv_len = 0;
3009 ira_reg_equiv = NULL;
3010 ira_expand_reg_equiv ();
3011 }
3012
3013 static void
3014 finish_reg_equiv (void)
3015 {
3016 free (ira_reg_equiv);
3017 }
3018
3019 \f
3020
3021 struct equivalence
3022 {
3023 /* Set when a REG_EQUIV note is found or created. Use to
3024 keep track of what memory accesses might be created later,
3025 e.g. by reload. */
3026 rtx replacement;
3027 rtx *src_p;
3028
3029 /* The list of each instruction which initializes this register.
3030
3031 NULL indicates we know nothing about this register's equivalence
3032 properties.
3033
3034 An INSN_LIST with a NULL insn indicates this pseudo is already
3035 known to not have a valid equivalence. */
3036 rtx_insn_list *init_insns;
3037
3038 /* Loop depth is used to recognize equivalences which appear
3039 to be present within the same loop (or in an inner loop). */
3040 short loop_depth;
3041 /* Nonzero if this had a preexisting REG_EQUIV note. */
3042 unsigned char is_arg_equivalence : 1;
3043 /* Set when an attempt should be made to replace a register
3044 with the associated src_p entry. */
3045 unsigned char replace : 1;
3046 /* Set if this register has no known equivalence. */
3047 unsigned char no_equiv : 1;
3048 /* Set if this register is mentioned in a paradoxical subreg. */
3049 unsigned char pdx_subregs : 1;
3050 };
3051
3052 /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence
3053 structure for that register. */
3054 static struct equivalence *reg_equiv;
3055
3056 /* Used for communication between the following two functions. */
3057 struct equiv_mem_data
3058 {
3059 /* A MEM that we wish to ensure remains unchanged. */
3060 rtx equiv_mem;
3061
3062 /* Set true if EQUIV_MEM is modified. */
3063 bool equiv_mem_modified;
3064 };
3065
3066 /* If EQUIV_MEM is modified by modifying DEST, indicate that it is modified.
3067 Called via note_stores. */
3068 static void
3069 validate_equiv_mem_from_store (rtx dest, const_rtx set ATTRIBUTE_UNUSED,
3070 void *data)
3071 {
3072 struct equiv_mem_data *info = (struct equiv_mem_data *) data;
3073
3074 if ((REG_P (dest)
3075 && reg_overlap_mentioned_p (dest, info->equiv_mem))
3076 || (MEM_P (dest)
3077 && anti_dependence (info->equiv_mem, dest)))
3078 info->equiv_mem_modified = true;
3079 }
3080
3081 enum valid_equiv { valid_none, valid_combine, valid_reload };
3082
3083 /* Verify that no store between START and the death of REG invalidates
3084 MEMREF. MEMREF is invalidated by modifying a register used in MEMREF,
3085 by storing into an overlapping memory location, or with a non-const
3086 CALL_INSN.
3087
3088 Return VALID_RELOAD if MEMREF remains valid for both reload and
3089 combine_and_move insns, VALID_COMBINE if only valid for
3090 combine_and_move_insns, and VALID_NONE otherwise. */
3091 static enum valid_equiv
3092 validate_equiv_mem (rtx_insn *start, rtx reg, rtx memref)
3093 {
3094 rtx_insn *insn;
3095 rtx note;
3096 struct equiv_mem_data info = { memref, false };
3097 enum valid_equiv ret = valid_reload;
3098
3099 /* If the memory reference has side effects or is volatile, it isn't a
3100 valid equivalence. */
3101 if (side_effects_p (memref))
3102 return valid_none;
3103
3104 for (insn = start; insn; insn = NEXT_INSN (insn))
3105 {
3106 if (!INSN_P (insn))
3107 continue;
3108
3109 if (find_reg_note (insn, REG_DEAD, reg))
3110 return ret;
3111
3112 if (CALL_P (insn))
3113 {
3114 /* We can combine a reg def from one insn into a reg use in
3115 another over a call if the memory is readonly or the call
3116 const/pure. However, we can't set reg_equiv notes up for
3117 reload over any call. The problem is the equivalent form
3118 may reference a pseudo which gets assigned a call
3119 clobbered hard reg. When we later replace REG with its
3120 equivalent form, the value in the call-clobbered reg has
3121 been changed and all hell breaks loose. */
3122 ret = valid_combine;
3123 if (!MEM_READONLY_P (memref)
3124 && !RTL_CONST_OR_PURE_CALL_P (insn))
3125 return valid_none;
3126 }
3127
3128 note_stores (insn, validate_equiv_mem_from_store, &info);
3129 if (info.equiv_mem_modified)
3130 return valid_none;
3131
3132 /* If a register mentioned in MEMREF is modified via an
3133 auto-increment, we lose the equivalence. Do the same if one
3134 dies; although we could extend the life, it doesn't seem worth
3135 the trouble. */
3136
3137 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
3138 if ((REG_NOTE_KIND (note) == REG_INC
3139 || REG_NOTE_KIND (note) == REG_DEAD)
3140 && REG_P (XEXP (note, 0))
3141 && reg_overlap_mentioned_p (XEXP (note, 0), memref))
3142 return valid_none;
3143 }
3144
3145 return valid_none;
3146 }
3147
3148 /* Returns zero if X is known to be invariant. */
3149 static int
3150 equiv_init_varies_p (rtx x)
3151 {
3152 RTX_CODE code = GET_CODE (x);
3153 int i;
3154 const char *fmt;
3155
3156 switch (code)
3157 {
3158 case MEM:
3159 return !MEM_READONLY_P (x) || equiv_init_varies_p (XEXP (x, 0));
3160
3161 case CONST:
3162 CASE_CONST_ANY:
3163 case SYMBOL_REF:
3164 case LABEL_REF:
3165 return 0;
3166
3167 case REG:
3168 return reg_equiv[REGNO (x)].replace == 0 && rtx_varies_p (x, 0);
3169
3170 case ASM_OPERANDS:
3171 if (MEM_VOLATILE_P (x))
3172 return 1;
3173
3174 /* Fall through. */
3175
3176 default:
3177 break;
3178 }
3179
3180 fmt = GET_RTX_FORMAT (code);
3181 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3182 if (fmt[i] == 'e')
3183 {
3184 if (equiv_init_varies_p (XEXP (x, i)))
3185 return 1;
3186 }
3187 else if (fmt[i] == 'E')
3188 {
3189 int j;
3190 for (j = 0; j < XVECLEN (x, i); j++)
3191 if (equiv_init_varies_p (XVECEXP (x, i, j)))
3192 return 1;
3193 }
3194
3195 return 0;
3196 }
3197
3198 /* Returns nonzero if X (used to initialize register REGNO) is movable.
3199 X is only movable if the registers it uses have equivalent initializations
3200 which appear to be within the same loop (or in an inner loop) and movable
3201 or if they are not candidates for local_alloc and don't vary. */
3202 static int
3203 equiv_init_movable_p (rtx x, int regno)
3204 {
3205 int i, j;
3206 const char *fmt;
3207 enum rtx_code code = GET_CODE (x);
3208
3209 switch (code)
3210 {
3211 case SET:
3212 return equiv_init_movable_p (SET_SRC (x), regno);
3213
3214 case CLOBBER:
3215 return 0;
3216
3217 case PRE_INC:
3218 case PRE_DEC:
3219 case POST_INC:
3220 case POST_DEC:
3221 case PRE_MODIFY:
3222 case POST_MODIFY:
3223 return 0;
3224
3225 case REG:
3226 return ((reg_equiv[REGNO (x)].loop_depth >= reg_equiv[regno].loop_depth
3227 && reg_equiv[REGNO (x)].replace)
3228 || (REG_BASIC_BLOCK (REGNO (x)) < NUM_FIXED_BLOCKS
3229 && ! rtx_varies_p (x, 0)));
3230
3231 case UNSPEC_VOLATILE:
3232 return 0;
3233
3234 case ASM_OPERANDS:
3235 if (MEM_VOLATILE_P (x))
3236 return 0;
3237
3238 /* Fall through. */
3239
3240 default:
3241 break;
3242 }
3243
3244 fmt = GET_RTX_FORMAT (code);
3245 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3246 switch (fmt[i])
3247 {
3248 case 'e':
3249 if (! equiv_init_movable_p (XEXP (x, i), regno))
3250 return 0;
3251 break;
3252 case 'E':
3253 for (j = XVECLEN (x, i) - 1; j >= 0; j--)
3254 if (! equiv_init_movable_p (XVECEXP (x, i, j), regno))
3255 return 0;
3256 break;
3257 }
3258
3259 return 1;
3260 }
3261
3262 static bool memref_referenced_p (rtx memref, rtx x, bool read_p);
3263
3264 /* Auxiliary function for memref_referenced_p. Process setting X for
3265 MEMREF store. */
3266 static bool
3267 process_set_for_memref_referenced_p (rtx memref, rtx x)
3268 {
3269 /* If we are setting a MEM, it doesn't count (its address does), but any
3270 other SET_DEST that has a MEM in it is referencing the MEM. */
3271 if (MEM_P (x))
3272 {
3273 if (memref_referenced_p (memref, XEXP (x, 0), true))
3274 return true;
3275 }
3276 else if (memref_referenced_p (memref, x, false))
3277 return true;
3278
3279 return false;
3280 }
3281
3282 /* TRUE if X references a memory location (as a read if READ_P) that
3283 would be affected by a store to MEMREF. */
3284 static bool
3285 memref_referenced_p (rtx memref, rtx x, bool read_p)
3286 {
3287 int i, j;
3288 const char *fmt;
3289 enum rtx_code code = GET_CODE (x);
3290
3291 switch (code)
3292 {
3293 case CONST:
3294 case LABEL_REF:
3295 case SYMBOL_REF:
3296 CASE_CONST_ANY:
3297 case PC:
3298 case HIGH:
3299 case LO_SUM:
3300 return false;
3301
3302 case REG:
3303 return (reg_equiv[REGNO (x)].replacement
3304 && memref_referenced_p (memref,
3305 reg_equiv[REGNO (x)].replacement, read_p));
3306
3307 case MEM:
3308 /* Memory X might have another effective type than MEMREF. */
3309 if (read_p || true_dependence (memref, VOIDmode, x))
3310 return true;
3311 break;
3312
3313 case SET:
3314 if (process_set_for_memref_referenced_p (memref, SET_DEST (x)))
3315 return true;
3316
3317 return memref_referenced_p (memref, SET_SRC (x), true);
3318
3319 case CLOBBER:
3320 if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3321 return true;
3322
3323 return false;
3324
3325 case PRE_DEC:
3326 case POST_DEC:
3327 case PRE_INC:
3328 case POST_INC:
3329 if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3330 return true;
3331
3332 return memref_referenced_p (memref, XEXP (x, 0), true);
3333
3334 case POST_MODIFY:
3335 case PRE_MODIFY:
3336 /* op0 = op0 + op1 */
3337 if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3338 return true;
3339
3340 if (memref_referenced_p (memref, XEXP (x, 0), true))
3341 return true;
3342
3343 return memref_referenced_p (memref, XEXP (x, 1), true);
3344
3345 default:
3346 break;
3347 }
3348
3349 fmt = GET_RTX_FORMAT (code);
3350 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3351 switch (fmt[i])
3352 {
3353 case 'e':
3354 if (memref_referenced_p (memref, XEXP (x, i), read_p))
3355 return true;
3356 break;
3357 case 'E':
3358 for (j = XVECLEN (x, i) - 1; j >= 0; j--)
3359 if (memref_referenced_p (memref, XVECEXP (x, i, j), read_p))
3360 return true;
3361 break;
3362 }
3363
3364 return false;
3365 }
3366
3367 /* TRUE if some insn in the range (START, END] references a memory location
3368 that would be affected by a store to MEMREF.
3369
3370 Callers should not call this routine if START is after END in the
3371 RTL chain. */
3372
3373 static int
3374 memref_used_between_p (rtx memref, rtx_insn *start, rtx_insn *end)
3375 {
3376 rtx_insn *insn;
3377
3378 for (insn = NEXT_INSN (start);
3379 insn && insn != NEXT_INSN (end);
3380 insn = NEXT_INSN (insn))
3381 {
3382 if (!NONDEBUG_INSN_P (insn))
3383 continue;
3384
3385 if (memref_referenced_p (memref, PATTERN (insn), false))
3386 return 1;
3387
3388 /* Nonconst functions may access memory. */
3389 if (CALL_P (insn) && (! RTL_CONST_CALL_P (insn)))
3390 return 1;
3391 }
3392
3393 gcc_assert (insn == NEXT_INSN (end));
3394 return 0;
3395 }
3396
3397 /* Mark REG as having no known equivalence.
3398 Some instructions might have been processed before and furnished
3399 with REG_EQUIV notes for this register; these notes will have to be
3400 removed.
3401 STORE is the piece of RTL that does the non-constant / conflicting
3402 assignment - a SET, CLOBBER or REG_INC note. It is currently not used,
3403 but needs to be there because this function is called from note_stores. */
3404 static void
3405 no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED,
3406 void *data ATTRIBUTE_UNUSED)
3407 {
3408 int regno;
3409 rtx_insn_list *list;
3410
3411 if (!REG_P (reg))
3412 return;
3413 regno = REGNO (reg);
3414 reg_equiv[regno].no_equiv = 1;
3415 list = reg_equiv[regno].init_insns;
3416 if (list && list->insn () == NULL)
3417 return;
3418 reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, NULL_RTX, NULL);
3419 reg_equiv[regno].replacement = NULL_RTX;
3420 /* This doesn't matter for equivalences made for argument registers, we
3421 should keep their initialization insns. */
3422 if (reg_equiv[regno].is_arg_equivalence)
3423 return;
3424 ira_reg_equiv[regno].defined_p = false;
3425 ira_reg_equiv[regno].init_insns = NULL;
3426 for (; list; list = list->next ())
3427 {
3428 rtx_insn *insn = list->insn ();
3429 remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX));
3430 }
3431 }
3432
3433 /* Check whether the SUBREG is a paradoxical subreg and set the result
3434 in PDX_SUBREGS. */
3435
3436 static void
3437 set_paradoxical_subreg (rtx_insn *insn)
3438 {
3439 subrtx_iterator::array_type array;
3440 FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
3441 {
3442 const_rtx subreg = *iter;
3443 if (GET_CODE (subreg) == SUBREG)
3444 {
3445 const_rtx reg = SUBREG_REG (subreg);
3446 if (REG_P (reg) && paradoxical_subreg_p (subreg))
3447 reg_equiv[REGNO (reg)].pdx_subregs = true;
3448 }
3449 }
3450 }
3451
3452 /* In DEBUG_INSN location adjust REGs from CLEARED_REGS bitmap to the
3453 equivalent replacement. */
3454
3455 static rtx
3456 adjust_cleared_regs (rtx loc, const_rtx old_rtx ATTRIBUTE_UNUSED, void *data)
3457 {
3458 if (REG_P (loc))
3459 {
3460 bitmap cleared_regs = (bitmap) data;
3461 if (bitmap_bit_p (cleared_regs, REGNO (loc)))
3462 return simplify_replace_fn_rtx (copy_rtx (*reg_equiv[REGNO (loc)].src_p),
3463 NULL_RTX, adjust_cleared_regs, data);
3464 }
3465 return NULL_RTX;
3466 }
3467
3468 /* Given register REGNO is set only once, return true if the defining
3469 insn dominates all uses. */
3470
3471 static bool
3472 def_dominates_uses (int regno)
3473 {
3474 df_ref def = DF_REG_DEF_CHAIN (regno);
3475
3476 struct df_insn_info *def_info = DF_REF_INSN_INFO (def);
3477 /* If this is an artificial def (eh handler regs, hard frame pointer
3478 for non-local goto, regs defined on function entry) then def_info
3479 is NULL and the reg is always live before any use. We might
3480 reasonably return true in that case, but since the only call
3481 of this function is currently here in ira.cc when we are looking
3482 at a defining insn we can't have an artificial def as that would
3483 bump DF_REG_DEF_COUNT. */
3484 gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && def_info != NULL);
3485
3486 rtx_insn *def_insn = DF_REF_INSN (def);
3487 basic_block def_bb = BLOCK_FOR_INSN (def_insn);
3488
3489 for (df_ref use = DF_REG_USE_CHAIN (regno);
3490 use;
3491 use = DF_REF_NEXT_REG (use))
3492 {
3493 struct df_insn_info *use_info = DF_REF_INSN_INFO (use);
3494 /* Only check real uses, not artificial ones. */
3495 if (use_info)
3496 {
3497 rtx_insn *use_insn = DF_REF_INSN (use);
3498 if (!DEBUG_INSN_P (use_insn))
3499 {
3500 basic_block use_bb = BLOCK_FOR_INSN (use_insn);
3501 if (use_bb != def_bb
3502 ? !dominated_by_p (CDI_DOMINATORS, use_bb, def_bb)
3503 : DF_INSN_INFO_LUID (use_info) < DF_INSN_INFO_LUID (def_info))
3504 return false;
3505 }
3506 }
3507 }
3508 return true;
3509 }
3510
3511 /* Scan the instructions before update_equiv_regs. Record which registers
3512 are referenced as paradoxical subregs. Also check for cases in which
3513 the current function needs to save a register that one of its call
3514 instructions clobbers.
3515
3516 These things are logically unrelated, but it's more efficient to do
3517 them together. */
3518
3519 static void
3520 update_equiv_regs_prescan (void)
3521 {
3522 basic_block bb;
3523 rtx_insn *insn;
3524 function_abi_aggregator callee_abis;
3525
3526 FOR_EACH_BB_FN (bb, cfun)
3527 FOR_BB_INSNS (bb, insn)
3528 if (NONDEBUG_INSN_P (insn))
3529 {
3530 set_paradoxical_subreg (insn);
3531 if (CALL_P (insn))
3532 callee_abis.note_callee_abi (insn_callee_abi (insn));
3533 }
3534
3535 HARD_REG_SET extra_caller_saves = callee_abis.caller_save_regs (*crtl->abi);
3536 if (!hard_reg_set_empty_p (extra_caller_saves))
3537 for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
3538 if (TEST_HARD_REG_BIT (extra_caller_saves, regno))
3539 df_set_regs_ever_live (regno, true);
3540 }
3541
3542 /* Find registers that are equivalent to a single value throughout the
3543 compilation (either because they can be referenced in memory or are
3544 set once from a single constant). Lower their priority for a
3545 register.
3546
3547 If such a register is only referenced once, try substituting its
3548 value into the using insn. If it succeeds, we can eliminate the
3549 register completely.
3550
3551 Initialize init_insns in ira_reg_equiv array. */
3552 static void
3553 update_equiv_regs (void)
3554 {
3555 rtx_insn *insn;
3556 basic_block bb;
3557
3558 /* Scan the insns and find which registers have equivalences. Do this
3559 in a separate scan of the insns because (due to -fcse-follow-jumps)
3560 a register can be set below its use. */
3561 bitmap setjmp_crosses = regstat_get_setjmp_crosses ();
3562 FOR_EACH_BB_FN (bb, cfun)
3563 {
3564 int loop_depth = bb_loop_depth (bb);
3565
3566 for (insn = BB_HEAD (bb);
3567 insn != NEXT_INSN (BB_END (bb));
3568 insn = NEXT_INSN (insn))
3569 {
3570 rtx note;
3571 rtx set;
3572 rtx dest, src;
3573 int regno;
3574
3575 if (! INSN_P (insn))
3576 continue;
3577
3578 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
3579 if (REG_NOTE_KIND (note) == REG_INC)
3580 no_equiv (XEXP (note, 0), note, NULL);
3581
3582 set = single_set (insn);
3583
3584 /* If this insn contains more (or less) than a single SET,
3585 only mark all destinations as having no known equivalence. */
3586 if (set == NULL_RTX
3587 || side_effects_p (SET_SRC (set)))
3588 {
3589 note_pattern_stores (PATTERN (insn), no_equiv, NULL);
3590 continue;
3591 }
3592 else if (GET_CODE (PATTERN (insn)) == PARALLEL)
3593 {
3594 int i;
3595
3596 for (i = XVECLEN (PATTERN (insn), 0) - 1; i >= 0; i--)
3597 {
3598 rtx part = XVECEXP (PATTERN (insn), 0, i);
3599 if (part != set)
3600 note_pattern_stores (part, no_equiv, NULL);
3601 }
3602 }
3603
3604 dest = SET_DEST (set);
3605 src = SET_SRC (set);
3606
3607 /* See if this is setting up the equivalence between an argument
3608 register and its stack slot. */
3609 note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
3610 if (note)
3611 {
3612 gcc_assert (REG_P (dest));
3613 regno = REGNO (dest);
3614
3615 /* Note that we don't want to clear init_insns in
3616 ira_reg_equiv even if there are multiple sets of this
3617 register. */
3618 reg_equiv[regno].is_arg_equivalence = 1;
3619
3620 /* The insn result can have equivalence memory although
3621 the equivalence is not set up by the insn. We add
3622 this insn to init insns as it is a flag for now that
3623 regno has an equivalence. We will remove the insn
3624 from init insn list later. */
3625 if (rtx_equal_p (src, XEXP (note, 0)) || MEM_P (XEXP (note, 0)))
3626 ira_reg_equiv[regno].init_insns
3627 = gen_rtx_INSN_LIST (VOIDmode, insn,
3628 ira_reg_equiv[regno].init_insns);
3629
3630 /* Continue normally in case this is a candidate for
3631 replacements. */
3632 }
3633
3634 if (!optimize)
3635 continue;
3636
3637 /* We only handle the case of a pseudo register being set
3638 once, or always to the same value. */
3639 /* ??? The mn10200 port breaks if we add equivalences for
3640 values that need an ADDRESS_REGS register and set them equivalent
3641 to a MEM of a pseudo. The actual problem is in the over-conservative
3642 handling of INPADDR_ADDRESS / INPUT_ADDRESS / INPUT triples in
3643 calculate_needs, but we traditionally work around this problem
3644 here by rejecting equivalences when the destination is in a register
3645 that's likely spilled. This is fragile, of course, since the
3646 preferred class of a pseudo depends on all instructions that set
3647 or use it. */
3648
3649 if (!REG_P (dest)
3650 || (regno = REGNO (dest)) < FIRST_PSEUDO_REGISTER
3651 || (reg_equiv[regno].init_insns
3652 && reg_equiv[regno].init_insns->insn () == NULL)
3653 || (targetm.class_likely_spilled_p (reg_preferred_class (regno))
3654 && MEM_P (src) && ! reg_equiv[regno].is_arg_equivalence))
3655 {
3656 /* This might be setting a SUBREG of a pseudo, a pseudo that is
3657 also set somewhere else to a constant. */
3658 note_pattern_stores (set, no_equiv, NULL);
3659 continue;
3660 }
3661
3662 /* Don't set reg mentioned in a paradoxical subreg
3663 equivalent to a mem. */
3664 if (MEM_P (src) && reg_equiv[regno].pdx_subregs)
3665 {
3666 note_pattern_stores (set, no_equiv, NULL);
3667 continue;
3668 }
3669
3670 note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
3671
3672 /* cse sometimes generates function invariants, but doesn't put a
3673 REG_EQUAL note on the insn. Since this note would be redundant,
3674 there's no point creating it earlier than here. */
3675 if (! note && ! rtx_varies_p (src, 0))
3676 note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
3677
3678 /* Don't bother considering a REG_EQUAL note containing an EXPR_LIST
3679 since it represents a function call. */
3680 if (note && GET_CODE (XEXP (note, 0)) == EXPR_LIST)
3681 note = NULL_RTX;
3682
3683 if (DF_REG_DEF_COUNT (regno) != 1)
3684 {
3685 bool equal_p = true;
3686 rtx_insn_list *list;
3687
3688 /* If we have already processed this pseudo and determined it
3689 cannot have an equivalence, then honor that decision. */
3690 if (reg_equiv[regno].no_equiv)
3691 continue;
3692
3693 if (! note
3694 || rtx_varies_p (XEXP (note, 0), 0)
3695 || (reg_equiv[regno].replacement
3696 && ! rtx_equal_p (XEXP (note, 0),
3697 reg_equiv[regno].replacement)))
3698 {
3699 no_equiv (dest, set, NULL);
3700 continue;
3701 }
3702
3703 list = reg_equiv[regno].init_insns;
3704 for (; list; list = list->next ())
3705 {
3706 rtx note_tmp;
3707 rtx_insn *insn_tmp;
3708
3709 insn_tmp = list->insn ();
3710 note_tmp = find_reg_note (insn_tmp, REG_EQUAL, NULL_RTX);
3711 gcc_assert (note_tmp);
3712 if (! rtx_equal_p (XEXP (note, 0), XEXP (note_tmp, 0)))
3713 {
3714 equal_p = false;
3715 break;
3716 }
3717 }
3718
3719 if (! equal_p)
3720 {
3721 no_equiv (dest, set, NULL);
3722 continue;
3723 }
3724 }
3725
3726 /* Record this insn as initializing this register. */
3727 reg_equiv[regno].init_insns
3728 = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv[regno].init_insns);
3729
3730 /* If this register is known to be equal to a constant, record that
3731 it is always equivalent to the constant.
3732 Note that it is possible to have a register use before
3733 the def in loops (see gcc.c-torture/execute/pr79286.c)
3734 where the reg is undefined on first use. If the def insn
3735 won't trap we can use it as an equivalence, effectively
3736 choosing the "undefined" value for the reg to be the
3737 same as the value set by the def. */
3738 if (DF_REG_DEF_COUNT (regno) == 1
3739 && note
3740 && !rtx_varies_p (XEXP (note, 0), 0)
3741 && (!may_trap_or_fault_p (XEXP (note, 0))
3742 || def_dominates_uses (regno)))
3743 {
3744 rtx note_value = XEXP (note, 0);
3745 remove_note (insn, note);
3746 set_unique_reg_note (insn, REG_EQUIV, note_value);
3747 }
3748
3749 /* If this insn introduces a "constant" register, decrease the priority
3750 of that register. Record this insn if the register is only used once
3751 more and the equivalence value is the same as our source.
3752
3753 The latter condition is checked for two reasons: First, it is an
3754 indication that it may be more efficient to actually emit the insn
3755 as written (if no registers are available, reload will substitute
3756 the equivalence). Secondly, it avoids problems with any registers
3757 dying in this insn whose death notes would be missed.
3758
3759 If we don't have a REG_EQUIV note, see if this insn is loading
3760 a register used only in one basic block from a MEM. If so, and the
3761 MEM remains unchanged for the life of the register, add a REG_EQUIV
3762 note. */
3763 note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
3764
3765 rtx replacement = NULL_RTX;
3766 if (note)
3767 replacement = XEXP (note, 0);
3768 else if (REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
3769 && MEM_P (SET_SRC (set)))
3770 {
3771 enum valid_equiv validity;
3772 validity = validate_equiv_mem (insn, dest, SET_SRC (set));
3773 if (validity != valid_none)
3774 {
3775 replacement = copy_rtx (SET_SRC (set));
3776 if (validity == valid_reload)
3777 note = set_unique_reg_note (insn, REG_EQUIV, replacement);
3778 }
3779 }
3780
3781 /* If we haven't done so, record for reload that this is an
3782 equivalencing insn. */
3783 if (note && !reg_equiv[regno].is_arg_equivalence)
3784 ira_reg_equiv[regno].init_insns
3785 = gen_rtx_INSN_LIST (VOIDmode, insn,
3786 ira_reg_equiv[regno].init_insns);
3787
3788 if (replacement)
3789 {
3790 reg_equiv[regno].replacement = replacement;
3791 reg_equiv[regno].src_p = &SET_SRC (set);
3792 reg_equiv[regno].loop_depth = (short) loop_depth;
3793
3794 /* Don't mess with things live during setjmp. */
3795 if (optimize && !bitmap_bit_p (setjmp_crosses, regno))
3796 {
3797 /* If the register is referenced exactly twice, meaning it is
3798 set once and used once, indicate that the reference may be
3799 replaced by the equivalence we computed above. Do this
3800 even if the register is only used in one block so that
3801 dependencies can be handled where the last register is
3802 used in a different block (i.e. HIGH / LO_SUM sequences)
3803 and to reduce the number of registers alive across
3804 calls. */
3805
3806 if (REG_N_REFS (regno) == 2
3807 && (rtx_equal_p (replacement, src)
3808 || ! equiv_init_varies_p (src))
3809 && NONJUMP_INSN_P (insn)
3810 && equiv_init_movable_p (PATTERN (insn), regno))
3811 reg_equiv[regno].replace = 1;
3812 }
3813 }
3814 }
3815 }
3816 }
3817
3818 /* For insns that set a MEM to the contents of a REG that is only used
3819 in a single basic block, see if the register is always equivalent
3820 to that memory location and if moving the store from INSN to the
3821 insn that sets REG is safe. If so, put a REG_EQUIV note on the
3822 initializing insn. */
3823 static void
3824 add_store_equivs (void)
3825 {
3826 auto_bitmap seen_insns;
3827
3828 for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
3829 {
3830 rtx set, src, dest;
3831 unsigned regno;
3832 rtx_insn *init_insn;
3833
3834 bitmap_set_bit (seen_insns, INSN_UID (insn));
3835
3836 if (! INSN_P (insn))
3837 continue;
3838
3839 set = single_set (insn);
3840 if (! set)
3841 continue;
3842
3843 dest = SET_DEST (set);
3844 src = SET_SRC (set);
3845
3846 /* Don't add a REG_EQUIV note if the insn already has one. The existing
3847 REG_EQUIV is likely more useful than the one we are adding. */
3848 if (MEM_P (dest) && REG_P (src)
3849 && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER
3850 && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
3851 && DF_REG_DEF_COUNT (regno) == 1
3852 && ! reg_equiv[regno].pdx_subregs
3853 && reg_equiv[regno].init_insns != NULL
3854 && (init_insn = reg_equiv[regno].init_insns->insn ()) != 0
3855 && bitmap_bit_p (seen_insns, INSN_UID (init_insn))
3856 && ! find_reg_note (init_insn, REG_EQUIV, NULL_RTX)
3857 && validate_equiv_mem (init_insn, src, dest) == valid_reload
3858 && ! memref_used_between_p (dest, init_insn, insn)
3859 /* Attaching a REG_EQUIV note will fail if INIT_INSN has
3860 multiple sets. */
3861 && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest)))
3862 {
3863 /* This insn makes the equivalence, not the one initializing
3864 the register. */
3865 ira_reg_equiv[regno].init_insns
3866 = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
3867 df_notes_rescan (init_insn);
3868 if (dump_file)
3869 fprintf (dump_file,
3870 "Adding REG_EQUIV to insn %d for source of insn %d\n",
3871 INSN_UID (init_insn),
3872 INSN_UID (insn));
3873 }
3874 }
3875 }
3876
3877 /* Scan all regs killed in an insn to see if any of them are registers
3878 only used that once. If so, see if we can replace the reference
3879 with the equivalent form. If we can, delete the initializing
3880 reference and this register will go away. If we can't replace the
3881 reference, and the initializing reference is within the same loop
3882 (or in an inner loop), then move the register initialization just
3883 before the use, so that they are in the same basic block. */
3884 static void
3885 combine_and_move_insns (void)
3886 {
3887 auto_bitmap cleared_regs;
3888 int max = max_reg_num ();
3889
3890 for (int regno = FIRST_PSEUDO_REGISTER; regno < max; regno++)
3891 {
3892 if (!reg_equiv[regno].replace)
3893 continue;
3894
3895 rtx_insn *use_insn = 0;
3896 for (df_ref use = DF_REG_USE_CHAIN (regno);
3897 use;
3898 use = DF_REF_NEXT_REG (use))
3899 if (DF_REF_INSN_INFO (use))
3900 {
3901 if (DEBUG_INSN_P (DF_REF_INSN (use)))
3902 continue;
3903 gcc_assert (!use_insn);
3904 use_insn = DF_REF_INSN (use);
3905 }
3906 gcc_assert (use_insn);
3907
3908 /* Don't substitute into jumps. indirect_jump_optimize does
3909 this for anything we are prepared to handle. */
3910 if (JUMP_P (use_insn))
3911 continue;
3912
3913 /* Also don't substitute into a conditional trap insn -- it can become
3914 an unconditional trap, and that is a flow control insn. */
3915 if (GET_CODE (PATTERN (use_insn)) == TRAP_IF)
3916 continue;
3917
3918 df_ref def = DF_REG_DEF_CHAIN (regno);
3919 gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && DF_REF_INSN_INFO (def));
3920 rtx_insn *def_insn = DF_REF_INSN (def);
3921
3922 /* We may not move instructions that can throw, since that
3923 changes basic block boundaries and we are not prepared to
3924 adjust the CFG to match. */
3925 if (can_throw_internal (def_insn))
3926 continue;
3927
3928 /* Instructions with multiple sets can only be moved if DF analysis is
3929 performed for all of the registers set. See PR91052. */
3930 if (multiple_sets (def_insn))
3931 continue;
3932
3933 basic_block use_bb = BLOCK_FOR_INSN (use_insn);
3934 basic_block def_bb = BLOCK_FOR_INSN (def_insn);
3935 if (bb_loop_depth (use_bb) > bb_loop_depth (def_bb))
3936 continue;
3937
3938 if (asm_noperands (PATTERN (def_insn)) < 0
3939 && validate_replace_rtx (regno_reg_rtx[regno],
3940 *reg_equiv[regno].src_p, use_insn))
3941 {
3942 rtx link;
3943 /* Append the REG_DEAD notes from def_insn. */
3944 for (rtx *p = &REG_NOTES (def_insn); (link = *p) != 0; )
3945 {
3946 if (REG_NOTE_KIND (XEXP (link, 0)) == REG_DEAD)
3947 {
3948 *p = XEXP (link, 1);
3949 XEXP (link, 1) = REG_NOTES (use_insn);
3950 REG_NOTES (use_insn) = link;
3951 }
3952 else
3953 p = &XEXP (link, 1);
3954 }
3955
3956 remove_death (regno, use_insn);
3957 SET_REG_N_REFS (regno, 0);
3958 REG_FREQ (regno) = 0;
3959 df_ref use;
3960 FOR_EACH_INSN_USE (use, def_insn)
3961 {
3962 unsigned int use_regno = DF_REF_REGNO (use);
3963 if (!HARD_REGISTER_NUM_P (use_regno))
3964 reg_equiv[use_regno].replace = 0;
3965 }
3966
3967 delete_insn (def_insn);
3968
3969 reg_equiv[regno].init_insns = NULL;
3970 ira_reg_equiv[regno].init_insns = NULL;
3971 bitmap_set_bit (cleared_regs, regno);
3972 }
3973
3974 /* Move the initialization of the register to just before
3975 USE_INSN. Update the flow information. */
3976 else if (prev_nondebug_insn (use_insn) != def_insn)
3977 {
3978 rtx_insn *new_insn;
3979
3980 new_insn = emit_insn_before (PATTERN (def_insn), use_insn);
3981 REG_NOTES (new_insn) = REG_NOTES (def_insn);
3982 REG_NOTES (def_insn) = 0;
3983 /* Rescan it to process the notes. */
3984 df_insn_rescan (new_insn);
3985
3986 /* Make sure this insn is recognized before reload begins,
3987 otherwise eliminate_regs_in_insn will die. */
3988 INSN_CODE (new_insn) = INSN_CODE (def_insn);
3989
3990 delete_insn (def_insn);
3991
3992 XEXP (reg_equiv[regno].init_insns, 0) = new_insn;
3993
3994 REG_BASIC_BLOCK (regno) = use_bb->index;
3995 REG_N_CALLS_CROSSED (regno) = 0;
3996
3997 if (use_insn == BB_HEAD (use_bb))
3998 BB_HEAD (use_bb) = new_insn;
3999
4000 /* We know regno dies in use_insn, but inside a loop
4001 REG_DEAD notes might be missing when def_insn was in
4002 another basic block. However, when we move def_insn into
4003 this bb we'll definitely get a REG_DEAD note and reload
4004 will see the death. It's possible that update_equiv_regs
4005 set up an equivalence referencing regno for a reg set by
4006 use_insn, when regno was seen as non-local. Now that
4007 regno is local to this block, and dies, such an
4008 equivalence is invalid. */
4009 if (find_reg_note (use_insn, REG_EQUIV, regno_reg_rtx[regno]))
4010 {
4011 rtx set = single_set (use_insn);
4012 if (set && REG_P (SET_DEST (set)))
4013 no_equiv (SET_DEST (set), set, NULL);
4014 }
4015
4016 ira_reg_equiv[regno].init_insns
4017 = gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX);
4018 bitmap_set_bit (cleared_regs, regno);
4019 }
4020 }
4021
4022 if (!bitmap_empty_p (cleared_regs))
4023 {
4024 basic_block bb;
4025
4026 FOR_EACH_BB_FN (bb, cfun)
4027 {
4028 bitmap_and_compl_into (DF_LR_IN (bb), cleared_regs);
4029 bitmap_and_compl_into (DF_LR_OUT (bb), cleared_regs);
4030 if (!df_live)
4031 continue;
4032 bitmap_and_compl_into (DF_LIVE_IN (bb), cleared_regs);
4033 bitmap_and_compl_into (DF_LIVE_OUT (bb), cleared_regs);
4034 }
4035
4036 /* Last pass - adjust debug insns referencing cleared regs. */
4037 if (MAY_HAVE_DEBUG_BIND_INSNS)
4038 for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
4039 if (DEBUG_BIND_INSN_P (insn))
4040 {
4041 rtx old_loc = INSN_VAR_LOCATION_LOC (insn);
4042 INSN_VAR_LOCATION_LOC (insn)
4043 = simplify_replace_fn_rtx (old_loc, NULL_RTX,
4044 adjust_cleared_regs,
4045 (void *) cleared_regs);
4046 if (old_loc != INSN_VAR_LOCATION_LOC (insn))
4047 df_insn_rescan (insn);
4048 }
4049 }
4050 }
4051
4052 /* A pass over indirect jumps, converting simple cases to direct jumps.
4053 Combine does this optimization too, but only within a basic block. */
4054 static void
4055 indirect_jump_optimize (void)
4056 {
4057 basic_block bb;
4058 bool rebuild_p = false;
4059
4060 FOR_EACH_BB_REVERSE_FN (bb, cfun)
4061 {
4062 rtx_insn *insn = BB_END (bb);
4063 if (!JUMP_P (insn)
4064 || find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX))
4065 continue;
4066
4067 rtx x = pc_set (insn);
4068 if (!x || !REG_P (SET_SRC (x)))
4069 continue;
4070
4071 int regno = REGNO (SET_SRC (x));
4072 if (DF_REG_DEF_COUNT (regno) == 1)
4073 {
4074 df_ref def = DF_REG_DEF_CHAIN (regno);
4075 if (!DF_REF_IS_ARTIFICIAL (def))
4076 {
4077 rtx_insn *def_insn = DF_REF_INSN (def);
4078 rtx lab = NULL_RTX;
4079 rtx set = single_set (def_insn);
4080 if (set && GET_CODE (SET_SRC (set)) == LABEL_REF)
4081 lab = SET_SRC (set);
4082 else
4083 {
4084 rtx eqnote = find_reg_note (def_insn, REG_EQUAL, NULL_RTX);
4085 if (eqnote && GET_CODE (XEXP (eqnote, 0)) == LABEL_REF)
4086 lab = XEXP (eqnote, 0);
4087 }
4088 if (lab && validate_replace_rtx (SET_SRC (x), lab, insn))
4089 rebuild_p = true;
4090 }
4091 }
4092 }
4093
4094 if (rebuild_p)
4095 {
4096 timevar_push (TV_JUMP);
4097 rebuild_jump_labels (get_insns ());
4098 if (purge_all_dead_edges ())
4099 delete_unreachable_blocks ();
4100 timevar_pop (TV_JUMP);
4101 }
4102 }
4103 \f
4104 /* Set up fields memory, constant, and invariant from init_insns in
4105 the structures of array ira_reg_equiv. */
4106 static void
4107 setup_reg_equiv (void)
4108 {
4109 int i;
4110 rtx_insn_list *elem, *prev_elem, *next_elem;
4111 rtx_insn *insn;
4112 rtx set, x;
4113
4114 for (i = FIRST_PSEUDO_REGISTER; i < ira_reg_equiv_len; i++)
4115 for (prev_elem = NULL, elem = ira_reg_equiv[i].init_insns;
4116 elem;
4117 prev_elem = elem, elem = next_elem)
4118 {
4119 next_elem = elem->next ();
4120 insn = elem->insn ();
4121 set = single_set (insn);
4122
4123 /* Init insns can set up equivalence when the reg is a destination or
4124 a source (in this case the destination is memory). */
4125 if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
4126 {
4127 if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL)
4128 {
4129 x = XEXP (x, 0);
4130 if (REG_P (SET_DEST (set))
4131 && REGNO (SET_DEST (set)) == (unsigned int) i
4132 && ! rtx_equal_p (SET_SRC (set), x) && MEM_P (x))
4133 {
4134 /* This insn reporting the equivalence but
4135 actually not setting it. Remove it from the
4136 list. */
4137 if (prev_elem == NULL)
4138 ira_reg_equiv[i].init_insns = next_elem;
4139 else
4140 XEXP (prev_elem, 1) = next_elem;
4141 elem = prev_elem;
4142 }
4143 }
4144 else if (REG_P (SET_DEST (set))
4145 && REGNO (SET_DEST (set)) == (unsigned int) i)
4146 x = SET_SRC (set);
4147 else
4148 {
4149 gcc_assert (REG_P (SET_SRC (set))
4150 && REGNO (SET_SRC (set)) == (unsigned int) i);
4151 x = SET_DEST (set);
4152 }
4153 if (! function_invariant_p (x)
4154 || ! flag_pic
4155 /* A function invariant is often CONSTANT_P but may
4156 include a register. We promise to only pass
4157 CONSTANT_P objects to LEGITIMATE_PIC_OPERAND_P. */
4158 || (CONSTANT_P (x) && LEGITIMATE_PIC_OPERAND_P (x)))
4159 {
4160 /* It can happen that a REG_EQUIV note contains a MEM
4161 that is not a legitimate memory operand. As later
4162 stages of reload assume that all addresses found in
4163 the lra_regno_equiv_* arrays were originally
4164 legitimate, we ignore such REG_EQUIV notes. */
4165 if (memory_operand (x, VOIDmode))
4166 {
4167 ira_reg_equiv[i].defined_p = true;
4168 ira_reg_equiv[i].memory = x;
4169 continue;
4170 }
4171 else if (function_invariant_p (x))
4172 {
4173 machine_mode mode;
4174
4175 mode = GET_MODE (SET_DEST (set));
4176 if (GET_CODE (x) == PLUS
4177 || x == frame_pointer_rtx || x == arg_pointer_rtx)
4178 /* This is PLUS of frame pointer and a constant,
4179 or fp, or argp. */
4180 ira_reg_equiv[i].invariant = x;
4181 else if (targetm.legitimate_constant_p (mode, x))
4182 ira_reg_equiv[i].constant = x;
4183 else
4184 {
4185 ira_reg_equiv[i].memory = force_const_mem (mode, x);
4186 if (ira_reg_equiv[i].memory == NULL_RTX)
4187 {
4188 ira_reg_equiv[i].defined_p = false;
4189 ira_reg_equiv[i].init_insns = NULL;
4190 break;
4191 }
4192 }
4193 ira_reg_equiv[i].defined_p = true;
4194 continue;
4195 }
4196 }
4197 }
4198 ira_reg_equiv[i].defined_p = false;
4199 ira_reg_equiv[i].init_insns = NULL;
4200 break;
4201 }
4202 }
4203
4204 \f
4205
4206 /* Print chain C to FILE. */
4207 static void
4208 print_insn_chain (FILE *file, class insn_chain *c)
4209 {
4210 fprintf (file, "insn=%d, ", INSN_UID (c->insn));
4211 bitmap_print (file, &c->live_throughout, "live_throughout: ", ", ");
4212 bitmap_print (file, &c->dead_or_set, "dead_or_set: ", "\n");
4213 }
4214
4215
4216 /* Print all reload_insn_chains to FILE. */
4217 static void
4218 print_insn_chains (FILE *file)
4219 {
4220 class insn_chain *c;
4221 for (c = reload_insn_chain; c ; c = c->next)
4222 print_insn_chain (file, c);
4223 }
4224
4225 /* Return true if pseudo REGNO should be added to set live_throughout
4226 or dead_or_set of the insn chains for reload consideration. */
4227 static bool
4228 pseudo_for_reload_consideration_p (int regno)
4229 {
4230 /* Consider spilled pseudos too for IRA because they still have a
4231 chance to get hard-registers in the reload when IRA is used. */
4232 return (reg_renumber[regno] >= 0 || ira_conflicts_p);
4233 }
4234
4235 /* Return true if we can track the individual bytes of subreg X.
4236 When returning true, set *OUTER_SIZE to the number of bytes in
4237 X itself, *INNER_SIZE to the number of bytes in the inner register
4238 and *START to the offset of the first byte. */
4239 static bool
4240 get_subreg_tracking_sizes (rtx x, HOST_WIDE_INT *outer_size,
4241 HOST_WIDE_INT *inner_size, HOST_WIDE_INT *start)
4242 {
4243 rtx reg = regno_reg_rtx[REGNO (SUBREG_REG (x))];
4244 return (GET_MODE_SIZE (GET_MODE (x)).is_constant (outer_size)
4245 && GET_MODE_SIZE (GET_MODE (reg)).is_constant (inner_size)
4246 && SUBREG_BYTE (x).is_constant (start));
4247 }
4248
4249 /* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] for
4250 a register with SIZE bytes, making the register live if INIT_VALUE. */
4251 static void
4252 init_live_subregs (bool init_value, sbitmap *live_subregs,
4253 bitmap live_subregs_used, int allocnum, int size)
4254 {
4255 gcc_assert (size > 0);
4256
4257 /* Been there, done that. */
4258 if (bitmap_bit_p (live_subregs_used, allocnum))
4259 return;
4260
4261 /* Create a new one. */
4262 if (live_subregs[allocnum] == NULL)
4263 live_subregs[allocnum] = sbitmap_alloc (size);
4264
4265 /* If the entire reg was live before blasting into subregs, we need
4266 to init all of the subregs to ones else init to 0. */
4267 if (init_value)
4268 bitmap_ones (live_subregs[allocnum]);
4269 else
4270 bitmap_clear (live_subregs[allocnum]);
4271
4272 bitmap_set_bit (live_subregs_used, allocnum);
4273 }
4274
4275 /* Walk the insns of the current function and build reload_insn_chain,
4276 and record register life information. */
4277 static void
4278 build_insn_chain (void)
4279 {
4280 unsigned int i;
4281 class insn_chain **p = &reload_insn_chain;
4282 basic_block bb;
4283 class insn_chain *c = NULL;
4284 class insn_chain *next = NULL;
4285 auto_bitmap live_relevant_regs;
4286 auto_bitmap elim_regset;
4287 /* live_subregs is a vector used to keep accurate information about
4288 which hardregs are live in multiword pseudos. live_subregs and
4289 live_subregs_used are indexed by pseudo number. The live_subreg
4290 entry for a particular pseudo is only used if the corresponding
4291 element is non zero in live_subregs_used. The sbitmap size of
4292 live_subreg[allocno] is number of bytes that the pseudo can
4293 occupy. */
4294 sbitmap *live_subregs = XCNEWVEC (sbitmap, max_regno);
4295 auto_bitmap live_subregs_used;
4296
4297 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
4298 if (TEST_HARD_REG_BIT (eliminable_regset, i))
4299 bitmap_set_bit (elim_regset, i);
4300 FOR_EACH_BB_REVERSE_FN (bb, cfun)
4301 {
4302 bitmap_iterator bi;
4303 rtx_insn *insn;
4304
4305 CLEAR_REG_SET (live_relevant_regs);
4306 bitmap_clear (live_subregs_used);
4307
4308 EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), 0, i, bi)
4309 {
4310 if (i >= FIRST_PSEUDO_REGISTER)
4311 break;
4312 bitmap_set_bit (live_relevant_regs, i);
4313 }
4314
4315 EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb),
4316 FIRST_PSEUDO_REGISTER, i, bi)
4317 {
4318 if (pseudo_for_reload_consideration_p (i))
4319 bitmap_set_bit (live_relevant_regs, i);
4320 }
4321
4322 FOR_BB_INSNS_REVERSE (bb, insn)
4323 {
4324 if (!NOTE_P (insn) && !BARRIER_P (insn))
4325 {
4326 struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4327 df_ref def, use;
4328
4329 c = new_insn_chain ();
4330 c->next = next;
4331 next = c;
4332 *p = c;
4333 p = &c->prev;
4334
4335 c->insn = insn;
4336 c->block = bb->index;
4337
4338 if (NONDEBUG_INSN_P (insn))
4339 FOR_EACH_INSN_INFO_DEF (def, insn_info)
4340 {
4341 unsigned int regno = DF_REF_REGNO (def);
4342
4343 /* Ignore may clobbers because these are generated
4344 from calls. However, every other kind of def is
4345 added to dead_or_set. */
4346 if (!DF_REF_FLAGS_IS_SET (def, DF_REF_MAY_CLOBBER))
4347 {
4348 if (regno < FIRST_PSEUDO_REGISTER)
4349 {
4350 if (!fixed_regs[regno])
4351 bitmap_set_bit (&c->dead_or_set, regno);
4352 }
4353 else if (pseudo_for_reload_consideration_p (regno))
4354 bitmap_set_bit (&c->dead_or_set, regno);
4355 }
4356
4357 if ((regno < FIRST_PSEUDO_REGISTER
4358 || reg_renumber[regno] >= 0
4359 || ira_conflicts_p)
4360 && (!DF_REF_FLAGS_IS_SET (def, DF_REF_CONDITIONAL)))
4361 {
4362 rtx reg = DF_REF_REG (def);
4363 HOST_WIDE_INT outer_size, inner_size, start;
4364
4365 /* We can usually track the liveness of individual
4366 bytes within a subreg. The only exceptions are
4367 subregs wrapped in ZERO_EXTRACTs and subregs whose
4368 size is not known; in those cases we need to be
4369 conservative and treat the definition as a partial
4370 definition of the full register rather than a full
4371 definition of a specific part of the register. */
4372 if (GET_CODE (reg) == SUBREG
4373 && !DF_REF_FLAGS_IS_SET (def, DF_REF_ZERO_EXTRACT)
4374 && get_subreg_tracking_sizes (reg, &outer_size,
4375 &inner_size, &start))
4376 {
4377 HOST_WIDE_INT last = start + outer_size;
4378
4379 init_live_subregs
4380 (bitmap_bit_p (live_relevant_regs, regno),
4381 live_subregs, live_subregs_used, regno,
4382 inner_size);
4383
4384 if (!DF_REF_FLAGS_IS_SET
4385 (def, DF_REF_STRICT_LOW_PART))
4386 {
4387 /* Expand the range to cover entire words.
4388 Bytes added here are "don't care". */
4389 start
4390 = start / UNITS_PER_WORD * UNITS_PER_WORD;
4391 last = ((last + UNITS_PER_WORD - 1)
4392 / UNITS_PER_WORD * UNITS_PER_WORD);
4393 }
4394
4395 /* Ignore the paradoxical bits. */
4396 if (last > SBITMAP_SIZE (live_subregs[regno]))
4397 last = SBITMAP_SIZE (live_subregs[regno]);
4398
4399 while (start < last)
4400 {
4401 bitmap_clear_bit (live_subregs[regno], start);
4402 start++;
4403 }
4404
4405 if (bitmap_empty_p (live_subregs[regno]))
4406 {
4407 bitmap_clear_bit (live_subregs_used, regno);
4408 bitmap_clear_bit (live_relevant_regs, regno);
4409 }
4410 else
4411 /* Set live_relevant_regs here because
4412 that bit has to be true to get us to
4413 look at the live_subregs fields. */
4414 bitmap_set_bit (live_relevant_regs, regno);
4415 }
4416 else
4417 {
4418 /* DF_REF_PARTIAL is generated for
4419 subregs, STRICT_LOW_PART, and
4420 ZERO_EXTRACT. We handle the subreg
4421 case above so here we have to keep from
4422 modeling the def as a killing def. */
4423 if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL))
4424 {
4425 bitmap_clear_bit (live_subregs_used, regno);
4426 bitmap_clear_bit (live_relevant_regs, regno);
4427 }
4428 }
4429 }
4430 }
4431
4432 bitmap_and_compl_into (live_relevant_regs, elim_regset);
4433 bitmap_copy (&c->live_throughout, live_relevant_regs);
4434
4435 if (NONDEBUG_INSN_P (insn))
4436 FOR_EACH_INSN_INFO_USE (use, insn_info)
4437 {
4438 unsigned int regno = DF_REF_REGNO (use);
4439 rtx reg = DF_REF_REG (use);
4440
4441 /* DF_REF_READ_WRITE on a use means that this use
4442 is fabricated from a def that is a partial set
4443 to a multiword reg. Here, we only model the
4444 subreg case that is not wrapped in ZERO_EXTRACT
4445 precisely so we do not need to look at the
4446 fabricated use. */
4447 if (DF_REF_FLAGS_IS_SET (use, DF_REF_READ_WRITE)
4448 && !DF_REF_FLAGS_IS_SET (use, DF_REF_ZERO_EXTRACT)
4449 && DF_REF_FLAGS_IS_SET (use, DF_REF_SUBREG))
4450 continue;
4451
4452 /* Add the last use of each var to dead_or_set. */
4453 if (!bitmap_bit_p (live_relevant_regs, regno))
4454 {
4455 if (regno < FIRST_PSEUDO_REGISTER)
4456 {
4457 if (!fixed_regs[regno])
4458 bitmap_set_bit (&c->dead_or_set, regno);
4459 }
4460 else if (pseudo_for_reload_consideration_p (regno))
4461 bitmap_set_bit (&c->dead_or_set, regno);
4462 }
4463
4464 if (regno < FIRST_PSEUDO_REGISTER
4465 || pseudo_for_reload_consideration_p (regno))
4466 {
4467 HOST_WIDE_INT outer_size, inner_size, start;
4468 if (GET_CODE (reg) == SUBREG
4469 && !DF_REF_FLAGS_IS_SET (use,
4470 DF_REF_SIGN_EXTRACT
4471 | DF_REF_ZERO_EXTRACT)
4472 && get_subreg_tracking_sizes (reg, &outer_size,
4473 &inner_size, &start))
4474 {
4475 HOST_WIDE_INT last = start + outer_size;
4476
4477 init_live_subregs
4478 (bitmap_bit_p (live_relevant_regs, regno),
4479 live_subregs, live_subregs_used, regno,
4480 inner_size);
4481
4482 /* Ignore the paradoxical bits. */
4483 if (last > SBITMAP_SIZE (live_subregs[regno]))
4484 last = SBITMAP_SIZE (live_subregs[regno]);
4485
4486 while (start < last)
4487 {
4488 bitmap_set_bit (live_subregs[regno], start);
4489 start++;
4490 }
4491 }
4492 else
4493 /* Resetting the live_subregs_used is
4494 effectively saying do not use the subregs
4495 because we are reading the whole
4496 pseudo. */
4497 bitmap_clear_bit (live_subregs_used, regno);
4498 bitmap_set_bit (live_relevant_regs, regno);
4499 }
4500 }
4501 }
4502 }
4503
4504 /* FIXME!! The following code is a disaster. Reload needs to see the
4505 labels and jump tables that are just hanging out in between
4506 the basic blocks. See pr33676. */
4507 insn = BB_HEAD (bb);
4508
4509 /* Skip over the barriers and cruft. */
4510 while (insn && (BARRIER_P (insn) || NOTE_P (insn)
4511 || BLOCK_FOR_INSN (insn) == bb))
4512 insn = PREV_INSN (insn);
4513
4514 /* While we add anything except barriers and notes, the focus is
4515 to get the labels and jump tables into the
4516 reload_insn_chain. */
4517 while (insn)
4518 {
4519 if (!NOTE_P (insn) && !BARRIER_P (insn))
4520 {
4521 if (BLOCK_FOR_INSN (insn))
4522 break;
4523
4524 c = new_insn_chain ();
4525 c->next = next;
4526 next = c;
4527 *p = c;
4528 p = &c->prev;
4529
4530 /* The block makes no sense here, but it is what the old
4531 code did. */
4532 c->block = bb->index;
4533 c->insn = insn;
4534 bitmap_copy (&c->live_throughout, live_relevant_regs);
4535 }
4536 insn = PREV_INSN (insn);
4537 }
4538 }
4539
4540 reload_insn_chain = c;
4541 *p = NULL;
4542
4543 for (i = 0; i < (unsigned int) max_regno; i++)
4544 if (live_subregs[i] != NULL)
4545 sbitmap_free (live_subregs[i]);
4546 free (live_subregs);
4547
4548 if (dump_file)
4549 print_insn_chains (dump_file);
4550 }
4551 \f
4552 /* Examine the rtx found in *LOC, which is read or written to as determined
4553 by TYPE. Return false if we find a reason why an insn containing this
4554 rtx should not be moved (such as accesses to non-constant memory), true
4555 otherwise. */
4556 static bool
4557 rtx_moveable_p (rtx *loc, enum op_type type)
4558 {
4559 const char *fmt;
4560 rtx x = *loc;
4561 int i, j;
4562
4563 enum rtx_code code = GET_CODE (x);
4564 switch (code)
4565 {
4566 case CONST:
4567 CASE_CONST_ANY:
4568 case SYMBOL_REF:
4569 case LABEL_REF:
4570 return true;
4571
4572 case PC:
4573 return type == OP_IN;
4574
4575 case REG:
4576 if (x == frame_pointer_rtx)
4577 return true;
4578 if (HARD_REGISTER_P (x))
4579 return false;
4580
4581 return true;
4582
4583 case MEM:
4584 if (type == OP_IN && MEM_READONLY_P (x))
4585 return rtx_moveable_p (&XEXP (x, 0), OP_IN);
4586 return false;
4587
4588 case SET:
4589 return (rtx_moveable_p (&SET_SRC (x), OP_IN)
4590 && rtx_moveable_p (&SET_DEST (x), OP_OUT));
4591
4592 case STRICT_LOW_PART:
4593 return rtx_moveable_p (&XEXP (x, 0), OP_OUT);
4594
4595 case ZERO_EXTRACT:
4596 case SIGN_EXTRACT:
4597 return (rtx_moveable_p (&XEXP (x, 0), type)
4598 && rtx_moveable_p (&XEXP (x, 1), OP_IN)
4599 && rtx_moveable_p (&XEXP (x, 2), OP_IN));
4600
4601 case CLOBBER:
4602 return rtx_moveable_p (&SET_DEST (x), OP_OUT);
4603
4604 case UNSPEC_VOLATILE:
4605 /* It is a bad idea to consider insns with such rtl
4606 as moveable ones. The insn scheduler also considers them as barrier
4607 for a reason. */
4608 return false;
4609
4610 case ASM_OPERANDS:
4611 /* The same is true for volatile asm: it has unknown side effects, it
4612 cannot be moved at will. */
4613 if (MEM_VOLATILE_P (x))
4614 return false;
4615
4616 default:
4617 break;
4618 }
4619
4620 fmt = GET_RTX_FORMAT (code);
4621 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
4622 {
4623 if (fmt[i] == 'e')
4624 {
4625 if (!rtx_moveable_p (&XEXP (x, i), type))
4626 return false;
4627 }
4628 else if (fmt[i] == 'E')
4629 for (j = XVECLEN (x, i) - 1; j >= 0; j--)
4630 {
4631 if (!rtx_moveable_p (&XVECEXP (x, i, j), type))
4632 return false;
4633 }
4634 }
4635 return true;
4636 }
4637
4638 /* A wrapper around dominated_by_p, which uses the information in UID_LUID
4639 to give dominance relationships between two insns I1 and I2. */
4640 static bool
4641 insn_dominated_by_p (rtx i1, rtx i2, int *uid_luid)
4642 {
4643 basic_block bb1 = BLOCK_FOR_INSN (i1);
4644 basic_block bb2 = BLOCK_FOR_INSN (i2);
4645
4646 if (bb1 == bb2)
4647 return uid_luid[INSN_UID (i2)] < uid_luid[INSN_UID (i1)];
4648 return dominated_by_p (CDI_DOMINATORS, bb1, bb2);
4649 }
4650
4651 /* Record the range of register numbers added by find_moveable_pseudos. */
4652 int first_moveable_pseudo, last_moveable_pseudo;
4653
4654 /* These two vectors hold data for every register added by
4655 find_movable_pseudos, with index 0 holding data for the
4656 first_moveable_pseudo. */
4657 /* The original home register. */
4658 static vec<rtx> pseudo_replaced_reg;
4659
4660 /* Look for instances where we have an instruction that is known to increase
4661 register pressure, and whose result is not used immediately. If it is
4662 possible to move the instruction downwards to just before its first use,
4663 split its lifetime into two ranges. We create a new pseudo to compute the
4664 value, and emit a move instruction just before the first use. If, after
4665 register allocation, the new pseudo remains unallocated, the function
4666 move_unallocated_pseudos then deletes the move instruction and places
4667 the computation just before the first use.
4668
4669 Such a move is safe and profitable if all the input registers remain live
4670 and unchanged between the original computation and its first use. In such
4671 a situation, the computation is known to increase register pressure, and
4672 moving it is known to at least not worsen it.
4673
4674 We restrict moves to only those cases where a register remains unallocated,
4675 in order to avoid interfering too much with the instruction schedule. As
4676 an exception, we may move insns which only modify their input register
4677 (typically induction variables), as this increases the freedom for our
4678 intended transformation, and does not limit the second instruction
4679 scheduler pass. */
4680
4681 static void
4682 find_moveable_pseudos (void)
4683 {
4684 unsigned i;
4685 int max_regs = max_reg_num ();
4686 int max_uid = get_max_uid ();
4687 basic_block bb;
4688 int *uid_luid = XNEWVEC (int, max_uid);
4689 rtx_insn **closest_uses = XNEWVEC (rtx_insn *, max_regs);
4690 /* A set of registers which are live but not modified throughout a block. */
4691 bitmap_head *bb_transp_live = XNEWVEC (bitmap_head,
4692 last_basic_block_for_fn (cfun));
4693 /* A set of registers which only exist in a given basic block. */
4694 bitmap_head *bb_local = XNEWVEC (bitmap_head,
4695 last_basic_block_for_fn (cfun));
4696 /* A set of registers which are set once, in an instruction that can be
4697 moved freely downwards, but are otherwise transparent to a block. */
4698 bitmap_head *bb_moveable_reg_sets = XNEWVEC (bitmap_head,
4699 last_basic_block_for_fn (cfun));
4700 auto_bitmap live, used, set, interesting, unusable_as_input;
4701 bitmap_iterator bi;
4702
4703 first_moveable_pseudo = max_regs;
4704 pseudo_replaced_reg.release ();
4705 pseudo_replaced_reg.safe_grow_cleared (max_regs, true);
4706
4707 df_analyze ();
4708 calculate_dominance_info (CDI_DOMINATORS);
4709
4710 i = 0;
4711 FOR_EACH_BB_FN (bb, cfun)
4712 {
4713 rtx_insn *insn;
4714 bitmap transp = bb_transp_live + bb->index;
4715 bitmap moveable = bb_moveable_reg_sets + bb->index;
4716 bitmap local = bb_local + bb->index;
4717
4718 bitmap_initialize (local, 0);
4719 bitmap_initialize (transp, 0);
4720 bitmap_initialize (moveable, 0);
4721 bitmap_copy (live, df_get_live_out (bb));
4722 bitmap_and_into (live, df_get_live_in (bb));
4723 bitmap_copy (transp, live);
4724 bitmap_clear (moveable);
4725 bitmap_clear (live);
4726 bitmap_clear (used);
4727 bitmap_clear (set);
4728 FOR_BB_INSNS (bb, insn)
4729 if (NONDEBUG_INSN_P (insn))
4730 {
4731 df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4732 df_ref def, use;
4733
4734 uid_luid[INSN_UID (insn)] = i++;
4735
4736 def = df_single_def (insn_info);
4737 use = df_single_use (insn_info);
4738 if (use
4739 && def
4740 && DF_REF_REGNO (use) == DF_REF_REGNO (def)
4741 && !bitmap_bit_p (set, DF_REF_REGNO (use))
4742 && rtx_moveable_p (&PATTERN (insn), OP_IN))
4743 {
4744 unsigned regno = DF_REF_REGNO (use);
4745 bitmap_set_bit (moveable, regno);
4746 bitmap_set_bit (set, regno);
4747 bitmap_set_bit (used, regno);
4748 bitmap_clear_bit (transp, regno);
4749 continue;
4750 }
4751 FOR_EACH_INSN_INFO_USE (use, insn_info)
4752 {
4753 unsigned regno = DF_REF_REGNO (use);
4754 bitmap_set_bit (used, regno);
4755 if (bitmap_clear_bit (moveable, regno))
4756 bitmap_clear_bit (transp, regno);
4757 }
4758
4759 FOR_EACH_INSN_INFO_DEF (def, insn_info)
4760 {
4761 unsigned regno = DF_REF_REGNO (def);
4762 bitmap_set_bit (set, regno);
4763 bitmap_clear_bit (transp, regno);
4764 bitmap_clear_bit (moveable, regno);
4765 }
4766 }
4767 }
4768
4769 FOR_EACH_BB_FN (bb, cfun)
4770 {
4771 bitmap local = bb_local + bb->index;
4772 rtx_insn *insn;
4773
4774 FOR_BB_INSNS (bb, insn)
4775 if (NONDEBUG_INSN_P (insn))
4776 {
4777 df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4778 rtx_insn *def_insn;
4779 rtx closest_use, note;
4780 df_ref def, use;
4781 unsigned regno;
4782 bool all_dominated, all_local;
4783 machine_mode mode;
4784
4785 def = df_single_def (insn_info);
4786 /* There must be exactly one def in this insn. */
4787 if (!def || !single_set (insn))
4788 continue;
4789 /* This must be the only definition of the reg. We also limit
4790 which modes we deal with so that we can assume we can generate
4791 move instructions. */
4792 regno = DF_REF_REGNO (def);
4793 mode = GET_MODE (DF_REF_REG (def));
4794 if (DF_REG_DEF_COUNT (regno) != 1
4795 || !DF_REF_INSN_INFO (def)
4796 || HARD_REGISTER_NUM_P (regno)
4797 || DF_REG_EQ_USE_COUNT (regno) > 0
4798 || (!INTEGRAL_MODE_P (mode)
4799 && !FLOAT_MODE_P (mode)
4800 && !OPAQUE_MODE_P (mode)))
4801 continue;
4802 def_insn = DF_REF_INSN (def);
4803
4804 for (note = REG_NOTES (def_insn); note; note = XEXP (note, 1))
4805 if (REG_NOTE_KIND (note) == REG_EQUIV && MEM_P (XEXP (note, 0)))
4806 break;
4807
4808 if (note)
4809 {
4810 if (dump_file)
4811 fprintf (dump_file, "Ignoring reg %d, has equiv memory\n",
4812 regno);
4813 bitmap_set_bit (unusable_as_input, regno);
4814 continue;
4815 }
4816
4817 use = DF_REG_USE_CHAIN (regno);
4818 all_dominated = true;
4819 all_local = true;
4820 closest_use = NULL_RTX;
4821 for (; use; use = DF_REF_NEXT_REG (use))
4822 {
4823 rtx_insn *insn;
4824 if (!DF_REF_INSN_INFO (use))
4825 {
4826 all_dominated = false;
4827 all_local = false;
4828 break;
4829 }
4830 insn = DF_REF_INSN (use);
4831 if (DEBUG_INSN_P (insn))
4832 continue;
4833 if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (def_insn))
4834 all_local = false;
4835 if (!insn_dominated_by_p (insn, def_insn, uid_luid))
4836 all_dominated = false;
4837 if (closest_use != insn && closest_use != const0_rtx)
4838 {
4839 if (closest_use == NULL_RTX)
4840 closest_use = insn;
4841 else if (insn_dominated_by_p (closest_use, insn, uid_luid))
4842 closest_use = insn;
4843 else if (!insn_dominated_by_p (insn, closest_use, uid_luid))
4844 closest_use = const0_rtx;
4845 }
4846 }
4847 if (!all_dominated)
4848 {
4849 if (dump_file)
4850 fprintf (dump_file, "Reg %d not all uses dominated by set\n",
4851 regno);
4852 continue;
4853 }
4854 if (all_local)
4855 bitmap_set_bit (local, regno);
4856 if (closest_use == const0_rtx || closest_use == NULL
4857 || next_nonnote_nondebug_insn (def_insn) == closest_use)
4858 {
4859 if (dump_file)
4860 fprintf (dump_file, "Reg %d uninteresting%s\n", regno,
4861 closest_use == const0_rtx || closest_use == NULL
4862 ? " (no unique first use)" : "");
4863 continue;
4864 }
4865
4866 bitmap_set_bit (interesting, regno);
4867 /* If we get here, we know closest_use is a non-NULL insn
4868 (as opposed to const_0_rtx). */
4869 closest_uses[regno] = as_a <rtx_insn *> (closest_use);
4870
4871 if (dump_file && (all_local || all_dominated))
4872 {
4873 fprintf (dump_file, "Reg %u:", regno);
4874 if (all_local)
4875 fprintf (dump_file, " local to bb %d", bb->index);
4876 if (all_dominated)
4877 fprintf (dump_file, " def dominates all uses");
4878 if (closest_use != const0_rtx)
4879 fprintf (dump_file, " has unique first use");
4880 fputs ("\n", dump_file);
4881 }
4882 }
4883 }
4884
4885 EXECUTE_IF_SET_IN_BITMAP (interesting, 0, i, bi)
4886 {
4887 df_ref def = DF_REG_DEF_CHAIN (i);
4888 rtx_insn *def_insn = DF_REF_INSN (def);
4889 basic_block def_block = BLOCK_FOR_INSN (def_insn);
4890 bitmap def_bb_local = bb_local + def_block->index;
4891 bitmap def_bb_moveable = bb_moveable_reg_sets + def_block->index;
4892 bitmap def_bb_transp = bb_transp_live + def_block->index;
4893 bool local_to_bb_p = bitmap_bit_p (def_bb_local, i);
4894 rtx_insn *use_insn = closest_uses[i];
4895 df_ref use;
4896 bool all_ok = true;
4897 bool all_transp = true;
4898
4899 if (!REG_P (DF_REF_REG (def)))
4900 continue;
4901
4902 if (!local_to_bb_p)
4903 {
4904 if (dump_file)
4905 fprintf (dump_file, "Reg %u not local to one basic block\n",
4906 i);
4907 continue;
4908 }
4909 if (reg_equiv_init (i) != NULL_RTX)
4910 {
4911 if (dump_file)
4912 fprintf (dump_file, "Ignoring reg %u with equiv init insn\n",
4913 i);
4914 continue;
4915 }
4916 if (!rtx_moveable_p (&PATTERN (def_insn), OP_IN))
4917 {
4918 if (dump_file)
4919 fprintf (dump_file, "Found def insn %d for %d to be not moveable\n",
4920 INSN_UID (def_insn), i);
4921 continue;
4922 }
4923 if (dump_file)
4924 fprintf (dump_file, "Examining insn %d, def for %d\n",
4925 INSN_UID (def_insn), i);
4926 FOR_EACH_INSN_USE (use, def_insn)
4927 {
4928 unsigned regno = DF_REF_REGNO (use);
4929 if (bitmap_bit_p (unusable_as_input, regno))
4930 {
4931 all_ok = false;
4932 if (dump_file)
4933 fprintf (dump_file, " found unusable input reg %u.\n", regno);
4934 break;
4935 }
4936 if (!bitmap_bit_p (def_bb_transp, regno))
4937 {
4938 if (bitmap_bit_p (def_bb_moveable, regno)
4939 && !control_flow_insn_p (use_insn))
4940 {
4941 if (modified_between_p (DF_REF_REG (use), def_insn, use_insn))
4942 {
4943 rtx_insn *x = NEXT_INSN (def_insn);
4944 while (!modified_in_p (DF_REF_REG (use), x))
4945 {
4946 gcc_assert (x != use_insn);
4947 x = NEXT_INSN (x);
4948 }
4949 if (dump_file)
4950 fprintf (dump_file, " input reg %u modified but insn %d moveable\n",
4951 regno, INSN_UID (x));
4952 emit_insn_after (PATTERN (x), use_insn);
4953 set_insn_deleted (x);
4954 }
4955 else
4956 {
4957 if (dump_file)
4958 fprintf (dump_file, " input reg %u modified between def and use\n",
4959 regno);
4960 all_transp = false;
4961 }
4962 }
4963 else
4964 all_transp = false;
4965 }
4966 }
4967 if (!all_ok)
4968 continue;
4969 if (!dbg_cnt (ira_move))
4970 break;
4971 if (dump_file)
4972 fprintf (dump_file, " all ok%s\n", all_transp ? " and transp" : "");
4973
4974 if (all_transp)
4975 {
4976 rtx def_reg = DF_REF_REG (def);
4977 rtx newreg = ira_create_new_reg (def_reg);
4978 if (validate_change (def_insn, DF_REF_REAL_LOC (def), newreg, 0))
4979 {
4980 unsigned nregno = REGNO (newreg);
4981 emit_insn_before (gen_move_insn (def_reg, newreg), use_insn);
4982 nregno -= max_regs;
4983 pseudo_replaced_reg[nregno] = def_reg;
4984 }
4985 }
4986 }
4987
4988 FOR_EACH_BB_FN (bb, cfun)
4989 {
4990 bitmap_clear (bb_local + bb->index);
4991 bitmap_clear (bb_transp_live + bb->index);
4992 bitmap_clear (bb_moveable_reg_sets + bb->index);
4993 }
4994 free (uid_luid);
4995 free (closest_uses);
4996 free (bb_local);
4997 free (bb_transp_live);
4998 free (bb_moveable_reg_sets);
4999
5000 last_moveable_pseudo = max_reg_num ();
5001
5002 fix_reg_equiv_init ();
5003 expand_reg_info ();
5004 regstat_free_n_sets_and_refs ();
5005 regstat_free_ri ();
5006 regstat_init_n_sets_and_refs ();
5007 regstat_compute_ri ();
5008 free_dominance_info (CDI_DOMINATORS);
5009 }
5010
5011 /* If SET pattern SET is an assignment from a hard register to a pseudo which
5012 is live at CALL_DOM (if non-NULL, otherwise this check is omitted), return
5013 the destination. Otherwise return NULL. */
5014
5015 static rtx
5016 interesting_dest_for_shprep_1 (rtx set, basic_block call_dom)
5017 {
5018 rtx src = SET_SRC (set);
5019 rtx dest = SET_DEST (set);
5020 if (!REG_P (src) || !HARD_REGISTER_P (src)
5021 || !REG_P (dest) || HARD_REGISTER_P (dest)
5022 || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))
5023 return NULL;
5024 return dest;
5025 }
5026
5027 /* If insn is interesting for parameter range-splitting shrink-wrapping
5028 preparation, i.e. it is a single set from a hard register to a pseudo, which
5029 is live at CALL_DOM (if non-NULL, otherwise this check is omitted), or a
5030 parallel statement with only one such statement, return the destination.
5031 Otherwise return NULL. */
5032
5033 static rtx
5034 interesting_dest_for_shprep (rtx_insn *insn, basic_block call_dom)
5035 {
5036 if (!INSN_P (insn))
5037 return NULL;
5038 rtx pat = PATTERN (insn);
5039 if (GET_CODE (pat) == SET)
5040 return interesting_dest_for_shprep_1 (pat, call_dom);
5041
5042 if (GET_CODE (pat) != PARALLEL)
5043 return NULL;
5044 rtx ret = NULL;
5045 for (int i = 0; i < XVECLEN (pat, 0); i++)
5046 {
5047 rtx sub = XVECEXP (pat, 0, i);
5048 if (GET_CODE (sub) == USE || GET_CODE (sub) == CLOBBER)
5049 continue;
5050 if (GET_CODE (sub) != SET
5051 || side_effects_p (sub))
5052 return NULL;
5053 rtx dest = interesting_dest_for_shprep_1 (sub, call_dom);
5054 if (dest && ret)
5055 return NULL;
5056 if (dest)
5057 ret = dest;
5058 }
5059 return ret;
5060 }
5061
5062 /* Split live ranges of pseudos that are loaded from hard registers in the
5063 first BB in a BB that dominates all non-sibling call if such a BB can be
5064 found and is not in a loop. Return true if the function has made any
5065 changes. */
5066
5067 static bool
5068 split_live_ranges_for_shrink_wrap (void)
5069 {
5070 basic_block bb, call_dom = NULL;
5071 basic_block first = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
5072 rtx_insn *insn, *last_interesting_insn = NULL;
5073 auto_bitmap need_new, reachable;
5074 vec<basic_block> queue;
5075
5076 if (!SHRINK_WRAPPING_ENABLED)
5077 return false;
5078
5079 queue.create (n_basic_blocks_for_fn (cfun));
5080
5081 FOR_EACH_BB_FN (bb, cfun)
5082 FOR_BB_INSNS (bb, insn)
5083 if (CALL_P (insn) && !SIBLING_CALL_P (insn))
5084 {
5085 if (bb == first)
5086 {
5087 queue.release ();
5088 return false;
5089 }
5090
5091 bitmap_set_bit (need_new, bb->index);
5092 bitmap_set_bit (reachable, bb->index);
5093 queue.quick_push (bb);
5094 break;
5095 }
5096
5097 if (queue.is_empty ())
5098 {
5099 queue.release ();
5100 return false;
5101 }
5102
5103 while (!queue.is_empty ())
5104 {
5105 edge e;
5106 edge_iterator ei;
5107
5108 bb = queue.pop ();
5109 FOR_EACH_EDGE (e, ei, bb->succs)
5110 if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
5111 && bitmap_set_bit (reachable, e->dest->index))
5112 queue.quick_push (e->dest);
5113 }
5114 queue.release ();
5115
5116 FOR_BB_INSNS (first, insn)
5117 {
5118 rtx dest = interesting_dest_for_shprep (insn, NULL);
5119 if (!dest)
5120 continue;
5121
5122 if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)
5123 return false;
5124
5125 for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
5126 use;
5127 use = DF_REF_NEXT_REG (use))
5128 {
5129 int ubbi = DF_REF_BB (use)->index;
5130 if (bitmap_bit_p (reachable, ubbi))
5131 bitmap_set_bit (need_new, ubbi);
5132 }
5133 last_interesting_insn = insn;
5134 }
5135
5136 if (!last_interesting_insn)
5137 return false;
5138
5139 call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, need_new);
5140 if (call_dom == first)
5141 return false;
5142
5143 loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
5144 while (bb_loop_depth (call_dom) > 0)
5145 call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom);
5146 loop_optimizer_finalize ();
5147
5148 if (call_dom == first)
5149 return false;
5150
5151 calculate_dominance_info (CDI_POST_DOMINATORS);
5152 if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom))
5153 {
5154 free_dominance_info (CDI_POST_DOMINATORS);
5155 return false;
5156 }
5157 free_dominance_info (CDI_POST_DOMINATORS);
5158
5159 if (dump_file)
5160 fprintf (dump_file, "Will split live ranges of parameters at BB %i\n",
5161 call_dom->index);
5162
5163 bool ret = false;
5164 FOR_BB_INSNS (first, insn)
5165 {
5166 rtx dest = interesting_dest_for_shprep (insn, call_dom);
5167 if (!dest || dest == pic_offset_table_rtx)
5168 continue;
5169
5170 bool need_newreg = false;
5171 df_ref use, next;
5172 for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next)
5173 {
5174 rtx_insn *uin = DF_REF_INSN (use);
5175 next = DF_REF_NEXT_REG (use);
5176
5177 if (DEBUG_INSN_P (uin))
5178 continue;
5179
5180 basic_block ubb = BLOCK_FOR_INSN (uin);
5181 if (ubb == call_dom
5182 || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
5183 {
5184 need_newreg = true;
5185 break;
5186 }
5187 }
5188
5189 if (need_newreg)
5190 {
5191 rtx newreg = ira_create_new_reg (dest);
5192
5193 for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next)
5194 {
5195 rtx_insn *uin = DF_REF_INSN (use);
5196 next = DF_REF_NEXT_REG (use);
5197
5198 basic_block ubb = BLOCK_FOR_INSN (uin);
5199 if (ubb == call_dom
5200 || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
5201 validate_change (uin, DF_REF_REAL_LOC (use), newreg, true);
5202 }
5203
5204 rtx_insn *new_move = gen_move_insn (newreg, dest);
5205 emit_insn_after (new_move, bb_note (call_dom));
5206 if (dump_file)
5207 {
5208 fprintf (dump_file, "Split live-range of register ");
5209 print_rtl_single (dump_file, dest);
5210 }
5211 ret = true;
5212 }
5213
5214 if (insn == last_interesting_insn)
5215 break;
5216 }
5217 apply_change_group ();
5218 return ret;
5219 }
5220
5221 /* Perform the second half of the transformation started in
5222 find_moveable_pseudos. We look for instances where the newly introduced
5223 pseudo remains unallocated, and remove it by moving the definition to
5224 just before its use, replacing the move instruction generated by
5225 find_moveable_pseudos. */
5226 static void
5227 move_unallocated_pseudos (void)
5228 {
5229 int i;
5230 for (i = first_moveable_pseudo; i < last_moveable_pseudo; i++)
5231 if (reg_renumber[i] < 0)
5232 {
5233 int idx = i - first_moveable_pseudo;
5234 rtx other_reg = pseudo_replaced_reg[idx];
5235 /* The iterating range [first_moveable_pseudo, last_moveable_pseudo)
5236 covers every new pseudo created in find_moveable_pseudos,
5237 regardless of the validation with it is successful or not.
5238 So we need to skip the pseudos which were used in those failed
5239 validations to avoid unexpected DF info and consequent ICE.
5240 We only set pseudo_replaced_reg[] when the validation is successful
5241 in find_moveable_pseudos, it's enough to check it here. */
5242 if (!other_reg)
5243 continue;
5244 rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (i));
5245 /* The use must follow all definitions of OTHER_REG, so we can
5246 insert the new definition immediately after any of them. */
5247 df_ref other_def = DF_REG_DEF_CHAIN (REGNO (other_reg));
5248 rtx_insn *move_insn = DF_REF_INSN (other_def);
5249 rtx_insn *newinsn = emit_insn_after (PATTERN (def_insn), move_insn);
5250 rtx set;
5251 int success;
5252
5253 if (dump_file)
5254 fprintf (dump_file, "moving def of %d (insn %d now) ",
5255 REGNO (other_reg), INSN_UID (def_insn));
5256
5257 delete_insn (move_insn);
5258 while ((other_def = DF_REG_DEF_CHAIN (REGNO (other_reg))))
5259 delete_insn (DF_REF_INSN (other_def));
5260 delete_insn (def_insn);
5261
5262 set = single_set (newinsn);
5263 success = validate_change (newinsn, &SET_DEST (set), other_reg, 0);
5264 gcc_assert (success);
5265 if (dump_file)
5266 fprintf (dump_file, " %d) rather than keep unallocated replacement %d\n",
5267 INSN_UID (newinsn), i);
5268 SET_REG_N_REFS (i, 0);
5269 }
5270
5271 first_moveable_pseudo = last_moveable_pseudo = 0;
5272 }
5273
5274 \f
5275
5276 /* Code dealing with scratches (changing them onto
5277 pseudos and restoring them from the pseudos).
5278
5279 We change scratches into pseudos at the beginning of IRA to
5280 simplify dealing with them (conflicts, hard register assignments).
5281
5282 If the pseudo denoting scratch was spilled it means that we do not
5283 need a hard register for it. Such pseudos are transformed back to
5284 scratches at the end of LRA. */
5285
5286 /* Description of location of a former scratch operand. */
5287 struct sloc
5288 {
5289 rtx_insn *insn; /* Insn where the scratch was. */
5290 int nop; /* Number of the operand which was a scratch. */
5291 unsigned regno; /* regno gnerated instead of scratch */
5292 int icode; /* Original icode from which scratch was removed. */
5293 };
5294
5295 typedef struct sloc *sloc_t;
5296
5297 /* Locations of the former scratches. */
5298 static vec<sloc_t> scratches;
5299
5300 /* Bitmap of scratch regnos. */
5301 static bitmap_head scratch_bitmap;
5302
5303 /* Bitmap of scratch operands. */
5304 static bitmap_head scratch_operand_bitmap;
5305
5306 /* Return true if pseudo REGNO is made of SCRATCH. */
5307 bool
5308 ira_former_scratch_p (int regno)
5309 {
5310 return bitmap_bit_p (&scratch_bitmap, regno);
5311 }
5312
5313 /* Return true if the operand NOP of INSN is a former scratch. */
5314 bool
5315 ira_former_scratch_operand_p (rtx_insn *insn, int nop)
5316 {
5317 return bitmap_bit_p (&scratch_operand_bitmap,
5318 INSN_UID (insn) * MAX_RECOG_OPERANDS + nop) != 0;
5319 }
5320
5321 /* Register operand NOP in INSN as a former scratch. It will be
5322 changed to scratch back, if it is necessary, at the LRA end. */
5323 void
5324 ira_register_new_scratch_op (rtx_insn *insn, int nop, int icode)
5325 {
5326 rtx op = *recog_data.operand_loc[nop];
5327 sloc_t loc = XNEW (struct sloc);
5328 ira_assert (REG_P (op));
5329 loc->insn = insn;
5330 loc->nop = nop;
5331 loc->regno = REGNO (op);
5332 loc->icode = icode;
5333 scratches.safe_push (loc);
5334 bitmap_set_bit (&scratch_bitmap, REGNO (op));
5335 bitmap_set_bit (&scratch_operand_bitmap,
5336 INSN_UID (insn) * MAX_RECOG_OPERANDS + nop);
5337 add_reg_note (insn, REG_UNUSED, op);
5338 }
5339
5340 /* Return true if string STR contains constraint 'X'. */
5341 static bool
5342 contains_X_constraint_p (const char *str)
5343 {
5344 int c;
5345
5346 while ((c = *str))
5347 {
5348 str += CONSTRAINT_LEN (c, str);
5349 if (c == 'X') return true;
5350 }
5351 return false;
5352 }
5353
5354 /* Change INSN's scratches into pseudos and save their location.
5355 Return true if we changed any scratch. */
5356 bool
5357 ira_remove_insn_scratches (rtx_insn *insn, bool all_p, FILE *dump_file,
5358 rtx (*get_reg) (rtx original))
5359 {
5360 int i;
5361 bool insn_changed_p;
5362 rtx reg, *loc;
5363
5364 extract_insn (insn);
5365 insn_changed_p = false;
5366 for (i = 0; i < recog_data.n_operands; i++)
5367 {
5368 loc = recog_data.operand_loc[i];
5369 if (GET_CODE (*loc) == SCRATCH && GET_MODE (*loc) != VOIDmode)
5370 {
5371 if (! all_p && contains_X_constraint_p (recog_data.constraints[i]))
5372 continue;
5373 insn_changed_p = true;
5374 *loc = reg = get_reg (*loc);
5375 ira_register_new_scratch_op (insn, i, INSN_CODE (insn));
5376 if (ira_dump_file != NULL)
5377 fprintf (dump_file,
5378 "Removing SCRATCH to p%u in insn #%u (nop %d)\n",
5379 REGNO (reg), INSN_UID (insn), i);
5380 }
5381 }
5382 return insn_changed_p;
5383 }
5384
5385 /* Return new register of the same mode as ORIGINAL. Used in
5386 remove_scratches. */
5387 static rtx
5388 get_scratch_reg (rtx original)
5389 {
5390 return gen_reg_rtx (GET_MODE (original));
5391 }
5392
5393 /* Change scratches into pseudos and save their location. Return true
5394 if we changed any scratch. */
5395 static bool
5396 remove_scratches (void)
5397 {
5398 bool change_p = false;
5399 basic_block bb;
5400 rtx_insn *insn;
5401
5402 scratches.create (get_max_uid ());
5403 bitmap_initialize (&scratch_bitmap, &reg_obstack);
5404 bitmap_initialize (&scratch_operand_bitmap, &reg_obstack);
5405 FOR_EACH_BB_FN (bb, cfun)
5406 FOR_BB_INSNS (bb, insn)
5407 if (INSN_P (insn)
5408 && ira_remove_insn_scratches (insn, false, ira_dump_file, get_scratch_reg))
5409 {
5410 /* Because we might use DF, we need to keep DF info up to date. */
5411 df_insn_rescan (insn);
5412 change_p = true;
5413 }
5414 return change_p;
5415 }
5416
5417 /* Changes pseudos created by function remove_scratches onto scratches. */
5418 void
5419 ira_restore_scratches (FILE *dump_file)
5420 {
5421 int regno, n;
5422 unsigned i;
5423 rtx *op_loc;
5424 sloc_t loc;
5425
5426 for (i = 0; scratches.iterate (i, &loc); i++)
5427 {
5428 /* Ignore already deleted insns. */
5429 if (NOTE_P (loc->insn)
5430 && NOTE_KIND (loc->insn) == NOTE_INSN_DELETED)
5431 continue;
5432 extract_insn (loc->insn);
5433 if (loc->icode != INSN_CODE (loc->insn))
5434 {
5435 /* The icode doesn't match, which means the insn has been modified
5436 (e.g. register elimination). The scratch cannot be restored. */
5437 continue;
5438 }
5439 op_loc = recog_data.operand_loc[loc->nop];
5440 if (REG_P (*op_loc)
5441 && ((regno = REGNO (*op_loc)) >= FIRST_PSEUDO_REGISTER)
5442 && reg_renumber[regno] < 0)
5443 {
5444 /* It should be only case when scratch register with chosen
5445 constraint 'X' did not get memory or hard register. */
5446 ira_assert (ira_former_scratch_p (regno));
5447 *op_loc = gen_rtx_SCRATCH (GET_MODE (*op_loc));
5448 for (n = 0; n < recog_data.n_dups; n++)
5449 *recog_data.dup_loc[n]
5450 = *recog_data.operand_loc[(int) recog_data.dup_num[n]];
5451 if (dump_file != NULL)
5452 fprintf (dump_file, "Restoring SCRATCH in insn #%u(nop %d)\n",
5453 INSN_UID (loc->insn), loc->nop);
5454 }
5455 }
5456 for (i = 0; scratches.iterate (i, &loc); i++)
5457 free (loc);
5458 scratches.release ();
5459 bitmap_clear (&scratch_bitmap);
5460 bitmap_clear (&scratch_operand_bitmap);
5461 }
5462
5463 \f
5464
5465 /* If the backend knows where to allocate pseudos for hard
5466 register initial values, register these allocations now. */
5467 static void
5468 allocate_initial_values (void)
5469 {
5470 if (targetm.allocate_initial_value)
5471 {
5472 rtx hreg, preg, x;
5473 int i, regno;
5474
5475 for (i = 0; HARD_REGISTER_NUM_P (i); i++)
5476 {
5477 if (! initial_value_entry (i, &hreg, &preg))
5478 break;
5479
5480 x = targetm.allocate_initial_value (hreg);
5481 regno = REGNO (preg);
5482 if (x && REG_N_SETS (regno) <= 1)
5483 {
5484 if (MEM_P (x))
5485 reg_equiv_memory_loc (regno) = x;
5486 else
5487 {
5488 basic_block bb;
5489 int new_regno;
5490
5491 gcc_assert (REG_P (x));
5492 new_regno = REGNO (x);
5493 reg_renumber[regno] = new_regno;
5494 /* Poke the regno right into regno_reg_rtx so that even
5495 fixed regs are accepted. */
5496 SET_REGNO (preg, new_regno);
5497 /* Update global register liveness information. */
5498 FOR_EACH_BB_FN (bb, cfun)
5499 {
5500 if (REGNO_REG_SET_P (df_get_live_in (bb), regno))
5501 SET_REGNO_REG_SET (df_get_live_in (bb), new_regno);
5502 if (REGNO_REG_SET_P (df_get_live_out (bb), regno))
5503 SET_REGNO_REG_SET (df_get_live_out (bb), new_regno);
5504 }
5505 }
5506 }
5507 }
5508
5509 gcc_checking_assert (! initial_value_entry (FIRST_PSEUDO_REGISTER,
5510 &hreg, &preg));
5511 }
5512 }
5513
5514 \f
5515
5516
5517 /* True when we use LRA instead of reload pass for the current
5518 function. */
5519 bool ira_use_lra_p;
5520
5521 /* True if we have allocno conflicts. It is false for non-optimized
5522 mode or when the conflict table is too big. */
5523 bool ira_conflicts_p;
5524
5525 /* Saved between IRA and reload. */
5526 static int saved_flag_ira_share_spill_slots;
5527
5528 /* This is the main entry of IRA. */
5529 static void
5530 ira (FILE *f)
5531 {
5532 bool loops_p;
5533 int ira_max_point_before_emit;
5534 bool saved_flag_caller_saves = flag_caller_saves;
5535 enum ira_region saved_flag_ira_region = flag_ira_region;
5536 basic_block bb;
5537 edge_iterator ei;
5538 edge e;
5539 bool output_jump_reload_p = false;
5540
5541 if (ira_use_lra_p)
5542 {
5543 /* First put potential jump output reloads on the output edges
5544 as USE which will be removed at the end of LRA. The major
5545 goal is actually to create BBs for critical edges for LRA and
5546 populate them later by live info. In LRA it will be
5547 difficult to do this. */
5548 FOR_EACH_BB_FN (bb, cfun)
5549 {
5550 rtx_insn *end = BB_END (bb);
5551 if (!JUMP_P (end))
5552 continue;
5553 extract_insn (end);
5554 for (int i = 0; i < recog_data.n_operands; i++)
5555 if (recog_data.operand_type[i] != OP_IN)
5556 {
5557 bool skip_p = false;
5558 FOR_EACH_EDGE (e, ei, bb->succs)
5559 if (EDGE_CRITICAL_P (e)
5560 && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
5561 && (e->flags & EDGE_ABNORMAL))
5562 {
5563 skip_p = true;
5564 break;
5565 }
5566 if (skip_p)
5567 break;
5568 output_jump_reload_p = true;
5569 FOR_EACH_EDGE (e, ei, bb->succs)
5570 if (EDGE_CRITICAL_P (e)
5571 && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
5572 {
5573 start_sequence ();
5574 /* We need to put some no-op insn here. We can
5575 not put a note as commit_edges insertion will
5576 fail. */
5577 emit_insn (gen_rtx_USE (VOIDmode, const1_rtx));
5578 rtx_insn *insns = get_insns ();
5579 end_sequence ();
5580 insert_insn_on_edge (insns, e);
5581 }
5582 break;
5583 }
5584 }
5585 if (output_jump_reload_p)
5586 commit_edge_insertions ();
5587 }
5588
5589 if (flag_ira_verbose < 10)
5590 {
5591 internal_flag_ira_verbose = flag_ira_verbose;
5592 ira_dump_file = f;
5593 }
5594 else
5595 {
5596 internal_flag_ira_verbose = flag_ira_verbose - 10;
5597 ira_dump_file = stderr;
5598 }
5599
5600 clear_bb_flags ();
5601
5602 /* Determine if the current function is a leaf before running IRA
5603 since this can impact optimizations done by the prologue and
5604 epilogue thus changing register elimination offsets.
5605 Other target callbacks may use crtl->is_leaf too, including
5606 SHRINK_WRAPPING_ENABLED, so initialize as early as possible. */
5607 crtl->is_leaf = leaf_function_p ();
5608
5609 /* Perform target specific PIC register initialization. */
5610 targetm.init_pic_reg ();
5611
5612 ira_conflicts_p = optimize > 0;
5613
5614 /* Determine the number of pseudos actually requiring coloring. */
5615 unsigned int num_used_regs = 0;
5616 for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
5617 if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
5618 num_used_regs++;
5619
5620 /* If there are too many pseudos and/or basic blocks (e.g. 10K
5621 pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
5622 use simplified and faster algorithms in LRA. */
5623 lra_simple_p
5624 = ira_use_lra_p
5625 && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
5626
5627 if (lra_simple_p)
5628 {
5629 /* It permits to skip live range splitting in LRA. */
5630 flag_caller_saves = false;
5631 /* There is no sense to do regional allocation when we use
5632 simplified LRA. */
5633 flag_ira_region = IRA_REGION_ONE;
5634 ira_conflicts_p = false;
5635 }
5636
5637 #ifndef IRA_NO_OBSTACK
5638 gcc_obstack_init (&ira_obstack);
5639 #endif
5640 bitmap_obstack_initialize (&ira_bitmap_obstack);
5641
5642 /* LRA uses its own infrastructure to handle caller save registers. */
5643 if (flag_caller_saves && !ira_use_lra_p)
5644 init_caller_save ();
5645
5646 setup_prohibited_mode_move_regs ();
5647 decrease_live_ranges_number ();
5648 df_note_add_problem ();
5649
5650 /* DF_LIVE can't be used in the register allocator, too many other
5651 parts of the compiler depend on using the "classic" liveness
5652 interpretation of the DF_LR problem. See PR38711.
5653 Remove the problem, so that we don't spend time updating it in
5654 any of the df_analyze() calls during IRA/LRA. */
5655 if (optimize > 1)
5656 df_remove_problem (df_live);
5657 gcc_checking_assert (df_live == NULL);
5658
5659 if (flag_checking)
5660 df->changeable_flags |= DF_VERIFY_SCHEDULED;
5661
5662 df_analyze ();
5663
5664 init_reg_equiv ();
5665 if (ira_conflicts_p)
5666 {
5667 calculate_dominance_info (CDI_DOMINATORS);
5668
5669 if (split_live_ranges_for_shrink_wrap ())
5670 df_analyze ();
5671
5672 free_dominance_info (CDI_DOMINATORS);
5673 }
5674
5675 df_clear_flags (DF_NO_INSN_RESCAN);
5676
5677 indirect_jump_optimize ();
5678 if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
5679 df_analyze ();
5680
5681 regstat_init_n_sets_and_refs ();
5682 regstat_compute_ri ();
5683
5684 /* If we are not optimizing, then this is the only place before
5685 register allocation where dataflow is done. And that is needed
5686 to generate these warnings. */
5687 if (warn_clobbered)
5688 generate_setjmp_warnings ();
5689
5690 /* update_equiv_regs can use reg classes of pseudos and they are set up in
5691 register pressure sensitive scheduling and loop invariant motion and in
5692 live range shrinking. This info can become obsolete if we add new pseudos
5693 since the last set up. Recalculate it again if the new pseudos were
5694 added. */
5695 if (resize_reg_info () && (flag_sched_pressure || flag_live_range_shrinkage
5696 || flag_ira_loop_pressure))
5697 ira_set_pseudo_classes (true, ira_dump_file);
5698
5699 init_alias_analysis ();
5700 loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
5701 reg_equiv = XCNEWVEC (struct equivalence, max_reg_num ());
5702 update_equiv_regs_prescan ();
5703 update_equiv_regs ();
5704
5705 /* Don't move insns if live range shrinkage or register
5706 pressure-sensitive scheduling were done because it will not
5707 improve allocation but likely worsen insn scheduling. */
5708 if (optimize
5709 && !flag_live_range_shrinkage
5710 && !(flag_sched_pressure && flag_schedule_insns))
5711 combine_and_move_insns ();
5712
5713 /* Gather additional equivalences with memory. */
5714 if (optimize)
5715 add_store_equivs ();
5716
5717 loop_optimizer_finalize ();
5718 free_dominance_info (CDI_DOMINATORS);
5719 end_alias_analysis ();
5720 free (reg_equiv);
5721
5722 /* Once max_regno changes, we need to free and re-init/re-compute
5723 some data structures like regstat_n_sets_and_refs and reg_info_p. */
5724 auto regstat_recompute_for_max_regno = []() {
5725 regstat_free_n_sets_and_refs ();
5726 regstat_free_ri ();
5727 regstat_init_n_sets_and_refs ();
5728 regstat_compute_ri ();
5729 resize_reg_info ();
5730 };
5731
5732 int max_regno_before_rm = max_reg_num ();
5733 if (ira_use_lra_p && remove_scratches ())
5734 {
5735 ira_expand_reg_equiv ();
5736 /* For now remove_scatches is supposed to create pseudos when it
5737 succeeds, assert this happens all the time. Once it doesn't
5738 hold, we should guard the regstat recompute for the case
5739 max_regno changes. */
5740 gcc_assert (max_regno_before_rm != max_reg_num ());
5741 regstat_recompute_for_max_regno ();
5742 }
5743
5744 setup_reg_equiv ();
5745 grow_reg_equivs ();
5746 setup_reg_equiv_init ();
5747
5748 allocated_reg_info_size = max_reg_num ();
5749
5750 /* It is not worth to do such improvement when we use a simple
5751 allocation because of -O0 usage or because the function is too
5752 big. */
5753 if (ira_conflicts_p)
5754 find_moveable_pseudos ();
5755
5756 max_regno_before_ira = max_reg_num ();
5757 ira_setup_eliminable_regset ();
5758
5759 ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
5760 ira_load_cost = ira_store_cost = ira_shuffle_cost = 0;
5761 ira_move_loops_num = ira_additional_jumps_num = 0;
5762
5763 ira_assert (current_loops == NULL);
5764 if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED)
5765 loop_optimizer_init (AVOID_CFG_MODIFICATIONS | LOOPS_HAVE_RECORDED_EXITS);
5766
5767 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
5768 fprintf (ira_dump_file, "Building IRA IR\n");
5769 loops_p = ira_build ();
5770
5771 ira_assert (ira_conflicts_p || !loops_p);
5772
5773 saved_flag_ira_share_spill_slots = flag_ira_share_spill_slots;
5774 if (too_high_register_pressure_p () || cfun->calls_setjmp)
5775 /* It is just wasting compiler's time to pack spilled pseudos into
5776 stack slots in this case -- prohibit it. We also do this if
5777 there is setjmp call because a variable not modified between
5778 setjmp and longjmp the compiler is required to preserve its
5779 value and sharing slots does not guarantee it. */
5780 flag_ira_share_spill_slots = FALSE;
5781
5782 ira_color ();
5783
5784 ira_max_point_before_emit = ira_max_point;
5785
5786 ira_initiate_emit_data ();
5787
5788 ira_emit (loops_p);
5789
5790 max_regno = max_reg_num ();
5791 if (ira_conflicts_p)
5792 {
5793 if (! loops_p)
5794 {
5795 if (! ira_use_lra_p)
5796 ira_initiate_assign ();
5797 }
5798 else
5799 {
5800 expand_reg_info ();
5801
5802 if (ira_use_lra_p)
5803 {
5804 ira_allocno_t a;
5805 ira_allocno_iterator ai;
5806
5807 FOR_EACH_ALLOCNO (a, ai)
5808 {
5809 int old_regno = ALLOCNO_REGNO (a);
5810 int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)->reg);
5811
5812 ALLOCNO_REGNO (a) = new_regno;
5813
5814 if (old_regno != new_regno)
5815 setup_reg_classes (new_regno, reg_preferred_class (old_regno),
5816 reg_alternate_class (old_regno),
5817 reg_allocno_class (old_regno));
5818 }
5819 }
5820 else
5821 {
5822 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
5823 fprintf (ira_dump_file, "Flattening IR\n");
5824 ira_flattening (max_regno_before_ira, ira_max_point_before_emit);
5825 }
5826 /* New insns were generated: add notes and recalculate live
5827 info. */
5828 df_analyze ();
5829
5830 /* ??? Rebuild the loop tree, but why? Does the loop tree
5831 change if new insns were generated? Can that be handled
5832 by updating the loop tree incrementally? */
5833 loop_optimizer_finalize ();
5834 free_dominance_info (CDI_DOMINATORS);
5835 loop_optimizer_init (AVOID_CFG_MODIFICATIONS
5836 | LOOPS_HAVE_RECORDED_EXITS);
5837
5838 if (! ira_use_lra_p)
5839 {
5840 setup_allocno_assignment_flags ();
5841 ira_initiate_assign ();
5842 ira_reassign_conflict_allocnos (max_regno);
5843 }
5844 }
5845 }
5846
5847 ira_finish_emit_data ();
5848
5849 setup_reg_renumber ();
5850
5851 calculate_allocation_cost ();
5852
5853 #ifdef ENABLE_IRA_CHECKING
5854 if (ira_conflicts_p && ! ira_use_lra_p)
5855 /* Opposite to reload pass, LRA does not use any conflict info
5856 from IRA. We don't rebuild conflict info for LRA (through
5857 ira_flattening call) and cannot use the check here. We could
5858 rebuild this info for LRA in the check mode but there is a risk
5859 that code generated with the check and without it will be a bit
5860 different. Calling ira_flattening in any mode would be a
5861 wasting CPU time. So do not check the allocation for LRA. */
5862 check_allocation ();
5863 #endif
5864
5865 if (max_regno != max_regno_before_ira)
5866 regstat_recompute_for_max_regno ();
5867
5868 overall_cost_before = ira_overall_cost;
5869 if (! ira_conflicts_p)
5870 grow_reg_equivs ();
5871 else
5872 {
5873 fix_reg_equiv_init ();
5874
5875 #ifdef ENABLE_IRA_CHECKING
5876 print_redundant_copies ();
5877 #endif
5878 if (! ira_use_lra_p)
5879 {
5880 ira_spilled_reg_stack_slots_num = 0;
5881 ira_spilled_reg_stack_slots
5882 = ((class ira_spilled_reg_stack_slot *)
5883 ira_allocate (max_regno
5884 * sizeof (class ira_spilled_reg_stack_slot)));
5885 memset ((void *)ira_spilled_reg_stack_slots, 0,
5886 max_regno * sizeof (class ira_spilled_reg_stack_slot));
5887 }
5888 }
5889 allocate_initial_values ();
5890
5891 /* See comment for find_moveable_pseudos call. */
5892 if (ira_conflicts_p)
5893 move_unallocated_pseudos ();
5894
5895 /* Restore original values. */
5896 if (lra_simple_p)
5897 {
5898 flag_caller_saves = saved_flag_caller_saves;
5899 flag_ira_region = saved_flag_ira_region;
5900 }
5901 }
5902
5903 /* Modify asm goto to avoid further trouble with this insn. We can
5904 not replace the insn by USE as in other asm insns as we still
5905 need to keep CFG consistency. */
5906 void
5907 ira_nullify_asm_goto (rtx_insn *insn)
5908 {
5909 ira_assert (JUMP_P (insn) && INSN_CODE (insn) < 0);
5910 rtx tmp = extract_asm_operands (PATTERN (insn));
5911 PATTERN (insn) = gen_rtx_ASM_OPERANDS (VOIDmode, ggc_strdup (""), "", 0,
5912 rtvec_alloc (0),
5913 rtvec_alloc (0),
5914 ASM_OPERANDS_LABEL_VEC (tmp),
5915 ASM_OPERANDS_SOURCE_LOCATION(tmp));
5916 }
5917
5918 static void
5919 do_reload (void)
5920 {
5921 basic_block bb;
5922 bool need_dce;
5923 unsigned pic_offset_table_regno = INVALID_REGNUM;
5924
5925 if (flag_ira_verbose < 10)
5926 ira_dump_file = dump_file;
5927
5928 /* If pic_offset_table_rtx is a pseudo register, then keep it so
5929 after reload to avoid possible wrong usages of hard reg assigned
5930 to it. */
5931 if (pic_offset_table_rtx
5932 && REGNO (pic_offset_table_rtx) >= FIRST_PSEUDO_REGISTER)
5933 pic_offset_table_regno = REGNO (pic_offset_table_rtx);
5934
5935 timevar_push (TV_RELOAD);
5936 if (ira_use_lra_p)
5937 {
5938 if (current_loops != NULL)
5939 {
5940 loop_optimizer_finalize ();
5941 free_dominance_info (CDI_DOMINATORS);
5942 }
5943 FOR_ALL_BB_FN (bb, cfun)
5944 bb->loop_father = NULL;
5945 current_loops = NULL;
5946
5947 ira_destroy ();
5948
5949 lra (ira_dump_file);
5950 /* ???!!! Move it before lra () when we use ira_reg_equiv in
5951 LRA. */
5952 vec_free (reg_equivs);
5953 reg_equivs = NULL;
5954 need_dce = false;
5955 }
5956 else
5957 {
5958 df_set_flags (DF_NO_INSN_RESCAN);
5959 build_insn_chain ();
5960
5961 need_dce = reload (get_insns (), ira_conflicts_p);
5962 }
5963
5964 timevar_pop (TV_RELOAD);
5965
5966 timevar_push (TV_IRA);
5967
5968 if (ira_conflicts_p && ! ira_use_lra_p)
5969 {
5970 ira_free (ira_spilled_reg_stack_slots);
5971 ira_finish_assign ();
5972 }
5973
5974 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL
5975 && overall_cost_before != ira_overall_cost)
5976 fprintf (ira_dump_file, "+++Overall after reload %" PRId64 "\n",
5977 ira_overall_cost);
5978
5979 flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots;
5980
5981 if (! ira_use_lra_p)
5982 {
5983 ira_destroy ();
5984 if (current_loops != NULL)
5985 {
5986 loop_optimizer_finalize ();
5987 free_dominance_info (CDI_DOMINATORS);
5988 }
5989 FOR_ALL_BB_FN (bb, cfun)
5990 bb->loop_father = NULL;
5991 current_loops = NULL;
5992
5993 regstat_free_ri ();
5994 regstat_free_n_sets_and_refs ();
5995 }
5996
5997 if (optimize)
5998 cleanup_cfg (CLEANUP_EXPENSIVE);
5999
6000 finish_reg_equiv ();
6001
6002 bitmap_obstack_release (&ira_bitmap_obstack);
6003 #ifndef IRA_NO_OBSTACK
6004 obstack_free (&ira_obstack, NULL);
6005 #endif
6006
6007 /* The code after the reload has changed so much that at this point
6008 we might as well just rescan everything. Note that
6009 df_rescan_all_insns is not going to help here because it does not
6010 touch the artificial uses and defs. */
6011 df_finish_pass (true);
6012 df_scan_alloc (NULL);
6013 df_scan_blocks ();
6014
6015 if (optimize > 1)
6016 {
6017 df_live_add_problem ();
6018 df_live_set_all_dirty ();
6019 }
6020
6021 if (optimize)
6022 df_analyze ();
6023
6024 if (need_dce && optimize)
6025 run_fast_dce ();
6026
6027 /* Diagnose uses of the hard frame pointer when it is used as a global
6028 register. Often we can get away with letting the user appropriate
6029 the frame pointer, but we should let them know when code generation
6030 makes that impossible. */
6031 if (global_regs[HARD_FRAME_POINTER_REGNUM] && frame_pointer_needed)
6032 {
6033 tree decl = global_regs_decl[HARD_FRAME_POINTER_REGNUM];
6034 error_at (DECL_SOURCE_LOCATION (current_function_decl),
6035 "frame pointer required, but reserved");
6036 inform (DECL_SOURCE_LOCATION (decl), "for %qD", decl);
6037 }
6038
6039 /* If we are doing generic stack checking, give a warning if this
6040 function's frame size is larger than we expect. */
6041 if (flag_stack_check == GENERIC_STACK_CHECK)
6042 {
6043 poly_int64 size = get_frame_size () + STACK_CHECK_FIXED_FRAME_SIZE;
6044
6045 for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
6046 if (df_regs_ever_live_p (i)
6047 && !fixed_regs[i]
6048 && !crtl->abi->clobbers_full_reg_p (i))
6049 size += UNITS_PER_WORD;
6050
6051 if (constant_lower_bound (size) > STACK_CHECK_MAX_FRAME_SIZE)
6052 warning (0, "frame size too large for reliable stack checking");
6053 }
6054
6055 if (pic_offset_table_regno != INVALID_REGNUM)
6056 pic_offset_table_rtx = gen_rtx_REG (Pmode, pic_offset_table_regno);
6057
6058 timevar_pop (TV_IRA);
6059 }
6060 \f
6061 /* Run the integrated register allocator. */
6062
6063 namespace {
6064
6065 const pass_data pass_data_ira =
6066 {
6067 RTL_PASS, /* type */
6068 "ira", /* name */
6069 OPTGROUP_NONE, /* optinfo_flags */
6070 TV_IRA, /* tv_id */
6071 0, /* properties_required */
6072 0, /* properties_provided */
6073 0, /* properties_destroyed */
6074 0, /* todo_flags_start */
6075 TODO_do_not_ggc_collect, /* todo_flags_finish */
6076 };
6077
6078 class pass_ira : public rtl_opt_pass
6079 {
6080 public:
6081 pass_ira (gcc::context *ctxt)
6082 : rtl_opt_pass (pass_data_ira, ctxt)
6083 {}
6084
6085 /* opt_pass methods: */
6086 bool gate (function *) final override
6087 {
6088 return !targetm.no_register_allocation;
6089 }
6090 unsigned int execute (function *) final override
6091 {
6092 ira (dump_file);
6093 return 0;
6094 }
6095
6096 }; // class pass_ira
6097
6098 } // anon namespace
6099
6100 rtl_opt_pass *
6101 make_pass_ira (gcc::context *ctxt)
6102 {
6103 return new pass_ira (ctxt);
6104 }
6105
6106 namespace {
6107
6108 const pass_data pass_data_reload =
6109 {
6110 RTL_PASS, /* type */
6111 "reload", /* name */
6112 OPTGROUP_NONE, /* optinfo_flags */
6113 TV_RELOAD, /* tv_id */
6114 0, /* properties_required */
6115 0, /* properties_provided */
6116 0, /* properties_destroyed */
6117 0, /* todo_flags_start */
6118 0, /* todo_flags_finish */
6119 };
6120
6121 class pass_reload : public rtl_opt_pass
6122 {
6123 public:
6124 pass_reload (gcc::context *ctxt)
6125 : rtl_opt_pass (pass_data_reload, ctxt)
6126 {}
6127
6128 /* opt_pass methods: */
6129 bool gate (function *) final override
6130 {
6131 return !targetm.no_register_allocation;
6132 }
6133 unsigned int execute (function *) final override
6134 {
6135 do_reload ();
6136 return 0;
6137 }
6138
6139 }; // class pass_reload
6140
6141 } // anon namespace
6142
6143 rtl_opt_pass *
6144 make_pass_reload (gcc::context *ctxt)
6145 {
6146 return new pass_reload (ctxt);
6147 }