git.ipfire.org Git - thirdparty/gcc.git/commit

middle-end: use two's complement equality when comparing IVs during candidate selection  [PR114932]

IVOPTS normally uses affine trees to perform comparisons between different IVs,
but these seem to have been missing in two key spots and instead normal tree
equivalencies used.

In some cases where we have a two-complements equivalence but not a strict
signedness equivalencies we end up generating both a signed and unsigned IV for
the same candidate.

This patch implements a new OEP flag called OEP_ASSUME_WRAPV.  This flag will
check if the operands would produce the same bit values after the computations
even if the final sign is different.

This happens quite a lot with Fortran but can also happen in C because this came
code is unable to figure out when one expression is a multiple of another.

As an example in the attached testcase we get:

Initial set of candidates:
  cost: 24 (complexity 3)
  reg_cost: 9
  cand_cost: 15
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6, 8
   group:0 --> iv_cand:6, cost=(0,1)
   group:1 --> iv_cand:1, cost=(0,0)
   group:2 --> iv_cand:8, cost=(0,1)
   group:3 --> iv_cand:8, cost=(0,1)
  invariant variables: 6
  invariant expressions: 1, 2

<Invariant Expressions>:
inv_expr 1:     stride.3_27 * 4
inv_expr 2:     (unsigned long) stride.3_27 * 4

These end up being used in the same group:

Group 1:
cand  cost    compl.  inv.expr.       inv.vars
1     0       0       NIL;    6
2     0       0       NIL;    6
3     0       0       NIL;    6

which ends up with IV opts picking the signed and unsigned IVs:

Improved to:
  cost: 24 (complexity 3)
  reg_cost: 9
  cand_cost: 15
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6, 8
   group:0 --> iv_cand:6, cost=(0,1)
   group:1 --> iv_cand:1, cost=(0,0)
   group:2 --> iv_cand:8, cost=(0,1)
   group:3 --> iv_cand:8, cost=(0,1)
  invariant variables: 6
  invariant expressions: 1, 2

and so generates the same IV as both signed and unsigned:

;;   basic block 21, loop depth 3, count 214748368 (estimated locally, freq 58.2545), maybe hot
;;    prev block 28, next block 31, flags: (NEW, REACHABLE, VISITED)
;;    pred:       28 [always]  count:23622320 (estimated locally, freq 6.4080) (FALLTHRU,EXECUTABLE)
;;                25 [always]  count:191126046 (estimated locally, freq 51.8465) (FALLTHRU,DFS_BACK,EXECUTABLE)
  # .MEM_66 = PHI <.MEM_34(28), .MEM_22(25)>
  # ivtmp.22_41 = PHI <0(28), ivtmp.22_82(25)>
  # ivtmp.26_51 = PHI <ivtmp.26_55(28), ivtmp.26_72(25)>
  # ivtmp.28_90 = PHI <ivtmp.28_99(28), ivtmp.28_98(25)>

...

;;   basic block 24, loop depth 3, count 214748366 (estimated locally, freq 58.2545), maybe hot
;;    prev block 22, next block 25, flags: (NEW, REACHABLE, VISITED)'
;;    pred:       22 [always]  count:95443719 (estimated locally, freq 25.8909) (FALLTHRU)
;;                21 [33.3% (guessed)]  count:71582790 (estimated locally, freq 19.4182) (TRUE_VALUE,EXECUTABLE)
;;                31 [33.3% (guessed)]  count:47721860 (estimated locally, freq 12.9455) (TRUE_VALUE,EXECUTABLE)
# .MEM_22 = PHI <.MEM_44(22), .MEM_31(21), .MEM_79(31)>
ivtmp.22_82 = ivtmp.22_41 + 1;
ivtmp.26_72 = ivtmp.26_51 + _80;
ivtmp.28_98 = ivtmp.28_90 + _39;

These two IVs are always used as unsigned, so IV ops generates:

  _73 = stride.3_27 * 4;
  _80 = (unsigned long) _73;
  _54 = (unsigned long) stride.3_27;
  _39 = _54 * 4;

Which means that in e.g. exchange2 we generate a lot of duplicate code.

This is because candidate 6 and 8 are equivalent under two's complement but have
different signs.

This patch changes it so that if you have two IVs that are affine equivalent to
just pick one over the other.  IV already has code for this, so the patch just
uses affine trees instead of tree for the check.

With it we get:

<Invariant Expressions>:
inv_expr 1:     stride.3_27 * 4

<Group-candidate Costs>:
Group 0:
  cand  cost    compl.  inv.expr.       inv.vars
  5     0       2       NIL;    NIL;
  6     0       3       NIL;    NIL;

Group 1:
  cand  cost    compl.  inv.expr.       inv.vars
  1     0       0       NIL;    6
  2     0       0       NIL;    6
  3     0       0       NIL;    6
  4     0       0       NIL;    6

Initial set of candidates:
  cost: 16 (complexity 3)
  reg_cost: 6
  cand_cost: 10
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6
   group:0 --> iv_cand:6, cost=(0,3)
   group:1 --> iv_cand:1, cost=(0,0)
  invariant variables: 6
  invariant expressions: 1

gcc/ChangeLog:

PR tree-optimization/114932
* fold-const.cc (operand_compare::operand_equal_p): Use it.
(operand_compare::verify_hash_value): Likewise.
(operand_compare::hash_operand): Likewise.
(test_operand_equality::test): New.
(fold_const_cc_tests): Use it.
* tree-core.h (enum operand_equal_flag): Add OEP_ASSUME_WRAPV.
* tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.

gcc/testsuite/ChangeLog:

PR tree-optimization/114932
* gfortran.dg/addressing-modes_2.f90: New test.

author	Tamar Christina <tamar.christina@arm.com>
	Wed, 11 Dec 2024 11:47:49 +0000 (11:47 +0000)
committer	Tamar Christina <tamar.christina@arm.com>
	Wed, 11 Dec 2024 11:47:49 +0000 (11:47 +0000)
commit	9403b035befe3537c343f7430e321468c0f2c28b
tree	e3acdc3ec11b300622407b73fbffa8382a19924d	tree
parent	3c32575e5b6370270d38a80a7fa8eaa144e083d0	commit \| diff

gcc/fold-const.cc		diff \| blob \| blame \| history
gcc/testsuite/gfortran.dg/addressing-modes_2.f90	[new file with mode: 0644]	blob
gcc/tree-core.h		diff \| blob \| blame \| history
gcc/tree-ssa-loop-ivopts.cc		diff \| blob \| blame \| history