lower-bitint: Fix up handling of nested casts in mergeable stmt handling [PR112941]
The following patch fixes 2 issues in handling of casts for mergeable
stmts.
The first hunk fixes the case when we have two nested casts (typically
after optimization that is zero-extension of a sign-extension because
everything else should have been folded into a single cast). If
the lowering of the outer cast needs to make the code conditional
(e.g.
for (...)
{
if (idx <= 32)
{
if (idx < 32)
{ ... handle_operand (idx); ... }
else
{ ... handle_operand (32); ... }
}
...
}
) and the lowering of the inner one as well, right now it creates invalid
SSA form, because even for the inner cast we need a PHI on the loop
and the PHI argument from the latch edge is a SSA_NAME initialized in
the conditionally executed bb. The hunk fixes that by detecting such
a case and adding further PHI nodes at the end of the ifs such that
the right value propagates to the next loop iteration. We can use
0 arguments for the other edges because the inner operand handling
is only done for the first set of iterations and then the other ifs take
over.
The rest fixes a case of again invalid SSA form, when for a sign extension
we need to use the 0 or -1 value initialized by earlier iteration in
a constant idx case, the code was using the value of the loop PHI argument
from latch edge rather than result; that is correct for cases expanded
in straight line code after the loop, but not inside of the loop for the
cases of handle_cast conditionals, there we should use PHI result. This
is done in the second hunk and supported by the remaining hunks, where
it clears m_bb to tell the code we aren't in the loop anymore.
Note, this patch doesn't deal with similar problems during multiplication,
division, floating casts etc. where we just emit a library call. I'll
need to make sure in that case we don't merge more than one cast per
operand.
2023-12-20 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112941
* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): If
save_cast_conditional, instead of adding assignment of t4 to
m_data[save_data_cnt + 1] before m_gsi, add phi nodes such that
t4 propagates to m_bb loop. For constant idx, use
m_data[save_data_cnt] rather than m_data[save_data_cnt + 1] if inside
of the m_bb loop.
(bitint_large_huge::lower_mergeable_stmt): Clear m_bb when no longer
expanding inside of that loop.
(bitint_large_huge::lower_comparison_stmt): Likewise.
(bitint_large_huge::lower_addsub_overflow): Likewise.
(bitint_large_huge::lower_mul_overflow): Likewise.
(bitint_large_huge::lower_bit_query): Likewise.