With some recent optimization, -O1/-O2/-O3 can archive almost same
performace/size by stack load/store. Thus lwm/swm will save/store
less callee-saved register. In fact only $16 is saved with swm.
To be sure that this optimization does exist, let's add 2 more
function calls. So that lwm/swm can be much more profitable.
If we add only once more, -O1 will still use stack load/store.
gcc/testsuite
* gcc.target/mips/umips-save-restore-1.c: Be sure lwm/swm
are used for more callee-saved registers with addtional
2 more function calls.
MICROMIPS int
foo (int n, int a, int b, int c, int d)
{
- int i, j;
+ int i, j, k, l;
i = bar (n, a, b, c, d);
j = bar (n, a, b, c, d);
- return i + j;
+ k = bar (n, a, b, c, d);
+ l = bar (n, a, b, c, d);
+ return i + j + k + l;
}
-/* { dg-final { scan-assembler "\tswm\t\\\$16-\\\$2(0|1),\\\$31" } } */
-/* { dg-final { scan-assembler "\tlwm\t\\\$16-\\\$2(0|1),\\\$31" } } */
+/* { dg-final { scan-assembler "\tswm\t\\\$16-\\\$2(2|3),\\\$31" } } */
+/* { dg-final { scan-assembler "\tlwm\t\\\$16-\\\$2(2|3),\\\$31" } } */