Ensure appropriate functions are inlined. This was seen to
be required with gcc 4.6.0 with -O2 on x86_64 at least.
It was reported that gcc 4.8.0 did inline these functions though.
Also reinstate the bit vector for the common case,
to further improve performance.
Benchmark results for both aspects of this change are:
$ yes abcdfeg | head -n1MB > big-file
$ for c in orig inline inline-array; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 big-file > /dev/null
done
== orig ==
real 0m0.088s
user 0m0.081s
sys 0m0.007s
== inline ==
real 0m0.070s
user 0m0.060s
sys 0m0.009s
== inline-array ==
real 0m0.049s
user 0m0.044s
sys 0m0.005s
* src/cut.c (set_fields): Set up the printable_field bit vector
for performance, but only when it's appropriate. I.E. not
when either --output-delimeter or huge ranges are specified.
(next_item): Ensure it's inlined and avoid unnecessary processing.
(print_kth): Ensure it's inlined and add a branch for the fast path.
Related to http://bugs.gnu.org/13127