]> git.ipfire.org Git - thirdparty/coreutils.git/commit
wc: speedup counting of short lines
authorKristoffer Brånemyr <ztion1@yahoo.se>
Wed, 18 Mar 2015 15:32:19 +0000 (15:32 +0000)
committerPádraig Brady <P@draigBrady.com>
Fri, 20 Mar 2015 00:48:52 +0000 (00:48 +0000)
commit1025243b6a0c8b8830b2d3676a97dae83c74d284
tree94325b00bc4610e4af13fb6a88af7496ad683a04
parente2e11119e0ac653bd0bdab91c189b7803f8df1f0
wc: speedup counting of short lines

Using a test file generated with:
  yes | head -n100M > 2x100M.txt

before> time wc -l 2x100M.txt
  real 0.842s
  user 0.810s
  sys  0.033s

after> time wc -l 2x100M.txt
  real 0.142s
  user 0.111s
  sys  0.031s

* src/wc.c (wc): Split the loop that deals with -l into 3.
The first is used at the start of the input to determine if
the average line length is < 15, and if so the second loop is
used to look for '\n' internally to wc.  For longer lines,
memchr is used as before to take advantage of system specific
optimizations which any outweigh function call overhead.
Note the first 2 loops could be combined, though in testing,
GCC 4.9.2 at least, wasn't sophisticated enough to separate
the loops based on the "check_len" invariant.
Note also __builtin_memchr() isn't significant here as
GCC currently only applies constant folding with that.
* NEWS: Mention the improvement.
NEWS
src/wc.c