]> git.ipfire.org Git - thirdparty/coreutils.git/commit
yes: output data more efficiently
authorPádraig Brady <P@draigBrady.com>
Mon, 9 Mar 2015 19:27:32 +0000 (19:27 +0000)
committerPádraig Brady <P@draigBrady.com>
Tue, 10 Mar 2015 00:12:30 +0000 (00:12 +0000)
commit35217221c211f3116f374f305654462195aa634a
tree233b01c55819c0d725c2f6ace6c736bd857a0b9f
parent3c3b760c34834638edf6e875fbb722f4db7cc938
yes: output data more efficiently

yes(1) may be used to generate repeating patterns of text
for test inputs etc., so adjust to be more efficient.

Profiling the case where yes(1) is outputting small items
through stdio (which was the default case), shows the overhead
of continuously processing small items in main() and in stdio:

    $ yes >/dev/null & perf top -p $!
    31.02%  yes           [.] main
    27.36%  libc-2.20.so  [.] _IO_file_xsputn@@GLIBC_2.2.5
    14.51%  libc-2.20.so  [.] fputs_unlocked
    13.50%  libc-2.20.so  [.] strlen
    10.66%  libc-2.20.so  [.] __GI___mempcpy
     1.98%  yes           [.] fputs_unlocked@plta

Sending more data per stdio call improves the situation,
but still, there is significant stdio overhead due to memory copies,
and the repeated string length checking:

    $ yes "`echo {1..1000}`" >/dev/null & perf top -p $!
    42.26%  libc-2.20.so  [.] __GI___mempcpy
    17.38%  libc-2.20.so  [.] strlen
     5.21%  [kernel]      [k] __srcu_read_lock
     4.58%  [kernel]      [k] __srcu_read_unlock
     4.27%  libc-2.20.so  [.] _IO_file_xsputn@@GLIBC_2.2.5
     2.50%  libc-2.20.so  [.] __GI___libc_write
     2.45%  [kernel]      [k] system_call
     2.40%  [kernel]      [k] system_call_after_swapgs
     2.27%  [kernel]      [k] vfs_write
     2.09%  libc-2.20.so  [.] _IO_do_write@@GLIBC_2.2.5
     2.01%  [kernel]      [k] fsnotify
     1.95%  libc-2.20.so  [.] _IO_file_write@@GLIBC_2.2.5
     1.44%  yes           [.] main

We can avoid all stdio overhead by building up the buffer
_once_ and outputting that, and the profile below shows
the bottleneck moved to the kernel:

    $ src/yes >/dev/null & perf top -p $!
    15.42%  [kernel]      [k] __srcu_read_lock
    12.98%  [kernel]      [k] __srcu_read_unlock
     9.41%  libc-2.20.so  [.] __GI___libc_write
     9.11%  [kernel]      [k] vfs_write
     8.35%  [kernel]      [k] fsnotify
     8.02%  [kernel]      [k] system_call
     5.84%  [kernel]      [k] system_call_after_swapgs
     4.54%  [kernel]      [k] __fget_light
     3.98%  [kernel]      [k] sys_write
     3.65%  [kernel]      [k] selinux_file_permission
     3.44%  [kernel]      [k] rw_verify_area
     2.94%  [kernel]      [k] __fsnotify_parent
     2.76%  [kernel]      [k] security_file_permission
     2.39%  yes           [.] main
     2.17%  [kernel]      [k] __fdget_pos
     2.13%  [kernel]      [k] sysret_check
     0.81%  [kernel]      [k] write_null
     0.36%  yes           [.] write@plt

Note this change also ensures that yes(1) will only write
complete lines for lines shorter than BUFSIZ.

* src/yes.c (main): Build up a BUFSIZ buffer of lines,
and output that, rather than having stdio process each item.
* tests/misc/yes.sh: Add a new test for various buffer sizes.
* tests/local.mk: Reference the new test.
Fixes http://bugs.gnu.org/20029
src/yes.c
tests/local.mk
tests/misc/yes.sh [new file with mode: 0755]