From: Pádraig Brady
Date: Wed, 28 Feb 2024 16:41:40 +0000 (+0000) Subject: cat,cp,mv,dd,install,split: set the default IO size to 256KiB X-Git-Tag: v9.5~44 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=fcfba90d0d27a1bacf2020bac4dbec74ed181028;p=thirdparty%2Fcoreutils.git cat,cp,mv,dd,install,split: set the default IO size to 256KiB * src/ioblksize.h: Add updated test results and increase value from 128KiB to 256KiB, which was last updated 10 years ago. * NEWS: Mention the improvement. --- diff --git a/NEWS b/NEWS index 69e282e378..7a5fbfd289 100644 --- a/NEWS +++ b/NEWS @@ -80,6 +80,10 @@ GNU coreutils NEWS -*- outline -*- ** Improvements + cp,mv,install,cat,split now read and write a minimum of 256KiB at a time. + This was previously 128KiB and increasing to 256KiB was seen to increase + throughput by 10-20% when reading cached files on modern systems. + SELinux operations in file copy operations are now more efficient, avoiding unneeded MCS/MLS label translation. diff --git a/src/ioblksize.h b/src/ioblksize.h index 590b09f585..dcf94fc600 100644 --- a/src/ioblksize.h +++ b/src/ioblksize.h @@ -21,8 +21,8 @@ #include "stat-size.h" -/* As of May 2014, 128KiB is determined to be the minimum - blksize to best minimize system call overhead. +/* As of Feb 2024, 256KiB is determined to be the best blksize + to minimize system call overhead across most systems. This can be tested with this script: for i in $(seq 0 10); do @@ -41,21 +41,25 @@ system #5: 2.30GHz i7-3615QM with 1600MHz DDR3, arch=x86_64 system #6: 1.30GHz i5-4250U with 1-channel 1600MHz DDR3, arch=x86_64 system #7: 3.55GHz IBM,8231-E2B with 1066MHz DDR3, POWER7 revision 2.1 + system #8: 2.60GHz i7-5600U with 1600MHz DDR3, arch=x86_64 + system #9: 3.80GHz IBM,02CY649 with 2666MHz DDR4, POWER9 revision 2.3 + system 10: 2.95GHz IBM,9043-MRX, POWER10 revision 2.0 + system 11: 3.23Ghz Apple M1 with 2666MHz DDR4, arch=arm64 per-system transfer rate (GB/s) - blksize #1 #2 #3 #4 #5 #6 #7 + blksize #1 #2 #3 #4 #5 #6 #7 #8 #9 10 11 ------------------------------------------------------------------------ - 1024 .73 1.7 2.6 .64 1.0 2.5 1.3 - 2048 1.3 3.0 4.4 1.2 2.0 4.4 2.5 - 4096 2.4 5.1 6.5 2.3 3.7 7.4 4.8 - 8192 3.5 7.3 8.5 4.0 6.0 10.4 9.2 - 16384 3.9 9.4 10.1 6.3 8.3 13.3 16.8 - 32768 5.2 9.9 11.1 8.1 10.7 13.2 28.0 - 65536 5.3 11.2 12.0 10.6 12.8 16.1 41.4 - 131072 5.5 11.8 12.3 12.1 14.0 16.7 54.8 - 262144 5.7 11.6 12.5 12.3 14.7 16.4 40.0 - 524288 5.7 11.4 12.5 12.1 14.7 15.5 34.5 - 1048576 5.8 11.4 12.6 12.2 14.9 15.7 36.5 + 1024 .73 1.7 2.6 .64 1.0 2.5 1.3 .9 1.2 2.5 2.0 + 2048 1.3 3.0 4.4 1.2 2.0 4.4 2.5 1.7 2.3 4.9 3.8 + 4096 2.4 5.1 6.5 2.3 3.7 7.4 4.8 3.1 4.6 9.6 6.9 + 8192 3.5 7.3 8.5 4.0 6.0 10.4 9.2 5.6 9.1 18.4 12.3 + 16384 3.9 9.4 10.1 6.3 8.3 13.3 16.8 8.6 17.3 33.6 19.8 + 32768 5.2 9.9 11.1 8.1 10.7 13.2 28.0 11.4 32.2 59.2 27.0 + 65536 5.3 11.2 12.0 10.6 12.8 16.1 41.4 14.9 56.9 95.4 34.1 + 131072 5.5 11.8 12.3 12.1 14.0 16.7 54.8 17.1 86.5 125.0 38.2 + -> 262144 5.7 11.6 12.5 12.3 14.7 16.4 40.0 18.0 113.0 148.0 41.3 <- + 524288 5.7 11.4 12.5 12.1 14.7 15.5 34.5 18.0 104.0 153.0 43.1 + 1048576 5.8 11.4 12.6 12.2 14.9 15.7 36.5 18.2 87.9 114.0 44.8 Note that this is to minimize system call overhead. @@ -71,7 +75,7 @@ In the future we could use the above method if available and default to io_blksize() if not. */ -enum { IO_BUFSIZE = 128 * 1024 }; +enum { IO_BUFSIZE = 256 * 1024 }; static inline idx_t io_blksize (struct stat const *st) {