Paul Eggert [Fri, 19 Sep 2025 15:55:12 +0000 (08:55 -0700)]
copy: pacify older clang
* src/copy-file-data.c (copy_file_data):
Redo conditionals for clarity.
This pacifies the clang in FreeBSD 11 and OpenBSD 7.6.
Problem reported by Bruno Haible in:
https://lists.gnu.org/r/coreutils/2025-09/msg00104.html
* src/dd.c (v): Add the O_EXCL flag to the set of bits that we do not
want to use for our definitions.
* cfg.mk (sc_dd_O_FLAGS): Adjust to pass syntax-check.
* NEWS: Mention the fix.
Reported by Bruno Haible.
tests: tail/overlay-headers.sh: protect against hang
* tests/tail/overlay-headers.sh: Protect tail invocation with timeout,
and extend the possible running period to 60 seconds like other tests.
Reported by Brudno Haible on T2SDE Linux/alpha
* tests/tail/wait.sh: This test was seen to hang occasionally
on an Alpine Linux 3.20 system, so protect the tail(1) call
with `timeout 60` as done in similar tests.
Reported by Bruno Haible.
* src/local.mk: Explicitly depend on @INTL_MACOS_LIBS@
which may be not implicitly referenced (in LIBINTL) without gettext.
This is a new transitive dependency through localename-unsafe.
We add this globally to ease future maintenance as currently
6 commands require the localename-unsafe dependency:
date, du through show-date() (fprintftime), and
ls, pr, stat, uptime through strftime().
tests: write-errors.sh: avoid portability issue with dash
* tests/misc/write-errors.sh: Use printf rather than echo
since the echo builtin in dash will interpret backslashes.
* tests/misc/read-errors.sh: Likewise for consistency.
cksum,wc: inspect GLIBC_TUNABLES on all architectures
* gnulib: Update gnulib to latest to pull in change
to inspect GLIBC_TUNABLES on non x86_64 or aarch64 platforms.
This was needed on 32 bit x86, which supports calling into
accelerated code, and so requires inspection of GLIBC_TUNABLES
to pass the disablement verification in tests/cksum/cksum.sh
Paul Eggert [Wed, 17 Sep 2025 16:12:23 +0000 (09:12 -0700)]
maint: STREQ → streq
Use new Gnulib streq function instead of rolling our own macro.
* bootstrap.conf (gnulib_modules): Add stringeq.
* src/rm.c (main): Don’t assume streq is a macro that expands to (...),
as it is now a function.
* src/system.h:
* tests/df/no-mtab-status.sh, tests/df/skip-duplicates.sh:
(STREQ): Remove. All uses replaced by streq.
* tests/fold/fold-characters.sh: Move memory limit test to ...
* tests/misc/write-errors.sh: ... which avoids "write error"
messages on stderr due to the ignored SIGPIPE. It also protects
the fold invocation with a timeout(1) so that fold implementations
that don't exit promptly upon write error don't hang the test suite
(Like we would have done before commit v9.7-311-gc95c7ee76).
maint: remove unnecessary uintmaxtostr usage in printf
* src/comm.c (compare_files): Use the "%ju" printf directive instead of
"%s" and printing the result of umaxtostr.
* src/df.c (get_header): Likewise.
* src/ls.c (print_long_format): Likewise.
* src/sort.c (check): Likewise.
* NEWS: Mention the new GLIBC_TUNABLES feature.
* doc/coreutils.texi (Hardware Acceleration): A new node
detailing the build time and run time configuration options.
fold: fix write error checks with invalid multi-byte input
* src/fold.c (write_out): A new helper to check all writes.
(fold-file): Use write_out() for all writes.
* tests/fold/fold-zero-width.sh: Adjust to writing more
data in various patterns, rather than two buffers of NULs.
This is a more robust memory bound check, and the '\303' case
tests this particular logic change.
* NEWS: fold now exits immediately, not just promptly.
* NEWS: Mention the improvement.
* src/fold.c (fold_file): Check for write errors
after each buffer read from stdin.
* tests/misc/write-errors.sh: Add test cases.
Paul Eggert [Wed, 6 Aug 2025 19:11:34 +0000 (12:11 -0700)]
cp: prefer signed types in copy-file-data.c
* src/copy-file-data.c (sparse_copy, lseek_copy)
(copy_file_data): Prefer idx_t to size_t.
(copy_file_data): Defend against st_size < 0, which can happen on
ancient buggy platforms. Check for overflow; the old code was
wrong on theoretical-but-valid hosts where SIZE_MAX <= INT_MAX.
Paul Eggert [Wed, 6 Aug 2025 18:48:21 +0000 (11:48 -0700)]
cp: don’t allocate a separate zero buffer
* src/copy-file-data.c (write_zeros): New args abuf, buf_size.
Use the lazily-allocated buffer, which most likely already exists,
rather than allocating a separate buffer just for zeros.
This makes the code more reentrant as there is no longer
a need for static storage here. Although there is some CPU
overhead due to the need to zero out the buffer for each file,
the overhead is relatively small as the buffer is smallish
and should be cached. All callers changed.
Paul Eggert [Wed, 6 Aug 2025 16:55:41 +0000 (09:55 -0700)]
cp: refactor copying to return bytes copied
This doesn’t change behavior; it simplifies future changes.
* src/copy-file-data.c (sparse_copy, lseek_copy, copy_file_data):
Return the number of bytes copied, or -1 on failure,
instead of merely returning a success indication.
All callers changed.
Paul Eggert [Wed, 6 Aug 2025 14:53:35 +0000 (07:53 -0700)]
cp: copy_file_data now supports ibytes
This does not affect current coreutils behavior;
it is merely to help make copy_file_data more useful in the future.
* src/copy-file-data.c (lseek_copy): New arg ibytes.
Caller changed.
(copy_file_data): Implement the ibytes arg; formerly
it was always treated as COUNT_MAX, though all callers
currently pass COUNT_MAX so there was no problem in practice.
Paul Eggert [Tue, 5 Aug 2025 22:45:35 +0000 (15:45 -0700)]
cp: port better to old limited hosts
Port better ancient platforms where OFF_T_MAX is only 2**31 - 1,
but some devices have more than that many bytes.
* src/copy-file-data.c (copy_file_data): Byte count is now
count_t, not off_t. All callers changed. Since we need to check
for overflow anyway, also check for too-small calls to fadvise.
Paul Eggert [Tue, 5 Aug 2025 21:45:18 +0000 (14:45 -0700)]
cp: refactor out data copying
* po/POTFILES.in, src/local.mk (copy_sources): Add the new file.
* src/copy.c: Move the #includes of alignalloc.h, buffer-lcm.h,
fadvise.h, full-write.h, ioblksize.h to copy-file-data.c.
(enum copy_debug_val, struct copy_debug): Move these decls to copy.h.
(punch_hole, create_hole, is_CLONENOTSUP, sparse_copy)
(write_zeros, lseek_copy, HAVE_STRUCT_STAT_ST_BLOCKS)
(enum scantype, struct scan_inference, infer_scantype):
Move to copy-file-data.c.
(copy_reg): Move the data-copying part of this function
to the new function copy_file_data in copy-file-data.c.
* src/copy-file-data.c: New file, taken from part of copy.c.
Paul Eggert [Tue, 5 Aug 2025 19:47:16 +0000 (12:47 -0700)]
cp: refactor src/copy.c
This is in preparation for splitting this large module.
* src/copy.c (sparse_copy, lseek_copy): New arg DEBUG, used to
identify copy debug info instead of using a static var. All
callers changed.
(lseek_copy, infer_scantype): New args SRC_POS and POS.
Callers changed.
(copy_file_data): New function, with contents taken from copy.
(copy_reg): Call it.
doc: NEWS: mention fold can operate on very long lines
* NEWS: Before commit fb9016d50 (fold: use fread instead of getline,
2025-08-24), fold required that the maximum line size in a file fit into
memory. Document that this is no longer the case.
fold: fix out of bounds write with zero width characters
* src/fold.c (fold_file): Prefer putchar ('\n') to copying characters.
If we do not have room in the output buffer print it since it is not a
full line of text.
* tests/fold/fold-zero-width.sh: New test case.
* tests/local.mk (all_tests): Add it.
cksum,wc: support disabling hardware acceleration at runtime
This is useful to give better test coverage at least,
and may be useful for users to tune their environment.
* bootstrap.conf: Reference the cpu-supports gnulib module.
* src/cksum.c: Use cpu_supports() rather than __builtin_cpu_supports().
* src/wc.c: Likewise.
* tests/cksum/cksum.sh: Adjust to testing all implementations.
* tests/wc/wc-cpu.sh: A new test to do likewise.
* tests/local.mk: Reference the new wc test.
maint: basenc: refactor all encodings to use finalize
Finalize was required for base58, but it's a more general mechanism
which simplifies the logic for all encodings
* src/basenc.c (do_decode): Always call base_decode_ctx_finalize(),
rather than the awkward double loop at end of buffer.
* tests/basenc/basenc.pl: Add basenc finalization tests.
maint: cleanup libraries unnecessarily added to fold
* src/local.mk (src_fold_LDADD): Remove $(MBRTOWC_LIB) since it is
already added to LDADD. Remove $(LIBC32CONV) and $(LIBUNISTRING) which
were for an uncommitted patch which used Gnulib's mbfile module.
nohup: avoid FORTIFY runtime failure on Bionic libc
The meaning of non-file permission umask bits is implementation defined.
On Bionic libc, attempting to set them triggers a FORTIFY runtime check.
$ nohup true
FORTIFY: umask: called with invalid mask -601
Aborted nohup true
* src/nohup.c: (main) Avoid setting non-permission bits in umask.
Just clear the umask to ensure we create nohup.out with u+rw,
as we restore the original umask before the exec().
* tests/misc/nohup.sh: Add a test case.
* NEWS: Mention the bug fix.
basenc: ensure partial padding with newlines induces an error
* src/basenc.c (has_padding): A more robust helper to
identify padding in the presence of trailing newlines.
(do_decode): Use has_padding() rather than just looking
at the last character.
* tests/basenc/base64.pl: Fully test commit v9.4-53-g378dc38f4
by ensuring partially padded data is diagnosed.
baddecode9 is the case fixed in this commit.
* NEWS: Mention the bug fix.
This implicitly tests the previous commit to
adjust how date(1) handles multiple named format options.
Currrently it tests the following are supported:
chown --quiet --silent
date --rfc-email --rfc-822 --rfc-2822
date --uct --utc --universal
dircolors --bourne-shell --sh
dircolors --csh --c-shell
head --quiet --silent
* tests/misc/option-aliases.sh: A new test to ensure all
option aliases supported by a command are supported.
* Reference the new test.
tests: fold: check if multi-byte spaces are treated as blank
This avoids a test failure on FreeBSD 14, MacOS 15, and musl.
Fix suggested by Pádraig Brady in:
<https://bugs.gnu.org/79301#32>.
* tests/fold/fold-spaces.sh (isblank): New function. Only run the tests
if the character is treated as blank.
Fixes https://bugs.gnu.org/79301
To make the interface more concise and consistent,
while being backwards compatible.
* src/digest.c (main): Continue to support -a "sha###" but
also support -a "sha2" and treat it like "sha3", except in...
(output_file): ... maintain the legacy tags for better compatability.
* doc/coreutils.texi (cksum invocation): Document the -a sha2 option.
* tests/cksum/cksum-base64.pl: Adjust as per modified --help.
* tests/cksum/cksum-c.sh: Add new supported SHA2-### tagged variant.
* NEWS: Mention the new feature.
Collin Funk [Sun, 31 Aug 2025 23:56:08 +0000 (16:56 -0700)]
cksum: add support for SHA-3
* src/digest.c: Include sha3.h.
(BLAKE2B_MAX_LEN): Rename to
DIGEST_MAX_LEN since it is also used for SHA-3.
(sha3_sum_stream): New function.
(enum Algorithm, algorithm_args, algorithm_args, algorithm_types)
algorithm_tags, algorithm_bits, cksumfns, cksum_output_fns): Add entries
for SHA-3.
(usage): Mention that SHA-3 is supported. Mention requirements for
--length with SHA-3.
(split_3): Use DIGEST_MAX_LEN instead of BLAKE2B_MAX_LEN. Determine the
length of the digest for SHA-3. Make sure it is 224, 256, 384, or 512.
(digest_file): Set the digest length in bytes. Use DIGEST_MAX_LEN
instead of BLAKE2B_MAX_LEN. Always append the digest length to SHA3 in
the output.
(main): Allow the use of --length with 'cksum -a sha3'. Use
DIGEST_MAX_LEN instead of BLAKE2B_MAX_LEN. Make sure it is 224, 256,
384, or 512.
* tests/cksum/cksum-base64.pl (@pairs): Add expected sha3 output.
(fmt): Modify the output to use SHA3-512 since that is the default.
(@Tests): Modify arguments for sha3 to use --length=512.
* tests/cksum/cksum-sha3.sh: New test, based on tests/cksum/b2sum.sh.
* tests/local.mk (all_tests): Add the test.
* bootstrap.conf: Add crypto/sha3.
* gnulib: Update to latest commit.
* NEWS: Mention the change.
* doc/coreutils.texi (cksum general options): Mention sha3 as a
supported argument to the -a option. Mention that 'cksum -a sha3'
supports the --length option. Mention that SHA-3 is considered secure.
fold: check that characters are not non-breaking spaces when -s is used
NetBSD 10 and Solaris 11.4 treat non-breaking spaces as blank
characters unlike glibc.
* src/system.h: Include uchar.h.
(c32isnbspace): New function based on iswnbspace from src/wc.c.
* src/fold.c (fold_file): Use it.
* src/wc.c (iswnbspace): Remove function.
(maybe_c32isnbspace): New function.
(wc, main): Use it.
Fixes https://bugs.gnu.org/79300
maint: prefer issymlink to readlink with a small buffer
* bootstrap.conf (gnulib_modules): Add issymlink and issymlinkat.
* src/copy.c: Include issymlink.h.
(copy_reg): Use issymlink instead of readlinkat.
* src/rmdir.c: Include issymlink.h.
(main): Use issymlink instead of readlink.
* src/tail.c: Include issymlink.h.
(recheck, any_symlinks): Use issymlink instead of readlink.
* src/test.c: Include issymlink.h.
(unary_operator): Use issymlink instead of readlink.
seq: be more accurate with large integer start values
* src/seq.c (main): Avoid possibly innacurate conversion
to long double, for all digit start values.
* tests/seq/seq-long-double.sh: Add a test case.
* NEWS: Mention the improvement.
Fixes https://bugs.gnu.org/79369
Pádraig Brady [Sun, 31 Aug 2025 13:29:56 +0000 (14:29 +0100)]
ls: fix alignment with locale formatted --size
Fix allocated size alignment in locales with multi-byte grouping chars.
Tested with: LC_ALL=sv_SE.utf8 ls --size --block-size=\'k
* src/ls.c (print_file_name_and_frills): Don't rely on
printf("%*s", width, string) to pad multi-byte strings appropriately.
Instead work out the padding required and use:
printf("%*s%s", padding, "", string) to pad multi-byte appropriately.
* tests/ls/block-size.sh: Add a test case.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/79347
Pádraig Brady [Sat, 30 Aug 2025 11:08:24 +0000 (12:08 +0100)]
b2sum: --length: fix upper bound check
* src/digest.c (main): Don't saturate -l to BLAKE2B_MAX_LEN,
so that the subsequent bounds check is performed.
* tests/cksum/b2sum.sh: Add a test case.
* NEWS: Mention the fix introduced in commit v9.5-71-gf2c84fe63
Collin Funk [Thu, 28 Aug 2025 01:33:37 +0000 (18:33 -0700)]
fold: fix handling of invalid multi-byte characters
* src/fold.c (fold_file): Continue the loop when we have buffered bytes
but nothing left to read from the file.
(adjust_column): Don't assume that the character is printable.
* tests/fold/fold-characters.sh: Add a new test case.
(bad_unicode): New function.
Collin Funk [Tue, 26 Aug 2025 06:15:21 +0000 (23:15 -0700)]
fold: don't truncate multibyte characters at the end of the buffer
* src/fold.c (fold_file): Replace invalid characters with the original
byte read. Copy multibyte sequences that may not yet be read to the
start of the buffer before reading more bytes.
* tests/fold/fold-characters.sh: Add a test case.
Collin Funk [Sun, 24 Aug 2025 19:05:41 +0000 (12:05 -0700)]
fold: use fread instead of getline
* src/fold.c: Include ioblksize.h.
(fold_file): Use two IO_BUFSIZE-sized buffers. Use fread instead of
getline. Check for if we reached the end of file.
Pádraig Brady [Sun, 24 Aug 2025 10:41:01 +0000 (11:41 +0100)]
tests: nproc: fix false failure on some systems
* tests/nproc/nproc-quota.sh: Also simulate sched_getscheduler()
as this will not be called on older or non linux, or
may return ENOSYS on Alpine.
Fixes https://bugs.gnu.org/79299
Collin Funk [Thu, 21 Aug 2025 04:13:52 +0000 (21:13 -0700)]
fold: add the --characters option
* src/fold.c: Include mcel.h.
(count_bytes): Remove variable.
(counting_mode, last_character_width): New variables.
(shortopts, long_options): Add the option.
(adjust_column): If --characters is in used account for number of
characters instead of their width.
(fold_file): Use getline and iterate over the result with mcel functions
to handle multibyte characters.
(main): Check for the option.
* src/local.mk (src_fold_LDADD): Add $(LIBC32CONV), $(LIBUNISTRING), and
$(MBRTOWC_LIB).
* tests/fold/fold-characters.sh: New file.
* tests/fold/fold-spaces.sh: New file.
* tests/fold/fold-nbsp.sh: New file.
* tests/local.mk (all_tests): Add the tests.
* NEWS: Mention the new option.
* doc/coreutils.texi (fold invocation): Likewise.
Paul Eggert [Sat, 23 Aug 2025 00:34:04 +0000 (17:34 -0700)]
cp: improve hole handling on squashfs
Better fix for problem reported by Jeremy Allison
<https://bugs.gnu.org/79267>.
* src/copy.c (struct scan_inference): New type, replacing
union scan_inference. All uses changed. This is so
infer_scantype can report the first hole's offset when known.
(lseek_copy): 5th arg is now struct scan_inference const *,
not just off_t. All uses changed.
(infer_scantype): If SEEK_SET+SEEK_HOLE do not find a hole,
fall back on ZERO_SCANTYPE.
Paul Eggert [Fri, 22 Aug 2025 17:37:50 +0000 (10:37 -0700)]
cp: go back to copy_file_range optimization
This reverts part of the previous change.
* src/copy.c (lseek_copy): When calling sparse_copy, do not
ask it to scan for zeros unless --sparse=always, so that it
can use copy_file_range which can be far more efficient.
Paul Eggert [Thu, 21 Aug 2025 22:02:10 +0000 (16:02 -0600)]
cp: always punch holes that we make
Problem reported by Jeremy Allison <https://bugs.gnu.org/79267>.
* src/copy.c (create_hole, sparse_copy): Omit arg PUNCH_HOLES,
as we always punch holes now. All uses changed.
(lseek_copy): When calling sparse_copy, scan for holes when
sparse_mode == SPARSE_AUTO, as that means we are making holes.
(copy_reg): Always punch any hole made at end.
Collin Funk [Tue, 19 Aug 2025 18:39:04 +0000 (11:39 -0700)]
maint: prefer https to http
* doc/sort-version.texi (Other version/natural sort implementations):
Use https in documentation link.
* tests/chmod/symlinks.sh: Use https in license text.
Pádraig Brady [Tue, 19 Aug 2025 15:49:22 +0000 (16:49 +0100)]
nproc: honor cgroup v2 CPU quotas
* NEWS: Mention the new feature.
* doc/coreutils.texi (nproc invocation): Mention that
cgroup CPU quotas can limit the reported number.
* gnulib: Update to new nproc gnulib implementation:
https://github.com/coreutils/gnulib/commit/9b07115f4a
Collin Funk [Fri, 15 Aug 2025 02:43:52 +0000 (19:43 -0700)]
maint: avoid syntax-check failure from previous commit
* src/tsort.c: Don't include long-options.h since the previous commit
removed the call to parse_gnu_standard_options_only. This avoids a
sc_prohibit_long_options_without_use syntax-check failure.
Paul Eggert [Thu, 14 Aug 2025 16:17:51 +0000 (09:17 -0700)]
tsort: add do-nothing -w option
This is for conformance to POSIX.1-2024
* src/tsort.c: Include getopt.h.
(main): Accept and ignore -w. Do not bother altering
the usage message, as the option is useless.
* tests/misc/tsort.pl (cycle-3): New test.
Collin Funk [Sun, 10 Aug 2025 00:53:29 +0000 (17:53 -0700)]
realpath: support the -E option required by POSIX
* src/realpath.c (longopts): Add the option.
(main): Likewise.
(usage): Add the option to the --help message.
* tests/misc/realpath.sh: Add a simple test.
* doc/coreutils.texi (realpath invocation): Mention the new option.
* NEWS: Likewise.
Pádraig Brady [Sun, 10 Aug 2025 15:15:37 +0000 (16:15 +0100)]
doc: --base58: add example usage to info
* doc/coreutils.texi (basenc invocation): Add an example
using --base58 to generate a unique ID. This also demonstrates
compound usage of the basenc command, to convert to/from binary.
A 58 character encoding that:
- avoids visually ambiguous 0OIl characters
- uses only alphanumeric characters
Described at:
- https://tools.ietf.org/html/draft-msporny-base58-03
This implementation uses GMP (or gnulib's gmp fallback).
Performance is good in comparison to other implementations.
For example when using libgmp on an i7-5600U system,
encoding is 530 times faster, and decoding 830 times faster
than the implementation using arbitrary precision ints in cpython 3.13.
Memory use is proportional to the size of input.
Encoding benchmarks:
$ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc
real 0m0.018s
./configure --quiet --without-libgmp && make -j $(nproc)
$ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc
real 0m3.431s
# dnf install python3-base58
$ time yes | head -c65535 | base58 >file.enc # cpython 3.13
real 0m9.700s
Decoding benchmarks:
$ time src/basenc --base58 -d <file.enc >/dev/null
real 0m0.010s
$ ./configure --without-libgmp && make # gnulib gmp
$ time src/basenc --base58 -d <file.enc >/dev/null
real 0m0.145s
$ time base58 -d <file.enc >/dev/null # cpython 3.13
real 0m8.302s
* src/basenc.c (base_decode_ctx_finalize, base_encode_ctx_init,
base_encode_ctx, base_encode_ctx_finalize): New functions to
provide more general processing functionality.
(base58_{de,en}code_ctx{_init,,_finalize}): New functions to
accumulate all input before calling ...
(base58_{de,en}code): ... the GMP based encoding/decoding routines.
(do_encode, do_decode): Call the ctx variants if enabled.
* doc/coreutils.texi (basenc invocation): Describe the new option,
and indicate the main use case being interactive user use.
* src/local.mk: Link basenc with GMP.
* tests/basenc/basenc.pl: Add test cases.
* NEWS: Mention the new feature.
Pádraig Brady [Thu, 7 Aug 2025 19:39:24 +0000 (20:39 +0100)]
basenc: fix stripping of '=' chars in some encodings
* src/basenc.c (do_decode): With -i ensure we strip '=' chars
if there is no padding for the chosen encoding.
* tests/basenc/basenc.pl: Add a test case.
* NEWS: Mention the bug fix.
Collin Funk [Sat, 9 Aug 2025 03:32:18 +0000 (20:32 -0700)]
maint: prefer attribute.h in .c files
* src/basenc.c (base16_encode, z85_encoding, do_decode): Use
ATTRIBUTE_NONSTRING instead of ATTRIBUTE_NONSTRING.
* src/basenc.c (sc_prohibit-_gl-attributes): New rule for
'make syntax-check'.
Paul Eggert [Thu, 7 Aug 2025 22:42:23 +0000 (15:42 -0700)]
cp: omit some needless lseek calls
The sparse code sometimes issued multiple lseeks against the
same file without doing anything in betwee. Optimize them away
by keeping track of the last hole output, in a way that
crosses the sparse_copy function call boundary.
* src/copy.c (sparse_copy): New arg hole_size, replacing old args
scan_holes and last_write_made_hole. All callers changed.
(sparse_copy, lseek_copy): Do not create hole at
end; let the caller deal with it. All callers changed.
(lseek_copy): New args hole_size and total_n_read. Caller changed.
(copy_reg): Create hole at end for both lseek_copy and sparse_copy.