Paul Eggert [Sat, 1 Nov 2025 23:42:59 +0000 (17:42 -0600)]
pr: improve nstrftime failure check
* src/pr.c (init_header): Do not report an nstrftime EOVERFLOW
error as memory exhaustion. Instead, output the time as an
integer. Also, work even if nstrftime (nullptr, SIZE_MAX, ...)
would return PTRDIFF_MAX which means adding 1 would overflow..
Pádraig Brady [Sat, 1 Nov 2025 14:30:12 +0000 (14:30 +0000)]
build: reduce explicit dependencies on macOS CoreFoundation
* src/local.mk: Revert v9.7-322-gc2e1816a5, instead relying
on the more focused v9.8-79-g532cd66af. When built with
--disable-nls on macOS this will result in only some commands
being linked with INTL_MACOSX_LIBS, thus resulting in env(1)
at least not setting a __CF_USER_TEXT_ENCODING envirnoment variable.
Pádraig Brady [Fri, 31 Oct 2025 15:37:55 +0000 (15:37 +0000)]
copy: be more defensive/restrictive with posix_fadvise
* src/copy-file-data.c (copy_file_data): Only give the
POSIX_FADV_SEQUENTIAL hint when we _know_ we'll definitely
use a read/write loop to copy the data. Also only apply
the hint to the whole file, as we've seen OpenZFS at least
special case that.
(sparse_copy): Update stale comment.
Pádraig Brady [Thu, 30 Oct 2025 13:02:48 +0000 (13:02 +0000)]
copy: don't avoid copy-offload upon SEEK_HOLE indicating non-sparse
* src/copy-file-data.c (infer_scantype): Fall back to a plain copy
if SEEK_HOLE indicates non-sparse, as zero copy avoids copy offload.
This was seen with transparently compressed files on OpenZFS.
* tests/cp/sparse-perf.sh: Add a test case even though it might
only trigger on compressed file systems that don't support reflink.
* NEWS: Mention the bug fix.
Addresses https://github.com/coreutils/coreutils/issues/122
* src/copy-file-data.c (): pass 0 to posix_fadvise to indicate to EOF.
coreutils 9.8 used OFF_T_MAX instead, which triggered OpenZFS 2.2.2
at least to synchronously (decompress and) populate the page cache.
Addresses https://github.com/coreutils/coreutils/issues/122
Collin Funk [Fri, 31 Oct 2025 04:10:52 +0000 (21:10 -0700)]
timeout: use fork and execvp instead of posix_spawn
* NEWS: Remove timeout from the list of programs that use posix_spawn.
* bootstrap.conf (gnulib_modules): Remove posix_spawnattr_setsigmask.
* src/timeout.c: Don't include spawn.h.
(main): Use fork and execvp instead of posix_spawn.
Pádraig Brady [Fri, 31 Oct 2025 14:17:31 +0000 (14:17 +0000)]
doc: NEWS: mention that sort --compress=script needs a shebang
* NEWS: Mention that we don't fall-back to executing /bin/sh <script>
for malformed scripts that don't start with #!..., or any executable
that returns ENOEXEC in general.
Bruno Haible [Thu, 30 Oct 2025 09:15:09 +0000 (10:15 +0100)]
sort: consistently diagnose access issues to --compress-program
* bootstrap.conf (gnulib_modules): Add findprog-in.
* src/sort.c: Include findprog.h.
(pipe_child): Look up the compress_program in $PATH and report errors
such as ENOENT or EACCES before invoking posix_spawnp.
This avoids inconsistency on systems that emulate posix_spawn through
fork/exec, as they would otherwise treat such a failure as a generic
failure and fail the sort, rather than continuing without compression.
Pádraig Brady [Tue, 28 Oct 2025 19:30:08 +0000 (19:30 +0000)]
sort: fix silent exit upon SIGPIPE from --compress-program
* src/sort.c (main): Ignore SIGPIPE so we've more control over
how we handle for stdout and compression programs.
(sort_die): Handle EPIPE from stdout and mimic a standard SIGPIPE,
otherwise reverting to a standard exit(SORT_FAILURE);
* tests/sort/sort-compress-proc.sh: Add a test case.
* NEWS: Mention the bug fix.
Pádraig Brady [Tue, 28 Oct 2025 12:52:55 +0000 (12:52 +0000)]
tests: fix new date/resolution.sh test on macOS
* tests/date/resolution.sh: Fix comparison on systems with less than
nano second reslution, where we use sed to discard the redundant
trailing zeros output by date --resolution.
Reported by Bruno Haible on macOS.
Pádraig Brady [Tue, 28 Oct 2025 12:18:47 +0000 (12:18 +0000)]
numfmt: ensure fields don't split on nbsp
* src/numfmt.c (newline_or_blank): Explicitly ensure
we don't match NBSP as on platforms like NetBSD 10 or Solaris 11,
NBSP is considered a blank character.
This should have been part of commit v9.8-39-g8bc11f80a
Solaris 11 test failure reported by Bruno Haible.
Collin Funk [Sun, 26 Oct 2025 22:33:24 +0000 (15:33 -0700)]
timeout: use the more efficient posix_spawn to invoke the command
* NEWS: Mention the improvement. Consolidate the posix_spawn
improvements into one item.
* bootstrap.conf (gnulib_modules): Add posix_spawnattr_setsigmask.
* src/timeout.c: Include spawn.h.
(main): Setup signals using a posix_spawnattr_t object. Use posix_spawn
instead of fork and execvp.
Pádraig Brady [Sun, 26 Oct 2025 12:33:42 +0000 (12:33 +0000)]
build: don't build chcon or runcon unless selinux is available
The build can be force enabled with --with-selinux and vice versa.
* build-aux/gen-lists-of-programs.sh: Move chcon and runcon
to the list of optional programs.
* configure.ac: Only enable chcon and runcon if selinux.h is available.
* NEWS: Mention the Build-related change.
Fixes https://github.com/coreutils/coreutils/issues/121
Collin Funk [Fri, 24 Oct 2025 04:27:53 +0000 (21:27 -0700)]
sort: use the more efficient posix_spawn to invoke --compress-program
* NEWS: Mention the improvement. Mention that 'sort' will continue
without compressing temporary files if the program specified by
--compress-program cannot be executed.
* doc/coreutils.texi (sort invocation): Document the behavior when the
program specified by --compress-program cannot be executed.
* src/sort.c: Include spawn.h.
(MAX_FORK_TRIES_COMPRESS, MAX_FORK_TRIES_DECOMPRESS): Remove definition.
(MAX_TRIES_COMPRESS, MAX_TRIES_DECOMPRESS): New definitions based on
MAX_FORK_TRIES_COMPRESS and MAX_FORK_TRIES_DECOMPRESS.
(async_safe_die): Remove function.
(posix_spawn_file_actions_move_fd): New function.
(pipe_fork): Remove function.
(pipe_child): New function based on pipe_fork. Return an error number
instead of a pid. Use posix_spawnp instead of calling fork and expecting
the caller to exec.
(maybe_create_temp): Call pipe_child instead of pipe_fork. Print a
warning to standard error if --compress-program cannot be executed and
the error is different than the previous call. Remove code for the child
process.
(open_temp): Remove code for the child process. Improve error message.
* tests/sort/sort-compress.sh: Add a test case for when the program
specified by --compress-program does not exist.
Collin Funk [Thu, 23 Oct 2025 21:40:39 +0000 (14:40 -0700)]
maint: remove unnecessary ignore_value usage
* src/install.c: Remove ignore-value.h include.
(strip): Don't use ignore_value on posix_spawnattr_destroy since it
isn't declared with the warn_unused_result attribute. Pass the checked
pointer to posix_spawnattr_destroy instead of the variable it points to.
Collin Funk [Thu, 23 Oct 2025 08:11:50 +0000 (01:11 -0700)]
split: prefer posix_spawn to fork and execl
* NEWS: Mention the change.
* bootstrap.conf (gnulib_modules): Add posix_spawn,
posix_spawnattr_setsigdefault, posix_spawn_file_actions_addclose,
posix_spawn_file_actions_adddup2, and posix_spawn_file_actions_init.
* src/split.c: Include spawn.h.
(create): Use posix_spawn instead of fork and execl.
Pádraig Brady [Wed, 22 Oct 2025 15:11:53 +0000 (16:11 +0100)]
fmt: promptly diagnose write errors
* NEWS: Mention the improvement.
* src/fmt.c (put_line): Exit if any error writing line.
(flush_paragraph): Exit if any error writing buffer.
* tests/misc/write-errors.sh: Enable the (flush_paragraph) test case,
and add another to check the put_line() case.
Pádraig Brady [Wed, 22 Oct 2025 12:53:46 +0000 (13:53 +0100)]
numfmt: promptly diagnose write errors
* src/numfmt.c (process line): Inspect the stdio error state when
outputting each line so that we don't have to check each output function
but do eventually exit upon write error, while also remaining buffered.
(main): Also check when outputting a header for the edge case
of very long headers.
* tests/misc/write-errors.sh: Enable the numfmt test case.
* NEWS: Mention the improvement, and reorganize all numfmt improvements.
Collin Funk [Sun, 19 Oct 2025 22:22:17 +0000 (15:22 -0700)]
install: prefer posix_spawnp to fork and execlp
* NEWS: Mention the change.
* bootstrap.conf (gnulib_modules): Add posix_spawnattr_destroy,
posix_spawnattr_init, posix_spawnattr_setflags, and posix_spawnp.
* src/install.c (strip): Use posix_spawnp instead of fork and execlp.
Pádraig Brady [Mon, 20 Oct 2025 11:53:13 +0000 (12:53 +0100)]
tests: numfmt: add non-utf8 multi-byte test
* tests/numfmt/mb-non-utf8.sh: Test GB18030 delimiter search.
* tests/local.mk: Reference the new test, and move
the existing numfmt.pl test from tests/misc to tests/numfmt.
Pádraig Brady [Sun, 19 Oct 2025 12:11:46 +0000 (13:11 +0100)]
numfmt: optimize multi-byte --delimiter search
* src/numfmt.c (is_utf8_charset): A new function to efficiently
determine if running with a UTF-8 charset.
(mbsmbchr): A new function to efficiently search for
a (multi-byte) character in a multi-byte string.
(next-field): Use mbsmbchr() rather than mbstr() directly.
Pádraig Brady [Sat, 18 Oct 2025 16:44:49 +0000 (17:44 +0100)]
numfmt: support multi-byte --delimiter
* bootstrap.conf: Depend on mbsstr() to robustly search for a
multi-byte delimiter character (string) within a multi-byte string.
* src/numfmt.c (main): Accept a valid multi-byte delimiter character.
(next_field): Adjust delimiter search from single byte
to multi-byte aware. Use mbsstr to find the first match.
* tests/misc/numfmt.pl: Add test case.
* NEWS: Mention the improvement.
Pádraig Brady [Sat, 18 Oct 2025 12:02:44 +0000 (13:02 +0100)]
numfmt: use multi-byte aware suffix matching
* src/numfmt.c (process_suffixed_number): Use gnulib's
mbs_endswith() helper, which is more robust in non UTF-8 locales.
Also always output a devmsg if a suffix is specified.
Pádraig Brady [Fri, 17 Oct 2025 18:14:21 +0000 (19:14 +0100)]
numfmt: fix issues with multi-byte blanks
* src/numfmt.c (process_line): Restore byte overwritten with NUL,
as it may be part of a multi-byte blank.
(process_suffixed_number): Skip multi-byte blanks,
and correctly determine width with mbswidth().
(parse_format_string): Use c_isblank() to explicitly
indicate that's all the format spec supports.
* tests/misc/numfmt.pl: Add test cases.
* NEWS: Mention the bug fix.
Pádraig Brady [Thu, 9 Oct 2025 13:24:12 +0000 (14:24 +0100)]
numfmt: add --unit-separator
Output, accept, or disallow a string between the number and unit
as recommended in <https://physics.nist.gov/cuu/Units/checklist.html>
I.e. support outputting numbers of the form: "1234 M"
* src/numfmt.c (simple_strtod_human): Skip unit separator if present,
or disallow a unit separator if empty.
(double_to_human): Output unit separator if specified.
(main): Accept --unit-separator.
* tests/misc/numfmt.pl: Add test cases.
* doc/coreutils.texi: Describe the new option,
giving examples of interaction with --delimiter.
* NEWS: Mention the new feature.
* THANKS.in: Add Johannes Schauer Marin Rodrigues,
who provided a preliminary patch.
Pádraig Brady [Tue, 14 Oct 2025 20:06:03 +0000 (21:06 +0100)]
numfmt: support reading numbers with grouping characters
This does not validate grouping character placement,
and currently just ignores grouping characters.
* src/numfmt.c (simple_strtod_int): Skip grouping chars
that are part of a number.
* tests/misc/numfmt.pl: Add test cases.
* NEWS: Mention the improvement.
Pádraig Brady [Tue, 14 Oct 2025 15:17:56 +0000 (16:17 +0100)]
numfmt: support reading numbers with NBSP before unit
* src/numfmt.c (simple_strtod_human): Accept (multi-byte)
non-breaking space character between number and unit.
Note we restrict this to a single character between number
and unit, to allow less ambiguous parsing if multiple blanks
are used to delimit fields.
* tests/misc/numfmt.pl: Add test cases.
* doc/coreutils.texi (numfmt invocation): Fix stale description
--delimiter skipping whitespace.
* NEWS: Mention the improvement.
Nicolas Boichat [Sat, 26 Jul 2025 06:58:33 +0000 (14:58 +0800)]
tests: du/bigtime: try harder to find a suitable filesystem
* tests/du/bigtime.sh: At least on Linux, the ext4 filesystem
doesn't support such large timestamp, while tmpfs does. Try a bit
harder to look for a filesystem with large timestamp support.
Pádraig Brady [Tue, 7 Oct 2025 13:38:49 +0000 (14:38 +0100)]
maint: cksum: document a base64/hex parsing ambiguity with untagged
* src/digest.c (split_3): Mention the ambiguity in misinterpreting
base64 characters as hex is not a practical consideration.
Also add an example of both tagged formats which makes it
easier to interpret the parsing logic.
Pádraig Brady [Mon, 6 Oct 2025 18:41:24 +0000 (19:41 +0100)]
cksum: fix --check with untagged base64 format with tag matches
* src/digest.c (split_3): Fallback to untagged matching in the
case where -a is specified and we have matched a TAG in
the possibly base64 data. This might happen in 1 in every 64K files.
Note we remove the modification of string S (and redundant streq) in
the tag matching, as that was not needed since v8.32-223-g217cd278e.
* tests/cksum/cksum-c.sh: Add a test case.
* NEWS: Mention the bug fix.
Pádraig Brady [Mon, 6 Oct 2025 15:32:26 +0000 (16:32 +0100)]
cksum: fix length validation with SHA2- tagged format
* src/digest.c (sha2_sum_stream): Change from unreachable()
to affirm() so that we have defined behavior unless
we configure with --disable-assert.
(sha3_sum_stream): Likewise.
(split_3): Validate SHA2-lengths before passing on.
* tests/cksum/cksum-c.sh: Add a test case.
* NEWS: Mention the bug fix.
Pádraig Brady [Mon, 6 Oct 2025 12:23:55 +0000 (13:23 +0100)]
cksum: fix --check with --algorithm=sha2
* src/digest.c (split_3): Look up the provided tag with -a sha2
because there is not a 1:1 mapping between them.
* tests/cksum/cksum-c.sh: Add a test case.
* NEWS: Mention the bug fix.
Paul Eggert [Mon, 6 Oct 2025 20:31:48 +0000 (13:31 -0700)]
rm: make ‘rm -d DIR’ more like ‘rmdir DIR’
* src/remove.c (rm_fts): When not recursive,
arrange for ‘rm -d DIR’ to behave more like ‘rmdir DIR’.
This works better for Ceph snapshot directories.
Problem reported by Yannick Le Pennec (bug#78245).
Collin Funk [Sun, 5 Oct 2025 00:18:01 +0000 (17:18 -0700)]
cksum: allow -a {blake2b,sha2,sha3} --check to work on base64
* NEWS: Mention the bug.
* src/digest.c (split_3): Check that the base64 digest matches the
length supported by the algorithm.
(digest_check): Check that the read digest matches the base64 length of
the algorithm's digest. The previous condition would not work for
'cksum -a blake2b -l 8 ...'.
* tests/cksum/cksum-base64-untagged.sh: New file.
* tests/local.mk (all_tests): Add the new test.
Paul Eggert [Sat, 4 Oct 2025 00:04:01 +0000 (17:04 -0700)]
maint: omit trailing white space in config.h
* configure.ac (FORTIFY_SOURCE): Don’t indent a line
where the indentation can cause trailing white space in config.h.
Problem reported by Grisha Levit (Bug#79567).
Pádraig Brady [Thu, 2 Oct 2025 16:20:39 +0000 (17:20 +0100)]
tests: factor: add suggested large prime tests
* tests/factor/create-test.sh: Add 2 new large primes from:
https://github.com/coreutils/coreutils/issues/65
* tests/local.mk: Reference the 2 new generated tests.
* man/help2man: Relax the option match so more --option
variations are supported, and passed through to convert_option().
Specifically more variations after '=' are now supported.
Also split and document the regular expression.
Reported at https://github.com/coreutils/coreutils/issues/84
fold: move multi-byte character reading to a module
* gl/modules/mbbuf: New file.
* gl/lib/mbbuf.c: Likewise.
* gl/lib/mbbuf.h: Likewise.
* gl/local.mk (EXTRA_DIST): Add the new files.
* bootstrap.conf (gnulib_modules): Add mbbuf.
* src/fold.c: Include mbbuf.h.
(fold_file): Use the mbbuf functions instead of calling fread and
handling the input buffer ourselves.
* cfg.mk (exclude_file_name_regexp--sc_preprocessor_indentation)
(exclude_file_name_regexp--sc_GPL_version): Match gl/lib/mbbuf.c and
gl/lib/mbbuf.h.
* configure.ac: Add detection of AVX512 intrinsics for wc.
* src/local.mk: Build AVX512 wc libraries.
* src/wc.c: Add runtime detection of AVX512 intrinsics and call
appropriate function when detected.
* src/wc.h (wc_lines_avx512): Declare function.
* tests/wc/wc-cpu.sh: Add a test that disables AVX512 intrinsics.
* src/wc_avx512.c: New file containing the wc -l implementation using
AVX512. The logic and code is reused from the AVX2 implementation with
slight adaptations. Replaced __builtin_popcount by __builtin_popcountll
and the combination of _mm256_cmpeq_epi8 and _mm256_movemask_epi8 by a
single call to _mm512_cmpeq_epi8_mask.
* NEWS: Mention the improvement.
Bernhard Voelker [Thu, 25 Sep 2025 20:50:01 +0000 (22:50 +0200)]
join: remove unused #include "argmatch.h"
Prompted by the syntax-check failure:
maint.mk: the above files include argmatch.h but don't use it
make: *** [maint.mk:741: sc_prohibit_argmatch_without_use] Error 1
* src/join.c: Remove include, as the previous commit changed from using
ARRAY_CARDINALITY to countof - for which system.h includes the header.
Hannes Braun [Wed, 24 Sep 2025 19:20:49 +0000 (21:20 +0200)]
tail: fix tailing larger number of lines in regular files
* src/tail.c (file_lines): Seek to the previous block instead of the
beginning (or a little before) of the block that was just scanned.
Otherwise, the same block is read and scanned (at least partially)
again. This bug was introduced by commit v9.7-219-g976f8abc1.
* tests/tail/basic-seek.sh: Add a new test.
* tests/local.mk: Reference the new test.
* NEWS: mention the bug fix.
* src/local.mk: Due to gnulib adjustments, this explicit
dependency is required with the mold linker at least.
Reported at https://github.com/coreutils/coreutils/issues/113
basenc: --base58: fix buffer overflow with input > 15MB
base58_length() operated naively on an int
which resulted in an overflow to a negative number
for any input > 2^31-1/138, i.e. 15,561,475 bytes.
* src/basenc.c (base_length): Change input and output
parameter types from int to idx_t since this needs to
cater for the full input size in the base58 case.
(base58_length): Likewise. Also reorder the calculation
to be less exact, but doing the division first
to minimize the chance of overflow (which now on 64 bit
would only happen for inputs > around 6 Exa bytes).
* tests/basenc/basenc-large.sh: Add a new test,
that triggers with valgrind or ASAN.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug fix.
* src/cksum.c (pclmul_supported): In the case where
gnulib has pclmul enabled but coreutils does not,
ensure we don't reference the coreutils pclmul function.
Addresses https://bugs.gnu.org/79491
tests: ls: avoid alignment check with non printable characters
* tests/ls/block-size.sh: Skip the case where there are
non-printable characters in ls' output, which is the case
with NBSP thousands separators on FreeBSD 11 and 12.
We may drop the MBSW_REJECT_UNPRINTABLE in future from
ls and numfmt, but for now avoid these characters in the test.
Reported by Bruno Haible.
* tests/du/move-dir-while-traversing.sh: Expand the work to avoid
a false failure where du completes before the directory is moved.
Also expand the timeout to our more standard 10s to avoid the
"directory mover" being killed before du processes the directory.
This doesn't perceptibly impact the run time of the test.
Reported by Bruno Haible on a CentOS 7 system.
* configure.ac: Check for statx using gl_CHECK_FUNCS_ANDROID since it is
hidden for Android API level <= 30.
* m4/jm-macros.m4 (coreutils_MACROS): Check for syncfs using
gl_CHECK_FUNCS_ANDROID since it is hidden for Android API level <= 28.
* tests/fold/fold-zero-width.sh: Check the shell was able to create
the redirection file, as intermittently on CentOS 5,6,7 this wasn't
the case, with the shell giving an xmalloc failure due to the ulimit.
Reported by William Bader and Bruno Haible.