]> git.ipfire.org Git - thirdparty/coreutils.git/log
thirdparty/coreutils.git
5 days agodoc: cut: clarify description of -b and -c
Pádraig Brady [Wed, 11 Mar 2026 22:17:08 +0000 (22:17 +0000)] 
doc: cut: clarify description of -b and -c

* src/cut.c (usage): State the arguments are positions,
in case users may think they were values.

5 days agobuild: update to latest gnulib
Pádraig Brady [Sun, 5 Apr 2026 12:04:03 +0000 (13:04 +0100)] 
build: update to latest gnulib

Pick up mbrto{c32,wc} optimizations on UTF-8 on GLIBC.
Note configure.ac defines the required GNULIB_WCHAR_SINGLE_LOCALE.
This speeds up wc -m by 2.6x, when processing non ASCII chars,
and will similarly speed up per character processing
in the impending cut multi-byte implementation.
* NEWS: Mention the wc -m speed improvement.

5 days agobasename: avoid duplicate strlen calls on the suffix
Collin Funk [Sat, 4 Apr 2026 19:44:14 +0000 (12:44 -0700)] 
basename: avoid duplicate strlen calls on the suffix

    $ ltrace -c ./src/basename-prev -s a $(seq 100000) > /dev/null
    % time     seconds  usecs/call     calls      function
    ------ ----------- ----------- --------- --------------------
     50.00   30.030316          75    400000 strlen
    [...]
    $ ltrace -c ./src/basename -s a $(seq 100000) > /dev/null
    % time     seconds  usecs/call     calls      function
    ------ ----------- ----------- --------- --------------------
     42.88   22.413953          74    300001 strlen
    [...]

* src/basename.c (remove_suffix, perform_basename): Add a length
argument for the suffix and use it instead of strlen.
(main): Calculate the suffix length. Refactor code to avoid calling
perform_basename in multiple places.

7 days agodate: simplify -u by not calling putenv
Paul Eggert [Fri, 3 Apr 2026 01:53:34 +0000 (18:53 -0700)] 
date: simplify -u by not calling putenv

* src/date.c (TZSET): Remove; no longer needed.
(main): Simplify -u’s implementation by passing "UTC0" to tzalloc,
rather than by setting TZ in the environment and then calling getenv.
The old way of doing things dates back to before we had tzalloc.
* configure.ac (LOCALTIME_CACHE): Remove; no longer needed.

8 days agobuild: update gnulib submodule to latest
Paul Eggert [Wed, 1 Apr 2026 21:44:00 +0000 (14:44 -0700)] 
build: update gnulib submodule to latest

8 days agomaint: avoid sigaction lock overhead
Paul Eggert [Wed, 1 Apr 2026 21:43:40 +0000 (14:43 -0700)] 
maint: avoid sigaction lock overhead

* configure.ac (GNULIB_SIGACTION_SINGLE_THREAD):
Define to avoid unnecessary locking in Gnulib sigaction.  See:
https://lists.gnu.org/r/bug-gnulib/2026-04/msg00008.html

8 days agomaint: avoid Gnulib modules mbiter, mbiterf
Paul Eggert [Wed, 1 Apr 2026 18:56:18 +0000 (11:56 -0700)] 
maint: avoid Gnulib modules mbiter, mbiterf

* bootstrap.conf (avoided_gnulib_modules): Avoid mbiter and
mbiterf, for the same reason we avoid mbuiter and mbuiterf: these
modules are not needed because (due to mcel-prefer) we use mcel in
preference to mbiter/mbiterf/mbuiter/mbuiterf.

8 days agobuild: update gnulib submodule to latest
Paul Eggert [Wed, 1 Apr 2026 16:54:50 +0000 (09:54 -0700)] 
build: update gnulib submodule to latest

9 days agotests: dd: ensure memory exhaustion is handled gracefully
oech3 [Wed, 1 Apr 2026 11:37:09 +0000 (20:37 +0900)] 
tests: dd: ensure memory exhaustion is handled gracefully

* tests/dd/no-allocate.sh: Ensure we exit 1 upon mem allocation failure.
Also check other buffer size edge cases.
https://github.com/uutils/coreutils/issues/11436
https://github.com/uutils/coreutils/issues/11580
https://github.com/coreutils/coreutils/pull/235

9 days agotests: dd: avoid false failure with no controlling terminal
Pádraig Brady [Wed, 1 Apr 2026 12:23:54 +0000 (13:23 +0100)] 
tests: dd: avoid false failure with no controlling terminal

* tests/dd/misc.sh: test -w /dev/tty is not a strong enough check,
we need to actually open /dev/tty to ensure it's available.
It's not available under setsid for example.

10 days agotests: dd: check that erroneous seeks are not done in output
oech3 [Tue, 31 Mar 2026 06:57:58 +0000 (15:57 +0900)] 
tests: dd: check that erroneous seeks are not done in output

* tests/dd/misc.sh: Add test case for of=/dev/tty.
The same occurs for /dev/stdout, but that varies
in the test hardness so is best avoided.
https://github.com/coreutils/coreutils/pull/234

11 days agotests: coreutils: ensure empty arg is diagnosed
oech3 [Mon, 30 Mar 2026 09:07:42 +0000 (18:07 +0900)] 
tests: coreutils: ensure empty arg is diagnosed

* tests/misc/coreutils.sh: Add a test case.
https://github.com/coreutils/coreutils/pull/232

12 days agodate: avoid calling putenv multiple times unnecessarily
Collin Funk [Sun, 29 Mar 2026 01:57:49 +0000 (18:57 -0700)] 
date: avoid calling putenv multiple times unnecessarily

Adding environment variables can become quite expensive in some
admittedly unlikely situations.

    $ for i in $(seq 10000); do export A$i=A$i; done
    $ time ./src/date-prev -u $(yes -- -u | head -n 100000)
    Sun Mar 29 01:59:49 AM UTC 2026

    real 0m3.753s
    user 0m3.684s
    sys 0m0.050s
    $ time ./src/date -u $(yes -- -u | head -n 100000)
    Sun Mar 29 02:00:00 AM UTC 2026

    real 0m0.061s
    user 0m0.022s
    sys 0m0.045s

* src/date.c (main): Only add TZ=UTC0 to the environment once.

12 days agomaint: remove unnecessary return statements
Collin Funk [Sat, 28 Mar 2026 19:48:38 +0000 (12:48 -0700)] 
maint: remove unnecessary return statements

* src/env.c (initialize_signals): Remove return at the end of the
function.
* src/who.c (print_runlevel): Likewise.

12 days agowho: avoid locking standard output for each user with the -q option
Collin Funk [Sat, 28 Mar 2026 19:45:14 +0000 (12:45 -0700)] 
who: avoid locking standard output for each user with the -q option

* src/who (list_entries_who): Prefer putchar and fputs to printf.
Simplify separator tracking.

13 days agodoc: tty: mention the removal of the -s option from POSIX
Collin Funk [Sat, 28 Mar 2026 05:45:56 +0000 (22:45 -0700)] 
doc: tty: mention the removal of the -s option from POSIX

* doc/coreutils.texi (tty invocation): Mention that POSIX.1-2001 removed
the -s option and that portable scripts can redirect standard out to
/dev/null instead.

13 days agotests: env/env.sh: improve portability
oech3 [Thu, 12 Mar 2026 09:33:08 +0000 (18:33 +0900)] 
tests: env/env.sh: improve portability

* tests/env/env.sh: Make more portable by avoiding references to our
build dir,  and avoiding names that may cause false matches in
multi-call binaries.
https://github.com/coreutils/coreutils/pull/216

2 weeks agood: suppress address output on read error
Pádraig Brady [Wed, 25 Mar 2026 17:33:47 +0000 (17:33 +0000)] 
od: suppress address output on read error

We don't output an address for `od missing` or `od --strings .`,
so be consistent and suppress the address for `od .`.

* src/od.c (dump): Only output an address if no errors
or the offset is non zero.

2 weeks agotests: od: ensure -j1 /dev/null succeeds
oech3 [Wed, 25 Mar 2026 17:08:51 +0000 (02:08 +0900)] 
tests: od: ensure -j1 /dev/null succeeds

Users may be using this to convert bases.

* tests/od/od-j.sh: Add a test case.
https://github.com/coreutils/coreutils/pull/228

2 weeks agotests: truncate: don't rely on errno being EISDIR
Collin Funk [Tue, 24 Mar 2026 04:44:05 +0000 (21:44 -0700)] 
tests: truncate: don't rely on errno being EISDIR

* tests/truncate/multiple-files.sh: Only check that an error is printed
instead of an exact message.
Reported by Bruno Haible.

2 weeks agotests: yes: support more zero-copy related syscalls
oech3 [Mon, 23 Mar 2026 13:28:22 +0000 (22:28 +0900)] 
tests: yes: support more zero-copy related syscalls

* tests/misc/yes.sh: Disable other related zero-copy syscalls
to ensure better testing of future or other implementations.
https://github.com/coreutils/coreutils/pull/227

2 weeks agomaint: remove some unnecessary casts
Collin Funk [Tue, 24 Mar 2026 02:32:21 +0000 (19:32 -0700)] 
maint: remove some unnecessary casts

* src/sort.c (begfield, limfield): Remove size_t casts.

2 weeks agotests: cut: add test for -z with NUL delimiter and -s flag
Sylvestre Ledru [Fri, 20 Mar 2026 14:17:40 +0000 (15:17 +0100)] 
tests: cut: add test for -z with NUL delimiter and -s flag

* tests/cut/cut.pl (zerot-7): New test.
Identified https://github.com/uutils/coreutils/pull/11394
https://github.com/coreutils/coreutils/pull/226

2 weeks agotests: tr: add test for invalid character class name
Sylvestre Ledru [Thu, 19 Mar 2026 21:25:14 +0000 (22:25 +0100)] 
tests: tr: add test for invalid character class name

* tests/tr/tr.pl (invalid-class): New test.
Identified : https://github.com/uutils/coreutils/pull/11398
https://github.com/coreutils/coreutils/pull/225

2 weeks agosort: speed up keyed field sorting significantly using memchr
Chris Down [Mon, 23 Mar 2026 07:55:53 +0000 (15:55 +0800)] 
sort: speed up keyed field sorting significantly using memchr

When sort is invoked with an explicit field separator with `-t SEP`,
begfield() and limfield() scan for the separator to locate boundaries.
Right now the implementation there uses a loop that iterates over bytes
one by one, which is not ideal since we must scan past many bytes of
non-separator data one byte at a time.

Let's replace each of these loops with memchr(). On glibc systems,
memchr() uses SIMD to scan 16 bytes per step (NEON on aarch64) or 32
bytes per step (AVX2 on x86_64), rather than 1 byte at a time, so any
field longer than a handful of bytes stands to benefit quite
significantly.

Using the following input data:

  awk 'BEGIN {
      srand(42)
      for (i = 1; i <= 500000; i++)
          printf "%*d,%*d,%d\n", 4+int(rand()*9), 0,
                                 4+int(rand()*9), 0, int(rand()*10000)
  }' > short_csv_500k

  awk 'BEGIN {
      for (i = 1; i <= 500000; i++)
          printf "%100d,%100d,%d\n", 0, 0, int(rand()*10000)
  }' > wide_csv_500k

One can benchmark with:

  hyperfine --warmup 10 --runs 50 \
    "LC_ALL=C sort_before -t, -k3,3n short_csv_500k > /dev/null" \
    "LC_ALL=C sort_after -t, -k3,3n short_csv_500k > /dev/null"

  hyperfine --warmup 10 --runs 50 \
    "LC_ALL=C sort_before -t, -k3,3n wide_csv_500k > /dev/null" \
    "LC_ALL=C sort_after -t, -k3,3n wide_csv_500k > /dev/null"

  hyperfine --warmup 10 --runs 50 \
    "LC_ALL=C sort_before wide_csv_500k > /dev/null" \
    "LC_ALL=C sort_after wide_csv_500k > /dev/null"

The results on i9-14900HX x86_64 with -O2:

  sort -t, -k3,3n (500K lines, 4-12 byte short fields):
    Before: 123.1 ms    After: 108.1 ms    (-12.2%)

  sort -t, -k3,3n (500K lines, 100 byte wide fields):
    Before: 243.5 ms    After: 165.9 ms    (-31.9%)

  sort (default, no -k, 500K lines):
    Before: 141.6 ms    After: 141.8 ms    (unchanged)

And on M1 Pro aarch64 with -O2:

  sort -t, -k3,3n (500K lines, 4-12 byte short fields):
    Before: 98.0 ms     After: 92.3 ms     (-5.8%)

  sort -t, -k3,3n (500K lines, 100 byte wide fields):
    Before: 240.8 ms    After: 183.0 ms    (-24.0%)

  sort (default, no -k, 500K lines):
    Before: 145.6 ms    After: 145.6 ms    (unchanged)

Looking at profiling, the improvement is larger on x86_64 in these runs
because glibc's memchr uses AVX2 to scan 32 bytes per step versus 16
bytes per step with NEON on aarch64.

2 weeks agomaint: fix an incomplete sentence
Collin Funk [Sun, 22 Mar 2026 20:32:52 +0000 (13:32 -0700)] 
maint: fix an incomplete sentence

* tests/pwd/argument.sh: Fix the test description.
Reported by G. Branden Robinson.

2 weeks agotests: pwd: test the behavior when given an argument
Collin Funk [Sun, 22 Mar 2026 05:16:59 +0000 (22:16 -0700)] 
tests: pwd: test the behavior when given an argument

* tests/pwd/argument.sh: New file.
* tests/local.mk (all_tests): Add the new test.

2 weeks agotac: avoid unnecessary standard output buffering
Collin Funk [Sat, 21 Mar 2026 22:36:46 +0000 (15:36 -0700)] 
tac: avoid unnecessary standard output buffering

This has removes a tiny amount of overhead:

    $ seq 10000000 > input
    $ perf stat -e cpu-clock --repeat 1000 taskset 1 ./src/tac \
        input 2>&1 > /dev/null | grep -F 'seconds time'
              0.095707 +- 0.000223 seconds time elapsed  ( +-  0.23% )
    $ perf stat -e cpu-clock --repeat 1000 taskset 1 ./src/tac-prev \
        input 2>&1 > /dev/null | grep -F 'seconds time'
             0.1009378 +- 0.0000995 seconds time elapsed  ( +-  0.10% )

* src/tac.c (output): Use full_write instead of fread since we already
buffer the output ourselves.

2 weeks agotests: rm: fix a test that would sometimes hang
Collin Funk [Sat, 21 Mar 2026 19:19:21 +0000 (12:19 -0700)] 
tests: rm: fix a test that would sometimes hang

* tests/rm/dash-hint.sh: Add the file name argument to grep, as I
intended when adding this test.

2 weeks agotac: promptly diagnose write errors
Collin Funk [Sat, 21 Mar 2026 08:07:28 +0000 (01:07 -0700)] 
tac: promptly diagnose write errors

This patch also fixes a bug where 'tac' would print a vague error on
some inputs:

    $ seq 10000 | ./src/tac-prev > /dev/full
    tac-prev: write error
    $ seq 10000 | ./src/tac > /dev/full
    tac: write error: No space left on device

In this case ferror (stdout) is true, but errno has been set back to
zero by a successful fclose (stdout) call.

* src/tac.c (output): Call write_error() if fwrite fails.
* tests/misc/io-errors.sh: Check that 'tac' prints a detailed write
error.
* NEWS: Mention the improvement.

2 weeks agotests: support checking for specific write errors
Pádraig Brady [Sat, 21 Mar 2026 12:37:20 +0000 (12:37 +0000)] 
tests: support checking for specific write errors

* tests/misc/io-errors.sh: Support checkout for a specific error
in commands that don't run indefinitely.  Currently all the explicitly
listed commands output a specific error and do not need to be tagged.

2 weeks agotests: nl: check that all files are processed
Collin Funk [Sat, 21 Mar 2026 02:46:04 +0000 (19:46 -0700)] 
tests: nl: check that all files are processed

* tests/nl/multiple-files.sh: New file.
* tests/local.mk (all_tests): Add the new test.

3 weeks agotest: truncate: improve the test added in the previous commit
Collin Funk [Fri, 20 Mar 2026 06:14:52 +0000 (23:14 -0700)] 
test: truncate: improve the test added in the previous commit

* tests/truncate/multiple-files.sh: Check that nothing is printed to
standard output and that standard error has the correct error.

3 weeks agotests: truncate: check that all files are processed
Collin Funk [Fri, 20 Mar 2026 05:54:42 +0000 (22:54 -0700)] 
tests: truncate: check that all files are processed

* tests/truncate/multiple-files.sh: New file.
* tests/local.mk (all_tests): Add the new test.

3 weeks agosort,split,yes: ensure pipe and pipe2 don't open standard descriptors
Collin Funk [Wed, 18 Mar 2026 06:06:16 +0000 (23:06 -0700)] 
sort,split,yes: ensure pipe and pipe2 don't open standard descriptors

* bootstrap.conf (gnulib_modules): Add pipe2-safer.
* cfg.mk (sc_require_unistd_safer): New rule for 'make syntax-check'.
* gl/lib/fd-reopen.c: Include unistd--.h instead of unistd.h.
* src/sort.c: Include unistd--.h.
* src/split.c: Likewise.
* src/yes.c: Likewise.

3 weeks agotests: dd: fix false failure on NetBSD 10
Pádraig Brady [Mon, 16 Mar 2026 22:34:58 +0000 (22:34 +0000)] 
tests: dd: fix false failure on NetBSD 10

* tests/dd/partial-write.sh: Skip the test if
nothing written at all, as was seen on NetBSD 10.
Reported by Bruno Haible.

3 weeks agotests: ls: fix false failure on FreeBSD
Pádraig Brady [Mon, 16 Mar 2026 22:25:42 +0000 (22:25 +0000)] 
tests: ls: fix false failure on FreeBSD

* tests/ls/non-utf8-hidden.sh: Avoid sorting in ls, to avoid:
ls: cannot compare file names ...: Illegal byte sequence
seen on FreeBSD 14.
Reported by Bruno Haible.

3 weeks agomaint: tee: remove an affirm call to silence coverity
Collin Funk [Mon, 16 Mar 2026 22:04:24 +0000 (15:04 -0700)] 
maint: tee: remove an affirm call to silence coverity

* src/iopoll.c (write_wait): Don't check that an unsigned integer is
always great than or equal to zero since that is always true.

3 weeks agowc: make sure input buffer for neon 'wc -l' is aligned
Collin Funk [Sun, 15 Mar 2026 04:04:12 +0000 (21:04 -0700)] 
wc: make sure input buffer for neon 'wc -l' is aligned

* src/wc_neon.c (wc_lines_neon): Use alignas.

3 weeks agotee: prefer file descriptors over streams
Collin Funk [Sun, 15 Mar 2026 03:21:53 +0000 (20:21 -0700)] 
tee: prefer file descriptors over streams

We disable buffering on the streams anyways, so we were effectively
calling the write system call previously despite using streams.

* src/iopoll.h (fclose_wait, fwrite_wait): Remove declarations.
(close_wait, write_wait): Add declarations.
* src/iopoll.c (fwait_for_nonblocking_write, fclose_wait, fwrite_wait):
Remove functions.
(wait_for_nonblocking_write): New function based on
fwait_for_nonblocking_write.
(close_wait): New function based on fclose_wait.
(write_wait): New function based on fwrite_wait.
* src/tee.c: Include fcntl--.h. Don't include stdio--.h.
(get_next_out): Operate on file descriptors instead of streams.
(fail_output): Likewise. Remove clearerr call since we no longer call
fwrite on stdout.
(tee_files): Operate on file descriptors instead of streams. Remove
calls to setvbuf.

3 weeks agotimeout: don't exit immediately if the parent is the init process
Collin Funk [Sat, 14 Mar 2026 03:37:10 +0000 (20:37 -0700)] 
timeout: don't exit immediately if the parent is the init process

* src/timeout.c (main): Save the process ID before creating a child
process. Check if the result of getppid is different than the saved
process ID instead of checking if it is 1.
* tests/timeout/init-parent.sh: New file.
* tests/local.mk (all_tests): Add the new test.
* NEWS: Mention the bug fix. Also mention that this change allows
'timeout' to work when reparented by a subreaper process instead of
init.

4 weeks agodoc: fix missing '=' in texi option descriptions
Pádraig Brady [Fri, 13 Mar 2026 10:27:40 +0000 (10:27 +0000)] 
doc: fix missing '=' in texi option descriptions

* doc/coreutils.texi (cut invocation, fold invocation):
Fix missing '=' before option parameters.

4 weeks agodd: always diagnose partial writes on write failure
Pádraig Brady [Wed, 11 Mar 2026 15:39:20 +0000 (15:39 +0000)] 
dd: always diagnose partial writes on write failure

* src/dd.c (dd_copy): Increment the partial write count upon failure.
* tests/dd/partial-write.sh: Add a new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/80583

4 weeks agodoc: clarify a recent NEWS item
Pádraig Brady [Wed, 11 Mar 2026 15:57:22 +0000 (15:57 +0000)] 
doc: clarify a recent NEWS item

* NEWS: It was ambiguous as to whether we quoted a range of
observered throughputs.  Clarify this was the old and new
throughput on a single test system.

4 weeks agodoc: NEWS: adjust 'wc -l' aarch64 benchmark after recent commit
Collin Funk [Wed, 11 Mar 2026 06:08:57 +0000 (23:08 -0700)] 
doc: NEWS: adjust 'wc -l' aarch64 benchmark after recent commit

After commit e0190a9d1 (wc: improve aarch64 Neon optimization for
'wc -l', 2026-03-09), on a Ampere eMAG machine:

    $ yes | head -n 10000000000 > input
    $ (time ./src/wc -l input)
    10000000000 input

    real 0m3.447s
    user 0m1.533s
    sys 0m1.913s
    $ (export GLIBC_TUNABLES='glibc.cpu.hwcaps=-ASIMD,-AVX2,-AVX512F'; \
       time ./src/wc -l input)
    10000000000 input

    real 0m15.758s
    user 0m14.039s
    sys 0m1.720s

* NEWS: Mention the improved benchmark.

4 weeks agotests: rm: check for hints when running 'rm -foo'
Collin Funk [Tue, 10 Mar 2026 07:08:12 +0000 (00:08 -0700)] 
tests: rm: check for hints when running 'rm -foo'

* tests/rm/dash-hint.sh: New file.
* tests/local.mk (all_tests): Add the new test.

4 weeks agomaint: adjust to placate coverity
Pádraig Brady [Tue, 10 Mar 2026 20:14:42 +0000 (20:14 +0000)] 
maint: adjust to placate coverity

* src/system.h (c32issep): Adjust to more standard layout.

4 weeks agoyes: use a zero-copy implementation via (vm)splice
Pádraig Brady [Sat, 7 Mar 2026 14:23:38 +0000 (14:23 +0000)] 
yes: use a zero-copy implementation via (vm)splice

A good reference for the concepts used here is:
https://mazzo.li/posts/fast-pipes.html
We don't consider huge pages or busy loops here,
but use vmsplice(), and splice() to get significant speedups:

  i7-5600U-laptop $ taskset 1 yes | taskset 2 pv > /dev/null
  ... [4.98GiB/s]
  i7-5600U-laptop $ taskset 1 src/yes | taskset 2 pv > /dev/null
  ... [34.1GiB/s]

  IBM,9043-MRX $ taskset 1 yes | taskset 2 pv > /dev/null
  ... [11.6GiB/s]
  IBM,9043-MRX $ taskset 1 src/yes | taskset 2 pv > /dev/null
  ... [175GiB/s]

Also throughput to file (on BTRFS) was seen to increase significantly.
With a Fedora 43 laptop improving from 690MiB/s to 1.1GiB/s.

* bootstrap.conf: Ensure sys/uio.h is present.
This was an existing transitive dependency.
* m4/jm-macros.m4: Define HAVE_SPLICE appropriately.
We assume vmsplice() is available if splice() is as they
were introduced at the same time to Linux and glibc.
* src/yes.c (repeat_pattern): A new function to efficiently
duplicate a pattern in a buffer with memcpy calls that double in size.
This also makes the setup for the existing write() path more efficient.
(pipe_splice_size): A new function to increase the kernel pipe buffer
if possible, and use an appropriately sized buffer based on that (25%).
(splice_write): A new function to call vmplice() when outputting
to a pipe, and also splice() if outputting to a non-pipe.
* tests/misc/yes.sh: Verify the non-pipe output case,
(main): Adjust to always calling write on the minimal buffer first,
then trying vmsplice(), then falling back to write from bigger buffer.
and the vmsplice() fallback to write() case.
* NEWS: Mention the improvement.

4 weeks agoall: use more consistent blank character determination
Pádraig Brady [Mon, 9 Mar 2026 22:23:12 +0000 (22:23 +0000)] 
all: use more consistent blank character determination

* src/system.h (c32issep): A new function that is essentially
iswblank() on GLIBC platforms, and iswspace() with exceptions elsewhere.
* src/expand.c: Use it instead of c32isblank().
* src/fold.c: Likewise.
* src/join.c: Likewise.
* src/numfmt.c: Likewise.
* src/unexpand.c: Likewise.
* src/uniq.c: Likewise.
* NEWS: Mention the improvement.

4 weeks agocksum: fix tagged output on 32 bit platforms
Pádraig Brady [Tue, 10 Mar 2026 14:47:25 +0000 (14:47 +0000)] 
cksum: fix tagged output on 32 bit platforms

Fix an unreleased issue due to the recent change
to using idx_t in commit v9.10-91-g02983e493

* src/cksum.c (output_file): Cast the idx_t before passing to printf.

4 weeks agowc: improve aarch64 Neon optimization for 'wc -l'
Collin Funk [Tue, 10 Mar 2026 02:32:27 +0000 (19:32 -0700)] 
wc: improve aarch64 Neon optimization for 'wc -l'

    $ yes abcdefghijklmnopqrstuvwxyz | head -n 200000000 > input
    $ time ./src/wc-prev -l input
    200000000 input

    real 0m1.240s
    user 0m0.456s
    sys 0m0.784s
    $ time ./src/wc -l input
    200000000 input

    real 0m0.936s
    user 0m0.141s
    sys 0m0.795s

* configure.ac: Use unsigned char for the buffer to avoid potential
compiler warnings. Check for the functions being used in src/wc_neon.c
after this patch.
* src/wc_neon.c (wc_lines_neon): Use vreinterpretq_s8_u8 to convert 0xff
into -1 instead of bitwise AND instructions into convert it into 1.
Perform the pairwise addition and lane extraction once every 8192 bytes
instead of once every 64 bytes.
Thanks to Lasse Collin for spotting this and reviewing a draft of this
patch.

4 weeks agotests: expand: fix false failure on various systems
Pádraig Brady [Mon, 9 Mar 2026 21:01:27 +0000 (21:01 +0000)] 
tests: expand: fix false failure on various systems

* tests/expand/mb.sh: Use $LOCALE_FR_UTF8 rather than
hardcoding "en_US.UTF-8".
* tests/unexpand/mb.sh: Likewise.
Reported by Bruno Haible.

4 weeks agobuild: update to latest gnulib
Pádraig Brady [Mon, 9 Mar 2026 13:14:54 +0000 (13:14 +0000)] 
build: update to latest gnulib

* src/ls.c: Adjust for renamed acl permissions member.

4 weeks agomaint: remove duplicate names from THANKS
Collin Funk [Sun, 8 Mar 2026 01:08:13 +0000 (17:08 -0800)] 
maint: remove duplicate names from THANKS

* .mailmap: Prefer the most recently used email address from each commit
author.

4 weeks agomaint: prefer memset_explicit to explicit_bzero
Collin Funk [Sun, 8 Mar 2026 00:16:01 +0000 (16:16 -0800)] 
maint: prefer memset_explicit to explicit_bzero

The explicit_bzero function is a common extension, but memset_explicit
was standardized in C23. It will likely become more portable in the
future, and Gnulib provides an implementation if needed.

* bootstrap.conf (gnulib_modules): Add memset_explicit. Remove
explicit_bzero.
* gl/lib/randint.c (randint_free): Use memset_explicit instead of
explicit_bzero.
* gl/lib/randread.c (randread_free_body): Likewise.

4 weeks agoexpand,unexpand: support multi-byte input
Lukáš Zaoral [Fri, 6 Mar 2026 14:13:17 +0000 (14:13 +0000)] 
expand,unexpand: support multi-byte input

* src/expand.c: Use mbbuf to support multi-byte input.
* src/unexpand.c: Likewise.
* tests/expand/mb.sh: New multi-byte test.
* tests/unexpand/mb.sh: Likewise.
* tests/local.mk: Reference new tests.
* NEWS: Mention the improvement.

4 weeks agomaint: shred: fix typo in comment
Weixie Cui [Sat, 7 Mar 2026 02:01:17 +0000 (10:01 +0800)] 
maint: shred: fix typo in comment

* src/shred.c: Fix "then" -> "than" in comment.

5 weeks agomaint: dd: fix typo in comment
Weixie Cui [Fri, 6 Mar 2026 13:05:55 +0000 (21:05 +0800)] 
maint: dd: fix typo in comment

* src/dd.c: Fix "that that" -> "that the" in comment.

5 weeks agobuild: update gnulib submodule to latest
Collin Funk [Fri, 6 Mar 2026 09:09:45 +0000 (01:09 -0800)] 
build: update gnulib submodule to latest

5 weeks agobuild: update gnulib submodule to latest
Collin Funk [Fri, 6 Mar 2026 06:24:38 +0000 (22:24 -0800)] 
build: update gnulib submodule to latest

5 weeks agomaint: touch: reduce variable scope
Collin Funk [Thu, 5 Mar 2026 07:40:03 +0000 (23:40 -0800)] 
maint: touch: reduce variable scope

* src/touch.c (main): Declare variables where they are used instead of
at the start of the function.

5 weeks agomaint: chown,chgrp: reduce variable scope
Collin Funk [Thu, 5 Mar 2026 07:34:45 +0000 (23:34 -0800)] 
maint: chown,chgrp: reduce variable scope

* src/chown-core.c (describe_change, restricted_chown)
(change_file_owner, chown_files): Declare variables where they are used
instead of at the start of the function.
* src/chown.c (main): Likewise.

5 weeks agoinstall: allow the combination of --compare and --preserve-timestamps
Collin Funk [Sun, 1 Mar 2026 23:31:28 +0000 (15:31 -0800)] 
install: allow the combination of --compare and --preserve-timestamps

* NEWS: Mention the improvement.
* src/install.c (enum copy_status): New type to let the caller know if
the copy was performed or skipped.
(copy_file): Return the new type instead of bool. Reduce variable scope.
(install_file_in_file): Only strip the file if the copy was
performed. Update the timestamps if the copy was skipped.
(main): Don't error when --compare and --preserve-timestamps are
combined.
* tests/install/install-C.sh: Add some test cases.

5 weeks agocksum: use more defensive escaping for --check
Pádraig Brady [Sat, 28 Feb 2026 11:09:26 +0000 (11:09 +0000)] 
cksum: use more defensive escaping for --check

cksum --check is often the first interaction
users have with possibly untrusted downloads, so we should try
to be as defensive as possible when processing it.

Specifically we currently only escape \n characters in file names
presented in checksum files being parsed with cksum --check.
This gives some possibilty of dumping arbitrary data to the terminal
when checking downloads from an untrusted source.
This change gives these advantages:

  1. Avoids dumping arbitrary data to vulnerable terminals
  2. Avoids visual deception with ansi codes hiding checksum failures
  3. More secure if users copy and paste file names from --check output
  4. Simplifies programmatic parsing

Note this changes programmatic parsing, but given the original
format was so awkward to parse, I expect that's extremely rare.
I was not able to find example in the wild at least.
To parse the new format from from shell, you can do something like:

  cksum -c checksums | while IFS= read -r line; do
    case $line in
      *': FAILED')
        filename=$(eval "printf '%s' ${line%: FAILED}")
        cp -v "$filename" /quarantine
        ;;
    esac
  done

This change also slightly reduces the size of the sum(1) utility.
This change also apples to md5sum, sha*sum, and b2sum.

* src/cksum.c (digest_check): Call quotef() instead of
cksum(1) specific quoting.
* tests/cksum/md5sum-bsd.sh: Adjust accordingly.
* doc/coreutils.texi (cksum general options): Describe the
shell quoting used for problematic file names.
* NEWS: Mention the change in behavior.
Reported by: Aaron Rainbolt

5 weeks agomaint: tests: refactor uses of bad_unicode()
Pádraig Brady [Wed, 4 Mar 2026 17:57:54 +0000 (17:57 +0000)] 
maint: tests: refactor uses of bad_unicode()

* init.cfg: Use 0xFF rather than 0xC3 everywhere.
* tests/fold/fold-characters.sh: Reuse bad_unicode().
* tests/tac/tac-locale.sh: Likewise.

5 weeks agofold: fix output truncation with 0xFF bytes in input
Pádraig Brady [Wed, 4 Mar 2026 16:56:48 +0000 (16:56 +0000)] 
fold: fix output truncation with 0xFF bytes in input

On signed char platforms, 0xFF was converted to -1
which matches MBBUF_EOF, causing fold to stop processing.

* NEWS: Mention the bug fix.
* gl/lib/mbbuf.h: Avoid sign extension on signed char platforms.
* tests/fold/fold-characters.sh: Adjust test case.
Reported at https://src.fedoraproject.org/rpms/coreutils/pull-request/20

5 weeks agotests: date: add timezone conversion test
Sylvestre Ledru [Sat, 14 Feb 2026 19:08:12 +0000 (20:08 +0100)] 
tests: date: add timezone conversion test

*tests/date/date.pl: Add the test case.
Add test case for https://github.com/uutils/coreutils/issues/10800
to verify `date -u -d '10:30 UTC-05'` converts to 15:30 UTC.

5 weeks agotests: date: add edge cases for modifiers
Sylvestre Ledru [Fri, 27 Feb 2026 08:16:00 +0000 (09:16 +0100)] 
tests: date: add edge cases for modifiers

* tests/date/date.pl: Add the test case.
Add test cases for https://github.com/uutils/coreutils/issues/10957

5 weeks agotests: cut: add test case for newline delimiter with -s flag
Sylvestre Ledru [Wed, 4 Mar 2026 10:57:10 +0000 (11:57 +0100)] 
tests: cut: add test case for newline delimiter with -s flag

* tests/cut/cut.pl: Add a new test case.
https://github.com/coreutils/coreutils/pull/211

5 weeks agotests: mktemp: ensure mktemp does not depend on getrandom and ASLR
oech3 [Sat, 28 Feb 2026 01:42:01 +0000 (10:42 +0900)] 
tests: mktemp: ensure mktemp does not depend on getrandom and ASLR

* tests/mktemp/mktemp-misc.sh: Add new test.
* tests/local.mk: Reference new test.
https://github.com/coreutils/coreutils/pull/206

5 weeks agomaint: tests: decouple debug output determination
Pádraig Brady [Tue, 3 Mar 2026 11:33:27 +0000 (11:33 +0000)] 
maint: tests: decouple debug output determination

* tests/misc/warning-errors.sh: Simply check there is output
to stderr before checking that output induces an error.

5 weeks agotests: avoid false test failure when using address sanitizer
Collin Funk [Tue, 3 Mar 2026 05:43:22 +0000 (21:43 -0800)] 
tests: avoid false test failure when using address sanitizer

* tests/misc/warning-errors.sh: Skip commands which have been built with
sanitizers, since standard error will not be closed and checked for
errors.
Reported by Bruno Haible.

5 weeks agotests: avoid failure on systems without an optimized 'cksum' or 'wc -l'
Collin Funk [Tue, 3 Mar 2026 06:16:21 +0000 (22:16 -0800)] 
tests: avoid failure on systems without an optimized 'cksum' or 'wc -l'

* tests/misc/warning-errors.sh: Expect 'wc' and 'cksum' to exit
successfully if there is not an optimized 'wc -l' implementation or
CRC32 implementation.
Reported by Bruno Haible.

5 weeks agotests: shuf: ensure memory exhaustion is handled gracefully
oech3 [Mon, 2 Mar 2026 11:56:23 +0000 (11:56 +0000)] 
tests: shuf: ensure memory exhaustion is handled gracefully

* tests/shuf/shuf.sh: Ensure we exit 1 upon failure
to allocate memory.
https://github.com/uutils/coreutils/issues/11170
https://github.com/coreutils/coreutils/pull/209

5 weeks agotest: cp: add test for non-UTF8 directory names
Sylvestre Ledru [Sat, 28 Feb 2026 08:16:57 +0000 (09:16 +0100)] 
test: cp: add test for non-UTF8 directory names

Missing test identified here:
 https://github.com/uutils/coreutils/pull/11148

* tests/cp/non-utf8-name.sh: Add a new test to cover this case.
https://github.com/coreutils/coreutils/pull/207

5 weeks agodu: fflush after outputting a line
Paul Eggert [Sun, 1 Mar 2026 03:07:52 +0000 (19:07 -0800)] 
du: fflush after outputting a line

* src/du.c (print_size): Resurrect the fflush call, since there
can be significant delay between output lines.

5 weeks agotests: wc,du: add additional --files0-from test cases
Collin Funk [Sun, 1 Mar 2026 02:36:34 +0000 (18:36 -0800)] 
tests: wc,du: add additional --files0-from test cases

* tests/wc/wc-files0-from.pl ($limits): New variable.
(@Tests): Prefer the error strings from getlimits over writing them by
hand. Add test cases for --files0-from listing missing files and
duplicate files.
* tests/du/files0-from.pl ($limits): New variable.
(@Tests): Prefer the error strings from getlimits over writing them by
hand. Add test cases for --files0-from listing missing files. Add tests
for --files0-from listing duplicate files with and without the -l option
also in use.

5 weeks agobuild: update gnulib submodule to latest
Collin Funk [Sat, 28 Feb 2026 22:23:26 +0000 (14:23 -0800)] 
build: update gnulib submodule to latest

* po/POTFILES.in: Remove recently added lib/cygpath.c dependency after
gnulib commit 2a893de047 (filesystem-remote: New module., 2026-02-28).

5 weeks agotests: ls: treat invalid UTF-8 paths starting with a dot as hidden
Sylvestre Ledru [Fri, 27 Feb 2026 08:23:17 +0000 (09:23 +0100)] 
tests: ls: treat invalid UTF-8 paths starting with a dot as hidden

* tests/ls/non-utf8-hidden.sh: Add the test case.
https://github.com/uutils/coreutils/pull/11135
https://github.com/coreutils/coreutils/pull/202

5 weeks agotests: ln: verify that -f and -i override each other
Sylvestre Ledru [Thu, 26 Feb 2026 20:58:51 +0000 (21:58 +0100)] 
tests: ln: verify that -f and -i override each other

Identified here:
<https://github.com/uutils/coreutils/pull/11129>

* tests/ln/misc.sh: Add the check.
 https://github.com/coreutils/coreutils/pull/199

5 weeks agotest: ln: verify backup suffix path traversal prevention
Sylvestre Ledru [Sat, 28 Feb 2026 08:31:48 +0000 (09:31 +0100)] 
test: ln: verify backup suffix path traversal prevention

missing test detected thanks to:
https://github.com/uutils/coreutils/pull/11149

* tests/ln/backup-suffix-traversal.sh: Add a test.
https://github.com/coreutils/coreutils/pull/208

5 weeks agomaint: fix typo in previous test
Pádraig Brady [Sat, 28 Feb 2026 16:32:50 +0000 (16:32 +0000)] 
maint: fix typo in previous test

* tests/shuf/shuf.sh: Use non varying $ret rather than $?

5 weeks agotests: shuf: ensure we handle unsupported getrandom syscall gracefully
oech3 [Fri, 27 Feb 2026 22:15:48 +0000 (07:15 +0900)] 
tests: shuf: ensure we handle unsupported getrandom syscall gracefully

* tests/shuf/shuf.sh: Check we fail normally or succeed where
the getrandom syscall is not available.
https://github.com/coreutils/coreutils/pull/205

5 weeks agobuild: update gnulib to latest
Pádraig Brady [Sat, 28 Feb 2026 11:10:57 +0000 (11:10 +0000)] 
build: update gnulib to latest

* NEWS: Mention the more encompassing remoteness check for df.
* po/POTFILES.in: Add new lib/cygpath.c dependency.

5 weeks agodu: avoid locking and flushing standard output
Collin Funk [Sat, 28 Feb 2026 06:18:48 +0000 (22:18 -0800)] 
du: avoid locking and flushing standard output

This results in a noticeable increase in performance:

    $ yes /dev/null | head -n 10000000 | tr '\n' '\0' \
        | time --format=%E ./src/du-prev -l --files0-from=- > /dev/null
    0:20.40
    $ yes /dev/null | head -n 10000000 | tr '\n' '\0' \
        | time --format=%E ./src/du -l --files0-from=- > /dev/null
    0:16.57

* src/du.c (print_size): Prefer putchar and fputs which may be unlocked
unlike printf. Prefer ferror to fflush.

5 weeks agostat: handle %%%N too
Paul Eggert [Sat, 28 Feb 2026 00:49:08 +0000 (16:49 -0800)] 
stat: handle %%%N too

* src/stat.c (main): Fix incorrect counting of '%'s before 'N'.
* tests/stat/stat-fmt.sh: Test for the bug.

5 weeks agoid: avoid unnecessary buffer flushing
Paul Eggert [Sat, 28 Feb 2026 00:17:27 +0000 (16:17 -0800)] 
id: avoid unnecessary buffer flushing

* src/groups.c (main):
* src/id.c (main, print_stuff):
Don’t flush stdout before testing for write error.
Do the test only when in a loop, as a one-shot will
test for write error soon anyway.

5 weeks agocksum: prefer signed int
Paul Eggert [Sat, 28 Feb 2026 00:17:27 +0000 (16:17 -0800)] 
cksum: prefer signed int

* src/cksum.c (min_digest_line_length, digest_hex_bytes)
(digest_length, md5_sum_stream, sha1_sum_stream)
(sha224_sum_stream, sha256_sum_stream, sha384_sum_stream)
(sha512_sum_stream, sha2_sum_stream, sha3_sum_stream)
(blake2b_sum_stream, sm3_sum_stream, problematic_chars)
(filename_unescape, valid_digits, bsd_split_3)
(algorithm_from_tag, split_3, digest_file, output_file)
(b64_equal, hex_equal, digest_check, main):
* src/cksum_avx2.c (cksum_avx2):
* src/cksum_avx512.c (cksum_avx512):
* src/cksum_crc.c (cksum_fp_t, cksum_slice8, crc_sum_stream)
(crc32b_sum_stream, output_crc):
* src/cksum_pclmul.c (cksum_pclmul):
* src/cksum_vmull.c (cksum_vmull):
* src/sum.c (bsd_sum_stream, sysv_sum_stream, output_bsd, output_sysv):
Prefer signed to unsigned int where either will do.
This allows better checking with -fsanitize=undefined.
It should also help simplify future patches, so that they
needn’t worry whether comparisons like ‘i < len - 2’ will misbehave.

6 weeks agostat: don't check QUOTING_STYLE when --printf %%N is used
Collin Funk [Fri, 27 Feb 2026 04:39:12 +0000 (20:39 -0800)] 
stat: don't check QUOTING_STYLE when --printf %%N is used

* NEWS: Mention the fix.
* src/stat.c (main): Only check QUOTING_STYLE if there is a %N that is
not preceded by a percentage sign.
* tests/stat/stat-fmt.sh: Add some test cases.

6 weeks agoid: promptly diagnose write errors
Collin Funk [Thu, 26 Feb 2026 04:59:35 +0000 (20:59 -0800)] 
id: promptly diagnose write errors

* NEWS: Mention the improvement.
* src/id.c (print_stuff): Call fflush for each listed user to check for
write errors.
* tests/misc/io-errors.sh: Add an invocation of 'id'.

6 weeks agogroups: promptly diagnose write errors
Collin Funk [Thu, 26 Feb 2026 04:56:12 +0000 (20:56 -0800)] 
groups: promptly diagnose write errors

* NEWS: Mention the improvement.
* src/groups.c (main): Call fflush for each listed user to check for
write errors.
* tests/misc/io-errors.sh: Add an invocation of 'groups'.

6 weeks agotests: ensure failure to write warnings is handled gracefully
Pádraig Brady [Thu, 26 Feb 2026 20:06:29 +0000 (20:06 +0000)] 
tests: ensure failure to write warnings is handled gracefully

* tests/misc/warning-errors.sh: Add a new test to ensure
failure to write warnings is diagnosed in the exit status.
* tests/local.mk: Reference the new test.

6 weeks agotests: shuf: ensure randomization doesn't depend solely on ASLR
oech3 [Thu, 26 Feb 2026 12:04:56 +0000 (21:04 +0900)] 
tests: shuf: ensure randomization doesn't depend solely on ASLR

* tests/shuf/shuf.sh: Use setarch --addr-no-randomize to disable
ASLR, and show the output is still random.
https://github.com/coreutils/coreutils/pull/198

6 weeks agomaint: fix description of tests/misc/io-errors.sh
Pádraig Brady [Thu, 26 Feb 2026 12:47:34 +0000 (12:47 +0000)] 
maint: fix description of tests/misc/io-errors.sh

* tests/misc/io-errors.sh: Promptness is checked in
write-errors.sh, not this test.

6 weeks agotests: nice: ensure a particular adjustment is disallowed
oech3 [Wed, 25 Feb 2026 10:57:26 +0000 (19:57 +0900)] 
tests: nice: ensure a particular adjustment is disallowed

* tests/nice/nice-fail.sh: Ensure "1+2-3" is disallowed.
https://github.com/coreutils/coreutils/pull/197

6 weeks agotests: factor,numfmt: verify embedded NUL handling
Pádraig Brady [Wed, 25 Feb 2026 14:49:12 +0000 (14:49 +0000)] 
tests: factor,numfmt: verify embedded NUL handling

* tests/factor/factor.pl: Verify that embedded NULs
on stdin terminate the _number_.
* tests/numfmt/numfmt.p: Verify that embedded NULs
on stdin terminate the _line_.
https://github.com/coreutils/coreutils/pull/196

6 weeks agotests: fix "Hangup" termination of non-interactive runs
Pádraig Brady [Tue, 24 Feb 2026 15:44:41 +0000 (15:44 +0000)] 
tests: fix "Hangup" termination of non-interactive runs

This avoids the test harness being terminated like:
  make[1]: *** [Makefile:24419: check-recursive] Hangup
  make[3]: *** [Makefile:24668: check-TESTS] Hangup
  make: *** [Makefile:24922: check] Hangup
  make[2]: *** [Makefile:24920: check-am] Hangup
  make[4]: *** [Makefile:24685: tests/misc/usage_vs_refs.log] Error 129
  ...

This happened sometimes when the tests were being run non interactively.
For example when run like:

  setsid make TESTS="tests/timeout/timeout.sh \
   tests/tail/overlay-headers.sh" SUBDIRS=. -j2 check

Note the race window can be made bigger by adding a sleep
after tail is stopped in overlay-headers.sh

The race can trigger the kernel to induce its job control
mechanism to prevent stuck processes.
I.e. where it sends SIGHUP + SIGCONT to a process group
when it determines that group may become orphaned,
and there are stopped processes in that group.

* tests/tail/overlay-headers.sh: Use setsid(1) to keep the stopped
tail process in a separate process group, thus avoiding any kernel
job control protection mechanism.
* tests/timeout/timeout.sh: Use setsid(1) to avoid the kernel
checking the main process group when sleep(1) is reparented.
Fixes https://bugs.gnu.org/80477

6 weeks agodoc: tee: avoid the use of gpg cleartext signatures in an example
Collin Funk [Sun, 22 Feb 2026 22:20:30 +0000 (14:20 -0800)] 
doc: tee: avoid the use of gpg cleartext signatures in an example

Cleartext signatures have many gotchas. Therefore, the use of detached
signatures is recommended where possible. See:
<https://gnupg.org/blog/20251226-cleartext-signatures.html>.

* doc/coreutils.texi (tee invocation): Adjust gpg invocation to produce
a detached signature.

6 weeks agotests: whoami, logname: verify error handling
oech3 [Mon, 23 Feb 2026 10:22:44 +0000 (19:22 +0900)] 
tests: whoami, logname: verify error handling

* tests/df/no-mtab-status-masked-proc.sh: Tweak unshare check.
* tests/local.mk: Reference new test.
* tests/misc/user.sh: Add new test using unshare -U, to verify
that whoami and logname diagnose failure correctly.
https://github.com/coreutils/coreutils/pull/195

6 weeks agodoc: stty: mention the -g does not save the terminal window size
Collin Funk [Sun, 22 Feb 2026 07:33:56 +0000 (23:33 -0800)] 
doc: stty: mention the -g does not save the terminal window size

* doc/coreutils.texi (stty invocation): Mention that 'stty -g' does not
save the terminal window size as allowed by POSIX.1-2024.