]> git.ipfire.org Git - thirdparty/coreutils.git/log
thirdparty/coreutils.git
20 months agouniq: prefer signed integers
Paul Eggert [Wed, 15 Nov 2023 07:13:26 +0000 (23:13 -0800)] 
uniq: prefer signed integers

* src/uniq.c (skip_fields, skip_chars, check_chars, size_opt)
(find_field, different, writeline, check_file, main):
Prefer signed to unsigned integer types, since this allows
for better runtime checking with -fsanitize=undefined.

20 months agomaint: DECIMAL_DIGIT_ACCUMULATE uses stdckdint.h
Paul Eggert [Wed, 15 Nov 2023 04:35:56 +0000 (20:35 -0800)] 
maint: DECIMAL_DIGIT_ACCUMULATE uses stdckdint.h

* src/system.h: Include <stdckdint.h>, since the new
DECIMAL_DIGIT_ACCUMULATE uses it.
Do not include stdckdint.h from files that also include system.h.
(DECIMAL_DIGIT_ACCUMULATE): Omit last arg, which is no longer needed.
Reimplement by using C23-style stdckdint.h’s ckd_mul and ckd_add,
as that’s more standard and is more likely to generate better code.

20 months agopinky: fix string size calculation
Paul Eggert [Sat, 11 Nov 2023 08:17:11 +0000 (00:17 -0800)] 
pinky: fix string size calculation

* src/pinky.c (count_ampersands): Simplify and return idx_t.
(create_fullname): Compute proper destination string size,
basically, by adding (ulen - 1) * ampersands rather than ulen *
(ampersands - 1).  Problem found on CHERI-64.

20 months agomaint: port randread to FreeBSD 14
Paul Eggert [Sat, 11 Nov 2023 08:14:48 +0000 (00:14 -0800)] 
maint: port randread to FreeBSD 14

* gl/lib/randread.c (POINTER_IS_ALIGNED): Rename from
ALIGNED_POINTER to avoid a collision with <machine/param.h>
on FreeBSD 14.

20 months agobuild: update gnulib submodule to latest
Paul Eggert [Sat, 11 Nov 2023 03:08:54 +0000 (19:08 -0800)] 
build: update gnulib submodule to latest

21 months agols: fix recent regression in size alignment
Pádraig Brady [Fri, 3 Nov 2023 16:22:22 +0000 (16:22 +0000)] 
ls: fix recent regression in size alignment

* src/ls.c (print_long_format): Use correct column width,
introduced due to a copy/paste error in commit v9.4-2-gcbb6dfec5
* tests/ls/size-align.sh: Add a test.
* tests/local.mk: Reference the new test.
Fixes https://bugs.gnu.org/66919

21 months agojoin: fix recently introduced NUL bug
Paul Eggert [Mon, 30 Oct 2023 17:47:34 +0000 (10:47 -0700)] 
join: fix recently introduced NUL bug

* src/join.c (xfields): Simplify and fix bug with fields
that start with a NUL byte when -t is not used.
* tests/misc/join-utf8.sh: Also test when -t is not used,
and when a field starts with NUL.

21 months agomaint: pacify ‘make syntax-check’
Paul Eggert [Mon, 30 Oct 2023 08:32:37 +0000 (01:32 -0700)] 
maint: pacify ‘make syntax-check’

* tests/misc/join-utf8.sh: Omit fail=0.
Fix framework_failure_ typo.
* tests/misc/join.pl: Change ` to '.

21 months agomaint: copy join, uniq tests from Fedora
Paul Eggert [Mon, 30 Oct 2023 08:24:28 +0000 (01:24 -0700)] 
maint: copy join, uniq tests from Fedora

* tests/misc/join.pl, tests/uniq/uniq.pl:
Copy from Fedora 39.  This adds more multi-byte tests.

21 months agojoin,uniq: support multi-byte separators
Paul Eggert [Mon, 30 Oct 2023 07:32:51 +0000 (00:32 -0700)] 
join,uniq: support multi-byte separators

* NEWS: Mention this.
* bootstrap.conf (gnulib_modules): Remove cu-ctype, as this module
is now more trouble than it’s worth.  All uses removed.
Add skipchars.
* gl/lib/cu-ctype.c, gl/lib/cu-ctype.h, gl/modules/cu-ctype:
Remove.
* gl/lib/skipchars.c, gl/lib/skipchars.h, gl/modules/skipchars:
* tests/misc/join-utf8.sh:
New files.
* src/join.c: Include skipchars.h and mcel.h instead of cu-ctype.h.
(tab): Now mcel_t, not int.  All uses changed.
(output_separator, output_seplen): New static vars.
(eq_tab, newline_or_blank, comma_or_blank): New functions.
(xfields, prfields, prjoin, add_field_list, main):
Support multi-byte characters.
* src/numfmt.c: Include ctype.h, skipchars.h.
Do not include cu-ctype.h.
(newline_or_blank): New function.
(next_field): Support multi-byte characters.
* src/sort.c: Include ctype.h instead of cu-ctype.h.
(inittables): Open-code field_sep since it no longer exists.
‘sort’ is not multi-byte safe yet, but when it is this code
will need revamping anyway.
* src/uniq.c: Include mcel.h and skipchars.h instead of cu-ctype.h.
(newline_or_blank): New function.
(find_field): Support multi-byte characters.
* tests/local.mk (all_tests): Add tests/misc/join-utf8.sh

21 months agotest: allow non-blank white space in numbers
Paul Eggert [Sat, 28 Oct 2023 23:15:49 +0000 (16:15 -0700)] 
test: allow non-blank white space in numbers

* src/test.c (find_int): Use isspace, not isblank,
for compatibility with how strtol works, which
is how most other shells do this.

21 months agostdbuf: port to oddball toupper
Paul Eggert [Sat, 28 Oct 2023 16:30:49 +0000 (09:30 -0700)] 
stdbuf: port to oddball toupper

* src/stdbuf.c: Do not include ctype.h.
(set_libstdbuf_options): Use c_toupper, not toupper,
since the C locale is intended here.

21 months agodircolors: assume C-locale spaces
Paul Eggert [Sat, 28 Oct 2023 16:22:09 +0000 (09:22 -0700)] 
dircolors: assume C-locale spaces

* src/dircolors.c: Include c-ctype.h, not ctype.h.
(parse_line): Use c_isspace, not isspace, as the .dircolors
file format (which does not seem to be documented!) appears
to be ASCII.

21 months agomaint: port to oddball tolower
Paul Eggert [Sat, 28 Oct 2023 16:07:14 +0000 (09:07 -0700)] 
maint: port to oddball tolower

* src/digest.c (hex_equal): Work even in oddball locales
where tolower does not work as expected on ASCII letters.

21 months agomaint: include ctype.h selectively
Paul Eggert [Sat, 28 Oct 2023 00:31:49 +0000 (17:31 -0700)] 
maint: include ctype.h selectively

Include ctype.h only in files that need it.  Many of its uses
are incorrect, as they assume single-byte locales.  The idea is
to remove the incorrect uses later, when there is time.
* src/chroot.c, src/csplit.c, src/dd.c, src/digest.c, src/dircolors.c:
* src/expand-common.c, src/expand.c, src/fmt.c, src/fold.c, src/ls.c:
* src/od.c, src/pinky.c, src/pr.c, src/ptx.c, src/seq.c:
* src/set-fields.c, src/split.c, src/stdbuf.c, src/test.c:
* src/tr.c, src/truncate.c, src/unexpand.c, src/wc.c:
Include ctype.h.
* src/system.h: Do not include ctype.h.

include ctype.h.o

21 months agomaint: move field_sep into separate module
Paul Eggert [Sat, 28 Oct 2023 00:15:08 +0000 (17:15 -0700)] 
maint: move field_sep into separate module

This is so that we don’t need to have every source file
include ctype.h.
* bootstrap.conf (gnulib_modules): Add cu-ctype.
* gl/lib/cu-ctype.c, gl/lib/cu-ctype.h, gl/modules/cu-ctype:
New files.
* src/join.c, src/numfmt.c, src/sort.c, src/uniq.c:
Include cu-ctype.h, for field_sep.
* src/system.h (field_sep): Remove; now supplied by cu-ctype.

21 months agodigest: omit unnecessary b2sum includes
Paul Eggert [Fri, 27 Oct 2023 15:56:39 +0000 (08:56 -0700)] 
digest: omit unnecessary b2sum includes

* src/blake2/b2sum.c: Do not include string.h, errno.h,
ctype.h, unistd.h, getopt.h.

21 months agomaint: prefer c_isxdigit when that is the intent
Paul Eggert [Fri, 27 Oct 2023 15:45:50 +0000 (08:45 -0700)] 
maint: prefer c_isxdigit when that is the intent

* src/digest.c (valid_digits, split_3):
* src/echo.c (main):
* src/printf.c (print_esc):
* src/ptx.c (unescape_string):
* src/stat.c (print_it):
When the code is supposed to support only POSIX-locale hex digits,
use c_isxdigit rather than isxdigit.  Include c-ctype.h as needed.
This defends against oddball locales where isxdigit != c_isxdigit.

21 months agomaint: fix syntax check issue
Pádraig Brady [Fri, 27 Oct 2023 13:19:01 +0000 (14:19 +0100)] 
maint: fix syntax check issue

* src/basenc.c: Fix preprocessor indentation.

21 months agobase32,base64: disallow non-canonical encodings
Pádraig Brady [Fri, 27 Oct 2023 12:24:04 +0000 (13:24 +0100)] 
base32,base64: disallow non-canonical encodings

This will make decoding more resilient to corruption
whether due to transmission errors or nefarious adjustment.
See https://eprint.iacr.org/2022/361.pdf

* gnulib: Update to commit 3f463202bd enforcing canonical encoding.
* tests/basenc/base64.pl: Add test cases, and adjust existing cases.
* NEWS: Mention the change in behavior.

21 months agobasenc: fix unlikely locale issue; tune
Paul Eggert [Wed, 25 Oct 2023 22:09:04 +0000 (15:09 -0700)] 
basenc: fix unlikely locale issue; tune

This sped up ‘basenc -d --base16’ by 60% on my old platform,
AMD Phenom II X4 910e, Fedora 38.
* src/basenc.c (struct base16_decode_context): Simplify by
omitting have_nibble.  ‘nibble’ is now negative if it’s missing.
All uses changed.
(B16): New macro, inspired by ../lib/base64.c.
(base16_to_int): New static var, likewise.
(isubase16): Reimplement using base16_to_int, since isxdigit is
not guaranteed to succeed on the chars we want when the locale is
oddball.
(base16_decode_ctx): Tune by using base16_to_int and by

21 months agobasenc: tweak checks to use unsigned char
Paul Eggert [Wed, 25 Oct 2023 21:43:32 +0000 (14:43 -0700)] 
basenc: tweak checks to use unsigned char

This tends to generate better code, at least on x86-64,
because callers are just as fast and callees can avoid a conversion.
* src/basenc.c: The following renamings also change the arg type
from char to unsigned char.  All uses changed.
(isubase): Rename from isbase.
(isubase64url): Rename from isbase64url.
(isubase32hex): Rename from isbase32hex.
(isubase16): Rename from isbase16.
(isuz85): Rename from isz85.
(isubase2): Rename from isbase2.

2023-10-24  Paul Eggert  <eggert@cs.ucla.edu>

* src/basenc.c (struct base16_decode_context):
Simplify by storing -1 for missing nibbles.  All uses changed.

21 months agobuild: update gnulib submodule to latest
Paul Eggert [Wed, 25 Oct 2023 15:45:15 +0000 (08:45 -0700)] 
build: update gnulib submodule to latest

21 months agobasenc: --base16: also allow lower case with --ignore-garbage
Pádraig Brady [Wed, 25 Oct 2023 13:04:00 +0000 (14:04 +0100)] 
basenc: --base16: also allow lower case with --ignore-garbage

* src/basenc.c (isbase16): Also return true for lower case.
* tests/basenc/basenc.pl: Add a test case.
Reported by Paul Eggert.

21 months agobasenc: --base16: support lower case hex digits
Pádraig Brady [Mon, 23 Oct 2023 11:51:19 +0000 (12:51 +0100)] 
basenc: --base16: support lower case hex digits

* src/basenc.c (base16_decode_ctx): Convert to uppercase
before converting from hex.
* tests/basenc/basenc.pl: Add a test case.
* NEWS: Mention the change in behavior.
Addresses https://bugs.gnu.org/66698

21 months agodoc: fix RFC references
Pádraig Brady [Mon, 23 Oct 2023 11:29:03 +0000 (12:29 +0100)] 
doc: fix RFC references

* doc/coreutils.texi: Adjust RFC URLs as the original
now give 404 errors.

22 months agotests: move all basenc tests to their own directory
Pádraig Brady [Fri, 6 Oct 2023 15:31:47 +0000 (16:31 +0100)] 
tests: move all basenc tests to their own directory

* tests/misc/base64.pl: Move to tests/basenc/base64.pl
* tests/misc/basenc.pl: Move to tests/basenc/basenc.pl
* tests/local.mk: Adjust accordingly

22 months agobasenc: auto pad base32 and base64 inputs when decoding
Pádraig Brady [Thu, 5 Oct 2023 16:00:51 +0000 (17:00 +0100)] 
basenc: auto pad base32 and base64 inputs when decoding

Padding of encoded data is useful in cases where
base64 encoded data is concatenated / streamed.
I.e. where there are padding chars _within_ the stream.
In other cases padding is optional and can be inferred.
Note we continue to treat partial padding as invalid,
as that would be indicative of truncation.

* src/basenc.c (do_decode): Auto pad the end of the input.
* NEWS: Mention the change in behavior.
* tests/misc/base64.pl: Adjust to not fail for missing padding.
Addresses https://bugs.gnu.org/66265

22 months agosort: improve --help
Paul Eggert [Fri, 29 Sep 2023 01:02:25 +0000 (18:02 -0700)] 
sort: improve --help

Problem reported by Jorge Stolfi (bug#66253).
* src/sort.c (usage): Suggest looking at the manual for -n details.

22 months agodoc: rm --help: mention that '.' or '..' are rejected
Pádraig Brady [Mon, 25 Sep 2023 13:46:48 +0000 (14:46 +0100)] 
doc: rm --help: mention that '.' or '..' are rejected

* src/rm.c (usage): State that '.' or '..' are rejected.

22 months agowc: pacify ‘make syntax-check’
Paul Eggert [Sun, 24 Sep 2023 00:19:35 +0000 (17:19 -0700)] 
wc: pacify ‘make syntax-check’

* src/wc_avx2.c (wc_lines_avx2): Explicitly make it ‘extern’.
Not sure why this is needed.

22 months agowc: distribute src/wc.h
Paul Eggert [Sun, 24 Sep 2023 00:18:45 +0000 (17:18 -0700)] 
wc: distribute src/wc.h

* src/local.mk (noinst_HEADERS): Add src/wc.h.

22 months agowc: goto considered harmful
Paul Eggert [Sun, 24 Sep 2023 00:07:33 +0000 (17:07 -0700)] 
wc: goto considered harmful

* src/wc.c: Do not include assure.h.  Replace the only
use of ‘assure’ with ‘unreachable’ which is good enough.
(wc, main): Remove labels and gotos.  This doesn’t affect
performance in any way I can measure, and makes the code
a bit easier to follow.

22 months agowc: prefer signed integers
Paul Eggert [Sat, 23 Sep 2023 21:22:16 +0000 (14:22 -0700)] 
wc: prefer signed integers

Prefer signed to unsigned integers, to make it easier to catch
integer overflow errors.
* src/wc.c: Do not include safe-read.
(total_lines_overflow, total_words_overflow, total_chars_overflow)
(total_bytes_overflow): Now bool, not uintmax_t.  All uses changed.
(max_line_length): Now intmax_t, not uintmax_t.  All uses changed.
The total_... vars are still uintmax_t because overflow into them
is checked.
(page_size): Now idx_t, not size_t.
(wc_lines, wc, get_input_fstatus, compute_number_width, main):
Prefer signed to unsigned ints where either should do.
(wc_lines, wc): Use read rather than safe_read, since we don’t
need safe_read’s checks for huge buffers.
(wc): Redo call to mbrtoc32 to lessen the number of comparisons
against its returned value.  Do this partly by keeping a pointer
to the end of the buffer rather than a count.  Simplify
overflow-checking code.
(compute_number_width): Check for integer overflow.
Don’t assume size_t fits into unsigned long.
* src/wc.h (struct wc_lines): Prefer signed integers.
* src/wc_avx2.c: Do not include safe-read.h.
(wc_lines_avx2): Prefer signed integers.  Use read, not safe_read.

22 months agowc: improve avx2 API
Paul Eggert [Sat, 23 Sep 2023 20:38:08 +0000 (13:38 -0700)] 
wc: improve avx2 API

* src/wc.c: Use "#include <...>" for files not in the current dir.
Include "wc.h" instead of declaring wc_lines_avx2 by hand.
(wc_lines): New API, with no file name (no longer needed) and
with a return struct rather than arg pointers.  All uses changed.
Use avx2_supported directly instead of using a function pointer.
Exploit C99-style declarations after statements.
Multiply by 15 rather than dividing; it’s faster and more accurate
and cannot overflow here.
(wc): Simplify based on wc_lines API change.
* src/wc.h: New file.
* src/wc_avx2.c: Include it, to check API better.
(wc_lines_avx2): Use new API.  All uses changed.  Exploit C99.
Make locals more local.

22 months agofactor,tail: avoid quadratic reallocation
Paul Eggert [Sat, 23 Sep 2023 08:15:08 +0000 (01:15 -0700)] 
factor,tail: avoid quadratic reallocation

* src/factor.c (struct mp_factors): New member nalloc.
(mp_factor_init): Initialize it.
* src/factor.c (mp_factor_insert):
* src/tail.c (parse_options): Use xpalloc to avoid quadratic
worst-case behavior on reallocation.
* src/tail.c (pids_alloc): New static var.

22 months agodoc: mention Unicode exceptions for wc
Paul Eggert [Sat, 23 Sep 2023 07:23:26 +0000 (00:23 -0700)] 
doc: mention Unicode exceptions for wc

22 months agowc: simplify by removing SUPPORT_OLD_MBRTOWC
Paul Eggert [Sat, 23 Sep 2023 07:03:41 +0000 (00:03 -0700)] 
wc: simplify by removing SUPPORT_OLD_MBRTOWC

* src/wc.c (SUPPORT_OLD_MBRTOWC): Remove.  All uses removed.
(wc): Simplify by assuming C99-or-later behavior for mbrtoc32,
which after all is a C11 API.  Fix the !SUPPORT_OLD_MBRTOWC
code, which evidently was never tested seriously.

22 months agowc: 3× speedup in C locale
Paul Eggert [Sat, 23 Sep 2023 05:09:37 +0000 (22:09 -0700)] 
wc: 3× speedup in C locale

The 3× speedup was measured by invoking 'wc $(find * -type f)'
on the coreutils sources etc. on an Ubuntu 23.04 x86-64.
These changes also speed up wc 20% in UTF-8 locales.
* src/wc.c (wc_isprint, wc_isspace): New static vars.
(wc): Use them for speed.
(main): Initialize them if needed.
(isnbspace): Remove; no longer used.

22 months agowc: treat encoding errors as non white space
Paul Eggert [Sat, 23 Sep 2023 03:53:57 +0000 (20:53 -0700)] 
wc: treat encoding errors as non white space

* src/wc.c (wc): Treat encoding errors like non white space
characters.

22 months agowc: fix word count bug
Paul Eggert [Fri, 22 Sep 2023 18:13:51 +0000 (11:13 -0700)] 
wc: fix word count bug

* bootstrap.conf (gnulib_modules): Remove c32isprint.
* src/wc.c (wc): Consider all non-white-space characters
to be word constituents, even if they are not printable.
POSIX requires this, and it is what BSD does.
Partly do this by simplifying the check for a word,
by counting word starts rather than word ends.
* tests/wc/wc.pl: Test for the bug.

22 months agomaint: omit some unused function tests
Paul Eggert [Fri, 22 Sep 2023 17:05:58 +0000 (10:05 -0700)] 
maint: omit some unused function tests

* m4/jm-macros.m4: Do not check for ftruncate, iswspace,
mkfifo, mbrlen, sysctl.  Coreutils no longer uses the
corresponding HAVE_* macros, typically because Gnulib
handles them now.
* src/wc.c (iswspace): Remove; unused.

22 months agosort: not a special case for mbrtowc
Paul Eggert [Fri, 22 Sep 2023 16:49:41 +0000 (09:49 -0700)] 
sort: not a special case for mbrtowc

* configure.ac (GNULIB_MBRTOWC_SINGLE_THREAD): Define.

22 months agomaint: prefer char32_t to wchar_t
Paul Eggert [Fri, 22 Sep 2023 16:45:12 +0000 (09:45 -0700)] 
maint: prefer char32_t to wchar_t

This should work better on non-glibc platforms that don’t
use Unicode for wchar_t.  However, POSIX appears to prohibit
this for printf.c so leave that alone.
* bootstrap.conf (gnulib_modules): Add btoc32, c32iscntrl,
c32isprint, c32isspace, c32width, mbrtoc32.  Remove btoc, wcwidth.
* src/df.c, src/ls.c, src/wc.c:
Include uchar.h instead of wchar.h and wctype.h.
* src/df.c (replace_invalid_chars):
* src/ls.c (quote_name_buf):
* src/wc.c (isnbspace, wc):
Use char32_t instead of wchar_t.

22 months agowc: simplify #if MB_LEN_MAX
Paul Eggert [Fri, 22 Sep 2023 15:17:15 +0000 (08:17 -0700)] 
wc: simplify #if MB_LEN_MAX

* src/wc.c: Don’t have special #ifs for platforms where
MB_LEN_MAX is 1.  On these platforms, MB_CUR_MAX is 1 as well,
so the compiler should optimize away all multi-byte code.

22 months agowc: avoid undefined conversion state
Paul Eggert [Fri, 22 Sep 2023 02:23:56 +0000 (19:23 -0700)] 
wc: avoid undefined conversion state

* src/wc.c (wc): When mbrtowc returns (size_t) -1, zero the
conversion state, since POSIX says it’s undefined.

22 months agomaint: use mbszero
Paul Eggert [Fri, 22 Sep 2023 02:09:15 +0000 (19:09 -0700)] 
maint: use mbszero

* bootstrap.conf (gnulib_modules): Add mbszero.
* src/df.c (replace_invalid_chars):
* src/ls.c (quote_name_buf):
* src/pathchk.c (portable_chars_only):
* src/printf.c (STRTOX):
* src/wc.c (wc):
Prefer mbszero to clearing an mbstate_t by hand.

22 months agomaint: prefer mcel
Paul Eggert [Fri, 22 Sep 2023 01:45:47 +0000 (18:45 -0700)] 
maint: prefer mcel

This causes Gnulib code to also use mcel, which is more consistent.
* bootstrap.conf (avoided_gnulib_modules): Avoid mbuiter
and mbuiterf, since we can now just use mcel.  This avoids
the need to ship and compile mbchar and these modules.
(gnulib_modules): Change mcel to mcel-prefer.

22 months agowc: stop worrying about EBCDIC, shift-JIS, etc
Paul Eggert [Fri, 22 Sep 2023 01:45:08 +0000 (18:45 -0700)] 
wc: stop worrying about EBCDIC, shift-JIS, etc

* src/wc.c: Do not include mbchar.h.
(wc): Check for ASCII characters instead of using is_basic.
Other parts of Gnulib and coreutils already assume the encoding
is upward compatible with ASCII, and the old code wouldn’t
have worked anyway with shift-JIS.

22 months agoexpr: use mcel
Paul Eggert [Thu, 21 Sep 2023 23:59:48 +0000 (16:59 -0700)] 
expr: use mcel

The mcel API is simpler and corresponds more closely to how
Emacs etc. behave when the input has encoding errors,
since it treats each encoding-error byte separately.
* bootstrap.conf (gnulib_modules): Add mcel.
* src/expr.c: Include mcel.h instead of mbuiter.h.
(mbs_logical_cspn, mbs_logical_substr, mbs_offset_to_chars):
Use mcel API.
(mbs_logical_substr): Use ximemdup0 so as not to waste memory in
the result, fixing a FIXME.

22 months agobuild: update gnulib submodule to latest
Paul Eggert [Thu, 21 Sep 2023 17:44:25 +0000 (10:44 -0700)] 
build: update gnulib submodule to latest

22 months agobuild: avoid build failures on gcc <= 10, or clang
Pádraig Brady [Thu, 21 Sep 2023 17:48:49 +0000 (18:48 +0100)] 
build: avoid build failures on gcc <= 10, or clang

On gcc 10 the following build failure occurs:
  "error: a label can only be part of a statement
   and a declaration is not a statement"
This is because the current code is non standards conforming,
but GCC >= 11 will compile it (even with the -Wpedantic option).
This issue is tracked for GCC at:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111526

* src/tail.c (parse_options): Avoid a declaration after label,
by using a surrounding block.

22 months agotail: allow multiple PIDs
Stephen Kitt [Mon, 18 Sep 2023 16:09:29 +0000 (18:09 +0200)] 
tail: allow multiple PIDs

tail can watch multiple files, but currently only a single writer. It
can be useful to watch files from multiple writers, or even processes
not directly related to the files (e.g. watch log files written by a
server process, for the duration of a test driven by a separate
client).

* src/tail.c (writers_are_dead): New function.
(tail_forever): Use it to wait for writers.
(tail_forever_inotify): As above.
(parse_options): Manage --pid options in an array.
* doc/coreutils.texi: Update documentation.
* tests/tail/pid.sh: Add a variant with two PIDs.
* News: Mention the new feature.

22 months agols: --dired now implies long format with hyperlinks disabled
Sylvestre Ledru [Sun, 17 Sep 2023 13:55:57 +0000 (15:55 +0200)] 
ls: --dired now implies long format with hyperlinks disabled

Currently --dired is silently ignored
with conflicting output formats

* src/ls.c (decode_switches): Set default format and hyperlink mode
when the --dired option is specified.
* tests/ls/dired.sh: Check that formats are implied / overridden.
* NEWS: Mention the change in behavior.
* doc/coreutils.texi (ls invocation): Adjust --dired description.

22 months agotests: improve ls --dired testing
Sylvestre Ledru [Thu, 14 Sep 2023 21:40:08 +0000 (23:40 +0200)] 
tests: improve ls --dired testing

* tests/ls/dired.sh: Verify ls --dired output against varying offsets.

22 months agomaint: use C99 int size specifiers rather than PRI.MAX defines
Pádraig Brady [Wed, 13 Sep 2023 22:08:02 +0000 (23:08 +0100)] 
maint: use C99 int size specifiers rather than PRI.MAX defines

Following on from commit v9.3-128-gf31229ebd
replace all uses of the PRI.MAX portability defines
with C99 size specifiers %z, %j, and %t.

22 months agodoc: add subsections for cksum nodes
Pádraig Brady [Mon, 11 Sep 2023 19:21:39 +0000 (20:21 +0100)] 
doc: add subsections for cksum nodes

* doc/coreutils.texi: Specify each of the cksum nodes as a subsection,
so that the docs are organised appropriately in the pdf and html manual.

22 months agocp,mv,install: add copy_internal comment
Paul Eggert [Fri, 8 Sep 2023 23:25:00 +0000 (16:25 -0700)] 
cp,mv,install: add copy_internal comment

* src/copy.c (copy_internal): Add comment about
some particularly tricky logic.

22 months agocp: avoid needless unlinkat after fstatat ELOOP
Paul Eggert [Fri, 8 Sep 2023 16:14:06 +0000 (09:14 -0700)] 
cp: avoid needless unlinkat after fstatat ELOOP

* src/copy.c (copy_internal): When cp -f's fstatat fails on the
destination with ELOOP, report an error immediately when fstatat
used AT_SYMLINK_NOFOLLOW, as the later unlinkat would fail too.

22 months agocp,mv,install: minor copy_internal refactoring
Paul Eggert [Fri, 8 Sep 2023 16:10:21 +0000 (09:10 -0700)] 
cp,mv,install: minor copy_internal refactoring

* src/copy.c (copy_internal): Redo to avoid need for calculating
fstatat_flags when not needed.  This is for clarity, not speed.

22 months agocp,mv,install: fix comment punctuation
Paul Eggert [Tue, 5 Sep 2023 17:10:12 +0000 (10:10 -0700)] 
cp,mv,install: fix comment punctuation

* src/copy.h: Fix punctuation in comment.

22 months agocp,mv,install: simplify copy_internal
Paul Eggert [Tue, 5 Sep 2023 17:02:39 +0000 (10:02 -0700)] 
cp,mv,install: simplify copy_internal

* src/copy.c (copy_internal): Simplify.

23 months agomaint: prefer psame_inode, PSAME_INODE, STP_*
Paul Eggert [Tue, 5 Sep 2023 06:04:47 +0000 (23:04 -0700)] 
maint: prefer psame_inode, PSAME_INODE, STP_*

Prefer psame_inode, PSAME_INODE, STP_NBLOCKS, and STP_BLKSIZE,
which take addresses of objects, to their counterparts that
take the whole objects.  In some cases the whole objects might
not be initialized, which would be undefined behavior strictly
speaking.
* gl/lib/root-dev-ino.h (ROOT_DEV_INO_CHECK):
* src/cp-hash.c (src_to_dest_compare):
* src/ls.c (dev_ino_compare):
* src/pwd.c (robust_getcwd):
Prefer PSAME_INODE to SAME_INODE.
* src/chown-core.c (restricted_chown):
* src/copy.c (copy_reg, same_file_ok, source_is_dst_backup)
(copy_internal):
* src/ln.c (do_link):
* src/pwd.c (logical_getcwd):
* src/sort.c (avoid_trashing_input):
* src/split.c (create):
* src/stat.c (find_bind_mount):
Prefer psame_inode to SAME_INODE.
* src/copy.c (infer_scantype):
* src/du.c (process_file):
* src/ls.c (gobble_file, print_long_format)
(print_file_name_and_frills, length_of_file_name_and_frills):
* src/stat.c (print_stat):
Prefer STP_NBLOCKS to ST_NBLOCKS.
* src/copy.c (copy_reg):
* src/head.c (elide_tail_bytes_file, elide_tail_lines_file):
* src/ioblksize.h (io_blksize):
* src/od.c (skip):
* src/shred.c (do_wipefd):
* src/stat.c (print_stat):
* src/tail.c (tail_bytes):
* src/truncate.c (do_ftruncate):
* src/wc.c (wc):
Prefer STP_BLKSIZE to ST_BLKSIZE.
* src/ioblksize.h (io_blksize):
Arg is now struct stat const *, not struct stat.  All callers changed.

23 months agobuild: update gnulib submodule to latest
Paul Eggert [Tue, 5 Sep 2023 06:03:52 +0000 (23:03 -0700)] 
build: update gnulib submodule to latest

23 months agocp,mv,install: a bit more up-to-date source stat
Paul Eggert [Tue, 5 Sep 2023 06:03:52 +0000 (23:03 -0700)] 
cp,mv,install: a bit more up-to-date source stat

* src/copy.c (copy_reg): Replace caller’s source status
with the more recent version.

23 months agocp,mv,install: fix chmod on Linux CIFS
Paul Eggert [Sat, 2 Sep 2023 20:27:45 +0000 (13:27 -0700)] 
cp,mv,install: fix chmod on Linux CIFS

This bug occurs only when temporarily setting the mode to the
intersection of old and new modes when changing ownership.
* src/copy.c (owner_failure_ok): Treat EACCES like EPERM.

23 months agocp,mv,install: fix chown on Linux CIFS
Paul Eggert [Fri, 1 Sep 2023 22:05:21 +0000 (15:05 -0700)] 
cp,mv,install: fix chown on Linux CIFS

* src/copy.c (chown_failure_ok): Also treat EACCES as OK.

23 months agomaint: simplify set_owner
Paul Eggert [Fri, 1 Sep 2023 16:14:02 +0000 (09:14 -0700)] 
maint: simplify set_owner

* src/copy.c (HAVE_FCHOWN, fchown): Remove.
(fchmod_or_lchmod): Move up.
(fchown_or_lchown): New function.
(set_owner): Use it to simplify.

23 months agochown: port to mingw and MSVC 14
Paul Eggert [Fri, 1 Sep 2023 16:11:32 +0000 (09:11 -0700)] 
chown: port to mingw and MSVC 14

* src/chown-core.c (restricted_chown): Don’t assume fchown exists.
The Gnulib doc says that nowadays this is needed only for
ports to mingw and MSVC 14, but it’s an easy port so let’s do it.

23 months agomaint: avoid syntax check failure
Pádraig Brady [Thu, 31 Aug 2023 20:59:02 +0000 (21:59 +0100)] 
maint: avoid syntax check failure

* tests/misc/numfmt.pl: Keep lines <= 80 chars.

23 months agomaint: Gnulib module gc
Paul Eggert [Thu, 31 Aug 2023 00:51:42 +0000 (17:51 -0700)] 
maint: Gnulib module gc

Remove Gnulib modules that coreutils code no longer use directly.
Some of these are used indirectly, but gnulib-tool should do that.
* bootstrap.conf (gnulib_modules): Remove calloc-gnu, cloexec,
getgroups, getpass-gnu, getugroups, getusershell, gnu-mae,
group-member, lchown, mgetgroups, netinet_in, readlink,
realloc-gnu, rename, rpmatch, stpncpy, tzset, wchar-single,
wcswidth.

23 months agomaint: assume non-rare encodings
Paul Eggert [Thu, 31 Aug 2023 00:41:42 +0000 (17:41 -0700)] 
maint: assume non-rare encodings

* configure.ac (GNULIB_WCHAR_SINGLE_LOCALE): Define.
This can improve performance, while dropping support for
rare encodings on non-GNU platforms.  Nowadays these encodings
are typically not worth the hassle.

23 months agomaint: tune for single thread & locale
Paul Eggert [Wed, 30 Aug 2023 23:59:39 +0000 (16:59 -0700)] 
maint: tune for single thread & locale

* configure.ac (GNULIB_EXCLUDE_SINGLE_THREAD)
(GNULIB_REGEX_SINGLE_THREAD, GNULIB_WCHAR_SINGLE_LOCALE):
Define.

23 months agomaint: regularize struct initializers
Paul Eggert [Wed, 30 Aug 2023 14:39:34 +0000 (07:39 -0700)] 
maint: regularize struct initializers

* src/chmod.c (process_file):
* src/df.c (replace_invalid_chars):
* src/iopoll.c (iopoll_internal):
* src/ls.c (quote_name_buf):
* src/pathchk.c (portable_chars_only):
* src/printf.c (STRTOX):
* src/shred.c (main):
* src/stat.c (neg_to_zero, do_stat):
* src/timeout.c (settimeout):
* src/tr.c (card_of_complement):
* src/wc.c (wc):
Prefer ‘{0}’ to initialize everything to zero.
* src/stat.c (do_stat):
* src/timeout.c (settimeout):
Do not assume the usual order for struct members,
as POSIX does not guarantee this.

23 months agomaint: rely on Gnulib fdatasync
Paul Eggert [Wed, 30 Aug 2023 14:16:49 +0000 (07:16 -0700)] 
maint: rely on Gnulib fdatasync

* m4/jm-macros.m4: Remove fdatasync-related code,
as Gnulib now does this.
* src/dd.c (fdatasync) [!HAVE_FDATASYNC]:
* src/shred.c (dosync) [!HAVE_FDATASYNC]: Rely on Gnulib fdatasync.

23 months agomaint: use modern Gnulib LIB_ macros
Paul Eggert [Wed, 30 Aug 2023 14:15:09 +0000 (07:15 -0700)] 
maint: use modern Gnulib LIB_ macros

* src/local.mk (src_timeout_LDADD, src_dd_LDADD)
(src_shred_LDADD, src_sync_LDADD): Use TIMER_TIME_LIB
and FDATASYNC_LIB instead of LIB_TIMER_TIME and
LIB_FDATASYNC.

23 months agokill: rely on Gnulib strsignal
Paul Eggert [Wed, 30 Aug 2023 13:51:18 +0000 (06:51 -0700)] 
kill: rely on Gnulib strsignal

Omit checks no longer needed now that we use strsignal.
* configure.ac: Do not check for strsignal-related decls.
* src/kill.c (sys_siglist, strsignal): Remove.

23 months agomaint: remove rename macro
Paul Eggert [Wed, 30 Aug 2023 06:52:07 +0000 (23:52 -0700)] 
maint: remove rename macro

* src/copy.h (rename) [RENAME_TRAILING_SLASH_BUG]:
Remove: unused, now that Gnulib takes care of this.

23 months agomaint: remove need for mbsalign
Paul Eggert [Wed, 30 Aug 2023 06:52:07 +0000 (23:52 -0700)] 
maint: remove need for mbsalign

This simplifies memory allocation a bit, and removes an arbitrary
limitation from numfmt, which formerly limited cell output to 127
bytes.
* bootstrap.conf (gnulib_modules): Remove mbsalign, strncat.
Add strnlen (the code already used strnlen directly, and we were
saved only because Gnulib used the module indirectly)
* gl/lib/mbsalign.c, gl/lib/mbsalign.h, gl/modules/mbsalign:
* gl/modules/mbsalign-tests, gl/tests/test-mbsalign.c: Remove.
* src/df.c, src/ls.c: Do not include mbsalign.h.
(MBSWIDTH_FLAGS): New constant, now used for all
mbswidth calls.  All callers changed to check for -1 return.
* src/df.c (struct field_data_t): ‘width’ is now int not size_t,
since mbswidth can’t do widths greater than INT_MAX anyway.
Replace ‘align’ with ‘align_right’.  All uses changed.
(print_table): Redo to avoid the need for ambsalign.
(get_header, get_dev): mbswidth returns int, not size_t.
* src/ls.c (MAX_MON_WIDTH): Remove; no longer used.
(abmon_init): Use strnlen to cheaply discard too-long month names.
Align by hand instead of using mbsalign.
* src/numfmt.c: Include stdckdint.h, mbswidth.h.
Do not include mbsalign.h.
(padding_buffer_size): Now idx_t.  All uses changed.
(padding_width): Now intmax_t, since it’s no longer an object
size.  Its sign now records alignment.  All uses changed.
(zero_padding_width): Now int, since it’s given to sprintf.
All uses changed.
(padding_alignment): Remove; it’s now taken from padding_width’s sign.
(double_to_human): Return string length.  BUF_SIZE arg is now idx_t.
Include suffix in output.  All callers changed.  Simplify by not
calling strncat or stpcpy.  Calculate fmt size bound more carefully.
(setup_padding_buffer): Remove.  All uses removed.
(parse_format_string): Use intmax_t, not long, for pad.
On overflow, set widths to large values that cause later code
to do the right thing, rather than separately checking for
overflow here.
(prepare_padded_number): Return bool, not int 0/1.  New arg
PADDING.  All uses changed.  Do not limit padded output to 127
bytes; instead, use xpalloc to expand the output buffer.
(print_padded_number): New arg PADDING.  All uses changed.
(process_suffixed_number): Simplify.
(main): Take extremum if xstrtoimax overflows, as this does
the right thing.
* tests/misc/numfmt.pl: New test suf-20 to test for truncation bug.
Remove tests pad-3.2, fmt-err-7, as they’re no longer invalid but
are quite expensive.

23 months agomaint: post-release administrivia
Pádraig Brady [Tue, 29 Aug 2023 14:23:14 +0000 (15:23 +0100)] 
maint: post-release administrivia

* NEWS: Add header line for next release.
* .prev-version: Record previous version.
* cfg.mk (old_NEWS_hash): Auto-update.

23 months agoversion 9.4 v9.4
Pádraig Brady [Tue, 29 Aug 2023 13:36:22 +0000 (14:36 +0100)] 
version 9.4

* NEWS: Record release date.

23 months agomaint: spelling fixes, including author names
Paul Eggert [Mon, 28 Aug 2023 19:42:23 +0000 (12:42 -0700)] 
maint: spelling fixes, including author names

Most of this just affects commentary and documentations.  The only
significant behavior change is translating author names via
proper_name_lite rather than proper_name_utf8, or not translating
them at all.  proper_name_lite is good enough for coreutils and
avoids the bloat that had coreutils not using Gnulib proper_name.
* bootstrap.conf (gnulib_modules): Use propername-lite instead
of propername.
(XGETTEXT_OPTIONS): Look for proper_name_lite instead of for
proper_name_utf8.
* cfg.mk (local-checks-to-skip): Remove
sc_proper_name_utf8_requires_ICONV, since we no longer use
proper_name_utf8.
(old_NEWS_hash): Update.
(sc_check-I18N-AUTHORS): Remove; no longer needed.

23 months agotest: omit unreachable code
Paul Eggert [Mon, 28 Aug 2023 19:40:57 +0000 (12:40 -0700)] 
test: omit unreachable code

* src/test.c (unary_operator): Omit unreachable ‘return false;’.
Oracle Solaris Studio 12.6 warns about it.

23 months agotests: avoid test failure on Android
Bruno Haible [Mon, 28 Aug 2023 10:07:18 +0000 (12:07 +0200)] 
tests: avoid test failure on Android

* gl/tests/test-mbsalign.c (main): Skip the unibyte truncation test
on Android, when the "C" locale in fact is multibyte.

23 months agosort: port sort-merge-fdlimit test to Solaris 10
Paul Eggert [Mon, 28 Aug 2023 03:49:33 +0000 (20:49 -0700)] 
sort: port sort-merge-fdlimit test to Solaris 10

* tests/sort/sort-merge-fdlimit.sh: Give 'sort' fd 6 too.
Needed for the same reason sort-continue.sh needed a ulimit -n boost.

23 months agosort: port sort-continue test back to Solaris 10
Paul Eggert [Mon, 28 Aug 2023 02:13:42 +0000 (19:13 -0700)] 
sort: port sort-continue test back to Solaris 10

* tests/sort/sort-continue.sh: Use ulimit -n 7 not -n 6.  On
Solaris 10 'sort' uses Gnulib mkostemp, which calls Gnulib
getrandom, which opens /dev/urandom to calculate the temp file's
name, which means 'sort' needs one more file descriptor to work.

23 months agotests: avoid false failure on cygwin
Pádraig Brady [Sun, 27 Aug 2023 19:22:32 +0000 (20:22 +0100)] 
tests: avoid false failure on cygwin

* tests/cksum/md5sum-bsd.sh: Avoid part of test dealing with backslashes
in file names, on systems where backslash is a directory separator.
Issue reported by Bruno Haible on cygwin.

23 months agocksum: adjust tests and docs to binary mode handling
Pádraig Brady [Sun, 27 Aug 2023 18:52:13 +0000 (19:52 +0100)] 
cksum: adjust tests and docs to binary mode handling

Following commit v9.3-80-g5e1e0993b which makes cksum
match the output of the standalone utilities...

* doc/coreutils.texi (cksum output modes): Remove the mention
that cksum never outputs a binary indicator, as that's no longer the
case.
* tests/cksum/b2sum.sh: Avoid outputting a binary indicator.
* tests/cksum/sm3sum.pl: Likewise.

23 months agoall: avoid duplicated write errors on FreeBSD
Pádraig Brady [Sun, 27 Aug 2023 15:01:27 +0000 (16:01 +0100)] 
all: avoid duplicated write errors on FreeBSD

* src/system.h (write_error): Also call fpurge(), which was seen to
be needed on FreeBSD 13.1 to avoid duplicated write errors.
* src/head.c (xwrite_stdout): Likewise.
* bootstrap.conf: Depend on fpurge.
Reported by Bruno Haible.

23 months agotests: avoid false failure where sleep is a shell builtin
Pádraig Brady [Sun, 27 Aug 2023 15:22:37 +0000 (16:22 +0100)] 
tests: avoid false failure where sleep is a shell builtin

* tests/misc/usage_vs_getopt.sh: Handle sleep as a shell builtin,
which was seen on Alpine Linux 3.18.

23 months agobuild: fix link errors of sort, split on CentOS 5 and Solaris 10
Bruno Haible [Sat, 26 Aug 2023 23:10:13 +0000 (01:10 +0200)] 
build: fix link errors of sort, split on CentOS 5 and Solaris 10

* src/local.mk (src_sort_LDADD, src_split_LDADD): Add $(CLOCK_TIME_LIB).

23 months agobuild: fix compilation error on AIX 7.1
Bruno Haible [Sat, 26 Aug 2023 19:01:48 +0000 (21:01 +0200)] 
build: fix compilation error on AIX 7.1

* src/copy.c (copy_internal): Don't test for ENOTEMPTY if it has the
same value as EEXIST.

23 months agobuild: update with gnulib fixes
Pádraig Brady [Sun, 27 Aug 2023 15:09:07 +0000 (16:09 +0100)] 
build: update with gnulib fixes

* gnulib: Update to incorporate gnulib fixes
from Bruno Haible from his coreutils 9.4 pre-release testing.

23 months agodoc: remove older ChangeLog items
Pádraig Brady [Thu, 24 Aug 2023 11:10:49 +0000 (12:10 +0100)] 
doc: remove older ChangeLog items

* Makefile.am: Update the oldest documented version
to 8.30 which is now about 5 years old.

23 months agoshred: fix operation on Solaris with 64 bit builds
Pádraig Brady [Wed, 23 Aug 2023 14:21:01 +0000 (15:21 +0100)] 
shred: fix operation on Solaris with 64 bit builds

* NEWS: Mention the bug fix.
* gl/lib/randread.c (get_nonce): Limit getrandom() <= 1024 bytes.

23 months agobuild: update gnulib submodule to latest
Pádraig Brady [Tue, 22 Aug 2023 10:56:45 +0000 (11:56 +0100)] 
build: update gnulib submodule to latest

* gnulib: Update to latest

23 months agodoc: reorg texinfo for the checksumming utilities
Pádraig Brady [Mon, 21 Aug 2023 13:51:49 +0000 (14:51 +0100)] 
doc: reorg texinfo for the checksumming utilities

* doc/coreutils.texi: Reorg so that 'cksum invocation' is the
main node listing all options and output formats, which is then
referenced by the descriptions of the standalone utilities.
Use macros in the description of the standalone utilities
rather than referencing 'md5sum invocation' to be more direct.

23 months agodoc: cksum: remove -b description from texinfo
Pádraig Brady [Mon, 21 Aug 2023 13:56:20 +0000 (14:56 +0100)] 
doc: cksum: remove -b description from texinfo

* doc/coreutils.texi (cksum invocation): Following commit 5e1e0993
also remove the desciption of the -b option for the cksum command.

23 months agocp: with --sparse=never, avoid COW and copy offload
Pádraig Brady [Mon, 21 Aug 2023 12:39:14 +0000 (13:39 +0100)] 
cp: with --sparse=never, avoid COW and copy offload

* src/cp.c (main): Set default reflink mode appropriately
with --sparse=never.
* src/copy.c (infer_scantype): Add a comment to related code.
* tests/cp/sparse-2.sh: Add a test case.
* NEWS: Mention the bug.

23 months agomaint: comment spelling fix
Pádraig Brady [Sat, 19 Aug 2023 15:45:17 +0000 (16:45 +0100)] 
maint: comment spelling fix

* tests/split/l-chunk-root.sh: Fix recently introduced typo.