]> git.ipfire.org Git - thirdparty/coreutils.git/log
thirdparty/coreutils.git
17 months agotests: move join tests to their own directory
Pádraig Brady [Tue, 27 Feb 2024 14:12:27 +0000 (14:12 +0000)] 
tests: move join tests to their own directory

* tests/misc/join-utf8.sh: Move to tests/join
since there are now multiple join tests.
* tests/misc/join.pl: Likewise.

17 months agotests: simplify treatment of the $LOCALE_FR_UTF8 variable
Pádraig Brady [Tue, 27 Feb 2024 14:05:49 +0000 (14:05 +0000)] 
tests: simplify treatment of the $LOCALE_FR_UTF8 variable

* tests/df/problematic-chars.sh: Rely on gnulib setting
this to "none" where not usable.
* tests/misc/sleep.sh: Likewise.
* tests/printf/printf-mb.sh: Likewise.
* tests/printf/printf-quote.sh: Likewise.
* tests/sort/sort-debug-keys.sh: Likewise.

17 months agojoin: avoid test failure on systems with no French UTF-8 locale
Bruno Haible [Mon, 26 Feb 2024 21:33:18 +0000 (22:33 +0100)] 
join: avoid test failure on systems with no French UTF-8 locale

* tests/misc/join-utf8.sh: Test the value of LOCALE_FR_UTF8 against
'none', not against a missing value.
Fixes https://bugs.gnu.org/69418

17 months agosort: make the startup time optimization effective on glibc < 2.34
Bruno Haible [Tue, 27 Feb 2024 11:12:59 +0000 (12:12 +0100)] 
sort: make the startup time optimization effective on glibc < 2.34

* configure.ac: Test where to find the dlopen function. Set LIB_DL.
Use it in the DLOPEN_LIBCRYPTO test.
* src/local.mk (src_sort_LDADD): Add $(LIB_DL).

17 months agobuild: improve libcrypto library detection
Pádraig Brady [Mon, 26 Feb 2024 19:10:14 +0000 (19:10 +0000)] 
build: improve libcrypto library detection

* configure.ac: Match literal '.' in the file name
to avoid potential mismatches with similarly named libs.
Reported by Andreas Schwab

17 months agowc: fix -w with breaking space over UCHAR_MAX
Aearil [Sat, 24 Feb 2024 20:44:24 +0000 (21:44 +0100)] 
wc: fix -w with breaking space over UCHAR_MAX

* src/wc.c (wc): Fix regression introduced in commit v9.4-48-gf40c6b5cf.
* tests/wc/wc-nbsh.sh: Add test cases for "standard" spaces.
Fixes https://bugs.gnu.org/69369

17 months agocp,mv: add --update=none-fail to fail if existing files
Pádraig Brady [Mon, 5 Feb 2024 15:55:07 +0000 (15:55 +0000)] 
cp,mv: add --update=none-fail to fail if existing files

* src/cp.c (main): Add support for --update=none-fail to provide the
functionality of diagnosing files in the destination,
and exiting with failure status.
(usage): Mark -n as deprecated.
* src/mv.c: Likewise.
* src/copy.h: Add UPDATE_NONE_FAIL definition.
* src/system.h (emit_update_parameters_note): Add --update=none-fail
description.
* doc/coreutils.texi (cp invocation): Likewise.
Also mention why -n is deprecated.
* tests/mv/update.sh: Add a test case, including precedence
with -n and other --update options.
* tests/cp/cp-i.sh: Verify that --backup and --update=none{,-fail}
are mutually exclusive.
* tests/mv/mv-n.sh: Likewise.
* NEWS: Mention the new feature.
Addresses https://bugs.gnu.org/62572

17 months agocp,mv: reinstate that -n exits with success if files skipped
Pádraig Brady [Sat, 24 Feb 2024 19:51:56 +0000 (19:51 +0000)] 
cp,mv: reinstate that -n exits with success if files skipped

* src/cp.c (main): Adjust so that -n will exit success if skipped files.
* src/mv.c (main): Likewise.
* doc/coreutils.texi (cp invocation): Adjust the description of -n.
* src/system.h (emit_update_parameters_note): Adjust --update=none
comparison.
* tests/cp/cp-i.sh: Adjust -n exit status checks.
* tests/mv/mv-n.sh: Likewise.
* NEWS: Mention the change in behavior.
Fixes https://bugs.gnu.org/62572

17 months agobuild: fix libcrypto version linked by sort at runtime
Pádraig Brady [Mon, 26 Feb 2024 16:38:41 +0000 (16:38 +0000)] 
build: fix libcrypto version linked by sort at runtime

One should link the versioned lib at runtime,
and the unversioned lib at build time,
as the unversioned lib may not be installed,
and better couples the binary with the required version.

* configure.ac: Define LIBCRYPTO_SONAME, determined from
the test binary linked with -lcrypto.  Also document
why we use SHA512() in the check, rather than MD5().
* src/sort.c (link_libcrypto): Use the versioned lib in dlopen().

17 months agomaint: avoid sc_tight_scope failure in sort.c
Pádraig Brady [Mon, 26 Feb 2024 14:42:40 +0000 (14:42 +0000)] 
maint: avoid sc_tight_scope failure in sort.c

* cfg.mk: Exclude the ptr_MD5_* symbols added in
commit v9.4-130-g7f57ac2d2, as there is no way
to declare these static given they way they're defined.

17 months agodoc: mention -lcrypto change in NEWS
Paul Eggert [Mon, 26 Feb 2024 05:24:04 +0000 (21:24 -0800)] 
doc: mention -lcrypto change in NEWS

17 months agosort: dynamically link -lcrypto if -R
Paul Eggert [Mon, 26 Feb 2024 01:13:12 +0000 (17:13 -0800)] 
sort: dynamically link -lcrypto if -R

This saves time in the usual case, which does not need -lcrypto.
* configure.ac (DLOPEN_LIBCRYPTO): New macro.
* src/sort.c [DLOPEN_LIBCRYPTO && HAVE_OPENSSL_MD5]: New macros
MD5_Init, MD5_Update, MD5_Final.  Include "md5.h" after defining
them.  Include <dlfcn.h>, and define new functions link_failure
and symbol_address.
(link_libcrypto): New function.
(random_md5_state_init): Call it before using crypto functions.

17 months agodoc: de-“note” the documentation
Paul Eggert [Sat, 24 Feb 2024 22:03:42 +0000 (14:03 -0800)] 
doc: de-“note” the documentation

* doc/coreutils.texi, man/readlink.x, man/runcon.x:
* src/comm.c (usage):
* src/digest.c (usage):
* src/echo.c (usage):
* src/join.c (usage):
* src/ln.c (usage):
* src/rm.c (usage):
* src/stat.c (usage):
* src/system.h (USAGE_BUILTIN_WARNING):
* src/test.c (usage):
* src/touch.c (usage):
* src/uniq.c (usage):
Rewrite to avoid most uses of “Note that” and similar wording.
These circumlocutions are rarely needed, and avoiding them
improves readability and lessens preaching.

17 months agocp: add --keep-directory-symlink option
Daan De Meyer [Thu, 25 Jan 2024 13:02:32 +0000 (14:02 +0100)] 
cp: add --keep-directory-symlink option

When recursively copying files into OS trees, it often happens that
some subdirectory of the source directory is a symlink in the target
directory. Currently, cp will fail in that scenario with the error:

"cannot overwrite non-directory %s with directory %s"

However, we'd like cp in this scenario to follow the destination
directory symlink and copy the files into the symlinked directory
instead. Let's support this by adding a new option
--keep-directory-symlink that makes cp follow destination directory
symlinks.

We name the option --keep-directory-symlink to keep consistent with
tar which has the same option with the same effect.

* doc/coreutils.texi (cp invocation): Describe the new option.
* src/copy.h: Add the new setting.
* src/copy.h: Adjust to follow symlinks if setting enabled.
* src/cp.c (usage): Describe the new option.
(main): Accept the new option.
* tests/cp/keep-directory-symlink.sh: A new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the new feature.

17 months agodircolors: update list of archive file extensions
Michel Lind [Fri, 16 Feb 2024 16:24:32 +0000 (10:24 -0600)] 
dircolors: update list of archive file extensions

* src/dircolors.hin: Sort archive section by extension.
Treat .crate (Rust archives) as archive files
(they're essentially tar.gz files).

17 months agomaint: prefer #include <...> for gnulib substitute headers
Collin Funk [Sun, 18 Feb 2024 20:23:07 +0000 (12:23 -0800)] 
maint: prefer #include <...> for gnulib substitute headers

* src/shuf.c: Change #include "getopt.h" to #include <getopt.h>.
* src/stat.c: Change #include "getopt.h" to #include <getopt.h>.
* src/system.h: Change #include "error.h" to #include <error.h>.

Copyright-paperwork-exempt: Yes

17 months agodoc: add '[' to the info index
Pádraig Brady [Mon, 19 Feb 2024 14:14:21 +0000 (14:14 +0000)] 
doc: add '[' to the info index

* doc/coreutils.texi (test invocation): Add '[' to the index.

17 months agobuild: update gnulib submodule to latest
Paul Eggert [Sun, 18 Feb 2024 05:55:56 +0000 (21:55 -0800)] 
build: update gnulib submodule to latest

17 months agonohup: document GCC bug number
Paul Eggert [Sun, 18 Feb 2024 05:51:46 +0000 (21:51 -0800)] 
nohup: document GCC bug number

* src/nohup.c: Add GCC bug number to comment.

17 months agols: remove unnecessary pragmas
Paul Eggert [Sun, 18 Feb 2024 05:51:03 +0000 (21:51 -0800)] 
ls: remove unnecessary pragmas

* src/ls.c (decode_switches): Remove pragmas.  They are no longer
needed to pacify GCC 13.2.1 with --enable-gcc-checking, and there’s
little point keeping them around for older GCC versions.

17 months agomaint: update GCC version comment
Paul Eggert [Sun, 18 Feb 2024 05:48:15 +0000 (21:48 -0800)] 
maint: update GCC version comment

* src/copy.c: Update comment.

17 months agomaint: document fix for GCC bug 109628
Paul Eggert [Sun, 18 Feb 2024 05:40:20 +0000 (21:40 -0800)] 
maint: document fix for GCC bug 109628

* src/fmt.c [14 <= __GNUC__]: Stop using pragma workaround,
as the GCC folks say the bug is no longer present in GCC 14.

17 months agomaint: remove unneeded suggest-attributes pragmas
Paul Eggert [Sun, 18 Feb 2024 05:36:37 +0000 (21:36 -0800)] 
maint: remove unneeded suggest-attributes pragmas

* gl/lib/fadvise.c: Remove pragma that works around GCC bug 83559
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83559>.
This bug was fixed in GCC 9, and we needn’t worry about
--enable-gcc-warnings for compilers that old.
* src/test.c: Likewise.

17 months agodoc: fix typo in shred example
Greg Wooledge [Sat, 17 Feb 2024 13:07:12 +0000 (13:07 +0000)] 
doc: fix typo in shred example

* doc/coreutils.texi (shred invocation): Fix the example
to correctly close file descriptor 3.
* THANKS.in: Remove old email since now recorded in repo history.
Reported at https://bugs.debian.org/1063837

17 months agomaint: avoid -Wshadow warning under clang
Collin Funk [Wed, 7 Feb 2024 11:58:02 +0000 (03:58 -0800)] 
maint: avoid -Wshadow warning under clang

* src/env.c (parse_signal_action_params, parse_signal_block_params):
Rename OPTARG to ARG so that it does not conflict with OPTARG used by
getopt.

Copyright-paperwork-exempt: Yes

17 months agobuild: fix od build on clang < 17
Pádraig Brady [Wed, 7 Feb 2024 10:55:00 +0000 (10:55 +0000)] 
build: fix od build on clang < 17

* configure.ac: Ensure the compiler can promote 16 bit floating point
types to float, before enabling that code in od.  This was an issue
with clang 16 at least.
* src/od.c: Adjust for the new defines.
* tests/od/od-float.sh: Likewise.  Also port to the dash shell,
whose inbuilt printf doesn't support hex escapes.

18 months agood: support half precision floating point
Pádraig Brady [Thu, 1 Feb 2024 17:59:51 +0000 (17:59 +0000)] 
od: support half precision floating point

Rely on compiler support for _Float16 and __bf16
to support -fH and -fB formats respectively.
I.e. IEEE 16 bit, and brain 16 bit floats respectively.
Modern GCC and LLVM compilers support both types.

clang-sect=half-precision-floating-point
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
https://clang.llvm.org/docs/LanguageExtensions.html#$clang-sect
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0192r4.html
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1467r9.html

This was tested on:
gcc 13, clang 17 x86 (Both types supported)
gcc 7 aarch64 (Only -fH supported)
gcc 13 ppc(be) (Neither supported. Both will be with GCC 14)

* src/od.c: Support -tf2 or -tfH to print IEEE 16 bit floating point,
or -tfB to print Brain 16 bit floating point.
* configure.ac: Check for _Float16 and __bf16 types.
* doc/coreutils.texi (od invocation): Mention the new -f types.
* tests/od/od-float.sh: Add test cases.
* NEWS: Mention the new feature.
Addresses https://bugs.gnu.org/68871

18 months agoseq: say why not ‘x += step’
Paul Eggert [Mon, 29 Jan 2024 07:35:49 +0000 (23:35 -0800)] 
seq: say why not ‘x += step’

* src/seq.c (print_numbers): Add comment.

18 months agodoc: split -C: test and document a heap overflow
Pádraig Brady [Thu, 18 Jan 2024 00:05:18 +0000 (00:05 +0000)] 
doc: split -C: test and document a heap overflow

This was introduced in coreutils 9.2 through commit v9.1-184-g40bf1591b,
and was fixed in coreutils 9.5 through commit v9.4-111-gc4c5ed8f4.
This issue has been assigned CVE-2024-0684.

* NEWS: Mention the bug fix.
* tests/split/line-bytes.sh: Add a test case.
Reported by Valentin Metz.

18 months agotests: make ulimit -v interact better with ASAN
Pádraig Brady [Wed, 17 Jan 2024 23:49:52 +0000 (23:49 +0000)] 
tests: make ulimit -v interact better with ASAN

ulimit -v is generally not supported with ASAN, giving errors like:
  "ReserveShadowMemoryRange failed while trying to map 0x... bytes.
   Perhaps you're using ulimit -v"

* tests/cp/link-heap.sh: Mention ASAN as a possible reason for skipping.
* tests/csplit/csplit-heap.sh: Likewise.
* tests/cut/cut-huge-range.sh: Likewise.
* tests/dd/no-allocate.sh: Likewise.
* tests/printf/printf-surprise.sh: Likewise.
* tests/rm/many-dir-entries-vs-OOM.sh: Likewise.
* tests/head/head-c.sh: Only skip the part of the test needing ulimit.
* tests/split/line-bytes.sh: Likewise.

18 months agosplit: do not shrink hold buffer
Paul Eggert [Tue, 16 Jan 2024 21:48:32 +0000 (13:48 -0800)] 
split: do not shrink hold buffer

* src/split.c (line_bytes_split): Do not shrink hold buffer.
If it’s large for this batch it’s likely to be large for the next
batch, and for ‘split’ it’s not worth the complexity/CPU hassle to
shrink it.  Do not assume hold_size can be bufsize.

18 months agotests: ls: add a test to verify that '+' is added
Sylvestre Ledru [Wed, 10 Jan 2024 18:18:05 +0000 (19:18 +0100)] 
tests: ls: add a test to verify that '+' is added

* tests/ls/acl.sh: Add a new test.
* tests/local.mk: Reference the new test.

19 months agomaint: add attributes to two functions without side effects
Samuel Tardieu [Fri, 5 Jan 2024 15:51:34 +0000 (16:51 +0100)] 
maint: add attributes to two functions without side effects

* src/date.c (res_width): This function computes its result solely
from the value of its parameter and qualifies for the const attribute.
* src/tee.c (get_next_out): This function has no side effect and
qualifies for the pure attribute.
* THANKS.in: Remove duplicate now that author has a commit in the repo.

Those two functions were flagged by GCC 12.3.0,
though not by GCC 13.2.1.

19 months agomaint: update all copyright year number ranges
Pádraig Brady [Mon, 1 Jan 2024 13:22:42 +0000 (13:22 +0000)] 
maint: update all copyright year number ranges

Update to latest gnulib with new copyright year.
Run "make update-copyright" and then...

* gnulib: Update included in this commit as copyright years
are the only change from the previous gnulib commit.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Manually update copyright year,
until we fully sync with gnulib at a later stage.
* tests/sample-test: Adjust to use the single most recent year.

19 months agomaint: pacify recent clang better
Paul Eggert [Mon, 1 Jan 2024 03:48:24 +0000 (19:48 -0800)] 
maint: pacify recent clang better

* configure.ac: Clang now seems to have -Wformat-extra-args,
-Wimplicit-const-int-float-conversion, and
-Wtautological-constant-out-of-range-compare on by default,
so disable them even if --enable-gcc-warnings is not used.
Rely on Gnulib’s check for clang rather than rolling our own.

19 months agomaint: pacify clang -Winclude-next-absolute-path
Paul Eggert [Mon, 1 Jan 2024 03:48:24 +0000 (19:48 -0800)] 
maint: pacify clang -Winclude-next-absolute-path

* gl/lib/xdectoint.c: Use #include <...> instead of #include "...".

19 months agobuild: update gnulib submodule to latest
Paul Eggert [Mon, 1 Jan 2024 03:48:24 +0000 (19:48 -0800)] 
build: update gnulib submodule to latest

19 months agols: omit bad_cast
Paul Eggert [Fri, 29 Dec 2023 00:32:28 +0000 (16:32 -0800)] 
ls: omit bad_cast

* src/ls.c (decode_switches): Declare some local vars to be
char const *, not char *, and omit unnecessary bad_cast calls.

19 months agosplit: omit bad_cast
Paul Eggert [Fri, 29 Dec 2023 00:32:28 +0000 (16:32 -0800)] 
split: omit bad_cast

* src/split.c (infile): Now char const *, not char *.
(main): Omit unnecessary bad_cast calls.

19 months agosort: fix thousands grouping handling on single byte locales
Pádraig Brady [Thu, 28 Dec 2023 00:02:42 +0000 (00:02 +0000)] 
sort: fix thousands grouping handling on single byte locales

* gl/lib/strnumcmp-in.h (numcompare): After commit v9.0-8-g6cafb122f,
we need to treat characters as signed to avoid invalid comparisons
between negative integers and unsigned characters.
* NEWS: Mention the bug fix.

19 months agotests: numfmt: fix test related to lower case 'k' SI unit
Pádraig Brady [Wed, 27 Dec 2023 23:37:17 +0000 (23:37 +0000)] 
tests: numfmt: fix test related to lower case 'k' SI unit

* tests/misc/numfmt.pl: Following on from v9.4-86-g615167cc4,
adjust this test accordingly.  This test was being skipped
on some systems, and so only noticed now.
Reported by Jim Meyering.

19 months agotests: run locale tests on more systems
Pádraig Brady [Wed, 27 Dec 2023 22:47:48 +0000 (22:47 +0000)] 
tests: run locale tests on more systems

* tests/misc/numfmt.pl: Determine the thousands grouping character
in use, rather than skipping locale tests when it's not a space.
For example fr_FR.UTF-8 uses "NARROW NO-BREAK SPACE" as the grouping
char on modern glibc systems at least.
* tests/sort/sort-h-thousands-sep.sh: Likewise.

19 months agomaint: distribute new header from previous commit
Pádraig Brady [Fri, 29 Dec 2023 17:51:19 +0000 (17:51 +0000)] 
maint: distribute new header from previous commit

* src/local.mk: Reference the new header, so it's distributed.

19 months agomaint: merge chgrp and chown sources
Pádraig Brady [Wed, 27 Dec 2023 13:28:02 +0000 (13:28 +0000)] 
maint: merge chgrp and chown sources

chown is a close superset of chgrp functionality,
so merge sources to avoid unwanted divergence in future.
This removes about 300 lines in chgrp.c

* build-aux/gen-single-binary.sh: Generate new rules for chgrp.
* cfg.mk: Exclude new wrappers.
* po/POTFILES.in: Remove chgrp.c
* src/chgrp.c: Remove.
* src/chown-chgrp.c: New wrapper.
* src/chown-chown.c: Likewise.
* src/chown.c (main): Prepend ':' for chgrp(1).
* src/chown.h: Define both operating modes.
(usage): Adjust depending on utility being called.
* src/coreutils-chgrp.c: Likewise.
* src/local.mk: Reference new wrappers.

19 months agocopy,install: avoid unnecessary security context translations
Christian Göttsche [Tue, 19 Dec 2023 14:55:28 +0000 (15:55 +0100)] 
copy,install: avoid unnecessary security context translations

Do not perform SELinux context translation for operations not involving
user input or output.  Context translation converts MCS/MLS labels into
human readable form, which is useful for user facing applications like
ls(1) or the --context=CTX argument of cp(1).

* src/copy.c (set_process_security_ctx): Use raw selinux variants.
* src/install.c (need_copy): Likewise.
(setdefaultfilecon): Likewise.
* src/selinux.c (computecon): Likewise.
(defaultcon): Likewise.
* tests/cp/no-ctx.sh: Add raw variants to preload lib.
* NEWS: Mention the improvement.

19 months agobuild: update gnulib to latest
Pádraig Brady [Tue, 19 Dec 2023 17:18:46 +0000 (17:18 +0000)] 
build: update gnulib to latest

* gnulib: Primarily to get raw selinux wrappers

19 months agomaint: avoid false positive warning with newer gcc
Pádraig Brady [Sun, 17 Dec 2023 17:13:31 +0000 (17:13 +0000)] 
maint: avoid false positive warning with newer gcc

* src/pr.c (read_line): GCC 13.2.1 can't discern that CHARS
is not used with '\n', so avoid the -Werror=maybe-uninitialized
issue in dev builds.

19 months agodoc: cp --no-clobber: improve documentation
Pádraig Brady [Sun, 17 Dec 2023 14:35:36 +0000 (14:35 +0000)] 
doc: cp --no-clobber: improve documentation

* doc/coreutils.texi (cp invocation): Reference the related --update
option, like we had already done in mv invocation.
* src/cp.c (usage): State clearly what --no-clobber does,
indicating it's protection focused, rather than being update focused.

19 months agochgrp: add --from parameter similar to chown
Pádraig Brady [Wed, 27 Sep 2023 19:32:06 +0000 (20:32 +0100)] 
chgrp: add --from parameter similar to chown

* doc/coreutils.texi (chown invocation): Convert --from option
description to a macro and call from ...
(chgrp description): ... here.
* src/chown-core.h (emit_from_option_description): A new function
refactored from ...
* src/chown.c (usage): ... here, and called from ...
* src/chgrp.c (usage): ... here.
(main): Accept the --from option as chown(1) does.
* po/POTFILES.in: Add chown-core.h as now translated.
* tests/chown/basic.sh: Decouple the root user from id 0.
* tests/chgrp/from.sh: A new test largely based on chown/basic.sh.
* tests/local.mk: Reference the new test.
* NEWS: Mention the new feature.
Suggested by Ed Neville.

19 months agomaint: remove obsolete AC_PROG_GCC_TRADITIONAL
Pádraig Brady [Mon, 11 Dec 2023 17:03:33 +0000 (17:03 +0000)] 
maint: remove obsolete AC_PROG_GCC_TRADITIONAL

* configure.ac: Remove obsolete macro call.
Recent autoconf warns that it is obsolete.
AC_PROG_CPP sets up the -traditional-cpp option if required.
GCC ignores -traditional since commit f458d1d5 (2002).
Fixes https://bugs.gnu.org/67756

19 months agodoc: ls: fix regression in -f description
Pádraig Brady [Mon, 11 Dec 2023 14:20:47 +0000 (14:20 +0000)] 
doc: ls: fix regression in -f description

The description of -f regressed in coreutils 9.0

* doc/coreutils.texi (ls invocation): Detail which options
are enabled/disabled with -f.
* src/ls.c (usage): Likewise.
(decode_switches): Update comments.
Fixes https://bugs.gnu.org/67765

19 months agomaint: add list/obstack.h to .gitignore
Pádraig Brady [Mon, 11 Dec 2023 14:33:14 +0000 (14:33 +0000)] 
maint: add list/obstack.h to .gitignore

Following recent gnulib update

19 months agobuild: update gnulib submodule to latest
Pádraig Brady [Sun, 10 Dec 2023 19:04:59 +0000 (19:04 +0000)] 
build: update gnulib submodule to latest

* bootstrap: Copy from latest Gnulib,
to fix --bootstrap-sync with other options.

20 months agodoc: touch: clarify --time description in man page
Pádraig Brady [Wed, 6 Dec 2023 13:03:48 +0000 (13:03 +0000)] 
doc: touch: clarify --time description in man page

* src/touch.c (usage): Reorganise the description to be similar to
the format used for the ls --time description, which formats better
when converted to a man page.  Also separate the description
to allow for more granular translations.
Fixes https://bugs.gnu.org/67656

20 months agotail: fix tailing sysfs files where PAGE_SIZE > BUFSIZ
dann frazier [Thu, 30 Nov 2023 01:32:34 +0000 (18:32 -0700)] 
tail: fix tailing sysfs files where PAGE_SIZE > BUFSIZ

* src/tail.c (file_lines): Ensure we use a buffer size >= PAGE_SIZE when
searching backwards to avoid seeking within a file,
which on sysfs files is accepted but also returns no data.
* tests/tail/tail-sysfs.sh: Add a new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/67490

20 months agonumfmt: support lowercase 'k' for Kilo and Kibi
Pádraig Brady [Sun, 26 Nov 2023 16:41:56 +0000 (16:41 +0000)] 
numfmt: support lowercase 'k' for Kilo and Kibi

For consistency with the "SI" standard, and with other coreutils
which output a lowercase 'k' in "SI" mode.

* src/numfmt.c (suffix_power): Treat 'k' like 'K' on input.
(double_to_human): Output lowercase 'k' in SI mode.
(usage): Adjust accordingly.
* doc/coreutils.texi: Mention 'k' accepted, and printed in SI mode.
* tests/misc/numfmt.pl: Adjust accordingly.
* NEWS: Mention the change in behavior.
Fixes https://bugs.gnu.org/47103

20 months agouniq: fix bug with -w in multibyte locales
Paul Eggert [Thu, 16 Nov 2023 19:34:55 +0000 (11:34 -0800)] 
uniq: fix bug with -w in multibyte locales

-w counted bytes not characters, which is wrong in multibyte locales.
This bug exists even in Fedora, which is why the recently-added
test cases from Fedora didn’t catch it.
* src/uniq.c (find_field): New arg PLEN.  All callers changed.
Compute length of field correctly in multi-byte locales.
(different): Don’t worry about check_chars; find_field now does that.
* tests/uniq/uniq.pl: Test for this bug.

20 months agotests: omit inapplicable test code
Paul Eggert [Thu, 16 Nov 2023 18:12:55 +0000 (10:12 -0800)] 
tests: omit inapplicable test code

* tests/misc/join.pl, tests/uniq/uniq.pl:
Remove test for "invalid byte, character or field list" message
that is not generated.

20 months agouniq: change macro to function
Paul Eggert [Wed, 15 Nov 2023 23:08:34 +0000 (15:08 -0800)] 
uniq: change macro to function

* src/uniq.c (swap_lines): New static function, replacing
the old SWAP_LINES macro.  These days this is just as fast.
All uses changed.

20 months agouniq: prefer static init
Paul Eggert [Wed, 15 Nov 2023 23:05:17 +0000 (15:05 -0800)] 
uniq: prefer static init

* src/uniq.c (skip_fields, skip_chars, check_chars, count_occurrences)
(output_unique, output_first_repeated, output_later_repeated)
(delimit_groups): Initialize statically, rather than in ‘main’.
This shrinks the executable a bit.

20 months agouniq: simplify and fix unlikely bug by using bool
Paul Eggert [Wed, 15 Nov 2023 22:57:17 +0000 (14:57 -0800)] 
uniq: simplify and fix unlikely bug by using bool

* src/uniq.c (enum countmode): Remove this type.
(count_occurrences): New static var, replacing the old countmode,
and of type boolean instead of a two-value enum type that was
confusing (and which caused a hard-to-test bug when the count
exceeded INTMAX_MAX - 1).  All uses changed.

20 months agouniq: prefer signed integers
Paul Eggert [Wed, 15 Nov 2023 07:13:26 +0000 (23:13 -0800)] 
uniq: prefer signed integers

* src/uniq.c (skip_fields, skip_chars, check_chars, size_opt)
(find_field, different, writeline, check_file, main):
Prefer signed to unsigned integer types, since this allows
for better runtime checking with -fsanitize=undefined.

20 months agomaint: DECIMAL_DIGIT_ACCUMULATE uses stdckdint.h
Paul Eggert [Wed, 15 Nov 2023 04:35:56 +0000 (20:35 -0800)] 
maint: DECIMAL_DIGIT_ACCUMULATE uses stdckdint.h

* src/system.h: Include <stdckdint.h>, since the new
DECIMAL_DIGIT_ACCUMULATE uses it.
Do not include stdckdint.h from files that also include system.h.
(DECIMAL_DIGIT_ACCUMULATE): Omit last arg, which is no longer needed.
Reimplement by using C23-style stdckdint.h’s ckd_mul and ckd_add,
as that’s more standard and is more likely to generate better code.

20 months agopinky: fix string size calculation
Paul Eggert [Sat, 11 Nov 2023 08:17:11 +0000 (00:17 -0800)] 
pinky: fix string size calculation

* src/pinky.c (count_ampersands): Simplify and return idx_t.
(create_fullname): Compute proper destination string size,
basically, by adding (ulen - 1) * ampersands rather than ulen *
(ampersands - 1).  Problem found on CHERI-64.

20 months agomaint: port randread to FreeBSD 14
Paul Eggert [Sat, 11 Nov 2023 08:14:48 +0000 (00:14 -0800)] 
maint: port randread to FreeBSD 14

* gl/lib/randread.c (POINTER_IS_ALIGNED): Rename from
ALIGNED_POINTER to avoid a collision with <machine/param.h>
on FreeBSD 14.

20 months agobuild: update gnulib submodule to latest
Paul Eggert [Sat, 11 Nov 2023 03:08:54 +0000 (19:08 -0800)] 
build: update gnulib submodule to latest

21 months agols: fix recent regression in size alignment
Pádraig Brady [Fri, 3 Nov 2023 16:22:22 +0000 (16:22 +0000)] 
ls: fix recent regression in size alignment

* src/ls.c (print_long_format): Use correct column width,
introduced due to a copy/paste error in commit v9.4-2-gcbb6dfec5
* tests/ls/size-align.sh: Add a test.
* tests/local.mk: Reference the new test.
Fixes https://bugs.gnu.org/66919

21 months agojoin: fix recently introduced NUL bug
Paul Eggert [Mon, 30 Oct 2023 17:47:34 +0000 (10:47 -0700)] 
join: fix recently introduced NUL bug

* src/join.c (xfields): Simplify and fix bug with fields
that start with a NUL byte when -t is not used.
* tests/misc/join-utf8.sh: Also test when -t is not used,
and when a field starts with NUL.

21 months agomaint: pacify ‘make syntax-check’
Paul Eggert [Mon, 30 Oct 2023 08:32:37 +0000 (01:32 -0700)] 
maint: pacify ‘make syntax-check’

* tests/misc/join-utf8.sh: Omit fail=0.
Fix framework_failure_ typo.
* tests/misc/join.pl: Change ` to '.

21 months agomaint: copy join, uniq tests from Fedora
Paul Eggert [Mon, 30 Oct 2023 08:24:28 +0000 (01:24 -0700)] 
maint: copy join, uniq tests from Fedora

* tests/misc/join.pl, tests/uniq/uniq.pl:
Copy from Fedora 39.  This adds more multi-byte tests.

21 months agojoin,uniq: support multi-byte separators
Paul Eggert [Mon, 30 Oct 2023 07:32:51 +0000 (00:32 -0700)] 
join,uniq: support multi-byte separators

* NEWS: Mention this.
* bootstrap.conf (gnulib_modules): Remove cu-ctype, as this module
is now more trouble than it’s worth.  All uses removed.
Add skipchars.
* gl/lib/cu-ctype.c, gl/lib/cu-ctype.h, gl/modules/cu-ctype:
Remove.
* gl/lib/skipchars.c, gl/lib/skipchars.h, gl/modules/skipchars:
* tests/misc/join-utf8.sh:
New files.
* src/join.c: Include skipchars.h and mcel.h instead of cu-ctype.h.
(tab): Now mcel_t, not int.  All uses changed.
(output_separator, output_seplen): New static vars.
(eq_tab, newline_or_blank, comma_or_blank): New functions.
(xfields, prfields, prjoin, add_field_list, main):
Support multi-byte characters.
* src/numfmt.c: Include ctype.h, skipchars.h.
Do not include cu-ctype.h.
(newline_or_blank): New function.
(next_field): Support multi-byte characters.
* src/sort.c: Include ctype.h instead of cu-ctype.h.
(inittables): Open-code field_sep since it no longer exists.
‘sort’ is not multi-byte safe yet, but when it is this code
will need revamping anyway.
* src/uniq.c: Include mcel.h and skipchars.h instead of cu-ctype.h.
(newline_or_blank): New function.
(find_field): Support multi-byte characters.
* tests/local.mk (all_tests): Add tests/misc/join-utf8.sh

21 months agotest: allow non-blank white space in numbers
Paul Eggert [Sat, 28 Oct 2023 23:15:49 +0000 (16:15 -0700)] 
test: allow non-blank white space in numbers

* src/test.c (find_int): Use isspace, not isblank,
for compatibility with how strtol works, which
is how most other shells do this.

21 months agostdbuf: port to oddball toupper
Paul Eggert [Sat, 28 Oct 2023 16:30:49 +0000 (09:30 -0700)] 
stdbuf: port to oddball toupper

* src/stdbuf.c: Do not include ctype.h.
(set_libstdbuf_options): Use c_toupper, not toupper,
since the C locale is intended here.

21 months agodircolors: assume C-locale spaces
Paul Eggert [Sat, 28 Oct 2023 16:22:09 +0000 (09:22 -0700)] 
dircolors: assume C-locale spaces

* src/dircolors.c: Include c-ctype.h, not ctype.h.
(parse_line): Use c_isspace, not isspace, as the .dircolors
file format (which does not seem to be documented!) appears
to be ASCII.

21 months agomaint: port to oddball tolower
Paul Eggert [Sat, 28 Oct 2023 16:07:14 +0000 (09:07 -0700)] 
maint: port to oddball tolower

* src/digest.c (hex_equal): Work even in oddball locales
where tolower does not work as expected on ASCII letters.

21 months agomaint: include ctype.h selectively
Paul Eggert [Sat, 28 Oct 2023 00:31:49 +0000 (17:31 -0700)] 
maint: include ctype.h selectively

Include ctype.h only in files that need it.  Many of its uses
are incorrect, as they assume single-byte locales.  The idea is
to remove the incorrect uses later, when there is time.
* src/chroot.c, src/csplit.c, src/dd.c, src/digest.c, src/dircolors.c:
* src/expand-common.c, src/expand.c, src/fmt.c, src/fold.c, src/ls.c:
* src/od.c, src/pinky.c, src/pr.c, src/ptx.c, src/seq.c:
* src/set-fields.c, src/split.c, src/stdbuf.c, src/test.c:
* src/tr.c, src/truncate.c, src/unexpand.c, src/wc.c:
Include ctype.h.
* src/system.h: Do not include ctype.h.

include ctype.h.o

21 months agomaint: move field_sep into separate module
Paul Eggert [Sat, 28 Oct 2023 00:15:08 +0000 (17:15 -0700)] 
maint: move field_sep into separate module

This is so that we don’t need to have every source file
include ctype.h.
* bootstrap.conf (gnulib_modules): Add cu-ctype.
* gl/lib/cu-ctype.c, gl/lib/cu-ctype.h, gl/modules/cu-ctype:
New files.
* src/join.c, src/numfmt.c, src/sort.c, src/uniq.c:
Include cu-ctype.h, for field_sep.
* src/system.h (field_sep): Remove; now supplied by cu-ctype.

21 months agodigest: omit unnecessary b2sum includes
Paul Eggert [Fri, 27 Oct 2023 15:56:39 +0000 (08:56 -0700)] 
digest: omit unnecessary b2sum includes

* src/blake2/b2sum.c: Do not include string.h, errno.h,
ctype.h, unistd.h, getopt.h.

21 months agomaint: prefer c_isxdigit when that is the intent
Paul Eggert [Fri, 27 Oct 2023 15:45:50 +0000 (08:45 -0700)] 
maint: prefer c_isxdigit when that is the intent

* src/digest.c (valid_digits, split_3):
* src/echo.c (main):
* src/printf.c (print_esc):
* src/ptx.c (unescape_string):
* src/stat.c (print_it):
When the code is supposed to support only POSIX-locale hex digits,
use c_isxdigit rather than isxdigit.  Include c-ctype.h as needed.
This defends against oddball locales where isxdigit != c_isxdigit.

21 months agomaint: fix syntax check issue
Pádraig Brady [Fri, 27 Oct 2023 13:19:01 +0000 (14:19 +0100)] 
maint: fix syntax check issue

* src/basenc.c: Fix preprocessor indentation.

21 months agobase32,base64: disallow non-canonical encodings
Pádraig Brady [Fri, 27 Oct 2023 12:24:04 +0000 (13:24 +0100)] 
base32,base64: disallow non-canonical encodings

This will make decoding more resilient to corruption
whether due to transmission errors or nefarious adjustment.
See https://eprint.iacr.org/2022/361.pdf

* gnulib: Update to commit 3f463202bd enforcing canonical encoding.
* tests/basenc/base64.pl: Add test cases, and adjust existing cases.
* NEWS: Mention the change in behavior.

21 months agobasenc: fix unlikely locale issue; tune
Paul Eggert [Wed, 25 Oct 2023 22:09:04 +0000 (15:09 -0700)] 
basenc: fix unlikely locale issue; tune

This sped up ‘basenc -d --base16’ by 60% on my old platform,
AMD Phenom II X4 910e, Fedora 38.
* src/basenc.c (struct base16_decode_context): Simplify by
omitting have_nibble.  ‘nibble’ is now negative if it’s missing.
All uses changed.
(B16): New macro, inspired by ../lib/base64.c.
(base16_to_int): New static var, likewise.
(isubase16): Reimplement using base16_to_int, since isxdigit is
not guaranteed to succeed on the chars we want when the locale is
oddball.
(base16_decode_ctx): Tune by using base16_to_int and by

21 months agobasenc: tweak checks to use unsigned char
Paul Eggert [Wed, 25 Oct 2023 21:43:32 +0000 (14:43 -0700)] 
basenc: tweak checks to use unsigned char

This tends to generate better code, at least on x86-64,
because callers are just as fast and callees can avoid a conversion.
* src/basenc.c: The following renamings also change the arg type
from char to unsigned char.  All uses changed.
(isubase): Rename from isbase.
(isubase64url): Rename from isbase64url.
(isubase32hex): Rename from isbase32hex.
(isubase16): Rename from isbase16.
(isuz85): Rename from isz85.
(isubase2): Rename from isbase2.

2023-10-24  Paul Eggert  <eggert@cs.ucla.edu>

* src/basenc.c (struct base16_decode_context):
Simplify by storing -1 for missing nibbles.  All uses changed.

21 months agobuild: update gnulib submodule to latest
Paul Eggert [Wed, 25 Oct 2023 15:45:15 +0000 (08:45 -0700)] 
build: update gnulib submodule to latest

21 months agobasenc: --base16: also allow lower case with --ignore-garbage
Pádraig Brady [Wed, 25 Oct 2023 13:04:00 +0000 (14:04 +0100)] 
basenc: --base16: also allow lower case with --ignore-garbage

* src/basenc.c (isbase16): Also return true for lower case.
* tests/basenc/basenc.pl: Add a test case.
Reported by Paul Eggert.

21 months agobasenc: --base16: support lower case hex digits
Pádraig Brady [Mon, 23 Oct 2023 11:51:19 +0000 (12:51 +0100)] 
basenc: --base16: support lower case hex digits

* src/basenc.c (base16_decode_ctx): Convert to uppercase
before converting from hex.
* tests/basenc/basenc.pl: Add a test case.
* NEWS: Mention the change in behavior.
Addresses https://bugs.gnu.org/66698

21 months agodoc: fix RFC references
Pádraig Brady [Mon, 23 Oct 2023 11:29:03 +0000 (12:29 +0100)] 
doc: fix RFC references

* doc/coreutils.texi: Adjust RFC URLs as the original
now give 404 errors.

22 months agotests: move all basenc tests to their own directory
Pádraig Brady [Fri, 6 Oct 2023 15:31:47 +0000 (16:31 +0100)] 
tests: move all basenc tests to their own directory

* tests/misc/base64.pl: Move to tests/basenc/base64.pl
* tests/misc/basenc.pl: Move to tests/basenc/basenc.pl
* tests/local.mk: Adjust accordingly

22 months agobasenc: auto pad base32 and base64 inputs when decoding
Pádraig Brady [Thu, 5 Oct 2023 16:00:51 +0000 (17:00 +0100)] 
basenc: auto pad base32 and base64 inputs when decoding

Padding of encoded data is useful in cases where
base64 encoded data is concatenated / streamed.
I.e. where there are padding chars _within_ the stream.
In other cases padding is optional and can be inferred.
Note we continue to treat partial padding as invalid,
as that would be indicative of truncation.

* src/basenc.c (do_decode): Auto pad the end of the input.
* NEWS: Mention the change in behavior.
* tests/misc/base64.pl: Adjust to not fail for missing padding.
Addresses https://bugs.gnu.org/66265

22 months agosort: improve --help
Paul Eggert [Fri, 29 Sep 2023 01:02:25 +0000 (18:02 -0700)] 
sort: improve --help

Problem reported by Jorge Stolfi (bug#66253).
* src/sort.c (usage): Suggest looking at the manual for -n details.

22 months agodoc: rm --help: mention that '.' or '..' are rejected
Pádraig Brady [Mon, 25 Sep 2023 13:46:48 +0000 (14:46 +0100)] 
doc: rm --help: mention that '.' or '..' are rejected

* src/rm.c (usage): State that '.' or '..' are rejected.

22 months agowc: pacify ‘make syntax-check’
Paul Eggert [Sun, 24 Sep 2023 00:19:35 +0000 (17:19 -0700)] 
wc: pacify ‘make syntax-check’

* src/wc_avx2.c (wc_lines_avx2): Explicitly make it ‘extern’.
Not sure why this is needed.

22 months agowc: distribute src/wc.h
Paul Eggert [Sun, 24 Sep 2023 00:18:45 +0000 (17:18 -0700)] 
wc: distribute src/wc.h

* src/local.mk (noinst_HEADERS): Add src/wc.h.

22 months agowc: goto considered harmful
Paul Eggert [Sun, 24 Sep 2023 00:07:33 +0000 (17:07 -0700)] 
wc: goto considered harmful

* src/wc.c: Do not include assure.h.  Replace the only
use of ‘assure’ with ‘unreachable’ which is good enough.
(wc, main): Remove labels and gotos.  This doesn’t affect
performance in any way I can measure, and makes the code
a bit easier to follow.

22 months agowc: prefer signed integers
Paul Eggert [Sat, 23 Sep 2023 21:22:16 +0000 (14:22 -0700)] 
wc: prefer signed integers

Prefer signed to unsigned integers, to make it easier to catch
integer overflow errors.
* src/wc.c: Do not include safe-read.
(total_lines_overflow, total_words_overflow, total_chars_overflow)
(total_bytes_overflow): Now bool, not uintmax_t.  All uses changed.
(max_line_length): Now intmax_t, not uintmax_t.  All uses changed.
The total_... vars are still uintmax_t because overflow into them
is checked.
(page_size): Now idx_t, not size_t.
(wc_lines, wc, get_input_fstatus, compute_number_width, main):
Prefer signed to unsigned ints where either should do.
(wc_lines, wc): Use read rather than safe_read, since we don’t
need safe_read’s checks for huge buffers.
(wc): Redo call to mbrtoc32 to lessen the number of comparisons
against its returned value.  Do this partly by keeping a pointer
to the end of the buffer rather than a count.  Simplify
overflow-checking code.
(compute_number_width): Check for integer overflow.
Don’t assume size_t fits into unsigned long.
* src/wc.h (struct wc_lines): Prefer signed integers.
* src/wc_avx2.c: Do not include safe-read.h.
(wc_lines_avx2): Prefer signed integers.  Use read, not safe_read.

22 months agowc: improve avx2 API
Paul Eggert [Sat, 23 Sep 2023 20:38:08 +0000 (13:38 -0700)] 
wc: improve avx2 API

* src/wc.c: Use "#include <...>" for files not in the current dir.
Include "wc.h" instead of declaring wc_lines_avx2 by hand.
(wc_lines): New API, with no file name (no longer needed) and
with a return struct rather than arg pointers.  All uses changed.
Use avx2_supported directly instead of using a function pointer.
Exploit C99-style declarations after statements.
Multiply by 15 rather than dividing; it’s faster and more accurate
and cannot overflow here.
(wc): Simplify based on wc_lines API change.
* src/wc.h: New file.
* src/wc_avx2.c: Include it, to check API better.
(wc_lines_avx2): Use new API.  All uses changed.  Exploit C99.
Make locals more local.

22 months agofactor,tail: avoid quadratic reallocation
Paul Eggert [Sat, 23 Sep 2023 08:15:08 +0000 (01:15 -0700)] 
factor,tail: avoid quadratic reallocation

* src/factor.c (struct mp_factors): New member nalloc.
(mp_factor_init): Initialize it.
* src/factor.c (mp_factor_insert):
* src/tail.c (parse_options): Use xpalloc to avoid quadratic
worst-case behavior on reallocation.
* src/tail.c (pids_alloc): New static var.

22 months agodoc: mention Unicode exceptions for wc
Paul Eggert [Sat, 23 Sep 2023 07:23:26 +0000 (00:23 -0700)] 
doc: mention Unicode exceptions for wc

22 months agowc: simplify by removing SUPPORT_OLD_MBRTOWC
Paul Eggert [Sat, 23 Sep 2023 07:03:41 +0000 (00:03 -0700)] 
wc: simplify by removing SUPPORT_OLD_MBRTOWC

* src/wc.c (SUPPORT_OLD_MBRTOWC): Remove.  All uses removed.
(wc): Simplify by assuming C99-or-later behavior for mbrtoc32,
which after all is a C11 API.  Fix the !SUPPORT_OLD_MBRTOWC
code, which evidently was never tested seriously.

22 months agowc: 3× speedup in C locale
Paul Eggert [Sat, 23 Sep 2023 05:09:37 +0000 (22:09 -0700)] 
wc: 3× speedup in C locale

The 3× speedup was measured by invoking 'wc $(find * -type f)'
on the coreutils sources etc. on an Ubuntu 23.04 x86-64.
These changes also speed up wc 20% in UTF-8 locales.
* src/wc.c (wc_isprint, wc_isspace): New static vars.
(wc): Use them for speed.
(main): Initialize them if needed.
(isnbspace): Remove; no longer used.