Pádraig Brady [Wed, 23 Feb 2022 17:27:15 +0000 (17:27 +0000)]
tests: improve compat with macOS
* tests/misc/wc-nbsp.sh: Only the en_US.iso8859-1 form
is accepted on macOS 10.15.7 at least. GNU/Linux also
accepts ISO-8859-1 (and canonicalizes the charmap to this).
Paul Eggert [Wed, 23 Feb 2022 02:27:09 +0000 (18:27 -0800)]
dd: counts ending in "B" now count bytes
This implements my suggestion in Bug#54112.
* src/dd.c (usage): Document the change.
(parse_integer, scanargs): Implement the change.
Omit some now-obsolete checks for invalid flags.
* tests/dd/bytes.sh: Test the new behavior, while retaining
checks for the now-obsolete usage.
* tests/dd/nocache_eof.sh: Avoid now-obsolete usage.
Paul Eggert [Mon, 21 Feb 2022 19:23:02 +0000 (11:23 -0800)]
dd: support iseek= and oseek=
Alias iseek=N to skip=N, oseek=N to seek=N (Bug#45648).
* src/dd.c (scanargs): Parse iseek= and oseek=.
* tests/dd/skip-seek.pl (sk-seek5): New test case.
Paul Eggert [Mon, 21 Feb 2022 08:18:19 +0000 (00:18 -0800)]
cp: avoid unnecessary buffer allocation
Do not allocate I/O buffer if copy_file_range suffices.
* src/copy.c (sparse_copy, lseek_copy): Buffer arg is now char **
instead of char *, and buffer is now allocated only if needed.
All uses changed.
Pádraig Brady [Mon, 14 Feb 2022 21:36:29 +0000 (21:36 +0000)]
doc: use bold style for man page references
It's more common to use bold style than not,
for references to other man pages.
Ideally each man page renderer would highlight references,
but currently some rely on styles in the page itself.
* man/help2man: Implement a --bold-refs option that
will mark up references like "name(1)" with bold
style around the "name" component.
* man/local.mk: Pass --bold-refs to our help2man unless disabled.
* configure.ac: Add a --disable-bold-man-page-references option.
Addresses https://bugs.gnu.org/53977
Pádraig Brady [Sat, 12 Feb 2022 21:54:07 +0000 (22:54 +0100)]
dircolors: consider COLORTERM as well as TERM env vars
COLORTERM is an environment used usually to expose truecolor support in
terminal emulators. Therefore support matches on that in addition
to TERM. Also set the default COLORTERM match pattern so that
we apply colors if COLORTERM is any value.
This implicitly supports a terminal like "foot"
without a need for an explicit TERM entry.
* NEWS: Mention the new feature.
* src/dircolors.c (main): Match COLORTERM like we do for TERM.
* src/dircolors.hin: Add default config to match any COLORTERM.
* tests/misc/dircolors.pl: Add test cases.
Paul Eggert [Mon, 14 Feb 2022 20:00:16 +0000 (12:00 -0800)]
tr: improve multibyte etc. doc
Problem reported by Dan Jacobson (Bug#48248).
* doc/coreutils.texi (tr invocation): Improve documentation for
tr's failure to support multibyte characters POSIX-style.
* doc/coreutils.texi (tr invocation), src/tr.c (usage):
Use terminology closer to POSIX's.
Pádraig Brady [Sun, 13 Feb 2022 18:19:04 +0000 (18:19 +0000)]
dircolors: add --print-ls-colors to display colored entries
* NEWS: Mention the new feature.
* doc/coreutils.texi (dircolors invocation): Describe the new
--print-ls-colors option.
* src/dircolors.c (print_ls_colors): A new global to select
between shell or terminal output.
(append_entry): A new function refactored from dc_parse_stream()
to append the entry in the appropriate format.
(dc_parse_stream): Adjust to call append_entry().
* tests/misc/dircolors.pl: Add test cases.
Pádraig Brady [Sat, 12 Feb 2022 21:20:21 +0000 (21:20 +0000)]
chown,chgrp: reinstate numeric id output in -v messages
since gnulib commit ff208d546a,
related to coreutils commit v9.0-143-gabde15969
we no longer maintain numeric IDs through chopt->{user,group}_name.
Therefore we need to adjust to ensure tests/chown/basic.sh passes.
* src/chown-core.c (uid_to_str, gid_to_str): New helper functions
to convert numeric id to string.
(change_file_owner): Use the above new functions to pass
numeric ids to describe_change().
Paul Eggert [Sun, 13 Feb 2022 04:14:27 +0000 (20:14 -0800)]
sort: fix several version-sort problems
This also affects ls -v in some corner cases.
Problems reported by Michael Debertol <https://bugs.gnu.org/49239>.
While looking into this, I spotted some more areas where the
code and documentation did not agree, or where the documentation
was unclear. In some cases I changed the code; in others
the documentation. I hope things are nailed down better now.
* doc/sort-version.texi: Distinguish more carefully between
characters and bytes. Say that non-identical strings can
compare equal, since they now can. Improve readability in
various ways. Make it clearer that a suffix can be the
entire string.
* src/ls.c (cmp_version): Fall back on strcmp if filevercmp
reports equality, since filevercmp is no longer a total order.
* src/sort.c (keycompare): Use filenvercmp, to treat NULs correctly.
* tests/misc/ls-misc.pl (v_files):
Adjust test to match new behavior.
* tests/misc/sort-version.sh: Add tests for stability,
and for sorting with NUL bytes.
Pádraig Brady [Sat, 12 Feb 2022 18:43:25 +0000 (18:43 +0000)]
doc: avoid using "[" is URLS in --help output
* src/system.h (emit_ancillary_info): While supported if entered
manually, the "[" character is not highlighted as part of a
URL by default in terminals, so avoid using it.
Addresses https://bugs.gnu.org/53946
Pádraig Brady [Sat, 12 Feb 2022 18:23:48 +0000 (18:23 +0000)]
doc: adust --help, --version alignment
* src/system.h: Adjust the alignment of the --help
and --version option descriptions, to start at column 21.
This better aligns with the descriptions of most commands,
and also aligns with the minimum column a description must
start at to ensure a blank line is not output when a description
follows an option on a line by itself.
Pádraig Brady [Sat, 12 Feb 2022 18:16:10 +0000 (18:16 +0000)]
doc: rmdir: improve --help formatting
* src/rmdir.c (usage): Move description to column 21,
so that a --long-option on its own line without a
trailing description, doesn't have an erroneous blank
line inserted between the option and description.
Also group descriptions with blank lines rather than indents,
so that man pages don't have erroneous blank lines
added within the description.
Addresses https://bugs.gnu.org/53946
Pádraig Brady [Sat, 12 Feb 2022 17:52:32 +0000 (17:52 +0000)]
doc: ls: improve --help formatting
* src/ls.c (usage): Use blank lines to group multi-line
option descriptions, rather than indenting.
This results in more consistent alignment of descriptions,
and also avoids erroneous new lines in generated in man pages.
Addresses https://bugs.gnu.org/53946
Paul Eggert [Tue, 8 Feb 2022 18:52:10 +0000 (10:52 -0800)]
doc: improve version-sort doc
* doc/coreutils.texi, doc/sort-version.texi:
Capitalize “Coreutils”.
* doc/sort-version.texi: Don’t emphasize natural sort so much,
since Coreutils has just version sort.
Use the term “lexicographic” instead of “alphabetic” or “standard”.
Suggest combining ‘V’ with ‘b’, and show why ‘b’ is needed.
Use shorter titles for sections, as GNU Emacs displays info poorly
when titles are too long to fit in a line.
Use @samp instead of @code for samples of data.
Do not use @samp{@code{...}}; @samp{...} should suffice and
double-nesting looks bad with Emacs.
Omit blank lines in examples that would not be present
in actual shell sessions.
Quote with `` and '', not with " or with '.
Mention dpkg --compare-versions more prominently.
Don’t rely on "\n" being equivalent to "\\n" in shell args.
Prefer Unicode name for hyphen-minus.
Pádraig Brady [Sat, 5 Feb 2022 12:47:33 +0000 (12:47 +0000)]
doc: fix somewhat ambiguous date format representation
* doc/coreutils.texi (date invocation): Remove @var{...} usage,
as that capitalizes in the representation and thus somewhat
ambiguates the format wrt Month and Minute. This also avoids
a syntax check failure about redundant capitalization in @var{}.
Paul Eggert [Sat, 5 Feb 2022 02:21:06 +0000 (18:21 -0800)]
date: improve doc
Problem reported by Dan Jacobson (Bug#51288).
* doc/coreutils.texi (date invocation, Setting the time)
(Options for date):
* src/date.c (usage): Improve doc.
Paul Eggert [Fri, 4 Feb 2022 22:43:31 +0000 (14:43 -0800)]
id: print groups of listed name
Problem reported by Vladimir D. Seleznev (Bug#53631).
* src/id.c (main): Do not canonicalize user name before
deciding what groups the user belongs to.
Paul Eggert [Tue, 1 Feb 2022 06:08:56 +0000 (22:08 -0800)]
maint: suppress bogus noreturn warnings
* configure.ac: Move the single-binary code before the
gcc-warnings code, so that the latter can depend on the former.
Suppress -Wsuggest-attribute=noreturn with single binaries,
to avoid diagnostics like the following:
src/expr.c: In function 'single_binary_main_expr':
error: function might be candidate for attribute 'noreturn'
Problem reported by Pádraig Brady in:
https://lists.gnu.org/r/coreutils/2022-01/msg00061.html
Paul Eggert [Tue, 1 Feb 2022 03:55:54 +0000 (19:55 -0800)]
maint: mark some _Noreturn functions
* src/basenc.c (finish_and_exit, do_encode, do_decode):
* src/comm.c (compare_files):
* src/tsort.c (tsort):
* src/uptime.c (uptime):
Mark with _Noreturn. Otherwise, unoptimized compilations may warn
that the calling renamed-main function doesn't return a value,
when !lint and when single-binary.
Paul Eggert [Mon, 31 Jan 2022 18:20:21 +0000 (10:20 -0800)]
dd: do not access uninitialized
* src/dd.c (parse_integer): Avoid undefined behavior
that accesses an uninitialized ‘n’ when e == LONGINT_INVALID.
Return more-accurate error code when INTMAX_MAX < n.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
hostname: simplify
* src/hostname.c (sethostname): Provide a substitute on all
platforms, to simplify the mainline code.
(main): Simplify. Remove an IF_LINT.
Use main_exit rather than return.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
factor: remove IF_LINT
* src/factor.c (factor_using_squfof) [USE_SQUFOF]:
Use plain assert (...), not IF_LINT (assert (...)).
This code is currently never compiled or executed,
so this is merely a symbolic cleanup.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
cut: simplify and remove an IF_LINT
* src/cut.c (enum operating_mode, operating_mode)
(output_delimiter_specified, cut_stream):
Remove; no longer needed.
(output_delimiter_default): New static var. Code can now
use ‘output_delimiter_string != output_delimiter_default’
instead of ‘output_delimiter_specified’.
(cut_file): New arg CUT_STREAM. Caller changed.
(main): Simplify. Coalesce duplicate code. Redo to avoid need
for IF_LINT, or for the static var. No need to xstrdup optarg.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
basenc: simplify -fsanitize=leak pacification
* src/basenc.c (finish_and_exit): New function.
(do_encode, do_decode): Use it. Accept new INFILE arg. Remove
no-longer-needed IF_LINT code. Exit when done. Caller changed.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
tail: simplify -fsanitize=leak pacification
Also, close a no-longer-needed file descriptor when falling
back from inotify.
* src/tail.c (tail_forever_inotify): Return void, not bool. Exit
on fatal error, or on successful completion. Accept an extra
argument pointing to a hash table that the caller should free on
non-fatal error; this simplifies cleanup. Don’t bother setting
errno when returning. Caller changed.
(main): Omit no-longer-needed IF_LINT code. Close inotify
descriptor if inotify fails; this fixes a file descriptor leak and
means we needn’t call inotify_rm_watch. Use main_exit, not return.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
cp: simplify cp/install/ln/mv pacification
* src/copy.c (dest_info_free, src_info_free) [lint]:
Remove. All uses removed.
(copy_internal): Pacify only Clang and Coverity; GCC doesn’t need it.
* src/cp-hash.c (forget_all) [lint]: Remove. All uses removed.
* src/cp.c, src/install.c, src/ln.c, src/mv.c (main):
Use main_exit, not return.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
sort: pacify -fsanitizer=leak
* src/sort.c (pipe_fork, keycompare, sort, main):
Remove lint code that no longer seems to be needed.
(sort): Unconditionally compile ifdef lint code that is needed
to free storage even when not linting.
(main): Use main_exit, not return.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
ptx: pacify -fsanitizer=leak
* src/ptx.c (unescape_string): Rename from copy_unescaped_string,
and unescape the string in place. Callers changed. This way,
we needn’t allocate storage and thus needn’t worry about
-fsanitizer=leak.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
tsort: pacify -fsanitizer=leak
* src/tsort.c (struct item.balance): Now signed char to save space.
(struct item.printed): New member.
(new_item): Initialize k->printed to false. Simplify via xzalloc.
(scan_zeros): Use k->printed rather than nulling out string.
(tsort): Move exiting code here ...
(main): ... from here.
(tsort) [lint]: Omit no-longer-needed code. Instead, set head->printed.
Paul Eggert [Mon, 31 Jan 2022 16:42:07 +0000 (08:42 -0800)]
expr: lint cleanup, and introducing main_exit
This introduces a new macro main_exit, which is useful
for pacifying gcc -fsanitizer=lint and in some cases
means we can remove some ‘IF_LINT’ and ‘ifdef lint’ code.
* src/expr.c (main): Use main_exit, not return.
(docolon): Omit an IF_LINT that GCC no longer needs.
* src/system.h (main_exit): New macro.
Pádraig Brady [Sun, 30 Jan 2022 20:19:48 +0000 (20:19 +0000)]
cksum: use more exact selection of digest algorithms
Use more constrained argument matching
to improve forward compatibility and robustness.
For example it's better that `cksum -a sha3` is _not_
equivalent to `cksum -a sha386`, so that a user
specifying `-a sha3` on an older cksum would not be surprised.
Also argmatch() is used when parsing tags from lines like:
SHA3 (filename) = abcedf....
so it's more robust that older cksum instances to fail
earlier in the parsing process, when parsing output from
possible future cksum implementations that might support SHA3.
* src/digest.c (algorithm_from_tag): Use argmatch_exact()
to ensure we don't match abbreviated algorithms.
(main): Likewise.
* tests/misc/cksum-a.sh: Add a test case.
Paul Eggert [Sat, 29 Jan 2022 19:40:17 +0000 (11:40 -0800)]
mv: when installing to dir use dir-relative names
When the destination for mv is a directory, use functions like openat
to access the destination files, when such functions are available.
This should be more efficient and should avoid some race conditions.
Likewise for 'install'.
* src/cp.c (must_be_working_directory, target_directory_operand)
(target_dirfd_valid): Move from here ...
* src/system.h: ... to here, so that install and mv can use them.
Make them inline so GCC doesn’t complain.
* src/install.c (lchown) [HAVE_LCHOWN]: Remove; no longer needed.
(need_copy, copy_file, change_attributes, change_timestamps)
(install_file_in_file, install_file_in_dir):
New args for directory-relative names. All uses changed.
Continue to pass full names as needed, for diagnostics and for
lower-level functions that do not support directory-relative names.
(install_file_in_dir): Update *TARGET_DIRFD as needed.
(main): Handle target-directory in the new, cp-like way.
* src/mv.c (remove_trailing_slashes): Remove static var; now local.
(do_move): New args for directory-relative names. All uses changed.
Continue to pass full names as needed, for diagnostics and for
lower-level functions that do not support directory-relative names.
(movefile): Remove; no longer needed.
(main): Handle target-directory in the new, cp-like way.
* tests/install/basic-1.sh:
* tests/mv/diag.sh: Adjust to match new diagnostic wording.
Paul Eggert [Fri, 28 Jan 2022 08:01:07 +0000 (00:01 -0800)]
dd: synchronize output after write errors
Problem reported by Sworddragon (Bug#51345).
* src/dd.c (cleanup): Synchronize output unless dd has been interrupted.
(synchronize_output): New function, split out from dd_copy.
Update conversions_mask so synchronization is done at most once.
(main): Do not die with the output file open, since we want to be
able to synchronize it before exiting. Synchronize output before
exiting.
Paul Eggert [Fri, 28 Jan 2022 02:34:09 +0000 (18:34 -0800)]
dd: output final progress before syncing
Problem reported by Sworddragon (Bug#51482).
* src/dd.c (reported_w_bytes): New var.
(print_xfer_stats): Set it.
(dd_copy): Print a final progress report if useful before
synchronizing output data.
Paul Eggert [Thu, 27 Jan 2022 20:06:21 +0000 (12:06 -0800)]
csplit: improve integer overflow checking
* src/csplit.c: Prefer signed integers to unsigned for sizes
when either will do. Check for some unlikely overflows.
(INCR_SIZE): Remove; no longer used.
(free_buffer): Also free the arg, simplifying callers.
(get_new_buffer): Use xpalloc instead of computing new
size by hand. Add ATTRIBUTE_DEALLOC.
(delete_all_files, close_output_file):
If unlink fails with ENOENT, treat it as success.
(close_output_file): If unlink fails, decrement count anyway.
(parse_repeat_count, parse_patterns): Check for int overflow.
(check_format_conv_type): Use signed format.
Paul Eggert [Thu, 27 Jan 2022 20:06:21 +0000 (12:06 -0800)]
maint: simplify memory alignment
Use the new Gnulib modules alignalloc and xalignalloc
to simplify some memory allocation.
Also, fix some unlikely integer overflow problems.
* bootstrap.conf (gnulib_modules): Add alignalloc, xalignalloc.
* src/cat.c, src/copy.c, src/dd.c, src/shred.c, src/split.c:
Include alignalloc.h.
* src/cat.c (main):
* src/copy.c (copy_reg):
* src/dd.c (alloc_ibuf, alloc_obuf):
* src/shred.c (dopass):
* src/split.c (main):
Use alignalloc/xalignalloc/alignfree instead of doing page
alignment by hand.
* src/cat.c (main):
Check for integer overflow in page size calculations.
* src/dd.c (INPUT_BLOCK_SLOP, OUTPUT_BLOCK_SLOP, MAX_BLOCKSIZE):
(real_ibuf, real_obuf) [lint]:
Remove; no longer needed.
(cleanup) [lint]:
(scanargs): Simplify.
* src/ioblksize.h (io_blksize): Do not allow blocksizes largest
than the largest power of two that fits in idx_t and size_t.
* src/shred.c (PAGE_ALIGN_SLOP, PATTERNBUF_SIZE): Remove.
Paul Eggert [Sun, 23 Jan 2022 19:24:35 +0000 (11:24 -0800)]
copy: remove unnecessary ‘free’
* src/copy.c (copy_reg): Remove a ‘free’ call that does nothing
because its argument is always a null pointer, starting with
2007-11-1608:31:15Z!jim@meyering.net.
Paul Eggert [Wed, 19 Jan 2022 18:51:25 +0000 (10:51 -0800)]
dd: simplify conv=swab code
Simplify byte-swapping, so that the code no longer needs to
allocate a page before the input buffer.
* src/dd.c (SWAB_ALIGN_OFFSET, char_is_saved, saved_char): Remove.
All uses removed.
(INPUT_BLOCK_SLOP): Simplify to just page_size.
(alloc_ibuf, dd_copy): Adjust to new swab_buffer API.
(swab_buffer): New arg SAVED_BYTE, taking the place of the old
global variables. Do not access BUF[-1].
Paul Eggert [Tue, 18 Jan 2022 21:22:02 +0000 (13:22 -0800)]
dd: improve integer overflow checking
* src/dd.c: Prefer signed to unsigned types where either will do,
as this helps improve checking with gcc -fsanitize=undefined.
Limit the signed types to their intended ranges.
(MAX_BLOCKSIZE): Don’t exceed IDX_MAX - slop either.
(input_offset_overflow): Remove; overflow now denoted by negative.
(parse_integer): Return INTMAX_MAX on overflow, instead of unspecified.
Do not falsely report overflow for ‘00x99999999999999999999999999999’.
* tests/dd/misc.sh: New test for 00xBIG.
* tests/dd/skip-seek-past-file.sh: Adjust to new diagnostic wording.
New test for BIGxBIG.
Paul Eggert [Mon, 17 Jan 2022 03:56:17 +0000 (19:56 -0800)]
shred: fix declaration typo
* gl/lib/randint.h (randint_all_new):
Do not declare with _GL_ATTRIBUTE_NONNULL (), as
the arg can be a null pointer. This fixes a typo added in
2021-11-01T05:30:28Z!eggert@cs.ucla.edu.