Paul Eggert [Wed, 28 Jul 2021 18:26:48 +0000 (11:26 -0700)]
doc: improve ls documentation
* doc/coreutils.texi (ls invocation): Document implementation more
closely. Be more consistent about style. Omit some needless words.
* src/ls.c (usage): Don’t overdocument -f, as the details were wrong.
Omit -1 advice as it’s a bit obsolete now that we have --zero and
is a bit much for --usage output anyway.
Paul Eggert [Wed, 28 Jul 2021 00:34:43 +0000 (17:34 -0700)]
ls: rename --null to --zero (Bug#49716)
* NEWS, doc/coreutils.texi (General output formatting):
* src/ls.c (usage):
Document this.
* src/ls.c (ZERO_OPTION): Rename from NULL_OPTION.
All uses changed.
(long_options): Rename --null to --zero.
(dired_dump_obstack, main, print_dir): Use '\n' instead of
eolbyte where eolbyte must equal '\n'.
(decode_switches): Decode --zero instead of --null.
--zero also implies -1, -N, --color=none, --show-control-chars.
Use easier-to-decipher code to set ‘format’ and ‘dired’.
Reject attempts to combine --dired and --zero.
* tests/local.mk: Adjust to test script renaming.
* tests/ls/zero-option.sh: Rename from tests/ls/null-option.sh,
and test --zero instead of --null.
Paul Eggert [Tue, 27 Jul 2021 21:27:00 +0000 (14:27 -0700)]
ls: compute defaults more lazily
* src/ls.c (enum time_type, enum sort_type, enum indicator_style)
(enum Dereference_symlink, ignore_mode):
Put ‘= 0’ after default values, since the code relies
on static storage defaulting to zero.
(enum sort_type): Reorder so that -1 can be used to represent unset.
(main): Test print_with_color after parse_ls_color may have reset it.
(decode_line_length): Return the line length instead of setting
static storage. All uses changed. Treat line lengths exceeding
PTRDIFF_MAX as infinite, to avoid pointer-subtraction glitches.
(stdout_isatty): New function, to avoid calling isatty twice.
(decode_switches): Calculate defaults more lazily, to avoid using
syscalls or getenv during startup unless the results are more
likely to be needed. Use -1 to indicate options that haven’t been
set on the command line yet. Move print_with_color test from
here to ‘main’. Suppress bogus GCC warning.
(getenv_quoting_style): Return the quoting style instead of
setting static storage.
(init_column_info): New arg MAX_COLS, to avoid recalculating it.
Caller changed.
Paul Eggert [Mon, 26 Jul 2021 07:26:32 +0000 (00:26 -0700)]
ls: port to wider off_t, uid_t, gid_t
* src/ls.c (dired_pos): Now off_t, not size_t, since it counts
output file offsets.
(dired_dump_obstack): This obstack's file offsets are now
off_t, not size_t.
(format_user_or_group, format_user_or_group_width):
ID arg is now uintmax_t, not unsigned long, since uid_t and
gid_t values might exceed ULONG_MAX.
(format_user_or_group_width): Use snprintf with NULL instead of
sprintf with a discarded buffer. This avoids a stack buffer,
and so should be safer.
Paul Eggert [Mon, 26 Jul 2021 04:01:31 +0000 (21:01 -0700)]
ls: demacroize
Prefer functions or constants to macros where either will do.
That’s cleaner, and nowadays there’s no performance reason to
prefer macros. All uses changed.
* src/ls.c (INITIAL_TABLE_SIZE, MIN_COLUMN_WIDTH):
Now constants instead of macros.
(file_or_link_mode): New function, replacing the old macro
FILE_OR_LINK_MODE.
(dired_outbyte): New function, replacing the old macro DIRED_PUTCHAR.
(dired_outbuf): New function, replacing the old macro DIRED_FPUTS.
(dired_outstring): New function, replacing the old macro
DIRED_FPUTS_LITERAL.
(dired_indent): New function, replacing the old macro DIRED_INDENT.
(push_current_dired_pos): New function, replacing the old macro
PUSH_CURRENT_DIRED_POS.
(assert_matching_dev_ino): New function, replacing the old macro
ASSERT_MATCHING_DEV_INO.
(do_stat, do_lstat, stat_for_mode, stat_for_ino, fstat_for_ino)
(signal_init, signal_restore, cmp_ctime, cmp_mtime, cmp_atime)
(cmp_btime, cmp_size, cmp_name, cmp_extension)
(fileinfo_name_width, cmp_width, cmp_version):
No longer inline; compilers can deduce this well enough nowadays.
(main): Protect unused assert with ‘if (false)’ rather than
commenting it out, so that the compiler checks the code.
(print_dir): Output the space and newline in the same buffer
as the human-readable number they surround.
(dirfirst_check): New function, replacing the old macro
DIRFIRST_CHECK. Simplify by using subtraction.
(off_cmp): New function, replacing the old macro longdiff.
(print_long_format): No need to null-terminate the string now.
(format_user_or_group): Let printf count the bytes.
Paul Eggert [Mon, 26 Jul 2021 01:54:10 +0000 (18:54 -0700)]
ls: simplify sprintf usage
* src/ls.c (format_user_or_group_width, print_long_format):
Use return value from sprintf instead of calling strlen on
the resulting buffer, or inferring the length some other way.
Kamil Dudka [Wed, 30 Jun 2021 15:53:22 +0000 (17:53 +0200)]
df: fix duplicated remote entries due to bind mounts
As originally reported in <https://bugzilla.redhat.com/1962515>,
df invoked without -a printed duplicated entries for NFS mounts
of bind mounts. This is a regression from commit v8.25-54-g1c17f61ef99,
which introduced the use of a hash table.
The proposed patch makes sure that the devlist entry seen the last time
is used for comparison when eliminating duplicated mount entries. This
way it worked before introducing the hash table.
Patch co-authored by Roberto Bergantinos.
* src/ls.c (struct devlist): Introduce the seen_last pointer.
(devlist_for_dev): Return the devlist entry seen the last time if found.
(filter_mount_list): Remember the devlist entry seen the last time for
each hashed item.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/49298
Paul Eggert [Sun, 27 Jun 2021 01:23:52 +0000 (18:23 -0700)]
tail: use poll, not select
This fixes an unlikely stack out-of-bounds write reported by
Stepan Broz via Kamil Dudka (Bug#49209).
* bootstrap.conf (gnulib_modules): Replace select with poll.
* src/tail.c: Do not include <sys/select.h>.
[!_AIX]: Include poll.h.
(check_output_alive) [!_AIX]: Use poll instead of select.
(tail_forever_inotify): Likewise. Simplify logic, as there is no
need for a ‘while (len <= evbuf_off)’ loop.
Pádraig Brady [Sun, 20 Jun 2021 20:26:21 +0000 (21:26 +0100)]
stat: use decomposed decimal device numbers by default
* src/stat.c (default_format): Use decomposed decimal
representation (major,minor) in the default format.
This is least ambiguous for human interpretation,
and more consistent with ls for example.
Fixes https://bugs.gnu.org/48960
Pádraig Brady [Sun, 20 Jun 2021 14:16:49 +0000 (15:16 +0100)]
stat: support more device number representations
In preparation for changing the default device number
representation (to decomposed decimal), provide more
formatting options for device numbers.
These new (FreeBSD compat) formatting options are added:
%Hd major device number in decimal (st_dev)
%Ld minor device number in decimal (st_dev)
%Hr major device type in decimal (st_rdev)
%Lr minor device type in decimal (st_rdev)
%r (composed) device type in decimal (st_rdev)
%R (composed) device type in hex (st_rdev)
* doc/coreutils.texi (stat invocation): Document new formats.
* src/stat.c (print_it): Handle the new %H and %L modifiers.
(print_statfs): Adjust to passing the format as two chars
rather than an int. Using an int was introduced in commit db42ae78,
but using separate chars is cleaner and more extensible.
(print_stat): Likewise. Handle any modifiers and the new 'r' format.
(usage): Document the new formats.
* tests/misc/stat-fmt.sh: Add a test case for new modifiers.
Addresses https://bugs.gnu.org/48960
Paul Eggert [Sat, 12 Jun 2021 00:41:37 +0000 (17:41 -0700)]
build: update gnulib submodule to latest
Coreutils mistakenly did not list xstrndup as a module
that it depends on directly. When the latest Gnulib removed
the dirname module's dependency on xstrndup, this mistake
caused coreutils to not build. Since all of Coreutils's
uses of xstrndup know the string length, xmemdup0 is a better
match for what's needed. Since the size args are typically
signed or derived from subtracting pointers, the new Gnulib
ximemdup0 function is a better match yet.
So, use ximemdup0 instead of xstrndup.
* src/cut.c, src/dircolors.c, src/expand-common.c, src/expand.c:
* src/numfmt.c, src/set-fields.c, src/unexpand.c:
Do not include xstrndup.h; no longer needed.
* src/dircolors.c (parse_line):
* src/expand-common.c (parse_tab_stops):
* src/numfmt.c (parse_format_string):
* src/set-fields.c (set_fields):
Use ximemdup0 instead of xstrndup.
Pádraig Brady [Sat, 15 May 2021 11:40:45 +0000 (12:40 +0100)]
copy: remove fiemap logic
This is now only used on 10 year old linux kernels,
and performs a sync before each copy.
* src/copy.c (extent_copy): Remove function and all callers.
* src/extent-scan.c: Remove.
* src/extent-scan.h: Remove.
* src/fiemap.h: Remove.
* src/local.mk: Adjust for removed files.
* NEWS: Adjust to say fiemap is removed.
Pádraig Brady [Wed, 12 May 2021 22:47:38 +0000 (23:47 +0100)]
copy: disallow copy_file_range() on Linux kernels before 5.3
copy_file_range() before Linux kernel release 5.3 had many issues,
as described at https://lwn.net/Articles/789527/, which was
referenced from https://lwn.net/Articles/846403/; a more general
article discussing the generality of copy_file_range().
Linux kernel 5.3 was released in September 2019, which is new enough
that we need to actively avoid older kernels.
* src/copy.c (functional_copy_file_range): A new function
that returns false for Linux kernels before version 5.3.
(sparse_copy): Call this new function to gate use of
copy_file_range().
Pádraig Brady [Sun, 9 May 2021 22:41:00 +0000 (23:41 +0100)]
tests: fix tests/cp/sparse-2.sh false failure on some systems
* tests/cp/sparse-2.sh: Double check cp --sparse=always,
with dd conv=sparse, in the case where the former didn't
create a sparse file. Now that this test is being newly run
on macos, we're seeing a failure due to seek() not creating
holes on apfs unless the size is >= 16MiB.
Pádraig Brady [Sun, 9 May 2021 13:29:01 +0000 (14:29 +0100)]
tests: ensure we test SEEK_DATA where used
fiemap is no longer the default copy implementation,
so check for SEEK_DATA support instead as that's preferred.
This will ensure better test coverage on systems without fiemap.
* init.cfg: Replace fiemap_capable_ with seek_data_capable_.
This is best supported with python 3 so prefer that.
* tests/seek-data-capable: A new test script checking for
SEEK_DATA support on the passed file name,
called from seek_data_capable_.
* tests/fiemap-capable: Remove no longer used probing script.
* tests/cp/fiemap-perf.sh: Renamed to tests/cp/sparse-perf.sh
* tests/cp/fiemap-2.sh: Renamed to tests/cp/sparse-2.sh
* tests/cp/fiemap-extents.sh: Renamed to tests/cp/sparse-extents.sh
* tests/cp/sparse-fiemap.sh: Renamed to tests/cp/sparse-extents-2.sh
* tests/cp/fiemap-FMR.sh: Renamed to tests/cp/copy-FMR.sh
* tests/local.mk: Reference the renamed tests.
Pádraig Brady [Sat, 8 May 2021 18:23:20 +0000 (19:23 +0100)]
copy: handle system security config issues with copy_file_range()
* src/copy.c (sparse_copy): Upon EPERM from copy_file_range(),
fall back to a standard copy, which will give a more accurate
error as to whether the issue is with the source or destination.
Also this will avoid the issue where seccomp or apparmor are
not configured to handle copy_file_range(), in which case
the fall back standard copy would succeed without issue.
This specific issue with seccomp was noticed for example in:
https://github.com/golang/go/issues/40900
Pádraig Brady [Sun, 9 May 2021 20:55:22 +0000 (21:55 +0100)]
copy: handle EOPNOTSUPP from SEEK_DATA
* src/copy.c (infer_scantype): Ensure we don't error out
if SEEK_DATA returns EOPNOTSUPP, on systems where this value
is distinct from ENOTSUP. Generally both of these should be checked.
Pádraig Brady [Sat, 8 May 2021 16:18:54 +0000 (17:18 +0100)]
copy: handle ENOTSUP from copy_file_range()
* src/copy.c (sparse_copy): Ensure we fall back to
a standard copy if copy_file_range() returns ENOTSUP.
This generally is best checked when checking ENOSYS,
but it also seems to be a practical concern on Centos 7,
as a quick search gave https://bugzilla.redhat.com/1840284
Carl Edquist [Mon, 10 May 2021 10:22:11 +0000 (05:22 -0500)]
build: fix __get_cpuid_count check to catch link failure
The test program will compile successfully even if __get_cpuid_count
is not declared. The error for the missing symbol will only show up
at link time. Thus, use AC_LINK_IFELSE instead of AC_COMPILE_IFELSE.
* configure.ac (__get_cpuid_count check): Use C_LINK_IFELSE instead
of AC_COMPILE_IFELSE.
(__get_cpuid check): Likewise.
Pádraig Brady [Mon, 3 May 2021 17:11:04 +0000 (18:11 +0100)]
maint: consistently free hash structures in dev mode
Ensure we call hash_free() to avoid valgrind and leak_sanitizer
"definitely lost" warnings. These were not real leaks as
we terminate immediately after, but we should avoid these
"definitely lost" warnings where possible.
* src/copy.c: Add dest_info_free() and src_info_free().
* src/copy.h: Declare the above.
* src/cp-hash.c: Don't define unless "lint" is defined.
* src/install.c: Call dest_info_free() in dev mode.
* src/mv.c: Likewise.
* src/cp.c: Likewise. Also call src_info_free().
* src/ln.c: Call hash_free() in dev mode.
* src/tail.c: Call hash_free() even if about to exit, in dev mode.
Pádraig Brady [Sun, 2 May 2021 20:27:17 +0000 (21:27 +0100)]
copy: ensure we enforce --reflink=never
* src/copy.c (sparse_copy): Don't use copy_file_range()
with --reflink=never as copy_file_range() may implicitly
use acceleration techniques like reflinking.
(extent_copy): Pass through whether we allow reflinking.
(lseek_copy): Likewise.
Fixes https://bugs.gnu.org/48164
Pádraig Brady [Sat, 1 May 2021 19:02:02 +0000 (20:02 +0100)]
wc: add --debug to diagnose which implementation used
* src/wc.c: (main): Handle the new --debug option.
Only call avx2_supported if needed.
(avx2_supported): Diagnose various failures and attempts.
* NEWS: Mention the new wc improvement and --debug option.
wc: use avx2 optimization when counting only lines
Use cpuid to detect CPU support for avx2 instructions.
Performance was seen to improve by 5x for a file with only newlines,
while the performance for a file with no such characters is unchanged.
* configure.ac [USE_AVX2_WC_LINECOUNT]: A new conditional,
set when __get_cpuid_count() and avx2 compiler intrinsics are supported.
* src/wc.c (avx2_supported): A new function using __get_cpuid_count()
to determine if avx2 instructions are supported.
(wc_lines): A new function refactored from wc(),
which implements the standard line counting logic,
and provides the fallback implementation for when avx2 is not supported.
* src/wc_avx2.c: A new module to implement using avx2 intrinsics.
* src/local.mk: Reference the new module. Note we build as a separate
lib so that it can be portably built with separate -mavx2 etc. flags.
Paul Eggert [Sat, 1 May 2021 22:19:16 +0000 (15:19 -0700)]
touch: fix wrong diagnostic (Bug#48106)
Problem reported by Roland (Bug#48106).
* src/touch.c (touch): Take more care when deciding whether
to use open_errno or utime_errno in the diagnostic.
Stop worrying about SunOS 4 (which as part of the problem),
as it’s long obsolete. For Solaris 10, verify that EINVAL
really means the file was a directory.
Paul Eggert [Tue, 27 Apr 2021 06:27:59 +0000 (23:27 -0700)]
maint: port to Autoconf 2.71
* configure.ac: Use AC_PROG_CC, not AC_PROG_CC_STDC.
* gl/modules/smack (configure.ac):
* m4/jm-macros.m4 (coreutils_MACROS):
* m4/xattr.m4 (gl_FUNC_XATTR):
Use AS_HELP_STRING, not AC_HELP_STRING.
* m4/check-decl.m4 (gl_CHECK_DECLS):
Do not require AC_HEADER_TIME; we no longer care about it directly.
* m4/jm-macros.m4 (coreutils_MACROS):
Do not require AC_ISC_POSIX, which became obsolete in 2006.
Use AC_LINK_IFELSE instead of AC_TRY_LINK.
Paul Eggert [Tue, 27 Apr 2021 01:02:16 +0000 (18:02 -0700)]
build: update gnulib submodule to latest
* src/csplit.c (load_buffer):
* src/pinky.c (create_fullname):
Use intprops-based checks rather than xalloc_oversized,
since Gnulib xalloc.h no longer includes xalloc-oversized.h.
Zorro Lang [Mon, 26 Apr 2021 15:25:18 +0000 (17:25 +0200)]
copy: do not refuse to copy a swap file
* src/copy.c (sparse_copy): Fallback to read() if copy_file_range()
fails with ETXTBSY. Otherwise it would be impossible to copy files
that are being used as swap. This used to work before introducing
the support for copy_file_range() in coreutils. (Bug#48036)
doc: clarify that ln --relative requires --symbolic to be specified
* doc/coreutils.texi (ln invocation): State --symbolic is required.
* src/ln.c (usage): Explicitly state -s is not implied.
Fixes https://bugs.gnu.org/47703
* src/wc.c (usage): State that only printable characters are considered
when counting words. This also disambiguates wether we're talking
about bytes or characters in this context.
* doc/coreutils.texi (wc invocation): Likewise. Also clarify
that --characters counts valid locale aware characters,
and that --lines does not count a trailing "line" unless
it ends with a newline character.
Fixes https://bugs.gnu.org/47702
This is especially important now for --sort=width,
as that can greatly increase how often this
expensive quote_name_width() function is called per file.
This also helps the default invocation of ls,
or specifically the --format={across,vertical} cases
(when --width is not set to 0),
to avoid two calls to this function per file.
Note the only case where we later compute the width,
is for --format=commas. That's only done once though,
so we leave the computation close to use to
maximize hardware caching.
* src/ls.c (struct fileinfo): Add a WIDTH member to cache
the screen width of the file name.
(update_current_files_info): Set the WIDTH members for cases
they're needed multiple times. Note we do this explicitly here,
rather than caching at use, so that the fileinfo
structures can remain const in the sorting and presentation functions.
(sort_files): Call the new update_current_files_info() in this
initialization function.
(fileinfo_name_width): Renamed from fileinfo_width,
and adjusted to return the cached value if available.
Carl Edquist [Fri, 26 Mar 2021 09:27:54 +0000 (04:27 -0500)]
ls: add --sort=width option to sort by file name width
This helps identify the outliers for long filenames, and also produces
a more compact display of columns when listing a directory with many
entries of various widths.
* src/ls.c (sort_type, sort_types, sort_width): New sort_width sort
type.
(sort_args): Add "width" sort arg.
(cmp_width, fileinfo_width): New sort function and helper for file name
width.
(quote_name_width): Add function prototype declaration.
(usage): Document --sort=width option.
* doc/coreutils.texi: Document --sort=width option.
* tests/ls/sort-width-option.sh: New test for --sort=width option.
* tests/local.mk: Reference new test.
* NEWS: Mention the new feature.
Paul Eggert [Tue, 30 Mar 2021 04:42:44 +0000 (21:42 -0700)]
env: simplify --split-string memory management
* bootstrap.conf (gnulib_modules): Add idx.
* src/env.c: Include idx.h, minmax.h.
Prefer idx_t to ptrdiff_t when values are nonnegative.
(valid_escape_sequence, escape_char, validate_split_str)
(CHECK_START_NEW_ARG):
Remove; no longer needed now that we validate as we go.
(struct splitbuf): New type.
(splitbuf_grow, splitbuf_append_byte, check_start_new_arg)
(splitbuf_finishup): New functions.
(build_argv): New arg ARGC. Validate and process in one go, using
the new functions; this is simpler and more reliable than the old
approach (as witness the recent bug). Avoid integer overflow in
the unlikely case where the string contains more than INT_MAX
arguments.
(parse_split_string): Simplify by exploiting the new build_argv.
Paul Eggert [Fri, 26 Mar 2021 21:06:26 +0000 (14:06 -0700)]
env: remove asserts
The assertions didn’t help catch the most recent bug which
was in their area, and kind of get in the way.
* src/env.c: Do not include <assert.h>, and remove all assertions.
These seem to have been put in to pacify gcov, but surely there’s
a better way.
(escape_char): Pacify GCC with 'assume' instead.
Paul Eggert [Fri, 26 Mar 2021 20:49:49 +0000 (13:49 -0700)]
env: fix address violation with '\v' in -S
Problem reported by Frank Busse (Bug#47412).
* src/env.c (C_ISSPACE_CHARS): New macro.
(shortopts, build_argv, main): Treate all C-locale space
characters like space and tab, for compatibility with FreeBSD.
(validate_split_str, build_argv, parse_split_string):
Use the C locale, not the current locale, to determine whether a
byte is a space character.
Paul Eggert [Wed, 24 Mar 2021 23:43:05 +0000 (16:43 -0700)]
cksum: port recent changes to macOS
* src/cksum.c (cksum_slice8): Fix bug on little-endian
platforms lacking __bswap_32: the SWAP macro evaluates
its argument multiple times, but the macro has a side effect.
Paul Eggert [Sun, 21 Mar 2021 21:00:26 +0000 (14:00 -0700)]
ptx: remove use of diacrit module
The diacrit module is obsolete, and ptx’s use of it is obsolete
too; it assumes an 8-bit locale (not that common these days) and
that TeX cannot process the 8-bit characters (nowadays, it can).
* NEWS, doc/coreutils.texi (Charset selection in ptx): Document this.
* bootstrap.conf (gnulib_modules): Remove diacrit.
* src/ptx.c: Do not include diacrit.h.
(print_field, fix_output_parameters): Remove obsolete support
for 8-bit diacritics.
Pádraig Brady [Sat, 13 Mar 2021 18:10:12 +0000 (18:10 +0000)]
cksum: add --debug to diagnose which implementation used
* src/cksum.c: (main): Use getopt_long to parse options,
and handle the new --debug option.
(pclmul_supported): Diagnose various failures and attempts.
* NEWS: Mention the new option.
cksum: use pclmul hardware instruction for CRC32 calculation
Use cpuid to detect CPU support for hardware instruction.
Fall back to slice by 8 algorithm if not supported.
A 500MiB file improves from 1.40s to 0.67s on an i3-2310M
* configure.ac [USE_PCLMUL_CRC32]: A new conditional,
set when __get_cpuid() and clmul compiler intrinsics are supported.
* src/cksum.c (pclmul_supported): A new function using __get_cpuid()
to determine if pclmul instructions are supported.
(cksum): A new function refactored from cksum_slice8(),
which calls pclmul_supported() and then cksum_slice8()
or cksum_pclmul() as appropriate.
* src/cksum.h: Export the crctab array for use in the new module.
* src/cksum_pclmul.c: A new module to implement using pclmul intrinsics.
* po/POTFILES.in: Reference the new cksum_pclmul module.
* src/local.mk: Likewise. Note we build it as a separate library
so that it can be portably built with separate -mavx etc. flags.
* tests/misc/cksum.sh: Add new test modes for pertinent buffer sizes.
Pádraig Brady [Sun, 14 Mar 2021 22:43:42 +0000 (22:43 +0000)]
maint: propagate DEPENDENCIES to libs in single binary mode
build-aux/gen-single-binary.sh (override_single): A new function
to refactor the existing mappings for dir, vdir, and arch.
This function now also sets the DEPENDENCIES variable so that these
dependencies can be maintained later in the script, where
we now propagate the automake generated $(src_$cmd_DEPENDENCIES)
to our equivalent src_libsinglebin_$cmd_a_DEPENDENCIES.
This will ensure that any required libs are built,
which we require in a following change to cksum that
builds part of it as a separate library.
Pádraig Brady [Tue, 16 Feb 2021 05:07:40 +0000 (05:07 +0000)]
rmdir: diagnose non following of symlinks with trailing slash
GNU/Linux is unusual here in that rmdir("symlink/") returns ENOTDIR,
whereas Solaris and FreeBSD at least, will follow the symlink
and remove the target directory. We don't make the behavior
on Linux kernels consistent, but at least clarify
the confusing error message.
* src/rmdir (main): Output a specific error message for the above case.
(remove_parents): In the error message, don't assume intermediate paths
are directories, as they could be symlinks.
* tests/rmdir/symlink-errors.sh: Add a new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the improvement.
Pádraig Brady [Tue, 9 Feb 2021 23:01:34 +0000 (23:01 +0000)]
cat: extend --show-ends to show \r\n as ^M$
- \r\n is common a line end combination
- catting such a file without options causes it to display normally
- overwriting the first char with $, loses info
* src/cat.c (cat): Convert \r preceeding a \n to ^M.
* tests/misc/cat-E.sh: New test.
* tests/local.mk: Reference new test.
* tests/misc/cat-proc.sh: Fix typo.
* doc/coreutils.texi (cat invocation): Mention the new behavior.
* NEWS: Mention the improvement.
Pádraig Brady [Mon, 25 Jan 2021 14:12:48 +0000 (14:12 +0000)]
split: fix --number=K/N to output correct part of file
This functionality regressed with the adjustments
in commit v8.25-4-g62e7af032
* src/split.c (bytes_chunk_extract): Account for already read data
when seeking into the file.
* tests/split/b-chunk.sh: Use the hidden ---io-blksize option,
to test this functionality.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/46048
Paul Eggert [Fri, 15 Jan 2021 10:57:59 +0000 (02:57 -0800)]
mkdir: fix bug when -m's more generous than umask
Problem reported by David McCall (Bug#45886).
I introduced this problem when fixing Bug#14371.
* NEWS: Mention the fix.
* src/mkdir.c (struct mkdir_options): New members umask_ancestor,
umask_self, replacing umask_value.
(make_ancestor): Use them when temporarily adjusting umask.
(main): Set them, and set the umask to umask_self instead
of leaving it alone.
* tests/mkdir/perm.sh (tests): Add test case for bug.
Paul Eggert [Sat, 9 Jan 2021 21:04:40 +0000 (13:04 -0800)]
doc: modernize and fix regexp xref
* doc/coreutils.texi: Fix regexp cross-reference that had become
out-of-date (Bug#45749). Also, fix some obsolete references to
SunOS and to /usr/dict/words, and change “Linux” to “GNU/Linux”
where appropriate. Unfortunately the pipeline example gets more
complicated since /usr/share/dict/words is not sorted the way that
‘comm’ wants.
Pádraig Brady [Fri, 1 Jan 2021 16:36:09 +0000 (16:36 +0000)]
maint: update all copyright year number ranges
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* tests/sample-test: Adjust to use the single most recent year.
A 100MB file improves from 2.50s to 1.80s on a Sparc T5220
A 100MB file improves from 0.54s to 0.13s on an i3-2310M
* bootstrap.conf: Explicitly depend on byteswap,
since now used directly by coreutils.
* src/cksum.c (cksum): Process in multiples of 8 bytes.
(main): Adjust for generation of expanded crctab.
* src/cksum.h: Split now larger crctab to separate header.
* src/local.mk: Reference the new header.
* NEWS: Mention the improvement.
Pádraig Brady [Fri, 18 Dec 2020 14:49:27 +0000 (14:49 +0000)]
doc: add `seq inf` and `sleep inf` examples to texinfo
* doc/coreutils.texi (seq invocation): Mention "inf" is supported,
and describe that it's handled specially to generate infinite
whole integer sequences. Also mention that such infinite generation
is supported for integer steps up to 200.
(sleep invocation): Give `sleep inf` as an example to sleep forever.
* src/seq.c: Add a comment on SEQ_FAST_STEP_LIMIT, to say it's
reflected in the texinfo description.
Paul Eggert [Tue, 15 Dec 2020 19:52:19 +0000 (11:52 -0800)]
doc: document mkdir -m -p better
Chris Colohan wrote that the man page did not do enough to dispel
a common misunderstanding that “contributed to one of the scariest
outages Google has ever seen” (Bug#45258).
* doc/coreutils.texi (mkdir invocation):
* src/mkdir.c (usage): Document -m vs -p better.
nl: fix --section-delimiter handling of single characters
* src/nl.c (main): Enforce the POSIX specified
behavior of assuming ':' is specified after a single
character argument to -d.
* tests/misc/nl.sh: Add a test case.
* NEWS: Mention the bug fix.
Pádraig Brady [Tue, 15 Dec 2020 01:06:50 +0000 (01:06 +0000)]
doc: mention the GNU extensions to nl --section-delimiter
* doc/coreutils.texi (nl invocation): Mention the GNU extensions
of allowing arbitrary length and empty delimiter strings.
* src/nl.c (usage): Likewise.
* tests/misc/nl.sh: Add test cases for the GNU extensions.
Pádraig Brady [Tue, 15 Dec 2020 01:02:32 +0000 (01:02 +0000)]
maint: refactor nl section delimiter handling
* src/nl.c (main): Update the default delimiter characters
when passed two characters with --section-delimiter.
Avoid redundant copies for the body and footer delimiter strings,
and instead, just offset into the header string.
(check_section): Avoid redundant comparing of 2 bytes of memory
for an empty delimiter.
Pádraig Brady [Mon, 30 Nov 2020 19:06:59 +0000 (19:06 +0000)]
date: with --debug, show the output format
The format can be determined from --options or the locale,
so it's useful to output the format string being used.
* src/date.c (show_date): Show the output format
along with the date being shown.
* tests/misc/date-debug.sh: Adjust accordingly.
Addresses https://bugs.gnu.org/44960
Nishant Nayan [Thu, 26 Nov 2020 14:35:17 +0000 (14:35 +0000)]
rm: do not skip files upon failure to remove an empty dir
When removing a directory fails for some reason, and that directory
is empty, the rm_fts code gets the return value of the excise call
confused with the return value of its earlier call to prompt,
causing fts_skip_tree to be called again and the next file
that rm would otherwise have deleted to survive.
* src/remove.c (rm_fts): Ensure we only skip a single fts entry,
when processing empty dirs. I.e. only skip the entry
having successfully removed it.
* tests/rm/empty-immutable-skip.sh: New root-only test.
* tests/local.mk: Add it.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/44883
Paul Eggert [Mon, 23 Nov 2020 09:48:15 +0000 (01:48 -0800)]
install: suppress "Operation not supported" false alarms
At least, I *think* they are false alarms. An SELinux expert eye
would be welcome.
* src/install.c (setdefaultfilecon): If selabel_lookup fails
due to either ENOTSUP or ENODATA, don’t diagnose the issue.
Problem reported by Kamil Dudka in:
https://lists.gnu.org/r/coreutils/2020-11/msg00050.html