Paul Eggert [Fri, 2 Aug 2024 16:32:11 +0000 (09:32 -0700)]
Use xalignalloc
It ports around issues that our handwritten code does not.
* gnulib.modules: Add xalignalloc.
* src/misc.c (ptr_align, page_aligned_alloc): Remove.
All page_aligned_alloc callers changed to use xalignalloc.
Paul Eggert [Fri, 2 Aug 2024 16:07:06 +0000 (09:07 -0700)]
Make stripped_prefix_len signed
This is part of the general guideline that signed integer types
are safer.
* src/names.c (stripped_prefix_len): Return ptrdiff_t,
not size_t. All callers changed.
Paul Eggert [Fri, 2 Aug 2024 07:27:40 +0000 (00:27 -0700)]
Don’t assume mode_t fits in unsigned long
* src/system.c (oct_to_env): Don’t assume mode_t fits in unsigned
long. Do not output excess leading 1 bits. When the mode is
zero, generate "0" rather than "00". Use sprintf instead of
snprintf, since the output won’t be truncated; in general we don’t
use snprintf unless we want output to be truncated and truncation
is typically not GNU style.
Paul Eggert [Fri, 2 Aug 2024 02:31:50 +0000 (19:31 -0700)]
Prefer C99 formats like %jd to doing it by hand
It’s now safe to assume support for C99 formats like %jd, so remove
some of the longwinded formatting code put in only to be portable to
pre-C99 platforms.
* gnulib.modules: Add intprops.
* src/buffer.c (format_total_stats, try_new_volume)
(write_volume_label):
* src/checkpoint.c (format_checkpoint_string):
* src/compare.c (verify_volume):
* src/create.c (to_chars_subst, dump_regular_file):
* src/incremen.c (read_num):
* src/list.c (read_and, from_header, simple_print_header)
(print_for_mkdir):
* src/sparse.c (sparse_dump_region):
* src/system.c (dec_to_env, sys_exec_info_script)
(sys_exec_checkpoint_script):
* src/xheader.c (out_of_range_header):
Prefer C99 formats like %jd and %ju to STRINGIFY_BIGINT.
* src/common.h: Sort includes.
Include intprops.h, verify.h. All other includes of verify.h
removed.
(intmax, uintmax): New functions and macros.
(STRINGIFY_BIGINT): Remove; no longer used.
(TIMESPEC_STRSIZE_BOUND): Make it 1 byte bigger, for negatives.
* src/create.c (MAX_VAL_WITH_DIGITS, to_base256):
Use *_WIDTH macros rather than assuming no padding bits.
Prefer UINTMAX_MAX to (uintmax_t) -1.
* src/list.c (tartime): Use strftime result rather
than running strlen later.
* src/misc.c (timetostr): New function. Prefer it when
printing time_t values.
Paul Eggert [Thu, 1 Aug 2024 17:02:06 +0000 (10:02 -0700)]
Fix unlikely problems with time overflow
Also, fix some rounding errors while we’re in the neighborhood.
* src/buffer.c (duration_ns, compute_duration_ns): Rename from
‘duration’ and ‘compute_duration’, and count ns rather than s, to
lessen rounding error. All uses changed.
(compute_duration_ns): Work even if the clock moves backward
and time_t is unsigned.
(print_stats): Don’t worry about null or empty TEXT, as that
cannot happen. Compare double to UINTMAX_MAX + 1.0, not
to UINTMAX_MAX, so that the comparison is exact.
Handle the unlikely case that numbytes >= UINTMAX_MAX.
* src/tar.c (parse_opt): Treat -L hugenumber as effectively
infinity rather than erroring out.
Prefer ckd_add to checking overflow by hand.
Paul Eggert [Thu, 1 Aug 2024 14:15:01 +0000 (07:15 -0700)]
ptrdiff_t, not ssize_t
* src/buffer.c (bufmap_reset, _flush_write):
Use ptrdiff_t, not ssize_t, to record pointer differences.
POSIX allows systems where size_t is 64 bits but ssize_t is only 32;
Ultrix used to do that, though no current systems do.
Paul Eggert [Wed, 31 Jul 2024 00:59:04 +0000 (17:59 -0700)]
Prefer stdckdint.h to intprops.h
Problem reported by Collin Funk in:
https://lists.gnu.org/r/bug-tar/2024-07/msg00000.html
though this patch is more general than Collin’s suggestion.
* src/compare.c (diff_multivol):
* src/delete.c (move_archive):
* src/sparse.c (oldgnu_add_sparse, pax_decode_header):
* src/system.c (mtioseek):
Prefer ckd_add and ckd_mul to the intprops.h equivalents,
since stdckdint.h is now standard.
Paul Eggert [Wed, 31 Jul 2024 00:47:10 +0000 (17:47 -0700)]
Simplify read_header overflow checking
* src/list.c (read_header): Use ckd_add instead of
doing overflow checking by hand. Although the old code
was correct on all practical hosts, the new code is simpler
and works even on weird hosts where SIZE_MAX <= INT_MAX.
Paul Eggert [Tue, 30 Jul 2024 23:19:35 +0000 (16:19 -0700)]
maint: use static_assert
* gnulib.modules: Add assert-h, for static_assert.
* src/common.h, src/list.c, src/misc.c:
Prefer static_assert to #if + #error. This doesn’t fix any bugs; it’s
just that in general it’s better to avoid the preprocessor.
Paul Eggert [Tue, 30 Jul 2024 15:35:59 +0000 (08:35 -0700)]
Fix tests/ckmtime.c arithmetic
* tests/ckmtime.c (main): Don’t assume time_t is signed.
Avoid integer overflows (quite possible if time_t is 32 bit).
Do calculations precisely, without any rounding errors.
Paul Eggert [Tue, 30 Jul 2024 03:56:27 +0000 (20:56 -0700)]
xsparse cleanup, including integer overflow
* scripts/xsparse.c: Include inttypes.h, for strtoimax.
Don’t include stdint.h, since inttypes.h includes it.
Sort include directives.
Make all extern functions and vars static, except for ‘main’.
(string_to_off): Use strtoimax instead of doing overflow
checking by hand, incorrectly (it relied on undefined behavior).
(string_to_size): New arg MAXSIZE. All callers changed.
(get_var): Return bool not int. Fix unlikely integer overflow.
Use strncmp instead of memcmp, to avoid unlikely pointer overflow.
(read_xheader, read_map, main): Avoid unlikely integer overflow.
Check for I/O errors more consistently.
(main): Prefer bool to int, and put vars near use.
Paul Eggert [Mon, 29 Jul 2024 22:36:47 +0000 (15:36 -0700)]
maint: fix some unlikely wordsplit overflows
* gnulib.modules: Add reallocarray.
* lib/wordsplit.c: Include stdckdint.h.
(ISDELIM, expvar, isglob, scan_word):
Defend against strchr (s, 0) always succeeding.
(alloc_space, wsplit_assign_vars):
Fix some unlikely integer overflows, partly by using reallocarray.
(alloc_space): Avoid quadratic worst-case behavior.
(isglob): Return bool, not int. Accept size_t, not int.
(to_num): Remove; no longer used.
(xtonum): Clarify the code the bit. Rely on portable
conversion to unsigned char rather than problematic pointer cast.
Paul Eggert [Sun, 28 Jul 2024 20:50:37 +0000 (13:50 -0700)]
Add some gnulib.modules
* gnulib.modules: Add errno, limits-h, safe-read, sys_stat.
Not sure about the relationship between gnulib.modules
and paxutils/gnulib.modules, but anyway tar itself uses
these so we should depend on them. (Perhaps it would be
better if there was just one Gnulib module list for tar;
that would be less confusing.)
Paul Eggert [Sat, 27 Jul 2024 07:05:49 +0000 (00:05 -0700)]
Pacify gcc 14 -Wanalyzer-infinite-loop
* gnulib.modules: Add stddef, for ‘unreachable’.
* src/compare.c (dumpdir_cmp): Tell GCC that the default case
is unreachable. Make just one pass through the string,
instead of two passes (one via strcmp, another via strlen).
Paul Eggert [Sat, 27 Jul 2024 06:26:24 +0000 (23:26 -0700)]
maint: remove GLOBAL as per GCC 14
* src/common.h (GLOBAL): Remove this macro, and all its uses.
It collides with GCC 14 and -Wmissing-variable-declarations.
Change all uses of GLOBAL to use extern instead,
and declare the variables in their respective .c files.
Move .c file’s extern declarations here, so that they
appear only once and are checked against definitions.
* src/names.c (unconsumed_option_tail): Now static.
Paul Eggert [Sat, 27 Jul 2024 04:55:31 +0000 (21:55 -0700)]
Modernize use of Gnulib, paxutils
* configure.ac: Omit stuff no longer needed now that Gnulib or
paxlib does it, or the code no longer needs the configure-time checks.
Do not use AC_SYS_LARGEFILE (Gnulib largefile does this) or check
for fcntl.h, memory.h, net/errno.h, sgtty.h, string.h,
sys/param.h, sys/device.h, sys/gentape.h, sys/inet.h,
sys/io/trioctl.h, sys/time.h, sys/tprintf.h, sys/tape.h, unistd.h,
locale.h, netdb.h; these are all now standard, or old ways of getting
at magtapes are no longer needed and we now have only sys/mtio.h.
Do not check for lstat, readlink, symlink, and check only for
waitpid’s existence rather than attempting to replace it.
Do not check for decls of getgrgid, getpwuid, or time.
Check just once for iconv.h.
* gnulib.modules: Add largefile.
* lib/.gitignore, lib/Makefile.am (noinst_HEADERS, libtar_a_SOURCES):
Remove system-ioctl.h, which is no longer in paxlib.
All includes now changed to just check HAVE_SYS_MTIO_H directly.
* lib/wordsplit.c (wordsplit_c_escape_tab, wordsplit_errstr)
(wordsplit_nerrs):
Now static or an enum, and without any leading "_" in the name.
* src/buffer.c (record_start, record_end, current_block, records_read):
* src/delete.c (records_skipped): Add extern decl to pacify GCC.
* src/compare.c, src/create.c, src/extract.c: Omit uses of
HAVE_READLINK and HAVE_SYMLINK since we now let Gnulib deal with
platforms lacking readlinkat and symlinkat.
* src/system.c: Use "#if !HAVE_WAITPID" instead of "#if MSDOS".
Paul Eggert [Wed, 24 Jul 2024 16:45:46 +0000 (09:45 -0700)]
maint: higher-precision checkpoint timestamps
* src/checkpoint.c (format_checkpoint_string):
Use current_timespec to get nanosecond resolution.
This also frees us from the necessity of including <sys/time.h>
to use gettimeofday, which is removed in POSIX.1-2024.
* src/extract.c (make_directories): Restore second argument. This
reverts the change made in 79d1ac38c1.
(maybe_recoverable, rename_directory): Update calls to make_directories.
* tests/extrac27.at: New file.
* tests/Makefile.am: Add new test.
* tests/testsuite.at: Likewise.
Paul Eggert [Sun, 3 Mar 2024 21:27:32 +0000 (13:27 -0800)]
tar: fix current_block confusion
Problem reported by Robert Morris in:
https://lists.gnu.org/r/bug-tar/2024-03/msg00001.html
* src/delete.c (flush_file): Simply return at EOF,
so that current_block continues to point to end of input.
Paul Eggert [Sun, 3 Mar 2024 21:17:32 +0000 (13:17 -0800)]
tar: improve diagnostic for truncated archive
* src/buffer.c (seek_archive): If EOF has been read, don’t attempt
to seek past it. This replaces a bogus "rmtlseek not stopped at a
record boundary" message with a better "Unexpected EOF in archive"
when I run ‘tar tvf gtar13c.tar’ using the gtar13.tar file here:
https://lists.gnu.org/r/bug-tar/2024-03/msg00001.html
When given -c -a, issue a warning if no compressor is associated with the suffix.
* src/suffix.c (find_compression_suffix): Always return stripped
archive name length in the last argument. Return 0 if there is no
suffix.
(find_compression_program): Remove.
(set_compression_program_by_suffix): Take third argument, controlling
whether to issue a warning if no suitable compression program is found
for the suffix.
* src/common.h (set_compression_program_by_suffix): Change prototype.
* src/buffer.c, src/tar.c: All uses of set_compression_program_by_suffix
changed.
Paul Eggert [Tue, 2 Jan 2024 03:09:59 +0000 (19:09 -0800)]
Skip test on macOS 12.6
* tests/xform04.at: Skip test on macOS 12.6, which is behind the times
and doesn’t think that ⱥ (U+2C65 LATIN SMALL LETTER A WITH STROKE) is
printable.
Paul Eggert [Wed, 13 Sep 2023 04:21:18 +0000 (23:21 -0500)]
Support multi-byte --transform='...\L...' etc
Support upcasing and downcasing in multi-byte locales.
* gnulib.modules: Add c32rtomb, c32tolower, c32toupper,
mbrtoc32-regular.
* src/transform.c: Do not include ctype.h. Include mcel.h.
(stk, stk_init): Move up.
(run_case_conv): Return void, not char *. Append result to
stk directly; this avoids the need for a separate allocation.
All callers changed. Do not assume a single-byte locale.
* tests/xform04.at: New test.
* tests/Makefile.am (TESTSUITE_AT):
* tests/testsuite.at: Add it.
Paul Eggert [Tue, 12 Sep 2023 05:15:52 +0000 (00:15 -0500)]
Parse in a more locale-independent way
update submodules to latest
* gnulib.modules: Add c-ctype.
* lib/wordsplit.c, src/buffer.c, src/exclist.c, src/incremen.c:
* src/list.c, src/misc.c, src/names.c, src/sparse.c, src/tar.c:
* src/xheader.c:
Include c-ctype.h, and use its API rather than ctype.h’s.
This is more likely to work when oddball locales are used.
* src/transform.c: Include ctype.h, since this module still uses
tolower and toupper (this is probably wrong - should be multi-byte).
Paul Eggert [Mon, 11 Sep 2023 06:17:02 +0000 (01:17 -0500)]
Fix pointer bug in drop_volume_label_suffix
Problem reported by Marc Espie in:
https://lists.gnu.org/r/bug-tar/2023-09/msg00003.html
* src/buffer.c (drop_volume_label_suffix):
Redo to not compute a pointer before the start of a buffer,
as this is not portable.
Paul Eggert [Sun, 10 Sep 2023 17:10:52 +0000 (10:10 -0700)]
Prefer mcel to mbuiter
Prefer the lighter-weight mcel implementation to the heavier-weight
mbuiter that GNU tar does not need.
* bootstrap.conf (avoided_gnulib_modules): Avoid mbuiter, mbuiterf.
* gnulib.modules: Add mcel-prefer.
Paul Eggert [Mon, 21 Aug 2023 20:40:37 +0000 (13:40 -0700)]
Simplify recently-added hash code
* src/extract.c (delay_set_stat): Simplify hash lookup;
no need to initialize members other than file_name.
Avoid assignment in ‘if’ when it’s easy.
(extract_finish): Do not bother to free when we are about to exit.
delayed_set_stat avoids inserting duplicate entries into
delayed_set_stat_head. It was doing this by scanning the entire
list.
Normally this list is small, but if --delay-directory-restore is
used (including automatically for incremental archives), this list
grows with the total number of directories in the archive.
The entire scan takes O(n) time. Extracting an archive with n
directories could therefore take O(n^2) time.
The included test uses AT_SKIP_LARGE_FILES, allowing it to optionally be
skipped. It may execute slowly on certain filesystems or disks, as it
creates thousands of directories.
There are still potentially problematic O(n) scans in
find_direct_ancestor and remove_delayed_set_stat, which this patch does
not attempt to fix.
* NEWS: Update.
* src/extract.c (delayed_set_stat_table): Create a table for O(1)
lookups of entries in the delayed_set_stat_head list. The list
remains, as tracking insertion order is important.
(dl_hash, dl_compare): New hash table helper functions.
(delay_set_stat): Create the hash table, replace the O(n) list scan
with a hash_lookup, insert new entries into the hash table.
(remove_delayed_set_stat): Also remove entry from hash table.
(apply_nonancestor_delayed_set_stat): Also remove entry from hash
table.
(extract_finish): Free the (empty) hash table.
* tests/extrac26.at: New file.
* tests/Makefile.am (TESTSUITE_AT): Include extrac26.at.
* tests/testsuite.at: Include extrac26.at.
Commit e89c7a45eb broke deletion from archives. The reported number
of bytes read is rounded to the nearest record anyway, revert the
commit and document the fact.
Reported by Ed Santiago. See
https://bugzilla.redhat.com/show_bug.cgi?id=2230127
* doc/tar.texi: Document the fact that --totals rounds up the
number of bytes reads to the nearest record.
* src/buffer.c: Revert changes.
* tests/delete06.at: Fix expected status code and stderr.
Paul Eggert [Wed, 2 Aug 2023 15:41:12 +0000 (08:41 -0700)]
Stop using alloca
* gnulib.modules: Remove alloca.
* src/create.c (dump_file0): Return address of any allocated
storage. Caller changed to free it. Use xmalloc instead
of alloca, to obtain this storage.
* src/list.c (from_header): Use quote_mem instead of quote,
removing the need to use alloca.
Paul Eggert [Tue, 25 Jul 2023 16:43:16 +0000 (09:43 -0700)]
Improve reproducibility recipe
* doc/tar.texi (Reproducibility): Improve index.
Improve and add comments to recipe. In the recipe,
don’t worry about file names beginning with ‘-’ for simplicity;
don’t use touch -c as it exits with status 0 even when it
does not work; and set directory timestamps too.
Paul Eggert [Wed, 19 Jul 2023 22:48:25 +0000 (15:48 -0700)]
tests: fix LDADD
Problem reported by Christian Weisgerber <naddy@mips.inka.de> in:
https://lists.gnu.org/r/bug-tar/2023-07/msg00015.html
* tests/Makefile.am (LDADD): Add $(LIBINTL), $(LIBICONV).
Paul Eggert [Tue, 18 Jul 2023 16:15:03 +0000 (09:15 -0700)]
tests: fix TESTSUITE_AT
Problem reported by Lukas Javorsky <ljavorsk@redhat.com> in:
https://lists.gnu.org/r/bug-tar/2023-07/msg00002.html
* tests/Makefile.am (TESTSUITE_AT): Add exclude17.at, exclude18.at.
Remove compress.m4; all uses changed. Add a comment saying how
to rederive this. Sort.
* src/common.h (name): New field: is_wildcard.
(name_scan): Change protoype.
* src/delete.c: Update calls to name_scan.
* src/names.c (addname, add_starting_file): Initialize is_wildcard.
(namelist_match): Take two arguments. If second one is true, return
only exact matches.
(name_scan): Likewise. All callers updated.
(name_from_list): Skip patterns.
* src/update.c (remove_exact_name): New function.
(update_archive): Do not remove matching name, if it is a pattern.
Instead, add a new entry with the matching file name.
* tests/update04.at: New test.
* tests/Makefile.am: Add new test.
* tests/testsuite.at: Include new test.
* doc/tar.1: Add missing dots, use plural when necessary,
tweak a wording. Remove an incorrect observation, three times.
Add some missing articles, correct some formatting,
and expand the opaque descriptions of two options.
* doc/tar.texi: Drop a stray `cd` command from an example.
Correct two cross references, correct the paragraph
about the manpage, and unbreak a URL.
* src/names.c: Correct and shorten an error message: "non-optional"
means "mandatory", but "non-option" is what was meant. And the
phrase "in archive create or update mode" was both unneeded and
incomplete.
* tests/positional01.at: Change expected error text.
* tests/positional02.at: Likewise.
* tests/positional03.at: Likewise.
Paul Eggert [Sun, 25 Jun 2023 19:54:20 +0000 (12:54 -0700)]
tar: extract delayed links in order
Extract delayed links in tar file order, rather than
in hash table order with modifications.
This is simpler and more likely to use the kernel’s
cached filesystem data, assuming related delayed links
are nearby in the tar file.
* src/extract.c (struct delayed_link.has_predecessor):
Remove. All uses removed.
(delayed_link_head, delayed_link_tail): New static vars.
This resurrects delayed_link_head’s old function
except that the linked list is now in forward order, not reverse.
(find_delayed_link_source): Now simply returns bool,
since the callers no longer need the pointer.
(create_placeholder_file):
Put the delayed link at the end of the linked list.
Omit no-longer-needed last arg. All callers changed.
(apply_delayed_links): Simplify now that we can just iterate
through the delayed_link_head list.
Paul Eggert [Sun, 25 Jun 2023 20:54:14 +0000 (13:54 -0700)]
tar: make safe for -Wunused-parameter
This also ports to C23 [[maybe_unused]].
* configure.ac (WARN_CFLAGS): Do not add -Wno-unused-parameter.
Add MAYBE_UNUSED where needed in source code.
Also, put it at the front where C23 requires it.
* src/extract.c (create_placeholder_file): Use FLEXNSIZEOF (overlooked
by c542d3d0c8)
(apply_delayed_links): Don't follow the "next" chain after its entries
have been applied.