Paul Eggert [Fri, 1 Nov 2024 02:53:25 +0000 (19:53 -0700)]
Pacify GCC in info_attach_exclist
* src/exclist.c (info_attach_exclist): Remove unnecessary test
for whether dir and ex are null. GCC complains about the first
one in some cases. Use C99-style decls.
Paul Eggert [Fri, 1 Nov 2024 02:53:25 +0000 (19:53 -0700)]
Simplify checkpoint_action allocation
* src/checkpoint.c: Include <flexmember.h>.
(struct checkpoint_action): New member commandbuf.
(checkpoint_action_tail): Now pointer to pointer,
to simplify updating. All uses changed.
(alloc_action): New arg quoted_string, to lessen number of
separate allocations. All uses changed.
Paul Eggert [Fri, 1 Nov 2024 02:53:25 +0000 (19:53 -0700)]
Fewer uses of size_t in checkpoint.c
* src/checkpoint.c (copy_string_unquote, getarg)
(format_checkpoint_string): Prefer idx_t to size_t.
(copy_string_unquote): Simplify by using ximemdup0.
(getarg): Avoid quadratic reallocation behavior by
using xpalloc.
Paul Eggert [Fri, 1 Nov 2024 02:53:25 +0000 (19:53 -0700)]
Fewer uses of size_t in buffer.c
* src/buffer.c (flush_write_ptr, flush_bufmap, bufmap_locate):
(struct zip_magic, available_space_after, _flush_write)
(short_read, flush_archive, try_new_volume)
(gnu_add_multi_volume_header, simple_flush_read)
(simple_flush_write, _gnu_flush_read, _gnu_flush_write)
(gnu_flush_write): Prefer idx_t to size_t when either will do, as
signed types are typically safer. For a tiny value in memory,
just use ‘char’.
Don't assume archive read from stdin starts at offset 0
* src/buffer.c (start_offset): New variable.
(get_archive_status): If reading from seekable stdin, store the
position in the stream corresponding to record_start in start_offset.
(seek_archive): Compute current offset relative to start_offset.
Paul Eggert [Mon, 19 Aug 2024 16:42:59 +0000 (09:42 -0700)]
Fewer macros in extract.c
* src/extract.c (ALL_MODE_BITS, RECOVER_NO, RECOVER_OK)
(RECOVER_SKIP): Now constants or inline functions, not macros.
(maybe_recoverable): Return enum recover, not int.
Paul Eggert [Mon, 19 Aug 2024 16:41:37 +0000 (09:41 -0700)]
Fewer macros in create.c
* src/create.c (CACHEDIR_SIGNATURE, CACHEDIR_SIGNATURE_SIZE)
(MAX_VAL_WITH_DIGITS, MAX_OCTAL_VAL): Now constants or
inline functions or removed, instead of macros.
(max_octal_val): Accept size rather than type.
Paul Eggert [Mon, 19 Aug 2024 16:12:52 +0000 (09:12 -0700)]
Fewer macros in common.h
In common.h, replace macros with constants or functions when that
is easy. This makes code a bit more reliable (functions evaluate
their args exactly once) and easier to debug (many debugging
environments cannot access macros).
* src/common.h (CHKBLANKS): Remove. All uses removed.
(NAME_FIELD_SIZE, PREFIX_FIELD_SIZE, UNAME_FIELD_SIZE)
(GNAME_FIELD_SIZE, TAREXIT_SUCCESS, TAREXIT_DIFFERS)
(TAREXIT_FAILURE, LG_8, LG_256, DEFAULT_CHECKPOINT)
(MAX_OLD_FILES, TF_READ, TF_WRITE, TF_DELETED, XFORM_REGFILE)
(XFORM_LINK, XFORM_SYMLINK, XFORM_ALL, WARN_ALONE_ZERO_BLOCK)
(WARN_BAD_DUMPDIR, WARN_CACHEDIR, WARN_CONTIGUOUS_CAST)
(WARN_FILE_CHANGED, WARN_FILE_IGNORED, WARN_FILE_REMOVED)
(WARN_FILE_SHRANK, WARN_FILE_UNCHANGED, WARN_FILENAME_WITH_NULS)
(WARN_IGNORE_ARCHIVE, WARN_IGNORE_NEWER, WARN_NEW_DIRECTORY)
(WARN_RENAME_DIRECTORY, WARN_SYMLINK_CAST, WARN_TIMESTAMP)
(WARN_UNKNOWN_CAST, WARN_UNKNOWN_KEYWORD, WARN_XDEV)
(WARN_DECOMPRESS_PROGRAM, WARN_EXISTING_FILE, WARN_XATTR_WRITE)
(WARN_RECORD_SIZE, WARN_FAILED_READ, WARN_MISSING_ZERO_BLOCKS)
(WARN_VERBOSE_WARNINGS, WARN_ALL, EXCL_DEFAULT, EXCL_RECURSIVE)
(EXCL_NON_RECURSIVE): Now enum constants rather than macros.
(time_option_initialized, isfound, wasfound, warning_enabled):
Now functions rather than macros TIME_OPTION_INITIALIZED, ISFOUND,
WASFOUND, WARNING_ENABLED. All uses changed.
(OLDER_STAT_TIME, OLDER_TAR_STAT_TIME, EXTRACT_OVER_PIPE)
(TAR_ARGS_INITIALIZER): Remove. All uses replaced with their
definiens or equivalent.
Paul Eggert [Sun, 18 Aug 2024 18:02:51 +0000 (11:02 -0700)]
Prefer function to COPY_BUF macro
* src/sparse.c (struct ok_n_block_ptr): New type.
(decode_num): Revamp API so that it does the work of both
the old decode_num and the old COPY_BUF. Always read to the
next newline even if there is a lot of junk in between.
(pax_decode_header): Use the new API.
(COPY_BUF): Remove.
Paul Eggert [Sun, 18 Aug 2024 16:44:25 +0000 (09:44 -0700)]
Prefer function to COPY_STRING macro
* src/sparse.c (struct block_ptr):
New type, to allow a functional style.
(dump_str_nl, floorlog10): New static functions.
(COPY_STRING): Remove. All uses replaced by dump_str_nl.
(pax_dump_header_1): Use floorlog10 instead of creating a string.
Simplify size calculation.
Paul Eggert [Tue, 13 Aug 2024 15:35:24 +0000 (08:35 -0700)]
Avoid need for base64_init and extra table
Simplify the code by assuming C99 initializers.
* src/list.c (base_64_digits): Remove.
(base64_map): Now a constant. Now has its (old value + 1) % 65,
as that’s the only easy portable way to do it with a static
initializer (even on platforms where CHAR_BIT != 8); all uses changed.
(base64_init): Remove; only use removed.
(from_header): Adjust to new values in base64_map.
* src/list.c (base_64_digits): Remove; no longer needed.
(base64_map): Now const, initialized statically, and with
invalid entries being 0 not 64, and with valid entries
being 1 greater than before.
Paul Eggert [Tue, 13 Aug 2024 01:17:58 +0000 (18:17 -0700)]
Prefer signed to unsigned when decoding options
* src/tar.c (assert_format, decode_options):
Prefer signed to unsigned integers.
(optloc_save): Prefer enum to unsigned integer.
Simplify allocation.
(decode_options): No need to call ngettext for a value known
to be plenty large.
Paul Eggert [Mon, 12 Aug 2024 23:18:16 +0000 (16:18 -0700)]
Use intmax_t, not size_t, for input line numbers
This works better on platforms where SIZE_MAX < OFF_MAX.
* src/common.h (struct common locus):
* src/names.c (struct name_elt):
Use intmax_t for line numbers. All uses changed.
Paul Eggert [Thu, 8 Aug 2024 23:32:49 +0000 (16:32 -0700)]
Prefer stoint to strtoul and variants
When parsing numbers prefer using strtosysint (renamed stoint)
to using strtoul and its variants.
This is simpler and faster and likely more reliable than
relying on quirks of the system strtoul etc,
and it standardizes how tar deals with parsing integers.
Among other things, the C standard and POSIX don’t specify
what strtol does to errno when conversions cannot be performed,
and it requires strtoul to support "-" before unsigned numbers.
* gnulib.modules (strtoimax, strtol, strtoumax, xstrtoimax):
Remove.
* src/checkpoint.c (checkpoint_compile_action, getwidth)
(format_checkpoint_string):
* src/incremen.c (read_incr_db_01, read_num)
* src/map.c (parse_id):
* src/misc.c (decode_timespec):
* src/sparse.c (decode_num):
* src/tar.c (parse_owner_group, parse_opt):
* src/transform.c (parse_transform_expr):
* src/xheader.c (decode_record, decode_signed_num)
(sparse_map_decoder):
Prefer stoint to strtol etc.
Don’t rely on errno == EINVAL as the standards don’t guarantee it.
* src/checkpoint.c (getwidth, format_checkpoint_string):
Check for invalid string suffix.
* src/checkpoint.c (getwidth):
Return intmax_t, not long. All callers changed.
* src/incremen.c (read_directory_file):
It’s just a one-digit number, so just subtract '0'.
* src/map.c (parse_id): Return bool not int. All callers changed.
* src/misc.c (stoint): Rename from strtosysint, and add
a bool * argument for reporting overflow. All callers changed.
(decode_timespec): Simplify by using ckd_sub rather than
checking for overflow by hand.
* src/tar.c (incremental_level): Now signed char to
emphasize that it can be only -1, 0, 1. All uses changed.
* src/xheader.c (decode_record): Avoid giant diagnostics.
Paul Eggert [Thu, 8 Aug 2024 17:51:39 +0000 (10:51 -0700)]
Handle enormous record sizes better
Formerly the code could misbehave when the user specified a record
size greater than min (INT_MAX * 512 + 511, PTRDIFF_MAX, SSIZE_MAX).
* src/delete.c (new_blocks, delete_archive_members):
* src/system.c (sys_exec_info_script):
* src/tar.c (blocking_factor, record_size):
Don’t limit blocking factor to INT_MAX.
Prefer signed type for record_size.
Do not exceed IDX_MAX or SSIZE_MAX for record_size;
the SSIZE_MAX limit is needed so that ‘read’ and ‘write’
calls behave sensibly.
Paul Eggert [Thu, 8 Aug 2024 00:03:22 +0000 (17:03 -0700)]
Avoid strtoul
This is part of the general trend to prefer signed integer types,
to allow better runtime checking with -fsanitize=undefined etc.
* gnulib.modules: Remove strtoul. Add xstrtoimax.
* src/checkpoint.c (checkpoint, format_checkpoint_string):
* src/system.c (sys_exec_checkpoint_script):
* src/tar.c (checkpoint_option):
Use intmax_t, not unsigned, for checkpoint numbers.
All uses changed.
* src/checkpoint.c (checkpoint_compile_action): Don’t assume
time_t == unsigned long. Treat overflows as TYPE_MAXIMUM (time_t),
essentially infinity.
* src/tar.c (tar_sparse_major, tar_sparse_minor):
* src/tar.h (struct tar_stat_info):
Use intmax_t, not unsigned, for sparse major and minor.
All uses changed.
* src/tar.c (parse_opt):
Don’t mishandle multiple specifications of sparse major and minor.
* src/transform.c (struct transform):
Use idx_t, not unsigned, for match_number. All uses changed.
(parse_transform_expr): Don’t mishandle large match numbers
by wrapping them around.
Paul Eggert [Sun, 4 Aug 2024 08:37:07 +0000 (01:37 -0700)]
Avoid snprintf
* gnulib.modules: Remove snprintf.
* lib/wordsplit.c (wordsplit_pathexpand):
Do not arbitrarily truncate diagnostic.
(wordsplit_c_quote_copy): Rewrite to avoid the need to
invoke snprintf on a temporary buffer.
Paul Eggert [Sun, 4 Aug 2024 07:24:15 +0000 (00:24 -0700)]
Prefer ialloc for wordsplit
* lib/wordsplit.c (alloc_space, wsplt_assign_var, expvar)
(wordsplit_tildexpand, wordsplit_pathexpand)
(wordsplit_get_words): Use ialloc API on idx_t args.
Paul Eggert [Sun, 4 Aug 2024 04:05:42 +0000 (21:05 -0700)]
Omit wordsplit API that tar doesn’t need
* lib/wordsplit.c: Include <attribute.h> here, not in wordsplit.h.
(WRDSO_ESC_SET, WRDSO_ESC_TEST): Move here from wordsplit.h.
(WORDSPLIT_EXTRAS_extern): New macro. Used by functions
that tar doesn’t need to be exposed.
(wordsplit_append, wordsplit_c_quoted_length, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy, wordsplit_get_words, wordsplit_perror):
Omit unless _WORDSPLIT_EXTRAS.
(WORDSPLIT_ENV_INIT): Move here from wordsplit.h, and
make it a constant rather than a macro.
(wordsplit_strerror): Arg is now pointer to const.
* lib/wordsplit.h: Do not include attribute.h, so that library
users need not worry about attribute.h.
(wordsplit_t): Declare only if _WORDSPLIT_EXTRAS. Similarly for
functions that are not exported to tar.
Paul Eggert [Sat, 3 Aug 2024 21:53:39 +0000 (14:53 -0700)]
More wordsplit int cleanup
* lib/wordsplit.c: Include limits.h.
(_wsplt_subsplit, wordsplit_add_segm, wsnode_quoteremoval)
(wsnode_coalesce, wsnode_tail_coalesce, find_closing_paren)
(expvar, begin_var_p, node_expand, begin_cmd_p, expcmd)
(scan_qstring, scan_word, wordsplit_c_quoted_length)
(wordsplit_string_unquote_copy, wordsplit_c_quote_copy)
(exptab_matches, wordsplit_process_list):
Prefer bool to int.
(wordsplit_init, alloc_space, coalesce_segment)
(wsnode_quoteremoval, wordsplit_finish, wordsplit_append):
Use WRDSE_OK instead of 0 when the context is that of WRDSE_*.
(wsnode_flagstr, coalesce_segment, wsnode_quoteremoval)
(wordsplit_finish, node_split_prefix, wsplt_assign_var, expvar)
(expcmd, wordsplit_tildexpand, wordsplit_pathexpand)
(wsplt_unquote_char, wsplt_quote_char)
(wordsplit_string_unquote_copy):
Prefer '\0' to 0 when it is a char.
(wsnode_insert): Omit last arg, which was always 0.
All callers changed.
(wordsplit_add_segm, node_split_prefix):
Use unsigned, not int, for flag, for consistency.
(wordsplit_finish, begin_var_p, begin_cmd_p, skip_sed_expr)
(xtonum, wsplt_unquote_char, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy):
Prefer char to int for chars.
(xtonum): Don’t treat "\400" as if it were "\000".
Paul Eggert [Sat, 3 Aug 2024 18:55:39 +0000 (11:55 -0700)]
Diagnose argp overflow
* src/names.c (handle_option):
* src/tar.c (parse_default_options):
Report an error if wordsplitting yields more than INT_MAX words,
rather than misbehaving. argp_parse can’t handle more than
INT_MAX, unfortunately.
Paul Eggert [Sat, 3 Aug 2024 16:35:46 +0000 (09:35 -0700)]
Support >INT_MAX -C dirs
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/misc.c (normalize_filename, wd_count, chdir_count)
(chdir_arg, tar_getcdpath):
* src/names.c (name_gather, addname, add_hierarchy_to_namelist):
* src/unlink.c (struct deferred_unlink, flush_deferred_unlinks):
Use idx_t, not int, for directory indexes, so as to not
limit their number to INT_MAX; this is theoretically possible
if -T is used.
* src/names.c (name_next_elt, name_next):
Use bool for boolean.
Paul Eggert [Fri, 2 Aug 2024 16:32:11 +0000 (09:32 -0700)]
Use xalignalloc
It ports around issues that our handwritten code does not.
* gnulib.modules: Add xalignalloc.
* src/misc.c (ptr_align, page_aligned_alloc): Remove.
All page_aligned_alloc callers changed to use xalignalloc.
Paul Eggert [Fri, 2 Aug 2024 16:07:06 +0000 (09:07 -0700)]
Make stripped_prefix_len signed
This is part of the general guideline that signed integer types
are safer.
* src/names.c (stripped_prefix_len): Return ptrdiff_t,
not size_t. All callers changed.
Paul Eggert [Fri, 2 Aug 2024 07:27:40 +0000 (00:27 -0700)]
Don’t assume mode_t fits in unsigned long
* src/system.c (oct_to_env): Don’t assume mode_t fits in unsigned
long. Do not output excess leading 1 bits. When the mode is
zero, generate "0" rather than "00". Use sprintf instead of
snprintf, since the output won’t be truncated; in general we don’t
use snprintf unless we want output to be truncated and truncation
is typically not GNU style.
Paul Eggert [Fri, 2 Aug 2024 02:31:50 +0000 (19:31 -0700)]
Prefer C99 formats like %jd to doing it by hand
It’s now safe to assume support for C99 formats like %jd, so remove
some of the longwinded formatting code put in only to be portable to
pre-C99 platforms.
* gnulib.modules: Add intprops.
* src/buffer.c (format_total_stats, try_new_volume)
(write_volume_label):
* src/checkpoint.c (format_checkpoint_string):
* src/compare.c (verify_volume):
* src/create.c (to_chars_subst, dump_regular_file):
* src/incremen.c (read_num):
* src/list.c (read_and, from_header, simple_print_header)
(print_for_mkdir):
* src/sparse.c (sparse_dump_region):
* src/system.c (dec_to_env, sys_exec_info_script)
(sys_exec_checkpoint_script):
* src/xheader.c (out_of_range_header):
Prefer C99 formats like %jd and %ju to STRINGIFY_BIGINT.
* src/common.h: Sort includes.
Include intprops.h, verify.h. All other includes of verify.h
removed.
(intmax, uintmax): New functions and macros.
(STRINGIFY_BIGINT): Remove; no longer used.
(TIMESPEC_STRSIZE_BOUND): Make it 1 byte bigger, for negatives.
* src/create.c (MAX_VAL_WITH_DIGITS, to_base256):
Use *_WIDTH macros rather than assuming no padding bits.
Prefer UINTMAX_MAX to (uintmax_t) -1.
* src/list.c (tartime): Use strftime result rather
than running strlen later.
* src/misc.c (timetostr): New function. Prefer it when
printing time_t values.
Paul Eggert [Thu, 1 Aug 2024 17:02:06 +0000 (10:02 -0700)]
Fix unlikely problems with time overflow
Also, fix some rounding errors while we’re in the neighborhood.
* src/buffer.c (duration_ns, compute_duration_ns): Rename from
‘duration’ and ‘compute_duration’, and count ns rather than s, to
lessen rounding error. All uses changed.
(compute_duration_ns): Work even if the clock moves backward
and time_t is unsigned.
(print_stats): Don’t worry about null or empty TEXT, as that
cannot happen. Compare double to UINTMAX_MAX + 1.0, not
to UINTMAX_MAX, so that the comparison is exact.
Handle the unlikely case that numbytes >= UINTMAX_MAX.
* src/tar.c (parse_opt): Treat -L hugenumber as effectively
infinity rather than erroring out.
Prefer ckd_add to checking overflow by hand.
Paul Eggert [Thu, 1 Aug 2024 14:15:01 +0000 (07:15 -0700)]
ptrdiff_t, not ssize_t
* src/buffer.c (bufmap_reset, _flush_write):
Use ptrdiff_t, not ssize_t, to record pointer differences.
POSIX allows systems where size_t is 64 bits but ssize_t is only 32;
Ultrix used to do that, though no current systems do.
Paul Eggert [Wed, 31 Jul 2024 00:59:04 +0000 (17:59 -0700)]
Prefer stdckdint.h to intprops.h
Problem reported by Collin Funk in:
https://lists.gnu.org/r/bug-tar/2024-07/msg00000.html
though this patch is more general than Collin’s suggestion.
* src/compare.c (diff_multivol):
* src/delete.c (move_archive):
* src/sparse.c (oldgnu_add_sparse, pax_decode_header):
* src/system.c (mtioseek):
Prefer ckd_add and ckd_mul to the intprops.h equivalents,
since stdckdint.h is now standard.
Paul Eggert [Wed, 31 Jul 2024 00:47:10 +0000 (17:47 -0700)]
Simplify read_header overflow checking
* src/list.c (read_header): Use ckd_add instead of
doing overflow checking by hand. Although the old code
was correct on all practical hosts, the new code is simpler
and works even on weird hosts where SIZE_MAX <= INT_MAX.
Paul Eggert [Tue, 30 Jul 2024 23:19:35 +0000 (16:19 -0700)]
maint: use static_assert
* gnulib.modules: Add assert-h, for static_assert.
* src/common.h, src/list.c, src/misc.c:
Prefer static_assert to #if + #error. This doesn’t fix any bugs; it’s
just that in general it’s better to avoid the preprocessor.
Paul Eggert [Tue, 30 Jul 2024 15:35:59 +0000 (08:35 -0700)]
Fix tests/ckmtime.c arithmetic
* tests/ckmtime.c (main): Don’t assume time_t is signed.
Avoid integer overflows (quite possible if time_t is 32 bit).
Do calculations precisely, without any rounding errors.
Paul Eggert [Tue, 30 Jul 2024 03:56:27 +0000 (20:56 -0700)]
xsparse cleanup, including integer overflow
* scripts/xsparse.c: Include inttypes.h, for strtoimax.
Don’t include stdint.h, since inttypes.h includes it.
Sort include directives.
Make all extern functions and vars static, except for ‘main’.
(string_to_off): Use strtoimax instead of doing overflow
checking by hand, incorrectly (it relied on undefined behavior).
(string_to_size): New arg MAXSIZE. All callers changed.
(get_var): Return bool not int. Fix unlikely integer overflow.
Use strncmp instead of memcmp, to avoid unlikely pointer overflow.
(read_xheader, read_map, main): Avoid unlikely integer overflow.
Check for I/O errors more consistently.
(main): Prefer bool to int, and put vars near use.
Paul Eggert [Mon, 29 Jul 2024 22:36:47 +0000 (15:36 -0700)]
maint: fix some unlikely wordsplit overflows
* gnulib.modules: Add reallocarray.
* lib/wordsplit.c: Include stdckdint.h.
(ISDELIM, expvar, isglob, scan_word):
Defend against strchr (s, 0) always succeeding.
(alloc_space, wsplit_assign_vars):
Fix some unlikely integer overflows, partly by using reallocarray.
(alloc_space): Avoid quadratic worst-case behavior.
(isglob): Return bool, not int. Accept size_t, not int.
(to_num): Remove; no longer used.
(xtonum): Clarify the code the bit. Rely on portable
conversion to unsigned char rather than problematic pointer cast.