Emil Velikov [Sun, 21 Nov 2021 17:38:38 +0000 (17:38 +0000)]
autotools: enable -fdata/function-sections and --gc-sections
Analogue to the parent cmake commit, with linker flag detection.
The former two split the functions and data into separate sections
within the object file. Which makes it easier for the latter to properly
garbage collect and discard unused sections.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sun, 21 Nov 2021 17:38:28 +0000 (17:38 +0000)]
cmake: enable -fdata/function-sections and --gc-sections
The former two split the functions and data into separate sections
within the object file. Which makes it easier for the latter to properly
garbage collect and discard unused sections. For example
text data bss dec hex filename
208268 2056 4424 214748 346dc bsdcat -- before
93396 1304 4360 99060 182f4 bsdcat -- after 1059167 12112 24176 1095455 10b71f bsdcpio -- before 1002538 7320 23984 1033842 fc672 bsdcpio -- after 1093676 14248 6608 1114532 1101a4 bsdtar -- before 1062231 14176 6416 1082823 1085c7 bsdtar -- after 1097259 15032 6408 1118699 1111eb libarchive.so.18 -- before 1095675 14992 6216 1116883 110ad3 libarchive.so.18 -- after
Note:
This is enabled only with gcc/clang on non-Mac platforms. Ideally we'll
have a compile-time check, albeit that seems impossible with our ancient
cmake requirement.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sun, 21 Nov 2021 14:50:25 +0000 (14:50 +0000)]
cmake: drop -rdynamic aka CMP0065 NEW
Prior to version 3.3 cmake would always use -rdynamic. That in itself
causes all the internal symbols to be exported, increasing the binaries
by 5-10% and making it impossible for the compiler to reason, optimise
and discard unused code.
The -rdynamic is useful in two cases:
- having a third party module (say /usr/lib/foo/foobar.so) which is
underlinked and depends on symbols from the main binary - apps like
irssi, bash and zsh use that
- uses the glibc backtrace, which relies on dlopen/dlsym to fetch the
symbol data. Unwind is much better solution, since it replies on the
DWARF data
Our binaries do not use either of these - so drop the -rdynamic. The
autotools build doesn't use it either.
text data bss dec hex filename
229000 2120 4424 235544 39818 bsdcat -- before
208324 2120 4424 214868 34754 bsdcat -- after 1093939 12128 24176 1130243 113f03 bsdcpio -- before 1059181 12128 24176 1095485 10b73d bsdcpio -- after 1130091 14264 6608 1150963 118ff3 bsdtar -- before 1093690 14264 6608 1114562 1101c2 bsdtar -- after
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sun, 21 Nov 2021 14:26:53 +0000 (14:26 +0000)]
cmake: fold gcc/clang sections
The flags used across the two are identical, apart from -g.
There is no compelling reason, why we would omit -g for debug builds
with GCC, while using it with clang.
De-duplicate the sections.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Jonas Witschel [Sun, 21 Nov 2021 09:07:52 +0000 (10:07 +0100)]
test_sparse_basic: do not assume that holes can be read in one go
verify_sparse_file() assumes that every hole will be fully contained in only
one archive_read_data_block(). This is a reasonable assumption if the file is
indeed sparsely encoded in the archive because archive_read_data_block() will
just skip the hole and return the offset of the next data block.
However, if the file is not sparsely encoded in the archive, a hole consists of
a lot of zeroes that need to be read byte by byte. In this case, the archive
contains no information on where this block of zeroes ends and where actual
data begins. Therefore it can happen that a single archive_read_data_block()
contains both zeroes from a hole and actual data.
If this happens, assert(sparse->type == HOLE) fails. This assertion is
reasonable for sparsely encoded files because archive_read_data_block() will
never only read part of a hole (since it does not really "read" a hole at all,
it just returns a higher offset accounting for the size of the hole).
However, we want to start testing files with verify_sparse_file() that are
explicitly not sparsely encoded. In this case, the assertion does not
necessarily hold any more. Therefore we need to account for the case where the
overlapping block consists of data. To make sure the file contents are
correctly encoded in the archive, we need to test the contents of the data
block, like it is already done for blocks completely contained in the data read
by archive_read_data_block().
Note that this modification does not change the way sparsely encoded files are
verified, it just relaxes an edge case that cannot happen with sparsely encoded
files to make it possible to test any kind of file, whether sparsely encoded or
not.
Theo Buehler [Fri, 19 Nov 2021 17:55:29 +0000 (18:55 +0100)]
Remove OpenSSL compat code that misuses the API
Immediately after EVP_CIPHER_CTX_new() neither EVP_CIPHER_CTX_init()
nor EVP_CIHPER_CTX_reset() should be called: the purpose of the init
function is to initialize a context on the stack while reset clears
a used context for reuse. Neither situation is the case here.
Removing the code also fixes a potential NULL dereference because an
error of reset is not signaled to the caller. Fortunately reset doesn't
currently fail in this situation in current OpenSSL and LibreSSL.
Martin Matuska [Wed, 17 Nov 2021 20:06:00 +0000 (21:06 +0100)]
archive_write_disk_posix: fix writing fflags broken in 8a1bd5c
The fixup list was erroneously assumed to be directories only.
Only in the case of critical file flags modification (e.g. SF_IMMUTABLE
on BSD systems), other file types (e.g. regular files or symbolic links)
may be added to the fixup list. We still need to verify that we are writing
to the correct file type, so compare the archive entry file type with
the file type of the file to be modified.
Jonas Witschel [Sun, 14 Nov 2021 17:56:49 +0000 (18:56 +0100)]
Add ARCHIVE_READDISK_NO_SPARSE to suppress reading sparse file info
Sparse file information depends on the file system and can therefore be a
source of unreproducibility for the generated archives, e.g. if the same
content is compressed on a file system with and without sparse file support.
Add an option to suppress reading this information from disk entirely.
Emil Velikov [Sat, 23 Oct 2021 14:32:12 +0000 (15:32 +0100)]
editorconfig: add simple top-level file
Add a simple top-level .editorconfig file to manage common attributes
such as indentation style, trailing whitespace and newline at end of
file. The format is wide spread and has support for nearly every editor
out there - see https://editorconfig.org/ for more.
Majority of the project is C - which uses tabs, although there are some
CMake files using 2 space indent and shell scripts - predominantly using
4 space indent.
This makes it harder for casual contributors to butcher things :-)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Remove the excessive fallthrough chain in parse_keyword(). Even though
it is in the else/error path, there is no point in comparing the key
another dozen (or more) times when we know it will fail.
Just use an early return (OK) or break respectively.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Samanta Navarro [Sat, 28 Aug 2021 11:58:00 +0000 (11:58 +0000)]
Fix size_t cast in read_mac_metadata_blob
The size_t data type on 32 bit systems is smaller than int64_t. Check
the int64_t value before casting to size_t. If the value is too large
then stop operation instead of continuing operation with truncated
value.
Martin Matuska [Fri, 27 Aug 2021 08:56:28 +0000 (10:56 +0200)]
Fix following symlinks when processing the fixup list
The previous fix in b41daecb5 was incomplete. Fixup entries are
given the original path without calling cleanup_pathname().
To make sure we don't follow a symlink, we must strip trailing
slashes from the path.
The fixup entries are always directories. Make sure we try to modify
only directories by providing O_DIRECTORY to open() (if supported)
and if it fails to check directory via lstat().
Martin Matuska [Sun, 22 Aug 2021 01:53:28 +0000 (03:53 +0200)]
Never follow symlinks when setting file flags on Linux
When opening a file descriptor to set file flags on linux, ensure
no symbolic links are followed. This fixes the case when an archive
contains a directory entry followed by a symlink entry with the same
path. The fixup code would modify file flags of the symlink target.
Martin Matuska [Sat, 21 Aug 2021 07:07:54 +0000 (09:07 +0200)]
write_disk_posix: rename variable in check_symlinks_fsobj()
Rename the flag "extracting_hardlink" to "checking_linkname" to
be more accurate about its use. If the variable is non-zero it
means that check_symlinks_fsobj() is called on the linkname
when a hardlink is going to be created.
Samanta Navarro [Tue, 1 Jun 2021 11:26:30 +0000 (11:26 +0000)]
Fix mutual check in tar sparse handling
GNU.sparse.numbytes and GNU.sparse.offset both have to be set before
gnu_add_sparse_entry can be called.
The GNU.sparse.numbytes parser checks for tar->sparse_numbytes.
This has to be tar->sparse_offset instead to work just like the
GNU.sparse.offset parser.
Samanta Navarro [Tue, 1 Jun 2021 11:25:03 +0000 (11:25 +0000)]
Handle all negative int64_t values in mtree/tar
The variable last_digit_limit is negative since INT64_MIN itself is
negative as well. This means that the last digit after "limit" always
leads to maxval.
Turning last_digit_limit positive in itself is not sufficient because
it would lead to a signed integer overflow during shift operation.
If limit is reached and the last digit is last_digit_limit, the number
is at least maxval. The already existing if condition for even larger
(or smaller) values can be reused to prevent the last shift.
In my humble opinion it might make sense to reduce duplicated code and
keep it separated in a utility source file for shared use.
Owen W. Taylor [Wed, 12 May 2021 20:26:24 +0000 (16:26 -0400)]
On close, handle short writes from archive_write_callback
The archive_write_callback passed to archive_write_open() is documented as:
"each call to the write callback function should translate to a single write(2) system call.
On success, the write callback should return the number of bytes actually written"
And in most places, the code repeatedly calls the write callback, but when flushing
the buffer at close, the write callback was called once, assuming it would write everything.
This could result in a truncated archive.
A test is added to test short writes in different code paths.
Owen W. Taylor [Wed, 12 May 2021 20:37:16 +0000 (16:37 -0400)]
Avoid getcwd(0, PATH_MAX) for GNU libc
Recent versions of GNU libc and GCC produce a warning on getcwd(0, PATH_MAX):
test_main.c: In function ‘get_refdir’:
test_main.c:3684:8: error: argument 1 is null but the corresponding size argument 2 value is 4096 [-Werror=nonnull]
3684 | pwd = getcwd(NULL, PATH_MAX);/* Solaris getcwd needs the size. */
This is because getcwd() is marked with the 'write_only (1, 2)' attribute.
Using the alternate getcwd(NULL, 0) path which is supported by GNU libc avoids this.
Wei-Cheng Pan [Tue, 9 Mar 2021 16:34:55 +0000 (16:34 +0000)]
fix rar header skiming
The available size returned from `__archive_read_ahead` can be larger
then required size. Substract by available size may underflow `skip`,
which will reach EOF too soon.