Tim Kientzle [Sun, 7 Dec 2008 11:42:15 +0000 (06:42 -0500)]
Provide configure options to suppress use of zlib, bzlib, and lzmadec.
There are a few issues that should be addressed: The test suites
shouldn't rely so heavily on gzip support; zip decoder should
have a private CRC calculation so it can read (uncompressed) zip
archives without zlib.
Submitted by: Diego Petten\242
Tim Kientzle [Sun, 7 Dec 2008 11:40:32 +0000 (06:40 -0500)]
The new and improved cpio -tv output relies on local system
conventions, so a naive file comparison can't be used to test
it. Someday, I'll put together code to carefully parse and verify
the format. Until then, at least shut up the bogus warnings.
Submitted by: Diego Petten\242
Tim Kientzle [Thu, 4 Dec 2008 21:58:03 +0000 (16:58 -0500)]
Basic test for gzip compatibility.
I've commented out the test for handling concatenated gzip streams,
as it appears I need to do more work on refactoring the read
filter framework before this is reasonable to implement.
Fortunately, concatenated gzip streams are uncommon.
Tim Kientzle [Sat, 22 Nov 2008 22:02:45 +0000 (17:02 -0500)]
Add a fuzz tester to the libarchive test suite. This
takes known-good archives, changes random bytes, then feeds
them through libarchive trying to provoke a crash or hang.
This has exposed a couple of problems reading malformed
ISO9660 images. As a result, I now have a rewritten
Rockridge extension parser, better handling of malformed
PVDs, and some additional checks around end-of-archive conditions.
Tim Kientzle [Tue, 18 Nov 2008 16:14:08 +0000 (11:14 -0500)]
Rework Bzip2 stream management. Mostly, this makes the stream
initialization lazy so we can re-open the decompressor for
a new stream. This should allow us to read pbzip2 output,
which compresses large blocks separately and writes them
as independent streams, while still handling files such as
Gentoo binary packages, which store unrelated data after the
end of the bzip2 data.
Thanks to: Ivan Voras for pointing out the pbzip2 case
Thanks to: Diego "Flameeyes" Petteno for pointing out
the problem with Gentoo binary packages
Tim Kientzle [Sun, 9 Nov 2008 18:00:18 +0000 (13:00 -0500)]
Update NEWS with a few of the things that have happened
since 2.5.5 was released. I still need to dig through old
commit messages to fill this in further before releasing 2.6.0.
Bump the version numbers to 2.5.901 to reflect the pre-release
status. Bump bsdcpio version to 1.1.0.
Tim Kientzle [Sun, 9 Nov 2008 17:45:24 +0000 (12:45 -0500)]
Add LZMA test, more detailed comments about the sorry state
of LZMA magic number checks, update Makefiles to include
lzma reader and lzma test.
Because LZMA support is optional, the lzma test is pretty forgiving
about failures at open time. It does report the skip (including
the underlying libarchive failure) which should reduce the risk of
false passes somewhat.
Tim Kientzle [Wed, 5 Nov 2008 22:18:36 +0000 (17:18 -0500)]
Checkpoint the read filter rearchitecture.
The read filters now consume blocks from their upstream
providers and provide blocks to their downstream consumers.
All blocks are arbitrarily-sized; the reblocking code that
used to be in "compression_none" has been moved into the read
core and handles the output from the read filters.
The big goal here is to provide support for multiple stacked
read filters. While this is of little interest for
decompression (people rarely stack multiple compressors), it
does lay the groundwork for encryption, uudecode, and other
filters that are used in combination with each other and with
compression.
This also simplifies the internal API a little (although the
init() method signature is pretty hairy and going to get
worse before I'm done) and has saved a few dozen lines of code
here and there.
This certainly isn't finished: I still have to convert the new
LZMA decompressor, clean up some of the new code, and find
better terminology. In particular "reader" and "source" are
really awful names. I'll figure out something better soon; I
promise.
But this does pass all of the tests again (which probably
means I need more tests!) so it seems a good point to check in
what I have. Hopefully, over the next couple of days, I'll
work out better terminology and give all the new code here a
good scrubbing.
Tim Kientzle [Wed, 29 Oct 2008 22:07:37 +0000 (18:07 -0400)]
First step in transitioning the current decompression code to
a more generic system of stackable stream transforms.
In particular, I believe I've edited every place that called the
decompressor directly to go through new __archive_read_ahead(),
__archive_read_consume() and __archive_read_skip() functions, so
the rest of the work here will not require any changes to the
format handlers.
I've also laid out the new types that will be needed for this.
Next step is to rewrite the decompressors to the new interface,
and overhaul the decompression auction to juggle generic "sources."
Then I'll be able to consolidate reblocking into a single place;
the transforms can emit arbitrary-sized blocks and the current
decompress_none logic will be used to reblock as needed by the
consumer format.
The initial impetus for this was to simplify the decompressors by
consolidating the reblocking logic. I recently came up with
some other transforms I'd like to implement (including new
decompressors, an encryption filter for secure backup, and
uudecode handling to simplify test harnesses) that would also
benefit from this. Eventually, I think I might be able to
standardize the interface for new transforms enough to allow
libarchive clients to register their own transforms complete
with bidding logic (the old interface was too wired into libarchive
internals for the API to be exported). In the very long term,
this might divorce the transformation logic from the rest of
libarchive enough to allow it to be packaged as an independent
library.
Tim Kientzle [Wed, 29 Oct 2008 21:55:12 +0000 (17:55 -0400)]
Style cleanup and sketch out the switch from off_t to int64_t for
libarchive 3.0. (Linux off_t isn't always the same size thanks to
the Large File System support in the system headers; it's a huge
headache using it in public headers for shared libraries.)
This is still a bit tentative; "long long" may be a more appropriate
choice here.
Tim Kientzle [Tue, 21 Oct 2008 22:17:25 +0000 (18:17 -0400)]
Recent changes to the read-ahead semantics require a
a slight change to the look ahead strategy for SFX ZIP archives.
The new code repeatedly extends the look-ahead window in small
increments.
Tim Kientzle [Tue, 21 Oct 2008 22:13:29 +0000 (18:13 -0400)]
Implement a custom command-line parser for cpio. In return for
these 80 extra lines of code, we get consistent argument
handling on all platforms which in turn will simplify the test
harness. I did try importing a "standard" getopt_long()
implementation but those tend to be 600+ lines of code and
provoke some awkward namespace conflicts with platform getopt().
Tim Kientzle [Sun, 19 Oct 2008 22:40:55 +0000 (18:40 -0400)]
Return ARCHIVE_FATAL instead of -1 for read failures here.
Make end-of-file persistent; don't call the client again.
Comment the upcoming change to the end-of-file semantics for now,
since I still need to do a little more groundwork.
Tim Kientzle [Sun, 19 Oct 2008 22:38:35 +0000 (18:38 -0400)]
Make the end-of-archive detection work properly with the new
strict read_ahead() semantics, which return a failure instead of
a short read at end-of-file. This seems to be one of the very
few cases where the short read is actually informative.
Tim Kientzle [Sun, 19 Oct 2008 18:57:16 +0000 (14:57 -0400)]
Tighten up the semantics of read_ahead(): It will now never
return a short read except at end-of-file (and I think that should
probably return an error as well). The old loose semantics
resulted in a lot of extra checks throughout the library to verify
the size of the returned data; this is a step towards removing
most such checks.
N.B.: I've only made this change to compression_none for
now since I'm planning to refactor a lot of the compression pipeline
very soon anyway.
Tim Kientzle [Sun, 19 Oct 2008 17:17:29 +0000 (13:17 -0400)]
__LA_DEAD is a private convention; don't publicize it.
Also, clean up leaks of a few other __LA_ symbols that
were introduced to make the public headers more portable.
Tim Kientzle [Mon, 6 Oct 2008 22:55:16 +0000 (18:55 -0400)]
Custom from-scratch command-line parser for bsdtar. This is
both more portable and more straightforward than the earlier
getopt()/getopt_long() wrapper approach, requires no fancy
configure/make glue to choose a platform implementation, and
gives me consistent command line parsing on every platform,
which should greatly simplify my attempts at building a robust
test suite.
Tim Kientzle [Wed, 24 Sep 2008 21:43:07 +0000 (17:43 -0400)]
On FreeBSD, we know how to write birthtime to disk.
Add a new test to exercise different time specifications and
make sure that omitted timestamps, high-res timestamps are
all handled correctly.
Tim Kientzle [Thu, 11 Sep 2008 22:46:08 +0000 (18:46 -0400)]
Style fixes: Use Unix-style line endings for consistency with
rest of source (fortunately, most Windows code editors are
agnostic about line endings). Remove trailing spaces, tab
after #define, remove a couple of definitions that are
not actually used in the source.