]> git.ipfire.org Git - thirdparty/libarchive.git/blame - README.md
Release 3.5.3
[thirdparty/libarchive.git] / README.md
CommitLineData
e2d91bed 1# Welcome to libarchive!
7b7c3284
TK
2
3The libarchive project develops a portable, efficient C library that
4can read and write streaming archives in a variety of formats. It
5also includes implementations of the common `tar`, `cpio`, and `zcat`
6command-line tools that use the libarchive library.
7
e2d91bed
TK
8## Questions? Issues?
9
10* http://www.libarchive.org is the home for ongoing
11 libarchive development, including documentation,
12 and links to the libarchive mailing lists.
13* To report an issue, use the issue tracker at
14 https://github.com/libarchive/libarchive/issues
15* To submit an enhancement to libarchive, please
16 submit a pull request via GitHub: https://github.com/libarchive/libarchive/pulls
17
18## Contents of the Distribution
19
20This distribution bundle includes the following major components:
7b7c3284 21
7b7c3284
TK
22* **libarchive**: a library for reading and writing streaming archives
23* **tar**: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive
24* **cpio**: the 'bsdcpio' program is a different interface to essentially the same functionality
25* **cat**: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such
26* **examples**: Some small example programs that you may find useful.
27* **examples/minitar**: a compact sample demonstrating use of libarchive.
28* **contrib**: Various items sent to me by third parties; please contact the authors with any questions.
29
30The top-level directory contains the following information files:
e2d91bed 31
7b7c3284
TK
32* **NEWS** - highlights of recent changes
33* **COPYING** - what you can do with this
34* **INSTALL** - installation instructions
35* **README** - this file
7b7c3284 36* **CMakeLists.txt** - input for "cmake" build tool, see INSTALL
e2d91bed 37* **configure** - configuration script, see INSTALL for details. If your copy of the source lacks a `configure` script, you can try to construct it by running the script in `build/autogen.sh` (or use `cmake`).
7b7c3284 38
e2d91bed
TK
39The following files in the top-level directory are used by the 'configure' script:
40* `Makefile.am`, `aclocal.m4`, `configure.ac` - used to build this distribution, only needed by maintainers
41* `Makefile.in`, `config.h.in` - templates used by configure script
42
43## Documentation
44
45In addition to the informational articles and documentation
46in the online [libarchive Wiki](https://github.com/libarchive/libarchive/wiki),
47the distribution also includes a number of manual pages:
7b7c3284 48
7b7c3284
TK
49 * bsdtar.1 explains the use of the bsdtar program
50 * bsdcpio.1 explains the use of the bsdcpio program
51 * bsdcat.1 explains the use of the bsdcat program
52 * libarchive.3 gives an overview of the library as a whole
53 * archive_read.3, archive_write.3, archive_write_disk.3, and
54 archive_read_disk.3 provide detailed calling sequences for the read
55 and write APIs
56 * archive_entry.3 details the "struct archive_entry" utility class
57 * archive_internals.3 provides some insight into libarchive's
58 internal structure and operation.
59 * libarchive-formats.5 documents the file formats supported by the library
60 * cpio.5, mtree.5, and tar.5 provide detailed information about these
61 popular archive formats, including hard-to-find details about
62 modern cpio and tar variants.
e2d91bed 63
7b7c3284
TK
64The manual pages above are provided in the 'doc' directory in
65a number of different formats.
66
e2d91bed 67You should also read the copious comments in `archive.h` and the
7b7c3284
TK
68source code for the sample programs for more details. Please let us
69know about any errors or omissions you find.
70
e2d91bed
TK
71## Supported Formats
72
c37ac23c 73Currently, the library automatically detects and reads the following formats:
7b7c3284
TK
74 * Old V7 tar archives
75 * POSIX ustar
e2d91bed
TK
76 * GNU tar format (including GNU long filenames, long link names, and sparse files)
77 * Solaris 9 extended tar format (including ACLs)
7b7c3284
TK
78 * POSIX pax interchange format
79 * POSIX octet-oriented cpio
80 * SVR4 ASCII cpio
7b7c3284 81 * Binary cpio (big-endian or little-endian)
85f0c98c 82 * PWB binary cpio
7b7c3284 83 * ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions)
e2d91bed 84 * ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives)
614110e7 85 * ZIPX archives (with support for bzip2, ppmd8, lzma and xz compressed entries)
7b7c3284
TK
86 * GNU and BSD 'ar' archives
87 * 'mtree' format
88 * 7-Zip archives
89 * Microsoft CAB format
90 * LHA and LZH archives
a6e1e9db 91 * RAR and RAR 5.0 archives (with some limitations due to RAR's proprietary status)
7b7c3284
TK
92 * XAR archives
93
94The library also detects and handles any of the following before evaluating the archive:
95 * uuencoded files
96 * files with RPM wrapper
97 * gzip compression
98 * bzip2 compression
99 * compress/LZW compression
100 * lzma, lzip, and xz compression
101 * lz4 compression
102 * lzop compression
07fbaa20 103 * zstandard compression
7b7c3284
TK
104
105The library can create archives in any of the following formats:
106 * POSIX ustar
107 * POSIX pax interchange format
108 * "restricted" pax format, which will create ustar archives except for
109 entries that require pax extensions (for long filenames, ACLs, etc).
110 * Old GNU tar format
111 * Old V7 tar format
112 * POSIX octet-oriented cpio
113 * SVR4 "newc" cpio
85f0c98c
TIH
114 * Binary cpio (little-endian)
115 * PWB binary cpio
7b7c3284
TK
116 * shar archives
117 * ZIP archives (with uncompressed or "deflate" compressed entries)
118 * GNU and BSD 'ar' archives
119 * 'mtree' format
120 * ISO9660 format
121 * 7-Zip archives
122 * XAR archives
123
124When creating archives, the result can be filtered with any of the following:
125 * uuencode
126 * gzip compression
127 * bzip2 compression
128 * compress/LZW compression
129 * lzma, lzip, and xz compression
130 * lz4 compression
131 * lzop compression
07fbaa20 132 * zstandard compression
7b7c3284 133
e2d91bed
TK
134## Notes about the Library Design
135
24e2f6ba
TK
136The following notes address many of the most common
137questions we are asked about libarchive:
138
e2d91bed
TK
139* This is a heavily stream-oriented system. That means that
140 it is optimized to read or write the archive in a single
141 pass from beginning to end. For example, this allows
142 libarchive to process archives too large to store on disk
143 by processing them on-the-fly as they are read from or
9a790ea8
TK
144 written to a network or tape drive. This also makes
145 libarchive useful for tools that need to produce
146 archives on-the-fly (such as webservers that provide
147 archived contents of a users account).
148
149* In-place modification and random access to the contents
150 of an archive are not directly supported. For some formats,
151 this is not an issue: For example, tar.gz archives are not
152 designed for random access. In some other cases, libarchive
153 can re-open an archive and scan it from the beginning quickly
154 enough to provide the needed abilities even without true
155 random access. Of course, some applications do require true
156 random access; those applications should consider alternatives
157 to libarchive.
e2d91bed
TK
158
159* The library is designed to be extended with new compression and
160 archive formats. The only requirement is that the format be
161 readable or writable as a stream and that each archive entry be
162 independent. There are articles on the libarchive Wiki explaining
163 how to extend libarchive.
164
165* On read, compression and format are always detected automatically.
166
07fbaa20 167* The same API is used for all formats; it should be very
e2d91bed
TK
168 easy for software using libarchive to transparently handle
169 any of libarchive's archiving formats.
170
171* Libarchive's automatic support for decompression can be used
172 without archiving by explicitly selecting the "raw" and "empty"
173 formats.
174
175* I've attempted to minimize static link pollution. If you don't
176 explicitly invoke a particular feature (such as support for a
177 particular compression or format), it won't get pulled in to
178 statically-linked programs. In particular, if you don't explicitly
179 enable a particular compression or decompression support, you won't
180 need to link against the corresponding compression or decompression
181 libraries. This also reduces the size of statically-linked
182 binaries in environments where that matters.
183
24e2f6ba
TK
184* The library is generally _thread safe_ depending on the platform:
185 it does not define any global variables of its own. However, some
186 platforms do not provide fully thread-safe versions of key C library
187 functions. On those platforms, libarchive will use the non-thread-safe
188 functions. Patches to improve this are of great interest to us.
189
190* In particular, libarchive's modules to read or write a directory
191 tree do use `chdir()` to optimize the directory traversals. This
192 can cause problems for programs that expect to do disk access from
6fd58302
TK
193 multiple threads. Of course, those modules are completely
194 optional and you can use the rest of libarchive without them.
24e2f6ba
TK
195
196* The library is _not_ thread aware, however. It does no locking
197 or thread management of any kind. If you create a libarchive
198 object and need to access it from multiple threads, you will
199 need to provide your own locking.
200
e2d91bed
TK
201* On read, the library accepts whatever blocks you hand it.
202 Your read callback is free to pass the library a byte at a time
203 or mmap the entire archive and give it to the library at once.
204 On write, the library always produces correctly-blocked output.
205
206* The object-style approach allows you to have multiple archive streams
207 open at once. bsdtar uses this in its "@archive" extension.
208
209* The archive itself is read/written using callback functions.
210 You can read an archive directly from an in-memory buffer or
211 write it to a socket, if you wish. There are some utility
212 functions to provide easy-to-use "open file," etc, capabilities.
213
214* The read/write APIs are designed to allow individual entries
215 to be read or written to any data source: You can create
216 a block of data in memory and add it to a tar archive without
217 first writing a temporary file. You can also read an entry from
218 an archive and write the data directly to a socket. If you want
219 to read/write entries to disk, there are convenience functions to
220 make this especially easy.
221
6fd58302
TK
222* Note: The "pax interchange format" is a POSIX standard extended tar
223 format that should be used when the older _ustar_ format is not
224 appropriate. It has many advantages over other tar formats
225 (including the legacy GNU tar format) and is widely supported by
226 current tar implementations.
227