]>
Commit | Line | Data |
---|---|---|
e2d91bed | 1 | # Welcome to libarchive! |
7b7c3284 TK |
2 | |
3 | The libarchive project develops a portable, efficient C library that | |
4 | can read and write streaming archives in a variety of formats. It | |
5 | also includes implementations of the common `tar`, `cpio`, and `zcat` | |
6 | command-line tools that use the libarchive library. | |
7 | ||
e2d91bed TK |
8 | ## Questions? Issues? |
9 | ||
10 | * http://www.libarchive.org is the home for ongoing | |
11 | libarchive development, including documentation, | |
12 | and links to the libarchive mailing lists. | |
13 | * To report an issue, use the issue tracker at | |
14 | https://github.com/libarchive/libarchive/issues | |
15 | * To submit an enhancement to libarchive, please | |
16 | submit a pull request via GitHub: https://github.com/libarchive/libarchive/pulls | |
17 | ||
18 | ## Contents of the Distribution | |
19 | ||
20 | This distribution bundle includes the following major components: | |
7b7c3284 | 21 | |
7b7c3284 TK |
22 | * **libarchive**: a library for reading and writing streaming archives |
23 | * **tar**: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive | |
24 | * **cpio**: the 'bsdcpio' program is a different interface to essentially the same functionality | |
25 | * **cat**: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such | |
26 | * **examples**: Some small example programs that you may find useful. | |
27 | * **examples/minitar**: a compact sample demonstrating use of libarchive. | |
28 | * **contrib**: Various items sent to me by third parties; please contact the authors with any questions. | |
29 | ||
30 | The top-level directory contains the following information files: | |
e2d91bed | 31 | |
7b7c3284 TK |
32 | * **NEWS** - highlights of recent changes |
33 | * **COPYING** - what you can do with this | |
34 | * **INSTALL** - installation instructions | |
35 | * **README** - this file | |
7b7c3284 | 36 | * **CMakeLists.txt** - input for "cmake" build tool, see INSTALL |
e2d91bed | 37 | * **configure** - configuration script, see INSTALL for details. If your copy of the source lacks a `configure` script, you can try to construct it by running the script in `build/autogen.sh` (or use `cmake`). |
7b7c3284 | 38 | |
e2d91bed TK |
39 | The following files in the top-level directory are used by the 'configure' script: |
40 | * `Makefile.am`, `aclocal.m4`, `configure.ac` - used to build this distribution, only needed by maintainers | |
41 | * `Makefile.in`, `config.h.in` - templates used by configure script | |
42 | ||
43 | ## Documentation | |
44 | ||
45 | In addition to the informational articles and documentation | |
46 | in the online [libarchive Wiki](https://github.com/libarchive/libarchive/wiki), | |
47 | the distribution also includes a number of manual pages: | |
7b7c3284 | 48 | |
7b7c3284 TK |
49 | * bsdtar.1 explains the use of the bsdtar program |
50 | * bsdcpio.1 explains the use of the bsdcpio program | |
51 | * bsdcat.1 explains the use of the bsdcat program | |
52 | * libarchive.3 gives an overview of the library as a whole | |
53 | * archive_read.3, archive_write.3, archive_write_disk.3, and | |
54 | archive_read_disk.3 provide detailed calling sequences for the read | |
55 | and write APIs | |
56 | * archive_entry.3 details the "struct archive_entry" utility class | |
57 | * archive_internals.3 provides some insight into libarchive's | |
58 | internal structure and operation. | |
59 | * libarchive-formats.5 documents the file formats supported by the library | |
60 | * cpio.5, mtree.5, and tar.5 provide detailed information about these | |
61 | popular archive formats, including hard-to-find details about | |
62 | modern cpio and tar variants. | |
e2d91bed | 63 | |
7b7c3284 TK |
64 | The manual pages above are provided in the 'doc' directory in |
65 | a number of different formats. | |
66 | ||
e2d91bed | 67 | You should also read the copious comments in `archive.h` and the |
7b7c3284 TK |
68 | source code for the sample programs for more details. Please let us |
69 | know about any errors or omissions you find. | |
70 | ||
e2d91bed TK |
71 | ## Supported Formats |
72 | ||
c37ac23c | 73 | Currently, the library automatically detects and reads the following formats: |
7b7c3284 TK |
74 | * Old V7 tar archives |
75 | * POSIX ustar | |
e2d91bed TK |
76 | * GNU tar format (including GNU long filenames, long link names, and sparse files) |
77 | * Solaris 9 extended tar format (including ACLs) | |
7b7c3284 TK |
78 | * POSIX pax interchange format |
79 | * POSIX octet-oriented cpio | |
80 | * SVR4 ASCII cpio | |
7b7c3284 | 81 | * Binary cpio (big-endian or little-endian) |
85f0c98c | 82 | * PWB binary cpio |
7b7c3284 | 83 | * ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions) |
e2d91bed | 84 | * ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives) |
614110e7 | 85 | * ZIPX archives (with support for bzip2, ppmd8, lzma and xz compressed entries) |
7b7c3284 TK |
86 | * GNU and BSD 'ar' archives |
87 | * 'mtree' format | |
88 | * 7-Zip archives | |
89 | * Microsoft CAB format | |
90 | * LHA and LZH archives | |
a6e1e9db | 91 | * RAR and RAR 5.0 archives (with some limitations due to RAR's proprietary status) |
7b7c3284 TK |
92 | * XAR archives |
93 | ||
94 | The library also detects and handles any of the following before evaluating the archive: | |
95 | * uuencoded files | |
96 | * files with RPM wrapper | |
97 | * gzip compression | |
98 | * bzip2 compression | |
99 | * compress/LZW compression | |
100 | * lzma, lzip, and xz compression | |
101 | * lz4 compression | |
102 | * lzop compression | |
07fbaa20 | 103 | * zstandard compression |
7b7c3284 TK |
104 | |
105 | The library can create archives in any of the following formats: | |
106 | * POSIX ustar | |
107 | * POSIX pax interchange format | |
108 | * "restricted" pax format, which will create ustar archives except for | |
109 | entries that require pax extensions (for long filenames, ACLs, etc). | |
110 | * Old GNU tar format | |
111 | * Old V7 tar format | |
112 | * POSIX octet-oriented cpio | |
113 | * SVR4 "newc" cpio | |
85f0c98c TIH |
114 | * Binary cpio (little-endian) |
115 | * PWB binary cpio | |
7b7c3284 TK |
116 | * shar archives |
117 | * ZIP archives (with uncompressed or "deflate" compressed entries) | |
118 | * GNU and BSD 'ar' archives | |
119 | * 'mtree' format | |
120 | * ISO9660 format | |
121 | * 7-Zip archives | |
122 | * XAR archives | |
123 | ||
124 | When creating archives, the result can be filtered with any of the following: | |
125 | * uuencode | |
126 | * gzip compression | |
127 | * bzip2 compression | |
128 | * compress/LZW compression | |
129 | * lzma, lzip, and xz compression | |
130 | * lz4 compression | |
131 | * lzop compression | |
07fbaa20 | 132 | * zstandard compression |
7b7c3284 | 133 | |
e2d91bed TK |
134 | ## Notes about the Library Design |
135 | ||
24e2f6ba TK |
136 | The following notes address many of the most common |
137 | questions we are asked about libarchive: | |
138 | ||
e2d91bed TK |
139 | * This is a heavily stream-oriented system. That means that |
140 | it is optimized to read or write the archive in a single | |
141 | pass from beginning to end. For example, this allows | |
142 | libarchive to process archives too large to store on disk | |
143 | by processing them on-the-fly as they are read from or | |
9a790ea8 TK |
144 | written to a network or tape drive. This also makes |
145 | libarchive useful for tools that need to produce | |
146 | archives on-the-fly (such as webservers that provide | |
147 | archived contents of a users account). | |
148 | ||
149 | * In-place modification and random access to the contents | |
150 | of an archive are not directly supported. For some formats, | |
151 | this is not an issue: For example, tar.gz archives are not | |
152 | designed for random access. In some other cases, libarchive | |
153 | can re-open an archive and scan it from the beginning quickly | |
154 | enough to provide the needed abilities even without true | |
155 | random access. Of course, some applications do require true | |
156 | random access; those applications should consider alternatives | |
157 | to libarchive. | |
e2d91bed TK |
158 | |
159 | * The library is designed to be extended with new compression and | |
160 | archive formats. The only requirement is that the format be | |
161 | readable or writable as a stream and that each archive entry be | |
162 | independent. There are articles on the libarchive Wiki explaining | |
163 | how to extend libarchive. | |
164 | ||
165 | * On read, compression and format are always detected automatically. | |
166 | ||
07fbaa20 | 167 | * The same API is used for all formats; it should be very |
e2d91bed TK |
168 | easy for software using libarchive to transparently handle |
169 | any of libarchive's archiving formats. | |
170 | ||
171 | * Libarchive's automatic support for decompression can be used | |
172 | without archiving by explicitly selecting the "raw" and "empty" | |
173 | formats. | |
174 | ||
175 | * I've attempted to minimize static link pollution. If you don't | |
176 | explicitly invoke a particular feature (such as support for a | |
177 | particular compression or format), it won't get pulled in to | |
178 | statically-linked programs. In particular, if you don't explicitly | |
179 | enable a particular compression or decompression support, you won't | |
180 | need to link against the corresponding compression or decompression | |
181 | libraries. This also reduces the size of statically-linked | |
182 | binaries in environments where that matters. | |
183 | ||
24e2f6ba TK |
184 | * The library is generally _thread safe_ depending on the platform: |
185 | it does not define any global variables of its own. However, some | |
186 | platforms do not provide fully thread-safe versions of key C library | |
187 | functions. On those platforms, libarchive will use the non-thread-safe | |
188 | functions. Patches to improve this are of great interest to us. | |
189 | ||
190 | * In particular, libarchive's modules to read or write a directory | |
191 | tree do use `chdir()` to optimize the directory traversals. This | |
192 | can cause problems for programs that expect to do disk access from | |
6fd58302 TK |
193 | multiple threads. Of course, those modules are completely |
194 | optional and you can use the rest of libarchive without them. | |
24e2f6ba TK |
195 | |
196 | * The library is _not_ thread aware, however. It does no locking | |
197 | or thread management of any kind. If you create a libarchive | |
198 | object and need to access it from multiple threads, you will | |
199 | need to provide your own locking. | |
200 | ||
e2d91bed TK |
201 | * On read, the library accepts whatever blocks you hand it. |
202 | Your read callback is free to pass the library a byte at a time | |
203 | or mmap the entire archive and give it to the library at once. | |
204 | On write, the library always produces correctly-blocked output. | |
205 | ||
206 | * The object-style approach allows you to have multiple archive streams | |
207 | open at once. bsdtar uses this in its "@archive" extension. | |
208 | ||
209 | * The archive itself is read/written using callback functions. | |
210 | You can read an archive directly from an in-memory buffer or | |
211 | write it to a socket, if you wish. There are some utility | |
212 | functions to provide easy-to-use "open file," etc, capabilities. | |
213 | ||
214 | * The read/write APIs are designed to allow individual entries | |
215 | to be read or written to any data source: You can create | |
216 | a block of data in memory and add it to a tar archive without | |
217 | first writing a temporary file. You can also read an entry from | |
218 | an archive and write the data directly to a socket. If you want | |
219 | to read/write entries to disk, there are convenience functions to | |
220 | make this especially easy. | |
221 | ||
6fd58302 TK |
222 | * Note: The "pax interchange format" is a POSIX standard extended tar |
223 | format that should be used when the older _ustar_ format is not | |
224 | appropriate. It has many advantages over other tar formats | |
225 | (including the legacy GNU tar format) and is widely supported by | |
226 | current tar implementations. | |
227 |