]>
Commit | Line | Data |
---|---|---|
e2d91bed | 1 | # Welcome to libarchive! |
7b7c3284 TK |
2 | |
3 | The libarchive project develops a portable, efficient C library that | |
4 | can read and write streaming archives in a variety of formats. It | |
5 | also includes implementations of the common `tar`, `cpio`, and `zcat` | |
6 | command-line tools that use the libarchive library. | |
7 | ||
e2d91bed TK |
8 | ## Questions? Issues? |
9 | ||
10 | * http://www.libarchive.org is the home for ongoing | |
11 | libarchive development, including documentation, | |
12 | and links to the libarchive mailing lists. | |
13 | * To report an issue, use the issue tracker at | |
14 | https://github.com/libarchive/libarchive/issues | |
15 | * To submit an enhancement to libarchive, please | |
16 | submit a pull request via GitHub: https://github.com/libarchive/libarchive/pulls | |
17 | ||
18 | ## Contents of the Distribution | |
19 | ||
20 | This distribution bundle includes the following major components: | |
7b7c3284 | 21 | |
7b7c3284 TK |
22 | * **libarchive**: a library for reading and writing streaming archives |
23 | * **tar**: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive | |
24 | * **cpio**: the 'bsdcpio' program is a different interface to essentially the same functionality | |
25 | * **cat**: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such | |
26 | * **examples**: Some small example programs that you may find useful. | |
27 | * **examples/minitar**: a compact sample demonstrating use of libarchive. | |
28 | * **contrib**: Various items sent to me by third parties; please contact the authors with any questions. | |
29 | ||
30 | The top-level directory contains the following information files: | |
e2d91bed | 31 | |
7b7c3284 TK |
32 | * **NEWS** - highlights of recent changes |
33 | * **COPYING** - what you can do with this | |
34 | * **INSTALL** - installation instructions | |
35 | * **README** - this file | |
7b7c3284 | 36 | * **CMakeLists.txt** - input for "cmake" build tool, see INSTALL |
e2d91bed | 37 | * **configure** - configuration script, see INSTALL for details. If your copy of the source lacks a `configure` script, you can try to construct it by running the script in `build/autogen.sh` (or use `cmake`). |
7b7c3284 | 38 | |
e2d91bed TK |
39 | The following files in the top-level directory are used by the 'configure' script: |
40 | * `Makefile.am`, `aclocal.m4`, `configure.ac` - used to build this distribution, only needed by maintainers | |
41 | * `Makefile.in`, `config.h.in` - templates used by configure script | |
42 | ||
43 | ## Documentation | |
44 | ||
45 | In addition to the informational articles and documentation | |
46 | in the online [libarchive Wiki](https://github.com/libarchive/libarchive/wiki), | |
47 | the distribution also includes a number of manual pages: | |
7b7c3284 | 48 | |
7b7c3284 TK |
49 | * bsdtar.1 explains the use of the bsdtar program |
50 | * bsdcpio.1 explains the use of the bsdcpio program | |
51 | * bsdcat.1 explains the use of the bsdcat program | |
52 | * libarchive.3 gives an overview of the library as a whole | |
53 | * archive_read.3, archive_write.3, archive_write_disk.3, and | |
54 | archive_read_disk.3 provide detailed calling sequences for the read | |
55 | and write APIs | |
56 | * archive_entry.3 details the "struct archive_entry" utility class | |
57 | * archive_internals.3 provides some insight into libarchive's | |
58 | internal structure and operation. | |
59 | * libarchive-formats.5 documents the file formats supported by the library | |
60 | * cpio.5, mtree.5, and tar.5 provide detailed information about these | |
61 | popular archive formats, including hard-to-find details about | |
62 | modern cpio and tar variants. | |
e2d91bed | 63 | |
7b7c3284 TK |
64 | The manual pages above are provided in the 'doc' directory in |
65 | a number of different formats. | |
66 | ||
e2d91bed | 67 | You should also read the copious comments in `archive.h` and the |
7b7c3284 TK |
68 | source code for the sample programs for more details. Please let us |
69 | know about any errors or omissions you find. | |
70 | ||
e2d91bed TK |
71 | ## Supported Formats |
72 | ||
7b7c3284 | 73 | Currently, the library automatically detects and reads the following fomats: |
7b7c3284 TK |
74 | * Old V7 tar archives |
75 | * POSIX ustar | |
e2d91bed TK |
76 | * GNU tar format (including GNU long filenames, long link names, and sparse files) |
77 | * Solaris 9 extended tar format (including ACLs) | |
7b7c3284 TK |
78 | * POSIX pax interchange format |
79 | * POSIX octet-oriented cpio | |
80 | * SVR4 ASCII cpio | |
81 | * POSIX octet-oriented cpio | |
82 | * Binary cpio (big-endian or little-endian) | |
83 | * ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions) | |
e2d91bed | 84 | * ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives) |
7b7c3284 TK |
85 | * GNU and BSD 'ar' archives |
86 | * 'mtree' format | |
87 | * 7-Zip archives | |
88 | * Microsoft CAB format | |
89 | * LHA and LZH archives | |
e2d91bed | 90 | * RAR archives (with some limitations due to RAR's proprietary status) |
7b7c3284 TK |
91 | * XAR archives |
92 | ||
93 | The library also detects and handles any of the following before evaluating the archive: | |
94 | * uuencoded files | |
95 | * files with RPM wrapper | |
96 | * gzip compression | |
97 | * bzip2 compression | |
98 | * compress/LZW compression | |
99 | * lzma, lzip, and xz compression | |
100 | * lz4 compression | |
101 | * lzop compression | |
102 | ||
103 | The library can create archives in any of the following formats: | |
104 | * POSIX ustar | |
105 | * POSIX pax interchange format | |
106 | * "restricted" pax format, which will create ustar archives except for | |
107 | entries that require pax extensions (for long filenames, ACLs, etc). | |
108 | * Old GNU tar format | |
109 | * Old V7 tar format | |
110 | * POSIX octet-oriented cpio | |
111 | * SVR4 "newc" cpio | |
112 | * shar archives | |
113 | * ZIP archives (with uncompressed or "deflate" compressed entries) | |
114 | * GNU and BSD 'ar' archives | |
115 | * 'mtree' format | |
116 | * ISO9660 format | |
117 | * 7-Zip archives | |
118 | * XAR archives | |
119 | ||
120 | When creating archives, the result can be filtered with any of the following: | |
121 | * uuencode | |
122 | * gzip compression | |
123 | * bzip2 compression | |
124 | * compress/LZW compression | |
125 | * lzma, lzip, and xz compression | |
126 | * lz4 compression | |
127 | * lzop compression | |
128 | ||
e2d91bed TK |
129 | ## Notes about the Library Design |
130 | ||
24e2f6ba TK |
131 | The following notes address many of the most common |
132 | questions we are asked about libarchive: | |
133 | ||
e2d91bed TK |
134 | * This is a heavily stream-oriented system. That means that |
135 | it is optimized to read or write the archive in a single | |
136 | pass from beginning to end. For example, this allows | |
137 | libarchive to process archives too large to store on disk | |
138 | by processing them on-the-fly as they are read from or | |
9a790ea8 TK |
139 | written to a network or tape drive. This also makes |
140 | libarchive useful for tools that need to produce | |
141 | archives on-the-fly (such as webservers that provide | |
142 | archived contents of a users account). | |
143 | ||
144 | * In-place modification and random access to the contents | |
145 | of an archive are not directly supported. For some formats, | |
146 | this is not an issue: For example, tar.gz archives are not | |
147 | designed for random access. In some other cases, libarchive | |
148 | can re-open an archive and scan it from the beginning quickly | |
149 | enough to provide the needed abilities even without true | |
150 | random access. Of course, some applications do require true | |
151 | random access; those applications should consider alternatives | |
152 | to libarchive. | |
e2d91bed TK |
153 | |
154 | * The library is designed to be extended with new compression and | |
155 | archive formats. The only requirement is that the format be | |
156 | readable or writable as a stream and that each archive entry be | |
157 | independent. There are articles on the libarchive Wiki explaining | |
158 | how to extend libarchive. | |
159 | ||
160 | * On read, compression and format are always detected automatically. | |
161 | ||
162 | * The same API is used for all formats; in particular, it's very | |
163 | easy for software using libarchive to transparently handle | |
164 | any of libarchive's archiving formats. | |
165 | ||
166 | * Libarchive's automatic support for decompression can be used | |
167 | without archiving by explicitly selecting the "raw" and "empty" | |
168 | formats. | |
169 | ||
170 | * I've attempted to minimize static link pollution. If you don't | |
171 | explicitly invoke a particular feature (such as support for a | |
172 | particular compression or format), it won't get pulled in to | |
173 | statically-linked programs. In particular, if you don't explicitly | |
174 | enable a particular compression or decompression support, you won't | |
175 | need to link against the corresponding compression or decompression | |
176 | libraries. This also reduces the size of statically-linked | |
177 | binaries in environments where that matters. | |
178 | ||
24e2f6ba TK |
179 | * The library is generally _thread safe_ depending on the platform: |
180 | it does not define any global variables of its own. However, some | |
181 | platforms do not provide fully thread-safe versions of key C library | |
182 | functions. On those platforms, libarchive will use the non-thread-safe | |
183 | functions. Patches to improve this are of great interest to us. | |
184 | ||
185 | * In particular, libarchive's modules to read or write a directory | |
186 | tree do use `chdir()` to optimize the directory traversals. This | |
187 | can cause problems for programs that expect to do disk access from | |
6fd58302 TK |
188 | multiple threads. Of course, those modules are completely |
189 | optional and you can use the rest of libarchive without them. | |
24e2f6ba TK |
190 | |
191 | * The library is _not_ thread aware, however. It does no locking | |
192 | or thread management of any kind. If you create a libarchive | |
193 | object and need to access it from multiple threads, you will | |
194 | need to provide your own locking. | |
195 | ||
e2d91bed TK |
196 | * On read, the library accepts whatever blocks you hand it. |
197 | Your read callback is free to pass the library a byte at a time | |
198 | or mmap the entire archive and give it to the library at once. | |
199 | On write, the library always produces correctly-blocked output. | |
200 | ||
201 | * The object-style approach allows you to have multiple archive streams | |
202 | open at once. bsdtar uses this in its "@archive" extension. | |
203 | ||
204 | * The archive itself is read/written using callback functions. | |
205 | You can read an archive directly from an in-memory buffer or | |
206 | write it to a socket, if you wish. There are some utility | |
207 | functions to provide easy-to-use "open file," etc, capabilities. | |
208 | ||
209 | * The read/write APIs are designed to allow individual entries | |
210 | to be read or written to any data source: You can create | |
211 | a block of data in memory and add it to a tar archive without | |
212 | first writing a temporary file. You can also read an entry from | |
213 | an archive and write the data directly to a socket. If you want | |
214 | to read/write entries to disk, there are convenience functions to | |
215 | make this especially easy. | |
216 | ||
6fd58302 TK |
217 | * Note: The "pax interchange format" is a POSIX standard extended tar |
218 | format that should be used when the older _ustar_ format is not | |
219 | appropriate. It has many advantages over other tar formats | |
220 | (including the legacy GNU tar format) and is widely supported by | |
221 | current tar implementations. | |
222 |