]>
Commit | Line | Data |
---|---|---|
9760662f JH |
1 | GIT pack format |
2 | =============== | |
3 | ||
4 | = pack-*.pack file has the following format: | |
5 | ||
6 | - The header appears at the beginning and consists of the following: | |
7 | ||
8 | 4-byte signature | |
9 | 4-byte version number (network byte order) | |
10 | 4-byte number of objects contained in the pack (network byte order) | |
11 | ||
12 | Observation: we cannot have more than 4G versions ;-) and | |
13 | more than 4G objects in a pack. | |
14 | ||
15 | - The header is followed by number of object entries, each of | |
16 | which looks like this: | |
17 | ||
18 | (undeltified representation) | |
19 | n-byte type and length (4-bit type, (n-1)*7+4-bit length) | |
20 | compressed data | |
21 | ||
22 | (deltified representation) | |
23 | n-byte type and length (4-bit type, (n-1)*7+4-bit length) | |
24 | 20-byte base object name | |
25 | compressed delta data | |
26 | ||
27 | Observation: length of each object is encoded in a variable | |
28 | length format and is not constrained to 32-bit or anything. | |
29 | ||
30 | - The trailer records 20-byte SHA1 checksum of all of the above. | |
31 | ||
32 | = pack-*.idx file has the following format: | |
33 | ||
34 | - The header consists of 256 4-byte network byte order | |
35 | integers. N-th entry of this table records the number of | |
36 | objects in the corresponding pack, the first byte of whose | |
37 | object name are smaller than N. This is called the | |
38 | 'first-level fan-out' table. | |
39 | ||
40 | Observation: we would need to extend this to an array of | |
41 | 8-byte integers to go beyond 4G objects per pack, but it is | |
42 | not strictly necessary. | |
43 | ||
44 | - The header is followed by sorted 28-byte entries, one entry | |
45 | per object in the pack. Each entry is: | |
46 | ||
47 | 4-byte network byte order integer, recording where the | |
48 | object is stored in the packfile as the offset from the | |
49 | beginning. | |
50 | ||
51 | 20-byte object name. | |
52 | ||
53 | Observation: we would definitely need to extend this to | |
54 | 8-byte integer plus 20-byte object name to handle a packfile | |
55 | that is larger than 4GB. | |
56 | ||
57 | - The file is concluded with a trailer: | |
58 | ||
59 | A copy of the 20-byte SHA1 checksum at the end of | |
60 | corresponding packfile. | |
61 | ||
62 | 20-byte SHA1-checksum of all of the above. | |
63 | ||
64 | Pack Idx file: | |
65 | ||
66 | idx | |
67 | +--------------------------------+ | |
68 | | fanout[0] = 2 |-. | |
69 | +--------------------------------+ | | |
70 | | fanout[1] | | | |
71 | +--------------------------------+ | | |
72 | | fanout[2] | | | |
73 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | |
74 | | fanout[255] | | | |
75 | +--------------------------------+ | | |
76 | main | offset | | | |
77 | index | object name 00XXXXXXXXXXXXXXXX | | | |
78 | table +--------------------------------+ | | |
79 | | offset | | | |
80 | | object name 00XXXXXXXXXXXXXXXX | | | |
81 | +--------------------------------+ | | |
82 | .-| offset |<+ | |
83 | | | object name 01XXXXXXXXXXXXXXXX | | |
84 | | +--------------------------------+ | |
85 | | | offset | | |
86 | | | object name 01XXXXXXXXXXXXXXXX | | |
87 | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
88 | | | offset | | |
89 | | | object name FFXXXXXXXXXXXXXXXX | | |
90 | | +--------------------------------+ | |
91 | trailer | | packfile checksum | | |
92 | | +--------------------------------+ | |
93 | | | idxfile checksum | | |
94 | | +--------------------------------+ | |
95 | .-------. | |
96 | | | |
97 | Pack file entry: <+ | |
98 | ||
99 | packed object header: | |
100 | 1-byte type (upper 4-bit) | |
101 | size0 (lower 4-bit) | |
102 | n-byte sizeN (as long as MSB is set, each 7-bit) | |
103 | size0..sizeN form 4+7+7+..+7 bit integer, size0 | |
104 | is the most significant part. | |
105 | packed object data: | |
106 | If it is not DELTA, then deflated bytes (the size above | |
107 | is the size before compression). | |
108 | If it is DELTA, then | |
109 | 20-byte base object name SHA1 (the size above is the | |
110 | size of the delta data that follows). | |
111 | delta data, deflated. |