]>
Commit | Line | Data |
---|---|---|
48a8c26c | 1 | Git index format |
8c7d0517 NTND |
2 | ================ |
3 | ||
2de9b711 | 4 | == The Git index file has the following format |
8c7d0517 NTND |
5 | |
6 | All binary numbers are in network byte order. Version 2 is described | |
7 | here unless stated otherwise. | |
8 | ||
9 | - A 12-byte header consisting of | |
10 | ||
11 | 4-byte signature: | |
23fcc98f | 12 | The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache") |
8c7d0517 NTND |
13 | |
14 | 4-byte version number: | |
300e39f6 | 15 | The current supported versions are 2, 3 and 4. |
8c7d0517 NTND |
16 | |
17 | 32-bit number of index entries. | |
18 | ||
23fcc98f | 19 | - A number of sorted index entries (see below). |
8c7d0517 NTND |
20 | |
21 | - Extensions | |
22 | ||
23 | Extensions are identified by signature. Optional extensions can | |
48a8c26c | 24 | be ignored if Git does not understand them. |
8c7d0517 | 25 | |
48a8c26c | 26 | Git currently supports cached tree and resolve undo extensions. |
8c7d0517 NTND |
27 | |
28 | 4-byte extension signature. If the first byte is 'A'..'Z' the | |
29 | extension is optional and can be ignored. | |
30 | ||
31 | 32-bit size of the extension | |
32 | ||
33 | Extension data | |
34 | ||
35 | - 160-bit SHA-1 over the content of the index file before this | |
36 | checksum. | |
37 | ||
38 | == Index entry | |
39 | ||
40 | Index entries are sorted in ascending order on the name field, | |
23fcc98f JH |
41 | interpreted as a string of unsigned bytes (i.e. memcmp() order, no |
42 | localization, no special casing of directory separator '/'). Entries | |
43 | with the same name are sorted by their stage field. | |
8c7d0517 NTND |
44 | |
45 | 32-bit ctime seconds, the last time a file's metadata changed | |
46 | this is stat(2) data | |
47 | ||
48 | 32-bit ctime nanosecond fractions | |
49 | this is stat(2) data | |
50 | ||
51 | 32-bit mtime seconds, the last time a file's data changed | |
52 | this is stat(2) data | |
53 | ||
54 | 32-bit mtime nanosecond fractions | |
55 | this is stat(2) data | |
56 | ||
57 | 32-bit dev | |
58 | this is stat(2) data | |
59 | ||
60 | 32-bit ino | |
61 | this is stat(2) data | |
62 | ||
63 | 32-bit mode, split into (high to low bits) | |
64 | ||
65 | 4-bit object type | |
23fcc98f | 66 | valid values in binary are 1000 (regular file), 1010 (symbolic link) |
8c7d0517 NTND |
67 | and 1110 (gitlink) |
68 | ||
69 | 3-bit unused | |
70 | ||
23fcc98f JH |
71 | 9-bit unix permission. Only 0755 and 0644 are valid for regular files. |
72 | Symbolic links and gitlinks have value 0 in this field. | |
8c7d0517 NTND |
73 | |
74 | 32-bit uid | |
75 | this is stat(2) data | |
76 | ||
77 | 32-bit gid | |
78 | this is stat(2) data | |
79 | ||
80 | 32-bit file size | |
23fcc98f | 81 | This is the on-disk size from stat(2), truncated to 32-bit. |
8c7d0517 NTND |
82 | |
83 | 160-bit SHA-1 for the represented object | |
84 | ||
23fcc98f | 85 | A 16-bit 'flags' field split into (high to low bits) |
8c7d0517 NTND |
86 | |
87 | 1-bit assume-valid flag | |
88 | ||
89 | 1-bit extended flag (must be zero in version 2) | |
90 | ||
91 | 2-bit stage (during merge) | |
92 | ||
23fcc98f JH |
93 | 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF |
94 | is stored in this field. | |
8c7d0517 | 95 | |
300e39f6 NTND |
96 | (Version 3 or later) A 16-bit field, only applicable if the |
97 | "extended flag" above is 1, split into (high to low bits). | |
8c7d0517 NTND |
98 | |
99 | 1-bit reserved for future | |
100 | ||
101 | 1-bit skip-worktree flag (used by sparse checkout) | |
102 | ||
103 | 1-bit intent-to-add flag (used by "git add -N") | |
104 | ||
105 | 13-bit unused, must be zero | |
106 | ||
107 | Entry path name (variable length) relative to top level directory | |
108 | (without leading slash). '/' is used as path separator. The special | |
23fcc98f | 109 | path components ".", ".." and ".git" (without quotes) are disallowed. |
8c7d0517 NTND |
110 | Trailing slash is also disallowed. |
111 | ||
112 | The exact encoding is undefined, but the '.' and '/' characters | |
23fcc98f JH |
113 | are encoded in 7-bit ASCII and the encoding cannot contain a NUL |
114 | byte (iow, this is a UNIX pathname). | |
8c7d0517 | 115 | |
afd7bd22 JH |
116 | (Version 4) In version 4, the entry path name is prefix-compressed |
117 | relative to the path name for the previous entry (the very first | |
118 | entry is encoded as if the path name for the previous entry is an | |
119 | empty string). At the beginning of an entry, an integer N in the | |
120 | variable width encoding (the same encoding as the offset is encoded | |
121 | for OFS_DELTA pack entries; see pack-format.txt) is stored, followed | |
122 | by a NUL-terminated string S. Removing N bytes from the end of the | |
123 | path name for the previous entry, and replacing it with the string S | |
124 | yields the path name for this entry. | |
125 | ||
8c7d0517 NTND |
126 | 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes |
127 | while keeping the name NUL-terminated. | |
128 | ||
afd7bd22 JH |
129 | (Version 4) In version 4, the padding after the pathname does not |
130 | exist. | |
131 | ||
5fc2fc8f NTND |
132 | Interpretation of index entries in split index mode is completely |
133 | different. See below for details. | |
134 | ||
8c7d0517 NTND |
135 | == Extensions |
136 | ||
23fcc98f | 137 | === Cached tree |
8c7d0517 | 138 | |
23fcc98f | 139 | Cached tree extension contains pre-computed hashes for trees that can |
8c7d0517 NTND |
140 | be derived from the index. It helps speed up tree object generation |
141 | from index for a new commit. | |
142 | ||
143 | When a path is updated in index, the path must be invalidated and | |
144 | removed from tree cache. | |
145 | ||
23fcc98f | 146 | The signature for this extension is { 'T', 'R', 'E', 'E' }. |
8c7d0517 | 147 | |
23fcc98f JH |
148 | A series of entries fill the entire extension; each of which |
149 | consists of: | |
8c7d0517 | 150 | |
23fcc98f | 151 | - NUL-terminated path component (relative to its parent directory); |
8c7d0517 | 152 | |
23fcc98f JH |
153 | - ASCII decimal number of entries in the index that is covered by the |
154 | tree this entry represents (entry_count); | |
8c7d0517 | 155 | |
23fcc98f | 156 | - A space (ASCII 32); |
8c7d0517 | 157 | |
23fcc98f JH |
158 | - ASCII decimal number that represents the number of subtrees this |
159 | tree has; | |
8c7d0517 | 160 | |
23fcc98f JH |
161 | - A newline (ASCII 10); and |
162 | ||
163 | - 160-bit object name for the object that would result from writing | |
164 | this span of index as a tree. | |
165 | ||
e44b6df9 | 166 | An entry can be in an invalidated state and is represented by having |
4a6385fe NTND |
167 | a negative number in the entry_count field. In this case, there is no |
168 | object name and the next entry starts immediately after the newline. | |
169 | When writing an invalid entry, -1 should always be used as entry_count. | |
23fcc98f JH |
170 | |
171 | The entries are written out in the top-down, depth-first order. The | |
172 | first entry represents the root level of the repository, followed by the | |
173 | first subtree---let's call this A---of the root level (with its name | |
174 | relative to the root level), followed by the first subtree of A (with | |
175 | its name relative to A), ... | |
8c7d0517 NTND |
176 | |
177 | === Resolve undo | |
178 | ||
23fcc98f | 179 | A conflict is represented in the index as a set of higher stage entries. |
8c7d0517 | 180 | When a conflict is resolved (e.g. with "git add path"), these higher |
17b83d71 | 181 | stage entries will be removed and a stage-0 entry with proper resolution |
23fcc98f | 182 | is added. |
8c7d0517 | 183 | |
23fcc98f JH |
184 | When these higher stage entries are removed, they are saved in the |
185 | resolve undo extension, so that conflicts can be recreated (e.g. with | |
186 | "git checkout -m"), in case users want to redo a conflict resolution | |
187 | from scratch. | |
8c7d0517 | 188 | |
23fcc98f | 189 | The signature for this extension is { 'R', 'E', 'U', 'C' }. |
8c7d0517 | 190 | |
23fcc98f JH |
191 | A series of entries fill the entire extension; each of which |
192 | consists of: | |
8c7d0517 | 193 | |
23fcc98f JH |
194 | - NUL-terminated pathname the entry describes (relative to the root of |
195 | the repository, i.e. full pathname); | |
8c7d0517 | 196 | |
23fcc98f JH |
197 | - Three NUL-terminated ASCII octal numbers, entry mode of entries in |
198 | stage 1 to 3 (a missing stage is represented by "0" in this field); | |
199 | and | |
8c7d0517 | 200 | |
23fcc98f JH |
201 | - At most three 160-bit object names of the entry in stages from 1 to 3 |
202 | (nothing is written for a missing stage). | |
8c7d0517 | 203 | |
5fc2fc8f NTND |
204 | === Split index |
205 | ||
206 | In split index mode, the majority of index entries could be stored | |
207 | in a separate file. This extension records the changes to be made on | |
208 | top of that to produce the final index. | |
209 | ||
210 | The signature for this extension is { 'l', 'i, 'n', 'k' }. | |
211 | ||
212 | The extension consists of: | |
213 | ||
214 | - 160-bit SHA-1 of the shared index file. The shared index file path | |
215 | is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the | |
216 | index does not require a shared index file. | |
217 | ||
218 | - An ewah-encoded delete bitmap, each bit represents an entry in the | |
219 | shared index. If a bit is set, its corresponding entry in the | |
220 | shared index will be removed from the final index. Note, because | |
221 | a delete operation changes index entry positions, but we do need | |
222 | original positions in replace phase, it's best to just mark | |
223 | entries for removal, then do a mass deletion after replacement. | |
224 | ||
225 | - An ewah-encoded replace bitmap, each bit represents an entry in | |
226 | the shared index. If a bit is set, its corresponding entry in the | |
227 | shared index will be replaced with an entry in this index | |
228 | file. All replaced entries are stored in sorted order in this | |
229 | index. The first "1" bit in the replace bitmap corresponds to the | |
230 | first index entry, the second "1" bit to the second entry and so | |
231 | on. Replaced entries may have empty path names to save space. | |
232 | ||
233 | The remaining index entries after replaced ones will be added to the | |
f745acb0 | 234 | final index. These added entries are also sorted by entry name then |
5fc2fc8f | 235 | stage. |