]>
Commit | Line | Data |
---|---|---|
88e7fdf2 JH |
1 | gitattributes(5) |
2 | ================ | |
3 | ||
4 | NAME | |
5 | ---- | |
1b81d8cb | 6 | gitattributes - Defining attributes per path |
88e7fdf2 JH |
7 | |
8 | SYNOPSIS | |
9 | -------- | |
e5b5c1d2 | 10 | $GIT_DIR/info/attributes, .gitattributes |
88e7fdf2 JH |
11 | |
12 | ||
13 | DESCRIPTION | |
14 | ----------- | |
15 | ||
16 | A `gitattributes` file is a simple text file that gives | |
17 | `attributes` to pathnames. | |
18 | ||
19 | Each line in `gitattributes` file is of form: | |
20 | ||
8d75a1d1 | 21 | pattern attr1 attr2 ... |
88e7fdf2 | 22 | |
3f74c8e8 | 23 | That is, a pattern followed by an attributes list, |
860a74d9 NTND |
24 | separated by whitespaces. Leading and trailing whitespaces are |
25 | ignored. Lines that begin with '#' are ignored. Patterns | |
26 | that begin with a double quote are quoted in C style. | |
27 | When the pattern matches the path in question, the attributes | |
28 | listed on the line are given to the path. | |
88e7fdf2 JH |
29 | |
30 | Each attribute can be in one of these states for a given path: | |
31 | ||
32 | Set:: | |
33 | ||
34 | The path has the attribute with special value "true"; | |
35 | this is specified by listing only the name of the | |
36 | attribute in the attribute list. | |
37 | ||
38 | Unset:: | |
39 | ||
40 | The path has the attribute with special value "false"; | |
41 | this is specified by listing the name of the attribute | |
42 | prefixed with a dash `-` in the attribute list. | |
43 | ||
44 | Set to a value:: | |
45 | ||
46 | The path has the attribute with specified string value; | |
47 | this is specified by listing the name of the attribute | |
48 | followed by an equal sign `=` and its value in the | |
49 | attribute list. | |
50 | ||
51 | Unspecified:: | |
52 | ||
3f74c8e8 | 53 | No pattern matches the path, and nothing says if |
b9d14ffb JH |
54 | the path has or does not have the attribute, the |
55 | attribute for the path is said to be Unspecified. | |
88e7fdf2 | 56 | |
3f74c8e8 | 57 | When more than one pattern matches the path, a later line |
b9d14ffb | 58 | overrides an earlier line. This overriding is done per |
b635ed97 JK |
59 | attribute. |
60 | ||
61 | The rules by which the pattern matches paths are the same as in | |
62 | `.gitignore` files (see linkgit:gitignore[5]), with a few exceptions: | |
63 | ||
64 | - negative patterns are forbidden | |
65 | ||
66 | - patterns that match a directory do not recursively match paths | |
67 | inside that directory (so using the trailing-slash `path/` syntax is | |
68 | pointless in an attributes file; use `path/**` instead) | |
88e7fdf2 | 69 | |
2de9b711 | 70 | When deciding what attributes are assigned to a path, Git |
88e7fdf2 JH |
71 | consults `$GIT_DIR/info/attributes` file (which has the highest |
72 | precedence), `.gitattributes` file in the same directory as the | |
20ff3ec2 JM |
73 | path in question, and its parent directories up to the toplevel of the |
74 | work tree (the further the directory that contains `.gitattributes` | |
6df42ab9 PO |
75 | is from the path in question, the lower its precedence). Finally |
76 | global and system-wide files are considered (they have the lowest | |
77 | precedence). | |
88e7fdf2 | 78 | |
40701adb NTND |
79 | When the `.gitattributes` file is missing from the work tree, the |
80 | path in the index is used as a fall-back. During checkout process, | |
81 | `.gitattributes` in the index is used and then the file in the | |
82 | working tree is used as a fall-back. | |
83 | ||
90b22907 | 84 | If you wish to affect only a single repository (i.e., to assign |
6df42ab9 PO |
85 | attributes to files that are particular to |
86 | one user's workflow for that repository), then | |
90b22907 JK |
87 | attributes should be placed in the `$GIT_DIR/info/attributes` file. |
88 | Attributes which should be version-controlled and distributed to other | |
89 | repositories (i.e., attributes of interest to all users) should go into | |
6df42ab9 PO |
90 | `.gitattributes` files. Attributes that should affect all repositories |
91 | for a single user should be placed in a file specified by the | |
da0005b8 | 92 | `core.attributesFile` configuration option (see linkgit:git-config[1]). |
684e40f6 HKNN |
93 | Its default value is $XDG_CONFIG_HOME/git/attributes. If $XDG_CONFIG_HOME |
94 | is either not set or empty, $HOME/.config/git/attributes is used instead. | |
6df42ab9 PO |
95 | Attributes for all users on a system should be placed in the |
96 | `$(prefix)/etc/gitattributes` file. | |
90b22907 | 97 | |
faa4e8ce | 98 | Sometimes you would need to override a setting of an attribute |
0922570c | 99 | for a path to `Unspecified` state. This can be done by listing |
88e7fdf2 JH |
100 | the name of the attribute prefixed with an exclamation point `!`. |
101 | ||
102 | ||
2232a88a JW |
103 | RESERVED BUILTIN_* ATTRIBUTES |
104 | ----------------------------- | |
105 | ||
106 | builtin_* is a reserved namespace for builtin attribute values. Any | |
107 | user defined attributes under this namespace will be ignored and | |
108 | trigger a warning. | |
109 | ||
110 | `builtin_objectmode` | |
111 | ~~~~~~~~~~~~~~~~~~~~ | |
112 | This attribute is for filtering files by their file bit modes (40000, | |
113 | 120000, 160000, 100755, 100644). e.g. ':(attr:builtin_objectmode=160000)'. | |
114 | You may also check these values with `git check-attr builtin_objectmode -- <file>`. | |
115 | If the object is not in the index `git check-attr --cached` will return unspecified. | |
116 | ||
117 | ||
88e7fdf2 JH |
118 | EFFECTS |
119 | ------- | |
120 | ||
2de9b711 | 121 | Certain operations by Git can be influenced by assigning |
ae7aa499 JH |
122 | particular attributes to a path. Currently, the following |
123 | operations are attributes-aware. | |
88e7fdf2 JH |
124 | |
125 | Checking-out and checking-in | |
126 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
127 | ||
3fed15f5 | 128 | These attributes affect how the contents stored in the |
88e7fdf2 | 129 | repository are copied to the working tree files when commands |
d787d311 NTND |
130 | such as 'git switch', 'git checkout' and 'git merge' run. |
131 | They also affect how | |
2de9b711 | 132 | Git stores the contents you prepare in the working tree in the |
0b444cdb | 133 | repository upon 'git add' and 'git commit'. |
88e7fdf2 | 134 | |
5ec3e670 | 135 | `text` |
3fed15f5 JH |
136 | ^^^^^^ |
137 | ||
6696077a AH |
138 | This attribute marks the path as a text file, which enables end-of-line |
139 | conversion: When a matching file is added to the index, the file's line | |
140 | endings are normalized to LF in the index. Conversely, when the file is | |
141 | copied from the index to the working directory, its line endings may be | |
142 | converted from LF to CRLF depending on the `eol` attribute, the Git | |
143 | config, and the platform (see explanation of `eol` below). | |
3fed15f5 | 144 | |
88e7fdf2 JH |
145 | Set:: |
146 | ||
5ec3e670 | 147 | Setting the `text` attribute on a path enables end-of-line |
6696077a AH |
148 | conversion on checkin and checkout as described above. Line endings |
149 | are normalized to LF in the index every time the file is checked in, | |
150 | even if the file was previously added to Git with CRLF line endings. | |
88e7fdf2 JH |
151 | |
152 | Unset:: | |
153 | ||
2de9b711 | 154 | Unsetting the `text` attribute on a path tells Git not to |
bbb896d8 | 155 | attempt any end-of-line conversion upon checkin or checkout. |
88e7fdf2 | 156 | |
fd6cce9e | 157 | Set to string value "auto":: |
88e7fdf2 | 158 | |
6696077a AH |
159 | When `text` is set to "auto", Git decides by itself whether the file |
160 | is text or binary. If it is text and the file was not already in | |
161 | Git with CRLF endings, line endings are converted on checkin and | |
162 | checkout as described above. Otherwise, no conversion is done on | |
163 | checkin or checkout. | |
88e7fdf2 | 164 | |
88e7fdf2 JH |
165 | Unspecified:: |
166 | ||
2de9b711 | 167 | If the `text` attribute is unspecified, Git uses the |
942e7747 EB |
168 | `core.autocrlf` configuration variable to determine if the |
169 | file should be converted. | |
88e7fdf2 | 170 | |
2de9b711 | 171 | Any other value causes Git to act as if `text` has been left |
fd6cce9e | 172 | unspecified. |
88e7fdf2 | 173 | |
fd6cce9e EB |
174 | `eol` |
175 | ^^^^^ | |
88e7fdf2 | 176 | |
6696077a AH |
177 | This attribute marks a path to use a specific line-ending style in the |
178 | working tree when it is checked out. It has effect only if `text` or | |
179 | `text=auto` is set (see above), but specifying `eol` automatically sets | |
180 | `text` if `text` was left unspecified. | |
88e7fdf2 | 181 | |
fd6cce9e | 182 | Set to string value "crlf":: |
88e7fdf2 | 183 | |
6696077a AH |
184 | This setting converts the file's line endings in the working |
185 | directory to CRLF when the file is checked out. | |
fd6cce9e EB |
186 | |
187 | Set to string value "lf":: | |
188 | ||
6696077a AH |
189 | This setting uses the same line endings in the working directory as |
190 | in the index when the file is checked out. | |
191 | ||
192 | Unspecified:: | |
193 | ||
194 | If the `eol` attribute is unspecified for a file, its line endings | |
195 | in the working directory are determined by the `core.autocrlf` or | |
196 | `core.eol` configuration variable (see the definitions of those | |
197 | options in linkgit:git-config[1]). If `text` is set but neither of | |
198 | those variables is, the default is `eol=crlf` on Windows and | |
199 | `eol=lf` on all other platforms. | |
5ec3e670 EB |
200 | |
201 | Backwards compatibility with `crlf` attribute | |
202 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
203 | ||
204 | For backwards compatibility, the `crlf` attribute is interpreted as | |
205 | follows: | |
206 | ||
207 | ------------------------ | |
208 | crlf text | |
209 | -crlf -text | |
210 | crlf=input eol=lf | |
211 | ------------------------ | |
fd6cce9e EB |
212 | |
213 | End-of-line conversion | |
214 | ^^^^^^^^^^^^^^^^^^^^^^ | |
215 | ||
2de9b711 | 216 | While Git normally leaves file contents alone, it can be configured to |
fd6cce9e EB |
217 | normalize line endings to LF in the repository and, optionally, to |
218 | convert them to CRLF when files are checked out. | |
219 | ||
fd6cce9e EB |
220 | If you simply want to have CRLF line endings in your working directory |
221 | regardless of the repository you are working with, you can set the | |
65237284 | 222 | config variable "core.autocrlf" without using any attributes. |
fd6cce9e EB |
223 | |
224 | ------------------------ | |
225 | [core] | |
226 | autocrlf = true | |
227 | ------------------------ | |
228 | ||
e28eae31 | 229 | This does not force normalization of text files, but does ensure |
fd6cce9e EB |
230 | that text files that you introduce to the repository have their line |
231 | endings normalized to LF when they are added, and that files that are | |
942e7747 | 232 | already normalized in the repository stay normalized. |
fd6cce9e | 233 | |
e28eae31 TB |
234 | If you want to ensure that text files that any contributor introduces to |
235 | the repository have their line endings normalized, you can set the | |
236 | `text` attribute to "auto" for _all_ files. | |
88e7fdf2 | 237 | |
fd6cce9e | 238 | ------------------------ |
5ec3e670 | 239 | * text=auto |
fd6cce9e EB |
240 | ------------------------ |
241 | ||
e28eae31 TB |
242 | The attributes allow a fine-grained control, how the line endings |
243 | are converted. | |
244 | Here is an example that will make Git normalize .txt, .vcproj and .sh | |
245 | files, ensure that .vcproj files have CRLF and .sh files have LF in | |
246 | the working directory, and prevent .jpg files from being normalized | |
247 | regardless of their content. | |
248 | ||
249 | ------------------------ | |
250 | * text=auto | |
251 | *.txt text | |
252 | *.vcproj text eol=crlf | |
253 | *.sh text eol=lf | |
254 | *.jpg -text | |
255 | ------------------------ | |
256 | ||
257 | NOTE: When `text=auto` conversion is enabled in a cross-platform | |
258 | project using push and pull to a central repository the text files | |
259 | containing CRLFs should be normalized. | |
fd6cce9e | 260 | |
e28eae31 | 261 | From a clean working directory: |
fd6cce9e EB |
262 | |
263 | ------------------------------------------------- | |
e28eae31 | 264 | $ echo "* text=auto" >.gitattributes |
9472935d | 265 | $ git add --renormalize . |
fd6cce9e | 266 | $ git status # Show files that will be normalized |
fd6cce9e EB |
267 | $ git commit -m "Introduce end-of-line normalization" |
268 | ------------------------------------------------- | |
269 | ||
270 | If any files that should not be normalized show up in 'git status', | |
5ec3e670 | 271 | unset their `text` attribute before running 'git add -u'. |
fd6cce9e EB |
272 | |
273 | ------------------------ | |
5ec3e670 | 274 | manual.pdf -text |
fd6cce9e | 275 | ------------------------ |
88e7fdf2 | 276 | |
2de9b711 | 277 | Conversely, text files that Git does not detect can have normalization |
fd6cce9e | 278 | enabled manually. |
88e7fdf2 | 279 | |
fd6cce9e | 280 | ------------------------ |
5ec3e670 | 281 | weirdchars.txt text |
fd6cce9e | 282 | ------------------------ |
88e7fdf2 | 283 | |
2de9b711 | 284 | If `core.safecrlf` is set to "true" or "warn", Git verifies if |
21e5ad50 | 285 | the conversion is reversible for the current setting of |
2de9b711 TA |
286 | `core.autocrlf`. For "true", Git rejects irreversible |
287 | conversions; for "warn", Git only prints a warning but accepts | |
21e5ad50 SP |
288 | an irreversible conversion. The safety triggers to prevent such |
289 | a conversion done to the files in the work tree, but there are a | |
290 | few exceptions. Even though... | |
291 | ||
0b444cdb | 292 | - 'git add' itself does not touch the files in the work tree, the |
21e5ad50 SP |
293 | next checkout would, so the safety triggers; |
294 | ||
0b444cdb | 295 | - 'git apply' to update a text file with a patch does touch the files |
21e5ad50 SP |
296 | in the work tree, but the operation is about text files and CRLF |
297 | conversion is about fixing the line ending inconsistencies, so the | |
298 | safety does not trigger; | |
299 | ||
0b444cdb TR |
300 | - 'git diff' itself does not touch the files in the work tree, it is |
301 | often run to inspect the changes you intend to next 'git add'. To | |
21e5ad50 SP |
302 | catch potential problems early, safety triggers. |
303 | ||
88e7fdf2 | 304 | |
107642fe LS |
305 | `working-tree-encoding` |
306 | ^^^^^^^^^^^^^^^^^^^^^^^ | |
307 | ||
308 | Git recognizes files encoded in ASCII or one of its supersets (e.g. | |
309 | UTF-8, ISO-8859-1, ...) as text files. Files encoded in certain other | |
310 | encodings (e.g. UTF-16) are interpreted as binary and consequently | |
311 | built-in Git text processing tools (e.g. 'git diff') as well as most Git | |
312 | web front ends do not visualize the contents of these files by default. | |
313 | ||
314 | In these cases you can tell Git the encoding of a file in the working | |
315 | directory with the `working-tree-encoding` attribute. If a file with this | |
031fd4b9 | 316 | attribute is added to Git, then Git re-encodes the content from the |
107642fe LS |
317 | specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded |
318 | content in its internal data structure (called "the index"). On checkout | |
031fd4b9 | 319 | the content is re-encoded back to the specified encoding. |
107642fe LS |
320 | |
321 | Please note that using the `working-tree-encoding` attribute may have a | |
322 | number of pitfalls: | |
323 | ||
324 | - Alternative Git implementations (e.g. JGit or libgit2) and older Git | |
325 | versions (as of March 2018) do not support the `working-tree-encoding` | |
326 | attribute. If you decide to use the `working-tree-encoding` attribute | |
327 | in your repository, then it is strongly recommended to ensure that all | |
328 | clients working with the repository support it. | |
ad471949 AH |
329 | + |
330 | For example, Microsoft Visual Studio resources files (`*.rc`) or | |
331 | PowerShell script files (`*.ps1`) are sometimes encoded in UTF-16. | |
332 | If you declare `*.ps1` as files as UTF-16 and you add `foo.ps1` with | |
333 | a `working-tree-encoding` enabled Git client, then `foo.ps1` will be | |
334 | stored as UTF-8 internally. A client without `working-tree-encoding` | |
335 | support will checkout `foo.ps1` as UTF-8 encoded file. This will | |
336 | typically cause trouble for the users of this file. | |
337 | + | |
ed31851f AB |
338 | If a Git client that does not support the `working-tree-encoding` |
339 | attribute adds a new file `bar.ps1`, then `bar.ps1` will be | |
ad471949 AH |
340 | stored "as-is" internally (in this example probably as UTF-16). |
341 | A client with `working-tree-encoding` support will interpret the | |
342 | internal contents as UTF-8 and try to convert it to UTF-16 on checkout. | |
343 | That operation will fail and cause an error. | |
107642fe | 344 | |
e92d6225 LS |
345 | - Reencoding content to non-UTF encodings can cause errors as the |
346 | conversion might not be UTF-8 round trip safe. If you suspect your | |
347 | encoding to not be round trip safe, then add it to | |
348 | `core.checkRoundtripEncoding` to make Git check the round trip | |
349 | encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character | |
350 | set) is known to have round trip issues with UTF-8 and is checked by | |
351 | default. | |
352 | ||
107642fe LS |
353 | - Reencoding content requires resources that might slow down certain |
354 | Git operations (e.g 'git checkout' or 'git add'). | |
355 | ||
356 | Use the `working-tree-encoding` attribute only if you cannot store a file | |
357 | in UTF-8 encoding and if you want Git to be able to process the content | |
358 | as text. | |
359 | ||
360 | As an example, use the following attributes if your '*.ps1' files are | |
361 | UTF-16 encoded with byte order mark (BOM) and you want Git to perform | |
362 | automatic line ending conversion based on your platform. | |
363 | ||
364 | ------------------------ | |
365 | *.ps1 text working-tree-encoding=UTF-16 | |
366 | ------------------------ | |
367 | ||
368 | Use the following attributes if your '*.ps1' files are UTF-16 little | |
369 | endian encoded without BOM and you want Git to use Windows line endings | |
e6e15194 | 370 | in the working directory (use `UTF-16LE-BOM` instead of `UTF-16LE` if |
aab2a1ae TB |
371 | you want UTF-16 little endian with BOM). |
372 | Please note, it is highly recommended to | |
107642fe LS |
373 | explicitly define the line endings with `eol` if the `working-tree-encoding` |
374 | attribute is used to avoid ambiguity. | |
375 | ||
376 | ------------------------ | |
377 | *.ps1 text working-tree-encoding=UTF-16LE eol=CRLF | |
378 | ------------------------ | |
379 | ||
380 | You can get a list of all available encodings on your platform with the | |
381 | following command: | |
382 | ||
383 | ------------------------ | |
384 | iconv --list | |
385 | ------------------------ | |
386 | ||
387 | If you do not know the encoding of a file, then you can use the `file` | |
388 | command to guess the encoding: | |
389 | ||
390 | ------------------------ | |
391 | file foo.ps1 | |
392 | ------------------------ | |
393 | ||
394 | ||
3fed15f5 JH |
395 | `ident` |
396 | ^^^^^^^ | |
397 | ||
2de9b711 | 398 | When the attribute `ident` is set for a path, Git replaces |
2c850f12 | 399 | `$Id$` in the blob object with `$Id:`, followed by the |
3fed15f5 JH |
400 | 40-character hexadecimal blob object name, followed by a dollar |
401 | sign `$` upon checkout. Any byte sequence that begins with | |
af9b54bb AP |
402 | `$Id:` and ends with `$` in the worktree file is replaced |
403 | with `$Id$` upon check-in. | |
3fed15f5 JH |
404 | |
405 | ||
aa4ed402 JH |
406 | `filter` |
407 | ^^^^^^^^ | |
408 | ||
c05ef938 | 409 | A `filter` attribute can be set to a string value that names a |
aa4ed402 JH |
410 | filter driver specified in the configuration. |
411 | ||
c05ef938 | 412 | A filter driver consists of a `clean` command and a `smudge` |
aa4ed402 | 413 | command, either of which can be left unspecified. Upon |
c05ef938 WC |
414 | checkout, when the `smudge` command is specified, the command is |
415 | fed the blob object from its standard input, and its standard | |
416 | output is used to update the worktree file. Similarly, the | |
417 | `clean` command is used to convert the contents of worktree file | |
edcc8581 LS |
418 | upon checkin. By default these commands process only a single |
419 | blob and terminate. If a long running `process` filter is used | |
420 | in place of `clean` and/or `smudge` filters, then Git can process | |
421 | all blobs with a single filter command invocation for the entire | |
422 | life of a single Git command, for example `git add --all`. If a | |
423 | long running `process` filter is configured then it always takes | |
424 | precedence over a configured single blob filter. See section | |
425 | below for the description of the protocol used to communicate with | |
426 | a `process` filter. | |
aa4ed402 | 427 | |
36daaaca JB |
428 | One use of the content filtering is to massage the content into a shape |
429 | that is more convenient for the platform, filesystem, and the user to use. | |
430 | For this mode of operation, the key phrase here is "more convenient" and | |
431 | not "turning something unusable into usable". In other words, the intent | |
432 | is that if someone unsets the filter driver definition, or does not have | |
433 | the appropriate filter program, the project should still be usable. | |
434 | ||
435 | Another use of the content filtering is to store the content that cannot | |
436 | be directly used in the repository (e.g. a UUID that refers to the true | |
2de9b711 | 437 | content stored outside Git, or an encrypted content) and turn it into a |
36daaaca JB |
438 | usable form upon checkout (e.g. download the external content, or decrypt |
439 | the encrypted content). | |
440 | ||
441 | These two filters behave differently, and by default, a filter is taken as | |
442 | the former, massaging the contents into more convenient shape. A missing | |
443 | filter driver definition in the config, or a filter driver that exits with | |
444 | a non-zero status, is not an error but makes the filter a no-op passthru. | |
445 | ||
446 | You can declare that a filter turns a content that by itself is unusable | |
447 | into a usable content by setting the filter.<driver>.required configuration | |
448 | variable to `true`. | |
aa4ed402 | 449 | |
9472935d TB |
450 | Note: Whenever the clean filter is changed, the repo should be renormalized: |
451 | $ git add --renormalize . | |
452 | ||
d79f5d17 NS |
453 | For example, in .gitattributes, you would assign the `filter` |
454 | attribute for paths. | |
455 | ||
456 | ------------------------ | |
457 | *.c filter=indent | |
458 | ------------------------ | |
459 | ||
460 | Then you would define a "filter.indent.clean" and "filter.indent.smudge" | |
461 | configuration in your .git/config to specify a pair of commands to | |
462 | modify the contents of C programs when the source files are checked | |
463 | in ("clean" is run) and checked out (no change is made because the | |
464 | command is "cat"). | |
465 | ||
466 | ------------------------ | |
467 | [filter "indent"] | |
468 | clean = indent | |
469 | smudge = cat | |
470 | ------------------------ | |
471 | ||
f217f0e8 EB |
472 | For best results, `clean` should not alter its output further if it is |
473 | run twice ("clean->clean" should be equivalent to "clean"), and | |
474 | multiple `smudge` commands should not alter `clean`'s output | |
475 | ("smudge->smudge->clean" should be equivalent to "clean"). See the | |
476 | section on merging below. | |
477 | ||
478 | The "indent" filter is well-behaved in this regard: it will not modify | |
479 | input that is already correctly indented. In this case, the lack of a | |
480 | smudge filter means that the clean filter _must_ accept its own output | |
481 | without modifying it. | |
482 | ||
36daaaca JB |
483 | If a filter _must_ succeed in order to make the stored contents usable, |
484 | you can declare that the filter is `required`, in the configuration: | |
485 | ||
486 | ------------------------ | |
487 | [filter "crypt"] | |
488 | clean = openssl enc ... | |
489 | smudge = openssl enc -d ... | |
490 | required | |
491 | ------------------------ | |
492 | ||
a2b665de PW |
493 | Sequence "%f" on the filter command line is replaced with the name of |
494 | the file the filter is working on. A filter might use this in keyword | |
495 | substitution. For example: | |
496 | ||
497 | ------------------------ | |
498 | [filter "p4"] | |
499 | clean = git-p4-filter --clean %f | |
500 | smudge = git-p4-filter --smudge %f | |
501 | ------------------------ | |
502 | ||
52db4b04 JH |
503 | Note that "%f" is the name of the path that is being worked on. Depending |
504 | on the version that is being filtered, the corresponding file on disk may | |
505 | not exist, or may have different contents. So, smudge and clean commands | |
506 | should not try to access the file on disk, but only act as filters on the | |
507 | content provided to them on standard input. | |
aa4ed402 | 508 | |
edcc8581 LS |
509 | Long Running Filter Process |
510 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
511 | ||
512 | If the filter command (a string value) is defined via | |
513 | `filter.<driver>.process` then Git can process all blobs with a | |
514 | single filter invocation for the entire life of a single Git | |
addad105 JT |
515 | command. This is achieved by using the long-running process protocol |
516 | (described in technical/long-running-process-protocol.txt). | |
517 | ||
518 | When Git encounters the first file that needs to be cleaned or smudged, | |
519 | it starts the filter and performs the handshake. In the handshake, the | |
520 | welcome message sent by Git is "git-filter-client", only version 2 is | |
031fd4b9 | 521 | supported, and the supported capabilities are "clean", "smudge", and |
addad105 | 522 | "delay". |
edcc8581 LS |
523 | |
524 | Afterwards Git sends a list of "key=value" pairs terminated with | |
525 | a flush packet. The list will contain at least the filter command | |
526 | (based on the supported capabilities) and the pathname of the file | |
527 | to filter relative to the repository root. Right after the flush packet | |
528 | Git sends the content split in zero or more pkt-line packets and a | |
529 | flush packet to terminate content. Please note, that the filter | |
530 | must not send any response before it received the content and the | |
c6b0831c LS |
531 | final flush packet. Also note that the "value" of a "key=value" pair |
532 | can contain the "=" character whereas the key would never contain | |
533 | that character. | |
edcc8581 LS |
534 | ------------------------ |
535 | packet: git> command=smudge | |
536 | packet: git> pathname=path/testfile.dat | |
537 | packet: git> 0000 | |
538 | packet: git> CONTENT | |
539 | packet: git> 0000 | |
540 | ------------------------ | |
541 | ||
542 | The filter is expected to respond with a list of "key=value" pairs | |
543 | terminated with a flush packet. If the filter does not experience | |
544 | problems then the list must contain a "success" status. Right after | |
545 | these packets the filter is expected to send the content in zero | |
546 | or more pkt-line packets and a flush packet at the end. Finally, a | |
547 | second list of "key=value" pairs terminated with a flush packet | |
548 | is expected. The filter can change the status in the second list | |
549 | or keep the status as is with an empty list. Please note that the | |
550 | empty list must be terminated with a flush packet regardless. | |
551 | ||
552 | ------------------------ | |
553 | packet: git< status=success | |
554 | packet: git< 0000 | |
555 | packet: git< SMUDGED_CONTENT | |
556 | packet: git< 0000 | |
557 | packet: git< 0000 # empty list, keep "status=success" unchanged! | |
558 | ------------------------ | |
559 | ||
560 | If the result content is empty then the filter is expected to respond | |
561 | with a "success" status and a flush packet to signal the empty content. | |
562 | ------------------------ | |
563 | packet: git< status=success | |
564 | packet: git< 0000 | |
565 | packet: git< 0000 # empty content! | |
566 | packet: git< 0000 # empty list, keep "status=success" unchanged! | |
567 | ------------------------ | |
568 | ||
569 | In case the filter cannot or does not want to process the content, | |
570 | it is expected to respond with an "error" status. | |
571 | ------------------------ | |
572 | packet: git< status=error | |
573 | packet: git< 0000 | |
574 | ------------------------ | |
575 | ||
576 | If the filter experiences an error during processing, then it can | |
577 | send the status "error" after the content was (partially or | |
578 | completely) sent. | |
579 | ------------------------ | |
580 | packet: git< status=success | |
581 | packet: git< 0000 | |
582 | packet: git< HALF_WRITTEN_ERRONEOUS_CONTENT | |
583 | packet: git< 0000 | |
584 | packet: git< status=error | |
585 | packet: git< 0000 | |
586 | ------------------------ | |
587 | ||
588 | In case the filter cannot or does not want to process the content | |
589 | as well as any future content for the lifetime of the Git process, | |
590 | then it is expected to respond with an "abort" status at any point | |
591 | in the protocol. | |
592 | ------------------------ | |
593 | packet: git< status=abort | |
594 | packet: git< 0000 | |
595 | ------------------------ | |
596 | ||
597 | Git neither stops nor restarts the filter process in case the | |
598 | "error"/"abort" status is set. However, Git sets its exit code | |
599 | according to the `filter.<driver>.required` flag, mimicking the | |
600 | behavior of the `filter.<driver>.clean` / `filter.<driver>.smudge` | |
601 | mechanism. | |
602 | ||
603 | If the filter dies during the communication or does not adhere to | |
604 | the protocol then Git will stop the filter process and restart it | |
605 | with the next file that needs to be processed. Depending on the | |
606 | `filter.<driver>.required` flag Git will interpret that as error. | |
607 | ||
2841e8f8 LS |
608 | Delay |
609 | ^^^^^ | |
610 | ||
611 | If the filter supports the "delay" capability, then Git can send the | |
612 | flag "can-delay" after the filter command and pathname. This flag | |
613 | denotes that the filter can delay filtering the current blob (e.g. to | |
614 | compensate network latencies) by responding with no content but with | |
615 | the status "delayed" and a flush packet. | |
616 | ------------------------ | |
617 | packet: git> command=smudge | |
618 | packet: git> pathname=path/testfile.dat | |
619 | packet: git> can-delay=1 | |
620 | packet: git> 0000 | |
621 | packet: git> CONTENT | |
622 | packet: git> 0000 | |
623 | packet: git< status=delayed | |
624 | packet: git< 0000 | |
625 | ------------------------ | |
626 | ||
627 | If the filter supports the "delay" capability then it must support the | |
628 | "list_available_blobs" command. If Git sends this command, then the | |
629 | filter is expected to return a list of pathnames representing blobs | |
630 | that have been delayed earlier and are now available. | |
631 | The list must be terminated with a flush packet followed | |
632 | by a "success" status that is also terminated with a flush packet. If | |
633 | no blobs for the delayed paths are available, yet, then the filter is | |
634 | expected to block the response until at least one blob becomes | |
635 | available. The filter can tell Git that it has no more delayed blobs | |
636 | by sending an empty list. As soon as the filter responds with an empty | |
637 | list, Git stops asking. All blobs that Git has not received at this | |
638 | point are considered missing and will result in an error. | |
639 | ||
640 | ------------------------ | |
641 | packet: git> command=list_available_blobs | |
642 | packet: git> 0000 | |
643 | packet: git< pathname=path/testfile.dat | |
644 | packet: git< pathname=path/otherfile.dat | |
645 | packet: git< 0000 | |
646 | packet: git< status=success | |
647 | packet: git< 0000 | |
648 | ------------------------ | |
649 | ||
650 | After Git received the pathnames, it will request the corresponding | |
651 | blobs again. These requests contain a pathname and an empty content | |
652 | section. The filter is expected to respond with the smudged content | |
653 | in the usual way as explained above. | |
654 | ------------------------ | |
655 | packet: git> command=smudge | |
656 | packet: git> pathname=path/testfile.dat | |
657 | packet: git> 0000 | |
658 | packet: git> 0000 # empty content! | |
659 | packet: git< status=success | |
660 | packet: git< 0000 | |
661 | packet: git< SMUDGED_CONTENT | |
662 | packet: git< 0000 | |
663 | packet: git< 0000 # empty list, keep "status=success" unchanged! | |
664 | ------------------------ | |
665 | ||
666 | Example | |
667 | ^^^^^^^ | |
668 | ||
0f71fa27 LS |
669 | A long running filter demo implementation can be found in |
670 | `contrib/long-running-filter/example.pl` located in the Git | |
671 | core repository. If you develop your own long running filter | |
edcc8581 LS |
672 | process then the `GIT_TRACE_PACKET` environment variables can be |
673 | very helpful for debugging (see linkgit:git[1]). | |
674 | ||
675 | Please note that you cannot use an existing `filter.<driver>.clean` | |
676 | or `filter.<driver>.smudge` command with `filter.<driver>.process` | |
677 | because the former two use a different inter process communication | |
678 | protocol than the latter one. | |
679 | ||
680 | ||
aa4ed402 JH |
681 | Interaction between checkin/checkout attributes |
682 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
683 | ||
684 | In the check-in codepath, the worktree file is first converted | |
685 | with `filter` driver (if specified and corresponding driver | |
686 | defined), then the result is processed with `ident` (if | |
5ec3e670 | 687 | specified), and then finally with `text` (again, if specified |
aa4ed402 JH |
688 | and applicable). |
689 | ||
690 | In the check-out codepath, the blob content is first converted | |
5ec3e670 | 691 | with `text`, and then `ident` and fed to `filter`. |
aa4ed402 JH |
692 | |
693 | ||
f217f0e8 EB |
694 | Merging branches with differing checkin/checkout attributes |
695 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
696 | ||
697 | If you have added attributes to a file that cause the canonical | |
698 | repository format for that file to change, such as adding a | |
699 | clean/smudge filter or text/eol/ident attributes, merging anything | |
700 | where the attribute is not in place would normally cause merge | |
701 | conflicts. | |
702 | ||
2de9b711 | 703 | To prevent these unnecessary merge conflicts, Git can be told to run a |
f217f0e8 EB |
704 | virtual check-out and check-in of all three stages of a file when |
705 | resolving a three-way merge by setting the `merge.renormalize` | |
706 | configuration variable. This prevents changes caused by check-in | |
707 | conversion from causing spurious merge conflicts when a converted file | |
708 | is merged with an unconverted file. | |
709 | ||
710 | As long as a "smudge->clean" results in the same output as a "clean" | |
711 | even on files that are already smudged, this strategy will | |
712 | automatically resolve all filter-related conflicts. Filters that do | |
713 | not act in this way may cause additional merge conflicts that must be | |
714 | resolved manually. | |
715 | ||
716 | ||
88e7fdf2 JH |
717 | Generating diff text |
718 | ~~~~~~~~~~~~~~~~~~~~ | |
719 | ||
4f73e240 JN |
720 | `diff` |
721 | ^^^^^^ | |
722 | ||
2de9b711 TA |
723 | The attribute `diff` affects how Git generates diffs for particular |
724 | files. It can tell Git whether to generate a textual patch for the path | |
678852d9 | 725 | or to treat the path as a binary file. It can also affect what line is |
2de9b711 TA |
726 | shown on the hunk header `@@ -k,l +n,m @@` line, tell Git to use an |
727 | external command to generate the diff, or ask Git to convert binary | |
678852d9 | 728 | files to a text format before generating the diff. |
88e7fdf2 JH |
729 | |
730 | Set:: | |
731 | ||
732 | A path to which the `diff` attribute is set is treated | |
733 | as text, even when they contain byte values that | |
734 | normally never appear in text files, such as NUL. | |
735 | ||
736 | Unset:: | |
737 | ||
738 | A path to which the `diff` attribute is unset will | |
678852d9 JK |
739 | generate `Binary files differ` (or a binary patch, if |
740 | binary patches are enabled). | |
88e7fdf2 JH |
741 | |
742 | Unspecified:: | |
743 | ||
744 | A path to which the `diff` attribute is unspecified | |
745 | first gets its contents inspected, and if it looks like | |
6bf3b813 NTND |
746 | text and is smaller than core.bigFileThreshold, it is treated |
747 | as text. Otherwise it would generate `Binary files differ`. | |
88e7fdf2 | 748 | |
2cc3167c JH |
749 | String:: |
750 | ||
678852d9 JK |
751 | Diff is shown using the specified diff driver. Each driver may |
752 | specify one or more options, as described in the following | |
753 | section. The options for the diff driver "foo" are defined | |
754 | by the configuration variables in the "diff.foo" section of the | |
2de9b711 | 755 | Git config file. |
2cc3167c JH |
756 | |
757 | ||
678852d9 JK |
758 | Defining an external diff driver |
759 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
2cc3167c JH |
760 | |
761 | The definition of a diff driver is done in `gitconfig`, not | |
762 | `gitattributes` file, so strictly speaking this manual page is a | |
763 | wrong place to talk about it. However... | |
764 | ||
678852d9 | 765 | To define an external diff driver `jcdiff`, add a section to your |
2cc3167c JH |
766 | `$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: |
767 | ||
768 | ---------------------------------------------------------------- | |
769 | [diff "jcdiff"] | |
770 | command = j-c-diff | |
771 | ---------------------------------------------------------------- | |
772 | ||
2de9b711 | 773 | When Git needs to show you a diff for the path with `diff` |
2cc3167c JH |
774 | attribute set to `jcdiff`, it calls the command you specified |
775 | with the above configuration, i.e. `j-c-diff`, with 7 | |
776 | parameters, just like `GIT_EXTERNAL_DIFF` program is called. | |
9e1f0a85 | 777 | See linkgit:git[1] for details. |
88e7fdf2 | 778 | |
a4cf900e JC |
779 | Setting the internal diff algorithm |
780 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
781 | ||
782 | The diff algorithm can be set through the `diff.algorithm` config key, but | |
783 | sometimes it may be helpful to set the diff algorithm per path. For example, | |
784 | one may want to use the `minimal` diff algorithm for .json files, and the | |
785 | `histogram` for .c files, and so on without having to pass in the algorithm | |
786 | through the command line each time. | |
787 | ||
788 | First, in `.gitattributes`, assign the `diff` attribute for paths. | |
789 | ||
790 | ------------------------ | |
791 | *.json diff=<name> | |
792 | ------------------------ | |
793 | ||
794 | Then, define a "diff.<name>.algorithm" configuration to specify the diff | |
795 | algorithm, choosing from `myers`, `patience`, `minimal`, or `histogram`. | |
796 | ||
797 | ---------------------------------------------------------------- | |
798 | [diff "<name>"] | |
799 | algorithm = histogram | |
800 | ---------------------------------------------------------------- | |
801 | ||
802 | This diff algorithm applies to user facing diff output like git-diff(1), | |
803 | git-show(1) and is used for the `--stat` output as well. The merge machinery | |
804 | will not use the diff algorithm set through this method. | |
805 | ||
806 | NOTE: If `diff.<name>.command` is defined for path with the | |
807 | `diff=<name>` attribute, it is executed as an external diff driver | |
808 | (see above), and adding `diff.<name>.algorithm` has no effect, as the | |
809 | algorithm is not passed to the external diff driver. | |
88e7fdf2 | 810 | |
ae7aa499 JH |
811 | Defining a custom hunk-header |
812 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
813 | ||
c882c01e | 814 | Each group of changes (called a "hunk") in the textual diff output |
ae7aa499 JH |
815 | is prefixed with a line of the form: |
816 | ||
817 | @@ -k,l +n,m @@ TEXT | |
818 | ||
c882c01e GD |
819 | This is called a 'hunk header'. The "TEXT" portion is by default a line |
820 | that begins with an alphabet, an underscore or a dollar sign; this | |
821 | matches what GNU 'diff -p' output uses. This default selection however | |
822 | is not suited for some contents, and you can use a customized pattern | |
823 | to make a selection. | |
ae7aa499 | 824 | |
c882c01e | 825 | First, in .gitattributes, you would assign the `diff` attribute |
ae7aa499 JH |
826 | for paths. |
827 | ||
828 | ------------------------ | |
829 | *.tex diff=tex | |
830 | ------------------------ | |
831 | ||
edb7e82f | 832 | Then, you would define a "diff.tex.xfuncname" configuration to |
ae7aa499 | 833 | specify a regular expression that matches a line that you would |
c4c86d23 JK |
834 | want to appear as the hunk header "TEXT". Add a section to your |
835 | `$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: | |
ae7aa499 JH |
836 | |
837 | ------------------------ | |
838 | [diff "tex"] | |
45d9414f | 839 | xfuncname = "^(\\\\(sub)*section\\{.*)$" |
ae7aa499 JH |
840 | ------------------------ |
841 | ||
842 | Note. A single level of backslashes are eaten by the | |
843 | configuration file parser, so you would need to double the | |
844 | backslashes; the pattern above picks a line that begins with a | |
02783075 | 845 | backslash, and zero or more occurrences of `sub` followed by |
ae7aa499 JH |
846 | `section` followed by open brace, to the end of line. |
847 | ||
848 | There are a few built-in patterns to make this easier, and `tex` | |
849 | is one of them, so you do not have to write the above in your | |
850 | configuration file (you still need to enable this with the | |
d08ed6d6 GH |
851 | attribute mechanism, via `.gitattributes`). The following built in |
852 | patterns are available: | |
853 | ||
e90d065e AJ |
854 | - `ada` suitable for source code in the Ada language. |
855 | ||
2ff6c346 VE |
856 | - `bash` suitable for source code in the Bourne-Again SHell language. |
857 | Covers a superset of POSIX shell function definitions. | |
858 | ||
23b5beb2 GH |
859 | - `bibtex` suitable for files with BibTeX coded references. |
860 | ||
80c49c3d TR |
861 | - `cpp` suitable for source code in the C and C++ languages. |
862 | ||
b221207d PO |
863 | - `csharp` suitable for source code in the C# language. |
864 | ||
0719f3ee WD |
865 | - `css` suitable for cascading style sheets. |
866 | ||
3c81760b SB |
867 | - `dts` suitable for devicetree (DTS) files. |
868 | ||
a807200f ŁN |
869 | - `elixir` suitable for source code in the Elixir language. |
870 | ||
909a5494 BC |
871 | - `fortran` suitable for source code in the Fortran language. |
872 | ||
69f9c87d ZB |
873 | - `fountain` suitable for Fountain documents. |
874 | ||
1dbf0c0a AG |
875 | - `golang` suitable for source code in the Go language. |
876 | ||
af9ce1ff AE |
877 | - `html` suitable for HTML/XHTML documents. |
878 | ||
b66e00f1 | 879 | - `java` suitable for source code in the Java language. |
d08ed6d6 | 880 | |
09188ed9 JD |
881 | - `kotlin` suitable for source code in the Kotlin language. |
882 | ||
09dad925 AH |
883 | - `markdown` suitable for Markdown documents. |
884 | ||
2731a784 | 885 | - `matlab` suitable for source code in the MATLAB and Octave languages. |
53b10a14 | 886 | |
5d1e958e JS |
887 | - `objc` suitable for source code in the Objective-C language. |
888 | ||
d08ed6d6 GH |
889 | - `pascal` suitable for source code in the Pascal/Delphi language. |
890 | ||
71a5d4bc JN |
891 | - `perl` suitable for source code in the Perl language. |
892 | ||
af9ce1ff AE |
893 | - `php` suitable for source code in the PHP language. |
894 | ||
7c17205b KS |
895 | - `python` suitable for source code in the Python language. |
896 | ||
d08ed6d6 GH |
897 | - `ruby` suitable for source code in the Ruby language. |
898 | ||
d74e7860 MAL |
899 | - `rust` suitable for source code in the Rust language. |
900 | ||
a4373903 AR |
901 | - `scheme` suitable for source code in the Scheme language. |
902 | ||
d08ed6d6 | 903 | - `tex` suitable for source code for LaTeX documents. |
ae7aa499 JH |
904 | |
905 | ||
80c49c3d TR |
906 | Customizing word diff |
907 | ^^^^^^^^^^^^^^^^^^^^^ | |
908 | ||
882749a0 | 909 | You can customize the rules that `git diff --word-diff` uses to |
80c49c3d | 910 | split words in a line, by specifying an appropriate regular expression |
ae3b970a | 911 | in the "diff.*.wordRegex" configuration variable. For example, in TeX |
80c49c3d TR |
912 | a backslash followed by a sequence of letters forms a command, but |
913 | several such commands can be run together without intervening | |
c4c86d23 JK |
914 | whitespace. To separate them, use a regular expression in your |
915 | `$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: | |
80c49c3d TR |
916 | |
917 | ------------------------ | |
918 | [diff "tex"] | |
ae3b970a | 919 | wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+" |
80c49c3d TR |
920 | ------------------------ |
921 | ||
922 | A built-in pattern is provided for all languages listed in the | |
923 | previous section. | |
924 | ||
925 | ||
678852d9 JK |
926 | Performing text diffs of binary files |
927 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
928 | ||
929 | Sometimes it is desirable to see the diff of a text-converted | |
930 | version of some binary files. For example, a word processor | |
931 | document can be converted to an ASCII text representation, and | |
932 | the diff of the text shown. Even though this conversion loses | |
933 | some information, the resulting diff is useful for human | |
934 | viewing (but cannot be applied directly). | |
935 | ||
936 | The `textconv` config option is used to define a program for | |
937 | performing such a conversion. The program should take a single | |
938 | argument, the name of a file to convert, and produce the | |
939 | resulting text on stdout. | |
940 | ||
941 | For example, to show the diff of the exif information of a | |
942 | file instead of the binary information (assuming you have the | |
c4c86d23 JK |
943 | exif tool installed), add the following section to your |
944 | `$GIT_DIR/config` file (or `$HOME/.gitconfig` file): | |
678852d9 JK |
945 | |
946 | ------------------------ | |
947 | [diff "jpg"] | |
948 | textconv = exif | |
949 | ------------------------ | |
950 | ||
951 | NOTE: The text conversion is generally a one-way conversion; | |
952 | in this example, we lose the actual image contents and focus | |
953 | just on the text data. This means that diffs generated by | |
954 | textconv are _not_ suitable for applying. For this reason, | |
955 | only `git diff` and the `git log` family of commands (i.e., | |
956 | log, whatchanged, show) will perform text conversion. `git | |
957 | format-patch` will never generate this output. If you want to | |
958 | send somebody a text-converted diff of a binary file (e.g., | |
959 | because it quickly conveys the changes you have made), you | |
960 | should generate it separately and send it as a comment _in | |
961 | addition to_ the usual binary diff that you might send. | |
962 | ||
d9bae1a1 | 963 | Because text conversion can be slow, especially when doing a |
2de9b711 | 964 | large number of them with `git log -p`, Git provides a mechanism |
d9bae1a1 JK |
965 | to cache the output and use it in future diffs. To enable |
966 | caching, set the "cachetextconv" variable in your diff driver's | |
967 | config. For example: | |
968 | ||
969 | ------------------------ | |
970 | [diff "jpg"] | |
971 | textconv = exif | |
972 | cachetextconv = true | |
973 | ------------------------ | |
974 | ||
975 | This will cache the result of running "exif" on each blob | |
976 | indefinitely. If you change the textconv config variable for a | |
2de9b711 | 977 | diff driver, Git will automatically invalidate the cache entries |
d9bae1a1 JK |
978 | and re-run the textconv filter. If you want to invalidate the |
979 | cache manually (e.g., because your version of "exif" was updated | |
980 | and now produces better output), you can remove the cache | |
981 | manually with `git update-ref -d refs/notes/textconv/jpg` (where | |
982 | "jpg" is the name of the diff driver, as in the example above). | |
678852d9 | 983 | |
55601c6a JK |
984 | Choosing textconv versus external diff |
985 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
986 | ||
987 | If you want to show differences between binary or specially-formatted | |
988 | blobs in your repository, you can choose to use either an external diff | |
989 | command, or to use textconv to convert them to a diff-able text format. | |
990 | Which method you choose depends on your exact situation. | |
991 | ||
992 | The advantage of using an external diff command is flexibility. You are | |
993 | not bound to find line-oriented changes, nor is it necessary for the | |
994 | output to resemble unified diff. You are free to locate and report | |
995 | changes in the most appropriate way for your data format. | |
996 | ||
997 | A textconv, by comparison, is much more limiting. You provide a | |
2de9b711 | 998 | transformation of the data into a line-oriented text format, and Git |
55601c6a JK |
999 | uses its regular diff tools to generate the output. There are several |
1000 | advantages to choosing this method: | |
1001 | ||
1002 | 1. Ease of use. It is often much simpler to write a binary to text | |
1003 | transformation than it is to perform your own diff. In many cases, | |
1004 | existing programs can be used as textconv filters (e.g., exif, | |
1005 | odt2txt). | |
1006 | ||
1007 | 2. Git diff features. By performing only the transformation step | |
2de9b711 | 1008 | yourself, you can still utilize many of Git's diff features, |
55601c6a JK |
1009 | including colorization, word-diff, and combined diffs for merges. |
1010 | ||
1011 | 3. Caching. Textconv caching can speed up repeated diffs, such as those | |
1012 | you might trigger by running `git log -p`. | |
1013 | ||
1014 | ||
ab435611 JK |
1015 | Marking files as binary |
1016 | ^^^^^^^^^^^^^^^^^^^^^^^ | |
1017 | ||
1018 | Git usually guesses correctly whether a blob contains text or binary | |
1019 | data by examining the beginning of the contents. However, sometimes you | |
1020 | may want to override its decision, either because a blob contains binary | |
1021 | data later in the file, or because the content, while technically | |
1022 | composed of text characters, is opaque to a human reader. For example, | |
f745acb0 | 1023 | many postscript files contain only ASCII characters, but produce noisy |
ab435611 JK |
1024 | and meaningless diffs. |
1025 | ||
1026 | The simplest way to mark a file as binary is to unset the diff | |
1027 | attribute in the `.gitattributes` file: | |
1028 | ||
1029 | ------------------------ | |
1030 | *.ps -diff | |
1031 | ------------------------ | |
1032 | ||
2de9b711 | 1033 | This will cause Git to generate `Binary files differ` (or a binary |
ab435611 JK |
1034 | patch, if binary patches are enabled) instead of a regular diff. |
1035 | ||
1036 | However, one may also want to specify other diff driver attributes. For | |
1037 | example, you might want to use `textconv` to convert postscript files to | |
f745acb0 | 1038 | an ASCII representation for human viewing, but otherwise treat them as |
ab435611 JK |
1039 | binary files. You cannot specify both `-diff` and `diff=ps` attributes. |
1040 | The solution is to use the `diff.*.binary` config option: | |
1041 | ||
1042 | ------------------------ | |
1043 | [diff "ps"] | |
1044 | textconv = ps2ascii | |
1045 | binary = true | |
1046 | ------------------------ | |
1047 | ||
88e7fdf2 JH |
1048 | Performing a three-way merge |
1049 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1050 | ||
4f73e240 JN |
1051 | `merge` |
1052 | ^^^^^^^ | |
1053 | ||
b547ce0b | 1054 | The attribute `merge` affects how three versions of a file are |
88e7fdf2 | 1055 | merged when a file-level merge is necessary during `git merge`, |
57f6ec02 | 1056 | and other commands such as `git revert` and `git cherry-pick`. |
88e7fdf2 JH |
1057 | |
1058 | Set:: | |
1059 | ||
1060 | Built-in 3-way merge driver is used to merge the | |
2fd02c92 | 1061 | contents in a way similar to 'merge' command of `RCS` |
88e7fdf2 JH |
1062 | suite. This is suitable for ordinary text files. |
1063 | ||
1064 | Unset:: | |
1065 | ||
1066 | Take the version from the current branch as the | |
1067 | tentative merge result, and declare that the merge has | |
b547ce0b | 1068 | conflicts. This is suitable for binary files that do |
88e7fdf2 JH |
1069 | not have a well-defined merge semantics. |
1070 | ||
1071 | Unspecified:: | |
1072 | ||
1073 | By default, this uses the same built-in 3-way merge | |
b547ce0b AS |
1074 | driver as is the case when the `merge` attribute is set. |
1075 | However, the `merge.default` configuration variable can name | |
1076 | different merge driver to be used with paths for which the | |
88e7fdf2 JH |
1077 | `merge` attribute is unspecified. |
1078 | ||
2cc3167c | 1079 | String:: |
88e7fdf2 JH |
1080 | |
1081 | 3-way merge is performed using the specified custom | |
1082 | merge driver. The built-in 3-way merge driver can be | |
1083 | explicitly specified by asking for "text" driver; the | |
1084 | built-in "take the current branch" driver can be | |
b9d14ffb | 1085 | requested with "binary". |
88e7fdf2 JH |
1086 | |
1087 | ||
0e545f75 JH |
1088 | Built-in merge drivers |
1089 | ^^^^^^^^^^^^^^^^^^^^^^ | |
1090 | ||
1091 | There are a few built-in low-level merge drivers defined that | |
1092 | can be asked for via the `merge` attribute. | |
1093 | ||
1094 | text:: | |
1095 | ||
1096 | Usual 3-way file level merge for text files. Conflicted | |
1097 | regions are marked with conflict markers `<<<<<<<`, | |
1098 | `=======` and `>>>>>>>`. The version from your branch | |
1099 | appears before the `=======` marker, and the version | |
1100 | from the merged branch appears after the `=======` | |
1101 | marker. | |
1102 | ||
1103 | binary:: | |
1104 | ||
1105 | Keep the version from your branch in the work tree, but | |
1106 | leave the path in the conflicted state for the user to | |
1107 | sort out. | |
1108 | ||
1109 | union:: | |
1110 | ||
1111 | Run 3-way file level merge for text files, but take | |
1112 | lines from both versions, instead of leaving conflict | |
1113 | markers. This tends to leave the added lines in the | |
1114 | resulting file in random order and the user should | |
1115 | verify the result. Do not use this if you do not | |
1116 | understand the implications. | |
1117 | ||
1118 | ||
88e7fdf2 JH |
1119 | Defining a custom merge driver |
1120 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
1121 | ||
0e545f75 JH |
1122 | The definition of a merge driver is done in the `.git/config` |
1123 | file, not in the `gitattributes` file, so strictly speaking this | |
1124 | manual page is a wrong place to talk about it. However... | |
88e7fdf2 JH |
1125 | |
1126 | To define a custom merge driver `filfre`, add a section to your | |
1127 | `$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: | |
1128 | ||
1129 | ---------------------------------------------------------------- | |
1130 | [merge "filfre"] | |
1131 | name = feel-free merge driver | |
ef45bb1f | 1132 | driver = filfre %O %A %B %L %P |
88e7fdf2 JH |
1133 | recursive = binary |
1134 | ---------------------------------------------------------------- | |
1135 | ||
1136 | The `merge.*.name` variable gives the driver a human-readable | |
1137 | name. | |
1138 | ||
1139 | The `merge.*.driver` variable's value is used to construct a | |
81effe94 | 1140 | command to run to common ancestor's version (`%O`), current |
88e7fdf2 JH |
1141 | version (`%A`) and the other branches' version (`%B`). These |
1142 | three tokens are replaced with the names of temporary files that | |
1143 | hold the contents of these versions when the command line is | |
81effe94 | 1144 | built. Additionally, `%L` will be replaced with the conflict marker |
16758621 | 1145 | size (see below). |
88e7fdf2 JH |
1146 | |
1147 | The merge driver is expected to leave the result of the merge in | |
1148 | the file named with `%A` by overwriting it, and exit with zero | |
1149 | status if it managed to merge them cleanly, or non-zero if there | |
2b7b788f JH |
1150 | were conflicts. When the driver crashes (e.g. killed by SEGV), |
1151 | it is expected to exit with non-zero status that are higher than | |
1152 | 128, and in such a case, the merge results in a failure (which is | |
1153 | different from producing a conflict). | |
88e7fdf2 JH |
1154 | |
1155 | The `merge.*.recursive` variable specifies what other merge | |
1156 | driver to use when the merge driver is called for an internal | |
1157 | merge between common ancestors, when there are more than one. | |
1158 | When left unspecified, the driver itself is used for both | |
1159 | internal merge and the final merge. | |
1160 | ||
ef45bb1f | 1161 | The merge driver can learn the pathname in which the merged result |
81effe94 AD |
1162 | will be stored via placeholder `%P`. The conflict labels to be used |
1163 | for the common ancestor, local head and other head can be passed by | |
1164 | using '%S', '%X' and '%Y` respectively. | |
88e7fdf2 | 1165 | |
4c734803 JH |
1166 | `conflict-marker-size` |
1167 | ^^^^^^^^^^^^^^^^^^^^^^ | |
1168 | ||
1169 | This attribute controls the length of conflict markers left in | |
97509a34 ŠN |
1170 | the work tree file during a conflicted merge. Only a positive |
1171 | integer has a meaningful effect. | |
4c734803 JH |
1172 | |
1173 | For example, this line in `.gitattributes` can be used to tell the merge | |
1174 | machinery to leave much longer (instead of the usual 7-character-long) | |
1175 | conflict markers when merging the file `Documentation/git-merge.txt` | |
1176 | results in a conflict. | |
1177 | ||
1178 | ------------------------ | |
1179 | Documentation/git-merge.txt conflict-marker-size=32 | |
1180 | ------------------------ | |
1181 | ||
1182 | ||
cf1b7869 JH |
1183 | Checking whitespace errors |
1184 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1185 | ||
1186 | `whitespace` | |
1187 | ^^^^^^^^^^^^ | |
1188 | ||
1189 | The `core.whitespace` configuration variable allows you to define what | |
2fd02c92 | 1190 | 'diff' and 'apply' should consider whitespace errors for all paths in |
5162e697 | 1191 | the project (See linkgit:git-config[1]). This attribute gives you finer |
cf1b7869 JH |
1192 | control per path. |
1193 | ||
1194 | Set:: | |
1195 | ||
2de9b711 | 1196 | Notice all types of potential whitespace errors known to Git. |
f4b05a49 JS |
1197 | The tab width is taken from the value of the `core.whitespace` |
1198 | configuration variable. | |
cf1b7869 JH |
1199 | |
1200 | Unset:: | |
1201 | ||
1202 | Do not notice anything as error. | |
1203 | ||
1204 | Unspecified:: | |
1205 | ||
f4b05a49 | 1206 | Use the value of the `core.whitespace` configuration variable to |
cf1b7869 JH |
1207 | decide what to notice as error. |
1208 | ||
1209 | String:: | |
1210 | ||
f9552641 | 1211 | Specify a comma separated list of common whitespace problems to |
f4b05a49 | 1212 | notice in the same format as the `core.whitespace` configuration |
cf1b7869 JH |
1213 | variable. |
1214 | ||
1215 | ||
8a33dd8b JH |
1216 | Creating an archive |
1217 | ~~~~~~~~~~~~~~~~~~~ | |
1218 | ||
08b51f51 JH |
1219 | `export-ignore` |
1220 | ^^^^^^^^^^^^^^^ | |
1221 | ||
1222 | Files and directories with the attribute `export-ignore` won't be added to | |
1223 | archive files. | |
1224 | ||
8a33dd8b JH |
1225 | `export-subst` |
1226 | ^^^^^^^^^^^^^^ | |
1227 | ||
2de9b711 | 1228 | If the attribute `export-subst` is set for a file then Git will expand |
8a33dd8b | 1229 | several placeholders when adding this file to an archive. The |
08b51f51 | 1230 | expansion depends on the availability of a commit ID, i.e., if |
8a33dd8b JH |
1231 | linkgit:git-archive[1] has been given a tree instead of a commit or a |
1232 | tag then no replacement will be done. The placeholders are the same | |
1233 | as those for the option `--pretty=format:` of linkgit:git-log[1], | |
1234 | except that they need to be wrapped like this: `$Format:PLACEHOLDERS$` | |
1235 | in the file. E.g. the string `$Format:%H$` will be replaced by the | |
96099726 RS |
1236 | commit hash. However, only one `%(describe)` placeholder is expanded |
1237 | per archive to avoid denial-of-service attacks. | |
8a33dd8b JH |
1238 | |
1239 | ||
975457f1 NG |
1240 | Packing objects |
1241 | ~~~~~~~~~~~~~~~ | |
1242 | ||
1243 | `delta` | |
1244 | ^^^^^^^ | |
1245 | ||
1246 | Delta compression will not be attempted for blobs for paths with the | |
1247 | attribute `delta` set to false. | |
1248 | ||
1249 | ||
a2df1fb2 AG |
1250 | Viewing files in GUI tools |
1251 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1252 | ||
1253 | `encoding` | |
1254 | ^^^^^^^^^^ | |
1255 | ||
1256 | The value of this attribute specifies the character encoding that should | |
1257 | be used by GUI tools (e.g. linkgit:gitk[1] and linkgit:git-gui[1]) to | |
1258 | display the contents of the relevant file. Note that due to performance | |
1259 | considerations linkgit:gitk[1] does not use this attribute unless you | |
1260 | manually enable per-file encodings in its options. | |
1261 | ||
1262 | If this attribute is not set or has an invalid value, the value of the | |
1263 | `gui.encoding` configuration variable is used instead | |
1264 | (See linkgit:git-config[1]). | |
1265 | ||
1266 | ||
0922570c | 1267 | USING MACRO ATTRIBUTES |
bbb896d8 JH |
1268 | ---------------------- |
1269 | ||
1270 | You do not want any end-of-line conversions applied to, nor textual diffs | |
1271 | produced for, any binary file you track. You would need to specify e.g. | |
1272 | ||
1273 | ------------ | |
5ec3e670 | 1274 | *.jpg -text -diff |
bbb896d8 JH |
1275 | ------------ |
1276 | ||
1277 | but that may become cumbersome, when you have many attributes. Using | |
0922570c | 1278 | macro attributes, you can define an attribute that, when set, also |
98e84066 | 1279 | sets or unsets a number of other attributes at the same time. The |
0922570c | 1280 | system knows a built-in macro attribute, `binary`: |
bbb896d8 JH |
1281 | |
1282 | ------------ | |
1283 | *.jpg binary | |
1284 | ------------ | |
1285 | ||
98e84066 | 1286 | Setting the "binary" attribute also unsets the "text" and "diff" |
0922570c | 1287 | attributes as above. Note that macro attributes can only be "Set", |
98e84066 MH |
1288 | though setting one might have the effect of setting or unsetting other |
1289 | attributes or even returning other attributes to the "Unspecified" | |
1290 | state. | |
bbb896d8 JH |
1291 | |
1292 | ||
0922570c | 1293 | DEFINING MACRO ATTRIBUTES |
bbb896d8 JH |
1294 | ------------------------- |
1295 | ||
e78e6967 MH |
1296 | Custom macro attributes can be defined only in top-level gitattributes |
1297 | files (`$GIT_DIR/info/attributes`, the `.gitattributes` file at the | |
1298 | top level of the working tree, or the global or system-wide | |
1299 | gitattributes files), not in `.gitattributes` files in working tree | |
1300 | subdirectories. The built-in macro attribute "binary" is equivalent | |
1301 | to: | |
bbb896d8 JH |
1302 | |
1303 | ------------ | |
155a4b71 | 1304 | [attr]binary -diff -merge -text |
bbb896d8 JH |
1305 | ------------ |
1306 | ||
8ff06de1 JK |
1307 | NOTES |
1308 | ----- | |
1309 | ||
1310 | Git does not follow symbolic links when accessing a `.gitattributes` | |
1311 | file in the working tree. This keeps behavior consistent when the file | |
1312 | is accessed from the index or a tree versus from the filesystem. | |
bbb896d8 | 1313 | |
76a8788c NTND |
1314 | EXAMPLES |
1315 | -------- | |
88e7fdf2 JH |
1316 | |
1317 | If you have these three `gitattributes` file: | |
1318 | ||
1319 | ---------------------------------------------------------------- | |
1320 | (in $GIT_DIR/info/attributes) | |
1321 | ||
1322 | a* foo !bar -baz | |
1323 | ||
1324 | (in .gitattributes) | |
1325 | abc foo bar baz | |
1326 | ||
1327 | (in t/.gitattributes) | |
1328 | ab* merge=filfre | |
1329 | abc -foo -bar | |
1330 | *.c frotz | |
1331 | ---------------------------------------------------------------- | |
1332 | ||
1333 | the attributes given to path `t/abc` are computed as follows: | |
1334 | ||
1335 | 1. By examining `t/.gitattributes` (which is in the same | |
2de9b711 | 1336 | directory as the path in question), Git finds that the first |
88e7fdf2 JH |
1337 | line matches. `merge` attribute is set. It also finds that |
1338 | the second line matches, and attributes `foo` and `bar` | |
1339 | are unset. | |
1340 | ||
1341 | 2. Then it examines `.gitattributes` (which is in the parent | |
1342 | directory), and finds that the first line matches, but | |
1343 | `t/.gitattributes` file already decided how `merge`, `foo` | |
1344 | and `bar` attributes should be given to this path, so it | |
1345 | leaves `foo` and `bar` unset. Attribute `baz` is set. | |
1346 | ||
5c759f96 | 1347 | 3. Finally it examines `$GIT_DIR/info/attributes`. This file |
88e7fdf2 JH |
1348 | is used to override the in-tree settings. The first line is |
1349 | a match, and `foo` is set, `bar` is reverted to unspecified | |
1350 | state, and `baz` is unset. | |
1351 | ||
02783075 | 1352 | As the result, the attributes assignment to `t/abc` becomes: |
88e7fdf2 JH |
1353 | |
1354 | ---------------------------------------------------------------- | |
1355 | foo set to true | |
1356 | bar unspecified | |
1357 | baz set to false | |
1358 | merge set to string value "filfre" | |
1359 | frotz unspecified | |
1360 | ---------------------------------------------------------------- | |
1361 | ||
1362 | ||
cde15181 MH |
1363 | SEE ALSO |
1364 | -------- | |
1365 | linkgit:git-check-attr[1]. | |
8460b2fc | 1366 | |
88e7fdf2 JH |
1367 | GIT |
1368 | --- | |
9e1f0a85 | 1369 | Part of the linkgit:git[1] suite |