]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/gitattributes.txt
t9300: drop some useless uses of cat
[thirdparty/git.git] / Documentation / gitattributes.txt
CommitLineData
88e7fdf2
JH
1gitattributes(5)
2================
3
4NAME
5----
6gitattributes - defining attributes per path
7
8SYNOPSIS
9--------
e5b5c1d2 10$GIT_DIR/info/attributes, .gitattributes
88e7fdf2
JH
11
12
13DESCRIPTION
14-----------
15
16A `gitattributes` file is a simple text file that gives
17`attributes` to pathnames.
18
19Each line in `gitattributes` file is of form:
20
3f74c8e8 21 pattern attr1 attr2 ...
88e7fdf2 22
3f74c8e8 23That is, a pattern followed by an attributes list,
860a74d9
NTND
24separated by whitespaces. Leading and trailing whitespaces are
25ignored. Lines that begin with '#' are ignored. Patterns
26that begin with a double quote are quoted in C style.
27When the pattern matches the path in question, the attributes
28listed on the line are given to the path.
88e7fdf2
JH
29
30Each attribute can be in one of these states for a given path:
31
32Set::
33
34 The path has the attribute with special value "true";
35 this is specified by listing only the name of the
36 attribute in the attribute list.
37
38Unset::
39
40 The path has the attribute with special value "false";
41 this is specified by listing the name of the attribute
42 prefixed with a dash `-` in the attribute list.
43
44Set to a value::
45
46 The path has the attribute with specified string value;
47 this is specified by listing the name of the attribute
48 followed by an equal sign `=` and its value in the
49 attribute list.
50
51Unspecified::
52
3f74c8e8 53 No pattern matches the path, and nothing says if
b9d14ffb
JH
54 the path has or does not have the attribute, the
55 attribute for the path is said to be Unspecified.
88e7fdf2 56
3f74c8e8 57When more than one pattern matches the path, a later line
b9d14ffb 58overrides an earlier line. This overriding is done per
3f74c8e8
JS
59attribute. The rules how the pattern matches paths are the
60same as in `.gitignore` files; see linkgit:gitignore[5].
82dce998 61Unlike `.gitignore`, negative patterns are forbidden.
88e7fdf2 62
2de9b711 63When deciding what attributes are assigned to a path, Git
88e7fdf2
JH
64consults `$GIT_DIR/info/attributes` file (which has the highest
65precedence), `.gitattributes` file in the same directory as the
20ff3ec2
JM
66path in question, and its parent directories up to the toplevel of the
67work tree (the further the directory that contains `.gitattributes`
6df42ab9
PO
68is from the path in question, the lower its precedence). Finally
69global and system-wide files are considered (they have the lowest
70precedence).
88e7fdf2 71
40701adb
NTND
72When the `.gitattributes` file is missing from the work tree, the
73path in the index is used as a fall-back. During checkout process,
74`.gitattributes` in the index is used and then the file in the
75working tree is used as a fall-back.
76
90b22907 77If you wish to affect only a single repository (i.e., to assign
6df42ab9
PO
78attributes to files that are particular to
79one user's workflow for that repository), then
90b22907
JK
80attributes should be placed in the `$GIT_DIR/info/attributes` file.
81Attributes which should be version-controlled and distributed to other
82repositories (i.e., attributes of interest to all users) should go into
6df42ab9
PO
83`.gitattributes` files. Attributes that should affect all repositories
84for a single user should be placed in a file specified by the
da0005b8 85`core.attributesFile` configuration option (see linkgit:git-config[1]).
684e40f6
HKNN
86Its default value is $XDG_CONFIG_HOME/git/attributes. If $XDG_CONFIG_HOME
87is either not set or empty, $HOME/.config/git/attributes is used instead.
6df42ab9
PO
88Attributes for all users on a system should be placed in the
89`$(prefix)/etc/gitattributes` file.
90b22907 90
faa4e8ce 91Sometimes you would need to override a setting of an attribute
0922570c 92for a path to `Unspecified` state. This can be done by listing
88e7fdf2
JH
93the name of the attribute prefixed with an exclamation point `!`.
94
95
96EFFECTS
97-------
98
2de9b711 99Certain operations by Git can be influenced by assigning
ae7aa499
JH
100particular attributes to a path. Currently, the following
101operations are attributes-aware.
88e7fdf2
JH
102
103Checking-out and checking-in
104~~~~~~~~~~~~~~~~~~~~~~~~~~~~
105
3fed15f5 106These attributes affect how the contents stored in the
88e7fdf2 107repository are copied to the working tree files when commands
0b444cdb 108such as 'git checkout' and 'git merge' run. They also affect how
2de9b711 109Git stores the contents you prepare in the working tree in the
0b444cdb 110repository upon 'git add' and 'git commit'.
88e7fdf2 111
5ec3e670 112`text`
3fed15f5
JH
113^^^^^^
114
fd6cce9e
EB
115This attribute enables and controls end-of-line normalization. When a
116text file is normalized, its line endings are converted to LF in the
117repository. To control what line ending style is used in the working
118directory, use the `eol` attribute for a single file and the
942e7747 119`core.eol` configuration variable for all text files.
65237284 120Note that `core.autocrlf` overrides `core.eol`
3fed15f5 121
88e7fdf2
JH
122Set::
123
5ec3e670 124 Setting the `text` attribute on a path enables end-of-line
fd6cce9e
EB
125 normalization and marks the path as a text file. End-of-line
126 conversion takes place without guessing the content type.
88e7fdf2
JH
127
128Unset::
129
2de9b711 130 Unsetting the `text` attribute on a path tells Git not to
bbb896d8 131 attempt any end-of-line conversion upon checkin or checkout.
88e7fdf2 132
fd6cce9e 133Set to string value "auto"::
88e7fdf2 134
5ec3e670 135 When `text` is set to "auto", the path is marked for automatic
65237284
TB
136 end-of-line conversion. If Git decides that the content is
137 text, its line endings are converted to LF on checkin.
2e3a16b2 138 When the file has been committed with CRLF, no conversion is done.
88e7fdf2 139
88e7fdf2
JH
140Unspecified::
141
2de9b711 142 If the `text` attribute is unspecified, Git uses the
942e7747
EB
143 `core.autocrlf` configuration variable to determine if the
144 file should be converted.
88e7fdf2 145
2de9b711 146Any other value causes Git to act as if `text` has been left
fd6cce9e 147unspecified.
88e7fdf2 148
fd6cce9e
EB
149`eol`
150^^^^^
88e7fdf2 151
fd6cce9e 152This attribute sets a specific line-ending style to be used in the
65237284 153working directory. It enables end-of-line conversion without any
3bc4b8f7
BB
154content checks, effectively setting the `text` attribute. Note that
155setting this attribute on paths which are in the index with CRLF line
156endings may make the paths to be considered dirty. Adding the path to
157the index again will normalize the line endings in the index.
88e7fdf2 158
fd6cce9e 159Set to string value "crlf"::
88e7fdf2 160
2de9b711 161 This setting forces Git to normalize line endings for this
942e7747
EB
162 file on checkin and convert them to CRLF when the file is
163 checked out.
fd6cce9e
EB
164
165Set to string value "lf"::
166
2de9b711 167 This setting forces Git to normalize line endings to LF on
fd6cce9e 168 checkin and prevents conversion to CRLF when the file is
942e7747 169 checked out.
5ec3e670
EB
170
171Backwards compatibility with `crlf` attribute
172^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
173
174For backwards compatibility, the `crlf` attribute is interpreted as
175follows:
176
177------------------------
178crlf text
179-crlf -text
180crlf=input eol=lf
181------------------------
fd6cce9e
EB
182
183End-of-line conversion
184^^^^^^^^^^^^^^^^^^^^^^
185
2de9b711 186While Git normally leaves file contents alone, it can be configured to
fd6cce9e
EB
187normalize line endings to LF in the repository and, optionally, to
188convert them to CRLF when files are checked out.
189
fd6cce9e
EB
190If you simply want to have CRLF line endings in your working directory
191regardless of the repository you are working with, you can set the
65237284 192config variable "core.autocrlf" without using any attributes.
fd6cce9e
EB
193
194------------------------
195[core]
196 autocrlf = true
197------------------------
198
e28eae31 199This does not force normalization of text files, but does ensure
fd6cce9e
EB
200that text files that you introduce to the repository have their line
201endings normalized to LF when they are added, and that files that are
942e7747 202already normalized in the repository stay normalized.
fd6cce9e 203
e28eae31
TB
204If you want to ensure that text files that any contributor introduces to
205the repository have their line endings normalized, you can set the
206`text` attribute to "auto" for _all_ files.
88e7fdf2 207
fd6cce9e 208------------------------
5ec3e670 209* text=auto
fd6cce9e
EB
210------------------------
211
e28eae31
TB
212The attributes allow a fine-grained control, how the line endings
213are converted.
214Here is an example that will make Git normalize .txt, .vcproj and .sh
215files, ensure that .vcproj files have CRLF and .sh files have LF in
216the working directory, and prevent .jpg files from being normalized
217regardless of their content.
218
219------------------------
220* text=auto
221*.txt text
222*.vcproj text eol=crlf
223*.sh text eol=lf
224*.jpg -text
225------------------------
226
227NOTE: When `text=auto` conversion is enabled in a cross-platform
228project using push and pull to a central repository the text files
229containing CRLFs should be normalized.
fd6cce9e 230
e28eae31 231From a clean working directory:
fd6cce9e
EB
232
233-------------------------------------------------
e28eae31 234$ echo "* text=auto" >.gitattributes
773a8891 235$ git read-tree --empty # Clean index, force re-scan of working directory
85999743 236$ git add .
fd6cce9e 237$ git status # Show files that will be normalized
fd6cce9e
EB
238$ git commit -m "Introduce end-of-line normalization"
239-------------------------------------------------
240
241If any files that should not be normalized show up in 'git status',
5ec3e670 242unset their `text` attribute before running 'git add -u'.
fd6cce9e
EB
243
244------------------------
5ec3e670 245manual.pdf -text
fd6cce9e 246------------------------
88e7fdf2 247
2de9b711 248Conversely, text files that Git does not detect can have normalization
fd6cce9e 249enabled manually.
88e7fdf2 250
fd6cce9e 251------------------------
5ec3e670 252weirdchars.txt text
fd6cce9e 253------------------------
88e7fdf2 254
2de9b711 255If `core.safecrlf` is set to "true" or "warn", Git verifies if
21e5ad50 256the conversion is reversible for the current setting of
2de9b711
TA
257`core.autocrlf`. For "true", Git rejects irreversible
258conversions; for "warn", Git only prints a warning but accepts
21e5ad50
SP
259an irreversible conversion. The safety triggers to prevent such
260a conversion done to the files in the work tree, but there are a
261few exceptions. Even though...
262
0b444cdb 263- 'git add' itself does not touch the files in the work tree, the
21e5ad50
SP
264 next checkout would, so the safety triggers;
265
0b444cdb 266- 'git apply' to update a text file with a patch does touch the files
21e5ad50
SP
267 in the work tree, but the operation is about text files and CRLF
268 conversion is about fixing the line ending inconsistencies, so the
269 safety does not trigger;
270
0b444cdb
TR
271- 'git diff' itself does not touch the files in the work tree, it is
272 often run to inspect the changes you intend to next 'git add'. To
21e5ad50
SP
273 catch potential problems early, safety triggers.
274
88e7fdf2 275
3fed15f5
JH
276`ident`
277^^^^^^^
278
2de9b711 279When the attribute `ident` is set for a path, Git replaces
2c850f12 280`$Id$` in the blob object with `$Id:`, followed by the
3fed15f5
JH
28140-character hexadecimal blob object name, followed by a dollar
282sign `$` upon checkout. Any byte sequence that begins with
af9b54bb
AP
283`$Id:` and ends with `$` in the worktree file is replaced
284with `$Id$` upon check-in.
3fed15f5
JH
285
286
aa4ed402
JH
287`filter`
288^^^^^^^^
289
c05ef938 290A `filter` attribute can be set to a string value that names a
aa4ed402
JH
291filter driver specified in the configuration.
292
c05ef938 293A filter driver consists of a `clean` command and a `smudge`
aa4ed402 294command, either of which can be left unspecified. Upon
c05ef938
WC
295checkout, when the `smudge` command is specified, the command is
296fed the blob object from its standard input, and its standard
297output is used to update the worktree file. Similarly, the
298`clean` command is used to convert the contents of worktree file
edcc8581
LS
299upon checkin. By default these commands process only a single
300blob and terminate. If a long running `process` filter is used
301in place of `clean` and/or `smudge` filters, then Git can process
302all blobs with a single filter command invocation for the entire
303life of a single Git command, for example `git add --all`. If a
304long running `process` filter is configured then it always takes
305precedence over a configured single blob filter. See section
306below for the description of the protocol used to communicate with
307a `process` filter.
aa4ed402 308
36daaaca
JB
309One use of the content filtering is to massage the content into a shape
310that is more convenient for the platform, filesystem, and the user to use.
311For this mode of operation, the key phrase here is "more convenient" and
312not "turning something unusable into usable". In other words, the intent
313is that if someone unsets the filter driver definition, or does not have
314the appropriate filter program, the project should still be usable.
315
316Another use of the content filtering is to store the content that cannot
317be directly used in the repository (e.g. a UUID that refers to the true
2de9b711 318content stored outside Git, or an encrypted content) and turn it into a
36daaaca
JB
319usable form upon checkout (e.g. download the external content, or decrypt
320the encrypted content).
321
322These two filters behave differently, and by default, a filter is taken as
323the former, massaging the contents into more convenient shape. A missing
324filter driver definition in the config, or a filter driver that exits with
325a non-zero status, is not an error but makes the filter a no-op passthru.
326
327You can declare that a filter turns a content that by itself is unusable
328into a usable content by setting the filter.<driver>.required configuration
329variable to `true`.
aa4ed402 330
d79f5d17
NS
331For example, in .gitattributes, you would assign the `filter`
332attribute for paths.
333
334------------------------
335*.c filter=indent
336------------------------
337
338Then you would define a "filter.indent.clean" and "filter.indent.smudge"
339configuration in your .git/config to specify a pair of commands to
340modify the contents of C programs when the source files are checked
341in ("clean" is run) and checked out (no change is made because the
342command is "cat").
343
344------------------------
345[filter "indent"]
346 clean = indent
347 smudge = cat
348------------------------
349
f217f0e8
EB
350For best results, `clean` should not alter its output further if it is
351run twice ("clean->clean" should be equivalent to "clean"), and
352multiple `smudge` commands should not alter `clean`'s output
353("smudge->smudge->clean" should be equivalent to "clean"). See the
354section on merging below.
355
356The "indent" filter is well-behaved in this regard: it will not modify
357input that is already correctly indented. In this case, the lack of a
358smudge filter means that the clean filter _must_ accept its own output
359without modifying it.
360
36daaaca
JB
361If a filter _must_ succeed in order to make the stored contents usable,
362you can declare that the filter is `required`, in the configuration:
363
364------------------------
365[filter "crypt"]
366 clean = openssl enc ...
367 smudge = openssl enc -d ...
368 required
369------------------------
370
a2b665de
PW
371Sequence "%f" on the filter command line is replaced with the name of
372the file the filter is working on. A filter might use this in keyword
373substitution. For example:
374
375------------------------
376[filter "p4"]
377 clean = git-p4-filter --clean %f
378 smudge = git-p4-filter --smudge %f
379------------------------
380
52db4b04
JH
381Note that "%f" is the name of the path that is being worked on. Depending
382on the version that is being filtered, the corresponding file on disk may
383not exist, or may have different contents. So, smudge and clean commands
384should not try to access the file on disk, but only act as filters on the
385content provided to them on standard input.
aa4ed402 386
edcc8581
LS
387Long Running Filter Process
388^^^^^^^^^^^^^^^^^^^^^^^^^^^
389
390If the filter command (a string value) is defined via
391`filter.<driver>.process` then Git can process all blobs with a
392single filter invocation for the entire life of a single Git
393command. This is achieved by using a packet format (pkt-line,
394see technical/protocol-common.txt) based protocol over standard
395input and standard output as follows. All packets, except for the
396"*CONTENT" packets and the "0000" flush packet, are considered
397text and therefore are terminated by a LF.
398
399Git starts the filter when it encounters the first file
400that needs to be cleaned or smudged. After the filter started
401Git sends a welcome message ("git-filter-client"), a list of supported
402protocol version numbers, and a flush packet. Git expects to read a welcome
403response message ("git-filter-server"), exactly one protocol version number
404from the previously sent list, and a flush packet. All further
405communication will be based on the selected version. The remaining
406protocol description below documents "version=2". Please note that
407"version=42" in the example below does not exist and is only there
408to illustrate how the protocol would look like with more than one
409version.
410
411After the version negotiation Git sends a list of all capabilities that
412it supports and a flush packet. Git expects to read a list of desired
413capabilities, which must be a subset of the supported capabilities list,
414and a flush packet as response:
415------------------------
416packet: git> git-filter-client
417packet: git> version=2
418packet: git> version=42
419packet: git> 0000
420packet: git< git-filter-server
421packet: git< version=2
422packet: git< 0000
423packet: git> capability=clean
424packet: git> capability=smudge
425packet: git> capability=not-yet-invented
426packet: git> 0000
427packet: git< capability=clean
428packet: git< capability=smudge
429packet: git< 0000
430------------------------
2841e8f8
LS
431Supported filter capabilities in version 2 are "clean", "smudge",
432and "delay".
edcc8581
LS
433
434Afterwards Git sends a list of "key=value" pairs terminated with
435a flush packet. The list will contain at least the filter command
436(based on the supported capabilities) and the pathname of the file
437to filter relative to the repository root. Right after the flush packet
438Git sends the content split in zero or more pkt-line packets and a
439flush packet to terminate content. Please note, that the filter
440must not send any response before it received the content and the
c6b0831c
LS
441final flush packet. Also note that the "value" of a "key=value" pair
442can contain the "=" character whereas the key would never contain
443that character.
edcc8581
LS
444------------------------
445packet: git> command=smudge
446packet: git> pathname=path/testfile.dat
447packet: git> 0000
448packet: git> CONTENT
449packet: git> 0000
450------------------------
451
452The filter is expected to respond with a list of "key=value" pairs
453terminated with a flush packet. If the filter does not experience
454problems then the list must contain a "success" status. Right after
455these packets the filter is expected to send the content in zero
456or more pkt-line packets and a flush packet at the end. Finally, a
457second list of "key=value" pairs terminated with a flush packet
458is expected. The filter can change the status in the second list
459or keep the status as is with an empty list. Please note that the
460empty list must be terminated with a flush packet regardless.
461
462------------------------
463packet: git< status=success
464packet: git< 0000
465packet: git< SMUDGED_CONTENT
466packet: git< 0000
467packet: git< 0000 # empty list, keep "status=success" unchanged!
468------------------------
469
470If the result content is empty then the filter is expected to respond
471with a "success" status and a flush packet to signal the empty content.
472------------------------
473packet: git< status=success
474packet: git< 0000
475packet: git< 0000 # empty content!
476packet: git< 0000 # empty list, keep "status=success" unchanged!
477------------------------
478
479In case the filter cannot or does not want to process the content,
480it is expected to respond with an "error" status.
481------------------------
482packet: git< status=error
483packet: git< 0000
484------------------------
485
486If the filter experiences an error during processing, then it can
487send the status "error" after the content was (partially or
488completely) sent.
489------------------------
490packet: git< status=success
491packet: git< 0000
492packet: git< HALF_WRITTEN_ERRONEOUS_CONTENT
493packet: git< 0000
494packet: git< status=error
495packet: git< 0000
496------------------------
497
498In case the filter cannot or does not want to process the content
499as well as any future content for the lifetime of the Git process,
500then it is expected to respond with an "abort" status at any point
501in the protocol.
502------------------------
503packet: git< status=abort
504packet: git< 0000
505------------------------
506
507Git neither stops nor restarts the filter process in case the
508"error"/"abort" status is set. However, Git sets its exit code
509according to the `filter.<driver>.required` flag, mimicking the
510behavior of the `filter.<driver>.clean` / `filter.<driver>.smudge`
511mechanism.
512
513If the filter dies during the communication or does not adhere to
514the protocol then Git will stop the filter process and restart it
515with the next file that needs to be processed. Depending on the
516`filter.<driver>.required` flag Git will interpret that as error.
517
2841e8f8
LS
518After the filter has processed a command it is expected to wait for
519a "key=value" list containing the next command. Git will close
edcc8581
LS
520the command pipe on exit. The filter is expected to detect EOF
521and exit gracefully on its own. Git will wait until the filter
522process has stopped.
523
2841e8f8
LS
524Delay
525^^^^^
526
527If the filter supports the "delay" capability, then Git can send the
528flag "can-delay" after the filter command and pathname. This flag
529denotes that the filter can delay filtering the current blob (e.g. to
530compensate network latencies) by responding with no content but with
531the status "delayed" and a flush packet.
532------------------------
533packet: git> command=smudge
534packet: git> pathname=path/testfile.dat
535packet: git> can-delay=1
536packet: git> 0000
537packet: git> CONTENT
538packet: git> 0000
539packet: git< status=delayed
540packet: git< 0000
541------------------------
542
543If the filter supports the "delay" capability then it must support the
544"list_available_blobs" command. If Git sends this command, then the
545filter is expected to return a list of pathnames representing blobs
546that have been delayed earlier and are now available.
547The list must be terminated with a flush packet followed
548by a "success" status that is also terminated with a flush packet. If
549no blobs for the delayed paths are available, yet, then the filter is
550expected to block the response until at least one blob becomes
551available. The filter can tell Git that it has no more delayed blobs
552by sending an empty list. As soon as the filter responds with an empty
553list, Git stops asking. All blobs that Git has not received at this
554point are considered missing and will result in an error.
555
556------------------------
557packet: git> command=list_available_blobs
558packet: git> 0000
559packet: git< pathname=path/testfile.dat
560packet: git< pathname=path/otherfile.dat
561packet: git< 0000
562packet: git< status=success
563packet: git< 0000
564------------------------
565
566After Git received the pathnames, it will request the corresponding
567blobs again. These requests contain a pathname and an empty content
568section. The filter is expected to respond with the smudged content
569in the usual way as explained above.
570------------------------
571packet: git> command=smudge
572packet: git> pathname=path/testfile.dat
573packet: git> 0000
574packet: git> 0000 # empty content!
575packet: git< status=success
576packet: git< 0000
577packet: git< SMUDGED_CONTENT
578packet: git< 0000
579packet: git< 0000 # empty list, keep "status=success" unchanged!
580------------------------
581
582Example
583^^^^^^^
584
0f71fa27
LS
585A long running filter demo implementation can be found in
586`contrib/long-running-filter/example.pl` located in the Git
587core repository. If you develop your own long running filter
edcc8581
LS
588process then the `GIT_TRACE_PACKET` environment variables can be
589very helpful for debugging (see linkgit:git[1]).
590
591Please note that you cannot use an existing `filter.<driver>.clean`
592or `filter.<driver>.smudge` command with `filter.<driver>.process`
593because the former two use a different inter process communication
594protocol than the latter one.
595
596
aa4ed402
JH
597Interaction between checkin/checkout attributes
598^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
599
600In the check-in codepath, the worktree file is first converted
601with `filter` driver (if specified and corresponding driver
602defined), then the result is processed with `ident` (if
5ec3e670 603specified), and then finally with `text` (again, if specified
aa4ed402
JH
604and applicable).
605
606In the check-out codepath, the blob content is first converted
5ec3e670 607with `text`, and then `ident` and fed to `filter`.
aa4ed402
JH
608
609
f217f0e8
EB
610Merging branches with differing checkin/checkout attributes
611^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
612
613If you have added attributes to a file that cause the canonical
614repository format for that file to change, such as adding a
615clean/smudge filter or text/eol/ident attributes, merging anything
616where the attribute is not in place would normally cause merge
617conflicts.
618
2de9b711 619To prevent these unnecessary merge conflicts, Git can be told to run a
f217f0e8
EB
620virtual check-out and check-in of all three stages of a file when
621resolving a three-way merge by setting the `merge.renormalize`
622configuration variable. This prevents changes caused by check-in
623conversion from causing spurious merge conflicts when a converted file
624is merged with an unconverted file.
625
626As long as a "smudge->clean" results in the same output as a "clean"
627even on files that are already smudged, this strategy will
628automatically resolve all filter-related conflicts. Filters that do
629not act in this way may cause additional merge conflicts that must be
630resolved manually.
631
632
88e7fdf2
JH
633Generating diff text
634~~~~~~~~~~~~~~~~~~~~
635
4f73e240
JN
636`diff`
637^^^^^^
638
2de9b711
TA
639The attribute `diff` affects how Git generates diffs for particular
640files. It can tell Git whether to generate a textual patch for the path
678852d9 641or to treat the path as a binary file. It can also affect what line is
2de9b711
TA
642shown on the hunk header `@@ -k,l +n,m @@` line, tell Git to use an
643external command to generate the diff, or ask Git to convert binary
678852d9 644files to a text format before generating the diff.
88e7fdf2
JH
645
646Set::
647
648 A path to which the `diff` attribute is set is treated
649 as text, even when they contain byte values that
650 normally never appear in text files, such as NUL.
651
652Unset::
653
654 A path to which the `diff` attribute is unset will
678852d9
JK
655 generate `Binary files differ` (or a binary patch, if
656 binary patches are enabled).
88e7fdf2
JH
657
658Unspecified::
659
660 A path to which the `diff` attribute is unspecified
661 first gets its contents inspected, and if it looks like
6bf3b813
NTND
662 text and is smaller than core.bigFileThreshold, it is treated
663 as text. Otherwise it would generate `Binary files differ`.
88e7fdf2 664
2cc3167c
JH
665String::
666
678852d9
JK
667 Diff is shown using the specified diff driver. Each driver may
668 specify one or more options, as described in the following
669 section. The options for the diff driver "foo" are defined
670 by the configuration variables in the "diff.foo" section of the
2de9b711 671 Git config file.
2cc3167c
JH
672
673
678852d9
JK
674Defining an external diff driver
675^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2cc3167c
JH
676
677The definition of a diff driver is done in `gitconfig`, not
678`gitattributes` file, so strictly speaking this manual page is a
679wrong place to talk about it. However...
680
678852d9 681To define an external diff driver `jcdiff`, add a section to your
2cc3167c
JH
682`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this:
683
684----------------------------------------------------------------
685[diff "jcdiff"]
686 command = j-c-diff
687----------------------------------------------------------------
688
2de9b711 689When Git needs to show you a diff for the path with `diff`
2cc3167c
JH
690attribute set to `jcdiff`, it calls the command you specified
691with the above configuration, i.e. `j-c-diff`, with 7
692parameters, just like `GIT_EXTERNAL_DIFF` program is called.
9e1f0a85 693See linkgit:git[1] for details.
88e7fdf2
JH
694
695
ae7aa499
JH
696Defining a custom hunk-header
697^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
698
c882c01e 699Each group of changes (called a "hunk") in the textual diff output
ae7aa499
JH
700is prefixed with a line of the form:
701
702 @@ -k,l +n,m @@ TEXT
703
c882c01e
GD
704This is called a 'hunk header'. The "TEXT" portion is by default a line
705that begins with an alphabet, an underscore or a dollar sign; this
706matches what GNU 'diff -p' output uses. This default selection however
707is not suited for some contents, and you can use a customized pattern
708to make a selection.
ae7aa499 709
c882c01e 710First, in .gitattributes, you would assign the `diff` attribute
ae7aa499
JH
711for paths.
712
713------------------------
714*.tex diff=tex
715------------------------
716
edb7e82f 717Then, you would define a "diff.tex.xfuncname" configuration to
ae7aa499 718specify a regular expression that matches a line that you would
c4c86d23
JK
719want to appear as the hunk header "TEXT". Add a section to your
720`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this:
ae7aa499
JH
721
722------------------------
723[diff "tex"]
45d9414f 724 xfuncname = "^(\\\\(sub)*section\\{.*)$"
ae7aa499
JH
725------------------------
726
727Note. A single level of backslashes are eaten by the
728configuration file parser, so you would need to double the
729backslashes; the pattern above picks a line that begins with a
02783075 730backslash, and zero or more occurrences of `sub` followed by
ae7aa499
JH
731`section` followed by open brace, to the end of line.
732
733There are a few built-in patterns to make this easier, and `tex`
734is one of them, so you do not have to write the above in your
735configuration file (you still need to enable this with the
d08ed6d6
GH
736attribute mechanism, via `.gitattributes`). The following built in
737patterns are available:
738
e90d065e
AJ
739- `ada` suitable for source code in the Ada language.
740
23b5beb2
GH
741- `bibtex` suitable for files with BibTeX coded references.
742
80c49c3d
TR
743- `cpp` suitable for source code in the C and C++ languages.
744
b221207d
PO
745- `csharp` suitable for source code in the C# language.
746
0719f3ee
WD
747- `css` suitable for cascading style sheets.
748
909a5494
BC
749- `fortran` suitable for source code in the Fortran language.
750
69f9c87d
ZB
751- `fountain` suitable for Fountain documents.
752
af9ce1ff
AE
753- `html` suitable for HTML/XHTML documents.
754
b66e00f1 755- `java` suitable for source code in the Java language.
d08ed6d6 756
53b10a14
GH
757- `matlab` suitable for source code in the MATLAB language.
758
5d1e958e
JS
759- `objc` suitable for source code in the Objective-C language.
760
d08ed6d6
GH
761- `pascal` suitable for source code in the Pascal/Delphi language.
762
71a5d4bc
JN
763- `perl` suitable for source code in the Perl language.
764
af9ce1ff
AE
765- `php` suitable for source code in the PHP language.
766
7c17205b
KS
767- `python` suitable for source code in the Python language.
768
d08ed6d6
GH
769- `ruby` suitable for source code in the Ruby language.
770
771- `tex` suitable for source code for LaTeX documents.
ae7aa499
JH
772
773
80c49c3d
TR
774Customizing word diff
775^^^^^^^^^^^^^^^^^^^^^
776
882749a0 777You can customize the rules that `git diff --word-diff` uses to
80c49c3d 778split words in a line, by specifying an appropriate regular expression
ae3b970a 779in the "diff.*.wordRegex" configuration variable. For example, in TeX
80c49c3d
TR
780a backslash followed by a sequence of letters forms a command, but
781several such commands can be run together without intervening
c4c86d23
JK
782whitespace. To separate them, use a regular expression in your
783`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this:
80c49c3d
TR
784
785------------------------
786[diff "tex"]
ae3b970a 787 wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+"
80c49c3d
TR
788------------------------
789
790A built-in pattern is provided for all languages listed in the
791previous section.
792
793
678852d9
JK
794Performing text diffs of binary files
795^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
796
797Sometimes it is desirable to see the diff of a text-converted
798version of some binary files. For example, a word processor
799document can be converted to an ASCII text representation, and
800the diff of the text shown. Even though this conversion loses
801some information, the resulting diff is useful for human
802viewing (but cannot be applied directly).
803
804The `textconv` config option is used to define a program for
805performing such a conversion. The program should take a single
806argument, the name of a file to convert, and produce the
807resulting text on stdout.
808
809For example, to show the diff of the exif information of a
810file instead of the binary information (assuming you have the
c4c86d23
JK
811exif tool installed), add the following section to your
812`$GIT_DIR/config` file (or `$HOME/.gitconfig` file):
678852d9
JK
813
814------------------------
815[diff "jpg"]
816 textconv = exif
817------------------------
818
819NOTE: The text conversion is generally a one-way conversion;
820in this example, we lose the actual image contents and focus
821just on the text data. This means that diffs generated by
822textconv are _not_ suitable for applying. For this reason,
823only `git diff` and the `git log` family of commands (i.e.,
824log, whatchanged, show) will perform text conversion. `git
825format-patch` will never generate this output. If you want to
826send somebody a text-converted diff of a binary file (e.g.,
827because it quickly conveys the changes you have made), you
828should generate it separately and send it as a comment _in
829addition to_ the usual binary diff that you might send.
830
d9bae1a1 831Because text conversion can be slow, especially when doing a
2de9b711 832large number of them with `git log -p`, Git provides a mechanism
d9bae1a1
JK
833to cache the output and use it in future diffs. To enable
834caching, set the "cachetextconv" variable in your diff driver's
835config. For example:
836
837------------------------
838[diff "jpg"]
839 textconv = exif
840 cachetextconv = true
841------------------------
842
843This will cache the result of running "exif" on each blob
844indefinitely. If you change the textconv config variable for a
2de9b711 845diff driver, Git will automatically invalidate the cache entries
d9bae1a1
JK
846and re-run the textconv filter. If you want to invalidate the
847cache manually (e.g., because your version of "exif" was updated
848and now produces better output), you can remove the cache
849manually with `git update-ref -d refs/notes/textconv/jpg` (where
850"jpg" is the name of the diff driver, as in the example above).
678852d9 851
55601c6a
JK
852Choosing textconv versus external diff
853^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
854
855If you want to show differences between binary or specially-formatted
856blobs in your repository, you can choose to use either an external diff
857command, or to use textconv to convert them to a diff-able text format.
858Which method you choose depends on your exact situation.
859
860The advantage of using an external diff command is flexibility. You are
861not bound to find line-oriented changes, nor is it necessary for the
862output to resemble unified diff. You are free to locate and report
863changes in the most appropriate way for your data format.
864
865A textconv, by comparison, is much more limiting. You provide a
2de9b711 866transformation of the data into a line-oriented text format, and Git
55601c6a
JK
867uses its regular diff tools to generate the output. There are several
868advantages to choosing this method:
869
8701. Ease of use. It is often much simpler to write a binary to text
871 transformation than it is to perform your own diff. In many cases,
872 existing programs can be used as textconv filters (e.g., exif,
873 odt2txt).
874
8752. Git diff features. By performing only the transformation step
2de9b711 876 yourself, you can still utilize many of Git's diff features,
55601c6a
JK
877 including colorization, word-diff, and combined diffs for merges.
878
8793. Caching. Textconv caching can speed up repeated diffs, such as those
880 you might trigger by running `git log -p`.
881
882
ab435611
JK
883Marking files as binary
884^^^^^^^^^^^^^^^^^^^^^^^
885
886Git usually guesses correctly whether a blob contains text or binary
887data by examining the beginning of the contents. However, sometimes you
888may want to override its decision, either because a blob contains binary
889data later in the file, or because the content, while technically
890composed of text characters, is opaque to a human reader. For example,
f745acb0 891many postscript files contain only ASCII characters, but produce noisy
ab435611
JK
892and meaningless diffs.
893
894The simplest way to mark a file as binary is to unset the diff
895attribute in the `.gitattributes` file:
896
897------------------------
898*.ps -diff
899------------------------
900
2de9b711 901This will cause Git to generate `Binary files differ` (or a binary
ab435611
JK
902patch, if binary patches are enabled) instead of a regular diff.
903
904However, one may also want to specify other diff driver attributes. For
905example, you might want to use `textconv` to convert postscript files to
f745acb0 906an ASCII representation for human viewing, but otherwise treat them as
ab435611
JK
907binary files. You cannot specify both `-diff` and `diff=ps` attributes.
908The solution is to use the `diff.*.binary` config option:
909
910------------------------
911[diff "ps"]
912 textconv = ps2ascii
913 binary = true
914------------------------
915
88e7fdf2
JH
916Performing a three-way merge
917~~~~~~~~~~~~~~~~~~~~~~~~~~~~
918
4f73e240
JN
919`merge`
920^^^^^^^
921
b547ce0b 922The attribute `merge` affects how three versions of a file are
88e7fdf2 923merged when a file-level merge is necessary during `git merge`,
57f6ec02 924and other commands such as `git revert` and `git cherry-pick`.
88e7fdf2
JH
925
926Set::
927
928 Built-in 3-way merge driver is used to merge the
2fd02c92 929 contents in a way similar to 'merge' command of `RCS`
88e7fdf2
JH
930 suite. This is suitable for ordinary text files.
931
932Unset::
933
934 Take the version from the current branch as the
935 tentative merge result, and declare that the merge has
b547ce0b 936 conflicts. This is suitable for binary files that do
88e7fdf2
JH
937 not have a well-defined merge semantics.
938
939Unspecified::
940
941 By default, this uses the same built-in 3-way merge
b547ce0b
AS
942 driver as is the case when the `merge` attribute is set.
943 However, the `merge.default` configuration variable can name
944 different merge driver to be used with paths for which the
88e7fdf2
JH
945 `merge` attribute is unspecified.
946
2cc3167c 947String::
88e7fdf2
JH
948
949 3-way merge is performed using the specified custom
950 merge driver. The built-in 3-way merge driver can be
951 explicitly specified by asking for "text" driver; the
952 built-in "take the current branch" driver can be
b9d14ffb 953 requested with "binary".
88e7fdf2
JH
954
955
0e545f75
JH
956Built-in merge drivers
957^^^^^^^^^^^^^^^^^^^^^^
958
959There are a few built-in low-level merge drivers defined that
960can be asked for via the `merge` attribute.
961
962text::
963
964 Usual 3-way file level merge for text files. Conflicted
965 regions are marked with conflict markers `<<<<<<<`,
966 `=======` and `>>>>>>>`. The version from your branch
967 appears before the `=======` marker, and the version
968 from the merged branch appears after the `=======`
969 marker.
970
971binary::
972
973 Keep the version from your branch in the work tree, but
974 leave the path in the conflicted state for the user to
975 sort out.
976
977union::
978
979 Run 3-way file level merge for text files, but take
980 lines from both versions, instead of leaving conflict
981 markers. This tends to leave the added lines in the
982 resulting file in random order and the user should
983 verify the result. Do not use this if you do not
984 understand the implications.
985
986
88e7fdf2
JH
987Defining a custom merge driver
988^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
989
0e545f75
JH
990The definition of a merge driver is done in the `.git/config`
991file, not in the `gitattributes` file, so strictly speaking this
992manual page is a wrong place to talk about it. However...
88e7fdf2
JH
993
994To define a custom merge driver `filfre`, add a section to your
995`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this:
996
997----------------------------------------------------------------
998[merge "filfre"]
999 name = feel-free merge driver
ef45bb1f 1000 driver = filfre %O %A %B %L %P
88e7fdf2
JH
1001 recursive = binary
1002----------------------------------------------------------------
1003
1004The `merge.*.name` variable gives the driver a human-readable
1005name.
1006
1007The `merge.*.driver` variable's value is used to construct a
1008command to run to merge ancestor's version (`%O`), current
1009version (`%A`) and the other branches' version (`%B`). These
1010three tokens are replaced with the names of temporary files that
1011hold the contents of these versions when the command line is
16758621
BW
1012built. Additionally, %L will be replaced with the conflict marker
1013size (see below).
88e7fdf2
JH
1014
1015The merge driver is expected to leave the result of the merge in
1016the file named with `%A` by overwriting it, and exit with zero
1017status if it managed to merge them cleanly, or non-zero if there
1018were conflicts.
1019
1020The `merge.*.recursive` variable specifies what other merge
1021driver to use when the merge driver is called for an internal
1022merge between common ancestors, when there are more than one.
1023When left unspecified, the driver itself is used for both
1024internal merge and the final merge.
1025
ef45bb1f
JH
1026The merge driver can learn the pathname in which the merged result
1027will be stored via placeholder `%P`.
1028
88e7fdf2 1029
4c734803
JH
1030`conflict-marker-size`
1031^^^^^^^^^^^^^^^^^^^^^^
1032
1033This attribute controls the length of conflict markers left in
1034the work tree file during a conflicted merge. Only setting to
1035the value to a positive integer has any meaningful effect.
1036
1037For example, this line in `.gitattributes` can be used to tell the merge
1038machinery to leave much longer (instead of the usual 7-character-long)
1039conflict markers when merging the file `Documentation/git-merge.txt`
1040results in a conflict.
1041
1042------------------------
1043Documentation/git-merge.txt conflict-marker-size=32
1044------------------------
1045
1046
cf1b7869
JH
1047Checking whitespace errors
1048~~~~~~~~~~~~~~~~~~~~~~~~~~
1049
1050`whitespace`
1051^^^^^^^^^^^^
1052
1053The `core.whitespace` configuration variable allows you to define what
2fd02c92 1054'diff' and 'apply' should consider whitespace errors for all paths in
5162e697 1055the project (See linkgit:git-config[1]). This attribute gives you finer
cf1b7869
JH
1056control per path.
1057
1058Set::
1059
2de9b711 1060 Notice all types of potential whitespace errors known to Git.
f4b05a49
JS
1061 The tab width is taken from the value of the `core.whitespace`
1062 configuration variable.
cf1b7869
JH
1063
1064Unset::
1065
1066 Do not notice anything as error.
1067
1068Unspecified::
1069
f4b05a49 1070 Use the value of the `core.whitespace` configuration variable to
cf1b7869
JH
1071 decide what to notice as error.
1072
1073String::
1074
1075 Specify a comma separate list of common whitespace problems to
f4b05a49 1076 notice in the same format as the `core.whitespace` configuration
cf1b7869
JH
1077 variable.
1078
1079
8a33dd8b
JH
1080Creating an archive
1081~~~~~~~~~~~~~~~~~~~
1082
08b51f51
JH
1083`export-ignore`
1084^^^^^^^^^^^^^^^
1085
1086Files and directories with the attribute `export-ignore` won't be added to
1087archive files.
1088
8a33dd8b
JH
1089`export-subst`
1090^^^^^^^^^^^^^^
1091
2de9b711 1092If the attribute `export-subst` is set for a file then Git will expand
8a33dd8b 1093several placeholders when adding this file to an archive. The
08b51f51 1094expansion depends on the availability of a commit ID, i.e., if
8a33dd8b
JH
1095linkgit:git-archive[1] has been given a tree instead of a commit or a
1096tag then no replacement will be done. The placeholders are the same
1097as those for the option `--pretty=format:` of linkgit:git-log[1],
1098except that they need to be wrapped like this: `$Format:PLACEHOLDERS$`
1099in the file. E.g. the string `$Format:%H$` will be replaced by the
1100commit hash.
1101
1102
975457f1
NG
1103Packing objects
1104~~~~~~~~~~~~~~~
1105
1106`delta`
1107^^^^^^^
1108
1109Delta compression will not be attempted for blobs for paths with the
1110attribute `delta` set to false.
1111
1112
a2df1fb2
AG
1113Viewing files in GUI tools
1114~~~~~~~~~~~~~~~~~~~~~~~~~~
1115
1116`encoding`
1117^^^^^^^^^^
1118
1119The value of this attribute specifies the character encoding that should
1120be used by GUI tools (e.g. linkgit:gitk[1] and linkgit:git-gui[1]) to
1121display the contents of the relevant file. Note that due to performance
1122considerations linkgit:gitk[1] does not use this attribute unless you
1123manually enable per-file encodings in its options.
1124
1125If this attribute is not set or has an invalid value, the value of the
1126`gui.encoding` configuration variable is used instead
1127(See linkgit:git-config[1]).
1128
1129
0922570c 1130USING MACRO ATTRIBUTES
bbb896d8
JH
1131----------------------
1132
1133You do not want any end-of-line conversions applied to, nor textual diffs
1134produced for, any binary file you track. You would need to specify e.g.
1135
1136------------
5ec3e670 1137*.jpg -text -diff
bbb896d8
JH
1138------------
1139
1140but that may become cumbersome, when you have many attributes. Using
0922570c 1141macro attributes, you can define an attribute that, when set, also
98e84066 1142sets or unsets a number of other attributes at the same time. The
0922570c 1143system knows a built-in macro attribute, `binary`:
bbb896d8
JH
1144
1145------------
1146*.jpg binary
1147------------
1148
98e84066 1149Setting the "binary" attribute also unsets the "text" and "diff"
0922570c 1150attributes as above. Note that macro attributes can only be "Set",
98e84066
MH
1151though setting one might have the effect of setting or unsetting other
1152attributes or even returning other attributes to the "Unspecified"
1153state.
bbb896d8
JH
1154
1155
0922570c 1156DEFINING MACRO ATTRIBUTES
bbb896d8
JH
1157-------------------------
1158
e78e6967
MH
1159Custom macro attributes can be defined only in top-level gitattributes
1160files (`$GIT_DIR/info/attributes`, the `.gitattributes` file at the
1161top level of the working tree, or the global or system-wide
1162gitattributes files), not in `.gitattributes` files in working tree
1163subdirectories. The built-in macro attribute "binary" is equivalent
1164to:
bbb896d8
JH
1165
1166------------
155a4b71 1167[attr]binary -diff -merge -text
bbb896d8
JH
1168------------
1169
1170
88e7fdf2
JH
1171EXAMPLE
1172-------
1173
1174If you have these three `gitattributes` file:
1175
1176----------------------------------------------------------------
1177(in $GIT_DIR/info/attributes)
1178
1179a* foo !bar -baz
1180
1181(in .gitattributes)
1182abc foo bar baz
1183
1184(in t/.gitattributes)
1185ab* merge=filfre
1186abc -foo -bar
1187*.c frotz
1188----------------------------------------------------------------
1189
1190the attributes given to path `t/abc` are computed as follows:
1191
11921. By examining `t/.gitattributes` (which is in the same
2de9b711 1193 directory as the path in question), Git finds that the first
88e7fdf2
JH
1194 line matches. `merge` attribute is set. It also finds that
1195 the second line matches, and attributes `foo` and `bar`
1196 are unset.
1197
11982. Then it examines `.gitattributes` (which is in the parent
1199 directory), and finds that the first line matches, but
1200 `t/.gitattributes` file already decided how `merge`, `foo`
1201 and `bar` attributes should be given to this path, so it
1202 leaves `foo` and `bar` unset. Attribute `baz` is set.
1203
5c759f96 12043. Finally it examines `$GIT_DIR/info/attributes`. This file
88e7fdf2
JH
1205 is used to override the in-tree settings. The first line is
1206 a match, and `foo` is set, `bar` is reverted to unspecified
1207 state, and `baz` is unset.
1208
02783075 1209As the result, the attributes assignment to `t/abc` becomes:
88e7fdf2
JH
1210
1211----------------------------------------------------------------
1212foo set to true
1213bar unspecified
1214baz set to false
1215merge set to string value "filfre"
1216frotz unspecified
1217----------------------------------------------------------------
1218
1219
cde15181
MH
1220SEE ALSO
1221--------
1222linkgit:git-check-attr[1].
8460b2fc 1223
88e7fdf2
JH
1224GIT
1225---
9e1f0a85 1226Part of the linkgit:git[1] suite