]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/gitdiffcore.txt
Documentation: be consistent about "git-" versus "git "
[thirdparty/git.git] / Documentation / gitdiffcore.txt
CommitLineData
30eba7bf
CC
1gitdiffcore(7)
2==============
4a1332d0 3
30eba7bf
CC
4NAME
5----
6gitdiffcore - Tweaking diff output (June 2005)
4a1332d0 7
30eba7bf
CC
8SYNOPSIS
9--------
10git diff *
11
12DESCRIPTION
13-----------
4a1332d0 14
4cc41a16
JH
15The diff commands git-diff-index, git-diff-files, and git-diff-tree
16can be told to manipulate differences they find in
59df2a11
CS
17unconventional ways before showing diff(1) output. The manipulation
18is collectively called "diffcore transformation". This short note
19describes what they are and how to use them to produce diff outputs
20that are easier to understand than the conventional kind.
4a1332d0
JH
21
22
23The chain of operation
24----------------------
25
26The git-diff-* family works by first comparing two sets of
27files:
28
215a7ad1 29 - git-diff-index compares contents of a "tree" object and the
e1ccf53a
YS
30 working directory (when '\--cached' flag is not used) or a
31 "tree" object and the index file (when '\--cached' flag is
4a1332d0
JH
32 used);
33
34 - git-diff-files compares contents of the index file and the
35 working directory;
36
59df2a11
CS
37 - git-diff-tree compares contents of two "tree" objects;
38
4a1332d0
JH
39In all of these cases, the commands themselves compare
40corresponding paths in the two sets of files. The result of
41comparison is passed from these commands to what is internally
42called "diffcore", in a format similar to what is output when
43the -p option is not used. E.g.
44
8db9307c
JH
45------------------------------------------------
46in-place edit :100644 100644 bcd1234... 0123456... M file0
47create :000000 100644 0000000... 1234567... A file4
48delete :100644 000000 1234567... 0000000... D file5
49unmerged :000000 000000 0000000... 0000000... U file6
50------------------------------------------------
4a1332d0
JH
51
52The diffcore mechanism is fed a list of such comparison results
53(each of which is called "filepair", although at this point each
54of them talks about a single file), and transforms such a list
28f8faff 55into another list. There are currently 6 such transformations:
4a1332d0 56
8db9307c
JH
57- diffcore-pathspec
58- diffcore-break
59- diffcore-rename
60- diffcore-merge-broken
61- diffcore-pickaxe
62- diffcore-order
4a1332d0 63
8db9307c 64These are applied in sequence. The set of filepairs git-diff-\*
4a1332d0
JH
65commands find are used as the input to diffcore-pathspec, and
66the output from diffcore-pathspec is used as the input to the
67next transformation. The final result is then passed to the
68output routine and generates either diff-raw format (see Output
8db9307c 69format sections of the manual for git-diff-\* commands) or
4a1332d0
JH
70diff-patch format.
71
72
59df2a11 73diffcore-pathspec: For Ignoring Files Outside Our Consideration
a67c1d08 74---------------------------------------------------------------
4a1332d0
JH
75
76The first transformation in the chain is diffcore-pathspec, and
77is controlled by giving the pathname parameters to the
78git-diff-* commands on the command line. The pathspec is used
79to limit the world diff operates in. It removes the filepairs
a6080a0a 80outside the specified set of pathnames. E.g. If the input set
59df2a11
CS
81of filepairs included:
82
83------------------------------------------------
84:100644 100644 bcd1234... 0123456... M junkfile
85------------------------------------------------
86
b1889c36 87but the command invocation was "git diff-files myfile", then the
59df2a11
CS
88junkfile entry would be removed from the list because only "myfile"
89is under consideration.
4a1332d0
JH
90
91Implementation note. For performance reasons, git-diff-tree
92uses the pathname parameters on the command line to cull set of
93filepairs it feeds the diffcore mechanism itself, and does not
94use diffcore-pathspec, but the end result is the same.
95
96
59df2a11 97diffcore-break: For Splitting Up "Complete Rewrites"
a67c1d08 98----------------------------------------------------
4a1332d0
JH
99
100The second transformation in the chain is diffcore-break, and is
101controlled by the -B option to the git-diff-* commands. This is
102used to detect a filepair that represents "complete rewrite" and
103break such filepair into two filepairs that represent delete and
104create. E.g. If the input contained this filepair:
105
8db9307c
JH
106------------------------------------------------
107:100644 100644 bcd1234... 0123456... M file0
108------------------------------------------------
4a1332d0
JH
109
110and if it detects that the file "file0" is completely rewritten,
111it changes it to:
112
8db9307c
JH
113------------------------------------------------
114:100644 000000 bcd1234... 0000000... D file0
115:000000 100644 0000000... 0123456... A file0
116------------------------------------------------
4a1332d0
JH
117
118For the purpose of breaking a filepair, diffcore-break examines
119the extent of changes between the contents of the files before
120and after modification (i.e. the contents that have "bcd1234..."
121and "0123456..." as their SHA1 content ID, in the above
122example). The amount of deletion of original contents and
123insertion of new material are added together, and if it exceeds
124the "break score", the filepair is broken into two. The break
125score defaults to 50% of the size of the smaller of the original
126and the result (i.e. if the edit shrinks the file, the size of
127the result is used; if the edit lengthens the file, the size of
128the original is used), and can be customized by giving a number
129after "-B" option (e.g. "-B75" to tell it to use 75%).
130
131
59df2a11 132diffcore-rename: For Detection Renames and Copies
a67c1d08 133-------------------------------------------------
4a1332d0
JH
134
135This transformation is used to detect renames and copies, and is
136controlled by the -M option (to detect renames) and the -C option
137(to detect copies as well) to the git-diff-* commands. If the
138input contained these filepairs:
139
8db9307c
JH
140------------------------------------------------
141:100644 000000 0123456... 0000000... D fileX
142:000000 100644 0000000... 0123456... A file0
143------------------------------------------------
4a1332d0
JH
144
145and the contents of the deleted file fileX is similar enough to
146the contents of the created file file0, then rename detection
147merges these filepairs and creates:
148
8db9307c
JH
149------------------------------------------------
150:100644 100644 0123456... 0123456... R100 fileX file0
151------------------------------------------------
4a1332d0 152
59df2a11
CS
153When the "-C" option is used, the original contents of modified files,
154and deleted files (and also unmodified files, if the
155"\--find-copies-harder" option is used) are considered as candidates
156of the source files in rename/copy operation. If the input were like
157these filepairs, that talk about a modified file fileY and a newly
4a1332d0
JH
158created file file0:
159
8db9307c
JH
160------------------------------------------------
161:100644 100644 0123456... 1234567... M fileY
59df2a11 162:000000 100644 0000000... bcd3456... A file0
8db9307c 163------------------------------------------------
4a1332d0
JH
164
165the original contents of fileY and the resulting contents of
166file0 are compared, and if they are similar enough, they are
167changed to:
168
8db9307c
JH
169------------------------------------------------
170:100644 100644 0123456... 1234567... M fileY
59df2a11 171:100644 100644 0123456... bcd3456... C100 fileY file0
8db9307c 172------------------------------------------------
4a1332d0
JH
173
174In both rename and copy detection, the same "extent of changes"
175algorithm used in diffcore-break is used to determine if two
176files are "similar enough", and can be customized to use
59df2a11
CS
177a similarity score different from the default of 50% by giving a
178number after the "-M" or "-C" option (e.g. "-M8" to tell it to use
4a1332d0
JH
1798/10 = 80%).
180
e1ccf53a 181Note. When the "-C" option is used with `\--find-copies-harder`
8db9307c 182option, git-diff-\* commands feed unmodified filepairs to
232b75ab
JH
183diffcore mechanism as well as modified ones. This lets the copy
184detector consider unmodified files as copy source candidates at
e1ccf53a 185the expense of making it slower. Without `\--find-copies-harder`,
8db9307c 186git-diff-\* commands can detect copies only if the file that was
232b75ab 187copied happened to have been modified in the same changeset.
4a1332d0
JH
188
189
59df2a11 190diffcore-merge-broken: For Putting "Complete Rewrites" Back Together
a67c1d08 191--------------------------------------------------------------------
4a1332d0
JH
192
193This transformation is used to merge filepairs broken by
f73ae1fc 194diffcore-break, and not transformed into rename/copy by
4a1332d0
JH
195diffcore-rename, back into a single modification. This always
196runs when diffcore-break is used.
197
198For the purpose of merging broken filepairs back, it uses a
199different "extent of changes" computation from the ones used by
200diffcore-break and diffcore-rename. It counts only the deletion
201from the original, and does not count insertion. If you removed
202only 10 lines from a 100-line document, even if you added 910
203new lines to make a new 1000-line document, you did not do a
204complete rewrite. diffcore-break breaks such a case in order to
205help diffcore-rename to consider such filepairs as candidate of
206rename/copy detection, but if filepairs broken that way were not
207matched with other filepairs to create rename/copy, then this
208transformation merges them back into the original
209"modification".
210
211The "extent of changes" parameter can be tweaked from the
212default 80% (that is, unless more than 80% of the original
213material is deleted, the broken pairs are merged back into a
214single modification) by giving a second number to -B option,
215like these:
216
8db9307c
JH
217* -B50/60 (give 50% "break score" to diffcore-break, use 60%
218 for diffcore-merge-broken).
219
220* -B/60 (the same as above, since diffcore-break defaults to 50%).
4a1332d0 221
366175ef 222Note that earlier implementation left a broken pair as a separate
f73ae1fc 223creation and deletion patches. This was an unnecessary hack and
366175ef
JH
224the latest implementation always merges all the broken pairs
225back into modifications, but the resulting patch output is
f73ae1fc 226formatted differently for easier review in case of such
366175ef
JH
227a complete rewrite by showing the entire contents of old version
228prefixed with '-', followed by the entire contents of new
229version prefixed with '+'.
230
4a1332d0 231
59df2a11 232diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
a67c1d08 233---------------------------------------------------------------------
4a1332d0
JH
234
235This transformation is used to find filepairs that represent
236changes that touch a specified string, and is controlled by the
e1ccf53a 237-S option and the `\--pickaxe-all` option to the git-diff-*
4a1332d0
JH
238commands.
239
240When diffcore-pickaxe is in use, it checks if there are
241filepairs whose "original" side has the specified string and
242whose "result" side does not. Such a filepair represents "the
243string appeared in this changeset". It also checks for the
244opposite case that loses the specified string.
245
e1ccf53a 246When `\--pickaxe-all` is not in effect, diffcore-pickaxe leaves
59df2a11 247only such filepairs that touch the specified string in its
e1ccf53a 248output. When `\--pickaxe-all` is used, diffcore-pickaxe leaves all
4a1332d0
JH
249filepairs intact if there is such a filepair, or makes the
250output empty otherwise. The latter behaviour is designed to
251make reviewing of the changes in the context of the whole
252changeset easier.
253
254
59df2a11 255diffcore-order: For Sorting the Output Based on Filenames
a67c1d08 256---------------------------------------------------------
4a1332d0
JH
257
258This is used to reorder the filepairs according to the user's
259(or project's) taste, and is controlled by the -O option to the
260git-diff-* commands.
261
59df2a11 262This takes a text file each of whose lines is a shell glob
4a1332d0
JH
263pattern. Filepairs that match a glob pattern on an earlier line
264in the file are output before ones that match a later line, and
265filepairs that do not match any glob pattern are output last.
266
59df2a11 267As an example, a typical orderfile for the core git probably
8db9307c 268would look like this:
4a1332d0 269
8db9307c 270------------------------------------------------
df8baa42
JF
271README
272Makefile
273Documentation
274*.h
275*.c
276t
8db9307c 277------------------------------------------------
30eba7bf
CC
278
279SEE ALSO
280--------
281linkgit:git-diff[1],
282linkgit:git-diff-files[1],
283linkgit:git-diff-index[1],
284linkgit:git-diff-tree[1],
285linkgit:git-format-patch[1],
286linkgit:git-log[1],
287linkgit:gitglossary[7],
288link:user-manual.html[The Git User's Manual]
289
290GIT
291---
9e1f0a85 292Part of the linkgit:git[1] suite.