]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/diffcore.txt
Diff: -l<num> to limit rename/copy detection.
[thirdparty/git.git] / Documentation / diffcore.txt
CommitLineData
4a1332d0
JH
1Tweaking diff output
2====================
3June 2005
4
5
6Introduction
7------------
8
215a7ad1 9The diff commands git-diff-index, git-diff-files, and
4a1332d0
JH
10git-diff-tree can be told to manipulate differences they find
11in unconventional ways before showing diff(1) output. The
12manipulation is collectively called "diffcore transformation".
13This short note describes what they are and how to use them to
14produce diff outputs that are easier to understand than the
15conventional kind.
16
17
18The chain of operation
19----------------------
20
21The git-diff-* family works by first comparing two sets of
22files:
23
215a7ad1 24 - git-diff-index compares contents of a "tree" object and the
e1ccf53a
YS
25 working directory (when '\--cached' flag is not used) or a
26 "tree" object and the index file (when '\--cached' flag is
4a1332d0
JH
27 used);
28
29 - git-diff-files compares contents of the index file and the
30 working directory;
31
32 - git-diff-tree compares contents of two "tree" objects.
33
34In all of these cases, the commands themselves compare
35corresponding paths in the two sets of files. The result of
36comparison is passed from these commands to what is internally
37called "diffcore", in a format similar to what is output when
38the -p option is not used. E.g.
39
8db9307c
JH
40------------------------------------------------
41in-place edit :100644 100644 bcd1234... 0123456... M file0
42create :000000 100644 0000000... 1234567... A file4
43delete :100644 000000 1234567... 0000000... D file5
44unmerged :000000 000000 0000000... 0000000... U file6
45------------------------------------------------
4a1332d0
JH
46
47The diffcore mechanism is fed a list of such comparison results
48(each of which is called "filepair", although at this point each
49of them talks about a single file), and transforms such a list
28f8faff 50into another list. There are currently 6 such transformations:
4a1332d0 51
8db9307c
JH
52- diffcore-pathspec
53- diffcore-break
54- diffcore-rename
55- diffcore-merge-broken
56- diffcore-pickaxe
57- diffcore-order
4a1332d0 58
8db9307c 59These are applied in sequence. The set of filepairs git-diff-\*
4a1332d0
JH
60commands find are used as the input to diffcore-pathspec, and
61the output from diffcore-pathspec is used as the input to the
62next transformation. The final result is then passed to the
63output routine and generates either diff-raw format (see Output
8db9307c 64format sections of the manual for git-diff-\* commands) or
4a1332d0
JH
65diff-patch format.
66
67
68diffcore-pathspec
69-----------------
70
71The first transformation in the chain is diffcore-pathspec, and
72is controlled by giving the pathname parameters to the
73git-diff-* commands on the command line. The pathspec is used
74to limit the world diff operates in. It removes the filepairs
75outside the specified set of pathnames.
76
77Implementation note. For performance reasons, git-diff-tree
78uses the pathname parameters on the command line to cull set of
79filepairs it feeds the diffcore mechanism itself, and does not
80use diffcore-pathspec, but the end result is the same.
81
82
83diffcore-break
84--------------
85
86The second transformation in the chain is diffcore-break, and is
87controlled by the -B option to the git-diff-* commands. This is
88used to detect a filepair that represents "complete rewrite" and
89break such filepair into two filepairs that represent delete and
90create. E.g. If the input contained this filepair:
91
8db9307c
JH
92------------------------------------------------
93:100644 100644 bcd1234... 0123456... M file0
94------------------------------------------------
4a1332d0
JH
95
96and if it detects that the file "file0" is completely rewritten,
97it changes it to:
98
8db9307c
JH
99------------------------------------------------
100:100644 000000 bcd1234... 0000000... D file0
101:000000 100644 0000000... 0123456... A file0
102------------------------------------------------
4a1332d0
JH
103
104For the purpose of breaking a filepair, diffcore-break examines
105the extent of changes between the contents of the files before
106and after modification (i.e. the contents that have "bcd1234..."
107and "0123456..." as their SHA1 content ID, in the above
108example). The amount of deletion of original contents and
109insertion of new material are added together, and if it exceeds
110the "break score", the filepair is broken into two. The break
111score defaults to 50% of the size of the smaller of the original
112and the result (i.e. if the edit shrinks the file, the size of
113the result is used; if the edit lengthens the file, the size of
114the original is used), and can be customized by giving a number
115after "-B" option (e.g. "-B75" to tell it to use 75%).
116
117
118diffcore-rename
119---------------
120
121This transformation is used to detect renames and copies, and is
122controlled by the -M option (to detect renames) and the -C option
123(to detect copies as well) to the git-diff-* commands. If the
124input contained these filepairs:
125
8db9307c
JH
126------------------------------------------------
127:100644 000000 0123456... 0000000... D fileX
128:000000 100644 0000000... 0123456... A file0
129------------------------------------------------
4a1332d0
JH
130
131and the contents of the deleted file fileX is similar enough to
132the contents of the created file file0, then rename detection
133merges these filepairs and creates:
134
8db9307c
JH
135------------------------------------------------
136:100644 100644 0123456... 0123456... R100 fileX file0
137------------------------------------------------
4a1332d0
JH
138
139When the "-C" option is used, the original contents of modified
140files and contents of unchanged files are considered as
141candidates of the source files in rename/copy operation, in
142addition to the deleted files. If the input were like these
143filepairs, that talk about a modified file fileY and a newly
144created file file0:
145
8db9307c
JH
146------------------------------------------------
147:100644 100644 0123456... 1234567... M fileY
148:000000 100644 0000000... 0123456... A file0
149------------------------------------------------
4a1332d0
JH
150
151the original contents of fileY and the resulting contents of
152file0 are compared, and if they are similar enough, they are
153changed to:
154
8db9307c
JH
155------------------------------------------------
156:100644 100644 0123456... 1234567... M fileY
157:100644 100644 0123456... 0123456... C100 fileY file0
158------------------------------------------------
4a1332d0
JH
159
160In both rename and copy detection, the same "extent of changes"
161algorithm used in diffcore-break is used to determine if two
162files are "similar enough", and can be customized to use
163similarity score different from the default 50% by giving a
164number after "-M" or "-C" option (e.g. "-M8" to tell it to use
1658/10 = 80%).
166
e1ccf53a 167Note. When the "-C" option is used with `\--find-copies-harder`
8db9307c 168option, git-diff-\* commands feed unmodified filepairs to
232b75ab
JH
169diffcore mechanism as well as modified ones. This lets the copy
170detector consider unmodified files as copy source candidates at
e1ccf53a 171the expense of making it slower. Without `\--find-copies-harder`,
8db9307c 172git-diff-\* commands can detect copies only if the file that was
232b75ab 173copied happened to have been modified in the same changeset.
4a1332d0
JH
174
175
176diffcore-merge-broken
177---------------------
178
179This transformation is used to merge filepairs broken by
180diffcore-break, and were not transformed into rename/copy by
181diffcore-rename, back into a single modification. This always
182runs when diffcore-break is used.
183
184For the purpose of merging broken filepairs back, it uses a
185different "extent of changes" computation from the ones used by
186diffcore-break and diffcore-rename. It counts only the deletion
187from the original, and does not count insertion. If you removed
188only 10 lines from a 100-line document, even if you added 910
189new lines to make a new 1000-line document, you did not do a
190complete rewrite. diffcore-break breaks such a case in order to
191help diffcore-rename to consider such filepairs as candidate of
192rename/copy detection, but if filepairs broken that way were not
193matched with other filepairs to create rename/copy, then this
194transformation merges them back into the original
195"modification".
196
197The "extent of changes" parameter can be tweaked from the
198default 80% (that is, unless more than 80% of the original
199material is deleted, the broken pairs are merged back into a
200single modification) by giving a second number to -B option,
201like these:
202
8db9307c
JH
203* -B50/60 (give 50% "break score" to diffcore-break, use 60%
204 for diffcore-merge-broken).
205
206* -B/60 (the same as above, since diffcore-break defaults to 50%).
4a1332d0 207
366175ef
JH
208Note that earlier implementation left a broken pair as a separate
209creation and deletion patches. This was unnecessary hack and
210the latest implementation always merges all the broken pairs
211back into modifications, but the resulting patch output is
212formatted differently to still let the reviewing easier for such
213a complete rewrite by showing the entire contents of old version
214prefixed with '-', followed by the entire contents of new
215version prefixed with '+'.
216
4a1332d0
JH
217
218diffcore-pickaxe
219----------------
220
221This transformation is used to find filepairs that represent
222changes that touch a specified string, and is controlled by the
e1ccf53a 223-S option and the `\--pickaxe-all` option to the git-diff-*
4a1332d0
JH
224commands.
225
226When diffcore-pickaxe is in use, it checks if there are
227filepairs whose "original" side has the specified string and
228whose "result" side does not. Such a filepair represents "the
229string appeared in this changeset". It also checks for the
230opposite case that loses the specified string.
231
e1ccf53a 232When `\--pickaxe-all` is not in effect, diffcore-pickaxe leaves
4a1332d0 233only such filepairs that touches the specified string in its
e1ccf53a 234output. When `\--pickaxe-all` is used, diffcore-pickaxe leaves all
4a1332d0
JH
235filepairs intact if there is such a filepair, or makes the
236output empty otherwise. The latter behaviour is designed to
237make reviewing of the changes in the context of the whole
238changeset easier.
239
240
241diffcore-order
242--------------
243
244This is used to reorder the filepairs according to the user's
245(or project's) taste, and is controlled by the -O option to the
246git-diff-* commands.
247
248This takes a text file each of whose line is a shell glob
249pattern. Filepairs that match a glob pattern on an earlier line
250in the file are output before ones that match a later line, and
251filepairs that do not match any glob pattern are output last.
252
253As an example, typical orderfile for the core GIT probably
8db9307c 254would look like this:
4a1332d0 255
8db9307c 256------------------------------------------------
4a1332d0
JH
257 README
258 Makefile
259 Documentation
260 *.h
261 *.c
262 t
8db9307c 263------------------------------------------------
4a1332d0 264