]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/diffcore.txt
make t5501 less annoying
[thirdparty/git.git] / Documentation / diffcore.txt
CommitLineData
4a1332d0
JH
1Tweaking diff output
2====================
3June 2005
4
5
6Introduction
7------------
8
59df2a11
CS
9The diff commands git-diff-index, git-diff-files, git-diff-tree, and
10git-diff-stages can be told to manipulate differences they find in
11unconventional ways before showing diff(1) output. The manipulation
12is collectively called "diffcore transformation". This short note
13describes what they are and how to use them to produce diff outputs
14that are easier to understand than the conventional kind.
4a1332d0
JH
15
16
17The chain of operation
18----------------------
19
20The git-diff-* family works by first comparing two sets of
21files:
22
215a7ad1 23 - git-diff-index compares contents of a "tree" object and the
e1ccf53a
YS
24 working directory (when '\--cached' flag is not used) or a
25 "tree" object and the index file (when '\--cached' flag is
4a1332d0
JH
26 used);
27
28 - git-diff-files compares contents of the index file and the
29 working directory;
30
59df2a11
CS
31 - git-diff-tree compares contents of two "tree" objects;
32
33 - git-diff-stages compares contents of blobs at two stages in an
34 unmerged index file.
4a1332d0
JH
35
36In all of these cases, the commands themselves compare
37corresponding paths in the two sets of files. The result of
38comparison is passed from these commands to what is internally
39called "diffcore", in a format similar to what is output when
40the -p option is not used. E.g.
41
8db9307c
JH
42------------------------------------------------
43in-place edit :100644 100644 bcd1234... 0123456... M file0
44create :000000 100644 0000000... 1234567... A file4
45delete :100644 000000 1234567... 0000000... D file5
46unmerged :000000 000000 0000000... 0000000... U file6
47------------------------------------------------
4a1332d0
JH
48
49The diffcore mechanism is fed a list of such comparison results
50(each of which is called "filepair", although at this point each
51of them talks about a single file), and transforms such a list
28f8faff 52into another list. There are currently 6 such transformations:
4a1332d0 53
8db9307c
JH
54- diffcore-pathspec
55- diffcore-break
56- diffcore-rename
57- diffcore-merge-broken
58- diffcore-pickaxe
59- diffcore-order
4a1332d0 60
8db9307c 61These are applied in sequence. The set of filepairs git-diff-\*
4a1332d0
JH
62commands find are used as the input to diffcore-pathspec, and
63the output from diffcore-pathspec is used as the input to the
64next transformation. The final result is then passed to the
65output routine and generates either diff-raw format (see Output
8db9307c 66format sections of the manual for git-diff-\* commands) or
4a1332d0
JH
67diff-patch format.
68
69
59df2a11 70diffcore-pathspec: For Ignoring Files Outside Our Consideration
4a1332d0
JH
71-----------------
72
73The first transformation in the chain is diffcore-pathspec, and
74is controlled by giving the pathname parameters to the
75git-diff-* commands on the command line. The pathspec is used
76to limit the world diff operates in. It removes the filepairs
59df2a11
CS
77outside the specified set of pathnames. E.g. If the input set
78of filepairs included:
79
80------------------------------------------------
81:100644 100644 bcd1234... 0123456... M junkfile
82------------------------------------------------
83
84but the command invocation was "git-diff-files myfile", then the
85junkfile entry would be removed from the list because only "myfile"
86is under consideration.
4a1332d0
JH
87
88Implementation note. For performance reasons, git-diff-tree
89uses the pathname parameters on the command line to cull set of
90filepairs it feeds the diffcore mechanism itself, and does not
91use diffcore-pathspec, but the end result is the same.
92
93
59df2a11 94diffcore-break: For Splitting Up "Complete Rewrites"
4a1332d0
JH
95--------------
96
97The second transformation in the chain is diffcore-break, and is
98controlled by the -B option to the git-diff-* commands. This is
99used to detect a filepair that represents "complete rewrite" and
100break such filepair into two filepairs that represent delete and
101create. E.g. If the input contained this filepair:
102
8db9307c
JH
103------------------------------------------------
104:100644 100644 bcd1234... 0123456... M file0
105------------------------------------------------
4a1332d0
JH
106
107and if it detects that the file "file0" is completely rewritten,
108it changes it to:
109
8db9307c
JH
110------------------------------------------------
111:100644 000000 bcd1234... 0000000... D file0
112:000000 100644 0000000... 0123456... A file0
113------------------------------------------------
4a1332d0
JH
114
115For the purpose of breaking a filepair, diffcore-break examines
116the extent of changes between the contents of the files before
117and after modification (i.e. the contents that have "bcd1234..."
118and "0123456..." as their SHA1 content ID, in the above
119example). The amount of deletion of original contents and
120insertion of new material are added together, and if it exceeds
121the "break score", the filepair is broken into two. The break
122score defaults to 50% of the size of the smaller of the original
123and the result (i.e. if the edit shrinks the file, the size of
124the result is used; if the edit lengthens the file, the size of
125the original is used), and can be customized by giving a number
126after "-B" option (e.g. "-B75" to tell it to use 75%).
127
128
59df2a11 129diffcore-rename: For Detection Renames and Copies
4a1332d0
JH
130---------------
131
132This transformation is used to detect renames and copies, and is
133controlled by the -M option (to detect renames) and the -C option
134(to detect copies as well) to the git-diff-* commands. If the
135input contained these filepairs:
136
8db9307c
JH
137------------------------------------------------
138:100644 000000 0123456... 0000000... D fileX
139:000000 100644 0000000... 0123456... A file0
140------------------------------------------------
4a1332d0
JH
141
142and the contents of the deleted file fileX is similar enough to
143the contents of the created file file0, then rename detection
144merges these filepairs and creates:
145
8db9307c
JH
146------------------------------------------------
147:100644 100644 0123456... 0123456... R100 fileX file0
148------------------------------------------------
4a1332d0 149
59df2a11
CS
150When the "-C" option is used, the original contents of modified files,
151and deleted files (and also unmodified files, if the
152"\--find-copies-harder" option is used) are considered as candidates
153of the source files in rename/copy operation. If the input were like
154these filepairs, that talk about a modified file fileY and a newly
4a1332d0
JH
155created file file0:
156
8db9307c
JH
157------------------------------------------------
158:100644 100644 0123456... 1234567... M fileY
59df2a11 159:000000 100644 0000000... bcd3456... A file0
8db9307c 160------------------------------------------------
4a1332d0
JH
161
162the original contents of fileY and the resulting contents of
163file0 are compared, and if they are similar enough, they are
164changed to:
165
8db9307c
JH
166------------------------------------------------
167:100644 100644 0123456... 1234567... M fileY
59df2a11 168:100644 100644 0123456... bcd3456... C100 fileY file0
8db9307c 169------------------------------------------------
4a1332d0
JH
170
171In both rename and copy detection, the same "extent of changes"
172algorithm used in diffcore-break is used to determine if two
173files are "similar enough", and can be customized to use
59df2a11
CS
174a similarity score different from the default of 50% by giving a
175number after the "-M" or "-C" option (e.g. "-M8" to tell it to use
4a1332d0
JH
1768/10 = 80%).
177
e1ccf53a 178Note. When the "-C" option is used with `\--find-copies-harder`
8db9307c 179option, git-diff-\* commands feed unmodified filepairs to
232b75ab
JH
180diffcore mechanism as well as modified ones. This lets the copy
181detector consider unmodified files as copy source candidates at
e1ccf53a 182the expense of making it slower. Without `\--find-copies-harder`,
8db9307c 183git-diff-\* commands can detect copies only if the file that was
232b75ab 184copied happened to have been modified in the same changeset.
4a1332d0
JH
185
186
59df2a11 187diffcore-merge-broken: For Putting "Complete Rewrites" Back Together
4a1332d0
JH
188---------------------
189
190This transformation is used to merge filepairs broken by
f73ae1fc 191diffcore-break, and not transformed into rename/copy by
4a1332d0
JH
192diffcore-rename, back into a single modification. This always
193runs when diffcore-break is used.
194
195For the purpose of merging broken filepairs back, it uses a
196different "extent of changes" computation from the ones used by
197diffcore-break and diffcore-rename. It counts only the deletion
198from the original, and does not count insertion. If you removed
199only 10 lines from a 100-line document, even if you added 910
200new lines to make a new 1000-line document, you did not do a
201complete rewrite. diffcore-break breaks such a case in order to
202help diffcore-rename to consider such filepairs as candidate of
203rename/copy detection, but if filepairs broken that way were not
204matched with other filepairs to create rename/copy, then this
205transformation merges them back into the original
206"modification".
207
208The "extent of changes" parameter can be tweaked from the
209default 80% (that is, unless more than 80% of the original
210material is deleted, the broken pairs are merged back into a
211single modification) by giving a second number to -B option,
212like these:
213
8db9307c
JH
214* -B50/60 (give 50% "break score" to diffcore-break, use 60%
215 for diffcore-merge-broken).
216
217* -B/60 (the same as above, since diffcore-break defaults to 50%).
4a1332d0 218
366175ef 219Note that earlier implementation left a broken pair as a separate
f73ae1fc 220creation and deletion patches. This was an unnecessary hack and
366175ef
JH
221the latest implementation always merges all the broken pairs
222back into modifications, but the resulting patch output is
f73ae1fc 223formatted differently for easier review in case of such
366175ef
JH
224a complete rewrite by showing the entire contents of old version
225prefixed with '-', followed by the entire contents of new
226version prefixed with '+'.
227
4a1332d0 228
59df2a11 229diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
4a1332d0
JH
230----------------
231
232This transformation is used to find filepairs that represent
233changes that touch a specified string, and is controlled by the
e1ccf53a 234-S option and the `\--pickaxe-all` option to the git-diff-*
4a1332d0
JH
235commands.
236
237When diffcore-pickaxe is in use, it checks if there are
238filepairs whose "original" side has the specified string and
239whose "result" side does not. Such a filepair represents "the
240string appeared in this changeset". It also checks for the
241opposite case that loses the specified string.
242
e1ccf53a 243When `\--pickaxe-all` is not in effect, diffcore-pickaxe leaves
59df2a11 244only such filepairs that touch the specified string in its
e1ccf53a 245output. When `\--pickaxe-all` is used, diffcore-pickaxe leaves all
4a1332d0
JH
246filepairs intact if there is such a filepair, or makes the
247output empty otherwise. The latter behaviour is designed to
248make reviewing of the changes in the context of the whole
249changeset easier.
250
251
59df2a11 252diffcore-order: For Sorting the Output Based on Filenames
4a1332d0
JH
253--------------
254
255This is used to reorder the filepairs according to the user's
256(or project's) taste, and is controlled by the -O option to the
257git-diff-* commands.
258
59df2a11 259This takes a text file each of whose lines is a shell glob
4a1332d0
JH
260pattern. Filepairs that match a glob pattern on an earlier line
261in the file are output before ones that match a later line, and
262filepairs that do not match any glob pattern are output last.
263
59df2a11 264As an example, a typical orderfile for the core git probably
8db9307c 265would look like this:
4a1332d0 266
8db9307c 267------------------------------------------------
df8baa42
JF
268README
269Makefile
270Documentation
271*.h
272*.c
273t
8db9307c 274------------------------------------------------
4a1332d0 275