]>
Commit | Line | Data |
---|---|---|
30eba7bf CC |
1 | gitdiffcore(7) |
2 | ============== | |
4a1332d0 | 3 | |
30eba7bf CC |
4 | NAME |
5 | ---- | |
dddfb3f1 | 6 | gitdiffcore - Tweaking diff output |
4a1332d0 | 7 | |
30eba7bf CC |
8 | SYNOPSIS |
9 | -------- | |
7791a1d9 | 10 | [verse] |
0979c106 | 11 | 'git diff' * |
30eba7bf CC |
12 | |
13 | DESCRIPTION | |
14 | ----------- | |
4a1332d0 | 15 | |
0b444cdb | 16 | The diff commands 'git diff-index', 'git diff-files', and 'git diff-tree' |
4cc41a16 | 17 | can be told to manipulate differences they find in |
2fd02c92 | 18 | unconventional ways before showing 'diff' output. The manipulation |
59df2a11 | 19 | is collectively called "diffcore transformation". This short note |
0979c106 | 20 | describes what they are and how to use them to produce 'diff' output |
877276d4 | 21 | that is easier to understand than the conventional kind. |
4a1332d0 JH |
22 | |
23 | ||
24 | The chain of operation | |
25 | ---------------------- | |
26 | ||
0b444cdb | 27 | The 'git diff-{asterisk}' family works by first comparing two sets of |
4a1332d0 JH |
28 | files: |
29 | ||
0b444cdb | 30 | - 'git diff-index' compares contents of a "tree" object and the |
bcf9626a MM |
31 | working directory (when `--cached` flag is not used) or a |
32 | "tree" object and the index file (when `--cached` flag is | |
4a1332d0 JH |
33 | used); |
34 | ||
0b444cdb | 35 | - 'git diff-files' compares contents of the index file and the |
4a1332d0 JH |
36 | working directory; |
37 | ||
0b444cdb | 38 | - 'git diff-tree' compares contents of two "tree" objects; |
59df2a11 | 39 | |
264e0b9a YD |
40 | In all of these cases, the commands themselves first optionally limit |
41 | the two sets of files by any pathspecs given on their command-lines, | |
42 | and compare corresponding paths in the two resulting sets of files. | |
43 | ||
44 | The pathspecs are used to limit the world diff operates in. They remove | |
45 | the filepairs outside the specified sets of pathnames. E.g. If the | |
46 | input set of filepairs included: | |
47 | ||
48 | ------------------------------------------------ | |
49 | :100644 100644 bcd1234... 0123456... M junkfile | |
50 | ------------------------------------------------ | |
51 | ||
52 | but the command invocation was `git diff-files myfile`, then the | |
53 | junkfile entry would be removed from the list because only "myfile" | |
54 | is under consideration. | |
55 | ||
56 | The result of comparison is passed from these commands to what is | |
57 | internally called "diffcore", in a format similar to what is output | |
58 | when the -p option is not used. E.g. | |
4a1332d0 | 59 | |
8db9307c JH |
60 | ------------------------------------------------ |
61 | in-place edit :100644 100644 bcd1234... 0123456... M file0 | |
62 | create :000000 100644 0000000... 1234567... A file4 | |
63 | delete :100644 000000 1234567... 0000000... D file5 | |
64 | unmerged :000000 000000 0000000... 0000000... U file6 | |
65 | ------------------------------------------------ | |
4a1332d0 JH |
66 | |
67 | The diffcore mechanism is fed a list of such comparison results | |
68 | (each of which is called "filepair", although at this point each | |
69 | of them talks about a single file), and transforms such a list | |
264e0b9a | 70 | into another list. There are currently 5 such transformations: |
4a1332d0 | 71 | |
8db9307c JH |
72 | - diffcore-break |
73 | - diffcore-rename | |
74 | - diffcore-merge-broken | |
75 | - diffcore-pickaxe | |
76 | - diffcore-order | |
1eb4136a | 77 | - diffcore-rotate |
4a1332d0 | 78 | |
0b444cdb | 79 | These are applied in sequence. The set of filepairs 'git diff-{asterisk}' |
264e0b9a YD |
80 | commands find are used as the input to diffcore-break, and |
81 | the output from diffcore-break is used as the input to the | |
4a1332d0 JH |
82 | next transformation. The final result is then passed to the |
83 | output routine and generates either diff-raw format (see Output | |
0b444cdb | 84 | format sections of the manual for 'git diff-{asterisk}' commands) or |
4a1332d0 JH |
85 | diff-patch format. |
86 | ||
87 | ||
b803ae44 PS |
88 | diffcore-break: For Splitting Up Complete Rewrites |
89 | -------------------------------------------------- | |
4a1332d0 JH |
90 | |
91 | The second transformation in the chain is diffcore-break, and is | |
0b444cdb | 92 | controlled by the -B option to the 'git diff-{asterisk}' commands. This is |
4a1332d0 JH |
93 | used to detect a filepair that represents "complete rewrite" and |
94 | break such filepair into two filepairs that represent delete and | |
95 | create. E.g. If the input contained this filepair: | |
96 | ||
8db9307c JH |
97 | ------------------------------------------------ |
98 | :100644 100644 bcd1234... 0123456... M file0 | |
99 | ------------------------------------------------ | |
4a1332d0 JH |
100 | |
101 | and if it detects that the file "file0" is completely rewritten, | |
102 | it changes it to: | |
103 | ||
8db9307c JH |
104 | ------------------------------------------------ |
105 | :100644 000000 bcd1234... 0000000... D file0 | |
106 | :000000 100644 0000000... 0123456... A file0 | |
107 | ------------------------------------------------ | |
4a1332d0 JH |
108 | |
109 | For the purpose of breaking a filepair, diffcore-break examines | |
110 | the extent of changes between the contents of the files before | |
111 | and after modification (i.e. the contents that have "bcd1234..." | |
d5fa1f1a | 112 | and "0123456..." as their SHA-1 content ID, in the above |
4a1332d0 JH |
113 | example). The amount of deletion of original contents and |
114 | insertion of new material are added together, and if it exceeds | |
115 | the "break score", the filepair is broken into two. The break | |
116 | score defaults to 50% of the size of the smaller of the original | |
117 | and the result (i.e. if the edit shrinks the file, the size of | |
118 | the result is used; if the edit lengthens the file, the size of | |
119 | the original is used), and can be customized by giving a number | |
120 | after "-B" option (e.g. "-B75" to tell it to use 75%). | |
121 | ||
122 | ||
1aa38199 | 123 | diffcore-rename: For Detecting Renames and Copies |
a67c1d08 | 124 | ------------------------------------------------- |
4a1332d0 JH |
125 | |
126 | This transformation is used to detect renames and copies, and is | |
127 | controlled by the -M option (to detect renames) and the -C option | |
0b444cdb | 128 | (to detect copies as well) to the 'git diff-{asterisk}' commands. If the |
4a1332d0 JH |
129 | input contained these filepairs: |
130 | ||
8db9307c JH |
131 | ------------------------------------------------ |
132 | :100644 000000 0123456... 0000000... D fileX | |
133 | :000000 100644 0000000... 0123456... A file0 | |
134 | ------------------------------------------------ | |
4a1332d0 JH |
135 | |
136 | and the contents of the deleted file fileX is similar enough to | |
137 | the contents of the created file file0, then rename detection | |
138 | merges these filepairs and creates: | |
139 | ||
8db9307c JH |
140 | ------------------------------------------------ |
141 | :100644 100644 0123456... 0123456... R100 fileX file0 | |
142 | ------------------------------------------------ | |
4a1332d0 | 143 | |
59df2a11 CS |
144 | When the "-C" option is used, the original contents of modified files, |
145 | and deleted files (and also unmodified files, if the | |
1c262bb7 | 146 | "--find-copies-harder" option is used) are considered as candidates |
59df2a11 CS |
147 | of the source files in rename/copy operation. If the input were like |
148 | these filepairs, that talk about a modified file fileY and a newly | |
4a1332d0 JH |
149 | created file file0: |
150 | ||
8db9307c JH |
151 | ------------------------------------------------ |
152 | :100644 100644 0123456... 1234567... M fileY | |
59df2a11 | 153 | :000000 100644 0000000... bcd3456... A file0 |
8db9307c | 154 | ------------------------------------------------ |
4a1332d0 JH |
155 | |
156 | the original contents of fileY and the resulting contents of | |
157 | file0 are compared, and if they are similar enough, they are | |
158 | changed to: | |
159 | ||
8db9307c JH |
160 | ------------------------------------------------ |
161 | :100644 100644 0123456... 1234567... M fileY | |
59df2a11 | 162 | :100644 100644 0123456... bcd3456... C100 fileY file0 |
8db9307c | 163 | ------------------------------------------------ |
4a1332d0 JH |
164 | |
165 | In both rename and copy detection, the same "extent of changes" | |
166 | algorithm used in diffcore-break is used to determine if two | |
167 | files are "similar enough", and can be customized to use | |
59df2a11 CS |
168 | a similarity score different from the default of 50% by giving a |
169 | number after the "-M" or "-C" option (e.g. "-M8" to tell it to use | |
4a1332d0 JH |
170 | 8/10 = 80%). |
171 | ||
6cf378f0 | 172 | Note. When the "-C" option is used with `--find-copies-harder` |
0b444cdb | 173 | option, 'git diff-{asterisk}' commands feed unmodified filepairs to |
232b75ab JH |
174 | diffcore mechanism as well as modified ones. This lets the copy |
175 | detector consider unmodified files as copy source candidates at | |
6cf378f0 | 176 | the expense of making it slower. Without `--find-copies-harder`, |
0b444cdb | 177 | 'git diff-{asterisk}' commands can detect copies only if the file that was |
232b75ab | 178 | copied happened to have been modified in the same changeset. |
4a1332d0 JH |
179 | |
180 | ||
b803ae44 PS |
181 | diffcore-merge-broken: For Putting Complete Rewrites Back Together |
182 | ------------------------------------------------------------------ | |
4a1332d0 JH |
183 | |
184 | This transformation is used to merge filepairs broken by | |
f73ae1fc | 185 | diffcore-break, and not transformed into rename/copy by |
4a1332d0 JH |
186 | diffcore-rename, back into a single modification. This always |
187 | runs when diffcore-break is used. | |
188 | ||
189 | For the purpose of merging broken filepairs back, it uses a | |
190 | different "extent of changes" computation from the ones used by | |
191 | diffcore-break and diffcore-rename. It counts only the deletion | |
192 | from the original, and does not count insertion. If you removed | |
193 | only 10 lines from a 100-line document, even if you added 910 | |
194 | new lines to make a new 1000-line document, you did not do a | |
195 | complete rewrite. diffcore-break breaks such a case in order to | |
196 | help diffcore-rename to consider such filepairs as candidate of | |
197 | rename/copy detection, but if filepairs broken that way were not | |
198 | matched with other filepairs to create rename/copy, then this | |
199 | transformation merges them back into the original | |
200 | "modification". | |
201 | ||
202 | The "extent of changes" parameter can be tweaked from the | |
203 | default 80% (that is, unless more than 80% of the original | |
204 | material is deleted, the broken pairs are merged back into a | |
205 | single modification) by giving a second number to -B option, | |
206 | like these: | |
207 | ||
8db9307c JH |
208 | * -B50/60 (give 50% "break score" to diffcore-break, use 60% |
209 | for diffcore-merge-broken). | |
210 | ||
211 | * -B/60 (the same as above, since diffcore-break defaults to 50%). | |
4a1332d0 | 212 | |
366175ef | 213 | Note that earlier implementation left a broken pair as a separate |
f73ae1fc | 214 | creation and deletion patches. This was an unnecessary hack and |
366175ef JH |
215 | the latest implementation always merges all the broken pairs |
216 | back into modifications, but the resulting patch output is | |
f73ae1fc | 217 | formatted differently for easier review in case of such |
366175ef JH |
218 | a complete rewrite by showing the entire contents of old version |
219 | prefixed with '-', followed by the entire contents of new | |
220 | version prefixed with '+'. | |
221 | ||
4a1332d0 | 222 | |
59df2a11 | 223 | diffcore-pickaxe: For Detecting Addition/Deletion of Specified String |
a67c1d08 | 224 | --------------------------------------------------------------------- |
4a1332d0 | 225 | |
5bc3f0b5 RR |
226 | This transformation limits the set of filepairs to those that change |
227 | specified strings between the preimage and the postimage in a certain | |
228 | way. -S<block of text> and -G<regular expression> options are used to | |
229 | specify different ways these strings are sought. | |
230 | ||
231 | "-S<block of text>" detects filepairs whose preimage and postimage | |
232 | have different number of occurrences of the specified block of text. | |
233 | By definition, it will not detect in-file moves. Also, when a | |
234 | changeset moves a file wholesale without affecting the interesting | |
235 | string, diffcore-rename kicks in as usual, and `-S` omits the filepair | |
236 | (since the number of occurrences of that string didn't change in that | |
237 | rename-detected filepair). When used with `--pickaxe-regex`, treat | |
238 | the <block of text> as an extended POSIX regular expression to match, | |
239 | instead of a literal string. | |
240 | ||
241 | "-G<regular expression>" (mnemonic: grep) detects filepairs whose | |
242 | textual diff has an added or a deleted line that matches the given | |
243 | regular expression. This means that it will detect in-file (or what | |
244 | rename-detection considers the same file) moves, which is noise. The | |
245 | implementation runs diff twice and greps, and this can be quite | |
e0e7cb80 TB |
246 | expensive. To speed things up binary files without textconv filters |
247 | will be ignored. | |
5bc3f0b5 RR |
248 | |
249 | When `-S` or `-G` are used without `--pickaxe-all`, only filepairs | |
250 | that match their respective criterion are kept in the output. When | |
251 | `--pickaxe-all` is used, if even one filepair matches their respective | |
252 | criterion in a changeset, the entire changeset is kept. This behavior | |
253 | is designed to make reviewing changes in the context of the whole | |
4a1332d0 JH |
254 | changeset easier. |
255 | ||
59df2a11 | 256 | diffcore-order: For Sorting the Output Based on Filenames |
a67c1d08 | 257 | --------------------------------------------------------- |
4a1332d0 JH |
258 | |
259 | This is used to reorder the filepairs according to the user's | |
260 | (or project's) taste, and is controlled by the -O option to the | |
0b444cdb | 261 | 'git diff-{asterisk}' commands. |
4a1332d0 | 262 | |
59df2a11 | 263 | This takes a text file each of whose lines is a shell glob |
4a1332d0 JH |
264 | pattern. Filepairs that match a glob pattern on an earlier line |
265 | in the file are output before ones that match a later line, and | |
266 | filepairs that do not match any glob pattern are output last. | |
267 | ||
2de9b711 | 268 | As an example, a typical orderfile for the core Git probably |
8db9307c | 269 | would look like this: |
4a1332d0 | 270 | |
8db9307c | 271 | ------------------------------------------------ |
df8baa42 JF |
272 | README |
273 | Makefile | |
274 | Documentation | |
275 | *.h | |
276 | *.c | |
277 | t | |
8db9307c | 278 | ------------------------------------------------ |
30eba7bf | 279 | |
1eb4136a JH |
280 | diffcore-rotate: For Changing At Which Path Output Starts |
281 | --------------------------------------------------------- | |
282 | ||
283 | This transformation takes one pathname, and rotates the set of | |
284 | filepairs so that the filepair for the given pathname comes first, | |
285 | optionally discarding the paths that come before it. This is used | |
286 | to implement the `--skip-to` and the `--rotate-to` options. It is | |
287 | an error when the specified pathname is not in the set of filepairs, | |
288 | but it is not useful to error out when used with "git log" family of | |
289 | commands, because it is unreasonable to expect that a given path | |
290 | would be modified by each and every commit shown by the "git log" | |
291 | command. For this reason, when used with "git log", the filepair | |
292 | that sorts the same as, or the first one that sorts after, the given | |
293 | pathname is where the output starts. | |
294 | ||
295 | Use of this transformation combined with diffcore-order will produce | |
296 | unexpected results, as the input to this transformation is likely | |
297 | not sorted when diffcore-order is in effect. | |
298 | ||
299 | ||
30eba7bf CC |
300 | SEE ALSO |
301 | -------- | |
302 | linkgit:git-diff[1], | |
303 | linkgit:git-diff-files[1], | |
304 | linkgit:git-diff-index[1], | |
305 | linkgit:git-diff-tree[1], | |
306 | linkgit:git-format-patch[1], | |
307 | linkgit:git-log[1], | |
308 | linkgit:gitglossary[7], | |
309 | link:user-manual.html[The Git User's Manual] | |
310 | ||
311 | GIT | |
312 | --- | |
941b9c52 | 313 | Part of the linkgit:git[1] suite |