]> git.ipfire.org Git - thirdparty/git.git/blame - Documentation/technical/rerere.txt
Merge branch 'jk/bundle-use-dash-for-stdfiles'
[thirdparty/git.git] / Documentation / technical / rerere.txt
CommitLineData
fb90dca3
TG
1Rerere
2======
3
4This document describes the rerere logic.
5
6Conflict normalization
7----------------------
8
9To ensure recorded conflict resolutions can be looked up in the rerere
10database, even when branches are merged in a different order,
11different branches are merged that result in the same conflict, or
12when different conflict style settings are used, rerere normalizes the
13conflicts before writing them to the rerere database.
14
15Different conflict styles and branch names are normalized by stripping
16the labels from the conflict markers, and removing the common ancestor
ddfc44a8
EN
17version from the `diff3` or `zdiff3` conflict styles. Branches that
18are merged in different order are normalized by sorting the conflict
19hunks. More on each of those steps in the following sections.
fb90dca3
TG
20
21Once these two normalization operations are applied, a conflict ID is
22calculated based on the normalized conflict, which is later used by
23rerere to look up the conflict in the rerere database.
24
25Removing the common ancestor version
26~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27
28Say we have three branches AB, AC and AC2. The common ancestor of
29these branches has a file with a line containing the string "A" (for
30brevity this is called "line A" in the rest of the document). In
31branch AB this line is changed to "B", in AC, this line is changed to
32"C", and branch AC2 is forked off of AC, after the line was changed to
33"C".
34
35Forking a branch ABAC off of branch AB and then merging AC into it, we
36get a conflict like the following:
37
38 <<<<<<< HEAD
39 B
40 =======
41 C
42 >>>>>>> AC
43
44Doing the analogous with AC2 (forking a branch ABAC2 off of branch AB
ddfc44a8
EN
45and then merging branch AC2 into it), using the diff3 or zdiff3
46conflict style, we get a conflict like the following:
fb90dca3
TG
47
48 <<<<<<< HEAD
49 B
50 ||||||| merged common ancestors
51 A
52 =======
53 C
54 >>>>>>> AC2
55
56By resolving this conflict, to leave line D, the user declares:
57
58 After examining what branches AB and AC did, I believe that making
59 line A into line D is the best thing to do that is compatible with
60 what AB and AC wanted to do.
61
62As branch AC2 refers to the same commit as AC, the above implies that
63this is also compatible what AB and AC2 wanted to do.
64
65By extension, this means that rerere should recognize that the above
66conflicts are the same. To do this, the labels on the conflict
67markers are stripped, and the common ancestor version is removed. The above
68examples would both result in the following normalized conflict:
69
70 <<<<<<<
71 B
72 =======
73 C
74 >>>>>>>
75
76Sorting hunks
77~~~~~~~~~~~~~
78
79As before, lets imagine that a common ancestor had a file with line A
80its early part, and line X in its late part. And then four branches
81are forked that do these things:
82
83 - AB: changes A to B
84 - AC: changes A to C
85 - XY: changes X to Y
86 - XZ: changes X to Z
87
88Now, forking a branch ABAC off of branch AB and then merging AC into
89it, and forking a branch ACAB off of branch AC and then merging AB
90into it, would yield the conflict in a different order. The former
91would say "A became B or C, what now?" while the latter would say "A
92became C or B, what now?"
93
94As a reminder, the act of merging AC into ABAC and resolving the
95conflict to leave line D means that the user declares:
96
97 After examining what branches AB and AC did, I believe that
98 making line A into line D is the best thing to do that is
99 compatible with what AB and AC wanted to do.
100
101So the conflict we would see when merging AB into ACAB should be
70661d28 102resolved the same way--it is the resolution that is in line with that
fb90dca3
TG
103declaration.
104
105Imagine that similarly previously a branch XYXZ was forked from XY,
106and XZ was merged into it, and resolved "X became Y or Z" into "X
107became W".
108
109Now, if a branch ABXY was forked from AB and then merged XY, then ABXY
110would have line B in its early part and line Y in its later part.
111Such a merge would be quite clean. We can construct 4 combinations
112using these four branches ((AB, AC) x (XY, XZ)).
113
114Merging ABXY and ACXZ would make "an early A became B or C, a late X
115became Y or Z" conflict, while merging ACXY and ABXZ would make "an
116early A became C or B, a late X became Y or Z". We can see there are
1174 combinations of ("B or C", "C or B") x ("X or Y", "Y or X").
118
119By sorting, the conflict is given its canonical name, namely, "an
031fd4b9 120early part became B or C, a late part became X or Y", and whenever
fb90dca3
TG
121any of these four patterns appear, and we can get to the same conflict
122and resolution that we saw earlier.
123
124Without the sorting, we'd have to somehow find a previous resolution
125from combinatorial explosion.
126
127Conflict ID calculation
128~~~~~~~~~~~~~~~~~~~~~~~
129
130Once the conflict normalization is done, the conflict ID is calculated
131as the sha1 hash of the conflict hunks appended to each other,
132separated by <NUL> characters. The conflict markers are stripped out
133before the sha1 is calculated. So in the example above, where we
134merge branch AC which changes line A to line C, into branch AB, which
135changes line A to line C, the conflict ID would be
136SHA1('B<NUL>C<NUL>').
137
138If there are multiple conflicts in one file, the sha1 is calculated
139the same way with all hunks appended to each other, in the order in
140which they appear in the file, separated by a <NUL> character.
4af32207
TG
141
142Nested conflicts
143~~~~~~~~~~~~~~~~
144
145Nested conflicts are handled very similarly to "simple" conflicts.
146Similar to simple conflicts, the conflict is first normalized by
147stripping the labels from conflict markers, stripping the common ancestor
148version, and the sorting the conflict hunks, both for the outer and the
149inner conflict. This is done recursively, so any number of nested
150conflicts can be handled.
151
bc4caecf
TG
152Note that this only works for conflict markers that "cleanly nest". If
153there are any unmatched conflict markers, rerere will fail to handle
154the conflict and record a conflict resolution.
155
4af32207
TG
156The only difference is in how the conflict ID is calculated. For the
157inner conflict, the conflict markers themselves are not stripped out
158before calculating the sha1.
159
160Say we have the following conflict for example:
161
162 <<<<<<< HEAD
163 1
164 =======
165 <<<<<<< HEAD
166 3
167 =======
168 2
169 >>>>>>> branch-2
170 >>>>>>> branch-3~
171
172After stripping out the labels of the conflict markers, and sorting
173the hunks, the conflict would look as follows:
174
175 <<<<<<<
176 1
177 =======
178 <<<<<<<
179 2
180 =======
181 3
182 >>>>>>>
183 >>>>>>>
184
185and finally the conflict ID would be calculated as:
186`sha1('1<NUL><<<<<<<\n3\n=======\n2\n>>>>>>><NUL>')`