From: Kristofer Karlsson Date: Wed, 24 Jun 2026 12:14:07 +0000 (+0000) Subject: Documentation/technical: add paint-down-to-common doc X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=6aaf850263e72974a9cfc58985cfe6876613ef3c;p=thirdparty%2Fgit.git Documentation/technical: add paint-down-to-common doc Add a technical document describing the paint_down_to_common() algorithm used for merge-base computation, covering the paint walk, generation number regions, and termination conditions. Signed-off-by: Kristofer Karlsson Signed-off-by: Junio C Hamano --- diff --git a/Documentation/Makefile b/Documentation/Makefile index 2699f0b24a..f8dea4b395 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -129,6 +129,7 @@ TECH_DOCS += technical/long-running-process-protocol TECH_DOCS += technical/multi-pack-index TECH_DOCS += technical/packfile-uri TECH_DOCS += technical/pack-heuristics +TECH_DOCS += technical/paint-down-to-common TECH_DOCS += technical/parallel-checkout TECH_DOCS += technical/partial-clone TECH_DOCS += technical/platform-support diff --git a/Documentation/technical/meson.build b/Documentation/technical/meson.build index ec07088c57..9ce11d5e48 100644 --- a/Documentation/technical/meson.build +++ b/Documentation/technical/meson.build @@ -18,6 +18,7 @@ articles = [ 'multi-pack-index.adoc', 'packfile-uri.adoc', 'pack-heuristics.adoc', + 'paint-down-to-common.adoc', 'parallel-checkout.adoc', 'partial-clone.adoc', 'platform-support.adoc', diff --git a/Documentation/technical/paint-down-to-common.adoc b/Documentation/technical/paint-down-to-common.adoc new file mode 100644 index 0000000000..c10d5d2887 --- /dev/null +++ b/Documentation/technical/paint-down-to-common.adoc @@ -0,0 +1,114 @@ +Merge-Base Computation and paint_down_to_common() +================================================== + +The function `paint_down_to_common()` in `commit-reach.c` computes merge +bases by walking the commit graph backwards from two sets of tips and +finding where their ancestry meets. + +Use cases +--------- + +Computing merge bases is used in two different ways: + + 1. *Finding all merge bases* (`merge-base --all`, `merge-tree`, + `merge`, `rebase`). A merge base is a common ancestor that is + not itself an ancestor of another common ancestor. + + 2. *Ancestry checks* (`in_merge_bases`, used by `merge-base + --is-ancestor`, `branch -d`, `fetch`). These ask: "is commit A + an ancestor of commit B?" If a common ancestor equals one of the + inputs, that input is necessarily the only merge base -- no other + common ancestor can be both as recent and not an ancestor of it. + +Both use cases share the same algorithm and implementation. + +Algorithm +--------- + +Given a commit `one` and a set of commits `twos[]`, the walk paints +commits with two colors: + + - PARENT1: reachable from `one` + - PARENT2: reachable from any commit in `twos[]` + +The walk uses a priority queue ordered by generation number (falling +back to commit date when generation numbers are unavailable). Each +step dequeues the highest-priority commit (this is when we say a +commit is "visited") and propagates its paint flags to its parents, +enqueuing them if they gained new flags. When a commit receives +both PARENT1 and PARENT2, it is a merge-base candidate. A candidate +gains the STALE flag so its ancestors propagate staleness -- any +deeper common ancestor is necessarily redundant. + +INFINITY and finite generation regions +-------------------------------------- + +The commit-graph stores a generation number for each commit. Commits +not in the commit-graph have generation `GENERATION_NUMBER_INFINITY`. The +graph is closed under reachability: if a commit is in the graph, all +its ancestors are too. This partitions the commit graph into two regions: + +.... + +---------------------------------------+ + | INFINITY region | + | generation = INFINITY | + | queue order: heuristic (commit date) | + +---------------------------------------+ + | + v + +---------------------------------------+ + | Finite region | + | generation = finite | + | queue order: topological | + +---------------------------------------+ +.... + +When the commit-graph is enabled, the INFINITY region is typically +very small -- it only contains commits added since the last +commit-graph refresh. + +All reachable INFINITY-generation commits are visited before any +finite-generation commit, because INFINITY is larger than any finite +value. Once the walk crosses into the finite region, it stays there. + +In the finite region, generation ordering guarantees topological +traversal: children are always visited before their parents. This +means that paint on already-visited commits is final -- no future +traversal step can add paint to them. + +In the INFINITY region, commit-date ordering can violate this: a +parent with a later date can be visited before a child with an earlier +date. Paint flags are therefore NOT final at visit time, and a +commit visited with only one side's paint may later gain the other. + +Paint flags are only added, never removed. Since each flag can be set +at most once per commit, the number of times a commit can be +re-enqueued is bounded by the number of flag transitions. + +Termination +----------- + +The walk uses a `nonstale_queue` wrapper around `prio_queue` that +tracks `max_nonstale`: the lowest-priority non-stale commit enqueued +so far. Once that commit is dequeued, every remaining entry is known +to be STALE and the loop terminates. Specifically, the main loop +ends when one of the following conditions holds: + + 1. The queue is empty. + 2. `max_nonstale` has been dequeued, meaning the queue only contains + STALE entries. + +Stale entry condition +~~~~~~~~~~~~~~~~~~~~~ +Once all queued entries are stale, no new merge-base candidates can +be discovered -- that requires at least one non-stale commit from +each side meeting. Continuing the walk could still invalidate +existing candidates by proving one is an ancestor of another, but +`remove_redundant()` handles that as a post-processing step, so it +is safe to exit early. + +Related documentation +--------------------- + + - `Documentation/technical/commit-graph.adoc` -- generation numbers + and the reachability closure property. diff --git a/commit-reach.c b/commit-reach.c index 5df471a313..a9483759e0 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -96,7 +96,11 @@ static struct commit *nonstale_queue_get_dedup(struct nonstale_queue *queue) return commit; } -/* all input commits in one and twos[] must have been parsed! */ +/* + * See Documentation/technical/paint-down-to-common.adoc + * + * All input commits in one and twos[] must have been parsed! + */ static int paint_down_to_common(struct repository *r, struct commit *one, int n, struct commit **twos,