git.ipfire.org Git - thirdparty/git.git/commit

author	Kristofer Karlsson <krka@spotify.com>
	Thu, 14 May 2026 16:51:31 +0000 (16:51 +0000)
committer	Junio C Hamano <gitster@pobox.com>
	Thu, 14 May 2026 19:26:25 +0000 (04:26 +0900)
commit	ef8d51a8a3e1c57201aa2c116ad27b0db580123a
tree	0639354469c327e17b28f0a68314ef7f6f0e8539	tree \| snapshot
parent	94f057755b7941b321fd11fec1b2e3ca5313a4e0	commit \| diff

revision: use priority queue in limit_list()

limit_list() maintains a date-sorted work queue of commits using a
linked list with commit_list_insert_by_date() for insertion.  Each
insertion walks the list to find the right position — O(n) per insert.
In repositories with merge-heavy histories, the symmetric difference
can contain thousands of commits, making this O(n) insertion the
dominant cost.

Replace the sorted linked list with a prio_queue (binary heap).  This
gives O(log n) insertion and O(log n) extraction instead of O(n)
insertion and O(1) extraction, which is a net win when the queue is
large.

The still_interesting() and everybody_uninteresting() helpers are
updated to scan the prio_queue's contiguous array instead of walking a
linked list.  process_parents() already accepts both a commit_list and
a prio_queue parameter, so the change in limit_list() simply switches
which one is passed.

Benchmark: git rev-list --left-right --count HEAD~N...HEAD
Repository: 2.3M commits, merge-heavy DAG (monorepo)
Best of 5 runs, times in seconds:

  commits in
  symmetric diff   baseline   patched    speedup
  --------------   --------   -------    -------
            10       0.01      0.01       1.0x
            50       0.01      0.01       1.0x
          3751      21.23      8.49       2.5x
          4524      21.70      8.29       2.6x
         10130      20.10      6.65       3.0x

No change for small traversals; 2.5-3.0x faster when the queue grows
to thousands of commits.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>