git.ipfire.org Git - thirdparty/git.git/commit

author	Garima Singh <garima.singh@microsoft.com>
	Mon, 6 Apr 2020 16:59:50 +0000 (16:59 +0000)
committer	Junio C Hamano <gitster@pobox.com>
	Mon, 6 Apr 2020 18:08:37 +0000 (11:08 -0700)
commit	1217c03e7b87b15f2c78af5b1e1915a675050454
tree	e19c660138425048b891a39028c6b6ce567c62d2	tree \| snapshot
parent	76ffbca71a9c89d1e530f734e16a70b3924f4bea	commit \| diff

commit-graph: reuse existing Bloom filters during write

Add logic to
a) parse Bloom filter information from the commit graph file and,
b) re-use existing Bloom filters.

See Documentation/technical/commit-graph-format for the format in which
the Bloom filter information is written to the commit graph file.

To read Bloom filter for a given commit with lexicographic position
'i' we need to:
1. Read BIDX[i] which essentially gives us the starting index in BDAT for
   filter of commit i+1. It is essentially the index past the end
   of the filter of commit i. It is called end_index in the code.

2. For i>0, read BIDX[i-1] which will give us the starting index in BDAT
   for filter of commit i. It is called the start_index in the code.
   For the first commit, where i = 0, Bloom filter data starts at the
   beginning, just past the header in the BDAT chunk. Hence, start_index
   will be 0.

3. The length of the filter will be end_index - start_index, because
   BIDX[i] gives the cumulative 8-byte words including the ith
   commit's filter.

We toggle whether Bloom filters should be recomputed based on the
compute_if_not_present flag.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bloom.c		diff \| blob \| blame \| history
bloom.h		diff \| blob \| blame \| history
commit-graph.c		diff \| blob \| blame \| history
t/helper/test-bloom.c		diff \| blob \| blame \| history