From: Jeff King Date: Tue, 18 Nov 2025 09:12:06 +0000 (-0500) Subject: pack-bitmap: handle name-hash lookups in incremental bitmaps X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=4deb882e54152c31bef23f8b33ad38b7bc26d398;p=thirdparty%2Fgit.git pack-bitmap: handle name-hash lookups in incremental bitmaps If a bitmap has a name-hash cache, it is an array of 32-bit integers, one per entry in the bitmap, which we've mmap'd from the .bitmap file. We access it directly like this: if (bitmap_git->hashes) hash = get_be32(bitmap_git->hashes + index_pos); That works for both regular pack bitmaps and for non-incremental midx bitmaps. There is one bitmap_index with one "hashes" array, and index_pos is within its bounds (we do the bounds-checking when we load the bitmap). But for an incremental midx bitmap, we have a linked list of bitmap_index structs, and each one has only its own small slice of the name-hash array. If index_pos refers to an object that is not in the first bitmap_git of the chain, then we'll access memory outside of the bounds of its "hashes" array, and often outside of the mmap. Instead, we should walk through the list until we find the bitmap_index which serves our index_pos, and use its hash (after adjusting index_pos to make it relative to the slice we found). This is exactly what we do elsewhere for incremental midx lookups (like the pack_pos_to_midx() call a few lines above). But we can't use existing helpers like midx_for_object() here, because we're walking through the chain of bitmap_index structs (each of which refers to a midx), not the chain of incremental multi_pack_index structs themselves. The problem is triggered in the test suite, but we don't get a segfault because the out-of-bounds index is too small. The OS typically rounds our mmap up to the nearest page size, so we just end up accessing some extra zero'd memory. Nor do we catch it with ASan, since it doesn't seem to instrument mmaps at all. But if we build with NO_MMAP, then our maps are replaced with heap allocations, which ASan does check. And so: make NO_MMAP=1 SANITIZE=address cd t ./t5334-incremental-multi-pack-index.sh does show the problem (and this patch makes it go away). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- diff --git a/pack-bitmap.c b/pack-bitmap.c index ac6d62b980..9c9aa7b67c 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -212,6 +212,28 @@ static uint32_t bitmap_num_objects(struct bitmap_index *index) return index->pack->num_objects; } +static uint32_t bitmap_name_hash(struct bitmap_index *index, uint32_t pos) +{ + if (bitmap_is_midx(index)) { + while (index && pos < index->midx->num_objects_in_base) { + ASSERT(bitmap_is_midx(index)); + index = index->base; + } + + if (!index) + BUG("NULL base bitmap for object position: %"PRIu32, pos); + + pos -= index->midx->num_objects_in_base; + if (pos >= index->midx->num_objects) + BUG("out-of-bounds midx bitmap object at %"PRIu32, pos); + } + + if (!index->hashes) + return 0; + + return get_be32(index->hashes + pos); +} + static struct repository *bitmap_repo(struct bitmap_index *bitmap_git) { if (bitmap_is_midx(bitmap_git)) @@ -1726,8 +1748,7 @@ static void show_objects_for_type( pack = bitmap_git->pack; } - if (bitmap_git->hashes) - hash = get_be32(bitmap_git->hashes + index_pos); + hash = bitmap_name_hash(bitmap_git, index_pos); show_reach(&oid, object_type, 0, hash, pack, ofs, payload); } @@ -3083,8 +3104,8 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, if (oe) { reposition[i] = oe_in_pack_pos(mapping, oe) + 1; - if (bitmap_git->hashes && !oe->hash) - oe->hash = get_be32(bitmap_git->hashes + index_pos); + if (!oe->hash) + oe->hash = bitmap_name_hash(bitmap_git, index_pos); } }