From: Patrick Steinhardt Date: Mon, 12 Jan 2026 09:00:42 +0000 (+0100) Subject: packfile: always declare object info to be OI_PACKED X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=7a4bd1b836c7437dfb8ff3650096750b98a6b3e4;p=thirdparty%2Fgit.git packfile: always declare object info to be OI_PACKED When reading object info via a packfile we yield one of two types: - The object can either be OI_PACKED, which is what a caller would typically expect. - Or it can be OI_DBCACHED if it is stored in the delta base cache. The latter really is an implementation detail though, and callers typically don't care at all about the difference. Furthermore, the information whether or not it is part of the delta base cache can already be derived via the `is_delta` field, so the fact that we discern between OI_PACKED and OI_DBCACHED only further complicates the interface. There aren't all that many callers that care about the `whence` field in the first place. In fact, there's only three: - `packfile_store_read_object_info()` checks for `whence == OI_PACKED` and then populates the packfile information of the object info structure. We now start to do this also for deltified objects, which gives its callers strictly more information. - `repack_local_links()` wants to determine whether the object is part of a promisor pack and checks for `whence == OI_PACKED`. If so, it verifies that the packfile is a promisor pack. It's arguably wrong to declare that an object is not part of a promisor pack only because it is stored in the delta base cache. - `is_not_in_promisor_pack_obj()` does the same, but checks that a specific object is _not_ part of a promisor pack. The same reasoning as above applies. Drop the OI_DBCACHED enum completely. None of the callers seem to care about the distinction. Note that this also fixes a segfault introduced in 8c1b84bc97 (streaming: move logic to read packed objects streams into backend, 2025-11-23), which refactors how we stream packed objects. The intent is to only read packed objects in case they are stored non-deltified as we'd otherwise have to deflate them first. But the check for whether or not the object is stored as a delta was unconditionally done via `oi.u.packed.is_delta`, which is only valid in case `oi.whence` is `OI_PACKED`. But under some circumstances we got `OI_DBCACHED` here, which means that none of the `oi.u.packed` fields were initialized at all. Consequently, we assumed the object was not stored as a delta, and then try to read the object from `oi.u.packed.pack`, which is a `NULL` pointer and thus causes a segfault. Add a test case for this issue so that this cannot regress in the future anymore. Reported-by: Matt Smiley Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- diff --git a/odb.h b/odb.h index 014cd9585a..73b0b87ad5 100644 --- a/odb.h +++ b/odb.h @@ -330,7 +330,6 @@ struct object_info { OI_CACHED, OI_LOOSE, OI_PACKED, - OI_DBCACHED } whence; union { /* diff --git a/packfile.c b/packfile.c index 08a0863fc3..b0c6665c87 100644 --- a/packfile.c +++ b/packfile.c @@ -1656,8 +1656,7 @@ int packed_object_info(struct repository *r, struct packed_git *p, oidclr(oi->delta_base_oid, p->repo->hash_algo); } - oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED : - OI_PACKED; + oi->whence = OI_PACKED; out: unuse_pack(&w_curs); diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index 961c6aac25..c8c1c5c06b 100755 --- a/t/t5003-archive-zip.sh +++ b/t/t5003-archive-zip.sh @@ -239,6 +239,40 @@ check_zip with_untracked2 check_added with_untracked2 untracked one/untracked check_added with_untracked2 untracked two/untracked +test_expect_success 'git-archive --format=zip with bigFile delta chains' ' + test_when_finished rm -rf repo && + git init repo && + ( + cd repo && + test-tool genrandom foo 100000 >base && + { + cat base && + echo "trailing data" + } >delta-1 && + { + cat delta-1 && + echo "trailing data" + } >delta-2 && + git add . && + git commit -m "blobs" && + git repack -Ad && + git verify-pack -v .git/objects/pack/pack-*.idx >stats && + test_grep "chain length = 1: 1 object" stats && + test_grep "chain length = 2: 1 object" stats && + + git -c core.bigFileThreshold=1k archive --format=zip HEAD >archive.zip && + if test_have_prereq UNZIP + then + mkdir unpack && + cd unpack && + "$GIT_UNZIP" ../archive.zip && + test_cmp base ../base && + test_cmp delta-1 ../delta-1 && + test_cmp delta-2 ../delta-2 + fi + ) +' + # Test remote archive over HTTP protocol. # # Note: this should be the last part of this test suite, because