The "--stdin-packs" option can be used to merge objects from multiple
packfiles given via stdin into a new packfile. One big upside of this
option is that we don't have to perform a complete rev walk to enumerate
objects. Instead, we can simply enumerate all objects that are part of
the specified packfiles, which can be significantly faster in very large
repositories.
There is one downside though: when we don't perform a rev walk we also
don't have a good way to learn about the respective object's names. As a
consequence, we cannot use the name hashes as a heuristic to get better
delta selection.
We try to offset this downside though by performing a localized rev
walk: we queue all objects that we're about to repack as interesting,
and all objects from excluded packfiles as uninteresting. We then
perform a best-effort rev walk that allows us to fill in object names.
There is one gotcha here though: when "--exclude-promisor-objects" has
not been given we will perform backfill fetches for any promised objects
that are missing. This used to not be an issue though as this option was
mutually exclusive with "--stdin-packs". But that has changed recently,
and starting with
dcc9c7ef47 (builtin/repack: handle promisor packs with
geometric repacking, 2026-01-05) we will now repack promisor packs
during geometric compaction. The consequence is that a geometric repack
may now perform a bunch of backfill fetches.
We of course cannot pass "--exclude-promisor-objects" to fix this
issue -- after all, the whole intent is to repack objects part of a
promisor pack. But arguably we don't have to: the rev walk is intended
as best effort, and we already configure it to ignore missing links to
other objects. So we can adapt the walk to unconditionally disable
fetching any missing objects.
Do so and add a test that verifies we don't backfill any objects.
Reported-by: Lukas Wanko <lwanko@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
static void read_stdin_packs(enum stdin_packs_mode mode, int rev_list_unpacked)
{
+ int prev_fetch_if_missing = fetch_if_missing;
struct rev_info revs;
+ /*
+ * The revision walk may hit objects that are promised, only. As the
+ * walk is best-effort though we don't want to perform backfill fetches
+ * for them.
+ */
+ fetch_if_missing = 0;
+
repo_init_revisions(the_repository, &revs, NULL);
/*
* Use a revision walk to fill in the namehash of objects in the include
stdin_packs_found_nr);
trace2_data_intmax("pack-objects", the_repository, "stdin_packs_hints",
stdin_packs_hints_nr);
+
+ fetch_if_missing = prev_fetch_if_missing;
}
static void add_cruft_object_entry(const struct object_id *oid, enum object_type type,
)
'
+test_expect_success '--stdin-packs does not perform backfill fetch' '
+ test_when_finished "rm -rf remote client" &&
+
+ git init remote &&
+ test_commit_bulk -C remote 10 &&
+ git -C remote config set --local uploadpack.allowfilter 1 &&
+ git -C remote config set --local uploadpack.allowanysha1inwant 1 &&
+
+ git clone --filter=tree:0 "file://$(pwd)/remote" client &&
+ (
+ cd client &&
+ ls .git/objects/pack/*.promisor | sed "s|.*/||; s/\.promisor$/.pack/" >packs &&
+ test_line_count -gt 1 packs &&
+ GIT_TRACE2_EVENT="$(pwd)/event.log" git pack-objects --stdin-packs pack <packs &&
+ test_grep ! "\"event\":\"child_start\"" event.log
+ )
+'
+
stdin_packs__follow_with_only () {
rm -fr stdin_packs__follow_with_only &&
git init stdin_packs__follow_with_only &&