reftable/merged: transfer ownership of records when iterating
When iterating over records with the merged iterator we put the records
into a priority queue before yielding them to the caller. This means
that we need to allocate the contents of these records before we can
pass them over to the caller.
The handover to the caller is quite inefficient though because we first
deallocate the record passed in by the caller and then copy over the new
record, which requires us to reallocate memory.
Refactor the code to instead transfer ownership of the new record to the
caller. So instead of reallocating all contents, we now release the old
record and then copy contents of the new record into place.
The following benchmark of `git show-ref --quiet` in a repository with
around 350k refs shows a clear improvement. Before:
HEAP SUMMARY:
in use at exit: 21,163 bytes in 193 blocks
total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 21,163 bytes in 193 blocks
total heap usage: 357,007 allocs, 356,814 frees, 24,193,602 bytes allocated
This shows that we now have roundabout a single allocation per record
that we're yielding from the iterator. Ideally, we'd also get rid of
this allocation so that the number of allocations doesn't scale with the
number of refs anymore. This would require some larger surgery though
because the memory is owned by the priority queue before transferring it
over to the caller.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/merged: really reuse buffers to compute record keys
In 829231dc20 (reftable/merged: reuse buffer to compute record keys,
2023-12-11), we have refactored the merged iterator to reuse a pair of
long-living strbufs by relying on the fact that `reftable_record_key()`
tries to reuse already allocated strbufs by calling `strbuf_reset()`,
which should give us significantly fewer reallocations compared to the
old code that used on-stack strbufs that are allocated for each and
every iteration. Unfortunately, we called `strbuf_release()` on these
long-living strbufs that we meant to reuse on each iteration, defeating
the optimization.
Fix this performance issue by not releasing those buffers on iteration
anymore, where we instead rely on `merged_iter_close()` to release the
buffers for us.
Using `git show-ref --quiet` in a repository with ~350k refs this leads
to a significant drop in allocations. Before:
HEAP SUMMARY:
in use at exit: 21,163 bytes in 193 blocks
total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 21,163 bytes in 193 blocks
total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/record: store "val2" hashes as static arrays
Similar to the preceding commit, convert ref records of type "val2" to
store their object IDs in static arrays instead of allocating them for
every single record.
We're using the same benchmark as in the preceding commit, with `git
show-ref --quiet` in a repository with ~350k refs. This time around
though the effects aren't this huge. Before:
HEAP SUMMARY:
in use at exit: 21,163 bytes in 193 blocks
total heap usage: 1,419,040 allocs, 1,418,847 frees, 62,153,868 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 21,163 bytes in 193 blocks
total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated
This is because "val2"-type records are typically only stored for peeled
tags, and the number of annotated tags in the benchmark repository is
rather low. Still, it can be seen that this change leads to a reduction
of allocations overall, even if only a small one.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/record: store "val1" hashes as static arrays
When reading ref records of type "val1", we store its object ID in an
allocated array. This results in an additional allocation for every
single ref record we read, which is rather inefficient especially when
iterating over refs.
Refactor the code to instead use an embedded array of `GIT_MAX_RAWSZ`
bytes. While this means that `struct ref_record` is bigger now, we
typically do not store all refs in an array anyway and instead only
handle a limited number of records at the same point in time.
Using `git show-ref --quiet` in a repository with ~350k refs this leads
to a significant drop in allocations. Before:
HEAP SUMMARY:
in use at exit: 21,098 bytes in 192 blocks
total heap usage: 2,116,683 allocs, 2,116,491 frees, 76,098,060 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 21,098 bytes in 192 blocks
total heap usage: 1,419,031 allocs, 1,418,839 frees, 62,145,036 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/record: constify some parts of the interface
We're about to convert reftable records to stop storing their object IDs
as allocated hashes. Prepare for this refactoring by constifying some
parts of the interface that will be impacted by this.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/writer: fix index corruption when writing multiple indices
Each reftable may contain multiple types of blocks for refs, objects and
reflog records, where each of these may have an index that makes it more
efficient to find the records. It was observed that the index for log
records can become corrupted under certain circumstances, where the
first entry of the index points into the object index instead of to the
log records.
As it turns out, this corruption can occur whenever we write a log index
as well as at least one additional index. Writing records and their index
is basically a two-step process:
1. We write all blocks for the corresponding record. Each block that
gets written is added to a list of blocks to index.
2. Once all blocks were written we finish the section. If at least two
blocks have been added to the list of blocks to index then we will
now write the index for those blocks and flush it, as well.
When we have a very large number of blocks then we may decide to write a
multi-level index, which is why we also keep track of the list of the
index blocks in the same way as we previously kept track of the blocks
to index.
Now when we have finished writing all index blocks we clear the index
and flush the last block to disk. This is done in the wrong order though
because flushing the block to disk will re-add it to the list of blocks
to be indexed. The result is that the next section we are about to write
will have an entry in the list of blocks to index that points to the
last block of the preceding section's index, which will corrupt the log
index.
Fix this corruption by clearing the index after having written the last
block.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: do not auto-compact twice in `reftable_stack_add()`
In 5c086453ff (reftable/stack: perform auto-compaction with
transactional interface, 2023-12-11), we fixed a bug where the
transactional interface to add changes to a reftable stack did not
perform auto-compaction by calling `reftable_stack_auto_compact()` in
`reftable_stack_addition_commit()`. While correct, this change may now
cause us to perform auto-compaction twice in the non-transactional
interface `reftable_stack_add()`:
- It performs auto-compaction by itself.
- It now transitively performs auto-compaction via the transactional
interface.
Remove the first instance so that we only end up doing auto-compaction
once.
Reported-by: Han-Wen Nienhuys <hanwenn@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: do not overwrite errors when compacting
In order to compact multiple stacks we iterate through the merged ref
and log records. When there is any error either when reading the records
from the old merged table or when writing the records to the new table
then we break out of the respective loops. When breaking out of the loop
for the ref records though the error code will be overwritten, which may
cause us to inadvertently skip over bad ref records. In the worst case,
this can lead to a compacted stack that is missing records.
Fix the code by using `goto done` instead so that any potential error
codes are properly returned to the caller.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 18 Dec 2023 22:10:13 +0000 (14:10 -0800)]
Merge branch 'jh/trace2-redact-auth'
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.
* jh/trace2-redact-auth:
t0212: test URL redacting in EVENT format
t0211: test URL redacting in PERF format
trace2: redact passwords from https:// URLs by default
trace2: fix signature of trace2_def_param() macro
Junio C Hamano [Mon, 18 Dec 2023 22:10:12 +0000 (14:10 -0800)]
Merge branch 'js/update-urls-in-doc-and-comment'
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.
* js/update-urls-in-doc-and-comment:
doc: refer to internet archive
doc: update links for andre-simon.de
doc: switch links to https
doc: update links to current pages
Junio C Hamano [Mon, 18 Dec 2023 22:10:11 +0000 (14:10 -0800)]
Merge branch 'ps/commit-graph-less-paranoid'
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications. The default has been
flipped to disable this pessimization.
* ps/commit-graph-less-paranoid:
commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
Junio C Hamano [Mon, 18 Dec 2023 22:10:11 +0000 (14:10 -0800)]
Merge branch 'cc/git-replay'
Introduce "git replay", a tool meant on the server side without
working tree to recreate a history.
* cc/git-replay:
replay: stop assuming replayed branches do not diverge
replay: add --contained to rebase contained branches
replay: add --advance or 'cherry-pick' mode
replay: use standard revision ranges
replay: make it a minimal server side command
replay: remove HEAD related sanity check
replay: remove progress and info output
replay: add an important FIXME comment about gpg signing
replay: change rev walking options
replay: introduce pick_regular_commit()
replay: die() instead of failing assert()
replay: start using parse_options API
replay: introduce new builtin
t6429: remove switching aspects of fast-rebase
Junio C Hamano [Mon, 18 Dec 2023 22:10:11 +0000 (14:10 -0800)]
Merge branch 'ps/ref-deletion-updates'
Simplify API implementation to delete references by eliminating
duplication.
* ps/ref-deletion-updates:
refs: remove `delete_refs` callback from backends
refs: deduplicate code to delete references
refs/files: use transactions to delete references
t5510: ensure that the packed-refs file needs locking
reftable/block: reuse buffer to compute record keys
When iterating over entries in the block iterator we compute the key of
each of the entries and write it into a buffer. We do not reuse the
buffer though and thus re-allocate it on every iteration, which is
wasteful.
Refactor the code to reuse the buffer.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: introduce macro to initialize `struct block_iter`
There are a bunch of locations where we initialize members of `struct
block_iter`, which makes it harder than necessary to expand this struct
to have additional members. Unify the logic via a new `BLOCK_ITER_INIT`
macro that initializes all members.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/merged: reuse buffer to compute record keys
When iterating over entries in the merged iterator's queue, we compute
the key of each of the entries and write it into a buffer. We do not
reuse the buffer though and thus re-allocate it on every iteration,
which is wasteful given that we never transfer ownership of the
allocated bytes outside of the loop.
Refactor the code to reuse the buffer. This also fixes a potential
memory leak when `merged_iter_advance_subiter()` returns an error.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a new reftable stack, Git will first create the stack with
a random suffix so that concurrent updates will not try to write to the
same file. This random suffix is computed via a call to rand(3P). But we
never seed the function via srand(3P), which means that the suffix is in
fact always the same.
Fix this bug by using `git_rand()` instead, which does not need to be
initialized. While this function is likely going to be slower depending
on the platform, this slowness should not matter in practice as we only
use it when writing a new reftable stack.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When starting a transaction via `reftable_stack_init_addition()`, we
create a lockfile for the reftable stack itself which we'll write the
new list of tables to. But if we terminate abnormally e.g. via a call to
`die()`, then we do not remove the lockfile. Subsequent executions of
Git which try to modify references will thus fail with an out-of-date
error.
Fix this bug by registering the lock as a `struct tempfile`, which
ensures automatic cleanup for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: reuse buffers when reloading stack
In `reftable_stack_reload_once()` we iterate over all the tables added
to the stack in order to figure out whether any of the tables needs to
be reloaded. We use a set of buffers in this context to compute the
paths of these tables, but discard those buffers on every iteration.
This is quite wasteful given that we do not need to transfer ownership
of the allocated buffer outside of the loop.
Refactor the code to instead reuse the buffers to reduce the number of
allocations we need to do. Note that we do not have to manually reset
the buffer because `stack_filename()` does this for us already.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: perform auto-compaction with transactional interface
Whenever updating references or reflog entries in the reftable stack, we
need to add a new table to the stack, thus growing the stack's length by
one. The stack can grow to become quite long rather quickly, leading to
performance issues when trying to read records. But besides performance
issues, this can also lead to exhaustion of file descriptors very
rapidly as every single table requires a separate descriptor when
opening the stack.
While git-pack-refs(1) fixes this issue for us by merging the tables, it
runs too irregularly to keep the length of the stack within reasonable
limits. This is why the reftable stack has an auto-compaction mechanism:
`reftable_stack_add()` will call `reftable_stack_auto_compact()` after
its added the new table, which will auto-compact the stack as required.
But while this logic works alright for `reftable_stack_add()`, we do not
do the same in `reftable_addition_commit()`, which is the transactional
equivalent to the former function that allows us to write multiple
updates to the stack atomically. Consequentially, we will easily run
into file descriptor exhaustion in code paths that use many separate
transactions like e.g. non-atomic fetches.
Fix this issue by calling `reftable_stack_auto_compact()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: verify that `reftable_stack_add()` uses auto-compaction
While we have several tests that check whether we correctly perform
auto-compaction when manually calling `reftable_stack_auto_compact()`,
we don't have any tests that verify whether `reftable_stack_add()` does
call it automatically. Add one.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are calls to pread(3P) and read(3P) where we don't properly handle
interrupts. Convert them to use `pread_in_full()` and `read_in_full()`,
respectively.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `EXPECT` macros used by the reftable test framework are all using a
single `if` statement with the actual condition. This results in weird
syntax when using them in if/else statements like the following:
Note that there need not be a trailing semicolon. Furthermore, it is not
immediately obvious whether the else now belongs to the `if (foo)` or
whether it belongs to the expanded `if (foo == 2)` from the macro.
Fix this by wrapping the macros in a do/while loop.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sun, 10 Dec 2023 00:37:51 +0000 (16:37 -0800)]
Merge branch 'tz/send-email-negatable-options'
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email". Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.
* tz/send-email-negatable-options:
send-email: avoid duplicate specification warnings
perl: bump the required Perl version to 5.8.1 from 5.8.0
"git for-each-ref --no-sort" still sorted the refs alphabetically
which paid non-trivial cost. It has been redefined to show output
in an unspecified order, to allow certain optimizations to take
advantage of.
* vd/for-each-ref-unsorted-optimization:
t/perf: add perf tests for for-each-ref
ref-filter.c: use peeled tag for '*' format fields
for-each-ref: clean up documentation of --format
ref-filter.c: filter & format refs in the same callback
ref-filter.c: refactor to create common helper functions
ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()'
ref-filter.h: add functions for filter/format & format-only
ref-filter.h: move contains caches into filter
ref-filter.h: add max_count and omit_empty to ref_format
ref-filter.c: really don't sort when using --no-sort
Junio C Hamano [Sun, 10 Dec 2023 00:37:50 +0000 (16:37 -0800)]
Merge branch 'ps/ban-a-or-o-operator-with-test'
Test and shell scripts clean-up.
* ps/ban-a-or-o-operator-with-test:
Makefile: stop using `test -o` when unlinking duplicate executables
contrib/subtree: convert subtree type check to use case statement
contrib/subtree: stop using `-o` to test for number of args
global: convert trivial usages of `test <expr> -a/-o <expr>`
Junio C Hamano [Sun, 10 Dec 2023 00:37:49 +0000 (16:37 -0800)]
Merge branch 'ps/ref-tests-update'
Update ref-related tests.
* ps/ref-tests-update:
t: mark several tests that assume the files backend with REFFILES
t7900: assert the absence of refs via git-for-each-ref(1)
t7300: assert exact states of repo
t4207: delete replace references via git-update-ref(1)
t1450: convert tests to remove worktrees via git-worktree(1)
t: convert tests to not access reflog via the filesystem
t: convert tests to not access symrefs via the filesystem
t: convert tests to not write references via the filesystem
t: allow skipping expected object ID in `ref-store update-ref`
Junio C Hamano [Sun, 10 Dec 2023 00:37:48 +0000 (16:37 -0800)]
Merge branch 'jk/chunk-bounds-more'
Code clean-up for jk/chunk-bounds topic.
* jk/chunk-bounds-more:
commit-graph: mark chunk error messages for translation
commit-graph: drop verify_commit_graph_lite()
commit-graph: check order while reading fanout chunk
commit-graph: use fanout value for graph size
commit-graph: abort as soon as we see a bogus chunk
commit-graph: clarify missing-chunk error messages
commit-graph: drop redundant call to "lite" verification
midx: check consistency of fanout table
commit-graph: handle overflow in chunk_size checks
Junio C Hamano [Sun, 10 Dec 2023 00:37:48 +0000 (16:37 -0800)]
Merge branch 'ps/ci-gitlab'
Add support for GitLab CI.
* ps/ci-gitlab:
ci: add support for GitLab CI
ci: install test dependencies for linux-musl
ci: squelch warnings when testing with unusable Git repo
ci: unify setup of some environment variables
ci: split out logic to set up failed test artifacts
ci: group installation of Docker dependencies
ci: make grouping setup more generic
ci: reorder definitions for grouping functions
Junio C Hamano [Sun, 10 Dec 2023 00:37:47 +0000 (16:37 -0800)]
Merge branch 'js/doc-unit-tests-with-cmake'
Update the base topic to work with CMake builds.
* js/doc-unit-tests-with-cmake:
cmake: handle also unit tests
cmake: use test names instead of full paths
cmake: fix typo in variable name
artifacts-tar: when including `.dll` files, don't forget the unit-tests
unit-tests: do show relative file paths
unit-tests: do not mistake `.pdb` files for being executable
cmake: also build unit tests
Junio C Hamano [Sun, 10 Dec 2023 00:37:46 +0000 (16:37 -0800)]
Merge branch 'ps/httpd-tests-on-nixos'
Portability tweak.
* ps/httpd-tests-on-nixos:
t9164: fix inability to find basename(1) in Subversion hooks
t/lib-httpd: stop using legacy crypt(3) for authentication
t/lib-httpd: dynamically detect httpd and modules path
René Scharfe [Sun, 26 Nov 2023 11:57:43 +0000 (12:57 +0100)]
i18n: factorize even more 'incompatible options' messages
Continue the work of 12909b6b8a (i18n: turn "options are incompatible"
into "cannot be used together", 2022-01-05) and a699367bb8 (i18n:
factorize more 'incompatible options' messages, 2022-01-31) to use the
same parameterized error message for reporting incompatible command line
options. This reduces the number of strings to translate and makes the
UI slightly more consistent.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 26 Nov 2023 11:57:36 +0000 (12:57 +0100)]
column: release strbuf and string_list after use
Releasing strbuf and string_list just before exiting is not strictly
necessary, but it gets rid of false positives reported by leak checkers,
which can then be more easily used to show that the column-printing
machinery behind print_columns() are free of leaks.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:43 +0000 (12:10 +0100)]
replay: stop assuming replayed branches do not diverge
The replay command is able to replay multiple branches but when some of
them are based on other replayed branches, their commit should be
replayed onto already replayed commits.
For this purpose, let's store the replayed commit and its original
commit in a key value store, so that we can easily find and reuse a
replayed commit instead of the original one.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:42 +0000 (12:10 +0100)]
replay: add --contained to rebase contained branches
Let's add a `--contained` option that can be used along with
`--onto` to rebase all the branches contained in the <revision-range>
argument.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:41 +0000 (12:10 +0100)]
replay: add --advance or 'cherry-pick' mode
There is already a 'rebase' mode with `--onto`. Let's add an 'advance' or
'cherry-pick' mode with `--advance`. This new mode will make the target
branch advance as we replay commits onto it.
The replayed commits should have a single tip, so that it's clear where
the target branch should be advanced. If they have more than one tip,
this new mode will error out.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:40 +0000 (12:10 +0100)]
replay: use standard revision ranges
Instead of the fixed "<oldbase> <branch>" arguments, the replay
command now accepts "<revision-range>..." arguments in a similar
way as many other Git commands. This makes its interface more
standard and more flexible.
This also enables many revision related options accepted and
eaten by setup_revisions(). If the replay command was a high level
one or had a high level mode, it would make sense to restrict some
of the possible options, like those generating non-contiguous
history, as they could be confusing for most users.
Also as the interface of the command is now mostly finalized,
we can add more documentation and more testcases to make sure
the command will continue to work as designed in the future.
We only document the rev-list related options among all the
revision related options that are now accepted, as the rev-list
related ones are probably the most useful for now.
Helped-by: Dragan Simic <dsimic@manjaro.org> Helped-by: Linus Arver <linusa@google.com> Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:39 +0000 (12:10 +0100)]
replay: make it a minimal server side command
We want this command to be a minimal command that just does server side
picking of commits, displaying the results on stdout for higher level
scripts to consume.
So let's simplify it:
* remove the worktree and index reading/writing,
* remove the ref (and reflog) updating,
* remove the assumptions tying us to HEAD, since (a) this is not a
rebase and (b) we want to be able to pick commits in a bare repo,
i.e. to/from branches that are not checked out and not the main
branch,
* remove unneeded includes,
* handle rebasing multiple branches by printing on stdout the update
ref commands that should be performed.
The output can be piped into `git update-ref --stdin` for the ref
updates to happen.
In the future to make it easier for users to use this command
directly maybe an option can be added to automatically pipe its output
into `git update-ref`.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:38 +0000 (12:10 +0100)]
replay: remove HEAD related sanity check
We want replay to be a command that can be used on the server side on
any branch, not just the current one, so we are going to stop updating
HEAD in a future commit.
A "sanity check" that makes sure we are replaying the current branch
doesn't make sense anymore. Let's remove it.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:37 +0000 (12:10 +0100)]
replay: remove progress and info output
The replay command will be changed in a follow up commit, so that it
will not update refs directly, but instead it will print on stdout a
list of commands that can be consumed by `git update-ref --stdin`.
We don't want this output to be polluted by its current low value
output, so let's just remove the latter.
In the future, when the command gets an option to update refs by
itself, it will make a lot of sense to display a progress meter, but
we are not there yet.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:36 +0000 (12:10 +0100)]
replay: add an important FIXME comment about gpg signing
We want to be able to handle signed commits in some way in the future,
but we are not ready to do it now. So for the time being let's just add
a FIXME comment to remind us about it.
These are different ways we could handle them:
- in case of a cli user and if there was an interactive mode, we could
perhaps ask if the user wants to sign again
- we could add an option to just fail if there are signed commits
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:35 +0000 (12:10 +0100)]
replay: change rev walking options
Let's force the rev walking options we need after calling
setup_revisions() instead of before.
This might override some user supplied rev walking command line options
though. So let's detect that and warn users by:
a) setting the desired values, before setup_revisions(),
b) checking after setup_revisions() whether these values differ from
the desired values,
c) if so throwing a warning and setting the desired values again.
We want the command to work from older commits to newer ones by default.
Also we don't want history simplification, as we want to deal with all
the commits in the affected range.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:34 +0000 (12:10 +0100)]
replay: introduce pick_regular_commit()
Let's refactor the code to handle a regular commit (a commit that is
neither a root commit nor a merge commit) into a single function instead
of keeping it inside cmd_replay().
This is good for separation of concerns, and this will help further work
in the future to replay merge commits.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:33 +0000 (12:10 +0100)]
replay: die() instead of failing assert()
It's not a good idea for regular Git commands to use an assert() to
check for things that could happen but are not supported.
Let's die() with an explanation of the issue instead.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:32 +0000 (12:10 +0100)]
replay: start using parse_options API
Instead of manually parsing arguments, let's start using the parse_options
API. This way this new builtin will look more standard, and in some
upcoming commits will more easily be able to handle more command line
options.
Note that we plan to later use standard revision ranges instead of
hardcoded "<oldbase> <branch>" arguments. When we will use standard
revision ranges, it will be easier to check if there are no spurious
arguments if we keep ARGV[0], so let's call parse_options() with
PARSE_OPT_KEEP_ARGV0 even if we don't need ARGV[0] right now to avoid
some useless code churn.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:31 +0000 (12:10 +0100)]
replay: introduce new builtin
For now, this is just a rename from `t/helper/test-fast-rebase.c` into
`builtin/replay.c` with minimal changes to make it build appropriately.
Let's add a stub documentation and a stub test script though.
Subsequent commits will flesh out the capabilities of the new command
and make it a more standard regular builtin.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elijah Newren [Fri, 24 Nov 2023 11:10:30 +0000 (12:10 +0100)]
t6429: remove switching aspects of fast-rebase
At the time t6429 was written, merge-ort was still under development,
did not have quite as many tests, and certainly was not widely deployed.
Since t6429 was exercising some codepaths just a little differently, we
thought having them also test the "merge_switch_to_result()" bits of
merge-ort was useful even though they weren't intrinsic to the real
point of these tests.
However, the value provided by doing extra testing of the
"merge_switch_to_result()" bits has decreased a bit over time, and it's
actively making it harder to refactor `test-tool fast-rebase` into `git
replay`, which we are going to do in following commits. Dispense with
these bits.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
In 7a5d604443 (commit: detect commits that exist in commit-graph but not
in the ODB, 2023-10-31), we have introduced a new object existence check
into `repo_parse_commit_internal()` so that we do not parse commits via
the commit-graph that don't have a corresponding object in the object
database. This new check of course comes with a performance penalty,
which the commit put at around 30% for `git rev-list --topo-order`. But
there are in fact scenarios where the performance regression is even
higher. The following benchmark against linux.git with a fully-build
commit-graph:
Benchmark 1: git.v2.42.1 rev-list --count HEAD
Time (mean ± σ): 658.0 ms ± 5.2 ms [User: 613.5 ms, System: 44.4 ms]
Range (min … max): 650.2 ms … 666.0 ms 10 runs
Benchmark 2: git.v2.43.0-rc1 rev-list --count HEAD
Time (mean ± σ): 1.333 s ± 0.019 s [User: 1.263 s, System: 0.069 s]
Range (min … max): 1.302 s … 1.361 s 10 runs
Summary
git.v2.42.1 rev-list --count HEAD ran
2.03 ± 0.03 times faster than git.v2.43.0-rc1 rev-list --count HEAD
While it's a noble goal to ensure that results are the same regardless
of whether or not we have a potentially stale commit-graph, taking twice
as much time is a tough sell. Furthermore, we can generally assume that
the commit-graph will be updated by git-gc(1) or git-maintenance(1) as
required so that the case where the commit-graph is stale should not at
all be common.
With that in mind, default-disable GIT_COMMIT_GRAPH_PARANOIA and restore
the behaviour and thus performance previous to the mentioned commit. In
order to not be inconsistent, also disable this behaviour by default in
`lookup_commit_in_graph()`, where the object existence check has been
introduced right at its inception via f559d6d45e (revision: avoid
hitting packfiles when commits are in commit-graph, 2021-08-09).
This results in another speedup in commands that end up calling this
function, even though it's less pronounced compared to the above
benchmark. The following has been executed in linux.git with ~1.2
million references:
Benchmark 1: GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted
Time (mean ± σ): 2.947 s ± 0.003 s [User: 2.412 s, System: 0.534 s]
Range (min … max): 2.943 s … 2.949 s 3 runs
Benchmark 2: GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted
Time (mean ± σ): 2.724 s ± 0.030 s [User: 2.207 s, System: 0.514 s]
Range (min … max): 2.704 s … 2.759 s 3 runs
Summary
GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted ran
1.08 ± 0.01 times faster than GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted
So whereas 7a5d604443 initially introduced the logic to start doing an
object existence check in `repo_parse_commit_internal()` by default, the
updated logic will now instead cause `lookup_commit_in_graph()` to stop
doing the check by default. This behaviour continues to be tweakable by
the user via the GIT_COMMIT_GRAPH_PARANOIA environment variable.
Note that this requires us to amend some tests to manually turn on the
paranoid checks again. This is because we cause repository corruption by
manually deleting objects which are part of the commit graph already.
These circumstances shouldn't usually happen in repositories.
Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josh Soref [Fri, 24 Nov 2023 03:35:15 +0000 (03:35 +0000)]
doc: refer to internet archive
These pages are no longer reachable from their original locations,
which makes things difficult for readers. Instead, switch to linking to
the Internet Archive for the content.
Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josh Soref [Fri, 24 Nov 2023 03:35:14 +0000 (03:35 +0000)]
doc: update links for andre-simon.de
Beyond the fact that it's somewhat traditional to respect sites'
self-identification, it's helpful for links to point to the things
that people expect them to reference. Here that means linking to
specific pages instead of a domain.
Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Josh Brobst [Sun, 26 Nov 2023 00:05:14 +0000 (19:05 -0500)]
builtin/reflog.c: fix dry-run option short name
The documentation for reflog states that the --dry-run option of the
expire and delete subcommands has a corresponding short name, -n.
However, 33d7bdd645 (builtin/reflog.c: use parse-options api for expire,
delete subcommands, 2022-01-06) did not include this short name in the
new options parsing.
Re-add the short name in the new dry-run option definitions.
Signed-off-by: Josh Brobst <josh@brob.st> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff Hostetler [Wed, 22 Nov 2023 19:18:37 +0000 (19:18 +0000)]
t0212: test URL redacting in EVENT format
In the added tests cases, skip testing the `GIT_TRACE2_REDACT=0` case
because we would need to exactly model the full JSON event stream like
we did in the preceding basic tests and I do not think it is worth it.
Furthermore, the Trace2 routines print the same content in normal, perf,
or event format, and in t0210 and t0211 we already tested the basic
functionality, so no need to repeat it here.
In this test, we use the test-helper to unit test each of the event
messages where URLs can appear and confirm that they are redacted in
each event.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
trace2: redact passwords from https:// URLs by default
It is an unsafe practice to call something like
git clone https://user:password@example.com/
This not only risks leaking the password "over the shoulder" or into the
readline history of the current Unix shell, it also gets logged via
Trace2 if enabled.
Let's at least avoid logging such secrets via Trace2, much like we avoid
logging secrets in `http.c`. Much like the code in `http.c` is guarded
via `GIT_TRACE_REDACT` (defaulting to `true`), we guard the new code via
`GIT_TRACE2_REDACT` (also defaulting to `true`).
The new tests added in this commit uncover leaks in `builtin/clone.c`
and `remote.c`. Therefore we need to turn off
`TEST_PASSES_SANITIZE_LEAK`. The reasons:
- We observed that `the_repository->remote_status` is not released
properly.
- We are using `url...insteadOf` and that runs into a code path where an
allocated URL is replaced with another URL, and the original URL is
never released.
- `remote_states` contains plenty of `struct remote`s whose refspecs
seem to be usually allocated by never released.
More investigation is needed here to identify the exact cause and
proper fixes for these leaks/bugs.
Co-authored-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff Hostetler [Wed, 22 Nov 2023 19:18:34 +0000 (19:18 +0000)]
trace2: fix signature of trace2_def_param() macro
Add `struct key_value_info` argument to `trace2_def_param()`.
In dc90208497 (trace2: plumb config kvi, 2023-06-28) a `kvi`
argument was added to `trace2_def_param_fl()` but the macro
was not up updated. Let's fix that.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Antonin Delpeuch [Mon, 20 Nov 2023 19:18:52 +0000 (19:18 +0000)]
merge-file: add --diff-algorithm option
Make it possible to use other diff algorithms than the 'myers'
default algorithm, when using the 'git merge-file' command, to help
avoid spurious conflicts by selecting a more recent algorithm such
as 'histogram', for instance when using 'git merge-file' as part of
a custom merge driver.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
packfile.c: fix a typo in `each_file_in_pack_dir_fn()`'s declaration
One parameter is called `file_pach`. On the face of it, this looks as if
it was supposed to talk about a `path` instead of a `pach`.
However, looking at the way this callback is called, it gets fed the
`d_name` from a directory entry, which provides just the file name, not
the full path. Therefore, let's fix this by calling the parameter
`file_name` instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that `refs_delete_refs` is implemented in a generic way via the ref
transaction interfaces there are no callers left that invoke the
`delete_refs` callback anymore. Remove it from all of our backends.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the files and the packed-refs reference backends now use the same
generic transactions-based code to delete references. Let's pull these
implementations up into `refs_delete_refs()` to deduplicate the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the `files_delete_refs()` callback function of the files backend we
implement deletion of references. This is done in two steps:
1. We lock the packed-refs file and delete all references from it in
a single transaction.
2. We delete all loose references via separate calls to
`refs_delete_ref()`.
These steps essentially duplicate the logic around locking and deletion
order that we already have in the transactional interfaces, where we do
know to lock and evict references from the packed-refs file. Despite the
fact that we duplicate the logic, it's also less efficient than if we
used a single generic transaction:
- The transactional interface knows to skip locking of the packed
refs in case they don't contain any of the refs which are about to
be deleted.
- We end up creating N+1 separate reference transactions, one for
the packed-refs file and N for the individual loose references.
Refactor the code to instead delete references via a single transaction.
As we don't assert the expected old object ID this is equivalent to the
previous behaviour, and we already do the same in the packed-refs
backend.
Despite the fact that the result is simpler to reason about, this change
also results in improved performance. The following benchmarks have been
executed in linux.git:
Benchmark 1: master packed=true refcount=1
Time (mean ± σ): 7.8 ms ± 1.6 ms [User: 3.4 ms, System: 4.4 ms]
Range (min … max): 5.5 ms … 11.0 ms 120 runs
Benchmark 2: master packed=false refcount=1
Time (mean ± σ): 7.0 ms ± 1.1 ms [User: 3.2 ms, System: 3.8 ms]
Range (min … max): 5.7 ms … 9.8 ms 180 runs
Benchmark 3: master packed=true refcount=1000
Time (mean ± σ): 283.8 ms ± 5.2 ms [User: 45.7 ms, System: 231.5 ms]
Range (min … max): 276.7 ms … 291.6 ms 10 runs
Benchmark 4: master packed=false refcount=1000
Time (mean ± σ): 284.4 ms ± 5.3 ms [User: 44.2 ms, System: 233.6 ms]
Range (min … max): 277.1 ms … 293.3 ms 10 runs
Benchmark 5: generic-delete-refs packed=true refcount=1
Time (mean ± σ): 6.2 ms ± 1.8 ms [User: 2.3 ms, System: 3.9 ms]
Range (min … max): 4.1 ms … 12.2 ms 142 runs
Benchmark 6: generic-delete-refs packed=false refcount=1
Time (mean ± σ): 7.1 ms ± 1.4 ms [User: 2.8 ms, System: 4.3 ms]
Range (min … max): 4.2 ms … 10.8 ms 157 runs
Benchmark 7: generic-delete-refs packed=true refcount=1000
Time (mean ± σ): 198.9 ms ± 1.7 ms [User: 29.5 ms, System: 165.7 ms]
Range (min … max): 196.1 ms … 201.4 ms 10 runs
Benchmark 8: generic-delete-refs packed=false refcount=1000
Time (mean ± σ): 199.7 ms ± 7.8 ms [User: 32.2 ms, System: 163.2 ms]
Range (min … max): 193.8 ms … 220.7 ms 10 runs
Summary
generic-delete-refs packed=true refcount=1 ran
1.14 ± 0.37 times faster than master packed=false refcount=1
1.15 ± 0.40 times faster than generic-delete-refs packed=false refcount=1
1.26 ± 0.44 times faster than master packed=true refcount=1
32.24 ± 9.17 times faster than generic-delete-refs packed=true refcount=1000
32.36 ± 9.29 times faster than generic-delete-refs packed=false refcount=1000
46.00 ± 13.10 times faster than master packed=true refcount=1000
46.10 ± 13.13 times faster than master packed=false refcount=1000
```
Especially in the case where we have many references we can see a clear
performance speedup of nearly 30%.
This is in contrast to the stated objecive in a27dcf89b68 (refs: make
delete_refs() virtual, 2016-09-04), where the virtual `delete_refs()`
function was introduced with the intent to speed things up rather than
making things slower. So it seems like we have outlived the need for a
virtual function.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t5510: ensure that the packed-refs file needs locking
One of the tests in t5510 asserts that a `git fetch --prune` detects
failures to prune branches. This is done by locking the packed-refs
file, which would then later lead to a locking issue when Git tries to
rewrite the file to prune the branches from it.
Interestingly though, we do not pack the about-to-be-pruned branch into
the packed-refs file, so it never even contained that branch in the
first place. While this is good enough right now because the pruning
will always lock the file regardless of whether it contains the branch
or not, this is a mere implementation detail. In fact, we're about to
rewrite branch deletions to make use of the ref transaction interface,
which knows to skip rewrites of the packed-refs file in the case where
it does not contain the branches in the first place, and this will break
the test.
Prepare the test for that change by packing the refs before trying to
prune them.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A warning is issued for options which are specified more than once
beginning with perl-Getopt-Long >= 2.55. In addition to causing users
to see warnings, this results in test failures which compare the output.
An example, from t9001-send-email.37:
| +++ diff -u expect actual
| --- expect 2023-11-14 10:38:23.854346488 +0000
| +++ actual 2023-11-14 10:38:23.848346466 +0000
| @@ -1,2 +1,7 @@
| +Duplicate specification "no-chain-reply-to" for option "no-chain-reply-to"
| +Duplicate specification "to-cover|to-cover!" for option "to-cover"
| +Duplicate specification "cc-cover|cc-cover!" for option "cc-cover"
| +Duplicate specification "no-thread" for option "no-thread"
| +Duplicate specification "no-to-cover" for option "no-to-cover"
| fatal: longline.patch:35 is longer than 998 characters
| warning: no patches were sent
| error: last command exited with $?=1
| not ok 37 - reject long lines
Remove the duplicate option specs. These are primarily the explicit
'--no-' prefix opts which were added in f471494303 (git-send-email.perl:
support no- prefix with older GetOptions, 2015-01-30). This was done
specifically to support perl-5.8.0 which includes Getopt::Long 2.32[1].
Getopt::Long 2.33 added support for the '--no-' prefix natively by
appending '!' to the option specification string, which was included in
perl-5.8.1 and is not present in perl-5.8.0. The previous commit bumped
the minimum supported Perl version to 5.8.1 so we no longer need to
provide the '--no-' variants for negatable options manually.
Teach `--git-completion-helper` to output the '--no-' options. They are
not included in the options hash and would otherwise be lost.
Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Todd Zullinger [Thu, 16 Nov 2023 19:30:10 +0000 (14:30 -0500)]
perl: bump the required Perl version to 5.8.1 from 5.8.0
The following commit will make use of a Getopt::Long feature which is
only present in Perl >= 5.8.1. Document that as the minimum version we
support.
Many of our Perl scripts will continue to run with 5.8.0 but this change
allows us to adjust them as needed without breaking any promises to our
users.
The Perl requirement was last changed in d48b284183 (perl: bump the
required Perl version to 5.8 from 5.6.[21], 2010-09-24). At that time,
5.8.0 was 8 years old. It is now over 21 years old.
Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Victoria Dye [Tue, 14 Nov 2023 19:53:58 +0000 (19:53 +0000)]
t/perf: add perf tests for for-each-ref
Add performance tests for 'for-each-ref'. The tests exercise different
combinations of filters/formats/options, as well as the overall performance
of 'git for-each-ref | git cat-file --batch-check' to demonstrate the
performance difference vs. 'git for-each-ref' with "%(*fieldname)" format
specifiers.
All tests are run against a repository with 40k loose refs - 10k commits,
each having a unique:
- branch
- custom ref (refs/custom/special_*)
- annotated tag pointing at the commit
- annotated tag pointing at the other annotated tag (i.e., a nested tag)
After those tests are finished, the refs are packed with 'pack-refs --all'
and the same tests are rerun.
Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Victoria Dye [Tue, 14 Nov 2023 19:53:57 +0000 (19:53 +0000)]
ref-filter.c: use peeled tag for '*' format fields
In most builtins ('rev-parse <revision>^{}', 'show-ref --dereference'),
"dereferencing" a tag refers to a recursive peel of the tag object. Unlike
these cases, the dereferencing prefix ('*') in 'for-each-ref' format
specifiers triggers only a single, non-recursive dereference of a given tag
object. For most annotated tags, a single dereference is all that is needed
to access the tag's associated commit or tree; "recursive" and
"non-recursive" dereferencing are functionally equivalent in these cases.
However, nested tags (annotated tags whose target is another annotated tag)
dereferenced once return another tag, where a recursive dereference would
return the commit or tree.
Currently, if a user wants to filter & format refs and include information
about a recursively-dereferenced tag, they can do so with something like
'cat-file --batch-check':
But the combination of commands is inefficient. So, to improve the
performance of this use case and align the defererencing behavior of
'for-each-ref' with that of other commands, update the ref formatting code
to use the peeled tag (from 'peel_iterated_oid()') to populate '*' fields
rather than the tag's immediate target object (from 'get_tagged_oid()').
Additionally, add a test to 't6300-for-each-ref' to verify new nested tag
behavior and update 't6302-for-each-ref-filter.sh' to print the correct
value for nested dereferenced fields.
Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Victoria Dye [Tue, 14 Nov 2023 19:53:56 +0000 (19:53 +0000)]
for-each-ref: clean up documentation of --format
Move the description of the `*` prefix from the --format option
documentation to the part of the command documentation that deals with other
object type-specific modifiers. Also reorganize and reword the remaining
--format documentation so that the explanation of the default format doesn't
interrupt the details on format string interpolation.
Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>