Junio C Hamano [Tue, 27 Feb 2024 02:10:24 +0000 (18:10 -0800)]
Merge branch 'ps/reftable-iteration-perf'
The code to iterate over refs with the reftable backend has seen
some optimization.
* ps/reftable-iteration-perf:
reftable/reader: add comments to `table_iter_next()`
reftable/record: don't try to reallocate ref record name
reftable/block: swap buffers instead of copying
reftable/pq: allocation-less comparison of entry keys
reftable/merged: skip comparison for records of the same subiter
reftable/merged: allocation-less dropping of shadowed records
reftable/record: introduce function to compare records by key
Junio C Hamano [Tue, 27 Feb 2024 02:10:24 +0000 (18:10 -0800)]
Merge branch 'cp/apply-core-filemode'
"git apply" on a filesystem without filemode support have learned
to take a hint from what is in the index for the path, even when
not working with the "--index" or "--cached" option, when checking
the executable bit match what is required by the preimage in the
patch.
* cp/apply-core-filemode:
apply: code simplification
apply: correctly reverse patch's pre- and post-image mode bits
apply: ignore working tree filemode when !core.filemode
Junio C Hamano [Tue, 27 Feb 2024 02:10:23 +0000 (18:10 -0800)]
Merge branch 'ps/reftable-backend'
Integrate the reftable code into the refs framework as a backend.
* ps/reftable-backend:
refs/reftable: fix leak when copying reflog fails
ci: add jobs to test with the reftable backend
refs: introduce reftable backend
Jeff King [Tue, 20 Feb 2024 01:09:36 +0000 (20:09 -0500)]
trailer: fix comment/cut-line regression with opts->no_divider
Commit 97e9d0b78a (trailer: find the end of the log message, 2023-10-20)
combined two code paths for finding the end of the log message. For the
"no_divider" case, we used to use find_trailer_end(), and that has now
been rolled into find_end_of_log_message(). But there's a regression;
that function returns early when no_divider is set, returning the whole
string.
That's not how find_trailer_end() behaved. Although it did skip the
"---" processing (which is what "no_divider" is meant to do), we should
still respect ignored_log_message_bytes(), which covers things like
comments, "commit -v" cut lines, and so on.
The bug is actually in the interpret-trailers command, but the obvious
way to experience it is by running "commit -v" with a "--trailer"
option. The new trailer will be added at the end of the verbose diff,
rather than before it (and consequently will be ignored entirely, since
everything after the diff's intro scissors line is thrown away).
I've added two tests here: one for interpret-trailers directly, which
shows the bug via the parsing routines, and one for "commit -v".
The fix itself is pretty simple: instead of returning early, no_divider
just skips the "---" handling but still calls ignored_log_message_bytes().
Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teng Long [Mon, 12 Feb 2024 13:04:30 +0000 (21:04 +0800)]
l10n: zh_CN: for git 2.44 rounds
In addition to the localized translation in 2.44, for zh_CN, we have
uniformly modified the translation of the word "commit-graph" to make it
more consistent with language usage habits.
Jiang Xin [Thu, 15 Feb 2024 01:48:25 +0000 (09:48 +0800)]
Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (51 commits)
Hopefully the last batch of fixes before 2.44 final
Git 2.43.2
A few more fixes before -rc1
write-or-die: fix the polarity of GIT_FLUSH environment variable
A few more topics before -rc1
completion: add and use __git_compute_second_level_config_vars_for_section
completion: add and use __git_compute_first_level_config_vars_for_section
completion: complete 'submodule.*' config variables
completion: add space after config variable names also in Bash 3
receive-pack: use find_commit_header() in check_nonce()
ci(linux32): add a note about Actions that must not be updated
ci: bump remaining outdated Actions versions
unit-tests: do show relative file paths on non-Windows, too
receive-pack: use find_commit_header() in check_cert_push_options()
prune: mark rebase autostash and orig-head as reachable
sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
ref-filter.c: sort formatted dates by byte value
ssh signing: signal an error with a negative return value
bisect: document command line arguments for "bisect start"
bisect: document "terms" subcommand more fully
...
Junio C Hamano [Wed, 14 Feb 2024 23:36:06 +0000 (15:36 -0800)]
Merge branch 'pb/complete-config'
The command line completion script (in contrib/) learned to
complete configuration variable names better.
* pb/complete-config:
completion: add and use __git_compute_second_level_config_vars_for_section
completion: add and use __git_compute_first_level_config_vars_for_section
completion: complete 'submodule.*' config variables
completion: add space after config variable names also in Bash 3
Junio C Hamano [Wed, 14 Feb 2024 23:36:05 +0000 (15:36 -0800)]
Merge branch 'rs/receive-pack-remove-find-header'
Code simplification.
* rs/receive-pack-remove-find-header:
receive-pack: use find_commit_header() in check_nonce()
receive-pack: use find_commit_header() in check_cert_push_options()
Junio C Hamano [Wed, 14 Feb 2024 23:36:05 +0000 (15:36 -0800)]
Merge branch 'pw/gc-during-rebase'
The sequencer machinery does not use the ref API and instead
records names of certain objects it needs for its correct operation
in temporary files, which makes these objects susceptible to loss
by garbage collection. These temporary files have been added as
starting points for reachability analysis to fix this.
* pw/gc-during-rebase:
prune: mark rebase autostash and orig-head as reachable
Jiang Xin [Wed, 14 Feb 2024 08:46:41 +0000 (16:46 +0800)]
diff: mark param1 and param2 as placeholders
Some l10n translators translated the parameters "files", "param1" and
"param2" in the following message:
"synonym for --dirstat=files,param1,param2..."
Translating "param1" and "param2" is OK, but changing the parameter
"files" is wrong. The parameters that are not meant to be used verbatim
should be marked as placeholders, but the verbatim parameter not marked
as a placeholder should be left as is.
This change is a complement for commit 51e846e673 (doc: enforce
placeholders in documentation, 2023-12-25).
With the help of Jean-Noël,some parameter combinations in one
placeholder (e.g. "<param1,param2>...") are splited into seperate
placeholders.
Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 13 Feb 2024 22:31:11 +0000 (14:31 -0800)]
Merge branch 'jc/unit-tests-make-relative-fix'
The mechanism to report the filename in the source code, used by
the unit-test machinery, assumed that the compiler expanded __FILE__
to the path to the source given to the $(CC), but some compilers
give full path, breaking the output. This has been corrected.
* jc/unit-tests-make-relative-fix:
unit-tests: do show relative file paths on non-Windows, too
The Perl version of the add -i/-p commands has been removed since 20b813d (add: remove "add.interactive.useBuiltin" & Perl "git
add--interactive", 2023-02-07)
Therefore, Perl prerequisite in the test scripts which use the patch
mode functionality is not neccessary.
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, (restore, checkout, reset) commands correctly take '@' as a
synonym for 'HEAD'. However, in patch mode different prompts/messages
are given on command line due to patch mode machinery not considering
'@' to be a synonym for 'HEAD' due to literal string comparison with
the word 'HEAD', and therefore assigning patch_mode_($command)_nothead
and triggering reverse mode (-R in diff-index). The NEEDSWORK comment
suggested comparing commit objects to get around this. However, doing
so would also take a non-checked out branch pointing to the same commit
as HEAD, as HEAD. This would cause confusion to the user.
Therefore, after parsing '@', replace it with 'HEAD' as reasonably
early as possible. This also solves another problem of disparity
between 'git checkout HEAD' and 'git checkout @' (latter detaches at
the HEAD commit and the former does not).
Trade-offs:
- Some of the errors would show the revision argument as 'HEAD' when
given '@'. This should be fine, as most users who probably use '@'
would be aware that it is a shortcut for 'HEAD' and most probably
used to use 'HEAD'. There is also relevant documentation in
'gitrevisions' manpage about '@' being the shortcut for 'HEAD'. Also,
the simplicity of the solution far outweighs this cost.
- Consider '@' as a shortcut for 'HEAD' even if 'refs/heads/@' exists
at a different commit. Naming a branch '@' is an obvious foot-gun and
many existing commands already take '@' for 'HEAD' even if
'refs/heads/@' exists at a different commit or does not exist at all
(e.g. 'git log @', 'git push origin @' etc.). Therefore this is an
existing assumption and should not be a problem.
Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 13 Feb 2024 19:48:15 +0000 (11:48 -0800)]
write-or-die: fix the polarity of GIT_FLUSH environment variable
When GIT_FLUSH is set to 1, true, on, yes, then we should disable
skip_stdout_flush, but the conversion somehow did the opposite.
With the understanding of the original motivation behind "skip" in 06f59e9f (Don't fflush(stdout) when it's not helpful, 2007-06-29),
we can sympathize with the current naming (we wanted to avoid
useless flushing of stdout by default, with an escape hatch to
always flush), but it is still not a good excuse.
Retire the "skip_stdout_flush" variable and replace it with "flush_stdout"
that tells if we do or do not want to run fflush().
Reported-by: Xiaoguang WANG <wxiaoguang@gmail.com> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch" and friends learned to use the formatted text as
sorting key, not the underlying timestamp value, when the --sort
option is used with author or committer timestamp with a format
specifier (e.g., "--sort=creatordate:format:%H:%M:%S").
* vd/for-each-ref-sort-with-formatted-timestamp:
ref-filter.c: sort formatted dates by byte value
Junio C Hamano [Mon, 12 Feb 2024 21:16:10 +0000 (13:16 -0800)]
Merge branch 'bk/complete-bisect'
Command line completion support (in contrib/) has been
updated for "git bisect".
* bk/complete-bisect:
completion: bisect: recognize but do not complete view subcommand
completion: bisect: complete log opts for visualize subcommand
completion: new function __git_complete_log_opts
completion: bisect: complete missing --first-parent and - -no-checkout options
completion: bisect: complete custom terms and related options
completion: bisect: complete bad, new, old, and help subcommands
completion: tests: always use 'master' for default initial branch name
Junio C Hamano [Mon, 12 Feb 2024 21:16:10 +0000 (13:16 -0800)]
Merge branch 'ps/reftable-styles'
Code clean-up in various reftable code paths.
* ps/reftable-styles:
reftable/record: improve semantics when initializing records
reftable/merged: refactor initialization of iterators
reftable/merged: refactor seeking of records
reftable/stack: use `size_t` to track stack length
reftable/stack: use `size_t` to track stack slices during compaction
reftable/stack: index segments with `size_t`
reftable/stack: fix parameter validation when compacting range
reftable: introduce macros to allocate arrays
reftable: introduce macros to grow arrays
Write multi-level indices for reftable has been corrected.
* ps/reftable-multi-level-indices-fix:
reftable: document reading and writing indices
reftable/writer: fix writing multi-level indices
reftable/writer: simplify writing index records
reftable/writer: use correct type to iterate through index entries
reftable/reader: be more careful about errors in indexed seeks
Philippe Blain [Sat, 10 Feb 2024 18:32:23 +0000 (18:32 +0000)]
completion: add and use __git_compute_second_level_config_vars_for_section
In a previous commit we removed some hardcoded config variable names from
function __git_complete_config_variable_name in the completion script by
introducing a new function,
__git_compute_first_level_config_vars_for_section.
The remaining hardcoded config variables are "second level"
configuration variables, meaning 'branch.<name>.upstream',
'remote.<name>.url', etc. where <name> is a user-defined name.
Making use of the new existing --config flag to 'git help', add a new
function, __git_compute_second_level_config_vars_for_section. This
function takes as argument a config section name and computes the
corresponding second-level config variables, i.e. those that contain a
'<' which indicates the start of a placeholder. Note that as in
__git_compute_first_level_config_vars_for_section added previsouly, we
use indirect expansion instead of associative arrays to stay compatible
with Bash 3 on which macOS is stuck for licensing reasons.
As explained in the previous commit, we use the existing pattern in the
completion script of using global variables to cache the list of
variables for each section.
Use this new function and the variables it defines in
__git_complete_config_variable_name to remove hardcoded config
variables, and add a test to verify the new function. Use a single
'case' for all sections with second-level variables names, since the
code for each of them is now exactly the same.
Adjust the name of a test added in a previous commit to reflect that it
now tests the added function.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Sat, 10 Feb 2024 18:32:22 +0000 (18:32 +0000)]
completion: add and use __git_compute_first_level_config_vars_for_section
The function __git_complete_config_variable_name in the Bash completion
script hardcodes several config variable names. These variables are
those in config sections where user-defined names can appear, such as
"branch.<name>". These sections are treated first by the case statement,
and the two last "catch all" cases are used for other sections, making
use of the __git_compute_config_vars and __git_compute_config_sections
function, which omit listing any variables containing wildcards or
placeholders. Having hardcoded config variables introduces the risk of
the completion code becoming out of sync with the actual config
variables accepted by Git.
To avoid these hardcoded config variables, introduce a new function,
__git_compute_first_level_config_vars_for_section, making use of the
existing __git_config_vars variable. This function takes as argument a
config section name and computes the matching "first level" config
variables for that section, i.e. those _not_ containing any placeholder,
like 'branch.autoSetupMerge, 'remote.pushDefault', etc. Use this
function and the variables it defines in the 'branch.*', 'remote.*' and
'submodule.*' switches of the case statement instead of hardcoding the
corresponding config variables. Note that we use indirect expansion to
create a variable for each section, instead of using a single
associative array indexed by section names, because associative arrays
are not supported in Bash 3, on which macOS is stuck for licensing
reasons.
Use the existing pattern in the completion script of using global
variables to cache the list of config variables for each section. The
rationale for such caching is explained in eaa4e6ee2a (Speed up bash
completion loading, 2009-11-17), and the current approach to using and
defining them via 'test -n' is explained in cf0ff02a38 (completion: work
around zsh option propagation bug, 2012-02-02).
Adjust the name of one of the tests added in the previous commit,
reflecting that it now also tests the new function.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the Bash completion script, function
__git_complete_config_variable_name completes config variables and has
special logic to deal with config variables involving user-defined
names, like branch.<name>.* and remote.<name>.*.
This special logic is missing for submodule-related config variables.
Add the appropriate branches to the case statement, making use of the
in-tree '.gitmodules' to list relevant submodules.
Add corresponding tests in t9902-completion.sh, making sure we complete
both first level submodule config variables as well as second level
variables involving submodule names.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Philippe Blain [Sat, 10 Feb 2024 18:32:20 +0000 (18:32 +0000)]
completion: add space after config variable names also in Bash 3
In be6444d1ca (completion: bash: add correct suffix in variables,
2021-08-16), __git_complete_config_variable_name was changed to use
"${sfx- }" instead of "$sfx" as the fourth argument of _gitcomp_nl and
_gitcomp_nl_append, such that this argument evaluates to a space if sfx
is unset. This was to ensure that e.g.
git config branch.autoSetupMe[TAB]
correctly completes to 'branch.autoSetupMerge ' with the trailing space.
This commits notes that the fix only works in Bash 4 because in Bash 3
the 'local sfx' construct at the beginning of
__git_complete_config_variable_name creates an empty string.
Make the fix also work for Bash 3 by using the "unset or null' parameter
expansion syntax ("${sfx:- }"), such that the parameter is also expanded
to a space if it is set but null, as is the behaviour of 'local sfx' in
Bash 3.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 10 Feb 2024 07:43:01 +0000 (08:43 +0100)]
use xstrncmpz()
Add and apply a semantic patch for calling xstrncmpz() to compare a
NUL-terminated string with a buffer of a known length instead of using
strncmp() and checking the terminating NUL explicitly. This simplifies
callers by reducing code duplication.
I had to adjust remote.c manually because Coccinelle inexplicably
changed the indent of the else branches.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Fri, 9 Feb 2024 20:41:47 +0000 (21:41 +0100)]
receive-pack: use find_commit_header() in check_nonce()
Use the public function find_commit_header() and remove find_header(),
as it becomes unused. This is safe and appropriate because we pass the
NUL-terminated payload buffer to check_nonce() instead of its start and
length. The underlying strbuf push_cert cannot contain NULs, as it is
built using strbuf_addstr(), only.
We no longer need to call strlen(), as find_commit_header() returns the
length of nonce already.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/reader: add comments to `table_iter_next()`
While working on the optimizations in the preceding patches I stumbled
upon `table_iter_next()` multiple times. It is quite easy to miss the
fact that we don't call `table_iter_next_in_block()` twice, but that the
second call is in fact `table_iter_next_block()`.
Add comments to explain what exactly is going on here to make things
more obvious. While at it, touch up the code to conform to our code
style better.
Note that one of the refactorings merges two conditional blocks into
one. Before, we had the following code:
reftable/record: don't try to reallocate ref record name
When decoding reftable ref records we first release the pointer to the
record passed to us and then use realloc(3P) to allocate the refname
array. This is a bit misleading though as we know at that point that the
refname will always be `NULL`, so we would always end up allocating a
new char array anyway.
Refactor the code to use `REFTABLE_ALLOC_ARRAY()` instead. As the
following benchmark demonstrates this is a tiny bit more efficient. But
the bigger selling point really is the gained clarity.
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 150.1 ms ± 4.1 ms [User: 146.6 ms, System: 3.3 ms]
Range (min … max): 144.5 ms … 180.5 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 148.9 ms ± 4.5 ms [User: 145.2 ms, System: 3.4 ms]
Range (min … max): 143.0 ms … 185.4 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.01 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)
Ideally, we should try and reuse the memory of the old record instead of
first freeing and then immediately reallocating it. This requires some
more surgery though and is thus left for a future iteration.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When iterating towards the next record in a reftable block we need to
keep track of the key that the last record had. This is required because
reftable records use prefix compression, where subsequent records may
reuse parts of their preceding record's key.
This key is stored in the `block_iter::last_key`, which we update after
every call to `block_iter_next()`: we simply reset the buffer and then
add the current key to it.
This is a bit inefficient though because it requires us to copy over the
key on every iteration, which adds up when iterating over many records.
Instead, we can make use of the fact that the `block_iter::key` buffer
is basically only a scratch buffer. So instead of copying over contents,
we can just swap both buffers.
The following benchmark prints a single ref matching a specific pattern
out of 1 million refs via git-show-ref(1):
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 155.7 ms ± 5.0 ms [User: 152.1 ms, System: 3.4 ms]
Range (min … max): 150.8 ms … 185.7 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 150.8 ms ± 4.2 ms [User: 147.1 ms, System: 3.5 ms]
Range (min … max): 145.1 ms … 180.7 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.03 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>