doc: mention that proxies must be completely transparent
We already document in the FAQ that proxies must be completely
transparent and not modify the request or response in any way, but add
similar documentation to the http.proxy entry. We know that while the
FAQ is very useful, users sometimes are less likely to read in favor of
the documentation specific to an option or command, so adding it in both
places will help users be adequately informed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users very commonly want to sync their working tree with uncommitted
changes across machines, often to carry across in-progress work or
stashes. Despite this not being a recommended approach, users want to
do it and are not dissuaded by suggestions not to, so let's recommend a
sensible technique.
The technique that many users are using is their preferred cloud syncing
service, which is a bad idea. Users have reported problems where they
end up with duplicate files that won't go away (with names like "file.c
2"), broken references, oddly named references that have date stamps
appended to them, missing objects, and general corruption and data loss.
That's because almost all of these tools sync file by file, which is a
great technique if your project is a single word processing document or
spreadsheet, but is utterly abysmal for Git repositories because they
don't necessarily snapshot the entire repository correctly. They also
tend to sync the files immediately instead of when the repository is
quiescent, so writing multiple files, as occurs during a commit or a gc,
can confuse the tools and lead to corruption.
We know that the old standby, rsync, is up to the task, provided that
the repository is quiescent, so let's suggest that and dissuade people
from using cloud syncing tools. Let's tell people about common things
they should be aware of before doing this and that this is still
potentially risky. Additionally, let's tell people that Git's security
model does not permit sharing working trees across users in case they
planned to do that. While we'd still prefer users didn't try to do
this, hopefully this will lead them in a safer direction.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitfaq: give advice on using eol attribute in gitattributes
In the FAQ, we tell people how to use the text attribute, but we fail to
explain what to do with the eol attribute. As we ourselves have
noticed, most shell implementations do not care for carriage returns,
and as such, people will practically always want them to use LF endings.
Similar things can be said for batch files on Windows, except with CRLF
endings.
Since these are common things to have in a repository, let's help users
make a good decision by recommending that they use the gitattributes
file to correctly check out the endings.
In addition, let's correct the cross-reference to this question, which
originally referred to "the following entry", even though a new entry
has been inserted in between. The cross-reference notation should
prevent this from occurring and provide a link in formats, such as HTML,
which support that.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many corporate environments and local systems have proxies in use. Note
the situations in which proxies can be used and how to configure them.
At the same time, note what standards a proxy must follow to work with
Git. Explicitly call out certain classes that are known to routinely
have problems reported various places online, including in the Git for
Windows issue tracker and on Stack Overflow, and recommend against the
use of such software, noting that they are associated with myriad
security problems (including, for example, breaking sandboxing and image
integrity[0], and, for TLS middleboxes, the use of insecure protocols
and ciphers and lack of certificate verification[1]). Don't mention the
specific nature of these security problems in the FAQ entry because they
are extremely numerous and varied and we wish to keep the FAQ entry
relatively brief.
Junio C Hamano [Mon, 8 Jul 2024 22:52:11 +0000 (15:52 -0700)]
ci: unify bash calling convention
Under ci/ hierarchy, we run scripts under either "sh" (any Bourne
compatible POSIX shell would work) or specifically "bash" (as they
require features from bash, e.g., ${parameter/pattern/string}
expansion). As we have the CI environment under our control, we can
expect that /bin/sh will always be fine to run the scripts that only
require a Bourne shell, but we may not know where "bash" is
installed depending on the distro used.
So let's make sure we start these scripts with either one of these:
#!/bin/sh
#!/usr/bin/env bash
Yes, the latter has to assume that everybody installs "env" at that
path and not as /bin/env or /usr/local/bin/env, but this currently
is the best we could do.
Junio C Hamano [Mon, 8 Jul 2024 21:53:09 +0000 (14:53 -0700)]
Merge branch 'tb/path-filter-fix'
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.
* tb/path-filter-fix:
bloom: introduce `deinit_bloom_filters()`
commit-graph: reuse existing Bloom filters where possible
object.h: fix mis-aligned flag bits table
commit-graph: new Bloom filter version that fixes murmur3
commit-graph: unconditionally load Bloom filters
bloom: prepare to discard incompatible Bloom filters
bloom: annotate filters with hash version
repo-settings: introduce commitgraph.changedPathsVersion
t4216: test changed path filters with high bit paths
t/helper/test-read-graph: implement `bloom-filters` mode
bloom.h: make `load_bloom_filter_from_graph()` public
t/helper/test-read-graph.c: extract `dump_graph_info()`
gitformat-commit-graph: describe version 2 of BDAT
commit-graph: ensure Bloom filters are read with consistent settings
revision.c: consult Bloom filters for root commits
t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
Junio C Hamano [Mon, 8 Jul 2024 21:53:08 +0000 (14:53 -0700)]
Merge branch 'rj/pager-die-upon-exec-failure'
When GIT_PAGER failed to spawn, depending on the code path taken,
we failed immediately (correct) or just spew the payload to the
standard output (incorrect). The code now always fail immediately
when GIT_PAGER fails.
* rj/pager-die-upon-exec-failure:
pager: die when paging to non-existing command
"git archive --add-virtual-file=<path>:<contents>" never paid
attention to the --prefix=<prefix> option but the documentation
said it would. The documentation has been corrected.
* jc/archive-prefix-with-add-virtual-file:
archive: document that --add-virtual-file takes full path
Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.
When users are in this state, there is usually a modified or untracked
file outside of the sparse-checkout mentioned by the output of 'git
status'. There are a number of reasons why this is insufficient:
1. Users may not have a full understanding of which files are inside or
outside of their sparse-checkout. This is more common in monorepos
that manage the sparse-checkout using custom tools that map build
dependencies into sparse-checkout definitions.
2. In some cases, an empty directory could exist outside the
sparse-checkout and these empty directories are not reported by 'git
status' and friends.
3. If the user has '.gitignore' or 'exclude' files, then 'git status'
will squelch the warnings and not demonstrate any problems.
In order to help users who are in this state, add a new advice message
to indicate that a sparse index is expanded to a full index. This
message should be written at most once per process, so add a static
global 'give_advice_on_expansion' to sparse-index.c. Further, there is a
case in 'git sparse-checkout set' that uses the sparse index as an
in-memory data structure (even when writing a full index) so we need to
disable the message in that kind of case.
The t1092-sparse-checkout-compatibility.sh test script compares the
behavior of several Git commands across full and sparse repositories,
including sparse repositories with and without a sparse index. We need
to disable the advice in the sparse-index repo to avoid differences in
stderr. By leaving the advice on in the sparse-checkout repo (without
the sparse index), we can test the behavior of disabling the advice in
convert_to_sparse(). (Indeed, these tests are how that necessity was
discovered.) Add a test that reenables the advice and demonstrates that
the message is output.
The advice message is defined outside of expand_index() to avoid super-
wide lines. It is also defined as a macro to avoid compile issues with
-Werror=format-security.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: rss/atom change published/updated date to committer date
The author date is used for published/updated date in the rss/atom
feed stream. Change it to the committer date that reflects the
"published/updated" definition better and makes rss/atom feeds more
linear. Gitlab/Github rss/atom feeds use the committer date.
Additionally, to be consistent, also use the committer date to
determine the date of the last commit to send in the feed
instead of the author date.
Signed-off-by: Jesús Ariel Cabello Mateos <080ariel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 8 Jul 2024 05:50:37 +0000 (22:50 -0700)]
Merge https://github.com/j6t/git-gui
* https://github.com/j6t/git-gui:
git-gui: fix inability to quit after closing another instance
git-gui: sv.po: Update Swedish translation (576t0f0u)
git-gui: note the new maintainer
Makefile(s): do not enforce "all indents must be done with tab"
Makefile(s): avoid recipe prefix in conditional statements
doc: switch links to https
doc: update links to current pages
git-gui: po: fix typo in French "aperçu"
Johannes Sixt [Sun, 7 Jul 2024 12:00:23 +0000 (14:00 +0200)]
Merge branch 'os/catch-rename'
The problem can be reproduced on Linux with this sequence:
1. Run git gui from a terminal.
2. Edit the commit message and wait for at least 2 seconds.
3. Terminate the instance from the terminal, for example with Ctrl-C,
to simulate crash. This leaves the file .git/GITGUI_BCK behind.
4. Start two instances of git gui &.
At this point the first instance can be closed (it renames
.git/GITGUI_BCK to .git/GITGUI_MSG), but the seconds brings an error
message about the absent file and cannot be closed thereafter and must
be killed from the command line.
The renaming that happens by the first instance is the correct action
and need not be repeated by the second instance. It is the correct
action to ignore the failed renaming.
On the other hand, the second instance could just edit the commit
message again, wait 2 seconds to write GITGUI_BCK, and then can be
closed without failing. At this point, since the user has edited the
message, it is again correct to preserve the edited version in
GITGUI_MSG.
* os/catch-rename:
git-gui: fix inability to quit after closing another instance
René Scharfe [Fri, 5 Jul 2024 19:25:43 +0000 (21:25 +0200)]
clang-format: include kh_foreach* macros in ForEachMacros
The command for generating the list of ForEachMacros searches for
macros whose name contains the string "for_each". Include those whose
name contains "foreach" as well. That brings in kh_foreach and
kh_foreach_value from khash.h.
Regenerating the list also brings in hashmap-based macros added by 87571c3f71 (hashmap: use *_entry APIs for iteration, 2019-10-06), f0e63c4113 (hashmap: use *_entry APIs to wrap container_of, 2019-10-06), 4fa1d501f7 (strmap: add functions facilitating use as a string->int map,
2020-11-05), b70c82e6ed (strmap: add more utility functions,
2020-11-05), and 1201eb628a (strmap: add a strset sub-type, 2020-11-06).
for_each_abbrev is no longer found because its definition was removed by d850b7a545 (cocci: apply the "cache.h" part of "the_repository.pending",
2023-03-28). Note that it had been a false positive, though, as it had
been a function wrapper, not a for-like macro.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Fri, 5 Jul 2024 18:51:09 +0000 (14:51 -0400)]
config.mak.dev: fix typo when enabling -Wpedantic
In ebd2e4a13a (Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format
better, 2021-09-28), we tightened our Makefile's behavior to only enable
-Wpedantic when compiling with either gcc5/clang4 or greater as older
compiler versions did not have support for -Wpedantic.
Commit ebd2e4a13a was looking for either "gcc5" or "clang4" to appear in
the COMPILER_FEATURES variable, combining the two "$(filter ...)"
searches with an "$(or ...)".
But ebd2e4a13a has a typo where instead of writing:
ifneq ($(or ($filter ...),$(filter ...)),)
we wrote:
ifneq (($or ($filter ...),$(filter ...)),)
Causing our Makefile (when invoked with DEVELOPER=1, and a sufficiently
recent compiler version) to barf:
$ make DEVELOPER=1
config.mak.dev:13: extraneous text after 'ifneq' directive
[...]
Correctly combine the results of the two "$(filter ...)" operations by
using "$(or ...)", not "$or".
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Fri, 5 Jul 2024 17:03:36 +0000 (19:03 +0200)]
t-strvec: use test_msg()
check_strvec_loc() checks each strvec item by looping through them and
comparing them with expected values. If a check fails then we'd like
to know which item is affected. It reports that information by building
a strbuf and delivering its contents using a failing assertion, e.g.
if there are fewer items in the strvec than expected:
# check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
# left: 1
# right: 1
# check "strvec index 1" failed at t/unit-tests/t-strvec.c:71
Note that the index variable is "nr" and thus the interesting value is
reported twice in that example (in lines three and four).
Stop printing the index explicitly for checks that already report it.
The message for the same condition as above becomes:
For the string comparison, whose error message doesn't include the
index, report it using the simpler and more appropriate test_msg()
instead. Report the index using its actual variable name and format the
line like the preceding ones. The message for an unexpected string
value becomes:
t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.
Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.
Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Reviewed-by: Josh Steadmon <steadmon@google.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 2 Jul 2024 19:57:47 +0000 (12:57 -0700)]
push: avoid showing false negotiation errors
When "git push" is configured to use the push negotiation, a push of
deletion of a branch (without pushing anything else) may end up not
having anything to negotiate for the common ancestor discovery.
In such a case, we end up making an internal invocation of "git
fetch --negotiate-only" without any "--negotiate-tip" parameters
that stops the negotiate-only fetch from being run, which by itself
is not a bad thing (one fewer round-trip), but the end-user sees a
"fatal: --negotiate-only needs one or more --negotiation-tip=*"
message that the user cannot act upon.
Teach "git push" to notice the situation and omit performing the
negotiate-only fetch to begin with. One fewer process spawned, one
fewer "alarming" message given the user.
Junio C Hamano [Tue, 2 Jul 2024 20:51:43 +0000 (13:51 -0700)]
checkout: special case error messages during noop switching
"git checkout" ran with no branch and no pathspec behaves like
switching the branch to the current branch (in other words, a
no-op, except that it gives a side-effect "here are the modified
paths" report). But unlike "git checkout HEAD" or "git checkout
main" (when you are on the 'main' branch), the user is much less
conscious that they are "switching" to the current branch.
This twists end-user expectation in a strange way. There are
options (like "--ours") that make sense only when we are checking
out paths out of either the tree-ish or out of the index. So the
error message the command below gives
$ git checkout --ours
fatal: '--ours/theirs' cannot be used with switching branches
is technically correct, but because the end-user may not even be
aware of the fact that the command they are issuing is about no-op
branch switching [*], they may find the error confusing.
Let's refactor the code to make it easier to special case the "no-op
branch switching" situation, and then customize the exact error
message for "--ours/--theirs". Since it is more likely that the
end-user forgot to give pathspec that is required by the option,
let's make it say
$ git checkout --ours
fatal: '--ours/theirs' needs the paths to check out
instead.
Among the other options that are incompatible with branch switching,
there may be some that benefit by having messages tweaked when a
no-op branch switching is done, but I'll leave them as #leftoverbits
material.
[Footnote]
* Yes, the end-users are irrational. When they did not give
"--ours", they take it granted that "git checkout" gives a short
status, e.g..
$ git checkout
M builtin/checkout.c
M t/t7201-co.sh
Junio C Hamano [Tue, 2 Jul 2024 16:59:01 +0000 (09:59 -0700)]
Merge branch 'jk/remote-wo-url'
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.
* jk/remote-wo-url:
remote: drop checks for zero-url case
remote: always require at least one url in a remote
t5801: test remote.*.vcs config
t5801: make remote-testgit GIT_DIR setup more robust
remote: allow resetting url list
config: document remote.*.url/pushurl interaction
remote: simplify url/pushurl selection
remote: use strvecs to store remote url/pushurl
remote: transfer ownership of memory in add_url(), etc
remote: refactor alias_url() memory ownership
archive: fix check for missing url
Junio C Hamano [Tue, 2 Jul 2024 16:59:01 +0000 (09:59 -0700)]
Merge branch 'jc/fuzz-sans-curl'
CI job to build minimum fuzzers learned to pass NO_CURL=NoThanks to
the build procedure, as its build environment does not offer, or
the rest of the build needs, anything cURL.
Junio C Hamano [Tue, 2 Jul 2024 16:59:00 +0000 (09:59 -0700)]
Merge branch 'rb/build-options-w-lib-versions'
"git version --build-options" reports the version information of
OpenSSL and other libraries (if used) in the build.
* rb/build-options-w-lib-versions:
version: teach --build-options to reports zlib version information
version: teach --build-options to reports libcurl version information
version: --build-options reports OpenSSL version information
Junio C Hamano [Tue, 2 Jul 2024 16:59:00 +0000 (09:59 -0700)]
Merge branch 'ps/use-the-repository'
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.
* ps/use-the-repository:
hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
t/helper: remove dependency on `the_repository` in "proc-receive"
t/helper: fix segfault in "oid-array" command without repository
t/helper: use correct object hash in partial-clone helper
compat/fsmonitor: fix socket path in networked SHA256 repos
replace-object: use hash algorithm from passed-in repository
protocol-caps: use hash algorithm from passed-in repository
oidset: pass hash algorithm when parsing file
http-fetch: don't crash when parsing packfile without a repo
hash-ll: merge with "hash.h"
refs: avoid include cycle with "repository.h"
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
hash: require hash algorithm in `empty_tree_oid_hex()`
hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
hash: make `is_null_oid()` independent of `the_repository`
hash: convert `oidcmp()` and `oideq()` to compare whole hash
global: ensure that object IDs are always padded
hash: require hash algorithm in `oidread()` and `oidclr()`
hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
Junio C Hamano [Tue, 2 Jul 2024 16:27:57 +0000 (09:27 -0700)]
Merge branch 'jc/no-default-attr-tree-in-bare' into maint-2.45
Earlier we stopped using the tree of HEAD as the default source of
attributes in a bare repository, but failed to document it. This
has been corrected.
* jc/no-default-attr-tree-in-bare:
attr.tree: HEAD:.gitattributes is no longer the default in a bare repo
Junio C Hamano [Tue, 2 Jul 2024 16:27:56 +0000 (09:27 -0700)]
Merge branch 'tb/precompose-getcwd' into maint-2.45
We forgot to normalize the result of getcwd() to NFC on macOS where
all other paths are normalized, which has been corrected. This still
does not address the case where core.precomposeUnicode configuration
is not defined globally.
* tb/precompose-getcwd:
macOS: ls-files path fails if path of workdir is NFD
Junio C Hamano [Tue, 2 Jul 2024 16:27:56 +0000 (09:27 -0700)]
Merge branch 'pw/rebase-i-error-message' into maint-2.45
When the user adds to "git rebase -i" instruction to "pick" a merge
commit, the error experience is not pleasant. Such an error is now
caught earlier in the process that parses the todo list.
* pw/rebase-i-error-message:
rebase -i: improve error message when picking merge
rebase -i: pass struct replay_opts to parse_insn_line()
Junio C Hamano [Tue, 2 Jul 2024 16:27:55 +0000 (09:27 -0700)]
Merge branch 'ds/format-patch-rfc-and-k' into maint-2.45
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".
* ds/format-patch-rfc-and-k:
format-patch: ensure that --rfc and -k are mutually exclusive
t-reftable-record: add tests for reftable_log_record_compare_key()
reftable_log_record_compare_key() is a function defined by
reftable/record.{c, h} and is used to compare the keys of two
log records when sorting multiple log records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add tests for reftable_ref_record_compare_name()
reftable_ref_record_compare_name() is a function defined by
reftable/record.{c, h} and is used to compare the refname of two
ref records when sorting multiple ref records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add index tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for index records.
Add tests for this function in the case of index records.
Note that since index records cannot be of type deletion, this function
must always return '0' when called on an index record.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add obj tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for two of the four record types (obj, index).
Add tests for this function in the case of obj records.
Note that since obj records cannot be of type deletion, this function
must always return '0' when called on an obj record.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add log tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for three of the four record types (log, obj, index).
Add tests for this function in the case of log records.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add ref tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for all the four record types (ref, log, obj, index).
Add tests for this function in the case of ref records.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add comparison tests for obj records
In the current testing setup for obj records, the comparison
functions for obj records, reftable_obj_record_cmp_void() and
reftable_obj_record_equal_void() are left untested.
Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp_void() and reftable_index_record_equal_void()
respectively.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add comparison tests for index records
In the current testing setup for index records, the comparison
functions for index records, reftable_index_record_cmp() and
reftable_index_record_equal() are left untested.
Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp() and reftable_index_record_equal()
respectively.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add comparison tests for ref records
In the current testing setup for ref records, the comparison
functions for ref records, reftable_ref_record_cmp_void() and
reftable_ref_record_equal() are left untested.
Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_ref_record_cmp_void() and reftable_ref_record_equal()
respectively.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t-reftable-record: add reftable_record_cmp() tests for log records
In the current testing setup for log records, only
reftable_log_record_equal() among log record's comparison functions
is tested.
Modify the existing tests to exercise reftable_log_record_cmp_void()
(using the wrapper function reftable_record_cmp()) alongside
reftable_log_record_equal().
Note that to achieve this, we'll need to replace instances of
reftable_log_record_equal() with the wrapper function
reftable_record_equal().
Rename the now modified test to reflect its nature of exercising
all comparison operations, not just equality.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t: move reftable/record_test.c to the unit testing framework
reftable/record_test.c exercises the functions defined in
reftable/record.{c, h}. Migrate reftable/record_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework, and renaming the tests to fit unit-tests' naming scheme.
While at it, change the type of index variable 'i' to 'size_t'
from 'int'. This is because 'i' is used in comparison against
'ARRAY_SIZE(x)' which is of type 'size_t'.
Also, use set_hash() which is defined locally in the test file
instead of set_test_hash() which is defined by
reftable/test_framework.{c, h}. This is fine to do as both these
functions are similarly implemented, and
reftable/test_framework.{c, h} is not #included in the ported test.
Get rid of reftable_record_print() from the tests as well, because
it clutters the test framework's output and we have no way of
verifying the output.
Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A quick test tells us that t0612 does not trigger any leak:
$ make SANITIZE=leak test GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true GIT_TEST_OPTS=-i T=t0612-reftable-jgit-compatibility.sh
[...]
*** t0612-reftable-jgit-compatibility.sh ***
in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
ok 1 - CGit repository can be read by JGit
ok 2 - JGit repository can be read by CGit
ok 3 - mixed writes from JGit and CGit
ok 4 - JGit can read multi-level index
# passed all 4 test(s)
1..4
# faking up non-zero exit with --invert-exit-code
make[2]: *** [Makefile:75: t0612-reftable-jgit-compatibility.sh] Error 1
Let's mark it as leak-free to silence the machinery activated by
`GIT_TEST_PASSING_SANITIZE_LEAK=check`.
Reported-by: Jeff King <peff@peff.net> Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rubén Justo [Sun, 30 Jun 2024 06:42:06 +0000 (08:42 +0200)]
test-lib: fix GIT_TEST_SANITIZE_LEAK_LOG
When a test that leaks runs with GIT_TEST_SANITIZE_LEAK_LOG=true,
the test returns zero, which is not what we want.
In the if-else's chain we have in "check_test_results_san_file_", we
consider three variables: $passes_sanitize_leak, $sanitize_leak_check
and, implicitly, GIT_TEST_SANITIZE_LEAK_LOG (always set to "true" at
that point).
For the first two variables we have different considerations depending
on the value of $test_failure, which makes sense. However, for the
third, GIT_TEST_SANITIZE_LEAK_LOG, we don't; regardless of
$test_failure, we use "invert_exit_code=t" to produce a non-zero
return value.
That assumes "$test_failure" is always zero at that point. But it
may not be:
$ git checkout v2.40.1
$ make test SANITIZE=leak T=t3200-branch.sh # this fails
$ make test SANITIZE=leak GIT_TEST_SANITIZE_LEAK_LOG=true T=t3200-branch.sh # this succeeds
[...]
With GIT_TEST_SANITIZE_LEAK_LOG=true, our logs revealed a memory leak, exiting with a non-zero status!
# faked up failures as TODO & now exiting with 0 due to --invert-exit-code
We need to use "invert_exit_code=t" only when "$test_failure" is zero.
Let's add the missing conditions in the if-else's chain to make it work
as expected.
Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Rubén Justo <rjusto@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rubén Justo [Sun, 30 Jun 2024 06:46:38 +0000 (08:46 +0200)]
t0613: mark as leak-free
We can mark t0613 as leak-free:
$ make test SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true T=t0613-reftable-write-options.sh
[...]
*** t0613-reftable-write-options.sh ***
in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
ok 1 - default write options
ok 2 - disabled reflog writes no log blocks
ok 3 - many refs results in multiple blocks
ok 4 - tiny block size leads to error
ok 5 - small block size leads to multiple ref blocks
ok 6 - small block size fails with large reflog message
ok 7 - block size exceeding maximum supported size
ok 8 - restart interval at every single record
ok 9 - restart interval exceeding maximum supported interval
ok 10 - object index gets written by default with ref index
ok 11 - object index can be disabled
# passed all 11 test(s)
1..11
# faking up non-zero exit with --invert-exit-code
make[2]: *** [Makefile:75: t0613-reftable-write-options.sh] Error 1
Do it.
Signed-off-by: Rubén Justo <rjusto@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sat, 29 Jun 2024 18:01:24 +0000 (20:01 +0200)]
submodule--helper: use strvec_pushf() for --super-prefix
Use the strvec_pushf() call that already appends a slash to also produce
the stuck form of the option --super-prefix instead of adding the option
name in a separate call of strvec_push() or strvec_pushl(). This way we
can more easily see that these parts make up a single option with its
argument and save a function call.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-send-email: use sanitized address when reading mbox body
Addresses that are mentioned on the trailers in the commit log
messages (e.g., "Reviewed-by") are added to the "Cc:" list by "git
send-email". These hand-written addresses, however, may be
malformed (e.g., having unquoted "." and other punctutation marks in
the display-name part) and can upset MTA.
The code does use the sanitize_address() helper on these
address-looking strings to turn them into valid addresses, but it is
used only to see if the address should be suppressed. The original
string taken from the message is added to the @cc list if the code
decides the address is not suppressed.
Because the addresses on trailer lines are hand-written and more
likely to contain malformed addresses, when adding to the @cc list,
use the result from sanitize_address, not the original. Note that
we do not modify the behaviour for addresses taken from the e-mail
headers, as they are more likely to be machine generated and
well-formed.
Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Orgad Shaneh [Tue, 7 Feb 2023 07:43:17 +0000 (09:43 +0200)]
git-gui: fix inability to quit after closing another instance
If you open 2 git gui instances in the same directory, then close one
of them and try to close the other, an error message pops up, saying:
'error renaming ".git/GITGUI_BCK": no such file or directory', and it
is no longer possible to close the window ever.
Fix by catching this error, and proceeding even if the file no longer
exists.
Junio C Hamano [Fri, 28 Jun 2024 22:53:18 +0000 (15:53 -0700)]
Merge branch 'jc/varargs-attributes' into maint-2.45
Varargs functions that are unannotated as printf-like or execl-like
have been annotated as such.
* jc/varargs-attributes:
__attribute__: add a few missing format attributes
__attribute__: mark some functions with LAST_ARG_MUST_BE_NULL
__attribute__: remove redundant attribute declaration for git_die_config()
__attribute__: trace2_region_enter_printf() is like "printf"
Junio C Hamano [Fri, 28 Jun 2024 22:53:15 +0000 (15:53 -0700)]
Merge branch 'es/chainlint-ncores-fix' into maint-2.45
The chainlint script (invoked during "make test") did nothing when
it failed to detect the number of available CPUs. It now falls
back to 1 CPU to avoid the problem.
* es/chainlint-ncores-fix:
chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
chainlint.pl: fix incorrect CPU count on Linux SPARC
chainlint.pl: make CPU count computation more robust
Junio C Hamano [Fri, 28 Jun 2024 22:53:08 +0000 (15:53 -0700)]
Merge branch 'bc/zsh-compatibility' into maint-2.45
zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.
* bc/zsh-compatibility:
vimdiff: make script and tests work with zsh
t4046: avoid continue in &&-chain for zsh
Junio C Hamano [Fri, 28 Jun 2024 22:53:08 +0000 (15:53 -0700)]
Merge branch 'js/for-each-repo-keep-going' into maint-2.45
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out. Now it keeps going.
* js/for-each-repo-keep-going:
maintenance: running maintenance should not stop on errors
for-each-repo: optionally keep going on an error
Junio C Hamano [Fri, 28 Jun 2024 22:53:06 +0000 (15:53 -0700)]
Merge branch 'pw/rebase-m-signoff-fix' into maint-2.45
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.
* pw/rebase-m-signoff-fix:
rebase -m: fix --signoff with conflicts
sequencer: store commit message in private context
sequencer: move current fixups to private context
sequencer: start removing private fields from public API
sequencer: always free "struct replay_opts"
Derrick Stolee [Fri, 28 Jun 2024 12:43:25 +0000 (12:43 +0000)]
sparse-index: improve lstat caching of sparse paths
The clear_skip_worktree_from_present_files() method was first introduced
in af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files
present in worktree, 2022-01-14) to allow better interaction with the
working directory in the presence of paths outside of the
sparse-checkout. The initial implementation would lstat() every single
SKIP_WORKTREE path to see if it existed; if it ran across a sparse
directory that existed (when a sparse index was in use), then it would
expand the index and then check every SKIP_WORKTREE path.
Since these lstat() calls were very expensive, this was improved in d79d299352 (Accelerate clear_skip_worktree_from_present_files() by
caching, 2022-01-14) by caching directories that do not exist so it
could avoid lstat()ing any files under such directories. However, there
are some inefficiencies in that caching mechanism.
The caching mechanism stored only the parent directory as not existing,
even if a higher parent directory also does not exist. This means that
wasted lstat() calls would occur when the paths passed to path_found()
change immediate parent directories but within the same parent directory
that does not exist.
To create an example repository that demonstrates this problem, it helps
to have a directory outside of the sparse-checkout that contains many
deep paths. In particular, the first paths (in lexicographic order)
underneath the sparse directory should have deep directory structures,
maximizing the difference between the old caching algorithm that looks
to a single parent and the new caching algorithm that looks to the
top-most missing directory.
The performance test script p2000-sparse-operations.sh takes the sample
repository and copies its HEAD to several copies nested in directories
of the form f<i>/f<j>/f<k> where i, j, and k are numbers from 1 to 4.
The sparse-checkout cone is then selected as "f2/f4/". Creating "f1/f1/"
will trigger the behavior and also lead to some interesting cases for
the caching algorithm since "f1/f1/" exists but "f1/f2/" and "f3/" do
not.
This is difficult to notice when running performance tests using the Git
repository (or a blow-up of the Git repository, as in
p2000-sparse-operations.sh) because Git has a very shallow directory
structure.
This change reorganizes the caching algorithm to focus on storing the
highest level leading directory that does not exist; specifically this
means that that directory's parent _does_ exist. By doing a little extra
work on a path passed to path_found(), we can short-circuit all of the
paths passed to path_found() afterwards that match a prefix with that
non-existing directory. When in a repository where the first sparse file
is likely to have a much deeper path than the first non-existing
directory, this can realize significant gains.
The details of this algorithm require careful attention, so the new
implementation of path_found() has detailed comments, including the use
of a new max_common_dir_prefix() method that may be of independent
interest.
It's worth noting that this is not universally positive, since we are
doing extra lstat() calls to establish the exact path to cache. In the
blow-up of the Git repository, we can see that the lstat count
_increases_ from 28 to 31. However, these numbers were already
artificially low.
Contributor Elijah Newren created a publicly-available test repository
that demonstrates the difference in these caching algorithms in the most
extreme way. To test, follow these steps:
git clone --sparse https://github.com/newren/gvfs-like-git-bomb
cd gvfs-like-git-bomb
./runme.sh # NOTE: check scripts before running!
At this point, assuming you do not have index.sparse=true set globally,
the index has one million paths with the SKIP_WORKTREE bit and they will
all be sent to path_found() in the sparse loop. You can measure this by
running 'git status' with GIT_TRACE2_PERF=1:
Sparse files in the index: 1,000,000
sparse_lstat_count (before): 200,000
sparse_lstat_count (after): 2
And here are the performance numbers:
Benchmark 1: old
Time (mean ± σ): 397.5 ms ± 4.1 ms
Range (min … max): 391.2 ms … 404.8 ms 10 runs
Benchmark 2: new
Time (mean ± σ): 252.7 ms ± 3.1 ms
Range (min … max): 249.4 ms … 259.5 ms 11 runs
Summary
'new' ran
1.57 ± 0.02 times faster than 'old'
By modifying this example further, we can demonstrate a more realistic
example and include the sparse index expansion. Continue by creating
this directory, confusing both caching algorithms somewhat:
mkdir -p bomb/d/e/f/a/a
Then re-run the 'git status' tests to see these statistics:
Sparse files in the index: 1,000,000
sparse_lstat_count (before): 724,010
sparse_lstat_count (after): 106
Benchmark 1: old
Time (mean ± σ): 753.0 ms ± 3.5 ms
Range (min … max): 749.7 ms … 760.9 ms 10 runs
Benchmark 2: new
Time (mean ± σ): 201.4 ms ± 3.2 ms
Range (min … max): 196.0 ms … 207.9 ms 14 runs
Summary
'new' ran
3.74 ± 0.06 times faster than 'old'
Note that if this repository had a sparse index enabled, the additional
cost of expanding the sparse index affects the total time of these
commands by over four seconds, significantly diminishing the benefit of
the caching algorithm. Having existing paths outside of the
sparse-checkout is a known performance issue for the sparse index and is
a known trade-off for the performance benefits given when no such paths
exist.
Using an internal monorepo with over two million paths at HEAD and a
typical sparse-checkout cone such that the sparse index contains
~190,000 entries (including over two thousand sparse trees), I was able
to measure these lstat counts when one sparse directory actually exists
on disk:
Derrick Stolee [Fri, 28 Jun 2024 12:43:24 +0000 (12:43 +0000)]
sparse-index: count lstat() calls
The clear_skip_worktree.. methods already report some statistics about
how many cache entries are checked against path_found() due to having
the skip-worktree bit set. However, due to path_found() performing some
caching, this isn't the only information that would be helpful to
report.
Add a new lstat_count member to the path_found_data struct to count the
number of times path_found() calls lstat(). This will be helpful to help
explain performance problems in this method as well as to demonstrate
future changes to the caching algorithm in a more concrete way than
end-to-end timings.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 28 Jun 2024 12:43:23 +0000 (12:43 +0000)]
sparse-index: use strbuf in path_found()
The path_found() method previously reused strings from the cache entries
the calling methods were using. This prevents string manipulation in
place and causes some odd reallocation before the final lstat() call in
the method.
Refactor the method to use strbufs and copy the path into the strbuf,
but also only the parent directory and not the whole path. This looks
like extra copying when assigning the path to the strbuf, but we save an
allocation by dropping the 'tmp' string, and we are "reusing" the copy
from 'tmp' to put the data in the strbuf.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 28 Jun 2024 12:43:22 +0000 (12:43 +0000)]
sparse-index: refactor path_found()
In advance of changing the behavior of path_found(), take all of the
intermediate data values and group them into a single struct. This
simplifies the method prototype as well as the initialization. Future
changes can be made directly to the struct and method without changing
the callers with this approach.
Note that the clear_path_found_data() method is currently empty, as
there is nothing to free. This method is a placeholder for future
changes that require a non-trivial implementation. Its stub is created
now so consumers could call it now and not change in future changes.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The clear_skip_worktree_from_present_files() method was introduced in af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present
in worktree, 2022-01-14) to help cases where sparse-checkout is enabled
but some paths outside of the sparse-checkout also exist on disk. This
operation can be slow as it needs to check path existence in a way not
stored in the index, so caching was introduced in d79d299352 (Accelerate
clear_skip_worktree_from_present_files() by caching, 2022-01-14).
This check is particularly confusing in the presence of a sparse index,
as a sparse tree entry corresponding to an existing directory must first
be expanded to a full index before examining the paths within. This is
currently implemented using a 'goto' and a boolean variable to ensure we
restart only once.
Even with that caching, it was noticed that this could take a long time
to execute. 89aaab11a3 (index: add trace2 region for clear skip
worktree, 2022-11-03) introduced trace2 regions to measure this time.
Further, the way the loop repeats itself was slightly confusing and
prone to breakage, so a BUG() statement was added in 8c7abdc596 (index:
raise a bug if the index is materialised more than once, 2022-11-03) to
be sure that the second run of the loop does not hit any sparse trees.
One thing that can be confusing about the current setup is that the
trace2 regions nest and it is not clear that a second loop is running
after a sparse index is expanded. Here is an example of what the regions
look like in a typical case:
One thing that is particularly difficult to understand about these
regions is that most of the time is spent between the close of the
ensure_full_index region and the reporting of the end data. This is
because of the restart of the loop being within the same region as the
first iteration of the loop.
This change refactors the method into two separate methods that are
traced separately. This will be more important later when we change
other features of the methods, but for now the only functional change is
the difference in the structure of the trace regions.
After this change, the same telemetry section is split into three
distinct chunks:
Here, we see the sparse loop terminating early with its first sparse
path being a sparse directory containing a file. Then, that loop's
region terminates before ensure_full_index begins (in this case, the
cache-tree must also be computed). Then, _after_ the index is expanded,
the full loop begins with its own region.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>