Junio C Hamano [Mon, 29 Sep 2025 18:40:34 +0000 (11:40 -0700)]
Merge branch 'jk/setup-revisions-freefix'
There are double frees and leaks around setup_revisions() API used
in "git stash show", which has been fixed, and setup_revisions()
API gained a wrapper to make it more ergonomic when using it with
strvec-manged argc/argv pairs.
* jk/setup-revisions-freefix:
revision: retain argv NULL invariant in setup_revisions()
treewide: pass strvecs around for setup_revisions_from_strvec()
treewide: use setup_revisions_from_strvec() when we have a strvec
revision: add wrapper to setup_revisions() from a strvec
revision: manage memory ownership of argv in setup_revisions()
stash: tell setup_revisions() to free our allocated strings
Junio C Hamano [Mon, 29 Sep 2025 18:40:34 +0000 (11:40 -0700)]
Merge branch 'pw/rebase-i-cleanup-fix'
"git rebase -i" failed to clean-up the commit log message when the
command commits the final one in a chain of "fixup" commands, which
has been corrected.
* pw/rebase-i-cleanup-fix:
sequencer: remove VERBATIM_MSG flag
rebase -i: respect commit.cleanup when picking fixups
Declare that "git init" that is not otherwise configured uses
'main' as the initial branch, not 'master', starting Git 3.0.
* pw/3.0-default-initial-branch-to-main:
t0613: stop setting default initial branch
t9902: switch default branch name to main
t4013: switch default branch name to main
breaking-changes: switch default branch to main
Junio C Hamano [Tue, 23 Sep 2025 18:53:40 +0000 (11:53 -0700)]
Merge branch 'jk/add-i-color'
Some among "git add -p" and friends ignored color.diff and/or
color.ui configuration variables, which is an old regression, which
has been corrected.
* jk/add-i-color:
contrib/diff-highlight: mention interactive.diffFilter
add-interactive: manually fall back color config to color.ui
add-interactive: respect color.diff for diff coloring
stash: pass --no-color to diff plumbing child processes
Junio C Hamano [Tue, 23 Sep 2025 18:53:39 +0000 (11:53 -0700)]
Merge branch 'cc/promisor-remote-capability'
The "promisor-remote" capability mechanism has been updated to
allow the "partialCloneFilter" settings and the "token" value to be
communicated from the server side.
* cc/promisor-remote-capability:
promisor-remote: use string_list_split() in mark_remotes_as_accepted()
promisor-remote: allow a client to check fields
promisor-remote: use string_list_split() in filter_promisor_remote()
promisor-remote: refactor how we parse advertised fields
promisor-remote: use string constants for 'name' and 'url' too
promisor-remote: allow a server to advertise more fields
promisor-remote: refactor to get rid of 'struct strvec'
Jeff King [Fri, 19 Sep 2025 22:51:46 +0000 (18:51 -0400)]
revision: retain argv NULL invariant in setup_revisions()
In an argc/argv pair, the entry for argv[argc] is generally NULL. You
can iterate by counting up to argc, or by looking for the NULL entry in
argv.
When we pass such a pair to setup_revisions(), it shrinks argc to
account for the options we consumed and returns the result to the
caller. But it doesn't touch the entries after the reduced argc. So
argv[argc] will be left pointing at some arbitrary entry rather than
NULL.
This isn't the source of any known bugs, since all callers are aware of
the limitation and act accordingly. But it's a possible gotcha that may
be easy to miss.
Let's set the new argv[argc] to NULL, taking care to free it if the
caller asked us to do so.
It is tempting to do likewise for all of the entries afterwards, too, as
some of them may also need to be freed (e.g., if coming from a strvec).
But doing so isn't entirely trivial, as we munge argc in the function
(e.g., when we find "--" and move all of the entries after it into the
prune_data list). It would be possible with some light refactoring, but
it's probably not worth it. Nobody should ever look at them (they are
beyond the revised argc and past the NULL argv entry) outside of strvec
cleanup, and setup_revisions_from_strvec() already handles this case.
There's one other interesting gotcha: many callers which do not want to
provide arguments just pass 0/NULL for argc/argv. We need to check for
this case before assigning the final NULL.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 19 Sep 2025 22:50:48 +0000 (18:50 -0400)]
treewide: pass strvecs around for setup_revisions_from_strvec()
The previous commit converted callers of setup_revisions() with a strvec
to use the safer and easier _from_strvec() variant.
Let's now convert spots that don't directly have a strvec, but receive
an argc/argv pair that eventually comes from one. We'll instead pass the
strvec down to the point where we call setup_revisions().
That makes these functions slightly less flexible if they were to grow
other callers that don't use strvecs, but this rigidity is buying us
some safety. It is only safe to pass the free_removed_argv_elements
option to setup_revisions() if we know the elements of argv/argc are
allocated on the heap. That isn't communicated in the type system when
we are passed the bare elements. But if we get a strvec, we know that
the elements are allocated strings.
And at any rate, each of these modified functions has only a single
caller (that has a strvec), so the loss of flexibility is unlikely to
ever matter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 19 Sep 2025 22:49:07 +0000 (18:49 -0400)]
treewide: use setup_revisions_from_strvec() when we have a strvec
The previous commit introduced a wrapper to make using setup_revisions()
with a strvec easier and safer. It converted spots that were already
doing most of what the wrapper did.
Let's now convert spots where we were not setting up the
free_removed_argv_elements flag. As discussed in the previous commit,
this probably isn't fixing any bugs or leaks (since these sites wouldn't
trigger the re-shuffling of argv that causes them). This is mostly
future-proofing us against setup_revisions() becoming more aggressive
about its re-shuffling.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 19 Sep 2025 22:48:47 +0000 (18:48 -0400)]
revision: add wrapper to setup_revisions() from a strvec
The setup_revisions() function was designed to take the argc/argv pair
from the operating system. But we sometimes construct our own argv using
a strvec and pass that in. There are a few gotchas that callers need to
deal with here:
1. You should always pass the free_removed_argv_elements option via
setup_revision_opt. Otherwise, entries may be leaked if
setup_revisions() re-shuffles options.
2. After setup_revisions() returns, the strvec state is odd. We get a
reduced argc from setup_revisions() telling us how many unknown
options were left in place. Entries after that in argv may be
retained, or may be NULL (depending on how the reshuffling
happened). But the strvec's "nr" field still represents the
original value, and some of the entries it thinks it is still
storing may be NULL. Callers must be careful with how they access
it.
Some callers deal with (1), but not all. In practice they are OK because
they do not pass any options that would cause setup_revisions() to
re-shuffle (namely unknown options which may be relayed from the user,
and the use of the "--" separator). But it's probably a good idea to
consistently pass this option anyway to future-proof ourselves against
the details of setup_revisions() changing.
No callers address (2), though I don't think there any visible bugs.
Most of them simply call strvec_clear() and never otherwise look at the
result. And in fact, if they naively set foo.nr to the argc returned by
setup_revisions(), that would cause leaks! Because setup_revisions()
does not free consumed options[1], we have to leave the "nr" field of
the strvec at its original value to find and free them during
strvec_clear().
So I don't think there are any bugs to fix here, but we can make things
safer and simpler for callers. Let's introduce a helper function that
sets the free_removed_argv_elements automatically and shrinks the strvec
to represent the retained options afterwards (taking care to free the
now-obsolete entries).
We'll start by converting all of the call-sites which use the
free_removed_argv_elements option. There should be no behavior change
for them, except that their "shrunken" entries are cleaned up
immediately, rather than waiting for a strvec_clear() call.
[1] Arguably setup_revisions() should be doing this step for us if we
told it to free removed options, but there are many existing callers
which will be broken if it did. Introducing this helper is a
possible first step towards that.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 19 Sep 2025 22:45:56 +0000 (18:45 -0400)]
revision: manage memory ownership of argv in setup_revisions()
The setup_revisions() function takes an argc/argv pair and consumes
arguments from it, returning a reduced argc count to the caller. But it
may also overwrite entries within the argv array, as it shifts unknown
options to the front of argv (so they can be found in the range of
0..argc-1 after we return).
For a normal argc/argv coming from the operating system, this is OK.
We don't need to worry about memory ownership of the strings in those
entries. But some callers pass in allocated strings from a strvec, and
we do need to care about those.
We faced a similar issue in f92dbdbc6a (revisions API: don't leak memory
on argv elements that need free()-ing, 2022-08-02), which added an
option for callers to tell us that elements need to be freed. But the
implementation within setup_revisions() was incomplete. It only covered
the case of dropping "--", but not the movement of unknown options.
When we shift argv entries around, we should free the elements we are
about to overwrite, so they are not leaked. For example, in:
overwriting the "-p" entry, which is leaked unless we free it at that
moment.
You can see in the output above another potential problem. We now have
two copies of the "--invalid" string. If the caller does not respect the
new argc when free-ing the strings via strvec_clear(), we'll get a
double-free. And git-stash suffers from this, and will crash with the
above command.
So it seems at first glance that the solution is to just assign the
reduced argc to the strvec.nr field in the caller. Then it would stop
after freeing only any copied entries. But that's not always right
either!
Remember that we are reducing "argc" to account for elements we've
consumed. So if there isn't an invalid option, we'd turn:
argc = 2, argv[] = { "show", "-p", NULL }
into:
argc = 1, argv[] = { "show", "-p", NULL }
In that case strvec_clear() must keep looking past the shortened argc we
return to find the original "-p" to free. It needs to use the original
argc to do that.
We can solve this by turning our argv writes into strict moves, not
copies. When we shuffle an unknown option to the front, we'll overwrite
its old position with NULL. That leaves an argv array that may have NULL
"holes" in it.
because we move "--invalid" to overwrite the first "-p", but the second
one is quietly consumed. But strvec_clear() can handle that fine (it
iterates over the "nr" field, and passing NULL to free() is OK).
To ease the implementation, I've introduced a helper function. It's a
little hacky because it must take a double-pointer to set the old
position to NULL. Which in turn means we cannot pass "&arg", our local
alias for the current entry we're parsing, but instead "&argv[i]", the
pointer in the original array. And to make it even more confusing, we
delegate some of this work to handle_revision_opt(), which is passed a
subset of the argv array, so is always working on "&argv[0]".
Likewise, because handle_revision_opt() only receives the part of argv
left to parse, it receives the array to accumulate unknown options as a
separate unkc/unkv pair. But we're always working on the same argv
array, so our strategy works fine. I suspect this would be a bit more
obvious (and avoid some pointer cleverness) if all functions saw the
full argv array and worked with positions within it (and our new helper
would take two positions, a src and dst). But that would involve
refactoring handle_revision_opt(). I punted on that, as what's here is
not too ugly and is all contained within revision.c itself.
The new test demonstrates that "git stash show -p --invalid" no longer
crashes with a double-free (because we move instead of copy). And it
passes with SANITIZE=leak because we free "-p" before overwriting.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Sep 2025 20:25:09 +0000 (16:25 -0400)]
stash: tell setup_revisions() to free our allocated strings
In "git stash show", we do a first pass of parsing our command line
options by splitting them into revision args and stash args. These are
stored in strvecs, and we pass the revision args to setup_revisions().
But setup_revisions() may modify the argv we pass it, causing us to leak
some of the entries. In particular, if it sees a "--" string, that will
be dropped from argv. This is the same as other cases addressed by f92dbdbc6a (revisions API: don't leak memory on argv elements that need
free()-ing, 2022-08-02), and we should fix it the same way: by passing
the free_removed_argv_elements option to setup_revisions().
The added test here is run only with SANITIZE=leak, without checking its
output, because the behavior of stash with "--" is a little odd:
1. Running "git stash show" will show --stat output. But running "git
stash show --" will show --patch.
2. I'd expect a non-option after "--" to be treated as a pathspec, so:
git stash show -p 1 -- foo
would look treat "1" as a stash (a synonym for stash@{1}) and
restrict the resulting diff to "foo". But it doesn't. We split the
revision/stash args without any regard to "--". So in the example
above both "1" and "foo" are stashes. Which is an error, but also:
git stash show -- foo
treats "foo" as a stash, not a pathspec.
These are both oddities that we may want to address (or may not, if we
want to retain historical quirks). But they are well outside the scope
of this patch. So for now we'll just let the tests confirm we aren't
leaking without otherwise expecting any behavior. If we later address
either of those points and end up with another test that covers "stash
show --", we can drop this leak-only test.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update to 10e96bc (Merge pull request #127 from
pks-gitlab/pks-ci-improvements, 2025-09-22). This commit includes a
couple of changes:
- The GitHub CI has been updated to include a 32 bit CI job.
Furthermore, the jobs now compile with "-Werror" and more warnings
enabled.
- An issue was addressed where `uintptr_t` is not available on
NonStop [1].
- The clar selftests have been restructured so that it is now possible
to add small test suites more readily. This was done to add tests
for the above addressed issue, where we now use "%p" to print
pointers in a platform dependent way.
- An issue was addressed where the test output had a trailing
whitespace with certain output formats, which caused whitespace
issues in the test expectation files.
Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
D. Ben Knoble [Mon, 22 Sep 2025 01:39:06 +0000 (21:39 -0400)]
stash: honor stash.index in apply, pop modes
With stash.index=true, git-stash(1) command now tries to reinstate the
index by default in the "apply" and "pop" modes. Not doing so creates a
common trap [1], [2]: "git stash apply" is not the reverse of "git stash
push" because carefully staged indices are lost and have to be manually
recreated. OTOH, this mode is not always desirable and may create more
conflicts when applying stashes. As usual, "--no-index" will disable
this behavior if you set "stash.index".
D. Ben Knoble [Mon, 22 Sep 2025 01:39:05 +0000 (21:39 -0400)]
stash: refactor private config globals
A subsequent commit will access a new config variable in the stash
subcommand implementations, which requires the variables to be declared
before the relevant functions. Prep with a pure refactoring change to
consolidate config-related globals with the rest of the globals.
Best-viewed-with: --color-moved Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
D. Ben Knoble [Mon, 22 Sep 2025 01:39:04 +0000 (21:39 -0400)]
t3905: remove unneeded blank line
This is leftover from 787513027a (stash: Add --include-untracked option
to stash and remove all untracked files, 2011-06-24) when it was
converted in bbaa45c3aa (t3905: move all commands into test cases,
2021-02-08).
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
D. Ben Knoble [Mon, 22 Sep 2025 01:39:03 +0000 (21:39 -0400)]
t3903: reduce dependencies on previous tests
Skipping previous tests to work through only failing tests with
arguments like --run=4,122- causes some tests to fail because subdir
doesn't exist yet (it is created by a previous test; typically
"unstashing in a subdirectory"). Create it on demand for tests that need
it, but don't fail (-p) if the directory already exists.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 17 Sep 2025 16:18:28 +0000 (09:18 -0700)]
initial branch: give hints after switching the default name
It is likely that those who came to Git after 3.0 switched the
default initial branch name to 'main' would still try to follow
tutorials that were written before 3.0 happened and with the
assumption that the tool would call the initial branch 'master'.
To help these new users after 3.0 boundary, let's retain one part of
the hint we will be giving before the default changes, namely, how
to rename the branch an unconfigured Git has created just once.
We do this without telling them how to permanently configure the
default name of the initial branch, and that design choice is very
much deliberate. The whole point of switching the default name was
because we did not want to force individual users to configure their
default branch name but while the hard wired default was 'master',
they _had_ to configure it away from 'master' in order to conform to
the recent norm, and a hint that tells them how to do so is useful.
But once the default is renamed to 'main', that no longer is true.
A narrower audience who are new users that follow an instruction
that assumes the initial branch name is 'master' would only need to
learn "here is how to change the branch name to match the tutorial
you are following in the repository you created for practice", and
"here is how you keep creating repositories with the first branch
with a name everybody hates" is unnecessary.
It also needs to be noted that the advise token to squelch the
message is the same advice.defaultBranchName as before, which is
also very much deliberate. The users who do have that configured
are those who _have_ been using Git since before 3.0, and they are
not the target audience for the new advice message. Reusing the
same advise token ensures that they do not have to turn the message
off.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 18 Sep 2025 17:07:02 +0000 (10:07 -0700)]
Merge branch 'ne/alloc-free-and-null'
The clear_alloc_state() API function was not fully clearing the
structure for reuse, but since nobody reuses it, replace it with a
variant that frees the structure as well, making the callers simpler.
* ne/alloc-free-and-null:
alloc: fix dangling pointer in alloc_state cleanup
Junio C Hamano [Thu, 18 Sep 2025 17:07:01 +0000 (10:07 -0700)]
Merge branch 'jc/longer-disambiguation-fix'
"git rev-parse --short" and friends failed to disambiguate two
objects with object names that share common prefix longer than 32
characters, which has been fixed.
* jc/longer-disambiguation-fix:
abbrev: allow extending beyond 32 chars to disambiguate
Junio C Hamano [Thu, 18 Sep 2025 17:07:00 +0000 (10:07 -0700)]
Merge branch 'ag/send-email-imap-sent'
"git send-email" learned to drive "git imap-send" to store already
sent e-mails in an IMAP folder.
* ag/send-email-imap-sent:
send-email: enable copying emails to an IMAP folder without actually sending them
send-email: add ability to send a copy of sent emails to an IMAP folder
"core.commentChar=auto" that attempts to dynamically pick a
suitable comment character is non-workable, as it is too much
trouble to support for little benefit, and is marked as deprecated.
* pw/3.0-commentchar-auto-deprecation:
commit: print advice when core.commentString=auto
config: warn on core.commentString=auto
breaking-changes: deprecate support for core.commentString=auto
As the last commit deleted the only user of VERBATIM_MSG remove
it. This reverts remaining parts of commit f7d42ceec52 (rebase -i:
do leave commit message intact in fixup! chains, 2021-01-28) that
were not deleted by the last commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase -i: respect commit.cleanup when picking fixups
If the user uses a prepare-commit-msg hook to add comments to the
commit message template and sets commit.cleanup to remove them when the
commit is created then the comments will not be removed when rebase
commits the final command in a chain of "fixup" commands[1]. This
happens because f7d42ceec52 (rebase -i: do leave commit message intact
in fixup! chains, 2021-01-28) started passing the VERBATIM_MSG flag
when committing the final command in a chain of "fixup" commands. That
change was added in response to a bug report[2] where the commit
message was being cleaned up when it should not be. The cause of that
bug was that before f7d42ceec52 the sequencer passed CLEANUP_MSG
when committing the final fixup. That commit should have simply
removed the CLEANUP_MSG flag, not changed it to VERBATIM_MSG. Using
VERBATIM_MSG ignores the user's commit.cleanup config when committing
the final fixup which means it behaves differently to an ordinary
"pick" command which respects commit.cleanup.
Fix this by not setting an explicit cleanup flag when committing the
final fixup which matches the way "pick" commands behave. The test
added in f7d42ceec52 is replaced with one that checks that "fixup"
and "pick" commands do not clean up the message when commit.cleanup
is not set and do clean up the message when it is set.
Reported-by: Simon Cheng <cyqsimon@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
and edit the intro message, then it will get two copies of the Reply-To
field. gmail.com rejects such messages.
This happens because send-email reads the edited message examining the
headers. For recognised headers the content is extracted to use in
constructing the final message and for possible inclusion in the patch
emails. Unrecognised headers are gathered (in @xh) to be passed through
uninterpreted.
Unfortunately "Reply-To" is not recognised in this process so it is
added to @xh as an uninterpreted header, but also generated from the
$reply_to variable in gen_header(), resulting in two copies
Add parsing to the loop in pre_process_file() to recognise a Reply-to
header and to store the result in $reply_to. This means that the
intro message will not get a second header and also means that
any changes made to the Reply-To header during editing will be
incorporated in the $reply_to variable and so included in all the
generated email messages.
Signed-off-by: NeilBrown <neil@brown.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merges contributions made from three different addresses:
- win@wincent.com (old address, initial contributions in 2007–2009)
- greg@hurrell.net (personal address matching full name, so this one is
the "forever" address; contributions made starting in 2018)
- greg.hurrell@datadoghq.com (current work address, used for recent
contributions)
Signed-off-by: Greg Hurrell <greg.hurrell@datadoghq.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 15 Sep 2025 15:52:06 +0000 (08:52 -0700)]
Merge branch 'ps/upload-pack-oom-protection'
A broken or malicious "git fetch" can say that it has the same
object for many many times, and the upload-pack serving it can
exhaust memory storing them redundantly, which has been corrected.
Junio C Hamano [Mon, 15 Sep 2025 15:52:06 +0000 (08:52 -0700)]
Merge branch 'ds/midx-write-fixes'
Fixes multiple crashes around midx write-out codepaths.
* ds/midx-write-fixes:
midx-write: simplify error cases
midx-write: reenable signed comparison errors
midx-write: use uint32_t for preferred_pack_idx
midx-write: use cleanup when incremental midx fails
midx-write: put failing response value back
midx-write: only load initialized packs
Junio C Hamano [Mon, 15 Sep 2025 15:52:05 +0000 (08:52 -0700)]
Merge branch 'jt/de-global-bulk-checkin'
The bulk-checkin code used to depend on a file-scope static
singleton variable, which has been updated to pass an instance
throughout the callchain.
* jt/de-global-bulk-checkin:
bulk-checkin: use repository variable from transaction
bulk-checkin: require transaction for index_blob_bulk_checkin()
bulk-checkin: remove global transaction state
bulk-checkin: introduce object database transaction structure
Instead of scanning for the remaining items to see if there are
still commits to be explored in the queue, use khash to remember
which items are still on the queue (an unacceptable alternative is
to reserve one object flag bits).
* rs/describe-with-lazy-queue-and-oidset:
describe: use oidset in finish_depth_computation()
Windows "real-time monitoring" interferes with the execution of
tests and affects negatively in both correctness and performance,
which has been disabled in Gitlab CI.
* ps/gitlab-ci-disable-windows-monitoring:
gitlab-ci: disable realtime monitoring to unbreak Windows jobs
Junio C Hamano [Fri, 12 Sep 2025 17:41:19 +0000 (10:41 -0700)]
Merge branch 'ms/refs-exists'
"git refs exists" that works like "git show-ref --exists" has been
added.
* ms/refs-exists:
t: add test for git refs exists subcommand
t1422: refactor tests to be shareable
t1403: split 'show-ref --exists' tests into a separate file
builtin/refs: add 'exists' subcommand
Junio C Hamano [Fri, 12 Sep 2025 17:41:18 +0000 (10:41 -0700)]
Merge branch 'ps/object-store-midx-dedup-info'
Further code clean-up for multi-pack-index code paths.
* ps/object-store-midx-dedup-info:
midx: compute paths via their source
midx: stop duplicating info redundant with its owning source
midx: write multi-pack indices via their source
midx: load multi-pack indices via their source
midx: drop redundant `struct repository` parameter
odb: simplify calling `link_alt_odb_entry()`
odb: return newly created in-memory sources
odb: consistently use "dir" to refer to alternate's directory
odb: allow `odb_find_source()` to fail
odb: store locality in object database sources
Junio C Hamano [Fri, 12 Sep 2025 17:41:18 +0000 (10:41 -0700)]
Merge branch 'je/doc-add'
Documentation for "git add" has been updated.
* je/doc-add:
doc: rephrase the purpose of the staging area
doc: git-add: simplify discussion of ignored files
doc: git-add: clarify intro & add an example
Update clar to fcbed04 (Merge pull request #123 from
pks-gitlab/pks-sandbox-ubsan, 2025-09-10). The most significant changes
since the last version include:
- Fixed platform support for HP-UX.
- Fixes for how clar handles the `-q` flag.
- A couple of leak fixes for reported clar errors.
- A new `cl_invoke()` function that retains line information.
- New infrastructure to create temporary directories.
- Improved printing of error messages so that all lines are now
properly indented.
- Proper selftests for the clar.
Most of these changes are somewhat irrelevant to us, but neither do we
have to adjust to any of these changes, either. What _is_ interesting to
us though is especially the fixed support for HP-UX, and eventually we
may also want to use `cl_invoke()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Colin Stagner [Wed, 10 Sep 2025 03:11:24 +0000 (22:11 -0500)]
contrib/subtree: fix split with squashed subtrees
98ba49ccc2 (subtree: fix split processing with multiple subtrees
present, 2023-12-01) increases the performance of
git subtree split --prefix=subA
by ignoring subtree merges which are outside of `subA/`. It also
introduces a regression. Subtree merges that should be retained
are incorrectly ignored if they:
1. are nested under `subA/`; and
2. are merged with `--squash`.
is erroneously ignored during a split of `subA`. This causes
missing tree files and different commit hashes starting in
git v2.44.0-rc0.
The method:
should_ignore_subtree_split_commit REV
should test only a single commit REV, but the combination of
git log -1 --grep=...
actually searches all *parent* commits until a `--grep` match is
discovered.
Rewrite this method to test only one REV at a time. Extract commit
information with a single `git` call as opposed to three. The
`test` conditions for rejecting a commit remain unchanged.
Unit tests now cover nested subtrees.
Signed-off-by: Colin Stagner <ask+git@howdoi.land> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc: fast-import: replace literal block with paragraph
68061e34702 (fast-import: disallow "feature export-marks" by default,
2019-08-29) added the documentation for this option. The second
paragraph is a literal block but it looks like it should just be
a regular paragraph.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
From user feedback on this section: 3 users don't know what "tree-ish"
means and 3 users don't know what "pathspec" means. One user also says
that the section is very confusing and that they don't understand what
the "index" is.
From conversations on Mastodon, several users said that their impression
is that "the index" means the same thing as "HEAD". It would be good to
give those users (and other users who do not know what "index" means) a
hint as to its meaning.
Make this section more accessible to users who don't know what the terms
"pathspec", "tree-ish", and "index" mean by using more familiar language,
adding examples, and using simpler sentence structures.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Wed, 10 Sep 2025 19:14:28 +0000 (19:14 +0000)]
doc: git-checkout: split up restoring files section
From user feedback: one user mentioned that "When the <tree-ish> (most
often a commit) is not given" is confusing since it starts with a
negative.
Restructuring so that `git checkout main file.txt` and
`git checkout file.txt` are separate items will help us simplify the
sentence structure a lot.
As a bonus, it appears that `-f` actually only applies to one of those
forms, so we can include fewer options, and now the structure of the
DESCRIPTION matches the SYNOPSIS.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
From user feedback: several users say they don't understand the use case
for `--detach`. It's probably not realistic to explain the use case for
detached HEAD state here, but we can improve the situation.
Explain how `git checkout --detach` is different from
`git checkout <branch>` instead of copying over the description from
`git checkout <branch>`, since `git checkout <branch>` will be a
familiar command to many readers.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Wed, 10 Sep 2025 19:14:26 +0000 (19:14 +0000)]
doc: git-checkout: clarify `-b` and `-B`
From user feedback: several users reported having trouble understanding
the difference between `-b` and `-B` ("I think it's because my brain
expects it to contrast with `-b`, but instead it starts off explaining
how they're the same").
Also, in `-B`, 2 users can't tell what the branch is reset *to*.
Simplify the sentence structure in the explanations of `-b` and `-B` and
add a little extra information (what `<start-point>` is, what the branch
is reset to).
Splitting up `-b` and `-B` into separate items helps simplify the
sentence structure since there's less "In this case...".
Replace the long "the branch is not reset/created unless "git checkout"
is successful..." with just "will fail", since we should generally
assume that Git will fail operations in a clean way and not leave
operations half-finished, and that cases where it does not fail cleanly
are the exceptions that the documentation should flag.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
From user feedback: several users commented that "Local modifications
to the files in the working tree are kept, so that they can be committed
to the <branch>." didn't seem accurate to them, since
`git checkout <branch>` will often fail.
One user also thought that "... and by pointing HEAD at the branch"
was something that _they_ had to do somehow ("How do I point HEAD at
a branch?") rather than a description of what the `git checkout`
operation is doing for them.
Explain when `git checkout <branch>` will fail and clarify that
"pointing HEAD at the branch" is part of what the command does.
6 users commented that the "You could omit <branch>..." section is
extremely confusing. Explain this in a much more direct way.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to use the terms "pathspec" or "tree-ish" in the
ARGUMENT DISAMBIGUATION section, which are terms that (from user
feedback on this page) many users do not understand.
"tree-ish" is actually not accurate here: `git checkout` in this case
takes a commit-ish, not a tree-ish. So we can say "branch or commit"
instead of "tree-ish" which is both more accurate and uses more familiar
terms.
And now that the intro to the man pages mentions that `git checkout` has
"two main modes", it makes sense to refer to this disambiguation section
to understand how Git decides which one to use when there's an overlap
in syntax.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julia Evans [Wed, 10 Sep 2025 19:14:23 +0000 (19:14 +0000)]
doc: git-checkout: clarify intro sentence
From user feedback: in the first paragraph, 5 users reported not
understanding the terms "pathspec" and 1 user reported not understanding
the term "HEAD". Of the users who said they didn't know what "pathspec"
means, 3 said they couldn't understand what the paragraph was trying to
communicate as a result.
One user also commented that "If no pathspec was given..." makes
`git checkout <branch>` sounds like a special edge case, instead of
being one of the most common ways to use this core Git command.
It looks like the goal of this paragraph is to communicate that `git
checkout` has two different modes: one where you switch branches and one
where you just update your working directory files/index. So say that
directly, and use more familiar language (including examples) to say it.
Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_oid_with_context() allows specifying flags and reports object
details via a passed-in struct object_context. Some callers just want
to specify flags, but don't need any details back. Convert them to
repo_get_oid_with_flags(), which provides just that and frees them from
dealing with the context structure.
Junio C Hamano [Wed, 10 Sep 2025 21:28:23 +0000 (14:28 -0700)]
Merge branch 'master' of https://github.com/j6t/git-gui
* 'master' of https://github.com/j6t/git-gui:
git-gui: sync Makefiles with git.git
git-gui: fix error handling of Revert Changes command
git-gui--askyesno (mingw): use Git for Windows' icon, if available
git-gui--askyesno: allow overriding the window title
git gui: set GIT_ASKPASS=git-gui--askpass if not set yet
git-gui: provide question helper for retry fallback on Windows
git-gui: simplify using nice(1)
git-gui: simplify PATH de-duplication
Junio C Hamano [Wed, 10 Sep 2025 21:27:52 +0000 (14:27 -0700)]
Merge branch 'master' of https://github.com/j6t/gitk
* 'master' of https://github.com/j6t/gitk:
gitk: add README with usage, build, and contribution details
gitk: fix trackpad scrolling for Tcl/Tk 8.7+
gitk: use <Button-3> for ctx menus on macOS with Tcl 8.7+
As the tests are all run in separate repositories, set the branch
name to "master" when creating the repository for the tests where
the result depends on the branch name. In order to make it easier to
change the branch name in the future a helper function is used. This
reduces the number of tests that depend on the default branch name
being "master" and removes the last instance of a test file using
"GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master".
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the penultimate use of "GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=
master" in our test suite. We have slowly been removing these ever
since we started to switch the default branch name used in tests to
"main".
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove one of the last remaining uses of
"TEST_GIT_DEFAULT_INITIAL_BRANCH= main" in the test suite. We have
been steadily be converting tests from using "master" as the default
branch name since the introduction of TEST_GIT_DEFAULT_INITIAL_BRANCH
in 704fed9ea22 (tests: start moving to a different default main branch
name, 2020-10-23) The changes here are purely mechanical replacing
"master" with "main"
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 1296cbe4b46 (init: document `init.defaultBranch` better,
2020-12-11) "git-init.adoc" has advertised that the default name
of the initial branch may change in the future. The name "main"
is chosen to match the default used by the big Git forge web sites.
The advice printed when init.defaultBranch is not set is updated
to say that the default will change to "main" in Git 3.0. Building
with WITH_BREAKING_CHANGES enabled removes the advice and changes
the default branch name to "main". The code in guess_remote_head()
that looks for "refs/heads/master" is left unchanged as that is only
called when the remote server does not support the symref capability
in the v0 protocol or the symref extension to the ls-refs list in the
v2 protocol. Such an old server is more likely to be using "master"
as the default branch name.
With the exception of the "git-init.adoc" the documentation is left
unchanged. I had hoped to parameterize the name of the default branch
by using an asciidoc attribute. Unfortunately attribute expansion
is inhibited by backticks and we use backticks to mark up ref names
so that idea does not work. As the changes to git-init.adoc show
inserting ifdef's around each instance of the branch name "master"
is cumbersome and makes the documentation sources harder to read.
Apart from "git-init.adoc" there are some other files where "master" is
used as the name of the initial branch rather than as an example of a
branch name such as "user-manual.adoc" and "gitcore-tutorial.adoc". The
name appears a lot in those so updating it with ifdef's is not really
practical. We can update that document in the 3.0 release cycle. The
other documentation where master is used as an example branch name
can be gradually converted over time.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 8 Sep 2025 21:54:35 +0000 (14:54 -0700)]
Merge branch 'tc/last-modified'
A new command "git last-modified" has been added to show the closest
ancestor commit that touched each path.
* tc/last-modified:
last-modified: use Bloom filters when available
t/perf: add last-modified perf script
last-modified: new subcommand to show when files were last modified
Junio C Hamano [Mon, 8 Sep 2025 21:54:34 +0000 (14:54 -0700)]
Merge branch 'ds/ls-files-lazy-unsparse'
"git ls-files <pathspec>..." should not necessarily have to expand
the index fully if a sparsified directory is excluded by the
pathspec; the code is taught to expand the index on demand to avoid
this.
* ds/ls-files-lazy-unsparse:
ls-files: conditionally leave index sparse
Junio C Hamano [Mon, 8 Sep 2025 21:54:34 +0000 (14:54 -0700)]
Merge branch 'am/xdiff-hash-tweak'
Inspired by Ezekiel's recent effort to showcase Rust interface, the
hash function implementation used to hash lines have been updated
to the one used for ELF symbol lookup by Glibc.
When the README for diff-highlight was written, there was no way to
trigger it for the `add -p` interactive patch mode. We've since grown a
feature to support that, but it was documented only on the Git side.
Let's also let people coming the other direction, from diff-highlight,
know that it's an option.
Suggested-by: Isaac Oscar Gariano <IsaacOscar@live.com.au> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 8 Sep 2025 16:42:39 +0000 (12:42 -0400)]
add-interactive: manually fall back color config to color.ui
Color options like color.interactive and color.diff should fall back to
the value of color.ui if they aren't set. In add-interactive, we check
the specific options (e.g., color.diff) via repo_config_get_value(),
which does not depend on the main command having loaded any color config
via the git_config() callback mechanism.
But then we call want_color() on the result; if our specific config is
unset then that function uses the value of git_use_color_default. That
variable is typically set from color.ui by the git_color_config()
callback, which is called by the main command in its own git_config()
callback function.
This works fine for "add -p", whose add_config() callback calls into
git_color_config(). But it doesn't work for other commands like
"checkout -p", which is otherwise unaware of color at all. People tend
not to notice because the default is "auto", and that's what they'd set
color.ui to as well. But something like:
git -c color.ui=false checkout -p
should disable color, and it doesn't.
This regression goes back to 0527ccb1b5 (add -i: default to the built-in
implementation, 2021-11-30). In the perl version we got the color config
from "git config --get-colorbool", which did the full lookup for us.
The obvious fix is for git-checkout to add a call to git_color_config()
to its own config callback. But we'd have to do so for every command
with this problem, which is error-prone. Let's see if we can fix it more
centrally.
It is tempting to teach want_color() to look up the value of
repo_config_get_value("color.ui") itself. But I think that would have
disastrous consequences. Plumbing commands, especially older ones, avoid
porcelain config like "color.*" by simply not parsing it in their config
callbacks. Looking up the value of color.ui under the hood would
undermine that.
Instead, let's do that lookup in the add-interactive setup code. We're
already demand-loading other color config there, which is probably fine
(even in a plumbing command like "git reset", the interactive mode is
inherently porcelain-ish). That catches all commands that use the
interactive code, whether they were calling git_color_config()
themselves or not.
Reported-by: Isaac Oscar Gariano <isaacoscar@live.com.au> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 8 Sep 2025 16:42:36 +0000 (12:42 -0400)]
add-interactive: respect color.diff for diff coloring
The old perl git-add--interactive.perl script used the color.diff config
option to decide whether to color diffs (and if not set, it fell back to
the value of color.ui via git-config's --get-colorbool option). When we
switched to the builtin version, this was lost: we respect only
color.ui. So for example:
git -c color.diff=false add -p
would color the diff, even when it should not.
The culprit is this line in add-interactive.c's parse_diff():
if (want_color_fd(1, -1))
That "-1" means "no config has been set", which causes it to fall back
to the color.ui setting. We should instead be passing the value of
color.diff. But the problem is that we never even parse that config
option!
Instead the builtin interactive code parses only the value of
color.interactive, which is used for prompts and other messages. One
could perhaps argue that this should cover interactive diff coloring,
too, but historically it did not. The perl script treated
color.interactive and color.diff separately. So we should grab the
values for both, keeping separate fields in our add_i_state variable,
rather than a single use_color field.
We also load individual color slots (e.g., color.interactive.prompt),
leaving them as the empty string when color is disabled. This happens
via the init_color() helper in add-interactive, which checks that
use_color field. Now that there are two such fields, we need to pass the
appropriate one for each color.
The colors are mostly easy to divide up; color.interactive.* follows
color.interactive, and color.diff.* follows color.diff. But the "reset"
color is tricky. It is used for both types of coloring, but the two can
be configured independently. So we introduce two separate reset colors,
and use each in the appropriate spot.
There are two new tests. The first enables interactive prompt colors but
disables color.diff. We should see a colored prompt but not a colored
diff, showing that we are now respecting color.diff (and not
color.interactive or color.ui).
The second does the opposite. We disable color.interactive but turn on
color.diff with a custom fragment color. When we split a hunk, the
interactive code has to re-color the hunk header, which lets us check
that we correctly loaded the color.diff.frag config based on color.diff,
not color.interactive.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 8 Sep 2025 16:42:32 +0000 (12:42 -0400)]
stash: pass --no-color to diff plumbing child processes
After a partial stash, we may clear out the working tree by capturing
the output of diff-tree and piping it into git-apply (and likewise we
may use diff-index to restore the index). So we most definitely do not
want color diff output from that diff-tree process. And it normally
would not produce any, since its stdout is not going to a tty, and the
default value of color.ui is "auto".
However, if GIT_PAGER_IN_USE is set in the environment, that overrides
the tty check, and we'll produce a colorized diff that chokes git-apply:
$ echo y | GIT_PAGER_IN_USE=1 git stash -p
[...]
Saved working directory and index state WIP on main: 4f2e2bb foo
error: No valid patches in input (allow with "--allow-empty")
Cannot remove worktree changes
Setting this variable is a relatively silly thing to do, and not
something most users would run into. But we sometimes do it in our tests
to stimulate color. And it is a user-visible bug, so let's fix it rather
than work around it in the tests.
The root issue here is that diff-tree (and other diff plumbing) should
probably not ever produce color by default. It does so not by parsing
color.ui, but because of the baked-in "auto" default from 4c7f1819b3
(make color.ui default to 'auto', 2013-06-10). But changing that is
risky; we've had discussions back and forth on the topic over the years.
E.g.:
So let's accept that as the status quo for now and protect ourselves by
passing --no-color to the child processes. This is the same thing we did
for add-interactive itself in 1c6ffb546b (add--interactive.perl: specify
--no-color explicitly, 2020-09-07).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: use string_list_split() in mark_remotes_as_accepted()
Previous commits replaced some strbuf_split*() calls with calls to
string_list_split*() in "promisor-remote.c".
For consistency, let's also replace the strbuf_split_str() call in
mark_remotes_as_accepted() with a call to string_list_split(), as we
don't need the splitted strings to be managed by a `struct strbuf`.
Using the lighter-weight `string_list` API is enough for our needs.
While at it let's remove a useless call to `strbuf_strip_suffix()`.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Let's make it possible for a client to check if these fields match
what it expects before accepting a promisor remote.
We allow this by introducing a new "promisor.checkFields"
configuration variable. It should contain a comma or space separated
list of fields that will be checked.
By limiting the protocol to specific well-defined fields, we ensure
both server and client have a shared understanding of field
semantics and usage.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: use string_list_split() in filter_promisor_remote()
A previous commit introduced a new parse_one_advertised_remote()
function that takes a `const char *` argument. This function is called
from filter_promisor_remote() and parses all the fields for one remote.
This means that in filter_promisor_remote() we no longer need to split
the remote information that will be passed to
parse_one_advertised_remote() into an array of relatively heavy and
complex `struct strbuf`.
To use something lighter, let's then replace strbuf_split_str() with
string_list_split() in filter_promisor_remote() to parse the remote
information that is passed to parse_one_advertised_remote().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: refactor how we parse advertised fields
In a follow up commit we are going to parse more fields, like a filter
and a token, coming from the server when it advertises promisor remotes
using the "promisor-remote" capability.
To prepare for this, let's refactor the code that parses the advertised
fields coming from the server into a new parse_one_advertised_remote()
function that will populate a `struct promisor_info` with the content
of the fields it parsed.
While at it, let's also pass this `struct promisor_info` to the
should_accept_remote() function, instead of passing it the parsed name
and url.
These changes will make it simpler to both parse more fields and access
the content of these parsed fields in follow up commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: use string constants for 'name' and 'url' too
A previous commit started to define `promisor_field_filter` and
`promisor_field_token`, and used them instead of the
"partialCloneFilter" and "token" string literals.
Let's do the same for "name" and "url" to avoid repeating them
several times and for consistency with the other fields.
For skipping "name=" or "url=" in advertisements, let's introduce
a skip_field_name_prefix() helper function to keep parsing clean
and easy to understand.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: allow a server to advertise more fields
For now the "promisor-remote" protocol capability can only pass "name"
and "url" information from a server to a client in the form
"name=<remote_name>,url=<remote_url>".
To allow clients to make more informed decisions about which promisor
remotes they accept, let's make it possible to pass more information
by introducing a new "promisor.sendFields" configuration variable.
On the server side, information about a remote `foo` is stored in
configuration variables named `remote.foo.<variable-name>`. To make
it clearer and simpler, we use `field` and `field name` like this:
* `field name` refers to the <variable-name> part of such a
configuration variable, and
* `field` refers to both the `field name` and the value of such a
configuration variable.
The "promisor.sendFields" configuration variable should contain a
comma or space separated list of field names that will be looked up
in the configuration of the remote on the server to find the values
that will be passed to the client.
Only a set of predefined field names are allowed. The only field
names in this set are "partialCloneFilter" and "token". The
"partialCloneFilter" field name specifies the filter definition used
by the promisor remote, and the "token" field name can provide an
authentication credential for accessing it.
For example, if "promisor.sendFields" is set to "partialCloneFilter",
and the server has the "remote.foo.partialCloneFilter" config
variable set to a value, then that value will be passed in the
"partialCloneFilter" field in the form "partialCloneFilter=<value>"
after the "name" and "url" fields.
A following commit will allow the client to use the information to
decide if it accepts the remote or not. For now the client doesn't do
anything with the additional information it receives.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
promisor-remote: refactor to get rid of 'struct strvec'
In a following commit, we will use the new 'promisor-remote' protocol
capability introduced by d460267613 (Add 'promisor-remote' capability
to protocol v2, 2025-02-18) to pass and process more information
about promisor remotes than just their name and url.
For that purpose, we will need to store information about other
fields, especially information that might or might not be available
for different promisor remotes. Unfortunately using 'struct strvec',
as we currently do, to store information about the promisor remotes
with one 'struct strvec' for each field like "name" or "url" does not
scale easily in that case. We would need one 'struct strvec' for each
new field, and then we would have to pass all these 'struct strvec'
around.
Let's refactor this and introduce a new 'struct promisor_info'.
It will only store promisor remote information in its members. For now
it has only a 'name' member for the promisor remote name and an 'url'
member for its URL. We will use a 'struct string_list' to store the
instances of 'struct promisor_info'. For each 'item' in the
string_list, 'item->string' will point to the promisor remote name and
'item->util' will point to the corresponding 'struct promisor_info'
instance.
Explicit members are used within 'struct promisor_info' for type
safety and clarity regarding the specific information being handled,
rather than a generic key-value store. We want to specify and document
each field and its content, so adding new members to the struct as
more fields are supported is fine.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adam Dinwoodie [Sat, 15 Feb 2025 21:19:03 +0000 (21:19 +0000)]
git-gui: sync Makefiles with git.git
In git.git, commit 5309c1e9fb39 (Makefile: set default goals in
makefiles, 2025-02-15) touched two Makefiles in the git-git/ directory.
Import these changes, so that the trees can converge again with the
next merge of this repository into git.git.
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Johannes Sixt [Sat, 6 Sep 2025 09:59:09 +0000 (11:59 +0200)]
Merge branch 'js/ask-yesno'
* js/ask-yesno:
git-gui--askyesno (mingw): use Git for Windows' icon, if available
git-gui--askyesno: allow overriding the window title
git gui: set GIT_ASKPASS=git-gui--askpass if not set yet
git-gui: provide question helper for retry fallback on Windows
upload-pack: don't ACK non-commits repeatedly in protocol v2
When a client performs a fetch or clone they can optionally send "have"
lines to tell the server which objects they already have available
locally. These object IDs are stored by the server in an object array so
that it can remember any objects it doesn't have to include in the pack
sent to the client.
While there isn't any reason to do so, clients are free to send the same
"have" line repeatedly. git-upload-pack(1) already knows to handle this
well: every commit it has seen via a "have" line gets marked with the
`THEY_HAVE` flag, and if such a commit is seen repeatedly we know to not
process it another time. This also has the effect that we only store the
object ID once, only, in the `have_obj` array.
There is an edge case though: if the client sends an object ID that does
not refer to a commit we neither store nor check the `THEY_HAVE` flag.
This means that we repeatedly store the same object ID in our `have_obj`
array, with two consequences:
- In protocol v2 we deduplicate ACKs for commits, but not for any
other objects as we send ACKs for every object ID in the `have_obj`
array.
- The `have_obj` array can grow in size indefinitely with both
protocols.
The potentially-more-serious issue is the second one, as we basically
have a way for an adversary to allocate arbitrarily large buffers now.
Ultimately, this doesn't seem to be all that serious though: on my
machine, the growth of that array is at around 4MB/s, and after roughly
five minutes I was only at 1GB RSS. So this is concerning, but only
mildly so.
Fix this bug by storing the `THEY_HAVE` flag independent of the object
type so that we don't store duplicate object IDs in `have_obj` anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The write_midx_internal() method uses gotos to jump to a cleanup section to
clear memory before returning 'result'. Since these jumps are more common
for error conditions, initialize 'result' to -1 and then only set it to 0
before returning with success. There are a couple places where we return
with success via a jump.
This has the added benefit that the method now returns -1 on error instead
of an inconsistent 1 or -1.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the remaining signed comparison warnings in midx-write.c so that
they can be enforced as errors in the future. After the previous change,
the remaining errors are due to iterator variables named 'i'.
The strategy here involves defining the variable within the for loop
syntax to make sure we use the appropriate bitness for the loop
sentinel. This matters in at least one method where the variable was
compared to uint32_t in some loops and size_t in others.
While adjusting these loops, there were some where the loop boundary was
checking against a uint32_t value _plus one_. These were replaced with
non-strict comparisons, but also the value is checked to not be
UINT32_MAX. Since the value is the number of incremental multi-pack-
indexes, this is not a meaningful restriction. The new die() is about
defensive programming more than it being realistically possible.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
midx-write.c has the DISABLE_SIGN_COMPARE_WARNINGS macro defined for a
few reasons, but the biggest one is the use of a signed
preferred_pack_idx member inside the write_midx_context struct. The code
currently uses -1 to indicate an unset preferred pack but pack int ids
are normally handled as uint32_t. There are also a few loops that search
for the preferred pack by name and those iterators will need updates to
uint32_t in the next change.
For now, replace the use of -1 with a 'NO_PREFERRED_PACK' macro and an
equality check. The macro stores the max value of a uint32_t, so we
cannot store a preferred pack that appears last in a list of 2^32 total
packs, but that's expected to be unreasonable already. Furthermore, with
this change we end up extending the range from 2^31 possible packs to
2^32-1.
There are some careful things to worry about with initializing the
preferred pack in the struct and using that value when searching for a
preferred pack that was already incorrect but accidentally working when
the index was initialized to zero.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
midx-write: use cleanup when incremental midx fails
The incremental mode of writing a multi-pack-index has a few extra
conditions that could lead to failure, but these are currently
short-ciruiting with 'return -1' instead of setting the method's
'result' variable and going to the cleanup tag.
Replace these returns with gotos to avoid memory issues when exiting
early due to error conditions.
Unfortunately, these error conditions are difficult to reproduce with
test cases, which is perhaps one reason why the memory loss was not
caught by existing test cases in memory tracking modes.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>