git.ipfire.org Git - thirdparty/git.git/log

Merge branch 'ds/sparse-checkout-clean' into next

"git sparse-checkout" subcommand learned a new "clean" action to
prune otherwise unused working-tree files that are outside the
areas of interest.

* ds/sparse-checkout-clean:
  t: expand tests around sparse merges and clean
  sparse-index: point users to new 'clean' action
  sparse-checkout: add --verbose option to 'clean'
  dir: add generic "walk all files" helper
  sparse-checkout: match some 'clean' behavior
  sparse-checkout: add basics of 'clean' command
  sparse-checkout: remove use of the_repository

Merge branch 'jt/odb-transaction' into next

Continue the work to build on the bulk-checkin infrastructure to
create many objects at once in a transaction and abstract it into
the generic object layer.

* jt/odb-transaction:
  odb: add transaction interface
  object-file: update naming from bulk-checkin
  object-file: relocate ODB transaction code
  bulk-checkin: drop flush_odb_transaction()
  builtin/update-index: end ODB transaction when --verbose is specified
  bulk-checkin: remove ODB transaction nesting

Sync with 'master'

Merge branch 'tc/last-modified-recursive-fix' into next

"git last-modified" operating in non-recursive mode used to trigger
a BUG(), which has been corrected.

* tc/last-modified-recursive-fix:
last-modified: fix bug when some paths remain unhandled

Merge branch 'kn/refs-files-case-insensitive' into next

Deal more gracefully with directory / file conflicts when the files
backend is used for ref storage, by failing only the ones that are
involved in the conflict while allowing others.

* kn/refs-files-case-insensitive:
  refs/files: handle D/F conflicts during locking
  refs/files: handle F/D conflicts in case-insensitive FS
  refs/files: use correct error type when lock exists
  refs/files: catch conflicts on case-insensitive file-systems

Merge branch 'jk/color-variable-fixes' into next

Some places in the code confused a variable that is *not* a boolean
to enable color but is an enum that records what the user requested
to do about color.  A couple of bugs of this sort have been fixed,
while the code has been cleaned up to prevent similar bugs in the
future.

* jk/color-variable-fixes:
  config: store want_color() result in a separate bool
  add-interactive: retain colorbool values longer
  color: return bool from want_color()
  color: use git_colorbool enum type to store colorbools
  pretty: use format_commit_context.auto_color as colorbool
  diff: stop passing ecbdata->use_color as boolean
  diff: pass o->use_color directly to fill_metainfo()
  diff: don't use diff_options.use_color as a strict bool
  diff: simplify color_moved check when flushing
  grep: don't treat grep_opt.color as a strict bool
  color: return enum from git_config_colorbool()
  color: use GIT_COLOR_* instead of numeric constants

Merge branch 'dk/stash-apply-index' into next

The stash.index configuration variable can be set to make "git stash
pop/apply" pretend that it was invoked with "--index".

* dk/stash-apply-index:
  stash: honor stash.index in apply, pop modes
  stash: refactor private config globals
  t3905: remove unneeded blank line
  t3903: reduce dependencies on previous tests

Merge branch 'je/doc-checkout' into next

Doc updates.

* je/doc-checkout:
  doc: git-checkout: clarify restoring files section
  doc: git-checkout: split up restoring files section
  doc: git-checkout: deduplicate --detach explanation
  doc: git-checkout: clarify `-b` and `-B`
  doc: git-checkout: clarify `git checkout <branch>`
  doc: git-checkout: clarify ARGUMENT DISAMBIGUATION
  doc: git-checkout: clarify intro sentence

The twelfth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'cs/subtree-squash-split-fix'

"git subtree" (in contrib/) did not work correctly when splitting
squashed subtrees, which has been improved.

* cs/subtree-squash-split-fix:
contrib/subtree: fix split with squashed subtrees

Merge branch 'rs/get-oid-with-flags-cleanup'

Code clean-up.

* rs/get-oid-with-flags-cleanup:
use repo_get_oid_with_flags()

Merge branch 'jk/add-i-color'

Some among "git add -p" and friends ignored color.diff and/or
color.ui configuration variables, which is an old regression, which
has been corrected.

* jk/add-i-color:
  contrib/diff-highlight: mention interactive.diffFilter
  add-interactive: manually fall back color config to color.ui
  add-interactive: respect color.diff for diff coloring
  stash: pass --no-color to diff plumbing child processes

Merge branch 'cc/promisor-remote-capability'

The "promisor-remote" capability mechanism has been updated to
allow the "partialCloneFilter" settings and the "token" value to be
communicated from the server side.

* cc/promisor-remote-capability:
  promisor-remote: use string_list_split() in mark_remotes_as_accepted()
  promisor-remote: allow a client to check fields
  promisor-remote: use string_list_split() in filter_promisor_remote()
  promisor-remote: refactor how we parse advertised fields
  promisor-remote: use string constants for 'name' and 'url' too
  promisor-remote: allow a server to advertise more fields
  promisor-remote: refactor to get rid of 'struct strvec'

Merge branch 'jk/setup-revisions-freefix' into next

There are double frees and leaks around setup_revisions() API used
in "git stash show", which has been fixed, and setup_revisions()
API gained a wrapper to make it more ergonomic when using it with
strvec-manged argc/argv pairs.

* jk/setup-revisions-freefix:
  revision: retain argv NULL invariant in setup_revisions()
  treewide: pass strvecs around for setup_revisions_from_strvec()
  treewide: use setup_revisions_from_strvec() when we have a strvec
  revision: add wrapper to setup_revisions() from a strvec
  revision: manage memory ownership of argv in setup_revisions()
  stash: tell setup_revisions() to free our allocated strings

Revert "Merge branch 'jk/setup-revisions-freefix' into next"

This reverts commit b178f27e6ddfa8d515dcd445b6bf17119f962c44, reversing
changes made to 2d0aec78212bcdf7c7c067a74b471a9e0ce60adf.

revision: retain argv NULL invariant in setup_revisions()

In an argc/argv pair, the entry for argv[argc] is generally NULL. You
can iterate by counting up to argc, or by looking for the NULL entry in
argv.

When we pass such a pair to setup_revisions(), it shrinks argc to
account for the options we consumed and returns the result to the
caller. But it doesn't touch the entries after the reduced argc. So
argv[argc] will be left pointing at some arbitrary entry rather than
NULL.

This isn't the source of any known bugs, since all callers are aware of
the limitation and act accordingly. But it's a possible gotcha that may
be easy to miss.

Let's set the new argv[argc] to NULL, taking care to free it if the
caller asked us to do so.

It is tempting to do likewise for all of the entries afterwards, too, as
some of them may also need to be freed (e.g., if coming from a strvec).
But doing so isn't entirely trivial, as we munge argc in the function
(e.g., when we find "--" and move all of the entries after it into the
prune_data list). It would be possible with some light refactoring, but
it's probably not worth it. Nobody should ever look at them (they are
beyond the revised argc and past the NULL argv entry) outside of strvec
cleanup, and setup_revisions_from_strvec() already handles this case.

There's one other interesting gotcha: many callers which do not want to
provide arguments just pass 0/NULL for argc/argv. We need to check for
this case before assigning the final NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

treewide: pass strvecs around for setup_revisions_from_strvec()

The previous commit converted callers of setup_revisions() with a strvec
to use the safer and easier _from_strvec() variant.

Let's now convert spots that don't directly have a strvec, but receive
an argc/argv pair that eventually comes from one. We'll instead pass the
strvec down to the point where we call setup_revisions().

That makes these functions slightly less flexible if they were to grow
other callers that don't use strvecs, but this rigidity is buying us
some safety. It is only safe to pass the free_removed_argv_elements
option to setup_revisions() if we know the elements of argv/argc are
allocated on the heap. That isn't communicated in the type system when
we are passed the bare elements. But if we get a strvec, we know that
the elements are allocated strings.

And at any rate, each of these modified functions has only a single
caller (that has a strvec), so the loss of flexibility is unlikely to
ever matter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

treewide: use setup_revisions_from_strvec() when we have a strvec

The previous commit introduced a wrapper to make using setup_revisions()
with a strvec easier and safer. It converted spots that were already
doing most of what the wrapper did.

Let's now convert spots where we were not setting up the
free_removed_argv_elements flag. As discussed in the previous commit,
this probably isn't fixing any bugs or leaks (since these sites wouldn't
trigger the re-shuffling of argv that causes them). This is mostly
future-proofing us against setup_revisions() becoming more aggressive
about its re-shuffling.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: add wrapper to setup_revisions() from a strvec

The setup_revisions() function was designed to take the argc/argv pair
from the operating system. But we sometimes construct our own argv using
a strvec and pass that in. There are a few gotchas that callers need to
deal with here:

  1. You should always pass the free_removed_argv_elements option via
     setup_revision_opt. Otherwise, entries may be leaked if
     setup_revisions() re-shuffles options.

  2. After setup_revisions() returns, the strvec state is odd. We get a
     reduced argc from setup_revisions() telling us how many unknown
     options were left in place. Entries after that in argv may be
     retained, or may be NULL (depending on how the reshuffling
     happened). But the strvec's "nr" field still represents the
     original value, and some of the entries it thinks it is still
     storing may be NULL. Callers must be careful with how they access
     it.

Some callers deal with (1), but not all. In practice they are OK because
they do not pass any options that would cause setup_revisions() to
re-shuffle (namely unknown options which may be relayed from the user,
and the use of the "--" separator). But it's probably a good idea to
consistently pass this option anyway to future-proof ourselves against
the details of setup_revisions() changing.

No callers address (2), though I don't think there any visible bugs.
Most of them simply call strvec_clear() and never otherwise look at the
result. And in fact, if they naively set foo.nr to the argc returned by
setup_revisions(), that would cause leaks!  Because setup_revisions()
does not free consumed options[1], we have to leave the "nr" field of
the strvec at its original value to find and free them during
strvec_clear().

So I don't think there are any bugs to fix here, but we can make things
safer and simpler for callers. Let's introduce a helper function that
sets the free_removed_argv_elements automatically and shrinks the strvec
to represent the retained options afterwards (taking care to free the
now-obsolete entries).

We'll start by converting all of the call-sites which use the
free_removed_argv_elements option. There should be no behavior change
for them, except that their "shrunken" entries are cleaned up
immediately, rather than waiting for a strvec_clear() call.

[1] Arguably setup_revisions() should be doing this step for us if we
    told it to free removed options, but there are many existing callers
    which will be broken if it did. Introducing this helper is a
    possible first step towards that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: manage memory ownership of argv in setup_revisions()

The setup_revisions() function takes an argc/argv pair and consumes
arguments from it, returning a reduced argc count to the caller. But it
may also overwrite entries within the argv array, as it shifts unknown
options to the front of argv (so they can be found in the range of
0..argc-1 after we return).

For a normal argc/argv coming from the operating system, this is OK.
We don't need to worry about memory ownership of the strings in those
entries. But some callers pass in allocated strings from a strvec, and
we do need to care about those.

We faced a similar issue in f92dbdbc6a (revisions API: don't leak memory
on argv elements that need free()-ing, 2022-08-02), which added an
option for callers to tell us that elements need to be freed. But the
implementation within setup_revisions() was incomplete.  It only covered
the case of dropping "--", but not the movement of unknown options.

When we shift argv entries around, we should free the elements we are
about to overwrite, so they are not leaked. For example, in:

  git stash show -p --invalid

we will pass this to setup_revisions():

  argc = 3, argv[] = { "show", "-p", "--invalid", NULL }

which will then return:

   argc = 2, argv[] = { "show", "--invalid", "--invalid", NULL }

overwriting the "-p" entry, which is leaked unless we free it at that
moment.

You can see in the output above another potential problem. We now have
two copies of the "--invalid" string. If the caller does not respect the
new argc when free-ing the strings via strvec_clear(), we'll get a
double-free. And git-stash suffers from this, and will crash with the
above command.

So it seems at first glance that the solution is to just assign the
reduced argc to the strvec.nr field in the caller. Then it would stop
after freeing only any copied entries. But that's not always right
either!

Remember that we are reducing "argc" to account for elements we've
consumed. So if there isn't an invalid option, we'd turn:

  argc = 2, argv[] = { "show", "-p", NULL }

into:

  argc = 1, argv[] = { "show", "-p", NULL }

In that case strvec_clear() must keep looking past the shortened argc we
return to find the original "-p" to free. It needs to use the original
argc to do that.

We can solve this by turning our argv writes into strict moves, not
copies. When we shuffle an unknown option to the front, we'll overwrite
its old position with NULL. That leaves an argv array that may have NULL
"holes" in it.

So in the "--invalid" example above we get:

   argc = 2, argv[] = { "show", "--invalid", NULL, NULL }

but something like "git stash -p --invalid -p" would yield:

  argc = 3, argv[] = { "show", "--invalid", NULL, "-p", NULL }

because we move "--invalid" to overwrite the first "-p", but the second
one is quietly consumed. But strvec_clear() can handle that fine (it
iterates over the "nr" field, and passing NULL to free() is OK).

To ease the implementation, I've introduced a helper function. It's a
little hacky because it must take a double-pointer to set the old
position to NULL. Which in turn means we cannot pass "&arg", our local
alias for the current entry we're parsing, but instead "&argv[i]", the
pointer in the original array. And to make it even more confusing, we
delegate some of this work to handle_revision_opt(), which is passed a
subset of the argv array, so is always working on "&argv[0]".

Likewise, because handle_revision_opt() only receives the part of argv
left to parse, it receives the array to accumulate unknown options as a
separate unkc/unkv pair. But we're always working on the same argv
array, so our strategy works fine. I suspect this would be a bit more
obvious (and avoid some pointer cleverness) if all functions saw the
full argv array and worked with positions within it (and our new helper
would take two positions, a src and dst). But that would involve
refactoring handle_revision_opt().  I punted on that, as what's here is
not too ugly and is all contained within revision.c itself.

The new test demonstrates that "git stash show -p --invalid" no longer
crashes with a double-free (because we move instead of copy). And it
passes with SANITIZE=leak because we free "-p" before overwriting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

stash: tell setup_revisions() to free our allocated strings

In "git stash show", we do a first pass of parsing our command line
options by splitting them into revision args and stash args. These are
stored in strvecs, and we pass the revision args to setup_revisions().

But setup_revisions() may modify the argv we pass it, causing us to leak
some of the entries. In particular, if it sees a "--" string, that will
be dropped from argv. This is the same as other cases addressed by
f92dbdbc6a (revisions API: don't leak memory on argv elements that need
free()-ing, 2022-08-02), and we should fix it the same way: by passing
the free_removed_argv_elements option to setup_revisions().

The added test here is run only with SANITIZE=leak, without checking its
output, because the behavior of stash with "--" is a little odd:

  1. Running "git stash show" will show --stat output. But running "git
     stash show --" will show --patch.

  2. I'd expect a non-option after "--" to be treated as a pathspec, so:

       git stash show -p 1 -- foo

     would look treat "1" as a stash (a synonym for stash@{1}) and
     restrict the resulting diff to "foo". But it doesn't. We split the
     revision/stash args without any regard to "--". So in the example
     above both "1" and "foo" are stashes. Which is an error, but also:

       git stash show -- foo

     treats "foo" as a stash, not a pathspec.

These are both oddities that we may want to address (or may not, if we
want to retain historical quirks). But they are well outside the scope
of this patch. So for now we'll just let the tests confirm we aren't
leaking without otherwise expecting any behavior. If we later address
either of those points and end up with another test that covers "stash
show --", we can drop this leak-only test.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jk/setup-revisions-freefix' into next

There are double frees and leaks around setup_revisions() API used
in "git stash show", which has been fixed, and setup_revisions()
API gained a wrapper to make it more ergonomic when using it with
strvec-manged argc/argv pairs.

* jk/setup-revisions-freefix:
  revision: retain argv NULL invariant in setup_revisions()
  treewide: pass strvecs around for setup_revisions_from_strvec()
  treewide: use setup_revisions_from_strvec() when we have a strvec
  revision: add wrapper to setup_revisions() from a strvec
  revision: manage memory ownership of argv in setup_revisions()
  stash: tell setup_revisions() to free our allocated strings

Merge branch 'pw/rebase-i-cleanup-fix' into next

"git rebase -i" failed to clean-up the commit log message when the
command commits the final one in a chain of "fixup" commands, which
has been corrected.

* pw/rebase-i-cleanup-fix:
sequencer: remove VERBATIM_MSG flag
rebase -i: respect commit.cleanup when picking fixups

Merge branch 'jc/3.0-default-initial-branch-to-main-addendum' into next

Keep giving hint about the default initial branch name for users
who may be surprised after Git 3.0 switchover.

* jc/3.0-default-initial-branch-to-main-addendum:
initial branch: give hints after switching the default name

Merge branch 'pw/3.0-default-initial-branch-to-main' into next

Declare that "git init" that is not otherwise configured uses
'main' as the initial branch, not 'master', starting Git 3.0.

* pw/3.0-default-initial-branch-to-main:
  t0613: stop setting default initial branch
  t9902: switch default branch name to main
  t4013: switch default branch name to main
  breaking-changes: switch default branch to main

Merge branch 'ps/clar-updates' into next

Import a newer version of the clar unit testing framework.

* ps/clar-updates:
t/unit-tests: update to 10e96bc

t/unit-tests: update to 10e96bc

Update to 10e96bc (Merge pull request #127 from
pks-gitlab/pks-ci-improvements, 2025-09-22). This commit includes a
couple of changes:

  - The GitHub CI has been updated to include a 32 bit CI job.
    Furthermore, the jobs now compile with "-Werror" and more warnings
    enabled.

  - An issue was addressed where `uintptr_t` is not available on
    NonStop [1].

  - The clar selftests have been restructured so that it is now possible
    to add small test suites more readily. This was done to add tests
    for the above addressed issue, where we now use "%p" to print
    pointers in a platform dependent way.

  - An issue was addressed where the test output had a trailing
    whitespace with certain output formats, which caused whitespace
    issues in the test expectation files.

[1]: <01c101dc2842$38903640$a9b0a2c0$@nexbridge.com>

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

stash: honor stash.index in apply, pop modes

With stash.index=true, git-stash(1) command now tries to reinstate the
index by default in the "apply" and "pop" modes. Not doing so creates a
common trap [1], [2]: "git stash apply" is not the reverse of "git stash
push" because carefully staged indices are lost and have to be manually
recreated. OTOH, this mode is not always desirable and may create more
conflicts when applying stashes. As usual, "--no-index" will disable
this behavior if you set "stash.index".

[1]: https://lore.kernel.org/git/CAPx1GvcxyDDQmCssMjEnt6JoV6qPc5ZUpgPLX3mpUC_4PNYA1w@mail.gmail.com/
[2]: https://lore.kernel.org/git/c5a811ac-8cd3-c389-ac6d-29020a648c87@gmail.com/

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

stash: refactor private config globals

A subsequent commit will access a new config variable in the stash
subcommand implementations, which requires the variables to be declared
before the relevant functions. Prep with a pure refactoring change to
consolidate config-related globals with the rest of the globals.

Best-viewed-with: --color-moved
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3905: remove unneeded blank line

This is leftover from 787513027a (stash: Add --include-untracked option
to stash and remove all untracked files, 2011-06-24) when it was
converted in bbaa45c3aa (t3905: move all commands into test cases,
2021-02-08).

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3903: reduce dependencies on previous tests

Skipping previous tests to work through only failing tests with
arguments like --run=4,122- causes some tests to fail because subdir
doesn't exist yet (it is created by a previous test; typically
"unstashing in a subdirectory"). Create it on demand for tests that need
it, but don't fail (-p) if the directory already exists.

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: retain argv NULL invariant in setup_revisions()

In an argc/argv pair, the entry for argv[argc] is generally NULL. You
can iterate by counting up to argc, or by looking for the NULL entry in
argv.

When we pass such a pair to setup_revisions(), it shrinks argc to
account for the options we consumed and returns the result to the
caller. But it doesn't touch the entries after the reduced argc. So
argv[argc] will be left pointing at some arbitrary entry rather than
NULL.

This isn't the source of any known bugs, since all callers are aware of
the limitation and act accordingly. But it's a possible gotcha that may
be easy to miss.

Let's set the new argv[argc] to NULL, taking care to free it if the
caller asked us to do so.

It is tempting to do likewise for all of the entries afterwards, too, as
some of them may also need to be freed (e.g., if coming from a strvec).
But doing so isn't entirely trivial, as we munge argc in the function
(e.g., when we find "--" and move all of the entries after it into the
prune_data list). It would be possible with some light refactoring, but
it's probably not worth it. Nobody should ever look at them (they are
beyond the revised argc and past the NULL argv entry) outside of strvec
cleanup, and setup_revisions_from_strvec() already handles this case.

There's one other interesting gotcha: many callers which do not want to
provide arguments just pass 0/NULL for argc/argv. We need to check for
this case before assigning the final NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

treewide: pass strvecs around for setup_revisions_from_strvec()

The previous commit converted callers of setup_revisions() with a strvec
to use the safer and easier _from_strvec() variant.

Let's now convert spots that don't directly have a strvec, but receive
an argc/argv pair that eventually comes from one. We'll instead pass the
strvec down to the point where we call setup_revisions().

That makes these functions slightly less flexible if they were to grow
other callers that don't use strvecs, but this rigidity is buying us
some safety. It is only safe to pass the free_removed_argv_elements
option to setup_revisions() if we know the elements of argv/argc are
allocated on the heap. That isn't communicated in the type system when
we are passed the bare elements. But if we get a strvec, we know that
the elements are allocated strings.

And at any rate, each of these modified functions has only a single
caller (that has a strvec), so the loss of flexibility is unlikely to
ever matter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

treewide: use setup_revisions_from_strvec() when we have a strvec

The previous commit introduced a wrapper to make using setup_revisions()
with a strvec easier and safer. It converted spots that were already
doing most of what the wrapper did.

Let's now convert spots where we were not setting up the
free_removed_argv_elements flag. As discussed in the previous commit,
this probably isn't fixing any bugs or leaks (since these sites wouldn't
trigger the re-shuffling of argv that causes them). This is mostly
future-proofing us against setup_revisions() becoming more aggressive
about its re-shuffling.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: add wrapper to setup_revisions() from a strvec

The setup_revisions() function was designed to take the argc/argv pair
from the operating system. But we sometimes construct our own argv using
a strvec and pass that in. There are a few gotchas that callers need to
deal with here:

  1. You should always pass the free_removed_argv_elements option via
     setup_revision_opt. Otherwise, entries may be leaked if
     setup_revisions() re-shuffles options.

  2. After setup_revisions() returns, the strvec state is odd. We get a
     reduced argc from setup_revisions() telling us how many unknown
     options were left in place. Entries after that in argv may be
     retained, or may be NULL (depending on how the reshuffling
     happened). But the strvec's "nr" field still represents the
     original value, and some of the entries it thinks it is still
     storing may be NULL. Callers must be careful with how they access
     it.

Some callers deal with (1), but not all. In practice they are OK because
they do not pass any options that would cause setup_revisions() to
re-shuffle (namely unknown options which may be relayed from the user,
and the use of the "--" separator). But it's probably a good idea to
consistently pass this option anyway to future-proof ourselves against
the details of setup_revisions() changing.

No callers address (2), though I don't think there any visible bugs.
Most of them simply call strvec_clear() and never otherwise look at the
result. And in fact, if they naively set foo.nr to the argc returned by
setup_revisions(), that would cause leaks!  Because setup_revisions()
does not free consumed options[1], we have to leave the "nr" field of
the strvec at its original value to find and free them during
strvec_clear().

So I don't think there are any bugs to fix here, but we can make things
safer and simpler for callers. Let's introduce a helper function that
sets the free_removed_argv_elements automatically and shrinks the strvec
to represent the retained options afterwards (taking care to free the
now-obsolete entries).

We'll start by converting all of the call-sites which use the
free_removed_argv_elements option. There should be no behavior change
for them, except that their "shrunken" entries are cleaned up
immediately, rather than waiting for a strvec_clear() call.

[1] Arguably setup_revisions() should be doing this step for us if we
    told it to free removed options, but there are many existing callers
    which will be broken if it did. Introducing this helper is a
    possible first step towards that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: manage memory ownership of argv in setup_revisions()

The setup_revisions() function takes an argc/argv pair and consumes
arguments from it, returning a reduced argc count to the caller. But it
may also overwrite entries within the argv array, as it shifts unknown
options to the front of argv (so they can be found in the range of
0..argc-1 after we return).

For a normal argc/argv coming from the operating system, this is OK.
We don't need to worry about memory ownership of the strings in those
entries. But some callers pass in allocated strings from a strvec, and
we do need to care about those.

We faced a similar issue in f92dbdbc6a (revisions API: don't leak memory
on argv elements that need free()-ing, 2022-08-02), which added an
option for callers to tell us that elements need to be freed. But the
implementation within setup_revisions() was incomplete.  It only covered
the case of dropping "--", but not the movement of unknown options.

When we shift argv entries around, we should free the elements we are
about to overwrite, so they are not leaked. For example, in:

  git stash show -p --invalid

we will pass this to setup_revisions():

  argc = 3, argv[] = { "show", "-p", "--invalid", NULL }

which will then return:

   argc = 2, argv[] = { "show", "--invalid", "--invalid", NULL }

overwriting the "-p" entry, which is leaked unless we free it at that
moment.

You can see in the output above another potential problem. We now have
two copies of the "--invalid" string. If the caller does not respect the
new argc when free-ing the strings via strvec_clear(), we'll get a
double-free. And git-stash suffers from this, and will crash with the
above command.

So it seems at first glance that the solution is to just assign the
reduced argc to the strvec.nr field in the caller. Then it would stop
after freeing only any copied entries. But that's not always right
either!

Remember that we are reducing "argc" to account for elements we've
consumed. So if there isn't an invalid option, we'd turn:

  argc = 2, argv[] = { "show", "-p", NULL }

into:

  argc = 1, argv[] = { "show", "-p", NULL }

In that case strvec_clear() must keep looking past the shortened argc we
return to find the original "-p" to free. It needs to use the original
argc to do that.

We can solve this by turning our argv writes into strict moves, not
copies. When we shuffle an unknown option to the front, we'll overwrite
its old position with NULL. That leaves an argv array that may have NULL
"holes" in it.

So in the "--invalid" example above we get:

   argc = 2, argv[] = { "show", "--invalid", NULL, NULL }

but something like "git stash -p --invalid -p" would yield:

  argc = 3, argv[] = { "show", "--invalid", NULL, "-p", NULL }

because we move "--invalid" to overwrite the first "-p", but the second
one is quietly consumed. But strvec_clear() can handle that fine (it
iterates over the "nr" field, and passing NULL to free() is OK).

To ease the implementation, I've introduced a helper function. It's a
little hacky because it must take a double-pointer to set the old
position to NULL. Which in turn means we cannot pass "&arg", our local
alias for the current entry we're parsing, but instead "&argv[i]", the
pointer in the original array. And to make it even more confusing, we
delegate some of this work to handle_revision_opt(), which is passed a
subset of the argv array, so is always working on "&argv[0]".

Likewise, because handle_revision_opt() only receives the part of argv
left to parse, it receives the array to accumulate unknown options as a
separate unkc/unkv pair. But we're always working on the same argv
array, so our strategy works fine. I suspect this would be a bit more
obvious (and avoid some pointer cleverness) if all functions saw the
full argv array and worked with positions within it (and our new helper
would take two positions, a src and dst). But that would involve
refactoring handle_revision_opt().  I punted on that, as what's here is
not too ugly and is all contained within revision.c itself.

The new test demonstrates that "git stash show -p --invalid" no longer
crashes with a double-free (because we move instead of copy). And it
passes with SANITIZE=leak because we free "-p" before overwriting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

stash: tell setup_revisions() to free our allocated strings

In "git stash show", we do a first pass of parsing our command line
options by splitting them into revision args and stash args. These are
stored in strvecs, and we pass the revision args to setup_revisions().

But setup_revisions() may modify the argv we pass it, causing us to leak
some of the entries. In particular, if it sees a "--" string, that will
be dropped from argv. This is the same as other cases addressed by
f92dbdbc6a (revisions API: don't leak memory on argv elements that need
free()-ing, 2022-08-02), and we should fix it the same way: by passing
the free_removed_argv_elements option to setup_revisions().

I've added a test here which fails when built with SANITIZE=leak because
it calls "git stash show --". This by itself is not a very
interesting invocation, because there is nothing after the "--", and
thus the "--" is not really doing anything.

But I think the current parsing in show_stash() is a little
questionable. It splits the arguments into revision options and stash
options solely based on the presence of a leading dash, with no regard
to "--" at all. So:

  git stash show -- foo

will take "foo" as a stash option before we even pass anything to
setup_revisions(). And something like:

  git stash show -- 1

will show stash@{1}. But I would expect anything after the "--" to be a
pathspec. So in this example it would show only the part of the diff
that touched "foo". And something like:

  git stash show -p 1 -- foo

would treat "1" as a stash and "foo" as a pathspec.

That may be something we want to fix, but I want to focus here on the
leak-fixing without changing behavior. So this test is a little odd, but
does what we want without locking us in to any particular behavior (we
only care that "--" by itself does not change the output nor leak).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

initial branch: give hints after switching the default name

It is likely that those who came to Git after 3.0 switched the
default initial branch name to 'main' would still try to follow
tutorials that were written before 3.0 happened and with the
assumption that the tool would call the initial branch 'master'.

To help these new users after 3.0 boundary, let's retain one part of
the hint we will be giving before the default changes, namely, how
to rename the branch an unconfigured Git has created just once.

We do this without telling them how to permanently configure the
default name of the initial branch, and that design choice is very
much deliberate.  The whole point of switching the default name was
because we did not want to force individual users to configure their
default branch name but while the hard wired default was 'master',
they _had_ to configure it away from 'master' in order to conform to
the recent norm, and a hint that tells them how to do so is useful.

But once the default is renamed to 'main', that no longer is true.
A narrower audience who are new users that follow an instruction
that assumes the initial branch name is 'master' would only need to
learn "here is how to change the branch name to match the tutorial
you are following in the repository you created for practice", and
"here is how you keep creating repositories with the first branch
with a name everybody hates" is unnecessary.

It also needs to be noted that the advise token to squelch the
message is the same advice.defaultBranchName as before, which is
also very much deliberate.  The users who do have that configured
are those who _have_ been using Git since before 3.0, and they are
not the target audience for the new advice message.  Reusing the
same advise token ensures that they do not have to turn the message
off.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Sync with 'master'

Merge branch 'nb/send-email-no-dup-reply-to' into next

"git send-email --compose --reply-to=<address>" used to add
duplicated Reply-To: header, which made mailservers unhappy. This
has been corrected.

* nb/send-email-no-dup-reply-to:
send-email: don't duplicate Reply-to: in intro message

The tenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'pc/range-diff-memory-limit'

"git range-diff" learned a way to limit the memory consumed by
O(N*N) cost matrix.

* pc/range-diff-memory-limit:
range-diff: add configurable memory limit for cost matrix

Merge branch 'ne/alloc-free-and-null'

The clear_alloc_state() API function was not fully clearing the
structure for reuse, but since nobody reuses it, replace it with a
variant that frees the structure as well, making the callers simpler.

* ne/alloc-free-and-null:
alloc: fix dangling pointer in alloc_state cleanup

Merge branch 'jk/curl-global-trace-components'

Adjust to the way newer versions of cURL selectivel enables tracing
options, so that our tests can continue to work.

* jk/curl-global-trace-components:
curl: add support for curl_global_trace() components

Merge branch 'ag/doc-sendmail-gmail-example-update'

Doc update.

* ag/doc-sendmail-gmail-example-update:
docs: update sendmail docs to use more secure SMTP server for Gmail

Merge branch 'kn/clang-format-bitfields'

CodingGuidelines now spells out how bitfields are to be written.

* kn/clang-format-bitfields:
Documentation: note styling for bit fields

Merge branch 'jc/longer-disambiguation-fix'

"git rev-parse --short" and friends failed to disambiguate two
objects with object names that share common prefix longer than 32
characters, which has been fixed.

* jc/longer-disambiguation-fix:
abbrev: allow extending beyond 32 chars to disambiguate

Merge branch 'sg/line-log-boundary-fixes'

A corner case bug in "git log -L..." has been corrected.

* sg/line-log-boundary-fixes:
line-log: show all line ranges touched by the same diff range
line-log: fix assertion error

Merge branch 'jc/doc-includeif-hasconfig-remote-url-fix'

Doc mark-up fix.

* jc/doc-includeif-hasconfig-remote-url-fix:
config: document includeIf conditions consistently

Merge branch 'ag/send-email-imap-sent'

"git send-email" learned to drive "git imap-send" to store already
sent e-mails in an IMAP folder.

* ag/send-email-imap-sent:
send-email: enable copying emails to an IMAP folder without actually sending them
send-email: add ability to send a copy of sent emails to an IMAP folder

Merge branch 'pw/3.0-commentchar-auto-deprecation'

"core.commentChar=auto" that attempts to dynamically pick a
suitable comment character is non-workable, as it is too much
trouble to support for little benefit, and is marked as deprecated.

* pw/3.0-commentchar-auto-deprecation:
  commit: print advice when core.commentString=auto
  config: warn on core.commentString=auto
  breaking-changes: deprecate support for core.commentString=auto

Merge branch 'kh/doc-fast-import-markup-fix'

Doc mark-up fix.

* kh/doc-fast-import-markup-fix:
doc: fast-import: replace literal block with paragraph

sequencer: remove VERBATIM_MSG flag

As the last commit deleted the only user of VERBATIM_MSG remove
it. This reverts remaining parts of commit f7d42ceec52 (rebase -i:
do leave commit message intact in fixup! chains, 2021-01-28) that
were not deleted by the last commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

rebase -i: respect commit.cleanup when picking fixups

If the user uses a prepare-commit-msg hook to add comments to the
commit message template and sets commit.cleanup to remove them when the
commit is created then the comments will not be removed when rebase
commits the final command in a chain of "fixup" commands[1]. This
happens because f7d42ceec52 (rebase -i: do leave commit message intact
in fixup! chains, 2021-01-28) started passing the VERBATIM_MSG flag
when committing the final command in a chain of "fixup" commands. That
change was added in response to a bug report[2] where the commit
message was being cleaned up when it should not be. The cause of that
bug was that before f7d42ceec52 the sequencer passed CLEANUP_MSG
when committing the final fixup. That commit should have simply
removed the CLEANUP_MSG flag, not changed it to VERBATIM_MSG. Using
VERBATIM_MSG ignores the user's commit.cleanup config when committing
the final fixup which means it behaves differently to an ordinary
"pick" command which respects commit.cleanup.

Fix this by not setting an explicit cleanup flag when committing the
final fixup which matches the way "pick" commands behave. The test
added in f7d42ceec52 is replaced with one that checks that "fixup"
and "pick" commands do not clean up the message when commit.cleanup
is not set and do clean up the message when it is set.

[1] https://lore.kernel.org/git/CA+itcS3DxbgpFy2aPRvHQvTAYE=dU0kfeDdidVwWLU=rBAWR4w@mail.gmail.com
[2] https://lore.kernel.org/git/CANVGpwZGbzYLMeMze64e_OU9p3bjyEgzC5thmNBr6LttBt+YGw@mail.gmail.com

Reported-by: Simon Cheng <cyqsimon@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

last-modified: fix bug when some paths remain unhandled

The recently introduced new subcommand git-last-modified(1) runs into an
error in some scenarios. It then would exit with the message:

BUG: paths remaining beyond boundary in last-modified

This seems to happens for example when criss-cross merges are involved.
In that scenario, the function diff_tree_combined() gets called.

The function diff_tree_combined() copies the `struct diff_options` from
the input `struct rev_info` to override some flags. One flag is
`recursive`, which is always set to 1. This has been the case since the
inception of this function in af3feefa1d (diff-tree -c: show a merge
commit a bit more sensibly., 2006-01-24).

This behavior is incompatible with git-last-modified(1), when called
non-recursive (which is the default).

The last-modified machinery uses a hashmap for all the paths it wants to
get the last-modified commit for. Through log_tree_commit() the callback
mark_path() is called. The diff machinery uses diff_tree_combined()
internally, and due to it's recursive behavior the callback receives
entries inside subtrees, but not the subtree entries themselves. So a
directory is never expelled from the hashmap, and the BUG() statement
gets hit.

Because there are many callers calling into diff_tree_combined(), both
directly and indirectly, we cannot simply change it's behavior.

Instead, add a flag `no_recursive_diff_tree_combined` which supresses
the behavior of diff_tree_combined() to override `recursive` and set
this flag in builtin/last-modified.c.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Sync with 'master'

* master:
mailmap: consolidate multiple addresses into one

refs/files: handle D/F conflicts during locking

The previous commit added the necessary validation and checks for F/D
conflicts in the files backend when working on case insensitive systems.

There is still a possibility for D/F conflicts. This is a different from
the F/D since for F/D conflicts, there would not be a conflict during
the lock creation phase:

    refs/heads/foo.lock
    refs/heads/foo/bar.lock

However there would be a conflict when the locks are committed, since we
cannot have 'refs/heads/foo/bar' and 'refs/heads/foo'. These kinds of
conflicts are checked and resolved in
`refs_verify_refnames_available()`, so the previous commit ensured that
for case-insensitive filesystems, we would lowercase the inputs to that
function.

For D/F conflicts, there is a conflict during the lock creation phase
itself:

    refs/heads/foo/bar.lock
    refs/heads/foo.lock

As in `lock_raw_ref()` after creating the lock, we also check for D/F
conflicts. This can occur in case-insensitive filesystems when trying to
fetch case-conflicted references like:

    refs/heads/Foo/new
    refs/heads/foo

D/F conflicts can also occur in case-sensitive filesystems, when the
repository already contains a directory with a lock file
'refs/heads/foo/bar.lock' and trying to fetch 'refs/heads/foo'. This
doesn't concern directories containing garbage files as those are
handled on a higher level.

To fix this, simply categorize the error as a name conflict. Also remove
this reference from the list of valid refnames for availability checks.
By categorizing the error and removing it from the list of valid
references, batched updates now knows to reject such reference updates
and apply the other reference updates.

Fix a small typo in `ref_transaction_maybe_set_rejected()` while here.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/files: handle F/D conflicts in case-insensitive FS

When using the files-backend on case-insensitive filesystems, there is
possibility of hitting F/D conflicts when creating references within a
single transaction, such as:

- 'refs/heads/foo'
- 'refs/heads/Foo/bar'

Ideally such conflicts are caught in `refs_verify_refnames_available()`
which is responsible for checking F/D conflicts within a given
transaction. This utility function is shared across the reference
backends. As such, it doesn't consider the issues of using a
case-insensitive file system, which only affects the files-backend.

While one solution would be to make the function aware of such issues,
this feels like leaking implementation details of file-backend specific
issues into the utility function. So opt for the more simpler option, of
lowercasing all references sent to this function when on a
case-insensitive filesystem and operating on the files-backend.

To do this, simply use a `struct strbuf` to convert the refname to
lowercase and append it to the list of refnames to be checked. Since we
use a `struct strbuf` and the memory is cleared right after, make sure
that the string list duplicates all provided string.

Without this change, the user would simply be left with a repository
with '.lock' files which were created in the 'prepare' phase of the
transaction, as the 'commit' phase would simply abort and not do the
necessary cleanup.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/files: use correct error type when lock exists

When fetching references into a repository, if a lock for a particular
reference exists, then `lock_raw_ref()` throws:

    - REF_TRANSACTION_ERROR_CASE_CONFLICT: when there is a conflict
    because the transaction contains conflicting references while being
    on a case-insensitive filesystem.

    - REF_TRANSACTION_ERROR_GENERIC: for all other errors.

The latter causes the entire set of batched updates to fail, even in
case sensitive filessystems.

Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This
allows batched updates to reject the individual update which conflicts
with the existing file, while updating the rest of the references.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/files: catch conflicts on case-insensitive file-systems

During the 'prepare' phase of a reference transaction in the files
backend, we create the lock files for references to be created. When
using batched updates on case-insensitive filesystems, the entire
batched updates would be aborted if there are conflicting names such as:

  refs/heads/Foo
  refs/heads/foo

This affects all commands which were migrated to use batched updates in
Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before
that, reference updates would be applied serially with one transaction
used per update. When users fetched multiple references on
case-insensitive systems, subsequent references would simply overwrite
any earlier references. So when fetching:

  refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6
  refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3
  refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56

The user would simply end up with:

  refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3
  refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56

This is buggy behavior since the user is never informed about the
overrides performed and missing references. Nevertheless, the user is
left with a working repository with a subset of the references. Since
Git 2.51, in such situations fetches would simply fail without updating
any references. Which is also buggy behavior and worse off since the
user is left without any references.

The error is triggered in `lock_raw_ref()` where the files backend
attempts to create a lock file. When a lock file already exists the
function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens,
the entire batched updates, not individual operation, is aborted as if
it were in a transaction.

Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to
aid the batched update mechanism to simply reject such errors. The
change only affects batched updates since batched updates will reject
individual updates with non-generic errors. So specifically this would
only affect:

    1. git fetch
    2. git receive-pack
    3. git update-ref --batch-updates

This bubbles the error type up to `files_transaction_prepare()` which
tries to lock each reference update. So if the locking fails, we check
if the rejection type can be ignored, which is done by calling
`ref_transaction_maybe_set_rejected()`.

As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT',
the specific reference update would simply be rejected, while other
updates in the transaction would continue to be applied. This allows
partial application of references in case-insensitive filesystems when
fetching colliding references.

While the earlier implementation allowed the last reference to be
applied overriding the initial references, this change would allow the
first reference to be applied while rejecting consequent collisions.
This should be an okay compromise since with the files backend, there is
no scenario possible where we would retain all colliding references.

Let's also be more proactive and notify users on case-insensitive
filesystems about such problems by providing a brief about the issue
while also recommending using the reftable backend, which doesn't have
the same issue.

Reported-by: Joe Drew <joe.drew@indexexchange.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

send-email: don't duplicate Reply-to: in intro message

If I run

  git send-email --compose --reply-to 'ME <my@address.net>' .....

and edit the intro message, then it will get two copies of the Reply-To
field.  gmail.com rejects such messages.

This happens because send-email reads the edited message examining the
headers.  For recognised headers the content is extracted to use in
constructing the final message and for possible inclusion in the patch
emails.  Unrecognised headers are gathered (in @xh) to be passed through
uninterpreted.

Unfortunately "Reply-To" is not recognised in this process so it is
added to @xh as an uninterpreted header, but also generated from the
$reply_to variable in gen_header(), resulting in two copies

Add parsing to the loop in pre_process_file() to recognise a Reply-to
header and to store the result in $reply_to.  This means that the
intro message will not get a second header and also means that
any changes made to the Reply-To header during editing will be
incorporated in the $reply_to variable and so included in all the
generated email messages.

Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

config: store want_color() result in a separate bool

The "git config --get-colorbool foo.bar" command not only digs in the
config to find the value of foo.bar, it evaluates the result using
want_color() to check the tty-ness of stdout.

But it stores the bool result of want_color() in the same git_colorbool
that we found in the config. This works in practice because the
git_colorbool enum is a superset of the bool values. But it is an oddity
from a type system perspective.

Let's instead store the result in a separate bool and use that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

add-interactive: retain colorbool values longer

Most of the diff code stores the decision about whether to show color as
a git_colorbool, and evaluates it at point-of-use with want_color().
This timing is important for reasons explained in daa0c3d971 (color:
delay auto-color decision until point of use, 2011-08-17).

The add-interactive code instead converts immediately to strict boolean
values using want_color(), and then evaluates those. This isn't wrong.
Even though we pass the bool values to diff_use_color(), which expects a
colorbool, the values are compatible. But it is unlike the rest of the
color code, and is questionable from a type-system perspective (but C's
typing between enums, ints, and bools is weak enough that the compiler
does not complain).

Let's switch it to the more usual way of calling want_color() at the
point of use.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

color: return bool from want_color()

The point of want_color() is to take in a git_colorbool enum value and
collapse it down to a single true/false boolean, letting UNKNOWN fall
back to the color.ui default and checking isatty() for AUTO.

Let's make that more clear in the type system by returning a bool rather
than an integer.

This sadly still does not help us much with compiler warnings for using
the two types interchangeably. But it helps make the intent more clear
to a human reader.

We still retain the idempotency of want_color(), because in C a bool
true/false converts to 1/0 when converted to an integer, which
corresponds to GIT_COLOR_ALWAYS and GIT_COLOR_NEVER. So you can store
the bool in a git_colorbool and get the right result (something a few
pieces of code still do, but which we'll clean up in further patches).

Note that we rely on this same bool/int conversion for
check_auto_color(). We cache its results in a tristate int with "-1" as
"not yet set", but we can assign to it (and return it) with implicit
conversions to/from bool.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

color: use git_colorbool enum type to store colorbools

We traditionally used "int" to store and pass around the values defined
by "enum git_colorbool" (which were originally just #define macros).
Using an int doesn't produce incorrect results, but using the actual
enum makes the intent of the code more clear.

It would be nice if the compiler could catch cases where we used the
enum and an int interchangeably, since it's very easy to accidentally
check the boolean true/false of a colorbool like:

if (branch_use_color)

This is wrong because GIT_COLOR_UNKNOWN and GIT_COLOR_AUTO evaluate to
true in C, even though we may ultimately decide not to use color. But C
is pretty happy to convert between ints and enums (even with various
-Wenum-* warnings). So this sadly doesn't protect us from such mistakes,
but it hopefully does make the code easier to read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

mailmap: consolidate multiple addresses into one

Merges contributions made from three different addresses:

- win@wincent.com (old address, initial contributions in 2007–2009)
- greg@hurrell.net (personal address matching full name, so this one is
the "forever" address; contributions made starting in 2018)
- greg.hurrell@datadoghq.com (current work address, used for recent
contributions)

Signed-off-by: Greg Hurrell <greg.hurrell@datadoghq.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pretty: use format_commit_context.auto_color as colorbool

When we see "%C(auto)" as a format placeholder, we evaluate the "color"
field of our pretty_print_context to decide whether we want color. The
auto_color field of format_commit_context then stores the boolean result
of want_color(), telling us the yes/no of whether we want color.

But the resulting field is passed to various functions which expect a
git_colorbool, like diff_get_color(), that will then pass it to
want_color() again. It's not wrong to do so, since want_color() is
idempotent. But it makes it harder to reason about the types, since we
sometimes confuse colorbools and strict booleans.

Let's instead store auto_color as the original colorbool itself. We'll
have to make sure it is passed through want_color() when it is
evaluated, but there is only one such spot (right next to where we
assign it!). Every other caller just ends up passing it to get
diff_get_color() either directly or through another helper.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: stop passing ecbdata->use_color as boolean

In emit_hunk_header(), we evaluate ecbdata->color_diff both as a
git_colorbool, passing it to diff_get_color():

const char *reset = diff_get_color(ecbdata->color_diff, DIFF_RESET);

and as a strict boolean:

const char *reverse = ecbdata->color_diff ? GIT_COLOR_REVERSE : "";

At first glance this seems wrong. Usually we store the color decision as
a git_colorbool, so the second line would get confused by GIT_COLOR_AUTO
(which is boolean true, but may still mean we do not produce color).

However, the second line is correct because our caller sets color_diff
using want_color(), which collapses the colorbool to a strict true/false
boolean. The first line is _also_ correct because of the idempotence of
want_color(). Even though diff_get_color() will pass our true/false
value through want_color() again, the result will be left untouched.

But let's pass through the colorbool itself, which makes it more
consistent with the rest of the diff code. We'll need to then call
want_color() whenever we treat it as a boolean, but there is only such
spot (the one quoted above).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: pass o->use_color directly to fill_metainfo()

We pass the use_color parameter of fill_metainfo() as a strict boolean,
using:

want_color(o->use_color) && !pgm

to derive its value. But then inside the function, we pass it to
diff_get_color(), which expects one of the git_colorbool enum values,
and so feeds it to want_color() again.

Even though want_color() produces a strict 0/1 boolean, this doesn't
produce wrong results because want_color() is idempotent. Since
GIT_COLOR_ALWAYS and NEVER are defined as 1 and 0, and because
want_color() passes through those values, evaluating "want_color(foo)"
and "want_color(want_color(foo))" will return the same result.

But as part of a longer strategy to align the types we use for storing
these values, let's pass through the colorbool directly. To handle the
"&&" case here, we'll convert the presence of "pgm" into "NEVER", which
arguably makes the intent of the code more clear anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: don't use diff_options.use_color as a strict bool

We disable --color-moved if color is not in use at all. This happens in
diff_setup_done(), where we set options->color_moved to 0 if
options->use_color is not true. But a strict boolean check here is not
correct; use_color could be GIT_COLOR_UNKNOWN or GIT_COLOR_AUTO, both of
which evaluate to true, even though we may later decide not to show
colors.

We should be using want_color() to convert that git_colorbool into a
true boolean. As it turns out, this does not produce wrong output. Even
though we go to the trouble to detect the moved lines, ultimately we get
the color values from diff_get_color(), which does check want_color().
And so it returns the empty string for each color, and we "color" the
result with nothing.

So the output is correct, but there is a small but measurable
performance cost to doing the line detection. E.g., in git.git before
and after this patch (there are no colors shown because hyperfine
redirects output to /dev/null):

  Benchmark 1: ./git.old log --no-merges -p --color-moved -1000
    Time (mean ± σ):      1.019 s ±  0.013 s    [User: 0.955 s, System: 0.064 s]
    Range (min … max):    1.005 s …  1.045 s    10 runs

  Benchmark 2: ./git.new log --no-merges -p --color-moved -1000
    Time (mean ± σ):     982.9 ms ±  14.5 ms    [User: 925.8 ms, System: 57.1 ms]
    Range (min … max):   965.1 ms … 1003.2 ms    10 runs

  Summary
    ./git.new log --no-merges -p --color-moved -1000 ran
      1.04 ± 0.02 times faster than ./git.old log --no-merges -p --color-moved -1000

Note that the fix is not quite as simple as just calling want_color()
from diff_setup_done(). There's a subtle timing issue that goes back to
daa0c3d971 (color: delay auto-color decision until point of use,
2011-08-17), the commit that adds want_color() in the first place.  As
discussed there, we must delay evaluating the colorbool value until all
pager setup is complete.

So instead, we'll leave the "color_moved" field intact in diff_setup_done(),
and modify the point where it is evaluated. Fortunately there is only
one such spot that controls whether we run any of the color-moved code
at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: simplify color_moved check when flushing

In diff_flush_patch_all_file_pairs(), we set o->emitted_symbols if and
only if o->color_moved is true. That causes the lower-level routines to
fill up o->emitted_symbols, which we then analyze in order to do the
actual colorizing.

But in that final step, we do:

  if (o->emitted_symbols) {
          if (o->color_moved) {
     ...actual coloring...
  }
  ...clean up of emitted_symbols...
  }

The inner "if" will always trigger, since we set emitted_symbols only
when doing color_moved (it is a little confusing that it is set inside
the diff_options struct, but that is for convenience of passing it to
the lower-level routines; we always clear it at the end of flushing,
since 48edf3a02a (diff: clear emitted_symbols flag after use,
2019-01-24)).

Let's simplify the code a bit by just dropping the inner "if" and
running its block unconditionally.

In theory the current code might be useful if another feature besides
color_moved setup and used emitted_symbols, but it would be easy to
refactor later to handle that. And in the meantime, this makes further
work in this area easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

grep: don't treat grep_opt.color as a strict bool

In show_line(), we check to see if colors are desired with just:

  if (opt->color)
     ...we want colors...

But this is incorrect. The color field here is really a git_colorbool,
so it may be "true" for GIT_COLOR_UNKNOWN or GIT_COLOR_AUTO. Either of
those _might_ end up true eventually (once we apply default fallbacks
and check stdout's tty), but they may not. E.g.:

  git grep foo | cat

will enter the conditional even though we're not going to show colors.
We should collapse it into a true boolean by calling want_color().

It turns out that this does not produce a user-visible bug. We do some
extra processing to isolate the matched portion of the line in order to
colorize it, but ultimately we pass it to our output_color() helper,
which does correctly check want_color(). So we end up with no colors.

But dropping the extra processing saves a measurable amount of time. For
example, running under hyperfine (which redirects to /dev/null, and thus
does not colorize):

  Benchmark 1: ./git.old grep a
    Time (mean ± σ):      58.7 ms ±   3.5 ms    [User: 580.6 ms, System: 74.3 ms]
    Range (min … max):    53.5 ms …  67.1 ms    48 runs

  Benchmark 2: ./git.new grep a
    Time (mean ± σ):      35.5 ms ±   0.9 ms    [User: 276.8 ms, System: 73.8 ms]
    Range (min … max):    34.3 ms …  39.3 ms    79 runs

  Summary
    ./git.new grep a ran
      1.65 ± 0.11 times faster than ./git.old grep a

That's a fairly extreme benchmark, just because it will come up with a
ton of small matches, but it shows that this really does matter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

color: return enum from git_config_colorbool()

The git_config_colorbool() function returns an integer which is always
one of the GIT_COLOR_* constants UNKNOWN, NEVER, ALWAYS, or AUTO. We
define these constants with macros, but let's switch to using an enum.
Even though the compiler does not strictly enforce enum/int conversions,
this should make the intent clearer to human readers. And as a bonus,
enum names are typically available to debuggers, making it more pleasant
to step through the code there.

This patch updates the return type of git_config_colorbool(), but holds
off on updating all of the callers. There's some trickiness to some of
them, and in the meantime it's perfectly fine to assign an enum into an
int.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

color: use GIT_COLOR_* instead of numeric constants

Long ago Git's decision to show color for a subsytem was stored in a
tri-state variable: it could be true (1), false (0), or unknown (-1).
But since daa0c3d971 (color: delay auto-color decision until point of
use, 2011-08-17) we want to carry around a new state, "auto", which
bases the decision on the tty-ness of stdout (rather than collapsing
that "auto" state to a true/false immediately).

That commit introduced a set of GIT_COLOR_* defines to represent each
state: UNKNOWN, ALWAYS, NEVER, and AUTO. But it only used the AUTO
value, and left alone code using bare 0/1/-1 values. And of course since
then we've grown many new spots that use those bare values.

Let's switch all of these to use the named constants. That should make
the code a bit easier to read, as it is more obvious that we're
representing a color decision.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jk/add-i-color' into jk/color-variable-fixes

* jk/add-i-color:
  contrib/diff-highlight: mention interactive.diffFilter
  add-interactive: manually fall back color config to color.ui
  add-interactive: respect color.diff for diff coloring
  stash: pass --no-color to diff plumbing child processes

odb: add transaction interface

Transactions are managed via the {begin,end}_odb_transaction() function
in the object-file subsystem and its implementation is specific to the
files object source. Introduce odb_transaction_{begin,commit}() in the
odb subsystem to provide an eventual object source agnostic means to
manage transactions.

Update call sites to instead manage transactions through the odb
subsystem. Also rename {begin,end}_odb_transaction() functions to
object_file_transaction_{begin,commit}() to clarify the object source it
supports.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: update naming from bulk-checkin

Update the names of several functions and types relocated from the
bulk-checkin subsystem for better clarity. Also drop
finish_tmp_packfile() as a standalone function in favor of embedding it
in flush_packfile_transaction() directly.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file: relocate ODB transaction code

The bulk-checkin subsystem provides various functions to manage ODB
transactions. Apart from {begin,end}_odb_transaction(), these functions
are only used by the object-file subsystem to manage aspects of a
transaction implementation specific to the files object source.

Relocate all the transaction code in bulk-checkin to object-file. This
simplifies the exposed transaction interface by reducing it to only
{begin,end}_odb_transaction(). Function and type names are adjusted in
the subsequent commit to better fit the new location.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bulk-checkin: drop flush_odb_transaction()

Object database transactions can be explicitly flushed via
flush_odb_transaction() without actually completing the transaction.
This makes the provided transactional interface a bit awkward. Now that
there are no longer any flush_odb_transaction() call sites, drop the
function to simplify the interface and further ensure that a transaction
is only finalized when end_odb_transaction() is invoked.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/update-index: end ODB transaction when --verbose is specified

With 23a3a303 (update-index: use the bulk-checkin infrastructure,
2022-04-04), object database transactions were added to
git-update-index(1) to facilitate writing objects in bulk. With
transactions, newly added objects are instead written to a temporary
object directory and migrated to the primary object database upon
transaction commit.

When the --verbose option is specified, the subsequent set of objects
written are explicitly flushed via flush_odb_transaction() prior to
reporting the update. Flushing the object database transaction migrates
pending objects to the primary object database without marking the
transaction as complete. This is done so objects are immediately visible
to git-update-index(1) callers using the --verbose option and that rely
on parsing verbose output to know when objects are written.

Due to how git-update-index(1) parses arguments, options that come after
a filename are not considered during the object update. Therefore, it
may not be known ahead of time whether the --verbose option is present
and thus object writes are considered transactional by default until a
--verbose option is parsed.

Flushing a transaction after individual object writes negates the
benefit of writing objects to a transaction in the first place.
Furthermore, the mechanism to flush a transaction without actually
committing is rather awkward. Drop the call to flush_odb_transaction()
in favor of ending the transaction altogether when the --verbose flag is
encountered. Subsequent object writes occur outside of a transaction and
are therefore immediately visible which matches the current behavior.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

bulk-checkin: remove ODB transaction nesting

ODB transactions support being nested. Only the outermost
{begin,end}_odb_transaction() start and finish a transaction. This
allows internal object write codepaths to be optimized with ODB
transactions without worrying about whether a transaction is already
active. When {begin,end}_odb_transaction() is invoked during an active
transaction, these operations are essentially treated as no-ops. This
can make the interface a bit awkward to use, as calling
end_odb_transaction() does not guarantee that a transaction is actually
ended. Thus, in situations where a transaction needs to be explicitly
flushed, flush_odb_transaction() must be used.

To remove the need for an explicit transaction flush operation via
flush_odb_transaction() and better clarify transaction semantics, drop
the transaction nesting mechanism in favor of begin_odb_transaction()
returning a NULL transaction value to signal it was a no-op, and
end_odb_transaction() behaving as a no-op when a NULL transaction value
is passed. This is safe for existing callers as the transaction value
wired to end_odb_transaction() already comes from
begin_odb_transaction() and thus continues the same no-op behavior when
a transaction is already pending. With this model, passing a pending
transaction to end_odb_transaction() ensures it is committed at that
point in time.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: expand tests around sparse merges and clean

With the current implementation of 'git sparse-checkout clean', we
notice that a file that was in a conflicted state does not get cleaned
up because of some internal details around the SKIP_WORKTREE bit.

This test is documenting the current behavior before we update it in the
following change.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-index: point users to new 'clean' action

In my experience, the most-common reason that the sparse index must
expand to a full one is because there is some leftover file in a tracked
directory that is now outside of the sparse-checkout. The new 'git
sparse-checkout clean' command will find and delete these directories,
so point users to it when they hit the sparse index expansion advice.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-checkout: add --verbose option to 'clean'

The 'git sparse-checkout clean' subcommand is focused on directories,
deleting any tracked sparse directories to clean up the worktree and
make the sparse index feature work optimally.

However, this directory-focused approach can leave users wondering why
those directories exist at all. In my experience, these files are left
over due to ignore or exclude patterns, Windows file handles, or
possibly merge conflict resolutions.

Add a new '--verbose' option for users to see all the files that are
being deleted (with '--force') or would be deleted (with '--dry-run').

Based on usage, users may request further context on this list of files for
states such as tracked/untracked, unstaged/staged/conflicted, etc.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Sync with 'master'

Merge branch 'cs/subtree-squash-split-fix' into next

"git subtree" (in contrib/) did not work correctly when splitting
squashed subtrees, which has been improved.

* cs/subtree-squash-split-fix:
contrib/subtree: fix split with squashed subtrees

Merge branch 'ps/clar-updates' into next

Import a newer version of the clar unit testing framework.

* ps/clar-updates:
t/unit-tests: update clar to fcbed04

Merge branch 'rs/get-oid-with-flags-cleanup' into next

Code clean-up.

* rs/get-oid-with-flags-cleanup:
use repo_get_oid_with_flags()

Merge branch 'jk/add-i-color' into next

Some among "git add -p" and friends ignored color.diff and/or
color.ui configuration variables, which is an old regression, which
has been corrected.

* jk/add-i-color:
  contrib/diff-highlight: mention interactive.diffFilter
  add-interactive: manually fall back color config to color.ui
  add-interactive: respect color.diff for diff coloring
  stash: pass --no-color to diff plumbing child processes

Merge branch 'cc/promisor-remote-capability' into next

The "promisor-remote" capability mechanism has been updated to
allow the "partialCloneFilter" settings and the "token" value to be
communicated from the server side.

* cc/promisor-remote-capability:
  promisor-remote: use string_list_split() in mark_remotes_as_accepted()
  promisor-remote: allow a client to check fields
  promisor-remote: use string_list_split() in filter_promisor_remote()
  promisor-remote: refactor how we parse advertised fields
  promisor-remote: use string constants for 'name' and 'url' too
  promisor-remote: allow a server to advertise more fields
  promisor-remote: refactor to get rid of 'struct strvec'

The ninth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'mm/worktree-doc-typofix'

Docfix.

* mm/worktree-doc-typofix:
docs: fix typo in worktree.adoc 'extension'

Merge branch 'rs/object-name-extend-abbrev-len-update'

Code clean-up.

* rs/object-name-extend-abbrev-len-update:
object-name: declare pointer type of extend_abbrev_len()'s 2nd parameter

Merge branch 'ps/upload-pack-oom-protection'

A broken or malicious "git fetch" can say that it has the same
object for many many times, and the upload-pack serving it can
exhaust memory storing them redundantly, which has been corrected.

* ps/upload-pack-oom-protection:
upload-pack: don't ACK non-commits repeatedly in protocol v2
t5530: modernize tests

Merge branch 'ds/midx-write-fixes'

Fixes multiple crashes around midx write-out codepaths.

* ds/midx-write-fixes:
  midx-write: simplify error cases
  midx-write: reenable signed comparison errors
  midx-write: use uint32_t for preferred_pack_idx
  midx-write: use cleanup when incremental midx fails
  midx-write: put failing response value back
  midx-write: only load initialized packs

Merge branch 'lo/repo-info-step-2'

"repo info" learns a short-hand option "-z" that is the same as
"--format=nul", and learns to report the objects format used in the
repository.

* lo/repo-info-step-2:
repo: add the field objects.format
repo: add the flag -z as an alias for --format=nul

Merge branch 'jt/de-global-bulk-checkin'

The bulk-checkin code used to depend on a file-scope static
singleton variable, which has been updated to pass an instance
throughout the callchain.

* jt/de-global-bulk-checkin:
  bulk-checkin: use repository variable from transaction
  bulk-checkin: require transaction for index_blob_bulk_checkin()
  bulk-checkin: remove global transaction state
  bulk-checkin: introduce object database transaction structure

Sync with 'master'

Merge branch 'pc/range-diff-memory-limit' into next

"git range-diff" learned a way to limit the memory consumed by
O(N*N) cost matrix.

* pc/range-diff-memory-limit:
range-diff: add configurable memory limit for cost matrix

Merge branch 'ne/alloc-free-and-null' into next

The clear_alloc_state() API function was not fully clearing the
structure for reuse, but since nobody reuses it, replace it with a
variant that frees the structure as well, making the callers simpler.

* ne/alloc-free-and-null:
alloc: fix dangling pointer in alloc_state cleanup