git.ipfire.org Git - thirdparty/git.git/log

apply: refactor `struct image` to use a `struct strbuf`

The `struct image` uses a character array to track the pre- or postimage
of a patch operation. This has multiple downsides:

  - It is somewhat hard to track memory ownership. In fact, we have
    several memory leaks in git-apply(1) because we do not (and cannot
    easily) free the buffer in all situations.

  - We have to reinvent the wheel and manually implement a lot of
    functionality that would already be provided by `struct strbuf`.

  - We have to carefully track whether `update_pre_post_images()` can do
    an in-place update of the postimage or whether it has to allocate a
    new buffer for it.

This is all rather cumbersome, and especially `update_pre_post_images()`
is really hard to understand as a consequence even though what it is
doing is rather trivial.

Refactor the code to use a `struct strbuf` instead, addressing all of
the above. Like this we can easily perform in-place updates in all
situations, the logic to perform those updates becomes way simpler and
the lifetime of the buffer becomes a ton easier to track.

This refactoring also plugs some leaking buffers as a side effect.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: rename members that track line count and allocation length

The `struct image` has two members `nr` and `alloc` that track the
number of lines as well as how large its array is. It is somewhat easy
to confuse these members with `len` though, which tracks the length of
the `buf` member.

Rename these members to `line_nr` and `line_alloc` respectively to avoid
confusion. This is in line with how we typically name variables that
track an array in this way.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: refactor code to drop `line_allocated`

The `struct image` has two members `line` and `line_allocated`. The
former member is the one that should be used throughout the code,
whereas the latter one is used to track whether the lines have been
allocated or not.

In practice, the array of lines is always allocated. The reason why we
have `line_allocated` is that `remove_first_line()` will advance the
array pointer to drop the first entry, and thus it points into the array
instead of to the array header.

Refactor the function to use memmove(3P) instead, which allows us to get
rid of this double bookkeeping. This is less efficient, but I doubt that
this matters much in practice. If this judgement call is found to be
wrong at a later point in time we can likely refactor the surrounding
loop such that we first calculate the number of leading context lines to
remove and then remove them in a single call to memmove(3P).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: introduce macro and function to init images

We're about to convert the `struct image` to gain a `struct strbuf`
member, which requires more careful initialization than just memsetting
it to zeros. Introduce the `IMAGE_INIT` macro and `image_init()`
function to prepare for this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: rename functions operating on `struct image`

Rename functions operating on `struct image` to have a `image_` prefix
to match our modern code style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

apply: reorder functions to move image-related things together

While most of the functions relating to `struct image` are relatively
close to one another, `fuzzy_matchlines()` sits in between those even
though it is rather unrelated.

Reorder functions such that `struct image`-related functions are next to
each other. While at it, move `clear_image()` to the top such that it is
close to the struct definition itself. This makes this lifecycle-related
thing easy to discover.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Sync with Git 2.46.1

The sixteenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'bl/trailers-and-incomplete-last-line-fix'

The interpret-trailers command failed to recognise the end of the
message when the commit log ends in an incomplete line.

* bl/trailers-and-incomplete-last-line-fix:
interpret-trailers: handle message without trailing newline

Merge branch 'rj/cygwin-has-dev-tty'

Cygwin does have /dev/tty support that is needed by things like
single-key input mode.

* rj/cygwin-has-dev-tty:
config.mak.uname: add HAVE_DEV_TTY to cygwin config section

Merge branch 'rs/diff-exit-code-fix'

In a few corner cases "git diff --exit-code" failed to report
"changes" (e.g., renamed without any content change), which has
been corrected.

* rs/diff-exit-code-fix:
diff: report dirty submodules as changes in builtin_diff()
diff: report copies and renames as changes in run_diff_cmd()

Merge branch 'jc/doc-skip-fetch-all-and-prefetch'

Doc updates.

* jc/doc-skip-fetch-all-and-prefetch:
doc: remote.*.skip{DefaultUpdate,FetchAll} stops prefetch

Merge branch 'ds/doc-wholesale-disabling-advice-messages'

The environment GIT_ADVICE has been intentionally kept undocumented
to discourage its use by interactive users. Add documentation to
help tool writers.

* ds/doc-wholesale-disabling-advice-messages:
advice: recommend GIT_ADVICE=0 for tools

Merge branch 'jk/sparse-fdleak-fix'

A file descriptor left open is now properly closed when "git
sparse-checkout" updates the sparse patterns.

* jk/sparse-fdleak-fix:
  sparse-checkout: use fdopen_lock_file() instead of xfdopen()
  sparse-checkout: check commit_lock_file when writing patterns
  sparse-checkout: consolidate cleanup when writing patterns

Merge branch 'ds/scalar-no-tags'

The "scalar clone" command learned the "--no-tags" option.

* ds/scalar-no-tags:
scalar: add --no-tags option to 'scalar clone'

Git 2.46.1

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'rj/compat-terminal-unused-fix' into maint-2.46

Build fix.

* rj/compat-terminal-unused-fix:
compat/terminal: mark parameter of git_terminal_prompt() UNUSED

Merge branch 'jc/config-doc-update' into maint-2.46

Docfix.

* jc/config-doc-update:
git-config.1: fix description of --regexp in synopsis
git-config.1: --get-all description update

Merge branch 'aa/cat-file-batch-output-doc' into maint-2.46

Docfix.

* aa/cat-file-batch-output-doc:
docs: explain the order of output in the batched mode of git-cat-file(1)

Merge branch 'cl/config-regexp-docfix' into maint-2.46

Docfix.

* cl/config-regexp-docfix:
doc: replace 3 dash with correct 2 dash in git-config(1)

Merge branch 'jc/coding-style-c-operator-with-spaces' into maint-2.46

Write down whitespacing rules around C opeators.

* jc/coding-style-c-operator-with-spaces:
CodingGuidelines: spaces around C operators

Merge branch 'ps/stash-keep-untrack-empty-fix' into maint-2.46

A corner case bug in "git stash" was fixed.

* ps/stash-keep-untrack-empty-fix:
builtin/stash: fix `--keep-index --include-untracked` with empty HEAD

Merge branch 'ps/index-pack-outside-repo-fix' into maint-2.46

"git verify-pack" and "git index-pack" started dying outside a
repository, which has been corrected.

* ps/index-pack-outside-repo-fix:
builtin/index-pack: fix segfaults when running outside of a repo

Merge branch 'jk/free-commit-buffer-of-skipped-commits' into maint-2.46

The code forgot to discard unnecessary in-core commit buffer data
for commits that "git log --skip=<number>" traversed but omitted
from the output, which has been corrected.

* jk/free-commit-buffer-of-skipped-commits:
revision: free commit buffers for skipped commits

Sync with 'maint'

The fifteenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'kl/cat-file-on-sparse-index'

"git cat-file" works well with the sparse-index, and gets marked as
such.

* kl/cat-file-on-sparse-index:
builtin/cat-file: mark 'git cat-file' sparse-index compatible
t1092: allow run_on_* functions to use standard input

Merge branch 'jk/messages-with-excess-lf-fix'

One-line messages to "die" and other helper functions will get LF
added by these helper functions, but many existing messages had an
unnecessary LF at the end, which have been corrected.

* jk/messages-with-excess-lf-fix:
drop trailing newline from warning/error/die messages

Merge branch 'ps/pack-refs-auto-heuristics'

"git pack-refs --auto" for the files backend was too aggressive,
which has been a bit tamed.

* ps/pack-refs-auto-heuristics:
  refs/files: use heuristic to decide whether to repack with `--auto`
  t0601: merge tests for auto-packing of refs
  wrapper: introduce `log2u()`

Merge branch 'tb/multi-pack-reuse-fix'

A data corruption bug when multi-pack-index is used and the same
objects are stored in multiple packfiles has been corrected.

* tb/multi-pack-reuse-fix:
  builtin/pack-objects.c: do not open-code `MAX_PACK_OBJECT_HEADER`
  pack-bitmap.c: avoid repeated `pack_pos_to_offset()` during reuse
  builtin/pack-objects.c: translate bit positions during pack-reuse
  pack-bitmap: tag bitmapped packs with their corresponding MIDX
  t/t5332-multi-pack-reuse.sh: verify pack generation with --strict

Merge branch 'gt/unit-test-oid-array'

Another unit-test.

* gt/unit-test-oid-array:
t: port helper/test-oid-array.c to unit-tests/t-oid-array.c

Merge branch 'ps/index-pack-outside-repo-fix'

"git verify-pack" and "git index-pack" started dying outside a
repository, which has been corrected.

* ps/index-pack-outside-repo-fix:
builtin/index-pack: fix segfaults when running outside of a repo

Merge branch 'jc/mailinfo-header-cleanup'

Code clean-up.

* jc/mailinfo-header-cleanup:
mailinfo: we parse fixed headers

Another batch of topics for 2.46.1

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jc/grammo-fixes' into maint-2.46

Doc updates.

* jc/grammo-fixes:
doc: grammofix in git-diff-tree
tutorial: grammofix

Merge branch 'jc/tests-no-useless-tee' into maint-2.46

Test fixes.

* jc/tests-no-useless-tee:
tests: drop use of 'tee' that hides exit status

Merge branch 'jc/how-to-maintain-updates' into maint-2.46

Doc updates.

* jc/how-to-maintain-updates:
howto-maintain: mention preformatted docs

Merge branch 'ps/bundle-outside-repo-fix' into maint-2.46

"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.

* ps/bundle-outside-repo-fix:
bundle: default to SHA1 when reading bundle headers
builtin/bundle: have unbundle check for repo before opening its bundle

Merge branch 'jc/patch-id' into maint-2.46

The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.
cf. <Zqh2T_2RLt0SeKF7@tanuki>

* jc/patch-id:
  patch-id: tighten code to detect the patch header
  patch-id: rewrite code that detects the beginning of a patch
  patch-id: make get_one_patchid() more extensible
  patch-id: call flush_current_id() only when needed
  t4204: patch-id supports various input format

Merge branch 'jk/apply-patch-mode-check-fix' into maint-2.46

Test fix.

* jk/apply-patch-mode-check-fix:
t4129: fix racy index when calling chmod after git-add
apply: canonicalize modes read from patches

The fourteenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'sp/mailmap'

Update to a mailmap entry.

* sp/mailmap:
.mailmap document current address.

Merge branch 'ps/declare-pack-redundamt-dead'

"git pack-redundant" has been marked for removal in Git 3.0.

* ps/declare-pack-redundamt-dead:
Documentation/BreakingChanges: announce removal of git-pack-redundant(1)

Merge branch 'ah/mergetols-vscode'

"git mergetool" learned to use VSCode as a merge backend.

* ah/mergetols-vscode:
mergetools: vscode: new tool

Merge branch 'rj/compat-terminal-unused-fix'

Build fix.

* rj/compat-terminal-unused-fix:
compat/terminal: mark parameter of git_terminal_prompt() UNUSED

Merge branch 'jk/free-commit-buffer-of-skipped-commits'

The code forgot to discard unnecessary in-core commit buffer data
for commits that "git log --skip=<number>" traversed but omitted
from the output, which has been corrected.

* jk/free-commit-buffer-of-skipped-commits:
revision: free commit buffers for skipped commits

doc: remote.*.skip{DefaultUpdate,FetchAll} stops prefetch

Back when 7cc91a2f (Add the configuration option skipFetchAll,
2009-11-09) added for the sole purpose of adding skipFetchAll as a
synonym to skipDefaultUpdate, there was no explanation about the
reason why it was needed., but these two configuration variables
mean exactly the same thing.

Also, when we taught the "prefetch" task to "git maintenance" later,
we did make it pay attention to the setting, but we forgot to
document it.

Document these variables as synonyms that collectively implements
the last-one-wins semantics, and also clarify that the prefetch task
is also controlled by this variable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

config.mak.uname: add HAVE_DEV_TTY to cygwin config section

If neither HAVE_DEV_TTY nor GIT_WINDOWS_NATIVE is set, while compiling
the 'compat/terminal.c' code, then the fallback code calls the system
getpass() function. Unfortunately, this ignores the 'echo' parameter of
the git_terminal_prompt() function, since it has no way to implement that
functionality. This results in a less than optimal user experience on
cygwin, which does not define either of those build flags.

However, cygwin does have a functional '/dev/tty', so that it can build
with HAVE_DEV_TTY and benefit from the improved user experience.

The improved git_terminal_prompt() function that comes with HAVE_DEV_TTY
is used in the git_prompt() function, which in turn is used by the
'git credential', 'git bisect' and 'git help' commands. In addition to
git_terminal_prompt(), read_key_without_echo() is likewise improved and
used by the 'git add -p' command.

While using the 'git credential fill' command, for example:

  $ printf "%s\n" protocol=https host=example.com path=git | ./git credential fill
  Username for 'https://example.com': user
  Password for 'https://user@example.com':
  protocol=https
  host=example.com
  username=user
  password=pass
  $

The 'user' name is now echoed while typing (the password isn't), where this
wasn't the case before.

When using the auto-correct feature:

  $ ./git -c help.autocorrect=prompt fred
  WARNING: You called a Git command named 'fred', which does not exist.
  Run 'grep' instead [y/N]? n
  $ ./git -c help.autocorrect=prompt fred
  WARNING: You called a Git command named 'fred', which does not exist.
  Run 'grep' instead [y/N]? y
  fatal: no pattern given
  $

The user can actually see what they are typing at the prompt. Similar
comments apply to 'git bisect':

  $ ./git bisect bad master~1
  You need to start by "git bisect start"

  Do you want me to do it for you [Y/n]? y
  status: waiting for both good and bad commits
  status: waiting for good commit(s), bad commit known
  $ ./git bisect reset
  Already on 'master-tmp'
  $

  $ ./git bisect start
  status: waiting for both good and bad commits
  $ ./git bisect bad master~1
  status: waiting for good commit(s), bad commit known
  $ ./git bisect next
  warning: bisecting only with a bad commit
  Are you sure [Y/n]? n
  $ ./git bisect reset
  Already on 'master-tmp'
  $

The read_key_without_echo() function leads to a much improved 'git add -p'
command, when the 'interactive.singleKey' configuration is set:

  $ cd ..
  $ mkdir test-git
  $ cd test-git
  $ git init -q
  $ echo foo >file
  $ git add file
  $ echo bar >file
  $ ../git/git -c interactive.singleKey=true add -p
  diff --git a/file b/file
  index 257cc56..5716ca5 100644
  --- a/file
  +++ b/file
  @@ -1 +1 @@
  -foo
  +bar
  (1/1) Stage this hunk [y,n,q,a,d,e,p,?]? y

  $

Note that, not only is the user input echoed, but that it is immediately
accepted (without having to type <return>) and the program exits with the
hunk staged (in this case) or not.

In order to reap these benefits, set the HAVE_DEV_TTY build flag in the
cygwin configuration section of config.mak.uname.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: report dirty submodules as changes in builtin_diff()

The diff machinery has two ways to detect changes to set the exit code:
Just comparing hashes and comparing blob contents.  The latter is needed
if certain changes have to be ignored, e.g. with --ignore-space-change
or --ignore-matching-lines.  It's enabled by the diff_options flag
diff_from_contents.

The slower mode as never considered submodules (and subrepos) as changes
with --submodule=diff or --submodule=log, which is inconsistent with
--submodule=short (the default).  Fix it.

d7b97b7185 (diff: let external diffs report that changes are
uninteresting, 2024-06-09) set diff_from_contents if external diff
programs are allowed.  This is the default e.g. for git diff, and so
that change exposed the inconsistency much more widely.

Reported-by: David Hull <david.hull@friendbuy.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff: report copies and renames as changes in run_diff_cmd()

The diff machinery has two ways to detect changes to set the exit code:
Just comparing hashes and comparing blob contents.  The latter is needed
if certain changes have to be ignored, e.g. with --ignore-space-change
or --ignore-matching-lines.  It's enabled by the diff_options flag
diff_from_contents.

The slower mode has never considered copies and renames to be changes,
which is inconsistent with the quicker one.  Fix it.  Even if we ignore
the file contents (because it's empty or contains only ignored lines),
there's still the meta data change of adding or changing a filename, so
we need to report it in the exit code.

d7b97b7185 (diff: let external diffs report that changes are
uninteresting, 2024-06-09) set diff_from_contents if external diff
programs are allowed.  This is the default e.g. for git diff, and so
that change exposed the inconsistency much more widely.

Reported-by: Jorge Luis Martinez Gomez <jol@jol.dev>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

advice: recommend GIT_ADVICE=0 for tools

The GIT_ADVICE environment variable was added implicitly in b79deeb5544
(advice: add --no-advice global option, 2024-05-03) but was not
documented. Add documentation to show that it is an option for tools
that want to disable these messages. Make note that while the
--no-advice option exists, older Git versions will fail to parse that
option. The environment variable presents a way to change the behavior
of Git versions that understand it without disrupting older versions.

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

scalar: add --no-tags option to 'scalar clone'

Some large repositories use tags to track a huge list of release
versions. While this choice is costly on the ref advertisement, it is
further wasteful for clients who do not need those tags. Allow clients
to optionally skip the tag advertisement.

This behavior is similar to that of 'git clone --no-tags' implemented in
0dab2468ee5 (clone: add a --no-tags option to clone without tags,
2017-04-26), including the modification of the remote.origin.tagOpt
config value to include "--no-tags".

One thing that is opposite of the 'git clone' implementation is that
this allows '--tags' as an assumed option, which can be naturally negated
with '--no-tags'. The clone command does not accept '--tags' but allows
"--no-no-tags" as the negation of its '--no-tags' option.

While testing this option, combine the test with the previously untested
'--no-src' option introduced in 4527db8ff8c (scalar: add --[no-]src
option, 2023-08-28).

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The thirteenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jk/maybe-unused-cleanup'

Code clean-up.

* jk/maybe-unused-cleanup:
grep: prefer UNUSED to MAYBE_UNUSED for pcre allocators
gc: drop MAYBE_UNUSED annotation from used parameter

Merge branch 'jc/unused-on-windows'

Fix more fallouts from -Werror=unused-parameter.

* jc/unused-on-windows:
refs/files-backend: work around -Wunused-parameter

Merge branch 'jc/maybe-unused'

Developer doc updates.

* jc/maybe-unused:
CodingGuidelines: also mention MAYBE_UNUSED

Merge branch 'jk/unused-parameters'

Make our codebase compilable with the -Werror=unused-parameter
option.

* jk/unused-parameters:
  CodingGuidelines: mention -Wunused-parameter and UNUSED
  config.mak.dev: enable -Wunused-parameter by default
  compat: mark unused parameters in win32/mingw functions
  compat: disable -Wunused-parameter in win32/headless.c
  compat: disable -Wunused-parameter in 3rd-party code
  t-reftable-readwrite: mark unused parameter in callback function
  gc: mark unused config parameter in virtual functions

Merge branch 'jk/send-email-mailmap'

"git send-email" learned "--mailmap" option to allow rewriting the
recipient addresses.

* jk/send-email-mailmap:
  send-email: add mailmap support via sendemail.mailmap and --mailmap
  check-mailmap: add options for additional mailmap sources
  check-mailmap: accept "user@host" contacts

.mailmap document current address.

Cox Communications no longer supports email and transfered accounts to
yahoo. I closed the account at yahoo since I use gmail.com.

Signed-off-by: Stephen P. Smith <ishchis2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

interpret-trailers: handle message without trailing newline

When git-interpret-trailers is used to add a trailer to a message that
does not end in a trailing newline, the new trailer is added on the line
immediately following the message instead of as a trailer block
separated from the message by a blank line.

For example, if a message's text was exactly "The subject" with no
trailing newline present, `git interpret-trailers --trailer
my-trailer=true` will result in the following malformed commit message:

The subject
my-trailer: true

While it is generally expected that a commit message should end with a
newline character, git-interpret-trailers should not be returning an
invalid message in this case.

Use `strbuf_complete_line` to ensure that the message ends with a
newline character when reading the input.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-checkout: use fdopen_lock_file() instead of xfdopen()

When updating sparse patterns, we open a lock_file to write out the new
data. The lock_file struct holds the file descriptor, but we call
fdopen() to get a stdio handle to do the actual write.

After we finish writing, we fflush() so that all of the data is on disk,
and then call commit_lock_file() which closes the descriptor. But we
never fclose() the stdio handle, leaking it.

The obvious solution seems like it would be to just call fclose(). But
when? If we do it before commit_lock_file(), then the lock_file code is
left thinking it owns the now-closed file descriptor, and will do an
extra close() on the descriptor. But if we do it before, we have the
opposite problem: the lock_file code will close the descriptor, and
fclose() will do the extra close().

We can handle this correctly by using fdopen_lock_file(). That leaves
ownership of the stdio handle with the lock_file, which knows not to
double-close it.

We do have to adjust the code a bit:

  - we have to handle errors ourselves; we can just die(), since that's
    what xfdopen() would have done (and we can even provide a more
    specific error message).

  - we no longer need to call fflush(); committing the lock-file
    auto-closes it, which will now do the flush for us. As a bonus, this
    will actually check that the flush was successful before renaming
    the file into place.

  - we can get rid of the local "fd" variable, since we never look at it
    ourselves now

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-checkout: check commit_lock_file when writing patterns

When writing a new "sparse-checkout" file, we do the usual strategy of
writing to a lockfile and committing it into place. But we don't check
the outcome of commit_lock_file(). Failing there would prevent us from
writing a bogus file (good), but we would ignore the error and return a
successful exit code (bad).

Fix this by calling die(). Note that we need to keep the sparse_filename
variable valid for longer, since the filename stored in the lock_file
struct will be dropped when we run commit_lock_file().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sparse-checkout: consolidate cleanup when writing patterns

In write_patterns_and_update(), we always need to free the pattern list
before exiting the function. Rather than handling it manually when we
return early, we can jump to an "out" label where cleanup happens. This
let us drop one line, but also establishes a pattern we can use for
other cleanup.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

drop trailing newline from warning/error/die messages

Our error reporting routines append a trailing newline, and the strings
we pass to them should not include them (otherwise we get an extra blank
line after the message).

These cases were all found by looking at the results of:

git grep -P '[^_](error|error_errno|warning|die|die_errno)\(.*\\n"[,)]' '*.c'

Note that we _do_ sometimes include a newline in the middle of such
messages, to create multiline output (hence our grep matching "," or ")"
after we see the newline, so we know we're at the end of the string).

It's possible that one or more of these cases could intentionally be
including a blank line at the end, but having looked at them all
manually, I think these are all just mistakes.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/cat-file: mark 'git cat-file' sparse-index compatible

This change affects how 'git cat-file' works with the index when
specifying an object with the ":<path>" syntax (which will give file
contents from the index).

'git cat-file' expands a sparse index to a full index any time contents
are requested from the index by specifying an object with the ":<path>"
syntax. This is true even when the requested file is part of the sparse
index, and results in much slower 'git cat-file' operations when working
within the sparse index.

Mark 'git cat-file' as not needing a full index, so that you only pay
the cost of expanding the sparse index to a full index when you request
a file outside of the sparse index.

Add tests to ensure both that:
- 'git cat-file' returns the correct file contents whether or not the
file is in the sparse index
- 'git cat-file' expands to the full index any time you request
something outside of the sparse index

Signed-off-by: Kevin Lyles <klyles+github@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t1092: allow run_on_* functions to use standard input

The 'run_on_sparse' and 'run_on_all' functions do not work correctly for
commands accepting standard input, because they run the same command
multiple times and the first instance consumes it. This also indirectly
affects 'test_all_match' and 'test_sparse_match'.

To allow these functions to work with commands accepting standard input,
first slurp standard input to a temporary file, and then run the command
with its standard input redirected from the temporary file. This ensures
that each command sees the same contents from its standard input.

Note that this does not impact commands that do not read from standard
input; they continue to ignore it. Additionally, existing uses of the
run_on_* functions do not need to do anything differently, as the
standard input of the test environment is already connected to
/dev/null.

We do not explicitly clean up the input files because they are cleaned
up with the rest of the test repositories and their contents may be
useful for figuring out which command failed when a test case fails.

Signed-off-by: Kevin Lyles <klyles@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/files: use heuristic to decide whether to repack with `--auto`

The `--auto` flag for git-pack-refs(1) allows the ref backend to decide
whether or not a repack is in order. This switch has been introduced
mostly with the "reftable" backend in mind, which already knows to
auto-compact its tables during normal operations. When the flag is set,
then it will use the same auto-compaction mechanism and thus end up
doing nothing in most cases.

The "files" backend does not have any such heuristic yet and instead
packs any loose references unconditionally. So we rewrite the complete
"packed-refs" file even if there's only a single loose reference to be
packed.

Even worse, starting with 9f6714ab3e (builtin/gc: pack refs when using
`git maintenance run --auto`, 2024-03-25), `git pack-refs --auto` is
unconditionally executed via our auto maintenance, so we end up repacking
references every single time auto maintenance kicks in. And while that
commit already mentioned that the "files" backend unconditionally packs
refs now, the author obviously didn't quite think about the consequences
thereof. So while the idea was sound, we really should have added a
heuristic to the "files" backend before implementing it.

Introduce a heuristic that decides whether or not it is worth to pack
loose references. The important factors to decide here are the number of
loose references in comparison to the overall size of the "packed-refs"
file. The bigger the "packed-refs" file, the longer it takes to rewrite
it and thus we scale up the limit of allowed loose references before we
repack.

As is the nature of heuristics, this mechansim isn't obviously
"correct", but should rather be seen as a tradeoff between how much
resources we spend packing refs and how inefficient the ref store
becomes. For all I can say, we have successfully been using the exact
same heuristic in Gitaly for several years by now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0601: merge tests for auto-packing of refs

We have two tests in t0601 which exercise the same underlying logic,
once via `git pack-refs --auto` and once via `git maintenance run
--auto`. Merge these two tests into one such that it becomes easier to
extend test coverage for both commands at the same time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

wrapper: introduce `log2u()`

We have an implementation of a function that computes the log2 for an
integer. While we could instead use log2(3P), that involves floating
point numbers and is thus needlessly complex and inefficient.

We're about to add a second caller that wants to compute log2 for a
`size_t`. Let's thus move the function into "wrapper.h" such that it
becomes generally available.

While at it, tweak the implementation a bit:

  - The parameter is converted from `int` to `uintmax_t`. This
    conversion is safe to do in "bisect.c" because we already check that
    the argument is positive.

  - The return value is an `unsigned`. It cannot ever be negative, so it
    is pointless for it to be a signed integer.

  - Loop until `!n` instead of requiring that `n > 1` and then subtract
    1 from the result and add a special case for `!sz`. This helps
    compilers to generate more efficient code.

Compilers recognize the pattern of this function and optimize
accordingly. On GCC 14.2 x86_64:

    log2u(unsigned long):
            test    rdi, rdi
            je      .L3
            bsr     rax, rdi
            ret
    .L3:
            mov     eax, -1
            ret

Clang 18.1 does not yet recognize the pattern, but starts to do so on
Clang trunk x86_64. The code isn't quite as efficient as the one
generated by GCC, but still manages to optimize away the loop:

    log2u(unsigned long):
            test    rdi, rdi
            je      .LBB0_1
            shr     rdi
            bsr     rcx, rdi
            mov     eax, 127
            cmovne  rax, rcx
            xor     eax, -64
            add     eax, 65
            ret
    .LBB0_1:
            mov     eax, -1
            ret

The pattern is also recognized on other platforms like ARM64 GCC 14.2.0,
where we end up using `clz`:

    log2u(unsigned long):
            clz     x2, x0
            cmp     x0, 0
            mov     w1, 63
            sub     w0, w1, w2
            csinv   w0, w0, wzr, ne
            ret

Note that we have a similar function `fastlog2()` in the reftable code.
As that codebase is separate from the Git codebase we do not adapt it to
use the new function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/index-pack: fix segfaults when running outside of a repo

It was reported that git-verify-pack(1) has started to crash with Git
v2.46.0 when run outside of a repository. This is another fallout from
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), where we have stopped setting the default hash algorithm
for `the_repository`. Consequently, code that relies on `the_hash_algo`
will now crash when it hasn't explicitly been initialized, which may be
the case when running outside of a Git repository.

The crash is not in git-verify-pack(1) but instead in git-index-pack(1),
which gets called by the former. Ideally, both of these programs should
be able to identify the hash algorithm used by the packfile and index
without having to rely on external information. But unfortunately, the
format for neither of them is completely self-describing, so it is not
possible to derive that information. This is a design issue that we
should address by introducing a new packfile version that encodes its
object hash.

For now though the more important fix is to not make either of these
programs crash anymore, which we do by falling back to SHA1 when the
object hash is unconfigured. This pessimizes reading packfiles which
use a different hash than SHA1, but restores previous behaviour.

Reported-by: Ilya K <me@0upti.me>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/BreakingChanges: announce removal of git-pack-redundant(1)

The git-pack-redundant(1) command is already in the process of being
phased out and dies unless the user passes the `--i-still-use-this` flag
since 4406522b76 (pack-redundant: escalate deprecation warning to an
error, 2023-03-23). We haven't heard any complaints, so let's announce
the removal of this command in Git 3.0 in our breaking changes document.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The twelfth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jc/config-doc-update'

Docfix.

* jc/config-doc-update:
git-config.1: fix description of --regexp in synopsis
git-config.1: --get-all description update

Merge branch 'rs/remote-leakfix'

Leakfix.

* rs/remote-leakfix:
remote: plug memory leaks at early returns

Merge branch 'ps/reftable-concurrent-compaction'

The code path for compacting reftable files saw some bugfixes
against concurrent operation.

* ps/reftable-concurrent-compaction:
  reftable/stack: fix segfault when reload with reused readers fails
  reftable/stack: reorder swapping in the reloaded stack contents
  reftable/reader: keep readers alive during iteration
  reftable/reader: introduce refcounting
  reftable/stack: fix broken refnames in `write_n_ref_tables()`
  reftable/reader: inline `reader_close()`
  reftable/reader: inline `init_reader()`
  reftable/reader: rename `reftable_new_reader()`
  reftable/stack: inline `stack_compact_range_stats()`
  reftable/blocksource: drop malloc block source

Merge branch 'js/fetch-push-trace2-annotation'

More trace2 events at key points on push and fetch code paths have
been added.

* js/fetch-push-trace2-annotation:
  send-pack: add new tracing regions for push
  fetch: add top-level trace2 regions
  trace2: implement trace2_printf() for event target

Merge branch 'aa/cat-file-batch-output-doc'

Docfix.

* aa/cat-file-batch-output-doc:
docs: explain the order of output in the batched mode of git-cat-file(1)

Merge branch 'dh/runtime-prefix-on-zos'

Support for the RUNTIME_PREFIX feature has been added to z/OS port.

* dh/runtime-prefix-on-zos:
exec_cmd: RUNTIME_PREFIX on z/OS systems

Merge branch 'ps/leakfixes-part-5'

Even more leak fixes.

* ps/leakfixes-part-5:
  transport: fix leaking negotiation tips
  transport: fix leaking arguments when fetching from bundle
  builtin/fetch: fix leaking transaction with `--atomic`
  remote: fix leaking peer ref when expanding refmap
  remote: fix leaks when matching refspecs
  remote: fix leaking config strings
  builtin/fetch-pack: fix leaking refs
  sideband: fix leaks when configuring sideband colors
  builtin/send-pack: fix leaking refspecs
  transport: fix leaking OID arrays in git:// transport data
  t/helper: fix leaking multi-pack-indices in "read-midx"
  builtin/repack: fix leaks when computing packs to repack
  midx-write: fix leaking hashfile on error cases
  builtin/archive: fix leaking `OPT_FILENAME()` value
  builtin/upload-archive: fix leaking args passed to `write_archive()`
  builtin/merge-tree: fix leaking `-X` strategy options
  pretty: fix leaking key/value separator buffer
  pretty: fix memory leaks when parsing pretty formats
  convert: fix leaks when resetting attributes
  mailinfo: fix leaking header data

Merge branch 'cl/config-regexp-docfix'

Docfix.

* cl/config-regexp-docfix:
doc: replace 3 dash with correct 2 dash in git-config(1)

mergetools: vscode: new tool

VSCode has supported three-way merges since 2022, see
<https://github.com/microsoft/vscode/issues/5770#issuecomment-1188658476>.

Although the program binary is located at /usr/bin/code, name the
mergetool "vscode" because the word "code" is too generic and would lead
to confusion. The name "vscode" also matches Git's existing
contrib/vscode directory.

On Windows, VSCode adds the directory that contains code.cmd to %PATH%,
so there is no need to invoke mergetool_find_win32_cmd to search for the
program.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: port helper/test-oid-array.c to unit-tests/t-oid-array.c

helper/test-oid-array.c along with t0064-oid-array.sh test the
oid-array.h API, which provides storage and processing
efficiency over large lists of object identifiers.

Migrate them to the unit testing framework for better runtime
performance and efficiency. As we don't initialize a repository
in these tests, the hash algo that functions like oid_array_lookup()
use is not initialized, therefore call repo_set_hash_algo() to
initialize it. And init_hash_algo():lib-oid.c can aid in this
process, so make it public.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat/terminal: mark parameter of git_terminal_prompt() UNUSED

If neither HAVE_DEV_TTY nor GIT_WINDOWS_NATIVE is set, the fallback
code calls the system getpass(). This unfortunately ignores the "echo"
boolean parameter, as we have no way to implement that functionality.
But we still have to keep the unused parameter, since our interface
has to match the other implementations.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

revision: free commit buffers for skipped commits

In git-log we leave the save_commit_buffer flag set to "1", which tells
the commit parsing code to store the object content after it has parsed
it to find parents, tree, etc. That lets us reuse the contents for
pretty-printing the commit in the output. And then after printing each
commit, we call free_commit_buffer(), since we don't need it anymore.

But some options may cause us to traverse commits which are not part of
the output. And so git-log does not see them at all, and doesn't free
them. One such case is something like:

  git log -n 1000 --skip=1000000

which will churn through a million commits, before showing only a
thousand. We loop through these inside get_revision(), without freeing
the contents. As a result, we end up storing the object data for those
million commits simultaneously.

We should free the stored buffers (if any) for those commits as we skip
over them, which is what this patch does. Running the above command in
linux.git drops the peak heap usage from ~1.1GB to ~200MB, according to
valgrind/massif. (I thought we might get an even bigger improvement, but
the remaining memory is going to commit/tree structs, which we do hold
on to forever).

Note that this problem doesn't occur if:

  - you're running a git-rev-list without a --format parameter; it turns
    off save_commit_buffer by default, since it only output the object
    id

  - you've built a commit-graph file, since in that case we'd use the
    optimized graph data instead of the initial parse, and then do a
    lazy parse for commits we're actually going to output

There are probably some other option combinations that can likewise
end up with useless stored commit buffers. For example, if you ask for
"foo..bar", then we'll have to walk down to the merge base, and
everything on the "foo" side won't be shown. Tuning the "save" behavior
to handle that might be tricky (I guess maybe drop buffers for anything
we mark as UNINTERESTING?). And in the long run, the right solution here
is probably to make sure the commit-graph is built (since it fixes the
memory problem _and_ drastically reduces CPU usage).

But since this "--skip" case is an easy one-liner, it's worth fixing in
the meantime. It should be OK to make this call even if there is no
saved buffer (e.g., because save_commit_buffer=0, or because a
commit-graph was used), since it's O(1) to look up the buffer and is a
noop if it isn't present. I verified by running the above command after
"git commit-graph write --reachable", and it takes the same time with
and without this patch.

Reported-by: Yuri Karnilaev <karnilaev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/files-backend: work around -Wunused-parameter

This is needed to build things with -Werror=unused-parameter on a
platform without symbolic link support.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

grep: prefer UNUSED to MAYBE_UNUSED for pcre allocators

We provide custom malloc/free callbacks for the pcre library to use.
Those take an extra "data" parameter, but we don't use it. Back when
these were added in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16), we only had MAYBE_UNUSED.

But these days we have UNUSED, which we should prefer, as it will
let the compiler inform us if the code changes to actually use the
parameters.

I also moved the annotations to come after the variable name, which is
how we typically spell it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gc: drop MAYBE_UNUSED annotation from used parameter

The "opts" parameter is always used, so marking it with MAYBE_UNUSED is
just confusing.

This annotation goes back to 41abfe15d9 (maintenance: add pack-refs
task, 2021-02-09), when it really was unused. Back then we did not have
the UNUSED macro that would complain if the code changed to use the
parameter. So when we started using it in bfc2f9eb8e (builtin/gc:
forward git-gc(1)'s `--auto` flag when packing refs, 2024-03-25), nobody
noticed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

CodingGuidelines: also mention MAYBE_UNUSED

A function that uses a parameter in one build may lose all uses of
the parameter in another build, depending on the configuration. A
workaround for such a case, MAYBE_UNUSED, should also be mentioned
when we recommend the use of UNUSED to our developers.

Keep the addition to the guideline short and document the criteria
to choose between UNUSED and MAYBE_UNUSED near their definition.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jk/unused-parameters' into jc/maybe-unused

* jk/unused-parameters:
  CodingGuidelines: mention -Wunused-parameter and UNUSED
  config.mak.dev: enable -Wunused-parameter by default
  compat: mark unused parameters in win32/mingw functions
  compat: disable -Wunused-parameter in win32/headless.c
  compat: disable -Wunused-parameter in 3rd-party code
  t-reftable-readwrite: mark unused parameter in callback function
  gc: mark unused config parameter in virtual functions

The eleventh batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ds/sparse-diff-index'

The underlying machinery for "git diff-index" has long been made to
expand the sparse index as needed, but the command fully expanded
the sparse index upfront, which now has been taught not to do.

* ds/sparse-diff-index:
diff-index: integrate with the sparse index

Merge branch 'cp/unit-test-reftable-block'

Another test for reftable library ported to the unit test framework.

* cp/unit-test-reftable-block:
  t-reftable-block: mark unused argv/argc
  t-reftable-block: add tests for index blocks
  t-reftable-block: add tests for obj blocks
  t-reftable-block: add tests for log blocks
  t-reftable-block: remove unnecessary variable 'j'
  t-reftable-block: use xstrfmt() instead of xstrdup()
  t-reftable-block: use block_iter_reset() instead of block_iter_close()
  t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
  t-reftable-block: use reftable_record_equal() instead of check_str()
  t-reftable-block: release used block reader
  t: harmonize t-reftable-block.c with coding guidelines
  t: move reftable/block_test.c to the unit testing framework

Merge branch 'ps/reftable-drop-generic'

The code in the reftable library has been cleaned up by discarding
unused "generic" interface.

* ps/reftable-drop-generic:
  reftable: mark unused parameters in empty iterator functions
  reftable/generic: drop interface
  t/helper: refactor to not use `struct reftable_table`
  t/helper: use `hash_to_hex_algop()` to print hashes
  t/helper: inline printing of reftable records
  t/helper: inline `reftable_table_print()`
  t/helper: inline `reftable_stack_print_directory()`
  t/helper: inline `reftable_reader_print_file()`
  t/helper: inline `reftable_dump_main()`
  reftable/dump: drop unused `compact_stack()`
  reftable/generic: move generic iterator code into iterator interface
  reftable/iter: drop double-checking logic
  reftable/stack: open-code reading refs
  reftable/merged: stop using generic tables in the merged table
  reftable/merged: rename `reftable_new_merged_table()`
  reftable/merged: expose functions to initialize iterators

The tenth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ah/git-prompt-portability'

The command line prompt support used to be littered with bash-isms,
which has been corrected to work with more shells.

* ah/git-prompt-portability:
  git-prompt: support custom 0-width PS1 markers
  git-prompt: ta-da! document usage in other shells
  git-prompt: don't use shell $'...'
  git-prompt: add some missing quotes
  git-prompt: replace [[...]] with standard code
  git-prompt: don't use shell arrays
  git-prompt: fix uninitialized variable
  git-prompt: use here-doc instead of here-string

Merge branch 'gt/unit-test-urlmatch-normalization'

Another rewrite of test.

* gt/unit-test-urlmatch-normalization:
t: migrate t0110-urlmatch-normalization to the new framework

Merge branch 'mt/rebase-x-quiet'

"git rebase -x --quiet" was not quiet, which was corrected.

* mt/rebase-x-quiet:
rebase --exec: respect --quiet

reftable: mark unused parameters in empty iterator functions

These unused parameters were marked in a68ec8683a (reftable: mark unused
parameters in virtual functions, 2024-08-17), but the functions were
moved to a new file in a parallel branch via f2406c81b9
(reftable/generic: move generic iterator code into iterator interface,
2024-08-22).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t-reftable-block: mark unused argv/argc

This is conceptually the same as the cases in df9d638c24 (unit-tests:
ignore unused argc/argv, 2024-08-17), but this unit test was migrated
from the reftable tests in a parallel branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

CodingGuidelines: mention -Wunused-parameter and UNUSED

Now that -Wunused-parameter is on by default for DEVELOPER=1 builds,
people may trigger it, blocking their build. When it's a mistake for the
parameter to exist, the path forward is obvious: remove it. But
sometimes you need to suppress the warning, and the "UNUSED" mechanism
for that is specific to our project, so people may not know about it.

Let's put some advice in CodingGuidelines, including an example warning
message. That should help people who grep for the warning text after
seeing it from the compiler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>