git.ipfire.org Git - thirdparty/git.git/log

Merge branch 'ps/upload-pack-buffer-more-writes'

Reduce system overhead "git upload-pack" spends on relaying "git
pack-objects" output to the "git fetch" running on the other end of
the connection.

* ps/upload-pack-buffer-more-writes:
  builtin/pack-objects: reduce lock contention when writing packfile data
  csum-file: drop `hashfd_throughput()`
  csum-file: introduce `hashfd_ext()`
  sideband: use writev(3p) to send pktlines
  wrapper: introduce writev(3p) wrappers
  compat/posix: introduce writev(3p) wrapper
  upload-pack: reduce lock contention when writing packfile data
  upload-pack: prefer flushing data over sending keepalive
  upload-pack: adapt keepalives based on buffering
  upload-pack: fix debug statement when flushing packfile data

Merge branch 'yc/histogram-hunk-shift-fix'

The final clean-up phase of the diff output could turn the result of
histogram diff algorithm suboptimal, which has been corrected.

* yc/histogram-hunk-shift-fix:
xdiff: re-diff shifted change groups when using histogram algorithm

Merge branch 'mf/t0008-cleanup'

Test clean-up.

* mf/t0008-cleanup:
t0008: improve test cleanup to fix failing test

Merge branch 'pb/t4200-test-path-is-helpers'

Test clean-up.

* pb/t4200-test-path-is-helpers:
t4200: convert test -[df] checks to test_path_* helpers

Merge branch 'jk/transport-color-leakfix'

Leakfix.

* jk/transport-color-leakfix:
transport: plug leaks in transport_color_config()

Merge branch 'rj/pack-refs-tests-path-is-helpers'

Test updates.

* rj/pack-refs-tests-path-is-helpers:
t/pack-refs-tests: use test_path_is_missing

Merge branch 'ps/clar-wo-path-max'

Clar (unit testing framework) update from the upstream.

* ps/clar-wo-path-max:
clar: update to fix compilation on platforms without PATH_MAX

Merge branch 'gj/user-manual-fix-grep-example'

Fix an example in the user-manual.

* gj/user-manual-fix-grep-example:
doc: fix git grep args order in Quick Reference

Merge branch 'ps/history-split'

"git history" learned the "split" subcommand.

* ps/history-split:
  builtin/history: implement "split" subcommand
  builtin/history: split out extended function to create commits
  cache-tree: allow writing in-memory index as tree
  add-patch: allow disabling editing of hunks
  add-patch: add support for in-memory index patching
  add-patch: remove dependency on "add-interactive" subsystem
  add-patch: split out `struct interactive_options`
  add-patch: split out header from "add-interactive.h"

Merge branch 'ss/t0410-delete-object-cleanup'

Test clean-up.

* ss/t0410-delete-object-cleanup:
t0410: modernize delete_object helper

Merge branch 'jt/fast-import-sign-again'

"git fast-import" learned to optionally replace signature on
commits whose signatures get invalidated due to replaying by
signing afresh.

* jt/fast-import-sign-again:
  fast-import: add mode to sign commits with invalid signatures
  gpg-interface: allow sign_buffer() to use default signing key
  commit: remove unused forward declaration

The 19th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ty/mktree-wo-the-repository'

Code clean-up.

* ty/mktree-wo-the-repository:
builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE

Merge branch 'bb/imap-send-openssl-4.0-prep'

"imap-send" used to use functions whose use is going to be removed
with OpenSSL 4.0; rewrite them using public API that has been
available since OpenSSL 1.1 since 2016 or so.

* bb/imap-send-openssl-4.0-prep:
  imap-send: move common code into function host_matches()
  imap-send: use the OpenSSL API to access the subject common name
  imap-send: use the OpenSSL API to access the subject alternative names

Merge branch 'ac/help-sort-correctly'

The code in "git help" that shows configuration items in sorted
order was awkwardly organized and prone to bugs.

* ac/help-sort-correctly:
help: cleanup the contruction of keys_uniq

Merge branch 'jc/test-allow-sed-with-ere'

Adjust test-lint to allow "sed -E" to use ERE in the patterns.

* jc/test-allow-sed-with-ere:
t: allow use of "sed -E"

Merge branch 'ng/submodule-default-remote'

Instead of hardcoded 'origin', use the configured default remote
when fetching from submodules.

* ng/submodule-default-remote:
submodule: fetch missing objects from default remote

Merge branch 'ms/t7605-test-path-is-helpers'

Test updates.

* ms/t7605-test-path-is-helpers:
t7605: use test_path_is_file instead of test -f

Merge branch 'cf/constness-fixes'

Small code clean-up around the constness area.

* cf/constness-fixes:
dir: avoid -Wdiscarded-qualifiers in remove_path()
bloom: remove a misleading const qualifier

Merge branch 'master' of https://github.com/j6t/git-gui

* 'master' of https://github.com/j6t/git-gui:
  git-gui: grey out comment lines in commit message
  git-gui: wire up "git-gui--askyesno" with Meson
  git-gui: massage "git-gui--askyesno" with "generate-script.sh"
  git-gui: prefer shell at "/bin/sh" with Meson
  git-gui: fix use of GIT_CEILING_DIRECTORIES
  git-gui: shift tabstops to account for the first column of patch text

Merge branch 'master' of https://github.com/j6t/gitk

* 'master' of https://github.com/j6t/gitk:
  gitk: l10n: make PO headers identify the Gitk project
  gitk: ignore generated POT file
  gitk: i18n: use "Gitk" as package name in POT file
  gitk: commit translation files without file information
  gitk: support link color in the Preferences dialog
  gitk: use config settings for head/tag colors

Merge branch 'jx/i18n-fix' of github.com:jiangxin/gitk

* 'jx/i18n-fix' of github.com:jiangxin/gitk:
  gitk: l10n: make PO headers identify the Gitk project
  gitk: ignore generated POT file
  gitk: i18n: use "Gitk" as package name in POT file

Signed-off-by: Johannes Sixt <j6t@kdbg.org>

Merge branch 'js/i18n-no-location'

* js/i18n-no-location:
gitk: commit translation files without file information

Merge branch 'sb/heed-ref-decoration-settings'

* sb/heed-ref-decoration-settings:
gitk: use config settings for head/tag colors

gitk: l10n: make PO headers identify the Gitk project

Commit f697d08 (gitk: i18n: use "Gitk" as package name in POT file,
2026-03-19) updated the generated POT template to use "Gitk" in its
Project-Id-Version header. Several existing PO files still carry older
header values such as "git" or "git-gui", so they do not consistently
identify themselves as Gitk translations.

Update the Project-Id-Version field in all Gitk PO files so that they
identify the Gitk project consistently.

The "Project-Id-Version" field in the PO header helps tools identify
which project a PO file belongs to. For example, Git's
"git-po-helper" uses it to choose project-specific checks and POT
handling rules. Without this change, some Gitk PO files are
misidentified because their headers still refer to other projects.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

gitk: ignore generated POT file

"po/gitk.pot" is generated from the source for translation maintenance.
Ignore it in the working tree so regenerating the template does not
introduce unnecessary noise in `git status`.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

gitk: i18n: use "Gitk" as package name in POT file

Use "Gitk" instead of the placeholder "PACKAGE" in the header of the
generated po/gitk.pot file. In particular, the "Project-Id-Version"
field in the header entry should be set to:

"Project-Id-Version: Gitk\n"

New PO files generated from this POT file will inherit that package
name.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

The 18th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ss/submodule--helper-use-xmalloc'

Code clean-up.

* ss/submodule--helper-use-xmalloc:
submodule--helper: replace malloc with xmalloc

Merge branch 'ps/unit-test-c-escape-names.txt'

The unit test helper function was taught to use backslash +
mnemonic notation for certain control characters like "\t", instead
of octal notation like "\011".

* ps/unit-test-c-escape-names.txt:
test-lib: print escape sequence names

Merge branch 'jc/doc-wholesale-replace-before-next'

Doc update.

* jc/doc-wholesale-replace-before-next:
SubmittingPatches: spell out "replace fully to pretend to be perfect"

Merge branch 'lc/rebase-trailer'

"git rebase" learns "--trailer" command to drive the
interpret-trailers machinery.

* lc/rebase-trailer:
  rebase: support --trailer
  commit, tag: parse --trailer with OPT_STRVEC
  trailer: append trailers without fork/exec
  trailer: libify a couple of functions
  interpret-trailers: refactor create_in_place_tempfile()
  interpret-trailers: factor trailer rewriting

Merge branch 'bk/run-command-wo-the-repository'

The run_command() API lost its implicit dependencyon the singleton
`the_repository` instance.

* bk/run-command-wo-the-repository:
run-command: wean auto_maintenance() functions off the_repository
run-command: wean start_command() off the_repository

Merge branch 'ps/editorconfig-unanchor'

Editorconfig filename patterns were specified incorrectly, making
many source files inside subdirectories unaffected, which has been
corrected.

* ps/editorconfig-unanchor:
editorconfig: fix style not applying to subdirs anymore

Merge branch 'ss/t3200-test-zero-oid'

A test now uses the symbolic constant $ZERO_OID instead of 40 "0" to
work better with SHA-256 as well as SHA-1.

* ss/t3200-test-zero-oid:
t3200: replace hardcoded null OID with $ZERO_OID

Merge branch 'dd/list-objects-filter-options-wo-strbuf-split'

The way combined list-object filter options are parsed has been
revamped.

* dd/list-objects-filter-options-wo-strbuf-split:
list-objects-filter-options: avoid strbuf_split_str()
worktree: do not pass strbuf by value

Merge branch 'ps/t9200-test-path-is-helpers'

Test update.

* ps/t9200-test-path-is-helpers:
t9200: replace test -f with modern path helper
t9200: handle missing CVS with skip_all

transport: plug leaks in transport_color_config()

We retrieve config values with repo_config_get_string(), which will
allocate a new copy of the string for us. But we don't hold on to those
strings, since they are just fed to git_config_colorbool() and
color_parse(). But nor do we free them, which means they leak.

We can fix this by using the "_tmp" form of repo_config_get_string(),
which just hands us a pointer directly to the internal storage. This is
OK for our purposes, since we don't need it to last for longer than our
parsing calls.

Two interesting side notes here:

  1. Many types already have a repo_config_get_X() variant that handles
     this for us (e.g., repo_config_get_bool()). But neither colorbools
     nor colors themselves have such helpers. We might think about
     adding them, but converting all callers is a larger task, and out
     of scope for this fix.

  2. As far as I can tell, this leak has been there since 960786e761
     (push: colorize errors, 2018-04-21), but wasn't detected by LSan in
     our test suite. It started triggering when we applied dd3693eb08
     (transport-helper, connect: use clean_on_exit to reap children on
     abnormal exit, 2026-03-12) which is mostly unrelated.

     Even weirder, it seems to trigger only with clang (and not gcc),
     and only with GIT_TEST_DEFAULT_REF_FORMAT=reftable. So I think this
     is another odd case where the pointers happened to be hanging
     around in stack memory, but changing the pattern of function calls
     in nearby code was enough for them to be incidentally overwritten.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t4200: convert test -[df] checks to test_path_* helpers

Replace old-style path existence checks in t4200-rerere.sh with
the appropriate test_path_* helper functions. These helpers provide
clearer diagnostic messages on failure than the raw shell test
builtin.

Signed-off-by: Prashant S Bisht <prashantjee2025@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0008: improve test cleanup to fix failing test

The "large exclude file ignored in tree" test fails. This is due to an
additional warning message that is generated in the test. "warning:
unable to access 'subdir/.gitignore': Too many levels of symbolic
links", the extra warning that is not supposed to be there, happens
because of some leftover files left by previous tests.

To fix this we improve cleanup on "symlinks not respected in-tree", and
because the tests in t0008 in general have poor cleanup, at the start of
"large exclude file ignored in tree" we search for any leftover
.gitignore and remove them before starting the test.

Improve post-test cleanup and add pre-test cleanup to make sure that we
have a workable environment for the test.

Signed-off-by: Mirko Faina <mroik@delayed.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The 17th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ty/patch-ids-document-lazy-eval'

In-code comment update to record a design decision to allow lazy
computation of patch IDs.

* ty/patch-ids-document-lazy-eval:
patch-ids: document intentional const-casting in patch_id_neq()

Merge branch 'rs/history-ergonomics-updates-fix'

Fix use of uninitialized variable.

* rs/history-ergonomics-updates-fix:
history: initialize rev_info in cmd_history_reword()

Merge branch 'jk/unleak-mmap'

Plug a few leaks where mmap'ed memory regions are not unmapped.

* jk/unleak-mmap:
  meson: turn on NO_MMAP when building with LSan
  Makefile: turn on NO_MMAP when building with LSan
  object-file: fix mmap() leak in odb_source_loose_read_object_stream()
  pack-revindex: avoid double-loading .rev files
  check_connected(): fix leak of pack-index mmap
  check_connected(): delay opening new_pack

Merge branch 'ty/setup-error-tightening'

While discovering a ".git" directory, the code treats any stat()
failure as a sign that a filesystem entity .git does not exist
there, and ignores ".git" that is not a "gitdir" file or a
directory. The code has been tightened to notice and report
filesystem corruption better.

* ty/setup-error-tightening:
setup: improve error diagnosis for invalid .git files

Merge branch 'os/doc-git-custom-commands'

Doc update.

* os/doc-git-custom-commands:
doc: make it easier to find custom command information

Merge branch 'fp/t3310-unhide-git-failures'

The construct 'test "$(command)" = expectation' loses the exit
status from the command, which has been fixed by breaking up the
statement into pieces.

* fp/t3310-unhide-git-failures:
t3310: avoid hiding failures from rev-parse in command substitutions

Merge branch 'jt/repo-structure-extrema'

"git repo structure" command learns to report maximum values on
various aspects of objects it inspects.

* jt/repo-structure-extrema:
  builtin/repo: find tree with most entries
  builtin/repo: find commit with most parents
  builtin/repo: add OID annotations to table output
  builtin/repo: collect largest inflated objects
  builtin/repo: add helper for printing keyvalue output
  builtin/repo: update stats for each object

Merge branch 'sp/wt-status-wo-the-repository'

Reduce dependence on the global the_hash_algo and the_repository
variables of wt-status code path.

* sp/wt-status-wo-the-repository:
  wt-status: use hash_algo from local repository instead of global the_hash_algo
  wt-status: replace uses of the_repository with local repository instances
  wt-status: pass struct repository through function parameters

doc: fix git grep args order in Quick Reference

The example provided has its arguments in the wrong order. The revision
should follow the pattern, and not the other way around.

Signed-off-by: Guillaume Jacob <guillaume@absolut-sensing.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

clar: update to fix compilation on platforms without PATH_MAX

Update clar to e4172e3 (Merge pull request #134 from
clar-test/ethomson/const, 2026-01-10). Besides some changes to
"generate.py" which don't have any impact on us, this commit also fixes
compilation on platforms that don't have PATH_MAX, like for example
GNU/Hurd.

Reported-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t/pack-refs-tests: use test_path_is_missing

The pack-refs tests previously used raw 'test -f' and 'test -e' checks
with negation. Update them to use Git's standard helper function
test_path_is_missing for consistency and clearer failure reporting.

As suggested in review, replaced the negated 'test_path_exists' with
test_path_is_missing to better reflect the expected absence of paths.

Signed-off-by: Ritesh Singh Jadoun <riteshjd75@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/pack-objects: reduce lock contention when writing packfile data

When running `git pack-objects --stdout` we feed the data through
`hashfd_ext()` with a progress meter and a smaller-than-usual buffer
length of 8kB so that we can track throughput more granularly. But as
packfiles tend to be on the larger side, this small buffer size may
cause a ton of write(3p) syscalls.

Originally, the buffer we used in `hashfd()` was 8kB for all use cases.
This was changed though in 2ca245f8be (csum-file.h: increase hashfile
buffer size, 2021-05-18) because we noticed that the number of writes
can have an impact on performance. So the buffer size was increased to
128kB, which improved performance a bit for some use cases.

But the commit didn't touch the buffer size for `hashd_throughput()`.
The reasoning here was that callers expect the progress indicator to
update frequently, and a larger buffer size would of course reduce the
update frequency especially on slow networks.

While that is of course true, there was (and still is, even though it's
now a call to `hashfd_ext()`) only a single caller of this function in
git-pack-objects(1). This command is responsible for writing packfiles,
and those packfiles are often on the bigger side. So arguably:

  - The user won't care about increments of 8kB when packfiles tend to
    be megabytes or even gigabytes in size.

  - Reducing the number of syscalls would be even more valuable here
    than it would be for multi-pack indices, which was the benchmark
    done in the mentioned commit, as MIDXs are typically significantly
    smaller than packfiles.

  - Nowadays, many internet connections should be able to transfer data
    at a rate significantly higher than 8kB per second.

Update the buffer to instead have a size of `LARGE_PACKET_DATA_MAX - 1`,
which translates to ~64kB. This limit was chosen because `git
pack-objects --stdout` is most often used when sending packfiles via
git-upload-pack(1), where packfile data is chunked into pktlines when
using the sideband. Furthermore, most internet connections should have a
bandwidth signifcantly higher than 64kB/s, so we'd still be able to
observe progress updates at a rate of at least once per second.

This change significantly reduces the number of write(3p) syscalls from
355,000 to 44,000 when packing the Linux repository. While this results
in a small performance improvement on an otherwise-unused system, this
improvement is mostly negligible. More importantly though, it will
reduce lock contention in the kernel on an extremely busy system where
we have many processes writing data at once.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

csum-file: drop `hashfd_throughput()`

The `hashfd_throughput()` function is used by a single callsite in
git-pack-objects(1). In contrast to `hashfd()`, this function uses a
progress meter to measure throughput and a smaller buffer length so that
the progress meter can provide more granular metrics.

We're going to change that caller in the next commit to be a bit more
specific to packing objects. As such, `hashfd_throughput()` will be a
somewhat unfitting mechanism for any potential new callers.

Drop the function and replace it with a call to `hashfd_ext()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

csum-file: introduce `hashfd_ext()`

Introduce a new `hashfd_ext()` function that takes an options structure.
This function will replace `hashd_throughput()` in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

sideband: use writev(3p) to send pktlines

Every pktline that we send out via `send_sideband()` currently requires
two syscalls: one to write the pktline's length, and one to send its
data. This typically isn't all that much of a problem, but under extreme
load the syscalls may cause contention in the kernel.

Refactor the code to instead use the newly introduced writev(3p) infra
so that we can send out the data with a single syscall. This reduces the
number of syscalls from around 133,000 calls to write(3p) to around
67,000 calls to writev(3p).

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

wrapper: introduce writev(3p) wrappers

In the preceding commit we have added a compatibility wrapper for the
writev(3p) syscall. Introduce some generic wrappers for this function
that we nowadays take for granted in the Git codebase.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

compat/posix: introduce writev(3p) wrapper

In a subsequent commit we're going to add the first caller to
writev(3p). Introduce a compatibility wrapper for this syscall that we
can use on systems that don't have this syscall.

The syscall exists on modern Unixes like Linux and macOS, and seemingly
even for NonStop according to [1]. It doesn't seem to exist on Windows
though.

[1]: http://nonstoptools.com/manuals/OSS-SystemCalls.pdf
[2]: https://www.gnu.org/software/gnulib/manual/html_node/writev.html

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

upload-pack: reduce lock contention when writing packfile data

In our production systems we have recently observed write contention in
git-upload-pack(1). The system in question was consistently streaming
packfiles at a rate of dozens of gigabits per second, but curiously the
system was neither bottlenecked on CPU, memory or IOPS.

We eventually discovered that Git was spending 80% of its time in
`pipe_write()`, out of which almost all of the time was spent in the
`ep_poll_callback` function in the kernel. Quoting the reporter:

  This infrastructure is part of an event notification queue designed to
  allow for multiple producers to emit events, but that concurrency
  safety is guarded by 3 layers of locking. The layer we're hitting
  contention in uses a simple reader/writer lock mode (a.k.a. shared
  versus exclusive mode), where producers need shared-mode (read mode),
  and various other actions use exclusive (write) mode.

The system in question generates workloads where we have hundreds of
git-upload-pack(1) processes active at the same point in time. These
processes end up contending around those locks, and the consequence is
that the Git processes stall.

Now git-upload-pack(1) already has the infrastructure in place to buffer
some of the data it reads from git-pack-objects(1) before actually
sending it out. We only use this infrastructure in very limited ways
though, so we generally end up matching one read(3p) call with one
write(3p) call. Even worse, when the sideband is enabled we end up
matching one read with _two_ writes: one for the pkt-line length, and
one for the packfile data.

Extend our use of the buffering infrastructure so that we soak up bytes
until the buffer is filled up at least 2/3rds of its capacity. The
change is relatively simple to implement as we already know to flush the
buffer in `create_pack_file()` after git-pack-objects(1) has finished.

This significantly reduces the number of write(3p) syscalls we need to
do. Before this change, cloning the Linux repository resulted in around
400,000 write(3p) syscalls. With the buffering in place we only do
around 130,000 syscalls.

Now we could of course go even further and make sure that we always fill
up the whole buffer. But this might cause an increase in read(3p)
syscalls, and some tests show that this only reduces the number of
write(3p) syscalls from 130,000 to 100,000. So overall this doesn't seem
worth it.

Note that the issue could also be fixed by adapting the write buffer
that we use in the downstream git-pack-objects(1) command, and such a
change would have roughly the same result. But the command that
generates the packfile data may not always be git-pack-objects(1) as it
can be changed via "uploadpack.packObjectsHook", so such a fix would
only help in _some_ cases. Regardless of that, we'll also adapt the
write buffer size of git-pack-objects(1) in a subsequent commit.

Helped-by: Matt Smiley <msmiley@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

upload-pack: prefer flushing data over sending keepalive

When using the sideband in git-upload-pack(1) we know to send out
keepalive packets in case generating the pack takes too long. These
keepalives take the form of a simple empty pktline.

In the preceding commit we have adapted git-upload-pack(1) to buffer
data more aggressively before sending it to the client. This creates an
obvious optimization opportunity: when we hit the keepalive timeout
while we still hold on to some buffered data, then it makes more sense
to flush out the data instead of sending the empty keepalive packet.

This is overall not going to be a significant win. Most keepalives will
come before the pack data starts, and once pack-objects starts producing
data, it tends to do so pretty consistently. And of course we can't send
data before we see the PACK header, because the whole point is to buffer
the early bit waiting for packfile URIs. But the optimization is easy
enough to realize.

Do so and flush out data instead of sending an empty pktline.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

upload-pack: adapt keepalives based on buffering

The function `create_pack_file()` is responsible for sending the
packfile data to the client of git-upload-pack(1). As generating the
bytes may take significant computing resources we also have a mechanism
in place that optionally sends keepalive pktlines in case we haven't
sent out any data.

The keepalive logic is purely based poll(3p): we pass a timeout to that
syscall, and if the call times out we send out the keepalive pktline.
While reasonable, this logic isn't entirely sufficient: even if the call
to poll(3p) ends because we have received data on any of the file
descriptors we may not necessarily send data to the client.

The most important edge case here happens in `relay_pack_data()`. When
we haven't seen the initial "PACK" signature from git-pack-objects(1)
yet we buffer incoming data. So in the worst case, if each of the bytes
of that signature arrive shortly before the configured keepalive
timeout, then we may not send out any data for a time period that is
(almost) four times as long as the configured timeout.

This edge case is rather unlikely to matter in practice. But in a
subsequent commit we're going to adapt our buffering mechanism to become
more aggressive, which makes it more likely that we don't send any data
for an extended amount of time.

Adapt the logic so that instead of using a fixed timeout on every call
to poll(3p), we instead figure out how much time has passed since the
last-sent data.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

upload-pack: fix debug statement when flushing packfile data

When git-upload-pack(1) writes packfile data to the client we have some
logic in place that buffers some partial lines. When that buffer still
contains data after git-pack-objects(1) has finished we flush the buffer
so that all remaining bytes are sent out.

Curiously, when we do so we also print the string "flushed." to stderr.
This statement has been introduced in b1c71b7281 (upload-pack: avoid
sending an incomplete pack upon failure, 2006-06-20), so quite a while
ago. What's interesting though is that stderr is typically spliced
through to the client-side, and consequently the client would see this
message. Munging the way how we do the caching indeed confirms this:

  $ git clone file:///home/pks/Development/linux/
  Cloning into bare repository 'linux.git'...
  remote: Enumerating objects: 12980346, done.
  remote: Counting objects: 100% (131820/131820), done.
  remote: Compressing objects: 100% (50290/50290), done.
  remote: Total 12980346 (delta 96319), reused 104500 (delta 81217), pack-reused 12848526 (from 1)
  Receiving objects: 100% (12980346/12980346), 3.23 GiB | 57.44 MiB/s, done.
  flushed.
  Resolving deltas: 100% (10676718/10676718), done.

It's quite clear that this string shouldn't ever be visible to the
client, so it rather feels like this is a left-over debug statement. The
menitoned commit doesn't mention this line, either.

Remove the debug output to prepare for a change in how we do the
buffering in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0410: modernize delete_object helper

The delete_object helper currently relies on a manual sed command to
calculate object paths. This works, but it's a bit brittle and forces
us to maintain shell logic that Git's own test suite can already
handle more elegantly.

Switch to 'test_oid_to_path' to let Git handle the path logic. This
makes the helper hash independent, which is much cleaner than manual
string manipulation. While at it, use 'local' to declare helper-specific
variables and quote them to follow Git's coding style. This prevents
them from leaking into global shell scope and avoids potential naming
conflicts with other parts of the test suite.

Helped-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fast-import: add mode to sign commits with invalid signatures

With git-fast-import(1), handling of signed commits is controlled via
the `--signed-commits=<mode>` option. When an invalid signature is
encountered, a user may want the option to sign the commit again as
opposed to just stripping the signature. To facilitate this, introduce a
"sign-if-invalid" mode for the `--signed-commits` option. Optionally, a
key ID may be explicitly provided in the form
`sign-if-invalid[=<keyid>]` to specify which signing key should be used
when signing invalid commit signatures.

Note that to properly support interoperability mode when signing commit
signatures, the commit buffer must be created in both the repository and
compatability object formats to generate the appropriate signatures
accordingly. As currently implemented, the commit buffer for the
compatability object format is not reconstructed and thus signing
commits in interoperability mode is not yet supported. Support may be
added in the future.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gpg-interface: allow sign_buffer() to use default signing key

The `sign_commit_to_strbuf()` helper in "commit.c" provides fallback
logic to get the default configured signing key when a key is not
provided and handles generating the commit signature accordingly. This
signing operation is not really specific to commits as any arbitrary
buffer can be signed. Also, in a subsequent commit, this same logic is
reused by git-fast-import(1) when signing commits with invalid
signatures.

Remove the `sign_commit_to_strbuf()` helper from "commit.c" and extend
`sign_buffer()` in "gpg-interface.c" to support using the default key as
a fallback when the `SIGN_BUFFER_USE_DEFAULT_KEY` flag is provided. Call
sites are updated accordingly.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

commit: remove unused forward declaration

In 6206089cbd (commit: write commits for both hashes, 2023-10-01),
`sign_with_header()` was removed, but its forward declaration in
"commit.h" was left. Remove the unused declaration.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The 16th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/odb-sources'

The object source API is getting restructured to allow plugging new
backends.

* ps/odb-sources:
  odb/source: make `begin_transaction()` function pluggable
  odb/source: make `write_alternate()` function pluggable
  odb/source: make `read_alternates()` function pluggable
  odb/source: make `write_object_stream()` function pluggable
  odb/source: make `write_object()` function pluggable
  odb/source: make `freshen_object()` function pluggable
  odb/source: make `for_each_object()` function pluggable
  odb/source: make `read_object_stream()` function pluggable
  odb/source: make `read_object_info()` function pluggable
  odb/source: make `close()` function pluggable
  odb/source: make `reprepare()` function pluggable
  odb/source: make `free()` function pluggable
  odb/source: introduce source type for robustness
  odb: move reparenting logic into respective subsystems
  odb: embed base source in the "files" backend
  odb: introduce "files" source
  odb: split `struct odb_source` into separate header

Merge branch 'hn/status-compare-with-push'

"git status" learned to show comparison between the current branch
and various other branches listed on status.compareBranches
configuration.

* hn/status-compare-with-push:
  status: clarify how status.compareBranches deduplicates
  status: add status.compareBranches config for multiple branch comparisons
  refactor format_branch_comparison in preparation

Merge branch 'ds/for-each-repo-w-worktree'

"git for-each-repo" started from a secondary worktree did not work
as expected, which has been corrected.

* ds/for-each-repo-w-worktree:
  for-each-repo: simplify passing of parameters
  for-each-repo: work correctly in a worktree
  run-command: extract sanitize_repo_env helper
  for-each-repo: test outside of repo context

The 15th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'sp/send-email-validate-charset'

"git send-email" has learned to be a bit more careful when it
accepts charset to use from the end-user, to avoid 'y' (mistaken
'yes' when expecting a charset like 'UTF-8') and other nonsense.

* sp/send-email-validate-charset:
send-email: validate charset name in 8bit encoding prompt

Merge branch 'dt/send-email-client-cert'

"git send-email" learns to support use of client-side certificates.

* dt/send-email-client-cert:
send-email: add client certificate options

Merge branch 'ps/ci-gitlab-prepare-for-macos-14-deprecation'

Move gitlab CI from macOS 14 images that are being deprecated.

* ps/ci-gitlab-prepare-for-macos-14-deprecation:
  gitlab-ci: update to macOS 15 images
  meson: detect broken iconv that requires ICONV_RESTART_RESET
  meson: simplify iconv-emits-BOM check

Merge branch 'ag/send-email-sasl-with-host-port'

"git send-email" learns to pass hostname/port to Authen::SASL
module.

* ag/send-email-sasl-with-host-port:
send-email: pass smtp hostname and port to Authen::SASL

Merge branch 'ss/t9123-setup-inside-test-expect-success'

Test clean-up.

* ss/t9123-setup-inside-test-expect-success:
t9123: use test_when_finished for cleanup

Merge branch 'sk/oidmap-clear-with-custom-free-func'

A bit of OIDmap API enhancement and cleanup.

* sk/oidmap-clear-with-custom-free-func:
builtin/rev-list: migrate missing_objects cleanup to oidmap_clear_with_free()
oidmap: make entry cleanup explicit in oidmap_clear

Merge branch 'jt/doc-submitting-patches-study-before-sending'

Doc update for our contributors.

* jt/doc-submitting-patches-study-before-sending:
Documentation: extend guidance for submitting patches

Merge branch 'os/doc-custom-subcommand-on-path'

The way end-users can add their own "git <cmd>" subcommand by
storing "git-<cmd>" in a directory on their $PATH has not been
documented clearly, which has been corrected.

* os/doc-custom-subcommand-on-path:
doc: add information regarding external commands

Merge branch 'ss/t3700-modernize'

Test clean-up.

* ss/t3700-modernize:
t3700: use test_grep helper for better diagnostics
t3700: avoid suppressing git's exit code

Merge branch 'lp/doc-gitprotocol-pack-fixes'

Doc update.

* lp/doc-gitprotocol-pack-fixes:
  doc: gitprotocol-pack: normalize italic formatting
  doc: gitprotocol-pack: improve paragraphs structure
  doc: gitprotocol-pack: fix pronoun-antecedent agreement

Merge branch 'kj/path-micro-code-cleanup'

Code clean-up.

* kj/path-micro-code-cleanup:
  path: remove redundant function calls
  path: use size_t for dir_prefix length
  path: remove unused header

Merge branch 'bc/sha1-256-interop-02'

The code to maintain mapping between object names in multiple hash
functions is being added, written in Rust.

* bc/sha1-256-interop-02:
  object-file-convert: always make sure object ID algo is valid
  rust: add a small wrapper around the hashfile code
  rust: add a new binary object map format
  rust: add functionality to hash an object
  rust: add a build.rs script for tests
  rust: fix linking binaries with cargo
  hash: expose hash context functions to Rust
  write-or-die: add an fsync component for the object map
  csum-file: define hashwrite's count as a uint32_t
  rust: add additional helpers for ObjectID
  hash: add a function to look up hash algo structs
  rust: add a hash algorithm abstraction
  rust: add a ObjectID struct
  hash: use uint32_t for object_id algorithm
  conversion: don't crash when no destination algo
  repository: require Rust support for interoperability

t9200: replace test -f with modern path helper

Replace old style 'test -f' with helper
'test_path_is_file', which make debugging
a failing test easier by loudly reporting
what expectation was not met.

Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE

The 'cmd_mktree()' function already receives a 'struct repository *repo'
pointer, but it was previously marked as UNUSED.

Pass the 'repo' pointer down to 'mktree_line()' and 'write_tree()'.
Consequently, remove the 'USE_THE_REPOSITORY_VARIABLE' macro, replace
usages of 'the_repository', and swap 'parse_oid_hex()' with its context-aware
version 'parse_oid_hex_algop()'.

This refactoring is safe because 'cmd_mktree()' is registered with the
'RUN_SETUP' flag in 'git.c', which guarantees that the command is
executed within a initialized repository, ensuring that the passed 'repo'
pointer is never 'NULL'.

Signed-off-by: Tian Yuchen <cat@malon.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

run-command: wean auto_maintenance() functions off the_repository

The prepare_auto_maintenance() relies on the_repository to read
configurations. Since run_auto_maintenance() calls
prepare_auto_maintenance(), it also implicitly depends the_repository.

Add 'struct repository *' as a parameter to both functions and update
all callers to pass the_repository.

With no global repository dependencies left in this file, remove the
USE_THE_REPOSITORY_VARIABLE macro.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Burak Kaan Karaçay <bkkaracay@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

run-command: wean start_command() off the_repository

The start_command() relies on the_repository due to the
close_object_store flag in 'struct child_process'. When this flag is
set, start_command() closes the object store associated with
the_repository before spawning a child process.

To eliminate this dependency, replace the 'close_object_store' with the
new 'struct object_database *odb_to_close' field. This allows callers to
specify the object store that needs to be closed.

Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Burak Kaan Karaçay <bkkaracay@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

imap-send: move common code into function host_matches()

Move the ASN1_STRING access, the associated cast and the check for
embedded NUL bytes into host_matches() to simplify both callers.

Reformulate the NUL check using memchr() and add a comment to make it
more obvious what it is about.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

imap-send: use the OpenSSL API to access the subject common name

The OpenSSL 4.0 master branch has deprecated the
X509_NAME_get_text_by_NID function. Use the recommended replacement APIs
instead. They have existed since OpenSSL v1.1.0.

Take care to get the constness right for pre-4.0 versions.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

imap-send: use the OpenSSL API to access the subject alternative names

The OpenSSL 4.0 master branch has made the ASN1_STRING structure opaque,
forbidding access to its internal fields. Use the official accessor
functions instead. They have existed since OpenSSL v1.1.0.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t: allow use of "sed -E"

Since early 2019 with e62e225f (test-lint: only use only sed [-n]
[-e command] [-f command_file], 2019-01-20), we have been trying to
limit the options of "sed" we use in our tests to "-e <pattern>",
"-n", and "-f <file>".

Before the commit, we were trying to reject only "-i" (which is one
of the really-not-portable options), but the commit explicitly
wanted to reject use of "-E" (use ERE instead of BRE).  The commit
cites the then-current POSIX.1 (Issue 7, 2018 edition) to show that
"even recent POSIX does not have it!", but the latest edition (Issue
8) documents "-E" as an option to use ERE.

But that was 7 years ago, and that is a long time for many things to
happen.

Besides, we have been using "sed -E" without the check in question
triggering in one of the scripts since 2022, with 461fec41 (bisect
run: keep some of the post-v2.30.0 output, 2022-11-10).  It was
hidden because the 'E' was squished with another single letter
option.

t/t6030-bisect-porcelain.sh: sed -En 's/.*(bisect...

This escaped the rather simple pattern used in the checker

    /\bsed\s+-[^efn]\s+/ and err 'sed option not portable...';

because -E did not appear as a singleton.

Let's change the rule to allow the "-E" option, which nobody has
complained against for the past 3 years.  We rewrite our first use
of the "-E" option so that it is caught by the old rule, primarily
because we do not want to teach our mischievous developers how to
smuggle in an unwanted option undetected by the test lint.  And at
the same time, loosen the pattern to allow "-E" the same way we
allow "-n" and friends.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

t9200: handle missing CVS with skip_all

CVS initialization runs outside a test_expect_success and when it
fails, the error report isn't good.

Wrap CVS initialization in a skip_all check so when CVS initialization
fails, the error report becomes clearer.

Move the Git repo initialization into its own test_expect_success instead
of being in the same CVS check.

Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

help: cleanup the contruction of keys_uniq

construction of keys_uniq depends on sort operation
executed on keys before processing, which does not
gurantee that keys_uniq will be sorted.

refactor the code to shift the sort operation after
the processing to remove dependency on key's sort operation
and strictly maintain the sorted order of keys_uniq.

move strbuf init and release out of loop to reuse same buffer.

dedent sort -u and sed in tests and replace grep with sed, to
avoid piping grep's output to sed.

Suggested-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
Signed-off-by: Amisha Chhajed <amishhhaaaa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

test-lib: print escape sequence names

When printing expected/actual characters in failed checks, use
their names (\a, \b, \n, ...) instead of their octal representation,
making it easier to read.

Add tests to test-example-tap.c
Update t0080-unit-test-output.sh to match the desired output

Teach 'print_one_char()' the equivalent name

Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

submodule--helper: replace malloc with xmalloc

The submodule_summary_callback() function currently uses a raw malloc()
which could lead to a NULL pointer dereference.

Standardize this by replacing malloc() with xmalloc() for error handling.
To improve maintainability, use sizeof(*temp) instead of the struct name,
and drop the typecast of void pointer assignment.

Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t3200: replace hardcoded null OID with $ZERO_OID

To support the SHA-256 transition, replace the hardcoded 40-zero string
in 'git branch --merged' with '$ZERO_OID'. The current 40-character
string causes the test to fail prematurely in SHA-256 environments
because Git identifies a "malformed object name" (due to the 40 vs 64
character mismatch) before it even validates the object type.

By using '$ZERO_OID', we ensure the hash length is always correct for
the active algorithm. Additionally, use 'test_grep' to verify the
"must point to a commit" error message, ensuring the test validates
the object type logic rather than just string syntax.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

list-objects-filter-options: avoid strbuf_split_str()

parse_combine_filter() splits a combine: filter spec at '+' using
strbuf_split_str(), which yields an array of strbufs with the
delimiter left at the end of each non-final piece.  The code then
mutates each non-final piece to strip the trailing '+' before parsing.

Allocating an array of strbufs is unnecessary.  The function processes
one sub-spec at a time and does not use strbuf editing on the pieces.
The two helpers it calls, has_reserved_character() and
parse_combine_subfilter(), only read the string content of the strbuf
they receive.

Walk the input string directly with strchrnul() to find each '+',
copying each sub-spec into a reusable temporary buffer.  The '+'
delimiter is naturally excluded.  Empty sub-specs (e.g. from a
trailing '+') are silently skipped for consistency.  Change the
helpers to take const char * instead of struct strbuf *.

The test that expected an error on a trailing '+' is removed, since
that behavior was incorrect.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

worktree: do not pass strbuf by value

write_worktree_linking_files() takes two struct strbuf parameters by
value, even though it only reads path strings from them.

Passing a strbuf by value is misleading and dangerous. The structure
carries a pointer to its underlying character array; caller and callee
end up sharing that storage. If the callee ever causes the strbuf to
be reallocated, the caller's copy becomes a dangling pointer, which
results in a double-free when the caller does strbuf_release().

The function only needs the string values, not the strbuf machinery.
Switch it to take const char * and update all callers to pass .buf.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

editorconfig: fix style not applying to subdirs anymore

In 046e1117d5 (templates: add .gitattributes entry for sample hooks,
2026-02-13) we have added another pattern to our EditorConfig that sets
the style for our hook templates. As our templates are located in
"templates/hooks/", we explicitly specify that subdirectory as part of
the globbing pattern.

This change causes files in other subdirectories, like for example
"builtin/add.c", to not be configured properly anymore. This seems to
stem from a subtlety in the EditorConfig specification [1]:

  If the glob contains a path separator (a / not inside square
  brackets), then the glob is relative to the directory level of the
  particular .editorconfig file itself. Otherwise the pattern may also
  match at any level below the .editorconfig level.

What's interesting is that the _whole_ expression is considered to be
the glob. So when the expression used is for example "{*.c,foo/*.h}",
then it will be considered a single glob, and because it contains a path
separator we will now anchor "*.c" matches to the same directory as the
".editorconfig" file.

Fix this issue by splitting out the configuration for hook templates
into a separate section. It leads to a tiny bit of duplication, but the
alternative would be something like the following (note the "{,**/}"):

  [{{,**/}*.{c,h,sh,bash,perl,pl,pm,txt,adoc},config.mak.*,{,**/}Makefile,templates/hooks/*.sample}]
  indent_style = tab
  tab_width = 8

This starts to become somewhat hard to read, so the duplication feels
like the better tradeoff.

[1]: https://spec.editorconfig.org/#glob-expressions

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t7605: use test_path_is_file instead of test -f

Replace old-style 'test -f' path checks with the modern
test_path_is_file helper in the merge_c1_to_c2_cmds block.

The helper provides clearer failure messages and is the
established convention in Git's test suite.

Signed-off-by: Mansi Singh <mansimaanu8627@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>