Jeff King [Fri, 16 May 2025 04:49:47 +0000 (00:49 -0400)]
cat-file: use type enum instead of buffer for -t option
Now that we no longer support OBJECT_INFO_ALLOW_UNKNOWN_TYPE, there is
no need to pass a strbuf into oid_object_info_extended() to record the
type. The regular object_type enum is sufficient to capture all of the
types we will allow.
This simplifies the code a bit, and will eventually let us drop
object_info's type_name strbuf support.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 16 May 2025 04:49:45 +0000 (00:49 -0400)]
object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag
Since cat-file dropped its "--allow-unknown-type" option in the previous
commit, there are no more uses of the internal flag that implemented it.
Let's drop it.
That in turn lets us drop the strbuf parameter of unpack_loose_header(),
which now is always NULL. And without that, we can drop all of the
additional code to inflate larger headers into the strbuf.
Arguably we could drop ULHR_TOO_LONG, as no callers really care about
the distinction from ULHR_BAD. But it's easy enough to retain, and it
does let us produce a slightly more specific message in one instance.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 16 May 2025 04:49:35 +0000 (00:49 -0400)]
cat-file: make --allow-unknown-type a noop
The cat-file command has some minor support for handling objects with
"unknown" types. I.e., strings that are not "blob", "commit", "tree", or
"tag".
In theory this could be used for debugging or experimenting with
extensions to Git. But in practice this support is not very useful:
1. You can get the type and size of such objects, but nothing else.
Not even the contents!
2. Only loose objects are supported, since packfiles use numeric ids
for the types, rather than strings.
3. Likewise you cannot ever transfer objects between repositories,
because they cannot be represented in the packfiles used for the
on-the-wire protocol.
The support for these unknown types complicates the object-parsing code,
and has led to bugs such as b748ddb7a4 (unpack_loose_header(): fix
infinite loop on broken zlib input, 2025-02-25). So let's drop it.
The first step is to remove the user-facing parts, which are accessible
only via cat-file. This is technically backwards-incompatible, but given
the limitations listed above, these objects couldn't possibly be useful
in any workflow.
However, we can't just rip out the option entirely. That would hurt a
caller who ran:
git cat-file -t --allow-unknown-object <oid>
and fed it normal, well-formed objects. There --allow-unknown-type was
doing nothing, but we wouldn't want to start bailing with an error. So
to protect any such callers, we'll retain --allow-unknown-type as a
noop.
The code change is fairly small (but we'll able to clean up more code in
follow-on patches). The test updates drop any use of the option. We
still retain tests that feed the broken objects to cat-file without
--allow-unknown-type, as we should continue to confirm that those
objects are rejected. Note that in one spot we can drop a layer of loop,
re-indenting the body; viewing the diff with "-w" helps there.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide an overview of the set of functions used for manipulating
`json_writer`s, by describing what functions should be used for
each JSON-related task.
Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a docstring for each function that manipulates json_writers.
Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 16 May 2025 00:24:55 +0000 (17:24 -0700)]
Merge branch 'ps/maintenance-missing-tasks'
Make repository clean-up tasks "gc" can do available to "git
maintenance" front-end.
* ps/maintenance-missing-tasks:
builtin/maintenance: introduce "rerere-gc" task
builtin/gc: move rerere garbage collection into separate function
builtin/maintenance: introduce "worktree-prune" task
builtin/gc: move pruning of worktrees into a separate function
builtin/gc: remove global variables where it is trivial to do
builtin/gc: fix indentation of `cmd_gc()` parameters
Junio C Hamano [Fri, 16 May 2025 00:24:55 +0000 (17:24 -0700)]
Merge branch 'cf/wrapper-bsd-eloop'
The fallback implementation of open_nofollow() depended on
open("symlink", O_NOFOLLOW) to set errno to ELOOP, but a few BSD
derived systems use different errno, which has been worked around.
* cf/wrapper-bsd-eloop:
wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
Lidong Yan [Fri, 9 May 2025 08:30:35 +0000 (08:30 +0000)]
commit-graph: fix memory leak when `fill_oids_from_packs()` fails
In commit-graph.c:fill_oids_from_packs, if open_pack_index failed,
memory allocated and returned by add_packed_git will leak. Simply
add close_pack and free(p) will solve this problem.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Wed, 14 May 2025 13:53:28 +0000 (13:53 +0000)]
sequencer: fix memory leak if `todo_list_rearrange_squash()` failed
In sequencer.c:todo_list_rearrange_squash, if it fails, memory
allocated in `next`, `tail`, `subjects` and `subject2item` will leak.
Jump to cleanup label before return could fix this leak problem.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Tue, 13 May 2025 02:49:10 +0000 (02:49 +0000)]
mailinfo: fix pointential memory leak if `decode_header` failed
In mailinfo.c:decode_header, if convert_to_utf8 failed, the strbuf stored
in dec will leak. Simply add strbuf_release and free(dec) will solve
this problem.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sequencer: stop pretending that an assignment is a condition
In 3e81bccdf3 (sequencer: factor out todo command name parsing,
2019-06-27), a `return` statement was introduced that basically was a
long sequence of conditions, combined with `&&`, except for the last
condition which is not really a condition but an assignment.
The point of this construct was to return 1 (i.e. `true`) from the
function if all of those conditions held true, and also assign the `bol`
pointer to the end of the parsed command.
Some static analyzers are really unhappy about such constructs. And
human readers are at least puzzled, if not confused, by seeing a single
`=` inside a chain of conditions where they would have expected to see
`==` instead and, based on experience, immediately suspect a typo.
Let's help all of this by turning this into the more verbose, more
readable form of an `if` construct that both assigns the pointer as well
as returns 1 if all of the conditions hold true.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
bundle-uri: avoid using undefined output of `sscanf()`
In c429bed102 (bundle-uri: store fetch.bundleCreationToken, 2023-01-31)
code was introduced that assumes that an `sscanf()` call leaves its
output variables unchanged unless the return value indicates success.
However, the POSIX documentation makes no such guarantee:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sscanf.html
So let's make sure that the output variable `maxCreationToken` is
always well-defined.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code is a bit too hard to reason about to fully assess whether the
`fill_commit_graph_info()` function is called at all after
`write_commit_graph()` returns (and hence the stack variable
`topo_levels` goes out of context).
Let's simply make sure that the stack address is no longer used at that
stage, thereby making the code quite a bit easier to reason about.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
CodeQL reports empty `if` blocks that only contain a comment as "futile
conditional". The comment talks about potential plans to turn this into
a warning, but that seems not to have been necessary. Replace the entire
construct with a concise comment.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
While `if (i <= 0) ... else if (i > 0) ...` is technically equivalent to
`if (i <= 0) ... else ...`, the latter is vastly easier to read because
it avoids writing out a condition that is unnecessary. Let's drop such
unnecessary conditions.
Pointed out by CodeQL.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: avoid unnecessary work when there is no current branch
As pointed out by CodeQL, `branch_get()` may return `NULL`, in which
case `branch_has_merge_config()` would return early, but we can even
avoid enumerating the refs prefixes in that case, saving even more CPU
cycles.
Technically, we should enclose these two statements in an `if (branch)
{...}` block, but the indentation is already quite deep, therefore I
refrained from doing that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
One thing that might be non-obvious to readers (or to analyzers like
CodeQL) is that the function essentially does nothing when the Git index
is empty, and in particular that it does not look at the value of
`len_eq_last` (which would be uninitialized at that point).
Let's make this much easier to understand, by returning early if the Git
index is empty, and by avoiding empty `else` blocks.
This commit changes indentation and is hence best viewed using
`--ignore-space-change`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
upload-pack: rename `enum` to reflect the operation
While 3145ea957d (upload-pack: introduce fetch server command,
2018-03-15) added support for the `fetch` command, from the server's
point of view it is an upload, and hence the `enum` should really be
called `upload_state` instead of `fetch_state`. Likewise, rename its
values.
This also helps unconfuse CodeQL which would otherwise be at sixes or
sevens about having _two_ non-local definitions of the same `enum` with
the same values.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do need a context to write the commit graph, but that context is only
needed during the life time of `commit_graph_write()`, therefore it can
easily be a stack variable.
This also helps CodeQL recognize that it is safe to assign the address
of other local variables to the context's fields.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: carefully clear local variable's address after use
As pointed out by CodeQL, it is a potentially dangerous practice to
store local variables' addresses in non-local structs. Yet this is
exactly what happens with the `acked_commits` attribute that is used in
`cmd_fetch()`: The pointer to a local variable is assigned to it.
Now, it is Git's convention that `cmd_*()` functions are essentially
only returning just before exiting the process, therefore there is
little danger that this attribute is used after the code flow returns
from that function.
However, code in `cmd_*()` function is often so useful that it gets
lifted into a library function, at which point this issue could become a
real problem.
Let's make sure to clear the `acked_commits` attribute out after it was
used, and before the function returns (at which point the address would
go stale).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The difference of two unsigned integers is defined to be unsigned, and
therefore it is misleading to check whether it is greater than zero
(instead, the more natural way would be to check whether the difference
is zero or not).
Let's instead avoid the subtraction altogether, and compare the two
operands directly, which makes the code more obvious as a side effect.
Pointed out by CodeQL's rule with the ID
`cpp/unsigned-difference-expression-compared-zero`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Wed, 14 May 2025 20:41:30 +0000 (22:41 +0200)]
git-gui: do not end the commit message with an empty line
The commit message is processed to remove unnecessary empty lines.
In particular, it is ensured that the text ends with at most one LF
character. This one is always present, because the Tk text widget
ensures that is present.
However, did not consider that the processed text is written to the
commit message file using `puts`, which also appends a LF character,
so that the final commit message ends with two LF. Trim all trailing
LF characters, and while we are here, use `string trim`, which lets
us remove the leading LF in the same command.
Reported-by: Gareth Fenn <garethfenn@gmail.com> Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
gitk: do not hard-code color of search results in commit list
A global variable exists that holds the color name used to highlight
search results everywhere, except that in the commit list the color
is still hard-coded to "yellow". Use the global variable there as well.
Signed-off-by: Alexander Ogorodov <bnfour@bnfour.net>
Elijah Newren [Wed, 14 May 2025 20:33:25 +0000 (20:33 +0000)]
replay: replace the_repository with repo parameter passed to cmd_replay ()
Replace the_repository everywhere with repo, feed repo from cmd_replay()
to all the other functions in the file that need it, and remove the
UNUSED annotation on repo.
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Wed, 14 May 2025 15:50:42 +0000 (23:50 +0800)]
packed-backend: mmap large "packed-refs" file during fsck
During fsck, we use "strbuf_read" to read the content of "packed-refs"
without using mmap mechanism. This is a bad practice which would consume
more memory than using mmap mechanism. Besides, as all code paths in
"packed-backend.c" use this way, we should make "fsck" align with the
current codebase.
As we have introduced the helper function "allocate_snapshot_buffer", we
can simply use this function to use mmap mechanism.
Suggested-by: Jeff King <peff@peff.net> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Wed, 14 May 2025 15:50:35 +0000 (23:50 +0800)]
packed-backend: extract snapshot allocation in `load_contents`
"load_contents" would choose which way to load the content of the
"packed-refs". However, we cannot directly use this function when
checking the consistency due to we don't want to open the file. And we
also need to reuse the logic to avoid causing repetition.
Let's create a new helper function "allocate_snapshot_buffer" to extract
the snapshot allocation logic in "load_contents" and update the
"load_contents" to align with the behavior.
Suggested-by: Jeff King <peff@peff.net> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
shejialuo [Wed, 14 May 2025 15:50:26 +0000 (23:50 +0800)]
packed-backend: fsck should warn when "packed-refs" file is empty
We assume the "packed-refs" won't be empty and instead has at least one
line in it (even when there are no refs packed, there is the file header
line). Because there is no terminating LF in the empty file, we will
report "packedRefEntryNotTerminated(ERROR)" to the user.
However, the runtime code paths would accept an empty "packed-refs"
file, for example, "create_snapshot" would simply return the "snapshot"
without checking the content of "packed-refs". So, we should skip
checking the content of "packed-refs" when it is empty during fsck.
After 694b7a1999 (repack_without_ref(): write peeled refs in the
rewritten file, 2013-04-22), we would always write a header into the
"packed-refs" file. So, versions of Git that are not too ancient never
write such an empty "packed-refs" file.
As an empty file often indicates a sign of a filesystem-level issue, the
way we want to resolve this inconsistency is not make everybody totally
silent but notice and report the anomaly.
Let's create a "FSCK_INFO" message id "EMPTY_PACKED_REFS_FILE" to report
to the users that "packed-refs" is empty.
Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Wed, 14 May 2025 13:52:44 +0000 (09:52 -0400)]
scalar reconfigure: improve --maintenance docs
The --maintenance option for 'scalar reconfigure' has three possible
values. Improve the documentation by specifying the option in the -h
help menu and usage information.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aditya Garg [Mon, 12 May 2025 08:11:19 +0000 (08:11 +0000)]
send-email: try to get fqdn by running hostname -f on Linux and macOS
`hostname` is a popular command available on both Linux and macOS. As
per the man-page[1], `hostname -f` command returns the fully qualified
domain name (FQDN) of the system. The current Net::Domain perl module
being used in the script for the same has been quite unrealiable in many
cases. Thankfully, we now have a better check for valid_fqdn, which does
reject the invalid FQDNs given by this module properly, but at the same
time, it will result in a fallback to 'localhost.localdomain' being
used. `hostname -f` has been quite reliable (probably even more reliable
than the Net::Domain module) and before falling back to
'localhost.localdomain', we should try to use it. Interestingly, the
`hostname` command is actually used by perl modules like Net::Domain[2]
and Sys::Hostname[3] to get the hostname. So, lets give `hostname -f` a
chance as well!
Junio C Hamano [Tue, 13 May 2025 21:05:06 +0000 (14:05 -0700)]
Merge branch 'js/ci-buildsystems-cleanup'
Code clean-up around stale CI elements and building with Visual Studio.
* js/ci-buildsystems-cleanup:
config.mak.uname: drop the `vcxproj` target
contrib/buildsystems: drop support for building . vcproj/.vcxproj files
ci: stop linking the `prove` cache
With 7304bd2bc39 (ci: wire up Visual Studio build with Meson,
2025-01-22) we have introduced a CI job that builds and tests Git with
Microsoft Visual Studio via Meson. This job is only being executed by
default on GitHub Workflows though -- on GitLab CI it is marked as a
"manual" job, so the developer has to actively trigger these jobs.
The consequence of this split is that any breakage specific to this job
is only noticed by developers who mainly work with GitHub. Let's improve
this situation by also running the job by default on GitLab CI.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui: wire up support for the Meson build system
The Git project has started to wire up Meson as a build system in Git
v2.48.0. Wire up support for Meson in "git-gui" so that we can trivially
include it as a subproject in Git.
The "GITGUI_VERSION" variable is made available by generating and
including the "GIT-VERSION-FILE" file. Its value has been used in
various build steps, but in the preceding commits we have refactored
those to instead source the "GIT-VERSION-FILE" directly. As a result,
the variable is now only used in a single recipe, and this use can be
trivially replaced with sed(1).
Refactor the recipe to do so and stop including "GIT-VERSION-FILE" to
simplify the build process.
Extract script to generate the macOS app. This change allows us to reuse
the build logic with the Meson build system.
Note that as part of this change we also modify the TKEXECUTABLE
variable to track its full path. Like this we don't have to propagate
both the TKEXECUTABLE and TKFRAMEWORK variables into the script, and the
basename can be trivially computed from TKEXECUTABLE anyway.
The value of the GITGUI_SCRIPT variable is only used in a single place
as part of an sed(1) script that massages the "git-gui.sh" script.
Interestingly, this specific replacement does seem to be a no-op: we
replace "^ argv0=$$0" with " argv=$(GITGUI_SCRIPT)", which has a value
of "$$0". The result would thus be completely unchanged.
git-gui: make output of GIT-VERSION-GEN source'able
The output of GIT-VERSION-GEN can be sourced by our Makefile to make the
version available there. The output has a couple of spaces around the
equals sign, which is perfectly valid for parsing it in our Makefile.
But in subsequent steps we'll also want to source the file in a couple
of newly-introduced shell scripts, but having spaces around variable
assignments is invalid there.
Prepare for this step by dropping the spaces surrounding the equals
sign. Like this, we can easily use the same file both in our Makefile
and in shell scripts.
git-gui: prepare GIT-VERSION-GEN for out-of-tree builds
The GIT-VERSION-GEN unconditionally writes version information into the
source directory in the form of the "GIT-VERSION-FILE". We are about to
introduce the Meson build system though, which enforces out-of-tree
builds by default, and in that context we cannot continue to write
version information into the source tree.
Prepare the script for out-of-tree builds by treating the source
directory different from the output file.
git-gui: replace GIT-GUI-VARS with GIT-GUI-BUILD-OPTIONS
The GIT-GUI-VARS file is used to track whether any of our build options
has changed. Unfortunately, the format of that file does not allow us to
propagate those build options to other scripts. But as we are about to
introduce support for the Meson build system, we will extract a couple
of scripts to deduplicate core build logic across Makefiles and Meson.
With this refactoring, it will become necessary to make build options
more widely accessible.
Replace GIT-GUI-VARS with a new GIT-GUI-BUILD-OPTIONS file that is being
populated from a template. This file can easily be sourced from build
scripts in subsequent steps.
Junio C Hamano [Mon, 12 May 2025 19:03:11 +0000 (12:03 -0700)]
whatschanged: list it in BreakingChanges document
This can be squashed into the previous step. That is how our "git
pack-redundant" conversion did.
Theoretically, however, those who want to gauge the need to keep the
command by exposing their users to patches before this one may want
to wait until their experiment finishes before they formally say
"this will go away".
This change is made into a separate patch from the previous step
precisely to help those folks.
While at it, update the documentation page to use the new [synopsis]
facility to mark-up the SYNOPSIS part.
Helped-by: Elijah Newren <newren@gmail.com>
[en: typofix] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 12 May 2025 19:03:10 +0000 (12:03 -0700)]
whatchanged: remove when built with WITH_BREAKING_CHANGES
As we made "git whatchanged" require "--i-still-use-this" and asked
the users to report if they still want to use it, the logical next
step is to allow us build Git without "whatchanged" to prepare for
its eventual removal.
If we were to follow the pattern established in 8ccc75c2 (remote:
announce removal of "branches/" and "remotes/", 2025-01-22), we can
do this together with the documentation update to officially list
that the command will be removed in the BreakingChanges document,
but let's just keep the changes separate just in case we want to
proceed a bit slower.
Junio C Hamano [Mon, 12 May 2025 19:03:09 +0000 (12:03 -0700)]
whatchanged: require --i-still-use-this
The documentation of "git whatchanged" is pretty explicit that the
command was retained for historical reasons to help those whose fingers
cannot be retrained. Let's see if they still are finding it hard to
type "git log --raw" instead of "git whatchanged" by marking the
command as "nominated for removal", and require "--i-still-use-this"
on the command line. Adjust the tests so that the option is passed
when we invoke the command. In addition, we test that the command
fails when "--i-still-use-this" is not given.
Junio C Hamano [Mon, 12 May 2025 19:03:08 +0000 (12:03 -0700)]
tests: prepare for a world without whatchanged
Some tests on fast-import run "git whatchanged" without even
checking the output from the command. It is tempting to remove the
calls altogether since they are not doing anything useful, but they
presumably were added there while the tests were developed to manually
sanity check which paths were touched.
Replace these calls with "git log --raw", which is a rough
equivalent in the more modern Git.
This does not remove "git whatchanged", but we no longer have to
worry about adjusting these places when we eventually do.
Helped-by: Elijah Newren <newren@gmail.com>
[en: log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 12 May 2025 21:22:48 +0000 (14:22 -0700)]
Merge branch 'ps/object-store-cleanup'
Further code clean-up in the object-store layer.
* ps/object-store-cleanup:
object-store: drop `repo_has_object_file()`
treewide: convert users of `repo_has_object_file()` to `has_object()`
object-store: allow fetching objects via `has_object()`
object-store: move function declarations to their respective subsystems
object-store: move and rename `odb_pack_keep()`
object-store: drop `loose_object_path()`
object-store: move `struct packed_git` into "packfile.h"
Junio C Hamano [Mon, 12 May 2025 19:03:06 +0000 (12:03 -0700)]
you-still-use-that??: help deprecating commands for removal
Commands slated for removal like "git pack-redundant" now require
an explicit "--i-still-use-this" option to run. This is to
discourage casual use and surface their pending deprecation to
users.
The warning message is long, so factor it into a helper function
you_still_use_that() to simplify reuse by other commands.
Also add a missing test to ensure this enforcement works for
"pack-redundant".
Helped-by: Elijah Newren <newren@gmail.com>
[en: log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 12 May 2025 18:52:33 +0000 (14:52 -0400)]
raw_object_store: drop extra pointer to replace_map
We store the replacement data in an oidmap, which is itself a pointer in
the raw_object_store struct. But there's no need for an extra pointer
indirection here. It is always allocated and initialized along with the
containing struct, and we never check it for NULL-ness.
Let's embed the map directly in the struct, which is simpler and avoids
extra pointer chasing.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 12 May 2025 18:51:30 +0000 (14:51 -0400)]
oidmap: add size function
Callers which want to know how many items are in an oidmap have to look
at the underlying hashmap struct, leaking an implementation detail.
Let's provide a type-appropriate wrapper and use it.
Note in the call from lookup_replace_object(), the caller was actually
looking at the hashmap's tablesize parameter (the allocated size of the
table) rather than hashmap_get_size(), the number of items in the table.
This probably should have been checking the number of items all along,
but the two are functionally equivalent here since we only add to the
map and never remove anything. Thus if there was any allocation, it was
because there is at least one item.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 12 May 2025 18:50:28 +0000 (14:50 -0400)]
oidmap: rename oidmap_free() to oidmap_clear()
This function does not free the oidmap struct itself; it just drops all
items from the map (using hashmap_clear_() internally). It should be
called oidmap_clear(), per CodingGuidelines.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Mon, 12 May 2025 12:22:10 +0000 (12:22 +0000)]
pack-bitmap: fix memory leak if `load_bitmap_entries_v1` failed
In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1`
allocates a bitmap and reads index data into it. However, if any of
the validation checks following the allocation fail, the allocated bitmap
is not freed, resulting in a memory leak. To avoid this, the validation
checks should be performed before the bitmap is allocated.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "stats" directory contains a couple of scripts to do some statistics
on a repository:
- "git-common-hash" shows the longest common hash prefixes and can be
used to determine the minimum prefix length to use for object names
to be unique. The script has last been touched in 53474eb92ff
(contrib: update stats/mailmap script, 2012-12-12) and searching for
it on the internet doesn't really surface any potential use cases or
even mentions of it.
Modern Git also shouldn't really need this tool as it knows to
automatically scale printed prefixes via some heuristics.
- "mailmap.pl" performs some statistics on the number of mailmapped
commits in a repository. It has last been modified in 53474eb92ff
(contrib: update stats/mailmap script, 2012-12-12) and has since
been bitrotting. It doesn't even compile nowadays anymore:
$ perl contrib/stats/mailmap.pl
Experimental keys on scalar is now forbidden at contrib/stats/mailmap.pl line 57.
Type of arg 1 to keys must be hash or array (not hash element) at contrib/stats/mailmap.pl line 57, near "}) "
Experimental keys on scalar is now forbidden at contrib/stats/mailmap.pl line 57.
Type of arg 1 to keys must be hash or array (not private variable) at contrib/stats/mailmap.pl line 57, near "$h)"
Experimental keys on scalar is now forbidden at contrib/stats/mailmap.pl line 64.
Type of arg 1 to keys must be hash or array (not private variable) at contrib/stats/mailmap.pl line 64, near "$h)"
Execution of contrib/stats/mailmap.pl aborted due to compilation errors.
This should be good-enough signal to indicate that nobody is using
this script at all anymore.
- "packinfo.pl" takes the output from git-verify-pack(1) and performs
some pretty printing thereof. On the one hand it reformats the
output to be easier to read and provide some summaries. On the other
hand it may also print filenames of blobs.
We don't have any replacement for this tool. Ideally, we should move
its functionality into git-verify-pack(1) itself.
Remove the first two scripts, but retain "packinfo.pl".
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git-new-workdir" command has been introduced to make it possible to
have a separate working directory in a different place. The command thus
predates git-worktree(1), which is what people use nowadays to create
any such working directory. As such, the script doesn't really have much
of a reason to exist nowadays anymore.
It also doesn't seem like the script is still in use: the last time it
has received an update was in e32afab7b03 (git-new-workdir: don't fail
if the target directory is empty, 2014-11-26), more than a decade ago.
Remove it as well as the tests that depend on it.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the "emacs/" directory still exists, all of its code has been
replaced with stubs in 6d5ed4836db (git{,-blame}.el: remove old
bitrotting Emacs code, 2018-04-11). Instead, the recommendation is to
use Emacs' own vc-annotate mode.
Remove the code altogether.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git-resurrect.sh" script can be used to find traces of a branch tip
in the reflog and resurrect that branch. Despite a couple of global
cleanups, the script hasn't seen any activity since it was introduced in e1ff064e1bf (contrib git-resurrect: find traces of a branch name and
resurrect it, 2009-02-04).
Furthermore, the tool does not work with the "reftable" backend at all
as it directly reads ".git/logs/HEAD". As reflogs are stored as part of
the individual tables though that file wouldn't exist in a "reftable"-
enabled repository.
Last but not least, the tool doesn't even work unless it is explicitly
invoked via `git resurrect` as it sources "git-sh-setup". As none of our
build systems know to install this script, users thus have to go out of
their way to really make it work, which is highly unlikely.
Another source that indicates that this tool can be removed is a
question for how to restore deleted branches on StackOverflow [1]. The
top-voted answer uses git-reflog(1) directly and has received more than
3000 votes to date. While "git-resurrect.sh" is also mentioned, it only
got 16 upvotes, and comments mention the above caveat that users have to
do some manual setup to make it work.
It's thus rather clear that the tool doesn't have a lot or even any
users. Remove it.
The "persistent-https" remote helper supposedly speeds up SSL operations
by running a daemon that keeps a connection open to a remote server. It
is effectively unmaintained nowadays: the last time it received an
update was in accb613afd2 (contrib/persistent-https: use Git version for
build label, 2016-07-20) and its parent commits to make it compile with
Go 1.7+.
This Go toolchain is somewhat dated by now though and unsupported. The
oldest still-supported toolchain is Go 1.23, which was released in
August 2024. It is not possible to compile the remote helper with that
Go version anymore:
$ go version
go version go1.23.8 linux/amd64
$ make
case $(go version) in \
"go version go"1.[0-5].*) EQ=" " ;; *) EQ="=" ;; esac && \
go build -o git-remote-persistent-https \
-ldflags "-X main._BUILD_EMBED_LABEL${EQ}GIT_VERSION=2.49.0.943.g965a70ebf62"
go: cannot find main module, but found .git/config in /home/pks/Development/git
to create a module there, run:
cd ../.. && go mod init
make: *** [Makefile:31: git-remote-persistent-https] Error 1
The problem is that modern Go toolchains require a "go.mod" file, but we
don't have any such files. This requirement exists since quite a while
already, so it's clear that nobody has tried to use this remote helper
anytime recent.
Remove the remote helper.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "mw-to-git" directory contains tools for accessing MediaWiki via
Git. The scripts are essentially unmaintained in Git: despite a couple
of global cleanups, the last changes were a couple of security-related
issues part of 9a8606465e8 (remote-mediawiki: use "sh" to eliminate
unquoted commands, 2020-09-21) and its parents. We don't ever run any of
the tests so it is more likely than not that many of the tests have been
bitrotting, like e.g. documented in f8ab018dafc (remote-mediawiki tests:
annotate failing tests, 2020-09-21).
According to Matthieu Moy [1], one of the original developers of this
tool, it didn't receive any attention recently and there is no
motivation to keep maintaining it anymore in the community. The project
has been spun out of Git [2] and thus has a new official home, but did
not receive much attention over there, either.
As such, it seems like the MediaWiki transport helper is slowly fading
away. But given that there is a new home, it doesn't make sense to have
it as part of Git anymore only to let it rot. Remove the directory.
The "hooks" directory contains a handful of example hooks. Most of these
hooks are highly specific and haven't really received any updates over
the last couple of years, except for some global cleanups. The multimail
hook has also been removed in f74d11471fa (multimail: stop shipping a
copy, 2021-06-10) in favor of its upstream project [1].
Remove those hooks. If we want to provide examples for how to use Git
hooks we should do that as part of our documentation, for example in
githooks(5).
The "thunderbird-patch-inline" directory in "contrib/" contains a script
to send patch files via Thunderbird. This script depends on the
ExternalEditor extension [1], which seems to be effectively unmaintained
with the last update being in 2008. While the extension has eventually
been maintained in [2], that fork hasn't received any updates since
2020, either.
As such, the ExternalEditor extension does not work with modern versions
of Thunderbird anymore, and as the "thunderbird-patch-inline" script
depends on the ExternalEditor extension it likely doesn't work anymore,
either. The fact that this script hasn't been touched for the last 10
years outside of some global cleanup supports the idea that it is not
useful anymore.
The "remote-helpers" directory contains two remote helper scripts for
Mercurial and Bazaar. These scripts have since been converted into stubs
in b2c851a8e67 (Revert "Merge branch 'jc/graduate-remote-hg-bzr' (early
part)", 2014-05-20) as the helpers have been moved into their own
upstream projects [1][2].
Given that these stubs have been created more than a decade ago it is
very unlikely that anybody still tries to use them. Remove them.
The "examples" directory used to contain scripted versions of some of
our builtins. These have all been removed in 49eb8d39c78 (Remove
contrib/examples/*, 2018-03-25), but we left a note in the directory to
make it discoverable that there used to be examples.
It is unlikely that anybody still looks at these examples more than 7
years after they have been removed. Remove the note and its directory.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remotes can be configured either via a repository's config or by using
the ".git/branches/" or ".git/remotes/" directories. Back when the new
config-based mechanism has been introduced we also introduced a helper
script that migrates from the old-style remote configuration to the new
config-based mechanism.
With the recent removal announcement for the two directories we also
started to instruct users to migrate repositories that still use these
mechanism to use config-based remotes. Notably though, the migration
path doesn't even use the migration script. Instead, git-remote(1)
itself knows how to migrate any such remote via `git remote rename`.
In fact, a full migration _cannot_ use the script as it only knows to
migrate remotes from ".git/remotes/", but not ".git/branches/". As such,
the migration path via `git remote rename` is the only feasible way to
fully migrate repositories over to the new format.
Last but not least, the script doesn't even work as-is as it sources
"git-sh-setup". For this to work it would need to be invoked either via
Git so that this script is in our PATH, users would have to manually
call it with an adjusted PATH, or distributions need to install the
script into "$prefix/libexec/git-core" with a "git-" prefix. All of
these steps are unlikely enough to underpin the claim that this script
is not used at all.
So given that:
- The script cannot perform a full migration of all deprecated remote
types.
- We don't advertise it anywhere.
- It has been basically untouched since 2007.
- It doesn't even work unless users do manual steps.
It should be safe enough to just remove it. Do so.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable: fix perf regression when reading blocks of unwanted type
In fd888311fbc (reftable/table: move reading block into block reader,
2025-04-07), we have refactored how reftable blocks are read so that
most of the logic is contained in the "block.c" subsystem itself. Most
importantly, the whole logic to read the data itself is now contained in
that subsystem.
This change caused a significant performance regression though when
reading blocks that aren't of the specific type one is searching for:
Benchmark 1: update-ref: create 100k refs (revision = fd888311fbc~)
Time (mean ± σ): 2.171 s ± 0.028 s [User: 1.189 s, System: 0.977 s]
Range (min … max): 2.117 s … 2.206 s 10 runs
Benchmark 2: update-ref: create 100k refs (revision = fd888311fbc)
Time (mean ± σ): 3.418 s ± 0.030 s [User: 2.371 s, System: 1.037 s]
Range (min … max): 3.377 s … 3.473 s 10 runs
Summary
update-ref: create 100k refs (revision = fd888311fbc~) ran
1.57 ± 0.02 times faster than update-ref: create 100k refs (revision = fd888311fbc)
The root caute of the performance regression is that we changed when
exactly blocks of an uninteresting type are being discarded. Previous to
the refactoring in the mentioned commit we'd load the block data, read
its type, notice that it's not the wanted type and discard the block.
After the commit though we don't discard the block immediately, but we
fully decode it only to realize that it's not the desired type. We then
discard the block again, but have already performed a bunch of pointless
work.
Fix the regression by making `reftable_block_init()` return early in
case the block is not of the desired type. This fixes the performance
hit:
Benchmark 1: update-ref: create 100k refs (revision = HEAD~)
Time (mean ± σ): 2.712 s ± 0.018 s [User: 1.990 s, System: 0.716 s]
Range (min … max): 2.682 s … 2.741 s 10 runs
Benchmark 2: update-ref: create 100k refs (revision = HEAD)
Time (mean ± σ): 1.670 s ± 0.012 s [User: 0.991 s, System: 0.676 s]
Range (min … max): 1.652 s … 1.693 s 10 runs
Summary
update-ref: create 100k refs (revision = HEAD) ran
1.62 ± 0.02 times faster than update-ref: create 100k refs (revision = HEAD~)
Note that the baseline performance is lower than in the original due to
a couple of unrelated performance improvements that have landed since
the original commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Mon, 12 May 2025 02:07:27 +0000 (02:07 +0000)]
builtin/am: fix memory leak in `split_mail_stgit_series`
In builtin/am.c:split_mail_stgit_series, if `fopen` failed,
`series_dir_buf` allocated by `xstrdup` will leak. Add `free` in
`!fp` if branch will prevent the leak.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jean-Noël Avila [Sat, 10 May 2025 12:33:17 +0000 (14:33 +0200)]
git-var doc: fix usage of $ENV_VAR vs ENV_VAR
When refering to environment variables in the documentation, use the
ENV_VARIABLE format instead of $ENV_VARIABLE. The latter is used in the
documentation to refer to the actual value of the variable, not the name
of the variable.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 10 May 2025 12:33:16 +0000 (14:33 +0200)]
git-verify-* doc: update mark-up of synopsis option descriptions
To unify mark-up used in our documentation to a newer convention,
started by 22293895 (doc: apply synopsis simplification on git-clone
and git-init, 2024-09-24), update the documentation pages for 'git
verify-commit', 'git verify-tag', and 'git verify-pack' to
* use [synopsis], not [verse] in the SYNOPSIS section
* enclose `--option=<value>` in backquotes
* do not describe non-option arguments in the OPTIONS section
Signed-off-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 10 May 2025 12:33:15 +0000 (14:33 +0200)]
git-{var,write-tree} docs: update mark-up of synopsis option descriptions
To unify mark-up used in our documentation to a newer convention,
started by 22293895 (doc: apply synopsis simplification on git-clone
and git-init, 2024-09-24), update the documentation for 'git var' and
'git write-tree' to
* use [synopsis], not [verse] in the SYNOPSIS section
* enclose `--option=<value>` in backquotes
Signed-off-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Sat, 10 May 2025 12:33:14 +0000 (14:33 +0200)]
git-daemon doc: update mark-up of synopsis option descriptions
To unify mark-up used in our documentation to a newer convention,
started by 22293895 (doc: apply synopsis simplification on git-clone
and git-init, 2024-09-24), update the documentation of 'git daemon'
to
* use [synopsis], not [verse] in the SYNOPSIS section
* enclose `--option=<value>` in backquotes
Also, split '--[no-]option' into '--option' and '--no-option'
to make it easier to grep for them.
Signed-off-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Mon, 12 May 2025 12:49:04 +0000 (12:49 +0000)]
reftable/writer: fix memory leak when `writer_index_hash()` fails
In reftable/writer.c:writer_index_hash(), if `reftable_buf_add` failed,
key allocated by `reftable_malloc` will not be insert into `obj_index_tree`
thus leaks. Simple add reftable_free(key) will solve this problem.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lidong Yan [Mon, 12 May 2025 12:49:03 +0000 (12:49 +0000)]
reftable/writer: fix memory leak when `padded_write()` fails
In reftable/writer.c:padded_write(), if w->writer failed, zeroed
allocated in `reftable_calloc` will leak. w->writer could be
`reftable_write_data` in reftable/stack.c, and could fail due to
some write error. Simply add reftable_free(zeroed) will solve this
problem.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
YOKOTA Hiroshi [Sun, 4 May 2025 17:57:07 +0000 (02:57 +0900)]
gitk: Legacy widgets doesn't have combobox
Use "proc makedroplist" function to support combobox on legacy widgets
mode. "proc makedroplist" uses "ttk::combobox" for themed mode, and uses
"tk_optionMenu" for legacy mode to get rid of the problem.
Signed-off-by: YOKOTA Hiroshi <yokota.hgml@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Makefile: avoid constant rebuilds with compilation database
Many contributors to software use a Language Server Protocol
implementation to allow their editor to learn structural information
about the code they write and provide additional features, such as
jumping to the declaration or definition of a function or type. In C,
the usual implementation is clangd, which requires compiling with clang.
Because C and C++ projects lack a standard file system layout and build
system, unlike languages such as Rust and Go, clangd requires a
compilation database to be generated by the clang compiler in order to
pass the proper compilation flags and discover all of the files
necessary to make the LSP work. This is done by setting
GENERATE_COMPILATION_DATABASE to "yes".
However, when that's enabled and the user runs "make" a second time,
all of the files are re-compiled, which is inconvenient for contributors
to Git, since it makes small changes or rebases recompile the entirety
of the codebase. This happens because the directory holding the
compilation database is updated anytime an object is built, so its
modification date will always be newer than the first object built.
To solve this, use the same trick we do just above for the .depend
directory and filter the compilation database directory out if it
already exists, which avoids making it a target to be built.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Phillip Wood [Fri, 9 May 2025 16:22:27 +0000 (16:22 +0000)]
sequencer: rework reflog message handling
It has been reported that "git rebase --rebase-merges" can create
corrupted reflog entries like
e9c962f2ea0 HEAD@{8}: <binary>�: Merged in <branch> (pull request #4441)
This is due to a use-after-free bug that happens because
reflog_message() uses a static `struct strbuf` and is not called to
update the current reflog message stored in `ctx->reflog_message` when
creating the merge. This means `ctx->reflog_message` points to a stale
reflog message that has been freed by subsequent call to
reflog_message() by a command such as `reset` that used the return value
directly rather than storing the result in `ctx->reflog_message`.
Fix this by creating the reflog message nearer to where the commit is
created and storing it in a local variable which is passed as an
additional parameter to run_git_commit() rather than storing the message
in `struct replay_ctx`. This makes it harder to forget to call
`reflog_message()` before creating a commit and using a variable with a
narrower scope means that a stale value cannot carried across a from one
iteration of the loop to the next which should prevent any similar
use-after-free bugs in the future.
A existing test is modified to demonstrate that merges are now created
with the correct reflog message.
Reported-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 9 May 2025 20:16:51 +0000 (13:16 -0700)]
Merge branch 'master' of https://github.com/j6t/gitk
* 'master' of https://github.com/j6t/gitk:
gitk: add Tamil translation
gitk: limit PATH search to bare executable names
gitk: _search_exe is no longer needed
gitk: override $PATH search only on Windows
gitk: adjust indentation to match the style used in this script
Junio C Hamano [Fri, 9 May 2025 20:14:36 +0000 (13:14 -0700)]
Merge branch 'master' of https://github.com/j6t/git-gui
* 'master' of https://github.com/j6t/git-gui:
git-gui: treat the message template file as a built file
git-gui: heed core.commentChar/commentString
git-gui: po/README: update repository location and maintainer
Junio C Hamano [Thu, 8 May 2025 19:36:31 +0000 (12:36 -0700)]
Merge branch 'ps/mv-contradiction-fix'
"git mv a a/b dst" would ask to move the directory 'a' itself, as
well as its contents, in a single destination directory, which is
a contradicting request that is impossible to satisfy. This case is
now detected and the command errors out.
* ps/mv-contradiction-fix:
builtin/mv: convert assert(3p) into `BUG()`
builtin/mv: bail out when trying to move child and its parent