builtin/diff: explicitly set hash algo when there is no repo
The git-diff(1) command can be used outside repositories to diff two
files with each other. But even if there is no repository we will end up
hashing the files that we are diffing so that we can print the "index"
line:
```
diff --git a/a b/b
index 7898192..6178079 100644
--- a/a
+++ b/b
@@ -1 +1 @@
-a
+b
```
We implicitly use SHA1 to calculate the hash here, which is because
`the_repository` gets initialized with SHA1 during the startup routine.
We are about to stop doing this though such that `the_repository` only
ever has a hash function when it was properly initialized via a repo's
configuration.
To give full control to our users, we would ideally add a new switch to
git-diff(1) that allows them to specify the hash function when executed
outside of a repository. But for now, we only convert the code to make
this explicit such that we can stop setting the default hash algorithm
for `the_repository`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/bundle: abort "verify" early when there is no repository
Verifying a bundle requires us to have a repository. This is encoded in
`verify_bundle()`, which will return an error if there is no repository.
We call `open_bundle()` before we call `verify_bundle()` though, which
already performs some verifications even though we may ultimately abort
due to a missing repository.
This is problematic because `open_bundle()` already reads the bundle
header and verifies that it contains a properly formatted hash. When
there is no repository we have no clue what hash function to expect
though, so we always end up assuming SHA1 here, which may or may not be
correct. Furthermore, we are about to stop initializing `the_hash_algo`
when there is no repository, which will lead to segfaults.
Check early on whether we have a repository to fix this issue.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We access `the_hash_algo` in git-blame(1) before we have executed
`parse_options_start()`, which may not be properly set up in case we
have no repository. This is fine for most of the part because all the
call paths that lead to it (git-blame(1), git-annotate(1) as well as
git-pick-axe(1)) specify `RUN_SETUP` and thus require a repository.
There is one exception though, namely when passing `-h` to print the
help. Here we will access `the_hash_algo` even if there is no repo.
This works fine right now because `the_hash_algo` gets sets up to point
to the SHA1 algorithm via `initialize_repository()`. But we're about to
stop doing this, and thus the code would lead to a `NULL` pointer
exception.
Prepare the code for this and only access `the_hash_algo` after we are
sure that there is a proper repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/rev-parse: allow shortening to more than 40 hex characters
The `--short=` option for git-rev-parse(1) allows the user to specify
to how many characters object IDs should be shortened to. The option is
broken though for SHA256 repositories because we set the maximum allowed
hash size to `the_hash_algo->hexsz` before we have even set up the repo.
Consequently, `the_hash_algo` will always be SHA1 and thus we truncate
every hash after at most 40 characters.
Fix this by accessing `the_hash_algo` only after we have set up the
repo.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dumb HTTP transport tries to read the remote HEAD reference by
downloading the "HEAD" file and then parsing it via `http_fetch_ref()`.
This function will either parse the file as an object ID in case it is
exactly `the_hash_algo->hexsz` long, or otherwise it will check whether
the reference starts with "ref :" and parse it as a symbolic ref.
This is broken when parsing detached HEADs of a remote SHA256 repository
because we never update `the_hash_algo` to the discovered remote object
hash. Consequently, `the_hash_algo` will always be the fallback SHA1
hash algorithm, which will cause us to fail parsing HEAD altogteher when
it contains a SHA256 object ID.
Fix this issue by setting up `the_hash_algo` via `repo_set_hash_algo()`.
While at it, let's make the expected SHA1 fallback explicit in our code,
which also addresses an upcoming issue where we are going to remove the
SHA1 fallback for `the_hash_algo`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
attr: fix BUG() when parsing attrs outside of repo
If either the `--attr-source` option or the `GIT_ATTR_SOURCE` envvar are
set, then `compute_default_attr_source()` will try to look up the value
as a treeish. It is possible to hit that function while outside of a Git
repository though, for example when using `git grep --no-index`. In that
case, Git will hit a bug because we try to look up the main ref store
outside of a repository.
Handle the case gracefully and detect when we try to look up an attr
source without a repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `default_attr_source()` function lazily computes the attr source
supposedly once, only. This is done via a static variable `attr_source`
that contains the resolved object ID of the attr source's tree. If the
variable is the null object ID then we try to look up the attr source,
otherwise we skip over it.
This approach is flawed though: the variable will never be set to
anything else but the null object ID in case there is no attr source.
Consequently, we re-compute the information on every call. And in the
worst case, when we silently ignore bad trees, this will cause us to try
and look up the treeish every single time.
Improve this by introducing a separate variable `has_attr_source` to
track whether we already computed the attr source and, if so, whether we
have an attr source or not.
This also allows us to convert the `ignore_bad_attr_tree` to not be
static anymore as the code will only be executed once anyway.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse-options-cb: only abbreviate hashes when hash algo is known
The `OPT__ABBREV()` option can be used to add an option that abbreviates
object IDs. When given a length longer than `the_hash_algo->hexsz`, then
it will instead set the length to that maximum length.
It may not always be guaranteed that we have `the_hash_algo` initialized
properly as the hash algorithm can only be set up after we have set up
`the_repository`. In that case, the hash would always be truncated to
the hex length of SHA1, which may not be what the user desires.
In practice it's not a problem as all commands that use `OPT__ABBREV()`
also have `RUN_SETUP` set and thus cannot work without a repository.
Consequently, both `the_repository` and `the_hash_algo` would be
properly set up.
Regardless of that, harden the code to not truncate the length when we
didn't set up a repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
While `validate_headref()` is only called from `is_git_directory()` in
"setup.c", it is currently implemented in "path.c". Move it over such
that it becomes clear that it is only really used during setup in order
to discover repositories.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
path: harden validation of HEAD with non-standard hashes
The `validate_headref()` function takes a path to a supposed "HEAD" file
and checks whether its format is something that we understand. It is
used as part of our repository discovery to check whether a specific
directory is a Git directory or not.
Part of the validation is a check for a detached HEAD that contains a
plain object ID. To do this validation we use `get_oid_hex()`, which
relies on `the_hash_algo`. At this point in time the hash algo cannot
yet be initialized though because we didn't yet read the Git config.
Consequently, it will always be the SHA1 hash algorithm.
In practice this works alright because `get_oid_hex()` only ends up
checking whether the prefix of the buffer is a valid object ID. And
because SHA1 is shorter than SHA256, the function will successfully
parse SHA256 object IDs, as well.
It is somewhat fragile though and not really the intent to only check
for SHA1. With this in mind, harden the code to use `get_oid_hex_any()`
to check whether the "HEAD" file parses as any known hash.
One might be hard pressed to tighten the check even further and fully
validate the file contents, not only the prefix. In practice though that
wouldn't make a lot of sense as it could be that the repository uses a
hash function that produces longer hashes than SHA256, but which the
current version of Git doesn't understand yet. We'd still want to detect
the repository as proper Git repository in that case, and we will fail
eventually with a proper error message that the hash isn't understood
when trying to set up the repository format.
It follows that we could just leave the current code intact, as in
practice the code change doesn't have any user visible impact. But it
also prepares us for `the_hash_algo` being unset when there is no
repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 7 May 2024 05:50:29 +0000 (22:50 -0700)]
Merge branch 'ps/the-index-is-no-more' into ps/undecided-is-not-necessarily-sha1
* ps/the-index-is-no-more:
repository: drop `initialize_the_repository()`
repository: drop `the_index` variable
builtin/clone: stop using `the_index`
repository: initialize index in `repo_init()`
builtin: stop using `the_index`
t/helper: stop using `the_index`
Junio C Hamano [Fri, 3 May 2024 15:34:27 +0000 (08:34 -0700)]
stop using HEAD for attributes in bare repository by default
With 23865355 (attr: read attributes from HEAD when bare repo,
2023-10-13), we started to use the HEAD tree as the default
attribute source in a bare repository. One argument for such a
behaviour is that it would make things like "git archive" run in
bare and non-bare repositories for the same commit consistent.
This changes was merged to Git 2.43 but without an explicit mention
in its release notes.
It turns out that this change destroys performance of shallowly
cloning from a bare repository. As the "server" installations are
expected to be mostly bare, and "git pack-objects", which is the
core of driving the other side of "git clone" and "git fetch" wants
to see if a path is set not to delta with blobs from other paths via
the attribute system, the change forces the server side to traverse
the tree of the HEAD commit needlessly to find if each and every
paths the objects it sends out has the attribute that controls the
deltification. Given that (1) most projects do not configure such
an attribute, and (2) it is dubious for the server side to honor
such an end-user supplied attribute anyway, this was a poor choice
of the default.
To mitigate the current situation, let's revert the change that uses
the tree of HEAD in a bare repository by default as the attribute
source. This will help most people who have been happy with the
behaviour of Git 2.42 and before.
Two things to note:
* If you are stuck with versions of Git 2.43 or newer, that is
older than the release this fix appears in, you can explicitly
set the attr.tree configuration variable to point at an empty
tree object, i.e.
* If you like the behaviour we are reverting, you can explicitly
set the attr.tree configuration variable to HEAD, i.e.
$ git config attr.tree HEAD
The right fix for this is to optimize the code paths that allow
accesses to attributes in tree objects, but that is a much more
involved change and is left as a longer-term project, outside the
scope of this "first step" fix.
Junio C Hamano [Tue, 30 Apr 2024 21:49:45 +0000 (14:49 -0700)]
Merge branch 'js/for-each-repo-keep-going'
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out. Now it keeps going.
* js/for-each-repo-keep-going:
maintenance: running maintenance should not stop on errors
for-each-repo: optionally keep going on an error
Junio C Hamano [Tue, 30 Apr 2024 21:49:44 +0000 (14:49 -0700)]
Merge branch 'js/build-fuzz-more-often'
In addition to building the objects needed, try to link the objects
that are used in fuzzer tests, to make sure at least they build
without bitrot, in Linux CI runs.
* js/build-fuzz-more-often:
fuzz: link fuzz programs with `make all` on Linux
Junio C Hamano [Tue, 30 Apr 2024 21:49:42 +0000 (14:49 -0700)]
Merge branch 'jc/format-patch-rfc-more'
The "--rfc" option of "git format-patch" learned to take an
optional string value to be used in place of "RFC" to tweak the
"[PATCH]" on the subject header.
* jc/format-patch-rfc-more:
format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
format-patch: allow --rfc to optionally take a value, like --rfc=WIP
Junio C Hamano [Tue, 30 Apr 2024 21:49:42 +0000 (14:49 -0700)]
Merge branch 'ds/format-patch-rfc-and-k'
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".
* ds/format-patch-rfc-and-k:
format-patch: ensure that --rfc and -k are mutually exclusive
Junio C Hamano [Tue, 30 Apr 2024 21:49:41 +0000 (14:49 -0700)]
Merge branch 'pw/rebase-m-signoff-fix'
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.
* pw/rebase-m-signoff-fix:
rebase -m: fix --signoff with conflicts
sequencer: store commit message in private context
sequencer: move current fixups to private context
sequencer: start removing private fields from public API
sequencer: always free "struct replay_opts"
Junio C Hamano [Thu, 25 Apr 2024 17:34:24 +0000 (10:34 -0700)]
Merge branch 'rj/add-i-leak-fix'
Leakfix.
* rj/add-i-leak-fix:
add: plug a leak on interactive_add
add-patch: plug a leak handling the '/' command
add-interactive: plug a leak in get_untracked_files
apply: plug a leak in apply_data
Since 5e47215080 (fuzz: add basic fuzz testing target., 2018-10-12), we
have compiled object files for the fuzz tests as part of the default
'make all' target. This helps prevent bit-rot in lesser-used parts of
the codebase, by making sure that incompatible changes are caught at
build time.
However, since we never linked the fuzzer executables, this did not
protect us from link-time errors. As of 8b9a42bf48 (fuzz: fix fuzz test
build rules, 2024-01-19), it's now possible to link the fuzzer
executables without using a fuzzing engine and a variety of
compiler-specific (and compiler-version-specific) flags, at least on
Linux. So let's add a platform-specific option in config.mak.uname to
link the executables as part of the default `make all` target.
Since linking the fuzzer executables without a fuzzing engine does not
require a C++ compiler, we can change the FUZZ_PROGRAMS build rule to
use $(CC) by default. This avoids compiler mis-match issues when
overriding $(CC) but not $(CXX). When we *do* want to actually link with
a fuzzing engine, we can set $(FUZZ_CXX). The build instructions in the
CI fuzz-smoke-test job and in the Makefile comment have been updated
accordingly.
While we're at it, we can consolidate some of the fuzzer build
instructions into one location in the Makefile.
Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
maintenance: running maintenance should not stop on errors
In https://github.com/microsoft/git/issues/623, it was reported that
maintenance stops on a missing repository, omitting the remaining
repositories that were scheduled for maintenance.
This is undesirable, as it should be a best effort type of operation.
It should still fail due to the missing repository, of course, but not
leave the non-missing repositories in unmaintained shapes.
Let's use `for-each-repo`'s shiny new `--keep-going` option that we just
introduced for that very purpose.
This change will be picked up when running `git maintenance start`,
which is run implicitly by `scalar reconfigure`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In https://github.com/microsoft/git/issues/623, it was reported that
the regularly scheduled maintenance stops if one repo in the middle of
the list was found to be missing.
This is undesirable, and points out a gap in the design of `git
for-each-repo`: We need a mode where that command does not stop on an
error, but continues to try running the specified command with the other
repositories.
Imitating the `--keep-going` option of GNU make, this commit teaches
`for-each-repo` the same trick: to continue with the operation on all
the remaining repositories in case there was a problem with one
repository, still setting the exit code to indicate an error occurred.
Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "receive-pack" program (which responds to "git push") was not
converted to run "git maintenance --auto" when other codepaths that
used to run "git gc --auto" were updated, which has been corrected.
* ps/run-auto-maintenance-in-receive-pack:
builtin/receive-pack: convert to use git-maintenance(1)
run-command: introduce function to prepare auto-maintenance process
Junio C Hamano [Tue, 23 Apr 2024 22:05:56 +0000 (15:05 -0700)]
Merge branch 'pk/bisect-use-show'
When "git bisect" reports the commit it determined to be the
culprit, we used to show it in a format that does not honor common
UI tweaks, like log.date and log.decorate. The code has been
taught to use "git show" to follow more customizations.
* pk/bisect-use-show:
bisect: report the found commit with "show"
Junio C Hamano [Tue, 23 Apr 2024 18:52:41 +0000 (11:52 -0700)]
Merge branch 'mr/rerere-crash-fix'
When .git/rr-cache/ rerere database gets corrupted or rerere is fed to
work on a file with conflicted hunks resolved incompletely, the rerere
machinery got confused and segfaulted, which has been corrected.
* mr/rerere-crash-fix:
rerere: fix crashes due to unmatched opening conflict markers
Junio C Hamano [Tue, 23 Apr 2024 18:52:40 +0000 (11:52 -0700)]
Merge branch 'ps/missing-btmp-fix'
GIt 2.44 introduced a regression that makes the updated code to
barf in repositories with multi-pack index written by older
versions of Git, which has been corrected.
Junio C Hamano [Tue, 23 Apr 2024 18:52:39 +0000 (11:52 -0700)]
Merge branch 'dd/t9604-use-posix-timezones'
The cvsimport tests required that the platform understands
traditional timezone notations like CST6CDT, which has been
updated to work on those systems as long as they understand
POSIX notation with explicit tz transition dates.
* dd/t9604-use-posix-timezones:
t9604: Fix test for musl libc and new Debian
Junio C Hamano [Tue, 23 Apr 2024 18:52:39 +0000 (11:52 -0700)]
Merge branch 'rj/launch-editor-error-message'
Git writes a "waiting for your editor" message on an incomplete
line after launching an editor, and then append another error
message on the same line if the editor errors out. It now clears
the "waiting for..." line before giving the error message.
* rj/launch-editor-error-message:
launch_editor: waiting message on error
Junio C Hamano [Tue, 23 Apr 2024 18:52:37 +0000 (11:52 -0700)]
Merge branch 'ps/reftable-block-iteration-optim'
The code to iterate over reftable blocks has seen some optimization
to reduce memory allocation and deallocation.
* ps/reftable-block-iteration-optim:
reftable/block: avoid copying block iterators on seek
reftable/block: reuse `zstream` state on inflation
reftable/block: open-code call to `uncompress2()`
reftable/block: reuse uncompressed blocks
reftable/reader: iterate to next block in place
reftable/block: move ownership of block reader into `struct table_iter`
reftable/block: introduce `block_reader_release()`
reftable/block: better grouping of functions
reftable/block: merge `block_iter_seek()` and `block_reader_seek()`
reftable/block: rename `block_reader_start()`
Junio C Hamano [Tue, 23 Apr 2024 17:52:34 +0000 (10:52 -0700)]
format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
In the previous step, the "--rfc" option of "format-patch" learned
to take an optional string value to prepend to the subject prefix,
so that --rfc=WIP can give "[WIP PATCH]".
There may be cases in which the extra string wants to come after the
subject prefix. Extend the mechanism to allow "--rfc=-(WIP)" [*] to
signal that the extra string is to be appended instead of getting
prepended, resulting in "[PATCH (WIP)]".
In the documentation, discourage (ab)using "--rfc=-RFC" to say
"[PATCH RFC]" just to be different, when "[RFC PATCH]" is the norm.
[Footnote]
* The syntax takes inspiration from Perl's open syntax that opens
pipes "open fh, '|-', 'cmd'", where the dash signals "the other
stuff comes here".
Junio C Hamano [Tue, 23 Apr 2024 17:52:33 +0000 (10:52 -0700)]
format-patch: allow --rfc to optionally take a value, like --rfc=WIP
With the "--rfc" option, we can tweak the "[PATCH]" (or whatever
string specified with the "--subject-prefix" option, instead of
"PATCH") that we prefix the title of the commit with into "[RFC
PATCH]", but some projects may want "[rfc PATCH]". Adding a new
option, e.g., "--rfc-lowercase", to support such need every time
somebody wants to use different strings would lead to insanity of
accumulating unbounded number of such options.
Allow an optional value specified for the option, so that users can
use "--rfc=rfc" (think of "--rfc" without value as a short-hand for
"--rfc=RFC") if they wanted to.
This can of course be (ab)used to make the prefix "[WIP PATCH]" by
passing "--rfc=WIP". Passing an empty string, i.e., "--rfc=", is
the same as "--no-rfc" to override an option given earlier on the
same command line.
Adam Johnson [Mon, 22 Apr 2024 10:28:14 +0000 (10:28 +0000)]
stash: fix "--staged" with binary files
"git stash --staged" errors out when given binary files, after saving the
stash.
This behaviour dates back to the addition of the feature in 41a28eb6c1
(stash: implement '--staged' option for 'push' and 'save', 2021-10-18).
Adding the "--binary" option of "diff-tree" fixes this. The "diff-tree" call
in stash_patch() also omits "--binary", but that is fine since binary files
cannot be selected interactively.
Helped-By: Jeff King <peff@peff.net> Helped-By: Randall S. Becker <randall.becker@nexbridge.ca> Signed-off-by: Adam Johnson <me@adamj.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
docs: improve changelog entry for `git pack-refs --auto`
The changelog entry for the new `git pack-refs --auto` mode only says
that the new flag is useful, but doesn't really say what it does. Add
some more information.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Sun, 21 Apr 2024 12:40:28 +0000 (14:40 +0200)]
don't report vsnprintf(3) error as bug
strbuf_addf() has been reporting a negative return value of vsnprintf(3)
as a bug since f141bd804d (Handle broken vsnprintf implementations in
strbuf, 2007-11-13). Other functions copied that behavior:
However, vsnprintf(3) can legitimately return a negative value if the
formatted output would be longer than INT_MAX. Stop accusing it of
being broken and just report the fact that formatting failed.
Suggested-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
format-patch: ensure that --rfc and -k are mutually exclusive
Fix a bug that allows the "--rfc" and "-k" options to be specified together
when "git format-patch" is executed, which was introduced in the commit e0d7db7423a9 ("format-patch: --rfc honors what --subject-prefix sets").
Add a couple of additional tests to t4014, to cover additional cases of
the mutual exclusivity between different "git format-patch" options.
Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
No matter how well someone configures their email tooling, understanding
who to send the patches to is something that must always be considered.
So discuss it first instead of at the end.
In the following commit we will clean up the (now redundant) discussion
about sending security patches to the Git Security mailing list.
Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a dash ("git-contacts", not "git contacts") because the script is
not installed as part of "git" toolset. This also puts the script on
one line, which should make it easier to grep for with a loose search
query, such as
$ git grep git.contacts Documentation
Also add a footnote to describe where the script is located, to help
readers who may not be familiar with such "contrib" scripts (and how
they are not accessible with the usual "git <subcommand>" syntax).
Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although we've had this script since 4d06402b1b (contrib: add
git-contacts helper, 2013-07-21), we don't mention it in our
introductory docs. Do so now.
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing with "--signoff" the commit created by "rebase --continue"
after resolving conflicts or editing a commit fails to add the
"Signed-off-by:" trailer. This happens because the message from the
original commit is reused instead of the one that would have been used
if the sequencer had not stopped for the user interaction. The correct
message is stored in ctx->message and so with a couple of exceptions
this is written to rebase_path_message() when stopping for user
interaction instead. The exceptions are (i) "fixup" and "squash"
commands where the file is written by error_failed_squash() and (ii)
"edit" commands that are fast-forwarded where the original message is
still reused. The latter is safe because "--signoff" will never
fast-forward.
Note this introduces a change in behavior as the message file now
contains conflict comments. This is safe because commit_staged_changes()
passes an explicit cleanup flag when not editing the message and when
the message is being edited it will be cleaned up automatically. This
means user now sees the same message comments in editor with "rebase
--continue" as they would if they ran "git commit" themselves before
continuing the rebase. It also matches the behavior of "git
cherry-pick", "git merge" etc. which all list the files with merge
conflicts.
The tests are extended to check that all commits made after continuing a
rebase have a "Signed-off-by:" trailer. Sadly there are a couple of
leaks in apply.c which I've not been able to track down that mean this
test file is no-longer leak free when testing "git rebase --apply
--signoff" with conflicts.
Reported-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sequencer: store commit message in private context
Add an strbuf to "struct replay_ctx" to hold the current commit
message. This does not change the behavior but it will allow us to fix a
bug with "git rebase --signoff" in the next commit. A future patch
series will use the changes here to avoid writing the commit message to
disc unless there are conflicts or the commit is being reworded.
The changes in do_pick_commit() are a mechanical replacement of "msgbuf"
with "ctx->message". In do_merge() the code to write commit message to
disc is factored out of the conditional now that both branches store the
message in the same buffer.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sequencer: start removing private fields from public API
"struct replay_opts" has a number of fields that are for internal
use. While they are marked as private having them in a public struct is
a distraction for callers and means that every time the internal details
are changed we have to recompile all the files that include sequencer.h
even though the public API is unchanged. This commit starts the process
of removing the private fields by adding an opaque pointer to a "struct
replay_ctx" to "struct replay_opts" and moving the "reflog_message"
member to the new private struct.
The sequencer currently updates the state files on disc each time it
processes a command in the todo list. This is an artifact of the
scripted implementation and makes the code hard to reason about as it is
not possible to get a complete view of the state in memory. In the
future we will add new members to "struct replay_ctx" to remedy this and
avoid writing state to disc unless the sequencer stops for user
interaction.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
sequencer_post_commit_cleanup() initializes an instance of "struct
replay_opts" but does not call replay_opts_release(). Currently this
does not leak memory because the code paths called don't allocate any of
the struct members. That will change in the next commit so add call to
replay_opts_release() to prevent a memory leak in "git commit" that
breaks all of the leak free tests.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have dropped `the_index`, `initialize_the_repository()`
doesn't really do a lot anymore except for setting up the pointer for
`the_repository` and then calling `initialize_repository()`. The former
can be replaced by statically initializing the pointer though, which
basically makes this function moot.
Convert callers to instead call `initialize_repository(the_repository)`
and drop `initialize_thee_repository()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
All users of `the_index` have been converted to use either a custom
`struct index_state *` or the index provided by `the_repository`. We can
thus drop the globally-accessible declaration of this variable. In fact,
we can go further than that and drop `the_index` completely now and have
it be allocated dynamically in `initialize_repository()` as all the
other data structures in it are.
This concludes the quest to make Git `the_index` free, which has started
with 4aab5b46f4 (Make read-cache.c "the_index" free., 2007-04-01).
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git starts, one of the first things it will do is to call
`initialize_the_repository()`. This function sets up both the global
`the_repository` and `the_index` variables as required. Part of that
setup is also to set `the_repository.index = &the_index` so that the
index can be accessed via the repository.
When calling `repo_init()` on a repository though we set the complete
struct to all-zeroes, which will also cause us to unset the `index`
pointer. And as we don't re-initialize the index in that function, we
will end up with a `NULL` pointer here.
This has been fine until now becaues this function is only used to
create a new repository. git-init(1) does not access the index at all
after initializing the repository, whereas git-checkout(1) only uses
`the_index` directly. We are about to remove `the_index` though, which
will uncover this partially-initialized repository structure.
Refactor the code and create a common `initialize_repository()` function
that gets called from `repo_init()` and `initialize_the_repository()`.
This function sets up both the repository and the index as required.
Like this, we can easily special-case when `repo_init()` gets called
with `the_repository`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>