git.ipfire.org Git - thirdparty/git.git/log

builtin/config: move subcommand options into `cmd_config()`

Move the subcommand options as well as the `subcommand` variable into
`cmd_config()`. This reduces our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/config: move legacy mode into its own function

In `cmd_config()` we first try to parse the provided arguments as
subcommands and, if this is successful, call the respective functions
of that subcommand. Otherwise we continue with the "legacy" mode that
uses implicit actions and/or flags.

Disentangle this by moving the legacy mode into its own function. This
allows us to move the options into the respective functions and clearly
separates concerns.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/config: stop printing full usage on misuse

When invoking git-config(1) with a wrong set of arguments we end up
calling `usage_builtin_config()` after printing an error message that
says what was wrong. As that function ends up printing the full list of
options, which is quite long, the actual error message will be buried by
a wall of text. This makes it really hard to figure out what exactly
caused the error.

Furthermore, now that we have recently introduced subcommands, the usage
information may actually be misleading as we unconditionally print
options of the subcommand-less mode.

Fix both of these issues by just not printing the options at all
anymore. Instead, we call `usage()` that makes us report in a single
line what has gone wrong. This should be way more discoverable for our
users and addresses the inconsistency.

Furthermore, this change allow us to inline the options into the
respective functions that use them to parse the command line.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap: introduce `bitmap_writer_free()`

Now that there is clearer memory ownership around the bitmap_writer
structure, introduce a bitmap_writer_free() function that callers may
use to free any memory associated with their instance of the
bitmap_writer structure.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap-write.c: avoid uninitialized 'write_as' field

Prepare to free() memory associated with bitmapped_commit structs by
zero'ing the 'write_as' field.

In ideal cases, it is fine to do something like:

    for (i = 0; i < writer->selected_nr; i++) {
        struct bitmapped_commit *bc = &writer->selected[i];
        if (bc->write_as != bc->bitmap)
            ewah_free(bc->write_as);
        ewah_free(bc->bitmap);
    }

but if not all of the 'write_as' fields were populated (e.g., because
the packing_data given does not form a reachability closure), then we
may attempt to free uninitialized memory.

Guard against this by preemptively zero'ing this field just in case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap: drop unused `max_bitmaps` parameter

The `max_bitmaps` parameter in `bitmap_writer_select_commits()` was
introduced back in 7cc8f97108 (pack-objects: implement bitmap writing,
2013-12-21), making it original to the bitmap implementation in Git
itself.

When that patch was merged via 0f9e62e084 (Merge branch
'jk/pack-bitmap', 2014-02-27), its sole caller in builtin/pack-objects.c
passed a value of "-1" for `max_bitmaps`, indicating no limit.

Since then, the only other caller (in midx.c, added via c528e17966
(pack-bitmap: write multi-pack bitmaps, 2021-08-31)) also uses a value
of "-1" for `max_bitmaps`.

Since no callers have needed a finite limit for the `max_bitmaps`
parameter in the nearly decade that has passed since 0f9e62e084, let's
remove the parameter and any dead pieces of code connected to it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap: avoid use of static `bitmap_writer`

The pack-bitmap machinery uses a structure called 'bitmap_writer' to
collect the data necessary to write out .bitmap files. Since its
introduction in 7cc8f971085 (pack-objects: implement bitmap writing,
2013-12-21), there has been a single static bitmap_writer structure,
which is responsible for all bitmap writing-related operations.

In practice, this is OK, since we are only ever writing a single .bitmap
file in a single process (e.g., `git multi-pack-index write --bitmap`,
`git pack-objects --write-bitmap-index`, `git repack -b`, etc.).

However, having a single static variable makes issues like data
ownership unclear, when to free variables, what has/hasn't been
initialized unclear.

Refactor this code to be written in terms of a given bitmap_writer
structure instead of relying on a static global.

Note that this exposes the structure definition of the bitmap_writer at
the pack-bitmap.h level. We could work around this by, e.g., forcing
callers to declare their writers as:

    struct bitmap_writer *writer;
    bitmap_writer_init(&bitmap_writer);

and then declaring `bitmap_writer_init()` as taking in a double-pointer
like so:

    void bitmap_writer_init(struct bitmap_writer **writer);

which would avoid us having to expose the definition of the structure
itself. This patch takes a different approach, since future patches
(like for the ongoing pseudo-merge bitmaps work) will want to modify the
innards of this structure (in the previous example, via pseudo-merge.c).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

pack-bitmap-write.c: move commit_positions into commit_pos fields

In 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), the
bitmapped_commit struct was introduced, including the 'commit_pos'
field, which has been unused ever since its introduction more than a
decade ago.

Instead, we have used the nearby `commit_positions` array leaving the
bitmapped_commit struct with an unused 4-byte field.

We could drop the `commit_pos` field as unused, and continue to store
the values in the auxiliary array. But we could also drop the array and
store the data for each bitmapped_commit struct inside of the structure
itself, which is what this patch does.

In any spot that we previously read `commit_positions[i]`, we can now
instead read `writer.selected[i].commit_pos`. There are a few spots that
need changing as a result:

  - write_selected_commits_v1() is a simple transformation, since we're
    just reading the field. As a result, the function no longer needs an
    explicit argument to pass the commit_positions array.

  - write_lookup_table() also no longer needs the explicit
    commit_positions array passed in as an argument. But it still needs
    to sort an array of indices into the writer.selected array to read
    them in commit_pos order, so table_cmp() is adjusted accordingly.

  - bitmap_writer_finish() no longer needs to allocate, populate, and
    free the commit_positions table. Instead, we can just write the data
    directly into each struct bitmapped_commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

object.h: add flags allocated by pack-bitmap.h

In commit 7cc8f971085 (pack-objects: implement bitmap writing,
2013-12-21) the NEEDS_BITMAP flag was introduced into pack-bitmap.h, but
no object flags allocation table existed at the time.

In 208acbfb82f (object.h: centralize object flag allocation, 2014-03-25)
when that table was first introduced, we never added the flags from
7cc8f971085, which has remained the case since.

Rectify this by including the flag bit used by pack-bitmap.h into the
centralized table in object.h.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Sync with Git 2.45.1

* tag 'v2.45.1': (42 commits)
  Git 2.45.1
  Git 2.44.1
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  ...

reftable/merged: adapt interface to allow reuse of iterators

Refactor the interfaces exposed by `struct reftable_merged_table` and
`struct merged_iter` such that they support iterator reuse. This is done
by separating initialization of the iterator and seeking on it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/stack: provide convenience functions to create iterators

There exist a bunch of call sites in the reftable backend that want to
create iterators for a reftable stack. This is rather convoluted right
now, where you always have to go via the merged table. And it is about
to become even more convoluted when we split up iterator initialization
and seeking in the next commit.

Introduce convenience functions that allow the caller to create an
iterator from a reftable stack directly without going through the merged
table. Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/reader: adapt interface to allow reuse of iterators

Refactor the interfaces exposed by `struct reftable_reader` and `struct
table_iterator` such that they support iterator reuse. This is done by
separating initialization of the iterator and seeking on it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/generic: adapt interface to allow reuse of iterators

Refactor the interfaces exposed by `struct reftable_table` and `struct
reftable_iterator` such that they support iterator reuse. This is done
by separating initialization of the iterator and seeking on it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/generic: move seeking of records into the iterator

Reftable iterators are created by seeking on the parent structure of a
corresponding record. For example, to create an iterator for the merged
table you would call `reftable_merged_table_seek_ref()`. Most notably,
it is not posible to create an iterator and then seek it afterwards.

While this may be a bit easier to reason about, it comes with two
significant downsides. The first downside is that the logic to find
records is split up between the parent data structure and the iterator
itself. Conceptually, it is more straight forward if all that logic was
contained in a single place, which should be the iterator.

The second and more significant downside is that it is impossible to
reuse iterators for multiple seeks. Whenever you want to look up a
record, you need to re-create the whole infrastructure again, which is
quite a waste of time. Furthermore, it is impossible to optimize seeks,
such as when seeking the same record multiple times.

To address this, we essentially split up the concerns properly such that
the parent data structure is responsible for setting up the iterator via
a new `init_iter()` callback, whereas the iterator handles seeks via a
new `seek()` callback. This will eventually allow us to call `seek()` on
the iterator multiple times, where every iterator can potentially
optimize for certain cases.

Note that at this point in time we are not yet ready to reuse the
iterators. This will be left for a future patch series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/merged: simplify indices for subiterators

When seeking on a merged table, we perform the seek for each of the
subiterators. If the subiterator has the desired record we add it to the
priority queue, otherwise we skip it and don't add it to the stack of
subiterators hosted by the merged table.

The consequence of this is that the index of the subiterator in the
merged table does not necessarily correspond to the index of it in the
merged iterator. Next to being potentially confusing, it also means that
we won't easily be able to re-seek the merged iterator because we have
no clear connection between both of the data structures.

Refactor the code so that the index stays the same in both structures.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/merged: split up initialization and seeking of records

To initialize a `struct merged_iter`, we need to seek all subiterators
to the wanted record and then add their results to the priority queue
used to sort the records. This logic is split up across two functions,
`merged_table_seek_record()` and `merged_iter_init()`. The scope of
these functions is somewhat weird though, where `merged_iter_init()` is
only responsible for adding the records of the subiterators to the
priority queue.

Clarify the scope of those functions such that `merged_iter_init()` is
only responsible for initializing the iterator's structure. Performing
the subiterator seeks are now part of `merged_table_seek_record()`.

This step is required to move seeking of records into the generic
`struct reftable_iterator` infrastructure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/reader: set up the reader when initializing table iterator

All the seeking functions accept a `struct reftable_reader` as input
such that they can use the reader to look up the respective blocks.
Refactor the code to instead set up the reader as a member of `struct
table_iter` during initialization such that we don't have to pass the
reader on every single call.

This step is required to move seeking of records into the generic
`struct reftable_iterator` infrastructure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/reader: inline `reader_seek_internal()`

We have both `reader_seek()` and `reader_seek_internal()`, where the
former function only exists so that we can exit early in case the given
table has no records of the sought-after type.

Merge these two functions into one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/reader: separate concerns of table iter and reftable reader

In "reftable/reader.c" we implement two different interfaces:

  - The reftable reader contains the logic to read reftables.

  - The table iterator is used to iterate through a single reftable read
    by the reader.

The way those two types are used in the code is somewhat confusing
though because seeking inside a table is implemented as if it was part
of the reftable reader, even though it is ultimately more of a detail
implemented by the table iterator.

Make the boundary between those two types clearer by renaming functions
that seek records in a table such that they clearly belong to the table
iterator's logic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/reader: unify indexed and linear seeking

In `reader_seek_internal()` we either end up doing an indexed seek when
there is one or a linear seek otherwise. These two code paths are
disjunct without a good reason, where the indexed seek will cause us to
exit early.

Refactor the two code paths such that it becomes possible to share a bit
more code between them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/reader: avoid copying index iterator

When doing an indexed seek we need to walk down the multi-level index
until we finally hit a record of the desired indexed type. This loop
performs a copy of the index iterator on every iteration, which is both
hard to understand and completely unnecessary.

Refactor the code so that we use a single iterator to walk down the
indices, only.

Note that while this should improve performance, the improvement is
negligible in all but the most unreasonable repositories. This is
because the effect is only really noticeable when we have to walk down
many levels of indices, which is not something that a repository would
typically have. So the motivation for this change is really only about
readability.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/block: use `size_t` to track restart point index

The function `block_reader_restart_offset()` gets the offset of the
`i`th restart point. `i` is a signed integer though, which is certainly
not the correct type to track indices like this. Furthermore, both
callers end up passing a `size_t`.

Refactor the code to use a `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/reftable: allow configuring geometric factor

Allow configuring the geometric factor used by the auto-compaction
algorithm whenever a new table is appended to the stack of tables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable: make the compaction factor configurable

When auto-compacting, the reftable library packs references such that
the sizes of the tables form a geometric sequence. The factor for this
geometric sequence is hardcoded to 2 right now. We're about to expose
this as a config option though, so let's expose the factor via write
options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/reftable: allow disabling writing the object index

Besides the expected "ref" and "log" records, the reftable library also
writes "obj" records. These are basically a reverse mapping of object
IDs to their respective ref records so that it becomes efficient to
figure out which references point to a specific object. The motivation
for this data structure is the "uploadpack.allowTipSHA1InWant" config,
which allows a client to fetch any object by its hash that has a ref
pointing to it.

This reverse index is not used by Git at all though, and the expectation
is that most hosters nowadays use "uploadpack.allowAnySHA1InWant". It
may thus be preferable for many users to disable writing these optional
object indices altogether to safe some precious disk space.

Add a new config "reftable.indexObjects" that allows the user to disable
the object index altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/reftable: allow configuring restart interval

Add a new option `reftable.restartInterval` that allows the user to
control the restart interval when writing reftable records used by the
reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable: use `uint16_t` to track restart interval

The restart interval can at most be `UINT16_MAX` as specified in the
technical documentation of the reftable format. Furthermore, it cannot
ever be negative. Regardless of that we use an `int` to track the
restart interval.

Change the type to use an `uint16_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs/reftable: allow configuring block size

Add a new option `reftable.blockSize` that allows the user to control
the block size used by the reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/dump: support dumping a table's block structure

We're about to introduce new configs that will allow users to have more
control over how exactly reftables are written. To verify that these
configs are effective we will need to take a peak into the actual blocks
written by the reftable backend.

Introduce a new mode to the dumping logic that prints out the block
structure. This logic can be invoked via `test-tool dump-reftables -b`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/writer: improve error when passed an invalid block size

The reftable format only supports block sizes up to 16MB. When the
writer is being passed a value bigger than that it simply calls
abort(3P), which isn't all that helpful due to the lack of a proper
error message.

Improve this by calling `BUG()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable/writer: drop static variable used to initialize strbuf

We have a static variable in the reftable writer code that is merely
used to initialize the `last_key` of the writer. Convert the code to
instead use `strbuf_init()` and drop the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable: pass opts as constant pointer

We sometimes pass the refatble write options as value and sometimes as a
pointer. This is quite confusing and makes the reader wonder whether the
options get modified sometimes.

In fact, `reftable_new_writer()` does cause the caller-provided options
to get updated when some values aren't set up. This is quite unexpected,
but didn't cause any harm until now.

Adapt the code so that we do not modify the caller-provided values
anymore. While at it, refactor the code to code to consistently pass the
options as a constant pointer to clarify that the caller-provided opts
will not ever get modified.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

reftable: consistently refer to `reftable_write_options` as `opts`

Throughout the reftable library the `reftable_write_options` are
sometimes referred to as `cfg` and sometimes as `opts`. Unify these to
consistently use `opts` to avoid confusion.

While at it, touch up the coding style a bit by removing unneeded braces
around one-line statements and newlines between variable declarations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

documentation: git-update-index: add --show-index-version to synopsis

In 606e088d5d (update-index: add --show-index-version, 2023-09-12), we
added the new '--show-index-version' option to 'git-update-index' and
documented it, but forgot to add it to the synopsis section.

Add '--show-index-version' to the synopsis of 'git-update-index'.

Signed-off-by: Dov Murik <dov.murik@linux.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

fetch-pack: remove unused 'struct loose_object_iter'

'struct loose_object_iter' in fetch-pack.c is unused since commit
97b2fa08 (fetch-pack: drop custom loose object cache, 2018-11-12).

Remove it.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/undecided-is-not-necessarily-sha1' into jc/undecided-is-not-necessarily-sha1-fix

* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes

The third batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'jc/git-gui-maintainer-update'

* jc/git-gui-maintainer-update:
SubmittingPatches: welcome the new maintainer of git-gui part

Merge branch 'fa/p4-error'

P4 update.

* fa/p4-error:
git-p4: show Perforce error to the user

Merge branch 'ps/ci-fuzzers-at-gitlab-fix'

CI fix.

* ps/ci-fuzzers-at-gitlab-fix:
gitlab-ci: fix installing dependencies for fuzz smoke tests
gitlab-ci: add smoke test for fuzzers

Merge branch 'jk/ci-test-with-jgit-fix'

CI fix.

* jk/ci-test-with-jgit-fix:
ci: update coverity runs_on_pool reference

Merge branch 'jk/ci-macos-gcc13-fix'

CI fix.

* jk/ci-macos-gcc13-fix:
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable

Merge branch 'jc/no-default-attr-tree-in-bare'

Git 2.43 started using the tree of HEAD as the source of attributes
in a bare repository, which has severe performance implications.
For now, revert the change, without ripping out a more explicit
support for the attr.tree configuration variable.

* jc/no-default-attr-tree-in-bare:
stop using HEAD for attributes in bare repository by default

Merge branch 'ps/ci-python-2-deprecation'

Unbreak CI jobs so that we do not attempt to use Python 2 that has
been removed from the platform.

* ps/ci-python-2-deprecation:
ci: fix Python dependency on Ubuntu 24.04

Merge branch 'tb/attr-limits'

The maximum size of attribute files is enforced more consistently.

* tb/attr-limits:
attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()

Merge branch 'jc/test-workaround-broken-mv'

Tests that try to corrupt in-repository files in chunked format did
not work well on macOS due to its broken "mv", which has been
worked around.

* jc/test-workaround-broken-mv:
t/lib-chunk: work around broken "mv" on some vintage of macOS

Merge branch 'ma/win32-unix-domain-socket'

Build fix.

* ma/win32-unix-domain-socket:
win32: fix building with NO_UNIX_SOCKETS

compat/regex: fix argument order to calloc(3)

Windows compiler suddenly started complaining that calloc(3) takes
its arguments in <nmemb, size> order. Indeed, there are many calls
that has their arguments in a _wrong_ order.

Fix them all.

A sample breakage can be seen at

https://github.com/git/git/actions/runs/9046793153/job/24857988702#step:4:272

Signed-off-by: Junio C Hamano <gitster@pobox.com>

SubmittingPatches: welcome the new maintainer of git-gui part

Signed-off-by: Junio C Hamano <gitster@pobox.com>

git-gui: note the new maintainer

Pratyush Yadev has relinquished, and Johannes Sixt has taken over,
maintainership of Git-GUI.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>

Merge branch 'ps/config-subcommands' into ps/builtin-config-cleanup

* ps/config-subcommands:
  builtin/config: display subcommand help
  builtin/config: introduce "edit" subcommand
  builtin/config: introduce "remove-section" subcommand
  builtin/config: introduce "rename-section" subcommand
  builtin/config: introduce "unset" subcommand
  builtin/config: introduce "set" subcommand
  builtin/config: introduce "get" subcommand
  builtin/config: introduce "list" subcommand
  builtin/config: pull out function to handle `--null`
  builtin/config: pull out function to handle config location
  builtin/config: use `OPT_CMDMODE()` to specify modes
  builtin/config: move "fixed-value" option to correct group
  builtin/config: move option array around
  config: clarify memory ownership when preparing comment strings

SubmittingPatches: extend the "flow" section

Explain a full lifecycle of a patch series upfront, so that it is
clear when key decisions to "accept" a series is made and how a new
patch series becomes a part of a new release.

Fold the "you need to monitor the progress of your topic" section
into the primary "patch lifecycle" section, as that is one of the
things the patch submitter is responsible for.  It is not like "I
sent a patch and responded to review messages, and now it is their
problem".  They need to see their patch through the patch life
cycle.

Earlier versions of this document outlined a slightly different
patch flow in an idealized world, where the original submitter
gathered agreements from the participants of the discussion and sent
the final "we all agreed that this is the good version--please
apply" patches to the maintainer.  In practice, this almost never
happened.  Instead, describe what flow was used in practice for the
past decade that worked well for us.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

SubmittingPatches: move the patch-flow section earlier

Before discussing the small details of how the patch gets sent, we'd
want to give people a larger picture first to set the expectation
straight. The existing patch-flow section covers materials that are
suitable for that purpose, so move it to the beginning of the
document. We'll update the contents of the section to clarify what
goal the patch submitter is working towards in the next step, which
will make it easier to understand the reason behind the individual
rules presented in latter parts of the document.

This step only moves two sections (patch-flow and patch-status)
without changing their contents, except that their section levels
are demoted from Level 1 to Level 2 to fit better in the document
structure at their new place.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: stop installing "gcc-13" for osx-gcc

Our osx-gcc job explicitly asks to install gcc-13. But since the GitHub
runner image already comes with gcc-13 installed, this is mostly doing
nothing (or in some cases it may install an incremental update over the
runner image). But worse, it recently started causing errors like:

    ==> Fetching gcc@13
    ==> Downloading https://ghcr.io/v2/homebrew/core/gcc/13/blobs/sha256:fb2403d97e2ce67eb441b54557cfb61980830f3ba26d4c5a1fe5ecd0c9730d1a
    ==> Pouring gcc@13--13.2.0.ventura.bottle.tar.gz
    Error: The `brew link` step did not complete successfully
    The formula built, but is not symlinked into /usr/local
    Could not symlink bin/c++-13
    Target /usr/local/bin/c++-13
    is a symlink belonging to gcc. You can unlink it:
      brew unlink gcc

which cause the whole CI job to bail.

I didn't track down the root cause, but I suspect it may be related to
homebrew recently switching the "gcc" default to gcc-14. And it may even
be fixed when a new runner image is released. But if we don't need to
run brew at all, it's one less thing for us to worry about.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: avoid bare "gcc" for osx-gcc job

On macOS, a bare "gcc" (without a version) will invoke a wrapper for
clang, not actual gcc. Even when gcc is installed via homebrew, that
only provides version-specific links in /usr/local/bin (like "gcc-13"),
and never a version-agnostic "gcc" wrapper.

As far as I can tell, this has been the case for a long time, and this
osx-gcc job has largely been doing nothing. We can point it at "gcc-13",
which will pick up the homebrew-installed version.

The fix here is specific to the github workflow file, as the gitlab one
does not have a matching job.

It's a little unfortunate that we cannot just ask for the latest version
of gcc which homebrew provides, but as far as I can tell there is no
easy alias (you'd have to find the highest number gcc-* in
/usr/local/bin yourself).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: drop mention of BREW_INSTALL_PACKAGES variable

The last user of this variable went away in 4a6e4b9602 (CI: remove
Travis CI support, 2021-11-23), so it's doing nothing except making it
more confusing to find out which packages _are_ installed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

ci: update coverity runs_on_pool reference

Commit 2d65e5b6a6 (ci: rename "runs_on_pool" to "distro", 2024-04-12)
renamed this variable for the main CI workflow, as well as in the ci/
scripts. Because the coverity workflow also relies on those scripts to
install dependencies, it needs to be updated, too. Without this patch,
the coverity build fails because we lack libcurl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitlab-ci: fix installing dependencies for fuzz smoke tests

There was a semantic merge conflict between 9cdeb34b96 (ci: merge
scripts which install dependencies, 2024-04-12), which has merged
"ci/install-docker-dependencies.sh" into "ci/install-dependencies.sh"
and c7b228e000 (gitlab-ci: add smoke test for fuzzers, 2024-04-29),
which has added a new fuzz smoke test job that makes use of the
now-removed script.

Adapt the job to instead use the new script to install dependencies.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ps/ci-python-2-deprecation' into ps/ci-fuzzers-at-gitlab-fix

* ps/ci-python-2-deprecation:
ci: fix Python dependency on Ubuntu 24.04

Merge branch 'ps/ci-enable-minimal-fuzzers-at-gitlab' into ps/ci-fuzzers-at-gitlab-fix

* ps/ci-enable-minimal-fuzzers-at-gitlab:
gitlab-ci: add smoke test for fuzzers

git-p4: show Perforce error to the user

During "git p4 clone" if p4 process returns an error from the server,
it will store the message in the 'err' variable. Then it will send a
text command "die-now" to git-fast-import. However, git-fast-import
raises an exception: "fatal: Unsupported command: die-now" and err is
never displayed. This patch ensures that err is shown to the end user.

Signed-off-by: Fahad Alrashed <fahad@keylock.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

The second batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'bb/rgb-12-bit-colors'

The color parsing code learned to handle 12-bit RGB colors, spelled
as "#RGB" (in addition to "#RRGGBB" that is already supported).

* bb/rgb-12-bit-colors:
  color: add support for 12-bit RGB colors
  t/t4026-color: add test coverage for invalid RGB colors
  t/t4026-color: remove an extra double quote character

Merge branch 'rs/diff-parseopts-cleanup'

Code clean-up to remove code that is now a noop.

* rs/diff-parseopts-cleanup:
diff-lib: stop calling diff_setup_done() in do_diff_cache()

Merge branch 'dk/zsh-git-repo-path-fix'

Command line completion support for zsh (in contrib/) has been
updated to stop exposing internal state to end-user shell
interaction.

* dk/zsh-git-repo-path-fix:
completion: zsh: stop leaking local cache variable

Merge branch 'bc/zsh-compatibility'

zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.

* bc/zsh-compatibility:
vimdiff: make script and tests work with zsh
t4046: avoid continue in &&-chain for zsh

Merge branch 'rj/add-p-typo-reaction'

When the user responds to a prompt given by "git add -p" with an
unsupported command, list of available commands were given, which
was too much if the user knew what they wanted to type but merely
made a typo.  Now the user gets a much shorter error message.

* rj/add-p-typo-reaction:
  add-patch: response to unknown command
  add-patch: do not show UI messages on stderr

Merge branch 'jt/doc-submitting-rerolled-series'

Developer doc update.

* jt/doc-submitting-rerolled-series:
doc: clarify practices for submitting updated patch versions

Merge branch 'rh/complete-symbolic-ref'

Command line completion script (in contrib/) learned to complete
"git symbolic-ref" a bit better (you need to enable plumbing
commands to be completed with GIT_COMPLETION_SHOW_ALL_COMMANDS).

* rh/complete-symbolic-ref:
  completion: add docs on how to add subcommand completions
  completion: improve docs for using __git_complete
  completion: add 'symbolic-ref'

Merge branch 'ps/the-index-is-no-more'

The singleton index_state instance "the_index" has been eliminated
by always instantiating "the_repository" and replacing references
to "the_index"  with references to its .index member.

* ps/the-index-is-no-more:
  repository: drop `initialize_the_repository()`
  repository: drop `the_index` variable
  builtin/clone: stop using `the_index`
  repository: initialize index in `repo_init()`
  builtin: stop using `the_index`
  t/helper: stop using `the_index`

Merge branch 'bc/credential-scheme-enhancement'

The credential helper protocol, together with the HTTP layer, have
been enhanced to support authentication schemes different from
username & password pair, like Bearer and NTLM.

* bc/credential-scheme-enhancement:
  credential: add method for querying capabilities
  credential-cache: implement authtype capability
  t: add credential tests for authtype
  credential: add support for multistage credential rounds
  t5563: refactor for multi-stage authentication
  docs: set a limit on credential line length
  credential: enable state capability
  credential: add an argument to keep state
  http: add support for authtype and credential
  docs: indicate new credential protocol fields
  credential: add a field called "ephemeral"
  credential: gate new fields on capability
  credential: add a field for pre-encoded credentials
  http: use new headers for each object request
  remote-curl: reset headers on new request
  credential: add an authtype field

Merge branch 'ps/ci-test-with-jgit'

Tests to ensure interoperability between reftable written by jgit
and our code have been added and enabled in CI.

* ps/ci-test-with-jgit:
  t0612: add tests to exercise Git/JGit reftable compatibility
  t0610: fix non-portable variable assignment
  t06xx: always execute backend-specific tests
  ci: install JGit dependency
  ci: make Perforce binaries executable for all users
  ci: merge scripts which install dependencies
  ci: fix setup of custom path for GitLab CI
  ci: merge custom PATH directories
  ci: convert "install-dependencies.sh" to use "/bin/sh"
  ci: drop duplicate package installation for "linux-gcc-default"
  ci: skip sudo when we are already root
  ci: expose distro name in dockerized GitHub jobs
  ci: rename "runs_on_pool" to "distro"

Merge branch 'ps/reftable-write-optim'

Code to write out reftable has seen some optimization and
simplification.

* ps/reftable-write-optim:
  reftable/block: reuse compressed array
  reftable/block: reuse zstream when writing log blocks
  reftable/writer: reset `last_key` instead of releasing it
  reftable/writer: unify releasing memory
  reftable/writer: refactorings for `writer_flush_nonempty_block()`
  reftable/writer: refactorings for `writer_add_record()`
  refs/reftable: don't recompute committer ident
  reftable: remove name checks
  refs/reftable: skip duplicate name checks
  refs/reftable: perform explicit D/F check when writing symrefs
  refs/reftable: fix D/F conflict error message on ref copy

scalar: avoid segfault in reconfigure --all

During the latest v2.45.0 update, 'scalar reconfigure --all' started to
segfault on my machine. Breaking it down via the debugger, it was
faulting on a NULL reference to the_hash_algo, which is a macro pointing
to the_repository->hash_algo.

In my case, this is due to one of my repositories having a detached HEAD,
which requires get_oid_hex() to parse that the HEAD reference is valid.
Another way to cause a failure is to use the "includeIf.onbranch" config
key, which will lead to a BUG() statement.

My first inclination was to try to refactor cmd_reconfigure() to execute
'git for-each-repo' instead of this loop. In addition to the difficulty
of executing 'scalar reconfigure' within 'git for-each-repo', it would
be difficult to perform the clean-up logic for non-existent repos if we
relied on that child process.

Instead, I chose to move the temporary repo to be within the loop and
reinstate the_repository to its old value after we are done performing
logic on the current array item.

Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with
multiple registered repos. There are two different ways that the old
use of the_repository could trigger bugs. These issues are being solved
independently to be more careful about the_repository being
uninitialized, but the change in this patch around the use of
the_repository is still a good safety precaution.

Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

t0018: two small fixes

Even though the three tests that were recently added started their
here-doc with "<<-\EOF", it did not take advantage of that and
instead wrote the here-doc payload abut to the left edge.  Use a tabs
to indent these lines.

More importantly, because these all hardcode the expected output,
which contains the current branch name, they break the CI job that
uses 'main' as the default branch name.

Use

    GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=trunk
    export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

between the test_description line and ". ./test-lib.sh" line to
force the initial branch name to 'trunk' and expect it to show in
the output.

Signed-off-by: Junio C Hamano <gitster@pobox.com>

Documentation/git-merge-tree.txt: document -X

Add an entry in the 'merge-tree' builtin documentation for
-X/--strategy-option (added in 6a4c9e7b32 (merge-tree: add -X strategy
option, 2023-09-24)). The same option is documented for 'merge', 'rebase',
'revert', etc. in their respective Documentation/ files, so let's do the
same for 'merge-tree'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: remove functions without ref store

The preceding commit has rewritten all callers of ref-related functions
to use the equivalents that accept a `struct ref_store`. Consequently,
the respective variants without the ref store are now unused. Remove
them.

There are likely patch series in-flight that use the now-removed
functions. To help the authors, the old implementations have been added
to "refs.c" in an ifdef'd section as a reference for how to migrate each
of the respective callers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cocci: apply rules to rewrite callers of "refs" interfaces

Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

cocci: introduce rules to transform "refs" to pass ref store

Most of the functions in "refs.h" have two flavors: one that accepts a
`struct ref_store`, and one that figures it out via `the_repository`.
As part of the libification efforts we want to get rid of the latter
variant and stop relying on `the_repository` altogether.

Introduce a set of Coccinelle rules that transform callers of the "refs"
interfaces to pass a `struct ref_store`. These rules are not yet applied
by this patch so that it can be reviewed standalone more easily. This
will be done in the next patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: add `exclude_patterns` parameter to `for_each_fullref_in()`

The `for_each_fullref_in()` function is supposedly the ref-store-less
equivalent of `refs_for_each_fullref_in()`, but the latter has gained a
new parameter `exclude_patterns` over time. Bring these two functions
back in sync again by adding the parameter to the former function, as
well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: introduce missing functions that accept a `struct ref_store`

While most of the functions in "refs.h" have a variant that accepts a
`struct ref_store`, some don't. Callers of these functions are thus
forced to implicitly rely on `the_repository` to figure out the ref
store that is to be used.

Introduce those missing functions to address this shortcoming.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/tag: add --trailer option

git-tag supports interpreting trailers from an annotated tag message,
using --list --format="%(trailers)". However, the available methods to
add a trailer to a tag message (namely -F or --editor) are not as
ergonomic.

In a previous patch, we moved git-commit's implementation of its
--trailer option to the trailer.h API. Let's use that new function to
teach git-tag the same --trailer option, emulating as much of
git-commit's behavior as much as possible.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/commit: refactor --trailer logic

git-commit adds user trailers to the commit message by passing its
`--trailer` arguments to a child process running `git-interpret-trailers
--in-place`. This logic is broadly useful, not just for git-commit but
for other commands constructing message bodies (e.g. git-tag).

Let's move this logic from git-commit to a new function in the trailer
API, so that it can be re-used in other commands.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/commit: use ARGV macro to collect trailers

Replace git-commit's callback for --trailer with the standard
OPT_PASSTHRU_ARGV macro. The callback only adds its values to a strvec
and sanity-checks that `unset` is always false; both of these are
already implemented in the parse-option API.

Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: remove `create_symref` and associated dead code

In the previous commits, we converted `refs_create_symref()` to utilize
transactions to perform symref updates. Earlier `refs_create_symref()`
used `create_symref()` to do the same.

We can now remove `create_symref()` and any code associated with it
which is no longer used. We remove `create_symref()` code from all the
reference backends and also remove it entirely from the `ref_storage_be`
struct.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: rename `refs_create_symref()` to `refs_update_symref()`

The `refs_create_symref()` function is used to update/create a symref.
But it doesn't check the old target of the symref, if existing. It force
updates the symref. In this regard, the name `refs_create_symref()` is a
bit misleading. So let's rename it to `refs_update_symref()`. This is
akin to how 'git-update-ref(1)' also allows us to create apart from
update.

While we're here, rename the arguments in the function to clarify what
they actually signify and reduce confusion.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: use transaction in `refs_create_symref()`

The `refs_create_symref()` function updates a symref to a given new
target. To do this, it uses a ref-backend specific function
`create_symref()`.

In the previous commits, we introduced symref support in transactions.
This means we can now use transactions to perform symref updates and
don't have to resort to `create_symref()`. Doing this allows us to
remove and cleanup `create_symref()`, which we will do in the following
commit.

Modify the expected error message for a test in
't/t0610-reftable-basics.sh', since the error is now thrown from
'refs.c'. This is because in transactional updates, F/D conflicts are
caught before we're in the reference backend.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: add support for transactional symref updates

The reference backends currently support transactional reference
updates. While this is exposed to users via 'git-update-ref' and its
'--stdin' mode, it is also used internally within various commands.

However, we do not support transactional updates of symrefs. This commit
adds support for symrefs in both the 'files' and the 'reftable' backend.

Here, we add and use `ref_update_has_null_new_value()`, a helper
function which is used to check if there is a new_value in a reference
update. The new value could either be a symref target `new_target` or a
OID `new_oid`.

We also add another common function `ref_update_check_old_target` which
will be used to check if the update's old_target corresponds to a
reference's current target.

Now transactional updates (verify, create, delete, update) can be used
for:
- regular refs
- symbolic refs
- conversion of regular to symbolic refs and vice versa

This also allows us to expose this to users via new commands in
'git-update-ref' in the future.

Note that a dangling symref update does not record a new reflog entry,
which is unchanged before and after this commit.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: move `original_update_refname` to 'refs.c'

The files backend and the reftable backend implement
`original_update_refname` to obtain the original refname of the update.
Move it out to 'refs.c' and only expose it internally to the refs
library. This will be used in an upcoming commit to also introduce
another common functionality for the two backends.

We also rename the function to `ref_update_original_update_refname` to
keep it consistent with the upcoming other 'ref_update_*' functions
that'll be introduced.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: support symrefs in 'reference-transaction' hook

The 'reference-transaction' hook runs whenever a reference update is
made to the system. In a previous commit, we added the `old_target` and
`new_target` fields to the `reference_transaction_update()`. In
following commits we'll also add the code to handle symref's in the
reference backends.

Support symrefs also in the 'reference-transaction' hook, by modifying
the current format:
<old-oid> SP <new-oid> SP <ref-name> LF
to be be:
<old-value> SP <new-value> SP <ref-name> LF
where for regular refs the output would not change and remain the same.
But when either 'old-value' or 'new-value' is a symref, we print the ref
as 'ref:<ref-target>'.

This does break backward compatibility, but the 'reference-transaction'
hook's documentation always stated that support for symbolic references
may be added in the future.

We do not add any tests in this commit since there is no git command
which activates this flow, in an upcoming commit, we'll start using
transaction based symref updates as the default, we'll add tests there
for the hook too.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

files-backend: extract out `create_symref_lock()`

The function `create_symref_locked()` creates a symref by creating a
'<symref>.lock' file and then committing the symref lock, which creates
the final symref.

Extract the early half of `create_symref_locked()` into a new helper
function `create_symref_lock()`. Because the name of the new function is
too similar to the original, rename the original to
`create_and_commit_symref()` to avoid confusion.

The new function `create_symref_locked()` can be used to create the
symref lock in a separate step from that of committing it. This allows
to add transactional support for symrefs, where the lock would be
created in the preparation step and the lock would be committed in the
finish step.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

refs: accept symref values in `ref_transaction_update()`

The function `ref_transaction_update()` obtains ref information and
flags to create a `ref_update` and add them to the transaction at hand.

To extend symref support in transactions, we need to also accept the
old and new ref targets and process it. This commit adds the required
parameters to the function and modifies all call sites.

The two parameters added are `new_target` and `old_target`. The
`new_target` is used to denote what the reference should point to when
the transaction is applied. Some functions allow this parameter to be
NULL, meaning that the reference is not changed.

The `old_target` denotes the value the reference must have before the
update. Some functions allow this parameter to be NULL, meaning that the
old value of the reference is not checked.

We also update the internal function `ref_transaction_add_update()`
similarly to take the two new parameters.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

repository: stop setting SHA1 as the default object hash

During the startup of Git, we call `initialize_the_repository()` to set
up `the_repository` as well as `the_index`. Part of this setup is also
to set the default object hash of the repository to SHA1. This has the
effect that `the_hash_algo` is getting initialized to SHA1, as well.
This default hash algorithm eventually gets overridden by most Git
commands via `setup_git_directory()`, which also detects the actual hash
algorithm used by the repository.

There are some commands though that don't access a repository at all, or
at a later point only, and thus retain the default hash function for
some amount of time. As some of the the preceding commits demonstrate,
this can lead to subtle issues when we access `the_hash_algo` when no
repository has been set up.

Address this issue by dropping the set up of the default hash algorithm
completely. The effect of this is that `the_hash_algo` will map to a
`NULL` pointer and thus cause Git to crash when something tries to
access the hash algorithm without it being properly initialized. It thus
forces all Git commands to explicitly set up the hash algorithm in case
there is no repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

oss-fuzz/commit-graph: set up hash algorithm

Our fuzzing setups don't work in a proper repository, but only use the
in-memory configured `the_repository`. Consequently, we never go through
the full repository setup procedures and thus do not set up the hash
algo used by the repository.

The commit-graph fuzzer does rely on a properly initialized hash algo
though. Initialize it explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/shortlog: don't set up revisions without repo

It is possible to run git-shortlog(1) outside of a repository by passing
it output from git-log(1) via standard input. Obviously, as there is no
repository in that context, it is thus unsupported to pass any revisions
as arguments.

Regardless of that we still end up calling `setup_revisions()`. While
that works alright, it is somewhat strange. Furthermore, this is about
to cause problems when we unset the default object hash.

Refactor the code to only call `setup_revisions()` when we have a
repository. This is safe to do as we already verify that there are no
arguments when running outside of a repository anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/diff: explicitly set hash algo when there is no repo

The git-diff(1) command can be used outside repositories to diff two
files with each other. But even if there is no repository we will end up
hashing the files that we are diffing so that we can print the "index"
line:

    ```
    diff --git a/a b/b
    index 7898192..6178079 100644
    --- a/a
    +++ b/b
    @@ -1 +1 @@
    -a
    +b
    ```

We implicitly use SHA1 to calculate the hash here, which is because
`the_repository` gets initialized with SHA1 during the startup routine.
We are about to stop doing this though such that `the_repository` only
ever has a hash function when it was properly initialized via a repo's
configuration.

To give full control to our users, we would ideally add a new switch to
git-diff(1) that allows them to specify the hash function when executed
outside of a repository. But for now, we only convert the code to make
this explicit such that we can stop setting the default hash algorithm
for `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/bundle: abort "verify" early when there is no repository

Verifying a bundle requires us to have a repository. This is encoded in
`verify_bundle()`, which will return an error if there is no repository.
We call `open_bundle()` before we call `verify_bundle()` though, which
already performs some verifications even though we may ultimately abort
due to a missing repository.

This is problematic because `open_bundle()` already reads the bundle
header and verifies that it contains a properly formatted hash. When
there is no repository we have no clue what hash function to expect
though, so we always end up assuming SHA1 here, which may or may not be
correct. Furthermore, we are about to stop initializing `the_hash_algo`
when there is no repository, which will lead to segfaults.

Check early on whether we have a repository to fix this issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/blame: don't access potentially unitialized `the_hash_algo`

We access `the_hash_algo` in git-blame(1) before we have executed
`parse_options_start()`, which may not be properly set up in case we
have no repository. This is fine for most of the part because all the
call paths that lead to it (git-blame(1), git-annotate(1) as well as
git-pick-axe(1)) specify `RUN_SETUP` and thus require a repository.

There is one exception though, namely when passing `-h` to print the
help. Here we will access `the_hash_algo` even if there is no repo.
This works fine right now because `the_hash_algo` gets sets up to point
to the SHA1 algorithm via `initialize_repository()`. But we're about to
stop doing this, and thus the code would lead to a `NULL` pointer
exception.

Prepare the code for this and only access `the_hash_algo` after we are
sure that there is a proper repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

builtin/rev-parse: allow shortening to more than 40 hex characters

The `--short=` option for git-rev-parse(1) allows the user to specify
to how many characters object IDs should be shortened to. The option is
broken though for SHA256 repositories because we set the maximum allowed
hash size to `the_hash_algo->hexsz` before we have even set up the repo.
Consequently, `the_hash_algo` will always be SHA1 and thus we truncate
every hash after at most 40 characters.

Fix this by accessing `the_hash_algo` only after we have set up the
repo.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>