The returned string by `output_prefix()` is sometimes a string constant
and sometimes an allocated string. This has been fine until now because
we always leak the allocated strings, and thus we never tried to free
the string constant.
Fix the code to always return an allocated string and free the returned
value at all callsites.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
line-log: stop assigning string constant to file parent buffer
Stop assigning a string constant to the file parent buffer and instead
assign an allocated string. While the code is fine in practice, it will
break once we compile with `-Wwrite-strings`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `fill_textconv()` function is responsible for converting an input
file with a textconv driver, which is then passed to the caller. Weirdly
though, the function also handles the case where there is no textconv
driver at all. In that case, it will return either the contents of the
populated filespec, or an empty string if the filespec is invalid.
These two cases have differing memory ownership semantics. When there is
a textconv driver, then the result is an allocated string. Otherwise,
the result is either a string constant or owned by the filespec struct.
All callers are in fact aware of this weirdness and only end up freeing
the output buffer when they had a textconv driver.
Ideally, we'd split up this interface to only perform the conversion via
the textconv driver, and BUG in case the caller didn't provide one. This
would make memory ownership semantics much more straight forward. For
now though, let's simply cast the empty string constant to `char *` to
avoid a warning with `-Wwrite-strings`. This is equivalent to the same
cast that we already have in `fill_mmfile()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/remote: cast away constness in `get_head_names()`
In `get_head_names()`, we assign the "refs/heads/*" string constant to
`struct refspec_item::{src,dst}`, which are both non-constant pointers.
Ideally, we'd refactor the code such that both of these fields were
constant. But `struct refspec_item` is used for two different usecases
with conflicting requirements:
- To query for a source or destination based on the given refspec. The
caller either sets `src` or `dst` as the branch that we want to
search for, and the respective other field gets populated. The
fields should be constant when being used as a query parameter,
which is owned by the caller, and non-constant when being used as an
out parameter, which is owned by the refspec item. This is is
contradictory in itself already.
- To store refspec items with their respective source and destination
branches, in which case both fields should be owned by the struct.
Ideally, we'd split up this interface to clearly separate between
querying and storing, which would enable us to clarify lifetimes of the
strings. This would be a much bigger undertaking though.
Instead, accept the status quo for now and cast away the constness of
the source and destination patterns. We know that those are not being
written to or freed, so while this is ugly it certainly is fine for now.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a global tag refspec structure that is used by both git-clone(1)
and git-fetch(1). Initialization of the structure will break once we
enable `-Wwrite-strings`, even though the breakage is harmless. While we
could just add casts, the structure isn't really required in the first
place as we can simply initialize the structures at the respective
callsites.
Refactor the code accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable: cast away constness when assigning constants to records
The reftable records are used in multiple ways throughout the reftable
library. In many of those cases they merely act as input to a function
without getting modified by it at all. Most importantly, this happens
when writing records and when querying for records.
We rely on this in our tests and thus assign string constants to those
fields, which is about to generate warnings as those fields are of type
`char *`. While we could go through the process and instead allocate
those strings in all of our tests, this feels quite unnecessary.
Instead, add casts to `char *` for all of those strings. As this is part
of our tests, this also nicely serves as a demonstration that nothing
writes or frees those string constants, which would otherwise lead to
segfaults.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs/reftable: stop micro-optimizing refname allocations on copy
When copying refs, we execute `write_copy_table()` to write the new
table. As the names are given to us via `arg->newname` and
`arg->oldname`, respectively, we optimize away some allocations by
assigning those fields to the reftable records we are about to write
directly, without duplicating them. This requires us to cast the input
to `char *` pointers as they are in fact constant strings. Later on, we
then unset the refname for all of the records before calling
`reftable_log_record_release()` on them.
We also do this when assigning the "HEAD" constant, but here we do not
cast because its type is `char[]` by default. It's about to be turned
into `const char *` though once we enable `-Wwrite-strings` and will
thus cause another warning.
It's quite dubious whether this micro-optimization really helps. We're
about to write to disk anyway, which is going to be way slower than a
small handful of allocations. Let's drop the optimization altogther and
instead copy arguments to simplify the code and avoid the future warning
with `-Wwrite-strings`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
global: convert intentionally-leaking config strings to consts
There are multiple cases where we intentionally leak config strings:
- `struct gpg_format` is used to track programs that can be used for
signing commits, either via gpg(1), gpgsm(1) or ssh-keygen(1). The
user can override the commands via several config variables. As the
array is populated once, only, and the struct memers are never
written to or free'd.
- `struct ll_merge_driver` is used to track merge drivers. Same as
with the GPG format, these drivers are populated once and then
reused. Its data is never written to or free'd, either.
- `struct userdiff_funcname` and `struct userdiff_driver` can be
configured via `diff.<driver>.*` to add additional drivers. Again,
these have a global lifetime and are never written to or free'd.
All of these are intentionally kept alive and are never written to.
Furthermore, all of these are being assigned both string constants in
some places, and allocated strings in other places. This will cause
warnings once we enable `-Wwrite-strings`, so let's mark the respective
fields as `const char *` and cast away the constness when assigning
those values.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
global: improve const correctness when assigning string constants
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 29 May 2024 16:32:24 +0000 (09:32 -0700)]
Merge branch 'ps/leakfixes' into ps/no-writable-strings
* ps/leakfixes:
builtin/mv: fix leaks for submodule gitfile paths
builtin/mv: refactor to use `struct strvec`
builtin/mv duplicate string list memory
builtin/mv: refactor `add_slash()` to always return allocated strings
strvec: add functions to replace and remove strings
submodule: fix leaking memory for submodule entries
commit-reach: fix memory leak in `ahead_behind()`
builtin/credential: clear credential before exit
config: plug various memory leaks
config: clarify memory ownership in `git_config_string()`
builtin/log: stop using globals for format config
builtin/log: stop using globals for log config
convert: refactor code to clarify ownership of check_roundtrip_encoding
diff: refactor code to clarify memory ownership of prefixes
config: clarify memory ownership in `git_config_pathname()`
http: refactor code to clarify memory ownership
checkout: clarify memory ownership in `unique_tracking_name()`
strbuf: fix leak when `appendwholeline()` fails with EOF
transport-helper: fix leaking helper name
The credential helper that talks with osx keychain learned to avoid
storing back the authentication material it just got received from
the keychain.
* kn/osxkeychain-skip-idempotent-store:
osxkeychain: state to skip unnecessary store operations
osxkeychain: exclusive lock to serialize execution of operations
Junio C Hamano [Tue, 28 May 2024 18:17:07 +0000 (11:17 -0700)]
Merge branch 'ps/builtin-config-cleanup'
Code clean-up to reduce inter-function communication inside
builtin/config.c done via the use of global variables.
* ps/builtin-config-cleanup: (21 commits)
builtin/config: pass data between callbacks via local variables
builtin/config: convert flags to a local variable
builtin/config: track "fixed value" option via flags only
builtin/config: convert `key` to a local variable
builtin/config: convert `key_regexp` to a local variable
builtin/config: convert `regexp` to a local variable
builtin/config: convert `value_pattern` to a local variable
builtin/config: convert `do_not_match` to a local variable
builtin/config: move `respect_includes_opt` into location options
builtin/config: move default value into display options
builtin/config: move type options into display options
builtin/config: move display options into local variables
builtin/config: move location options into local variables
builtin/config: refactor functions to have common exit paths
config: make the config source const
builtin/config: check for writeability after source is set up
builtin/config: move actions into `cmd_config_actions()`
builtin/config: move legacy options into `cmd_config()`
builtin/config: move subcommand options into `cmd_config()`
builtin/config: move legacy mode into its own function
...
Junio C Hamano [Tue, 28 May 2024 18:17:06 +0000 (11:17 -0700)]
Merge branch 'ps/pseudo-ref-terminology'
Terminology to call various ref-like things are getting
straightened out.
* ps/pseudo-ref-terminology:
refs: refuse to write pseudorefs
ref-filter: properly distinuish pseudo and root refs
refs: pseudorefs are no refs
refs: classify HEAD as a root ref
refs: do not check ref existence in `is_root_ref()`
refs: rename `is_special_ref()` to `is_pseudo_ref()`
refs: rename `is_pseudoref()` to `is_root_ref()`
Documentation/glossary: define root refs as refs
Documentation/glossary: clarify limitations of pseudorefs
Documentation/glossary: redefine pseudorefs as special refs
Similar to the preceding commit, we have effectively given tracking
memory ownership of submodule gitfile paths. Refactor the code to start
tracking allocated strings in a separate `struct strvec` such that we
can easily plug those leaks. Mark now-passing tests as leak free.
Note that ideally, we wouldn't require two separate data structures to
track those paths. But we do need to store `NULL` pointers for the
gitfile paths such that we can indicate that its corresponding entries
in the other arrays do not have such a path at all. And given that
`struct strvec`s cannot store `NULL` pointers we cannot use them to
store this information.
There is another small gotcha that is easy to miss: you may be wondering
why we don't want to store `SUBMODULE_WITH_GITDIR` in the strvec. This
is because this is a mere sentinel value and not actually a string at
all.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Memory allocation patterns in git-mv(1) are extremely hard to follow:
We copy around string pointers into manually-managed arrays, some of
which alias each other, but only sometimes, while we also drop some of
those strings at other times without ever daring to free them.
While this may be my own subjective feeling, it seems like others have
given up as the code has multiple calls to `UNLEAK()`. These are not
sufficient though, and git-mv(1) is still leaking all over the place
even with them.
Refactor the code to instead track strings in `struct strvec`. While
this has the effect of effectively duplicating some of the strings
without an actual need, it is way easier to reason about and fixes all
of the aliasing of memory that has been going on. It allows us to get
rid of the `UNLEAK()` calls and also fixes leaks that those calls did
not paper over.
Mark tests which are now leak-free accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
makes the next patch easier, where we will migrate to the paths being
owned by a strvec. given that we are talking about command line
parameters here it's also not like we have tons of allocations that this
would save
while at it, fix a memory leak
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/mv: refactor `add_slash()` to always return allocated strings
The `add_slash()` function will only conditionally return an allocated
string when the passed-in string did not yet have a trailing slash. This
makes the memory ownership harder to track than really necessary.
It's dubious whether this optimization really buys us all that much. The
number of times we execute this function is bounded by the number of
arguments to git-mv(1), so in the typical case we may end up saving an
allocation or two.
Simplify the code to unconditionally return allocated strings.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodule: fix leaking memory for submodule entries
In `free_one_config()` we never end up freeing the `url` and `ignore`
fields and thus leak memory. Fix those leaks and mark now-passing tests
as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use a priority queue in `ahead_behind()` to compute the ahead/behind
count for commits. We may not iterate through all commits part of that
queue though in case all of its entries are stale. Consequently, as we
never make the effort to release the remaining commits, we end up
leaking bit arrays that we have allocated for each of the contained
commits.
Plug this leak and mark the corresponding test as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that memory ownership rules around `git_config_string()` and
`git_config_pathname()` are clearer, it also got easier to spot that
the returned memory needs to be free'd. Plug a subset of those cases and
mark now-passing tests as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: clarify memory ownership in `git_config_string()`
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're using global variables to store the log configuration. Many of
these can be set both via the command line and via the config, and
depending on how they are being set, they may contain allocated strings.
This leads to hard-to-track memory ownership and memory leaks.
Refactor the code to instead use a `struct log_config` that is being
allocated on the stack. This allows us to more clearly scope the
variables, track memory ownership and ultimately release the memory.
This also prepares us for a change to `git_config_string()`, which will
be adapted to have a `char **` out parameter instead of `const char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
convert: refactor code to clarify ownership of check_roundtrip_encoding
The `check_roundtrip_encoding` variable is tracked in a `const char *`
even though it may contain allocated strings at times. The result is
that those strings may be leaking because we never free them.
Refactor the code to always store allocated strings in this variable.
The default value is handled in `check_roundtrip()` now, which is the
only user of the variable.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff: refactor code to clarify memory ownership of prefixes
The source and destination prefixes are tracked in a `const char *`
array, but may at times contain allocated strings. The result is that
those strings may be leaking because we never free them.
Refactor the code to always store allocated strings in those variables,
freeing them as required. This requires us to handle the default values
a bit different compared to before. But given that there is only a
single callsite where we use the variables to `struct diff_options` it's
easy to handle the defaults there.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
config: clarify memory ownership in `git_config_pathname()`
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are various variables assigned via `git_config_string()` and
`git_config_pathname()` which are never free'd. This bug is relatable
because the out parameter of those functions are a `const char **`, even
though memory ownership is transferred to the caller.
We're about to adapt the functions to instead use `char **`. Prepare the
code accordingly. Note that the `(const char **)` casts will go away
once we have adapted the functions.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout: clarify memory ownership in `unique_tracking_name()`
The function `unique_tracking_name()` returns an allocated string, but
does not clearly indicate this because its return type is `const char *`
instead of `char *`. This has led to various callsites where we never
free its returned memory at all, which causes memory leaks.
Plug those leaks and mark now-passing tests as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf: fix leak when `appendwholeline()` fails with EOF
In `strbuf_appendwholeline()` we call `strbuf_getwholeline()` with a
temporary buffer. In case the call returns an error we indicate this by
returning EOF, but never release the temporary buffer. This can cause a
leak though because `strbuf_getwholeline()` calls getline(3). Quoting
its documentation:
If *lineptr was set to NULL before the call, then the buffer
should be freed by the user program even on failure.
Consequently, the temporary buffer may hold allocated memory even when
the call to `strbuf_getwholeline()` fails.
Fix this by releasing the temporary buffer on error.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- t7423: Introduced via b20c10fd9b (t7423: add tests for symlinked
submodule directories, 2024-01-28), passes since e8d0608944
(submodule: require the submodule path to contain directories only,
2024-03-26). The fix is not obviously related, but probably works
because we now die early in many code paths.
- t9xxx: All of these are exercising CVS-related tooling and pass
since at least Git v2.40. It's likely that these pass for a long
time already, but nobody ever noticed because Git developers do not
tend to have CVS on their machines.
Mark all of these tests as passing.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When initializing the transport helper in `transport_get()`, we
allocate the name of the helper. We neither end up transferring
ownership of the name, nor do we free it. The associated memory thus
leaks.
Fix this memory leak by freeing the string at the calling side in
`transport_get()`. `transport_helper_init()` now creates its own copy of
the string and thus can free it as required.
An alterantive way to fix this would be to transfer ownership of the
string passed into `transport_helper_init()`, which would avoid the call
to xstrdup(1). But it does make for a more surprising calling convention
as we do not typically transfer ownership of strings like this.
Mark now-passing tests as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "t/lib-terminal.sh", we declare a lazy prerequisite for tests that
require a TTY. The prerequisite uses a Perl script to figure out whether
we do have a usable TTY or not and thus implicitly depends on the PERL
prerequisite, as well. Furthermore though, the script requires another
dependency that is easy to miss, namely on the IO::Pty module. If that
module is not installed, then the script will exit early due to an
reason unrelated to missing TTYs.
This easily leads to missing test coverage. But most importantly, our CI
systems are missing this dependency and thus don't execute those tests
at all. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 23 May 2024 18:04:29 +0000 (11:04 -0700)]
Merge branch 'mt/openindiana-portability'
Portability updates to various uses of grep and sed.
* mt/openindiana-portability:
t/t9001-send-email.sh: sed - remove the i flag for s
t/t9118-git-svn-funky-branch-names.sh: sed needs semicolon
t/t1700-split-index.sh: mv -v is not portable
t/t4202-log.sh: fix misspelled variable
t/t0600-reffiles-backend.sh: rm -v is not portable
t/t9902-completion.sh: backslashes in echo
Switch grep from non-portable BRE to portable ERE
Junio C Hamano [Thu, 23 May 2024 18:04:26 +0000 (11:04 -0700)]
Merge branch 'la/hide-trailer-info'
The trailer API has been reshuffled a bit.
* la/hide-trailer-info:
trailer unit tests: inspect iterator contents
trailer: document parse_trailers() usage
trailer: retire trailer_info_get() from API
trailer: make trailer_info struct private
trailer: make parse_trailers() return trailer_info pointer
interpret-trailers: access trailer_info with new helpers
sequencer: use the trailer iterator
trailer: teach iterator about non-trailer lines
trailer: add unit tests for trailer iterator
Makefile: sort UNIT_TEST_PROGRAMS
Junio C Hamano [Mon, 20 May 2024 18:20:04 +0000 (11:20 -0700)]
Merge branch 'jc/compat-regex-calloc-fix'
Windows CI running in GitHub Actions started complaining about the
order of arguments given to calloc(); the imported regex code uses
the wrong order almost consistently, which has been corrected.
* jc/compat-regex-calloc-fix:
compat/regex: fix argument order to calloc(3)
Junio C Hamano [Mon, 20 May 2024 18:20:04 +0000 (11:20 -0700)]
Merge branch 'kn/ref-transaction-symref'
Updates to symbolic refs can now be made as a part of ref
transaction.
* kn/ref-transaction-symref:
refs: remove `create_symref` and associated dead code
refs: rename `refs_create_symref()` to `refs_update_symref()`
refs: use transaction in `refs_create_symref()`
refs: add support for transactional symref updates
refs: move `original_update_refname` to 'refs.c'
refs: support symrefs in 'reference-transaction' hook
files-backend: extract out `create_symref_lock()`
refs: accept symref values in `ref_transaction_update()`
Marcel Telka [Fri, 17 May 2024 15:27:41 +0000 (17:27 +0200)]
t/t1700-split-index.sh: mv -v is not portable
The -v option for mv is not specified by POSIX. The illumos
implementation of mv does not support -v. Since we do not need the
verbose mv output we just drop -v for mv.
Signed-off-by: Marcel Telka <marcel@telka.sk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Marcel Telka [Fri, 17 May 2024 13:19:00 +0000 (15:19 +0200)]
t/t0600-reffiles-backend.sh: rm -v is not portable
The -v option for rm is not specified by POSIX. The illumos
implementation of rm does not support -v. Since we do not need the
verbose rm output we just drop -v for rm.
Signed-off-by: Marcel Telka <marcel@telka.sk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Marcel Telka [Fri, 17 May 2024 14:08:45 +0000 (16:08 +0200)]
t/t9902-completion.sh: backslashes in echo
The usage of backslashes in echo is not portable. Since some tests
tries to output strings containing '\b' it is safer to use printf
here. The usage of printf instead of echo is also preferred by POSIX.
Signed-off-by: Marcel Telka <marcel@telka.sk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 17 May 2024 17:14:46 +0000 (10:14 -0700)]
diff: document what --name-only shows
The "--name-only" option is about showing the name of each file in
the post-image tree that got changed and nothing else (like "was it
created?"). Unlike the "--name-status" option that tells how the
change happened (e.g., renamed with similarity), it does not give
anything else, like the name of the corresponding file in the old
tree.
For example, if you start from a clean checkout that has a file
whose name is COPYING, here is what you would see:
Lack of the description of this fact has confused readers in the
past. Even back when dda2d79a ([PATCH] Clean up diff option
descriptions., 2005-07-13) documented "--name-only", "git diff"
already supported the renames, so in a sense, from day one, this
should have been documented more clearly but it wasn't.
Karthik Nayak [Fri, 17 May 2024 12:27:24 +0000 (14:27 +0200)]
SubmittingPatches: add section for iterating patches
Add a section to explain how to work around other in-flight patches and
how to navigate conflicts which arise as a series is being iterated.
This provides the necessary steps that users can follow to reduce
friction with other ongoing topics and also provides guidelines on how
the users can also communicate this to the list efficiently.
Co-authored-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: adapt git-config(1) to complete subcommands
With fe3ccc7aab (Merge branch 'ps/config-subcommands', 2024-05-15),
git-config(1) has gained support for subcommands. These subcommands live
next to the old, action-based mode, so that both the old and new way
continue to work.
The manpage for this command has been updated to prominently show the
subcommands, and the action-based modes are marked as deprecated. Update
Bash completion scripts accordingly to advertise subcommands instead of
actions.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 15 May 2024 19:32:42 +0000 (12:32 -0700)]
t0017: clarify dubious test set-up
1ff750b1 (tests: make GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21)
added this test, in which "test-tool -C" is fed a name of a
directory that does not exist, and expects that it dies because of a
failure to read the configuration file(s), because the configuration
setting is screwed up to contain mutual inclusion loop, before it
notices that the directory to chdir into does not exist and dies.
It is of dubious value to etch the current order of events, i.e.,
the configuration needs to be read that early (for initializing
trace2 subsystem) before we even notice the lack of the directory
and have a chance to fail, into stone. Indeed, if you completely
compile out trace2 subsystem so that it does not even attempt to
read the configuration that early, we would die with a different
error message (i.e. "unable to chdir to 'cycle'") and this test will
fail.
At least give a bogus argument to "test-tool -C" a name that is
clearly bogus to make sure we can more easily see what is going on
with plenty of comments.
We may want to remove this test altogether, instead, though.
Junio C Hamano [Thu, 16 May 2024 17:10:13 +0000 (10:10 -0700)]
Merge branch 'ps/refs-without-the-repository'
The refs API lost functions that implicitly assumes to work on the
primary ref_store by forcing the callers to pass a ref_store as an
argument.
* ps/refs-without-the-repository:
refs: remove functions without ref store
cocci: apply rules to rewrite callers of "refs" interfaces
cocci: introduce rules to transform "refs" to pass ref store
refs: add `exclude_patterns` parameter to `for_each_fullref_in()`
refs: introduce missing functions that accept a `struct ref_store`
Junio C Hamano [Thu, 16 May 2024 17:10:13 +0000 (10:10 -0700)]
Merge branch 'jl/git-no-advice'
A new global "--no-advice" option can be used to disable all advice
messages, which is meant to be used only in scripts.
* jl/git-no-advice:
t0018: two small fixes
advice: add --no-advice global option
doc: add spacing around paginate options
doc: clean up usage documentation for --no-* opts
Koji Nakamaru [Wed, 15 May 2024 19:21:07 +0000 (19:21 +0000)]
osxkeychain: state to skip unnecessary store operations
git passes a credential that has been used successfully to the helpers
to record. If a credential is already stored,
"git-credential-osxkeychain store" just records the credential returned
by "git-credential-osxkeychain get", and unnecessary (sometimes
problematic) SecItemAdd() and/or SecItemUpdate() are performed.
We can skip such unnecessary operations by marking a credential returned
by "git-credential-osxkeychain get". This marking can be done by
utilizing the "state[]" feature:
- The "get" command sets the field "state[]=osxkeychain:seen=1".
- The "store" command skips its actual operation if the field
"state[]=osxkeychain:seen=1" exists.
Introduce a new state "state[]=osxkeychain:seen=1".
Suggested-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Koji Nakamaru [Wed, 15 May 2024 19:21:06 +0000 (19:21 +0000)]
osxkeychain: exclusive lock to serialize execution of operations
git passes a credential that has been used successfully to the helpers
to record. If "git-credential-osxkeychain store" commands run in
parallel (with fetch.parallel configuration and/or by running multiple
git commands simultaneously), some of them may exit with the error
"failed to store: -25299". This is because SecItemUpdate() in
add_internet_password() may return errSecDuplicateItem (-25299) in this
situation. Apple's documentation [1] also states as below:
In macOS, some of the functions of this API block while waiting for
input from the user (for example, when the user is asked to unlock a
keychain or give permission to change trust settings). In general, it
is safe to use this API in threads other than your main thread, but
avoid calling the functions from multiple operations, work queues, or
threads concurrently. Instead, serialize function calls or confine
them to a single thread.
The error has not been noticed before, because the former implementation
ignored the error.
Introduce an exclusive lock to serialize execution of operations.
The "whitespace check" task that was enabled for GitHub Actions CI
has been ported to GitLab CI.
* jt/port-ci-whitespace-check-to-gitlab:
gitlab-ci: add whitespace error check
ci: make the whitespace report optional
ci: separate whitespace check script
github-ci: fix link to whitespace error
ci: pre-collapse GitLab CI sections
Junio C Hamano [Wed, 15 May 2024 16:52:52 +0000 (09:52 -0700)]
Merge branch 'js/unit-test-suite-runner'
The "test-tool" has been taught to run testsuite tests in parallel,
bypassing the need to use the "prove" tool.
* js/unit-test-suite-runner:
cmake: let `test-tool` run the unit tests, too
ci: use test-tool as unit test runner on Windows
t/Makefile: run unit tests alongside shell tests
unit tests: add rule for running with test-tool
test-tool run-command testsuite: support unit tests
test-tool run-command testsuite: remove hardcoded filter
test-tool run-command testsuite: get shell from env
t0080: turn t-basic unit test into a helper
Pseudorefs are not stored in the ref database as by definition, they
carry additional metadata that essentially makes them not a ref. As
such, writing pseudorefs via the ref backend does not make any sense
whatsoever as the ref backend wouldn't know how exactly to store the
data.
Restrict writing pseudorefs via the ref backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
ref-filter: properly distinuish pseudo and root refs
The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `is_root_ref()` function will happily clarify a pseudoref as a root
ref, even though pseudorefs are no refs. Next to being wrong, it also
leads to inconsistent behaviour across ref backends: while the "files"
backend accidentally knows to parse those pseudorefs and thus yields
them to the caller, the "reftable" backend won't ever see the pseudoref
at all because they are never stored in the "reftable" backend.
Fix this issue by filtering out pseudorefs in `is_root_ref()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.
Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:
- `ref_kind_from_refname()` already handles "HEAD" explicitly before
calling `is_root_ref()`.
- The "files" and "reftable" backends explicitly call both
`is_root_ref()` and `is_headref()` together.
This also aligns behaviour or `is_root_ref()` and `is_headref()` such
that we stop checking for ref existence. This changes semantics for our
backends:
- In the reftable backend we already know that the ref must exist
because `is_headref()` is called as part of the ref iterator. The
existence check is thus redundant, and the change is safe to do.
- In the files backend we use it when populating root refs, where we
would skip adding the "HEAD" file if it was not possible to resolve
it. The new behaviour is to instead mark "HEAD" as broken, which
will cause us to emit warnings in various places.
As there are no callers of `is_headref()` left afer the refactoring, we
can absorb it completely into `is_root_ref()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: do not check ref existence in `is_root_ref()`
Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.
This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.
Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.
We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.
This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:
- "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
It's clear that the intent is to classify based on the ref name,
only.
- "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
filter root refs. Again, using existence checks is pointless here as
the iterator has just surfaced the ref, so we know it does exist.
- "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
determine whether it should add a ref to the root directory of its
iterator. This had the effect that we skipped over any files that
are either a symbolic ref, or which are not a ref at all.
The new behaviour is to include symbolic refs know, which aligns us
with the adapted terminology. Furthermore, files which look like
root refs but aren't are now mark those as "broken". As broken refs
are not surfaced by our tooling, this should not lead to a change in
user-visible behaviour, but may cause us to emit warnings. This
feels like the right thing to do as we would otherwise just silently
ignore corrupted root refs completely.
So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.
Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs: rename `is_special_ref()` to `is_pseudo_ref()`
Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
defined terminology in our gitglossary(7). Note that in the preceding
commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
there may be confusion for in-flight patch series that add new calls to
`is_pseudoref()`. In order to intentionally break such patch series we
have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the
new name.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live
in the root of the ref hierarchy behave the exact same as normal refs.
They can be symbolic refs or direct refs and can be read, iterated over
and written via normal tooling. All of these refs are stored in the ref
backends, which further demonstrates that they are just normal refs.
Extend the definition of "ref" to also cover such root refs. The only
additional restriction for root refs is that they must conform to a
specific naming schema.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/glossary: clarify limitations of pseudorefs
Clarify limitations that pseudorefs have:
- They can be read via git-rev-parse(1) and similar tools.
- They are not surfaced when iterating through refs, like when using
git-for-each-ref(1). They are not refs, so iterating through refs
should not surface them.
- They cannot be written via git-update-ref(1) and related commands.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/glossary: redefine pseudorefs as special refs
Nowadays, Git knows about three different kinds of refs. As defined in
gitglossary(7):
- Regular refs that start with "refs/", like "refs/heads/main".
- Pseudorefs, which live in the root directory. These must have
all-caps names and must be a file that start with an object hash.
Consequently, symbolic refs are not pseudorefs because they do not
start with an object hash.
- Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".
This state is extremely confusing, and I would claim that most folks
don't fully understand what is what here. The current definitions also
have several problems:
- Where does "HEAD" fit in? It's not a pseudoref because it can be
a symbolic ref. It's not a regular ref because it does not start
with "refs/". And it's not a special ref, either.
- There is a strong overlap between pseudorefs and special refs. The
pseudoref section for example mentions "MERGE_HEAD", even though it
is a special ref. Is it thus both a pseudoref and a special ref?
- Why do we even need to distinguish refs that live in the root from
other refs when they behave just like a regular ref anyway?
In other words, the current state is quite a mess and leads to wild
inconsistencies without much of a good reason.
The original reason why pseudorefs were introduced is that there are
some refs that sometimes behave like a ref, even though they aren't a
ref. And we really only have two of these nowadays, namely "MERGE_HEAD"
and "FETCH_HEAD". Those files are never written via the ref backends,
but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
They contain additional metadata that highlights where a ref has been
fetched from or the list of commits that have been merged.
This original intent in fact matches the definition of special refs that
we have recently introduced in 8df4c5d205 (Documentation: add "special
refs" to the glossary, 2024-01-19). Due to the introduction of the new
reftable backend we were forced to distinguish those refs more clearly
such that we don't ever try to read or write them via the reftable
backend. In the same series, we also addressed all the other cases where
we used to write those special refs via the filesystem directly, thus
circumventing the ref backend, to instead write them via the backends.
Consequently, there are no other refs left anymore which are special.
Let's address this mess and return the pseudoref terminology back to its
original intent: a ref that sometimes behave like a ref, but which isn't
really a ref because it gets written to the filesystem directly. Or in
other words, let's redefine pseudorefs to match the current definition
of special refs. As special refs and pseudorefs are now the same per
definition, we can drop the "special refs" term again. It's not exposed
to our users and thus they wouldn't ever encounter that term anyway.
Refs that live in the root of the ref hierarchy but which are not
pseudorefs will be further defined in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/config: pass data between callbacks via local variables
We use several global variables to pass data between callers and
callbacks in `get_color()` and `get_colorbool()`. Convert those to use
callback data structures instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the `do_all` and `use_key_regexp` bits essentially act like flags
to `get_value()`. Let's convert them to actual flags so that we can get
rid of the last two remaining global variables that track options.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/config: track "fixed value" option via flags only
We track the "fixed value" option via two separate bits: once via the
global variable `fixed_value`, and once via the CONFIG_FLAGS_FIXED_VALUE
bit in `flags`. This is confusing and may easily lead to issues when one
is not aware that this is tracked via two separate mechanisms.
Refactor the code to use the flag exclusively. We already pass it to all
the required callsites anyway, except for `collect_config()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `key` variable is used by the `get_value()` function for two
purposes:
- It is used to store the result of `git_config_parse_key()`, which is
then passed on to `collect_config()`.
- It is used as a store to convert the provided key to an
all-lowercase key when `use_key_regexp` is set.
Neither of these cases warrant a global variable at all. In the former
case we can pass the key via `struct collect_config_data`. And in the
latter case we really only want to have it as a temporary local variable
such that we can free associated memory.
Refactor the code accordingly to reduce our reliance on global state.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>