All the Perforce tests are free of memory leaks. This went unnoticed
because most folks do not have p4 and p4d installed on their computers.
Consequently, given that the prerequisites for running those tests
aren't fulfilled, `TEST_PASSES_SANITIZE_LEAK=check` won't notice that
those tests are indeed memory leak free.
Mark those tests accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update our Perforce version from r21.2 to r23.2. Note that the updated
version is not the newest version. Instead, it is the last version where
the way that Perforce is being distributed remains the same as in r21.2.
Newer releases stopped distributing p4 and p4d executables as well as
the macOS archives directly and would thus require more work.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the tests in t98xx modify the Perforce depot in ways that the
tool wouldn't normally allow. This is done to test behaviour of git-p4
in certain edge cases that we have observed in the wild, but which
should in theory not be possible.
Naturally, modifying the depot on disk directly is quite intimate with
the tool and thus prone to breakage when Perforce updates the way that
data is stored. And indeed, those tests are broken nowadays with r23 of
Perforce. While a file revision was previously stored as a plain file
"depot/file,v", it is now stored in a directory "depot/file,d" with
compression.
Adapt those tests to handle both old- and new-style depot layouts.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
D Harithamma [Wed, 31 Jul 2024 13:33:59 +0000 (13:33 +0000)]
convert: return early when not tracing
When Git adds a file requiring encoding conversion and tracing of encoding
conversion is not requested via the GIT_TRACE_WORKING_TREE_ENCODING
environment variable, the `trace_encoding()` function still allocates &
prepares "human readable" copies of the file contents before and after
conversion to show in the trace. This results in a high memory footprint
and increased runtime without providing any user-visible benefit.
This fix introduces an early exit from the `trace_encoding()` function
when tracing is not requested, preventing unnecessary memory allocation
and processing.
Signed-off-by: D Harithamma <harithamma.d@ibm.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: consistently use spaces inside initializers
Our coding guide is inconsistent with how it uses spaces inside of
initializers (`struct foo bar = { something }`). While we mostly carry
the space between open and closing braces and the initialized members,
in one case we don't.
Fix this one instance such that we consistently carry the space. This is
also consistent with how clang-format formats such initializers.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We semi-regularly have discussions around whether a function shall be
named `S_release()`, `S_clear()` or `S_free()`. Indeed, it may not be
obvious which of these is preferable as we never really defined what
each of these variants means exactly.
Carve out a space where we can add idiomatic names for common functions
in our coding guidelines and define each of those functions. Like this,
we can get to a shared understanding of their respective semantics and
can easily point towards our style guide in future discussions such that
our codebase becomes more consistent over time.
Note that the intent is not to rename all functions which violate these
semantics right away. Rather, the intent is to slowly converge towards a
common style over time.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: document naming schema for structs and their functions
We nowadays have a proper mishmash of struct-related functions that are
called `<verb>_<struct>` (e.g. `clear_prio_queue()`) versus functions
that are called `<struct>_<verb>` (e.g. `strbuf_clear()`). While the
former style may be easier to tie into a spoken conversation, most of
our communication happens in text anyway. Furthermore, prefixing
functions with the name of the structure they operate on makes it way
easier to group them together, see which functions are related, and will
also help folks who are using code completion.
Let's thus settle on one style, namely the one where functions start
with the name of the structure they operate on.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: clarify indentation style for C preprocessor directives
In the preceding commit, we have settled on using a single space per
nesting level to indent preprocessor directives. Clarify our coding
guidelines accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clang-format: fix indentation width for preprocessor directives
In [1], we have improved our clang-format configuration to also specify
the style for how to indent preprocessor directives. But while we have
settled the question of where to put the indentation, either before or
after the hash sign, we didn't specify exactly how to indent.
With the current configuration, clang-format uses tabs to indent each
level of nested preprocessor directives, which is in fact unintentional
and never done in our codebase. Instead, we use a mixture of indenting
by either one or two spaces, where using a single space is somewhat more
common.
Adapt our clang-format configuration accordingly by specifying an
indentation width of one space.
Junio C Hamano [Tue, 30 Jul 2024 20:47:26 +0000 (13:47 -0700)]
Merge branch 'kn/ci-clang-format' into ps/doc-more-c-coding-guidelines
* kn/ci-clang-format:
ci/style-check: add `RemoveBracesLLVM` in CI job
check-whitespace: detect if no base_commit is provided
ci: run style check on GitHub and GitLab
clang-format: formalize some of the spacing rules
clang-format: avoid spacing around bitfield colon
clang-format: indent preprocessor directives after hash
refs/files: stop using `the_repository` in `parse_loose_ref_contents()`
We implicitly rely on `the_repository` in `parse_loose_ref_contents()`
by calling `parse_oid_hex()`. Convert the function to instead use
`parse_oid_hex_algop()` and have callers pass in the hash algorithm to
use.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 30 Jul 2024 14:12:36 +0000 (16:12 +0200)]
t-strvec: use if_test
The macro TEST takes a single expression. If a test requires multiple
statements then they need to be placed in a function that's called in
the TEST expression.
Remove the cognitive overhead of defining and calling single-use
functions by using if_test instead.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 30 Jul 2024 14:10:59 +0000 (16:10 +0200)]
t-reftable-basics: use if_test
The macro TEST takes a single expression. If a test requires multiple
statements then they need to be placed in a function that's called in
the TEST expression.
Remove the overhead of defining and calling single-use functions by
using if_test instead.
Run the tests in the order of definition. We can reorder them like that
because they are independent. Technically this changes the output, but
retains the meaning of a full run and allows for easier review e.g. with
diff option --ignore-all-space.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 30 Jul 2024 14:10:09 +0000 (16:10 +0200)]
t-ctype: use if_test
Use the documented macro if_test instead of the internal functions
test__run_begin() and test__run_end(), which are supposed to be private
to the unit test framework.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 30 Jul 2024 14:08:19 +0000 (16:08 +0200)]
unit-tests: add if_test
The macro TEST only allows defining a test that consists of a single
expression. Add a new macro, if_test, which provides a way to define
unit tests that are made up of one or more statements.
if_test allows defining self-contained tests en bloc, a bit like
test_expect_success does for regular tests. It acts like a conditional;
the test body is executed if test_skip_all() had not been called before.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 30 Jul 2024 14:07:00 +0000 (16:07 +0200)]
unit-tests: show location of checks outside of tests
Checks outside of tests are caught at runtime and reported like this:
Assertion failed: (ctx.running), function test_assert, file test-lib.c, line 267.
The assert() call aborts the unit test and doesn't reveal the location
or even the type of the offending check, as test_assert() is called by
all of them.
Handle it like the opposite case, a test without any checks: Don't
abort, but report the location of the actual check, along with a message
explaining the situation. The output for example above becomes:
# BUG: check outside of test at t/helper/test-example-tap.c:75
... and the unit test program continues and indicates the error in its
exit code at the end.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 30 Jul 2024 14:05:58 +0000 (16:05 +0200)]
t0080: use here-doc test body
Improve the readability of the expected output by using a here-doc for
the test body and replacing the unwieldy ${SQ} references with literal
single quotes.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 30 Jul 2024 14:00:20 +0000 (16:00 +0200)]
t-example-decorate: remove test messages
The test_msg() calls only repeat information already present in test
descriptions and check definitions, which are shown automatically if
the checks fail. Remove the redundant messages to simplify the tests
and their output. Here it is with all of them failing before:
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:18
# when adding a brand-new object, NULL should be returned
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:21
# when adding a brand-new object, NULL should be returned
not ok 1 - Add 2 objects, one with a non-NULL decoration and one with a NULL decoration.
# check "ret == &vars->decoration_a" failed at t/unit-tests/t-example-decorate.c:29
# when readding an already existing object, existing decoration should be returned
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:32
# when readding an already existing object, existing decoration should be returned
not ok 2 - When re-adding an already existing object, the old decoration is returned.
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:40
# lookup should return added declaration
# check "ret == &vars->decoration_b" failed at t/unit-tests/t-example-decorate.c:43
# lookup should return added declaration
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:46
# lookup for unknown object should return NULL
not ok 3 - Lookup returns the added declarations, or NULL if the object was never added.
# check "objects_noticed == 2" failed at t/unit-tests/t-example-decorate.c:58
# left: 1
# right: 2
# should have 2 objects
not ok 4 - The user can also loop through all entries.
1..4
... and here with the patch applied:
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:18
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:20
not ok 1 - Add 2 objects, one with a non-NULL decoration and one with a NULL decoration.
# check "ret == &vars->decoration_a" failed at t/unit-tests/t-example-decorate.c:27
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:29
not ok 2 - When re-adding an already existing object, the old decoration is returned.
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:36
# check "ret == &vars->decoration_b" failed at t/unit-tests/t-example-decorate.c:38
# check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:40
not ok 3 - Lookup returns the added declarations, or NULL if the object was never added.
# check "objects_noticed == 2" failed at t/unit-tests/t-example-decorate.c:51
# left: 1
# right: 2
not ok 4 - The user can also loop through all entries.
1..4
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 30 Jul 2024 18:43:52 +0000 (11:43 -0700)]
safe.directory: setting safe.directory="." allows the "current" directory
When "git daemon" enters a repository, it chdir's to the requested
repository and then uses "." (the curent directory) to consult the
"is this repository considered safe?" when it is not owned by the
same owner as the process.
Make sure this access will be allowed by setting safe.directory to
".", as that was once advertised on the list as a valid workaround
to the overly tight safe.directory settings introduced by 2.45.1
(cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>).
Also add simlar test to show what happens in the same setting if the
safe.directory is set to "*" instead of "."; in short, "." is a bit
tighter (as it is custom designed for git-daemon situation) than
"anything goes" settings given by "*".
Junio C Hamano [Tue, 30 Jul 2024 18:43:51 +0000 (11:43 -0700)]
safe.directory: normalize the configured path
The pathname of a repository comes from getcwd() and it could be a
path aliased via symbolic links, e.g., the real directory may be
/home/u/repository but a symbolic link /home/u/repo may point at it,
and the clone request may come as "git clone file:///home/u/repo/"
A request to check if /home/u/repository is safe would be rejected
if the safe.directory configuration allows /home/u/repo/ but not its
alias /home/u/repository/. Normalize the paths configured for the
safe.directory configuration variable before comparing them with the
path being checked.
Two and a half things to note, compared to the previous step to
normalize the actual path of the suspected repository, are:
- A configured safe.directory may be coming from .gitignore in the
home directory that may be shared across machines. The path
meant to match with an entry may not necessarily exist on all of
such machines, so not being able to convert them to real path on
this machine is *not* a condition that is worthy of warning.
Hence, we ignore a path that cannot be converted to a real path.
- A configured safe.directory is essentially a random string that
user throws at us, written completely unrelated to the directory
the current process happens to be in. Hence it makes little
sense to give a non-absolute path. Hence we ignore any
non-absolute paths, except for ".".
- The safe.directory set to "." was once advertised on the list as
a valid workaround for the regression caused by the overly tight
safe.directory check introduced in 2.45.1; we treat it to mean
"if we are at the top level of a repository, it is OK".
(cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>).
Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 30 Jul 2024 18:43:50 +0000 (11:43 -0700)]
safe.directory: normalize the checked path
The pathname of a repository comes from getcwd() and it could be a
path aliased via symbolic links, e.g., the real directory may be
/home/u/repository but a symbolic link /home/u/repo may point at it,
and the clone request may come as "git clone file:///home/u/repo/".
A request to check if /home/u/repo is safe would be rejected if the
safe.directory configuration allows /home/u/repository/ but not its
alias /home/u/repo/. Normalize the path being checked before
comparing with safe.directory value(s).
Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 30 Jul 2024 18:43:49 +0000 (11:43 -0700)]
safe.directory: preliminary clean-up
The paths given in the safe.directory configuration variable are
allowed to contain "~user" (which interpolates to user's home
directory) and "%(prefix)" (which interpolates to the installation
location in RUNTIME_PREFIX-enabled builds, and a call to the
git_config_pathname() function is tasked to obtain a copy of the
path with these constructs interpolated.
The function, when it succeeds, always yields an allocated string in
the location given as the out-parameter; even when there is nothing
to interpolate in the original, a literal copy is made. The code
path that contains this caller somehow made two contradicting and
incorrect assumptions of the behaviour when there is no need for
interpolation, and was written with extra defensiveness against
two phantom risks that do not exist.
One wrong assumption was that the function might yield NULL when
there is no interpolation. This led to the use of an extra "check"
variable, conditionally holding either the interpolated or the
original string. The assumption was with us since 8959555c
(setup_git_directory(): add an owner check for the top-level
directory, 2022-03-02) originally introduced the safe.directory
feature.
Another wrong assumption was that the function might yield the same
pointer as the input when there is no interpolation. This led to a
conditional free'ing of the interpolated copy, that the conditional
never skipped, as we always received an allocated string.
Simplify the code by removing the extra defensiveness.
René Scharfe [Tue, 30 Jul 2024 14:18:54 +0000 (16:18 +0200)]
grep: -W: skip trailing empty lines at EOF, too
4aa2c4753d (grep: -W: don't extend context to trailing empty lines,
2016-05-28) stopped showing empty lines at the end of function context
when using -W. Do the same for trailing empty lines at the end of
files, for consistency -- it doesn't matter whether a function section
is ended by the next function or the end of the file.
Test it by adding a trailing empty line to the file used by the test
"grep -W" and leave its expected output the same.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 30 Jul 2024 01:17:38 +0000 (18:17 -0700)]
patch-id: tighten code to detect the patch header
The get_one_patchid() function unconditionally takes a line that
matches the patch header (namely, a line that begins with a full
object name, possibly prefixed by "commit" or "From" plus a space)
as the beginning of a patch. Even when it is *not* looking for one
(namely, when the previous call found the patch header and returned,
and then we are called again to skip the log message and process the
patch whose header was found by the previous invocation).
As a consequence, a line in the commit log message that begins with
one of these patterns can be mistaken to start another patch, with
current message entirely skipped (because we haven't even reached
the patch at all).
Allow the caller to tell us if it called us already and saw the
patch header (in which case we shouldn't be looking for another one,
until we see the "diff" part of the patch; instead we simply should
be skipping these lines as part of the commit log message), and skip
the header processing logic when that is the case. In the helper
function, it also needs to flip this "are we looking for a header?"
bit, once it finished skipping the commit log message and started
processing the patches, as the patch header of the _next_ message is
the only clue in the input that the current patch is done.
Junio C Hamano [Tue, 30 Jul 2024 01:17:37 +0000 (18:17 -0700)]
patch-id: rewrite code that detects the beginning of a patch
The get_one_patchid() function reads input lines until it finds a
patch header (the line that begins a patch), whose beginning is one
of:
(1) an "<object name>", which is what "git diff-tree --stdin" shows;
(2) "commit <object name>", which is what "git log" shows; or
(3) "From <object name>", which is what "git log --format=email" shows.
When it finds such a line, it returns to the caller, reporting the
<object name> it found, and the size of the "patch" it processed.
The caller then calls the function again, which then ignores the
commit log message, and then processes the lines in the patch part
until it hits another "beginning of a patch".
The above logic was fairly easy to see until 2bb73ae8 (patch-id: use
starts_with() and skip_prefix(), 2016-05-28) reorganized the code,
which made another logic that has nothing to do with the "where does
the next patch begin?" logic, which came from 2485eab5
(git-patch-id: do not trip over "no newline" markers, 2011-02-17)
that ignores the "\ No newline at the end", rolled into the same
single if() statement.
Let's split it out. The "\ No newline at the end" marker is part of
the patch, should not appear before we start reading the patch part,
and does not belong to the detection of patch header.
Junio C Hamano [Tue, 30 Jul 2024 01:17:36 +0000 (18:17 -0700)]
patch-id: make get_one_patchid() more extensible
We pass two independent Boolean flags (i.e. do we want the stable
variant of patch-id? do we want to hash the stuff verbatim?) into
the function as two separate parameters. Before adding the third
one and make the interface even wider, let's consolidate them into
a single flag word.
No changes in behaviour. Just a trivial interface change.
Junio C Hamano [Tue, 30 Jul 2024 01:17:35 +0000 (18:17 -0700)]
patch-id: call flush_current_id() only when needed
The caller passes a flag that is used to become no-op when calling
flush_current_id(). Instead of calling something that becomes a
no-op, teach the caller not to call it in the first place.
Junio C Hamano [Tue, 30 Jul 2024 01:17:34 +0000 (18:17 -0700)]
t4204: patch-id supports various input format
"git patch-id" was first developed to read from "git diff-tree
--stdin -p" output. Later it was enhanced to read from "git
diff-tree --stdin -p -v", which was the downstream of an early
imitation of "git log" ("git rev-list" run in the upstream of a pipe
to feed the "diff-tree"). These days, we also read from "git
format-patch".
Their output begins slightly differently, but the patch-id computed
over them for the same commit should be the same. Ensure that we
won't accidentally break this expectation.
David Disseldorp [Mon, 29 Jul 2024 15:14:00 +0000 (17:14 +0200)]
notes: do not trigger editor when adding an empty note
With "git notes add -C $blob", the given blob contents are to be made
into a note without involving an editor. But when "--allow-empty" is
given, the editor is invoked, which can cause problems for
non-interactive callers[1].
This behaviour started with 90bc19b3ae (notes.c: introduce
'--separator=<paragraph-break>' option, 2023-05-27), which changed
editor invocation logic to check for a zero length note_data buffer.
Restore the original behaviour of "git note" that takes the contents
given via the "-m", "-C", "-F" options without invoking an editor, by
checking for any prior parameter callbacks, indicated by a non-zero
note_data.msg_nr. Remove the now-unneeded note_data.given flag.
Add a test for this regression by checking whether GIT_EDITOR is
invoked alongside "git notes add -C $empty_blob --allow-empty"
[1] https://github.com/ddiss/icyci/issues/12
Signed-off-by: David Disseldorp <ddiss@suse.de>
[jc: enhanced the test with -m/-F options] Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell, thus should be avoided.
As such, check-non-portable-shell.pl warns when it detects such usage.
However, a limitation of the check is that it only detects such
invocations when variable assignment (i.e. `VAR=val`) is the first thing
on the line. Thus, it can easily be fooled by an invocation such as:
echo X | VAR=val shell-func
Address this shortcoming by loosening the check so that the variable
assignment can be recognized even when not at the beginning of the line.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Sat, 27 Jul 2024 05:35:08 +0000 (01:35 -0400)]
check-non-portable-shell: suggest alternative for `VAR=val shell-func`
Most problems reported by check-non-portable-shell are accompanied by
advice suggesting how the test author can repair the problem. For
instance:
error: egrep/fgrep obsolescent (use grep -E/-F)
However, when one-shot variable assignment is detected when calling a
shell function (i.e. `VAR=val shell-func`), the problem is reported, but
no advice is given. The lack of advice is particularly egregious since
neither the problem nor the workaround are likely well-known by
newcomers to the project writing tests for the first time. Address this
shortcoming by recommending the use of `test_env` which is tailor made
for this specific use-case.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a0a630192d (t/check-non-portable-shell: detect "FOO=bar
shell_func", 2018-07-13) added the check for one-shot environment
variable assignment for shell functions, the primary reason given for
avoiding them was that, under some shells, the assignment outlives the
invocation of the shell function, thus could potentially negatively
impact subsequent commands in the same test, as well as subsequent
tests.
However, it has recently become apparent that this is not the only
potential problem with one-shot assignments and shell functions. Another
problem is that some shells do not actually export the variable to
commands which the function invokes[1]. More significantly, however, the
behavior of one-shot assignments with shell functions is not specified
by POSIX[2].
Given this new understanding, the presented error message ("assignment
extends beyond 'shell_func'") is too specific and potentially
misleading. Address this by emitting a less specific error message.
(Note that the wording "is not portable" is chosen over the more
specific "behavior not specified by POSIX" for consistency with almost
all other error message issued by this "lint" script.)
Eric Sunshine [Sat, 27 Jul 2024 05:35:06 +0000 (01:35 -0400)]
t4034: fix use of one-shot variable assignment with shell function
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell, thus should be avoided.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Sat, 27 Jul 2024 05:35:05 +0000 (01:35 -0400)]
t3430: drop unnecessary one-shot "VAR=val shell-func" invocation
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell. One such problematic behavior
is that, with some shells, the assignment will outlive the invocation of
the function, thus may potentially impact subsequent commands in the
test, as well as subsequent tests. A common way to work around the
problem is to wrap a subshell around the one-shot assignment, thus
ensuring that the assignment is short-lived.
In this test, the subshell is employed precisely for this purpose; other
side-effects of the subshell, such as losing the effect of `test_tick`
which is invoked by `test_commit`, are immaterial.
These days, we can take advantage of `test_commit --author` to more
clearly convey that the test is interested only in overriding the author
of the commit.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --format option on the git-ls-files man page states that `%xx`
interpolates to the character with hex code `xx`. This mirrors the
documentation and behavior of `git for-each-ref --format=...`. However,
in reality it requires the character with code `XX` to be specified as
`%xXX`, mirroring the behaviour of `git log --format`.
Signed-off-by: Jayson Rhynas <jayrhynas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 25 Jul 2024 23:07:28 +0000 (16:07 -0700)]
csum-file: introduce discard_hashfile()
The hashfile API is used to write out a "hashfile", which has a
final checksum (typically SHA-1) at the end. An in-core hashfile
structure has up to two file descriptors and a few buffers that can
only be freed by calling a helper function that is private to the
csum-file implementation.
The usual flow of a user of the API is to first open a file
descriptor for writing, obtain a hashfile associated with that write
file descriptor by calling either hashfd() or hashfd_check(), call
hashwrite() number of times to write data to the file, and then call
finalize_hashfile(), which appends th checksum to the end of the
file, closes file descriptors and releases associated buffers.
But what if a caller finds some error after calling hashfd() to
start the process and/or hashwrite() to send some data to the file,
and wants to abort the operation? The underlying file descriptor is
often managed by the tempfile API, so aborting will clean the file
out of the filesystem, but the resources associated with the in-core
hashfile structure is lost.
Introduce discard_hashfile() API function to allow them to release
the resources held by a hashfile structure the callers want to
dispose of, and use that in read-cache.c:do_write_index(), which is
a central place that writes the index file.
Mark t2107 as leak-free, as this leak in "update-index --cacheinfo"
test that deliberately makes it fail is now plugged.
Junio C Hamano [Thu, 25 Jul 2024 17:27:29 +0000 (10:27 -0700)]
doc: difference in location to apply is "offset", not "fuzz"
The documentation to "git rebase" says that the line numbers (in the
rebased change) may not exactly be the same as the line numbers the
change gets replayed on top of the new base, but uses a wrong noun
"fuzz". It should have said "offset".
They are both terms of art. "fuzz" is about context lines not
exactly matching. "offset" is about the difference in the location
that a change was taken from the original and the change gets
replayed on the target. "offset" is often inevitable and part of
normal life. "fuzz" on the other hand is often a sign of trouble
(and indeed "Git" refuses to apply a change with "fuzz", except
there are options to be fuzzy about whitespaces).
Make the print command trigger the pager when invoked using a capital
'P', to make it easier for the user to review long hunks.
Note that if the PAGER ends unexpectedly before we've been able to send
the payload, perhaps because the user is not interested in the whole
thing, we might receive a SIGPIPE, which would abruptly and unexpectedly
terminate the interactive session for the user.
Therefore, we need to ignore a possible SIGPIPE signal. Add a test for
this, in addition to the test for normal operation.
For the SIGPIPE test, we need to make sure that we completely fill the
operating system's buffer, otherwise we might not trigger the SIGPIPE
signal. The normal size of this buffer in different OSs varies from a
few KBs to 1MB. Use a payload large enough to guarantee that we exceed
this limit.
Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since f67b45f862 (Introduce trivial new pager.c helper infrastructure,
2006-02-28) we have the machinery to send our output to a pager.
That machinery, once set up, does not allow us to regain the original
stdio streams.
In the interactive commands (i.e.: add -p) we want to use the pager for
some output, while maintaining the interaction with the user.
Modify the pager machinery so that we can use `setup_pager()` and, once
we've finished sending the desired output for the pager, wait for the
pager termination using a new function `wait_for_pager()`. Make this
function reset the pager machinery before returning.
One specific point to note is that we avoid forking the pager in
`setup_pager()` if the configured pager is an empty string [*1*] or
simply "cat" [*2*]. In these cases, `setup_pager()` does nothing and
therefore `wait_for_pager()` should not be called.
We could modify `setup_pager()` to return an indication of these
situations, so we could avoid calling `wait_for_pager()`.
However, let's avoid transferring that responsibility to the caller and
instead treat the call to `wait_for_pager()` as a no-op when we know we
haven't forked the pager.
1.- 402461aab1 (pager: do not fork a pager if PAGER is set to empty.,
2006-04-16)
2.- caef71a535 (Do not fork PAGER=cat, 2006-04-16)
Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We send errors to the pager since 61b80509e3 (sending errors to stdout
under $PAGER, 2008-02-16).
In a8335024c2 (pager: do not dup2 stderr if it is already redirected,
2008-12-15) an exception was introduced to avoid redirecting stderr if
it is not connected to a terminal.
In such exceptional cases, the close(STDERR_FILENO) we're doing in
close_pager_fds, is unnecessary.
Furthermore, in a subsequent commit we're going to introduce changes
that will involve using close_pager_fds multiple times.
With this in mind, controlling when we want to close stderr, become
sensible.
Let's close(STDERR_FILENO) only when necessary, and pave the way for the
upcoming changes.
Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 25 Jul 2024 15:49:34 +0000 (08:49 -0700)]
ReviewingGuidelines: encourage positive reviews more
I saw some contributors hesitate to give a positive review on
patches by their coworkers. When written well, a positive review
does not have to be a hollow "looks good" that rubber stamps an
useless approval on a topic that is not interesting to others.
Let's add a few paragraphs to encourage positive reviews, which is a
bit harder to give than a review to point out things to improve.
Alexander Shopov [Wed, 24 Jul 2024 11:11:11 +0000 (14:11 +0300)]
show-ref: improve short help messages of options
Trivial change to indicate that branches and tags are real options
that can be used combined to get more information. This helps with
linting translations and prompting the user that the terms represent
options.
Signed-off-by: Alexander Shopov <ash@kambanaria.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 22 Jul 2024 21:17:55 +0000 (14:17 -0700)]
Doc: fix Asciidoctor css workaround
The previous step introduced docinfo.html to be used to tweak the
CSS used by the asciidoctor, that by default renders <code> inside
<pre> as a block element, breaking the SYNOPSIS section of a few
pages that adopted a new convention we use since Git 2.45.
But in this project, HTML files are all generated. We do not force
any human to write HTML by hand, which is an unusual and cruel
punishment. "*.html" is in the .gitignore file, and "make clean"
removes them. Having a tracked .html file makes "make clean" make
the tree dirty by removing the tracked docinfo.html file.
Let's do an obvious, minimum and stupid workaround to generate that
file at runtime instead. The mark-up is being rethought in a major
way for the next development cycle, and the CSS workaround we added
in the previous step may have to adjusted, possibly in a large way,
anyway.
For 'clang-format', setting 'RemoveBracesLLVM' to 'true', adds a check
to ensure we avoid curly braces for single-statement bodies in
conditional blocks.
However, the option does come with two warnings [1]:
This option will be renamed and expanded to support other styles.
and
Setting this option to true could lead to incorrect code formatting
due to clang-format’s lack of complete semantic information. As
such, extra care should be taken to review code changes made by
this option.
The latter seems to be of concern. While we want to experiment with the
rule, adding it to the in-tree '.clang-format' could affect end-users.
Let's only add it to the CI jobs for now. With time, we can evaluate
its efficacy and decide if we want to add it to '.clang-format' or
retract it entirely. We do so, by adding the existing rules in
'.clang-format' and this rule to a temp file outside the working tree,
which is then used by 'git clang-format'. This ensures we don't murk
with files in-tree.
check-whitespace: detect if no base_commit is provided
The 'check-whitespace' CI script exits gracefully if no base commit is
provided or if an invalid revision is provided. This is not good because
if a particular CI provides an incorrect base_commit, it would fail
successfully.
This is exactly the case with the GitLab CI. The CI is using the
"$CI_MERGE_REQUEST_TARGET_BRANCH_SHA" variable to get the base commit
SHA, but variable is only defined for _merged_ pipelines. So it is empty
for regular pipelines [1]. This should've failed the check-whitespace
job.
Let's fallback to 'CI_MERGE_REQUEST_DIFF_BASE_SHA' if
"CI_MERGE_REQUEST_TARGET_BRANCH_SHA" isn't available in GitLab CI,
similar to the previous commit. Let's also add a check for incorrect
base_commit in the 'check-whitespace.sh' script. While here, fix a small
typo too.
We don't run style checks on our CI, even though we have a
'.clang-format' setup in the repository. Let's add one, the job will
validate only against the new commits added and will only run on merge
requests. Since we're introducing it for the first time, let's allow
this job to fail, so we can validate if this is useful and eventually
enforce it.
For GitHub, we allow the job to pass by adding 'continue-on-error: true'
to the workflow. This means the job would show as passed, even if the
style check failed. To know the status of the job, users have to
manually check the logs.
For GitLab, we allow the job to pass by adding 'allow_failure: true', to
the job. Unlike GitHub, here the job will show as failed with a yellow
warning symbol, but the pipeline would still show as passed.
Also for GitLab, we use the 'CI_MERGE_REQUEST_TARGET_BRANCH_SHA'
variable by default to obtain the base SHA of the merged pipeline (which
is only available for merged pipelines [1]). Otherwise we use the
'CI_MERGE_REQUEST_DIFF_BASE_SHA' variable.
There are some spacing rules that we follow in the project and it makes
sense to formalize them:
* Ensure there is no space inserted after the logical not '!' operator.
* Ensure there is no space before the case statement's colon.
* Ensure there is no space before the first bracket '[' of an array.
* Ensure there is no space in empty blocks.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The spacing around colons is currently not standardized and as such we
have the following practices in our code base:
- Spacing around the colon `int bf : 1`: 146 instances
- No spacing around the colon `int bf:1`: 148 instances
- Spacing before the colon `int bf :1`: 6 instances
- Spacing after the colon `int bf: 1`: 12 instances
Let's formalize this by picking the most followed pattern and add the
corresponding style to '.clang-format'.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clang-format: indent preprocessor directives after hash
We do not have a rule around the indentation of preprocessor directives.
This was also discussed on the list [1], noting how there is often
inconsistency in the styling. While there was discussion, there was no
conclusion around what is the preferred style here. One style being
indenting after the hash:
#if FOO
# if BAR
# include <foo>
# endif
#endif
The other being before the hash:
#if FOO
#if BAR
#include <foo>
#endif
#endif
Let's pick the former and add 'IndentPPDirectives: AfterHash' value to
our '.clang-format'. There is no clear reason to pick one over the
other, but it would definitely be nicer to be consistent.
It was reported that t1460-refs-migrate.sh fails when using Cygwin with
errors like the following:
error: could not link file '.git/ref_migration.sr9pEF/reftable' to '.git/reftable': Permission denied
As some debugging surfaced, the root cause of this is that some files of
the newly-initialized ref store are still open when the target format is
the "reftable" format, and Cygwin refuses to rename open files.
Fix this issue by closing the new ref store before renaming its files
into place. This is a slight change in behaviour compared to before,
where we kept the new ref store open and then updated the repository's
ref store to point to it.
While we could re-open the new ref store after we have moved files
around, this is ultimately unnecessary. We know that the only user of
`repo_migrate_ref_storage_format()` is the git-refs(1) command, and it
won't access the ref store after it has been migrated anyway. So
reinitializing the ref store would be a waste of time. Regardless of
that it is still sensible to leave the repository in a consistent state.
But instead of reinitializing the ref store, we can simply unset the
repo's ref store altogether and let `get_main_ref_store()` lazily
initialize the new ref store as required.
Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 23 Jul 2024 00:04:54 +0000 (17:04 -0700)]
CodingGuidelines: document a shell that "fails" "VAR=VAL shell_func"
Over the years, we accumulated the community wisdom to avoid the
common "one-short export" construct for shell functions, but seem to
have lost on which exact platform it is known to fail. Now during
an investigation on a breakage for a recent topic, we found one
example of failing shell. Let's document that.
This does *not* mean that we can freely start using the construct
once Ubuntu 20.04 is retired. But it does mean that we cannot use
the construct until Ubuntu 20.04 is fully retired from the machines
that matter. Moreover, posix explicitly says that the behaviour for
the construct is unspecified.
Helped-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tomas Nordin [Mon, 22 Jul 2024 22:53:02 +0000 (22:53 +0000)]
doc: remove dangling closing parenthesis
The second line of the synopsis, starting with [--dry-run] has a
dangling closing paren in the second optional group. Probably added by
mistake, so remove it.
Signed-off-by: Tomas Nordin <tomasn@posteo.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 76880f0510c (doc: git-clone: apply new documentation formatting
guidelines, 2024-03-29), the synopsis of `git clone`'s manual page is
rendered differently than before; Its parent commit did the same for
`git init`.
The result looks quite nice. When rendered with AsciiDoc, that is. When
rendered using AsciiDoctor and displayed in a graphical web browser such
as Firefox, Chrome, Edge, etc, the result is quite unpleasant to my eye,
reading something like this:
The reason is that AsciiDoctor's default style sheet contains this (see
https://github.com/asciidoctor/asciidoctor/blob/854923b15533/src/stylesheets/asciidoctor.css#L519-L521
for context):
pre > code {
display: block;
}
It is this `display: block` that forces the parts that are enclosed in
`<code>` tags (such as the `git clone` or the `--template=` part) to be
rendered on their own line.
Side note: This seems not to affect console web browsers like `lynx` or
`w3m`, most likely because most style sheet directions cannot be
respected in text terminals and therefore they seem to punt on style
sheets altogether.
To fix this, let's apply the method recommended by AsciiDoctor in
https://docs.asciidoctor.org/asciidoctor/latest/html-backend/default-stylesheet/#customize-docinfo
to partially override AsciiDoctor's default style sheet so that the
`<code>` sections of the synopsis are no longer each rendered on their
own, individual lines.
This fixes https://github.com/git-for-windows/git/issues/5063.
Even on the Git home page, where AsciiDoctor's default stylesheet is
_not_ used, this change resulted in some unpleasant rendering where not
only the font is changed for the `<code>` sections of the synopsis, but
padding and a different background color make the visual impression
quite uneven. This has been addressed in the meantime, via
https://github.com/git/git-scm.com/commit/a492d0565512.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add-patch: use normalize_marker() when recounting edited hunk
After the user has edited a hunk the number of lines in the pre- and
post- image lines is recounted the hunk header can be updated before
passing the hunk to "git apply". The recounting code correctly handles
empty context lines where the leading ' ' is omitted by treating '\n'
and '\r' as context lines.
Update this code to use normalize_marker() so that the handling of empty
context lines is consistent with the rest of the hunk parsing
code. There is a small change in behavior as normalize_marker() only
treats "\r\n" as an empty context line rather than any line starting
with '\r'. This should not matter in practice as Macs have used Unix
line endings since MacOs 10 was released in 2001 and if it transpires
that someone is still using an earlier version of MacOs where lines end
with '\r' then we will need to change the handling of '\r' in
normalize_marker() anyway.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add-patch: handle splitting hunks with diff.suppressBlankEmpty
When "add -p" parses diffs, it looks for context lines starting with a
single space. But when diff.suppressBlankEmpty is in effect, an empty
context line will omit the space, giving us a true empty line. This
confuses the parser, which is unable to split based on such a line.
It's tempting to say that we should just make sure that we generate a
diff without that option. However, although we do not parse hunks that
the user has manually edited with parse_diff() we do allow the user
to split such hunks. As POSIX calls the decision of whether to print the
space here "implementation-defined" we need to handle edited hunks where
empty context lines omit the space.
So let's handle both cases: a context line either starts with a space or
consists of a totally empty line by normalizing the first character to a
space when we parse them. Normalizing the first character rather than
changing the code to check for a space or newline will hopefully future
proof against introducing similar bugs if the code is changed.
Reported-by: Ilya Tumaykin <itumaykin@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
doc: git-clone fix discrepancy between asciidoc and asciidoctor
Asciidoc.py does not have the concept of generalized roles, whereas
asciidoctor interprets [foo]`blah` as blah with role foo in the
synopsis, making in effect foo disappear in the output. Note that
square brackets not directly followed by an inline markup do not
define a role, which is why we do not have the issue on other parts of
the documentation.
In order to get a consistant result across asciidoctor and
asciidoc.py, the hack is to use the {empty} entity
to split the bracket part from the inline format part.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 19 Jul 2024 20:52:09 +0000 (13:52 -0700)]
howto-maintain: update daily tasks
Some "implementation details" of how I perform these integration
tasks day to day have changed since the document was originally
written. Update to reflect the way things are currently done.
Junio C Hamano [Fri, 19 Jul 2024 20:44:35 +0000 (13:44 -0700)]
howto-maintain: cover a whole development cycle
The "policy" part is more important than the "daily operation" part
in that it establishes why certain maintainer tasks exist and are
performed the way they are.
The text briefly touches the role each integration branches play in
the workflow, but does not give the whole picture of what happens in
a single development cycle using these branches. Extend the
description to describe a whole development cycle.
This reverts b7d6f23a171 (midx-write.c: use `--stdin-packs` when
repacking, 2024-04-01) and then marks the test created in the previous
change as passing.
The fundamental issue with the reverted change is that the focus on
pack-files separates the object selection from how the multi-pack-index
selects a single pack-file for an object ID with multiple copies among
the tracked pack-files.
The change was made with the intention of improving delta compression in
the resulting pack-file, but that can be resolved with the existing
object list mechanism. There are other potential pitfalls of doing an
object walk at this time if the repository is a blobless partial clone,
and that will require additional testing on top of the one that changes
here.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alex Galvin [Thu, 18 Jul 2024 20:47:37 +0000 (20:47 +0000)]
git-svn: use `svn:global-ignores` to create .gitignore
`svn:global-ignores` contains a list of file patterns that should
not be tracked in version control. The syntax of these patterns is
the same as `svn:ignore`. Their semantics differ: patterns in
`svn:global-ignores` apply to all paths under the directory where
they apply, while `svn:ignore` only applies to the directory's
immediate children.
Signed-off-by: Alex Galvin <agalvin@comqi.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alex Galvin [Thu, 18 Jul 2024 20:47:36 +0000 (20:47 +0000)]
git-svn: add public property `svn:global-ignores`
Subversion 1.8 added a new property `svn:global-ignores`. It
contains a list of patterns used to determine what files should
be ignored. If Git-SVN is going to ignore these files as well, it
is important that we do not skip over directories that have this
property set.
Signed-off-by: Alex Galvin <agalvin@comqi.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.45.0 included the change b7d6f23a171 (midx-write.c: use
`--stdin-packs` when repacking, 2024-04-01) which caused the 'git
multi-pack-index repack' command to use 'git pack-objects --stdin-packs'
instead of listing the objects to repack. While this change was
motivated by efficient cross-process communication and the ability to
improve delta compression, it breaks a fundamental function of the
'incremental-repack' task that is enabled by default in Scalar clones or
Git repositories that run 'git maintenance start'.
The 'incremental-repack' task performs a two-step process of the
'expire' and 'repack' subcommands of the 'git multi-pack-index' builtin.
The 'expire' command removes any pack-files listed in the
multi-pack-index but without any referenced objects. The 'repack' task
then finds a batch of pack-files to repack and sends their objects to
'git pack-objects'. Both the pack-files chosen for the batch and the
objects chosen to repack are based on the ones that the multi-pack-index
references. Objects that appear in a pack-file but have a duplicate copy
in a newer pack-file are not considered in this case. Since the
multi-pack-index references only the newest copy of an object, this
allows the next 'incremental-repack' task to remove the pack-files in
the next 'expire' task. This delay is intentional due to how Windows
handles may block deletion of files with open read handles.
However, the mentioned commit changed this behavior to divorce the set
of objects referenced by the multi-pack-index and instead use a set of
"included" and "excluded" pack-files in the 'git pack-objects' builtin.
When a pack-file is selected as "included", only the objects it contains
but are not in any "excluded" pack-files are considered for repacking.
This has led to client repositories failing to remove old pack-files as
they still have some referenced objects. This grows over time until the
point that Git is trying to repack the same pack-files over and over.
For now, create a test case that demonstrates the expected behavior, but
also fails in its final line. The setup here it attempting to recreate a
typical situation for a repository that uses a blobless partial clone.
There would be a large initial pack-file from the clone that is never
selected in the 'repack' batch. There are other pack-files that have a
combination of new objects from incremental fetches and possibly blobs
that are not connected to those incremental fetches; these blobs could
be filled in from commands like 'git checkout' or 'git blame'. The
pack-files also have some overlap on purpose so test-1 has some
duplicates in test-2 and test-2 has some duplicates in test-3.
At the end of the test, the test-2 pack-file still exists though it
should have been expired. This test will pass when reverting the
offending commit.
Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>