The filtering ref iterator can be used to only yield refs which are not
in a specific skip list. This iterator has an option to double-check the
results it returns, which causes us to seek the reference we are about
to yield via a separate table such that we detect whether the reference
that the first iterator has yielded actually exists.
The value of this is somewhat dubious, and I cannot think of any usecase
where this functionality should be required. Furthermore, this option is
never set in our codebase, which means that it is essentially untested.
And last but not least, the `struct reftable_table` that is used to
implement it is about to go away.
So while we could refactor the code to not use a `reftable_table`, it
very much feels like a wasted effort. Let's just drop this code.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
To read a reference for the reftable stack, we first create a generic
`reftable_table` from the merged table and then read the reference via a
convenience function. We are about to remove these generic interfaces,
so let's instead open-code the logic to prepare for this removal.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/merged: stop using generic tables in the merged table
The merged table provides access to a reftable stack by merging the
contents of those tables into a virtual table. These subtables are being
tracked via `struct reftable_table`, which is a generic interface for
accessing either a single reftable or a merged reftable. So in theory,
it would be possible for the merged table to merge together other merged
tables.
This is somewhat nonsensical though: we only ever set up a merged table
over normal reftables, and there is no reason to do otherwise. This
generic interface thus makes the code way harder to follow and reason
about than really necessary. The abstraction layer may also have an
impact on performance, even though the extra set of vtable function
calls probably doesn't really matter.
Refactor the merged tables to use a `struct reftable_reader` for each of
the subtables instead, which gives us direct access to the underlying
tables. Adjust names accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/merged: expose functions to initialize iterators
We do not expose any functions via our public headers that would allow a
caller to initialize a reftable iterator from a merged table. Instead,
they are expected to go via the generic `reftable_table` interface,
which is somewhat roundabout.
Implement two new functions to initialize iterators for ref and log
records to plug this gap.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 15 Aug 2024 15:22:03 +0000 (08:22 -0700)]
Merge branch 'ps/reftable-stack-compaction' into ps/reftable-drop-generic
* ps/reftable-stack-compaction:
reftable/stack: handle locked tables during auto-compaction
reftable/stack: fix corruption on concurrent compaction
reftable/stack: use lock_file when adding table to "tables.list"
reftable/stack: do not die when fsyncing lock file files
reftable/stack: simplify tracking of table locks
reftable/stack: update stats on failed full compaction
reftable/stack: test compaction with already-locked tables
reftable/stack: extract function to setup stack with N tables
reftable/stack: refactor function to gather table sizes
Junio C Hamano [Thu, 8 Aug 2024 17:41:20 +0000 (10:41 -0700)]
Merge branch 'ps/p4-tests-updates'
Perforce tests have been updated.
* ps/p4-tests-updates:
t98xx: mark Perforce tests as memory-leak free
ci: update Perforce version to r23.2
t98xx: fix Perforce tests with p4d r23 and newer
Junio C Hamano [Thu, 8 Aug 2024 17:41:19 +0000 (10:41 -0700)]
Merge branch 'ps/doc-more-c-coding-guidelines'
Some project conventions have been added to CodingGuidelines.
* ps/doc-more-c-coding-guidelines:
Documentation: consistently use spaces inside initializers
Documentation: document idiomatic function names
Documentation: document naming schema for structs and their functions
Documentation: clarify indentation style for C preprocessor directives
clang-format: fix indentation width for preprocessor directives
Junio C Hamano [Thu, 8 Aug 2024 17:41:19 +0000 (10:41 -0700)]
Merge branch 'dd/notes-empty-no-edit-by-default'
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.
* dd/notes-empty-no-edit-by-default:
notes: do not trigger editor when adding an empty note
Junio C Hamano [Thu, 8 Aug 2024 17:41:18 +0000 (10:41 -0700)]
Merge branch 'es/shell-check-updates'
Test script linter has been updated to catch an attempt to use
one-shot export construct "VAR=VAL func" for shell functions (which
does not work for some shells) better.
* es/shell-check-updates:
check-non-portable-shell: improve `VAR=val shell-func` detection
check-non-portable-shell: suggest alternative for `VAR=val shell-func`
check-non-portable-shell: loosen one-shot assignment error message
t4034: fix use of one-shot variable assignment with shell function
t3430: drop unnecessary one-shot "VAR=val shell-func" invocation
Junio C Hamano [Thu, 8 Aug 2024 17:41:18 +0000 (10:41 -0700)]
Merge branch 'rj/add-p-pager'
A 'P' command to "git add -p" that passes the patch hunk to the
pager has been added.
* rj/add-p-pager:
add-patch: render hunks through the pager
pager: introduce wait_for_pager
pager: do not close fd 2 unnecessarily
add-patch: test for 'p' command
reftable/stack: handle locked tables during auto-compaction
When compacting tables, it may happen that we want to compact a set of
tables which are already locked by a concurrent process that compacts
them. In the case where we wanted to perform a full compaction of all
tables it is sensible to bail out in this case, as we cannot fulfill the
requested action.
But when performing auto-compaction it isn't necessarily in our best
interest of us to abort the whole operation. For example, due to the
geometric compacting schema that we use, it may be that process A takes
a lot of time to compact the bulk of all tables whereas process B
appends a bunch of new tables to the stack. B would in this case also
notice that it has to compact the tables that process A is compacting
already and thus also try to compact the same range, probably including
the new tables it has appended. But because those tables are locked
already, it will fail and thus abort the complete auto-compaction. The
consequence is that the stack will grow longer and longer while A isn't
yet done with compaction, which will lead to a growing performance
impact.
Instead of aborting auto-compaction altogether, let's gracefully handle
this situation by instead compacting tables which aren't locked. To do
so, instead of locking from the beginning of the slice-to-be-compacted,
we start locking tables from the end of the slice. Once we hit the first
table that is locked already, we abort. If we succeeded to lock two or
more tables, then we simply reduce the slice of tables that we're about
to compact to those which we managed to lock.
This ensures that we can at least make some progress for compaction in
said scenario. It also helps in other scenarios, like for example when a
process died and left a stale lockfile behind. In such a case we can at
least ensure some compaction on a best-effort basis.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: fix corruption on concurrent compaction
The locking employed by compaction uses the following schema:
1. Lock "tables.list" and verify that it matches the version we have
loaded in core.
2. Lock each of the tables in the user-supplied range of tables that
we are supposed to compact. These locks prohibit any concurrent
process to compact those tables while we are doing that.
3. Unlock "tables.list". This enables concurrent processes to add new
tables to the stack, but also allows them to compact tables outside
of the range of tables that we have locked.
4. Perform the compaction.
5. Lock "tables.list" again.
6. Move the compacted table into place.
7. Write the new order of tables, including the compacted table, into
the lockfile.
8. Commit the lockfile into place.
Letting concurrent processes modify the "tables.list" file while we are
doing the compaction is very much part of the design and thus expected.
After all, it may take some time to compact tables in the case where we
are compacting a lot of very large tables.
But there is a bug in the code. Suppose we have two processes which are
compacting two slices of the table. Given that we lock each of the
tables before compacting them, we know that the slices must be disjunct
from each other. But regardless of that, compaction performed by one
process will always impact what the other process needs to write to the
"tables.list" file.
Right now, we do not check whether the "tables.list" has been changed
after we have locked it for the second time in (5). This has the
consequence that we will always commit the old, cached in-core tables to
disk without paying to respect what the other process has written. This
scenario would then lead to data loss and corruption.
This can even happen in the simpler case of one compacting process and
one writing process. The newly-appended table by the writing process
would get discarded by the compacting process because it never sees the
new table.
Fix this bug by re-checking whether our stack is still up to date after
locking for the second time. If it isn't, then we adjust the indices of
tables to replace in the updated stack.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: use lock_file when adding table to "tables.list"
When modifying "tables.list", we need to lock the list before updating
it to ensure that no concurrent writers modify the list at the same
point in time. While we do this via the `lock_file` subsystem when
compacting the stack, we manually handle the lock when adding a new
table to it. While not wrong, it is at least inconsistent.
Refactor the code to consistently lock "tables.list" via the `lock_file`
subsytem.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: do not die when fsyncing lock file files
We use `fsync_component_or_die()` when committing an addition to the
"tables.list" lock file, which unsurprisingly dies in case the fsync
fails. Given that this is part of the reftable library, we should never
die and instead let callers handle the error.
Adapt accordingly and use `fsync_component()` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When compacting tables, we store the locks of all tables we are about to
compact in the `table_locks` array. As we currently only ever compact
all tables in the user-provided range or none, we simply track those
locks via the indices of the respective tables in the merged stack.
This is about to change though, as we will introduce a mode where auto
compaction gracefully handles the case of already-locked files. In this
case, it may happen that we only compact a subset of the user-supplied
range of tables. In this case, the indices will not necessarily match
the lock indices anymore.
Refactor the code such that we track the number of locks via a separate
variable. The resulting code is expected to perform the same, but will
make it easier to perform the described change.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: update stats on failed full compaction
When auto-compaction fails due to a locking error, we update the
statistics to indicate this failure. We're not doing the same when
performing a full compaction.
Fix this inconsistency by using `stack_compact_range_stats()`, which
handles the stat update for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: test compaction with already-locked tables
We're lacking test coverage for compacting tables when some of the
tables that we are about to compact are locked. Add two tests that
exercise this, one for auto-compaction and one for full compaction.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: extract function to setup stack with N tables
We're about to add two tests, and both of them will want to initialize
the reftable stack with a set of N tables. Introduce a new function that
handles this and refactor existing tests that use such a setup to use
it.
Note that this changes the exact records contained in the preexisting
tests. This is fine though as we only care about the shape of the stack
here, not the shape of each table.
Furthermore, with this change we now start to disable auto compaction
when writing the tables, as otherwise we might not end up with the
expected amount of new tables added. This also slightly changes the
behaviour of these tests, but the properties we care for remain intact.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/stack: refactor function to gather table sizes
Refactor the function that gathers table sizes to be more idiomatic. For
one, use `REFTABLE_CALLOC_ARRAY()` instead of `reftable_calloc()`.
Second, avoid using an integer to iterate through the tables in the
reftable stack given that `stack_len` itself is using a `size_t`.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 1 Aug 2024 17:18:11 +0000 (10:18 -0700)]
Merge branch 'jc/doc-rebase-fuzz-vs-offset-fix'
"git rebase --help" referred to "offset" (the difference between
the location a change was taken from and the change gets replaced)
incorrectly and called it "fuzz", which has been corrected.
* jc/doc-rebase-fuzz-vs-offset-fix:
doc: difference in location to apply is "offset", not "fuzz"
Junio C Hamano [Wed, 31 Jul 2024 20:34:19 +0000 (13:34 -0700)]
Merge branch 'cp/unit-test-reftable-merged'
Another reftable test has been ported to use the unit test framework.
* cp/unit-test-reftable-merged:
t-reftable-merged: add test for REFTABLE_FORMAT_ERROR
t-reftable-merged: use reftable_ref_record_equal to compare ref records
t-reftable-merged: add tests for reftable_merged_table_max_update_index
t-reftable-merged: improve the const-correctness of helper functions
t-reftable-merged: improve the test t_merged_single_record()
t: harmonize t-reftable-merged.c with coding guidelines
t: move reftable/merged_test.c to the unit testing framework
Junio C Hamano [Wed, 31 Jul 2024 20:34:18 +0000 (13:34 -0700)]
Merge branch 'kn/ci-clang-format'
A CI job that use clang-format to check coding style issues in new
code has been added.
* kn/ci-clang-format:
ci/style-check: add `RemoveBracesLLVM` in CI job
check-whitespace: detect if no base_commit is provided
ci: run style check on GitHub and GitLab
clang-format: formalize some of the spacing rules
clang-format: avoid spacing around bitfield colon
clang-format: indent preprocessor directives after hash
Junio C Hamano [Wed, 31 Jul 2024 20:34:18 +0000 (13:34 -0700)]
Merge branch 'jc/checkout-no-op-switch-errors'
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users. It now says that the
user needs to give pathspec to specify what paths to checkout.
* jc/checkout-no-op-switch-errors:
checkout: special case error messages during noop switching
"git add -p" by users with diff.suppressBlankEmpty set to true
failed to parse the patch that represents an unmodified empty line
with an empty line (not a line with a single space on it), which
has been corrected.
* pw/add-patch-with-suppress-blank-empty:
add-patch: use normalize_marker() when recounting edited hunk
add-patch: handle splitting hunks with diff.suppressBlankEmpty
All the Perforce tests are free of memory leaks. This went unnoticed
because most folks do not have p4 and p4d installed on their computers.
Consequently, given that the prerequisites for running those tests
aren't fulfilled, `TEST_PASSES_SANITIZE_LEAK=check` won't notice that
those tests are indeed memory leak free.
Mark those tests accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update our Perforce version from r21.2 to r23.2. Note that the updated
version is not the newest version. Instead, it is the last version where
the way that Perforce is being distributed remains the same as in r21.2.
Newer releases stopped distributing p4 and p4d executables as well as
the macOS archives directly and would thus require more work.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the tests in t98xx modify the Perforce depot in ways that the
tool wouldn't normally allow. This is done to test behaviour of git-p4
in certain edge cases that we have observed in the wild, but which
should in theory not be possible.
Naturally, modifying the depot on disk directly is quite intimate with
the tool and thus prone to breakage when Perforce updates the way that
data is stored. And indeed, those tests are broken nowadays with r23 of
Perforce. While a file revision was previously stored as a plain file
"depot/file,v", it is now stored in a directory "depot/file,d" with
compression.
Adapt those tests to handle both old- and new-style depot layouts.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
D Harithamma [Wed, 31 Jul 2024 13:33:59 +0000 (13:33 +0000)]
convert: return early when not tracing
When Git adds a file requiring encoding conversion and tracing of encoding
conversion is not requested via the GIT_TRACE_WORKING_TREE_ENCODING
environment variable, the `trace_encoding()` function still allocates &
prepares "human readable" copies of the file contents before and after
conversion to show in the trace. This results in a high memory footprint
and increased runtime without providing any user-visible benefit.
This fix introduces an early exit from the `trace_encoding()` function
when tracing is not requested, preventing unnecessary memory allocation
and processing.
Signed-off-by: D Harithamma <harithamma.d@ibm.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: consistently use spaces inside initializers
Our coding guide is inconsistent with how it uses spaces inside of
initializers (`struct foo bar = { something }`). While we mostly carry
the space between open and closing braces and the initialized members,
in one case we don't.
Fix this one instance such that we consistently carry the space. This is
also consistent with how clang-format formats such initializers.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We semi-regularly have discussions around whether a function shall be
named `S_release()`, `S_clear()` or `S_free()`. Indeed, it may not be
obvious which of these is preferable as we never really defined what
each of these variants means exactly.
Carve out a space where we can add idiomatic names for common functions
in our coding guidelines and define each of those functions. Like this,
we can get to a shared understanding of their respective semantics and
can easily point towards our style guide in future discussions such that
our codebase becomes more consistent over time.
Note that the intent is not to rename all functions which violate these
semantics right away. Rather, the intent is to slowly converge towards a
common style over time.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: document naming schema for structs and their functions
We nowadays have a proper mishmash of struct-related functions that are
called `<verb>_<struct>` (e.g. `clear_prio_queue()`) versus functions
that are called `<struct>_<verb>` (e.g. `strbuf_clear()`). While the
former style may be easier to tie into a spoken conversation, most of
our communication happens in text anyway. Furthermore, prefixing
functions with the name of the structure they operate on makes it way
easier to group them together, see which functions are related, and will
also help folks who are using code completion.
Let's thus settle on one style, namely the one where functions start
with the name of the structure they operate on.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation: clarify indentation style for C preprocessor directives
In the preceding commit, we have settled on using a single space per
nesting level to indent preprocessor directives. Clarify our coding
guidelines accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clang-format: fix indentation width for preprocessor directives
In [1], we have improved our clang-format configuration to also specify
the style for how to indent preprocessor directives. But while we have
settled the question of where to put the indentation, either before or
after the hash sign, we didn't specify exactly how to indent.
With the current configuration, clang-format uses tabs to indent each
level of nested preprocessor directives, which is in fact unintentional
and never done in our codebase. Instead, we use a mixture of indenting
by either one or two spaces, where using a single space is somewhat more
common.
Adapt our clang-format configuration accordingly by specifying an
indentation width of one space.
Junio C Hamano [Tue, 30 Jul 2024 20:47:26 +0000 (13:47 -0700)]
Merge branch 'kn/ci-clang-format' into ps/doc-more-c-coding-guidelines
* kn/ci-clang-format:
ci/style-check: add `RemoveBracesLLVM` in CI job
check-whitespace: detect if no base_commit is provided
ci: run style check on GitHub and GitLab
clang-format: formalize some of the spacing rules
clang-format: avoid spacing around bitfield colon
clang-format: indent preprocessor directives after hash
René Scharfe [Tue, 30 Jul 2024 14:18:54 +0000 (16:18 +0200)]
grep: -W: skip trailing empty lines at EOF, too
4aa2c4753d (grep: -W: don't extend context to trailing empty lines,
2016-05-28) stopped showing empty lines at the end of function context
when using -W. Do the same for trailing empty lines at the end of
files, for consistency -- it doesn't matter whether a function section
is ended by the next function or the end of the file.
Test it by adding a trailing empty line to the file used by the test
"grep -W" and leave its expected output the same.
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
David Disseldorp [Mon, 29 Jul 2024 15:14:00 +0000 (17:14 +0200)]
notes: do not trigger editor when adding an empty note
With "git notes add -C $blob", the given blob contents are to be made
into a note without involving an editor. But when "--allow-empty" is
given, the editor is invoked, which can cause problems for
non-interactive callers[1].
This behaviour started with 90bc19b3ae (notes.c: introduce
'--separator=<paragraph-break>' option, 2023-05-27), which changed
editor invocation logic to check for a zero length note_data buffer.
Restore the original behaviour of "git note" that takes the contents
given via the "-m", "-C", "-F" options without invoking an editor, by
checking for any prior parameter callbacks, indicated by a non-zero
note_data.msg_nr. Remove the now-unneeded note_data.given flag.
Add a test for this regression by checking whether GIT_EDITOR is
invoked alongside "git notes add -C $empty_blob --allow-empty"
[1] https://github.com/ddiss/icyci/issues/12
Signed-off-by: David Disseldorp <ddiss@suse.de>
[jc: enhanced the test with -m/-F options] Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell, thus should be avoided.
As such, check-non-portable-shell.pl warns when it detects such usage.
However, a limitation of the check is that it only detects such
invocations when variable assignment (i.e. `VAR=val`) is the first thing
on the line. Thus, it can easily be fooled by an invocation such as:
echo X | VAR=val shell-func
Address this shortcoming by loosening the check so that the variable
assignment can be recognized even when not at the beginning of the line.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Sat, 27 Jul 2024 05:35:08 +0000 (01:35 -0400)]
check-non-portable-shell: suggest alternative for `VAR=val shell-func`
Most problems reported by check-non-portable-shell are accompanied by
advice suggesting how the test author can repair the problem. For
instance:
error: egrep/fgrep obsolescent (use grep -E/-F)
However, when one-shot variable assignment is detected when calling a
shell function (i.e. `VAR=val shell-func`), the problem is reported, but
no advice is given. The lack of advice is particularly egregious since
neither the problem nor the workaround are likely well-known by
newcomers to the project writing tests for the first time. Address this
shortcoming by recommending the use of `test_env` which is tailor made
for this specific use-case.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a0a630192d (t/check-non-portable-shell: detect "FOO=bar
shell_func", 2018-07-13) added the check for one-shot environment
variable assignment for shell functions, the primary reason given for
avoiding them was that, under some shells, the assignment outlives the
invocation of the shell function, thus could potentially negatively
impact subsequent commands in the same test, as well as subsequent
tests.
However, it has recently become apparent that this is not the only
potential problem with one-shot assignments and shell functions. Another
problem is that some shells do not actually export the variable to
commands which the function invokes[1]. More significantly, however, the
behavior of one-shot assignments with shell functions is not specified
by POSIX[2].
Given this new understanding, the presented error message ("assignment
extends beyond 'shell_func'") is too specific and potentially
misleading. Address this by emitting a less specific error message.
(Note that the wording "is not portable" is chosen over the more
specific "behavior not specified by POSIX" for consistency with almost
all other error message issued by this "lint" script.)
Eric Sunshine [Sat, 27 Jul 2024 05:35:06 +0000 (01:35 -0400)]
t4034: fix use of one-shot variable assignment with shell function
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell, thus should be avoided.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Sat, 27 Jul 2024 05:35:05 +0000 (01:35 -0400)]
t3430: drop unnecessary one-shot "VAR=val shell-func" invocation
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell. One such problematic behavior
is that, with some shells, the assignment will outlive the invocation of
the function, thus may potentially impact subsequent commands in the
test, as well as subsequent tests. A common way to work around the
problem is to wrap a subshell around the one-shot assignment, thus
ensuring that the assignment is short-lived.
In this test, the subshell is employed precisely for this purpose; other
side-effects of the subshell, such as losing the effect of `test_tick`
which is invoked by `test_commit`, are immaterial.
These days, we can take advantage of `test_commit --author` to more
clearly convey that the test is interested only in overriding the author
of the commit.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 25 Jul 2024 17:27:29 +0000 (10:27 -0700)]
doc: difference in location to apply is "offset", not "fuzz"
The documentation to "git rebase" says that the line numbers (in the
rebased change) may not exactly be the same as the line numbers the
change gets replayed on top of the new base, but uses a wrong noun
"fuzz". It should have said "offset".
They are both terms of art. "fuzz" is about context lines not
exactly matching. "offset" is about the difference in the location
that a change was taken from the original and the change gets
replayed on the target. "offset" is often inevitable and part of
normal life. "fuzz" on the other hand is often a sign of trouble
(and indeed "Git" refuses to apply a change with "fuzz", except
there are options to be fuzzy about whitespaces).
Make the print command trigger the pager when invoked using a capital
'P', to make it easier for the user to review long hunks.
Note that if the PAGER ends unexpectedly before we've been able to send
the payload, perhaps because the user is not interested in the whole
thing, we might receive a SIGPIPE, which would abruptly and unexpectedly
terminate the interactive session for the user.
Therefore, we need to ignore a possible SIGPIPE signal. Add a test for
this, in addition to the test for normal operation.
For the SIGPIPE test, we need to make sure that we completely fill the
operating system's buffer, otherwise we might not trigger the SIGPIPE
signal. The normal size of this buffer in different OSs varies from a
few KBs to 1MB. Use a payload large enough to guarantee that we exceed
this limit.
Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since f67b45f862 (Introduce trivial new pager.c helper infrastructure,
2006-02-28) we have the machinery to send our output to a pager.
That machinery, once set up, does not allow us to regain the original
stdio streams.
In the interactive commands (i.e.: add -p) we want to use the pager for
some output, while maintaining the interaction with the user.
Modify the pager machinery so that we can use `setup_pager()` and, once
we've finished sending the desired output for the pager, wait for the
pager termination using a new function `wait_for_pager()`. Make this
function reset the pager machinery before returning.
One specific point to note is that we avoid forking the pager in
`setup_pager()` if the configured pager is an empty string [*1*] or
simply "cat" [*2*]. In these cases, `setup_pager()` does nothing and
therefore `wait_for_pager()` should not be called.
We could modify `setup_pager()` to return an indication of these
situations, so we could avoid calling `wait_for_pager()`.
However, let's avoid transferring that responsibility to the caller and
instead treat the call to `wait_for_pager()` as a no-op when we know we
haven't forked the pager.
1.- 402461aab1 (pager: do not fork a pager if PAGER is set to empty.,
2006-04-16)
2.- caef71a535 (Do not fork PAGER=cat, 2006-04-16)
Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We send errors to the pager since 61b80509e3 (sending errors to stdout
under $PAGER, 2008-02-16).
In a8335024c2 (pager: do not dup2 stderr if it is already redirected,
2008-12-15) an exception was introduced to avoid redirecting stderr if
it is not connected to a terminal.
In such exceptional cases, the close(STDERR_FILENO) we're doing in
close_pager_fds, is unnecessary.
Furthermore, in a subsequent commit we're going to introduce changes
that will involve using close_pager_fds multiple times.
With this in mind, controlling when we want to close stderr, become
sensible.
Let's close(STDERR_FILENO) only when necessary, and pave the way for the
upcoming changes.
Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 25 Jul 2024 15:49:34 +0000 (08:49 -0700)]
ReviewingGuidelines: encourage positive reviews more
I saw some contributors hesitate to give a positive review on
patches by their coworkers. When written well, a positive review
does not have to be a hollow "looks good" that rubber stamps an
useless approval on a topic that is not interesting to others.
Let's add a few paragraphs to encourage positive reviews, which is a
bit harder to give than a review to point out things to improve.
Alexander Shopov [Wed, 24 Jul 2024 11:11:11 +0000 (14:11 +0300)]
show-ref: improve short help messages of options
Trivial change to indicate that branches and tags are real options
that can be used combined to get more information. This helps with
linting translations and prompting the user that the terms represent
options.
Signed-off-by: Alexander Shopov <ash@kambanaria.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 22 Jul 2024 21:17:55 +0000 (14:17 -0700)]
Doc: fix Asciidoctor css workaround
The previous step introduced docinfo.html to be used to tweak the
CSS used by the asciidoctor, that by default renders <code> inside
<pre> as a block element, breaking the SYNOPSIS section of a few
pages that adopted a new convention we use since Git 2.45.
But in this project, HTML files are all generated. We do not force
any human to write HTML by hand, which is an unusual and cruel
punishment. "*.html" is in the .gitignore file, and "make clean"
removes them. Having a tracked .html file makes "make clean" make
the tree dirty by removing the tracked docinfo.html file.
Let's do an obvious, minimum and stupid workaround to generate that
file at runtime instead. The mark-up is being rethought in a major
way for the next development cycle, and the CSS workaround we added
in the previous step may have to adjusted, possibly in a large way,
anyway.
For 'clang-format', setting 'RemoveBracesLLVM' to 'true', adds a check
to ensure we avoid curly braces for single-statement bodies in
conditional blocks.
However, the option does come with two warnings [1]:
This option will be renamed and expanded to support other styles.
and
Setting this option to true could lead to incorrect code formatting
due to clang-format’s lack of complete semantic information. As
such, extra care should be taken to review code changes made by
this option.
The latter seems to be of concern. While we want to experiment with the
rule, adding it to the in-tree '.clang-format' could affect end-users.
Let's only add it to the CI jobs for now. With time, we can evaluate
its efficacy and decide if we want to add it to '.clang-format' or
retract it entirely. We do so, by adding the existing rules in
'.clang-format' and this rule to a temp file outside the working tree,
which is then used by 'git clang-format'. This ensures we don't murk
with files in-tree.
check-whitespace: detect if no base_commit is provided
The 'check-whitespace' CI script exits gracefully if no base commit is
provided or if an invalid revision is provided. This is not good because
if a particular CI provides an incorrect base_commit, it would fail
successfully.
This is exactly the case with the GitLab CI. The CI is using the
"$CI_MERGE_REQUEST_TARGET_BRANCH_SHA" variable to get the base commit
SHA, but variable is only defined for _merged_ pipelines. So it is empty
for regular pipelines [1]. This should've failed the check-whitespace
job.
Let's fallback to 'CI_MERGE_REQUEST_DIFF_BASE_SHA' if
"CI_MERGE_REQUEST_TARGET_BRANCH_SHA" isn't available in GitLab CI,
similar to the previous commit. Let's also add a check for incorrect
base_commit in the 'check-whitespace.sh' script. While here, fix a small
typo too.
We don't run style checks on our CI, even though we have a
'.clang-format' setup in the repository. Let's add one, the job will
validate only against the new commits added and will only run on merge
requests. Since we're introducing it for the first time, let's allow
this job to fail, so we can validate if this is useful and eventually
enforce it.
For GitHub, we allow the job to pass by adding 'continue-on-error: true'
to the workflow. This means the job would show as passed, even if the
style check failed. To know the status of the job, users have to
manually check the logs.
For GitLab, we allow the job to pass by adding 'allow_failure: true', to
the job. Unlike GitHub, here the job will show as failed with a yellow
warning symbol, but the pipeline would still show as passed.
Also for GitLab, we use the 'CI_MERGE_REQUEST_TARGET_BRANCH_SHA'
variable by default to obtain the base SHA of the merged pipeline (which
is only available for merged pipelines [1]). Otherwise we use the
'CI_MERGE_REQUEST_DIFF_BASE_SHA' variable.
There are some spacing rules that we follow in the project and it makes
sense to formalize them:
* Ensure there is no space inserted after the logical not '!' operator.
* Ensure there is no space before the case statement's colon.
* Ensure there is no space before the first bracket '[' of an array.
* Ensure there is no space in empty blocks.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The spacing around colons is currently not standardized and as such we
have the following practices in our code base:
- Spacing around the colon `int bf : 1`: 146 instances
- No spacing around the colon `int bf:1`: 148 instances
- Spacing before the colon `int bf :1`: 6 instances
- Spacing after the colon `int bf: 1`: 12 instances
Let's formalize this by picking the most followed pattern and add the
corresponding style to '.clang-format'.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
clang-format: indent preprocessor directives after hash
We do not have a rule around the indentation of preprocessor directives.
This was also discussed on the list [1], noting how there is often
inconsistency in the styling. While there was discussion, there was no
conclusion around what is the preferred style here. One style being
indenting after the hash:
#if FOO
# if BAR
# include <foo>
# endif
#endif
The other being before the hash:
#if FOO
#if BAR
#include <foo>
#endif
#endif
Let's pick the former and add 'IndentPPDirectives: AfterHash' value to
our '.clang-format'. There is no clear reason to pick one over the
other, but it would definitely be nicer to be consistent.
It was reported that t1460-refs-migrate.sh fails when using Cygwin with
errors like the following:
error: could not link file '.git/ref_migration.sr9pEF/reftable' to '.git/reftable': Permission denied
As some debugging surfaced, the root cause of this is that some files of
the newly-initialized ref store are still open when the target format is
the "reftable" format, and Cygwin refuses to rename open files.
Fix this issue by closing the new ref store before renaming its files
into place. This is a slight change in behaviour compared to before,
where we kept the new ref store open and then updated the repository's
ref store to point to it.
While we could re-open the new ref store after we have moved files
around, this is ultimately unnecessary. We know that the only user of
`repo_migrate_ref_storage_format()` is the git-refs(1) command, and it
won't access the ref store after it has been migrated anyway. So
reinitializing the ref store would be a waste of time. Regardless of
that it is still sensible to leave the repository in a consistent state.
But instead of reinitializing the ref store, we can simply unset the
repo's ref store altogether and let `get_main_ref_store()` lazily
initialize the new ref store as required.
Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 23 Jul 2024 00:04:54 +0000 (17:04 -0700)]
CodingGuidelines: document a shell that "fails" "VAR=VAL shell_func"
Over the years, we accumulated the community wisdom to avoid the
common "one-short export" construct for shell functions, but seem to
have lost on which exact platform it is known to fail. Now during
an investigation on a breakage for a recent topic, we found one
example of failing shell. Let's document that.
This does *not* mean that we can freely start using the construct
once Ubuntu 20.04 is retired. But it does mean that we cannot use
the construct until Ubuntu 20.04 is fully retired from the machines
that matter. Moreover, posix explicitly says that the behaviour for
the construct is unspecified.
Helped-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tomas Nordin [Mon, 22 Jul 2024 22:53:02 +0000 (22:53 +0000)]
doc: remove dangling closing parenthesis
The second line of the synopsis, starting with [--dry-run] has a
dangling closing paren in the second optional group. Probably added by
mistake, so remove it.
Signed-off-by: Tomas Nordin <tomasn@posteo.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 76880f0510c (doc: git-clone: apply new documentation formatting
guidelines, 2024-03-29), the synopsis of `git clone`'s manual page is
rendered differently than before; Its parent commit did the same for
`git init`.
The result looks quite nice. When rendered with AsciiDoc, that is. When
rendered using AsciiDoctor and displayed in a graphical web browser such
as Firefox, Chrome, Edge, etc, the result is quite unpleasant to my eye,
reading something like this:
The reason is that AsciiDoctor's default style sheet contains this (see
https://github.com/asciidoctor/asciidoctor/blob/854923b15533/src/stylesheets/asciidoctor.css#L519-L521
for context):
pre > code {
display: block;
}
It is this `display: block` that forces the parts that are enclosed in
`<code>` tags (such as the `git clone` or the `--template=` part) to be
rendered on their own line.
Side note: This seems not to affect console web browsers like `lynx` or
`w3m`, most likely because most style sheet directions cannot be
respected in text terminals and therefore they seem to punt on style
sheets altogether.
To fix this, let's apply the method recommended by AsciiDoctor in
https://docs.asciidoctor.org/asciidoctor/latest/html-backend/default-stylesheet/#customize-docinfo
to partially override AsciiDoctor's default style sheet so that the
`<code>` sections of the synopsis are no longer each rendered on their
own, individual lines.
This fixes https://github.com/git-for-windows/git/issues/5063.
Even on the Git home page, where AsciiDoctor's default stylesheet is
_not_ used, this change resulted in some unpleasant rendering where not
only the font is changed for the `<code>` sections of the synopsis, but
padding and a different background color make the visual impression
quite uneven. This has been addressed in the meantime, via
https://github.com/git/git-scm.com/commit/a492d0565512.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add-patch: use normalize_marker() when recounting edited hunk
After the user has edited a hunk the number of lines in the pre- and
post- image lines is recounted the hunk header can be updated before
passing the hunk to "git apply". The recounting code correctly handles
empty context lines where the leading ' ' is omitted by treating '\n'
and '\r' as context lines.
Update this code to use normalize_marker() so that the handling of empty
context lines is consistent with the rest of the hunk parsing
code. There is a small change in behavior as normalize_marker() only
treats "\r\n" as an empty context line rather than any line starting
with '\r'. This should not matter in practice as Macs have used Unix
line endings since MacOs 10 was released in 2001 and if it transpires
that someone is still using an earlier version of MacOs where lines end
with '\r' then we will need to change the handling of '\r' in
normalize_marker() anyway.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>