Jeff King [Thu, 31 Aug 2023 06:23:20 +0000 (02:23 -0400)]
lower core.maxTreeDepth default to 2048
On my Linux system, all of our recursive tree walking algorithms can run
up to the 4096 default limit without segfaulting. But not all platforms
will have stack sizes as generous (nor might even Linux if we kick off a
recursive walk within a thread).
In particular, several of the tests added in the previous few commits
fail in our Windows CI environment. Through some guess-and-check
pushing, I found that 3072 is still too much, but 2048 is OK.
These are obviously vague heuristics, and there is nothing to promise
that another system might not have trouble at even lower values. But it
seems unlikely anybody will be too angry about a 2048-depth limit (this
is close to the default max-pathname limit on Linux even for a
pathological path like "a/a/a/..."). So let's just lower it.
Some alternatives are:
- configure separate defaults for Windows versus other platforms.
- just skip the tests on Windows. This leaves Windows users with the
annoying case that they can be crashed by running out of stack
space, but there shouldn't be any security implications (they can't
go deep enough to hit integer overflow problems).
Since the original default was arbitrary, it seems less confusing to
just lower it, keeping behavior consistent across platforms.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:22:08 +0000 (02:22 -0400)]
tree-diff: respect max_allowed_tree_depth
When diffing trees, we recurse to handle subtrees. That means we may run
out of stack space and segfault. Let's teach this code path about
core.maxTreeDepth in order to fail more gracefully.
As with the previous patch, we have no way to return an error (and other
tree-loading problems would just cause us to die()). So we'll likewise
call die() if we exceed the maximum depth.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:22:03 +0000 (02:22 -0400)]
list-objects: respect max_allowed_tree_depth
The tree traversal in list-objects.c, which is used by "rev-list
--objects", etc, uses recursion and may run out of stack space. Let's
teach it about the new core.maxTreeDepth config option.
We unfortunately can't return an error here, as this code doesn't
produce an error return at all. We'll die() instead, which matches the
behavior when we see an otherwise broken tree.
Note that this will also generally reject such deep trees from entering
the repository from a fetch or push, due to the use of rev-list in the
connectivity check. But it's not foolproof! We stop traversing when we
see an UNINTERESTING object, and the connectivity check marks existing
ref tips as UNINTERESTING. So imagine commit X has a tree
with maximum depth N. If you then create a new commit Y with a tree
entry "Y:subdir" that points to "X^{tree}", then the depth of Y will be
N+1. But a connectivity check running "git rev-list --objects Y --not X"
won't realize that; it will stop traversing at X^{tree}, since that was
already reachable.
So this will stop naive pushes of too-deep trees, but not carefully
crafted malicious ones. Doing it robustly and efficiently would require
caching the maximum depth of each tree (i.e., the longest path to any
leaf entry). That's much more complex and not strictly needed. If each
recursive algorithm limits itself already, then that's sufficient.
Blocking the objects from entering the repo would be a nice
belt-and-suspenders addition, but it's not worth the extra cost.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:21:55 +0000 (02:21 -0400)]
read_tree(): respect max_allowed_tree_depth
The read_tree() function reads trees recursively (via its read_tree_at()
helper). This can cause it to run out of stack space on very deep trees.
Let's teach it about the new core.maxTreeDepth option.
The easiest way to demonstrate this is via "ls-tree -r", which the test
covers. Note that I needed a tree depth of ~30k to trigger a segfault on
my Linux system, not the 4100 used by our "big" test in t6700. However,
that test still tells us what we want: that the default 4096 limit is
enough to prevent segfaults on all platforms. We could bump it, but that
increases the cost of the test setup for little gain.
As an interesting side-note: when I originally wrote this patch about 4
years ago, I needed a depth of ~50k to segfault. But porting it forward,
the number is much lower. Seemingly little things like cf0983213c (hash:
add an algo member to struct object_id, 2021-04-26) take it from 32,722
to 29,080.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:21:40 +0000 (02:21 -0400)]
traverse_trees(): respect max_allowed_tree_depth
The tree-walk.c code walks trees recursively, and may run out of stack
space. The easiest way to see this is with git-archive; on my 64-bit
Linux system it runs out of stack trying to generate a tarfile with a
tree depth of 13,772.
I've picked 4100 as the depth for our "big" test. I ran it with a much
higher value to confirm that we do get a segfault without this patch.
But really anything over 4096 is sufficient for its stated purpose,
which is to find out if our default limit of 4096 is low enough to
prevent segfaults on all platforms. Keeping it small saves us time on
the test setup.
The tree-walk code that's touched here underlies unpack_trees(), so this
protects any programs which use it, not just git-archive (but archive is
easy to test, and was what alerted me to this issue in a real-world
case).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:21:00 +0000 (02:21 -0400)]
add core.maxTreeDepth config
Most of our tree traversal algorithms use recursion to visit sub-trees.
For pathologically large trees, this can cause us to run out of stack
space and abort in an uncontrolled way. Let's put our own limit here so
that we can fail gracefully rather than segfaulting.
In similar cases where we recursed along the commit graph, we rewrote
the algorithms to avoid recursion and keep any stack data on the heap.
But the commit graph is meant to grow without bound, whereas it's not an
imposition to put a limit on the maximum size of tree we'll handle.
And this has a bonus side effect: coupled with a limit on individual
tree entry names, this limits the total size of a path we may encounter.
This gives us an extra protection against code handling long path names
which may suffer from integer overflows in the size (which could then be
exploited by malicious trees).
The default of 4096 is set to be much longer than anybody would care
about in the real world. Even with single-letter interior tree names
(like "a/b/c"), such a path is at least 8191 bytes. While most operating
systems will let you create such a path incrementally, trying to
reference the whole thing in a system call (as Git would do when
actually trying to access it) will result in ENAMETOOLONG. Coupled with
the recent fsck.largePathname warning, the maximum total pathname Git
will handle is (by default) 16MB.
This config option doesn't do anything yet; future patches will convert
various algorithms to respect the limit.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:20:01 +0000 (02:20 -0400)]
fsck: detect very large tree pathnames
In general, Git tries not to arbitrarily limit what it will store, and
there are currently no limits at all on the size of the path we find in
a tree. In theory you could have one that is gigabytes long.
But in practice this freedom is not really helping anybody, and is
potentially harmful:
1. Most operating systems have much lower limits for the size of a
single pathname component (e.g., on Linux you'll generally get
ENAMETOOLONG for anything over 255 bytes). And while you _can_ use
Git in a way that never touches the filesystem (manipulating the
index and trees directly), it's still probably not a good idea to
have gigantic tree names. Many operations load and traverse them,
so any clever Git-as-a-database scheme is likely to perform poorly
in that case.
2. We still have a lot of code which assumes strings are reasonably
sized, and I won't be at all surprised if you can trigger some
interesting integer overflows with gigantic pathnames. Stopping
malicious trees from entering the repository provides an extra line
of defense, protecting downstream code.
This patch implements an fsck check so that such trees can be rejected
by transfer.fsckObjects. I've picked a reasonably high maximum depth
here (4096) that hopefully should not bother anybody in practice. I've
also made it configurable, as an escape hatch.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:19:22 +0000 (02:19 -0400)]
tree-walk: rename "error" variable
The "error" variable in traverse_trees() shadows the global error()
function (meaning we can't call error() from here). Let's call the local
variable "ret" instead, which matches the idiom in other functions.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:19:16 +0000 (02:19 -0400)]
tree-walk: drop MAX_TRAVERSE_TREES macro
Since the previous commit dropped the hard-coded limit in
traverse_trees(), we don't need this macro there anymore (the code can
handle any number of trees in parallel).
We do define MAX_UNPACK_TREES using MAX_TRAVERSE_TREES, due to 5290d45134 (tree-walk.c: break circular dependency with unpack-trees,
2020-02-01). So we can just directly define that as "8" now; we know
traverse_trees() can handle whatever we throw at it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 06:17:54 +0000 (02:17 -0400)]
tree-walk: reduce stack size for recursive functions
The traverse_trees() and traverse_trees_recursive() functions call each
other recursively. In a deep tree, this can result in running out of
stack space and crashing.
There's obviously going to be some limit here based on available stack,
but the problem is exacerbated by a few large structs, many of which we
over-allocate. For example, in traverse_trees() we store a name_entry
and tree_desc_x per tree, both of which contain an object_id (which is
now 32 bytes). And we allocate 8 of them (from MAX_TRAVERSE_TREES), even
though many traversals will only look at 1 or 2.
Interestingly, we used to allocate these on the heap, prior to 8dd40c0472 (traverse_trees(): use stack array for name entries,
2020-01-30). That commit was trying to simplify away allocation size
computations, and naively assumed that the sizes were small enough not
to matter. And they don't in normal cases, but on my stock Debian system
I see a crash running "git archive" on a tree with ~3600 entries.
That's deep enough we wouldn't see it in practice, but probably shallow
enough that we'd prefer not to make it a hard limit. Especially because
other systems may have even smaller stacks.
We can replace these stack variables with a few malloc invocations. This
reduces the stack sizes for the two functions from 1128 and 752 bytes,
respectively, down to 40 and 92 bytes. That allows a depth of ~13000 on
my machine (the improvement isn't in linear proportion because my
numbers don't count the size of parameters and other function overhead).
The possible downsides are:
1. We now have to remember to free(). But both functions have an easy
single exit (and already had to clean up other bits anyway).
2. The extra malloc()/free() overhead might be measurable. I tested
this by setting up a 3000-depth tree with a single blob and running
"git archive" on it. After switching to the heap, it consistently
runs 2-3% faster! Presumably this is because the 1K+ of wasted
stack space penalized memory caches.
On a more real-world case like linux.git, the speed difference isn't
measurable at all, simply because most trees aren't that deep and
there's so much other work going on (like accessing the objects
themselves). So the improvement I saw should be taken as evidence that
we're not making anything worse, but isn't really that interesting on
its own. The main motivation here is that we're now less likely to run
out of stack space and crash.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 21:17:33 +0000 (17:17 -0400)]
format-patch: use OPT_STRING_LIST for to/cc options
The to_callback() and cc_callback() functions are identical to the
generic parse_opt_string_list() function (except that they don't handle
optional arguments, but that's OK because their callers do not use the
OPTARG flag).
Let's simplify the code by using OPT_STRING_LIST.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 21:17:16 +0000 (17:17 -0400)]
merge: simplify parsing of "-n" option
The "-n" option is implemented by an option callback, as it is really a
"reverse bool". If given, it sets show_diffstat to 0. In theory, when
negated, it would set the same flag to 1. But it's not possible to
trigger that, since short options cannot be negated.
So in practice this is really just a SET_INT to 0. Let's use that
instead, which shortens the code.
Note that negation here would do the wrong thing (as with any SET_INT
with a value of "0"). We could specify PARSE_OPT_NONEG to future-proof
ourselves against somebody adding a long option name (which would make
it possible to negate). But there's not much point:
1. Nobody is going to do that, because the negated form already
exists, and is called "--stat" (which is defined separately so that
"--no-stat" works).
2. If they did, the BUG() check added by 3284b93862 (parse-options:
disallow negating OPTION_SET_INT 0, 2023-08-08) will catch it (and
that check is smart enough to realize that our short-only option is
OK).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Thu, 31 Aug 2023 21:17:11 +0000 (17:17 -0400)]
merge: make xopts a strvec
The "xopts" variable uses a custom array with ALLOC_GROW(). Using a
strvec simplifies things a bit. We need fewer variables, and we can also
ditch our custom parseopt callback in favor of OPT_STRVEC().
As a bonus, this means that "--no-strategy-option", which was previously
a silent noop, now does something useful: like other list-like options,
it will clear the list of -X options seen so far. This matches the
behavior of revert/cherry-pick, which made the same change in fb60b9f37f
(sequencer: use struct strvec to store merge strategy options,
2023-04-10).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drew DeVault [Wed, 30 Aug 2023 06:43:33 +0000 (08:43 +0200)]
format-patch: --rfc honors what --subject-prefix sets
Rather than replacing the configured subject prefix (either through the
git config or command line) entirely with "RFC PATCH", this change
prepends RFC to whatever subject prefix was already in use.
This is useful, for example, when a user is working on a repository that
has a subject prefix considered to disambiguate patches:
Prior to this change, formatting patches with --rfc would lose the
'my-project' information.
The data flow for the subject-prefix was that rev.subject_prefix
were to be kept the authoritative version of the subject prefix even
while parsing command line options, and sprefix variable was used as
a temporary area to futz with it. Now, the parsing code has been
refactored to build the subject prefix into the sprefix variable and
assigns its value at the end to rev.subject_prefix, which makes the
flow easier to grasp.
Signed-off-by: Drew DeVault <sir@cmpwn.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wesley Schwengle [Wed, 30 Aug 2023 22:32:08 +0000 (15:32 -0700)]
git-svn: drop FakeTerm hack
Drop the FakeTerm hack, just like dfd46bae (send-email: drop
FakeTerm hack, 2023-08-08) did, for exactly the same reason.
It has been obsolete in git-svn since 30d45f798d (git-svn: delay term
initialization, 2014-09-14). Note that unlike send-email, we already
make sure to load Term::ReadLine only once. So this is just a cleanup,
and not fixing any bug.
Signed-off-by: Wesley Schwengle <wesleys@opperschaap.net> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 30 Aug 2023 19:51:32 +0000 (15:51 -0400)]
ci: deprecate ci/config/allow-ref script
Now that we have the CI_BRANCHES mechanism, there is no need for anybody
to use the ci/config/allow-ref mechanism. In the long run, we can
hopefully remove it and the whole "config" job, as it consumes CPU and
adds to the end-to-end latency of the whole workflow. But we don't want
to do that immediately, as people need time to migrate until the
CI_BRANCHES change has made it into the workflow file of every branch.
So let's issue a warning, which will appear in the "annotations" section
below the workflow result in GitHub's web interface. And let's remove
the sample allow-refs script, as we don't want to encourage anybody to
use it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 30 Aug 2023 19:51:13 +0000 (15:51 -0400)]
ci: allow branch selection through "vars"
When we added config to skip CI for certain branches in e76eec3554 (ci:
allow per-branch config for GitHub Actions, 2020-05-07), there wasn't
any way to avoid spinning up a VM just to check the config. From the
developer's perspective this isn't too bad, as the "skipped" branches
complete successfully after running the config job (the workflow result
is "success" instead of "skipped", but that is a minor lie).
But we are still wasting time and GitHub's CPU to spin up a VM just to
check the result of a short shell script. At the time there wasn't any
way to avoid this. But they've since introduced repo-level variables
that should let us do the same thing:
This is more efficient, and as a bonus is probably less confusing to
configure (the existing system requires sticking your config on a magic
ref).
See the included docs for how to configure it.
The code itself is pretty simple: it checks the variable and skips the
config job if appropriate (and everything else depends on the config job
already). There are two slight inaccuracies here:
- we don't insist on branches, so this likewise applies to tag names
or other refs. I think in practice this is OK, and keeping the code
(and docs) short is more important than trying to be more exact. We
are targeting developers of git.git and their limited workflows.
- the match is done as a substring (so if you want to run CI for
"foobar", then branch "foo" will accidentally match). Again, this
should be OK in practice, as anybody who uses this is likely to only
specify a handful of well-known names. If we want to be more exact,
we can have the code check for adjoining spaces. Or even move to a
more general CI_CONFIG variable formatted as JSON. I went with this
scheme for the sake of simplicity.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 30 Aug 2023 20:50:41 +0000 (13:50 -0700)]
Merge branch 'jc/diff-exit-code-with-w-fixes'
"git diff -w --exit-code" with various options did not work
correctly, which is being addressed.
* jc/diff-exit-code-with-w-fixes:
diff: the -w option breaks --exit-code for --raw and other output modes
t4040: remove test that succeeded for a wrong reason
diff: teach "--stat -w --exit-code" to notice differences
diff: mode-only change should be noticed by "--patch -w --exit-code"
diff: move the fallback "--exit-code" code down
Jeff King [Tue, 29 Aug 2023 23:45:40 +0000 (19:45 -0400)]
gc: mark unused descriptors in scheduler callbacks
Each of the scheduler update callbacks gets the descriptor of the lock
file, but only the crontab updater needs it. We have to retain the
unused descriptors because these are dispatched from a table of function
pointers, but we should mark them to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:37 +0000 (19:45 -0400)]
fetch: mark unused parameter in ref_transaction callback
Since this callback is just trying to collect the set of queued tag
updates, there is no need for it to look at old_oid at all. Mark it as
unused to appease -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:36 +0000 (19:45 -0400)]
credential: mark unused parameter in urlmatch callback
Our select_all() callback does not need to actually look at its
parameters, since the point is to match everything. But we need to mark
its parameters to satisfy -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:34 +0000 (19:45 -0400)]
grep: mark unused parmaeters in pcre fallbacks
When USE_LIBPCRE2 is not defined, we compile several noop fallbacks.
These need to have their parameters annotated to avoid
-Wunused-parameter warnings (and obviously we cannot remove the
parameters, since the functions must match the non-fallback versions).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:33 +0000 (19:45 -0400)]
imap-send: mark unused parameters with NO_OPENSSL
Earlier patches annotating unused parameters in imap-send missed a few
cases in code that is compiled only with NO_OPENSSL. These need to
retain the extra parameters to match the interfaces used when we compile
with openssl support.
Note in the case of socket_perror() that the function declaration and
parts of its code are shared between the two cases, and only the openssl
code looks at "sock". So we can't simply mark the parameter as always
unused. Instead, we can add a noop statement that references it. This is
ugly, but should be portable.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:28 +0000 (19:45 -0400)]
add-interactive: mark unused callback parameters
The interactive commands are dispatched from a table of abstract
pointers, but not every command uses every parameter it receives. Mark
the unused ones to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:25 +0000 (19:45 -0400)]
test-trace2: mark unused argv/argc parameters
The trace2 test helper uses function pointers to dispatch to individual
tests. Not all tests bother looking at their argv/argc parameters. We
could tighten this up (e.g., complaining when seeing unexpected
parameters), but for internal test code it's not worth worrying about.
This is similar in spirit to 126e3b3d2a (t/helper: mark unused argv/argc
arguments, 2023-03-28).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:22 +0000 (19:45 -0400)]
trace2: mark unused us_elapsed_absolute parameters
Many trace2 targets ignore the absolute elapsed time parameters.
However, the virtual interface needs to retain the parameter since it is
used by others (e.g., the perf target).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:19 +0000 (19:45 -0400)]
ls-tree: mark unused parameter in callback
The formatting functions are dispatched from a table of function
pointers. The "path name only" function unsurprisingly does not need to
look at its "oid" parameter, but we must mark it as unused to make
-Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:17 +0000 (19:45 -0400)]
commit-graph: mark unused data parameters in generation callbacks
The compute_generation_info code uses function pointers to abstract the
get/set generation operations. Some callers don't need the extra void
data pointer, which should be annotated to appease -Wunused-parameter.
Note that we can drop the assignment of the "data" parameter in
compute_generation_numbers(), as we've just shown that neither of the
callbacks it uses will access it. This matches the caller in
ensure_generations_valid(), which already does not bother to set "data".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:45:06 +0000 (19:45 -0400)]
ref-filter: mark unused parameters in parser callbacks
These are similar to the cases annotated in 5fe9e1ce2f (ref-filter: mark
unused callback parameters, 2023-02-24), but were added after that
commit.
Note that the ahead/behind callback ignores its "atom" parameter, which
is a little unusual, since that struct usually stores the result. But in
this case, the data is stored centrally in ref_array->counts, since we
want to compute all ahead/behinds at once, not per ref.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:44:23 +0000 (19:44 -0400)]
sequencer: mark repository argument as unused
In sequencer_get_last_command(), we don't ever look at the repository
parameter. This is due to ed5b1ca10b (status: do not report errors in
sequencer/todo, 2019-06-27), which dropped the call to parse_insn_line().
However, it _should_ be used when calling into git_path_* functions,
but the one we use here is declared with the non-REPO variant of
GIT_PATH_FUNC(), and so just uses the_repository internally.
We could change the path helper to use REPO_GIT_PATH_FUNC(), but doing
so piecemeal is not great. There are 41 uses of GIT_PATH_FUNC() in
sequencer.c, and inconsistently switching one makes the code more
confusing. Likewise, this one function is used in half a dozen other
spots, all of which would need to start passing in a repository argument
(with rippling effects up the call stack).
So let's punt on that for now and just silence any -Wunused-parameter
warning.
Note that we could also drop this parameter entirely, as the function is
always called directly, and not as a callback that has to conform to
some external interface. But since we'd eventually want to use the
repository parameter, let's leave it in place to avoid disrupting the
callers twice.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 29 Aug 2023 23:43:39 +0000 (19:43 -0400)]
sequencer: use repository parameter in short_commit_name()
Instead of just using the_repository, we can take a repository parameter
from the caller. Most of them already have one, and doing so clears up a
few -Wunused-parameter warnings. There are still a few callers which use
the_repository, but this pushes us one small step forward to eventually
getting rid of those.
Note that a few of these functions have a "rev_info" whose "repo"
parameter could probably be used instead of the_repository. I'm leaving
that for further cleanups, as it's not immediately obvious that
revs->repo is always valid, and there's quite a bit of other possible
refactoring here (even getting rid of some "struct repository" arguments
in favor of revs->repo).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Every once in a while, the `git-p4` tests flake for reasons outside of
our control. It typically fails with "Connection refused" e.g. here:
https://github.com/git/git/actions/runs/5969707156/job/16196057724
[...]
+ git p4 clone --dest=/home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git //depot
Initialized empty Git repository in /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git/.git/
Perforce client error:
Connect to server failed; check $P4PORT.
TCP connect to localhost:9807 failed.
connect: 127.0.0.1:9807: Connection refused
failure accessing depot: could not run p4
Importing from //depot into /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git
[...]
This happens in other jobs, too, but in the `linux-asan-ubsan` job it
hurts the most because that job often takes over a full hour to run,
therefore re-running a failed `linux-asan-ubsan` job is _very_ costly.
The purpose of the `linux-asan-ubsan` job is to exercise the C code of
Git, anyway, and any part of Git's source code that the `git-p4` tests
run and that would benefit from the attention of ASAN/UBSAN are run
better in other tests anyway, as debugging C code run via Python scripts
can get a bit hairy.
In fact, it is not even just `git-p4` that is the problem (even if it
flakes often enough to be problematic in the CI builds), but really the
part about Python scripts. So let's just skip any Python parts of the
tests from being run in that job.
For good measure, also skip the Subversion tests because debugging C
code run via Perl scripts is as much fun as debugging C code run via
Python scripts. And it will reduce the time this very expensive job
takes, which is a big benefit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 29 Aug 2023 20:51:44 +0000 (13:51 -0700)]
Merge branch 'py/git-gui-updates'
Git GUI updates.
* py/git-gui-updates:
git-gui - use mkshortcut on Cygwin
git-gui - use cygstart to browse on Cygwin
git-gui - remove obsolete Cygwin specific code
git gui Makefile - remove Cygwin modifications
Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
Work around Tcl's default `PATH` lookup
Move the `_which` function (almost) to the top
Move is_<platform> functions to the beginning
is_Cygwin: avoid `exec`ing anything
windows: ignore empty `PATH` elements
git-gui: Fix a typo in README
Junio C Hamano [Tue, 29 Aug 2023 20:51:44 +0000 (13:51 -0700)]
Merge branch 'jc/ci-skip-same-commit'
Tweak GitHub Actions CI so that pushing the same commit to multiple
branch tips at the same time will not waste building and testing
the same thing twice.
* jc/ci-skip-same-commit:
ci: avoid building from the same commit in parallel
Taylor Blau [Mon, 28 Aug 2023 22:49:12 +0000 (18:49 -0400)]
Documentation/gitformat-pack.txt: drop mixed version section
This section was added in 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) to highlight a potential pitfall when
deploying cruft packs in an environment where multiple versions of Git
are GC-ing the same repository.
Now that it has been more than a year since 3d89a8c118 was written,
let's drop this section as it is no longer relevant.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 28 Aug 2023 22:49:10 +0000 (18:49 -0400)]
Documentation/gitformat-pack.txt: remove multi-cruft packs alternative
This text, originally from 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) lists multiple cruft packs as a potential
alternative to the design of cruft packs.
We have always supported multiple cruft packs (i.e. we use the most
recent mtime for a given object among all cruft packs which contain it,
etc.), but haven't encouraged its use.
We still aren't encouraging users to go out and generate multiple cruft
packs, but let's take a step in that direction by dropping language that
suggests we aren't capable of working with multiple cruft packs.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 28 Aug 2023 22:49:07 +0000 (18:49 -0400)]
builtin/pack-objects.c: support `--max-pack-size` with `--cruft`
When pack-objects learned the `--cruft` option back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20), we
explicitly forbade `--cruft` with `--max-pack-size`.
At the time, there was no specific rationale given in the patch for not
supporting the `--max-pack-size` option with `--cruft`. (As best I can
remember, it's because we were trying to push users towards only ever
having a single cruft pack, but I cannot be sure).
However, `--max-pack-size` is flexible enough that it already works with
`--cruft` and can shard unreachable objects across multiple cruft packs,
creating separate ".mtimes" files as appropriate. In fact, the
`--max-pack-size` option worked with `--cruft` as far back as b757353676!
This is because we overwrite the `written_list`, and pass down the
appropriate length, i.e. the number of objects written in each pack
shard.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading input with the `--cruft` option, `git pack-objects` reads
each line into a strbuf, and then moves it to either the list of
discarded or fresh packs, depending on whether or not the input line
starts with a '-' character.
At the beginning of each loop iteration, the next line of input is read
with `strbuf_getline()`, which calls `strbuf_reset()` (as a part of
`strbuf_getwholeline()`) before reading the next line of input.
Thus, the call to `strbuf_reset()` (added back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20)) at the
end of the loop is unnecessary, so let's remove it accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 28 Aug 2023 22:53:04 +0000 (18:53 -0400)]
leak tests: mark t5583-push-branches.sh as leak-free
When t5583-push-branches.sh was originally introduced via 425b4d7f47
(push: introduce '--branches' option, 2023-05-06), it was not leak-free.
In fact, the test did not even run correctly until 022fbb655d (t5583:
fix shebang line, 2023-05-12), but after applying that patch, we see a
failure at t5583.8:
Direct leak of 384 byte(s) in 1 object(s) allocated from:
#0 0x7fb536330986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
#1 0x55e07606cbf9 in xrealloc wrapper.c:140
#2 0x55e075fb6cb3 in prio_queue_put prio-queue.c:42
#3 0x55e075ec81cb in get_reachable_subset commit-reach.c:917
#4 0x55e075fe9cce in add_missing_tags remote.c:1518
#5 0x55e075fea1e4 in match_push_refs remote.c:1665
#6 0x55e076050a8e in transport_push transport.c:1378
#7 0x55e075e2eb74 in push_with_options builtin/push.c:401
#8 0x55e075e2edb0 in do_push builtin/push.c:458
#9 0x55e075e2ff7a in cmd_push builtin/push.c:702
#10 0x55e075d8aaf0 in run_builtin git.c:452
#11 0x55e075d8af08 in handle_builtin git.c:706
#12 0x55e075d8b12c in run_argv git.c:770
#13 0x55e075d8b6a0 in cmd_main git.c:905
#14 0x55e075e81f07 in main common-main.c:60
#15 0x7fb5360ab6c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#16 0x7fb5360ab784 in __libc_start_main_impl ../csu/libc-start.c:360
#17 0x55e075d88f40 in _start (git+0x1ff40) (BuildId: 38ad998b85a535e786129979443630d025ec2453)
SUMMARY: LeakSanitizer: 384 byte(s) leaked in 1 allocation(s).
This leak was addressed independently via 68b51172e3 (commit-reach: fix
memory leak in get_reachable_subset(), 2023-06-03), which makes t5583
leak-free.
But t5583 was not in the tree when 68b51172e3 was written, and the two
only met after the latter was merged back in via 693bde461c (Merge
branch 'mh/commit-reach-get-reachable-plug-leak', 2023-06-20).
At that point, t5583 was leak-free. Let's mark it as such accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 28 Aug 2023 22:52:59 +0000 (18:52 -0400)]
leak tests: mark a handful of tests as leak-free
In the topic merged via 5a4f8381b6 (Merge branch
'ab/mark-leak-free-tests', 2021-10-25), a handful of tests in the suite
were marked as leak-free.
Since then, a handful of tests have become leak-free due to changes like
- 861c56f6f9 (branch: fix a leak in setup_tracking, 2023-06-11), and
- 866b43e644 (do_read_index(): always mark index as initialized unless
erroring out, 2023-06-29)
, but weren't updated at the time to mark themselves as such. This leads
to test "failures" when running:
$ make SANITIZE=leak
$ make -C t \
GIT_TEST_PASSING_SANITIZE_LEAK=check \
GIT_TEST_SANITIZE_LEAK_LOG=true \
GIT_TEST_OPTS=-vi test
This patch closes those gaps by exporting TEST_PASSES_SANITIZE_LEAK=true
before sourcing t/test-lib.sh on most remaining leak-free tests.
There are a couple of other tests which are similarly leak-free, but not
included in the list of tests touched by this patch. The remaining tests
will be addressed in the subsequent two patches.
Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
PtraceRegistersStatus have_registers =
suspended_threads.GetRegistersAndSP(i, ®isters, &sp);
if (have_registers != REGISTERS_AVAILABLE) {
Report("Unable to get registers from thread %llu.\n", os_id);
// If unable to get SP, consider the entire stack to be reachable unless
// GetRegistersAndSP failed with ESRCH.
if (have_registers == REGISTERS_UNAVAILABLE_FATAL)
continue;
sp = stack_begin;
}
The program itself still runs fine and LSan doesn't cause us to abort.
But test-lib.sh looks for any non-empty LSan logs and marks the test as
a failure anyway, under the assumption that we simply missed the failing
exit code somehow.
I don't think I've ever seen this happen in the CI job, but running
locally using clang-14 on an 8-core machine, I can't seem to make it
through a full run of the test suite without having at least one
failure. And it's a different one every time (though they do seem to
often be related to packing tests, which makes sense, since that is one
of our biggest users of threaded code).
We can hack around this by only counting LSan log files that contain a
line that doesn't match our known-uninteresting pattern.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 28 Aug 2023 16:52:28 +0000 (09:52 -0700)]
The extra batch to update credenthal helpers
These two topics did not see much interest and reviews while they
were on 'next'; let's "inflict" them to the general public and see
if anybody screams, which is much less nicer way than to merge
only topics that are well reviewed down in an orderly manner, but
that is the only thing we can do to these topics without any
development community help.
Junio C Hamano [Mon, 28 Aug 2023 16:51:15 +0000 (09:51 -0700)]
Merge branch 'mh/credential-libsecret-attrs'
The way authentication related data other than passwords (e.g.
oath token and password expiration data) are stored in libsecret
keyrings has been rethought.
* mh/credential-libsecret-attrs:
credential/libsecret: store new attributes
Derrick Stolee [Mon, 28 Aug 2023 13:52:26 +0000 (13:52 +0000)]
scalar reconfigure: help users remove buggy repos
When running 'scalar reconfigure -a', Scalar has warning messages about
the repository missing (or not containing a .git directory). Failures
can also happen while trying to modify the repository-local config for
that repository.
These warnings may seem confusing to users who don't understand what
they mean or how to stop them.
Add a warning that instructs the user how to remove the warning in
future installations.
Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 28 Aug 2023 13:52:25 +0000 (13:52 +0000)]
setup: add discover_git_directory_reason()
There are many reasons why discovering a Git directory may fail. In
particular, 8959555cee7 (setup_git_directory(): add an owner check for
the top-level directory, 2022-03-02) added ownership checks as a
security precaution.
Callers attempting to set up a Git directory may want to inform the user
about the reason for the failure. For that, expose the enum
discovery_result from within setup.c and move it into cache.h where
discover_git_directory() is defined.
I initially wanted to change the return type of discover_git_directory()
to be this enum, but several callers rely upon the "zero means success".
The two problems with this are:
1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
results are errors.
2. There are multiple successful states; positive results are
successful.
It is worth noting that GIT_DIR_NONE is not returned, so we remove this
option from the enum. We must be careful to keep the successful reasons
as positive values, so they are given explicit positive values.
Instead of updating all callers immediately, add a new method,
discover_git_directory_reason(), and convert discover_git_directory() to
be a thin shim on top of it.
One thing that is important to note is that discover_git_directory()
previously returned -1 on error, so let's continue that into the future.
There is only one caller (in scalar.c) that depends on that signedness
instead of a non-zero check, so clean that up, too.
Because there are extra checks that discover_git_directory_reason() does
after setup_git_directory_gently_1(), there are other modes that can be
returned for failure states. Add these modes to the enum, but be sure to
explicitly add them as BUG() states in the switch of
setup_git_directory_gently().
Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 28 Aug 2023 13:52:24 +0000 (13:52 +0000)]
scalar: add --[no-]src option
Some users have strong aversions to Scalar's opinion that the repository
should be in a 'src' directory, even though this creates a clean slate
for placing build artifacts in adjacent directories.
The new --no-src option allows users to opt out of the default behavior.
While adding options, make sure the usage output by 'scalar clone -h'
reports the same as the SYNOPSIS line in Documentation/scalar.txt.
Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1b68387e02 (builtin/receive-pack.c: use parse_options API, 2016-03-02)
added the options --stateless-rpc, --advertise-refs and
--reject-thin-pack-for-testing with a NULL `help` string; 03831ef7b5
(difftool: implement the functionality in the builtin, 2017-01-19)
similarly added the "helpless" option --prompt. Presumably this was
done because all four options are hidden and self-explanatory.
They cause a NULL pointer dereference when using the option --help-all
with their respective tool, though. Handle such options gracefully
instead by turning the NULL pointer into an empty string at the top of
the loop, always printing a newline at the end and passing through the
separating newlines from the help text.
Junio C Hamano [Fri, 25 Aug 2023 17:37:37 +0000 (10:37 -0700)]
Merge branch 'rs/parse-options-negation-help'
"git cmd -h" learned to signal which options can be negated by
listing such options like "--[no-]opt".
* rs/parse-options-negation-help:
parse-options: simplify usage_padding()
parse-options: no --[no-]no-...
parse-options: factor out usage_indent() and usage_padding()
parse-options: show negatability of options in short help
t1502: test option negation
t1502: move optionspec help output to a file
t1502, docs: disallow --no-help
subtree: disallow --no-{help,quiet,debug,branch,message}
ci: avoid building from the same commit in parallel
At times, we may need to push the same commit to multiple branches
in the same push. Rewinding 'next' to rebuild on top of 'master'
soon after a release is such an occasion. Making sure 'main' stays
in sync with 'master' to help those who expect that primary branch
of the project is named either of these is another.
We already use the branch name as a "concurrency group" key, but
that does not address the situation illustrated above.
Let's introduce another `concurrency` attribute, using the commit
hash as the concurrency group key, on the workflow run level, to
address this. This will hold any workflow run in the queued state
when there is already a workflow run targeting the same commit,
until that latter run completed. The `skip-if-redundant` check of
the second run will then have a chance to see whether the first
run succeeded.
The only caveat with this strategy is that only one workflow run
will be kept in the queued state by the `concurrency` feature: if
another run targeting the same commit is triggered, the
previously-queued run will be canceled. Considering the benefit,
this seems the smaller price to pay than to overload Git's build
agent pool with undesired workflow runs.
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 24 Aug 2023 16:57:43 +0000 (09:57 -0700)]
Merge https://github.com/prati0100/git-gui
* https://github.com/prati0100/git-gui:
git-gui - use mkshortcut on Cygwin
git-gui - use cygstart to browse on Cygwin
git-gui - remove obsolete Cygwin specific code
git gui Makefile - remove Cygwin modifications
Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
Work around Tcl's default `PATH` lookup
Move the `_which` function (almost) to the top
Move is_<platform> functions to the beginning
is_Cygwin: avoid `exec`ing anything
windows: ignore empty `PATH` elements
git-gui: Fix a typo in README
Junio C Hamano [Thu, 24 Aug 2023 16:32:34 +0000 (09:32 -0700)]
Merge branch 'ds/maintenance-schedule-fuzz'
Hourly and other schedule of "git maintenance" jobs are randomly
distributed now.
* ds/maintenance-schedule-fuzz:
maintenance: update schedule before config
maintenance: fix systemd schedule overlaps
maintenance: use random minute in systemd scheduler
maintenance: swap method locations
maintenance: use random minute in cron scheduler
maintenance: use random minute in Windows scheduler
maintenance: use random minute in launchctl scheduler
maintenance: add get_random_minute()
Junio C Hamano [Thu, 24 Aug 2023 16:32:33 +0000 (09:32 -0700)]
Merge branch 'mp/rebase-label-length-limit'
Overly long label names used in the sequencer machinery are now
chopped to fit under filesystem limitation.
* mp/rebase-label-length-limit:
rebase: allow overriding the maximal length of the generated labels
sequencer: truncate labels to accommodate loose refs
Junio C Hamano [Thu, 24 Aug 2023 16:32:33 +0000 (09:32 -0700)]
Merge branch 'ds/upload-pack-error-sequence-fix'
Error message generation fix.
* ds/upload-pack-error-sequence-fix:
upload-pack: fix exit code when denying fetch of unreachable object ID
upload-pack: fix race condition in error messages
Junio C Hamano [Thu, 24 Aug 2023 16:32:32 +0000 (09:32 -0700)]
Merge branch 'rj/branch-in-use-error-message'
A message written in olden time prevented a branch from getting
checked out saying it is already checked out elsewhere, but these
days, we treat a branch that is being bisected or rebased just like
a branch that is checked out and protect it. Rephrase the message
to say that the branch is in use.
* rj/branch-in-use-error-message:
branch: error message checking out a branch in use
branch: error message deleting a branch in use
sequencer: rectify empty hint in call of require_clean_work_tree()
The canonical way to represent "no error hint" is making it NULL, which
shortcuts the error() call altogether. This fixes the output by removing
the line which said just "error:", which would appear when the worktree
is dirtied while editing the initial rebase todo file. This was
introduced by 97e1873 (rebase -i: rewrite complete_action() in C,
2018-08-28), which did a somewhat inaccurate conversion from shell.
To avoid that such bugs re-appear, test for the condition in
require_clean_work_tree().
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pratyush Yadav [Thu, 24 Aug 2023 14:46:29 +0000 (16:46 +0200)]
Merge branch 'ml/cygwin-fixes'
Remove some code supporting ancient Cygwin Tcl/Tk versions. Also fix
exploring working directory and making desktop shortcuts on Cygwin.
* ml/cygwin-fixes:
git-gui - use mkshortcut on Cygwin
git-gui - use cygstart to browse on Cygwin
git-gui - remove obsolete Cygwin specific code
git gui Makefile - remove Cygwin modifications
Mark Levedahl [Mon, 26 Jun 2023 16:53:05 +0000 (12:53 -0400)]
git-gui - use mkshortcut on Cygwin
git-gui enables the "Repository->Create Desktop Icon" item on Cygwin,
offering to create a shortcut that starts git-gui on the current
repository. The code in do_cygwin_shortcut invokes function
win32_create_lnk to create the shortcut. This latter function is shared
between Cygwin and Git For Windows and expects Windows rather than unix
pathnames, though do_cygwin_shortcut provides unix pathnames. Also, this
function tries to invoke the Windows Script Host to run a javascript
snippet, but this fails under Cygwin's Tcl. So, win32_create_lnk just
does not support Cygwin.
However, Cygwin's default installation provides /bin/mkshortcut for
creating desktop shortcuts. This is compatible with exec under Cygwin's
Tcl, understands Cygwin's unix pathnames, and avoids the need for shell
escapes to encode troublesome paths. So, teach git-gui to use mkshortcut
on Cygwin, leaving win32_create_lnk unchanged and for exclusive use by
Git For Windows.
Notes: "CHERE_INVOKING=1" is recognized by Cygwin's /etc/profile and
prevents a "chdir $HOME", leaving the shell in the working directory
specified by the shortcut. That directory is written directly by
mkshortcut eliminating any problems with shell escapes and quoting.
The code being replaced includes the full pathname of the git-gui
creating the shortcut, but that git-gui might not be compatible with the
git found after /etc/profile sets the path, and might have a pathname
that defies encoding using shell escapes that can survive the multiple
incompatible interpreters involved in the chain of creating and using
this shortcut. The new code uses bare "git gui" as the command to
execute, thus using the system git to launch the system git-gui, and
avoiding both issues.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Mark Levedahl [Mon, 26 Jun 2023 16:53:04 +0000 (12:53 -0400)]
git-gui - use cygstart to browse on Cygwin
git-gui enables the "Repository->Explore Working Copy" menu on Cygwin,
offering to open a Windows graphical file browser at the root of the
working directory. This code, shared with Git For Windows support,
depends upon use of Windows pathnames. However, git gui on Cygwin uses
unix pathnames, so this shared code will not work on Cygwin.
A base install of Cygwin provides the /bin/cygstart utility that runs
a registered Windows application based upon the file type, after
translating unix pathnames to Windows. Adding the --explore option
guarantees that the Windows file explorer is opened, regardless of the
supplied pathname's file type and avoiding possibility of some other
action being taken.
So, teach git-gui to use cygstart --explore on Cygwin, restoring the
pre-2012 behavior of opening a Windows file explorer for browsing. This
separates the Git For Windows and Cygwin code paths. Note that
is_Windows is never true on Cygwin, and is_Cygwin is never true on Git
for Windows, though this is not obvious by examining the code for those
independent functions.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Mark Levedahl [Mon, 26 Jun 2023 16:53:03 +0000 (12:53 -0400)]
git-gui - remove obsolete Cygwin specific code
In the current git release, git-gui runs on Cygwin without enabling any
of git-gui's Cygwin specific code. This happens as the Cygwin specific
code in git-gui was (mostly) written in 2007-2008 to work with Cygwin's
then supplied Tcl/Tk which was an incompletely ported variant of the
8.4.1 Windows Tcl/Tk code. In March, 2012, that 8.4.1 package was
replaced with a full port based upon the upstream unix/X11 code,
since maintained up to date. The two Tcl/Tk packages are completely
incompatible, and have different signatures.
When Cygwin's Tcl/Tk signature changed in 2012, git-gui no longer
detected Cygwin, so did not enable Cygwin specific code, and the POSIX
environment provided by Cygwin since 2012 supported git-gui as a generic
unix. Thus, no-one apparently noticed the existence of incompatible
Cygwin specific code.
However, since commit c5766eae6f in the git-gui source tree
(https://github.com/prati0100/git-gui, master at a5005ded), and not yet
pulled into the git repository, the is_Cygwin function does detect
Cygwin using the unix/X11 Tcl/Tk. The Cygwin specific code is enabled,
causing use of Windows rather than unix pathnames, and enabling
incorrect warnings about environment variables that were relevant only
to the old Tcl/Tk. The end result is that (upstream) git-gui is now
incompatible with Cygwin.
So, delete Cygwin specific code (code protected by "if is_Cygwin") that
is not needed in any form to work with the unix/X11 Tcl/Tk.
Cygwin specific code required to enable file browsing and shortcut
creation is not addressed in this patch, does not currently work, and
invocation of those items may leave git-gui in a confused state.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Mark Levedahl [Mon, 26 Jun 2023 16:53:02 +0000 (12:53 -0400)]
git gui Makefile - remove Cygwin modifications
git-gui's Makefile hardcodes the absolute Windows path of git-gui's libraries
into git-gui, destroying the ability to package git-gui on one machine and
distribute to others. The intent is to do this only if a non-Cygwin Tcl/Tk is
installed, but the test for this is wrong with the unix/X11 Tcl/Tk shipped
since 2012. Also, Cygwin does not support a non-Cygwin Tcl/Tk.
The Cygwin git maintainer disables this code, so this code is definitely
not in use in the Cygwin distribution.
The simplest fix is to just delete the Cygwin specific code,
allowing the Makefile to work out of the box on Cygwin. Do so.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The transfer.unpackLimit configuration variable is documented to be
used only as a fallback value when the more operation-specific
fetch.unpackLimit and receive.unpackLimit variables are not set, but
the implementation had the precedence reversed. Apparently this was
broken since the transfer.unpackLimit was introduced in e28714c5
(Consolidate {receive,fetch}.unpackLimit, 2007-01-24).
Often when documentation and code have diverged for so long, we
prefer to change the documentation instead, to avoid disrupting
users. But doing so would make these weirdly unlike most other
"specific overrides general" config options. And the fact that the
bug has existed for so long without anyone noticing implies to me
that nobody really tries to mix and match them much.
Signed-off-by: Taylor Santiago <taylorsantiago@google.com>
[jc: rewrote the log message, added tests, covered receive-pack as well] Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 18 Aug 2023 23:59:32 +0000 (16:59 -0700)]
diff: the -w option breaks --exit-code for --raw and other output modes
The output from "--raw", "--name-status", and "--name-only" modes in
"git diff" does depend on and does not reflect how certain different
contents are considered equal, unlike "--patch" and "--stat" output
modes do, when used with options like "-w" (another way of thinking
about it is that it is not like we recompute the hash of the blob
after removing all whitespaces to show "git diff --raw -w" output).
But the fact that "--raw" and friends ignore "-w" is not a good
excuse for "diff --raw -w --exit-code" to also ignore the request to
report the differences with its exit status. When run without "-w",
"git diff --exit-code --raw" does report with its exit status the
differences as requested, and we should do the same when run with
"-w", too.
Taylor Blau [Mon, 21 Aug 2023 21:34:42 +0000 (17:34 -0400)]
commit-graph: avoid repeated mixed generation number warnings
When validating that a commit-graph has either all zero, or all non-zero
generation numbers, we emit a warning on both the rising and falling
edge of transitioning between the two.
So if we are unfortunate enough to see a commit-graph which has a
repeating sequence of zero, then non-zero generation numbers, we'll
generate many warnings that contain more or less the same information.
Avoid this by keeping track of a single example for a commit with zero-
and non-zero generation, and emit a single warning at the end of
verification if both are non-NULL.
Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taylor Blau [Mon, 21 Aug 2023 21:34:40 +0000 (17:34 -0400)]
t/t5318-commit-graph.sh: test generation zero transitions during fsck
The second test called "detect incorrect generation number" asserts that
we correctly warn during an fsck when we see a non-zero generation
number after seeing a zero beforehand.
The other transition (going from non-zero to zero) was previously
untested. Test both directions, and rename the existing test to make
clear which direction it is exercising.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In verify_one_commit_graph(), we have code that complains when a commit
is found with a generation number of zero, and then later with a
non-zero number. It works like this:
1. When we see an entry with generation zero, we set the
generation_zero flag to GENERATION_ZERO_EXISTS.
2. When we later see an entry with a non-zero generation, we complain
if the flag is GENERATION_ZERO_EXISTS.
There's a matching GENERATION_NUMBER_EXISTS value, which in theory would
be used to find the case that we see the entries in the opposite order:
1. When we see an entry with a non-zero generation, we set the
generation_zero flag to GENERATION_NUMBER_EXISTS.
2. When we later see an entry with a zero generation, we complain if
the flag is GENERATION_NUMBER_EXISTS.
But that doesn't work; step 2 is implemented, but there is no step 1. We
never use NUMBER_EXISTS at all, and Coverity rightly complains that step
2 is dead code.
We can fix that by implementing that step 1.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2ee11f7261 (commit-graph: return generation from memory, 2023-03-20),
the `commit_graph_generation()` function stopped returning zeros when
asked to locate the generation number of a given commit.
This was done at the time to prepare for a later change which set
generation values in memory, meaning that we could no longer rely on
`graph_pos` alone to tell us whether or not to trust the generation
number returned by this function.
In 2ee11f7261, it was noted that this change only impacted very old
commit-graphs, which were written with all commits having generation
number 0. Indeed, zero is not a valid generation number, so we should
never expect to see that value outside of the aforementioned case.
The test fallout in 2ee11f7261 indicated that we were no longer able to
fsck a specific old case of commit-graph corruption, where we see a
non-zero generation number after having seen a generation number of 0
earlier.
Introduce a variant of `commit_graph_generation()` which behaves like
that function did prior to 2ee11f7261, known as
`commit_graph_generation_from_graph()`. Then use this function in the
context of `verify_one_commit_graph()`, where we only want to trust the
values from the graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 21 Aug 2023 20:20:46 +0000 (16:20 -0400)]
diff: drop useless "status" parameter from diff_result_code()
Many programs use diff_result_code() to get a user-visible program exit
code from a diff result (e.g., checking opts.found_changes if
--exit-code was requested).
This function also takes a "status" parameter, which seems at first
glance that it could be used to propagate an error encountered when
computing the diff. But it doesn't work that way:
- negative values are passed through as-is, but are not appropriate as
program exit codes
- when --exit-code or --check is in effect, we _ignore_ the passed-in
status completely. So a failed diff which did not have a chance to
set opts.found_changes would erroneously report "success, no
changes" instead of propagating the error.
After recent cleanups, neither of these bugs is possible to trigger, as
every caller just passes in "0". So rather than fixing them, we can
simply drop the useless parameter instead.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 21 Aug 2023 20:19:44 +0000 (16:19 -0400)]
diff: drop useless return values in git-diff helpers
Since git-diff has many diff modes, it dispatches to many helpers to
perform each one. But every helper simply returns "0", as it exits
directly if there are serious errors (and options like --exit-code are
handled afterwards). So let's get rid of these useless return values,
which makes the code flow more clear.
There's very little chance that we'd later want to propagate errors
instead of dying immediately. These are all static-local helpers for the
git-diff program implementing its various modes. More "lib-ified" code
would directly call the underlying functions.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 21 Aug 2023 20:18:55 +0000 (16:18 -0400)]
diff: drop useless return from run_diff_{files,index} functions
Neither of these functions ever returns a value other than zero.
Instead, they expect unrecoverable errors to exit immediately, and
things like "--exit-code" are stored inside the diff_options struct to
be handled later via diff_result_code().
Some callers do check the return values, but many don't bother. Let's
drop the useless return values, which are misleading callers about how
the functions work. This could be seen as a step in the wrong direction,
as we might want to eventually "lib-ify" these to more cleanly return
errors up the stack, in which case we'd have to add the return values
back in. But there are some benefits to doing this now:
1. In the current code, somebody could accidentally add a "return -1"
to one of the functions, which would be erroneously ignored by many
callers. By removing the return code, the compiler can notice the
mismatch and force the developer to decide what to do.
Obviously the other option here is that we could start consistently
checking the error code in every caller. But it would be dead code,
and we wouldn't get any compile-time help in catching new cases.
2. It communicates the situation to callers, who may want to choose a
different function. These functions are really thin wrappers for
doing git-diff-files and git-diff-index within the process. But
callers who care about recovering from an error here are probably
better off using the underlying library functions, many of
which do return errors.
If somebody eventually wants to teach these functions to propagate
errors, they'll have to switch back to returning a value, effectively
reverting this patch. But at least then they will be starting with a
level playing field: they know that they will need to inspect each
caller to see how it should handle the error.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>