From: Jacob Keller Date: Wed, 21 May 2025 23:29:17 +0000 (-0700) Subject: diff --no-index: support limiting by pathspec X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=09fb155f11128b505c227aae673de957c9388240;p=thirdparty%2Fgit.git diff --no-index: support limiting by pathspec The --no-index option of git-diff enables using the diff machinery from git while operating outside of a repository. This mode of git diff is able to compare directories and produce a diff of their contents. When operating git diff in a repository, git has the notion of "pathspecs" which can specify which files to compare. In particular, when using git to diff two trees, you might invoke: $ git diff-tree -r . where the treeish could point to a subdirectory of the repository. When invoked this way, users can limit the selected paths of the tree by using a pathspec. Either by providing some list of paths to accept, or by removing paths via a negative refspec. The git diff --no-index mode does not support pathspecs, and cannot limit the diff output in this way. Other diff programs such as GNU difftools have options for excluding paths based on a pattern match. However, using git diff as a diff replacement has several advantages over many popular diff tools, including coloring moved lines, rename detections, and similar. Teach git diff --no-index how to handle pathspecs to limit the comparisons. This will only be supported if both provided paths are directories. For comparisons where one path isn't a directory, the --no-index mode already has some DWIM shortcuts implemented in the fixup_paths() function. Modify the fixup_paths function to return 1 if both paths are directories. If this is the case, interpret any extra arguments to git diff as pathspecs via parse_pathspec. Use parse_pathspec to load the remaining arguments (if any) to git diff --no-index as pathspec items. Disable PATHSPEC_ATTR support since we do not have a repository to do attribute lookup. Disable PATHSPEC_FROMTOP since we do not have a repository root. All pathspecs are treated as rooted at the provided comparison paths. After loading the pathspec data, calculate skip offsets for skipping past the root portion of the paths. This is required to ensure that pathspecs start matching from the provided path, rather than matching from the absolute path. We could instead pass the paths as prefix values to parse_pathspec. This is slightly problematic because the paths come from the command line and don't necessarily have the proper trailing slash. Additionally, that would require parsing pathspecs multiple times. Pass the pathspec object and the skip offsets into queue_diff, which in-turn must pass them along to read_directory_contents. Modify read_directory_contents to check against the pathspecs when scanning the directory. Use the skip offset to skip past the initial root of the path, and only match against portions that are below the intended directory structure being compared. The search algorithm for finding paths is recursive with read_dir. To make pathspec matching work properly, we must set both DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC. Without DO_MATCH_DIRECTORY, paths like "a/b/c/d" will not match against pathspecs like "a/b/c". This is usually achieved by setting the is_dir parameter of match_pathspec. Without DO_MATCH_LEADING_PATHSPEC, paths like "a/b/c" would not match against pathspecs like "a/b/c/d". This is crucial because we recursively iterate down the directories. We could simply avoid checking pathspecs at subdirectories, but this would force recursion down directories which would simply be skipped. If we always passed DO_MATCH_LEADING_PATHSPEC, then we will incorrectly match in certain cases such as matching 'a/c' against ':(glob)**/d'. The match logic will see that a matches the leading part of the **/ and accept this even tho c doesn't match. To avoid this, use the match_leading_pathspec() variant recently introduced. This sets both flags when is_dir is set, but leaves them both cleared when is_dir is 0. Add test cases and documentation covering the new functionality. Note for the documentation I opted not to move the placement of '--' which is sometimes used to disambiguate arguments. The diff --no-index mode requires exactly 2 arguments determining what to compare. Any additional arguments are interpreted as pathspecs and must come afterwards. Use of '--' would not actually disambiguate anything, since there will never be ambiguity over which arguments represent paths or pathspecs. Signed-off-by: Jacob Keller Signed-off-by: Junio C Hamano --- diff --git a/Documentation/git-diff.adoc b/Documentation/git-diff.adoc index dec173a345..272331afba 100644 --- a/Documentation/git-diff.adoc +++ b/Documentation/git-diff.adoc @@ -14,7 +14,7 @@ git diff [] --cached [--merge-base] [] [--] [...] git diff [] [--merge-base] [...] [--] [...] git diff [] ... [--] [...] git diff [] -git diff [] --no-index [--] +git diff [] --no-index [--] [...] DESCRIPTION ----------- @@ -31,14 +31,18 @@ files on disk. further add to the index but you still haven't. You can stage these changes by using linkgit:git-add[1]. -`git diff [] --no-index [--] `:: +`git diff [] --no-index [--] [...]`:: This form is to compare the given two paths on the filesystem. You can omit the `--no-index` option when running the command in a working tree controlled by Git and at least one of the paths points outside the working tree, or when running the command outside a working tree - controlled by Git. This form implies `--exit-code`. + controlled by Git. This form implies `--exit-code`. If both + paths point to directories, additional pathspecs may be + provided. These will limit the files included in the + difference. All such pathspecs must be relative as they + apply to both sides of the diff. `git diff [] --cached [--merge-base] [] [--] [...]`:: diff --git a/builtin/diff.c b/builtin/diff.c index fa963808c3..c6231edce4 100644 --- a/builtin/diff.c +++ b/builtin/diff.c @@ -35,7 +35,7 @@ static const char builtin_diff_usage[] = " or: git diff [] [--merge-base] [...] [--] [...]\n" " or: git diff [] ... [--] [...]\n" " or: git diff [] \n" -" or: git diff [] --no-index [--] " +" or: git diff [] --no-index [--] [...]" "\n" COMMON_DIFF_OPTIONS_HELP; diff --git a/diff-no-index.c b/diff-no-index.c index 9739b2b268..4aeeb98cfa 100644 --- a/diff-no-index.c +++ b/diff-no-index.c @@ -15,20 +15,45 @@ #include "gettext.h" #include "revision.h" #include "parse-options.h" +#include "pathspec.h" #include "string-list.h" #include "dir.h" -static int read_directory_contents(const char *path, struct string_list *list) +static int read_directory_contents(const char *path, struct string_list *list, + const struct pathspec *pathspec, + int skip) { + struct strbuf match = STRBUF_INIT; + int len; DIR *dir; struct dirent *e; if (!(dir = opendir(path))) return error("Could not open directory %s", path); - while ((e = readdir_skip_dot_and_dotdot(dir))) + if (pathspec) { + strbuf_addstr(&match, path); + strbuf_complete(&match, '/'); + strbuf_remove(&match, 0, skip); + + len = match.len; + } + + while ((e = readdir_skip_dot_and_dotdot(dir))) { + if (pathspec) { + strbuf_setlen(&match, len); + strbuf_addstr(&match, e->d_name); + + if (!match_leading_pathspec(NULL, pathspec, + match.buf, match.len, + 0, NULL, e->d_type == DT_DIR ? 1 : 0)) + continue; + } + string_list_insert(list, e->d_name); + } + strbuf_release(&match); closedir(dir); return 0; } @@ -131,7 +156,8 @@ static struct diff_filespec *noindex_filespec(const struct git_hash_algo *algop, } static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop, - const char *name1, const char *name2, int recursing) + const char *name1, const char *name2, int recursing, + const struct pathspec *ps, int skip1, int skip2) { int mode1 = 0, mode2 = 0; enum special special1 = SPECIAL_NONE, special2 = SPECIAL_NONE; @@ -171,9 +197,9 @@ static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop, int i1, i2, ret = 0; size_t len1 = 0, len2 = 0; - if (name1 && read_directory_contents(name1, &p1)) + if (name1 && read_directory_contents(name1, &p1, ps, skip1)) return -1; - if (name2 && read_directory_contents(name2, &p2)) { + if (name2 && read_directory_contents(name2, &p2, ps, skip2)) { string_list_clear(&p1, 0); return -1; } @@ -218,7 +244,7 @@ static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop, n2 = buffer2.buf; } - ret = queue_diff(o, algop, n1, n2, 1); + ret = queue_diff(o, algop, n1, n2, 1, ps, skip1, skip2); } string_list_clear(&p1, 0); string_list_clear(&p2, 0); @@ -258,8 +284,10 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F" * Note that we append the basename of F to D/, so "diff a/b/file D" * becomes "diff a/b/file D/file", not "diff a/b/file D/a/b/file". + * + * Return 1 if both paths are directories, 0 otherwise. */ -static void fixup_paths(const char **path, struct strbuf *replacement) +static int fixup_paths(const char **path, struct strbuf *replacement) { struct stat st; unsigned int isdir0 = 0, isdir1 = 0; @@ -282,26 +310,31 @@ static void fixup_paths(const char **path, struct strbuf *replacement) if ((isdir0 && ispipe1) || (ispipe0 && isdir1)) die(_("cannot compare a named pipe to a directory")); - if (isdir0 == isdir1) - return; + /* if both paths are directories, we will enable pathspecs */ + if (isdir0 && isdir1) + return 1; + if (isdir0) { append_basename(replacement, path[0], path[1]); path[0] = replacement->buf; - } else { + } else if (isdir1) { append_basename(replacement, path[1], path[0]); path[1] = replacement->buf; } + + return 0; } static const char * const diff_no_index_usage[] = { - N_("git diff --no-index [] "), + N_("git diff --no-index [] [...]"), NULL }; int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop, int implicit_no_index, int argc, const char **argv) { - int i, no_index; + struct pathspec pathspec, *ps = NULL; + int i, no_index, skip1 = 0, skip2 = 0; int ret = 1; const char *paths[2]; char *to_free[ARRAY_SIZE(paths)] = { 0 }; @@ -317,13 +350,12 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop, options = add_diff_options(no_index_options, &revs->diffopt); argc = parse_options(argc, argv, revs->prefix, options, diff_no_index_usage, 0); - if (argc != 2) { + if (argc < 2) { if (implicit_no_index) warning(_("Not a git repository. Use --no-index to " "compare two paths outside a working tree")); usage_with_options(diff_no_index_usage, options); } - FREE_AND_NULL(options); for (i = 0; i < 2; i++) { const char *p = argv[i]; if (!strcmp(p, "-")) @@ -337,7 +369,23 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop, paths[i] = p; } - fixup_paths(paths, &replacement); + if (fixup_paths(paths, &replacement)) { + parse_pathspec(&pathspec, PATHSPEC_FROMTOP | PATHSPEC_ATTR, + PATHSPEC_PREFER_FULL | PATHSPEC_NO_REPOSITORY, + NULL, &argv[2]); + if (pathspec.nr) + ps = &pathspec; + + skip1 = strlen(paths[0]); + skip1 += paths[0][skip1] == '/' ? 0 : 1; + skip2 = strlen(paths[1]); + skip2 += paths[1][skip2] == '/' ? 0 : 1; + } else if (argc > 2) { + warning(_("Limiting comparison with pathspecs is only " + "supported if both paths are directories.")); + usage_with_options(diff_no_index_usage, options); + } + FREE_AND_NULL(options); revs->diffopt.skip_stat_unmatch = 1; if (!revs->diffopt.output_format) @@ -354,7 +402,8 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop, setup_diff_pager(&revs->diffopt); revs->diffopt.flags.exit_with_status = 1; - if (queue_diff(&revs->diffopt, algop, paths[0], paths[1], 0)) + if (queue_diff(&revs->diffopt, algop, paths[0], paths[1], 0, ps, + skip1, skip2)) goto out; diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/"); diffcore_std(&revs->diffopt); @@ -370,5 +419,7 @@ out: for (i = 0; i < ARRAY_SIZE(to_free); i++) free(to_free[i]); strbuf_release(&replacement); + if (ps) + clear_pathspec(ps); return ret; } diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh index 5e5bad61ca..01db9243ab 100755 --- a/t/t4053-diff-no-index.sh +++ b/t/t4053-diff-no-index.sh @@ -295,4 +295,79 @@ test_expect_success PIPE,SYMLINKS 'diff --no-index reads from pipes' ' test_cmp expect actual ' +test_expect_success 'diff --no-index F F rejects pathspecs' ' + test_must_fail git diff --no-index -- a/1 a/2 a 2>actual.err && + test_grep "usage: git diff --no-index" actual.err +' + +test_expect_success 'diff --no-index D F rejects pathspecs' ' + test_must_fail git diff --no-index -- a a/2 a 2>actual.err && + test_grep "usage: git diff --no-index" actual.err +' + +test_expect_success 'diff --no-index F D rejects pathspecs' ' + test_must_fail git diff --no-index -- a/1 b b 2>actual.err && + test_grep "usage: git diff --no-index" actual.err +' + +test_expect_success 'diff --no-index rejects absolute pathspec' ' + test_must_fail git diff --no-index -- a b $(pwd)/a/1 +' + +test_expect_success 'diff --no-index with pathspec' ' + test_expect_code 1 git diff --name-status --no-index a b 1 >actual && + cat >expect <<-EOF && + D a/1 + EOF + test_cmp expect actual +' + +test_expect_success 'diff --no-index with pathspec no matches' ' + test_expect_code 0 git diff --name-status --no-index a b missing +' + +test_expect_success 'diff --no-index with negative pathspec' ' + test_expect_code 1 git diff --name-status --no-index a b ":!2" >actual && + cat >expect <<-EOF && + D a/1 + EOF + test_cmp expect actual +' + +test_expect_success 'setup nested' ' + mkdir -p c/1/2 && + mkdir -p d/1/2 && + echo 1 >c/1/2/a && + echo 2 >c/1/2/b +' + +test_expect_success 'diff --no-index with pathspec nested negative pathspec' ' + test_expect_code 0 git diff --no-index c d ":!1" +' + +test_expect_success 'diff --no-index with pathspec nested pathspec' ' + test_expect_code 1 git diff --name-status --no-index c d 1/2 >actual && + cat >expect <<-EOF && + D c/1/2/a + D c/1/2/b + EOF + test_cmp expect actual +' + +test_expect_success 'diff --no-index with pathspec glob' ' + test_expect_code 1 git diff --name-status --no-index c d ":(glob)**/a" >actual && + cat >expect <<-EOF && + D c/1/2/a + EOF + test_cmp expect actual +' + +test_expect_success 'diff --no-index with pathspec glob and exclude' ' + test_expect_code 1 git diff --name-status --no-index c d ":(glob,exclude)**/a" >actual && + cat >expect <<-EOF && + D c/1/2/b + EOF + test_cmp expect actual +' + test_done