reftable/constants: make block types part of the public interface
Now that reftable blocks can be read individually via the public
interface it becomes necessary for callers to be able to distinguish the
different types of blocks. Expose the relevant constants.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/table: introduce iterator for table blocks
Introduce a new iterator that allows the caller to iterate through all
blocks contained in a table. This gives users more fine-grained control
over how exactly those blocks are being read and exposes information to
callers that was previously inaccessible.
This iterator will be required by a future patch series that adds
consistency checks for the reftable backend. In addition to that though
we will also reimplement `reftable_table_print_blocks()` on top of this
new iterator in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/table: add `reftable_table` to the public interface
The `reftable_table` interface is an internal implementation detail that
callers have no access to. Having direct access to this structure is
important though for a subsequent patch series that will implement
consistency checks for the reftable backend.
Move the structure into "reftable-table.h" so that it part of the public
interface.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: expose a generic iterator over reftable records
Expose a generic iterator over reftable records and expose it via the
public interface. Together with an upcoming iterator for reftable blocks
contained in a table this will allow users to trivially iterate through
blocks and their respective records individually.
This functionality will be used to implement consistency checks for the
reftable backend, which requires more fine-grained control over how we
read data.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the block iterators so that initialization and seeking are
different from one another. This makes the iterator trivially reseekable
by storing the pointer to the block at initialization time, which we can
then reuse on every seek.
This refactoring prepares the code for exposing a `reftable_iterator`
interface for blocks in a subsequent commit. Callsites are adjusted
accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: store block pointer in the block iterator
The block iterator requires access to a bunch of data from the
underlying `reftable_block` that it is iterating over. This data is
stored by copying over relevant data into a separate set of variables.
This has multiple downsides:
- We require more storage space than necessary. This is more of a
theoretical issue as we shouldn't ever have many blocks.
- We have to perform more bookkeeping, and the variable names are
inconsistent across the two data structures. This can lead to some
confusion.
- The lifetime of the block iterator is tied to the block anyway, but
we hide that a bit by only storing pointers pointing into the block.
There isn't really any good reason why we rip out parts of the block
instead of storing a pointer to the block itself.
Refactor the code to do so. Despite being simpler, it also allows us to
decouple the lifetime of the block iterator from seeking in a subsequent
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: create public interface for reading blocks
While users of the reftable library wouldn't generally require access to
individual blocks in a reftable table, there are valid usecases where
one may require low-level access to them. One such upcoming usecase in
the Git codebase is to implement consistency checks for the reftable
library where we want to verify each block individually.
Create a public interface for reading blocks. The interface isn't yet
complete and lacks e.g. a way to read individual records from a block.
Such missing functionality will be backfilled in subsequent commits.
Note that this change also requires us to expose `reftable_buf`, which
is used by the `reftable_block_first_key()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-zlib: use `struct z_stream_s` instead of typedef
Throughout the Git codebase we're using the typedeffed version of
`z_stream`, which maps to `struct z_stream_s`. By using a typedef
instead of the struct it becomes somewhat harder to predeclare the
symbol so that headers depending on the struct can do so without having
to pull in "zlib-compat.h".
We don't yet have users that would really care about this: the only
users that declare `z_stream` as a pointer are in "reftable/block.h",
which is a header that is internal to the reftable library. But in the
next step we're going to expose the `struct reftable_block` publicly,
and that struct does contain a pointer to `z_stream`. And as the public
header shouldn't depend on "reftable/system.h", which is an internal
implementation detail, we won't have the typedef for `z_stream` readily
available.
Prepare for this change by using `struct z_stream_s` throughout our code
base. In case zlib-ng is used we use a define to map from `z_stream_s`
to `zng_stream_s`.
Drop the pre-declaration of `struct z_stream` while at it. This struct
does not exist in the first place, and the declaration wasn't needed
because "reftable/block.h" already includes "reftable/basics.h" which
transitively includes "reftable/system.h" and thus "git-zlib.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: rename `block_reader` to `reftable_block`
The `block_reader` structure is used to access parsed data of a reftable
block. The structure is currently treated as an internal implementation
detail and not exposed via our public interfaces. The functionality
provided by the structure is useful to external users of the reftable
library though, for example when implementing consistency checks that
need to scan through the blocks manually.
Rename the structure to `reftable_block` now that the name has been made
available in the preceding commit. This name is in line with the naming
schema used for other data structures like `reftable_table` in that it
describes the underlying entity that it provides access to.
The new data structure isn't yet exposed via the public interface, which
is left for a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `reftable_block` structure associates a byte slice with a block
source. As such it only holds the data of a reftable block without
actually encoding any of the details for how to access that data.
Rename the structure to instead be called `reftable_block_data`. Besides
clarifying that this really only holds data, it also allows us to rename
the `reftable_block_reader` to `reftable_block` in the next commit, as
this is the structure that actually encapsulates access to the reftable
blocks.
Rename the `struct reftable_block_reader::block` member accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/table: move reading block into block reader
The logic to read blocks from a reftable is scattered across both the
table and the block subsystems. Besides causing somewhat fuzzy
responsibilities, it also means that we have to awkwardly pass around
the ownership of blocks between the subsystems.
Refactor the code so that we stop passing the block when initializing a
reader, but instead by passing in the block source plus the offset at
which we're supposed to read a block. Like this, the ownership of the
block itself doesn't need to get handed over as the block reader is the
one owning the block right from the start.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block: simplify how we track restart points
Restart points record the location of reftable records that do not use
prefix compression and are used to perform a binary search inside of a
block. These restart points are encoded at the end of a block, between
the record data and the footer of a table.
The block structure contains three different variables related to these
restart points:
- The block length contains the length of the reftable block up to the
restart points.
- The restart count contains the number of restart points contained in
the block.
- The restart bytes variable tracks where the restart point data
begins.
Tracking all three of these variables is unnecessary though as the data
can be derived from one another: the block length without restart points
is the exact same as the offset of the restart count data, which we
already track via the `restart_bytes` data.
Refactor the code so that we track the location of restart bytes not as
a pointer, but instead as an offset. This allows us to trivially get rid
of the `block_len` variable as described above. This avoids having the
confusing `block_len` variable and allows us to do less bookkeeping
overall.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/blocksource: consolidate code into a single file
The code that implements block sources is distributed across a couple of
files. Consolidate all of it into "reftable/blocksource.c" and its
accompanying header so that it is easier to locate and more self
contained.
While at it, rename some of the functions to have properly scoped names.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct reftable_reader` subsystem encapsulates a table that has
been read from the disk. As such, the current name of that structure is
somewhat hard to understand as it only talks about the fact that we read
something from disk, without really giving an indicator _what_ that is.
Furthermore, this naming schema doesn't really fit well into how the
other structures are named: `reftable_merged_table`, `reftable_stack`,
`reftable_block` and `reftable_record` are all named after what they
encapsulate.
Rename the subsystem to `reftable_table`, which directly gives a hint
that the data structure is about handling the individual tables part of
the stack.
While this change results in a lot of churn, it prepares for us exposing
the APIs to third-party callers now that the reftable library is a
standalone library that can be linked against by other projects.
Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 1 Apr 2025 10:05:13 +0000 (19:05 +0900)]
Merge branch 'ps/reftable-sans-compat-util' into ps/reftable-api-revamp
* ps/reftable-sans-compat-util:
Makefile: skip reftable library for Coccinelle
reftable: decouple from Git codebase by pulling in "compat/posix.h"
git-compat-util.h: split out POSIX-emulating bits
compat/mingw: split out POSIX-related bits
reftable/basics: introduce `REFTABLE_UNUSED` annotation
reftable/basics: stop using `SWAP()` macro
reftable/stack: stop using `sleep_millisec()`
reftable/system: introduce `reftable_rand()`
reftable/reader: stop using `ARRAY_SIZE()` macro
reftable/basics: provide wrappers for big endian conversion
reftable/basics: stop using `st_mult()` in array allocators
reftable: stop using `BUG()` in trivial cases
reftable/record: don't `BUG()` in `reftable_record_cmp()`
reftable/record: stop using `BUG()` in `reftable_record_init()`
reftable/record: stop using `COPY_ARRAY()`
reftable/blocksource: stop using `xmmap()`
reftable/stack: stop using `write_in_full()`
reftable/stack: stop using `read_in_full()`
Martin Ågren [Mon, 10 Mar 2025 11:07:56 +0000 (12:07 +0100)]
git-clone doc: fix indentation
Commit bc26f7690a (clone: make it possible to specify --tags,
2025-02-06) added a new paragraph in the middle of this list item. By
adding an empty line rather than using a list continuation, we broke the
list continuation, with the new paragraph ending up funnily indented.
Restore the chain of list continuations.
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Peter Krefting [Mon, 10 Mar 2025 16:48:34 +0000 (17:48 +0100)]
l10n: sv.po: Update Swedish translation
- Update for 2.49.0.
- Fix numerous typos found by spelling checker.
- Fix more straight quotes.
- Harmonize translation of "blob" (to "blob", not "blobb").
- Harmonize translation of "reflog" (to "referenslogg").
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Taylor Blau [Fri, 7 Mar 2025 23:27:03 +0000 (18:27 -0500)]
fetch: fix following tags when fetching specific OID
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22),
unconditionally adds "HEAD" to the list of ref prefixes we send to the
server.
This breaks a core assumption that the list of prefixes we send to the
server is complete. We must either send all prefixes we care about, or
none at all (in the latter case the server then advertises everything).
The tag following code is careful to only add "refs/tags/" to the list
of prefixes if there are already entries in the prefix list. But because
the new code from 3f763ddf28 runs after the tag code, and because it
unconditionally adds to the prefix list, we may end up with a prefix
list that _should_ have "refs/tags/" in it, but doesn't.
When that is the case, the server does not advertise any tags, and our
auto-following breaks because we never learned about any tags in the
first place.
Fix this by only adding "HEAD" to the ref prefixes when we know that we
are already limiting the advertisement. In either case we'll learn about
HEAD (either through the limited advertisement, or implicitly through a
full advertisement).
Reported-by: Igor Todorovski <itodorov@ca.ibm.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Fri, 7 Mar 2025 14:18:08 +0000 (15:18 +0100)]
help: print zlib-ng version number
When building against zlib-ng, the header file `zlib.h` is not included,
but `zlib-ng.h` is included instead. It's `zlib.h` that defines
`ZLIB_VERSION` and that macro is used to print out zlib version in
`git-version(1)` with `--build-options`. But when it's not defined, no
version is printed.
`zlib-ng.h` defines another macro: `ZLIBNG_VERSION`. Use that macro to
print the zlib-ng version in `git version --build-options` when it's
set. Otherwise fallback to `ZLIB_VERSION`.
Signed-off-by: Toon Claes <toon@iotcl.com> Helped-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Toon Claes [Fri, 7 Mar 2025 14:18:07 +0000 (15:18 +0100)]
help: include git-zlib.h to print zlib version
In 41f1a8435a (git-compat-util: move include of "compat/zlib.h" into
"git-zlib.h", 2025-01-28) some code was refactored to enable easier
linking against zlib-ng.
This removed `zlib.h` being indirectly included in `help.c`. As this
file uses `ZLIB_VERSION` to print the version number of zlib when
running git-version(1) with `--build-options`, this resulted in a
regression.
Include `git-zlib.h` directly into `help.c` to print zlib version
information. This brings back the zlib version in the output of
`git version --build-options`.
Signed-off-by: Toon Claes <toon@iotcl.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 6 Mar 2025 22:06:31 +0000 (14:06 -0800)]
Merge branch 'js/win-2.49-build-fixes'
Hotfix to help building Git-for-Windows.
* js/win-2.49-build-fixes:
cmake: generalize the handling of the `CLAR_TEST_OBJS` list
meson: fix sorting
ident: stop assuming that `gw_gecos` is writable
Junio C Hamano [Thu, 6 Mar 2025 22:06:31 +0000 (14:06 -0800)]
Merge branch 'pw/repo-layout-doc-update'
Some future breaking changes would remove certain parts of the
default repository, which were still described even when the
documents were built for the future with WITH_BREAKING_CHANGES.
* pw/repo-layout-doc-update:
docs: fix repository-layout when building with breaking changes
cmake: generalize the handling of the `CLAR_TEST_OBJS` list
A late-comer to the v2.49.0 party, `sk/unit-test-oid`, added yet another
array item to `CLAR_TEST_OBJS`, causing the `win+VS build` job to fail
with symptoms like this one:
unit-tests-lib.lib(u-oid-array.obj) : error LNK2019: unresolved
external symbol cl_parse_any_oid referenced in function fill_array
This is a similar scenario to the one that forced me to write 8afda42fce60 (cmake: generalize the handling of the `UNIT_TEST_OBJS`
list, 2024-09-18): The hard-coded echo of `CLAR_TEST_OBJS` in
`CMakeLists.txt` that recapitulates faithfully what was already
hard-coded in `Makefile` would either have to be updated whack-a-mole
style, or generalized.
Just like I chose the latter option for `UNIT_TEST_OBJS`, I now do the
same for `CLAR_TEST_OBJS`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 904339edbd80 (Introduce support for the Meson build system,
2024-12-06) the `meson.build` file was introduced, adding also a
Windows-specific list of source files. This list was obviously meant to
be sorted alphabetically, but there is one mistake. Let's fix that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 590e081dea7c (ident: add NO_GECOS_IN_PWENT for systems without
pw_gecos in struct passwd, 2011-05-19), code was introduced to iterate
over the `gw_gecos` field; The loop variable is of type `char *`, which
assumes that `gw_gecos` is writable.
However, it is not necessarily writable (and it is a bad idea to have it
writable in the first place), so let's switch the loop variable type to
`const char *`.
This is not a new problem, but what is new is the Meson build. While it
does not trigger in CI builds, imitating the commands of
`ci/run-build-and-tests.sh` in a regular Git for Windows SDK (`meson
setup build . --fatal-meson-warnings --warnlevel 2 --werror --wrap-mode
nofallback -Dfuzzers=true` followed by `meson compile -C build --`
results in this beautiful error:
"cc" [...] -o libgit.a.p/ident.c.obj "-c" ../ident.c
../ident.c: In function 'copy_gecos':
../ident.c:68:18: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
68 | for (src = get_gecos(w); *src && *src != ','; src++) {
| ^
cc1.exe: all warnings being treated as errors
Now, why does this not trigger in CI? The answer is as simple as it is
puzzling: The `win+Meson` job completely side-steps Git for Windows'
development environment, opting instead to use the GCC that is on the
`PATH` in GitHub-hosted `windows-latest` runners. That GCC is pinned to
v12.2.0 and targets the UCRT (unlikely to change any time soon, see
https://github.com/actions/runner-images/blob/win25/20250303.1/images/windows/toolsets/toolset-2022.json#L132-L141).
That is in stark contrast to Git for Windows, which uses GCC v14.2.0 and
targets MSVCRT. Git for Windows' `Makefile`-based build also obviously
uses different compiler flags, otherwise this compile error would have
had plenty of opportunity in almost 14 years to surface.
In other words, contrary to my expectations, the `win+Meson` job is
ill-equipped to replace the `win build` job because it exercises a
completely different tool version/compiler flags vector than what Git
for Windows needs.
Nevertheless, there is currently this huge push, including breaking
changes after -rc1 and all, for switching to Meson. Therefore, we need
to make it work, somehow, even in Git for Windows' SDK, hence this
patch, at this point in time.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 5 Mar 2025 18:37:44 +0000 (10:37 -0800)]
Merge branch 'cc/lop-remote'
Large-object promisor protocol extension.
* cc/lop-remote:
doc: add technical design doc for large object promisors
promisor-remote: check advertised name or URL
Add 'promisor-remote' capability to protocol v2
Junio C Hamano [Wed, 5 Mar 2025 18:37:43 +0000 (10:37 -0800)]
Merge branch 'sk/unit-test-oid'
Convert a few unit tests to the clar framework.
* sk/unit-test-oid:
t/unit-tests: convert oidtree test to use clar test framework
t/unit-tests: convert oidmap test to use clar test framework
t/unit-tests: convert oid-array test to use clar test framework
t/unit-tests: implement clar specific oid helper functions
Junio C Hamano [Wed, 5 Mar 2025 18:37:43 +0000 (10:37 -0800)]
Merge branch 'ps/path-sans-the-repository'
The path.[ch] API takes an explicit repository parameter passed
throughout the callchain, instead of relying on the_repository
singleton instance.
* ps/path-sans-the-repository:
path: adjust last remaining users of `the_repository`
environment: move access to "core.sharedRepository" into repo settings
environment: move access to "core.hooksPath" into repo settings
repo-settings: introduce function to clear struct
path: drop `git_path()` in favor of `repo_git_path()`
rerere: let `rerere_path()` write paths into a caller-provided buffer
path: drop `git_common_path()` in favor of `repo_common_path()`
worktree: return allocated string from `get_worktree_git_dir()`
path: drop `git_path_buf()` in favor of `repo_git_path_replace()`
path: drop `git_pathdup()` in favor of `repo_git_path()`
path: drop unused `strbuf_git_path()` function
path: refactor `repo_submodule_path()` family of functions
submodule: refactor `submodule_to_gitdir()` to accept a repo
path: refactor `repo_worktree_path()` family of functions
path: refactor `repo_git_path()` family of functions
path: refactor `repo_common_path()` family of functions
Phillip Wood [Wed, 5 Mar 2025 10:42:37 +0000 (10:42 +0000)]
docs: fix repository-layout when building with breaking changes
Since commit 8ccc75c2452 (remote: announce removal of "branches/" and
"remotes/", 2025-01-22) enabling WITH_BREAKING_CHANGES when building git
removes support for reading branches from ".git/branches" and remotes
from ".git/remotes". However those locations are still documented in
gitrepository-layout.adoc even though the build does not support them.
Rectify this by adding a new document attribute "with-breaking-changes"
and use it to make the inclusion of those sections of the documentation
conditional. Note that the name of the attribute does not match the test
prerequisite WITHOUT_BREAKING_CHANGES added in c5bc9a7f94a (Makefile:
wire up build option for deprecated features, 2025-01-22). This is to
avoid the awkward double negative ifndef::without_breaking_changes for
documentation that should be included when WITH_BREAKING_CHANGES is
enabled. The test prerequisite will be renamed to match the
documentation attribute in a future patch series.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
René Scharfe [Tue, 4 Mar 2025 16:11:54 +0000 (17:11 +0100)]
reftable: release name on reftable_reader_new() error
If block_source_read_block() or parse_footer() fail, we leak the "name"
member of struct reftable_reader in reftable_reader_new(). Release it.
Reported by: H Z <shiyuyuranzh@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Todd Zullinger [Mon, 3 Mar 2025 20:44:07 +0000 (15:44 -0500)]
howto/new-command: update reference to builtin docs
Commit ec14d4ecb5 (builtin.h: take over documentation from
api-builtin.txt, 2017-08-02) deleted api-builtin.txt and moved the
contents into builtin.h. Most of the references were fixed in d85e9448dd (new-command.txt: update reference to builtin docs,
2023-02-04), but one remained. Fix it.
Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Todd Zullinger [Mon, 3 Mar 2025 20:44:01 +0000 (15:44 -0500)]
doc: remove unneeded .gitattributes
The top-level .gitattributes file contains entries for the Documentation
tree. Documentation/.gitattributes has not been touched since it was
added in 14f9e128d3 (Define the project whitespace policy, 2008-02-10).
Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>