docs: improve ambiguous areas of pack format documentation

author brian m. carlson <sandals@crustytoothpaste.net>

Thu, 2 Oct 2025 22:38:50 +0000 (22:38 +0000)

committer Junio C Hamano <gitster@pobox.com>

Fri, 3 Oct 2025 16:58:54 +0000 (09:58 -0700)
author brian m. carlson <sandals@crustytoothpaste.net>
Thu, 2 Oct 2025 22:38:50 +0000 (22:38 +0000)
committer Junio C Hamano <gitster@pobox.com>
Fri, 3 Oct 2025 16:58:54 +0000 (09:58 -0700)
diff --git a/Documentation/gitformat-pack.adoc b/Documentation/gitformat-pack.adoc

index d6ae229be5685950da9a8cff4bbe215c62e0c17c..9b7af5c1849f07751c9085983309a4fb01b14416 100644 (file)
--- a/Documentation/gitformat-pack.adoc
+++ b/Documentation/gitformat-pack.adoc
@@ -32,6 +32,10 @@ In a repository using the traditional SHA-1, pack checksums, index checksums,
  and object IDs (object names) mentioned below are all computed using SHA-1.
  Similarly, in SHA-256 repositories, these values are computed using SHA-256.
  
+CRC32 checksums are always computed over the entire packed object, including
+the header (n-byte type and length); the base object name or offset, if any;
+and the entire compressed object.  The CRC32 algorithm used is that of zlib.
+
  == pack-*.pack files have the following format:
  
     - A header appears at the beginning and consists of the following:
@@ -80,6 +84,15 @@ Valid object types are:
  
  Type 5 is reserved for future expansion. Type 0 is invalid.
  
+=== Object encoding
+
+Unlike loose objects, packed objects do not have a prefix containing the type,
+size, and a NUL byte. These are not necessary because they can be determined by
+the n-byte type and length that prefixes the data and so they are omitted from
+the compressed and deltified data.
+
+The computation of the object ID still uses this prefix, however.
+
  === Size encoding
  
  This document uses the following "size encoding" of non-negative
@@ -92,6 +105,11 @@ values are more significant.
  This size encoding should not be confused with the "offset encoding",
  which is also used in this document.
  
+When encoding the size of an undeltified object in a pack, the size is that of
+the uncompressed raw object. For deltified objects, it is the size of the
+uncompressed delta.  The base object name or offset is not included in the size
+computation.
+
  === Deltified representation
  
  Conceptually there are only four object types: commit, tree, tag and
author	brian m. carlson <sandals@crustytoothpaste.net>
	Thu, 2 Oct 2025 22:38:50 +0000 (22:38 +0000)
committer	Junio C Hamano <gitster@pobox.com>
	Fri, 3 Oct 2025 16:58:54 +0000 (09:58 -0700)