From: Tom Lane Date: Tue, 31 Mar 2026 15:23:20 +0000 (-0400) Subject: Doc: improve explanation of GiST compress/decompress methods. X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=fb7a9050d53c5cd4b7c86f8e07196bd47b9db3b2;p=thirdparty%2Fpostgresql.git Doc: improve explanation of GiST compress/decompress methods. The docs previously didn't explain that leaf and non-leaf keys could be treated differently, even though many of our opclasses do exactly that. It also wasn't explained how that relates to the STORAGE option, particularly since only one storage type can be specified for both leaf and non-leaf keys. While here, reorganize the text slightly, rather than sticking additional detail into what's supposed to be a brief summary paragraph. Author: Paul A Jungwirth Co-authored-by: Tom Lane Discussion: https://postgr.es/m/CA+renyWs5Np+FLSYfL+eu20S4U671A3fQGb-+7e22HLrD1NbYw@mail.gmail.com --- diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml index 5c0a0c48bab..3f1a01f381f 100644 --- a/doc/src/sgml/gist.sgml +++ b/doc/src/sgml/gist.sgml @@ -273,14 +273,10 @@ CREATE INDEX ON my_table USING GIST (my_inet_column inet_ops); index will depend on the penalty and picksplit methods. Two optional methods are compress and - decompress, which allow an index to have internal tree data of - a different type than the data it indexes. The leaves are to be of the - indexed data type, while the other tree nodes can be of any C struct (but - you still have to follow PostgreSQL data type rules here, - see about varlena for variable sized data). If the tree's - internal data type exists at the SQL level, the STORAGE option - of the CREATE OPERATOR CLASS command can be used. - The optional eighth method is distance, which is needed + decompress, which allow an index to store keys that + are of a different type than the data it indexes, or are a compressed + representation of that type. + The optional eighth method distance is needed if the operator class wishes to support ordered scans (nearest-neighbor searches). The optional ninth method fetch is needed if the operator class wishes to support index-only scans, except when the @@ -294,6 +290,7 @@ CREATE INDEX ON my_table USING GIST (my_inet_column inet_ops); src/include/access/cmptype.h) into strategy numbers used by the operator class. This lets the core code look up operators for temporal constraint indexes. + All these methods are described in more detail below. @@ -484,6 +481,24 @@ my_union(PG_FUNCTION_ARGS) in the index without modification. + + Use the STORAGE option of the CREATE + OPERATOR CLASS command to define the data type that is + stored in the index, if it is different from the data type being + indexed. Be aware however that the STORAGE data + type is only used to define the physical properties of the index + entries (their typlen, + typbyval, + and typalign attributes). What is + actually in the index datums is under the control of the + compress and decompress + methods, so long as the stored datums match those properties. + It is allowed for compress to produce different + representations for leaf keys than for keys on higher-level index + pages, so long as both representations match + the STORAGE data type. + + The SQL declaration of the function must look like this: diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README index 76e0e11f228..75445b07455 100644 --- a/src/backend/access/gist/README +++ b/src/backend/access/gist/README @@ -10,9 +10,13 @@ GiST stands for Generalized Search Tree. It was introduced in the seminal paper Jeffrey F. Naughton, Avi Pfeffer: http://www.sai.msu.su/~megera/postgres/gist/papers/gist.ps + +Concurrency support was described in "Concurrency and Recovery in Generalized +Search Trees", 1997, Marcel Kornacker, C. Mohan, Joseph M. Hellerstein: + https://dsf.berkeley.edu/papers/sigmod97-gist.pdf -and implemented by J. Hellerstein and P. Aoki in an early version of +GiST was implemented by J. Hellerstein and P. Aoki in an early version of PostgreSQL (more details are available from The GiST Indexing Project at Berkeley at http://gist.cs.berkeley.edu/). As a "university" project it had a limited number of features and was in rare use. @@ -55,6 +59,9 @@ The original algorithms were modified in several ways: it is now a single-pass algorithm. * Since the papers were theoretical, some details were omitted and we had to find out ourself how to solve some specific problems. +* The 1997 paper above (but not the 1995 one) states that leaf pages should + store the original key. While that can be done in PostgreSQL, it is + also possible to use a compressed representation in leaf pages. Because of the above reasons, we have revised the interaction of GiST core and PostgreSQL WAL system. Moreover, we encountered (and solved)