From: Bruce Momjian Date: Fri, 12 Aug 2022 19:05:12 +0000 (-0400) Subject: doc: add section about heap-only tuples (HOT) X-Git-Tag: REL_11_18~80 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=99c1f24f5b5b0c1586e5dd143101023852ff5606;p=thirdparty%2Fpostgresql.git doc: add section about heap-only tuples (HOT) Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/c59ffbd5-96ac-a5a5-a401-14f627ca1405@postgresql.org Backpatch-through: 11 --- diff --git a/doc/src/sgml/acronyms.sgml b/doc/src/sgml/acronyms.sgml index dd8e594e9f4..e7801736a59 100644 --- a/doc/src/sgml/acronyms.sgml +++ b/doc/src/sgml/acronyms.sgml @@ -299,9 +299,7 @@ HOT - Heap-Only - Tuples + Heap-Only Tuples diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 571e75445ff..6c08f4821e9 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -3818,7 +3818,8 @@ SCRAM-SHA-256$<iteration count>:&l If true, queries must not use the index until the xmin of this pg_index row is below their TransactionXmin - event horizon, because the table may contain broken HOT chains with + event horizon, because the table may contain broken HOT chains with incompatible rows that they can see diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 111afa8cb47..1496ab32101 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3439,7 +3439,8 @@ ANY num_sync ( HOT updates + will defer cleanup of dead row versions. The default is zero transactions, meaning that dead row versions can be removed as soon as possible, that is, as soon as they are no longer visible to any open transaction. You may wish to set this to a diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml index 9957d1959da..80584480912 100644 --- a/doc/src/sgml/indexam.sgml +++ b/doc/src/sgml/indexam.sgml @@ -37,7 +37,8 @@ extant versions of the same logical row; to an index, each tuple is an independent object that needs its own index entry. Thus, an update of a row always creates all-new index entries for the row, even if - the key values did not change. (HOT tuples are an exception to this + the key values did not change. (HOT + tuples are an exception to this statement; but indexes do not deal with those, either.) Index entries for dead tuples are reclaimed (by vacuuming) when the dead tuples themselves are reclaimed. diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml index ec2d16d94ef..82e43fe6bce 100644 --- a/doc/src/sgml/indices.sgml +++ b/doc/src/sgml/indices.sgml @@ -104,7 +104,9 @@ CREATE INDEX test1_id_index ON test1 (id); After an index is created, the system has to keep it synchronized with the - table. This adds overhead to data manipulation operations. + table. This adds overhead to data manipulation operations. Indexes can + also prevent the creation of heap-only + tuples. Therefore indexes that are seldom or never used in queries should be removed. @@ -726,7 +728,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name)); Index expressions are relatively expensive to maintain, because the derived expression(s) must be computed for each row insertion - and non-HOT update. However, the index expressions are + and non-HOT update. However, the index expressions are not recomputed during an indexed search, since they are already stored in the index. In both examples above, the system sees the query as just WHERE indexedcolumn = 'constant' diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 16c927fcef3..9ba8eeea46b 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -2640,7 +2640,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i n_tup_upd bigint - Number of rows updated (includes HOT updated rows) + Number of rows updated (includes HOT updated rows updated rows) n_tup_del diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml index 082d0f259f6..94af437b3b0 100644 --- a/doc/src/sgml/ref/create_table.sgml +++ b/doc/src/sgml/ref/create_table.sgml @@ -1262,7 +1262,9 @@ WITH ( MODULUS numeric_literal, REM to the indicated percentage; the remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page as the - original, which is more efficient than placing it on a different page. + original, which is more efficient than placing it on a different + page, and makes heap-only tuple + updates more likely. For a table whose entries are never updated, complete packing is the best choice, but in heavily updated tables smaller fillfactors are appropriate. This parameter cannot be set for TOAST tables. diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index ea9640f76d9..39d90ce753d 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -1046,4 +1046,74 @@ data. Empty in ordinary tables. + + + Heap-Only Tuples (<acronym>HOT</acronym>) + + + To allow for high concurrency, PostgreSQL + uses multiversion concurrency + control (MVCC) to store rows. However, + MVCC has some downsides for update queries. + Specifically, updates require new versions of rows to be added to + tables. This can also require new index entries for each updated row, + and removal of old versions of rows and their index entries can be + expensive. + + + + To help reduce the overhead of updates, + PostgreSQL has an optimization called + heap-only tuples (HOT). This optimization is + possible when: + + + + + The update does not modify any columns referenced by the table's + indexes, including expression and partial indexes. + + + + + There is sufficient free space on the page containing the old row + for the updated row. + + + + + In such cases, heap-only tuples provide two optimizations: + + + + + New index entries are not needed to represent updated rows. + + + + + Old versions of updated rows can be completely removed during normal + operation, including SELECTs, instead of requiring + periodic vacuum operations. (This is possible because indexes + do not reference their page + item identifiers.) + + + + + + + In summary, heap-only tuple updates can only be created + if columns used by indexes are not updated. You can + increase the likelihood of sufficient page space for + HOT updates by decreasing a table's fillfactor. + If you don't, HOT updates will still happen because + new rows will naturally migrate to new pages and existing pages with + sufficient free space for new row versions. The system view pg_stat_all_tables + allows monitoring of the occurrence of HOT and non-HOT updates. + + +