From 715eb62e30ac54963be63f625a1d8d50fa7d29b7 Mon Sep 17 00:00:00 2001 From: David Rowley Date: Tue, 14 Oct 2025 09:27:11 +1300 Subject: [PATCH] Doc: clarify n_distinct_inherited setting There was some confusion around how to adjust the n_distinct estimates for partitioned tables. Here we try and clarify that n_distinct_inherited needs to be adjusted rather than n_distinct. Also fix some slightly misleading text which was talking about table size rather than table rows, fix a grammatical error, and adjust some text which indicated that ANALYZE was performing calculations based on the n_distinct settings. Really it's the query planner that does this and ANALYZE only stores the overridden n_distinct estimate value in pg_statistic. Author: David Rowley Reviewed-by: David G. Johnston Reviewed-by: Chao Li Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-ZmVw@mail.gmail.com --- doc/src/sgml/ref/alter_table.sgml | 34 +++++++++++++++---------------- 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 0d41a581972..2953639a0a8 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -329,24 +329,22 @@ WITH ( MODULUS numeric_literal, REM n_distinct_inherited, which override the number-of-distinct-values estimates made by subsequent ANALYZE - operations. n_distinct affects the statistics for the table - itself, while n_distinct_inherited affects the statistics - gathered for the table plus its inheritance children. When set to a - positive value, ANALYZE will assume that the column contains - exactly the specified number of distinct nonnull values. When set to a - negative value, which must be greater - than or equal to -1, ANALYZE will assume that the number of - distinct nonnull values in the column is linear in the size of the - table; the exact count is to be computed by multiplying the estimated - table size by the absolute value of the given number. For example, - a value of -1 implies that all values in the column are distinct, while - a value of -0.5 implies that each value appears twice on the average. - This can be useful when the size of the table changes over time, since - the multiplication by the number of rows in the table is not performed - until query planning time. Specify a value of 0 to revert to estimating - the number of distinct values normally. For more information on the use - of statistics by the PostgreSQL query - planner, refer to . + operations. n_distinct affects the statistics for the + table itself, while n_distinct_inherited affects the + statistics gathered for the table plus its inheritance children, and for + the statistics gathered for partitioned tables. When the value + specified is a positive value, the query planner will assume that the + column contains exactly the specified number of distinct nonnull values. + Fractional values may also be specified by using values below 0 and + above or equal to -1. This instructs the query planner to estimate the + number of distinct values by multiplying the absolute value of the + specified number by the estimated number of rows in the table. For + example, a value of -1 implies that all values in the column are + distinct, while a value of -0.5 implies that each value appears twice on + average. This can be useful when the size of the table changes over + time. For more information on the use of statistics by the + PostgreSQL query planner, refer to + . Changing per-attribute options acquires a -- 2.47.3